Azure OpenAI TTS
The Azure OpenAI TTS provider enables your agent to use Azure OpenAI's text-to-speech models for converting text responses to natural-sounding audio output.
Installation
Install the Azure OpenAI-enabled VideoSDK Agents package:
pip install "videosdk-plugins-openai"
Importing
from videosdk.plugins.openai import OpenAITTS
Authentication
The Azure OpenAI plugin requires either an Azure OpenAI API key.
Set AZURE_OPENAI_API_KEY , AZURE_OPENAI_ENDPOINT and OPENAI_API_VERSION in your .env file.
Example Usage
from videosdk.plugins.openai import OpenAITTS
from videosdk.agents import CascadingPipeline
# Initialize the Azure OpenAI TTS model
tts = OpenAITTS.azure(
azure_deployment="gpt-4o-mini-tts",
speed=1.0,
response_format="pcm"
)
# Add tts to cascading pipeline
pipeline = CascadingPipeline(tts=tts)
note
When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code.
Configuration Options
azure_deployment: The OpenAI deployment ID to use (by default it is model name: e.g.,"gpt-4o-mini-tts")api_key: Your Azure OpenAI API key (can also be set via environment variable)azure_endpoint: Your Azure OpenAI Deployment Endpoint URL (can also be set via environment variable)api_version: Your Azure OpenAI API version (can also be set via environment variable)voice: (str) Voice to use for audio output (e.g.,"alloy","echo","fable","onyx","nova","shimmer")speed: (float) Speed of the generated audio (0.25 to 4.0, default: 1.0)
Additional Resources
The following resources provide more information about using OpenAI with VideoSDK Agents SDK.
Got a Question? Ask us on discord

