OpenAI TTS
The OpenAI TTS provider enables your agent to use OpenAI's text-to-speech models for converting text responses to natural-sounding audio output.
Installation​
Install the OpenAI-enabled VideoSDK Agents package:
pip install "videosdk-plugins-openai"
Importing​
from videosdk.plugins.openai import OpenAITTS
Example Usage​
from videosdk.plugins.openai import OpenAITTS
from videosdk.agents import CascadingPipeline
# Initialize the OpenAI TTS model
tts = OpenAITTS(
# When OPENAI_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-openai-api-key",
model="tts-1",
voice="alloy",
speed=1.0,
response_format="pcm"
)
# Add tts to cascading pipeline
pipeline = CascadingPipeline(tts=tts)
note
When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code.
Configuration Options​
model
: The OpenAI TTS model to use (e.g.,"tts-1"
,"tts-1-hd"
)voice
: (str) Voice to use for audio output (e.g.,"alloy"
,"echo"
,"fable"
,"onyx"
,"nova"
,"shimmer"
)speed
: (float) Speed of the generated audio (0.25 to 4.0, default: 1.0)instructions
: (str) Custom instructions to guide speech synthesis styleapi_key
: Your OpenAI API key (can also be set via environment variable)base_url
: Custom base URL for OpenAI API (optional)response_format
: (str) Audio format for output (default:"pcm"
)
Got a Question? Ask us on discord