Skip to main content
Version: 1.0.x

OpenAI TTS

The OpenAI TTS provider enables your agent to use OpenAI's text-to-speech models for converting text responses to natural-sounding audio output.

Installation

Install the OpenAI-enabled VideoSDK Agents package:

pip install "videosdk-plugins-openai"

Importing

from videosdk.agents.plugins import OpenAITTS

Authentication

The OpenAI plugin requires an OpenAI API key.

Set OPENAI_API_KEY in your .env file.

Example Usage

from videosdk.agents.plugins import OpenAITTS
from videosdk.agents import Pipeline

# Initialize the OpenAI TTS model
tts = OpenAITTS(
# When OPENAI_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-openai-api-key",
model="tts-1",
voice="alloy",
speed=1.0,
response_format="pcm"
)

# Add tts to cascade
pipeline = Pipeline(tts=tts)
note

When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code.

Configuration Options

  • model: The OpenAI TTS model to use (e.g., "tts-1", "tts-1-hd")
  • voice: (str | dict[str, str]) Built-in voice name (e.g., "alloy", "echo", "fable", "onyx", "nova", "shimmer") or a custom voice reference dict {"id": "voice_xxx"}
  • speed: (float) Speed of the generated audio (default: 1.0)
  • instructions: (str) Custom instructions to guide speech synthesis style (only honored by gpt-4o-mini-tts)
  • language: (str) ISO language hint (e.g., "hi", "mr", "fr") for non-English input or custom voices (default: None)
  • api_key: Your OpenAI API key (can also be set via environment variable)
  • base_url: Custom base URL for OpenAI API (optional)
  • response_format: (str) Audio format for output (default: "pcm")
  • chunked_synthesis: (bool) When True, dispatch one POST per flush boundary; when False, accumulate the entire stream into a single POST (default: False)

Additional Resources

The following resources provide more information about using OpenAI with VideoSDK Agents SDK.

Got a Question? Ask us on discord