Ultravox

The Ultravox provider enables your agent to use Ultravox's models for real-time, conversational AI interactions.

Installation

Install the Ultravox-enabled VideoSDK Agents package:

pip install "videosdk-plugins-ultravox"

Authentication

The Ultravox plugin requires an Ultravox API key.

Set the ULTRAVOX_API_KEY in your .env file.

Importing

from videosdk.plugins.ultravox import UltravoxRealtime, UltravoxLiveConfig

Example Usage

from videosdk.plugins.ultravox import UltravoxRealtime, UltravoxLiveConfig
from videosdk.agents import RealTimePipeline

# Initialize the Ultravox real-time model
model = UltravoxRealtime(
    model="fixie-ai/ultravox",
    # When ULTRAVOX_API_KEY is set in .env - DON'T pass api_key parameter
    api_key="your-ultravox-api-key",
    config=UltravoxLiveConfig(
        voice="54ebeae1-88df-4d66-af13-6c41283b4332"
    )
)

# Create the pipeline with the model
pipeline = RealTimePipeline(model=model)

note

When using a .env file for credentials, you do not need to pass the api_key as an argument to the model instance; the SDK reads it automatically.

Key Features

Real-time Interactions: Utilize Ultravox's powerful models for low-latency voice conversations.
Function Calling: Empower your agent to perform actions like retrieving weather data or calling external APIs.
Custom Agent Behaviors: Define a unique personality and interaction style for your agent through system prompts.
Call Control: Agents can manage the conversation flow and gracefully terminate calls.
MCP Integration: Connect to external tools and data sources using the Model Context Protocol (MCP) via MCPServerStdio for local processes or MCPServerHTTP for remote services.

Configuration Options

model: The Ultravox model to use (e.g., "fixie-ai/ultravox").
api_key: Your Ultravox API key (can also be set via the ULTRAVOX_API_KEY environment variable).
config: An UltravoxLiveConfig object for advanced options:
- voice: (str) The Voice ID for the synthesized speech.
- language_hint: (str) A hint for the conversation's language (e.g., "en").
- temperature: (float) Controls the randomness of responses (0.0 to 1.0).
- vad_turn_endpoint_delay: (int) Delay in milliseconds for voice activity detection to determine the end of a turn.
- vad_minimum_turn_duration: (int) The minimum duration in milliseconds for a valid speech turn.

Additional Resources

The following resources provide more information about using Ultravox with the VideoSDK Agents SDK.

SDK Reference

GitHub Repository

Python Package

Got a Question? Ask us on discord

Installation​

Authentication​

Importing​

Example Usage​

Key Features​

Configuration Options​

Additional Resources​