Version: 1.0.x

CambAI TTS

The CambAI TTS provider enables your agent to use CambAI's high-quality, low-latency text-to-speech models for generating natural-sounding voice output with advanced voice customization capabilities.

Installation

Install the CambAI-enabled VideoSDK Agents package:

pip install "videosdk-plugins-cambai"

Importing

from videosdk.plugins.cambai import CambAITTS, InferenceOptions, VoiceSettings, OutputConfiguration

Authentication

The CambAI plugin requires a CambAI API key.

Set CAMBAI_API_KEY in your .env file.

Example Usage

from videosdk.plugins.cambai import CambAITTS, InferenceOptions, VoiceSettings, OutputConfiguration
from videosdk.agents import Pipeline

inference_options = InferenceOptions(
    stability=0.5,
    temperature=0.7,
    inference_steps=60,
    speaker_similarity=0.8,
    localize_speaker_weight=0.5,
    acoustic_quality_boost=True
)

# Configure voice settings
voice_settings = VoiceSettings(
    enhance_reference_audio_quality=False,
    maintain_source_accent=False,
)

output_configuration = OutputConfiguration(
    format="wav",
    sample_rate=24000,   # Audio sample rate
    duration=None
)

# Initialize CambAI TTS with optional audio output settings
tts = CambAITTS(
    speech_model="mars-pro",
    voice_id=147320,
    language="en-us",
    user_instructions=None,  # Optional for mars-instruct
    enhance_named_entities_pronunciation=True,
    voice_settings=voice_settings,
    inference_options=inference_options,
    output_configuration=output_configuration,
)

# Add TTS to a cascade
pipeline = Pipeline(tts=tts)

note

When using .env file for credentials, don't pass them as arguments to model instances. The SDK automatically reads environment variables, so omit api_key and other credential parameters from your code.

Configuration Options

api_key: (str) Your CambAI API key. Can also be set via the CAMBAI_API_KEY environment variable.
speech_model: (str) The CambAI TTS model to use (e.g., "mars-pro", "mars-flash", "mars-instruct"). Defaults to "mars-pro".
voice_id: (int) Numeric voice profile ID from CambAI's voice library. Defaults to 147320.
language: (str) BCP-47 locale string (e.g., "en-us"). Defaults to "en-us".
user_instructions: (str) Style and tone guidance for the generated speech. Only supported when speech_model is set to "mars-instruct".
enhance_named_entities_pronunciation: (bool) Improve pronunciation of brand names and proper nouns (default: False).
voice_settings: (VoiceSettings) Voice behaviour preferences:
- enhance_reference_audio_quality: (bool) Enhance the quality of reference audio (default: False)
- maintain_source_accent: (bool) Preserve the original speaker's accent (default: False)
inference_options: (InferenceOptions) Model sampling controls:
- stability: (float) Voice stability control (optional)
- temperature: (float) Sampling temperature (optional)
- inference_steps: (int) Number of inference steps (optional)
- speaker_similarity: (float) Speaker similarity control (optional)
- localize_speaker_weight: (float) Speaker localization weight (optional)
- acoustic_quality_boost: (bool) Enable acoustic quality enhancement (optional)
output_configuration: (OutputConfiguration) Audio output format and pacing options:
- format: (str) Output audio format. Currently "wav" is supported (default: "wav")
- sample_rate: (int) Audio sample rate in Hz (default: 24000)
- duration: (float) Target speech duration in seconds. When set, the model attempts to pace the audio to match the requested duration. Omit or set to None for natural pacing (optional)

Additional Resources

The following resources provide more information about using CambAI with VideoSDK Agents.

CambAI docs: CambAI TTS docs.

SDK Reference

GitHub Repository

Python Package

Got a Question? Ask us on discord

Installation​

Importing​

Authentication​

Example Usage​

Configuration Options​

Additional Resources​