Skip to main content

Google TTS

The Google TTS provider enables your agent to use Google's high-quality text-to-speech models for generating natural-sounding voice output.

Installation​

Install the Google-enabled VideoSDK Agents package:

pip install "videosdk-plugins-google"

Importing​

from videosdk.plugins.google import GoogleTTS, GoogleVoiceConfig

Example Usage​

from videosdk.plugins.google import GoogleTTS, GoogleVoiceConfig
from videosdk.agents import CascadingPipeline

# Configure voice settings
voice_config = GoogleVoiceConfig(
languageCode="en-US",
name="en-US-Chirp3-HD-Aoede",
ssmlGender="FEMALE"
)

# Initialize the Google TTS model
tts = GoogleTTS(
# When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-google-api-key",
speed=1.0,
pitch=0.0,
voice_config=voice_config
)

# Add tts to cascading pipeline
pipeline = CascadingPipeline(tts=tts)
note

When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key and other credential parameters from your code.

Configuration Options​

  • api_key: (str) Your Google Cloud TTS API key. Can also be set via the GOOGLE_API_KEY environment variable.
  • speed: (float) The speaking rate of the generated audio (default: 1.0).
  • pitch: (float) The pitch of the generated audio. Can be between -20.0 and 20.0 (default: 0.0).
  • response_format: (str) The format of the audio response. Currently only supports "pcm" (default: "pcm").
  • voice_config: (GoogleVoiceConfig) Configuration for the voice to be used.
    • languageCode: (str) The language code of the voice (e.g., "en-US", "en-GB") (default: "en-US").
    • name: (str) The name of the voice to use (e.g., "en-US-Chirp3-HD-Aoede", "en-US-News-N") (default: "en-US-Chirp3-HD-Aoede").
    • ssmlGender: (str) The gender of the voice ("MALE", "FEMALE", "NEUTRAL") (default: "FEMALE").

Got a Question? Ask us on discord