Google TTS
The Google TTS plugin enables your agent to use Google's text-to-speech models for generating natural-sounding voice output. It supports low-latency gRPC streaming with Chirp 3 HD voices and Vertex AI endpoints.
Installation
pip install "videosdk-plugins-google"
Authentication
Set your Google API key as an environment variable:
export GOOGLE_API_KEY="your-google-api-key"
You can obtain an API key from the Google AI Studio.
Example Usage
from videosdk.plugins.google import GoogleTTS, GoogleVoiceConfig
from videosdk.agents import CascadingPipeline
# Configure voice settings
voice_config = GoogleVoiceConfig(
languageCode="en-US",
name="en-US-Chirp3-HD-Aoede",
ssmlGender="FEMALE"
)
# Initialize the Google TTS model
tts = GoogleTTS(
# When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-google-api-key",
speed=1.0,
pitch=0.0,
voice_config=voice_config,
custom_pronunciations=[{"tomato": "təˈmeɪtoʊ"}], # Optional IPA overrides
)
# Add tts to cascading pipeline
pipeline = CascadingPipeline(tts=tts)
Vertex AI
To use the Vertex AI endpoint instead of an API key, authenticate using Application Default Credentials (ADC) and set your project ID:
export GOOGLE_CLOUD_PROJECT="my-gcp-project"
from videosdk.plugins.google import GoogleTTS, VertexAIConfig
tts = GoogleTTS(
vertexai=True,
vertexai_config=VertexAIConfig(location="us-central1"),
streaming=False, # Streaming cannot be used with Vertex AI
)
note
streaming=True(the default) requires a Chirp 3 HD voice (e.g.en-US-Chirp3-HD-Aoede) and cannot be combined withvertexai=True.- Vertex AI requires a GCP project ID via
VertexAIConfig(project_id="..."), theGOOGLE_CLOUD_PROJECTenv variable, or aGOOGLE_APPLICATION_CREDENTIALSservice-account file.
Configuration Options
api_key: (str) Your Google Cloud TTS API key. Can also be set via theGOOGLE_API_KEYenvironment variable.speed: (float) The speaking rate of the generated audio (default:1.0).pitch: (float) The pitch of the generated audio. Can be between -20.0 and 20.0 (default:0.0).response_format: (str) The format of the audio response. Currently only supports"pcm"(default:"pcm").voice_config: (GoogleVoiceConfig) Configuration for the voice to be used.languageCode: (str) The language code of the voice (e.g.,"en-US","en-GB") (default:"en-US").name: (str) The name of the voice to use (e.g.,"en-US-Chirp3-HD-Aoede","en-US-News-N") (default:"en-US-Chirp3-HD-Aoede").ssmlGender: (str) The gender of the voice ("MALE","FEMALE","NEUTRAL") (default:"FEMALE").
custom_pronunciations: (list[dict] | dict | None) IPA pronunciation overrides for specific words (e.g.,[{"tomato": "təˈmeɪtoʊ"}]). Defaults toNone.streaming: (bool) Use gRPCStreamingSynthesizefor lower-latency audio generation. Only compatible with Chirp 3 HD voices and cannot be combined withvertexai=True(default:True).vertexai: (bool) Use the Vertex AI TTS endpoint with Application Default Credentials (ADC) instead of an API key (default:False).vertexai_config: (VertexAIConfig) Project and region settings for Vertex AI.project_id: (str | None) Your GCP project ID. Falls back toGOOGLE_CLOUD_PROJECTorGOOGLE_APPLICATION_CREDENTIALS(default:None).location: (str) GCP region for the TTS endpoint (default:"us-central1").
Additional Resources
The following resources provide more information about using Google with VideoSDK Agents SDK.
- Google TTS docs: Google Cloud TTS documentation.
- Chirp 3 HD voices: Available voices for low-latency streaming synthesis.
- Vertex AI TTS: Vertex AI Text-to-Speech documentation.
Got a Question? Ask us on discord

