Sarvam AI STT
The Sarvam AI STT provider enables your agent to use Sarvam AI's speech-to-text models for transcription. This provider uses Voice Activity Detection (VAD) to send audio chunks for transcription after a period of silence.
Installation
Install the Sarvam AI-enabled VideoSDK Agents package:
pip install "videosdk-plugins-sarvamai"
Importing
from videosdk.plugins.sarvamai import SarvamAISTT
Authentication
The Sarvam plugin requires a Sarvam API key.
Set SARVAM_API_KEY in your .env file.
Example Usage
from videosdk.plugins.sarvamai import SarvamAISTT
from videosdk.agents import CascadingPipeline
# Initialize the Sarvam AI STT model
stt = SarvamAISTT(
# When SARVAMAI_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-sarvam-ai-api-key",
model="saaras:v3",
language="en-IN",
)
# Add stt to cascading pipeline
pipeline = CascadingPipeline(stt=stt)
note
When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key and other credential parameters from your code.
Configuration Options
api_key: (str) Your Sarvam AI API key. Can also be set via theSARVAMAI_API_KEYenvironment variable.model: (str) The Sarvam AI model to use (default:"saaras:v3").language: (str) Language code for transcription (default:"en-IN").input_sample_rate: (int) The sample rate of the audio from the source in Hz (default:48000).output_sample_rate: (int) The sample rate to which the audio is resampled before sending for transcription (default:16000).mode: (str) Mode of operation. Only applicable forsaaras:v3. Allowed values:"transcribe","translate","verbatim","translit","codemix"(default:"transcribe"forsaaras:v3,Nonefor other models).high_vad_sensitivity: (bool) Whether to use high sensitivity voice activity detection (default:None).flush_signal: (bool) Whether to send flush signal (default:None).translation: (bool) Enable speech-to-text translation. Supported onsaaras:v3andsaaras:v2.5models. When enabled, routes to the translation endpoint (default:False).prompt: (str) Prompt to guide the translation. Only applicable whentranslationisTrue(default:None).
Additional Resources
The following resources provide more information about using Sarvam AI with VideoSDK Agents SDK.
- Sarvam docs: Sarvam's full docs site.
Got a Question? Ask us on discord

