Deepgram STT
The Deepgram STT provider enables your agent to use Deepgram's advanced speech-to-text models for high-accuracy, real-time audio transcription.
Installation​
Install the Deepgram-enabled VideoSDK Agents package:
pip install "videosdk-plugins-deepgram"
Importing​
from videosdk.plugins.deepgram import DeepgramSTT
Example Usage​
from videosdk.plugins.deepgram import DeepgramSTT
from videosdk.agents import CascadingPipeline
# Initialize the Deepgram STT model
stt = DeepgramSTT(
# When DEEPGRAM_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-deepgram-api-key",
model="nova-2",
language="en-US",
interim_results=True,
punctuate=True,
smart_format=True
)
# Add stt to cascading pipeline
pipeline = CascadingPipeline(stt=stt)
note
When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code.
Configuration Options​
api_key
: Your Deepgram API key (can also be set via environment variable)model
: The Deepgram model to use (e.g.,"nova-2"
,"nova-3"
,"whisper-large"
)language
: (str) Language code for transcription (default:"en-US"
)interim_results
: (bool) Enable real-time partial transcription results (default:True
)punctuate
: (bool) Add punctuation to transcription (default:True
)smart_format
: (bool) Apply intelligent formatting to output (default:True
)sample_rate
: (int) Audio sample rate in Hz (default:48000
)endpointing
: (int) Silence detection threshold in milliseconds (default:50
)filler_words
: (bool) Include filler words like "uh", "um" in transcription (default:True
)base_url
: (str) WebSocket endpoint URL (default:"wss://api.deepgram.com/v1/listen"
)
Got a Question? Ask us on discord