Skip to main content

Gladia STT

The Gladia STT provider enables your agent to use Gladia's fast and accurate speech-to-text models for real-time audio transcription with support for multiple languages and code-switching.

Installation

Install the Gladia-enabled VideoSDK Agents package:

pip install "videosdk-plugins-gladia"

Authentication

The Gladia plugin requires a Gladia API key.

Set GLADIA_API_KEY in your .env file.

Importing

from videosdk.plugins.gladia import GladiaSTT

Example Usage

from videosdk.plugins.gladia import GladiaSTT
from videosdk.agents import CascadingPipeline

# Initialize the Gladia STT model
stt = GladiaSTT(
# When GLADIA_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-gladia-api-key",
languages=["en"],
code_switching=True,
receive_partial_transcripts=True
)

# Add stt to a cascading pipeline
pipeline = CascadingPipeline(stt=stt)
note

When using a .env file for credentials, you do not need to pass the api_key as an argument to the model instance; the SDK reads it automatically.

Configuration Options

  • api_key: (str, optional) Your Gladia API key. Can also be set via the GLADIA_API_KEY environment variable.
  • model: (str, optional) The model to use. Defaults to "solaria-1".
  • languages: (List[str], optional) A list of language codes to detect (e.g., ["en", "fr"]). Defaults to ["en"].
  • code_switching: (bool, optional) Enables automatic language switching between the provided languages. Defaults to True.
  • input_sample_rate: (int, optional) The sample rate of the incoming audio. Defaults to 48000.
  • output_sample_rate: (int, optional) The sample rate Gladia should process. Defaults to 16000.
  • encoding: (str, optional) The audio encoding format. Defaults to "wav/pcm".
  • bit_depth: (int, optional) The bit depth of the audio. Defaults to 16.
  • channels: (int, optional) The number of audio channels. Defaults to 1 (mono).
  • receive_partial_transcripts: (bool, optional) Set to True to receive interim transcription results for lower latency. Defaults to False.

Got a Question? Ask us on discord