Cascading Pipeline

The Cascading Pipeline component provides a flexible, modular approach to building AI agents by allowing you to mix and match different components for Speech-to-Text (STT), Large Language Models (LLM), Text-to-Speech (TTS), Voice Activity Detection (VAD), and Turn Detection.

Key Features:

Modular Component Selection - Choose different providers for each component
Flexible Configuration - Mix and match STT, LLM, TTS, VAD, and Turn Detection
Custom Processing - Add custom processing for STT and LLM outputs
Provider Agnostic - Support for multiple AI service providers
Advanced Control - Fine-tune each component independently

Example Implementation:

from videosdk.agents import CascadingPipeline
from videosdk.plugins.openai import OpenAILLM
from videosdk.plugins.deepgram import DeepgramSTT
from videosdk.plugins.silero import SileroVAD
from videosdk.plugins.turn_detector import TurnDetector

    stt=DeepgramSTT(
        api_key=os.getenv("DEEPGRAM_API_KEY"),
        model="nova-2",
        language="en"
    )

    llm=OpenAILLM(
        api_key=os.getenv("OPENAI_API_KEY"),
        model="gpt-4o"
    )

    tts=ElevenLabsTTS(
        api_key=os.getenv("ELEVENLABS_API_KEY"),
        voice_id="your-voice-id"
    )

    vad=SileroVAD(
        threshold = 0.35
    )

    turn_detector=TurnDetector(t
    threshold=0.8
    )

pipeline = CascadingPipeline(stt=stt, llm=llm, tts=tts, vad=vad, turn_detector=turn_detector)

Use Cases:

Multi-language Support - Use specialized STT for different languages
Cost Optimization - Mix premium and cost-effective services
Custom Voice Processing - Add domain-specific processing logic
Performance Optimization - Choose fastest providers for each component
Compliance Requirements - Use specific providers for regulatory compliance

Got a Question? Ask us on discord

Key Features:​

Example Implementation:​

Use Cases:​

Key Features:

Example Implementation:

Use Cases: