Fallback Adapter
The Fallback Adapter provides automatic failover between multiple STT, LLM, or TTS providers. It switches providers on two conditions: first, on errors when a provider fails or becomes unavailable, and second, on latency when a provider stays slower than its configured budget. In both cases the system automatically switches to the next configured provider without interrupting the session.
Features
- Automatic Fallback: Switches to lower-priority providers if the primary provider fails.
- Latency-based Fallback: Optionally switches providers when a component stays above its latency budget for several consecutive turns.
- Cooldown-based Retry: Implements a cooldown period before retrying a failed provider, preventing immediate repeated failures.
- Auto-Recovery: Automatically switches back to a higher-priority provider once it becomes healthy again.
- Permanent Disable: Permanently disables a provider after a configured number of failed recovery attempts.
Error-based Fallback
Here is how you can implement error-based fallback providers for STT, LLM, and TTS in your agent configuration. When a provider fails or becomes unavailable, the system switches to the next configured provider.
from videosdk.agents import FallbackSTT, FallbackLLM, FallbackTTS
from videosdk.agents.plugins import OpenAISTT, OpenAILLM, OpenAITTS, DeepgramSTT, CerebrasLLM, CartesiaTTS
# Configure Fallback STT
stt_provider = FallbackSTT(
[OpenAISTT(), DeepgramSTT()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
)
# Configure Fallback LLM
llm_provider = FallbackLLM(
[OpenAILLM(model="gpt-4o-mini"), CerebrasLLM()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
)
# Configure Fallback TTS
tts_provider = FallbackTTS(
[OpenAITTS(voice="alloy"), CartesiaTTS()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
)
Configuration Options
You can configure the error-based fallback behavior using the following parameters:
| Parameter | Description |
|---|---|
temporary_disable_sec | The duration (in seconds) to wait before retrying a failed provider. |
permanent_disable_after_attempts | The maximum number of recovery attempts allowed before a provider is permanently disabled. |
Latency-based Fallback
Beyond hard failures, the Fallback Adapter can switch providers when a healthy provider becomes too slow. This is useful for keeping conversations responsive when a provider degrades without erroring out.
- Latency-based fallback is off by default. Set
latency_threshold_mson a component to enable it. - Each component measures a relevant latency metric: STT uses
stt_latency, LLM usesllm_ttft(time to first token), and TTS usesttfb(time to first byte). - A provider is only switched after it stays above the threshold for
consecutive_latency_hitsturns in a row, avoiding switches caused by a single slow turn. - Recovery and cooldown for a latency-disabled provider use the same
temporary_disable_secandpermanent_disable_after_attemptssettings as the error path.
To enable latency-based fallback, add latency_threshold_ms (and optionally consecutive_latency_hits) on top of the error-based configuration:
from videosdk.agents import FallbackSTT, FallbackLLM, FallbackTTS
from videosdk.agents.plugins import OpenAISTT, OpenAILLM, OpenAITTS, DeepgramSTT, CerebrasLLM, CartesiaTTS
# Configure Fallback STT
stt_provider = FallbackSTT(
[OpenAISTT(), DeepgramSTT()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
latency_threshold_ms=350,
consecutive_latency_hits=3,
)
# Configure Fallback LLM
llm_provider = FallbackLLM(
[OpenAILLM(model="gpt-4o-mini"), CerebrasLLM()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
latency_threshold_ms=800,
consecutive_latency_hits=3,
)
# Configure Fallback TTS
tts_provider = FallbackTTS(
[OpenAITTS(voice="alloy"), CartesiaTTS()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
latency_threshold_ms=250,
consecutive_latency_hits=3,
)
Configuration Options
You can configure the latency-based fallback behavior using the following parameters:
| Parameter | Description |
|---|---|
latency_threshold_ms | Per-component latency budget in milliseconds (STT stt_latency, LLM llm_ttft, TTS ttfb). Off by default. Pass a value to enable latency-based fallback. |
consecutive_latency_hits | The number of consecutive turns that must exceed latency_threshold_ms before switching providers. Defaults to 3. |
Examples - Try Out Yourself
Got a Question? Ask us on discord

