Skip to main content
Version: 1.0.x

Fallback Adapter

The Fallback Adapter provides automatic failover between multiple STT, LLM, or TTS providers. It switches providers on two conditions: first, on errors when a provider fails or becomes unavailable, and second, on latency when a provider stays slower than its configured budget. In both cases the system automatically switches to the next configured provider without interrupting the session.

Features

  • Automatic Fallback: Switches to lower-priority providers if the primary provider fails.
  • Latency-based Fallback: Optionally switches providers when a component stays above its latency budget for several consecutive turns.
  • Cooldown-based Retry: Implements a cooldown period before retrying a failed provider, preventing immediate repeated failures.
  • Auto-Recovery: Automatically switches back to a higher-priority provider once it becomes healthy again.
  • Permanent Disable: Permanently disables a provider after a configured number of failed recovery attempts.

Error-based Fallback

Here is how you can implement error-based fallback providers for STT, LLM, and TTS in your agent configuration. When a provider fails or becomes unavailable, the system switches to the next configured provider.

from videosdk.agents import FallbackSTT, FallbackLLM, FallbackTTS
from videosdk.agents.plugins import OpenAISTT, OpenAILLM, OpenAITTS, DeepgramSTT, CerebrasLLM, CartesiaTTS

# Configure Fallback STT
stt_provider = FallbackSTT(
[OpenAISTT(), DeepgramSTT()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
)

# Configure Fallback LLM
llm_provider = FallbackLLM(
[OpenAILLM(model="gpt-4o-mini"), CerebrasLLM()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
)

# Configure Fallback TTS
tts_provider = FallbackTTS(
[OpenAITTS(voice="alloy"), CartesiaTTS()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
)

Configuration Options

You can configure the error-based fallback behavior using the following parameters:

ParameterDescription
temporary_disable_secThe duration (in seconds) to wait before retrying a failed provider.
permanent_disable_after_attemptsThe maximum number of recovery attempts allowed before a provider is permanently disabled.

Latency-based Fallback

Beyond hard failures, the Fallback Adapter can switch providers when a healthy provider becomes too slow. This is useful for keeping conversations responsive when a provider degrades without erroring out.

  • Latency-based fallback is off by default. Set latency_threshold_ms on a component to enable it.
  • Each component measures a relevant latency metric: STT uses stt_latency, LLM uses llm_ttft (time to first token), and TTS uses ttfb (time to first byte).
  • A provider is only switched after it stays above the threshold for consecutive_latency_hits turns in a row, avoiding switches caused by a single slow turn.
  • Recovery and cooldown for a latency-disabled provider use the same temporary_disable_sec and permanent_disable_after_attempts settings as the error path.

To enable latency-based fallback, add latency_threshold_ms (and optionally consecutive_latency_hits) on top of the error-based configuration:

from videosdk.agents import FallbackSTT, FallbackLLM, FallbackTTS
from videosdk.agents.plugins import OpenAISTT, OpenAILLM, OpenAITTS, DeepgramSTT, CerebrasLLM, CartesiaTTS

# Configure Fallback STT
stt_provider = FallbackSTT(
[OpenAISTT(), DeepgramSTT()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
latency_threshold_ms=350,
consecutive_latency_hits=3,
)

# Configure Fallback LLM
llm_provider = FallbackLLM(
[OpenAILLM(model="gpt-4o-mini"), CerebrasLLM()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
latency_threshold_ms=800,
consecutive_latency_hits=3,
)

# Configure Fallback TTS
tts_provider = FallbackTTS(
[OpenAITTS(voice="alloy"), CartesiaTTS()],
temporary_disable_sec=30.0,
permanent_disable_after_attempts=3,
latency_threshold_ms=250,
consecutive_latency_hits=3,
)

Configuration Options

You can configure the latency-based fallback behavior using the following parameters:

ParameterDescription
latency_threshold_msPer-component latency budget in milliseconds (STT stt_latency, LLM llm_ttft, TTS ttfb). Off by default. Pass a value to enable latency-based fallback.
consecutive_latency_hitsThe number of consecutive turns that must exceed latency_threshold_ms before switching providers. Defaults to 3.

Examples - Try Out Yourself

Got a Question? Ask us on discord