Inference Pricing
VideoSDK provides access to state-of-the-art AI models for building intelligent voice and video applications. Our inference API supports multiple providers including Google and Sarvam AI.
Contact sales for preferred rates on high-volume usage.
Speech-to-Text (STT)
Transcribe audio to text with high accuracy and low latency.
| Model | Provider | Price (per minute) | Documentation |
|---|---|---|---|
| Google STT Chirp 2 | $0.01200 | Pricing | |
| Speech-to-Text | Sarvam | $0.00560 | Pricing |
Large Language Models (LLM)
Build conversational AI agents with the latest language models from Google.
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) | Documentation |
|---|---|---|---|---|
| Gemini 2.5 Flash | $0.30 | $0.85 | Pricing | |
| Gemini 2.5 Flash Lite | $0.10 | $0.40 | Pricing | |
| Gemini 2.0 Flash | $0.10 | $0.40 | Pricing | |
| Gemini 2.0 Flash Lite | $0.07 | $0.30 | Pricing |
Text-to-Speech (TTS)
Convert text to natural-sounding speech with multiple quality tiers.
| Model | Provider | Price (per 1M chars) | Documentation |
|---|---|---|---|
| Google TTS Studio | $160.00 | Pricing | |
| Google TTS Chirp 3 | $30.00 | Pricing | |
| Google Cloud TTS Neural | $16.00 | Pricing | |
| Google Cloud TTS Standard | $4.00 | Pricing | |
| Text-to-Speech (TTS) | Sarvam | $17.00 | Pricing |
Speech-to-Speech Models
Enable real-time voice conversations with native audio processing.
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) | Documentation |
|---|---|---|---|---|
| Gemini Live 2.5 Flash Native Audio | $3.00 | $12.00 | Pricing |
Billing & Usage
How Inference Pricing Works
- Pay-as-you-go: You are charged only for what you use
- No minimum commitment: Start small and scale as needed
- Usage-based billing: Charges are calculated based on actual API usage
- Transparent Usage Charges: You can track your usage through the VideoSDK Dashboard
Usage Calculation
LLM & Speech-to-Speech Models
- Input tokens: Charged per 1 million input tokens processed
- Output tokens: Charged per 1 million output tokens generated
- Token counts are calculated based on the model's tokenizer
Text-to-Speech Models
- Characters: Charged per 1 million characters converted to speech
- All characters in the input text are counted, including spaces and punctuation
Speech-to-Text Models
- Audio duration: Charged per minute of audio transcribed
- Partial minutes are rounded up to the nearest minute
Example Calculations
LLM Usage Example
If you process 5 million input tokens and generate 2 million output tokens using Gemini 2.0 Flash:
- Input cost: 5 × $0.10 = $0.50
- Output cost: 2 × $0.40 = $0.80
- Total cost: $1.30
TTS Usage Example
If you convert 10 million characters to speech using Google Cloud TTS Standard:
- Cost: 10 × $4.00 = $40.00
- Total cost: $40.00
STT Usage Example
If you transcribe 120 minutes of audio using Google STT Chirp 2:
- Cost: 120 × $0.01200 = $1.44
- Total cost: $1.44
Frequently Asked Questions
1. Are there any free tiers for inference APIs?
Currently, inference APIs are billed on a pay-as-you-go basis without a free tier. However, we offer competitive pricing and volume discounts for high-usage customers.
2. Can I use multiple models in my application?
Yes, you can use any combination of models from our supported providers. Each model will be billed separately based on its usage.
3. How do I monitor my inference usage?
You can track your inference API usage through the VideoSDK Dashboard.
4. What happens if I exhaust my balance?
You will get usage alerts and can set auto-recharge in the VideoSDK Dashboard to prevent service outages.
5. Are there volume discounts available?
Yes! If you expect high volume usage, please contact our sales team.
6. Which providers are supported?
We currently support Google (Gemini, Google Cloud TTS/STT) and Sarvam AI. We're continuously adding new providers and models based on customer demand.
7. How are partial units billed?
- Tokens: Charged per actual token count
- Characters: Charged per actual character count
- Audio minutes: Charged per actual audio minute
8. Can I switch between models?
Yes, you can switch between different models at any time, and you'll be charged for each model's actual usage.
Got a Question? Ask us on discord

