---
title: Introduction
hide_title: false
hide_table_of_contents: false
description: "Introduce yourself to the VideoSDK AI Agent SDK, a Python framework for integrating AI-powered voice agents into VideoSDK meetings. Understand its high-level architecture and how it bridges AI models with users for real-time interactions."
pagination_label: "Introduction"
keywords:
- AI Agent SDK
- VideoSDK Agents
- Introduction
- Python SDK
- Voice AI
- Real-time Communication
- AI Integration
- VideoSDK Cloud
- Conversational AI
- Build AI Agents
image: img/videosdklive-thumbnail.jpg
sidebar_position: 1
sidebar_label: Introduction
slug: introduction
---
import { AgentCardGrid, GithubIcon, RobotIcon, DocumentIcon, PlayIcon, CodeIcon, ExternalLinkIcon, SettingsIcon, TelephonyIcon, WaveformIcon, DocsIcon, CloudIcon, PuzzlePieceSimpleIcon, MetricsIcon, BulbIcon, DiscordIcon, SupportIcon } from '@site/src/components/agent/cards';
# AI Voice Agents
The VideoSDK AI Agent SDK is a powerful Python framework for developers to seamlessly integrate intelligent, real-time voice agents into any application. Bridge the gap between advanced AI models and human interaction, creating natural, engaging, and responsive conversational experiences.
,
showArrow: false
},
{
title: "AI Telephony Agent Quickstart",
description: "Build an AI Telephony Agent in less than 10 minutes",
link: "/ai_agents/ai-phone-agent-quick-start",
icon: ,
showArrow: false
},
{
title: "Github Repository",
description: "The videosdk agent code and examples",
link: "https://github.com/videosdk-live/agents",
icon:
},
{
title: "Agent Starter Apps",
description: "Ready-to-run starter apps to get your AI agent up and running fast.",
link: "/ai_agents/agent-runtime/connect-agent/web-integrations/agent-starter-react",
icon:
}
]}
/>
## The Architecture
The VideoSDK AI Agents framework connects four key components to enable seamless AI voice interactions:
- Your **Infrastructure** hosts the agent management system
- The **Agent Worker** creates and manages AI sessions
- The **VideoSDK Room** handles real-time meeting operations
- **User Devices** connect through web, mobile apps, or phone calls to interact with intelligent agents that can listen, understand, and respond naturally in real-time conversations.

## Use Cases
Here are some real-world applications where VideoSDK AI Agents can be deployed to create intelligent, voice-enabled experiences across different industries and scenarios. You can use this, or refer this to create your customized agent.
## The Building Blocks
Our SDK is built on four primary, modular components that work together to create powerful and customizable agents. Understand these concepts, and you're ready to build.
,
showArrow: false
},
{
title: "Deployment Options",
description: "Deploy your agent on cloud or self-host it on your own infrastructure",
link: "/ai_agents/deployments/introduction",
icon: ,
showArrow: false
},
{
title: "Observability",
description: "Monitor and debug with confidence using our built-in session analytics, latency tracking, and detailed traces.",
link: "/ai_agents/tracing-observability/session-analytics",
icon: ,
showArrow: false
},
{
title: "Plugin Ecosystem",
description: "Integrate with dozens of providers like OpenAI, Google, Anthropic, and Elevenlabs for STT, LLM, and TTS.",
link: "/ai_agents/plugins/realtime/openai",
icon: ,
showArrow: false
}
]}
/>
## Need Help?
If you have any queries, please feel free to reach out to us using one of the following methods:
},
{
title: "GitHub",
description: "Ask your questions on GitHub.",
link: "https://github.com/videosdk-live/agents/issues",
icon:
},
{
title: "Support",
description: "Talk to an expert, book demo or talk to sales.",
link: "https://www.videosdk.live/contact",
icon:
}
]}
columns={3}
/>
## Frequently Asked Questions
What programming language and version are required?
The AI Agent SDK is built in Python. You'll need Python 3.12 or higher to use the SDK.
Can my agent answer phone calls?
Yes. By integrating with our SIP/telephony services, your AI agent can join a room initiated by a standard phone call. This allows you to build powerful IVR systems, automated appointment schedulers, AI-powered call centers, and more.
What AI models are supported?
The SDK supports various AI models including:
- **Real-time Models**: OpenAI, Google Gemini, AWS Nova Sonic
- **LLM Providers**: OpenAI, Google Gemini, Anthropic Claude, Sarvam AI, Cerebras
- **TTS Providers**: ElevenLabs, OpenAI, Google, AWS Polly, Cartesia, and many more
- **STT Providers**: OpenAI Whisper, Deepgram, Google, AssemblyAI, and others
Can I use my own custom models?
Absolutely! The SDK's modular architecture allows you to create custom plugins for any AI provider. Check our [plugin development guide](https://github.com/videosdk-live/agents/blob/main/BUILD_YOUR_OWN_PLUGIN.md) for detailed instructions.
How is pricing handled for the AI Agent SDK?
VideoSDK offers a free tier with limited usage. The AI Agent SDK itself is open-source, but you'll need API keys for the AI services you choose to use (OpenAI, Google, etc.). Check the [pricing page](https://www.videosdk.live/pricing) for VideoSDK usage limits.
Can agents handle more than just voice?
Absolutely! Agents support multimodal interactions including vision processing, data messages, and real-time video streams. They can also use function tools to interact with external systems and APIs.
Is the SDK production-ready?
Yes, the AI Agent SDK is stable and production-ready. It is designed to be self-hosted on your own infrastructure for full control and scalability, from a single server to a Kubernetes cluster. It includes comprehensive error handling, metrics collection, and deployment flexibility.
---
---
title: A2A Implementation Guide
hide_title: false
hide_table_of_contents: false
description: "Complete implementation guide for building Agent to Agent (A2A) systems with VideoSDK AI Agents. Learn to create customer service and specialist agents that collaborate seamlessly using real-world examples."
pagination_label: "A2A Implementation"
keywords:
- A2A Implementation
- Agent to Agent Example
- Multi-Agent System
- Multiple Agent
- A2A Protocol
- AI Agent
- Google's A2A
- Customer Service Agent
- Loan Specialist Agent
- VideoSDK Agents
- AI Agent SDK
- Python Implementation
- Agent Collaboration
image: img/videosdklive-thumbnail.jpg
sidebar_position: 6
sidebar_label: Implementation
slug: implementation
---
# A2A Implementation Guide
This guide shows you how to build a complete Agent to Agent (A2A) system using the concepts from the [A2A Overview](overview). We'll create a banking customer service system with a main customer service agent and a loan specialist.
## Implementation Overview
We'll build a system with:
- **Customer Service Agent**: Voice-enabled interface agent using **RealTimePipeline** for low-latency voice interactions
- **Loan Specialist Agent**: Text-based domain expert using **CascadingPipeline** for efficient text processing
- **Intelligent Routing**: Automatic detection and forwarding of loan queries
- **Seamless Communication**: Users get expert responses without knowing about the routing
## Structure of the project
```js
A2A
├── agents/
│ ├── customer_agent.py # CustomerServiceAgent definition
│ ├── loan_agent.py # LoanAgent definition
│
├── session_manager.py # Handles session creation, pipeline setup, meeting join/leave
└── main.py # Entry point: runs main() and starts agents
```
## Sequence Diagram

## Step 1: Create the Customer Service Agent
- **`Interface Agent`**: Creates `CustomerServiceAgent` as the main user-facing agent with voice capabilities and customer service instructions.
- **`Function Tool`**: Implements`@function_tool forward_to_specialist()`that uses A2A discovery to find and route queries to domain specialists.
- **`Response Relay`**: Includes `handle_specialist_response()` method that automatically receives and relays specialist responses back to users.
```python title="agents/customer_agent.py"
from videosdk.agents import Agent, AgentCard, A2AMessage, function_tool
import asyncio
from typing import Dict, Any
class CustomerServiceAgent(Agent):
def __init__(self):
super().__init__(
agent_id="customer_service_1",
instructions=(
"You are a helpful bank customer service agent. "
"For general banking queries (account balances, transactions, basic services), answer directly. "
"For ANY loan-related queries, questions, or follow-ups, ALWAYS use the forward_to_specialist function "
"with domain set to 'loan'. This includes initial loan questions AND all follow-up questions about loans. "
"Do NOT attempt to answer loan questions yourself - always forward them to the specialist. "
"After forwarding a loan query, stay engaged and automatically relay any response you receive from the specialist. "
"When you receive responses from specialists, immediately relay them naturally to the customer."
)
)
@function_tool
async def forward_to_specialist(self, query: str, domain: str) -> Dict[str, Any]:
"""Forward queries to domain specialist agents using A2A discovery"""
# Use A2A discovery to find specialists by domain
specialists = self.a2a.registry.find_agents_by_domain(domain)
id_of_target_agent = specialists[0] if specialists else None
if not id_of_target_agent:
return {"error": f"No specialist found for domain {domain}"}
# Send A2A message to the specialist
await self.a2a.send_message(
to_agent=id_of_target_agent,
message_type="specialist_query",
content={"query": query}
)
return {
"status": "forwarded",
"specialist": id_of_target_agent,
"message": "Let me get that information for you from our loan specialist..."
}
async def handle_specialist_response(self, message: A2AMessage) -> None:
"""Handle responses from specialist agents and relay to user"""
response = message.content.get("response")
if response:
# Brief pause for natural conversation flow
await asyncio.sleep(0.5)
# Try multiple methods to relay the response to the user
prompt = f"The loan specialist has responded: {response}"
methods_to_try = [
(self.session.pipeline.send_text_message, prompt),# While using Cascading as main agent, comment this
(self.session.pipeline.model.send_message, response),# While using Cascading as main agent, comment this
(self.session.say, response)
]
for method, arg in methods_to_try:
try:
await method(arg)
break
except Exception as e:
print(f"Error with {method.__name__}: {e}")
async def on_enter(self):
# Register this agent with the A2A system
await self.register_a2a(AgentCard(
id="customer_service_1",
name="Customer Service Agent",
domain="customer_service",
capabilities=["query_handling", "specialist_coordination"],
description="Handles customer queries and coordinates with specialists"
))
await self.session.say("Hello! I am your customer service agent. How can I help you?")
# Set up message listener for specialist responses
self.a2a.on_message("specialist_response", self.handle_specialist_response)
async def on_exit(self):
print("Customer agent left the meeting")
```
## Step 2: Create the Loan Specialist Agent
- **`Specialist Agent Setup`**: Creates `LoanAgent` class with specialized loan expertise instructions and agent_id `"specialist_1"`.
- **`Message Handlers`**: Implements` handle_specialist_query()` to process incoming queries and handle_model_response() to send responses back.
- **`Registration`**: Registers with A2A system using domain "loan" so it can be `discovered` by other agents needing loan expertise.
```python title="agents/loan_agent.py"
from videosdk.agents import Agent, AgentCard, A2AMessage
class LoanAgent(Agent):
def __init__(self):
super().__init__(
agent_id="specialist_1",
instructions=(
"You are a specialized loan expert at a bank. "
"Provide detailed, helpful information about loans including interest rates, terms, and requirements. "
"Give complete answers with specific details when possible. "
"You can discuss personal loans, car loans, home loans, and business loans. "
"Provide helpful guidance and next steps for loan applications. "
"Be friendly and professional in your responses. "
"Keep responses concise within 5-7 lines and easily understandable."
)
)
async def handle_specialist_query(self, message: A2AMessage):
"""Process incoming queries from customer service agent"""
query = message.content.get("query")
if query:
# Send the query to our AI model for processing
await self.session.pipeline.send_text_message(query)
async def handle_model_response(self, message: A2AMessage):
"""Send processed responses back to requesting agent"""
response = message.content.get("response")
requesting_agent = message.to_agent
if response and requesting_agent:
# Send the specialist response back to the customer service agent
await self.a2a.send_message(
to_agent=requesting_agent,
message_type="specialist_response",
content={"response": response}
)
async def on_enter(self):
await self.register_a2a(AgentCard(
id="specialist_1",
name="Loan Specialist Agent",
domain="loan",
capabilities=["loan_consultation", "loan_information", "interest_rates"],
description="Handles loan queries"
))
self.a2a.on_message("specialist_query", self.handle_specialist_query)
self.a2a.on_message("model_response", self.handle_model_response)
async def on_exit(self):
print("LoanAgent Left")
```
## Step 3: Configure Session Management
- **`Pipeline Architecture`**: Uses **RealTimePipeline** for customer agent (audio-enabled Gemini for voice interaction) and **CascadingPipeline** for specialist agent (text-only OpenAI for efficient processing).
- **`Session Factory`**: Provides `create_pipeline()` and `create_session()` functions to properly configure agent sessions based on their roles.
- **`Modality Separation`**: Ensures customer agent can handle voice while specialist processes text in background.
```python title="session_manager.py"
from videosdk.agents import AgentSession, CascadingPipeline, RealTimePipeline, ConversationFlow
from videosdk.plugins.openai import OpenAILLM
from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig
import os
class MyConversationFlow(ConversationFlow):
async def on_turn_start(self, transcript: str) -> None:
pass
async def on_turn_end(self) -> None:
pass
def create_pipeline(agent_type: str):
if agent_type == "customer":
# Customer agent: RealTimePipeline for voice interaction
return RealTimePipeline(
model=GeminiRealtime(
model="gemini-2.5-flash-native-audio-preview-12-2025",
config=GeminiLiveConfig(
voice="Leda",
response_modalities=["AUDIO"]
)
)
)
else:
# Specialist agent: CascadingPipeline for text processing
return CascadingPipeline(
llm=OpenAILLM(api_key=os.getenv("OPENAI_API_KEY")),
)
def create_session(agent, pipeline) -> AgentSession:
return AgentSession(
agent=agent,
pipeline=pipeline,
conversation_flow=MyConversationFlow(agent=agent),
)
```
:::note
While setting up pipelines, make sure:
- The **customer agent** has **voice capabilities only** (via `RealTimePipeline`).
- The **specialist agent (Loan Agent)** operates in **text-only mode** (via `CascadingPipeline`).
:::
:::info
**Pipeline Support**: The VideoSDK AI Agents framework supports both **RealTimePipeline** and **CascadingPipeline**, enabling flexible configurations for voice and text processing with **A2A**. You can run a full `RealTimePipeline` or `CascadingPipeline` for both modalities, or create a hybrid setup that combines the two. This allows you to tailor the use of STT, TTS, and LLM to suit your specific use case, whether for low-latency interactions, complex processing flows, or a mix of both.
:::
## Step 4: Deploy A2A System on VideoSDK Platform
- **`Meeting Setup`**: Customer agent joins VideoSDK meeting for user interaction while specialist runs in background mode. Requires environment variables: `VIDEOSDK_AUTH_TOKEN`, `GOOGLE_API_KEY`, and `OPENAI_API_KEY`.
- **`System Orchestration`**: Uses `JobContext` and `WorkerJob` to manage the meeting lifecycle and agent coordination.
- **`Resource Management`**: Handles startup sequence, keeps system running, and provides clean shutdown with proper A2A unregistration
```python title="main.py"
import asyncio
from contextlib import suppress
from agents.customer_agent import CustomerServiceAgent
from agents.loan_agent import LoanAgent
from session_manager import create_pipeline, create_session
from videosdk.agents import JobContext, RoomOptions, WorkerJob
async def main(ctx: JobContext):
specialist_agent = LoanAgent()
specialist_pipeline = create_pipeline("specialist")
specialist_session = create_session(specialist_agent, specialist_pipeline)
customer_agent = CustomerServiceAgent()
customer_pipeline = create_pipeline("customer")
customer_session = create_session(customer_agent, customer_pipeline)
specialist_task = asyncio.create_task(specialist_session.start())
try:
await ctx.connect()
await customer_session.start()
await asyncio.Event().wait()
except (KeyboardInterrupt, asyncio.CancelledError):
print("Shutting down...")
finally:
specialist_task.cancel()
with suppress(asyncio.CancelledError):
await specialist_task
await specialist_session.close()
await customer_session.close()
await specialist_agent.unregister_a2a()
await customer_agent.unregister_a2a()
await ctx.shutdown()
def customer_agent_context() -> JobContext:
room_options = RoomOptions(room_id="", name="Customer Service Agent", playground=True)
return JobContext(
room_options=room_options
)
if __name__ == "__main__":
job = WorkerJob(entrypoint=main, jobctx=customer_agent_context)
job.start()
```
:::note
Ensure that the `JobContext` is created **only for the primary (main) agent**, i.e., the agent responsible for user-facing interaction (e.g., Customer Agent).
The background agent (e.g., Loan Agent) should not have its own context or initiate a separate connection.
:::
#### Running the Application
Set the required environment variables:
```bash
export VIDEOSDK_AUTH_TOKEN="your_videosdk_token"
export GOOGLE_API_KEY="your_google_api_key"
export OPENAI_API_KEY="your_openai_api_key"
```
Replace `` in the code with your actual meeting ID, then run:
```bash
cd A2A
python main.py
```
:::tip Quick Start
Get the complete working example at [A2A Quick Start Repository](https://github.com/videosdk-live/agents-quickstart/tree/main/A2A) with all the code ready to run.
:::
---
---
title: Agent to Agent (A2A)
hide_title: false
hide_table_of_contents: false
description: "Understanding the core concepts of Agent to Agent (A2A) communication in VideoSDK AI Agents - AgentCard, A2AMessage, agent registration, and discovery mechanisms for building collaborative multi-agent systems."
pagination_label: "A2A Overview"
keywords:
- A2A Overview
- A2A Protocol
- Agent To Agent
- AI Agent
- Google's A2A
- AgentCard
- A2AMessage
- Agent Registration
- Agent Discovery
- Multi-Agent Communication
- VideoSDK Agents
- AI Agent SDK
- Agent Collaboration
image: img/videosdklive-thumbnail.jpg
sidebar_position: 5
sidebar_label: Overview
slug: overview
---
# Agent to Agent (A2A)
The Agent to Agent (A2A) protocol enables seamless collaboration between specialized AI agents, allowing them to communicate, share knowledge, and coordinate responses based on their unique capabilities and domain expertise. With VideoSDK's A2A implementation, you can create multi-agent systems where different agents work together to provide comprehensive solutions.
## How It Works
### Basic Flow
1. **Agent Registration**: Agents register themselves with an `AgentCard` that contains their capabilities and domain expertise
2. **Client Query**: Client sends a query to the main agent
3. **Agent Discovery**: Main agent discovers relevant specialist agents using agent cards
4. **Query Forwarding**: Main agent forwards specialized queries to appropriate agents
5. **Response Chain**: Specialist agents process queries and respond back to the main agent
6. **Client Response**: Main agent formats and delivers the final response to the client

### Example Scenario
```
Client → "Book a flight to New York and find a hotel"
↓
Travel Agent (Main) → Analyzes query
↓
Travel Agent → Discovers Flight Booking Agent & Hotel Booking Agent
↓
Travel Agent → Forwards flight query to Flight Booking Agent
Travel Agent → Forwards hotel query to Hotel Booking Agent
↓
Specialist Agents → Process queries and respond back (text format)
↓
Travel Agent → Combines responses and sends to client (audio format)
```
# Core Components
## 1. AgentCard
The `AgentCard` is how agents identify themselves and advertise their capabilities to other agents.
#### Structure
```python
AgentCard(
id="agent_flight_001",
name="Skymate",
domain="flight",
capabilities=[
"search_flights",
"modify_bookings",
"show_flight_status"
],
description="Handles all flight-related tasks"
)
```
#### Parameters
| Parameter | Type | Required | Description |
| -------------- | ------ | -------- | ------------------------------------ |
| `id` | string | Yes | Unique identifier for the agent |
| `name` | string | Yes | Human-readable agent name |
| `domain` | string | Yes | Primary expertise domain |
| `capabilities` | list | Yes | List of specific capabilities |
| `description` | string | Yes | Brief description of agent's purpose |
| `metadata` | dict | No | Additional metadata for the agent |
## 2. A2AMessage
`A2AMessage` is the standardized communication format between agents.
#### Structure
```python
message = A2AMessage(
from_agent="travel_agent_1",
to_agent="agent_flight_001",
type="flight_status_query",
content={"query": "What's the status of AI202?"},
metadata={"client_id": "xyz123", "urgency": "medium"}
)
```
#### Parameters
| Parameter | Type | Required | Description |
| ------------ | ------ | -------- | --------------------------- |
| `from_agent` | string | Yes | ID of the sending agent |
| `to_agent` | string | Yes | ID of the receiving agent |
| `type` | string | Yes | Message type/event name |
| `content` | dict | Yes | Message payload |
| `metadata` | dict | No | Additional message metadata |
## 3. Agent Registry
#### `register_a2a(agent_card)`
Register an agent with the A2A system.
```python
async def on_enter(self):
await self.register_a2a(AgentCard(
id="agent_flight_001",
name="Skymate",
domain="flight",
capabilities=[
"search_flights",
"modify_bookings",
"show_flight_status"
],
description="Handles all flight-related tasks"
))
```
**What Registration Does:**
- Adds the agent to the global `AgentRegistry` singleton
- Makes the agent discoverable by other agents
- Stores both the `AgentCard` and agent instance
- Enables message routing to this agent
#### `unregister()`
Unregister an agent from the A2A system.
```python
await self.unregister_a2a()
```
## 4. A2AProtocol Class
The main class for managing agent-to-agent communication.
### Agent Discovery
#### `find_agents_by_domain(domain: str)`
Discover agents based on their domain expertise.
```python
agents = self.a2a.registry.find_agents_by_domain("hotel")
# Returns: ["agent_hotel_001"]
```
#### `find_agents_by_capability(cap: str)`
Find agents with specific skills.
```python
agents = await self.a2a.registry.find_agents_by_capability("modify_bookings")
# Returns: ["agent_flight_001"]
```
---
### Agent Communications
#### `send_message(to_agent, message_type, content, metadata=None)`
Send messages directly to other agents.
```python
await self.a2a.send_message(
to_agent="agent_hotel_001",
message_type="hotel_booking_query", # Event name that the receiving agent listens for
content={"query": "Find 3-star hotels in Delhi under $100"},
metadata={"client_id": "xyz123"} # Optional metadata
)
```
**Parameters:**
- `to_agent` (string): Target agent ID
- `message_type` (string): Event name the receiving agent listens for
- `content` (dict): Message payload
- `metadata` (dict, optional): Additional message metadata
#### `on_message(message_type, handler)`
Register message handlers for incoming messages.
```python
# Register a handler for specialist queries
self.a2a.on_message("hotel_booking_query", self.handle_specialist_query)
async def handle_specialist_query(self, message):
# Process the incoming message
query = message.content.get("query")
# ... process query ...
# Return response
return {"response": "Current mortgage rates are 6.5%"}
```
## Next Steps
Now that you're familiar with the core A2A concepts, it's time to move from theory to practice:
👉 **[Explore the Full A2A Implementation](implementation)**
Dive into a complete, working example that demonstrates agent discovery, messaging, and collaboration in action.
---
---
title: Build a Custom Voice AI Agent in Minutes
hide_title: false
hide_table_of_contents: false
description: "Use VideoSDK's low-code builder to design, test, and deploy a personalized voice agent powered by your preferred LLM."
keywords:
- voice ai agent
- low-code agent builder
- conversational ai
- videosdk agents
- gemini
- realtime pipeline
- telephony
- knowledge base
- speech recognition
- tts
image: https://strapi.videosdk.live/uploads/Screenshot_2025_11_17_at_5_06_23_PM_33a509fd4e.png
sidebar_label: Build Agent
slug: build-agent
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Step from '@site/src/components/Step'
# Agent Runtime Guide
AI voice agents are transforming how businesses interact with customers, providing natural, conversational experiences through voice interfaces. VideoSDK's **Agent Runtime** feature offers a powerful **no-code/low-code interface** that enables you to build sophisticated AI voice agents without extensive programming knowledge.
## Prerequisites
Before you begin, ensure you have:
- **VideoSDK Account:** Visit [VideoSDK Dashboard](https://app.videosdk.live) to sign up for a free account and access the AI Agent builder.
## Step-By-Step Guide
### Step 1: Create a New Agent
1. In the dashboard, navigate to **AI Agent > Agents** or visit [Agents Dashboard](https://app.videosdk.live/agents/agents).
2. You'll see the `AI Agent > Agents` section in the dashboard.
3. To create a voice agent, click on **Agents** in the sidebar.

### Step 2: Click `Add New Agent`
This is where you'll start creating your voice agent. If no agent has been created yet, you'll see a **Add New Agent** button. If agents already exist, you'll see a list of all AI voice agents, and you can click the button in the top right corner to create a new agent.

### Step 3: Configure Agent Details
This is where you can define your AI voice agent's persona and behavior:
- **Agent Name:** Set a descriptive name for your agent (e.g., "AI Interviewer").
- **System Prompt:** Define the agent's role, personality, and behavior guidelines.
- **Welcome Message:** Set the message that plays when the agent joins a conversation.
- **Closing Message:** Set the message that plays when the agent leaves a conversation.

### Step 4: Configure the Pipeline
The pipeline is the core engine of your voice agent, processing audio through speech recognition, AI reasoning, and text-to-speech. VideoSDK offers two pipeline options: **Realtime Pipeline** and **Cascading Pipeline**.
The **Realtime Pipeline** provides direct speech-to-speech processing with minimal latency, ideal for natural, conversational interactions.
Example: Adding **Gemini Realtime Model**
1. Add your Gemini API key in the pipeline configuration or at [Realtime Integrations](https://app.videosdk.live/agents/integrations/realtime).
2. To get your API key, visit [Gemini API Keys](https://aistudio.google.com/api-keys).

**Available models:**
- `gemini-2.5-flash-native-audio-preview-12-2025`
- `gemini-2.0-flash`
- `gemini-2.5-flash-native-audio-preview-12-2025`
- `gemini-2.5-flash-native-audio`
The **Cascading Pipeline** processes audio through distinct stages (STT → LLM → TTS), providing maximum control over each component.
Configure your providers for [STT Integrations](https://app.videosdk.live/agents/integrations/stt), [LLM Integrations](https://app.videosdk.live/agents/integrations/llm) and [TTS Integrations](https://app.videosdk.live/agents/integrations/tts).

Example: Adding **Deepgram STT**
- Get API Key at: [Deepgram Console](https://console.deepgram.com/)
**Available models:**
- `flux-general-en`
- `nova-2` or `nova-2-general` (for non-English transcriptions)
- `nova-3` or `nova-3-general`
- `base`
### Step 5: Knowledge Base Integration
Upload a knowledge base to provide context and domain expertise to your voice agent. This dramatically improves answer accuracy and enables your agent to handle specialized queries.
- Navigate to the **Knowledge Base** tab in your agent configuration.
- Upload documents, FAQs, or product sheets that contain relevant information.
- The agent will use this knowledge to provide more accurate and contextual responses.

### Step 6: Configure Telephony Settings
Configure telephony settings to enable your agent to handle phone calls:
- **Agent Type:** Set the type of agent (inbound, outbound, or both).
- **Inbound Gateways:** Set up gateways to receive incoming calls.
- **Outbound Gateways:** Set up gateways to make outbound calls.
- **Routing Rules:** Create rules to map phone numbers to your agent.
- **Calling Settings:** Configure call handling preferences and behavior.

This configuration is essential for **call center automation**, **platform integration**, and smooth **agent orchestration**.
### Step 7: Test Your Voice Agent
You can interact with the agent directly from the dashboard before connecting it to production channels:
1. Visit [Agents Dashboard](https://app.videosdk.live/agents/agents).
2. Locate your agent in the list and click the **Test** button in the top-right corner.
3. Use the built-in simulator to speak with the agent in real time, view live transcripts, and fine-tune prompts based on the conversation.

### Step 8: Connect Voice Agent
Once your agent is configured, you can connect it to various platforms and devices:
- **Web:** Integrate your agent into web applications.
- **Mobile:** Connect to iOS and Android mobile apps.
- **Telephony:** Deploy to phone systems for voice calls.
- **IoT Devices:** Connect to Internet of Things devices.

## Next Steps
Congratulations! You've successfully created your AI voice agent. Here are the next steps:
- **Test Your Agent:** Use the built-in test simulator to verify your agent's behavior and responses.
- **Deploy to Production:** Connect your agent to production environments and real user interactions.
- **Monitor Performance:** Track agent performance, user satisfaction, and conversation quality.
- **Iterate and Improve:** Refine your agent's prompts, knowledge base, and configuration based on real-world usage.
Keep refining your agent's configuration to build a powerful voice AI solution tailored to your specific business needs.
### Starter Apps
import { AgentCardGrid, SettingsIcon, PlayIcon, CodeIcon, DocumentIcon, ExternalLinkIcon, RobotIcon, GithubIcon } from '@site/src/components/agent/cards';
---
/// Enter your code here(ref Flutter)
---
---
title: Flutter Agent Starter
hide_title: false
hide_table_of_contents: false
description: VideoSDK enables the opportunity to integrate AI agents with real-time voice interaction using Flutter frontend and a no-code agent from the dashboard.
sidebar_label: Flutter
pagination_label: Agent Runtime with Flutter
keywords:
- ai agent
- no-code
- voice interaction
- real-time communication
- flutter sdk
image: img/videosdklive-thumbnail.jpg
sidebar_position: 2
slug: agent-starter-flutter
---
import Step from '@site/src/components/Step'
import CreateAgent from '@site/mdx/_ai-agent-starter-sdk-guide.mdx'
# Agent Starter App - Flutter
VideoSDK enables you to seamlessly add a voice-enabled AI agent to your Flutter app — this guide walks you through connecting your Flutter frontend to an agent configured and deployed directly from the VideoSDK dashboard.
## Prerequisites
- A deployed AI agent on VideoSDK Agent Cloud. If you haven't done this yet, create and deploy your agent using the [Low-Code Deployment UI](/ai_agents/agent-runtime/build-agent) on the VideoSDK Dashboard — no coding required. Once deployed, note down your **Agent ID**.
- If your target platform is iOS, your development environment must meet the following requirements:
- Flutter 3.8.0 or later
- Dart 3.x or later
- Valid Video SDK [Account](https://app.videosdk.live/)
import APISecret from '@site/mdx/introduction/_api-key.mdx';
## Run the Sample Project
### Step 1: Clone the sample project
Clone the repository to your local environment.
```bash
git clone https://github.com/videosdk-live/agent-starter-app-flutter.git
cd agent-starter-flutter
```
### Step 2: Install the dependencies
Install all the dependencies to run the project.
```bash
flutter pub get
```
### Step 3: Create Your Agent (Optional)
:::info
If you've already configured and deployed your agent from the VideoSDK Dashboard, you can jump directly to [Step 4](#step-4-setup-environment-variables).
:::
### Step 4: Setup Environment Variables
Copy the `.env.example` file to `.env`.
```bash
cp .env.example .env
```
Update the `.env` file with your credentials. The `AGENT_ID` is the identifier for the Low-Code agent you deployed from the VideoSDK Dashboard.
```env title=".env"
AUTH_TOKEN=your_videosdk_auth_token
AGENT_ID=your_agent_id
MEETING_ID=your_meeting_id
VERSION_ID=your_version_id
```
> **Tip:** You can obtain your `AUTH_TOKEN` and `AGENT_ID` from the [VideoSDK Dashboard](https://app.videosdk.live/) under your Agent Cloud deployment. `MEETING_ID` is optional — if left blank, the app will create a new meeting automatically.
### Step 5: Run the Sample App
Bingo, it's time to push the launch button.
**Android:**
```bash
flutter run
```
**iOS:**
```bash
cd ios && pod install && cd ..
flutter run -d ios
```
Once running, the app will use the Dispatch API to send your deployed agent into the meeting room. You'll see the live transcription as you speak, and the agent will respond in real time.
---
## Troubleshooting
### Common Issues:
1. **Agent not joining:**
- Check that the `AGENT_ID` and `VERSION_ID` in your `.env` are correctly set.
- Verify your VideoSDK token is valid and has the necessary permissions.
2. **Audio not working:**
- Check device permissions for microphone access.
3. **"Failed to connect agent" error:**
- Verify your `AGENT_ID` and `VERSION_ID` are correct.
- Check the debug console for any network errors.
4. **Flutter build issues:**
- Ensure your Flutter version is compatible (3.8.0 or later for iOS targets).
- Try cleaning the build: `flutter clean`.
- Delete `pubspec.lock` and run `flutter pub get`.
- For iOS: run `cd ios && pod install` before `flutter run`.
---
---
title: iOS Agent Starter
hide_title: false
hide_table_of_contents: false
description: VideoSDK enables the opportunity to integrate AI agents with real-time voice interaction using iOS frontend and a no-code agent from the dashboard.
sidebar_label: iOS
pagination_label: Agent Runtime with iOS
keywords:
- ai agent
- no-code
- voice interaction
- real-time communication
- ios sdk
image: img/videosdklive-thumbnail.jpg
sidebar_position: 3
slug: agent-starter-ios
---
import Step from '@site/src/components/Step'
import CreateAgent from '@site/mdx/_ai-agent-starter-sdk-guide.mdx'
# Agent Starter App - iOS
VideoSDK enables you to seamlessly add a voice-enabled AI agent to your iOS app — this guide walks you through connecting your iOS application to an agent configured and deployed directly from the VideoSDK dashboard.
## Prerequisites
- A deployed AI agent on VideoSDK Agent Cloud. If you haven't done this yet, create and deploy your agent using the [Low-Code Deployment UI](/ai_agents/agent-runtime/build-agent) on the VideoSDK Dashboard — no coding required. Once deployed, note down your **Agent ID**.
- For iOS, your development environment must meet the following requirements:
- iOS 18 or later
- Xcode 16.4 or later
- Valid Video SDK [Account](https://app.videosdk.live/)
import APISecret from '@site/mdx/introduction/_api-key.mdx';
## Run the Sample Project
### Step 1: Clone the sample project
Clone the repository to your local environment.
```bash
git clone https://github.com/videosdk-live/agent-starter-app-ios.git
cd agent-starter-ios
```
### Step 2: Open the project in XCode
Open the `agent-starter-ios.xcodeproj` file using Xcode.
### Step 3: Create Your Agent (Optional)
:::info
If you've already configured and deployed your agent from the VideoSDK Dashboard, you can jump directly to [Step 4](#step-4-set-up-credentials).
:::
### Step 4: Set up credentials
Before running the app, you need to configure your authentication details. Open `agent-starter-ios/Constants/MeetingConfig.swift` and supply the required values:
```
AUTH_TOKEN:
AGENT_ID:
MEETING_ID:
VERSION_ID:
```
> **Tip:** You can obtain your `AUTH_TOKEN` and `AGENT_ID` from the [VideoSDK Dashboard](https://app.videosdk.live/) under your Agent Cloud deployment. `MEETING_ID` is optional — if left blank, the app will create a new meeting automatically. `VERSION_ID` is also optional, if left blank, the app will fetch the agent's version and choose the latest one and proceed with the meeting.
### Step 5: Build and Run
Bingo, Now Select your target physical device and click the Run button (or press Cmd + R) in Xcode!
Once running, the app will use the Dispatch API to send your deployed agent into the meeting room. You'll see the live transcription as you speak, and the agent will respond in real time.
---
## Troubleshooting
### Common Issues:
1. **Agent not joining:**
- Check that the `AGENT_ID` and `VERSION_ID` in your `agent-starter-ios/Constants/MeetingConfig.swift` are correctly set.
- Verify your VideoSDK token is valid and has the necessary permissions.
2. **Audio not working:**
- Check device permissions for microphone access.
3. **"Failed to connect agent" error:**
- Verify your `AGENT_ID` and `VERSION_ID` are correct.
- Check the debug console for any network errors.
---
---
title: Agent Runtime with Flutter
hide_title: false
hide_table_of_contents: false
description: VideoSDK enables the opportunity to integrate AI agents with real-time voice interaction using Flutter frontend and a no-code agent from the dashboard.
sidebar_label: With Flutter
pagination_label: Agent Runtime with Flutter
keywords:
- ai agent
- no-code
- voice interaction
- real-time communication
- flutter sdk
image: img/videosdklive-thumbnail.jpg
sidebar_position: 2
slug: with-flutter
---
import Step from '@site/src/components/Step'
# Agent Runtime with Flutter
VideoSDK empowers you to seamlessly integrate AI agents with real-time voice interaction into your Flutter application within minutes. This guide shows you how to connect a Flutter frontend with an AI agent created and configured entirely from the VideoSDK dashboard.
## Prerequisites
Before proceeding, ensure that your development environment meets the following requirements:
- Video SDK Developer Account (Not having one, follow **[Video SDK Dashboard](https://app.videosdk.live/)**)
- Flutter installed on your device
- Familiarity with creating a no-code voice agent. If you're new to this, please follow our guide on how to **[Build a Custom Voice AI Agent in Minutes](/ai_agents/agent-runtime/build-agent)** first.
:::important
You need a VideoSDK account to generate a token and an agent from the dashboard.
Visit the VideoSDK **[dashboard](https://app.videosdk.live/api-keys)** to generate a token.
:::
## Project Structure
Your project structure should look like this:
```jsx title="Project Structure"
root
├── android
├── ios
├── lib
│ ├── api_call.dart
│ ├── join_screen.dart
│ ├── main.dart
│ ├── meeting_controls.dart
│ ├── meeting_screen.dart
│ └── participant_tile.dart
├── macos
├── web
└── windows
```
You will be working on the following files:
- `join_screen.dart`: Responsible for the user interface to join a meeting.
- `meeting_screen.dart`: Displays the meeting interface and handles meeting logic.
- `api_call.dart`: Handles API calls for creating meetings and dispatching agents.
## 1. Flutter Frontend
### Step 1: Getting Started
Follow these steps to create the environment necessary to add AI agent functionality to your app.
#### Create a New Flutter App
Create a new Flutter app using the following command:
```bash
$ flutter create videosdk_ai_agent_flutter_app
```
#### Install VideoSDK
Install the VideoSDK using the following Flutter command. Make sure you are in your Flutter app directory before you run this command.
```bash
$ flutter pub add videosdk
$ flutter pub add http
```
### Step 2: Configure Project
#### For Android
- Update the `/android/app/src/main/AndroidManifest.xml` for the permissions we will be using to implement the audio and video features.
```xml title="android/app/src/main/AndroidManifest.xml"
```
- If necessary, in the `build.gradle` you will need to increase `minSdkVersion` of `defaultConfig` up to `23` (currently default Flutter generator set it to `16`).
#### For iOS
- Add the following entries which allow your app to access the camera and microphone to your `/ios/Runner/Info.plist` file :
```xml title="/ios/Runner/Info.plist"
NSCameraUsageDescription$(PRODUCT_NAME) Camera Usage!NSMicrophoneUsageDescription$(PRODUCT_NAME) Microphone Usage!
```
- Uncomment the following line to define a global platform for your project in `/ios/Podfile` :
```ruby title="/ios/Podfile"
platform :ios, '12.0'
```
#### For MacOS
- Add the following entries to your `/macos/Runner/Info.plist` file which allow your app to access the camera and microphone.
```xml title="/macos/Runner/Info.plist"
NSCameraUsageDescription$(PRODUCT_NAME) Camera Usage!NSMicrophoneUsageDescription$(PRODUCT_NAME) Microphone Usage!
```
- Add the following entries to your `/macos/Runner/DebugProfile.entitlements` file which allow your app to access the camera, microphone and open outgoing network connections.
```xml title="/macos/Runner/DebugProfile.entitleaments"
com.apple.security.network.clientcom.apple.security.device.cameracom.apple.security.device.microphone
```
- Add the following entries to your `/macos/Runner/Release.entitlements` file which allow your app to access the camera, microphone and open outgoing network connections.
```xml title="/macos/Runner/Release.entitlements"
com.apple.security.network.servercom.apple.security.network.clientcom.apple.security.device.cameracom.apple.security.device.microphone
```
### Step 3: Configure Environment and Credentials
Create a meeting room using the VideoSDK API:
```bash
curl -X POST https://api.videosdk.live/v2/rooms \
-H "Authorization: YOUR_JWT_TOKEN_HERE" \
-H "Content-Type: application/json"
```
Copy the `roomId` from the response and configure it in `lib/api_call.dart` along with your agent credentials.
```dart title="lib/api_call.dart"
import 'dart:convert';
import 'package:http/http.dart' as http;
//Auth token we will use to generate a meeting and connect to it
const token = 'YOUR_VIDEOSDK_AUTH_TOKEN';
const agentId = 'YOUR_AGENT_ID';
const versionId = 'YOUR_VERSION_ID';
// API call to create meeting
Future createMeeting() async {
final http.Response httpResponse = await http.post(
Uri.parse('https://api.videosdk.live/v2/rooms'),
headers: {'Authorization': token},
);
//Destructuring the roomId from the response
return json.decode(httpResponse.body)['roomId'];
}
// API call to connect agent
Future connectAgent(String meetingId) async {
final http.Response httpResponse = await http.post(
Uri.parse('https://api.videosdk.live/v2/agent/general/dispatch'),
headers: {
'Authorization': token,
'Content-Type': 'application/json',
},
body: json.encode({
'agentId': agentId,
'meetingId': meetingId,
'versionId': versionId,
}),
);
if (httpResponse.statusCode != 200) {
throw Exception('Failed to connect agent');
}
}
```
### Step 4: Design the User Interface (UI)
Update the UI files to add the "Connect Agent" button and connect the logic.
```dart title="lib/join_screen.dart"
import 'package:flutter/material.dart';
import 'api_call.dart';
import 'meeting_screen.dart';
class JoinScreen extends StatelessWidget {
final _meetingIdController = TextEditingController();
JoinScreen({super.key});
void onJoinButtonPressed(BuildContext context) {
// check meeting id is not null or invaild
// if meeting id is vaild then navigate to MeetingScreen with meetingId,token
Navigator.of(context).push(
MaterialPageRoute(
builder:
(context) =>
MeetingScreen(meetingId: "YOUR_MEETING_ID", token: token),
),
);
}
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(title: const Text('VideoSDK QuickStart')),
body: Padding(
padding: const EdgeInsets.all(12.0),
child: Center(
child: ElevatedButton(
onPressed: () => onJoinButtonPressed(context),
child: const Text('Join Meeting'),
),
),
),
);
}
}
```
```dart title="lib/meeting_screen.dart"
import 'package:flutter/material.dart';
import 'package:videosdk/videosdk.dart';
import 'participant_tile.dart';
import 'meeting_controls.dart';
import 'api_call.dart';
class MeetingScreen extends StatefulWidget {
final String meetingId;
final String token;
const MeetingScreen({
super.key,
required this.meetingId,
required this.token,
});
@override
State createState() => _MeetingScreenState();
}
class _MeetingScreenState extends State {
late Room _room;
var micEnabled = true;
var camEnabled = true;
bool _isAgentConnected = false;
Map participants = {};
@override
void initState() {
// create room
_room = VideoSDK.createRoom(
roomId: widget.meetingId,
token: widget.token,
displayName: "John Doe",
micEnabled: micEnabled,
camEnabled: false,
defaultCameraIndex:
1, // Index of MediaDevices will be used to set default camera
);
setMeetingEventListener();
// Join room
_room.join();
super.initState();
}
// listening to meeting events
void setMeetingEventListener() {
_room.on(Events.roomJoined, () {
setState(() {
participants.putIfAbsent(
_room.localParticipant.id,
() => _room.localParticipant,
);
});
});
_room.on(Events.participantJoined, (Participant participant) {
setState(
() => participants.putIfAbsent(participant.id, () => participant),
);
});
_room.on(Events.participantLeft, (String participantId) {
if (participants.containsKey(participantId)) {
setState(() => participants.remove(participantId));
}
});
_room.on(Events.roomLeft, () {
participants.clear();
Navigator.popUntil(context, ModalRoute.withName('/'));
});
}
void _connectAgent() async {
try {
await connectAgent(widget.meetingId);
setState(() {
_isAgentConnected = true;
});
ScaffoldMessenger.of(context).showSnackBar(
const SnackBar(content: Text('Agent connected successfully!')),
);
} catch (e) {
ScaffoldMessenger.of(context).showSnackBar(
SnackBar(content: Text('Failed to connect agent: ${e.toString()}')),
);
}
}
// onbackButton pressed leave the room
Future _onWillPop() async {
_room.leave();
return true;
}
@override
Widget build(BuildContext context) {
return WillPopScope(
onWillPop: () => _onWillPop(),
child: Scaffold(
appBar: AppBar(title: const Text('VideoSDK QuickStart')),
body: Padding(
padding: const EdgeInsets.all(8.0),
child: Column(
children: [
Text(widget.meetingId),
//render all participant
Expanded(
child: Padding(
padding: const EdgeInsets.all(8.0),
child: GridView.builder(
gridDelegate:
const SliverGridDelegateWithFixedCrossAxisCount(
crossAxisCount: 2,
crossAxisSpacing: 10,
mainAxisSpacing: 10,
mainAxisExtent: 300,
),
itemBuilder: (context, index) {
return ParticipantTile(
key: Key(participants.values.elementAt(index).id),
participant: participants.values.elementAt(index),
);
},
itemCount: participants.length,
),
),
),
MeetingControls(
onToggleMicButtonPressed: () {
micEnabled ? _room.muteMic() : _room.unmuteMic();
micEnabled = !micEnabled;
},
onLeaveButtonPressed: () => _room.leave(),
onConnectAgentButtonPressed: _isAgentConnected ? null : _connectAgent,
),
],
),
),
),
);
}
}
```
```dart title="lib/meeting_controls.dart"
import 'package:flutter/material.dart';
class MeetingControls extends StatelessWidget {
final void Function() onToggleMicButtonPressed;
final void Function() onLeaveButtonPressed;
final void Function()? onConnectAgentButtonPressed;
const MeetingControls({
super.key,
required this.onToggleMicButtonPressed,
required this.onLeaveButtonPressed,
required this.onConnectAgentButtonPressed,
});
@override
Widget build(BuildContext context) {
return Row(
mainAxisAlignment: MainAxisAlignment.spaceEvenly,
children: [
ElevatedButton(
onPressed: onLeaveButtonPressed,
child: const Text('Leave'),
),
ElevatedButton(
onPressed: onToggleMicButtonPressed,
child: const Text('Toggle Mic'),
),
ElevatedButton(
onPressed: onConnectAgentButtonPressed,
child: const Text('Connect Agent'),
),
],
);
}
}
```
## 2. Creating the AI Agent from Dashboard (No-Code)
You can create and configure a powerful AI agent directly from the VideoSDK dashboard.
### Step 1: Create Your Agent
First, follow our detailed guide to **[Build a Custom Voice AI Agent in Minutes](/ai_agents/agent-runtime/build-agent)**. This will walk you through creating the agent's persona, configuring its pipeline (Realtime or Cascading), and testing it directly from the dashboard.
### Step 2: Get Agent and Version ID
Once your agent is created, you need to get its `agentId` and `versionId` to connect it to your frontend application.
1. After creating your agent, go to the agent's page and find the JSON editor on right side. Copy the `agentId`.
2. To get the `versionId`, click on 3 dots besides Deploy button and click on "Version History" in it. Copy the version id via copy button of the version you want.

### Step 3: Configure IDs in Frontend
Now, update your `lib/api_call.dart` file with these IDs.
```dart title="lib/api_call.dart"
const token = 'your_videosdk_auth_token_here';
const agentId = 'paste_your_agent_id_here';
const versionId = 'paste_your_version_id_here';
```
## 3. Run the Application
### Step 1: Run the Frontend
Once you have completed all the steps mentioned above, start your Flutter application:
```bash
flutter run
```
### Step 2: Connect and Interact
1. **Join the meeting from the Flutter app:**
- Click the "Join Meeting" button.
- Allow microphone permissions when prompted.
2. **Connect the agent:**
- Once you join, click the "Connect Agent" button.
- You should see a confirmation that the agent was connected.
- The AI agent will join the meeting and greet you.
3. **Start playing:**
- Interact with your AI agent using your microphone.
## Troubleshooting
### Common Issues:
1. **Agent not joining:**
- Check that the `roomId`, `agentId`, and `versionId` are correctly set.
- Verify your VideoSDK token is valid and has the necessary permissions.
2. **Audio not working:**
- Check device permissions for microphone access.
3. **"Failed to connect agent" error:**
- Verify your `agentId` and `versionId` are correct.
- Check the debug console for any network errors.
4. **Flutter build issues:**
- Ensure your Flutter version is compatible.
- Try cleaning the build: `flutter clean`.
- Delete `pubspec.lock` and run `flutter pub get`.
---
---
title: Agent Runtime with iOS
hide_title: false
hide_table_of_contents: false
description: VideoSDK enables the opportunity to integrate AI agents with real-time voice interaction using an iOS frontend and a no-code agent from the dashboard.
sidebar_label: With iOS
pagination_label: Agent Runtime with iOS
keywords:
- ai agent
- no-code
- voice interaction
- real-time communication
- ios sdk
- swiftui
image: img/videosdklive-thumbnail.jpg
sidebar_position: 2
slug: with-ios
---
import Step from '@site/src/components/Step'
# Agent Runtime with iOS
VideoSDK empowers you to integrate an AI voice agent into your iOS app within minutes. This guide shows you how to connect an iOS (SwiftUI) frontend with an AI agent created and configured entirely from the VideoSDK dashboard.
## Prerequisites
- macOS with Xcode 15.0+
- iOS 13.0+ deployment target
- Valid VideoSDK [Account](https://app.videosdk.live/)
- Familiarity with creating a no-code voice agent. If you're new to this, please follow our guide on how to **[Build a Custom Voice AI Agent in Minutes](/ai_agents/agent-runtime/build-agent)** first.
:::important
You need a VideoSDK account to generate a token and an agent from the dashboard.
:::
### Step 1: Clone the sample project
Clone the repository to your local environment.
```bash
git clone https://github.com/videosdk-live/agents-quickstart.git
cd mobile-quickstarts/ios/
```
### Step 2: Environment Configuration
### Create a Meeting Room
Create a meeting room using the VideoSDK API:
```bash
curl -X POST https://api.videosdk.live/v2/rooms \
-H "Authorization: YOUR_VIDEOSDK_AUTH_TOKEN" \
-H "Content-Type: application/json"
```
Use the returned `roomId` in your configuration files.
### Configuration Files
Update the following files with your credentials. The Agent and Version IDs will be retrieved in a later step.
**MeetingViewController.swift** (line 14):
```swift
var token = "YOUR_VIDEOSDK_AUTH_TOKEN" // Add Your token here
var agentId = "YOUR_AGENT_ID"
var versionId = "YOUR_VERSION_ID"
```
**JoinScreenView.swift** (line 13):
```swift
let meetingId: String = "YOUR_MEETING_ID"
```
### Step 3: iOS Frontend Modifications
### Step 1: Add Connect Agent Button
In `MeetingView.swift`, add a button to connect the agent.
```swift title="MeetingView.swift"
// Add this button to your view hierarchy
Button(action: {
meetingVC.connectAgent()
}) {
Text("Connect Agent")
}
.disabled(meetingVC.isAgentConnected)
```
### Step 2: Implement Connect Logic
In `MeetingViewController.swift`, add the logic to call the dispatch API.
```swift title="MeetingViewController.swift"
// Add state to track if the agent is connected
@Published var isAgentConnected = false
// ...
func connectAgent() {
guard let url = URL(string: "https://api.videosdk.live/v2/agent/general/dispatch") else { return }
var request = URLRequest(url: url)
request.httpMethod = "POST"
request.setValue("application/json", forHTTPHeaderField: "Content-Type")
request.setValue(token, forHTTPHeaderField: "Authorization")
let body: [String: Any] = [
"agentId": agentId,
"meetingId": room?.id ?? "",
"versionId": versionId
]
request.httpBody = try? JSONSerialization.data(withJSONObject: body)
URLSession.shared.dataTask(with: request) { data, response, error in
if let error = error {
print("Connect error: \(error.localizedDescription)")
return
}
if let httpResponse = response as? HTTPURLResponse, httpResponse.statusCode == 200 {
DispatchQueue.main.async {
self.isAgentConnected = true
print("Agent connected successfully")
}
} else {
print("Failed to connect agent")
}
}.resume()
}
```
### Step 4: Creating the AI Agent from Dashboard (No-Code)
### Step 1: Create Your Agent
First, follow our detailed guide to **[Build a Custom Voice AI Agent in Minutes](/ai_agents/agent-runtime/build-agent)**. This will walk you through creating the agent's persona, configuring its pipeline (Realtime or Cascading), and testing it directly from the dashboard.
### Step 2: Get Agent and Version ID
Once your agent is created, you need to get its `agentId` and `versionId` to connect it to your frontend application.
1. After creating your agent, go to the agent's page and find the JSON editor on right side. Copy the `agentId`.
2. To get the `versionId`, click on 3 dots besides Deploy button and click on "Version History" in it. Copy the version id via copy button of the version you want.

### Step 3: Configure IDs in Frontend
Now, update your `MeetingViewController.swift` file with these IDs.
```swift title="MeetingViewController.swift"
var agentId = "paste_your_agent_id_here"
var versionId = "paste_your_version_id_here"
```
### Step 5: Run the iOS Frontend
1. **Open Xcode:**
```bash
open videosdk-agents-quickstart-ios.xcodeproj
```
2. **Configure your development team:**
- Select the project in Xcode
- Go to "Signing & Capabilities"
- Select your development team
3. **Build and run:**
- Select your target device or simulator
- Press `Cmd + R` to build and run
### Step 6: Connect and Interact
1. Join the meeting from the app and allow microphone permissions.
2. When you join, click the "Connect Agent" button to call the agent into the meeting.
3. Talk to the agent in real time.
## Troubleshooting
### Common Issues
1. **Build Errors:**
- Ensure Xcode 15.0+ is installed
- Check iOS deployment target (13.0+)
- Verify VideoSDK package dependency
2. **Authentication Issues:**
- Verify `VIDEOSDK_AUTH_TOKEN` in `MeetingViewController.swift`
- Check token permissions include `allow_join`
3. **Meeting Connection Issues:**
- Ensure `YOUR_MEETING_ID` is correct
- Verify network connectivity
- Check VideoSDK account status
4. **AI Agent Issues:**
- Verify `agentId` and `versionId` are set correctly
- Check for errors in the Xcode console when connecting the agent.
---
---
title: Agent Runtime with React Native
hide_title: false
hide_table_of_contents: false
description: VideoSDK enables the opportunity to integrate AI agents with real-time voice interaction using a React Native frontend and a no-code agent from the dashboard.
sidebar_label: With React Native
pagination_label: Agent Runtime with React Native
keywords:
- ai agent
- no-code
- voice interaction
- real-time communication
- react native sdk
image: img/videosdklive-thumbnail.jpg
sidebar_position: 2
slug: with-react-native
---
import Step from '@site/src/components/Step'
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Agent Runtime with React Native
VideoSDK empowers you to integrate an AI voice agent into your React Native app (Android/iOS) within minutes. This guide shows you how to connect a React Native frontend with an AI agent created and configured entirely from the VideoSDK dashboard.
## Prerequisites
- VideoSDK Developer Account (get token from the [dashboard](https://app.videosdk.live/api-keys))
- Node.js and a working React Native environment (Android Studio and/or Xcode)
- Familiarity with creating a no-code voice agent. If you're new to this, please follow our guide on how to **[Build a Custom Voice AI Agent in Minutes](/ai_agents/agent-runtime/build-agent)** first.
:::important
You need a VideoSDK token and an agent from the dashboard. Generate your VideoSDK token from the dashboard.
:::
## Project Structure
First, create an empty project using `mkdir folder_name` on your preferable location for the React Native Frontend. Your final project structure should look like this:
```jsx title="Directory Structure"
root
├── android/
├── ios/
├── App.js
├── constants.js
└── index.js
```
You will work on:
- `android/`: Contains the Android-specific project files.
- `ios/`: Contains the iOS-specific project files.
- `App.js`: The main React Native component, containing the UI and meeting logic.
- `constants.js`: To store token, meetingId, and agent credentials for the frontend.
- `index.js`: The entry point of the React Native application, where VideoSDK is registered.
## Building the React Native Frontend
### Step 1: Create App and Install SDKs
Create a React Native app and install the VideoSDK RN SDK:
```bash
npx react-native init videosdkAiAgentRN
cd videosdkAiAgentRN
# Install VideoSDK
npm install "@videosdk.live/react-native-sdk"
```
### Step 2: Configure the Project
#### Android Setup
```xml title="android/app/src/main/AndroidManifest.xml"
```
```java title="android/app/build.gradle"
dependencies {
implementation project(':rnwebrtc')
}
```
```gradle title="android/settings.gradle"
include ':rnwebrtc'
project(':rnwebrtc').projectDir = new File(rootProject.projectDir, '../node_modules/@videosdk.live/react-native-webrtc/android')
```
```java title="MainApplication.kt"
import live.videosdk.rnwebrtc.WebRTCModulePackage
class MainApplication : Application(), ReactApplication {
override val reactNativeHost: ReactNativeHost =
object : DefaultReactNativeHost(this) {
override fun getPackages(): List {
val packages = PackageList(this).packages.toMutableList()
packages.add(WebRTCModulePackage())
return packages
}
// ...
}
}
```
```java title="android/gradle.properties"
/* This one fixes a weird WebRTC runtime problem on some devices. */
android.enableDexingArtifactTransform.desugaring=false
```
```java title="android/app/proguard-rules.pro"
-keep class org.webrtc.** { *; }
```
```java title="android/build.gradle"
buildscript {
ext {
minSdkVersion = 23
}
}
```
#### iOS Setup
To update CocoaPods, you can reinstall the gem using the following command:
```gem
$ sudo gem install cocoapods
```
```sh title="ios/Podfile"
pod ‘react-native-webrtc’, :path => ‘../node_modules/@videosdk.live/react-native-webrtc’
```
You need to change the platform field in the Podfile to 12.0 or above because react-native-webrtc doesn't support iOS versions earlier than 12.0. Update the line: platform : ios, ‘12.0’.
After updating the version, you need to install the pods by running the following command:
```sh
pod install
```
Add the following lines to your info.plist file located at (project folder/ios/projectname/info.plist):
```html title="ios/MyApp/Info.plist"
NSCameraUsageDescriptionCamera permission descriptionNSMicrophoneUsageDescriptionMicrophone permission description
```
### Step 3: Register Service and Configure
Register VideoSDK services in your root `index.js` file for the initialization service.
```js title="index.js"
import { AppRegistry } from "react-native";
import App from "./App";
import { name as appName } from "./app.json";
import { register } from "@videosdk.live/react-native-sdk";
register();
AppRegistry.registerComponent(appName, () => App);
```
Create a `constants.js` file to store your token, meeting ID, and agent credentials.
```js title="constants.js"
export const token = "YOUR_VIDEOSDK_AUTH_TOKEN";
export const meetingId = "YOUR_MEETING_ID";
export const name = "User Name";
export const agentId = "YOUR_AGENT_ID";
export const versionId = "YOUR_VERSION_ID";
```
### Step 4: Build UI and wire up MeetingProvider
```js title="App.js"
import React, { useState } from 'react';
import {
SafeAreaView,
TouchableOpacity,
Text,
View,
FlatList,
Alert,
} from 'react-native';
import {
MeetingProvider,
useMeeting,
} from '@videosdk.live/react-native-sdk';
import { meetingId, token, name, agentId, versionId } from './constants';
const Button = ({ onPress, buttonText, backgroundColor }) => {
return (
{buttonText}
);
};
function ControlsContainer({ join, leave, toggleMic }) {
const [connected, setConnected] = useState(false);
const connectAgent = async () => {
try {
const response = await fetch("https://api.videosdk.live/v2/agent/general/dispatch", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: token,
},
body: JSON.stringify({ agentId: agentId, meetingId: meetingId, versionId: versionId }),
});
if (response.ok) {
Alert.alert("Agent connected successfully!");
setConnected(true);
} else {
Alert.alert("Failed to connect agent.");
}
} catch (error) {
console.error("Error connecting agent:", error);
Alert.alert("An error occurred while connecting the agent.");
}
};
return (
);
}
function ParticipantView({ participantDisplayName }) {
return (
Participant: {participantDisplayName}
);
}
function ParticipantList({ participants }) {
return participants.length > 0 ? (
{
return ;
}}
/>
) : (
Press Join button to enter meeting.
);
}
function MeetingView() {
const { join, leave, toggleMic, participants, meetingId } = useMeeting({});
const participantsList = [...participants.values()].map(participant => ({
displayName: participant.displayName,
}));
return (
{meetingId ? (
Meeting Id : {meetingId}
) : null}
);
}
export default function App() {
if (!meetingId || !token) {
return (
Please add a valid Meeting ID and Token in the `constants.js` file.
);
}
return (
);
}
```
## Creating the AI Agent from Dashboard (No-Code)
You can create and configure a powerful AI agent directly from the VideoSDK dashboard.
### Step 1: Create Your Agent
First, follow our detailed guide to **[Build a Custom Voice AI Agent in Minutes](/ai_agents/agent-runtime/build-agent)**. This will walk you through creating the agent's persona, configuring its pipeline (Realtime or Cascading), and testing it directly from the dashboard.
### Step 2: Get Agent and Version ID
Once your agent is created, you need to get its `agentId` and `versionId` to connect it to your frontend application.
1. After creating your agent, go to the agent's page and find the JSON editor on right side. Copy the `agentId`.
2. To get the `versionId`, click on 3 dots besides Deploy button and click on "Version History" in it. Copy the version id via copy button of the version you want.

### Step 3: Configure IDs in Frontend
Now, update your `constants.js` file with these IDs.
```js title="constants.js"
export const token = "your_videosdk_auth_token_here";
export const meetingId = "YOUR_MEETING_ID";
export const name = "User Name";
export const agentId = "paste_your_agent_id_here";
export const versionId = "paste_your_version_id_here";
```
## Run the Application
### 1) Start the React Native app
```bash
npm install
# Android
npm run android
# iOS (macOS only)
cd ios && pod install && cd ..
npm run ios
```
### 2) Connect and interact
1. Join the meeting from the app and allow microphone permissions.
2. When you join, click the "Connect Agent" button to call the agent into the meeting.
3. Talk to the agent in real time.
## Troubleshooting
- Ensure the same `meetingId` is used and the `agentId` and `versionId` are correct in `constants.js`.
- Verify microphone permissions on the device/simulator.
- Confirm your VideoSDK token is valid.
- If audio is silent, check device output volume.
---
---
title: React Agent Starter
hide_title: false
hide_table_of_contents: false
description: VideoSDK enables the opportunity to integrate AI agents with real-time voice interaction using React frontend and a no-code agent from the dashboard.
sidebar_label: React
pagination_label: Agent Runtime with React
keywords:
- ai agent
- no-code
- voice interaction
- real-time communication
- React sdk
image: img/videosdklive-thumbnail.jpg
sidebar_position: 2
slug: agent-starter-react
---
import Step from '@site/src/components/Step'
import CreateAgent from '@site/mdx/_ai-agent-starter-sdk-guide.mdx'
# Agent Starter App - React
VideoSDK enables you to seamlessly add a voice-enabled AI agent to your React app — this guide walks you through connecting your React frontend to an agent configured and deployed directly from the VideoSDK dashboard.
## Prerequisites
- A deployed AI agent on VideoSDK Agent Cloud. If you haven't done this yet, create and deploy your agent using the [Low-Code Deployment UI](/ai_agents/agent-runtime/build-agent) on the VideoSDK Dashboard — no coding required. Once deployed, note down your **Agent ID**.
- Node.js (v18.x or later)
- npm or yarn
- Valid Video SDK [Account](https://app.videosdk.live/)
import APISecret from '@site/mdx/introduction/_api-key.mdx';
## Run the Sample Project
### Step 1: Clone the sample project
Clone the repository to your local environment.
```js
git clone https://github.com/videosdk-live/agent-starter-app-react.git
cd agent-starter-react
```
### Step 2: Install the dependencies
Install all the dependencies to run the project.
```bash
npm install
# or
yarn install
```
### Step 3: Create Your Agent (Optional)
:::info
If you've already configured and deployed your agent from the VideoSDK Dashboard, you can jump directly to [Step 4](#step-4-setup-environment-variables).
:::
### Step 4: Setup Environment Variables
Copy the `.env.example` file to `.env`.
```bash
cp .env.example .env
```
Update the `.env` file with your credentials. The `AGENT_ID` is the identifier for the Low-Code agent you deployed from the VideoSDK Dashboard.
```env title=".env"
AUTH_TOKEN=your_videosdk_auth_token
AGENT_ID=your_agent_id
MEETING_ID=your_meeting_id
VERSION_ID=your_verison_id
```
> **Tip:** You can obtain your `AUTH_TOKEN` and `AGENT_ID` from the [VideoSDK Dashboard](https://app.videosdk.live/) under your Agent Cloud deployment. `MEETING_ID` is optional — if left blank, the app will create a new meeting automatically.
### Step 5: Run the Sample App
Bingo, it's time to push the launch button.
```js
npm run dev
#or
yarn dev
```
Once running, the app will use the Dispatch API to send your deployed agent into the meeting room. You'll see the live transcription as you speak, and the agent will respond in real time.
---
## Troubleshooting
### Common Issues:
1. **Agent not joining:**
- Check that the `AGENT_ID` and `VERSION_ID` in your `.env` are correctly set.
- Verify your VideoSDK token is valid and has the necessary permissions.
2. **Audio not working:**
- Check device permissions for microphone access.
3. **"Failed to connect agent" error:**
- Verify your `AGENT_ID` and `VERSION_ID` are correct.
- Check the debug console for any network errors.
4. **React build issues:**
- Ensure `Node.js` version is 18 or higher.
- Restart the dev server
- Delete `node_modules` and reinstall:
```js
rm -rf node_modules
npm install
```
---
---
title: Agent Runtime with JavaScript
hide_title: false
hide_table_of_contents: false
description: VideoSDK enables the opportunity to integrate AI agents with real-time voice interaction using JavaScript frontend and a no-code backend.
sidebar_label: With JavaScript
pagination_label: Agent Runtime with JavaScript
keywords:
- ai agent
- no-code
- voice interaction
- real-time communication
- javascript sdk
image: img/videosdklive-thumbnail.jpg
sidebar_position: 2
slug: with-javascript
---
import Step from '@site/src/components/Step'
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Agent Runtime with JavaScript
VideoSDK empowers you to seamlessly integrate AI agents with real-time voice interaction into your JavaScript application within minutes. This guide shows you how to connect a JavaScript frontend with an AI agent created and configured entirely from the VideoSDK dashboard.
## Prerequisites
Before proceeding, ensure that your development environment meets the following requirements:
- Video SDK Developer Account (Not having one, follow **[Video SDK Dashboard](https://app.videosdk.live/)**)
- Node.js installed on your device
- Familiarity with creating a no-code voice agent. If you're new to this, please follow our guide on how to **[Build a Custom Voice AI Agent in Minutes](/ai_agents/agent-runtime/build-agent)** first.
:::important
You need a VideoSDK account to generate a token and an agent from the dashboard.
Visit the VideoSDK **[dashboard](https://app.videosdk.live/api-keys)** to generate a token.
:::
## Project Structure
First, create an empty project using `mkdir folder_name` on your preferable location for the JavaScript Frontend. Your final project structure should look like this:
```jsx title="Project Structure"
root
├── index.html
├── config.js
└── index.js
```
You will be working on the following files:
- `index.html`: Responsible for creating a basic UI for joining the meeting.
- `config.js`: Responsible for storing the token, room ID, and agent credentials.
- `index.js`: Responsible for rendering the meeting view and audio functionality.
## Building the JavaScript Frontend