Skip to main content

MCP Integration

The Model Context Protocol (MCP) is an open standard that enables AI assistants to securely connect to data sources and tools. With VideoSDK's AI Agents, you can seamlessly integrate MCP servers to extend your agent's capabilities with external services or applications, databases, and APIs.

MCP Server Types​

VideoSDK supports two transport methods for MCP servers:

1. STDIO Transport​

  • Direct process communication
  • Local Python scripts
  • Best for custom tools and functions
  • Ideal for server-side integrations

2. HTTP Transport (Streamable HTTP or SSE)​

  • Network-based communication
  • External MCP services
  • Best for third-party integrations
  • Supports remote MCP servers

How It Works with VideoSDK's AI Agent​

MCP tools are automatically discovered and made available to your agent. Agent will intelligently choose which tools to use based on user requests. When a user asks for information that requires external data, the agent will:

  • Identify the need for external data based on the user's request
  • Select appropriate tools from available MCP servers
  • Execute the tools with relevant parameters
  • Process the results and provide a natural language response

This seamless integration allows your voice agent to access real-time data and external services while maintaining a natural conversational flow.

Creating an MCP Server​

Basic MCP Server Structure

A simple MCP server using STDIO to return the current time. First, install the required package:

pip install fastmcp
mcp_stdio_example.py
from mcp.server.fastmcp import FastMCP
import datetime

# Create the MCP server
mcp = FastMCP("CurrentTimeServer")

@mcp.tool()
def get_current_time() -> str:
"""Get the current time in the user's location"""

# Get current time
now = datetime.datetime.now()

# Return formatted time string
return f"The current time is {now.strftime('%H:%M:%S')} on {now.strftime('%Y-%m-%d')}"

if __name__ == "__main__":
# Run the server with STDIO transport
mcp.run(transport="stdio")

Integrating MCP with VideoSDK Agent​

Now we'll see how to integrate MCP servers with your VideoSDK AI Agent:

main.py
import asyncio
import pathlib
import sys
from videosdk.agents import Agent, AgentSession, RealTimePipeline,MCPServerStdio, MCPServerHTTP
from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig

class MyVoiceAgent(Agent):
def __init__(self):
# Define paths to your MCP servers
mcp_script = Path(__file__).parent.parent / "MCP_Example" / "mcp_stdio_example.py"
super().__init__(
instructions="""You are a helpful assistant with access to real-time data.
You can provide current time information.
Always be conversational and helpful in your responses.""",
mcp_servers=[
# STDIO MCP Server (Local Python script for time)
MCPServerStdio(
command=sys.executable, # Use current Python interpreter
args=[str(mcp_script)],
client_session_timeout_seconds=30
),
# HTTP MCP Server (External service example e.g Zapier)
MCPServerHTTP(
url="https://your-mcp-service.com/api/mcp",
client_session_timeout_seconds=30
)
]
)

async def on_enter(self) -> None:
await self.session.say("Hi there! How can I help you today?")

async def on_exit(self) -> None:
await self.session.say("Thank you for using the assistant. Goodbye!")

async def main(context: dict):

# Configure Gemini Realtime model
model = GeminiRealtime(
model="gemini-2.0-flash-live-001",
config=GeminiLiveConfig(
voice="Leda", # Available voices: Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, Zephyr
response_modalities=["AUDIO"]
)
)

pipeline = RealTimePipeline(model=model)
agent = MyVoiceAgent()

session = AgentSession(
agent=agent,
pipeline=pipeline,
context=context
)

try:
# Start the session
await session.start()
# Keep the session running until manually terminated
await asyncio.Event().wait()
finally:
# Clean up resources when done
await session.close()

if __name__ == "__main__":
def make_context():
# When VIDEOSDK_AUTH_TOKEN is set in .env - DON'T include videosdk_auth
return {
"meetingId": "your_actual_meeting_id_here", # Replace with actual meeting ID
"name": "AI Voice Agent",
"videosdk_auth": "your_videosdk_auth_token_here" # Replace with actual token
}

tip

Get started quickly with the Quick Start Example for the VideoSDK AI Agent SDK With MCP — everything you need to build your first AI agent fast.

Got a Question? Ask us on discord