--- title: A2A Implementation Guide hide_title: false hide_table_of_contents: false description: "Complete implementation guide for building Agent to Agent (A2A) systems with VideoSDK AI Agents. Learn to create customer service and specialist agents that collaborate seamlessly using real-world examples." pagination_label: "A2A Implementation" keywords: - A2A Implementation - Agent to Agent Example - Multi-Agent System - Multiple Agent - A2A Protocol - AI Agent - Google's A2A - Customer Service Agent - Loan Specialist Agent - VideoSDK Agents - AI Agent SDK - Python Implementation - Agent Collaboration image: img/videosdklive-thumbnail.jpg sidebar_position: 6 sidebar_label: Implementation slug: implementation --- # A2A Implementation Guide This guide shows you how to build a complete Agent to Agent (A2A) system using the concepts from the [A2A Overview](overview). We'll create a banking customer service system with a main customer service agent and a loan specialist. ## Implementation Overview We'll build a system with: - **Customer Service Agent**: Voice-enabled interface agent that users interact with - **Loan Specialist Agent**: Text-based domain expert for loan-related queries - **Intelligent Routing**: Automatic detection and forwarding of loan queries - **Seamless Communication**: Users get expert responses without knowing about the routing ## Structure of the project ```js A2A ├── agents/ │ ├── customer_agent.py # CustomerServiceAgent definition │ ├── loan_agent.py # LoanAgent definition │ ├── session_manager.py # Handles session creation, pipeline setup, meeting join/leave └── main.py # Entry point: runs main() and starts agents ``` ## Sequence Diagram ![A2A Architecture](https://cdn.videosdk.live/website-resources/docs-resources/a2a_sequence_diagram.png) ## Step 1: Create the Customer Service Agent - **`Interface Agent`**: Creates `CustomerServiceAgent` as the main user-facing agent with voice capabilities and customer service instructions. - **`Function Tool`**: Implements`@function_tool forward_to_specialist()`that uses A2A discovery to find and route queries to domain specialists. - **`Response Relay`**: Includes `handle_specialist_response()` method that automatically receives and relays specialist responses back to users. ```python title="agents/customer_agent.py" from videosdk.agents import Agent, AgentCard, A2AMessage, function_tool import asyncio from typing import Dict, Any class CustomerServiceAgent(Agent): def __init__(self): super().__init__( agent_id="customer_service_1", instructions=( "You are a helpful bank customer service agent. " "For general banking queries (account balances, transactions, basic services), answer directly. " "For ANY loan-related queries, questions, or follow-ups, ALWAYS use the forward_to_specialist function " "with domain set to 'loan'. This includes initial loan questions AND all follow-up questions about loans. " "Do NOT attempt to answer loan questions yourself - always forward them to the specialist. " "After forwarding a loan query, stay engaged and automatically relay any response you receive from the specialist. " "When you receive responses from specialists, immediately relay them naturally to the customer." ) ) @function_tool async def forward_to_specialist(self, query: str, domain: str) -> Dict[str, Any]: """Forward queries to domain specialist agents using A2A discovery""" # Use A2A discovery to find specialists by domain specialists = self.a2a.registry.find_agents_by_domain(domain) id_of_target_agent = specialists[0] if specialists else None if not id_of_target_agent: return {"error": f"No specialist found for domain {domain}"} # Send A2A message to the specialist await self.a2a.send_message( to_agent=id_of_target_agent, message_type="specialist_query", content={"query": query} ) return { "status": "forwarded", "specialist": id_of_target_agent, "message": "Let me get that information for you from our loan specialist..." } async def handle_specialist_response(self, message: A2AMessage) -> None: """Handle responses from specialist agents and relay to user""" response = message.content.get("response") if response: # Brief pause for natural conversation flow await asyncio.sleep(0.5) # Try multiple methods to relay the response to the user prompt = f"The loan specialist has responded: {response}" methods_to_try = [ (self.session.pipeline.send_text_message, prompt), (self.session.pipeline.model.send_message, response), (self.session.say, response) ] for method, arg in methods_to_try: try: await method(arg) break except Exception as e: print(f"Error with {method.__name__}: {e}") async def on_enter(self): # Register this agent with the A2A system await self.register_a2a(AgentCard( id="customer_service_1", name="Customer Service Agent", domain="customer_service", capabilities=["query_handling", "specialist_coordination"], description="Handles customer queries and coordinates with specialists", metadata={"version": "1.0", "type": "interface"} )) # Greet the user await self.session.say("Hello! I am your customer service agent. How can I help you?") # Set up message listener for specialist responses self.a2a.on_message("specialist_response", self.handle_specialist_response) async def on_exit(self): print("Customer agent left the meeting") ``` ## Step 2: Create the Loan Specialist Agent - **`Specialist Agent Setup`**: Creates `LoanAgent` class with specialized loan expertise instructions and agent_id `"specialist_1"`. - **`Message Handlers`**: Implements` handle_specialist_query()` to process incoming queries and handle_model_response() to send responses back. - **`Registration`**: Registers with A2A system using domain "loan" so it can be `discovered` by other agents needing loan expertise. ```python title="agents/loan_agent.py" from videosdk.agents import Agent, AgentCard, A2AMessage class LoanAgent(Agent): def __init__(self): super().__init__( agent_id="specialist_1", instructions=( "You are a specialized loan expert at a bank. " "Provide detailed, helpful information about loans including interest rates, terms, and requirements. " "Give complete answers with specific details when possible. " "You can discuss personal loans, car loans, home loans, and business loans. " "Provide helpful guidance and next steps for loan applications. " "Be friendly and professional in your responses. " "Keep responses concise within 5-7 lines and easily understandable." ) ) async def handle_specialist_query(self, message: A2AMessage): """Process incoming queries from customer service agent""" query = message.content.get("query") if query: # Send the query to our AI model for processing await self.session.pipeline.send_text_message(query) async def handle_model_response(self, message: A2AMessage): """Send processed responses back to requesting agent""" response = message.content.get("response") requesting_agent = message.to_agent if response and requesting_agent: # Send the specialist response back to the customer service agent await self.a2a.send_message( to_agent=requesting_agent, message_type="specialist_response", content={"response": response} ) async def on_enter(self): print("LoanAgent joined the system") # Register this agent with the A2A system await self.register_a2a(AgentCard( id="specialist_1", name="Loan Specialist Agent", domain="loan", capabilities=["loan_consultation", "loan_information", "interest_rates"], description="Handles loan queries with specialized expertise", metadata={"version": "1.0", "type": "specialist"} )) # Set up message listeners for different message types self.a2a.on_message("specialist_query", self.handle_specialist_query) self.a2a.on_message("model_response", self.handle_model_response) async def on_exit(self): print("LoanAgent left the system") ``` ## Step 3: Configure Session Management - **`Pipeline Creation`**: Sets up different pipeline configurations - customer agent gets audio-enabled Gemini for voice interaction, specialist gets text-only for efficiency. - **`Session Factory`**: Provides `create_pipeline()` and `create_session()` functions to properly configure agent sessions based on their roles. - **`Modality Separation`**: Ensures customer agent can handle voice while specialist processes text in background. ```python title="session_manager.py" from videosdk.agents import AgentSession, RealTimePipeline from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig from typing import Dict def create_pipeline(agent_type: str) -> RealTimePipeline: """Create appropriate pipeline based on agent type""" if agent_type == "customer": # Customer agent: Audio-enabled for real-time voice interaction model = GeminiRealtime( model="gemini-2.0-flash-live-001", config=GeminiLiveConfig( voice="Leda", # Available voices: Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, Zephyr response_modalities=["AUDIO"] ) ) else: # Specialist agent: Text-only for efficient processing model = GeminiRealtime( model="gemini-2.0-flash-live-001", config=GeminiLiveConfig(response_modalities=["TEXT"]) ) return RealTimePipeline(model=model) def create_session(agent, pipeline, context: Dict) -> AgentSession: """Create agent session with given configuration""" return AgentSession(agent=agent, pipeline=pipeline, context=context) ``` :::info You can use OpenAI's text and audio features, Gemini's text and audio features, or a combination of both. ::: ## Step 4: Deploy A2A System on VideoSDK Platform - **`Meeting Setup`**: Customer agent joins VideoSDK meeting for user interaction while specialist runs in background mode. Requires environment variables: `VIDEOSDK_AUTH_TOKEN` and `GOOGLE_API_KEY`, plus meeting ID configuration in session context. - **`System Orchestration`**: Initializes both agents, creates their respective pipelines, and starts their sessions with proper meeting configurations - **`Resource Management`**: Handles startup sequence, keeps system running, and provides clean shutdown with proper A2A unregistration ```python title="main.py" import asyncio from agents.customer_agent import CustomerServiceAgent from agents.loan_agent import LoanAgent from session_manager import create_pipeline, create_session async def main(): # Initialize both customer and specialist agents customer_agent = CustomerServiceAgent() specialist_agent = LoanAgent() # Create pipelines for different agent types customer_pipeline = create_pipeline("customer") specialist_pipeline = create_pipeline("specialist") # Create sessions with appropriate configurations customer_session = create_session(customer_agent, customer_pipeline, { "name": "Customer Service Assistant", "meetingId": "YOUR_MEETING_ID", # Replace with your meeting ID "join_meeting": True # Customer agent joins the meeting for user interaction }) specialist_session = create_session(specialist_agent, specialist_pipeline, { "join_meeting": False # Specialist agent runs in background }) try: # Start both agent sessions await customer_session.start() await specialist_session.start() # Keep the system running until manually terminated await asyncio.Event().wait() except KeyboardInterrupt: print("\n Shutting down A2A system...") finally: # Clean up resources await customer_session.close() await specialist_session.close() await customer_agent.unregister_a2a() await specialist_agent.unregister_a2a() print(" System shutdown complete") if __name__ == "__main__": asyncio.run(main()) ``` #### Running the Application ```bash cd A2A python main.py ``` :::tip Quick Start Get the complete working example at [A2A Quick Start Repository](https://github.com/videosdk-live/agents-quickstart/tree/main/A2A) with all the code ready to run. ::: --- --- title: Agent to Agent (A2A) hide_title: false hide_table_of_contents: false description: "Understanding the core concepts of Agent to Agent (A2A) communication in VideoSDK AI Agents - AgentCard, A2AMessage, agent registration, and discovery mechanisms for building collaborative multi-agent systems." pagination_label: "A2A Overview" keywords: - A2A Overview - A2A Protocol - Agent To Agent - AI Agent - Google's A2A - AgentCard - A2AMessage - Agent Registration - Agent Discovery - Multi-Agent Communication - VideoSDK Agents - AI Agent SDK - Agent Collaboration image: img/videosdklive-thumbnail.jpg sidebar_position: 5 sidebar_label: Overview slug: overview --- # Agent to Agent (A2A) The Agent to Agent (A2A) protocol enables seamless collaboration between specialized AI agents, allowing them to communicate, share knowledge, and coordinate responses based on their unique capabilities and domain expertise. With VideoSDK's A2A implementation, you can create multi-agent systems where different agents work together to provide comprehensive solutions. ## How It Works ### Basic Flow 1. **Agent Registration**: Agents register themselves with an `AgentCard` that contains their capabilities and domain expertise 2. **Client Query**: Client sends a query to the main agent 3. **Agent Discovery**: Main agent discovers relevant specialist agents using agent cards 4. **Query Forwarding**: Main agent forwards specialized queries to appropriate agents 5. **Response Chain**: Specialist agents process queries and respond back to the main agent 6. **Client Response**: Main agent formats and delivers the final response to the client ![A2A Architecture](https://cdn.videosdk.live/website-resources/docs-resources/a2a_diagram.png) ### Example Scenario ``` Client → "Book a flight to New York and find a hotel" ↓ Travel Agent (Main) → Analyzes query ↓ Travel Agent → Discovers Flight Booking Agent & Hotel Booking Agent ↓ Travel Agent → Forwards flight query to Flight Booking Agent Travel Agent → Forwards hotel query to Hotel Booking Agent ↓ Specialist Agents → Process queries and respond back (text format) ↓ Travel Agent → Combines responses and sends to client (audio format) ``` # Core Components ## 1. AgentCard The `AgentCard` is how agents identify themselves and advertise their capabilities to other agents. #### Structure ```python AgentCard( id="agent_flight_001", name="Skymate", domain="flight", capabilities=[ "search_flights", "modify_bookings", "show_flight_status" ], description="Handles all flight-related tasks" ) ``` #### Parameters | Parameter | Type | Required | Description | | -------------- | ------ | -------- | ------------------------------------ | | `id` | string | Yes | Unique identifier for the agent | | `name` | string | Yes | Human-readable agent name | | `domain` | string | Yes | Primary expertise domain | | `capabilities` | list | Yes | List of specific capabilities | | `description` | string | Yes | Brief description of agent's purpose | | `metadata` | dict | No | Additional metadata for the agent | ## 2. A2AMessage `A2AMessage` is the standardized communication format between agents. #### Structure ```python message = A2AMessage( from_agent="travel_agent_1", to_agent="agent_flight_001", type="flight_status_query", content={"query": "What's the status of AI202?"}, metadata={"client_id": "xyz123", "urgency": "medium"} ) ``` #### Parameters | Parameter | Type | Required | Description | | ------------ | ------ | -------- | --------------------------- | | `from_agent` | string | Yes | ID of the sending agent | | `to_agent` | string | Yes | ID of the receiving agent | | `type` | string | Yes | Message type/event name | | `content` | dict | Yes | Message payload | | `metadata` | dict | No | Additional message metadata | ## 3. Agent Registry #### `register_a2a(agent_card)` Register an agent with the A2A system. ```python async def on_enter(self): await self.register_a2a(AgentCard( id="agent_flight_001", name="Skymate", domain="flight", capabilities=[ "search_flights", "modify_bookings", "show_flight_status" ], description="Handles all flight-related tasks" )) ``` **What Registration Does:** - Adds the agent to the global `AgentRegistry` singleton - Makes the agent discoverable by other agents - Stores both the `AgentCard` and agent instance - Enables message routing to this agent #### `unregister()` Unregister an agent from the A2A system. ```python await self.unregister_a2a() ``` ## 4. A2AProtocol Class The main class for managing agent-to-agent communication. ### Agent Discovery #### `find_agents_by_domain(domain: str)` Discover agents based on their domain expertise. ```python agents = self.a2a.registry.find_agents_by_domain("hotel") # Returns: ["agent_hotel_001"] ``` #### `find_agents_by_capability(cap: str)` Find agents with specific skills. ```python agents = await self.a2a.registry.find_agents_by_capability("modify_bookings") # Returns: ["agent_flight_001"] ``` --- ### Agent Communications #### `send_message(to_agent, message_type, content, metadata=None)` Send messages directly to other agents. ```python await self.a2a.send_message( to_agent="agent_hotel_001", message_type="hotel_booking_query", # Event name that the receiving agent listens for content={"query": "Find 3-star hotels in Delhi under $100"}, metadata={"client_id": "xyz123"} # Optional metadata ) ``` **Parameters:** - `to_agent` (string): Target agent ID - `message_type` (string): Event name the receiving agent listens for - `content` (dict): Message payload - `metadata` (dict, optional): Additional message metadata #### `on_message(message_type, handler)` Register message handlers for incoming messages. ```python # Register a handler for specialist queries self.a2a.on_message("hotel_booking_query", self.handle_specialist_query) async def handle_specialist_query(self, message): # Process the incoming message query = message.content.get("query") # ... process query ... # Return response return {"response": "Current mortgage rates are 6.5%"} ``` ## Next Steps Now that you're familiar with the core A2A concepts, it's time to move from theory to practice: 👉 **[Explore the Full A2A Implementation](implementation)** Dive into a complete, working example that demonstrates agent discovery, messaging, and collaboration in action. --- --- title: Playground hide_title: false hide_table_of_contents: false description: "Test and interact with your VideoSDK AI agents in real-time using Playground mode. Learn how to enable the interactive testing environment for rapid development and debugging of voice AI agents." pagination_label: "Playground" keywords: - AI Agent SDK - VideoSDK Agents - Playground - Testing - Python SDK - Voice AI - Real-time Communication - AI Integration - VideoSDK Cloud - Development - Debugging image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: Playground slug: playground --- # Agents Playground The Agents Playground provides an interactive testing environment where you can directly communicate with your AI agents during development. This feature enables rapid prototyping, testing, and debugging of your voice AI implementations without needing a separate client application. ## Overview Playground mode creates a web-based interface that connects directly to your agent session, allowing you to: - Test agent in real-time - Demonstrate agent capabilities to stakeholders ## Enabling Playground Mode To activate playground mode, simply set `playground: True` in your agent session context: ### Basic Implementation ```python from videosdk.agents import AgentSession def make_context(): return { "meetingId": "", "name": "", "playground": True } # Initialize the agent session with all components session = AgentSession( agent=VoiceAgent(), pipeline=pipeline, context=make_context ) # Start the agent session session.start() ``` ## Accessing the Playground Once your agent session starts, the playground URL will be displayed in your terminal: ``` Agent started in playground mode Interact with agent here at: https://playground.videosdk.live?token={auth_token}&meetingId={meeting_id} ``` ### URL Structure The playground URL follows this format: ``` https://playground.videosdk.live?token={auth_token}&meetingId={meeting_id} ``` Where: - `auth_token`: videosdk_auth that is provided in session context or in env file. - `meeting_id`: The meeting ID specified in session context. **Note**: Playground mode is designed for development and testing purposes. For production deployments, ensure playground mode is disabled to maintain security and performance. --- --- title: AI Voice Agent Quick Start hide_title: false hide_table_of_contents: false description: "A step-by-step guide to quickly integrate an AI-powered voice agent into your VideoSDK meetings using the AI Agent SDK. Covers prerequisites, installation, custom agent creation, function tools, pipeline setup, and session management." pagination_label: "AI Voice Agent Quick Start" keywords: - AI Voice Agent - Quick Start - VideoSDK Agents - AI Agent SDK - Python - OpenAI - Gemini - Live API - Speech To Speech - Amazon Nova Sonic - AWS Nova Sonic - Function Tools - Real-time AI - Voice Integration - VideoSDK Meetings image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: AI Voice Agent Quick Start slug: voice-agent-quick-start --- import Step from '@site/src/components/Step' # AI Voice Agent Quick Start This guide will help you integrate an AI-powered voice agent into your VideoSDK meetings. ## Prerequisites Before you begin, ensure you have: - A VideoSDK authentication token (generate from [app.videosdk.live](https://app.videosdk.live)) - - A VideoSDK meeting ID (you can generate one using the [Create Room API](https://docs.videosdk.live/api-reference/realtime-communication/create-room) or through the VideoSDK dashboard) - Python 3.12 or higher - API Key: An API key corresponding to your chosen model provider: - OpenAI API key (for OpenAI models) - Google Gemini API key (for Gemini's LiveAPI) - AWS credentials (aws_access_key_id and aws_secret_access_key) for Amazon Nova Sonic ## Installation Create and activate a virtual environment with Python 3.12 or higher: ```js python3.12 -m venv venv source venv/bin/activate ``` ```js python -m venv venv venv\Scripts\activate ``` First, install the VideoSDK AI Agent package using pip: ```bash pip install videosdk-agents ``` Now its time to install the plugin for your chosen AI model. Each plugin is tailored for seamless integration with the VideoSDK AI Agent SDK. import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js pip install "videosdk-plugins-openai" ``` ```js pip install "videosdk-plugins-google" ``` ```js pip install "videosdk-plugins-aws" ``` ## Environment Setup It's recommended to use environment variables for secure storage of API keys, secret tokens, and authentication tokens. Create a `.env` file in your project root: ```js title=".env" VIDEOSDK_AUTH_TOKEN=your_videosdk_auth_token OPENAI_API_KEY=your_openai_api_key ``` ```js title=".env" VIDEOSDK_AUTH_TOKEN=your_videosdk_auth_token GOOGLE_API_KEY=your_google_api_key ``` ```js title=".env" VIDEOSDK_AUTH_TOKEN=your_videosdk_auth_token AWS_ACCESS_KEY_ID=your_aws_access_key AWS_SECRET_ACCESS_KEY=your_aws_secret_key AWS_DEFAULT_REGION=your_aws_region ``` ## Generating a VideoSDK Meeting ID Before your AI agent can join a meeting, you'll need to create a meeting ID. You can generate one using the VideoSDK Create Room API: ### Using cURL ```bash curl -X POST https://api.videosdk.live/v2/rooms \ -H "Authorization: YOUR_JWT_TOKEN_HERE" \ -H "Content-Type: application/json" ``` For more details on the Create Room API, refer to the [VideoSDK documentation](https://docs.videosdk.live/api-reference/realtime-communication/create-room).
### Step 1: Creating a Custom Agent
First, let's create a custom voice agent by inheriting from the base `Agent` class: ```python title="main.py" from videosdk.agents import Agent, function_tool # External Tool # async def get_weather(self, latitude: str, longitude: str): class VoiceAgent(Agent): def __init__(self): super().__init__( instructions="You are a helpful voice assistant that can answer questions and help with tasks.", tools=[get_weather] # You can register any external tool defined outside of this scope ) async def on_enter(self) -> None: """Called when the agent first joins the meeting""" await self.session.say("Hi there! How can I help you today?") ``` This code defines a basic voice agent with: - Custom instructions that define the agent's personality and capabilities - An entry message when joining a meeting - State change handling to track the agent's current activity
### Step 2: Implementing Function Tools
Function tools allow your agent to perform actions beyond conversation. There are two ways to define tools: - **External Tools:** Defined as standalone functions outside the agent class and registered via the `tools` argument in the agent's constructor. - **Internal Tools:** Defined as methods inside the agent class and decorated with `@function_tool`. Below is an example of both: ```python title="main.py" import aiohttp # External Function Tools @function_tool def get_weather(latitude: str, longitude: str): """Called when the user asks about the weather. This function will return the weather for the given location. When given a location, please estimate the latitude and longitude of the location and do not ask the user for them. Args: latitude: The latitude of the location longitude: The longitude of the location """ print(f"Getting weather for {latitude}, {longitude}") url = f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}¤t=temperature_2m" async with aiohttp.ClientSession() as session: async with session.get(url) as response: if response.status == 200: data = await response.json() return { "temperature": data["current"]["temperature_2m"], "temperature_unit": "Celsius", } else: raise Exception( f"Failed to get weather data, status code: {response.status}" ) class VoiceAgent(Agent): # ... previous code ... # Internal Function Tools @function_tool async def get_horoscope(self, sign: str) -> dict: """Get today's horoscope for a given zodiac sign. Args: sign: The zodiac sign (e.g., Aries, Taurus, Gemini, etc.) """ horoscopes = { "Aries": "Today is your lucky day!", "Taurus": "Focus on your goals today.", "Gemini": "Communication will be important today.", } return { "sign": sign, "horoscope": horoscopes.get(sign, "The stars are aligned for you today!"), } ``` - Use external tools for reusable, standalone functions (registered via `tools=[...]`). - Use internal tools for agent-specific logic as class methods. - Both must be decorated with `@function_tool` for the agent to recognize and use them.
### Step 3: Setting Up the Pipeline
The pipeline connects your agent to an AI model. In this example, we're using OpenAI's real-time model: ```python title="main.py" from videosdk.plugins.openai import OpenAIRealtime, OpenAIRealtimeConfig from videosdk.agents import RealTimePipeline from openai.types.beta.realtime.session import TurnDetection async def start_session(context: dict): # Initialize the AI model model = OpenAIRealtime( model="gpt-4o-realtime-preview", # When OPENAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="sk-proj-XXXXXXXXXXXXXXXXXXXX", config=OpenAIRealtimeConfig( voice="alloy", # alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, and verse modalities=["text", "audio"], turn_detection=TurnDetection( type="server_vad", threshold=0.5, prefix_padding_ms=300, silence_duration_ms=200, ), tool_choice="auto" ) ) pipeline = RealTimePipeline(model=model) # Continue to the next steps... ``` ```python title="main.py" from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig from videosdk.agents import RealTimePipeline async def start_session(context: dict): # Initialize the AI model model = GeminiRealtime( model="gemini-2.0-flash-live-001", # When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter api_key="AKZSXXXXXXXXXXXXXXXXXXXX", config=GeminiLiveConfig( voice="Leda", # Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, and Zephyr. response_modalities=["AUDIO"] ) ) pipeline = RealTimePipeline(model=model) # Continue to the next steps... ``` ```python title="main.py" from videosdk.plugins.aws import NovaSonicRealtime, NovaSonicConfig from videosdk.agents import RealTimePipeline async def start_session(context: dict): # Initialize the AI model model = NovaSonicRealtime( model="amazon.nova-sonic-v1:0", # When AWS credentials and region are set in .env - DON'T pass credential parameters region="us-east-1", # Currently, only "us-east-1" is supported for Amazon Nova Sonic. aws_access_key_id="AWSXXXXXXXXXXXXXXXXXXXX", aws_secret_access_key="AQSXXXXXXXXXXXXXXXXXXXX", config=NovaSonicConfig( voice="tiffany", # "tiffany","matthew", "amy" temperature=0.7, top_p=0.9, max_tokens=1024 ) ) pipeline = RealTimePipeline(model=model) # Continue to the next steps... ``` :::note To initiate a conversation with Amazon Nova Sonic, the user must speak first. The model listens for user input to begin the interaction. ::: :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. :::
### Step 4: Assembling and Starting the Agent Session
Now, let's put everything together and start the agent session: ```python title="main.py" import asyncio from videosdk.agents import AgentSession async def start_session(context: dict): # ... previous setup code ... # Create the agent session session = AgentSession( agent=VoiceAgent(), pipeline=pipeline, context=context ) try: # Start the session await session.start() # Keep the session running until manually terminated await asyncio.Event().wait() finally: # Clean up resources when done await session.close() if __name__ == "__main__": def make_context(): # When VIDEOSDK_AUTH_TOKEN is set in .env - DON'T include videosdk_auth return { "meetingId": "your_actual_meeting_id_here", # Replace with actual meeting ID "name": "AI Voice Agent", "videosdk_auth": "your_videosdk_auth_token_here" # Replace with actual token } asyncio.run(start_session(context=make_context())) ```
### Step 5: Connecting with VideoSDK Client Applications
After setting up your AI Agent, you'll need a client application to connect with it. You can use any of the VideoSDK quickstart examples to create a client that joins the same meeting: - [JavaScript](https://github.com/videosdk-live/quickstart/tree/main/js-rtc) - [React](https://github.com/videosdk-live/quickstart/tree/main/react-rtc) - [React Native](https://github.com/videosdk-live/quickstart/tree/main/react-native) - [Android](https://github.com/videosdk-live/quickstart/tree/main/android-rtc) - [Flutter](https://github.com/videosdk-live/quickstart/tree/main/flutter-rtc) - [iOS](https://github.com/videosdk-live/quickstart/tree/main/ios-rtc) When setting up your client application, make sure to use the same meeting ID that your AI Agent is using.
### Step 6: Running the Project
Once you have completed the setup, you can run your AI Voice Agent project using Python. Make sure your `.env` file is properly configured and all dependencies are installed. ```bash python main.py ``` :::tip Get started quickly with the [Quick Start Example](https://github.com/videosdk-live/agents-quickstart/) for the VideoSDK AI Agent SDK — everything you need to build your first AI agent fast. ::: --- --- title: Agent Session hide_title: false hide_table_of_contents: false description: "Discover how the `AgentSession` in VideoSDK's AI Agent SDK orchestrates various components into a unified workflow, managing the agent's interaction lifecycle and context for seamless real-time communication." pagination_label: "Agent Session" keywords: - AgentSession - AI Agent SDK - VideoSDK Agents - Component Orchestration - Session Management - Context Handling - Agent Workflow - Real-time AI - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 5 sidebar_label: Agent Session slug: agent-session --- # Agent Session The `AgentSession` integrates all the components into a cohesive workflow, creating a complete agent interaction system. #### Key Features: - Component orchestration - Session lifecycle management - Context handling #### Example Implementation: ```python from videosdk.agents import AgentSession def make_context(): return {"meetingId": "", "name": ""} # Initialize the agent session with all components session = AgentSession( agent=VoiceAgent(), pipeline=pipeline, context=make_context ) # Start the agent session session.start() ``` --- --- title: Agent hide_title: false hide_table_of_contents: false description: "Learn about the `Agent` base class in the VideoSDK AI Agent SDK. Understand how to create custom agents, define system prompts, manage state, and register function tools." pagination_label: "Agent" keywords: - Agent Class - AI Agent SDK - VideoSDK Agents - Custom Agents - System Prompts - State Management - Function Tools - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: Agent slug: agent --- # Agent The `Agent` base class serves as the foundation for creating custom agents. By inheriting from this class, you can define system prompts and register function tools. ### Key Features: - Custom instruction configuration - State management lifecycle hooks - Function tool registration (internal and external) #### Example Implementation: You can register function tools in two ways: - **External Tools:** Defined as standalone functions outside the agent class and registered via the `tools` argument in the agent's constructor. - **Internal Tools:** Defined as methods inside the agent class and decorated with `@function_tool`. ```python from videosdk.agents import Agent # --- External Function Tool --- @function_tool def get_weather(latitude: str, longitude: str): """Called when the user asks about the weather. This function will return the weather for the given location. When given a location, please estimate the latitude and longitude of the location and do not ask the user for them. Args: latitude: The latitude of the location longitude: The longitude of the location """ print(f"Getting weather for {latitude}, {longitude}") url = f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}¤t=temperature_2m" async with aiohttp.ClientSession() as session: async with session.get(url) as response: if response.status == 200: data = await response.json() return { "temperature": data["current"]["temperature_2m"], "temperature_unit": "Celsius", } else: raise Exception( f"Failed to get weather data, status code: {response.status}" ) class VoiceAgent(Agent): def __init__(self): super().__init__( instructions="You are a helpful voice assistant that can answer questions and help with tasks.", tools=[get_weather] # Register external tools here ) # --- Internal Function Tool --- @function_tool async def get_horoscope(self, sign: str) -> dict: """Get today's horoscope for a given zodiac sign. Args: sign: The zodiac sign (e.g., Aries, Taurus, Gemini, etc.) Returns: dict: Contains the sign and horoscope text """ horoscopes = { "Aries": "Today is your lucky day!", "Taurus": "Focus on your goals today.", "Gemini": "Communication will be important today.", } return { "sign": sign, "horoscope": horoscopes.get(sign, "The stars are aligned for you today!"), } ``` --- --- title: Cascading Pipeline hide_title: false hide_table_of_contents: false description: "Explore the `Cascading Pipeline` component in the VideoSDK AI Agent SDK. Learn how it manages AI models (like OpenAI and Gemini), configurations, streaming audio, and multi-modal capabilities." pagination_label: "Cascading Pipeline" keywords: - Pipeline Component - AI Agent SDK - VideoSDK Agents - AI Models - OpenAI - Gemini - Model Configuration - Streaming Audio - Multi-modal AI - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 4 sidebar_label: Cascading Pipeline slug: cascading-pipeline --- # Cascading Pipeline The `Cascading Pipeline` component provides a flexible, modular approach to building AI agents by allowing you to mix and match different components for Speech-to-Text (STT), Large Language Models (LLM), Text-to-Speech (TTS), Voice Activity Detection (VAD), and Turn Detection. #### Key Features: - **Modular Component Selection** - Choose different providers for each component - **Flexible Configuration** - Mix and match STT, LLM, TTS, VAD, and Turn Detection - **Custom Processing** - Add custom processing for STT and LLM outputs - **Provider Agnostic** - Support for multiple AI service providers - **Advanced Control** - Fine-tune each component independently #### Example Implementation: ```python from videosdk.agents import CascadingPipeline from videosdk.plugins.openai import OpenAILLM from videosdk.plugins.deepgram import DeepgramSTT from videosdk.plugins.silero import SileroVAD from videosdk.plugins.turn_detector import TurnDetector stt=DeepgramSTT( api_key=os.getenv("DEEPGRAM_API_KEY"), model="nova-2", language="en" ) llm=OpenAILLM( api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o" ) tts=ElevenLabsTTS( api_key=os.getenv("ELEVENLABS_API_KEY"), voice_id="your-voice-id" ) vad=SileroVAD( threshold = 0.35 ) turn_detector=TurnDetector(t threshold=0.8 ) pipeline = CascadingPipeline(stt=stt, llm=llm, tts=tts, vad=vad, turn_detector=turn_detector) ``` #### Use Cases: - **Multi-language Support** - Use specialized STT for different languages - **Cost Optimization** - Mix premium and cost-effective services - **Custom Voice Processing** - Add domain-specific processing logic - **Performance Optimization** - Choose fastest providers for each component - **Compliance Requirements** - Use specific providers for regulatory compliance --- --- title: Conversation Flow hide_title: false hide_table_of_contents: false description: "Explore the `Conversation Flow` component in the VideoSDK AI Agent SDK. Learn how it manages turn taking in the Agents" pagination_label: "Conversation Flow" keywords: - Conversation Flow - AI Agent SDK - VideoSDK Agents - AI Models - OpenAI - Gemini - Model Configuration - Streaming Audio - Multi-modal AI - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 4 sidebar_label: Conversation Flow slug: conversation-flow --- # Conversation Flow The `Conversation Flow` component is an inheritable class that enables custom turn-taking logic, transcript preprocessing, and memory/RAG integrations before LLM processing. #### Key Features: - Custom transcript preprocessing - Turn-based conversation management - Memory and RAG integration hooks - Flexible LLM processing pipeline - Streaming response support #### Example Implementation: ```python from videosdk.agents import Agent, ConversationFlow, ChatContext class MyConversationFlow(ConversationFlow): def __init__(self, agent, stt=None, llm=None, tts=None): super().__init__(agent, stt, llm, tts) async def run(self, transcript: str) -> AsyncIterator[str]: """Main conversation loop: handle a user turn.""" await self.on_turn_start(transcript) # Pre-process transcript: clean, normalize, validate input # Apply custom business logic, content filtering # sample impl processed_transcript = transcript.lower().strip() self.agent.chat_context.add_message(role=ChatRole.USER, content=processed_transcript) # Use process_with_llm method for invoking llm after pre processing transcript async for response_chunk in self.process_with_llm(): yield response_chunk await self.on_turn_end() async def on_turn_start(self, transcript: str) -> None: """Called at the start of a user turn.""" # Fetch relevant context from memory/database # Perform RAG retrieval for enhanced responses # Initialize turn-specific variables # Log conversation analytics self.is_turn_active = True async def on_turn_end(self) -> None: """Called at the end of a user turn.""" # Store conversation in memory/database # Update user preferences and context # Perform cleanup operations # Log turn completion metrics self.is_turn_active = False ``` #### Use Cases: - **RAG Implementation** - Retrieve relevant documents and context before LLM processing - **Memory Management** - Store and recall conversation history across sessions - **Content Filtering** - Apply safety checks and content moderation on input/output - **Analytics & Logging** - Track conversation metrics and user behavior patterns - **Business Logic Integration** - Add domain-specific processing and validation rules - **Multi-step Workflows** - Implement complex conversation flows with state management The key methods allow you to inject custom logic at different stages of the conversation flow, enabling sophisticated AI agent behaviors while maintaining clean separation of concerns. :::note Conversation Flow works with [Cascading Pipeline](cascading-pipeline.md) currently. ::: --- --- title: Overview hide_title: false hide_table_of_contents: false description: "Get an overview of the VideoSDK AI Agent SDK, a framework for building AI agents for real-time conversations. Learn about its core components: Agent, Pipeline, and Agent Session." pagination_label: "Overview" keywords: - AI Agent SDK - VideoSDK Agents - Core Components - Agent - Pipeline - Agent Session - Real-time AI - Agentic Workflow - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: Overview slug: overview --- The AI Agent SDK provides a powerful framework for building AI agents that can participate in real-time conversations. This guide explains the core components and demonstrates how to create a complete agentic workflow. ## Core Components The SDK consists of four primary components: 1. **Agent** - Base class for defining agent behavior and capabilities 2. **Pipeline** - Manages model selection and configuration 3. **Agent Session** - Combines all components into a cohesive workflow --- --- title: Realtime Pipeline hide_title: false hide_table_of_contents: false description: "Explore the `Realtime Pipeline` component in the VideoSDK AI Agent SDK. Learn how it manages AI models (like OpenAI and Gemini), configurations, streaming audio, and multi-modal capabilities." pagination_label: "Realtime Pipeline" keywords: - Pipeline Component - AI Agent SDK - VideoSDK Agents - AI Models - OpenAI - Gemini - Model Configuration - Streaming Audio - Multi-modal AI - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 3 sidebar_label: Realtime Pipeline slug: realtime-pipeline --- # Realtime Pipeline The `Realtime Pipeline` component manages the AI model and its configuration. It provides a standardized interface for different model providers. #### Key Features: - Flexible model selection - Configurable model parameters - Streaming audio support - Multi-modal capabilities #### Example Implementation: ```python from videosdk.agents import RealTimePipeline from openai.types.beta.realtime.session import TurnDetection from videosdk.plugins.google import OpenAIRealtime, OpenAIRealtimeConfig # For Gemini # from videosdk.plugins.gemini import GeminiRealtime, GeminiLiveConfig # from google.genai.types import AudioTranscriptionConfig # Option 1: Using OpenAI's real-time models openai_model = OpenAIRealtime( model="gpt-4o-realtime-preview", api_key="your-openai-api-key" # Or use environment variable ) # Option 2: Using Google's models # gemini_model = GeminiRealtime( # model="gemini-pro-1.5", # api_key="your-google-api-key" # ) # Configure the pipeline with your selected model model = OpenAIRealtime( model="gpt-4o-realtime-preview", config=OpenAIRealtimeConfig( modalities=["text", "audio"], turn_detection=TurnDetection( type="server_vad", threshold=0.5, prefix_padding_ms=300, silence_duration_ms=200, ), tool_choice="auto" ) ) # For Gemini # model = GeminiRealtime( # model="gemini-2.0-flash-live-001", # config=GeminiLiveConfig( # response_modalities=["AUDIO"], # ) # ) # Create the pipeline with the model and configuration pipeline = RealTimePipeline(model=openai_model) ``` --- --- title: Introduction hide_title: false hide_table_of_contents: false description: "Introduce yourself to the VideoSDK AI Agent SDK, a Python framework for integrating AI-powered voice agents into VideoSDK meetings. Understand its high-level architecture and how it bridges AI models with users for real-time interactions." pagination_label: "Introduction" keywords: - AI Agent SDK - VideoSDK Agents - Introduction - Python SDK - Voice AI - Real-time Communication - AI Integration - VideoSDK Cloud image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: Introduction slug: introduction --- # Introduction The AI Agent SDK is a Python framework built on top of the VideoSDK Python SDK that enables AI-powered agents to join VideoSDK rooms as participants. This SDK serves as a real-time bridge between AI models (like OpenAI and Gemini) and your users, facilitating seamless voice and media interactions. ## High Level Architecture This architecture shows how AI voice agents connect to VideoSDK meetings. The system links your backend with VideoSDK's platform, allowing AI assistants to interact with users in real-time. ### System Components - **Your Backend**: Hosts the Worker and Agent Job that powers the AI agents - **VideoSDK Cloud**: Manages the meeting rooms where agents and users interact in real time - **Client SDK**: Applications on user devices (web, mobile, or SIP) that connect to VideoSDK meetings ### Process Flow 1. **Register**: Your backend worker registers with the VideoSDK Cloud 2. **Initiate to join Room**: The user initiates joining a VideoSDK Room via the Client SDK on their device 3. **Notify worker for Agent to join Room**: The VideoSDK Cloud notifies your backend worker to have an Agent join the room. 4. **Agent joins the room**: The Agent connects to the VideoSDK Room and can interact with the user. --- --- title: MCP Integration hide_title: false hide_table_of_contents: false description: "Learn how to integrate Model Context Protocol (MCP) servers with VideoSDK AI Agents to extend your agent's capabilities with external services, databases, and APIs using STDIO and HTTP transport methods." pagination_label: "MCP Integration" keywords: - MCP Integration - Model Context Protocol - MCP Client - MCP Servers - Multiple MCP Servers - MCP Server Client Example - VideoSDK Agents - AI Agent SDK - Python - MCP Tools - MCP Standard Input/Output (stdio) - MCP Streamable HTTP - MCP Server-Sent Events (SSE) - External APIs - Voice Agent Sessions - Real Time MCP image: img/videosdklive-thumbnail.jpg sidebar_position: 4 sidebar_label: MCP Integration slug: mcp-integration --- The Model Context Protocol (MCP) is an open standard that enables AI assistants to securely connect to data sources and tools. With VideoSDK's AI Agents, you can seamlessly integrate MCP servers to extend your agent's capabilities with external services or applications, databases, and APIs. ## MCP Server Types VideoSDK supports two transport methods for MCP servers: ### 1. STDIO Transport - Direct process communication - Local Python scripts - Best for custom tools and functions - Ideal for server-side integrations ### 2. HTTP Transport (Streamable HTTP or SSE) - Network-based communication - External MCP services - Best for third-party integrations - Supports remote MCP servers ## How It Works with VideoSDK's AI Agent MCP tools are automatically discovered and made available to your agent. Agent will intelligently choose which tools to use based on user requests. When a user asks for information that requires external data, the agent will: - Identify the need for external data based on the user's request - Select appropriate tools from available MCP servers - Execute the tools with relevant parameters - Process the results and provide a natural language response This seamless integration allows your voice agent to access real-time data and external services while maintaining a natural conversational flow. ## Creating an MCP Server # Basic MCP Server Structure A simple MCP server using STDIO to return the current time. First, install the required package: ```bash pip install fastmcp ``` ```python title="mcp_stdio_example.py" from mcp.server.fastmcp import FastMCP import datetime # Create the MCP server mcp = FastMCP("CurrentTimeServer") @mcp.tool() def get_current_time() -> str: """Get the current time in the user's location""" # Get current time now = datetime.datetime.now() # Return formatted time string return f"The current time is {now.strftime('%H:%M:%S')} on {now.strftime('%Y-%m-%d')}" if __name__ == "__main__": # Run the server with STDIO transport mcp.run(transport="stdio") ``` ## Integrating MCP with VideoSDK Agent Now we'll see how to integrate MCP servers with your VideoSDK AI Agent: ```python title="main.py" import asyncio import pathlib import sys from videosdk.agents import Agent, AgentSession, RealTimePipeline,MCPServerStdio, MCPServerHTTP from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig class MyVoiceAgent(Agent): def __init__(self): # Define paths to your MCP servers mcp_script = Path(__file__).parent.parent / "MCP_Example" / "mcp_stdio_example.py" super().__init__( instructions="""You are a helpful assistant with access to real-time data. You can provide current time information. Always be conversational and helpful in your responses.""", mcp_servers=[ # STDIO MCP Server (Local Python script for time) MCPServerStdio( command=sys.executable, # Use current Python interpreter args=[str(mcp_script)], client_session_timeout_seconds=30 ), # HTTP MCP Server (External service example e.g Zapier) MCPServerHTTP( url="https://your-mcp-service.com/api/mcp", client_session_timeout_seconds=30 ) ] ) async def on_enter(self) -> None: await self.session.say("Hi there! How can I help you today?") async def on_exit(self) -> None: await self.session.say("Thank you for using the assistant. Goodbye!") async def main(context: dict): # Configure Gemini Realtime model model = GeminiRealtime( model="gemini-2.0-flash-live-001", config=GeminiLiveConfig( voice="Leda", # Available voices: Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, Zephyr response_modalities=["AUDIO"] ) ) pipeline = RealTimePipeline(model=model) agent = MyVoiceAgent() session = AgentSession( agent=agent, pipeline=pipeline, context=context ) try: # Start the session await session.start() # Keep the session running until manually terminated await asyncio.Event().wait() finally: # Clean up resources when done await session.close() if __name__ == "__main__": def make_context(): # When VIDEOSDK_AUTH_TOKEN is set in .env - DON'T include videosdk_auth return { "meetingId": "your_actual_meeting_id_here", # Replace with actual meeting ID "name": "AI Voice Agent", "videosdk_auth": "your_videosdk_auth_token_here" # Replace with actual token } ``` :::tip Get started quickly with the [Quick Start Example](https://github.com/videosdk-live/agents-quickstart/tree/main/MCP%20Server) for the VideoSDK AI Agent SDK With MCP — everything you need to build your first AI agent fast. ::: --- --- title: Google LLM hide_title: false hide_table_of_contents: false description: "Learn how to use Google's LLM models (Gemini) with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing text-based AI capabilities for your conversational agents." pagination_label: "Google LLM" keywords: - Google - Gemini - gemini-2.0-flash-001 - LLM - Large Language Model - VideoSDK Agents - Python SDK - Text Generation - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: Google slug: google-llm --- # Google LLM The Google LLM provider enables your agent to use Google's Gemini family of language models for text-based conversations and processing. ## Installation Install the Google-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-google" ``` ## Importing ```python from videosdk.plugins.google import GoogleLLM ``` ## Example Usage ```python from videosdk.plugins.google import GoogleLLM from videosdk.agents import CascadingPipeline # Initialize the Google LLM model llm = GoogleLLM( model="gemini-2.0-flash-001", # When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-google-api-key", temperature=0.7, tool_choice="auto", max_output_tokens=1000 ) # Add llm to cascading pipeline pipeline = CascadingPipeline(llm=llm) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit `api_key` and other credential parameters from your code. ::: ## Configuration Options - `model`: (str) The Google model to use (e.g., `"gemini-2.0-flash-001"`, `"gemini-1.5-pro"`) (default: `"gemini-2.0-flash-001"`). - `api_key`: (str) Your Google API key. Can also be set via the `GOOGLE_API_KEY` environment variable. - `temperature`: (float) Sampling temperature for response randomness (default: `0.7`). - `tool_choice`: (ToolChoice) Tool selection mode (`"auto"`, `"required"`, `"none"`) (default: `"auto"`). - `max_output_tokens`: (int) Maximum number of tokens in the completion response (optional). - `top_p`: (float) The nucleus sampling probability (optional). - `top_k`: (int) The top-k sampling parameter (optional). - `presence_penalty`: (float) Penalizes new tokens based on whether they appear in the text so far (optional). - `frequency_penalty`: (float) Penalizes new tokens based on their existing frequency in the text so far (optional). --- --- title: OpenAI LLM hide_title: false hide_table_of_contents: false description: "Learn how to use OpenAI's LLM models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing text-based AI capabilities for your conversational agents." pagination_label: "OpenAI LLM" keywords: - OpenAI - GPT-4o - LLM - Large Language Model - VideoSDK Agents - Python SDK - Text Generation - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: OpenAI slug: openai --- # OpenAI LLM The OpenAI LLM provider enables your agent to use OpenAI's language models (like GPT-4o) for text-based conversations and processing. ## Installation Install the OpenAI-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-openai" ``` ## Importing ```python from videosdk.plugins.openai import OpenAILLM ``` ## Example Usage ```python from videosdk.plugins.openai import OpenAILLM from videosdk.agents import CascadingPipeline # Initialize the OpenAI LLM model llm = OpenAILLM( model="gpt-4o", # When OPENAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-openai-api-key", temperature=0.7, tool_choice="auto", max_completion_tokens=1000 ) # Add llm to cascading pipeline pipeline = CascadingPipeline(llm=llm) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: ## Configuration Options - `model`: The OpenAI model to use (e.g., `"gpt-4o"`, `"gpt-4o-mini"`, `"gpt-3.5-turbo"`) - `api_key`: Your OpenAI API key (can also be set via environment variable) - `base_url`: Custom base URL for OpenAI API (optional) - `temperature`: (float) Sampling temperature for response randomness (0.0 to 2.0, default: 0.7) - `tool_choice`: Tool selection mode (e.g., `"auto"`, `"none"`, or specific tool) - `max_completion_tokens`: (int) Maximum number of tokens in the completion response --- --- title: Sarvam AI LLM hide_title: false hide_table_of_contents: false description: "Learn how to use Sarvam AI's LLM models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing text-based AI capabilities for your conversational agents." pagination_label: "Sarvam AI LLM" keywords: - Sarvam AI - sarvam-m - LLM - Large Language Model - VideoSDK Agents - Python SDK - Text Generation - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 3 sidebar_label: Sarvam AI slug: sarvam-ai-llm --- # Sarvam AI LLM The Sarvam AI LLM provider enables your agent to use Sarvam AI's language models for text-based conversations and processing. ## Installation Install the Sarvam AI-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-sarvamai" ``` ## Importing ```python from videosdk.plugins.sarvamai import SarvamAILLM ``` :::note When using Sarvam AI as the LLM option, the function tool calls and MCP tool will not work. ::: ## Example Usage ```python from videosdk.plugins.sarvamai import SarvamAILLM from videosdk.agents import CascadingPipeline # Initialize the Sarvam AI LLM model llm = SarvamAILLM( model="sarvam-m", # When SARVAMAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-sarvam-ai-api-key", temperature=0.7, tool_choice="auto", max_completion_tokens=1000 ) # Add llm to cascading pipeline pipeline = CascadingPipeline(llm=llm) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit `api_key` and other credential parameters from your code. ::: ## Configuration Options - `model`: (str) The Sarvam AI model to use (default: `"sarvam-m"`). - `api_key`: (str) Your Sarvam AI API key. Can also be set via the `SARVAMAI_API_KEY` environment variable. - `temperature`: (float) Sampling temperature for response randomness (default: `0.7`). - `tool_choice`: (ToolChoice) Tool selection mode (default: `"auto"`). - `max_completion_tokens`: (int) Maximum number of tokens in the completion response (optional). --- --- title: AWS Nova Sonic hide_title: false hide_table_of_contents: false description: "Learn how to use Amazon's Nova Sonic model with the VideoSDK AI Agent SDK. This guide covers model configuration, streaming audio, and integration with your agent pipeline." pagination_label: "Amazon Nova Sonic" keywords: - Amazon's Nova Sonic - AWS Nova Sonic - AWS Model - Amazon Nova Sonic - NovaSonicRealtime - NovaSonicLiveConfig - Real-time AI - VideoSDK Agents - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: AWS Nova Sonic slug: aws-nova-sonic --- # AWS Nova Sonic The AWS Nova Sonic provider enables your agent to use Amazon's Nova Sonic model for real-time, speech-to-speech AI interactions. ### Prerequisites Before Start Using AWS Nova Sonic with the VideoSDK AI Agent, ensure the following: - `AWS Account`: You have an active AWS account with permissions to access Amazon Bedrock. - `Model Access`: You've requested and obtained access to the Amazon Nova models (Nova Lite and Nova Canvas) via the Amazon Bedrock console. - `Region Selection`: You're operating in the US East (N. Virginia) (us-east-1) region, as model access is region-specific. - `AWS Credentials`: Your AWS credentials (aws_access_key_id and aws_secret_access_key) are configured, either through environment variables or your preferred credential management method. ## Installation Install the Gemini-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-aws" ``` ## Importing ```python from videosdk.plugins.aws import NovaSonicRealtime, NovaSonicConfig ``` ## Example Usage ```python from videosdk.plugins.aws import NovaSonicRealtime, NovaSonicConfig from videosdk.agents import RealTimePipeline # Initialize the Nova Sonic real-time model model = NovaSonicRealtime( model="amazon.nova-sonic-v1:0", # When AWS credentials and region are set in .env - DON'T pass credential parameters region="us-east-1", # Currently, only "us-east-1" is supported for Amazon Nova Sonic. aws_access_key_id="YOUR_ACCESS_KEY", aws_secret_access_key="YOUR_SECRET_KEY", config=NovaSonicConfig( voice="tiffany", # "tiffany","matthew", "amy" temperature=0.7, top_p=0.9, max_tokens=1024 ) ) # Create the pipeline with the model pipeline = RealTimePipeline(model=model) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: :::note To initiate a conversation with Amazon Nova Sonic, the user must speak first. The model listens for user input to begin the interaction. ::: ## Configuration Options - `model`: The Amazon Nova Sonic model to use (e.g., "amazon.nova-sonic-v1:0"). - `region`: AWS region where the model is hosted (e.g., "us-east-1"). - `aws_access_key_id`: Your AWS access key ID. - `aws_secret_access_key`: Your AWS secret access key. - `config`: A NovaSonicConfig object for advanced options: - `voice`: (str or None) The voice to use for audio output (e.g., "matthew", "tiffany", "amy"). - `temperature`: (float or None) Sampling temperature for response randomness. - `top_p`: (float or None) Nucleus sampling probability. - `max_tokens`: (int or None) Maximum number of tokens in the output :::tip Explore and utilize ready-made scripts for integrating AWS Nova Sonic with the VideoSDK AI Agent SDK. [AWS Nova Sonic Example Script.](https://github.com/videosdk-live/agents-quickstart/tree/main/AWS%20Nova%20Sonic) ::: --- --- title: Google Gemini (LiveAPI) hide_title: false hide_table_of_contents: false description: "Learn how to use Google's Gemini models with the VideoSDK AI Agent SDK. This guide covers model configuration, streaming audio, and integration with your agent pipeline." pagination_label: "Google Gemini" keywords: - Google Gemini - GeminiRealtime - GeminiLiveConfig - Real-time AI - VideoSDK Agents - Python SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: Google Gemini (LiveAPI) slug: google-live-api --- # Google Gemini (LiveAPI) The Google Gemini (Live API) provider allows your agent to leverage Google's Gemini models for real-time, multimodal AI interactions. ## Installation Install the Gemini-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-google" ``` ## Importing ```python from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig ``` ## Example Usage ```python from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig from videosdk.agents import RealTimePipeline # Initialize the Gemini real-time model model = GeminiRealtime( model="gemini-2.0-flash-live-001", # When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-google-api-key", config=GeminiLiveConfig( voice="Leda", # Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, and Zephyr. response_modalities=["AUDIO"] ) ) # Create the pipeline with the model pipeline = RealTimePipeline(model=model) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: ## Configuration Options - `model`: The Gemini model to use (e.g., `"gemini-2.0-flash-live-001"`). Other supported models include: `"gemini-2.5-flash-preview-native-audio-dialog"` and `"gemini-2.5-flash-exp-native-audio-thinking-dialog"`. - `api_key`: Your Google API key (can also be set via environment variable) - `config`: A `GeminiLiveConfig` object for advanced options: - `voice`: (str or None) The voice to use for audio output (e.g., `"Puck"`). - `language_code`: (str or None) The language code for the conversation (e.g., `"en-US"`). - `temperature`: (float or None) Sampling temperature for response randomness. - `top_p`: (float or None) Nucleus sampling probability. - `top_k`: (float or None) Top-k sampling for response diversity. - `candidate_count`: (int or None) Number of candidate responses to generate. - `max_output_tokens`: (int or None) Maximum number of tokens in the output. - `presence_penalty`: (float or None) Penalty for introducing new topics. - `frequency_penalty`: (float or None) Penalty for repeating tokens. - `response_modalities`: (List[str] or None) List of enabled output modalities (e.g., `["TEXT", "AUDIO"]`). - `output_audio_transcription`: (`AudioTranscriptionConfig` or None) Configuration for audio output transcription. :::tip Explore and utilize ready-made scripts for Gemini(LiveAPI) with the VideoSDK AI Agent SDK. [Gemini(LiveAPI) Example Script.](https://github.com/videosdk-live/agents-quickstart/tree/main/Google%20Gemini%20(LiveAPI)) ::: --- --- title: OpenAI hide_title: false hide_table_of_contents: false description: "Learn how to use OpenAI's real-time models with the VideoSDK AI Agent SDK. This guide covers model configuration, streaming audio, and integration with your agent pipeline." pagination_label: "OpenAI" keywords: - OpenAI - GPT-4o - Real-time AI - VideoSDK Agents - Python SDK - OpenAIRealtime - OpenAIRealtimeConfig image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: OpenAI slug: openai --- # OpenAI The OpenAI provider enables your agent to use OpenAI's real-time models (like GPT-4o) for text and audio interactions. ## Installation Install the OpenAI-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-openai" ``` ## Importing ```python from videosdk.plugins.openai import OpenAIRealtime, OpenAIRealtimeConfig ``` ## Example Usage ```python from videosdk.plugins.openai import OpenAIRealtime, OpenAIRealtimeConfig from videosdk.agents import RealTimePipeline from openai.types.beta.realtime.session import TurnDetection # Initialize the OpenAI real-time model model = OpenAIRealtime( model="gpt-4o-realtime-preview", # When OPENAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-openai-api-key", config=OpenAIRealtimeConfig( voice="alloy", # alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, and verse modalities=["text", "audio"], turn_detection=TurnDetection( type="server_vad", threshold=0.5, prefix_padding_ms=300, silence_duration_ms=200, ), tool_choice="auto" ) ) # Create the pipeline with the model pipeline = RealTimePipeline(model=model) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: ## Configuration Options - `model`: The OpenAI model to use (e.g., `"gpt-4o-realtime-preview"`) - `api_key`: Your OpenAI API key (can also be set via environment variable) - `config`: An `OpenAIRealtimeConfig` object for advanced options: - `voice`: (str) The voice to use for audio output (e.g., `"alloy"`). - `temperature`: (float) Sampling temperature for response randomness. - `turn_detection`: (`TurnDetection` or None) Configure how the agent detects when a user has finished speaking. - `input_audio_transcription`: (`InputAudioTranscription` or None) Configure audio-to-text (e.g., Whisper). - `tool_choice`: (str or None) Tool selection mode (e.g., `"auto"`). - `modalities`: (list[str]) List of enabled modalities (e.g., `["text", "audio"]`). :::tip Explore and utilize ready-made scripts for OpenAI with the VideoSDK AI Agent SDK. [OpenAI Example Script.](https://github.com/videosdk-live/agents-quickstart/tree/main/OpenAI) ::: --- --- title: Silero VAD hide_title: false hide_table_of_contents: false description: "Learn how to use Silero's VAD with the VideoSDK AI Agent SDK. This guide covers model configuration, related events." pagination_label: "Silero VAD" keywords: - Silero - VAD - Large Language Model - VideoSDK Agents - Python SDK - Text To Speech - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: Silero VAD slug: silero-vad --- # Silero VAD The Silero VAD (Voice Activity Detection) provider enables your agent to detect when users start and stop speaking. When added to a cascading pipeline, it automatically enables interrupt functionality - allowing users to interrupt the agent mid-response. ## Installation Install the Silero VAD-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-silero" ``` ## Importing ```python from videosdk.plugins.silero import SileroVAD ``` ## Example Usage ```python from videosdk.plugins.silero import SileroVAD from videosdk.agents import CascadingPipeline # Initialize the Silero VAD vad = SileroVAD( input_sample_rate=48000, model_sample_rate=16000, threshold=0.3, min_speech_duration=0.1, min_silence_duration=0.75, prefix_padding_duration=0.3 ) # Add VAD to cascading pipeline - automatically enables interrupts pipeline = CascadingPipeline(vad=vad) ``` ## Configuration Options - `input_sample_rate`: (int) Sample rate of input audio in Hz (default: `48000`) - `model_sample_rate`: (Literal[8000, 16000]) Model's expected sample rate (default: `16000`) - `threshold`: (float) Voice activity detection sensitivity (0.0 to 1.0, default: `0.3`) - `min_speech_duration`: (float) Minimum speech duration to trigger detection in seconds (default: `0.1`) - `min_silence_duration`: (float) Minimum silence duration to end speech detection in seconds (default: `0.75`) - `max_buffered_speech`: (float) Maximum speech buffer duration in seconds (default: `60.0`) - `force_cpu`: (bool) Force CPU usage instead of GPU acceleration (default: `True`) - `prefix_padding_duration`: (float) Audio padding before speech detection in seconds (default: `0.3`) --- --- title: Deepgram STT hide_title: false hide_table_of_contents: false description: "Learn how to use Deepgram's STT models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing speech to text for Deepgram's services" pagination_label: "Deepgram STT" keywords: - Deepgram - nova-2 - nova-3 - STT - Large Language Model - VideoSDK Agents - Python SDK - Speech To Text - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: Deepgram slug: deepgram --- # Deepgram STT The Deepgram STT provider enables your agent to use Deepgram's advanced speech-to-text models for high-accuracy, real-time audio transcription. ## Installation Install the Deepgram-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-deepgram" ``` ## Importing ```python from videosdk.plugins.deepgram import DeepgramSTT ``` ## Example Usage ```python from videosdk.plugins.deepgram import DeepgramSTT from videosdk.agents import CascadingPipeline # Initialize the Deepgram STT model stt = DeepgramSTT( # When DEEPGRAM_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-deepgram-api-key", model="nova-2", language="en-US", interim_results=True, punctuate=True, smart_format=True ) # Add stt to cascading pipeline pipeline = CascadingPipeline(stt=stt) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: ## Configuration Options - `api_key`: Your Deepgram API key (can also be set via environment variable) - `model`: The Deepgram model to use (e.g., `"nova-2"`, `"nova-3"`, `"whisper-large"`) - `language`: (str) Language code for transcription (default: `"en-US"`) - `interim_results`: (bool) Enable real-time partial transcription results (default: `True`) - `punctuate`: (bool) Add punctuation to transcription (default: `True`) - `smart_format`: (bool) Apply intelligent formatting to output (default: `True`) - `sample_rate`: (int) Audio sample rate in Hz (default: `48000`) - `endpointing`: (int) Silence detection threshold in milliseconds (default: `50`) - `filler_words`: (bool) Include filler words like "uh", "um" in transcription (default: `True`) - `base_url`: (str) WebSocket endpoint URL (default: `"wss://api.deepgram.com/v1/listen"`) --- --- title: Google STT hide_title: false hide_table_of_contents: false description: "Learn how to use Google's STT models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing speech to text for Google's services" pagination_label: "Google STT" keywords: - Google - Speech-to-Text - STT - Large Language Model - VideoSDK Agents - Python SDK - Speech To Text - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: Google slug: google --- # Google STT The Google STT provider enables your agent to use Google's advanced speech-to-text models for high-accuracy, real-time audio transcription. ## Installation Install the Google-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-google" ``` ## Importing ```python from videosdk.plugins.google import GoogleSTT ``` ## Setup Credentials To use Google STT, you need to set up your Google Cloud credentials. You can do this by setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to the path of your service account key file. ```bash export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/keyfile.json" ``` Alternatively, you can pass the path to the key file directly to the `GoogleSTT` constructor via the `api_key` parameter. ## Example Usage ```python from videosdk.plugins.google import GoogleSTT from videosdk.agents import CascadingPipeline # Initialize the Google STT model stt = GoogleSTT( # If GOOGLE_APPLICATION_CREDENTIALS is set, you can omit api_key api_key="/path/to/your/keyfile.json", languages="en-US", model="latest_long", interim_results=True, punctuate=True ) # Add stt to cascading pipeline pipeline = CascadingPipeline(stt=stt) ``` :::note When using an environment variable for credentials, don't pass the `api_key` as an argument to the model instance. The SDK automatically reads the environment variable. ::: ## Configuration Options - `api_key`: (str) Path to your Google Cloud service account JSON file. This can also be set via the `GOOGLE_APPLICATION_CREDENTIALS` environment variable. - `languages`: (Union[str, list[str]]) Language code or a list of language codes for transcription (default: `"en-US"`). - `model`: (str) The Google STT model to use (e.g., `"latest_long"`, `"telephony"`) (default: `"latest_long"`). - `sample_rate`: (int) The target audio sample rate in Hz for transcription (default: `16000`). The input audio at 48000Hz will be resampled to this rate. - `interim_results`: (bool) Enable real-time partial transcription results (default: `True`). - `punctuate`: (bool) Add punctuation to transcription (default: `True`). - `min_confidence_threshold`: (float) The minimum confidence level for a transcription result to be considered valid (default: `0.1`). - `location`: (str) The Google Cloud location to use for the STT service (default: `"global"`). --- --- title: OpenAI STT hide_title: false hide_table_of_contents: false description: "Learn how to use OpenAI's STT models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing speech to text for OpenAI's services" pagination_label: "OpenAI STT" keywords: - OpenAI - gpt-4o-mini-transcribe - whisper-1 - STT - Large Language Model - VideoSDK Agents - Python SDK - Speech To Text - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: OpenAI slug: openai --- # OpenAI STT The OpenAI STT provider enables your agent to use OpenAI's speech-to-text models (like Whisper) for converting audio input to text. ## Installation Install the OpenAI-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-openai" ``` ## Importing ```python from videosdk.plugins.openai import OpenAISTT ``` ## Example Usage ```python from videosdk.plugins.openai import OpenAISTT from videosdk.agents import CascadingPipeline # Initialize the OpenAI STT model stt = OpenAISTT( # When OPENAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-openai-api-key", model="whisper-1", language="en", prompt="Transcribe this audio with proper punctuation and formatting." ) # Add stt to cascading pipeline pipeline = CascadingPipeline(stt=stt) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: ## Configuration Options - `api_key`: Your OpenAI API key (required, can also be set via environment variable) - `model`: The OpenAI STT model to use (e.g., `"whisper-1"`, `"gpt-4o-mini-transcribe"`) - `base_url`: Custom base URL for OpenAI API (optional) - `prompt`: (str) Custom prompt to guide transcription style and format - `language`: (str) Language code for transcription (default: `"en"`) - `turn_detection`: (dict) Configuration for detecting conversation turns --- --- title: Sarvam AI STT hide_title: false hide_table_of_contents: false description: "Learn how to use Sarvam AI's STT models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing speech to text for Sarvam AI's services" pagination_label: "Sarvam AI STT" keywords: - Sarvam AI - saarika:v2 - STT - Large Language Model - VideoSDK Agents - Python SDK - Speech To Text - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 3 sidebar_label: Sarvam AI slug: sarvam-ai --- # Sarvam AI STT The Sarvam AI STT provider enables your agent to use Sarvam AI's speech-to-text models for transcription. This provider uses Voice Activity Detection (VAD) to send audio chunks for transcription after a period of silence. ## Installation Install the Sarvam AI-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-sarvamai" ``` ## Importing ```python from videosdk.plugins.sarvamai import SarvamAISTT ``` ## Example Usage ```python from videosdk.plugins.sarvamai import SarvamAISTT from videosdk.agents import CascadingPipeline # Initialize the Sarvam AI STT model stt = SarvamAISTT( # When SARVAMAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-sarvam-ai-api-key", model="saarika:v2", language="en-IN" ) # Add stt to cascading pipeline pipeline = CascadingPipeline(stt=stt) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit `api_key` and other credential parameters from your code. ::: ## Configuration Options - `api_key`: (str) Your Sarvam AI API key. Can also be set via the `SARVAMAI_API_KEY` environment variable. - `model`: (str) The Sarvam AI model to use (default: `"saarika:v2"`). - `language`: (str) Language code for transcription (default: `"en-IN"`). - `input_sample_rate`: (int) The sample rate of the audio from the source in Hz (default: `48000`). - `output_sample_rate`: (int) The sample rate to which the audio is resampled before sending for transcription (default: `16000`). - `silence_threshold`: (float) The normalized amplitude threshold for silence detection (default: `0.01`). - `silence_duration`: (float) The duration of silence in seconds that triggers the end of a speech segment for transcription (default: `0.8`). --- --- title: ElevenLabs TTS hide_title: false hide_table_of_contents: false description: "Learn how to use ElevenLabs's TTS models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing text to speech for ElevenLabs's services" pagination_label: "ElevenLabs TTS" keywords: - ElevenLabs - eleven_flash_v2_5 - TTS - Large Language Model - VideoSDK Agents - Python SDK - Text To Speech - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: ElevenLabs slug: eleven-labs --- # ElevenLabs TTS The ElevenLabs TTS provider enables your agent to use ElevenLabs' high-quality text-to-speech models for generating natural, expressive voice output with advanced voice cloning capabilities. ## Installation Install the ElevenLabs-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-elevenlabs" ``` ## Importing ```python from videosdk.plugins.elevenlabs import ElevenLabsTTS, VoiceSettings ``` ## Example Usage ```python from videosdk.plugins.elevenlabs import ElevenLabsTTS, VoiceSettings from videosdk.agents import CascadingPipeline # Configure voice settings voice_settings = VoiceSettings( stability=0.71, similarity_boost=0.5, style=0.0, use_speaker_boost=True ) # Initialize the ElevenLabs TTS model tts = ElevenLabsTTS( # When ELEVENLABS_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-elevenlabs-api-key", model="eleven_flash_v2_5", voice="your-voice-id", speed=1.0, response_format="pcm_24000", voice_settings=voice_settings, enable_streaming=True ) # Add tts to cascading pipeline pipeline = CascadingPipeline(tts=tts) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: ## Configuration Options - `model`: The ElevenLabs model to use (e.g., `"eleven_flash_v2_5"`, `"eleven_multilingual_v2"`) - `voice`: (str) Voice ID to use for audio output (get from ElevenLabs dashboard) - `speed`: (float) Speed of the generated audio (default: 1.0) - `api_key`: Your ElevenLabs API key (can also be set via environment variable) - `response_format`: (str) Audio format for output (default: `"pcm_24000"`) - `voice_settings`: (`VoiceSettings`) Advanced voice configuration options: - `stability`: (float) Voice stability (0.0 to 1.0, default: 0.71) - `similarity_boost`: (float) Voice similarity enhancement (0.0 to 1.0, default: 0.5) - `style`: (float) Voice style exaggeration (0.0 to 1.0, default: 0.0) - `use_speaker_boost`: (bool) Enable speaker boost for clarity (default: `True`) - `base_url`: (str) Custom base URL for ElevenLabs API (optional) - `enable_streaming`: (bool) Enable real-time audio streaming (default: `False`) --- --- title: Google TTS hide_title: false hide_table_of_contents: false description: "Learn how to use Google's TTS models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing text to speech for Google's services" pagination_label: "Google TTS" keywords: - Google - Text-to-Speech - TTS - Large Language Model - VideoSDK Agents - Python SDK - Text To Speech - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: Google slug: google-tts --- # Google TTS The Google TTS provider enables your agent to use Google's high-quality text-to-speech models for generating natural-sounding voice output. ## Installation Install the Google-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-google" ``` ## Importing ```python from videosdk.plugins.google import GoogleTTS, GoogleVoiceConfig ``` ## Example Usage ```python from videosdk.plugins.google import GoogleTTS, GoogleVoiceConfig from videosdk.agents import CascadingPipeline # Configure voice settings voice_config = GoogleVoiceConfig( languageCode="en-US", name="en-US-Chirp3-HD-Aoede", ssmlGender="FEMALE" ) # Initialize the Google TTS model tts = GoogleTTS( # When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-google-api-key", speed=1.0, pitch=0.0, voice_config=voice_config ) # Add tts to cascading pipeline pipeline = CascadingPipeline(tts=tts) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit `api_key` and other credential parameters from your code. ::: ## Configuration Options - `api_key`: (str) Your Google Cloud TTS API key. Can also be set via the `GOOGLE_API_KEY` environment variable. - `speed`: (float) The speaking rate of the generated audio (default: `1.0`). - `pitch`: (float) The pitch of the generated audio. Can be between -20.0 and 20.0 (default: `0.0`). - `response_format`: (str) The format of the audio response. Currently only supports `"pcm"` (default: `"pcm"`). - `voice_config`: (`GoogleVoiceConfig`) Configuration for the voice to be used. - `languageCode`: (str) The language code of the voice (e.g., `"en-US"`, `"en-GB"`) (default: `"en-US"`). - `name`: (str) The name of the voice to use (e.g., `"en-US-Chirp3-HD-Aoede"`, `"en-US-News-N"`) (default: `"en-US-Chirp3-HD-Aoede"`). - `ssmlGender`: (str) The gender of the voice (`"MALE"`, `"FEMALE"`, `"NEUTRAL"`) (default: `"FEMALE"`). --- --- title: OpenAI TTS hide_title: false hide_table_of_contents: false description: "Learn how to use OpenAI's TTS models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing text to speech for OpenAI's services" pagination_label: "OpenAI TTS" keywords: - OpenAI - gpt-4o-mini-tts - TTS - Large Language Model - VideoSDK Agents - Python SDK - Text To Speech - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 2 sidebar_label: OpenAI slug: openai --- # OpenAI TTS The OpenAI TTS provider enables your agent to use OpenAI's text-to-speech models for converting text responses to natural-sounding audio output. ## Installation Install the OpenAI-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-openai" ``` ## Importing ```python from videosdk.plugins.openai import OpenAITTS ``` ## Example Usage ```python from videosdk.plugins.openai import OpenAITTS from videosdk.agents import CascadingPipeline # Initialize the OpenAI TTS model tts = OpenAITTS( # When OPENAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-openai-api-key", model="tts-1", voice="alloy", speed=1.0, response_format="pcm" ) # Add tts to cascading pipeline pipeline = CascadingPipeline(tts=tts) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code. ::: ## Configuration Options - `model`: The OpenAI TTS model to use (e.g., `"tts-1"`, `"tts-1-hd"`) - `voice`: (str) Voice to use for audio output (e.g., `"alloy"`, `"echo"`, `"fable"`, `"onyx"`, `"nova"`, `"shimmer"`) - `speed`: (float) Speed of the generated audio (0.25 to 4.0, default: 1.0) - `instructions`: (str) Custom instructions to guide speech synthesis style - `api_key`: Your OpenAI API key (can also be set via environment variable) - `base_url`: Custom base URL for OpenAI API (optional) - `response_format`: (str) Audio format for output (default: `"pcm"`) --- --- title: Sarvam AI TTS hide_title: false hide_table_of_contents: false description: "Learn how to use Sarvam AI's TTS models with the VideoSDK AI Agent SDK. This guide covers model configuration, API integration, and implementing text to speech for Sarvam AI's services" pagination_label: "Sarvam AI TTS" keywords: - Sarvam AI - bulbul:v2 - TTS - Large Language Model - VideoSDK Agents - Python SDK - Text To Speech - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 3 sidebar_label: Sarvam AI slug: sarvam-ai-tts --- # Sarvam AI TTS The Sarvam AI TTS provider enables your agent to use Sarvam AI's text-to-speech models for generating voice output. ## Installation Install the Sarvam AI-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-sarvamai" ``` ## Importing ```python from videosdk.plugins.sarvamai import SarvamAITTS ``` ## Example Usage ```python from videosdk.plugins.sarvamai import SarvamAITTS from videosdk.agents import CascadingPipeline # Initialize the Sarvam AI TTS model tts = SarvamAITTS( # When SARVAMAI_API_KEY is set in .env - DON'T pass api_key parameter api_key="your-sarvam-ai-api-key", model="bulbul:v2", speaker="anushka", target_language_code="en-IN", pitch=0.0, pace=1.0, loudness=1.2 ) # Add tts to cascading pipeline pipeline = CascadingPipeline(tts=tts) ``` :::note When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit `api_key` and other credential parameters from your code. ::: ## Configuration Options - `api_key`: (str) Your Sarvam AI API key. Can also be set via the `SARVAMAI_API_KEY` environment variable. - `model`: (str) The Sarvam AI model to use (default: `"bulbul:v2"`). - `speaker`: (str) The speaker voice to use (default: `"anushka"`). - `target_language_code`: (str) The language code for the generated audio (default: `"en-IN"`). - `pitch`: (float) The pitch of the generated audio (default: `0.0`). - `pace`: (float) The pace or speed of the generated audio (default: `1.0`). - `loudness`: (float) The loudness of the generated audio (default: `1.2`). - `enable_preprocessing`: (bool) Whether to enable text preprocessing on the server (default: `True`). --- --- title: Turn Detector hide_title: false hide_table_of_contents: false description: "Learn how to use TurnDetector model with the VideoSDK AI Agent SDK. This guide covers model configuration." pagination_label: "Turn Detector" keywords: - Turn Detection - Turn Detector - Large Language Model - VideoSDK Agents - Python SDK - Text To Speech - AI Chat - Conversational AI image: img/videosdklive-thumbnail.jpg sidebar_position: 1 sidebar_label: Turn Detector slug: turn-detector --- # Turn Detector The Turn Detector uses a Hugging Face model to determine whether a user's turn is completed or not, enabling precise conversation flow management in cascading pipelines. ## Installation Install the Turn Detector-enabled VideoSDK Agents package: ```bash pip install "videosdk-plugins-turn-detector" ``` ## Importing ```python from videosdk.plugins.turn_detector import TurnDetector ``` ## Example Usage ```python from videosdk.plugins.turn_detector import TurnDetector, pre_download_model from videosdk.agents import CascadingPipeline # Pre-download the model (optional but recommended) pre_download_model() # Initialize the Turn Detector turn_detector = TurnDetector( threshold=0.7 ) # Add Turn Detector to cascading pipeline pipeline = CascadingPipeline(turn_detector=turn_detector) ``` ## Configuration Options - `threshold`: (float) Confidence threshold for turn completion detection (0.0 to 1.0, default: `0.7`) ## Pre-downloading Model To avoid delays during agent initialization, you can pre-download the Hugging Face model: ```python from videosdk.plugins.turn_detector import pre_download_model # Download model before running the agent pre_download_model() ``` --- --- title: Running Multiple Agents hide_title: false hide_table_of_contents: false description: "Learn how to run multiple AI agent instances concurrently using the Worker system in the VideoSDK AI Agent SDK. Understand WorkerJob and Worker components for parallel task execution and managing multiple voice agent sessions." pagination_label: "Running Multiple Agents" keywords: - Multiple Agents - Worker System - VideoSDK Agents - AI Agent SDK - Python - Multiprocessing - Parallel Execution - WorkerJob - Worker - Voice Agent Sessions image: img/videosdklive-thumbnail.jpg sidebar_position: 4 sidebar_label: Running Multiple Agents slug: running-multiple-agents --- The worker system provides a way to run multiple tasks in parallel using Python's multiprocessing. It's particularly useful for running multiple instances of the same task concurrently, such as handling multiple voice agent sessions. ## Key Components ### 1. WorkerJob `WorkerJob` is the main class that defines a task to be executed. It takes two parameters: - `job_func`: The function to be executed - `jobctx`: Context data for the job (can be static or a callable) ```python job = WorkerJob(job_func=my_function, jobctx=my_context) ``` ### 2. Worker `Worker` manages the execution of jobs and provides a command-line interface to control them. It supports the following commands: - `n`: Start a new worker process - `l`: List all running worker processes - `q `: Stop a specific process - `q`: Stop all running processes - `x`: Exit the worker interface ## Usage Example Here's a complete example of how to use the worker system with a voice agent: ```python import asyncio from videosdk.plugins.openai import OpenAIRealtime, OpenAIRealtimeConfig from videosdk.agents import Agent, AgentSession, RealTimePipeline, WorkerJob class VoiceAgent(Agent): def __init__(self): super().__init__( instructions="You are a helpful voice assistant that can answer questions and help with tasks.", ) async def on_enter(self) -> None: await self.session.say("Hi there! How can I help you today?") async def start_session(jobctx): model = OpenAIRealtime( model="gpt-4o-realtime-preview", ) pipeline = RealTimePipeline(model=model) session = AgentSession( agent=VoiceAgent(), pipeline=pipeline, context=jobctx ) try: await session.start() await asyncio.Event().wait() finally: await session.close() def entryPoint(jobctx): asyncio.run(start_session(jobctx)) # Create job context def make_context(): return {"meetingId": "", "name": ""} # Start a worker job job = WorkerJob(job_func=entryPoint, jobctx=make_context) job.start() ``` --- --- title: Custom Tracks hide_title: true hide_table_of_contents: false description: Custom Video Track features quick integrate in Javascript, React JS, Android, IOS, React Native, Flutter with Video SDK to add live video & audio conferencing to your applications. sidebar_label: Custom Tracks pagination_label: Custom Tracks keywords: - custom Track - audio calling - video calling - real-time communication image: img/videosdklive-thumbnail.jpg sidebar_position: 1 --- ## Custom Video Track - Android - You can create a Video Track using `createCameraVideoTrack()` method of `VideoSDK`. - This method can be used to create video track using different encoding parameters, camera facing mode, and optimization mode. ### Parameters - **encoderConfig**: - type: `String` - required: `true` - default: `h480p_w720p` - You can chose from the below mentioned list of values for the encoder config. | Encoder Config | Resolution | Frame Rate | Bitrate | | -------------- | :--------: | :--------: | :----------: | | h144p_w176p | 176x144 | 15 fps | 120000 kbps | | h240p_w320p | 320x240 | 15 fps | 150000 kbps | | h480p_w640p | 640x480 | 25 fps | 300000 kbps | | h480p_w720p | 720x480 | 30 fps | 450000 kbps | | h720p_w960p | 720x960 | 30 fps | 1500000 kbps | | h1080p_w1440p | 1080x1440 | 30 fps | 2500000 kbps | | h720p_w1280p | 720x1280 | 30 fps | 2000000 kbps | | h360p_w640p | 360x640 | 30 fps | 400000 kbps | :::note Above mentioned encoder configurations are valid for both, landscape as well as portrait mode. ::: - **facingMode**: - type: `String` - required: `true` - Allowed values : `front` | `back` - It will specify wheater to use front or back camera for the video track. - **optimizationMode** - type: `CustomStreamTrack.VideoMode` - required: `true` - Allowed values: `motion` | `text` | `detail` - It will specify the optimization mode for the video track being generated. - **multiStream**: - type: `boolean` - required: `true` - It will specify if the stream should send multiple resolution layers or single resolution layer. - **context**: - type: `Context` - required: `true` - Pass the Android Context for this parameter. - **observer**: - type: `CapturerObserver` - required: `false` - If you want to use video filter from external SDK(e.g., [Banuba](https://www.banuba.com/)) then pass instance of `CapturerObserver` in this parameter. - **videoDeviceInfo**: - type: `VideoDeviceInfo` - required: `false` - If you want to specify a camera device to be used in the meeting. :::note For Banuba integraion with VideoSDK, please visit [Banuba Intergation with VideoSDK](/android/guide/video-and-audio-calling-api-sdk/video-processor/banuba-integration) ::: :::info - To learn more about optimizations and best practices for using custom video tracks, [follow this guide](/android/guide/video-and-audio-calling-api-sdk/render-media/optimize-video-track). ::: #### Returns - `CustomStreamTrack` ### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```javascript val videoCustomTrack: CustomStreamTrack = VideoSDK.createCameraVideoTrack("h720p_w960p", "front", CustomStreamTrack.VideoMode.MOTION, false, this) ``` ```javascript CustomStreamTrack customStreamTrack = VideoSDK.createCameraVideoTrack("h720p_w960p", "front", CustomStreamTrack.VideoMode.MOTION, false, this); ``` ## Custom Audio Track - Android - You can create a Audio Track using `createAudioTrack()` method of `VideoSDK`. - This method can be used to create audio track using different encoding parameters. ### Parameters - **encoderConfig**: - type: `String` - required: `true` - default: `speech_standard` - You can chose from the below mentioned list of values for the encoder config. | Encoder Config | Bitrate | Auto Gain | Echo Cancellation | Noise Suppression | | ------------------- | :------: | :-------: | :---------------: | :---------------: | | speech_low_quality | 16 kbps | TRUE | TRUE | TRUE | | speech_standard | 24 kbps | TRUE | TRUE | TRUE | | music_standard | 32 kbps | FALSE | FALSE | FALSE | | standard_stereo | 64 kbps | FALSE | FALSE | FALSE | | high_quality | 128 kbps | FALSE | FALSE | FALSE | | high_quality_stereo | 192 kbps | FALSE | FALSE | FALSE | - **context** - type: `Context` - required: `true` - Pass the Android Context for this parameter. #### Returns - `CustomStreamTrack` ### Example ```js val audioCustomTrack: CustomStreamTrack = VideoSDK.createAudioTrack("speech_standard",this) ``` ```js CustomStreamTrack audioCustomTrack=VideoSDK.createAudioTrack("speech_standard", this); ``` ## Custom Screen Share Track - Android - You can create a Screen Share track using `createScreenShareVideoTrack()` method of `VideoSDK`. - This method can be used to create screen share track using different encoding parameters. ### Parameters - **encoderConfig**: - type: `String` - required: `true` - default: `h720p_15fps` - You can chose from the below mentioned list of values for the encoder config. | Encoder Config | Resolution | Frame Rate | Bitrate | | -------------- | :--------: | :--------: | :----------: | | h360p_30fps | 640x360 | 3 fps | 200000 kbps | | h720p_5fps | 1280x720 | 5 fps | 400000 kbps | | h720p_15fps | 1280x720 | 15 fps | 1000000 kbps | | h1080p_15fps | 1920x1080 | 15 fps | 1500000 kbps | | h1080p_30fps | 1920x1080 | 15 fps | 1000000 kbps | :::note Above mentioned encoder configurations are valid for both, landscape as well as portrait mode. ::: - **data** - type: `Intent` - required: `true` - It is Intent received from onActivityResult when user provide permission for ScreenShare. - **context** - type: `Context` - required: `true` - Pass the Android Context for this parameter. - **listener** - type: `CustomTrackListener` - required: `true` - Callback to this listener will be made when track is ready with CustomTrack as parameter. ### Example ```javascript // data is received from onActivityResult method. VideoSDK.createScreenShareVideoTrack("h720p_15fps", data, this) { track -> meeting!!.enableScreenShare(track) } ``` ```javascript // data is received from onActivityResult method. VideoSDK.createScreenShareVideoTrack("h720p_15fps", data, this, (track)->{ meeting.enableScreenShare(track); }); ``` --- --- sidebar_position: 2 sidebar_label: Meeting Error Codes pagination_label: Meeting Error Codes title: Meeting Error Codes --- # Meeting Error Codes - Android If you encounter any of the errors listed below, refer to the [Developer Experience Guide](../../guide/best-practices/developer-experience.md#listen-for-error-events), which offers recommended solutions based on common error categories. import ServerErrorCodes from '../../../mdx/\_server-error-codes.mdx' import SDKErrorCodes from '../../data/\_sdk-error-codes.mdx' --- --- sidebar_position: 2 sidebar_label: Initializing a Meeting pagination_label: Initializing a Meeting title: Initializing a Meeting --- # Initializing a Meeting - Android
## initialize() To initialize the meeting, first you have to initialize the `VideoSDK`. You can initialize the `VideoSDK` using `initialize()` method provided by the SDK. #### Parameters - **context**: Context #### Returns - _`void`_ ```js title="initialize" VideoSDK.initialize(Context context) ``` --- ## config() Now, you have to set `token` property of `VideoSDK` class. By using `config()` method, you can set the `token` property of `VideoSDK` class. Please refer this [documentation](/api-reference/realtime-communication/intro/) to generate a token. #### Parameters - **token**: String #### Returns - _`void`_ ```js title="config" VideoSDK.config(String token) ``` --- ## initMeeting() - Now, you can initialize the meeting using a factory method provided by the SDK called `initMeeting()`. - `initMeeting()` will generate a new [`Meeting`](./meeting-class/introduction.md) class and the initiated meeting will be returned. ```js title="initMeeting" VideoSDK.initMeeting( Context context, String meetingId, String name, boolean micEnabled, boolean webcamEnabled, String participantId, String mode, boolean multiStream, Map customTracks JSONObject metaData, String signalingBaseUrl, PreferredProtocol preferredProtocol ) ``` ## Parameters ### context - Context of activity. - type : Context - `REQUIRED` ### meetingId - Unique Id of the meeting where that participant will be joining. - type : `String` - `REQUIRED` Please refer this [documentation](/api-reference/realtime-communication/create-room) to create a room. ### name - Name of the participant who will be joining the meeting, this name will be displayed to other participants in the same meeting. - type : String - `REQUIRED` ### micEnabled - Whether `mic` of the participant will be on while joining the meeting. If it is set to `false`, then mic of that participant will be `disabled` by default, but can be `enabled` or `disabled` later. - type: `Boolean` - `REQUIRED` ### webcamEnabled - Whether `webcam` of the participant will be on while joining the meeting. If it is set to `false`, then webcam of that participant will be `disabled` by default, but can be `enabled` or `disabled` later. - type: `Boolean` - `REQUIRED` ### participantId - Unique Id of the participant. If you passed `null` then SDK will create an Id by itself and will use that id. - type : `String` or `null` - `REQUIRED` ### Modes There are three modes available: - **`SEND_AND_RECV`**: In this mode, both audio and video streams will be produced and consumed. - **`SIGNALLING_ONLY`**: In this mode, no audio or video streams will be produced or consumed. It is used solely for signaling. - **`RECV_ONLY`**: This mode allows only the consumption of audio and video streams without producing any. **Type**: `String` or `null` **Default Value**: `SEND_AND_RECV` import CautionMessage from '@site/src/theme/CautionMessage'; ### multiStream - It will specify if the stream should send multiple resolution layers or single resolution layer. - type: `boolean` - `REQUIRED` ### customTracks - If you want to use custom tracks from start of the meeting, you can pass map of custom tracks in this paramater. - type : `Map` or `null` - `REQUIRED` Please refer this [documentation](../../guide/video-and-audio-calling-api-sdk/features/custom-track/custom-video-track) to know more about CustomTrack. ### metaData - If you want to provide additional details about a user joining a meeting, such as their profile image, you can pass that information in this parameter. - type: `JSONObject` - `REQUIRED` ### signalingBaseUrl - If you want to use a proxy server with the VideoSDK, you can specify your baseURL here. - type: `String` - `OPTIONAL` :::note If you intend to use a proxy server with the VideoSDK, priorly inform us at support@videosdk.live ::: ### preferredProtocol - If you want to provide a preferred network protocol for communication, you can specify that in `PreferredProtocol`, with options including `UDP_ONLY`, `UDP_OVER_TCP`, and `TCP_ONLY`. - type: `PreferredProtocol` - `OPTIONAL` ## Returns ### meeting - After initializing the meeting, `initMeeting()` will return a new [`Meeting`](./meeting-class/introduction.md) instance. --- ## Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js title="initMeeting" VideoSDK.initialize(applicationContext) // Configure the token VideoSDK.config(token) // pass the token generated from VideoSDK Dashboard // Initialize the meeting var meeting = VideoSDK.initMeeting( arrayOf( this@MainActivity, "abc-1234-xyz", "John Doe", true, true, null, null, false, null, null ) ) ``` ```js title="initMeeting" VideoSDK.initialize(getApplicationContext()); // Configure the token VideoSDK.config(token); // pass the token generated from VideoSDK Dashboard // Initialize the meeting Meeting meeting = VideoSDK.initMeeting({ MainActivity.this, "abc-1234-xyz", "John Doe", true, true, null, null, false, null, null, null }); ```
--- --- title: MediaEffects library hide_title: true hide_table_of_contents: false description: The MediaEffects library enhances video applications by providing advanced media effects, including virtual backgrounds. sidebar_label: MediaEffects library pagination_label: MediaEffects library keywords: - Apply Virtual Background - Remove Virtual Background - Change Virtual Background image: img/videosdklive-thumbnail.jpg sidebar_position: 1 --- # MediaEffects library - Android
## Introduction - The `MediaEffects` library enhances video applications with advanced media effects, including virtual backgrounds. It supports real-time processing and is optimized for Android devices. - The `MediaEffects` library offers three classes to customize your video background: using a custom image, applying a blur effect, or choosing a solid color. :::info The Virtual Background feature in VideoSDK can be utilized regardless of the meeting environment, including the pre-call screen. ::: ## 1. BackgroundImageProcessor - `BackgroundImageProcessor` sets a specified image as the background in a video stream, allowing you to customize the visual appearance of the video. - `BackgroundImageProcessor` class provides following method. - `setBackgroundSource()` method updates the virtual background by setting a new image as the background that the user wants to switch to. - **Parameters**: `Uri`: An image URI for the background image. - **Return Type**: `void` import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js val uri = Uri.parse("https://st.depositphotos.com/2605379/52364/i/450/depositphotos_523648932-stock-photo-concrete-rooftop-night-city-view.jpg") val backgroundImageProcessor = BackgroundImageProcessor(uri) // Sets the background image val newUri = Uri.parse("https://img.freepik.com/free-photo/plant-against-blue-wall-mockup_53876-96052.jpg?size=626&ext=jpg&ga=GA1.1.2008272138.1723420800&semt=ais_hybrid") backgroundImageProcessor.setBackgroundSource(newUri) // Changed background image ``` ```java Uri uri = Uri.parse("https://st.depositphotos.com/2605379/52364/i/450/depositphotos_523648932-stock-photo-concrete-rooftop-night-city-view.jpg"); BackgroundImageProcessor backgroundImageProcessor= new BackgroundImageProcessor(uri); // Sets the background image Uri uri = Uri.parse("https://img.freepik.com/free-photo/plant-against-blue-wall-mockup_53876-96052.jpg?size=626&ext=jpg&ga=GA1.1.2008272138.1723420800&semt=ais_hybrid"); backgroundImageProcessor.setBackgroundSource(uri); //Changed background image ``` ## 2. BackgroundBlurProcessor - `BackgroundBlurProcessor` applies a blur effect to the video background, with the intensity controlled by a float value, creating a softened visual effect. - `BackgroundBlurProcessor` class provides following method. - `setBlurRadius()` method adjusts the blur effect on the video background, with the blur strength controlled by the specified float value. - **Parameters**: `Float`: representing the blur strength; higher values mean stronger blur. The supported range is 0 to 25. - **Return Type**: `void` ```js val backgroundBlurProcessor = BackgroundBlurProcessor(25, this) // Applies a blur with intensity 25 backgroundBlurProcessor.setBlurRadius(17) // Changes the blur intensity to 17 ``` ```java BackgroundBlurProcessor backgroundBlurProcessor = new BackgroundBlurProcessor(25, this);// Applies a blur with intensity 25 backgroundBlurProcessor.setBlurRadius(17); // changes the blur intensity to 17 ``` ## 3. BackgroundColorProcessor - `BackgroundColorProcessor` sets a solid color as the video background using a `Color` object, enabling you to create a uniform color backdrop for your video. - `BackgroundColorProcessor` class provides following method. - `setBackgroundColor()` method sets the color that user wants to switch to, for virtual background effect. - **Parameters**: `Integer`: Specifies the color for the virtual background. - **Return Type**: `void` ```js val backgroundColorProcessor = BackgroundColorProcessor(Color.BLUE) // Sets the background color to blue backgroundColorProcessor.setBackgroundColor(Color.CYAN) // Changes the background color to CYAN ``` ```java BackgroundColorProcessor backgroundColorProcessor = new BackgroundColorProcessor(Color.BLUE);// Sets the background color to blue backgroundColorProcessor.setBackgroundColor(Color.CYAN); // changed the background color to CYAN ```
--- --- sidebar_position: 1 sidebar_label: Introduction pagination_label: Intro to Video SDK Meeting Class title: Video SDK Meeting Class --- # Video SDK Meeting Class - Android
## Introduction The `Meeting` class includes properties, methods and meeting-event-listener-class for managing a meeting, participants, video, audio and share streams, messaging and UI customization. import LinksGrid from "../../../../src/theme/LinksGrid"; import properties from "./../data/meeting-class/properties.json"; import methods from "./../data/meeting-class/methods.json"; import events from "./../data/meeting-class/events.json"; ## Meeting Properties
- [getmeetingId()](/android/api/sdk-reference/meeting-class/properties#getmeetingid)
- [getLocalParticipant()](./properties#getlocalparticipant)
- [getParticipants()](./properties#getparticipants)
- [pubSub](./properties#pubsub)
## Meeting Methods
- [join()](./methods#join)
- [leave()](./methods#leave)
- [end()](./methods#end)
- [enableWebcam()](./methods#enablewebcam)
- [disableWebcam()](./methods#disablewebcam)
- [unmuteMic()](./methods#unmutemic)
- [muteMic()](./methods#mutemic)
- [enableScreenShare()](./methods#enablescreenshare)
- [disableScreenShare()](./methods#disablescreenshare)
- [startRecording()](./methods#startrecording)
- [stopRecording()](./methods#stoprecording)
- [startLiveStream()](./methods#startlivestream)
- [stopLiveStream()](./methods#stoplivestream)
- [startHls()](./methods#starthls)
- [stopHls()](./methods#stophls)
- [startTranscription()](./methods#starttranscription)
- [stopTranscription()](./methods#stoptranscription)
- [changeMode()](./methods#changemode)
- [getMics()](./methods#getmics)
- [changeMic()](./methods#changemic)
- [setAudioDeviceChangeListener()](./methods#setaudiodevicechangelistener)
- [changeWebcam()](./methods#changewebcam)
- [uploadBase64File()](./methods#uploadbase64file)
- [fetchBase64File()](./methods#fetchbase64file)
- [addEventListener()](./methods#addeventlistener)
- [removeEventListener()](./methods#removeeventlistener)
- [removeAllListeners()](./methods#removealllisteners)
- [startWhiteboard()](./methods#startwhiteboard)
- [stopWhiteboard()](./methods#stopwhiteboard)
## Meeting Events
- [onMeetingJoined](./meeting-event-listener-class#onmeetingjoined)
- [onMeetingLeft](./meeting-event-listener-class#onmeetingleft)
- [onParticipantJoined](./meeting-event-listener-class#onparticipantjoined)
- [onParticipantLeft](./meeting-event-listener-class#onparticipantleft)
- [onSpeakerChanged](./meeting-event-listener-class#onspeakerchanged)
- [onPresenterChanged](./meeting-event-listener-class#onpresenterchanged)
- [onEntryRequested](./meeting-event-listener-class#onentryrequested)
- [onEntryResponded](./meeting-event-listener-class#onentryresponded)
- [onWebcamRequested](./meeting-event-listener-class#onwebcamrequested)
- [onMicRequested](./meeting-event-listener-class#onmicrequested)
- [onRecordingStateChanged](./meeting-event-listener-class#onrecordingstatechanged)
- [onRecordingStarted](./meeting-event-listener-class#onrecordingstarted)
- [onRecordingStopped](./meeting-event-listener-class#onrecordingstopped)
- [onLivestreamStateChanged](./meeting-event-listener-class#onlivestreamstatechanged)
- [onLivestreamStarted](./meeting-event-listener-class#onlivestreamstarted)
- [onLivestreamStopped](./meeting-event-listener-class#onlivestreamstopped)
- [onHlsStateChanged](./meeting-event-listener-class#onhlsstatechanged)
- [onTranscriptionStateChanged](./meeting-event-listener-class#ontranscriptionstatechanged)
- [onTranscriptionText](./meeting-event-listener-class#ontranscriptiontext)
- [onExternalCallStarted](./meeting-event-listener-class#onexternalcallstarted)
- [onMeetingStateChanged](./meeting-event-listener-class#onmeetingstatechanged)
- [onParticipantModeChanged](./meeting-event-listener-class#onparticipantmodechanged)
- [onPinStateChanged()](./meeting-event-listener-class#onpinstatechanged)
- [onWhiteboardStarted()](./meeting-event-listener-class#onwhiteboardstarted)
- [onWhiteboardStopped()](./meeting-event-listener-class#onwhiteboardstopped)
--- --- sidebar_position: 1 sidebar_label: MeetingEventListener Class pagination_label: MeetingEventListener Class title: MeetingEventListener Class --- # MeetingEventListener Class - Android
--- ### implementation - You can implement all the methods of `MeetingEventListener` abstract Class and add the listener to `Meeting` class using the `addEventListener()` method of `Meeting` Class. #### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```javascript private final MeetingEventListener meetingEventListener = new MeetingEventListener() { override fun onMeetingJoined() { Log.d("#meeting", "onMeetingJoined()") } } ``` ```javascript private final MeetingEventListener meetingEventListener = new MeetingEventListener() { @Override public void onMeetingJoined() { Log.d("#meeting", "onMeetingJoined()"); } } ``` --- ### onMeetingJoined() - This event will be emitted when a [localParticipant](./properties#getlocalparticipant) successfully joined the meeting. #### Example ```javascript override fun onMeetingJoined() { Log.d("#meeting", "onMeetingJoined()") } ``` ```javascript @Override public void onMeetingJoined() { Log.d("#meeting", "onMeetingJoined()"); } ``` --- ### onMeetingLeft() - This event will be emitted when a [localParticipant](./properties#getlocalparticipant) left the meeting. #### Example ```javascript override fun onMeetingLeft() { Log.d("#meeting", "onMeetingLeft()") } ``` ```javascript @Override public void onMeetingLeft() { Log.d("#meeting", "onMeetingLeft()"); } ``` --- ### onParticipantJoined() - This event will be emitted when a new [participant](../participant-class/introduction) joined the meeting. #### Event callback parameters - **participant**: [Participant](../participant-class/introduction) #### Example ```javascript override fun onParticipantJoined(participant: Participant) { Log.d("#meeting", participant.displayName + " joined"); } ``` ```javascript @Override public void onParticipantJoined(Participant participant) { Log.d("#meeting", participant.getDisplayName() + " joined"); } ``` --- ### onParticipantLeft - This event will be emitted when a joined [participant](../participant-class/introduction) left the meeting. #### Event callback parameters - **participant**: [Participant](../participant-class/introduction) #### Example ```javascript override fun onParticipantLeft(participant: Participant) { Log.d("#meeting", participant.displayName + " left"); } ``` ```javascript @Override public void onParticipantLeft(Participant participant) { Log.d("#meeting", participant.getDisplayName() + " left"); } ``` --- ### onSpeakerChanged() - This event will be emitted when a active speaker changed. - If you want to know which participant is actively speaking, then this event will be used. - If no participant is actively speaking, then this event will pass `null` as en event callback parameter. #### Event callback parameters - **participantId**: String #### Example ```javascript override fun onSpeakerChanged(participantId: String?) { // } ``` ```javascript @Override public void onSpeakerChanged(String participantId) { // } ``` --- ### onPresenterChanged() - This event will be emitted when any [participant](../participant-class/introduction) starts or stops screen sharing. - It will pass `participantId` as an event callback parameter. - If a participant stops screensharing, then this event will pass `null` as en event callback parameter. #### Event callback parameters - **participantId**: String #### Example ```javascript override fun onPresenterChanged(participantId: String) { // } ``` ```javascript @Override public void onPresenterChanged(String participantId) { // } ``` --- ### onEntryRequested() - This event will be emitted when a new [participant](../participant-class/introduction) who is trying to join the meeting, is having permission **`ask_join`** in token. - This event will only be emitted to the [participants](./properties#getparticipants) in the meeting, who is having the permission **`allow_join`** in token. - This event will pass following parameters as an event parameters, `participantId` and `name` of the new participant who is trying to join the meeting, `allow()` and `deny()` to take required actions. #### Event callback parameters - **peerId**: String - **name**: String #### Example ```javascript override fun onEntryRequested(id: String?, name: String?) { // } ``` ```javascript @Override public void onEntryRequested(String id, String name) { // } ``` --- ### onEntryResponded() - This event will be emitted when the `join()` request is responded. - This event will be emitted to the [participants](./properties#getparticipants) in the meeting, who is having the permission **`allow_join`** in token. - This event will be also emitted to the [participant](../participant-class/introduction) who requested to join the meeting. #### Event callback parameters - **participantId**: _String_ - **decision**: _"allowed"_ | _"denied"_ #### Example ```javascript override fun onEntryResponded(id: String?, decision: String?) { // } ``` ```javascript @Override public void onEntryResponded(String id, String decision) { // } ``` --- ### onWebcamRequested() - This event will be emitted to the participant `B` when any other participant `A` requests to enable webcam of participant `B`. - On accepting the request, webcam of participant `B` will be enabled. #### Event callback parameters - **participantId**: String - **listener**: WebcamRequestListener \{ **accept**: Method; **reject**: Method } #### Example ```javascript override fun onWebcamRequested(participantId: String, listener: WebcamRequestListener) { // if accept request listener.accept() // if reject request listener.reject() } ``` ```javascript @Override public void onWebcamRequested(String participantId, WebcamRequestListener listener) { // if accept request listener.accept(); // if reject request listener.reject(); } ``` ### onMicRequested() - This event will be emitted to the participant `B` when any other participant `A` requests to enable mic of participant `B`. - On accepting the request, mic of participant `B` will be enabled. #### Event callback parameters - **participantId**: String - **listener**: MicRequestListener \{ **accept**: Method; **reject**: Method } #### Example ```javascript override fun onMicRequested(participantId: String, listener: MicRequestListener) { // if accept request listener.accept() // if reject request listener.reject() } ``` ```javascript @Override public void onMicRequested(String participantId, MicRequestListener listener) { // if accept request listener.accept(); // if reject request listener.reject(); } ``` --- ### onRecordingStateChanged() - This event will be emitted when the meeting's recording status changed. #### Event callback parameters - **recordingState**: String `recordingState` has following values - `RECORDING_STARTING` - Recording is in starting phase and hasn't started yet. - `RECORDING_STARTED` - Recording has started successfully. - `RECORDING_STOPPING` - Recording is in stopping phase and hasn't stopped yet. - `RECORDING_STOPPED` - Recording has stopped successfully. #### Example ```javascript override fun onRecordingStateChanged(recordingState: String) { when (recordingState) { "RECORDING_STARTING" -> { Log.d("onRecordingStateChanged", "Meeting recording is starting") } "RECORDING_STARTED" -> { Log.d("onRecordingStateChanged", "Meeting recording is started") } "RECORDING_STOPPING" -> { Log.d("onRecordingStateChanged", "Meeting recording is stopping") } "RECORDING_STOPPED" -> { Log.d("onRecordingStateChanged", "Meeting recording is stopped") } } } ``` ```javascript @Override public void onRecordingStateChanged(String recordingState) { switch (recordingState) { case "RECORDING_STARTING": Log.d("onRecordingStateChanged", "Meeting recording is starting"); break; case "RECORDING_STARTED": Log.d("onRecordingStateChanged", "Meeting recording is started"); break; case "RECORDING_STOPPING": Log.d("onRecordingStateChanged", "Meeting recording is stopping"); break; case "RECORDING_STOPPED": Log.d("onRecordingStateChanged", "Meeting recording is stopped"); break; } } ``` --- ### onRecordingStarted() _`This event will be deprecated soon`_ - This event will be emitted when recording of the meeting is started. #### Example ```javascript override fun onRecordingStarted() { // } ``` ```javascript @Override public void onRecordingStarted() { // } ``` --- ### onRecordingStopped() _`This event will be deprecated soon`_ - This event will be emitted when recording of the meeting is stopped. #### Example ```javascript override fun onRecordingStopped() { // } ``` ```javascript @Override public void onRecordingStopped() { // } ``` --- ### onLivestreamStateChanged() - This event will be emitted when the meeting's livestream status changed. #### Event callback parameters - **livestreamState**: String `livestreamState` has following values - `LIVESTREAM_STARTING` - Livestream is in starting phase and hasn't started yet. - `LIVESTREAM_STARTED` - Livestream has started successfully. - `LIVESTREAM_STOPPING` - Livestream is in stopping phase and hasn't stopped yet. - `LIVESTREAM_STOPPED` - Livestream has stopped successfully. #### Example ```javascript override fun onLivestreamStateChanged(livestreamState: String?) { when (livestreamState) { "LIVESTREAM_STARTING" -> Log.d( "LivestreamStateChanged", "Meeting livestream is starting" ) "LIVESTREAM_STARTED" -> Log.d( "LivestreamStateChanged", "Meeting livestream is started" ) "LIVESTREAM_STOPPING" -> Log.d("LivestreamStateChanged", "Meeting livestream is stopping" ) "LIVESTREAM_STOPPED" -> Log.d("LivestreamStateChanged", "Meeting livestream is stopped" ) } } ``` ```javascript @Override public void onLivestreamStateChanged(String livestreamState) { switch (livestreamState) { case "LIVESTREAM_STARTING": Log.d("LivestreamStateChanged", "Meeting livestream is starting"); break; case "LIVESTREAM_STARTED": Log.d("LivestreamStateChanged", "Meeting livestream is started"); break; case "LIVESTREAM_STOPPING": Log.d("LivestreamStateChanged", "Meeting livestream is stopping"); break; case "LIVESTREAM_STOPPED": Log.d("LivestreamStateChanged", "Meeting livestream is stopped"); break; } } ``` --- ### onLivestreamStarted() _`This event will be deprecated soon`_ - This event will be emitted when `RTMP` live stream of the meeting is started. #### Example ```javascript override fun onLivestreamStarted() { // } ``` ```javascript @Override public void onLivestreamStarted() { // } ``` --- ### onLivestreamStopped() _`This event will be deprecated soon`_ - This event will be emitted when `RTMP` live stream of the meeting is stopped. #### Example ```javascript override fun onLivestreamStopped() { // } ``` ```javascript @Override public void onLivestreamStopped() { // } ``` --- ### onHlsStateChanged() - This event will be emitted when the meeting's HLS(Http Livestreaming) status changed. #### Event callback parameters - **HlsState**: \{ **status**: String} - `status` has following values : - `HLS_STARTING` - HLS is in starting phase and hasn't started yet. - `HLS_STARTED` - HLS has started successfully. - `HLS_PLAYABLE` - HLS can be playable now. - `HLS_STOPPING` - HLS is in stopping phase and hasn't stopped yet. - `HLS_STOPPED` - HLS has stopped successfully. - when you receive `HLS_PLAYABLE` status you will receive 2 urls in response - `playbackHlsUrl` - Live HLS with playback support - `livestreamUrl` - Live HLS without playback support :::note `downstreamUrl` is now depecated. Use `playbackHlsUrl` or `livestreamUrl` in place of `downstreamUrl` ::: #### Example ```javascript override fun onHlsStateChanged(HlsState: JSONObject) { when (HlsState.getString("status")) { "HLS_STARTING" -> Log.d("onHlsStateChanged", "Meeting hls is starting") "HLS_STARTED" -> Log.d("onHlsStateChanged", "Meeting hls is started") "HLS_PLAYABLE" -> { Log.d("onHlsStateChanged", "Meeting hls is playable now") // on hls playable you will receive playbackHlsUrl and livestreamUrl val playbackHlsUrl = HlsState.getString("playbackHlsUrl") val livestreamUrl = HlsState.getString("livestreamUrl") } "HLS_STOPPING" -> Log.d("onHlsStateChanged", "Meeting hls is stopping") "HLS_STOPPED" -> Log.d("onHlsStateChanged", "Meeting hls is stopped") } } ``` ```javascript @Override public void onHlsStateChanged(JSONObject HlsState) { switch (HlsState.getString("status")) { case "HLS_STARTING": Log.d("onHlsStateChanged", "Meeting hls is starting"); break; case "HLS_STARTED": Log.d("onHlsStateChanged", "Meeting hls is started"); break; case "HLS_PLAYABLE": Log.d("onHlsStateChanged", "Meeting hls is playable now"); // on hls started you will receive playbackHlsUrl and livestreamUrl String playbackHlsUrl = HlsState.getString("playbackHlsUrl"); String livestreamUrl = HlsState.getString("livestreamUrl"); break; case "HLS_STOPPING": Log.d("onHlsStateChanged", "Meeting hls is stopping"); break; case "HLS_STOPPED": Log.d("onHlsStateChanged", "Meeting hls is stopped"); break; } } ``` --- ### onTranscriptionStateChanged() - This event will be triggered whenever state of realtime transcription is changed. #### Event callback parameters - **data**: \{ **status**: String, **id**: String } - **status**: String - **id**: String `status` has following values - `TRANSCRIPTION_STARTING` - Realtime Transcription is in starting phase and hasn't started yet. - `TRANSCRIPTION_STARTED` - Realtime Transcription has started successfully. - `TRANSCRIPTION_STOPPING` - Realtime Transcription is in stopping phase and hasn't stopped yet. - `TRANSCRIPTION_STOPPED` - Realtime Transcription has stopped successfully. #### Example ```javascript override fun onTranscriptionStateChanged(data: JSONObject) { //Status can be :: TRANSCRIPTION_STARTING //Status can be :: TRANSCRIPTION_STARTED //Status can be :: TRANSCRIPTION_STOPPING //Status can be :: TRANSCRIPTION_STOPPED val status = data.getString("status") Log.d("MeetingActivity", "Transcription status: $status") } ``` ```javascript @Override public void onTranscriptionStateChanged(JSONObject data) { //Status can be :: TRANSCRIPTION_STARTING //Status can be :: TRANSCRIPTION_STARTED //Status can be :: TRANSCRIPTION_STOPPING //Status can be :: TRANSCRIPTION_STOPPED String status = data.getString("status"); Log.d("MeetingActivity", "Transcription status: " + status); } ``` --- ### onTranscriptionText() - This event will be emitted when text for running realtime transcription received. #### Event callback parameters - **data**: TranscriptionText - **TranscriptionText.participantId**: String - **TranscriptionText.participantName**: String - **TranscriptionText.text**: String - **TranscriptionText.timestamp**: int - **TranscriptionText.type**: String #### Example ```javascript override fun onTranscriptionText(data: TranscriptionText) { val participantId = data.participantId val participantName = data.participantName val text = data.text val timestamp = data.timestamp val type = data.type Log.d("MeetingActivity", "$participantName: $text $timestamp") } ``` ```javascript @Override public void onTranscriptionText(TranscriptionText data) { String participantId = data.getParticipantId(); String participantName = data.getParticipantName(); String text = data.getText(); int timestamp = data.getTimestamp(); String type = data.getType(); Log.d("MeetingActivity", participantName + ": " + text + " " + timestamp); } ``` --- ### onWhiteboardStarted() - This event will be triggered when the whiteboard is successfully started. #### Event callback parameters **url**: String #### Example ```javascript override fun onWhiteboardStarted(url: String) { super.onWhiteboardStarted(url) //... } ``` ```java @Override public void onWhiteboardStarted(String url) { super.onWhiteboardStarted(url); //... } ``` --- ### onWhiteboardStopped() - This event will be triggered when the whiteboard session is successfully terminated. #### Example ```javascript override fun onWhiteboardStopped() { super.onWhiteboardStopped() //... } ``` ```java @Override public void onWhiteboardStopped() { super.onWhiteboardStopped(); //... } ``` --- ### onExternalCallStarted() - This event will be emitted when local particpant receive incoming call. #### Example ```javascript override fun onExternalCallStarted() { // } ``` ```javascript @Override public void onExternalCallStarted() { // } ``` --- ### onMeetingStateChanged() - This event will be emitted when state of meeting changes. - It will pass **`state`** as an event callback parameter which will indicate current state of the meeting. - All available states are `CONNECTING`, `CONNECTED`, `FAILED`, `DISCONNECTED`, `CLOSING`, `CLOSED`. #### Event callback parameters - **state**: String #### Example ```javascript override fun onMeetingStateChanged(state: String?) { when (state) { "CONNECTING" -> Log.d("onMeetingStateChanged: ", "Meeting is Connecting") "CONNECTED" -> Log.d("onMeetingStateChanged: ", "Meeting is Connected") "FAILED" -> Log.d("onMeetingStateChanged: ", "Meeting connection failed") "DISCONNECTED" -> Log.d("onMeetingStateChanged: ","Meeting connection disconnected abruptly") "CLOSING" -> Log.d("onMeetingStateChanged: ", "Meeting is closing") "CLOSED" -> Log.d("onMeetingStateChanged: ", "Meeting connection closed") } } ``` ```javascript @Override public void onMeetingStateChanged(String state) { switch (state) { case "CONNECTING": Log.d("onMeetingStateChanged: ", "Meeting is Connecting"); break; case "CONNECTED": Log.d("onMeetingStateChanged: ", "Meeting is Connected"); break; case "FAILED": Log.d("onMeetingStateChanged: ", "Meeting connection failed"); break; case "DISCONNECTED": Log.d("onMeetingStateChanged: ", "Meeting connection disconnected abruptly"); break; case "CLOSING": Log.d("onMeetingStateChanged: ", "Meeting is closing"); break; case "CLOSED": Log.d("onMeetingStateChanged: ", "Meeting connection closed"); break; } } ``` --- ### onParticipantModeChanged() This event is triggered when a participant's mode is updated. It passes `data` as an event callback parameter, which includes the following: - **`SEND_AND_RECV`**: Both audio and video streams will be produced and consumed. - **`SIGNALLING_ONLY`**: Audio and video streams will not be produced or consumed. It is used solely for signaling. - **`RECV_ONLY`**: Only audio and video streams will be consumed without producing any. This event is triggered when a participant's mode is updated. #### Event Callback Parameters - **data**: `{ mode: String, participantId: String }` - **mode**: `String` - **participantId**: `String` import CautionMessage from '@site/src/theme/CautionMessage'; #### Example ```javascript override fun onParticipantModeChanged(data: JSONObject?) { // } ``` ```javascript @Override public void onParticipantModeChanged(JSONObject data) { // } ``` --- ### onPinStateChanged() - This event will be triggered when any participant got pinned or unpinned by any participant got pinned or unpinned by any participant. #### Event callback parameters - **pinStateData**: \{ **peerId**: String, **state**: JSONObject, **pinnedBy**: String } - **peerId**: String - **state**: JSONObject - **pinnedBy**: String #### Example ```javascript override fun onPinStateChanged(pinStateData: JSONObject?) { Log.d("onPinStateChanged: ", pinStateData.getString("peerId")) // id of participant who were pinned Log.d("onPinStateChanged: ", pinStateData.getJSONObject("state")) // { cam: true, share: true } Log.d("onPinStateChanged: ", pinStateData.getString("pinnedBy")) // id of participant who pinned that participant } ``` ```javascript @Override public void onPinStateChanged(JSONObject pinStateData) { Log.d("onPinStateChanged: ", pinStateData.getString("peerId")); // id of participant who were pinned Log.d("onPinStateChanged: ", pinStateData.getJSONObject("state")); // { cam: true, share: true } Log.d("onPinStateChanged: ", pinStateData.getString("pinnedBy")); // id of participant who pinned that participant } ```
--- --- sidebar_position: 1 sidebar_label: Methods pagination_label: Meeting Class Methods title: Meeting Class Methods --- # Meeting Class Methods - Android
### join() - It is used to join a meeting. - After meeting initialization by [`initMeeting()`](../initMeeting) it returns a new instance of [Meeting](./introduction). However by default, it will not automatically join the meeting. Hence, to join the meeting you should call `join()`. #### Events associated with `join()`: - Local Participant will receive a [`onMeetingJoined`](./meeting-event-listener-class#onmeetingjoined) event, when successfully joined. - Remote Participant will receive a [`onParticipantJoined`](./meeting-event-listener-class#onparticipantjoined) event with the newly joined [`Participant`](../participant-class/introduction) object from the event callback. #### Participant having `ask_join` permission inside token - If a token contains the permission `ask_join`, then the participant will not join the meeting directly after calling `join()`, but an event will be emitted to the participant having the permission `allow_join` called [`onEntryRequested`](./meeting-event-listener-class#onentryrequested). - After the decision from the remote participant, an event will be emitted to participant called [`onEntryResponded`](./meeting-event-listener-class#onentryresponded). This event will contain the decision made by the remote participant. #### Participant having `allow_join` permission inside token - If a token containing the permission `allow_join`, then the participant will join the meeting directly after calling `join()`. #### Returns - _`void`_ --- ### leave() - It is used to leave the current meeting. #### Events associated with `leave()`: - Local participant will receive a [`onMeetingLeft`](./meeting-event-listener-class#onmeetingleft) event. - All remote participants will receive a [`onParticipantLeft`](./meeting-event-listener-class#onparticipantleft) event with `participantId`. #### Returns - _`void`_ --- ### end() - It is used to end the current running session. - By calling `end()`, all joined [participants](properties#getparticipants) including [localParticipant](./properties.md#getlocalparticipant) of that session will leave the meeting. #### Events associated with `end()`: - All [participants](./properties.md#getparticipants) and [localParticipant](./properties.md#getlocalparticipant), will be emitted [`onMeetingLeft`](./meeting-event-listener-class#onmeetingleft) event. #### Returns - _`void`_ --- ### enableWebcam() - It is used to enable self camera. - [`onStreamEnabled`](../participant-class/participant-event-listener-class#onstreamenabled) event of `ParticipantEventListener` will be emitted with [`stream`](../stream-class/introduction) object from the event callback. #### Returns - _`void`_ --- ### disableWebcam() - It is used to disable self camera. - [`onStreamDisabled`](../participant-class/participant-event-listener-class#onstreamdisabled) event of `ParticipantEventListener` will be emitted with [`stream`](../stream-class/introduction) object from the event callback. #### Returns - _`void`_ --- ### unmuteMic() - It is used to enable self microphone. - [`onStreamEnabled`](../participant-class/participant-event-listener-class#onstreamenabled) event of `ParticipantEventListener` will be emitted with [`stream`](../stream-class/introduction) object from the event callback. #### Returns - _`void`_ --- ### muteMic() - It is used to disable self microphone. - [`onStreamDisabled`](../participant-class/participant-event-listener-class#onstreamdisabled) event of `ParticipantEventListener` will be emitted with [`stream`](../stream-class/introduction) object from the event callback. #### Returns - _`void`_ --- ### enableScreenShare() - it is used to enable screen-sharing. - [`onStreamEnabled`](../participant-class/participant-event-listener-class#onstreamenabled) event of `ParticipantEventListener` will be emitted with [`stream`](../stream-class/introduction) object from the event callback. - [`onPresenterChanged()`](./meeting-event-listener-class#onpresenterchanged) event will be trigget to all participant with `participantId`. #### Parameters - **data**: Intent #### Returns - _`void`_ --- ### disableScreenShare() - It is used to disable screen-sharing. - [`onStreamDisabled`](../participant-class/participant-event-listener-class#onstreamdisabled) event of `ParticipantEventListener` will be emitted with [`stream`](../stream-class/introduction) object from the event callback. - [`onPresenterChanged()`](./meeting-event-listener-class#onpresenterchanged) event will be trigget to all participant with `null`. #### Returns - _`void`_ --- ### uploadBase64File() - It is used to upload your file to Videosdk's Temporary storage. - `base64Data` convert your file to base64 and pass here. - `token` pass your videosdk token. Read more about token [here](/android/guide/video-and-audio-calling-api-sdk/authentication-and-token) - `fileName` provide your fileName with extension. - `TaskCompletionListener` will handle the result of the upload operation. - When the upload is complete, the `onComplete()` method of `TaskCompletionListener` will provide the corresponding `fileUrl`, which can be used to retrieve the uploaded file. - If an error occurs during the upload process, the `onError()` method of `TaskCompletionListener` will provide the error details. #### Parameters - **base64Data**: String - **token**: String - **fileName**: String - **listener**: TaskCompletionListener #### Returns - _`void`_ #### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js private fun uploadFile() { val base64Data = "" // Convert your file to base64 and pass here val token = "" val fileName = "myImage.jpeg" // Provide name with extension here meeting!!.uploadBase64File( base64Data, token, fileName, object : TaskCompletionListener { override fun onComplete(data: String?) { Log.d("VideoSDK", "Uploaded file url: $data") } override fun onError(error: String?) { Log.d("VideoSDK", "Error in upload file: $error") } } ) } ``` ```js private void uploadFile() { String base64Data = ""; // Convert your file to base64 and pass here String token = ""; String fileName = "myImage.jpeg"; // Provide name with extension here meeting.uploadBase64File(base64Data, token, fileName, new TaskCompletionListener() { @Override public void onComplete(@Nullable String data) { Log.d("VideoSDK", "Uploaded file url: " + data); } @Override public void onError(@Nullable String error) { Log.d("VideoSDK", "Error in upload file: " + error); } }); } ``` --- ### fetchBase64File() - It is used to retrieve your file from the Videosdk's Temporary storage. - `url` pass fileUrl which is returned by `uploadBase64File()` - `token` pass your videosdk token. Read more about token [here](/android/guide/video-and-audio-calling-api-sdk/authentication-and-token) - `TaskCompletionListener` will handle the result of the fetch operation. - When the fetch operation is complete, the `onComplete()` method of `TaskCompletionListener` will provide the file in `base64` format. - If an error occurs during the fetch operation, the `onError()` method of `TaskCompletionListener` will provide the error details. #### Parameters - **url**: String - **token**: String - **listener**: TaskCompletionListener #### Returns - _`void`_ #### Example ```js private fun fetchFile() { val url = "" // Provide fileUrl which is returned by uploadBase64File() val token = "" meeting.fetchBase64File(url, token, object : TaskCompletionListener { override fun onComplete(data: String?) { Log.d("VideoSDK", "Fetched file in base64:$data") } override fun onError(error: String?) { Log.d("VideoSDK", "Error in fetch file: $error") } }) } ``` ```js private void fetchFile() { String url = ""; // Provide fileUrl which is returned by uploadBase64File() String token = ""; meeting.fetchBase64File(url, token, new TaskCompletionListener() { @Override public void onComplete(@Nullable String data) { Log.d("VideoSDK", "Fetched file in base64:" + data); } @Override public void onError(@Nullable String error) { Log.d("VideoSDK", "Error in fetch file: " + error); } }); } ``` --- ### startRecording() - `startRecording` is used to start meeting recording. - `webhookUrl` will be triggered when the recording is completed and stored into server. Read more about webhooks [here](https://en.wikipedia.org/wiki/Webhook). - `awsDirPath` will be the path for the your S3 bucket where you want to store recordings to. To allow us to store recording in your S3 bucket, you will need to fill this form by providing the required values. [VideoSDK AWS S3 Integration](https://zfrmz.in/RVlFLFiturVJ7Q97fr23) - `config: mode` is used to either record video-and-audio both or only audio. And by default it will be video-and-audio. - `config: quality` is only applicable to video-and-audio. - `transcription` This parameter lets you start post transcription for the recording. #### Parameters - **webhookUrl**: String - **awsDirPath**: String - **config**: - **layout**: - **type**: _"GRID"_ | _"SPOTLIGHT"_ | _"SIDEBAR"_ - **priority**: _"SPEAKER"_ | _"PIN"_ - **gridSize**: Number _`max 4`_ - **theme**: _"DARK"_ | _"LIGHT"_ | _"DEFAULT"_ - **mode**: _"video-and-audio"_ | _"audio"_ - **quality**: _"low"_ | _"med"_ | _"high"_ - **orientation**: _"landscape"_ | _"portrait"_ - **transcription**: **PostTranscriptionConfig** - **PostTranscriptionConfig.enabled**: boolean - **PostTranscriptionConfig.modelId**: String - **PostTranscriptionConfig.summary**: SummaryConfig - **SummaryConfig.enabled**: boolean - **SummaryConfig.prompt**: String #### Returns - _`void`_ #### Events associated with `startRecording()`: - Every participant will receive a callback on [`onRecordingStateChanged()`](./meeting-event-listener-class#onrecordingstatechanged) #### Example ```js val webhookUrl = "https://webhook.your-api-server.com" var config = JSONObject() var layout = JSONObject() JsonUtils.jsonPut(layout, "type", "SPOTLIGHT") JsonUtils.jsonPut(layout, "priority", "PIN") JsonUtils.jsonPut(layout, "gridSize", 9) JsonUtils.jsonPut(config, "layout", layout) JsonUtils.jsonPut(config, "theme", "DARK") val prompt = "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary" val summaryConfig = SummaryConfig(true, prompt) val modelId = "raman_v1" val transcription = PostTranscriptionConfig(true, summaryConfig, modelId) meeting!!.startRecording(webhookUrl,null,config,transcription) ``` ```js String webhookUrl = "https://webhook.your-api-server.com"; JSONObject config = new JSONObject(); JSONObject layout = new JSONObject(); JsonUtils.jsonPut(layout, "type", "SPOTLIGHT"); JsonUtils.jsonPut(layout, "priority", "PIN"); JsonUtils.jsonPut(layout, "gridSize", 9); JsonUtils.jsonPut(config, "layout", layout); JsonUtils.jsonPut(config, "theme", "DARK"); String prompt = "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary"; SummaryConfig summaryConfig = new SummaryConfig(true, prompt); String modelId = "raman_v1"; PostTranscriptionConfig transcription = new PostTranscriptionConfig(true, summaryConfig, modelId); meeting.startRecording(webhookUrl,null,config,transcription); ``` --- ### stopRecording() - It is used to stop meeting recording. #### Returns - _`void`_ #### Events associated with `stopRecording()`: - Every participant will receive a callback on [`onRecordingStateChanged()`](./meeting-event-listener-class#onrecordingstatechanged) #### Example ```javascript meeting!!.stopRecording() ``` ```javascript meeting.stopRecording(); ``` --- ### startLivestream() - `startLiveStream()` is used to start meeting livestreaming. - You will be able to start live stream meetings to other platforms such as Youtube, Facebook, etc. that support `RTMP` streaming. #### Parameters - **outputs**: `List` - **config**: - **layout**: - **type**: _"GRID"_ | _"SPOTLIGHT"_ | _"SIDEBAR"_ - **priority**: _"SPEAKER"_ | _"PIN"_ - **gridSize**: Number _`max 25`_ - **theme**: _"DARK"_ | _"LIGHT"_ | _"DEFAULT"_ #### Returns - _`void`_ #### Events associated with `startLiveStream()`: - Every participant will receive a callback on [`onLivestreamStateChanged()`](./meeting-event-listener-class#onlivestreamstatechanged) #### Example ```javascript val YOUTUBE_RTMP_URL = "rtmp://a.rtmp.youtube.com/live2" val YOUTUBE_RTMP_STREAM_KEY = "" val outputs: MutableList = ArrayList() outputs.add(LivestreamOutput(YOUTUBE_RTMP_URL, YOUTUBE_RTMP_STREAM_KEY)) var config = JSONObject() var layout = JSONObject() JsonUtils.jsonPut(layout, "type", "SPOTLIGHT") JsonUtils.jsonPut(layout, "priority", "PIN") JsonUtils.jsonPut(layout, "gridSize", 9) JsonUtils.jsonPut(config, "layout", layout) JsonUtils.jsonPut(config, "theme", "DARK") meeting!!.startLivestream(outputs,config) ``` ```javascript final String YOUTUBE_RTMP_URL = "rtmp://a.rtmp.youtube.com/live2"; final String YOUTUBE_RTMP_STREAM_KEY = ""; List outputs = new ArrayList<>(); outputs.add(new LivestreamOutput(YOUTUBE_RTMP_URL, YOUTUBE_RTMP_STREAM_KEY)); JSONObject config = new JSONObject(); JSONObject layout = new JSONObject(); JsonUtils.jsonPut(layout, "type", "SPOTLIGHT"); JsonUtils.jsonPut(layout, "priority", "PIN"); JsonUtils.jsonPut(layout, "gridSize", 9); JsonUtils.jsonPut(config, "layout", layout); JsonUtils.jsonPut(config, "theme", "DARK"); meeting.startLivestream(outputs,config); ``` --- ### stopLivestream() - It is used to stop meeting livestreaming. #### Returns - _`void`_ #### Events associated with `stopLivestream()`: - Every participant will receive a callback on [`onLivestreamStateChanged()`](./meeting-event-listener-class#onlivestreamstatechanged) #### Example ```javascript meeting!!.stopLivestream() ``` ```javascript meeting.stopLivestream(); ``` --- ### startHls() - `startHls()` will start HLS streaming of your meeting. - You will be able to start HLS and watch the live stream of meeting over HLS. - `mode` is used to either start hls streaming of video-and-audio both or only audio. And by default it will be video-and-audio. - `quality` is only applicable to video-and-audio. - `transcription` This parameter lets you start post transcription for the recording. #### Parameters - **config**: - **layout**: - **type**: _"GRID"_ | _"SPOTLIGHT"_ | _"SIDEBAR"_ - **priority**: _"SPEAKER"_ | _"PIN"_ - **gridSize**: Number _`max 25`_ - **theme**: _"DARK"_ | _"LIGHT"_ | _"DEFAULT"_ - **mode**: _"video-and-audio"_ | _"audio"_ - **quality**: _"low"_ | _"med"_ | _"high"_ - **transcription**: **PostTranscriptionConfig** - **PostTranscriptionConfig.enabled**: boolean - **PostTranscriptionConfig.modelId**: String - **PostTranscriptionConfig.summary**: SummaryConfig - **SummaryConfig.enabled**: boolean - **SummaryConfig.prompt**: String #### Returns - _`void`_ #### Events associated with `startHls()`: - Every participant will receive a callback on [`onHlsStateChanged()`](./meeting-event-listener-class#onhlsstatechanged) #### Example ```javascript var config = JSONObject() var layout = JSONObject() JsonUtils.jsonPut(layout, "type", "SPOTLIGHT") JsonUtils.jsonPut(layout, "priority", "PIN") JsonUtils.jsonPut(layout, "gridSize", 9) JsonUtils.jsonPut(config, "layout", layout) JsonUtils.jsonPut(config, "orientation", "portrait") JsonUtils.jsonPut(config, "theme", "DARK") val prompt = "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary" val summaryConfig = SummaryConfig(true, prompt) val modelId = "raman_v1" val transcription = PostTranscriptionConfig(true, summaryConfig,modelId) meeting!!.startHls(config, transcription) ``` ```javascript JSONObject config = new JSONObject(); JSONObject layout = new JSONObject(); JsonUtils.jsonPut(layout, "type", "SPOTLIGHT"); JsonUtils.jsonPut(layout, "priority", "PIN"); JsonUtils.jsonPut(layout, "gridSize", 9); JsonUtils.jsonPut(config, "layout", layout); JsonUtils.jsonPut(config, "orientation", "portrait"); JsonUtils.jsonPut(config, "theme", "DARK"); String prompt = "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary"; SummaryConfig summaryConfig = new SummaryConfig(true, prompt); String modelId = "raman_v1"; PostTranscriptionConfig transcription = new PostTranscriptionConfig(true, summaryConfig,modelId); meeting.startHls(config,transcription); ``` --- ### stopHls() - `stopHls()` is used to stop the HLS streaming. #### Returns - _`void`_ #### Events associated with `stopHls()`: - Every participant will receive a callback on [`onHlsStateChanged()`](./meeting-event-listener-class#onhlsstatechanged) #### Example ```javascript meeting!!.stopHls() ``` ```javascript meeting.stopHls(); ``` --- ### startTranscription() - `startTranscription()` It is used to start realtime transcription. #### Parameters #### config - type : `TranscriptionConfig` - This specifies the configurations for realtime transcription. You can specify following properties. - `TranscriptionConfig.webhookUrl`: Webhooks will be triggered when the state of realtime transcription is changed. Read more about webhooks [here](https://en.wikipedia.org/wiki/Webhook) - `TranscriptionConfig.summary`: `SummaryConfig` - `enabled`: Indicates whether realtime transcription summary generation is enabled. Summary will be available after realtime transcription stopped. Default: `false` - `prompt`: provides guidelines or instructions for generating a custom summary based on the realtime transcription content. #### Returns - _`void`_ #### Events associated with `startTranscription()`: - Every participant will receive a callback on [`onTranscriptionStateChanged()`](./meeting-event-listener-class#ontranscriptionstatechanged) - Every participant will receive a callback on [`onTranscriptionText()`](./meeting-event-listener-class#ontranscriptiontext) #### Example ```javascript // Realtime Transcription Configuration val webhookUrl = "https://www.example.com" val summaryConfig = SummaryConfig( true, "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary" ) val transcriptionConfig = TranscriptionConfig( webhookUrl, summaryConfig ) meeting!!.startTranscription(transcriptionConfig) ``` ```javascript // Realtime Transcription Configuration final String webhookUrl = "https://www.example.com"; SummaryConfig summaryConfig = new SummaryConfig( true, "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary" ); TranscriptionConfig transcriptionConfig = new TranscriptionConfig( webhookUrl, summaryConfig ); meeting.startTranscription(transcriptionConfig); ``` --- ### stopTranscription() - `stopTranscription()` It is used to stop realtime transcription. #### Returns - _`void`_ #### Events associated with `startTranscription()`: - Every participant will receive a callback on [`onTranscriptionStateChanged()`](./meeting-event-listener-class#ontranscriptionstatechanged) #### Example ```javascript meeting!!.stopTranscription() ``` ```javascript meeting.stopTranscription(); ``` --- ### startWhiteboard() - It is used to initilize a whiteboard session. #### Returns - _`void`_ --- ### stopWhiteboard() - It is used to end a whiteboard session. #### Returns - _`void`_ --- ### changeMode() - It is used to change the mode. - You can toggle between the following modes: - **`SEND_AND_RECV`**: Both audio and video streams will be produced and consumed. - **`SIGNALLING_ONLY`**: Audio and video streams will not be produced or consumed. It is used solely for signaling. - **`RECV_ONLY`**: Only audio and video streams will be consumed without producing any. import CautionMessage from '@site/src/theme/CautionMessage'; #### Parameters - **mode**: `String` - **mode**: `String` #### Returns - _`void`_ #### Events associated with `changeMode()`: - Every participant will receive a callback on [`onParticipantModeChanged()`](./meeting-event-listener-class#onparticipantmodechanged) ```javascript meeting!!.changeMode("SIGNALLING_ONLY") meeting!!.changeMode("SIGNALLING_ONLY") ``` ```javascript meeting!!.changeMode("SIGNALLING_ONLY") meeting!!.changeMode("SIGNALLING_ONLY") ``` --- ### getMics() - It will return all connected mic devices. #### Returns - `Set` #### Example ```javascript val mics = meeting!!.mics var mic: String for (i in mics.indices) { mic = mics.toTypedArray()[i].toString() Toast.makeText(this, "Mic : $mic", Toast.LENGTH_SHORT).show() } ``` ```javascript Set mics = meeting.getMics(); String mic; for (int i = 0; i < mics.size(); i++) { mic=mics.toArray()[i].toString(); Toast.makeText(this, "Mic : " + mic, Toast.LENGTH_SHORT).show(); } ``` --- ### changeMic() - It is used to change the mic device. - If multiple mic devices are connected, by using `changeMic()` one can change the mic device. #### Parameters - **device**: AppRTCAudioManager.AudioDevice #### Returns - _`void`_ #### Example ```javascript meeting!!.changeMic(AppRTCAudioManager.AudioDevice.BLUETOOTH) ``` ```javascript meeting.changeMic(AppRTCAudioManager.AudioDevice.BLUETOOTH); ``` --- ### changeWebcam() - It is used to change the camera device. - If multiple camera devices are connected, by using `changeWebcam()`, one can change the camera device with its respective device id. - You can get a list of connected video devices using [`VideoSDK.getVideoDevices()`](../videosdk-class/methods#getvideodevices) #### Parameters - **deviceId**: - The `deviceId` represents the unique identifier of the camera device you wish to switch to. If no deviceId is provided, the facing mode will toggle, from the back camera to the front camera if the back camera is currently in use, or from the front camera to the back camera if the front camera is currently in use. - type : String - `OPTIONAL` #### Returns - _`void`_ #### Example ```javascript meeting!!.changeWebcam() ``` ```javascript meeting.changeWebcam(); ``` --- ### setAudioDeviceChangeListener() - When a Local participant changes the Mic, `AppRTCAudioManager.AudioManagerEvents()` is triggered which can be set by using `setAudioDeviceChangeListener()` method. #### Parameters - **audioManagerEvents**: AppRTCAudioManager.AudioManagerEvents #### Returns - _`void`_ #### Example ```javascript meeting!!.setAudioDeviceChangeListener(object : AudioManagerEvents { override fun onAudioDeviceChanged( selectedAudioDevice: AppRTCAudioManager.AudioDevice, availableAudioDevices: Set ) { when (selectedAudioDevice) { AppRTCAudioManager.AudioDevice.BLUETOOTH -> Toast.makeText(this@MainActivity, "Selected AudioDevice: BLUETOOTH", Toast.LENGTH_SHORT).show() AppRTCAudioManager.AudioDevice.WIRED_HEADSET -> Toast.makeText(this@MainActivity, "Selected AudioDevice: WIRED_HEADSET", Toast.LENGTH_SHORT).show() AppRTCAudioManager.AudioDevice.SPEAKER_PHONE -> Toast.makeText(this@MainActivity, "Selected AudioDevice: SPEAKER_PHONE", Toast.LENGTH_SHORT).show() AppRTCAudioManager.AudioDevice.EARPIECE -> Toast.makeText(this@MainActivity, "Selected AudioDevice: EARPIECE", Toast.LENGTH_SHORT).show() } } }) ``` ```javascript meeting.setAudioDeviceChangeListener(new AppRTCAudioManager.AudioManagerEvents() { @Override public void onAudioDeviceChanged(AppRTCAudioManager.AudioDevice selectedAudioDevice, Set availableAudioDevices) { switch (selectedAudioDevice) { case BLUETOOTH: Toast.makeText(MainActivity.this, "Selected AudioDevice: BLUETOOTH", Toast.LENGTH_SHORT).show(); break; case WIRED_HEADSET: Toast.makeText(MainActivity.this, "Selected AudioDevice: WIRED_HEADSET", Toast.LENGTH_SHORT).show(); break; case SPEAKER_PHONE: Toast.makeText(MainActivity.this, "Selected AudioDevice: SPEAKER_PHONE", Toast.LENGTH_SHORT).show(); break; case EARPIECE: Toast.makeText(MainActivity.this, "Selected AudioDevice: EARPIECE", Toast.LENGTH_SHORT).show(); break; } } }); ``` --- ### addEventListener() #### Parameters - **listener**: MeetingEventListener #### Returns - _`void`_ --- ### removeEventListener() #### Parameters - **listener**: MeetingEventListener #### Returns - _`void`_ --- ### removeAllListeners() #### Returns - _`void`_
--- --- sidebar_position: 1 sidebar_label: Properties pagination_label: Meeting Class Properties title: Meeting Class Properties --- # Meeting Class Properties - Android
### getmeetingId() - type: `String` - `getmeetingId()` will return `meetingId`, which is unique id of the meeting where the participant has joined. --- ### getLocalParticipant() - type: [Participant](../participant-class/introduction) - It will be the instance of [Participant](../participant-class/introduction) class for the local participant(You) who joined the meeting. --- ### getParticipants() - type: [`Map`](https://developer.android.com/reference/java/util/Map) of [Participant](../participant-class/introduction) - `Map` - Map{'<'}`participantId`, [Participant](../participant-class/introduction)> - It will contain all joined participants in the meeting except the `localParticipant`. - This will be the [`Map`](https://developer.android.com/reference/java/util/Map) what will container all participants attached with the key as id of that participant. import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```javascript val remoteParticipantId = "ajf897" val participant = meeting!!.participants[remoteParticipantId] ``` ```javascript String remoteParticipantId = "ajf897"; Participant participant = meeting.getParticipants().get(remoteParticipantId); ``` --- ### pubSub - type: [`PubSub`](../pubsub-class/introduction) - It is used to enable Publisher-Subscriber feature in [`meeting`](introduction) class. Learn more about `PubSub`, [here](../pubsub-class/introduction)
--- --- title: Meeting class for android SDK. hide_title: false hide_table_of_contents: false description: RTC Meeting Class provides features to implement custom meeting layout in your application. sidebar_label: Meeting Class pagination_label: Meeting Class keywords: - RTC Android - Meeting Class - Video API - Video Conferencing image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: meeting-class --- import NoIndex from '/mdx/\_no-index.mdx'; # Meeting Class ## using Meeting Class The `Meeting Class` includes methods and events for managing meetings, participants, video & audio streams, data channels and UI customization. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ## Constructor ### Meeting(String meetingId, Participant localParticipant) - return type : `void` ## Properties ### getmeetingId() - `getmeetingId()` will return `meetingId`, which represents the meetingId for the current meeting - return type : `void` ### getLocalParticipant() - `getLocalParticipant()` will return Local participant - return type :`Participant` ### getParticipants() - `getParticipants()` will return all Remote participant - return type : `void` ### pubSub() - `pubSub()` will return object of `PubSub` class - return type : `PubSub` ### Events ### Methods "} /> --- --- title: MeetingEventListener Class for android SDK. hide_title: false hide_table_of_contents: false description: The `MeetingEventListener Class` includes list of events which can be useful for the design custom user interface. sidebar_label: MeetingEventListener Class pagination_label: MeetingEventListener Class keywords: - RTC Android - MeetingEventListener Class - Video API - Video Conferencing image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: meeting-event-listener-class --- import NoIndex from '/mdx/\_no-index.mdx'; # MeetingEventListener Class ## using MeetingEventListener Class The `MeetingEventListener Class` is responsible for listening to all the events that are related to `Meeting Class`. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ### Listeners --- --- title: Video SDK Participant Class sidebar_position: 1 sidebar_label: Introduction pagination_label: Video SDK Participant Class --- # Video SDK Participant Class - Android
import properties from './../data/participant-class/properties.json' import methods from './../data/participant-class/methods.json' import events from './../data/participant-class/events.json' import LinksGrid from '../../../../src/theme/LinksGrid' Participant class includes all the properties, methods and events related to all the participants joined in a particular meeting. ## Get local and remote participants You can get the local streams and participant meta from `meeting.getLocalParticipant()`. And a Map of joined participants is always available via `meeting.getParticipants()` import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js title="Javascript" val localParticipant = meeting!!.getLocalParticipant() val participants = meeting!!.getParticipants() ``` ```js title="Javascript" Participant localParticipant = meeting.getLocalParticipant(); Map participants = meeting.getParticipants(); ``` ## Participant Properties
- [getId()](./properties#getid)
- [getDisplayName()](./properties#getdisplayname)
- [getQuality()](./properties#getquality)
- [isLocal()](./properties#islocal)
- [getStreams()](./properties#getstreams)
- [getMode()](./properties#getmode)
- [getMetaData()](./properties#getmetadata)
## Participant Methods
- [enableWebcam()](./methods#enablewebcam)
- [disableWebcam()](./methods#disablewebcam)
- [enableMic()](./methods#enablemic)
- [disableMic()](./methods#disablemic)
- [remove()](./methods#remove)
- [setQuality()](./methods#setquality)
- [setViewPort()](./methods#setviewport)
- [captureImage()](./methods#captureimage)
## Participant Events
- [onStreamEnabled](./participant-event-listener-class#onstreamenabled)
- [onStreamDisabled](./participant-event-listener-class#onstreamdisabled)
--- --- title: Participant Class Methods sidebar_position: 1 sidebar_label: Methods pagination_label: Participant Class Methods --- # Participant Class Methods - Android
### enableWebcam() - `enableWebcam()` is used to enable participant's camera. #### Events associated with `enableWebcam()` : - First the participant will get a callback on [onWebcamRequested()](../meeting-class/meeting-event-listener-class#onwebcamrequested) and once the participant accepts the request, webcam will be enabled. - Every Participant will receive a `streamEnabled` event of `ParticipantEventListener` Class with `stream` object. #### Returns - `void` --- ### disableWebcam() - `disableWebcam()` is used to disable participant camera. #### Events associated with `disableWebcam()` : - Every Participant will receive a `streamDisabled` event of `ParticipantEventListener` Class with `stream` object. #### Returns - `void` --- ### enableMic() - `enableMic()` is used to enable participant microphone. #### Events associated with `enableMic()` : - First the participant will get a callback on [onMicRequested()](../meeting-class/meeting-event-listener-class#onmicrequested) and once the participant accepts the request, mic will be enabled. - Every Participant will receive a `streamEnabled` event of `ParticipantEventListener` Class with `stream` object. #### Returns - `void` --- ### disableMic() - `disableMic()` is used to disable participant microphone. #### Events associated with `disableMic()`: - Every Participant will receive a `streamDisabled` event of `ParticipantEventListener` Class with `stream` object. #### Returns - `void` --- ### pin() - It is used to set pin state of the participant. You can use it to pin the screen share, camera or both of the participant. It accepts a paramter of type `String`. Default `SHARE_AND_CAM` #### Parameters - **pinType**: `SHARE_AND_CAM` | `CAM` | `SHARE` --- ### unpin() - It is used to unpin participant. You can use it to unpin the screen share, camera or both of the participant. It accepts a paramter of type `String`. Default is `SHARE_AND_CAM` #### Parameters - **pinType**: `SHARE_AND_CAM` | `CAM` | `SHARE` --- ### remove() - It is used to remove a participant from the meeting --- ### setQuality() - `setQuality()` is used to set the quality of the participant's video stream. #### Parameters - `quality`: low | med | high #### Returns - `void` --- ### setViewPort() - `setViewPort()` is used to set the quality of the participant's video stream based on the viewport height and width. #### Parameters - **width**: int - **height**: int #### Returns - `void` --- :::info MultiStream is not supported by the Android SDK. Use `customTrack` rather than `setQuality()` and `setViewPort()` if you want to change participant's quality who joined using our Android SDK. To know more about customTrack visit [here](/android/guide/video-and-audio-calling-api-sdk/features/custom-track/custom-video-track) ::: ### remove() - `remove()` is used to remove the participant from the meeting. #### Events associated with `remove()` : - Local participant will receive a [`onMeetingLeft`](../meeting-class/meeting-event-listener-class.md#onmeetingleft) event. - All remote participants will receive a [`onParticipantLeft`](../meeting-class/meeting-event-listener-class.md#onparticipantleft) event with `participantId`. #### Returns - `void` --- ### captureImage() - It is used to capture image of local participant's current videoStream. - You need to pass an implementation of `TaskCompletionListener` as a parameter. This listener will handle the result of the image capture task. - When the image capture task is complete, the `onComplete()` method will provide the image in the form of a `base64` string. If an error occurs, the `onError()` method will provide the error details. #### Parameters - **height**: int - **width**: int - **listener**: TaskCompletionListener #### Returns - _`void`_ --- ### getVideoStats() - `getVideoStats()` will return an JSONObject which will contain details regarding the participant's critical video metrics such as **Jitter**, **Packet Loss**, **Quality Score** etc. #### Returns - `JSONObject` - `jitter` : It represents the distortion in the stream. - `bitrate` : It represents the bitrate of the stream which is being transmitted. - `totalPackets` : It represents the total packet count which were transmitted for that particiular stream. - `packetsLost` : It represents the total packets lost during the transimission of the stream. - `rtt` : It represents the time between the stream being reached to client from the server in milliseconds(ms). - `codec`: It represents the codec used for the stream. - `network`: It represents the network used to transmit the stream - `size`: It is object containing the height, width and frame rate of the stream. :::note getVideoStats() will return the metrics for the participant at that given point of time and not average data of the complete meeting. To view the metrics for the complete meeting using the stats API documented [here](/api-reference/realtime-communication/fetch-session-quality-stats). ::: :::info If you are getting `rtt` greater than 300ms, try using a different region which is nearest to your user. To know more about changing region [visit here](/api-reference/realtime-communication/create-room). If you are getting high packet loss, try using the `customTrack` for better experience. To know more about customTrack [visit here](/android/guide/video-and-audio-calling-api-sdk/features/custom-track/custom-video-track) ::: --- ### getAudioStats() - `getAudioStats()` will return an JSONObject which will contain details regarding the participant's critical audio metrics such as **Jitter**, **Packet Loss**, **Quality Score** etc. #### Returns - `JSONObject` - `jitter` : It represents the distortion in the stream. - `bitrate` : It represents the bitrate of the stream which is being transmitted. - `totalPackets` : It represents the total packet count which were transmitted for that particiular stream. - `packetsLost` : It represents the total packets lost during the transimission of the stream. - `rtt` : It represents the time between the stream being reached to client from the server in milliseconds(ms). - `codec`: It represents the codec used for the stream. - `network`: It represents the network used to transmit the stream :::note getAudioStats() will return the metrics for the participant at that given point of time and not average data of the complete meeting. To view the metrics for the complete meeting using the stats API documented [here](/api-reference/realtime-communication/fetch-session-quality-stats). ::: :::info If you are getting `rtt` greater than 300ms, try using a different region which is nearest to your user. To know more about changing region [visit here](/api-reference/realtime-communication/create-room). ::: ### getShareStats() - `getShareStats()` will return an JSONObject which will contain details regarding the participant's critical video metrics such as **Jitter**, **Packet Loss**, **Quality Score** etc. #### Returns - `JSONObject` - `jitter` : It represents the distortion in the stream. - `bitrate` : It represents the bitrate of the stream which is being transmitted. - `totalPackets` : It represents the total packet count which were transmitted for that particiular stream. - `packetsLost` : It represents the total packets lost during the transimission of the stream. - `rtt` : It represents the time between the stream being reached to client from the server in milliseconds(ms). - `codec`: It represents the codec used for the stream. - `network`: It represents the network used to transmit the stream - `size`: It is object containing the height, width and frame rate of the stream. :::note getShareStats() will return the metrics for the participant at that given point of time and not average data of the complete meeting. To view the metrics for the complete meeting using the stats API documented [here](/api-reference/realtime-communication/fetch-session-quality-stats). ::: :::info If you are getting `rtt` greater than 300ms, try using a different region which is nearest to your user. To know more about changing region [visit here](/api-reference/realtime-communication/create-room). ::: ### addEventListener() #### Parameters - **listener**: ParticipantEventListener #### Returns - _`void`_ --- ### removeEventListener() #### Parameters - **listener**: ParticipantEventListener #### Returns - _`void`_ --- ### removeAllListeners() #### Returns - _`void`_
--- --- title: ParticipantEventListener Class sidebar_position: 1 sidebar_label: ParticipantEventListener pagination_label: ParticipantEventListener Class --- # ParticipantEventListener Class - Android
### Implementation - You can implement all the methods of `ParticipantEventListener` abstract Class and add the listener to `Participant` class using the `addEventListener()` method of `Participant` Class. --- ### onStreamEnabled() - `onStreamEnabled()` is a callback which gets triggered whenever a participant's video, audio or screen share stream is enabled. #### Event callback parameters - **stream**: [Stream](../stream-class/introduction.md) --- ### onStreamDisabled() - `onStreamDisabled()` is a callback which gets triggered whenever a participant's video, audio or screen share stream is disabled. #### Event callback parameters - **stream**: [Stream](../stream-class/introduction.md) --- ### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js meeting!!.localParticipant.addEventListener(object : ParticipantEventListener() { override fun onStreamEnabled(stream: Stream) { // } override fun onStreamDisabled(stream: Stream) { // } }); ``` ```js participant.addEventListener(new ParticipantEventListener() { @Override public void onStreamEnabled(Stream stream) { // } @Override public void onStreamDisabled(Stream stream) { // } }); ```
--- --- title: Participant Class Properties sidebar_position: 1 sidebar_label: Properties pagination_label: Participant Class Properties --- # Participant Class Properties - Android
### getId() - type: `String` - `getId` will return unique id of the participant who has joined the meeting. --- ### getDisplayName() - type: `String` - It will return the `displayName` of the participant who has joined the meeting. --- ### getQuality() - type: `String` - `getQuality()` will return quality of participant's stream. Stream could be `audio` , `video` or `share`. --- ### isLocal() - type: `boolean` - `isLocal()` will return `true` if participant is Local,`false` otherwise. --- ### getStreams() - type: `Map` - It will represents the stream for that particular participant who has joined the meeting. Streams could be `audio` , `video` or `share`. --- ### getMode() - type : `string` - ` getMode()` will return mode of the Participant. --- ### getMetaData() - type : `JSONObject` - `getMetaData()` will return additional information, that you have passed in `initMeeting()`.
--- --- title: Participant class for android SDK. hide_title: false hide_table_of_contents: false description: The `Participant Class` includes methods and events for participants and their associated video & audio streams, data channels and UI customization. sidebar_label: Participant Class pagination_label: Participant Class keywords: - RTC Android - Participant Class - Video API - Video Conferencing image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: participant-class --- import NoIndex from '/mdx/\_no-index.mdx'; # Participant Class ## Introduction The `Participant Class` includes methods and events for participants and their associated video & audio streams, data channels and UI customization. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ## Properties ### getId() - `getId()` will return participant's Id - return type : `String` ### getDisplayName() - `getDisplayName()` will return name of participant - return type : `String` ### getQuality() - `getQuality()` will return quality of participant's video stream - return type : `String` ### isLocal() - `isLocal()` will return `true` if participant is Local,`false` otherwise - return type : `boolean` ### getStreams() - `getStreams()` will return streams of participant - return type : `Map` - Map contains `streamId` as key and `stream` as value ## Events ### addEventListener(ParticipantEventListener listener) - By using `addEventListener(ParticipantEventListener listener)`, we can add listener to the List of `ParticipantEventListener` - return type : `void` ### removeEventListener(ParticipantEventListener listener) - By using `removeEventListener(ParticipantEventListener listener)`, we can remove listener from List of `ParticipantEventListener` - return type : `void` ### removeAllListeners() - By using `removeAllListeners()`, we can remove all listener from List - return type : `void` ## Methods ### enableMic() - By using `enableMic()` function, a participant can enable the Mic of any particular Remote Participant - When `enableMic()` is called, - Local Participant will receive a callback on `streamEnabled()` of `ParticipantEventListener` class - Remote Participant will receive a callback for `onMicRequested()` and once the remote participant accepts the request, mic will be enabled for that participant - return type : `void` ### disableMic() - By using `disableMic()` function, a participant can disable the Mic of any particular Remote Participant - When `enableMic()` is called, - Local Participant will receive a callback on `streamDisabled()` of `ParticipantEventListener` class - Remote Participant will receive a callback on `streamDisabled()` of `ParticipantEventListener` class - return type : `void` ### enableWebcam() - By using `enableWebcam()` function, a participant can enable the Webcam of any particular Remote Participant - When `enableWebcam()` is called, - Local Participant will receive a callback on `streamEnabled()` of `ParticipantEventListener` class - Remote Participant will receive a callback for `webcamRequested()` and once the remote participant accepts the request, webcam will be enabled for that participant - return type : `void` ### disableWebcam() - By using `disableWebcam()` function, a participant can disable the Webcam of any particular Remote Participant - When `disableWebcam()` is called, - Local Participant will receive a callback on `streamDisabled()` of `ParticipantEventListener` class - Remote Participant will receive a callback on `streamDisabled()` of `ParticipantEventListener` class - return type : `void` ### remove() - By using `remove()` function, a participant can remove any particular Remote Participant - When `remove()` is called, - Local Participant will receive a callback on `meetingLeft()` - Remote Participant will receive a callback on `participantLeft()` - return type : `void` ### setQuality() - By using `setQuality()`,you can set quality of participant's video stream - return type : `void` --- --- title: ParticipantEventListener Class for android SDK. hide_title: false hide_table_of_contents: false description: The `ParticipantEventListener Class` includes list of events which can be useful for the design custom user interface. sidebar_label: ParticipantEventListener Class pagination_label: ParticipantEventListener Class keywords: - RTC Android - ParticipantEventListener Class - Video API - Video Conferencing image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: participant-event-listener-class --- # ParticipantEventListener Class ## using ParticipantEventListener Class The `ParticipantEventListener Class` is responsible for listening to all the events that are related to `Participant Class`. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ### Listeners --- --- title: Video SDK PubSub Class sidebar_position: 1 sidebar_label: Introduction pagination_label: Video SDK PubSub Class --- # Video SDK PubSub Class - Android
## Introduction PubSub class provides the methods to implement Publisher-Subscriber feature in your Application. ## PubSub Methods
- [publish()](methods#publish)
- [subscribe()](methods#subscribe)
- [unsubscribe()](methods#unsubscribe)
--- --- sidebar_position: 1 sidebar_label: Methods pagination_label: PubSub Class Methods title: PubSub Class Methods --- # PubSub Class Methods - Android
### publish() - `publish()` is used to publish messages on a specified topic in the meeting. #### Parameters - topic - type: `String` - This is the name of the topic, for which message will be published. - message - type: `String` - This is the actual message. - options - type: [`PubSubPublishOptions`](pubsub-publish-options-class) - This specifies the options for publish. - payload - type: `JSONObject` - `OPTIONAL` - If you need to include additional information along with a message, you can pass here as `JSONObject`. #### Returns - _`void`_ #### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js // Publish message for 'CHAT' topic val publishOptions = PubSubPublishOptions() publishOptions.isPersist = true meeting!!.pubSub.publish("CHAT", "Hello from Android", publishOptions) ``` ```js // Publish message for 'CHAT' topic PubSubPublishOptions publishOptions = new PubSubPublishOptions(); publishOptions.setPersist(true); meeting.pubSub.publish("CHAT", "Hello from Android", publishOptions); ``` --- ### subscribe() - `subscribe()` is used to subscribe a particular topic to get all the messages of that particular topic in the meeting. #### Parameters - topic: - type: `String` - Participants can listen to messages on that particular topic. - listener: - type: [`PubSubMessageListener`](pubsub-message-listener-class) #### Returns - [_`List`_](pubsub-message-class) #### Example ```js var pubSubMessageListener: ubSubMessageListener = PubSubMessageListener { message -> Log.d("#message","onMessageReceived: " + message.message) } // Subscribe for 'CHAT' topic val pubSubMessageList = meeting!!.pubSub.subscribe("CHAT", pubSubMessageListener) ``` ```js PubSubMessageListener pubSubMessageListener = new PubSubMessageListener() { @Override public void onMessageReceived(PubSubMessage message) { Log.d("#message", "onMessageReceived: " + message.getMessage()); } }; // Subscribe for 'CHAT' topic List pubSubMessageList = meeting.pubSub.subscribe("CHAT", pubSubMessageListener); ``` --- ### unsubscribe() - `unsubscribe()` is used to unsubscribe a particular topic on which you have subscribed priviously. #### Parameters - topic: - type: `String` - This is the name of the topic to be unsubscribed. - listener: - type: [`PubSubMessageListener`](pubsub-message-listener-class) #### Returns - _`void`_ #### Example ```js // Unsubscribe for 'CHAT' topic meeting!!.pubSub.unsubscribe("CHAT", pubSubMessageListener) ``` ```js // Unsubscribe for 'CHAT' topic meeting.pubSub.unsubscribe("CHAT", pubSubMessageListener); ```
--- --- sidebar_position: 1 sidebar_label: Properties pagination_label: Properties title: Properties --- # Properties - Android
### getId() - type: `String` - `getId()` will return unique id of the pubsub message. --- ### getMessage() - type: `String` - `getMessage()` will return message that has been published on the specific topic. --- ### getTopic() - type: `String` - `getTopic()` will return name of the message topic. --- ### getSenderId() - type: `String` - `getSenderId()` will return id of the participant, who has sent the message. --- ### getSenderName() - type: `String` - `getSenderName()` will return name of the participant, who has sent the pubsub message. --- ### getTimestamp() - type: `long` - `getTimestamp()` will return timestamp at which, the pubsub message was sent. --- ### getPayload() - type: `JSONObject` - `getPayload()` will return data that you have send with message.
--- --- sidebar_position: 1 sidebar_label: PubSubMessageListener Class pagination_label: PubSubMessageListener Class title: PubSubMessageListener Class --- # PubSubMessageListener Class - Android
--- #### onMessageReceived() - This event will be emitted whenever any pubsub message received. #### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```javascript var pubSubMessageListener = PubSubMessageListener { message -> Log.d("#message", "onMessageReceived: " + message.message) } ``` ```javascript PubSubMessageListener pubSubMessageListener = new PubSubMessageListener() { @Override public void onMessageReceived(PubSubMessage message) { Log.d("#message", "onMessageReceived: " + message.getMessage()); } }; ```
--- --- sidebar_position: 1 sidebar_label: PubSubPublishOptions Class pagination_label: PubSubPublishOptions Class title: PubSubPublishOptions Class --- # PubSubPublishOptions Class - Android
## Properties ### persist - type: `boolean` - defaultValue: `false` - This property specifies whether to store messages on server for upcoming participants. - If the value of this property is true, then server will store pubsub messages for the upcoming participants. --- ### sendOnly - type: `String[]` - defaultValue: `null` - If you want to send a message to specific participants, you can pass their respective `participantId` here. If you don't provide any IDs or pass a `null` value, the message will be sent to all participants by default. :::note Make sure that participantId present in the array must be subscribe to that specific topic. :::
--- --- title: PubSub class for android SDK. hide_title: false hide_table_of_contents: false description: PubSub Class sidebar_label: PubSub Class pagination_label: PubSub Class keywords: - RTC Android - Publisher-Subscriber - PubSub image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: pubsub-class --- # PubSub Class ## using PubSub Class The `PubSub` includes methods for pubsub. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ### Methods "} /> --- --- title: PubSubMessage class for android SDK. hide_title: false hide_table_of_contents: false description: PubSubMessage Class sidebar_label: PubSubMessage Class pagination_label: PubSubMessage Class keywords: - RTC Android - Publisher-Subscriber - PubSub - PubSubMessage image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: pubsub-message-class --- # PubSubMessage Class ## using PubSubMessage Class The `PubSubMessage` includes properties of PubSub message. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ### Properties --- --- title: PubSubPublishOptions class for android SDK. hide_title: false hide_table_of_contents: false description: PubSubPublishOptions Class sidebar_label: PubSubPublishOptions Class pagination_label: PubSubPublishOptions Class keywords: - RTC Android - PubSub - PubSubPublishOptions image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: pubsub-publish-options-class --- # PubSubPublishOptions Class ## using PubSubPublishOptions Class The `PubSubPublishOptions` includes properties of PubSub options. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ### Properties ### Methods --- --- id: setup title: Installation steps for RTC Android SDK hide_title: false hide_table_of_contents: false description: RTC Android SDK provides client for almost all Android devices. it takes less amount of cpu and memory. sidebar_label: Setup pagination_label: Setup keywords: - RTC Android - Android SDK - Kotlin SDK - Java SDK image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: setup --- # Setup - Android ## Setting up android sdk Android SDK is client for real-time communication for android devices. It inherits the same terminology as all other SDKs does. ## Minimum OS/SDK versions It supports the following OS/SDK versions. ### Android: minSdkVersion >= 21 ## Installation 1. If your Android Studio Version is older than Android Studio Bumblebees, add the repository to project's `build.gradle` file.
If your are using Android Studio Bumblebees or newer Version, add the repository to `settings.gradle` file. :::note You can use imports with Maven Central after rtc-android-sdk version `0.1.12`. Whether on Maven or Jitpack, the same version numbers always refer to the same SDK. ::: import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js title="build.gradle" allprojects { repositories { // ... google() mavenCentral() maven { url "https://maven.aliyun.com/repository/jcenter" } } } ``` ```js title="settings.gradle" dependencyResolutionManagement{ repositories { // ... google() mavenCentral() maven { url "https://maven.aliyun.com/repository/jcenter" } } } ``` ```js title="build.gradle" allprojects { repositories { // ... google() maven { url 'https://jitpack.io' } mavenCentral() maven { url "https://maven.aliyun.com/repository/jcenter" } } } ``` ```js title="settings.gradle" dependencyResolutionManagement{ repositories { // ... google() maven { url 'https://jitpack.io' } mavenCentral() maven { url "https://maven.aliyun.com/repository/jcenter" } } } ``` ### Step 2: Add the following dependency in your app's `app/build.gradle`. ```js title="app/build.gradle" dependencies { implementation 'live.videosdk:rtc-android-sdk:0.1.38' // library to perform Network call to generate a meeting id implementation 'com.amitshekhar.android:android-networking:1.0.2' // other app dependencies } ``` :::important Android SDK compatible with armeabi-v7a, arm64-v8a, x86_64 architectures. If you want to run the application in an emulator, choose ABI x86_64 when creating a device. ::: ## Integration ### Step 1: Add the following permissions in `AndroidManifest.xml`. ```xml title="AndroidManifest.xml" ``` ### Step 2: Create `MainApplication` class which will extend the `android.app.Application`. ```js title="MainApplication.kt" package live.videosdk.demo; import live.videosdk.android.VideoSDK class MainApplication : Application() { override fun onCreate() { super.onCreate() VideoSDK.initialize(applicationContext) } } ``` ```js title="MainApplication.java" package live.videosdk.demo; import android.app.Application; import live.videosdk.android.VideoSDK; public class MainApplication extends Application { @Override public void onCreate() { super.onCreate(); VideoSDK.initialize(getApplicationContext()); } } ``` ### Step 3: Add `MainApplication` to `AndroidManifest.xml`. ```js title="AndroidManifest.xml" ``` ### Step 4: In your `JoinActivity` add the following code in `onCreate()` method. ```js title="JoinActivity.kt" override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) setContentView(R.layout.activity_join) val meetingId = "" val participantName = "John Doe" var micEnabled = true var webcamEnabled = true // generate the jwt token from your api server and add it here VideoSDK.config("JWT TOKEN GENERATED FROM SERVER") // create a new meeting instance meeting = VideoSDK.initMeeting( this@MeetingActivity, meetingId, participantName, micEnabled, webcamEnabled, null, null, false, null, null) // get permissions and join the meeting with meeting.join(); // checkPermissionAndJoinMeeting(); } ``` ```js title="JoinActivity.java" @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_join); final String meetingId = ""; final String participantName = "John Doe"; final boolean micEnabled = true; final boolean webcamEnabled = true; // generate the jwt token from your api server and add it here VideoSDK.config("JWT TOKEN GENERATED FROM SERVER"); // create a new meeting instance Meeting meeting = VideoSDK.initMeeting( MainActivity.this, meetingId, participantName, micEnabled, webcamEnabled, null, null, false, null, null ); // get permissions and join the meeting with meeting.join(); // checkPermissionAndJoinMeeting(); } ``` All set! Here is the link to the complete sample code on [Github](https://github.com/videosdk-live/videosdk-rtc-android-java-sdk-example). Please refer to the [documentation](initMeeting) for a full list of available methods, events and features of the SDK. --- --- title: Video SDK Stream Class sidebar_position: 1 sidebar_label: Introduction pagination_label: Video SDK Stream Class --- # Video SDK Stream Class - Android
import properties from './../data/stream-class/properties.json' import methods from './../data/stream-class/methods.json' import LinksGrid from '../../../../src/theme/LinksGrid' Stream class is responsible for handling audio, video and screen sharing streams. Stream class defines instance of audio, video and shared screen stream of participants. ## Stream Properties
- [getId()](./properties#getid)
- [getKind()](./properties#getkind)
- [getTrack()](./properties#gettrack)
## Stream Methods
- [resume()](methods#resume)
- [pause()](./methods#pause)
--- --- title: Stream Class Methods sidebar_position: 1 sidebar_label: Methods pagination_label: Stream Class Methods --- # Stream Class Methods - Android
### resume() - By using `resume()`, a participant can resume the stream of Remote Participant. #### Returns - `void` --- ### pause() - By using `pause()`, a participant can pause the stream of Remote Participant. #### Returns - `void`
--- --- title: Stream Class Properties sidebar_position: 1 sidebar_label: Properties pagination_label: Stream Class Properties --- # Stream Class Properties - Android
### getId() - type: `String` - `getId()` will return id for that stream . --- ### getKind() - type: `String` - `getKind()` will return `kind`, which represents the type of stream which could be `audio` | `video` or `share` . --- ### getTrack() - type: `MediaStreamTrack` - `getTrack()` will return a MediaStreamTrack object stored in the MediaStream object.
--- --- title: Stream class for android SDK. hide_title: false hide_table_of_contents: false description: RTC Stream Class enables opportunity to . sidebar_label: Stream Class pagination_label: Stream Class keywords: - RTC Android - Stream Class - Video API - Video Conferencing image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: stream-class --- # Stream Class ## Introduction The `Stream Class` includes methods and events of video & audio streams. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ## Properties ### getId() - `getId()` will return Id of stream - return type : `String` ### getKind() - `getKind()` will return kind of stream, which can `audio`,`video` or `share` - return type : `String` ### getTrack() - `getTrack()` will return a MediaStreamTrack object stored in the MediaStream object - return type : `MediaStreamTrack` ## Methods ### pause() - By using `pause()` function, a participant can pause the stream of Remote Participant - return type : `void` ### resume() - By using `resume()` function, a participant can resume the stream of Remote Participant - return type : `void` --- --- title: Terminology - Video SDK Documentation hide_title: true hide_table_of_contents: false description: Video SDK enables the opportunity to integrate native IOS, Android & Web SDKs to add live video & audio conferencing to your applications. sidebar_label: Terminology pagination_label: Terminology keywords: - audio calling - video calling - real-time communication - collaboration image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: terminology --- import Terminology from '../../../mdx/introduction/\_terminology.mdx'; --- --- title: Video SDK Class for android SDK. hide_title: false hide_table_of_contents: false description: Video SDK Class is a factory for initialize, configure and init meetings. sidebar_label: VideoSDK Class pagination_label: VideoSDK Class keywords: - RTC Android - VideoSDK Class - Video API - Video Conferencing image: img/videosdklive-thumbnail.jpg sidebar_position: 1 slug: videosdk-class --- # VideoSDK Class The entry point into real-time communication SDK. ## using VideoSDK Class The `VideoSDK Class` includes methods and events to initialize and configure the SDK. It is a factory class. import MethodListGroup from '@theme/MethodListGroup'; import MethodListItemLabel from '@theme/MethodListItemLabel'; import MethodListHeading from '@theme/MethodListHeading'; ### Parameters ### Methods ## Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js title="initMeeting" // Configure the token VideoSDK.config(token) // Initialize the meeting Meeting meeting = VideoSDK.initMeeting( context, meetingId, // required name, // required micEnabled, // required webcamEnabled, // required null, // required null, // required null // required ) }); ``` ```js title="initMeeting" // Configure the token VideoSDK.config(token) // Initialize the meeting Meeting meeting = VideoSDK.initMeeting({ context, meetingId, // required name, // required micEnabled, // required webcamEnabled, // required null, // required null // required null // required }); ``` --- --- sidebar_position: 1 sidebar_label: Events pagination_label: VideoSDK Class Events title: VideoSDK Class Events --- # VideoSDK Class Events - Android
### onAudioDeviceChanged() - This event will be emitted when an audio device, is connected to or removed from the device. #### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```javascript VideoSDK.setAudioDeviceChangeListener(object : VideoSDK.AudioDeviceChangeEvent { override fun onAudioDeviceChanged( selectedAudioDevice: AudioDeviceInfo?, audioDevices: MutableSet? ) { Log.d( "VideoSDK", "Selected Audio Device: " + selectedAudioDevice.label ) for (audioDevice in audioDevices) { Log.d("VideoSDK", "Audio Devices" + audioDevice.label) } } }) ``` ```javascript VideoSDK.setAudioDeviceChangeListener(new VideoSDK.AudioDeviceChangeEvent() { @Override public void onAudioDeviceChanged(AudioDeviceInfo selectedAudioDevice, Set audioDevices) { Log.d("VideoSDK", "Selected Audio Device: " + selectedAudioDevice.getLabel()); for (AudioDeviceInfo audioDevice : audioDevices) { Log.d("VideoSDK", "Audio Devices" + audioDevice.getLabel()); } } }); ``` ---
--- --- sidebar_position: 1 sidebar_label: Introduction pagination_label: Intro to VideoSDK Class title: VideoSDK Class --- # VideoSDK Class - Android
## Introduction The `VideoSDK` class includes properties, methods and events for creating and configuring a meeting, and managing media devices. import LinksGrid from "../../../../src/theme/LinksGrid"; //import properties from "./../data/meeting-class/properties.json"; import methods from "./../data/meeting-class/methods.json"; import events from "./../data/meeting-class/events.json"; ## VideoSDK Properties
- [getSelectedAudioDevice()](./properties.md#getselectedaudiodevice)
- [getSelectedVideoDevice()](./properties#getselectedvideodevice)
## VideoSDK Methods
- [initialize()](./methods#initialize)
- [config()](./methods#config)
- [initMeeting()](./methods#initmeeting)
- [getDevices()](./methods#getdevices)
- [getVideoDevices()](./methods#getvideodevices)
- [getAudioDevices()](./methods#getaudiodevices)
- [checkPermissions()](./methods#checkpermissions)
- [setAudioDeviceChangeListener()](./methods#setaudiodevicechangelistener)
- [setSelectedAudioDevice()](./methods#setselectedaudiodevice)
- [setSelectedVideoDevice()](./methods#setselectedvideodevice)
## VideoSDK Events
- [onAudioDeviceChanged](./events.md#onaudiodevicechanged)
--- --- sidebar_position: 1 sidebar_label: Methods pagination_label: VideoSDK Class Methods title: VideoSDK Class Methods --- # VideoSDK Class Methods - Android
### initialize() To initialize the meeting, first you have to initialize the `VideoSDK`. You can initialize the `VideoSDK` using `initialize()` method provided by the SDK. #### Parameters - **context**: Context #### Returns - _`void`_ ```js title="initialize" VideoSDK.initialize(Context context) ``` --- ### config() By using `config()` method, you can set the `token` property of `VideoSDK` class. Please refer this [documentation](/api-reference/realtime-communication/intro/) to generate a token. #### Parameters - **token**: String #### Returns - _`void`_ ```js title="config" VideoSDK.config(String token) ``` --- ### initMeeting() - Initialize the meeting using a factory method provided by the SDK called `initMeeting()`. - `initMeeting()` will generate a new [`Meeting`](../meeting-class/introduction.md) class and the initiated meeting will be returned. ```js title="initMeeting" VideoSDK.initMeeting( Context context, String meetingId, String name, boolean micEnabled, boolean webcamEnabled, String participantId, String mode, boolean multiStream, Map customTracks JSONObject metaData, String signalingBaseUrl PreferredProtocol preferredProtocol ) ``` - Please refer this [documentation](../initMeeting.md#initmeeting) to know more about `initMeeting()`. --- ### getDevices() - The `getDevices()` method returns a list of the currently available media devices, such as microphones, cameras, headsets, and so forth. The method returns a list of `DeviceInfo` objects describing the devices. - `DeviceInfo` class has four properties : 1. `DeviceInfo.deviceId` - Returns a string that is an identifier for the represented device, persisted across sessions. 2. `DeviceInfo.label` - Returns a string describing this device (for example `BLUETOOTH`). 3. `DeviceInfo.kind` - Returns an enumerated value that is either `video` or `audio`. 4. `DeviceInfo.FacingMode` - Returns a value of type `FacingMode` indicating which camera device is in use (front or back). #### Returns - `Set` #### Example import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```javascript val devices: Set = VideoSDK.getDevices() for (deviceInfo in devices) { Log.d("VideoSDK", "Device's DeviceId " + deviceInfo.deviceId) Log.d("VideoSDK", "Device's Label " + deviceInfo.label) Log.d("VideoSDK", "Device's Kind " + deviceInfo.kind) Log.d("VideoSDK", "Device's Facing Mode " + deviceInfo.facingMode) //Value will be null for Audio Devices } ``` ```javascript Set devices = VideoSDK.getDevices(); for (DeviceInfo deviceInfo : devices) { Log.d("VideoSDK", "Device's DeviceId " + deviceInfo.getDeviceId()); Log.d("VideoSDK", "Device's Label " + deviceInfo.getLabel()); Log.d("VideoSDK", "Device's Kind " + deviceInfo.getKind()); Log.d("VideoSDK", "Device's Facing Mode " + deviceInfo.getFacingMode()) //Value will be null for Audio Devices } ``` --- ### getVideoDevices() - The `getVideoDevices` method returns a list of currently available video devices. The method returns a list of `VideoDeviceInfo` objects describing the video devices. - `VideoDeviceInfo` class has four properties : 1. `VideoDeviceInfo.deviceId` - Returns a string that is an identifier for the represented device, persisted across sessions. 2. `VideoDeviceInfo.label` - Returns a string describing this device (for example `BLUETOOTH`). 2. `VideoDeviceInfo.kind` - Returns an enumerated value that is `video` . 4. `VideoDeviceInfo.FacingMode` - Returns a value of type `FacingMode` indicating which camera device is in use (front or back). #### Returns - `Set` #### Example ```js val videoDevices: Set = VideoSDK.getVideoDevices() for (videoDevice in videoDevices) { Log.d("VideoSDK", "Video Device's DeviceId " + videoDevice.deviceId) Log.d("VideoSDK", "Video Device's Label " + videoDevice.label) Log.d("VideoSDK", "Video Device's Kind " + videoDevice.kind) } ``` ```js Set videoDevices = VideoSDK.getVideoDevices(); for (VideoDeviceInfo videoDevice: videoDevices) { Log.d("VideoSDK", "Video Device's DeviceId " + videoDevice.getDeviceId()); Log.d("VideoSDK", "Video Device's Label " + videoDevice.getLabel()); Log.d("VideoSDK", "Video Device's Kind " + videoDevice.getKind()); } ``` --- ### getAudioDevices() - The `getAudioDevices` method returns a list of currently available audio devices. The method returns a list of `AudioDeviceInfo` objects describing the audio devices. - `AudioDeviceInfo` class has three properties : 1. `AudioDeviceInfo.deviceId` - Returns a string that is an identifier for the represented device, persisted across sessions. 2. `AudioDeviceInfo.label` - Returns a string describing this device (for example `BLUETOOTH`). 3. `AudioDeviceInfo.kind` - Returns an enumerated value that is `audio`. #### Returns - `Set` #### Example ```js val audioDevices: Set = VideoSDK.getAudioDevices() for (audioDevice in audioDevices) { Log.d("VideoSDK", "Audio Device's DeviceId " + audioDevice.deviceId) Log.d("VideoSDK", "Audio Device's Label " + audioDevice.label) Log.d("VideoSDK", "Audio Device's Kind " + audioDevice.kind) } ``` ```js Set audioDevices = VideoSDK.getAudioDevices(); for (AudioDeviceInfo audioDevice: audioDevices) { Log.d("VideoSDK", "Audio Device's DeviceId " + audioDevice.getDeviceId()); Log.d("VideoSDK", "Audio Device's Label " + audioDevice.getLabel()); Log.d("VideoSDK", "Audio Device's Kind " + audioDevice.getKind()); } ``` --- ### setAudioDeviceChangeListener() - The `AudioDeviceChangeEvent` is emitted when an audio device, is connected to or removed from the device. This event can be set by using `setAudioDeviceChangeListener()` method. #### Parameters - **audioDeviceChangeEvent**: AudioDeviceChangeEvent #### Returns - _`void`_ #### Example ```javascript VideoSDK.setAudioDeviceChangeListener { selectedAudioDevice: AudioDeviceInfo, audioDevices: Set -> Log.d( "VideoSDK", "Selected Audio Device: " + selectedAudioDevice.label ) for (audioDevice in audioDevices) { Log.d("VideoSDK", "Audio Devices" + audioDevice.label) } } ``` ```javascript VideoSDK.setAudioDeviceChangeListener((selectedAudioDevice, audioDevices) -> { Log.d("VideoSDK", "Selected Audio Device: " + selectedAudioDevice.getLabel()); for (AudioDeviceInfo audioDevice : audioDevices) { Log.d("VideoSDK", "Audio Devices" + audioDevice.getLabel()); } }); ``` --- ### checkPermissions() - The `checkPermissions()` method verifies whether permissions to access camera and microphone devices have been granted. If the required permissions are not granted, the method will proceed to request these permissions from the user. #### Parameters - context - type: `Context` - `REQUIRED` - The android context. - permission - type: `List` - `REQUIRED` - The permission to be requested. - permissionHandler - type: `PermissionHandler` - `REQUIRED` - The permission handler object for handling callbacks of various user actions such as permission granted, permission denied, etc. - rationale - type: `String` - `OPTIONAL` - Explanation to be shown to user if they have denied permission earlier. If this parameter is not provided, permissions will be requested without showing the rationale dialog. - options - type: `Permissions.Options` - `OPTIONAL` - The options object for setting title and description of dialog box that prompts users to manually grant permissions by navigating to device settings. If this parameter is not provided,the default title and decription will be used for the dialog box. #### Returns - _`void`_ #### Example ```js private val permissionHandler: PermissionHandler = object : PermissionHandler() { override fun onGranted() {} override fun onBlocked( context: Context, blockedList: java.util.ArrayList ): Boolean { for (blockedPermission in blockedList) { Log.d("VideoSDK Permission", "onBlocked: $blockedPermission") } return super.onBlocked(context, blockedList) } override fun onDenied( context: Context, deniedPermissions: java.util.ArrayList ) { for (deniedPermission in deniedPermissions) { Log.d("VideoSDK Permission", "onDenied: $deniedPermission") } super.onDenied(context, deniedPermissions) } override fun onJustBlocked( context: Context, justBlockedList: java.util.ArrayList, deniedPermissions: java.util.ArrayList ) { for (justBlockedPermission in justBlockedList) { Log.d("VideoSDK Permission", "onJustBlocked: $justBlockedPermission") } super.onJustBlocked(context, justBlockedList, deniedPermissions) } } val permissionList: MutableList = ArrayList() permissionList.add(Permission.audio) permissionList.add(Permission.video) permissionList.add(Permission.bluetooth) val rationale = "Please provide permissions" val options = Permissions.Options().setRationaleDialogTitle("Info").setSettingsDialogTitle("Warning") //If you wish to disable the dialog box that prompts //users to manually grant permissions by navigating to device settings, //you can set options.sendDontAskAgainToSettings(false) VideoSDK.checkPermissions(this, permissionList, rationale, options, permissionHandler) ``` ```js private final PermissionHandler permissionHandler = new PermissionHandler() { @Override public void onGranted() { } @Override public boolean onBlocked(Context context, ArrayList blockedList) { for (Permission blockedPermission : blockedList) { Log.d("VideoSDK Permission", "onBlocked: " + blockedPermission); } return super.onBlocked(context, blockedList); } @Override public void onDenied(Context context, ArrayList deniedPermissions) { for (Permission deniedPermission : deniedPermissions) { Log.d("VideoSDK Permission", "onDenied: " + deniedPermission); } super.onDenied(context, deniedPermissions); } @Override public void onJustBlocked(Context context, ArrayList justBlockedList, ArrayList deniedPermissions) { for (Permission justBlockedPermission : justBlockedList) { Log.d("VideoSDK Permission", "onJustBlocked: " + justBlockedPermission); } super.onJustBlocked(context, justBlockedList, deniedPermissions); } }; List permissionList = new ArrayList<>(); permissionList.add(Permission.audio); permissionList.add(Permission.video); permissionList.add(Permission.bluetooth); String rationale = "Please provide permissions"; Permissions.Options options = new Permissions.Options().setRationaleDialogTitle("Info").setSettingsDialogTitle("Warning"); //If you wish to disable the dialog box that prompts //users to manually grant permissions by navigating to device settings, //you can set options.sendDontAskAgainToSettings(false) VideoSDK.checkPermissions(this, permissionList, rationale, options, permissionHandler); ``` --- ### setSelectedAudioDevice() - It sets the selected audio device, allowing the user to specify which audio device to use in the meeting. #### Parameters - **selectedAudioDevice**: AudioDeviceInfo #### Returns - _`void`_ #### Example ```js val audioDevices: Set = VideoSDK.getAudioDevices() val audioDeviceInfo: AudioDeviceInfo = audioDevices.toTypedArray().get(0) as AudioDeviceInfo VideoSDK.setSelectedAudioDevice(audioDeviceInfo) ``` ```js Set audioDevices = VideoSDK.getAudioDevices(); AudioDeviceInfo audioDeviceInfo = (AudioDeviceInfo) audioDevices.toArray()[0]; VideoSDK.setSelectedAudioDevice(audioDeviceInfo); ``` --- ### setSelectedVideoDevice() - It sets the selected video device, allowing the user to specify which video device to use in the meeting. #### Parameters - **selectedVideoDevice**: VideoDeviceInfo #### Returns - _`void`_ #### Example ```js val videoDevices: Set = VideoSDK.getVideoDevices() val videoDeviceInfo: VideoDeviceInfo = videoDevices.toTypedArray().get(1) as VideoDeviceInfo VideoSDK.setSelectedVideoDevice(videoDeviceInfo) ``` ```js Set videoDevices = VideoSDK.getVideoDevices(); VideoDeviceInfo videoDeviceInfo = (VideoDeviceInfo) videoDevices.toArray()[1]; VideoSDK.setSelectedVideoDevice(videoDeviceInfo); ``` --- ### applyVideoProcessor() - This method allows users to dynamically apply virtual background to their video stream during a live session. #### Parameters - videoFrameProcessor - type: `VideoFrameProcessor` - This is an object of the `VideoFrameProcessor` class, which overrides the `onFrameCaptured(VideoFrame videoFrame)` method. #### Returns - _`void`_ #### Example ```js val uri = Uri.parse("https://st.depositphotos.com/2605379/52364/i/450/depositphotos_523648932-stock-photo-concrete-rooftop-night-city-view.jpg") val backgroundImageProcessor = BackgroundImageProcessor(uri) VideoSDK.applyVideoProcessor(backgroundImageProcessor) ``` ```java Uri uri = Uri.parse("https://st.depositphotos.com/2605379/52364/i/450/depositphotos_523648932-stock-photo-concrete-rooftop-night-city-view.jpg"); BackgroundImageProcessor backgroundImageProcessor = new BackgroundImageProcessor(uri) VideoSDK.applyVideoProcessor(backgroundColorProcessor); ``` --- ### removeVideoProcessor() - This method provides users with a convenient way to revert their video background to its original state, removing any previously applied virtual background. - **Returns:** - _`void`_ #### Example ```js VideoSDK.removeVideoProcessor(); ```
--- --- sidebar_position: 1 sidebar_label: Properties pagination_label: VideoSDK Class Properties title: VideoSDK Class Properties --- # VideoSDK Class Properties - Android
### getSelectedAudioDevice() - type: `AudioDeviceInfo` - The `getSelectedAudioDevice()` method will return the object of the audio device, which is currently in use. --- ### getSelectedVideoDevice() - type: `VideoDeviceInfo` - The `getSelectedVideoDevice()` method will return the object of the video device, which is currently in use.
--- ### 5.Errors associated with Media These errors involve media access, device availability, or permission-related issues affecting camera, microphone, and screen sharing. ### Device access-related errors | Type | Code | Message | |-------------------------------------------|-------|--------------------------------------------------| | ERROR_CAMERA_ACCESS | 3002 | Something went wrong. Unable to access camera. | | ERROR_MIC_ACCESS_DENIED | 3003 | It seems like microphone access was denied or dismissed. To proceed, kindly grant access through your device's settings. | | ERROR_CAMERA_ACCESS_DENIED | 3004 | It seems like camera access was denied or dismissed. To proceed, kindly grant access through your device's settings. | ### 6.Errors associated with Track These errors occur when there are issues with video or audio tracks, such as disconnections or invalid custom tracks. | Type | Code | Message | |-----------------------------|-------|--------------------------------------------------| | ERROR_CUSTOM_SCREEN_SHARE_TRACK_ENDED | 3005 | The provided custom track is in an ended state. Please try again with new custom track. | | ERROR_CUSTOM_SCREEN_SHARE_TRACK_DISPOSED | 3006 | The provided custom track was disposed. Please try again with new custom track. | | ERROR_CHANGE_WEBCAM | 3007 | Something went wrong, and the camera could not be changed. Please try again. | ### 7.Errors associated with Actions Below error is triggered when an action is attempted before joining a meeting. | Type | Code | Message | |-----------------------------|-------|--------------------------------------------------| | ERROR_ACTION_PERFORMED_BEFORE_MEETING_JOINED | 3001 | Oops! Something went wrong. The room was in a connecting state, and during that time, an action encountered an issue. Please try again after joining a meeting. | --- --- sidebar_label: App Size Optimization pagination_label: App Size Optimization --- # App Size Optimization - Android This guide is designed to help developers optimize app size, enhancing performance and efficiency across different devices. By following these best practices, you can reduce load times, minimize storage requirements, and deliver a more seamless experience to users, all while preserving essential functionality. ### Deliver Leaner Apps with App Bundles Using Android App Bundles (AAB) is an effective way to optimize the size of your application, making it lighter and more efficient for users to download and install. App Bundles allow Google Play to dynamically generate APKs tailored to each device, so users only download the resources and code relevant to their specific configuration. This approach reduces app size significantly, leading to faster installs and conserving storage space on users’ devices. Recommended Practices: - `Enable App Bundles`: Configure your build to use the App Bundle format instead of APKs. This will allow Google Play to optimize your app for each device type automatically. - `Organize Resources by Device Type`: Ensure that resources (like images and layouts) are organized by device type (such as screen density or language) to maximize the benefits of App Bundles. - `Test Modularization`: If your app contains large, optional features, use dynamic feature modules to let users download them on demand. This reduces the initial download size and provides features only as needed. - `Monitor Size Reductions`: Regularly analyze your app size to see where the most savings occur, and make sure that App Bundle optimizations are effectively reducing your app size across different device configurations. ### Optimize Libraries for a Leaner App Experience Managing dependencies carefully is essential for minimizing app size and improving performance. Every library or dependency included in your app adds to its overall size, so it’s crucial to only incorporate what’s necessary. Optimizing dependencies helps streamline your app, reduce load times, and enhance maintainability. Recommended Practices: - `Use Only Essential Libraries`: Review all libraries and dependencies, removing any that are not critical to your app’s functionality. This helps avoid unnecessary bloat. - `Leverage Lightweight Alternatives`: Whenever possible, choose lightweight libraries or modularized versions of larger ones. For example, opt for a specific feature module rather than including an entire library. - `Monitor Library Updates`: Regularly update your dependencies to take advantage of any optimizations or size reductions made by the library maintainers. Newer versions are often more efficient. - `Minimize Native Libraries`: If your app uses native libraries, ensure they’re essential and compatible across platforms, as they can significantly increase app size. - `Analyze Dependency Tree`: Use tools like Gradle’s dependency analyzer to identify unnecessary or redundant dependencies, ensuring your app’s dependency tree is as lean as possible. ### Optimize with ProGuard ProGuard is a powerful tool for shrinking, optimizing, and obfuscating your code, which can significantly reduce your app's size and improve performance. By removing unused code and reducing the size of classes, fields, and methods, ProGuard helps to minimize the footprint of your app without sacrificing functionality. Additionally, ProGuard’s obfuscation feature enhances security by making reverse engineering more difficult. You can refer to the official [documentation](https://developer.android.com/build/shrink-code) for more information. Recommended Practices: - `Enable ProGuard`: To enable ProGuard in your project, ensure that your `proguard-rules.pro` file is properly configured, and add the following lines to your `build.gradle` file: ```js title="build.gradle" buildTypes { release { minifyEnabled true proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro' } } ``` - `Customize ProGuard Rules`: Carefully review and customize ProGuard rules in the proguard-rules.pro file to avoid stripping essential code. For example, to keep a specific class, add: ```js -keep class com.example.myapp.MyClass { *; } ``` If you encounter an issue after enabling ProGuard rules, refer to our [known issues section](https://docs.videosdk.live/android/guide/video-and-audio-calling-api-sdk/known-issues). --- --- sidebar_label: Developer Experience Guidelines pagination_label: Developer Experience Guidelines --- # Developer Experience Guidelines - Android This section provides best practices for creating a smooth and efficient development process when working with VideoSDK. From handling errors gracefully to managing resources and event subscriptions, these guidelines help developers build more reliable and maintainable applications. Following these practices can simplify troubleshooting, prevent common pitfalls, and improve overall application performance. ### Initiate Key Features After Meeting Join Event To provide a seamless and reliable meeting experience, initiate specific features **only** after the [onMeetingJoined()](https://docs.videosdk.live/android/api/sdk-reference/meeting-class/meeting-event-listener-class#onmeetingjoined) event has been triggered. - **Trigger Key Actions After Joining the Meeting** : Initiating crucial actions after the `onMeetingJoined()` event helps avoid errors and optimizes the meeting setup, ensuring a smoother experience for participants. If your application utilizes any of the following features or you want to perform any action as soon as meeting joins, it's recommended to call them only after the meeting has successfully started: - `Chat Subscription`: To enable in-meeting chat functionality, subscribe to the chat topic after the `onMeetingJoined()` event is triggered. This ensures that messages are reliably received by participants.
- `Device Management`: If you need users to use specific audio or video devices when the meeting is first joined, you can utilize the [`setSelectedAudioDevice()`](https://docs.videosdk.live/android/api/sdk-reference/videosdk-class/methods#setselectedaudiodevice) and [`setSelectedVideoDevice()`](https://docs.videosdk.live/android/api/sdk-reference/videosdk-class/methods#setselectedvideodevice) methods of `VideoSDK` class. - `Recording and Transcription`: To automatically start recording or transcription as soon as the meeting begins, configure the `autoStartConfig` in the `createMeeting` API. For detailed information, refer to the documentation [here](https://docs.videosdk.live/api-reference/realtime-communication/create-room#autoCloseConfig). ### Dispose Custom Tracks When Necessary Proper disposal of custom tracks is essential for managing system resources and ensuring a smooth experience. In most scenarios, tracks are automatically disposed of by the SDK, ensuring efficient resource management. However, in specific cases outlined below, you will need to dispose of custom tracks explicitly: 1. **When Enabling/Disabling the Camera on a Precall Screen**: - If your application includes a precall screen and you want to ensure that the device's camera is not used when the camera is disabled, you must dispose of the custom video track. Otherwise, the device’s camera will continue to be used even when the camera is off. - Additionally, remember to create a new track when the user enables the camera again. - If you don’t need to manage the camera's usage on the device level (i.e., you’re okay with the camera being used whether it’s enabled or disabled), you can skip this step. - Here's how you can manage customTrack on a precall screen : import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ```js import live.videosdk.rtc.android.CustomStreamTrack import live.videosdk.rtc.android.VideoSDK import live.videosdk.rtc.android.VideoView class JoinActivity : AppCompatActivity() { private var videoTrack: CustomStreamTrack? = null private var joinView: VideoView? = null private fun toggleWebcam(videoDevice: VideoDeviceInfo?) { if (isWebcamEnabled) { // check the track state is LIVE if(videoTrack?.track?.state()?.equals("LIVE") == true){ videoTrack?.track?.dispose() // Dispose the current video track videoTrack?.track?.setEnabled(false) // Disable the track } videoTrack = null joinView!!.removeTrack() // Remove the video track from the view joinView!!.releaseSurfaceViewRenderer() joinView!!.visibility = View.INVISIBLE; } else { // Re-enabling the webcam by creating a new track videoTrack = VideoSDK.createCameraVideoTrack( "h720p_w960p", "front", CustomStreamTrack.VideoMode.TEXT, true, this, videoDevice // Passes the VideoDeviceInfo object of the user's selected device ) // display in localView joinView!!.addTrack(videoTrack!!.track as VideoTrack?) joinView!!.visibility = View.VISIBLE } isWebcamEnabled = !isWebcamEnabled // Toggle webcam state } } ``` ```js import live.videosdk.rtc.android.CustomStreamTrack import live.videosdk.rtc.android.VideoSDK import live.videosdk.rtc.android.VideoView public class JoinActivity extends AppCompatActivity { private CustomStreamTrack videoTrack = null; private VideoView joinView = null; private boolean isWebcamEnabled = false; private void toggleWebcam(VideoDeviceInfo videoDevice) { if (isWebcamEnabled) { // Check if the track state is LIVE if (videoTrack != null && "LIVE".equals(videoTrack.getTrack().state())) { videoTrack.getTrack().dispose(); // Dispose the current video track videoTrack.getTrack().setEnabled(false); // Disable the track } videoTrack = null; joinView.removeTrack(); // Remove the video track from the view joinView.releaseSurfaceViewRenderer(); joinView.setVisibility(View.INVISIBLE); } else { // Re-enabling the webcam by creating a new track videoTrack = VideoSDK.createCameraVideoTrack( "h720p_w960p", "front", CustomStreamTrack.VideoMode.TEXT, true, this, videoDevice // Passes the VideoDeviceInfo object of the user's selected device ); // Display in the local view joinView.addTrack((VideoTrack) videoTrack.getTrack()); joinView.setVisibility(View.VISIBLE); } isWebcamEnabled = !isWebcamEnabled; // Toggle webcam state } } ``` ### Listen for Error Events Listening to error events enables your application to handle unexpected issues efficiently, providing users with clear feedback and potential solutions. Error codes pinpoint specific problems, whether from configuration settings, account restrictions, permission limitations, or device constraints. Here are recommended solutions based on common error categories: 1. [Errors associated with Organization](../../api/sdk-reference/error-codes.md#1-errors-associated-with-organization): If you encounter errors related to your organization (e.g., account status or participant limits), reach out to support at support@videosdk.live or reach out to us on [Discord](https://discord.com/invite/Qfm8j4YAUJ) for assistance. 2. [Errors associated with Token](../../api/sdk-reference/error-codes#2-errors-associated-with-token): For errors related to authentication tokens, ensure the token is valid and hasn’t expired, then try the request again. 3. [Errors associated with Meeting and Participant](../../api/sdk-reference/error-codes#3-errors-associated-with-meeting-and-participant): Check that meetingId and participantId are correctly passed and valid. Also, ensure each participant has a unique participantId to avoid duplicate entries. 4. [Errors associated with Add-on Service](../../api/sdk-reference/error-codes#4-errors-associated-with-add-on-service): If you encounter errors with add-on services (such as recording or streaming), try restarting the service after receiving a failure event. For example, if a `START_RECORDING_FAILED` error event occurs, attempt to call the `startRecording()` method again. If you're using webhooks, you can also retry on [recording-failed](https://docs.videosdk.live/api-reference/realtime-communication/user-webhooks#recording-failed) hook. 5. [Errors associated with Media](../../api/sdk-reference/error-codes#5errors-associated-with-media): Inform the user about media access issues, such as microphone or camera permissions. Design the UI to clearly indicate what is preventing the mic or camera from enabling, helping the user understand the problem. 6. [Errors associated with Track](../../api/sdk-reference/error-codes#6errors-associated-with-track): Ensure that the track you’ve created and passed to enable the mic or camera methods meets the required specifications. 7. [Errors associated with Actions](../../api/sdk-reference/error-codes#7errors-associated-with-actions): If you need to perform actions as soon as a meeting joins, only initiate them after receiving the [onMeetingJoined()](https://docs.videosdk.live/android/api/sdk-reference/meeting-class/meeting-event-listener-class#onmeetingjoined) event, otherwise it will not work well. - Here's how to listen for the error event: ```js private val meetingEventListener: MeetingEventListener = object : MeetingEventListener() { //.. override fun onError(error: JSONObject) { try { val errorCodes: JSONObject = VideoSDK.getErrorCodes() val code = error.getInt("code") Log.d("#error", "Error is: " + error["message"]) } catch (e: Exception) { e.printStackTrace() } } } ``` ```js private final MeetingEventListener meetingEventListener = new MeetingEventListener() { //.. @Override public void onError(JSONObject error) { try { JSONObject errorCodes = VideoSDK.getErrorCodes(); int code = error.getInt("code"); Log.d("#error", "Error is: " + error.get("message")); } catch (Exception e) { e.printStackTrace(); } } }; ``` --- --- sidebar_label: Handle Large Rooms pagination_label: Handle Large Rooms --- # Handle Large Rooms - Android Managing large meetings requires specific strategies to ensure performance, stability, and a seamless user experience. This section provides best practices for optimizing VideoSDK applications to handle high participant volumes effectively. By implementing these recommendations, you can reduce lag, maintain video and audio quality, and provide a smooth experience even in large rooms. ### User Interface Optimization When hosting large meetings, an optimized UI helps manage participant visibility and ensures smooth performance. Recommended Practices: - `Limit Visible Participants`: Display only a limited number of participants on screen at any given time, adapting the view based on screen size. Use pagination to allow users to browse or switch between additional participants seamlessly. For example, you could display only users whose video stream is enabled, or you could choose to display all active speakers. This approach helps manage screen space efficiently, ensuring that the most relevant participants are visible without overwhelming the interface. - `Prioritize Active Speakers`: Ensure all active speakers are displayed on the screen to highlight who is currently talking, helping participants stay engaged and aware of ongoing discussions. To identify which participant is speaking, you can use the [onSpeakerChanged()](https://docs.videosdk.live/android/api/sdk-reference/meeting-class/meeting-event-listener-class#onspeakerchanged) event. ### Optimizing Media Streams In large video calls, it’s important to manage media streams effectively to optimize system resources while maintaining a smooth user experience. Recommended Practices: - `Pause Streams for Non-Visible Participants`: To optimize performance, pause the video and/or audio streams of participants who are not currently visible on the screen. This reduces unnecessary resource consumption. - `Resume Streams When Visible`: Once a participant comes into view, resume their stream to provide an uninterrupted experience as they appear on the screen. For detailed setup instructions on how to achieve this, check out our in-depth documentation [here](https://docs.videosdk.live/android/guide/video-and-audio-calling-api-sdk/render-media/layout-and-grid-management#pauseresume-stream). ### Media Stream Quality Adjustment In large meetings, managing media stream quality is essential to balance performance and user experience. Recommended Practices: - `High Quality for Active Speakers`: For all active speakers, set the video stream quality to a higher level using the setQuality method (e.g., `setQuality("high")`). This ensures that participants will receive higher-quality video for active speakers, providing a clearer and more engaging experience. - `Lower Quality for Non-Speaking Participants`: For other participants who are not actively speaking, set their video stream quality to a lower level (e.g., `setQuality("low")`). This helps conserve bandwidth and system resources while maintaining overall meeting performance. Checkout the documentation for `setQuality()` method [here](https://docs.videosdk.live/android/api/sdk-reference/participant-class/methods#setquality) --- --- sidebar_label: User Experience Guidelines pagination_label: User Experience Guidelines --- # User Experience Guidelines - Android This guide aims to help developers optimize the user experience and functionality of video conferencing applications with VideoSDK. By following these best practices, you can create smoother interactions, minimize common issues, and deliver a more reliable experience for users. Here are our recommended best practices to enhance the user experience in your application: | **Section** | **Description** | |--------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------| | [Configure Precall for Effortless Meeting Join](#configure-precall-for-effortless-meeting-join) | Users may enter meetings unprepared due to device or connection issues. A Precall setup can help them configure devices and settings beforehand for a smooth start. | | [Listen Key Events for Optimal User Experience](#listen-key-events-for-optimal-user-experience) | Users can feel lost without real-time updates on meeting status, events, and errors. Event monitoring and notifications keep them informed and engaged. | | [Handling Media Devices](#handling-media-devices) | Users may want to change their audio or video setup mid-meeting but struggle to manage device controls. Providing easy device switching enhances control and flexibility. | | [Monitoring Real-Time Participant Statistics](#monitoring-real-time-participant-statistics) | Poor video or audio quality without real-time feedback leaves users frustrated. Real-time stats let them assess connection quality and troubleshoot issues actively. | ### Configure Precall for Effortless Meeting Join A Precall step is crucial for ensuring users are set up correctly and have no device before joining a meeting. This step allows users to configure their devices and settings before entering a meeting, leading to a smoother experience and minimizing technical issues once the call begins. Recommended Practices: - `Request Permissions`: Prompt users to grant microphone, and camera permissions before entering the meeting, ensuring seamless access to their devices. - `Device Selection`: Allow users to select their preferred camera, and microphone giving them control over their setup from the start. - `Entry Preferences`: Provide options to join with the microphone and camera either on or off, letting users choose their level of engagement upon entry. - `Camera Preview`: Show a live camera preview, allowing users to adjust angles and lighting to ensure they appear clearly and professionally. - `Virtual Backgrounds`: Allow users to choose from different virtual backgrounds or enter with a virtual background enabled, enhancing privacy and creating a more polished appearance. For detailed setup instructions on each of these features, check out our in-depth documentation [here](https://docs.videosdk.live/android/guide/video-and-audio-calling-api-sdk/setup-call/precall).
### Monitor Key Events for Optimal User Experience Listening for crucial events is vital for providing users with a responsive and engaging experience in your application. By effectively managing state changes and user notifications, you can keep participants informed and enhance their overall experience during meetings. Recommended Practices: - `Monitor State Change Events`: Listen for state change events, such as `onMeetingStateChanged` and `onRecordingStateChanged`, and notify users promptly about these transitions. Keeping users informed helps them understand the current state of the meeting.
- `UI Handling on Event Trigger`: Update the user interface only in response to specific events. For instance, display that the meeting is recording only when the `onRecordingStateChanged` event with the status `RECORDING_STARTED` is received, rather than when the record button is clicked. This ensures users receive accurate and timely updates.
- `Notify Participants of Join/Leave Events`: Keep users informed about participant activity by notifying them when someone joins or leaves the meeting. This fosters a sense of presence and awareness of who is currently available to engage. - `Listen for Error Events`: It is crucial to monitor error events and notify users promptly when issues arise. Clear communication about errors can help users troubleshoot and address problems quickly, minimizing disruptions to the meeting. ### Handling Media Devices Providing seamless control over devices enhances user convenience and allows participants to adjust their setup for the best meeting experience. Proper device management within the UI also helps users stay informed about their current settings and troubleshoot issues effectively. Recommended Practices: - `Allow Device Switching`: Provide users with the option to switch between available microphone, and camera devices during the meeting. This flexibility is essential, especially if users want to adjust their setup mid-call. - `Display Selected Devices`: Ensure the UI shows users which microphone, and camera devices are currently selected. Clear device labeling in the interface can reduce confusion and help users verify their setup at a glance.
### Monitoring Real-Time Participant Statistics Providing real-time insights into stream quality allows participants to monitor and optimize their connection for the best experience. With detailed metrics on video, audio, and screen sharing, users can assess and troubleshoot quality issues, ensuring smooth and uninterrupted meetings. To display these statistics, you can use the [getVideoStats()](https://docs.videosdk.live/android/api/sdk-reference/participant-class/methods#getvideostats), [getAudioStats()](https://docs.videosdk.live/android/api/sdk-reference/participant-class/methods#getaudiostats), and [getShareStats()](https://docs.videosdk.live/android/api/sdk-reference/participant-class/methods#getsharestats) methods. import ReactPlayer from 'react-player'

:::note To show the popup dialog for the participant's realtime stats, you can [refer to this function](https://github.com/videosdk-live/videosdk-rtc-android-kotlin-sdk-example/blob/main/app/src/main/java/live/videosdk/rtc/android/kotlin/Common/Utils/HelperClass.kt#L91). ::: --- --- sidebar_label: Face Match API pagination_label: Face Match API --- # Face Match API import FaceMatchApi from '../../../mdx/\_api-face-match.mdx'; --- --- sidebar_label: Face Spoof Detection API pagination_label: Face Spoof Detection API --- # Face Spoof Detection import FaceSpoofDetection from '../../../mdx/\_api-spoof-detection.mdx'; --- --- sidebar_label: Number of Face Detection API pagination_label: Number of Face Detection API --- # Number of Face Detection import NoOfFaceDetectionApi from '../../../mdx/\_api-number-of-face-detection.mdx'; --- --- sidebar_label: OCR API pagination_label: OCR API --- # OCR API import OcrApi from '../../../mdx/\_api-ocr.mdx'; --- --- title: Customized Live Stream sidebar_position: 1 sidebar_label: Customized Live Stream hide_table_of_contents: false --- # Customized Live Stream - Android VideoSDK is a platform that offers a range of video streaming tools and solutions for content creators, publishers, and developers. ### Custom Template - Custom template is template for live stream, which allows users to add real-time graphics to their streams. - With custom templates, users can create unique and engaging video experiences by overlaying graphics, text, images, and animations onto their live streams. These graphics can be customized to match the branding. - Custom templates enable users to create engaging video content with real-time graphics, with live scoreboards, social media feeds, and other customizations, users can easily create unique and visually appealing streams that stands out from the crowd. :::note Custom templates can be used with recordings and RTMP service provided by VideoSDK as well. ::: ### What you can do with Custom Template Using a custom template, you may create a variety of various modes. Here are a few of the more well-known modes that you can create. - **`PK Host:`** Host can organise player vs player battle. Below image is example of gaming battle. - **`Watermark:`** Host can add & update watermark anywhere in the template. In below image we have added VideoSDK watermark on top right side of the screen. - **`News Mode:`** Host can add dynamic text in lower third banner. in below image we have added some sample text in bottom left of the screen. ![Mobile Custom Template ](https://cdn.videosdk.live/website-resources/docs-resources/mobile_custom_template.png) ### Custom template with VideoSDK In this section, we will discuss how Custom Templates work with VideoSDK. - **`Host`**: The host is responsible for starting the live streaming by passing the `templateURL`. The `templateURL` is the URL of the hosted template webpage. The host is also responsible for managing the template, such as changing text, logos, and switching template layout, among other things. - **`VideoSDK Template Engine`** : The VideoSDK Template Engine accepts and opens the templateURL in the browser. It listens to all the events performed by the Host and enables customization of the template according to the Host's preferences. - **`Viewer`**: The viewer can stream the content. They can watch the live stream with the added real-time graphics, which makes for a unique and engaging viewing experience. ![custom template](https://cdn.videosdk.live/website-resources/docs-resources/custom_template.png) ### Understanding Template URL The template URL is the webpage that VideoSDK Template Engine will open while composing the live stream. The template URL will appear as shown below. ![template url](https://cdn.videosdk.live/website-resources/docs-resources/custom_template_url.png) The Template URL consists of two parts: - Your actual page URL, which will look something like `https://example.com/videosdk-template`. - Query parameters, which will allow the VideoSDK Template Engine to join the meeting when the URL is opened. There are a total of three query parameters: - `token`: This will be your token, which will be used to join the meeting. - `meetingId`: This will be the meeting ID that will be joined by the VideoSDK Template Engine. - `participantId`: This will be the participant ID of the VideoSDK Template Engine, which should be passed while joining the template engine in your template so that the tempalte engine participant is not visible to other participants. **This parameter will be added by the** **VideoSDK**. :::info Above mentioned query parameters are mandatory. Apart from these parameters, you can pass any other extra parameters which are required according to your use-case. ::: ### **Creating Template** **`Step 1:`** Create a new React App using the below command ```js npx create-react-app videosdk-custom-template ``` :::note You can use VideoSDK's React or JavaScript SDK to create custom template. Following is the example of building custom template with React SDK. ::: **`Step 2:`** Install the VideoSDK using the below-mentioned npm command. Make sure you are in your react app directory before you run this command. ```js npm install "@videosdk.live/react-sdk" //For the Participants Video npm install "react-player" ``` ###### App Architecture ![template architechture](https://cdn.videosdk.live/website-resources/docs-resources/custom_template_arch.png) ###### Structure of the Project ```jsx title="Project Structure" root ├── node_modules ├── public ├── src │ ├── components │ ├── MeetingContainer.js │ ├── ParticipantsAudioPlayer.js │ ├── ParticipantsView.js │ ├── Notification.js │ ├── icons │ ├── App.js │ ├── index.js ├── package.json . ``` **`Step 3:`** Next we will fetch the query parameters, from the URL which we will later use to initialize the meeting ```js title=App.js function App() { const { meetingId, token, participantId } = useMemo(() => { //highlight-start const location = window.location; const urlParams = new URLSearchParams(location.search); const paramKeys = { meetingId: "meetingId", token: "token", participantId: "participantId", }; Object.keys(paramKeys).forEach((key) => { paramKeys[key] = urlParams.get(key) ? decodeURIComponent(urlParams.get(key)) : null; }); return paramKeys; //highlight-end }, []); } ``` **`Step 4:`** Now we will initialize the meeting with the parameters we extracted from the URL. Make sure `joinWithoutUserInteraction` is specified, so that the template engine is able to join directly into the meeting, on the page load. ```js title=App.js function App(){ //highlight-next-line ... return meetingId && token && participantId ? (
{/* We will create this in upcoming steps */}
) : null; } ``` **`Step 5:`** Let us create the `MeetingContainer` which will render the meeting view for us. - It will also listen to the PubSub messages from the `CHANGE_BACKGROUND` topic, which will change the background color of the meeting. - It will have `Notification` component which will show any messages share by Host :::note We will be using the PubSub mechanism to communicate with the template. You can learn more about [PubSub from here](../video-and-audio-calling-api-sdk/collaboration-in-meeting/pubsub). ::: import CautionMessage from '@site/src/theme/CautionMessage'; ```js title=MeetingContainer.js import { Constants, useMeeting, usePubSub } from "@videosdk.live/react-sdk"; import { Notification } from "./Notification"; import { ParticipantsAudioPlayer } from "./ParticipantsAudioPlayer"; import { ParticipantView } from "./ParticipantView"; export const MeetingContainer = () => { const { isMeetingJoined, participants, localParticipant } = useMeeting(); //highlight-next-line const { messages } = usePubSub("CHANGE_BACKGROUND"); const remoteSpeakers = [...participants.values()].filter((participant) => { return ( participant.mode == Constants.modes.SEND_AND_RECV && !participant.local ); }); return isMeetingJoined ? (
0 ? messages.at(messages.length - 1).message : "#fff", //highlight-end }} > //highlight-next-line
1 ? "1fr 1fr" : "1fr", flex: 1, maxHeight: `100vh`, overflowY: "auto", gap: "20px", padding: "20px", alignItems: "center", justifyItems: "center", }} > {[...remoteSpeakers].map((participant) => { return ( //highlight-start //highlight-end ); })}
//highlight-next-line
) : (
); }; ``` **`Step 6:`** Let us create the `ParticipantView` and `ParticipantsAudioPlayer` which will render the video and audio of the participants respectively. ```js title=ParticipantView.js import { useParticipant } from "@videosdk.live/react-sdk"; import { useMemo } from "react"; import ReactPlayer from "react-player"; import MicOffIcon from "../icons/MicOffIcon"; export const ParticipantView = (props) => { const { webcamStream, webcamOn, displayName, micOn } = useParticipant( props.participantId ); const videoStream = useMemo(() => { if (webcamOn && webcamStream) { const mediaStream = new MediaStream(); mediaStream.addTrack(webcamStream.track); return mediaStream; } }, [webcamStream, webcamOn]); return (
{webcamOn && webcamStream ? ( { console.log(err, "participant video error"); }} /> ) : (
{String(displayName).charAt(0).toUpperCase()}
)}
{displayName}{" "} {!micOn && }
); }; ``` ```js title=ParticipantsAudioPlayer.js import { useMeeting, useParticipant } from "@videosdk.live/react-sdk"; import { useEffect, useRef } from "react"; const ParticipantAudio = ({ participantId }) => { const { micOn, micStream, isLocal } = useParticipant(participantId); const audioPlayer = useRef(); useEffect(() => { if (!isLocal && audioPlayer.current && micOn && micStream) { const mediaStream = new MediaStream(); mediaStream.addTrack(micStream.track); audioPlayer.current.srcObject = mediaStream; audioPlayer.current.play().catch((err) => {}); } else { audioPlayer.current.srcObject = null; } }, [micStream, micOn, isLocal, participantId]); return