AI Agent with React Native - Quick Start

VideoSDK empowers you to integrate an AI voice agent into your React Native app (Android/iOS) within minutes. The agent joins the same meeting room and interacts over voice using the Google Gemini Live API.

Prerequisites

VideoSDK Developer Account (get token from the dashboard)
Node.js and a working React Native environment (Android Studio and/or Xcode)
Python 3.12+
Google API Key with Gemini Live API access

important

You need a VideoSDK account to generate a token and a Google API key for the Gemini Live API. Visit the VideoSDK dashboard to generate a token and the Google AI Studio for Google API key.

Project Structure

First, create an empty project using mkdir folder_name on your preferable location for the React Native Frontend. Your final project structure should look like this:

Directory Structure
  root
   ├── android/
   ├── ios/
   ├── App.js
   ├── constants.js
   ├── index.js
   ├── agent-react-native.py
   └── .env

You will work on:

android/: Contains the Android-specific project files.
ios/: Contains the iOS-specific project files.
App.js: The main React Native component, containing the UI and meeting logic.
constants.js: To store token and meetingId for the frontend.
index.js: The entry point of the React Native application, where VideoSDK is registered.
agent-react-native.py: The Python agent that joins the meeting.
.env: Environment variables file for the Python agent (stores API keys).

1. Building the React Native Frontend

Step 1: Create App and Install SDKs

Create App and Install SDKs

Create a React Native app and install the VideoSDK RN SDK:

npx react-native init videosdkAiAgentRN
cd videosdkAiAgentRN

# Install VideoSDK
npm install "@videosdk.live/react-native-sdk"

Step 2: Configure the Project

Configure the Project

Android Setup

Add the required permissions in the `AndroidManifest.xml` file.

android/app/src/main/AndroidManifest.xml
<manifest
  xmlns:android="http://schemas.android.com/apk/res/android"
>
    <uses-permission android:name="android.permission.INTERNET" />
    <uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
    <uses-permission
        android:name="android.permission.BLUETOOTH"
        android:maxSdkVersion="30" />
    <uses-permission
        android:name="android.permission.BLUETOOTH_ADMIN"
        android:maxSdkVersion="30" />

    <uses-permission android:name="android.permission.BLUETOOTH_CONNECT" />

    <uses-permission android:name="android.permission.CAMERA" />
    <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
    <uses-permission android:name="android.permission.RECORD_AUDIO" />
    <uses-permission android:name="android.permission.WAKE_LOCK" />
</manifest>

Link the necessary VideoSDK Dependencies

android/app/build.gradle
  dependencies {
   implementation project(':rnwebrtc')
  }

android/settings.gradle

include ':rnwebrtc'
project(':rnwebrtc').projectDir = new File(rootProject.projectDir, '../node_modules/@videosdk.live/react-native-webrtc/android')

MainApplication.kt
import live.videosdk.rnwebrtc.WebRTCModulePackage

class MainApplication : Application(), ReactApplication {
  override val reactNativeHost: ReactNativeHost =
      object : DefaultReactNativeHost(this) {
        override fun getPackages(): List<ReactPackage> {
          val packages = PackageList(this).packages.toMutableList()
          packages.add(WebRTCModulePackage())
          return packages
        }
        // ...
      }
}

android/gradle.properties
/* This one fixes a weird WebRTC runtime problem on some devices. */
android.enableDexingArtifactTransform.desugaring=false

Include the following line in your `proguard-rules.pro` file (optional: if you are using Proguard)

android/app/proguard-rules.pro
-keep class org.webrtc.** { *; }

In your `build.gradle` file, update the minimum OS/SDK version to `23`.

android/build.gradle
buildscript {
  ext {
      minSdkVersion = 23
  }
}

iOS Setup

IMPORTANT: Ensure that you are using CocoaPods version 1.10 or later

To update CocoaPods, you can reinstall the gem using the following command:

$ sudo gem install cocoapods

Change the path of `react-native-webrtc` using the following command:

ios/Podfile

pod ‘react-native-webrtc’, :path => ‘../node_modules/@videosdk.live/react-native-webrtc’

Change the version of your platform.

You need to change the platform field in the Podfile to 12.0 or above because react-native-webrtc doesn't support iOS versions earlier than 12.0. Update the line: platform : ios, ‘12.0’.

Install pods.

After updating the version, you need to install the pods by running the following command:

pod install

Declare permissions in Info.plist :

Add the following lines to your info.plist file located at (project folder/ios/projectname/info.plist):

ios/MyApp/Info.plist
<key>NSCameraUsageDescription</key>
<string>Camera permission description</string>
<key>NSMicrophoneUsageDescription</key>
<string>Microphone permission description</string>

Step 3: Register Service and Configure

index.js
import { AppRegistry } from "react-native";
import App from "./App";
import { name as appName } from "./app.json";
import { register } from "@videosdk.live/react-native-sdk";

register();

AppRegistry.registerComponent(appName, () => App);

Create a constants.js file to store your token and meeting ID.

constants.js
export const token = "YOUR_VIDEOSDK_AUTH_TOKEN";
export const meetingId = "YOUR_MEETING_ID";
export const name = "User Name";

Step 4: Build UI and wire up MeetingProvider

Build UI and wire up MeetingProvider

App.js
import React from 'react';
import {
  SafeAreaView,
  TouchableOpacity,
  Text,
  View,
  FlatList,
} from 'react-native';
import {
  MeetingProvider,
  useMeeting,
} from '@videosdk.live/react-native-sdk';
import { meetingId, token, name } from './constants';

const Button = ({ onPress, buttonText, backgroundColor }) => {
  return (
    <TouchableOpacity
      onPress={onPress}
      style={{
        backgroundColor: backgroundColor,
        justifyContent: 'center',
        alignItems: 'center',
        padding: 12,
        borderRadius: 4,
      }}>
      <Text style={{ color: 'white', fontSize: 12 }}>{buttonText}</Text>
    </TouchableOpacity>
  );
};

function ControlsContainer({ join, leave, toggleMic }) {
  return (
    <View
      style={{
        padding: 24,
        flexDirection: 'row',
        justifyContent: 'space-between',
      }}>
      <Button
        onPress={() => {
          join();
        }}
        buttonText={'Join'}
        backgroundColor={'#1178F8'}
      />
      <Button
        onPress={() => {
          toggleMic();
        }}
        buttonText={'Toggle Mic'}
        backgroundColor={'#1178F8'}
      />
      <Button
        onPress={() => {
          leave();
        }}
        buttonText={'Leave'}
        backgroundColor={'#FF0000'}
      />
    </View>
  );
}

function ParticipantView({ participantDisplayName }) {
  return (
    <View
      style={{
        backgroundColor: 'grey',
        height: 300,
        justifyContent: 'center',
        alignItems: 'center',
        marginVertical: 8,
        marginHorizontal: 8,
      }}>
      <Text style={{ fontSize: 16 }}>Participant: {participantDisplayName}</Text>
    </View>
  );
}

function ParticipantList({ participants }) {
  return participants.length > 0 ? (
    <FlatList
      data={participants}
      renderItem={({ item }) => {
        return <ParticipantView participantDisplayName={item.displayName} />;
      }}
    />
  ) : (
    <View
      style={{
        flex: 1,
        backgroundColor: '#F6F6FF',
        justifyContent: 'center',
        alignItems: 'center',
      }}>
      <Text style={{ fontSize: 20 }}>Press Join button to enter meeting.</Text>
    </View>
  );
}

function MeetingView() {
  const { join, leave, toggleMic, participants, meetingId } = useMeeting({});

  const participantsList = [...participants.values()].map(participant => ({
    displayName: participant.displayName,
  }));

  return (
    <View style={{ flex: 1 }}>
      {meetingId ? (
        <Text style={{ fontSize: 18, padding: 12 }}>Meeting Id : {meetingId}</Text>
      ) : null}
      <ParticipantList participants={participantsList} />
      <ControlsContainer
        join={join}
        leave={leave}
        toggleMic={toggleMic}
      />
    </View>
  );
}

export default function App() {
  if (!meetingId || !token) {
    return (
      <SafeAreaView style={{ flex: 1, backgroundColor: '#F6F6FF' }}>
        <View style={{ flex: 1, justifyContent: 'center', alignItems: 'center' }}>
          <Text style={{ fontSize: 20, textAlign: 'center' }}>
            Please add a valid Meeting ID and Token in the `constants.js` file.
          </Text>
        </View>
      </SafeAreaView>
    );
  }

  return (
    <SafeAreaView style={{ flex: 1, backgroundColor: '#F6F6FF' }}>
      <MeetingProvider
        config={{
          meetingId,
          micEnabled: true,
          webcamEnabled: false,
          name,
        }}
        token={token}>
        <MeetingView />
      </MeetingProvider>
    </SafeAreaView>
  );
}

2. Building the Python Agent

Step 1: Configure the Agent

Configure the Agent

Create a .env file to store your API keys securely for the Python agent:

.env
# Google API Key for Gemini Live API
GOOGLE_API_KEY=your_google_api_key_here

# VideoSDK Authentication Token
VIDEOSDK_AUTH_TOKEN=your_videosdk_auth_token_here

Step 2: Create the Python Agent

Create the Python Agent

agent-react-native.py
from videosdk.agents import Agent, AgentSession, RealTimePipeline, JobContext, RoomOptions, WorkerJob
from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig
import logging

logging.getLogger().setLevel(logging.INFO)

class MyVoiceAgent(Agent): 
    def __init__(self):
        super().__init__(
            instructions="You are a high-energy game-show host guiding the caller to guess a secret number from 1 to 100 to win 1,000,000$.",
        )

    async def on_enter(self) -> None:
        await self.session.say("Welcome to the Videosdk's AI Agent game show! I'm your host, and we're about to play for 1,000,000$. Are you ready to play?")
    
    async def on_exit(self) -> None:
        await self.session.say("Goodbye!")

async def start_session(context: JobContext):
    agent = MyVoiceAgent()
    model = GeminiRealtime(
        model="gemini-2.0-flash-live-001",
        # When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter
        # api_key="AIXXXXXXXXXXXXXXXXXXXX", 
        config=GeminiLiveConfig(
            voice="Leda", # Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, and Zephyr.
            response_modalities=["AUDIO"]
        )
    )

    pipeline = RealTimePipeline(model=model)
    session = AgentSession(
        agent=agent,
        pipeline=pipeline
    )

    def on_transcription(data: dict):
        role = data.get("role")
        text = data.get("text")
        print(f"[TRANSCRIPT][{role}]: {text}")
    pipeline.on("realtime_model_transcription", on_transcription)

    await context.run_until_shutdown(session=session, wait_for_participant=True)

def make_context() -> JobContext:
    room_options = RoomOptions(
        # Static meeting ID - same as used in frontend
        room_id="YOUR_MEETING_ID", # Replace it with your actual room_id
        name="Gemini Agent",
        playground=True,
    )

    return JobContext(room_options=room_options)

if __name__ == "__main__":
    job = WorkerJob(entrypoint=start_session, jobctx=make_context)
    job.start()

3. Run the Application

1) Start the React Native app

npm install

# Android
npm run android

# iOS (macOS only)
cd ios && pod install && cd ..
npm run ios

2) Start the Python Agent

pip install videosdk-agents
pip install "videosdk-plugins-google"

python agent-react-native.py

3) Connect and interact

Join the meeting from the app and allow microphone permissions.
When you join, the Python agent detects your participation and starts speaking.
Talk to the agent in real time and play the number guessing game.

Troubleshooting

Ensure the same room_id is set in both the RN app (constants.js) and the agent's RoomOptions.
Verify microphone and camera permissions on the device/simulator.
Confirm your VideoSDK token is valid and Google API key is set.
If audio is silent, check device output volume and that the agent is not in playground mode.

Next Steps

Clone repo for quick implementation

Quickstart Example

Complete working example with source code

Got a Question? Ask us on discord

Prerequisites​

Project Structure​

1. Building the React Native Frontend​

Step 1: Create App and Install SDKs​

Step 2: Configure the Project​

Android Setup​

iOS Setup​

Step 3: Register Service and Configure​

Step 4: Build UI and wire up MeetingProvider​

2. Building the Python Agent​

Step 1: Configure the Agent​

Step 2: Create the Python Agent​

3. Run the Application​

1) Start the React Native app​

2) Start the Python Agent​

3) Connect and interact​

Troubleshooting​

Next Steps​