Skip to main content

AI Agent with Flutter - Quick Start

VideoSDK empowers you to seamlessly integrate AI agents with real-time voice interaction into your Flutter application within minutes.

In this quickstart, you'll explore how to create an AI agent that joins a flutter meeting room and interacts with users through voice using Google Gemini Live API.

Prerequisites

Before proceeding, ensure that your development environment meets the following requirements:

  • Video SDK Developer Account (Not having one, follow Video SDK Dashboard)
  • Flutter and Python 3.12+ installed on your device
  • Google API Key with Gemini Live API access
important

You need a VideoSDK account to generate a token and a Google API key for the Gemini Live API. Visit the VideoSDK dashboard to generate a token and the Google AI Studio for Google API key.

Project Structure

Your project structure should look like this:

Project Structure
  root
├── android
├── ios
├── lib
│ ├── api_call.dart
│ ├── join_screen.dart
│ ├── main.dart
│ ├── meeting_controls.dart
│ ├── meeting_screen.dart
│ └── participant_tile.dart
├── macos
├── web
├── windows
├── agent-flutter.py
└── .env

You will be working on the following files:

  • join_screen.dart: Responsible for the user interface to join a meeting.
  • meeting_screen.dart: Displays the meeting interface and handles meeting logic.
  • api_call.dart: Handles API calls for creating meetings.
  • agent-flutter.py: The Python AI agent backend using Google Gemini Live API.
  • .env: For storing API keys.

1. Flutter Frontend

Step 1: Getting Started

Follow these steps to create the environment necessary to add AI agent functionality to your app.

Create a New Flutter App

Create a new Flutter app using the following command:

$ flutter create videosdk_ai_agent_flutter_app

Install VideoSDK

Install the VideoSDK using the following Flutter command. Make sure you are in your Flutter app directory before you run this command.

$ flutter pub add videosdk
$ flutter pub add http

Step 2: Configure Project

For Android

  • Update the /android/app/src/main/AndroidManifest.xml for the permissions we will be using to implement the audio and video features.
android/app/src/main/AndroidManifest.xml
    <uses-feature android:name="android.hardware.camera" />
<uses-feature android:name="android.hardware.camera.autofocus" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.CHANGE_NETWORK_STATE" />
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
<uses-permission android:name="android.permission.INTERNET"/>
  • If necessary, in the build.gradle you will need to increase minSdkVersion of defaultConfig up to 23 (currently default Flutter generator set it to 16).

For iOS

  • Add the following entries which allow your app to access the camera and microphone to your /ios/Runner/Info.plist file :
/ios/Runner/Info.plist
<key>NSCameraUsageDescription</key>
<string>$(PRODUCT_NAME) Camera Usage!</string>
<key>NSMicrophoneUsageDescription</key>
<string>$(PRODUCT_NAME) Microphone Usage!</string>
  • Uncomment the following line to define a global platform for your project in /ios/Podfile :
/ios/Podfile
platform :ios, '12.0'

For MacOS

  • Add the following entries to your /macos/Runner/Info.plist file which allow your app to access the camera and microphone.
/macos/Runner/Info.plist
<key>NSCameraUsageDescription</key>
<string>$(PRODUCT_NAME) Camera Usage!</string>
<key>NSMicrophoneUsageDescription</key>
<string>$(PRODUCT_NAME) Microphone Usage!</string>
  • Add the following entries to your /macos/Runner/DebugProfile.entitlements file which allow your app to access the camera, microphone and open outgoing network connections.
/macos/Runner/DebugProfile.entitleaments
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.device.camera</key>
<true/>
<key>com.apple.security.device.microphone</key>
<true/>
  • Add the following entries to your /macos/Runner/Release.entitlements file which allow your app to access the camera, microphone and open outgoing network connections.
/macos/Runner/Release.entitlements
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.device.camera</key>
<true/>
<key>com.apple.security.device.microphone</key>
<true/>

Step 3: Configure Environment and Credentials

Create a meeting room using the VideoSDK API:

curl -X POST https://api.videosdk.live/v2/rooms \
-H "Authorization: YOUR_JWT_TOKEN_HERE" \
-H "Content-Type: application/json"

Copy the roomId from the response and configure it in lib/join_screen.dart and lib/api_call.dart.

lib/api_call.dart
import 'dart:convert';
import 'package:http/http.dart' as http;

//Auth token we will use to generate a meeting and connect to it
const token =
'YOUR_VIDEOSDK_AUTH_TOKEN';

// API call to create meeting
Future<String> createMeeting() async {
final http.Response httpResponse = await http.post(
Uri.parse('https://api.videosdk.live/v2/rooms'),
headers: {'Authorization': token},
);

//Destructuring the roomId from the response
return json.decode(httpResponse.body)['roomId'];
}
lib/join_screen.dart
import 'package:flutter/material.dart';
import 'api_call.dart';
import 'meeting_screen.dart';

class JoinScreen extends StatelessWidget {
final _meetingIdController = TextEditingController();

JoinScreen({super.key});

void onCreateButtonPressed(BuildContext context) async {
// call api to create meeting and navigate to MeetingScreen with meetingId,token
await createMeeting().then((meetingId) {
if (!context.mounted) return;
Navigator.of(context).push(
MaterialPageRoute(
builder:
(context) => MeetingScreen(meetingId: meetingId, token: token),
),
);
});
}

void onJoinButtonPressed(BuildContext context) {
// check meeting id is not null or invaild
// if meeting id is vaild then navigate to MeetingScreen with meetingId,token
Navigator.of(context).push(
MaterialPageRoute(
builder:
(context) =>
MeetingScreen(meetingId: "YOUR_MEETING_ID", token: token),
),
);
}

@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(title: const Text('VideoSDK QuickStart')),
body: Padding(
padding: const EdgeInsets.all(12.0),
child: Center(
child: ElevatedButton(
onPressed: () => onJoinButtonPressed(context),
child: const Text('Join Meeting'),
),
),
),
);
}
}

Step 4: Design the User Interface (UI)

Create the main MeetingScreen component with audio-only interaction in lib/meeting_screen.dart:

lib/meeting_screen.dart
import 'package:flutter/material.dart';
import 'package:videosdk/videosdk.dart';
import 'participant_tile.dart';
import 'meeting_controls.dart';

class MeetingScreen extends StatefulWidget {
final String meetingId;
final String token;

const MeetingScreen({
super.key,
required this.meetingId,
required this.token,
});

@override
State<MeetingScreen> createState() => _MeetingScreenState();
}

class _MeetingScreenState extends State<MeetingScreen> {
late Room _room;
var micEnabled = true;
var camEnabled = true;

Map<String, Participant> participants = {};

@override
void initState() {
// create room
_room = VideoSDK.createRoom(
roomId: widget.meetingId,
token: widget.token,
displayName: "John Doe",
micEnabled: micEnabled,
camEnabled: false,
defaultCameraIndex:
1, // Index of MediaDevices will be used to set default camera
);

setMeetingEventListener();

// Join room
_room.join();

super.initState();
}

// listening to meeting events
void setMeetingEventListener() {
_room.on(Events.roomJoined, () {
setState(() {
participants.putIfAbsent(
_room.localParticipant.id,
() => _room.localParticipant,
);
});
});

_room.on(Events.participantJoined, (Participant participant) {
setState(
() => participants.putIfAbsent(participant.id, () => participant),
);
});

_room.on(Events.participantLeft, (String participantId) {
if (participants.containsKey(participantId)) {
setState(() => participants.remove(participantId));
}
});

_room.on(Events.roomLeft, () {
participants.clear();
Navigator.popUntil(context, ModalRoute.withName('/'));
});
}

// onbackButton pressed leave the room
Future<bool> _onWillPop() async {
_room.leave();
return true;
}

@override
Widget build(BuildContext context) {
return WillPopScope(
onWillPop: () => _onWillPop(),
child: Scaffold(
appBar: AppBar(title: const Text('VideoSDK QuickStart')),
body: Padding(
padding: const EdgeInsets.all(8.0),
child: Column(
children: [
Text(widget.meetingId),
//render all participant
Expanded(
child: Padding(
padding: const EdgeInsets.all(8.0),
child: GridView.builder(
gridDelegate:
const SliverGridDelegateWithFixedCrossAxisCount(
crossAxisCount: 2,
crossAxisSpacing: 10,
mainAxisSpacing: 10,
mainAxisExtent: 300,
),
itemBuilder: (context, index) {
return ParticipantTile(
key: Key(participants.values.elementAt(index).id),
participant: participants.values.elementAt(index),
);
},
itemCount: participants.length,
),
),
),
MeetingControls(
onToggleMicButtonPressed: () {
micEnabled ? _room.muteMic() : _room.unmuteMic();
micEnabled = !micEnabled;
},
onLeaveButtonPressed: () => _room.leave(),
),
],
),
),
),
);
}
}
lib/participant_tile.dart
import 'package:flutter/material.dart';
import 'package:videosdk/videosdk.dart';

class ParticipantTile extends StatefulWidget {
final Participant participant;
const ParticipantTile({super.key, required this.participant});

@override
State<ParticipantTile> createState() => _ParticipantTileState();
}

class _ParticipantTileState extends State<ParticipantTile> {
var pariticpantName;
@override
void initState() {
pariticpantName = widget.participant.displayName;

super.initState();
}

@override
Widget build(BuildContext context) {
return Padding(
padding: const EdgeInsets.all(8.0),
child: Container(
color: Colors.grey.shade800,
child: Center(
child: Text(
'$pariticpantName',
style: TextStyle(color: Colors.white),
),
),
),
);
}
}
lib/meeting_controls.dart
import 'package:flutter/material.dart';

class MeetingControls extends StatelessWidget {
final void Function() onToggleMicButtonPressed;
final void Function() onLeaveButtonPressed;

const MeetingControls({
super.key,
required this.onToggleMicButtonPressed,
required this.onLeaveButtonPressed,
});

@override
Widget build(BuildContext context) {
return Row(
mainAxisAlignment: MainAxisAlignment.spaceEvenly,
children: [
ElevatedButton(
onPressed: onLeaveButtonPressed,
child: const Text('Leave'),
),
ElevatedButton(
onPressed: onToggleMicButtonPressed,
child: const Text('Toggle Mic'),
),
],
);
}
}

2. Python AI Agent

Step 1: Create Python AI Agent

Create a .env file to store your API keys securely for the Python agent:

.env
# Google API Key for Gemini Live API
GOOGLE_API_KEY=your_google_api_key_here

# VideoSDK Authentication Token
VIDEOSDK_AUTH_TOKEN=your_videosdk_auth_token_here

Create the Python AI agent that will join the same meeting room and interact with users through voice.

agent-flutter.py
from videosdk.agents import Agent, AgentSession, RealTimePipeline, JobContext, RoomOptions, WorkerJob
from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig
import logging

logging.getLogger().setLevel(logging.INFO)

class MyVoiceAgent(Agent):
def __init__(self):
super().__init__(
instructions="You are a high-energy game-show host guiding the caller to guess a secret number from 1 to 100 to win 1,000,000$.",
)

async def on_enter(self) -> None:
await self.session.say("Welcome to the Videosdk's AI Agent game show! I'm your host, and we're about to play for 1,000,000$. Are you ready to play?")

async def on_exit(self) -> None:
await self.session.say("Goodbye!")

async def start_session(context: JobContext):
agent = MyVoiceAgent()
model = GeminiRealtime(
model="gemini-2.0-flash-live-001",
# When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter
# api_key="AIXXXXXXXXXXXXXXXXXXXX",
config=GeminiLiveConfig(
voice="Leda", # Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, and Zephyr.
response_modalities=["AUDIO"]
)
)

pipeline = RealTimePipeline(model=model)
session = AgentSession(
agent=agent,
pipeline=pipeline
)

def on_transcription(data: dict):
role = data.get("role")
text = data.get("text")
print(f"[TRANSCRIPT][{role}]: {text}")
pipeline.on("realtime_model_transcription", on_transcription)

await context.run_until_shutdown(session=session, wait_for_participant=True)

def make_context() -> JobContext:
room_options = RoomOptions(
# Static meeting ID - same as used in frontend
room_id="YOUR_MEETING_ID", # Replace it with your actual room_id
name="Gemini Agent",
playground=True,
)

return JobContext(room_options=room_options)

if __name__ == "__main__":
job = WorkerJob(entrypoint=start_session, jobctx=make_context)
job.start()

3. Run the Application

Step 1: Run the Frontend

Once you have completed all the steps mentioned above, start your Flutter application:

flutter run

Step 2: Run the AI Agent

Open a new terminal and run the Python agent:

# Install Python dependencies
pip install "videosdk-plugins-google"
pip install videosdk-agents


# Run the AI agent
python agent-flutter.py

Step 3: Connect and Interact

  1. Join the meeting from the Flutter app:

    • Click the "Join Meeting" button.
    • Allow microphone permissions when prompted.
  2. Agent connection:

    • Once you join, the Python agent will detect your participation.
    • You should see "Participant joined" in the terminal.
    • The AI agent will greet you and start the game.
  3. Start playing:

    • The agent will guide you through a number guessing game (1-100).
    • Use your microphone to interact with the AI host.
    • The agent will provide hints and encouragement throughout the game.

Troubleshooting

Common Issues:

  1. "Waiting for participant..." but no connection:

    • Ensure both the frontend and the agent are running.
    • Check that the room ID matches in both lib/join_screen.dart and agent-flutter.py.
    • Verify your VideoSDK token is valid.
  2. Audio not working:

    • Check browser permissions for microphone access.
    • Ensure your Google API key has Gemini Live API access enabled.
  3. Agent not responding:

    • Verify your Google API key is correctly set in the environment.
    • Check that the Gemini Live API is enabled in your Google Cloud Console.
  4. Flutter build issues:

    • Ensure your Flutter version is compatible.
    • Try cleaning the build: flutter clean.
    • Delete pubspec.lock and run flutter pub get.

Next Steps

Clone repo for quick implementation

Got a Question? Ask us on discord