AI Agent with Flutter - Quick Start

VideoSDK empowers you to seamlessly integrate AI agents with real-time voice interaction into your Flutter application within minutes.

In this quickstart, you'll explore how to create an AI agent that joins a flutter meeting room and interacts with users through voice using Google Gemini Live API.

Prerequisites

Before proceeding, ensure that your development environment meets the following requirements:

Video SDK Developer Account (Not having one, follow Video SDK Dashboard)
Flutter and Python 3.12+ installed on your device
Google API Key with Gemini Live API access

important

You need a VideoSDK account to generate a token and a Google API key for the Gemini Live API. Visit the VideoSDK dashboard to generate a token and the Google AI Studio for Google API key.

Project Structure

Your project structure should look like this:

Project Structure
  root
   ├── android
   ├── ios
   ├── lib
   │    ├── api_call.dart
   │    ├── join_screen.dart
   │    ├── main.dart
   │    ├── meeting_controls.dart
   │    ├── meeting_screen.dart
   │    └── participant_tile.dart
   ├── macos
   ├── web
   ├── windows
   ├── agent-flutter.py
   └── .env

You will be working on the following files:

join_screen.dart: Responsible for the user interface to join a meeting.
meeting_screen.dart: Displays the meeting interface and handles meeting logic.
api_call.dart: Handles API calls for creating meetings.
agent-flutter.py: The Python AI agent backend using Google Gemini Live API.
.env: For storing API keys.

1. Flutter Frontend

Step 1: Getting Started

Getting Started

Follow these steps to create the environment necessary to add AI agent functionality to your app.

Create a New Flutter App

Create a new Flutter app using the following command:

$ flutter create videosdk_ai_agent_flutter_app

Install VideoSDK

Install the VideoSDK using the following Flutter command. Make sure you are in your Flutter app directory before you run this command.

$ flutter pub add videosdk
$ flutter pub add http

Step 2: Configure Project

Configure Project

For Android

Update the /android/app/src/main/AndroidManifest.xml for the permissions we will be using to implement the audio and video features.

android/app/src/main/AndroidManifest.xml
    <uses-feature android:name="android.hardware.camera" />
    <uses-feature android:name="android.hardware.camera.autofocus" />
    <uses-permission android:name="android.permission.CAMERA" />
    <uses-permission android:name="android.permission.RECORD_AUDIO" />
    <uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
    <uses-permission android:name="android.permission.CHANGE_NETWORK_STATE" />
    <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
    <uses-permission android:name="android.permission.INTERNET"/>

If necessary, in the build.gradle you will need to increase minSdkVersion of defaultConfig up to 23 (currently default Flutter generator set it to 16).

For iOS

Add the following entries which allow your app to access the camera and microphone to your /ios/Runner/Info.plist file :

/ios/Runner/Info.plist
<key>NSCameraUsageDescription</key>
<string>$(PRODUCT_NAME) Camera Usage!</string>
<key>NSMicrophoneUsageDescription</key>
<string>$(PRODUCT_NAME) Microphone Usage!</string>

Uncomment the following line to define a global platform for your project in /ios/Podfile :

/ios/Podfile

platform :ios, '12.0'

For MacOS

Add the following entries to your /macos/Runner/Info.plist file which allow your app to access the camera and microphone.

/macos/Runner/Info.plist
<key>NSCameraUsageDescription</key>
<string>$(PRODUCT_NAME) Camera Usage!</string>
<key>NSMicrophoneUsageDescription</key>
<string>$(PRODUCT_NAME) Microphone Usage!</string>

Add the following entries to your /macos/Runner/DebugProfile.entitlements file which allow your app to access the camera, microphone and open outgoing network connections.

/macos/Runner/DebugProfile.entitleaments
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.device.camera</key>
<true/>
<key>com.apple.security.device.microphone</key>
<true/>

Add the following entries to your /macos/Runner/Release.entitlements file which allow your app to access the camera, microphone and open outgoing network connections.

/macos/Runner/Release.entitlements
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.device.camera</key>
<true/>
<key>com.apple.security.device.microphone</key>
<true/>

Step 3: Configure Environment and Credentials

Configure Credentials

Create a meeting room using the VideoSDK API:

curl -X POST https://api.videosdk.live/v2/rooms \
  -H "Authorization: YOUR_JWT_TOKEN_HERE" \
  -H "Content-Type: application/json"

Copy the roomId from the response and configure it in lib/join_screen.dart and lib/api_call.dart.

lib/api_call.dart
import 'dart:convert';
import 'package:http/http.dart' as http;

//Auth token we will use to generate a meeting and connect to it
const token =
    'YOUR_VIDEOSDK_AUTH_TOKEN';

// API call to create meeting
Future<String> createMeeting() async {
  final http.Response httpResponse = await http.post(
    Uri.parse('https://api.videosdk.live/v2/rooms'),
    headers: {'Authorization': token},
  );

  //Destructuring the roomId from the response
  return json.decode(httpResponse.body)['roomId'];
}

lib/join_screen.dart
import 'package:flutter/material.dart';
import 'api_call.dart';
import 'meeting_screen.dart';

class JoinScreen extends StatelessWidget {
  final _meetingIdController = TextEditingController();

  JoinScreen({super.key});

  void onCreateButtonPressed(BuildContext context) async {
    // call api to create meeting and navigate to MeetingScreen with meetingId,token
    await createMeeting().then((meetingId) {
      if (!context.mounted) return;
      Navigator.of(context).push(
        MaterialPageRoute(
          builder:
              (context) => MeetingScreen(meetingId: meetingId, token: token),
        ),
      );
    });
  }

  void onJoinButtonPressed(BuildContext context) {
    // check meeting id is not null or invaild
    // if meeting id is vaild then navigate to MeetingScreen with meetingId,token
    Navigator.of(context).push(
      MaterialPageRoute(
        builder:
            (context) =>
                MeetingScreen(meetingId: "YOUR_MEETING_ID", token: token),
      ),
    );
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: const Text('VideoSDK QuickStart')),
      body: Padding(
        padding: const EdgeInsets.all(12.0),
        child: Center(
          child: ElevatedButton(
            onPressed: () => onJoinButtonPressed(context),
            child: const Text('Join Meeting'),
          ),
        ),
      ),
    );
  }
}

Step 4: Design the User Interface (UI)

Design the User Interface (UI)

Create the main MeetingScreen component with audio-only interaction in lib/meeting_screen.dart:

lib/meeting_screen.dart
import 'package:flutter/material.dart';
import 'package:videosdk/videosdk.dart';
import 'participant_tile.dart';
import 'meeting_controls.dart';

class MeetingScreen extends StatefulWidget {
  final String meetingId;
  final String token;

  const MeetingScreen({
    super.key,
    required this.meetingId,
    required this.token,
  });

  @override
  State<MeetingScreen> createState() => _MeetingScreenState();
}

class _MeetingScreenState extends State<MeetingScreen> {
  late Room _room;
  var micEnabled = true;
  var camEnabled = true;

  Map<String, Participant> participants = {};

  @override
  void initState() {
    // create room
    _room = VideoSDK.createRoom(
      roomId: widget.meetingId,
      token: widget.token,
      displayName: "John Doe",
      micEnabled: micEnabled,
      camEnabled: false,
      defaultCameraIndex:
          1, // Index of MediaDevices will be used to set default camera
    );

    setMeetingEventListener();

    // Join room
    _room.join();

    super.initState();
  }

  // listening to meeting events
  void setMeetingEventListener() {
    _room.on(Events.roomJoined, () {
      setState(() {
        participants.putIfAbsent(
          _room.localParticipant.id,
          () => _room.localParticipant,
        );
      });
    });

    _room.on(Events.participantJoined, (Participant participant) {
      setState(
        () => participants.putIfAbsent(participant.id, () => participant),
      );
    });

    _room.on(Events.participantLeft, (String participantId) {
      if (participants.containsKey(participantId)) {
        setState(() => participants.remove(participantId));
      }
    });

    _room.on(Events.roomLeft, () {
      participants.clear();
      Navigator.popUntil(context, ModalRoute.withName('/'));
    });
  }

  // onbackButton pressed leave the room
  Future<bool> _onWillPop() async {
    _room.leave();
    return true;
  }

  @override
  Widget build(BuildContext context) {
    return WillPopScope(
      onWillPop: () => _onWillPop(),
      child: Scaffold(
        appBar: AppBar(title: const Text('VideoSDK QuickStart')),
        body: Padding(
          padding: const EdgeInsets.all(8.0),
          child: Column(
            children: [
              Text(widget.meetingId),
              //render all participant
              Expanded(
                child: Padding(
                  padding: const EdgeInsets.all(8.0),
                  child: GridView.builder(
                    gridDelegate:
                        const SliverGridDelegateWithFixedCrossAxisCount(
                          crossAxisCount: 2,
                          crossAxisSpacing: 10,
                          mainAxisSpacing: 10,
                          mainAxisExtent: 300,
                        ),
                    itemBuilder: (context, index) {
                      return ParticipantTile(
                        key: Key(participants.values.elementAt(index).id),
                        participant: participants.values.elementAt(index),
                      );
                    },
                    itemCount: participants.length,
                  ),
                ),
              ),
              MeetingControls(
                onToggleMicButtonPressed: () {
                  micEnabled ? _room.muteMic() : _room.unmuteMic();
                  micEnabled = !micEnabled;
                },
                onLeaveButtonPressed: () => _room.leave(),
              ),
            ],
          ),
        ),
      ),
    );
  }
}

lib/participant_tile.dart
import 'package:flutter/material.dart';
import 'package:videosdk/videosdk.dart';

class ParticipantTile extends StatefulWidget {
  final Participant participant;
  const ParticipantTile({super.key, required this.participant});

  @override
  State<ParticipantTile> createState() => _ParticipantTileState();
}

class _ParticipantTileState extends State<ParticipantTile> {
  var pariticpantName;
  @override
  void initState() {
    pariticpantName = widget.participant.displayName;

    super.initState();
  }

  @override
  Widget build(BuildContext context) {
    return Padding(
      padding: const EdgeInsets.all(8.0),
      child: Container(
        color: Colors.grey.shade800,
        child: Center(
          child: Text(
            '$pariticpantName',
            style: TextStyle(color: Colors.white),
          ),
        ),
      ),
    );
  }
}

lib/meeting_controls.dart
import 'package:flutter/material.dart';

class MeetingControls extends StatelessWidget {
  final void Function() onToggleMicButtonPressed;
  final void Function() onLeaveButtonPressed;

  const MeetingControls({
    super.key,
    required this.onToggleMicButtonPressed,
    required this.onLeaveButtonPressed,
  });

  @override
  Widget build(BuildContext context) {
    return Row(
      mainAxisAlignment: MainAxisAlignment.spaceEvenly,
      children: [
        ElevatedButton(
          onPressed: onLeaveButtonPressed,
          child: const Text('Leave'),
        ),
        ElevatedButton(
          onPressed: onToggleMicButtonPressed,
          child: const Text('Toggle Mic'),
        ),
      ],
    );
  }
}

2. Python AI Agent

Step 1: Create Python AI Agent

Configure the Agent

Create a .env file to store your API keys securely for the Python agent:

.env
# Google API Key for Gemini Live API
GOOGLE_API_KEY=your_google_api_key_here

# VideoSDK Authentication Token
VIDEOSDK_AUTH_TOKEN=your_videosdk_auth_token_here

Create Python AI Agent

Create the Python AI agent that will join the same meeting room and interact with users through voice.

agent-flutter.py
from videosdk.agents import Agent, AgentSession, RealTimePipeline, JobContext, RoomOptions, WorkerJob
from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig
import logging

logging.getLogger().setLevel(logging.INFO)

class MyVoiceAgent(Agent): 
    def __init__(self):
        super().__init__(
            instructions="You are a high-energy game-show host guiding the caller to guess a secret number from 1 to 100 to win 1,000,000$.",
        )

    async def on_enter(self) -> None:
        await self.session.say("Welcome to the Videosdk's AI Agent game show! I'm your host, and we're about to play for 1,000,000$. Are you ready to play?")
    
    async def on_exit(self) -> None:
        await self.session.say("Goodbye!")

async def start_session(context: JobContext):
    agent = MyVoiceAgent()
    model = GeminiRealtime(
        model="gemini-2.5-flash-native-audio-preview-12-2025",
        # When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter
        # api_key="AIXXXXXXXXXXXXXXXXXXXX", 
        config=GeminiLiveConfig(
            voice="Leda", # Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, and Zephyr.
            response_modalities=["AUDIO"]
        )
    )

    pipeline = RealTimePipeline(model=model)
    session = AgentSession(
        agent=agent,
        pipeline=pipeline
    )

    def on_transcription(data: dict):
        role = data.get("role")
        text = data.get("text")
        print(f"[TRANSCRIPT][{role}]: {text}")
    pipeline.on("realtime_model_transcription", on_transcription)

    await context.run_until_shutdown(session=session, wait_for_participant=True)

def make_context() -> JobContext:
    room_options = RoomOptions(
        # Static meeting ID - same as used in frontend
        room_id="YOUR_MEETING_ID", # Replace it with your actual room_id
        name="Gemini Agent",
        playground=True,
    )

    return JobContext(room_options=room_options)

if __name__ == "__main__":
    job = WorkerJob(entrypoint=start_session, jobctx=make_context)
    job.start()

3. Run the Application

Step 1: Run the Frontend

Run the Frontend

Once you have completed all the steps mentioned above, start your Flutter application:

flutter run

Step 2: Run the AI Agent

Run the AI Agent

Open a new terminal and run the Python agent:

# Install Python dependencies
pip install "videosdk-plugins-google"
pip install videosdk-agents


# Run the AI agent
python agent-flutter.py

Step 3: Connect and Interact

Connect and Interact

Join the meeting from the Flutter app:
- Click the "Join Meeting" button.
- Allow microphone permissions when prompted.
Agent connection:
- Once you join, the Python agent will detect your participation.
- You should see "Participant joined" in the terminal.
- The AI agent will greet you and start the game.
Start playing:
- The agent will guide you through a number guessing game (1-100).
- Use your microphone to interact with the AI host.
- The agent will provide hints and encouragement throughout the game.

Troubleshooting

Common Issues:

"Waiting for participant..." but no connection:
- Ensure both the frontend and the agent are running.
- Check that the room ID matches in both lib/join_screen.dart and agent-flutter.py.
- Verify your VideoSDK token is valid.
Audio not working:
- Check browser permissions for microphone access.
- Ensure your Google API key has Gemini Live API access enabled.
Agent not responding:
- Verify your Google API key is correctly set in the environment.
- Check that the Gemini Live API is enabled in your Google Cloud Console.
Flutter build issues:
- Ensure your Flutter version is compatible.
- Try cleaning the build: flutter clean.
- Delete pubspec.lock and run flutter pub get.

Next Steps

Clone repo for quick implementation

Quickstart Example

Complete working example with source code

Got a Question? Ask us on discord

Prerequisites​

Project Structure​

1. Flutter Frontend​

Step 1: Getting Started​

Create a New Flutter App​

Install VideoSDK​

Step 2: Configure Project​

For Android​

For iOS​

For MacOS​

Step 3: Configure Environment and Credentials​

Step 4: Design the User Interface (UI)​

2. Python AI Agent​

Step 1: Create Python AI Agent​

3. Run the Application​

Step 1: Run the Frontend​

Step 2: Run the AI Agent​

Step 3: Connect and Interact​

Troubleshooting​

Common Issues:​

Next Steps​