Memory

Give your AI agents the ability to remember past interactions and user preferences. By integrating a memory provider, your agent can move beyond the limits of its immediate context window to deliver truly personalized and context-aware conversations.

How Memory Enhances Conversations

A standard LLM's memory is limited to its context window. A dedicated memory provider solves this by creating a persistent, intelligent storage layer that recalls information across different sessions.

Memory-enabled Conversation Flow

As the diagram shows, the agent intelligently stores key facts and retrieves them in later conversations to provide a personalized, efficient interaction.

Implementation with Mem0

This guide demonstrates how to implement long-term memory using Mem0, an open-source platform designed to give AI agents a persistent memory layer.

This example creates a "Concierge Agent" that remembers returning users. We will break down the implementation into logical steps.

note

The following sections outline the steps you might follow. For a complete working example, see the GitHub repository:

https://github.com/videosdk-live/agents-quickstart/tree/main/Memory

Prerequisites

A Mem0 API key, available from the Mem0 dashboard.
Ensure your agent environment is set up per the AI Voice Agent Quickstart. This is the baseline app where we'll implement the memory features in the steps below.

Step 1: Create a Dedicated Memory Manager

Create a Dedicated Memory Manager

Start by creating a memory manager class that abstracts your chosen memory provider's API. This class should handle three core operations: storing memories, retrieving memories, and deciding what to remember.

The key is to implement a should_store method that intelligently determines which conversations are worth remembering based on keywords, user intent, or other criteria you define.

memory_utils.py
from mem0.client.main import AsyncMemoryClient

class Mem0MemoryManager:
    """Handles all interactions with the Mem0 API."""
    def __init__(self, api_key: str, user_id: str):
        self.user_id = user_id
        self._client = AsyncMemoryClient(api_key=api_key)

    async def fetch_recent_memories(self, limit: int = 5) -> list[str]:
        """Retrieves the most recent memories for the user."""
        try:
            response = await self._client.get_all(filters={"user_id": self.user_id}, limit=limit)
            return [entry.get("memory", "") for entry in response]
        except Exception as e:
            print(f"Error fetching memories: {e}")
            return []

    def should_store(self, user_message: str) -> bool:
        """Determines if a message contains keywords indicating a fact to remember."""
        keywords = ("remember", "preference", "my name is", "likes", "dislike")
        return any(keyword in user_message.lower() for keyword in keywords)

    async def record_memory(self, user_message: str, assistant_message: str | None = None):
        """Stores a conversational turn in Mem0."""
        # ... implementation to call self._client.add()

Step 2: Accessing Memory to Personalize the Agent

Accessing Memory to Personalize the Agent

Implement memory retrieval at session startup to personalize your agent's behavior. Create a function that fetches relevant user memories and injects them into your agent's system prompt or context.

Consider how you want to use retrieved memories: for personalized greetings, context-aware responses, or maintaining conversation continuity across sessions.

main.py
class MemoryAgent(Agent):
    def __init__(self, instructions: str, remembered_facts: list[str] | None = None):
        self._remembered_facts = remembered_facts or []
        super().__init__(instructions=instructions)

    async def on_enter(self):
        # Use the retrieved facts for a personalized greeting
        if self._remembered_facts:
            top_fact = "; ".join(self._remembered_facts[:2])
            await self.session.say(f"Welcome back! I remember that {top_fact}. What can I help you with?")
        else:
            await self.session.say("Hello! How can I help today?")

# This helper function runs at the start of the session
async def build_agent_instructions(memory_manager: Mem0MemoryManager | None) -> tuple[str, list[str]]:
    base_instructions = "You are a helpful voice concierge..."
    if not memory_manager:
        return base_instructions, []

    # Fetches memories and adds them to the system prompt
    remembered_facts = await memory_manager.fetch_recent_memories()
    if not remembered_facts:
        return base_instructions, []

    memory_lines = "\n".join(f"- {fact}" for fact in remembered_facts)
    enriched_instructions = f"{base_instructions}\n\nKnown details about this caller:\n{memory_lines}"
    return enriched_instructions, remembered_facts

Step 3: Storing New Memories with a Custom Conversation Flow

Storing New Memories with a Custom Conversation Flow

Extend your conversation flow to capture and store new memories during interactions. If you are new to flows, review the core concepts in the Conversation Flow guide. Override the conversation flow's main processing method to evaluate each user message after the agent responds.

The goal is to identify valuable information (user preferences, personal details, important facts) and store it without impacting response latency. You can implement this as a post-processing step or integrate it into your existing conversation handling logic.

tip

Want a deeper dive or to run this locally?

Review core concepts in the Conversation Flow guide.
To run your agent, follow the AI Voice Agent Quickstart.

memory_utils.py
from videosdk.agents import ConversationFlow

class Mem0ConversationFlow(ConversationFlow):
    """A custom flow that records memories after each turn."""
    def __init__(self, agent: Agent, memory_manager: Mem0MemoryManager, **kwargs):
        super().__init__(agent=agent, **kwargs)
        self._memory_manager = memory_manager
        self._pending_user_message: str | None = None

    async def run(self, transcript: str):
        self._pending_user_message = transcript
        # First, let the standard conversation turn happen
        full_response = "".join([chunk async for chunk in super().run(transcript)])

        # After the response, decide if the turn should be stored in memory
        if self._pending_user_message and self._memory_manager.should_store(self._pending_user_message):
            await self._memory_manager.record_memory(self._pending_user_message, full_response or None)

        self._pending_user_message = None

Step 4: Assembling the Agent Session

Assembling the Agent Session

Integrate all components in your main application entry point. Initialize your memory manager, use it to build personalized agent instructions, and configure your session with the enhanced conversation flow.

This is where you connect the memory system to your agent's lifecycle, ensuring memories are loaded at startup and new information is captured during conversations.

main.py
async def start_session(context: JobContext):
    # 1. Setup memory manager
    memory_manager = Mem0MemoryManager(api_key=os.getenv("MEM0_API_KEY"), user_id="demo-user")

    # 2. Build agent with personalized instructions
    instructions, facts = await build_agent_instructions(memory_manager)
    agent = MemoryAgent(instructions=instructions, remembered_facts=facts)

    # 3. Setup conversation flow with memory capabilities
    conversation_flow = Mem0ConversationFlow(agent=agent, memory_manager=memory_manager, ...)

    # 4. Create the session with the custom flow
    session = AgentSession(
        agent=agent,
        pipeline=pipeline, # your pipeline
        conversation_flow=conversation_flow
    )

    # ... rest of your session and job context setup

This creates a powerful feedback loop where each interaction enriches the agent's knowledge, leading to smarter and more personalized conversations over time.

Step 5: Run the Agent

Run the Agent

Start your worker process and connect the agent to a room using a JobContext. This boots your agent and keeps it running.

main.py
from videosdk.agents import WorkerJob, JobContext, RoomOptions

def make_context() -> JobContext:
    return JobContext(room_options=RoomOptions(name="Concierge Agent", playground=True))

if __name__ == "__main__":
    WorkerJob(entrypoint=start_session, jobctx=make_context).start()

This will initialize the session using your start_session function from Step 4 and keep the worker alive.

Example - Try It Yourself

Explore our complete, runnable example on GitHub to see how to integrate a memory provider into a VideoSDK AI Agent.

Memory Agent Example

A complete example demonstrating how to use Mem0 to give your AI agent long-term memory.

Got a Question? Ask us on discord

How Memory Enhances Conversations​

Implementation with Mem0​

Prerequisites​

Step 1: Create a Dedicated Memory Manager​

Step 2: Accessing Memory to Personalize the Agent​

Step 3: Storing New Memories with a Custom Conversation Flow​

Step 4: Assembling the Agent Session​

Step 5: Run the Agent​

Example - Try It Yourself​