Memory
Give your AI agents the ability to remember past interactions and user preferences. By integrating a memory provider, your agent can move beyond the limits of its immediate context window to deliver truly personalized and context-aware conversations.
How Memory Enhances Conversations
A standard LLM's memory is limited to its context window. A dedicated memory provider solves this by creating a persistent, intelligent storage layer that recalls information across different sessions.
As the diagram shows, the agent intelligently stores key facts and retrieves them in later conversations to provide a personalized, efficient interaction.
Implementation with Mem0
This guide demonstrates how to implement long-term memory using Mem0, an open-source platform designed to give AI agents a persistent memory layer.
This example creates a "Concierge Agent" that remembers returning users. We will break down the implementation into logical steps.
The following sections outline the steps you might follow. For a complete working example, see the GitHub repository:
Prerequisites
- A Mem0 API key, available from the Mem0 dashboard.
- Ensure your agent environment is set up per the AI Voice Agent Quickstart. This is the baseline app where we'll implement the memory features in the steps below.
Step 1: Create a Dedicated Memory Manager
Start by creating a memory manager class that abstracts your chosen memory provider's API. This class should handle three core operations: storing memories, retrieving memories, and deciding what to remember.
The key is to implement a should_store
method that intelligently determines which conversations are worth remembering based on keywords, user intent, or other criteria you define.
from mem0.client.main import AsyncMemoryClient
class Mem0MemoryManager:
"""Handles all interactions with the Mem0 API."""
def __init__(self, api_key: str, user_id: str):
self.user_id = user_id
self._client = AsyncMemoryClient(api_key=api_key)
async def fetch_recent_memories(self, limit: int = 5) -> list[str]:
"""Retrieves the most recent memories for the user."""
try:
response = await self._client.get_all(filters={"user_id": self.user_id}, limit=limit)
return [entry.get("memory", "") for entry in response]
except Exception as e:
print(f"Error fetching memories: {e}")
return []
def should_store(self, user_message: str) -> bool:
"""Determines if a message contains keywords indicating a fact to remember."""
keywords = ("remember", "preference", "my name is", "likes", "dislike")
return any(keyword in user_message.lower() for keyword in keywords)
async def record_memory(self, user_message: str, assistant_message: str | None = None):
"""Stores a conversational turn in Mem0."""
# ... implementation to call self._client.add()
Step 2: Accessing Memory to Personalize the Agent
Implement memory retrieval at session startup to personalize your agent's behavior. Create a function that fetches relevant user memories and injects them into your agent's system prompt or context.
Consider how you want to use retrieved memories: for personalized greetings, context-aware responses, or maintaining conversation continuity across sessions.
class MemoryAgent(Agent):
def __init__(self, instructions: str, remembered_facts: list[str] | None = None):
self._remembered_facts = remembered_facts or []
super().__init__(instructions=instructions)
async def on_enter(self):
# Use the retrieved facts for a personalized greeting
if self._remembered_facts:
top_fact = "; ".join(self._remembered_facts[:2])
await self.session.say(f"Welcome back! I remember that {top_fact}. What can I help you with?")
else:
await self.session.say("Hello! How can I help today?")
# This helper function runs at the start of the session
async def build_agent_instructions(memory_manager: Mem0MemoryManager | None) -> tuple[str, list[str]]:
base_instructions = "You are a helpful voice concierge..."
if not memory_manager:
return base_instructions, []
# Fetches memories and adds them to the system prompt
remembered_facts = await memory_manager.fetch_recent_memories()
if not remembered_facts:
return base_instructions, []
memory_lines = "\n".join(f"- {fact}" for fact in remembered_facts)
enriched_instructions = f"{base_instructions}\n\nKnown details about this caller:\n{memory_lines}"
return enriched_instructions, remembered_facts
Step 3: Storing New Memories with a Custom Conversation Flow
Extend your conversation flow to capture and store new memories during interactions. If you are new to flows, review the core concepts in the Conversation Flow guide. Override the conversation flow's main processing method to evaluate each user message after the agent responds.
The goal is to identify valuable information (user preferences, personal details, important facts) and store it without impacting response latency. You can implement this as a post-processing step or integrate it into your existing conversation handling logic.
Want a deeper dive or to run this locally?
- Review core concepts in the Conversation Flow guide.
- To run your agent, follow the AI Voice Agent Quickstart.
from videosdk.agents import ConversationFlow
class Mem0ConversationFlow(ConversationFlow):
"""A custom flow that records memories after each turn."""
def __init__(self, agent: Agent, memory_manager: Mem0MemoryManager, **kwargs):
super().__init__(agent=agent, **kwargs)
self._memory_manager = memory_manager
self._pending_user_message: str | None = None
async def run(self, transcript: str):
self._pending_user_message = transcript
# First, let the standard conversation turn happen
full_response = "".join([chunk async for chunk in super().run(transcript)])
# After the response, decide if the turn should be stored in memory
if self._pending_user_message and self._memory_manager.should_store(self._pending_user_message):
await self._memory_manager.record_memory(self._pending_user_message, full_response or None)
self._pending_user_message = None
Step 4: Assembling the Agent Session
Integrate all components in your main application entry point. Initialize your memory manager, use it to build personalized agent instructions, and configure your session with the enhanced conversation flow.
This is where you connect the memory system to your agent's lifecycle, ensuring memories are loaded at startup and new information is captured during conversations.
async def start_session(context: JobContext):
# 1. Setup memory manager
memory_manager = Mem0MemoryManager(api_key=os.getenv("MEM0_API_KEY"), user_id="demo-user")
# 2. Build agent with personalized instructions
instructions, facts = await build_agent_instructions(memory_manager)
agent = MemoryAgent(instructions=instructions, remembered_facts=facts)
# 3. Setup conversation flow with memory capabilities
conversation_flow = Mem0ConversationFlow(agent=agent, memory_manager=memory_manager, ...)
# 4. Create the session with the custom flow
session = AgentSession(
agent=agent,
pipeline=pipeline, # your pipeline
conversation_flow=conversation_flow
)
# ... rest of your session and job context setup
This creates a powerful feedback loop where each interaction enriches the agent's knowledge, leading to smarter and more personalized conversations over time.
Step 5: Run the Agent
Start your worker process and connect the agent to a room using a JobContext
. This boots your agent and keeps it running.
from videosdk.agents import WorkerJob, JobContext, RoomOptions
def make_context() -> JobContext:
return JobContext(room_options=RoomOptions(name="Concierge Agent", playground=True))
if __name__ == "__main__":
WorkerJob(entrypoint=start_session, jobctx=make_context).start()
This will initialize the session using your start_session
function from Step 4 and keep the worker alive.
Example - Try It Yourself
Explore our complete, runnable example on GitHub to see how to integrate a memory provider into a VideoSDK AI Agent.
Got a Question? Ask us on discord