Skip to main content

Human in the Loop

Human in the Loop (HITL) enables AI agents to escalate specific queries to human operators for review and approval. This implementation uses Discord as the human interface, allowing seamless handoffs between AI automation and human oversight.

Overview

The HITL system allows AI agents to:

  • Handle routine customer inquiries autonomously
  • Escalate specific queries (like discount requests) to human operators via Discord
  • Receive human responses and relay them back to customers
  • Maintain conversation flow while waiting for human input

Use Cases

  • Discount Requests: AI escalates pricing queries to human sales agents
  • Complex Support: Technical issues requiring human expertise
  • Policy Decisions: Requests that need human approval or clarification
  • Escalation Scenarios: Situations where AI confidence is low

Example Overview

The implementation consists of two main components:

  1. Customer Agent: VideoSDK AI agent that handles customer interactions and escalates specific queries
  2. Discord MCP Server: MCP server that creates Discord threads for human operator responses

Example Implementation

Customer Agent Setup

from videosdk.agents import Agent, MCPServerStdio
import pathlib
import sys

class CustomerAgent(Agent):
def __init__(self, ctx: Optional[JobContext] = None):
current_dir = pathlib.Path(__file__).parent
discord_mcp_server_path = current_dir / "discord_mcp_server.py"

super().__init__(
instructions="You are a customer-facing agent for VideoSDK. You have access to various tools to assist with customer inquiries, provide support, and handle tasks. When a user asks for a discount percentage, always use the appropriate tool to retrieve and provide the accurate answer from your superior human agent.",
mcp_servers=[
MCPServerStdio(
command=sys.executable,
args=[str(discord_mcp_server_path)],
client_session_timeout_seconds=30
),
]
)
self.ctx = ctx

Discord MCP Server

from mcp.server.fastmcp import FastMCP
import discord
from discord.ext import commands

class DiscordHuman:
def __init__(self, user_id: int, channel_id: int):
self.user_id = user_id
self.channel_id = channel_id
self.bot = commands.Bot(command_prefix="!", intents=discord.Intents.all())
self.response_future = None

async def ask(self, question: str) -> str:
channel = self.bot.get_channel(self.channel_id)
thread = await channel.create_thread(
name=question[:100],
type=discord.ChannelType.public_thread
)
await thread.send(f"<@{self.user_id}> {question}")

self.response_future = self.loop.create_future()
try:
return await asyncio.wait_for(self.response_future, timeout=600)
except asyncio.TimeoutError:
return "⏱️ Timed out waiting for a human response"

# MCP Server Setup
mcp = FastMCP("HumanInTheLoopServer")

@mcp.tool(description="Ask a human agent via Discord for a specific user query such as discount percentage, etc.")
async def ask_human(question: str) -> str:
return await discord_human.ask(question)

Pipeline Configuration

pipeline = CascadingPipeline(
stt=DeepgramSTT(api_key=os.getenv("DEEPGRAM_API_KEY")),
llm=AnthropicLLM(api_key=os.getenv("ANTHROPIC_API_KEY")),
tts=GoogleTTS(api_key=os.getenv("GOOGLE_API_KEY")),
vad=SileroVAD(),
turn_detector=TurnDetector(threshold=0.8)
)

Environment Variables

Set the following environment variables:

DISCORD_TOKEN=your_discord_bot_token
DISCORD_USER_ID=human_operator_user_id
DISCORD_CHANNEL_ID=channel_id_for_escalations
DEEPGRAM_API_KEY=your_deepgram_key
ANTHROPIC_API_KEY=your_anthropic_key
GOOGLE_API_KEY=your_google_key

Complete implementation with full source code, setup instructions, and configuration examples available in the VideoSDK Agents GitHub repository.

Got a Question? Ask us on discord