Skip to main content
Version: 1.0.x

LangChain & LangGraph LLM

The videosdk-plugins-langchain package provides two LLM adapters that let you drop any LangChain-compatible model or a full LangGraph workflow directly into the VideoSDK voice pipeline — no changes needed to the rest of your pipeline.

AdapterWhen to use
LangChainLLMWrap a single BaseChatModel (OpenAI, Anthropic, Gemini, Mistral, …) and optionally attach LangChain-native tools
LangGraphLLMWrap a compiled StateGraph — multi-node flows, conditional routing, tool nodes, planners, and more

Installation

pip install "videosdk-plugins-langchain"

You also need the LangChain integration package for your chosen model provider, e.g.:

pip install langchain-openai          # OpenAI / Azure OpenAI
pip install langchain-google-genai # Google Gemini
pip install langchain-anthropic # Anthropic Claude

Importing

from videosdk.plugins.langchain import LangChainLLM, LangGraphLLM

LangChainLLM

LangChainLLM adapts any LangChain BaseChatModel for use inside the VideoSDK pipeline. It supports two tool-calling modes that can be combined on the same instance.

Tool-calling modes

Mode A — VideoSDK @function_tool methods

Define tools as @function_tool methods on your Agent subclass (exactly as you would with OpenAILLM or GoogleLLM). The adapter converts them to LangChain stubs for schema binding and lets VideoSDK dispatch and re-feed the results — the standard VideoSDK tool loop.

Mode B — LangChain-native tools

Pass LangChain tools (e.g. TavilySearchResults, WikipediaQueryRun, or any custom @tool function) at init via tools=[...]. The full call→execute→feed loop runs inside the adapter; the voice pipeline only receives the final text stream.

Both modes can be active simultaneously.

Basic Usage

agent.py
from langchain_openai import ChatOpenAI
from videosdk.plugins.langchain import LangChainLLM
from videosdk.agents import Agent, Pipeline, function_tool

class SlackAgent(Agent):
def __init__(self):
super().__init__(instructions="You are a helpful Slack assistant.")

@function_tool
async def post_message(self, channel: str, message: str) -> str:
"""Post a message to a Slack channel.

Args:
channel: Channel name (e.g. 'general')
message: The message text to post
"""
# ... your Slack API call here
return f"Message posted to #{channel}."

langchain_llm = LangChainLLM(
llm=ChatOpenAI(model="gpt-4o-mini", streaming=True),
)

pipeline = Pipeline(llm=langchain_llm, ...)

Configuration Options

ParameterTypeDefaultDescription
llmBaseChatModelrequiredAny LangChain chat model instance
toolslist | NoneNoneLangChain-native tools executed internally (Mode B)
max_tool_iterationsint10Safety cap on consecutive internal tool-call rounds
note

When using a .env file for credentials, pass credentials to the LangChain model constructor (e.g. ChatOpenAI()), not to LangChainLLM. The SDK reads environment variables automatically, so you can also omit them entirely and rely on the provider's SDK defaults.

Full Example — Voice-controlled Slack Assistant

agent.py
"""
Voice-controlled Slack assistant powered by LangChain.
Pipeline: DeepgramSTT + LangChainLLM + CartesiaTTS + SileroVAD + TurnDetector

Env Vars: VIDEOSDK_AUTH_TOKEN, DEEPGRAM_API_KEY, OPENAI_API_KEY,
CARTESIA_API_KEY, SLACK_BOT_TOKEN
"""
import os
from slack_sdk.web.async_client import AsyncWebClient
from langchain_openai import ChatOpenAI

from videosdk.agents import Agent, AgentSession, Pipeline, WorkerJob, JobContext, RoomOptions, function_tool
from videosdk.plugins.deepgram import DeepgramSTT
from videosdk.plugins.cartesia import CartesiaTTS
from videosdk.plugins.silero import SileroVAD
from videosdk.plugins.turn_detector import TurnDetector, pre_download_model
from videosdk.plugins.langchain import LangChainLLM

pre_download_model()

_slack = AsyncWebClient(token=os.environ.get("SLACK_BOT_TOKEN", ""))


class SlackVoiceAgent(Agent):
def __init__(self):
super().__init__(
instructions=(
"You are Max, a voice-controlled Slack assistant. "
"You can post messages to channels. "
"After executing any action, confirm it briefly."
),
)

async def on_enter(self) -> None:
await self.session.say("Hey! I'm Max. Which channel would you like to post to?")

@function_tool
async def post_message(self, channel: str, message: str) -> str:
"""Post a message to a Slack channel.

Args:
channel: Channel name or ID (e.g. 'general', 'C01234ABCDE')
message: The message text to post
"""
channel_name = channel.lstrip("#")
try:
await _slack.chat_postMessage(channel=f"#{channel_name}", text=message)
return f"Message posted to #{channel_name}."
except Exception as exc:
return f"Failed to post to #{channel_name}: {exc}"


async def entrypoint(ctx: JobContext):
agent = SlackVoiceAgent()
langchain_llm = LangChainLLM(
llm=ChatOpenAI(model="gpt-4o-mini", streaming=True),
)
pipeline = Pipeline(
stt=DeepgramSTT(),
llm=langchain_llm,
tts=CartesiaTTS(),
vad=SileroVAD(),
turn_detector=TurnDetector(),
)
session = AgentSession(agent=agent, pipeline=pipeline)
await session.start(wait_for_participant=True, run_until_shutdown=True)


def make_context() -> JobContext:
return JobContext(room_options=RoomOptions(room_id="<room_id>", name="Slack Assistant", playground=True))


if __name__ == "__main__":
WorkerJob(entrypoint=entrypoint, jobctx=make_context).start()

LangGraphLLM

LangGraphLLM wraps a compiled LangGraph StateGraph as a VideoSDK LLM. The entire graph — nodes, edges, tool nodes, conditional routing, and internal state — runs as the "LLM" from the pipeline's perspective.

Key concepts

  • output_node — Only text chunks emitted by this node name reach TTS. Use it to suppress intermediate planner/researcher nodes and expose only the final synthesis node to the voice pipeline.
  • stream_mode"messages" (default) streams AIMessageChunk tokens. "custom" streams arbitrary objects emitted via graph.send(). Pass a list for both simultaneously.
  • subgraphs — Set True to stream tokens from nested subgraphs (requires LangGraph ≥ 0.2).
  • config — Optional LangGraph RunnableConfig dict for thread IDs, recursion limits, or custom callbacks.

Basic Usage

agent.py
from videosdk.plugins.langchain import LangGraphLLM
from videosdk.agents import Pipeline

# graph is a compiled LangGraph StateGraph
llm = LangGraphLLM(
graph=my_compiled_graph,
output_node="synthesizer_node", # only this node's text reaches TTS
)

pipeline = Pipeline(llm=llm, ...)

Configuration Options

ParameterTypeDefaultDescription
graphCompiledStateGraphrequiredA compiled LangGraph graph (StateGraph.compile())
output_nodestr | NoneNoneNode name whose text is forwarded to TTS; None forwards all AI text
configdict | NoneNoneLangGraph RunnableConfig (thread IDs, recursion limit, callbacks)
stream_modestr | list[str]"messages"Streaming mode: "messages", "custom", or both as a list
subgraphsboolFalseStream tokens from nested subgraphs
contextAny | NoneNoneLangGraph 2.0 context object injected at runtime

Full Example — Voice-driven Blog Writer

This example shows a 3-question information-gathering flow before a multi-step sequential writing pipeline.

START → coordinator_node  (extract topic / audience / tone)
↓ all 3 gathered?
→ planner_node (plan 4 BlogSection objects)
→ write_sections (4 sequential LLM calls — one per section)
→ compiler_node (join + save {slug}.md)
→ synthesizer_node ← OUTPUT NODE (spoken announcement)
↓ info still missing?
→ synthesizer_node (asks the next gathering question)
END
agent.py
"""
Voice-driven blog writer powered by a sequential LangGraph pipeline.

Env Vars: VIDEOSDK_AUTH_TOKEN, DEEPGRAM_API_KEY, GOOGLE_API_KEY, CARTESIA_API_KEY
"""
import os
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langgraph.graph import END, START, MessagesState, StateGraph
from langchain_google_genai import ChatGoogleGenerativeAI
from pydantic import BaseModel, Field

from videosdk.agents import Agent, AgentSession, Pipeline, WorkerJob, JobContext, RoomOptions
from videosdk.plugins.deepgram import DeepgramSTT
from videosdk.plugins.cartesia import CartesiaTTS
from videosdk.plugins.silero import SileroVAD
from videosdk.plugins.turn_detector import TurnDetector, pre_download_model
from videosdk.plugins.langchain import LangGraphLLM

pre_download_model()


# --- Pydantic schemas ---

class GatheringInfo(BaseModel):
topic: str = Field(default="")
audience: str = Field(default="")
tone: str = Field(default="")

class BlogSection(BaseModel):
name: str
description: str

class BlogSections(BaseModel):
sections: list[BlogSection]

class BlogState(MessagesState):
topic: str
audience: str
tone: str
sections: list[BlogSection]
completed_sections: list[str]
filename: str
blog_done: bool


# --- LLMs ---

_MODEL = "gemini-2.5-flash"
_coordinator_llm = ChatGoogleGenerativeAI(model=_MODEL).with_structured_output(GatheringInfo)
_planner_llm = ChatGoogleGenerativeAI(model=_MODEL).with_structured_output(BlogSections)
_writer_llm = ChatGoogleGenerativeAI(model=_MODEL, streaming=True)
_synth_llm = ChatGoogleGenerativeAI(model=_MODEL, streaming=True)


# --- Graph nodes (abbreviated — see full example in repo) ---

def coordinator_node(state: BlogState) -> dict: ...
def planner_node(state: BlogState) -> dict: ...
def write_sections_node(state: BlogState) -> dict: ...
def compiler_node(state: BlogState) -> dict: ...
def synthesizer_node(state: BlogState) -> dict: ...
def route_after_coordinator(state: BlogState) -> str: ...

builder = StateGraph(BlogState)
builder.add_node("coordinator", coordinator_node)
builder.add_node("planner", planner_node)
builder.add_node("write_sections", write_sections_node)
builder.add_node("compiler", compiler_node)
builder.add_node("synthesizer_node", synthesizer_node)

builder.add_edge(START, "coordinator")
builder.add_conditional_edges(
"coordinator", route_after_coordinator,
{"planner": "planner", "synthesizer_node": "synthesizer_node"},
)
builder.add_edge("planner", "write_sections")
builder.add_edge("write_sections", "compiler")
builder.add_edge("compiler", "synthesizer_node")
builder.add_edge("synthesizer_node", END)

blog_graph = builder.compile()


# --- Agent ---

class BlogWriterAgent(Agent):
def __init__(self):
super().__init__(instructions="You are Aria, a friendly AI writing assistant.")

async def on_enter(self) -> None:
await self.session.say(
"Hi! I'm Aria. What topic would you like me to write a blog about?"
)


async def entrypoint(ctx: JobContext):
agent = BlogWriterAgent()
langgraph_llm = LangGraphLLM(
graph=blog_graph,
output_node="synthesizer_node", # only synthesizer text reaches TTS
)
pipeline = Pipeline(
stt=DeepgramSTT(),
llm=langgraph_llm,
tts=CartesiaTTS(),
vad=SileroVAD(),
turn_detector=TurnDetector(),
)
session = AgentSession(agent=agent, pipeline=pipeline)
await session.start(wait_for_participant=True, run_until_shutdown=True)


def make_context() -> JobContext:
return JobContext(room_options=RoomOptions(room_id="<room_id>", name="Blog Writer", playground=True))


if __name__ == "__main__":
WorkerJob(entrypoint=entrypoint, jobctx=make_context).start()
output_node

Always set output_node to your final "speech synthesis" node when using multi-node graphs. Without it, intermediate planner/researcher node text is also forwarded to TTS, producing unexpected spoken output.


Choosing Between LangChainLLM and LangGraphLLM

ScenarioRecommended adapter
Simple model swap (e.g. use Mistral instead of OpenAI)LangChainLLM
Add web-search / RAG tools without changing agent codeLangChainLLM with tools=[...]
Multi-step sequential pipeline (plan → write → compile)LangGraphLLM
Conditional routing / state machinesLangGraphLLM
Mixture-of-experts or parallel sub-agentsLangGraphLLM

Additional Resources

Got a Question? Ask us on discord