LangChain & LangGraph LLM
The videosdk-plugins-langchain package provides two LLM adapters that let you drop any LangChain-compatible model or a full LangGraph workflow directly into the VideoSDK voice pipeline — no changes needed to the rest of your pipeline.
| Adapter | When to use |
|---|---|
LangChainLLM | Wrap a single BaseChatModel (OpenAI, Anthropic, Gemini, Mistral, …) and optionally attach LangChain-native tools |
LangGraphLLM | Wrap a compiled StateGraph — multi-node flows, conditional routing, tool nodes, planners, and more |
Installation
pip install "videosdk-plugins-langchain"
You also need the LangChain integration package for your chosen model provider, e.g.:
pip install langchain-openai # OpenAI / Azure OpenAI
pip install langchain-google-genai # Google Gemini
pip install langchain-anthropic # Anthropic Claude
Importing
from videosdk.plugins.langchain import LangChainLLM, LangGraphLLM
LangChainLLM
LangChainLLM adapts any LangChain BaseChatModel for use inside the VideoSDK pipeline. It supports two tool-calling modes that can be combined on the same instance.
Tool-calling modes
Mode A — VideoSDK @function_tool methods
Define tools as @function_tool methods on your Agent subclass (exactly as you would with OpenAILLM or GoogleLLM). The adapter converts them to LangChain stubs for schema binding and lets VideoSDK dispatch and re-feed the results — the standard VideoSDK tool loop.
Mode B — LangChain-native tools
Pass LangChain tools (e.g. TavilySearchResults, WikipediaQueryRun, or any custom @tool function) at init via tools=[...]. The full call→execute→feed loop runs inside the adapter; the voice pipeline only receives the final text stream.
Both modes can be active simultaneously.
Basic Usage
- Mode A — VideoSDK tools
- Mode B — LangChain-native tools
from langchain_openai import ChatOpenAI
from videosdk.plugins.langchain import LangChainLLM
from videosdk.agents import Agent, Pipeline, function_tool
class SlackAgent(Agent):
def __init__(self):
super().__init__(instructions="You are a helpful Slack assistant.")
@function_tool
async def post_message(self, channel: str, message: str) -> str:
"""Post a message to a Slack channel.
Args:
channel: Channel name (e.g. 'general')
message: The message text to post
"""
# ... your Slack API call here
return f"Message posted to #{channel}."
langchain_llm = LangChainLLM(
llm=ChatOpenAI(model="gpt-4o-mini", streaming=True),
)
pipeline = Pipeline(llm=langchain_llm, ...)
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults
from videosdk.plugins.langchain import LangChainLLM
from videosdk.agents import Pipeline
langchain_llm = LangChainLLM(
llm=ChatOpenAI(model="gpt-4o-mini", streaming=True),
tools=[TavilySearchResults(max_results=3)],
)
pipeline = Pipeline(llm=langchain_llm, ...)
Configuration Options
| Parameter | Type | Default | Description |
|---|---|---|---|
llm | BaseChatModel | required | Any LangChain chat model instance |
tools | list | None | None | LangChain-native tools executed internally (Mode B) |
max_tool_iterations | int | 10 | Safety cap on consecutive internal tool-call rounds |
When using a .env file for credentials, pass credentials to the LangChain model constructor (e.g. ChatOpenAI()), not to LangChainLLM. The SDK reads environment variables automatically, so you can also omit them entirely and rely on the provider's SDK defaults.
Full Example — Voice-controlled Slack Assistant
"""
Voice-controlled Slack assistant powered by LangChain.
Pipeline: DeepgramSTT + LangChainLLM + CartesiaTTS + SileroVAD + TurnDetector
Env Vars: VIDEOSDK_AUTH_TOKEN, DEEPGRAM_API_KEY, OPENAI_API_KEY,
CARTESIA_API_KEY, SLACK_BOT_TOKEN
"""
import os
from slack_sdk.web.async_client import AsyncWebClient
from langchain_openai import ChatOpenAI
from videosdk.agents import Agent, AgentSession, Pipeline, WorkerJob, JobContext, RoomOptions, function_tool
from videosdk.plugins.deepgram import DeepgramSTT
from videosdk.plugins.cartesia import CartesiaTTS
from videosdk.plugins.silero import SileroVAD
from videosdk.plugins.turn_detector import TurnDetector, pre_download_model
from videosdk.plugins.langchain import LangChainLLM
pre_download_model()
_slack = AsyncWebClient(token=os.environ.get("SLACK_BOT_TOKEN", ""))
class SlackVoiceAgent(Agent):
def __init__(self):
super().__init__(
instructions=(
"You are Max, a voice-controlled Slack assistant. "
"You can post messages to channels. "
"After executing any action, confirm it briefly."
),
)
async def on_enter(self) -> None:
await self.session.say("Hey! I'm Max. Which channel would you like to post to?")
@function_tool
async def post_message(self, channel: str, message: str) -> str:
"""Post a message to a Slack channel.
Args:
channel: Channel name or ID (e.g. 'general', 'C01234ABCDE')
message: The message text to post
"""
channel_name = channel.lstrip("#")
try:
await _slack.chat_postMessage(channel=f"#{channel_name}", text=message)
return f"Message posted to #{channel_name}."
except Exception as exc:
return f"Failed to post to #{channel_name}: {exc}"
async def entrypoint(ctx: JobContext):
agent = SlackVoiceAgent()
langchain_llm = LangChainLLM(
llm=ChatOpenAI(model="gpt-4o-mini", streaming=True),
)
pipeline = Pipeline(
stt=DeepgramSTT(),
llm=langchain_llm,
tts=CartesiaTTS(),
vad=SileroVAD(),
turn_detector=TurnDetector(),
)
session = AgentSession(agent=agent, pipeline=pipeline)
await session.start(wait_for_participant=True, run_until_shutdown=True)
def make_context() -> JobContext:
return JobContext(room_options=RoomOptions(room_id="<room_id>", name="Slack Assistant", playground=True))
if __name__ == "__main__":
WorkerJob(entrypoint=entrypoint, jobctx=make_context).start()
LangGraphLLM
LangGraphLLM wraps a compiled LangGraph StateGraph as a VideoSDK LLM. The entire graph — nodes, edges, tool nodes, conditional routing, and internal state — runs as the "LLM" from the pipeline's perspective.
Key concepts
output_node— Only text chunks emitted by this node name reach TTS. Use it to suppress intermediate planner/researcher nodes and expose only the final synthesis node to the voice pipeline.stream_mode—"messages"(default) streamsAIMessageChunktokens."custom"streams arbitrary objects emitted viagraph.send(). Pass a list for both simultaneously.subgraphs— SetTrueto stream tokens from nested subgraphs (requires LangGraph ≥ 0.2).config— Optional LangGraphRunnableConfigdict for thread IDs, recursion limits, or custom callbacks.
Basic Usage
from videosdk.plugins.langchain import LangGraphLLM
from videosdk.agents import Pipeline
# graph is a compiled LangGraph StateGraph
llm = LangGraphLLM(
graph=my_compiled_graph,
output_node="synthesizer_node", # only this node's text reaches TTS
)
pipeline = Pipeline(llm=llm, ...)
Configuration Options
| Parameter | Type | Default | Description |
|---|---|---|---|
graph | CompiledStateGraph | required | A compiled LangGraph graph (StateGraph.compile()) |
output_node | str | None | None | Node name whose text is forwarded to TTS; None forwards all AI text |
config | dict | None | None | LangGraph RunnableConfig (thread IDs, recursion limit, callbacks) |
stream_mode | str | list[str] | "messages" | Streaming mode: "messages", "custom", or both as a list |
subgraphs | bool | False | Stream tokens from nested subgraphs |
context | Any | None | None | LangGraph 2.0 context object injected at runtime |
Full Example — Voice-driven Blog Writer
This example shows a 3-question information-gathering flow before a multi-step sequential writing pipeline.
START → coordinator_node (extract topic / audience / tone)
↓ all 3 gathered?
→ planner_node (plan 4 BlogSection objects)
→ write_sections (4 sequential LLM calls — one per section)
→ compiler_node (join + save {slug}.md)
→ synthesizer_node ← OUTPUT NODE (spoken announcement)
↓ info still missing?
→ synthesizer_node (asks the next gathering question)
END
"""
Voice-driven blog writer powered by a sequential LangGraph pipeline.
Env Vars: VIDEOSDK_AUTH_TOKEN, DEEPGRAM_API_KEY, GOOGLE_API_KEY, CARTESIA_API_KEY
"""
import os
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langgraph.graph import END, START, MessagesState, StateGraph
from langchain_google_genai import ChatGoogleGenerativeAI
from pydantic import BaseModel, Field
from videosdk.agents import Agent, AgentSession, Pipeline, WorkerJob, JobContext, RoomOptions
from videosdk.plugins.deepgram import DeepgramSTT
from videosdk.plugins.cartesia import CartesiaTTS
from videosdk.plugins.silero import SileroVAD
from videosdk.plugins.turn_detector import TurnDetector, pre_download_model
from videosdk.plugins.langchain import LangGraphLLM
pre_download_model()
# --- Pydantic schemas ---
class GatheringInfo(BaseModel):
topic: str = Field(default="")
audience: str = Field(default="")
tone: str = Field(default="")
class BlogSection(BaseModel):
name: str
description: str
class BlogSections(BaseModel):
sections: list[BlogSection]
class BlogState(MessagesState):
topic: str
audience: str
tone: str
sections: list[BlogSection]
completed_sections: list[str]
filename: str
blog_done: bool
# --- LLMs ---
_MODEL = "gemini-2.5-flash"
_coordinator_llm = ChatGoogleGenerativeAI(model=_MODEL).with_structured_output(GatheringInfo)
_planner_llm = ChatGoogleGenerativeAI(model=_MODEL).with_structured_output(BlogSections)
_writer_llm = ChatGoogleGenerativeAI(model=_MODEL, streaming=True)
_synth_llm = ChatGoogleGenerativeAI(model=_MODEL, streaming=True)
# --- Graph nodes (abbreviated — see full example in repo) ---
def coordinator_node(state: BlogState) -> dict: ...
def planner_node(state: BlogState) -> dict: ...
def write_sections_node(state: BlogState) -> dict: ...
def compiler_node(state: BlogState) -> dict: ...
def synthesizer_node(state: BlogState) -> dict: ...
def route_after_coordinator(state: BlogState) -> str: ...
builder = StateGraph(BlogState)
builder.add_node("coordinator", coordinator_node)
builder.add_node("planner", planner_node)
builder.add_node("write_sections", write_sections_node)
builder.add_node("compiler", compiler_node)
builder.add_node("synthesizer_node", synthesizer_node)
builder.add_edge(START, "coordinator")
builder.add_conditional_edges(
"coordinator", route_after_coordinator,
{"planner": "planner", "synthesizer_node": "synthesizer_node"},
)
builder.add_edge("planner", "write_sections")
builder.add_edge("write_sections", "compiler")
builder.add_edge("compiler", "synthesizer_node")
builder.add_edge("synthesizer_node", END)
blog_graph = builder.compile()
# --- Agent ---
class BlogWriterAgent(Agent):
def __init__(self):
super().__init__(instructions="You are Aria, a friendly AI writing assistant.")
async def on_enter(self) -> None:
await self.session.say(
"Hi! I'm Aria. What topic would you like me to write a blog about?"
)
async def entrypoint(ctx: JobContext):
agent = BlogWriterAgent()
langgraph_llm = LangGraphLLM(
graph=blog_graph,
output_node="synthesizer_node", # only synthesizer text reaches TTS
)
pipeline = Pipeline(
stt=DeepgramSTT(),
llm=langgraph_llm,
tts=CartesiaTTS(),
vad=SileroVAD(),
turn_detector=TurnDetector(),
)
session = AgentSession(agent=agent, pipeline=pipeline)
await session.start(wait_for_participant=True, run_until_shutdown=True)
def make_context() -> JobContext:
return JobContext(room_options=RoomOptions(room_id="<room_id>", name="Blog Writer", playground=True))
if __name__ == "__main__":
WorkerJob(entrypoint=entrypoint, jobctx=make_context).start()
Always set output_node to your final "speech synthesis" node when using multi-node graphs. Without it, intermediate planner/researcher node text is also forwarded to TTS, producing unexpected spoken output.
Choosing Between LangChainLLM and LangGraphLLM
| Scenario | Recommended adapter |
|---|---|
| Simple model swap (e.g. use Mistral instead of OpenAI) | LangChainLLM |
| Add web-search / RAG tools without changing agent code | LangChainLLM with tools=[...] |
| Multi-step sequential pipeline (plan → write → compile) | LangGraphLLM |
| Conditional routing / state machines | LangGraphLLM |
| Mixture-of-experts or parallel sub-agents | LangGraphLLM |
Additional Resources
Got a Question? Ask us on discord

