Why Conversational Graph

Build deterministic, multi-turn voice AI agents as directed graphs - where you control the flow and the LLM handles natural conversation.

The Problem: LLM-Driven voice Conversations

You ask the same question twice and get two completely different answers. LLM-driven flows are non-deterministic by nature, the same prompt won't always lead to the same decision.

Test 1 - Hallucination (Invalid Decision Making)

User: "I want to apply for a loan"
LLM:  "Sure! What's your name?"  

User: "Rahul"
LLM:  "Great Rahul! You're approved!"

Test 2 - Irrelevant / Off-Track Flow

User: "I want to apply for a loan"
LLM:  "Sure! What's your name?"  

User: "Rahul"
LLM:  "Nice to meet you Rahul! By the way, do you want tips on saving money?"

Test 3 - Correct Behavior

User: "I want to apply for a loan"
LLM:  "Sure! What's your name?"  

User: "Rahul"
LLM:  "Thanks Rahul. What type of loan are you looking for? (e.g., personal, home, car)"

The LLM decides what to ask, when to branch, and when to stop. You have no control over the path, no guarantee that required data is collected, and no way to enforce business rules.

The Solution: Separate Flow from Language

Conversational Graph splits responsibilities:

You define the what and when. The LLM handles the how it sounds. Using LLM is also Optional.

What You Get

Capability	How It Works
Predictable flows	Conversations follow the exact path you define. No hallucinated detours.
Structured data collection	Extract and validate user data with Pydantic models - types, constraints, custom validators.
Conditional branching	Route to different paths based on collected data (e.g., credit score -> approve/reject/review).
Human-in-the-loop	Pause the conversation, wait for external input (payment callback, human review), then resume.
Checkpointing	Save conversation state after every step. Resume interrupted calls. Time-travel to any past point.
Parallel tool execution	Run multiple nodes or tool calls concurrently with fan-out/fan-in transitions.
Provider agnostic	Works with any STT, LLM, TTS via the VideoSDK pipeline.

Before and After

Without Conversational Graph - the LLM controls everything:

User speaks -> LLM decides what to do -> LLM responds
                  |
                  +-- May skip steps
                  +-- May hallucinate decisions
                  +-- May ask irrelevant questions
                  +-- No audit trail
                  +-- No way to pause/resume

With Conversational Graph - you control flow, LLM handles language:

User speaks -> Graph Engine decides what to do -> LLM makes it sound natural
                  |
                  +-- Always follows your defined path
                  +-- Validates every extracted value
                  +-- Branches based on your rules
                  +-- Saves state at every step
                  +-- Can pause and resume anytime

How It Compares to Other Frameworks

Most agent frameworks fall into two categories. Conversational Graph takes a different approach.

The Two Common Approaches

1. LLM-navigated graphs - You define a graph of nodes, but the LLM decides which edges to follow at runtime. Great for tool-calling agents and multi-step reasoning. But the LLM controls routing, which means conversations can take unexpected paths.

User message -> LLM decides next step -> LLM calls tools -> LLM responds
                     |
                     +-- LLM chooses which node to visit
                     +-- LLM decides when to loop or stop
                     +-- You define the graph, but LLM navigates it

2. Guideline-driven agents - You write natural language rules ("guidelines") that steer LLM behavior. The LLM interprets these rules at runtime. Good for open-ended customer support. But guidelines are soft constraints - they can conflict, and the LLM may misinterpret them.

User message -> LLM reads guidelines -> LLM decides what to say and do
                     |
                     +-- Guidelines are soft constraints ("prefer X over Y")
                     +-- LLM interprets guidelines at runtime
                     +-- No fixed conversation path

What Conversational Graph Does Differently

Conversational Graph takes a third approach: you define the flow, and the graph engine follows it deterministically. The LLM never makes routing decisions - it only generates natural language and extracts data.

User speaks -> Graph Engine decides next step -> LLM makes it sound natural
                     |
                     +-- Engine follows your transitions exactly
                     +-- No LLM in the routing loop
                     +-- Deterministic, auditable path

When to use Conversational Graph

Use Conversational Graph when the conversation must follow a strict, auditable path - loan applications, appointment booking, insurance claims, payment flows - where every step must happen in order, every field must be validated, and business rules (not LLM judgment) determine the next step.

Got a Question? Ask us on discord

The Problem: LLM-Driven voice Conversations​

The Solution: Separate Flow from Language​

What You Get​

Before and After​

How It Compares to Other Frameworks​

The Two Common Approaches​

What Conversational Graph Does Differently​

When to use Conversational Graph​