How It Works
Conversational Graph works by separating conversation flow from language generation. You define the structure as a directed graph of nodes and transitions, and the graph engine executes it deterministically, while the LLM only handles natural language responses and data extraction.
Architecture
The diagram above shows how user speech flows through the system. The STT layer converts speech to text, the Conversational Graph engine (State Machine, Node Executor, Prompt Builder, Extraction Validator) processes it and builds an enriched prompt, the LLM generates a natural response with extracted values, and the TTS layer speaks the response back to the user.
Turn Lifecycle
Every conversation turn follows a structured cycle, from the moment the user speaks to when they hear the response. The graph engine orchestrates each step, ensuring that the right node runs, state is validated, and the conversation advances deterministically.
1. User speaks
|
2. STT converts speech to text
|
3. graph.handle_input(text)
|
+-- Runs the current node function
+-- Node returns an action (Route, Interrupt, END, etc.)
+-- Enriches the prompt with collected state
|
4. LLM receives enriched prompt + user text
|
+-- Generates natural response
+-- Extracts state values from user message
|
5. graph.handle_decision(llm_response)
|
+-- Validates extracted values (Pydantic)
+-- Updates state with valid extractions
+-- Re-runs node if state changed (routing logic fires)
+-- Advances StateMachine to next node
+-- Saves checkpoint
|
6. TTS speaks response to user
|
7. Repeat from step 1
What the LLM Sees
The graph generates a system prompt that constrains the LLM to two jobs:
- Respond : Generate a natural reply following the graph's current instruction.
- Extract : Pull structured state values from the user's message.
The LLM does NOT decide routing, branching, or next steps. The graph engine handles everything else.
Got a Question? Ask us on discord

