State: Define Your Data Model

State is a Pydantic model that defines all the data your graph collects. Fields are extracted from user utterances and validated automatically.

Defining State

To define your state, create a class that extends GraphState and declare each field using Pydantic's Field. Each field represents a piece of data your graph will collect from the user during the conversation.

from videosdk.conversational_graph import GraphState
from pydantic import Field, field_validator
from typing import Optional

class LoanState(GraphState):
    name:              Optional[str]  = Field(None, description="Applicant full name (first + last)")
    employment_status: Optional[str]  = Field(
        None,
        description="Employment status: employed | unemployed | student | self-employed",
    )
    income:            Optional[int]  = Field(None, ge=0, description="Annual income in INR")
    loan_amount:       Optional[int]  = Field(None, description="Requested loan amount in INR")
    credit_score:      Optional[int]  = Field(None, ge=300, le=850, description="Credit score 300-850")

Field Definitions

Every field should be Optional with a default value or default as None, since values are collected incrementally during the conversation.

description - Critical. This text is included in the LLM system prompt as extraction hints. Be specific about format and valid values.
ge, le - Numeric constraints enforced by Pydantic validation.

Custom Validators

Beyond Pydantic's built-in constraints, you can use @field_validator to add custom validation and normalization logic. This is useful for enforcing business rules, restricting values to a known set, or normalizing user input (e.g., lowercasing, trimming whitespace).

class LoanState(GraphState):
    employment_status: Optional[str] = Field(
        None, description="Employment status: employed | unemployed | student | self-employed"
    )
    name: Optional[str] = Field(None, description="Applicant full name")

    @field_validator("employment_status")
    @classmethod
    def _validate_employment(cls, v):
        valid = {"employed", "unemployed", "student", "self-employed"}
        if v and v.lower() not in valid:
            raise ValueError(f"Must be one of {valid}")
        return v.lower() if v else v

    @field_validator("name")
    @classmethod
    def _validate_name(cls, v):
        if v and len(v.strip()) < 2:
            raise ValueError("Name too short")
        return v.strip().title() if v else v

When validation fails, the extracted value is silently dropped - the field remains None and the node can re-ask.

Collect Schema vs Global State

There are two ways to tell the extractor which fields to collect, using fields (a list of field names from your global state) or schema (a separate Pydantic model). When using ctx.extractor.collect(), you can pass a custom schema instead of fields. This schema is a separate Pydantic model scoped to that specific node's extraction:

class NameSchema(GraphState):
    name: str = Field(..., description="Applicant full name (first + last)")

async def collect_name_node(state, ctx):
    result = await ctx.extractor.collect(
        schema=NameSchema,
        prompt="Ask the user for their full name.",
    )

note

Every field in a collect schema must also exist in the global GraphState.

The collect schema controls what the LLM extracts in that node, but the extracted values are stored in the global state so they are accessible throughout the entire graph. If a field exists in a collect schema but not in the global state, it will be extracted but lost.

# Global state - declares ALL fields used anywhere in the graph
class LoanState(GraphState):
    name:              Optional[str]  = Field(None, description="Applicant full name (first + last)")
    employment_status: Optional[str]  = Field(None, description="Employment status")
    income:            Optional[int]  = Field(None, ge=0, description="Annual income in INR")

# Collect schema - scoped to one node, extracts a subset of fields
class NameSchema(GraphState):
    name: str = Field(..., description="Applicant full name (first + last)")

In the collect schema, fields can be non-optional (required) since the schema is only used for extraction at that node.

Approach	When to use
`fields=["name"]`	Field already defined in global state. Simplest option - no extra model needed.
`schema=NameSchema`	You want custom descriptions, stricter validation, or to group multiple fields for a single extraction.

Both approaches store extracted values in the same global state.

Got a Question? Ask us on discord

Defining State​

Field Definitions​

Custom Validators​

Collect Schema vs Global State​

Defining State

Field Definitions

Custom Validators

Collect Schema vs Global State