Skip to main content

Worker Configuration

Workers are the execution engines that run your AI Agent jobs. Think of them as the bridge between your agent logic and the VideoSDK runtime.

This guide walks you through how to configure and tune a Worker for different environments — from local dev to production.

Quick Start: Minimal Worker

Here’s the simplest Worker setup to get going:

from videosdk.agents import WorkerJob, Options, JobContext, RoomOptions

options = Options(
agent_id="MyAgent",
max_processes=5,
register=True, # Registers worker with the backend for job scheduling
)

room_options = RoomOptions(
name="My Agent",
)

job_context = JobContext(room_options=room_options)

job = WorkerJob(
entrypoint=your_agent_function,
jobctx=lambda: job_context,
options=options,
)

job.start()

That’s enough to start processing jobs locally or in staging.

Worker Options Explained

The Options class gives you fine-grained control over Worker behavior:

OptionPurposeExample
agent_idUnique identifier for your agent"SupportBot01"
max_processesMaximum concurrent jobs10
num_idle_processesPre-warmed processes for faster startup2
load_thresholdMax CPU/Load tolerance before refusing jobs0.75
registerWhether to register with backendTrue (prod) / False (local)
log_levelLogging verbosity"DEBUG", "INFO", "ERROR"
host, portBind address for health/status endpoints"0.0.0.0", 8081
memory_warn_mbTrigger warning logs at this usage500.0
memory_limit_mbHard memory cap (0 = unlimited)1000.0
ping_intervalHeartbeat interval in seconds30.0
max_retryMax connection retries before giving up16

Example Configurations

Standard Production configuration for typical deployments:

options = Options(
agent_id="StandardAgent",
max_processes=5,
register=True,
log_level="INFO",
)

This configuration is suitable for:

  • Standard production deployments
  • Moderate traffic loads
  • Most business applications

Hosting Environments

Scaling Your Workers

Workers can scale both vertically (more power per instance) and horizontally (more instances).

  • Vertical Scaling → Increase max_processes to run more jobs per worker.
  • Horizontal Scaling → Deploy multiple workers; the backend registry will balance load.
  • Idle Processes → Use num_idle_processes to reduce cold start latency.
  • Load Threshold → Tune load_threshold (default 0.75) to prevent overload.
  • Memory Safety → Use memory_warn_mb and memory_limit_mb to keep processes healthy.

Pro Tips

  • Start small → Begin with max_processes=5 and adjust as you observe metrics.
  • Log smart → Use DEBUG in dev, but INFO or WARN in prod to reduce noise.
  • Monitor & Auto-Scale → Pair with metrics (Prometheus, Grafana, CloudWatch, etc.) to auto-scale horizontally.
  • Keep processes warm → Set at least num_idle_processes=1 in production for faster first-response times.

Got a Question? Ask us on discord