Skip to main content

Cerebras LLM

The Cerebras AI LLM provider enables your agent to use Cerebras AI's language models for text-based conversations and processing.

Installation​

Install the Cerebras-enabled VideoSDK Agents package:

pip install "videosdk-plugins-cerebras"

Importing​

from videosdk.plugins.cerebras import CerebrasLLM

Authentication​

The Cerebras plugin requires an Cerebras API key.

Set CARTESIA_API_KEY in your .env file.

Example Usage​

from videosdk.plugins.cerebras import CerebrasLLM
from videosdk.agents import CascadingPipeline

# Initialize the Cerebras LLM model
llm = CerebrasLLM(
model="llama3.3-70b",
temperature=0.7,
max_tokens=1024,
)

# Add llm to cascading pipeline
pipeline = CascadingPipeline(llm=llm)
note

When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key and other credential parameters from your code.

Configuration Options​

  • model: (str) The Cerebras model to use (default: "llama3.3-70b"). Supported models include: llama3.3-70b, llama3.1-8b, llama-4-scout-17b-16e-instruct, qwen-3-32b, deepseek-r1-distill-llama-70b (private preview)
  • api_key: (str) Your Cerebras API key. Can also be set via the CEREBRAS_API_KEY environment variable.
  • temperature: (float) Sampling temperature for response randomness (default: 0.7).
  • tool_choice: (ToolChoice) Tool selection mode ("auto", "required", "none") (default: "auto").
  • max_completion_tokens: (int) Maximum number of tokens to generate in the response (optional).
  • top_p: (float) Nucleus sampling probability (optional).
  • seed: (int) Random seed for reproducible completions (optional).
  • stop: (str) Stop sequence that halts generation when encountered (optional).
  • user: (str) Identifier for the end user triggering the request (optional).

Additional Resources​

The following resources provide more information about using Cerebras with VideoSDK Agents SDK.

Got a Question? Ask us on discord