Skip to main content

xAI (Grok)

The xAI (Grok) provider enables your agent to use xAI's powerful Grok models for real-time, multimodal AI interactions.

Installation

Install the xAI-enabled VideoSDK Agents package:

pip install "videosdk-plugins-xai"

Authentication

The xAI plugin requires an xAI API key.

Set XAI_API_KEY in your .env file.

Importing

from videosdk.plugins.xai import XAIRealtime, XAIRealtimeConfig

Example Usage

from videosdk.plugins.xai import XAIRealtime, XAIRealtimeConfig
from videosdk.agents import RealTimePipeline

# Initialize the xAI Grok real-time model
model = XAIRealtime(
model="grok-4-1-fast-non-reasoning",
# When XAI_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-xai-api-key",
config=XAIRealtimeConfig(
voice="Eve",
# collection_id="your-collection-id" # Optional
)
)

# Create the pipeline with the model
pipeline = RealTimePipeline(model=model)
note

When using a .env file for credentials, don't pass them as arguments to model instances. The SDK automatically reads environment variables, so omit the api_key parameter from your code.

Key Features

  • Multi-modal Interactions: Utilize xAI's powerful Grok models for voice and text.
  • Function Calling: Define custom tools to retrieve weather data, interact with external APIs, or perform other actions.
  • Web Search: Enable real-time web search capabilities by setting enable_web_search=True.
  • X Search: Access X (formerly Twitter) content by setting enable_x_search=True and providing allowed_x_handles.

Configuration Options

  • model: The Grok model to use (e.g., "grok-4-1-fast-non-reasoning").
  • api_key: Your xAI API key (can also be set via the XAI_API_KEY environment variable).
  • config: An XAIRealtimeConfig object for advanced options:
    • voice: (str) The voice to use for audio output (e.g., "Eve", "Ara", "Rex", "Sal", "Leo").
    • enable_web_search: (bool) Enable or disable web search capabilities.
    • enable_x_search: (bool) Enable or disable search on X (Twitter).
    • allowed_x_handles: (List[str]) A list of allowed X handles to search within.
    • collection_id: (str, optional) The ID of a custom collection from your xAI Console storage to provide additional context.
    • turn_detection: Configuration for detecting when a user has finished speaking.

Collection Storage

xAI Grok supports using "collections" to provide additional context to your agent, grounding its responses in your own documents or data.

To use a collection:

  1. Navigate to xAI Console: Go to your console.x.ai dashboard.
  2. Access Storage: Click on the Storage section in the sidebar.
  3. Create New Collection: Click the "Create New Collection" button.
  4. Upload Files: Upload your relevant documents or data files to the new collection.
  5. Get Collection ID: Once the collection is created, copy its Collection ID.
  6. Use in Config: Pass the copied ID to your agent's configuration:
config=XAIRealtimeConfig(
voice="Eve",
collection_id="your-collection-id-from-console",
# ... other config options
)

The agent will now use the content of this collection to inform its responses.

Additional Resources

The following resources provide more information about using xAI (Grok) with the VideoSDK Agents SDK.

Got a Question? Ask us on discord