Skip to main content

Google LLM

The Google LLM provider enables your agent to use Google's Gemini family of language models for text-based conversations and processing.

Installation​

Install the Google-enabled VideoSDK Agents package:

pip install "videosdk-plugins-google"

Importing​

from videosdk.plugins.google import GoogleLLM

Example Usage​

from videosdk.plugins.google import GoogleLLM
from videosdk.agents import CascadingPipeline

# Initialize the Google LLM model
llm = GoogleLLM(
model="gemini-2.0-flash-001",
# When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter
api_key="your-google-api-key",
temperature=0.7,
tool_choice="auto",
max_output_tokens=1000
)

# Add llm to cascading pipeline
pipeline = CascadingPipeline(llm=llm)
note

When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key and other credential parameters from your code.

Configuration Options​

  • model: (str) The Google model to use (e.g., "gemini-2.0-flash-001", "gemini-1.5-pro") (default: "gemini-2.0-flash-001").
  • api_key: (str) Your Google API key. Can also be set via the GOOGLE_API_KEY environment variable.
  • temperature: (float) Sampling temperature for response randomness (default: 0.7).
  • tool_choice: (ToolChoice) Tool selection mode ("auto", "required", "none") (default: "auto").
  • max_output_tokens: (int) Maximum number of tokens in the completion response (optional).
  • top_p: (float) The nucleus sampling probability (optional).
  • top_k: (int) The top-k sampling parameter (optional).
  • presence_penalty: (float) Penalizes new tokens based on whether they appear in the text so far (optional).
  • frequency_penalty: (float) Penalizes new tokens based on their existing frequency in the text so far (optional).

Got a Question? Ask us on discord