Skip to main content
Version: 1.0.x

Google LLM

The Google LLM provider enables your agent to use Google's Gemini family of language models for text-based conversations and processing. It also supports vision input capabilities, allowing your agent to analyze and respond to images alongside text with the supported models.

Installation

Install the Google-enabled VideoSDK Agents package:

pip install "videosdk-plugins-google"

Importing

from videosdk.plugins.google import GoogleLLM

Authentication

The Google plugin requires a Gemini API key.

Set GOOGLE_API_KEY in your .env file.

Example Usage

from videosdk.plugins.google import GoogleLLM
from videosdk.agents import Pipeline

llm = GoogleLLM(
model="gemini-2.5-flash-lite",
temperature=0.7,
tool_choice="auto",
max_output_tokens=1000,
)

pipeline = Pipeline(llm=llm)
note

When using a .env file for credentials, don't pass them as arguments to model instances. The SDK automatically reads environment variables, so omit api_key and other credential parameters from your code.

Configuration Options

Core

  • model — The Gemini model to use (e.g. "gemini-2.5-flash-lite", "gemini-3-flash-preview", "gemini-3-pro-preview"). Default: "gemini-2.5-flash-lite".
  • api_key — Your Google API key. Falls back to the GOOGLE_API_KEY environment variable.
  • temperature — Sampling temperature. Default: 0.7.
  • tool_choice — Tool selection mode: "auto", "required", "none". Default: "auto".
  • max_output_tokens — Maximum tokens in the response (optional).
  • top_p — Nucleus sampling probability mass (float, optional).
  • top_k — Restricts sampling to the top-k most probable tokens (int, optional).
  • presence_penalty — Penalises tokens that have already appeared (float, optional).
  • frequency_penalty — Penalises tokens by their existing frequency in the response (float, optional).

Generation knobs

  • seed — Integer seed for deterministic sampling (optional).

Safety

  • safety_settings — List of google.genai.types.SafetySetting objects (or equivalent dicts) to override the model's default content-safety thresholds (optional).

Extended thinking

  • thinking_budget — Token budget for extended reasoning on Gemini 2.5 models. Set to a positive integer to enable; 0 to explicitly disable. Omit (None) to use the API default (optional).
  • thinking_level — Qualitative reasoning effort for Gemini 3 models: "low", "medium", "high", or "minimal". Ignored on Gemini 2.5 (optional).
  • include_thoughts — When True, the model's internal reasoning steps are surfaced in the response metadata alongside the final answer. Works with thinking_budget on Gemini 2.5 (bool, optional).

Extended Thinking

Gemini models support extended thinking — an internal reasoning pass the model performs before producing the final answer.

Gemini 2.5 — thinking_budget

from videosdk.plugins.google import GoogleLLM
from videosdk.agents import Pipeline

llm = GoogleLLM(
model="gemini-2.5-flash-lite",
thinking_budget=1024, # token budget for internal reasoning
include_thoughts=True, # surface thoughts in response metadata
)

pipeline = Pipeline(llm=llm)

Gemini 3 — thinking_level

from videosdk.plugins.google import GoogleLLM
from videosdk.agents import Pipeline

llm = GoogleLLM(
model="gemini-3-flash-preview",
thinking_level="medium", # "low" | "medium" | "high" | "minimal"
)

pipeline = Pipeline(llm=llm)
note

thinking_budget and thinking_level are mutually exclusive. Use thinking_budget for Gemini 2.5 models and thinking_level for Gemini 3 models. The plugin automatically routes to the correct configuration based on the model name.

Vertex AI Integration

You can use Gemini models through Vertex AI. This requires different authentication and configuration.

Authentication for Vertex AI

Create a service account, download the JSON key file, and set the path in your environment:

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/keyfile.json"
export GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
export GOOGLE_CLOUD_LOCATION="your-gcp-location"

Example Usage with Vertex AI

from videosdk.plugins.google import GoogleLLM, VertexAIConfig
from videosdk.agents import Pipeline

llm = GoogleLLM(
vertexai=True,
vertexai_config=VertexAIConfig(
project_id="videosdk",
location="us-central1",
),
)
pipeline = Pipeline(llm=llm)

Additional Resources

Got a Question? Ask us on discord