Skip to main content
Version: 1.0.x

AWS Bedrock LLM

The AWS Bedrock LLM provider enables your agent to use any Bedrock-hosted model (Amazon Nova, Anthropic Claude, Meta Llama, Mistral, and more) for text-based conversations and processing through the unified Bedrock Converse API. It also supports vision input, allowing your agent to analyze and respond to images alongside text with the supported models.

Installation

Install the AWS-enabled VideoSDK Agents package:

pip install "videosdk-plugins-aws"

Importing

from videosdk.agents.plugins import AWSBedrockLLM

Authentication

  • AWS Account: You have an active AWS account with permissions to access Amazon Bedrock and the model you intend to invoke.
  • Model Access: You've requested and been granted access to the target model in the Bedrock console under Model access, in the region you're using.
  • Region Selection: You're operating in a region where the model is available (e.g. US East (N. Virginia) us-east-1), as model access is region-specific.
  • AWS Credentials: Your AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) are configured, either through environment variables or your preferred credential management method (IAM role, shared profile, etc.).

Example Usage

from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline

# Initialize the AWS Bedrock LLM model
llm = AWSBedrockLLM(
model="amazon.nova-lite-v1:0",
region="us-east-1",
temperature=0.7,
max_tokens=1024,
)

# Add llm to pipeline
pipeline = Pipeline(llm=llm)
note

When using a .env file for credentials, don't pass them as arguments to model instances. The SDK automatically reads environment variables, so omit aws_access_key_id, aws_secret_access_key, and other credential parameters from your code.

Inference Profiles (Cross-Region)

Many newer models — including most Anthropic Claude models — cannot be invoked on-demand by their bare model id and require a cross-region inference profile. If you use the plain model id you'll see an error like:

ValidationException: Invocation of model ID anthropic.claude-haiku-4-5-20251001-v1:0
with on-demand throughput isn't supported. Retry your request with the ID or ARN of an
inference profile that contains this model.

The fix is to prefix the model id with the region group that matches your region (us., eu., or apac.). The prefix is the only change:

from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline

# Cross-region inference profile (note the "us." prefix) for us-east-1
llm = AWSBedrockLLM(
model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
region="us-east-1",
temperature=0.7,
)

pipeline = Pipeline(llm=llm)

Examples for us-east-1:

Model idInvocation
amazon.nova-lite-v1:0On-demand OK
us.anthropic.claude-haiku-4-5-20251001-v1:0Inference profile
us.anthropic.claude-3-5-sonnet-20241022-v2:0Inference profile
us.meta.llama3-1-70b-instruct-v1:0Inference profile
tip

Match the prefix to your region: use us. for US regions, eu. for Europe, and apac. for Asia-Pacific. You can also pass a full inference profile ARN, or set the BEDROCK_INFERENCE_PROFILE_ARN environment variable and omit model. See the AWS inference profiles guide.

Configuration Options

Core

  • model — The Bedrock model id or inference profile ARN (e.g. "amazon.nova-lite-v1:0", "us.anthropic.claude-haiku-4-5-20251001-v1:0"). Falls back to the BEDROCK_INFERENCE_PROFILE_ARN environment variable. Default: "amazon.nova-lite-v1:0".
  • region — AWS region for Bedrock Runtime. Falls back to the AWS_DEFAULT_REGION environment variable. Default: "us-east-1".
  • temperature — Sampling temperature. Default: 0.7.
  • tool_choice — Tool selection mode: "auto", "required", "none", or a dict {"type": "function", "function": {"name": "my_tool"}} to force a specific tool. Default: "auto".
  • max_tokens — Maximum tokens to generate in the response (optional).

Credentials

  • aws_access_key_id — AWS access key ID. Falls back to the AWS_ACCESS_KEY_ID environment variable (optional).
  • aws_secret_access_key — AWS secret access key. Falls back to the AWS_SECRET_ACCESS_KEY environment variable (optional).
  • aws_session_token — Optional AWS session token for temporary credentials. Falls back to the AWS_SESSION_TOKEN environment variable (optional).
note

When none of the credential arguments are provided, the standard boto3 credential chain is used (environment variables, shared config/credentials files, attached IAM role, etc.).

Generation knobs

  • top_p — Nucleus sampling probability mass (float, optional).
  • top_k — Restricts sampling to the top-k most probable tokens. Sent via additionalModelRequestFields; model support varies (int, optional).
  • stop_sequences — List of sequences that stop generation (list of str, optional).
  • additional_request_fields — Extra fields merged into additionalModelRequestFields for model-specific parameters (dict, optional).

Prompt caching

  • cache_system — Append a prompt-cache checkpoint after the system prompt to reduce input token usage. Default: False.
  • cache_tools — Append a prompt-cache checkpoint after the tool definitions. Default: False.

Output handling

  • strip_thinking — Remove <thinking>...</thinking> spans from the streamed text before it is yielded. Amazon Nova models emit chain-of-thought in these tags, which would otherwise be read aloud by TTS. Default: True.

Client

  • client — Optional pre-built boto3 bedrock-runtime client. When provided, the credential and region arguments are ignored and the caller retains ownership of the client (optional).

Advanced Example

from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline

llm = AWSBedrockLLM(
model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
region="us-east-1",
temperature=0.7,
max_tokens=2048,
top_p=0.95,
top_k=40,
stop_sequences=["\n\nUser:"],
cache_system=True,
cache_tools=True,
)

pipeline = Pipeline(llm=llm)

Additional Resources

The following resources provide more information about using AWS Bedrock with VideoSDK Agents SDK.

Got a Question? Ask us on discord