Version: 1.0.x

AWS Bedrock LLM

The AWS Bedrock LLM provider enables your agent to use any Bedrock-hosted model (Amazon Nova, Anthropic Claude, Meta Llama, Mistral, and more) for text-based conversations and processing through the unified Bedrock Converse API. It also supports vision input, allowing your agent to analyze and respond to images alongside text with the supported models.

Installation

Install the AWS-enabled VideoSDK Agents package:

pip install "videosdk-plugins-aws"

Importing

from videosdk.agents.plugins import AWSBedrockLLM

Authentication

AWS Account: You have an active AWS account with permissions to access Amazon Bedrock and the model you intend to invoke.
Model Access: You've requested and been granted access to the target model in the Bedrock console under Model access, in the region you're using.
Region Selection: You're operating in a region where the model is available (e.g. US East (N. Virginia) us-east-1), as model access is region-specific.
AWS Credentials: Your AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) are configured, either through environment variables or your preferred credential management method (IAM role, shared profile, etc.).

Example Usage

from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline

# Initialize the AWS Bedrock LLM model
llm = AWSBedrockLLM(
    model="amazon.nova-lite-v1:0",
    region="us-east-1",
    temperature=0.7,
    max_tokens=1024,
)

# Add llm to pipeline
pipeline = Pipeline(llm=llm)

note

When using a .env file for credentials, don't pass them as arguments to model instances. The SDK automatically reads environment variables, so omit aws_access_key_id, aws_secret_access_key, and other credential parameters from your code.

Inference Profiles (Cross-Region)

Many newer models — including most Anthropic Claude models — cannot be invoked on-demand by their bare model id and require a cross-region inference profile. If you use the plain model id you'll see an error like:

ValidationException: Invocation of model ID anthropic.claude-haiku-4-5-20251001-v1:0
with on-demand throughput isn't supported. Retry your request with the ID or ARN of an
inference profile that contains this model.

The fix is to prefix the model id with the region group that matches your region (us., eu., or apac.). The prefix is the only change:

from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline

# Cross-region inference profile (note the "us." prefix) for us-east-1
llm = AWSBedrockLLM(
    model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
    region="us-east-1",
    temperature=0.7,
)

pipeline = Pipeline(llm=llm)

Examples for us-east-1:

Model id	Invocation
`amazon.nova-lite-v1:0`	On-demand OK
`us.anthropic.claude-haiku-4-5-20251001-v1:0`	Inference profile
`us.anthropic.claude-3-5-sonnet-20241022-v2:0`	Inference profile
`us.meta.llama3-1-70b-instruct-v1:0`	Inference profile

tip

Match the prefix to your region: use us. for US regions, eu. for Europe, and apac. for Asia-Pacific. You can also pass a full inference profile ARN, or set the BEDROCK_INFERENCE_PROFILE_ARN environment variable and omit model. See the AWS inference profiles guide.

Configuration Options

Core

model — The Bedrock model id or inference profile ARN (e.g. "amazon.nova-lite-v1:0", "us.anthropic.claude-haiku-4-5-20251001-v1:0"). Falls back to the BEDROCK_INFERENCE_PROFILE_ARN environment variable. Default: "amazon.nova-lite-v1:0".
region — AWS region for Bedrock Runtime. Falls back to the AWS_DEFAULT_REGION environment variable. Default: "us-east-1".
temperature — Sampling temperature. Default: 0.7.
tool_choice — Tool selection mode: "auto", "required", "none", or a dict {"type": "function", "function": {"name": "my_tool"}} to force a specific tool. Default: "auto".
max_tokens — Maximum tokens to generate in the response (optional).

Credentials

aws_access_key_id — AWS access key ID. Falls back to the AWS_ACCESS_KEY_ID environment variable (optional).
aws_secret_access_key — AWS secret access key. Falls back to the AWS_SECRET_ACCESS_KEY environment variable (optional).
aws_session_token — Optional AWS session token for temporary credentials. Falls back to the AWS_SESSION_TOKEN environment variable (optional).

note

When none of the credential arguments are provided, the standard boto3 credential chain is used (environment variables, shared config/credentials files, attached IAM role, etc.).

Generation knobs

top_p — Nucleus sampling probability mass (float, optional).
top_k — Restricts sampling to the top-k most probable tokens. Sent via additionalModelRequestFields; model support varies (int, optional).
stop_sequences — List of sequences that stop generation (list of str, optional).
additional_request_fields — Extra fields merged into additionalModelRequestFields for model-specific parameters (dict, optional).

Prompt caching

cache_system — Append a prompt-cache checkpoint after the system prompt to reduce input token usage. Default: False.
cache_tools — Append a prompt-cache checkpoint after the tool definitions. Default: False.

Output handling

strip_thinking — Remove <thinking>...</thinking> spans from the streamed text before it is yielded. Amazon Nova models emit chain-of-thought in these tags, which would otherwise be read aloud by TTS. Default: True.

Client

client — Optional pre-built boto3 bedrock-runtime client. When provided, the credential and region arguments are ignored and the caller retains ownership of the client (optional).

Advanced Example

from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline

llm = AWSBedrockLLM(
    model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
    region="us-east-1",
    temperature=0.7,
    max_tokens=2048,
    top_p=0.95,
    top_k=40,
    stop_sequences=["\n\nUser:"],
    cache_system=True,
    cache_tools=True,
)

pipeline = Pipeline(llm=llm)

Additional Resources

The following resources provide more information about using AWS Bedrock with VideoSDK Agents SDK.

AWS Bedrock docs: AWS Bedrock documentation.
Converse API reference: Bedrock ConverseStream API reference.
Inference profiles: Using cross-region inference profiles.

SDK Reference

GitHub Repository

Python Package

Got a Question? Ask us on discord

Installation​

Importing​

Authentication​

Example Usage​

Inference Profiles (Cross-Region)​

Configuration Options​

Core​

Credentials​

Generation knobs​

Prompt caching​

Output handling​

Client​

Advanced Example​

Additional Resources​