AWS Bedrock LLM
The AWS Bedrock LLM provider enables your agent to use any Bedrock-hosted model (Amazon Nova, Anthropic Claude, Meta Llama, Mistral, and more) for text-based conversations and processing through the unified Bedrock Converse API. It also supports vision input, allowing your agent to analyze and respond to images alongside text with the supported models.
Installation
Install the AWS-enabled VideoSDK Agents package:
pip install "videosdk-plugins-aws"
Importing
from videosdk.agents.plugins import AWSBedrockLLM
Authentication
AWS Account: You have an active AWS account with permissions to access Amazon Bedrock and the model you intend to invoke.Model Access: You've requested and been granted access to the target model in the Bedrock console under Model access, in the region you're using.Region Selection: You're operating in a region where the model is available (e.g. US East (N. Virginia)us-east-1), as model access is region-specific.AWS Credentials: Your AWS credentials (AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_DEFAULT_REGION) are configured, either through environment variables or your preferred credential management method (IAM role, shared profile, etc.).
Example Usage
from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline
# Initialize the AWS Bedrock LLM model
llm = AWSBedrockLLM(
model="amazon.nova-lite-v1:0",
region="us-east-1",
temperature=0.7,
max_tokens=1024,
)
# Add llm to pipeline
pipeline = Pipeline(llm=llm)
When using a .env file for credentials, don't pass them as arguments to model instances. The SDK automatically reads environment variables, so omit aws_access_key_id, aws_secret_access_key, and other credential parameters from your code.
Inference Profiles (Cross-Region)
Many newer models — including most Anthropic Claude models — cannot be invoked on-demand by their bare model id and require a cross-region inference profile. If you use the plain model id you'll see an error like:
ValidationException: Invocation of model ID anthropic.claude-haiku-4-5-20251001-v1:0
with on-demand throughput isn't supported. Retry your request with the ID or ARN of an
inference profile that contains this model.
The fix is to prefix the model id with the region group that matches your region (us., eu., or apac.). The prefix is the only change:
from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline
# Cross-region inference profile (note the "us." prefix) for us-east-1
llm = AWSBedrockLLM(
model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
region="us-east-1",
temperature=0.7,
)
pipeline = Pipeline(llm=llm)
Examples for us-east-1:
| Model id | Invocation |
|---|---|
amazon.nova-lite-v1:0 | On-demand OK |
us.anthropic.claude-haiku-4-5-20251001-v1:0 | Inference profile |
us.anthropic.claude-3-5-sonnet-20241022-v2:0 | Inference profile |
us.meta.llama3-1-70b-instruct-v1:0 | Inference profile |
Match the prefix to your region: use us. for US regions, eu. for Europe, and apac. for Asia-Pacific. You can also pass a full inference profile ARN, or set the BEDROCK_INFERENCE_PROFILE_ARN environment variable and omit model. See the AWS inference profiles guide.
Configuration Options
Core
model— The Bedrock model id or inference profile ARN (e.g."amazon.nova-lite-v1:0","us.anthropic.claude-haiku-4-5-20251001-v1:0"). Falls back to theBEDROCK_INFERENCE_PROFILE_ARNenvironment variable. Default:"amazon.nova-lite-v1:0".region— AWS region for Bedrock Runtime. Falls back to theAWS_DEFAULT_REGIONenvironment variable. Default:"us-east-1".temperature— Sampling temperature. Default:0.7.tool_choice— Tool selection mode:"auto","required","none", or a dict{"type": "function", "function": {"name": "my_tool"}}to force a specific tool. Default:"auto".max_tokens— Maximum tokens to generate in the response (optional).
Credentials
aws_access_key_id— AWS access key ID. Falls back to theAWS_ACCESS_KEY_IDenvironment variable (optional).aws_secret_access_key— AWS secret access key. Falls back to theAWS_SECRET_ACCESS_KEYenvironment variable (optional).aws_session_token— Optional AWS session token for temporary credentials. Falls back to theAWS_SESSION_TOKENenvironment variable (optional).
When none of the credential arguments are provided, the standard boto3 credential chain is used (environment variables, shared config/credentials files, attached IAM role, etc.).
Generation knobs
top_p— Nucleus sampling probability mass (float, optional).top_k— Restricts sampling to the top-k most probable tokens. Sent viaadditionalModelRequestFields; model support varies (int, optional).stop_sequences— List of sequences that stop generation (list of str, optional).additional_request_fields— Extra fields merged intoadditionalModelRequestFieldsfor model-specific parameters (dict, optional).
Prompt caching
cache_system— Append a prompt-cache checkpoint after the system prompt to reduce input token usage. Default:False.cache_tools— Append a prompt-cache checkpoint after the tool definitions. Default:False.
Output handling
strip_thinking— Remove<thinking>...</thinking>spans from the streamed text before it is yielded. Amazon Nova models emit chain-of-thought in these tags, which would otherwise be read aloud by TTS. Default:True.
Client
client— Optional pre-built boto3bedrock-runtimeclient. When provided, the credential and region arguments are ignored and the caller retains ownership of the client (optional).
Advanced Example
from videosdk.agents.plugins import AWSBedrockLLM
from videosdk.agents import Pipeline
llm = AWSBedrockLLM(
model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
region="us-east-1",
temperature=0.7,
max_tokens=2048,
top_p=0.95,
top_k=40,
stop_sequences=["\n\nUser:"],
cache_system=True,
cache_tools=True,
)
pipeline = Pipeline(llm=llm)
Additional Resources
The following resources provide more information about using AWS Bedrock with VideoSDK Agents SDK.
- AWS Bedrock docs: AWS Bedrock documentation.
- Converse API reference: Bedrock
ConverseStreamAPI reference. - Inference profiles: Using cross-region inference profiles.
Got a Question? Ask us on discord

