Skip to main content

AWS Nova Sonic

The AWS Nova Sonic provider enables your agent to use Amazon's Nova Sonic model for real-time, speech-to-speech AI interactions.

Prerequisites​

Before Start Using AWS Nova Sonic with the VideoSDK AI Agent, ensure the following:

  • AWS Account: You have an active AWS account with permissions to access Amazon Bedrock.
  • Model Access: You've requested and obtained access to the Amazon Nova models (Nova Lite and Nova Canvas) via the Amazon Bedrock console.
  • Region Selection: You're operating in the US East (N. Virginia) (us-east-1) region, as model access is region-specific.
  • AWS Credentials: Your AWS credentials (aws_access_key_id and aws_secret_access_key) are configured, either through environment variables or your preferred credential management method.

Installation​

Install the Gemini-enabled VideoSDK Agents package:

pip install "videosdk-plugins-aws"

Importing​

from videosdk.plugins.aws import NovaSonicRealtime, NovaSonicConfig

Example Usage​

from videosdk.plugins.aws import NovaSonicRealtime, NovaSonicConfig
from videosdk.agents import RealTimePipeline

# Initialize the Nova Sonic real-time model
model = NovaSonicRealtime(
model="amazon.nova-sonic-v1:0",
# When AWS credentials and region are set in .env - DON'T pass credential parameters
region="us-east-1", # Currently, only "us-east-1" is supported for Amazon Nova Sonic.
aws_access_key_id="YOUR_ACCESS_KEY",
aws_secret_access_key="YOUR_SECRET_KEY",
config=NovaSonicConfig(
voice="tiffany", # "tiffany","matthew", "amy"
temperature=0.7,
top_p=0.9,
max_tokens=1024
)
)

# Create the pipeline with the model
pipeline = RealTimePipeline(model=model)
note

When using .env file for credentials, don't pass them as arguments to model instances or context objects. The SDK automatically reads environment variables, so omit api_key, videosdk_auth, and other credential parameters from your code.

note

To initiate a conversation with Amazon Nova Sonic, the user must speak first. The model listens for user input to begin the interaction.

Configuration Options​

  • model: The Amazon Nova Sonic model to use (e.g., "amazon.nova-sonic-v1:0").
  • region: AWS region where the model is hosted (e.g., "us-east-1").
  • aws_access_key_id: Your AWS access key ID.
  • aws_secret_access_key: Your AWS secret access key.
  • config: A NovaSonicConfig object for advanced options:
    • voice: (str or None) The voice to use for audio output (e.g., "matthew", "tiffany", "amy").
    • temperature: (float or None) Sampling temperature for response randomness.
    • top_p: (float or None) Nucleus sampling probability.
    • max_tokens: (int or None) Maximum number of tokens in the output
tip

Explore and utilize ready-made scripts for integrating AWS Nova Sonic with the VideoSDK AI Agent SDK. AWS Nova Sonic Example Script.

Got a Question? Ask us on discord