Background Audio

The Background Audio feature enables voice agents to play audio during conversations, enhancing user experience with ambient sounds and processing feedback. There are two ways to set the audio:

Thinking Audio: Plays automatically during agent processing (e.g., keyboard typing sounds)
Background Audio: Plays on-demand for ambient music or soundscapes

Thinking Audio
Background Audio

Thinking Audio

Getting Started

Enable Background Audio

from videosdk.agents import RoomOptions, JobContext  
  
room_options = RoomOptions(  
    room_id="your-room-id",  
    name="My Agent",  
    background_audio=True  # Enable background audio support  
)  
  
context = JobContext(room_options=room_options)

Agent Methods

1. Set Thinking Audio

set_thinking_audio(): Configures audio that plays automatically while the agent processes responses.

Parameters:

file (str, optional): Path to custom WAV audio file. If not provided, uses built-in agent_keyboard.wav
volume (float, optional): Volume of the audio. Default: 0.3

Example:

class MyAgent(Agent):  
    def __init__(self):  
        super().__init__(instructions="...")  
        # Use default keyboard sound  
        self.set_thinking_audio()  
        # Or use custom audio  
        # self.set_thinking_audio(file="path/to/custom.wav")

2. Play Background Audio

play_background_audio(): Starts playing background audio during the conversation.

Parameters:

file (str, optional): Path to custom WAV audio file. If not provided, uses built-in classical.wav
looping (bool, optional): Whether to loop the audio. Default: False
override_thinking (bool, optional): Whether to stop thinking audio when background audio starts. Default: True
volume (float, optional): Volume of the audio. Default: 1.0

Example:

@function_tool  
async def play_music(self):  
    """Plays background music"""  
    await self.play_background_audio(  
        looping=True,  
        override_thinking=False  
    )  
    return "Music started"

3. Stop Background Audio

stop_background_audio(): Stops currently playing background audio.

Example:

@function_tool  
async def stop_music(self):  
    """Stops background music"""  
    await self.stop_background_audio()  
    return "Music stopped"

Complete Example

main.py
from videosdk.agents import (  
    Agent, AgentSession, CascadingPipeline,   
    WorkerJob, ConversationFlow, JobContext,   
    RoomOptions, function_tool  
)  
from videosdk.plugins.openai import OpenAILLM, OpenAITTS  
from videosdk.plugins.deepgram import DeepgramSTT  
from videosdk.plugins.silero import SileroVAD  
from videosdk.plugins.turn_detector import TurnDetector  
  
class MusicAgent(Agent):  
    def __init__(self):  
        super().__init__(  
            instructions="You are a helpful assistant. Use control_music to play or stop background music."  
        )  
        # Enable thinking audio with default keyboard sound  
        self.set_thinking_audio()  
      
    async def on_enter(self):  
        await self.session.say("Hello! Ask me to play music.")
    
    async def on_exit(self):
        await self.session.say("Goodbye! Hope you enjoyed the music.")
      
    @function_tool  
    async def control_music(self, action: str):  
        """  
        Controls background music.  
        :param action: 'play' to start music, 'stop' to end it  
        """  
        if action == "play":  
            await self.play_background_audio(  
                override_thinking=True,  
                looping=True  
            )  
            return "Music started"  
        elif action == "stop":  
            await self.stop_background_audio()  
            return "Music stopped"  
        return "Invalid action"  
  
async def entrypoint(ctx: JobContext):  
    agent = MusicAgent()  
      
    pipeline = CascadingPipeline(  
        stt=DeepgramSTT(),  
        llm=OpenAILLM(),  
        tts=OpenAITTS(),  
        vad=SileroVAD(),  
        turn_detector=TurnDetector()  
    )  
      
    session = AgentSession(  
        agent=agent,  
        pipeline=pipeline,  
        conversation_flow=ConversationFlow(agent)  
    )  
      
    await ctx.run_until_shutdown(session=session)  
  
def make_context():  
    return JobContext(  
        room_options=RoomOptions(  
            room_id="<room_id>",  
            name="Music Agent",  
            background_audio=True  # Required!  
        )  
    )  
  
if __name__ == "__main__":  
    job = WorkerJob(entrypoint=entrypoint, jobctx=make_context)  
    job.start()

Pipeline Support

Background audio works with both pipeline types:

Cascading Pipeline

Thinking audio plays automatically during LLM processing
Background audio can be controlled via agent methods
Audio stops automatically when agent speaks

RealTime Pipeline

Full background audio support with streaming models
Automatic lifecycle management during conversation turns

Audio Behavior

Feature	Thinking Audio	Background Audio
Trigger	Automatic during processing	Manual via `play_background_audio()`
Default File	`agent_keyboard.wav`	`classical.wav`
Typical Duration	Short (during LLM call)	Long/continuous
Looping	Optional	Recommended (`looping=True`)
User Control	No	Yes (via function tools)
Stops When	Agent speaks	Agent speaks or `stop_background_audio()`

Audio File Requirements

Format: WAV (.wav)
Recommended: 16-bit PCM, 16kHz sample rate, mono channel
Built-in files:
- agent_keyboard.wav: Default thinking sound
- classical.wav: Default background music

Best Practices

Always enable in RoomOptions: Set background_audio=True before using audio methods
Use override_thinking=True: When playing music to avoid overlapping sounds
Loop background audio: Set looping=True for continuous ambient sounds
Control via function tools: Let users control music through natural language
Clean audio files: Use high-quality WAV files to avoid distortion

Common Use Cases

Music player agent: Control playback through conversation
Ambient soundscapes: Create atmosphere during interactions
Processing feedback: Custom thinking sounds for different agent personalities
Hold music: Play audio while agent performs long operations

Example - Try It Yourself

Background Audio example

Implement and experience the background audio functionality yourself

FAQs

Troubleshooting

Issue	Solution
Audio not playing	Verify `background_audio=True` in `RoomOptions`
Audio quality issues	Use WAV format with 16-bit PCM encoding
Audio doesn't stop	Ensure `stop_background_audio()` is called properly
Overlapping sounds	Use `override_thinking=True` when playing background audio

Got a Question? Ask us on discord

Getting Started​

Enable Background Audio​

Agent Methods​

Complete Example​

Pipeline Support​

Cascading Pipeline​

RealTime Pipeline​

Audio Behavior​

Audio File Requirements​

Best Practices​

Common Use Cases​

Example - Try It Yourself​

Background Audio example

FAQs​

Getting Started

Enable Background Audio

Agent Methods

Complete Example

Pipeline Support

Cascading Pipeline

RealTime Pipeline

Audio Behavior

Audio File Requirements

Best Practices

Common Use Cases

Example - Try It Yourself

FAQs