Skip to main content
Version: 0.0.x

Custom Audio Track | Python

In VideoSDK, you can ingest a custom audio track from a file into your meeting using a CustomAudioTrack.

This example demonstrates how to create a custom audio track from MP4 file and integrate it into a VideoSDK meeting.

notification

Setup

Before running the example, ensure you have set up your environment variables for VideoSDK authentication:

VIDEOSDK_TOKEN = "<VIDEOSDK_TOKEN>"
MEETING_ID = "<MEETING_ID>"
NAME = "<NAME>"

Custom Audio Track Class

The AudioFileTrack class extends CustomAudioTrack to handle audio stream ingestion from an MP4 file:

import time
import fractions
import asyncio
from av import AudioFrame, open, AudioResampler
from vsaiortc.mediastreams import MediaStreamError
from videosdk import CustomAudioTrack, MeetingConfig, VideoSDK, Participant, MeetingEventHandler, ParticipantEventHandler

VIDEOSDK_TOKEN = "VIDEOSDK_TOKEN"
MEETING_ID = "MEETING_ID"
NAME = "NAME"
loop = asyncio.get_event_loop()

class AudioFileTrack(CustomAudioTrack):
def __init__(self, file_path: str):
super().__init__()
self.file_path = file_path
self.container = open(file_path)
self.stream = self.container.streams.audio[0]
self.decoder = self.container.decode(self.stream)
self._start = None
self._timestamp = 0
self.resampler = AudioResampler(format='s16', layout='mono', rate=90000)
self.frame_buffer = []

async def recv(self) -> AudioFrame:
"""
Receive the next :class:`~av.audio.frame.AudioFrame` from the MP4 file.
"""
# implementing in next step

AudioFileTrack.recv()

Purpose:

The recv method in AudioFileTrack class is responsible for retrieving the next audio frame from the source(file, url, Buffer or any).

Key Components:

  • Frame Buffering: Frames are read from the file and stored in self.frame_buffer after resampling them to match the desired audio format ('s16' format, mono channel, specified sample rate).
  • Timestamp Synchronization: It ensures that audio frames are delivered at the correct timestamps to maintain audio-video synchronization.
  • Asynchronous Waiting: Uses asyncio.sleep() to synchronize the frame delivery based on calculated wait time (wait).

Code Snippet:

    async def recv(self) -> AudioFrame:
"""
Receive the next :class:~av.audio.frame.AudioFrame from the MP4 file.
"""
if self.readyState != "live":
raise MediaStreamError

if self._start is None:
self._start = time.time()

# Frame Buffering
if not self.frame_buffer:
try:
frame = next(self.decoder)
frames = self.resampler.resample(frame)
self.frame_buffer.extend(frames)
except StopIteration:
raise MediaStreamError("End of stream")

frame = self.frame_buffer.pop(0)

# Ensure the audio frame has the correct format
# Calculate the wait time to synchronize playback
sample_rate = 90000
self._timestamp += frame.samples
wait = self._start + (self._timestamp / sample_rate) - time.time()
if wait > 0:
await asyncio.sleep(wait)

frame.pts = self._timestamp
frame.time_base = fractions.Fraction(1, sample_rate)

return frame

Event Handlers

Define custom event handlers to handle meeting and participant events:

class MyMeetingEventHandler(MeetingEventHandler):
def on_meeting_joined(self, data):
print("Meeting joined")

def on_meeting_left(self, data):
print("Meeting left")


class MyParticipantEventHandler(ParticipantEventHandler):
def __init__(self, participant_id: str):
super().__init__()
self.participant_id = participant_id

def on_stream_enabled(self, stream):
print(f"Participant {self.participant_id} stream enabled: {stream.kind}")

def on_stream_disabled(self, stream):
print(f"Participant {self.participant_id} stream disabled: {stream.kind}")

Main Function

Configure and run the main function to initialize the meeting with the custom audio track:

def main():
# Initialize custom audio track
audio_track = AudioFileTrack("./example-video.mp4")

# Configure meeting
meeting_config = MeetingConfig(
meeting_id=MEETING_ID,
name=NAME,
mic_enabled=True,
webcam_enabled=False,
token=VIDEOSDK_TOKEN,
custom_microphone_audio_track=audio_track
)

# Initialize VideoSDK meeting
meeting = VideoSDK.init_meeting(**meeting_config)

# Add event listeners
meeting.add_event_listener(MyMeetingEventHandler())
meeting.local_participant.add_event_listener(MyParticipantEventHandler(participant_id=meeting.local_participant.id))

# Join the meeting
print("Joining the meeting...")
meeting.join()

print("Meeting setup complete")

if __name__ == '__main__':
main()
loop.run_forever()
tip

Stuck anywhere? Check out this example code on GitHub

Conclusion

This setup demonstrates how to integrate a custom audio track from file into a VideoSDK meeting, with event handling for meeting and participant interactions.

API Reference

Refer to the following API documentation for methods and events used in this guide:

Got a Question? Ask us on discord