Custom Audio Track | Python
In VideoSDK, you can ingest a custom audio track from a file into your meeting using a CustomAudioTrack
.
This example demonstrates how to create a custom audio track from MP4 file
and integrate it into a VideoSDK meeting.
Setup​
Before running the example, ensure you have set up your environment variables for VideoSDK authentication:
VIDEOSDK_TOKEN = "<VIDEOSDK_TOKEN>"
MEETING_ID = "<MEETING_ID>"
NAME = "<NAME>"
Custom Audio Track Class​
The AudioFileTrack
class extends CustomAudioTrack
to handle audio stream ingestion from an MP4 file:
import time
import fractions
import asyncio
from av import AudioFrame, open, AudioResampler
from vsaiortc.mediastreams import MediaStreamError
from videosdk import CustomAudioTrack, MeetingConfig, VideoSDK, Participant, MeetingEventHandler, ParticipantEventHandler
VIDEOSDK_TOKEN = "VIDEOSDK_TOKEN"
MEETING_ID = "MEETING_ID"
NAME = "NAME"
loop = asyncio.get_event_loop()
class AudioFileTrack(CustomAudioTrack):
def __init__(self, file_path: str):
super().__init__()
self.file_path = file_path
self.container = open(file_path)
self.stream = self.container.streams.audio[0]
self.decoder = self.container.decode(self.stream)
self._start = None
self._timestamp = 0
self.resampler = AudioResampler(format='s16', layout='mono', rate=90000)
self.frame_buffer = []
async def recv(self) -> AudioFrame:
"""
Receive the next :class:`~av.audio.frame.AudioFrame` from the MP4 file.
"""
# implementing in next step
AudioFileTrack.recv()
​
Purpose:​
The recv
method in AudioFileTrack
class is responsible for retrieving the next audio frame from the source(file, url, Buffer or any).
Key Components:​
- Frame Buffering: Frames are read from the file and stored in
self.frame_buffer
after resampling them to match the desired audio format ('s16'
format, mono channel, specified sample rate). - Timestamp Synchronization: It ensures that audio frames are delivered at the correct timestamps to maintain audio-video synchronization.
- Asynchronous Waiting: Uses
asyncio.sleep()
to synchronize the frame delivery based on calculated wait time (wait
).
Code Snippet:​
async def recv(self) -> AudioFrame:
"""
Receive the next :class:~av.audio.frame.AudioFrame from the MP4 file.
"""
if self.readyState != "live":
raise MediaStreamError
if self._start is None:
self._start = time.time()
# Frame Buffering
if not self.frame_buffer:
try:
frame = next(self.decoder)
frames = self.resampler.resample(frame)
self.frame_buffer.extend(frames)
except StopIteration:
raise MediaStreamError("End of stream")
frame = self.frame_buffer.pop(0)
# Ensure the audio frame has the correct format
# Calculate the wait time to synchronize playback
sample_rate = 90000
self._timestamp += frame.samples
wait = self._start + (self._timestamp / sample_rate) - time.time()
if wait > 0:
await asyncio.sleep(wait)
frame.pts = self._timestamp
frame.time_base = fractions.Fraction(1, sample_rate)
return frame
Event Handlers​
Define custom event handlers to handle meeting and participant events:
class MyMeetingEventHandler(MeetingEventHandler):
def on_meeting_joined(self, data):
print("Meeting joined")
def on_meeting_left(self, data):
print("Meeting left")
class MyParticipantEventHandler(ParticipantEventHandler):
def __init__(self, participant_id: str):
super().__init__()
self.participant_id = participant_id
def on_stream_enabled(self, stream):
print(f"Participant {self.participant_id} stream enabled: {stream.kind}")
def on_stream_disabled(self, stream):
print(f"Participant {self.participant_id} stream disabled: {stream.kind}")
- You can learn more about MeetinEventHandler & ParticipantEventHandler class in upcoming guide.
Main Function​
Configure and run the main function to initialize the meeting with the custom audio track:
def main():
# Initialize custom audio track
audio_track = AudioFileTrack("./example-video.mp4")
# Configure meeting
meeting_config = MeetingConfig(
meeting_id=MEETING_ID,
name=NAME,
mic_enabled=True,
webcam_enabled=False,
token=VIDEOSDK_TOKEN,
custom_microphone_audio_track=audio_track
)
# Initialize VideoSDK meeting
meeting = VideoSDK.init_meeting(**meeting_config)
# Add event listeners
meeting.add_event_listener(MyMeetingEventHandler())
meeting.local_participant.add_event_listener(MyParticipantEventHandler(participant_id=meeting.local_participant.id))
# Join the meeting
print("Joining the meeting...")
meeting.join()
print("Meeting setup complete")
if __name__ == '__main__':
main()
loop.run_forever()
Stuck anywhere? Check out this example code on GitHub
Conclusion​
This setup demonstrates how to integrate a custom audio track from file into a VideoSDK meeting, with event handling for meeting and participant interactions.
API Reference​
Refer to the following API documentation for methods and events used in this guide:
Got a Question? Ask us on discord