Version: 0.0.x

Realtime Transcription - Python

Realtime transcription allows you to transcribe audio content into text in real-time during a session. This guide will walk you through using the start_transcription() and stop_transcription() functions to manage realtime transcription in your server.

Moreover, VideoSDK offers flexibility in configuring real-time transcription, allowing you to set up webhooks for this purpose.

Integrating Realtime Transcription Feature

alt text

The above image represents,

Start Transcription: The SDK Client initiates real-time transcription using the startTranscription method.
Resource Acquisition: VideoSDK server requests necessary resources from transcription service.
- If the request is denied, the server sends a transcription-failed event to the SDK Client.
- If the request is successful, the server sends a transcription-started event to the client, indicating that transcription has begun.
Transcription Data: As transcription progresses, the client receives transcription-text event with data such as the text itself, participant ID, and timestamp.
Stop Transcription: When the client decides to stop transcription, it informs the VideoSDK server to release resources.
- The server then sends a transcription-stopped event to confirm that transcription has ended and resources are released.

Step 1: Configure Realtime Transcription

In this step, we set up the configuration for realtime transcription. We define the webhook URL where the webhooks will be received.

#  Configurations for Realtime Transcription
webhook_url = "https://www.example.com"

Step 2: Listen for the transcription events

Here, we configure the callback methods for transcription events.

from videosdk import Meeting, MyMeetingEventHandler
  class MyMeetingEventHandler(MeetingEventHandler):
    def __init__(self):
        super().__init__()

    def on_transcription_state_changed(self, data):
        print("transcription state changed", data)

    def on_transcription_text(self, data):
        print("transcription text", data)

Step 3: Start realtime transcription

Initiate the realtime transcription using the start_transcription() method.

  transcription_config = TranscriptionConfig(
    webhook_url = webhook_url
    summary=SummaryConfig(
      enabled=True,
      prompt="Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary"
    )
  )

  meeting.start_transcription(transcription_config)

Step 4: Stop realtime transcription

Terminate the realtime transcription using the stop_transcription() method.

  meeting.stop_transcription()

Example

The following python code snippet allows you to start and stop realtime transcription with just a second.

import asyncio
from videosdk import (
    MeetingConfig,
    VideoSDK,
    MeetingEventHandler,
    SummaryConfig,
    TranscriptionConfig
)

VIDEOSDK_TOKEN = "<VIDEOSDK_TOKEN>"
MEETING_ID = "<MEETING_ID>"
NAME = "<NAME>"
loop = asyncio.get_event_loop()

class MyMeetingEventHandler(MeetingEventHandler):
    def on_transcription_state_changed(self, data):
        print(f"===== transcription state changed -> {data} =====")

    def on_transcription_text(self, data):
        print(f"===== transcription text -> {data} =====")


async def main():
  meeting = VideoSDK.init_meeting(**MeetingConfig(
      meeting_id=MEETING_ID,
      name=NAME,
      mic_enabled=True,
      webcam_enabled=True,
      token=VIDEOSDK_TOKEN,
  ))
  meeting.add_event_listener(MyMeetingEventHandler())

  meeting.join()

  await asyncio.sleep(5)
  meeting.start_transcription(TranscriptionConfig(
    summary=SummaryConfig(
      enabled=True,
      prompt="Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary"
    )
  ))

  await asyncio.sleep(60)
  meeting.stop_transcription()

if __name__ == "__main__":
  loop.run_until_complete(main())
  loop.run_forever()

important

You can access a summary of your realtime transcription using the Fetch Realtime Transcription API.

API Reference

The API references for all the methods utilized in this guide are provided below.

Got a Question? Ask us on discord

Integrating Realtime Transcription Feature​

Step 1: Configure Realtime Transcription​

Step 2: Listen for the transcription events​

Step 3: Start realtime transcription​

Step 4: Stop realtime transcription​

Example​

API Reference​