Skip to main content
Version: 0.0.x

Realtime Transcription - Javascript

Realtime transcription allows you to transcribe audio content into text in real-time during a session. This guide will walk you through using the startTranscription() and stopTranscription() functions to manage realtime transcription in your application.

Moreover, VideoSDK offers flexibility in configuring real-time transcription, allowing you to set up webhooks for this purpose.

Integrating Realtime Transcription Feature

alt text

The above image represents,

  1. Start Transcription: The SDK Client initiates real-time transcription using the startTranscription method.

  2. Resource Acquisition: VideoSDK server requests necessary resources from transcription service.

    • If the request is denied, the server sends a transcription-failed event to the SDK Client.
    • If the request is successful, the server sends a transcription-started event to the client, indicating that transcription has begun.
  3. Transcription Data: As transcription progresses, the client receives transcription-text event with data such as the text itself, participant ID, and timestamp.

  4. Stop Transcription: When the client decides to stop transcription, it informs the VideoSDK server to release resources.

    • The server then sends a transcription-stopped event to confirm that transcription has ended and resources are released.

Step 1: Configure Realtime Transcription

  • In this step, we set up the configuration for realtime transcription. We define the webhook URL where the webhooks will be received.
// Realtime Transcription Configuration
const config = {
webhookUrl = "https://www.example.com",
summary: {
enabled: true,
prompt: "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary"
}
};

Step 2: Listen for the transcription events

  • Here, we configure the callback methods for transcription events.
import { VideoSDK } from "@videosdk.live/js-sdk";

const Constants = VideoSDK.Constants;

// Listen for transcription state changed event
meeting.on("transcription-state-changed", (data) => {
let { status, id } = data;

if (status === Constants.transcriptionEvents.TRANSCRIPTION_STARTING) {
console.log(`Realtime Transcription with ${id} is starting`);
} else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STARTED) {
console.log(`Realtime Transcription with ${id} is started`);
} else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STOPPING) {
console.log(`Realtime Transcription with ${id} is stopping`);
} else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STOPPED) {
console.log(`Realtime Transcription with ${id} is stopped`);
}
});

// Listen for transcription text event
meeting.on("transcription-text", (data) => {
let { participantId, participantName, text, timestamp, type } = data;
console.log(`${participantName}: ${text} ${timestamp}`);
});

Step 3: Start realtime transcription

  • Initiate the realtime transcription using the startTranscription() method.
// Starts realtime transcription
meeting.startTranscription(config);

Step 4: Stop realtime transcription

  • Terminate the realtime transcription using the stopTranscription() method.
// Stops realtime transcription
meeting.stopTranscription();
info

You can access a summary of your realtime transcription using the Fetch Realtime Transcription API.

Example

  • The following JavaScript code snippet allows you to start and stop realtime transcription with just a click. When you click the "Start Realtime Transcription" button, it begins realtime transcription. Clicking the "Stop Realtime Transcription" button ends the realtime transcription.
// Meeting object
let meeting;

// Initialize Meeting
meeting = VideoSDK.initMeeting({
// ...
});

// Get start button element
const startRealtimeTranscriptionBtn = document.getElementById(
"startRealtimeTranscriptionBtn"
);

// Get stop button element
const stopRealtimeTranscriptionBtn = document.getElementById(
"stopRealtimeTranscriptionBtn"
);

// Listen for transcription state changed event
meeting?.on("transcription-state-changed", (data) => {
const { status, id } = data;

// Check for starting event
if (status === Constants.transcriptionEvents.TRANSCRIPTION_STARTING) {
console.log("Realtime Transcription is starting", id);
}
// Check for started event
else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STARTED) {
console.log("Realtime Transcription is started", id);
}
// Check for stopping event
else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STOPPING) {
console.log("Realtime Transcription is stopping", id);
}
// Check for stopped event
else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STOPPED) {
console.log("Realtime Transcription is stopped", id);
}
});

// Listen for transcription text event
meeting?.on("transcription-text", (data) => {
// Destructuring data
let { participantId, participantName, text, timestamp, type } = data;
console.log(`${participantName}: ${text} ${timestamp}`);
});

// Listen for click event
startRealtimeTranscriptionBtn.addEventListener("click", () => {
// Configuration for realtime transcription
let config = {
webhookUrl: "https://example.com",
};

// Start realtime transcription
meeting?.startTranscription(config);
});

// Listen for click event
stopRealtimeTranscriptionBtn.addEventListener("click", () => {
// Stop realtime transcription
meeting?.stopTranscription();
});

API Reference

The API references for all the methods utilized in this guide are provided below.

Got a Question? Ask us on discord