How to enable post transcription and summary in VideoSDK
Overviewโ
Transcription API empowers developers to convert spoken language into written text automatically. In this tutorial, we'll delve into how to effectively utilize a Transcription API to transcribe meeting recordings. We'll explore two distinct methods for initiating transcription: enabling transcription via the start recording API and enabling transcription via auto-configuration while creating a meetingId using API.
This feature is currently in BETA.
Prerequisitesโ
Before diving into the Transcription API, ensure you have:
- Basic knowledge of REST APIs.
- Understanding of JSON formatting.
- An Auth Token for the Transcription API.
Step 1: Understand the workflowโ
Before enabling transcription and summary features in VideoSDK, it's important to understand the basic workflow. Transcription and summarization occur after recording, meaning that recording is a necessary step before these features can be used. To activate the transcription and summary APIs, you must first start recording using either the Start Recording API or by auto config recording within the Create MeetingId API. To use transcription, recordings needs to be stored in VideoSDK Cloud
Step 2: Choosing Your Methodโ
There are two primary methods for initiating transcription
- via the start recording API
- via auto configuration while creating a meeting API
Choose the method that best fits your use case.
Option 1: Enabling Transcription via Recording APIโ
In this method, transcription is enabled by including a transcription object with the property enabled set to true in the start recording API request. Below is an example implementation in Node.js:
Request
import fetch from "node-fetch";
const options = {
method: "POST",
headers: {
Authorization: "YOUR_TOKEN",
"Content-Type": "application/json",
},
body: JSON.stringify({
roomId: "room-id",
transcription: {
enabled: true,
summary: {
enabled: true,
},
},
}),
};
const url = `https://api.videosdk.live/v2/recordings/start`;
const response = await fetch(url, options);
const data = await response.json();
console.log(data);
Response
"Recording started successfully";
Replace room-id
with the ID of your meeting or room and YOUR_TOKEN
with your auth token.
Additionally, you can enable summary feature by setting the enabled
property to true
within the summary object.
Option 2: Enabling Transcription via Auto Configuration while creating meeting using APIโ
If you prefer transcription to be enabled by default when creating a meeting, you can configure this during the meeting creation process. Here's an example implementation in Node.js:
Request
import fetch from 'node-fetch';
const options = {
method: "POST",
headers: {
"Authorization": "$YOUR_TOKEN",
"Content-Type": "application/json",
},
body: JSON.stringify({
"autoStartConfig" : {
"recording":
"transcription": {
"enabled": true
"summary": {
"enabled": true
},
}
}
}),
};
const url= `https://api.videosdk.live/v2/rooms`;
const response = await fetch(url, options);
const data = await response.json();
console.log(data);
Response
{
"roomId": "abc-xyzw-lmno",
"disabled": false,
"createdAt": "2024-02-27T04:49:11.024Z",
"updatedAt": "2024-02-27T04:49:11.024Z",
"id": "623d49c760a18e699abcc8a4",
"links": {
"get_room": "https://api.videosdk.live/v2/rooms/abc-xyzw-lmno",
"get_session": "https://api.videosdk.live/v2/sessions/"
}
Replace "YOUR_TOKEN"
with your authentication token and set the summary object's enabled
property to true
if you wish to utilize the summary feature.
In this method, recording will automatically start with transcription enabled when initiating a session with a specified Meeting ID or Room ID.
Step 3: Conducting a Meetingโ
You can initiate a meeting using any of our SDKs (Prebuilt, React, React Native, JS, Flutter, Android, iOS).
- Option 1: If you opt to enable transcription via the recording API, pass the transcription object as described in Step 2. The recording will commence with the transcription feature enabled.
- Option 2: If you prefer enabling the transcription feature during meeting creation using the API, the recording will automatically begin with transcription enabled when initiating your meeting.
Step 4: Retrieving Transcription and Summaryโ
After the recording concludes and transcription was enabled for that recording or meeting, you can retrieve the transcription and summary using the API.
To retrieve the transcription of a meeting, make a GET request to the Transcription API endpoint with the meeting ID. Below is an example using Node.js:
Request
import fetch from "node-fetch";
const options = {
method: "GET",
headers: {
Authorization: "$YOUR_TOKEN",
"Content-Type": "application/json",
},
};
const roomId = "room-id";
const url = `https://api.videosdk.live/ai/v1/post-transcriptions?roomId=${roomId}`;
const response = await fetch(url, options);
const data = await response.json();
console.log(data);
Response
{
"pageInfo": {
"currentPage": 1,
"perPage": 20,
"lastPage": 1,
"total": 1
},
"transcriptions": [
{
"id": "transcription-id",
"status": "completed",
"roomId": "room-id",
"sessionId": "session-id",
"recordingId": "recording-id",
"filePath": "https://cdn.videosdk.live/encoded/videos/dummy.mp4",
"transcriptionFilePaths": {
"json": "https://cdn.videosdk.live/transcriptions/dummy/dummy.json",
"srt": "https://cdn.videosdk.live/transcriptions/dummy/dummy.srt",
"txt": "https://cdn.videosdk.live/transcriptions/dummy/dummy.txt",
"tsv": "https://cdn.videosdk.live/transcriptions/dummy/dummy.tsv",
"vtt": "https://cdn.videosdk.live/transcriptions/dummy/dummy.vtt"
},
"summarizedFilePaths": {
"txt": "https://cdn.videosdk.live/transcriptions/dummy/dummy-summary.txt"
},
"userStorage": null,
"start": "2024-02-27T16:00:36.828Z",
"end": "2024-02-27T16:01:46.939Z"
}
]
}
Replace 'YOUR_AUTH_TOKEN'
with your authentication token and 'room-id'
with the ID of your meeting or room.
Here, you will receive transcriptionFilePaths
object containing paths of the transcribed files with various formats such as json, srt, txt, tsv, vtt.
Additionally, if you have enabled summary feature, you will receive a summarizedFilePaths
object containing path of the summarized text file.
Conclusionโ
In this tutorial, you've learned how to enable transcription for meetings using the Transcription API. By following the steps outlined above, you can easily integrate transcription capabilities into your applications and retrieve transcribed text and summaries for your meetings. Transcription can greatly enhance the accessibility and usability of your meeting recordings, making them more searchable and comprehensible. Experiment with these methods and explore further possibilities to unlock the full potential of transcription in your applications and enhance the efficiency and productivity of your meetings.