Skip to main content

Namo Turn Detector

The Namo Turn Detector v1 utilizes a custom fine-tuned model from VideoSDK to accurately determine whether a user has finished speaking. This allows for precise management of conversation flow, especially in cascading pipeline setups. It can operate as a multilingual model or be configured for a specific language for optimized performance.

Installation

Install the Turn Detector-enabled VideoSDK Agents package:

pip install "videosdk-plugins-turn-detector"

Importing

from videosdk.plugins.turn_detector import NamoTurnDetectorV1

Example Usage

1. For a specific language (e.g., English):

from videosdk.plugins.turn_detector import NamoTurnDetectorV1, pre_download_namo_turn_v1_model
from videosdk.agents import CascadingPipeline

# Pre-download the English model to avoid delays
pre_download_namo_turn_v1_model(language="en")

# Initialize the Turn Detector for English
turn_detector = NamoTurnDetectorV1(
language="en",
threshold=0.7
)

# Add the Turn Detector to a cascading pipeline
pipeline = CascadingPipeline(turn_detector=turn_detector)

2. For multilingual support:

If you don't specify a language, the detector will default to the multilingual model, which can handle various languages.

from videosdk.plugins.turn_detector import NamoTurnDetectorV1, pre_download_namo_turn_v1_model
from videosdk.agents import CascadingPipeline

# Pre-download the multilingual model
pre_download_namo_turn_v1_model()

# Initialize the multilingual Turn Detector
turn_detector = NamoTurnDetectorV1(
threshold=0.7
)

# Add the Turn Detector to a cascading pipeline
pipeline = CascadingPipeline(turn_detector=turn_detector)

Configuration Options

  • language: (Optional, str): Specifies the language for the turn detection model. If left as None (the default), it loads a multilingual model capable of handling all supported languages.

  • threshold: (float) Confidence threshold for turn completion detection (0.0 to 1.0, default: 0.7)

Supported Languages

The NamoTurnDetectorV1 supports a wide range of languages when you specify the corresponding language code. If no language is specified, the multilingual model will be used.

Here is a list of the supported languages and their codes:

LanguageCode
Arabicar
Bengalibn
Chinesezh
Danishda
Dutchnl
Englishen
Finnishfi
Frenchfr
Germande
Hindihi
Indonesianid
Italianit
Japaneseja
Koreanko
Marathimr
Norwegianno
Polishpl
Portuguesept
Russianru
Spanishes
Turkishtr
Ukrainianuk
Vietnamesevi

Pre-downloading Model

To avoid delays during agent initialization, you can pre-download the Hugging Face model:

You can pre-download a specific language model:

from videosdk.plugins.turn_detector import pre_download_namo_turn_v1_model

# Download the English model before the agent runs
pre_download_namo_turn_v1_model(language="en")

Or pre-download the multilingual model:

from videosdk.plugins.turn_detector import pre_download_namo_turn_v1_model

# Download the multilingual model
pre_download_namo_turn_v1_model()

Additional Resources

The following resources provide more information about VideoSDK Turn Detector plugin for AI Agents SDK.

Got a Question? Ask us on discord