Namo Turn Detector
The Namo Turn Detector v1 utilizes a custom fine-tuned model from VideoSDK to accurately determine whether a user has finished speaking. This allows for precise management of conversation flow, especially in cascading pipeline setups. It can operate as a multilingual model or be configured for a specific language for optimized performance.
Installation
Install the Turn Detector-enabled VideoSDK Agents package:
pip install "videosdk-plugins-turn-detector"
Importing
from videosdk.plugins.turn_detector import NamoTurnDetectorV1
Example Usage
1. For a specific language (e.g., English):
from videosdk.plugins.turn_detector import NamoTurnDetectorV1, pre_download_namo_turn_v1_model
from videosdk.agents import CascadingPipeline
# Pre-download the English model to avoid delays
pre_download_namo_turn_v1_model(language="en")
# Initialize the Turn Detector for English
turn_detector = NamoTurnDetectorV1(
language="en",
threshold=0.7
)
# Add the Turn Detector to a cascading pipeline
pipeline = CascadingPipeline(turn_detector=turn_detector)
2. For multilingual support:
If you don't specify a language, the detector will default to the multilingual model, which can handle various languages.
from videosdk.plugins.turn_detector import NamoTurnDetectorV1, pre_download_namo_turn_v1_model
from videosdk.agents import CascadingPipeline
# Pre-download the multilingual model
pre_download_namo_turn_v1_model()
# Initialize the multilingual Turn Detector
turn_detector = NamoTurnDetectorV1(
threshold=0.7
)
# Add the Turn Detector to a cascading pipeline
pipeline = CascadingPipeline(turn_detector=turn_detector)
Configuration Options
-
language
: (Optional,str
): Specifies the language for the turn detection model. If left asNone
(the default), it loads a multilingual model capable of handling all supported languages. -
threshold
: (float) Confidence threshold for turn completion detection (0.0 to 1.0, default:0.7
)
Supported Languages
The NamoTurnDetectorV1
supports a wide range of languages when you specify the corresponding language code. If no language is specified, the multilingual model will be used.
Here is a list of the supported languages and their codes:
Language | Code |
---|---|
Arabic | ar |
Bengali | bn |
Chinese | zh |
Danish | da |
Dutch | nl |
English | en |
Finnish | fi |
French | fr |
German | de |
Hindi | hi |
Indonesian | id |
Italian | it |
Japanese | ja |
Korean | ko |
Marathi | mr |
Norwegian | no |
Polish | pl |
Portuguese | pt |
Russian | ru |
Spanish | es |
Turkish | tr |
Ukrainian | uk |
Vietnamese | vi |
Pre-downloading Model
To avoid delays during agent initialization, you can pre-download the Hugging Face model:
You can pre-download a specific language model:
from videosdk.plugins.turn_detector import pre_download_namo_turn_v1_model
# Download the English model before the agent runs
pre_download_namo_turn_v1_model(language="en")
Or pre-download the multilingual model:
from videosdk.plugins.turn_detector import pre_download_namo_turn_v1_model
# Download the multilingual model
pre_download_namo_turn_v1_model()
Additional Resources
The following resources provide more information about VideoSDK Turn Detector plugin for AI Agents SDK.
Got a Question? Ask us on discord