Face Detection - Python
This guide will help you understand how to implement real-time face detection on video frames using the VideoSDK. We will leverage the Mediapipe library to detect faces and draw bounding boxes around them.
Prerequisites​
- Install necessary libraries:
pip install videosdk python-dotenv opencv-python av mediapipe
- Create a
.env
file and add your VideoSDK token, meeting ID, and name:VIDEOSDK_TOKEN=your_token
MEETING_ID=your_meeting_id
NAME=your_name
Code Breakdown​
Imports and Constants​
We start by importing necessary libraries and loading environment variables in face_detection.py
file:
import asyncio
import os
from videosdk import MeetingConfig, VideoSDK, Participant, Stream, MeetingEventHandler, ParticipantEventHandler, CustomVideoTrack, Meeting
import mediapipe as mp
import cv2
from av import VideoFrame
from dotenv import load_dotenv
load_dotenv()
VIDEOSDK_TOKEN = os.getenv("VIDEOSDK_TOKEN")
MEETING_ID = os.getenv("MEETING_ID")
NAME = os.getenv("NAME")
loop = asyncio.get_event_loop()
# Initialize Mediapipe face detection
mp_face_detection = mp.solutions.face_detection
mp_drawing = mp.solutions.drawing_utils
meeting: Meeting = None
Face Detection Processor​
This processor performs face detection on each video frame, draws the outline and return the frame:
class FaceDetectionProcessor():
def __init__(self) -> None:
print("Processor initialized")
def process(self, frame: VideoFrame) -> VideoFrame:
# Convert frame to image
img = frame.to_ndarray(format="bgr24")
# Convert the image to RGB
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# Perform face detection
with mp_face_detection.FaceDetection(
min_detection_confidence=0.2
) as face_detection:
results = face_detection.process(img_rgb)
# Draw face detections on the image
if results.detections:
for detection in results.detections:
mp_drawing.draw_detection(img, detection)
# rebuild a VideoFrame, preserving timing information
new_frame = VideoFrame.from_ndarray(img, format="bgr24")
new_frame.pts = frame.pts
new_frame.time_base = frame.time_base
return new_frame
CustomVideoTrack​
Define a custom video track that will run the above processor when new frame received.
class ProcessedVideoTrack(CustomVideoTrack):
"""
A video stream track that transforms frames from an another track.
"""
kind = "video"
def __init__(self, track):
super().__init__() # don't forget this!
self.track = track
self.processor = FaceDetectionProcessor()
async def recv(self):
frame = await self.track.recv()
new_frame = self.processor.process(frame)
return new_frame
Process on stream available​
This function applies the ProcessedVideoTrack
to a available video track:
def process_video(track: CustomVideoTrack):
global meeting
meeting.add_custom_video_track(
track=ProcessedVideoTrack(track=track)
)
Event Handlers​
Define event handlers to handle meeting and participant events:
class MyMeetingEventHandler(MeetingEventHandler):
def __init__(self):
super().__init__()
def on_meeting_left(self, data):
print("on_meeting_left")
def on_participant_joined(self, participant: Participant):
participant.add_event_listener(
MyParticipantEventHandler(participant_id=participant.id)
)
def on_participant_left(self, participant: Participant):
print("on_participant_left")
class MyParticipantEventHandler(ParticipantEventHandler):
def __init__(self, participant_id: str):
super().__init__()
self.participant_id = participant_id
def on_stream_enabled(self, stream: Stream):
print("on_stream_enabled: " + stream.kind)
if stream.kind == "video":
process_video(track=stream.track)
def on_stream_disabled(self, stream: Stream):
print("on_stream_disabled")
Main Function​
Initialize the meeting and start the event loop:
def main():
global meeting
# Example usage:
meeting_config = MeetingConfig(
meeting_id=MEETING_ID,
name=NAME,
mic_enabled=False,
webcam_enabled=False,
token=VIDEOSDK_TOKEN,
)
meeting = VideoSDK.init_meeting(**meeting_config)
print("adding event listener...")
meeting.add_event_listener(MyMeetingEventHandler())
print("joining into meeting...")
meeting.join()
if __name__ == "__main__":
main()
loop.run_forever()
Running the Code​
To run the code, simply execute the script:
python face_detection.py
This script will join the meeting specified by MEETING_ID
with the provided VIDEOSDK_TOKEN
and NAME
, and perform real-time face detection on video frames using Mediapipe.
Feel free to modify the face detection logic inside the FaceDetectionProcessor
class to adjust the detection parameters or apply additional processing.
Output​
Stuck anywhere? Check out this example code on GitHub.
API Reference​
The API references for all the methods and events utilized in this guide are provided below.
Got a Question? Ask us on discord