Real-time hand detection and tracking system using MediaPipe and OpenCV.
This project implements a real-time hand detector that uses Google's MediaPipe technology to identify and track up to 4 hands simultaneously through the webcam. The system draws the landmarks and connections between them, showing additional information such as FPS and the type of hand detected (left/right).
- ✋ Detection of up to 4 hands simultaneously
- 🎯 21 landmarks per hand
- 🔄 Real-time processing with LIVE_STREAM mode
- 📊 Real-time FPS visualization
- 🎨 Drawing of landmarks and connections between joints
- 👆 Left/right hand identification with confidence level
- ⚙️ Customizable camera and detector configuration
- Python 3.8+
- Webcam
-
Clone or download this repository
-
Install dependencies:
pip install -r requirements.txt- Make sure you have the model file
hand_landmarker.taskin the project root directory
Run the main program:
python main.py- Q or ESC: Exit the program
mediapipeHandLandmarker/
│
├── main.py # Main program entry point
├── settings.py # Camera and detector configuration
├── drawer.py # Landmark drawing functions
├── common.py # Shared global variables
├── hand_landmarker.task # MediaPipe model (required)
├── requirements.txt # Project dependencies
└── README.md # This file
Camera Configuration (CameraConfig in settings.py)
WIDTH = 680 # Resolution width
HEIGHT = 460 # Resolution height
FPS = 30 # Frames per secondDetector Configuration (HandDetectorConfig in settings.py)
NUM_HANDS = 4 # Maximum number of hands to detect
MIN_DETECTION_CONFIDENCE = 0.5 # Minimum confidence for detection
MIN_PRESENCE_CONFIDENCE = 0.5 # Minimum presence confidence
MIN_TRACKING_CONFIDENCE = 0.5 # Minimum confidence for tracking- Initializes the camera and MediaPipe detector
- Manages the main capture and processing loop
- Handles asynchronous detection mode (LIVE_STREAM)
- Draws the 21 landmarks of each detected hand
- Connects landmarks according to hand anatomy
- Displays FPS information and number of hands detected
- Labels each hand as left/right with its confidence level
- Centralizes all project configurations
- Defines camera and detector parameters
- Imports necessary classes from MediaPipe
- Global variables for sharing detection results
- Manages state between asynchronous callbacks
- mediapipe (0.10.31): ML framework for hand detection
- opencv-python (4.12.0.88): Image and video processing
- numpy (2.2.6): Numerical operations
The system detects 21 reference points per hand:
- 0: Wrist
- 1-4: Thumb
- 5-8: Index
- 9-12: Middle
- 13-16: Ring
- 17-20: Pinky
- The system operates in LIVE_STREAM mode for asynchronous processing
- Uses callbacks to handle detection results without blocking the flow
- Timestamps are handled in milliseconds for synchronization
- Detection works best with good lighting and contrasting background
- Verify that your webcam is connected and not being used by another application
- Try changing the camera index in
cv2.VideoCapture(0)to1or2
- Reduce resolution in
CameraConfig - Decrease
NUM_HANDSif you don't need to detect so many hands - Close other resource-consuming applications
- Adjust confidence values in
HandDetectorConfig - Make sure you have good lighting
- Keep hands within the camera's field of view
This project uses Google's MediaPipe, which is under the Apache 2.0 license.
Developed with MediaPipe and OpenCV
Contributions are welcome. Please open an issue or pull request for suggestions or improvements.