Overview
The hand tracking service:- Captures webcam input at 60 fps (capped)
- Detects up to 2 hands using MediaPipe
- Streams hand landmark data via WebSocket (
ws://localhost:8765) - Supports natural gestures for camera control and node manipulation
Hand tracking is optional. Sprout works without it, but you lose the ability to control the 3D graph with hand gestures.
Prerequisites
Install Anaconda or Miniconda
Download and install Miniconda (recommended) or full Anaconda:
- macOS
- Linux
- Windows
Environment Setup
Create an isolated conda environment for hand tracking dependencies:Create Conda Environment
Use Python 3.10 or 3.11. Avoid 3.12+ as MediaPipe may have compatibility issues.
Install Dependencies
The hand tracking service requires four Python packages with specific versions:- From requirements.txt (Recommended)
- Manual Installation
- With Cache Clearing (Troubleshooting)
Package Details
| Package | Version | Purpose |
|---|---|---|
mediapipe | 0.10.14 | Hand landmark detection (21 points per hand) |
opencv-python | 4.13.0.92 | Webcam capture and image processing |
websockets | 12.0 | WebSocket server for streaming data |
numpy | 2.4.2 | Array operations (auto-installed by mediapipe) |
Running the Service
Start the WebSocket server:Keep this terminal window open while using hand tracking. The service runs continuously until you stop it with
Ctrl+C.Configuration
The hand tracking service is configured inbackend.py:
MediaPipe Settings
backend.py
- max_num_hands
- model_complexity
- min_detection_confidence
- min_tracking_confidence
Default:
2Maximum number of hands to detect simultaneously.Camera Settings
backend.py
- Default Camera
- Multiple Cameras
Performance Settings
backend.py
- Frame Rate
- Smoothing
- Grab Gesture
Default:
60 fpsGesture System
The hand tracking service recognizes two main gestures:1. Camera Control (Normal Hand)
When your hand is not in an open palm position:- Index finger tip position controls camera azimuth and elevation
- Camera orbits around the current focus point
2. Grab Mode (Open Palm)
Hold an open palm for 3 seconds to enter grab mode:- All four fingers (index, middle, ring, pinky) must be extended
- Palm center position is tracked
- Hovering over a node while grabbing allows you to drag it in 3D space
is_open_palm):
Gesture Flow
Protocol
The WebSocket server sends JSON messages at 60 fps (capped):Field Definitions
| Field | Type | Description |
|---|---|---|
handedness | string | ”Left” or “Right” |
x, y, z | float | Index finger tip position (normalized 0-1) |
pinch | boolean | Thumb and index finger are pinching |
palm_x, palm_y, palm_z | float | Palm center position (average of wrist + 4 MCP joints) |
is_open_palm | boolean | All 4 fingers are extended |
palm_hold_duration | float | Seconds open palm has been held |
is_grabbing | boolean | Grab mode is active (palm held for 3s) |
Both hands are sent when detected. The frontend uses handedness (not array index) to track gestures consistently.
Frontend Integration
The frontend connects to the WebSocket server from the hand tracking toggle:Troubleshooting
MediaPipe module not found
MediaPipe module not found
Error:
ModuleNotFoundError: No module named 'mediapipe'Solution:- Verify you’re in the conda environment:
- Reinstall dependencies:
Camera permission denied
Camera permission denied
Error:
cv2.VideoCapture returns None or blank frames.Solution:- Grant camera permissions in system settings
- Close other apps using the webcam
- Try a different camera index:
WebSocket connection refused
WebSocket connection refused
Frontend shows “WebSocket error: Connection refused”.Solution:
- Ensure
python backend.pyis running - Check the port is 8765 (default)
- Verify firewall isn’t blocking localhost connections
Hand tracking is jittery
Hand tracking is jittery
Hand positions jump around erratically.Solution:
- Increase smoothing:
- Increase tracking confidence:
- Improve lighting conditions
- Use a higher quality webcam
Grab mode activates too easily
Grab mode activates too easily
Grab mode triggers unintentionally.Solution: Increase palm hold duration:
MediaPipe fails on Python 3.12+
MediaPipe fails on Python 3.12+
Error:
ImportError: DLL load failed or compatibility issues.Solution:- Delete the environment:
- Recreate with Python 3.11:
Advanced Configuration
Custom WebSocket Port
Change the WebSocket port inbackend.py:
backend.py
Multiple Camera Support
Cycle through cameras to find the correct index:test_cameras.py
Logging Hand Data
Add logging for debugging:backend.py
Next Steps
Document Uploads
Configure S3 for document storage
Running Locally
Start all services in development mode