Overview
Iris is built as a stateless, high-performance face recognition API using Rust, Axum web framework, and OpenCV. The architecture is designed for speed, security, and scalability with zero data persistence.Every request is processed independently in RAM with no database, no file storage, and no logging of user images.
System Components
The application is structured around three core modules:FaceEngine
Neural network models for face detection and recognition
Request Handler
HTTP endpoints and request processing logic
Stats & Rate Limiting
Performance monitoring and API protection
Application State
TheAppState struct (main.rs:28-33) holds shared components:
main.rs
- engine: Thread-safe reference to the face processing engine
- limiter: IP-based rate limiter (5 req/sec, burst of 10)
- stats: Real-time request metrics
Data Flow
Request Arrival
Client sends POST request to
/compare with target face URL and list of people to match againstFace Recognition
SFace recognizer extracts 128-dimensional embeddings and computes cosine similarity (
face.rs:32-36)Visual Flow
Request Processing
The/compare endpoint (main.rs:62-117) follows this logic:
Step 1: Target Image Processing
Step 1: Target Image Processing
main.rs
Step 2: Extract Target Embedding
Step 2: Extract Target Embedding
main.rs
Step 3: Compare Against All Candidates
Step 3: Compare Against All Candidates
main.rs
Concurrency Model
Iris uses Tokio async runtime for handling concurrent requests efficiently.
Arc<Mutex<FaceEngine>> to allow safe shared access across async tasks:
- Arc: Enables multiple ownership across threads
- Mutex: Ensures only one request processes face recognition at a time
- async/await: Allows other tasks to run while waiting for I/O operations
Why Mutex Instead of RwLock?
Since face recognition modifies internal state in OpenCV models, we useMutex rather than RwLock. All operations require mutable access to the detector and recognizer.
Performance Optimizations
Zero Allocation
Images are decoded directly into OpenCV Mat without intermediate buffers
ONNX Runtime
Pre-trained models use ONNX format for fast inference
Connection Pooling
Reqwest client reuses HTTP connections when downloading images
Early Returns
Invalid images or failed detections exit immediately
Security Architecture
Rate Limiting
Implemented using thegovernor crate with per-IP quotas (main.rs:127-130):
main.rs
- 5 requests/second sustained rate
- Burst of 10 for occasional spikes
- Per IP address tracking
CORS Configuration
main.rs
Deployment Architecture
Iris runs as a single binary with all dependencies:main.rs
Error Handling Strategy
The API follows a graceful degradation pattern:- Invalid target image: Returns empty matches array
- Invalid candidate image: Skips that candidate, continues processing
- No face detected: Treats as non-match, continues
- Rate limit exceeded: Returns
429 TOO_MANY_REQUESTS