Skip to main content

Overview

Iris is built as a stateless, high-performance face recognition API using Rust, Axum web framework, and OpenCV. The architecture is designed for speed, security, and scalability with zero data persistence.
Every request is processed independently in RAM with no database, no file storage, and no logging of user images.

System Components

The application is structured around three core modules:

FaceEngine

Neural network models for face detection and recognition

Request Handler

HTTP endpoints and request processing logic

Stats & Rate Limiting

Performance monitoring and API protection

Application State

The AppState struct (main.rs:28-33) holds shared components:
main.rs
#[derive(Clone)]
struct AppState {
    engine: Arc<Mutex<FaceEngine>>,
    limiter: SharedRateLimiter,
    stats: RequestStats,
}
  • engine: Thread-safe reference to the face processing engine
  • limiter: IP-based rate limiter (5 req/sec, burst of 10)
  • stats: Real-time request metrics

Data Flow

1

Request Arrival

Client sends POST request to /compare with target face URL and list of people to match against
2

Rate Limiting

Middleware checks IP-based quota before processing (main.rs:35-45)
3

Image Download

Both data URIs and HTTP URLs are decoded into OpenCV Mat objects (main.rs:47-60)
4

Face Detection

YuNet detector locates faces in both target and candidate images (face.rs:26-28)
5

Face Recognition

SFace recognizer extracts 128-dimensional embeddings and computes cosine similarity (face.rs:32-36)
6

Response

Matches above 0.363 threshold are returned sorted by probability (main.rs:104-116)

Visual Flow

Request Processing

The /compare endpoint (main.rs:62-117) follows this logic:
main.rs
let target_img = match download_and_decode(&payload.target_url).await {
    Ok(img) => img,
    Err(_) => return Json(CompareResponse { matches: vec![] }),
};
If the target URL is invalid or image is corrupted, return empty matches immediately.
main.rs
let mut guard = state.engine.lock().await;
let (det, rec) = unsafe {
    (
        &mut *(guard.detector.as_raw_mut() as *mut objdetect::FaceDetectorYN),
        &mut *(guard.recognizer.as_raw_mut() as *mut objdetect::FaceRecognizerSF)
    )
};
if let Ok(Some(emb)) = get_embedding(&target_img, det, rec) {
    target_embedding = Some(emb);
}
Lock the FaceEngine and extract the 128-dimensional feature vector from the target face.
main.rs
for person in payload.people {
    if let Ok(p_img) = download_and_decode(&person.image_url).await {
        // ... extract embedding and compare
        if let Ok(score) = rec.match_(&t_emb, &p_emb, objdetect::FaceRecognizerSF_DisType::FR_COSINE as i32) {
            if score > 0.363 {
                results.push(MatchResult {
                    name: person.name,
                    probability: (score.max(0.0) * 100.0).round(),
                });
            }
        }
    }
}
Iterate through all candidates, compute similarity scores, and collect matches above threshold.

Concurrency Model

Iris uses Tokio async runtime for handling concurrent requests efficiently.
The FaceEngine is protected by an Arc<Mutex<FaceEngine>> to allow safe shared access across async tasks:
  • Arc: Enables multiple ownership across threads
  • Mutex: Ensures only one request processes face recognition at a time
  • async/await: Allows other tasks to run while waiting for I/O operations

Why Mutex Instead of RwLock?

Since face recognition modifies internal state in OpenCV models, we use Mutex rather than RwLock. All operations require mutable access to the detector and recognizer.

Performance Optimizations

Zero Allocation

Images are decoded directly into OpenCV Mat without intermediate buffers

ONNX Runtime

Pre-trained models use ONNX format for fast inference

Connection Pooling

Reqwest client reuses HTTP connections when downloading images

Early Returns

Invalid images or failed detections exit immediately

Security Architecture

Rate Limiting

Implemented using the governor crate with per-IP quotas (main.rs:127-130):
main.rs
let quota = Quota::per_second(NonZeroU32::new(5).unwrap())
    .allow_burst(NonZeroU32::new(10).unwrap());
let limiter: SharedRateLimiter = Arc::new(RateLimiter::keyed(quota));
  • 5 requests/second sustained rate
  • Burst of 10 for occasional spikes
  • Per IP address tracking

CORS Configuration

main.rs
let cors = CorsLayer::new()
    .allow_origin(Any)
    .allow_methods([Method::POST, Method::GET])
    .allow_headers([header::CONTENT_TYPE]);
The API allows requests from any origin. In production, restrict .allow_origin() to specific domains.

Deployment Architecture

Iris runs as a single binary with all dependencies:
main.rs
let port = 8080;
let listener = tokio::net::TcpListener::bind(format!("0.0.0.0:{}", port)).await?;
axum::serve(listener, app.into_make_service_with_connect_info::<SocketAddr>()).await?;
Deploy behind a reverse proxy (nginx, Caddy) for TLS termination and additional security layers.

Error Handling Strategy

The API follows a graceful degradation pattern:
  1. Invalid target image: Returns empty matches array
  2. Invalid candidate image: Skips that candidate, continues processing
  3. No face detected: Treats as non-match, continues
  4. Rate limit exceeded: Returns 429 TOO_MANY_REQUESTS
This ensures partial failures don’t crash the entire request.

Module Structure

src/
├── main.rs        # HTTP server, routing, middleware
├── face.rs        # Face detection and recognition logic
├── models.rs      # Request/response data structures
└── stats.rs       # Request statistics tracking
Each module has a single, well-defined responsibility following separation of concerns principles.

Build docs developers (and LLMs) love