Facial analysis tasks
| Task | Description | Output |
|---|---|---|
| Detection | Locate faces in an image | Bounding boxes |
| Alignment | Normalize face crop to canonical pose | Aligned face image |
| Recognition | Match a face to a known identity | Identity or embedding |
| Verification | Decide if two faces are the same person | Yes / No + confidence |
| Attribute analysis | Estimate age, gender, emotion | Labels or continuous values |
Face recognition pipeline
A robust face recognition system follows four stages:Detection
Detect all faces in the input image. Common detectors: MTCNN, RetinaFace, BlazeFace. Each returns a bounding box and facial landmark coordinates.
Alignment
Use the five-point landmarks (two eyes, nose, two mouth corners) to apply a similarity transform, producing a standardized 112×112 crop. Alignment is critical — misaligned faces significantly hurt recognition accuracy.
Embedding
Pass the aligned crop through a deep network (ResNet-50 or -100 backbone) to produce a compact feature vector, typically 512-dimensional. Similar faces produce embeddings close together in distance.
Deep face recognition: ArcFace and AdaFace
Classical softmax training does not enforce tight clustering of same-identity embeddings. Margin-based loss functions explicitly push the decision boundary closer to each class center.ArcFace loss
ArcFace adds an angular margin to the target class angle: where is a scaling factor and is the angle between the embedding and the target class weight vector.AdaFace
AdaFace introduces an adaptive margin that scales with image quality. Low-quality images (blurry, occluded) receive a smaller margin, preventing the loss from forcing incorrect gradients on hard-to-recognize samples.Social attribute analysis
Beyond identity, we can estimate social attributes from face crops:Age estimation
Age is typically framed as a regression or ordinal regression problem. The model predicts a continuous age value from the face embedding: Mean absolute error (MAE) is the standard evaluation metric.Gender classification
Binary classification (male/female) on top of face embeddings. Modern models achieve >95% accuracy on benchmark datasets, though they can exhibit demographic bias — an important ethical consideration.Emotion recognition
Seven basic emotion categories (angry, disgust, fear, happy, neutral, sad, surprised) are predicted from face crops, often using lightweight CNNs fine-tuned on the AffectNet or RAF-DB datasets.Social attribute models carry significant ethical risks, including demographic bias and potential misuse. When applying these methods, consider fairness, consent, and the limitations of automated attribute inference.
Resources
AdaFace Basic Example
Step-by-step Colab notebook for face recognition using AdaFace.
Exercise E07: Facial Analysis
Hands-on exercise covering face detection, alignment, and recognition.
VisionColab: Facial Analysis
Collection of facial analysis examples from the course repository.
Video: Facial Analysis Lecture (2021)
Recorded lecture on facial analysis techniques and applications.
