Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/usnistgov/NFIQ2/llms.txt

Use this file to discover all available pages before exploring further.

Overview

NFIQ2 uses a random forest classifier to combine native quality measures into unified quality scores. The random forest is a machine learning model trained on thousands of fingerprint images with known recognition performance outcomes.
The random forest model is the core of NFIQ2’s ability to predict recognition performance. It learns complex relationships between quality measures and match accuracy that would be difficult to encode manually.

Random Forest Fundamentals

What is a Random Forest?

A random forest is an ensemble learning method that:
  1. Trains multiple decision trees on different subsets of training data
  2. Combines tree predictions through averaging or voting
  3. Reduces overfitting by using randomized tree construction
  4. Handles high-dimensional features effectively

Why Random Forest for Quality Assessment?

Random forests are ideal for NFIQ2 because they:
  • Handle non-linear relationships between quality measures
  • Account for feature interactions automatically
  • Provide robust predictions even with correlated features
  • Train efficiently on large datasets
  • Generate interpretable feature importance rankings
Unlike neural networks, random forests don’t require extensive hyperparameter tuning and provide good performance out-of-the-box.

Model Architecture

Core Components

The NFIQ2 random forest implementation consists of:
namespace NFIQ2::Prediction {
    /**
     * Random Forest Machine Learning model for generating
     * unified quality scores.
     */
    class RandomForestML {
    public:
        RandomForestML();
        
        // Initialize from external parameters file
        std::string initModule(
            const std::string &fileName,
            const std::string &fileHash
        );
        
        // Evaluate quality score from features
        void evaluate(
            const std::unordered_map<std::string, double> &features,
            double &qualityValue
        ) const;
        
    private:
        // OpenCV RTrees model
        cv::Ptr<cv::ml::RTrees> m_pTrainedRF;
    };
}

OpenCV RTrees Integration

NFIQ2 uses OpenCV’s RTrees (Random Trees) implementation:
// Internal model representation
cv::Ptr<cv::ml::RTrees> m_pTrainedRF;
Benefits:
  • Mature, well-tested implementation
  • Fast prediction performance
  • Cross-platform compatibility
  • Serialization support

Model Parameters

Parameter File Format

NFIQ2 random forest parameters are stored in YAML format. The default model is:
Name = Plain TIR + Ink
Trainer = National Institute of Standards and Technology
Description = Trained on plain optical bright-field total internal reflection and scanned ink plain impression fingerprints, as described in the NFIQ 2 report. This model can only be used with NFIQ 2 v2.3.
Version = 2.0.0
Path = nist_plain_tir-ink.yaml
Hash = b4a1e7586b3be906f9770e4b77768038

Loading Model Parameters

#include <nfiq2.hpp>

// Load model from file with hash verification
std::string modelPath = "/path/to/nist_plain_tir-ink.yaml";
std::string modelHash = "b4a1e7586b3be906f9770e4b77768038";

NFIQ2::Algorithm algorithm(modelPath, modelHash);

// Verify model is loaded
if (algorithm.isInitialized()) {
    std::cout << "Model loaded successfully" << std::endl;
    std::cout << "Hash: " << algorithm.getParameterHash() << std::endl;
}

Model Information API

namespace NFIQ2 {
    /** Information about a random forest parameter model. */
    class ModelInfo {
    public:
        ModelInfo(const std::string &modelInfoFilePath);
        
        std::string getModelName() const;
        std::string getModelTrainer() const;
        std::string getModelDescription() const;
        std::string getModelVersion() const;
        std::string getModelPath() const;
        std::string getModelHash() const;
        
        // Model info file keys
        static const char ModelInfoKeyName[];
        static const char ModelInfoKeyTrainer[];
        static const char ModelInfoKeyDescription[];
        static const char ModelInfoKeyVersion[];
        static const char ModelInfoKeyPath[];
        static const char ModelInfoKeyHash[];
    };
}

Training Data

Dataset Characteristics

The NIST Plain TIR + Ink model is trained on:
Plain Impression Fingerprints:
  • Optical bright-field total internal reflection (TIR) captures
  • Scanned ink impressions
  • Both live-scan and offline captured
Not Included:
  • Rolled impressions
  • Contactless captures
  • Mobile sensor captures
  • Latent prints
  • Resolution: 500 PPI
  • Bit depth: 8-bit grayscale
  • Format: Decompressed raw pixel data
  • Encoding: ISO/IEC 39794-4:2019 canonical format
Training set includes:
  • High-quality captures (excellent ridge detail)
  • Medium-quality captures (typical operational quality)
  • Low-quality captures (marginal but usable)
  • Failed captures (for quality threshold calibration)
Balanced to reflect operational distributions.
Quality labels derived from:
  • Genuine match scores (same finger comparisons)
  • Impostor match scores (different finger comparisons)
  • Recognition performance metrics (FMR, FNMR)
  • Multi-system matching results

Training Methodology

The model training process:
  1. Feature Extraction: Compute all native quality measures for training images
  2. Ground Truth Assignment: Link images to recognition performance outcomes
  3. Forest Training: Train random forest to predict performance from measures
  4. Validation: Test on held-out data to prevent overfitting
  5. Calibration: Map predictions to 0-100 quality score scale
The current model (v2.3) uses updated training methodology and expanded training data compared to earlier versions, resulting in improved prediction accuracy.

Friction Ridge Capture Technology (FCT) Codes

What are FCT Codes?

Friction Ridge Capture Technology (FCT) codes specify the sensor type used for fingerprint capture, as defined in ANSI/NIST-ITL 1-2011: Update 2015.

Common FCT Codes

FCT CodeTechnologyDescription
0UnspecifiedDefault/unknown capture method
2Optical TIR (bright)Total internal reflection, bright field
3Optical direct viewDirect optical imaging
8ThermalHeat-sensing
9CapacitiveElectric field sensing
14Electro-luminescentLight-emitting polymer
The NIST Plain TIR + Ink model is primarily trained on FCT 0 (unspecified, includes ink) and FCT 2 (optical TIR) captures.

FCT in NFIQ2

namespace NFIQ2 {
    class Algorithm {
    public:
        /**
         * Obtain the friction ridge capture technology (FCT) specified
         * for the embedded random forest parameters.
         *
         * @return Embedded FCT specified.
         * @throw NFIQ2::Exception Parameters were not embedded or FCT was not specified.
         */
        unsigned int getEmbeddedFCT() const;
    };
}

Using FCT Information

// Check embedded FCT
NFIQ2::Algorithm algorithm;

if (algorithm.isEmbedded()) {
    try {
        unsigned int fct = algorithm.getEmbeddedFCT();
        
        switch (fct) {
            case 0:
                std::cout << "Model: Unspecified/Ink" << std::endl;
                break;
            case 2:
                std::cout << "Model: Optical TIR (bright field)" << std::endl;
                break;
            default:
                std::cout << "Model FCT: " << fct << std::endl;
        }
    } catch (const NFIQ2::Exception& e) {
        std::cout << "FCT not specified in embedded model" << std::endl;
    }
}
Using a model with the wrong FCT code may result in less accurate quality predictions. Always use a model trained on your sensor type when possible.

Embedding Model Parameters

Why Embed Parameters?

Embedding random forest parameters in the library offers several advantages:
  • Simplified deployment: No external parameter files to distribute
  • Reduced I/O: Faster initialization (no file loading)
  • Security: Parameters cannot be modified or replaced
  • Reliability: Eliminates missing file errors

Build-Time Embedding

Parameters are embedded during compilation:
# CMakeLists.txt configuration
option(EMBED_RANDOM_FOREST_PARAMETERS "Embed random forest parameters in library" OFF)

set(EMBEDDED_RANDOM_FOREST_PARAMETER_FCT "0" CACHE STRING
    "ANSI/NIST-ITL 1-2011: Update 2015 friction ridge capture technology (FRCT) code for parameters to embed")

if(EMBED_RANDOM_FOREST_PARAMETERS)
    message(STATUS "Embedding random forest parameters")
    add_definitions(-DNFIQ2_EMBED_RANDOM_FOREST_PARAMETERS)
    add_definitions(-DEMBEDDED_RANDOM_FOREST_PARAMETER_FCT=${EMBEDDED_RANDOM_FOREST_PARAMETER_FCT})
endif()

Building with Embedded Parameters

# Configure with embedded parameters
cmake -B build \
  -DEMBED_RANDOM_FOREST_PARAMETERS=ON \
  -DEMBEDDED_RANDOM_FOREST_PARAMETER_FCT=2 \
  ..

# Build
cmake --build build

Embedded vs. External Parameters

Advantages:
  • No runtime file I/O
  • Faster initialization
  • Simpler deployment
  • Cannot be tampered with
Disadvantages:
  • Increases library size
  • Requires recompilation to change models
  • Single model per build
  • Limited flexibility
Best For:
  • Production deployments
  • Embedded systems
  • Containerized applications
  • Security-sensitive environments

Model Evaluation

Computing Quality Scores

The random forest evaluates features to produce quality scores:
namespace NFIQ2::Prediction {
    class RandomForestML {
    public:
        /**
         * Compute NFIQ2 quality score based on model and provided features.
         *
         * @param features Map of quality measure identifiers to values
         * @param qualityValue Output quality score
         */
        void evaluate(
            const std::unordered_map<std::string, double> &features,
            double &qualityValue
        ) const;
    };
}

Internal Prediction Flow

  1. Feature Vector Construction: Maps quality measure names to model input indices
  2. Tree Evaluation: Each decision tree produces a prediction
  3. Ensemble Aggregation: Tree predictions are averaged
  4. Score Normalization: Raw prediction mapped to [0, 100] scale

Algorithm Integration

The NFIQ2::Algorithm class wraps the random forest:
namespace NFIQ2 {
    class Algorithm {
    public:
        // Compute quality score from image
        unsigned int computeUnifiedQualityScore(
            const NFIQ2::FingerprintImageData &rawImage
        ) const;
        
        // Compute quality score from pre-computed algorithms
        unsigned int computeUnifiedQualityScore(
            const std::vector<std::shared_ptr<QualityMeasures::Algorithm>> &algorithms
        ) const;
        
        // Compute quality score from feature map
        unsigned int computeUnifiedQualityScore(
            const std::unordered_map<std::string, double> &features
        ) const;
    };
}

Model Versioning

Version History

NFIQ2 has released several model versions:
namespace NFIQ2::Identifiers {
    namespace UnifiedQualityScores {
        extern const char NFIQ2Rev0[];  // v2.0 - Initial release
        extern const char NFIQ2Rev1[];  // v2.1 - Refined training
        extern const char NFIQ2Rev2[];  // v2.2 - Expanded dataset
        extern const char NFIQ2Rev3[];  // v2.3 - Current version
    }
    
    namespace CBEFF {
        extern const unsigned int NFIQ2Rev0;  // CBEFF ID for v2.0
        extern const unsigned int NFIQ2Rev1;  // CBEFF ID for v2.1
        extern const unsigned int NFIQ2Rev2;  // CBEFF ID for v2.2
        extern const unsigned int NFIQ2Rev3;  // CBEFF ID for v2.3
    }
}

Version Compatibility

Important: Quality scores from different NFIQ2 versions are not directly comparable. Always document which version you’re using.

Checking Model Version

// Get model version from ModelInfo
NFIQ2::ModelInfo modelInfo("/path/to/model_info.txt");
std::string version = modelInfo.getModelVersion();

std::cout << "Model Version: " << version << std::endl;

// Get model hash for verification
std::string hash = modelInfo.getModelHash();
std::cout << "Model Hash: " << hash << std::endl;

Training Custom Models

When to Train Custom Models

Consider training a custom model if:
  • Your sensor type differs significantly from optical TIR
  • Your population has unique characteristics
  • You need quality predictions for specific use cases
  • You have ground-truth performance data from your system
Custom model training is an advanced topic. Contact NIST or consult the NFIQ2 technical report for guidance on training methodology.

Training Data Requirements

  • Minimum: 1,000+ fingerprint images with ground truth
  • Recommended: 5,000+ images with diverse quality distribution
  • Ground Truth: Match performance data from operational system
  • Validation Set: 20-30% held out for testing

Model Export Format

Custom models must be:
  • Trained using OpenCV RTrees
  • Exported to YAML format
  • Compatible with NFIQ2 feature naming conventions
  • Validated against standard test set

Best Practices

Always verify model parameter hash on loading:
try {
    NFIQ2::Algorithm algorithm(modelPath, expectedHash);
    std::string loadedHash = algorithm.getParameterHash();
    
    if (loadedHash != expectedHash) {
        std::cerr << "Hash mismatch!" << std::endl;
    }
} catch (const NFIQ2::Exception& e) {
    std::cerr << "Model load failed: " << e.what() << std::endl;
}
Model initialization is expensive. Reuse Algorithm instances:
// Initialize once
static NFIQ2::Algorithm algorithm;

// Reuse for all quality score computations
for (const auto& image : images) {
    unsigned int score = algorithm.computeUnifiedQualityScore(image);
}
Always record which model version produced quality scores:
  • Store model hash with quality scores in database
  • Include model version in log files
  • Document model changes in release notes
  • Maintain model file version control
Use models trained on your sensor type:
  • Check FCT code compatibility
  • Validate against your sensor’s images
  • Consider training custom model if default performance is poor

Next Steps

Quality Scores

Understand unified quality score interpretation

Quality Measures

Learn about input features to the model

Algorithm API

Complete Algorithm class documentation

ModelInfo API

Model information management

Build docs developers (and LLMs) love