Biometric Verification and Authentication Threshold

Verification is the final gate before a cryptographic key is released. When a new fMRI scan arrives, Neural Vault normalizes it with the enrolled user’s statistics, runs it through the trained Transformer encoder to obtain an embedding, and computes the cosine similarity of that embedding against the stored prototype vector. If the similarity exceeds the Equal Error Rate threshold determined during enrollment, the user is authenticated and a 256-bit key is derived. If it falls short, the session is rejected without any key material being produced.

Getting an Embedding from Raw Data

get_embedding in model.py is the inference entry point. It applies per-feature normalization using statistics (feat_mean, feat_std) that were computed at enrollment time — never from the current scan — then builds temporal sequences and runs the model:

def get_embedding(raw_data_np: np.ndarray) -> np.ndarray:
    """raw_data_np: (N, 20484) → normalized embedding"""
    normed = (raw_data_np - feat_mean) / feat_std
    seqs   = build_sequences(normed)
    if len(seqs) == 0:
        pad   = np.tile(normed, (SEQ_CHUNK, 1))
        seqs  = torch.from_numpy(pad).unsqueeze(0)
    with torch.no_grad():
        emb = model(seqs.to(DEVICE))
    return emb.mean(dim=0).cpu().numpy()

The zero-length guard (if len(seqs) == 0) handles the edge case where the input has fewer rows than SEQ_CHUNK. In that case the scan is tiled to reach the minimum window size. The final emb.mean(dim=0) collapses all sequence windows into a single representative embedding vector before returning it.

Because the model outputs L2-normalized embeddings (every vector has unit length), the dot product between any two embedding vectors is exactly their cosine similarity. No explicit cosine computation is needed — np.dot(emb, prototype_vec) is sufficient.

Verification Function

verify combines embedding extraction and similarity scoring into a single call:

def verify(raw_data_np: np.ndarray) -> float:
    """Returns cosine similarity to the user prototype."""
    emb = get_embedding(raw_data_np)
    # L2 normalized vectors, so dot product = cosine similarity
    similarity = np.dot(emb, prototype_vec)
    return float(similarity)

The return value is a scalar in [-1.0, 1.0]. A score near 1.0 means the scan is nearly identical to the enrolled prototype in embedding space; a score near 0.0 or below indicates an impostor or a badly degraded scan.

Batch Similarity Scoring for Benchmarking

verify_similarity in main.py evaluates the full score distribution over an embedding matrix at once, separating scores into genuine (same-class) and impostor (different-class) arrays using scipy.spatial.distance.cdist:

def verify_similarity(embeddings, prototypes, labels):
    genuine, impostor = [], []
    for c in range(N_CLASSES):
        ref = np.atleast_2d(prototypes[c])
        genuine.extend(
            cdist(ref, embeddings[labels == c], metric='cosine').flatten()
        )
        impostor.extend(
            cdist(ref, embeddings[labels != c], metric='cosine').flatten()
        )
    return np.array(genuine), np.array(impostor)

cdist with metric='cosine' returns cosine distance (i.e., 1 - cosine_similarity), so lower values indicate better matches. The returned arrays are used to sweep the ROC curve and locate the EER threshold.

EER Threshold Derivation

The Equal Error Rate threshold is the operating point at which the False Acceptance Rate equals the False Rejection Rate. It is computed from the ROC curve in main.py:

fpr_v, tpr_v, thresh_v = roc_curve(
    np.concatenate([np.ones_like(genuine_scores), np.zeros_like(impostor_scores)]),
    np.concatenate([genuine_scores, impostor_scores])
)
vault_threshold = float(thresh_v[np.argmin(np.abs(fpr_v - (1.0 - tpr_v)))])

np.argmin(np.abs(fpr_v - (1.0 - tpr_v))) finds the threshold index where FPR and FNR (= 1 - TPR) are closest to each other. The benchmark pipeline reports this threshold at approximately 0.316.

Authentication Decision

The end-to-end authentication flow from model.py demonstrates how the threshold gates key derivation:

user_emb = get_embedding(norm_np)
score    = np.dot(user_emb, prototype_vec)

if score > eer_thr:
    key_bytes, key_hex = derive_key(user_emb)
    print("✓ AUTHENTICATED")
    print(f"  Derived key (256-bit)   : {key_hex}")
else:
    print("✗ REJECTED (score below EER threshold)")

No key material is derived or returned in the rejection branch — the derive_key call is gated entirely behind the threshold check.

Full Verification Flow

Normalize the incoming scan

Apply z-score normalization using the enrolled user’s stored feat_mean and feat_std:

normed = (raw_data_np - feat_mean) / feat_std

Using enrollment-time statistics ensures the normalization is consistent regardless of the current session’s signal distribution.

Build temporal sequences

Chunk the normalized scan into windows of length SEQ_CHUNK (5 frames):

seqs = build_sequences(normed)

If the scan is shorter than one window, tile it to meet the minimum length requirement.

Run the Transformer encoder

Pass the sequence tensor through the trained model to obtain L2-normalized embeddings, then average across all windows:

with torch.no_grad():
    emb = model(seqs.to(DEVICE))
user_emb = emb.mean(dim=0).cpu().numpy()

Compute cosine similarity

Compute the dot product of the query embedding with the enrolled prototype vector:

score = np.dot(user_emb, prototype_vec)

Because both vectors are L2-normalized, this equals their cosine similarity.

Apply the EER threshold

Compare the score against the EER threshold. Authenticate and derive a key if the score exceeds the threshold; reject otherwise:

if score > eer_thr:   # eer_thr ≈ 0.316
    key_bytes, key_hex = derive_key(user_emb)
    # proceed with key_bytes
else:
    # reject — do not derive key
    pass

Few-Shot Classification Evaluation

For multi-class scenarios, evaluate_fewshot classifies a query embedding by finding its nearest class prototype via Euclidean distance in embedding space:

def evaluate_fewshot(model, X_train, y_train, X_test, y_test):
    """Predicts query tokens based on minimum distance to generated anchors."""
    model.eval()
    with torch.no_grad():
        X_tr = torch.from_numpy(X_train).float().unsqueeze(1)
        y_tr = torch.from_numpy(y_train).long()
        X_te = torch.from_numpy(X_test).float().unsqueeze(1)

        emb_tr = model(X_tr)
        emb_te = model(X_te)

        prototypes = torch.stack([emb_tr[y_tr == c].mean(0) for c in range(N_CLASSES)])
        dists = euclidean_dist(emb_te, prototypes)
        preds = torch.argmin(dists, dim=1).numpy()
        probs = F.softmax(-dists, dim=1).numpy()
    return preds, probs

Prototypes are computed live from the support set (X_train) within each episode rather than from pre-stored values. torch.argmin(dists, dim=1) selects the class whose prototype is closest; F.softmax(-dists, dim=1) converts distances to probabilities for ROC-AUC computation.

Benchmark Metrics

The benchmark results from running 40 few-shot episodes on 5classpreds.csv confirm strong biometric performance:

Metric	Value
Accuracy (40 episodes)	98.12%
Macro F1	0.9810
ROC-AUC	0.9995
FAR @ EER	≈ 0.75%
FRR @ EER	≈ 0.75%
EER threshold (cosine sim)	≈ 0.316

The score distributions show clear separation between genuine and impostor populations: genuine cosine similarities cluster at high positive values while impostor similarities spread around zero, reflecting the Gaussian noise baseline used for impostor simulation in model.py.

The EER threshold is the balanced operating point, but it is not always the right choice for a deployment. Setting the threshold below the EER value lowers the False Rejection Rate at the cost of a higher False Acceptance Rate — appropriate when user experience is the priority and the downstream key usage has secondary security controls. Setting it above the EER tightens security but will reject more legitimate users. Use the full ROC curve (fpr_arr, tpr_arr, thresh_v) to select a threshold that matches the security-usability tradeoff required by the application.

Overview

Getting Started

Pipeline

Benchmarking

Reference

Biometric Verification and Authentication Threshold

Getting an Embedding from Raw Data

Verification Function

Batch Similarity Scoring for Benchmarking

EER Threshold Derivation

Authentication Decision

Full Verification Flow

Few-Shot Classification Evaluation

Benchmark Metrics

Build docs developers (and LLMs) love

Overview

Getting Started

Pipeline

Benchmarking

Reference

Documentation Index

​Getting an Embedding from Raw Data

​Verification Function

​Batch Similarity Scoring for Benchmarking

​EER Threshold Derivation

​Authentication Decision

​Full Verification Flow

​Few-Shot Classification Evaluation

​Benchmark Metrics

Build docs developers (and LLMs) love

Getting an Embedding from Raw Data

Verification Function

Batch Similarity Scoring for Benchmarking

EER Threshold Derivation

Authentication Decision

Full Verification Flow

Few-Shot Classification Evaluation

Benchmark Metrics