Neural Vault’s core encoder isDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/Skieriya/fMRI-key-generation-with-TRIBEv2/llms.txt
Use this file to discover all available pages before exploring further.
NeuralVaultFewShot — a compact Transformer network trained to map fMRI feature vectors into a metric space where embeddings from the same identity cluster tightly together and embeddings from different identities are pushed apart. Training uses the triplet loss objective and episodic sampling: each mini-batch contains anchor, positive, and negative samples assembled on the fly from the labeled dataset. The result is a unit-hypersphere embedding space where cosine distance directly measures biometric similarity.
Architecture
NeuralVaultFewShot composes four stages: a linear feature projection, sinusoidal positional encoding, a stack of Transformer encoder layers, and a multi-layer embedding head that projects the temporally-pooled representation down to latent_dim dimensions before L2 normalization.
PositionalEncoding uses the classic sinusoidal formulation from “Attention Is All You Need”. Even and odd dimensions receive sin and cos encodings at geometrically spaced frequencies. The buffer is registered (not a parameter) so it moves to the correct device with .to(device) without participating in backpropagation.
norm_first=True— applies layer normalization before the attention sub-layer (Pre-LN), which improves gradient flow in shallow stacks.dim_feedforward=d_model * 4— follows the standard 4× expansion ratio used in the original Transformer paper.h.mean(dim=1)— mean pooling over the time axis aggregates information from all frames into a single context vector before the embedding head.F.normalize(..., p=2, dim=1)— L2-normalizes the output so every embedding lies on the unit hypersphere. On this manifold, dot products equal cosine similarities, making verification a simple inner product.
Triplet Loss
The training objective is the standard margin-based triplet loss:margin=0.3. F.relu ensures only violated constraints contribute to the gradient — once a triplet is well-separated, it stops driving updates. Loss is averaged over the batch.
Training Workflow
Initialize model and optimizer
Instantiate
NeuralVaultFewShot with input_dim set to the number of features in the scaled data matrix, and configure AdamW with lr=1e-4 and weight_decay=1e-4:class_indices caches a per-class index array upfront so episodic sampling during the loop avoids repeated np.where calls.Episodic triplet sampling
Each epoch, sample The self-exclusion check (
batch_size anchor indices at random, then for each anchor select a positive from the same class and a negative from a different class:same_class[same_class != idx]) prevents degenerate triplets where the anchor and positive are the same sample, which would contribute a zero positive distance. If a class has only one member, the anchor is reused as the positive — a known safe fallback that produces zero positive loss for that triplet.Forward pass and loss computation
Run all three legs of the triplet through the model, compute embeddings, and evaluate the triplet loss:
Backward pass with gradient clipping
Backpropagate, clip gradients to prevent exploding norms, and step the optimizer:
clip_grad_norm_ rescales the entire parameter gradient vector so its L2 norm does not exceed 1.0. This is particularly important in the early epochs when triplet distances are large and gradients can spike.model.py Variant: Cosine Annealing and Noise Negatives
Themodel.py training loop uses an alternative triplet strategy: negatives are drawn from random Gaussian noise rather than from labeled impostor classes. This is appropriate for single-user enrollment where no labeled impostor data is available at training time. It also uses a cosine annealing learning rate schedule over 150 epochs:
CosineAnnealingLR with T_max=150 smoothly decays the learning rate from 1e-4 to near zero over the full training run, helping the model converge to a tighter minimum without abrupt learning rate drops.
Few-Shot Evaluation
After training, model quality is assessed through episodic few-shot classification.evaluate_fewshot computes per-class prototype embeddings from a small support set and classifies queries by nearest prototype using Euclidean distance:
euclidean_dist expands both tensors using broadcasting to compute all n × m pairwise distances in a single vectorized operation. The negative distance is passed through softmax to produce class probability estimates used for ROC-AUC calculation. Across 40 evaluation episodes with 4-shot support sets, Neural Vault achieves Accuracy 98.12%, F1 0.9810, and ROC-AUC 0.9995.