Neural Vault Configuration Constants and Hyperparameters

Neural Vault’s behaviour is controlled by a set of module-level constants defined at the top of main.py and model.py. Because both files are executed as scripts (or imported as modules), changing a constant takes effect for all functions in that module without requiring any argument threading. The tables below list every constant, its default value, and guidance on when to modify it.

`main.py` Constants

These constants govern the benchmark pipeline, output directory layout, noise sweep levels, and visualization colours.

Constant	Default	Description
`SEED`	`42`	Random seed applied to both `numpy` and `torch` at import time. Change this to reproduce independent runs or verify that results are not seed-dependent.
`N_CLASSES`	`5`	Number of stimulus classes (distinct video stimuli) in `5classpreds.csv`. If you record additional stimuli, increment this and re-run the pipeline.
`BENCHMARK_DIR`	`Path("benchmark")`	Root output directory relative to the working directory. All three sub-directories are created beneath it by `setup_environment()`.
`RESULTS_DIR`	`benchmark/results`	Destination for `keygen_benchmark_results.json`.
`VIZ_DIR`	`benchmark/visualizations`	Destination for `neuralvault_extended_dashboard.png`.
`REPORT_DIR`	`benchmark/reports`	Destination for `BENCHMARK_REPORT.md` and `benchmark_summary.json`.
`SNR_LEVELS`	`[30, 20, 15, 10, 5, 0]`	AWGN noise test levels in decibels, evaluated in descending order (cleanest signal first). Add lower values such as `-5` to test extreme degradation.
`ARTIFACT_LEVELS`	`[0.0, 0.05, 0.10, 0.15, 0.20, 0.30]`	Fractions of feature values corrupted by simulated motion artifacts. `0.0` is the clean baseline; `0.30` corrupts 30% of values.
`COLORS`	`["#00e5ff", "#ff4040", "#7cff6b", "#f5c518", "#ab47bc"]`	Hex colour codes assigned per class in PCA scatter plots and bar charts. One entry per class; extend the list when `N_CLASSES` is increased.

`model.py` Constants

These constants define the enrolled-user verification pipeline: model architecture dimensions, sequence chunking, and benchmark generation parameters.

Constant	Default	Description
`SEED`	`42`	Random seed, set independently from `main.py`.
`LATENT_DIM`	`40`	Dimensionality of the output embedding space. This controls the size of the unit hypersphere that genuine samples cluster on. Smaller values compress the representation; larger values allow finer discrimination but require more training data.
`D_MODEL`	`256`	Transformer model width. All internal attention and feed-forward operations use this dimensionality. The `main.py` pipeline uses `128` for its `NeuralVaultFewShot` instance; `model.py` uses `256` for the enrolled-user model.
`N_HEAD`	`8`	Number of self-attention heads. Must evenly divide `D_MODEL`. Reducing to `4` halves attention capacity and speeds up inference on CPU-only hardware.
`N_LAYERS`	`2`	Number of stacked Pre-LN Transformer encoder layers. Each layer adds `~4 × D_MODEL²` parameters; increasing to `4` roughly doubles model capacity.
`SEQ_CHUNK`	`5`	Number of consecutive fMRI frames packed into one sequence for the Transformer’s temporal context window. Matches the `seq_len` argument of `build_sequences`.
`DRIFT_NOISE_SCALE`	`0.5`	Standard deviation of zero-mean Gaussian noise added to each genuine sample during benchmark score generation to simulate session-to-session signal drift. Set to `0.0` to measure zero-noise performance; increase toward `1.0` to stress-test robustness.
`N_IMPOSTORS`	`200`	Number of random Gaussian draws used as impostor samples when computing EER and d-prime in `model.py`. Increase to `1000`+ for tighter statistical estimates.

Training Hyperparameters

`run_integrated_pipeline()` in `main.py`

The main.py training loop operates over the full labelled dataset with random triplet mining. These values are hard-coded inside run_integrated_pipeline and can be adjusted by editing the function directly.

Parameter	Value	Description
`epochs`	`100`	Number of full passes through the triplet-mined training set. Increase for richer or larger datasets; 100 is sufficient for the TRIBEv2 five-class prediction set.
`batch_size`	`min(128, N)`	Number of triplets sampled per epoch, capped at the dataset size `N`. Reduce to `32` if GPU memory is constrained.
`lr`	`1e-4`	AdamW learning rate. Typical values for Transformer fine-tuning; lower to `5e-5` if loss oscillates.
`weight_decay`	`1e-4`	L2 regularisation coefficient in AdamW. Helps prevent the embedding head from collapsing to a degenerate solution.
`triplet_margin`	`0.3`	Minimum enforced positive-negative distance gap in `NeuralVaultFewShot.triplet_loss`. Increase to `0.5` to enforce stricter separation; decrease if too many hard negatives cause slow convergence.
`grad_clip`	`1.0`	Maximum gradient norm passed to `nn.utils.clip_grad_norm_`. Prevents exploding gradients during early training when embeddings are random.
`n_episodes`	`40`	Number of few-shot evaluation episodes used to compute mean accuracy, F1, and ROC-AUC. Increase to `100` for lower-variance metric estimates.
`n_shot_train`	`4`	Number of support samples per class in each evaluation episode. This is the “4-shot” regime. Increase to `8` or `16` if more labeled data is available.
`n_query`	`4`	Number of query samples per class per episode used to measure classification performance.

`model.py` Training Loop

The model.py training loop targets a single enrolled user, using the user’s own sequences as anchors/positives and random noise as negatives. A cosine annealing schedule decays the learning rate across all epochs.

Parameter	Value	Description
`epochs`	`150`	Training epochs for the enrolled-user model. More epochs improve intra-user compactness but offer diminishing returns past the point where loss plateaus.
`lr`	`1e-4`	AdamW initial learning rate.
`weight_decay`	`1e-4`	AdamW L2 regularisation coefficient.
`scheduler`	`CosineAnnealingLR(T_max=150)`	Decays the learning rate from `lr` to near zero following a cosine curve over 150 epochs. Change `T_max` to match `epochs` if you adjust training length.

Tuning Guidance

When to change LATENT_DIM

LATENT_DIM controls the trade-off between representation capacity and generalisation. The default of 40 in model.py is appropriate for a single enrolled user with a small fMRI session. If you are working with multiple enrolled users simultaneously (multi-class scenario), consider increasing to 64 or 128 so that each class can carve out a distinct region of the hypersphere. For very simple two-class tasks, reducing to 16 can improve speed without hurting EER.

When to change epochs

Both training loops use a fixed epoch count rather than early stopping. If your loss curve plateaus before 100/150 epochs (visible in the E: Training Loss Curve dashboard panel), you can reduce epochs to save wall-clock time. Conversely, if the loss is still decreasing at the final epoch, double the epoch count and check whether EER improves. As a rule of thumb, 5 × (dataset_size / batch_size) epochs is a reasonable starting point for new datasets.

When to change n_shot_train

n_shot_train = 4 reflects the realistic constraint of limited labeled fMRI data. If your acquisition protocol yields more labeled sessions per stimulus class — for example 10 or 20 — increase n_shot_train accordingly. Higher shot counts almost always improve few-shot accuracy and lower EER, so this is the most impactful knob to turn once more data is available.

When to change SEQ_CHUNK

SEQ_CHUNK = 5 means the model sees 5 consecutive fMRI frames at once. Increase this value to give the Transformer more temporal context (useful if your stimuli have slow hemodynamic responses), but note that floor(N / SEQ_CHUNK) sequences are generated — so a large SEQ_CHUNK relative to N_SAMPLES drastically reduces the number of training sequences. Ensure N_SAMPLES / SEQ_CHUNK >= 3 for the triplet training loop to produce valid triplets.

When to change DRIFT_NOISE_SCALE

In model.py, genuine scores are computed by adding N(0, DRIFT_NOISE_SCALE) noise to each enrolled sample before calling verify. This simulates day-to-day variability in fMRI signal quality. If your acquisition protocol uses a tightly controlled environment and EER benchmarks seem pessimistic, reduce DRIFT_NOISE_SCALE to 0.2. If you expect large inter-session variability (e.g., resting-state vs. task-based scans), increase it toward 1.0 to stress-test the system.

Both main.py and model.py set SEED independently. If you change one, change the other to keep experiments fully reproducible.

When benchmarking on a new dataset, start by changing only N_CLASSES and LATENT_DIM. Leave all learning-rate and epoch settings at their defaults until you have a baseline EER, then tune from there.

Overview

Getting Started

Pipeline

Benchmarking

Reference

Neural Vault Configuration Constants and Hyperparameters

`main.py` Constants

`model.py` Constants

Training Hyperparameters

`run_integrated_pipeline()` in `main.py`

`model.py` Training Loop

Tuning Guidance

Build docs developers (and LLMs) love

Overview

Getting Started

Pipeline

Benchmarking

Reference

Documentation Index

​main.py Constants

​model.py Constants

​Training Hyperparameters

​run_integrated_pipeline() in main.py

​model.py Training Loop

​Tuning Guidance

Build docs developers (and LLMs) love

`main.py` Constants

`model.py` Constants

Training Hyperparameters

`run_integrated_pipeline()` in `main.py`

`model.py` Training Loop

Tuning Guidance