Neural Vault ingests TRIBEv2 cortical predictions stored inDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/Skieriya/fMRI-key-generation-with-TRIBEv2/llms.txt
Use this file to discover all available pages before exploring further.
5classpreds.csv — a long-format table where each row encodes one brain region’s prediction for a given video stimulus at a specific timestep. Before any model sees the data, three operations are applied in sequence: the long table is pivoted into a wide feature matrix keyed on (video, timestep), invalid values are sanitized, the matrix is z-score normalized with StandardScaler, and finally rows are chunked into fixed-length temporal windows compatible with the Transformer encoder.
Loading and Preparing Data
load_and_prepare_data() is the single entry point for raw data ingestion. It reads the CSV, assigns integer labels to each unique video name (sorted alphabetically for determinism), derives a feat_idx column by counting rows within each (video, timestep) group, and then pivots to produce a 2-D feature matrix X alongside a label vector y.
What each step does
label_map— sortingdf['video'].unique()before enumeration makes label assignment reproducible across runs, regardless of CSV row ordering.df.groupby(['video', 'timestep']).cumcount()— within each(video, timestep)group,cumcount()assigns a unique integer index to every row. This becomes the column index in the pivoted matrix, so each TRIBEv2 brain-region prediction occupies its own column.pd.pivot_table(...)— reshapes the long-format table. Theindex=['video','timestep']pairs each become a single row; thecolumns='feat_idx'values fan out into feature columns;values='prediction'fills each cell;aggfunc='first'handles any accidental duplicates.np.nan_to_num— TRIBEv2 predictions for regions with no signal may beNaNor±inf. Replacing these with0.0/±1.0before scaling preventsStandardScalerfrom propagating poison values.StandardScaler().fit_transform(X)— subtracts the column mean and divides by the column standard deviation, producing a zero-mean, unit-variance matrix. See Why normalization matters below.
Building Sequence Windows
The Transformer encoder inmodel.py expects 3-D tensors of shape (B, T, F) — batches of temporal windows. build_sequences() chops the flat (N, F) matrix into non-overlapping windows of length SEQ_CHUNK (default 5 frames).
(len(data) // seq_len) * seq_len ensures the total row count is an exact multiple of seq_len before the reshape — no padding is needed. The resulting tensor has dtype torch.float32 (inherited from the input np.float32 array).
build_sequences is used in model.py for the enrollment and verification paths where the full temporal context is available. In main.py, the simpler build_sequence_tensor wrapper is used during training because each data point is treated as a single-frame sequence.Single-Frame Sequence Wrapper
main.py uses a lightweight wrapper that converts the flat (N, F) matrix directly into a 3-D tensor without temporal chunking, treating every sample as a sequence of length 1:
unsqueeze(1) inserts a length-1 time dimension, yielding shape (N, 1, F). The output is already 3-D, so NeuralVaultFewShot.forward’s x.dim() == 2 guard is not triggered — the guard exists only for callers that pass a bare (B, F) 2-D tensor directly, without going through build_sequence_tensor.
Data Shape Summary
The transformation chain from raw CSV to model-ready tensors proceeds as follows:| Stage | Shape | Notes |
|---|---|---|
| Raw CSV | (rows,) long-format | One prediction per row |
| After pivot | (N_samples, N_features) | One sample per (video, timestep) pair |
After np.nan_to_num + StandardScaler | (N_samples, N_features) | Float32, zero mean, unit variance |
build_sequence_tensor output | (N_samples, 1, N_features) | Single-frame sequences for training |
build_sequences output | (B, SEQ_CHUNK, N_features) | Multi-frame windows for enrollment/verification |
Why Z-Score Normalization Matters
fMRI and cortical prediction signals exhibit two systematic sources of amplitude variation that are irrelevant to identity:- DC offset — baseline BOLD signal levels differ across brain regions and scanning sessions due to scanner drift and physiological variation. Subtracting the column mean removes this offset entirely.
- Amplitude scaling — cortical regions with higher prediction variance would otherwise dominate dot-product computations in the Transformer’s attention layers. Dividing by the column standard deviation puts all features on equal footing.
model.py, normalization is applied online during verification using statistics computed from the enrolled user’s data only — feat_mean and feat_std are stored at enrollment time and reused for every subsequent authentication. This prevents impostor data from influencing the normalization parameters.