Training a self-supervised encoder on PTB-XL teaches it to capture the temporal structure of 12-lead ECG signals. But how well do those learned representations generalise beyond PTB-XL? Transfer evaluation on the MIT-BIH arrhythmia dataset answers this question by applying the pretrained encoder — without retraining on PTB-XL labels — to an entirely different ECG corpus with its own patient population, acquisition hardware, and annotation scheme.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Tumo505/SSL-for-ECG-classification/llms.txt
Use this file to discover all available pages before exploring further.
Why Transfer to MIT-BIH?
Strong cross-dataset transfer is one of the most compelling arguments for self-supervised pretraining. A supervised model trained on PTB-XL diagnostic classes may overfit to PTB-XL-specific signal characteristics. An SSL encoder that generalises to MIT-BIH arrhythmia detection demonstrates that it has learned genuinely physiological ECG structure rather than dataset-specific artefacts.About the MIT-BIH Arrhythmia Dataset
MIT-BIH is a widely-used benchmark released by PhysioNet (MIT-BIH Arrhythmia Database v1.0.0). Key characteristics:- 48 half-hour ambulatory ECG recordings from 47 subjects at Boston’s Beth Israel Hospital
- 2-lead recordings (MLII and one of V1, V2, V4, or V5) sampled at 360 Hz
- Arrhythmia annotations at each beat, covering normal sinus rhythm and multiple abnormal rhythms
- Different distribution from PTB-XL: shorter records, different leads, beat-level rather than record-level labels, and a distinct patient population
ECGClassifier(n_classes=1) used by transfer_mitbih.py.
MIT-BIH requires its own data folder setup. Download the dataset from PhysioNet and place the
.hea, .dat, and .atr files under data/MIT-BIH/files/mitdb/1.0.0/. The MITBIHDataset loader expects this exact path structure.Expected Folder Structure
Running the Transfer Evaluation
Prepare a pretrained checkpoint
Use either a raw SSL checkpoint (with an
encoder key) or a fine-tuned checkpoint (with a model key containing encoder.* weights).Run the transfer evaluation script
Point the script at the MIT-BIH data folder and your checkpoint.For a fine-tuned checkpoint (with a
model key), the script automatically extracts the encoder weights from the encoder.* sub-keys.CLI Arguments
Root directory containing the MIT-BIH data files. Must have the
files/mitdb/1.0.0/ subdirectory with .hea, .dat, and .atr files.Path to a pretrained or fine-tuned checkpoint. Accepted formats:
{"encoder": <state_dict>}— raw SSL checkpoint{"model": {"encoder.*": ..., "classifier.*": ...}}— fine-tuned ECGClassifier checkpoint
Inference batch size. MIT-BIH records are shorter than PTB-XL, so 32 is typically sufficient.
Parsed by the argument parser but not used by the current
main() implementation, which runs inference only. Reserved for future fine-tuning extensions.Parsed by the argument parser but not used by the current
main() implementation. Reserved for future fine-tuning extensions.When set (and
--use-ssl is not set), the encoder parameters have requires_grad = False. Has no effect on the inference output since main() does not run a training loop.Indicates that the checkpoint is a raw SSL encoder (with an
encoder key) rather than a full fine-tuned classifier.Random seed passed to
set_seed for reproducible weight initialisation and data shuffling.How the Encoder Is Adapted
MIT-BIH provides 2-lead recordings, but the PTB-XL-pretrained encoder expects 12-channel input (ECGEncoder1DCNN(in_ch=12, ...)). The MITBIHDataset loader handles this by replicating the 2-lead signal across all 12 channels, preserving the encoder’s input dimensionality without re-architecture. The classification head is then trained from scratch for the binary arrhythmia task.
What Transfer Success Tells Us
A high transfer AUROC (above 0.80) on MIT-BIH after minimal fine-tuning indicates that the SSL encoder has captured generalisable physiological ECG representations — not just the label distribution of PTB-XL. Specifically, it shows:Morphological generalisation
The encoder responds to waveform shape (P-wave, QRS complex, T-wave) regardless of the acquisition system or lead configuration.
Temporal structure capture
Self-supervised pretraining on PTB-XL’s 10-second windows has encoded rhythm information that transfers to MIT-BIH’s ambulatory segments.
Domain shift robustness
Different patient demographics, recording hardware, and annotation protocols do not prevent the encoder from providing a useful feature basis.
Low-data adaptability
The binary head achieves competitive performance with only 10 fine-tuning epochs, confirming that the encoder’s features are linearly separable for new tasks.