SSRL-ECG is distributed as a standard Python package with aDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/Tumo505/SSL-for-ECG-classification/llms.txt
Use this file to discover all available pages before exploring further.
pyproject.toml-based build. The recommended path is to create an isolated virtual environment, clone the repository, and install in editable mode so that every module under src/ssrl_ecg/ is immediately importable without re-installing after code changes.
Prerequisites
Before you begin, ensure your system has:- Python 3.9 or higher — required by
pyproject.toml - Git — to clone the repository
- pip 21+ — ships with Python 3.9; upgrade with
pip install --upgrade pipif needed - CUDA toolkit (optional) — for GPU-accelerated training; SSRL-ECG auto-detects the device via
choose_device()
Step-by-Step Installation
Clone the repository
Download the source code to your local machine.
- Linux / macOS
- Windows PowerShell
Create and activate a virtual environment
Isolate SSRL-ECG’s dependencies from your system Python installation.After activation your shell prompt will be prefixed with
- Linux / macOS
- Windows PowerShell
(.venv).Install the package in editable mode
Run the editable install from the repository root. This resolves all declared dependencies automatically.
The
-e (editable) flag installs the package by symlinking src/ssrl_ecg/ into your Python environment rather than copying files. This means any change you make to the source is reflected immediately — no reinstall step required. Without -e, Python will not be able to resolve import ssrl_ecg when running training scripts as python -m ssrl_ecg.*.Installed Dependencies
The editable install pulls in all packages declared inpyproject.toml. No separate pip install -r requirements.txt step is needed.
| Package | Minimum Version | Purpose |
|---|---|---|
torch | 2.2 | Neural network training and GPU acceleration |
numpy | 1.24 | Numerical array operations |
pandas | 2.0 | Metadata loading and DataFrame processing |
scikit-learn | 1.3 | Metrics, stratified sampling, label encoding |
wfdb | 4.1 | Reading PTB-XL and MIT-BIH WFDB-format signal files |
tqdm | 4.66 | Progress bars for training loops |
pyyaml | 6.0 | Configuration file parsing |
scipy | 1.10 | Signal processing (bandpass filtering, resampling) |
matplotlib | 3.5 | Plotting and figure generation |
seaborn | 0.12 | Publication-ready statistical visualizations |
xgboost | 2.0 | Gradient boosting baseline models |
lightgbm | 4.0 | Gradient boosting baseline models |
Dataset Setup
SSRL-ECG expects datasets to be present under adata/ directory relative to wherever you run training commands. Create this structure before launching any training script.
Expected Folder Structure
ptbxl_database.csv sits directly inside data/PTB-XL/.
GPU Setup
SSRL-ECG selects the compute device automatically via thechoose_device() utility — no manual device configuration is required. The priority order is:
- CUDA — used if
torch.cuda.is_available()returnsTrue - MPS — used on Apple Silicon Macs if CUDA is unavailable
- CPU — fallback for all other environments
- Linux / macOS
- Windows PowerShell
If CUDA is unavailable, training will run on CPU. SimCLR pretraining for 20 epochs on PTB-XL is feasible on a modern multi-core CPU, but will be significantly slower than on a GPU. Consider reducing
--batch-size to 64 if you encounter memory pressure on smaller GPUs.