Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Tumo505/SSL-for-ECG-classification/llms.txt

Use this file to discover all available pages before exploring further.

SSRL-ECG is distributed as a standard Python package with a pyproject.toml-based build. The recommended path is to create an isolated virtual environment, clone the repository, and install in editable mode so that every module under src/ssrl_ecg/ is immediately importable without re-installing after code changes.

Prerequisites

Before you begin, ensure your system has:
  • Python 3.9 or higher — required by pyproject.toml
  • Git — to clone the repository
  • pip 21+ — ships with Python 3.9; upgrade with pip install --upgrade pip if needed
  • CUDA toolkit (optional) — for GPU-accelerated training; SSRL-ECG auto-detects the device via choose_device()

Step-by-Step Installation

1

Clone the repository

Download the source code to your local machine.
git clone https://github.com/Tumo505/SSL-for-ECG-classification.git
cd SSL-for-ECG-classification
2

Create and activate a virtual environment

Isolate SSRL-ECG’s dependencies from your system Python installation.
python -m venv .venv
source .venv/bin/activate
After activation your shell prompt will be prefixed with (.venv).
3

Install the package in editable mode

Run the editable install from the repository root. This resolves all declared dependencies automatically.
pip install -e .
The -e (editable) flag installs the package by symlinking src/ssrl_ecg/ into your Python environment rather than copying files. This means any change you make to the source is reflected immediately — no reinstall step required. Without -e, Python will not be able to resolve import ssrl_ecg when running training scripts as python -m ssrl_ecg.*.
4

Verify the installation

Confirm the package is importable and all core dependencies are satisfied.
python -c "import ssrl_ecg; print('OK')"
Expected output:
OK
If you see ModuleNotFoundError: No module named 'ssrl_ecg', ensure you ran pip install -e . from inside the repository root with your virtual environment active.

Installed Dependencies

The editable install pulls in all packages declared in pyproject.toml. No separate pip install -r requirements.txt step is needed.
PackageMinimum VersionPurpose
torch2.2Neural network training and GPU acceleration
numpy1.24Numerical array operations
pandas2.0Metadata loading and DataFrame processing
scikit-learn1.3Metrics, stratified sampling, label encoding
wfdb4.1Reading PTB-XL and MIT-BIH WFDB-format signal files
tqdm4.66Progress bars for training loops
pyyaml6.0Configuration file parsing
scipy1.10Signal processing (bandpass filtering, resampling)
matplotlib3.5Plotting and figure generation
seaborn0.12Publication-ready statistical visualizations
xgboost2.0Gradient boosting baseline models
lightgbm4.0Gradient boosting baseline models
Optional extras are declared in pyproject.toml. Install the development extras (Black formatter + pytest) with pip install -e ".[dev]", or the full analysis stack with pip install -e ".[ml,analysis]".

Dataset Setup

SSRL-ECG expects datasets to be present under a data/ directory relative to wherever you run training commands. Create this structure before launching any training script.

Expected Folder Structure

data/
├── PTB-XL/                        # Primary dataset (required)
│   ├── ptbxl_database.csv         # Patient metadata and fold assignments
│   ├── scp_statements.csv         # Diagnostic label definitions
│   ├── records100/                # 100 Hz downsampled recordings
│   │   ├── 00000/
│   │   │   ├── *.hea
│   │   │   └── *.dat
│   │   └── ...
│   └── records500/                # 500 Hz full-resolution recordings
│       └── ...
└── MIT-BIH/                       # Secondary dataset (optional)
    └── files/
        └── mitdb/
            └── 1.0.0/
                ├── *.hea
                ├── *.dat
                └── *.atr
PTB-XL is freely available from PhysioNet. Download and extract the archive so that ptbxl_database.csv sits directly inside data/PTB-XL/.
The --data-root argument in every training and evaluation script defaults to data/PTB-XL. If you store PTB-XL elsewhere, pass the correct path explicitly, for example --data-root /mnt/datasets/PTB-XL.

GPU Setup

SSRL-ECG selects the compute device automatically via the choose_device() utility — no manual device configuration is required. The priority order is:
  1. CUDA — used if torch.cuda.is_available() returns True
  2. MPS — used on Apple Silicon Macs if CUDA is unavailable
  3. CPU — fallback for all other environments
Verify your GPU is visible to PyTorch:
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
nvidia-smi
If CUDA is unavailable, training will run on CPU. SimCLR pretraining for 20 epochs on PTB-XL is feasible on a modern multi-core CPU, but will be significantly slower than on a GPU. Consider reducing --batch-size to 64 if you encounter memory pressure on smaller GPUs.

Build docs developers (and LLMs) love