Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Tumo505/SSL-for-ECG-classification/llms.txt

Use this file to discover all available pages before exploring further.

The ECGAugmentations class is the core augmentation engine in SSRL-ECG. It generates two independently augmented views of the same ECG signal for contrastive learning objectives such as SimCLR and BYOL. Every transform is designed around the physiological and recording properties of 12-lead ECGs, ensuring that augmented views remain medically plausible while being sufficiently diverse to drive representation learning.

Class Overview

from ssrl_ecg.augmentations import ECGAugmentations
ECGAugmentations applies a two-stage pipeline to each input signal: a weak stage (always applied, small perturbations) followed by a strong stage (applied with probability prob_strong, larger structural transforms). The same pipeline is run twice on every input to produce two independent views, x1 and x2.

Constructor

ECGAugmentations(signal_length=5000, sampling_rate=500, prob_strong=0.8)
signal_length
int
default:"5000"
Number of time-steps in the ECG signal. PTB-XL recordings at 500 Hz for 10 seconds yield 5000 samples. Adjust to match your dataset’s temporal resolution.
sampling_rate
int
default:"500"
Sampling rate in Hz. Used by frequency-domain augmentations (e.g. _augment_bandpass_variation) to correctly map frequency bins to physical units. Set to 100 for PTB-XL’s low-resolution variant.
prob_strong
float
default:"0.8"
Probability gate controlling whether the strong augmentation block runs on any given call. At 0.8, roughly four in five augmentation calls include the full set of structural transforms. Reduce to 0.5 during development for faster iteration.

__call__

x1, x2 = aug(x)
Produces two independently augmented views of the input signal by running the full weak + strong pipeline twice on cloned copies of x.
x
torch.Tensor
required
Input ECG signal. Accepts either:
  • 2D[channels, time] for a single sample (e.g. [12, 5000])
  • 3D[batch, channels, time] for a mini-batch (e.g. [32, 12, 5000])
The tensor is cloned before augmentation; the original is never modified.
Returns
x1
torch.Tensor
First augmented view. Shape matches the input exactly: [channels, time] if a 2D tensor was passed, [batch, channels, time] if 3D. Values are clamped to [-10, 10].
x2
torch.Tensor
Second augmented view, produced by an independent pass through the same pipeline. Stochastic operations ensure x1 ≠ x2 in virtually all cases.
Automatic 2D / 3D handling. When a 2D [channels, time] tensor is passed, the class silently inserts a batch dimension internally (unsqueeze(0)) so that all augmentations operate on the canonical 3D shape, then squeezes it back before returning. This means you can use the same augmenter in both dataset __getitem__ methods (single sample) and collated batch pipelines without any code changes.

Usage Example

from ssrl_ecg.augmentations import ECGAugmentations
import torch

aug = ECGAugmentations(signal_length=5000, sampling_rate=500, prob_strong=0.8)

# Single sample (2D input)
x_single = torch.randn(12, 5000)
x1, x2 = aug(x_single)
print(x1.shape)  # [12, 5000]

# Batch (3D input)
x_batch = torch.randn(32, 12, 5000)
x1, x2 = aug(x_batch)
print(x1.shape)  # [32, 12, 5000]

# Confirm views differ
print(torch.allclose(x1, x2))  # False
print(f"Value range: [{x1.min():.3f}, {x1.max():.3f}]")  # clamped to [-10, 10]
Set prob_strong=0.5 during development and architecture search to cut augmentation time roughly in half. Restore to 0.8 for full pretraining runs where diversity of views is critical for learning strong representations.

Augmentation Pipeline

The pipeline is divided into two sequential stages. The weak stage always runs; the strong stage runs with probability prob_strong.
Weak augmentations are applied unconditionally on every call. They introduce small perturbations that mimic natural recording variability without altering the underlying cardiac signal structure.
MethodProbabilitySimulates
_weak_jitter0.9Sensor noise, electrical interference — adds Gaussian noise with std=0.03
_weak_scaling0.8Recording gain drift across sessions — multiplies the whole signal by 1 ± 15%
_augment_channel_noise0.6Differing signal quality across ECG leads — adds per-channel Gaussian noise at 0.5–2% of each lead’s standard deviation

Augmentation Method Reference

Adds zero-mean Gaussian noise scaled by std to the entire signal tensor. Applied with probability 0.9 on every call, making it the most consistently active augmentation.Simulates: Thermal sensor noise, 50/60 Hz electrical interference, and quantisation noise in ADC circuits.
# Equivalent behaviour (internal)
noise = torch.randn_like(x) * 0.03
x = x + noise  # applied with prob 0.9
Multiplies the entire signal by a scalar drawn uniformly from [1 - scale_range, 1 + scale_range], i.e. [0.85, 1.15] at the default setting. Applied with probability 0.8.Simulates: Gain drift between recording sessions, automatic gain control variation, and electrode impedance changes.
For each of the channels leads independently, samples a noise level in [0.5%, 2%] of that lead’s empirical standard deviation and adds Gaussian noise at that level. Applied with probability 0.6 when channels > 1.Simulates: The real-world situation where individual ECG leads have different signal-to-noise ratios depending on electrode contact quality and anatomical positioning.
Constructs a random non-linear time mapping using num_points intermediate control points with ±5% jitter, then resamples the signal through the resulting index mapping. Applied with probability 0.5.Simulates: Temporal distortions arising from heart rate variability, breathing-related signal elongation/compression, and slight recording speed errors.
Shifts the signal by a random integer number of samples in [-max_shift × L, +max_shift × L]. Zeros pad the vacated end. Applied with probability 0.7.Simulates: Variability in recording trigger timing, lead-on delays, and heart-rate-driven beat alignment differences between sessions.
Zeros out one or more contiguous temporal segments totalling approximately dropout_ratio × L samples, replacing masked values with the global signal mean. Unlike standard i.i.d. dropout, segments are spatially coherent. Applied with probability 0.5.Simulates: Signal interruptions from temporary electrode detachment, patient movement artefacts long enough to render a segment uninterpretable, and data transmission dropouts.
Applies a soft bandpass filter in the FFT domain with randomly sampled low-cut (0–5 Hz) and high-cut (50–250 Hz) corner frequencies. The transition bands use a linear ramp window. Applied with probability 0.5.Simulates: Device-specific analogue filter designs — clinical ECG devices nominally pass 0.5–100 Hz, but Holter monitors, consumer wearables, and research-grade devices differ significantly.
Removes a contiguous segment comprising 10–30% of the signal length and fills the gap with linear interpolation between the boundary samples. Applied with probability 0.5.Simulates: Missing data windows due to electrode contact loss, corrupt storage blocks, or deliberate cropping of artefact-heavy regions in preprocessing pipelines.
Samples a mixing coefficient λ ~ Beta(alpha, alpha) and returns λ·x + (1−λ)·x_shuffled where x_shuffled is a randomly permuted copy of the batch. Requires batch > 1; silently skips otherwise. Applied with probability 0.4.Simulates: Averaged multi-lead readings, physiologically plausible blends of two patients’ signals in a shared recording environment, and label-space interpolation for multi-label classification.
Replaces a random temporal segment (spanning 10–33% of signal length) with the matching segment from a randomly selected other sample in the batch. Requires batch > 1. Applied with probability equal to cutmix_prob (called with 0.3 in the augmentation pipeline).Simulates: Electrode cross-talk, stitched recordings from different sessions, and segment-level annotation noise.
Injects 1–3 localised motion artefact bursts per sample. Each burst is a combination of a low-frequency sinusoid (0.5–2 Hz, amplitude 0.5–2.0) simulating baseline wander and a Gaussian noise burst (std 0.3) simulating high-frequency movement noise. Applied with probability 0.5.Simulates: Patient movement during recording, respiration-induced baseline wander, cable movement artefacts, and electrode pop.
Applies an independent scale (1 ± 10%) and DC offset (drawn from N(0, 0.05)) to each lead separately. Applied with probability 0.6 when channels > 1.Simulates: Inter-lead amplitude variation from electrode placement differences, skin impedance heterogeneity, and device-specific per-channel gain calibration drift.

Output Clamping

After both augmentation stages, all output values are hard-clamped to the range [-10, 10]:
return torch.clamp(x, -10, 10)
This guard prevents numerically unstable representations from propagating into the encoder, particularly when mixing augmentations stack additive noise on top of amplitude-scaled signals.

Build docs developers (and LLMs) love