Documentation Index
Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt
Use this file to discover all available pages before exploring further.
Mammography datasets are small by deep-learning standards, and the visual appearance of cancer lesions varies significantly with tissue density, scanner model, and imaging angle. Without augmentation, a detector trained on a few hundred images tends to memorise the specific appearance of training cases and fails to generalise to new patients or institutions. MammoMix uses an albumentations pipeline that applies a diverse set of geometric and photometric transforms at training time to artificially expand the effective dataset size and expose the model to realistic imaging variation.
Full augmentation pipeline
The pipeline is defined in BreastCancerDataset.get_transforms in loader.py. It is applied only when split == 'train'; validation and test splits receive A.NoOp().
def get_transforms(self):
if self.split == 'train': # Apply augmentation if training
return A.Compose([
# Geometric transformations
A.ElasticTransform(alpha=50, sigma=5, approximate=False, p=0.5), # Elastic deformation to simulate tissue variability
A.Perspective(scale=(0.05, 0.1), p=0.5), # Perspective distortion to simulate different angles
A.HorizontalFlip(p=0.5), # Mirror image
A.Rotate(limit=10, p=0.5), # Small angles to avoid disrupting anatomical structure
A.RandomScale(scale_limit=0.2, p=0.5), # Random scaling to simulate different distances
A.Affine(
scale=(0.9, 1.1), translate_percent=(0.1, 0.1), rotate=(-10, 10), shear=(-5, 5),
interpolation=1, p=0.5 # Affine transformation to simulate different angles and scales
),
# Color and intensity transformations
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
A.GaussNoise(std_range=(0.05, 0.05), mean_range=(0.0, 0.0), per_channel=True, p=0.5),
A.GaussianBlur(p=0.5),
], bbox_params=A.BboxParams(
format='pascal_voc', # [x_min, y_min, x_max, y_max]
label_fields=['labels'], # Labels for bounding boxes
min_area=25, # Drop boxes smaller than 25 pixels after augmentation
min_visibility=0.1, # Discard boxes with less than 10% visibility after augmentation
clip=True # Clip bounding boxes to image boundaries
))
return A.Compose([A.NoOp()], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels'], clip=True))
A.ElasticTransform(alpha=50, sigma=5, approximate=False, p=0.5)
Applies a smooth, spatially-varying displacement field to the image — simulating the natural elastic deformation of breast tissue under compression. alpha=50 controls displacement magnitude and sigma=5 controls the smoothness of the deformation. This is one of the most effective augmentations for medical image segmentation and detection because real anatomical structures deform non-rigidly.
Perspective
A.Perspective(scale=(0.05, 0.1), p=0.5)
Applies a random four-point perspective warp. This simulates the effect of the X-ray source or detector not being perfectly orthogonal to the breast, which produces projective distortion in practice. scale=(0.05, 0.1) keeps the warp subtle enough to preserve anatomical integrity.
HorizontalFlip
Randomly mirrors the image left-to-right. Because mammograms are acquired from both the left and right breast, a horizontally flipped left-breast image is visually indistinguishable from a right-breast image. This effectively doubles the usable training samples with no labelling cost.
Rotate
A.Rotate(limit=10, p=0.5)
Rotates the image by a uniformly sampled angle in [-10°, +10°]. The small limit is intentional: large rotations would make the image anatomically implausible (a mammogram rotated 45° no longer looks like a clinical acquisition). Small rotations simulate slight patient positioning variation.
RandomScale
A.RandomScale(scale_limit=0.2, p=0.5)
Rescales the image by a random factor in [0.8, 1.2]. This simulates acquiring mammograms at slightly different distances from the X-ray source, which changes the apparent size of anatomical structures and lesions.
Affine
A.Affine(
scale=(0.9, 1.1), translate_percent=(0.1, 0.1), rotate=(-10, 10), shear=(-5, 5),
interpolation=1, p=0.5
)
Applies a combined affine transformation with independent control over scale, translation, rotation, and shear. This is a more general geometric augmentation that covers positioning artefacts not captured by the individual transforms above. interpolation=1 uses bilinear interpolation to keep edge quality reasonable.
RandomBrightnessContrast
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5)
Randomly shifts pixel intensity (brightness) and the intensity range (contrast) by up to ±20%. Different mammography systems and exposure settings produce images with substantially different brightness and contrast profiles, so this transform helps the model generalise across scanners and acquisition protocols.
GaussNoise
A.GaussNoise(std_range=(0.05, 0.05), mean_range=(0.0, 0.0), per_channel=True, p=0.5)
Adds zero-mean Gaussian noise independently to each colour channel, simulating electronic sensor noise and quantum mottle that appear in low-dose or high-sensitivity mammography acquisitions. The fixed std=0.05 keeps the noise level realistic without degrading image quality.
GaussianBlur
Applies Gaussian smoothing to simulate motion blur (from patient movement during exposure) or focus blur from depth-of-field effects. This encourages the model to detect cancer regions based on shape and location rather than high-frequency texture that may be absent in blurry acquisitions.
Bounding box parameters
The bbox_params argument propagates transforms to the bounding boxes alongside the image:
bbox_params=A.BboxParams(
format='pascal_voc', # [x_min, y_min, x_max, y_max]
label_fields=['labels'], # Labels for bounding boxes
min_area=25, # Drop boxes smaller than 25 pixels after augmentation
min_visibility=0.1, # Discard boxes with less than 10% visibility after augmentation
clip=True # Clip bounding boxes to image boundaries
)
format='pascal_voc': boxes are expressed as absolute pixel coordinates [x_min, y_min, x_max, y_max], matching the output of parse_voc_xml.
clip=True: after geometric transforms, boxes are clamped to the image boundary so no coordinate falls outside [0, W] or [0, H].
min_area=25: any box whose area after augmentation is smaller than 25 pixels is dropped. This prevents degenerate near-zero-area annotations from entering the loss computation.
min_visibility=0.1: any box that has less than 10% of its original area visible after augmentation (e.g. because it was cropped to the image edge) is dropped.
Retry logic
If all bounding boxes are dropped by the augmentation pipeline (e.g. a combination of aggressive scaling and rotation pushes every annotation off-screen), __getitem__ retries the same index:
transformed = self.transforms(image=image, bboxes=bboxes, labels=labels)
labels = np.array(transformed['labels'], dtype=np.int64)
if len(transformed['labels']) <= 0: return self.__getitem__(idx) # Retry if no valid boxes after augmentation
This guarantees that every sample returned by the dataset contains at least one valid annotation, preventing the model from receiving supervision-free examples.
Validation and test behaviour
Validation and test splits bypass all augmentation:
return A.Compose([A.NoOp()], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels'], clip=True))
A.NoOp() is a no-operation transform; the image and boxes pass through unchanged. clip=True is still applied, which is a safe no-op for well-formed annotations but guards against any annotation that marginally exceeds the image boundary.
The min_area=25 and min_visibility=0.1 thresholds deserve careful tuning for your dataset. If your annotations include very small lesions (e.g. micro-calcifications spanning only a few pixels), a min_area of 25 may silently discard real cancer regions after aggressive scaling. Conversely, setting min_area too low keeps near-invisible annotations that add noise to the loss. Check the distribution of annotation areas in your training set before adjusting these values.