Documentation Index
Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt
Use this file to discover all available pages before exploring further.
loader.py provides the PyTorch Dataset implementation for mammography data and the collate function used with DataLoader. It handles split-based file discovery, VOC XML annotation parsing, Albumentations augmentation, and DETR-compatible encoding.
BreastCancerDataset
A torch.utils.data.Dataset that loads mammography images and bounding-box annotations for DETR-based object detection models. Augmentation is applied automatically for the train split.
Constructor parameters
Dataset split to load. Must be one of
"train", "val", or "test". Raises ValueError for any other value. The "train" split activates the full Albumentations augmentation pipeline; "val" and "test" apply a no-op identity transform.Path to the root directory that contains per-dataset split files. The constructor expects a file at
{splits_dir}/{dataset_name}/{split}.txt. Each line in that file is a relative path to one image. Raises FileNotFoundError if the file does not exist.Name of the dataset subdirectory inside
splits_dir. Supported values used in MammoMix are "CSAW", "DMID", and "DDSM".A HuggingFace
AutoImageProcessor instance (e.g. from hustvl/yolos-base or a DETR checkpoint). Used to resize, pad, and normalise images, and to encode COCO-format annotations into the tensors expected by DETR.Return value — __getitem__
Each call to dataset[idx] returns a Python dict with the following fields.
Preprocessed image tensor of shape
(3, H, W) after resizing, padding, and normalisation. The batch dimension from the image processor is squeezed out.DETR-compatible annotation dict produced by the image processor. Contains at minimum:
Training augmentation pipeline
Whensplit="train", get_transforms() returns an albumentations.Compose with the following transforms applied to both the image and bounding boxes:
| Transform | Key parameters |
|---|---|
ElasticTransform | alpha=50, sigma=5, p=0.5 |
Perspective | scale=(0.05, 0.1), p=0.5 |
HorizontalFlip | p=0.5 |
Rotate | limit=10, p=0.5 |
RandomScale | scale_limit=0.2, p=0.5 |
Affine | scale, translate, rotate, shear, p=0.5 |
RandomBrightnessContrast | brightness_limit=0.2, contrast_limit=0.2, p=0.5 |
GaussNoise | std_range=(0.05, 0.05), p=0.5 |
GaussianBlur | p=0.5 |
pascal_voc, min_area=25, min_visibility=0.1, clip=True. If all boxes are removed by augmentation, the item is retried automatically.
collate_fn
Collates a list of dataset samples into a batch suitable for a DataLoader.
Parameters
A list of sample dicts as returned by
BreastCancerDataset.__getitem__. Each dict must contain pixel_values and labels, and may optionally contain pixel_mask.Return value
Stacked image tensor of shape
(B, 3, H, W) produced by torch.stack.List of per-image label dicts (length
B). Kept as a Python list because each image may have a different number of bounding boxes and DETR expects this structure directly.Stacked attention mask of shape
(B, H, W), present only when pixel_mask exists in the first sample. Each value is 1 for real pixels and 0 for padding.