Train YOLOS for breast cancer detection

MammoMix uses YOLOS (You Only Look at One Sequence), a vision-transformer-based object detector, to localize malignant lesions in mammography images. The train.py script wraps HuggingFace’s Trainer API around the BreastCancerDataset loader, handles W&B logging automatically, and saves the best checkpoint by eval_map_50 so you always have a model that is optimized for detection recall at IoU 0.50.

Quickstart

Run training by pointing --config at a YAML file and supplying the target dataset with --dataset:

python train.py --config configs/config_yolos.yaml --dataset CSAW

python train.py --config configs/config_yolos.yaml --dataset CSAW

Overriding the epoch count

Pass --epoch to override the training.epochs value in the config without editing the file:

python train.py --config configs/config_yolos.yaml --dataset CSAW --epoch 50

The --epoch flag takes precedence over training.epochs in the YAML. All other hyperparameters are still read from the config file.

What happens during training

Dataset loading

BreastCancerDataset is instantiated for the train and val splits. The loader reads image paths from the split .txt files located under splits_dir, applies the HuggingFace AutoImageProcessor (resize + pad to max_size × max_size), and returns pixel values with COCO-format bounding box annotations for the single cancer class.

Model initialization

AutoModelForObjectDetection.from_pretrained('hustvl/yolos-base') loads the pre-trained YOLOS backbone from HuggingFace Hub. The classification head is replaced via id2label={0: 'cancer'} and ignore_mismatched_sizes=True, so only the detection head is re-initialized.

Trainer setup

HuggingFace Trainer is configured with eval_strategy="epoch" and save_strategy="epoch". Both evaluation and checkpointing happen at the end of every epoch. fp16 is enabled automatically when a CUDA GPU is detected.

W&B logging

report_to="all" sends metrics to Weights & Biases. The run is named {model}_{DATASET}_{DDMMYY} and logs are written to the directory specified by wandb.wandb_dir in the config. Set logging.use_wandb: false in the config to disable W&B reporting.

Best model selection

load_best_model_at_end=True combined with metric_for_best_model=eval_map_50 and greater_is_better=True means the trainer automatically reloads the checkpoint with the highest mAP@50 after training completes.

Test evaluation

After training, a third BreastCancerDataset is created for the test split and passed to trainer.evaluate(). Results are printed to stdout:

=== Test Results ===
test_map: 0.423
test_map_50: 0.681
test_map_75: 0.391

Model output path

The trained model is saved one directory above the repository root, suffixed with the dataset name and the current date in DDMMYY format:

../yolos_{DATASET_NAME}_{DDMMYY}

For example, a CSAW run completed on 12 May 2026 saves to:

../yolos_CSAW_120526/

eval_map_50 is used as the best-model metric rather than the overall eval_map because mAP@50 is more sensitive to lesion localization in mammography, where bounding boxes are often imprecise. This tends to produce models that generalize better to unseen scans.

YOLOS-base requires approximately 8 GB of VRAM at batch_size=8 with gradient_accumulation_steps=2. Reduce training.batch_size to 4 if you encounter out-of-memory errors, or enable fp16 explicitly in the config.

Get Started

Concepts

Training

Evaluation & Inference

Data Pipeline

Train YOLOS for breast cancer detection

Quickstart

Overriding the epoch count

What happens during training

Model output path

Build docs developers (and LLMs) love

Get Started

Concepts

Training

Evaluation & Inference

Data Pipeline

Documentation Index

​Quickstart

​Overriding the epoch count

​What happens during training

​Model output path

Build docs developers (and LLMs) love

Quickstart

Overriding the epoch count

What happens during training

Model output path