MammoMix uses YOLOS (You Only Look at One Sequence), a vision-transformer-based object detector, to localize malignant lesions in mammography images. TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt
Use this file to discover all available pages before exploring further.
train.py script wraps HuggingFace’s Trainer API around the BreastCancerDataset loader, handles W&B logging automatically, and saves the best checkpoint by eval_map_50 so you always have a model that is optimized for detection recall at IoU 0.50.
Quickstart
Run training by pointing--config at a YAML file and supplying the target dataset with --dataset:
Overriding the epoch count
Pass--epoch to override the training.epochs value in the config without editing the file:
The
--epoch flag takes precedence over training.epochs in the YAML. All other hyperparameters are still read from the config file.What happens during training
Dataset loading
BreastCancerDataset is instantiated for the train and val splits. The loader reads image paths from the split .txt files located under splits_dir, applies the HuggingFace AutoImageProcessor (resize + pad to max_size × max_size), and returns pixel values with COCO-format bounding box annotations for the single cancer class.Model initialization
AutoModelForObjectDetection.from_pretrained('hustvl/yolos-base') loads the pre-trained YOLOS backbone from HuggingFace Hub. The classification head is replaced via id2label={0: 'cancer'} and ignore_mismatched_sizes=True, so only the detection head is re-initialized.Trainer setup
HuggingFace
Trainer is configured with eval_strategy="epoch" and save_strategy="epoch". Both evaluation and checkpointing happen at the end of every epoch. fp16 is enabled automatically when a CUDA GPU is detected.W&B logging
report_to="all" sends metrics to Weights & Biases. The run is named {model}_{DATASET}_{DDMMYY} and logs are written to the directory specified by wandb.wandb_dir in the config. Set logging.use_wandb: false in the config to disable W&B reporting.Best model selection
load_best_model_at_end=True combined with metric_for_best_model=eval_map_50 and greater_is_better=True means the trainer automatically reloads the checkpoint with the highest mAP@50 after training completes.Model output path
The trained model is saved one directory above the repository root, suffixed with the dataset name and the current date inDDMMYY format:
YOLOS-base requires approximately 8 GB of VRAM at
batch_size=8 with gradient_accumulation_steps=2. Reduce training.batch_size to 4 if you encounter out-of-memory errors, or enable fp16 explicitly in the config.