MammoMix supports two evaluation workflows: automatic evaluation hooked into the Hugging FaceDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt
Use this file to discover all available pages before exploring further.
Trainer loop, and a standalone function for running inference and computing mAP on any test dataset.
Evaluation approaches
Automatic evaluation via Trainer
During training, pass the metrics function returned by
get_eval_compute_metrics_fn to the Trainer as compute_metrics. The Trainer calls it after each evaluation epoch with an EvalPrediction object containing batched predictions and ground-truth labels.evaluation.py
You must set
eval_do_concat_batches=False in TrainingArguments. The compute_metrics function iterates over individual batches from evaluation_results.predictions and evaluation_results.label_ids. Concatenating batches before this step produces incorrect image-size tensors and breaks post-processing.Standalone inference with mAP evaluation
Use SignatureInternally the function:
run_model_inference_with_map to evaluate any trained model against a test dataset outside the Trainer loop. This is the recommended path for final benchmark runs.evaluation.py
- Wraps
test_datasetin aDataLoaderusingcollate_fn. - Runs
model.eval()and collects outputs undertorch.no_grad(). - Delegates metric computation to
calculate_custom_map_metrics.
get_eval_compute_metrics_fn
evaluation.py
compute_metrics with two fixed parameters:
| Parameter | Value | Purpose |
|---|---|---|
threshold | 0.5 | Confidence cutoff — boxes below this score are discarded before mAP accumulation |
id2label | {0: 'cancer'} | Single-class mapping used by the image processor during post-processing |
Trainer(compute_metrics=...).
Bounding box conversion: YOLO → Pascal VOC
Ground-truth labels are stored and fed to YOLOS in YOLO format:(x_center, y_center, width, height) normalised to [0, 1]. Before computing IoU-based metrics, MammoMix converts all boxes to Pascal VOC format: (x_min, y_min, x_max, y_max) in absolute pixel coordinates.
evaluation.py
torchmetrics.
Output metrics
compute_metrics returns a dictionary filtered to keys that start with map:
map_per_class is explicitly removed before returning because MammoMix is a single-class detector (cancer only). See Object detection metrics for a full explanation of each key.
ModelOutput dataclass
Post-processing via image_processor.post_process_object_detection requires a model output object with specific attributes. When running inference manually, MammoMix wraps raw tensors in a lightweight dataclass:
evaluation.py
YolosObjectDetectionOutput and satisfies the image processor’s interface without importing the full model output class.