Documentation Index
Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt
Use this file to discover all available pages before exploring further.
evaluation.py provides mAP computation, HuggingFace Trainer-compatible metric callbacks, and a standalone inference loop for DETR-based mammography models.
ModelOutput
A lightweight dataclass that mirrors the output structure expected by AutoImageProcessor.post_process_object_detection. Use it to wrap raw logits and predicted boxes when calling the image processor outside a HuggingFace model forward pass.
Class logits produced by the detection head. Shape
(B, num_queries, num_classes + 1) where the last dimension includes the no-object class.Predicted bounding boxes in YOLO normalised format
(cx, cy, w, h). Shape (B, num_queries, 4).convert_bbox_yolo_to_pascal
Converts bounding boxes from normalised YOLO format to absolute Pascal VOC coordinates.
Parameters
Bounding boxes in YOLO format
(cx, cy, w, h) with values normalised to [0, 1]. Shape (N, 4).Target image dimensions as
(height, width) used to scale the normalised coordinates to absolute pixel values.Returns
Bounding boxes in Pascal VOC format
(x_min, y_min, x_max, y_max) in absolute pixel coordinates. Shape (N, 4).compute_metrics
Computes mean average precision (mAP) and related detection metrics from HuggingFace EvalPrediction objects. Decorated with @torch.no_grad().
Parameters
HuggingFace
EvalPrediction object with .predictions (batched model outputs) and .label_ids (batched ground-truth annotations) as populated by Trainer.evaluate().The same image processor used during training. Called with
post_process_object_detection to convert raw model outputs to scored, filtered bounding boxes.Confidence threshold for filtering predicted boxes before metric computation. Boxes with scores below this value are discarded.
Mapping from integer class id to string label name (e.g.
{0: "cancer"}). Passed through to the image processor’s post-processing step.The spatial dimension used as both height and width when constructing
target_sizes for post-processing. Should match the pad_size used during preprocessing.Returns
Dictionary of mAP metrics with keys prefixed by
"map". Common keys include:get_eval_compute_metrics_fn
Returns a functools.partial of compute_metrics pre-configured for MammoMix’s single-class cancer detection task. Pass the result directly as the compute_metrics argument to Trainer.
Parameters
Image processor used to post-process predictions inside
compute_metrics.Returns
A partial of
compute_metrics with threshold=0.5 and id2label={0: "cancer"} already bound. Accepts a single EvalPrediction argument.calculate_custom_map_metrics
An alternative mAP implementation that processes raw model output objects directly, bypassing the EvalPrediction serialisation path used by compute_metrics. Useful when evaluating outside Trainer or when the standard path encounters tensor-shape issues.
All tensors are moved to CPU before computing metrics for compatibility with
torchmetrics. Returns zeroed metrics as a fallback if any exception is raised.Parameters
List of model output objects. Each element must expose
.logits and .pred_boxes attributes (e.g. instances of ModelOutput or native HuggingFace model outputs).List of ground-truth annotation dicts, each with keys
"boxes" (YOLO format Tensor) and "class_labels" (Tensor).Image processor used to post-process predictions via
post_process_object_detection with threshold=0.5.The device on which intermediate tensors are created. Final metric computation is performed on CPU.
Image spatial dimension used as target size during post-processing.
Returns
mAP metrics dict with the same keys as
compute_metrics ("map", "map_50", "map_75", "map_small", "map_medium", "map_large"). All values are Python float. Returns all zeros on error.run_model_inference_with_map
End-to-end inference loop: iterates over a test dataset, collects model predictions, and returns mAP metrics. This is the primary entry point for evaluating a trained checkpoint on a held-out split.
Parameters
A trained HuggingFace detection model (e.g.
AutoModelForObjectDetection). The function calls model.eval() before running inference and wraps the loop in torch.no_grad().A
BreastCancerDataset instance initialised with split="test". Wrapped in a DataLoader internally.Image processor forwarded to
calculate_custom_map_metrics for post-processing.Device to move
pixel_values and label tensors to before the forward pass.Number of images per inference batch.
Returns
mAP metrics dict returned by
calculate_custom_map_metrics, aggregated over the full test set.