Documentation Index
Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt
Use this file to discover all available pages before exploring further.
utils.py provides thin, stateless helper functions used across the MammoMix training and evaluation scripts. Functions cover YAML config loading, image processor initialisation, model architecture detection, and VOC XML annotation parsing.
load_config
Loads a YAML configuration file and returns its contents as a Python dictionary.
Parameters
Path to the YAML file to load. Passed directly to
open(), so both relative and absolute paths are accepted.Returns
Parsed YAML document as a Python dictionary. Keys and value types reflect the structure of the config file.
get_image_processor
Creates and returns an AutoImageProcessor configured for fixed-size padding, which is required by YOLOS and DETR models in MammoMix.
Parameters
HuggingFace model identifier or local path. Used as the source for the processor configuration. Supports any checkpoint compatible with
AutoImageProcessor.from_pretrained.Maximum height and width in pixels. Applied to both
size (for resizing) and pad_size (for padding), so all images are resized to fit within a max_size × max_size canvas and padded to exactly that resolution.Returns
Processor instance with the following flags set:
| Flag | Value |
|---|---|
do_resize | True |
do_pad | True |
use_fast | True |
size | {"max_height": max_size, "max_width": max_size} |
pad_size | {"height": max_size, "width": max_size} |
get_model_type
Infers the model architecture family from the model identifier string.
Parameters
HuggingFace model identifier or local directory name. The function checks whether the string contains the substring
"yolos" (case-sensitive).Returns
"yolos" if "yolos" is found in model_name; otherwise "detr".parse_voc_xml
Parses a Pascal VOC XML annotation file and extracts image metadata and bounding box coordinates.
Parameters
Absolute or relative path to a Pascal VOC XML file. The file must have a standard VOC structure with
<filename>, <size>, and one or more <object> elements.Returns
Parsed annotation data.
xml2dicts
Converts a list of raw VOC bounding-box dicts (as produced by parse_voc_xml) into the format expected by the DETR image processor.
width and height are accepted as parameters for future normalisation use, but are not applied in the current implementation. All coordinates remain in absolute pixel space.Parameters
List of bounding-box dicts as returned by
parse_voc_xml. Each dict must contain "xmin", "ymin", "xmax", and "ymax" keys.Image width in pixels. Accepted for interface consistency; not used for coordinate normalisation.
Image height in pixels. Accepted for interface consistency; not used for coordinate normalisation.
Returns
List of annotation dicts, one per input bounding box.