Documentation Index
Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt
Use this file to discover all available pages before exploring further.
momo.py merges three independent mammography datasets — CSAW, DMID, and DDSM — into a single YOLO-style directory tree. Each file is copied with a dataset-name prefix to avoid filename collisions, and per-split .txt manifests are updated to point to the merged paths.
merge_datasets
Copies all images and labels from each source dataset into a new unified directory, prefixing every filename with its source dataset name, and writes merged train.txt, val.txt, and test.txt manifests.
Parameters
Path to the parent directory that contains the individual dataset folders (
CSAW/, DMID/, DDSM/). The merged output directory is also created here.Name of the output folder to create inside
input_dir. For example, passing "MammoMix" creates {input_dir}/MammoMix/.Side effects
The function creates the following directory structure under{input_dir}/{output_name}/:
{DATASET}_{original_filename}. For example, img001.jpg from CSAW becomes CSAW_img001.jpg. This applies to both image and label files.
Manifest lines. Each line in the merged .txt files is an absolute path of the form {output_root}/{split}/images/{DATASET}_{filename}.
Missing datasets. If a source dataset folder does not exist inside input_dir, that dataset is skipped with a console warning and processing continues.
Missing splits. If a specific {dataset}/{split}/images or {dataset}/{split}/labels directory is absent, that split for that dataset is skipped with a console warning.
Example — directory layout before merging
Example — directory layout after merging
CLI usage
momo.py is also executable as a standalone script via argparse.
Arguments
Path to the parent directory containing
CSAW/, DMID/, and DDSM/ subdirectories. Passed directly to merge_datasets(input_dir=...).Name for the output merged dataset folder created inside
input_dir. Passed directly to merge_datasets(output_name=...).Example
/data/mammography/MammoMix/ with the merged directory structure described above.