Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt

Use this file to discover all available pages before exploring further.

mocae.py implements the Mixture of Calibrated Experts (MoCaE) ensemble strategy for mammography detection. Three dataset-specific YOLOS models each produce predictions; scores are calibrated by a RandomForestRegressor, then merged via Soft NMS and Score Voting.

soft_nms

Soft Non-Maximum Suppression — iteratively selects the highest-scoring box and decays the scores of nearby overlapping boxes instead of removing them outright. Produces a sparser set of boxes than greedy NMS while retaining weak detections.
from mocae import soft_nms
import torch

boxes  = torch.tensor([[10., 10., 50., 50.], [12., 12., 52., 52.], [200., 200., 300., 300.]])
scores = torch.tensor([0.9, 0.75, 0.6])

keep_boxes, keep_scores = soft_nms(boxes, scores, sigma_nms=0.1, iou_nms=0.65, method="gaussian")

Parameters

boxes
torch.Tensor
required
Detection boxes in (x1, y1, x2, y2) format. Shape (N, 4).
scores
torch.Tensor
required
Confidence scores for each box. Shape (N,).
sigma_nms
float
default:"0.1"
Gaussian decay parameter σ. Controls how sharply scores drop as IoU increases. Smaller values produce stronger suppression. Only used when method="gaussian".
iou_nms
float
default:"0.65"
IoU threshold for the linear decay method. Boxes with IoU above this value are linearly down-weighted. Only used when method="linear".
score_thresh
float
default:"0"
Boxes whose decayed score falls at or below this threshold are removed from consideration in subsequent iterations.
method
string
default:"gaussian"
Decay strategy. "gaussian" applies exp(-(iou² / σ)); "linear" applies 1 - iou for boxes above iou_nms.

Returns

keep_boxes
torch.Tensor
Surviving boxes after suppression. Shape (M, 4) where M ≤ N.
keep_scores
torch.Tensor
Decayed confidence scores corresponding to keep_boxes. Shape (M,).

score_voting

Refines bounding-box coordinates and scores using neighbourhood agreement. Each box is updated as a weighted average of all other boxes, where weights are a product of calibrated confidence and IoU-derived similarity. Self-influence is excluded.
from mocae import score_voting
import torch

boxes  = torch.tensor([[10., 10., 50., 50.], [12., 12., 52., 52.]])
scores = torch.tensor([0.85, 0.70])

refined_boxes, refined_scores = score_voting(boxes, scores, sigma_sv=0.1)

Parameters

boxes
torch.Tensor
required
Boxes in (x1, y1, x2, y2) format. Shape (N, 4). Returns the inputs unchanged if N == 0.
scores
torch.Tensor
required
Calibrated confidence scores. Shape (N,).
sigma_sv
float
default:"0.1"
Score Voting sigma parameter. Controls how steeply the IoU-based weight falls off. Smaller values mean only very high-IoU neighbours contribute significantly.

Returns

refined_boxes
torch.Tensor
Weighted-average box coordinates. Shape (N, 4). Division by zero is avoided by adding 1e-8 to the denominator.
refined_scores
torch.Tensor
Neighbourhood-agreement scores computed as a weighted average of neighbour confidence values. Shape (N,).

combine_predictions

Runs the full MoCaE ensemble pipeline on the test split of a single dataset. For each batch, all expert models produce scored predictions; scores are calibrated; predictions are pooled across experts then refined with Soft NMS and Score Voting; and cumulative mAP is tracked with MeanAveragePrecision.
from mocae import combine_predictions

result = combine_predictions(
    image_processors=image_processors,   # dict[dataset_name -> AutoImageProcessor]
    models=models,                        # list of trained AutoModelForObjectDetection
    calibrators=calibrators,              # list of fitted RandomForestRegressor
    dataset_name="CSAW",
    splits_dir="/tmp/splits",
    batch_size=8,
    sigma_nms=0.08,
    iou_nms=0.65,
    score_thresh=0,
    method="gaussian",
)
print(result)  # {"map": ..., "map_50": ..., ...}

Parameters

image_processors
dict[str, AutoImageProcessor]
required
Mapping from dataset name (e.g. "CSAW") to its corresponding AutoImageProcessor. The processor for dataset_name is used when constructing the test dataset; all processors are iterated when decoding each expert’s predictions.
models
list[nn.Module]
required
List of expert models in the same order as calibrators. Each model is called with pixel_values and must return an output with logits and pred_boxes.
calibrators
list[RandomForestRegressor]
required
List of fitted sklearn.ensemble.RandomForestRegressor calibrators, one per model. Each calibrator predicts calibrated IoU-scores from a feature vector of [image_embedding (512-d), confidence (1-d)].
dataset_name
string
required
Name of the dataset to evaluate on (e.g. "CSAW", "DMID", or "DDSM"). Used to load the test split from splits_dir.
splits_dir
string
required
Root directory containing per-dataset split .txt files. Forwarded to BreastCancerDataset.
batch_size
int
default:"8"
Number of images per inference batch.
sigma_nms
float
default:"0.08"
Gaussian sigma passed to soft_nms. Also reused as sigma_sv in the subsequent score_voting call.
iou_nms
float
default:"0.65"
IoU threshold passed to soft_nms for the linear decay method.
score_thresh
float
default:"0"
Minimum score threshold for soft_nms.
method
string
default:"gaussian"
NMS decay method passed to soft_nms. "gaussian" or "linear".

Returns

result
dict
Output of torchmetrics.detection.MeanAveragePrecision.compute(). Contains keys such as "map", "map_50", "map_75", "map_small", "map_medium", "map_large".

build_calibrate_dataset

Populates three mutable lists — image_embeddings, confidences, and ious — by running a single expert model over a dataset split. Image features are extracted with a frozen ResNet-18 backbone. IoU values are computed against ground-truth boxes for in-domain data and set to 0 for out-of-domain data.
from mocae import build_calibrate_dataset

image_embeddings, confidences, ious = [], [], []

build_calibrate_dataset(
    config=config,                # dict with keys "model", "image_processor", "dataset_name"
    image_embeddings=image_embeddings,
    confidences=confidences,
    ious=ious,
    dataset_name="CSAW",
    batch_size=8,
    split="train",
)
This function mutates the lists passed in. Call it in a loop over all datasets to build a combined calibration corpus with combine_datasets.

Parameters

config
dict
required
Expert configuration dict with keys "model" (the expert nn.Module), "image_processor" (AutoImageProcessor), and "dataset_name" (string). The "dataset_name" key is compared with dataset_name to decide whether IoU labels are real (in-domain) or zeroed (out-of-domain).
image_embeddings
list
required
Mutable list to append 512-dimensional ResNet-18 feature vectors (as NumPy arrays) to, one per predicted cancer box.
confidences
list
required
Mutable list to append scalar confidence scores (float) to, one per predicted cancer box.
ious
list
required
Mutable list to append IoU labels (float) to. For in-domain predictions, values are the max IoU against ground-truth boxes; for out-of-domain predictions, values are 0.
dataset_name
string
required
Name of the dataset split to load (e.g. "CSAW"). Used to construct a BreastCancerDataset.
batch_size
int
default:"8"
Batch size for the internal DataLoader.
split
string
default:"train"
Dataset split to iterate over. Typically "train" or "val".

combine_datasets

Aggregates calibration data across all datasets for a single expert by calling build_calibrate_dataset once per dataset name, then concatenating the embeddings with the confidence scores to form the final feature matrix.
from mocae import combine_datasets

inputs, ious = combine_datasets(
    config=config,
    dataset_names=["CSAW", "DMID", "DDSM"],
    split="val",
)
# inputs.shape → (num_samples, 513)
# len(ious)    → num_samples

Parameters

config
dict
required
Expert configuration dict forwarded to build_calibrate_dataset. See the config parameter of that function for required keys.
dataset_names
list[str]
required
Ordered list of dataset names to iterate over (e.g. ["CSAW", "DMID", "DDSM"]).
split
string
default:"train"
Dataset split to use when collecting calibration data.

Returns

inputs
numpy.ndarray
Feature matrix of shape (num_samples, 513) where the first 512 columns are ResNet-18 image embeddings and the last column is the raw confidence score. Used as X when fitting a RandomForestRegressor.
ious
list[float]
Target IoU values of length num_samples. Used as y when fitting the calibrator.

Build docs developers (and LLMs) love