mocae.py — MoCaE ensemble functions

mocae.py implements the Mixture of Calibrated Experts (MoCaE) ensemble strategy for mammography detection. Three dataset-specific YOLOS models each produce predictions; scores are calibrated by a RandomForestRegressor, then merged via Soft NMS and Score Voting.

`soft_nms`

Soft Non-Maximum Suppression — iteratively selects the highest-scoring box and decays the scores of nearby overlapping boxes instead of removing them outright. Produces a sparser set of boxes than greedy NMS while retaining weak detections.

from mocae import soft_nms
import torch

boxes  = torch.tensor([[10., 10., 50., 50.], [12., 12., 52., 52.], [200., 200., 300., 300.]])
scores = torch.tensor([0.9, 0.75, 0.6])

keep_boxes, keep_scores = soft_nms(boxes, scores, sigma_nms=0.1, iou_nms=0.65, method="gaussian")

Parameters

boxes

torch.Tensor

required

Detection boxes in (x1, y1, x2, y2) format. Shape (N, 4).

scores

torch.Tensor

required

Confidence scores for each box. Shape (N,).

sigma_nms

float

default:"0.1"

Gaussian decay parameter σ. Controls how sharply scores drop as IoU increases. Smaller values produce stronger suppression. Only used when method="gaussian".

iou_nms

float

default:"0.65"

IoU threshold for the linear decay method. Boxes with IoU above this value are linearly down-weighted. Only used when method="linear".

score_thresh

float

default:"0"

Boxes whose decayed score falls at or below this threshold are removed from consideration in subsequent iterations.

method

string

default:"gaussian"

Decay strategy. "gaussian" applies exp(-(iou² / σ)); "linear" applies 1 - iou for boxes above iou_nms.

Returns

keep_boxes

torch.Tensor

Surviving boxes after suppression. Shape (M, 4) where M ≤ N.

keep_scores

torch.Tensor

Decayed confidence scores corresponding to keep_boxes. Shape (M,).

`score_voting`

Refines bounding-box coordinates and scores using neighbourhood agreement. Each box is updated as a weighted average of all other boxes, where weights are a product of calibrated confidence and IoU-derived similarity. Self-influence is excluded.

from mocae import score_voting
import torch

boxes  = torch.tensor([[10., 10., 50., 50.], [12., 12., 52., 52.]])
scores = torch.tensor([0.85, 0.70])

refined_boxes, refined_scores = score_voting(boxes, scores, sigma_sv=0.1)

Parameters

boxes

torch.Tensor

required

Boxes in (x1, y1, x2, y2) format. Shape (N, 4). Returns the inputs unchanged if N == 0.

scores

torch.Tensor

required

Calibrated confidence scores. Shape (N,).

sigma_sv

float

default:"0.1"

Score Voting sigma parameter. Controls how steeply the IoU-based weight falls off. Smaller values mean only very high-IoU neighbours contribute significantly.

Returns

refined_boxes

torch.Tensor

Weighted-average box coordinates. Shape (N, 4). Division by zero is avoided by adding 1e-8 to the denominator.

refined_scores

torch.Tensor

Neighbourhood-agreement scores computed as a weighted average of neighbour confidence values. Shape (N,).

`combine_predictions`

Runs the full MoCaE ensemble pipeline on the test split of a single dataset. For each batch, all expert models produce scored predictions; scores are calibrated; predictions are pooled across experts then refined with Soft NMS and Score Voting; and cumulative mAP is tracked with MeanAveragePrecision.

from mocae import combine_predictions

result = combine_predictions(
    image_processors=image_processors,   # dict[dataset_name -> AutoImageProcessor]
    models=models,                        # list of trained AutoModelForObjectDetection
    calibrators=calibrators,              # list of fitted RandomForestRegressor
    dataset_name="CSAW",
    splits_dir="/tmp/splits",
    batch_size=8,
    sigma_nms=0.08,
    iou_nms=0.65,
    score_thresh=0,
    method="gaussian",
)
print(result)  # {"map": ..., "map_50": ..., ...}

Parameters

image_processors

dict[str, AutoImageProcessor]

required

Mapping from dataset name (e.g. "CSAW") to its corresponding AutoImageProcessor. The processor for dataset_name is used when constructing the test dataset; all processors are iterated when decoding each expert’s predictions.

models

list[nn.Module]

required

List of expert models in the same order as calibrators. Each model is called with pixel_values and must return an output with logits and pred_boxes.

calibrators

list[RandomForestRegressor]

required

List of fitted sklearn.ensemble.RandomForestRegressor calibrators, one per model. Each calibrator predicts calibrated IoU-scores from a feature vector of [image_embedding (512-d), confidence (1-d)].

dataset_name

string

required

Name of the dataset to evaluate on (e.g. "CSAW", "DMID", or "DDSM"). Used to load the test split from splits_dir.

splits_dir

string

required

Root directory containing per-dataset split .txt files. Forwarded to BreastCancerDataset.

batch_size

int

default:"8"

Number of images per inference batch.

sigma_nms

float

default:"0.08"

Gaussian sigma passed to soft_nms. Also reused as sigma_sv in the subsequent score_voting call.

iou_nms

float

default:"0.65"

IoU threshold passed to soft_nms for the linear decay method.

score_thresh

float

default:"0"

Minimum score threshold for soft_nms.

method

string

default:"gaussian"

NMS decay method passed to soft_nms. "gaussian" or "linear".

Returns

result

dict

Output of torchmetrics.detection.MeanAveragePrecision.compute(). Contains keys such as "map", "map_50", "map_75", "map_small", "map_medium", "map_large".

`build_calibrate_dataset`

Populates three mutable lists — image_embeddings, confidences, and ious — by running a single expert model over a dataset split. Image features are extracted with a frozen ResNet-18 backbone. IoU values are computed against ground-truth boxes for in-domain data and set to 0 for out-of-domain data.

from mocae import build_calibrate_dataset

image_embeddings, confidences, ious = [], [], []

build_calibrate_dataset(
    config=config,                # dict with keys "model", "image_processor", "dataset_name"
    image_embeddings=image_embeddings,
    confidences=confidences,
    ious=ious,
    dataset_name="CSAW",
    batch_size=8,
    split="train",
)

This function mutates the lists passed in. Call it in a loop over all datasets to build a combined calibration corpus with combine_datasets.

Parameters

config

dict

required

Expert configuration dict with keys "model" (the expert nn.Module), "image_processor" (AutoImageProcessor), and "dataset_name" (string). The "dataset_name" key is compared with dataset_name to decide whether IoU labels are real (in-domain) or zeroed (out-of-domain).

image_embeddings

list

required

Mutable list to append 512-dimensional ResNet-18 feature vectors (as NumPy arrays) to, one per predicted cancer box.

confidences

list

required

Mutable list to append scalar confidence scores (float) to, one per predicted cancer box.

ious

list

required

Mutable list to append IoU labels (float) to. For in-domain predictions, values are the max IoU against ground-truth boxes; for out-of-domain predictions, values are 0.

dataset_name

string

required

Name of the dataset split to load (e.g. "CSAW"). Used to construct a BreastCancerDataset.

batch_size

int

default:"8"

Batch size for the internal DataLoader.

split

string

default:"train"

Dataset split to iterate over. Typically "train" or "val".

`combine_datasets`

Aggregates calibration data across all datasets for a single expert by calling build_calibrate_dataset once per dataset name, then concatenating the embeddings with the confidence scores to form the final feature matrix.

from mocae import combine_datasets

inputs, ious = combine_datasets(
    config=config,
    dataset_names=["CSAW", "DMID", "DDSM"],
    split="val",
)
# inputs.shape → (num_samples, 513)
# len(ious)    → num_samples

Parameters

config

dict

required

Expert configuration dict forwarded to build_calibrate_dataset. See the config parameter of that function for required keys.

dataset_names

list[str]

required

Ordered list of dataset names to iterate over (e.g. ["CSAW", "DMID", "DDSM"]).

split

string

default:"train"

Dataset split to use when collecting calibration data.

Returns

inputs

numpy.ndarray

Feature matrix of shape (num_samples, 513) where the first 512 columns are ResNet-18 image embeddings and the last column is the raw confidence score. Used as X when fitting a RandomForestRegressor.

ious

list[float]

Target IoU values of length num_samples. Used as y when fitting the calibrator.

Core Modules

Ensemble & Post-processing

mocae.py — MoCaE ensemble functions

`soft_nms`

Parameters

Returns

`score_voting`

Parameters

Returns

`combine_predictions`

Parameters

Returns

`build_calibrate_dataset`

Parameters

`combine_datasets`

Parameters

Returns

Build docs developers (and LLMs) love

Core Modules

Ensemble & Post-processing

Documentation Index

​soft_nms

​Parameters

​Returns

​score_voting

​Parameters

​Returns

​combine_predictions

​Parameters

​Returns

​build_calibrate_dataset

​Parameters

​combine_datasets

​Parameters

​Returns

Build docs developers (and LLMs) love

`soft_nms`

Parameters

Returns

`score_voting`

Parameters

Returns

`combine_predictions`

Parameters

Returns

`build_calibrate_dataset`

Parameters

`combine_datasets`

Parameters

Returns