Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/tommyngx/MammoMix/llms.txt

Use this file to discover all available pages before exploring further.

MammoMix uses a single YAML file to control every aspect of training: dataset location, model selection, HuggingFace TrainingArguments, and Weights & Biases logging. Both train.py (YOLOS) and train_detrd.py (Deformable DETR) read the same YAML schema via utils.load_config. The config file is the single source of truth; CLI flags --config, --dataset, and --epoch can override individual values without editing the file.
CLI flags take precedence over YAML values. --dataset overrides dataset.name and --epoch overrides training.epochs. All other fields must be changed in the YAML file directly.

Full example: config_yolos.yaml

config_yolos.yaml
# Training configuration for BreastDet

dataset:
  name: CSAW
  splits_dir: ../dataset
  max_size: 640

model:
  model_name: hustvl/yolos-base

data:
  train_dir: train
  val_dir: val
  test_dir: test
  batch_size: 8
  num_workers: 4
  image_size: 512

training:
  output_dir: ../tmp
  epochs: 30
  batch_size: 8
  learning_rate: 0.0001
  weight_decay: 0.0005
  warmup_ratio: 0.05
  lr_scheduler_type: cosine_with_restarts
  lr_scheduler_kwargs:
    num_cycles: 1
  eval_do_concat_batches: False
  evaluation_strategy: epoch
  save_strategy: epoch
  save_total_limit: 1
  logging_strategy: epoch
  load_best_model_at_end: True
  metric_for_best_model: eval_map_50
  greater_is_better: True
  dataloader_num_workers: 4
  gradient_accumulation_steps: 2
  remove_unused_columns: False

logging:
  use_wandb: True
  wandb_project: MammoMix
  log_interval: 10

seed: 42

wandb:
  wandb_dir: ../wandb

dataset section

Controls which dataset is loaded and how images are pre-processed before entering the model.
dataset.name
string
default:"CSAW"
Dataset identifier. Accepted values are CSAW, DMID, and DDSM. This value is passed as dataset_name to BreastCancerDataset and is also used to name W&B runs and model save directories. Overridden by the --dataset CLI flag.
dataset.splits_dir
string
required
Absolute or relative path to the directory that contains the split .txt files and the image subdirectories. train.py reads {splits_dir}/train, {splits_dir}/val, and {splits_dir}/test from this root.
dataset.max_size
number
default:"640"
Maximum image dimension (height and width) used by AutoImageProcessor when resizing and padding. Images are resized so that their longest side equals max_size, then zero-padded to a square of max_size × max_size. Use 800 for Deformable DETR to match its standard input resolution.

model section

model.model_name
string
required
HuggingFace Hub model ID loaded by AutoModelForObjectDetection.from_pretrained. Common values:
ModelID
YOLOS-basehustvl/yolos-base
Deformable DETRSenseTime/deformable-detr
DETR-ResNet-50facebook/detr-resnet-50
utils.get_model_type uses this string to determine whether the dataset loader should return YOLOS or DETR-style pixel value tensors.

data section

Controls subdirectory names and DataLoader settings. These values are relative to dataset.splits_dir.
data.train_dir
string
default:"train"
Name of the training split subdirectory under splits_dir.
data.val_dir
string
default:"val"
Name of the validation split subdirectory under splits_dir.
data.test_dir
string
default:"test"
Name of the test split subdirectory under splits_dir. Used only for the final evaluation after training.
data.batch_size
number
default:"8"
DataLoader batch size. Note that training.batch_size is the value actually passed to TrainingArguments; this field is informational and may be used by custom DataLoader construction code.
data.num_workers
number
default:"4"
Number of DataLoader worker processes for the torch.utils.data.DataLoader constructed in train.py. Set to 0 when using Deformable DETR to avoid shared-memory conflicts.
data.image_size
number
default:"512"
Target image size used within the dataset loader prior to processor resizing. Acts as a pre-resize step before max_size is applied.

training section

All keys under training map directly to HuggingFace TrainingArguments parameters.
training.output_dir
string
default:"../tmp"
Directory where HuggingFace Trainer writes intermediate checkpoints. This is separate from the final model save path (../yolos_{DATASET}_{DDMMYY}).
training.epochs
number
default:"30"
Total number of training epochs. Overridden by the --epoch CLI flag.
training.batch_size
number
default:"8"
Per-device training and evaluation batch size passed to TrainingArguments as per_device_train_batch_size and per_device_eval_batch_size.
training.learning_rate
number
default:"0.0001"
Peak learning rate for the optimizer. The Deformable DETR pipeline overrides this in code to 0.0005; set it explicitly in your config if you want a different value.
training.weight_decay
number
default:"0.0005"
L2 regularization coefficient applied to all non-bias parameters.
training.warmup_ratio
number
default:"0.05"
Fraction of total training steps used for linear learning-rate warmup. 0.05 means 5 % of all steps warm up from 0 to learning_rate.
training.lr_scheduler_type
string
default:"cosine_with_restarts"
Learning rate schedule. Accepted values are any HuggingFace SchedulerType string, e.g. cosine, cosine_with_restarts, linear, constant.
training.lr_scheduler_kwargs
object
Extra keyword arguments forwarded to the scheduler factory. For cosine_with_restarts, set num_cycles to control how many cosine cycles run over the training duration.
lr_scheduler_kwargs:
  num_cycles: 1
training.eval_do_concat_batches
boolean
default:"false"
When True, evaluation batches are concatenated before metric computation. Set to False (default) to compute metrics per-batch and average, which is more memory efficient.
training.evaluation_strategy
string
default:"epoch"
When to run evaluation. epoch evaluates after every epoch; steps evaluates every eval_steps steps.
training.save_strategy
string
default:"epoch"
When to save checkpoints. Must match evaluation_strategy when load_best_model_at_end=True.
training.save_total_limit
number
default:"1"
Maximum number of checkpoints to keep on disk. Older checkpoints are deleted automatically. The Deformable DETR config uses 2 to retain the previous-best checkpoint as a safety net.
training.logging_strategy
string
default:"epoch"
When to log training metrics. Use steps together with logging_steps for finer-grained W&B curves (the Deformable DETR pipeline logs every 10 steps).
training.load_best_model_at_end
boolean
default:"true"
When True, the trainer reloads the best checkpoint at the end of training before saving the final model and running test evaluation.
training.metric_for_best_model
string
default:"eval_map_50"
Validation metric used to rank checkpoints. Use eval_map_50 for YOLOS (computed by the custom compute_metrics function). Use eval_loss for Deformable DETR (since compute_metrics is not attached during training).
training.greater_is_better
boolean
default:"true"
Set to True when the best model has the highest metric value (e.g. eval_map_50). Set to False when the best model has the lowest value (e.g. eval_loss).
training.num_workers
number
default:"4"
dataloader_num_workers passed to TrainingArguments. Controls the number of worker processes used by Trainer’s internal DataLoader (distinct from data.num_workers).
training.gradient_accumulation_steps
number
default:"2"
Number of forward passes to accumulate gradients over before performing an optimizer step. Effective batch size = batch_size × gradient_accumulation_steps. The Deformable DETR pipeline hardcodes this to 32.
training.remove_unused_columns
boolean
default:"false"
When False, HuggingFace Trainer passes all dataset columns to the model. Must be False for MammoMix because the dataset returns custom keys (pixel_values, labels) that are not automatically recognized.

logging section

logging.use_wandb
boolean
default:"true"
Enable or disable Weights & Biases integration. When True, report_to is set to include wandb in the training arguments.
logging.wandb_project
string
default:"MammoMix"
W&B project name. All runs for a given config are grouped under this project in the W&B dashboard.
logging.log_interval
number
default:"10"
Step interval for logging. Used as a reference value; the actual logging_steps in TrainingArguments is set independently in training.logging_strategy.

wandb section

wandb.wandb_dir
string
Local filesystem path where W&B stores run artifacts, offline logs, and sync cache. Passed to TrainingArguments as logging_dir. If omitted, W&B defaults to ./wandb in the working directory.

deformable_detr section

This section is read by train_detrd.py only when loading the model. Only num_queries is currently consumed from this section.
deformable_detr.num_queries
number
default:"300"
Number of object queries for Deformable DETR. Passed to AutoModelForObjectDetection.from_pretrained during model loading in train_detrd.py.
Other keys under deformable_detr (num_feature_levels, dec_n_points, enc_n_points, with_box_refine, two_stage) are present in config_d_detr.yaml but are not read by train_detrd.py. The training hyperparameters for Deformable DETR are hardcoded in create_deformable_training_args and override the training section values.

Top-level seed

seed
number
default:"42"
Random seed documented in the config file for reference. Note: train.py does not currently read or apply this value — reproducibility in the data split is handled by the fixed random_state=42 in splitting.py.

Build docs developers (and LLMs) love