DetectorPlacas pipeline.config Reference: SSD MobileNet v1

pipeline.config is the TensorFlow Object Detection API training configuration file used to train the DetectorPlacas model. It controls every aspect of training — from the model architecture and anchor generation strategy to the optimizer schedule, data augmentation, and evaluation protocol. All values documented here are extracted directly from the project’s pipeline.config file.

Model (SSD)

The top-level model block configures the SSD detector with MobileNet v1 as its backbone.

model {
  ssd {
    num_classes: 1
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    feature_extractor {
      type: "ssd_mobilenet_v1"
      depth_multiplier: 1.0
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 4e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.03
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.9997
          center: true
          scale: true
          epsilon: 0.001
          train: true
        }
      }
    }

Field	Value	Meaning
`num_classes`	`1`	The model detects a single class: license plate
`fixed_shape_resizer`	`300 × 300`	Every input image is resized to exactly 300 × 300 pixels before the feature extractor runs
`type`	`"ssd_mobilenet_v1"`	MobileNet v1 backbone — lightweight depthwise-separable convolutions
`depth_multiplier`	`1.0`	Full-width MobileNet; reduce below 1.0 to shrink the network at the cost of accuracy
`min_depth`	`16`	Minimum number of filters in any conv layer
`l2_regularizer.weight`	`4e-05`	L2 weight decay applied to all conv layers to limit overfitting
`truncated_normal_initializer`	mean=0.0, stddev=0.03	Weights are initialized from a truncated normal distribution
`activation`	`RELU_6`	Clamps activations to [0, 6] — well-suited for fixed-point quantization
`batch_norm.decay`	`0.9997`	Moving-average decay for batch normalization statistics
`batch_norm.epsilon`	`0.001`	Numerical stability constant added to the variance denominator

Matcher and Box Coder

The matcher determines how ground-truth boxes are assigned to anchor boxes during training. The box coder encodes bounding-box coordinates into the regression targets the model learns to predict.

    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }

Argmax matcher

Field	Value	Meaning
`matched_threshold`	`0.5`	Anchors with IoU ≥ 0.5 against a ground-truth box are treated as positives
`unmatched_threshold`	`0.5`	Anchors with IoU < 0.5 are treated as negatives
`ignore_thresholds`	`false`	No intermediate “ignore” band between matched and unmatched thresholds
`negatives_lower_than_unmatched`	`true`	Negatives must have IoU strictly lower than the unmatched threshold
`force_match_for_each_row`	`true`	Guarantees every ground-truth box is matched to at least one anchor

Faster RCNN box coderThe coder scales the center-offset and log-size regression targets by the scale factors before computing the loss. Larger scale values amplify small errors and encourage the network to be more precise in that dimension.

Axis	Scale
y-center	`10.0`
x-center	`10.0`
height (log)	`5.0`
width (log)	`5.0`

Box Predictor

The convolutional box predictor is applied on top of each SSD feature map layer to produce raw box offsets and class scores.

    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
      }
    }

Field	Value	Meaning
`min_depth`	`0`	No minimum filter depth enforced on predictor layers
`max_depth`	`0`	No maximum filter depth limit — uses the feature map depth directly
`num_layers_before_predictor`	`0`	No extra conv layers inserted before the predictor head
`use_dropout`	`false`	Dropout is disabled; `dropout_keep_probability: 0.8` is defined but not applied
`kernel_size`	`1`	1×1 convolution for the predictor head
`box_code_size`	`4`	Four values per box: `[y_offset, x_offset, height_log, width_log]`
`apply_sigmoid_to_scores`	`false`	Raw logits are returned; sigmoid is applied in the post-processing step instead

Anchor Generator

The SSD anchor generator creates a set of default bounding boxes (anchors) at multiple scales and aspect ratios across six feature map layers. These anchors are the candidates the model refines during both training and inference.

    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }

Field	Value	Meaning
`num_layers`	`6`	Anchors are generated from 6 different feature map resolutions
`min_scale`	`0.2`	Smallest anchor size as a fraction of the input image (300 × 0.2 = 60 px)
`max_scale`	`0.95`	Largest anchor size as a fraction of the input image (300 × 0.95 = 285 px)
Aspect ratios	`1.0, 2.0, 0.5, 3.0, 0.3333`	Five aspect ratios per anchor location; covers square, wide, and tall plates

The scale for layer k is interpolated linearly between min_scale and max_scale, giving the model sensitivity to license plates at a wide range of distances from the camera.

Post-Processing

After the model produces raw scores and box predictions, the post-processing step filters and deduplicates the results before they are returned as detections.

    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-08
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }

Field	Value	Meaning
`score_threshold`	`1e-08`	Effectively zero — almost no box is discarded by score alone before NMS
`iou_threshold`	`0.6`	Boxes that overlap an already-accepted detection by more than 60 % IoU are suppressed
`max_detections_per_class`	`100`	Upper bound on returned detections per class
`max_total_detections`	`100`	Upper bound on total returned detections across all classes
`score_converter`	`SIGMOID`	Raw logit scores are converted to [0, 1] probabilities via the sigmoid function

The per-script min_score_thresh values (0.65 for images, 0.55 for video, 0.50 for webcam) are applied after this post-processing step at visualization time. The NMS score_threshold here is intentionally very permissive so that thresholding is deferred to the calling script.

Loss

The loss block defines the objective functions for both localization (box regression) and classification, plus hard example mining to focus training on difficult samples.

    loss {
      localization_loss {
        weighted_smooth_l1 {}
      }
      classification_loss {
        weighted_sigmoid {}
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.5
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }

Loss functions

Loss	Type	Notes
Localization	`weighted_smooth_l1`	Smooth L1 (Huber) loss is less sensitive to outlier box predictions than plain L1 or L2
Classification	`weighted_sigmoid`	Binary sigmoid cross-entropy — appropriate for a single-class detector
`classification_weight`	`1.0`	Classification and localization losses are weighted equally
`localization_weight`	`1.0`

Hard example minerOnline Hard Example Mining (OHEM) selects the most informative negative anchors for each training batch, preventing easy negatives from dominating the gradient.

Field	Value	Meaning
`num_hard_examples`	`3000`	Consider up to 3000 anchors when selecting hard examples
`iou_threshold`	`0.5`	Anchors with IoU ≥ 0.5 are not eligible to be mined as negatives
`loss_type`	`CLASSIFICATION`	Negatives are ranked by classification loss, not localization
`max_negatives_per_positive`	`3`	At most 3 negative anchors are selected for every positive in the batch (3:1 ratio)
`min_negatives_per_image`	`0`	No forced minimum; allows images with no positives to contribute zero negatives

Training Configuration

The train_config block controls the training loop: batch size, data augmentation, optimizer, learning rate schedule, and the checkpoint to fine-tune from.

train_config {
  batch_size: 16
  data_augmentation_options {
    random_horizontal_flip {}
  }
  data_augmentation_options {
    ssd_random_crop {}
  }
  optimizer {
    adam_optimizer {
      learning_rate {
        manual_step_learning_rate {
          initial_learning_rate: 0.0002
          schedule {
            step: 4500
            learning_rate: 0.0001
          }
          schedule {
            step: 7000
            learning_rate: 8e-05
          }
          schedule {
            step: 10000
            learning_rate: 4e-05
          }
        }
      }
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "inference_graph/model.ckpt"
  from_detection_checkpoint: true
  num_steps: 200000
  load_all_detection_checkpoint_vars: true
}

Batch and augmentation

Field	Value	Meaning
`batch_size`	`16`	16 images per gradient update
`random_horizontal_flip`	—	Randomly mirrors images horizontally to improve left/right invariance
`ssd_random_crop`	—	Randomly crops the image using SSD-specific constraints that preserve object coverage

Adam optimizer and learning rate scheduleThe training uses Adam with a 4-stage manually stepped learning rate decay. The rate drops by 50 % or more at each milestone to stabilize training as it converges.

Step range	Learning rate
0 → 4499	`0.0002` (initial)
4500 → 6999	`0.0001`
7000 → 9999	`8e-05`
10000 → 200000	`4e-05`

Checkpoint and fine-tuning

Field	Value	Meaning
`fine_tune_checkpoint`	`"inference_graph/model.ckpt"`	Starting weights — the model is fine-tuned from an existing detection checkpoint
`from_detection_checkpoint`	`true`	Loads all detection variables (backbone + heads) rather than backbone only
`num_steps`	`200000`	Total number of training steps
`load_all_detection_checkpoint_vars`	`true`	Restores every variable from the checkpoint, including box predictor heads
`use_moving_average`	`false`	Moving-average weights are not applied during training

Evaluation

The eval_config block defines the evaluation protocol run after training to measure model quality.

eval_config {
  num_examples: 1100
  metrics_set: "coco_detection_metrics"
  use_moving_averages: false
}

Field	Value	Meaning
`num_examples`	`1100`	Number of images in the evaluation split
`metrics_set`	`"coco_detection_metrics"`	Reports COCO-style Average Precision (AP) at IoU thresholds 0.50:0.95
`use_moving_averages`	`false`	Evaluates the raw checkpoint weights rather than exponential moving-average weights

COCO AP provides a single summary metric (mAP@[0.50:0.95]) as well as breakdowns by object size (small, medium, large), which is particularly useful for license plates that can appear at varying distances.

The train_input_reader and eval_input_reader sections in pipeline.config contain absolute Windows paths — for example C:/Users/OKTOPUZ-SLID/Desktop/detector-placas/data/labelmap.pbtxt and C:/Users/OKTOPUZ-SLID/Desktop/detector-placas/data/train.record. You must update these paths to match your local file system before retraining the model on a different machine.

Get Started

Guides

Reference

DetectorPlacas pipeline.config Reference: SSD MobileNet v1

Build docs developers (and LLMs) love

Get Started

Guides

Reference

Documentation Index

Build docs developers (and LLMs) love