Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/oktopuzSlid/detectorPlacas/llms.txt

Use this file to discover all available pages before exploring further.

pipeline.config is the TensorFlow Object Detection API training configuration file used to train the DetectorPlacas model. It controls every aspect of training — from the model architecture and anchor generation strategy to the optimizer schedule, data augmentation, and evaluation protocol. All values documented here are extracted directly from the project’s pipeline.config file.
The top-level model block configures the SSD detector with MobileNet v1 as its backbone.
model {
  ssd {
    num_classes: 1
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    feature_extractor {
      type: "ssd_mobilenet_v1"
      depth_multiplier: 1.0
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 4e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.03
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.9997
          center: true
          scale: true
          epsilon: 0.001
          train: true
        }
      }
    }
FieldValueMeaning
num_classes1The model detects a single class: license plate
fixed_shape_resizer300 × 300Every input image is resized to exactly 300 × 300 pixels before the feature extractor runs
type"ssd_mobilenet_v1"MobileNet v1 backbone — lightweight depthwise-separable convolutions
depth_multiplier1.0Full-width MobileNet; reduce below 1.0 to shrink the network at the cost of accuracy
min_depth16Minimum number of filters in any conv layer
l2_regularizer.weight4e-05L2 weight decay applied to all conv layers to limit overfitting
truncated_normal_initializermean=0.0, stddev=0.03Weights are initialized from a truncated normal distribution
activationRELU_6Clamps activations to [0, 6] — well-suited for fixed-point quantization
batch_norm.decay0.9997Moving-average decay for batch normalization statistics
batch_norm.epsilon0.001Numerical stability constant added to the variance denominator
The matcher determines how ground-truth boxes are assigned to anchor boxes during training. The box coder encodes bounding-box coordinates into the regression targets the model learns to predict.
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
Argmax matcher
FieldValueMeaning
matched_threshold0.5Anchors with IoU ≥ 0.5 against a ground-truth box are treated as positives
unmatched_threshold0.5Anchors with IoU < 0.5 are treated as negatives
ignore_thresholdsfalseNo intermediate “ignore” band between matched and unmatched thresholds
negatives_lower_than_unmatchedtrueNegatives must have IoU strictly lower than the unmatched threshold
force_match_for_each_rowtrueGuarantees every ground-truth box is matched to at least one anchor
Faster RCNN box coderThe coder scales the center-offset and log-size regression targets by the scale factors before computing the loss. Larger scale values amplify small errors and encourage the network to be more precise in that dimension.
AxisScale
y-center10.0
x-center10.0
height (log)5.0
width (log)5.0
The convolutional box predictor is applied on top of each SSD feature map layer to produce raw box offsets and class scores.
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
      }
    }
FieldValueMeaning
min_depth0No minimum filter depth enforced on predictor layers
max_depth0No maximum filter depth limit — uses the feature map depth directly
num_layers_before_predictor0No extra conv layers inserted before the predictor head
use_dropoutfalseDropout is disabled; dropout_keep_probability: 0.8 is defined but not applied
kernel_size11×1 convolution for the predictor head
box_code_size4Four values per box: [y_offset, x_offset, height_log, width_log]
apply_sigmoid_to_scoresfalseRaw logits are returned; sigmoid is applied in the post-processing step instead
The SSD anchor generator creates a set of default bounding boxes (anchors) at multiple scales and aspect ratios across six feature map layers. These anchors are the candidates the model refines during both training and inference.
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
FieldValueMeaning
num_layers6Anchors are generated from 6 different feature map resolutions
min_scale0.2Smallest anchor size as a fraction of the input image (300 × 0.2 = 60 px)
max_scale0.95Largest anchor size as a fraction of the input image (300 × 0.95 = 285 px)
Aspect ratios1.0, 2.0, 0.5, 3.0, 0.3333Five aspect ratios per anchor location; covers square, wide, and tall plates
The scale for layer k is interpolated linearly between min_scale and max_scale, giving the model sensitivity to license plates at a wide range of distances from the camera.
After the model produces raw scores and box predictions, the post-processing step filters and deduplicates the results before they are returned as detections.
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-08
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
FieldValueMeaning
score_threshold1e-08Effectively zero — almost no box is discarded by score alone before NMS
iou_threshold0.6Boxes that overlap an already-accepted detection by more than 60 % IoU are suppressed
max_detections_per_class100Upper bound on returned detections per class
max_total_detections100Upper bound on total returned detections across all classes
score_converterSIGMOIDRaw logit scores are converted to [0, 1] probabilities via the sigmoid function
The per-script min_score_thresh values (0.65 for images, 0.55 for video, 0.50 for webcam) are applied after this post-processing step at visualization time. The NMS score_threshold here is intentionally very permissive so that thresholding is deferred to the calling script.
The loss block defines the objective functions for both localization (box regression) and classification, plus hard example mining to focus training on difficult samples.
    loss {
      localization_loss {
        weighted_smooth_l1 {}
      }
      classification_loss {
        weighted_sigmoid {}
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.5
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
Loss functions
LossTypeNotes
Localizationweighted_smooth_l1Smooth L1 (Huber) loss is less sensitive to outlier box predictions than plain L1 or L2
Classificationweighted_sigmoidBinary sigmoid cross-entropy — appropriate for a single-class detector
classification_weight1.0Classification and localization losses are weighted equally
localization_weight1.0
Hard example minerOnline Hard Example Mining (OHEM) selects the most informative negative anchors for each training batch, preventing easy negatives from dominating the gradient.
FieldValueMeaning
num_hard_examples3000Consider up to 3000 anchors when selecting hard examples
iou_threshold0.5Anchors with IoU ≥ 0.5 are not eligible to be mined as negatives
loss_typeCLASSIFICATIONNegatives are ranked by classification loss, not localization
max_negatives_per_positive3At most 3 negative anchors are selected for every positive in the batch (3:1 ratio)
min_negatives_per_image0No forced minimum; allows images with no positives to contribute zero negatives
The train_config block controls the training loop: batch size, data augmentation, optimizer, learning rate schedule, and the checkpoint to fine-tune from.
train_config {
  batch_size: 16
  data_augmentation_options {
    random_horizontal_flip {}
  }
  data_augmentation_options {
    ssd_random_crop {}
  }
  optimizer {
    adam_optimizer {
      learning_rate {
        manual_step_learning_rate {
          initial_learning_rate: 0.0002
          schedule {
            step: 4500
            learning_rate: 0.0001
          }
          schedule {
            step: 7000
            learning_rate: 8e-05
          }
          schedule {
            step: 10000
            learning_rate: 4e-05
          }
        }
      }
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "inference_graph/model.ckpt"
  from_detection_checkpoint: true
  num_steps: 200000
  load_all_detection_checkpoint_vars: true
}
Batch and augmentation
FieldValueMeaning
batch_size1616 images per gradient update
random_horizontal_flipRandomly mirrors images horizontally to improve left/right invariance
ssd_random_cropRandomly crops the image using SSD-specific constraints that preserve object coverage
Adam optimizer and learning rate scheduleThe training uses Adam with a 4-stage manually stepped learning rate decay. The rate drops by 50 % or more at each milestone to stabilize training as it converges.
Step rangeLearning rate
0 → 44990.0002 (initial)
4500 → 69990.0001
7000 → 99998e-05
10000 → 2000004e-05
Checkpoint and fine-tuning
FieldValueMeaning
fine_tune_checkpoint"inference_graph/model.ckpt"Starting weights — the model is fine-tuned from an existing detection checkpoint
from_detection_checkpointtrueLoads all detection variables (backbone + heads) rather than backbone only
num_steps200000Total number of training steps
load_all_detection_checkpoint_varstrueRestores every variable from the checkpoint, including box predictor heads
use_moving_averagefalseMoving-average weights are not applied during training
The eval_config block defines the evaluation protocol run after training to measure model quality.
eval_config {
  num_examples: 1100
  metrics_set: "coco_detection_metrics"
  use_moving_averages: false
}
FieldValueMeaning
num_examples1100Number of images in the evaluation split
metrics_set"coco_detection_metrics"Reports COCO-style Average Precision (AP) at IoU thresholds 0.50:0.95
use_moving_averagesfalseEvaluates the raw checkpoint weights rather than exponential moving-average weights
COCO AP provides a single summary metric (mAP@[0.50:0.95]) as well as breakdowns by object size (small, medium, large), which is particularly useful for license plates that can appear at varying distances.
The train_input_reader and eval_input_reader sections in pipeline.config contain absolute Windows paths — for example C:/Users/OKTOPUZ-SLID/Desktop/detector-placas/data/labelmap.pbtxt and C:/Users/OKTOPUZ-SLID/Desktop/detector-placas/data/train.record. You must update these paths to match your local file system before retraining the model on a different machine.

Build docs developers (and LLMs) love