Model Architectures
Donkeycar provides several pre-built neural network architectures optimized for different use cases:Linear Model
The simplest model using linear activation for continuous steering and throttle outputs.- 5 convolutional layers with dropout (24, 32, 64, 64, 64 filters)
- Flatten layer
- 2 fully connected layers (100, 50 neurons)
- 2 output neurons with linear activation (steering, throttle)
Categorical Model
Converts steering and throttle into discrete bins using categorical cross-entropy.- Same CNN base as Linear model
- 2 output layers with softmax activation:
- Steering: 15 bins (covering -1.0 to 1.0)
- Throttle: 20 bins (covering throttle_range)
LSTM Model
Recurrent model that uses sequences of images for temporal reasoning.- Time-distributed CNN layers on image sequences
- 2 LSTM layers (128 units each)
- Dense layers (128, 64, 10 neurons)
- 2 output neurons
3D CNN Model
Uses 3D convolutions over video sequences for spatiotemporal feature extraction.- 4 Conv3D layers (16, 32, 64, 128 filters) with MaxPooling3D
- Flatten and batch normalization
- 2 dense layers (256 neurons) with dropout (0.5)
- 2 output neurons
Memory Model
Linear model augmented with recent steering/throttle history for smoother outputs.- CNN base (same as Linear)
- Memory input: last
mem_lengthsteering/throttle pairs - Dense layers to process memory
- Concatenation of CNN and memory features
- Output layers with tanh/sigmoid activation
IMU Model
Combines camera images with IMU sensor data (accelerometer/gyroscope).- Image branch: CNN layers
- IMU branch: 3 dense layers (14 neurons each)
- Concatenation of both branches
- 2 dense layers (50 neurons) with dropout
- 2 output neurons
['imu/acl_x', 'imu/acl_y', 'imu/acl_z', 'imu/gyr_x', 'imu/gyr_y', 'imu/gyr_z']
Use case: Improves stability on rough terrain or aggressive driving.
Behavioral Model
Multi-task learning with different behaviors (e.g., left lane, right lane, obstacles).- Image branch: CNN layers
- Behavior branch: Dense layers for one-hot behavior state
- Concatenation of branches
- Categorical outputs (15 angle bins, 20 throttle bins)
Localizer Model
Predicts steering, throttle, and track location simultaneously.- Shared CNN base
- 3 outputs: steering (linear), throttle (linear), location (softmax)
Training Pipeline
Training Command
Train a model using the command line:--tub: Comma-separated list of tub paths--model: Output model path (.h5 for TensorFlow, .ckpt for PyTorch)--type: Model type (linear, categorical, lstm, 3d_cnn, memory, imu, behavioral, localizer)--transfer: Path to model for transfer learning--comment: Training description for database
Training Configuration
Key configuration parameters inmyconfig.py:
Training Process
The training pipeline performs these steps:- Data Loading: Load tub data and split into train/validation sets
- Model Creation: Initialize the selected model architecture
- Compilation: Set optimizer, loss function, and metrics
- Augmentation: Apply image augmentations (optional)
- Training: Fit model with early stopping and checkpointing
- Export: Save best model and optionally convert to TFLite/TensorRT
Optimizer Configuration
Customize the optimizer in your training script:Image Augmentation
Augmentation helps prevent overfitting and improves generalization.Configuration
Custom Augmentation
Create custom augmentation pipeline:Transfer Learning
Use a pre-trained model as starting point:- Faster training convergence
- Better performance with limited data
- Reuse features learned from other tracks
PyTorch Training
Donkeycar also supports PyTorch with ResNet18 transfer learning.PyTorch Model
- Pre-trained ResNet18 on ImageNet
- Feature extraction layers frozen
- Fine-tuned classifier layer
- PyTorch Lightning for training
Training PyTorch Model
PyTorch Data Pipeline
Model Selection Guide
| Model | Speed | Accuracy | Memory | Use Case |
|---|---|---|---|---|
| Linear | Fast | Good | Low | General purpose, fast inference |
| Categorical | Fast | Better | Low | Better confidence, discrete control |
| LSTM | Slow | Better | High | Temporal reasoning, smooth driving |
| 3D CNN | Very Slow | Best | Very High | Complex spatiotemporal patterns |
| Memory | Fast | Better | Low | Smooth control with history |
| IMU | Fast | Better | Low | Rough terrain, better stability |
| Behavioral | Fast | Good | Medium | Multiple driving modes |
| PyTorch ResNet | Medium | Best | Medium | Transfer learning, limited data |
Performance Tips
- Start with Linear model: Simple and fast, good baseline
- Try Categorical for better accuracy: Especially on complex tracks
- Use augmentation: Prevents overfitting on small datasets
- Monitor validation loss: Use early stopping to prevent overfitting
- Collect diverse data: Various lighting, positions, speeds
- Try transfer learning: Start from pre-trained model
- Tune batch size: Larger batches are faster but use more memory
- Use GPU: Significantly faster training (CUDA)
Model Inference
Use trained model in your car:Training Output
Training produces:- Model file:
mypilot.h5(TensorFlow) ormypilot.ckpt(PyTorch) - TFLite model:
mypilot.tflite(for embedded devices) - Training plot:
mypilot.png(loss curves) - Database entry: Training metadata and history
data/pilot_db.json
Next Steps
- Calibration - Calibrate your car for better training data
- Get Driving - Use your trained model to drive
- Deep Learning - Advanced model architectures
