donkey tubplot command creates plots comparing a model’s predictions (steering and throttle) against the actual user inputs from recorded tub data. This is essential for evaluating model performance and identifying areas for improvement.
Usage
Options
Path(s) to tub directories to analyze. Multiple tubs can be specified:
Path to the trained model to use for predictions.
Maximum number of records to process. Default is 1000 records.
Model type to load (e.g.,
linear, categorical). If not specified, uses DEFAULT_MODEL_TYPE from config.Location of config file to use. Default is
./config.py.Save the plot without displaying it in a window. Useful for headless environments or batch processing.
What Gets Created
The command generates:-
Interactive plot window (unless
--noshowis specified) with two subplots:- Steering plot: User angle vs. pilot angle over time
- Throttle plot: User throttle vs. pilot throttle over time
-
PNG image file saved as
<model_path>_pred.pngcontaining the plots
Plot Features
Steering Subplot (Top)
- Blue line: User steering input (ground truth)
- Orange line: Model predicted steering
- Y-axis: Steering angle (-1.0 to 1.0)
- X-axis: Record index
Throttle Subplot (Bottom)
- Blue line: User throttle input (ground truth)
- Orange line: Model predicted throttle
- Y-axis: Throttle value (-1.0 to 1.0)
- X-axis: Record index
Plot Title
Includes:- Tub path(s)
- Model path
- Model type
Examples
Basic plot with single tub
Plot with multiple tubs
Process limited records
Specify model type
Save without displaying (headless mode)
Use custom config
Process all available records
Output Example
While processing:--noshow) and the PNG file is saved.
Interpreting the Plots
Good Model Performance
- Lines closely aligned: Predictions closely follow user inputs
- Smooth predictions: Model outputs are stable, not jittery
- Similar patterns: Model captures overall driving behavior
Signs of Problems
Overfitting
- Perfect match on training data
- Poor match on validation data
- Solution: Collect more diverse data, reduce model complexity
Underfitting
- Predictions don’t follow user inputs well
- Flat or unresponsive predictions
- Solution: Use more complex model, train longer, improve data quality
Lag
- Predictions delayed compared to user input
- Model reacts too slowly
- Solution: Check sequence length, reduce model latency
Oscillation
- Predictions jitter or oscillate
- Model output is unstable
- Solution: Add smoothing, improve training data, adjust learning rate
Bias
- Predictions consistently offset from user input
- Model steers too much left/right
- Solution: Check calibration, balance training data
Use Cases
Model Evaluation
Compare model accuracy after training:Model Comparison
Evaluate multiple models on the same data:*_pred.png files.
Identifying Problem Areas
Find where predictions diverge from user input:Quick Validation
Rapidly check if training improved the model:Debugging
Identify specific failure modes:Analysis Workflow
-
Train your model:
-
Create plots for validation data:
-
Analyze the plots:
- Check steering accuracy
- Check throttle stability
- Identify systematic errors
-
Iterate:
- Collect more data for problem areas
- Adjust model architecture or hyperparameters
- Retrain and replot
Advanced Analysis
Quantitative Metrics
For numerical error metrics, you can calculate:- Mean Absolute Error (MAE): Average absolute difference
- Root Mean Square Error (RMSE): Emphasizes larger errors
- R² Score: How well predictions explain variance
tubplot, but you can compute them from the saved data or modify the source code.
Custom Plots
The plot data is generated during inference. For custom analysis:- Copy the tubplot code from
donkeycar/management/base.py - Modify the plotting section to add:
- Error distribution histograms
- Scatter plots of user vs. predicted values
- Time-series of prediction errors
- Confidence intervals
Troubleshooting
Model loading errors
- Verify model path is correct
- Ensure model file is not corrupted
- Check config matches model requirements
- Specify
--typeexplicitly
Tub not found
- Check tub path is correct
- Verify tub contains valid data (manifest.json)
- Use absolute paths if relative paths fail
Plot display issues (Linux)
- May need X11 forwarding for SSH:
ssh -X - Or use
--noshowand view PNG file - Install matplotlib backend:
sudo apt-get install python3-tk
Memory errors
- Reduce
--limitto process fewer records - Close other applications
- Use a machine with more RAM
Inference too slow
- Reduce
--limit - Use GPU if available
- Ensure TensorFlow/PyTorch is properly installed
Plot looks empty or flat
- Check that tub contains valid user inputs
- Verify model is actually loaded (not using random weights)
- Ensure image normalization is correct
Tips
Efficient Evaluation
- Use subset of data: Start with
--limit 100for quick checks - Automate comparison: Script multiple tubplot calls for batch analysis
- Save all plots: Use
--noshowand compare PNG files side-by-side
Representative Data
- Use validation tubs: Don’t evaluate on training data
- Test diverse scenarios: Include straight, curves, different speeds
- Check edge cases: Test on challenging sections
Continuous Monitoring
- Plot after each training: Track improvement over time
- Keep plot history: Archive PNG files with timestamps
- Document changes: Note what training parameters produced each plot
Next Steps
After analyzing plots:- Identify weaknesses: Note where predictions are poor
- Collect targeted data: Record more examples of problem scenarios
- Visualize with video: Use
donkey makemoviefor visual analysis - Check data distribution: Use
donkey tubhistto analyze data balance - Retrain: Incorporate findings into next training iteration
- Test on car: Deploy and test in real-world conditions
