Overview
The notebook atnotebooks/inference.ipynb provides an interactive environment for:
- Loading and visualizing multi-camera driving scenes
- Running model inference with customizable parameters
- Plotting predicted vs. ground truth trajectories
- Analyzing Chain-of-Causation reasoning traces
- Computing evaluation metrics (minADE)
Getting Started
Launch Jupyter
Start Jupyter Lab or Notebook from your activated environment:Or if you prefer the classic interface:
Notebook Walkthrough
Cell 1: Import Dependencies
The notebook starts by importing required libraries:torch: PyTorch for model loading and inferencemediapy: For displaying multi-camera imagespandas: For loading clip IDsmatplotlib: For trajectory visualization
Cell 2: Load Model and Processor
This cell downloads the model weights (22 GB) on first run. The download is cached for future use.
Cell 3: Load and Prepare Data
Load a driving scene from the PhysicalAI-AV dataset:Customizing the Clip ID
Customizing the Clip ID
You can select different driving scenarios by:
- Using a specific clip ID:
clip_id = "030c760c-ae38-49aa-9ad8-f5650a545d26" - Choosing from the parquet file:
clip_id = clip_ids[INDEX] - The parquet file contains curated clip IDs from the dataset
Cell 4: Run Model Inference
Generate trajectory predictions with reasoning:The notebook uses
copy.deepcopy(model_inputs) to preserve the original inputs for potential re-runs with different parameters.Visualizing Results
Cell 5: Display Multi-Camera Images
Visualize the input camera frames:Cell 6: Plot Trajectory Predictions
Visualize predicted trajectories against ground truth:Why the 90° Rotation?
Why the 90° Rotation?
The
rotate_90cc function rotates the trajectory coordinates by 90 degrees counter-clockwise for better visualization. This transforms the vehicle coordinate system to a more intuitive top-down view where:- Forward motion appears as upward movement on the plot
- The trajectory is easier to interpret visually
Cell 7: Compute Evaluation Metrics
Calculate the minimum Average Displacement Error:- Measures average distance between predicted and ground truth waypoints
- Lower values indicate better trajectory accuracy
- Typical values range from 0.5 to 3.0 meters depending on scene complexity
Advanced Usage
Analyzing Multiple Scenarios
Loop through multiple clips to analyze model performance:Comparing Different Sampling Strategies
Test how different parameters affect predictions:Tips for Effective Use
Memory Management
Memory Management
- Clear GPU memory between runs:
torch.cuda.empty_cache() - Use
num_traj_samples=1for initial experiments - Restart the kernel if you encounter OOM errors
Reproducibility
Reproducibility
- Set random seeds for consistent results:
torch.cuda.manual_seed_all(42) - Note that exact numerical reproducibility is not guaranteed across different GPU architectures
- Save model outputs for later analysis
Performance Optimization
Performance Optimization
- Batch multiple clips together for faster processing
- Cache loaded models to avoid reloading
- Use
torch.autocastfor efficient mixed-precision inference
Troubleshooting
If you encounter issues while running the notebook:CUDA OOM Errors
Reduce
num_traj_samples or restart the kernel to clear GPU memoryImport Errors
Ensure your environment is activated and all dependencies are installed
Dataset Access
Verify HuggingFace authentication and dataset access approval
Common Issues
See the full troubleshooting guide for detailed solutions
Next Steps
- Experiment with different clip IDs to see diverse driving scenarios
- Adjust sampling parameters to explore prediction diversity
- Analyze Chain-of-Causation reasoning to understand model decisions
- Compare predictions across multiple scenarios for performance evaluation