MuJoCo Simulation Debugging
Visualizing Contact Forces
Enable Contact Visualization
Enable Contact Visualization
MuJoCo provides built-in contact force visualization:Keyboard Shortcuts:
F1: Toggle contact pointsF2: Toggle contact forcesSpace: Pause/resumeCtrl+P: Screenshot
Checking Joint Limits
Detect Joint Limit Violations
Detect Joint Limit Violations
IK Solution Debugging
Visualize IK Targets and Solutions
Visualize IK Targets and Solutions
Script: Test IK Reachability:See
~/workspace/source/debug_model.pyAdd visualization markers for IK targets:~/workspace/source/ik.py:131 for built-in test suite.Logging Sensor Data
Record Simulation Data to CSV
Record Simulation Data to CSV
RL Training Debugging
Monitoring Training Progress
TensorBoard Visualization
TensorBoard Visualization
Launch TensorBoard to monitor training metrics:Key Metrics to Watch:
-
rollout/ep_rew_mean: Average episode reward- Should increase over time
- Plateaus indicate convergence or reward saturation
-
train/entropy_loss: Policy entropy- High entropy = exploration
- Should decrease as policy becomes more deterministic
-
train/policy_gradient_loss: Policy loss- Oscillations are normal
- Divergence indicates instability
-
train/value_loss: Critic loss- Should decrease and stabilize
-
rollout/ep_len_mean: Episode length- Increasing length = robot survives longer
- Max length =
max_episode_steps
If
ep_len_mean stays near maximum, the robot is not falling. If it’s very short, check termination conditions.Debugging Reward Functions
Log Reward Components
Log Reward Components
The environment returns detailed reward breakdowns:Expected Behavior:
forward_velocity: Dominates reward (largest magnitude)lateral_velocity_penalty: Small negative valuecontact_pattern: Positive when gait is correctstability: Negative penalty for tilting
~/workspace/source/envs/adaptive_gait_env.py:259 for reward implementation.Compare Policy vs Baseline
Compare Policy vs Baseline
Script: Output:See
~/workspace/source/debug_model.pyCompares trained policy against zero-residual baseline:~/workspace/source/debug_model.py:1 for full script.Observation Space Validation
Check Observation Statistics
Check Observation Statistics
Action Distribution Analysis
Visualize Policy Actions
Visualize Policy Actions
- Actions centered at 0: Policy relies mostly on baseline
- Wide distribution: Policy actively explores action space
- Saturated at ±1: Actions hitting limits (may need to adjust scaling)
ROS2 Debugging
Node Diagnostics
Check Running Nodes and Topics
Check Running Nodes and Topics
Debug Camera Feed
Debug Camera Feed
If camera feed is not displaying:Common Issues:
- No messages:
sim.pynot running or camera rendering disabled - Low frequency: MuJoCo simulation running too slowly (reduce control rate)
- Black image: Check camera position in
robot.xml
Test Movement Commands
Test Movement Commands
Service Call Debugging
Service Call Debugging
- Check if
sim.pyis running - Verify ROS2 environment is sourced:
source /opt/ros/jazzy/setup.bash - Check for error messages in
sim.pyterminal
Joystick Debugging
Test Joystick Input
Test Joystick Input
Check pygame joystick detection:Common Issues:
- No joystick detected: Check USB connection, try
ls /dev/input/js* - Axes not responding: Check axis mapping in
gui/gui.py - Buttons not working: Verify button indices with
pygame.joystickAPI
Performance Profiling
Profile Simulation Speed
Profile Simulation Speed
- Reduce number of contacts (simplify terrain)
- Lower integrator accuracy:
model.opt.iterations = 1 - Use GPU rendering:
MUJOCO_GL=egl
Profile Training Throughput
Profile Training Throughput
Common Issues and Solutions
Robot Falls Immediately
Checklist
Checklist
-
Check initial pose: Verify keyframe in
robot.xmlhas valid joint angles -
Verify IK solutions: Ensure gait targets are reachable
-
Check settle steps: Environment may need initialization time
-
Inspect contact forces: Ensure feet are making ground contact
Training Diverges
Checklist
Checklist
-
Reduce learning rate:
-
Clip gradient norm:
-
Increase batch size:
-
Check reward scaling: Ensure reward components are balanced
-
Add reward clipping:
Slow Training
Optimizations
Optimizations
-
Use SubprocVecEnv: Parallel environments on multi-core CPUs
-
Reduce episode length:
-
Use GPU:
-
Optimize network size:
-
Reduce checkpoint frequency:
Debugging Tools Summary
MuJoCo Viewer
Built-in visualization with contact forces, joint positions, and camera views
TensorBoard
Monitor training metrics: rewards, losses, episode lengths
ROS2 CLI
Inspect topics, services, and nodes with
ros2 command-line toolsPython Debugger
Use
pdb or ipdb for interactive debugging: import pdb; pdb.set_trace()Next Steps
Custom Terrains
Create custom heightfield terrains for testing
Extending Controllers
Add new gait parameters or behaviors