Skip to main content

Goal

In this guide, you’ll run the main comparison test that demonstrates the power of adaptive RL control. You’ll see three simulations side-by-side showing how the robot performs:
  1. Baseline on flat terrain - establishes ideal performance
  2. Baseline on rough terrain - shows degradation without adaptation
  3. Adaptive RL on rough terrain - demonstrates learned improvements
Time required: ~10 minutes (includes 3 × 17-second simulations + setup)

Prerequisites

Before starting, make sure you’ve completed the installation guide. You’ll need:
  • ROS2 Jazzy installed and sourced
  • Python dependencies from requirements.txt
  • The source code cloned to your machine

Step 1: Source ROS2 Environment

Every time you work with the robot, start by sourcing ROS2:
source /opt/ros/jazzy/setup.bash
Add this line to your ~/.bashrc to automatically source ROS2 in every terminal session.

Step 2: Navigate to Project Directory

cd path/to/walk2
Verify you’re in the right place:
ls -1
You should see directories like model/, controllers/, envs/, tests/, etc.

Step 3: Run the Comparison Test

Execute the main comparison script with the pre-trained model:
python3 tests/compare_baseline_adaptive.py \
    --model runs/adaptive_gait_20251115_180640/final_model.zip \
    --normalize runs/adaptive_gait_20251115_180640/vec_normalize.pkl \
    --seconds 17
  • --model: Path to the trained PPO policy network (.zip file)
  • --normalize: Path to observation normalization statistics (.pkl file)
  • --seconds: Duration of each simulation (17 seconds gives clear results)

Step 4: Watch the Simulations

The script will run three simulations automatically:
1

Step 1: Baseline (Flat)

A MuJoCo viewer window opens showing the robot on flat terrain. The baseline gait controller moves the robot forward steadily. This simulation lasts 17 seconds.Expected: Smooth forward progression (~0.5m traveled)
2

Step 2: Baseline (Rough)

The window closes and reopens with rough terrain. Same controller, but now the irregular ground disrupts the gait.Expected: Slower, less efficient movement (~0.3m traveled, 41% degradation)
3

Step 3: Adaptive RL (Rough)

Same rough terrain, but now with the RL policy adding corrections. Watch how the robot adjusts its gait dynamically.Expected: Much better performance (~3.2m traveled, 967% improvement!)
Each simulation runs in fullscreen mode. Press ESC to exit early if needed, but note that this will abort the comparison.

Step 5: Review the Results

After all three simulations complete, you’ll see:

Terminal Output

A summary table with detailed metrics:
===============================================================================================
COMPARISON SUMMARY - THREE SIMULATIONS
===============================================================================================

Metric                         Step 1: Baseline     Step 2: Baseline     Step 3: Adaptive    
                               (Flat)               (Rough)              (Rough)             
-----------------------------------------------------------------------------------------------
Duration (s)                   17.00                17.00                17.00               
Data points                    9069                 8966                 7128                
Start X (m)                    0.000                0.000                0.000               
End X (m)                      0.506                0.299                3.191               
Distance traveled (m)          0.506                0.299                3.191               
Average velocity (m/s)         0.030                0.018                0.188               

Performance Comparison:
  Step 2 vs Step 1 (Rough vs Flat):      -40.9%
  Step 3 vs Step 2 (Adaptive vs Rough):  +967.2%
===============================================================================================

Comparison Plot

A three-panel visualization saved to tests/baseline_vs_adaptive_comparison.png. Each panel shows:
  • Time vs X position trajectory
  • Start (green) and end (red) markers
  • Distance traveled and average velocity
  • Percentage improvement/degradation

Trajectory Data

Three JSON files with complete trajectory data:
tests/trajectory_step1_baseline_flat.json
tests/trajectory_step2_baseline_rough.json  
tests/trajectory_step3_adaptive_rough.json
Each file contains timestamped position data:
{
  "duration": 17.0,
  "terrain": "rough",
  "mode": "adaptive",
  "trajectory": [
    {"time": 0.000, "x": 0.000, "y": 0.000, "z": 0.091},
    {"time": 0.002, "x": 0.001, "y": 0.000, "z": 0.091},
    ...
  ]
}

Understanding the Results

Why Does Baseline Degrade on Rough Terrain?

The baseline controller uses open-loop gait generation:
# From gait_controller.py
GAIT_PARAMS = GaitParameters(
    body_height=0.05,
    step_length=0.06, 
    step_height=0.04,
    cycle_time=0.8
)
These fixed parameters work well on flat ground but fail to adapt when:
  • Ground height varies unexpectedly
  • Contact timing differs from predictions
  • Body orientation changes due to terrain

How Does Adaptive Control Help?

The RL policy learns to add residual corrections that compensate for terrain:
# From adaptive_gait_controller.py
class AdaptiveGaitController:
    def update(self, timestep: float, rl_action: np.ndarray) -> Dict[str, np.ndarray]:
        # Get baseline targets
        base_targets = self.base_controller.update(timestep)
        
        # Add learned residuals
        for leg in LEG_NAMES:
            base_targets[leg] += rl_action[leg] * self.residual_scale
        
        return base_targets
The policy observes:
  • Body position, velocity, and orientation
  • Joint angles and velocities
  • Contact states for each foot
  • Recent command history
And learns to predict optimal corrections through trial and error during training.

Next Steps

Run Standalone Simulation

Try a simpler simulation without the comparison framework:
python3 height_control.py
Press Space to pause/resume.

Try GUI with Joystick

Run the full ROS2 stack with GUI control:Terminal 1:
cd gui && python3 gui.py
Terminal 2:
python3 sim.py --terrain rough

Train Your Own Model

Train a new adaptive policy from scratch:
bash train_quick.sh
Takes ~1.5 hours with 12 CPU cores.

Explore the Code

Dive into the implementation:
  • gait_controller.py - Baseline diagonal gait
  • controllers/adaptive_gait_controller.py - RL wrapper
  • envs/adaptive_gait_env.py - Training environment
  • utils/ik.py - Inverse kinematics solver

Troubleshooting

You need to install Python dependencies:
pip install -r requirements.txt
Make sure you’re running the script from the project root directory (walk2/), not from inside tests/.The pre-trained model should exist at:
runs/adaptive_gait_20251115_180640/final_model.zip
Check your MUJOCO_GL environment variable:
unset MUJOCO_GL
If running over SSH without X11 forwarding, MuJoCo will default to headless rendering.
This might indicate:
  1. Wrong terrain file (check console output)
  2. Incorrect IK parameters
  3. Corrupted model checkpoint
Try running just the baseline on flat terrain:
python3 height_control.py --terrain flat
Results can vary based on:
  • Random terrain generation (each rough terrain is unique)
  • Initial robot placement
  • Simulation timestep accuracy
Run the comparison multiple times to see typical ranges.

Visualization Tips

MuJoCo Viewer Controls

While simulations are running:
ActionControl
Rotate cameraRight-click + drag
Pan cameraMiddle-click + drag
ZoomScroll wheel
PauseSpace bar
Reset viewDouble-click
Exit fullscreenESC

Analyzing Trajectories

Load and plot trajectory data with Python:
import json
import matplotlib.pyplot as plt

# Load trajectory
with open('tests/trajectory_step3_adaptive_rough.json') as f:
    data = json.load(f)

# Extract time and position
times = [p['time'] for p in data['trajectory']]
x_pos = [p['x'] for p in data['trajectory']]

# Plot
plt.plot(times, x_pos)
plt.xlabel('Time (s)')
plt.ylabel('X Position (m)')
plt.title(f"Adaptive RL on Rough Terrain")
plt.show()

Learn More

MuJoCo Docs

Physics engine documentation

Stable Baselines 3

RL training framework

ROS2 Jazzy

ROS2 distribution docs

Build docs developers (and LLMs) love