Quick Start

Goal

In this guide, you’ll run the main comparison test that demonstrates the power of adaptive RL control. You’ll see three simulations side-by-side showing how the robot performs:

Baseline on flat terrain - establishes ideal performance
Baseline on rough terrain - shows degradation without adaptation
Adaptive RL on rough terrain - demonstrates learned improvements

Time required: ~10 minutes (includes 3 × 17-second simulations + setup)

Prerequisites

Before starting, make sure you’ve completed the installation guide. You’ll need:

ROS2 Jazzy installed and sourced
Python dependencies from requirements.txt
The source code cloned to your machine

Step 1: Source ROS2 Environment

Every time you work with the robot, start by sourcing ROS2:

source /opt/ros/jazzy/setup.bash

Add this line to your ~/.bashrc to automatically source ROS2 in every terminal session.

Step 2: Navigate to Project Directory

cd path/to/walk2

Verify you’re in the right place:

ls -1

You should see directories like model/, controllers/, envs/, tests/, etc.

Step 3: Run the Comparison Test

Execute the main comparison script with the pre-trained model:

python3 tests/compare_baseline_adaptive.py \
    --model runs/adaptive_gait_20251115_180640/final_model.zip \
    --normalize runs/adaptive_gait_20251115_180640/vec_normalize.pkl \
    --seconds 17

What do these arguments mean?

--model: Path to the trained PPO policy network (.zip file)
--normalize: Path to observation normalization statistics (.pkl file)
--seconds: Duration of each simulation (17 seconds gives clear results)

Step 4: Watch the Simulations

The script will run three simulations automatically:

Step 1: Baseline (Flat)

A MuJoCo viewer window opens showing the robot on flat terrain. The baseline gait controller moves the robot forward steadily. This simulation lasts 17 seconds.Expected: Smooth forward progression (~0.5m traveled)

Step 2: Baseline (Rough)

The window closes and reopens with rough terrain. Same controller, but now the irregular ground disrupts the gait.Expected: Slower, less efficient movement (~0.3m traveled, 41% degradation)

Step 3: Adaptive RL (Rough)

Same rough terrain, but now with the RL policy adding corrections. Watch how the robot adjusts its gait dynamically.Expected: Much better performance (~3.2m traveled, 967% improvement!)

Each simulation runs in fullscreen mode. Press ESC to exit early if needed, but note that this will abort the comparison.

Step 5: Review the Results

After all three simulations complete, you’ll see:

Terminal Output

A summary table with detailed metrics:

===============================================================================================
COMPARISON SUMMARY - THREE SIMULATIONS
===============================================================================================

Metric                         Step 1: Baseline     Step 2: Baseline     Step 3: Adaptive    
                               (Flat)               (Rough)              (Rough)             
-----------------------------------------------------------------------------------------------
Duration (s)                   17.00                17.00                17.00               
Data points                    9069                 8966                 7128                
Start X (m)                    0.000                0.000                0.000               
End X (m)                      0.506                0.299                3.191               
Distance traveled (m)          0.506                0.299                3.191               
Average velocity (m/s)         0.030                0.018                0.188               

Performance Comparison:
  Step 2 vs Step 1 (Rough vs Flat):      -40.9%
  Step 3 vs Step 2 (Adaptive vs Rough):  +967.2%
===============================================================================================

Comparison Plot

A three-panel visualization saved to tests/baseline_vs_adaptive_comparison.png. Each panel shows:

Time vs X position trajectory
Start (green) and end (red) markers
Distance traveled and average velocity
Percentage improvement/degradation

Trajectory Data

Three JSON files with complete trajectory data:

tests/trajectory_step1_baseline_flat.json
tests/trajectory_step2_baseline_rough.json  
tests/trajectory_step3_adaptive_rough.json

Each file contains timestamped position data:

{
  "duration": 17.0,
  "terrain": "rough",
  "mode": "adaptive",
  "trajectory": [
    {"time": 0.000, "x": 0.000, "y": 0.000, "z": 0.091},
    {"time": 0.002, "x": 0.001, "y": 0.000, "z": 0.091},
    ...
  ]
}

Understanding the Results

Why Does Baseline Degrade on Rough Terrain?

The baseline controller uses open-loop gait generation:

# From gait_controller.py
GAIT_PARAMS = GaitParameters(
    body_height=0.05,
    step_length=0.06, 
    step_height=0.04,
    cycle_time=0.8
)

These fixed parameters work well on flat ground but fail to adapt when:

Ground height varies unexpectedly
Contact timing differs from predictions
Body orientation changes due to terrain

How Does Adaptive Control Help?

The RL policy learns to add residual corrections that compensate for terrain:

# From adaptive_gait_controller.py
class AdaptiveGaitController:
    def update(self, timestep: float, rl_action: np.ndarray) -> Dict[str, np.ndarray]:
        # Get baseline targets
        base_targets = self.base_controller.update(timestep)
        
        # Add learned residuals
        for leg in LEG_NAMES:
            base_targets[leg] += rl_action[leg] * self.residual_scale
        
        return base_targets

The policy observes:

Body position, velocity, and orientation
Joint angles and velocities
Contact states for each foot
Recent command history

And learns to predict optimal corrections through trial and error during training.

Next Steps

Run Standalone Simulation

Try a simpler simulation without the comparison framework:

python3 height_control.py

Press Space to pause/resume.

Try GUI with Joystick

Run the full ROS2 stack with GUI control:Terminal 1:

cd gui && python3 gui.py

Terminal 2:

python3 sim.py --terrain rough

Train Your Own Model

Train a new adaptive policy from scratch:

bash train_quick.sh

Takes ~1.5 hours with 12 CPU cores.

Explore the Code

Dive into the implementation:

gait_controller.py - Baseline diagonal gait
controllers/adaptive_gait_controller.py - RL wrapper
envs/adaptive_gait_env.py - Training environment
utils/ik.py - Inverse kinematics solver

Troubleshooting

Error: No module named 'mujoco'

You need to install Python dependencies:

pip install -r requirements.txt

Error: Model file not found

Make sure you’re running the script from the project root directory (walk2/), not from inside tests/.The pre-trained model should exist at:

runs/adaptive_gait_20251115_180640/final_model.zip

Simulation runs in headless mode (no window)

Check your MUJOCO_GL environment variable:

unset MUJOCO_GL

If running over SSH without X11 forwarding, MuJoCo will default to headless rendering.

Robot falls over immediately

This might indicate:

Wrong terrain file (check console output)
Incorrect IK parameters
Corrupted model checkpoint

Try running just the baseline on flat terrain:

python3 height_control.py --terrain flat

Performance is worse than expected

Results can vary based on:

Random terrain generation (each rough terrain is unique)
Initial robot placement
Simulation timestep accuracy

Run the comparison multiple times to see typical ranges.

Visualization Tips

MuJoCo Viewer Controls

While simulations are running:

Action	Control
Rotate camera	Right-click + drag
Pan camera	Middle-click + drag
Zoom	Scroll wheel
Pause	Space bar
Reset view	Double-click
Exit fullscreen	ESC

Analyzing Trajectories

Load and plot trajectory data with Python:

import json
import matplotlib.pyplot as plt

# Load trajectory
with open('tests/trajectory_step3_adaptive_rough.json') as f:
    data = json.load(f)

# Extract time and position
times = [p['time'] for p in data['trajectory']]
x_pos = [p['x'] for p in data['trajectory']]

# Plot
plt.plot(times, x_pos)
plt.xlabel('Time (s)')
plt.ylabel('X Position (m)')
plt.title(f"Adaptive RL on Rough Terrain")
plt.show()

Learn More

MuJoCo Docs

Physics engine documentation

Stable Baselines 3

RL training framework

ROS2 Jazzy

ROS2 distribution docs

Get Started

Core Concepts

Usage Guides

Development

Goal

Prerequisites

Step 1: Source ROS2 Environment

Step 2: Navigate to Project Directory

Step 3: Run the Comparison Test

Step 4: Watch the Simulations

Step 5: Review the Results

Terminal Output

Comparison Plot

Trajectory Data

Understanding the Results

Why Does Baseline Degrade on Rough Terrain?

How Does Adaptive Control Help?

Next Steps

Run Standalone Simulation

Try GUI with Joystick

Train Your Own Model

Explore the Code

Troubleshooting

Visualization Tips

MuJoCo Viewer Controls

Analyzing Trajectories

Learn More

MuJoCo Docs

Stable Baselines 3

ROS2 Jazzy

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage Guides

Development

​Goal

​Prerequisites

​Step 1: Source ROS2 Environment

​Step 2: Navigate to Project Directory

​Step 3: Run the Comparison Test

​Step 4: Watch the Simulations

​Step 5: Review the Results

​Terminal Output

​Comparison Plot

​Trajectory Data

​Understanding the Results

​Why Does Baseline Degrade on Rough Terrain?

​How Does Adaptive Control Help?

​Next Steps

Run Standalone Simulation

Try GUI with Joystick

Train Your Own Model

Explore the Code

​Troubleshooting

​Visualization Tips

​MuJoCo Viewer Controls

​Analyzing Trajectories

​Learn More

MuJoCo Docs

Stable Baselines 3

ROS2 Jazzy

Build docs developers (and LLMs) love

Goal

Prerequisites

Step 1: Source ROS2 Environment

Step 2: Navigate to Project Directory

Step 3: Run the Comparison Test

Step 4: Watch the Simulations

Step 5: Review the Results

Terminal Output

Comparison Plot

Trajectory Data

Understanding the Results

Why Does Baseline Degrade on Rough Terrain?

How Does Adaptive Control Help?

Next Steps

Troubleshooting

Visualization Tips

MuJoCo Viewer Controls

Analyzing Trajectories

Learn More