Skip to main content

Adaptive Quadruped Robot with Reinforcement Learning

A 12-DOF quadruped robot simulation powered by MuJoCo physics and PPO-based adaptive gait control. Train intelligent controllers that learn to navigate rough terrain through reinforcement learning.

Training Episode
Forward Velocity+0.188 m/s
Improvement vs Baseline+967%
PPO Training: 30M timesteps • 12 parallel envs

Quick Start

Get the quadruped robot simulation running in minutes

1

Install dependencies

Install ROS2 Jazzy and Python dependencies. The project requires MuJoCo for physics simulation, Stable-Baselines3 for RL training, and ROS2 for real-time control.
# Install ROS2 Jazzy (Ubuntu 24.04)
sudo apt install ros-jazzy-desktop python3-colcon-common-extensions

# Clone repository and install Python dependencies
git clone https://github.com/stm32f303ret6/tesis.git
cd tesis
pip install -r requirements.txt
2

Run the baseline comparison test

Execute the main test that compares baseline kinematic control vs adaptive RL-based control on rough terrain.
python3 tests/compare_baseline_adaptive.py \
    --model runs/adaptive_gait_20251115_180640/final_model.zip \
    --normalize runs/adaptive_gait_20251115_180640/vec_normalize.pkl \
    --seconds 17
The test runs three simulations and outputs a comparison table:
===============================================================================
COMPARISON SUMMARY - THREE SIMULATIONS
===============================================================================

Metric                         Step 1: Baseline     Step 2: Baseline     Step 3: Adaptive    
                               (Flat)               (Rough)              (Rough)             
-------------------------------------------------------------------------------
Duration (s)                   17.00                17.00                17.00               
Distance traveled (m)          0.506                0.299                3.191               
Average velocity (m/s)         0.030                0.018                0.188               

Performance Comparison:
  Step 2 vs Step 1 (Rough vs Flat):      -40.9%
  Step 3 vs Step 2 (Adaptive vs Rough):  +967.2%
===============================================================================
3

Visualize with GUI and joystick

Launch the PyQt5 GUI with ROS2 integration for interactive control. Requires two terminals.Terminal 1 - GUI:
source /opt/ros/jazzy/setup.bash
cd gui
python3 gui.py
Terminal 2 - Simulation:
source /opt/ros/jazzy/setup.bash
python3 sim.py --terrain rough
Use your gamepad’s analog stick to control forward/backward movement.

Explore by Topic

Deep dive into the robot’s architecture and capabilities

Robot Design

12-DOF quadruped with parallel SCARA leg mechanism and 5-bar linkage. Learn about the custom OpenSCAD CAD design.

Gait Control

Diagonal trot gait with Bézier curve swing trajectories. Understand the state machine and phase transitions.

Inverse Kinematics

3DOF parallel SCARA IK solver for leg positioning. Explore the geometric solutions and working modes.

Reinforcement Learning

PPO-based adaptive control with residual corrections. Learn how the policy learns to navigate rough terrain.

Usage Guides

Step-by-step tutorials for common workflows

Running Simulations

Run standalone simulations with MuJoCo viewer

Training Models

Train new adaptive gait policies with PPO

ROS2 Integration

Set up ROS2 nodes for real-time control

GUI & Joystick

Use the PyQt5 GUI with gamepad control

Baseline vs Adaptive

Compare performance metrics and results

Key Features

What makes this quadruped robot unique

Parallel SCARA Mechanism

Custom 5-bar linkage design with 3DOF per leg. OpenSCAD CAD models ready for 3D printing.

Adaptive RL Control

PPO policy learns gait parameter adaptation and residual corrections for rough terrain navigation.

MuJoCo Physics

High-fidelity physics simulation with configurable terrains (flat and heightfield rough terrain).

ROS2 Integration

Real-time control with ROS2 Jazzy. Camera streaming, telemetry, and command topics.

Ready to train your own adaptive controller?

Follow the training guide to learn how to customize gait parameters, configure PPO hyperparameters, and evaluate your trained policies on custom terrains.

Start Training