Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jackvice/RoboTerrain/llms.txt

Use this file to discover all available pages before exploring further.

Predict mode loads a pretrained SAC checkpoint together with its paired VecNormalize statistics file and runs the agent deterministically for up to 1,000,000 steps without performing any gradient updates. The VecNormalize wrapper switches to evaluation mode automatically (env.training = False, env.norm_reward = False), so the running statistics are frozen at the values saved during training. Episode rewards and counts are printed to stdout as each episode completes, making it straightforward to benchmark a policy across worlds.

Prerequisites

Before launching the inference script, two things must be true:

Gazebo Simulation Running

The correct Gazebo world must be open and the rover model spawned. Launch with:
source /opt/ros/humble/setup.bash
source ~/src/RoboTerrain/ros2_ws/install/setup.bash
ros2 launch roverrobotics_gazebo 4wd_rover_gazebo.launch.py

Position Bridge Active

The ign_ros2_Nav2_topics.py bridge must be publishing ground-truth pose on /rover/pose_array. See the section below for the exact command.

Running the Position Bridge

The position bridge translates Ignition Gazebo pose data into a ROS 2 PoseArray message that the environment subscribes to. Run this in a dedicated terminal before starting inference:
cd ros2_ws/src/pose_topic
python ign_ros2_Nav2_topics.py inspect rover_zero4wd
Replace inspect with the name of whichever world you have launched in Gazebo (maze, island, etc.). The second argument rover_zero4wd is the model name as registered in the Gazebo entity registry.

Inference Commands

Inspection World (Pretrained Checkpoint)

The repository ships a pretrained checkpoint for the inspection world under trained_agents/. Use it directly:
cd ros2_ws/src/sb3/

python sb3_SAC.py \
  --mode predict \
  --load True \
  --world inspect \
  --vision False \
  --checkpoint_name trained_agents/sac_inspect.zip \
  --normalize_stats trained_agents/sac_inspect_normalize.pkl

Maze World

python sb3_SAC.py \
  --mode predict \
  --load True \
  --world maze \
  --vision False \
  --checkpoint_name checkpoints/sac_maze_20250126_1430_500000_steps.zip \
  --normalize_stats checkpoints/sac_maze_20250126_1430_500000_steps_normalize.pkl

Island / Moon World

python sb3_SAC.py \
  --mode predict \
  --load True \
  --world island \
  --vision False \
  --checkpoint_name checkpoints/sac_island_20250126_1430_500000_steps.zip \
  --normalize_stats checkpoints/sac_island_20250126_1430_500000_steps_normalize.pkl
--vision False selects the standard LIDAR+pose RoverEnv path in sb3_SAC.py. To run inference with the fused camera observation produced by the Active Vision pipeline, pass --vision True and ensure inference.py is writing to shared memory first.

Predict Mode Behavior

The predict loop in sb3_SAC.py runs for a fixed budget of 1,000,000 steps and resets automatically at each episode boundary:
obs = env.reset()
episode_rewards = 0
num_episodes = 0

for _ in range(1_000_000):
    action, _states = model.predict(obs, deterministic=True)
    obs, rewards, done, info = env.step(action)
    episode_rewards += rewards[0]

    if done:
        print(f"Episode {num_episodes} finished with reward {episode_rewards}")
        obs = env.reset()
        episode_rewards = 0
        num_episodes += 1
Key characteristics:
PropertyValue
Action selectiondeterministic=True — argmax over the policy mean, no sampling noise
Max steps1,000,000 (regardless of episode length)
Episode reset triggerdone=True from any environment termination condition
Reward displayCumulative episode reward printed after each episode
Normalization updatesDisabled (env.training = False, env.norm_reward = False)
env.training = False and env.norm_reward = False are set automatically by sb3_SAC.py when --mode predict is used together with --load True. You do not need to set these flags manually.

Switching Worlds for Inference

Each world requires its own matching checkpoint (the position ranges and terrain dynamics differ significantly between worlds). To switch:
1

Close the current Gazebo session

Stop the running simulation (Ctrl+C on the launch terminal).
2

Edit the launch file to select the new world

nano ros2_ws/src/roverrobotics_ros2/roverrobotics_gazebo/launch/4wd_rover_gazebo.launch.py
# Uncomment the desired world line in DeclareLaunchArgument() around line 24
3

Rebuild and relaunch

cd ros2_ws/
colcon build
ros2 launch roverrobotics_gazebo 4wd_rover_gazebo.launch.py
4

Restart the position bridge with the new world name

python ign_ros2_Nav2_topics.py maze rover_zero4wd
5

Run inference with the matching checkpoint

python sb3_SAC.py \
  --mode predict \
  --load True \
  --world maze \
  --checkpoint_name checkpoints/sac_maze_<timestamp>_steps.zip \
  --normalize_stats checkpoints/sac_maze_<timestamp>_steps_normalize.pkl
Using a checkpoint trained on one world for inference in a different world will generally fail: the position ranges, terrain, and obstacle distributions are incompatible. Always match the --world flag to both the Gazebo world and the checkpoint it was trained in.

Build docs developers (and LLMs) love