Predict mode loads a pretrained SAC checkpoint together with its pairedDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/jackvice/RoboTerrain/llms.txt
Use this file to discover all available pages before exploring further.
VecNormalize statistics file and runs the agent deterministically for up to 1,000,000 steps without performing any gradient updates. The VecNormalize wrapper switches to evaluation mode automatically (env.training = False, env.norm_reward = False), so the running statistics are frozen at the values saved during training. Episode rewards and counts are printed to stdout as each episode completes, making it straightforward to benchmark a policy across worlds.
Prerequisites
Before launching the inference script, two things must be true:Gazebo Simulation Running
The correct Gazebo world must be open and the rover model spawned. Launch with:
Position Bridge Active
The
ign_ros2_Nav2_topics.py bridge must be publishing ground-truth pose on /rover/pose_array. See the section below for the exact command.Running the Position Bridge
The position bridge translates Ignition Gazebo pose data into a ROS 2PoseArray message that the environment subscribes to. Run this in a dedicated terminal before starting inference:
inspect with the name of whichever world you have launched in Gazebo (maze, island, etc.). The second argument rover_zero4wd is the model name as registered in the Gazebo entity registry.
Inference Commands
Inspection World (Pretrained Checkpoint)
The repository ships a pretrained checkpoint for the inspection world undertrained_agents/. Use it directly:
Maze World
Island / Moon World
--vision False selects the standard LIDAR+pose RoverEnv path in sb3_SAC.py. To run inference with the fused camera observation produced by the Active Vision pipeline, pass --vision True and ensure inference.py is writing to shared memory first.Predict Mode Behavior
The predict loop insb3_SAC.py runs for a fixed budget of 1,000,000 steps and resets automatically at each episode boundary:
| Property | Value |
|---|---|
| Action selection | deterministic=True — argmax over the policy mean, no sampling noise |
| Max steps | 1,000,000 (regardless of episode length) |
| Episode reset trigger | done=True from any environment termination condition |
| Reward display | Cumulative episode reward printed after each episode |
| Normalization updates | Disabled (env.training = False, env.norm_reward = False) |
env.training = False and env.norm_reward = False are set automatically by sb3_SAC.py when --mode predict is used together with --load True. You do not need to set these flags manually.