Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jackvice/RoboTerrain/llms.txt

Use this file to discover all available pages before exploring further.

RoboTerrain ships three Gymnasium-compatible environments that share a common ROS 2 / Gazebo Fortress backbone but differ in observation modality and action structure. RoverEnv is the baseline: it feeds LIDAR, pose, IMU, target, and velocity data to a continuous-action SAC agent and is the fastest to train. RoverEnvFused replaces raw sensor arrays with a 3-channel fused image (grayscale + monocular depth + spatiotemporal attention heatmap) produced by the Active Vision pipeline, making it the right choice when pedestrian-aware perception matters. RoverEnvFused (Discrete) keeps the fused image observation but collapses the action space to a small discrete set, which suits algorithms like DQN or the DreamerV3 gaze-control head. Choose RoverEnv for rapid prototyping and LIDAR-first navigation; choose the fused variants when integrating camera-based social awareness.

RoverEnv

RoverEnv is defined in ros2_ws/src/sb3/environments/rover_env.py. It subscribes to LIDAR, IMU, wheel odometry, and a pose-array topic, and publishes Twist commands at each step.

Constructor Signature

RoverEnv(
    size=(64, 64),
    length=2000,
    scan_topic='/scan',
    imu_topic='/imu/data',
    cmd_vel_topic='/cmd_vel',
    camera_topic='/camera/image_raw',
    world_n='inspect',
    connection_check_timeout=30,
    lidar_points=32,
    max_lidar_range=12.0,
)
ParameterTypeDefaultDescription
sizetuple(64, 64)Reserved image buffer size (unused in base env)
lengthint2000Maximum steps per episode
scan_topicstr'/scan'ROS 2 LIDAR topic
imu_topicstr'/imu/data'ROS 2 IMU topic
cmd_vel_topicstr'/cmd_vel'Velocity command topic
camera_topicstr'/camera/image_raw'ROS 2 camera topic (image buffer, unused in base env)
world_nstr'inspect'Active Gazebo world — 'inspect', 'moon' (island), or 'maze'
connection_check_timeoutint30Seconds to wait for an initial LIDAR scan before flagging simulation mode
lidar_pointsint32Number of downsampled LIDAR points
max_lidar_rangefloat12.0Maximum LIDAR range in metres

Observation Space

The observation is a gymnasium.spaces.Dict with five keys:
observation_space = spaces.Dict({
    'lidar': spaces.Box(
        low=0,
        high=12.0,           # max_lidar_range
        shape=(32,),         # lidar_points
        dtype=np.float32
    ),
    'pose': spaces.Box(
        low=np.array([-30.0, -30.0, -10.0]),
        high=np.array([ 30.0,  30.0,  10.0]),
        dtype=np.float32     # shape (3,): x, y, z
    ),
    'imu': spaces.Box(
        low=np.array([-np.pi, -np.pi, -np.pi]),
        high=np.array([ np.pi,  np.pi,  np.pi]),
        dtype=np.float32     # shape (3,): pitch, roll, yaw
    ),
    'target': spaces.Box(
        low=np.array([0,      -np.pi]),
        high=np.array([100,    np.pi]),
        shape=(2,),
        dtype=np.float32     # [distance_m, relative_angle_rad]
    ),
    'velocities': spaces.Box(
        low=np.array([-10.0, -10.0]),
        high=np.array([ 10.0,  10.0]),
        shape=(2,),
        dtype=np.float32     # [linear_vel, angular_vel]
    ),
})
KeyShapeRangeDescription
lidar(32,)[0, 12.0]Downsampled min-segment LIDAR ranges (m)
pose(3,)[-30, 30] x/y, [-10, 10] zGround-truth position from pose-array bridge
imu(3,)[-π, π]Pitch, roll, yaw from IMU quaternion
target(2,)distance [0, 100], angle [-π, π]Distance (m) and relative bearing to goal
velocities(2,)[-10, 10]Wheel-odometry linear and angular velocities

Action Space

action_space = spaces.Box(
    low=np.array([-1.0, -np.pi]),
    high=np.array([ 1.0,  np.pi]),
    dtype=np.float32
)
# action[0] = speed        (-1.0 to 1.0)
# action[1] = desired_heading (-π to π, relative to current yaw)
The heading action is a relative offset added to the current yaw. A PID heading controller converts the resulting absolute desired heading into an angular velocity command (see PID Heading Controller below).

Supported Worlds

WorldAliasDescription
'inspect'Industrial inspection zone with solar panels and structures
'moon''island'Island/lunar terrain with open rocky ground
'maze'Walled maze for structured navigation

Safety Mechanisms

The environment enforces four termination conditions that end the episode early:
  • Flip detection — episode ends immediately if |roll| or |pitch| exceeds 1.48 rad (≈ 85°) (checked via the local constant FLIP_THRESHOLD = 1.48 inside is_robot_flipped()).
  • Stuck detection — episode ends if the total Euclidean displacement over the last stuck_window steps is less than stuck_threshold (0.001 m over 600 steps by default in rover_env.py).
  • Out-of-bounds detection — episode ends if the rover’s x or y position exits the world-specific boundary rectangle.
  • Stuck-in-collision termination — episode ends when self.collision_count > self.stuck_window (i.e., collision_count exceeds 600). This fires at the top of step() and resets collision_count to zero.

Reward and Penalties

EventValueNotes
Goal reached (distance < 0.5 m)+100New goal randomised; episode continues
Collision (LIDAR min < 0.3 m)-0.5Per-step penalty
Stuck (displacement < 0.001 m over stuck_window steps)-25.0Source field: stuck_penilty (intentional typo in source)
Flip / stuck-in-collision-20Source field: general_penility (intentional typo in source)
Out-of-bounds-10too_far_away_penilty
Distance progress rewarddistance_delta × 0.3 × 1.5Positive when moving toward goal
Linear velocity bonuslinear_vel × 0.0025Encourages forward motion
Turn penalty`-0.0015 ×angular_vel`Discourages spinning in place

RoverEnvFused

RoverEnvFused is defined in ros2_ws/src/sb3/environments/leo_rover_env_fused.py. It extends the base environment’s navigation logic with a shared-memory fused image observation produced by the Active Vision pipeline, replacing raw sensor arrays with a compact 3-channel representation.

Constructor Signature

RoverEnvFused(
    size=(96, 96),
    length=6000,
    scan_topic='/scan',
    imu_topic='/imu/data',
    cmd_vel_topic='/cmd_vel',
    world_n='inspect',
    lidar_points=32,
    max_lidar_range=12.0,
    rl_obs_name='rl_observation',
)
ParameterTypeDefaultDescription
sizetuple(96, 96)Height × width of the fused observation image
lengthint6000Maximum steps per episode (3× longer than base)
world_nstr'inspect'Gazebo world; 'island' is silently remapped to 'moon'
rl_obs_namestr'rl_observation'Shared-memory block name written by inference.py

Fused Observation Channels

The observation is a (96, 96, 3) float32 array in [0, 1] read from shared memory. Each channel carries distinct semantic content:
ChannelContentSource
0 — GrayscaleRectified fisheye view of the forward scenefisheye_ros2_mem_share.py
1 — DepthMonocular depth estimate (normalized)inference.py depth head
2 — AttentionSpatiotemporal pedestrian occupancy heatmapAttention model checkpoint
RoverEnvFused is designed exclusively for the Active Vision / DreamerV3 pipeline. inference.py must be running and writing to shared memory before the environment’s reset() is called.
The fused environment preserves the same PID heading controller, safety mechanisms, and reward function as RoverEnv. Episode length is set to 6000 steps to accommodate the slower perception stack.

RoverEnvFused (Discrete)

The discrete variant is defined in ros2_ws/src/sb3/environments/discrete_rover_env_fused.py. It inherits the RoverEnvFused class name and fused 3-channel observation but replaces the continuous Box action space with a Discrete action space. This variant is used as the gaze-control head in the DreamerV3 joint policy: the RL agent selects one of a small set of camera pan angles (−60°, −30°, 0°, +30°, +60°) simultaneously with its locomotion commands.
# Discrete action → pan angle mapping
pan_offsets = [-60, -30, 0, +30, +60]  # degrees
The discrete environment is imported by sb3_SAC.py at runtime when --vision True is passed, overriding the continuous fused environment.

World-Specific Configuration

Each world uses different spawn and boundary rectangles. These are set automatically from the world_n constructor argument.

inspect (Industrial Inspection Zone)

rand_x_range      = (-27, -19)   # rover spawn x (m)
rand_y_range      = (-27, -19)   # rover spawn y (m)
rand_goal_x_range = (-27, -19)   # goal x (m)
rand_goal_y_range = (-27, -19)   # goal y (m)

too_far_away_low_x  = -29
too_far_away_high_x = -13
too_far_away_low_y  = -29
too_far_away_high_y = -17

moon / island (Lunar Island Terrain)

rand_x_range      = (-4, 4)
rand_y_range      = (-4, 4)
rand_goal_x_range = (-4, 4)
rand_goal_y_range = (-4, 4)

too_far_away_low_x  = -20
too_far_away_high_x =  10
too_far_away_low_y  = -20
too_far_away_high_y =  20

maze (default)

rand_x_range      = (-4, 4)
rand_y_range      = (-4, 4)
rand_goal_x_range = (-4, 4)
rand_goal_y_range = (-4, 4)

too_far_away_low_x  = -30
too_far_away_high_x =  30
too_far_away_low_y  = -30
too_far_away_high_y =  30
On each reset(), both the rover spawn position and the goal position are independently sampled uniformly from the ranges above.

PID Heading Controller

Rather than sending angular velocity directly to the robot, the environment uses a PID controller to track the desired absolute heading computed from the policy’s relative heading action.
# Parameters (set in __init__)
Kp = 2.0   # Proportional gain
Ki = 0.0   # Integral gain
Kd = 0.1   # Derivative gain
max_angular_velocity = 7.0  # rad/s clamp
At each step the controller:
  1. Computes desired_heading = current_yaw + action[1]
  2. Calculates the wrapped heading error: error = atan2(sin(desired - current), cos(desired - current))
  3. Runs a PID update with elapsed wall-clock dt
  4. Clips the output to [-7.0, 7.0] rad/s before publishing to /cmd_vel
def heading_controller(self, desired_heading, current_heading):
    dt = time.time() - self.last_time
    self.last_time = time.time()

    error = math.atan2(
        math.sin(desired_heading - current_heading),
        math.cos(desired_heading - current_heading)
    )
    self.integral_error += error * dt
    derivative_error = (error - self.last_error) / dt
    self.last_error = error

    control = (
        self.Kp * error
        + self.Ki * self.integral_error
        + self.Kd * derivative_error
    )
    return np.clip(control, -self.max_angular_velocity, self.max_angular_velocity)
The derivative term (Kd=0.1) damps oscillation when the rover overshoots a sharp heading change. If you observe excessive hunting behaviour in a new world, lower Kd or reduce max_angular_velocity.

Build docs developers (and LLMs) love