Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jackvice/RoboTerrain/llms.txt
Use this file to discover all available pages before exploring further.
RoverEnvFused is a Gymnasium-compatible environment that replaces the raw LIDAR sensor stack of RoverEnv with a rich 3-channel fused image observation delivered via POSIX shared memory. Each 96×96 frame is composed of three co-registered visual channels: a grayscale camera view, a YOLO pedestrian-detection mask, and a monocular depth map. This design lets the environment consume pre-processed, inference-ready observations from a separate Active Vision pipeline process without the latency of inter-process ROS2 messaging. The continuous-action variant is implemented in ros2_ws/src/sb3/environments/leo_rover_env_fused.py.
Constructor
Height and width of the fused observation image in pixels. Internally stored as
rl_obs_height = 96 and rl_obs_width = 96. Must match the dimensions written by the Active Vision inference pipeline into shared memory.Maximum number of environment steps per episode — three times longer than
RoverEnv’s default of 2000. The larger budget accommodates the richer observation space and slower per-step wall-clock time from image-based policies.ROS2 topic name for
sensor_msgs/LaserScan messages. No LIDAR subscriber is created in RoverEnvFused; this parameter is accepted for API compatibility but has no effect on the running environment.ROS2 topic name for
sensor_msgs/Imu messages. The IMU subscriber is disabled by default (commented out in source); orientation is instead extracted from the /rover/pose_array quaternion inside pose_array_callback.ROS2 topic for publishing
geometry_msgs/Twist velocity commands. The heading PID output and the agent’s requested speed are combined into a single Twist message per step() call.Simulated world name. Accepted values:
'inspect', 'moon', 'maze'. The string 'island' is silently remapped to 'moon' at construction time via if world_n == 'island': self.world_name = 'moon'. The world name controls spawn ranges, goal ranges, and out-of-bounds thresholds identically to RoverEnv.Seconds to wait for initial sensor data when verifying the robot connection. The robot connection check is currently disabled in
RoverEnvFused; this parameter is reserved for future use. If re-enabled, _check_robot_connection returns False after the timeout elapses.Downsampled LIDAR resolution. The
lidar_data buffer is initialised but no LIDAR subscriber is created; this parameter is present for API compatibility.Maximum valid LIDAR range in metres. Used to size the
lidar_data buffer.Name of the POSIX shared memory segment written by the Active Vision inference pipeline. The environment attaches to this segment at construction time using
multiprocessing.shared_memory.SharedMemory. If the segment does not exist the process calls exit(1) with a descriptive error.Passing
world_n='island' is equivalent to world_n='moon'. The mapping if world_n == 'island': self.world_name = 'moon' is applied unconditionally in __init__, so all internal paths, pose-service endpoints, and spawn ranges use 'moon'.Observation Space
The observation is agymnasium.spaces.Dict with five keys. The primary key is fused_image, which carries the 3-channel visual observation sourced from shared memory:
Fused Image Channels
| Channel Index | Content | Source |
|---|---|---|
[:, :, 0] | Grayscale camera image | RGB camera → greyscale conversion |
[:, :, 1] | Pedestrian detection mask (heatmap) | YOLO inference output |
[:, :, 2] | Monocular depth map | Depth estimation model |
[0.0, 1.0] as float32 by the inference pipeline before being written to shared memory.
Auxiliary Observation Keys
| Key | Shape | Description |
|---|---|---|
pose | (3,) | Ground-truth rover position [x, y, z] from /rover/pose_array. |
imu | (3,) | Orientation [pitch, roll, yaw] in radians extracted from the pose quaternion. |
target | (2,) | [distance_m, relative_angle_rad] to the current navigation goal. |
velocities | (2,) | [linear_velocity, angular_velocity] from /odometry/wheels. |
Shared Memory Interface
RoverEnvFused reads its primary observation from a POSIX shared memory segment rather than a ROS2 topic. This approach eliminates serialisation overhead and allows the inference pipeline to run in a separate process (potentially on a GPU) at its own frame rate.
Memory Layout
Read Protocol
get_fused_observation() polls the shared memory in a busy-wait loop with 1 ms sleep intervals. It compares the current timestamp at offset 0 against the last-seen timestamp. When both the timestamp has changed and the valid flag equals 1, it reads the observation array, reshapes it to (96, 96, 3), and returns a copy.
Attaching at Construction
RoverEnvFused never creates or destroys the segment. The resource-tracker entry is unregistered immediately after attach to suppress Python’s shutdown warning about inherited shared memory handles.
Action Space
The action space uses a continuousBox with a minimum linear speed of −0.6 m/s (limiting the rover’s reverse capability compared to RoverEnv’s −1.0):
| Index | Meaning | Range |
|---|---|---|
action[0] | Target linear speed (m/s) | [-0.6, 1.0] |
action[1] | Desired heading offset (rad) | [-π, π] |
Methods
step(action) -> tuple
- Checks for flip, stuck-in-collision, stuck, and out-of-bounds conditions with associated terminal penalties.
- Decodes
actionintospeedandrelative_heading_command. - Computes absolute target heading (
current_yaw + relative_heading_command) and callsheading_controller()to get angular velocity. - Publishes a
Twistcommand. - Calls
get_observation(), which blocks onget_fused_observation()until a fresh shared-memory frame is available. - Calls
task_reward(observation)— passes the full observation dict so the reward can inspect the pedestrian heatmap channel. - Spins the ROS2 node once (
timeout_sec=0.001) to flush callbacks.
info dict keys: steps, total_steps, reward.
reset(seed=None, options=None) -> tuple
ign service CLI, randomises the goal position, spins the ROS2 node for 100 × 0.1 s to let the pose settle, sleeps 1 s, then returns the initial observation (which includes a fresh fused image read from shared memory).
get_observation() -> dict
Returns the current observation dictionary:
get_target_info() -> np.ndarray
Computes and returns [distance, relative_angle] to the current goal. Distance is Euclidean in metres; angle is relative to the rover’s current yaw, normalised to [-π, π].
task_reward(observation) -> float
Computes the scalar reward. Unlike RoverEnv, this variant accepts the full observation dict so it can read the pedestrian heatmap channel (observation['fused_image'][:, :, 1]). Key constants:
| Constant | Value |
|---|---|
success_distance | 0.5 m |
goal_reward | 100 |
distance_delta_scale | 0.9 |
heatmap_center_scale | 0.1 |
stuck_penalty | −25.0 |
too_far_away_penilty | −10.0 |
heatmap_center_scale × sum(center_4_columns_of_heatmap) from the reward each step, discouraging the rover from moving towards detected pedestrians.
is_robot_flipped() -> Union[str, bool]
Returns 'roll_left', 'roll_right', 'pitch_forward', or 'pitch_backward' when either absolute angle exceeds flip_threshold = 1.48 rad (~85°). Returns False otherwise.
heading_controller(desired_heading, current_heading) -> float
PID controller for yaw tracking. Identical gains to RoverEnv:
| Parameter | Value |
|---|---|
Kp | 2.0 |
Ki | 0.0 |
Kd | 0.1 |
| Max output | ±7.0 rad/s |
too_far_away() -> bool
Returns True when the rover exits the world-specific bounding box. Triggers episode termination with too_far_away_penilty = -10.0.
close()
- Calls
self.rl_obs_shm.close()to detach from the shared memory segment (does not unlink it — the inference pipeline owns the segment lifetime). - Destroys the ROS2 node and calls
rclpy.shutdown().
pose_array_callback(msg)
Subscriber callback for /rover/pose_array. Extracts the first pose, stores position in self.rover_position, and extracts roll/pitch/yaw from the quaternion via transforms3d.euler.quat2euler. This callback also serves as the orientation source since the IMU subscriber is disabled.
odom_callback(msg)
Subscriber callback for /odometry/wheels (nav_msgs/Odometry). Stores msg.twist.twist.linear.x and msg.twist.twist.angular.z in self.current_linear_velocity and self.current_angular_velocity.
Robot Connection Check
_check_robot_connection(timeout) is defined but disabled (commented out) in RoverEnvFused.__init__. When re-enabled it polls lidar_data for non-zero values and returns False after the timeout elapses without detecting sensor activity.
Stuck Detection
RoverEnvFused uses the same position_history mechanism as RoverEnv but with different thresholds:
| Parameter | Value |
|---|---|
stuck_window | 5000 steps |
stuck_threshold | 0.0001 m |
stuck_penalty | −25.0 |
position_history reaches 5000 entries, step() checks whether the displacement between the oldest and newest recorded position is less than 0.0001 m. If so, the episode terminates with the stuck penalty.
Usage with Active Vision Pipeline
The following snippet shows the recommended pattern for launchingRoverEnvFused alongside DreamerV3 or a Stable-Baselines3 policy:
When using DreamerV3, configure the encoder to consume the
Dict space directly. The fused_image key should be routed to the convolutional encoder; the scalar keys (pose, imu, target, velocities) should be processed by an MLP encoder.