Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jackvice/RoboTerrain/llms.txt
Use this file to discover all available pages before exploring further.
RoverEnv is the foundational Gymnasium-compatible environment for training a rover agent on point-navigation tasks in RoboTerrain simulations. It bridges ROS2 sensor topics — LIDAR scans, IMU orientation, wheel odometry, and ground-truth pose — into a structured observation dict that reinforcement learning algorithms such as Stable-Baselines3 can consume directly. The environment supports three distinct simulated worlds (inspect, moon, and maze), each with its own spawn bounds and out-of-bounds thresholds. Source: ros2_ws/src/sb3/environments/rover_env.py.
Constructor
Dimensions of the internal image buffer allocated for the grayscale camera channel. This buffer is reserved for vision extensions; the default observation space does not include the raw camera image.
Maximum number of environment steps per episode. When
self._step >= length the done flag is set to True and the episode terminates normally (truncated is always False).ROS2 topic name for the
sensor_msgs/LaserScan LIDAR messages. The subscriber is created in __init__ and data is received asynchronously via lidar_callback.ROS2 topic name for the
sensor_msgs/Imu messages. Roll, pitch, and yaw are extracted from the quaternion and stored for inclusion in the imu observation key.ROS2 topic name for the
geometry_msgs/Twist velocity command publisher. Each step() call publishes a Twist message with the computed linear and angular velocities to this topic.ROS2 topic name for the raw camera image stream. Used to pre-allocate the image buffer; not currently included in the default Dict observation space.
Name of the simulated world. Accepted values are
'inspect', 'moon', and 'maze'. The world name controls spawn-position ranges, goal-position ranges, and out-of-bounds thresholds. It is also embedded in the Ignition Gazebo service path (/world/<world_n>/set_pose) used for pose resets.Number of seconds to wait for the first LIDAR scan before concluding that no physical robot is connected. If the timeout is reached without receiving a scan,
_check_robot_connection returns False and the environment logs a warning and continues in simulation-only mode.Number of LIDAR points in the downsampled scan. The raw scan is divided into
lidar_points equal segments and the minimum range in each segment is kept, so the agent sees a compact worst-case radial profile.Maximum valid LIDAR range in metres. Infinity and NaN values in the raw scan are clipped to this value before downsampling. Also used as the
high bound of the lidar observation sub-space.Observation Space
RoverEnv exposes a gymnasium.spaces.Dict observation containing five named sub-spaces:
| Key | Shape | Description |
|---|---|---|
lidar | (32,) | Downsampled LIDAR scan; each element is the minimum range (m) within a sweep segment. Clipped to [0, max_lidar_range]. |
pose | (3,) | Ground-truth rover position [x, y, z] in metres, sourced from /rover/pose_array. |
imu | (3,) | Orientation [pitch, roll, yaw] in radians derived from the IMU quaternion via transforms3d.euler.quat2euler. |
target | (2,) | Navigation target info [distance_m, relative_angle_rad]. Distance is Euclidean to the current goal; angle is relative to the rover’s current yaw. |
velocities | (2,) | Current [linear_velocity, angular_velocity] from wheel odometry (/odometry/wheels). |
Action Space
| Index | Meaning | Range |
|---|---|---|
action[0] | Target linear speed (m/s) | [-1.0, 1.0] |
action[1] | Desired heading offset (rad) | [-π, π] |
step(), the desired heading offset is added to the current yaw to produce an absolute target heading, which is then tracked by the on-board PID heading controller. The resulting angular velocity command is clipped to ±7.0 rad/s before being published on cmd_vel_topic.
Methods
step(action) -> tuple
- Checks for flip, stuck-in-collision, out-of-bounds, and stuck conditions — terminating early with the appropriate penalty if any are triggered.
- Waits (spin-once loop) until a fresh LIDAR scan has been received.
- Converts
actioninto aTwistcommand (linear speed + PID-computed angular velocity) and publishes it. - Calls
task_reward()to compute the scalar reward. - Increments
_step; setsdone = Truewhen_step >= length. - Returns
(observation, reward, done, False, info).
info dict keys: steps, total_steps, reward.
reset(seed=None, options=None) -> tuple
- Publishes a zero-velocity
Twistto stop the rover. - Samples a random
(x, y)spawn position withinrand_x_range/rand_y_rangefor the current world. - Samples a random yaw in
[-π, π]and converts to a quaternion. - Calls the Ignition Gazebo
ign serviceCLI to teleport the rover model. - Samples a new random goal position within
rand_goal_x_range/rand_goal_y_range. - Spins the ROS2 node for 100 × 0.1 s to let pose settle, then sleeps 1 s.
- Returns the initial observation and an empty info dict.
Optional RNG seed forwarded to the parent
gym.Env.reset().Currently unused; reserved for future configuration overrides.
get_observation() -> dict
Returns the current observation dictionary without advancing the simulation:
get_target_info() -> np.ndarray
[distance, relative_angle] to the current goal:
- distance — Euclidean distance in metres between the rover’s current
(x, y)and the target(target_positions_x, target_positions_y). - relative_angle — Angle in radians from the rover’s current heading to the bearing towards the target, normalised to
[-π, π]viaatan2.
task_reward() -> float
Computes the scalar reward for the current step. See the Reward Function section for the full breakdown.
is_robot_flipped() -> Union[str, bool]
|roll| or |pitch| exceeds the flip threshold of 1.48 rad (~85°), otherwise returns False. The direction string indicates which axis and which direction the flip occurred in.
heading_controller(desired_heading, current_heading) -> float
desired_heading. The heading error is computed as atan2(sin(Δ), cos(Δ)) to handle wrap-around correctly.
| PID Parameter | Value |
|---|---|
Kp | 2.0 |
Ki | 0.0 |
Kd | 0.1 |
| Max output | ±7.0 rad/s |
Absolute target heading in radians. Computed inside
step() as current_yaw + action[1].Current rover yaw in radians as reported by the IMU.
too_far_away() -> bool
Returns True when the rover’s (x, y) position falls outside the world-specific bounding box. When True, step() terminates the episode immediately with the too_far_away_penilty reward (-10.0).
render()
No-op. The environment does not produce a display output; all visualisation is handled externally through Gazebo and RViz.
close()
self.node.destroy_node()) and calls rclpy.shutdown(). Always call this when training is finished to avoid dangling ROS2 handles.
pose_array_callback(msg)
Subscriber callback for /rover/pose_array (geometry_msgs/PoseArray). Extracts the first pose from the array, updates self.current_pose, and stores the position as a (3,) float32 NumPy array in self.rover_position.
lidar_callback(msg)
Subscriber callback for the sensor_msgs/LaserScan topic. Converts the raw range array to float32, replaces inf and NaN values with max_lidar_range, clips all values to [0, max_lidar_range], downsamples to lidar_points by taking the minimum in each segment, and sets self._received_scan = True to signal step() that fresh data is available.
imu_callback(msg)
Subscriber callback for sensor_msgs/Imu. Normalises the quaternion, converts it to Euler angles using transforms3d.euler.quat2euler with axes='sxyz', and stores pitch, roll, and yaw in self.current_pitch, self.current_roll, and self.current_yaw.
odom_callback(msg)
Subscriber callback for /odometry/wheels (nav_msgs/Odometry). Stores msg.twist.twist.linear.x and msg.twist.twist.angular.z in self.current_linear_velocity and self.current_angular_velocity respectively.
Robot Connection Check
At the end of__init__, _check_robot_connection(timeout=connection_check_timeout) is called. It spins the ROS2 node in a loop, waiting for the first LIDAR scan to arrive. If self._received_scan becomes True before the timeout elapses, it returns True. If the timeout (default 30 s) is exceeded without receiving any scan data, it returns False and the environment logs a warning and continues in simulation-only mode.
Stuck Detection
RoverEnv maintains a position_history list that records the rover’s (x, y) coordinates at every step. The list is capped at stuck_window = 600 entries. Once the list is full, step() checks whether the total displacement between the oldest and newest recorded position is less than stuck_threshold = 0.001 m. If so, the episode terminates immediately with stuck_penilty = -25.0.
Reward Function
task_reward() is called once per step() and returns a single scalar. The components are evaluated in priority order:
Goal Reward
success_distance). After granting the reward the target is relocated to a new random position within the world’s goal bounds and the episode continues.
Distance Reward
|distance_delta| > 0.001 and the heading alignment reward is positive (i.e., the rover is broadly facing the goal). If either condition fails a small idle penalty of −0.03 is applied instead.
Velocity Bonus
Turn Penalty
Collision Penalty
collision_threshold). The episode does not terminate on a single collision; persistent collision over collision_window steps triggers the general penalty instead.
Stuck Penalty
stuck_window). The episode ends immediately.
General Penalty
collision_window steps.
Too-Far Penalty
too_far_away() is True. The episode ends immediately.
World-Specific Parameters
Each world name sets a distinct set of geographic boundaries used for random spawn and goal placement, as well as out-of-bounds detection.| Parameter | inspect | moon | maze (default) |
|---|---|---|---|
rand_goal_x_range | (-27, -19) | (-4, 4) | (-4, 4) |
rand_goal_y_range | (-27, -19) | (-4, 4) | (-4, 4) |
rand_x_range | (-27, -19) | (-4, 4) | (-4, 4) |
rand_y_range | (-27, -19) | (-4, 4) | (-4, 4) |
too_far_away_low_x | -29 | -20 | -30 |
too_far_away_high_x | -13 | 10 | 30 |
too_far_away_low_y | -29 | -20 | -30 |
too_far_away_high_y | -17 | 20 | 30 |
too_far_away_penilty | -10.0 | -10.0 | -10.0 |
The
inspect world uses elevated spawn heights. When both x < -24.5 and y < -24.5, the rover is spawned at z = 6.5 instead of the standard z = 5.5 to clear terrain features in the corner of the inspection zone.