Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/MRRP-lab/arm-demos/llms.txt

Use this file to discover all available pages before exploring further.

The colman_perception package provides the sensor and perception layer for the Colman arm. It contains nodes that drive the OAK-D Pro depth camera over USB using the DepthAI SDK, run YOLO11 segmentation on the RGB stream, and prepare stereo camera info messages for the downstream depth pipeline. AprilTag configuration is also housed here and consumed by colman_bringup.

Nodes

CameraPublisher

Drives the OAK-D Pro camera directly via depthai and publishes the undistorted RGB stream along with calibrated CameraInfo to ROS 2. Camera intrinsics are read live from the device calibration so no separate calibration file is required.
PropertyValue
Node namecamera_publisher
Resolution1280 × 720
Camera FPS15 Hz (set on the DepthAI pipeline)
Timer rate30 Hz (publishes only when a new frame is available)
Frame IDoakd_pro_camera_rgb_optical_frame_wrist
Distortion modelplumb_bob with zero distortion coefficients (undistortion applied in hardware)
Published topics:
TopicTypeDescription
/oakd_pro/rgb/imagesensor_msgs/ImageUndistorted BGR8 RGB frames
/oakd_pro/rgb/camera_infosensor_msgs/CameraInfoIntrinsic matrix from device calibration
The node uses qos_profile_sensor_data (best-effort, volatile) on both publishers so downstream nodes like apriltag_ros and the YOLO detector receive images without delivery guarantees blocking the pipeline.
ros2 run colman_perception camera_publisher
CameraPublisher requires the OAK-D Pro to be connected via USB before launch. The node closes the DepthAI device cleanly on KeyboardInterrupt or node destruction to avoid leaving the USB device in a locked state.

DetectionNode

Runs YOLO11 medium segmentation (yolo11m-seg.pt) on the OAK-D Pro RGB stream and publishes annotated images for visualisation in RViz2 or rqt_image_view.
PropertyValue
Node nameyolo_node
Modelyolo11m-seg.pt (instance segmentation)
Inference rate20 Hz timer
Subscribed topics:
TopicTypeDescription
/oakd_pro/rgb/imagesensor_msgs/ImageIncoming RGB frames from CameraPublisher
Published topics:
TopicTypeDescription
/yolo/annotatedsensor_msgs/ImageBGR8 image with bounding boxes and masks overlaid
The node caches the latest frame in self.latest_image on subscription and runs inference inside a 20 Hz timer callback, decoupling the detection rate from the camera publish rate. If no image has been received yet, a warning is throttled to once every 5 seconds.
ros2 run colman_perception detection
# or via launch:
ros2 launch colman_bringup yolo.launch.py

CameraInfoRepublisher

Adapts raw CameraInfo messages from the OAK-D Pro stereo cameras for use with stereo_image_proc. The raw messages lack a stereo baseline in their projection matrix; this node injects P[3] = -fx * baseline_m for the right camera and zeroes P[3] for the left camera, which is the ROS stereo convention.
PropertyValue
Node namecamera_info_republisher
Parameters:
baseline_m
float
default:"0.075"
Stereo baseline in metres. Matches the physical distance between the OAK-D Pro left and right sensors.
left_camera_info_in
string
default:"/oakd_pro/left/camera_info"
Input topic for left camera info.
right_camera_info_in
string
default:"/oakd_pro/right/camera_info"
Input topic for right camera info.
left_camera_info_out
string
default:"/oakd_pro/left/camera_info_calibrated"
Output topic for corrected left camera info.
right_camera_info_out
string
default:"/oakd_pro/right/camera_info_calibrated"
Output topic for corrected right camera info.
left_frame_id
string
default:""
If non-empty, overrides the frame_id in the output left camera info header.
right_frame_id
string
default:""
If non-empty, overrides the frame_id in the output right camera info header.
The node also normalises the ROI fields: if roi.width or roi.height is zero (as is common when no ROI is set), it fills them with the image width and height, and resets offsets to zero. depth.launch.py configures this node to remap both output topics to /oakd_pro/left/camera_info and /oakd_pro/right/camera_info so image_proc and stereo_image_proc receive them transparently.
ros2 launch colman_bringup depth.launch.py

AprilTag configuration

The file config/apriltag.yaml is loaded by apriltag.launch.py and configures the apriltag_ros node for the Colman demo workspace.
apriltag:
  ros__parameters:
    image_transport: raw
    family: 36h11
    size: 0.10           # 10 cm physical tag size
    max_hamming: 0       # no error correction bits — high precision
    detector:
      threads: 2
      decimate: 2.0      # 2× image decimation for speed
      blur: 0.0
      refine: true
      sharpening: 0.25
      debug: false
    pose_estimation_method: "pnp"
    tag:
      ids:    [1]
      frames: ["tag_1"]
      sizes:  [0.10]
ParameterValueNotes
family36h11Standard AprilTag family used on the Colman workspace tags
size0.10 mPhysical side length of the printed tag
max_hamming0Rejects any detection with bit errors; maximises pose accuracy
decimate2.0Halves image resolution before detection to reduce CPU load
pose_estimation_methodpnpOpenCV PnP solver for 6-DOF pose estimation
Tag ID 1 frametag_1TF frame name published for the single workspace tag
If you add more tags to the workspace, extend the ids, frames, and sizes lists. Each tag can have its own size, which is useful if you use different-scale tags for different objects.

Topic summary

TopicDirectionNodeType
/oakd_pro/rgb/imagePublishedCameraPublishersensor_msgs/Image
/oakd_pro/rgb/camera_infoPublishedCameraPublishersensor_msgs/CameraInfo
/oakd_pro/rgb/imageSubscribedDetectionNodesensor_msgs/Image
/yolo/annotatedPublishedDetectionNodesensor_msgs/Image
/oakd_pro/left/camera_infoSubscribedCameraInfoRepublishersensor_msgs/CameraInfo
/oakd_pro/right/camera_infoSubscribedCameraInfoRepublishersensor_msgs/CameraInfo

Build docs developers (and LLMs) love