Skip to main content
This guide shows how to fine-tune GR00T for point navigation tasks using datasets generated by COMPASS, NVIDIA’s framework for learning navigation policies.

Dataset preparation

To generate and prepare the dataset, follow the COMPASS GR00T post-training guide:
1

Train residual RL specialists

Use COMPASS to train task-specific RL policies.
2

Collect distillation data

Collect specialist distillation data from trained policies.
3

Convert to LeRobot format

Convert HDF5 dataset to GR00T LeRobot format:
python scripts/hdf5_to_lerobot_episodic.py

Quick start dataset

For a quick start, a pre-collected G1 robot dataset is available:
wget https://huggingface.co/nvidia/COMPASS/blob/main/gr00t_post_training_g1.zip
unzip gr00t_post_training_g1.zip

Modality configuration

The point navigation task uses the following modalities defined in modality.json:

Input modalities

ModalityKeyIndicesDimensionDescription
Videoego_view-H×W×3Ego-centric RGB camera image
Statespeed[0, 1)1Robot forward speed
Stateroute[1, 41)40Route segments in robot frame (10 segments × 4 values)
Stategoal_heading[41, 43)2Goal heading direction (cos θ, sin θ)
Languagetask_description--“Robot Navigation Task”

Output modalities

ModalityKeyIndicesDimensionDescription
Actionvel_cmd[0, 3)3Velocity command (vx, vy, ωz)
The route modality encodes 10 waypoint segments, with each segment represented by 4 values: x_start, y_start, x_end, y_end in the robot’s local frame.

Fine-tuning

Run the fine-tuning script after updating paths:
uv run bash examples/PointNav/finetune_point_nav.sh
Update the script with your configuration:
  • --dataset-path: Path to the converted LeRobot format dataset
  • --output-dir: Directory to save checkpoints

Evaluation

1

Launch inference server

Start the GR00T policy server:
uv run python gr00t/eval/run_gr00t_server.py \
    --model-path <path/to/checkpoint> \
    --embodiment-tag NEW_EMBODIMENT \
    --device cuda:0 \
    --host 0.0.0.0 \
    --port 8888
2

Run COMPASS evaluation

Follow the COMPASS evaluation instructions to evaluate the fine-tuned model.

Results

Task success rate on 640 randomized test cases:
ModelIn-distributionOut-of-distribution
GR00T N1.686.3%76.5%
GR00T N1.586.1%77.6%
COMPASS (baseline)84.7%45.6%
GR00T significantly outperforms the COMPASS baseline on out-of-distribution scenarios (76.5% vs 45.6%), demonstrating strong generalization capabilities.

Additional resources

Build docs developers (and LLMs) love