Skip to main content
This guide demonstrates how to fine-tune GR00T on your own robot data and configuration. We provide a complete example for the SO-100 robot under examples/SO100, which uses demo_data/cube_to_bowl_5 as the demo dataset.
1
Prepare your data
2
Prepare your data in GR00T-flavored LeRobot v2 format by following the data preparation guide.
3
Prepare your modality configuration
4
Define your own modality configuration. Below is an example configuration that corresponds to the demo data:
5
from gr00t.configs.data.embodiment_configs import register_modality_config
from gr00t.data.types import ModalityConfig, ActionConfig, ActionRepresentation, ActionType, ActionFormat
from gr00t.data.embodiment_tags import EmbodimentTag

so100_config = {
    "video": ModalityConfig(
        delta_indices=[0],
        modality_keys=[
            "front",
            "wrist",
        ],
    ),
    "state": ModalityConfig(
        delta_indices=[0],
        modality_keys=[
            "single_arm",
            "gripper",
        ],
    ),
    "action": ModalityConfig(
        delta_indices=list(range(0, 16)),
        modality_keys=[
            "single_arm",
            "gripper",
        ],
        action_configs=[
            # single_arm
            ActionConfig(
                rep=ActionRepresentation.RELATIVE,
                type=ActionType.NON_EEF,
                format=ActionFormat.DEFAULT,
            ),
            # gripper
            ActionConfig(
                rep=ActionRepresentation.ABSOLUTE,
                type=ActionType.NON_EEF,
                format=ActionFormat.DEFAULT,
            ),
        ],
    ),
    "language": ModalityConfig(
        delta_indices=[0],
        modality_keys=["annotation.human.action.task_description"],
    ),
}

register_modality_config(so100_config, embodiment_tag=EmbodimentTag.NEW_EMBODIMENT)
6
Register your modality configuration under the EmbodimentTag.NEW_EMBODIMENT tag.
7
Run fine-tuning
8
Use gr00t/experiment/launch_finetune.py as the entry point. Ensure that the uv environment is enabled before launching.
9
View available arguments
10
python gr00t/experiment/launch_finetune.py --help
11
Execute fine-tuning
12
# Configure for single GPU
export NUM_GPUS=1
CUDA_VISIBLE_DEVICES=0 python \
    gr00t/experiment/launch_finetune.py \
    --base-model-path nvidia/GR00T-N1.6-3B \
    --dataset-path ./demo_data/cube_to_bowl_5 \
    --embodiment-tag NEW_EMBODIMENT \
    --modality-config-path examples/SO100/so100_config.py \
    --num-gpus $NUM_GPUS \
    --output-dir /tmp/so100 \
    --save-total-limit 5 \
    --save-steps 2000 \
    --max-steps 2000 \
    --use-wandb \
    --global-batch-size 32 \
    --color-jitter-params brightness 0.3 contrast 0.4 saturation 0.5 hue 0.08 \
    --dataloader-num-workers 4
13
Run open-loop evaluation
14
After fine-tuning, evaluate the model’s performance using open-loop evaluation:
15
python gr00t/eval/open_loop_eval.py \
    --dataset-path ./demo_data/cube_to_bowl_5 \
    --embodiment-tag NEW_EMBODIMENT \
    --model-path /tmp/so100/checkpoint-2000 \
    --traj-ids 0 \
    --action-horizon 16 \
    --steps 400 \
    --modality-keys single_arm gripper

Key parameters

ParameterDescription
--base-model-pathPath to the pre-trained base model checkpoint
--dataset-pathPath to your training dataset
--embodiment-tagTag to identify your robot embodiment
--modality-config-pathPath to user-specified modality config (required only for NEW_EMBODIMENT tag)
--output-dirDirectory where checkpoints will be saved
--save-stepsSave checkpoint every N steps
--max-stepsTotal number of training steps
--use-wandbEnable Weights & Biases logging for experiment tracking
--global-batch-sizeGlobal batch size across all GPUs
--color-jitter-paramsColor jitter augmentation parameters
--dataloader-num-workersNumber of data loading workers
For optimal results, maximize your batch size based on available hardware and train for a few thousand steps.

Hardware performance

  • We recommend using 1 H100 node or L40 node for optimal fine-tuning performance
  • Other hardware configurations (e.g., A6000) will also work but may require longer training time
  • Optimal batch size depends on your hardware and which model components are being tuned

Training variance

Users may observe some variance in post-training results across runs, even when using the same configuration, seed, and dropout settings. In our experiments, we have observed performance differences as large as 5-6% between runs. This variance may be attributed to non-deterministic operations in image augmentations or other stochastic components.

Dataloader optimization

When training a model, you can optimize the dataloading speed vs memory usage via various command line arguments:
python gr00t/experiment/launch_finetune.py \
    ... \
    --num-shards-per-epoch 100 \
    --dataloader-num-workers 2 \
    --shard-size 512
If VRAM is limited, you can reduce all the numbers above to reduce memory usage. To ensure more IID during sampling of shards, you can reduce the episode_sampling_rate to 0.05 or lower.

Advanced configuration

For more extensive fine-tuning configuration, use gr00t/experiment/launch_train.py instead to launch the training process with full control over all training parameters.

Build docs developers (and LLMs) love