launch_finetune.py

The launch_finetune.py script provides a streamlined interface for fine-tuning pretrained GR00T models on your own datasets. It handles model loading, data configuration, and distributed training setup.

Usage

python -m gr00t.experiment.launch_finetune \
  --base-model-path <path-to-checkpoint> \
  --dataset-path <path-to-dataset> \
  --embodiment-tag <embodiment> \
  --output-dir ./outputs

Parameters

Data and model paths

base-model-path

str

required

Path to the pretrained base model checkpoint (e.g., Hugging Face model hub or local directory).

dataset-path

str

required

Path to the dataset root directory containing trajectory data for fine-tuning.

embodiment-tag

EmbodimentTag

required

Identifier specifying which embodiment (robot configuration) this fine-tuning run targets.

modality-config-path

str | None

default:"None"

Path to a Python file defining the modality configuration for the given embodiment. If None, uses the pre-registered modality config in gr00t/configs/data/embodiment_configs.py.

Model tuning flags

tune-llm

bool

default:"False"

If True, fine-tune the language model (LLM) backbone during training.

tune-visual

bool

default:"False"

If True, fine-tune the visual encoder (e.g., ViT or CNN backbone).

tune-projector

bool

default:"True"

If True, fine-tune the multimodal projector layers that map vision/language features to a shared space.

tune-diffusion-model

bool

default:"True"

If True, fine-tune the diffusion-based action decoder (if present in the model).

state-dropout-prob

float

default:"0.0"

Dropout probability applied to state inputs for regularization during training.

Data augmentation

random-rotation-angle

int | None

default:"None"

Maximum rotation angle (in degrees) for random rotation augmentation of input images.

color-jitter-params

dict[str, float] | None

default:"None"

Parameters for color jitter augmentation on images.Expected keys include:

brightness: float
contrast: float
saturation: float
hue: float

Example: {"brightness": 0.4, "contrast": 0.4, "saturation": 0.4, "hue": 0.1}If None, applies the default color jitter augmentation from the pretrained model.

Training configuration

global-batch-size

int

default:"64"

Total effective batch size across all GPUs and accumulation steps.

dataloader-num-workers

int

default:"2"

Number of parallel worker processes used for data loading.

learning-rate

float

default:"1e-4"

Initial learning rate for optimizer.

gradient-accumulation-steps

int

default:"1"

Number of forward passes to accumulate before performing a backward/update step.

output-dir

str

default:"./outputs"

Directory where model checkpoints, logs, and outputs are saved.

save-steps

int

default:"1000"

Frequency (in training steps) at which to save checkpoints.

save-total-limit

int

default:"5"

Maximum number of checkpoints to keep before older ones are deleted.

num-gpus

int

default:"1"

Number of GPUs available for distributed or single-node training.

use-wandb

bool

default:"False"

If True, log metrics and artifacts to Weights & Biases (wandb). The project is finetune-gr00t-n1d6. You need to login to wandb to view the logs.

max-steps

int

default:"10000"

Total number of training steps to run before stopping.

weight-decay

float

default:"1e-5"

Weight decay coefficient for optimizer (L2 regularization).

warmup-ratio

float

default:"0.05"

Proportion of total training steps used for learning rate warm-up.

shard-size

int

default:"1024"

Size of the shard to use for the dataset during preloading.

episode-sampling-rate

float

default:"0.1"

Sampling rate for the episodes.

num-shards-per-epoch

int

default:"100000"

Number of shards to use for the dataset. Reduce this number if VRAM is limited.

Examples

Basic fine-tuning

python -m gr00t.experiment.launch_finetune \
  --base-model-path nvidia/Eagle-Block2A-2B-v2 \
  --dataset-path /data/my_robot_dataset \
  --embodiment-tag FRANKA_PANDA \
  --num-gpus 1

Fine-tuning with data augmentation

python -m gr00t.experiment.launch_finetune \
  --base-model-path ./checkpoints/base_model \
  --dataset-path /data/my_robot_dataset \
  --embodiment-tag UR5 \
  --random-rotation-angle 15 \
  --color-jitter-params '{"brightness": 0.4, "contrast": 0.4, "saturation": 0.4, "hue": 0.1}' \
  --num-gpus 4

Fine-tuning with custom learning parameters

python -m gr00t.experiment.launch_finetune \
  --base-model-path nvidia/Eagle-Block2A-2B-v2 \
  --dataset-path /data/my_robot_dataset \
  --embodiment-tag FRANKA_PANDA \
  --learning-rate 5e-5 \
  --global-batch-size 128 \
  --max-steps 20000 \
  --save-steps 500 \
  --use-wandb

Fine-tuning with all model components

python -m gr00t.experiment.launch_finetune \
  --base-model-path nvidia/Eagle-Block2A-2B-v2 \
  --dataset-path /data/my_robot_dataset \
  --embodiment-tag FRANKA_PANDA \
  --tune-llm \
  --tune-visual \
  --tune-projector \
  --tune-diffusion-model \
  --num-gpus 8

Environment variables

LOGURU_LEVEL: Controls logging verbosity (default: INFO)

Notes

The script automatically sets up the model with these configurations:
- Model: nvidia/Eagle-Block2A-2B-v2
- Optimizer: adamw_torch
- Wandb project: finetune-gr00t-n1d6
- Relative action mode enabled
- Eagle collator enabled
If a custom modality config is provided, it will be loaded from the specified path
Download cache is disabled by default for datasets

Policy

Data

Model

Training

Evaluation

Usage

Parameters

Data and model paths

Model tuning flags

Data augmentation

Training configuration

Examples

Basic fine-tuning

Fine-tuning with data augmentation

Fine-tuning with custom learning parameters

Fine-tuning with all model components

Environment variables

Notes

Build docs developers (and LLMs) love

Policy

Data

Model

Training

Evaluation

Documentation Index

​Usage

​Parameters

​Data and model paths

​Model tuning flags

​Data augmentation

​Training configuration

​Examples

​Basic fine-tuning

​Fine-tuning with data augmentation

​Fine-tuning with custom learning parameters

​Fine-tuning with all model components

​Environment variables

​Notes

Build docs developers (and LLMs) love

Usage

Parameters

Data and model paths

Model tuning flags

Data augmentation

Training configuration

Examples

Basic fine-tuning

Fine-tuning with data augmentation

Fine-tuning with custom learning parameters

Fine-tuning with all model components

Environment variables

Notes