The TLDR: Add a
meta/modality.json file to your LeRobot v2 dataset and follow the schema below.LeRobot v2 requirements
If you already have a dataset in the LeRobot v2 format, you can skip this section. If you have a dataset in the LeRobot v3.0 format, use the conversion script:Structure requirements
The folder should follow this structure:Video observations
The videos folder contains the MP4 files associated with each episode. Requirements:- Must be stored as MP4 files
- Should be named using the format:
observation.images.<video_name> - Use
episode_00000X.mp4naming where X indicates the episode number
Data files
The data folder contains all of the parquet files associated with each episode. Each parquet file contains:- State information: stored as
observation.state(1D concatenated array of all state modalities) - Action: stored as
action(1D concatenated array of all action modalities) - Timestamp: stored as
timestamp(float point number of the starting time) - Annotations: stored as
annotation.<annotation_source>.<annotation_type>(.<annotation_name>)
Example parquet file
Here is a sample from thecube_to_bowl dataset:
Meta files
meta/tasks.jsonl
Contains a list of all the tasks in the entire dataset:meta/episodes.jsonl
Contains a list of all the episodes in the entire dataset:GR00T LeRobot specific requirements
The meta/modality.json configuration
GR00T requires an additional metadata filemeta/modality.json that is not present in the standard LeRobot format. This file provides detailed metadata about state and action modalities, enabling:
- Separate data storage and interpretation: State and action are stored as concatenated float32 arrays, with metadata to interpret them as distinct fields
- Video: Stored as separate files, with the configuration allowing them to be renamed to a standardized format
- Annotations: Keeps track of all annotation fields
- Fine-grained splitting: Divides the state and action arrays into more semantically meaningful fields
- Clear mapping: Explicit mapping of data dimensions
- Sophisticated data transformations: Supports field-specific normalization and rotation transformations during training
Schema
All indices are zero-based and follow Python’s array slicing convention (
[start:end]).Example modality.json
GR00T LeRobot extensions to standard LeRobot
GR00T LeRobot is a flavor of the standard LeRobot format with more opinionated requirements:- Computes
meta/stats.jsonandmeta/relative_stats.jsonfor each dataset automatically - Proprioceptive states must always be included in the
observation.statekeys - Supports multi-channel annotation formats (e.g., coarse-grained, fine-tuned), allowing users to add as many annotation channels as needed via the
annotation.<annotation_source>.<annotation_type>key - Requires the additional metadata file
meta/modality.json
Multiple annotation support
To support multiple annotations within a single parquet file, users may add extra columns to the parquet file. These columns should be treated the same way as thetask_index column in the original LeRobot v2 dataset:
In LeRobot v2, actual language descriptions are stored in a row of the meta/tasks.jsonl file, while the parquet file stores only the corresponding index in the task_index column. GR00T follows the same convention and stores the corresponding index for each annotation in the annotation.<annotation_source>.<annotation_type> column.
Although the task_index column may still be used for the default annotation, a dedicated column annotation.<annotation_source>.<annotation_type> is required to ensure it is loadable by the custom data loader.