Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/pytorch/rl/llms.txt

Use this file to discover all available pages before exploring further.

TorchRL ships built-in loaders for the most widely-used offline RL and robot-learning datasets. Every loader extends BaseDatasetExperienceReplay, which is itself a TensorDictReplayBuffer. This means offline data plugs directly into the same training loop as any online buffer: call dataset.sample(batch_size) to get a TensorDict, apply transforms via dataset.append_transform(t), and checkpoint with dataset.dumps(path). Data is stored in TorchRL’s TED (TensorDict Episode Data) format — a flat TensorDict with fields observation, action, reward, next.observation, next.done, next.terminated, and next.truncated.
Most dataset classes have optional external dependencies (e.g., d4rl, minari, lerobot). Import errors are caught at module load time and re-raised with a helpful message when the class is instantiated. See the individual class sections below for the required package.

Base Class: BaseDatasetExperienceReplay

All dataset loaders inherit from this class, which extends TensorDictReplayBuffer with:
  • Automatic download on first instantiation (controlled by download).
  • A TensorStorage backed by memory-mapped tensors for zero-copy disk access.
  • An ImmutableDatasetWriter to prevent accidental writes.
  • A class-level available_datasets property listing all downloadable dataset IDs.
Import path: torchrl.data.datasets.common.BaseDatasetExperienceReplay

D4RL Datasets

D4RLExperienceReplay

Wraps the D4RL offline RL benchmark (Offline datasets collected from MuJoCo locomotion tasks, AntMaze, Adroit hand manipulation, and Kitchen). Data can be loaded either via the d4rl package or via direct HTTP download without it. Import path: torchrl.data.datasets.D4RLExperienceReplay Requires: d4rl package (optional — pass direct_download=True to skip)
dataset_id
str
required
Dataset identifier, e.g. "hopper-medium-v2", "maze2d-umaze-v1", "pen-expert-v1". See D4RLExperienceReplay.available_datasets for the full list.
batch_size
int
required
Batch size returned by sample(). Can be overridden at call time: dataset.sample(64).
sampler
Sampler
Index sampler. Defaults to RandomSampler.
transform
Transform
Transform applied after each sample.
split_trajs
bool
default:"False"
Split transitions into individual trajectories (padded to equal length) using the done signal (done = terminated | truncated).
from_env
bool
default:"False"
Collect the dataset from a live d4rl environment rather than a pre-saved HDF5 file. Requires the d4rl package.
direct_download
bool | None
default:"None"
Download raw HDF5 files without the d4rl package. If None, falls back to direct_download=True when d4rl is not installed. Incompatible with from_env=True.
use_truncated_as_done
bool
default:"True"
Set done = terminated | truncated. When False, only terminated is used.
root
Path | str
Root directory for cached datasets. Defaults to ~/.cache/torchrl/d4rl.
download
bool | str
default:"True"
Download if not cached. Pass "force" to overwrite an existing cache.
from torchrl.data.datasets import D4RLExperienceReplay

# Load the medium Hopper dataset (downloads on first run)
data = D4RLExperienceReplay("hopper-medium-v2", batch_size=256)
print(len(data))   # ~1,000,000 transitions

sample = data.sample()
print(sample)
# TensorDict with keys: observation, action, next/observation,
#                        next/reward, next/done, next/terminated, ...

Atari DQN Dataset

AtariDQNExperienceReplay

The DQN Replay Dataset — 5 training runs of DQN across all Atari 2600 games, totalling 200 million frames (50 million steps per game at frame-skip 4). The dataset is chunked into 50 shards of 1 million transitions each; data is formatted on-the-fly at sample time to avoid storing redundant (obs, next_obs) pairs. Import path: torchrl.data.datasets.AtariDQNExperienceReplay Requires: gsutil (for downloading from Google Cloud Storage)
dataset_id
str
required
"<game>/<run>", e.g. "Pong/1", "Breakout/5". See AtariDQNExperienceReplay.available_datasets.
batch_size
int
required
Batch size for sample().
root
Path | str
Root directory for cached files. Defaults to ~/.cache/torchrl/atari.
download
bool | str
default:"True"
Download on first use. Pass "force" to re-download.
sampler
Sampler
Index sampler. Supports SliceSampler for sequence sampling.
num_slices
int
If provided, wraps the buffer in a SliceSampler returning this many non-overlapping sub-sequences per batch.
slice_len
int
Fixed sub-sequence length. Mutually exclusive with num_slices.
from torchrl.data.datasets import AtariDQNExperienceReplay

# Load run 1 of Pong (downloads ~10 GB on first use)
data = AtariDQNExperienceReplay(
    dataset_id="Pong/1",
    batch_size=32,
    num_slices=8,   # 8 non-overlapping sub-sequences per batch
)

sample = data.sample()
print(sample["observation"].shape)  # torch.Size([32, 1, 84, 84]) — grayscale frames

Minari Datasets

MinariExperienceReplay

The Minari dataset library from the Farama Foundation. Covers robotics (Gymnasium-Robotics, D4RL-compatible MuJoCo), manipulation, and navigation tasks. Import path: torchrl.data.datasets.MinariExperienceReplay Requires: minari package (pip install minari)
dataset_id
str
required
Dataset ID from the Minari registry, e.g. "door-human-v1", "pen-expert-v1". Use MinariExperienceReplay.available_datasets to list all registered IDs.
batch_size
int
required
Batch size for sample().
root
Path | str
Root directory for cached datasets. Defaults to ~/.cache/torchrl/minari.
download
bool | str
default:"True"
Download on first use. Pass "force" to overwrite an existing cache.
split_trajs
bool
default:"False"
Split into individual trajectory tensors using done = truncated | terminated.
load_from_local_minari
bool
default:"False"
Load directly from the local Minari cache (~/.minari/datasets) without a network request. Useful for custom or private datasets.
transform
Transform
Post-processing transform applied to every sampled batch.
from torchrl.data.datasets import MinariExperienceReplay

data = MinariExperienceReplay("door-human-v1", batch_size=32, download="force")

for batch in data:
    # batch keys: observation, action,
    #              next/observation, next/reward, next/done, ...
    print(batch["action"].shape)  # e.g. torch.Size([32, 28])
    break
Minari datasets may contain text/non-tensor info entries. TorchRL currently discards non-tensor data from these entries. If your training requires text metadata, open an issue on TorchRL’s GitHub.

LeRobot Datasets

LeRobotExperienceReplay

LeRobot is Hugging Face’s dataset collection for robot manipulation. Datasets are hosted on the Hugging Face Hub and include video observations, proprioceptive state, language instructions, and actions from real and simulated robot arms. Import path: torchrl.data.datasets.LeRobotExperienceReplay Requires: huggingface_hub, datasets (pip install huggingface_hub datasets)
dataset_id
str
required
Hugging Face Hub repository ID, e.g. "lerobot/pusht", "lerobot/aloha_sim_transfer_cube_human".
batch_size
int
required
Batch size for sample().
root
Path | str
Local cache directory. Defaults to ~/.cache/torchrl/lerobot.
download
bool | str
default:"True"
Download from the Hub on first use.
sampler
Sampler
Index sampler. Defaults to SliceSampler for sequence-aware sampling.
transform
Transform
Post-processing transform. Use torchrl.envs.transforms.DecodeVideoTransform to decode lazy VideoClipRef frames on-the-fly.
key_map
dict[str, NestedKey]
Override the default column-to-TensorDict key mapping. The default maps "action""action", "observation.state"("observation", "state"), "observation.images.<cam>"("observation", "image", "<cam>"), etc.
num_slices
int
Number of episode slices per batch. Passed to the default SliceSampler.
slice_len
int
Length of each sampled slice. Mutually exclusive with num_slices.
from torchrl.data.datasets import LeRobotExperienceReplay

data = LeRobotExperienceReplay(
    "lerobot/pusht",
    batch_size=64,
    num_slices=8,  # 8 trajectories per batch, each of length 8
)

sample = data.sample()
print(sample["observation", "state"].shape)   # e.g. torch.Size([64, 2])
print(sample["action"].shape)                  # e.g. torch.Size([64, 2])
print(sample["episode"].shape)                 # episode IDs

lerobot_columns_to_tensordict

Utility function that converts a dictionary of LeRobot-style columnar data (e.g., from a datasets.Dataset row batch) into a canonical VLA-format TensorDict.
from torchrl.data.datasets.lerobot import lerobot_columns_to_tensordict
import torch

columns = {
    "observation.state": torch.zeros(4, 7),
    "observation.images.top": torch.zeros(4, 3, 8, 8, dtype=torch.uint8),
    "action": torch.zeros(4, 7),
    "episode_index": torch.tensor([0, 0, 1, 1]),
    "task": ["pick", "pick", "place", "place"],
}

td = lerobot_columns_to_tensordict(columns)
print(td["observation", "state"].shape)     # torch.Size([4, 7])
print(td["observation", "image", "top"].shape)  # torch.Size([4, 3, 8, 8])
print(td.get("language_instruction").tolist())  # ['pick', 'pick', 'place', 'place']

Open X-Embodiment

OpenXExperienceReplay

The Open X-Embodiment dataset collection — over 1 million real robot trajectories from 22 research groups across 527 tasks. Backed by TensorFlow Datasets (TFDS) and streamed lazily via torchrl’s RLDS adapter. Import path: torchrl.data.datasets.OpenXExperienceReplay Requires: tensorflow_datasets, tensorflow (pip install tensorflow_datasets)
dataset_id
str
required
TFDS dataset name, e.g. "fractal20220817_data", "bridge", "kuka". Full list at OpenXExperienceReplay.available_datasets.
batch_size
int
required
Batch size for sample().
root
Path | str
Local TFDS data directory.
download
bool | str
default:"True"
Download and prepare data on first use.
sampler
Sampler
Index sampler.
split
str
default:"\"train\""
Dataset split. Most Open X datasets only have "train".
from torchrl.data.datasets import OpenXExperienceReplay

data = OpenXExperienceReplay(
    "fractal20220817_data",
    batch_size=32,
    root="/data/openx",
)

sample = data.sample()
print(sample.keys())

Roboset

RobosetExperienceReplay

Roboset — a large-scale multitask dataset for dexterous robot manipulation collected with the ROBEL D’Claw and D’Hand robotic hands. Import path: torchrl.data.datasets.RobosetExperienceReplay Requires: h5py (pip install h5py)
dataset_id
str
required
Dataset ID. See RobosetExperienceReplay.available_datasets.
batch_size
int
required
Batch size for sample().
root
Path | str
Cache directory. Defaults to ~/.cache/torchrl/roboset.
download
bool | str
default:"True"
Download on first use.

GenDGRL

GenDGRLExperienceReplay

The GenDGRL dataset — a synthetic, procedurally-generated dataset for goal-conditioned RL research. Import path: torchrl.data.datasets.GenDGRLExperienceReplay
dataset_id
str
required
Dataset split identifier. See GenDGRLExperienceReplay.available_datasets.
batch_size
int
required
Batch size for sample().
root
Path | str
Cache directory. Defaults to ~/.cache/torchrl/gen_dgrl.
download
bool | str
default:"True"
Download on first use.

VD4RL

VD4RLExperienceReplay

VD4RL — pixel-based D4RL transitions collected from DMControl environments, matching the standard D4RL quality tiers (random, medium, medium-replay, expert). Import path: torchrl.data.datasets.VD4RLExperienceReplay
dataset_id
str
required
Dataset ID, e.g. "cheetah_run/medium/64px". See VD4RLExperienceReplay.available_datasets.
batch_size
int
required
Batch size for sample().
root
Path | str
Cache directory. Defaults to ~/.cache/torchrl/vd4rl.

Integration with TensorDictReplayBuffer

Dataset loaders return TensorDictReplayBuffer instances. You can mix offline and online data by combining two buffers with ReplayBufferEnsemble, or by using OfflineToOnlineReplayBuffer:
from torchrl.data import (
    OfflineToOnlineReplayBuffer,
    TensorDictReplayBuffer,
    LazyTensorStorage,
)
from torchrl.data.datasets import D4RLExperienceReplay

offline = D4RLExperienceReplay("hopper-medium-v2", batch_size=128)
online  = TensorDictReplayBuffer(
    storage=LazyTensorStorage(max_size=100_000),
    batch_size=128,
)

# 50% offline data, 50% online data
mixed = OfflineToOnlineReplayBuffer(
    replay_buffer=offline,
    storage=online._storage,
    batch_size=256,
)
You can also add transforms to any dataset for preprocessing:
from torchrl.data.datasets import D4RLExperienceReplay
from torchrl.envs import ObservationNorm, RewardScaling, Compose

data = D4RLExperienceReplay("hopper-medium-v2", batch_size=256)
data.append_transform(
    Compose(
        ObservationNorm(loc=0.0, scale=1.0, in_keys=["observation"]),
        RewardScaling(loc=0.0, scale=0.01),
    )
)

sample = data.sample()
print(sample["observation"].mean())   # ≈ 0
print(sample["next", "reward"].abs().max())  # scaled

Optional Dependencies Summary

D4RL

Package: d4rl (optional — direct_download=True works without it)Install: pip install d4rl or use direct_download=TrueDatasets: MuJoCo locomotion, AntMaze, Adroit, Kitchen

Minari

Package: minariInstall: pip install minariDatasets: Robotics, manipulation, navigation (Farama Foundation)

LeRobot

Package: huggingface_hub, datasetsInstall: pip install huggingface_hub datasetsDatasets: Real and simulated robot manipulation (Hugging Face Hub)

Open X-Embodiment

Package: tensorflow_datasets, tensorflowInstall: pip install tensorflow_datasetsDatasets: 527 tasks across 22 robot platforms (RLDS format)
All dataset loaders cache their data as memory-mapped TensorDicts under ~/.cache/torchrl/<dataset_name>/ by default. Set the root argument or the TORCHRL_DATA_PATH environment variable to redirect the cache to a custom location — for example, a fast local NVMe drive or a shared network filesystem.

Build docs developers (and LLMs) love