Offline RL Dataset Loaders and Experience Replay Classes

TorchRL ships built-in loaders for the most widely-used offline RL and robot-learning datasets. Every loader extends BaseDatasetExperienceReplay, which is itself a TensorDictReplayBuffer. This means offline data plugs directly into the same training loop as any online buffer: call dataset.sample(batch_size) to get a TensorDict, apply transforms via dataset.append_transform(t), and checkpoint with dataset.dumps(path). Data is stored in TorchRL’s TED (TensorDict Episode Data) format — a flat TensorDict with fields observation, action, reward, next.observation, next.done, next.terminated, and next.truncated.

Most dataset classes have optional external dependencies (e.g., d4rl, minari, lerobot). Import errors are caught at module load time and re-raised with a helpful message when the class is instantiated. See the individual class sections below for the required package.

Base Class: `BaseDatasetExperienceReplay`

All dataset loaders inherit from this class, which extends TensorDictReplayBuffer with:

Automatic download on first instantiation (controlled by download).
A TensorStorage backed by memory-mapped tensors for zero-copy disk access.
An ImmutableDatasetWriter to prevent accidental writes.
A class-level available_datasets property listing all downloadable dataset IDs.

Import path: torchrl.data.datasets.common.BaseDatasetExperienceReplay

D4RL Datasets

`D4RLExperienceReplay`

Wraps the D4RL offline RL benchmark (Offline datasets collected from MuJoCo locomotion tasks, AntMaze, Adroit hand manipulation, and Kitchen). Data can be loaded either via the d4rl package or via direct HTTP download without it. Import path: torchrl.data.datasets.D4RLExperienceReplay Requires: d4rl package (optional — pass direct_download=True to skip)

dataset_id

str

required

Dataset identifier, e.g. "hopper-medium-v2", "maze2d-umaze-v1", "pen-expert-v1". See D4RLExperienceReplay.available_datasets for the full list.

batch_size

int

required

Batch size returned by sample(). Can be overridden at call time: dataset.sample(64).

sampler

Sampler

Index sampler. Defaults to RandomSampler.

transform

Transform

Transform applied after each sample.

split_trajs

bool

default:"False"

Split transitions into individual trajectories (padded to equal length) using the done signal (done = terminated | truncated).

from_env

bool

default:"False"

Collect the dataset from a live d4rl environment rather than a pre-saved HDF5 file. Requires the d4rl package.

direct_download

bool | None

default:"None"

Download raw HDF5 files without the d4rl package. If None, falls back to direct_download=True when d4rl is not installed. Incompatible with from_env=True.

use_truncated_as_done

bool

default:"True"

Set done = terminated | truncated. When False, only terminated is used.

root

Path | str

Root directory for cached datasets. Defaults to ~/.cache/torchrl/d4rl.

download

bool | str

default:"True"

Download if not cached. Pass "force" to overwrite an existing cache.

Basic usage
With transforms
Without d4rl

from torchrl.data.datasets import D4RLExperienceReplay

# Load the medium Hopper dataset (downloads on first run)
data = D4RLExperienceReplay("hopper-medium-v2", batch_size=256)
print(len(data))   # ~1,000,000 transitions

sample = data.sample()
print(sample)
# TensorDict with keys: observation, action, next/observation,
#                        next/reward, next/done, next/terminated, ...

from torchrl.data.datasets import D4RLExperienceReplay
from torchrl.envs import ObservationNorm

data = D4RLExperienceReplay(
    "hopper-medium-v2",
    batch_size=256,
    split_trajs=True,
)
data.append_transform(
    ObservationNorm(loc=0.0, scale=1.0, in_keys=["observation"])
)

for batch in data:  # epoch-style iteration
    train_step(batch)

from torchrl.data.datasets import D4RLExperienceReplay

# Direct download — no d4rl installation required
data = D4RLExperienceReplay(
    "walker2d-medium-replay-v2",
    batch_size=512,
    direct_download=True,
)

Atari DQN Dataset

`AtariDQNExperienceReplay`

The DQN Replay Dataset — 5 training runs of DQN across all Atari 2600 games, totalling 200 million frames (50 million steps per game at frame-skip 4). The dataset is chunked into 50 shards of 1 million transitions each; data is formatted on-the-fly at sample time to avoid storing redundant (obs, next_obs) pairs. Import path: torchrl.data.datasets.AtariDQNExperienceReplay Requires: gsutil (for downloading from Google Cloud Storage)

dataset_id

str

required

"<game>/<run>", e.g. "Pong/1", "Breakout/5". See AtariDQNExperienceReplay.available_datasets.

batch_size

int

required

Batch size for sample().

root

Path | str

Root directory for cached files. Defaults to ~/.cache/torchrl/atari.

download

bool | str

default:"True"

Download on first use. Pass "force" to re-download.

sampler

Sampler

Index sampler. Supports SliceSampler for sequence sampling.

num_slices

int

If provided, wraps the buffer in a SliceSampler returning this many non-overlapping sub-sequences per batch.

slice_len

int

Fixed sub-sequence length. Mutually exclusive with num_slices.

from torchrl.data.datasets import AtariDQNExperienceReplay

# Load run 1 of Pong (downloads ~10 GB on first use)
data = AtariDQNExperienceReplay(
    dataset_id="Pong/1",
    batch_size=32,
    num_slices=8,   # 8 non-overlapping sub-sequences per batch
)

sample = data.sample()
print(sample["observation"].shape)  # torch.Size([32, 1, 84, 84]) — grayscale frames

Minari Datasets

`MinariExperienceReplay`

The Minari dataset library from the Farama Foundation. Covers robotics (Gymnasium-Robotics, D4RL-compatible MuJoCo), manipulation, and navigation tasks. Import path: torchrl.data.datasets.MinariExperienceReplay Requires: minari package (pip install minari)

dataset_id

str

required

Dataset ID from the Minari registry, e.g. "door-human-v1", "pen-expert-v1". Use MinariExperienceReplay.available_datasets to list all registered IDs.

batch_size

int

required

Batch size for sample().

root

Path | str

Root directory for cached datasets. Defaults to ~/.cache/torchrl/minari.

download

bool | str

default:"True"

Download on first use. Pass "force" to overwrite an existing cache.

split_trajs

bool

default:"False"

Split into individual trajectory tensors using done = truncated | terminated.

load_from_local_minari

bool

default:"False"

Load directly from the local Minari cache (~/.minari/datasets) without a network request. Useful for custom or private datasets.

transform

Transform

Post-processing transform applied to every sampled batch.

from torchrl.data.datasets import MinariExperienceReplay

data = MinariExperienceReplay("door-human-v1", batch_size=32, download="force")

for batch in data:
    # batch keys: observation, action,
    #              next/observation, next/reward, next/done, ...
    print(batch["action"].shape)  # e.g. torch.Size([32, 28])
    break

Minari datasets may contain text/non-tensor info entries. TorchRL currently discards non-tensor data from these entries. If your training requires text metadata, open an issue on TorchRL’s GitHub.

LeRobot Datasets

`LeRobotExperienceReplay`

LeRobot is Hugging Face’s dataset collection for robot manipulation. Datasets are hosted on the Hugging Face Hub and include video observations, proprioceptive state, language instructions, and actions from real and simulated robot arms. Import path: torchrl.data.datasets.LeRobotExperienceReplay Requires: huggingface_hub, datasets (pip install huggingface_hub datasets)

dataset_id

str

required

Hugging Face Hub repository ID, e.g. "lerobot/pusht", "lerobot/aloha_sim_transfer_cube_human".

batch_size

int

required

Batch size for sample().

root

Path | str

Local cache directory. Defaults to ~/.cache/torchrl/lerobot.

download

bool | str

default:"True"

Download from the Hub on first use.

sampler

Sampler

Index sampler. Defaults to SliceSampler for sequence-aware sampling.

transform

Transform

Post-processing transform. Use torchrl.envs.transforms.DecodeVideoTransform to decode lazy VideoClipRef frames on-the-fly.

key_map

dict[str, NestedKey]

Override the default column-to-TensorDict key mapping. The default maps "action" → "action", "observation.state" → ("observation", "state"), "observation.images.<cam>" → ("observation", "image", "<cam>"), etc.

num_slices

int

Number of episode slices per batch. Passed to the default SliceSampler.

slice_len

int

Length of each sampled slice. Mutually exclusive with num_slices.

Basic usage
With video decoding

from torchrl.data.datasets import LeRobotExperienceReplay

data = LeRobotExperienceReplay(
    "lerobot/pusht",
    batch_size=64,
    num_slices=8,  # 8 trajectories per batch, each of length 8
)

sample = data.sample()
print(sample["observation", "state"].shape)   # e.g. torch.Size([64, 2])
print(sample["action"].shape)                  # e.g. torch.Size([64, 2])
print(sample["episode"].shape)                 # episode IDs

from torchrl.data.datasets import LeRobotExperienceReplay
from torchrl.envs.transforms import DecodeVideoTransform

data = LeRobotExperienceReplay(
    "lerobot/aloha_sim_transfer_cube_human",
    batch_size=32,
)
# Decode video frames lazily — only on sample
data.append_transform(
    DecodeVideoTransform(in_keys=[("observation", "image", "top")])
)

sample = data.sample()
print(sample["observation", "image", "top"].shape)
# torch.Size([32, C, H, W]) — decoded video frames

`lerobot_columns_to_tensordict`

Utility function that converts a dictionary of LeRobot-style columnar data (e.g., from a datasets.Dataset row batch) into a canonical VLA-format TensorDict.

from torchrl.data.datasets.lerobot import lerobot_columns_to_tensordict
import torch

columns = {
    "observation.state": torch.zeros(4, 7),
    "observation.images.top": torch.zeros(4, 3, 8, 8, dtype=torch.uint8),
    "action": torch.zeros(4, 7),
    "episode_index": torch.tensor([0, 0, 1, 1]),
    "task": ["pick", "pick", "place", "place"],
}

td = lerobot_columns_to_tensordict(columns)
print(td["observation", "state"].shape)     # torch.Size([4, 7])
print(td["observation", "image", "top"].shape)  # torch.Size([4, 3, 8, 8])
print(td.get("language_instruction").tolist())  # ['pick', 'pick', 'place', 'place']

Open X-Embodiment

`OpenXExperienceReplay`

The Open X-Embodiment dataset collection — over 1 million real robot trajectories from 22 research groups across 527 tasks. Backed by TensorFlow Datasets (TFDS) and streamed lazily via torchrl’s RLDS adapter. Import path: torchrl.data.datasets.OpenXExperienceReplay Requires: tensorflow_datasets, tensorflow (pip install tensorflow_datasets)

dataset_id

str

required

TFDS dataset name, e.g. "fractal20220817_data", "bridge", "kuka". Full list at OpenXExperienceReplay.available_datasets.

batch_size

int

required

Batch size for sample().

root

Path | str

Local TFDS data directory.

download

bool | str

default:"True"

Download and prepare data on first use.

sampler

Sampler

Index sampler.

split

str

default:"\"train\""

Dataset split. Most Open X datasets only have "train".

from torchrl.data.datasets import OpenXExperienceReplay

data = OpenXExperienceReplay(
    "fractal20220817_data",
    batch_size=32,
    root="/data/openx",
)

sample = data.sample()
print(sample.keys())

Roboset

`RobosetExperienceReplay`

Roboset — a large-scale multitask dataset for dexterous robot manipulation collected with the ROBEL D’Claw and D’Hand robotic hands. Import path: torchrl.data.datasets.RobosetExperienceReplay Requires: h5py (pip install h5py)

dataset_id

str

required

Dataset ID. See RobosetExperienceReplay.available_datasets.

batch_size

int

required

Batch size for sample().

root

Path | str

Cache directory. Defaults to ~/.cache/torchrl/roboset.

download

bool | str

default:"True"

Download on first use.

GenDGRL

`GenDGRLExperienceReplay`

The GenDGRL dataset — a synthetic, procedurally-generated dataset for goal-conditioned RL research. Import path: torchrl.data.datasets.GenDGRLExperienceReplay

dataset_id

str

required

Dataset split identifier. See GenDGRLExperienceReplay.available_datasets.

batch_size

int

required

Batch size for sample().

root

Path | str

Cache directory. Defaults to ~/.cache/torchrl/gen_dgrl.

download

bool | str

default:"True"

Download on first use.

VD4RL

`VD4RLExperienceReplay`

VD4RL — pixel-based D4RL transitions collected from DMControl environments, matching the standard D4RL quality tiers (random, medium, medium-replay, expert). Import path: torchrl.data.datasets.VD4RLExperienceReplay

dataset_id

str

required

Dataset ID, e.g. "cheetah_run/medium/64px". See VD4RLExperienceReplay.available_datasets.

batch_size

int

required

Batch size for sample().

root

Path | str

Cache directory. Defaults to ~/.cache/torchrl/vd4rl.

Integration with `TensorDictReplayBuffer`

Dataset loaders return TensorDictReplayBuffer instances. You can mix offline and online data by combining two buffers with ReplayBufferEnsemble, or by using OfflineToOnlineReplayBuffer:

from torchrl.data import (
    OfflineToOnlineReplayBuffer,
    TensorDictReplayBuffer,
    LazyTensorStorage,
)
from torchrl.data.datasets import D4RLExperienceReplay

offline = D4RLExperienceReplay("hopper-medium-v2", batch_size=128)
online  = TensorDictReplayBuffer(
    storage=LazyTensorStorage(max_size=100_000),
    batch_size=128,
)

# 50% offline data, 50% online data
mixed = OfflineToOnlineReplayBuffer(
    replay_buffer=offline,
    storage=online._storage,
    batch_size=256,
)

You can also add transforms to any dataset for preprocessing:

from torchrl.data.datasets import D4RLExperienceReplay
from torchrl.envs import ObservationNorm, RewardScaling, Compose

data = D4RLExperienceReplay("hopper-medium-v2", batch_size=256)
data.append_transform(
    Compose(
        ObservationNorm(loc=0.0, scale=1.0, in_keys=["observation"]),
        RewardScaling(loc=0.0, scale=0.01),
    )
)

sample = data.sample()
print(sample["observation"].mean())   # ≈ 0
print(sample["next", "reward"].abs().max())  # scaled

Optional Dependencies Summary

D4RL

Package: d4rl (optional — direct_download=True works without it)Install: pip install d4rl or use direct_download=TrueDatasets: MuJoCo locomotion, AntMaze, Adroit, Kitchen

Minari

Package: minariInstall: pip install minariDatasets: Robotics, manipulation, navigation (Farama Foundation)

LeRobot

Package: huggingface_hub, datasetsInstall: pip install huggingface_hub datasetsDatasets: Real and simulated robot manipulation (Hugging Face Hub)

Open X-Embodiment

Package: tensorflow_datasets, tensorflowInstall: pip install tensorflow_datasetsDatasets: 527 tasks across 22 robot platforms (RLDS format)

All dataset loaders cache their data as memory-mapped TensorDicts under ~/.cache/torchrl/<dataset_name>/ by default. Set the root argument or the TORCHRL_DATA_PATH environment variable to redirect the cache to a custom location — for example, a fast local NVMe drive or a shared network filesystem.

Environments

Data & Buffers

Collectors

Modules

Objectives

Offline RL Dataset Loaders and Experience Replay Classes

Base Class: `BaseDatasetExperienceReplay`

D4RL Datasets

`D4RLExperienceReplay`

Atari DQN Dataset

`AtariDQNExperienceReplay`

Minari Datasets

`MinariExperienceReplay`

LeRobot Datasets

`LeRobotExperienceReplay`

`lerobot_columns_to_tensordict`

Open X-Embodiment

`OpenXExperienceReplay`

Roboset

`RobosetExperienceReplay`

GenDGRL

`GenDGRLExperienceReplay`

VD4RL

`VD4RLExperienceReplay`

Integration with `TensorDictReplayBuffer`

Optional Dependencies Summary

D4RL

Minari

LeRobot

Open X-Embodiment

Build docs developers (and LLMs) love

Environments

Data & Buffers

Collectors

Modules

Objectives

Documentation Index

​Base Class: BaseDatasetExperienceReplay

​D4RL Datasets

​D4RLExperienceReplay

​Atari DQN Dataset

​AtariDQNExperienceReplay

​Minari Datasets

​MinariExperienceReplay

​LeRobot Datasets

​LeRobotExperienceReplay

​lerobot_columns_to_tensordict

​Open X-Embodiment

​OpenXExperienceReplay

​Roboset

​RobosetExperienceReplay

​GenDGRL

​GenDGRLExperienceReplay

​VD4RL

​VD4RLExperienceReplay

​Integration with TensorDictReplayBuffer

​Optional Dependencies Summary

D4RL

Minari

LeRobot

Open X-Embodiment

Build docs developers (and LLMs) love

Base Class: `BaseDatasetExperienceReplay`

D4RL Datasets

`D4RLExperienceReplay`

Atari DQN Dataset

`AtariDQNExperienceReplay`

Minari Datasets

`MinariExperienceReplay`

LeRobot Datasets

`LeRobotExperienceReplay`

`lerobot_columns_to_tensordict`

Open X-Embodiment

`OpenXExperienceReplay`

Roboset

`RobosetExperienceReplay`

GenDGRL

`GenDGRLExperienceReplay`

VD4RL

`VD4RLExperienceReplay`

Integration with `TensorDictReplayBuffer`

Optional Dependencies Summary