Tensor Specs: Typing Actions, Observations, and Rewards

TorchRL specs are metadata containers that describe what a tensor should look like — its shape, dtype, device, and the mathematical domain it lives in. They are attached to every environment under observation_spec, action_spec, reward_spec, and done_spec, and are propagated automatically through transforms and wrappers. Specs serve three main purposes: shape and dtype checking (catching bugs before a backward pass), automatic action sampling via spec.rand() (random agents, exploration layers, and testing), and transform validation (ensuring that a transform’s output remains in the expected domain).

The spec class names were shortened in recent TorchRL versions. The canonical names (Bounded, Categorical, OneHot, Composite, Unbounded, etc.) are exported from torchrl.data. The legacy long names (BoundedTensorSpec, DiscreteTensorSpec, CompositeSpec, …) are no longer available and should not be used.

Why Specs Matter

from torchrl.data import Bounded, Categorical, Composite
import torch

# Define what an environment's spaces look like
obs_spec   = Bounded(low=-1.0, high=1.0, shape=(8,), dtype=torch.float32)
action_spec = Categorical(n=4, shape=())

# Sample a random observation and action — no environment needed
obs    = obs_spec.rand()     # tensor in [-1, 1]^8
action = action_spec.rand()  # tensor in {0, 1, 2, 3}

# Check membership
print(obs_spec.is_in(obs))     # True
print(action_spec.is_in(torch.tensor(10)))  # False — 10 ≥ n=4

# Project an out-of-domain value back into the spec
clipped = obs_spec.project(torch.tensor([2.0, -2.0, 0.0, 0.5, 0.5, 0.5, 0.5, 0.5]))

Base Class: `TensorSpec`

All spec classes inherit from TensorSpec. It is an abstract dataclass with four core fields: shape, space, dtype, and device. Import path: torchrl.data.TensorSpec

Common Operations

Every concrete spec supports these operations regardless of domain:

Operation	Description
`spec.rand(shape=None)`	Return a uniformly random tensor in the spec’s domain. For unbounded specs, samples from a standard normal.
`spec.zero(shape=None)`	Return a zero-filled tensor. Does not validate that `0` belongs to the domain.
`spec.one(shape=None)`	Return a ones-filled tensor.
`spec.encode(val)`	Cast a numpy array, list, or tensor to the spec’s dtype/device.
`spec.is_in(val)`	Return `True` if `val` satisfies all constraints (shape, dtype, bounds).
`spec.project(val)`	Clamp or otherwise map `val` back into the valid domain.
`spec.assert_is_in(val)`	Like `is_in`, but raises `AssertionError` on failure.
`spec.to(dest)`	Cast to a different device or dtype.
`spec.expand(*shape)`	Broadcast the spec to a larger batch shape.
`spec.squeeze(dim)` / `spec.unsqueeze(dim)`	Add or remove singleton dimensions.
`spec.clone()`	Deep-copy the spec.
`spec.sample(shape=None)`	Alias for `rand`.

Continuous Specs

`Bounded` / `BoundedContinuous`

A spec for tensors whose values lie in a closed box [low, high]. The constructor dispatches automatically to BoundedContinuous for floating-point dtypes and BoundedDiscrete for integer dtypes. Pass domain="continuous" or domain="discrete" to override this. Import path: torchrl.data.Bounded (alias: torchrl.data.BoundedContinuous)

low

float | torch.Tensor | np.ndarray

required

Lower bound. Must be broadcastable with high and shape.

high

float | torch.Tensor | np.ndarray

required

Upper bound. Must be broadcastable with low and shape.

shape

torch.Size | int

Shape of the tensors produced by this spec. Inferred from low/high if not provided.

dtype

torch.dtype | str

Tensor dtype. Defaults to torch.get_default_dtype() (usually float32). Determines whether BoundedContinuous or BoundedDiscrete is instantiated.

device

str | int | torch.device

Device for generated tensors.

domain

str

Override automatic domain detection: "continuous" or "discrete".

from torchrl.data import Bounded
import torch

# Continuous spec for a 4-dimensional observation in [-1, 1]
obs_spec = Bounded(low=-1.0, high=1.0, shape=(4,), dtype=torch.float32)
print(obs_spec.rand())   # tensor in [-1, 1]^4
print(obs_spec.is_in(torch.zeros(4)))   # True
print(obs_spec.is_in(torch.full((4,), 2.0)))  # False

# Per-dimension bounds via tensors
action_spec = Bounded(
    low=torch.tensor([-1.0, -0.5]),
    high=torch.tensor([1.0, 0.5]),
)

`Unbounded` / `UnboundedContinuous`

A spec for tensors with no lower or upper bound. The internal box spans the full range of the dtype (torch.finfo.min to torch.finfo.max for floats). rand() draws from N(0, 1). Dispatches to UnboundedContinuous for float dtypes and UnboundedDiscrete for integer dtypes. Import path: torchrl.data.Unbounded (aliases: torchrl.data.UnboundedContinuous, torchrl.data.UnboundedDiscrete)

shape

torch.Size | int

default:"torch.Size([1])"

Shape of tensors. Default is a single-element vector (1,).

dtype

torch.dtype | str

Tensor dtype. Floating-point → UnboundedContinuous; integer → UnboundedDiscrete.

device

str | int | torch.device

Device for generated tensors.

from torchrl.data import Unbounded, UnboundedContinuous, UnboundedDiscrete
import torch

reward_spec = Unbounded(shape=(1,), dtype=torch.float32)
# Equivalent to:
reward_spec = UnboundedContinuous(shape=(1,))

count_spec = UnboundedDiscrete(shape=(1,), dtype=torch.int64)
print(reward_spec.rand())  # draws from N(0, 1)

Discrete Specs

`Categorical`

A scalar-indexed categorical spec. Values are integers in {0, 1, ..., n-1}. More memory-efficient than OneHot for large categorical variables. Use n=-1 for an environment whose action space is dynamically sized; call spec.set_provisional_n(k) before sampling. Import path: torchrl.data.Categorical

int

required

Number of possible outcomes. Use -1 for a dynamically-sized space.

shape

torch.Size

Shape of the output tensor. Defaults to torch.Size([]) (scalar).

dtype

torch.dtype | str

default:"torch.int64"

Tensor dtype.

device

str | int | torch.device

Device for generated tensors.

mask

torch.Tensor | None

Boolean mask of shape broadcastable to (*shape, n). False prevents an outcome from being sampled.

from torchrl.data import Categorical
import torch

# Discrete action space with 6 actions
action_spec = Categorical(n=6)
print(action_spec.rand())       # tensor in {0..5}
print(action_spec.rand((4,)))   # 4 independent samples

# Masking: only allow actions 0, 2, 4
mask = torch.tensor([True, False, True, False, True, False])
action_spec.update_mask(mask)
print(action_spec.rand())  # always 0, 2, or 4

`OneHot`

One-hot encoding of a categorical variable. The last dimension of the tensor has size n and exactly one element is True (or 1). Enables differentiable indexing via tensor multiplication. Import path: torchrl.data.OneHot

int

required

Number of categories. The last dimension of every sample is n.

shape

torch.Size

Total shape including the one-hot dimension. If provided, shape[-1] must equal n.

dtype

torch.dtype | str

default:"torch.bool"

Tensor dtype. Common choices: torch.bool, torch.int64, torch.float32.

device

str | int | torch.device

Device for generated tensors.

mask

torch.Tensor | None

Boolean mask to prevent specific outcomes from being sampled.

from torchrl.data import OneHot
import torch

spec = OneHot(n=5, dtype=torch.float32)
sample = spec.rand()
print(sample)          # e.g. tensor([0., 0., 1., 0., 0.])
print(sample.sum())    # always 1.0

# Convert between one-hot and categorical:
categ = spec.to_categorical(sample)  # tensor(2)
back  = spec.to_one_hot(categ)

`MultiOneHot`

Concatenation of multiple one-hot vectors along the last dimension. Useful for multi-dimensional discrete action spaces (e.g., multiple discrete sub-actions at each step). Import path: torchrl.data.MultiOneHot

nvec

Sequence[int]

required

Number of categories for each sub-dimension. The total length of the output tensor is sum(nvec).

shape

torch.Size

Leading batch shape. The full spec shape is (*shape, sum(nvec)).

dtype

torch.dtype

default:"torch.bool"

Tensor dtype.

device

str | int | torch.device

Device for generated tensors.

mask

torch.Tensor | None

Flat boolean mask of shape (*shape, sum(nvec)).

from torchrl.data import MultiOneHot
import torch

# Two independent sub-actions: 3 choices and 2 choices
spec = MultiOneHot(nvec=(3, 2))
sample = spec.rand()
print(sample)        # e.g. tensor([False, True, False, True, False])
print(sample.shape)  # torch.Size([5])  (= 3+2)

# Decode back to integer indices
indices = spec.to_categorical(sample)  # tensor([1, 0])

`MultiCategorical`

Integer-valued counterpart of MultiOneHot. Each element of the output vector is an independent category index. More memory-efficient than MultiOneHot for large action spaces. Import path: torchrl.data.MultiCategorical

nvec

Sequence[int]

required

Number of categories for each independent sub-action.

shape

torch.Size

Leading batch shape. The full spec shape is (*shape, len(nvec)).

dtype

torch.dtype

default:"torch.int64"

Tensor dtype.

device

str | int | torch.device

Device for generated tensors.

mask

torch.Tensor | None

Mask of shape (*shape, sum(nvec)) following one-hot flattening conventions.

`Binary`

A spec for binary vectors where each element is independently 0 or 1. Unlike OneHot, multiple elements may be active simultaneously. Uses torch.int8 by default. Import path: torchrl.data.Binary

int

Length of the binary vector. Either n or shape must be provided. If both are given, shape[-1] must equal n.

shape

torch.Size

Total tensor shape. shape[-1] determines the vector length.

dtype

torch.dtype

default:"torch.int8"

Tensor dtype. Also supports torch.bool.

device

str | int | torch.device

Device for generated tensors.

from torchrl.data import Binary
import torch

spec = Binary(n=4, shape=(2, 4))
print(spec.rand())
# e.g. tensor([[0, 1, 1, 0],
#               [1, 1, 1, 1]], dtype=torch.int8)

`Choice`

A spec that draws samples from a fixed set of allowed values. Unlike Categorical, values can be arbitrary tensors rather than indices. Import path: torchrl.data.Choice

values

torch.Tensor

required

The allowable tensor values, stacked along the first dimension.

shape

torch.Size

Leading batch dimensions.

dtype

torch.dtype

Tensor dtype. Inferred from values if not provided.

device

str | int | torch.device

Device for generated tensors.

Composite Spec

`Composite`

A dictionary-like container for nested specs. Analogous to TensorDict but for spec metadata. Each leaf is a TensorSpec; leaves can be None to indicate unconstrained entries. The shape attribute acts like a TensorDict’s batch_size — it is the common leading shape for all leaf specs. Import path: torchrl.data.Composite

**kwargs

key=TensorSpec

Named specs as keyword arguments.

shape

tuple | torch.Size

Batch shape shared by all leaves.

device

torch.device | None

Shared device constraint. None (default) allows leaves on different devices.

data_cls

type

The TensorDict subclass (TensorDict, a tensorclass, etc.) that should be enforced in the environment.

from torchrl.data import Composite, Bounded, Categorical, Unbounded
import torch

# Environment observation spec
obs_spec = Composite(
    pixels=Bounded(
        low=torch.zeros(3, 84, 84, dtype=torch.uint8),
        high=torch.full((3, 84, 84), 255, dtype=torch.uint8),
    ),
    proprioception=Bounded(low=-10.0, high=10.0, shape=(27,)),
)

# Nested composite specs are created automatically for tuple keys:
full_spec = Composite(
    {("next", "observation"): Bounded(low=-1.0, high=1.0, shape=(8,))},
    action=Categorical(n=4),
)

# Sampling returns a TensorDict:
td = obs_spec.rand()
print(type(td))          # tensordict.TensorDict
print(td["pixels"].shape)  # torch.Size([3, 84, 84])
print(td["proprioception"].shape)  # torch.Size([27])

# Iterate over keys:
for key, spec in obs_spec.items():
    print(key, spec.shape)

Key methods:

Method	Description
`spec[key]` / `spec[key] = value`	Get or set a leaf spec by key (string or tuple of strings).
`spec.keys(include_nested, leaves_only)`	Iterate over keys.
`spec.values()` / `spec.items()`	Iterate over values or key-value pairs.
`spec.rand()`	Sample a TensorDict with one random tensor per leaf.
`spec.zero()`	Return a zero-filled TensorDict.
`spec.is_in(td)`	Check that every leaf of `td` satisfies its spec.
`spec.project(td)`	Clamp all out-of-domain leaves back into their specs.
`spec.encode(td)`	Encode all leaves.

Non-Tensor Spec

`NonTensor`

A spec for non-tensor data (strings, Python objects, metadata). Used for NonTensorData leaves in TensorDicts, e.g., task descriptions or environment info dicts. Import path: torchrl.data.NonTensor

shape

torch.Size

Batch shape.

device

str | int | torch.device

Device. Non-tensor data is typically device-agnostic.

Spec Operations Reference

from torchrl.data import Bounded, Composite
import torch

spec = Bounded(low=-1.0, high=1.0, shape=(4,))

# --- Shape manipulation ---
batched = spec.expand(8, 4)    # shape (8, 4), same bounds
squeezed = spec.unsqueeze(0)   # shape (1, 4)

# --- Type and device ---
gpu_spec = spec.to("cuda")
f64_spec = spec.to(torch.float64)

# --- Sampling ---
single = spec.rand()           # shape (4,)
batch  = spec.rand((32,))      # shape (32, 4)
zeros  = spec.zero((8,))       # shape (8, 4)

# --- Validation ---
val = torch.randn(4) * 2       # may be out of bounds
print(spec.is_in(val))         # possibly False
clipped = spec.project(val)    # clamps to [-1, 1]
spec.assert_is_in(clipped)     # raises AssertionError if fails

# --- encode (numpy/list → tensor) ---
import numpy as np
arr = np.array([0.1, -0.2, 0.5, 0.3])
tensor = spec.encode(arr)      # torch.Tensor on spec.device

# --- Composite spec operations ---
comp = Composite(obs=spec, action=Bounded(low=0.0, high=1.0, shape=(2,)))
td = comp.rand()
comp.assert_is_in(td)

Using Specs with Environments

Specs are automatically defined on every torchrl.envs.EnvBase subclass and can be inspected or modified:

from torchrl.envs import GymEnv, TransformedEnv, ObservationNorm

env = GymEnv("HalfCheetah-v4")

# Inspect the observation spec
print(env.observation_spec)
print(env.action_spec)
print(env.reward_spec)

# Check that a rollout satisfies the specs
td = env.rollout(10)
env.check_env_specs()  # validates observation_spec, action_spec, reward_spec

# Specs propagate through transforms:
tenv = TransformedEnv(env, ObservationNorm(loc=0, scale=1))
print(tenv.observation_spec)  # updated by the transform

Use torchrl.envs.utils.make_composite_from_td(tensordict) to automatically generate a Composite spec from an observed TensorDict. This is a convenient starting point when building a custom environment.

Environments

Data & Buffers

Collectors

Modules

Objectives

Tensor Specs: Typing Actions, Observations, and Rewards

Why Specs Matter

Base Class: `TensorSpec`

Common Operations

Continuous Specs

`Bounded` / `BoundedContinuous`

`Unbounded` / `UnboundedContinuous`

Discrete Specs

`Categorical`

`OneHot`

`MultiOneHot`

`MultiCategorical`

`Binary`

`Choice`

Composite Spec

`Composite`

Non-Tensor Spec

`NonTensor`

Spec Operations Reference

Using Specs with Environments

Build docs developers (and LLMs) love

Environments

Data & Buffers

Collectors

Modules

Objectives

Documentation Index

​Why Specs Matter

​Base Class: TensorSpec

​Common Operations

​Continuous Specs

​Bounded / BoundedContinuous

​Unbounded / UnboundedContinuous

​Discrete Specs

​Categorical

​OneHot

​MultiOneHot

​MultiCategorical

​Binary

​Choice

​Composite Spec

​Composite

​Non-Tensor Spec

​NonTensor

​Spec Operations Reference

​Using Specs with Environments

Build docs developers (and LLMs) love

Why Specs Matter

Base Class: `TensorSpec`

Common Operations

Continuous Specs

`Bounded` / `BoundedContinuous`

`Unbounded` / `UnboundedContinuous`

Discrete Specs

`Categorical`

`OneHot`

`MultiOneHot`

`MultiCategorical`

`Binary`

`Choice`

Composite Spec

`Composite`

Non-Tensor Spec

`NonTensor`

Spec Operations Reference

Using Specs with Environments