Helper Module

The helper module provides utility functions for processing data and preparing inputs for the Alpamayo R1 model.

create_message

Constructs a message structure with images for chain-of-thought reasoning.

def create_message(frames: torch.Tensor) -> list[dict]

frames

torch.Tensor

required

Input frames with shape (N, C, H, W) where:

N: number of frames
C: number of channels (3 for RGB)
H: image height
W: image width

return

list[dict]

A list of message dictionaries in chat format containing:

System message with driving assistant role
User message with images and trajectory history placeholders
Assistant message with chain-of-thought start token

Example

import torch
from alpamayo_r1 import helper

# Load data from dataset
data = load_physical_aiavdataset(clip_id, t0_us=5_100_000)

# Flatten camera and frame dimensions
frames = data["image_frames"].flatten(0, 1)

# Create message structure
messages = helper.create_message(frames)

get_processor

Get the processor for the Qwen3-VL-2B-Instruct model with custom tokenizer.

def get_processor(tokenizer: AutoTokenizer) -> AutoProcessor

tokenizer

AutoTokenizer

required

The tokenizer to use with the processor. Typically obtained from the model’s tokenizer.

return

AutoProcessor

An AutoProcessor configured with:

min_pixels: 163840
max_pixels: 196608
Custom tokenizer from input parameter

Example

from alpamayo_r1.models.alpamayo_r1 import AlpamayoR1
from alpamayo_r1 import helper

model = AlpamayoR1.from_pretrained("nvidia/Alpamayo-R1-10B")
processor = helper.get_processor(model.tokenizer)

# Apply chat template
inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
)

to_device

Recursively cast data to the specified device and dtype.

def to_device(
    data: Any,
    device: str | torch.device | None = None,
    dtype: torch.dtype | None = None
) -> Any

data

Any

required

Data to transfer. Can be:

torch.Tensor: directly transferred to device/dtype
dict: recursively processes all values
list/tuple: recursively processes all elements
Other types: returned unchanged

device

str | torch.device | None

default:"None"

Target device (e.g., "cuda", "cpu", or torch.device object)

dtype

torch.dtype | None

default:"None"

Target dtype (e.g., torch.bfloat16, torch.float32)

return

Any

Data with all tensors transferred to the specified device and dtype

Example

from alpamayo_r1 import helper
import torch

model_inputs = {
    "tokenized_data": inputs,
    "ego_history_xyz": data["ego_history_xyz"],
    "ego_history_rot": data["ego_history_rot"],
}

# Transfer all inputs to CUDA
model_inputs = helper.to_device(model_inputs, "cuda")

# Transfer with specific dtype
model_inputs = helper.to_device(model_inputs, "cuda", torch.bfloat16)

Constants

The module defines the following constants:

MIN_PIXELS: 163840 - Minimum number of pixels for image processing
MAX_PIXELS: 196608 - Maximum number of pixels for image processing
BASE_PROCESSOR_NAME: “Qwen/Qwen3-VL-2B-Instruct” - Base processor model name

Models

Action Space

Diffusion

Utilities

create_message

Example

get_processor

Example

to_device

Example

Constants

Build docs developers (and LLMs) love

Models

Action Space

Diffusion

Utilities

Documentation Index

​create_message

​Example

​get_processor

​Example

​to_device

​Example

​Constants

Build docs developers (and LLMs) love

create_message

Example

get_processor

Example

to_device

Example

Constants