Class: ReasoningVLA
TheReasoningVLA class is the base model for reasoning-enabled Vision-Language-Action tasks. It combines a vision-language model (VLM) backbone with trajectory tokenization capabilities.
Inherits from: PreTrainedModel, TrajectoryFusionMixin
Location: alpamayo_r1.models.base_model.ReasoningVLA
Constructor
Configuration object containing VLM settings, trajectory tokenizer configurations, and model parameters.
Dictionary of pretrained PyTorch modules. Can include:
"vlm": Pretrained vision-language model"traj_tokenizer": Pretrained trajectory tokenizer
Original vocabulary size of the VLM before adding trajectory tokens.
Whether to log total and trainable parameter counts during initialization.
Class Methods
from_pretrained_submodules
Configuration object specifying the VLM to load and tokenizer settings.
Initialized ReasoningVLA model with pretrained VLM backbone and tokenizers loaded from the paths specified in config.
Instance Methods
fuse_traj_tokens
Input token IDs tensor with shape
[B, n_token] containing placeholder trajectory tokens.Dictionary containing trajectory data:
ego_history_xyz: History positions[B, n_traj, T, 3]ego_history_rot: History rotations[B, n_traj, T, ...]ego_future_xyz: (Optional) Future positionsego_future_rot: (Optional) Future rotations
Input IDs with trajectory placeholder tokens replaced by actual encoded trajectory tokens. Shape:
[B, n_token]get_input_embeddings
The embedding layer from the VLM’s language model.
get_output_embeddings
The output embedding layer from the VLM.
tie_weights
Attributes
The vision-language model backbone (e.g., Qwen3VLForConditionalGeneration).
Tokenizer with trajectory tokens and special tokens added.
Trajectory tokenizer for encoding future trajectories to discrete tokens.
Trajectory tokenizer for encoding history trajectories. Defaults to
traj_tokenizer if not separately configured.Mapping of special token names to their token IDs.
Original vocabulary size before adding trajectory tokens.
Example Usage
Special Tokens
The model adds the following special tokens to the tokenizer:<|traj_history|>: History trajectory placeholder<|traj_future|>: Future trajectory placeholder<|traj_history_start|>: History trajectory start marker<|traj_history_end|>: History trajectory end marker<|traj_future_start|>: Future trajectory start marker<|traj_future_end|>: Future trajectory end marker
add_special_tokens=True in the config.
Notes
- The model automatically resizes the VLM’s token embeddings to accommodate trajectory tokens
- Trajectory tokens are discrete tokens of the form
<i0>,<i1>, …,<i{vocab_size-1}> - The
TrajectoryFusionMixinprovides thefuse_traj_tokensfunctionality - Currently supports Qwen3-VL as the VLM backend