helper module provides utility functions for processing data and preparing inputs for the Alpamayo R1 model.
create_message
Constructs a message structure with images for chain-of-thought reasoning.Input frames with shape
(N, C, H, W) where:- N: number of frames
- C: number of channels (3 for RGB)
- H: image height
- W: image width
A list of message dictionaries in chat format containing:
- System message with driving assistant role
- User message with images and trajectory history placeholders
- Assistant message with chain-of-thought start token
Example
get_processor
Get the processor for the Qwen3-VL-2B-Instruct model with custom tokenizer.The tokenizer to use with the processor. Typically obtained from the model’s tokenizer.
An AutoProcessor configured with:
min_pixels: 163840max_pixels: 196608- Custom tokenizer from input parameter
Example
to_device
Recursively cast data to the specified device and dtype.Data to transfer. Can be:
torch.Tensor: directly transferred to device/dtypedict: recursively processes all valueslist/tuple: recursively processes all elements- Other types: returned unchanged
Target device (e.g.,
"cuda", "cpu", or torch.device object)Target dtype (e.g.,
torch.bfloat16, torch.float32)Data with all tensors transferred to the specified device and dtype
Example
Constants
The module defines the following constants:MIN_PIXELS: 163840 - Minimum number of pixels for image processingMAX_PIXELS: 196608 - Maximum number of pixels for image processingBASE_PROCESSOR_NAME: “Qwen/Qwen3-VL-2B-Instruct” - Base processor model name