src/models/. Each model exposes a build_graph (or equivalent) function that registers the full computation graph into a Graph instance. You then compile that graph, load weights, and run inference or training — all on the GPU.
SmolLM2
Decoder-only transformer language model with GQA, RoPE, RMSNorm, and SwiGLU. Includes a 135M preset and KV-cache graphs for efficient autoregressive decoding.
SmolVLM2
Vision-language model combining a SigLIP-style ViT encoder with a LLaMA-3-variant text decoder. A pixel-shuffle connector bridges vision and text.
SmolVLA
Vision-language-action model for robotics. Extends SmolVLM2 with an action expert that alternates self-attention over action tokens and cross-attention to VLM hidden states.
Stable Diffusion UNet
SD 1.5-compatible U-Net for denoising diffusion. Encoder/decoder with ResBlocks, GroupNorm, SiLU, and skip connections. Includes a training benchmark with MSE loss.
Common pattern
All built-in models follow the same four-step pattern:Define a config
Instantiate the model’s config struct, either using a built-in preset or by setting fields manually.
Build the graph
Call the model’s
build_graph function, passing a mutable Graph reference and the config. The function registers all parameters and ops and returns the output NodeId.Compile and load weights
Compile the graph into a
Session, then load HuggingFace-compatible safetensors weights. Linear layer weights stored in [out, in] order must be transposed on load; each model provides a transposed_weight_names helper for this.HuggingFace-compatible parameter names
Every built-in model uses the same parameter naming convention as the corresponding HuggingFace safetensors checkpoint. For example, SmolLM2 registers:SafeTensorsModel::download(REPO_ID) without any name remapping.
HuggingFace stores linear layer weights in
[out_features, in_features] order, but Meganeura’s matmul op expects [in_features, out_features]. Each model provides a transposed_weight_names function that returns the list of tensors that must be transposed on load.