Meganeura

Meganeura is a Rust library for training and running neural networks on any GPU — from laptops to edge devices — without CUDA dependencies. Models are defined as declarative computation graphs, automatically differentiated, e-graph optimized, and compiled to static GPU dispatch sequences with zero JIT warmup.

Quick Start

Train your first model in minutes with a working MNIST example.

System Requirements

Supported GPUs, drivers, and platforms for Vulkan and Metal backends.

Concepts

Learn how computation graphs, e-graph optimization, and autodiff work together.

API Reference

Full reference for Graph, Session, Trainer, and all neural network layers.

Why Meganeura?

Portable

Runs on Linux, Windows, macOS, iOS, and Android via Vulkan and Metal — no CUDA required.

Zero compile time

No JIT warmup. The execution plan is compiled once at graph build time and runs instantly.

E-graph optimized

Equality saturation via egglog automatically fuses kernels like SwiGLU, MatMul+Add, and RmsNorm.

Get started

Add Meganeura to your project

Add the dependency to your Cargo.toml:

Cargo.toml

[dependencies]
meganeura = { git = "https://github.com/kvark/meganeura" }

Define a computation graph

Build your model as a declarative graph of operations:

use meganeura::Graph;

let mut g = Graph::new();
let x = g.input("x", &[32, 784]);
let labels = g.input("labels", &[32, 10]);

let w1 = g.parameter("w1", &[784, 128]);
let b1 = g.parameter("b1", &[128]);
let h = g.bias_add(g.matmul(x, w1), b1);
let h = g.relu(h);

let w2 = g.parameter("w2", &[128, 10]);
let logits = g.matmul(h, w2);
let loss = g.cross_entropy_loss(logits, labels);
g.set_outputs(vec![loss]);

Build a training session

Meganeura runs autodiff, e-graph optimization, and GPU compilation automatically:

use meganeura::build_session;

let mut session = build_session(&g);

Train your model

Use the Trainer to run the training loop:

use meganeura::{TrainConfig, Trainer};

let config = TrainConfig { learning_rate: 0.01, ..Default::default() };
let mut trainer = Trainer::new(session, config);
let history = trainer.train(&mut loader, 10);
println!("final loss: {:.4}", history.final_loss().unwrap());

Built-in model support

Meganeura ships with pre-built graph definitions for popular architectures. Load HuggingFace weights directly and run inference immediately.

SmolLM2

Decoder-only language model with GQA, RoPE, and SwiGLU FFN.

SmolVLM2

Vision-language model with cross-attention between image and text tokens.

Stable Diffusion UNet

Diffusion model UNet with Conv2d, GroupNorm, and cross-attention.

HuggingFace Integration

Load safetensors weights from any HuggingFace Hub repository.

Get Started

Concepts

Training

Inference

Built-in Models

Advanced

Quick Start

System Requirements

Concepts

API Reference

Why Meganeura?

Portable

Zero compile time

E-graph optimized

Get started

Built-in model support

SmolLM2

SmolVLM2

Stable Diffusion UNet

HuggingFace Integration

Build docs developers (and LLMs) love

Get Started

Concepts

Training

Inference

Built-in Models

Advanced

Quick Start

System Requirements

Concepts

API Reference

​Why Meganeura?

Portable

Zero compile time

E-graph optimized

​Get started

​Built-in model support

SmolLM2

SmolVLM2

Stable Diffusion UNet

HuggingFace Integration

Build docs developers (and LLMs) love

Why Meganeura?

Get started

Built-in model support