Skip to main content
Meganeura provides both high-level nn:: structs for common normalization patterns and low-level Graph ops for direct use in custom architectures.

nn::RmsNorm

RMS normalization: scales the input by the inverse RMS, then multiplies element-wise by a learned weight. Formula: y = x / sqrt(mean(x²) + eps) * weight Fields
weight
NodeId
required
Scale parameter of shape [dim].
eps
f32
required
Small constant added to the denominator for numerical stability.

RmsNorm::new

g
&mut Graph
required
The computation graph to register the parameter into.
name
&str
required
Name for the weight parameter.
dim
usize
required
Feature dimension (last dimension of the input).
eps
f32
required
Stability epsilon (e.g. 1e-5 or 1e-6).
let norm = nn::RmsNorm::new(&mut g, "model.layers.0.input_layernorm.weight", 512, 1e-5);

forward

g
&mut Graph
required
The computation graph to append ops to.
x
NodeId
required
2D input tensor of shape [seq, dim].
NodeId
NodeId
Normalized output tensor of shape [seq, dim].
let norm = nn::RmsNorm::new(&mut g, "rms", 512, 1e-5);
let y = norm.forward(&mut g, x);
// y shape: [seq, 512]

nn::LayerNorm

Layer normalization with both a learned scale (weight) and a learned shift (bias). Formula: y = (x - mean(x)) / sqrt(var(x) + eps) * weight + bias Fields
weight
NodeId
required
Scale parameter of shape [dim]. Registered as {name}.weight.
bias
NodeId
required
Shift parameter of shape [dim]. Registered as {name}.bias.
eps
f32
required
Small constant added to the variance for numerical stability.

LayerNorm::new

g
&mut Graph
required
The computation graph to register parameters into.
name
&str
required
Name prefix. Registers {name}.weight and {name}.bias.
dim
usize
required
Feature dimension (last dimension of the input).
eps
f32
required
Stability epsilon (e.g. 1e-5).
let norm = nn::LayerNorm::new(&mut g, "encoder.layer_norm", 768, 1e-5);

forward

g
&mut Graph
required
The computation graph to append ops to.
x
NodeId
required
2D input tensor of shape [seq, dim].
NodeId
NodeId
Normalized output tensor of shape [seq, dim].
let norm = nn::LayerNorm::new(&mut g, "ln", 768, 1e-5);
let y = norm.forward(&mut g, x);
// y shape: [seq, 768]

Graph normalization ops

g.rms_norm

Applies RMS normalization directly.
x
NodeId
required
2D input tensor of shape [seq, dim].
weight
NodeId
required
1D weight tensor of shape [dim].
eps
f32
required
Stability epsilon.
NodeId
NodeId
Normalized tensor of shape [seq, dim].
let w = g.parameter("rms_w", &[512]);
let y = g.rms_norm(x, w, 1e-5);

g.layer_norm

Applies standard layer normalization.
x
NodeId
required
2D input tensor of shape [seq, dim].
weight
NodeId
required
1D scale tensor of shape [dim].
bias
NodeId
required
1D shift tensor of shape [dim].
eps
f32
required
Stability epsilon.
NodeId
NodeId
Normalized tensor of shape [seq, dim].
let w = g.parameter("ln.weight", &[768]);
let b = g.parameter("ln.bias", &[768]);
let y = g.layer_norm(x, w, b, 1e-5);

g.group_norm

Group normalization over a flat NCHW tensor. Divides the channels dimension into num_groups groups, each normalized independently.
x
NodeId
required
Flat 1D input tensor representing [N, C, H, W] in NCHW order (total size N*C*H*W).
weight
NodeId
required
Scale parameter of shape [C].
bias
NodeId
required
Shift parameter of shape [C].
batch
u32
required
Batch size N.
channels
u32
required
Number of channels C. Must be divisible by num_groups.
spatial
u32
required
Spatial size H * W.
num_groups
u32
required
Number of groups to divide channels into.
eps
f32
required
Stability epsilon.
NodeId
NodeId
Normalized flat tensor of the same shape as the input.
let w = g.parameter("gn.weight", &[64]);
let b = g.parameter("gn.bias", &[64]);
// x is flat [N * 64 * 32 * 32]
let y = g.group_norm(x, w, b, batch, 64, 32 * 32, 32, 1e-5);
During inference optimization, the compiler may automatically fuse adjacent GroupNorm + SiLU sequences into a single GroupNormSilu kernel. This fusion is applied transparently and does not require changes to your graph construction code.

Build docs developers (and LLMs) love