LoRe model classes

LoRe’s three nn.Module classes represent the core of the reward basis learning pipeline. LoRe and LoRe_regularized jointly learn a shared basis matrix V (shape [num_features, K]) and per-user weight matrix W (shape [N, K]) from preference data. PersonalizeBatch holds V fixed and adapts only user weights, enabling fast personalization for new users at inference time.

LoRe

LoRe performs joint reward basis learning with optional L2 regularization toward a reference SFT vector V_sft. It uses a single Adam optimizer over all parameters and is the simpler of the two basis-learning modules.

from utils import LoRe

model = LoRe(
    V_sft=V_final,
    alpha=0.01,
    num_classes=N,
    num_features=4096,
    num_basis_vectors=5,
    num_iterations=1000,
    learning_rate=0.5,
)
W, V = model.train(train_features)

Constructor parameters

V_sft

torch.Tensor

required

Reference reward direction from the SFT model, shape [num_features] or [num_features, 1]. Used as the L2 regularization target for each basis vector. Pass the final linear layer weights of your reward model.

alpha

float

required

L2 regularization strength. Controls how strongly each column of V is pulled toward V_sft. Set to 0 to disable regularization entirely.

num_classes

int

required

Number of users (training population size N). Determines the number of rows in the learned weight matrix W.

num_features

int

required

Dimensionality of the embedding space F. Must match the feature dimension of your training tensors (e.g., 4096 for Llama-3.1-8B embeddings).

num_basis_vectors

int

required

Number of basis vectors K. The rank of the factorization. Larger values increase expressivity at the cost of overfitting and compute.

num_iterations

int

default:"1000"

Number of Adam gradient steps. The model does not use early stopping; training always runs for exactly this many steps.

learning_rate

float

default:"0.01"

Learning rate for the Adam optimizer shared by W and V.

Learned parameters

Parameter	Shape	Description
`W`	`[num_classes, num_basis_vectors]`	Raw logits for per-user basis weights. `softmax(W, dim=1)` gives the probability simplex over basis vectors.
`V`	`[num_features, num_basis_vectors]`	Reward basis matrix. Each column is a direction in embedding space.

Methods

def forward(X: list[Tensor]) -> tuple[Tensor, Tensor]

Computes the negative log-likelihood over all users plus L2 regularization. X is a list of N tensors, each shape [m_i, F], where m_i is the number of preference pairs for user i. Returns (nll, reg).

def train(x: list[Tensor]) -> tuple[Tensor, Tensor]

Runs num_iterations steps of Adam, minimizing nll + reg. Returns (softmax(W, dim=1), V) — the softmax-normalized weight matrix and raw basis matrix.

train() calls self.to(device) internally, so you do not need to move the model to GPU before calling it. The returned V is not detached; call .detach() before passing it downstream if you do not want gradients to flow.

LoRe_regularized

LoRe_regularized is the production variant used for PRISM experiments. Key differences from LoRe: it uses cosine similarity regularization instead of L2, maintains separate Adam optimizers for W and V with alternating updates, applies a warmup schedule for alpha, and prunes unused basis vectors after training.

from utils import LoRe_regularized

model = LoRe_regularized(
    V_sft=V_final,
    alpha=1e4,
    num_classes=N,
    num_features=4096,
    num_basis_vectors=25,
    num_iterations=20000,
    learning_rate=0.5,
)
W_kept, V_kept = model.train(train_features)

Constructor parameters

V_sft

torch.Tensor

required

Reference reward direction from the SFT model. Normalized once at construction time (F.normalize(V_sft, dim=0)) and stored as self.V_sft_norm. Used for cosine similarity regularization.

alpha

float

required

Maximum regularization coefficient reached after the warmup period. The actual coefficient applied at each step is computed by _alpha_at_step(step).

num_classes

int

required

Number of users (training population size N).

num_features

int

required

Embedding dimensionality F. Hard-coded to 4096 inside solve_regularized_simplex; ensure your embeddings match.

num_basis_vectors

int

required

Initial number of basis vectors before pruning. After training, vectors whose maximum softmax weight across all users is below 1e-2 are discarded.

num_iterations

int

default:"1000"

Total training steps. run_regularized passes 20000; run passes 1000.

learning_rate

float

default:"0.01"

Learning rate used by both optimizer_W and optimizer_V.

Methods

@staticmethod
def _prepare_batch(X: list[Tensor]) -> tuple[Tensor, Tensor]

Packs a list of per-user tensors into a single concatenated tensor X_cat of shape [N_total, F] and a label vector y of shape [N_total] (values 0..C-1). Called once before the training loop to avoid repeated concatenation.

def _forward_from_packed(
    X_cat: Tensor,
    y: Tensor,
    alpha_curr: float,
) -> tuple[Tensor, Tensor, float]

Computes (nll, reg, entropy_loss) from pre-packed data. Cosine regularization is only applied when alpha_curr > 0:

reg = mean(1 - cosine_similarity(V[:, i], V_sft[:, i]))  for i in range(K)

def _alpha_at_step(step: int) -> float

Warmup schedule for the regularization coefficient:

Step range	Returned alpha
`< 0.2 * num_iterations`	`0.0`
`0.2 * num_iterations` to `0.8 * num_iterations`	Linear ramp from `0.0` to `alpha`
`>= 0.8 * num_iterations`	`alpha`

def train(X: list[Tensor]) -> tuple[Tensor, Tensor]

Alternating-update training loop. Each step:

Freeze V, update W with NLL only (alpha_curr=0).
Freeze W, update V with NLL + alpha_curr * cosine_reg.

After training, prunes basis vectors where max_c softmax(W)[c, i] < 1e-2. Returns (W_kept, V_kept) with shapes [N, K_kept] and [F, K_kept].

Use run_regularized() rather than instantiating LoRe_regularized directly. It handles the checkpoint-saving pattern and constructs the model with the correct num_features=4096.

PersonalizeBatch

PersonalizeBatch adapts new users to a fixed reward basis V. Only per-user weight vectors w[i] are learned; V is passed as an argument and receives no gradient. This makes it suitable for few-shot personalization after basis training is complete.

from utils import PersonalizeBatch, learn_multiple_few_shot

# After basis training:
W_few_shot = learn_multiple_few_shot(
    train_features_unseen,
    V=V_joint.detach(),
    num_iterations=500,
    learning_rate=0.1,
)

Constructor parameters

num_classes

int

required

Number of users to personalize simultaneously. Sets the length of the ParameterList.

num_features

int

required

Embedding dimensionality F. Only used internally to confirm tensor shapes; not stored as an attribute used during forward.

num_basis_vectors

int

required

Dimensionality K of each user weight vector. Must match the number of columns in V.

num_iterations

int

default:"1000"

Number of Adam gradient steps for user adaptation.

learning_rate

float

default:"0.01"

Learning rate for the shared Adam optimizer over all w[i].

Learned parameters

Parameter	Shape	Description
`w`	`ParameterList` of `num_classes` vectors, each `[num_basis_vectors]`	Raw logits for each user’s mixture over basis vectors. `softmax(w[i])` is the per-user weight.

Methods

def forward(X: list[Tensor], V: Tensor) -> Tensor

Computes summed NLL across all users. For user i:

V_w = V @ softmax(w[i])        # [F]
logits = X[i] @ V_w / 100.0   # [m_i]
nll_i = -mean(log(sigmoid(logits)))

V is not modified; pass V.detach() to prevent accidental gradient flow into the basis.

def train(X: list[Tensor], V: Tensor) -> list[Tensor]

Runs num_iterations Adam steps. Returns a list of len(X) detached softmax weight vectors, one per user, each shape [num_basis_vectors].

PersonalizeBatch is most commonly accessed through the learn_multiple_few_shot() wrapper, which handles instantiation and device placement automatically.

Core API

LoRe model classes

LoRe

Constructor parameters

Learned parameters

Methods

LoRe_regularized

Constructor parameters

Methods

PersonalizeBatch

Constructor parameters

Learned parameters

Methods

Build docs developers (and LLMs) love

Core API

Documentation Index

​LoRe

​Constructor parameters

​Learned parameters

​Methods

​LoRe_regularized

​Constructor parameters

​Methods

​PersonalizeBatch

​Constructor parameters

​Learned parameters

​Methods

Build docs developers (and LLMs) love

LoRe

Constructor parameters

Learned parameters

Methods

LoRe_regularized

Constructor parameters

Methods

PersonalizeBatch

Constructor parameters

Learned parameters

Methods