Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/facebookresearch/LoRe/llms.txt

Use this file to discover all available pages before exploring further.

LoRe is a lightweight, modular framework for learning personalized reward models from multi-user preference data. Rather than training a single monolithic reward model that averages over all users, LoRe decomposes the reward signal into a shared low-rank basis and individual user weights — enabling the model to capture genuine diversity in what different people find valuable in language model outputs.

The problem with monolithic reward models

Standard reward model training aggregates preferences from many annotators into a single scalar signal. This works reasonably well when users agree, but human preferences are not uniform. A user who values conciseness may consistently prefer shorter summaries; another who values detail may prefer longer ones. A monolithic Bradley-Terry model trained on pooled data will find a compromise that satisfies neither. LoRe addresses this directly: instead of one reward function, it learns a family of reward functions parameterized by a shared low-rank basis V (a linear transformation on fixed reward model embeddings) and per-user mixture weights W. Each user’s reward is a softmax-weighted combination of basis vectors, so the model can represent a diverse population of preferences while keeping the number of parameters manageable.

Key insight: low-rank reward decomposition

The core mathematical idea is to factorize the reward as:
r(x, user_i) = x · V · w_i
where:
  • x is the embedding difference between a chosen and a rejected response (extracted from a pretrained reward model backbone, e.g. Skywork/Skywork-Reward-Llama-3.1-8B-v0.2)
  • V ∈ ℝ^(d × K) is the shared low-rank basis matrix, with K basis vectors learned jointly across all users
  • w_i ∈ Δ^K is the per-user weight vector (on the probability simplex via softmax)
When K = 0 the model reduces to the pretrained reference reward. When K = 1 it is equivalent to a standard Bradley-Terry model. As K increases, the model gains capacity to represent more diverse preference patterns. The run() function in utils.py sweeps over a configurable K_list to find the right rank for your dataset. Once the basis V is learned from seen users, new users can be personalized with only a handful of preference examples by fitting their weight vector w while keeping V fixed — this is the few-shot personalization setting.

Supported datasets

LoRe ships with experiment scripts for three benchmark datasets:
DatasetDomainScript directory
Reddit TLDRSummarization preference modelingRedditTLDR/
PRISMMulti-turn dialogue response preferencesPRISM/
PersonalLLMOpen-ended LLM response personalizationPersonalLLM/
Each dataset directory follows the same three-script workflow: prepare.pytrain_basis.pyvary_fewshot.py.

Features

  • Low-rank joint learning: learns basis V and per-user weights W simultaneously across the full user population
  • Few-shot personalization: adapts to new users using only a small number of preference examples by fitting their weight vector against the frozen basis
  • Regularization support: LoRe_regularized adds cosine similarity regularization to the SFT reference model, controlled by a warm-up schedule over training iterations
  • Modular design: dataset preprocessing, training, and evaluation are separated into composable scripts; utils.py exposes all core primitives
  • Evaluation on seen and unseen users: the run() pipeline automatically evaluates train accuracy, seen-user generalization to unseen prompts, and few-shot accuracy for unseen users

Paper and citation

LoRe is described in detail in the paper:
LoRe: Personalizing LLMs via Low-Rank Reward Modeling Avinandan Bose, Zhihan Xiong, Yuejie Chi, Simon Shaolei Du, Lin Xiao, Maryam Fazel arXiv:2504.14439 · 2025
@misc{bose2025lorepersonalizingllmslowrank,
      title={LoRe: Personalizing LLMs via Low-Rank Reward Modeling},
      author={Avinandan Bose and Zhihan Xiong and Yuejie Chi and Simon Shaolei Du and Lin Xiao and Maryam Fazel},
      year={2025},
      eprint={2504.14439},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2504.14439},
}

Where to go next

Quickstart

Run your first LoRe experiment on the Reddit TLDR dataset in three steps.

Installation

Set up Python, PyTorch, and all required dependencies.

Datasets

Learn how Reddit TLDR, PRISM, and PersonalLLM are preprocessed and structured.

API reference

Full documentation for utils.py classes and functions.

License

LoRe is released under the CC-BY-NC 4.0 license. Non-commercial use only. See the LICENSE file in the repository root for the full text.

Build docs developers (and LLMs) love