Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/satijalab/seurat-wrappers/llms.txt

Use this file to discover all available pages before exploring further.

LIGER (Linked Inference of Genomic Experimental Relationships) uses integrative non-negative matrix factorization (iNMF) to identify shared and dataset-specific factors across multiple single-cell datasets. SeuratWrappers provides RunOptimizeALS() and RunQuantileNorm() to run LIGER directly on Seurat objects.
LIGER does not center data during scaling. You must pass do.center = FALSE to ScaleData() before running LIGER, and use split.by to scale each dataset subset independently.
Update your rliger package to version 0.5.0 or above before following this workflow. Install it from CRAN: install.packages('rliger').

Citation

If you use LIGER in your work, please cite:
Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity Joshua Welch, Velina Kozareva, Ashley Ferreira, Charles Vanderburg, Carly Martin, Evan Z. Macosko Cell, 2019 doi: 10.1016/j.cell.2019.05.006 GitHub: https://github.com/welch-lab/liger

Installation

# Install rliger from CRAN
install.packages('rliger')

# Install SeuratWrappers
remotes::install_github('satijalab/seurat-wrappers')

Workflow

1

Load libraries and data

library(rliger)
library(Seurat)
library(SeuratData)
library(SeuratWrappers)

InstallData("pbmcsca")
data("pbmcsca")
2

Normalize and identify variable features

pbmcsca <- NormalizeData(pbmcsca)
pbmcsca <- FindVariableFeatures(pbmcsca)
3

Scale data without centering

LIGER requires uncentered scaled data. Use split.by to scale each dataset subset separately.
pbmcsca <- ScaleData(pbmcsca, split.by = "Method", do.center = FALSE)
4

Run iNMF factorization with RunOptimizeALS

Factorizes the scaled data using alternating least squares (ALS). The result is stored as the iNMF_raw reduction.
pbmcsca <- RunOptimizeALS(pbmcsca, k = 20, lambda = 5, split.by = "Method")
5

Quantile normalize the joint embeddings

Aligns factor loadings across datasets via quantile normalization, producing the final integrated iNMF reduction.
pbmcsca <- RunQuantileNorm(pbmcsca, split.by = "Method")
6

Cluster and visualize

Optionally run Louvain clustering on the integrated embedding, then compute UMAP.
pbmcsca <- FindNeighbors(pbmcsca, reduction = "iNMF", dims = 1:20)
pbmcsca <- FindClusters(pbmcsca, resolution = 0.3)
pbmcsca <- RunUMAP(pbmcsca, dims = 1:ncol(pbmcsca[["iNMF"]]), reduction = "iNMF")
DimPlot(pbmcsca, group.by = c("Method", "ident", "CellType"), ncol = 3)

Examples

Interferon-stimulated and control PBMC

InstallData("ifnb")
data("ifnb")
ifnb <- NormalizeData(ifnb)
ifnb <- FindVariableFeatures(ifnb)
ifnb <- ScaleData(ifnb, split.by = "stim", do.center = FALSE)
ifnb <- RunOptimizeALS(ifnb, k = 20, lambda = 5, split.by = "stim")
ifnb <- RunQuantileNorm(ifnb, split.by = "stim")
ifnb <- FindNeighbors(ifnb, reduction = "iNMF", dims = 1:20)
ifnb <- FindClusters(ifnb, resolution = 0.55)
ifnb <- RunUMAP(ifnb, dims = 1:ncol(ifnb[["iNMF"]]), reduction = "iNMF")
DimPlot(ifnb, group.by = c("stim", "ident", "seurat_annotations"), ncol = 3)

Eight human pancreatic islet datasets

InstallData("panc8")
data("panc8")
panc8 <- NormalizeData(panc8)
panc8 <- FindVariableFeatures(panc8)
panc8 <- ScaleData(panc8, split.by = "replicate", do.center = FALSE)
panc8 <- RunOptimizeALS(panc8, k = 20, lambda = 5, split.by = "replicate")
panc8 <- RunQuantileNorm(panc8, split.by = "replicate")
panc8 <- FindNeighbors(panc8, reduction = "iNMF", dims = 1:20)
panc8 <- FindClusters(panc8, resolution = 0.4)
panc8 <- RunUMAP(panc8, dims = 1:ncol(panc8[["iNMF"]]), reduction = "iNMF")
DimPlot(panc8, group.by = c("replicate", "ident", "celltype"), ncol = 3)

Functions

RunOptimizeALS

Runs iNMF factorization via alternating least squares on a merged Seurat object. Stores per-dataset factor loading matrices in the tool slot (accessible with Tool()), and combined cell embeddings in the iNMF_raw reduction by default.
object
Seurat
required
A merged Seurat object. Data must be scaled (without centering) before calling this function.
k
integer
required
Number of factors (latent dimensions) to compute.
split.by
character
default:"orig.ident"
Metadata column used to split cells into per-dataset subsets for factorization.
lambda
numeric
default:"5"
Regularization parameter. Controls the weight of the dataset-specific penalty term. Higher values enforce greater similarity between shared and dataset-specific factors.
thresh
numeric
default:"1e-6"
Convergence threshold. Optimization stops when the objective improvement falls below this value.
max.iters
integer
default:"30"
Maximum number of ALS iterations.
nrep
integer
default:"1"
Number of factorization restarts. The run with the lowest final objective is retained.
rand.seed
integer
default:"1"
Random seed for reproducibility.
reduction.name
character
default:"iNMF_raw"
Name under which the raw iNMF embedding is stored.
reduction.key
character
default:"riNMF_"
Key prefix for the raw iNMF reduction dimensions.

RunQuantileNorm

Aligns iNMF factor loadings across datasets using quantile normalization. Produces the final integrated embedding stored in the iNMF reduction by default. Also assigns cluster identities to cells via Idents().
object
Seurat
required
A Seurat object after running RunOptimizeALS().
split.by
character
default:"orig.ident"
Metadata column identifying which dataset each cell belongs to.
reduction
character
default:"iNMF_raw"
Name of the raw iNMF reduction to normalize.
quantiles
integer
default:"50"
Number of quantile bins used in the normalization procedure.
ref_dataset
character
default:"NULL"
Name of the reference dataset for alignment. Defaults to the largest dataset.
min_cells
integer
default:"20"
Minimum number of cells required in a cluster for it to be used in alignment.
knn_k
integer
default:"20"
Number of nearest neighbors used in the kNN graph for quantile normalization.
do.center
logical
default:"FALSE"
Whether to center embeddings before normalization. Should match the centering used in ScaleData().
max_sample
integer
default:"1000"
Maximum number of cells to sample per dataset when computing quantiles.
eps
numeric
default:"0.9"
Epsilon parameter for approximate nearest neighbor search.
refine.knn
logical
default:"TRUE"
Whether to refine the kNN graph after initial construction.
reduction.name
character
default:"iNMF"
Name under which the normalized embedding is stored.
reduction.key
character
default:"iNMF_"
Key prefix for the normalized iNMF reduction dimensions.

Deprecated Functions

RunSNF() and RunQuantileAlignSNF() are deprecated. Both now redirect to RunQuantileNorm(). Use RunQuantileNorm() directly in all new workflows.

Build docs developers (and LLMs) love