GLM-PCA applies a generalized linear model framework to perform dimensionality reduction directly on raw count data. Traditional PCA requires normalized and log-transformed counts, which can introduce artifacts — particularly the mean-variance relationship present in sequencing data. GLM-PCA avoids this by modeling counts under a Poisson or negative binomial likelihood.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/satijalab/seurat-wrappers/llms.txt
Use this file to discover all available pages before exploring further.
Reference
Townes, F. W., Hicks, S. C., Aryee, M. J., & Irizarry, R. A. (2019). Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model. Genome Biology. https://doi.org/10.1186/s13059-019-1861-6 Source: willtownes/glmpca · CRANInstallation
The
glmpca package must be installed before using RunGLMPCA(). It is available on both CRAN and GitHub.scry:
Why GLM-PCA
Conventional scRNA-seq workflows normalize raw counts and apply a log transformation before PCA. This pipeline:- Distorts the mean-variance relationship of count data.
- Can inflate the contribution of lowly-expressed genes.
- Introduces a systematic bias when counts are sparse.
Key function
RunGLMPCA() — Runs GLM-PCA on a Seurat object and stores the result as a DimReduc object. It uses the counts slot of the specified assay as input.
Example
Visualize results
Parameters
A Seurat object. Must contain raw counts in the
counts slot of the target assay.Number of dimensions (latent factors) to return.
Features to use. Defaults to the variable features identified by
FindVariableFeatures(). Providing a curated list (e.g., top deviance genes) is recommended for best results.Assay to use. Defaults to the default assay of the Seurat object.
Name under which the resulting
DimReduc object is stored in the Seurat object.Prefix for the column names of the GLM-PCA embedding dimensions.
...
Additional arguments passed directly to
glmpca::glmpca(). Use this to set the fam argument (e.g., fam = "nb" for negative binomial) or other model options.GLM-PCA reads from the
counts slot, not the data (normalized) slot. Do not run NormalizeData() before RunGLMPCA() — the normalization is handled implicitly by the model.