CoGAPS (Coordinated Gene Activity in Pattern Sets) applies Bayesian non-negative matrix factorization (NMF) to decompose a gene expression matrix into a set of latent patterns and their associated gene weights. Each pattern captures a coordinated program of gene activity, which can correspond to cell types, lineages, or biological processes.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/satijalab/seurat-wrappers/llms.txt
Use this file to discover all available pages before exploring further.
Citation: Stein-O’Brien et al. (2019) Decomposing cell identity for transfer learning across cellular measurements, platforms, tissues, and species. Cell Systems. doi: 10.1016/j.cels.2019.04.004Source: Bioconductor CoGAPS
Installation
Key Function
RunCoGAPS() — Runs CoGAPS on the expression data from a Seurat object and stores the resulting cell embeddings and gene loadings as a DimReduc object named "CoGAPS".
How It Works
CoGAPS factorizes the log-normalized expression matrix into two non-negative matrices:- Sample factors (cells × patterns) — how strongly each cell expresses each pattern, stored as the reduction’s cell embeddings
- Feature loadings (genes × patterns) — which genes drive each pattern, stored as the reduction’s feature loadings
nPatterns) is a key hyperparameter. Fewer patterns capture broad lineage differences; more patterns can resolve finer cell-type distinctions and subtypes. CoGAPS is computationally intensive for large datasets and benefits from distributed or parallel execution.
RunCoGAPS Parameters
The Seurat object to run CoGAPS on.
Assay to pull expression data from.
Slot within the assay to use. Data is log2-transformed internally (
log2(x + 1)) before being passed to CoGAPS.A
CogapsParams object for specifying CoGAPS settings such as nPatterns, nIterations, singleCell, sparseOptimization, and distributed mode settings. If NULL, CoGAPS runs with default parameters.Path for a temporary
.mtx file used when running in distributed mode. Set to TRUE to auto-generate a temp file path. Required for distributed/genome-wide runs on large datasets.Name of the
DimReduc object to store in the Seurat object.Key prefix for the CoGAPS reduction dimensions (e.g.,
CoGAPS_1, CoGAPS_2).Workflow
Local run (small datasets / exploratory)
For quick exploratory runs with a small number of iterations:For robust results, 50,000+ iterations are recommended. Expect runtimes of several hours for large datasets. Consider using cloud computing for production runs.
Cloud / distributed run (large datasets)
Use aCogapsParams object to configure distributed execution:
10 patterns — resolve cell types
IncreasingnPatterns allows CoGAPS to identify finer-grained cell type distinctions and subtypes:
Visualizing CoGAPS Patterns
CoGAPS results are stored as a standard SeuratDimReduc object and can be used with all standard Seurat visualization functions.
Scatter plots of pattern dimensions
Violin plots of pattern activity per cluster
Each CoGAPS dimension represents a pattern. Violin plots show how strongly a pattern is active across cell type clusters:Advanced Options
Custom uncertainty matrix
By default, CoGAPS assumes the uncertainty of each data entry is 10% of its value. You can provide a custom uncertainty matrix:Parallel execution
ThenThreads argument enables multi-threaded execution without affecting the mathematics: