CIPR (Cell Identity Predictor using Reference) automates cell cluster annotation in scRNA-seq experiments by comparing cluster marker genes or average expression profiles against a panel of curated reference datasets. It provides both logFC-based and correlation-based scoring methods and includes 7 built-in reference datasets covering human and mouse immune cell types.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/satijalab/seurat-wrappers/llms.txt
Use this file to discover all available pages before exploring further.
Citation: Ekiz et al. (2020) CIPR: a web-based R/shiny app and R package to annotate cell clusters in single cell RNA sequencing experiments. BMC Bioinformatics. doi: 10.1186/s12859-020-3538-2Source: atakanekiz/CIPR-Package (GitHub)
Installation
How It Works
CIPR accepts either differential expression results (allmarkers from FindAllMarkers) or average expression profiles (avgexp from AverageExpression) and scores them against a reference dataset. Two families of scoring methods are available:
LogFC comparison methods — compare cluster marker logFC profiles against reference-derived logFC values:
logfc_dot_product— sum of pairwise logFC products (recommended)logfc_spearman— rank correlation of logFC valueslogfc_pearson— linear correlation of logFC values
all_genes_spearman— Spearman rank correlation (robust across technologies)all_genes_pearson— Pearson linear correlation (useful with custom references)
SeuratWrappers provides integration between Seurat and CIPR. All CIPR analysis functions (
CIPR()) are called directly from the CIPR package on Seurat-derived inputs. There are no additional wrapper functions beyond standard Seurat preprocessing steps.Available Reference Datasets
| Reference | reference argument |
|---|---|
| Immunological Genome Project (ImmGen) | "immgen" |
| Presorted cell RNAseq (various tissues) | "mmrnaseq" |
| Blueprint/ENCODE | "blueprint" |
| Human Primary Cell Atlas | "hpca" |
| Database of Immune Cell Expression (DICE) | "dice" |
| Hematopoietic differentiation | "hema" |
| Presorted cell RNAseq (PBMC) | "hsrnaseq" |
| User-provided custom reference | "custom" |
Workflow
Generate CIPR inputs
CIPR supports two input types. Prepare one or both depending on the scoring methods you intend to use.For logFC comparison methods — run For all-genes correlation methods — compute cluster-average expression:
FindAllMarkers:Run CIPR
Visualize PBMC using Run CIPR with the logFC dot product method against sorted human PBMC RNAseq:CIPR saves results to global objects
DimPlot before annotating:CIPR_top_results (top 5 matches per cluster) and CIPR_all_results (full scoring table).All-Genes Correlation Method
The all-genes approach correlates overall cluster expression against each reference sample, regardless of differential expression status. This is conceptually similar to SingleR and scMCA.Subsetting the Reference
When using logFC comparison methods, excluding irrelevant reference cell types sharpens discrimination between closely related subtypes:Filtering Lowly Variable Genes
Genes with low expression variance across the reference have weak discriminatory power. Usekeep_top_var to restrict analysis to the top N% most variable reference genes: