miQC jointly models mitochondrial read percentage and library complexity using a two-distribution mixture model, enabling probabilistic rather than threshold-based identification of compromised cells. This is particularly useful for archived or tumor tissues where fixed mitochondrial cutoffs are often too stringent or too lenient.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/satijalab/seurat-wrappers/llms.txt
Use this file to discover all available pages before exploring further.
Citation: Hippen et al. (2021) miQC: An adaptive probabilistic framework for quality control of single-cell RNA-sequencing data. bioRxiv. doi: 10.1101/2021.03.03.433798Source: greenelab/miQC (Bioconductor)
Installation
At this point, the miQC algorithm has been adapted for use in Seurat through installation of
flexmix only. The miQC Bioconductor package provides the reference implementation, but SeuratWrappers calls flexmix directly.Key Functions
RunMiQC()— Fits a two-distribution mixture model and assigns each cell a probability of being compromised. Stores results in object metadata.PlotMiQC()— Visualizes the fitted mixture model overlaid on a scatter plot of mitochondrial percentage vs. unique gene count.
How It Works
miQC assumes that a scRNA-seq dataset contains two populations of cells: intact cells (low mitochondrial reads, higher gene counts) and compromised cells (high mitochondrial reads, lower gene counts). It fits a two-component mixture model to the joint distribution ofpercent.mt and nFeature_RNA, then computes a posterior probability for each cell of belonging to the compromised component.
Cells above a configurable posterior.cutoff are labeled for removal. This approach adapts to each dataset’s specific quality profile rather than requiring a universal threshold.
Workflow
Calculate mitochondrial percentage
miQC requires Inspect the distribution before running the model:Look for a distinctive triangular shape: a wide range of mitochondrial percentages at lower gene counts tapering to low mitochondrial percentage at higher gene counts. If this pattern is absent, the two-distribution assumption may not hold for your data.
percent.mt and nFeature_RNA to be present in the object metadata. nFeature_RNA is computed automatically by CreateSeuratObject. Calculate percent.mt with PercentageFeatureSet.For human data, mitochondrial genes start with MT-. For mouse data, use mt-.Run the miQC mixture model
miQC.probability— posterior probability of belonging to the compromised distributionmiQC.keep—"keep"or"discard"decision based onposterior.cutoff
Visualize the model
Plot the mixture model with cells colored by their compromise probability:Or visualize the keep/discard decisions directly:
RunMiQC Parameters
The Seurat object to run miQC on.
Name of the metadata column containing the percentage of reads attributed to mitochondrial genes.
Name of the metadata column containing the number of unique genes detected per cell.
Posterior probability threshold for the compromised distribution. Cells with probability above this value are marked as
"discard". Must be between 0 and 1. When processing multiple samples for the same experiment, use the same cutoff across all samples for consistency.Type of mixture model to fit. Options:
"linear"— linear mixture model (recommended)"spline"— b-spline mixture model"polynomial"— two-degree polynomial mixture model
Name of the
misc slot in the Seurat object where the fitted flexmix model is stored.Fallback strategy when flexmix fails to fit a two-cluster model. Options:
"percentile"— filter bybackup.percentileof the mitochondrial distribution"percent"— filter by a fixedbackup.percentmitochondrial cutoff"pass"— return the object unchanged without miQC stats"halt"— stop with an error
Percentile cutoff for mitochondrial percentage when
backup.option = "percentile".Fixed mitochondrial percentage cutoff when
backup.option = "percent".Whether to print progress messages.
PlotMiQC Parameters
A Seurat object that has already been processed with
RunMiQC.Name of the metadata column with mitochondrial percentage.
Name of the metadata column with unique gene counts.
The
misc slot where the flexmix model was stored during RunMiQC.Metadata column to use for coloring points. Common choices are
"miQC.probability" (continuous gradient) and "miQC.keep" (categorical).Non-linear Models
For datasets where a linear relationship between mitochondrial percentage and gene count does not hold,RunMiQC supports b-spline and polynomial models via the model.type parameter:
Handling Model Failures
Some datasets — particularly very clean ones — do not have a meaningful second population of compromised cells, so flexmix may fail to find two clusters.RunMiQC will issue a warning and fall back to the strategy set by backup.option:
FeatureScatter to check whether the two-distribution assumption is appropriate for your data.
Inspecting Model Parameters
The raw flexmix model is stored in the object’smisc slot and can be accessed directly: