scGate: Marker-Based Cell Type Purification for Single-Cell Sequencing
A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. 'scGate' automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. Briefly, 'scGate' takes as input: i) a gene expression matrix stored in a 'Seurat' object and ii) a “gating model” (GM), consisting of a set of marker genes that define the cell population of interest. The GM can be as simple as a single marker gene, or a combination of positive and negative markers. More complex GMs can be constructed in a hierarchical fashion, akin to gating strategies employed in flow cytometry. 'scGate' evaluates the strength of signature marker expression in each cell using the rank-based method 'UCell', and then performs k-nearest neighbor (kNN) smoothing by calculating the mean 'UCell' score across neighboring cells. kNN-smoothing aims at compensating for the large degree of sparsity in scRNA-seq data. Finally, a universal threshold over kNN-smoothed signature scores is applied in binary decision trees generated from the user-provided gating model, to annotate cells as either “pure” or “impure”, with respect to the cell population of interest. See the related publication Andreatta et al. (2022) <doi:10.1093/bioinformatics/btac141>.
||R (≥ 4.3.0)
||Seurat (≥ 4.0.0), UCell (≥ 2.6.0), dplyr, stats, utils, methods, patchwork, ggridges, reshape2, ggplot2, BiocParallel
||ggparty, partykit, knitr, rmarkdown
Changsheng Li [aut],
||Massimo Andreatta <massimo.andreatta at unil.ch>
||scGate citation info
Please use the canonical form
to link to this page.