Performs quality control filtering on single-cell RNA-seq expression data based on:
Gene counts per cell (detected features)
UMI counts per cell (library size)
Mitochondrial gene percentage
The function applies user-defined thresholds to filter low-quality cells and optionally generates diagnostic plots of QC metrics distributions.
Usage
Cell_qc(
expression_profile,
sample_information_cellType = NULL,
sample_information_timePoint = NULL,
species = "Hs",
gene_low = 500,
gene_high = 10000,
mito_high = 0.1,
umi_low = 1500,
umi_high = Inf,
logNormalize = TRUE,
plot = FALSE,
plot_path = "./quality_control.pdf"
)
Arguments
- expression_profile
Raw count matrix (genes x cells)
- sample_information_cellType
Optional cell type annotations (named vector)
- sample_information_timePoint
Optional time point annotations (named vector)
- species
Species specification ("Hs" for human, "Mm" for mouse)
- gene_low
Minimum genes required per cell (default: 500)
- gene_high
Maximum genes allowed per cell (default: 10000)
- mito_high
Maximum mitochondrial percentage allowed (default: 0.1)
- umi_low
Minimum UMIs required per cell (default: 1500)
- umi_high
Maximum UMIs allowed per cell (default: Inf)
- logNormalize
Whether to apply log transformation (default: TRUE)
- plot
Whether to generate QC diagnostic plots (default: FALSE)
- plot_path
Output path for QC plots (default: "./quality_control.pdf")
Value
A list containing:
expression_profile: Filtered and normalized expression matrix
sample_information_cellType: Filtered cell type annotations (if provided)
sample_information_timePoint: Filtered time point annotations (if provided)
Details
Key processing steps:
Calculates mitochondrial gene percentage (species-aware)
Normalizes counts to 10,000 counts per cell (CP10K)
Applies logarithmic transformation (optional)
Filters cells based on user-defined thresholds
Removes mitochondrial genes from final output
Preserves sample metadata when provided
Examples
if (FALSE) { # \dontrun{
# Load example data
library(SingleCellExperiment)
data(QueryCellData)
# Extract expression matrix (assuming counts are in 'counts' assay)
counts <- assay(QueryCellData, "counts")
# Run QC with default parameters
qc_results <- Cell_qc(
expression_profile = counts,
species = "Hs"
)
# Run QC with custom thresholds and plotting
qc_results <- Cell_qc(
expression_profile = counts,
species = "Hs",
gene_low = 600,
mito_high = 0.2,
plot = TRUE,
plot_path = "qc_plots.pdf"
)
# With cell type annotations
cell_types <- colData(QueryCellData)$cell_type1
qc_results <- Cell_qc(
expression_profile = counts,
sample_information_cellType = cell_types,
species = "Hs"
)
} # }