Skip to contents

Performs quality control filtering on single-cell RNA-seq expression data based on:

  • Gene counts per cell (detected features)

  • UMI counts per cell (library size)

  • Mitochondrial gene percentage

The function applies user-defined thresholds to filter low-quality cells and optionally generates diagnostic plots of QC metrics distributions.

Usage

Cell_qc(
  expression_profile,
  sample_information_cellType = NULL,
  sample_information_timePoint = NULL,
  species = "Hs",
  gene_low = 500,
  gene_high = 10000,
  mito_high = 0.1,
  umi_low = 1500,
  umi_high = Inf,
  logNormalize = TRUE,
  plot = FALSE,
  plot_path = "./quality_control.pdf"
)

Arguments

expression_profile

Raw count matrix (genes x cells)

sample_information_cellType

Optional cell type annotations (named vector)

sample_information_timePoint

Optional time point annotations (named vector)

species

Species specification ("Hs" for human, "Mm" for mouse)

gene_low

Minimum genes required per cell (default: 500)

gene_high

Maximum genes allowed per cell (default: 10000)

mito_high

Maximum mitochondrial percentage allowed (default: 0.1)

umi_low

Minimum UMIs required per cell (default: 1500)

umi_high

Maximum UMIs allowed per cell (default: Inf)

logNormalize

Whether to apply log transformation (default: TRUE)

plot

Whether to generate QC diagnostic plots (default: FALSE)

plot_path

Output path for QC plots (default: "./quality_control.pdf")

Value

A list containing:

  • expression_profile: Filtered and normalized expression matrix

  • sample_information_cellType: Filtered cell type annotations (if provided)

  • sample_information_timePoint: Filtered time point annotations (if provided)

Details

Key processing steps:

  • Calculates mitochondrial gene percentage (species-aware)

  • Normalizes counts to 10,000 counts per cell (CP10K)

  • Applies logarithmic transformation (optional)

  • Filters cells based on user-defined thresholds

  • Removes mitochondrial genes from final output

  • Preserves sample metadata when provided

Author

Bin Duan (binduan\@sjtu.edu.cn)

Examples

if (FALSE) { # \dontrun{
# Load example data
library(SingleCellExperiment)
data(QueryCellData)

# Extract expression matrix (assuming counts are in 'counts' assay)
counts <- assay(QueryCellData, "counts")

# Run QC with default parameters
qc_results <- Cell_qc(
  expression_profile = counts,
  species = "Hs"
)

# Run QC with custom thresholds and plotting
qc_results <- Cell_qc(
  expression_profile = counts,
  species = "Hs",
  gene_low = 600,
  mito_high = 0.2,
  plot = TRUE,
  plot_path = "qc_plots.pdf"
)

# With cell type annotations
cell_types <- colData(QueryCellData)$cell_type1
qc_results <- Cell_qc(
  expression_profile = counts,
  sample_information_cellType = cell_types,
  species = "Hs"
)
} # }