Skip to contents

Predicts cell types for query single-cell RNA-seq data using a pre-trained scLearn model. Implements a voting-based assignment system with multiple quality control checks.

Usage

scLearn_cell_assignment(
  scLearn_model_learning_result,
  expression_profile_query,
  vote_rate = 0.6,
  diff = 0.05,
  threshold_use = FALSE
)

Arguments

scLearn_model_learning_result

A trained scLearn model object containing:

  • high_varGene_names: Vector of high-variance genes

  • trans_matrix_learned: Transformation matrix/matrices

  • feature_matrix_learned: Reference feature matrix/matrices

  • simi_threshold_learned: Correlation threshold(s)

expression_profile_query

Query expression matrix (genes x cells)

vote_rate

Minimum vote proportion for consensus (default: 0.6)

diff

Minimum correlation difference between top candidates (default: 0.05)

threshold_use

Whether to use correlation threshold (default: FALSE)

Value

A data frame with prediction results containing:

  • Query_cell_id: Cell identifiers

  • Predict_cell_type: Predicted cell type or "unassigned"

  • Additional_information: Quality flags (when single matrix used)

Details

The prediction process involves:

  • Feature matching between query data and model features

  • Multiple transformation matrices application (if available)

  • Correlation-based cell type assignment

  • Voting mechanism for consensus prediction

  • Novel cell type detection and quality checks

Three possible outcomes for each cell:

  • Specific cell type assignment

  • "unassigned" (low confidence)

  • Quality flags (Gene_Missing/Novel_Cell/Too similar)

See also

scLearn_model_learning for model training function

Author

Bin Duan (binduan\@sjtu.edu.cn)

Examples

if (FALSE) { # \dontrun{
# Load example scLearn model and query data
data(scLearn_model)
data(QueryCellData)

# Get query expression matrix
query_data <- logcounts(QueryCellData)

# Basic prediction
predictions <- scLearn_cell_assignment(
  scLearn_model_learning_result = scLearn_model,
  expression_profile_query = query_data
)

# Strict prediction with threshold
strict_pred <- scLearn_cell_assignment(
  scLearn_model_learning_result = scLearn_model,
  expression_profile_query = query_data,
  vote_rate = 0.7,
  diff = 0.1,
  threshold_use = TRUE
)

# Examine results
table(predictions$Predict_cell_type)
head(predictions)
} # }