Skip to contents

Generates 2D visualization of high-dimensional data using either t-SNE or UMAP dimensionality reduction, with options for cluster labeling and aesthetic customization.

Usage

DrawCluster(
  data,
  label = NULL,
  point_size = 1,
  method = c("tsne", "umap"),
  draw_cluster_text = TRUE,
  calculated = TRUE,
  pca = TRUE,
  perplexity = 100,
  plot = TRUE,
  seed = 1
)

Arguments

data

Input data matrix (features x samples) or pre-computed coordinates if calculated=FALSE

label

Vector of cluster labels for each sample (optional)

point_size

Size of points in the plot (default: 1)

method

Dimensionality reduction method: "tsne" or "umap" (default: "tsne")

draw_cluster_text

Whether to display cluster labels at median positions (default: TRUE)

calculated

Whether to compute dimensionality reduction (TRUE) or use provided coordinates (FALSE)) (default: TRUE)

pca

Whether to perform PCA preprocessing for t-SNE (default: TRUE)

perplexity

t-SNE perplexity parameter (default: 100)

plot

Whether to display the plot (default: TRUE)

seed

Random seed for reproducibility (default: 1)

Value

A list containing:

  • p: ggplot2 object of the visualization

  • x: Data frame with coordinates (V1 and V2 columns)

  • cell_group: Factor vector of cluster labels

Details

Key features:

  • Supports both t-SNE (t-Distributed Stochastic Neighbor Embedding) and UMAP (Uniform Manifold Approximation and Projection)

  • Option to use pre-computed coordinates or calculate new embeddings

  • Automatic cluster labeling at median positions

  • Customizable plotting parameters and reproducible results via seed setting

The function returns both the plot object and the coordinates for further analysis.

See also

Rtsne for t-SNE implementation, umap for UMAP implementation

Author

Bin Duan (binduan\@sjtu.edu.cn)

Examples

if (FALSE) { # \dontrun{
# Using example data from scLearn package
data(QueryCellData)

# Get normalized expression matrix
norm_expr <- logcounts(QueryCellData)
cell_types <- colData(QueryCellData)$cell_type1

# t-SNE visualization
tsne_res <- DrawCluster(
  data = norm_expr,
  label = cell_types,
  method = "tsne",
  perplexity = 30,
  point_size = 1.5
)

# UMAP visualization without cluster labels
umap_res <- DrawCluster(
  data = norm_expr,
  label = cell_types,
  method = "umap",
  draw_cluster_text = FALSE,
  point_size = 2
)

# Using pre-computed coordinates
precomputed_coords <- data.frame(V1 = rnorm(ncol(norm_expr)),
                               V2 = rnorm(ncol(norm_expr)))
DrawCluster(
  data = precomputed_coords,
  label = cell_types,
  calculated = FALSE
)
} # }