Computes pairwise similarity/distance matrix between samples (columns) using various metrics with parallel computation support. Suitable for large-scale genomic data.
Usage
correlation(
matrix,
method = c("pearson", "spearman", "cosin", "euclidean"),
cpu_num = 8
)
Value
A symmetric similarity/distance matrix with dimensions ncol(matrix) x ncol(matrix), where row and column names match the input matrix column names.
Details
Available similarity/distance measures:
Pearson correlation (linear relationship)
Spearman correlation (rank-based relationship)
Cosine similarity (angle between vectors)
Euclidean distance (geometric distance)
The function utilizes parallel computation via parallel
package to accelerate
calculations for large matrices.
See also
cor
for correlation calculations,
makeCluster
for parallel computation setup
Examples
if (FALSE) { # \dontrun{
# Using example data from scLearn package
data(QueryCellData)
# Get normalized expression matrix
norm_expr <- logcounts(QueryCellData)
# Calculate Pearson correlation
cor_mat <- correlation(
matrix = norm_expr,
method = "pearson",
cpu_num = 4
)
# Calculate cosine similarity
cos_mat <- correlation(
matrix = norm_expr,
method = "cosin",
cpu_num = 2
)
# Visualize results
heatmap(cor_mat, symm = TRUE)
} # }