Skip to contents

Performs residual-based SHAP analysis for a selected feature, identifying outliers based on SHAP residuals and visualizing them in both spatial and dependence plots.

Usage

SpaPheno_SHAP_residual_analysis(
  shap_df,
  feature_name,
  coordinate_df,
  size = 1,
  title = "SHAP Residual Distribution"
)

Arguments

shap_df

A data.frame containing SHAP results with at least the columns: feature, phi, feature.value, and sample.

feature_name

Character; the feature (e.g., gene) to analyze.

coordinate_df

A data.frame of spatial coordinates with rownames matching sample IDs; must contain columns named X and Y.

size

Numeric; point size for plotting. Default is 1.

title

Character; title for the spatial residual plot. Default is "SHAP Residual Distribution".

Value

A list containing:

  • spatial_plot: A ggplot2 object showing spatial distribution of SHAP residuals.

  • dependence_plot: A SHAP dependence plot with residual outlier groups.

  • residual_table: A data.frame with residuals, Z-scores, and group annotations.

Details

This function regresses SHAP values on feature values to obtain residuals, then uses Z-score normalization to identify spatial units with unusually high or low SHAP effects that are not explained by the raw expression. The results are visualized both spatially and as enhanced SHAP dependence plots.

Residual outliers are defined as:

  • Z ≥ 2: High residual

  • Z ≤ -2: Low residual

  • Otherwise: Normal

Author

Bin Duan

Examples

if (FALSE) { # \dontrun{

data("osmFISH_metadata_cellType")
data("osmFISH_bulk_decon")
data("osmFISH_bulk_pheno")
data("osmFISH_bulk_coordinate")

PhenoResult <- SpatialPhenoMap(
  bulk_decon = osmFISH_bulk_decon,
  bulk_pheno = osmFISH_bulk_pheno,
  family = "binomial",
  coord = osmFISH_bulk_coordinate,
  resolution = "single_cell",
  sample_information_cellType = osmFISH_metadata_cellType
)

pred_result <- PhenoResult$pred_score
phenoPlus <- row.names(pred_result[pred_result$label %in% "phenotype+", ])
model <- PhenoResult$model
X <- as.data.frame(PhenoResult$cell_type_distribution[phenoPlus, ])

# This step took a very long time
shap_test_plus <- compute_shap_spatial(
  model = model,
  X_bulk = as.data.frame(osmFISH_bulk_decon),
  y_bulk = osmFISH_bulk_pheno,
  X_spatial = X)

resi_result <- SpaPheno_SHAP_residual_analysis(
shap_df = shap_test_plus,
feature_name = "Perivascular.Macrophages",
coordinate_df = test_coordinate, size = 0.8
)
resi_hot <- resi_result$residual_table
head(resi_hot[order(abs(resi_hot$phi_resid_z), decreasing = T), ], 5)
SpaPheno_SHAP_waterfall_plot(shap_test_plus, "cell_5593", top_n = 10)
resi_result$dependence_plot
resi_result$spatial_plot

} # }