Giotto implemented three algorithms for enrichment analysis of lower resolution of spatially expression datasets. It contains PAGE, RANK and hypergeometric. The aim of enrichment analysis is to use continuous values to represent the likelihood of the presence of a cell type of interest in specific spatial locations which may contain multiple cells.
The method uses Parametric Analysis of Gene Set Enrichment (PAGE) method to evaluate cell type enrichment for each spatial location. Signature genes of interested cell types are used for enrichment analysis. Enrichment score was calculated based on signature gene expression for spatial locations. P-value could also be calculated via permutation test.
The method uses a rank method to calculate enrichment score for interested cell types. Single cell expression matrix as well as cell type labels are used for rank analysis. Rather than PAGE, RANK does not need signature gene selection. Based on the gene expression pattern of single cell RNA-seq, RANK could evaluate the cell type presence of spatial locations. P-value could also be calculated via permutation test.
This method uses hypergeometric distribution test to evaluate cell type distribution of spatial locations based on signature genes of interested cell types. Enrichment score was calculated as -log10(p-value) of hypergeometric distribution test.
makeSignMatrixPAGE = function(sign_names, sign_list)
This function converts a list of signature genes (e.g. for cell types or processes) into a binary matrix format that can be used with the PAGE enrichment option. Each cell type or process should have a vector of cell-type or process specific genes. These vectors need to be combined into a list (sign_list). The names of the cell types or processes that are provided in the list need to be given (sign_names).
Signature matrix is a binary (0/1) matrix. Rows are genes. Columns are cell types.
Once the signature matrix is created. The next step is to run runPAGEEnrich function.
runPAGEEnrich <- function(gobject, sign_matrix, expression_values = c('normalized', 'scaled', 'custom'),reverse_log_scale = FALSE, logbase = 2, output_enrichment = c('original', 'zscore'), p_value = FALSE, n_times = 1000, name = NULL, return_gobject = TRUE)
expression_values=c('normalized', "raw", 'scaled', 'custom')
logbase = 2
A data frame with the enrichment score with each spatial location and cell type. In addition, if p_value= TRUE was specified, another data frame with p-value could also reported.
makeSignMatrixRank <- function(sc_matrix, sc_cluster_ids,gobject = NULL, ties.method=c("random", "max"))
This function will make a rank-based cell type gene signature matrix based on the scRNAseq dataset. In this signature matrix, there are N vectors where N is number of cell types. For each cell type, the vector is a rank-list of genes according to some criterion (in this case, according to log2(mean_expr+1)-log2(av_expr+1) where mean_expr is the cell type expression average, av_expr is the all cells' expression average). Where two values are sharing the same rank and thus creating a tie, the ties.method is used to break ties.
The sc_matrix should be the gene expression matrix (in raw form). The sc_cluster_ids is the cluster annotation column. ties.method is tie breaking method for assigning ranks in case of ties.
rank_matrix=makeSignMatrixRank(sc_matrix=cere_rnaseq2@raw_exprs, sc_cluster_ids=pDataDT(cere_rnaseq2)$leiden, ties.method="random")
The next step is to call runRankEnrich() function.
runRankEnrich <- function(gobject,sign_matrix,expression_values = c('normalized', "raw", 'scaled', 'custom'), reverse_log_scale = FALSE, logbase = 2,output_enrichment = c('original', 'zscore'),ties.method = c("random", "max"),p_value = FALSE, n_times = 1000,name = NULL, return_gobject = TRUE, rbp_p = 0.99, num_agg=100 )
ties.method = c("random", "max")
rbp_p = 0.99
Slide_test<-runRankEnrich(Slide_test, sign_matrix=rank_matrix, expression_values="norm", reverse_log_scale=F, logbase=2, output_enrichment="original", name="rank", rbp_p=0.99, num_agg=100, ties.method="random")
createSpatialEnrich = function(gobject, enrich_method = ' hypergeometric’, sign_matrix, p_value = FALSE, n_times = 1000 …)
The input is sign_matrix which is 0/1 signature matrix for each cell type. P value could be calculated by setting p_value=TRUE.