Hierarchical Clustering of Enriched Terms

hierarchical_term_clustering(
kappa_mat,
enrichment_res,
num_clusters = NULL,
use_description = FALSE,
clu_method = "average",
plot_hmap = FALSE,
plot_dend = TRUE
)

## Arguments

kappa_mat

matrix of kappa statistics (output of create_kappa_matrix)

enrichment_res

data frame of pathfindR enrichment results. Must-have columns are "Term_Description" (if use_description = TRUE) or "ID" (if use_description = FALSE), "Down_regulated", and "Up_regulated". If use_active_snw_genes = TRUE, "non_Signif_Snw_Genes" must also be provided.

num_clusters

number of clusters to be formed (default = NULL). If NULL, the optimal number of clusters is determined as the number which yields the highest average silhouette width.

use_description

Boolean argument to indicate whether term descriptions (in the "Term_Description" column) should be used. (default = FALSE)

clu_method

the agglomeration method to be used (default = "average", see hclust)

plot_hmap

boolean to indicate whether to plot the kappa statistics clustering heatmap or not (default = FALSE)

plot_dend

boolean to indicate whether to plot the clustering dendrogram partitioned into the optimal number of clusters (default = TRUE)

## Value

a vector of clusters for each enriched term in the enrichment results.

## Details

The function initially performs hierarchical clustering of the enriched terms in enrichment_res using the kappa statistics (defining the distance as 1 - kappa_statistic). Next, the clustering dendrogram is cut into k = 2, 3, ..., n - 1 clusters (where n is the number of terms). The optimal number of clusters is determined as the k value which yields the highest average silhouette width. (if num_clusters not specified)

## Examples

if (FALSE) {
hierarchical_term_clustering(kappa_mat, enrichment_res)
hierarchical_term_clustering(kappa_mat, enrichment_res, method = "complete")
}