`R/clustering_functions.R`

`hierarchical_term_clustering.Rd`

Hierarchical Clustering of Enriched Terms

```
hierarchical_term_clustering(
kappa_mat,
enrichment_res,
num_clusters = NULL,
use_description = FALSE,
clu_method = "average",
plot_hmap = FALSE,
plot_dend = TRUE
)
```

- kappa_mat
matrix of kappa statistics (output of

`create_kappa_matrix`

)- enrichment_res
data frame of pathfindR enrichment results. Must-have columns are "Term_Description" (if

`use_description = TRUE`

) or "ID" (if`use_description = FALSE`

), "Down_regulated", and "Up_regulated". If`use_active_snw_genes = TRUE`

, "non_Signif_Snw_Genes" must also be provided.- num_clusters
number of clusters to be formed (default =

`NULL`

). If`NULL`

, the optimal number of clusters is determined as the number which yields the highest average silhouette width.- use_description
Boolean argument to indicate whether term descriptions (in the "Term_Description" column) should be used. (default =

`FALSE`

)- clu_method
the agglomeration method to be used (default = "average", see

`hclust`

)- plot_hmap
boolean to indicate whether to plot the kappa statistics clustering heatmap or not (default = FALSE)

- plot_dend
boolean to indicate whether to plot the clustering dendrogram partitioned into the optimal number of clusters (default = TRUE)

a vector of clusters for each enriched term in the enrichment results.

The function initially performs hierarchical clustering
of the enriched terms in `enrichment_res`

using the kappa statistics
(defining the distance as `1 - kappa_statistic`

). Next,
the clustering dendrogram is cut into k = 2, 3, ..., n - 1 clusters
(where n is the number of terms). The optimal number of clusters is
determined as the k value which yields the highest average silhouette width.
(if `num_clusters`

not specified)

```
if (FALSE) {
hierarchical_term_clustering(kappa_mat, enrichment_res)
hierarchical_term_clustering(kappa_mat, enrichment_res, method = "complete")
}
```