Calculate Agglomerated Scores of Enriched Terms for Each Subject

  cases = NULL,
  use_description = FALSE,
  plot_hmap = TRUE,



a data frame that must contain the 3 columns below:


Description of the enriched term (necessary if use_description = TRUE)


ID of the enriched term (necessary if use_description = FALSE)


the up-regulated genes in the input involved in the given term's gene set, comma-separated


the down-regulated genes in the input involved in the given term's gene set, comma-separated


the experiment (e.g., gene expression/methylation) matrix. Columns are samples and rows are genes. Column names must contain sample names and row names must contain the gene symbols.


(Optional) A vector of sample names that are cases in the case/control experiment. (default = NULL)


Boolean argument to indicate whether term descriptions (in the "Term_Description" column) should be used. (default = FALSE)


Boolean value to indicate whether or not to draw the heatmap plot of the scores. (default = TRUE)


Additional arguments for plot_scores for aesthetics of the heatmap plot


Matrix of agglomerated scores of each enriched term per sample. Columns are samples, rows are enriched terms. Optionally, displays a heatmap of this matrix.

Conceptual Background

For an experiment matrix (containing expression, methylation, etc. values), the rows of which are genes and the columns of which are samples, we denote:

  • E as a matrix of size m x n

  • G as the set of all genes in the experiment G = Ei., i ∈ [1, m]

  • S as the set of all samples in the experiment S = E.j, i ∈ [1, n]

We next define the gene score matrix GS (the standardized experiment matrix, also of size m x n) as:

GSgs = (Egs - ēg) / sg

where g ∈ G, s ∈ S, ēg is the mean of all values for gene g and sg is the standard deviation of all values for gene g.

We next denote T to be a set of terms (where each t ∈ T is a set of term-related genes, i.e., t = {gx, ..., gy} ⊂ G) and finally define the agglomerated term scores matrix TS (where rows correspond to genes and columns corresponds to samples s.t. the matrix has size |T| x n) as:

TSts = 1/|t| ∑ g ∈ t GSgs, where t ∈ T and s ∈ S.


score_matrix <- score_terms(RA_output, RA_exp_mat, plot_hmap = FALSE)