Perform Active Subnetwork Search
active_snw_search(
input_for_search,
pin_name_path = "Biogrid",
snws_file = "active_snws",
dir_for_parallel_run = NULL,
score_quan_thr = 0.8,
sig_gene_thr = 0.02,
search_method = "GR",
seedForRandom = 1234,
silent_option = TRUE,
use_all_positives = FALSE,
geneInitProbs = 0.1,
saTemp0 = 1,
saTemp1 = 0.01,
saIter = 10000,
gaPop = 400,
gaIter = 10000,
gaThread = 5,
gaCrossover = 1,
gaMut = 0,
grMaxDepth = 1,
grSearchDepth = 1,
grOverlap = 0.5,
grSubNum = 1000
)
input the input data that active subnetwork search uses. The input must be a data frame containing at least these 2 columns:
Gene Symbol
p value obtained through a test, e.g. differential expression/methylation
Name of the chosen PIN or absolute/path/to/PIN.sif. If PIN name, must be one of c('Biogrid', 'STRING', 'GeneMania', 'IntAct', 'KEGG', 'mmu_STRING'). If path/to/PIN.sif, the file must comply with the PIN specifications. (Default = 'Biogrid')
name for active subnetwork search output data without file extension (default = 'active_snws')
(previously created) directory for a parallel run iteration. Used in the wrapper function (see ?run_pathfindR) (Default = NULL)
active subnetwork score quantile threshold. Must be between 0 and 1 or set to -1 for not filtering. (Default = 0.8)
threshold for the minimum proportion of significant genes in the subnetwork (Default = 0.02) If the number of genes to use as threshold is calculated to be < 2 (e.g. 50 signif. genes x 0.01 = 0.5), the threshold number is set to 2
algorithm to use when performing active subnetwork search. Options are greedy search (GR), simulated annealing (SA) or genetic algorithm (GA) for the search (default = 'GR').
seed for reproducibility while running the java modules (applies for GR and SA)
boolean value indicating whether to print the messages to the console (FALSE) or not (TRUE, this will print to a temp. file) during active subnetwork search (default = TRUE). This option was added because during parallel runs, the console messages get disorderly printed.
if TRUE: in GA, adds an individual with all positive nodes. In SA, initializes candidate solution with all positive nodes. (default = FALSE)
For SA and GA, probability of adding a gene in initial solution (default = 0.1)
Initial temperature for SA (default = 1.0)
Final temperature for SA (default = 0.01)
Iteration number for SA (default = 10000)
Population size for GA (default = 400)
Iteration number for GA (default = 200)
Number of threads to be used in GA (default = 5)
Applies crossover with the given probability in GA (default = 1, i.e. always perform crossover)
For GA, applies mutation with given mutation rate (default = 0, i.e. mutation off)
Sets max depth in greedy search, 0 for no limit (default = 1)
Search depth in greedy search (default = 1)
Overlap threshold for results of greedy search (default = 0.5)
Number of subnetworks to be presented in the results (default = 1000)
A list of genes in every identified active subnetwork that has a score greater than the `score_quan_thr`th quantile and that has at least `sig_gene_thr` affected genes.
processed_df <- example_pathfindR_input[1:15, -2]
colnames(processed_df) <- c('GENE', 'P_VALUE')
GR_snws <- active_snw_search(
input_for_search = processed_df,
pin_name_path = 'KEGG',
search_method = 'GR',
score_quan_thr = 0.8
)
#> Found 2 active subnetworks
#>
# clean-up
unlink('active_snw_search', recursive = TRUE)