├── .Rbuildignore
├── .gitignore
├── DESCRIPTION
├── NAMESPACE
├── NEWS.md
├── R
    ├── ActivePathways.r
    ├── cytoscape.r
    ├── gmt.r
    ├── merge_p.r
    └── statistical_tests.r
├── README.md
├── inst
    └── extdata
    │   ├── Adenocarcinoma_scores_subset.tsv
    │   ├── Differential_expression_rna_protein.tsv
    │   ├── enrichmentMap__legend.pdf
    │   ├── enrichmentMap__pathways.gmt
    │   ├── enrichmentMap__pathways.txt
    │   ├── enrichmentMap__subgroups.txt
    │   ├── hsapiens_REAC_subset.gmt
    │   └── hsapiens_REAC_subset2.gmt
├── man
    ├── ActivePathways.Rd
    ├── DPM.Rd
    ├── GMT.Rd
    ├── brownsMethod.Rd
    ├── columnSignificance.Rd
    ├── enrichmentAnalysis.Rd
    ├── export_as_CSV.Rd
    ├── hypergeometric.Rd
    ├── makeBackground.Rd
    ├── merge_p_values.Rd
    ├── orderedHypergeometric.Rd
    └── prepareCytoscape.Rd
├── tests
    ├── testthat.R
    └── testthat
    │   ├── helper.r
    │   ├── hsapiens_REAC_subset.gmt
    │   ├── test.gmt
    │   ├── test_columnContribution.r
    │   ├── test_columnSignificance.r
    │   ├── test_cytoscape.r
    │   ├── test_data.txt
    │   ├── test_data_rna_protein.tsv
    │   ├── test_enrichmentAnalysis.r
    │   ├── test_export_CSV.r
    │   ├── test_merge_p_values.r
    │   ├── test_orderedHypergeometric.r
    │   ├── test_return.r
    │   └── test_validation.r
└── vignettes
    ├── ActivePathways-vignette.Rmd
    ├── CreateEnrichmentMapDialogue_V2.png
    ├── ImportStep_V2.png
    ├── LegendView.png
    ├── LegendView_Custom.png
    ├── LegendView_RColorBrewer.png
    ├── NetworkStep1_V2.png
    ├── NetworkStep2_V2.png
    ├── PropertiesDropDown2_V2.png
    ├── StylePanel_V2.png
    ├── border_line_type.jpg
    ├── legend.png
    ├── lineplot_tutorial.png
    ├── new_map.png
    └── set_aesthetic.jpg


/.Rbuildignore:
--------------------------------------------------------------------------------
1 | ^.*\.Rproj$
2 | ^\.Rproj\.user$
3 | ^self_testing$
4 | ^\.Rhistory$
5 | ^\.gitignore$
6 | ^Notes$
7 | ^README.md$
8 | ^results_ActivePathways.csv$
9 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | .Rproj
2 | .Rproj.user
3 | .Rhistory
4 | .RData
5 | .DS_Store
6 | ActivePathways.Rproj
7 | 


--------------------------------------------------------------------------------
/DESCRIPTION:
--------------------------------------------------------------------------------
 1 | Package: ActivePathways
 2 | Title: Integrative Pathway Enrichment Analysis of Multivariate Omics Data
 3 | Version: 2.0.5
 4 | Authors@R: c(person("Juri", "Reimand", email = "juri.reimand@utoronto.ca", role = c("aut", "cre")),
 5 |            person("Jonathan", "Barenboim", email = "jon.barenboim@gmail.com", role = "ctb"),
 6 |            person("Mykhaylo", "Slobodyanyuk", email = "michael.slobodyanyuk@oicr.on.ca", role = "aut"))
 7 | Description: Framework for analysing multiple omics datasets in the context of molecular pathways, biological processes and other types of gene sets. The package uses p-value merging to combine gene- or protein-level signals, followed by ranked hypergeometric tests to determine enriched pathways and processes. Genes can be integrated using directional constraints that reflect how the input datasets are expected interact with one another. This approach allows researchers to interpret a series of omics datasets in the context of known biology and gene function, and discover associations that are only apparent when several datasets are combined. The recent version of the package is part of the following publication: Directional integration and pathway enrichment analysis for multi-omics data. Slobodyanyuk M^, Bahcheli AT^, Klein ZP, Bayati M, Strug LJ, Reimand J. Nature Communications (2024) <doi:10.1038/s41467-024-49986-4>.
 8 | Depends: R (>= 3.6)
 9 | Imports:
10 |     data.table,
11 |     ggplot2
12 | License: GPL-3
13 | URL: 
14 | BugReports: https://github.com/reimandlab/ActivePathways/issues
15 | Encoding: UTF-8
16 | LazyData: true
17 | RoxygenNote: 7.3.1
18 | Suggests: testthat,
19 |     knitr,
20 |     rmarkdown,
21 |     RColorBrewer
22 | VignetteBuilder: knitr
23 | 


--------------------------------------------------------------------------------
/NAMESPACE:
--------------------------------------------------------------------------------
 1 | # Generated by roxygen2: do not edit by hand
 2 | 
 3 | S3method("$",GMT)
 4 | S3method("[",GMT)
 5 | S3method("[[",GMT)
 6 | S3method(print,GMT)
 7 | export(ActivePathways)
 8 | export(DPM)
 9 | export(brownsMethod)
10 | export(export_as_CSV)
11 | export(is.GMT)
12 | export(makeBackground)
13 | export(merge_p_values)
14 | export(orderedHypergeometric)
15 | export(read.GMT)
16 | export(write.GMT)
17 | import(data.table)
18 | import(ggplot2)
19 | 


--------------------------------------------------------------------------------
/NEWS.md:
--------------------------------------------------------------------------------
 1 | ### ActivePathways 2.0.5
 2 | * Fixed a minor bug when exporting results from the ActivePathways() output as a data.table into a csv file. Thre bug occurred when all results unfiltered by statistical significance values were exported, resulting in NULL values in gene overlap columns of resulting tables. These NULL entries are now first converted to an empty string inside the export_as_CSV() function.
 3 | 
 4 | ### ActivePathways 2.0.4
 5 | * Minor update to ensure the 'scores' and 'scores_direction' matrices have the same number of rows, and that the gene row names in 'scores' are in the same order as 'scores_direction'. The method now reports an error and terminates if two matrices are misaligned.
 6 | 
 7 | ### ActivePathways 2.0.3
 8 | * Minor updates in documentation and code examples. 
 9 | 
10 | ### ActivePathways 2.0.2
11 | * Separated the directional P-value merging methods into separate functions. 'Fisher_directional', 'DPM', 'Stouffer_directional', and 'Strube_directional' methods perform directional integration. The 'scores_direction' and 'constraints_vector' parameters must be provided.
12 | 
13 | ### ActivePathways 2.0.1
14 | * Changed how very small P-values are processed before P-value merging. P-values of '0' or anything less than '1e-300' are converted to '1e-300'.
15 | 
16 | ### ActivePathways 2.0.0
17 | * Incorporated 'scores_direction' and 'constraints_vector' parameters to ActivePathways() and merge_p_values() to account for the direction between datasets when performing p-value merging.
18 | * Added the 'Stouffer' and 'Strube' p-value merging methods as alternatives to 'Fisher' and "Brown', respectively. 
19 | * Changed the naming convention of parameters and objects, substituting a period '.' for an underscore '_'. 
20 | 
21 | ### ActivePathways 1.1.1
22 | * Added an option for the colours specified in the "custom_colors" parameter to be provided in any order, as long as the vector "names()" match the column names of the "scores" matrix. 
23 | * Fixed an error where the "combined" contribution label was absent from the legend.pdf ActivePathways output file. 
24 | 
25 | ### ActivePathways 1.1.0
26 | * Updated the filtering procedure of gene sets in the GMT file when a custom gene background is provided. Given a background gene list, the GMT gene sets are first modified to only include the genes of the background list, and second, the gene sets are filtered by gene set size. Gene sets lacking any genes from the background list are removed. This update will result in a more lenient multiple testing correction in analyses with a custom background gene list.
27 | 
28 | ### ActivePathways 1.0.4
29 | * Added three new parameters to ActivePathways and prepareCystoscape functions. These include "color_palette", "custom_colors" and "color_integrated_only" to provide more options for node coloring in Cytoscape. 
30 | 
31 | ### ActivePathways 1.0.3
32 | * Removed dependency used for testing for CRAN compliance.
33 | 
34 | ### ActivePathways 1.0.2
35 | * Renamed package to ActivePathways from activePathways for consistency 
36 | with function and publication
37 | * Added new function export_as_CSV(res, file_name) to save data in 
38 | spreadsheet-friendly formats
39 | * Updated README-file with an actionable step-by-step tutorial
40 | * Changed logic of creating files for Enrichment Map: the user can provide 
41 | the parameter "cytoscape.file.tag" for creating the required files. If the 
42 | parameter is NA (default), no files are created. No directories are created. 
43 | * Removed the parameter "return.all" as it was redundant with the 
44 | parameter "significance".
45 | * Removed the parameter "reanalyze" to simplify the package and leave the structuring 
46 | of results up to the user.
47 | * Removed the dependency on the R package metap. As a result, only Fisher's and Brown's p-value 
48 | merging options are available.
49 | * Updated the vignette that now describes the ActivePathways package as well as the following steps of visualising results as enrichment maps in Cytoscape. 
50 | 


--------------------------------------------------------------------------------
/R/ActivePathways.r:
--------------------------------------------------------------------------------
  1 | #' ActivePathways
  2 | #'
  3 | #' @param scores A numerical matrix of p-values where each row is a gene and
  4 | #'   each column represents an omics dataset (evidence). Rownames correspond to the genes 
  5 | #'   and colnames to the datasets. All values must be 0<=p<=1. We recommend converting 
  6 | #'   missing values to ones. 
  7 | #' @param gmt A GMT object to be used for enrichment analysis. If a filename, a
  8 | #'   GMT object will be read from the file.
  9 | #' @param background A character vector of gene names to be used as a
 10 | #'   statistical background. By default, the background is all genes that appear
 11 | #'   in \code{gmt}.
 12 | #' @param geneset_filter A numeric vector of length two giving the lower and 
 13 | #'   upper limits for the size of the annotated geneset to pathways in gmt.
 14 | #'   Pathways with a geneset shorter than \code{geneset_filter[1]} or longer
 15 | #'   than \code{geneset_filter[2]} will be removed. Set either value to NA
 16 | #'   to not enforce a minimum or maximum value, or set \code{geneset_filter} to 
 17 | #'   \code{NULL} to skip filtering.
 18 | #' @param cutoff A maximum merged p-value for a gene to be used for analysis.
 19 | #'   Any genes with merged, unadjusted \code{p > significant} will be discarded 
 20 | #'   before testing.
 21 | #' @param significant Significance cutoff for selecting enriched pathways. Pathways with
 22 | #'   \code{adjusted_p_val <= significant} will be selected as results.
 23 | #' @param merge_method Statistical method to merge p-values. See section on Merging P-Values
 24 | #' @param correction_method Statistical method to correct p-values. See
 25 | #'   \code{\link[stats]{p.adjust}} for details.
 26 | #' @param cytoscape_file_tag The directory and/or file prefix to which the output files
 27 | #'   for generating enrichment maps should be written. If NA, files will not be written. 
 28 | #' @param color_palette Color palette from RColorBrewer::brewer.pal to color each
 29 | #'   column in the scores matrix. If NULL grDevices::rainbow is used by default.
 30 | #' @param custom_colors A character vector of custom colors for each column in the scores matrix.
 31 | #' @param color_integrated_only A character vector of length 1 specifying the color of the 
 32 | #'   "combined" pathway contribution.
 33 | #' @param scores_direction A numerical matrix of log2 transformed fold-change values where each row is a
 34 | #'   gene and each column represents a dataset (evidence). Rownames correspond to the genes
 35 | #'   and colnames to the datasets. We recommend converting missing values to zero. 
 36 | #'   Must contain the same dimensions as the scores parameter. Datasets without directional information should be set to 0.
 37 | #' @param constraints_vector A numerical vector of +1 or -1 values corresponding to the user-defined
 38 | #'   directional relationship between columns in scores_direction. Datasets without directional information should
 39 | #'   be set to 0.
 40 | #'
 41 | #' @return A data.table of terms (enriched pathways) containing the following columns:
 42 | #'   \describe{
 43 | #'     \item{term_id}{The database ID of the term}
 44 | #'     \item{term_name}{The full name of the term}
 45 | #'     \item{adjusted_p_val}{The associated p-value, adjusted for multiple testing}
 46 | #'     \item{term_size}{The number of genes annotated to the term}
 47 | #'     \item{overlap}{A character vector of the genes enriched in the term}
 48 | #'     \item{evidence}{Columns of \code{scores} (i.e., omics datasets) that contributed 
 49 | #'          individually to the enrichment of the term. Each input column is evaluated 
 50 | #'          separately for enrichments and added to the evidence if the term is found.}
 51 | #'   }
 52 | #'
 53 | #' @section Merging P-values:
 54 | #' To obtain a single p-value for each gene across the multiple omics datasets considered, 
 55 | #' the p-values in \code{scores} #' are merged row-wise using a data fusion approach of p-value merging. 
 56 | #' The eight available methods are:
 57 | #' \describe{
 58 | #'  \item{Fisher}{Fisher's method assumes p-values are uniformly
 59 | #'  distributed and performs a chi-squared test on the statistic sum(-2 log(p)).
 60 | #'  This method is most appropriate when the columns in \code{scores} are
 61 | #'  independent.}
 62 | #'  \item{Fisher_directional}{Fisher's method modification that allows for 
 63 | #'  directional information to be incorporated with the \code{scores_direction}
 64 | #'  and \code{constraints_vector} parameters.}
 65 | #'  \item{Brown}{Brown's method extends Fisher's method by accounting for the
 66 | #'  covariance in the columns of \code{scores}. It is more appropriate when the
 67 | #'  tests of significance used to create the columns in \code{scores} are not
 68 | #'  necessarily independent. The Brown's method is therefore recommended for 
 69 | #'  many omics integration approaches.}
 70 | #'  \item{DPM}{DPM extends Brown's method by incorporating directional information
 71 | #'  using the \code{scores_direction} and \code{constraints_vector} parameters.}
 72 | #'  \item{Stouffer}{Stouffer's method assumes p-values are uniformly distributed
 73 | #'  and transforms p-values into a Z-score using the cumulative distribution function of a
 74 | #'  standard normal distribution. This method is appropriate when the columns in \code{scores}
 75 | #'   are independent.}
 76 | #'  \item{Stouffer_directional}{Stouffer's method modification that allows for 
 77 | #'  directional information to be incorporated with the \code{scores_direction}
 78 | #'  and \code{constraints_vector} parameters.}
 79 | #'  \item{Strube}{Strube's method extends Stouffer's method by accounting for the 
 80 | #'  covariance in the columns of \code{scores}.}
 81 | #'  \item{Strube_directional}{Strube's method modification that allows for 
 82 | #'  directional information to be incorporated with the \code{scores_direction}
 83 | #'  and \code{constraints_vector} parameters.}
 84 | #' }
 85 | #'
 86 | #' @section Cytoscape:
 87 | #'   To visualize and interpret enriched pathways, ActivePathways provides an option
 88 | #'   to further analyse results as enrichment maps in the Cytoscape software. 
 89 | #'   If \code{!is.na(cytoscape_file_tag)}, four files will be written that can be used 
 90 | #'   to build enrichment maps. This requires the EnrichmentMap and enhancedGraphics apps.
 91 | #'
 92 | #' The four files written are:
 93 | #'   \describe{
 94 | #'     \item{pathways.txt}{A list of significant terms and the
 95 | #'     associated p-value. Only terms with \code{adjusted_p_val <= significant} are
 96 | #'     written to this file.}
 97 | #'     \item{subgroups.txt}{A matrix indicating whether the significant terms (pathways)
 98 | #'     were also found to be significant when considering only one column from
 99 | #'     \code{scores}. A one indicates that term was found to be significant 
100 | #' 			when only p-values in that column were used to select genes.}
101 | #'     \item{pathways.gmt}{A Shortened version of the supplied GMT
102 | #'     file, containing only the significantly enriched terms in pathways.txt. }
103 | #'     \item{legend.pdf}{A legend with colours matching contributions
104 | #'     from columns in \code{scores}.}
105 | #'   }
106 | #'
107 | #'   How to use: Create an enrichment map in Cytoscape with the file of terms
108 | #'   (pathways.txt) and the shortened gmt file
109 | #'   (pathways.gmt). Upload the subgroups file (subgroups.txt) as a table
110 | #'   using the menu File > Import > Table from File. To paint nodes according 
111 | #'   to the type of supporting evidence, use the 'style'
112 | #'   panel, set image/Chart1 to use the column `instruct` and the passthrough
113 | #'   mapping type. Make sure the app enhancedGraphics is installed. 
114 | #'   Lastly, use the file legend.pdf as a reference for colors in the enrichment map.
115 | #'
116 | #' @examples
117 | #'     fname_scores <- system.file("extdata", "Adenocarcinoma_scores_subset.tsv", 
118 | #'          package = "ActivePathways")
119 | #'     fname_GMT = system.file("extdata", "hsapiens_REAC_subset.gmt",
120 | #'          package = "ActivePathways")
121 | #'
122 | #'     dat <- as.matrix(read.table(fname_scores, header = TRUE, row.names = 'Gene'))
123 | #'     dat[is.na(dat)] <- 1
124 | #'
125 | #'     ActivePathways(dat, fname_GMT)
126 | #'
127 | #' @import data.table
128 | #'
129 | #' @export
130 | 
131 | ActivePathways <-  function(scores, gmt, background = makeBackground(gmt),
132 |                             geneset_filter = c(5, 1000), cutoff = 0.1, significant = 0.05,
133 |                             merge_method = c("Fisher", "Fisher_directional", "Brown", "DPM", "Stouffer",
134 |                                             "Stouffer_directional", "Strube", "Strube_directional"),
135 |                             correction_method = c("holm", "fdr", "hochberg", "hommel",
136 |                                                   "bonferroni", "BH", "BY", "none"),
137 |                             cytoscape_file_tag = NA, color_palette = NULL, custom_colors = NULL, 
138 |                             color_integrated_only = "#FFFFF0", scores_direction = NULL, 
139 |                             constraints_vector = NULL) {
140 |    
141 |    merge_method <- match.arg(merge_method)
142 |    correction_method <- match.arg(correction_method)
143 |    
144 |    ##### Validation #####
145 |    # scores
146 |    if (!(is.matrix(scores) && is.numeric(scores))) stop("scores must be a numeric matrix")
147 |    if (any(is.na(scores))) stop("scores cannot contain missing values, we recommend replacing NA with 1 or removing")
148 |    if (any(scores < 0) || any(scores > 1)) stop("All values in scores must be in [0,1]")
149 |    if (any(duplicated(rownames(scores)))) stop("scores matrix contains duplicated genes - rownames must be unique")
150 |    
151 |    # scores_direction and constraints_vector
152 |    if (xor(!is.null(scores_direction),!is.null(constraints_vector))) stop("Both scores_direction and constraints_vector must be provided")
153 |    if (!is.null(scores_direction) && !is.null(constraints_vector)){
154 |       if (!(is.numeric(constraints_vector) && is.vector(constraints_vector))) stop("constraints_vector must be a numeric vector")
155 |       if (any(!constraints_vector %in% c(1,-1,0))) stop("constraints_vector must contain the values: 1, -1 or 0")
156 |       if (!(is.matrix(scores_direction) && is.numeric(scores_direction))) stop("scores_direction must be a numeric matrix")
157 |       if (any(is.na(scores_direction))) stop("scores_direction cannot contain missing values, we recommend replacing NA with 0 or removing")
158 |       if (nrow(scores) != nrow(scores_direction)) stop("scores and scores_direction must have the same number of rows")
159 |       if (any(!rownames(scores_direction) %in% rownames(scores))) stop ("scores_direction gene names must match scores genes")
160 |       if (any(rownames(scores) != rownames(scores_direction))) stop("scores genes should be in the same order as scores_direction genes")
161 |       if (is.null(colnames(scores)) || is.null(colnames(scores_direction))) stop("column names must be provided to scores and scores_direction")
162 |       if (any(!colnames(scores_direction) %in% colnames(scores))) stop("scores_direction column names must match scores column names")
163 |       if (length(constraints_vector) != length(colnames(scores_direction))) stop("constraints_vector should have the same number of entries as columns in scores_direction")
164 |       if (merge_method %in% c("Fisher","Brown","Stouffer","Strube")) stop("Only DPM, Fisher_directional, Stouffer_directional, and Strube_directional methods support directional integration")
165 |       if (any(constraints_vector %in% 0) &&  !all(scores_direction[,constraints_vector %in% 0] == 0)) 
166 |          stop("scores_direction entries must be set to 0's for columns that do not contain directional information")
167 |       if (!is.null(names(constraints_vector))){
168 |          if (!all.equal(names(constraints_vector), colnames(scores_direction), colnames(scores)) == TRUE){
169 |             stop("the constraints_vector entries should match the order of scores and scores_direction columns")
170 |          }}}
171 |    
172 |    # cutoff and significant
173 |    stopifnot(length(cutoff) == 1)
174 |    stopifnot(is.numeric(cutoff))
175 |    if (cutoff < 0 || cutoff > 1) stop("cutoff must be a value in [0,1]")
176 |    stopifnot(length(significant) == 1)
177 |    stopifnot(is.numeric(significant))
178 |    if (significant < 0 || significant > 1) stop("significant must be a value in [0,1]")
179 |    
180 |    # gmt
181 |    if (!is.GMT(gmt)) gmt <- read.GMT(gmt)
182 |    if (length(gmt) == 0) stop("No pathways in gmt made the geneset_filter")
183 |    if (!(is.character(background) && is.vector(background))) {
184 |       stop("background must be a character vector")
185 |    } 
186 |    
187 |    # geneset_filter
188 |    if (!is.null(geneset_filter)) {
189 |       if (!(is.numeric(geneset_filter) && is.vector(geneset_filter))) {
190 |          stop("geneset_filter must be a numeric vector")
191 |       }
192 |       if (length(geneset_filter) != 2) stop("geneset_filter must be length 2")
193 |       if (!is.numeric(geneset_filter)) stop("geneset_filter must be numeric")
194 |       if (any(geneset_filter < 0, na.rm=TRUE)) stop("geneset_filter limits must be positive")
195 |    }
196 |    
197 |    # custom_colors
198 |    if (!is.null(custom_colors)){
199 |       if(!(is.character(custom_colors) && is.vector(custom_colors))){
200 |          stop("colors must be provided as a character vector")   
201 |       } 
202 |       if(length(colnames(scores)) != length(custom_colors)) stop("incorrect number of colors is provided")
203 |    }
204 |    if (!is.null(custom_colors) & !is.null(color_palette)){
205 |       stop("Both custom_colors and color_palette are provided. Specify only one of these parameters for node coloring.")
206 |    }
207 |    
208 |    if (!is.null(names(custom_colors))){
209 |       if (!all(names(custom_colors) %in% colnames(scores))){
210 |          stop("names() of the custom colors vector should match the scores column names")
211 |       }
212 |    }
213 |    
214 |    # color_palette
215 |    if (!is.null(color_palette)){
216 |       if (!(color_palette %in% rownames(RColorBrewer::brewer.pal.info))) stop("palette must be from the RColorBrewer package")
217 |    }
218 |    
219 |    # color_integrated_only
220 |    if(!(is.character(color_integrated_only) && is.vector(color_integrated_only))){
221 |       stop("color must be provided as a character vector")   
222 |    } 
223 |    if(1 != length(color_integrated_only)) stop("only a single color must be specified")
224 |    
225 |    # contribution
226 |    contribution <- TRUE
227 |    if (ncol(scores) == 1) {
228 |       contribution <- FALSE
229 |       message("scores matrix contains only one column. Column contributions will not be calculated")
230 |    }
231 |    
232 |    ##### filtering and sorting ####
233 |    
234 |    # Remove any genes not found in the background
235 |    orig_length <- nrow(scores)
236 |    scores <- scores[rownames(scores) %in% background, , drop=FALSE]
237 |    if(!is.null(scores_direction)){
238 |       scores_direction <- scores_direction[rownames(scores_direction) %in% background, , drop=FALSE]
239 |    }
240 |    if (nrow(scores) == 0) {
241 |       stop("scores does not contain any genes in the background")
242 |    }
243 |    if (nrow(scores) < orig_length) {
244 |       message(paste(orig_length - nrow(scores), "rows were removed from scores",
245 |                     "because they are not found in the background"))
246 |    }
247 |    
248 |    
249 |    # Filter the GMT
250 |    if (!all(background %in% unique(unlist(sapply(gmt, "[", c(3)))))){
251 |       background_genes <- lapply(sapply(gmt, "[", c(3)), intersect, background)
252 |       background_genes <- background_genes[lapply(background_genes,length) > 0]
253 |       gmt <- gmt[names(sapply(gmt,"[",c(3))) %in% names(background_genes)]
254 |       for (i in 1:length(gmt)) {
255 |          gmt[[i]]$genes <- background_genes[[i]]
256 |       }
257 |    }
258 |    
259 |    if(!is.null(geneset_filter)) {
260 |       orig_length <- length(gmt)
261 |       if (!is.na(geneset_filter[1])) {
262 |          gmt <- Filter(function(x) length(x$genes) >= geneset_filter[1], gmt)
263 |       }
264 |       if (!is.na(geneset_filter[2])) {
265 |          gmt <- Filter(function(x) length(x$genes) <= geneset_filter[2], gmt)
266 |       }
267 |       if (length(gmt) == 0) stop("No pathways in gmt made the geneset_filter")
268 |       if (length(gmt) < orig_length) {
269 |          message(paste(orig_length - length(gmt), "terms were removed from gmt", 
270 |                        "because they did not make the geneset_filter"))
271 |       }
272 |    }
273 |    
274 |    # merge p-values to get a single score for each gene and remove any genes
275 |    # that don't make the cutoff
276 |    merged_scores <- merge_p_values(scores, merge_method, scores_direction, constraints_vector)
277 |    merged_scores <- merged_scores[merged_scores <= cutoff]
278 |    
279 |    if (length(merged_scores) == 0) stop("No genes made the cutoff")
280 |    
281 |    # Sort genes by p-value
282 |    ordered_scores <- names(merged_scores)[order(merged_scores)]
283 |    
284 |    ##### enrichmentAnalysis and column contribution #####
285 |    
286 |    res <- enrichmentAnalysis(ordered_scores, gmt, background)
287 |    adjusted_p <- stats::p.adjust(res$adjusted_p_val, method = correction_method)
288 |    res[, "adjusted_p_val" := adjusted_p]
289 |    
290 |    significant_indeces <- which(res$adjusted_p_val <= significant)
291 |    if (length(significant_indeces) == 0) {
292 |       warning("No significant terms were found")
293 |       return()
294 |    }
295 |    
296 |    if (contribution) {
297 |       sig_cols <- columnSignificance(scores, gmt, background, cutoff,
298 |                                      significant, correction_method, res$adjusted_p_val)
299 |       res <- cbind(res, sig_cols[, -1])
300 |    } else {
301 |       sig_cols <- NULL
302 |    }
303 |    
304 |    # if significant result were found and cytoscape file tag exists
305 |    # proceed with writing files in the working directory
306 |    if (length(significant_indeces) > 0 & !is.na(cytoscape_file_tag)) {
307 |       prepareCytoscape(res[significant_indeces, c("term_id", "term_name", "adjusted_p_val")],
308 |                        gmt[significant_indeces], 
309 |                        cytoscape_file_tag,
310 |                        sig_cols[significant_indeces,], color_palette, custom_colors, color_integrated_only)
311 |    }
312 |    
313 |    res[significant_indeces]
314 | }
315 | 
316 | 
317 | #' Perform pathway enrichment analysis on an ordered list of genes
318 | #'
319 | #' @param genelist character vector of gene names, in decreasing order
320 | #'   of significance
321 | #' @param gmt GMT object
322 | #' @param background character vector of gene names. List of all genes being used
323 | #'   as a statistical background
324 | #'
325 | #' @return a data.table of terms with the following columns:
326 | #'   \describe{
327 | #'     \item{term_id}{The id of the term}
328 | #'     \item{term_name}{The full name of the term}
329 | #'     \item{adjusted_p_val}{The associated p-value adjusted for multiple testing}
330 | #'     \item{term_size}{The number of genes annotated to the term}
331 | #'     \item{overlap}{A character vector of the genes that overlap between the
332 | #'        term and the query}
333 | #'   }
334 | #' @keywords internal
335 | enrichmentAnalysis <- function(genelist, gmt, background) {
336 |    dt <- data.table(term_id=names(gmt))
337 |    
338 |    for (i in 1:length(gmt)) {
339 |       term <- gmt[[i]]
340 |       tmp <- orderedHypergeometric(genelist, background, term$genes)
341 |       overlap <- genelist[1:tmp$ind]
342 |       overlap <- overlap[overlap %in% term$genes]
343 |       if (length(overlap) == 0) overlap <- c()
344 |       set(dt, i, 'term_name', term$name)
345 |       set(dt, i, 'adjusted_p_val', tmp$p_val)
346 |       set(dt, i, 'term_size', length(term$genes))
347 |       set(dt, i, 'overlap', list(list(overlap)))
348 |    }
349 |    dt
350 | }
351 | 
352 | #' Determine which terms are found to be significant using each column
353 | #' individually. 
354 | #'
355 | #' @inheritParams ActivePathways
356 | #' @param pvals p-value for the pathways calculated by ActivePathways
357 | #'
358 | #' @return a data.table with columns 'term_id' and a column for each column
359 | #' in \code{scores}, indicating whether each term (pathway) was found to be
360 | #' significant or not when considering only that column. For each term, 
361 | #' either report the list of related genes if that term was significant, or NA if not. 
362 | 
363 | columnSignificance <- function(scores, gmt, background, cutoff, significant, correction_method, pvals) {
364 |    dt <- data.table(term_id=names(gmt), evidence=NA)
365 |    for (col in colnames(scores)) {
366 |       col_scores <- scores[, col, drop=TRUE]
367 |       col_scores <- col_scores[col_scores <= cutoff]
368 |       col_scores <- names(col_scores)[order(col_scores)]
369 |       
370 |       res <- enrichmentAnalysis(col_scores, gmt, background)
371 |       set(res, i = NULL, "adjusted_p_val", stats::p.adjust(res$adjusted_p_val, correction_method))
372 |       set(res, i = which(res$adjusted_p_val > significant), "overlap", list(list(NA)))
373 |       set(dt, i=NULL, col, res$overlap)
374 |    }
375 |    
376 |    ev_names = colnames(dt[,-1:-2])
377 |    set_evidence <- function(x) {
378 |       ev <- ev_names[!is.na(dt[x, -1:-2])]
379 |       if(length(ev) == 0) {
380 |          if (pvals[x] <= significant) {
381 |             ev <- 'combined'
382 |          } else {
383 |             ev <- 'none'
384 |          }
385 |       }
386 |       ev
387 |    }
388 |    evidence <- lapply(1:nrow(dt), set_evidence)
389 |    
390 |    set(dt, i=NULL, "evidence", evidence)
391 |    colnames(dt)[-1:-2] = paste0("Genes_", colnames(dt)[-1:-2])
392 |    
393 |    dt
394 | }
395 | 
396 | #' Export the results from ActivePathways as a comma-separated values (CSV) file. 
397 | #'
398 | #' @param res the data.table object with ActivePathways results.
399 | #' @param file_name location and name of the CSV file to write to.
400 | #' @export
401 | #'
402 | #' @examples
403 | #'     fname_scores <- system.file("extdata", "Adenocarcinoma_scores_subset.tsv", 
404 | #'          package = "ActivePathways")
405 | #'     fname_GMT = system.file("extdata", "hsapiens_REAC_subset.gmt",
406 | #'          package = "ActivePathways")
407 | #'
408 | #'     dat <- as.matrix(read.table(fname_scores, header = TRUE, row.names = 'Gene'))
409 | #'     dat[is.na(dat)] <- 1
410 | #'
411 | #'     res <- ActivePathways(dat, fname_GMT)
412 | #'\donttest{
413 | #'     export_as_CSV(res, "results_ActivePathways.csv")
414 | #'}
415 | export_as_CSV = function (res, file_name) {
416 |    overlap_index <- which(grepl("overlap", colnames(res), fixed=TRUE))
417 |    dataset_indices <- which(grepl("Genes_", colnames(res), fixed=TRUE))
418 |    for (i in c(overlap_index, dataset_indices)){
419 |       res[[i]] <- sapply(res[[i]], function(x) paste(x, collapse = "|"))
420 |    }
421 |    data.table::fwrite(res, file_name)	
422 | } 
423 | 


--------------------------------------------------------------------------------
/R/cytoscape.r:
--------------------------------------------------------------------------------
  1 | #' Prepare files for building an enrichment map network visualization in Cytoscape
  2 | #'
  3 | #' This function writes four text files that are used to build an network using
  4 | #' Cytoscape and the EnrichmentMap app. The files are prefixed with \code{cytoscape_file_tag}. 
  5 | #'   The four files written are:
  6 | #'   \describe{
  7 | #'     \item{pathways.txt}{A list of significant terms and the
  8 | #'     associated p-value. Only terms with \code{adjusted_p_val <= significant} are
  9 | #'     written to this file}
 10 | #'     \item{subgroups.txt}{A matrix indicating whether the significant
 11 | #'     pathways are found to be significant when considering only one column (i.e., type of omics evidence) from
 12 | #'     \code{scores}. A 1 indicates that that term is significant using only that
 13 | #'     column to test for enrichment analysis}
 14 | #'     \item{pathways.gmt}{A shortened version of the supplied GMT
 15 | #'     file, containing only the terms in pathways.txt.}
 16 | #'     \item{legend.pdf}{A legend with colours matching contributions
 17 | #'     from columns in \code{scores}}
 18 | #'   }
 19 | #'
 20 | #' @param terms A data.table object with the columns 'term_id', 'term_name', 'adjusted_p_val'. 
 21 | #' @param gmt An abridged GMT object containing only the pathways that were
 22 | #' found to be significant in the ActivePathways analysis.
 23 | #' @param cytoscape_file_tag The user-defined file prefix and/or directory defining the location of the files.
 24 | #' @param col_significance A data.table object with a column 'term_id' and a column
 25 | #' for each type of omics evidence indicating whether a term was also found to be significant or not
 26 | #' when considering only the genes and p-values in the corresponding column of the \code{scores} matrix.
 27 | #' If term was not found, NA's are shown in columns, otherwise the relevant lists of genes are shown.
 28 | #' @param color_palette Color palette from RColorBrewer::brewer.pal to color each
 29 | #' column in the scores matrix. If NULL grDevices::rainbow is used by default.
 30 | #' @param custom_colors A character vector of custom colors for each column in the scores matrix.
 31 | #' @param color_integrated_only A character vector of length 1 specifying the color of the "combined" pathway contribution. 
 32 | #' @import ggplot2
 33 | #'
 34 | #' @return None
 35 | 
 36 | prepareCytoscape <- function(terms, 
 37 |                              gmt, 
 38 |                              cytoscape_file_tag, 
 39 |                              col_significance, color_palette = NULL, custom_colors = NULL, color_integrated_only = "#FFFFF0") {
 40 |   if (!is.null(col_significance)) {
 41 |     
 42 |     # Obtain the name of each omics dataset and incorporate a 'combined' contribution
 43 |     tests <- colnames(col_significance)[3:length(colnames(col_significance))]
 44 |     tests <- substr(tests, 7, 100)
 45 |     tests <- append(tests, "combined")
 46 |     
 47 |     # Create a matrix of ones and zeros, where columns are omics datasets + 'combined'
 48 |     # and rows are enriched pathways
 49 |     rows <- 1:nrow(col_significance)
 50 |     evidence_columns = do.call(rbind, lapply(col_significance$evidence,
 51 |                                              function(x) 0+(tests %in% x)))
 52 |     colnames(evidence_columns) = tests
 53 |     col_significance = cbind(col_significance[,"term_id"], evidence_columns)
 54 |     
 55 |     # Acquire colours from grDevices::rainbow or RColorBrewer::brewer.pal if custom colors are not provided  
 56 |     if(is.null(color_palette) & is.null(custom_colors)) {
 57 |       col_colors <- grDevices::rainbow(length(tests))
 58 |     } else if (!is.null(custom_colors)){
 59 |         if (!is.null(names(custom_colors))){
 60 |           custom_colors <- custom_colors[order(match(names(custom_colors),tests))]
 61 |       }
 62 |         custom_colors <- append(custom_colors, color_integrated_only, after = match("combined",tests))
 63 |         col_colors <- custom_colors
 64 |     } else {
 65 |       col_colors <- RColorBrewer::brewer.pal(length(tests),color_palette)
 66 |     }
 67 |     col_colors <- replace(col_colors, match("combined",tests),color_integrated_only)
 68 |     if (!is.null(names(col_colors))){
 69 |       names(col_colors)[length(col_colors)] <- "combined"
 70 |     }
 71 |     
 72 |     instruct_str <- paste('piechart:',
 73 |                           ' attributelist="', 
 74 |                           paste(tests, collapse=','),
 75 |                           '" colorlist="', 
 76 |                           paste(col_colors, collapse=','), 
 77 |                           '" showlabels=FALSE', sep='')
 78 |     col_significance[, "instruct" := instruct_str]
 79 |     
 80 |     # Writing the Files
 81 |     utils::write.table(terms, 
 82 |                 file=paste0(cytoscape_file_tag, "pathways.txt"), 
 83 |                 row.names=FALSE, 
 84 |                 sep="\t", 
 85 |                 quote=FALSE)
 86 |     utils::write.table(col_significance, 
 87 |                 file=paste0(cytoscape_file_tag, "subgroups.txt"), 
 88 |                 row.names=FALSE, 
 89 |                 sep="\t", 
 90 |                 quote=FALSE)
 91 |     write.GMT(gmt, 
 92 |               paste0(cytoscape_file_tag, "pathways.gmt"))
 93 |     
 94 |     # Making a Legend
 95 |       dummy_plot = ggplot(data.frame("tests" = factor(tests, levels = tests),
 96 |                                      "value" = 1), aes(tests, fill = tests)) +
 97 |         geom_bar() +
 98 |         scale_fill_manual(name = "Contribution", values=col_colors)
 99 | 
100 |       grDevices::pdf(file = NULL) # Suppressing Blank Display Device from ggplot_gtable
101 |       dummy_table = ggplot_gtable(ggplot_build(dummy_plot))
102 |       grDevices::dev.off()
103 | 
104 |       legend = dummy_table$grobs[[which(sapply(dummy_table$grobs, function(x) x$name) == "guide-box")]]
105 |       
106 |       # Estimating height & width
107 |       legend_height = ifelse(length(tests) > 20, 
108 |                              5.5, 
109 |                              length(tests)*0.25+1)
110 |       legend_width = ifelse(length(tests) > 20, 
111 |                             ceiling(length(tests)/20)*(max(nchar(tests))*0.05+1), 
112 |                             max(nchar(tests))*0.05+1)
113 |       ggsave(legend,
114 |              device = "pdf",
115 |              filename = paste0(cytoscape_file_tag, "legend.pdf"), 
116 |              height = legend_height, 
117 |              width = legend_width, 
118 |              scale = 1)
119 |     
120 |   } else {
121 |     utils::write.table(terms, 
122 |                 file=paste0(cytoscape_file_tag, "pathways.txt"),
123 |                 row.names=FALSE, 
124 |                 sep="\t", 
125 |                 quote=FALSE)
126 |     write.GMT(gmt, 
127 |               paste0(cytoscape_file_tag, "pathways.gmt"))
128 |   }
129 | }
130 | 


--------------------------------------------------------------------------------
/R/gmt.r:
--------------------------------------------------------------------------------
  1 | #' Read and Write GMT files
  2 | #'
  3 | #' Functions to read and write Gene Matrix Transposed (GMT) files and to test if
  4 | #' an object inherits from GMT.
  5 | #'
  6 | #' A GMT file describes gene sets, such as biological terms and pathways. GMT files are 
  7 | #' tab delimited text files. Each row of a GMT file contains a single term with its 
  8 | #' database ID and a term name, followed by all the genes annotated to the term.
  9 | #'
 10 | #' @format
 11 | #' A GMT object is a named list of terms, where each term is a list with the items:
 12 | #' \describe{
 13 | #'     \item{id}{The term ID.}
 14 | #'     \item{name}{The full name or description of the term.}
 15 | #'     \item{genes}{A character vector of genes annotated to this term.}
 16 | #'   }
 17 | #' @rdname GMT
 18 | #' @name GMT
 19 | #' @aliases GMT gmt
 20 | #'
 21 | #' @param filename Location of the gmt file.
 22 | #' @param gmt A GMT object.
 23 | #' @param x The object to test.
 24 | #'
 25 | #' @return \code{read.GMT} returns a GMT object. \cr
 26 | #' \code{write.GMT} returns NULL. \cr
 27 | #' \code{is.GMT} returns TRUE if \code{x} is a GMT object, else FALSE.
 28 | #'
 29 | #'
 30 | #' @examples
 31 | #'   fname_GMT <- system.file("extdata", "hsapiens_REAC_subset.gmt", package = "ActivePathways")
 32 | #'   gmt <- read.GMT(fname_GMT)
 33 | #'   gmt[1:10]
 34 | #'   gmt[[1]]
 35 | #'   gmt[[1]]$id
 36 | #'   gmt[[1]]$genes
 37 | #'   gmt[[1]]$name
 38 | #'   gmt$`REAC:1630316`
 39 | #' @export
 40 | read.GMT <- function(filename) {
 41 |     gmt <- strsplit(readLines(filename), '\t')
 42 |     names(gmt) <- sapply(gmt, `[`, 1)
 43 |     gmt <- lapply(gmt, function(x) { list(id=x[1], name=x[2], genes=x[-c(1,2)]) })
 44 |     class(gmt) <- 'GMT'
 45 |     gmt
 46 | }
 47 | 
 48 | #' @rdname GMT
 49 | #' @export
 50 | write.GMT <- function(gmt, filename) {
 51 |     if (!is.GMT(gmt)) stop("gmt is not a valid GMT object")
 52 |     sink(filename)
 53 |     for (term in gmt) {
 54 |         cat(term$id, term$name, paste(term$genes, collapse="\t"), sep = "\t")
 55 |         cat("\n")
 56 |     }
 57 |     sink()
 58 | }
 59 | 
 60 | #' Make a background list of genes (i.e., the statistical universe) based on all the terms (gene sets, pathways) considered. 
 61 | #'
 62 | #' Returns A character vector of all genes in a GMT object.
 63 | #'
 64 | #' @param gmt A \link{GMT} object.
 65 | #' @return A character vector containing all genes in GMT.
 66 | #' @export
 67 | #'
 68 | #' @examples
 69 | #'   fname_GMT <- system.file("extdata", "hsapiens_REAC_subset.gmt", package = "ActivePathways")
 70 | #'   gmt <- read.GMT(fname_GMT)
 71 | #'   makeBackground(gmt)[1:10]
 72 | makeBackground <- function(gmt) {
 73 |     if (!is.GMT(gmt)) stop('gmt is not a valid GMT object')
 74 |     unlist(Reduce(function(x, y) union(x, y$genes), gmt, gmt[[1]]$genes))
 75 | }
 76 | 
 77 | #####  Subsetting functions #####
 78 | # Treat as a list but return an object of "GMT" class
 79 | #' @export
 80 | `[.GMT` <- function(x, i) {
 81 |     x <- unclass(x)
 82 |     res <- x[i]
 83 |     class(res) <- c('GMT')
 84 |     res
 85 | }
 86 | #' @export
 87 | `[[.GMT` <- function(x, i, exact = TRUE) {
 88 |     x <- unclass(x)
 89 |     x[[i, exact = exact]]
 90 | }
 91 | 
 92 | #' @export
 93 | `$.GMT` <- function(x, i) {
 94 |     x[[i]]
 95 | }
 96 | 
 97 | #' @export
 98 | #' @rdname GMT
 99 | is.GMT <- function(x) inherits(x, 'GMT')
100 | 
101 | # Print a GMT object
102 | #' @export
103 | print.GMT <- function(x, ...) {
104 |     num_lines <- min(length(x), getOption("max.print", 99999))
105 |     num_trunc <- length(x) - num_lines
106 |     cat(sapply(x[1:num_lines], function(a) paste(a$id, "-", a$name, "\n",
107 |                                                  paste(a$genes, collapse=", "), '\n\n')))
108 |     if (num_trunc == 1) {
109 |         cat('[ reached getOption("max.print") -- omitted 1 term ]')
110 |     } else if (num_trunc > 1) {
111 |         cat(paste('[ reached getOption("max.print") -- ommitted', num_trunc, 'terms ]'))
112 |     }
113 | }
114 | 


--------------------------------------------------------------------------------
/R/merge_p.r:
--------------------------------------------------------------------------------
  1 | #' Merge a list or matrix of p-values
  2 | #'
  3 | #' @param scores Either a list/vector of p-values or a matrix where each column is a test.
  4 | #' @param method Method to merge p-values. See 'methods' section below.
  5 | #' @param scores_direction Either a vector of log2 transformed fold-change values or a matrix where each column is a test. 
  6 | #' Must contain the same dimensions as the scores parameter. Datasets without directional information should be set to 0. 
  7 | #' @param constraints_vector  A numerical vector of +1 or -1 values corresponding to the user-defined
  8 | #'   directional relationship between the columns in scores_direction. Datasets without directional information should
  9 | #'   be set to 0. 
 10 | #'
 11 | #' @return If \code{scores} is a vector or list, returns a number. If \code{scores} is a
 12 | #'   matrix, returns a named list of p-values merged by row.
 13 | #'
 14 | #' @section Methods:
 15 | #' Eight methods are available to merge a list of p-values:
 16 | #' \describe{
 17 | #'  \item{Fisher}{Fisher's method (default) assumes that p-values are uniformly
 18 | #'  distributed and performs a chi-squared test on the statistic sum(-2 log(p)).
 19 | #'  This method is most appropriate when the columns in \code{scores} are
 20 | #'  independent.}
 21 | #'  \item{Fisher_directional}{Fisher's method modification that allows for 
 22 | #'  directional information to be incorporated with the \code{scores_direction}
 23 | #'  and \code{constraints_vector} parameters.}
 24 | #'  \item{Brown}{Brown's method extends Fisher's method by accounting for the
 25 | #'  covariance in the columns of \code{scores}. It is more appropriate when the
 26 | #'  tests of significance used to create the columns in \code{scores} are not
 27 | #'  necessarily independent. Note that the "Brown" method cannot be used with a 
 28 | #'  single list of p-values. However, in this case Brown's method is identical 
 29 | #'  to Fisher's method and should be used instead.}
 30 | #'  \item{DPM}{DPM extends Brown's method by incorporating directional information
 31 | #'  using the \code{scores_direction} and \code{constraints_vector} parameters.}
 32 | #'  \item{Stouffer}{Stouffer's method assumes p-values are uniformly distributed
 33 | #'  and transforms p-values into a Z-score using the cumulative distribution function of a
 34 | #'  standard normal distribution. This method is appropriate when the columns in \code{scores}
 35 | #'   are independent.}
 36 | #'  \item{Stouffer_directional}{Stouffer's method modification that allows for 
 37 | #'  directional information to be incorporated with the \code{scores_direction}
 38 | #'  and \code{constraints_vector} parameters.}
 39 | #'  \item{Strube}{Strube's method extends Stouffer's method by accounting for the 
 40 | #'  covariance in the columns of \code{scores}.}
 41 | #'  \item{Strube_directional}{Strube's method modification that allows for 
 42 | #'  directional information to be incorporated with the \code{scores_direction}
 43 | #'  and \code{constraints_vector} parameters.}
 44 | #'   
 45 | #' }
 46 | #'
 47 | #' @examples
 48 | #'   merge_p_values(c(0.05, 0.09, 0.01))
 49 | #'   merge_p_values(list(a=0.01, b=1, c=0.0015, d=0.025), method='Fisher')
 50 | #'   merge_p_values(matrix(data=c(0.03, 0.061, 0.48, 0.052), nrow = 2), method='Brown')
 51 | #' 
 52 | #' @export
 53 | merge_p_values <- function(scores, method = "Fisher", scores_direction = NULL, 
 54 |                            constraints_vector = NULL) {
 55 |   
 56 |   ##### Validation #####
 57 |   # scores
 58 |   if (is.list(scores)) scores <- unlist(scores, recursive=FALSE)
 59 |   if (!(is.vector(scores) || is.matrix(scores))) stop("scores must be a matrix or vector")
 60 |   if (any(is.na(scores))) stop("scores cannot contain missing values, we recommend replacing NA with 1 or removing")
 61 |   if (!is.numeric(scores)) stop("scores must be numeric")
 62 |   if (any(scores < 0 | scores > 1)) stop("All values in scores must be in [0,1]")
 63 |   
 64 |   # scores_direction and constraints_vector
 65 |   if (xor(!is.null(scores_direction),!is.null(constraints_vector))) stop("Both scores_direction and constraints_vector must be provided")
 66 |   if (!is.null(scores_direction) && !is.null(constraints_vector)){
 67 |     if (!(is.numeric(constraints_vector) && is.vector(constraints_vector))) stop("constraints_vector must be a numeric vector")
 68 |     if (any(!constraints_vector %in% c(1,-1,0))) stop("constraints_vector must contain the values: 1, -1 or 0")
 69 |     if (!(is.vector(scores_direction) || is.matrix(scores_direction))) stop("scores_direction must be a matrix or vector")
 70 |     if (!all(class(scores_direction) == class(scores))) stop("scores and scores_direction must be the same data type")
 71 |     if (any(is.na(scores_direction))) stop("scores_direction cannot contain missing values, we recommend replacing NA with 0 or removing")
 72 |     if (!is.numeric(scores_direction)) stop("scores_direction must be numeric")
 73 |     if (method %in% c("Fisher","Brown","Stouffer","Strube")) stop("Only DPM, Fisher_directional, Stouffer_directional, and Strube_directional methods support directional integration")
 74 |     
 75 |     if (is.matrix(scores_direction)){
 76 |       if (nrow(scores) != nrow(scores_direction)) stop("scores and scores_direction must have the same number of rows")
 77 |       if (any(!rownames(scores_direction) %in% rownames(scores))) stop ("scores_direction gene names must match scores genes")
 78 |       if (any(rownames(scores) != rownames(scores_direction))) stop("scores genes should be in the same order as scores_direction genes")
 79 |       if (is.null(colnames(scores)) || is.null(colnames(scores_direction))) stop("column names must be provided to scores and scores_direction")
 80 |       if (any(!colnames(scores_direction) %in% colnames(scores))) stop("scores_direction column names must match scores column names")
 81 |       if (length(constraints_vector) != length(colnames(scores_direction))) stop("constraints_vector should have the same number of entries as columns in scores_direction")
 82 |       if (any(constraints_vector %in% 0) &&  !all(scores_direction[,constraints_vector %in% 0] == 0)) 
 83 |         stop("scores_direction entries must be set to 0's for columns that do not contain directional information")
 84 |       if (!is.null(names(constraints_vector))){
 85 |         if (!all.equal(names(constraints_vector), colnames(scores_direction), colnames(scores)) == TRUE){
 86 |           stop("the constraints_vector entries should match the order of scores and scores_direction columns")
 87 |         }}}
 88 |     
 89 |     if (is.vector(scores_direction)){
 90 |       if (length(constraints_vector) != length(scores_direction)) stop("constraints_vector should have the same number of entries as scores_direction")
 91 |       if (length(scores_direction) != length(scores)) stop("scores_direction should have the same number of entries as scores")
 92 |       if (any(constraints_vector %in% 0) &&  !all(scores_direction[constraints_vector %in% 0] == 0)) 
 93 |         stop("scores_direction entries that do not contain directional information must be set to 0's")
 94 |       if (!is.null(names(constraints_vector))){
 95 |         if (!all.equal(names(constraints_vector), names(scores_direction), names(scores)) == TRUE){
 96 |           stop("the constraints_vector entries should match the order of scores and scores_direction")
 97 |         }}}}
 98 |   
 99 |   # method
100 |   if (!method %in% c("Fisher", "Fisher_directional", "Brown", "DPM", "Stouffer", "Stouffer_directional", "Strube", "Strube_directional")){
101 |     stop("Only Fisher, Brown, Stouffer and Strube methods are currently supported for non-directional analysis. 
102 |              And only DPM, Fisher_directional, Stouffer_directional, and Strube_directional are supported for directional analysis")
103 |   }
104 |   if (method %in% c("Fisher_directional", "DPM", "Stouffer_directional", "Strube_directional") & 
105 |       is.null(scores_direction)){
106 |     stop("scores_direction and constraints_vector must be provided for directional analyses")
107 |   }
108 |   
109 |   
110 |   ##### Merge P-values #####
111 |   
112 |   # Methods to merge p-values from a scores vector
113 |   if (is.vector(scores)){
114 |     if (method == "Brown" || method == "Strube" || method == "DPM" || method == "Strube_directional") {
115 |       stop("Brown's, DPM, Strube's, and Strube_directional methods cannot be used with a single list of p-values")
116 |     }
117 |     
118 |     # Convert 0 or very small p-values to 1e-300
119 |     if(min(scores) < 1e-300){  
120 |       message(paste('warning: p-values smaller than ', 1e-300, ' are replaced with ', 1e-300))
121 |       scores <- sapply(scores, function(x) ifelse (x < 1e-300, 1e-300, x))
122 |     }
123 |     
124 |     if (method == "Fisher"){
125 |       p_fisher <- stats::pchisq(fishersMethod(scores),2*length(scores), lower.tail = FALSE)
126 |       return(p_fisher)
127 |     }
128 |     
129 |     if (method == "Fisher_directional"){
130 |       p_fisher <- stats::pchisq(fishersDirectional(scores, scores_direction,constraints_vector),
131 |                                 2*length(scores), lower.tail = FALSE)
132 |       return(p_fisher)
133 |     } 
134 |     
135 |     if (method == "Stouffer"){
136 |       p_stouffer <- 2*stats::pnorm(-1*abs(stouffersMethod(scores)))
137 |       return(p_stouffer)
138 |     } 
139 |     if (method == "Stouffer_directional"){
140 |       p_stouffer <- 2*stats::pnorm(-1*abs(stouffersDirectional(scores,scores_direction,constraints_vector)))
141 |       return(p_stouffer)
142 |     } 
143 |   }
144 |   
145 |   # If scores is a matrix with one column, then no p-value merging can be done 
146 |   if (ncol(scores) == 1) return (scores[, 1, drop=TRUE])
147 |   
148 |   # If scores is a matrix with multiple columns, apply the following methods
149 |   if(min(scores) < 1e-300){
150 |     message(paste('warning: p-values smaller than ', 1e-300, ' are replaced with ', 1e-300))
151 |     scores <- apply(scores, c(1,2), function(x) ifelse (x < 1e-300, 1e-300, x))
152 |   }
153 |   
154 |   if (method == "Fisher"){
155 |     fisher_merged <- c()
156 |     for(i in 1:length(scores[,1])) {
157 |       p_fisher <- stats::pchisq(fishersMethod(scores[i,]), 2*length(scores[i,]), lower.tail = FALSE)
158 |       fisher_merged <- c(fisher_merged, p_fisher)
159 |     }
160 |     names(fisher_merged) <- rownames(scores)
161 |     return(fisher_merged)
162 |   }
163 |   if (method == "Fisher_directional"){
164 |     fisher_merged <- c()
165 |     for(i in 1:length(scores[,1])) {
166 |       p_fisher <- stats::pchisq(fishersDirectional(scores[i,], scores_direction[i,], constraints_vector),
167 |                                 2*length(scores[i,]), lower.tail = FALSE)
168 |       fisher_merged <- c(fisher_merged,p_fisher)
169 |     }
170 |     names(fisher_merged) <- rownames(scores)
171 |     return(fisher_merged)
172 |   }
173 |   if (method == "Brown") {
174 |     cov_matrix <- calculateCovariances(t(scores))
175 |     brown_merged <- brownsMethod(scores, cov_matrix = cov_matrix)
176 |     return(brown_merged)
177 |   }
178 |   if (method == "DPM") {
179 |     cov_matrix <- calculateCovariances(t(scores))
180 |     dpm_merged <- DPM(scores, cov_matrix = cov_matrix, scores_direction = scores_direction,
181 |                       constraints_vector = constraints_vector)
182 |     return(dpm_merged)
183 |   }
184 |   if (method == "Stouffer"){
185 |     stouffer_merged <- c()
186 |     for(i in 1:length(scores[,1])){
187 |       p_stouffer <- 2*stats::pnorm(-1*abs(stouffersMethod(scores[i,])))
188 |       stouffer_merged <- c(stouffer_merged,p_stouffer)
189 |     }
190 |     names(stouffer_merged) <- rownames(scores)
191 |     return(stouffer_merged)
192 |   }
193 |   if (method == "Stouffer_directional"){
194 |     stouffer_merged <- c()
195 |     for(i in 1:length(scores[,1])){
196 |       p_stouffer <- 2*stats::pnorm(-1*abs(stouffersDirectional(scores[i,], scores_direction[i,],constraints_vector)))
197 |       stouffer_merged <- c(stouffer_merged,p_stouffer)
198 |     }
199 |     names(stouffer_merged) <- rownames(scores)
200 |     return(stouffer_merged)
201 |   }
202 |   if (method == "Strube"){
203 |     strube_merged <- strubesMethod(scores)
204 |     return(strube_merged)
205 |   }
206 |   if (method == "Strube_directional"){
207 |     strube_merged <- strubesDirectional(scores,scores_direction,constraints_vector)
208 |     return(strube_merged)
209 |   }
210 | }
211 | 
212 | 
213 | fishersMethod <- function(p_values) {
214 |   chisq_values <- -2*log(p_values)
215 |   sum(chisq_values)
216 | }
217 | 
218 | 
219 | fishersDirectional <- function(p_values, scores_direction, constraints_vector) {
220 |   # Sum the directional chi-squared values
221 |   directionality <- constraints_vector * scores_direction/abs(scores_direction)
222 |   p_values_directional <- p_values[!is.na(directionality)]
223 |   chisq_directional <- abs(-2 * sum(log(p_values_directional)*directionality[!is.na(directionality)]))
224 |   
225 |   # Sum the non-directional chi-squared values
226 |   chisq_nondirectional <- abs(-2 * sum(log(p_values[is.na(directionality)])))
227 |   
228 |   # Combine both
229 |   sum(c(chisq_directional, chisq_nondirectional))
230 | }
231 | 
232 | 
233 | #' Merge p-values using the Brown's method.
234 | #'
235 | #' @param p_values A matrix of m x n p-values.
236 | #' @param data_matrix An m x n matrix representing m tests and n samples. NA's are not allowed.
237 | #' @param cov_matrix A pre-calculated covariance matrix of \code{data_matrix}. This is more
238 | #'   efficient when making many calls with the same data_matrix.
239 | #'   Only one of \code{data_matrix} and \code{cov_matrix} must be given. If both are supplied,
240 | #'   \code{data_matrix} is ignored.
241 | #' @return A p-value vector representing the merged significance of multiple p-values.
242 | #' @export
243 | 
244 | # Based on the R package EmpiricalBrownsMethod
245 | # https://github.com/IlyaLab/CombiningDependentPvaluesUsingEBM/blob/master/R/EmpiricalBrownsMethod/R/ebm.R
246 | # Only significant differences are the removal of extra_info and allowing a
247 | # pre-calculated covariance matrix
248 | # 
249 | brownsMethod <- function(p_values, data_matrix = NULL, cov_matrix = NULL) {
250 |   if (missing(data_matrix) && missing(cov_matrix)) {
251 |     stop ("Either data_matrix or cov_matrix must be supplied")
252 |   }
253 |   if (!(missing(data_matrix) || missing(cov_matrix))) {
254 |     message("Both data_matrix and cov_matrix were supplied. Ignoring data_matrix")
255 |   }
256 |   if (missing(cov_matrix)) cov_matrix <- calculateCovariances(data_matrix)
257 |   
258 |   N <- ncol(cov_matrix)
259 |   expected <- 2 * N
260 |   cov_sum <- 2 * sum(cov_matrix[lower.tri(cov_matrix, diag=FALSE)])
261 |   var <- (4 * N) + cov_sum
262 |   sf <- var / (2 * expected)
263 |   
264 |   df <- (2 * expected^2) / var
265 |   if (df > 2 * N) {
266 |     df <- 2 * N
267 |     sf <- 1
268 |   }
269 |   
270 |   # Acquiring the unadjusted chi-squared values from Fisher's method
271 |   fisher_chisq <- c()
272 |   for(i in 1:length(p_values[,1])) {
273 |     fisher_chisq <- c(fisher_chisq, fishersMethod(p_values[i,]))
274 |   }
275 |   
276 |   # Adjusted p-value
277 |   p_brown <- stats::pchisq(df=df, q=fisher_chisq/sf, lower.tail=FALSE)
278 |   names(p_brown) <- rownames(p_values)
279 |   p_brown
280 | }
281 | 
282 | 
283 | #' Merge p-values using the DPM method.
284 | #'
285 | #' @param p_values A matrix of m x n p-values.
286 | #' @param data_matrix An m x n matrix representing m tests and n samples. NA's are not allowed.
287 | #' @param cov_matrix A pre-calculated covariance matrix of \code{data_matrix}. This is more
288 | #'   efficient when making many calls with the same data_matrix.
289 | #'   Only one of \code{data_matrix} and \code{cov_matrix} must be given. If both are supplied,
290 | #'   \code{data_matrix} is ignored.
291 | #' @param scores_direction A matrix of log2 fold-change values. Datasets without directional information should be set to 0. 
292 | #' @param constraints_vector  A numerical vector of +1 or -1 values corresponding to the user-defined
293 | #'   directional relationship between columns in scores_direction. Datasets without directional information should
294 | #'   be set to 0.
295 | #' @return A p-value vector representing the merged significance of multiple p-values.
296 | #' @export
297 | 
298 | 
299 | DPM <- function(p_values, data_matrix = NULL, cov_matrix = NULL,
300 |                 scores_direction, constraints_vector) {
301 |   if (missing(data_matrix) && missing(cov_matrix)) {
302 |     stop ("Either data_matrix or cov_matrix must be supplied")
303 |   }
304 |   if (!(missing(data_matrix) || missing(cov_matrix))) {
305 |     message("Both data_matrix and cov_matrix were supplied. Ignoring data_matrix")
306 |   }
307 |   if (missing(cov_matrix)) cov_matrix <- calculateCovariances(data_matrix)
308 |   
309 |   N <- ncol(cov_matrix)
310 |   expected <- 2 * N
311 |   cov_sum <- 2 * sum(cov_matrix[lower.tri(cov_matrix, diag=FALSE)])
312 |   var <- (4 * N) + cov_sum
313 |   sf <- var / (2 * expected)
314 |   
315 |   df <- (2 * expected^2) / var
316 |   if (df > 2 * N) {
317 |     df <- 2 * N
318 |     sf <- 1
319 |   }
320 |   
321 |   # Acquiring the unadjusted chi-squared value from Fisher's method
322 |   fisher_chisq <- c()
323 |   for(i in 1:length(p_values[,1])) {
324 |     fisher_chisq <- c(fisher_chisq, fishersDirectional(p_values[i,], scores_direction[i,],constraints_vector))
325 |   }
326 |   
327 |   # Adjusted p-value
328 |   p_dpm <- stats::pchisq(df=df, q=fisher_chisq/sf, lower.tail=FALSE)
329 |   names(p_dpm) <- rownames(p_values)
330 |   p_dpm
331 | }
332 | 
333 | 
334 | stouffersMethod <- function (p_values){
335 |   k = length(p_values)
336 |   z_values <- stats::qnorm(p_values/2)
337 |   sum(z_values)/sqrt(k)
338 | }
339 | 
340 | 
341 | stouffersDirectional <- function (p_values, scores_direction, constraints_vector){
342 |   k = length(p_values)
343 |   
344 |   # Sum the directional z-values
345 |   directionality <- constraints_vector * scores_direction/abs(scores_direction)
346 |   p_values_directional <- p_values[!is.na(directionality)]
347 |   z_directional <- abs(sum(stats::qnorm(p_values_directional/2)*directionality[!is.na(directionality)]))
348 |   
349 |   # Sum the non-directional z-values
350 |   z_nondirectional <- abs(sum(stats::qnorm(p_values[is.na(directionality)]/2)))
351 |   
352 |   # Combine both
353 |   z_values <- c(z_directional, z_nondirectional)
354 |   sum(z_values)/sqrt(k)
355 | }
356 | 
357 | 
358 | strubesMethod <- function (p_values){
359 |   # Acquiring the unadjusted z-value from Stouffer's method
360 |   stouffer_z <- c()
361 |   for(i in 1:length(p_values[,1])){
362 |     stouffer_z <- c(stouffer_z,stouffersMethod(p_values[i,]))
363 |   }
364 |   
365 |   # Correlation matrix
366 |   cor_mtx <- stats::cor(p_values, use = "complete.obs")
367 |   cor_mtx[is.na(cor_mtx)] <- 0
368 |   cor_mtx <- abs(cor_mtx)
369 |   
370 |   # Adjusted p-value
371 |   k = length(p_values[1,])
372 |   adjusted_z <- stouffer_z * sqrt(k) / sqrt(sum(cor_mtx))
373 |   p_strube <- 2*stats::pnorm(-1*abs(adjusted_z))
374 |   names(p_strube) <- rownames(p_values)
375 |   p_strube
376 | }
377 | 
378 | 
379 | strubesDirectional <- function (p_values, scores_direction, constraints_vector){
380 |   # Acquiring the unadjusted z-value from Stouffer's method
381 |   stouffer_z <- c()
382 |   for(i in 1:length(p_values[,1])){
383 |     stouffer_z <- c(stouffer_z,stouffersDirectional(p_values[i,], scores_direction[i,],constraints_vector))
384 |   }
385 |   
386 |   # Correlation matrix
387 |   cor_mtx <- stats::cor(p_values, use = "complete.obs")
388 |   cor_mtx[is.na(cor_mtx)] <- 0
389 |   cor_mtx <- abs(cor_mtx)
390 |   
391 |   # Adjusted p-value
392 |   k = length(p_values[1,])
393 |   adjusted_z <- stouffer_z * sqrt(k) / sqrt(sum(cor_mtx))
394 |   p_strube <- 2*stats::pnorm(-1*abs(adjusted_z))
395 |   names(p_strube) <- rownames(p_values)
396 |   p_strube
397 | }
398 | 
399 | 
400 | transformData <- function(dat) {
401 |   # If all values in dat are the same (equal to y), return dat. The covariance
402 |   # matrix will be the zero matrix, and brown's method gives the p-value as y
403 |   # Otherwise (dat - dmv) / dvsd is NaN and ecdf throws an error
404 |   if (isTRUE(all.equal(min(dat), max(dat)))) return(dat)
405 |   
406 |   dvm <- mean(dat, na.rm=TRUE)
407 |   dvsd <- pop.sd(dat)
408 |   s <- (dat - dvm) / dvsd
409 |   distr <- stats::ecdf(s)
410 |   sapply(s, function(a) -2 * log(distr(a)))
411 | }
412 | 
413 | 
414 | calculateCovariances <- function(data_matrix) {
415 |   transformed_data_matrix <- apply(data_matrix, 1, transformData)
416 |   stats::cov(transformed_data_matrix)
417 | }
418 | 
419 | 
420 | pop.var <- function(x) stats::var(x, na.rm=TRUE) * (length(x) - 1) / length(x)
421 | pop.sd <- function(x) sqrt(pop.var(x))
422 | 


--------------------------------------------------------------------------------
/R/statistical_tests.r:
--------------------------------------------------------------------------------
 1 | #' Hypergeometric test
 2 | #'
 3 | #' Perform a hypergeometric test, also known as the Fisher's exact test, on a 2x2 contingency
 4 | #' table with the alternative hypothesis set to 'greater'. In this application, the test finds the
 5 | #' probability that counts[1, 1] or more genes would be found to be annotated to a term (pathway),
 6 | #' assuming the null hypothesis of genes being distributed randomly to terms. 
 7 | #'
 8 | #' @param counts A 2x2 numerical matrix representing a contingency table.
 9 | #'
10 | #' @return a p-value of enrichment of genes in a term or pathway. 
11 | hypergeometric <- function(counts) {
12 |     if (any(counts < 0)) stop('counts contains negative values. Something went very wrong.')
13 |     m <- counts[1, 1] + counts[2, 1]
14 |     n <- counts[1, 2] + counts[2, 2]
15 |     k <- counts[1, 1] + counts[1, 2]
16 |     x <- counts[1, 1]
17 |     stats::phyper(x-1, m, n, k, lower.tail=FALSE)
18 | }
19 | 
20 | 
21 | #' Ordered Hypergeometric Test
22 | #'
23 | #' Perform a series of hypergeometric tests (a.k.a. Fisher's Exact tests), on a ranked list of genes ordered
24 | #' by significance against a list of annotation genes. The hypergeometric tests are executed with 
25 | #' increasingly larger numbers of genes representing the top genes in order of decreasing scores. 
26 | #' The lowest p-value of the series is returned as the optimal enriched intersection of the ranked list of genes
27 | #' and the biological term (pathway). 
28 | #'
29 | #' @param genelist Character vector of gene names, assumed to be ordered by decreasing importance. 
30 | #' For example, the genes could be ranked by decreasing significance of differential expression. 
31 | #' @param background Character vector of gene names. List of all genes used as a statistical background (i.e., the universe).
32 | #' @param annotations Character vector of gene names. A gene set representing a functional term, process or biological pathway. 
33 | #'
34 | #' @return a list with the items:
35 | #'   \describe{
36 | #'     \item{p_val}{The lowest obtained p-value}
37 | #'     \item{ind}{The index of \code{genelist} such that \code{genelist[1:ind]}
38 | #'       gives the lowest p-value}
39 | #'  }
40 | #' @export
41 | #'
42 | #' @examples
43 | #'    orderedHypergeometric(c('HERC2', 'SP100'), c('PHC2', 'BLM', 'XPC', 'SMC3', 'HERC2', 'SP100'),
44 | #'                          c('HERC2', 'PHC2', 'BLM'))
45 | orderedHypergeometric <- function(genelist, background, annotations) {
46 |     # Only test subsets of genelist that end with a gene in annotations since
47 |     # these are the only tests for which the p-value can decrease
48 |     which_in <- which(genelist %in% annotations)
49 |     if (length(which_in) == 0) return(list(p_val=1, ind=1))
50 | 
51 |     # Construct the counts matrix for the first which_in[1] genes
52 |     gl <- genelist[1:which_in[1]]
53 |     cl <- setdiff(background, gl)
54 |     genelist0 <- length(gl) - 1
55 |     complement1 <- length(which(cl %in% annotations))
56 |     complement0 <- length(cl) - complement1
57 |     counts <- matrix(data=c(1, genelist0, complement1, complement0), nrow=2)
58 |     scores <- hypergeometric(counts)
59 | 
60 |     if (length(which_in) == 1) return(list(p_val=scores, ind=which_in[1]))
61 | 
62 |     # Update counts and recalculate score for the rest of the indeces in which_in
63 |     # The genes in genelist[which_in[i]:which_in[i-1]] are added to the genes
64 |     # being tested and removed from the complement. Of these, 1 will always be
65 |     # in annotations and the rest will not. Therefore we can just modify the
66 |     # contingency table rather than recounting which genes are in annotations
67 |     for (i in 2:length(which_in)) {
68 |         diff <- which_in[i] - which_in[i-1]
69 |         counts[1, 1] <- i
70 |         counts[2, 1] <- counts[2, 1] + diff - 1
71 |         counts[1, 2] <- counts[1, 2] - 1
72 |         counts[2, 2] <- counts[2, 2] - diff + 1
73 |         scores[i] <- hypergeometric(counts)
74 |     }
75 | 
76 |     # Return the lowest p-value and the associated index
77 |     min_score <- min(scores)
78 |     
79 |     ind = which_in[max(which(scores==min_score))]
80 |     p_val = min_score
81 |     
82 | 	list(p_val=p_val, ind=ind)
83 | }
84 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # ActivePathways - integrative pathway analysis of multi-omics data
  2 | 
  3 | 
  4 | **July 28th 2024: ActivePathways version 2.0.5 is now available on CRAN and GitHub that fixes a minor bug on exporing unfiltered results as CSV files. The major update 2.0 provides the directional p-value merging (DPM) method described in our recent publication.**
  5 | 
  6 | ActivePathways is a tool for multivariate pathway enrichment analysis that identifies gene sets, such as pathways or Gene Ontology terms, that are over-represented in a list or matrix of genes. ActivePathways uses a data fusion method to combine multiple omics datasets, prioritises genes based on the significance and direction of signals from the omics datasets, and performs pathway enrichment analysis of these prioritised genes. We can find pathways and genes supported by single or multiple omics datasets, as well as additional genes and pathways that are only apparent through data integration and remain undetected in any single dataset alone. 
  7 | 
  8 | The new version of ActivePathways is described in our recent publication.
  9 | 
 10 | Mykhaylo Slobodyanyuk^, Alexander T. Bahcheli^, Zoe P. Klein, Masroor Bayati, Lisa J. Strug, Jüri Reimand. Directional integration and pathway enrichment analysis for multi-omics data. *Nature Communications* 15, 5690 (2024). (^ - co-first authors)
 11 | <https://www.nature.com/articles/s41467-024-49986-4>
 12 | <https://pubmed.ncbi.nlm.nih.gov/38971800/>
 13 | 
 14 | The first version of ActivePathways was published in Nature Communications with the PCAWG Pan-Cancer project. 
 15 | 
 16 | Marta Paczkowska^, Jonathan Barenboim^, Nardnisa Sintupisut, Natalie S. Fox, Helen Zhu, Diala Abd-Rabbo, Miles W. Mee, Paul C. Boutros, PCAWG Drivers and Functional Interpretation Working Group, PCAWG Consortium, Juri Reimand. Integrative pathway enrichment analysis of multivariate omics data. *Nature Communications* 11, 735 (2020) (^ - co-first authors)
 17 | <https://www.nature.com/articles/s41467-019-13983-9> 
 18 | <https://pubmed.ncbi.nlm.nih.gov/32024846/>
 19 | 
 20 | The package version 2.0.3 used in the DPM preprint and manuscript is archived on Zenodo: <https://zenodo.org/records/12118089>.
 21 | 
 22 | ## Installation
 23 | 
 24 | Package tested with: MacOS 14, Windows 11, Ubuntu 20.04.
 25 | 
 26 | Software dependencies: data.table, ggplot2, testthat, knitr, rmarkdown, RColorBrewer.
 27 | 
 28 | Installation time: less than 2 minutes.
 29 | 
 30 | #### From CRAN: ActivePathways 2.0.5 is currently the most recent version
 31 | Open R and run `install.packages('ActivePathways')`
 32 | 
 33 | #### Using devtools on our GitHub repository
 34 | Using the R package `devtools`, run
 35 | `devtools::install_github('https://github.com/reimandlab/ActivePathways', build_vignettes = TRUE)`
 36 | 
 37 | #### From source on our GitHub repository
 38 | Clone the repository, for example using `git clone https://github.com/reimandlab/ActivePathways.git`. 
 39 | 
 40 | Open R in the directory where you cloned the package and run `install.packages("ActivePathways", repos = NULL, type = "source")`
 41 | 
 42 | 
 43 | 
 44 | ## Using ActivePathways
 45 | 
 46 | See the vignette for more details. Run `browseVignettes(package='ActivePathways')` in R.
 47 | 
 48 | 
 49 | ### Examples
 50 | 
 51 | The simplest use of ActivePathways requires only a data table and a GMT file. The data table is a matrix of p-values of genes/transcripts/proteins as rows and omics datasets as columns. it also needs a list of gene sets in the form of a GMT [(Gene Matrix Transposed)](https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29) file. 
 52 | 
 53 | * The data table must be a numerical matrix. For a single gene list, a one-column matrix can be used. The matrix cannot contain any missing values, and one conservative option is to re-assign all missing values as 1s, indicating our confidence that the missing P-values are always insignificant. Alternatively, one may consider removing genes with NA values.
 54 | 
 55 | * Gene sets in the form of a GMT file can be acquired from multiple [sources](https://baderlab.org/GeneSets) such as Gene Ontology, Reactome and others. For better accuracy and statistical power these pathway databases should be combined. Acquiring an [up-to-date GMT file](http://download.baderlab.org/EM_Genesets/current_release/) is essential to avoid using unreliable outdated annotations [(see this paper)](https://www.nature.com/articles/nmeth.3963). 
 56 | 
 57 | ```R
 58 | 
 59 | library(ActivePathways)
 60 | 
 61 | ##
 62 | # Run an example using the data files included in the ActivePathways package. 
 63 | # This basic example does not incorporate directionality. 
 64 | ##
 65 | 
 66 | fname_scores <- system.file("extdata", "Adenocarcinoma_scores_subset.tsv", 
 67 | 		package = "ActivePathways")
 68 | fname_GMT <- system.file("extdata", "hsapiens_REAC_subset.gmt", 
 69 | 		package = "ActivePathways")
 70 | 
 71 | ##
 72 | # Numeric matrix of p-values is required as input. 
 73 | # NA values are converted to P = 1.
 74 | ##
 75 | 
 76 | scores <- read.table(fname_scores, header = TRUE, row.names = 'Gene')
 77 | scores <- as.matrix(scores)
 78 | scores[is.na(scores)] <- 1
 79 | 
 80 | 
 81 | ##
 82 | # Main call of ActivePathways function:
 83 | ##
 84 | 
 85 | enriched_pathways <- ActivePathways(scores, fname_GMT) 
 86 | 
 87 | #35 terms were removed from gmt because they did not make the geneset_filter
 88 | #91 rows were removed from scores because they are not found in the background
 89 | 
 90 | 
 91 | ##
 92 | # list a few first results of enriched pathways identified by ActivePathways
 93 | ##
 94 | 
 95 | enriched_pathways[1:3,]
 96 | 
 97 | #        term_id         term_name adjusted_p_val term_size
 98 | #1: REAC:2424491   DAP12 signaling   4.491268e-05       358
 99 | #2:  REAC:422475     Axon guidance   2.028966e-02       555
100 | #3:  REAC:177929 Signaling by EGFR   6.245734e-04       366
101 | #                                   overlap       evidence
102 | #1:     TP53,PIK3CA,KRAS,PTEN,BRAF,NRAS,...            CDS
103 | #2: PIK3CA,KRAS,BRAF,NRAS,CALM2,RPS6KA3,... X3UTR,promCore
104 | #3:     TP53,PIK3CA,KRAS,PTEN,BRAF,NRAS,...            CDS
105 | #                            Genes_X3UTR Genes_X5UTR
106 | #1:                                   NA          NA
107 | #2: CALM2,ARPC2,RHOA,NUMB,CALM1,ACTB,...          NA
108 | #3:                                   NA          NA
109 | #                             Genes_CDS
110 | #1: TP53,PTEN,KRAS,PIK3CA,BRAF,NRAS,...
111 | #2:                                  NA
112 | #3: TP53,PTEN,KRAS,PIK3CA,BRAF,NRAS,...
113 | #                                Genes_promCore
114 | #1:                                          NA
115 | #2: EFNA1,IQGAP1,COL4A1,SCN2B,RPS6KA3,CALM2,...
116 | #3:                                          NA
117 | 
118 | ##
119 | # Show enriched genes of the first pathway 'DAP12 signalling' 
120 | # the column `overlap` displays genes of the integrated dataset (from 
121 | # data fusion, i.e., p-value merging) that occur in the given pathway.
122 | # Genes are ranked by joint significance across input omics datasets.
123 | ##
124 | 
125 | enriched_pathways[["overlap"]][[1]]
126 | # [1] "TP53"   "PIK3CA" "KRAS"   "PTEN"   "BRAF"   "NRAS"   "B2M"    "CALM2"
127 | # [9] "CDKN1A" "CDKN1B"
128 | 
129 | ##
130 | # Save the resulting pathways as a Comma-Separated Values (CSV) file
131 | # for spreadsheets and computational pipelines.
132 | # the data.table object cannot be saved directly as text.
133 | ##
134 | 
135 | export_as_CSV(enriched_pathways, "enriched_pathways.csv")
136 | 
137 | 
138 | ## 
139 | # Examine a few lines of the two major types of input
140 | ##
141 | 
142 | ##
143 | # The scores matrix includes p-values for genes (rows) 
144 | #   and evidence of different omics datasets (columns).
145 | # This dataset includes predicted cancer driver mutations
146 | #   in gene coding/CDS, 5'UTR, 3'UTR, and core promoter sequences
147 | ##
148 | 
149 | head(scores, n = 3)
150 | 
151 | #         X3UTR      X5UTR       CDS  promCore
152 | #A2M  1.0000000 0.33396764 0.9051708 0.4499201
153 | #AAAS 1.0000000 0.42506012 0.7047723 0.7257641
154 | #ABAT 0.9664126 0.04202735 0.7600985 0.1903789
155 | 
156 | ##
157 | # GMT files include functional gene sets (pathways, processes).
158 | # Each tab-separated line represents a gene set: 
159 | #   gene set ID, description followed by gene symbols.
160 | # Gene symbols in the scores table and the GMT file need to match. 
161 | # NB: this GMT file is a small subset of the real GMT file built for testing. 
162 | #     It should not be used for real analyses. 
163 | ##
164 | 
165 | readLines(fname_GMT)[11:13]
166 | 
167 | #[1] "REAC:3656535\tTGFBR1 LBD Mutants in Cancer\tTGFB1\tFKBP1A\tTGFBR2\tTGFBR1\t"
168 | #[2] "REAC:73927\tDepurination\tOGG1\tMPG\tMUTYH\t"
169 | #[3] "REAC:5602410\tTLR3 deficiency - HSE\tTLR3\t" 
170 | 
171 | 
172 | ```
173 | 
174 | ### Examples - Directional integration of multi-omics data
175 | 
176 | ActivePathways 2.0 extends our integrative pathway analysis framework significantly. Users can now provide directional assumptions of input omics datasets for more accurate analyses. This allows us to prioritise genes and pathways where certain directional assumptions are met, and penalise those where the assumptions are violated. 
177 | 
178 | For example, fold-change in protein expression would be expected to associate positively with mRNA fold-change of the corresponding gene, while negative associations would be unexpected and indicate more-complex situations or potential false positives. We can instruct the pathway analysis to prioritise positively-associated protein/mRNA pairs and penalise negative associations (or vice versa). 
179 | 
180 | Two additional inputs are included in ActivePathways that allow diverse multi-omics analyses. These inputs are optional. 
181 | 
182 | The scores_direction and constraints_vector parameters are provided in the merge_p_values() and ActivePathways() functions to incorporate this directional penalty into the data fusion and pathway enrichment analyses. 
183 | 
184 | The parameter constraints_vector is a vector that allows the user to represent the expected relationship between the input omics datasets. The vector size is n_datasets. Values include +1, -1, and 0. The constraints_vector should reflect the expected *relative* directional relationship between datasets. For example, the constraints_vector values c(-1,1) and c(1,-1) are functionally identical. When combining datasets that contain both directional datatypes (eg gene or protein expression, gene promoter methylation) and non-directional datatypes (eg gene mutational burden, ChIP-seq), we can define the relative relationship between directional datatypes with the values 1 and -1 while setting the value of non-directional datatypes to 0.
185 | 
186 | The parameter scores_direction is a matrix that reflects the directions that the genes/transcripts/protein show in the data. The matrix size is n_genes * n_datasets, that is the same size as the P-value matrix. This is a numeric matrix, but only the signs of the values are accounted for. 
187 | 
188 | #### Directional data integration at the gene level
189 | 
190 | ```R 
191 | 
192 | ##
193 | # load a dataset of P-values and fold-changes for mRNA and protein levels
194 | # this dataset is embedded in the package
195 | ##
196 | fname_data_matrix <- system.file('extdata', 
197 | 		'Differential_expression_rna_protein.tsv',
198 | 		package = 'ActivePathways')
199 | pvals_FCs <- read.table(fname_data_matrix, header = TRUE, sep = '\t')
200 |                  
201 | # examine a few example genes
202 | example_genes <- c('ACTN4','PIK3R4','PPIL1','NELFE','LUZP1','ITGB2')
203 | pvals_FCs[pvals_FCs$gene %in% example_genes,]
204 | 
205 | #       gene     rna_pval rna_log2fc protein_pval protein_log2fc
206 | #73   PIK3R4 1.266285e-03  1.1557077 2.791135e-03     -0.8344799
207 | #74    PPIL1 1.276838e-03 -1.1694221 1.199303e-04     -1.1193605
208 | #606   NELFE 1.447553e-02 -0.9120687 1.615592e-05     -1.2630114
209 | #4048  LUZP1 3.253382e-05  1.5830796 4.129125e-02      0.5791377
210 | #4050  ITGB2 4.584450e-05  1.6472117 1.327997e-01      0.4221579
211 | #4052  ACTN4 5.725503e-05  1.5531533 8.238317e-07      1.4279158
212 | 
213 | ##
214 | # create a matrix of gene/protein P-values. 
215 | # where the columns are different omics datasets (mRNA, protein)
216 | # and the rows are genes. 
217 | ##
218 | 
219 | pval_matrix <- data.frame(
220 | 		row.names = pvals_FCs$gene, 
221 | 		rna = pvals_FCs$rna_pval, 
222 | 		protein = pvals_FCs$protein_pval)
223 | pval_matrix <- as.matrix(pval_matrix)
224 | 
225 | ##
226 | # examine a few genes in the P-value matrix
227 | ##
228 | 
229 | pval_matrix[example_genes,]
230 | #                rna      protein
231 | #ACTN4  5.725503e-05 8.238317e-07
232 | #PIK3R4 1.266285e-03 2.791135e-03
233 | #PPIL1  1.276838e-03 1.199303e-04
234 | #NELFE  1.447553e-02 1.615592e-05
235 | #LUZP1  3.253382e-05 4.129125e-02
236 | #ITGB2  4.584450e-05 1.327997e-01
237 | 
238 | ##
239 | # convert missing values to P = 1
240 | ##
241 | 
242 | pval_matrix[is.na(pval_matrix)] <- 1
243 | 
244 | ##
245 | # Create a matrix of gene/protein directions 
246 | # similarly to the P-value matrix (i.e., scores_direction)
247 | ##
248 | 
249 | dir_matrix <- data.frame(
250 | 		row.names = pvals_FCs$gene, 
251 | 		rna = pvals_FCs$rna_log2fc, 
252 | 		protein = pvals_FCs$protein_log2fc)
253 | dir_matrix <- as.matrix(dir_matrix)
254 | 
255 | ##
256 | # ActivePathways only uses the signs of the direction values (ie +1 or -1).
257 | ##
258 | 
259 | dir_matrix <- sign(dir_matrix)
260 | 
261 | ##
262 | # if directions are missing (NA), we recommend setting the values to zero 
263 | ##
264 | 
265 | dir_matrix[is.na(dir_matrix)] <- 0
266 | 
267 | ##
268 | # examine a few genes in the direction matrix
269 | ##
270 | 
271 | dir_matrix[example_genes,]
272 | #       rna protein
273 | #ACTN4    1       1
274 | #PIK3R4   1      -1
275 | #PPIL1   -1      -1
276 | #NELFE   -1      -1
277 | #LUZP1    1       1
278 | #ITGB2    1       1
279 | 
280 | ##
281 | # This matrix has to be accompanied by a vector that 
282 | # provides the expected relationship between the
283 | # different datasets. Here, mRNA levels and protein 
284 | # levels are expected to have consistent directions:
285 | # either both positive or both negative (eg log fold-change).  
286 | ##
287 | 
288 | constraints_vector <- c(1,1)
289 | 
290 | ##
291 | # Alternatively, we can use another vector to prioritise 
292 | # genes/proteins where the directions are the opposite.
293 | ##
294 | 
295 | # constraints_vector <- c(1,-1)
296 | 
297 | ##
298 | # Now we merge the P-values of the two datasets 
299 | # using directional assumtions and compare these 
300 | # with the plain non-directional merging. 
301 | # The top 5 scoring genes differ if we penalise genes
302 | # where this directional logic is violated: 
303 | # While 4 of 5 genes retain significance, the gene PIK3R4 is penalised. 
304 | # Interestingly, as a consequence of penalising PIK3R4, 
305 | # other genes such as ITGB2 move up in rank.  
306 | ##
307 | 
308 | directional_merged_pvals <- merge_p_values(pval_matrix, 
309 | 		method = "DPM", dir_matrix, constraints_vector)
310 | 
311 | merged_pvals <- merge_p_values(pval_matrix, method = "Brown")
312 | 
313 | 
314 | sort(merged_pvals)[1:5]
315 | #       ACTN4        PPIL1        NELFE        LUZP1       PIK3R4 
316 | #1.168708e-09 2.556067e-06 3.804646e-06 1.950607e-05 4.790125e-05 
317 | 
318 | 
319 | sort(directional_merged_pvals)[1:5]
320 | #       ACTN4        PPIL1        NELFE        LUZP1        ITGB2 
321 | #1.168708e-09 2.556067e-06 3.804646e-06 1.950607e-05 7.920157e-05 
322 | 
323 | ##
324 | # PIK3R4 is penalised because the fold-changes of its mRNA and 
325 | # protein levels are significant and have the opposite signs:
326 | ##
327 | 
328 | pvals_FCs[pvals_FCs$gene == "PIK3R4",]
329 | #     gene    rna_pval rna_log2fc protein_pval protein_log2fc
330 | #73 PIK3R4 0.001266285   1.155708  0.002791135     -0.8344799
331 | 
332 | pval_matrix["PIK3R4",]
333 | #        rna     protein
334 | #0.001266285 0.002791135
335 | 
336 | dir_matrix["PIK3R4",]
337 | #    rna protein
338 | #      1      -1
339 | 
340 | merged_pvals["PIK3R4"]
341 | #      PIK3R4
342 | #4.790125e-05
343 | 
344 | directional_merged_pvals["PIK3R4"]
345 | #   PIK3R4
346 | #0.8122527
347 | 
348 | ```
349 | To assess the impact of the directional penalty on gene merged P-value signals we create a plot showing directional results on the y axis and non-directional results on the x. Blue dots are prioritised hits, red dots are penalised.
350 | 
351 | ```R
352 | lineplot_df <- data.frame(original = -log10(merged_pvals),
353 | 			  modified = -log10(directional_merged_pvals))
354 | 
355 | ggplot(lineplot_df) +
356 | 	geom_point(size = 2.4, shape = 19,
357 | 		aes(original, modified,
358 | 		    color = ifelse(original <= -log10(0.05),"gray",
359 |                                     ifelse(modified > -log10(0.05),"#1F449C","#F05039")))) +
360 | 	labs(title = "",
361 | 		 x ="Merged -log10(P)",
362 | 		 y = "Directional Merged -log10(P)") + 
363 |             geom_hline(yintercept = 1.301, linetype = "dashed",
364 | 		       col = 'black', size = 0.5) +
365 |             geom_vline(xintercept = 1.301, linetype = "dashed",
366 | 		       col = "black", size = 0.5) + 
367 |             geom_abline(size = 0.5, slope = 1,intercept = 0) +
368 | 	    scale_color_identity()
369 | 	    
370 | ```
371 | 
372 | ![](vignettes/lineplot_tutorial.png)
373 | 
374 | #### Pathway-level insight
375 | To explore how changes on the individual gene level impact biological pathways, we can compare results before and after incorporating a directional penalty.
376 | 
377 | ```R 
378 | 
379 | ##
380 | # use the example GMT file embedded in the package
381 | ##
382 | 
383 | fname_GMT2 <- system.file("extdata", "hsapiens_REAC_subset2.gmt", 
384 | 		package = "ActivePathways")
385 | ##
386 | # Integrative pathway enrichment analysis with no directionality
387 | ##
388 | enriched_pathways <- ActivePathways(
389 | 		pval_matrix, gmt = fname_GMT2, cytoscape_file_tag = "Original_")
390 | 
391 | ##
392 | # Directional integration and pathway enrichment analysis
393 | # this analysis the directional coefficients and constraints_vector from 
394 | # the gene-based analysis described above
395 | ##
396 | 
397 | constraints_vector
398 | # [1] 1 1
399 | 
400 | dir_matrix[example_genes,]
401 | #       rna protein
402 | #ACTN4    1       1
403 | #PIK3R4   1      -1
404 | #PPIL1   -1      -1
405 | #NELFE   -1      -1
406 | #LUZP1    1       1
407 | #ITGB2    1       1
408 | 
409 | enriched_pathways_directional <- ActivePathways(
410 | 		pval_matrix, gmt = fname_GMT2, cytoscape_file_tag = "Directional_",
411 | 		merge_method = "DPM", scores_direction = dir_matrix, constraints_vector = constraints_vector)
412 | 		
413 | ## 
414 | # Examine the pathways that are lost when 
415 | # directional information is incorporated in the data integration
416 | ## 
417 | 
418 | pathways_lost_in_directional_integration = 
419 | 		setdiff(enriched_pathways$term_id, enriched_pathways_directional$term_id)
420 | pathways_lost_in_directional_integration
421 | #[1] "REAC:R-HSA-3858494" "REAC:R-HSA-69206"   "REAC:R-HSA-69242"
422 | #[4] "REAC:R-HSA-9013149"
423 | 
424 | enriched_pathways[enriched_pathways$term_id %in% pathways_lost_in_directional_integration,] 
425 | #              term_id                              term_name adjusted_p_val
426 | #1: REAC:R-HSA-3858494 Beta-catenin independent WNT signaling    0.013437464
427 | #2:   REAC:R-HSA-69206                        G1/S Transition    0.026263457
428 | #3:   REAC:R-HSA-69242                                S Phase    0.009478766
429 | #4: REAC:R-HSA-9013149                      RAC1 GTPase cycle    0.047568911
430 | #   term_size                                  overlap    evidence
431 | #1:       143 PSMA5,PSMB4,PSMC5,PSMD11,PSMA8,GNG13,... rna,protein
432 | #2:       130  PSMA5,PSMB4,CDK4,PSMC5,PSMD11,CCNB1,...     protein
433 | #3:       162   PSMA5,PSMB4,RFC3,CDK4,PSMC5,PSMD11,...    combined
434 | #4:       184 SRGAP1,TIAM1,BAIAP2,FMNL1,DOCK9,PAK3,...         rna
435 | #                                      Genes_rna
436 | #1:       GNG13,PSMC1,PSMA5,PSMB4,ITPR3,DVL1,...
437 | #2:                                           NA
438 | #3:                                           NA
439 | #4: SRGAP1,TIAM1,FMNL1,ARHGAP30,FARP2,DOCK10,...
440 | #                               Genes_protein
441 | #1: PSMA8,PSMD11,PSMA5,PRKG1,PSMD10,PSMB4,...
442 | #2:   PSMD11,PSMA5,PSMD10,PSMB4,CDK7,ORC2,...
443 | #3:                                        NA
444 | #4:                                        NA
445 | 
446 | 
447 | ##
448 | # An example of a lost pathway is Beta-catenin independent WNT signaling. 
449 | # Out of the 32 genes that contribute to this pathway enrichment, 
450 | # 10 genes are in directional conflict. The enrichment is no longer 
451 | # identified when these genes are penalised due to the conflicting 
452 | # log2 fold-change directions.
453 | ##
454 | 
455 | wnt_pathway_id <- "REAC:R-HSA-3858494"
456 | enriched_pathway_genes <- unlist(
457 | 		enriched_pathways[enriched_pathways$term_id == wnt_pathway_id,]$overlap)
458 | enriched_pathway_genes
459 | # [1] "PSMA5"  "PSMB4"  "PSMC5"  "PSMD11" "PSMA8"  "GNG13"  "SMURF1" "PSMC1"
460 | # [9] "PSMA4"  "PLCB2"  "PRKG1"  "PSMD4"  "PSMD1"  "PSMD10" "PSMA6"  "PSMA2"
461 | #[17] "PSMA1"  "PRKCA"  "PSMC6"  "RHOA"   "PSMB3"  "PSMB1"  "PSME3"  "ITPR3"
462 | #[25] "AGO4"   "DVL3"   "PSMA3"  "PPP3R1" "DVL1"   "CLTA"   "PSME2"  "CALM1"
463 | #[33] "PSMD6"  "PSMB6"
464 | 
465 | ##
466 | # examine the pathway genes that have directional disagreement and 
467 | # contribute to the lack of pathway enrichment in the directional analysis
468 | ##
469 | 
470 | pathway_gene_pvals = pval_matrix[enriched_pathway_genes,]
471 | pathway_gene_directions = dir_matrix[enriched_pathway_genes,]
472 | 
473 | directional_conflict_genes = names(which(
474 | 		pathway_gene_directions[,1] != pathway_gene_directions[,2] &
475 | 		pathway_gene_directions[,1] != 0 & pathway_gene_directions[,2] != 0))
476 | 
477 | pathway_gene_pvals[directional_conflict_genes,]
478 | #              rna     protein
479 | #PSMD11 0.34121101 0.002094310
480 | #PSMA8  0.55510836 0.001415197
481 | #SMURF1 0.03353629 0.042995333
482 | #PSMD1  0.04650877 0.100178048
483 | #RHOA   0.01786687 0.474628084
484 | #PSME3  0.07148904 0.130184883
485 | #ITPR3  0.01660850 0.589929787
486 | #DVL3   0.46381447 0.022535743
487 | #PSME2  0.03274707 0.514351089
488 | #PSMB6  0.02863259 0.677224905
489 | 
490 | pathway_gene_directions[directional_conflict_genes,]
491 | #       rna protein
492 | #PSMD11   1      -1
493 | #PSMA8    1      -1
494 | #SMURF1   1      -1
495 | #PSMD1    1      -1
496 | #RHOA    -1       1
497 | #PSME3    1      -1
498 | #ITPR3    1      -1
499 | #DVL3     1      -1
500 | #PSME2   -1       1
501 | #PSMB6   -1       1
502 | 
503 | length(directional_conflict_genes)
504 | #[1] 10
505 | 
506 | 
507 | ```
508 | To visualise differences in biological pathways between ActivePathways analyses with or without a directional penalty, we combine both outputs into a single enrichment map for [plotting](#visualising-directional-impact-with-node-borders).
509 | 
510 | More thorough documentation of the ActivePathways function can be found in R with `?ActivePathways`, and complete tutorials can be found with `browseVignettes(package='ActivePathways')`.
511 | 
512 | 
513 | # Visualising pathway enrichment results using enrichment maps in Cytoscape
514 | 
515 | Cytoscape provides powerful tools to visualise the enriched pathways from `ActivePathways` as a network (i.e., an enrichment map). `ActivePathways` provides the files needed for building enrichment maps in Cytoscape. To create these files, supply a file prefix to the argument `cytoscape_file_tag` in the ActivePathways() function. The prefix can be a path to an existing writable directory.
516 |  
517 | ```{r}
518 | res <- ActivePathways(scores, fname_GMT, cytoscape_file_tag = "enrichmentMap__")
519 | ```
520 | Four files are written using the prefix:
521 | 
522 | * `enrichmentMap__pathways.txt`: a table of significant terms and the associated adjusted P-values. Terms include molecular pathways, biological processes, and other gene sets. Note that only terms with `adjusted_p_val <= significant` are written.
523 | 
524 | * `enrichmentMap__subgroups.txt`: a matrix indicating which columns of the input matrix (i.e., which omics datasets) contributed to the discovery of each pathway. These values correspond to the `evidence` evaluation of input omics datasets discussed above. A value of one indicates the pathway was also detectable using a specific input omics dataset. A value of zero indicates otherwise. This file will be not generated if the input matrix is a single-column matrix of scores (just one omics dataset).
525 | 
526 | * `enrichmentMap__pathways.gmt`: a shortened version of the supplied GMT file, containing only the significant pathways detected by `ActivePathways`. 
527 | 
528 | * `enrichmentMap__legend.pdf`: a pdf file containing a legend, with colors corresponding to the different omics datasets visualised in the enrichment map. This can be used as a reference to the generated enrichment map.
529 | 
530 | ## Creating enrichment maps using results of ActivePathways 
531 | 
532 | Pathway enrichment analysis often leads to complex and redundant results. Enrichment maps are network-based visualisations of pathway enrichment analyses. Enrichment maps can be generated in the Cytoscape software using the EnrichmentMap app. **The enhancedGraphics app is also required**. See the vignette for details: `browseVignettes(package='ActivePathways')`.
533 | 
534 | 
535 | ## Required software
536 | 
537 | 1.	Cytoscape, see <https://cytoscape.org/download.html>
538 | 2.	EnrichmentMap app of Cytoscape, see menu Apps>App manager or <https://apps.cytoscape.org/apps/enrichmentmap> 
539 | 3.	EnhancedGraphics app of Cytoscape, see menu Apps>App manager or <https://apps.cytoscape.org/apps/enhancedGraphics> 
540 | 
541 | ## Creating the enrichment map
542 | 
543 | * Open the Cytoscape software. 
544 | * Select *Apps -> EnrichmentMap*. 
545 | * In the following dialogue, click the button `+` *Add Data Set from Files* in the top left corner of the dialogue.
546 | * Change the Analysis Type to Generic/gProfiler/Enrichr.
547 | * Upload the files `enrichmentMap__pathways.txt` and `enrichmentMap__pathways.gmt` in the *Enrichments* and *GMT* fields, respectively. 
548 | * Click the checkbox *Show Advanced Options* and set *Cutoff* to 0.6.
549 | * Then click *Build* in the bottom-right corner to create the enrichment map. 
550 | 
551 | ![](vignettes/CreateEnrichmentMapDialogue_V2.png)
552 | 
553 | ![](vignettes/NetworkStep1_V2.png)
554 | 
555 | 
556 | ## Colour the nodes of the network to visualise supporting omics datasets
557 | 
558 | The third file `enrichmentMap__subgroups.txt` needs to be imported to Cytoscape directly in order to color nodes (i.e. terms) according to their source omics datasets. To import the file, select the menu option *File -> Import -> Table from File* and select the file `enrichmentMap__subgroups.txt`. In the following dialogue, select *To a Network Collection* in the dropdown menu *Where to Import Table Data*. Click OK to proceed. 
559 | 
560 | ![](vignettes/ImportStep_V2.png)
561 | 
562 | Cytoscape uses the imported information to color nodes like a pie chart. To enable this click the Style tab in the left control panel and select the Image/Chart1 Property in a series of dropdown menus (*Properties -> Paint -> Custom Paint 1 -> Image/Chart 1*). 
563 | 
564 | ![](vignettes/PropertiesDropDown2_V2.png)
565 | 
566 | The *Image/Chart 1* property now appears in the Style control panel. Click the triangle on the right, then set the *Column* to *instruct* and the *Mapping Type* to *Passthrough Mapping*. 
567 | 
568 | ![](vignettes/StylePanel_V2.png)
569 | 
570 | This step colours the nodes corresponding to the enriched pathways according to the supporting omics datasets, based on the scores matrix initially analysed in `ActivePathways`. 
571 | 
572 | ![](vignettes/NetworkStep2_V2.png)
573 | 
574 | `ActivePathways` generates a color legend in the file `enrichmentMap__legend.pdf` that shows which colors correspond to which omics datasets. 
575 | 
576 | ![](vignettes/LegendView.png)
577 | 
578 | Note that one of the colors corresponds to a subset of enriched pathways with *combined* evidence. These terms were only detected through data fusion and P-value merging, and not with any of the input datasets individually. This exemplifies the added value of integrative multi-omics pathway enrichment analysis. 
579 | 
580 | ## Visualising directional impact with node borders
581 | 
582 | From the drop-down Properties menu, select *Border Line Type*.
583 | 
584 | ![](vignettes/border_line_type.jpg)
585 | 
586 | Set *Column* to *directional impact* and *Mapping Type* to *Discrete Mapping*. Now we can compare findings between a non-directional and a directional method. We highlight pathways that were shared (0), lost (1), and gained (2) between the approaches. Here, we have solid lines for the shared pathways, dots for the lost pathways, and vertical lines for the gained pathways. Border widths can be adjusted in the *Border Width* property, again with discrete mapping.
587 | 
588 | ![](vignettes/set_aesthetic.jpg)
589 | 
590 | This step changes node borders in the aggregated enrichment map, depicting the additional information provided by directional impact.
591 | 
592 | ![](vignettes/new_map.png)
593 | 
594 | ![](vignettes/legend.png)
595 | 
596 | ## Alternative node coloring
597 | 
598 | For a more diverse range of colors, ActivePathways supports any color palette from RColorBrewer. The color_palette parameter must be provided.
599 | ```{r}
600 | res <- ActivePathways(scores, gmt_file,
601 | 		      cytoscape_file_tag = "enrichmentMap__",
602 | 		      color_palette = "Pastel1")
603 | ```
604 | ![](vignettes/LegendView_RColorBrewer.png)
605 | 
606 | Alternatively, the custom_colors parameter can be specified as a vector to manually input the color of each dataset. This vector should contain the same number of colors as columns in the scores matrix.
607 | ```{r}
608 | res <- ActivePathways(scores, gmt_file,
609 | 		      cytoscape_file_tag = "enrichmentMap__",
610 | 		      custom_colors = c("violet","green","orange","red"))
611 | ```
612 | ![](vignettes/LegendView_Custom.png)
613 | 
614 | To change the color of the *combined* contribution, a color must be provided to the color_integrated_only parameter.
615 | 
616 | **If the coloring of nodes did not work in Cytoscape after setting the options in the Style panel, check that the EnhancedGraphics Cytoscape app is installed.**
617 | 
618 | ## References
619 | 
620 | * See the vignette for more details: `browseVignettes(package='ActivePathways')`.
621 | 
622 | * Mykhaylo Slobodyanyuk^, Alexander T. Bahcheli^, Zoe P. Klein, Masroor Bayati, Lisa J. Strug, Jüri Reimand. Directional integration and pathway enrichment analysis for multi-omics data. Nature Communications (2024) (^ - co-first authors) <https://www.nature.com/articles/s41467-024-49986-4> <https://pubmed.ncbi.nlm.nih.gov/38971800/>.
623 | 
624 | * Integrative Pathway Enrichment Analysis of Multivariate Omics Data. Paczkowska M^, Barenboim J^, Sintupisut N, Fox NS, Zhu H, Abd-Rabbo D, Mee MW, Boutros PC, PCAWG Drivers and Functional Interpretation Working Group; Reimand J, PCAWG Consortium. Nature Communications (2020) (^ - co-first authors) <https://pubmed.ncbi.nlm.nih.gov/32024846/> <https://doi.org/10.1038/s41467-019-13983-9>.
625 | 
626 | * Pathway Enrichment Analysis and Visualization of Omics Data Using g:Profiler, GSEA, Cytoscape and EnrichmentMap. Reimand J^, Isserlin R^, Voisin V, Kucera M, Tannus-Lopes C, Rostamianfar A, Wadi L, Meyer M, Wong J, Xu C, Merico D, Bader GD. Nature Protocols (2019) (^ - co-first authors)<https://pubmed.ncbi.nlm.nih.gov/30664679/> <https://doi.org/10.1038/s41596-018-0103-9>.
627 | 


--------------------------------------------------------------------------------
/inst/extdata/enrichmentMap__legend.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/inst/extdata/enrichmentMap__legend.pdf


--------------------------------------------------------------------------------
/inst/extdata/enrichmentMap__pathways.txt:
--------------------------------------------------------------------------------
 1 | term_id	term_name	adjusted_p_val
 2 | REAC:2424491	DAP12 signaling	4.49126833230489e-05
 3 | REAC:422475	Axon guidance	0.00046259184602835
 4 | REAC:177929	Signaling by EGFR	0.000619750411866923
 5 | REAC:2559583	Cellular Senescence	6.59544666946083e-05
 6 | REAC:5654699	SHC-mediated cascade:FGFR2	0.0284446360025451
 7 | REAC:2428924	IGF1R signaling cascade	0.00472765617433565
 8 | REAC:167044	Signalling to RAS	0.00725674706700891
 9 | REAC:5654700	FRS-mediated FGFR2 signaling	0.0038328186164029
10 | REAC:187687	Signalling to ERKs	0.0110412411912787
11 | REAC:180336	SHC1 events in EGFR signaling	0.0036184063767845
12 | REAC:4420097	VEGFA-VEGFR2 Pathway	0.00399592999211922
13 | REAC:112399	IRS-mediated signalling	0.00376912151604491
14 | REAC:5654712	FRS-mediated FGFR4 signaling	0.0038328186164029
15 | REAC:180292	GAB1 signalosome	0.011469483859716
16 | REAC:2262752	Cellular responses to stress	0.000906522444094916
17 | REAC:198203	PI3K/AKT activation	0.00491448952862143
18 | REAC:194138	Signaling by VEGF	0.0061305631670633
19 | REAC:1257604	PIP3 activates AKT signaling	0.0108481449904663
20 | REAC:212436	Generic Transcription Pathway	0.00945486548608516
21 | REAC:74752	Signaling by Insulin receptor	0.00701199797493265
22 | REAC:186797	Signaling by PDGF	0.000806572551897785
23 | REAC:112412	SOS-mediated signalling	0.0036184063767845
24 | REAC:3214842	HDMs demethylate histones	0.00775152941354889
25 | REAC:1236394	Signaling by ERBB4	0.000365080271370282
26 | REAC:449147	Signaling by Interleukins	0.00289259718940152
27 | REAC:1433557	Signaling by SCF-KIT	0.000332224908895709
28 | REAC:5654695	PI-3K cascade:FGFR2	0.0108481449904663
29 | REAC:5654736	Signaling by FGFR1	0.000430526596880612
30 | REAC:2172127	DAP12 interactions	6.09352718405167e-05
31 | REAC:5655253	Signaling by FGFR2 in disease	0.0200793711374132
32 | REAC:5673000	RAF activation	0.00404111415860685
33 | REAC:448424	Interleukin-17 signaling	0.00746476885651692
34 | REAC:190236	Signaling by FGFR	0.000160119485897304
35 | 


--------------------------------------------------------------------------------
/inst/extdata/enrichmentMap__subgroups.txt:
--------------------------------------------------------------------------------
 1 | term_id	X3UTR	X5UTR	CDS	promCore	combined	instruct
 2 | REAC:2424491	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
 3 | REAC:422475	1	0	0	1	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
 4 | REAC:177929	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
 5 | REAC:2559583	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
 6 | REAC:5654699	0	0	0	0	1	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
 7 | REAC:2428924	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
 8 | REAC:167044	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
 9 | REAC:5654700	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
10 | REAC:187687	0	0	0	0	1	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
11 | REAC:180336	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
12 | REAC:4420097	1	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
13 | REAC:112399	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
14 | REAC:5654712	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
15 | REAC:180292	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
16 | REAC:2262752	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
17 | REAC:198203	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
18 | REAC:194138	1	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
19 | REAC:1257604	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
20 | REAC:212436	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
21 | REAC:74752	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
22 | REAC:186797	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
23 | REAC:112412	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
24 | REAC:3214842	0	0	0	0	1	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
25 | REAC:1236394	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
26 | REAC:449147	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
27 | REAC:1433557	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
28 | REAC:5654695	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
29 | REAC:5654736	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
30 | REAC:2172127	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
31 | REAC:5655253	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
32 | REAC:5673000	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
33 | REAC:448424	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
34 | REAC:190236	0	0	1	0	0	piechart: attributelist="X3UTR,X5UTR,CDS,promCore,combined" colorlist="#FF0000,#CCFF00,#00FF66,#0066FF,#FFFFF0" showlabels=FALSE
35 | 


--------------------------------------------------------------------------------
/man/ActivePathways.Rd:
--------------------------------------------------------------------------------
  1 | % Generated by roxygen2: do not edit by hand
  2 | % Please edit documentation in R/ActivePathways.r
  3 | \name{ActivePathways}
  4 | \alias{ActivePathways}
  5 | \title{ActivePathways}
  6 | \usage{
  7 | ActivePathways(
  8 |   scores,
  9 |   gmt,
 10 |   background = makeBackground(gmt),
 11 |   geneset_filter = c(5, 1000),
 12 |   cutoff = 0.1,
 13 |   significant = 0.05,
 14 |   merge_method = c("Fisher", "Fisher_directional", "Brown", "DPM", "Stouffer",
 15 |     "Stouffer_directional", "Strube", "Strube_directional"),
 16 |   correction_method = c("holm", "fdr", "hochberg", "hommel", "bonferroni", "BH", "BY",
 17 |     "none"),
 18 |   cytoscape_file_tag = NA,
 19 |   color_palette = NULL,
 20 |   custom_colors = NULL,
 21 |   color_integrated_only = "#FFFFF0",
 22 |   scores_direction = NULL,
 23 |   constraints_vector = NULL
 24 | )
 25 | }
 26 | \arguments{
 27 | \item{scores}{A numerical matrix of p-values where each row is a gene and
 28 | each column represents an omics dataset (evidence). Rownames correspond to the genes 
 29 | and colnames to the datasets. All values must be 0<=p<=1. We recommend converting 
 30 | missing values to ones.}
 31 | 
 32 | \item{gmt}{A GMT object to be used for enrichment analysis. If a filename, a
 33 | GMT object will be read from the file.}
 34 | 
 35 | \item{background}{A character vector of gene names to be used as a
 36 | statistical background. By default, the background is all genes that appear
 37 | in \code{gmt}.}
 38 | 
 39 | \item{geneset_filter}{A numeric vector of length two giving the lower and 
 40 | upper limits for the size of the annotated geneset to pathways in gmt.
 41 | Pathways with a geneset shorter than \code{geneset_filter[1]} or longer
 42 | than \code{geneset_filter[2]} will be removed. Set either value to NA
 43 | to not enforce a minimum or maximum value, or set \code{geneset_filter} to 
 44 | \code{NULL} to skip filtering.}
 45 | 
 46 | \item{cutoff}{A maximum merged p-value for a gene to be used for analysis.
 47 | Any genes with merged, unadjusted \code{p > significant} will be discarded 
 48 | before testing.}
 49 | 
 50 | \item{significant}{Significance cutoff for selecting enriched pathways. Pathways with
 51 | \code{adjusted_p_val <= significant} will be selected as results.}
 52 | 
 53 | \item{merge_method}{Statistical method to merge p-values. See section on Merging P-Values}
 54 | 
 55 | \item{correction_method}{Statistical method to correct p-values. See
 56 | \code{\link[stats]{p.adjust}} for details.}
 57 | 
 58 | \item{cytoscape_file_tag}{The directory and/or file prefix to which the output files
 59 | for generating enrichment maps should be written. If NA, files will not be written.}
 60 | 
 61 | \item{color_palette}{Color palette from RColorBrewer::brewer.pal to color each
 62 | column in the scores matrix. If NULL grDevices::rainbow is used by default.}
 63 | 
 64 | \item{custom_colors}{A character vector of custom colors for each column in the scores matrix.}
 65 | 
 66 | \item{color_integrated_only}{A character vector of length 1 specifying the color of the 
 67 | "combined" pathway contribution.}
 68 | 
 69 | \item{scores_direction}{A numerical matrix of log2 transformed fold-change values where each row is a
 70 | gene and each column represents a dataset (evidence). Rownames correspond to the genes
 71 | and colnames to the datasets. We recommend converting missing values to zero. 
 72 | Must contain the same dimensions as the scores parameter. Datasets without directional information should be set to 0.}
 73 | 
 74 | \item{constraints_vector}{A numerical vector of +1 or -1 values corresponding to the user-defined
 75 | directional relationship between columns in scores_direction. Datasets without directional information should
 76 | be set to 0.}
 77 | }
 78 | \value{
 79 | A data.table of terms (enriched pathways) containing the following columns:
 80 |   \describe{
 81 |     \item{term_id}{The database ID of the term}
 82 |     \item{term_name}{The full name of the term}
 83 |     \item{adjusted_p_val}{The associated p-value, adjusted for multiple testing}
 84 |     \item{term_size}{The number of genes annotated to the term}
 85 |     \item{overlap}{A character vector of the genes enriched in the term}
 86 |     \item{evidence}{Columns of \code{scores} (i.e., omics datasets) that contributed 
 87 |          individually to the enrichment of the term. Each input column is evaluated 
 88 |          separately for enrichments and added to the evidence if the term is found.}
 89 |   }
 90 | }
 91 | \description{
 92 | ActivePathways
 93 | }
 94 | \section{Merging P-values}{
 95 | 
 96 | To obtain a single p-value for each gene across the multiple omics datasets considered, 
 97 | the p-values in \code{scores} #' are merged row-wise using a data fusion approach of p-value merging. 
 98 | The eight available methods are:
 99 | \describe{
100 |  \item{Fisher}{Fisher's method assumes p-values are uniformly
101 |  distributed and performs a chi-squared test on the statistic sum(-2 log(p)).
102 |  This method is most appropriate when the columns in \code{scores} are
103 |  independent.}
104 |  \item{Fisher_directional}{Fisher's method modification that allows for 
105 |  directional information to be incorporated with the \code{scores_direction}
106 |  and \code{constraints_vector} parameters.}
107 |  \item{Brown}{Brown's method extends Fisher's method by accounting for the
108 |  covariance in the columns of \code{scores}. It is more appropriate when the
109 |  tests of significance used to create the columns in \code{scores} are not
110 |  necessarily independent. The Brown's method is therefore recommended for 
111 |  many omics integration approaches.}
112 |  \item{DPM}{DPM extends Brown's method by incorporating directional information
113 |  using the \code{scores_direction} and \code{constraints_vector} parameters.}
114 |  \item{Stouffer}{Stouffer's method assumes p-values are uniformly distributed
115 |  and transforms p-values into a Z-score using the cumulative distribution function of a
116 |  standard normal distribution. This method is appropriate when the columns in \code{scores}
117 |   are independent.}
118 |  \item{Stouffer_directional}{Stouffer's method modification that allows for 
119 |  directional information to be incorporated with the \code{scores_direction}
120 |  and \code{constraints_vector} parameters.}
121 |  \item{Strube}{Strube's method extends Stouffer's method by accounting for the 
122 |  covariance in the columns of \code{scores}.}
123 |  \item{Strube_directional}{Strube's method modification that allows for 
124 |  directional information to be incorporated with the \code{scores_direction}
125 |  and \code{constraints_vector} parameters.}
126 | }
127 | }
128 | 
129 | \section{Cytoscape}{
130 | 
131 |   To visualize and interpret enriched pathways, ActivePathways provides an option
132 |   to further analyse results as enrichment maps in the Cytoscape software. 
133 |   If \code{!is.na(cytoscape_file_tag)}, four files will be written that can be used 
134 |   to build enrichment maps. This requires the EnrichmentMap and enhancedGraphics apps.
135 | 
136 | The four files written are:
137 |   \describe{
138 |     \item{pathways.txt}{A list of significant terms and the
139 |     associated p-value. Only terms with \code{adjusted_p_val <= significant} are
140 |     written to this file.}
141 |     \item{subgroups.txt}{A matrix indicating whether the significant terms (pathways)
142 |     were also found to be significant when considering only one column from
143 |     \code{scores}. A one indicates that term was found to be significant 
144 | 			when only p-values in that column were used to select genes.}
145 |     \item{pathways.gmt}{A Shortened version of the supplied GMT
146 |     file, containing only the significantly enriched terms in pathways.txt. }
147 |     \item{legend.pdf}{A legend with colours matching contributions
148 |     from columns in \code{scores}.}
149 |   }
150 | 
151 |   How to use: Create an enrichment map in Cytoscape with the file of terms
152 |   (pathways.txt) and the shortened gmt file
153 |   (pathways.gmt). Upload the subgroups file (subgroups.txt) as a table
154 |   using the menu File > Import > Table from File. To paint nodes according 
155 |   to the type of supporting evidence, use the 'style'
156 |   panel, set image/Chart1 to use the column `instruct` and the passthrough
157 |   mapping type. Make sure the app enhancedGraphics is installed. 
158 |   Lastly, use the file legend.pdf as a reference for colors in the enrichment map.
159 | }
160 | 
161 | \examples{
162 |     fname_scores <- system.file("extdata", "Adenocarcinoma_scores_subset.tsv", 
163 |          package = "ActivePathways")
164 |     fname_GMT = system.file("extdata", "hsapiens_REAC_subset.gmt",
165 |          package = "ActivePathways")
166 | 
167 |     dat <- as.matrix(read.table(fname_scores, header = TRUE, row.names = 'Gene'))
168 |     dat[is.na(dat)] <- 1
169 | 
170 |     ActivePathways(dat, fname_GMT)
171 | 
172 | }
173 | 


--------------------------------------------------------------------------------
/man/DPM.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/merge_p.r
 3 | \name{DPM}
 4 | \alias{DPM}
 5 | \title{Merge p-values using the DPM method.}
 6 | \usage{
 7 | DPM(
 8 |   p_values,
 9 |   data_matrix = NULL,
10 |   cov_matrix = NULL,
11 |   scores_direction,
12 |   constraints_vector
13 | )
14 | }
15 | \arguments{
16 | \item{p_values}{A matrix of m x n p-values.}
17 | 
18 | \item{data_matrix}{An m x n matrix representing m tests and n samples. NA's are not allowed.}
19 | 
20 | \item{cov_matrix}{A pre-calculated covariance matrix of \code{data_matrix}. This is more
21 | efficient when making many calls with the same data_matrix.
22 | Only one of \code{data_matrix} and \code{cov_matrix} must be given. If both are supplied,
23 | \code{data_matrix} is ignored.}
24 | 
25 | \item{scores_direction}{A matrix of log2 fold-change values. Datasets without directional information should be set to 0.}
26 | 
27 | \item{constraints_vector}{A numerical vector of +1 or -1 values corresponding to the user-defined
28 | directional relationship between columns in scores_direction. Datasets without directional information should
29 | be set to 0.}
30 | }
31 | \value{
32 | A p-value vector representing the merged significance of multiple p-values.
33 | }
34 | \description{
35 | Merge p-values using the DPM method.
36 | }
37 | 


--------------------------------------------------------------------------------
/man/GMT.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/gmt.r
 3 | \name{GMT}
 4 | \alias{GMT}
 5 | \alias{read.GMT}
 6 | \alias{gmt}
 7 | \alias{write.GMT}
 8 | \alias{is.GMT}
 9 | \title{Read and Write GMT files}
10 | \format{
11 | A GMT object is a named list of terms, where each term is a list with the items:
12 | \describe{
13 |     \item{id}{The term ID.}
14 |     \item{name}{The full name or description of the term.}
15 |     \item{genes}{A character vector of genes annotated to this term.}
16 |   }
17 | }
18 | \usage{
19 | read.GMT(filename)
20 | 
21 | write.GMT(gmt, filename)
22 | 
23 | is.GMT(x)
24 | }
25 | \arguments{
26 | \item{filename}{Location of the gmt file.}
27 | 
28 | \item{gmt}{A GMT object.}
29 | 
30 | \item{x}{The object to test.}
31 | }
32 | \value{
33 | \code{read.GMT} returns a GMT object. \cr
34 | \code{write.GMT} returns NULL. \cr
35 | \code{is.GMT} returns TRUE if \code{x} is a GMT object, else FALSE.
36 | }
37 | \description{
38 | Functions to read and write Gene Matrix Transposed (GMT) files and to test if
39 | an object inherits from GMT.
40 | }
41 | \details{
42 | A GMT file describes gene sets, such as biological terms and pathways. GMT files are 
43 | tab delimited text files. Each row of a GMT file contains a single term with its 
44 | database ID and a term name, followed by all the genes annotated to the term.
45 | }
46 | \examples{
47 |   fname_GMT <- system.file("extdata", "hsapiens_REAC_subset.gmt", package = "ActivePathways")
48 |   gmt <- read.GMT(fname_GMT)
49 |   gmt[1:10]
50 |   gmt[[1]]
51 |   gmt[[1]]$id
52 |   gmt[[1]]$genes
53 |   gmt[[1]]$name
54 |   gmt$`REAC:1630316`
55 | }
56 | 


--------------------------------------------------------------------------------
/man/brownsMethod.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/merge_p.r
 3 | \name{brownsMethod}
 4 | \alias{brownsMethod}
 5 | \title{Merge p-values using the Brown's method.}
 6 | \usage{
 7 | brownsMethod(p_values, data_matrix = NULL, cov_matrix = NULL)
 8 | }
 9 | \arguments{
10 | \item{p_values}{A matrix of m x n p-values.}
11 | 
12 | \item{data_matrix}{An m x n matrix representing m tests and n samples. NA's are not allowed.}
13 | 
14 | \item{cov_matrix}{A pre-calculated covariance matrix of \code{data_matrix}. This is more
15 | efficient when making many calls with the same data_matrix.
16 | Only one of \code{data_matrix} and \code{cov_matrix} must be given. If both are supplied,
17 | \code{data_matrix} is ignored.}
18 | }
19 | \value{
20 | A p-value vector representing the merged significance of multiple p-values.
21 | }
22 | \description{
23 | Merge p-values using the Brown's method.
24 | }
25 | 


--------------------------------------------------------------------------------
/man/columnSignificance.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/ActivePathways.r
 3 | \name{columnSignificance}
 4 | \alias{columnSignificance}
 5 | \title{Determine which terms are found to be significant using each column
 6 | individually.}
 7 | \usage{
 8 | columnSignificance(
 9 |   scores,
10 |   gmt,
11 |   background,
12 |   cutoff,
13 |   significant,
14 |   correction_method,
15 |   pvals
16 | )
17 | }
18 | \arguments{
19 | \item{scores}{A numerical matrix of p-values where each row is a gene and
20 | each column represents an omics dataset (evidence). Rownames correspond to the genes 
21 | and colnames to the datasets. All values must be 0<=p<=1. We recommend converting 
22 | missing values to ones.}
23 | 
24 | \item{gmt}{A GMT object to be used for enrichment analysis. If a filename, a
25 | GMT object will be read from the file.}
26 | 
27 | \item{background}{A character vector of gene names to be used as a
28 | statistical background. By default, the background is all genes that appear
29 | in \code{gmt}.}
30 | 
31 | \item{cutoff}{A maximum merged p-value for a gene to be used for analysis.
32 | Any genes with merged, unadjusted \code{p > significant} will be discarded 
33 | before testing.}
34 | 
35 | \item{significant}{Significance cutoff for selecting enriched pathways. Pathways with
36 | \code{adjusted_p_val <= significant} will be selected as results.}
37 | 
38 | \item{correction_method}{Statistical method to correct p-values. See
39 | \code{\link[stats]{p.adjust}} for details.}
40 | 
41 | \item{pvals}{p-value for the pathways calculated by ActivePathways}
42 | }
43 | \value{
44 | a data.table with columns 'term_id' and a column for each column
45 | in \code{scores}, indicating whether each term (pathway) was found to be
46 | significant or not when considering only that column. For each term, 
47 | either report the list of related genes if that term was significant, or NA if not.
48 | }
49 | \description{
50 | Determine which terms are found to be significant using each column
51 | individually.
52 | }
53 | 


--------------------------------------------------------------------------------
/man/enrichmentAnalysis.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/ActivePathways.r
 3 | \name{enrichmentAnalysis}
 4 | \alias{enrichmentAnalysis}
 5 | \title{Perform pathway enrichment analysis on an ordered list of genes}
 6 | \usage{
 7 | enrichmentAnalysis(genelist, gmt, background)
 8 | }
 9 | \arguments{
10 | \item{genelist}{character vector of gene names, in decreasing order
11 | of significance}
12 | 
13 | \item{gmt}{GMT object}
14 | 
15 | \item{background}{character vector of gene names. List of all genes being used
16 | as a statistical background}
17 | }
18 | \value{
19 | a data.table of terms with the following columns:
20 |   \describe{
21 |     \item{term_id}{The id of the term}
22 |     \item{term_name}{The full name of the term}
23 |     \item{adjusted_p_val}{The associated p-value adjusted for multiple testing}
24 |     \item{term_size}{The number of genes annotated to the term}
25 |     \item{overlap}{A character vector of the genes that overlap between the
26 |        term and the query}
27 |   }
28 | }
29 | \description{
30 | Perform pathway enrichment analysis on an ordered list of genes
31 | }
32 | \keyword{internal}
33 | 


--------------------------------------------------------------------------------
/man/export_as_CSV.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/ActivePathways.r
 3 | \name{export_as_CSV}
 4 | \alias{export_as_CSV}
 5 | \title{Export the results from ActivePathways as a comma-separated values (CSV) file.}
 6 | \usage{
 7 | export_as_CSV(res, file_name)
 8 | }
 9 | \arguments{
10 | \item{res}{the data.table object with ActivePathways results.}
11 | 
12 | \item{file_name}{location and name of the CSV file to write to.}
13 | }
14 | \description{
15 | Export the results from ActivePathways as a comma-separated values (CSV) file.
16 | }
17 | \examples{
18 |     fname_scores <- system.file("extdata", "Adenocarcinoma_scores_subset.tsv", 
19 |          package = "ActivePathways")
20 |     fname_GMT = system.file("extdata", "hsapiens_REAC_subset.gmt",
21 |          package = "ActivePathways")
22 | 
23 |     dat <- as.matrix(read.table(fname_scores, header = TRUE, row.names = 'Gene'))
24 |     dat[is.na(dat)] <- 1
25 | 
26 |     res <- ActivePathways(dat, fname_GMT)
27 | \donttest{
28 |     export_as_CSV(res, "results_ActivePathways.csv")
29 | }
30 | }
31 | 


--------------------------------------------------------------------------------
/man/hypergeometric.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/statistical_tests.r
 3 | \name{hypergeometric}
 4 | \alias{hypergeometric}
 5 | \title{Hypergeometric test}
 6 | \usage{
 7 | hypergeometric(counts)
 8 | }
 9 | \arguments{
10 | \item{counts}{A 2x2 numerical matrix representing a contingency table.}
11 | }
12 | \value{
13 | a p-value of enrichment of genes in a term or pathway.
14 | }
15 | \description{
16 | Perform a hypergeometric test, also known as the Fisher's exact test, on a 2x2 contingency
17 | table with the alternative hypothesis set to 'greater'. In this application, the test finds the
18 | probability that counts[1, 1] or more genes would be found to be annotated to a term (pathway),
19 | assuming the null hypothesis of genes being distributed randomly to terms.
20 | }
21 | 


--------------------------------------------------------------------------------
/man/makeBackground.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/gmt.r
 3 | \name{makeBackground}
 4 | \alias{makeBackground}
 5 | \title{Make a background list of genes (i.e., the statistical universe) based on all the terms (gene sets, pathways) considered.}
 6 | \usage{
 7 | makeBackground(gmt)
 8 | }
 9 | \arguments{
10 | \item{gmt}{A \link{GMT} object.}
11 | }
12 | \value{
13 | A character vector containing all genes in GMT.
14 | }
15 | \description{
16 | Returns A character vector of all genes in a GMT object.
17 | }
18 | \examples{
19 |   fname_GMT <- system.file("extdata", "hsapiens_REAC_subset.gmt", package = "ActivePathways")
20 |   gmt <- read.GMT(fname_GMT)
21 |   makeBackground(gmt)[1:10]
22 | }
23 | 


--------------------------------------------------------------------------------
/man/merge_p_values.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/merge_p.r
 3 | \name{merge_p_values}
 4 | \alias{merge_p_values}
 5 | \title{Merge a list or matrix of p-values}
 6 | \usage{
 7 | merge_p_values(
 8 |   scores,
 9 |   method = "Fisher",
10 |   scores_direction = NULL,
11 |   constraints_vector = NULL
12 | )
13 | }
14 | \arguments{
15 | \item{scores}{Either a list/vector of p-values or a matrix where each column is a test.}
16 | 
17 | \item{method}{Method to merge p-values. See 'methods' section below.}
18 | 
19 | \item{scores_direction}{Either a vector of log2 transformed fold-change values or a matrix where each column is a test. 
20 | Must contain the same dimensions as the scores parameter. Datasets without directional information should be set to 0.}
21 | 
22 | \item{constraints_vector}{A numerical vector of +1 or -1 values corresponding to the user-defined
23 | directional relationship between the columns in scores_direction. Datasets without directional information should
24 | be set to 0.}
25 | }
26 | \value{
27 | If \code{scores} is a vector or list, returns a number. If \code{scores} is a
28 |   matrix, returns a named list of p-values merged by row.
29 | }
30 | \description{
31 | Merge a list or matrix of p-values
32 | }
33 | \section{Methods}{
34 | 
35 | Eight methods are available to merge a list of p-values:
36 | \describe{
37 |  \item{Fisher}{Fisher's method (default) assumes that p-values are uniformly
38 |  distributed and performs a chi-squared test on the statistic sum(-2 log(p)).
39 |  This method is most appropriate when the columns in \code{scores} are
40 |  independent.}
41 |  \item{Fisher_directional}{Fisher's method modification that allows for 
42 |  directional information to be incorporated with the \code{scores_direction}
43 |  and \code{constraints_vector} parameters.}
44 |  \item{Brown}{Brown's method extends Fisher's method by accounting for the
45 |  covariance in the columns of \code{scores}. It is more appropriate when the
46 |  tests of significance used to create the columns in \code{scores} are not
47 |  necessarily independent. Note that the "Brown" method cannot be used with a 
48 |  single list of p-values. However, in this case Brown's method is identical 
49 |  to Fisher's method and should be used instead.}
50 |  \item{DPM}{DPM extends Brown's method by incorporating directional information
51 |  using the \code{scores_direction} and \code{constraints_vector} parameters.}
52 |  \item{Stouffer}{Stouffer's method assumes p-values are uniformly distributed
53 |  and transforms p-values into a Z-score using the cumulative distribution function of a
54 |  standard normal distribution. This method is appropriate when the columns in \code{scores}
55 |   are independent.}
56 |  \item{Stouffer_directional}{Stouffer's method modification that allows for 
57 |  directional information to be incorporated with the \code{scores_direction}
58 |  and \code{constraints_vector} parameters.}
59 |  \item{Strube}{Strube's method extends Stouffer's method by accounting for the 
60 |  covariance in the columns of \code{scores}.}
61 |  \item{Strube_directional}{Strube's method modification that allows for 
62 |  directional information to be incorporated with the \code{scores_direction}
63 |  and \code{constraints_vector} parameters.}
64 |   
65 | }
66 | }
67 | 
68 | \examples{
69 |   merge_p_values(c(0.05, 0.09, 0.01))
70 |   merge_p_values(list(a=0.01, b=1, c=0.0015, d=0.025), method='Fisher')
71 |   merge_p_values(matrix(data=c(0.03, 0.061, 0.48, 0.052), nrow = 2), method='Brown')
72 | 
73 | }
74 | 


--------------------------------------------------------------------------------
/man/orderedHypergeometric.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/statistical_tests.r
 3 | \name{orderedHypergeometric}
 4 | \alias{orderedHypergeometric}
 5 | \title{Ordered Hypergeometric Test}
 6 | \usage{
 7 | orderedHypergeometric(genelist, background, annotations)
 8 | }
 9 | \arguments{
10 | \item{genelist}{Character vector of gene names, assumed to be ordered by decreasing importance. 
11 | For example, the genes could be ranked by decreasing significance of differential expression.}
12 | 
13 | \item{background}{Character vector of gene names. List of all genes used as a statistical background (i.e., the universe).}
14 | 
15 | \item{annotations}{Character vector of gene names. A gene set representing a functional term, process or biological pathway.}
16 | }
17 | \value{
18 | a list with the items:
19 |   \describe{
20 |     \item{p_val}{The lowest obtained p-value}
21 |     \item{ind}{The index of \code{genelist} such that \code{genelist[1:ind]}
22 |       gives the lowest p-value}
23 |  }
24 | }
25 | \description{
26 | Perform a series of hypergeometric tests (a.k.a. Fisher's Exact tests), on a ranked list of genes ordered
27 | by significance against a list of annotation genes. The hypergeometric tests are executed with 
28 | increasingly larger numbers of genes representing the top genes in order of decreasing scores. 
29 | The lowest p-value of the series is returned as the optimal enriched intersection of the ranked list of genes
30 | and the biological term (pathway).
31 | }
32 | \examples{
33 |    orderedHypergeometric(c('HERC2', 'SP100'), c('PHC2', 'BLM', 'XPC', 'SMC3', 'HERC2', 'SP100'),
34 |                          c('HERC2', 'PHC2', 'BLM'))
35 | }
36 | 


--------------------------------------------------------------------------------
/man/prepareCytoscape.Rd:
--------------------------------------------------------------------------------
 1 | % Generated by roxygen2: do not edit by hand
 2 | % Please edit documentation in R/cytoscape.r
 3 | \name{prepareCytoscape}
 4 | \alias{prepareCytoscape}
 5 | \title{Prepare files for building an enrichment map network visualization in Cytoscape}
 6 | \usage{
 7 | prepareCytoscape(
 8 |   terms,
 9 |   gmt,
10 |   cytoscape_file_tag,
11 |   col_significance,
12 |   color_palette = NULL,
13 |   custom_colors = NULL,
14 |   color_integrated_only = "#FFFFF0"
15 | )
16 | }
17 | \arguments{
18 | \item{terms}{A data.table object with the columns 'term_id', 'term_name', 'adjusted_p_val'.}
19 | 
20 | \item{gmt}{An abridged GMT object containing only the pathways that were
21 | found to be significant in the ActivePathways analysis.}
22 | 
23 | \item{cytoscape_file_tag}{The user-defined file prefix and/or directory defining the location of the files.}
24 | 
25 | \item{col_significance}{A data.table object with a column 'term_id' and a column
26 | for each type of omics evidence indicating whether a term was also found to be significant or not
27 | when considering only the genes and p-values in the corresponding column of the \code{scores} matrix.
28 | If term was not found, NA's are shown in columns, otherwise the relevant lists of genes are shown.}
29 | 
30 | \item{color_palette}{Color palette from RColorBrewer::brewer.pal to color each
31 | column in the scores matrix. If NULL grDevices::rainbow is used by default.}
32 | 
33 | \item{custom_colors}{A character vector of custom colors for each column in the scores matrix.}
34 | 
35 | \item{color_integrated_only}{A character vector of length 1 specifying the color of the "combined" pathway contribution.}
36 | }
37 | \value{
38 | None
39 | }
40 | \description{
41 | This function writes four text files that are used to build an network using
42 | Cytoscape and the EnrichmentMap app. The files are prefixed with \code{cytoscape_file_tag}. 
43 |   The four files written are:
44 |   \describe{
45 |     \item{pathways.txt}{A list of significant terms and the
46 |     associated p-value. Only terms with \code{adjusted_p_val <= significant} are
47 |     written to this file}
48 |     \item{subgroups.txt}{A matrix indicating whether the significant
49 |     pathways are found to be significant when considering only one column (i.e., type of omics evidence) from
50 |     \code{scores}. A 1 indicates that that term is significant using only that
51 |     column to test for enrichment analysis}
52 |     \item{pathways.gmt}{A shortened version of the supplied GMT
53 |     file, containing only the terms in pathways.txt.}
54 |     \item{legend.pdf}{A legend with colours matching contributions
55 |     from columns in \code{scores}}
56 |   }
57 | }
58 | 


--------------------------------------------------------------------------------
/tests/testthat.R:
--------------------------------------------------------------------------------
1 | library(testthat)
2 | library(ActivePathways)
3 | 
4 | test_check("ActivePathways")
5 | 


--------------------------------------------------------------------------------
/tests/testthat/helper.r:
--------------------------------------------------------------------------------
 1 | # Prepare testing data
 2 | gmt <- read.GMT('test.gmt')
 3 | gmt_reac <- read.GMT('hsapiens_REAC_subset.gmt')
 4 | dat <- as.matrix(read.table('test_data.txt', header=TRUE, row.names='Gene'))
 5 | dat[is.na(dat)] <- 1
 6 | background <- makeBackground(gmt)
 7 | 
 8 | # filenames for cytoscape
 9 | CStag = "CS_files"
10 | 
11 | # Prepare testing data for scores_direction and constraints_vector
12 | df <- read.table('test_data_rna_protein.tsv', header = TRUE, row.names = "gene", sep = '\t')
13 | scores_test <- data.frame(row.names = rownames(df), rna = df$rna_pval, protein = df$protein_pval)
14 | scores_test <- as.matrix(scores_test)
15 | scores_test[is.na(scores_test)] <- 1
16 | direction_test <- data.frame(row.names = rownames(df), rna = df$rna_log2fc, protein = df$protein_log2fc)
17 | direction_test <- as.matrix(direction_test)
18 | direction_test[is.na(direction_test)] <- 0
19 | constraints_vector_test <- c(1,1)
20 | 
21 | # Run ActivePathways quickly
22 | run_ap_short <- function(dat) ActivePathways(dat[,1, drop = F], gmt[1:3], cutoff=1, significant=1)
23 | run_ap_short_contribution <- function(dat) ActivePathways(dat, gmt[1:3], cutoff=1, significant=1)
24 | run_ap <- function(scores_test,direction_test,constraints_vector_test) ActivePathways(scores=scores_test,
25 |                                                                                       merge_method="DPM",
26 |                                                                                       gmt_reac, cutoff=1, significant=1,
27 |                                                                                       scores_direction=direction_test,
28 |                                                                                       constraints_vector=constraints_vector_test)
29 | 
30 | 
31 | 
32 | # Data for testing enrichmentAnalysis
33 | ea_gmt <- gmt[1:4]
34 | ea_gmt[[1]]$genes <- c('PHC2', 'XPC', 'BLM')
35 | ea_gmt[[2]]$genes <- c('HERC2', 'SP100', 'BLM')
36 | ea_gmt[[3]]$genes <- c('HERC2', 'XPC')
37 | ea_gmt[[4]]$genes <- c('XPC')
38 | ea_genelist <- c('HERC2', 'SP100', 'BLM')
39 | ea_genelist2 <- c('HERC2', letters, 'XPC')
40 | ea_background <- makeBackground(ea_gmt)
41 | ea_background2 <- c(ea_background, letters)
42 | 
43 | # Expectation to test if two lists contain the same items, ignoring order
44 | expect_setequal <- function(actual, expected) {
45 |     # Test that the sets from two objects are the same
46 |     differences <- setdiff(actual,expected)
47 |     sets_equal  <- length(differences) == 0
48 |     message     <- paste("Sets not equal. First difference was:", differences[[1]])
49 |     expect(sets_equal, message)
50 |     invisible(actual)
51 | }
52 | 


--------------------------------------------------------------------------------
/tests/testthat/test.gmt:
--------------------------------------------------------------------------------
  1 | REAC:450513	Tristetraprolin (TTP, ZFP36) binds and destabilizes mRNA	EXOSC9	XRN1	EXOSC8	EXOSC2	EXOSC6	DCP1A	EXOSC3	YWHAB	DIS3	DCP2	EXOSC1	EXOSC4	EXOSC5	TNPO1	ZFP36	EXOSC7	MAPKAPK2	
  2 | REAC:912631	Regulation of signaling by CBL	PIK3R3	PIK3R1	VAV1	AL358075.4	BLNK	LYN	CBL	SYK	PIK3CA	FYN	GRB2	RAPGEF1	AL672043.1	CRK	YES1	CRKL	PIK3CB	PIK3R2	HCK	PIK3CD	
  3 | REAC:434316	Fatty Acids bound to GPR40 (FFAR1) regulate insulin secretion	GNA14	GNA11	GNAQ	PLCB2	PLCB1	GNA15	FFAR1	PLCB3	
  4 | REAC:446199	Synthesis of Dolichyl-phosphate	DOLPP1	DHDDS	DOLK	MVD	NUS1	SRD5A3	
  5 | REAC:3304356	SMAD2/3 Phosphorylation Motif Mutants in Cancer	TGFB1	SMAD2	SMAD3	TGFBR2	ZFYVE9	TGFBR1	
  6 | REAC:5213460	RIPK1-mediated regulated necrosis	BIRC3	FASLG	FAS	MLKL	TNFRSF10B	TRAF2	RIPK3	FADD	CASP8	RIPK1	TNFRSF10A	XIAP	TRADD	TNFSF10	BIRC2	CFLAR	
  7 | REAC:5635851	GLI proteins bind promoters of Hh responsive genes to promote transcription	PTCH1	PTCH2	GLI3	GLI1	BOC	HHIP	GLI2	
  8 | REAC:2644605	FBXW7 Mutants and NOTCH1 in Cancer	SKP1	CUL1	FBXW7	RBX1	NOTCH1	
  9 | REAC:2024101	CS/DS degradation	DCN	HYAL3	HEXA	NCAN	NAT6	ARSB	HYAL1	BGN	HEXB	CSPG5	VCAN	CSPG4	BCAN	IDS	AC244197.3	IDUA	
 10 | REAC:5358752	T41 mutants of beta-catenin aren't phosphorylated	AXIN1	PPP2R5C	PPP2R5B	CSNK1A1	PPP2R1B	PPP2R5E	PPP2R5A	PPP2R5D	AMER1	GSK3B	PPP2CA	APC	PPP2CB	CTNNB1	PPP2R1A	
 11 | REAC:3595172	Defective CHST3 causes SEDCJD	VCAN	CSPG4	NCAN	BGN	DCN	CSPG5	BCAN	
 12 | REAC:167826	The fatty acid cycling model	SLC25A27	SLC25A14	UCP2	UCP3	UCP1	
 13 | REAC:77348	Beta oxidation of octanoyl-CoA to hexanoyl-CoA	ACADM	HADHA	HADHB	HADH	ECHS1	
 14 | REAC:5083625	Defective GALNT3 causes familial hyperphosphatemic tumoral calcinosis (HFTC)	MUC2	MUC5B	MUC19	MUC3A	MUC12	MUC7	MUC6	MUC5AC	MUC21	MUC15	MUCL1	MUC4	MUC20	MUC1	MUC13	MUC16	MUC17	
 15 | REAC:1912420	Pre-NOTCH Processing in Golgi	ATP2A3	MFNG	RAB6A	B4GALT1	ST3GAL3	LFNG	NOTCH1	TMED2	ATP2A1	RFNG	NOTCH4	ATP2A2	NOTCH3	NOTCH2	SEL1L	ST3GAL6	ST3GAL4	FURIN	
 16 | REAC:5358493	Synthesis of diphthamide-EEF2	DPH3	EEF2	DPH5	DNAJC24	DPH6	DPH7	DPH1	DPH2	
 17 | REAC:210746	Regulation of gene expression in endocrine-committed (NEUROG3+) progenitor cells	NKX2-2	PAX4	NEUROG3	NEUROD1	INSM1	
 18 | REAC:2142712	Synthesis of 12-eicosatetraenoic acid derivatives	ALOX15	ALOXE3	ALOX12B	GPX4	GPX2	GPX1	ALOX12	
 19 | REAC:75157	FasL/ CD95L signaling	CASP8	FAS	FASLG	FADD	CASP10	
 20 | REAC:2022857	Keratan sulfate degradation	ACAN	FMOD	OGN	HEXA	KERA	HEXB	GLB1L	GNS	OMD	PRELP	LUM	GLB1	
 21 | REAC:111932	CaMK IV-mediated phosphorylation of CREB	CAMK4	CALM3	CALM1	CALM2	CREB1	
 22 | REAC:2470946	Cohesin Loading onto Chromatin	STAG1	SMC1A	STAG2	PDS5A	WAPL	NIPBL	SMC3	RAD21	PDS5B	MAU2	
 23 | REAC:5603041	IRAK4 deficiency (TLR2/4)	IRAK4	TLR4	TLR6	TIRAP	CD36	MYD88	TLR2	BTK	CD14	LY96	TLR1	
 24 | REAC:388479	Vasopressin-like receptors	AVPR1B	AVPR1A	OXTR	OXT	AVP	AVPR2	
 25 | REAC:111448	Activation of NOXA and translocation to mitochondria	TFDP2	TFDP1	TP53	E2F1	PMAIP1	
 26 | REAC:180689	APOBEC3G mediated resistance to HIV-1 infection	HMGA1	BANF1	PPIA	PSIP1	APOBEC3G	
 27 | REAC:5083632	Defective C1GALT1C1 causes Tn polyagglutination syndrome (TNPS)	MUC19	MUC5B	MUC2	MUC3A	MUC21	MUC5AC	MUC6	MUC7	MUC12	MUC15	MUC4	MUCL1	MUC20	MUC13	MUC1	MUC17	MUC16	
 28 | REAC:5625900	RHO GTPases activate CIT	KIF14	RHOC	RNASE1	RAC1	RHOB	MYH9	MYL12B	MYH11	MYL9	CDKN1B	PRC1	RHOA	MYL6	MYH10	DLG4	MYH14	CIT	
 29 | REAC:4839748	AMER1 mutants destabilize the destruction complex	PPP2CB	PPP2R1A	GSK3B	AMER1	PPP2R5D	APC	PPP2CA	PPP2R1B	PPP2R5A	PPP2R5E	PPP2R5C	PPP2R5B	AXIN1	CSNK1A1	
 30 | REAC:3656225	Defective CHST6 causes MCDC1	KERA	PRELP	LUM	OMD	ACAN	OGN	FMOD	
 31 | REAC:427601	Multifunctional anion exchangers	SLC26A1	SLC26A4	SLC26A7	SLC26A11	SLC26A2	SLC26A3	SLC26A6	SLC26A9	SLC5A12	
 32 | REAC:2142770	Synthesis of 15-eicosatetraenoic acid derivatives	GPX1	PTGS2	GPX4	GPX2	ALOX15	ALOX15B	
 33 | REAC:372708	p130Cas linkage to MAPK signaling for integrins	FN1	TLN1	ITGA2B	SRC	ITGB3	FGA	VWF	PTK2	FGG	RAP1B	FGB	APBB1IP	RAP1A	CRK	BCAR1	
 34 | REAC:77588	SLBP Dependent Processing of Replication-Dependent Histone Pre-mRNAs	NCBP1	SNRPD3	SLBP	NCBP2	SNRPF	LSM11	SNRPE	ZNF473	SNRPB	LSM10	SNRPG	
 35 | REAC:193634	Axonal growth inhibition (RHOA activation)	ARHGDIA	ARHGEF1	MAG	RHOA	LINGO1	NGFR	OMG	RTN4	
 36 | REAC:3315487	SMAD2/3 MH2 Domain Mutants in Cancer	TGFBR1	ZFYVE9	TGFBR2	SMAD4	SMAD3	SMAD2	TGFB1	
 37 | REAC:164843	2-LTR circle formation	XRCC4	XRCC5	XRCC6	HMGA1	PSIP1	BANF1	LIG4	
 38 | REAC:622312	Inflammasomes	NLRP3	HSP90AB1	P2RX7	PSTPIP1	MEFV	TXNIP	SUGT1	BCL2L1	APP	CASP1	NLRP1	PANX1	BCL2	NLRC4	PYCARD	AIM2	TXN	
 39 | REAC:442380	Zinc influx into cells by the SLC39 gene family	SLC39A7	SLC39A10	SLC39A14	SLC39A4	SLC39A1	SLC39A8	SLC39A6	SLC39A5	SLC39A3	SLC39A2	
 40 | REAC:418889	Ligand-independent caspase activation via DCC	DCC	APPL1	CASP9	DAPK2	MAGED1	UNC5B	UNC5A	CASP3	DAPK3	DAPK1	
 41 | REAC:975110	TRAF6 mediated IRF7 activation in TLR7/8 or 9 signaling	IRF7	TLR9	TRAF6	UBE2N	IRAK1	TLR8	DHX36	MYD88	UBE2V1	TLR7	IRAK4	
 42 | REAC:2485179	Activation of the phototransduction cascade	GNAT1	CNGA1	SLC24A1	RHO	PDE6B	PDE6A	CNGB1	SAG	PDE6G	GNB1	GNGT1	
 43 | REAC:168799	Neurotoxicity of clostridium toxins	STX1B	SV2A	SNAP25	SV2B	STX1A	VAMP1	VAMP2	SYT1	SV2C	SYT2	
 44 | REAC:196819	Vitamin B1 (thiamin) metabolism	SLC19A2	TPK1	THTPA	SLC25A19	SLC19A3	
 45 | REAC:196780	Biotin transport and metabolism	MCCC1	ACACB	ACACA	HLCS	BTD	PC	PDZD11	PCCB	MCCC2	PCCA	SLC5A6	
 46 | REAC:5666185	RHO GTPases Activate Rhotekin and Rhophilins	ROPN1	LIN7B	RHOC	RHOA	RTKN	RHPN2	RHPN1	RHOB	TAX1BP3	
 47 | REAC:110329	Cleavage of the damaged pyrimidine 	SMUG1	TDG	UNG	MBD4	NTHL1	NEIL1	OGG1	NEIL2	
 48 | REAC:1236977	Endosomal/Vacuolar pathway	LNPEP	HLA-H	HLA-G	CTSS	CTSL	HLA-F	CTSV	HLA-B	HLA-A	HLA-C	B2M	HLA-E	
 49 | REAC:1663150	The activation of arylsulfatases	SUMF2	ARSE	ARSI	ARSK	ARSG	ARSF	SUMF1	STS	ARSA	ARSH	ARSJ	ARSB	ARSD	
 50 | REAC:2562578	TRIF-mediated programmed cell death	TICAM1	CD14	TICAM2	TLR3	LY96	CASP8	FADD	RIPK3	TLR4	RIPK1	
 51 | REAC:3000484	Scavenging by Class F Receptors	HSPH1	HYOU1	SCARF1	APOB	HSP90AA1	CALR	
 52 | REAC:2028269	Signaling by Hippo	STK4	AMOTL2	STK3	YAP1	LATS2	TJP1	WWTR1	AMOT	DVL2	LATS1	YWHAE	CASP3	MOB1B	NPHP4	MOB1A	WWC1	SAV1	TJP2	YWHAB	AMOTL1	
 53 | REAC:190861	Gap junction assembly	GJA1	GJB1	GJB7	GJD4	GJD3	GJC1	GJB5	GJA10	GJD2	GJA8	GJB6	GJB3	GJC2	GJA4	GJA5	GJA9	GJB4	GJA3	GJB2	
 54 | REAC:74713	IRS activation	INS-IGF2	INS	IRS2	GRB10	IRS1	INSR	
 55 | REAC:391906	Leukotriene receptors	GPR17	CYSLTR2	LTB4R2	LTB4R	CYSLTR1	
 56 | REAC:6803207	TP53 Regulates Transcription of Caspase Activators and Caspases	TP53	PIDD1	APAF1	CASP2	TP63	CASP1	TP73	NLRC4	CRADD	ATM	CASP10	CASP6	
 57 | REAC:3595174	Defective CHST14 causes EDS, musculocontractural type	NCAN	BGN	CSPG4	VCAN	DCN	CSPG5	BCAN	
 58 | REAC:193670	p75NTR negatively regulates cell cycle via SC1	NGFR	NGF	HDAC2	PRDM4	HDAC3	HDAC1	
 59 | REAC:193144	Estrogen biosynthesis	CYP19A1	HSD17B11	HSD17B1	HSD17B14	AKR1B15	HSD17B2	
 60 | REAC:1433559	Regulation of KIT signaling	SOS1	GRB2	SOCS1	LCK	KIT	FYN	SH2B3	SRC	YES1	KITLG	PRKCA	PTPN6	CBL	LYN	SOCS6	SH2B2	
 61 | REAC:209560	NF-kB is activated and signals survival	TRAF6	UBB	SQSTM1	RPS27A	NFKBIA	NFKB1	NGFR	UBC	IKBKB	RELA	UBA52	NGF	IRAK1	
 62 | REAC:4839744	truncated APC mutants destabilize the destruction complex	CSNK1A1	PPP2R5C	PPP2R5B	AXIN1	PPP2R5E	PPP2R5A	PPP2R1B	APC	PPP2CA	GSK3B	PPP2R5D	AMER1	PPP2R1A	PPP2CB	
 63 | REAC:1839117	Signaling by cytosolic FGFR1 fusion mutants	GAB2	CPSF6	GRB2	STAT5B	PIK3CA	FGFR1OP2	MYO18A	BCR	ZMYM2	CNTRL	STAT1	PIK3R1	STAT5A	FGFR1OP	STAT3	TRIM24	LRRFIP1	CUX1	
 64 | REAC:622323	Presynaptic nicotinic acetylcholine receptors	CHRNA3	CHRNA6	CHRNA1	CHRNB3	CHRNG	CHRNA2	CHRNB2	CHRNA5	CHRNA4	CHRNE	CHRNB4	CHRND	
 65 | REAC:5368598	Negative regulation of TCF-dependent signaling by DVL-interacting proteins	DVL2	CXXC4	DVL1	CCDC88C	DVL3	
 66 | REAC:5656121	Translesion synthesis by POLI	RPA1	POLI	RFC5	RPA3	REV1	RPA2	RFC3	RFC2	UBC	UBA52	UBB	RPS27A	MAD2L2	PCNA	RFC1	REV3L	RFC4	
 67 | REAC:444209	Free fatty acid receptors	FFAR3	FFAR2	GPR31	FFAR4	FFAR1	
 68 | REAC:159424	Conjugation of carboxylic acids	GLYATL2	ACSM5	ACSM1	GLYAT	GLYATL3	ACSM2B	ACSM2A	ACSM4	GLYATL1	
 69 | REAC:2408517	SeMet incorporation into proteins	AIMP1	QARS	RARS	EPRS	MARS	KARS	DARS	AIMP2	IARS	EEF1E1	LARS	
 70 | REAC:5603027	IKBKG deficiency causes anhidrotic ectodermal dysplasia with immunodeficiency (EDA-ID) (via TLR)	NFKB1	NFKBIA	CHUK	RELA	NFKB2	NFKBIB	IKBKG	IKBKB	
 71 | REAC:112411	MAPK1 (ERK2) activation	MAPK1	JAK2	MAP2K2	PTPN11	TYK2	IL6R	JAK1	IL6ST	IL6	
 72 | REAC:193048	Androgen biosynthesis	SRD5A2	HSD17B3	HSD3B1	CGA	HSD17B12	LHB	SRD5A3	SRD5A1	POMC	CYP17A1	HSD3B2	
 73 | REAC:879518	Transport of organic anions	SLCO2B1	SLCO1B3	SLCO1A2	SLCO1C1	SLCO1B1	AVP	SLCO3A1	SLCO2A1	SLCO4C1	SLC16A2	SLCO4A1	ALB	
 74 | REAC:203641	NOSTRIN mediated eNOS trafficking	WASL	CAV1	NOS3	NOSTRIN	DNM2	
 75 | REAC:2660826	Constitutive Signaling by NOTCH1 t(7;9)(NOTCH1:M1580_K2555) Translocation Mutant	JAG2	NOTCH1	DLL4	DLL1	ADAM17	JAG1	ADAM10	
 76 | REAC:209822	Glycoprotein hormones	FSHB	CGB5	INHBC	LHB	INHBA	CGB3	TSHB	CGA	CGB8	INHA	INHBE	INHBB	
 77 | REAC:189451	Heme biosynthesis	FECH	COX10	UROD	ALAS1	PPOX	HMBS	COX15	ALAS2	UROS	CPOX	ALAD	
 78 | REAC:5637815	Signaling by Ligand-Responsive EGFR Variants in Cancer	PIK3CA	UBB	GRB2	SOS1	GAB1	NRAS	RPS27A	HSP90AA1	HRAS	KRAS	CDC37	EGFR	SHC1	PIK3R1	EGF	UBC	PLCG1	CBL	UBA52	
 79 | REAC:75205	Dissolution of Fibrin Clot	PLG	SERPINB8	PLAUR	PLAT	SERPINB2	SERPINF2	SERPINB6	S100A10	HRG	SERPINE2	PLAU	ANXA2	SERPINE1	
 80 | REAC:111471	Apoptotic factor-mediated response	CASP7	CYCS	CASP3	APAF1	CASP9	XIAP	DIABLO	
 81 | REAC:2408508	Metabolism of ingested SeMet, Sec, MeSec into H2Se	AHCY	CBS	GNMT	MAT1A	NNMT	HNMT	SCLY	CTH	
 82 | REAC:200425	Import of palmitoyl-CoA into the mitochondrial matrix	PRKAG2	PRKAB2	CPT1B	PPARD	THRSP	CPT2	MID1IP1	PRKAA2	RXRA	CPT1A	SLC22A5	SLC25A20	
 83 | REAC:176974	Unwinding of DNA	MCM3	MCM6	CDC45	MCM5	GINS3	MCM2	MCM4	MCM8	GINS2	GINS1	GINS4	MCM7	
 84 | REAC:5684264	MAP3K8 (TPL2)-dependent MAPK1/3 activation	UBB	CUL1	MAP3K8	FBXW11	MAP2K4	MAP2K1	RPS27A	IKBKG	CHUK	SKP1	TNIP2	BTRC	NFKB1	IKBKB	UBC	UBA52	
 85 | REAC:69416	Dimerization of procaspase-8	RIPK1	FADD	FAS	CASP8	FASLG	TRAF2	CFLAR	TNFRSF10B	TNFRSF10A	TNFSF10	TRADD	
 86 | REAC:174495	Synthesis And Processing Of GAG, GAGPOL Polyproteins	RPS27A	UBAP1	VPS37B	VPS37D	UBB	VPS37C	MVB12B	MVB12A	NMT2	UBA52	TSG101	VPS28	UBC	VPS37A	
 87 | REAC:2022377	Metabolism of Angiotensinogen to Angiotensins	C9ORF3	GZMH	MME	CPB1	ACE2	CTSZ	ACE	CPB2	ENPEP	AGT	REN	CTSD	ATP6AP2	CTSG	ANPEP	CPA3	CMA1	
 88 | REAC:561048	Organic anion transport	SLC22A8	SLC22A7	SLC22A11	SLC22A6	SLC22A12	
 89 | REAC:189085	Digestion of dietary carbohydrate	AMY2A	AMY2B	MGAM	AMY1A	LCT	AMY1B	AMY1C	TREH	SI	
 90 | REAC:4793953	Defective B4GALT1 causes B4GALT1-CDG (CDG-2d)	LUM	KERA	PRELP	OMD	FMOD	OGN	ACAN	
 91 | REAC:8857538	PTK6 promotes HIF1A stabilization	HIF1A	PTK6	HBEGF	LINC01139	GPNMB	LRRK2	EGFR	
 92 | REAC:111931	PKA-mediated phosphorylation of CREB	ADCY7	ADCY2	ADCY9	PRKAR1A	PRKAR2B	ADCY5	ADCY4	ADCY1	PRKACA	ADCY6	PRKACG	PRKAR1B	PRKACB	PRKAR2A	CREB1	ADCY8	ADCY3	
 93 | REAC:2197563	NOTCH2 intracellular domain regulates transcription	FCER2	HES1	CREB1	EP300	RBPJ	MAML1	GZMB	MAML3	HES5	NOTCH2	MAMLD1	MAML2	
 94 | REAC:111957	Cam-PDE 1 activation	CALM3	CALM1	PDE1C	CALM2	PDE1A	PDE1B	
 95 | REAC:5676934	Protein repair	MSRB2	MSRA	PCMT1	MSRB1	MSRB3	TXN	
 96 | REAC:4420332	Defective B3GALT6 causes EDSP2 and SEMDJL1	SDC2	BCAN	VCAN	CSPG4	GPC3	CSPG5	GPC5	SDC1	SDC3	BGN	AGRN	HSPG2	SDC4	GPC1	NCAN	GPC6	GPC2	DCN	GPC4	
 97 | REAC:428540	Activation of Rac	RAC1	PAK3	BUB1B-PAK6	SOS2	ROBO1	PAK6	PAK1	NCK2	RNASE1	PAK2	GPC1	PAK5	SOS1	NCK1	PAK4	SLIT2	
 98 | REAC:549132	Organic cation/anion/zwitterion transport	SLC22A1	RSC1A1	SLC22A15	SLC22A18	SLC22A5	SLC22A12	SLC22A6	SLC22A4	SLC22A11	RUNX1	SLC22A7	SLC22A2	SLC22A16	SLC22A3	SLC22A8	
 99 | REAC:1839130	Signaling by activated point mutants of FGFR3	FGF4	FGF2	FGF18	FGF8	FGFR3	FGF16	FGF23	FGF9	FGF20	FGF5	FGF17	FGF1	
100 | REAC:5688849	Defective CSF2RB causes pulmonary surfactant metabolism dysfunction 5 (SMDP5)	SFTPA2	SFTPB	CSF2RB	CSF2RA	SFTA3	SFTPC	SFTPD	SFTPA1	
101 | 


--------------------------------------------------------------------------------
/tests/testthat/test_columnContribution.r:
--------------------------------------------------------------------------------
 1 | context('columnContribution function')
 2 | 
 3 | 
 4 | test_that('Column Contribution ratio is correct', {
 5 |     col <- colnames(dat)[1]
 6 | 
 7 |     res <- ActivePathways(dat, gmt, significant = 1)
 8 |     res_just_column <- ActivePathways(dat[, 1, drop = FALSE], gmt, significant = 1)
 9 |     expect_equal(res_just_column$overlap, res[[paste0("Genes_", col)]])
10 | })
11 | 


--------------------------------------------------------------------------------
/tests/testthat/test_columnSignificance.r:
--------------------------------------------------------------------------------
 1 | context('columnSignificance Function')
 2 | 
 3 | 
 4 | test_that('columnSignificance agrees with testing individual columns', {
 5 |     col <- colnames(dat)[1]
 6 | 
 7 |     res1 <- columnSignificance(dat, gmt, background, 0.1, 0.05, 'holm', rep(0.05, length(gmt)))
 8 |     res2 <- ActivePathways(dat[, col, drop = FALSE], gmt, 
 9 |     		correction_method = 'holm', geneset_filter = NULL)
10 | 
11 |     # Pathways that are significant according to columnSignificance
12 | 	comp1 = res1$term_id[ sapply(res1[["Genes_cds"]], function(x) !all(is.na(x))) ]
13 | 
14 |     expect_true(setequal(comp1, res2$term_id))
15 | })
16 | 
17 | test_that('Column names of columnSignificance result is correct', {
18 |     res <- columnSignificance(dat, gmt, background, 0.1, 0.05, 'holm', rep(0.05, length(gmt)))
19 |     expect_equal(colnames(res), c('term_id', 'evidence', paste0("Genes_", colnames(dat))))
20 | })
21 | 


--------------------------------------------------------------------------------
/tests/testthat/test_cytoscape.r:
--------------------------------------------------------------------------------
 1 | context("Validation of Cytoscape Files and Test that the files are written")
 2 | 
 3 | 
 4 | test_that("cytoscape_filenames specified", {
 5 | 
 6 |     expect_error(ActivePathways(dat, gmt, cytoscape_file_tag = CStag, significant = 1), NA)
 7 |     expect_message(ActivePathways(dat[,1, drop = F], gmt, cytoscape_file_tag = CStag, significant = 1),
 8 | 			"scores matrix contains only one column. Column contributions will not be calculated", 
 9 | 			fixed = TRUE)
10 | })
11 | 
12 | 
13 | test_that("Cytoscape files are written", {
14 | 	
15 | 	CS_fnames = paste0(CStag, c("pathways.txt", "subgroups.txt", "pathways.gmt", "legend.pdf"))
16 | 	
17 |     suppressWarnings(file.remove(CS_fnames))
18 |     ActivePathways(dat, gmt, cytoscape_file_tag = CStag, significant = 0.9, cutoff = 1)
19 |     expect_equal(file.exists(CS_fnames), c(TRUE, TRUE, TRUE, TRUE))
20 | 
21 |     suppressWarnings(file.remove(CS_fnames))
22 |     ActivePathways(dat[,1, drop = F], gmt, cytoscape_file_tag = CStag, significant = 0.9, cutoff = 1)
23 |     expect_equal(file.exists(CS_fnames), c(TRUE, FALSE, TRUE, FALSE))
24 | 
25 |     suppressWarnings(file.remove(CS_fnames))
26 |  	suppressWarnings(ActivePathways(dat, gmt, cytoscape_file_tag = CStag, significant = 0))
27 |     expect_equal(file.exists(CS_fnames), c(FALSE, FALSE, FALSE, FALSE))
28 | 
29 |     suppressWarnings(file.remove(CS_fnames))
30 |     suppressWarnings(ActivePathways(dat, gmt, cytoscape_file_tag = NA))
31 |     expect_equal(file.exists(CS_fnames), c(FALSE, FALSE, FALSE, FALSE))
32 | 
33 |     suppressWarnings(file.remove(CS_fnames))
34 | })
35 | 


--------------------------------------------------------------------------------
/tests/testthat/test_data.txt:
--------------------------------------------------------------------------------
  1 | Gene	cds	promoter	enhancer
  2 | YWHAB	0.181759849432863	0.573086408819819	0.0801841709343925
  3 | PIK3R1	0.602307993749334	0.207969196040606	0.663872931721775
  4 | LYN	0.470537676163286	0.674624033841617	0.267676513421447
  5 | CBL	0.550952588371561	0.116561937843907	0.209648324944454
  6 | PIK3CA	0.0540076617030512	0.0290456574381137	0.382803198202891
  7 | FYN	NaN	0.673002840468483	0.0162136835995857
  8 | GRB2	0.286803788768734	0.134829611925831	0.0631235526924534
  9 | CRK	0.719999151213222	0.0655267271138409	0.185142055481746
 10 | YES1	0.578556194515032	0.124656734114965	0.228050611170295
 11 | FFAR1	0.00303599268752283	0.0636714240840086	0.0492807721449381
 12 | SRD5A3	0.288074096116829	0.0358067339897629	0.123682154635008
 13 | TGFB1	0.0612956812817711	0.0610529986408099	0.323547128645997
 14 | SMAD2	0.568142036584863	0.0947513326601436	NaN
 15 | SMAD3	0.176208589757222	0.203036016818637	0.621776050218014
 16 | TGFBR2	0.215742489942542	0.0625617834021545	0.323493076750721
 17 | ZFYVE9	0.0224740590393185	0.388509516563473	0.403654829321766
 18 | TGFBR1	0.538995651513955	0.00104396926512306	0.00660456963077945
 19 | FASLG	0.826268064502935	0.570385053029772	0.614228033751506
 20 | FAS	0.129042188048758	0.20908063560038	0.00372549509946649
 21 | TNFRSF10B	0.0787923912677442	0.0574973243105529	0.386117968844315
 22 | TRAF2	0.361874014385668	0.22442794052215	0.219061256254318
 23 | RIPK3	0.136275282672798	0.0490289357618559	0.139143487766408
 24 | FADD	0.140990912559879	0.328272367301317	0.0514925745297319
 25 | CASP8	0.491529744931601	0.300376305357171	0.108679339959835
 26 | RIPK1	0.342228607177507	0.168452777932317	0.294468711326761
 27 | TNFRSF10A	0.462360632575749	0.0348048284650228	0.337221848126996
 28 | XIAP	0.280093566017082	0.231719806343191	0.215701169481379
 29 | TRADD	0.163648477250899	0.0241196800137681	0.223511220917899
 30 | TNFSF10	0.295085663460857	0.0143604971451271	0.619212329533342
 31 | CFLAR	0.509505723632392	0.0577364266381152	0.0473195089397215
 32 | SKP1	0.440620422727399	0.653130867621918	0.0107270688797775
 33 | CUL1	0.69308480218818	0.00752388811354779	0.0031797652177559
 34 | rty	0.187285536302794	0.397634913189841	0.0120889812105367
 35 | DCN	0.172936536608252	0.00167639656635933	0.0340881081725078
 36 | HEXA	0.33022116641044	0.931526053768471	0.27134635449122
 37 | NCAN	0.18238263117686	0.299748397560577	0.474814417973435
 38 | ARSB	0.23111211631101	0.34311876315205	0.533967882602817
 39 | zxc	0.284584475728181	0.351512281944281	0.592723365828653
 40 | HEXB	0.648283156662765	0.239973120916298	0.086056202465542
 41 | CSPG5	0.659346099292787	0.0487821028774078	0.358557947073338
 42 | VCAN	0.0686878027536848	0.491104585717545	0.0293370810717161
 43 | fgh	0.0509080834579079	0.187641833440054	0.901662542354382
 44 | BCAN	0.787631766257433	0.225030615489533	0.0374954394106719
 45 | qwe	0.326165001483189	0.227565735120978	0.236017704377374
 46 | PPP2R5C	0.0256205616326249	0.237770242497341	0.703038785186687
 47 | PPP2R5B	0.668138052886544	0.133410161184475	0.16182500769308
 48 | CSNK1A1	0.095428723884147	0.496964753033534	0.399608162580152
 49 | asd	0.00682200165226474	0.081460833287022	0.0066516898093222
 50 | PPP2R5E	0.218919950042612	0.570025483190649	NaN
 51 | PPP2R5A	0.111667365168111	0.0697380171623881	0.016823949197242
 52 | PPP2R5D	0.52537266912045	0.063956104952396	0.0804431493305053
 53 | AMER1	0.0635450420358293	0.152257093354487	0.245071437603002
 54 | GSK3B	0.508044461840615	0.255655872271908	0.593995143698125
 55 | PPP2CA	0.00626565432006055	0.124474012488303	0.240561275336224
 56 | APC	0.362197080075817	0.411662319752638	0.132461065475608
 57 | PPP2CB	0.205843495067803	0.635863265436401	0.378447105687548
 58 | PPP2R1A	0.0985574849943051	0.588028264171712	0.255382889202054
 59 | MUC2	0.405105057017385	0.154423601969691	0.387027882147937
 60 | MUC5B	0.174949507957768	0.535187173560105	0.00204348738970449
 61 | MUC19	0.106674213966976	0.674560255787201	0.278825827268245
 62 | MUC3A	0.0117273200509688	0.160412973337993	0.295858270553739
 63 | MUC12	0.0495774824192386	0.786175224347218	0.419916319436669
 64 | MUC7	0.636459665502816	0.580504673105255	NaN
 65 | MUC6	0.395838210813141	0.0481391996153076	0.108968288063993
 66 | MUC5AC	0.279811607955896	NaN	0.197066465382849
 67 | MUC21	0.175460048777475	0.274879467257154	0.150139895113747
 68 | MUC15	0.333821054134149	0.228122891010576	0.0111420352647364
 69 | MUCL1	0.379818836725266	0.220906795424679	0.535082329455193
 70 | MUC4	0.904550466894486	0.0527523599134175	0.237503425736444
 71 | MUC20	0.179841497617397	0.096855549952906	0.0337274434852643
 72 | MUC1	0.886684840037558	0.149391315008235	0.696308352577933
 73 | MUC13	0.711293035984503	0.5343909634298	0.0179609866278288
 74 | MUC16	0.423176647603287	0.132447232115612	0.0677308970025233
 75 | MUC17	0.500910607124346	0.173990975121157	0.0248499519296264
 76 | NOTCH2	0.266290030147768	0.753648701095022	0.0762196409571534
 77 | ALOX15	0.118226541780289	0.155017854300937	0.361492803953278
 78 | GPX4	0.00843781888462744	0.336992751483621	0.270437507231239
 79 | GPX2	0.741130866151387	0.564265427419989	0.251102107978515
 80 | GPX1	0.0030950654526386	0.108904545446975	0.10684374021452
 81 | CASP10	0.0554205690135064	0.102944850372978	0.0826767820182217
 82 | ACAN	0.784223631802517	0.0841097031577299	0.10120852124552
 83 | FMOD	0.743174281732335	0.129713597785276	0.326095776862619
 84 | OGN	0.0581407235165238	0.386195902945662	0.122358916983575
 85 | KERA	0.0937110839782113	0.0107751800483757	0.371142369002521
 86 | OMD	0.68830180098311	0.460768175970274	0.00994624933071638
 87 | PRELP	0.0468314374082694	0.387730193019547	0.31670497294304
 88 | LUM	0.155238545178946	0.450677720254535	0.300686039230981
 89 | CALM3	0.294082753617955	0.0973512997693561	0.0757295548189794
 90 | CALM1	0.0162164472903082	0.0146462814141986	0.0883705169615348
 91 | CALM2	0.28867463584822	NaN	0.476320391605935
 92 | CREB1	0.100589558043417	0.00441554136643305	0.193774879008658
 93 | IRAK4	0.00187477740643946	0.385542838344349	0.317205656228724
 94 | TLR4	0.113487144001835	0.0509786886990751	0.144689976899324
 95 | MYD88	0.0531301110581846	NaN	0.0970400569269218
 96 | CD14	0.262592331139313	0.164184846544988	0.442698832228852
 97 | LY96	0.586673957860476	0.400579860275021	0.318368637777921
 98 | AVP	0.0205120962953295	0.13026984986983	0.27948032541776
 99 | TP53	0.227437347116785	0.116820408172034	0.0672024581528924
100 | HMGA1	0.175266313377645	0.195205552944568	0.55227284759047
101 | BANF1	0.00862024755915763	0.404321216235234	NaN
102 | PSIP1	0.0315422441663664	0.260179410362494	0.0193001785639445
103 | RHOC	0.46282590745913	0.0800188748102712	0.38912825821739
104 | RNASE1	0.391651541447617	0.296354310690846	0.375541368175919
105 | RAC1	0.0497605816406729	0.473192699224232	0.00785975981540628
106 | RHOB	0.15536420963248	0.00831086962178004	0.59812700514789
107 | RHOA	0.173077362901154	0.0654017654847227	0.135245141152461
108 | SRC	0.212359437153009	0.0122535219756262	0.507312746796874
109 | NGFR	0.322377774987638	0.181379556715301	0.352457843008307
110 | CASP1	0.308680244342293	0.216286729560113	0.0126951770622138
111 | NLRC4	0.223874641936647	0.530559156350974	0.158444218992153
112 | TXN	0.210539932795287	0.00727720974666793	0.227056053007002
113 | CASP9	0.249826923228618	0.368385912313289	0.302337101590475
114 | CASP3	0.38916880865594	0.270582585454458	0.394530508850117
115 | TRAF6	0.00499926355836968	0.757390671934309	0.314606112914376
116 | IRAK1	0.0368293027963075	0.213197299245414	0.0213526357655416
117 | HSP90AA1	0.33071309399982	0.084418215333913	0.11277048299554
118 | DVL2	0.497118703694093	0.177351832995164	0.0176069871193889
119 | APAF1	0.00638770741570969	0.0280786242682516	0.00253383433249355
120 | NGF	0.701357809121609	0.224052417264222	0.205976732317063
121 | SOS1	0.43215780379985	NaN	0.0398484022264086
122 | UBB	0.402179697719774	0.155442919318031	0.0912067445566898
123 | RPS27A	0.763321282798381	0.105533938373264	0.278925020638349
124 | NFKBIA	0.322306480886568	0.881330034849885	0.0149157670047246
125 | NFKB1	0.750516107024684	0.524829883640447	0.186042826898887
126 | UBC	0.0333904103535825	0.34319503837792	0.142243777205839
127 | IKBKB	0.426156664406726	0.158072265510563	0.153501809385075
128 | RELA	0.631518450016278	0.0187994230118444	0.691485574359667
129 | UBA52	0.477150317357163	0.0345409139103193	0.337136959982154
130 | CHUK	0.33664114526757	0.11167441558727	0.0591199470884607
131 | IKBKG	0.0273153924865842	0.0485566114907025	0.216172251093514
132 | CGA	0.0304592868066794	0.330087367618374	0.0704885247069825
133 | LHB	0.116528473112306	0.22983749350818	0.215878879567404
134 | EGFR	0.245749468620911	0.428237554651949	0.157023897185237
135 | SLC22A5	0.221200111634252	0.64464870188557	0.0176161706232961
136 | SLC22A8	0.452956386587425	0.0075086814505186	0.382419380566935
137 | SLC22A7	0.416167622932431	0.270523771951322	0.126673352104803
138 | SLC22A11	0.576297231810438	0.421769362201008	0.137786002942712
139 | SLC22A6	0.280550250201698	0.623302031329024	0.109465727804541
140 | SLC22A12	0.421394167036946	0.0769854744091124	0.00664221909901664
141 | GPC1	0.140697512299858	0.12154599641336	NaN
142 | 


--------------------------------------------------------------------------------
/tests/testthat/test_data_rna_protein.tsv:
--------------------------------------------------------------------------------
  1 | gene	rna_pval	rna_log2fc	protein_pval	protein_log2fc
  2 | TBX20	0.000357525	-1.105589466	NA	NA
  3 | TPRG1	0.000450743	1.692824405	0.015971771	0.605260822
  4 | TAGLN3	0.000586889	-0.934074291	NA	NA
  5 | COL2A1	0.000756495	-1.11660567	0.893790743	-0.144821024
  6 | TMEM220	0.000756495	-1.591869755	NA	NA
  7 | SLC17A2	0.001814256	-1.646219828	NA	NA
  8 | DAPK2	0.001944503	-1.236295967	0.045711209	-0.507885739
  9 | MAMDC2	0.001944503	-1.698242123	0.076839798	-1.964908518
 10 | DCLK1	0.002421783	-1.038307475	0.015971771	-1.127911513
 11 | KHNYN	0.002995212	1.142517108	0.005479684	-1.017432094
 12 | TMEM55B	0.002995212	0.808984306	0.978697383	0.291851953
 13 | HUS1B	0.002995212	-1.005525351	NA	NA
 14 | SLC22A10	0.003010222	0.74624427	NA	NA
 15 | PPP1R13B	0.003684403	1.153023448	0.018784685	0.98526776
 16 | CHD8	0.003684403	1.318268287	0.37602023	0.620428325
 17 | SYPL1	0.00450474	1.231264698	0.05232352	1.005819252
 18 | CRLS1	0.00450474	-1.292625821	NA	NA
 19 | HSD3B7	0.00450474	1.090055947	0.111888112	0.311957592
 20 | RAB2B	0.00450474	1.171185417	0.768886795	-0.067745508
 21 | RAMP1	0.00450474	-1.203869726	NA	NA
 22 | IZUMO1	0.00450474	0.977368481	NA	NA
 23 | NARF	0.005479684	-0.82458169	0.688522201	0.177777166
 24 | DNAJA4	0.005479684	-1.004742924	0.086761533	-0.846859147
 25 | DTWD2	0.005479684	1.072007456	1	0.012433014
 26 | TMEM198	0.005479684	-0.867986706	NA	NA
 27 | GIP	0.005648776	-1.021586722	NA	NA
 28 | DGKK	0.005828361	-0.982368597	NA	NA
 29 | AP1G2	0.006628079	1.37972769	0.294537238	0.253267234
 30 | TAF7L	0.006628079	-0.744959654	NA	NA
 31 | C19orf57	0.006628079	1.427460164	NA	NA
 32 | JSRP1	0.006628079	-1.013067608	0.122517932	-0.894223144
 33 | RBM12B	0.006628079	1.1387136	0.347492645	0.525667888
 34 | POLN	0.006628079	-0.547246086	NA	NA
 35 | ASCL1	0.007082013	-1.045419331	NA	NA
 36 | ARID4A	0.00797954	1.070053132	0.059675788	0.484819373
 37 | RBBP4	0.00797954	-1.805544172	1	-0.224044148
 38 | SYN1	0.00797954	-1.240076496	1	-0.243574681
 39 | MYH15	0.00797954	-0.903393125	0.076839798	-0.512752607
 40 | IPO11	0.00797954	1.088966953	0.574292637	0.361459873
 41 | MEPCE	0.00797954	1.518069613	0.320321905	0.720907206
 42 | RHBDF2	0.00797954	0.723731174	0.034513009	1.036065362
 43 | RNF145	0.00797954	0.960588002	0.218848025	0.607235045
 44 | NOP9	0.00797954	1.008246249	0.006628079	0.983550708
 45 | CPNE7	0.00797954	1.154804721	NA	NA
 46 | TRPV5	0.008090787	0.644002696	NA	NA
 47 | SLC31A1	0.009556372	0.784097699	NA	NA
 48 | GPX7	0.009556372	-1.170961288	0.151872392	-0.939190446
 49 | ING1	0.009556372	-1.095231053	0.103708514	-1.206054698
 50 | PFKP	0.009556372	0.734171072	0.029822126	0.653130432
 51 | SCAMP1	0.009556372	1.350597405	0.978697383	0.657603859
 52 | FAF2	0.009556372	0.821466416	0.270123647	0.560733539
 53 | PLCE1	0.009556372	-1.109233066	NA	NA
 54 | FAR2	0.009556372	0.456814184	NA	NA
 55 | AP5M1	0.009556372	1.141371435	0.136614041	0.641827057
 56 | MRPS11	0.009556372	-1.335737706	0.810041728	0.127962304
 57 | DHRS1	0.009556372	0.864620275	0.076839798	0.786212305
 58 | SLFN13	0.009556372	-1.00963914	NA	NA
 59 | CADM2	0.009556372	-0.749532162	NA	NA
 60 | ZNF425	0.009556372	1.422190487	NA	NA
 61 | PGBD2	0.009556372	1.181635837	NA	NA
 62 | CCDC177	0.009761752	0.68807853	NA	NA
 63 | TRIM60	0.009972042	-1.545409372	NA	NA
 64 | USP10	0.011393573	1.028713556	0.136614041	0.837526835
 65 | GTF2IRD1	0.011393573	1.011279562	0.127428127	1.289800091
 66 | TOX4	0.011393573	0.889082427	0.469593677	-0.114132794
 67 | PRMT5	0.011393573	0.768809429	0.086761533	0.82702538
 68 | NCOA2	0.011393573	0.774338167	0.503319039	0.550695269
 69 | RUSC1	0.011393573	0.936215214	0.864616785	-0.092534527
 70 | MTURN	0.011393573	-1.635825951	0.674396811	-0.118331203
 71 | SMIM20	0.011393573	-0.794011156	0.025666212	-1.113577341
 72 | ADPRM	0.011393573	-1.211928747	NA	NA
 73 | MAPK4	0.011393573	-0.870785701	NA	NA
 74 | WNT5B	0.011393573	-1.065140524	NA	NA
 75 | MAP3K14	0.011393573	-1.177319935	NA	NA
 76 | HBM	0.01275843	1.427961962	0.236985237	1.160652335
 77 | CA7	0.013151255	-0.594299142	NA	NA
 78 | RALYL	0.013166999	-1.173671993	NA	NA
 79 | PSG3	0.013257371	0.798473654	NA	NA
 80 | CROCC2	0.013310573	-1.035260571	0.294537238	1.917226435
 81 | AP2B1	0.013518835	-1.219757499	0.893790743	0.128839961
 82 | CUX1	0.013518835	0.75074608	0.015971771	1.011067942
 83 | FGD1	0.013518835	-0.807987005	NA	NA
 84 | GAS8	0.013518835	1.311705956	NA	NA
 85 | EIF4H	0.013518835	0.586479532	0.122517932	0.253250419
 86 | FBP2	0.013518835	-0.822130547	0.225413534	-0.694679452
 87 | SOCS2	0.013518835	-1.358713506	NA	NA
 88 | LRRC49	0.013518835	-0.666150035	NA	NA
 89 | PANK3	0.013518835	1.334098652	0.405904275	0.705984706
 90 | EMC6	0.013518835	-0.750929476	1	1.347986834
 91 | EPPK1	0.013518835	1.156332217	0.059675788	0.744371633
 92 | TRIM41	0.013518835	1.186057465	0.630528914	0.142162735
 93 | CRY1	0.013518835	-0.977021758	NA	NA
 94 | FXYD1	0.013518835	-0.767150995	NA	NA
 95 | TTLL1	0.013518835	-0.789939528	NA	NA
 96 | ZYG11A	0.013518835	0.992600549	NA	NA
 97 | RAD51L1	0.013518835	0.922141757	NA	NA
 98 | FIGN	0.013518835	-0.743659684	NA	NA
 99 | CCDC62	0.013518835	-0.761940866	NA	NA
100 | SIM1	0.014359259	-0.816299684	NA	NA
101 | CALML3	0.015971771	0.974592667	0.015971771	1.16270074
102 | MTTP	0.015971771	-1.492207502	0.628571429	-1.815562057
103 | EIF5B	0.015971771	0.710031225	0.893790743	-0.145090034
104 | SCFD1	0.015971771	1.062474486	0.009556372	1.090109309
105 | HBP1	0.015971771	1.120776017	NA	NA
106 | TMOD2	0.015971771	-0.866905749	0.109548295	-0.620059409
107 | DBR1	0.015971771	0.972127105	0.247091129	0.287829105
108 | TBC1D7	0.015971771	-1.733317374	0.955089355	-0.168800252
109 | FUNDC2	0.015971771	-0.924000602	0.018784685	-0.877012634
110 | EFCAB1	0.015971771	-1.329640051	NA	NA
111 | REEP4	0.015971771	0.37786569	0.649510605	0.183245443
112 | COL21A1	0.015971771	-0.612811973	0.572760573	-0.865929997
113 | RNF170	0.015971771	1.286652391	0.766432484	0.037020852
114 | ATG10	0.015971771	1.135243111	NA	NA
115 | CORO6	0.015971771	-1.168547316	0.437101706	-0.820529902
116 | SLC5A12	0.015971771	0.710350342	NA	NA
117 | RALGAPA1	0.015971771	0.30796241	0.002995212	1.138278243
118 | SRPK3	0.015971771	-1.138555211	NA	NA
119 | DEFA4	0.01616114	1.048160071	0.039795012	0.952835518
120 | CCK	0.01793474	-0.722583633	NA	NA
121 | F12	0.018784685	0.861768435	0.611412419	-0.46230423
122 | FNTA	0.018784685	0.72152704	0.20508663	0.312886485
123 | HIST1H1A	0.018784685	-0.82293859	0.025666212	-0.659115125
124 | IFI27	0.018784685	0.793442768	NA	NA
125 | CD99	0.018784685	-0.862031929	0.005479684	-0.96600878
126 | XPNPEP2	0.018784685	-0.907463709	1	-0.031775315
127 | CLINT1	0.018784685	0.311587342	0.018784685	0.712626692
128 | CTCF	0.018784685	1.132315778	0.469593677	-0.484562394
129 | EML2	0.018784685	0.822731442	0.039795012	0.843028384
130 | LRP10	0.018784685	1.312518564	0.00797954	1.271599045
131 | SUSD5	0.018784685	-1.446855602	NA	NA
132 | MAT2B	0.018784685	0.900001515	0.045711209	0.63366498
133 | SERPINA10	0.018784685	-1.133937437	0.405904275	-0.229143393
134 | ARMCX1	0.018784685	-0.869225137	0.018794774	-1.14734795
135 | CDKAL1	0.018784685	-1.679050116	0.247091129	-0.616640144
136 | PARP16	0.018784685	-0.467599741	0.31959707	-0.208408305
137 | C1GALT1	0.018784685	-0.775887675	1	0.182597728
138 | TRMT5	0.018784685	0.720214087	0.076839798	0.600977224
139 | C14orf93	0.018784685	1.285617223	NA	NA
140 | ZNF655	0.018784685	1.721029817	0.089209552	0.459080313
141 | NLRX1	0.018784685	1.058067919	0.20508663	0.415301806
142 | EFHC2	0.018784685	-0.992127041	NA	NA
143 | C1orf21	0.018784685	-0.893406586	0.623878701	-0.539503655
144 | TPGS1	0.018784685	-0.427021579	NA	NA
145 | LRRN4	0.018784685	-1.020970634	NA	NA
146 | TRIM7	0.018784685	0.860553697	0.036522301	1.517764677
147 | SLC39A3	0.018784685	-1.081669893	NA	NA
148 | SPIRE2	0.018784685	0.66259784	NA	NA
149 | FUT1	0.018784685	1.063330371	NA	NA
150 | DHRS4L1	0.018784685	1.122289655	NA	NA
151 | SLC13A2	0.019108754	0.548421006	NA	NA
152 | TMEM82	0.020817976	-0.722850121	NA	NA
153 | ADIPOQ	0.021751781	-0.822727063	0.574292637	-0.570247296
154 | BIN1	0.0220045	-1.189585728	0.109548295	-0.974888675
155 | ARHGAP5	0.0220045	0.851355772	0.076839798	0.834539815
156 | CYC1	0.0220045	1.338254833	0.574292637	0.107577951
157 | CYP24A1	0.0220045	1.151659865	0.93616176	0.095761425
158 | ESRRA	0.0220045	0.660113769	0.149964105	0.277820695
159 | HSD17B4	0.0220045	1.208505275	0.011393573	0.964397863
160 | HUS1	0.0220045	-0.887604587	0.059675788	-1.074655885
161 | LTA4H	0.0220045	0.605073618	0.029822126	0.656911544
162 | MSH3	0.0220045	0.595499727	0.37602023	0.195951859
163 | PAH	0.0220045	-1.050932983	NA	NA
164 | PSME1	0.0220045	1.085163107	0.151872392	1.021176065
165 | SLC20A2	0.0220045	0.90146853	0.437101706	0.555860354
166 | TST	0.0220045	0.495652934	0.270123647	0.048579392
167 | VDAC3	0.0220045	0.950739895	0.810041728	0.335234937
168 | WFS1	0.0220045	0.388992952	0.009556372	0.409441223
169 | DGKD	0.0220045	-0.993987219	0.674396811	0.682444501
170 | DGAT1	0.0220045	0.848706873	0.180652681	-0.442313433
171 | VPS4B	0.0220045	0.870010447	0.015971771	0.789952594
172 | SCRN1	0.0220045	-0.995964561	0.002421783	-1.178174456
173 | NFASC	0.0220045	-0.836618651	0.571428571	-0.477647981
174 | PAMR1	0.0220045	-1.04891043	0.650549746	0.131151306
175 | TMEM98	0.0220045	-0.93615148	NA	NA
176 | FBXO4	0.0220045	-1.317559593	0.009556372	-1.139629931
177 | TINF2	0.0220045	0.702787892	0.125942685	0.630886873
178 | NMD3	0.0220045	0.979023773	0.097642444	0.771109828
179 | MINDY4	0.0220045	-0.714932152	0.4	-1.332866921
180 | LCA5	0.0220045	-1.122629113	NA	NA
181 | FAM45A	0.0220045	0.644317329	0.37602023	0.45236434
182 | C6orf132	0.0220045	0.83027806	0.05232352	0.722427323
183 | TMPRSS11D	0.0220045	0.72611518	0.405904275	0.300725076
184 | FAM189A2	0.0220045	-0.779077616	NA	NA
185 | KCNB1	0.0220045	-1.135530768	NA	NA
186 | TRIM71	0.0220045	-0.390862224	NA	NA
187 | BCL2L2-PABPN1	0.0220045	1.142463011	NA	NA
188 | ZKSCAN7	0.0220045	-0.76199694	NA	NA
189 | KIAA1377	0.0220045	-1.108424161	NA	NA
190 | HCAR3	0.0220045	0.581568191	NA	NA
191 | EGF	0.025666212	-0.619128093	NA	NA
192 | GOLGA2	0.025666212	0.771911066	0.538245486	0.20414804
193 | GPR39	0.025666212	-0.697353915	NA	NA
194 | HES1	0.025666212	0.902489219	0.78594874	0.549922842
195 | IGF1	0.025666212	-1.054153995	0.638888889	-0.300969816
196 | PNN	0.025666212	0.854128804	0.574292637	-0.291719599
197 | RAB3B	0.025666212	-1.348240928	0.320321905	-0.635523855
198 | RABGGTA	0.025666212	0.867452733	0.097642444	0.763373148
199 | SRP54	0.025666212	0.575056607	0.20508663	0.535352723
200 | ZBTB16	0.025666212	-1.021094326	NA	NA
201 | PLA2G7	0.025666212	-1.180397876	0.000980713	1.445835206
202 | CLDN2	0.025666212	-1.145554858	NA	NA
203 | WDR47	0.025666212	0.725264358	0.005479684	0.99459523
204 | ICOSLG	0.025666212	-0.706157977	NA	NA
205 | SACS	0.025666212	-1.773699496	0.270123647	-1.290294442
206 | RANGRF	0.025666212	-0.712027992	0.565783927	-0.447246023
207 | UGGT1	0.025666212	-0.684258871	0.294537238	-0.451493623
208 | AKR1B10	0.025666212	0.736256055	0.018784685	0.749245676
209 | HOMEZ	0.025666212	1.241970113	0.075414781	1.013902997
210 | FAM129B	0.025666212	1.008986306	0.039795012	0.701572406
211 | DCAF11	0.025666212	0.954693826	0.347492645	0.864227715
212 | SFXN1	0.025666212	1.950901287	0.186070034	1.060301035
213 | CAVIN3	0.025666212	-1.259241046	0.136614041	-0.514935324
214 | DCBLD2	0.025666212	-1.010200191	0.458520823	-0.518327588
215 | GPAT4	0.025666212	0.634149344	0.776210828	0.19256231
216 | POTEI	0.025666212	-1.269284259	0.503319039	0.399375873
217 | PPP4R4	0.025666212	-0.722077425	0.111888112	0.446241484
218 | ASIC2	0.025666212	-0.95651339	NA	NA
219 | ATRNL1	0.025666212	-0.883351194	NA	NA
220 | DIRAS1	0.025666212	-0.846225689	NA	NA
221 | FAM171B	0.025666212	-1.151664498	NA	NA
222 | LMTK3	0.025666212	0.879223878	NA	NA
223 | NRXN1	0.025666212	-1.155990484	NA	NA
224 | TMEM127	0.025666212	0.776671103	NA	NA
225 | YY2	0.025666212	-0.790033094	NA	NA
226 | HSF4	0.025666212	0.848922808	NA	NA
227 | SLC22A9	0.027571262	0.601907638	NA	NA
228 | GUCA2A	0.027976658	-1.233709192	NA	NA
229 | KHDRBS2	0.029545414	-0.729337219	NA	NA
230 | CDK7	0.029822126	0.717408259	0.503319039	-0.003201871
231 | CEACAM8	0.029822126	0.92764653	0.151872392	0.486575438
232 | CYP17A1	0.029822126	-0.517859844	NA	NA
233 | ECI1	0.029822126	0.997321563	0.045711209	0.577918021
234 | DMXL1	0.029822126	0.942172393	0.001944503	0.944059703
235 | FOXC1	0.029822126	-0.876245738	0.690511445	-0.162206933
236 | GSTZ1	0.029822126	0.839064803	0.109548295	0.55781303
237 | KCNQ1	0.029822126	1.241307002	0.437562438	-0.505915781
238 | MTRR	0.029822126	-0.772622893	0.893790743	0.00029816
239 | MYBPC1	0.029822126	-0.741461098	0.109548295	-0.709638265
240 | RP9	0.029822126	-0.53249663	0.44283318	-0.064119047
241 | STAT5B	0.029822126	-0.838160572	0.611412419	-0.05131994
242 | THRSP	0.029822126	-1.070123136	NA	NA
243 | ARHGEF7	0.029822126	-0.83108183	0.067832625	-0.699966957
244 | MTMR6	0.029822126	-0.597017158	0.20508663	-0.7680694
245 | CDC42BPB	0.029822126	0.632109389	0.247091129	0.968694876
246 | PJA2	0.029822126	1.76674202	0.029822126	3.631251948
247 | RCOR1	0.029822126	0.524906095	0.05232352	0.65783498
248 | CBLC	0.029822126	0.720283994	0.054750046	0.907683978
249 | ARGLU1	0.029822126	-1.061546865	0.768886795	-0.231311495
250 | NAXD	0.029822126	-0.799851541	0.437101706	-0.466816789
251 | PDXP	0.029822126	-1.118598413	0.122517932	-0.585684256
252 | FIGNL1	0.029822126	-0.799517155	0.574292637	0.276193984
253 | PCYOX1L	0.029822126	0.922707856	0.097642444	1.720411254
254 | DBNDD1	0.029822126	0.853383409	NA	NA
255 | TCTN1	0.029822126	-0.513004375	NA	NA
256 | ITIH5	0.029822126	-1.166672347	0.00797954	-1.086591356
257 | SSH2	0.029822126	-1.123319274	0.38650761	-0.806467388
258 | FOPNL	0.029822126	0.66878713	NA	NA
259 | TMEM30B	0.029822126	-0.026183708	NA	NA
260 | LEMD2	0.029822126	-0.814490056	0.029822126	-0.831200423
261 | CA11	0.029822126	1.012893105	0.832944833	-0.029635882
262 | HTR2A	0.029822126	-0.781494287	NA	NA
263 | KCNA6	0.029822126	-0.765476197	NA	NA
264 | SAMD14	0.029822126	-0.973650526	NA	NA
265 | TNF	0.029822126	-0.914800779	NA	NA
266 | SCN4A	0.029822126	-0.834432686	NA	NA
267 | CCNB3	0.029822126	-1.024624824	NA	NA
268 | EXTL1	0.029822126	-0.476942125	NA	NA
269 | BARX1	0.029822126	-0.564863949	NA	NA
270 | SPP2	0.029834637	1.224432494	0.690511445	0.156005773
271 | GAGE2E	0.029834637	1.583140873	0.851719324	0.287340998
272 | ACTL8	0.031049234	0.947267059	0.114285714	1.139548513
273 | PPEF2	0.031639945	0.350530754	NA	NA
274 | CLRN3	0.032694975	-0.786013419	NA	NA
275 | ARVCF	0.034513009	-0.28878153	0.076839798	-0.471929965
276 | ATP5B	0.034513009	1.283139251	0.768886795	-0.103179028
277 | ADGRB1	0.034513009	0.551521379	NA	NA
278 | CACNB1	0.034513009	-1.132236922	0.768886795	-0.561325178
279 | GRB7	0.034513009	0.820494179	0.059675788	0.718004563
280 | KIF3C	0.034513009	-0.516679957	0.151872392	-0.590733048
281 | MAP3K1	0.034513009	0.648220135	0.098617585	0.876780763
282 | MYO7B	0.034513009	-1.489369136	NA	NA
283 | NDUFB3	0.034513009	0.942576391	0.37602023	-0.065719394
284 | PRKCA	0.034513009	-0.555980508	0.067832625	-0.983469881
285 | SCG5	0.034513009	-1.084599746	NA	NA
286 | PABPN1	0.034513009	0.795814607	NA	NA
287 | NEMF	0.034513009	-0.048227574	0.168348749	0.121192656
288 | CNOT8	0.034513009	1.08673033	0.574292637	0.142202779
289 | FAM13A	0.034513009	-0.675793845	NA	NA
290 | PAXIP1	0.034513009	0.662050916	0.0220045	1.192916982
291 | GTPBP4	0.034513009	0.669593891	0.469593677	0.163213392
292 | MYEF2	0.034513009	-1.325128799	0.186070034	-1.831217261
293 | HDGFL3	0.034513009	-0.988133695	0.136614041	-0.784387515
294 | BORCS6	0.034513009	-0.596910571	0.978697383	0.150097775
295 | RFWD3	0.034513009	1.097444934	NA	NA
296 | EAPP	0.034513009	0.33467958	0.574292637	0.069426313
297 | ECHDC1	0.034513009	-0.825820557	0.097642444	-0.557956764
298 | RBM25	0.034513009	0.700942626	0.294537238	0.578090304
299 | ENGASE	0.034513009	-1.272550498	0.37602023	-0.695045321
300 | THSD4	0.034513009	-0.680615312	0.649510605	0.08296643
301 | WDR61	0.034513009	-1.156926971	0.688522201	-0.290526632
302 | L3MBTL3	0.034513009	-0.851380686	0.035827949	-0.914739299
303 | MYOM3	0.034513009	-0.764256686	0.168348749	-0.709368019
304 | SPIN4	0.034513009	-1.080150253	NA	NA
305 | ATP6V0D2	0.034513009	0.99958212	NA	NA
306 | INAFM2	0.034513009	-0.717788811	NA	NA
307 | NAALAD2	0.034513009	-0.82289103	NA	NA
308 | ADCY2	0.034513009	-0.977049108	NA	NA
309 | BRINP1	0.034513009	-0.816822634	NA	NA
310 | DCUN1D2	0.034513009	-0.647168739	NA	NA
311 | DHRS4L2	0.034513009	1.449805344	NA	NA
312 | C20ORF26	0.034513009	-1.189640274	NA	NA
313 | ADCY5	0.039795012	-0.651252415	NA	NA
314 | SERPINH1	0.039795012	-1.21097985	0.405904275	-0.941129078
315 | CSNK1A1	0.039795012	0.776147709	0.122517932	0.596504981
316 | CSNK1G3	0.039795012	0.864514937	0.93616176	0.129762887
317 | DLD	0.039795012	0.806401832	0.270123647	0.275045981
318 | EFEMP1	0.039795012	-0.832409223	0.247091129	-0.480885831
319 | GPD1	0.039795012	-0.78445361	0.109548295	-0.694408228
320 | HSPA1L	0.039795012	-0.894905096	0.649510605	0.131199462
321 | ITPR2	0.039795012	-0.692725971	0.122517932	-0.816765168
322 | MPZ	0.039795012	-2.752270396	0.039795012	-0.994593646
323 | MTHFD1	0.039795012	0.624434563	0.151872392	0.634283566
324 | MYBPC2	0.039795012	-0.871399419	0.034513009	-0.870032612
325 | NPM1	0.039795012	0.565764768	0.186070034	0.485280263
326 | PLAT	0.039795012	0.685971318	0.20508663	0.353659204
327 | POLB	0.039795012	0.818134497	0.320321905	0.316010598
328 | RAF1	0.039795012	1.111704393	0.168348749	1.454881036
329 | RDH5	0.039795012	-0.914158262	NA	NA
330 | S100A9	0.039795012	0.625982826	0.045711209	0.268649605
331 | S100A12	0.039795012	0.67518107	0.025666212	0.897236487
332 | HLTF	0.039795012	0.763470346	0.045711209	0.866193423
333 | TP63	0.039795012	0.683873932	0.20508663	0.676795971
334 | CYP7B1	0.039795012	-0.664634301	0.003684403	-1.249145944
335 | SLC25A17	0.039795012	-0.948418917	0.722342673	0.090684908
336 | SLU7	0.039795012	1.077910187	0.086761533	0.552436048
337 | WASF3	0.039795012	-0.936502889	0.93616176	-0.019477191
338 | SNRNP27	0.039795012	-0.561702735	0.086761533	-0.554640111
339 | ITGB1BP2	0.039795012	-1.04415137	0.948729289	-0.077492877
340 | TMCO6	0.039795012	1.195371806	0.463403263	0.15696866
341 | MYDGF	0.039795012	-0.57973669	0.649510605	0.025193216
342 | TMX4	0.039795012	-0.773290876	0.086761533	-0.839913139
343 | WDR19	0.039795012	-0.094038305	0.109548295	-0.382523145
344 | PLEKHA2	0.039795012	0.927808852	0.067832625	0.444131475
345 | FKBP10	0.039795012	-0.805704253	0.469593677	-0.744502273
346 | KCTD14	0.039795012	-1.128630873	0.810041728	-1.257278315
347 | SAT2	0.039795012	-0.720497196	0.405904275	-0.747623943
348 | DTD2	0.039795012	1.530944369	0.076839798	0.52092256
349 | ERP27	0.039795012	-0.661313449	NA	NA
350 | CHCHD4	0.039795012	0.357638682	0.320321905	0.169422049
351 | GRPEL2	0.039795012	0.72070874	0.270123647	0.560033871
352 | NRK	0.039795012	-1.003083126	0.004096213	-2.198015889
353 | RFLNB	0.039795012	-0.893534297	0.088293857	0.286211365
354 | CCDC9B	0.039795012	-0.675971046	0.649510605	-0.659561902
355 | HACD4	0.039795012	-0.600886689	NA	NA
356 | CDK11A	0.039795012	1.037339358	0.20508663	0.432442214
357 | RORC	0.039795012	-0.554863717	NA	NA
358 | SH3RF2	0.039795012	0.968825306	0.072024691	0.949101285
359 | KIFC2	0.039795012	0.535560156	NA	NA
360 | LRP3	0.039795012	-0.987289465	NA	NA
361 | TRPM3	0.039795012	-0.799898414	NA	NA
362 | ULK2	0.039795012	-2.233954345	NA	NA
363 | SMIM15	0.039795012	0.547421946	NA	NA
364 | MXI1	0.039795012	0.94461409	NA	NA
365 | EYA1	0.039795012	-0.781897056	NA	NA
366 | SEC16B	0.039795012	-0.738029433	NA	NA
367 | LONRF3	0.039795012	-1.090501791	NA	NA
368 | KCNH5	0.040597074	0.685617795	NA	NA
369 | KLHL33	0.041220825	-0.550658021	NA	NA
370 | HMGCLL1	0.04133803	-0.518453521	NA	NA
371 | XAGE2	0.045295054	-1.565172561	NA	NA
372 | AARS	0.045711209	0.553615181	0.122517932	0.5500708
373 | ART3	0.045711209	-1.225786998	0.315151515	-1.0834237
374 | CAPN1	0.045711209	0.646412481	0.086761533	0.631288119
375 | RCC1	0.045711209	0.77600977	0.93616176	0.204211254
376 | CYP51A1	0.045711209	0.362820339	0.93616176	0.068043127
377 | PHC2	0.045711209	-1.630758311	0.225413534	-0.1250244
378 | GM2A	0.045711209	0.909252838	0.978697383	0.239526145
379 | HNRNPAB	0.045711209	0.656769193	0.37602023	-0.169481376
380 | ITGAE	0.045711209	-0.537837254	0.810041728	0.308307694
381 | KRT13	0.045711209	0.750261117	0.018784685	1.039614316
382 | PITX1	0.045711209	0.956409679	0.015971771	1.589687565
383 | PPP3CA	0.045711209	-0.74625322	0.045711209	-0.630252974
384 | PSMC2	0.045711209	1.31747828	0.122517932	0.856237559
385 | PTX3	0.045711209	-0.910625409	0.810041728	0.288983993
386 | RBP2	0.045711209	0.731318626	NA	NA
387 | S100A8	0.045711209	0.530925996	0.006628079	0.57114929
388 | SLC7A2	0.045711209	-0.739340279	NA	NA
389 | SPTBN1	0.045711209	-0.889868051	0.247091129	-0.846156642
390 | STYX	0.045711209	0.734460792	0.125541126	0.089115416
391 | TCF7	0.045711209	-0.760099593	1	-0.339344957
392 | UBE2G2	0.045711209	-0.632388776	0.649510605	0.021765304
393 | ZNF136	0.045711209	-0.719672989	NA	NA
394 | SLC43A1	0.045711209	-1.869062941	NA	NA
395 | ENC1	0.045711209	-1.178077807	NA	NA
396 | BTRC	0.045711209	0.16220394	NA	NA
397 | USP13	0.045711209	-0.601228072	0.05232352	-0.660709659
398 | IRF9	0.045711209	1.043068914	0.039795012	1.148263075
399 | TUBGCP3	0.045711209	-0.429019743	0.034513009	-0.808442861
400 | WDHD1	0.045711209	0.611711185	0.086761533	0.637415529
401 | RHOBTB3	0.045711209	-1.37356497	NA	NA
402 | TBC1D9B	0.045711209	0.835208726	0.347492645	0.185417989
403 | N4BP3	0.045711209	1.158699803	0.538245486	-0.096491108
404 | SDF2L1	0.045711209	-0.850359915	0.574292637	0.084765797
405 | CCDC9	0.045711209	0.66807693	0.247091129	0.630753022
406 | GOLGA7	0.045711209	0.588777365	0.136614041	0.426756319
407 | FAM8A1	0.045711209	-0.534656325	0.197808115	-0.836992917
408 | SH3TC1	0.045711209	-0.744389602	0.768886795	-0.246788132
409 | RBM28	0.045711209	0.724368622	0.893790743	0.274292152
410 | RSAD1	0.045711209	-0.771828283	0.60952381	0.412231478
411 | MIS18BP1	0.045711209	-0.747909753	0.335664336	-0.155844504
412 | CENPN	0.045711209	0.673167011	0.778865579	0.285349943
413 | CTTNBP2NL	0.045711209	0.626355776	0.076839798	0.617989143
414 | CLTRN	0.045711209	-0.851537416	NA	NA
415 | SRR	0.045711209	-0.691950029	0.059234883	-0.834253563
416 | ZNF106	0.045711209	-0.903889697	0.846998719	0.101090761
417 | RUFY1	0.045711209	0.932995631	0.005479684	0.759156984
418 | CCDC115	0.045711209	1.116927725	0.039795012	0.779625771
419 | SCRN2	0.045711209	-0.73511872	0.186070034	-0.602675344
420 | ANKRD40	0.045711209	-1.153013624	0.070238585	-0.669341506
421 | CXorf40A	0.045711209	-0.675008159	0.755050505	-0.543254465
422 | RMI2	0.045711209	0.682104735	0.688911089	-0.407829423
423 | ZNF526	0.045711209	0.807707675	1	-0.073834295
424 | UBLCP1	0.045711209	0.719950306	0.097642444	0.30488639
425 | MOSPD2	0.045711209	-0.553994524	0.688522201	-0.137724547
426 | PLCXD3	0.045711209	-0.595259729	NA	NA
427 | MAST4	0.045711209	1.089172734	0.076839798	0.654830622
428 | PHLDB3	0.045711209	0.896317211	0.029822126	0.909743033
429 | ARMCX4	0.045711209	-0.75530203	NA	NA
430 | CELSR3	0.045711209	0.539737052	NA	NA
431 | OMG	0.045711209	-0.695696171	NA	NA
432 | GRIN2C	0.045711209	-0.835381614	NA	NA
433 | FNTB	0.045711209	0.540012678	NA	NA
434 | CCZ1B	0.045711209	-0.777086647	NA	NA
435 | KCNH7	0.045711209	-3.244430944	NA	NA
436 | B4GALT6	0.045711209	-1.263078613	NA	NA
437 | FBXO24	0.045711209	0.746589638	NA	NA
438 | LRRC43	0.045711209	-0.765784453	NA	NA
439 | NXPE1	0.045711209	-0.686872578	NA	NA
440 | UGT2B4	0.045965915	-0.901677493	NA	NA
441 | HS3ST4	0.046682742	-0.698299805	0.177777778	1.10616322
442 | LRRC14B	0.047064403	-0.460242353	NA	NA
443 | CPA2	0.049771984	-0.788299252	NA	NA
444 | GABBR2	0.050169199	-0.809696673	NA	NA
445 | SIX3	0.050928434	-0.506402257	NA	NA
446 | CACNA1S	0.05232352	-0.672505139	0.768886795	-0.060026461
447 | CAPN6	0.05232352	-0.893177722	1	-0.318557076
448 | CAPZA1	0.05232352	0.255055831	0.086761533	0.479885627
449 | EPHA1	0.05232352	0.607019081	0.768886795	0.124069886
450 | ESRRG	0.05232352	-0.90337727	NA	NA
451 | FKBP3	0.05232352	-0.869191896	0.136614041	-1.00777756
452 | BLOC1S1	0.05232352	-0.563624694	0.405904275	-0.225353741
453 | GTF2A1	0.05232352	0.553110076	0.978697383	-0.089194076
454 | HPN	0.05232352	-0.948812655	NA	NA
455 | IDH1	0.05232352	1.221109848	0.097642444	1.061969094
456 | LSAMP	0.05232352	-0.782624715	0.005620386	-0.899563797
457 | NARS	0.05232352	0.472389924	0.097642444	0.367744916
458 | NPR1	0.05232352	-0.613934096	0.133333333	1.227104972
459 | SERPINA5	0.05232352	-1.031962163	0.151872392	-1.115540894
460 | PROX1	0.05232352	-0.154908799	0.202020202	0.129473866
461 | PSMC6	0.05232352	0.692472155	0.039795012	0.831791831
462 | SCO1	0.05232352	-1.015603881	0.018784685	-0.958615937
463 | SOD3	0.05232352	-0.844217815	0.109548295	-0.571521752
464 | TFAP2C	0.05232352	0.436639236	0.347492645	0.213036183
465 | TPD52	0.05232352	0.362377927	0.168348749	0.158733081
466 | UCP1	0.05232352	-0.568136764	NA	NA
467 | CTNNAL1	0.05232352	-0.710408163	0.086761533	-0.398881335
468 | MYOM2	0.05232352	-1.117877346	0.05232352	-0.807723081
469 | STOML1	0.05232352	-0.90166518	0.380952381	-0.971166133
470 | VPS9D1	0.05232352	1.226804618	0.016161616	1.446408809
471 | ZSCAN12	0.05232352	-0.557567714	NA	NA
472 | KPTN	0.05232352	0.767002241	0.202967311	0.583699445
473 | DDX42	0.05232352	-0.909405413	0.347492645	-0.271502699
474 | HABP4	0.05232352	-1.453223776	0.574292637	-0.987195417
475 | RFTN1	0.05232352	-0.750816346	0.015971771	-0.994821355
476 | FBXO46	0.05232352	0.855604981	NA	NA
477 | PYGO1	0.05232352	-0.607265547	NA	NA
478 | ARHGEF16	0.05232352	0.865790882	0.186070034	0.869048136
479 | PACSIN1	0.05232352	-1.095115699	0.097642444	-0.924780627
480 | DCTN4	0.05232352	0.882414074	0.649510605	0.571080565
481 | PIGG	0.05232352	-0.133347609	0.247091129	-0.64990835
482 | CHTF8	0.05232352	0.942639755	0.775624376	0.138198302
483 | WDR70	0.05232352	-0.879132826	0.122517932	-1.310253867
484 | RNLS	0.05232352	-0.582210045	0.343434343	0.304644102
485 | ISY1	0.05232352	0.481851941	0.688522201	-0.098566832
486 | RELCH	0.05232352	0.630986424	0.437101706	0.618175604
487 | MRPS9	0.05232352	0.904047243	0.538245486	0.322843993
488 | SOWAHC	0.05232352	0.861338825	0.018784685	1.273956604
489 | TNIP2	0.05232352	-0.518296309	0.611412419	-0.28349582
490 | PAAF1	0.05232352	-0.946705377	0.503319039	-0.337209398
491 | HYI	0.05232352	-0.372390134	0.225413534	-0.52736661
492 | ARMC10	0.05232352	1.651834221	0.574292637	0.856225525
493 | SLC9A7	0.05232352	-0.740705146	0.051218939	-0.60583178
494 | MYO18B	0.05232352	-0.857535041	0.538245486	-0.405578441
495 | UNK	0.05232352	-0.542720909	0.347492645	-0.209853396
496 | PXYLP1	0.05232352	-0.775144494	NA	NA
497 | PPP1R14A	0.05232352	-0.581823181	0.136614041	-0.527614232
498 | VSTM2L	0.05232352	-0.915128205	NA	NA
499 | C22orf39	0.05232352	-0.487653761	NA	NA
500 | MTPN	0.05232352	0.533682717	0.270123647	0.485724883
501 | 


--------------------------------------------------------------------------------
/tests/testthat/test_enrichmentAnalysis.r:
--------------------------------------------------------------------------------
 1 | context("Test the enrichmentAnalysis function")
 2 | 
 3 | 
 4 | test_that('Overlap Found by enrichmentAnalysis is correct', {
 5 |     res <- enrichmentAnalysis(ea_genelist, ea_gmt, ea_background)
 6 |     res2 <- enrichmentAnalysis(ea_genelist2, ea_gmt, ea_background2)
 7 | 
 8 |     expect_true(setequal(res[[1, 'overlap']], c('BLM')))
 9 |     expect_true(setequal(res[[2, 'overlap']], c('HERC2', 'SP100', 'BLM')))
10 |     expect_true(setequal(res2[[3, 'overlap']], c('HERC2')))
11 |     expect_true(setequal(res[[4, 'overlap']], NULL))
12 | })
13 | 


--------------------------------------------------------------------------------
/tests/testthat/test_export_CSV.r:
--------------------------------------------------------------------------------
 1 | context("Export of results as CSV file")
 2 | 
 3 | test_that("CSV file structure is expected for single evidence", {
 4 |       
 5 |       res = ActivePathways(dat[,1, drop = F], gmt)
 6 | 	CSV_fname = "res.csv"
 7 |       suppressWarnings(file.remove(CSV_fname))
 8 |     
 9 | 	export_as_CSV(res, CSV_fname)
10 |       res_from_CSV = read.csv(CSV_fname, stringsAsFactors = F)
11 |       suppressWarnings(file.remove(CSV_fname))
12 |     
13 |       expect_equal(colnames(res_from_CSV), colnames(res))
14 |       expect_equal(res_from_CSV$term_id, res$term_id)
15 | 
16 | })
17 | 
18 | 
19 | test_that("CSV file structure is expected for multiple evidence", {
20 | 
21 | 	res = ActivePathways(dat, gmt)
22 | 	CSV_fname = "res.csv"
23 |       suppressWarnings(file.remove(CSV_fname))
24 |     
25 | 	export_as_CSV(res, CSV_fname)
26 |       res_from_CSV = read.csv(CSV_fname, stringsAsFactors = F)
27 |       suppressWarnings(file.remove(CSV_fname))
28 |     
29 |       expect_equal(colnames(res_from_CSV), colnames(res))
30 |       expect_equal(res_from_CSV$term_id, res$term_id)
31 | 
32 | })
33 | 
34 | 
35 | test_that("CSV file is exported when there are NULL entries", {
36 |       
37 |       res = ActivePathways(scores=scores_test, gmt=gmt_reac, significant=1)
38 |       CSV_fname = "res.csv"
39 |       suppressWarnings(file.remove(CSV_fname))
40 |       
41 |       export_as_CSV(res, CSV_fname)
42 |       res_from_CSV = read.csv(CSV_fname, stringsAsFactors = F)
43 |       suppressWarnings(file.remove(CSV_fname))
44 |       
45 |       # convert res overlap column values to a string type
46 |       res$overlap <- sapply(res$overlap, function(x) paste(x, collapse = "|"))
47 |       expect_equal(res_from_CSV$overlap, res$overlap)
48 | 
49 | })
50 | 


--------------------------------------------------------------------------------
/tests/testthat/test_merge_p_values.r:
--------------------------------------------------------------------------------
  1 | context("merge_p_values function")
  2 | 
  3 | test_list <- list(a=0.01, b=0.06, c=0.8, d=0.0001, e=0, f=1)
  4 | test_matrix <- matrix(c(0.01, 0.06, 0.08, 0.0001, 0, 1), ncol=2)
  5 | 
  6 | comparison_list <- test_list
  7 | comparison_list[[5]] = 1e-300
  8 | 
  9 | test_matrix = matrix(unlist(test_list), ncol = 2)
 10 | comparison_matrix = matrix(unlist(comparison_list), ncol = 2)
 11 | 
 12 | test_that("scores is a numeric matrix or list with valid p-values", {
 13 |    
 14 |    expect_error(merge_p_values(unlist(test_list), "Fisher"), NA)
 15 |    expect_error(merge_p_values(test_list, "Brown"), 
 16 |                 "Brown's, DPM, Strube's, and Strube_directional methods cannot be used with a single list of p-values")
 17 |    expect_error(merge_p_values(unlist(test_list), "Brown"), 
 18 |                 "Brown's, DPM, Strube's, and Strube_directional methods cannot be used with a single list of p-values")
 19 |    
 20 |    
 21 |    
 22 |    test_list[[1]] <- -0.1
 23 |    expect_error(merge_p_values(test_list), 'All values in scores must be in [0,1]', fixed=TRUE)
 24 |    test_list[[1]] <- 1.1
 25 |    expect_error(merge_p_values(test_list), 'All values in scores must be in [0,1]', fixed=TRUE)
 26 |    test_list[[1]] <- NA
 27 |    expect_error(merge_p_values(test_list), 'scores cannot contain missing values, we recommend replacing NA with 1 or removing')
 28 |    test_list[[1]] <- 'c'
 29 |    expect_error(merge_p_values(test_list), 'scores must be numeric')
 30 |    
 31 |    
 32 |    test_matrix[1, 1] <- NA
 33 |    expect_error(merge_p_values(test_matrix), 'scores cannot contain missing values, we recommend replacing NA with 1 or removing')
 34 |    test_matrix[1, 1] <- -0.1
 35 |    expect_error(merge_p_values(test_matrix), "All values in scores must be in [0,1]", fixed=TRUE)
 36 |    test_matrix[1, 1] <- 1.1
 37 |    expect_error(merge_p_values(test_matrix), "All values in scores must be in [0,1]", fixed=TRUE)
 38 |    test_matrix[1, 1] <- 'a'
 39 |    expect_error(merge_p_values(test_matrix), 'scores must be numeric')
 40 |    
 41 |    
 42 | })
 43 | 
 44 | 
 45 | test_direction_vector <- c(1,-1)
 46 | 
 47 | 
 48 | test_that("Merged p-values are correct", {
 49 |    
 50 |    this_tolerance = 1e-7
 51 |    answer1 = c(1.481551e-05, 4.167534e-299, 9.785148e-01)
 52 |    answer2 = c(2.52747e-05, 0.00000e+00, 9.73873e-01)
 53 |    answer3 = 7.147579e-296
 54 |    
 55 |    expect_equal(merge_p_values(test_matrix, "Fisher"), answer1, tolerance = this_tolerance)
 56 |    
 57 |    expect_equal(merge_p_values(test_matrix, "Brown"), answer2, tolerance = this_tolerance)
 58 |    
 59 |    expect_equal(merge_p_values(test_matrix[, 1, drop=FALSE], "Fisher"), test_matrix[, 1, drop=TRUE])
 60 |    expect_equal(merge_p_values(test_matrix[, 1, drop=FALSE], "Brown"), test_matrix[, 1, drop=TRUE])
 61 |    
 62 |    expect_equal(merge_p_values(test_list, "Fisher"), answer3, tolerance = this_tolerance)
 63 |    
 64 |    test_pval_vector <- c(0.05,0.01)
 65 |    test_direction_vector <- c(1,-1)
 66 |    constraints_vector1 <- c(-1,1)
 67 |    constraints_vector2 <- c(1,-1)
 68 |    
 69 |    expect_equal(merge_p_values(test_pval_vector,"Fisher_directional",test_direction_vector,constraints_vector1),
 70 |                 merge_p_values(test_pval_vector,"Fisher_directional",test_direction_vector,constraints_vector2))
 71 |    
 72 |    inflated_pvals <- c(1, 1e-400)
 73 |    threshold_pvals <- c(1, 1e-300)
 74 |    expect_equal(merge_p_values(inflated_pvals), merge_p_values(threshold_pvals))
 75 |    
 76 |    inflated_pval_matrix = matrix(c(1, 1, 1e-320, 1e-310), ncol = 2)
 77 |    threshold_pval_matrix = matrix(c(1, 1, 1e-300, 1e-300), ncol = 2)
 78 |    expect_equal(merge_p_values(inflated_pval_matrix), merge_p_values(threshold_pval_matrix))
 79 |    
 80 | })
 81 | 
 82 | 
 83 | test_matrix <- matrix(c(0.01, 0.06, 0.08, 0.0001, 0, 1), ncol=2)
 84 | test_direction_matrix <- matrix(c(1,-1,1,-1,-1,1), ncol=2)
 85 | constraints_vector <- c(1,1)
 86 | 
 87 | colnames(test_matrix) <- c("RNA", "Protein") 
 88 | rownames(test_matrix) <- c("TP53", "CHRNA1","PTEN")
 89 | 
 90 | colnames(test_direction_matrix) <- colnames(test_matrix)
 91 | rownames(test_direction_matrix) <- rownames(test_matrix)
 92 | 
 93 | test_that("scores_direction and constraints_vector are valid", {
 94 |    
 95 |    expect_error(merge_p_values(test_matrix, "Fisher_directional",test_direction_matrix),'Both scores_direction and constraints_vector must be provided')
 96 |    expect_error(merge_p_values(test_matrix, "Fisher_directional",constraints_vector = constraints_vector),'Both scores_direction and constraints_vector must be provided')
 97 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_direction_matrix, c(1,"b")), 'constraints_vector must be a numeric vector')
 98 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_direction_matrix, c(1,0)), "scores_direction entries must be set to 0's for columns that do not contain directional information")
 99 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_direction_matrix, c(1,5)), "constraints_vector must contain the values: 1, -1 or 0")
100 |    
101 |    test_dir <- as.vector(test_direction_matrix)
102 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_dir, c(1,1)), 'scores and scores_direction must be the same data type')
103 |    
104 |    test_dir <- test_direction_matrix
105 |    test_dir[1,1] <- NA
106 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_dir, c(1,1)), 'scores_direction cannot contain missing values, we recommend replacing NA with 0 or removing')
107 |    
108 |    test_dir <- test_direction_matrix
109 |    test_dir[1,1] <- 'a'
110 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_dir, c(1,1)), 'scores_direction must be numeric')
111 |    
112 |    test_dir <- test_direction_matrix
113 |    colnames(test_dir) <- NULL
114 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_dir, c(1,1)), 'column names must be provided to scores and scores_direction')
115 |    
116 |    test_m <- test_matrix[1:2,]
117 |    expect_error(merge_p_values(test_m, "Fisher_directional", test_direction_matrix, c(1,1)), 'scores and scores_direction must have the same number of rows')
118 |    
119 |    test_m <- test_matrix
120 |    rownames(test_m) <- c("TP53", "GENE2", "GENE3")
121 |    expect_error(merge_p_values(test_m, "Fisher_directional", test_direction_matrix, c(1,1)), 'scores_direction gene names must match scores genes')
122 |    
123 |    test_m <- test_matrix
124 |    rownames(test_m) <- c("CHRNA1","TP53","PTEN")
125 |    expect_error(merge_p_values(test_m, "Fisher_directional", test_direction_matrix, c(1,1)), 'scores genes should be in the same order as scores_direction genes')
126 |    
127 |    test_m <- test_matrix
128 |    colnames(test_m) <- c("RNA","Mutation")
129 |    expect_error(merge_p_values(test_m, "Fisher_directional", test_direction_matrix, c(1,1)),
130 |                 'scores_direction column names must match scores column names')
131 |    
132 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_direction_matrix, c(1,1,-1)),
133 |                 'constraints_vector should have the same number of entries as columns in scores_direction')
134 |    
135 |    names(constraints_vector) <- c("Protein","RNA")
136 |    expect_error(merge_p_values(test_matrix, "Fisher_directional", test_direction_matrix, constraints_vector),
137 |                 'the constraints_vector entries should match the order of scores and scores_direction columns')
138 | })
139 | 
140 | 
141 | 
142 | test_that("P-value merging methods are correct", {
143 |    expect_error(merge_p_values(c(0.05,0.10), "Fisher", c(1,1), c(1,1)),
144 |                 'Only DPM, Fisher_directional, Stouffer_directional, and Strube_directional methods support directional integration')
145 |    
146 |    expect_error(merge_p_values(c(0.05,0.10), "Tippett"),
147 |                 'Only Fisher, Brown, Stouffer and Strube methods are currently supported for non-directional analysis. 
148 |              And only DPM, Fisher_directional, Stouffer_directional, and Strube_directional are supported for directional analysis')
149 |    
150 |    expect_error(merge_p_values(c(0.05,0.10), "Fisher_directional"),
151 |                 'scores_direction and constraints_vector must be provided for directional analyses')
152 |    
153 | })
154 | 


--------------------------------------------------------------------------------
/tests/testthat/test_orderedHypergeometric.r:
--------------------------------------------------------------------------------
 1 | context("Ordered Hypergeometric Statistical Test")
 2 | 
 3 | 
 4 | test_that('hypergeometric gives the same results as fisher.test', {
 5 |     counts <- matrix(c(0,0,0,0), nrow=2)
 6 |     expect_equal(hypergeometric(counts), fisher.test(counts, alternative='greater')$p.value)
 7 | 
 8 |     counts <- matrix(c(0, 5, 16, 1683), nrow=2)
 9 |     expect_equal(hypergeometric(counts), fisher.test(counts, alternative='greater')$p.value)
10 | 
11 |     counts <- matrix(c(2, 2, 2, 2), nrow=2)
12 |     expect_equal(hypergeometric(counts), fisher.test(counts, alternative='greater')$p.value)
13 | })
14 | 
15 | 
16 | test_that('orderedHypergeometric returns the lowest p-value and correct index', {
17 |     genelist <- c('HERC2', 'SP100', 'BLM')
18 |     background <- c('PHC2', 'BLM', 'XPC', 'SMC3', 'HERC2', 'SP100')
19 |     annotations <- c('HERC2', 'PHC2', 'BLM')
20 | 
21 |     get_pvalue <- function(genes) {
22 |         complement <- setdiff(background, genes)
23 |         genelist1 <- length(which(genes %in% annotations))
24 |         genelist0 <- length(genes) - genelist1
25 |         complement1 <- length(which(complement %in% annotations))
26 |         complement0 <- length(complement) - complement1
27 |         counts <- matrix(c(genelist1, genelist0, complement1, complement0), 2)
28 |         hypergeometric(counts)
29 |     }
30 |     p_values <- sapply(1:length(genelist), function(i) get_pvalue(genelist[1:i]))
31 | 
32 |     smallest_value <- min(p_values)
33 |     smallest_index <- match(smallest_value, p_values)
34 |     exp <- list(p_val=smallest_value, ind=smallest_index)
35 | 
36 |     expect_equal(orderedHypergeometric(genelist, background, annotations), exp)
37 | })
38 | 


--------------------------------------------------------------------------------
/tests/testthat/test_return.r:
--------------------------------------------------------------------------------
 1 | context("Format of the returned object and output files")
 2 | 
 3 | 
 4 | test_that("Column names of data.table is correct", {
 5 |     expect_equal(colnames(run_ap_short(dat)),
 6 |                  c('term_id', 'term_name', 'adjusted_p_val', 'term_size', 'overlap'))
 7 |     expect_equal(colnames(run_ap_short_contribution(dat)),
 8 |                  c('term_id', 'term_name', 'adjusted_p_val', 'term_size', 'overlap', 
 9 |                  'evidence', 'Genes_cds', 'Genes_promoter', 'Genes_enhancer'))
10 | })
11 | 
12 | 
13 | test_that("All results or only significant ones are returned", {
14 |     expect_true(all(ActivePathways(dat, gmt, cytoscape_file_tag = NA)$p_val < 0.05))
15 |     expect_equal(nrow(ActivePathways(dat, gmt, cytoscape_file_tag = NA, significant = 1)), length(gmt))
16 | })
17 | 
18 | 
19 | test_that("No significant results are found", {
20 |     expect_warning(res1 <- ActivePathways(dat, gmt, cytoscape_file_tag = NA, significant = 0),
21 |                    "No significant terms were found", fixed = TRUE)
22 |     expect_equal(nrow(res1), NULL)
23 | })
24 | 
25 | 
26 | 


--------------------------------------------------------------------------------
/tests/testthat/test_validation.r:
--------------------------------------------------------------------------------
  1 | context("Validation on input to ActivePathways")
  2 | 
  3 | 
  4 | test_that("scores is a numeric matrix with valid p-values", {
  5 |   dat2 <- dat
  6 |   dat2[1, 1] <- 'a'
  7 |   expect_error(run_ap_short(dat2), 'scores must be a numeric matrix')
  8 |   
  9 |   dat2 <- dat
 10 |   dat2[1, 1] <- NA
 11 |   expect_error(run_ap_short(dat2), 'scores cannot contain missing values, we recommend replacing NA with 1 or removing')
 12 |   
 13 |   dat2[1, 1] <- -0.1
 14 |   expect_error(run_ap_short(dat2), "All values in scores must be in [0,1]", fixed=TRUE)
 15 |   
 16 |   dat2[1, 1] <- 1.1
 17 |   expect_error(run_ap_short(dat2), "All values in scores must be in [0,1]", fixed=TRUE)
 18 |   
 19 |   dat2[1, 1] <- 1
 20 |   expect_error(run_ap_short(dat2), NA)
 21 |   
 22 |   dat2[1, 1] <- 0
 23 |   expect_error(run_ap_short(dat2), NA)
 24 | })
 25 | 
 26 | test_that("scores_direction and constraints_vector have valid input",{
 27 |   
 28 |   expect_error(run_ap(scores_test, direction_test,NULL),'Both scores_direction and constraints_vector must be provided')
 29 |   expect_error(run_ap(scores_test, NULL,constraints_vector = constraints_vector_test),'Both scores_direction and constraints_vector must be provided')
 30 |   
 31 |   constraints_vector <- c('a','b')
 32 |   expect_error(run_ap(scores_test,direction_test,constraints_vector), 'constraints_vector must be a numeric vector')
 33 |   expect_error(run_ap(scores_test, direction_test, c(1,0)), "scores_direction entries must be set to 0's for columns that do not contain directional information")
 34 |   
 35 |   dir_test <- direction_test
 36 |   dir_test[1,1] <- NA
 37 |   expect_error(run_ap(scores_test,dir_test,constraints_vector_test), 'scores_direction cannot contain missing values, we recommend replacing NA with 0 or removing')
 38 |   
 39 |   dir_test <- direction_test
 40 |   dir_test[1,1] <- 'a'
 41 |   expect_error(run_ap(scores_test,dir_test,constraints_vector_test), 'scores_direction must be a numeric matrix')
 42 |   
 43 |   dir_test <- direction_test[1:3,]
 44 |   expect_error(run_ap(scores_test,dir_test,constraints_vector_test), 'scores and scores_direction must have the same number of rows')
 45 |   
 46 |   dir_test <- direction_test
 47 |   rownames(dir_test) <- 1:length(direction_test[,1])
 48 |   expect_error(run_ap(scores_test,dir_test,constraints_vector_test), 'scores_direction gene names must match scores genes')
 49 |   
 50 |   dir_test <- direction_test
 51 |   rownames(dir_test) <- rev(rownames(direction_test))
 52 |   expect_error(run_ap(scores_test,dir_test,constraints_vector_test), 'scores genes should be in the same order as scores_direction genes')
 53 |   
 54 |   dir_test <- direction_test
 55 |   colnames(dir_test) <- NULL
 56 |   expect_error(run_ap(scores_test,dir_test,constraints_vector_test), 'column names must be provided to scores and scores_direction')
 57 |   
 58 |   constraints_vector <- c(1,1,-1)
 59 |   expect_error(run_ap(scores_test,direction_test,constraints_vector), 
 60 |                'constraints_vector should have the same number of entries as columns in scores_direction')
 61 |   
 62 |   constraints_vector <- c(1,-1)
 63 |   names(constraints_vector) <- c("protein","rna")
 64 |   expect_error(run_ap(scores_test,direction_test,constraints_vector), 
 65 |                'the constraints_vector entries should match the order of scores and scores_direction columns')
 66 |   
 67 |   dir_test <- direction_test
 68 |   colnames(dir_test) <- c("rna","Mutation")
 69 |   expect_error(run_ap(scores_test,dir_test,constraints_vector_test), 
 70 |                'scores_direction column names must match scores column names')
 71 |   
 72 |   constraints_vector <- c(1,0)
 73 |   expect_error(run_ap(scores_test, direction_test, constraints_vector), "scores_direction entries must be set to 0's for columns that do not contain directional information")
 74 | })
 75 | 
 76 | test_that("significant is valid", {
 77 |   expect_error(ActivePathways(dat, gmt, significant=-0.1),
 78 |                "significant must be a value in [0,1]", fixed=TRUE)
 79 |   expect_error(ActivePathways(dat, gmt, significant = 1.1),
 80 |                "significant must be a value in [0,1]", fixed=TRUE)
 81 |   expect_error(ActivePathways(dat, gmt, significant=NULL),
 82 |                "length(significant) == 1 is not TRUE", fixed=TRUE)
 83 |   expect_error(ActivePathways(dat, gmt, significant=c(1,2)),
 84 |                "length(significant) == 1 is not TRUE", fixed=TRUE)
 85 |   expect_error(ActivePathways(dat, gmt, significant='qwe'),
 86 |                "is.numeric(significant) is not TRUE", fixed=TRUE)
 87 |   expect_warning(ActivePathways(dat, gmt, significant = 0),
 88 |                  "No significant terms were found")
 89 |   expect_error(ActivePathways(dat, gmt, significant = 1), NA)
 90 | })
 91 | 
 92 | 
 93 | test_that("cutoff is valid", {
 94 |   expect_error(ActivePathways(dat, gmt, cutoff=-0.1),
 95 |                "cutoff must be a value in [0,1]", fixed=TRUE)
 96 |   expect_error(ActivePathways(dat, gmt, cutoff = 1.1),
 97 |                "cutoff must be a value in [0,1]", fixed=TRUE)
 98 |   expect_error(ActivePathways(dat, gmt, cutoff=NULL),
 99 |                "length(cutoff) == 1 is not TRUE", fixed=TRUE)
100 |   expect_error(ActivePathways(dat, gmt, cutoff=c(1,2)),
101 |                "length(cutoff) == 1 is not TRUE", fixed=TRUE)
102 |   expect_error(ActivePathways(dat, gmt, cutoff='qwe'),
103 |                "is.numeric(cutoff) is not TRUE", fixed=TRUE)
104 |   expect_error(ActivePathways(dat, gmt, cutoff=0),
105 |                "No genes made the cutoff", fixed=TRUE)
106 |   expect_error(ActivePathways(dat, gmt, cutoff=1), NA)
107 | })
108 | 
109 | 
110 | test_that("background is a character vector", {
111 |   error_msg <- "background must be a character vector"
112 |   expect_error(ActivePathways(dat, gmt, background=c(1,5,2)), error_msg)
113 |   expect_error(ActivePathways(dat, gmt, background=matrix(c('a', 'b', 'c', 'd'), 2)), error_msg)
114 | })
115 | 
116 | 
117 | test_that("genes not found in background are removed", {
118 |   expect_message(ActivePathways(dat, gmt, background=rownames(dat)[-(1:10)], significant=1, cutoff=1),
119 |                  "10 rows were removed from scores because they are not found in the background")
120 |   expect_error(ActivePathways(dat, gmt, background='qwerty'),
121 |                "scores does not contain any genes in the background")
122 | })
123 | 
124 | test_that("geneset_filter is a numeric vector of length 2", {
125 |   expect_error(ActivePathways(dat, gmt, geneset_filter=1), 
126 |                "geneset_filter must be length 2")
127 |   expect_error(ActivePathways(dat, gmt, geneset_filter=list(1,2)),
128 |                "geneset_filter must be a numeric vector")
129 |   expect_error(ActivePathways(dat, gmt, geneset_filter=c('q', 2)),
130 |                "geneset_filter must be a numeric vector")
131 |   expect_error(ActivePathways(dat, gmt, geneset_filter=c(1, -2)),
132 |                "geneset_filter limits must be positive")
133 |   expect_error(ActivePathways(dat, gmt, geneset_filter=c(0, 0)), 
134 |                "No pathways in gmt made the geneset_filter", fixed=TRUE)
135 |   expect_message(ActivePathways(dat, gmt, geneset_filter=c(NA, 10)),
136 |                  "[0-9]+ terms were removed from gmt because they did not make the geneset_filter")
137 |   expect_error(ActivePathways(dat, gmt, geneset_filter=c(0, NA)), NA)
138 |   expect_error(ActivePathways(dat, gmt, geneset_filter=NULL), NA)
139 | }) 
140 | 
141 | test_that("custom colors is a character vector that is equal in length to the number of columns in scores",{
142 |   expect_error(ActivePathways(scores = dat, gmt = gmt, custom_colors = list("red","blue", "green")),
143 |                "colors must be provided as a character vector",fixed = TRUE)  
144 |   expect_error(ActivePathways(scores = dat, gmt = gmt, custom_colors = c("red","blue")),
145 |                "incorrect number of colors is provided",fixed = TRUE)
146 |   
147 |   incorrect_color_names <- c("red","blue", "green")
148 |   names(incorrect_color_names) <- c("promoter","lds",	"enhancer")
149 |   expect_error(ActivePathways(scores = dat, gmt = gmt, custom_colors = incorrect_color_names),
150 |                "names() of the custom colors vector should match the scores column names",fixed = TRUE) 
151 | })
152 | 
153 | test_that("color palette is from the RColorBrewer package",{
154 |   expect_error(ActivePathways(scores = dat, gmt = gmt, color_palette = "flamingo"),
155 |                "palette must be from the RColorBrewer package",fixed = TRUE)
156 | })
157 | 
158 | test_that("color palette and custom colors parameters are never specified together",{
159 |   expect_error(ActivePathways(scores = dat, gmt = gmt, color_palette = "Pastel1", custom_colors = c("red","blue", "green")),
160 |                "Both custom_colors and color_palette are provided. Specify only one of these parameters for node coloring.",fixed = TRUE)
161 | })
162 | 
163 | test_that("color_integrated_only is a character vector of length 1",{
164 |   expect_error(ActivePathways(scores = dat, gmt = gmt, color_integrated_only = list(1,2,3)),
165 |                "color must be provided as a character vector",fixed = TRUE)
166 |   expect_error(ActivePathways(scores = dat, gmt = gmt, color_integrated_only = c("red","blue")),
167 |                "only a single color must be specified",fixed = TRUE)
168 | })
169 | 


--------------------------------------------------------------------------------
/vignettes/CreateEnrichmentMapDialogue_V2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/CreateEnrichmentMapDialogue_V2.png


--------------------------------------------------------------------------------
/vignettes/ImportStep_V2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/ImportStep_V2.png


--------------------------------------------------------------------------------
/vignettes/LegendView.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/LegendView.png


--------------------------------------------------------------------------------
/vignettes/LegendView_Custom.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/LegendView_Custom.png


--------------------------------------------------------------------------------
/vignettes/LegendView_RColorBrewer.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/LegendView_RColorBrewer.png


--------------------------------------------------------------------------------
/vignettes/NetworkStep1_V2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/NetworkStep1_V2.png


--------------------------------------------------------------------------------
/vignettes/NetworkStep2_V2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/NetworkStep2_V2.png


--------------------------------------------------------------------------------
/vignettes/PropertiesDropDown2_V2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/PropertiesDropDown2_V2.png


--------------------------------------------------------------------------------
/vignettes/StylePanel_V2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/StylePanel_V2.png


--------------------------------------------------------------------------------
/vignettes/border_line_type.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/border_line_type.jpg


--------------------------------------------------------------------------------
/vignettes/legend.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/legend.png


--------------------------------------------------------------------------------
/vignettes/lineplot_tutorial.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/lineplot_tutorial.png


--------------------------------------------------------------------------------
/vignettes/new_map.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/new_map.png


--------------------------------------------------------------------------------
/vignettes/set_aesthetic.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/reimandlab/ActivePathways/2cd1931cfc750e96282533bbd0928b5273f5035f/vignettes/set_aesthetic.jpg


--------------------------------------------------------------------------------