├── .Rbuildignore ├── .github └── issue_template.md ├── .gitignore ├── DESCRIPTION ├── Makefile ├── NAMESPACE ├── NEWS ├── NEWS.md ├── R ├── 00-AllClasses.R ├── AllGenerics.R ├── DOSE-package.R ├── accessor.R ├── build_Anno.R ├── clusterSim.R ├── doSim.R ├── enrichDGN.R ├── enrichDGNv.R ├── enrichDO.R ├── enrichDisease.R ├── enrichNCG.R ├── enricher_internal.R ├── geneSim.R ├── gseAnalyzer.R ├── gsea.R ├── gsfilter.R ├── mclusterSim.R ├── parse_ratio.R ├── print.R ├── setReadable.R ├── simplot.R ├── utilities.R └── zzz.R ├── README.Rmd ├── README.md ├── data ├── DGN_EXTID2PATHID.rda ├── DGN_PATHID2EXTID.rda ├── DGN_PATHID2NAME.rda ├── NCG_EXTID2PATHID.rda ├── NCG_PATHID2EXTID.rda ├── NCG_PATHID2NAME.rda ├── VDGN_EXTID2PATHID.rda ├── VDGN_PATHID2EXTID.rda ├── VDGN_PATHID2NAME.rda └── geneList.rda ├── inst ├── CITATION └── extdata │ ├── build_DGN_Anno.R │ ├── build_NCG_Anno.R │ └── preparing.geneList.R ├── man ├── DOSE-package.Rd ├── DataSet.Rd ├── EXTID2NAME.Rd ├── GSEA_internal.Rd ├── clusterSim.Rd ├── compareClusterResult-class.Rd ├── computeIC.Rd ├── doseSim.Rd ├── enrichDGN.Rd ├── enrichDGNv.Rd ├── enrichDO.Rd ├── enrichNCG.Rd ├── enrichResult-class.Rd ├── enricher_internal.Rd ├── gene2DO.Rd ├── geneID.Rd ├── geneInCategory.Rd ├── geneSim.Rd ├── gseDGN.Rd ├── gseDO.Rd ├── gseNCG.Rd ├── gseaResult-class.Rd ├── gsfilter.Rd ├── mclusterSim.Rd ├── parse_ratio.Rd ├── reexports.Rd ├── setReadable.Rd ├── show-methods.Rd ├── simplot.Rd ├── summary-methods.Rd └── theme_dose.Rd ├── tests ├── testthat.R └── testthat │ ├── test-doSim.R │ └── test-geneSim.R └── vignettes └── DOSE.Rmd /.Rbuildignore: -------------------------------------------------------------------------------- 1 | inst/extdata/buildAnnoData.R 2 | inst/extdata/preparing.geneList.R 3 | inst/extdatao 4 | .svnignore 5 | ^.*\.DS_Store 6 | Makefile 7 | README.Rmd 8 | .travis.yml 9 | appveyor.yml 10 | docs 11 | mkdocs 12 | .github 13 | gh-pages 14 | -------------------------------------------------------------------------------- /.github/issue_template.md: -------------------------------------------------------------------------------- 1 | ### Prerequisites 2 | 3 | + [ ] Have you read [Feedback](https://guangchuangyu.github.io/dose/#feedback) and follow the [guide](https://guangchuangyu.github.io/2016/07/how-to-bug-author/)? 4 | * [ ] make sure your are using the latest release version 5 | * [ ] read the [documents](https://guangchuangyu.github.io/dose/documentation/) 6 | * [ ] google your quesion/issue 7 | 8 | ### Describe you issue 9 | 10 | * [ ] Make a reproducible example (*e.g.* [1](https://gist.github.com/talonsensei/e1fad082657054207f249ec98f0920eb)) 11 | * [ ] your code should contain comments to describe the problem (*e.g.* what expected and actually happened?) 12 | 13 | 14 | ### Ask in right place 15 | 16 | * [ ] for bugs or feature requests, post here (github issue) 17 | * [ ] for questions, please post to [Bioconductor](https://support.bioconductor.org/) or [Biostars](https://www.biostars.org/) with tag `DOSE` 18 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | vignettes/.DS_Store 3 | .svn 4 | *~ 5 | .Rhistory 6 | docs/__pycache__ 7 | docs/__init__.py 8 | __init__.pyc 9 | HPO 10 | gh-pages 11 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: DOSE 2 | Type: Package 3 | Title: Disease Ontology Semantic and Enrichment analysis 4 | Version: 4.3.0 5 | Authors@R: c( person(given = "Guangchuang", family = "Yu", email = "guangchuangyu@gmail.com", role = c("aut", "cre")), 6 | person(given = "Li-Gen", family = "Wang", email = "reeganwang020@gmail.com", role = "ctb"), 7 | person(given = "Vladislav", family = "Petyuk", email = "petyuk@gmail.com", role = "ctb"), 8 | person(given = "Giovanni", family = "Dall'Olio", email = "giovanni.dallolio@upf.edu", role = "ctb")) 9 | Maintainer: Guangchuang Yu 10 | Description: This package implements five methods proposed by 11 | Resnik, Schlicker, Jiang, Lin and Wang respectively 12 | for measuring semantic similarities among DO terms and 13 | gene products. Enrichment analyses including hypergeometric 14 | model and gene set enrichment analysis are also implemented 15 | for discovering disease associations of high-throughput 16 | biological data. 17 | Depends: 18 | R (>= 3.5.0) 19 | Imports: 20 | AnnotationDbi, 21 | BiocParallel, 22 | fgsea, 23 | ggplot2, 24 | GOSemSim (>= 2.31.2), 25 | methods, 26 | qvalue, 27 | reshape2, 28 | stats, 29 | utils, 30 | yulab.utils (>= 0.1.6) 31 | Suggests: 32 | prettydoc, 33 | clusterProfiler, 34 | gson (>= 0.0.5), 35 | knitr, 36 | memoise, 37 | org.Hs.eg.db, 38 | rmarkdown, 39 | testthat 40 | VignetteBuilder: knitr 41 | ByteCompile: true 42 | License: Artistic-2.0 43 | Encoding: UTF-8 44 | URL: https://yulab-smu.top/contribution-knowledge-mining/ 45 | BugReports: https://github.com/GuangchuangYu/DOSE/issues 46 | Packaged: 2011-12-28 08:16:14 UTC; root 47 | biocViews: Annotation, Visualization, MultipleComparison, GeneSetEnrichment, 48 | Pathways, Software 49 | RoxygenNote: 7.3.2 50 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | PKGNAME := $(shell sed -n "s/Package: *\([^ ]*\)/\1/p" DESCRIPTION) 2 | PKGVERS := $(shell sed -n "s/Version: *\([^ ]*\)/\1/p" DESCRIPTION) 3 | PKGSRC := $(shell basename `pwd`) 4 | BIOCVER := RELEASE_3_21 5 | 6 | all: rd check clean 7 | 8 | alldocs: rd readme 9 | 10 | rd: 11 | Rscript -e 'roxygen2::roxygenise(".")' 12 | 13 | readme: 14 | Rscript -e 'rmarkdown::render("README.Rmd", encoding="UTF-8")' 15 | 16 | build: 17 | #cd ..;\ 18 | #R CMD build $(PKGSRC) 19 | Rscript -e 'devtools::build()' 20 | 21 | build2: 22 | cd ..;\ 23 | R CMD build --no-build-vignettes $(PKGSRC) 24 | 25 | install: 26 | cd ..;\ 27 | R CMD INSTALL $(PKGNAME)_$(PKGVERS).tar.gz 28 | 29 | check: 30 | ##cd ..;\ 31 | ## Rscript -e 'rcmdcheck::rcmdcheck("$(PKGNAME)_$(PKGVERS).tar.gz")' 32 | Rscript -e 'devtools::check()' 33 | 34 | check2: build 35 | cd ..;\ 36 | R CMD check $(PKGNAME)_$(PKGVERS).tar.gz 37 | 38 | bioccheck: 39 | cd ..;\ 40 | Rscript -e 'BiocCheck::BiocCheck("$(PKGNAME)_$(PKGVERS).tar.gz")' 41 | 42 | clean: 43 | cd ..;\ 44 | $(RM) -r $(PKGNAME).Rcheck/ 45 | 46 | gitmaintain: 47 | git gc --auto;\ 48 | git prune -v;\ 49 | git fsck --full 50 | 51 | update: 52 | git fetch --all;\ 53 | git checkout devel;\ 54 | git merge upstream/devel;\ 55 | git merge origin/devel 56 | 57 | 58 | push: 59 | git push upstream devel;\ 60 | git push origin devel 61 | 62 | rmrelease: 63 | git branch -D $(BIOCVER) 64 | 65 | release: 66 | git checkout $(BIOCVER);\ 67 | git fetch --all 68 | 69 | biocinit: 70 | git remote add upstream git@git.bioconductor.org:packages/$(PKGNAME).git;\ 71 | git fetch --all 72 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method("$",enrichResult) 4 | S3method("$",gseaResult) 5 | S3method("[",enrichResult) 6 | S3method("[",gseaResult) 7 | S3method("[[",enrichResult) 8 | S3method("[[",gseaResult) 9 | S3method(as.data.frame,enrichResult) 10 | S3method(as.data.frame,gseaResult) 11 | S3method(dim,enrichResult) 12 | S3method(dim,gseaResult) 13 | S3method(geneID,compareClusterResult) 14 | S3method(geneID,enrichResult) 15 | S3method(geneID,gseaResult) 16 | S3method(geneInCategory,compareClusterResult) 17 | S3method(geneInCategory,enrichResult) 18 | S3method(geneInCategory,gseaResult) 19 | S3method(head,enrichResult) 20 | S3method(head,gseaResult) 21 | S3method(tail,enrichResult) 22 | S3method(tail,gseaResult) 23 | export(EXTID2NAME) 24 | export(clusterSim) 25 | export(doSim) 26 | export(doseSim) 27 | export(enrichDGN) 28 | export(enrichDGNv) 29 | export(enrichDO) 30 | export(enrichNCG) 31 | export(facet_grid) 32 | export(gene2DO) 33 | export(geneID) 34 | export(geneInCategory) 35 | export(geneSim) 36 | export(gseDGN) 37 | export(gseDO) 38 | export(gseNCG) 39 | export(gsfilter) 40 | export(mclusterSim) 41 | export(parse_ratio) 42 | export(setReadable) 43 | export(simplot) 44 | export(theme_dose) 45 | exportClasses(compareClusterResult) 46 | exportClasses(enrichResult) 47 | exportClasses(gseaResult) 48 | exportMethods(show) 49 | exportMethods(summary) 50 | importClassesFrom(GOSemSim,GOSemSimDATA) 51 | importClassesFrom(methods,data.frame) 52 | importFrom(BiocParallel,MulticoreParam) 53 | importFrom(BiocParallel,bplapply) 54 | importFrom(BiocParallel,multicoreWorkers) 55 | importFrom(GOSemSim,combineScores) 56 | importFrom(GOSemSim,load_OrgDb) 57 | importFrom(GOSemSim,termSim) 58 | importFrom(fgsea,fgsea) 59 | importFrom(ggplot2,aes) 60 | importFrom(ggplot2,element_blank) 61 | importFrom(ggplot2,element_text) 62 | importFrom(ggplot2,facet_grid) 63 | importFrom(ggplot2,geom_text) 64 | importFrom(ggplot2,geom_tile) 65 | importFrom(ggplot2,ggplot) 66 | importFrom(ggplot2,margin) 67 | importFrom(ggplot2,scale_fill_gradient) 68 | importFrom(ggplot2,scale_x_discrete) 69 | importFrom(ggplot2,scale_y_discrete) 70 | importFrom(ggplot2,theme) 71 | importFrom(ggplot2,theme_bw) 72 | importFrom(ggplot2,xlab) 73 | importFrom(ggplot2,ylab) 74 | importFrom(methods,is) 75 | importFrom(methods,new) 76 | importFrom(methods,show) 77 | importFrom(qvalue,qvalue) 78 | importFrom(reshape2,melt) 79 | importFrom(stats,p.adjust) 80 | importFrom(stats,phyper) 81 | importFrom(stats,setNames) 82 | importFrom(utils,head) 83 | importFrom(utils,setTxtProgressBar) 84 | importFrom(utils,str) 85 | importFrom(utils,tail) 86 | importFrom(utils,txtProgressBar) 87 | importFrom(yulab.utils,yulab_msg) 88 | importMethodsFrom(AnnotationDbi,columns) 89 | importMethodsFrom(AnnotationDbi,exists) 90 | importMethodsFrom(AnnotationDbi,get) 91 | importMethodsFrom(AnnotationDbi,keys) 92 | importMethodsFrom(AnnotationDbi,keytypes) 93 | importMethodsFrom(AnnotationDbi,select) 94 | importMethodsFrom(AnnotationDbi,toTable) 95 | -------------------------------------------------------------------------------- /NEWS: -------------------------------------------------------------------------------- 1 | CHANGES IN VERSION 3.5.2 2 | ------------------------ 3 | o bug fixed of gseaScores <2018-04-18, Wed> 4 | + https://github.com/GuangchuangYu/DOSE/issues/23 5 | o mv web site to https://guangchuangyu.github.io/software/DOSE 6 | 7 | CHANGES IN VERSION 3.3.2 8 | ------------------------ 9 | o new project site using blogdown <2017-09-28, Thu> 10 | o ridgeplot for gseaResult <2017-08-17, Thu> 11 | 12 | CHANGES IN VERSION 3.3.1 13 | ------------------------ 14 | o throw warning instead of error when fail to `setReadable`. <2017-06-28, Wed> 15 | + https://github.com/GuangchuangYu/clusterProfiler/issues/91 16 | o better msg when using wrong ID types in GSEA <2017-06-01, Thu> 17 | + https://support.bioconductor.org/p/96512/#96516 18 | 19 | CHANGES IN VERSION 3.2.0 20 | ------------------------ 21 | o BioC 3.5 release <2017-04-26, Wed> 22 | 23 | CHANGES IN VERSION 3.1.3 24 | ------------------------ 25 | o output expected sample gene ID when input gene ID type not match <2017-03-27, Mon> 26 | o dotplot for gseaResult <2016-11-23, Fri> 27 | + https://github.com/GuangchuangYu/DOSE/issues/20 28 | 29 | CHANGES IN VERSION 3.1.2 30 | ------------------------ 31 | o in gseaplot, call grid.newpage only if dev.interactive() <2016-11-16, Wed> 32 | o change minGSSize < geneSet_size & geneSet_size < maxGSSize to 33 | minGSSize <= geneSet_size & geneSet_size <= maxGSSize <2016-11-16, Wed> 34 | o fixed show method issus of unknown setType in clusterProfiler::GSEA output <2016-11-15, Tue> 35 | o throw more friendly error msg if fail to determine setType automatically in setReadable function <2016-11-15, Tue> 36 | + https://support.bioconductor.org/p/89445/#89479 37 | o apply minGSSize and maxGSSize to fgsea <2016-11-14, Mon> 38 | o change summary to as.data.frame in internal calls to prevent warning message <2016-11-14, Mon> 39 | 40 | CHANGES IN VERSION 3.1.1 41 | ------------------------ 42 | o update startup message <2016-11-09, Wed> 43 | o fixed parallel in Windows (not supported) <2016-10-24, Mon> 44 | + https://github.com/GuangchuangYu/DOSE/issues/16 45 | o options(DOSE_workers = x) to set using x cores for GSEA analysis is removed. <2016-10-24, Mon> 46 | instead let MulticoreParam() to decide how many cores (can be set by `options(mc.cores = x)`). 47 | 48 | CHANGES IN VERSION 3.0.0 49 | ------------------------ 50 | o BioC 3.4 released <2016-10-18, Tue> 51 | 52 | CHANGES IN VERSION 2.11.12 53 | ------------------------ 54 | o deprecate summary method and use as.data.frame instead <2016-09-29, Thu> 55 | 56 | CHANGES IN VERSION 2.11.11 57 | ------------------------ 58 | o export geneID and geneInCategory <2016-09-19, Mon> 59 | 60 | CHANGES IN VERSION 2.11.10 61 | ------------------------ 62 | o geneID and geneInCategory accessor functions <2016-09-19, Mon> 63 | 64 | CHANGES IN VERSION 2.11.9 65 | ------------------------ 66 | o enrichMap works with result contains only one term <2016-08-15, Wed> 67 | + https://github.com/GuangchuangYu/DOSE/issues/15 68 | o build_Anno now works with tibble <2016-08-18, Tue> 69 | 70 | CHANGES IN VERSION 2.11.8 71 | ------------------------ 72 | o add unit test <2016-08-15, Mon> 73 | 74 | CHANGES IN VERSION 2.11.7 75 | ------------------------ 76 | o change for meshes packages <2016-08-11, Thu> 77 | 78 | CHANGES IN VERSION 2.11.6 79 | ------------------------ 80 | o user can use options(DOSE_workers = x) to set using x cores for GSEA analysis <2016-08-02, Tue> 81 | o support DisGeNET enrichment analyses <2016-08-01, Mon> 82 | + enrichDGN, enrichDGNv, gseDGN 83 | o update vignettes <2016-07-29, Fri> 84 | 85 | CHANGES IN VERSION 2.11.5 86 | ------------------------ 87 | o enrichMap now output igraph object that can be viewed using other software like networkD3 <2016-07-25, Mon> 88 | o dim methods for enrichResult and gseaResult <2016-07-22, Fri> 89 | o $ methods for enrichResult and gseaResult <2016-07-20, Wed> 90 | o switch from parallel to BiocParallel <2017-07-07, Thu> 91 | o [, head and tail methods for enrichResult and gseaResult <2016-07-06, Wed> 92 | o change according to GOSemSim <2016-07-05, Tue> 93 | 94 | CHANGES IN VERSION 2.11.4 95 | ------------------------ 96 | o 'title' parameter for gseaplot <2016-07-04, Mon> 97 | + contributed by https://github.com/pedrostrusso 98 | + https://github.com/GuangchuangYu/DOSE/pull/13 99 | o 'by' parameter in GSEA_internal, by default by = 'fgsea' <2016-07-04, Mon> 100 | + by = 'fgsea', use GSEA algorithm implemented in fgsea 101 | + by = 'DOSE', use GSEA algorithm implemented in DOSE 102 | o leading edge analysis for GSEA <2016-07-04, Mon> 103 | 104 | CHANGES IN VERSION 2.11.3 105 | ------------------------ 106 | o output igraph object in cnetplot <2016-06-21, Tue> 107 | o upsetplot generics <2016-06-14, Tue> 108 | o [[ methods for enrichResult and gseaResult for accessing gene set <2016-06-14, Tue> 109 | 110 | CHANGES IN VERSION 2.11.2 111 | ------------------------ 112 | o use byte compiler <2016-05-18, Wed> 113 | o https://github.com/Bioconductor-mirror/DOSE/commit/6c508c6a6816f465bb372f30f4ab99c839d81767 114 | 115 | CHANGES IN VERSION 2.11.1 116 | ------------------------ 117 | o https://github.com/Bioconductor-mirror/DOSE/commit/7e87d01e671ce1b5fbe974c06b796b1a2970f11c 118 | 119 | CHANGES IN VERSION 2.10.0 120 | ------------------------ 121 | o BioC 3.3 released <2016-05-05, Thu> 122 | 123 | CHANGES IN VERSION 2.9.7 124 | ------------------------ 125 | o barplot accepts x and colorBy parameters as in dotplot <2016-04-13, Wed> 126 | o gsfilter function for restricting result with minimal and maximal gene set size <2016-03-31, Thu> 127 | + see https://github.com/GuangchuangYu/clusterProfiler/issues/46 128 | 129 | CHANGES IN VERSION 2.9.6 130 | ------------------------ 131 | o add maxGSSize parameter for hypergeometric test <2016-03-09, Wed> 132 | o add maxGSSize parameter in GSEA analysis. <2016-03-09, Wed> 133 | + Default of 500 is fairly standard for GSEA analysis. 134 | + Usually if the geneset > 500, its probability of being called significant by GSEA rises quite dramatically. 135 | o fixed R check <2016-03-05, Sat> 136 | o update ReactomePA citation info <2016-02-17, Wed> 137 | 138 | CHANGES IN VERSION 2.9.5 139 | ------------------------ 140 | o upset plot for enrichResult object <2016-01-22, Fri> 141 | 142 | CHANGES IN VERSION 2.9.4 143 | ------------------------ 144 | o bug fixed in scaling category sizes in enrichMap <2016-01-10, Sun> 145 | + use setSize in gseResult 146 | 147 | CHANGES IN VERSION 2.9.3 148 | ------------------------ 149 | o update enrichMap to scale category sizes <2016-01-04, Mon> 150 | o update 'show' methods of enrichResult and gseaResult <2015-12-29, Tue> 151 | 152 | CHANGES IN VERSION 2.9.2 153 | ------------------------ 154 | o re-design internal function <2015-12-20, Sun> 155 | 156 | CHANGES IN VERSION 2.9.1 157 | ------------------------ 158 | o GSEA: test bimodal separately <2015-10-28, Wed> 159 | o add NES column in GSEA result <2015-10-28, Wed> 160 | o use NES instead of ES in calculating p-values. <2015-10-28, Wed> 161 | o duplicated gene IDs in enrich.internal is not allow. add `unique` to remove duplicated ID. <2015-10-20, Tue> 162 | + see https://github.com/GuangchuangYu/clusterProfiler/issues/27 163 | 164 | CHANGES IN VERSION 2.9.0 165 | ------------------------ 166 | o BioC 3.3 branch 167 | 168 | CHANGES IN VERSION 2.7.12 169 | ------------------------ 170 | o output a list of data frames by enrichMap via invisible, so that the graph info can be viewed by other tools, eg. Cytoscape. <2015-09-30, Wed> 171 | 172 | CHANGES IN VERSION 2.7.11 173 | ------------------------ 174 | o check whether input geneList is sorted <2015-09-22, Tue> 175 | o order first followed by showCategory subsetting in barplot <2015-09-08, Tue> 176 | + https://support.bioconductor.org/p/71917/#71939 177 | o bug fixed in EXTID2NAME, keytype is TAIR for arabidopsis and ORF for malaria <2015-08-26, Wed> 178 | 179 | CHANGES IN VERSION 2.7.10 180 | ------------------------ 181 | o add Giovanni Dall'Olio as contributor in author list. <2015-07-21, Tue> 182 | o update NCG data to version cancergenes_4.9.0_20150720 contributed by Giovanni Dall'Olio. 183 | https://github.com/GuangchuangYu/DOSE/pull/8 <2015-07-21, Tue> 184 | o geneInCategory may simplify to vector by R in very rare case, 185 | which violate the assumption of list in S4 validation checking. 186 | This issue was fixed. <2015-07-19, Sun> 187 | 188 | CHANGES IN VERSION 2.7.9 189 | ------------------------ 190 | o add citation of ChIPseeker <2015-07-09, Thu> 191 | o add 'Disease analysis of NGS data' section in vignette <2015-06-29, Mon> 192 | o convert vignette from Rnw to Rmd <2015-06-24, Wed> 193 | 194 | CHANGES IN VERSION 2.7.8 195 | ------------------------ 196 | o dotplot for enrichResult <2015-06-21, Sun> 197 | 198 | CHANGES IN VERSION 2.7.7 199 | ------------------------ 200 | o speed up by using sample.int instead of sample <2015-06-04, Thu> 201 | o add seed = FALSE to control reproduciblility of gsea function. 202 | to make result reproducible, explicitly set seed=TRUE <2015-06-04, Thu> 203 | - contributed by Vlad, see https://github.com/GuangchuangYu/DOSE/pull/5/ 204 | - modified by Guangchuang 205 | 206 | CHANGES IN VERSION 2.7.6 207 | ------------------------ 208 | o bug fixed in cnetplot legend contributed by Vladislav Petyuk <2015-05-28, Thu> 209 | 210 | CHANGES IN VERSION 2.7.5 211 | ------------------------ 212 | o update vignette <2015-05-15, Fri> 213 | 214 | CHANGES IN VERSION 2.7.4 215 | ------------------------ 216 | o update permutation test with pvalue = (K+1)/(N+1) instead of K/N, 217 | so that p value will never be 0 <2015-05-12, Tue> 218 | 219 | CHANGES IN VERSION 2.7.3 220 | ------------------------ 221 | o add setType slot in gseaResult <2015-05-15, Tue> 222 | o add universe and geneSets slots in enrichResult <2015-05-05, Tue> 223 | 224 | CHANGES IN VERSION 2.7.2 225 | ------------------------ 226 | o add vertex.label.font parameter in enrichMap <2015-04-27, Mon> 227 | 228 | CHANGES IN VERSION 2.7.1 229 | ------------------------ 230 | o update NCG description in enrichNCG <2015-04-22, Wed> 231 | 232 | CHANGES IN VERSION 2.5.12 233 | ------------------------ 234 | o report NA in qvalue column when it fail to calculate instead of throw error <2015-03-09, Thu> 235 | o update IC data <2015-03-08, Wed> 236 | 237 | CHANGES IN VERSION 2.5.11 238 | ------------------------ 239 | o implement clusterSim and mclusterSim <2015-03-07, Tue> 240 | 241 | CHANGES IN VERSION 2.5.10 242 | ------------------------ 243 | o implement enrichNCG and now gseAnalyzer accept setType = "NCG" <2015-04-01, Wed> 244 | see http://ncg.kcl.ac.uk 245 | o change use_internal_data parameter to ... in all S3 generics and methods <2015-04-01, Wed> 246 | 247 | CHANGES IN VERSION 2.5.9 248 | ------------------------ 249 | o extend gseAnalyzer to support use_internal_data parameter <2015-03-29, Sun> 250 | 251 | CHANGES IN VERSION 2.5.8 252 | ------------------------ 253 | o keep order of barplot, sorted by pvalue by defalt <2015-02-11, Wed> 254 | 255 | CHANGES IN VERSION 2.5.7 256 | ------------------------ 257 | o fixed typo in vignette <2015-02-10, Tue> 258 | 259 | CHANGES IN VERSION 2.5.6 260 | ------------------------ 261 | o add organismMapper to satisfy the change of enrichKEGG <2015-01-28, Wed> 262 | 263 | CHANGES IN VERSION 2.5.5 264 | ------------------------ 265 | o introduce use_internal_data parameter for enrichKEGG of clusterProfiler <2015-01-27, Tue> 266 | 267 | CHANGES IN VERSION 2.5.3 268 | ------------------------ 269 | o update vignette <2015-01-27, Tue> 270 | 271 | CHANGES IN VERSION 2.5.1 272 | ------------------------ 273 | o add CITATION <2015-01-19, Mon> 274 | 275 | CHANGES IN VERSION 2.3.6 276 | ------------------------ 277 | o add readable parameter in simplot <2014-10-09, Thu> 278 | 279 | CHANGES IN VERSION 2.3.5 280 | ------------------------ 281 | o implement enrichMap <2014-07-28, Wed> 282 | 283 | CHANGES IN VERSION 2.3.4 284 | ------------------------ 285 | o return ggplot2 objects in gseaplot <2014-07-21, Mon> 286 | 287 | CHANGES IN VERSION 2.3.3 288 | ------------------------ 289 | o geneSim can only accept one gene ID vector and perform as mgeneSim in GOSemSim <2014-06-23, Mon> 290 | o update man files <2014-06-23, Mon> 291 | 292 | CHANGES IN VERSION 2.3.2 293 | ------------------------ 294 | o bug fixed in scaleNodeColor <2014-06-08, Sun> 295 | 296 | CHANGES IN VERSION 2.3.1 297 | ------------------------ 298 | o upgrade man file according to roxygen2 (ver 4.0.0) . <2014-05-16, Fri> 299 | 300 | CHANGES IN VERSION 1.99.6 301 | ------------------------ 302 | o bug fixed in EXTID2NAME. <2013-09-28, Sat> 303 | 304 | CHANGES IN VERSION 1.99.5 305 | ------------------------ 306 | o fixed in calculating M when only one categroy presented, the object was matrix insted of list. <2013-09-16, Mon> 307 | 308 | CHANGES IN VERSION 1.99.4 309 | ------------------------ 310 | o export gsea function <2013-07-10, Wed> 311 | 312 | CHANGES IN VERSION 1.99.3 313 | ------------------------ 314 | o extend EXTID2NAME to support 20 species <2013-07-09, Tue> 315 | o update vignette. <2013-07-09, Tue> 316 | 317 | CHANGES IN VERSION 1.99.1 318 | ------------------------ 319 | o convert vignette file to knitr Sweave. <2013-06-24, Mon> 320 | 321 | CHANGES IN VERSION 1.99.0 322 | ------------------------ 323 | o extent ggplot to support enrichResult by implementing fortify method. <2013-05-22, Wed> 324 | o re-implement barplot.enrichResult. <2013-05-23, Thu> 325 | o enrich.internal support user specifiy background by parameter universe. <2013-05-24, Fri> 326 | o implement Gene Set Enrichment Analysis algorithm. <2013-05-29, Wed> 327 | o change setReadable to support groupGO of clusterProfiler. <2013-05-29, Wed> 328 | o fixed mclapply not support Windows platform issue. <2013-05-30, Fri> 329 | o rename logFC parameter to foldChange. <2013-06-13, Thu> 330 | 331 | CHANGES IN VERSION 1.7.1 332 | ------------------------ 333 | o use geom_bar(stat="identity") instead of geom_bar() in barplot for explicitly mapping y value. <2013-05-08, Wed> 334 | o bug fixed when qvalue can't calculated. <2013-05-02, Thu> 335 | o bug fixed of enrich.internal, drop those genes that without annotation when calculating sample gene number. <2013-05-02, Thu> 336 | o change some code to satisfy ReactomePA <2013-03-27, Wed> 337 | 338 | CHANGES IN VERSION 1.5.1 339 | ------------------------ 340 | o bug fixed in enrich.internal, now return NA rather than throw error, if gene have no ontology annotation <2013-01-22, Fri> 341 | 342 | CHANGES IN VERSION 1.3.2 343 | ------------------------ 344 | o update codes of plot functions accompaning with ggplot2 (version 0.9.2) <2012-09-06, Thu> 345 | o update cnetplot corresponding to igraph version 0.6 <2012-07-11, Wed> 346 | o parameter showCategory now support term ID vector <2012-07-11, Wed> 347 | o import termSim and combineScores from GOSemSim. <2012-06,14, Thu> 348 | o optimize setReadable <2012-05-09, Wed> 349 | o bug fixed of setReadable method. For those unmapped genes, return the original gene ID. <2012-05-15, Tue> 350 | o fill color in barplot for Ontology classification. <2012-05-22, Tue> 351 | o update color scale of cnetplot <2012-06-18, Mon> 352 | 353 | CHANGES IN VERSION 1.3.1 354 | ------------------------ 355 | o update color scheme of cnetplot <2012-04-10, Tue> 356 | o add simplot for plotting semantic similarity matrix <2012-04-20, Fri> 357 | o bug fixed of combineScore function for DO semantic similarity matrix 358 | which containing NA rows of NA coloumns <2012-04-20, Fri> 359 | o export doSim and geneSim functions <2012-04-20, Fri> 360 | 361 | CHANGES IN VERSION 1.1.12 362 | ------------------------ 363 | o implement barplot for enrichResult <2012-03-18, Sun> 364 | o bug fixed for setReadable method <2012-03-19, Mon> 365 | o add logFC parameter for cnet plot, support color gene nodes by 366 | their expression value (log fold change) <2012-03-21, Wed> 367 | o add mapping entrezgene ID and gene Name for organisms 368 | other than human, mouse and yeast. <2012-03-22, Thu> 369 | o bug fixed for attempt to name logFC, when it is NULL. <2012-03-22, Thu> 370 | o optimized readable method. <2012-03-26, Mon> 371 | 372 | CHANGES IN VERSION 1.1.11 373 | ------------------------ 374 | o setReadable method for mapping gene ID to gene Symbol in enrichResult instance. <2012-03-12, Mon> 375 | o export method show. <2012-03-12, Mon> 376 | 377 | CHANGES IN VERSION 1.1.10 378 | ------------------------ 379 | o import plot summary from stats4, for BiocGenerics (version 0.1.10) removed them <2012-03-03, Sat> 380 | o Add DO2ALLEG and EG2ALLDO, for mapping undirecte annotation. <2012-03-03, Sat> 381 | o update vignette <2012-03-06, Tue> 382 | 383 | CHANGES IN VERSION 1.1.9 384 | ------------------------ 385 | o fixed BibTeX database file .bib. 386 | month = , must be month = someMonth, or totally deleted, 387 | leave it blank will cause texi2dvi failed. <2012-03-01, Thu> 388 | o update IC data and DO-EG mapping data. <2012-03-01, Thu> 389 | 390 | CHANGES IN VERSION 1.1.8 391 | ------------------------ 392 | o update vignette, add semantic similarity algorithms' details. <2012-02-28, Tue> 393 | 394 | CHANGES IN VERSION 1.1.7 395 | ------------------------ 396 | o fixed warnings concerning documents of plot generics. <2012-02-27, Mon> 397 | o import summary generic from BiocGenerics instead of stats4. <2012-02-27, Mon> 398 | 399 | CHANGES IN VERSION 1.1.6 400 | ------------------------ 401 | o defined S3 generic for ALLEXTID, EXTID2NAME, EXTID2TERMID, TERM2NAME, 402 | and TERMID2EXTID. <2012-02-26, Sun> 403 | o update roxygen and regenerate man file. <2012-02-26, Sun> 404 | o import S4 generics of plot from BiocGenerics. <2012-02-26, Sun> 405 | 406 | CHANGES IN VERSION 1.1.5 407 | ------------------------ 408 | o add S4 method of plot, which accept parameter type = "cnet", 409 | and call cnetplot.enrichResult method. <2012-02-23, Thu> 410 | o add S3 method cnetplot.enrichResult for plotting enrichResult object. 411 | <2012-02-23, Thu> 412 | o define cnetplot function for category-gene network visualization. <2012-02-23, Thu> 413 | o remove generic definition of show and summary, 414 | import show from methods and summary from stats4 <2012-02-23, Thu> 415 | o redefine functions as S3 methods for mapping ID among gene and Term. 416 | this will make enrich.internal which calling these mapping function more robust 417 | <2012-02-23, Thu> 418 | 419 | CHANGES IN VERSION 1.1.4 420 | ------------------------ 421 | o add Enrichment Analysis session in vignette. <2012-02-22, Wed> 422 | o optimize enrichDO, ten time faster. <2012-02-22, Wed> 423 | o separate code of enrichDO to enrich.internal, make it more general, 424 | and can be applied to other ontology. <2012-02-22, Wed> 425 | o rename enrichDOResult class to enrichResult and add slot geneInCategory. <2012-02-22, Wed> 426 | o export infoContentMethod and wangMethod. <2012-02-22, Wed> 427 | 428 | CHANGES IN VERSION 1.1.3 429 | ------------------------ 430 | o update infoContentMethod to make it consistent between DOSE and GOSemSim. <2011-12-31, Sat> 431 | 432 | CHANGES IN VERSION 1.1.2 433 | ------------------------ 434 | o change to using roxygen for generating Rd docs 435 | 436 | CHANGES IN VERSION 1.1.1 437 | ------------------------ 438 | o add function rebuildAnnoData 439 | o update Disease-Gene Mapping data 440 | 441 | CHANGES IN VERSION 0.99.7 442 | ------------------------ 443 | o fixed bug in .SemVal_internal 444 | 445 | CHANGES IN VERSION 0.99.6 446 | ------------------------ 447 | o Use qvalue instead of fdrtool to calculate qvalues. 448 | 449 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | # DOSE 4.2.0 2 | 3 | + Bioconductor RELEASE_3_21 (2025-04-17, Thu) 4 | 5 | # DOSE 4.0.0 6 | 7 | + Bioconductor RELEASE_3_20 (2024-10-30, Wed) 8 | 9 | # DOSE 3.99.1 10 | 11 | + return NULL in GSEA if not genes can be mapped (2024-08-26, Mon, ReactomePA#43) 12 | 13 | # DOSE 3.31.4 14 | 15 | + remove `mpoSim()` and `hopSim()` (2024-08-21, Wed) 16 | 17 | # DOSE 3.31.3 18 | 19 | + remove `gseMPO()` and `gseHPO()`, instead using e.g., `gseDO(ont="HPO")` (2024-08-15, Thu) 20 | + remove `enrichMPO()` and `enrichHPO()`, instead using e.g., `enrichDO(ont="HPO")` 21 | + unify API and removing the usages of MPO.db and HPO.db 22 | + update DO-gene mapping data (2024-08-13, Tue) 23 | + remove `HDO.db` and use new API in GOSemSim (v>=2.31.1) (2024-08-13, Tue) 24 | + use `yulab.utils::yulab_msg()` for startup message (2024-07-26, Fri) 25 | 26 | # DOSE 3.31.2 27 | 28 | + add `FoldEnrichment`, `RichFactor` and `zScore` in ORA result (2024-06-13, Thu) 29 | 30 | # DOSE 3.31.1 31 | 32 | + fixed bug in `options(enrichment_force_universe=TRUE)` (2024-05-16, Thu) 33 | - 34 | 35 | # DOSE 3.30.0 36 | 37 | + Bioconductor RELEASE_3_19 (2024-05-15, Wed) 38 | 39 | # DOSE 3.29.2 40 | 41 | + fix bugs in `get_ont_info()` and `get_dose_data()`: wrong object name and wrong data type (2023-11-30, Thu) 42 | 43 | # DOSE 3.29.1 44 | 45 | + mv 'MPO.db' and 'HPO.db' from 'Imports' to 'Suggests' and fixed bugs (2023-11-18, Sat) 46 | 47 | # DOSE 3.28.0 48 | 49 | + Bioconductor RELEASE_3_18 (2023-10-25, Wed) 50 | 51 | # DOSE 3.27.3 52 | 53 | + update `TERM2NAME()` to return term if corresponding name not found. (2023-10-09, Mon) 54 | 55 | # DOSE 3.27.2 56 | 57 | + use 'MPO.db' and 'HPO.db' to support phenotype ontology for mouse and human (2023-06-30, Fri) 58 | 59 | # DOSE 3.27.1 60 | 61 | + `options(enrichment_force_universe = TRUE)` will force enrichment analysis to intersect the `universe` with gene sets (2023-05-03, Wed) 62 | + use `inherits` to judge the class of objects (2022-11-20, Sun) 63 | + test whether slot in `GSON` object is NULL (e.g., `GSON@keytype`) when assigning it to enrichment result (2022-11-07, Mon) 64 | 65 | # DOSE 3.26.0 66 | 67 | + Bioconductor RELEASE_3_17 (2023-05-03, Wed) 68 | 69 | # DOSE 3.24.0 70 | 71 | + Bioconductor RELEASE_3_16 (2022-11-02, Wed) 72 | 73 | # DOSE 3.23.3 74 | 75 | + replace `DO.db` to `HDO.db` (2022-10-7, Fri) 76 | + add values of `organism`, `keytype` and `setType` for `GSEA_internal()` (2022-09-21, Wed) 77 | + add values of `organism`, `keytype` and `ontology` for `enricher_internal()` (2022-09-21, Wed) 78 | + move `inst/extdata/parse-obo.R` to `HDO.db` package (2022-08-29, Mon) 79 | + rename `qvalues` to `qvalue` in `gseaResult` object (2022-08-29, Mon) 80 | 81 | # DOSE 3.23.2 82 | 83 | + Support `GSON` object in `GSEA_internal()` (2022-06-08, Wed) 84 | 85 | # DOSE 3.23.1 86 | 87 | + Support `GSON` object in `enricher_internal()` (2022-06-06, Mon) 88 | 89 | # DOSE 3.22.0 90 | 91 | + Bioconductor 3.15 release 92 | 93 | # DOSE 3.21.2 94 | 95 | + enable `setReadable` for compareCluster(GSEA algorithm) result(2021-12-13, Mon) 96 | + update the default order of GSEA result (2021-12-09, Thu) 97 | - if p.adjust is identical, sorted by `abs(NES)` 98 | 99 | 100 | # DOSE 3.21.1 101 | 102 | + upate DisGeNET and NCG data (2021-11-14, Sun) 103 | - DisGeNET v7: 21671 genes, 30170 diseases and 1134942 gene-disease associations 104 | - 194515 variants, 14155 diseases and 369554 variant-disease associations 105 | - NCG v7: 3177 cancer genes, 130 diseases and 6095 gene-disease associations 106 | 107 | # DOSE 3.20.0 108 | 109 | + Bioconductor 3.14 release 110 | 111 | # DOSE 3.19.4 112 | 113 | + update `clusterProfiler` citation (2021-09-30, Thu) 114 | + upate error message of `enricher_internal` (2021-9-3, Fri) 115 | 116 | # DOSE 3.19.3 117 | 118 | + upate DisGeNET and NCG data (2021-8-16, Mon) 119 | 120 | # DOSE 3.19.2 121 | 122 | + bug fixed, change 'is.na(path2name)' to 'all(is.na(path2name))' (2021-06-21, Mon) 123 | 124 | # DOSE 3.19.1 125 | 126 | + add `dr` slot to `compareClusterResult`, `enrichRestul` and `gseaResult`(2021-5-21, Fri) 127 | 128 | # DOSE 3.18 129 | 130 | + Bioconductor 3.13 release 131 | 132 | # DOSE 3.17 133 | 134 | + support setting seed for fgsea method if e.g. `gseGO(seed = TRUE)` (2020-10-28, Wed) 135 | - 136 | 137 | # DOSE 3.16.0 138 | 139 | + Bioconductor 3.12 release (2020-10-28, Wed) 140 | 141 | # DOSE 3.15.4 142 | 143 | + update `setReadable` and `geneInCategory` methods for `compareClusterResult` object (2020-10-12, Mon) 144 | 145 | # DOSE 3.15.3 146 | 147 | + allow passing additional parameters to fgsea (2020-10-09, Fri) 148 | - 149 | + add `termsim` and `method` slots to `compareClusterResult`, `enrichRestul` and `gseaResult` 150 | - 151 | 152 | # DOSE 3.15.2 153 | 154 | + update [NCG](http://ncg.kcl.ac.uk/download.php#) and [DGN](https://www.disgenet.org/downloads) data (2020-10-09, Thu) 155 | 156 | # DOSE 3.14.0 157 | 158 | + Bioconductor 3.11 release 159 | 160 | # DOSE 3.13.2 161 | 162 | + fixed issue caused by R v4.0.0 (2020-03-12, Thu) 163 | - length > 1 in coercion to logical 164 | - 165 | 166 | # DOSE 3.13.1 167 | 168 | + remove `S4Vectors` dependencies (2019-12-19, Thu) 169 | + extend `setReadable` to support `compareClusterResult` (2019-12-02, Mon) 170 | + add `gene2Symbol`, `keytype` and `readable` slots for `compareClusterResult` 171 | + move `compareClusterResult` class definition from `clusterProfiler` (2019-11-01, Fri) 172 | 173 | # DOSE 3.12.0 174 | 175 | + Bioconductor 3.10 release 176 | 177 | # DOSE 3.11.2 178 | 179 | + ignore `universe` and print a message if users passing accidentally passing wrong input (2019-10-24, Thu) 180 | - 181 | + gene with minimal ES value (NES < 0) will be reported in `core_enrichment` (2019-07-31, Wed) 182 | 183 | # DOSE 3.11.1 184 | 185 | + `build_Anno` now compatible with `tibble` (2019-05-28, Tue) 186 | 187 | # DOSE 3.10.0 188 | 189 | + Bioconductor 3.9 release 190 | 191 | # DOSE 3.9.4 192 | 193 | + export `parse_ratio` (2019-03-29, Tue) 194 | 195 | # DOSE 3.9.4 196 | 197 | + bug fixed of `get_enriched` (2019-01-14, Mon) 198 | - 199 | 200 | # DOSE 3.9.2 201 | 202 | + mv enrichment vignettes to [clusterProfiler-book](https://yulab-smu.github.io/clusterProfiler-book) (2019-01-10, Thu) 203 | 204 | # DOSE 3.9.1 205 | 206 | + `asis` parameter in `[.enrichResult` and `[.gseaResult` (2018-12-24, Mon) 207 | - 208 | 209 | # DOSE 3.8 210 | 211 | + Bioconductor 3.8 release 212 | 213 | # DOSE 3.7.1 214 | 215 | + S3 accessor methods only return enriched terms. (2018-06-20, Wed) 216 | -------------------------------------------------------------------------------- /R/00-AllClasses.R: -------------------------------------------------------------------------------- 1 | ##' Class "compareClusterResult" 2 | ##' This class represents the comparison result of gene clusters by GO 3 | ##' categories at specific level or GO enrichment analysis. 4 | ##' 5 | ##' 6 | ##' @name compareClusterResult-class 7 | ##' @aliases compareClusterResult-class show,compareClusterResult-method 8 | ##' summary,compareClusterResult-method plot,compareClusterResult-method 9 | ##' @docType class 10 | ##' @slot compareClusterResult cluster comparing result 11 | ##' @slot geneClusters a list of genes 12 | ##' @slot fun one of groupGO, enrichGO and enrichKEGG 13 | ##' @slot gene2Symbol gene ID to Symbol 14 | ##' @slot keytype Gene ID type 15 | ##' @slot readable logical flag of gene ID in symbol or not. 16 | ##' @slot .call function call 17 | ##' @slot termsim Similarity between term 18 | ##' @slot method method of calculating the similarity between nodes 19 | ##' @slot dr dimension reduction result 20 | ##' @exportClass compareClusterResult 21 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 22 | ##' @exportClass compareClusterResult 23 | ##' @seealso 24 | ##' \code{\linkS4class{enrichResult}} 25 | ##' @keywords classes 26 | setClass("compareClusterResult", 27 | representation = representation( 28 | compareClusterResult = "data.frame", 29 | geneClusters = "list", 30 | fun = "character", 31 | gene2Symbol = "character", 32 | keytype = "character", 33 | readable = "logical", 34 | .call = "call", 35 | termsim = "matrix", 36 | method = "character", 37 | dr = "list" 38 | ) 39 | ) 40 | 41 | ##' Class "enrichResult" 42 | ##' This class represents the result of enrichment analysis. 43 | ##' 44 | ##' 45 | ##' @name enrichResult-class 46 | ##' @aliases enrichResult-class 47 | ##' show,enrichResult-method summary,enrichResult-method 48 | ##' 49 | ##' @docType class 50 | ##' @slot result enrichment analysis 51 | ##' @slot pvalueCutoff pvalueCutoff 52 | ##' @slot pAdjustMethod pvalue adjust method 53 | ##' @slot qvalueCutoff qvalueCutoff 54 | ##' @slot organism only "human" supported 55 | ##' @slot ontology biological ontology 56 | ##' @slot gene Gene IDs 57 | ##' @slot keytype Gene ID type 58 | ##' @slot universe background gene 59 | ##' @slot gene2Symbol mapping gene to Symbol 60 | ##' @slot geneSets gene sets 61 | ##' @slot readable logical flag of gene ID in symbol or not. 62 | ##' @slot termsim Similarity between term 63 | ##' @slot method method of calculating the similarity between nodes 64 | ##' @slot dr dimension reduction result 65 | ##' @exportClass enrichResult 66 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 67 | ##' @seealso \code{\link{enrichDO}} 68 | ##' @keywords classes 69 | setClass("enrichResult", 70 | representation=representation( 71 | result = "data.frame", 72 | pvalueCutoff = "numeric", 73 | pAdjustMethod = "character", 74 | qvalueCutoff = "numeric", 75 | organism = "character", 76 | ontology = "character", 77 | gene = "character", 78 | keytype = "character", 79 | universe = "character", 80 | gene2Symbol = "character", 81 | geneSets = "list", 82 | readable = "logical", 83 | termsim = "matrix", 84 | method = "character", 85 | dr = "list" 86 | ), 87 | prototype=prototype(readable = FALSE) 88 | ) 89 | 90 | 91 | 92 | ##' Class "gseaResult" 93 | ##' This class represents the result of GSEA analysis 94 | ##' 95 | ##' 96 | ##' @name gseaResult-class 97 | ##' @aliases gseahResult-class 98 | ##' show,gseaResult-method summary,gseaResult-method 99 | ##' 100 | ##' @docType class 101 | ##' @slot result GSEA anaysis 102 | ##' @slot organism organism 103 | ##' @slot setType setType 104 | ##' @slot geneSets geneSets 105 | ##' @slot geneList order rank geneList 106 | ##' @slot keytype ID type of gene 107 | ##' @slot permScores permutation scores 108 | ##' @slot params parameters 109 | ##' @slot gene2Symbol gene ID to Symbol 110 | ##' @slot readable whether convert gene ID to symbol 111 | ##' @slot dr dimension reduction result 112 | ##' @exportClass gseaResult 113 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 114 | ##' @keywords classes 115 | setClass("gseaResult", 116 | representation = representation( 117 | result = "data.frame", 118 | organism = "character", 119 | setType = "character", 120 | geneSets = "list", 121 | geneList = "numeric", 122 | keytype = "character", 123 | permScores = "matrix", 124 | params = "list", 125 | gene2Symbol = "character", 126 | readable = "logical", 127 | termsim = "matrix", 128 | method = "character", 129 | dr = "list" 130 | ) 131 | ) 132 | 133 | -------------------------------------------------------------------------------- /R/AllGenerics.R: -------------------------------------------------------------------------------- 1 | #' geneID generic 2 | #' 3 | #' @param x enrichResult object 4 | #' @return 'geneID' return the 'geneID' column of the enriched result which can be converted to data.frame via 'as.data.frame' 5 | #' @export 6 | #' @examples 7 | #' data(geneList, package="DOSE") 8 | #' de <- names(geneList)[1:100] 9 | #' x <- enrichDO(de) 10 | #' geneID(x) 11 | geneID <- function(x) { 12 | UseMethod("geneID", x) 13 | } 14 | 15 | #' geneInCategory generic 16 | #' 17 | #' @param x enrichResult 18 | #' @return 'geneInCategory' return a list of genes, by spliting the input gene vector to enriched functional categories 19 | #' @export 20 | #' @examples 21 | #' data(geneList, package="DOSE") 22 | #' de <- names(geneList)[1:100] 23 | #' x <- enrichDO(de) 24 | #' geneInCategory(x) 25 | geneInCategory <- function(x) { 26 | UseMethod("geneInCategory", x) 27 | } 28 | 29 | -------------------------------------------------------------------------------- /R/DOSE-package.R: -------------------------------------------------------------------------------- 1 | #' @keywords internal 2 | "_PACKAGE" 3 | 4 | 5 | #' Datasets 6 | #' 7 | #' Information content and DO term to entrez gene IDs mapping 8 | #' 9 | #' 10 | #' @name DataSet 11 | #' @aliases geneList 12 | #' NCG_EXTID2PATHID NCG_PATHID2EXTID NCG_PATHID2NAME 13 | #' DGN_EXTID2PATHID DGN_PATHID2EXTID DGN_PATHID2NAME 14 | #' VDGN_EXTID2PATHID VDGN_PATHID2EXTID VDGN_PATHID2NAME 15 | #' @docType data 16 | #' @keywords datasets 17 | NULL 18 | -------------------------------------------------------------------------------- /R/accessor.R: -------------------------------------------------------------------------------- 1 | ##' @method as.data.frame enrichResult 2 | ##' @export 3 | as.data.frame.enrichResult <- function(x, ...) { 4 | x <- get_enriched(x) 5 | as.data.frame(x@result, ...) 6 | } 7 | 8 | ##' @method as.data.frame gseaResult 9 | ##' @export 10 | as.data.frame.gseaResult <- function(x, ...) { 11 | as.data.frame(x@result, ...) 12 | } 13 | 14 | ##' @method geneID enrichResult 15 | ##' @export 16 | geneID.enrichResult <- function(x) as.character(x@result$geneID) 17 | 18 | ##' @method geneID gseaResult 19 | ##' @export 20 | geneID.gseaResult <- function(x) as.character(x@result$core_enrichment) 21 | 22 | ##' @method geneID compareClusterResult 23 | ##' @export 24 | geneID.compareClusterResult <- function(x) as.character(x@compareClusterResult$geneID) 25 | 26 | 27 | ##' @method geneInCategory enrichResult 28 | ##' @export 29 | ##' @importFrom stats setNames 30 | geneInCategory.enrichResult <- function(x) 31 | setNames(strsplit(geneID(x), "/", fixed=TRUE), rownames(x@result)) 32 | 33 | ##' @method geneInCategory gseaResult 34 | ##' @export 35 | geneInCategory.gseaResult <- function(x) 36 | setNames(strsplit(geneID(x), "/", fixed=TRUE), rownames(x@result)) 37 | 38 | ##' @method geneInCategory compareClusterResult 39 | ##' @export 40 | geneInCategory.compareClusterResult <- function(x) { 41 | ########## v1 42 | ## setNames(strsplit(geneID(x), "/", fixed=TRUE), 43 | ## paste(x@compareClusterResult$Cluster, 44 | ## x@compareClusterResult$ID, sep= "-")) 45 | ## setNames(strsplit(geneID(x), "/", fixed=TRUE), x@compareClusterResult$ID) 46 | 47 | ########## v2 48 | ## Cluster <- NULL 49 | ## clusters <- as.character(unique(x@compareClusterResult$Cluster)) 50 | ## list_new <- setNames(lapply(clusters, 51 | ## function(j) dplyr::filter(x@compareClusterResult, Cluster == j)), clusters) 52 | 53 | ## for(i in seq_len(length(list_new))) { 54 | ## list_new[[i]] <- setNames(strsplit(as.character(list_new[[i]]$geneID), 55 | ## "/", fixed=TRUE), 56 | ## list_new[[i]]$ID) 57 | ## } 58 | ## return(list_new) 59 | 60 | 61 | x <- as.data.frame(x) 62 | # reslist <- split(x@compareClusterResult, x@compareClusterResult$Cluster) 63 | reslist <- split(x, x$Cluster) 64 | if ("core_enrichment" %in% colnames(x)) { 65 | res <- lapply(reslist, function(y) 66 | setNames( 67 | strsplit(y$core_enrichment, split="/", fixed=TRUE), 68 | y$ID 69 | )) 70 | } else { 71 | res <- lapply(reslist, function(y) 72 | setNames( 73 | strsplit(y$geneID, split="/", fixed=TRUE), 74 | y$ID 75 | )) 76 | } 77 | 78 | res[vapply(res, length, numeric(1)) != 0] 79 | } 80 | 81 | 82 | ##' @method [ enrichResult 83 | ##' @export 84 | `[.enrichResult` <- function(x, i, j, asis = FALSE, ...) { 85 | x <- get_enriched(x) 86 | y <- x@result[i, j, ...] 87 | if (!asis) 88 | return(y) 89 | x@result <- y 90 | return(x) 91 | } 92 | 93 | ##' @method [ gseaResult 94 | ##' @export 95 | `[.gseaResult` <- function(x, i, j, asis = FALSE, ...) { 96 | y <- x@result[i, j, ...] 97 | if (!asis) 98 | return(y) 99 | x@result <- y 100 | return(x) 101 | } 102 | 103 | 104 | ##' @method $ enrichResult 105 | ##' @export 106 | `$.enrichResult` <- function(x, name) { 107 | x <- get_enriched(x) 108 | x@result[, name] 109 | } 110 | 111 | ##' @method $ gseaResult 112 | ##' @export 113 | `$.gseaResult` <- function(x, name) { 114 | x@result[, name] 115 | } 116 | 117 | 118 | 119 | ##' @method [[ enrichResult 120 | ##' @export 121 | `[[.enrichResult` <- function(x, i) { 122 | gc <- geneInCategory(x) 123 | if (!i %in% names(gc)) 124 | stop("input term not found...") 125 | gc[[i]] 126 | } 127 | 128 | 129 | ##' @method [[ gseaResult 130 | ##' @export 131 | `[[.gseaResult` <- function(x, i) { 132 | gc <- geneInCategory(x) 133 | if (!i %in% names(gc)) 134 | stop("input term not found...") 135 | gc[[i]] 136 | } 137 | 138 | 139 | ##' @importFrom utils head 140 | ##' @method head enrichResult 141 | ##' @export 142 | head.enrichResult <- function(x, n=6L, ...) { 143 | x <- get_enriched(x) 144 | head(x@result, n, ...) 145 | } 146 | 147 | ##' @method head gseaResult 148 | ##' @export 149 | head.gseaResult <- function(x, n=6L, ...) { 150 | head(x@result, n, ...) 151 | } 152 | 153 | ##' @importFrom utils tail 154 | ##' @method tail enrichResult 155 | ##' @export 156 | tail.enrichResult <- function(x, n=6L, ...) { 157 | x <- get_enriched(x) 158 | tail(x@result, n, ...) 159 | } 160 | 161 | ##' @method tail gseaResult 162 | ##' @export 163 | tail.gseaResult <- function(x, n=6L, ...) { 164 | tail(x@result, n, ...) 165 | } 166 | 167 | ##' @method dim enrichResult 168 | ##' @export 169 | dim.enrichResult <- function(x) { 170 | x <- get_enriched(x) 171 | dim(x@result) 172 | } 173 | 174 | ##' @method dim gseaResult 175 | ##' @export 176 | dim.gseaResult <- function(x) { 177 | dim(x@result) 178 | } 179 | 180 | 181 | 182 | 183 | ##' summary method for \code{gseaResult} instance 184 | ##' 185 | ##' 186 | ##' @name summary 187 | ##' @docType methods 188 | ##' @rdname summary-methods 189 | ##' 190 | ##' @title summary method 191 | ##' @return A data frame 192 | ##' @exportMethod summary 193 | ##' @usage summary(object, ...) 194 | ##' @author Guangchuang Yu \url{https://guangchuangyu.github.io} 195 | setMethod("summary", signature(object="gseaResult"), 196 | function(object, ...) { 197 | warning("summary method to convert the object to data.frame is deprecated, please use as.data.frame instead.") 198 | return(as.data.frame(object, ...)) 199 | } 200 | ) 201 | 202 | 203 | ##' summary method for \code{enrichResult} instance 204 | ##' 205 | ##' 206 | ##' @name summary 207 | ##' @docType methods 208 | ##' @rdname summary-methods 209 | ##' 210 | ##' @title summary method 211 | ##' @param object A \code{enrichResult} instance. 212 | ##' @param ... additional parameter 213 | ##' @return A data frame 214 | ##' @exportMethod summary 215 | ##' @usage summary(object, ...) 216 | ##' @author Guangchuang Yu \url{http://guangchuangyu.github.io} 217 | setMethod("summary", signature(object="enrichResult"), 218 | function(object, ...) { 219 | warning("summary method to convert the object to data.frame is deprecated, please use as.data.frame instead.") 220 | return(as.data.frame(object, ...)) 221 | } 222 | ) 223 | 224 | 225 | -------------------------------------------------------------------------------- /R/build_Anno.R: -------------------------------------------------------------------------------- 1 | build_Anno <- function(path2gene, path2name) { 2 | if (!exists(".Anno_clusterProfiler_Env", envir = .GlobalEnv)) { 3 | pos <- 1 4 | envir <- as.environment(pos) 5 | assign(".Anno_clusterProfiler_Env", new.env(), envir = envir) 6 | } 7 | Anno_clusterProfiler_Env <- get(".Anno_clusterProfiler_Env", envir= .GlobalEnv) 8 | 9 | # if(class(path2gene[[2]]) == 'list') { 10 | if (inherits(path2gene[[2]], "list")){ 11 | ## to compatible with tibble 12 | path2gene <- cbind(rep(path2gene[[1]], 13 | times = vapply(path2gene[[2]], length, numeric(1))), 14 | unlist(path2gene[[2]])) 15 | } 16 | 17 | path2gene <- as.data.frame(path2gene) 18 | path2gene <- path2gene[!is.na(path2gene[,1]), ] 19 | path2gene <- path2gene[!is.na(path2gene[,2]), ] 20 | path2gene <- unique(path2gene) 21 | 22 | PATHID2EXTID <- split(as.character(path2gene[,2]), as.character(path2gene[,1])) 23 | EXTID2PATHID <- split(as.character(path2gene[,1]), as.character(path2gene[,2])) 24 | 25 | assign("PATHID2EXTID", PATHID2EXTID, envir = Anno_clusterProfiler_Env) 26 | assign("EXTID2PATHID", EXTID2PATHID, envir = Anno_clusterProfiler_Env) 27 | 28 | if ( missing(path2name) || is.null(path2name) || all(is.na(path2name))) { 29 | assign("PATHID2NAME", NULL, envir = Anno_clusterProfiler_Env) 30 | } else { 31 | path2name <- as.data.frame(path2name) 32 | path2name <- path2name[!is.na(path2name[,1]), ] 33 | path2name <- path2name[!is.na(path2name[,2]), ] 34 | path2name <- unique(path2name) 35 | PATH2NAME <- as.character(path2name[,2]) 36 | names(PATH2NAME) <- as.character(path2name[,1]) 37 | assign("PATHID2NAME", PATH2NAME, envir = Anno_clusterProfiler_Env) 38 | } 39 | return(Anno_clusterProfiler_Env) 40 | } 41 | -------------------------------------------------------------------------------- /R/clusterSim.R: -------------------------------------------------------------------------------- 1 | ##' semantic similarity between two gene clusters 2 | ##' 3 | ##' given two gene clusters, this function calculates semantic similarity between them. 4 | ##' 5 | ##' @title clusterSim 6 | ##' @param cluster1 a vector of gene IDs 7 | ##' @param cluster2 another vector of gene IDs 8 | ##' @param organism one of "hsa" and "mmu" 9 | ##' @param ont one of "HDO", "HPO" and "MPO" 10 | ##' @param measure One of "Resnik", "Lin", "Rel", "Jiang" and "Wang" methods. 11 | ##' @param combine One of "max", "avg", "rcmax", "BMA" methods, for combining 12 | ##' @return similarity 13 | ##' @importFrom GOSemSim combineScores 14 | ##' @export 15 | ##' @author Yu Guangchuang 16 | ##' @examples 17 | ##' \dontrun{ 18 | ##' cluster1 <- c("835", "5261","241", "994") 19 | ##' cluster2 <- c("307", "308", "317", "321", "506", "540", "378", "388", "396") 20 | ##' clusterSim(cluster1, cluster2) 21 | ##' } 22 | clusterSim <- function(cluster1, 23 | cluster2, 24 | ont = "HDO", 25 | organism = "hsa", 26 | measure="Wang", 27 | combine="BMA") { 28 | if (ont == "DO") ont <- 'HDO' 29 | 30 | do1 <- sapply(cluster1, gene2DO, organism = organism) 31 | do2 <- sapply(cluster2, gene2DO, organism = organism) 32 | 33 | do1 <- unlist(do1) 34 | do2 <- unlist(do2) 35 | 36 | res <- doseSim(DOID1 = do1, DOID2 = do2, measure = measure, ont = ont) 37 | combineScores(res, combine) 38 | } 39 | -------------------------------------------------------------------------------- /R/doSim.R: -------------------------------------------------------------------------------- 1 | ##' measuring similarities between two DO term vectors. 2 | ##' 3 | ##' provide two term vectors, this function will calculate their similarities. 4 | ##' @title doseSim 5 | ##' @rdname doseSim 6 | ##' @param DOID1 DO term, MPO term or HPO term vector 7 | ##' @param DOID2 DO term, MPO term or HPO term vector 8 | ##' @param ont one of "HDO", "HPO" and "MPO" 9 | ##' @param measure one of "Wang", "Resnik", "Rel", "Jiang", "Lin", and "TCSS". 10 | ##' @return score matrix 11 | ##' @importFrom GOSemSim termSim 12 | ##' @export 13 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 14 | doseSim <- function(DOID1, 15 | DOID2, 16 | measure="Wang", 17 | ont = "HDO") { 18 | ont <- match.arg(ont, c("DO", "HDO", "MPO", "HPO")) 19 | 20 | if (ont == "DO") ont <- 'HDO' 21 | 22 | processTCSS <- FALSE 23 | if (measure == "TCSS") { 24 | processTCSS <- TRUE 25 | } 26 | 27 | scores <- GOSemSim::termSim( 28 | DOID1, 29 | DOID2, 30 | semdata2(processTCSS = processTCSS, ont = ont), 31 | measure 32 | ) 33 | 34 | if(length(scores) == 1) 35 | scores <- as.numeric(scores) 36 | 37 | return(scores) 38 | } 39 | 40 | ##' @rdname doseSim 41 | ##' @export 42 | doSim <- doseSim 43 | -------------------------------------------------------------------------------- /R/enrichDGN.R: -------------------------------------------------------------------------------- 1 | ##' Enrichment analysis based on the DisGeNET (\url{http://www.disgenet.org/}) 2 | ##' 3 | ##' given a vector of genes, this function will return the enrichment NCG 4 | ##' categories with FDR control 5 | ##' 6 | ##' 7 | ##' @inheritParams enrichNCG 8 | ##' @return A \code{enrichResult} instance 9 | ##' @export 10 | ##' @references Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. \emph{Database} bav028 11 | ##' \url{http://database.oxfordjournals.org/content/2015/bav028.long} 12 | ##' @author Guangchuang Yu 13 | enrichDGN <- function(gene, 14 | pvalueCutoff = 0.05, 15 | pAdjustMethod = "BH", 16 | universe, 17 | minGSSize = 10, 18 | maxGSSize = 500, 19 | qvalueCutoff = 0.2, 20 | readable = FALSE){ 21 | 22 | enrichDisease(gene = gene, 23 | pvalueCutoff = pvalueCutoff, 24 | pAdjustMethod = pAdjustMethod, 25 | universe = universe, 26 | minGSSize = minGSSize, 27 | maxGSSize = maxGSSize, 28 | qvalueCutoff = qvalueCutoff, 29 | readable = readable, 30 | ontology = "DisGeNET") 31 | 32 | } 33 | 34 | get_DGN_data <- function() { 35 | if (!exists(".DOSEenv")) .initial() 36 | .DOSEEnv <- get(".DOSEEnv", envir = .GlobalEnv) 37 | 38 | if (!exists(".DGN_DOSE_Env", envir=.DOSEEnv)) { 39 | tryCatch(utils::data(list="DGN_EXTID2PATHID", package="DOSE")) 40 | tryCatch(utils::data(list="DGN_PATHID2EXTID", package="DOSE")) 41 | tryCatch(utils::data(list="DGN_PATHID2NAME", package="DOSE")) 42 | EXTID2PATHID <- DGN_EXTID2PATHID <- get("DGN_EXTID2PATHID") 43 | PATHID2EXTID <- DGN_PATHID2EXTID <- get("DGN_PATHID2EXTID") 44 | PATHID2NAME <- DGN_PATHID2NAME <- get("DGN_PATHID2NAME") 45 | 46 | rm(DGN_EXTID2PATHID, envir = .GlobalEnv) 47 | rm(DGN_PATHID2EXTID, envir = .GlobalEnv) 48 | rm(DGN_PATHID2NAME, envir = .GlobalEnv) 49 | 50 | assign(".DGN_DOSE_Env", new.env(), envir = .DOSEEnv) 51 | .DGN_DOSE_Env <- get(".DGN_DOSE_Env", envir = .DOSEEnv) 52 | assign("EXTID2PATHID", EXTID2PATHID, envir = .DGN_DOSE_Env) 53 | assign("PATHID2EXTID", PATHID2EXTID, envir = .DGN_DOSE_Env) 54 | assign("PATHID2NAME", PATHID2NAME, envir = .DGN_DOSE_Env) 55 | } 56 | get(".DGN_DOSE_Env", envir = .DOSEEnv) 57 | } 58 | 59 | 60 | -------------------------------------------------------------------------------- /R/enrichDGNv.R: -------------------------------------------------------------------------------- 1 | ##' Enrichment analysis based on the DisGeNET (\url{http://www.disgenet.org/}) 2 | ##' 3 | ##' given a vector of genes, this function will return the enrichment NCG 4 | ##' categories with FDR control 5 | ##' 6 | ##' 7 | ##' @title enrichDGN 8 | ##' @param snp a vector of SNP 9 | ##' @param pvalueCutoff pvalue cutoff 10 | ##' @param pAdjustMethod one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" 11 | ##' @param universe background genes 12 | ##' @param minGSSize minimal size of genes annotated by NCG category for testing 13 | ##' @param maxGSSize maximal size of each geneSet for analyzing 14 | ##' @param qvalueCutoff qvalue cutoff 15 | ##' @param readable whether mapping gene ID to gene Name 16 | ##' @return A \code{enrichResult} instance 17 | ##' @export 18 | ##' @references Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. \emph{Database} bav028 19 | ##' \url{http://database.oxfordjournals.org/content/2015/bav028.long} 20 | ##' @author Guangchuang Yu 21 | enrichDGNv <- function(snp, 22 | pvalueCutoff = 0.05, 23 | pAdjustMethod = "BH", 24 | universe, 25 | minGSSize = 10, 26 | maxGSSize = 500, 27 | qvalueCutoff = 0.2, 28 | readable = FALSE){ 29 | enrichDisease(gene = snp, 30 | pvalueCutoff = pvalueCutoff, 31 | pAdjustMethod = pAdjustMethod, 32 | universe = universe, 33 | minGSSize = minGSSize, 34 | maxGSSize = maxGSSize, 35 | qvalueCutoff = qvalueCutoff, 36 | readable = readable, 37 | ontology = "snpDisGeNET") 38 | 39 | } 40 | 41 | get_VDGN_data <- function() { 42 | if (!exists(".DOSEenv")) .initial() 43 | .DOSEEnv <- get(".DOSEEnv", envir = .GlobalEnv) 44 | 45 | if (!exists(".VDGN_DOSE_Env", envir=.DOSEEnv)) { 46 | tryCatch(utils::data(list="VDGN_EXTID2PATHID", package="DOSE")) 47 | tryCatch(utils::data(list="VDGN_PATHID2EXTID", package="DOSE")) 48 | tryCatch(utils::data(list="VDGN_PATHID2NAME", package="DOSE")) 49 | EXTID2PATHID <- VDGN_EXTID2PATHID <- get("VDGN_EXTID2PATHID") 50 | PATHID2EXTID <- VDGN_PATHID2EXTID <- get("VDGN_PATHID2EXTID") 51 | PATHID2NAME <- VDGN_PATHID2NAME <- get("VDGN_PATHID2NAME") 52 | 53 | rm(VDGN_EXTID2PATHID, envir = .GlobalEnv) 54 | rm(VDGN_PATHID2EXTID, envir = .GlobalEnv) 55 | rm(VDGN_PATHID2NAME, envir = .GlobalEnv) 56 | 57 | assign(".VDGN_DOSE_Env", new.env(), envir = .DOSEEnv) 58 | .VDGN_DOSE_Env <- get(".VDGN_DOSE_Env", envir = .DOSEEnv) 59 | assign("EXTID2PATHID", EXTID2PATHID, envir = .VDGN_DOSE_Env) 60 | assign("PATHID2EXTID", PATHID2EXTID, envir = .VDGN_DOSE_Env) 61 | assign("PATHID2NAME", PATHID2NAME, envir = .VDGN_DOSE_Env) 62 | } 63 | get(".VDGN_DOSE_Env", envir = .DOSEEnv) 64 | } 65 | 66 | 67 | -------------------------------------------------------------------------------- /R/enrichDO.R: -------------------------------------------------------------------------------- 1 | ##' DO Enrichment Analysis 2 | ##' 3 | ##' Given a vector of genes, this function will return the enrichment DO 4 | ##' categories with FDR control. 5 | ##' 6 | ##' @rdname enrichDO 7 | ##' @param ont one of "HDO", "HPO" or "MPO". 8 | ##' @param organism one of "hsa" and "mmu" 9 | ##' @inheritParams enrichNCG 10 | ##' @return A \code{enrichResult} instance. 11 | ##' @export 12 | ##' @seealso \code{\link{enrichResult-class}} 13 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 14 | ##' @keywords manip 15 | ##' @examples 16 | ##' 17 | ##' data(geneList) 18 | ##' gene = names(geneList)[geneList > 1] 19 | ##' yy = enrichDO(gene, pvalueCutoff=0.05) 20 | ##' summary(yy) 21 | ##' 22 | enrichDO <- function(gene, ont="HDO", 23 | organism = "hsa", 24 | pvalueCutoff=0.05, 25 | pAdjustMethod="BH", 26 | universe, 27 | minGSSize = 10, 28 | maxGSSize = 500, 29 | qvalueCutoff=0.2, 30 | readable = FALSE){ 31 | 32 | enrichDisease(gene = gene, 33 | organism = organism, 34 | pvalueCutoff = pvalueCutoff, 35 | pAdjustMethod = pAdjustMethod, 36 | universe = universe, 37 | minGSSize = minGSSize, 38 | maxGSSize = maxGSSize, 39 | qvalueCutoff = qvalueCutoff, 40 | readable = readable, 41 | ontology = ont) 42 | } 43 | 44 | 45 | 46 | -------------------------------------------------------------------------------- /R/enrichDisease.R: -------------------------------------------------------------------------------- 1 | enrichDisease <- function(gene, 2 | organism = "hsa", 3 | pvalueCutoff = 0.05, 4 | pAdjustMethod = "BH", 5 | universe, 6 | minGSSize = 10, 7 | maxGSSize = 500, 8 | qvalueCutoff = 0.2, 9 | readable = FALSE, 10 | ontology){ 11 | 12 | organism <- match.arg(organism, c("hsa", "mm")) 13 | 14 | annoData <- get_anno_data(ontology) 15 | 16 | res <- enricher_internal(gene = gene, 17 | pvalueCutoff = pvalueCutoff, 18 | pAdjustMethod = pAdjustMethod, 19 | universe = universe, 20 | minGSSize = minGSSize, 21 | maxGSSize = maxGSSize, 22 | qvalueCutoff = qvalueCutoff, 23 | USER_DATA = annoData) 24 | 25 | if (is.null(res)) 26 | return(res) 27 | if (organism == "hsa") { 28 | res@organism <- "Homo sapiens" 29 | } else { 30 | res@organism <- "Mus musculus" 31 | } 32 | 33 | res@keytype <- "ENTREZID" 34 | res@ontology <- ontology 35 | 36 | if(readable) { 37 | if (organism == "hsa") { 38 | res <- setReadable(res, 'org.Hs.eg.db') 39 | } else { 40 | res <- setReadable(res, 'org.Mm.eg.db') 41 | } 42 | } 43 | return(res) 44 | } 45 | 46 | 47 | get_anno_data <- function(ontology) { 48 | if (ontology == "NCG") { 49 | annoData <- get_NCG_data() 50 | } else if (ontology == "DisGeNET") { 51 | annoData <- get_DGN_data() 52 | } else if (ontology == "snpDisGeNET") { 53 | annoData <- get_VDGN_data() 54 | } else if (ontology %in% c("HDO", "MPO", "HPO")) { 55 | annoData <- get_dose_data(ontology) 56 | } else { 57 | stop("ontology not supported yet...") 58 | } 59 | 60 | return(annoData) 61 | } 62 | 63 | get_dose_data <- function(ontology = "HPO") { 64 | .DOSEEnv <- get_dose_env() 65 | .env <- sprintf(".%s_DOSE_Env", ontology) 66 | if (exists(.env, envir=.DOSEEnv)) { 67 | res <- get(.env, envir = .DOSEEnv) 68 | return(res) 69 | } 70 | 71 | assign(.env, new.env(), envir = .DOSEEnv) 72 | ret_env <- get(.env, envir = .DOSEEnv) 73 | 74 | TERM2ALLEG <- get_ont2allgene(ontology) 75 | EG2ALLTERM <- get_gene2allont(ontology) 76 | 77 | termmap <- GOSemSim:::get_onto_data( 78 | ontology, 79 | table="term", 80 | output = "data.frame") 81 | 82 | PATH2NAME.df <- unique(termmap) 83 | PATH2NAME <- setNames(PATH2NAME.df[,2], PATH2NAME.df[,1]) 84 | 85 | assign("EXTID2PATHID", EG2ALLTERM, envir = ret_env) 86 | assign("PATHID2EXTID", TERM2ALLEG, envir = ret_env) 87 | assign("PATHID2NAME", PATH2NAME, envir = ret_env) 88 | 89 | return(ret_env) 90 | } 91 | 92 | -------------------------------------------------------------------------------- /R/enrichNCG.R: -------------------------------------------------------------------------------- 1 | ##' Enrichment analysis based on the Network of Cancer Genes database (http://ncg.kcl.ac.uk/) 2 | ##' 3 | ##' given a vector of genes, this function will return the enrichment NCG 4 | ##' categories with FDR control 5 | ##' 6 | ##' 7 | ##' @title enrichNCG 8 | ##' @param gene a vector of entrez gene id 9 | ##' @param pvalueCutoff pvalue cutoff 10 | ##' @param pAdjustMethod one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" 11 | ##' @param universe background genes 12 | ##' @param minGSSize minimal size of genes annotated by NCG category for testing 13 | ##' @param maxGSSize maximal size of each geneSet for analyzing 14 | ##' @param qvalueCutoff qvalue cutoff 15 | ##' @param readable whether mapping gene ID to gene Name 16 | ##' @return A \code{enrichResult} instance 17 | ##' @export 18 | ##' @author Guangchuang Yu 19 | enrichNCG <- function(gene, 20 | pvalueCutoff = 0.05, 21 | pAdjustMethod = "BH", 22 | universe, 23 | minGSSize = 10, 24 | maxGSSize = 500, 25 | qvalueCutoff = 0.2, 26 | readable = FALSE){ 27 | 28 | enrichDisease(gene = gene, 29 | pvalueCutoff = pvalueCutoff, 30 | pAdjustMethod = pAdjustMethod, 31 | universe = universe, 32 | minGSSize = minGSSize, 33 | maxGSSize = maxGSSize, 34 | qvalueCutoff = qvalueCutoff, 35 | readable = readable, 36 | ontology = "NCG") 37 | } 38 | 39 | get_NCG_data <- function() { 40 | if (!exists(".DOSEenv")) .initial() 41 | .DOSEEnv <- get(".DOSEEnv", envir = .GlobalEnv) 42 | 43 | if (!exists(".NCG_DOSE_Env", envir=.DOSEEnv)) { 44 | tryCatch(utils::data(list="NCG_EXTID2PATHID", package="DOSE")) 45 | tryCatch(utils::data(list="NCG_PATHID2EXTID", package="DOSE")) 46 | tryCatch(utils::data(list="NCG_PATHID2NAME", package="DOSE")) 47 | EXTID2PATHID <- NCG_EXTID2PATHID <- get("NCG_EXTID2PATHID") 48 | PATHID2EXTID <- NCG_PATHID2EXTID <- get("NCG_PATHID2EXTID") 49 | PATHID2NAME <- NCG_PATHID2NAME <- get("NCG_PATHID2NAME") 50 | 51 | rm(NCG_EXTID2PATHID, envir = .GlobalEnv) 52 | rm(NCG_PATHID2EXTID, envir = .GlobalEnv) 53 | rm(NCG_PATHID2NAME, envir = .GlobalEnv) 54 | 55 | assign(".NCG_DOSE_Env", new.env(), envir = .DOSEEnv) 56 | .NCG_DOSE_Env <- get(".NCG_DOSE_Env", envir = .DOSEEnv) 57 | assign("EXTID2PATHID", EXTID2PATHID, envir = .NCG_DOSE_Env) 58 | assign("PATHID2EXTID", PATHID2EXTID, envir = .NCG_DOSE_Env) 59 | assign("PATHID2NAME", PATHID2NAME, envir = .NCG_DOSE_Env) 60 | } 61 | 62 | get(".NCG_DOSE_Env", envir = .DOSEEnv) 63 | } 64 | 65 | 66 | -------------------------------------------------------------------------------- /R/enricher_internal.R: -------------------------------------------------------------------------------- 1 | ##' interal method for enrichment analysis 2 | ##' 3 | ##' using the hypergeometric model 4 | ##' @title enrich.internal 5 | ##' @param gene a vector of entrez gene id. 6 | ##' @param pvalueCutoff Cutoff value of pvalue. 7 | ##' @param pAdjustMethod one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" 8 | ##' @param universe background genes, default is the intersection of the 'universe' with genes that have annotations. 9 | ##' Users can set `options(enrichment_force_universe = TRUE)` to force the 'universe' untouched. 10 | ##' @param minGSSize minimal size of genes annotated by Ontology term for testing. 11 | ##' @param maxGSSize maximal size of each geneSet for analyzing 12 | ##' @param qvalueCutoff cutoff of qvalue 13 | ##' @param USER_DATA ontology information 14 | ##' @return A \code{enrichResult} instance. 15 | ##' @importClassesFrom methods data.frame 16 | ##' @importFrom qvalue qvalue 17 | ##' @importFrom methods new 18 | ##' @importFrom stats phyper 19 | ##' @importFrom stats p.adjust 20 | ##' @keywords manip 21 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 22 | enricher_internal <- function(gene, 23 | pvalueCutoff, 24 | pAdjustMethod="BH", 25 | universe = NULL, 26 | minGSSize=10, 27 | maxGSSize=500, 28 | qvalueCutoff=0.2, 29 | USER_DATA){ 30 | 31 | ## query external ID to Term ID 32 | gene <- as.character(unique(gene)) 33 | qExtID2TermID <- EXTID2TERMID(gene, USER_DATA) 34 | qTermID <- unlist(qExtID2TermID) 35 | if (is.null(qTermID)) { 36 | message("--> No gene can be mapped....") 37 | if (inherits(USER_DATA, "environment")) { 38 | p2e <- get("PATHID2EXTID", envir=USER_DATA) 39 | sg <- unique(unlist(p2e[1:10])) 40 | } else { 41 | sg <- unique(USER_DATA@gsid2gene$gene[1:100]) 42 | } 43 | sg <- sample(sg, min(length(sg), 6)) 44 | message("--> Expected input gene ID: ", paste0(sg, collapse=',')) 45 | 46 | message("--> return NULL...") 47 | return(NULL) 48 | } 49 | 50 | ## Term ID -- query external ID association list. 51 | qExtID2TermID.df <- data.frame(extID=rep(names(qExtID2TermID), 52 | times=lapply(qExtID2TermID, length)), 53 | termID=qTermID) 54 | qExtID2TermID.df <- unique(qExtID2TermID.df) 55 | 56 | qTermID2ExtID <- with(qExtID2TermID.df, 57 | split(as.character(extID), as.character(termID))) 58 | 59 | extID <- ALLEXTID(USER_DATA) 60 | if (missing(universe)) 61 | universe <- NULL 62 | if(!is.null(universe)) { 63 | if (is.character(universe)) { 64 | force_universe <- getOption("enrichment_force_universe", FALSE) 65 | if (force_universe) { 66 | extID <- universe 67 | } else { 68 | extID <- intersect(extID, universe) 69 | } 70 | } else { 71 | ## https://github.com/YuLab-SMU/clusterProfiler/issues/217 72 | message("`universe` is not in character and will be ignored...") 73 | } 74 | } 75 | 76 | qTermID2ExtID <- lapply(qTermID2ExtID, intersect, extID) 77 | 78 | ## Term ID annotate query external ID 79 | qTermID <- unique(names(qTermID2ExtID)) 80 | 81 | 82 | termID2ExtID <- TERMID2EXTID(qTermID, USER_DATA) 83 | termID2ExtID <- lapply(termID2ExtID, intersect, extID) 84 | 85 | geneSets <- termID2ExtID 86 | 87 | idx <- get_geneSet_index(termID2ExtID, minGSSize, maxGSSize) 88 | 89 | if (sum(idx) == 0) { 90 | msg <- paste("No gene sets have size between", minGSSize, "and", maxGSSize, "...") 91 | message(msg) 92 | message("--> return NULL...") 93 | return (NULL) 94 | } 95 | 96 | termID2ExtID <- termID2ExtID[idx] 97 | qTermID2ExtID <- qTermID2ExtID[idx] 98 | qTermID <- unique(names(qTermID2ExtID)) 99 | 100 | ## prepare parameter for hypergeometric test 101 | k <- sapply(qTermID2ExtID, length) 102 | k <- k[qTermID] 103 | M <- sapply(termID2ExtID, length) 104 | M <- M[qTermID] 105 | 106 | N <- rep(length(extID), length(M)) 107 | ## n <- rep(length(gene), length(M)) ## those genes that have no annotation should drop. 108 | n <- rep(length(qExtID2TermID), length(M)) 109 | args.df <- data.frame(numWdrawn=k-1, ## White balls drawn 110 | numW=M, ## White balls 111 | numB=N-M, ## Black balls 112 | numDrawn=n) ## balls drawn 113 | 114 | 115 | ## calcute pvalues based on hypergeometric model 116 | pvalues <- apply(args.df, 1, function(n) 117 | phyper(n[1], n[2], n[3], n[4], lower.tail=FALSE) 118 | ) 119 | 120 | ## gene ratio and background ratio 121 | #GeneRatio <- apply(data.frame(a=k, b=n), 1, function(x) 122 | # paste(x[1], "/", x[2], sep="", collapse="") 123 | # ) 124 | #BgRatio <- apply(data.frame(a=M, b=N), 1, function(x) 125 | # paste(x[1], "/", x[2], sep="", collapse="") 126 | # ) 127 | 128 | GeneRatio <- sprintf("%s/%s", k, n) 129 | BgRatio <- sprintf("%s/%s", M, N) 130 | RichFactor <- k / M 131 | FoldEnrichment <- RichFactor * N / n 132 | 133 | # mu and sigma are the mean and standard deviation of the hypergeometric distribution 134 | ## https://en.wikipedia.org/wiki/Hypergeometric_distribution 135 | mu <- M * n / N 136 | sigma <- mu * (N - n) * (N - M) / N / (N-1) 137 | zScore <- (k - mu)/sqrt(sigma) 138 | Over <- data.frame(ID = as.character(qTermID), 139 | GeneRatio = GeneRatio, 140 | BgRatio = BgRatio, 141 | RichFactor = RichFactor, 142 | FoldEnrichment = FoldEnrichment, 143 | zScore = zScore, 144 | pvalue = pvalues, 145 | stringsAsFactors = FALSE) 146 | 147 | p.adj <- p.adjust(Over$pvalue, method=pAdjustMethod) 148 | qobj <- tryCatch(qvalue(p=Over$pvalue, lambda=0.05, pi0.method="bootstrap"), error=function(e) NULL) 149 | 150 | # if (class(qobj) == "qvalue") { 151 | if (inherits(qobj, "qvalue")) { 152 | qvalues <- qobj$qvalues 153 | } else { 154 | qvalues <- NA 155 | } 156 | 157 | geneID <- sapply(qTermID2ExtID, function(i) paste(i, collapse="/")) 158 | geneID <- geneID[qTermID] 159 | Over <- data.frame(Over, 160 | p.adjust = p.adj, 161 | qvalue = qvalues, 162 | geneID = geneID, 163 | Count = k, 164 | stringsAsFactors = FALSE) 165 | 166 | Description <- TERM2NAME(qTermID, USER_DATA) 167 | 168 | if (length(qTermID) != length(Description)) { 169 | idx <- qTermID %in% names(Description) 170 | Over <- Over[idx,] 171 | } 172 | Over$Description <- Description 173 | nc <- ncol(Over) 174 | Over <- Over[, c(1,nc, 2:(nc-1))] 175 | 176 | 177 | Over <- Over[order(pvalues),] 178 | 179 | 180 | Over$ID <- as.character(Over$ID) 181 | Over$Description <- as.character(Over$Description) 182 | 183 | row.names(Over) <- as.character(Over$ID) 184 | 185 | x <- new("enrichResult", 186 | result = Over, 187 | pvalueCutoff = pvalueCutoff, 188 | pAdjustMethod = pAdjustMethod, 189 | qvalueCutoff = qvalueCutoff, 190 | gene = as.character(gene), 191 | universe = extID, 192 | geneSets = geneSets, 193 | organism = "UNKNOWN", 194 | keytype = "UNKNOWN", 195 | ontology = "UNKNOWN", 196 | readable = FALSE 197 | ) 198 | if (inherits(USER_DATA, "GSON")) { 199 | if (!is.null(USER_DATA@keytype)) { 200 | x@keytype <- USER_DATA@keytype 201 | } 202 | if (!is.null(USER_DATA@species)) { 203 | x@organism <- USER_DATA@species 204 | } 205 | if (!is.null(USER_DATA@gsname)) { 206 | x@ontology <- gsub(".*;", "", USER_DATA@gsname) 207 | } 208 | } 209 | return (x) 210 | } 211 | 212 | 213 | get_enriched <- function(object) { 214 | 215 | Over <- object@result 216 | 217 | pvalueCutoff <- object@pvalueCutoff 218 | if (length(pvalueCutoff) != 0) { 219 | ## if groupGO result, numeric(0) 220 | Over <- Over[ Over$pvalue <= pvalueCutoff, ] 221 | Over <- Over[ Over$p.adjust <= pvalueCutoff, ] 222 | } 223 | 224 | qvalueCutoff <- object@qvalueCutoff 225 | if (length(qvalueCutoff) != 0) { 226 | if (! any(is.na(Over$qvalue))) { 227 | if (length(qvalueCutoff) > 0) 228 | Over <- Over[ Over$qvalue <= qvalueCutoff, ] 229 | } 230 | } 231 | 232 | object@result <- Over 233 | return(object) 234 | } 235 | 236 | 237 | EXTID2TERMID <- function(gene, USER_DATA) { 238 | if (inherits(USER_DATA, "environment")) { 239 | EXTID2PATHID <- get("EXTID2PATHID", envir = USER_DATA) 240 | 241 | qExtID2Path <- EXTID2PATHID[gene] 242 | } else if (inherits(USER_DATA, "GSON")) { 243 | gsid2gene <- USER_DATA@gsid2gene 244 | qExtID2Path <- setNames(lapply(gene, function(x) { 245 | subset(gsid2gene, gsid2gene$gene == x)[["gsid"]] 246 | }), gene) 247 | } else { 248 | stop("not supported") 249 | } 250 | 251 | len <- sapply(qExtID2Path, length) 252 | notZero.idx <- len != 0 253 | qExtID2Path <- qExtID2Path[notZero.idx] 254 | 255 | return(qExtID2Path) 256 | } 257 | 258 | ALLEXTID <- function(USER_DATA) { 259 | if (inherits(USER_DATA, "environment")) { 260 | PATHID2EXTID <- get("PATHID2EXTID", envir = USER_DATA) 261 | res <- unique(unlist(PATHID2EXTID)) 262 | } else if (inherits(USER_DATA, "GSON")) { 263 | gsid2gene <- USER_DATA@gsid2gene 264 | res <- unique(gsid2gene$gene) 265 | } else { 266 | stop("not supported") 267 | } 268 | 269 | return(res) 270 | } 271 | 272 | 273 | TERMID2EXTID <- function(term, USER_DATA) { 274 | if (inherits(USER_DATA, "environment")) { 275 | PATHID2EXTID <- get("PATHID2EXTID", envir = USER_DATA) 276 | res <- PATHID2EXTID[term] 277 | } else if (inherits(USER_DATA, "GSON")) { 278 | gsid2gene <- USER_DATA@gsid2gene 279 | res <- setNames(lapply(term, function(x) { 280 | subset(gsid2gene, gsid2gene$gsid == x)[["gene"]] 281 | }), term) 282 | } else { 283 | stop("not supported") 284 | } 285 | 286 | return(res) 287 | } 288 | 289 | TERM2NAME <- function(term, USER_DATA) { 290 | if (inherits(USER_DATA, "environment")) { 291 | PATHID2NAME <- get("PATHID2NAME", envir = USER_DATA) 292 | #if (is.null(PATHID2NAME) || is.na(PATHID2NAME)) { 293 | if (is.null(PATHID2NAME) || all(is.na(PATHID2NAME))) { 294 | return(as.character(term)) 295 | } 296 | res <- PATHID2NAME[term] 297 | i <- is.na(res) 298 | res[i] <- term[i] 299 | } else if (inherits(USER_DATA, "GSON")) { 300 | gsid2name <- USER_DATA@gsid2name 301 | i <- match(term, gsid2name$gsid) 302 | j <- !is.na(i) 303 | res <- term 304 | res[j] <- gsid2name$name[i[j]] 305 | } else { 306 | res <- as.character(term) 307 | } 308 | 309 | names(res) <- term 310 | return(res) 311 | } 312 | 313 | get_geneSet_index <- function(geneSets, minGSSize, maxGSSize) { 314 | if (is.na(minGSSize) || is.null(minGSSize)) 315 | minGSSize <- 1 316 | if (is.na(maxGSSize) || is.null(maxGSSize)) 317 | maxGSSize <- Inf #.Machine$integer.max 318 | 319 | ## index of geneSets in used. 320 | ## logical 321 | geneSet_size <- sapply(geneSets, length) 322 | idx <- minGSSize <= geneSet_size & geneSet_size <= maxGSSize 323 | return(idx) 324 | } 325 | -------------------------------------------------------------------------------- /R/geneSim.R: -------------------------------------------------------------------------------- 1 | ##' measuring similarities bewteen two gene vectors. 2 | ##' 3 | ##' provide two entrez gene vectors, this function will calculate their similarity. 4 | ##' @title geneSim 5 | ##' @param geneID1 entrez gene vector 6 | ##' @param geneID2 entrez gene vector 7 | ##' @param organism one of "hsa" and "mmu" 8 | ##' @param ont one of "HDO" and "MPO" 9 | ##' @param measure one of "Wang", "Resnik", "Rel", "Jiang", and "Lin". 10 | ##' @param combine One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein. 11 | ##' @return score matrix 12 | ##' @importFrom GOSemSim combineScores 13 | ##' @export 14 | ##' @examples 15 | ##' g <- c("835", "5261","241", "994") 16 | ##' geneSim(g) 17 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 18 | geneSim <- function(geneID1, 19 | geneID2=NULL, 20 | ont = "HDO", 21 | organism = "hsa", 22 | measure="Wang", 23 | combine="BMA") { 24 | 25 | if (ont == "DO") ont <- 'HDO' 26 | 27 | DOID1 <- lapply(geneID1, gene2DO, organism = organism, ont = ont) 28 | if (is.null(geneID2)) { 29 | geneID2 <- geneID1 30 | DOID2 <- DOID1 31 | } else { 32 | DOID2 <- lapply(geneID2, gene2DO, organism = organism, ont = ont) 33 | } 34 | 35 | m <- length(geneID1) 36 | n <- length(geneID2) 37 | scores <- matrix(NA, nrow=m, ncol=n) 38 | rownames(scores) <- geneID1 39 | colnames(scores) <- geneID2 40 | 41 | for (i in 1:m) { 42 | if (length(geneID1) == length(geneID2) && all(geneID1 == geneID2)) { 43 | nn <- i 44 | flag <- TRUE 45 | } else { 46 | flag <- FALSE 47 | nn <- n 48 | } 49 | 50 | for (j in 1:nn) { 51 | if(any(!is.na(DOID1[[i]])) && any(!is.na(DOID2[[j]]))) { 52 | s <- doseSim(DOID1[[i]], 53 | DOID2[[j]], 54 | measure = measure, 55 | ont = ont 56 | ) 57 | scores[i,j] = combineScores(s, combine) 58 | if (flag == TRUE && j != i) { 59 | scores[j, i] <- scores[i,j] 60 | } 61 | } 62 | } 63 | } 64 | if (nrow(scores) == 1 & ncol(scores) == 1) 65 | scores = as.numeric(scores) 66 | return(scores) 67 | } 68 | -------------------------------------------------------------------------------- /R/gseAnalyzer.R: -------------------------------------------------------------------------------- 1 | gseDisease <- function(geneList, 2 | organism = "hsa", 3 | exponent=1, 4 | minGSSize = 10, 5 | maxGSSize = 500, 6 | eps = 1e-10, 7 | pvalueCutoff=0.05, 8 | pAdjustMethod="BH", 9 | verbose=TRUE, 10 | seed=FALSE, 11 | by = 'fgsea', 12 | ontology, 13 | ...) { 14 | 15 | annoData <- get_anno_data(ontology) 16 | 17 | res <- GSEA_internal(geneList = geneList, 18 | exponent = exponent, 19 | minGSSize = minGSSize, 20 | maxGSSize = maxGSSize, 21 | eps = eps, 22 | pvalueCutoff = pvalueCutoff, 23 | pAdjustMethod = pAdjustMethod, 24 | verbose = verbose, 25 | seed = seed, 26 | USER_DATA = annoData, 27 | by = by, 28 | ...) 29 | 30 | if (is.null(res)) 31 | return(res) 32 | 33 | if (organism == "hsa") { 34 | res@organism <- "Homo sapiens" 35 | } else { 36 | res@organism <- "Mus musculus" 37 | } 38 | res@setType <- ontology 39 | res@keytype <- "ENTREZID" 40 | return(res) 41 | } 42 | 43 | ##' DO Gene Set Enrichment Analysis 44 | ##' 45 | ##' 46 | ##' perform gsea analysis 47 | ##' @param geneList order ranked geneList 48 | ##' @param ont one of "HDO", "HPO" or "MPO" 49 | ##' @param organism one of "hsa" and "mmu" 50 | ##' @param exponent weight of each step 51 | ##' @param minGSSize minimal size of each geneSet for analyzing 52 | ##' @param maxGSSize maximal size of each geneSet for analyzing 53 | ##' @param pvalueCutoff pvalue Cutoff 54 | ##' @param pAdjustMethod p value adjustment method 55 | ##' @param verbose print message or not 56 | ##' @param seed logical 57 | ##' @param by one of 'fgsea' or 'DOSE' 58 | ##' @param ... other parameter 59 | ##' @return gseaResult object 60 | ##' @export 61 | ##' @author Yu Guangchuang 62 | ##' @keywords manip 63 | gseDO <- function(geneList, 64 | ont = "HDO", 65 | organism = "hsa", 66 | exponent=1, 67 | minGSSize = 10, 68 | maxGSSize = 500, 69 | pvalueCutoff=0.05, 70 | pAdjustMethod="BH", 71 | verbose=TRUE, 72 | seed=FALSE, 73 | by = 'fgsea', 74 | ...) { 75 | 76 | 77 | gseDisease(geneList = geneList, 78 | exponent = exponent, 79 | minGSSize = minGSSize, 80 | maxGSSize = maxGSSize, 81 | pvalueCutoff = pvalueCutoff, 82 | pAdjustMethod = pAdjustMethod, 83 | verbose = verbose, 84 | seed = seed, 85 | by = by, 86 | ontology = ont, 87 | ...) 88 | 89 | } 90 | 91 | ##' NCG Gene Set Enrichment Analysis 92 | ##' 93 | ##' 94 | ##' perform gsea analysis 95 | ##' @inheritParams gseDO 96 | ##' @return gseaResult object 97 | ##' @export 98 | ##' @author Yu Guangchuang 99 | ##' @keywords manip 100 | gseNCG <- function(geneList, 101 | exponent=1, 102 | minGSSize = 10, 103 | maxGSSize = 500, 104 | pvalueCutoff=0.05, 105 | pAdjustMethod="BH", 106 | verbose=TRUE, 107 | seed=FALSE, 108 | by = 'fgsea', 109 | ...) { 110 | 111 | 112 | gseDisease(geneList = geneList, 113 | exponent = exponent, 114 | minGSSize = minGSSize, 115 | maxGSSize = maxGSSize, 116 | pvalueCutoff = pvalueCutoff, 117 | pAdjustMethod = pAdjustMethod, 118 | verbose = verbose, 119 | seed = seed, 120 | by = by, 121 | ontology = "NCG", 122 | ...) 123 | 124 | 125 | 126 | } 127 | 128 | ##' DisGeNET Gene Set Enrichment Analysis 129 | ##' 130 | ##' 131 | ##' perform gsea analysis 132 | ##' @inheritParams gseDO 133 | ##' @return gseaResult object 134 | ##' @export 135 | ##' @author Yu Guangchuang 136 | ##' @keywords manip 137 | gseDGN <- function(geneList, 138 | exponent=1, 139 | minGSSize = 10, 140 | maxGSSize = 500, 141 | pvalueCutoff=0.05, 142 | pAdjustMethod="BH", 143 | verbose=TRUE, 144 | seed=FALSE, 145 | by = 'fgsea', 146 | ...) { 147 | 148 | 149 | gseDisease(geneList = geneList, 150 | exponent = exponent, 151 | minGSSize = minGSSize, 152 | maxGSSize = maxGSSize, 153 | pvalueCutoff = pvalueCutoff, 154 | pAdjustMethod = pAdjustMethod, 155 | verbose = verbose, 156 | seed = seed, 157 | by = by, 158 | ontology = "DisGeNET", 159 | ...) 160 | } 161 | -------------------------------------------------------------------------------- /R/gsea.R: -------------------------------------------------------------------------------- 1 | ##' @importFrom fgsea fgsea 2 | GSEA_fgsea <- function(geneList, 3 | exponent, 4 | nPerm, 5 | minGSSize, 6 | maxGSSize, 7 | eps, 8 | pvalueCutoff, 9 | pAdjustMethod, 10 | verbose, 11 | seed=FALSE, 12 | USER_DATA, 13 | ...) { 14 | 15 | if(verbose) { 16 | message("using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).\n") 17 | message("preparing geneSet collections...") 18 | } 19 | 20 | geneSets <- getGeneSet(USER_DATA) 21 | if (!check_gene_id(geneList, geneSets)) return(NULL) 22 | 23 | if(verbose) 24 | message("GSEA analysis...") 25 | 26 | if (seed) 27 | set.seed(.Random.seed) 28 | 29 | if(missing(nPerm)){ 30 | tmp_res <- fgsea(pathways=geneSets, 31 | stats=geneList, 32 | minSize=minGSSize, 33 | maxSize=maxGSSize, 34 | eps=eps, 35 | gseaParam=exponent, 36 | nproc = 0, 37 | ...) 38 | } else { 39 | warning("We do not recommend using nPerm parameter in", 40 | "current and future releases") 41 | tmp_res <- fgsea(pathways=geneSets, 42 | stats=geneList, 43 | nperm=nPerm, 44 | minSize=minGSSize, 45 | maxSize=maxGSSize, 46 | gseaParam=exponent, 47 | nproc = 0, 48 | ...) 49 | 50 | } 51 | 52 | p.adj <- p.adjust(tmp_res$pval, method=pAdjustMethod) 53 | qvalues <- calculate_qvalue(tmp_res$pval) 54 | 55 | Description <- TERM2NAME(tmp_res$pathway, USER_DATA) 56 | 57 | if(missing(nPerm)){ 58 | params <- list(pvalueCutoff = pvalueCutoff, 59 | eps = eps, 60 | pAdjustMethod = pAdjustMethod, 61 | exponent = exponent, 62 | minGSSize = minGSSize, 63 | maxGSSize = maxGSSize 64 | ) 65 | } else { 66 | params <- list(pvalueCutoff = pvalueCutoff, 67 | nPerm = nPerm, 68 | pAdjustMethod = pAdjustMethod, 69 | exponent = exponent, 70 | minGSSize = minGSSize, 71 | maxGSSize = maxGSSize 72 | ) 73 | } 74 | 75 | 76 | res <- data.frame( 77 | ID = as.character(tmp_res$pathway), 78 | Description = unname(Description), 79 | setSize = tmp_res$size, 80 | enrichmentScore = tmp_res$ES, 81 | NES = tmp_res$NES, 82 | pvalue = tmp_res$pval, 83 | p.adjust = p.adj, 84 | qvalue = qvalues, 85 | stringsAsFactors = FALSE 86 | ) 87 | 88 | res <- res[!is.na(res$pvalue),] 89 | res <- res[ res$pvalue <= pvalueCutoff, ] 90 | res <- res[ res$p.adjust <= pvalueCutoff, ] 91 | idx <- order(res$p.adjust, -abs(res$NES), decreasing = FALSE) 92 | res <- res[idx, ] 93 | 94 | if (nrow(res) == 0) { 95 | message("no term enriched under specific pvalueCutoff...") 96 | return( 97 | new("gseaResult", 98 | result = res, 99 | geneSets = geneSets, 100 | geneList = geneList, 101 | params = params, 102 | readable = FALSE 103 | ) 104 | ) 105 | } 106 | 107 | row.names(res) <- res$ID 108 | observed_info <- lapply(geneSets[res$ID], function(gs) 109 | gseaScores(geneSet=gs, 110 | geneList=geneList, 111 | exponent=exponent) 112 | ) 113 | 114 | if (verbose) 115 | message("leading edge analysis...") 116 | 117 | ledge <- leading_edge(observed_info) 118 | 119 | res$rank <- ledge$rank 120 | res$leading_edge <- ledge$leading_edge 121 | res$core_enrichment <- sapply(ledge$core_enrichment, paste0, collapse='/') 122 | 123 | if (verbose) 124 | message("done...") 125 | 126 | new("gseaResult", 127 | result = res, 128 | geneSets = geneSets, 129 | geneList = geneList, 130 | params = params, 131 | readable = FALSE 132 | ) 133 | } 134 | 135 | ##' generic function for gene set enrichment analysis 136 | ##' 137 | ##' 138 | ##' @title GSEA_internal 139 | ##' @param geneList order ranked geneList 140 | ##' @param exponent weight of each step 141 | ##' @param minGSSize minimal size of each geneSet for analyzing 142 | ##' @param maxGSSize maximal size of each geneSet for analyzing 143 | ##' @param eps This parameter sets the boundary for calculating the p value. 144 | ##' @param pvalueCutoff p value Cutoff 145 | ##' @param pAdjustMethod p value adjustment method 146 | ##' @param verbose print message or not 147 | ##' @param seed set seed inside the function to make result reproducible. FALSE by default. 148 | ##' @param USER_DATA annotation data 149 | ##' @param by one of 'fgsea' or 'DOSE' 150 | ##' @param ... other parameter 151 | ##' @return gseaResult object 152 | ##' @author Yu Guangchuang 153 | GSEA_internal <- function(geneList, 154 | exponent, 155 | minGSSize, 156 | maxGSSize, 157 | eps, 158 | pvalueCutoff, 159 | pAdjustMethod, 160 | verbose, 161 | seed=FALSE, 162 | USER_DATA, 163 | by="fgsea", 164 | ...) { 165 | 166 | by <- match.arg(by, c("fgsea", "DOSE")) 167 | if (!is.sorted(geneList)) 168 | stop("geneList should be a decreasing sorted vector...") 169 | if (by == 'fgsea') { 170 | .GSEA <- GSEA_fgsea 171 | } else { 172 | .GSEA <- GSEA_DOSE 173 | } 174 | 175 | res <- .GSEA(geneList = geneList, 176 | exponent = exponent, 177 | minGSSize = minGSSize, 178 | maxGSSize = maxGSSize, 179 | eps = eps, 180 | pvalueCutoff = pvalueCutoff, 181 | pAdjustMethod = pAdjustMethod, 182 | verbose = verbose, 183 | seed = seed, 184 | USER_DATA = USER_DATA, 185 | ...) 186 | 187 | res@organism <- "UNKNOWN" 188 | res@setType <- "UNKNOWN" 189 | res@keytype <- "UNKNOWN" 190 | if (inherits(USER_DATA, "GSON")) { 191 | if (!is.null(USER_DATA@keytype)) { 192 | res@keytype <- USER_DATA@keytype 193 | } 194 | if (!is.null(USER_DATA@species)) { 195 | res@organism <- USER_DATA@species 196 | } 197 | if (!is.null(USER_DATA@gsname)) { 198 | res@setType <- gsub(".*;", "", USER_DATA@gsname) 199 | } 200 | } 201 | return(res) 202 | } 203 | 204 | ##' @importFrom utils setTxtProgressBar 205 | ##' @importFrom utils txtProgressBar 206 | ##' @importFrom stats p.adjust 207 | ##' @importFrom BiocParallel bplapply 208 | ##' @importFrom BiocParallel MulticoreParam 209 | ## @importFrom BiocParallel bpisup 210 | ## @importFrom BiocParallel bpstart 211 | ## @importFrom BiocParallel bpstop 212 | ##' @importFrom BiocParallel multicoreWorkers 213 | GSEA_DOSE <- function(geneList, 214 | exponent, 215 | nPerm, 216 | minGSSize, 217 | maxGSSize, 218 | pvalueCutoff, 219 | pAdjustMethod, 220 | verbose, 221 | seed=FALSE, 222 | USER_DATA, 223 | ...) { 224 | 225 | if(verbose) 226 | message("preparing geneSet collections...") 227 | geneSets <- getGeneSet(USER_DATA) 228 | if (!check_gene_id(geneList, geneSets)) return(NULL) 229 | 230 | 231 | selected.gs <- geneSet_filter(geneSets, geneList, minGSSize, maxGSSize) 232 | 233 | if (is.null(selected.gs)) 234 | return(NULL) 235 | 236 | 237 | if (verbose) 238 | message("calculating observed enrichment scores...") 239 | 240 | observed_info <- lapply(selected.gs, function(gs) 241 | gseaScores(geneSet=gs, 242 | geneList=geneList, 243 | exponent=exponent) 244 | ) 245 | observedScore <- sapply(observed_info, function(x) x$ES) 246 | 247 | if (verbose) { 248 | message("calculating permutation scores...") 249 | } 250 | if (seed) { 251 | seeds <- sample.int(nPerm) 252 | } 253 | 254 | ## if (!bpisup()) { 255 | ## bpstart(MulticoreParam(multicoreWorkers(), progressbar=verbose)) 256 | ## on.exit(bpstop()) 257 | ## } 258 | 259 | permScores <- bplapply(1:nPerm, function(i) { 260 | if (seed) 261 | set.seed(seeds[i]) 262 | perm.gseaEScore(geneList, selected.gs, exponent) 263 | }) 264 | 265 | permScores <- do.call("cbind", permScores) 266 | 267 | rownames(permScores) <- names(selected.gs) 268 | 269 | pos.m <- apply(permScores, 1, function(x) mean(x[x >= 0])) 270 | neg.m <- apply(permScores, 1, function(x) abs(mean(x[x < 0]))) 271 | 272 | 273 | normalized_ES <- function(ES, pos.m, neg.m) { 274 | s <- sign(ES) 275 | m <- numeric(length(ES)) 276 | m[s==1] <- pos.m[s==1] 277 | m[s==-1] <- neg.m[s==-1] 278 | ES/m 279 | } 280 | 281 | NES <- normalized_ES(observedScore, pos.m, neg.m) 282 | 283 | permScores <- apply(permScores, 2, normalized_ES, pos.m=pos.m, neg.m=neg.m) 284 | 285 | if (verbose) 286 | message("calculating p values...") 287 | pvals <- sapply(seq_along(observedScore), function(i) { 288 | if( is.na(NES[i]) ) { 289 | NA 290 | } else if ( NES[i] >= 0 ) { 291 | (sum(permScores[i, ] >= NES[i]) +1) / (sum(permScores[i,] >= 0) +1) 292 | } else { # NES[i] < 0 293 | (sum(permScores[i, ] <= NES[i]) +1) / (sum(permScores[i,] < 0) +1) 294 | } 295 | 296 | }) 297 | p.adj <- p.adjust(pvals, method=pAdjustMethod) 298 | qvalues <- calculate_qvalue(pvals) 299 | 300 | gs.name <- names(selected.gs) 301 | Description <- TERM2NAME(gs.name, USER_DATA) 302 | 303 | params <- list(pvalueCutoff = pvalueCutoff, 304 | nPerm = nPerm, 305 | pAdjustMethod = pAdjustMethod, 306 | exponent = exponent, 307 | minGSSize = minGSSize, 308 | maxGSSize = maxGSSize 309 | ) 310 | 311 | 312 | res <- data.frame( 313 | ID = as.character(gs.name), 314 | Description = Description, 315 | setSize = sapply(selected.gs, length), 316 | enrichmentScore = observedScore, 317 | NES = NES, 318 | pvalue = pvals, 319 | p.adjust = p.adj, 320 | qvalue = qvalues, 321 | stringsAsFactors = FALSE 322 | ) 323 | 324 | res <- res[!is.na(res$pvalue),] 325 | res <- res[ res$pvalue <= pvalueCutoff, ] 326 | res <- res[ res$p.adjust <= pvalueCutoff, ] 327 | idx <- order(res$p.adjust, -abs(res$NES), decreasing = FALSE) 328 | res <- res[idx, ] 329 | 330 | if (nrow(res) == 0) { 331 | message("no term enriched under specific pvalueCutoff...") 332 | return( 333 | new("gseaResult", 334 | result = res, 335 | geneSets = geneSets, 336 | geneList = geneList, 337 | params = params, 338 | readable = FALSE 339 | ) 340 | ) 341 | } 342 | 343 | row.names(res) <- res$ID 344 | observed_info <- observed_info[res$ID] 345 | 346 | if (verbose) 347 | message("leading edge analysis...") 348 | 349 | ledge <- leading_edge(observed_info) 350 | 351 | res$rank <- ledge$rank 352 | res$leading_edge <- ledge$leading_edge 353 | res$core_enrichment <- sapply(ledge$core_enrichment, paste0, collapse='/') 354 | 355 | 356 | if (verbose) 357 | message("done...") 358 | 359 | new("gseaResult", 360 | result = res, 361 | geneSets = geneSets, 362 | geneList = geneList, 363 | permScores = permScores, 364 | params = params, 365 | readable = FALSE 366 | ) 367 | 368 | } 369 | 370 | 371 | leading_edge <- function(observed_info) { 372 | core_enrichment <- lapply(observed_info, function(x) { 373 | runningES <- x$runningES 374 | runningES <- runningES[runningES$position == 1,] 375 | ES <- x$ES 376 | if (ES >= 0) { 377 | i <- which.max(runningES$runningScore) 378 | leading_gene <- runningES$gene[1:i] 379 | } else { 380 | i <- which.min(runningES$runningScore) 381 | leading_gene <- runningES$gene[-c(1:(i-1))] 382 | } 383 | return(leading_gene) 384 | }) 385 | 386 | rank <- sapply(observed_info, function(x) { 387 | runningES <- x$runningES 388 | ES <- x$ES 389 | if (ES >= 0) { 390 | rr <- which.max(runningES$runningScore) 391 | } else { 392 | i <- which.min(runningES$runningScore) 393 | rr <- nrow(runningES) - i + 1 394 | } 395 | return(rr) 396 | }) 397 | 398 | tags <- sapply(observed_info, function(x) { 399 | runningES <- x$runningES 400 | runningES <- runningES[runningES$position == 1,] 401 | ES <- x$ES 402 | if (ES >= 0) { 403 | i <- which.max(runningES$runningScore) 404 | res <- i/nrow(runningES) 405 | } else { 406 | i <- which.min(runningES$runningScore) 407 | res <- (nrow(runningES) - i + 1)/nrow(runningES) 408 | } 409 | return(res) 410 | }) 411 | 412 | ll <- sapply(observed_info, function(x) { 413 | runningES <- x$runningES 414 | ES <- x$ES 415 | if (ES >= 0) { 416 | i <- which.max(runningES$runningScore) 417 | res <- i/nrow(runningES) 418 | } else { 419 | i <- which.min(runningES$runningScore) 420 | res <- (nrow(runningES) - i + 1)/nrow(runningES) 421 | } 422 | return(res) 423 | }) 424 | 425 | N <- nrow(observed_info[[1]]$runningES) 426 | setSize <- sapply(observed_info, function(x) sum(x$runningES$position)) 427 | signal <- tags * (1-ll) * (N / (N - setSize)) 428 | 429 | tags <- paste0(round(tags * 100), "%") 430 | ll <- paste0(round(ll * 100), "%") 431 | signal <- paste0(round(signal * 100), "%") 432 | leading_edge <- paste0('tags=', tags, ", list=", ll, ", signal=", signal) 433 | 434 | res <- list(rank = rank, 435 | tags = tags, 436 | list = ll, 437 | signal = signal, 438 | leading_edge = leading_edge, 439 | core_enrichment = core_enrichment) 440 | return(res) 441 | } 442 | 443 | ## GSEA algorithm (Subramanian et al. PNAS 2005) 444 | ## INPUTs to GSEA 445 | ## 1. Expression data set D with N genes and k samples. 446 | ## 2. Ranking procedure to produce Gene List L. 447 | ## Includes a correlation (or other ranking metric) 448 | ## and a phenotype or profile of interest C. 449 | ## 3. An exponent p to control the weight of the step. 450 | ## 4. Independently derived Gene Set S of N_H genes (e.g., a pathway). 451 | ## Enrichment Score ES. 452 | ## 2. Evaluate the fraction of genes in S ("hits") weighted 453 | ## by their correlation and the fraction of genes not in S ("miss") 454 | ## present up to a given position i in L. 455 | gseaScores <- function(geneList, geneSet, exponent=1, fortify=FALSE) { 456 | ################################################################### 457 | ## geneList ## 458 | ## ## 459 | ## 1. Rank order the N genes in D to form L = { g_1, ... , g_N} ## 460 | ## according to the correlation, r(g_j)=r_j, ## 461 | ## of their expression profiles with C. ## 462 | ## ## 463 | ################################################################### 464 | 465 | ################################################################### 466 | ## exponent ## 467 | ## ## 468 | ## An exponent p to control the weight of the step. ## 469 | ## When p = 0, Enrichment Score ( ES(S) ) reduces to ## 470 | ## the standard Kolmogorov-Smirnov statistic. ## 471 | ## When p = 1, we are weighting the genes in S ## 472 | ## by their correlation with C normalized ## 473 | ## by the sum of the correlations over all of the genes in S. ## 474 | ## ## 475 | ################################################################### 476 | 477 | ## genes defined in geneSet should appear in geneList. 478 | ## this is a must, see https://github.com/GuangchuangYu/DOSE/issues/23 479 | geneSet <- intersect(geneSet, names(geneList)) 480 | 481 | N <- length(geneList) 482 | Nh <- length(geneSet) 483 | 484 | Phit <- Pmiss <- numeric(N) 485 | hits <- names(geneList) %in% geneSet ## logical 486 | 487 | Phit[hits] <- abs(geneList[hits])^exponent 488 | NR <- sum(Phit) 489 | Phit <- cumsum(Phit/NR) 490 | 491 | Pmiss[!hits] <- 1/(N-Nh) 492 | Pmiss <- cumsum(Pmiss) 493 | 494 | runningES <- Phit - Pmiss 495 | 496 | ## ES is the maximum deviation from zero of Phit-Pmiss 497 | max.ES <- max(runningES) 498 | min.ES <- min(runningES) 499 | if( abs(max.ES) > abs(min.ES) ) { 500 | ES <- max.ES 501 | } else { 502 | ES <- min.ES 503 | } 504 | 505 | df <- data.frame(x=seq_along(runningES), 506 | runningScore=runningES, 507 | position=as.integer(hits) 508 | ) 509 | 510 | if(fortify==TRUE) { 511 | return(df) 512 | } 513 | 514 | df$gene = names(geneList) 515 | res <- list(ES=ES, runningES = df) 516 | return(res) 517 | } 518 | 519 | perm.geneList <- function(geneList) { 520 | ## perm.idx <- sample(seq_along(geneList), length(geneList), replace=FALSE) 521 | perm.idx <- sample.int(length(geneList)) 522 | perm.geneList <- geneList 523 | names(perm.geneList) <- names(geneList)[perm.idx] 524 | return(perm.geneList) 525 | } 526 | 527 | perm.gseaEScore <- function(geneList, geneSets, exponent=1) { 528 | geneList <- perm.geneList(geneList) 529 | res <- sapply(1:length(geneSets), function(i) 530 | gseaScores(geneSet=geneSets[[i]], 531 | geneList=geneList, 532 | exponent=exponent)$ES 533 | ) 534 | return(res) 535 | } 536 | 537 | 538 | geneSet_filter <- function(geneSets, geneList, minGSSize, maxGSSize) { 539 | geneSets <- sapply(geneSets, intersect, names(geneList)) 540 | 541 | gs.idx <- get_geneSet_index(geneSets, minGSSize, maxGSSize) 542 | nGeneSet <- sum(gs.idx) 543 | 544 | if ( nGeneSet == 0 ) { 545 | msg <- paste0("No gene set have size between [", minGSSize, ", ", maxGSSize, "]...") 546 | message(msg) 547 | message("--> return NULL...") 548 | return(NULL) 549 | } 550 | geneSets[gs.idx] 551 | } 552 | 553 | -------------------------------------------------------------------------------- /R/gsfilter.R: -------------------------------------------------------------------------------- 1 | ##' filter enriched result by gene set size or gene count 2 | ##' 3 | ##' 4 | ##' @title gsfilter 5 | ##' @param x instance of enrichResult or compareClusterResult 6 | ##' @param by one of 'GSSize' or 'Count' 7 | ##' @param min minimal size 8 | ##' @param max maximal size 9 | ##' @importFrom methods is 10 | ##' @return update object 11 | ##' @export 12 | ##' @author Guangchuang Yu 13 | gsfilter <- function(x, by="GSSize", min=NA, max=NA) { 14 | by <- match.arg(by, c("GSSize", "Count")) 15 | if (is(x, "enrichResult")) { 16 | result <- x@result 17 | } else { 18 | result <- x@compareClusterResult 19 | } 20 | 21 | if (by == "GSSize") { 22 | n <- as.numeric(gsub("^(\\d+)/\\d+$",'\\1', as.character(result$BgRatio))) 23 | } else { 24 | n <- result$Count 25 | } 26 | if (!is.na(min)) { 27 | min_lidx <- n >= min 28 | } else { 29 | min_lidx <- TRUE 30 | } 31 | 32 | if (!is.na(max)) { 33 | max_lidx <- n <= max 34 | } else { 35 | max_lidx <- TRUE 36 | } 37 | 38 | idx <- min_lidx & max_lidx 39 | 40 | if (is(x, "enrichResult")) { 41 | x@result <- result[idx,] 42 | } else { 43 | x@compareClusterResult <- result[idx,] 44 | } 45 | return(x) 46 | } 47 | -------------------------------------------------------------------------------- /R/mclusterSim.R: -------------------------------------------------------------------------------- 1 | ##' Pairwise semantic similarity for a list of gene clusters 2 | ##' 3 | ##' 4 | ##' @title mclusterSim 5 | ##' @param clusters A list of gene clusters 6 | ##' @param organism organism 7 | ##' @param ont one of "HDO", "HPO" and "MPO" 8 | ##' @param measure one of "Wang", "Resnik", "Rel", "Jiang", and "Lin". 9 | ##' @param combine One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein. 10 | ##' @return similarity matrix 11 | ##' @importFrom GOSemSim combineScores 12 | ##' @export 13 | ##' @author Guangchuang Yu 14 | ##' @examples 15 | ##' \dontrun{ 16 | ##' cluster1 <- c("835", "5261","241") 17 | ##' cluster2 <- c("578","582") 18 | ##' cluster3 <- c("307", "308", "317") 19 | ##' clusters <- list(a=cluster1, b=cluster2, c=cluster3) 20 | ##' mclusterSim(clusters, measure="Wang") 21 | ##' } 22 | mclusterSim <- function(clusters, 23 | ont = "HDO", 24 | organism = "hsa", 25 | measure="Wang", 26 | combine="BMA") { 27 | if (ont == "DO") ont <- 'HDO' 28 | 29 | cluster_dos <- list() 30 | for (i in seq_along(clusters)) { 31 | cluster_dos[[i]] <- unlist(sapply(clusters[[i]], gene2DO, organism = organism)) 32 | } 33 | n <- length(clusters) 34 | scores <- matrix(NA, nrow=n, ncol=n) 35 | rownames(scores) <- names(clusters) 36 | colnames(scores) <- names(clusters) 37 | 38 | for (i in seq_along(cluster_dos)) { 39 | do1 <- cluster_dos[[i]] 40 | do1 <- do1[!is.na(do1)] 41 | for (j in 1:i) { 42 | do2 <- cluster_dos[[j]] 43 | do2 <- do2[!is.na(do2)] 44 | if (length(do1) != 0 && length(do2) != 0) { 45 | s <- doseSim(do1, do2, measure = measure, ont = ont) 46 | scores[i,j] <- combineScores(s, combine) 47 | if (i != j) { 48 | scores[j, i] <- scores[i, j] 49 | } 50 | } 51 | } 52 | } 53 | removeRowNA <- apply(!is.na(scores), 1, sum)>0 54 | removeColNA <- apply(!is.na(scores), 2, sum)>0 55 | return(scores[removeRowNA, removeColNA]) 56 | } 57 | -------------------------------------------------------------------------------- /R/parse_ratio.R: -------------------------------------------------------------------------------- 1 | ##' parse character ratio to double value, such as 1/5 to 0.2 2 | ##' 3 | ##' 4 | ##' @title parse_ratio 5 | ##' @param ratio character vector of ratio to parse 6 | ##' @return A numeric vector (double) of parsed ratio 7 | ##' @export 8 | ##' @author Guangchuang Yu 9 | parse_ratio <- function(ratio) { 10 | ratio <- sub("^\\s*", "", as.character(ratio)) 11 | ratio <- sub("\\s*$", "", ratio) 12 | numerator <- as.numeric(sub("/\\d+$", "", ratio)) 13 | denominator <- as.numeric(sub("^\\d+/", "", ratio)) 14 | return(numerator/denominator) 15 | } 16 | 17 | -------------------------------------------------------------------------------- /R/print.R: -------------------------------------------------------------------------------- 1 | ##' show method for \code{gseaResult} instance 2 | ##' 3 | ##' @name show 4 | ##' @docType methods 5 | ##' @rdname show-methods 6 | ##' 7 | ##' @title show method 8 | ##' @return message 9 | ##' @importFrom methods show 10 | ##' @exportMethod show 11 | ##' @usage show(object) 12 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 13 | setMethod("show", signature(object="gseaResult"), 14 | function (object){ 15 | params <- object@params 16 | cat("#\n# Gene Set Enrichment Analysis\n#\n") 17 | cat("#...@organism", "\t", object@organism, "\n") 18 | cat("#...@setType", "\t", object@setType, "\n") 19 | kt <- object@keytype 20 | if (kt != "UNKNOWN") { 21 | cat("#...@keytype", "\t", kt, "\n") 22 | } 23 | 24 | cat("#...@geneList", "\t") 25 | str(object@geneList) 26 | cat("#...nPerm", "\t", params$nPerm, "\n") 27 | cat("#...pvalues adjusted by", paste0("'", params$pAdjustMethod, "'"), 28 | paste0("with cutoff <", params$pvalueCutoff), "\n") 29 | cat(paste0("#...", nrow(object@result)), "enriched terms found\n") 30 | str(object@result) 31 | cat("#...Citation\n") 32 | print_citation_msg(object@setType) 33 | } 34 | ) 35 | 36 | 37 | ##' show method for \code{enrichResult} instance 38 | ##' 39 | ##' @name show 40 | ##' @docType methods 41 | ##' @rdname show-methods 42 | ##' 43 | ##' @title show method 44 | ##' @param object A \code{enrichResult} instance. 45 | ##' @return message 46 | ##' @importFrom utils str 47 | ##' @importFrom methods show 48 | ##' @exportMethod show 49 | ##' @usage show(object) 50 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 51 | setMethod("show", signature(object="enrichResult"), 52 | function (object){ 53 | 54 | cat("#\n# over-representation test\n#\n") 55 | cat("#...@organism", "\t", object@organism, "\n") 56 | cat("#...@ontology", "\t", object@ontology, "\n") 57 | kt <- object@keytype 58 | if (kt != "UNKNOWN") { 59 | cat("#...@keytype", "\t", kt, "\n") 60 | } 61 | cat("#...@gene", "\t") 62 | str(object@gene) 63 | cat("#...pvalues adjusted by", paste0("'", object@pAdjustMethod, "'"), 64 | paste0("with cutoff <", object@pvalueCutoff), "\n") 65 | object <- get_enriched(object) 66 | n <- nrow(object@result) 67 | cat(paste0("#...", n), "enriched terms found\n") 68 | if (n > 0) str(object@result) 69 | cat("#...Citation\n") 70 | print_citation_msg(object@ontology) 71 | } 72 | ) 73 | 74 | 75 | print_citation_msg <- function(ontology) { 76 | refs <- yulab.utils:::ref_knownledge() 77 | 78 | if (ontology == "HDO" || ontology == "NCG") { 79 | citation_msg <- refs["DOSE"] 80 | } else if (ontology == "Reactome") { 81 | citation_msg <- refs["ReactomePA"] 82 | } else if (ontology == "MeSH") { 83 | citation_msg <- refs["meshes"] 84 | } else { 85 | citation_msg <- refs["clusterProfiler_Innovation2024"] 86 | } 87 | cat(citation_msg, "\n\n") 88 | } 89 | 90 | -------------------------------------------------------------------------------- /R/setReadable.R: -------------------------------------------------------------------------------- 1 | ##' mapping geneID to gene Symbol 2 | ##' 3 | ##' 4 | ##' @title setReadable 5 | ##' @param x enrichResult Object 6 | ##' @param OrgDb OrgDb 7 | ##' @param keyType keyType of gene 8 | ##' @return enrichResult Object 9 | ##' @author Yu Guangchuang 10 | ##' @export 11 | setReadable <- function(x, OrgDb, keyType="auto") { 12 | OrgDb <- load_OrgDb(OrgDb) 13 | if (!'SYMBOL' %in% columns(OrgDb)) { 14 | warning("Fail to convert input geneID to SYMBOL since no SYMBOL information available in the provided OrgDb...") 15 | } 16 | 17 | if (!(is(x, "enrichResult") || is(x, "groupGOResult") || is(x, "gseaResult") || is(x,"compareClusterResult"))) 18 | stop("input should be an 'enrichResult' , 'gseaResult' or 'compareClusterResult' object...") 19 | 20 | isGSEA <- FALSE 21 | isCompare <- FALSE 22 | if (is(x, 'gseaResult')) 23 | isGSEA <- TRUE 24 | 25 | if (is(x, 'compareClusterResult')) 26 | isCompare <- TRUE 27 | 28 | if (keyType == "auto") { 29 | keyType <- x@keytype 30 | if (keyType == 'UNKNOWN') { 31 | stop("can't determine keyType automatically; need to set 'keyType' explicitly...") 32 | } 33 | } 34 | 35 | if (x@readable) 36 | return(x) 37 | 38 | gc <- geneInCategory(x) 39 | if (isGSEA) { 40 | genes <- names(x@geneList) 41 | } else if (isCompare) { 42 | if ("core_enrichment" %in% colnames(as.data.frame(x))) { 43 | geneslist <- x@geneClusters 44 | names(geneslist) <- NULL 45 | genes <- unique(names(unlist(geneslist))) 46 | } else { 47 | genes <- unique(unlist(x@geneClusters)) 48 | } 49 | } else { 50 | genes <- x@gene 51 | } 52 | 53 | gn <- EXTID2NAME(OrgDb, genes, keyType) 54 | 55 | 56 | if(isCompare) { 57 | gc2 <- list() 58 | k <- 1 59 | for(i in seq_len(length(gc))) { 60 | for(j in seq_len(length(gc[[i]]))) { 61 | gc2[[k]] <- gc[[i]][[j]] 62 | names(gc2)[k] <- paste(names(gc)[[i]], names(gc[[i]])[j], sep="-") 63 | k <- k + 1 64 | } 65 | } 66 | gc <- gc2 67 | gc <- lapply(gc, function(i) gn[i]) 68 | res <- x@compareClusterResult 69 | gc <- gc[paste(res$Cluster, res$ID, sep= "-")] 70 | } else { 71 | gc <- lapply(gc, function(i) gn[i]) 72 | res <- x@result 73 | gc <- gc[as.character(res$ID)] 74 | } 75 | 76 | ## names(gc) should be identical to res$ID 77 | 78 | ## gc <- gc[as.character(res$ID)] 79 | 80 | 81 | geneID <- sapply(gc, paste0, collapse="/") 82 | # if (isGSEA) { 83 | if ("core_enrichment" %in% colnames(as.data.frame(x))) { 84 | res$core_enrichment <- unlist(geneID) 85 | } else { 86 | res$geneID <- unlist(geneID) 87 | } 88 | x@gene2Symbol <- gn 89 | x@keytype <- keyType 90 | x@readable <- TRUE 91 | if(isCompare){ 92 | x@compareClusterResult <- res 93 | } else { 94 | x@result <- res 95 | } 96 | 97 | 98 | return(x) 99 | } 100 | 101 | 102 | # geneInCategory2 <- function(x){ 103 | # setNames(strsplit(geneID(x), "/", fixed=TRUE), 104 | # paste(x@compareClusterResult$Cluster, 105 | # x@compareClusterResult$ID, sep= "-")) 106 | # } 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | -------------------------------------------------------------------------------- /R/simplot.R: -------------------------------------------------------------------------------- 1 | ##' plotting similarity matrix 2 | ##' 3 | ##' 4 | ##' @title simplot 5 | ##' @param sim similarity matrix 6 | ##' @param xlab xlab 7 | ##' @param ylab ylab 8 | ##' @param color.low color of low value 9 | ##' @param color.high color of high value 10 | ##' @param labs logical, add text label or not 11 | ##' @param digits round digit numbers 12 | ##' @param labs.size lable size 13 | ##' @param font.size font size 14 | ##' @return ggplot object 15 | ##' @importFrom ggplot2 ggplot 16 | ##' @importFrom ggplot2 aes 17 | ##' @importFrom ggplot2 geom_tile 18 | ##' @importFrom ggplot2 geom_text 19 | ##' @importFrom ggplot2 scale_fill_gradient 20 | ##' @importFrom ggplot2 scale_x_discrete 21 | ##' @importFrom ggplot2 scale_y_discrete 22 | ##' @importFrom ggplot2 theme 23 | ##' @importFrom ggplot2 element_blank 24 | ##' @importFrom ggplot2 element_text 25 | ##' @importFrom ggplot2 xlab 26 | ##' @importFrom ggplot2 ylab 27 | ##' @importFrom reshape2 melt 28 | ##' @export 29 | ##' @author Yu Guangchuang 30 | simplot <- function(sim, xlab="", ylab="", color.low="white", color.high="red", labs=TRUE, digits=2, labs.size=3, font.size=14){ 31 | sim.df <- as.data.frame(sim) 32 | ## if(readable == TRUE) { 33 | ## rownames(sim.df) <- TERM2NAME(rownames(sim.df)) 34 | ## colnames(sim.df) <- TERM2NAME(colnames(sim.df)) 35 | ## } 36 | rn <- row.names(sim.df) 37 | 38 | sim.df <- cbind(ID=rownames(sim.df), sim.df) 39 | sim.df <- melt(sim.df) 40 | 41 | sim.df[,1] <- factor(sim.df[,1], levels=rev(rn)) 42 | if (labs == TRUE) { 43 | ## lbs <- c(apply(round(sim, digits), 2, as.character)) 44 | sim.df$label <- as.character(round(sim.df$value, digits)) 45 | } 46 | variable <- ID <- value <- label <- NULL ## to satisfy codetools 47 | if (labs == TRUE) 48 | p <- ggplot(sim.df, aes(variable, ID, fill=value, label=label)) 49 | else 50 | p <- ggplot(sim.df, aes(variable, ID, fill=value)) 51 | 52 | p <- p + geom_tile(color="black")+ 53 | scale_fill_gradient(low=color.low, high=color.high) + 54 | scale_x_discrete(expand=c(0,0)) + 55 | scale_y_discrete(expand=c(0,0))+ 56 | theme(axis.ticks=element_blank()) 57 | if (labs == TRUE) 58 | p <- p+geom_text(size=labs.size) 59 | p <- p+theme_dose(font.size) 60 | p <- p + theme(axis.text.x=element_text(hjust=0, angle=-90)) + 61 | theme(axis.text.y=element_text(hjust=0)) 62 | p <- p+theme(legend.title=element_blank()) 63 | ##geom_point(aes(size=value)) 64 | p <- p+xlab(xlab)+ylab(ylab) 65 | 66 | ## if (readable == TRUE) { 67 | ## p <- p + theme(axis.text.y = element_text(hjust=1)) 68 | ## } 69 | p <- p + theme(axis.text.x = element_text(vjust=0.5)) 70 | return(p) 71 | } 72 | 73 | 74 | ##' ggplot theme of DOSE 75 | ##' 76 | ##' @title theme_dose 77 | ##' @param font.size font size 78 | ##' @return ggplot theme 79 | ##' @importFrom ggplot2 theme_bw 80 | ##' @importFrom ggplot2 theme 81 | ##' @importFrom ggplot2 element_text 82 | ##' @importFrom ggplot2 margin 83 | ##' @examples 84 | ##' library(ggplot2) 85 | ##' qplot(1:10) + theme_dose() 86 | ##' @export 87 | theme_dose <- function(font.size=14) { 88 | theme_bw() + 89 | theme(axis.text.x = element_text(colour = "black", 90 | size = font.size, vjust =1 ), 91 | axis.text.y = element_text(colour = "black", 92 | size = font.size, hjust =1 ), 93 | axis.title = element_text(margin=margin(10, 5, 0, 0), 94 | color = "black", 95 | size = font.size), 96 | axis.title.y = element_text(angle=90) 97 | ) 98 | } 99 | -------------------------------------------------------------------------------- /R/utilities.R: -------------------------------------------------------------------------------- 1 | get_dose_env <- function() { 2 | if (!exists(".DOSEEnv")) { 3 | .initial() 4 | } 5 | get(".DOSEEnv") 6 | } 7 | 8 | .initial <- function() { 9 | pos <- 1 10 | envir <- as.environment(pos) 11 | assign(".DOSEEnv", new.env(), envir = envir) 12 | } 13 | 14 | # https://github.com/YuLab-SMU/ReactomePA/issues/43 15 | check_gene_id <- function(geneList, geneSets) { 16 | if (all(!names(geneList) %in% unique(unlist(geneSets)))) { 17 | sg <- unlist(geneSets[1:10]) 18 | sg <- sample(sg, min(length(sg), 6)) 19 | message("--> Expected input gene ID: ", paste0(sg, collapse=',')) 20 | message("--> No gene can be mapped....") 21 | return(FALSE) 22 | } 23 | return(TRUE) 24 | } 25 | 26 | 27 | ## @importFrom S4Vectors metadata 28 | get_organism <- function(OrgDb) { 29 | OrgDb <- load_OrgDb(OrgDb) 30 | ## md <- S4Vectors::metadata(OrgDb) 31 | ## md[md[,1] == "ORGANISM", 2] 32 | AnnotationDbi::species(OrgDb) 33 | } 34 | 35 | 36 | calculate_qvalue <- function(pvals) { 37 | if (length(pvals) == 0) 38 | return(numeric(0)) 39 | 40 | qobj <- tryCatch(qvalue(pvals, lambda=0.05, pi0.method="bootstrap"), error=function(e) NULL) 41 | 42 | # if (class(qobj) == "qvalue") { 43 | if (inherits(qobj, "qvalue")) { 44 | qvalues <- qobj$qvalues 45 | } else { 46 | qvalues <- NA 47 | } 48 | return(qvalues) 49 | } 50 | 51 | 52 | calculate_qvalue <- function(pvals) { 53 | if (length(pvals) == 0) 54 | return(numeric(0)) 55 | 56 | qobj <- tryCatch(qvalue(pvals, lambda=0.05, pi0.method="bootstrap"), error=function(e) NULL) 57 | 58 | # if (class(qobj) == "qvalue") { 59 | if (inherits(qobj, "qvalue")) { 60 | qvalues <- qobj$qvalues 61 | } else { 62 | qvalues <- NA 63 | } 64 | return(qvalues) 65 | } 66 | 67 | ##' compute information content 68 | ##' 69 | ##' 70 | ##' @title compute information content 71 | ##' @param ont one of "DO", "HPO" and "MPO" 72 | ##' @return NULL 73 | ##' @importMethodsFrom AnnotationDbi toTable 74 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 75 | computeIC <- function(ont="HDO"){ 76 | DO2EG <- get_ont2gene(ont) 77 | Offsprings <- GOSemSim:::getOffsprings(ont) 78 | 79 | docount <- unlist(lapply(DO2EG, length)) 80 | doids <- names(docount) 81 | 82 | cnt <- docount[doids] + sapply(doids, function(i) sum(docount[Offsprings[[i]]], na.rm=TRUE)) 83 | names(cnt) <- doids 84 | p <- cnt/sum(docount) 85 | 86 | ## IC of DO terms was quantified as the negative log likelihood. 87 | IC <- -log(p) 88 | return(IC) 89 | } 90 | 91 | 92 | ##' provide gene ID, this function will convert to the corresponding DO Terms 93 | ##' 94 | ##' 95 | ##' @title convert Gene ID to DO Terms 96 | ##' @param gene entrez gene ID 97 | ##' @param organism organism 98 | ##' @param ont ont 99 | ##' @return DO Terms 100 | ##' @importMethodsFrom AnnotationDbi get 101 | ##' @importMethodsFrom AnnotationDbi exists 102 | ##' @export 103 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 104 | gene2DO <- function(gene, organism = "hsa", ont = "HDO") { 105 | gene <- as.character(gene) 106 | 107 | EG2DO <- get_gene2ont(ont) 108 | 109 | DO <- EG2DO[[gene]] 110 | DO <- unlist(DO) 111 | if (is.null(DO)) { 112 | return(NA) 113 | } 114 | if (sum(!is.na(DO)) == 0) { 115 | return(NA) 116 | } 117 | DO <- DO[!is.na(DO)] 118 | if (length(DO) == 0) { 119 | return(NA) 120 | } 121 | return(DO) 122 | } 123 | 124 | process_tcss <- getFromNamespace("process_tcss", "GOSemSim") 125 | 126 | ##' @importClassesFrom GOSemSim GOSemSimDATA 127 | semdata <- function(processTCSS = FALSE, ont = "HDO") { 128 | IC <- new("GOSemSimDATA", 129 | ont = ont, 130 | IC = computeIC(ont = ont)) 131 | 132 | if (processTCSS) { 133 | IC <- IC@IC 134 | IC@tcssdata <- process_tcss(ont = ont, IC = IC, cutoff = NULL) 135 | } 136 | 137 | IC 138 | } 139 | 140 | semdata2 <- memoise::memoise(semdata) 141 | 142 | 143 | get_ont2gene <- function(ontology, output = "list") { 144 | gene2ont <- get_gene2ont(ontology, output = "data.frame") 145 | if (output == "data.frame") { 146 | return(gene2ont[, 2:1]) 147 | } 148 | 149 | split(as.character(gene2ont[,1]), as.character(gene2ont[,2])) 150 | } 151 | 152 | get_gene2ont <- function(ontology, output = "list") { 153 | ont2gene <- GOSemSim:::get_onto_data(ontology, table = "ont2gene", output = 'data.frame') 154 | anc <- GOSemSim:::getAncestors(ontology) 155 | idx <- ont2gene[,1] %in% names(anc) 156 | ont2gene <- unique(ont2gene[idx, ]) 157 | 158 | if (output == "data.frame") { 159 | return(ont2gene[, 2:1]) 160 | } 161 | 162 | split(as.character(ont2gene[,1]), as.character(ont2gene[,2])) 163 | } 164 | 165 | get_gene2allont <- function(ontology, output = "list") { 166 | GOSemSim:::get_onto_data(ontology, table = "gene2allont", output = output) 167 | } 168 | 169 | get_ont2allgene <- function(ontology, output = "list") { 170 | gene2allont <- GOSemSim:::get_onto_data(ontology, table = "gene2allont", output = "data.frame") 171 | if (output == "data.frame") { 172 | return(gene2allont[, 2:1]) 173 | } 174 | 175 | split(as.character(gene2allont[,1]), as.character(gene2allont[,2])) 176 | } 177 | 178 | ## ##' get all entrezgene ID of a specific organism 179 | ## ##' 180 | ## ##' 181 | ## ##' @title getALLEG 182 | ## ##' @param organism species 183 | ## ##' @return entrez gene ID vector 184 | ## ##' @export 185 | ## ##' @author Yu Guangchuang 186 | ## getALLEG <- function(organism) { 187 | ## annoDb <- getDb(organism) 188 | ## require(annoDb, character.only = TRUE) 189 | ## annoDb <- eval(parse(text=annoDb)) 190 | ## eg=keys(annoDb, keytype="ENTREZID") 191 | ## return(eg) 192 | ## } 193 | 194 | 195 | ##' mapping gene ID to gene Symbol 196 | ##' 197 | ##' 198 | ##' @title EXTID2NAME 199 | ##' @param OrgDb OrgDb 200 | ##' @param geneID entrez gene ID 201 | ##' @param keytype keytype 202 | ##' @return gene symbol 203 | ##' @importMethodsFrom AnnotationDbi select 204 | ##' @importMethodsFrom AnnotationDbi keys 205 | ##' @importMethodsFrom AnnotationDbi columns 206 | ##' @importMethodsFrom AnnotationDbi keytypes 207 | ##' @importFrom GOSemSim load_OrgDb 208 | ##' @export 209 | ##' @author Guangchuang Yu \url{https://yulab-smu.top} 210 | EXTID2NAME <- function(OrgDb, geneID, keytype) { 211 | OrgDb <- load_OrgDb(OrgDb) 212 | kt <- keytypes(OrgDb) 213 | if (! keytype %in% kt) { 214 | stop("keytype is not supported...") 215 | } 216 | 217 | gn.df <- suppressMessages(select(OrgDb, keys=geneID, keytype=keytype, columns="SYMBOL")) 218 | gn.df <- unique(gn.df) 219 | colnames(gn.df) <- c("GeneID", "SYMBOL") 220 | 221 | unmap_geneID <- geneID[!geneID %in% gn.df$GeneID] 222 | if (length(unmap_geneID) != 0) { 223 | unmap_geneID.df = data.frame(GeneID = unmap_geneID, 224 | SYMBOL = unmap_geneID) 225 | gn.df <- rbind(gn.df, unmap_geneID.df) 226 | } 227 | 228 | gn <- gn.df$SYMBOL 229 | names(gn) <- gn.df$GeneID 230 | return(gn) 231 | } 232 | 233 | ## EXTID2NAME <- function(geneID, organism) { 234 | ## if (length(geneID) == 0) { 235 | ## return("") 236 | ## } 237 | ## if (organism == "worm") { 238 | ## organism = "celegans" 239 | ## warning("'worm' is deprecated, please use 'celegans' instead...") 240 | ## } 241 | ## organism <- organismMapper(organism) 242 | 243 | ## supported_Org <- getSupported_Org() 244 | ## if (organism %in% supported_Org) { 245 | ## ## kk <- getALLEG(organism) 246 | ## ## unmap_geneID <- geneID[! geneID %in% kk] 247 | ## ## map_geneID <- geneID[geneID %in% kk] 248 | 249 | ## ## if (length(map_geneID) == 0) { 250 | ## ## warning("the input geneID is not entrezgeneID, and cannot be mapped") 251 | ## ## names(geneID) <- geneID 252 | ## ## return (geneID) 253 | ## ## } 254 | ## annoDb <- getDb(organism) 255 | ## require(annoDb, character.only = TRUE) 256 | ## annoDb <- eval(parse(text=annoDb)) 257 | ## if (organism == "yeast" || organism == "malaria") { 258 | ## gn.df <- select(annoDb, keys=geneID,keytype="ORF", columns="GENENAME") 259 | ## } else if (organism == "arabidopsis") { 260 | ## gn.df <- select(annoDb, keys=geneID,keytype="TAIR", columns="SYMBOL") 261 | ## } else { 262 | ## gn.df <- select(annoDb, keys=geneID,keytype="ENTREZID", columns="SYMBOL") 263 | ## } 264 | ## gn.df <- unique(gn.df) 265 | ## colnames(gn.df) <- c("ENTREZID", "SYMBOL") 266 | 267 | ## unmap_geneID <- geneID[!geneID %in% gn.df$ENTREZID] 268 | ## if (length(unmap_geneID) != 0) { 269 | ## unmap_geneID.df = data.frame(ENTREZID= unmap_geneID, SYMBOL=unmap_geneID) 270 | ## gn.df <- rbind(gn.df, unmap_geneID.df) 271 | ## } 272 | 273 | ## gn <- gn.df$SYMBOL 274 | ## names(gn) <- gn.df$ENTREZID 275 | ## ##gn <- unique(gn[!is.na(gn)]) 276 | ## } else { 277 | ## oldwd <- getwd() 278 | ## if(organism == "D39") { 279 | ## dir <- system.file("extdata/D39/", package="clusterProfiler") 280 | ## setwd(dir) 281 | ## } 282 | ## if(organism == "M5005") { 283 | ## dir <- system.file("extdata/M5005/", package="clusterProfiler") 284 | ## setwd(dir) 285 | ## } 286 | 287 | ## if (file.exists("geneTable.rda")) { 288 | ## geneTable <- NULL # to satisfy codetools 289 | ## load("geneTable.rda") 290 | ## idx <- geneTable$GeneID %in% geneID 291 | ## eg.gn <- geneTable[idx, c("GeneID", "GeneName", "Locus")] 292 | ## eg.gn[eg.gn[,2] == "-",2] <- eg.gn[eg.gn[,2] == "-",3] 293 | ## ##eg.gn <- eg.gn[,c(1,2)] 294 | ## gn <- eg.gn$GeneName 295 | ## names(gn) <- as.character(eg.gn$GeneID) 296 | ## setwd(oldwd) 297 | ## } else { 298 | ## setwd(oldwd) 299 | ## warning("Have no annotation found for the input geneID") 300 | ## return(geneID) 301 | ## } 302 | ## } 303 | ## return(gn) 304 | ## } 305 | 306 | 307 | 308 | is.sorted <- function(x, decreasing=TRUE) { 309 | all( sort(x, decreasing=decreasing) == x ) 310 | } 311 | 312 | getGeneSet <- function(USER_DATA) { 313 | if (inherits(USER_DATA, "environment")) { 314 | res <- get("PATHID2EXTID", envir = USER_DATA) 315 | } else if (inherits(USER_DATA, "GSON")) { 316 | gsid2gene <- USER_DATA@gsid2gene 317 | res <- split(gsid2gene$gene, gsid2gene$gsid) 318 | } else { 319 | stop("not supported") 320 | } 321 | return(res) 322 | } 323 | 324 | 325 | ##' @importFrom ggplot2 facet_grid 326 | ##' @export 327 | ggplot2::facet_grid 328 | -------------------------------------------------------------------------------- /R/zzz.R: -------------------------------------------------------------------------------- 1 | ##' @importFrom yulab.utils yulab_msg 2 | .onAttach <- function(libname, pkgname) { 3 | packageStartupMessage(yulab_msg(pkgname)) 4 | 5 | .initial() 6 | } 7 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: 3 | md_document: 4 | variant: gfm 5 | html_preview: false 6 | --- 7 | 8 | 9 | ```{r echo=FALSE, results="hide", message=FALSE} 10 | library("badger") 11 | library("yulab.utils") 12 | ``` 13 | 14 | 15 | # DOSE: Disease Ontology Semantic and Enrichment analysis 16 | 17 | `r badge_bioc_release("DOSE", "green")` 18 | `r badge_devel("guangchuangyu/DOSE", "green")` 19 | [![Bioc](http://www.bioconductor.org/shields/years-in-bioc/DOSE.svg)](https://www.bioconductor.org/packages/devel/bioc/html/DOSE.html#since) 20 | [![codecov](https://codecov.io/gh/GuangchuangYu/DOSE/branch/master/graph/badge.svg)](https://codecov.io/gh/GuangchuangYu/DOSE/) 21 | 22 | 23 | 24 | [![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) 25 | [![platform](http://www.bioconductor.org/shields/availability/devel/DOSE.svg)](https://www.bioconductor.org/packages/devel/bioc/html/DOSE.html#archives) 26 | [![Build Status](http://www.bioconductor.org/shields/build/devel/bioc/DOSE.svg)](https://bioconductor.org/checkResults/devel/bioc-LATEST/DOSE/) 27 | `r badge_bioc_download("DOSE", "total", "blue")` 28 | `r badge_bioc_download("DOSE", "month", "blue")` 29 | 30 | 31 | ```{r comment="", echo=FALSE, results='asis'} 32 | cat(packageDescription('DOSE')$Description) 33 | ``` 34 | 35 | 36 | ## :writing_hand: Authors 37 | 38 | Guangchuang YU 39 | 40 | School of Basic Medical Sciences, Southern Medical University 41 | 42 | 43 | Learn more at . 44 | 45 | Please cite the following article when using `DOSE`: 46 | 47 | ```{r comment="", echo=FALSE, results='asis'} 48 | cat(yulab.utils:::ref_knownledge()["DOSE"], ".\n", sep="") 49 | ``` 50 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DOSE: Disease Ontology Semantic and Enrichment analysis 2 | 3 | [![](https://img.shields.io/badge/release%20version-3.30.2-green.svg)](https://www.bioconductor.org/packages/DOSE) 4 | [![](https://img.shields.io/badge/devel%20version-3.31.3-green.svg)](https://github.com/guangchuangyu/DOSE) 5 | [![Bioc](http://www.bioconductor.org/shields/years-in-bioc/DOSE.svg)](https://www.bioconductor.org/packages/devel/bioc/html/DOSE.html#since) 6 | [![codecov](https://codecov.io/gh/GuangchuangYu/DOSE/branch/master/graph/badge.svg)](https://codecov.io/gh/GuangchuangYu/DOSE/) 7 | 8 | [![Project Status: Active - The project has reached a stable, usable 9 | state and is being actively 10 | developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) 11 | [![platform](http://www.bioconductor.org/shields/availability/devel/DOSE.svg)](https://www.bioconductor.org/packages/devel/bioc/html/DOSE.html#archives) 12 | [![Build 13 | Status](http://www.bioconductor.org/shields/build/devel/bioc/DOSE.svg)](https://bioconductor.org/checkResults/devel/bioc-LATEST/DOSE/) 14 | [![](https://img.shields.io/badge/download-836274/total-blue.svg)](https://bioconductor.org/packages/stats/bioc/DOSE) 15 | [![](https://img.shields.io/badge/download-20017/month-blue.svg)](https://bioconductor.org/packages/stats/bioc/DOSE) 16 | 17 | This package implements five methods proposed by Resnik, Schlicker, 18 | Jiang, Lin and Wang respectively for measuring semantic similarities 19 | among DO terms and gene products. Enrichment analyses including 20 | hypergeometric model and gene set enrichment analysis are also 21 | implemented for discovering disease associations of high-throughput 22 | biological data. 23 | 24 | ## :writing_hand: Authors 25 | 26 | Guangchuang YU 27 | 28 | School of Basic Medical Sciences, Southern Medical University 29 | 30 | Learn more at . 31 | 32 | Please cite the following article when using `DOSE`: 33 | 34 | Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an 35 | R/Bioconductor package for Disease Ontology Semantic and Enrichment 36 | analysis. Bioinformatics. 2015, 31(4):608-609. 37 | -------------------------------------------------------------------------------- /data/DGN_EXTID2PATHID.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/DGN_EXTID2PATHID.rda -------------------------------------------------------------------------------- /data/DGN_PATHID2EXTID.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/DGN_PATHID2EXTID.rda -------------------------------------------------------------------------------- /data/DGN_PATHID2NAME.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/DGN_PATHID2NAME.rda -------------------------------------------------------------------------------- /data/NCG_EXTID2PATHID.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/NCG_EXTID2PATHID.rda -------------------------------------------------------------------------------- /data/NCG_PATHID2EXTID.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/NCG_PATHID2EXTID.rda -------------------------------------------------------------------------------- /data/NCG_PATHID2NAME.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/NCG_PATHID2NAME.rda -------------------------------------------------------------------------------- /data/VDGN_EXTID2PATHID.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/VDGN_EXTID2PATHID.rda -------------------------------------------------------------------------------- /data/VDGN_PATHID2EXTID.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/VDGN_PATHID2EXTID.rda -------------------------------------------------------------------------------- /data/VDGN_PATHID2NAME.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/VDGN_PATHID2NAME.rda -------------------------------------------------------------------------------- /data/geneList.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YuLab-SMU/DOSE/34a63655d1c24c4e855a669e61880187a28a7a1a/data/geneList.rda -------------------------------------------------------------------------------- /inst/CITATION: -------------------------------------------------------------------------------- 1 | citHeader("Please cite G. Yu (2015) for using DOSE. In addition, please cite G. Yu (2012) when using compareCluster in clusterProfiler package, G. Yu (2015) when applying enrichment analysis to NGS data by using ChIPseeker and G. Yu (2010) when using GOSemSim for GO semantic similarity analysis") 2 | 3 | citEntry(entry = "ARTICLE", 4 | title = "DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis", 5 | author = c( 6 | person("Guangchuang", "Yu"), 7 | person("Li-Gen", "Wang"), 8 | person("Guang-Rong", "Yan"), 9 | person("Qing-Yu", "He") 10 | ), 11 | journal = "Bioinformatics", 12 | year = "2015", 13 | volume = "31", 14 | number = "4", 15 | pages = "608-609", 16 | PMID = "", 17 | url = "http://bioinformatics.oxfordjournals.org/content/31/4/608", 18 | doi = "10.1093/bioinformatics/btu684", 19 | textVersion = paste("Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He.", 20 | "DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis.", 21 | "Bioinformatics 2015 31(4):608-609") 22 | ) 23 | -------------------------------------------------------------------------------- /inst/extdata/build_DGN_Anno.R: -------------------------------------------------------------------------------- 1 | # download from https://www.disgenet.org/downloads 2 | # 1. ALL gene-disease associations 3 | # 2. ALL variant-disease associations 4 | #x <- read.delim("all_gene_disease_associations.tsv", comment.char="#", stringsAsFactor=F) 5 | x <- read.delim("all_gene_disease_associations.tsv", comment.char="#", stringsAsFactor=F, encoding = "latin1") 6 | d2n <- unique(x[, c("diseaseId", "diseaseName")]) 7 | d2g <- unique(x[, c("diseaseId", "geneId")]) 8 | 9 | .DGN_DOSE_Env <- DOSE:::build_Anno(d2g, d2n) 10 | 11 | 12 | ## save(.DGN_DOSE_Env, file="DGN_DOSE_Env.rda", compress='xz') 13 | 14 | DGN_EXTID2PATHID = get("EXTID2PATHID", envir=.DGN_DOSE_Env) 15 | DGN_PATHID2EXTID = get("PATHID2EXTID", envir = .DGN_DOSE_Env) 16 | DGN_PATHID2NAME = get("PATHID2NAME", envir = .DGN_DOSE_Env) 17 | 18 | ## Warning: found non-ASCII strings 19 | ## 'Primary Sjgren's syndrome' in object 'DGN_PATHID2NAME' 20 | ## 'Secondary Sjgren's syndrome' in object 'DGN_PATHID2NAME' 21 | ## 'Henoch-Schnlein nephritis' in object 'DGN_PATHID2NAME' 22 | ## 23 | ## Sjögren 24 | ## 25 | ## DGN_PATHID2NAME['umls:C0151449'] <- "Primary Sjogren's syndrome" 26 | ## DGN_PATHID2NAME['umls:C0151450'] <- "Secondary Sjogren's syndrome" 27 | ## DGN_PATHID2NAME['umls:C0403528'] <- 'Henoch-Schonlein nephritis' 28 | 29 | DGN_PATHID2NAME <- iconv(DGN_PATHID2NAME, "ASCII", "UTF-8") 30 | 31 | save(DGN_EXTID2PATHID, file = "DGN_EXTID2PATHID.rda", compress='xz') 32 | save(DGN_PATHID2EXTID, file="DGN_PATHID2EXTID.rda", compress='xz') 33 | save(DGN_PATHID2NAME, file="DGN_PATHID2NAME.rda", compress='xz') 34 | 35 | 36 | 37 | y <- read.delim("all_variant_disease_associations.tsv", comment.char="#", stringsAsFactor=F) 38 | d2n <- unique(y[, c("diseaseId", "diseaseName")]) 39 | d2s <- unique(y[, c("diseaseId", "snpId")]) 40 | 41 | 42 | .VDGN_DOSE_Env <- DOSE:::build_Anno(d2s, d2n) 43 | 44 | ## save(.VDGN_DOSE_Env, file="VDGN_DOSE_Env.rda", compress='xz') 45 | 46 | 47 | VDGN_EXTID2PATHID = get("EXTID2PATHID", envir=.VDGN_DOSE_Env) 48 | VDGN_PATHID2EXTID = get("PATHID2EXTID", envir = .VDGN_DOSE_Env) 49 | VDGN_PATHID2NAME = get("PATHID2NAME", envir = .VDGN_DOSE_Env) 50 | 51 | VDGN_PATHID2NAME <- iconv(VDGN_PATHID2NAME, "ASCII", "UTF-8") 52 | 53 | save(VDGN_EXTID2PATHID, file = "VDGN_EXTID2PATHID.rda", compress='xz') 54 | save(VDGN_PATHID2EXTID, file="VDGN_PATHID2EXTID.rda", compress='xz') 55 | save(VDGN_PATHID2NAME, file="VDGN_PATHID2NAME.rda", compress='xz') 56 | 57 | 58 | -------------------------------------------------------------------------------- /inst/extdata/build_NCG_Anno.R: -------------------------------------------------------------------------------- 1 | # download from http://ncg.kcl.ac.uk/download.php 2 | # NCG 6.0: All cancer genes -> •List of 2372 cancer genes and supporting literature 3 | # NCG 7.0: List of all 3347 cancer drivers and their annotation and supporting evidence 4 | #x=read.delim("NCG6_cancergenes.tsv", stringsAsFactor=F) 5 | x=read.delim("NCG6_cancergenes.tsv", stringsAsFactor=F, encoding = "latin1") 6 | path2gene <- x[, c("cancer_type", "entrez")] 7 | path2gene <- path2gene[path2gene[,1] != '',] 8 | 9 | ## gene2name <- x[, c("entrez", "symbol")] 10 | 11 | path2name=NULL 12 | 13 | .NCG_DOSE_Env <- DOSE:::build_Anno(path2gene, path2name) 14 | 15 | NCG_EXTID2PATHID = get("EXTID2PATHID", envir=.NCG_DOSE_Env) 16 | NCG_PATHID2EXTID = get("PATHID2EXTID", envir = .NCG_DOSE_Env) 17 | NCG_PATHID2NAME = get("PATHID2NAME", envir = .NCG_DOSE_Env) 18 | 19 | save(NCG_EXTID2PATHID, file = "NCG_EXTID2PATHID.rda", compress='xz') 20 | save(NCG_PATHID2EXTID, file="NCG_PATHID2EXTID.rda", compress='xz') 21 | save(NCG_PATHID2NAME, file="NCG_PATHID2NAME.rda", compress='xz') 22 | 23 | 24 | 25 | 26 | 27 | -------------------------------------------------------------------------------- /inst/extdata/preparing.geneList.R: -------------------------------------------------------------------------------- 1 | require(breastCancerMAINZ) 2 | data(mainz) 3 | 4 | require("hgu133a.db") 5 | 6 | require(siggenes) 7 | 8 | clmainz=pData(mainz)$grade 9 | 10 | dd <- exprs(mainz) 11 | g1 <- dd[,clmainz == 1] 12 | g3 <- dd[,clmainz == 3] 13 | geneList <- exp(rowMeans(g3))/exp(rowMeans(g1)) 14 | geneList <- sort(geneList, decreasing=TRUE) 15 | geneList <- log(geneList, base=2) 16 | 17 | eg <- mget(names(geneList), hgu133aENTREZID, ifnotfound=NA) 18 | 19 | gg <- data.frame(probe=names(geneList), val = geneList) 20 | eg.df <- data.frame(probe=names(eg), eg=unlist(eg)) 21 | 22 | xx <- merge(gg, eg.df, by.x="probe", by.y="probe") 23 | xx <- xx[,-1] 24 | xx <- unique(xx) 25 | xx <- xx[!is.na(xx[,2]),] 26 | 27 | require(plyr) 28 | yy <- ddply(xx, .(eg), function(x) data.frame(val=mean(x$val))) 29 | 30 | geneList <- yy$val 31 | names(geneList) <- yy$eg 32 | geneList <- sort(geneList, decreasing=TRUE) 33 | 34 | save(geneList, file="geneList.rda") 35 | -------------------------------------------------------------------------------- /man/DOSE-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/DOSE-package.R 3 | \docType{package} 4 | \name{DOSE-package} 5 | \alias{DOSE} 6 | \alias{DOSE-package} 7 | \title{DOSE: Disease Ontology Semantic and Enrichment analysis} 8 | \description{ 9 | This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data. 10 | } 11 | \seealso{ 12 | Useful links: 13 | \itemize{ 14 | \item \url{https://yulab-smu.top/biomedical-knowledge-mining-book/} 15 | \item Report bugs at \url{https://github.com/GuangchuangYu/DOSE/issues} 16 | } 17 | 18 | } 19 | \author{ 20 | \strong{Maintainer}: Guangchuang Yu \email{guangchuangyu@gmail.com} 21 | 22 | Other contributors: 23 | \itemize{ 24 | \item Li-Gen Wang \email{reeganwang020@gmail.com} [contributor] 25 | \item Vladislav Petyuk \email{petyuk@gmail.com} [contributor] 26 | \item Giovanni Dall'Olio \email{giovanni.dallolio@upf.edu} [contributor] 27 | } 28 | 29 | } 30 | \keyword{internal} 31 | -------------------------------------------------------------------------------- /man/DataSet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/DOSE-package.R 3 | \docType{data} 4 | \name{DataSet} 5 | \alias{DataSet} 6 | \alias{geneList} 7 | \alias{NCG_EXTID2PATHID} 8 | \alias{NCG_PATHID2EXTID} 9 | \alias{NCG_PATHID2NAME} 10 | \alias{DGN_EXTID2PATHID} 11 | \alias{DGN_PATHID2EXTID} 12 | \alias{DGN_PATHID2NAME} 13 | \alias{VDGN_EXTID2PATHID} 14 | \alias{VDGN_PATHID2EXTID} 15 | \alias{VDGN_PATHID2NAME} 16 | \title{Datasets} 17 | \description{ 18 | Information content and DO term to entrez gene IDs mapping 19 | } 20 | \keyword{datasets} 21 | -------------------------------------------------------------------------------- /man/EXTID2NAME.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utilities.R 3 | \name{EXTID2NAME} 4 | \alias{EXTID2NAME} 5 | \title{EXTID2NAME} 6 | \usage{ 7 | EXTID2NAME(OrgDb, geneID, keytype) 8 | } 9 | \arguments{ 10 | \item{OrgDb}{OrgDb} 11 | 12 | \item{geneID}{entrez gene ID} 13 | 14 | \item{keytype}{keytype} 15 | } 16 | \value{ 17 | gene symbol 18 | } 19 | \description{ 20 | mapping gene ID to gene Symbol 21 | } 22 | \author{ 23 | Guangchuang Yu \url{https://yulab-smu.top} 24 | } 25 | -------------------------------------------------------------------------------- /man/GSEA_internal.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gsea.R 3 | \name{GSEA_internal} 4 | \alias{GSEA_internal} 5 | \title{GSEA_internal} 6 | \usage{ 7 | GSEA_internal( 8 | geneList, 9 | exponent, 10 | minGSSize, 11 | maxGSSize, 12 | eps, 13 | pvalueCutoff, 14 | pAdjustMethod, 15 | verbose, 16 | seed = FALSE, 17 | USER_DATA, 18 | by = "fgsea", 19 | ... 20 | ) 21 | } 22 | \arguments{ 23 | \item{geneList}{order ranked geneList} 24 | 25 | \item{exponent}{weight of each step} 26 | 27 | \item{minGSSize}{minimal size of each geneSet for analyzing} 28 | 29 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 30 | 31 | \item{eps}{This parameter sets the boundary for calculating the p value.} 32 | 33 | \item{pvalueCutoff}{p value Cutoff} 34 | 35 | \item{pAdjustMethod}{p value adjustment method} 36 | 37 | \item{verbose}{print message or not} 38 | 39 | \item{seed}{set seed inside the function to make result reproducible. FALSE by default.} 40 | 41 | \item{USER_DATA}{annotation data} 42 | 43 | \item{by}{one of 'fgsea' or 'DOSE'} 44 | 45 | \item{...}{other parameter} 46 | } 47 | \value{ 48 | gseaResult object 49 | } 50 | \description{ 51 | generic function for gene set enrichment analysis 52 | } 53 | \author{ 54 | Yu Guangchuang 55 | } 56 | -------------------------------------------------------------------------------- /man/clusterSim.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/clusterSim.R 3 | \name{clusterSim} 4 | \alias{clusterSim} 5 | \title{clusterSim} 6 | \usage{ 7 | clusterSim( 8 | cluster1, 9 | cluster2, 10 | ont = "HDO", 11 | organism = "hsa", 12 | measure = "Wang", 13 | combine = "BMA" 14 | ) 15 | } 16 | \arguments{ 17 | \item{cluster1}{a vector of gene IDs} 18 | 19 | \item{cluster2}{another vector of gene IDs} 20 | 21 | \item{ont}{one of "HDO", "HPO" and "MPO"} 22 | 23 | \item{organism}{one of "hsa" and "mmu"} 24 | 25 | \item{measure}{One of "Resnik", "Lin", "Rel", "Jiang" and "Wang" methods.} 26 | 27 | \item{combine}{One of "max", "avg", "rcmax", "BMA" methods, for combining} 28 | } 29 | \value{ 30 | similarity 31 | } 32 | \description{ 33 | semantic similarity between two gene clusters 34 | } 35 | \details{ 36 | given two gene clusters, this function calculates semantic similarity between them. 37 | } 38 | \examples{ 39 | \dontrun{ 40 | cluster1 <- c("835", "5261","241", "994") 41 | cluster2 <- c("307", "308", "317", "321", "506", "540", "378", "388", "396") 42 | clusterSim(cluster1, cluster2) 43 | } 44 | } 45 | \author{ 46 | Yu Guangchuang 47 | } 48 | -------------------------------------------------------------------------------- /man/compareClusterResult-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/00-AllClasses.R 3 | \docType{class} 4 | \name{compareClusterResult-class} 5 | \alias{compareClusterResult-class} 6 | \alias{show,compareClusterResult-method} 7 | \alias{summary,compareClusterResult-method} 8 | \alias{plot,compareClusterResult-method} 9 | \title{Class "compareClusterResult" 10 | This class represents the comparison result of gene clusters by GO 11 | categories at specific level or GO enrichment analysis.} 12 | \description{ 13 | Class "compareClusterResult" 14 | This class represents the comparison result of gene clusters by GO 15 | categories at specific level or GO enrichment analysis. 16 | } 17 | \section{Slots}{ 18 | 19 | \describe{ 20 | \item{\code{compareClusterResult}}{cluster comparing result} 21 | 22 | \item{\code{geneClusters}}{a list of genes} 23 | 24 | \item{\code{fun}}{one of groupGO, enrichGO and enrichKEGG} 25 | 26 | \item{\code{gene2Symbol}}{gene ID to Symbol} 27 | 28 | \item{\code{keytype}}{Gene ID type} 29 | 30 | \item{\code{readable}}{logical flag of gene ID in symbol or not.} 31 | 32 | \item{\code{.call}}{function call} 33 | 34 | \item{\code{termsim}}{Similarity between term} 35 | 36 | \item{\code{method}}{method of calculating the similarity between nodes} 37 | 38 | \item{\code{dr}}{dimension reduction result} 39 | }} 40 | 41 | \seealso{ 42 | \code{\linkS4class{enrichResult}} 43 | } 44 | \author{ 45 | Guangchuang Yu \url{https://yulab-smu.top} 46 | } 47 | \keyword{classes} 48 | -------------------------------------------------------------------------------- /man/computeIC.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utilities.R 3 | \name{computeIC} 4 | \alias{computeIC} 5 | \title{compute information content} 6 | \usage{ 7 | computeIC(ont = "HDO") 8 | } 9 | \arguments{ 10 | \item{ont}{one of "DO", "HPO" and "MPO"} 11 | } 12 | \description{ 13 | compute information content 14 | } 15 | \author{ 16 | Guangchuang Yu \url{https://yulab-smu.top} 17 | } 18 | -------------------------------------------------------------------------------- /man/doseSim.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/doSim.R 3 | \name{doseSim} 4 | \alias{doseSim} 5 | \alias{doSim} 6 | \title{doseSim} 7 | \usage{ 8 | doseSim(DOID1, DOID2, measure = "Wang", ont = "HDO") 9 | 10 | doSim(DOID1, DOID2, measure = "Wang", ont = "HDO") 11 | } 12 | \arguments{ 13 | \item{DOID1}{DO term, MPO term or HPO term vector} 14 | 15 | \item{DOID2}{DO term, MPO term or HPO term vector} 16 | 17 | \item{measure}{one of "Wang", "Resnik", "Rel", "Jiang", "Lin", and "TCSS".} 18 | 19 | \item{ont}{one of "HDO", "HPO" and "MPO"} 20 | } 21 | \value{ 22 | score matrix 23 | } 24 | \description{ 25 | measuring similarities between two DO term vectors. 26 | } 27 | \details{ 28 | provide two term vectors, this function will calculate their similarities. 29 | } 30 | \author{ 31 | Guangchuang Yu \url{https://yulab-smu.top} 32 | } 33 | -------------------------------------------------------------------------------- /man/enrichDGN.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/enrichDGN.R 3 | \name{enrichDGN} 4 | \alias{enrichDGN} 5 | \title{Enrichment analysis based on the DisGeNET (\url{http://www.disgenet.org/})} 6 | \usage{ 7 | enrichDGN( 8 | gene, 9 | pvalueCutoff = 0.05, 10 | pAdjustMethod = "BH", 11 | universe, 12 | minGSSize = 10, 13 | maxGSSize = 500, 14 | qvalueCutoff = 0.2, 15 | readable = FALSE 16 | ) 17 | } 18 | \arguments{ 19 | \item{gene}{a vector of entrez gene id} 20 | 21 | \item{pvalueCutoff}{pvalue cutoff} 22 | 23 | \item{pAdjustMethod}{one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"} 24 | 25 | \item{universe}{background genes} 26 | 27 | \item{minGSSize}{minimal size of genes annotated by NCG category for testing} 28 | 29 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 30 | 31 | \item{qvalueCutoff}{qvalue cutoff} 32 | 33 | \item{readable}{whether mapping gene ID to gene Name} 34 | } 35 | \value{ 36 | A \code{enrichResult} instance 37 | } 38 | \description{ 39 | given a vector of genes, this function will return the enrichment NCG 40 | categories with FDR control 41 | } 42 | \references{ 43 | Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. \emph{Database} bav028 44 | \url{http://database.oxfordjournals.org/content/2015/bav028.long} 45 | } 46 | \author{ 47 | Guangchuang Yu 48 | } 49 | -------------------------------------------------------------------------------- /man/enrichDGNv.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/enrichDGNv.R 3 | \name{enrichDGNv} 4 | \alias{enrichDGNv} 5 | \title{enrichDGN} 6 | \usage{ 7 | enrichDGNv( 8 | snp, 9 | pvalueCutoff = 0.05, 10 | pAdjustMethod = "BH", 11 | universe, 12 | minGSSize = 10, 13 | maxGSSize = 500, 14 | qvalueCutoff = 0.2, 15 | readable = FALSE 16 | ) 17 | } 18 | \arguments{ 19 | \item{snp}{a vector of SNP} 20 | 21 | \item{pvalueCutoff}{pvalue cutoff} 22 | 23 | \item{pAdjustMethod}{one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"} 24 | 25 | \item{universe}{background genes} 26 | 27 | \item{minGSSize}{minimal size of genes annotated by NCG category for testing} 28 | 29 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 30 | 31 | \item{qvalueCutoff}{qvalue cutoff} 32 | 33 | \item{readable}{whether mapping gene ID to gene Name} 34 | } 35 | \value{ 36 | A \code{enrichResult} instance 37 | } 38 | \description{ 39 | Enrichment analysis based on the DisGeNET (\url{http://www.disgenet.org/}) 40 | } 41 | \details{ 42 | given a vector of genes, this function will return the enrichment NCG 43 | categories with FDR control 44 | } 45 | \references{ 46 | Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. \emph{Database} bav028 47 | \url{http://database.oxfordjournals.org/content/2015/bav028.long} 48 | } 49 | \author{ 50 | Guangchuang Yu 51 | } 52 | -------------------------------------------------------------------------------- /man/enrichDO.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/enrichDO.R 3 | \name{enrichDO} 4 | \alias{enrichDO} 5 | \title{DO Enrichment Analysis} 6 | \usage{ 7 | enrichDO( 8 | gene, 9 | ont = "HDO", 10 | organism = "hsa", 11 | pvalueCutoff = 0.05, 12 | pAdjustMethod = "BH", 13 | universe, 14 | minGSSize = 10, 15 | maxGSSize = 500, 16 | qvalueCutoff = 0.2, 17 | readable = FALSE 18 | ) 19 | } 20 | \arguments{ 21 | \item{gene}{a vector of entrez gene id} 22 | 23 | \item{ont}{one of "HDO", "HPO" or "MPO".} 24 | 25 | \item{organism}{one of "hsa" and "mmu"} 26 | 27 | \item{pvalueCutoff}{pvalue cutoff} 28 | 29 | \item{pAdjustMethod}{one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"} 30 | 31 | \item{universe}{background genes} 32 | 33 | \item{minGSSize}{minimal size of genes annotated by NCG category for testing} 34 | 35 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 36 | 37 | \item{qvalueCutoff}{qvalue cutoff} 38 | 39 | \item{readable}{whether mapping gene ID to gene Name} 40 | } 41 | \value{ 42 | A \code{enrichResult} instance. 43 | } 44 | \description{ 45 | Given a vector of genes, this function will return the enrichment DO 46 | categories with FDR control. 47 | } 48 | \examples{ 49 | 50 | data(geneList) 51 | gene = names(geneList)[geneList > 1] 52 | yy = enrichDO(gene, pvalueCutoff=0.05) 53 | summary(yy) 54 | 55 | } 56 | \seealso{ 57 | \code{\link{enrichResult-class}} 58 | } 59 | \author{ 60 | Guangchuang Yu \url{https://yulab-smu.top} 61 | } 62 | \keyword{manip} 63 | -------------------------------------------------------------------------------- /man/enrichNCG.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/enrichNCG.R 3 | \name{enrichNCG} 4 | \alias{enrichNCG} 5 | \title{enrichNCG} 6 | \usage{ 7 | enrichNCG( 8 | gene, 9 | pvalueCutoff = 0.05, 10 | pAdjustMethod = "BH", 11 | universe, 12 | minGSSize = 10, 13 | maxGSSize = 500, 14 | qvalueCutoff = 0.2, 15 | readable = FALSE 16 | ) 17 | } 18 | \arguments{ 19 | \item{gene}{a vector of entrez gene id} 20 | 21 | \item{pvalueCutoff}{pvalue cutoff} 22 | 23 | \item{pAdjustMethod}{one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"} 24 | 25 | \item{universe}{background genes} 26 | 27 | \item{minGSSize}{minimal size of genes annotated by NCG category for testing} 28 | 29 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 30 | 31 | \item{qvalueCutoff}{qvalue cutoff} 32 | 33 | \item{readable}{whether mapping gene ID to gene Name} 34 | } 35 | \value{ 36 | A \code{enrichResult} instance 37 | } 38 | \description{ 39 | Enrichment analysis based on the Network of Cancer Genes database (http://ncg.kcl.ac.uk/) 40 | } 41 | \details{ 42 | given a vector of genes, this function will return the enrichment NCG 43 | categories with FDR control 44 | } 45 | \author{ 46 | Guangchuang Yu 47 | } 48 | -------------------------------------------------------------------------------- /man/enrichResult-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/00-AllClasses.R 3 | \docType{class} 4 | \name{enrichResult-class} 5 | \alias{enrichResult-class} 6 | \alias{show,enrichResult-method} 7 | \alias{summary,enrichResult-method} 8 | \title{Class "enrichResult" 9 | This class represents the result of enrichment analysis.} 10 | \description{ 11 | Class "enrichResult" 12 | This class represents the result of enrichment analysis. 13 | } 14 | \section{Slots}{ 15 | 16 | \describe{ 17 | \item{\code{result}}{enrichment analysis} 18 | 19 | \item{\code{pvalueCutoff}}{pvalueCutoff} 20 | 21 | \item{\code{pAdjustMethod}}{pvalue adjust method} 22 | 23 | \item{\code{qvalueCutoff}}{qvalueCutoff} 24 | 25 | \item{\code{organism}}{only "human" supported} 26 | 27 | \item{\code{ontology}}{biological ontology} 28 | 29 | \item{\code{gene}}{Gene IDs} 30 | 31 | \item{\code{keytype}}{Gene ID type} 32 | 33 | \item{\code{universe}}{background gene} 34 | 35 | \item{\code{gene2Symbol}}{mapping gene to Symbol} 36 | 37 | \item{\code{geneSets}}{gene sets} 38 | 39 | \item{\code{readable}}{logical flag of gene ID in symbol or not.} 40 | 41 | \item{\code{termsim}}{Similarity between term} 42 | 43 | \item{\code{method}}{method of calculating the similarity between nodes} 44 | 45 | \item{\code{dr}}{dimension reduction result} 46 | }} 47 | 48 | \seealso{ 49 | \code{\link{enrichDO}} 50 | } 51 | \author{ 52 | Guangchuang Yu \url{https://yulab-smu.top} 53 | } 54 | \keyword{classes} 55 | -------------------------------------------------------------------------------- /man/enricher_internal.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/enricher_internal.R 3 | \name{enricher_internal} 4 | \alias{enricher_internal} 5 | \title{enrich.internal} 6 | \usage{ 7 | enricher_internal( 8 | gene, 9 | pvalueCutoff, 10 | pAdjustMethod = "BH", 11 | universe = NULL, 12 | minGSSize = 10, 13 | maxGSSize = 500, 14 | qvalueCutoff = 0.2, 15 | USER_DATA 16 | ) 17 | } 18 | \arguments{ 19 | \item{gene}{a vector of entrez gene id.} 20 | 21 | \item{pvalueCutoff}{Cutoff value of pvalue.} 22 | 23 | \item{pAdjustMethod}{one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"} 24 | 25 | \item{universe}{background genes, default is the intersection of the 'universe' with genes that have annotations. 26 | Users can set `options(enrichment_force_universe = TRUE)` to force the 'universe' untouched.} 27 | 28 | \item{minGSSize}{minimal size of genes annotated by Ontology term for testing.} 29 | 30 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 31 | 32 | \item{qvalueCutoff}{cutoff of qvalue} 33 | 34 | \item{USER_DATA}{ontology information} 35 | } 36 | \value{ 37 | A \code{enrichResult} instance. 38 | } 39 | \description{ 40 | interal method for enrichment analysis 41 | } 42 | \details{ 43 | using the hypergeometric model 44 | } 45 | \author{ 46 | Guangchuang Yu \url{https://yulab-smu.top} 47 | } 48 | \keyword{manip} 49 | -------------------------------------------------------------------------------- /man/gene2DO.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utilities.R 3 | \name{gene2DO} 4 | \alias{gene2DO} 5 | \title{convert Gene ID to DO Terms} 6 | \usage{ 7 | gene2DO(gene, organism = "hsa", ont = "HDO") 8 | } 9 | \arguments{ 10 | \item{gene}{entrez gene ID} 11 | 12 | \item{organism}{organism} 13 | 14 | \item{ont}{ont} 15 | } 16 | \value{ 17 | DO Terms 18 | } 19 | \description{ 20 | provide gene ID, this function will convert to the corresponding DO Terms 21 | } 22 | \author{ 23 | Guangchuang Yu \url{https://yulab-smu.top} 24 | } 25 | -------------------------------------------------------------------------------- /man/geneID.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/AllGenerics.R 3 | \name{geneID} 4 | \alias{geneID} 5 | \title{geneID generic} 6 | \usage{ 7 | geneID(x) 8 | } 9 | \arguments{ 10 | \item{x}{enrichResult object} 11 | } 12 | \value{ 13 | 'geneID' return the 'geneID' column of the enriched result which can be converted to data.frame via 'as.data.frame' 14 | } 15 | \description{ 16 | geneID generic 17 | } 18 | \examples{ 19 | data(geneList, package="DOSE") 20 | de <- names(geneList)[1:100] 21 | x <- enrichDO(de) 22 | geneID(x) 23 | } 24 | -------------------------------------------------------------------------------- /man/geneInCategory.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/AllGenerics.R 3 | \name{geneInCategory} 4 | \alias{geneInCategory} 5 | \title{geneInCategory generic} 6 | \usage{ 7 | geneInCategory(x) 8 | } 9 | \arguments{ 10 | \item{x}{enrichResult} 11 | } 12 | \value{ 13 | 'geneInCategory' return a list of genes, by spliting the input gene vector to enriched functional categories 14 | } 15 | \description{ 16 | geneInCategory generic 17 | } 18 | \examples{ 19 | data(geneList, package="DOSE") 20 | de <- names(geneList)[1:100] 21 | x <- enrichDO(de) 22 | geneInCategory(x) 23 | } 24 | -------------------------------------------------------------------------------- /man/geneSim.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/geneSim.R 3 | \name{geneSim} 4 | \alias{geneSim} 5 | \title{geneSim} 6 | \usage{ 7 | geneSim( 8 | geneID1, 9 | geneID2 = NULL, 10 | ont = "HDO", 11 | organism = "hsa", 12 | measure = "Wang", 13 | combine = "BMA" 14 | ) 15 | } 16 | \arguments{ 17 | \item{geneID1}{entrez gene vector} 18 | 19 | \item{geneID2}{entrez gene vector} 20 | 21 | \item{ont}{one of "HDO" and "MPO"} 22 | 23 | \item{organism}{one of "hsa" and "mmu"} 24 | 25 | \item{measure}{one of "Wang", "Resnik", "Rel", "Jiang", and "Lin".} 26 | 27 | \item{combine}{One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein.} 28 | } 29 | \value{ 30 | score matrix 31 | } 32 | \description{ 33 | measuring similarities bewteen two gene vectors. 34 | } 35 | \details{ 36 | provide two entrez gene vectors, this function will calculate their similarity. 37 | } 38 | \examples{ 39 | g <- c("835", "5261","241", "994") 40 | geneSim(g) 41 | } 42 | \author{ 43 | Guangchuang Yu \url{https://yulab-smu.top} 44 | } 45 | -------------------------------------------------------------------------------- /man/gseDGN.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gseAnalyzer.R 3 | \name{gseDGN} 4 | \alias{gseDGN} 5 | \title{DisGeNET Gene Set Enrichment Analysis} 6 | \usage{ 7 | gseDGN( 8 | geneList, 9 | exponent = 1, 10 | minGSSize = 10, 11 | maxGSSize = 500, 12 | pvalueCutoff = 0.05, 13 | pAdjustMethod = "BH", 14 | verbose = TRUE, 15 | seed = FALSE, 16 | by = "fgsea", 17 | ... 18 | ) 19 | } 20 | \arguments{ 21 | \item{geneList}{order ranked geneList} 22 | 23 | \item{exponent}{weight of each step} 24 | 25 | \item{minGSSize}{minimal size of each geneSet for analyzing} 26 | 27 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 28 | 29 | \item{pvalueCutoff}{pvalue Cutoff} 30 | 31 | \item{pAdjustMethod}{p value adjustment method} 32 | 33 | \item{verbose}{print message or not} 34 | 35 | \item{seed}{logical} 36 | 37 | \item{by}{one of 'fgsea' or 'DOSE'} 38 | 39 | \item{...}{other parameter} 40 | } 41 | \value{ 42 | gseaResult object 43 | } 44 | \description{ 45 | perform gsea analysis 46 | } 47 | \author{ 48 | Yu Guangchuang 49 | } 50 | \keyword{manip} 51 | -------------------------------------------------------------------------------- /man/gseDO.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gseAnalyzer.R 3 | \name{gseDO} 4 | \alias{gseDO} 5 | \title{DO Gene Set Enrichment Analysis} 6 | \usage{ 7 | gseDO( 8 | geneList, 9 | ont = "HDO", 10 | organism = "hsa", 11 | exponent = 1, 12 | minGSSize = 10, 13 | maxGSSize = 500, 14 | pvalueCutoff = 0.05, 15 | pAdjustMethod = "BH", 16 | verbose = TRUE, 17 | seed = FALSE, 18 | by = "fgsea", 19 | ... 20 | ) 21 | } 22 | \arguments{ 23 | \item{geneList}{order ranked geneList} 24 | 25 | \item{ont}{one of "HDO", "HPO" or "MPO"} 26 | 27 | \item{organism}{one of "hsa" and "mmu"} 28 | 29 | \item{exponent}{weight of each step} 30 | 31 | \item{minGSSize}{minimal size of each geneSet for analyzing} 32 | 33 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 34 | 35 | \item{pvalueCutoff}{pvalue Cutoff} 36 | 37 | \item{pAdjustMethod}{p value adjustment method} 38 | 39 | \item{verbose}{print message or not} 40 | 41 | \item{seed}{logical} 42 | 43 | \item{by}{one of 'fgsea' or 'DOSE'} 44 | 45 | \item{...}{other parameter} 46 | } 47 | \value{ 48 | gseaResult object 49 | } 50 | \description{ 51 | perform gsea analysis 52 | } 53 | \author{ 54 | Yu Guangchuang 55 | } 56 | \keyword{manip} 57 | -------------------------------------------------------------------------------- /man/gseNCG.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gseAnalyzer.R 3 | \name{gseNCG} 4 | \alias{gseNCG} 5 | \title{NCG Gene Set Enrichment Analysis} 6 | \usage{ 7 | gseNCG( 8 | geneList, 9 | exponent = 1, 10 | minGSSize = 10, 11 | maxGSSize = 500, 12 | pvalueCutoff = 0.05, 13 | pAdjustMethod = "BH", 14 | verbose = TRUE, 15 | seed = FALSE, 16 | by = "fgsea", 17 | ... 18 | ) 19 | } 20 | \arguments{ 21 | \item{geneList}{order ranked geneList} 22 | 23 | \item{exponent}{weight of each step} 24 | 25 | \item{minGSSize}{minimal size of each geneSet for analyzing} 26 | 27 | \item{maxGSSize}{maximal size of each geneSet for analyzing} 28 | 29 | \item{pvalueCutoff}{pvalue Cutoff} 30 | 31 | \item{pAdjustMethod}{p value adjustment method} 32 | 33 | \item{verbose}{print message or not} 34 | 35 | \item{seed}{logical} 36 | 37 | \item{by}{one of 'fgsea' or 'DOSE'} 38 | 39 | \item{...}{other parameter} 40 | } 41 | \value{ 42 | gseaResult object 43 | } 44 | \description{ 45 | perform gsea analysis 46 | } 47 | \author{ 48 | Yu Guangchuang 49 | } 50 | \keyword{manip} 51 | -------------------------------------------------------------------------------- /man/gseaResult-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/00-AllClasses.R 3 | \docType{class} 4 | \name{gseaResult-class} 5 | \alias{gseaResult-class} 6 | \alias{gseahResult-class} 7 | \alias{show,gseaResult-method} 8 | \alias{summary,gseaResult-method} 9 | \title{Class "gseaResult" 10 | This class represents the result of GSEA analysis} 11 | \description{ 12 | Class "gseaResult" 13 | This class represents the result of GSEA analysis 14 | } 15 | \section{Slots}{ 16 | 17 | \describe{ 18 | \item{\code{result}}{GSEA anaysis} 19 | 20 | \item{\code{organism}}{organism} 21 | 22 | \item{\code{setType}}{setType} 23 | 24 | \item{\code{geneSets}}{geneSets} 25 | 26 | \item{\code{geneList}}{order rank geneList} 27 | 28 | \item{\code{keytype}}{ID type of gene} 29 | 30 | \item{\code{permScores}}{permutation scores} 31 | 32 | \item{\code{params}}{parameters} 33 | 34 | \item{\code{gene2Symbol}}{gene ID to Symbol} 35 | 36 | \item{\code{readable}}{whether convert gene ID to symbol} 37 | 38 | \item{\code{dr}}{dimension reduction result} 39 | }} 40 | 41 | \author{ 42 | Guangchuang Yu \url{https://yulab-smu.top} 43 | } 44 | \keyword{classes} 45 | -------------------------------------------------------------------------------- /man/gsfilter.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gsfilter.R 3 | \name{gsfilter} 4 | \alias{gsfilter} 5 | \title{gsfilter} 6 | \usage{ 7 | gsfilter(x, by = "GSSize", min = NA, max = NA) 8 | } 9 | \arguments{ 10 | \item{x}{instance of enrichResult or compareClusterResult} 11 | 12 | \item{by}{one of 'GSSize' or 'Count'} 13 | 14 | \item{min}{minimal size} 15 | 16 | \item{max}{maximal size} 17 | } 18 | \value{ 19 | update object 20 | } 21 | \description{ 22 | filter enriched result by gene set size or gene count 23 | } 24 | \author{ 25 | Guangchuang Yu 26 | } 27 | -------------------------------------------------------------------------------- /man/mclusterSim.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mclusterSim.R 3 | \name{mclusterSim} 4 | \alias{mclusterSim} 5 | \title{mclusterSim} 6 | \usage{ 7 | mclusterSim( 8 | clusters, 9 | ont = "HDO", 10 | organism = "hsa", 11 | measure = "Wang", 12 | combine = "BMA" 13 | ) 14 | } 15 | \arguments{ 16 | \item{clusters}{A list of gene clusters} 17 | 18 | \item{ont}{one of "HDO", "HPO" and "MPO"} 19 | 20 | \item{organism}{organism} 21 | 22 | \item{measure}{one of "Wang", "Resnik", "Rel", "Jiang", and "Lin".} 23 | 24 | \item{combine}{One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein.} 25 | } 26 | \value{ 27 | similarity matrix 28 | } 29 | \description{ 30 | Pairwise semantic similarity for a list of gene clusters 31 | } 32 | \examples{ 33 | \dontrun{ 34 | cluster1 <- c("835", "5261","241") 35 | cluster2 <- c("578","582") 36 | cluster3 <- c("307", "308", "317") 37 | clusters <- list(a=cluster1, b=cluster2, c=cluster3) 38 | mclusterSim(clusters, measure="Wang") 39 | } 40 | } 41 | \author{ 42 | Guangchuang Yu 43 | } 44 | -------------------------------------------------------------------------------- /man/parse_ratio.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/parse_ratio.R 3 | \name{parse_ratio} 4 | \alias{parse_ratio} 5 | \title{parse_ratio} 6 | \usage{ 7 | parse_ratio(ratio) 8 | } 9 | \arguments{ 10 | \item{ratio}{character vector of ratio to parse} 11 | } 12 | \value{ 13 | A numeric vector (double) of parsed ratio 14 | } 15 | \description{ 16 | parse character ratio to double value, such as 1/5 to 0.2 17 | } 18 | \author{ 19 | Guangchuang Yu 20 | } 21 | -------------------------------------------------------------------------------- /man/reexports.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utilities.R 3 | \docType{import} 4 | \name{reexports} 5 | \alias{reexports} 6 | \alias{facet_grid} 7 | \title{Objects exported from other packages} 8 | \keyword{internal} 9 | \description{ 10 | These objects are imported from other packages. Follow the links 11 | below to see their documentation. 12 | 13 | \describe{ 14 | \item{ggplot2}{\code{\link[ggplot2]{facet_grid}}} 15 | }} 16 | 17 | -------------------------------------------------------------------------------- /man/setReadable.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/setReadable.R 3 | \name{setReadable} 4 | \alias{setReadable} 5 | \title{setReadable} 6 | \usage{ 7 | setReadable(x, OrgDb, keyType = "auto") 8 | } 9 | \arguments{ 10 | \item{x}{enrichResult Object} 11 | 12 | \item{OrgDb}{OrgDb} 13 | 14 | \item{keyType}{keyType of gene} 15 | } 16 | \value{ 17 | enrichResult Object 18 | } 19 | \description{ 20 | mapping geneID to gene Symbol 21 | } 22 | \author{ 23 | Yu Guangchuang 24 | } 25 | -------------------------------------------------------------------------------- /man/show-methods.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/print.R 3 | \docType{methods} 4 | \name{show} 5 | \alias{show} 6 | \title{show method} 7 | \usage{ 8 | show(object) 9 | 10 | show(object) 11 | } 12 | \arguments{ 13 | \item{object}{A \code{enrichResult} instance.} 14 | } 15 | \value{ 16 | message 17 | 18 | message 19 | } 20 | \description{ 21 | show method for \code{gseaResult} instance 22 | 23 | show method for \code{enrichResult} instance 24 | } 25 | \author{ 26 | Guangchuang Yu \url{https://yulab-smu.top} 27 | } 28 | -------------------------------------------------------------------------------- /man/simplot.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simplot.R 3 | \name{simplot} 4 | \alias{simplot} 5 | \title{simplot} 6 | \usage{ 7 | simplot( 8 | sim, 9 | xlab = "", 10 | ylab = "", 11 | color.low = "white", 12 | color.high = "red", 13 | labs = TRUE, 14 | digits = 2, 15 | labs.size = 3, 16 | font.size = 14 17 | ) 18 | } 19 | \arguments{ 20 | \item{sim}{similarity matrix} 21 | 22 | \item{xlab}{xlab} 23 | 24 | \item{ylab}{ylab} 25 | 26 | \item{color.low}{color of low value} 27 | 28 | \item{color.high}{color of high value} 29 | 30 | \item{labs}{logical, add text label or not} 31 | 32 | \item{digits}{round digit numbers} 33 | 34 | \item{labs.size}{lable size} 35 | 36 | \item{font.size}{font size} 37 | } 38 | \value{ 39 | ggplot object 40 | } 41 | \description{ 42 | plotting similarity matrix 43 | } 44 | \author{ 45 | Yu Guangchuang 46 | } 47 | -------------------------------------------------------------------------------- /man/summary-methods.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/accessor.R 3 | \docType{methods} 4 | \name{summary} 5 | \alias{summary} 6 | \title{summary method} 7 | \usage{ 8 | summary(object, ...) 9 | 10 | summary(object, ...) 11 | } 12 | \arguments{ 13 | \item{object}{A \code{enrichResult} instance.} 14 | 15 | \item{...}{additional parameter} 16 | } 17 | \value{ 18 | A data frame 19 | 20 | A data frame 21 | } 22 | \description{ 23 | summary method for \code{gseaResult} instance 24 | 25 | summary method for \code{enrichResult} instance 26 | } 27 | \author{ 28 | Guangchuang Yu \url{https://guangchuangyu.github.io} 29 | 30 | Guangchuang Yu \url{http://guangchuangyu.github.io} 31 | } 32 | -------------------------------------------------------------------------------- /man/theme_dose.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simplot.R 3 | \name{theme_dose} 4 | \alias{theme_dose} 5 | \title{theme_dose} 6 | \usage{ 7 | theme_dose(font.size = 14) 8 | } 9 | \arguments{ 10 | \item{font.size}{font size} 11 | } 12 | \value{ 13 | ggplot theme 14 | } 15 | \description{ 16 | ggplot theme of DOSE 17 | } 18 | \examples{ 19 | library(ggplot2) 20 | qplot(1:10) + theme_dose() 21 | } 22 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(DOSE) 3 | 4 | test_check("DOSE") 5 | 6 | -------------------------------------------------------------------------------- /tests/testthat/test-doSim.R: -------------------------------------------------------------------------------- 1 | library(DOSE) 2 | 3 | context("doSim") 4 | 5 | test_that("doSim", { 6 | res <- sapply(c("Wang", "Lin", "Jiang", "Resnik", "Rel"), function(mm) { 7 | doSim("DOID:1002", "DOID:10003", measure=mm) 8 | }) 9 | expect_true(all(res >= 0) && all(res <= 1)) 10 | }) 11 | -------------------------------------------------------------------------------- /tests/testthat/test-geneSim.R: -------------------------------------------------------------------------------- 1 | library(DOSE) 2 | 3 | context("geneSim") 4 | 5 | test_that("geneSim", { 6 | res <- sapply(c("Wang", "Lin", "Jiang", "Resnik", "Rel"), function(mm) geneSim('2524', '3070', measure=mm)) 7 | expect_true(all(res >=0) && all(res <=1)) 8 | }) 9 | 10 | -------------------------------------------------------------------------------- /vignettes/DOSE.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "DOSE: Disease Ontology Semantic and Enrichment analysis" 3 | author: 4 | - name: Guangchuang Yu 5 | email: guangchuangyu@gmail.com 6 | affiliation: Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University 7 | date: "`r Sys.Date()`" 8 | biblio-style: apalike 9 | output: 10 | prettydoc::html_pretty: 11 | toc: true 12 | theme: cayman 13 | highlight: github 14 | pdf_document: 15 | toc: true 16 | vignette: > 17 | %\VignetteIndexEntry{DOSE} 18 | %\VignettePackage{DOSE} 19 | % \VignetteEngine{knitr::rmarkdown} 20 | % \usepackage[utf8]{inputenc} 21 | %\VignetteEncoding{UTF-8} 22 | --- 23 | 24 | ```{r style, echo=FALSE, results="asis", message=FALSE} 25 | knitr::opts_chunk$set(tidy = FALSE, 26 | warning = FALSE, 27 | message = FALSE) 28 | 29 | Biocpkg <- function (pkg) { 30 | sprintf("[%s](http://bioconductor.org/packages/%s)", pkg, pkg) 31 | } 32 | 33 | ``` 34 | 35 | 36 | ```{r echo=FALSE, results='hide', message=FALSE} 37 | library(DOSE) 38 | ``` 39 | 40 | 41 | # Vignette 42 | 43 | Please go to for the full vignette. 44 | 45 | 46 | 47 | # Citation 48 | 49 | If you use `r Biocpkg("DOSE")` in published research, please cite G. Yu (2015). 50 | 51 | 52 | __*G Yu*__, LG Wang, GR Yan, QY He. DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis. __*Bioinformatics*__ 2015, 31(4):608-609. . 53 | 54 | 55 | # Need helps? 56 | 57 | 58 | If you have questions/issues, please post 59 | to [Bioconductor support site](https://support.bioconductor.org/) and tag your 60 | post with *DOSE*. 61 | 62 | 63 | 64 | --------------------------------------------------------------------------------