├── DESCRIPTION
├── NAMESPACE
├── NEWS
├── QUESTIONS
├── R
    ├── KEGGREST.R
    ├── parsers.R
    └── utilities.R
├── README.md
├── inst
    └── unitTests
    │   └── test_KEGGREST.R
├── man
    ├── keggCompounds.Rd
    ├── keggConv.Rd
    ├── keggFind.Rd
    ├── keggGet.Rd
    ├── keggInfo.Rd
    ├── keggLink.Rd
    ├── keggList.Rd
    ├── listDatabases.Rd
    └── mark.pathway.by.objects.Rd
├── tests
    └── KEGGREST_unit_tests.R
└── vignettes
    └── KEGGREST-vignette.Rmd


/DESCRIPTION:
--------------------------------------------------------------------------------
 1 | Package: KEGGREST
 2 | Version: 1.49.0
 3 | Title:
 4 |     Client-side REST access to the Kyoto Encyclopedia of Genes and Genomes (KEGG)
 5 | Authors@R: c(
 6 |     person("Dan", "Tenenbaum", role = "aut"),
 7 |     person("Bioconductor Package", "Maintainer", role = c("aut", "cre"),
 8 |            email = "maintainer@bioconductor.org"),
 9 |     person("Martin", "Morgan", role = "ctb"),
10 |     person("Kozo", "Nishida", role = "ctb"),
11 |     person("Marcel", "Ramos", role = "ctb"),
12 |     person("Kristina", "Riemer", role = "ctb"),
13 |     person("Lori", "Shepherd", role = "ctb"),
14 |     person("Jeremy", "Volkening", role = "ctb")
15 |     )
16 | Depends: R (>= 3.5.0)
17 | Imports: methods, httr, png, Biostrings
18 | Suggests: RUnit, BiocGenerics, BiocStyle, knitr, markdown
19 | Description:
20 |     A package that provides a client interface to the Kyoto
21 |     Encyclopedia of Genes and Genomes (KEGG) REST API. Only
22 |     for academic use by academic users belonging to academic
23 |     institutions (see <https://www.kegg.jp/kegg/rest/>).
24 |     Note that KEGGREST is based on KEGGSOAP by J. Zhang, R. Gentleman,
25 |     and Marc Carlson, and KEGG (python package) by Aurelien Mazurie.
26 | URL: https://bioconductor.org/packages/KEGGREST
27 | BugReports: https://github.com/Bioconductor/KEGGREST/issues
28 | License: Artistic-2.0
29 | VignetteBuilder: knitr
30 | biocViews: Annotation, Pathways, ThirdPartyClient, KEGG
31 | RoxygenNote: 7.1.1
32 | Date: 2024-06-17
33 | 


--------------------------------------------------------------------------------
/NAMESPACE:
--------------------------------------------------------------------------------
 1 | 
 2 | importFrom(utils, download.file, head)
 3 | importFrom(httr, GET, POST, http_status, content, stop_for_status)
 4 | importFrom(png, readPNG, writePNG)
 5 | importFrom(Biostrings, readAAStringSet, readDNAStringSet,
 6 |     DNAStringSet, AAStringSet)
 7 | import(methods)
 8 | 
 9 | export(
10 |     keggInfo,
11 |     keggList,
12 |     listDatabases,
13 |     keggFind,
14 |     keggGet,
15 |     keggCompounds,
16 |     keggConv,
17 |     keggLink,
18 |     mark.pathway.by.objects,
19 |     color.pathway.by.objects
20 | )
21 | 
22 | 


--------------------------------------------------------------------------------
/NEWS:
--------------------------------------------------------------------------------
 1 | CHANGES IN VERSION 1.46.0
 2 | -----------------------
 3 | 
 4 | BUG FIXES
 5 | 
 6 |     o 1.45.1 Fix keggFind URL to use '+' instead of spaces.
 7 | 
 8 | CHANGES IN VERSION 1.42.0
 9 | -----------------------
10 | 
11 | SIGNIFICANT USER-VISIBLE CHANGES
12 | 
13 |     o `keggCompounds` lists compound IDs for a given pathway (@KristinaRiemer,
14 |     #6).
15 | 
16 | BUG FIXES
17 | 
18 |     o Update URL path in `.get.kegg.url` from `tmp` to `kegg` subfolder.
19 | 
20 | CHANGES IN VERSION 1.37.0
21 | -----------------------
22 | 
23 | BUG CORRECTION
24 | 
25 |     o 1.37.1 Fixes new endpoint
26 |     o 1.37.2 http to https fixes windows error
27 | 
28 | CHANGES IN VERSION 1.0.0
29 | -----------------------
30 | 
31 | SIGNIFICANT USER-VISIBLE CHANGES
32 | 
33 |     o Package introduced.
34 | 
35 | NEW FEATURES
36 | 
37 |     o Package introduced.
38 | 


--------------------------------------------------------------------------------
/QUESTIONS:
--------------------------------------------------------------------------------
  1 | Questions for the KEGG team.
  2 | 
  3 | 
  4 | ** No apparent replacement found for old APIs
  5 | 
  6 | Is there a new programmatic way to call the old SOAP api "get_motifs_by_gene"?
  7 | I know I can do it manually with a request like this:
  8 | 
  9 | http://www.kegg.jp/ssdb-bin/ssdb_motif?kid=eco%3Ab0002&lib=pfam
 10 | 
 11 | But then I have to scrape the page.
 12 | 
 13 | I have a similar question about "get_genes_by_motifs", which I can also do
 14 | manually from http://www.kegg.jp/kegg/ssdb/. Also, the SOAP API had
 15 | "start" and "max_results" arguments; is there an equivalent?
 16 | 
 17 | About the SOAP APIs "get_best_neighbors_by_gene" and 
 18 | "get_best_best_neighbors_by_gene"; it looks like similar 
 19 | functionality is provided by http://www.kegg.jp/kegg/ssdb/ but again,
 20 | is there a more programmatic way to do it?
 21 | 
 22 | I’d also like to replace the old API "get_paralogs_by_gene"
 23 | which it seems like I can also do from the 
 24 | http://www.kegg.jp/kegg/ssdb/ page.
 25 | With all of these SSDB functions, is there support for the 
 26 | "start" and "max_results" arguments?
 27 | 
 28 | Although I can search compounds by mass,
 29 | (example: http://rest.kegg.jp/find/compound/174.05/exact_mass/),
 30 | I can’t seem to search glycans by mass as I could with the SOAP API
 31 | "search_glycans_by_mass". Is there an equivalent function in the REST API?
 32 | 
 33 | There does not seem to be an equivalent to the SOAP API
 34 | "search_compounds_by_subcomp". Is there a replacement?
 35 | 
 36 | What about "search_glycans_by_kcam" and the more general-purpose 
 37 | "bget"? Is there a way in the REST api to return flat-file records,
 38 | similar to what "bget" did?
 39 | 
 40 | Two other "missing" functions seem to be "get_ko_by_ko_class"
 41 | and "get_genes_by_ko_class". Are there replacements for these?
 42 | 
 43 | Is the SOAP function get_html_of_colored_pathway_by_elements
 44 | any different from e.g.
 45 | http://www.kegg.jp/kegg-bin/show_pathway?eco00260/b0002%09%23ff0000,%2300ff00/c00263%09%23ffff00,yellow
 46 | ?
 47 | 
 48 | Is there a REST implementation of the SOAP functions
 49 | get_element_relations_by_pathway and get_elements_by_pathway?
 50 | 
 51 | 
 52 | ** Results of new APIs differ from old
 53 | 
 54 | Calling the SOAP api "get_enzymes_by_pathway" with 
 55 | "pathway_id" equal to "path:eco00020" yields a result of 14 
 56 | enzymes; what seems to be the equivalent in REST,
 57 | http://rest.kegg.jp/link/enzyme/path:eco00020 returns nothing.
 58 | Is this expected? Have the data changed?
 59 | 
 60 | Similarly, calling "get_compounds_by_pathway" with a 
 61 | "pathway_id" argument of "path:eco00020" returns 20 compounds 
 62 | in SOAP, but the REST equivalent (?),
 63 | http://rest.kegg.jp/link/compound/path%3aeco00020 returns nothing.
 64 | 
 65 | Calling this REST api with a different argument:
 66 | http://rest.kegg.jp/link/compound/path:map00010
 67 | does return some results.
 68 | 
 69 | Calling the SOAP api "get_kos_by_pathway" with a "pathway_id" 
 70 | argument of "path:hsa00010" returns 36 results, but the seeming REST 
 71 | equivalent, http://rest.kegg.jp/link/ko/path%3ahsa00010 returns nothing.
 72 | 
 73 | 
 74 | ** Missing Arguments
 75 | 
 76 | The SOAP API "get_genes_by_organism" has "start" and "max_results
 77 | arguments. The REST equivalent, which seems to be, for example 
 78 | http://rest.kegg.jp/list/hsa does not appear to have such arguments.
 79 | Is there a way to paginate results that come back from the REST server?
 80 | 
 81 | It looks like I can call an equivalent of the old 
 82 | "get_genes_by_ko" API, as follows:
 83 | http://rest.kegg.jp/link/genes/ko:K12524
 84 | But in the old API, I could filter by organism (e.g. "eco"). 
 85 | Also, the SOAP api would return annotations, for example, for 
 86 | eco:b0002 it would return 
 87 | "thrA; fused aspartokinase I and homoserine dehydrogenase I 
 88 | (EC:2.7.2.4 1.1.1.3); K12524 bifunctional aspartokinase / homoserine 
 89 | dehydrogenase 1 [EC:2.7.2.4 1.1.1.3] ". 
 90 | 
 91 | Is there a way I can get these annotations back in a REST query?
 92 | It seems I only get two columns back, the ko ID and gene ID.
 93 | 
 94 | Similarly, the SOAP API "get_pathways_by_kos" had an "org" 
 95 | argument to filter by "organism". The REST equivalent, for 
 96 | example http://rest.kegg.jp/link/pathway/ko:K00016+ko:K00382
 97 | oes not seem to have this option, and the results do not allow
 98 | me to do my own filtering (that is, they do not have three-letter
 99 | organism codes).
100 | 
101 | Also, the results differ between SOAP and REST.
102 | In SOAP, calling "get_pathways_by_kos" with a "ko_id_list"
103 | argument of ko:K00016 and ko:K00382, and an "org" argument 
104 | of "hsa", returns path:hsa00010 and path:hsa00620, but the results of 
105 | http://rest.kegg.jp/link/pathway/ko:K00016+ko:K00382 do not
106 | include these items.
107 | 
108 | 


--------------------------------------------------------------------------------
/R/KEGGREST.R:
--------------------------------------------------------------------------------
  1 | keggInfo <- function(database)
  2 | {
  3 |     ## FIXME return an object instead of a character vector
  4 |     url <- sprintf("%s/info/%s", .getRootUrl(), database)
  5 |     .getUrl(url, .textParser)
  6 | }
  7 | 
  8 | 
  9 | keggList <- function(database, organism)
 10 | {
 11 |     database <- paste(database, collapse="+")
 12 |     if (missing(organism))
 13 |         url <- sprintf("%s/list/%s", .getRootUrl(), database)
 14 |     else
 15 |         url <- sprintf("%s/list/%s/%s", .getRootUrl(), database, organism)
 16 |     if (database == "organism")
 17 |         return(.organismListParser(url))
 18 |     .getUrl(url, .listParser, nameColumn=1, valueColumn=2)
 19 | }
 20 | 
 21 | keggFind <- function(database, query,
 22 |     option=c("formula", "exact_mass", "mol_weight"))
 23 | {
 24 |     if(missing(database))
 25 |         stop("'database' argument is required")
 26 |     if (!missing(option))
 27 |         option <- match.arg(option)
 28 |     if (is.integer(query) && length(query) > 1)
 29 |         query <- sprintf("%s-%s", min(query), max(query))
 30 |     query <- gsub("\\s", "+", query)
 31 |     query <- paste(query, collapse="+")
 32 |     url <- sprintf("%s/find/%s/%s", .getRootUrl(), database, query)
 33 |     if (!missing(option))
 34 |         url <- sprintf("%s/%s", url, option)
 35 |     .getUrl(url, .listParser, nameColumn=1, valueColumn=2)
 36 | }
 37 | 
 38 | 
 39 | keggGet <- function(dbentries,
 40 |     option=c("aaseq", "ntseq", "mol", "kcf", "image", "kgml"))
 41 | {
 42 |     if (length(dbentries) > 10)
 43 |         warning(paste("More than 10 inputs supplied, only the first",
 44 |             "10 results will be returned."))
 45 |     dbentries <- paste(dbentries, collapse="+")
 46 |     url <- sprintf("%s/get/%s", .getRootUrl(), dbentries)
 47 |     if (!missing(option))
 48 |     {
 49 |         url <- sprintf("%s/%s", url, option)
 50 | 
 51 |         if (option == "image")
 52 |             return(content(GET(url), type="image/png"))
 53 |         if (option %in% c("aaseq", "ntseq"))
 54 |         {
 55 |             t <- tempfile()
 56 |             cat(.getUrl(url, .textParser), file=t)
 57 |             if (option == "aaseq")
 58 |                 return(readAAStringSet(t))
 59 |             else if (option == "ntseq")
 60 |                 return(readDNAStringSet(t))
 61 |         }
 62 |         if (option %in% c("mol", "kcf", "kgml"))
 63 |             return(.getUrl(url, .textParser))
 64 |     }
 65 |     if (grepl("^br:", dbentries[1]))
 66 |         return(.getUrl(url, .textParser))
 67 |     .getUrl(url, .flatFileParser)
 68 | }
 69 | 
 70 | keggCompounds <- function(pathwayID)
 71 | {
 72 |     url <- sprintf("%s/link/cpd/%s", .getRootUrl(), pathwayID)
 73 |     .getUrl(url, .compoundParser)
 74 | }
 75 | 
 76 | .keggConv <- function(target, source)
 77 | {
 78 |     query <-paste(source, collapse = "+")
 79 |     url <- sprintf("%s/conv/%s/%s", .getRootUrl(), target, query)
 80 |     .getUrl(url, .listParser, nameColumn = 1, valueColumn = 2)
 81 | }
 82 | 
 83 | keggConv <- function (target, source, querySize = 100)
 84 | {
 85 |     groups <- .splitInGroups(source, querySize)
 86 |     answer <- lapply(groups, .keggConv, target = target)
 87 |     as(unlist(answer), "character")
 88 | }
 89 | 
 90 | keggLink <- function(target, source)
 91 | {
 92 |     if (missing(source))
 93 |     {
 94 |         url <- sprintf("%s/link/%s",
 95 |             .getGenomeUrl(), target)
 96 |         .getUrl(url, .matrixParser, ncol=3)
 97 |     } else {
 98 |         url <- sprintf("%s/link/%s/%s",
 99 |             .getRootUrl(), target, paste(source, collapse="+"))
100 |     .getUrl(url, .listParser, nameColumn=1, valueColumn=2)
101 | 
102 |     }
103 |     ## FIXME?? keggLink("pathway",c("hsa:10458", "ece:Z5100"))
104 |     ## returns a list with duplicate names
105 | }
106 | 
107 | 
108 | listDatabases <- function()
109 | {
110 |     c("pathway", "brite", "module", "ko", "genome", "vg", "ag", "compound",
111 |           "glycan", "reaction", "rclass", "enzyme", "disease", "drug",
112 |           "dgroup", "environ", "genes", "ligand", "kegg")
113 | }
114 | 
115 | ## This is not strictly speaking an API supported by the KEGG REST
116 | ## server, but it seems useful, and does not use SOAP, so I'm leaving it in.
117 | mark.pathway.by.objects <- function(pathway.id, object.id.list)
118 | {
119 |     ## example: http://www.kegg.jp/pathway/eco00260+b0002+c00263
120 |     pathway.id <- sub("^path:", "", pathway.id)
121 |     if (!missing(object.id.list)) {
122 |         object.id.list <- paste(object.id.list, collapse="+")
123 |         pathway.id <- sprintf("%s+%s", pathway.id, object.id.list)
124 |     }
125 |     url <- sprintf("https://www.kegg.jp/pathway/%s", pathway.id)
126 |     .get.kegg.url(url)
127 | }
128 | 
129 | ## This is not strictly speaking an API supported by the KEGG REST
130 | ## server, but it seems useful, and does not use SOAP, so I'm leaving it in.
131 | color.pathway.by.objects <- function(pathway.id, object.id.list,
132 |     fg.color.list, bg.color.list)
133 | {
134 |     ## example: http://www.kegg.jp/kegg-bin/show_pathway?eco00260/b0002%09%23ff0000,%2300ff00/c00263%09%23ffff00,yellow
135 |     ## also works to include organism code in gene IDs
136 |     ## (but don't include path: in pathway id)
137 |     ## documentation here: http://www.kegg.jp/kegg/rest/weblink.html
138 |     ## and here: http://www.kegg.jp/kegg/tool/map_pathway2.html
139 | 
140 |     ## Nov 2020: refactored to use form POST due to issues with long URLs when
141 |     ## large identifier lists are passed.
142 | 
143 |     pathway.id <- sub("^path:", "", pathway.id)
144 |     if (!(length(object.id.list)==length(fg.color.list) &&
145 |           length(fg.color.list) == length(bg.color.list))) {
146 |         stop(paste("object.id.list, fg.color.list, and bg.color.list must",
147 |             "all be the same length."))
148 |     }
149 | 
150 |     # format identifier/color list as expected by server
151 |     payload <- paste(
152 |         c("#ids", object.id.list),
153 |         c("cols", paste(bg.color.list, fg.color.list, sep=',')),
154 |         sep="\t",
155 |         collapse="\n"
156 |     )
157 | 
158 |     # fetch KEGG page from server, via a 302 redirect handled by httr
159 |     # transparently
160 |     res <- POST(
161 |         url = "https://www.kegg.jp/kegg-bin/show_pathway",
162 |         body = list(
163 |             map = pathway.id,
164 |             multi_query = payload,
165 |             mode = 'color'
166 |         ),
167 |         encode="multipart"
168 |     )
169 |     res <- content(res, "text")
170 | 
171 |     # extract image URL from page
172 |     img_matches <- regexpr(
173 |         "(?<=<img src=\")[^\"]+",
174 |         res,
175 |         perl=T
176 |     )
177 |     img_url <- regmatches(res, img_matches)
178 |     if (length(img_url) < 1) {
179 |         stop(
180 |             "'color.pathway.by.objects()' ",
181 |             "failed to extract KEGG image path from response."
182 |         )
183 |     }
184 |     if (length(img_url) > 1) {
185 |         stop(
186 |             "'color.pathway.by.objects()' ",
187 |             "unexpectedly matched multiple KEGG image paths in response."
188 |         )
189 |     }
190 |     sprintf("https://www.kegg.jp%s", img_url)
191 | 
192 | }
193 | 
194 | 


--------------------------------------------------------------------------------
/R/parsers.R:
--------------------------------------------------------------------------------
  1 | 
  2 | .matrixParser <- function(txt, ncol)
  3 | {
  4 |     lines <- strsplit(txt, "\n")[[1]]
  5 |     split <- strsplit(lines, "\t")
  6 |     u <- unlist(split)
  7 |     matrix(u, ncol=ncol, byrow=TRUE)
  8 | }
  9 | 
 10 | 
 11 | .organismListParser <- function(url)
 12 | {
 13 |     lines <- readLines(url)
 14 |     split <- strsplit(lines, "\t")
 15 |     u <- unlist(split)
 16 |     m <- matrix(u, ncol=4, byrow=TRUE)
 17 |     colnames(m) <-  c("T.number", "organism", "species", "phylogeny")
 18 |     m
 19 | }
 20 | 
 21 | .get_parser_NAME <- function(entry)
 22 | {
 23 |     ret <- list()
 24 |     for (value in names(entry))
 25 |     {
 26 |         ret[[value]] <- gsub("^;|;$", "", entry[[value]])
 27 |     }
 28 |     ret
 29 | }
 30 | 
 31 | .get_parser_ENTRY <- function(entry)
 32 | {
 33 |     segs <- strsplit(unlist(entry[[1]]), "   +")[[1]]
 34 |     ret <- c(segs[1])
 35 |     names(ret) <- segs[2]
 36 |     ret
 37 | }
 38 | 
 39 | 
 40 | .get_parser_REFERENCE <- function(refs)
 41 | {
 42 |     ret <- list()
 43 |     thisref <- list()
 44 |     for (i in 1:length(refs)) {
 45 |     #sapply(refs, function(item) {
 46 |         item <- refs[[i]]
 47 |         if (item$refField == "REFERENCE")
 48 |         {
 49 |           if (length(thisref) > 0)
 50 |             ret <- c(ret, list(thisref))
 51 |           thisref <- list(id=item$value)
 52 |         } else {
 53 |           if (is.null(thisref[[item$refField]]))
 54 |             thisref[[item$refField]] <- list()
 55 |           thisref[[item$refField]] <- c(thisref[[item$refField]], 
 56 |                                         item$value)
 57 |         }
 58 |     #})
 59 |     }
 60 |     ret <- c(ret, list(thisref))
 61 |     ret
 62 | }
 63 | 
 64 | 
 65 | .get_parser_key_value <- function(entry)
 66 | {
 67 |     content <- c()
 68 |     names <- c()
 69 |     lines <- unlist(strsplit(unname(unlist(entry)), "\n", fixed=TRUE))
 70 |     for (line in lines)
 71 |     {
 72 |         tmp <- strsplit(line, "  ", fixed=TRUE)[[1]]
 73 |         key <- tmp[1]
 74 |         value <- paste(tmp[2:length(tmp)], collapse="  ")
 75 |         if (is.na(value))
 76 |             value <- ""
 77 |         content <- c(content, .strip(value))
 78 |         names <- c(names, .strip(key))
 79 |     }
 80 |     names(content) <- names
 81 |     content
 82 | }
 83 | 
 84 | .get_parser_list <- function(entry)
 85 | {
 86 |     unname(unlist(strsplit(unlist(entry), " {2,}")))
 87 | }
 88 | 
 89 | .get_parser_list_or_key_value <- function(entry)
 90 | {
 91 |     x <- unlist(entry)
 92 |     if (any(grepl(" {2,}", x)))
 93 |         .get_parser_key_value(entry)
 94 |     else
 95 |         .get_parser_list(entry)
 96 | ##        unlist(unname(sapply(entry, strsplit, " ")))
 97 | }
 98 | 
 99 | 
100 | .get_parser_biostring <- function(entry, type)
101 | {
102 |     ntseq <- unname(unlist(entry))
103 |     tmp <- ntseq[2:length(ntseq)]
104 |     seq <- paste(tmp, collapse="")
105 |     if (type=="AAStringSet")
106 |         AAStringSet(seq)
107 |     else if (type == "DNAStringSet")
108 |         DNAStringSet(seq)
109 | }
110 | 
111 | 
112 | .flatFileParser <- function(txt)
113 | {
114 |     entry <- list()
115 |     refs <- list()
116 |     allEntries <- c()
117 |     last_field <- NULL
118 |     lines <- strsplit(.strip(txt), "\n", fixed=TRUE)[[1]]
119 |     ffrec <- flatFileRecordGen()
120 |     for (line in lines)
121 |     {
122 |         if (line == "///")
123 |         {
124 |             ffrec$flush()
125 |             for (name in ffrec$names())
126 |             {
127 |                 item <- ffrec$get(name)
128 |                 if (name == "ENTRY")
129 |                     ffrec$set("ENTRY", .get_parser_ENTRY(item))
130 |                 if (name %in% c("ENZYME", "MARKER", "ALL_REAC",
131 |                     "RELATEDPAIR", "DBLINKS", "DRUG", "GENE"))
132 |                     ffrec$set(name, .get_parser_list(item))
133 |                 if (name %in% c("PATHWAY", "ORTHOLOGY", "PATHWAY_MAP", "MODULE",
134 |                     "DISEASE", "REL_PATHWAY", "COMPOUND",
135 |                     "REACTION", "ORGANISM"))
136 |                     {
137 |                         ffrec$set(name, .get_parser_key_value(item))
138 |                     }
139 |                 if (name %in% c("REACTION"))
140 |                 {
141 |                     ffrec$set(name, .get_parser_list_or_key_value(item))
142 |                 }
143 |                 item <- ffrec$get(name)
144 |                 if(length(item) == 1 && "list" %in% class(item))
145 |                 {
146 |                     item <- unlist(item)
147 |                     item <- unname(item)
148 |                     ffrec$set(name, item)
149 |                 }
150 |             }
151 |             if ("NTSEQ" %in% ffrec$names())
152 |             {
153 |                 ffrec$set("NTSEQ",
154 |                     .get_parser_biostring(ffrec$get("NTSEQ"), "DNAStringSet"))
155 |             }
156 |             if ("AASEQ" %in% ffrec$names())
157 |             {
158 |                 ffrec$set("AASEQ",
159 |                     .get_parser_biostring(ffrec$get("AASEQ"), "AAStringSet"))
160 |             }
161 | 
162 |             ## dreaded copy-and-append pattern
163 |             allEntries <- c(allEntries, list(ffrec$getFields()))
164 |             ffrec <- flatFileRecordGen()
165 |         } else {
166 |             subfield <- NULL
167 |             tmp <- strsplit(line, "", fixed=TRUE)[[1]]
168 |             fs <- tmp[1:12]
169 |             fs <- fs[!is.na(fs)]
170 |             first12 <- .strip(paste(fs, collapse=""))
171 |             if(is.na(tmp[13]))
172 |                 value <- ""
173 |             else
174 |                 value <- .rstrip(paste(tmp[13:length(tmp)], collapse=""))
175 |             if (!grepl("^ ", line))
176 |             {
177 |                 field <- strsplit(line, " ", fixed=TRUE)[[1]][1]
178 |                 ffrec$setField(field)
179 |             } else {
180 |                 if (first12 != "")
181 |                 {
182 |                     subfield <- first12
183 |                     ffrec$setSubfield(first12)
184 |                 }
185 |             }
186 |             ffrec$setBody(value)
187 |         }
188 |     }
189 |     allEntries
190 | }
191 | 
192 | .listParser <- function(txt, valueColumn, nameColumn)
193 | {
194 |     lines <- strsplit(txt, "\n", fixed=TRUE)[[1]]
195 |     splits <- strsplit(lines, "\t", fixed=TRUE)
196 |     len <- lengths(splits)
197 |     ret <- character(length(len))
198 |     idx <- len >= valueColumn
199 |     ret[idx]  <- sapply(splits[idx], "[[", valueColumn)
200 |     if (!missing(nameColumn)) {
201 |         idx <- len >= nameColumn
202 |         nms <- character(length(len))
203 |         nms[idx] <- sapply(splits[idx], "[[", nameColumn)
204 |         names(ret) <- nms
205 |     }
206 |     ret
207 | }
208 | 
209 | 
210 | .textParser <- function(txt)
211 | {
212 |     txt
213 | }
214 | 
215 | 
216 | flatFileRecordGen <- setRefClass("KEGGFlatFileRecord", 
217 |     fields=list("fields"="list",
218 |         lastField="character",
219 |         lastSubfield="character",
220 |         lastReference="list",
221 |         references="list"),
222 |     methods=list(
223 |         initialize=function()
224 |         {
225 |             .self$fields <- list()
226 |             .self$references <- list()
227 |             .self$lastField <- character(0)
228 |             .self$lastSubfield <- character(0)
229 |             .self$lastReference <- list()
230 |         },
231 |         setField=function(field)
232 |         {
233 |             .self$flush()
234 |             .self$lastField <- field
235 |             .self$lastSubfield <- character(0)
236 |             .self
237 |         },
238 |         setSubfield=function(subfield)
239 |         {
240 |             .self$lastSubfield <- subfield
241 |             .self
242 |         },
243 |         setBody=function(body)
244 |         {
245 |             if (!is.null(.self$lastField) && !is.na(.self$lastField) && .self$lastField == "REFERENCE")
246 |             {
247 |                 if(length(.self$lastSubfield))
248 |                 {
249 |                     if(is.null(.self$lastReference[[.self$lastSubfield]]))
250 |                         .self$lastReference[[.self$lastSubfield]] <- c()
251 |                     .self$lastReference[[.self$lastSubfield]] <- c(
252 |                         .self$lastReference[[.self$lastSubfield]],
253 |                         body)
254 |                 } else {
255 |                     if(is.null(.self$lastReference[[.self$lastField]]))
256 |                         .self$lastReference[[.self$lastField]] <- c()
257 |                     .self$lastReference[[.self$lastField]] <- c(
258 |                         .self$lastReference[[.self$lastField]],
259 |                         body)
260 |                 }
261 |             } else{
262 |                 if (is.null(.self$fields[[.self$lastField]]))
263 |                     .self$fields[[.self$lastField]] <- list()
264 | 
265 |                 if(length(.self$lastSubfield))
266 |                 {
267 |                     if(is.null(.self$fields[[.self$lastField]][[.self$lastSubfield]]))
268 |                         .self$fields[[.self$lastField]][[.self$lastSubfield]] <- c()
269 |                     .self$fields[[.self$lastField]][[.self$lastSubfield]] <- c(
270 |                         .self$fields[[.self$lastField]][[.self$lastSubfield]],
271 |                         body
272 |                     )
273 |                 } else {
274 |                     if (is.null(.self$fields[[.self$lastField]][[.self$lastField]]))
275 |                         .self$fields[[.self$lastField]][[.self$lastField]] <- c()
276 |                     if (!is.null(.self$lastField) && !is.na(.self$lastField))
277 |                         .self$fields[[.self$lastField]][[.self$lastField]] <- c(
278 |                         .self$fields[[.self$lastField]][[.self$lastField]], body)
279 |                 }
280 |             }
281 |             .self
282 |         },
283 |         flush = function()
284 |         {
285 |             .self$fields[["///"]] <- NULL
286 |             if (length(.self$lastReference))
287 |             {
288 |                 .self$references[[length(.self$references)+1]] <- .self$lastReference
289 |                 .self$lastReference <- list()
290 |             }
291 |             .self
292 |         },
293 |         names = function()
294 |         {
295 |             nms <- base::names(.self$fields)
296 |             if (length(.self$references))
297 |                 nms <-c(nms, "REFERENCE")
298 |             nms
299 |         },
300 |         get = function(name)
301 |         {
302 |             if (name == "REFERENCE")
303 |                 return(.self$references)
304 |             return(.self$fields[[name]])
305 |         },
306 |         set = function(name, value)
307 |         {
308 |             .self$fields[[name]] <- value
309 |             .self
310 |         }, getFields = function()
311 |         {
312 |             f <- .self$fields
313 |             if (length(.self$references))
314 |                 f[["REFERENCE"]] <- .self$references
315 |             f
316 |         }
317 |     )
318 | )
319 | 
320 | .compoundParser <- function(txt)
321 | {
322 |     cmptxt <- unlist(txt)
323 |     lines <- strsplit(cmptxt, "\n")
324 |     cmps <- gsub(".*cpd:", "", unlist(lines))
325 |     cmps
326 | }
327 | 


--------------------------------------------------------------------------------
/R/utilities.R:
--------------------------------------------------------------------------------
 1 | 
 2 | .getRootUrl <- function()
 3 | {
 4 |     getOption("KEGG_REST_URL", "https://rest.kegg.jp")
 5 | }
 6 | 
 7 | .getGenomeUrl <- function()
 8 | {
 9 |     getOption("KEGG_GENOME_URL", "http://rest.genome.jp")
10 | }
11 | 
12 | .printf <- function(...) message(noquote(sprintf(...)))
13 | 
14 | .cleanUrl <- function(url)
15 | {
16 |      url <- gsub(" ", "%20", url, fixed=TRUE)
17 |      url <- gsub("#", "%23", url, fixed=TRUE)
18 |      url <- gsub(":", "%3a", url, fixed=TRUE)
19 |      sub("http(s)*%3a//", "http\\1://", url)
20 | }
21 | 
22 | .getUrl <- function(url, parser, ...)
23 | {
24 |     url <- .cleanUrl(url)
25 |     debug <- getOption("KEGGREST_DEBUG", FALSE)
26 |     if (debug)
27 |         .printf("url == %s", url)
28 |     response <- GET(url)
29 |     stop_for_status(response)
30 |     content <- .strip(content(response, "text"))
31 |     if (nchar(content) == 0)
32 |         return(character(0))
33 |     do.call(parser, list(content, ...))
34 | }
35 | 
36 | .strip <- function(str)
37 | {
38 |     gsub("^\\s+|\\s+$", "", str)
39 | }
40 | 
41 | .rstrip <- function(str)
42 | {
43 |     gsub("\\s+$", "", str)
44 | }
45 | 
46 | .lstrip <- function(str)
47 | {
48 |     gsub("^\\s+", "", str)
49 | }
50 | 
51 | .get.kegg.url <- function(url)
52 | {
53 |     res <- GET(url)
54 |     stop_for_status(res, "GET KEGG pathway URL")
55 |     content <- content(res, type="text", encoding = "UTF-8")
56 |     lines <- strsplit(content, "\n", fixed=TRUE)[[1]]
57 |     urlLine <- grep("<img src=\"/kegg", lines, value=TRUE)
58 |     path <- strsplit(urlLine, '"', fixed=TRUE)[[1]][2]
59 |     sprintf("https://www.kegg.jp%s", path)
60 | }
61 | 
62 | .splitInGroups <- function(x, n)
63 | {
64 |     groups <- seq_len(ceiling(length(x) / n))
65 |     members <- head(rep(groups, each = n), length(x))
66 |     unname(split(x, members))
67 | }
68 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | [<img src="https://www.bioconductor.org/images/logo/jpg/bioconductor_logo_rgb.jpg" width="200" align="right"/>](https://bioconductor.org/)
2 | 
3 | **KEGGREST** is an R/Bioconductor package that provides a client interface to the Kyoto Encyclopedia of Genes and Genomes (KEGG) REST API.
4 | 
5 | See https://bioconductor.org/packages/KEGGREST for more information including how to install the release version of the package (please refrain from installing directly from GitHub).
6 | 
7 | 


--------------------------------------------------------------------------------
/inst/unitTests/test_KEGGREST.R:
--------------------------------------------------------------------------------
  1 | library(KEGGREST)
  2 | library(RUnit)
  3 | 
  4 | ## checker helper
  5 | .checkLOL <- function(res)
  6 | {
  7 |     all(checkTrue(class(res)=="list"),
  8 |         checkTrue(class(res[[1]])=="list"),
  9 |         checkTrue(length(res) > 0))
 10 | }
 11 | 
 12 | .checkCharVec <- function(res)
 13 | {
 14 |     all(checkTrue(class(res)=="character"),
 15 |         checkTrue(length(res) > 0))
 16 | }
 17 | 
 18 | .checkPlainText <- function(res)
 19 | {
 20 |     all(checkTrue(class(res)=="character"),
 21 |         checkTrue(length(res) == 1))
 22 | }
 23 | 
 24 | .checkNamedCharVec <- function(res)
 25 | {
 26 |     .checkCharVec(res) &&
 27 |         checkTrue(length(names(res)) > 0)
 28 | }
 29 | 
 30 | .checkUnnamedCharVec <- function(res)
 31 | {
 32 |     .checkCharVec(res) &&
 33 |         is.null(names(res))
 34 | }
 35 | 
 36 | test_keggInfo <- function()
 37 | {
 38 |     res <- keggInfo("kegg")
 39 |     .checkPlainText(res)
 40 |     res <- keggInfo("pathway")
 41 |     .checkPlainText(res)
 42 |     res <- keggInfo("hsa")
 43 |     .checkPlainText(res)
 44 | 
 45 | }
 46 | 
 47 | test_keggList <- function()
 48 | {
 49 |     res <- keggList("pathway")
 50 |     .checkCharVec(res)
 51 |     res <- keggList("pathway", "hsa")
 52 |     .checkCharVec(res)
 53 |     res <- keggList("organism")
 54 |     checkTrue("matrix" %in% class(res))
 55 |     checkTrue("hsa" %in% res[, "organism"])
 56 |     res <- keggList("hsa")
 57 |     .checkCharVec(res)
 58 |     res <- keggList("T01001")
 59 |     .checkCharVec(res)
 60 |     res <- keggList(c("hsa:10458", "ece:Z5100"))
 61 |     .checkCharVec(res)
 62 |     res <- keggList(c("cpd:C01290","gl:G00092"))
 63 |     .checkCharVec(res)
 64 |     res <- keggList(c("C01290+G00092"))
 65 |     .checkCharVec(res)
 66 | }
 67 | 
 68 | ## The thorough thing to do would be to hit /list/x for each
 69 | ## x in listDatabases, but that might slam KEGG too hard and
 70 | ## make them mad. Instead we hit /info. KEGG does not like
 71 | ## /info/organism for some reason so we will test /list/organism.
 72 | ## NOTE: rpair (RP ids) was discontinued in 2016.
 73 | test_listDatabases <- function()
 74 | {
 75 |     dbs <- listDatabases()
 76 |     for (db in dbs)
 77 |     {
 78 |         if (all(db != c("organism", "rpair", "environ")))  # environ by vince may 5 2021
 79 |         {
 80 |             res <- keggInfo(db)
 81 |             .checkPlainText(res)
 82 |         }
 83 |     }
 84 |     res <- keggList("organism")
 85 |     checkTrue("matrix" %in% class(res))
 86 | }
 87 | 
 88 | 
 89 | test_keggFind <- function()
 90 | {
 91 |     res <- keggFind("genes", c("shiga", "toxin"))
 92 |     .checkCharVec(res)
 93 |     res <- keggFind("genes", "shiga toxin")
 94 |     .checkCharVec(res)
 95 |     res <- keggFind("compound", "C7H10O5", "formula")
 96 |     .checkCharVec(res)
 97 |     res <- keggFind("compound", "O5C7", "formula")
 98 |     .checkCharVec(res)
 99 |     res <- keggFind("compound", 174.05, "exact_mass")
100 |     .checkCharVec(res)
101 |     res <- keggFind("compound", 300:310, "mol_weight")
102 |     .checkCharVec(res)
103 | }
104 | 
105 | test_keggGet <- function()
106 | {
107 |     res <- keggGet(c("cpd:C01290", "gl:G00092"))
108 |     .checkLOL(res)
109 |     res <- keggGet(c("C01290", "G00092"))
110 |     .checkLOL(res)
111 |     res <- keggGet(c("hsa:10458", "ece:Z5100"))
112 |     .checkLOL(res)
113 |     res <- keggGet("ec:1.1.1.1")
114 |     .checkLOL(res)
115 |     .checkLOL(res[[1]]$REFERENCE)
116 |     res <- keggGet(c("hsa:10458", "ece:Z5100"), "aaseq")
117 |     checkTrue("AAStringSet" %in% class(res))
118 |     res <- keggGet(c("hsa:10458", "ece:Z5100"), "ntseq")
119 |     checkTrue("DNAStringSet" %in% class(res))
120 |     png <- keggGet("hsa05130", "image")
121 |     checkTrue("array" %in% class(png))
122 | }
123 | 
124 | test_keggGet_2 <- function()
125 | {
126 |     res <- keggGet("br:br08901")
127 |     .checkCharVec(res)
128 |     res <- keggGet(c("br:br08901", "ece:Z5100"))
129 |     .checkCharVec(res)
130 |     res <- keggGet(c("ece:Z5100", "br:br08901"))
131 |     .checkLOL(res)
132 |     res <- keggGet("path:map00010")
133 |     res <- res[[1]]
134 | #    .checkNamedCharVec(res$DISEASE)
135 |     res <- keggGet("md:M00001")
136 |     .checkNamedCharVec(res[[1]]$REACTION)
137 |     .checkNamedCharVec(res[[1]]$ORTHOLOGY)
138 |     res <- keggGet("ds:H00001")
139 |     .checkLOL(res)
140 |     .checkUnnamedCharVec(res[[1]]$GENE)
141 |     res <- keggGet("dr:D00001")
142 |     x <- res[[1]]$PRODUCT
143 |     checkTrue(all(names(x) == c("PRODUCT","GENERIC")))
144 |     checkTrue(grepl("^ ", res[[1]]$BRITE[2]))
145 | #    res <- keggGet("ev:E00001")
146 | #[1] "http://rest.kegg.jp/get/ev:E00001"
147 | #Browse[1]> zz = GET(url)
148 | #Browse[1]> httr::content(zz)
149 | #NULL
150 | #Browse[1]> zz
151 | #Response [http://rest.kegg.jp/get/ev:E00001]
152 | #  Date: 2021-05-05 12:33
153 | #  Status: 404
154 | #  Content-Type: text/plain
155 | #<EMPTY BODY>
156 | #
157 | #    .checkCharVec(res[[1]]$CATEGORY)
158 |     res <- keggGet("ko:K00001")
159 |     checkTrue(names(res[[1]]$ENTRY) == "KO")
160 |     ## DBLINK parser?
161 |     res <- keggGet("genome:T00001")
162 |     x <- res[[1]]$CHROMOSOME
163 |     checkTrue(all(names(x) == c("CHROMOSOME", "SEQUENCE", "LENGTH")))
164 |     x <- res[[1]]$TAXONOMY
165 |     checkTrue(all(names(x) == c("TAXONOMY", "LINEAGE")))
166 |     res <- keggGet("mgnm:T30001")
167 |     ## metagenome has multiple TAXONOMY sections! fixme
168 |     .checkCharVec(res[[1]]$ANNOTATION)
169 |     ## Changed from hsa:645954; that one doesn't seem to exist!
170 |     res <- keggGet("hsa:10460")
171 |     .checkNamedCharVec(res[[1]]$ORGANISM)
172 |     ## IS DNAStringSet the best object for a nucleotide sequence? fixme
173 |     checkTrue(class(res[[1]]$NTSEQ) %in% "DNAStringSet")
174 |     res <-keggGet("cpd:C00001")
175 |     .checkUnnamedCharVec(res[[1]]$REACTION)
176 |     checkTrue(length(res[[1]]$REACTION)> 300)
177 |     res <- keggGet("gl:G00001")
178 |     checkTrue("COMPOSITION" %in% names(res[[1]]))
179 |     res <- keggGet("rn:R00001")
180 |     checkTrue("EQUATION" %in% names(res[[1]]))
181 |     res <- keggGet("rc:RC00001")
182 |     .checkUnnamedCharVec(res[[1]]$REACTION)
183 |     res <- keggGet("ec:1.1.1.1")
184 |     .checkUnnamedCharVec(res[[1]]$REACTION)
185 |     .checkUnnamedCharVec(res[[1]]$ALL_REAC) ## not ideal fixme (?)
186 |     #res <- keggGet("vgnm:NC_018104")
187 |     #checkTrue(is.na(names(res[[1]]$ENTRY))) # not ideal fixme
188 |     res <- keggGet("hsa:10458")
189 |     checkTrue("AAStringSet" %in% class(res[[1]]$AASEQ))
190 |     checkTrue("DNAStringSet" %in% class(res[[1]]$NTSEQ))
191 |     # fixme do something with CODON_USAGE?
192 | 
193 | 
194 | }
195 | 
196 | test_splitInGroups <- function()
197 | {
198 |     .splitInGroups <- KEGGREST:::.splitInGroups
199 |     checkIdentical(.splitInGroups(character(), 3), list())
200 |     checkIdentical(.splitInGroups(1:5, 3), list(1:3, 4:5))
201 |     checkIdentical(.splitInGroups(1:6, 3), list(1:3, 4:6))
202 |     checkIdentical(.splitInGroups(1:7, 3), list(1:3, 4:6, 7L))
203 | }
204 | 
205 | test_keggConv <- function()
206 | {
207 |     res <- keggConv("eco", "ncbi-geneid")
208 |     .checkCharVec(res)
209 |     res <- keggConv("ncbi-geneid", "eco")
210 |     .checkCharVec(res)
211 |     res <- keggConv("ncbi-proteinid", c("hsa:10458", "ece:Z5100"))
212 |     .checkCharVec(res)
213 | }
214 | 
215 | test_keggLink <- function()
216 | {
217 |     res <- keggLink("pathway", "hsa")
218 |     .checkCharVec(res)
219 |     res <- keggLink("hsa", "pathway")
220 |     .checkCharVec(res)
221 |     res <- keggLink("pathway", c("hsa:10458", "ece:Z5100"))
222 |     .checkCharVec(res)
223 | }
224 | 
225 | test_mark_and_color_pathways_by_objects  <- function(){
226 |   url <- mark.pathway.by.objects("path:eco00260",
227 |                                  c("eco:b0002", "eco:c00263"))
228 |   .checkCharVec(url)
229 |   checkTrue(grep("https://", url)==1)
230 |   res <- httr::GET(url)
231 |   checkTrue( httr::http_type(res) == 'image/png' )
232 |   url <- color.pathway.by.objects("path:eco00260",
233 |                                   c("eco:b0002", "eco:c00263"),
234 |                                   c("#ff0000", "#00ff00"),
235 |                                   c("#ffff00", "yellow"))
236 |   .checkCharVec(url)
237 |   checkTrue(grep("https://", url)==1)
238 |   res <- httr::GET(url)
239 |   checkTrue( httr::http_type(res) == 'image/png' )
240 | }
241 | 
242 | 
243 | test_reference_parser <- function()
244 | {
245 |     res <- keggGet("path:map00010")[[1]]
246 |     refs <- res$REFERENCE[[1]]
247 |     checkTrue(length(refs) > 0)
248 | }
249 | 
250 | test_keggCompounds <- function() {
251 |     result <- c(
252 |         "C00011", "C00042", "C00090", "C00146", "C00160", "C00530",
253 |         "C00682", "C01407", "C02124", "C02222", "C02375", "C02575", "C02625",
254 |         "C02814", "C02933", "C03434", "C03572", "C03585", "C03664", "C03918",
255 |         "C04091", "C04431", "C04522", "C04706", "C04729", "C05618", "C06328",
256 |         "C06329", "C06594", "C06596", "C06597", "C06598", "C06599", "C06600",
257 |         "C06601", "C06602", "C06603", "C06755", "C06988", "C06989", "C06990",
258 |         "C07075", "C07088", "C07089", "C07090", "C07091", "C07092", "C07093",
259 |         "C07094", "C07095", "C07096", "C07097", "C07098", "C07099", "C07100",
260 |         "C07101", "C07102", "C07103", "C11352", "C12831", "C12832", "C12833",
261 |         "C12834", "C12835", "C12836", "C12837", "C12838", "C14419", "C14450",
262 |         "C16181", "C16182", "C16266", "C18236", "C18238", "C18240", "C18241",
263 |         "C18242", "C18243", "C18244", "C18933", "C21103", "C21104", "C21105"
264 |     )
265 |     checkTrue(
266 |         all(
267 |             result %in% keggCompounds("map00361")
268 |         )
269 |     )
270 | }
271 | 
272 | 


--------------------------------------------------------------------------------
/man/keggCompounds.Rd:
--------------------------------------------------------------------------------
 1 | \name{keggCompounds}
 2 | \alias{keggCompounds}
 3 | \title{
 4 | Get list of compounds IDs for pathway
 5 | }
 6 | \description{
 7 | Get list of compounds IDs for pathway.
 8 | }
 9 | \usage{
10 | keggCompounds(pathwayID)
11 | }
12 | \arguments{
13 |   \item{pathwayID}{
14 |   A KEGG pathway identifier with the prefix \code{map} and 5 digit number.
15 | }
16 | 
17 | }
18 | \value{
19 | A list of KEGG compound identifiers
20 | }
21 | \references{
22 |   \url{https://www.genome.jp/kegg/pathway.html}
23 | }
24 | \author{
25 | Dan Tenenbaum, Kristina Riemer
26 | }
27 | \examples{
28 | keggCompounds("map00361")
29 | }
30 | \keyword{ compounds }
31 | 


--------------------------------------------------------------------------------
/man/keggConv.Rd:
--------------------------------------------------------------------------------
 1 | \name{keggConv}
 2 | \alias{keggConv}
 3 | \alias{conv}
 4 | \alias{bconv}
 5 | \title{
 6 | Convert KEGG identifiers to/from outside identifiers
 7 | }
 8 | \description{
 9 | Convert KEGG identifiers to/from outside identifiers.
10 | }
11 | \usage{
12 | keggConv(target, source, querySize = 100)
13 | }
14 | \arguments{
15 |   \item{target}{
16 |   A KEGG organism code (), T number, or one of the external
17 |   databases \code{ncbi-gi}, \code{ncbi-geneid}, \code{ncbi-proteinid},
18 |   \code{uniprot}, or
19 |   (for chemical substance identifiers) 
20 |   \code{drug}, \code{compound}, or \code{glycan}, \code{pubchem},
21 |   or \code{chebi}.
22 | }
23 | 
24 |   \item{source}{
25 |   Same as \code{target}, but may also be a list of KEGG identifers
26 |   representing internal or external names.
27 | }
28 | 
29 |   \item{querySize}{
30 |   Empirically, KEGG limits queries to 100 source identifiers per query.
31 |   This argument enables larger queries by dividing \code{source} into
32 |   sub-queries of no more than \code{querySize} identifiers.
33 | }
34 | 
35 | }
36 | \value{
37 | A named character vector.
38 | }
39 | \references{
40 |   \url{https://www.kegg.jp/kegg/docs/keggapi.html}
41 | }
42 | \author{
43 | Dan Tenenbaum
44 | }
45 | \examples{
46 | ## conversion from NCBI GeneID to KEGG ID for E. coli genes
47 | head(keggConv("eco", "ncbi-geneid"))
48 | head(keggConv("ncbi-geneid", "eco")) ## opposite direction
49 | 
50 | ## conversion from KEGG ID to NCBI GI
51 | head(keggConv("ncbi-proteinid", c("hsa:10458", "ece:Z5100")))
52 | 
53 | ## conversion from NCBI GI to KEGG ID when the organism code is not known:
54 | head(keggConv("genes", "ncbi-geneid:3113320"))
55 | }
56 | \keyword{ conv }
57 | 


--------------------------------------------------------------------------------
/man/keggFind.Rd:
--------------------------------------------------------------------------------
 1 | \name{keggFind}
 2 | \alias{keggFind}
 3 | \title{
 4 | Finds entries with matching query keywords or other query data in a given 
 5 | database
 6 | }
 7 | \description{
 8 | Finds entries with matching query keywords or other query data in a given 
 9 | database.
10 | }
11 | \usage{
12 | keggFind(database, query, option = c("formula", "exact_mass", 
13 |     "mol_weight")) 
14 | }
15 | \arguments{
16 |   \item{database}{
17 |   Either the name of a single KEGG database (list available via
18 |   \code{\link{listDatabases}()}, a "T number" genome identifier,
19 |   or a KEGG organism code (lists of both available via
20 |   \code{keggList("organism")}).
21 | }
22 |   \item{query}{
23 |   One or more keywords, or a range of integers representing 
24 |   molecular weights.
25 |   If \code{query} includes identifiers not known to KEGG, 
26 |   the results will not contain any information about those identifiers.
27 | }
28 |   \item{option}{
29 |     \code{Optional.} If \code{database} is \code{compound} or \code{drug},
30 |     \code{option} can be \code{formula}, \code{exact_mass}, or 
31 |     \code{weight}. 
32 |     Chemical formula search is a partial match irrespective of the
33 |     order of atoms given. 
34 |     The exact mass (or molecular weight) is checked by rounding off to the
35 |     same decimal place as the query data.
36 | }
37 | }
38 | \value{
39 | A named character vector.
40 | }
41 | \references{
42 | \url{https://www.kegg.jp/kegg/docs/keggapi.html}
43 | }
44 | \author{
45 | Dan Tenenbaum
46 | }
47 | 
48 | 
49 | \examples{
50 | res <-
51 |     keggFind("genes", c("shiga", "toxin")) ## for keywords "shiga" and "toxin"
52 | length(res)
53 | head(res)
54 | res <- keggFind("genes", "shiga toxin")    ## for keywords "shiga toxin"
55 | length(res)
56 | head(res)
57 | keggFind("compound", "C7H10O5", "formula") ## for chemical formula "C7H10O5"
58 | res <- keggFind("compound", "O5C7", "formula") ## for chemical formula
59 |                                            ## containing "O5" and "C7" 
60 | length(res)
61 | head(res)
62 | keggFind("compound", 174.05, "exact_mass") ## for 174.045
63 |                                            ## =< exact mass < 174.055
64 | res <- keggFind("compound", 300:310, "mol_weight") ## for 300 =<
65 |                                            ## molecular weight =< 310
66 | length(res)
67 | head(res)
68 | }
69 | \keyword{ find }
70 | 


--------------------------------------------------------------------------------
/man/keggGet.Rd:
--------------------------------------------------------------------------------
 1 | \name{keggGet}
 2 | \alias{keggGet}
 3 | \title{
 4 | Retrieves given database entries
 5 | }
 6 | \description{
 7 | Retrieves given database entries.
 8 | }
 9 | \usage{
10 | keggGet(dbentries, option = c("aaseq", "ntseq", "mol", "kcf", 
11 |     "image", "kgml"))
12 | }
13 | %- maybe also 'usage' for other objects documented here.
14 | \arguments{
15 |   \item{dbentries}{
16 |   One or more (up to a maximum of 10) KEGG identifiers.
17 | }
18 |   \item{option}{
19 |     \code{Optional.} Option governing the format of the output.
20 |     \code{aaseq} is an amino acid sequence, \code{ntseq} is a nucleotide
21 |     sequence. \code{image} returns an object which can be written
22 |     to a PNG file, \code{kgml} returns a KGML document.
23 | }
24 | }
25 | \details{
26 | Retrieves all entries from the KEGG database for a set of KEGG identifers.
27 | 
28 |  \code{keggGet}() can only return 10 result sets at once (this limitation
29 | is on the server side). If you supply more than 10 inputs to \code{keggGet()},
30 | \code{KEGGREST} will warn that only the first 10 results will be returned.
31 | }
32 | \value{
33 | A list wrapping a KEGG flat file.
34 | If \code{option} is \code{aaseq}, an \code{AAStringSet} object.
35 | If \code{option} is \code{ntseq}, a \code{DNAStringSet} object.
36 | If \code{option} is \code{image}, an object which can be written
37 | to a PNG file.
38 | If \code{option} is \code{kgml}, a KGML document.
39 | }
40 | \references{
41 |   \url{https://www.kegg.jp/kegg/docs/keggapi.html}
42 | }
43 | \author{
44 | Dan Tenenbaum
45 | }
46 | \examples{
47 | res <- keggGet(c("cpd:C01290", "gl:G00092")) ## retrieves a compound entry
48 |                                     ## and a glycan entry
49 | str(res)
50 | res <- keggGet(c("C01290", "G00092")) ## same as above, without prefixes
51 | str(res)
52 | res <- keggGet(c("hsa:10458", "ece:Z5100")) ## retrieves a human gene entry
53 |                                     ## and an E.coli O157 gene entry
54 | str(res)
55 | res <- keggGet(c("hsa:10458", "ece:Z5100"), "aaseq") ## retrieves amino
56 |                                     ## acid sequences of a human gene and an 
57 |                                     ## E.coli O157 gene
58 | png <- keggGet("hsa05130", "image") ## retrieves the image file of a
59 |                                     ## pathway map
60 | t <- tempfile()
61 | library(png)
62 | writePNG(png, t)
63 | res <- keggGet("hsa05130", "kgml")
64 | str(res)
65 | }
66 | \keyword{ get }
67 | 


--------------------------------------------------------------------------------
/man/keggInfo.Rd:
--------------------------------------------------------------------------------
 1 | \name{keggInfo}
 2 | \alias{keggInfo}
 3 | \alias{info}
 4 | \title{
 5 | Displays the current statistics of a given database
 6 | }
 7 | \description{
 8 | Displays statistics of a given database, such as number of
 9 | entries, version, release date, and source. 
10 | }
11 | \usage{
12 | keggInfo(database)
13 | }
14 | \arguments{
15 |   \item{database}{
16 |   Either a KEGG database (list available via \code{\link{listDatabases}()}),
17 |   a KEGG organism code (list available by calling \code{\link{keggList}()})
18 |   with the \code{organism} argument), or a T number (list available by
19 |   calling \code{\link{keggList}()} with the \code{genome} argument.)
20 |   
21 | }
22 | }
23 | \value{
24 | A character vector containing statistics about \code{database}.
25 | }
26 | \references{
27 |   \url{https://www.kegg.jp/kegg/docs/keggapi.html}
28 | }
29 | \author{
30 | Dan Tenenbaum
31 | }
32 | \examples{
33 | res <- keggInfo("kegg") ## displays the current statistics of the KEGG database
34 | cat(res)
35 | res <- keggInfo("pathway") ## displays the number pathway entries including both
36 |                     ## the reference and organism-specific pathways
37 | cat(res)
38 | res <- keggInfo("hsa") ## displays the number of gene entries for the
39 |                     ## KEGG organism Homo sapiens
40 | cat(res)
41 | }
42 | \keyword{ info }
43 | \keyword{ metadata }
44 | 


--------------------------------------------------------------------------------
/man/keggLink.Rd:
--------------------------------------------------------------------------------
 1 | \name{keggLink}
 2 | \alias{keggLink}
 3 | \alias{link}
 4 | \title{
 5 | Find related entries by using database cross-references.
 6 | }
 7 | \description{
 8 | Find related entries by using database cross-references.
 9 | }
10 | \usage{
11 | keggLink(target, source)
12 | }
13 | \arguments{
14 |   \item{target}{
15 |   Either the name of a single KEGG database (list available via
16 |   \code{\link{listDatabases}()}, a "T number" genome identifier,
17 |   or a KEGG organism code (lists of both available via
18 |   \code{keggList("organism")}).
19 | }
20 |   \item{source}{
21 |   The same as \code{target}, but may also be one or more
22 |   KEGG identifiers.
23 | }
24 | }
25 | \details{
26 | Many of the old KEGGSOAP functions whose names
27 | started with 'get', such as \code{get.pathways.by.genes} and
28 | \code{get.pathways.by.reactions},
29 | are replaced by using \code{keggLink} (see examples).
30 | 
31 | 
32 | 
33 | }
34 | \value{
35 | A named character vector.
36 | }
37 | \references{
38 |   \url{https://www.kegg.jp/kegg/docs/keggapi.html}
39 | }
40 | \author{
41 | Dan Tenenbaum
42 | }
43 | \examples{
44 | res <- keggLink("pathway", "hsa") ## KEGG pathways linked from each of
45 |           ## the human genes equivalent to 'get.genes.by.pathway' in KEGGSOAP
46 | length(res)
47 | head(res)
48 | res <- keggLink("hsa", "pathway") ## human genes linked from each of the
49 |           ## KEGG pathways equivalent to 'get.pathways.by.genes' in KEGGSOAP
50 | keggLink("pathway", c("hsa:10458", "ece:Z5100")) ## KEGG pathways
51 |           ## linked from a human gene and an E. coli O157 gene
52 | res <- keggLink("hsa:126") ## LinkDB search shows all KEGG
53 |           ## resources related to hsa:126
54 | head(res)
55 | }
56 | \keyword{ link }
57 | 


--------------------------------------------------------------------------------
/man/keggList.Rd:
--------------------------------------------------------------------------------
 1 | \name{keggList}
 2 | \alias{keggList}
 3 | \title{
 4 | Returns a list of entry identifiers and associated definition for a given
 5 | database or a given set of database entries.
 6 | %%  ~~function to do ... ~~
 7 | }
 8 | \description{
 9 | Returns a list of entry identifiers and associated definition for a given
10 | database or a given set of database entries.
11 | }
12 | \usage{
13 | keggList(database, organism)
14 | }
15 | %- maybe also 'usage' for other objects documented here.
16 | \arguments{
17 |   \item{database}{
18 | %%     ~~Describe \code{x} here~~
19 | Either a KEGG database (list available via \code{\link{listDatabases}()}),
20 | a KEGG organism code (list available via \code{\link{keggList}()} with the
21 | \code{organism} argument,  a T number (list available via
22 | \code{\link{keggList}()} with the \code{genome} argument), or a character
23 | vector of KEGG identifiers.
24 | }
25 |   \item{organism}{
26 |   \code{Optional.} A KEGG organism identifier (list available via
27 |   \code{\link{keggList}()} with the \code{organism} argument).
28 | }
29 | }
30 | \value{
31 | A named character vector containing entry identifiers and
32 | associated definition.
33 | }
34 | \references{
35 |   \url{https://www.kegg.jp/kegg/docs/keggapi.html}
36 | }
37 | \author{
38 | Dan Tenenbaum
39 | }
40 | \examples{
41 | res <- keggList("pathway") ## returns the list of reference pathways
42 | length(res)
43 | head(res) 
44 | res <- keggList("pathway", "hsa") ## returns the list of human pathways
45 | length(res)
46 | head(res)
47 | res <- keggList("organism") ## returns the list of KEGG organisms with
48 |                      ## taxonomic classification
49 | nrow(res)
50 | head(res)
51 | res <- keggList("hsa")  ## returns the entire list of human genes
52 | length(res)
53 | head(res)
54 | ## keggList("T01001") ## same as above
55 | keggList(c("hsa:10458", "ece:Z5100")) ## returns the list of a human gene
56 |                                       ## and an E.coli O157 gene
57 | keggList(c("cpd:C01290","gl:G00092")) ## returns the list of a compound entry
58 |                                       ## and a glycan entry
59 | keggList(c("C01290+G00092")) ## same as above (prefixes are not necessary)
60 | }
61 | \keyword{ list }
62 | 


--------------------------------------------------------------------------------
/man/listDatabases.Rd:
--------------------------------------------------------------------------------
 1 | \name{listDatabases}
 2 | \alias{listDatabases}
 3 | \title{
 4 | Lists the KEGG databases which may be searched.
 5 | }
 6 | \description{
 7 | Lists the KEGG databases which may be searched. In most cases,
 8 | you can also use a KEGG organism name or T number (genome identifier)
 9 | as a database name.
10 | }
11 | \usage{
12 | listDatabases()
13 | }
14 | \value{
15 | A character vector of database names.
16 | }
17 | \references{
18 |   \url{https://www.kegg.jp/kegg/docs/keggapi.html}
19 | }
20 | \author{
21 | Dan Tenenbaum
22 | }
23 | \seealso{
24 | \code{\link{keggList}}
25 | }
26 | \examples{
27 | listDatabases()
28 | res <- keggList("organism") ## list all organisms
29 | nrow(res)
30 | head(res)
31 | res <- keggList("hsa") ## list all human genes
32 | length(res)
33 | head(res)
34 | ## keggList("T01001") ## list all human genes
35 | res <- keggList("genome") ## list all genome identifiers
36 | length(res)
37 | head(res)
38 | }
39 | \keyword{ database }
40 | \keyword{ databases }
41 | 


--------------------------------------------------------------------------------
/man/mark.pathway.by.objects.Rd:
--------------------------------------------------------------------------------
 1 | \name{mark.pathway.by.objects}
 2 | \alias{mark.pathway.by.objects}
 3 | \alias{color.pathway.by.objects}
 4 | 
 5 | \title{Client-side interface to obtain an url for a KEGG pathway diagram
 6 | with a given set of genes marked}
 7 | \description{
 8 |   Given a KEGG pathway id and a set of KEGG gene ids, the functions
 9 |   return the URL of a KEGG pathway diagram with the elements
10 |   corresponding to the genes marked by red or specified color
11 | }
12 | \usage{
13 | mark.pathway.by.objects(pathway.id, object.id.list)
14 | color.pathway.by.objects(pathway.id, object.id.list,
15 |                                      fg.color.list, bg.color.list)
16 | }
17 | 
18 | \arguments{
19 |   \item{pathway.id}{\code{pathway.id} a character string for a KEGG
20 |     pathway id. KEGG pathway ids consist of the string path followed by
21 |     a colon, a three-letter code for the organism of concern, and then
22 |     a number (e. g. "path:eco00020"). The three-letter organism code
23 |     consists of the first letter of the genus name and the first two
24 |     letters of the species name of the scientific name of the organism
25 |     of concern}
26 |   \item{object.id.list}{\code{object.id.list} a vector of character
27 |     strings for KEGG gene ids. KEGG gene ids normally consist of
28 |     three letters followed by a column and then several numeric
29 |     numbers. The three letters are from the first letter of the genus
30 |     name and the first two letters of the species name of the scientific
31 |     name of the organism of concern (e. g. hsa:111 for Homo Sapiens)}
32 |   \item{fg.color.list}{\code{fg.color.list} a vector of two character
33 |     strings to indicate the color for the text and border, respectively,
34 |     of the objects in a pathway diagram. The strings can either be a
35 |     color code linke #ff0000 or letter link yellow}
36 |   \item{bg.color.list}{\code{bg.color.list} a vector of character
37 |     strings of the same length of \code{object.id.list} to indicate the
38 |     background color of the objects in a pathway diagram. The strings
39 |     can either be a color code like #ff0000 or letter like yellow}
40 | }
41 | \details{
42 |   This function only returns the URL of the KEGG pathway diagram. Use
43 |   the function \code{\link{browseURL}} to view the diagram.
44 | 
45 |   These functions are not part of the KEGG REST API; they are provided
46 |   because they existed in \code{KEGGSOAP} and an alternative implementation
47 |   was possible.
48 | }
49 | \value{
50 |   This function returns a character string for the url
51 | }
52 | \references{\url{https://www.kegg.jp/kegg/docs/keggapi.html}}
53 | \author{Jianhua Zhang}
54 | 
55 | \seealso{\code{\link{browseURL}}}
56 | \examples{
57 |  url <- mark.pathway.by.objects(
58 |     "path:eco00260", c("eco:b0002", "eco:c00263")
59 | )
60 | if(interactive()){
61 |     browseURL(url)
62 | }
63 | url <- color.pathway.by.objects(
64 |     "path:eco00260", c("eco:b0002", "eco:c00263"),
65 |     c("#ff0000", "#00ff00"),
66 |     c("#ffff00", "yellow")
67 | )
68 | }
69 | \keyword{ datasets }
70 | 
71 | 


--------------------------------------------------------------------------------
/tests/KEGGREST_unit_tests.R:
--------------------------------------------------------------------------------
1 | BiocGenerics:::testPackage("KEGGREST")
2 | 


--------------------------------------------------------------------------------
/vignettes/KEGGREST-vignette.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Accessing the KEGG REST API"
  3 | date: "`r format(Sys.Date(), '%B %d, %Y')`"
  4 | vignette: >
  5 |   %\VignetteIndexEntry{Accessing the KEGG REST API}
  6 |   %\VignetteEngine{knitr::rmarkdown}
  7 |   %\VignetteEncoding{UTF-8}
  8 | output:
  9 |   BiocStyle::html_document:
 10 |     toc: true
 11 | ---
 12 | 
 13 | ```{r setup, echo=FALSE}
 14 | library(knitr)
 15 | options(width=80)
 16 | ```
 17 | ```{r wrap-hook, echo=FALSE}
 18 | hook_output = knit_hooks$get('output')
 19 | knit_hooks$set(output = function(x, options) {
 20 |   # this hook is used only when the linewidth option is not NULL
 21 |   if (!is.null(n <- options$linewidth)) {
 22 |     x = knitr:::split_lines(x)
 23 |     # any lines wider than n should be wrapped
 24 |     if (any(nchar(x) > n)) x = strwrap(x, width = n)
 25 |     x = paste(x, collapse = '\n')
 26 |   }
 27 |   hook_output(x, options)
 28 | })
 29 | ```
 30 | 
 31 | # KEGGREST
 32 | 
 33 | [KEGG](https://www.kegg.jp/kegg/)
 34 | is a database resource for understanding high-level functions
 35 | and utilities of the biological system, such as the cell, the organism
 36 | and the ecosystem, from molecular-level information, especially
 37 | large-scale molecular datasets generated by genome sequencing and
 38 | other high-throughput experimental technologies.
 39 | 
 40 | `KEGGREST` allows access to the
 41 | [KEGG REST API](https://www.kegg.jp/kegg/docs/keggapi.html). Since
 42 | KEGG disabled the KEGG SOAP server
 43 | on December 31, 2012 (which means the `KEGGSOAP` package will no
 44 | longer work), `KEGGREST` serves as a replacement.
 45 | 
 46 | The interface to `KEGGREST` is simpler and in some ways more
 47 | powerful than `KEGGSOAP`; however, not all the functionality
 48 | that was available through the SOAP API has been exposed
 49 | in the REST API. If and when more functionality is exposed
 50 | on the server side, this package will be updated to take
 51 | advantage of it.
 52 | 
 53 | **Restriction: The KEGG API is provided for academic use by academic
 54 | users belonging to academic institutions. See https://www.kegg.jp/kegg/rest/
 55 | for more information.**
 56 | 
 57 | ## Installation
 58 | 
 59 | You can install `KEGGREST` from Bioconductor with:
 60 | 
 61 | ```{r install,eval=FALSE}
 62 | if (!require("BiocManager", quietly=TRUE))
 63 |     install.packages("BiocManager")
 64 | 
 65 | BiocManager::install("KEGGREST")
 66 | ```
 67 | 
 68 | ## Overview
 69 | 
 70 | The KEGG REST API is built on some simple operations:
 71 | `info`, `list`, `find`, `get`, `conv`, and `link`.
 72 | The corresponding `R` functions in `KEGGREST` are:
 73 | `keggInfo()`, `keggList()`, `keggFind()`, `keggGet()`,
 74 | `keggConv`, and `keggLink()`.
 75 | 
 76 | 
 77 | # Exploring KEGG Resources with `keggList()`
 78 | 
 79 | KEGG exposes a number of databases. To get an idea of
 80 | what is available, run `listDatabases()`:
 81 | 
 82 | ```{r listDatabases}
 83 | library(KEGGREST)
 84 | listDatabases()
 85 | ```
 86 | You can use these databases in further queries. Note that in many
 87 | cases you can also use a three-letter KEGG organism code or a 
 88 | "T number" (genome identifier) in the same place you would use 
 89 | one of these database names.
 90 | 
 91 | You can obtain the list of organisms available in KEGG with
 92 | the `keggList()` function:
 93 | 
 94 | ```{r get_organisms}
 95 | org <- keggList("organism")
 96 | head(org)
 97 | ```
 98 | 
 99 | From `KEGGREST`'s point of view, you've just asked KEGG
100 | to show you the name of every entry in the "organism" database.
101 | 
102 | Therefore, the complete list of entities that can be
103 | queried with `KEGGREST` can be obtained as follows:
104 | 
105 | ```{r list_queryables}
106 | queryables <- c(listDatabases(), org[,1], org[,2])
107 | ```
108 | 
109 | You could also ask for every entry in the "hsa" (_Homo sapiens_)
110 | database as follows:
111 | 
112 | ```{r query_hsa, eval=FALSE}
113 | keggList("hsa")
114 | ```
115 | 
116 | # Get specific entries with `keggGet()`
117 | 
118 | Once you have a list of specific KEGG identifiers, use
119 | `keggGet()` to get more information about them. Here we look up
120 | a human gene and an E. coli O157 gene:
121 | 
122 | ```{r keggGet}
123 | query <- keggGet(c("hsa:10458", "ece:Z5100"))
124 | ```
125 | 
126 | As expected, this returns two items:
127 | 
128 | ```{r querylength}
129 | length(query)
130 | ```
131 | 
132 | Behind the scenes, `KEGGREST` downloaded and parsed a KEGG
133 | [flat file](https://www.kegg.jp/kegg/rest/dbentry.html), which you
134 | can now explore:
135 | 
136 | ```{r explore}
137 | names(query[[1]])
138 | query[[1]]$ENTRY
139 | query[[1]]$DBLINKS
140 | ```
141 | 
142 | `keggGet()` can also return amino acid sequences as `AAStringSet` objects
143 | (from the `Biostrings` package):
144 | 
145 | ```{r aaseq}
146 | keggGet(c("hsa:10458", "ece:Z5100"), "aaseq") ## retrieves amino acid sequences
147 | ```
148 | 
149 | ...or `DNAStringSet` objects if `option` is `ntseq`:
150 | 
151 | ```{r ntseq}
152 | keggGet(c("hsa:10458", "ece:Z5100"), "ntseq") ## retrieves nucleotide sequences
153 | ```
154 | 
155 | 
156 | 
157 | `keggGet()` can also return images:
158 | ```{r png}
159 | png <- keggGet("hsa05130", "image") 
160 | t <- tempfile()
161 | library(png)
162 | writePNG(png, t)
163 | if (interactive()) browseURL(t)
164 | ```
165 | 
166 | __NOTE__: `keggGet()` can only return 10 result sets at once (this limitation
167 | is on the server side). If you supply more than 10 inputs to `keggGet()`, 
168 | `KEGGREST` will warn that only the first 10 results will be returned.
169 | 
170 | # Search by keywords with `keggFind()`
171 | 
172 | You can search for two separate keywords ("shiga" and "toxin" in this case):
173 | 
174 | ```{r separate_keywords, linewidth=80}
175 | head(keggFind("genes", c("shiga", "toxin")))
176 | ```
177 | 
178 | Or search for the two words together:
179 | 
180 | ```{r keyphrase, linewidth=80}
181 | head(keggFind("genes", "shiga toxin"))
182 | ```
183 | 
184 | Search for a chemical formula:
185 | ```{r formula}
186 | head(keggFind("compound", "C7H10O5", "formula"))
187 | ```
188 | Search for a chemical formula containing "O5" and "C7":
189 | ```{r formula2}
190 | head(keggFind("compound", "O5C7", "formula"))
191 | ```
192 | 
193 | You can search for compounds with a particular exact mass:
194 | 
195 | ```{r exact_mass}
196 | keggFind("compound", 174.05, "exact_mass")
197 | ```
198 | 
199 | Because we've supplied a number with two decimal digits of precision,
200 | KEGG will find all compounds with exact mass between 174.045 and 174.055.
201 | 
202 | Integer ranges can be used to find compounds by molecular weight:
203 | 
204 | ```{r mol_weight}
205 | head(keggFind("compound", 300:310, "mol_weight"))
206 | ```
207 | 
208 | # Convert identifiers with `keggConv()`
209 | 
210 | Convert between KEGG identifiers and outside identifiers.
211 | 
212 | You can either specify fully qualified identifiers:
213 | 
214 | ```{r conv_with_ids}
215 | keggConv("ncbi-proteinid", c("hsa:10458", "ece:Z5100"))
216 | ```
217 | 
218 | ...or get the mapping for an entire species:
219 | 
220 | ```{r conv_species_kegg_to_geneid}
221 | head(keggConv("eco", "ncbi-geneid"))
222 | ```
223 | 
224 | Reversing the arguments does the opposite mapping:
225 | 
226 | ```{r conv_species_geneid_to_kegg}
227 | head(keggConv("ncbi-geneid", "eco"))
228 | 
229 | ```
230 | 
231 | # Link across databases with `keggLink()`
232 | 
233 | Most of the `KEGGSOAP` functions whose names started with
234 | "get", for example `get.pathways.by.genes()`, can be replaced
235 | with the `keggLink()` function. Here we query all pathways
236 | for human:
237 | 
238 | ```{r keggLink}
239 | head(keggLink("pathway", "hsa"))
240 | ```
241 | 
242 | ...but you can also specify one or more genes (from multiple species):
243 | ```{r keggLink2}
244 | keggLink("pathway", c("hsa:10458", "ece:Z5100"))
245 | ```
246 | 


--------------------------------------------------------------------------------