├── .Rbuildignore ├── .github ├── CONTRIBUTING.md ├── dependabot.yml ├── issue_template.md ├── pull_request_template.md └── workflows │ └── R-check.yaml ├── .gitignore ├── CRAN-SUBMISSION ├── DESCRIPTION ├── LICENSE ├── LICENSE.md ├── NAMESPACE ├── NEWS.md ├── R ├── gn_debug.R ├── gn_parse.R ├── gn_parse_tidy.R ├── gn_version.R ├── gnparser.R ├── install_gnparser.R ├── rgnparser-package.R └── zzz.R ├── README.Rmd ├── README.md ├── codemeta.json ├── cran-comments.md ├── inst └── precompile.R ├── man ├── figures │ ├── lifecycle-archived.svg │ ├── lifecycle-defunct.svg │ ├── lifecycle-deprecated.svg │ ├── lifecycle-experimental.svg │ ├── lifecycle-maturing.svg │ ├── lifecycle-questioning.svg │ ├── lifecycle-soft-deprecated.svg │ ├── lifecycle-stable.svg │ └── lifecycle-superseded.svg ├── gn_debug.Rd ├── gn_parse.Rd ├── gn_parse_tidy.Rd ├── gn_version.Rd ├── install_gnparser.Rd └── rgnparser-package.Rd ├── revdep ├── README.md ├── cran.md ├── failures.md └── problems.md ├── rgnparser.Rproj ├── tests ├── testthat.R └── testthat │ ├── test-gn_parse.R │ ├── test-gn_parse_tidy.R │ └── test-gn_version.R └── vignettes ├── .gitignore ├── rgnparser.Rmd └── rgnparser.Rmd.orig /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | README.Rmd 4 | ^notes\.md$ 5 | man-roxygen 6 | ^cran-comments\.md$ 7 | .github 8 | ^LICENSE\.md$ 9 | vignettes/rgnparser.Rmd 10 | ^codemeta\.json$ 11 | notes-egs.R 12 | get_gnparser_url.R 13 | revdep 14 | ^CRAN-SUBMISSION$ 15 | -------------------------------------------------------------------------------- /.github/CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # CONTRIBUTING # 2 | 3 | ### Fixing typos 4 | 5 | Small typos or grammatical errors in documentation may be edited directly using 6 | the GitHub web interface, so long as the changes are made in the _source_ file. 7 | 8 | * YES: you edit a roxygen comment in a `.R` file below `R/`. 9 | * NO: you edit an `.Rd` file below `man/`. 10 | 11 | ### Prerequisites 12 | 13 | Before you make a substantial pull request, you should always file an issue and 14 | make sure someone from the team agrees that it’s a problem. If you’ve found a 15 | bug, create an associated issue and illustrate the bug with a minimal 16 | [reprex](https://www.tidyverse.org/help/#reprex). 17 | 18 | ### Pull request process 19 | 20 | * We recommend that you create a Git branch for each pull request (PR). 21 | * Look at the Travis and AppVeyor build status before and after making changes. 22 | The `README` should contain badges for any continuous integration services used 23 | by the package. 24 | * We recommend the tidyverse [style guide](http://style.tidyverse.org). 25 | You can use the [styler](https://CRAN.R-project.org/package=styler) package to 26 | apply these styles, but please don't restyle code that has nothing to do with 27 | your PR. 28 | * We use [roxygen2](https://cran.r-project.org/package=roxygen2). 29 | * We use [testthat](https://cran.r-project.org/package=testthat). Contributions 30 | with test cases included are easier to accept. 31 | * For user-facing changes, add a bullet to the top of `NEWS.md` below the 32 | current development version header describing the changes made followed by your 33 | GitHub username, and links to relevant issue(s)/PR(s). 34 | 35 | ### Code of Conduct 36 | 37 | Please note that the `pegax` project is released with a 38 | [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project you agree to abide by its terms. 39 | 40 | ### See rOpenSci [contributing guide](https://devguide.ropensci.org/contributingguide.html) 41 | for further details. 42 | 43 | ### Discussion forum 44 | 45 | Check out our [discussion forum](https://discuss.ropensci.org) if you think your issue requires a longer form discussion. 46 | -------------------------------------------------------------------------------- /.github/dependabot.yml: -------------------------------------------------------------------------------- 1 | # To get started with Dependabot version updates, you'll need to specify which 2 | # package ecosystems to update and where the package manifests are located. 3 | # Please see the documentation for all configuration options: 4 | # https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file 5 | 6 | version: 2 7 | updates: 8 | - package-ecosystem: "github-actions" 9 | directory: "/" 10 | schedule: 11 | interval: "monthly" 12 | -------------------------------------------------------------------------------- /.github/issue_template.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 |
Session Info 8 | 9 | ```r 10 | 11 | ``` 12 |
13 | -------------------------------------------------------------------------------- /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ## Description 10 | 11 | 12 | ## Related Issue 13 | 16 | 17 | ## Example 18 | 20 | 21 | 23 | 24 | -------------------------------------------------------------------------------- /.github/workflows/R-check.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | on: 4 | push: 5 | branches: [main, master] 6 | pull_request: 7 | branches: [main, master] 8 | 9 | name: R-CMD-check 10 | 11 | jobs: 12 | R-CMD-check: 13 | runs-on: ${{ matrix.config.os }} 14 | 15 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 16 | 17 | strategy: 18 | fail-fast: false 19 | matrix: 20 | config: 21 | - {os: macos-latest, r: 'release'} 22 | - {os: windows-latest, r: 'release'} 23 | - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'} 24 | - {os: ubuntu-latest, r: 'release'} 25 | - {os: ubuntu-latest, r: 'oldrel-1'} 26 | 27 | env: 28 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 29 | R_KEEP_PKG_SOURCE: yes 30 | 31 | steps: 32 | - uses: actions/checkout@v3 33 | 34 | - uses: r-lib/actions/setup-pandoc@v2 35 | 36 | - uses: r-lib/actions/setup-r@v2 37 | with: 38 | r-version: ${{ matrix.config.r }} 39 | http-user-agent: ${{ matrix.config.http-user-agent }} 40 | use-public-rspm: true 41 | 42 | - uses: r-lib/actions/setup-r-dependencies@v2 43 | with: 44 | extra-packages: any::rcmdcheck 45 | needs: check 46 | 47 | - uses: r-lib/actions/check-r-package@v2 48 | with: 49 | upload-snapshots: true -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | .Rproj.user 4 | 5 | .Rhistory 6 | ^notes\.md$ 7 | inst/doc 8 | notes-egs.R 9 | get_gnparser_url.R 10 | 11 | revdep/checks.noindex/ 12 | revdep/library.noindex/ 13 | revdep/data.sqlite 14 | 15 | .DS_Store 16 | -------------------------------------------------------------------------------- /CRAN-SUBMISSION: -------------------------------------------------------------------------------- 1 | Version: 0.2.6 2 | Date: 2023-02-01 08:32:27 UTC 3 | SHA: 67779d3a0813d03a2458fdc2305c3b9b3a7e6f41 4 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: rgnparser 2 | Title: Parse Scientific Names 3 | Description: Parse scientific names using 'gnparser' 4 | (), written in Go. 'gnparser' 5 | parses scientific names into their component parts; it utilizes 6 | a Parsing Expression Grammar specifically for scientific names. 7 | Version: 0.3.0 8 | Authors@R: c( 9 | person("Scott", "Chamberlain", role = "aut", 10 | email = "sckott@protonmail.com", 11 | comment = c(ORCID = "0000-0003-1444-9135")), 12 | person("Joel H.", "Nitta", role = c("aut","cre"), 13 | email = "joelnitta@gmail.com", 14 | comment = c(ORCID = "0000-0003-4719-7472")), 15 | person("Alban", "Sagouis", role = "aut", 16 | email = "alban.sagouis@idiv.de", 17 | comment = c(ORCID = "0000-0002-3827-1063")) 18 | ) 19 | License: MIT + file LICENSE 20 | URL: https://docs.ropensci.org/rgnparser/, https://github.com/ropensci/rgnparser 21 | BugReports: https://github.com/ropensci/rgnparser/issues 22 | Roxygen: list(markdown = TRUE) 23 | Encoding: UTF-8 24 | Language: en-US 25 | SystemRequirements: gnparser () 26 | Imports: 27 | sys, 28 | tibble, 29 | jsonlite, 30 | readr, 31 | lifecycle 32 | Suggests: 33 | testthat 34 | RoxygenNote: 7.2.3 35 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2020 2 | COPYRIGHT HOLDER: Scott Chamberlain 3 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | # MIT License 2 | 3 | Copyright (c) 2020 Scott Chamberlain 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(gn_debug) 4 | export(gn_parse) 5 | export(gn_parse_tidy) 6 | export(gn_version) 7 | export(install_gnparser) 8 | importFrom(jsonlite,fromJSON) 9 | importFrom(lifecycle,deprecated) 10 | importFrom(readr,read_csv) 11 | importFrom(sys,exec_internal) 12 | importFrom(tibble,as_tibble) 13 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | rgnparser 0.3.0 2 | =============== 3 | 4 | ### NEW FEATURES 5 | 6 | * Support was added for gnparser arguments `cultivar`, `capitalize`, `diaereses` (#31) 7 | 8 | ### MISC 9 | 10 | * The default number of threads was changed to 1 (#32) 11 | 12 | * Website links and badges fixed (#28, #30) 13 | 14 | ### DEFUNCT 15 | 16 | * `install_gnparser()` has been deprecated to stay in compliance with CRAN policies ("Packages should not write in the user’s home filespace, nor anywhere else on the file system apart from the R session’s temporary directory") 17 | 18 | ### BUG FIX 19 | 20 | * A bug was fixed where the `ignore_tags` argument was not respected (#31) 21 | 22 | rgnparser 0.2.6 23 | =============== 24 | ### BUG FIX 25 | `install_gnparser()` now downloads the correct binary file depending on the system it installs 'gnparser' on. 26 | 27 | rgnparser 0.2.0 28 | =============== 29 | 30 | ### NEW FEATURES 31 | 32 | * A new gnparser version (v1) is out. In addition, gnparser has moved from Gitlab to Github; which also required changes to `install_gnparser()` because we download Go binaries from the gnparser source repository (#7) 33 | * As part of new gnparser version, arguments have changed in `gn_parse()` and `gn_parse_tidy()`: `format` has been removed. `batch_size` and `ignore_tags` were added to both functions, while `details` was added to `gn_parse()` only. See docs for details. (#11) 34 | * gnparser v1 or greater is now required (#10) 35 | 36 | ### DEFUNCT 37 | 38 | * `gn_debug()` is now defunct. the gnparser command for this function was removed in gnparser v1 (#9) 39 | 40 | ### BUG FIXES 41 | 42 | * `gn_version()` was broken with the new gnparser version, fixed now (#8) 43 | 44 | rgnparser 0.1.0 45 | =============== 46 | 47 | ### NEW FEATURES 48 | 49 | * First submission to CRAN. 50 | -------------------------------------------------------------------------------- /R/gn_debug.R: -------------------------------------------------------------------------------- 1 | #' gn_debug 2 | #' 3 | #' DEFUNCT 4 | #' 5 | #' @export 6 | #' @param ... ignored 7 | gn_debug <- function(...) { 8 | .Defunct(msg="gn_debug is defunct, functionality was removed from gnparser") 9 | } 10 | -------------------------------------------------------------------------------- /R/gn_parse.R: -------------------------------------------------------------------------------- 1 | #' gn_parse 2 | #' 3 | #' extract names using gnparser 4 | #' 5 | #' @export 6 | #' @param x (character) vector of scientific names. required 7 | #' @param threads (integer/numeric) number of threads to run for parallel 8 | #' processing. Setting to `NULL` will use all threads available. default: `1` 9 | #' @param batch_size (integer/numeric) maximum number of names in a 10 | #' batch send for processing. default: `NULL` 11 | #' @param cultivar (logical) adds support for botanical cultivars like 12 | #' `Sarracenia flava 'Maxima'` and graft-chimaeras like `+ Crataegomespilus`. 13 | #' default: `FALSE` 14 | #' @param capitalize (logical) capitalizes the first letter of name-strings. 15 | #' default: `FALSE` 16 | #' @param diaereses (logical) preserves diaereses within names, e.g. 17 | #' `Leptochloöpsis virgata`. The stemmed canonical name will be generated 18 | #' without diaereses. default: `FALSE` 19 | #' @param ignore_tags (logical) ignore HTML entities and tags when 20 | #' parsing. default: `FALSE` 21 | #' @param details (logical) Return more details for a parsed name 22 | #' @return a list 23 | #' @examples 24 | #' trys <- function(x) try(x, silent=TRUE) 25 | #' if (interactive()) { 26 | #' x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 27 | #' "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 28 | #' trys(gn_parse(x[1])) 29 | #' trys(gn_parse(x[2])) 30 | #' trys(gn_parse(x[3])) 31 | #' trys(gn_parse(x)) 32 | #' # details 33 | #' w <- trys(gn_parse(x, details = TRUE)) 34 | #' w[[1]]$details # details for one name 35 | #' lapply(w, "[[", "details") # details for all names 36 | #' z <- trys(gn_parse(x, details = FALSE)) # compared to regular 37 | #' z 38 | #' } 39 | gn_parse <- function( 40 | x, 41 | threads = 1, 42 | batch_size = NULL, 43 | ignore_tags = FALSE, 44 | cultivar = FALSE, 45 | capitalize = FALSE, 46 | diaereses = FALSE, 47 | details = FALSE) { 48 | 49 | gnparser_exists() 50 | ver_check(1) 51 | assert(x, "character") 52 | file <- tempfile(fileext = ".txt") 53 | on.exit(unlink(file)) 54 | cat(x, file = file, sep = "\n") 55 | res <- parse_one(file, "compact", threads, batch_size, 56 | ignore_tags, cultivar, capitalize, diaereses, details) 57 | lapply(strsplit(res, "\n")[[1]], jsonlite::fromJSON) 58 | } 59 | -------------------------------------------------------------------------------- /R/gn_parse_tidy.R: -------------------------------------------------------------------------------- 1 | #' gn_parse_tidy 2 | #' 3 | #' extract names using gnparser into a tidy tibble 4 | #' 5 | #' @export 6 | #' @inheritParams gn_parse 7 | #' @return a data.frame 8 | #' @details This function focuses on a data.frame result that's easy 9 | #' to munge downstream - note that this function does not do additional 10 | #' details as does [gn_parse()]. 11 | #' @examples 12 | #' trys <- function(x) try(x, silent=TRUE) 13 | #' if (interactive()) { 14 | #' x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 15 | #' "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 16 | #' trys(gn_parse_tidy(x)) 17 | #' } 18 | gn_parse_tidy <- function( 19 | x, 20 | threads = 1, 21 | batch_size = NULL, 22 | cultivar = FALSE, 23 | capitalize = FALSE, 24 | diaereses = FALSE, 25 | ignore_tags = FALSE) { 26 | 27 | gnparser_exists() 28 | ver_check(1) 29 | assert(x, "character") 30 | file <- tempfile(fileext = ".txt") 31 | on.exit(unlink(file)) 32 | cat(x, file = file, sep = "\n") 33 | readcsv( 34 | parse_one( 35 | file, 36 | threads = threads, 37 | batch_size = batch_size, 38 | cultivar = cultivar, 39 | capitalize = capitalize, 40 | diaereses = diaereses, 41 | ignore_tags = ignore_tags 42 | ) 43 | ) 44 | } 45 | -------------------------------------------------------------------------------- /R/gn_version.R: -------------------------------------------------------------------------------- 1 | #' gn_version 2 | #' 3 | #' get gnparser version information 4 | #' 5 | #' @export 6 | #' @return named list, with `version` and `build` 7 | #' @examples 8 | #' trys <- function(x) try(x, silent=TRUE) 9 | #' if (interactive()) { 10 | #' trys(gn_version()) 11 | #' } 12 | gn_version <- function() { 13 | gnparser_exists() 14 | # z <- gnparser_cmd("-V", error = FALSE) 15 | z <- gnparser_cmd("-V", error = FALSE) 16 | if (z$status != 0) z <- gnparser_cmd("-v", error = FALSE) 17 | err_chk(z) 18 | process_version_string(z$stdout) 19 | # txt <- rawToChar(z$stdout) 20 | # txt <- strsplit(txt, "\n")[[1]] 21 | # unlist(lapply(txt[nzchar(txt)], function(w) { 22 | # tmp <- gsub("\\s", "", strsplit(w, ":\\s")[[1]]) 23 | # stats::setNames(list(tmp[2]), tmp[1]) 24 | # }), FALSE) 25 | } 26 | -------------------------------------------------------------------------------- /R/gnparser.R: -------------------------------------------------------------------------------- 1 | #' @noRd 2 | #' @param ... Arguments to be passed to `sys::exec_internal('gnparser', ...)` 3 | gnparser_cmd = function(...) { 4 | sys::exec_internal(find_gnparser(), ...) 5 | } 6 | 7 | # find an executable from PATH, APPDATA, system.file(), ~/bin, etc 8 | find_exec = function(cmd, dir, info = '') { 9 | for (d in bin_paths(dir)) { 10 | exec = if (is_windows()) paste0(cmd, ".exe") else cmd 11 | path = file.path(d, exec) 12 | if (utils::file_test("-x", path)) break else path = '' 13 | } 14 | path2 = Sys.which(cmd) 15 | if (path == '' || same_path(path, path2)) { 16 | if (path2 == '') stop(cmd, ' not found. ', info, call. = FALSE) 17 | return(cmd) # do not use the full path of the command 18 | } else { 19 | if (path2 != '') warning( 20 | 'Found ', cmd, ' at "', path, '" and "', path2, '". The former will be used. ', 21 | "If you don't need both copies, you may delete/uninstall one." 22 | ) 23 | } 24 | normalizePath(path) 25 | } 26 | 27 | gnpenv <- new.env() 28 | find_gnparser = local({ 29 | gnpenv$path = NULL # cache the path to gnparser 30 | function() { 31 | if (is.null(gnpenv$path)) gnpenv$path <- find_exec( 32 | 'gnparser', 'gnparser', 'You need to install gnparser' 33 | ) 34 | gnpenv$path 35 | } 36 | }) 37 | 38 | parse_one <- function(x, format = NULL, threads = NULL, 39 | batch_size = NULL, ignore_tags = NULL, cultivar = FALSE, 40 | capitalize = FALSE, diaereses = FALSE, details = FALSE) { 41 | 42 | assert(format, "character") 43 | assert(threads, c("integer", "numeric")) # NULL OK 44 | assert(batch_size, c("integer", "numeric")) 45 | assert(ignore_tags, "logical") 46 | assert(cultivar, "logical") 47 | assert(capitalize, "logical") 48 | assert(diaereses, "logical") 49 | assert(details, "logical") 50 | 51 | args <- character(0) 52 | if (!is.null(format)) args <- c(args, "--format", format) 53 | if (!is.null(threads)) args <- c(args, "--jobs", threads) 54 | if (!is.null(batch_size)) args <- c(args, "--batch_size", batch_size) 55 | if (ignore_tags) args <- c(args, "--ignore_tags") 56 | if (cultivar) args <- c(args, "--cultivar") 57 | if (capitalize) args <- c(args, "--capitalize") 58 | if (diaereses) args <- c(args, "--diaereses") 59 | if (details) args <- c(args, "--details") 60 | z <- gnparser_cmd(c(args, x), error = FALSE) 61 | err_chk(z) 62 | return(rawToChar(z$stdout)) 63 | } 64 | 65 | gnparser_exists <- function() { 66 | check_gnp <- gnparser_cmd() 67 | if (check_gnp$status != 0) stop("You need to install gnparser") 68 | return(TRUE) 69 | } 70 | 71 | process_version_string <- function(x) { 72 | txt <- rawToChar(x) 73 | txt <- strsplit(txt, "\n")[[1]] 74 | unlist(lapply(txt[nzchar(txt)], function(w) { 75 | tmp <- gsub("\\s", "", strsplit(w, ":\\s")[[1]]) 76 | stats::setNames(list(tmp[2]), tmp[1]) 77 | }), FALSE) 78 | } 79 | 80 | ver_check <- function(version) { 81 | # ver <- gnparser_cmd("-V", error = FALSE) 82 | # if (ver$status != 0) ver <- gnparser_cmd("-v", error = FALSE) 83 | # ver <- process_version_string(ver$stdout) 84 | ver <- gn_version() 85 | ver_first_num <- as.numeric(substring(gsub("v|\\.", "", ver$version), 1, 1)) 86 | if (ver_first_num < version) stop("you need to install `gnparser` v1 or greater") 87 | return(TRUE) 88 | } 89 | 90 | # from xfun::same_path 91 | same_path <- function(p1, p2, ...) { 92 | normalize_path(p1, ...) == normalize_path(p2, ...) 93 | } 94 | # from xfun::normalize_path 95 | normalize_path <- function(path, winslash = "/", must_work = FALSE) { 96 | res = normalizePath(path, winslash = winslash, mustWork = must_work) 97 | if (is_windows()) res[is.na(path)] = NA 98 | res 99 | } 100 | -------------------------------------------------------------------------------- /R/install_gnparser.R: -------------------------------------------------------------------------------- 1 | #' Install gnparser 2 | #' 3 | #' @description 4 | #' `r lifecycle::badge("deprecated")` 5 | #' 6 | #' ### Reason for deprecating 7 | #' The function used to download the appropriate `gnparser` executable for your 8 | #' platform and try to copy it to a system directory so \pkg{rgnparser} can run 9 | #' the `gnparser` command. 10 | #' This function was deprecated to stay in compliance with CRAN policies 11 | #' ("Packages should not write in the user’s home filespace, nor anywhere else 12 | #' on the file system apart from the R session’s temporary directory") 13 | #' 14 | #' ### Solution 15 | #' Please install `gnparser` by hand. 16 | #' For Linux and Mac users, installing with your usual package manager such as 17 | #' homebrew is the easiest, see `gnparser` documentation for more details: 18 | #' \url{https://github.com/gnames/gnparser#installation} 19 | #' @param version The gnparser version number, e.g., `1.0.0`; the default 20 | #' `latest` means the latest version (fetched from GitLab releases). 21 | #' Alternatively, this argument can take a file path of the zip archive or 22 | #' tarball of gnparser that has already been downloaded from GitLab, 23 | #' in which case it will not be downloaded again. The minimum version 24 | #' is `v1.0.0` because gnparser v1 introduced breaking changes - and 25 | #' we don't support older versions of gnparser here. 26 | #' @param force Whether to install gnparser even if it has already been 27 | #' installed. This may be useful when upgrading gnparser. 28 | #' @export 29 | 30 | install_gnparser = function(version, force) { 31 | lifecycle::deprecate_stop( 32 | when = "0.3.0", 33 | what = "install_gnparser()", 34 | details = "Please see help page for deprecation reason and solution.") 35 | } 36 | -------------------------------------------------------------------------------- /R/rgnparser-package.R: -------------------------------------------------------------------------------- 1 | #' @title rgnparser 2 | #' @description Parse scientific names using gnparser 3 | #' @importFrom jsonlite fromJSON 4 | #' @importFrom lifecycle deprecated 5 | #' @importFrom readr read_csv 6 | #' @importFrom sys exec_internal 7 | #' @importFrom tibble as_tibble 8 | #' @name rgnparser-package 9 | #' @aliases rgnparser 10 | #' @docType package 11 | #' @keywords package 12 | NULL 13 | -------------------------------------------------------------------------------- /R/zzz.R: -------------------------------------------------------------------------------- 1 | 2 | assert <- function(x, y) { 3 | if (!is.null(x)) { 4 | if (!inherits(x, y)) { 5 | stop(deparse(substitute(x)), " must be of class ", 6 | paste0(y, collapse = ", "), call. = FALSE) 7 | } 8 | } 9 | } 10 | 11 | last <- function(x) x[length(x)] 12 | 13 | err_chk <- function(z) { 14 | if (z$status != 0) { 15 | err <- rawToChar(z$stderr) 16 | err <- gsub("Error: ", "", err) 17 | # language replacement 18 | err <- gsub("-l detect", "language=\"detect\"", err) 19 | stop(err, call. = FALSE) 20 | } 21 | } 22 | 23 | readcsv <- function(x) { 24 | df <- readr::read_csv(x) 25 | stats::setNames(df, tolower(names(df))) 26 | } 27 | 28 | # from xfun 29 | is_windows <- function() .Platform$OS.type == "windows" 30 | is_macos <- function() unname(Sys.info()["sysname"] == "Darwin") 31 | dir_exists <- function(x) utils::file_test("-d", x) 32 | pkg_file = function(..., mustWork = TRUE) { 33 | system.file(..., package = 'rgnparser', mustWork = mustWork) 34 | } 35 | 36 | bin_paths <- function(dir = 'gnparser') { 37 | if (is_windows()) { 38 | path <- Sys.getenv('APPDATA', '') 39 | path <- if (dir_exists(path)) file.path(path, dir) 40 | } else if (is_macos()) { 41 | path <- '~/Library/Application Support' 42 | path <- if (dir_exists(path)) file.path(path, dir) 43 | path <- c('/usr/local/bin', path) 44 | } else { 45 | path <- c('~/bin', '/snap/bin', '/var/lib/snapd/snap/bin') 46 | } 47 | path <- c(path, pkg_file(dir, mustWork = FALSE)) 48 | return(path) 49 | } 50 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: github_document 3 | --- 4 | rgnparser 5 | ========= 6 | 7 | ```{r, echo=FALSE} 8 | knitr::opts_chunk$set( 9 | collapse = TRUE, 10 | comment = "#>", 11 | warning = FALSE, 12 | message = FALSE 13 | ) 14 | ``` 15 | 16 | [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) 17 | [![R-check](https://github.com/ropensci/rgnparser/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/rgnparser/actions/) 18 | [![rstudio mirror downloads](https://cranlogs.r-pkg.org/badges/rgnparser)](https://github.com/r-hub/cranlogs.app) 19 | [![cran version](https://www.r-pkg.org/badges/version/rgnparser)](https://cran.r-project.org/package=rgnparser) 20 | 21 | **rgnparser**: Parse Scientific Names 22 | 23 | Docs: https://docs.ropensci.org/rgnparser/ 24 | 25 | ## Installation 26 | 27 | ```{r eval=FALSE} 28 | install.packages("rgnparser") 29 | # OR 30 | remotes::install_github("ropensci/rgnparser") 31 | ``` 32 | 33 | ```{r} 34 | library("rgnparser") 35 | ``` 36 | 37 | ## Install gnparser 38 | 39 | The command line tool written in Go, gnparser, is required to use this package. 40 | 41 | Instructions for installation can be found at the gnparser repo (https://github.com/gnames/gnparser#installation) 42 | 43 | ## Meta 44 | 45 | * Please [report any issues or bugs](https://github.com/ropensci/rgnparser/issues). 46 | * License: MIT 47 | * Get citation information for `rgnparser` in R doing `citation(package = 'rgnparser')` 48 | * Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project, you agree to abide by its terms. 49 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # rgnparser 3 | 4 | [![Project Status: Active – The project has reached a stable, usable 5 | state and is being actively 6 | developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) 7 | [![R-check](https://github.com/ropensci/rgnparser/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/rgnparser/actions/) 8 | [![rstudio mirror 9 | downloads](https://cranlogs.r-pkg.org/badges/rgnparser)](https://github.com/r-hub/cranlogs.app) 10 | [![cran 11 | version](https://www.r-pkg.org/badges/version/rgnparser)](https://cran.r-project.org/package=rgnparser) 12 | 13 | **rgnparser**: Parse Scientific Names 14 | 15 | Docs: 16 | 17 | ## Installation 18 | 19 | ``` r 20 | install.packages("rgnparser") 21 | # OR 22 | remotes::install_github("ropensci/rgnparser") 23 | ``` 24 | 25 | ``` r 26 | library("rgnparser") 27 | ``` 28 | 29 | ## Install gnparser 30 | 31 | The command line tool written in Go, gnparser, is required to use this 32 | package. 33 | 34 | Instructions for installation can be found at the gnparser repo 35 | () 36 | 37 | ## Meta 38 | 39 | - Please [report any issues or 40 | bugs](https://github.com/ropensci/rgnparser/issues). 41 | - License: MIT 42 | - Get citation information for `rgnparser` in R doing 43 | `citation(package = 'rgnparser')` 44 | - Please note that this package is released with a [Contributor Code of 45 | Conduct](https://ropensci.org/code-of-conduct/). By contributing to 46 | this project, you agree to abide by its terms. 47 | -------------------------------------------------------------------------------- /codemeta.json: -------------------------------------------------------------------------------- 1 | { 2 | "@context": "https://doi.org/10.5063/schema/codemeta-2.0", 3 | "@type": "SoftwareSourceCode", 4 | "identifier": "rgnparser", 5 | "description": "Parse scientific names using 'gnparser' (), written in Go. 'gnparser' parses scientific names into their component parts; it utilizes a Parsing Expression Grammar specifically for scientific names.", 6 | "name": "rgnparser: Parse Scientific Names", 7 | "relatedLink": "https://docs.ropensci.org/rgnparser/", 8 | "codeRepository": "https://github.com/ropensci/rgnparser", 9 | "issueTracker": "https://github.com/ropensci/rgnparser/issues", 10 | "license": "https://spdx.org/licenses/MIT", 11 | "version": "0.3.0", 12 | "programmingLanguage": { 13 | "@type": "ComputerLanguage", 14 | "name": "R", 15 | "url": "https://r-project.org" 16 | }, 17 | "runtimePlatform": "R version 4.3.1 (2023-06-16)", 18 | "provider": { 19 | "@id": "https://cran.r-project.org", 20 | "@type": "Organization", 21 | "name": "Comprehensive R Archive Network (CRAN)", 22 | "url": "https://cran.r-project.org" 23 | }, 24 | "author": [ 25 | { 26 | "@type": "Person", 27 | "givenName": "Scott", 28 | "familyName": "Chamberlain", 29 | "email": "sckott@protonmail.com", 30 | "@id": "https://orcid.org/0000-0003-1444-9135" 31 | }, 32 | { 33 | "@type": "Person", 34 | "givenName": "Joel H.", 35 | "familyName": "Nitta", 36 | "email": "joelnitta@gmail.com", 37 | "@id": "https://orcid.org/0000-0003-4719-7472" 38 | }, 39 | { 40 | "@type": "Person", 41 | "givenName": "Alban", 42 | "familyName": "Sagouis", 43 | "email": "alban.sagouis@idiv.de", 44 | "@id": "https://orcid.org/0000-0002-3827-1063" 45 | } 46 | ], 47 | "maintainer": [ 48 | { 49 | "@type": "Person", 50 | "givenName": "Joel H.", 51 | "familyName": "Nitta", 52 | "email": "joelnitta@gmail.com", 53 | "@id": "https://orcid.org/0000-0003-4719-7472" 54 | } 55 | ], 56 | "softwareSuggestions": [ 57 | { 58 | "@type": "SoftwareApplication", 59 | "identifier": "testthat", 60 | "name": "testthat", 61 | "provider": { 62 | "@id": "https://cran.r-project.org", 63 | "@type": "Organization", 64 | "name": "Comprehensive R Archive Network (CRAN)", 65 | "url": "https://cran.r-project.org" 66 | }, 67 | "sameAs": "https://CRAN.R-project.org/package=testthat" 68 | } 69 | ], 70 | "softwareRequirements": { 71 | "1": { 72 | "@type": "SoftwareApplication", 73 | "identifier": "sys", 74 | "name": "sys", 75 | "provider": { 76 | "@id": "https://cran.r-project.org", 77 | "@type": "Organization", 78 | "name": "Comprehensive R Archive Network (CRAN)", 79 | "url": "https://cran.r-project.org" 80 | }, 81 | "sameAs": "https://CRAN.R-project.org/package=sys" 82 | }, 83 | "2": { 84 | "@type": "SoftwareApplication", 85 | "identifier": "tibble", 86 | "name": "tibble", 87 | "provider": { 88 | "@id": "https://cran.r-project.org", 89 | "@type": "Organization", 90 | "name": "Comprehensive R Archive Network (CRAN)", 91 | "url": "https://cran.r-project.org" 92 | }, 93 | "sameAs": "https://CRAN.R-project.org/package=tibble" 94 | }, 95 | "3": { 96 | "@type": "SoftwareApplication", 97 | "identifier": "jsonlite", 98 | "name": "jsonlite", 99 | "provider": { 100 | "@id": "https://cran.r-project.org", 101 | "@type": "Organization", 102 | "name": "Comprehensive R Archive Network (CRAN)", 103 | "url": "https://cran.r-project.org" 104 | }, 105 | "sameAs": "https://CRAN.R-project.org/package=jsonlite" 106 | }, 107 | "4": { 108 | "@type": "SoftwareApplication", 109 | "identifier": "readr", 110 | "name": "readr", 111 | "provider": { 112 | "@id": "https://cran.r-project.org", 113 | "@type": "Organization", 114 | "name": "Comprehensive R Archive Network (CRAN)", 115 | "url": "https://cran.r-project.org" 116 | }, 117 | "sameAs": "https://CRAN.R-project.org/package=readr" 118 | }, 119 | "SystemRequirements": "gnparser ()" 120 | }, 121 | "fileSize": "20.915KB", 122 | "releaseNotes": "https://github.com/ropensci/rgnparser/blob/master/NEWS.md", 123 | "readme": "https://github.com/ropensci/rgnparser/blob/main/README.md", 124 | "contIntegration": "https://github.com/ropensci/rgnparser/actions/", 125 | "developmentStatus": "https://www.repostatus.org/#active", 126 | "keywords": ["r", "rstats", "taxonomy", "r-package"] 127 | } 128 | -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | ## Test environments 2 | * local - Darwin, R 4.3.1 3 | * win-builder (release and devel) 4 | * r hub - Ubuntu Linux 20.04.1 LTS, R-release, GCC 5 | * r hub - Fedora Linux, R-devel, clang, gfortran 6 | 7 | ## R CMD check results 8 | 9 | 0 errors | 0 warnings | 0 notes 10 | 11 | ----- 12 | 13 | Since last submission the most crucial change was deprecating the install_gnparser() function that did not respect CRAN rules ("Packages should not write in the user’s home filespace, nor anywhere else on the file system apart from the R session’s temporary directory"). The function is still exported but show a deprecation error message thanks to lifecycle::deprecate_stop(). Reverse dependencies were checked and raised no problem even if the package bdc uses rgnparser. 14 | 15 | This submission also fixes a bug in the functions 'gn_parse()' and 'gn_parse_tidy()' that caused the argument `ignore_tags` to be... ignored. 16 | 17 | We, Joel Nitta (main maintainer) and Alban Sagouis (co-maintainer), thank you for taking the time of reviewing 'rgnparser'. 18 | 19 | Best wishes, 20 | Joel Nitta, Alban Sagouis 21 | -------------------------------------------------------------------------------- /inst/precompile.R: -------------------------------------------------------------------------------- 1 | # Precompile vignettes (from .Rmd.orig to .Rmd) 2 | # Need to do this locally because CI won't have gnparser 3 | 4 | # Load package in working state 5 | # not with library() 6 | library(devtools) 7 | library(knitr) 8 | load_all() 9 | 10 | knit("vignettes/rgnparser.Rmd.orig", "vignettes/rgnparser.Rmd") 11 | 12 | build_vignettes() -------------------------------------------------------------------------------- /man/figures/lifecycle-archived.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: archived 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | archived 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/figures/lifecycle-defunct.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: defunct 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | defunct 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/figures/lifecycle-deprecated.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: deprecated 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | deprecated 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/figures/lifecycle-experimental.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: experimental 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | experimental 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/figures/lifecycle-maturing.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: maturing 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | maturing 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/figures/lifecycle-questioning.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: questioning 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | questioning 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/figures/lifecycle-soft-deprecated.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: soft-deprecated 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | soft-deprecated 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/figures/lifecycle-stable.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: stable 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 19 | 20 | lifecycle 21 | 22 | 25 | 26 | stable 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /man/figures/lifecycle-superseded.svg: -------------------------------------------------------------------------------- 1 | 2 | lifecycle: superseded 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | lifecycle 18 | 19 | superseded 20 | 21 | 22 | -------------------------------------------------------------------------------- /man/gn_debug.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gn_debug.R 3 | \name{gn_debug} 4 | \alias{gn_debug} 5 | \title{gn_debug} 6 | \usage{ 7 | gn_debug(...) 8 | } 9 | \arguments{ 10 | \item{...}{ignored} 11 | } 12 | \description{ 13 | DEFUNCT 14 | } 15 | -------------------------------------------------------------------------------- /man/gn_parse.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gn_parse.R 3 | \name{gn_parse} 4 | \alias{gn_parse} 5 | \title{gn_parse} 6 | \usage{ 7 | gn_parse( 8 | x, 9 | threads = 1, 10 | batch_size = NULL, 11 | ignore_tags = FALSE, 12 | cultivar = FALSE, 13 | capitalize = FALSE, 14 | diaereses = FALSE, 15 | details = FALSE 16 | ) 17 | } 18 | \arguments{ 19 | \item{x}{(character) vector of scientific names. required} 20 | 21 | \item{threads}{(integer/numeric) number of threads to run for parallel 22 | processing. Setting to \code{NULL} will use all threads available. default: \code{1}} 23 | 24 | \item{batch_size}{(integer/numeric) maximum number of names in a 25 | batch send for processing. default: \code{NULL}} 26 | 27 | \item{ignore_tags}{(logical) ignore HTML entities and tags when 28 | parsing. default: \code{FALSE}} 29 | 30 | \item{cultivar}{(logical) adds support for botanical cultivars like 31 | \verb{Sarracenia flava 'Maxima'} and graft-chimaeras like \code{+ Crataegomespilus}. 32 | default: \code{FALSE}} 33 | 34 | \item{capitalize}{(logical) capitalizes the first letter of name-strings. 35 | default: \code{FALSE}} 36 | 37 | \item{diaereses}{(logical) preserves diaereses within names, e.g. 38 | \verb{Leptochloöpsis virgata}. The stemmed canonical name will be generated 39 | without diaereses. default: \code{FALSE}} 40 | 41 | \item{details}{(logical) Return more details for a parsed name} 42 | } 43 | \value{ 44 | a list 45 | } 46 | \description{ 47 | extract names using gnparser 48 | } 49 | \examples{ 50 | trys <- function(x) try(x, silent=TRUE) 51 | if (interactive()) { 52 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 53 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 54 | trys(gn_parse(x[1])) 55 | trys(gn_parse(x[2])) 56 | trys(gn_parse(x[3])) 57 | trys(gn_parse(x)) 58 | # details 59 | w <- trys(gn_parse(x, details = TRUE)) 60 | w[[1]]$details # details for one name 61 | lapply(w, "[[", "details") # details for all names 62 | z <- trys(gn_parse(x, details = FALSE)) # compared to regular 63 | z 64 | } 65 | } 66 | -------------------------------------------------------------------------------- /man/gn_parse_tidy.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gn_parse_tidy.R 3 | \name{gn_parse_tidy} 4 | \alias{gn_parse_tidy} 5 | \title{gn_parse_tidy} 6 | \usage{ 7 | gn_parse_tidy( 8 | x, 9 | threads = 1, 10 | batch_size = NULL, 11 | cultivar = FALSE, 12 | capitalize = FALSE, 13 | diaereses = FALSE, 14 | ignore_tags = FALSE 15 | ) 16 | } 17 | \arguments{ 18 | \item{x}{(character) vector of scientific names. required} 19 | 20 | \item{threads}{(integer/numeric) number of threads to run for parallel 21 | processing. Setting to \code{NULL} will use all threads available. default: \code{1}} 22 | 23 | \item{batch_size}{(integer/numeric) maximum number of names in a 24 | batch send for processing. default: \code{NULL}} 25 | 26 | \item{cultivar}{(logical) adds support for botanical cultivars like 27 | \verb{Sarracenia flava 'Maxima'} and graft-chimaeras like \code{+ Crataegomespilus}. 28 | default: \code{FALSE}} 29 | 30 | \item{capitalize}{(logical) capitalizes the first letter of name-strings. 31 | default: \code{FALSE}} 32 | 33 | \item{diaereses}{(logical) preserves diaereses within names, e.g. 34 | \verb{Leptochloöpsis virgata}. The stemmed canonical name will be generated 35 | without diaereses. default: \code{FALSE}} 36 | 37 | \item{ignore_tags}{(logical) ignore HTML entities and tags when 38 | parsing. default: \code{FALSE}} 39 | } 40 | \value{ 41 | a data.frame 42 | } 43 | \description{ 44 | extract names using gnparser into a tidy tibble 45 | } 46 | \details{ 47 | This function focuses on a data.frame result that's easy 48 | to munge downstream - note that this function does not do additional 49 | details as does \code{\link[=gn_parse]{gn_parse()}}. 50 | } 51 | \examples{ 52 | trys <- function(x) try(x, silent=TRUE) 53 | if (interactive()) { 54 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 55 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 56 | trys(gn_parse_tidy(x)) 57 | } 58 | } 59 | -------------------------------------------------------------------------------- /man/gn_version.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gn_version.R 3 | \name{gn_version} 4 | \alias{gn_version} 5 | \title{gn_version} 6 | \usage{ 7 | gn_version() 8 | } 9 | \value{ 10 | named list, with \code{version} and \code{build} 11 | } 12 | \description{ 13 | get gnparser version information 14 | } 15 | \examples{ 16 | trys <- function(x) try(x, silent=TRUE) 17 | if (interactive()) { 18 | trys(gn_version()) 19 | } 20 | } 21 | -------------------------------------------------------------------------------- /man/install_gnparser.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/install_gnparser.R 3 | \name{install_gnparser} 4 | \alias{install_gnparser} 5 | \title{Install gnparser} 6 | \usage{ 7 | install_gnparser(version, force) 8 | } 9 | \arguments{ 10 | \item{version}{The gnparser version number, e.g., \verb{1.0.0}; the default 11 | \code{latest} means the latest version (fetched from GitLab releases). 12 | Alternatively, this argument can take a file path of the zip archive or 13 | tarball of gnparser that has already been downloaded from GitLab, 14 | in which case it will not be downloaded again. The minimum version 15 | is \code{v1.0.0} because gnparser v1 introduced breaking changes - and 16 | we don't support older versions of gnparser here.} 17 | 18 | \item{force}{Whether to install gnparser even if it has already been 19 | installed. This may be useful when upgrading gnparser.} 20 | } 21 | \description{ 22 | \ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\strong{[Deprecated]}} 23 | \subsection{Reason for deprecating}{ 24 | 25 | The function used to download the appropriate \code{gnparser} executable for your 26 | platform and try to copy it to a system directory so \pkg{rgnparser} can run 27 | the \code{gnparser} command. 28 | This function was deprecated to stay in compliance with CRAN policies 29 | ("Packages should not write in the user’s home filespace, nor anywhere else 30 | on the file system apart from the R session’s temporary directory") 31 | } 32 | 33 | \subsection{Solution}{ 34 | 35 | Please install \code{gnparser} by hand. 36 | For Linux and Mac users, installing with your usual package manager such as 37 | homebrew is the easiest, see \code{gnparser} documentation for more details: 38 | \url{https://github.com/gnames/gnparser#installation} 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /man/rgnparser-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/rgnparser-package.R 3 | \docType{package} 4 | \name{rgnparser-package} 5 | \alias{rgnparser-package} 6 | \alias{rgnparser} 7 | \title{rgnparser} 8 | \description{ 9 | Parse scientific names using gnparser 10 | } 11 | \keyword{package} 12 | -------------------------------------------------------------------------------- /revdep/README.md: -------------------------------------------------------------------------------- 1 | # Platform 2 | 3 | |field |value | 4 | |:--------|:----------------------------------------| 5 | |version |R version 4.3.1 (2023-06-16) | 6 | |os |macOS Sonoma 14.1.1 | 7 | |system |aarch64, darwin20 | 8 | |ui |RStudio | 9 | |language |(EN) | 10 | |collate |en_US.UTF-8 | 11 | |ctype |en_US.UTF-8 | 12 | |tz |Europe/Berlin | 13 | |date |2023-12-04 | 14 | |rstudio |2023.09.1+494 Desert Sunflower (desktop) | 15 | |pandoc |3.1.8 @ /opt/homebrew/bin/pandoc | 16 | 17 | # Dependencies 18 | 19 | |package |old |new |Δ | 20 | |:-----------|:-----|:-----|:--| 21 | |rgnparser |0.2.6 |0.3.0 |* | 22 | |cpp11 |NA |0.4.7 |* | 23 | |lifecycle |NA |1.0.4 |* | 24 | |prettyunits |NA |1.2.0 |* | 25 | |rlang |NA |1.1.2 |* | 26 | |vctrs |NA |0.6.5 |* | 27 | |withr |NA |2.5.2 |* | 28 | 29 | # Revdeps 30 | 31 | -------------------------------------------------------------------------------- /revdep/cran.md: -------------------------------------------------------------------------------- 1 | ## revdepcheck results 2 | 3 | We checked 2 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package. 4 | 5 | * We saw 0 new problems 6 | * We failed to check 0 packages 7 | 8 | -------------------------------------------------------------------------------- /revdep/failures.md: -------------------------------------------------------------------------------- 1 | *Wow, no problems at all. :)* -------------------------------------------------------------------------------- /revdep/problems.md: -------------------------------------------------------------------------------- 1 | *Wow, no problems at all. :)* -------------------------------------------------------------------------------- /rgnparser.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 3 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | PackageCheckArgs: --as-cran 22 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(rgnparser) 3 | 4 | test_check("rgnparser") 5 | -------------------------------------------------------------------------------- /tests/testthat/test-gn_parse.R: -------------------------------------------------------------------------------- 1 | skip_on_cran() 2 | skip_on_ci() 3 | 4 | test_that("gn_parse", { 5 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 6 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 7 | w <- gn_parse(x) 8 | vapply(w, "[[", "", "normalized") 9 | 10 | expect_is(w, "list") 11 | for (i in w) expect_is(i, "list") 12 | 13 | expect_is(w[[1]]$parsed, "logical") 14 | expect_is(w[[1]]$verbatim, "character") 15 | expect_is(w[[1]]$normalized, "character") 16 | expect_is(w[[1]]$cardinality, "integer") 17 | expect_is(w[[1]]$canonical, "list") 18 | expect_null(w[[1]]$details) # used to be a thing, removed at some point 19 | }) 20 | 21 | test_that("cultivar arg works", { 22 | with_cult <- gn_parse("Sarracenia flava 'Maxima'", cultivar = TRUE) 23 | without_cult <- gn_parse("Sarracenia flava 'Maxima'", cultivar = FALSE) 24 | expect_equal( 25 | with_cult[[1]]$canonical$simple, "Sarracenia flava ‘Maxima’" 26 | ) 27 | expect_equal( 28 | without_cult[[1]]$canonical$simple, "Sarracenia flava" 29 | ) 30 | }) 31 | 32 | test_that("capitalize arg works", { 33 | with_capital <- gn_parse("parus major", capitalize = TRUE) 34 | without_capital <- gn_parse( 35 | "parus major", capitalize = FALSE) 36 | expect_equal( 37 | with_capital[[1]]$canonical$simple, "Parus major" 38 | ) 39 | # Without capitalization, name cannot be parsed 40 | expect_equal( 41 | without_capital[[1]]$parsed, FALSE 42 | ) 43 | }) 44 | 45 | test_that("diaereses arg works", { 46 | with_dia <- gn_parse("Leptochloöpsis virgata", diaereses = TRUE) 47 | without_dia <- gn_parse("Leptochloöpsis virgata", diaereses = FALSE) 48 | expect_equal( 49 | with_dia[[1]]$canonical$simple, "Leptochloöpsis virgata" 50 | ) 51 | expect_equal( 52 | without_dia[[1]]$canonical$simple, "Leptochlooepsis virgata" 53 | ) 54 | }) 55 | -------------------------------------------------------------------------------- /tests/testthat/test-gn_parse_tidy.R: -------------------------------------------------------------------------------- 1 | skip_on_cran() 2 | skip_on_ci() 3 | 4 | test_that("gn_parse_tidy", { 5 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 6 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 7 | z <- gn_parse_tidy(x) 8 | 9 | expect_is(z, "data.frame") 10 | expect_is(z, "tbl") 11 | # names are all lowercase 12 | expect_equal(names(z), tolower(names(z))) 13 | expect_equal(sort(z$verbatim), sort(x)) 14 | 15 | name <- "Parus major Linnaeus, 1788" 16 | w <- gn_parse_tidy(name) 17 | expect_equal(w$verbatim, name) 18 | expect_equal(w$authorship, "Linnaeus 1788") 19 | }) 20 | -------------------------------------------------------------------------------- /tests/testthat/test-gn_version.R: -------------------------------------------------------------------------------- 1 | skip_on_cran() 2 | skip_on_ci() 3 | 4 | test_that("gn_version", { 5 | expect_is(gn_version(), "list") 6 | expect_named(gn_version(), c("version", "build")) 7 | expect_is(gn_version()$version, "character") 8 | expect_is(gn_version()$build, "character") 9 | }) 10 | -------------------------------------------------------------------------------- /vignettes/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | *.R 3 | -------------------------------------------------------------------------------- /vignettes/rgnparser.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "rgnparser: Parse Scientific Names" 3 | author: "Scott Chamberlain, Joel H. Nitta" 4 | date: "2023-09-15" 5 | output: rmarkdown::html_vignette 6 | --- 7 | 8 | 9 | 10 | An R interface to `gnparser` (https://github.com/gnames/gnparser). 11 | 12 | ## Installation 13 | 14 | 15 | ```r 16 | install.packages("rgnparser") 17 | # OR 18 | remotes::install_github("ropensci/rgnparser") 19 | ``` 20 | 21 | 22 | ```r 23 | library(rgnparser) 24 | ``` 25 | 26 | ## Install gnparser 27 | 28 | The command line tool written in Go, gnparser, is required to use this package. 29 | 30 | Instructions for installation can be found at the gnparser repo 31 | (). 32 | 33 | ## gn_version() 34 | 35 | Check the gnparser version: 36 | 37 | 38 | ```r 39 | gn_version() 40 | #> $version 41 | #> [1] "v1.7.1" 42 | #> 43 | #> $build 44 | #> [1] "" 45 | ``` 46 | 47 | ## gn_parse_tidy 48 | 49 | Output a data.frame with more minimal information 50 | 51 | 52 | ```r 53 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 54 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 55 | gn_parse_tidy(x) 56 | #> Rows: 3 Columns: 9 57 | #> ── Column specification ──────────────────────────────────────── 58 | #> Delimiter: "," 59 | #> chr (6): Id, Verbatim, CanonicalStem, CanonicalSimple, CanonicalFull, Authorship 60 | #> dbl (3): Cardinality, Year, Quality 61 | #> 62 | #> ℹ Use `spec()` to retrieve the full column specification for this data. 63 | #> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. 64 | #> # A tibble: 3 × 9 65 | #> id verbatim cardinality canonicalstem canonicalsimple canonicalfull authorship year quality 66 | #> 67 | #> 1 fbd1b4fe-f8ed-5390-9cb1-e0f798691b1e Quadrella steyermar… 2 Quadrella st… Quadrella stey… Quadrella st… (Standl.)… NA 4 68 | #> 2 e4e1d462-d332-583d-97a1-09735712f04d Parus major Linnaeu… 2 Parus maior Parus major Parus major Linnaeus … 1788 1 69 | #> 3 e571bae4-9e3f-5481-9b53-f614d536066c Helianthus annuus v… 3 Helianthus a… Helianthus ann… Helianthus a… NA 1 70 | ``` 71 | 72 | It's pretty fast, thanks to gnparser of course 73 | 74 | 75 | ```r 76 | n <- 10000L 77 | # get random scientific names from taxize 78 | spp <- taxize::names_list(rank = "species", size = n) 79 | timed <- system.time(gn_parse_tidy(spp)) 80 | #> Rows: 10000 Columns: 9 81 | #> ── Column specification ──────────────────────────────────────── 82 | #> Delimiter: "," 83 | #> chr (5): Id, Verbatim, CanonicalStem, CanonicalSimple, CanonicalFull 84 | #> dbl (2): Cardinality, Quality 85 | #> lgl (2): Authorship, Year 86 | #> 87 | #> ℹ Use `spec()` to retrieve the full column specification for this data. 88 | #> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. 89 | timed 90 | #> user system elapsed 91 | #> 0.458 0.091 0.244 92 | ``` 93 | 94 | Just 0.244 sec for 10000 names 95 | 96 | ## gn_parse 97 | 98 | output a list of lists with more detailed information 99 | 100 | 101 | ```r 102 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 103 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 104 | gn_parse(x) 105 | #> [[1]] 106 | #> [[1]]$parsed 107 | #> [1] TRUE 108 | #> 109 | #> [[1]]$quality 110 | #> [1] 4 111 | #> 112 | #> [[1]]$qualityWarnings 113 | #> quality warning 114 | #> 1 4 Unparsed tail 115 | #> 116 | #> [[1]]$verbatim 117 | #> [1] "Quadrella steyermarkii (Standl.) Iltis & Cornejo" 118 | #> 119 | #> [[1]]$normalized 120 | #> [1] "Quadrella steyermarkii (Standl.) Iltis" 121 | #> 122 | #> [[1]]$canonical 123 | #> [[1]]$canonical$stemmed 124 | #> [1] "Quadrella steyermark" 125 | #> 126 | #> [[1]]$canonical$simple 127 | #> [1] "Quadrella steyermarkii" 128 | #> 129 | #> [[1]]$canonical$full 130 | #> [1] "Quadrella steyermarkii" 131 | #> 132 | #> 133 | #> [[1]]$cardinality 134 | #> [1] 2 135 | #> 136 | #> [[1]]$authorship 137 | #> [[1]]$authorship$verbatim 138 | #> [1] "(Standl.) Iltis" 139 | #> 140 | #> [[1]]$authorship$normalized 141 | #> [1] "(Standl.) Iltis" 142 | #> 143 | #> [[1]]$authorship$authors 144 | #> [1] "Standl." "Iltis" 145 | #> 146 | #> 147 | #> [[1]]$tail 148 | #> [1] " & Cornejo" 149 | #> 150 | #> [[1]]$id 151 | #> [1] "fbd1b4fe-f8ed-5390-9cb1-e0f798691b1e" 152 | #> 153 | #> [[1]]$parserVersion 154 | #> [1] "v1.7.1" 155 | #> 156 | #> 157 | #> [[2]] 158 | #> [[2]]$parsed 159 | #> [1] TRUE 160 | #> 161 | #> [[2]]$quality 162 | #> [1] 1 163 | #> 164 | #> [[2]]$verbatim 165 | #> [1] "Parus major Linnaeus, 1788" 166 | #> 167 | #> [[2]]$normalized 168 | #> [1] "Parus major Linnaeus 1788" 169 | #> 170 | #> [[2]]$canonical 171 | #> [[2]]$canonical$stemmed 172 | #> [1] "Parus maior" 173 | #> 174 | #> [[2]]$canonical$simple 175 | #> [1] "Parus major" 176 | #> 177 | #> [[2]]$canonical$full 178 | #> [1] "Parus major" 179 | #> 180 | #> 181 | #> [[2]]$cardinality 182 | #> [1] 2 183 | #> 184 | #> [[2]]$authorship 185 | #> [[2]]$authorship$verbatim 186 | #> [1] "Linnaeus, 1788" 187 | #> 188 | #> [[2]]$authorship$normalized 189 | #> [1] "Linnaeus 1788" 190 | #> 191 | #> [[2]]$authorship$year 192 | #> [1] "1788" 193 | #> 194 | #> [[2]]$authorship$authors 195 | #> [1] "Linnaeus" 196 | #> 197 | #> 198 | #> [[2]]$id 199 | #> [1] "e4e1d462-d332-583d-97a1-09735712f04d" 200 | #> 201 | #> [[2]]$parserVersion 202 | #> [1] "v1.7.1" 203 | #> 204 | #> 205 | #> [[3]] 206 | #> [[3]]$parsed 207 | #> [1] TRUE 208 | #> 209 | #> [[3]]$quality 210 | #> [1] 1 211 | #> 212 | #> [[3]]$verbatim 213 | #> [1] "Helianthus annuus var. texanus" 214 | #> 215 | #> [[3]]$normalized 216 | #> [1] "Helianthus annuus var. texanus" 217 | #> 218 | #> [[3]]$canonical 219 | #> [[3]]$canonical$stemmed 220 | #> [1] "Helianthus annu texan" 221 | #> 222 | #> [[3]]$canonical$simple 223 | #> [1] "Helianthus annuus texanus" 224 | #> 225 | #> [[3]]$canonical$full 226 | #> [1] "Helianthus annuus var. texanus" 227 | #> 228 | #> 229 | #> [[3]]$cardinality 230 | #> [1] 3 231 | #> 232 | #> [[3]]$id 233 | #> [1] "e571bae4-9e3f-5481-9b53-f614d536066c" 234 | #> 235 | #> [[3]]$parserVersion 236 | #> [1] "v1.7.1" 237 | ``` 238 | 239 | [gnparser]: https://github.com/gnames/gnparser 240 | 241 | 242 | -------------------------------------------------------------------------------- /vignettes/rgnparser.Rmd.orig: -------------------------------------------------------------------------------- 1 | --- 2 | title: "rgnparser: Parse Scientific Names" 3 | author: "Scott Chamberlain, Joel H. Nitta" 4 | date: "`r Sys.Date()`" 5 | output: rmarkdown::html_vignette 6 | --- 7 | 8 | ```{r setup-hide, include = FALSE} 9 | knitr::opts_chunk$set( 10 | collapse = TRUE, 11 | comment = "#>" 12 | ) 13 | # Increase width for printing tibbles 14 | # (will change back to default at end) 15 | old <- options(width = 140) 16 | ``` 17 | 18 | An R interface to `gnparser` (https://github.com/gnames/gnparser). 19 | 20 | ## Installation 21 | 22 | ```{r eval=FALSE} 23 | install.packages("rgnparser") 24 | # OR 25 | remotes::install_github("ropensci/rgnparser") 26 | ``` 27 | 28 | ```{r setup} 29 | library(rgnparser) 30 | ``` 31 | 32 | ## Install gnparser 33 | 34 | The command line tool written in Go, gnparser, is required to use this package. 35 | 36 | Instructions for installation can be found at the gnparser repo 37 | (). 38 | 39 | ## gn_version() 40 | 41 | Check the gnparser version: 42 | 43 | ```{r} 44 | gn_version() 45 | ``` 46 | 47 | ## gn_parse_tidy 48 | 49 | Output a data.frame with more minimal information 50 | 51 | ```{r} 52 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 53 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 54 | gn_parse_tidy(x) 55 | ``` 56 | 57 | It's pretty fast, thanks to gnparser of course 58 | 59 | ```{r} 60 | n <- 10000L 61 | # get random scientific names from taxize 62 | spp <- taxize::names_list(rank = "species", size = n) 63 | timed <- system.time(gn_parse_tidy(spp)) 64 | timed 65 | ``` 66 | 67 | Just `r timed[["elapsed"]]` sec for `r n` names 68 | 69 | ## gn_parse 70 | 71 | output a list of lists with more detailed information 72 | 73 | ```{r output.lines=1:15} 74 | x <- c("Quadrella steyermarkii (Standl.) Iltis & Cornejo", 75 | "Parus major Linnaeus, 1788", "Helianthus annuus var. texanus") 76 | gn_parse(x) 77 | ``` 78 | 79 | [gnparser]: https://github.com/gnames/gnparser 80 | 81 | ```{r, include = FALSE} 82 | # Reset options 83 | options(old) 84 | ``` --------------------------------------------------------------------------------