├── .Rbuildignore ├── .github ├── .gitignore ├── CONTRIBUTING.md ├── issue_template.md ├── pull_request_template.md └── workflows │ ├── R-CMD-check.yaml │ └── test-coverage.yaml ├── .gitignore ├── CODE_OF_CONDUCT.md ├── DESCRIPTION ├── DataPackageR.Rproj ├── LICENSE ├── LICENSE.md ├── NAMESPACE ├── NEWS.md ├── R ├── DataPackageR-defunct.R ├── DataPackageR-package.R ├── autodoc.R ├── build.R ├── dataversion.R ├── digests.R ├── environments.R ├── ignore.R ├── load_save.R ├── logger.R ├── mergeDocumentation.R ├── parseDocumentation.R ├── processData.R ├── prompt.R ├── qualify_changes.R ├── rmarkdown_functions.R ├── skeleton.R ├── use.R ├── yamlR.R └── zzz.R ├── README.Rmd ├── README.md ├── bibliography.bib ├── codecov.yml ├── codemeta.json ├── cran-comments.md ├── inst ├── WORDLIST └── extdata │ └── tests │ ├── extra.Rmd │ ├── raw_data │ └── testdata.csv │ ├── rfileTest.R │ ├── rfileTest_noheader.R │ ├── subsetCars.Rmd │ └── subsetCars.html ├── man ├── DataPackageR-defunct.Rd ├── DataPackageR-package.Rd ├── DataPackageR_options.Rd ├── assert_data_version.Rd ├── construct_yml_config.Rd ├── data_version.Rd ├── datapackage_skeleton.Rd ├── datapackager_object_read.Rd ├── document.Rd ├── package_build.Rd ├── project_data_path.Rd ├── project_extdata_path.Rd ├── project_path.Rd ├── use_data_object.Rd ├── use_ignore.Rd ├── use_processing_script.Rd ├── use_raw_dataset.Rd └── yaml.Rd ├── revdep └── .gitignore ├── tests ├── spelling.R ├── testthat.R └── testthat │ ├── setup.R │ ├── test-DataPackageR.R │ ├── test-build-locations.R │ ├── test-conditional-build.R │ ├── test-data-name-change.R │ ├── test-data-version.R │ ├── test-datapackager-object-read.R │ ├── test-document.R │ ├── test-edge-cases.R │ ├── test-ignore.R │ ├── test-logger.R │ ├── test-manual-version-bump.R │ ├── test-news-update.R │ ├── test-phantom_loading.R │ ├── test-pkg_description.R │ ├── test-project-path.R │ ├── test-r-processing.R │ ├── test-skeleton-data-dependencies.R │ ├── test-skeleton-edgecases.R │ ├── test-skeleton.R │ ├── test-source_r_folder_functions.R │ ├── test-updating-datapackager-version.R │ ├── test-use_raw_data.R │ ├── test-version-bump.R │ ├── test-version-management-edge-cases.R │ ├── test-yaml-config.R │ ├── test-yaml-manipulation.R │ └── test-yaml.R └── vignettes ├── .gitignore ├── Using_DataPackageR.Rmd └── YAML_Configuration_Details.Rmd /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^Meta$ 2 | ^doc$ 3 | ^.*\.Rproj$ 4 | ^\.Rproj\.user$ 5 | ^README\.Rmd$ 6 | ^codecov\.yml$ 7 | ^CODE_OF_CONDUCT\.md$ 8 | bibliography.bib 9 | ^codemeta\.json$ 10 | ^\.github$ 11 | ^revdep$ 12 | ^cran-comments\.md$ 13 | ^DataPackageR\.Rproj$ 14 | ^CRAN-SUBMISSION$ 15 | ^LICENSE\.md$ 16 | -------------------------------------------------------------------------------- /.github/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | -------------------------------------------------------------------------------- /.github/CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # CONTRIBUTING # 2 | 3 | ### Fixing typos 4 | 5 | Small typos or grammatical errors in documentation may be edited directly using 6 | the GitHub web interface, so long as the changes are made in the _source_ file. 7 | 8 | * YES: you edit a roxygen comment in a `.R` file below `R/`. 9 | * NO: you should not edit an `.Rd` file below `man/`. 10 | 11 | ### Prerequisites 12 | 13 | Before you make a substantial pull request, you should always file an issue and 14 | make sure someone from the team agrees that it’s a problem. If you’ve found a 15 | bug, create an associated issue and illustrate the bug with a minimal 16 | [reprex](https://www.tidyverse.org/help/#reprex). 17 | 18 | ### Pull request process 19 | 20 | * We are using the Git commit workflow found here: 21 | https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow 22 | * We recommend that you create a Git branch for each pull request (PR) and keep 23 | issues addressed to one issue per branch. 24 | * Look at the git workflow checks before and after making changes. 25 | The `README` should contain badges for any continuous integration services used 26 | by the package. 27 | * We recommend the tidyverse [style guide](https://style.tidyverse.org). 28 | You can use the [styler](https://CRAN.R-project.org/package=styler) package to 29 | apply these styles, but please don't restyle code that has nothing to do with 30 | your PR. 31 | * We use [roxygen2](https://cran.r-project.org/package=roxygen2). 32 | * We use [testthat](https://cran.r-project.org/package=testthat). Contributions 33 | with test cases included are easier to accept. 34 | * For user-facing changes, add a bullet to the top of `NEWS.md` below the 35 | current development version header describing the changes made followed by your 36 | GitHub username, and links to relevant issue(s)/PR(s). 37 | 38 | ### Code of Conduct 39 | 40 | Please note that the DataPackageR project is released with a 41 | [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this 42 | project you agree to abide by its terms. 43 | 44 | ### See rOpenSci [contributing guide](https://ropensci.github.io/dev_guide/contributingguide.html) 45 | for further details. 46 | 47 | ### Style pointers for vignettes 48 | 49 | * Headings are in sentence case and contain a period at the end of the heading when appropriate. 50 | * Conjunctions are excluded from text. 51 | * Multiline function calls use the following whitespace schema: 52 | 53 | ``` R 54 | myFunction <- aFunctionCall( 55 | input1 = "this", 56 | input2 = "that" 57 | ) 58 | ``` 59 | 60 | * Vignette file names are snake cased with capital first letters: Like_This.Rmd 61 | * Vignette `VignetteIndexEntry` are in sentence case: My new vignette 62 | 63 | ### Discussion forum 64 | 65 | Check out our [discussion forum](https://discuss.ropensci.org) if you think your issue requires a longer form discussion. 66 | 67 | ### Prefer to Email? 68 | 69 | Email the person listed as maintainer in the `DESCRIPTION` file of this repo. 70 | 71 | Though note that private discussions over email don't help others - of course email is totally warranted if it's a sensitive problem of any kind. 72 | 73 | ### Thanks for contributing! 74 | 75 | This contributing guide is adapted from the tidyverse contributing guide available at https://raw.githubusercontent.com/r-lib/usethis/master/inst/templates/tidy-contributing.md 76 | -------------------------------------------------------------------------------- /.github/issue_template.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 |
Session Info 6 | 7 | ```r 8 | 9 | ``` 10 |
11 | -------------------------------------------------------------------------------- /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | ## Description 8 | 9 | 10 | ## Related Issue 11 | 14 | 15 | ## Example 16 | 18 | 19 | 21 | -------------------------------------------------------------------------------- /.github/workflows/R-CMD-check.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | on: 4 | push: 5 | branches: [main, master, develop] 6 | pull_request: 7 | branches: [main, master, develop] 8 | 9 | name: R-CMD-check 10 | 11 | jobs: 12 | R-CMD-check: 13 | runs-on: ${{ matrix.config.os }} 14 | 15 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 16 | 17 | strategy: 18 | fail-fast: false 19 | matrix: 20 | config: 21 | - {os: macos-latest, r: 'release'} 22 | - {os: windows-latest, r: 'release'} 23 | - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'} 24 | - {os: ubuntu-latest, r: 'release'} 25 | - {os: ubuntu-latest, r: 'oldrel-1'} 26 | - {os: ubuntu-latest, r: '4.0.4'} 27 | 28 | env: 29 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 30 | R_KEEP_PKG_SOURCE: yes 31 | 32 | steps: 33 | - uses: actions/checkout@v4 34 | 35 | - uses: r-lib/actions/setup-pandoc@v2 36 | 37 | - uses: r-lib/actions/setup-r@v2 38 | with: 39 | r-version: ${{ matrix.config.r }} 40 | http-user-agent: ${{ matrix.config.http-user-agent }} 41 | use-public-rspm: true 42 | 43 | - uses: r-lib/actions/setup-tinytex@v2 44 | 45 | - uses: r-lib/actions/setup-r-dependencies@v2 46 | with: 47 | extra-packages: any::rcmdcheck 48 | needs: check 49 | 50 | - uses: r-lib/actions/check-r-package@v2 51 | with: 52 | upload-snapshots: true 53 | build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")' 54 | -------------------------------------------------------------------------------- /.github/workflows/test-coverage.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | on: 4 | push: 5 | branches: [main, master, develop] 6 | pull_request: 7 | branches: [main, master, develop] 8 | 9 | name: test-coverage 10 | 11 | permissions: read-all 12 | 13 | jobs: 14 | test-coverage: 15 | runs-on: ubuntu-latest 16 | env: 17 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 18 | 19 | steps: 20 | - uses: actions/checkout@v4 21 | 22 | - uses: r-lib/actions/setup-pandoc@v2 23 | 24 | - uses: r-lib/actions/setup-r@v2 25 | with: 26 | use-public-rspm: true 27 | 28 | - uses: r-lib/actions/setup-tinytex@v2 29 | 30 | - uses: r-lib/actions/setup-r-dependencies@v2 31 | with: 32 | extra-packages: any::covr, any::xml2 33 | needs: coverage 34 | 35 | - name: Test coverage 36 | run: | 37 | cov <- covr::package_coverage( 38 | quiet = FALSE, 39 | clean = FALSE, 40 | install_path = file.path(normalizePath(Sys.getenv("RUNNER_TEMP"), winslash = "/"), "package") 41 | ) 42 | covr::to_cobertura(cov) 43 | shell: Rscript {0} 44 | 45 | - uses: codecov/codecov-action@v4 46 | with: 47 | fail_ci_if_error: ${{ github.event_name != 'pull_request' && true || false }} 48 | file: ./cobertura.xml 49 | plugin: noop 50 | disable_search: true 51 | token: ${{ secrets.CODECOV_TOKEN }} 52 | 53 | - name: Show testthat output 54 | if: always() 55 | run: | 56 | ## -------------------------------------------------------------------- 57 | find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \; || true 58 | shell: bash 59 | 60 | - name: Upload test results 61 | if: failure() 62 | uses: actions/upload-artifact@v4 63 | with: 64 | name: coverage-test-failures 65 | path: ${{ runner.temp }}/package 66 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | /revdep/.cache.rds 5 | .DS_Store 6 | .httr-oauth 7 | inst/doc 8 | /doc/ 9 | /Meta/ 10 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Code of Conduct 2 | 3 | As contributors and maintainers of this project, we pledge to respect all people who 4 | contribute through reporting issues, posting feature requests, updating documentation, 5 | submitting pull requests or patches, and other activities. 6 | 7 | We are committed to making participation in this project a harassment-free experience for 8 | everyone, regardless of level of experience, gender, gender identity and expression, 9 | sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion. 10 | 11 | Examples of unacceptable behavior by participants include the use of sexual language or 12 | imagery, derogatory comments or personal attacks, trolling, public or private harassment, 13 | insults, or other unprofessional conduct. 14 | 15 | Project maintainers have the right and responsibility to remove, edit, or reject comments, 16 | commits, code, wiki edits, issues, and other contributions that are not aligned to this 17 | Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed 18 | from the project team. 19 | 20 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by 21 | opening an issue or contacting one or more of the project maintainers. 22 | 23 | This Code of Conduct is adapted from the Contributor Covenant 24 | (http://contributor-covenant.org), version 1.0.0, available at 25 | http://contributor-covenant.org/version/1/0/0/ 26 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Type: Package 2 | Package: DataPackageR 3 | Title: Construct Reproducible Analytic Data Sets as R Packages 4 | Version: 0.16.1 5 | Authors@R: c( 6 | person("Greg", "Finak", , "greg.finak@gmail.com", role = c("aut", "cph"), 7 | comment = "Original author and creator of DataPackageR"), 8 | person("Paul", "Obrecht", role = "ctb"), 9 | person("Ellis", "Hughes", , "ellishughes@live.com", role = "ctb", 10 | comment = c(ORCID = "0000-0003-0637-4436")), 11 | person("Jimmy", "Fulp", , "williamjfulp@gmail.com", role = "ctb"), 12 | person("Marie", "Vendettuoli", role = "ctb", 13 | comment = c(ORCID = "0000-0001-9321-1410")), 14 | person("Dave", "Slager", , "dslager@fredhutch.org", role = c("ctb", "cre"), 15 | comment = c(ORCID = "0000-0003-2525-2039")), 16 | person("Jason", "Taylor", , "jmtaylor@fredhutch.org", role = "ctb"), 17 | person("Kara", "Woo", role = "rev", 18 | comment = "Kara reviewed the package for rOpenSci, see "), 19 | person("William", "Landau", role = "rev", 20 | comment = "William reviewed the package for rOpenSci, see ") 21 | ) 22 | Description: A framework to help construct R data packages in a 23 | reproducible manner. Potentially time consuming processing of raw data 24 | sets into analysis ready data sets is done in a reproducible manner 25 | and decoupled from the usual 'R CMD build' process so that data sets 26 | can be processed into R objects in the data package and the data 27 | package can then be shared, built, and installed by others without the 28 | need to repeat computationally costly data processing. The package 29 | maintains data provenance by turning the data processing scripts into 30 | package vignettes, as well as enforcing documentation and version 31 | checking of included data objects. Data packages can be version 32 | controlled on 'GitHub', and used to share data for manuscripts, 33 | collaboration and reproducible research. 34 | License: MIT + file LICENSE 35 | URL: https://github.com/ropensci/DataPackageR, 36 | https://docs.ropensci.org/DataPackageR/ 37 | BugReports: https://github.com/ropensci/DataPackageR/issues 38 | Depends: 39 | R (>= 3.5.0) 40 | Imports: 41 | cli, 42 | desc, 43 | digest, 44 | futile.logger, 45 | knitr, 46 | pkgbuild, 47 | pkgload, 48 | rmarkdown, 49 | roxygen2, 50 | rprojroot, 51 | usethis, 52 | utils, 53 | yaml 54 | Suggests: 55 | covr, 56 | data.tree, 57 | spelling, 58 | testthat, 59 | withr 60 | VignetteBuilder: 61 | knitr 62 | Encoding: UTF-8 63 | Language: en-US 64 | RoxygenNote: 7.3.2 65 | SystemRequirements: pandoc - https://pandoc.org 66 | -------------------------------------------------------------------------------- /DataPackageR.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2024 2 | COPYRIGHT HOLDER: Greg Finak 3 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | # MIT License 2 | 3 | Copyright (c) 2024 Greg Finak 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(assert_data_version) 4 | export(construct_yml_config) 5 | export(dataVersion) 6 | export(data_version) 7 | export(datapackage.skeleton) 8 | export(datapackage_skeleton) 9 | export(datapackager_object_read) 10 | export(document) 11 | export(keepDataObjects) 12 | export(package_build) 13 | export(project_data_path) 14 | export(project_extdata_path) 15 | export(project_path) 16 | export(use_data_object) 17 | export(use_ignore) 18 | export(use_processing_script) 19 | export(use_raw_dataset) 20 | export(yml_add_files) 21 | export(yml_add_objects) 22 | export(yml_disable_compile) 23 | export(yml_enable_compile) 24 | export(yml_find) 25 | export(yml_list_files) 26 | export(yml_list_objects) 27 | export(yml_remove_files) 28 | export(yml_remove_objects) 29 | export(yml_write) 30 | importFrom(desc,desc) 31 | importFrom(futile.logger,INFO) 32 | importFrom(futile.logger,TRACE) 33 | importFrom(futile.logger,appender.console) 34 | importFrom(futile.logger,appender.file) 35 | importFrom(futile.logger,appender.tee) 36 | importFrom(futile.logger,flog.appender) 37 | importFrom(futile.logger,flog.debug) 38 | importFrom(futile.logger,flog.error) 39 | importFrom(futile.logger,flog.fatal) 40 | importFrom(futile.logger,flog.info) 41 | importFrom(futile.logger,flog.logger) 42 | importFrom(futile.logger,flog.threshold) 43 | importFrom(futile.logger,flog.trace) 44 | importFrom(futile.logger,flog.warn) 45 | importFrom(knitr,knit) 46 | importFrom(knitr,spin) 47 | importFrom(rmarkdown,pandoc_available) 48 | importFrom(rmarkdown,render) 49 | importFrom(rprojroot,is_r_package) 50 | importFrom(usethis,create_package) 51 | importFrom(usethis,proj_get) 52 | importFrom(usethis,proj_set) 53 | importFrom(usethis,use_build_ignore) 54 | importFrom(usethis,use_data_raw) 55 | importFrom(usethis,use_directory) 56 | importFrom(usethis,use_rstudio) 57 | importFrom(yaml,as.yaml) 58 | importFrom(yaml,read_yaml) 59 | importFrom(yaml,write_yaml) 60 | importFrom(yaml,yaml.load_file) 61 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | # DataPackageR 0.16.1 2 | 3 | ## Minor user-facing improvements 4 | * Suppress warning from project*_path() functions when file does not exist (#160) 5 | 6 | ## Maintenance 7 | * Remove dependency on crayon package, switch to cli package (#159) 8 | * Remove invalid ORCID placeholder created by usethis that caused failing CRAN checks (#162) 9 | 10 | # DataPackageR 0.16.0 11 | 12 | ## Bug fixes 13 | 14 | * Throw error if data object name is same as package name. Fixes (#62) 15 | * Fix triplicate warnings when data version changes (#128) 16 | * Fix bug where automatic updates to NEWS.md and the log file did not reflect the actual changes to data objects (#67) 17 | - The actual data packaging and versioning behavior was as intended, but now the NEWS.md and log file reflect that behavior 18 | * Remove inconsistently auto-generated 'see also' links from package and data object help files 19 | 20 | ## Significant user-facing changes 21 | 22 | * `document()` now defaults to `install = FALSE`, like `package_build()` (#130) 23 | 24 | ## Minor user-facing improvements 25 | 26 | * Tweaks to PDF manual and vignette index titles 27 | * Tweaks to spacing in console outputs and log file 28 | * Suppress repetitive log file lines about updating NEWS.md 29 | * Remove `assertthat`, `devtools`, `stringr`, and `purrr` from Imports 30 | * Add `pkgbuild` and `pkgload` to Imports 31 | * Move `withr` from Imports to Suggests 32 | * Make some long-deprecated functions officially Defunct 33 | - `dataVersion()`, renamed years ago to `data_version()` 34 | - `datapackage.skeleton()`, renamed years ago to `datapackage_skeleton()` 35 | - `keepDataObjects()` 36 | * Keep existing DataPackageR_verbose option if set outside of package 37 | 38 | ## Internal improvements 39 | 40 | * Refactor and split up internal omnibus function `DataPackageR()` 41 | - New internal functions 42 | - `do_doc()` 43 | - `do_digests()` 44 | - `validate_pkg_name()` 45 | - `validate_package_skeleton()` 46 | - `validate_DataVersion()` 47 | - `validate_description()` 48 | - `validate_yml()` 49 | - `get_yml_objects()` 50 | - `get_yml_r_files()` 51 | - Deleted internal functions 52 | - `read.description()`, now uses `desc::desc()` 53 | - `read_pkg_description()`, now uses `desc::desc()` 54 | - `comment()`, now uses `roxygen2::parse_file()` 55 | - Simplify internal functions 56 | - `.increment_data_version()` 57 | - `.digest_data_env()` 58 | - `.save_data()` 59 | - `.doc_parse()` 60 | - `.check_dataversion_string()` 61 | - Use base R `compareVersion()` and class `package_version` to simplify code involving the DataVersion and do additional sanity checks 62 | * Miscellaneous cleanup and maintenance 63 | * Make unit tests more independent from each other 64 | 65 | 66 | # DataPackageR 0.15.9 67 | 68 | ## Bug fixes 69 | 70 | * Fix bug where `.Rprofile` options setting for `DataPackageR_interact` was overwritten upon package load 71 | * Fix bugs where document() and package_build() left the data package attached 72 | * Fix unit test inter-dependencies 73 | * Make package template depend on R >= 3.5.0, suppressing `.rda` serialization warnings upon install 74 | * Close a file connection, suppressing warnings about orphaned connections 75 | * Fix broken test that led to archiving on CRAN 76 | * Fix package documentation method for data packages and DataPackageR itself (r-lib/roxygen2#1491) 77 | 78 | ## Minor improvements 79 | 80 | * Various maintenance tweaks 81 | * Documentation improvements 82 | * Update maintainer and contact info 83 | * New global option `DataPackageR_verbose` to suppress console output, e.g. during unit testing 84 | 85 | # DataPackageR 0.15.8 86 | * Fix to datapackager_object_read that was causing a test to break. `get` needs to have `inherits=FALSE`. 87 | * Other fixes for `usethis` 1.6.0 88 | * Fixes to tests that were failing on CRAN 89 | * In `package_build`, remove `devtools::reload` and put `devtools::unload` and in front of install.packages 90 | * in `document`, remove `devtools::reload` and put in `devtools::unload` and install.packages 91 | 92 | # DataPackageR 0.15.7 93 | * Fix test and vignette bugs related to upcoming version of usethis (1.5) 94 | 95 | 96 | # DataPackageR 0.15.6 97 | * Fix bug in vignette and code that writes to user space during CRAN checks. 98 | 99 | # DataPackageR 0.15.4.900 100 | * Fix a bug in update_news. 101 | * Create news files if it doesn't exist. 102 | 103 | 104 | # DataPackageR 0.15.4 105 | * New CRAN Release 106 | 107 | # DataPackageR 0.15.3.9000 108 | 109 | ## Features and enhancements 110 | * Reduce the console output from logging. (ropensci/DataPackageR/issues/50) 111 | * Create a new logger that logs at different thresholds to console and to file (ropensci/DataPackageR/issues/50) 112 | * Default on build is not to install. 113 | * Hide console output from Rmd render. 114 | * Nicer messages describing data sets that are created (ropensci/DataPackageR/issues/51) 115 | * Write deleted, changed, and added data objects to the NEWS file automatically. 116 | * Add option to overwrite (or not) via use_processing_script. Provide warning. 117 | * Add use_ignore() to ignore files and data sets in .Rbuildignore and .gitignore and added ignore argument to use_raw_dataset(). 118 | 119 | ## Bug fixes 120 | * code argument no longer required for construct_yml_config 121 | * Fix the documentation for datapackager_object_read() and "Migrating old packages". 122 | * Copy over vignettes generated as pdfs into the package inst/doc 123 | * Data objects are incrementally stored during the build process, into the render_root directory specified in the datapackager.yml config file. 124 | 125 | # DataPackageR 0.15.3 126 | * conditional tests when pandoc is missing (ropensci/DataPackageR/issues/46) 127 | * add use_data_object and use_processing_script (ropensci/DataPackageR/issues/44) 128 | * allow datapackage_skeleton to be called without files or data objects for interactive construction. (ropensci/DataPackageR/issues/44) 129 | 130 | # DataPackageR 0.15.2 131 | * Add pandoc to SystemRequirements (ropensci/DataPackageR/issues/46) 132 | * Add use_raw_dataset() method (and tests) to add data sets to inst/extdata. interactively. (ropensci/DataPackageR/issues/44) 133 | 134 | # DataPackageR 0.15.1.9000 135 | * Development version 136 | 137 | # DataPackageR 0.15.1 138 | - Fix CRAN notes. 139 | 140 | # DataPackageR 0.15.0 141 | - Prepare for CRAN submission. 142 | 143 | 144 | # DataPackageR 0.14.9 145 | 146 | - Moving towards rOpenSci compliance 147 | - NEWS.md updated with description of changes to data sets when version is bumped (or new package is created). 148 | - Output of "next steps" for user when package is built 149 | - New `document()` function to rebuild docs from `documentation.R` in `data-raw` without rebuilding the whole package. 150 | - Improved package test. 151 | - R scripts processed properly into vignettes. 152 | - Packages installed and loaded after build to make vignettes and data sets accessible in same R session. 153 | - 154 | 155 | # DataPackageR 0.13.6 156 | 157 | - Added a NEWS file. 158 | - Cleaned up the examples. 159 | - Snake case for all exported functions. 160 | 161 | # DataPackageR 0.13.3 162 | 163 | - Added the `render_root` property to the YAML configuration. Specifies where `render()` processing is done, instead of the `data-raw` directory. 164 | -------------------------------------------------------------------------------- /R/DataPackageR-defunct.R: -------------------------------------------------------------------------------- 1 | ## Defunct functions in DataPackageR 2 | #' @title Defunct functions in package \pkg{DataPackageR}. 3 | #' @description These functions are defunct and no longer supported. 4 | #' Calling them will result in an error. 5 | #' 6 | #' When possible, alternatives are suggested. 7 | #' 8 | #' @name DataPackageR-defunct 9 | #' @param ... All arguments are now ignored. 10 | #' @returns Defunct function. No return value. 11 | NULL 12 | 13 | #' @rdname DataPackageR-defunct 14 | #' @export 15 | datapackage.skeleton <- function(...) { 16 | .Defunct('datapackage_skeleton()', package = 'DataPackageR') 17 | } 18 | 19 | #' @rdname DataPackageR-defunct 20 | #' @export 21 | dataVersion <- function(...) { 22 | .Defunct('data_version()', package = 'DataPackageR') 23 | } 24 | 25 | #' @rdname DataPackageR-defunct 26 | #' @export 27 | keepDataObjects <- function(...) { 28 | .Defunct('datapackager.yml', package = 'DataPackageR') 29 | } 30 | -------------------------------------------------------------------------------- /R/DataPackageR-package.R: -------------------------------------------------------------------------------- 1 | #' DataPackageR 2 | #' 3 | #' A framework to automate the processing, tidying and packaging of raw data into analysis-ready 4 | #' data sets as R packages. 5 | #' 6 | #' DataPackageR will automate running of data processing code, 7 | #' storing tidied data sets in an R package, producing 8 | #' data documentation stubs, tracking data object finger prints (md5 hash) 9 | #' and tracking and incrementing a "DataVersion" string 10 | #' in the DESCRIPTION file of the package when raw data or data 11 | #' objects change. 12 | #' Code to perform the data processing is passed to DataPackageR by the user. 13 | #' The user also specifies the names of the tidy data objects to be stored, 14 | #' documented and tracked in the final package. Raw data should be read from 15 | #' "inst/extdata" but large raw data files can be read from sources external 16 | #' to the package source tree. 17 | #' 18 | #' Configuration is controlled via the datapackager.yml file created at the package root. 19 | #' Its properties include a list of R and Rmd files that are to be rendered / sourced and 20 | #' which read data and do the actual processing. 21 | #' It also includes a list of r object names created by those files. These objects 22 | #' are stored in the final package and accessible via the \code{data()} API. 23 | #' The documentation for these objects is accessible via "?object-name", and md5 24 | #' fingerprints of these objects are created and tracked. 25 | #' 26 | #' The Rmd and R files used to process the objects are transformed into vignettes 27 | #' accessible in the final package so that the processing is fully documented. 28 | #' 29 | #' A DATADIGEST file in the package source keeps track of the data object fingerprints. 30 | #' A DataVersion string is added to the package DESCRIPTION file and updated when these 31 | #' objects are updated or changed on subsequent builds. 32 | #' 33 | #' Once the package is built and installed, the data objects created in the package are accessible via 34 | #' the \code{data()} API, and 35 | #' Calling \code{datapackage_skeleton()} and passing in R / Rmd file names, and r object names 36 | #' constructs a skeleton data package source tree and an associated \code{datapackager.yml} file. 37 | #' 38 | #' Calling \code{package_build()} sets the build process in motion. 39 | #' @examples 40 | #' # A simple Rmd file that creates one data object 41 | #' # named "tbl". 42 | #' if(rmarkdown::pandoc_available()){ 43 | #' f <- tempdir() 44 | #' f <- file.path(f,"foo.Rmd") 45 | #' con <- file(f) 46 | #' writeLines("```{r}\n tbl = data.frame(1:10) \n```\n",con=con) 47 | #' close(con) 48 | #' 49 | #' # construct a data package skeleton named "MyDataPackage" and pass 50 | #' # in the Rmd file name with full path, and the name of the object(s) it 51 | #' # creates. 52 | #' 53 | #' pname <- basename(tempfile()) 54 | #' datapackage_skeleton(name=pname, 55 | #' path=tempdir(), 56 | #' force = TRUE, 57 | #' r_object_names = "tbl", 58 | #' code_files = f) 59 | #' 60 | #' # call package_build to run the "foo.Rmd" processing and 61 | #' # build a data package. 62 | #' package_build(file.path(tempdir(), pname), install = FALSE) 63 | #' 64 | #' # "install" the data package 65 | #' pkgload::load_all(file.path(tempdir(), pname)) 66 | #' 67 | #' # read the data version 68 | #' data_version(pname) 69 | #' 70 | #' # list the data sets in the package. 71 | #' data(package = pname) 72 | #' 73 | #' # The data objects are in the package source under "/data" 74 | #' list.files(pattern="rda", path = file.path(tempdir(),pname,"data"), full = TRUE) 75 | #' 76 | #' # The documentation that needs to be edited is in "/R" 77 | #' list.files(pattern="R", path = file.path(tempdir(), pname,"R"), full = TRUE) 78 | #' readLines(list.files(pattern="R", path = file.path(tempdir(),pname,"R"), full = TRUE)) 79 | #' # view the documentation with 80 | #' ?tbl 81 | #' } 82 | #' @name DataPackageR-package 83 | #' @keywords internal 84 | '_PACKAGE' 85 | 86 | ## usethis namespace: start 87 | ## usethis namespace: end 88 | NULL 89 | 90 | #' Options consulted by DataPackageR 91 | #' 92 | #' @description User-configurable options consulted by DataPackageR, which 93 | #' provide a mechanism for setting default behaviors for various functions. 94 | #' 95 | #' If the built-in defaults don't suit you, set one or more of these options. 96 | #' Typically, this is done in the \code{.Rprofile} startup file, which you can open 97 | #' for editing with \code{usethis::edit_r_profile()} - this will set the specified 98 | #' options for all future R sessions. The following setting is recommended to 99 | #' not be prompted upon each package build for a NEWS update: 100 | #' 101 | #' \code{options(DataPackageR_interact = FALSE)} 102 | #' 103 | #' @section Options for the DataPackageR package: 104 | #' 105 | #' - \code{DataPackageR_interact}: Upon package load, this defaults to the value of 106 | #' \code{interactive()}, unless the option has been previously set (e.g., in 107 | #' \code{.Rprofile}). TRUE prompts user interactively for a NEWS update on 108 | #' \code{package_build()}. See the example above and the 109 | #' \href{https://ropensci.org/blog/2018/09/18/datapackager/}{rOpenSci blog 110 | #' post} for more details on how to set this to FALSE, which will never prompt 111 | #' user for a NEWS update. FALSE is also the setting used for DataPackageR 112 | #' internal package tests. 113 | #' 114 | #' - \code{DataPackageR_verbose}: Default upon package load is TRUE. FALSE suppresses 115 | #' all console output and is currently only used for automated 116 | #' unit tests of the DataPackageR package. 117 | #' 118 | #' - \code{DataPackageR_packagebuilding}: Default upon package load is FALSE. This 119 | #' option is used internally for package operations and changing it is not 120 | #' recommended. 121 | #' 122 | #' @name DataPackageR_options 123 | NULL 124 | -------------------------------------------------------------------------------- /R/autodoc.R: -------------------------------------------------------------------------------- 1 | 2 | 3 | # function .doc_autogen() automates the creation of a basic roxygen template for 4 | # the package and each object in objects_to_keep arguments are pname and ds2kp, 5 | # normally defined in datasets.R pname is name of package, ds2kp is list of 6 | # objects to save in data package 7 | .doc_autogen <- function(pname, ds2kp, env, path, name = "documentation.R") { 8 | 9 | # create default file to be edited and 10 | # renamed manually by user, who then rebuilds package 11 | tempfilename <- file.path(path, name) 12 | if (file.exists(tempfilename)) { 13 | file.remove(tempfilename) 14 | } 15 | 16 | # create Roxygen documentation for data package 17 | on.exit(close(con)) 18 | con <- file(tempfilename, open = "w") 19 | writeLines( 20 | c( 21 | .rc( 22 | c( 23 | pname, 24 | paste0("A data package for ", pname, "."), 25 | paste0("@aliases ", pname, "-package"), 26 | "@title Package Title", 27 | paste0("@name ", pname), 28 | "@description A description of the data package", 29 | paste0( 30 | "@details Use \\code{", 31 | "data(package='", pname, "')$", 32 | "results[, 3]} to ", 33 | "see a list of available ", 34 | "data sets in this data package" 35 | ), 36 | " and/or DataPackageR::load_all", 37 | "_datasets() to load them." 38 | ) 39 | ), 40 | "'_PACKAGE'\n\n\n" 41 | ), con 42 | ) 43 | 44 | # Cycle through the objects and create Roxygen documentation for each one 45 | for (ds in ds2kp) { 46 | type <- class(get(ds, envir = env))[1] 47 | writeLines( 48 | .rc(c( 49 | "Detailed description of the data", 50 | paste("@name", ds), 51 | "@docType data", 52 | "@title Descriptive data title", 53 | paste0( 54 | "@format a \\code{", type, 55 | "} containing the following fields:" 56 | ), 57 | "\\describe{" 58 | )), con 59 | ) 60 | # set up documentation template 61 | # for each field, using \item{varname}{} 62 | # with a blank description to fill in 63 | for (var in names(get(ds, envir = env))) { 64 | writeLines( 65 | .rc(paste0("\\item{", var, "}{}")), 66 | con 67 | ) 68 | } 69 | writeLines( 70 | c( 71 | .rc( 72 | c( 73 | "}", 74 | paste0( 75 | "@source The data comes from", 76 | "________________________." 77 | ) 78 | ) 79 | ), 80 | "NULL\n\n\n" 81 | ), con 82 | ) 83 | } 84 | } 85 | -------------------------------------------------------------------------------- /R/build.R: -------------------------------------------------------------------------------- 1 | 2 | 3 | #' Pre-process, document and build a data package 4 | #' 5 | #' Combines the preprocessing, documentation, and build steps into one. 6 | #' 7 | #' @param packageName \code{character} path to package source directory. Defaults to the current path when NULL. 8 | #' @param vignettes \code{logical} specify whether to build vignettes. Default FALSE. 9 | #' @param log log level \code{INFO,WARN,DEBUG,FATAL} 10 | #' @param deps \code{logical} should we pass data objects into subsequent scripts? Default TRUE 11 | #' @param install \code{logical} automatically install and load the package after building. Default FALSE 12 | #' @param ... additional arguments passed to \code{install.packages} when \code{install=TRUE}. 13 | #' @returns Character vector. File path of the built package. 14 | #' @importFrom usethis use_build_ignore use_rstudio proj_set use_directory 15 | #' @importFrom rprojroot is_r_package 16 | #' @importFrom rmarkdown pandoc_available 17 | #' @importFrom yaml read_yaml 18 | #' @importFrom futile.logger flog.logger flog.trace appender.file flog.debug flog.info flog.warn flog.error flog.fatal flog.appender flog.threshold INFO TRACE appender.console appender.tee 19 | #' @importFrom knitr knit spin 20 | #' @details Note that if \code{package_build} returns an error when rendering an \code{.Rmd} 21 | #' internally, but that same \code{.Rmd} can be run successfully manually using \code{rmarkdown::render}, 22 | #' then the following code facilitates debugging. Set \code{options(error = function(){ sink(); recover()})} 23 | #' before running \code{package_build} . This will enable examination of the active function calls at the time of the error, 24 | #' with output printed to the console rather than \code{knitr}'s default sink. 25 | #' After debugging, evaluate \code{options(error = NULL)} to revert to default error handling. 26 | #' See section "22.5.3 RMarkdown" at \url{ https://adv-r.hadley.nz/debugging.html} for more details. 27 | #' @export 28 | #' @examples 29 | #' if(rmarkdown::pandoc_available()){ 30 | #' f <- tempdir() 31 | #' f <- file.path(f,"foo.Rmd") 32 | #' con <- file(f) 33 | #' writeLines("```{r}\n tbl = data.frame(1:10) \n```\n",con=con) 34 | #' close(con) 35 | #' pname <- basename(tempfile()) 36 | #' datapackage_skeleton(name=pname, 37 | #' path=tempdir(), 38 | #' force = TRUE, 39 | #' r_object_names = "tbl", 40 | #' code_files = f) 41 | #' 42 | #' package_build(file.path(tempdir(),pname), install = FALSE) 43 | #' } 44 | package_build <- function(packageName = NULL, 45 | vignettes = FALSE, 46 | log = INFO, 47 | deps = TRUE, 48 | install = FALSE, 49 | ...) { 50 | .multilog_setup(LOGFILE = NULL) 51 | # flog.appender(appender.console()) 52 | if (is.null(packageName)) { 53 | packageName <- "." 54 | # use normalizePath 55 | package_path <- normalizePath(packageName, winslash = "/") 56 | packageName <- basename(package_path) 57 | # Is this a package root? 58 | if (!is_r_package$find_file() == package_path) { 59 | flog.fatal(paste0(package_path, 60 | " is not an R package root directory"), 61 | name = "console") 62 | stop("exiting", call. = FALSE) 63 | } 64 | } else { 65 | package_path <- normalizePath(packageName, winslash = "/") 66 | if (!file.exists(package_path)) { 67 | flog.fatal(paste0("Non existent package ", packageName), name = "console") 68 | stop("exiting", call. = FALSE) 69 | } 70 | packageName <- basename(package_path) 71 | } 72 | # This should always be a proper name of a directory, either current or a 73 | # subdirectory 74 | tryCatch({is_r_package$find_file(path = package_path)}, 75 | error = function(cond){ 76 | flog.fatal(paste0( 77 | package_path, 78 | " is not a valid R package directory beneath ", 79 | getwd() 80 | ), name = "console") 81 | stop("exiting", call. = FALSE) 82 | } 83 | ) 84 | 85 | # Check that directory name matches package name 86 | validate_pkg_name(package_path) 87 | 88 | # Return success if we've processed everything 89 | success <- 90 | DataPackageR(arg = package_path, deps = deps) 91 | ifelse(success, 92 | .multilog_trace("DataPackageR succeeded"), 93 | .multilog_warn("DataPackageR failed") 94 | ) 95 | .multilog_trace("Building documentation") 96 | local({ 97 | on.exit({ 98 | if (packageName %in% names(utils::sessionInfo()$otherPkgs)){ 99 | pkgload::unload(packageName) 100 | } 101 | }) 102 | roxygen2::roxygenize(package_path, clean = TRUE) 103 | }) 104 | .multilog_trace("Building package") 105 | location <- pkgbuild::build(path = package_path, 106 | dest_path = dirname(package_path), 107 | vignettes = vignettes, 108 | quiet = ! getOption('DataPackageR_verbose', TRUE) 109 | ) 110 | # try to install and then reload the package in the current session 111 | if (install) { 112 | utils::install.packages(location, repos = NULL, type = "source", ...) 113 | } 114 | .next_steps() 115 | return(location) 116 | } 117 | 118 | .next_steps <- function() { 119 | if (! getOption('DataPackageR_verbose', TRUE)) return(invisible(NULL)) 120 | cat(cli::col_green(cli::style_bold("Next Steps")), "\n") # nolint 121 | cat(cli::col_white(cli::col_yellow(cli::style_bold("1. Update your package documentation.")), "\n")) # nolint 122 | cat(cli::col_white(" - Edit the documentation.R file in the package source", cli::col_green("data-raw"), "subdirectory and update the roxygen markup."), "\n") # nolint 123 | cat(cli::col_white(" - Rebuild the package documentation with ", cli::col_red("document()"), "."), "\n") # nolint 124 | cat(cli::col_white(cli::col_yellow(cli::style_bold("2. Add your package to source control.")), "\n")) # nolint 125 | cat(cli::col_white(" - Call ", cli::col_red("git init ."), " in the package source root directory."), "\n") # nolint 126 | cat(cli::col_white(" - ", cli::col_red("git add"), " the package files."), "\n") # nolint 127 | cat(cli::col_white(" - ", cli::col_red("git commit"), " your new package."), "\n") # nolint 128 | cat(cli::col_white(" - Set up a github repository for your pacakge."), "\n") # nolint 129 | cat(cli::col_white(" - Add the github repository as a remote of your local package repository."), "\n") # nolint 130 | cat(cli::col_white(" - ", cli::col_red("git push"), " your local repository to gitub."), "\n") # nolint 131 | } 132 | 133 | #' Check that pkg name inferred from pkg path is same as pkg name in DESCRIPTION 134 | #' 135 | #' @param package_path Package path 136 | #' 137 | #' @returns Package name (character) if validated 138 | #' @noRd 139 | validate_pkg_name <- function(package_path){ 140 | desc_pkg_name <- desc::desc( 141 | file = file.path(package_path, 'DESCRIPTION') 142 | )$get("Package") 143 | path_pkg_name <- basename(package_path) 144 | if (desc_pkg_name != path_pkg_name){ 145 | err_msg <- paste("Data package name in DESCRIPTION does not match", 146 | "name of the data package directory") 147 | flog.fatal(err_msg, name = "console") 148 | stop(err_msg, call. = FALSE) 149 | } 150 | desc_pkg_name 151 | } 152 | -------------------------------------------------------------------------------- /R/dataversion.R: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | #' Get the DataVersion for a package 5 | #' 6 | #' Retrieves the DataVersion of a package if available 7 | #' @param pkg \code{character} the package name 8 | #' @param lib.loc \code{character} path to library location. 9 | #' @seealso \code{\link[utils]{packageVersion}} 10 | #' @rdname data_version 11 | #' @returns Object of class 'package_version' and 'numeric_version' specifying the DataVersion of the package 12 | #' @export 13 | #' @examples 14 | #' if(rmarkdown::pandoc_available()){ 15 | #' f <- tempdir() 16 | #' f <- file.path(f,"foo.Rmd") 17 | #' con <- file(f) 18 | #' writeLines("```{r}\n vec = 1:10 \n```\n",con=con) 19 | #' close(con) 20 | #' pname <- basename(tempfile()) 21 | #' datapackage_skeleton(name = pname, 22 | #' path=tempdir(), 23 | #' force = TRUE, 24 | #' r_object_names = "vec", 25 | #' code_files = f) 26 | #' 27 | #' package_build(file.path(tempdir(),pname), install = FALSE) 28 | #' 29 | #' pkgload::load_all(file.path(tempdir(),pname)) 30 | #' data_version(pname) 31 | #' } 32 | data_version <- function(pkg, lib.loc = NULL) { 33 | res <- suppressWarnings(utils::packageDescription(pkg, 34 | lib.loc = lib.loc, 35 | fields = "DataVersion" 36 | )) 37 | if (!is.na(res)) { 38 | package_version(res) 39 | } else { 40 | stop(gettextf( 41 | paste0( 42 | "package %s not found ", 43 | "or has no DataVersion string" 44 | ), 45 | sQuote(pkg) 46 | ), 47 | domain = NA 48 | ) 49 | } 50 | } 51 | 52 | .increment_data_version <- 53 | function(pkg_desc, new_data_digest, which = "patch") { 54 | which_options <- c("major", "minor", "patch") 55 | if (!which %in% which_options) { 56 | stop( 57 | paste0( 58 | "version component to increment", 59 | "is misspecified in ", 60 | ".increment_data_version, ", 61 | "package DataPackageR" 62 | ) 63 | ) 64 | } 65 | verstring <- validate_DataVersion(pkg_desc$get('DataVersion')) 66 | # convert back into package_version after validation 67 | # to be able to use base R subsetting facilities 68 | verstring <- as.package_version(verstring) 69 | m <- match(which, which_options) 70 | verstring[1, m] <- as.integer(verstring[1, m]) + 1L 71 | verstring <- validate_DataVersion(verstring) 72 | pkg_desc$set('DataVersion', verstring) 73 | new_data_digest[["DataVersion"]] <- verstring 74 | list(pkg_description = pkg_desc, new_data_digest = new_data_digest) 75 | } 76 | 77 | #' Assert that a data version in a data package matches an expectation. 78 | #' 79 | #' @param data_package_name \code{character} Name of the package. 80 | #' @param version_string \code{character} Version string in "x.y.z" format. 81 | #' @param acceptable \code{character} one of "equal", "equal_or_greater", describing what version match is acceptable. 82 | #' @param ... additional arguments passed to data_version (such as lib.loc) 83 | #' @details Tests the DataVersion string in \code{data_package_name} against \code{version_string} testing the major, minor and revision portion. 84 | #' @return invisible \code{logical} TRUE if success, otherwise stop on mismatch. 85 | #' @details 86 | #' Tests "data_package_name version equal version_string" or "data_package_name version equal_or_greater version_string". 87 | #' @export 88 | #' @examples 89 | #' if(rmarkdown::pandoc_available()){ 90 | #' f <- tempdir() 91 | #' f <- file.path(f, "foo.Rmd") 92 | #' con <- file(f) 93 | #' writeLines("```{r}\n vec = 1:10 \n```\n",con = con) 94 | #' close(con) 95 | #' pname <- basename(tempfile()) 96 | #' datapackage_skeleton(name = pname, 97 | #' path=tempdir(), 98 | #' force = TRUE, 99 | #' r_object_names = "vec", 100 | #' code_files = f) 101 | #' package_build(file.path(tempdir(),pname), install = FALSE) 102 | #' 103 | #' pkgload::load_all(file.path(tempdir(),pname)) 104 | #' 105 | #' assert_data_version(data_package_name = pname,version_string = "0.1.0",acceptable = "equal") 106 | #' } 107 | assert_data_version <- 108 | function(data_package_name = NULL, 109 | version_string = NULL, 110 | acceptable = "equal", 111 | ...) { 112 | acceptable <- match.arg(acceptable, c("equal", "equal_or_greater")) 113 | pkg_version <- data_version(pkg = data_package_name,...) 114 | required_version <- as.numeric_version(version_string) 115 | base <- 116 | max(10, max( 117 | .find_base(pkg_version), 118 | .find_base(required_version) 119 | )) + 1 120 | if ((acceptable == "equal_or_greater") & 121 | ( 122 | .mk_version_numeric(pkg_version, base = base) >= 123 | .mk_version_numeric(required_version, base = base) 124 | )) { 125 | invisible(TRUE) 126 | } else if ((acceptable == "equal") & 127 | ( 128 | .mk_version_numeric(pkg_version, base = base) == 129 | .mk_version_numeric(required_version, base = base) 130 | )) { 131 | invisible(TRUE) 132 | } else { 133 | stop( 134 | paste0( 135 | "Found ", 136 | data_package_name, 137 | " ", 138 | pkg_version, 139 | " but ", 140 | ifelse(acceptable == "equal", " == ", " >= "), 141 | required_version, 142 | " is required." 143 | ) 144 | ) 145 | } 146 | } 147 | 148 | .find_base <- function(v) { 149 | max( 150 | as.numeric(v[1, 1]), 151 | as.numeric(v[1, 2]), 152 | as.numeric(v[1, 3]) 153 | ) 154 | } 155 | 156 | .mk_version_numeric <- function(x, base = 10) { 157 | as.numeric(x[1, 1]) * base^2 + 158 | as.numeric(x[1, 2]) * base^1 + 159 | as.numeric(x[1, 3]) * base^ 160 | 0 161 | } 162 | -------------------------------------------------------------------------------- /R/digests.R: -------------------------------------------------------------------------------- 1 | .save_digest <- function(data_digest, path = NULL) { 2 | write.dcf(data_digest, file.path(path, "DATADIGEST")) 3 | } 4 | 5 | #' Check dataversion string 6 | #' 7 | #' @param new_data_digest New data digest list with element named "DataVersion" 8 | #' containing a valid DataVersion 9 | #' @param old_data_digest Old data digest list with element named "DataVersion" 10 | #' containing a valid DataVersion 11 | #' @returns Character, ("lower", "equal", or "higher"), where new DataVersion is 12 | #' ____ relative to old DataVersion. version 13 | #' @noRd 14 | .check_dataversion_string <- function(new_data_digest, old_data_digest) { 15 | new <- validate_DataVersion(new_data_digest[["DataVersion"]]) 16 | old <- validate_DataVersion(old_data_digest[["DataVersion"]]) 17 | comp <- utils::compareVersion(new, old) 18 | txt <- c(lower = -1L, equal = 0L, higher = 1L) 19 | names(txt[which(txt == comp)]) 20 | } 21 | 22 | .compare_digests <- function(old_digest, new_digest) { 23 | # Returns FALSE when any existing data has is changed, new data is added, or 24 | # data is removed, else return TRUE. Use .multilog_trace for all changes since 25 | # this is standard behavior during package re-build, and changes are already 26 | # output to the console by .qualify_changes() 27 | 28 | old_digest[['DataVersion']] <- NULL 29 | new_digest[['DataVersion']] <- NULL 30 | old_digest <- unlist(old_digest) 31 | new_digest <- unlist(new_digest) 32 | added <- setdiff(names(new_digest), names(old_digest)) 33 | removed <- setdiff(names(old_digest), names(new_digest)) 34 | common <- intersect(names(old_digest), names(new_digest)) 35 | changed <- common[new_digest[common] != old_digest[common]] 36 | out <- TRUE 37 | for(name in changed){ 38 | .multilog_trace(paste(name, "has changed.")) 39 | out <- FALSE 40 | } 41 | 42 | for(name in removed){ 43 | .multilog_trace(paste(name, "was removed.")) 44 | out <- FALSE 45 | } 46 | 47 | for(name in added){ 48 | .multilog_trace(paste(name, "was added.")) 49 | out <- FALSE 50 | } 51 | 52 | return(out) 53 | } 54 | 55 | .combine_digests <- function(new, old) { 56 | intersection <- intersect(names(new), names(old)) 57 | difference <- setdiff(names(new), names(old)) 58 | rdifference <- setdiff(names(old), names(new)) 59 | combined <- c(new[intersection], old[rdifference], new[difference]) 60 | combined[["DataVersion"]] <- new[["DataVersion"]] 61 | return(combined) 62 | } 63 | 64 | .parse_data_digest <- function(pkg_dir = NULL) { 65 | digest <- NULL 66 | if (file.exists(file.path(pkg_dir, "DATADIGEST"))) { 67 | ret <- read.dcf(file.path(pkg_dir, "DATADIGEST")) 68 | digest <- as.list(as.character(ret)) 69 | names(digest) <- colnames(ret) 70 | } 71 | return(digest) 72 | } 73 | 74 | .digest_data_env <- function(object_names, dataenv, DataVersion) { 75 | new_data_digest <- list() 76 | new_data_digest[["DataVersion"]] <- validate_DataVersion(DataVersion) 77 | data_objects <- lapply(object_names, function(obj) { 78 | digest::digest(dataenv[[obj]]) 79 | }) 80 | names(data_objects) <- object_names 81 | new_data_digest <- c(new_data_digest, data_objects) 82 | return(new_data_digest) 83 | } 84 | 85 | 86 | # function .rc() prepends '#'' to a string or character vector 87 | .rc <- function(strvec) { 88 | paste("#'", strvec) 89 | } 90 | -------------------------------------------------------------------------------- /R/environments.R: -------------------------------------------------------------------------------- 1 | 2 | #' Read an object created in a previously run processing script. 3 | #' 4 | #' @param name \code{character} the name of the object. Must be a 5 | #' name available in the configuration objects. Other objects are not saved. 6 | #' @details This function is only accessible within an R or Rmd file processed by DataPackageR. 7 | #' It searches for an environment named \code{ENVS} within the current environment, 8 | #' that holds the object with the given \code{name}. Such an environment is constructed and populated 9 | #' with objects specified in the yaml \code{objects} property and passed along 10 | #' to subsequent R and Rmd files as DataPackageR processes them in order. 11 | #' @return An R object. 12 | #' @export 13 | #' @examples 14 | #' \donttest{ 15 | #' if(rmarkdown::pandoc_available()){ 16 | #' ENVS <- new.env() # ENVS would be in the environment 17 | #' # where the data processing is run. It is 18 | #' # handled automatically by the package. 19 | #' assign("find_me", 100, ENVS) #This is done automatically by DataPackageR 20 | #' 21 | #' find_me <- datapackager_object_read("find_me") # This would appear in an Rmd processed by 22 | #' # DataPackageR to access the object named "find_me" created 23 | #' # by a previous script. "find_me" would also need to 24 | #' # appear in the objects property of datapackager.yml 25 | #' } 26 | #' } 27 | datapackager_object_read <- function(name) { 28 | 29 | # get(name, get("ENVS", parent.frame())) 30 | 31 | #when the datapackage is being build, it will only read the object from the environment. when 32 | # in interactive mode, it should be try to load the objects reviously generated. 33 | # It will preferentially read the object from the the temporary folder, but if that does not exist, it will read the 34 | # object from the the data folder 35 | buildingPackage<-getOption("DataPackageR_packagebuilding",TRUE) 36 | 37 | 38 | object<-try(get(name, get("ENVS", parent.frame(),inherits=FALSE),inherits=FALSE),silent=TRUE) 39 | 40 | if( !buildingPackage && inherits(object,"try-error")){ 41 | #if the package is not being build and the object is not found in the "ENVS" environment 42 | temp_folder_path<-file.path(tempdir(),yml_find(usethis::proj_get())[["configuration"]][["render_root"]]$tmp) 43 | 44 | if(file.exists(objectPath<-file.path(temp_folder_path,paste0(name,".rds")))){ 45 | if (getOption('DataPackageR_verbose', TRUE)){ 46 | message('loading ',name,' from temporary folder from previous build attempt.') 47 | } 48 | object<-readRDS(objectPath) 49 | 50 | }else if(file.exists(objectPath<-file.path(project_data_path(),paste0(name,".rda")))){ 51 | if (getOption('DataPackageR_verbose', TRUE)){ 52 | message('loading ',name,' from data directory.') 53 | print(objectPath) 54 | } 55 | load_env<-new.env() 56 | load(objectPath,envir = load_env) 57 | object<-load_env[[ls(load_env)[1]]] 58 | 59 | }else{ 60 | stop(paste(name,'not found!')) 61 | 62 | } 63 | }else if(inherits(object,"try-error")){ 64 | #if the package is being build and the object is not found in the "ENVS" environment, 65 | # pass on the original error warning 66 | stop(object[1]) 67 | 68 | } 69 | 70 | return(object) 71 | } 72 | -------------------------------------------------------------------------------- /R/ignore.R: -------------------------------------------------------------------------------- 1 | #' Ignore specific files by git and R build. 2 | #' 3 | #' @param file \code{character} File to ignore. 4 | #' @param path \code{character} Path to the file. 5 | #' 6 | #' @return invisibly returns 0. 7 | #' @export 8 | #' 9 | #' @examples 10 | #' datapackage_skeleton(name="test",path = tempdir()) 11 | #' use_ignore("foo", ".") 12 | use_ignore <- function(file = NULL, path = NULL){ 13 | if (is.null(file)) { 14 | message("No file name provided to ignore.") 15 | return(invisible(0)) 16 | } 17 | usethis::use_build_ignore(files = file.path(path,basename(file)), escape = TRUE) 18 | usethis::use_git_ignore(ignores = basename(file), directory = path) 19 | invisible(0) 20 | } -------------------------------------------------------------------------------- /R/load_save.R: -------------------------------------------------------------------------------- 1 | 2 | .save_data <- function(new_data_digest, 3 | DataVersion, 4 | object_names, 5 | dataenv, 6 | old_data_digest = NULL, 7 | masterfile = NULL, 8 | pkg_path = NULL) { 9 | DataVersion <- validate_DataVersion(DataVersion) 10 | .save_digest(new_data_digest, path = pkg_path) 11 | .multilog_trace("Saving to data") 12 | # TODO get the names of each data object and save them separately. Provide a 13 | # function to load all. 14 | for (i in seq_along(object_names)) { 15 | obj <- object_names[i] 16 | data_save_rda_path <- file.path(pkg_path, "data", paste0(obj, ".rda")) 17 | save(list = obj, file = data_save_rda_path, envir = dataenv) 18 | } 19 | # Update description file 20 | to_update <- desc::desc(file = file.path(pkg_path, "DESCRIPTION")) 21 | to_update$set("DataVersion", DataVersion) 22 | to_update$set("Date", format(Sys.time(), "%Y-%m-%d")) 23 | to_update$write() 24 | } 25 | -------------------------------------------------------------------------------- /R/logger.R: -------------------------------------------------------------------------------- 1 | .multilog_info <- function(msg) { 2 | flog.info(msg, name = "console") 3 | flog.info(msg, name = "logfile") 4 | } 5 | .multilog_trace <- function(msg) { 6 | flog.trace(msg, name = "console") 7 | flog.trace(msg, name = "logfile") 8 | } 9 | .multilog_warn <- function(msg) { 10 | flog.warn(msg, name = "console") 11 | flog.warn(msg, name = "logfile") 12 | } 13 | .multilog_debug <- function(msg) { 14 | flog.debug(msg, name = "console") 15 | flog.debug(msg, name = "logfile") 16 | } 17 | .multilog_fatal <- function(msg) { 18 | flog.fatal(msg, name = "console") 19 | flog.fatal(msg, name = "logfile") 20 | } 21 | .multilog_error <- function(msg) { 22 | flog.error(msg, name = "console") 23 | flog.error(msg, name = "logfile") 24 | } 25 | 26 | .multilog_thresold <- function(console = INFO, logfile = TRACE) { 27 | flog.threshold(console, name = "console") 28 | flog.threshold(logfile, name = "logfile") 29 | } 30 | 31 | select_console_appender <- function(){ 32 | if (getOption('DataPackageR_verbose', TRUE)){ 33 | appender.console() 34 | } else { 35 | # quiet console appender 36 | function(line) { } 37 | } 38 | } 39 | 40 | .multilog_setup <- function(LOGFILE = NULL) { 41 | if (!is.null(LOGFILE)) { 42 | if (file.exists(LOGFILE)){ 43 | # initial newline to separate from previous run log entries 44 | cat("\n", file = LOGFILE, append = TRUE) 45 | } 46 | flog.logger( 47 | name = "logfile", 48 | appender = appender.file(LOGFILE), 49 | threshold = TRACE 50 | ) 51 | } 52 | flog.logger( 53 | name = "console", 54 | appender = select_console_appender(), 55 | threshold = INFO 56 | ) 57 | } 58 | -------------------------------------------------------------------------------- /R/mergeDocumentation.R: -------------------------------------------------------------------------------- 1 | .doc_merge <- function(old, new) { 2 | merged <- list() 3 | oldnames <- names(old) 4 | newnames <- names(new) 5 | for (i in oldnames) { 6 | if (i %in% newnames) { 7 | merged[[i]] <- new[[i]] 8 | } else { 9 | merged[[i]] <- old[[i]] 10 | } 11 | } 12 | for (i in newnames) { 13 | if (!i %in% oldnames) { 14 | merged[[i]] <- new[[i]] 15 | } 16 | } 17 | return(merged) 18 | } 19 | -------------------------------------------------------------------------------- /R/parseDocumentation.R: -------------------------------------------------------------------------------- 1 | .doc_parse <- function(all_r_files) { 2 | lst <- lapply(all_r_files, 3 | function(file){ 4 | file_lines <- readLines(file) 5 | roxy_blocks <- roxygen2::parse_file(file, env = NULL) 6 | roxy_block_list <- lapply(roxy_blocks, function(roxy_block){ 7 | # length(deparse(...) - 1 is to keep call intact if it's > 1 line 8 | # but 1 line only is the typical case (e.g. NULL, '_PACKAGE') 9 | end_line <- roxy_block$line + length(deparse(roxy_block$call)) - 1L 10 | roxy_lines <- vapply(roxy_block$tag, 11 | function(tag) as.integer(tag$line), 1L) 12 | roxy_line_seq <- seq(from = min(roxy_lines), to = end_line) 13 | out_lines <- file_lines[roxy_line_seq] 14 | # 2 line breaks between roxygen2 sections in final file write 15 | c(out_lines, rep("", 2)) 16 | }) 17 | names(roxy_block_list) <- vapply(roxy_blocks, 18 | roxygen2::block_get_tag_value, 19 | "", tag = 'name') 20 | roxy_block_list 21 | } 22 | ) 23 | Reduce(c, lst) 24 | } 25 | -------------------------------------------------------------------------------- /R/prompt.R: -------------------------------------------------------------------------------- 1 | .prompt_user_for_change_description <- 2 | function(interact = getOption( 3 | "DataPackageR_interact", 4 | interactive() 5 | )) { 6 | if (interactive()&interact) { 7 | cat(cli::col_cyan("Enter a text description of the changes for the NEWS.md file.\n")) #nocov 8 | }else{ 9 | if (getOption('DataPackageR_verbose', TRUE)){ 10 | cat(cli::col_cyan("Non-interactive NEWS.md file update.\n")) 11 | } 12 | } 13 | change_description <- 14 | ifelse( 15 | interactive() & interact, 16 | readline(prompt = "+ "), 17 | "Package built in non-interactive mode" 18 | ) 19 | return(change_description) 20 | } 21 | 22 | .update_news_md <- function(version = "Version Not Provided", 23 | interact = getOption( 24 | "DataPackageR_interact", 25 | interactive() 26 | )) { 27 | news_file <- .newsfile() 28 | change_description <- 29 | .prompt_user_for_change_description(interact = interact) 30 | news_con <- file(news_file, open = "r+") 31 | news_file_data <- readLines(news_con) 32 | writeLines( 33 | text = paste0("DataVersion: ", version), 34 | con = news_con, 35 | sep = "\n" 36 | ) 37 | writeLines("=======================", 38 | con = news_con, 39 | sep = "\n" 40 | ) 41 | writeLines(c(change_description, ""), 42 | con = news_con, 43 | sep = "\n" 44 | ) 45 | writeLines(news_file_data, 46 | con = news_con, 47 | sep = "\n" 48 | ) 49 | flush(news_con) 50 | close(news_con) 51 | } 52 | 53 | .newsfile <- function() { 54 | newsfile <- file.path(usethis::proj_get(), "NEWS.md") 55 | if (!file.exists(newsfile)) { 56 | .multilog_trace("NEWS.md file not found, creating!") 57 | file.create(newsfile) 58 | } 59 | return(newsfile) 60 | } 61 | 62 | .update_news_changed_objects <- function(objectlist) { 63 | news_file <- .newsfile() 64 | news_con <- file(news_file, open = "r+") 65 | news_file_data <- readLines(news_con) 66 | header_1 <- grep("DataVersion", news_file_data)[1] 67 | # header_2 <- grep("DataVersion", news_file_data)[2] 68 | ul_1 <- grep("=====", news_file_data)[1] 69 | # ul_2 <- grep("=====", news_file_data)[2] 70 | stopifnot(header_1 == ul_1 - 1) 71 | # stopifnot(header_2 == ul_2 - 1) 72 | header <- news_file_data[header_1:ul_1] 73 | news_file_data <- news_file_data[-c(header_1:ul_1)] 74 | #write header 75 | writeLines( 76 | text = header, 77 | con = news_con, 78 | sep = "\n" 79 | ) 80 | #write changes 81 | added <- objectlist[["added"]] 82 | deleted <- objectlist[["deleted"]] 83 | changed <- objectlist[["changed"]] 84 | 85 | .write_changes <- function(string, news_con, what = NULL) { 86 | if (length(string) != 0) { 87 | if (getOption('DataPackageR_verbose', TRUE)){ 88 | cat(cli::col_cyan(paste0("* ",what,": ",string,"\n")), sep = "") 89 | } 90 | writeLines(text = paste0("* ",what,": ", string), 91 | con = news_con, 92 | sep = "\n") 93 | } 94 | } 95 | .write_changes(added, news_con, "Added") 96 | .write_changes(deleted, news_con, "Deleted") 97 | .write_changes(changed, news_con, "Changed") 98 | 99 | #write the rest of the data 100 | writeLines(news_file_data, 101 | con = news_con, 102 | sep = "\n" 103 | ) 104 | flush(news_con) 105 | close(news_con) 106 | } 107 | -------------------------------------------------------------------------------- /R/qualify_changes.R: -------------------------------------------------------------------------------- 1 | .qualify_changes <- function(new, old) { 2 | # don't need DataVersion here 3 | new[["DataVersion"]] <- NULL 4 | old[["DataVersion"]] <- NULL 5 | new <- unlist(new) 6 | old <- unlist(old) 7 | added <- setdiff(names(new), names(old)) 8 | deleted <- setdiff(names(old), names(new)) 9 | common <- intersect(names(new), names(old)) 10 | #test for equality 11 | changed <- common[new[common] != old[common]] 12 | list(added = added, 13 | deleted = deleted, 14 | changed = changed) 15 | } 16 | -------------------------------------------------------------------------------- /R/rmarkdown_functions.R: -------------------------------------------------------------------------------- 1 | read_file <- function(path) { 2 | n <- file.info(path)$size 3 | readChar(path, n, TRUE) 4 | } 5 | -------------------------------------------------------------------------------- /R/skeleton.R: -------------------------------------------------------------------------------- 1 | #' @importFrom usethis create_package 2 | .codefile_validate <- function(code_files) { 3 | # do they exist? 4 | if (! all(file.exists(code_files))){ 5 | stop("code_files do not all exist!") 6 | } 7 | # are the .Rmd files? 8 | if (! all(grepl(".*\\.r$", tolower(code_files)) | 9 | grepl(".*\\.rmd$", tolower(code_files)))){ 10 | stop("code files are not Rmd or R files!") 11 | } 12 | } 13 | 14 | #' Create a Data Package skeleton for use with DataPackageR. 15 | #' 16 | #' Creates a package skeleton directory structure for use with DataPackageR. 17 | #' Adds the DataVersion string to DESCRIPTION, creates the DATADIGEST file, and the data-raw directory. 18 | #' Updates the Read-and-delete-me file to reflect the additional necessary steps. 19 | #' @name datapackage_skeleton 20 | #' @param name \code{character} name of the package to create. 21 | #' @rdname datapackage_skeleton 22 | #' @param path A \code{character} path where the package is located. See \code{\link[utils]{package.skeleton}} 23 | #' @param force \code{logical} Force the package skeleton to be recreated even if it exists. see \code{\link[utils]{package.skeleton}} 24 | #' @param code_files Optional \code{character} vector of paths to Rmd files that process raw data 25 | #' into R objects. 26 | #' @param r_object_names \code{vector} of quoted r object names , tables, etc. created when the files in \code{code_files} are run. 27 | #' @param raw_data_dir \code{character} pointing to a raw data directory. Will be moved with all its subdirectories to "inst/extdata" 28 | #' @param dependencies \code{vector} of \code{character}, paths to R files that will be moved to "data-raw" but not included in the yaml config file. e.g., dependency scripts. 29 | #' @returns No return value, called for side effects 30 | #' @examples 31 | #' if(rmarkdown::pandoc_available()){ 32 | #' f <- tempdir() 33 | #' f <- file.path(f,"foo.Rmd") 34 | #' con <- file(f) 35 | #' writeLines("```{r}\n tbl = data.frame(1:10) \n```\n",con=con) 36 | #' close(con) 37 | #' pname <- basename(tempfile()) 38 | #' datapackage_skeleton(name = pname, 39 | #' path = tempdir(), 40 | #' force = TRUE, 41 | #' r_object_names = "tbl", 42 | #' code_files = f) 43 | #' } 44 | #' @export 45 | datapackage_skeleton <- 46 | function(name = NULL, 47 | path = ".", 48 | force = FALSE, 49 | code_files = character(), 50 | r_object_names = character(), 51 | raw_data_dir = character(), 52 | dependencies = character()) { 53 | if (! getOption('DataPackageR_verbose', TRUE)){ 54 | old_usethis_quiet <- getOption('usethis.quiet') 55 | on.exit(options(usethis.quiet = old_usethis_quiet)) 56 | options(usethis.quiet = TRUE) 57 | } 58 | if (is.null(name)) { 59 | stop("Must supply a package name", call. = FALSE) 60 | } 61 | # if (length(r_object_names) == 0) { 62 | # stop("You must specify r_object_names", call. = FALSE) 63 | # } 64 | # if (length(code_files) == 0) { 65 | # stop("You must specify code_files", call. = FALSE) 66 | # } 67 | if (force) { 68 | unlink(file.path(path, name), recursive = TRUE, force = TRUE) 69 | } 70 | package_path <- usethis::create_package( 71 | path = file.path(path, name), 72 | # fields override for usethis 3.0.0 ORCID placeholder, errors out in R 4.5 73 | # https://github.com/r-lib/usethis/issues/2059 74 | fields = list( 75 | `Authors@R` = paste0( 76 | "person(\"First\", \"Last\", email = \"first.last", 77 | "@example.com\", role = c(\"aut\", \"cre\"))" 78 | ) 79 | ), rstudio = FALSE, open = FALSE 80 | ) 81 | # compatibility between usethis 1.4 and 1.5. 82 | if(is.character(package_path)){ 83 | usethis::proj_set(package_path) 84 | }else{ 85 | # create the rest of the necessary elements in the package 86 | package_path <- file.path(path, name) 87 | } 88 | description <- 89 | desc::desc(file = file.path(package_path, "DESCRIPTION")) 90 | description$set("DataVersion" = "0.1.0") 91 | description$set("Version" = "1.0") 92 | description$set("Package" = name) 93 | description$set_dep("R", "Depends", ">= 3.5.0") 94 | description$set("Roxygen" = "list(markdown = TRUE)") 95 | description$write() 96 | .done(paste0("Added DataVersion string to ", cli::col_blue("'DESCRIPTION'"))) 97 | 98 | usethis::use_directory("data-raw") 99 | usethis::use_directory("data") 100 | usethis::use_directory("inst/extdata") 101 | # .done("Created data and data-raw directories") 102 | 103 | con <- 104 | file(file.path(package_path, "Read-and-delete-me"), open = "w") 105 | writeLines( 106 | c( 107 | "Edit the DESCRIPTION file to reflect", 108 | "the contents of your package.", 109 | "Optionally put your raw data under", 110 | "'inst/extdata/'. If the datasets are large,", 111 | "they may reside elsewhere outside the package", 112 | "source tree. If you passed R and Rmd files to", 113 | "datapackage_skeleton(), they should now appear in 'data-raw'.", 114 | "When you call package_build(), your datasets will", 115 | "be automatically documented. Edit datapackager.yml to", 116 | "add additional files / data objects to the package.", 117 | "After building, you should edit data-raw/documentation.R", 118 | "to fill in dataset documentation details and rebuild.", 119 | "", 120 | "NOTES", 121 | "If your code relies on other packages,", 122 | "add those to the @import tag of the roxygen markup.", 123 | "The R object names you wish to make available", 124 | "(and document) in the package must match", 125 | "the roxygen @name tags, must be listed", 126 | "in the yml file, and must not have the same name", 127 | "as the name of your data package." 128 | ), 129 | con 130 | ) 131 | close(con) 132 | 133 | 134 | # Rather than copy, read in, modify (as needed), and write. 135 | # process the string 136 | .copy_files_to_data_raw <- function(x, obj = c("code", "dependencies")) { 137 | if (length(x) != 0) { 138 | .codefile_validate(x) 139 | # copy them over 140 | obj <- match.arg(obj, c("code", "dependencies")) 141 | for (y in x) { 142 | file.copy(y, file.path(package_path, "data-raw"), overwrite = TRUE) 143 | .done(paste0("Copied ", basename(y), 144 | " into ", cli::col_blue("'data-raw'"))) 145 | } 146 | } 147 | } 148 | 149 | .copy_data_to_inst_extdata <- function(x) { 150 | if (length(x) != 0) { 151 | # copy them over 152 | file.copy(x, file.path(package_path, "inst/extdata"), 153 | recursive = TRUE, overwrite = TRUE 154 | ) 155 | .done(paste0("Moved data into ", cli::col_blue("'inst/extdata'"))) 156 | } 157 | } 158 | .copy_files_to_data_raw(code_files, obj = "code") 159 | .copy_files_to_data_raw(dependencies, obj = "dependencies") 160 | .copy_data_to_inst_extdata(raw_data_dir) 161 | 162 | yml <- construct_yml_config(code = code_files, data = r_object_names) 163 | yaml::write_yaml(yml, file = file.path(package_path, "datapackager.yml")) 164 | .done(paste0("configured ", cli::col_blue("'datapackager.yml'"), " file")) 165 | 166 | 167 | oldrdfiles <- 168 | list.files( 169 | path = file.path(package_path, "man"), 170 | pattern = "Rd", 171 | full.names = TRUE 172 | ) 173 | file.remove(file.path(package_path, "NAMESPACE")) 174 | oldrdafiles <- 175 | list.files( 176 | path = file.path(package_path, "data"), 177 | pattern = "rda", 178 | full.names = TRUE 179 | ) 180 | oldrfiles <- 181 | list.files( 182 | path = file.path(package_path, "R"), 183 | pattern = "R", 184 | full.names = TRUE 185 | ) 186 | file.remove(oldrdafiles) 187 | file.remove(oldrfiles) 188 | file.remove(oldrdfiles) 189 | invisible(NULL) 190 | } 191 | 192 | .done <- function(...) { 193 | .bullet(paste0(...), bullet = cli::col_green("\u2714")) 194 | } 195 | 196 | .bullet <- function(lines, bullet) { 197 | lines <- paste0(bullet, " ", lines) 198 | .cat_line(lines) 199 | } 200 | 201 | .cat_line <- function(...) { 202 | if (getOption('DataPackageR_verbose', TRUE)) cat(..., "\n", sep = "") 203 | } 204 | -------------------------------------------------------------------------------- /R/use.R: -------------------------------------------------------------------------------- 1 | #' Add a raw data set to inst/extdata 2 | #' 3 | #' The file or directory specified by \code{path} will be moved into 4 | #' the inst/extdata directory. 5 | #' 6 | #' @param path \code{character} path to file or directory. 7 | #' @param ignore \code{logical} whether to ignore the path or file in git and R build. 8 | #' 9 | #' @return invisibly returns TRUE for success. Stops on failure. 10 | #' @importFrom usethis proj_get proj_set create_package use_data_raw 11 | #' @export 12 | #' 13 | #' @examples 14 | #' if(rmarkdown::pandoc_available()){ 15 | #' myfile <- tempfile() 16 | #' file <- system.file("extdata", "tests", "extra.Rmd", 17 | #' package = "DataPackageR") 18 | #' raw_data <- system.file("extdata", "tests", "raw_data", 19 | #' package = "DataPackageR") 20 | #' datapackage_skeleton( 21 | #' name = "datatest", 22 | #' path = tempdir(), 23 | #' code_files = file, 24 | #' force = TRUE, 25 | #' r_object_names = "data") 26 | #' use_raw_dataset(raw_data) 27 | #' } 28 | use_raw_dataset <- function(path = NULL, ignore = FALSE) { 29 | if (is.null(path)) { 30 | stop("You must provide a full path to a file or directory.") 31 | } 32 | proj_path <- usethis::proj_get() 33 | if (!utils::file_test("-d", file.path(proj_path, "inst", "extdata"))) { 34 | stop(paste0("inst/extdata doesn't exist in ", proj_path), call. = FALSE) 35 | } 36 | raw_file <- normalizePath(path) 37 | if (utils::file_test("-f", raw_file)) { 38 | file.copy( 39 | from = raw_file, 40 | to = file.path(proj_path, "inst", "extdata"), 41 | overwrite = TRUE 42 | ) 43 | if (ignore) { 44 | # inst/extdata is a path relative to the project root 45 | # as needed by git_ignore 46 | use_ignore(basename(raw_file), path = file.path("inst", "extdata")) 47 | } 48 | return(invisible(TRUE)) 49 | } else if (utils::file_test("-d", raw_file)) { 50 | file.copy( 51 | from = raw_file, 52 | to = file.path(proj_path, "inst", "extdata"), 53 | recursive = TRUE, overwrite = TRUE 54 | ) 55 | if (ignore) { 56 | #should work and the directory should be ignored 57 | use_ignore(basename(raw_file), path = file.path("inst", "extdata")) 58 | } 59 | return(invisible(TRUE)) 60 | } else { 61 | stop("path must be a path to an existing file or directory.") 62 | } 63 | } 64 | 65 | 66 | #' Add a processing script to a data package. 67 | #' 68 | #' The Rmd or R file or directory specified by \code{file} will be moved into 69 | #' the data-raw directory. It will also be added to the yml configuration file. 70 | #' Any existing file by that name will be overwritten when overwrite is set to TRUE 71 | #' 72 | #' @param file \code{character} path to an existing file or name of a new R or Rmd file to create. 73 | #' @param title \code{character} title of the processing script for the yaml header. Used only if file is being created. 74 | #' @param author \code{character} author name for the yaml header. Used only if the file is being created. 75 | #' @param overwrite \code{logical} default FALSE. Overwrite existing file of the same name. 76 | #' 77 | #' @return invisibly returns TRUE for success. Stops on failure. 78 | #' @importFrom usethis proj_get proj_set create_package use_data_raw 79 | #' @export 80 | #' 81 | #' @examples 82 | #' if(rmarkdown::pandoc_available()){ 83 | #' myfile <- tempfile() 84 | #' file <- system.file("extdata", "tests", "extra.Rmd", 85 | #' package = "DataPackageR") 86 | #' datapackage_skeleton( 87 | #' name = "datatest", 88 | #' path = tempdir(), 89 | #' code_files = file, 90 | #' force = TRUE, 91 | #' r_object_names = "data") 92 | #' use_processing_script(file = "newScript.Rmd", 93 | #' title = "Processing a new dataset", 94 | #' author = "Y.N. Here.") 95 | #' } 96 | use_processing_script <- function(file = NULL, title = NULL, author = NULL, overwrite = FALSE) { 97 | if (is.null(file)) { 98 | stop("You must provide a full path to a file.") 99 | } 100 | proj_path <- usethis::proj_get() 101 | if (!utils::file_test("-d", file.path(proj_path, "data-raw"))) { 102 | stop(paste0("data-raw doesn't exist in ", proj_path), call. = FALSE) 103 | } 104 | #check if the given file or directory already exists 105 | if (utils::file_test("-f",file.path(proj_path,"data-raw",file))|utils::file_test("-d",file.path(proj_path,"data-raw",file))) { #nolint 106 | if (overwrite) { 107 | .bullet(paste0("Courtesy warning: ", basename(file), " exists in ",cli::col_blue("'data-raw'"),", and ",cli::col_red("WILL")," be overwritten."),bullet = cli::col_red("\u2622")) #nolint 108 | } else { 109 | .bullet(paste0("Courtesy warning: ", basename(file), " exists in ",cli::col_blue("'data-raw'"),", and ",cli::col_red("WILL NOT")," be overwritten."),bullet = cli::col_red("\u2622")) #nolint 110 | } 111 | } 112 | raw_file <- suppressWarnings(normalizePath(file)) 113 | 114 | if (utils::file_test("-f", raw_file)) { 115 | # test if it's an R or Rmd file. 116 | if (!(grepl("\\.rmd$", tolower(raw_file)) | 117 | grepl("\\.r$", tolower(raw_file)))) { 118 | stop("file must be an .R or .Rmd.") 119 | } 120 | file.copy( 121 | from = raw_file, 122 | to = file.path(proj_path, "data-raw"), 123 | overwrite = overwrite 124 | ) 125 | # add it to the yaml 126 | yml <- yml_find(path = proj_path) 127 | yml <- yml_add_files(yml, basename(raw_file)) 128 | yml_write(yml) 129 | 130 | invisible(TRUE) 131 | } else if (utils::file_test("-d", raw_file)) { 132 | stop("path argument must be a path to a file, not a directory.") 133 | } else if ((!grepl("/", raw_file) & 134 | !grepl("^\\.", raw_file)) & 135 | (grepl("\\.r$", tolower(raw_file)) | 136 | grepl("\\.rmd$", tolower(raw_file)))) { 137 | # we have a valid file name and should create it. 138 | if (file.exists(file.path(proj_path, "data-raw", basename(raw_file))) && 139 | !overwrite) { 140 | .bullet(paste0("Skipping file creation: pass overwrite = TRUE to use_processing_script()"), bullet = cli::col_red("\u2622")) #nolint 141 | } else { 142 | if (getOption('DataPackageR_verbose', TRUE)){ 143 | cat("Attempting to create ", raw_file) 144 | } 145 | file.create(file.path(proj_path, "data-raw", basename(raw_file))) 146 | .update_header(file.path(proj_path, 147 | "data-raw", 148 | basename(raw_file)), 149 | title = title, 150 | author = author) 151 | } 152 | # add it to the yaml. 153 | yml <- yml_find(path = proj_path) 154 | yml <- yml_add_files(yml, basename(raw_file)) 155 | yml_write(yml) 156 | 157 | invisible(TRUE) 158 | } else { 159 | stop("path argument must be a path to an existing file or a new file name, cannot begin with a dot '.' and must end in R or Rmd (case insensitive).") # nolint 160 | } 161 | } 162 | 163 | 164 | 165 | #' Add a data object to a data package. 166 | #' 167 | #' The data object will be added to the yml configuration file. 168 | #' @param object_name Name of the data object. Should be created by a processing script in data-raw. \code{character} vector of length 1. 169 | #' 170 | #' @return invisibly returns TRUE for success. 171 | #' @export 172 | #' 173 | #' @examples 174 | #' if(rmarkdown::pandoc_available()){ 175 | #' myfile <- tempfile() 176 | #' file <- system.file("extdata", "tests", "extra.Rmd", 177 | #' package = "DataPackageR") 178 | #' datapackage_skeleton( 179 | #' name = "datatest", 180 | #' path = tempdir(), 181 | #' code_files = file, 182 | #' force = TRUE, 183 | #' r_object_names = "data") 184 | #' use_data_object(object_name = "newobject") 185 | #' } 186 | #' 187 | use_data_object <- function(object_name = NULL) { 188 | if (is.null(object_name)) { 189 | stop(paste0(object_name, " cannot be NULL.")) 190 | } else if(!is.character(object_name) | !length(object_name)==1){ 191 | stop("object_name must be a character vector of length 1.") 192 | } else { 193 | proj_path <- usethis::proj_get() 194 | yml <- yml_find(path = proj_path) 195 | yml <- yml_add_objects(yml, objects = object_name) 196 | yml_write(yml) 197 | invisible(TRUE) 198 | } 199 | } 200 | 201 | 202 | .update_header <- function(file = NULL, 203 | title = NULL, 204 | author = NULL) { 205 | file_contents <- readLines(file) 206 | if (grepl("\\.r$", tolower(file))) { 207 | # get the front matter as comments. 208 | partitioned_file <- .partition_r_front_matter(file_contents) 209 | } else { 210 | partitioned_file <- .partition_rmd_front_matter(file_contents) 211 | } 212 | if (!is.null(partitioned_file$front_matter)) { 213 | front_matter <- .parse_yaml_front_matter( 214 | gsub("#'\\s*", "", partitioned_file$front_matter)) 215 | } else { 216 | front_matter <- list() 217 | } 218 | { 219 | if (!is.null(title)) { 220 | front_matter$title <- title 221 | } 222 | if (!is.null(author)) { 223 | front_matter$author <- author 224 | } 225 | 226 | front_matter <- yaml::as.yaml(front_matter) 227 | front_matter <- 228 | ifelse( 229 | grepl("\\.r$", tolower(file)), 230 | gsub( 231 | "#' $", "", 232 | gsub( 233 | "\n", "\n#' ", 234 | paste0("#' ", front_matter) 235 | ) 236 | ), 237 | front_matter 238 | ) 239 | 240 | # open the file for writing. 241 | connection <- file(file, open = "w+") 242 | # write the header 243 | writeLines(ifelse(grepl("\\.r$", tolower(file)), 244 | "#' ---", "---"), con = connection) 245 | writeLines(front_matter, con = connection, sep = "") 246 | writeLines(ifelse(grepl("\\.r$", tolower(file)), 247 | "#' ---", "---"), con = connection) 248 | # write the body 249 | if (!is.null(partitioned_file$body)) { 250 | writeLines(partitioned_file$body, con = connection) 251 | } 252 | # close the file 253 | close(connection) 254 | } 255 | } 256 | 257 | .partition_r_front_matter <- function(input_lines) { 258 | validate_front_matter <- function(delimiters) { 259 | if (length(delimiters) >= 2 && (delimiters[2] - delimiters[1] > 260 | 1) && grepl("^#'\\s*---\\s*$", input_lines[delimiters[1]])) { 261 | if (delimiters[1] == 1) { 262 | TRUE 263 | } else { 264 | .is_blank(input_lines[1:delimiters[1] - 1]) 265 | } 266 | } 267 | else { 268 | FALSE 269 | } 270 | } 271 | delimiters <- grep("^(#'\\s*---|\\.\\.\\.)\\s*$", input_lines) 272 | if (validate_front_matter(delimiters)) { 273 | front_matter <- input_lines[(delimiters[1]):(delimiters[2])] 274 | input_body <- c() 275 | if (delimiters[1] > 1) { 276 | input_body <- c(input_body, input_lines[1:delimiters[1] - 277 | 1]) 278 | } 279 | if (delimiters[2] < length(input_lines)) { 280 | input_body <- c(input_body, input_lines[-(1:delimiters[2])]) 281 | } 282 | list(front_matter = front_matter, body = input_body) 283 | } 284 | else { 285 | list(front_matter = NULL, body = input_lines) 286 | } 287 | } 288 | 289 | 290 | .partition_rmd_front_matter <- function(input_lines) { 291 | validate_front_matter <- function(delimiters) { 292 | if (length(delimiters) >= 2 && (delimiters[2] - delimiters[1] > 293 | 1) && grepl("^---\\s*$", input_lines[delimiters[1]])) { 294 | if (delimiters[1] == 1) { 295 | TRUE 296 | } else { 297 | .is_blank(input_lines[1:delimiters[1] - 1]) 298 | } 299 | } 300 | else { 301 | FALSE 302 | } 303 | } 304 | delimiters <- grep("^(---|\\.\\.\\.)\\s*$", input_lines) 305 | if (validate_front_matter(delimiters)) { 306 | front_matter <- input_lines[(delimiters[1]):(delimiters[2])] 307 | input_body <- c() 308 | if (delimiters[1] > 1) { 309 | input_body <- c(input_body, input_lines[1:delimiters[1] - 310 | 1]) 311 | } 312 | if (delimiters[2] < length(input_lines)) { 313 | input_body <- c(input_body, input_lines[-(1:delimiters[2])]) 314 | } 315 | list(front_matter = front_matter, body = input_body) 316 | } 317 | else { 318 | list(front_matter = NULL, body = input_lines) 319 | } 320 | } 321 | 322 | 323 | .parse_yaml_front_matter <- function(front_matter) { 324 | if (length(front_matter) > 2) { 325 | front_matter <- front_matter[2:(length(front_matter) - 326 | 1)] 327 | front_matter <- paste(front_matter, collapse = "\n") 328 | .validate_front_matter(front_matter) 329 | parsed_yaml <- .yaml_load_utf8(front_matter) 330 | if (is.list(parsed_yaml)) { 331 | parsed_yaml 332 | } else { 333 | list() 334 | } 335 | } 336 | else { 337 | list() 338 | } 339 | } 340 | 341 | .validate_front_matter <- function(front_matter) { 342 | front_matter <- .trim_trailing_ws(front_matter) 343 | if (grepl(":$", front_matter)) { 344 | stop("Invalid YAML front matter (ends with ':')", call. = FALSE) 345 | } 346 | } 347 | 348 | .trim_trailing_ws <- function(x) { 349 | sub("\\s+$", "", x) 350 | } 351 | 352 | .yaml_load_utf8 <- function(string, ...) { 353 | string <- paste(string, collapse = "\n") 354 | if (utils::packageVersion("yaml") >= "2.1.14") { 355 | yaml::yaml.load(string, ...) 356 | } 357 | else { 358 | .mark_utf8(yaml::yaml.load(enc2utf8(string), ...)) #nocov 359 | } 360 | } 361 | 362 | .mark_utf8 <- function(x) { 363 | if (is.character(x)) { 364 | Encoding(x) <- "UTF-8" 365 | return(x) 366 | } 367 | if (!is.list(x)) { 368 | return(x) 369 | } 370 | attrs <- attributes(x) 371 | res <- lapply(x, .mark_utf8) 372 | attributes(res) <- attrs 373 | names(res) <- .mark_utf8(names(res)) 374 | res 375 | } 376 | 377 | .is_blank <- function(x) { 378 | if (length(x)) { 379 | all(grepl("^\\s*$", x)) 380 | } else { 381 | TRUE 382 | } 383 | } 384 | -------------------------------------------------------------------------------- /R/yamlR.R: -------------------------------------------------------------------------------- 1 | #' Edit DataPackageR yaml configuration 2 | #' 3 | #' @rdname yaml 4 | #' @param path Path to the data package source or path to write config file (for \code{yml_write}) 5 | #' @return A yaml configuration structured as an R nested list. 6 | #' @description Edit a yaml configuration file via an API. 7 | #' @details Add, remove files and objects, enable or disable parsing of specific files, list objects or files in a yaml config, or write a config back to a package. 8 | #' @importFrom yaml yaml.load_file as.yaml write_yaml 9 | #' @export 10 | #' 11 | #' @examples 12 | #' if(rmarkdown::pandoc_available()){ 13 | #' f <- tempdir() 14 | #' f <- file.path(f,"foo.Rmd") 15 | #' con <- file(f) 16 | #' writeLines("```{r}\n vec = 1:10\n```\n",con=con) 17 | #' close(con) 18 | #' pname <- basename(tempfile()) 19 | #' datapackage_skeleton(name=pname, 20 | #' path = tempdir(), 21 | #' force = TRUE, 22 | #' r_object_names = "vec", 23 | #' code_files = f) 24 | #' yml <- yml_find(file.path(tempdir(),pname)) 25 | #' yml <- yml_add_files(yml,"foo.Rmd") 26 | #' yml_list_files(yml) 27 | #' yml <- yml_disable_compile(yml,"foo.Rmd") 28 | #' yml <- yml_enable_compile(yml,"foo.Rmd") 29 | #' yml <- yml_add_objects(yml,"data1") 30 | #' yml_list_objects(yml) 31 | #' yml <- yml_remove_objects(yml,"data1") 32 | #' yml <- yml_remove_files(yml,"foo.Rmd") 33 | #' } 34 | yml_find <- function(path) { 35 | path <- normalizePath(path, winslash = "/") 36 | config_yml <- is_r_package$find_file("datapackager.yml", path = path) 37 | if (!file.exists(config_yml)) { 38 | stop("Can't find a datapackager.yml config at ", 39 | dirname(config_yml), 40 | call. = FALSE 41 | ) 42 | } 43 | config <- yaml::yaml.load_file(config_yml) 44 | attr(config, "path") <- config_yml 45 | return(config) 46 | } 47 | 48 | #' @rdname yaml 49 | #' @param config an R representation of the datapackager.yml config, returned by yml_find, or a path to the package root. 50 | #' @export 51 | yml_add_files <- function(config, filenames) { 52 | if (is.character(config)) { 53 | # assume config is a package root path 54 | config <- yml_find(config) 55 | } 56 | for (i in filenames) { 57 | if (is.null(config[["configuration"]][["files"]][[i]])) { 58 | config[["configuration"]][["files"]][[i]] <- list() 59 | # config[["configuration"]][["files"]][[i]]$name <- i 60 | config[["configuration"]][["files"]][[i]]$enabled <- TRUE 61 | } 62 | } 63 | if (getOption('DataPackageR_verbose', TRUE)) cat(yaml::as.yaml(config)) 64 | return(config) 65 | } 66 | 67 | #' @rdname yaml 68 | #' @param filenames A vector of filenames. 69 | #' @export 70 | yml_disable_compile <- function(config, filenames) { 71 | if (is.character(config)) { 72 | # assume config is a package root path 73 | config <- yml_find(config) 74 | } 75 | for (i in filenames) { 76 | if (!is.null(config[["configuration"]][["files"]][[i]])) { 77 | config[["configuration"]][["files"]][[i]]$enabled <- FALSE 78 | } 79 | } 80 | return(config) 81 | } 82 | 83 | #' @rdname yaml 84 | #' @export 85 | yml_enable_compile <- function(config, filenames) { 86 | if (is.character(config)) { 87 | # assume config is a package root path 88 | config <- yml_find(config) 89 | } 90 | for (i in filenames) { 91 | if (!is.null(config[["configuration"]][["files"]][[i]])) { 92 | config[["configuration"]][["files"]][[i]]$enabled <- TRUE 93 | } 94 | } 95 | return(config) 96 | } 97 | 98 | 99 | #' @rdname yaml 100 | #' @param objects A vector of R object names. 101 | #' @export 102 | yml_add_objects <- function(config, objects) { 103 | if (is.character(config)) { 104 | # assume config is a package root path 105 | config <- yml_find(config) 106 | } 107 | config[["configuration"]][["objects"]] <- 108 | unique(c( 109 | config[["configuration"]][["objects"]], 110 | objects 111 | )) 112 | if (getOption('DataPackageR_verbose', TRUE)) cat(yaml::as.yaml(config)) 113 | return(config) 114 | } 115 | 116 | 117 | #' @rdname yaml 118 | #' @export 119 | yml_list_objects <- function(config) { 120 | if (is.character(config)) { 121 | # assume config is a package root path 122 | config <- yml_find(config) 123 | } 124 | if (getOption('DataPackageR_verbose', TRUE)){ 125 | cat("\n") 126 | cat(config[["configuration"]][["objects"]]) 127 | } 128 | invisible(config[["configuration"]][["objects"]]) 129 | } 130 | 131 | #' @rdname yaml 132 | #' @export 133 | yml_list_files <- function(config) { 134 | if (is.character(config)) { 135 | # assume config is a package root path 136 | config <- yml_find(config) 137 | } 138 | if (getOption('DataPackageR_verbose', TRUE)){ 139 | cat("\n") 140 | cat(names(config[["configuration"]][["files"]])) 141 | } 142 | invisible(names(config[["configuration"]][["files"]])) 143 | } 144 | 145 | #' @rdname yaml 146 | #' @export 147 | yml_remove_objects <- function(config, objects) { 148 | if (is.character(config)) { 149 | # assume config is a package root path 150 | config <- yml_find(config) 151 | } 152 | config[["configuration"]][["objects"]] <- 153 | setdiff( 154 | config[["configuration"]][["objects"]], 155 | objects 156 | ) 157 | if (getOption('DataPackageR_verbose', TRUE)) cat(yaml::as.yaml(config)) 158 | return(config) 159 | } 160 | 161 | #' @rdname yaml 162 | #' @export 163 | yml_remove_files <- function(config, filenames) { 164 | if (is.character(config)) { 165 | # assume config is a package root path 166 | config <- yml_find(config) 167 | } 168 | for (i in filenames) { 169 | if (!is.null(config[["configuration"]][["files"]][[i]])) { 170 | config[["configuration"]][["files"]][[i]] <- NULL 171 | } 172 | } 173 | if (getOption('DataPackageR_verbose', TRUE)) cat(yaml::as.yaml(config)) 174 | return(config) 175 | } 176 | 177 | #' @rdname yaml 178 | #' @export 179 | yml_write <- function(config, path = NULL) { 180 | if (is.character(config)) { 181 | stop( 182 | paste0( 183 | "config must be a datapackager.yml configuration", 184 | " in r object representation, as ready by yml_find()" 185 | ), 186 | call. = FALSE 187 | ) 188 | } 189 | if (is.null(path)) { 190 | path <- 191 | attr(config, "path") 192 | } else { 193 | path <- file.path(path, "datapackager.yml") 194 | } 195 | yaml::write_yaml(config, file = path) 196 | } 197 | 198 | 199 | .create_tmpdir_render_root <- function(sub = NULL) { 200 | if (is.null(sub)) { 201 | sub <- as.character(as.integer(stats::runif(1) * 1000000)) 202 | } 203 | render_root <- file.path(tempdir(), sub) 204 | tempdir_exists <- 205 | try(normalizePath(dirname(render_root), 206 | winslash = "/", 207 | mustWork = TRUE 208 | ), 209 | silent = TRUE 210 | ) 211 | if (!dir.exists(render_root)) { 212 | dir.create(render_root, recursive = TRUE, showWarnings = FALSE) 213 | } 214 | render_root <- normalizePath(render_root, winslash = "/", mustWork = TRUE) 215 | return(render_root) 216 | } 217 | 218 | #' Construct a datapackager.yml configuration 219 | #' 220 | #' @param code A vector of filenames 221 | #' @param data A vector of quoted object names 222 | #' @param render_root The root directory where the package data processing code will be rendered. 223 | #' Defaults to is set to a randomly generated named subdirectory of \code{tempdir()}. 224 | #' @return a datapackager.yml configuration represented as an R object 225 | #' @description Constructs a datapackager.yml configuration object from a vector of file names and a vector of object names (all quoted). 226 | #' Can be written to disk via \code{yml_write}. 227 | #' \code{render_root} is set to a randomly generated named subdirectory of \code{tempdir()}. 228 | #' @examples 229 | #' conf <- construct_yml_config(code = c('file1.rmd','file2.rmd'), data=c('object1','object2')) 230 | #' tmp <- normalizePath(tempdir(), winslash = "/") 231 | #' yml_write(conf,path=tmp) 232 | #' @export 233 | construct_yml_config <- function(code = NULL, data = NULL, render_root = NULL) { 234 | if (!is.null(code)) { 235 | code <- basename(code) 236 | } 237 | files <- vector(length = length(code), mode = "list") 238 | names(files) <- code 239 | for (i in code) { 240 | files[[i]]$enabled <- TRUE 241 | } 242 | 243 | # create render root at a temporary directory. 244 | # this will be stored in the yaml. What if we restart? 245 | # see processData - it gets validated and created if not existing. 246 | # would prefer to have something like "NULL" or "tmp" specify a default to a 247 | # temporary directory. But also have a consistent subdirectory beneath it. 248 | # currently not consistent, since we are randomly 249 | # generating a subdirectory name. 250 | # we could use "tmp: subdir" and construct the path. 251 | 252 | yml <- list(configuration = list(files = files, objects = data)) 253 | if (is.null(render_root)) { 254 | render_root <- .create_tmpdir_render_root() 255 | yml[["configuration"]]$render_root$tmp <- basename(render_root) 256 | } else { 257 | render_root <- 258 | try(normalizePath(render_root, 259 | winslash = "/", 260 | mustWork = TRUE 261 | ), 262 | silent = TRUE 263 | ) 264 | if (inherits(render_root, "try-error")) { 265 | .multilog_fatal(paste0( 266 | dirname(render_root), 267 | " doesn't exist!" 268 | )) 269 | stop("error", call. = FALSE) 270 | } 271 | yml[["configuration"]]$render_root <- render_root 272 | } 273 | return(yml) 274 | } 275 | 276 | .get_render_root <- function(x) { 277 | if ("tmp" %in% names(x$configuration$render_root)) { 278 | sub <- x$configuration$render_root$tmp 279 | render_root <- .create_tmpdir_render_root(sub) 280 | return(render_root) 281 | } else if (length(x$configuration$render_root) != 0) { 282 | return(x$configuration$render_root) 283 | } else { 284 | .multilog_fatal("render_root is not set in yaml") 285 | stop("error", call. = FALSE) 286 | } 287 | } 288 | -------------------------------------------------------------------------------- /R/zzz.R: -------------------------------------------------------------------------------- 1 | .onLoad <- function(libname, pkgname) { 2 | # keeping this first option hardcoded on load for now 3 | options("DataPackageR_packagebuilding" = FALSE) 4 | # respect previous user setting for options if set 5 | op <- options() 6 | op.DataPackageR <- list( 7 | DataPackageR_interact = interactive(), 8 | DataPackageR_verbose = TRUE 9 | ) 10 | toset <- !(names(op.DataPackageR) %in% names(op)) 11 | if (any(toset)) options(op.DataPackageR[toset]) 12 | invisible() 13 | } 14 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: README 3 | output: github_document 4 | bibliography: bibliography.bib 5 | editor_options: 6 | chunk_output_type: inline 7 | --- 8 | 9 | 10 | ```{r setup, include = FALSE} 11 | knitr::opts_chunk$set( 12 | collapse = TRUE, 13 | comment = "#>", 14 | fig.path = "man/figures/README-", 15 | out.width = "100%" 16 | ) 17 | ``` 18 | 19 | # DataPackageR 20 | 21 | DataPackageR is used to reproducibly process raw data into packaged, analysis-ready data sets. 22 | 23 | 24 | [![CRAN](https://www.r-pkg.org/badges/version/DataPackageR)]( https://CRAN.R-project.org/package=DataPackageR) 25 | [![R-CMD-check](https://github.com/ropensci/DataPackageR/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/DataPackageR/actions) 26 | [![Codecov test coverage](https://codecov.io/gh/ropensci/DataPackageR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/ropensci/DataPackageR?branch=main) 27 | [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) 28 | [![](https://badges.ropensci.org/230_status.svg)](https://github.com/ropensci/software-review/issues/230) 29 | [![DOI](https://zenodo.org/badge/29267435.svg)](https://doi.org/10.5281/zenodo.1292095) 30 | 31 | 32 | ## Installation 33 | 34 | You can install the latest [CRAN](https://cran.r-project.org/package=DataPackageR) release of DataPackageR with: 35 | 36 | ```{r, eval=FALSE} 37 | install.packages("DataPackageR") 38 | ``` 39 | 40 | You can install the latest development version of DataPackageR from [GitHub](https://github.com/ropensci/DataPackageR) with: 41 | 42 | ```{r, eval=FALSE} 43 | library(remotes) 44 | remotes::install_github("ropensci/DataPackageR") 45 | ``` 46 | 47 | ## What problems does DataPackageR tackle? 48 | 49 | You have diverse raw data sets that you need to preprocess and tidy in order to: 50 | 51 | - Perform data analysis 52 | - Write a report 53 | - Publish a paper 54 | - Share data with colleagues and collaborators 55 | - Save time in the future when you return to this project but have forgotten all about what you did. 56 | 57 | ### Why package data sets? 58 | 59 | **Definition:** A *data package* is a formal R package whose sole purpose is to contain, access, and / or document data sets. 60 | 61 | - **Reproducibility.** 62 | 63 | As described [elsewhere](https://github.com/ropensci/rrrpkg), packaging your data promotes reproducibility. 64 | R's packaging infrastructure promotes unit testing, documentation, a reproducible build system, and has many other benefits. 65 | Coopting it for packaging data sets is a natural fit. 66 | 67 | - **Collaboration.** 68 | 69 | A data set packaged in R is easy to distribute and share among collaborators, and is easy to install and use. 70 | All the hard work you've put into documenting and standardizing the tidy data set comes right along with the data package. 71 | 72 | - **Documentation.** 73 | 74 | R's package system allows us to document data objects. What's more, the `roxygen2` package makes this very easy to do with [markup tags](https://r-pkgs.org/data.html). 75 | That documentation is the equivalent of a data dictionary and can be extremely valuable when returning to a project after a period of time. 76 | 77 | - **Convenience.** 78 | 79 | Data pre-processing can be time consuming, depending on the data type and raw data sets may be too large to share conveniently in a packaged format. 80 | Packaging and sharing the small, tidied data saves the users computing time and time spent waiting for downloads. 81 | 82 | ## Challenges 83 | 84 | - **Package size limits.** 85 | 86 | R packages have a 10MB size limit, at least on [CRAN](https://cran.r-project.org/web/packages/policies.html). Bioconductor [ExperimentHub](http://contributions.bioconductor.org/data.html#data) may be able to support larger data packages. 87 | 88 | Sharing large volumes of raw data in an R package format is still not ideal, and there are public biological data repositories better suited for raw data: e.g., [GEO](https://www.ncbi.nlm.nih.gov/geo/), [SRA](https://www.ncbi.nlm.nih.gov/sra), [ImmPort](https://www.immport.org/), [ImmuneSpace](https://immunespace.org/), [FlowRepository](http://flowrepository.org/). 89 | 90 | Tools like [datastorr](https://github.com/traitecoevo/datastorr) can help with this and we hope to integrate this into DataPackageR in the future. 91 | 92 | - **Manual effort** 93 | 94 | There is still a substantial manual effort to set up the correct directory structures for an R data package. This can dissuade many individuals, particularly new users who have never built an R package, from going this route. 95 | 96 | - **Scale** 97 | 98 | Setting up and building R data packages by hand is a workable solution for a small project or a small number of projects, but when dealing with many projects each involving many data sets, tools are needed to help automate the process. 99 | 100 | ## DataPackageR 101 | 102 | DataPackageR provides a number of benefits when packaging your data. 103 | 104 | - It aims to automate away much of the tedium of packaging data sets without getting too much in the way, and keeps your processing workflow reproducible. 105 | 106 | - It sets up the necessary package structure and files for a data package. 107 | 108 | - It allows you to keep the large, raw data and only ship the packaged tidy data, saving space and time consumers of your data set need to spend downloading and re-processing it. 109 | 110 | - It maintains a reproducible record (vignettes) of the data processing along with the package. Consumers of the data package can verify how the processing was done, increasing confidence in your data. 111 | 112 | - It automates construction of the documentation and maintains a data set version and an md5 fingerprint of each data object in the package. If the data changes and the package is rebuilt, the data version is automatically updated. 113 | 114 | ## Blog Post - building packages interactively. 115 | 116 | See this [rOpenSci blog post](https://ropensci.org/blog/2018/09/18/datapackager/) on how to build data packages interactively using DataPackageR. 117 | This uses several new interfaces: `use_data_object()`, `use_processing_script()` and `use_raw_dataset()` to build up a data package, rather than assuming 118 | the user has all the code and data ready to go for `datapackage_skeleton()`. 119 | 120 | ## Example 121 | 122 | ```{r minimal_example} 123 | library(DataPackageR) 124 | 125 | # Let's reproducibly package up 126 | # the cars in the mtcars dataset 127 | # with speed > 20. 128 | # Our dataset will be called cars_over_20. 129 | # There are three steps: 130 | 131 | # 1. Get the code file that turns the raw data 132 | # into our packaged and processed analysis-ready dataset. 133 | # This is in a file called subsetCars.Rmd located in exdata/tests of the DataPackageR package. 134 | # For your own projects you would write your own Rmd processing file. 135 | processing_code <- system.file( 136 | "extdata", "tests", "subsetCars.Rmd", package = "DataPackageR" 137 | ) 138 | 139 | # 2. Create the package framework. 140 | # We pass in the Rmd file in the `processing_code` variable and the names of the data objects it creates (called "cars_over_20") 141 | # The new package is called "mtcars20" 142 | datapackage_skeleton( 143 | "mtcars20", force = TRUE, 144 | code_files = processing_code, 145 | r_object_names = "cars_over_20", 146 | path = tempdir()) 147 | 148 | # 3. Run the preprocessing code to build the cars_over_20 data set 149 | # and reproducibly enclose it in the mtcars20 package. 150 | # packageName is the full path to the package source directory created at step 2. 151 | # You'll be prompted for a text description (one line) of the changes you're making. 152 | # These will be added to the NEWS.md file along with the DataVersion in the package source directory. 153 | # If the build is run in non-interactive mode, the description will read 154 | # "Package built in non-interactive mode". You may update it later. 155 | package_build(packageName = file.path(tempdir(),"mtcars20")) 156 | 157 | # Update the autogenerated roxygen documentation in data-raw/documentation.R. 158 | # edit(file.path(tempdir(),"mtcars20","R","mtcars20.R")) 159 | 160 | # 4. Rebuild the documentation. 161 | document(file.path(tempdir(),"mtcars20")) 162 | 163 | # Let's use the package we just created. 164 | # During actual use, the temporary library does not need to be specified. 165 | temp_lib <- file.path(tempdir(),"lib") 166 | dir.create(temp_lib) 167 | install.packages(file.path(tempdir(),"mtcars20_1.0.tar.gz"), 168 | type = "source", repos = NULL, lib = temp_lib) 169 | library(mtcars20, lib.loc = temp_lib) 170 | data("cars_over_20") # load the data 171 | 172 | # We have our dataset! 173 | # Since we preprocessed it, 174 | # it is clean and under the 5 MB limit for data in packages. 175 | cars_over_20 176 | 177 | ?cars_over_20 # See the documentation you wrote in data-raw/documentation.R. 178 | 179 | # We can easily check the version of the data 180 | data_version("mtcars20") 181 | 182 | # You can use an assert to check the data version in reports and 183 | # analyses that use the packaged data. 184 | assert_data_version(data_package_name = "mtcars20", 185 | version_string = "0.1.0", 186 | acceptable = "equal") 187 | ``` 188 | 189 | ### Reading external data from within R / Rmd processing scripts. 190 | 191 | When creating a data package, your processing scripts will need to read your raw data sets in order to process them. 192 | These data sets can be stored in `inst/extdata` of the data package source tree, or elsewhere outside the package source tree. 193 | In order to have portable and reproducible code, you should not use absolute paths to the raw data. 194 | Instead, `DataPackageR` provides several APIs to access the data package project root directory, the `inst/extdata` subdirectory, and the `data` subdirectory. 195 | 196 | ```{r, eval = FALSE} 197 | # This returns the datapackage source 198 | # root directory. 199 | # In an R or Rmd processing script this can be used to build a path to a directory that is exteral to the package, for 200 | # example if we are dealing with very large data sets where data cannot be packaged. 201 | DataPackageR::project_path() 202 | 203 | # This returns the 204 | # inst/extdata directory. 205 | # Raw data sets that are included in the package should be placed there. 206 | # They can be read from that location, which is returned by: 207 | DataPackageR::project_extdata_path() 208 | 209 | # This returns the path to the datapackage 210 | # data directory. This can be used to access 211 | # stored data objects already created and saved in `data` from 212 | # other processing scripts. 213 | DataPackageR::project_data_path() 214 | ``` 215 | 216 | 217 | ## Vignettes 218 | 219 | [yaml configuration guide](https://docs.ropensci.org/DataPackageR/articles/YAML_Configuration_Details.html) 220 | 221 | [a more detailed technical vignette](https://docs.ropensci.org/DataPackageR/articles/Using_DataPackageR.html) 222 | 223 | ## Preprint and publication 224 | 225 | The publication describing the package, (Finak *et al.*, 2018), is now available at [Gates Open Research](https://gatesopenresearch.org/articles/2-31/v2) . 226 | 227 | 228 | The preprint is on [bioRxiv](https://doi.org/10.1101/342907). 229 | 230 | ## Similar work 231 | 232 | DataPackageR is for processing raw data into tidy data sets and bundling them into R packages. (Note: [datapack](https://github.com/ropensci/datapack) is a **different package** that is used to "create, send and load data from common repositories such as DataONE into the R environment".) 233 | 234 | 235 | There are a number of tools out there that address similar and complementary problems: 236 | 237 | - **datastorr** 238 | [github repo](https://github.com/traitecoevo/datastorr) 239 | 240 | Simple data retrieval and versioning using GitHub to store data. 241 | 242 | - Caches downloads and uses github releases to version data. 243 | - Deal consistently with translating the file stored online into a loaded data object 244 | - Access multiple versions of the data at once 245 | 246 | `datastorrr` could be used with DataPackageR to store / access remote raw data sets, remotely store / access tidied data that are too large to fit in the package itself. 247 | 248 | - **fst** 249 | [github repo](https://github.com/fstpackage/fst) 250 | 251 | `fst` provides lightning fast serialization of data frames. 252 | 253 | - **The modern data package** 254 | [pdf](https://github.com/noamross/2018-04-18-rstats-nyc/blob/master/Noam_Ross_ModernDataPkg_rstatsnyc_2018-04-20.pdf) 255 | 256 | A presentation from \@noamross touching on modern tools for open science and reproducibility. Discusses `datastorr` and `fst` as well as standardized metadata and documentation. 257 | 258 | - **rrrpkg** 259 | [github repo](https://github.com/ropensci/rrrpkg) 260 | 261 | A document from rOpenSci describing using an R package as a research compendium. Based on ideas originally introduced by Robert Gentleman and Duncan Temple Lang (Gentleman and Lang (2004)) 262 | 263 | - **template** 264 | [github repo](https://github.com/ropensci/rrrpkg) 265 | 266 | An R package template for data packages. 267 | 268 | See the [publication](#publication) for further discussion. 269 | 270 | ## Code of conduct 271 | 272 | Please note that this project is released with a [Contributor Code of Conduct](https://github.com/ropensci/DataPackageR/blob/main/CODE_OF_CONDUCT.md). 273 | By participating in this project you agree to abide by its terms. 274 | 275 | ### References 276 | 277 | 1. Gentleman, Robert, and Duncan Temple Lang. 2004. “Statistical Analyses and Reproducible Research.” Bioconductor Project Working Papers, Bioconductor project working papers,. bepress. 278 | 279 | 2. Finak G, Mayer B, Fulp W, et al. DataPackageR: Reproducible data preprocessing, standardization and sharing using R/Bioconductor for collaborative data analysis. Gates Open Res 2018, 2:31 280 | (DOI: 10.12688/gatesopenres.12832.1) 281 | 282 | 283 | 284 | [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org) 285 | 286 | 287 | -------------------------------------------------------------------------------- /bibliography.bib: -------------------------------------------------------------------------------- 1 | 2 | @ARTICLE{Gentleman2004-oj, 3 | title = "Statistical Analyses and Reproducible Research", 4 | author = "Gentleman, Robert and Lang, Duncan Temple", 5 | journal = "Bioconductor Project Working Papers", 6 | publisher = "bepress", 7 | series = "Bioconductor Project Working Papers", 8 | year = 2004 9 | } 10 | 11 | @UNPUBLISHED{Finak2018-tu, 12 | title = "{DataPackageR}: Reproducible data preprocessing, standardization 13 | and sharing using {R/Bioconductor} for collaborative data 14 | analysis", 15 | author = "Finak, Greg and Mayer, Bryan and Fulp, William and Obrecht, Paul 16 | and Sato, Alicia and Chung, Eva and Holman, Drienna and Gottardo, 17 | Raphael", 18 | journal = "bioRxiv", 19 | pages = "342907", 20 | month = jun, 21 | year = 2018, 22 | language = "en" 23 | } 24 | 25 | @Article{ 10.12688/gatesopenres.12832.2, 26 | AUTHOR = { Finak, G and Mayer, B and Fulp, W and Obrecht, P and Sato, A and Chung, E and Holman, D and Gottardo, R}, 27 | TITLE = {DataPackageR: Reproducible data preprocessing, standardization and sharing using R/Bioconductor for collaborative data analysis [version 2; referees: 2 approved, 1 approved with reservations] 28 | }, 29 | JOURNAL = {Gates Open Research}, 30 | VOLUME = {2}, 31 | YEAR = {2018}, 32 | NUMBER = {31}, 33 | DOI = {10.12688/gatesopenres.12832.2} 34 | } -------------------------------------------------------------------------------- /codecov.yml: -------------------------------------------------------------------------------- 1 | comment: false 2 | 3 | coverage: 4 | status: 5 | project: 6 | default: 7 | target: auto 8 | threshold: 1% 9 | informational: true 10 | patch: 11 | default: 12 | target: auto 13 | threshold: 1% 14 | informational: true 15 | -------------------------------------------------------------------------------- /codemeta.json: -------------------------------------------------------------------------------- 1 | { 2 | "@context": "https://doi.org/10.5063/schema/codemeta-2.0", 3 | "@type": "SoftwareSourceCode", 4 | "identifier": "DataPackageR", 5 | "description": "A framework to help construct R data packages in a reproducible manner. Potentially time consuming processing of raw data sets into analysis ready data sets is done in a reproducible manner and decoupled from the usual 'R CMD build' process so that data sets can be processed into R objects in the data package and the data package can then be shared, built, and installed by others without the need to repeat computationally costly data processing. The package maintains data provenance by turning the data processing scripts into package vignettes, as well as enforcing documentation and version checking of included data objects. Data packages can be version controlled on 'GitHub', and used to share data for manuscripts, collaboration and reproducible research.", 6 | "name": "DataPackageR: Construct Reproducible Analytic Data Sets as R Packages", 7 | "relatedLink": ["https://docs.ropensci.org/DataPackageR/", "https://CRAN.R-project.org/package=DataPackageR"], 8 | "codeRepository": "https://github.com/ropensci/DataPackageR", 9 | "issueTracker": "https://github.com/ropensci/DataPackageR/issues", 10 | "license": "https://spdx.org/licenses/MIT", 11 | "version": "0.16.1", 12 | "programmingLanguage": { 13 | "@type": "ComputerLanguage", 14 | "name": "R", 15 | "url": "https://r-project.org" 16 | }, 17 | "runtimePlatform": "R version 4.4.1 (2024-06-14)", 18 | "provider": { 19 | "@id": "https://cran.r-project.org", 20 | "@type": "Organization", 21 | "name": "Comprehensive R Archive Network (CRAN)", 22 | "url": "https://cran.r-project.org" 23 | }, 24 | "author": [ 25 | { 26 | "@type": "Person", 27 | "givenName": "Greg", 28 | "familyName": "Finak", 29 | "email": "greg.finak@gmail.com" 30 | } 31 | ], 32 | "contributor": [ 33 | { 34 | "@type": "Person", 35 | "givenName": "Paul", 36 | "familyName": "Obrecht" 37 | }, 38 | { 39 | "@type": "Person", 40 | "givenName": "Ellis", 41 | "familyName": "Hughes", 42 | "email": "ellishughes@live.com", 43 | "@id": "https://orcid.org/0000-0003-0637-4436" 44 | }, 45 | { 46 | "@type": "Person", 47 | "givenName": "Jimmy", 48 | "familyName": "Fulp", 49 | "email": "williamjfulp@gmail.com" 50 | }, 51 | { 52 | "@type": "Person", 53 | "givenName": "Marie", 54 | "familyName": "Vendettuoli", 55 | "@id": "https://orcid.org/0000-0001-9321-1410" 56 | }, 57 | { 58 | "@type": "Person", 59 | "givenName": "Dave", 60 | "familyName": "Slager", 61 | "email": "dslager@fredhutch.org", 62 | "@id": "https://orcid.org/0000-0003-2525-2039" 63 | }, 64 | { 65 | "@type": "Person", 66 | "givenName": "Jason", 67 | "familyName": "Taylor", 68 | "email": "jmtaylor@fredhutch.org" 69 | } 70 | ], 71 | "copyrightHolder": [ 72 | { 73 | "@type": "Person", 74 | "givenName": "Greg", 75 | "familyName": "Finak", 76 | "email": "greg.finak@gmail.com" 77 | } 78 | ], 79 | "maintainer": [ 80 | { 81 | "@type": "Person", 82 | "givenName": "Dave", 83 | "familyName": "Slager", 84 | "email": "dslager@fredhutch.org", 85 | "@id": "https://orcid.org/0000-0003-2525-2039" 86 | } 87 | ], 88 | "softwareSuggestions": [ 89 | { 90 | "@type": "SoftwareApplication", 91 | "identifier": "covr", 92 | "name": "covr", 93 | "provider": { 94 | "@id": "https://cran.r-project.org", 95 | "@type": "Organization", 96 | "name": "Comprehensive R Archive Network (CRAN)", 97 | "url": "https://cran.r-project.org" 98 | }, 99 | "sameAs": "https://CRAN.R-project.org/package=covr" 100 | }, 101 | { 102 | "@type": "SoftwareApplication", 103 | "identifier": "data.tree", 104 | "name": "data.tree", 105 | "provider": { 106 | "@id": "https://cran.r-project.org", 107 | "@type": "Organization", 108 | "name": "Comprehensive R Archive Network (CRAN)", 109 | "url": "https://cran.r-project.org" 110 | }, 111 | "sameAs": "https://CRAN.R-project.org/package=data.tree" 112 | }, 113 | { 114 | "@type": "SoftwareApplication", 115 | "identifier": "spelling", 116 | "name": "spelling", 117 | "provider": { 118 | "@id": "https://cran.r-project.org", 119 | "@type": "Organization", 120 | "name": "Comprehensive R Archive Network (CRAN)", 121 | "url": "https://cran.r-project.org" 122 | }, 123 | "sameAs": "https://CRAN.R-project.org/package=spelling" 124 | }, 125 | { 126 | "@type": "SoftwareApplication", 127 | "identifier": "testthat", 128 | "name": "testthat", 129 | "provider": { 130 | "@id": "https://cran.r-project.org", 131 | "@type": "Organization", 132 | "name": "Comprehensive R Archive Network (CRAN)", 133 | "url": "https://cran.r-project.org" 134 | }, 135 | "sameAs": "https://CRAN.R-project.org/package=testthat" 136 | }, 137 | { 138 | "@type": "SoftwareApplication", 139 | "identifier": "withr", 140 | "name": "withr", 141 | "provider": { 142 | "@id": "https://cran.r-project.org", 143 | "@type": "Organization", 144 | "name": "Comprehensive R Archive Network (CRAN)", 145 | "url": "https://cran.r-project.org" 146 | }, 147 | "sameAs": "https://CRAN.R-project.org/package=withr" 148 | } 149 | ], 150 | "softwareRequirements": { 151 | "1": { 152 | "@type": "SoftwareApplication", 153 | "identifier": "R", 154 | "name": "R", 155 | "version": ">= 3.5.0" 156 | }, 157 | "2": { 158 | "@type": "SoftwareApplication", 159 | "identifier": "cli", 160 | "name": "cli", 161 | "provider": { 162 | "@id": "https://cran.r-project.org", 163 | "@type": "Organization", 164 | "name": "Comprehensive R Archive Network (CRAN)", 165 | "url": "https://cran.r-project.org" 166 | }, 167 | "sameAs": "https://CRAN.R-project.org/package=cli" 168 | }, 169 | "3": { 170 | "@type": "SoftwareApplication", 171 | "identifier": "desc", 172 | "name": "desc", 173 | "provider": { 174 | "@id": "https://cran.r-project.org", 175 | "@type": "Organization", 176 | "name": "Comprehensive R Archive Network (CRAN)", 177 | "url": "https://cran.r-project.org" 178 | }, 179 | "sameAs": "https://CRAN.R-project.org/package=desc" 180 | }, 181 | "4": { 182 | "@type": "SoftwareApplication", 183 | "identifier": "digest", 184 | "name": "digest", 185 | "provider": { 186 | "@id": "https://cran.r-project.org", 187 | "@type": "Organization", 188 | "name": "Comprehensive R Archive Network (CRAN)", 189 | "url": "https://cran.r-project.org" 190 | }, 191 | "sameAs": "https://CRAN.R-project.org/package=digest" 192 | }, 193 | "5": { 194 | "@type": "SoftwareApplication", 195 | "identifier": "futile.logger", 196 | "name": "futile.logger", 197 | "provider": { 198 | "@id": "https://cran.r-project.org", 199 | "@type": "Organization", 200 | "name": "Comprehensive R Archive Network (CRAN)", 201 | "url": "https://cran.r-project.org" 202 | }, 203 | "sameAs": "https://CRAN.R-project.org/package=futile.logger" 204 | }, 205 | "6": { 206 | "@type": "SoftwareApplication", 207 | "identifier": "knitr", 208 | "name": "knitr", 209 | "provider": { 210 | "@id": "https://cran.r-project.org", 211 | "@type": "Organization", 212 | "name": "Comprehensive R Archive Network (CRAN)", 213 | "url": "https://cran.r-project.org" 214 | }, 215 | "sameAs": "https://CRAN.R-project.org/package=knitr" 216 | }, 217 | "7": { 218 | "@type": "SoftwareApplication", 219 | "identifier": "pkgbuild", 220 | "name": "pkgbuild", 221 | "provider": { 222 | "@id": "https://cran.r-project.org", 223 | "@type": "Organization", 224 | "name": "Comprehensive R Archive Network (CRAN)", 225 | "url": "https://cran.r-project.org" 226 | }, 227 | "sameAs": "https://CRAN.R-project.org/package=pkgbuild" 228 | }, 229 | "8": { 230 | "@type": "SoftwareApplication", 231 | "identifier": "pkgload", 232 | "name": "pkgload", 233 | "provider": { 234 | "@id": "https://cran.r-project.org", 235 | "@type": "Organization", 236 | "name": "Comprehensive R Archive Network (CRAN)", 237 | "url": "https://cran.r-project.org" 238 | }, 239 | "sameAs": "https://CRAN.R-project.org/package=pkgload" 240 | }, 241 | "9": { 242 | "@type": "SoftwareApplication", 243 | "identifier": "rmarkdown", 244 | "name": "rmarkdown", 245 | "provider": { 246 | "@id": "https://cran.r-project.org", 247 | "@type": "Organization", 248 | "name": "Comprehensive R Archive Network (CRAN)", 249 | "url": "https://cran.r-project.org" 250 | }, 251 | "sameAs": "https://CRAN.R-project.org/package=rmarkdown" 252 | }, 253 | "10": { 254 | "@type": "SoftwareApplication", 255 | "identifier": "roxygen2", 256 | "name": "roxygen2", 257 | "provider": { 258 | "@id": "https://cran.r-project.org", 259 | "@type": "Organization", 260 | "name": "Comprehensive R Archive Network (CRAN)", 261 | "url": "https://cran.r-project.org" 262 | }, 263 | "sameAs": "https://CRAN.R-project.org/package=roxygen2" 264 | }, 265 | "11": { 266 | "@type": "SoftwareApplication", 267 | "identifier": "rprojroot", 268 | "name": "rprojroot", 269 | "provider": { 270 | "@id": "https://cran.r-project.org", 271 | "@type": "Organization", 272 | "name": "Comprehensive R Archive Network (CRAN)", 273 | "url": "https://cran.r-project.org" 274 | }, 275 | "sameAs": "https://CRAN.R-project.org/package=rprojroot" 276 | }, 277 | "12": { 278 | "@type": "SoftwareApplication", 279 | "identifier": "usethis", 280 | "name": "usethis", 281 | "provider": { 282 | "@id": "https://cran.r-project.org", 283 | "@type": "Organization", 284 | "name": "Comprehensive R Archive Network (CRAN)", 285 | "url": "https://cran.r-project.org" 286 | }, 287 | "sameAs": "https://CRAN.R-project.org/package=usethis" 288 | }, 289 | "13": { 290 | "@type": "SoftwareApplication", 291 | "identifier": "utils", 292 | "name": "utils" 293 | }, 294 | "14": { 295 | "@type": "SoftwareApplication", 296 | "identifier": "yaml", 297 | "name": "yaml", 298 | "provider": { 299 | "@id": "https://cran.r-project.org", 300 | "@type": "Organization", 301 | "name": "Comprehensive R Archive Network (CRAN)", 302 | "url": "https://cran.r-project.org" 303 | }, 304 | "sameAs": "https://CRAN.R-project.org/package=yaml" 305 | }, 306 | "SystemRequirements": "pandoc - https://pandoc.org" 307 | }, 308 | "fileSize": "860.52KB", 309 | "releaseNotes": "https://github.com/ropensci/DataPackageR/blob/master/NEWS.md", 310 | "readme": "https://github.com/ropensci/DataPackageR/blob/main/README.md", 311 | "contIntegration": ["https://github.com/ropensci/DataPackageR/actions", "https://app.codecov.io/gh/ropensci/DataPackageR?branch=main"], 312 | "developmentStatus": "https://www.repostatus.org/#active", 313 | "review": { 314 | "@type": "Review", 315 | "url": "https://github.com/ropensci/software-review/issues/230", 316 | "provider": "https://ropensci.org" 317 | }, 318 | "keywords": ["r", "r-package", "reproducibility", "rstats", "peer-reviewed"] 319 | } 320 | -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | ## R CMD check results 2 | 3 | 0 errors ✔ | 0 warnings ✔ | 1 note ✖ 4 | 5 | ❯ checking CRAN incoming feasibility ... [139s] NOTE 6 | Maintainer: 'Dave Slager ' 7 | 8 | ## Reverse dependency checks 9 | 10 | This package has no reverse dependencies. 11 | -------------------------------------------------------------------------------- /inst/WORDLIST: -------------------------------------------------------------------------------- 1 | CMD 2 | Codecov 3 | Coopting 4 | DATADIGEST 5 | DOI 6 | DataONE 7 | DataVersion 8 | ExperimentHub 9 | FlowRepository 10 | Hadley 11 | ImmPort 12 | ImmuneSpace 13 | ORCID 14 | Pre 15 | Preprint 16 | README 17 | RMarkdown 18 | Rbuildignore 19 | Reproducibility 20 | Rmd 21 | SRA 22 | SystemRequirements 23 | Wickham's 24 | YAML 25 | al 26 | bepress 27 | bioRxiv 28 | cli 29 | config 30 | csv 31 | datapack 32 | datapackage 33 | datapackager 34 | datastorr 35 | et 36 | extdata 37 | fst 38 | gatesopenres 39 | github 40 | gitignore 41 | https 42 | incrementing 43 | loc 44 | md 45 | mtcars 46 | mydata 47 | onboarding 48 | pandoc 49 | pre 50 | preprint 51 | preprocess 52 | rOpenSci 53 | repo 54 | reproducibility 55 | reproducibly 56 | rmarkdown 57 | ropensci 58 | roxygen 59 | rrrpkg 60 | useR 61 | usethis 62 | yaml 63 | yml 64 | -------------------------------------------------------------------------------- /inst/extdata/tests/extra.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "extra" 3 | author: "Greg Finak" 4 | date: "5/30/2018" 5 | output: html_document 6 | --- 7 | 8 | ```{r setup, include=FALSE} 9 | knitr::opts_chunk$set(echo = TRUE) 10 | ``` 11 | 12 | This file is processed second in the `datapackager.yml` file. It therefore has access to the data objects 13 | created by `subsetCars.Rmd`, the file that is processed first in the `datapackager.yml`. 14 | 15 | ## Reading objects from previously run files 16 | 17 | In order to read previously processed objects, we use the `datapackager_object_read()` API from `DataPackageR`: 18 | 19 | 20 | ```{r reading_from_environments} 21 | library(DataPackageR) 22 | cars_over_20 = datapackager_object_read("cars_over_20") 23 | print(cars_over_20) 24 | ``` 25 | 26 | This API reads from an environment named `ENVS`, containing `subsetCars` and any other previously built data set objects. It is passed into the render environment of `extra.Rmd` by DataPackageR at the `render()` call. 27 | 28 | ## Additional data objects 29 | 30 | This file will add the pressure data set to the example. 31 | 32 | ```{r} 33 | data(pressure, envir = environment()) 34 | plot(pressure) 35 | ``` 36 | 37 | -------------------------------------------------------------------------------- /inst/extdata/tests/raw_data/testdata.csv: -------------------------------------------------------------------------------- 1 | 1, name,value 2 | 2, Red, 50 3 | 3, Blue, 200 4 | 4, Green, 10 -------------------------------------------------------------------------------- /inst/extdata/tests/rfileTest.R: -------------------------------------------------------------------------------- 1 | #' --- 2 | #' title: Sample report from R script 3 | #' author: Greg Finak 4 | #' date: August 1, 2018 5 | #' output: pdf_document 6 | #' --- 7 | data <- runif(100) 8 | -------------------------------------------------------------------------------- /inst/extdata/tests/rfileTest_noheader.R: -------------------------------------------------------------------------------- 1 | data <- runif(100) 2 | -------------------------------------------------------------------------------- /inst/extdata/tests/subsetCars.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "A Test Document for DataPackageR" 3 | author: "Greg Finak" 4 | output: html_document 5 | --- 6 | 7 | ```{r include=FALSE} 8 | knitr::opts_chunk$set(echo = TRUE) 9 | ``` 10 | 11 | This is a simple Rmd file that demonstrates how DataPackageR processes Rmarkdown files and creates data sets 12 | that are then stored in an R data package. 13 | 14 | In the `datapackager.yml` for this example, this file is listed first, and therefore processed first. 15 | 16 | This particular document simply subsets the `cars` data set: 17 | 18 | ```{r cars} 19 | summary(cars) 20 | dim(cars) 21 | ``` 22 | 23 | `cars` consists of a data frame of 50 rows and two columns. The `?cars` documentation specifies that it consists of speed and stopping distances of cars. 24 | 25 | Let's say, for some reason, we are only interested in the stopping distances of cars traveling greater than 20 miles per hour. 26 | 27 | ```{r} 28 | cars_over_20 = subset(cars, speed > 20) 29 | ``` 30 | 31 | The data frame `cars_over_20` now holds this information. 32 | 33 | # Storing data set objects and making making accessible to other processing scripts. 34 | 35 | When DataPackageR processes this file, it creates this `cars_over_20` object. After processing the file it does several things: 36 | 37 | 1. It compares the objects in the rmarkdown render environment of `subsetCars.Rmd` against the objects listed in the `datapackager.yml` file `objects` property. 38 | 2. It finds `cars_over_20` is listed there, so it stores it in a new environment. 39 | 3. That environment is passed to subsequent R and Rmd files. Specifically when the `extra.Rmd` file is processed, it has access to an environment object that holds all the `objects` (defined in the yaml config) that have already been created and processed. This environment is passed into subsequent scripts at the `render()` call. 40 | 41 | All of the above is done automatically. The user only needs to list the objects to be stored and passed to other scripts in the `datapackager.yml` file. 42 | 43 | The `datapackager_object_read()` API can be used to retrieve these objects from the environment. 44 | 45 | ### Storing objects in the data package 46 | 47 | In addition to passing around an environment to subsequent scripts, the `cars_over_20` object is stored in the data package `/data` directory as an `rda` file. 48 | 49 | Note that this is all done automatically. The user does not need to explicitly save anything, they only need to list the objects to be store in the `datapackager.yml`. 50 | 51 | This object is then accessible in the resulting package via the `data()` API, and its documentation is accessible via `?cars_over_20`. 52 | 53 | ### Data object documentation 54 | 55 | The documentation for the `cars_over_20` object is created in a `subsetCars.R` file in the `/R` directory of the data package. 56 | 57 | While the data object document stub is created automatically, it must be edited by the user to provide additional details about the data object. 58 | 59 | 60 | -------------------------------------------------------------------------------- /man/DataPackageR-defunct.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/DataPackageR-defunct.R 3 | \name{DataPackageR-defunct} 4 | \alias{DataPackageR-defunct} 5 | \alias{datapackage.skeleton} 6 | \alias{dataVersion} 7 | \alias{keepDataObjects} 8 | \title{Defunct functions in package \pkg{DataPackageR}.} 9 | \usage{ 10 | datapackage.skeleton(...) 11 | 12 | dataVersion(...) 13 | 14 | keepDataObjects(...) 15 | } 16 | \arguments{ 17 | \item{...}{All arguments are now ignored.} 18 | } 19 | \value{ 20 | Defunct function. No return value. 21 | } 22 | \description{ 23 | These functions are defunct and no longer supported. 24 | Calling them will result in an error. 25 | 26 | When possible, alternatives are suggested. 27 | } 28 | -------------------------------------------------------------------------------- /man/DataPackageR-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/DataPackageR-package.R 3 | \docType{package} 4 | \name{DataPackageR-package} 5 | \alias{DataPackageR} 6 | \alias{DataPackageR-package} 7 | \title{DataPackageR} 8 | \description{ 9 | A framework to automate the processing, tidying and packaging of raw data into analysis-ready 10 | data sets as R packages. 11 | } 12 | \details{ 13 | DataPackageR will automate running of data processing code, 14 | storing tidied data sets in an R package, producing 15 | data documentation stubs, tracking data object finger prints (md5 hash) 16 | and tracking and incrementing a "DataVersion" string 17 | in the DESCRIPTION file of the package when raw data or data 18 | objects change. 19 | Code to perform the data processing is passed to DataPackageR by the user. 20 | The user also specifies the names of the tidy data objects to be stored, 21 | documented and tracked in the final package. Raw data should be read from 22 | "inst/extdata" but large raw data files can be read from sources external 23 | to the package source tree. 24 | 25 | Configuration is controlled via the datapackager.yml file created at the package root. 26 | Its properties include a list of R and Rmd files that are to be rendered / sourced and 27 | which read data and do the actual processing. 28 | It also includes a list of r object names created by those files. These objects 29 | are stored in the final package and accessible via the \code{data()} API. 30 | The documentation for these objects is accessible via "?object-name", and md5 31 | fingerprints of these objects are created and tracked. 32 | 33 | The Rmd and R files used to process the objects are transformed into vignettes 34 | accessible in the final package so that the processing is fully documented. 35 | 36 | A DATADIGEST file in the package source keeps track of the data object fingerprints. 37 | A DataVersion string is added to the package DESCRIPTION file and updated when these 38 | objects are updated or changed on subsequent builds. 39 | 40 | Once the package is built and installed, the data objects created in the package are accessible via 41 | the \code{data()} API, and 42 | Calling \code{datapackage_skeleton()} and passing in R / Rmd file names, and r object names 43 | constructs a skeleton data package source tree and an associated \code{datapackager.yml} file. 44 | 45 | Calling \code{package_build()} sets the build process in motion. 46 | } 47 | \examples{ 48 | # A simple Rmd file that creates one data object 49 | # named "tbl". 50 | if(rmarkdown::pandoc_available()){ 51 | f <- tempdir() 52 | f <- file.path(f,"foo.Rmd") 53 | con <- file(f) 54 | writeLines("```{r}\n tbl = data.frame(1:10) \n```\n",con=con) 55 | close(con) 56 | 57 | # construct a data package skeleton named "MyDataPackage" and pass 58 | # in the Rmd file name with full path, and the name of the object(s) it 59 | # creates. 60 | 61 | pname <- basename(tempfile()) 62 | datapackage_skeleton(name=pname, 63 | path=tempdir(), 64 | force = TRUE, 65 | r_object_names = "tbl", 66 | code_files = f) 67 | 68 | # call package_build to run the "foo.Rmd" processing and 69 | # build a data package. 70 | package_build(file.path(tempdir(), pname), install = FALSE) 71 | 72 | # "install" the data package 73 | pkgload::load_all(file.path(tempdir(), pname)) 74 | 75 | # read the data version 76 | data_version(pname) 77 | 78 | # list the data sets in the package. 79 | data(package = pname) 80 | 81 | # The data objects are in the package source under "/data" 82 | list.files(pattern="rda", path = file.path(tempdir(),pname,"data"), full = TRUE) 83 | 84 | # The documentation that needs to be edited is in "/R" 85 | list.files(pattern="R", path = file.path(tempdir(), pname,"R"), full = TRUE) 86 | readLines(list.files(pattern="R", path = file.path(tempdir(),pname,"R"), full = TRUE)) 87 | # view the documentation with 88 | ?tbl 89 | } 90 | } 91 | \seealso{ 92 | Useful links: 93 | \itemize{ 94 | \item \url{https://github.com/ropensci/DataPackageR} 95 | \item \url{https://docs.ropensci.org/DataPackageR/} 96 | \item Report bugs at \url{https://github.com/ropensci/DataPackageR/issues} 97 | } 98 | 99 | } 100 | \author{ 101 | \strong{Maintainer}: Dave Slager \email{dslager@fredhutch.org} (\href{https://orcid.org/0000-0003-2525-2039}{ORCID}) [contributor] 102 | 103 | Authors: 104 | \itemize{ 105 | \item Greg Finak \email{greg.finak@gmail.com} (Original author and creator of DataPackageR) [copyright holder] 106 | } 107 | 108 | Other contributors: 109 | \itemize{ 110 | \item Paul Obrecht [contributor] 111 | \item Ellis Hughes \email{ellishughes@live.com} (\href{https://orcid.org/0000-0003-0637-4436}{ORCID}) [contributor] 112 | \item Jimmy Fulp \email{williamjfulp@gmail.com} [contributor] 113 | \item Marie Vendettuoli (\href{https://orcid.org/0000-0001-9321-1410}{ORCID}) [contributor] 114 | \item Jason Taylor \email{jmtaylor@fredhutch.org} [contributor] 115 | \item Kara Woo (Kara reviewed the package for rOpenSci, see ) [reviewer] 116 | \item William Landau (William reviewed the package for rOpenSci, see ) [reviewer] 117 | } 118 | 119 | } 120 | \keyword{internal} 121 | -------------------------------------------------------------------------------- /man/DataPackageR_options.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/DataPackageR-package.R 3 | \name{DataPackageR_options} 4 | \alias{DataPackageR_options} 5 | \title{Options consulted by DataPackageR} 6 | \description{ 7 | User-configurable options consulted by DataPackageR, which 8 | provide a mechanism for setting default behaviors for various functions. 9 | 10 | If the built-in defaults don't suit you, set one or more of these options. 11 | Typically, this is done in the \code{.Rprofile} startup file, which you can open 12 | for editing with \code{usethis::edit_r_profile()} - this will set the specified 13 | options for all future R sessions. The following setting is recommended to 14 | not be prompted upon each package build for a NEWS update: 15 | 16 | \code{options(DataPackageR_interact = FALSE)} 17 | } 18 | \section{Options for the DataPackageR package}{ 19 | 20 | 21 | - \code{DataPackageR_interact}: Upon package load, this defaults to the value of 22 | \code{interactive()}, unless the option has been previously set (e.g., in 23 | \code{.Rprofile}). TRUE prompts user interactively for a NEWS update on 24 | \code{package_build()}. See the example above and the 25 | \href{https://ropensci.org/blog/2018/09/18/datapackager/}{rOpenSci blog 26 | post} for more details on how to set this to FALSE, which will never prompt 27 | user for a NEWS update. FALSE is also the setting used for DataPackageR 28 | internal package tests. 29 | 30 | - \code{DataPackageR_verbose}: Default upon package load is TRUE. FALSE suppresses 31 | all console output and is currently only used for automated 32 | unit tests of the DataPackageR package. 33 | 34 | - \code{DataPackageR_packagebuilding}: Default upon package load is FALSE. This 35 | option is used internally for package operations and changing it is not 36 | recommended. 37 | } 38 | 39 | -------------------------------------------------------------------------------- /man/assert_data_version.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/dataversion.R 3 | \name{assert_data_version} 4 | \alias{assert_data_version} 5 | \title{Assert that a data version in a data package matches an expectation.} 6 | \usage{ 7 | assert_data_version( 8 | data_package_name = NULL, 9 | version_string = NULL, 10 | acceptable = "equal", 11 | ... 12 | ) 13 | } 14 | \arguments{ 15 | \item{data_package_name}{\code{character} Name of the package.} 16 | 17 | \item{version_string}{\code{character} Version string in "x.y.z" format.} 18 | 19 | \item{acceptable}{\code{character} one of "equal", "equal_or_greater", describing what version match is acceptable.} 20 | 21 | \item{...}{additional arguments passed to data_version (such as lib.loc)} 22 | } 23 | \value{ 24 | invisible \code{logical} TRUE if success, otherwise stop on mismatch. 25 | } 26 | \description{ 27 | Assert that a data version in a data package matches an expectation. 28 | } 29 | \details{ 30 | Tests the DataVersion string in \code{data_package_name} against \code{version_string} testing the major, minor and revision portion. 31 | 32 | Tests "data_package_name version equal version_string" or "data_package_name version equal_or_greater version_string". 33 | } 34 | \examples{ 35 | if(rmarkdown::pandoc_available()){ 36 | f <- tempdir() 37 | f <- file.path(f, "foo.Rmd") 38 | con <- file(f) 39 | writeLines("```{r}\n vec = 1:10 \n```\n",con = con) 40 | close(con) 41 | pname <- basename(tempfile()) 42 | datapackage_skeleton(name = pname, 43 | path=tempdir(), 44 | force = TRUE, 45 | r_object_names = "vec", 46 | code_files = f) 47 | package_build(file.path(tempdir(),pname), install = FALSE) 48 | 49 | pkgload::load_all(file.path(tempdir(),pname)) 50 | 51 | assert_data_version(data_package_name = pname,version_string = "0.1.0",acceptable = "equal") 52 | } 53 | } 54 | -------------------------------------------------------------------------------- /man/construct_yml_config.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/yamlR.R 3 | \name{construct_yml_config} 4 | \alias{construct_yml_config} 5 | \title{Construct a datapackager.yml configuration} 6 | \usage{ 7 | construct_yml_config(code = NULL, data = NULL, render_root = NULL) 8 | } 9 | \arguments{ 10 | \item{code}{A vector of filenames} 11 | 12 | \item{data}{A vector of quoted object names} 13 | 14 | \item{render_root}{The root directory where the package data processing code will be rendered. 15 | Defaults to is set to a randomly generated named subdirectory of \code{tempdir()}.} 16 | } 17 | \value{ 18 | a datapackager.yml configuration represented as an R object 19 | } 20 | \description{ 21 | Constructs a datapackager.yml configuration object from a vector of file names and a vector of object names (all quoted). 22 | Can be written to disk via \code{yml_write}. 23 | \code{render_root} is set to a randomly generated named subdirectory of \code{tempdir()}. 24 | } 25 | \examples{ 26 | conf <- construct_yml_config(code = c('file1.rmd','file2.rmd'), data=c('object1','object2')) 27 | tmp <- normalizePath(tempdir(), winslash = "/") 28 | yml_write(conf,path=tmp) 29 | } 30 | -------------------------------------------------------------------------------- /man/data_version.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/dataversion.R 3 | \name{data_version} 4 | \alias{data_version} 5 | \title{Get the DataVersion for a package} 6 | \usage{ 7 | data_version(pkg, lib.loc = NULL) 8 | } 9 | \arguments{ 10 | \item{pkg}{\code{character} the package name} 11 | 12 | \item{lib.loc}{\code{character} path to library location.} 13 | } 14 | \value{ 15 | Object of class 'package_version' and 'numeric_version' specifying the DataVersion of the package 16 | } 17 | \description{ 18 | Retrieves the DataVersion of a package if available 19 | } 20 | \examples{ 21 | if(rmarkdown::pandoc_available()){ 22 | f <- tempdir() 23 | f <- file.path(f,"foo.Rmd") 24 | con <- file(f) 25 | writeLines("```{r}\n vec = 1:10 \n```\n",con=con) 26 | close(con) 27 | pname <- basename(tempfile()) 28 | datapackage_skeleton(name = pname, 29 | path=tempdir(), 30 | force = TRUE, 31 | r_object_names = "vec", 32 | code_files = f) 33 | 34 | package_build(file.path(tempdir(),pname), install = FALSE) 35 | 36 | pkgload::load_all(file.path(tempdir(),pname)) 37 | data_version(pname) 38 | } 39 | } 40 | \seealso{ 41 | \code{\link[utils]{packageVersion}} 42 | } 43 | -------------------------------------------------------------------------------- /man/datapackage_skeleton.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/skeleton.R 3 | \name{datapackage_skeleton} 4 | \alias{datapackage_skeleton} 5 | \title{Create a Data Package skeleton for use with DataPackageR.} 6 | \usage{ 7 | datapackage_skeleton( 8 | name = NULL, 9 | path = ".", 10 | force = FALSE, 11 | code_files = character(), 12 | r_object_names = character(), 13 | raw_data_dir = character(), 14 | dependencies = character() 15 | ) 16 | } 17 | \arguments{ 18 | \item{name}{\code{character} name of the package to create.} 19 | 20 | \item{path}{A \code{character} path where the package is located. See \code{\link[utils]{package.skeleton}}} 21 | 22 | \item{force}{\code{logical} Force the package skeleton to be recreated even if it exists. see \code{\link[utils]{package.skeleton}}} 23 | 24 | \item{code_files}{Optional \code{character} vector of paths to Rmd files that process raw data 25 | into R objects.} 26 | 27 | \item{r_object_names}{\code{vector} of quoted r object names , tables, etc. created when the files in \code{code_files} are run.} 28 | 29 | \item{raw_data_dir}{\code{character} pointing to a raw data directory. Will be moved with all its subdirectories to "inst/extdata"} 30 | 31 | \item{dependencies}{\code{vector} of \code{character}, paths to R files that will be moved to "data-raw" but not included in the yaml config file. e.g., dependency scripts.} 32 | } 33 | \value{ 34 | No return value, called for side effects 35 | } 36 | \description{ 37 | Creates a package skeleton directory structure for use with DataPackageR. 38 | Adds the DataVersion string to DESCRIPTION, creates the DATADIGEST file, and the data-raw directory. 39 | Updates the Read-and-delete-me file to reflect the additional necessary steps. 40 | } 41 | \examples{ 42 | if(rmarkdown::pandoc_available()){ 43 | f <- tempdir() 44 | f <- file.path(f,"foo.Rmd") 45 | con <- file(f) 46 | writeLines("```{r}\n tbl = data.frame(1:10) \n```\n",con=con) 47 | close(con) 48 | pname <- basename(tempfile()) 49 | datapackage_skeleton(name = pname, 50 | path = tempdir(), 51 | force = TRUE, 52 | r_object_names = "tbl", 53 | code_files = f) 54 | } 55 | } 56 | -------------------------------------------------------------------------------- /man/datapackager_object_read.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/environments.R 3 | \name{datapackager_object_read} 4 | \alias{datapackager_object_read} 5 | \title{Read an object created in a previously run processing script.} 6 | \usage{ 7 | datapackager_object_read(name) 8 | } 9 | \arguments{ 10 | \item{name}{\code{character} the name of the object. Must be a 11 | name available in the configuration objects. Other objects are not saved.} 12 | } 13 | \value{ 14 | An R object. 15 | } 16 | \description{ 17 | Read an object created in a previously run processing script. 18 | } 19 | \details{ 20 | This function is only accessible within an R or Rmd file processed by DataPackageR. 21 | It searches for an environment named \code{ENVS} within the current environment, 22 | that holds the object with the given \code{name}. Such an environment is constructed and populated 23 | with objects specified in the yaml \code{objects} property and passed along 24 | to subsequent R and Rmd files as DataPackageR processes them in order. 25 | } 26 | \examples{ 27 | \donttest{ 28 | if(rmarkdown::pandoc_available()){ 29 | ENVS <- new.env() # ENVS would be in the environment 30 | # where the data processing is run. It is 31 | # handled automatically by the package. 32 | assign("find_me", 100, ENVS) #This is done automatically by DataPackageR 33 | 34 | find_me <- datapackager_object_read("find_me") # This would appear in an Rmd processed by 35 | # DataPackageR to access the object named "find_me" created 36 | # by a previous script. "find_me" would also need to 37 | # appear in the objects property of datapackager.yml 38 | } 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /man/document.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/processData.R 3 | \name{document} 4 | \alias{document} 5 | \title{Build documentation for a data package using DataPackageR.} 6 | \usage{ 7 | document(path = ".", install = FALSE, ...) 8 | } 9 | \arguments{ 10 | \item{path}{\code{character} the path to the data package source root.} 11 | 12 | \item{install}{\code{logical} install the package. (default FALSE)} 13 | 14 | \item{...}{additional arguments to \code{install}} 15 | } 16 | \value{ 17 | Called for side effects. Returns TRUE on successful exit. 18 | } 19 | \description{ 20 | Build documentation for a data package using DataPackageR. 21 | } 22 | \examples{ 23 | # A simple Rmd file that creates one data object 24 | # named "tbl". 25 | if(rmarkdown::pandoc_available()){ 26 | f <- tempdir() 27 | f <- file.path(f,"foo.Rmd") 28 | con <- file(f) 29 | writeLines("```{r}\n tbl = data.frame(1:10) \n```\n",con=con) 30 | close(con) 31 | \donttest{ 32 | # construct a data package skeleton named "MyDataPackage" and pass 33 | # in the Rmd file name with full path, and the name of the object(s) it 34 | # creates. 35 | 36 | pname <- basename(tempfile()) 37 | datapackage_skeleton(name=pname, 38 | path=tempdir(), 39 | force = TRUE, 40 | r_object_names = "tbl", 41 | code_files = f) 42 | 43 | # call package_build to run the "foo.Rmd" processing and 44 | # build a data package. 45 | package_build(file.path(tempdir(), pname), install = FALSE) 46 | document(path = file.path(tempdir(), pname), install = FALSE) 47 | } 48 | } 49 | } 50 | -------------------------------------------------------------------------------- /man/package_build.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/build.R 3 | \name{package_build} 4 | \alias{package_build} 5 | \title{Pre-process, document and build a data package} 6 | \usage{ 7 | package_build( 8 | packageName = NULL, 9 | vignettes = FALSE, 10 | log = INFO, 11 | deps = TRUE, 12 | install = FALSE, 13 | ... 14 | ) 15 | } 16 | \arguments{ 17 | \item{packageName}{\code{character} path to package source directory. Defaults to the current path when NULL.} 18 | 19 | \item{vignettes}{\code{logical} specify whether to build vignettes. Default FALSE.} 20 | 21 | \item{log}{log level \code{INFO,WARN,DEBUG,FATAL}} 22 | 23 | \item{deps}{\code{logical} should we pass data objects into subsequent scripts? Default TRUE} 24 | 25 | \item{install}{\code{logical} automatically install and load the package after building. Default FALSE} 26 | 27 | \item{...}{additional arguments passed to \code{install.packages} when \code{install=TRUE}.} 28 | } 29 | \value{ 30 | Character vector. File path of the built package. 31 | } 32 | \description{ 33 | Combines the preprocessing, documentation, and build steps into one. 34 | } 35 | \details{ 36 | Note that if \code{package_build} returns an error when rendering an \code{.Rmd} 37 | internally, but that same \code{.Rmd} can be run successfully manually using \code{rmarkdown::render}, 38 | then the following code facilitates debugging. Set \code{options(error = function(){ sink(); recover()})} 39 | before running \code{package_build} . This will enable examination of the active function calls at the time of the error, 40 | with output printed to the console rather than \code{knitr}'s default sink. 41 | After debugging, evaluate \code{options(error = NULL)} to revert to default error handling. 42 | See section "22.5.3 RMarkdown" at \url{ https://adv-r.hadley.nz/debugging.html} for more details. 43 | } 44 | \examples{ 45 | if(rmarkdown::pandoc_available()){ 46 | f <- tempdir() 47 | f <- file.path(f,"foo.Rmd") 48 | con <- file(f) 49 | writeLines("```{r}\n tbl = data.frame(1:10) \n```\n",con=con) 50 | close(con) 51 | pname <- basename(tempfile()) 52 | datapackage_skeleton(name=pname, 53 | path=tempdir(), 54 | force = TRUE, 55 | r_object_names = "tbl", 56 | code_files = f) 57 | 58 | package_build(file.path(tempdir(),pname), install = FALSE) 59 | } 60 | } 61 | -------------------------------------------------------------------------------- /man/project_data_path.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/processData.R 3 | \name{project_data_path} 4 | \alias{project_data_path} 5 | \title{Get DataPackageR data path} 6 | \usage{ 7 | project_data_path(file = NULL) 8 | } 9 | \arguments{ 10 | \item{file}{\code{character} or \code{NULL} (default).} 11 | } 12 | \value{ 13 | \code{character} 14 | } 15 | \description{ 16 | Get DataPackageR data path 17 | } 18 | \details{ 19 | Returns the path to the data package data subdirectory, or 20 | constructs a path to a file in the data subdirectory from the 21 | file argument. 22 | } 23 | \examples{ 24 | if(rmarkdown::pandoc_available()){ 25 | project_data_path( file = "data.rda" ) 26 | } 27 | } 28 | -------------------------------------------------------------------------------- /man/project_extdata_path.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/processData.R 3 | \name{project_extdata_path} 4 | \alias{project_extdata_path} 5 | \title{Get DataPackageR extdata path} 6 | \usage{ 7 | project_extdata_path(file = NULL) 8 | } 9 | \arguments{ 10 | \item{file}{\code{character} or \code{NULL} (default).} 11 | } 12 | \value{ 13 | \code{character} 14 | } 15 | \description{ 16 | Get DataPackageR extdata path 17 | } 18 | \details{ 19 | Returns the path to the data package extdata subdirectory, or 20 | constructs a path to a file in the extdata subdirectory from the 21 | file argument. 22 | } 23 | \examples{ 24 | if(rmarkdown::pandoc_available()){ 25 | project_extdata_path(file = "mydata.csv") 26 | } 27 | } 28 | -------------------------------------------------------------------------------- /man/project_path.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/processData.R 3 | \name{project_path} 4 | \alias{project_path} 5 | \title{Get DataPackageR Project Root Path} 6 | \usage{ 7 | project_path(file = NULL) 8 | } 9 | \arguments{ 10 | \item{file}{\code{character} or \code{NULL} (default).} 11 | } 12 | \value{ 13 | \code{character} 14 | } 15 | \description{ 16 | Get DataPackageR Project Root Path 17 | } 18 | \details{ 19 | Returns the path to the data package project root, or 20 | constructs a path to a file in the project root from the 21 | file argument. 22 | } 23 | \examples{ 24 | if(rmarkdown::pandoc_available()){ 25 | project_path( file = "DESCRIPTION" ) 26 | } 27 | } 28 | -------------------------------------------------------------------------------- /man/use_data_object.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/use.R 3 | \name{use_data_object} 4 | \alias{use_data_object} 5 | \title{Add a data object to a data package.} 6 | \usage{ 7 | use_data_object(object_name = NULL) 8 | } 9 | \arguments{ 10 | \item{object_name}{Name of the data object. Should be created by a processing script in data-raw. \code{character} vector of length 1.} 11 | } 12 | \value{ 13 | invisibly returns TRUE for success. 14 | } 15 | \description{ 16 | The data object will be added to the yml configuration file. 17 | } 18 | \examples{ 19 | if(rmarkdown::pandoc_available()){ 20 | myfile <- tempfile() 21 | file <- system.file("extdata", "tests", "extra.Rmd", 22 | package = "DataPackageR") 23 | datapackage_skeleton( 24 | name = "datatest", 25 | path = tempdir(), 26 | code_files = file, 27 | force = TRUE, 28 | r_object_names = "data") 29 | use_data_object(object_name = "newobject") 30 | } 31 | 32 | } 33 | -------------------------------------------------------------------------------- /man/use_ignore.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ignore.R 3 | \name{use_ignore} 4 | \alias{use_ignore} 5 | \title{Ignore specific files by git and R build.} 6 | \usage{ 7 | use_ignore(file = NULL, path = NULL) 8 | } 9 | \arguments{ 10 | \item{file}{\code{character} File to ignore.} 11 | 12 | \item{path}{\code{character} Path to the file.} 13 | } 14 | \value{ 15 | invisibly returns 0. 16 | } 17 | \description{ 18 | Ignore specific files by git and R build. 19 | } 20 | \examples{ 21 | datapackage_skeleton(name="test",path = tempdir()) 22 | use_ignore("foo", ".") 23 | } 24 | -------------------------------------------------------------------------------- /man/use_processing_script.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/use.R 3 | \name{use_processing_script} 4 | \alias{use_processing_script} 5 | \title{Add a processing script to a data package.} 6 | \usage{ 7 | use_processing_script( 8 | file = NULL, 9 | title = NULL, 10 | author = NULL, 11 | overwrite = FALSE 12 | ) 13 | } 14 | \arguments{ 15 | \item{file}{\code{character} path to an existing file or name of a new R or Rmd file to create.} 16 | 17 | \item{title}{\code{character} title of the processing script for the yaml header. Used only if file is being created.} 18 | 19 | \item{author}{\code{character} author name for the yaml header. Used only if the file is being created.} 20 | 21 | \item{overwrite}{\code{logical} default FALSE. Overwrite existing file of the same name.} 22 | } 23 | \value{ 24 | invisibly returns TRUE for success. Stops on failure. 25 | } 26 | \description{ 27 | The Rmd or R file or directory specified by \code{file} will be moved into 28 | the data-raw directory. It will also be added to the yml configuration file. 29 | Any existing file by that name will be overwritten when overwrite is set to TRUE 30 | } 31 | \examples{ 32 | if(rmarkdown::pandoc_available()){ 33 | myfile <- tempfile() 34 | file <- system.file("extdata", "tests", "extra.Rmd", 35 | package = "DataPackageR") 36 | datapackage_skeleton( 37 | name = "datatest", 38 | path = tempdir(), 39 | code_files = file, 40 | force = TRUE, 41 | r_object_names = "data") 42 | use_processing_script(file = "newScript.Rmd", 43 | title = "Processing a new dataset", 44 | author = "Y.N. Here.") 45 | } 46 | } 47 | -------------------------------------------------------------------------------- /man/use_raw_dataset.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/use.R 3 | \name{use_raw_dataset} 4 | \alias{use_raw_dataset} 5 | \title{Add a raw data set to inst/extdata} 6 | \usage{ 7 | use_raw_dataset(path = NULL, ignore = FALSE) 8 | } 9 | \arguments{ 10 | \item{path}{\code{character} path to file or directory.} 11 | 12 | \item{ignore}{\code{logical} whether to ignore the path or file in git and R build.} 13 | } 14 | \value{ 15 | invisibly returns TRUE for success. Stops on failure. 16 | } 17 | \description{ 18 | The file or directory specified by \code{path} will be moved into 19 | the inst/extdata directory. 20 | } 21 | \examples{ 22 | if(rmarkdown::pandoc_available()){ 23 | myfile <- tempfile() 24 | file <- system.file("extdata", "tests", "extra.Rmd", 25 | package = "DataPackageR") 26 | raw_data <- system.file("extdata", "tests", "raw_data", 27 | package = "DataPackageR") 28 | datapackage_skeleton( 29 | name = "datatest", 30 | path = tempdir(), 31 | code_files = file, 32 | force = TRUE, 33 | r_object_names = "data") 34 | use_raw_dataset(raw_data) 35 | } 36 | } 37 | -------------------------------------------------------------------------------- /man/yaml.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/yamlR.R 3 | \name{yml_find} 4 | \alias{yml_find} 5 | \alias{yml_add_files} 6 | \alias{yml_disable_compile} 7 | \alias{yml_enable_compile} 8 | \alias{yml_add_objects} 9 | \alias{yml_list_objects} 10 | \alias{yml_list_files} 11 | \alias{yml_remove_objects} 12 | \alias{yml_remove_files} 13 | \alias{yml_write} 14 | \title{Edit DataPackageR yaml configuration} 15 | \usage{ 16 | yml_find(path) 17 | 18 | yml_add_files(config, filenames) 19 | 20 | yml_disable_compile(config, filenames) 21 | 22 | yml_enable_compile(config, filenames) 23 | 24 | yml_add_objects(config, objects) 25 | 26 | yml_list_objects(config) 27 | 28 | yml_list_files(config) 29 | 30 | yml_remove_objects(config, objects) 31 | 32 | yml_remove_files(config, filenames) 33 | 34 | yml_write(config, path = NULL) 35 | } 36 | \arguments{ 37 | \item{path}{Path to the data package source or path to write config file (for \code{yml_write})} 38 | 39 | \item{config}{an R representation of the datapackager.yml config, returned by yml_find, or a path to the package root.} 40 | 41 | \item{filenames}{A vector of filenames.} 42 | 43 | \item{objects}{A vector of R object names.} 44 | } 45 | \value{ 46 | A yaml configuration structured as an R nested list. 47 | } 48 | \description{ 49 | Edit a yaml configuration file via an API. 50 | } 51 | \details{ 52 | Add, remove files and objects, enable or disable parsing of specific files, list objects or files in a yaml config, or write a config back to a package. 53 | } 54 | \examples{ 55 | if(rmarkdown::pandoc_available()){ 56 | f <- tempdir() 57 | f <- file.path(f,"foo.Rmd") 58 | con <- file(f) 59 | writeLines("```{r}\n vec = 1:10\n```\n",con=con) 60 | close(con) 61 | pname <- basename(tempfile()) 62 | datapackage_skeleton(name=pname, 63 | path = tempdir(), 64 | force = TRUE, 65 | r_object_names = "vec", 66 | code_files = f) 67 | yml <- yml_find(file.path(tempdir(),pname)) 68 | yml <- yml_add_files(yml,"foo.Rmd") 69 | yml_list_files(yml) 70 | yml <- yml_disable_compile(yml,"foo.Rmd") 71 | yml <- yml_enable_compile(yml,"foo.Rmd") 72 | yml <- yml_add_objects(yml,"data1") 73 | yml_list_objects(yml) 74 | yml <- yml_remove_objects(yml,"data1") 75 | yml <- yml_remove_files(yml,"foo.Rmd") 76 | } 77 | } 78 | -------------------------------------------------------------------------------- /revdep/.gitignore: -------------------------------------------------------------------------------- 1 | checks 2 | library 3 | checks.noindex 4 | library.noindex 5 | cloud.noindex 6 | data.sqlite 7 | *.html 8 | -------------------------------------------------------------------------------- /tests/spelling.R: -------------------------------------------------------------------------------- 1 | if (requireNamespace("spelling", quietly = TRUE)) { 2 | spelling::spell_check_test(vignettes = TRUE, 3 | error = FALSE, skip_on_cran = TRUE) 4 | } 5 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(DataPackageR) 3 | # Test only if pandoc is available. 4 | if (rmarkdown::pandoc_available()) { 5 | test_check("DataPackageR") 6 | } 7 | -------------------------------------------------------------------------------- /tests/testthat/setup.R: -------------------------------------------------------------------------------- 1 | # use these options for testthat tests, interactively or non-interactively, 2 | # and restore previously set options when tests are finished 3 | # https://testthat.r-lib.org/articles/special-files.html 4 | withr::local_options( 5 | list( 6 | DataPackageR_interact = FALSE, 7 | DataPackageR_packagebuilding = FALSE, 8 | DataPackageR_verbose = FALSE 9 | ), 10 | .local_envir = teardown_env() 11 | ) 12 | -------------------------------------------------------------------------------- /tests/testthat/test-DataPackageR.R: -------------------------------------------------------------------------------- 1 | test_that("Error on data object same name as data package", { 2 | file <- system.file("extdata", "tests", "extra.Rmd", package = "DataPackageR") 3 | td <- withr::local_tempdir() 4 | pp <- 'pressure' 5 | datapackage_skeleton(name = pp, path = td, 6 | code_files = file, r_object_names = pp) 7 | err_msg <- "Data object not allowed to have same name as data package" 8 | expect_error(package_build(file.path(td, pp)), err_msg) 9 | }) 10 | -------------------------------------------------------------------------------- /tests/testthat/test-build-locations.R: -------------------------------------------------------------------------------- 1 | context("building packages") 2 | 3 | test_that("package can be built from different locations", { 4 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 5 | package = "DataPackageR" 6 | ) 7 | datapackage_skeleton( 8 | name = "subsetCars", 9 | path = tempdir(), 10 | code_files = c(file), 11 | force = TRUE, 12 | r_object_names = "cars_over_20" 13 | ) 14 | expect_equal( 15 | basename( 16 | package_build( 17 | file.path(tempdir(), "subsetCars") 18 | ) 19 | ), 20 | "subsetCars_1.0.tar.gz" 21 | ) 22 | 23 | old <- 24 | setwd(file.path(tempdir(), "subsetCars")) # nolint 25 | on.exit(setwd(old)) # nolint 26 | expect_equal(basename(package_build(".")), "subsetCars_1.0.tar.gz") 27 | suppressWarnings(expect_error(package_build("subsetCars"))) 28 | 29 | unlink(file.path(tempdir(), "subsetCars"), 30 | recursive = TRUE, 31 | force = TRUE 32 | ) 33 | }) 34 | 35 | test_that("Error on data pkg dirname different from data pkg name", { 36 | td <- withr::local_tempdir() 37 | sn <- 'skelname' 38 | not_sn <- paste0('not_', sn) 39 | datapackage_skeleton(name = sn, path = td) 40 | file.rename(from = file.path(td, sn), to = file.path(td, not_sn)) 41 | err_msg <- paste("Data package name in DESCRIPTION does not match", 42 | "name of the data package directory") 43 | expect_error(package_build(file.path(td, not_sn)), err_msg) 44 | }) 45 | 46 | test_that("properly handle relative render_root path from yaml config", { 47 | # A lightly modified version of Jason's reprex 48 | withr::with_tempdir({ 49 | datapackage_skeleton("new") 50 | 51 | utils::write.csv(data.frame(x=1:10), 52 | file.path('new', 'inst', 'extdata', 'ext.csv'), 53 | row.names=F) 54 | 55 | x <- "x <- read.csv(file.path('inst', 'extdata', 'ext.csv'))" 56 | writeLines(x, file.path('new', 'data-raw', 'x.R')) 57 | 58 | config <- yml_add_files("new", "x.R") 59 | config <- yml_add_objects(config, "x") 60 | config <- yml_write(config, "new") 61 | 62 | yml <- yaml::read_yaml(file.path("new", "datapackager.yml")) 63 | yml$configuration$render_root$tmp <- NULL 64 | yml$configuration$render_root <- "./" 65 | yaml::write_yaml(yml, file.path("new", "datapackager.yml")) 66 | 67 | expect_error(package_build()) 68 | 69 | withr::with_dir('new', { 70 | expect_no_error(package_build()) 71 | }) 72 | }) 73 | }) 74 | -------------------------------------------------------------------------------- /tests/testthat/test-conditional-build.R: -------------------------------------------------------------------------------- 1 | 2 | context("conditional build") 3 | test_that("can add a data item", { 4 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 5 | package = "DataPackageR" 6 | ) 7 | file2 <- system.file("extdata", "tests", "extra.Rmd", 8 | package = "DataPackageR" 9 | ) 10 | expect_null( 11 | datapackage_skeleton( 12 | name = "subsetCars", 13 | path = tempdir(), 14 | code_files = c(file, file2), 15 | force = TRUE, 16 | r_object_names = c("cars_over_20") 17 | ) 18 | ) 19 | package_build(file.path(tempdir(), "subsetCars")) 20 | expect_equal( 21 | list.files(file.path(tempdir(), "subsetCars", "data")), 22 | "cars_over_20.rda" 23 | ) 24 | expect_true(all( 25 | c("subsetCars", "cars_over_20") %in% 26 | names(DataPackageR:::.doc_parse( 27 | list.files(file.path(tempdir(), "subsetCars", "R"), 28 | full.names = TRUE 29 | ) 30 | )) 31 | )) 32 | unlink(file.path(tempdir(), "subsetCars"), 33 | recursive = TRUE, 34 | force = TRUE 35 | ) 36 | setwd(tempdir()) 37 | try(usethis::proj_set(NULL),silent = TRUE) #wrap in try for usethis 1.4 vs 1.5 38 | }) 39 | -------------------------------------------------------------------------------- /tests/testthat/test-data-name-change.R: -------------------------------------------------------------------------------- 1 | test_that("data object can be renamed", { 2 | 3 | addData <- function(dataset, pname){ 4 | fil <- sprintf("data(%s, envir=environment())", dataset) 5 | writeLines(fil, file.path(tempdir(), sprintf("%s/data-raw/%s.R", pname, dataset))) 6 | 7 | yml <- yml_add_files(file.path(tempdir(), pname), c(sprintf("%s.R", dataset))) 8 | yml <- yml_add_objects(yml, dataset) 9 | yml_write(yml) 10 | 11 | package_build(file.path(tempdir(), pname)) 12 | } 13 | 14 | changeName <- function(old_dataset_name, new_dataset_name, pname){ 15 | process_path <- file.path(tempdir(), sprintf("%s/data-raw/%s.R", pname, old_dataset_name)) 16 | fil <- c(readLines(process_path), sprintf("%s <- %s", new_dataset_name, old_dataset_name)) 17 | writeLines(fil, process_path) 18 | 19 | yml <- yml_remove_objects(file.path(tempdir(), pname), old_dataset_name) 20 | yml <- yml_add_objects(yml, new_dataset_name) 21 | yml_write(yml) 22 | 23 | package_build(file.path(tempdir(), pname)) 24 | } 25 | 26 | removeName <- function(dataset_name, script, pname){ 27 | process_path <- file.path(tempdir(), sprintf("%s/data-raw/%s", pname, script)) 28 | fil <- gsub(paste0("^", dataset_name, ".+$"), "", readLines(process_path)) 29 | writeLines(fil, process_path) 30 | 31 | yml <- yml_remove_objects(file.path(tempdir(), pname), dataset_name) 32 | yml_write(yml) 33 | 34 | package_build(file.path(tempdir(), pname)) 35 | } 36 | 37 | ## test change when one object is present 38 | pname <- "nameChangeTest1" 39 | datapackage_skeleton(pname, tempdir(), force = TRUE) 40 | addData("mtcars", pname) 41 | expect_no_error(changeName("mtcars", "mtcars2", pname)) 42 | expect_error(removeName("mtcars2", "mtcars.R", pname), "exiting") 43 | 44 | ## test change when two objects are present 45 | pname <- "nameChangeTest2" 46 | datapackage_skeleton(pname, tempdir(), force = TRUE) 47 | addData("mtcars", pname) 48 | addData("iris", pname) 49 | expect_no_error(changeName("mtcars", "mtcars2", pname)) 50 | expect_no_error(removeName("mtcars2", "mtcars.R", pname)) 51 | 52 | ## test change when more than 2 objects are present 53 | pname <- "nameChangeTest3" 54 | datapackage_skeleton(pname, tempdir(), force = TRUE) 55 | addData("mtcars", pname) 56 | addData("iris", pname) 57 | addData("ToothGrowth", pname) 58 | expect_no_error(changeName("mtcars", "mtcars2", pname)) 59 | expect_no_error(removeName("mtcars2", "mtcars.R", pname)) 60 | 61 | }) 62 | -------------------------------------------------------------------------------- /tests/testthat/test-data-version.R: -------------------------------------------------------------------------------- 1 | 2 | context("data version strings") 3 | test_that("assert_data_version", { 4 | f <- tempdir() 5 | f <- file.path(f, "foo.Rmd") 6 | con <- file(f) 7 | writeLines( 8 | c("---", 9 | 'title: "foo"', 10 | "---", 11 | "", 12 | "```{r}", 13 | "tbl = table(sample(1:10,1000,replace=TRUE))", 14 | "```" 15 | ), 16 | con = con 17 | ) 18 | close(con) 19 | pname <- basename(tempfile()) 20 | datapackage_skeleton( 21 | name = pname, 22 | path = normalizePath(tempdir()), 23 | force = TRUE, 24 | r_object_names = "tbl", 25 | code_files = f 26 | ) 27 | package_build(file.path(tempdir(), pname)) 28 | on.exit(pkgload::unload(pname)) 29 | pkgload::load_all(file.path(tempdir(), pname)) 30 | suppressWarnings(expect_true( 31 | data_version(pkg = pname) == numeric_version("0.1.0") 32 | )) 33 | expect_true( 34 | assert_data_version( 35 | data_package_name = pname, 36 | version_string = "0.1.0", 37 | acceptable = "equal" 38 | ) 39 | ) 40 | expect_true( 41 | assert_data_version( 42 | data_package_name = pname, 43 | version_string = "0.1.0", 44 | acceptable = "equal_or_greater" 45 | ) 46 | ) 47 | expect_true( 48 | assert_data_version( 49 | data_package_name = pname, 50 | version_string = "0.0.0", 51 | acceptable = "equal_or_greater" 52 | ) 53 | ) 54 | expect_true( 55 | assert_data_version( 56 | data_package_name = pname, 57 | version_string = "0.0.11", 58 | acceptable = "equal_or_greater" 59 | ) 60 | ) 61 | expect_error( 62 | assert_data_version( 63 | data_package_name = pname, 64 | version_string = "1.0.0", 65 | acceptable = "equal_or_greater" 66 | ) 67 | ) 68 | expect_error( 69 | assert_data_version( 70 | data_package_name = pname, 71 | version_string = "1.1.0", 72 | acceptable = "equal_or_greater" 73 | ) 74 | ) 75 | expect_error( 76 | assert_data_version( 77 | data_package_name = pname, 78 | version_string = "0.1.1", 79 | acceptable = "equal_or_greater" 80 | ) 81 | ) 82 | expect_error( 83 | assert_data_version( 84 | data_package_name = pname, 85 | version_string = "0.1.1", 86 | acceptable = "equal" 87 | ) 88 | ) 89 | expect_error( 90 | assert_data_version( 91 | data_package_name = pname, 92 | version_string = "1.0.0", 93 | acceptable = "equal" 94 | ) 95 | ) 96 | expect_error( 97 | assert_data_version( 98 | data_package_name = pname, 99 | version_string = "1.1.0", 100 | acceptable = "equal" 101 | ) 102 | ) 103 | expect_error( 104 | assert_data_version( 105 | data_package_name = pname, 106 | version_string = "0.2.0", 107 | acceptable = "equal_or_greater" 108 | ) 109 | ) 110 | expect_true( 111 | assert_data_version( 112 | data_package_name = pname, 113 | version_string = "0.0.10000001", 114 | acceptable = "equal_or_greater" 115 | ) 116 | ) 117 | }) 118 | -------------------------------------------------------------------------------- /tests/testthat/test-datapackager-object-read.R: -------------------------------------------------------------------------------- 1 | context("datapackager_object_read") 2 | test_that("data objects can be read across scripts", { 3 | # file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | # package = "DataPackageR" 5 | # ) 6 | # datapackage_skeleton( 7 | # name = "subsetCars", 8 | # path = tempdir(), 9 | # code_files = c(file), 10 | # force = TRUE, 11 | # r_object_names = "cars_over_20" 12 | # ) 13 | # 14 | ENVS <- new.env() 15 | dataenv <- new.env() 16 | assign("foo", 100, ENVS) 17 | assign("ENVS", ENVS, dataenv) 18 | assign( 19 | "datapackager_object_read", 20 | datapackager_object_read, 21 | dataenv 22 | ) 23 | expect_equal( 24 | eval( 25 | datapackager_object_read("foo"), 26 | dataenv 27 | ), 28 | 100 29 | ) 30 | # expect_true(is.character(DataPackageR::project_path())) 31 | # expect_true(is.character(DataPackageR::project_data_path())) 32 | # expect_true(is.character(DataPackageR::project_extdata_path())) 33 | }) 34 | 35 | 36 | test_that("data objects are saved incrementally in render_root", { 37 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 38 | package = "DataPackageR" 39 | ) 40 | datapackage_skeleton( 41 | name = "subsetCars", 42 | path = tempdir(), 43 | code_files = c(file), 44 | force = TRUE, 45 | r_object_names = "cars_over_20" 46 | ) 47 | package_build( 48 | file.path(tempdir(), "subsetCars") 49 | ) 50 | expect_true(utils::file_test("-f",file.path(tempdir(),DataPackageR::yml_find(file.path(tempdir(),"subsetCars"))[["configuration"]][["render_root"]][["tmp"]],"cars_over_20.rds"))) 51 | }) 52 | 53 | 54 | test_that("data objects can be read from render_root or the data dir", { 55 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 56 | package = "DataPackageR" 57 | ) 58 | datapackage_skeleton( 59 | name = "subsetCars", 60 | path = tempdir(), 61 | code_files = c(file), 62 | force = TRUE, 63 | r_object_names = "cars_over_20" 64 | ) 65 | package_build( 66 | file.path(tempdir(), "subsetCars") 67 | ) 68 | 69 | #create object that doesn't exist in temporary file, so datapackager_object_read is forced to look in the data dir 70 | file.copy(file.path(tempdir(),"subsetCars","data","cars_over_20.rda"), 71 | file.path(tempdir(), "subsetCars","data","cars_over_20_2.rda")) 72 | 73 | original<-readRDS(file.path(tempdir(),DataPackageR::yml_find(file.path(tempdir(),"subsetCars"))[["configuration"]][["render_root"]][["tmp"]],"cars_over_20.rds")) 74 | 75 | expect_identical(suppressMessages(datapackager_object_read("cars_over_20")),original) 76 | expect_identical(datapackager_object_read("cars_over_20_2"),original) 77 | 78 | #check if the reading will try to read from the ENV 79 | options("DataPackageR_packagebuilding" = TRUE) 80 | on.exit({options("DataPackageR_packagebuilding" = FALSE)}) 81 | 82 | expect_error(datapackager_object_read("cars_over_20")) 83 | 84 | }) 85 | -------------------------------------------------------------------------------- /tests/testthat/test-document.R: -------------------------------------------------------------------------------- 1 | context("build documentation") 2 | test_that("documentation is built via document()", { 3 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | local({ 7 | tempdir <- withr::local_tempdir() 8 | datapackage_skeleton( 9 | name = "subsetCars", 10 | path = tempdir, 11 | code_files = c(file), 12 | force = TRUE, 13 | r_object_names = "cars_over_20" 14 | ) 15 | temp_libpath <- file.path(tempdir, "lib") 16 | dir.create(temp_libpath) 17 | package_build(file.path(tempdir, "subsetCars")) 18 | expect_true(document( 19 | file.path(tempdir, "subsetCars"), 20 | lib = temp_libpath, 21 | quiet = ! getOption('DataPackageR_verbose', TRUE))) 22 | docfile <- readLines(file.path( 23 | tempdir, 24 | "subsetCars", "data-raw", "documentation.R" 25 | )) 26 | connection <- file(file.path( 27 | tempdir, 28 | "subsetCars", "data-raw", "documentation.R" 29 | ), 30 | open = "w+" 31 | ) 32 | writeLines( 33 | text = 34 | c(docfile, " 35 | #' Use roxygen to document a package. 36 | #' 37 | #' This is dummy documentation used to test markdown documentation 38 | #' for [roxygen2::roxygenize()] in the `subsetCars`` test package. 39 | #' 40 | #' @name testmarkdownroxygen 41 | #' @param none there are no parameters 42 | #' this is a link to a function: [document()] 43 | #' @seealso [DataPackageR::document()], `browseVignettes(\"subsetCars\")` 44 | #' @md 45 | NULL 46 | "), 47 | con = connection 48 | ) 49 | flush(connection) 50 | close(connection) 51 | expect_true( 52 | document(file.path(tempdir, "subsetCars"), 53 | install = TRUE, 54 | lib = temp_libpath, 55 | quiet = ! getOption('DataPackageR_verbose', TRUE)) 56 | ) 57 | v <- vignette(package = "subsetCars", lib.loc = temp_libpath) 58 | expect_equal(v$results[, "Item"], "subsetCars") 59 | unlink(file.path(tempdir, "subsetCars"), 60 | recursive = TRUE, 61 | force = TRUE 62 | ) 63 | try(usethis::proj_set(NULL),silent = TRUE) #wrap in try for usethis 1.4 vs 1.5 64 | }) 65 | }) 66 | 67 | -------------------------------------------------------------------------------- /tests/testthat/test-ignore.R: -------------------------------------------------------------------------------- 1 | context("ignore") 2 | 3 | test_that("use_ignore works", { 4 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 5 | package = "DataPackageR" 6 | ) 7 | local({ 8 | tempdir <- withr::local_tempdir() 9 | expect_null( 10 | datapackage_skeleton( 11 | name = "subsetCars", 12 | path = tempdir, 13 | code_files = c(file), 14 | force = TRUE, 15 | r_object_names = c("cars_over_20") 16 | ) 17 | ) 18 | use_ignore(file = "mydata.csv", path = file.path("inst", "extdata")) 19 | expect_true( 20 | 'mydata.csv' %in% readLines( 21 | file.path(tempdir, 'subsetCars', 'inst', 'extdata', '.gitignore') 22 | ) 23 | ) 24 | expect_true( 25 | '^inst/extdata/mydata\\.csv$' %in% readLines( 26 | file.path(tempdir, 'subsetCars', '.Rbuildignore') 27 | ) 28 | ) 29 | expect_message(use_ignore(),"No file name provided to ignore.") 30 | }) 31 | }) 32 | -------------------------------------------------------------------------------- /tests/testthat/test-logger.R: -------------------------------------------------------------------------------- 1 | context("logger") 2 | withr::with_options(list(DataPackageR_verbose = TRUE),{ 3 | test_that(".multilog_setup", { 4 | expect_null(DataPackageR:::.multilog_setup(file.path(tempdir(), "test.log"))) 5 | }) 6 | test_that(".multilog_threshold", { 7 | expect_null(DataPackageR:::.multilog_thresold(INFO, TRACE)) 8 | }) 9 | test_that(".multilog_info", { 10 | expect_output(DataPackageR:::.multilog_info("message"), "INFO .* message") 11 | expect_true(utils::file_test("-f", file.path(tempdir(), "test.log"))) 12 | }) 13 | test_that(".multilog_error", { 14 | expect_output(DataPackageR:::.multilog_error("message"), "ERROR .* message") 15 | }) 16 | test_that(".multilog_trace", { 17 | expect_silent(DataPackageR:::.multilog_trace("message")) 18 | expect_true(length(grep(pattern = "TRACE", 19 | readLines(file.path(tempdir(), 20 | "test.log")))) > 0) 21 | }) 22 | test_that(".multilog_warn", { 23 | expect_output(DataPackageR:::.multilog_warn("message"), "WARN") 24 | }) 25 | test_that(".multilog_debug", { 26 | expect_silent(DataPackageR:::.multilog_debug("message")) 27 | expect_true(length(grep(pattern = "DEBUG", 28 | readLines(file.path(tempdir(), 29 | "test.log")))) > 0) 30 | }) 31 | }) 32 | -------------------------------------------------------------------------------- /tests/testthat/test-manual-version-bump.R: -------------------------------------------------------------------------------- 1 | 2 | context("version string bump") 3 | test_that("manual bump version when data unchanged", { 4 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 5 | package = "DataPackageR" 6 | ) 7 | file2 <- system.file("extdata", "tests", "extra.Rmd", 8 | package = "DataPackageR" 9 | ) 10 | expect_null( 11 | datapackage_skeleton( 12 | name = "subsetCars", 13 | path = tempdir(), 14 | code_files = c(file), 15 | force = TRUE, 16 | r_object_names = c("cars_over_20") 17 | ) 18 | ) 19 | package_build(file.path(tempdir(), "subsetCars")) 20 | pkg <- desc::desc(file.path(tempdir(), "subsetCars")) 21 | pkg$set("DataVersion", "0.2.0") 22 | pkg$write() 23 | package_build(file.path(tempdir(), "subsetCars")) 24 | unlink(file.path(tempdir(), "subsetCars"), 25 | recursive = TRUE, 26 | force = TRUE 27 | ) 28 | }) 29 | -------------------------------------------------------------------------------- /tests/testthat/test-news-update.R: -------------------------------------------------------------------------------- 1 | context("news file") 2 | test_that("news file is created", { 3 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | datapackage_skeleton( 7 | name = "subsetCars", 8 | path = tempdir(), 9 | code_files = c(file), 10 | force = TRUE, 11 | r_object_names = c("cars_over_20") 12 | ) 13 | package_build(file.path(tempdir(), "subsetCars")) 14 | news_lines <- readLines(file.path(tempdir(), "subsetCars", "NEWS.md")) 15 | expect_true(sum(grepl("Package built in non-interactive mode", news_lines)) == 1) # nolint 16 | unlink(file.path(tempdir(), "subsetCars"), 17 | recursive = TRUE, 18 | force = TRUE 19 | ) 20 | expect_equal(DataPackageR:::.prompt_user_for_change_description(), "Package built in non-interactive mode") # nolint 21 | }) 22 | -------------------------------------------------------------------------------- /tests/testthat/test-phantom_loading.R: -------------------------------------------------------------------------------- 1 | testthat::test_that( 2 | "no phantom package loading from roxygenise() or associated warnings", 3 | { 4 | # test README.Rmd sequence that led to warnings 5 | processing_code <- system.file( 6 | "extdata", "tests", "subsetCars.Rmd", package = "DataPackageR" 7 | ) 8 | pkg_name <- "mtcars20" 9 | on.exit( 10 | if (pkg_name %in% names(utils::sessionInfo()$otherPkgs)){ 11 | pkgload::unload(pkg_name) 12 | } 13 | ) 14 | # remove this directory on exit 15 | temp_dir <- withr::local_tempdir() 16 | pkg_path <- file.path(temp_dir, pkg_name) 17 | 18 | datapackage_skeleton( 19 | pkg_name, force = TRUE, 20 | code_files = processing_code, 21 | r_object_names = "cars_over_20", 22 | path = temp_dir) 23 | 24 | expect_no_warning(package_build(pkg_path, install = FALSE)) 25 | # test phantom pkg loading side effect from roxygen2::roxygenise() 26 | expect_false( 27 | res1 <- pkg_name %in% names(utils::sessionInfo()$otherPkgs) 28 | ) 29 | 30 | # reset for next test 31 | if (res1) pkgload::unload(pkg_name) 32 | 33 | # test phantom pkg loading side effect from roxygen2::roxygenise() 34 | expect_no_warning(document(pkg_path, install = FALSE)) 35 | expect_false( 36 | res2 <- pkg_name %in% names(utils::sessionInfo()$otherPkgs) 37 | ) 38 | } 39 | ) 40 | -------------------------------------------------------------------------------- /tests/testthat/test-pkg_description.R: -------------------------------------------------------------------------------- 1 | 2 | context("documentation") 3 | test_that("can_read_pkg_description, data_version", { 4 | td <- withr::local_tempdir() 5 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 6 | package = "DataPackageR" 7 | ) 8 | file2 <- system.file("extdata", "tests", "extra.Rmd", 9 | package = "DataPackageR" 10 | ) 11 | datapackage_skeleton( 12 | name = "subsetCars", 13 | path = td, 14 | code_files = c(file, file2), 15 | force = TRUE, 16 | r_object_names = c("cars_over_20", "pressure") 17 | ) 18 | td_sc <- file.path(td, "subsetCars") 19 | # validate package description 20 | d <- desc::desc(td_sc) 21 | on.exit(pkgload::unload("subsetCars")) 22 | pkgload::load_all(td_sc) 23 | expected_version <- 24 | structure(list(c(0L, 1L, 0L)), 25 | class = c("package_version", "numeric_version") 26 | ) 27 | expect_equal(data_version("subsetCars"), expected_version) 28 | }) 29 | -------------------------------------------------------------------------------- /tests/testthat/test-project-path.R: -------------------------------------------------------------------------------- 1 | context("project paths") 2 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 3 | package = "DataPackageR" 4 | ) 5 | datapackage_skeleton( 6 | name = "subsetCars", 7 | path = tempdir(), 8 | code_files = c(file), 9 | force = TRUE, 10 | r_object_names = c("cars_over_20") 11 | ) 12 | package_build(file.path(tempdir(), "subsetCars")) 13 | usethis::proj_set(file.path(tempdir(), "subsetCars")) 14 | test_that("path functions throw no warning when file does not exist", { 15 | expect_no_warning(project_path('zzzZZZ333.txt')) 16 | expect_no_warning(project_extdata_path('zzzZZZ333.txt')) 17 | expect_no_warning(project_data_path('zzzZZZ333.txt')) 18 | }) 19 | test_that("project_path works with file arguments", { 20 | expect_equal(project_path("DESCRIPTION"), expected = file.path(usethis::proj_get(), "DESCRIPTION")) # nolint 21 | }) 22 | test_that("project_data_path works with file arguments", { 23 | expect_equal(project_data_path("cars_over_20.rda"), expected = file.path(usethis::proj_get(), "data", "cars_over_20.rda")) # nolint 24 | }) 25 | test_that("project_extdata_path works with file arguments", { 26 | expect_equal(project_extdata_path("Logfiles/processing.log"), expected = file.path(usethis::proj_get(), "inst", "extdata", "Logfiles", "processing.log")) # nolint 27 | }) 28 | unlink(file.path(tempdir(), "subsetCars"), 29 | recursive = TRUE, 30 | force = TRUE 31 | ) 32 | try(usethis::proj_set(path = NULL), silent = TRUE) #compatibility between usethis 1.4 and 1.5 33 | 34 | -------------------------------------------------------------------------------- /tests/testthat/test-r-processing.R: -------------------------------------------------------------------------------- 1 | context("R file script processing to vignette") 2 | test_that("R file processing works and creates vignettes", { 3 | 4 | file <- system.file("extdata", "tests", "rfileTest.R", 5 | package = "DataPackageR" 6 | ) 7 | local({ 8 | tempdir <- withr::local_tempdir() 9 | datapackage_skeleton( 10 | name = "rfiletest", 11 | path = tempdir, 12 | code_files = c(file), 13 | force = TRUE, 14 | r_object_names = "data" 15 | ) 16 | 17 | temp_libpath <- file.path(tempdir,"lib") 18 | dir.create(temp_libpath) 19 | 20 | expect_equal( 21 | basename(package_build( 22 | file.path(tempdir, "rfiletest"), 23 | install = TRUE, 24 | lib = temp_libpath, 25 | quiet = ! getOption('DataPackageR_verbose', TRUE) 26 | )), 27 | "rfiletest_1.0.tar.gz" 28 | ) 29 | 30 | v <- vignette(package = "rfiletest", lib.loc = temp_libpath) 31 | 32 | expect_equal(v$results[, "Item"], "rfileTest") 33 | expect_true(utils::file_test("-f", file.path(tempdir,"rfiletest","inst","doc","rfileTest.pdf"))) 34 | unlink(file.path(tempdir, "rfiletest"), 35 | recursive = TRUE, 36 | force = TRUE 37 | ) 38 | remove.packages("rfiletest", lib = temp_libpath) 39 | }) 40 | 41 | file <- system.file("extdata", "tests", "rfileTest_noheader.R", 42 | package = "DataPackageR" 43 | ) 44 | local({ 45 | tempdir <- withr::local_tempdir() 46 | datapackage_skeleton( 47 | name = "rfiletest", 48 | path = tempdir, 49 | code_files = c(file), 50 | force = TRUE, 51 | r_object_names = "data" 52 | ) 53 | temp_libpath <- file.path(tempdir,"lib") 54 | dir.create(temp_libpath) 55 | expect_equal( 56 | basename( 57 | package_build( 58 | file.path(tempdir, "rfiletest"), 59 | install = TRUE, 60 | lib = temp_libpath, 61 | quiet = ! getOption('DataPackageR_verbose', TRUE) 62 | ) 63 | ), 64 | "rfiletest_1.0.tar.gz" 65 | ) 66 | v <- vignette(package = "rfiletest", lib.loc = temp_libpath) 67 | expect_equal(v$results[, "Item"], "rfileTest_noheader") 68 | unlink(file.path(tempdir, "rfiletest"), 69 | recursive = TRUE, 70 | force = TRUE 71 | ) 72 | try(usethis::proj_set(NULL),silent = TRUE) #wrap in try for usethis 1.4 vs 1.5 73 | remove.packages("rfiletest",lib = temp_libpath) 74 | }) 75 | }) 76 | -------------------------------------------------------------------------------- /tests/testthat/test-skeleton-data-dependencies.R: -------------------------------------------------------------------------------- 1 | context("skeleton") 2 | test_that("data, code, and dependencies are moved into place by skeleton", { 3 | file <- system.file("extdata", "tests", "extra.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | ancillary <- system.file("extdata", "tests", "rfileTest.R", 7 | package = "DataPackageR" 8 | ) 9 | raw_data <- system.file("extdata", "tests", "raw_data", 10 | package = "DataPackageR" 11 | ) 12 | expect_null( 13 | datapackage_skeleton( 14 | name = "datatest", 15 | path = tempdir(), 16 | code_files = c(file), 17 | force = TRUE, 18 | r_object_names = "data", 19 | raw_data_dir = raw_data, 20 | dependencies = ancillary 21 | ) 22 | ) 23 | expect_true( 24 | file.exists( 25 | normalizePath( 26 | file.path( 27 | tempdir(), 28 | "datatest", 29 | "inst", 30 | "extdata", 31 | "raw_data", 32 | "testdata.csv" 33 | ), 34 | winslash = "/" 35 | ) 36 | ) 37 | ) 38 | expect_true( 39 | file.exists( 40 | normalizePath( 41 | file.path( 42 | tempdir(), 43 | "datatest", 44 | "data-raw", 45 | "extra.Rmd" 46 | ), 47 | winslash = "/" 48 | ) 49 | ) 50 | ) 51 | expect_true( 52 | file.exists( 53 | normalizePath( 54 | file.path( 55 | tempdir(), 56 | "datatest", 57 | "data-raw", 58 | "rfileTest.R" 59 | ), 60 | winslash = "/" 61 | ) 62 | ) 63 | ) 64 | unlink(file.path(tempdir(), "datatest"), 65 | recursive = TRUE, 66 | force = TRUE 67 | ) 68 | try(usethis::proj_set(NULL),silent = TRUE) #wrap in try for usethis 1.4 vs 1.5 69 | }) 70 | -------------------------------------------------------------------------------- /tests/testthat/test-skeleton-edgecases.R: -------------------------------------------------------------------------------- 1 | context("datapackage_skeleton") 2 | test_that("datapackage_skeleton errors with no name arg", { 3 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | file2 <- system.file("extdata", "tests", "extra.Rmd", 7 | package = "DataPackageR" 8 | ) 9 | expect_error( 10 | datapackage_skeleton( 11 | name = NULL, 12 | path = tempdir(), 13 | code_files = c(file1, file2), 14 | force = TRUE, 15 | r_object_names = c("cars_over_20", "pressure") 16 | ) 17 | ) 18 | expect_error( 19 | datapackage_skeleton( 20 | name = "mtcars20", 21 | path = tempdir(), 22 | code_files = c(file1, file2), 23 | force = TRUE 24 | ) 25 | ) 26 | expect_null( 27 | datapackage_skeleton( 28 | name = "mtcars20", 29 | path = tempdir(), 30 | force = TRUE, 31 | r_object_names = c("cars_over_20", "pressure") 32 | ) 33 | ) 34 | }) 35 | -------------------------------------------------------------------------------- /tests/testthat/test-skeleton.R: -------------------------------------------------------------------------------- 1 | context("datapackage skeleton") 2 | 3 | test_that("datapackage skeleton builds correct structure", { 4 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 5 | package = "DataPackageR" 6 | ) 7 | # normalizePath(tempdir(), winslash = "/", mustWork = TRUE) 8 | 9 | expect_null( 10 | datapackage_skeleton( 11 | name = "subsetCars", 12 | path = tempdir(), 13 | code_files = c(file), 14 | force = TRUE, 15 | r_object_names = "cars_over_20" 16 | ) 17 | ) 18 | unlink(file.path(tempdir(), "subsetCars"), 19 | recursive = TRUE, 20 | force = TRUE 21 | ) 22 | }) 23 | -------------------------------------------------------------------------------- /tests/testthat/test-source_r_folder_functions.R: -------------------------------------------------------------------------------- 1 | 2 | context("conditional build") 3 | test_that("can add a data item", { 4 | 5 | 6 | library(testthat); library(DataPackageR) 7 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 8 | package = "DataPackageR" 9 | ) 10 | 11 | expect_null( 12 | datapackage_skeleton( 13 | name = "testRSourcing", 14 | path = tempdir(), 15 | code_files = file, 16 | force = TRUE, 17 | r_object_names = c("cars_over_20") 18 | ) 19 | ) 20 | 21 | package_build(file.path(tempdir(), "testRSourcing")) 22 | 23 | path_rmd <- paste0( tempdir(), "/testRSourcing/data-raw" ) 24 | path_rmd <- normalizePath( path_rmd ) 25 | wd_old <- getwd() 26 | setwd(path_rmd) 27 | fileConn<-file("depRmd.Rmd") 28 | writeLines(c("---\ntitle: Rmd to test loading of /R functions\n---\n`r test_func(3)`"), fileConn) 29 | close(fileConn) 30 | setwd(wd_old) 31 | 32 | path_rmd <- normalizePath( file.path( tempdir(), "testRSourcing", "data-raw", "depRmd.Rmd" ) ) 33 | 34 | path_pkg <- normalizePath( file.path( tempdir(), "testRSourcing" ) ) 35 | yml <- yml_find( path_pkg ) 36 | yml <- yml_add_files( yml, "depRmd.Rmd" ) 37 | yml_write( yml ) 38 | expect_error( package_build(file.path(tempdir(), "testRSourcing"))) 39 | 40 | 41 | path_r <- paste0( tempdir(), "/testRSourcing/R" ) 42 | path_r <- normalizePath( path_r ) 43 | wd_old <- getwd() 44 | setwd(path_r) 45 | fileConn<-file("test_func.R") 46 | writeLines(c("test_func <- function(x) x^2"), fileConn) 47 | close(fileConn) 48 | setwd(wd_old) 49 | 50 | package_build(file.path(tempdir(), "testRSourcing")) 51 | path_rmd <- file.path( tempdir(), "testRSourcing", "inst", "doc" ) 52 | path_rmd <- normalizePath( path_rmd ) 53 | expect_true( "depRmd.html" %in% 54 | list.files( path_rmd ) ) 55 | expect_true( "depRmd.Rmd" %in% 56 | list.files( path_rmd ) ) 57 | 58 | unlink(file.path(tempdir(), "testRSourcing"), 59 | recursive = TRUE, 60 | force = TRUE 61 | ) 62 | setwd(tempdir()) 63 | }) 64 | -------------------------------------------------------------------------------- /tests/testthat/test-updating-datapackager-version.R: -------------------------------------------------------------------------------- 1 | context("updating datapackager API version") 2 | test_that("can update", { 3 | 4 | #setup, build example package 5 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 6 | package = "DataPackageR" 7 | ) 8 | file2 <- system.file("extdata", "tests", "extra.Rmd", 9 | package = "DataPackageR" 10 | ) 11 | expect_null( 12 | datapackage_skeleton( 13 | name = "subsetCars", 14 | path = tempdir(), 15 | code_files = c(file, file2), 16 | force = TRUE, 17 | r_object_names = c("cars_over_20") 18 | ) 19 | ) 20 | package_build(file.path(tempdir(), "subsetCars")) 21 | 22 | #remove news.md and modify with the digest so it thinks there has been an update when rebuilt 23 | file.remove(file.path(tempdir(), "subsetCars", "NEWS.md")) 24 | oldDigest<-DataPackageR:::.parse_data_digest(file.path(tempdir(),"subsetCars")) 25 | oldDigest$cars_over_20<-"123456789" 26 | DataPackageR:::.save_digest(oldDigest,file.path(tempdir(),"subsetCars")) 27 | 28 | 29 | expect_no_warning(build_res <- package_build(file.path(tempdir(), "subsetCars"))) 30 | expect_identical( 31 | build_res, 32 | normalizePath(file.path(tempdir(),"subsetCars_1.0.tar.gz"),winslash = "/") 33 | )#if it passes, it returns the path to the tar file? 34 | 35 | }) 36 | -------------------------------------------------------------------------------- /tests/testthat/test-use_raw_data.R: -------------------------------------------------------------------------------- 1 | context("test-use_raw_data") 2 | 3 | test_that("use_raw_data works as expected", { 4 | myfile <- tempfile() 5 | file <- system.file("extdata", "tests", "extra.Rmd", 6 | package = "DataPackageR" 7 | ) 8 | ancillary <- system.file("extdata", "tests", "rfileTest.R", 9 | package = "DataPackageR" 10 | ) 11 | raw_data <- system.file("extdata", "tests", "raw_data", 12 | package = "DataPackageR" 13 | ) 14 | expect_null( 15 | datapackage_skeleton( 16 | name = "subsetCars20", 17 | path = tempdir(), 18 | code_files = c(file), 19 | force = TRUE, 20 | r_object_names = "data", 21 | raw_data_dir = raw_data, 22 | dependencies = ancillary 23 | ) 24 | ) 25 | file.create(myfile) 26 | unlink( 27 | file.path(tempdir(), "subsetCars20", "inst"), 28 | force = TRUE, 29 | recursive = TRUE 30 | ) 31 | expect_error(use_raw_dataset(myfile)) 32 | dir.create(file.path( 33 | tempdir(), 34 | "subsetCars20", "inst", "extdata" 35 | ), 36 | recursive = TRUE 37 | ) 38 | expect_true(use_raw_dataset(myfile)) 39 | expect_true(use_raw_dataset(myfile, ignore = TRUE)) 40 | expect_true(utils::file_test( 41 | "-f", 42 | file.path( 43 | tempdir(), 44 | "subsetCars20", 45 | "inst", 46 | "extdata", 47 | basename(myfile) 48 | ) 49 | )) 50 | expect_error(use_raw_dataset()) 51 | expect_error(suppressWarnings(use_raw_dataset("foobar"))) 52 | expect_true(use_raw_dataset(file.path(tempdir(), "subsetCars20", "R"), ignore = TRUE)) 53 | }) 54 | 55 | 56 | test_that("use_processing_script works as expected", { 57 | myfile <- tempfile() 58 | file <- system.file("extdata", "tests", "extra.Rmd", 59 | package = "DataPackageR" 60 | ) 61 | expect_null( 62 | datapackage_skeleton( 63 | name = "subsetCars20", 64 | path = tempdir(), 65 | code_files = c(file), 66 | force = TRUE, 67 | r_object_names = "data" 68 | ) 69 | ) 70 | expect_true(use_processing_script("newScript.Rmd")) 71 | expect_true(use_processing_script("newScript.Rmd", overwrite = TRUE)) 72 | expect_false(any(grepl("foo",readLines(normalizePath(file.path(tempdir(),"subsetCars20","data-raw","newScript.Rmd"), winslash = "/"))))) 73 | expect_true(use_processing_script("newScript.Rmd", title = "foo", overwrite = FALSE)) 74 | expect_false(any(grepl("foo",readLines(normalizePath(file.path(tempdir(),"subsetCars20","data-raw","newScript.Rmd"), winslash = "/"))))) 75 | expect_true(use_processing_script("newScript.Rmd", title = "foo", overwrite = TRUE)) 76 | expect_true(any(grepl("foo",readLines(normalizePath(file.path(tempdir(),"subsetCars20","data-raw","newScript.Rmd"), winslash = "/"))))) 77 | 78 | expect_true(use_processing_script("newScript.Rmd", 79 | title = "foo", author = "bar", overwrite = TRUE)) 80 | expect_true(use_processing_script("newScript.Rmd", author = "bar", overwrite = TRUE)) 81 | 82 | expect_true(use_processing_script("newScript.R", overwrite = TRUE)) 83 | expect_true(use_processing_script("newScript.R", title = "foo", overwrite = TRUE)) 84 | expect_true(use_processing_script("newScript.R", 85 | title = "foo", author = "bar", overwrite = TRUE)) 86 | expect_true(use_processing_script("newScript.R", author = "bar", overwrite = TRUE)) 87 | expect_equal(readLines( 88 | file.path(tempdir(), "subsetCars20", "data-raw", "newScript.Rmd") 89 | )[2], "author: bar") 90 | expect_equal(readLines( 91 | file.path(tempdir(), "subsetCars20", "data-raw", "newScript.R") 92 | )[2], "#' author: bar") 93 | expect_error(use_processing_script(file = NULL)) 94 | unlink( 95 | file.path(tempdir(), "subsetCars20", "data-raw"), 96 | force = TRUE, 97 | recursive = TRUE 98 | ) 99 | expect_error(use_processing_script(file = "newScript.R", overwrite = TRUE)) 100 | dir.create(file.path(tempdir(), "subsetCars20", "data-raw")) 101 | expect_error(use_processing_script(file = "newScript.foo", overwrite = TRUE)) 102 | expect_error(use_processing_script(".")) 103 | file.create(file.path(tempdir(), "foo.csv")) 104 | expect_error(use_processing_script(file.path(tempdir(), "foo.csv"), overwrite = TRUE)) 105 | file.create(file.path(tempdir(), "foo.R")) 106 | expect_true(use_processing_script(file.path(tempdir(), "foo.R"), overwrite = TRUE)) 107 | }) 108 | 109 | test_that("use_data_object works as expected", { 110 | myfile <- tempfile() 111 | file <- system.file("extdata", "tests", "extra.Rmd", 112 | package = "DataPackageR" 113 | ) 114 | expect_null( 115 | datapackage_skeleton( 116 | name = "subsetCars20", 117 | path = tempdir(), 118 | code_files = c(file), 119 | force = TRUE, 120 | r_object_names = "data" 121 | ) 122 | ) 123 | expect_true(use_data_object("newobject")) 124 | expect_error(use_data_object(object_name = NULL)) 125 | expect_error(use_data_object(object_name = 1)) 126 | expect_error(use_data_object(object_name = c("a","b"))) 127 | }) 128 | 129 | test_that(".update_header", { 130 | con <- file(file.path(tempdir(), "foo.R"), open = "wt") 131 | writeLines( 132 | text = c("#' ---", "#' title: My Title", "#' author: My Name", "#' ---"), 133 | con = con 134 | ) 135 | close(con) 136 | DataPackageR:::.update_header( 137 | file = file.path(tempdir(), "foo.R"), 138 | title = "new title", 139 | author = "new author" 140 | ) 141 | expect_equal( 142 | readLines(file.path(tempdir(), "foo.R")), 143 | c("#' ---", "#' title: new title", "#' author: new author", "#' ---") 144 | ) 145 | 146 | con <- file(file.path(tempdir(), "foo.Rmd"), open = "wt") 147 | writeLines(text = 148 | c("---", "title: My Title", "author: My Name", "---"), 149 | con = con) 150 | close(con) 151 | DataPackageR:::.update_header( 152 | file = file.path(tempdir(), "foo.Rmd"), 153 | title = "new title", 154 | author = "new author" 155 | ) 156 | expect_equal( 157 | readLines(file.path(tempdir(), "foo.Rmd")), 158 | c("---", "title: new title", "author: new author", "---") 159 | ) 160 | }) 161 | 162 | test_that(".partition_r_front_matter", { 163 | test_string1 <- 164 | c("#' ---\n", "#' input: in", "#' output out", "#' ---") 165 | test_string2 <- c("#' ---\n", "#' input: in", "#' output out") 166 | test_string3 <- c("#' input: in", "#' output out") 167 | test_string4 <- c("#' input: in", "#' output out", "#' ---") 168 | test_string5 <- 169 | c(" ", "#' ---\n", "#' input: in", "#' output out", "#' ---") 170 | 171 | expect_equal( 172 | DataPackageR:::.partition_r_front_matter(test_string1)$body, 173 | NULL 174 | ) 175 | expect_equal( 176 | is.null( 177 | DataPackageR:::.partition_r_front_matter(test_string1)$front_matter 178 | ), 179 | FALSE 180 | ) 181 | expect_equal( 182 | is.null( 183 | DataPackageR:::.partition_r_front_matter(test_string2)$body 184 | ), 185 | FALSE 186 | ) 187 | expect_equal( 188 | DataPackageR:::.partition_r_front_matter(test_string2)$front_matter, 189 | NULL 190 | ) 191 | 192 | expect_equal(DataPackageR:::.partition_r_front_matter( 193 | c("#' ---","author:Greg Finak","#' ---","body")), 194 | list(front_matter = c("#' ---","author:Greg Finak","#' ---"), 195 | body = "body")) 196 | 197 | expect_equal( 198 | is.null( 199 | DataPackageR:::.partition_r_front_matter(test_string3)$body 200 | ), 201 | FALSE 202 | ) 203 | expect_equal( 204 | DataPackageR:::.partition_r_front_matter(test_string3)$front_matter, 205 | NULL 206 | ) 207 | expect_equal( 208 | is.null( 209 | DataPackageR:::.partition_r_front_matter(test_string4)$body 210 | ), 211 | FALSE 212 | ) 213 | expect_equal( 214 | DataPackageR:::.partition_r_front_matter(test_string4)$front_matter, 215 | NULL 216 | ) 217 | expect_equal( 218 | DataPackageR:::.partition_r_front_matter(test_string5)$body, 219 | " " 220 | ) 221 | expect_equal( 222 | is.null( 223 | DataPackageR:::.partition_r_front_matter(test_string5)$front_matter 224 | ), 225 | FALSE 226 | ) 227 | }) 228 | 229 | test_that(".partition_rmd_front_matter", { 230 | test_string1 <- c("---\n", " input: in", " output out", "---") 231 | test_string2 <- c("---\n", " input: in", " output out") 232 | test_string3 <- c(" input: in", " output out") 233 | test_string4 <- c("input: in", " output out", " ---") 234 | test_string5 <- c(" ", "---\n", " input: in", " output out", "---") 235 | 236 | expect_equal( 237 | DataPackageR:::.partition_rmd_front_matter(test_string1)$body, 238 | NULL 239 | ) 240 | expect_equal( 241 | is.null( 242 | DataPackageR:::.partition_rmd_front_matter(test_string1)$front_matter 243 | ), 244 | FALSE 245 | ) 246 | expect_equal( 247 | is.null( 248 | DataPackageR:::.partition_rmd_front_matter(test_string2)$body 249 | ), 250 | FALSE 251 | ) 252 | expect_equal( 253 | DataPackageR:::.partition_rmd_front_matter(test_string2)$front_matter, 254 | NULL 255 | ) 256 | expect_equal( 257 | is.null( 258 | DataPackageR:::.partition_rmd_front_matter(test_string3)$body 259 | ), 260 | FALSE 261 | ) 262 | expect_equal( 263 | DataPackageR:::.partition_rmd_front_matter(test_string3)$front_matter, 264 | NULL 265 | ) 266 | expect_equal( 267 | is.null( 268 | DataPackageR:::.partition_rmd_front_matter(test_string4)$body 269 | ), 270 | FALSE 271 | ) 272 | expect_equal( 273 | DataPackageR:::.partition_rmd_front_matter(test_string4)$front_matter, 274 | NULL 275 | ) 276 | 277 | expect_equal(DataPackageR:::.partition_rmd_front_matter( 278 | c("---","author:Greg Finak","---","body")), 279 | list(front_matter = c("---","author:Greg Finak","---"), 280 | body = "body")) 281 | 282 | expect_equal( 283 | is.null( 284 | DataPackageR:::.partition_rmd_front_matter(test_string5)$front_matter 285 | ), 286 | FALSE 287 | ) 288 | expect_equal( 289 | DataPackageR:::.partition_rmd_front_matter(test_string5)$body, 290 | " " 291 | ) 292 | }) 293 | 294 | 295 | test_that(".parse_yaml_front_matter", { 296 | test_string1 <- c("---", " input: in", " output: out", "---\n") 297 | test_string2 <- c("---", " input: in", " output: out\n") 298 | test_string3 <- c(" input: in", " output: out\n") 299 | test_string4 <- c("input: in", " output: out", "---\n") 300 | test_string5 <- 301 | c(" ", "---", " input: in", " output: out", "---\n") 302 | test_string6 <- c("---", "input: in ", "output: out", "test:") 303 | expect_null(DataPackageR:::.validate_front_matter( 304 | paste0(test_string1, collapse = "\n"))) 305 | expect_error(DataPackageR:::.validate_front_matter( 306 | paste0(test_string6, collapse = "\n"))) 307 | 308 | expect_equal( 309 | (DataPackageR:::.parse_yaml_front_matter(test_string1)), 310 | list(input = "in", output = "out") 311 | ) 312 | expect_equal( 313 | (DataPackageR:::.parse_yaml_front_matter(test_string2)), 314 | list(input = "in") 315 | ) 316 | expect_equal((DataPackageR:::.parse_yaml_front_matter(test_string3)), list()) 317 | expect_equal( 318 | (DataPackageR:::.parse_yaml_front_matter(test_string4)), 319 | list(output = "out") 320 | ) 321 | expect_equal( 322 | (DataPackageR:::.parse_yaml_front_matter(test_string5)), 323 | list(input = "in", output = "out") 324 | ) 325 | expect_equal( 326 | (DataPackageR:::.parse_yaml_front_matter(test_string6)), 327 | list(input = "in", output = "out") 328 | ) 329 | expect_equal(DataPackageR:::.parse_yaml_front_matter(c("foo:bar","yes:no","if:then")),list()) 330 | }) 331 | 332 | test_that(".mark_utf8", { 333 | expect_equal(DataPackageR:::.mark_utf8("\320\274"), "м") 334 | expect_equal(DataPackageR:::.mark_utf8(list("\320\274")), list("м")) 335 | }) 336 | 337 | test_that(".is_blank", { 338 | expect_true(DataPackageR:::.is_blank(x = "")) 339 | expect_true(DataPackageR:::.is_blank(character(0))) 340 | }) 341 | 342 | -------------------------------------------------------------------------------- /tests/testthat/test-version-bump.R: -------------------------------------------------------------------------------- 1 | context("version bump") 2 | test_that("auto bump version when data unchanged", { 3 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | file2 <- system.file("extdata", "tests", "extra.Rmd", 7 | package = "DataPackageR" 8 | ) 9 | expect_null( 10 | datapackage_skeleton( 11 | name = "subsetCars", 12 | path = tempdir(), 13 | code_files = c(file), 14 | force = TRUE, 15 | r_object_names = c("cars_over_20") 16 | ) 17 | ) 18 | package_build(file.path(tempdir(), "subsetCars")) 19 | pkg <- desc::desc(file.path(tempdir(), "subsetCars")) 20 | pkg$set("DataVersion", "0.0.0") 21 | pkg$write() 22 | package_build(file.path(tempdir(), "subsetCars")) 23 | unlink(file.path(tempdir(), "subsetCars"), 24 | recursive = TRUE, 25 | force = TRUE 26 | ) 27 | }) 28 | -------------------------------------------------------------------------------- /tests/testthat/test-version-management-edge-cases.R: -------------------------------------------------------------------------------- 1 | 2 | context("Data Version management") 3 | test_that("data changes but version out of sync", { 4 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 5 | package = "DataPackageR" 6 | ) 7 | file2 <- system.file("extdata", "tests", "extra.Rmd", 8 | package = "DataPackageR" 9 | ) 10 | expect_null( 11 | datapackage_skeleton( 12 | name = "subsetCars", 13 | path = tempdir(), 14 | code_files = c(file), 15 | force = TRUE, 16 | r_object_names = c("cars_over_20") 17 | ) 18 | ) 19 | package_build(file.path(tempdir(), "subsetCars")) 20 | news_lines <- readLines(file.path(tempdir(), "subsetCars","NEWS.md")) 21 | expect_true(sum(grepl("Added: cars_over_20", news_lines)) == 1) 22 | config <- yml_find(file.path(tempdir(), "subsetCars")) 23 | config <- yml_add_files(config, "extra.Rmd") 24 | config <- yml_add_objects(config, "pressure") 25 | file.copy(file2, file.path(tempdir(), "subsetCars", "data-raw")) 26 | yml_write(config) 27 | package_build(file.path(tempdir(), "subsetCars")) 28 | news_lines <- readLines(file.path(tempdir(), "subsetCars","NEWS.md")) 29 | expect_false(any(grepl("Changed: cars_over_20", news_lines))) 30 | expect_false(any(grepl("Deleted: cars_over_20", news_lines))) 31 | expect_true(sum(grepl("Added: pressure", news_lines)) == 1) 32 | unlink(file.path(tempdir(), "subsetCars"), 33 | recursive = TRUE, 34 | force = TRUE 35 | ) 36 | }) 37 | -------------------------------------------------------------------------------- /tests/testthat/test-yaml-config.R: -------------------------------------------------------------------------------- 1 | context("conditional build") 2 | test_that("can add a file", { 3 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | file2 <- system.file("extdata", "tests", "extra.Rmd", 7 | package = "DataPackageR" 8 | ) 9 | expect_null( 10 | datapackage_skeleton( 11 | name = "subsetCars", 12 | path = tempdir(), 13 | code_files = c(file), 14 | force = TRUE, 15 | r_object_names = c("cars_over_20") 16 | ) 17 | ) 18 | package_build(file.path(tempdir(), "subsetCars")) 19 | expect_equal( 20 | list.files(file.path(tempdir(), "subsetCars", "data")), 21 | "cars_over_20.rda" 22 | ) 23 | expect_true(all( 24 | c("subsetCars", "cars_over_20") %in% 25 | names(DataPackageR:::.doc_parse( 26 | list.files(file.path(tempdir(), "subsetCars", "R"), full.names = TRUE) 27 | )) 28 | )) 29 | config <- yml_find(file.path(tempdir(), "subsetCars")) 30 | config <- yml_add_files(config, "extra.Rmd") 31 | yml_write(config) 32 | file.copy(from = file2, file.path(tempdir(), "subsetCars", "data-raw")) 33 | expect_equal( 34 | basename(package_build(file.path( 35 | tempdir(), 36 | "subsetCars" 37 | ))), 38 | "subsetCars_1.0.tar.gz" 39 | ) 40 | expect_equal( 41 | names(DataPackageR:::.doc_parse( 42 | list.files(file.path( 43 | tempdir(), 44 | "subsetCars", 45 | "R" 46 | ), 47 | full.names = TRUE 48 | ) 49 | )), 50 | c("subsetCars", "cars_over_20") 51 | ) 52 | config <- yml_add_objects(config, "pressure") 53 | yml_write(config) 54 | expect_equal( 55 | basename(package_build(file.path( 56 | tempdir(), 57 | "subsetCars" 58 | ))), 59 | "subsetCars_1.0.tar.gz" 60 | ) 61 | expect_equal( 62 | names(DataPackageR:::.doc_parse( 63 | list.files(file.path( 64 | tempdir(), 65 | "subsetCars", 66 | "R" 67 | ), 68 | full.names = TRUE 69 | ) 70 | )), 71 | c("subsetCars", "cars_over_20", "pressure") 72 | ) 73 | expect_equal( 74 | basename(list.files( 75 | file.path(tempdir(), "subsetCars", "data"), 76 | full.names = TRUE 77 | )), 78 | c("cars_over_20.rda", "pressure.rda") 79 | ) 80 | unlink(file.path(tempdir(), "subsetCars"), 81 | recursive = TRUE, 82 | force = TRUE 83 | ) 84 | }) 85 | -------------------------------------------------------------------------------- /tests/testthat/test-yaml-manipulation.R: -------------------------------------------------------------------------------- 1 | context("yaml config manipulation") 2 | test_that("can remove a data item", { 3 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | file2 <- system.file("extdata", "tests", "extra.Rmd", 7 | package = "DataPackageR" 8 | ) 9 | expect_null( 10 | datapackage_skeleton( 11 | name = "subsetCars", 12 | path = tempdir(), 13 | code_files = c(file, file2), 14 | force = TRUE, 15 | r_object_names = c("cars_over_20", "pressure") 16 | ) 17 | ) 18 | package_build(file.path(tempdir(), "subsetCars")) 19 | # have we saved the new object? 20 | config <- yml_find(file.path(tempdir(), "subsetCars")) 21 | config <- yml_disable_compile(config, basename(file2)) 22 | yml_write(config) 23 | package_build(file.path(tempdir(), "subsetCars")) 24 | expect_equal( 25 | list.files(file.path(tempdir(), "subsetCars", "data")), 26 | c("cars_over_20.rda", "pressure.rda") 27 | ) 28 | expect_true(all( 29 | c("subsetCars", "cars_over_20", "pressure") %in% 30 | names(DataPackageR:::.doc_parse( 31 | list.files(file.path(tempdir(), "subsetCars", "R"), 32 | full.names = TRUE 33 | ) 34 | )) 35 | )) 36 | unlink(file.path(tempdir(), "subsetCars"), 37 | recursive = TRUE, 38 | force = TRUE 39 | ) 40 | }) 41 | -------------------------------------------------------------------------------- /tests/testthat/test-yaml.R: -------------------------------------------------------------------------------- 1 | context("yaml config manipulation") 2 | test_that("yaml reading, adding, removing, listing, and writing", { 3 | file <- system.file("extdata", "tests", "subsetCars.Rmd", 4 | package = "DataPackageR" 5 | ) 6 | expect_null( 7 | datapackage_skeleton( 8 | name = "subsetCars", 9 | path = tempdir(), 10 | code_files = c(file), 11 | force = TRUE, 12 | r_object_names = "cars_over_20" 13 | ) 14 | ) 15 | 16 | test_config <- 17 | structure(list( 18 | configuration = list( 19 | files = list( 20 | subsetCars.Rmd = 21 | list(enabled = TRUE) 22 | ), 23 | objects = "cars_over_20", 24 | render_root = "dummy" 25 | ) 26 | )) 27 | config <- yml_find(file.path(tempdir(), "subsetCars")) 28 | config$configuration$render_root <- "dummy" 29 | attr(test_config, "path") <- attr(config, "path") 30 | expect_identical(config, test_config) 31 | 32 | config <- yml_add_files(config, "extra.Rmd") 33 | test_config <- 34 | structure(list( 35 | configuration = list( 36 | files = list( 37 | subsetCars.Rmd = list(enabled = TRUE), 38 | extra.Rmd = list(enabled = TRUE) 39 | ), 40 | objects = "cars_over_20", 41 | render_root = "dummy" 42 | ) 43 | ), path = "/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp7DyEjM/subsetCars/datapackager.yml") # nolint 44 | attr(test_config, "path") <- attr(config, "path") 45 | config$configuration$render_root <- "dummy" 46 | 47 | expect_identical(config, test_config) 48 | 49 | config <- yml_remove_files(config, "foo_file") 50 | test_config <- 51 | structure(list( 52 | configuration = list( 53 | files = list( 54 | subsetCars.Rmd = list(enabled = TRUE), 55 | extra.Rmd = list(enabled = TRUE) 56 | ), 57 | objects = "cars_over_20", 58 | render_root = "dummy" 59 | ) 60 | ), path = "/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp7DyEjM/subsetCars/datapackager.yml") # nolint 61 | attr(test_config, "path") <- attr(config, "path") 62 | config$configuration$render_root <- "dummy" 63 | expect_identical(config, test_config) 64 | 65 | 66 | config <- yml_add_objects(config, "foo_obj") 67 | test_config <- 68 | structure(list( 69 | configuration = list( 70 | files = list( 71 | subsetCars.Rmd = list(enabled = TRUE), 72 | extra.Rmd = list(enabled = TRUE) 73 | ), 74 | objects = c( 75 | "cars_over_20", 76 | "foo_obj" 77 | ), 78 | render_root = "dummy" 79 | ) 80 | ), path = "/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp7DyEjM/subsetCars/datapackager.yml") # nolint 81 | 82 | attr(test_config, "path") <- attr(config, "path") 83 | config$configuration$render_root <- "dummy" 84 | expect_identical(config, test_config) 85 | 86 | 87 | config <- yml_remove_objects(config, "foo_obj") 88 | test_config <- 89 | structure(list( 90 | configuration = list( 91 | files = list( 92 | subsetCars.Rmd = list(enabled = TRUE), 93 | extra.Rmd = list(enabled = TRUE) 94 | ), 95 | objects = "cars_over_20", 96 | render_root = "dummy" 97 | ) 98 | ), path = "/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp7DyEjM/subsetCars/datapackager.yml") # nolint 99 | attr(test_config, "path") <- attr(config, "path") 100 | config$configuration$render_root <- "dummy" 101 | expect_identical(config, test_config) 102 | 103 | 104 | list <- yml_list_files(config) 105 | expect_identical( 106 | list, 107 | c("subsetCars.Rmd", "extra.Rmd") 108 | ) 109 | 110 | 111 | 112 | list <- yml_list_objects(config) 113 | expect_identical(list, "cars_over_20") 114 | 115 | # still the same after writing? 116 | yml_write(config) 117 | test_config <- 118 | structure(list( 119 | configuration = list( 120 | files = list( 121 | subsetCars.Rmd = list(enabled = TRUE), 122 | extra.Rmd = list(enabled = TRUE) 123 | ), 124 | objects = "cars_over_20", 125 | render_root = "dummy" 126 | ) 127 | ), path = "/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp7DyEjM/subsetCars/datapackager.yml") # nolint 128 | 129 | config <- yml_find(file.path(tempdir(), "subsetCars")) 130 | attr(test_config, "path") <- attr(config, "path") 131 | config$configuration$render_root <- "dummy" 132 | expect_identical(config, test_config) 133 | unlink(file.path(tempdir(), "subsetCars"), 134 | recursive = TRUE, 135 | force = TRUE 136 | ) 137 | }) 138 | -------------------------------------------------------------------------------- /vignettes/.gitignore: -------------------------------------------------------------------------------- 1 | *.R 2 | *.md 3 | *.html 4 | -------------------------------------------------------------------------------- /vignettes/YAML_Configuration_Details.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "YAML Configuration Details" 3 | date: "`r Sys.Date()`" 4 | output: 5 | rmarkdown::html_vignette: 6 | keep_md: TRUE 7 | toc: yes 8 | bibliography: bibliography.bib 9 | vignette: > 10 | %\VignetteIndexEntry{YAML Configuration Details} 11 | %\VignetteEngine{knitr::rmarkdown} 12 | %\usepackage[utf8]{inputenc} 13 | %\usepackage{graphicx} 14 | editor_options: 15 | chunk_output_type: inline 16 | --- 17 | 18 | # Configuring and controlling DataPackageR builds. 19 | 20 | Data package builds are controlled using the `datapackager.yml` file. This file is created in the package source tree when the user creates a package using `datapackage_skeleton()`. It is automatically populated with the names of the `code_files` and `data_objects` the passed in to datapackage_skeleton. 21 | 22 | ## The `datapackager.yml` file. 23 | 24 | The structure of a correctly formatted `datapackager.yml` file is shown below: 25 | 26 | ```{r, echo = FALSE, results = 'hide', eval = rmarkdown::pandoc_available()} 27 | library(DataPackageR) 28 | library(yaml) 29 | yml <- DataPackageR::construct_yml_config(code = "subsetCars.Rmd", data = "cars_over_20") 30 | ``` 31 | 32 | ```{r, echo = FALSE, comment="", eval = rmarkdown::pandoc_available()} 33 | cat(yaml::as.yaml(yml)) 34 | ``` 35 | 36 | ## YAML config file properties. 37 | 38 | The main section of the file is the `configuration:` section. 39 | 40 | It has three properties: 41 | 42 | - `files:` 43 | 44 | The files (`R` or `Rmd`) to be processed by DataPackageR. They are processed in the order shown. Users running multi-script workflows with dependencies between the scripts need to ensure the files are processed in the correct order. Here `subsetCars.Rmd` is the only file to process. The name is transformed to an absolute path within the package. 45 | 46 | Each file itself has just one property: 47 | 48 | - `enabled:` 49 | 50 | A logical `yes`, `no` flag indicating whether the file should be rendered during the build, or whether it should be skipped. 51 | 52 | This is useful for 'turning off' long running processing tasks if they have not changed. Disabling processing of a file will not overwrite existing documentation or data objects created during previous builds. 53 | 54 | - `objects:` 55 | 56 | The names of the data objects created by the processing files, to be stored in the package. These names are compared against the objects created in the render environment by each file. The names must match. 57 | 58 | - `render_root:` 59 | 60 | The directory where the `Rmd` or `R` files will be rendered which will default to a randomly named subdirectory given by `tempdir()`. This render location will allow workflows that use multiple scripts, and will create file system artifacts to function correctly by simply writing to and reading from the working directory. 61 | 62 | ## Editing the YAML config file. 63 | 64 | The structure of the YAML is simple enough to understand but complex enough that it can be challenging to edit manually. 65 | 66 | DataPackageR provides a number of API calls to construct, read, modify, and write the yaml config file. 67 | 68 | ### YAML config API calls. 69 | 70 | #### `construct_yml_config` 71 | 72 | Make an r object representing a YAML config file. 73 | 74 | ##### Example 75 | The YAML config shown above was created by: 76 | ```{r, eval = rmarkdown::pandoc_available()} 77 | # Note this is done by the datapackage_skeleton. 78 | # The user does not usually need to call 79 | # construct_yml_config() 80 | yml <- DataPackageR::construct_yml_config( 81 | code = "subsetCars.Rmd", 82 | data = "cars_over_20" 83 | ) 84 | ``` 85 | 86 | 87 | #### `yml_find` 88 | 89 | Read a yaml config file from a package path into an r object. 90 | 91 | ##### Example 92 | 93 | Read the YAML config file from the `mtcars20` example. 94 | 95 | ```{r eval=FALSE} 96 | # returns an r object representation of 97 | # the config file. 98 | mtcars20_config <- yml_find( 99 | file.path(tempdir(),"mtcars20") 100 | ) 101 | ``` 102 | 103 | #### `yml_list_objects` 104 | 105 | List the `objects` in a config read by `yml_find`. 106 | 107 | ##### Example 108 | 109 | ```{r, comment="", eval = rmarkdown::pandoc_available()} 110 | yml_list_objects(yml) 111 | ``` 112 | 113 | #### `yml_list_files` 114 | 115 | List the `files` in a config read by `yml_find`. 116 | 117 | ##### Example 118 | 119 | ```{r, comment="", eval = rmarkdown::pandoc_available()} 120 | yml_list_files(yml) 121 | ``` 122 | 123 | #### `yml_disable_compile` 124 | 125 | Disable compilation of named files in a config read by `yml_find`. 126 | 127 | ##### Example 128 | 129 | ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()} 130 | yml_disabled <- yml_disable_compile( 131 | yml, 132 | filenames = "subsetCars.Rmd" 133 | ) 134 | cat(as.yaml(yml_disabled)) 135 | ``` 136 | 137 | #### `yml_enable_compile` 138 | 139 | Enable compilation of named files in a config read by `yml_find`. 140 | 141 | ##### Example 142 | 143 | ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()} 144 | yml_enabled <- yml_enable_compile( 145 | yml, 146 | filenames = "subsetCars.Rmd" 147 | ) 148 | cat(as.yaml(yml_enabled)) 149 | ``` 150 | 151 | #### `yml_add_files` 152 | 153 | Add named files to a config read by `yml_find`. 154 | 155 | ##### Example 156 | 157 | ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()} 158 | yml_twofiles <- yml_add_files( 159 | yml, 160 | filenames = "anotherFile.Rmd" 161 | ) 162 | # cat(as.yaml(yml_twofiles)) 163 | ``` 164 | 165 | #### `yml_add_objects` 166 | 167 | Add named objects to a config read by `yml_find`. 168 | 169 | ##### Example 170 | 171 | ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()} 172 | yml_twoobj <- yml_add_objects( 173 | yml_twofiles, 174 | objects = "another_object" 175 | ) 176 | # cat(as.yaml(yml_twoobj)) 177 | ``` 178 | 179 | #### `yml_remove_files` 180 | 181 | Remove named files from a config read by `yml_find`. 182 | 183 | ##### Example 184 | 185 | ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()} 186 | yml_twoobj <- yml_remove_files( 187 | yml_twoobj, 188 | filenames = "anotherFile.Rmd" 189 | ) 190 | # cat(as.yaml(yml_twoobj)) 191 | ``` 192 | 193 | #### `yml_remove_objects` 194 | 195 | Remove named objects from a config read by `yml_find`. 196 | 197 | ##### Example 198 | 199 | ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()} 200 | yml_oneobj <- yml_remove_objects( 201 | yml_twoobj, 202 | objects = "another_object" 203 | ) 204 | # cat(as.yaml(yml_oneobj)) 205 | ``` 206 | 207 | #### `yml_write` 208 | 209 | Write a modified config to its package path. 210 | 211 | ##### Example 212 | 213 | ```{r, eval = FALSE} 214 | yml_write(yml_oneobj, path = "path_to_package") 215 | ``` 216 | 217 | The `yml_oneobj` read by `yml_find()` carries an attribute 218 | that is the path to the package. The user does not need to pass a `path` to `yml_write` if the config has been read by `yml_find`. 219 | --------------------------------------------------------------------------------