├── .Rbuildignore ├── .covrignore ├── .github ├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md └── workflows │ ├── R-CMD-check.yaml │ ├── check-no-suggests.yaml │ ├── lock.yaml │ ├── pkgdown.yaml │ ├── pr-commands.yaml │ └── test-coverage.yaml ├── .gitignore ├── .vscode ├── extensions.json └── settings.json ├── DESCRIPTION ├── LICENSE ├── LICENSE.md ├── MAINTENANCE.md ├── NAMESPACE ├── NEWS.md ├── R ├── action.R ├── broom.R ├── butcher.R ├── compat-purrr.R ├── control.R ├── extract.R ├── fit-action-model.R ├── fit.R ├── generics.R ├── import-standalone-obj-type.R ├── import-standalone-types-check.R ├── post-action-tailor.R ├── pre-action-case-weights.R ├── pre-action-formula.R ├── pre-action-recipe.R ├── pre-action-variables.R ├── predict.R ├── pull.R ├── reexports.R ├── sparsevctrs.R ├── stage.R ├── survival-censoring-weights.R ├── utils.R ├── workflow.R ├── workflows-package.R └── zzz.R ├── README.Rmd ├── README.md ├── _pkgdown.yml ├── air.toml ├── codecov.yml ├── man ├── add_case_weights.Rd ├── add_formula.Rd ├── add_model.Rd ├── add_recipe.Rd ├── add_tailor.Rd ├── add_variables.Rd ├── augment.workflow.Rd ├── control_workflow.Rd ├── extract-workflow.Rd ├── figures │ ├── lifecycle-deprecated.svg │ ├── lifecycle-soft-deprecated.svg │ └── logo.png ├── fit-workflow.Rd ├── glance.workflow.Rd ├── is_trained_workflow.Rd ├── predict-workflow.Rd ├── reexports.Rd ├── rmd │ ├── add-formula.Rmd │ └── indicators.Rmd ├── tidy.workflow.Rd ├── workflow-butcher.Rd ├── workflow-extractors.Rd ├── workflow.Rd ├── workflows-internals.Rd └── workflows-package.Rd ├── pkgdown └── favicon │ ├── apple-touch-icon-120x120.png │ ├── apple-touch-icon-152x152.png │ ├── apple-touch-icon-180x180.png │ ├── apple-touch-icon-60x60.png │ ├── apple-touch-icon-76x76.png │ ├── apple-touch-icon.png │ ├── favicon-16x16.png │ ├── favicon-32x32.png │ └── favicon.ico ├── revdep ├── .gitignore ├── README.md ├── cran.md ├── email.yml ├── failures.md └── problems.md ├── tests ├── testthat.R └── testthat │ ├── _snaps │ ├── broom.md │ ├── butcher.md │ ├── control.md │ ├── extract.md │ ├── fit-action-model.md │ ├── fit.md │ ├── post-action-tailor.md │ ├── pre-action-case-weights.md │ ├── pre-action-formula.md │ ├── pre-action-recipe.md │ ├── pre-action-variables.md │ ├── predict.md │ ├── printing.md │ ├── pull.md │ ├── sparsevctrs.md │ └── workflow.md │ ├── helper-extract_parameter_set.R │ ├── helper-lifecycle.R │ ├── helper-sparsevctrs.R │ ├── helper-tunable.R │ ├── test-broom.R │ ├── test-butcher.R │ ├── test-control.R │ ├── test-extract.R │ ├── test-fit-action-model.R │ ├── test-fit.R │ ├── test-generics.R │ ├── test-post-action-tailor.R │ ├── test-pre-action-case-weights.R │ ├── test-pre-action-formula.R │ ├── test-pre-action-recipe.R │ ├── test-pre-action-variables.R │ ├── test-predict.R │ ├── test-printing.R │ ├── test-pull.R │ ├── test-sparsevctrs.R │ └── test-workflow.R ├── vignettes ├── extras │ └── getting-started.Rmd └── stages.Rmd └── workflows.Rproj /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^workflows\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^LICENSE\.md$ 4 | ^\.travis\.yml$ 5 | ^README\.Rmd$ 6 | ^codecov\.yml$ 7 | ^\.covrignore$ 8 | ^_pkgdown\.yml$ 9 | ^docs$ 10 | ^pkgdown$ 11 | ^\.github$ 12 | ^CRAN-RELEASE$ 13 | ^CODE_OF_CONDUCT\.md$ 14 | ^revdep$ 15 | ^CRAN-SUBMISSION$ 16 | ^MAINTENANCE\.md$ 17 | ^[\.]?air\.toml$ 18 | ^\.vscode$ 19 | -------------------------------------------------------------------------------- /.covrignore: -------------------------------------------------------------------------------- 1 | R/deprec-*.R 2 | R/compat-*.R 3 | -------------------------------------------------------------------------------- /.github/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | -------------------------------------------------------------------------------- /.github/CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socio-economic status, 9 | nationality, personal appearance, race, caste, color, religion, or sexual 10 | identity and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | * Demonstrating empathy and kindness toward other people 21 | * Being respectful of differing opinions, viewpoints, and experiences 22 | * Giving and gracefully accepting constructive feedback 23 | * Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | * Focusing on what is best not just for us as individuals, but for the overall 26 | community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | * The use of sexualized language or imagery, and sexual attention or advances of 31 | any kind 32 | * Trolling, insulting or derogatory comments, and personal or political attacks 33 | * Public or private harassment 34 | * Publishing others' private information, such as a physical or email address, 35 | without their explicit permission 36 | * Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at codeofconduct@posit.co. 63 | All complaints will be reviewed and investigated promptly and fairly. 64 | 65 | All community leaders are obligated to respect the privacy and security of the 66 | reporter of any incident. 67 | 68 | ## Enforcement Guidelines 69 | 70 | Community leaders will follow these Community Impact Guidelines in determining 71 | the consequences for any action they deem in violation of this Code of Conduct: 72 | 73 | ### 1. Correction 74 | 75 | **Community Impact**: Use of inappropriate language or other behavior deemed 76 | unprofessional or unwelcome in the community. 77 | 78 | **Consequence**: A private, written warning from community leaders, providing 79 | clarity around the nature of the violation and an explanation of why the 80 | behavior was inappropriate. A public apology may be requested. 81 | 82 | ### 2. Warning 83 | 84 | **Community Impact**: A violation through a single incident or series of 85 | actions. 86 | 87 | **Consequence**: A warning with consequences for continued behavior. No 88 | interaction with the people involved, including unsolicited interaction with 89 | those enforcing the Code of Conduct, for a specified period of time. This 90 | includes avoiding interactions in community spaces as well as external channels 91 | like social media. Violating these terms may lead to a temporary or permanent 92 | ban. 93 | 94 | ### 3. Temporary Ban 95 | 96 | **Community Impact**: A serious violation of community standards, including 97 | sustained inappropriate behavior. 98 | 99 | **Consequence**: A temporary ban from any sort of interaction or public 100 | communication with the community for a specified period of time. No public or 101 | private interaction with the people involved, including unsolicited interaction 102 | with those enforcing the Code of Conduct, is allowed during this period. 103 | Violating these terms may lead to a permanent ban. 104 | 105 | ### 4. Permanent Ban 106 | 107 | **Community Impact**: Demonstrating a pattern of violation of community 108 | standards, including sustained inappropriate behavior, harassment of an 109 | individual, or aggression toward or disparagement of classes of individuals. 110 | 111 | **Consequence**: A permanent ban from any sort of public interaction within the 112 | community. 113 | 114 | ## Attribution 115 | 116 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 117 | version 2.1, available at 118 | . 119 | 120 | Community Impact Guidelines were inspired by 121 | [Mozilla's code of conduct enforcement ladder][https://github.com/mozilla/inclusion]. 122 | 123 | For answers to common questions about this code of conduct, see the FAQ at 124 | . Translations are available at . 125 | 126 | [homepage]: https://www.contributor-covenant.org 127 | -------------------------------------------------------------------------------- /.github/CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to tidymodels 2 | 3 | For more detailed information about contributing to tidymodels packages, see our [**development contributing guide**](https://www.tidymodels.org/contribute/). 4 | 5 | ## Documentation 6 | 7 | Typos or grammatical errors in documentation may be edited directly using the GitHub web interface, as long as the changes are made in the _source_ file. 8 | 9 | * YES ✅: you edit a roxygen comment in an `.R` file in the `R/` directory. 10 | * NO 🚫: you edit an `.Rd` file in the `man/` directory. 11 | 12 | We use [roxygen2](https://cran.r-project.org/package=roxygen2), with [Markdown syntax](https://cran.r-project.org/web/packages/roxygen2/vignettes/rd-formatting.html), for documentation. 13 | 14 | ## Code 15 | 16 | Before you submit 🎯 a pull request on a tidymodels package, always file an issue and confirm the tidymodels team agrees with your idea and is happy with your basic proposal. 17 | 18 | The [tidymodels packages](https://www.tidymodels.org/packages/) work together. Each package contains its own unit tests, while integration tests and other tests using all the packages are contained in [extratests](https://github.com/tidymodels/extratests). 19 | 20 | * We recommend that you create a Git branch for each pull request (PR). 21 | * Look at the build status before and after making changes. The `README` contains badges for any continuous integration services used by the package. 22 | * New code should follow the tidyverse [style guide](http://style.tidyverse.org). You can use the [styler](https://CRAN.R-project.org/package=styler) package to apply these styles, but please don't restyle code that has nothing to do with your PR. 23 | * For user-facing changes, add a bullet to the top of `NEWS.md` below the current development version header describing the changes made followed by your GitHub username, and links to relevant issue(s)/PR(s). 24 | * We use [testthat](https://cran.r-project.org/package=testthat). Contributions with test cases included are easier to accept. 25 | * If your contribution spans the use of more than one package, consider building [extratests](https://github.com/tidymodels/extratests) with your changes to check for breakages and/or adding new tests there. Let us know in your PR if you ran these extra tests. 26 | 27 | ### Code of Conduct 28 | 29 | This project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms. 30 | -------------------------------------------------------------------------------- /.github/workflows/R-CMD-check.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | # 4 | # NOTE: This workflow is overkill for most R packages and 5 | # check-standard.yaml is likely a better choice. 6 | # usethis::use_github_action("check-standard") will install it. 7 | on: 8 | push: 9 | branches: [main, master] 10 | pull_request: 11 | 12 | name: R-CMD-check.yaml 13 | 14 | permissions: read-all 15 | 16 | jobs: 17 | R-CMD-check: 18 | runs-on: ${{ matrix.config.os }} 19 | 20 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 21 | 22 | strategy: 23 | fail-fast: false 24 | matrix: 25 | config: 26 | - {os: macos-latest, r: 'release'} 27 | 28 | - {os: windows-latest, r: 'release'} 29 | # use 4.0 or 4.1 to check with rtools40's older compiler 30 | - {os: windows-latest, r: 'oldrel-4'} 31 | 32 | - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'} 33 | - {os: ubuntu-latest, r: 'release'} 34 | - {os: ubuntu-latest, r: 'oldrel-1'} 35 | - {os: ubuntu-latest, r: 'oldrel-2'} 36 | - {os: ubuntu-latest, r: 'oldrel-3'} 37 | - {os: ubuntu-latest, r: 'oldrel-4'} 38 | 39 | env: 40 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 41 | R_KEEP_PKG_SOURCE: yes 42 | 43 | steps: 44 | - uses: actions/checkout@v4 45 | 46 | - uses: r-lib/actions/setup-pandoc@v2 47 | 48 | - uses: r-lib/actions/setup-r@v2 49 | with: 50 | r-version: ${{ matrix.config.r }} 51 | http-user-agent: ${{ matrix.config.http-user-agent }} 52 | use-public-rspm: true 53 | 54 | - uses: r-lib/actions/setup-r-dependencies@v2 55 | with: 56 | extra-packages: any::rcmdcheck 57 | needs: check 58 | 59 | - uses: r-lib/actions/check-r-package@v2 60 | with: 61 | upload-snapshots: true 62 | build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")' 63 | -------------------------------------------------------------------------------- /.github/workflows/check-no-suggests.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | # 4 | # NOTE: This workflow only directly installs "hard" dependencies, i.e. Depends, 5 | # Imports, and LinkingTo dependencies. Notably, Suggests dependencies are never 6 | # installed, with the exception of testthat, knitr, and rmarkdown. The cache is 7 | # never used to avoid accidentally restoring a cache containing a suggested 8 | # dependency. 9 | on: 10 | push: 11 | branches: [main, master] 12 | pull_request: 13 | branches: [main, master] 14 | 15 | name: check-no-suggests.yaml 16 | 17 | permissions: read-all 18 | 19 | jobs: 20 | check-no-suggests: 21 | runs-on: ${{ matrix.config.os }} 22 | 23 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 24 | 25 | strategy: 26 | fail-fast: false 27 | matrix: 28 | config: 29 | - {os: ubuntu-latest, r: 'release'} 30 | 31 | env: 32 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 33 | R_KEEP_PKG_SOURCE: yes 34 | 35 | steps: 36 | - uses: actions/checkout@v4 37 | 38 | - uses: r-lib/actions/setup-pandoc@v2 39 | 40 | - uses: r-lib/actions/setup-r@v2 41 | with: 42 | r-version: ${{ matrix.config.r }} 43 | http-user-agent: ${{ matrix.config.http-user-agent }} 44 | use-public-rspm: true 45 | 46 | - uses: r-lib/actions/setup-r-dependencies@v2 47 | with: 48 | dependencies: '"hard"' 49 | cache: false 50 | extra-packages: | 51 | any::rcmdcheck 52 | any::testthat 53 | any::knitr 54 | any::rmarkdown 55 | needs: check 56 | 57 | - uses: r-lib/actions/check-r-package@v2 58 | with: 59 | upload-snapshots: true 60 | build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")' 61 | -------------------------------------------------------------------------------- /.github/workflows/lock.yaml: -------------------------------------------------------------------------------- 1 | name: 'Lock Threads' 2 | 3 | on: 4 | schedule: 5 | - cron: '0 0 * * *' 6 | 7 | jobs: 8 | lock: 9 | runs-on: ubuntu-latest 10 | steps: 11 | - uses: dessant/lock-threads@v2 12 | with: 13 | github-token: ${{ github.token }} 14 | issue-lock-inactive-days: '14' 15 | # issue-exclude-labels: '' 16 | # issue-lock-labels: 'outdated' 17 | issue-lock-comment: > 18 | This issue has been automatically locked. If you believe you have 19 | found a related problem, please file a new issue (with a reprex: 20 | ) and link to this issue. 21 | issue-lock-reason: '' 22 | pr-lock-inactive-days: '14' 23 | # pr-exclude-labels: 'wip' 24 | pr-lock-labels: '' 25 | pr-lock-comment: > 26 | This pull request has been automatically locked. If you believe you 27 | have found a related problem, please file a new issue (with a reprex: 28 | ) and link to this issue. 29 | pr-lock-reason: '' 30 | # process-only: 'issues' 31 | -------------------------------------------------------------------------------- /.github/workflows/pkgdown.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | on: 4 | push: 5 | branches: [main, master] 6 | pull_request: 7 | release: 8 | types: [published] 9 | workflow_dispatch: 10 | 11 | name: pkgdown.yaml 12 | 13 | permissions: read-all 14 | 15 | jobs: 16 | pkgdown: 17 | runs-on: ubuntu-latest 18 | # Only restrict concurrency for non-PR jobs 19 | concurrency: 20 | group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }} 21 | env: 22 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 23 | permissions: 24 | contents: write 25 | steps: 26 | - uses: actions/checkout@v4 27 | 28 | - uses: r-lib/actions/setup-pandoc@v2 29 | 30 | - uses: r-lib/actions/setup-r@v2 31 | with: 32 | use-public-rspm: true 33 | 34 | - uses: r-lib/actions/setup-r-dependencies@v2 35 | with: 36 | extra-packages: any::pkgdown, local::. 37 | needs: website 38 | 39 | - name: Build site 40 | run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE) 41 | shell: Rscript {0} 42 | 43 | - name: Deploy to GitHub pages 🚀 44 | if: github.event_name != 'pull_request' 45 | uses: JamesIves/github-pages-deploy-action@v4.5.0 46 | with: 47 | clean: false 48 | branch: gh-pages 49 | folder: docs 50 | -------------------------------------------------------------------------------- /.github/workflows/pr-commands.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | on: 4 | issue_comment: 5 | types: [created] 6 | 7 | name: pr-commands.yaml 8 | 9 | permissions: read-all 10 | 11 | jobs: 12 | document: 13 | if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/document') }} 14 | name: document 15 | runs-on: ubuntu-latest 16 | env: 17 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 18 | permissions: 19 | contents: write 20 | steps: 21 | - uses: actions/checkout@v4 22 | 23 | - uses: r-lib/actions/pr-fetch@v2 24 | with: 25 | repo-token: ${{ secrets.GITHUB_TOKEN }} 26 | 27 | - uses: r-lib/actions/setup-r@v2 28 | with: 29 | use-public-rspm: true 30 | 31 | - uses: r-lib/actions/setup-r-dependencies@v2 32 | with: 33 | extra-packages: any::roxygen2 34 | needs: pr-document 35 | 36 | - name: Document 37 | run: roxygen2::roxygenise() 38 | shell: Rscript {0} 39 | 40 | - name: commit 41 | run: | 42 | git config --local user.name "$GITHUB_ACTOR" 43 | git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com" 44 | git add man/\* NAMESPACE 45 | git commit -m 'Document' 46 | 47 | - uses: r-lib/actions/pr-push@v2 48 | with: 49 | repo-token: ${{ secrets.GITHUB_TOKEN }} 50 | 51 | style: 52 | if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/style') }} 53 | name: style 54 | runs-on: ubuntu-latest 55 | env: 56 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 57 | permissions: 58 | contents: write 59 | steps: 60 | - uses: actions/checkout@v4 61 | 62 | - uses: r-lib/actions/pr-fetch@v2 63 | with: 64 | repo-token: ${{ secrets.GITHUB_TOKEN }} 65 | 66 | - uses: r-lib/actions/setup-r@v2 67 | 68 | - name: Install dependencies 69 | run: install.packages("styler") 70 | shell: Rscript {0} 71 | 72 | - name: Style 73 | run: styler::style_pkg() 74 | shell: Rscript {0} 75 | 76 | - name: commit 77 | run: | 78 | git config --local user.name "$GITHUB_ACTOR" 79 | git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com" 80 | git add \*.R 81 | git commit -m 'Style' 82 | 83 | - uses: r-lib/actions/pr-push@v2 84 | with: 85 | repo-token: ${{ secrets.GITHUB_TOKEN }} 86 | -------------------------------------------------------------------------------- /.github/workflows/test-coverage.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | on: 4 | push: 5 | branches: [main, master] 6 | pull_request: 7 | 8 | name: test-coverage.yaml 9 | 10 | permissions: read-all 11 | 12 | jobs: 13 | test-coverage: 14 | runs-on: ubuntu-latest 15 | env: 16 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 17 | 18 | steps: 19 | - uses: actions/checkout@v4 20 | 21 | - uses: r-lib/actions/setup-r@v2 22 | with: 23 | use-public-rspm: true 24 | 25 | - uses: r-lib/actions/setup-r-dependencies@v2 26 | with: 27 | extra-packages: any::covr, any::xml2 28 | needs: coverage 29 | 30 | - name: Test coverage 31 | run: | 32 | cov <- covr::package_coverage( 33 | quiet = FALSE, 34 | clean = FALSE, 35 | install_path = file.path(normalizePath(Sys.getenv("RUNNER_TEMP"), winslash = "/"), "package") 36 | ) 37 | print(cov) 38 | covr::to_cobertura(cov) 39 | shell: Rscript {0} 40 | 41 | - uses: codecov/codecov-action@v5 42 | with: 43 | # Fail if error if not on PR, or if on PR and token is given 44 | fail_ci_if_error: ${{ github.event_name != 'pull_request' || secrets.CODECOV_TOKEN }} 45 | files: ./cobertura.xml 46 | plugins: noop 47 | disable_search: true 48 | token: ${{ secrets.CODECOV_TOKEN }} 49 | 50 | - name: Show testthat output 51 | if: always() 52 | run: | 53 | ## -------------------------------------------------------------------- 54 | find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \; || true 55 | shell: bash 56 | 57 | - name: Upload test results 58 | if: failure() 59 | uses: actions/upload-artifact@v4 60 | with: 61 | name: coverage-test-failures 62 | path: ${{ runner.temp }}/package 63 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | docs 5 | .DS_Store 6 | -------------------------------------------------------------------------------- /.vscode/extensions.json: -------------------------------------------------------------------------------- 1 | { 2 | "recommendations": [ 3 | "Posit.air-vscode" 4 | ] 5 | } 6 | -------------------------------------------------------------------------------- /.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "[r]": { 3 | "editor.formatOnSave": true, 4 | "editor.defaultFormatter": "Posit.air-vscode" 5 | } 6 | } 7 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: workflows 2 | Title: Modeling Workflows 3 | Version: 1.2.0.9000 4 | Authors@R: c( 5 | person("Davis", "Vaughan", , "davis@posit.co", role = "aut"), 6 | person("Simon", "Couch", , "simon.couch@posit.co", role = c("aut", "cre"), 7 | comment = c(ORCID = "0000-0001-5676-5107")), 8 | person("Posit Software, PBC", role = c("cph", "fnd"), 9 | comment = c(ROR = "03wc8by49")) 10 | ) 11 | Description: Managing both a 'parsnip' model and a preprocessor, such as a 12 | model formula or recipe from 'recipes', can often be challenging. The 13 | goal of 'workflows' is to streamline this process by bundling the 14 | model alongside the preprocessor, all within the same object. 15 | License: MIT + file LICENSE 16 | URL: https://github.com/tidymodels/workflows, 17 | https://workflows.tidymodels.org 18 | BugReports: https://github.com/tidymodels/workflows/issues 19 | Depends: 20 | R (>= 4.1) 21 | Imports: 22 | cli (>= 3.3.0), 23 | generics (>= 0.1.2), 24 | glue (>= 1.6.2), 25 | hardhat (>= 1.3.1.9000), 26 | lifecycle (>= 1.0.3), 27 | modelenv (>= 0.1.0), 28 | parsnip (>= 1.2.1.9000), 29 | recipes (>= 1.0.10.9000), 30 | rlang (>= 1.1.0), 31 | sparsevctrs (>= 0.1.0.9003), 32 | tidyselect (>= 1.2.0), 33 | vctrs (>= 0.4.1), 34 | withr 35 | Suggests: 36 | butcher (>= 0.2.0), 37 | covr, 38 | dials (>= 1.0.0), 39 | glmnet, 40 | knitr, 41 | magrittr, 42 | Matrix, 43 | methods, 44 | modeldata (>= 1.0.0), 45 | probably, 46 | rmarkdown, 47 | tailor (>= 0.0.0.9001), 48 | testthat (>= 3.0.0) 49 | VignetteBuilder: 50 | knitr 51 | Remotes: 52 | r-lib/sparsevctrs, 53 | tidymodels/dials, 54 | tidymodels/parsnip, 55 | tidymodels/probably, 56 | tidymodels/recipes, 57 | tidymodels/tailor 58 | Config/Needs/website:dplyr, ggplot2, tidyr, tidyverse/tidytemplate, 59 | yardstick 60 | Config/testthat/edition: 3 61 | Config/usethis/last-upkeep: 2025-04-25 62 | Encoding: UTF-8 63 | Roxygen: list(markdown = TRUE) 64 | RoxygenNote: 7.3.2 65 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2025 2 | COPYRIGHT HOLDER: workflows authors 3 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | # MIT License 2 | 3 | Copyright (c) 2025 workflows authors 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /MAINTENANCE.md: -------------------------------------------------------------------------------- 1 | ## Current state 2 | 3 | Workflows is stable. 4 | It currently supports all implemented model types in tidymodels, including those in censored or modeltime. 5 | Often if it looks like a model type cannot be supported with the typical workflows model, the user can supply a "model formula" that gets passed directly through to parsnip as a workaround (i.e. `add_model(formula = )`). 6 | 7 | The general model of workflows is that it is split into 3 stages: `pre`, `fit`, and `post`. 8 | 9 | - `pre` controls the preprocessing, and is further divided into "actions". 10 | 11 | - The formula, recipes, and variables actions correspond to the 3 preprocessor types in hardhat. 12 | You can only use one of these per workflow. 13 | 14 | - The case weights action controls how case weights are extracted from the data and passed on to the parsnip model. 15 | Internally we force this action to run before the preprocessor actions. 16 | 17 | - `fit` controls the model fit. 18 | There is only 1 "model" action here, and I do not anticipate any more actions in this stage of the workflow. 19 | 20 | - `post` controls the postprocessing. 21 | There is only 1 "tailor" action here, though may be others in the future. 22 | 23 | Once a workflow is specified, `fit()` is called to fit all of the "actions". 24 | It loops through the actions in the workflow, and calls `fit()` on each of the actions as well (there are S3 methods for `fit()` for each action). 25 | This is similar to recipes, where each step has a `prep()` method. 26 | 27 | Keep in mind that people do save their fitted workflows and reload them for prediction, which has considerations for backwards compatibility. 28 | Any time you add a new feature, or change an existing one, you will need to keep in mind whether or not old workflows saved to disk will continue to run with the new version of workflows. 29 | Historically this has been more of a problem for hardhat, so if the backwards compatibility issues seem like a hardhat problem, then I would suggest adding backwards compatibility tests to hardhat directly instead. 30 | 31 | ## Known issues 32 | 33 | - I think that we still don't support `parsnip::multi_predict()`. At the time I remember not seeing a clear way to integrate this, but maybe the landscape has changed since then . 34 | 35 | ## Future directions 36 | 37 | The only known feature we want to add to workflows is support for postprocessing. 38 | As mentioned above, this requires some tooling in the probably package, along with tight integration to tune (which will likely need a 3rd inner loop to control tuning over the postprocessing options). 39 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(.censoring_weights_graf,workflow) 4 | S3method(add_action_impl,action_fit) 5 | S3method(add_action_impl,action_post) 6 | S3method(add_action_impl,action_pre) 7 | S3method(augment,workflow) 8 | S3method(check_conflicts,action_formula) 9 | S3method(check_conflicts,action_recipe) 10 | S3method(check_conflicts,action_variables) 11 | S3method(check_conflicts,default) 12 | S3method(extract_fit_engine,workflow) 13 | S3method(extract_fit_parsnip,workflow) 14 | S3method(extract_fit_time,workflow) 15 | S3method(extract_mold,workflow) 16 | S3method(extract_parameter_dials,workflow) 17 | S3method(extract_parameter_set_dials,workflow) 18 | S3method(extract_postprocessor,workflow) 19 | S3method(extract_preprocessor,workflow) 20 | S3method(extract_recipe,workflow) 21 | S3method(extract_spec_parsnip,workflow) 22 | S3method(fit,action_case_weights) 23 | S3method(fit,action_formula) 24 | S3method(fit,action_model) 25 | S3method(fit,action_recipe) 26 | S3method(fit,action_tailor) 27 | S3method(fit,action_variables) 28 | S3method(fit,workflow) 29 | S3method(glance,workflow) 30 | S3method(predict,workflow) 31 | S3method(print,control_workflow) 32 | S3method(print,workflow) 33 | S3method(required_pkgs,workflow) 34 | S3method(tidy,workflow) 35 | S3method(tunable,workflow) 36 | S3method(tune_args,workflow) 37 | export(.fit_finalize) 38 | export(.fit_model) 39 | export(.fit_post) 40 | export(.fit_pre) 41 | export(.workflow_includes_calibration) 42 | export(add_case_weights) 43 | export(add_formula) 44 | export(add_model) 45 | export(add_recipe) 46 | export(add_tailor) 47 | export(add_variables) 48 | export(control_workflow) 49 | export(extract_fit_engine) 50 | export(extract_fit_parsnip) 51 | export(extract_fit_time) 52 | export(extract_mold) 53 | export(extract_parameter_dials) 54 | export(extract_parameter_set_dials) 55 | export(extract_postprocessor) 56 | export(extract_preprocessor) 57 | export(extract_recipe) 58 | export(extract_spec_parsnip) 59 | export(fit) 60 | export(is_trained_workflow) 61 | export(pull_workflow_fit) 62 | export(pull_workflow_mold) 63 | export(pull_workflow_prepped_recipe) 64 | export(pull_workflow_preprocessor) 65 | export(pull_workflow_spec) 66 | export(remove_case_weights) 67 | export(remove_formula) 68 | export(remove_model) 69 | export(remove_recipe) 70 | export(remove_tailor) 71 | export(remove_variables) 72 | export(required_pkgs) 73 | export(update_case_weights) 74 | export(update_formula) 75 | export(update_model) 76 | export(update_recipe) 77 | export(update_tailor) 78 | export(update_variables) 79 | export(workflow) 80 | export(workflow_variables) 81 | import(rlang) 82 | importFrom(cli,cli_abort) 83 | importFrom(cli,cli_inform) 84 | importFrom(cli,cli_warn) 85 | importFrom(cli,qty) 86 | importFrom(generics,augment) 87 | importFrom(generics,fit) 88 | importFrom(generics,glance) 89 | importFrom(generics,required_pkgs) 90 | importFrom(generics,tidy) 91 | importFrom(generics,tunable) 92 | importFrom(generics,tune_args) 93 | importFrom(hardhat,extract_fit_engine) 94 | importFrom(hardhat,extract_fit_parsnip) 95 | importFrom(hardhat,extract_fit_time) 96 | importFrom(hardhat,extract_mold) 97 | importFrom(hardhat,extract_parameter_dials) 98 | importFrom(hardhat,extract_parameter_set_dials) 99 | importFrom(hardhat,extract_postprocessor) 100 | importFrom(hardhat,extract_preprocessor) 101 | importFrom(hardhat,extract_recipe) 102 | importFrom(hardhat,extract_spec_parsnip) 103 | importFrom(lifecycle,deprecated) 104 | importFrom(parsnip,.censoring_weights_graf) 105 | importFrom(parsnip,fit_xy) 106 | importFrom(stats,predict) 107 | -------------------------------------------------------------------------------- /R/action.R: -------------------------------------------------------------------------------- 1 | add_action <- function(x, action, name, ..., call = caller_env()) { 2 | validate_is_workflow(x, call = call) 3 | 4 | check_conflicts(action, x, call = call) 5 | 6 | add_action_impl(x, action, name, call = call) 7 | } 8 | 9 | # ------------------------------------------------------------------------------ 10 | 11 | add_action_impl <- function(x, action, name, ..., call = caller_env()) { 12 | check_dots_empty() 13 | UseMethod("add_action_impl", action) 14 | } 15 | 16 | #' @export 17 | add_action_impl.action_pre <- function( 18 | x, 19 | action, 20 | name, 21 | ..., 22 | call = caller_env() 23 | ) { 24 | check_singleton(x$pre$actions, name, call = call) 25 | x$pre <- add_action_to_stage(x$pre, action, name, order_stage_pre()) 26 | x 27 | } 28 | 29 | #' @export 30 | add_action_impl.action_fit <- function( 31 | x, 32 | action, 33 | name, 34 | ..., 35 | call = caller_env() 36 | ) { 37 | check_singleton(x$fit$actions, name, call = call) 38 | x$fit <- add_action_to_stage(x$fit, action, name, order_stage_fit()) 39 | x 40 | } 41 | 42 | #' @export 43 | add_action_impl.action_post <- function( 44 | x, 45 | action, 46 | name, 47 | ..., 48 | call = caller_env() 49 | ) { 50 | check_singleton(x$post$actions, name, call = call) 51 | x$post <- add_action_to_stage(x$post, action, name, order_stage_post()) 52 | x 53 | } 54 | 55 | # ------------------------------------------------------------------------------ 56 | 57 | order_stage_pre <- function() { 58 | # Case weights must come before preprocessor 59 | c( 60 | c("case_weights"), 61 | c("formula", "recipe", "variables") 62 | ) 63 | } 64 | 65 | order_stage_fit <- function() { 66 | "model" 67 | } 68 | 69 | order_stage_post <- function() { 70 | "tailor" 71 | } 72 | 73 | # ------------------------------------------------------------------------------ 74 | 75 | add_action_to_stage <- function(stage, action, name, order) { 76 | actions <- c(stage$actions, list2(!!name := action)) 77 | 78 | # Apply required ordering for this stage 79 | order <- intersect(order, names(actions)) 80 | actions <- actions[order] 81 | 82 | stage$actions <- actions 83 | 84 | stage 85 | } 86 | 87 | # ------------------------------------------------------------------------------ 88 | 89 | # `check_conflicts()` allows us to to check that no other action interferes 90 | # with the current action. For instance, we can't have a formula action with 91 | # a recipe action 92 | 93 | check_conflicts <- function(action, x, ..., call = caller_env()) { 94 | check_dots_empty() 95 | UseMethod("check_conflicts") 96 | } 97 | 98 | #' @export 99 | check_conflicts.default <- function(action, x, ..., call = caller_env()) { 100 | invisible(action) 101 | } 102 | 103 | # ------------------------------------------------------------------------------ 104 | 105 | check_singleton <- function(actions, name, ..., call = caller_env()) { 106 | check_dots_empty() 107 | 108 | if (name %in% names(actions)) { 109 | cli_abort( 110 | "A `{name}` action has already been added to this workflow.", 111 | call = call 112 | ) 113 | } 114 | 115 | invisible(actions) 116 | } 117 | 118 | # ------------------------------------------------------------------------------ 119 | 120 | new_action_pre <- function(..., subclass = character()) { 121 | new_action(..., subclass = c(subclass, "action_pre")) 122 | } 123 | 124 | new_action_fit <- function(..., subclass = character()) { 125 | new_action(..., subclass = c(subclass, "action_fit")) 126 | } 127 | 128 | new_action_post <- function(..., subclass = character()) { 129 | new_action(..., subclass = c(subclass, "action_post")) 130 | } 131 | 132 | is_action_pre <- function(x) { 133 | inherits(x, "action_pre") 134 | } 135 | 136 | is_action_fit <- function(x) { 137 | inherits(x, "action_fit") 138 | } 139 | 140 | is_action_post <- function(x) { 141 | inherits(x, "action_post") 142 | } 143 | 144 | # ------------------------------------------------------------------------------ 145 | 146 | # An `action` is a list of objects that define how to perform a specific action, 147 | # such as working with a recipe, or formula terms, or a model 148 | 149 | new_action <- function(..., subclass = character()) { 150 | data <- list2(...) 151 | 152 | if (!is_uniquely_named(data)) { 153 | cli_abort("All elements of `...` must be uniquely named.", .internal = TRUE) 154 | } 155 | 156 | structure(data, class = c(subclass, "action")) 157 | } 158 | 159 | is_action <- function(x) { 160 | inherits(x, "action") 161 | } 162 | 163 | # ------------------------------------------------------------------------------ 164 | 165 | is_list_of_actions <- function(x) { 166 | x <- compact(x) 167 | 168 | all(map_lgl(x, is_action)) 169 | } 170 | -------------------------------------------------------------------------------- /R/broom.R: -------------------------------------------------------------------------------- 1 | #' Tidy a workflow 2 | #' 3 | #' @description 4 | #' This is a [generics::tidy()] method for a workflow that calls `tidy()` on 5 | #' either the underlying parsnip model or the recipe, depending on the value 6 | #' of `what`. 7 | #' 8 | #' `x` must be a fitted workflow, resulting in fitted parsnip model or prepped 9 | #' recipe that you want to tidy. 10 | #' 11 | #' @details 12 | #' To tidy the unprepped recipe, use [extract_preprocessor()] and `tidy()` 13 | #' that directly. 14 | #' 15 | #' @param x A workflow 16 | #' 17 | #' @param what A single string. Either `"model"` or `"recipe"` to select 18 | #' which part of the workflow to tidy. Defaults to tidying the model. 19 | #' 20 | #' @param ... Arguments passed on to methods 21 | #' 22 | #' @export 23 | tidy.workflow <- function(x, what = "model", ...) { 24 | what <- arg_match(what, values = c("model", "recipe")) 25 | 26 | if (identical(what, "model")) { 27 | x <- extract_fit_parsnip(x) 28 | out <- tidy(x, ...) 29 | return(out) 30 | } 31 | 32 | if (identical(what, "recipe")) { 33 | x <- extract_recipe(x) 34 | out <- tidy(x, ...) 35 | return(out) 36 | } 37 | 38 | cli_abort( 39 | "{.arg what} must be {.val model} or {.val recipe}.", 40 | .internal = TRUE 41 | ) 42 | } 43 | 44 | # ------------------------------------------------------------------------------ 45 | 46 | #' Glance at a workflow model 47 | #' 48 | #' @description 49 | #' This is a [generics::glance()] method for a workflow that calls `glance()` on 50 | #' the underlying parsnip model. 51 | #' 52 | #' `x` must be a trained workflow, resulting in fitted parsnip model to 53 | #' `glance()` at. 54 | #' 55 | #' @param x A workflow 56 | #' 57 | #' @param ... Arguments passed on to methods 58 | #' 59 | #' @export 60 | #' @examples 61 | #' if (rlang::is_installed(c("broom", "modeldata"))) { 62 | #' 63 | #' library(parsnip) 64 | #' library(magrittr) 65 | #' library(modeldata) 66 | #' 67 | #' data("attrition") 68 | #' 69 | #' model <- logistic_reg() |> 70 | #' set_engine("glm") 71 | #' 72 | #' wf <- workflow() |> 73 | #' add_model(model) |> 74 | #' add_formula( 75 | #' Attrition ~ BusinessTravel + YearsSinceLastPromotion + OverTime 76 | #' ) 77 | #' 78 | #' # Workflow must be trained to call `glance()` 79 | #' try(glance(wf)) 80 | #' 81 | #' wf_fit <- fit(wf, attrition) 82 | #' 83 | #' glance(wf_fit) 84 | #' 85 | #' } 86 | glance.workflow <- function(x, ...) { 87 | x <- extract_fit_parsnip(x) 88 | glance(x, ...) 89 | } 90 | 91 | # ------------------------------------------------------------------------------ 92 | 93 | #' Augment data with predictions 94 | #' 95 | #' @description 96 | #' This is a [generics::augment()] method for a workflow that calls 97 | #' `augment()` on the underlying parsnip model with `new_data`. 98 | #' 99 | #' `x` must be a trained workflow, resulting in fitted parsnip model to 100 | #' `augment()` with. 101 | #' 102 | #' `new_data` will be preprocessed using the preprocessor in the workflow, 103 | #' and that preprocessed data will be used to generate predictions. The 104 | #' final result will contain the original `new_data` with new columns containing 105 | #' the prediction information. 106 | #' 107 | #' @param x A workflow 108 | #' 109 | #' @param new_data A data frame of predictors 110 | #' 111 | #' @param ... Arguments passed on to methods 112 | #' 113 | #' @return `new_data` with new prediction specific columns. 114 | #' 115 | #' @param eval_time For censored regression models, a vector of time points at 116 | #' which the survival probability is estimated. See 117 | #' [parsnip::augment.model_fit()] for more details. 118 | #' 119 | #' @export 120 | #' @examples 121 | #' if (rlang::is_installed("broom")) { 122 | #' 123 | #' library(parsnip) 124 | #' library(magrittr) 125 | #' library(modeldata) 126 | #' 127 | #' data("attrition") 128 | #' 129 | #' model <- logistic_reg() |> 130 | #' set_engine("glm") 131 | #' 132 | #' wf <- workflow() |> 133 | #' add_model(model) |> 134 | #' add_formula( 135 | #' Attrition ~ BusinessTravel + YearsSinceLastPromotion + OverTime 136 | #' ) 137 | #' 138 | #' wf_fit <- fit(wf, attrition) 139 | #' 140 | #' augment(wf_fit, attrition) 141 | #' 142 | #' } 143 | augment.workflow <- function(x, new_data, eval_time = NULL, ...) { 144 | fit <- extract_fit_parsnip(x) 145 | mold <- extract_mold(x) 146 | 147 | # supply outcomes to `augment.model_fit()` if possible (#131) 148 | outcomes <- FALSE 149 | if (length(fit$preproc$y_var) > 0) { 150 | outcomes <- all(fit$preproc$y_var %in% names(new_data)) 151 | } 152 | 153 | # `augment.model_fit()` requires the pre-processed `new_data` 154 | forged <- hardhat::forge( 155 | new_data, 156 | blueprint = mold$blueprint, 157 | outcomes = outcomes 158 | ) 159 | 160 | if (outcomes) { 161 | new_data_forged <- vctrs::vec_cbind(forged$predictors, forged$outcomes) 162 | } else { 163 | new_data_forged <- forged$predictors 164 | } 165 | 166 | new_data_forged <- prepare_augment_new_data(new_data_forged) 167 | out <- augment(fit, new_data_forged, eval_time = eval_time, ...) 168 | 169 | if (has_postprocessor_tailor(x)) { 170 | post <- extract_postprocessor(x) 171 | out <- predict(post, new_data = out) 172 | } 173 | 174 | augment_columns <- setdiff( 175 | names(out), 176 | names(new_data_forged) 177 | ) 178 | 179 | out <- out[augment_columns] 180 | 181 | # Return original `new_data` with new prediction columns 182 | out <- vctrs::vec_cbind(out, new_data) 183 | 184 | out 185 | } 186 | 187 | prepare_augment_new_data <- function(x) { 188 | # `augment()` works best with a data frame of predictors, 189 | # so we need to undo any matrix/sparse matrix compositions that 190 | # were returned from `hardhat::forge()` (#148) 191 | if (is.data.frame(x)) { 192 | x 193 | } else if (is.matrix(x)) { 194 | as.data.frame(x) 195 | } else if (inherits(x, "dgCMatrix")) { 196 | x <- as.matrix(x) 197 | as.data.frame(x) 198 | } else { 199 | cli_abort( 200 | "Unknown predictor type returned by {.fun forge_predictors}.", 201 | .internal = TRUE 202 | ) 203 | } 204 | } 205 | -------------------------------------------------------------------------------- /R/compat-purrr.R: -------------------------------------------------------------------------------- 1 | # nocov start - compat-purrr (last updated: rlang 0.3.2.9000) 2 | 3 | # This file serves as a reference for compatibility functions for 4 | # purrr. They are not drop-in replacements but allow a similar style 5 | # of programming. This is useful in cases where purrr is too heavy a 6 | # package to depend on. Please find the most recent version in rlang's 7 | # repository. 8 | 9 | map <- function(.x, .f, ...) { 10 | lapply(.x, .f, ...) 11 | } 12 | map_mold <- function(.x, .f, .mold, ...) { 13 | out <- vapply(.x, .f, .mold, ..., USE.NAMES = FALSE) 14 | names(out) <- names(.x) 15 | out 16 | } 17 | map_lgl <- function(.x, .f, ...) { 18 | map_mold(.x, .f, logical(1), ...) 19 | } 20 | map_int <- function(.x, .f, ...) { 21 | map_mold(.x, .f, integer(1), ...) 22 | } 23 | map_dbl <- function(.x, .f, ...) { 24 | map_mold(.x, .f, double(1), ...) 25 | } 26 | map_chr <- function(.x, .f, ...) { 27 | map_mold(.x, .f, character(1), ...) 28 | } 29 | map_cpl <- function(.x, .f, ...) { 30 | map_mold(.x, .f, complex(1), ...) 31 | } 32 | 33 | walk <- function(.x, .f, ...) { 34 | map(.x, .f, ...) 35 | invisible(.x) 36 | } 37 | 38 | pluck <- function(.x, .f) { 39 | map(.x, `[[`, .f) 40 | } 41 | pluck_lgl <- function(.x, .f) { 42 | map_lgl(.x, `[[`, .f) 43 | } 44 | pluck_int <- function(.x, .f) { 45 | map_int(.x, `[[`, .f) 46 | } 47 | pluck_dbl <- function(.x, .f) { 48 | map_dbl(.x, `[[`, .f) 49 | } 50 | pluck_chr <- function(.x, .f) { 51 | map_chr(.x, `[[`, .f) 52 | } 53 | pluck_cpl <- function(.x, .f) { 54 | map_cpl(.x, `[[`, .f) 55 | } 56 | 57 | map2 <- function(.x, .y, .f, ...) { 58 | out <- mapply(.f, .x, .y, MoreArgs = list(...), SIMPLIFY = FALSE) 59 | if (length(out) == length(.x)) { 60 | set_names(out, names(.x)) 61 | } else { 62 | set_names(out, NULL) 63 | } 64 | } 65 | map2_lgl <- function(.x, .y, .f, ...) { 66 | as.vector(map2(.x, .y, .f, ...), "logical") 67 | } 68 | map2_int <- function(.x, .y, .f, ...) { 69 | as.vector(map2(.x, .y, .f, ...), "integer") 70 | } 71 | map2_dbl <- function(.x, .y, .f, ...) { 72 | as.vector(map2(.x, .y, .f, ...), "double") 73 | } 74 | map2_chr <- function(.x, .y, .f, ...) { 75 | as.vector(map2(.x, .y, .f, ...), "character") 76 | } 77 | map2_cpl <- function(.x, .y, .f, ...) { 78 | as.vector(map2(.x, .y, .f, ...), "complex") 79 | } 80 | 81 | args_recycle <- function(args) { 82 | lengths <- map_int(args, length) 83 | n <- max(lengths) 84 | 85 | stopifnot(all(lengths == 1L | lengths == n)) 86 | to_recycle <- lengths == 1L 87 | args[to_recycle] <- map(args[to_recycle], function(x) rep.int(x, n)) 88 | 89 | args 90 | } 91 | pmap <- function(.l, .f, ...) { 92 | args <- args_recycle(.l) 93 | do.call( 94 | "mapply", 95 | c( 96 | FUN = list(quote(.f)), 97 | args, 98 | MoreArgs = quote(list(...)), 99 | SIMPLIFY = FALSE, 100 | USE.NAMES = FALSE 101 | ) 102 | ) 103 | } 104 | 105 | probe <- function(.x, .p, ...) { 106 | if (is_logical(.p)) { 107 | stopifnot(length(.p) == length(.x)) 108 | .p 109 | } else { 110 | map_lgl(.x, .p, ...) 111 | } 112 | } 113 | 114 | keep <- function(.x, .f, ...) { 115 | .x[probe(.x, .f, ...)] 116 | } 117 | discard <- function(.x, .p, ...) { 118 | sel <- probe(.x, .p, ...) 119 | .x[is.na(sel) | !sel] 120 | } 121 | map_if <- function(.x, .p, .f, ...) { 122 | matches <- probe(.x, .p) 123 | .x[matches] <- map(.x[matches], .f, ...) 124 | .x 125 | } 126 | 127 | compact <- function(.x) { 128 | Filter(length, .x) 129 | } 130 | 131 | transpose <- function(.l) { 132 | inner_names <- names(.l[[1]]) 133 | if (is.null(inner_names)) { 134 | fields <- seq_along(.l[[1]]) 135 | } else { 136 | fields <- set_names(inner_names) 137 | } 138 | 139 | map(fields, function(i) { 140 | map(.l, .subset2, i) 141 | }) 142 | } 143 | 144 | every <- function(.x, .p, ...) { 145 | for (i in seq_along(.x)) { 146 | if (!rlang::is_true(.p(.x[[i]], ...))) { 147 | return(FALSE) 148 | } 149 | } 150 | TRUE 151 | } 152 | some <- function(.x, .p, ...) { 153 | for (i in seq_along(.x)) { 154 | if (rlang::is_true(.p(.x[[i]], ...))) { 155 | return(TRUE) 156 | } 157 | } 158 | FALSE 159 | } 160 | negate <- function(.p) { 161 | function(...) !.p(...) 162 | } 163 | 164 | reduce <- function(.x, .f, ..., .init) { 165 | f <- function(x, y) .f(x, y, ...) 166 | Reduce(f, .x, init = .init) 167 | } 168 | reduce_right <- function(.x, .f, ..., .init) { 169 | f <- function(x, y) .f(y, x, ...) 170 | Reduce(f, .x, init = .init, right = TRUE) 171 | } 172 | accumulate <- function(.x, .f, ..., .init) { 173 | f <- function(x, y) .f(x, y, ...) 174 | Reduce(f, .x, init = .init, accumulate = TRUE) 175 | } 176 | accumulate_right <- function(.x, .f, ..., .init) { 177 | f <- function(x, y) .f(y, x, ...) 178 | Reduce(f, .x, init = .init, right = TRUE, accumulate = TRUE) 179 | } 180 | 181 | detect <- function(.x, .f, ..., .right = FALSE, .p = is_true) { 182 | for (i in index(.x, .right)) { 183 | if (.p(.f(.x[[i]], ...))) { 184 | return(.x[[i]]) 185 | } 186 | } 187 | NULL 188 | } 189 | detect_index <- function(.x, .f, ..., .right = FALSE, .p = is_true) { 190 | for (i in index(.x, .right)) { 191 | if (.p(.f(.x[[i]], ...))) { 192 | return(i) 193 | } 194 | } 195 | 0L 196 | } 197 | index <- function(x, right = FALSE) { 198 | idx <- seq_along(x) 199 | if (right) { 200 | idx <- rev(idx) 201 | } 202 | idx 203 | } 204 | 205 | imap <- function(.x, .f, ...) { 206 | map2(.x, vec_index(.x), .f, ...) 207 | } 208 | vec_index <- function(x) { 209 | names(x) %||% seq_along(x) 210 | } 211 | 212 | # nocov end 213 | -------------------------------------------------------------------------------- /R/control.R: -------------------------------------------------------------------------------- 1 | #' Control object for a workflow 2 | #' 3 | #' `control_workflow()` holds the control parameters for a workflow. 4 | #' 5 | #' @param control_parsnip A parsnip control object. If `NULL`, a default control 6 | #' argument is constructed from [parsnip::control_parsnip()]. 7 | #' 8 | #' @return 9 | #' A `control_workflow` object for tweaking the workflow fitting process. 10 | #' 11 | #' @export 12 | #' @examples 13 | #' control_workflow() 14 | control_workflow <- function(control_parsnip = NULL) { 15 | control_parsnip <- check_control_parsnip(control_parsnip) 16 | 17 | data <- list( 18 | control_parsnip = control_parsnip 19 | ) 20 | 21 | structure(data, class = "control_workflow") 22 | } 23 | 24 | #' @export 25 | print.control_workflow <- function(x, ...) { 26 | cat("") 27 | invisible() 28 | } 29 | 30 | check_control_parsnip <- function(x, ..., call = caller_env()) { 31 | check_dots_empty() 32 | 33 | if (is.null(x)) { 34 | x <- parsnip::control_parsnip() 35 | } 36 | 37 | if (!inherits(x, "control_parsnip")) { 38 | cli_abort( 39 | "{.arg control_parsnip} must be a {.cls control_parsnip} object.", 40 | call = call 41 | ) 42 | } 43 | 44 | x 45 | } 46 | 47 | is_control_workflow <- function(x) { 48 | inherits(x, "control_workflow") 49 | } 50 | -------------------------------------------------------------------------------- /R/fit-action-model.R: -------------------------------------------------------------------------------- 1 | #' Add a model to a workflow 2 | #' 3 | #' @description 4 | #' - `add_model()` adds a parsnip model to the workflow. 5 | #' 6 | #' - `remove_model()` removes the model specification as well as any fitted 7 | #' model object. Any extra formulas are also removed. 8 | #' 9 | #' - `update_model()` first removes the model then adds the new specification to 10 | #' the workflow. 11 | #' 12 | #' @details 13 | #' `add_model()` is a required step to construct a minimal workflow. 14 | #' 15 | #' @includeRmd man/rmd/indicators.Rmd details 16 | #' 17 | #' @inheritParams rlang::args_dots_empty 18 | #' 19 | #' @param x A workflow. 20 | #' 21 | #' @param spec A parsnip model specification. 22 | #' 23 | #' @param formula An optional formula override to specify the terms of the 24 | #' model. Typically, the terms are extracted from the formula or recipe 25 | #' preprocessing methods. However, some models (like survival and bayesian 26 | #' models) use the formula not to preprocess, but to specify the structure 27 | #' of the model. In those cases, a formula specifying the model structure 28 | #' must be passed unchanged into the model call itself. This argument is 29 | #' used for those purposes. 30 | #' 31 | #' @return 32 | #' `x`, updated with either a new or removed model. 33 | #' 34 | #' @export 35 | #' @examples 36 | #' library(parsnip) 37 | #' 38 | #' lm_model <- linear_reg() 39 | #' lm_model <- set_engine(lm_model, "lm") 40 | #' 41 | #' regularized_model <- set_engine(lm_model, "glmnet") 42 | #' 43 | #' workflow <- workflow() 44 | #' workflow <- add_model(workflow, lm_model) 45 | #' workflow 46 | #' 47 | #' workflow <- add_formula(workflow, mpg ~ .) 48 | #' workflow 49 | #' 50 | #' remove_model(workflow) 51 | #' 52 | #' fitted <- fit(workflow, data = mtcars) 53 | #' fitted 54 | #' 55 | #' remove_model(fitted) 56 | #' 57 | #' remove_model(workflow) 58 | #' 59 | #' update_model(workflow, regularized_model) 60 | #' update_model(fitted, regularized_model) 61 | add_model <- function(x, spec, ..., formula = NULL) { 62 | check_dots_empty() 63 | action <- new_action_model(spec, formula) 64 | add_action(x, action, "model") 65 | } 66 | 67 | #' @rdname add_model 68 | #' @export 69 | remove_model <- function(x) { 70 | validate_is_workflow(x) 71 | 72 | if (!has_spec(x)) { 73 | cli_warn("The workflow has no model to remove.") 74 | } 75 | 76 | new_workflow( 77 | pre = x$pre, 78 | fit = new_stage_fit(), 79 | post = new_stage_post(actions = x$post$actions), 80 | trained = FALSE 81 | ) 82 | } 83 | 84 | 85 | #' @rdname add_model 86 | #' @export 87 | update_model <- function(x, spec, ..., formula = NULL) { 88 | check_dots_empty() 89 | x <- remove_model(x) 90 | add_model(x, spec, formula = formula) 91 | } 92 | 93 | # ------------------------------------------------------------------------------ 94 | 95 | #' @export 96 | fit.action_model <- function(object, workflow, control, ...) { 97 | if (!is_control_workflow(control)) { 98 | cli_abort( 99 | "{.arg control} must be a workflows control object created 100 | by {.fun control_workflow}." 101 | ) 102 | } 103 | 104 | control_parsnip <- control$control_parsnip 105 | 106 | spec <- object$spec 107 | formula <- object$formula 108 | 109 | mold <- extract_mold0(workflow) 110 | case_weights <- extract_case_weights0(workflow) 111 | 112 | if (is.null(formula)) { 113 | fit <- fit_from_xy(spec, mold, case_weights, control_parsnip) 114 | } else { 115 | fit <- fit_from_formula(spec, mold, case_weights, control_parsnip, formula) 116 | } 117 | 118 | workflow$fit$fit <- fit 119 | 120 | # Only the workflow is returned 121 | workflow 122 | } 123 | 124 | fit_from_xy <- function(spec, mold, case_weights, control_parsnip) { 125 | fit_xy( 126 | spec, 127 | x = mold$predictors, 128 | y = mold$outcomes, 129 | case_weights = case_weights, 130 | control = control_parsnip 131 | ) 132 | } 133 | 134 | fit_from_formula <- function( 135 | spec, 136 | mold, 137 | case_weights, 138 | control_parsnip, 139 | formula 140 | ) { 141 | data <- cbind(mold$outcomes, mold$predictors) 142 | 143 | fit( 144 | spec, 145 | formula = formula, 146 | data = data, 147 | case_weights = case_weights, 148 | control = control_parsnip 149 | ) 150 | } 151 | 152 | extract_mold0 <- function(workflow) { 153 | mold <- workflow$pre$mold 154 | 155 | if (is.null(mold)) { 156 | cli_abort( 157 | "No mold exists. `workflow` pre stage has not been run.", 158 | .internal = TRUE 159 | ) 160 | } 161 | 162 | mold 163 | } 164 | 165 | extract_case_weights0 <- function(workflow) { 166 | if (!has_case_weights(workflow)) { 167 | return(NULL) 168 | } 169 | 170 | case_weights <- workflow$pre$case_weights 171 | 172 | if (is_null(case_weights)) { 173 | cli_abort( 174 | "No case weights exist. `workflow` pre stage has not been run.", 175 | .internal = TRUE 176 | ) 177 | } 178 | 179 | case_weights 180 | } 181 | 182 | # ------------------------------------------------------------------------------ 183 | 184 | new_action_model <- function(spec, formula, ..., call = caller_env()) { 185 | check_dots_empty() 186 | 187 | if (!is_model_spec(spec)) { 188 | cli_abort("{.arg spec} must be a {.cls model_spec}.", call = call) 189 | } 190 | 191 | mode <- spec$mode 192 | 193 | if (is_string(mode, string = "unknown")) { 194 | message <- 195 | c( 196 | "{.arg spec} must have a known mode.", 197 | i = "Set the mode of `spec` by using {.fun parsnip::set_mode} or by setting 198 | the mode directly in the parsnip specification function." 199 | ) 200 | 201 | cli_abort(message, call = call) 202 | } 203 | 204 | if (!is.null(formula) && !is_formula(formula)) { 205 | cli_abort("{.arg formula} must be a formula, or {.code NULL}.", call = call) 206 | } 207 | 208 | if (!parsnip::spec_is_loaded(spec = spec) && inherits(spec, "model_spec")) { 209 | parsnip::prompt_missing_implementation( 210 | spec = spec, 211 | prompt = cli_abort, 212 | call = call 213 | ) 214 | } 215 | 216 | new_action_fit(spec = spec, formula = formula, subclass = "action_model") 217 | } 218 | -------------------------------------------------------------------------------- /R/generics.R: -------------------------------------------------------------------------------- 1 | #' @export 2 | required_pkgs.workflow <- function(x, infra = TRUE, ...) { 3 | out <- character() 4 | 5 | if (has_spec(x)) { 6 | model <- extract_spec_parsnip(x) 7 | pkgs <- generics::required_pkgs(model, infra = infra) 8 | out <- c(pkgs, out) 9 | } 10 | 11 | if (has_preprocessor_recipe(x)) { 12 | preprocessor <- extract_preprocessor(x) 13 | 14 | # This also has the side effect of loading recipes, ensuring that its 15 | # S3 methods for `required_pkgs()` are registered 16 | if (!is_installed("recipes")) { 17 | cli_abort( 18 | "The {.pkg recipes} package must be installed to compute the 19 | {.fun required_pkgs} of a workflow with a recipe preprocessor." 20 | ) 21 | } 22 | 23 | pkgs <- generics::required_pkgs(preprocessor, infra = infra) 24 | out <- c(pkgs, out) 25 | } 26 | 27 | out <- unique(out) 28 | out 29 | } 30 | 31 | #' @export 32 | tune_args.workflow <- function(object, ...) { 33 | model <- extract_spec_parsnip(object) 34 | 35 | param_data <- generics::tune_args(model) 36 | 37 | if (has_preprocessor_recipe(object)) { 38 | recipe <- extract_preprocessor(object) 39 | recipe_param_data <- generics::tune_args(recipe) 40 | param_data <- vctrs::vec_rbind(param_data, recipe_param_data) 41 | } 42 | 43 | if (has_postprocessor_tailor(object)) { 44 | tailor <- extract_postprocessor(object) 45 | tailor_param_data <- generics::tune_args(tailor) 46 | param_data <- vctrs::vec_rbind(param_data, tailor_param_data) 47 | } 48 | 49 | param_data 50 | } 51 | 52 | #' @export 53 | tunable.workflow <- function(x, ...) { 54 | model <- extract_spec_parsnip(x) 55 | param_data <- generics::tunable(model) 56 | 57 | if (has_preprocessor_recipe(x)) { 58 | recipe <- extract_preprocessor(x) 59 | recipe_param_data <- generics::tunable(recipe) 60 | 61 | param_data <- vctrs::vec_rbind(param_data, recipe_param_data) 62 | } 63 | 64 | if (has_postprocessor_tailor(x)) { 65 | tailor <- extract_postprocessor(x) 66 | tailor_param_data <- generics::tunable(tailor) 67 | 68 | param_data <- vctrs::vec_rbind(param_data, tailor_param_data) 69 | } 70 | 71 | param_data 72 | } 73 | -------------------------------------------------------------------------------- /R/pre-action-case-weights.R: -------------------------------------------------------------------------------- 1 | #' Add case weights to a workflow 2 | #' 3 | #' @description 4 | #' This family of functions revolves around selecting a column of `data` to use 5 | #' for _case weights_. This column must be one of the allowed case weight types, 6 | #' such as [hardhat::frequency_weights()] or [hardhat::importance_weights()]. 7 | #' Specifically, it must return `TRUE` from [hardhat::is_case_weights()]. The 8 | #' underlying model will decide whether or not the type of case weights you have 9 | #' supplied are applicable or not. 10 | #' 11 | #' - `add_case_weights()` specifies the column that will be interpreted as 12 | #' case weights in the model. This column must be present in the `data` 13 | #' supplied to [fit()][fit.workflow()]. 14 | #' 15 | #' - `remove_case_weights()` removes the case weights. Additionally, if the 16 | #' model has already been fit, then the fit is removed. 17 | #' 18 | #' - `update_case_weights()` first removes the case weights, then replaces them 19 | #' with the new ones. 20 | #' 21 | #' @details 22 | #' For formula and variable preprocessors, the case weights `col` is removed 23 | #' from the data before the preprocessor is evaluated. This allows you to use 24 | #' formulas like `y ~ .` or tidyselection like `everything()` without fear of 25 | #' accidentally selecting the case weights column. 26 | #' 27 | #' For recipe preprocessors, the case weights `col` is not removed and is 28 | #' passed along to the recipe. Typically, your recipe will include steps that 29 | #' can utilize case weights. 30 | #' 31 | #' @param x A workflow 32 | #' 33 | #' @param col A single unquoted column name specifying the case weights for 34 | #' the model. This must be a classed case weights column, as determined by 35 | #' [hardhat::is_case_weights()]. 36 | #' 37 | #' @export 38 | #' @examples 39 | #' library(parsnip) 40 | #' library(magrittr) 41 | #' library(hardhat) 42 | #' 43 | #' mtcars2 <- mtcars 44 | #' mtcars2$gear <- frequency_weights(mtcars2$gear) 45 | #' 46 | #' spec <- linear_reg() |> 47 | #' set_engine("lm") 48 | #' 49 | #' wf <- workflow() |> 50 | #' add_case_weights(gear) |> 51 | #' add_formula(mpg ~ .) |> 52 | #' add_model(spec) 53 | #' 54 | #' wf <- fit(wf, mtcars2) 55 | #' 56 | #' # Notice that the case weights (gear) aren't included in the predictors 57 | #' extract_mold(wf)$predictors 58 | #' 59 | #' # Strip them out of the workflow, which also resets the model 60 | #' remove_case_weights(wf) 61 | add_case_weights <- function(x, col) { 62 | col <- enquo(col) 63 | action <- new_action_case_weights(col) 64 | # Ensures that case-weight actions are always before preprocessor actions 65 | add_action(x, action, "case_weights") 66 | } 67 | 68 | #' @rdname add_case_weights 69 | #' @export 70 | remove_case_weights <- function(x) { 71 | validate_is_workflow(x) 72 | 73 | if (!has_case_weights(x)) { 74 | cli_warn("The workflow has no case weights specification to remove.") 75 | } 76 | 77 | actions <- x$pre$actions 78 | actions[["case_weights"]] <- NULL 79 | 80 | new_workflow( 81 | pre = new_stage_pre(actions = actions), 82 | fit = new_stage_fit(actions = x$fit$actions), 83 | post = new_stage_post(actions = x$post$actions), 84 | trained = FALSE 85 | ) 86 | } 87 | 88 | #' @rdname add_case_weights 89 | #' @export 90 | update_case_weights <- function(x, col) { 91 | x <- remove_case_weights(x) 92 | add_case_weights(x, {{ col }}) 93 | } 94 | 95 | # ------------------------------------------------------------------------------ 96 | 97 | #' @export 98 | fit.action_case_weights <- function(object, workflow, data, ...) { 99 | col <- object$col 100 | 101 | loc <- eval_select_case_weights(col, data) 102 | 103 | case_weights <- data[[loc]] 104 | 105 | if (!hardhat::is_case_weights(case_weights)) { 106 | cli_abort(c( 107 | "{.arg col} must select a classed case weights column, as determined by 108 | {.fun hardhat::is_case_weights}.", 109 | "i" = "For example, it could be a column created by 110 | {.fun hardhat::frequency_weights} or 111 | {.fun hardhat::importance_weights}." 112 | )) 113 | } 114 | 115 | # Remove case weights for formula/variable preprocessors so `y ~ .` and 116 | # `everything()` don't pick up the weights column. Recipe preprocessors 117 | # likely need the case weights columns so we don't remove them in that case. 118 | # They will be automatically tagged by the recipe with a `"case_weights"` 119 | # role, so they won't be considered predictors during `bake()`, meaning 120 | # that passing them through should be harmless. 121 | remove <- 122 | has_preprocessor_formula(workflow) || 123 | has_preprocessor_variables(workflow) 124 | 125 | if (remove) { 126 | data[[loc]] <- NULL 127 | } 128 | 129 | workflow$pre <- new_stage_pre( 130 | actions = workflow$pre$actions, 131 | mold = NULL, 132 | case_weights = case_weights 133 | ) 134 | 135 | # All pre steps return the `workflow` and `data` 136 | list(workflow = workflow, data = data) 137 | } 138 | 139 | # ------------------------------------------------------------------------------ 140 | 141 | new_action_case_weights <- function(col) { 142 | if (!is_quosure(col)) { 143 | cli_abort("{.arg col} must be a quosure.", .internal = TRUE) 144 | } 145 | 146 | new_action_pre( 147 | col = col, 148 | subclass = "action_case_weights" 149 | ) 150 | } 151 | 152 | # ------------------------------------------------------------------------------ 153 | 154 | extract_case_weights_col <- function(x) { 155 | x$pre$actions$case_weights$col 156 | } 157 | 158 | eval_select_case_weights <- function(col, data, ..., call = caller_env()) { 159 | check_dots_empty() 160 | 161 | # `col` is saved as a quosure, so it carries along the evaluation environment 162 | env <- empty_env() 163 | 164 | loc <- tidyselect::eval_select( 165 | expr = col, 166 | data = data, 167 | env = env, 168 | error_call = call 169 | ) 170 | 171 | if (length(loc) != 1L) { 172 | message <- paste0( 173 | "{.arg col} must specify exactly one column from 174 | {.arg data} to extract case weights from." 175 | ) 176 | 177 | cli_abort(message, call = call) 178 | } 179 | 180 | loc 181 | } 182 | -------------------------------------------------------------------------------- /R/pre-action-formula.R: -------------------------------------------------------------------------------- 1 | #' Add formula terms to a workflow 2 | #' 3 | #' @description 4 | #' - `add_formula()` specifies the terms of the model through the usage of a 5 | #' formula. 6 | #' 7 | #' - `remove_formula()` removes the formula as well as any downstream objects 8 | #' that might get created after the formula is used for preprocessing, such as 9 | #' terms. Additionally, if the model has already been fit, then the fit is 10 | #' removed. 11 | #' 12 | #' - `update_formula()` first removes the formula, then replaces the previous 13 | #' formula with the new one. Any model that has already been fit based on this 14 | #' formula will need to be refit. 15 | #' 16 | #' @details 17 | #' To fit a workflow, exactly one of [add_formula()], [add_recipe()], or 18 | #' [add_variables()] _must_ be specified. 19 | #' 20 | #' @includeRmd man/rmd/add-formula.Rmd details 21 | #' 22 | #' @param x A workflow 23 | #' 24 | #' @param formula A formula specifying the terms of the model. It is advised to 25 | #' not do preprocessing in the formula, and instead use a recipe if that is 26 | #' required. 27 | #' 28 | #' @param ... Not used. 29 | #' 30 | #' @param blueprint A hardhat blueprint used for fine tuning the preprocessing. 31 | #' 32 | #' If `NULL`, [hardhat::default_formula_blueprint()] is used and is passed 33 | #' arguments that best align with the model present in the workflow. 34 | #' 35 | #' Note that preprocessing done here is separate from preprocessing that 36 | #' might be done by the underlying model. For example, if a blueprint with 37 | #' `indicators = "none"` is specified, no dummy variables will be created by 38 | #' hardhat, but if the underlying model requires a formula interface that 39 | #' internally uses [stats::model.matrix()], factors will still be expanded to 40 | #' dummy variables by the model. 41 | #' 42 | #' @return 43 | #' `x`, updated with either a new or removed formula preprocessor. 44 | #' 45 | #' @export 46 | #' @examples 47 | #' workflow <- workflow() 48 | #' workflow <- add_formula(workflow, mpg ~ cyl) 49 | #' workflow 50 | #' 51 | #' remove_formula(workflow) 52 | #' 53 | #' update_formula(workflow, mpg ~ disp) 54 | add_formula <- function(x, formula, ..., blueprint = NULL) { 55 | check_dots_empty() 56 | action <- new_action_formula(formula, blueprint) 57 | add_action(x, action, "formula") 58 | } 59 | 60 | #' @rdname add_formula 61 | #' @export 62 | remove_formula <- function(x) { 63 | validate_is_workflow(x) 64 | 65 | if (!has_preprocessor_formula(x)) { 66 | cli_warn("The workflow has no formula preprocessor to remove.") 67 | } 68 | 69 | actions <- x$pre$actions 70 | actions[["formula"]] <- NULL 71 | 72 | new_workflow( 73 | pre = new_stage_pre(actions = actions), 74 | fit = new_stage_fit(actions = x$fit$actions), 75 | post = new_stage_post(actions = x$post$actions), 76 | trained = FALSE 77 | ) 78 | } 79 | 80 | #' @rdname add_formula 81 | #' @export 82 | update_formula <- function(x, formula, ..., blueprint = NULL) { 83 | check_dots_empty() 84 | x <- remove_formula(x) 85 | add_formula(x, formula, blueprint = blueprint) 86 | } 87 | 88 | # ------------------------------------------------------------------------------ 89 | 90 | #' @export 91 | fit.action_formula <- function(object, workflow, data, ...) { 92 | formula <- object$formula 93 | blueprint <- object$blueprint 94 | 95 | if (sparsevctrs::has_sparse_elements(data)) { 96 | cli::cli_abort( 97 | "Sparse data cannot be used with the formula interface. Please use 98 | {.fn add_recipe} or {.fn add_variables} instead." 99 | ) 100 | } 101 | 102 | # TODO - Strip out the formula environment at some time? 103 | mold <- hardhat::mold(formula, data, blueprint = blueprint) 104 | 105 | check_for_offset(mold) 106 | 107 | workflow$pre <- new_stage_pre( 108 | actions = workflow$pre$actions, 109 | mold = mold, 110 | case_weights = workflow$pre$case_weights 111 | ) 112 | 113 | # All pre steps return the `workflow` and `data` 114 | list(workflow = workflow, data = data) 115 | } 116 | 117 | check_for_offset <- function(mold, ..., call = caller_env()) { 118 | check_dots_empty() 119 | 120 | # `hardhat::mold()` specially detects offsets in the formula preprocessor and 121 | # places them in an "extras" slot. This is useful for modeling package 122 | # authors, but we don't want users to provide an offset in the formula 123 | # supplied to `add_formula()` because "extra" columns aren't passed on to 124 | # parsnip. They should use a model formula instead (#162). 125 | offset <- mold$extras$offset 126 | 127 | if (!is.null(offset)) { 128 | message <- c( 129 | "Can't use an offset in the formula supplied to {.fun add_formula}.", 130 | "i" = "Instead, specify offsets through a model formula 131 | in {.code add_model(formula = )}." 132 | ) 133 | 134 | cli_abort(message, call = call) 135 | } 136 | } 137 | 138 | # ------------------------------------------------------------------------------ 139 | 140 | #' @export 141 | check_conflicts.action_formula <- function( 142 | action, 143 | x, 144 | ..., 145 | call = caller_env() 146 | ) { 147 | pre <- x$pre 148 | 149 | if (has_action(pre, "recipe")) { 150 | cli_abort( 151 | "A formula cannot be added when a recipe already exists.", 152 | call = call 153 | ) 154 | } 155 | if (has_action(pre, "variables")) { 156 | cli_abort( 157 | "A formula cannot be added when variables already exist.", 158 | call = call 159 | ) 160 | } 161 | 162 | invisible(action) 163 | } 164 | 165 | # ------------------------------------------------------------------------------ 166 | 167 | new_action_formula <- function(formula, blueprint, ..., call = caller_env()) { 168 | check_dots_empty() 169 | 170 | if (!is_formula(formula)) { 171 | cli_abort("{.arg formula} must be a formula.", call = call) 172 | } 173 | 174 | # `NULL` blueprints are finalized at fit time 175 | if (!is_null(blueprint) && !is_formula_blueprint(blueprint)) { 176 | cli_abort( 177 | "{.arg blueprint} must be a hardhat {.cls formula_blueprint}.", 178 | call = call 179 | ) 180 | } 181 | 182 | new_action_pre( 183 | formula = formula, 184 | blueprint = blueprint, 185 | subclass = "action_formula" 186 | ) 187 | } 188 | 189 | is_formula_blueprint <- function(x) { 190 | inherits(x, "formula_blueprint") 191 | } 192 | -------------------------------------------------------------------------------- /R/predict.R: -------------------------------------------------------------------------------- 1 | #' Predict from a workflow 2 | #' 3 | #' @description 4 | #' This is the `predict()` method for a fit workflow object. The nice thing 5 | #' about predicting from a workflow is that it will: 6 | #' 7 | #' - Preprocess `new_data` using the preprocessing method specified when the 8 | #' workflow was created and fit. This is accomplished using 9 | #' [hardhat::forge()], which will apply any formula preprocessing or call 10 | #' [recipes::bake()] if a recipe was supplied. 11 | #' 12 | #' - Call [parsnip::predict.model_fit()] for you using the underlying fit 13 | #' parsnip model. 14 | #' 15 | #' @inheritParams parsnip::predict.model_fit 16 | #' 17 | #' @param object A workflow that has been fit by [fit.workflow()] 18 | #' 19 | #' @param new_data A data frame containing the new predictors to preprocess 20 | #' and predict on. If using a recipe preprocessor, you should not call 21 | #' [recipes::bake()] on `new_data` before passing to this function. 22 | #' 23 | #' @return 24 | #' A data frame of model predictions, with as many rows as `new_data` has. 25 | #' 26 | #' @name predict-workflow 27 | #' @export 28 | #' @examplesIf rlang::is_installed("recipes") 29 | #' library(parsnip) 30 | #' library(recipes) 31 | #' library(magrittr) 32 | #' 33 | #' training <- mtcars[1:20, ] 34 | #' testing <- mtcars[21:32, ] 35 | #' 36 | #' model <- linear_reg() |> 37 | #' set_engine("lm") 38 | #' 39 | #' workflow <- workflow() |> 40 | #' add_model(model) 41 | #' 42 | #' recipe <- recipe(mpg ~ cyl + disp, training) |> 43 | #' step_log(disp) 44 | #' 45 | #' workflow <- add_recipe(workflow, recipe) 46 | #' 47 | #' fit_workflow <- fit(workflow, training) 48 | #' 49 | #' # This will automatically `bake()` the recipe on `testing`, 50 | #' # applying the log step to `disp`, and then fit the regression. 51 | #' predict(fit_workflow, testing) 52 | predict.workflow <- function( 53 | object, 54 | new_data, 55 | type = NULL, 56 | opts = list(), 57 | ... 58 | ) { 59 | workflow <- object 60 | 61 | if (!is_trained_workflow(workflow)) { 62 | cli_abort(c( 63 | "Can't predict on an untrained workflow.", 64 | "i" = "Do you need to call {.fun fit}?" 65 | )) 66 | } 67 | 68 | if (is_sparse_matrix(new_data)) { 69 | new_data <- sparsevctrs::coerce_to_sparse_tibble( 70 | new_data, 71 | call = rlang::caller_env(0) 72 | ) 73 | } 74 | 75 | fit <- extract_fit_parsnip(workflow) 76 | new_data <- forge_predictors(new_data, workflow) 77 | 78 | if (!has_postprocessor(workflow)) { 79 | return(predict(fit, new_data, type = type, opts = opts, ...)) 80 | } 81 | 82 | # use `augment()` rather than `fit()` to get all possible prediction `type`s (#234). 83 | fit_aug <- augment(fit, new_data, opts = opts, ...) 84 | 85 | post <- extract_postprocessor(workflow) 86 | predict(post, fit_aug)[predict_type_column_names(type, post$columns)] 87 | } 88 | 89 | forge_predictors <- function(new_data, workflow) { 90 | mold <- extract_mold(workflow) 91 | forged <- hardhat::forge(new_data, blueprint = mold$blueprint) 92 | forged$predictors 93 | } 94 | 95 | predict_type_column_names <- function( 96 | type, 97 | tailor_columns, 98 | call = caller_env() 99 | ) { 100 | check_string(type, allow_null = TRUE, call = call) 101 | 102 | if (is.null(type)) { 103 | return(tailor_columns$estimate) 104 | } 105 | 106 | switch( 107 | type, 108 | numeric = , 109 | class = tailor_columns$estimate, 110 | prob = tailor_columns$probabilities, 111 | cli::cli_abort( 112 | "Unsupported prediction {.arg type} {.val {type}} for a workflow with a postprocessor.", 113 | call = call 114 | ) 115 | ) 116 | } 117 | -------------------------------------------------------------------------------- /R/pull.R: -------------------------------------------------------------------------------- 1 | #' Extract elements of a workflow 2 | #' 3 | #' @description 4 | #' 5 | #' `r lifecycle::badge("soft-deprecated")` 6 | #' 7 | #' Please use the `extract_*()` functions instead of these 8 | #' (e.g. [extract_mold()]). 9 | #' 10 | #' These functions extract various elements from a workflow object. If they do 11 | #' not exist yet, an error is thrown. 12 | #' 13 | #' - `pull_workflow_preprocessor()` returns the formula, recipe, or variable 14 | #' expressions used for preprocessing. 15 | #' 16 | #' - `pull_workflow_spec()` returns the parsnip model specification. 17 | #' 18 | #' - `pull_workflow_fit()` returns the parsnip model fit. 19 | #' 20 | #' - `pull_workflow_mold()` returns the preprocessed "mold" object returned 21 | #' from [hardhat::mold()]. It contains information about the preprocessing, 22 | #' including either the prepped recipe or the formula terms object. 23 | #' 24 | #' - `pull_workflow_prepped_recipe()` returns the prepped recipe. It is 25 | #' extracted from the mold object returned from `pull_workflow_mold()`. 26 | #' 27 | #' @param x A workflow 28 | #' 29 | #' @return 30 | #' The extracted value from the workflow, `x`, as described in the description 31 | #' section. 32 | #' 33 | #' @name workflow-extractors 34 | #' @keywords internal 35 | #' @examplesIf rlang::is_installed("recipes") 36 | #' library(parsnip) 37 | #' library(recipes) 38 | #' library(magrittr) 39 | #' 40 | #' model <- linear_reg() |> 41 | #' set_engine("lm") 42 | #' 43 | #' recipe <- recipe(mpg ~ cyl + disp, mtcars) |> 44 | #' step_log(disp) 45 | #' 46 | #' base_wf <- workflow() |> 47 | #' add_model(model) 48 | #' 49 | #' recipe_wf <- add_recipe(base_wf, recipe) 50 | #' formula_wf <- add_formula(base_wf, mpg ~ cyl + log(disp)) 51 | #' variable_wf <- add_variables(base_wf, mpg, c(cyl, disp)) 52 | #' 53 | #' fit_recipe_wf <- fit(recipe_wf, mtcars) 54 | #' fit_formula_wf <- fit(formula_wf, mtcars) 55 | #' 56 | #' # The preprocessor is a recipes, formula, or a list holding the 57 | #' # tidyselect expressions identifying the outcomes/predictors 58 | #' pull_workflow_preprocessor(recipe_wf) 59 | #' pull_workflow_preprocessor(formula_wf) 60 | #' pull_workflow_preprocessor(variable_wf) 61 | #' 62 | #' # The `spec` is the parsnip spec before it has been fit. 63 | #' # The `fit` is the fit parsnip model. 64 | #' pull_workflow_spec(fit_formula_wf) 65 | #' pull_workflow_fit(fit_formula_wf) 66 | #' 67 | #' # The mold is returned from `hardhat::mold()`, and contains the 68 | #' # predictors, outcomes, and information about the preprocessing 69 | #' # for use on new data at `predict()` time. 70 | #' pull_workflow_mold(fit_recipe_wf) 71 | #' 72 | #' # A useful shortcut is to extract the prepped recipe from the workflow 73 | #' pull_workflow_prepped_recipe(fit_recipe_wf) 74 | #' 75 | #' # That is identical to 76 | #' identical( 77 | #' pull_workflow_mold(fit_recipe_wf)$blueprint$recipe, 78 | #' pull_workflow_prepped_recipe(fit_recipe_wf) 79 | #' ) 80 | NULL 81 | 82 | #' @rdname workflow-extractors 83 | #' @export 84 | pull_workflow_preprocessor <- function(x) { 85 | lifecycle::deprecate_warn( 86 | "0.2.3", 87 | "pull_workflow_preprocessor()", 88 | "extract_preprocessor()" 89 | ) 90 | validate_is_workflow(x) 91 | extract_preprocessor(x) 92 | } 93 | 94 | #' @rdname workflow-extractors 95 | #' @export 96 | pull_workflow_spec <- function(x) { 97 | lifecycle::deprecate_warn( 98 | "0.2.3", 99 | "pull_workflow_spec()", 100 | "extract_spec_parsnip()" 101 | ) 102 | validate_is_workflow(x) 103 | extract_spec_parsnip(x) 104 | } 105 | 106 | #' @rdname workflow-extractors 107 | #' @export 108 | pull_workflow_fit <- function(x) { 109 | lifecycle::deprecate_warn( 110 | "0.2.3", 111 | "pull_workflow_fit()", 112 | "extract_fit_parsnip()" 113 | ) 114 | validate_is_workflow(x) 115 | extract_fit_parsnip(x) 116 | } 117 | 118 | #' @rdname workflow-extractors 119 | #' @export 120 | pull_workflow_mold <- function(x) { 121 | lifecycle::deprecate_warn("0.2.3", "pull_workflow_mold()", "extract_mold()") 122 | validate_is_workflow(x) 123 | extract_mold(x) 124 | } 125 | 126 | #' @rdname workflow-extractors 127 | #' @export 128 | pull_workflow_prepped_recipe <- function(x) { 129 | lifecycle::deprecate_warn( 130 | "0.2.3", 131 | "pull_workflow_prepped_recipe()", 132 | "extract_recipe()" 133 | ) 134 | validate_is_workflow(x) 135 | extract_recipe(x) 136 | } 137 | -------------------------------------------------------------------------------- /R/reexports.R: -------------------------------------------------------------------------------- 1 | #' @importFrom hardhat extract_spec_parsnip 2 | #' @export 3 | hardhat::extract_spec_parsnip 4 | #' 5 | #' @importFrom hardhat extract_recipe 6 | #' @export 7 | hardhat::extract_recipe 8 | #' 9 | #' @importFrom hardhat extract_fit_parsnip 10 | #' @export 11 | hardhat::extract_fit_parsnip 12 | #' 13 | #' @importFrom hardhat extract_fit_engine 14 | #' @export 15 | hardhat::extract_fit_engine 16 | #' 17 | #' @importFrom hardhat extract_mold 18 | #' @export 19 | hardhat::extract_mold 20 | #' 21 | #' @importFrom hardhat extract_preprocessor 22 | #' @export 23 | hardhat::extract_preprocessor 24 | #' 25 | #' @importFrom hardhat extract_postprocessor 26 | #' @export 27 | hardhat::extract_postprocessor 28 | #' 29 | #' @importFrom hardhat extract_parameter_set_dials 30 | #' @export 31 | hardhat::extract_parameter_set_dials 32 | #' 33 | #' @importFrom hardhat extract_parameter_dials 34 | #' @export 35 | hardhat::extract_parameter_dials 36 | #' 37 | #' @importFrom hardhat extract_fit_time 38 | #' @export 39 | hardhat::extract_fit_time 40 | 41 | #' @importFrom generics required_pkgs 42 | #' @export 43 | generics::required_pkgs 44 | 45 | #' @importFrom generics fit 46 | #' @export 47 | generics::fit 48 | -------------------------------------------------------------------------------- /R/sparsevctrs.R: -------------------------------------------------------------------------------- 1 | is_sparse_matrix <- function(x) { 2 | methods::is(x, "sparseMatrix") 3 | } 4 | 5 | # This function takes a workflow and its data. If the model supports sparse data 6 | # And there is a recipe, then it uses `should_use_sparsity()` to determine 7 | # whether all the `sparse = "auto"` should be turned to `"yes"` or `"no"` in the 8 | # recipe. 9 | # 10 | # Done using flow chart in https://github.com/tidymodels/workflows/issues/271 11 | toggle_sparsity <- function(object, data) { 12 | if ( 13 | allow_sparse(object$fit$actions$model$spec) && 14 | has_preprocessor_recipe(object) 15 | ) { 16 | est_sparsity <- recipes::.recipes_estimate_sparsity( 17 | extract_preprocessor(object) 18 | ) 19 | 20 | toggle_sparse <- should_use_sparsity( 21 | est_sparsity, 22 | extract_spec_parsnip(object)$engine, 23 | nrow(data) 24 | ) 25 | 26 | object$pre$actions$recipe$recipe <- recipes::.recipes_toggle_sparse_args( 27 | object$pre$actions$recipe$recipe, 28 | choice = toggle_sparse 29 | ) 30 | } 31 | 32 | object 33 | } 34 | 35 | allow_sparse <- function(x) { 36 | if (inherits(x, "model_fit")) { 37 | x <- x$spec 38 | } 39 | res <- parsnip::get_from_env(paste0(class(x)[1], "_encoding")) 40 | all(res$allow_sparse_x[res$engine == x$engine]) 41 | } 42 | 43 | # This function was created using from the output of a mars model fit on the 44 | # simulation data generated in `analysis/time_analysis.R` 45 | # https://github.com/tidymodels/benchmark-sparsity-threshold 46 | # 47 | # The model was extracted using {tidypredict} and hand-tuned for speed. 48 | # 49 | # The model was fit on `sparsity`, `engine` and `n_rows` and the outcome was 50 | # `log_fold` which is defined as 51 | # `log(time to fit with dense data / time to fit with sparse data)`. 52 | # Meaning that values above above 0 would reflects longer fit times for dense, 53 | # Hence we want to use sparse data. 54 | # 55 | # At this time the only engines that support sparse data are glmnet, LiblineaR, 56 | # ranger, and xgboost. Which is why they are the only ones listed here. 57 | # This is fine as this code will only run if `allow_sparse()` returns `TRUE` 58 | # Which only happens for these engines. 59 | # 60 | # Ranger is hard-coded to always fail since they appear to use the same 61 | # algorithm for sparse and dense data, resulting in identical times. 62 | should_use_sparsity <- function(sparsity, engine, n_rows) { 63 | if (is.null(engine) || engine == "ranger") { 64 | return("no") 65 | } 66 | 67 | log_fold <- -0.599333138645995 + 68 | ifelse(sparsity < 0.836601307189543, 0.836601307189543 - sparsity, 0) * 69 | -0.541581853008009 + 70 | ifelse(n_rows < 16000, 16000 - n_rows, 0) * 3.23980908942813e-05 + 71 | ifelse(n_rows > 16000, n_rows - 16000, 0) * -2.81001152147355e-06 + 72 | ifelse(sparsity > 0.836601307189543, sparsity - 0.836601307189543, 0) * 73 | 9.82444255114058 + 74 | ifelse(sparsity > 0.836601307189543, sparsity - 0.836601307189543, 0) * 75 | ifelse(n_rows > 8000, n_rows - 8000, 0) * 76 | 7.27456967763306e-05 + 77 | ifelse(sparsity > 0.836601307189543, sparsity - 0.836601307189543, 0) * 78 | ifelse(n_rows < 8000, 8000 - n_rows, 0) * 79 | -0.000798307404212627 80 | 81 | if (engine == "xgboost") { 82 | log_fold <- log_fold + 83 | ifelse(sparsity < 0.984615384615385, 0.984615384615385 - sparsity, 0) * 84 | 0.113098025073806 + 85 | ifelse(n_rows < 8000, 8000 - n_rows, 0) * -9.77914237255269e-05 + 86 | ifelse(n_rows > 8000, n_rows - 8000, 0) * 3.22657666511869e-06 + 87 | ifelse(sparsity > 0.984615384615385, sparsity - 0.984615384615385, 0) * 88 | 41.5180348086939 + 89 | 0.913457808326756 90 | } 91 | 92 | if (engine == "LiblineaR") { 93 | log_fold <- log_fold + 94 | ifelse(sparsity > 0.836601307189543, sparsity - 0.836601307189543, 0) * 95 | -5.39592564852111 96 | } 97 | 98 | ifelse(log_fold > 0, "yes", "no") 99 | } 100 | -------------------------------------------------------------------------------- /R/stage.R: -------------------------------------------------------------------------------- 1 | new_stage_pre <- function( 2 | actions = new_named_list(), 3 | mold = NULL, 4 | case_weights = NULL 5 | ) { 6 | if (!is.null(mold) && !is.list(mold)) { 7 | cli_abort( 8 | "{.arg mold} must be a result of calling {.fun hardhat::mold}.", 9 | .internal = TRUE 10 | ) 11 | } 12 | 13 | if (!is_null(case_weights) && !hardhat::is_case_weights(case_weights)) { 14 | cli_abort( 15 | "{.arg case_weights} must be a true case weights column.", 16 | .internal = TRUE 17 | ) 18 | } 19 | 20 | new_stage( 21 | actions = actions, 22 | mold = mold, 23 | case_weights = case_weights, 24 | subclass = "stage_pre" 25 | ) 26 | } 27 | 28 | new_stage_fit <- function(actions = new_named_list(), fit = NULL) { 29 | if (!is.null(fit) && !is_model_fit(fit)) { 30 | cli_abort("{.arg fit} must be a {.cls model_fit}.", .internal = TRUE) 31 | } 32 | 33 | new_stage(actions = actions, fit = fit, subclass = "stage_fit") 34 | } 35 | 36 | new_stage_post <- function(actions = new_named_list(), fit = NULL) { 37 | if (!is.null(fit) && !is_tailor(fit)) { 38 | cli_abort("{.arg fit} must be a fitted {.cls tailor}.", .internal = TRUE) 39 | } 40 | 41 | new_stage(actions, fit = fit, subclass = "stage_post") 42 | } 43 | 44 | # ------------------------------------------------------------------------------ 45 | 46 | # A `stage` is a collection of `action`s 47 | 48 | # There are 3 stages that actions can fall into: 49 | # - pre 50 | # - fit 51 | # - post 52 | 53 | new_stage <- function(actions = new_named_list(), ..., subclass = character()) { 54 | if (!is_list_of_actions(actions)) { 55 | cli_abort("{.arg actions} must be a list of actions.", .internal = TRUE) 56 | } 57 | 58 | if (!is_uniquely_named(actions)) { 59 | cli_abort("{.arg actions} must be uniquely named.", .internal = TRUE) 60 | } 61 | 62 | fields <- list2(...) 63 | 64 | if (!is_uniquely_named(fields)) { 65 | cli_abort("`...` must be uniquely named.", .internal = TRUE) 66 | } 67 | 68 | fields <- list2(actions = actions, !!!fields) 69 | 70 | structure(fields, class = c(subclass, "stage")) 71 | } 72 | 73 | # ------------------------------------------------------------------------------ 74 | 75 | is_stage <- function(x) { 76 | inherits(x, "stage") 77 | } 78 | 79 | has_action <- function(stage, name) { 80 | name %in% names(stage$actions) 81 | } 82 | 83 | # ------------------------------------------------------------------------------ 84 | 85 | new_named_list <- function() { 86 | # To standardize results for testing. 87 | # Mainly applicable when `[[<-` removes all elements from a named list and 88 | # leaves a named list behind that we want to compare against. 89 | set_names(list(), character()) 90 | } 91 | -------------------------------------------------------------------------------- /R/survival-censoring-weights.R: -------------------------------------------------------------------------------- 1 | #' @export 2 | .censoring_weights_graf.workflow <- function( 3 | object, 4 | predictions, 5 | cens_predictors = NULL, 6 | trunc = 0.05, 7 | eps = 10^-10, 8 | ... 9 | ) { 10 | if (is.null(object$fit$fit)) { 11 | cli_abort("The workflow does not have a model fit object.") 12 | } 13 | .censoring_weights_graf( 14 | object$fit$fit, 15 | predictions, 16 | cens_predictors, 17 | trunc, 18 | eps 19 | ) 20 | } 21 | -------------------------------------------------------------------------------- /R/utils.R: -------------------------------------------------------------------------------- 1 | is_uniquely_named <- function(x) { 2 | if (length(x) > 0) { 3 | is_named(x) && !anyDuplicated(names(x)) 4 | } else { 5 | TRUE 6 | } 7 | } 8 | 9 | is_model_fit <- function(x) { 10 | inherits(x, "model_fit") || modelenv::is_unsupervised_fit(x) 11 | } 12 | 13 | is_model_spec <- function(x) { 14 | inherits(x, "model_spec") || modelenv::is_unsupervised_spec(x) 15 | } 16 | 17 | validate_recipes_available <- function(..., call = caller_env()) { 18 | check_dots_empty() 19 | 20 | if (!requireNamespace("recipes", quietly = TRUE)) { 21 | cli_abort( 22 | "The {.pkg recipes} package must be available to add a recipe.", 23 | call = call 24 | ) 25 | } 26 | 27 | invisible() 28 | } 29 | 30 | validate_tailor_available <- function(..., call = caller_env()) { 31 | check_dots_empty() 32 | 33 | if (!requireNamespace("tailor", quietly = TRUE)) { 34 | cli_abort( 35 | "The {.pkg tailor} package must be available to add a tailor.", 36 | call = call 37 | ) 38 | } 39 | 40 | invisible() 41 | } 42 | 43 | # ------------------------------------------------------------------------------ 44 | 45 | # https://github.com/r-lib/tidyselect/blob/10e00cea2fff3585fc827b6a7eb5e172acadbb2f/R/utils.R#L109 46 | vec_index_invert <- function(x) { 47 | if (vec_index_is_empty(x)) { 48 | TRUE 49 | } else { 50 | -x 51 | } 52 | } 53 | 54 | vec_index_is_empty <- function(x) { 55 | !length(x) || all(x == 0L) 56 | } 57 | 58 | # ------------------------------------------------------------------------------ 59 | 60 | validate_is_workflow <- function(x, ..., arg = "`x`", call = caller_env()) { 61 | check_dots_empty() 62 | 63 | if (!is_workflow(x)) { 64 | cli_abort( 65 | "{arg} must be a workflow, not a {.cls {class(x)[[1]]}}.", 66 | call = call 67 | ) 68 | } 69 | 70 | invisible(x) 71 | } 72 | 73 | # ------------------------------------------------------------------------------ 74 | 75 | has_preprocessor_recipe <- function(x) { 76 | "recipe" %in% names(x$pre$actions) 77 | } 78 | 79 | has_preprocessor_formula <- function(x) { 80 | "formula" %in% names(x$pre$actions) 81 | } 82 | 83 | has_preprocessor_variables <- function(x) { 84 | "variables" %in% names(x$pre$actions) 85 | } 86 | 87 | has_case_weights <- function(x) { 88 | "case_weights" %in% names(x$pre$actions) 89 | } 90 | 91 | has_mold <- function(x) { 92 | !is.null(x$pre$mold) 93 | } 94 | 95 | has_spec <- function(x) { 96 | "model" %in% names(x$fit$actions) 97 | } 98 | 99 | has_fit <- function(x) { 100 | !is.null(x$fit$fit) 101 | } 102 | 103 | has_postprocessor <- function(x) { 104 | has_postprocessor_tailor(x) 105 | } 106 | 107 | has_postprocessor_tailor <- function(x) { 108 | "tailor" %in% names(x$post$actions) 109 | } 110 | 111 | has_blueprint <- function(x) { 112 | if (has_preprocessor_formula(x)) { 113 | !is.null(x$pre$actions$formula$blueprint) 114 | } else if (has_preprocessor_recipe(x)) { 115 | !is.null(x$pre$actions$recipe$blueprint) 116 | } else if (has_preprocessor_variables(x)) { 117 | !is.null(x$pre$actions$variables$blueprint) 118 | } else { 119 | cli_abort( 120 | "{.arg x} must have a preprocessor to check for a blueprint.", 121 | .internal = TRUE 122 | ) 123 | } 124 | } 125 | -------------------------------------------------------------------------------- /R/workflows-package.R: -------------------------------------------------------------------------------- 1 | #' @keywords internal 2 | "_PACKAGE" 3 | 4 | # The following block is used by usethis to automatically manage 5 | # roxygen namespace tags. Modify with care! 6 | ## usethis namespace: start 7 | #' 8 | #' @import rlang 9 | #' @importFrom cli cli_inform cli_warn cli_abort qty 10 | #' @importFrom generics augment 11 | #' @importFrom generics glance 12 | #' @importFrom generics tidy 13 | #' @importFrom generics tune_args 14 | #' @importFrom generics tunable 15 | #' @importFrom lifecycle deprecated 16 | #' @importFrom parsnip .censoring_weights_graf 17 | #' @importFrom parsnip fit_xy 18 | #' @importFrom stats predict 19 | ## usethis namespace: end 20 | NULL 21 | -------------------------------------------------------------------------------- /R/zzz.R: -------------------------------------------------------------------------------- 1 | # nocov start 2 | 3 | .onLoad <- function(libname, pkgname) { 4 | ns <- rlang::ns_env("workflows") 5 | 6 | vctrs::s3_register("butcher::axe_call", "workflow") 7 | vctrs::s3_register("butcher::axe_ctrl", "workflow") 8 | vctrs::s3_register("butcher::axe_data", "workflow") 9 | vctrs::s3_register("butcher::axe_env", "workflow") 10 | vctrs::s3_register("butcher::axe_fitted", "workflow") 11 | } 12 | 13 | dummy_withr <- function() withr::defer 14 | 15 | # nocov end 16 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: github_document 3 | --- 4 | 5 | 6 | 7 | ```{r, include = FALSE} 8 | knitr::opts_chunk$set( 9 | collapse = TRUE, 10 | comment = "#>", 11 | fig.path = "man/figures/README-", 12 | out.width = "100%" 13 | ) 14 | ``` 15 | 16 | # workflows A teal-colored hexagonal logo. The word WORKFLOWS is centered inside of a diagram of circular cycle, with a magrittr pipe on the top and a directed graph on the bottom. 17 | 18 | 19 | [![Codecov test coverage](https://codecov.io/gh/tidymodels/workflows/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidymodels/workflows?branch=main) 20 | [![R-CMD-check](https://github.com/tidymodels/workflows/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidymodels/workflows/actions/workflows/R-CMD-check.yaml) 21 | 22 | 23 | ## What is a workflow? 24 | 25 | A workflow is an object that can bundle together your pre-processing, modeling, and post-processing requests. For example, if you have a `recipe` and `parsnip` model, these can be combined into a workflow. The advantages are: 26 | 27 | * You don't have to keep track of separate objects in your workspace. 28 | 29 | * The recipe prepping, model fitting, and postprocessor estimation (which may include data splitting) can be executed using a single call to `fit()`. 30 | 31 | * If you have custom tuning parameter settings, these can be defined using a simpler interface when combined with [tune](https://github.com/tidymodels/tune). 32 | 33 | ## Installation 34 | 35 | You can install workflows from CRAN with: 36 | 37 | ``` r 38 | install.packages("workflows") 39 | ``` 40 | 41 | 42 | You can install the development version from [GitHub](https://github.com/) with: 43 | 44 | ``` r 45 | # install.packages("pak") 46 | pak::pak("tidymodels/workflows") 47 | ``` 48 | 49 | ## Example 50 | 51 | Suppose you were modeling data on cars. Say...the fuel efficiency of 32 cars. You know that the relationship between engine displacement and miles-per-gallon is nonlinear, and you would like to model that as a spline before adding it to a Bayesian linear regression model. You might have a recipe to specify the spline: 52 | 53 | ```{r spline-rec, eval = FALSE} 54 | library(recipes) 55 | library(parsnip) 56 | library(workflows) 57 | 58 | spline_cars <- recipe(mpg ~ ., data = mtcars) |> 59 | step_ns(disp, deg_free = 10) 60 | ``` 61 | 62 | and a model object: 63 | 64 | ```{r car-mod, eval = FALSE} 65 | bayes_lm <- linear_reg() |> 66 | set_engine("stan") 67 | ``` 68 | 69 | To use these, you would generally run: 70 | 71 | ```{r car-fit, eval = FALSE} 72 | spline_cars_prepped <- prep(spline_cars, mtcars) 73 | bayes_lm_fit <- fit(bayes_lm, mpg ~ ., data = juice(spline_cars_prepped)) 74 | ``` 75 | 76 | You can't predict on new samples using `bayes_lm_fit` without the prepped version of `spline_cars` around. You also might have other models and recipes in your workspace. This might lead to getting them mixed-up or forgetting to save the model/recipe pair that you are most interested in. 77 | 78 | workflows makes this easier by combining these objects together: 79 | 80 | ```{r wflow, eval = FALSE} 81 | car_wflow <- workflow() |> 82 | add_recipe(spline_cars) |> 83 | add_model(bayes_lm) 84 | ``` 85 | 86 | Now you can prepare the recipe and estimate the model via a single call to `fit()`: 87 | 88 | ```{r wflow-fit, eval = FALSE} 89 | car_wflow_fit <- fit(car_wflow, data = mtcars) 90 | ``` 91 | 92 | You can alter existing workflows using `update_recipe()` / `update_model()` and `remove_recipe()` / `remove_model()`. 93 | 94 | 95 | ## Contributing 96 | 97 | This project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms. 98 | 99 | - For questions and discussions about tidymodels packages, modeling, and machine learning, please [post on Posit Community](https://community.rstudio.com/new-topic?category_id=15&tags=tidymodels,question). 100 | 101 | - If you think you have encountered a bug, please [submit an issue](https://github.com/tidymodels/workflows/issues). 102 | 103 | - Either way, learn how to create and share a [reprex](https://reprex.tidyverse.org/articles/articles/learn-reprex.html) (a minimal, reproducible example), to clearly communicate about your code. 104 | 105 | - Check out further details on [contributing guidelines for tidymodels packages](https://www.tidymodels.org/contribute/) and [how to get help](https://www.tidymodels.org/help/). 106 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | # workflows A teal-colored hexagonal logo. The word WORKFLOWS is centered inside of a diagram of circular cycle, with a magrittr pipe on the top and a directed graph on the bottom. 5 | 6 | 7 | 8 | [![Codecov test 9 | coverage](https://codecov.io/gh/tidymodels/workflows/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidymodels/workflows?branch=main) 10 | [![R-CMD-check](https://github.com/tidymodels/workflows/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidymodels/workflows/actions/workflows/R-CMD-check.yaml) 11 | 12 | 13 | ## What is a workflow? 14 | 15 | A workflow is an object that can bundle together your pre-processing, 16 | modeling, and post-processing requests. For example, if you have a 17 | `recipe` and `parsnip` model, these can be combined into a workflow. The 18 | advantages are: 19 | 20 | - You don’t have to keep track of separate objects in your workspace. 21 | 22 | - The recipe prepping, model fitting, and postprocessor estimation 23 | (which may include data splitting) can be executed using a single call 24 | to `fit()`. 25 | 26 | - If you have custom tuning parameter settings, these can be defined 27 | using a simpler interface when combined with 28 | [tune](https://github.com/tidymodels/tune). 29 | 30 | ## Installation 31 | 32 | You can install workflows from CRAN with: 33 | 34 | ``` r 35 | install.packages("workflows") 36 | ``` 37 | 38 | You can install the development version from 39 | [GitHub](https://github.com/) with: 40 | 41 | ``` r 42 | # install.packages("pak") 43 | pak::pak("tidymodels/workflows") 44 | ``` 45 | 46 | ## Example 47 | 48 | Suppose you were modeling data on cars. Say…the fuel efficiency of 32 49 | cars. You know that the relationship between engine displacement and 50 | miles-per-gallon is nonlinear, and you would like to model that as a 51 | spline before adding it to a Bayesian linear regression model. You might 52 | have a recipe to specify the spline: 53 | 54 | ``` r 55 | library(recipes) 56 | library(parsnip) 57 | library(workflows) 58 | 59 | spline_cars <- recipe(mpg ~ ., data = mtcars) |> 60 | step_ns(disp, deg_free = 10) 61 | ``` 62 | 63 | and a model object: 64 | 65 | ``` r 66 | bayes_lm <- linear_reg() |> 67 | set_engine("stan") 68 | ``` 69 | 70 | To use these, you would generally run: 71 | 72 | ``` r 73 | spline_cars_prepped <- prep(spline_cars, mtcars) 74 | bayes_lm_fit <- fit(bayes_lm, mpg ~ ., data = juice(spline_cars_prepped)) 75 | ``` 76 | 77 | You can’t predict on new samples using `bayes_lm_fit` without the 78 | prepped version of `spline_cars` around. You also might have other 79 | models and recipes in your workspace. This might lead to getting them 80 | mixed-up or forgetting to save the model/recipe pair that you are most 81 | interested in. 82 | 83 | workflows makes this easier by combining these objects together: 84 | 85 | ``` r 86 | car_wflow <- workflow() |> 87 | add_recipe(spline_cars) |> 88 | add_model(bayes_lm) 89 | ``` 90 | 91 | Now you can prepare the recipe and estimate the model via a single call 92 | to `fit()`: 93 | 94 | ``` r 95 | car_wflow_fit <- fit(car_wflow, data = mtcars) 96 | ``` 97 | 98 | You can alter existing workflows using `update_recipe()` / 99 | `update_model()` and `remove_recipe()` / `remove_model()`. 100 | 101 | ## Contributing 102 | 103 | This project is released with a [Contributor Code of 104 | Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). 105 | By contributing to this project, you agree to abide by its terms. 106 | 107 | - For questions and discussions about tidymodels packages, modeling, and 108 | machine learning, please [post on Posit 109 | Community](https://community.rstudio.com/new-topic?category_id=15&tags=tidymodels,question). 110 | 111 | - If you think you have encountered a bug, please [submit an 112 | issue](https://github.com/tidymodels/workflows/issues). 113 | 114 | - Either way, learn how to create and share a 115 | [reprex](https://reprex.tidyverse.org/articles/articles/learn-reprex.html) 116 | (a minimal, reproducible example), to clearly communicate about your 117 | code. 118 | 119 | - Check out further details on [contributing guidelines for tidymodels 120 | packages](https://www.tidymodels.org/contribute/) and [how to get 121 | help](https://www.tidymodels.org/help/). 122 | -------------------------------------------------------------------------------- /_pkgdown.yml: -------------------------------------------------------------------------------- 1 | url: https://workflows.tidymodels.org/ 2 | 3 | template: 4 | package: tidytemplate 5 | bootstrap: 5 6 | bslib: 7 | primary: "#CA225E" 8 | includes: 9 | in_header: | 10 | 11 | 12 | development: 13 | mode: auto 14 | 15 | figures: 16 | fig.width: 8 17 | fig.height: 5.75 18 | 19 | navbar: 20 | left: 21 | - text: Workflow Stages 22 | href: articles/stages.html 23 | - text: Getting Started 24 | href: articles/extras/getting-started.html 25 | - text: News 26 | href: news/index.html 27 | - text: Reference 28 | href: reference/index.html 29 | 30 | -------------------------------------------------------------------------------- /air.toml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/air.toml -------------------------------------------------------------------------------- /codecov.yml: -------------------------------------------------------------------------------- 1 | comment: false 2 | 3 | coverage: 4 | status: 5 | project: 6 | default: 7 | target: auto 8 | threshold: 1% 9 | informational: true 10 | patch: 11 | default: 12 | target: auto 13 | threshold: 1% 14 | informational: true 15 | -------------------------------------------------------------------------------- /man/add_case_weights.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/pre-action-case-weights.R 3 | \name{add_case_weights} 4 | \alias{add_case_weights} 5 | \alias{remove_case_weights} 6 | \alias{update_case_weights} 7 | \title{Add case weights to a workflow} 8 | \usage{ 9 | add_case_weights(x, col) 10 | 11 | remove_case_weights(x) 12 | 13 | update_case_weights(x, col) 14 | } 15 | \arguments{ 16 | \item{x}{A workflow} 17 | 18 | \item{col}{A single unquoted column name specifying the case weights for 19 | the model. This must be a classed case weights column, as determined by 20 | \code{\link[hardhat:is_case_weights]{hardhat::is_case_weights()}}.} 21 | } 22 | \description{ 23 | This family of functions revolves around selecting a column of \code{data} to use 24 | for \emph{case weights}. This column must be one of the allowed case weight types, 25 | such as \code{\link[hardhat:frequency_weights]{hardhat::frequency_weights()}} or \code{\link[hardhat:importance_weights]{hardhat::importance_weights()}}. 26 | Specifically, it must return \code{TRUE} from \code{\link[hardhat:is_case_weights]{hardhat::is_case_weights()}}. The 27 | underlying model will decide whether or not the type of case weights you have 28 | supplied are applicable or not. 29 | \itemize{ 30 | \item \code{add_case_weights()} specifies the column that will be interpreted as 31 | case weights in the model. This column must be present in the \code{data} 32 | supplied to \link[=fit.workflow]{fit()}. 33 | \item \code{remove_case_weights()} removes the case weights. Additionally, if the 34 | model has already been fit, then the fit is removed. 35 | \item \code{update_case_weights()} first removes the case weights, then replaces them 36 | with the new ones. 37 | } 38 | } 39 | \details{ 40 | For formula and variable preprocessors, the case weights \code{col} is removed 41 | from the data before the preprocessor is evaluated. This allows you to use 42 | formulas like \code{y ~ .} or tidyselection like \code{everything()} without fear of 43 | accidentally selecting the case weights column. 44 | 45 | For recipe preprocessors, the case weights \code{col} is not removed and is 46 | passed along to the recipe. Typically, your recipe will include steps that 47 | can utilize case weights. 48 | } 49 | \examples{ 50 | library(parsnip) 51 | library(magrittr) 52 | library(hardhat) 53 | 54 | mtcars2 <- mtcars 55 | mtcars2$gear <- frequency_weights(mtcars2$gear) 56 | 57 | spec <- linear_reg() |> 58 | set_engine("lm") 59 | 60 | wf <- workflow() |> 61 | add_case_weights(gear) |> 62 | add_formula(mpg ~ .) |> 63 | add_model(spec) 64 | 65 | wf <- fit(wf, mtcars2) 66 | 67 | # Notice that the case weights (gear) aren't included in the predictors 68 | extract_mold(wf)$predictors 69 | 70 | # Strip them out of the workflow, which also resets the model 71 | remove_case_weights(wf) 72 | } 73 | -------------------------------------------------------------------------------- /man/add_recipe.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/pre-action-recipe.R 3 | \name{add_recipe} 4 | \alias{add_recipe} 5 | \alias{remove_recipe} 6 | \alias{update_recipe} 7 | \title{Add a recipe to a workflow} 8 | \usage{ 9 | add_recipe(x, recipe, ..., blueprint = NULL) 10 | 11 | remove_recipe(x) 12 | 13 | update_recipe(x, recipe, ..., blueprint = NULL) 14 | } 15 | \arguments{ 16 | \item{x}{A workflow} 17 | 18 | \item{recipe}{A recipe created using \code{\link[recipes:recipe]{recipes::recipe()}}. The recipe 19 | should not have been trained already with \code{\link[recipes:prep]{recipes::prep()}}; workflows 20 | will handle training internally.} 21 | 22 | \item{...}{Not used.} 23 | 24 | \item{blueprint}{A hardhat blueprint used for fine tuning the preprocessing. 25 | 26 | If \code{NULL}, \code{\link[hardhat:default_recipe_blueprint]{hardhat::default_recipe_blueprint()}} is used. 27 | 28 | Note that preprocessing done here is separate from preprocessing that 29 | might be done automatically by the underlying model.} 30 | } 31 | \value{ 32 | \code{x}, updated with either a new or removed recipe preprocessor. 33 | } 34 | \description{ 35 | \itemize{ 36 | \item \code{add_recipe()} specifies the terms of the model and any preprocessing that 37 | is required through the usage of a recipe. 38 | \item \code{remove_recipe()} removes the recipe as well as any downstream objects 39 | that might get created after the recipe is used for preprocessing, such as 40 | the prepped recipe. Additionally, if the model has already been fit, then 41 | the fit is removed. 42 | \item \code{update_recipe()} first removes the recipe, then replaces the previous 43 | recipe with the new one. Any model that has already been fit based on this 44 | recipe will need to be refit. 45 | } 46 | } 47 | \details{ 48 | To fit a workflow, exactly one of \code{\link[=add_formula]{add_formula()}}, \code{\link[=add_recipe]{add_recipe()}}, or 49 | \code{\link[=add_variables]{add_variables()}} \emph{must} be specified. 50 | } 51 | \examples{ 52 | \dontshow{if (rlang::is_installed("recipes")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 53 | library(recipes) 54 | library(magrittr) 55 | 56 | recipe <- recipe(mpg ~ cyl, mtcars) |> 57 | step_log(cyl) 58 | 59 | workflow <- workflow() |> 60 | add_recipe(recipe) 61 | 62 | workflow 63 | 64 | remove_recipe(workflow) 65 | 66 | update_recipe(workflow, recipe(mpg ~ cyl, mtcars)) 67 | \dontshow{\}) # examplesIf} 68 | } 69 | -------------------------------------------------------------------------------- /man/add_tailor.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/post-action-tailor.R 3 | \name{add_tailor} 4 | \alias{add_tailor} 5 | \alias{remove_tailor} 6 | \alias{update_tailor} 7 | \title{Add a tailor to a workflow} 8 | \usage{ 9 | add_tailor(x, tailor, ...) 10 | 11 | remove_tailor(x) 12 | 13 | update_tailor(x, tailor, ...) 14 | } 15 | \arguments{ 16 | \item{x}{A workflow} 17 | 18 | \item{tailor}{A tailor created using \code{\link[tailor:tailor]{tailor::tailor()}}. The tailor 19 | should not have been trained already with \code{\link[tailor:reexports]{tailor::fit()}}; workflows 20 | will handle training internally.} 21 | 22 | \item{...}{Not used.} 23 | } 24 | \value{ 25 | \code{x}, updated with either a new or removed tailor postprocessor. 26 | } 27 | \description{ 28 | \itemize{ 29 | \item \code{add_tailor()} specifies post-processing steps to apply through the 30 | usage of a tailor. 31 | \item \code{remove_tailor()} removes the tailor as well as any downstream objects 32 | that might get created after the tailor is used for post-processing, such as 33 | the fitted tailor. 34 | \item \code{update_tailor()} first removes the tailor, then replaces the previous 35 | tailor with the new one. 36 | } 37 | } 38 | \section{Data Usage}{ 39 | 40 | 41 | While preprocessors and models are trained on data in the usual sense, 42 | postprocessors are training on \emph{predictions} on data. When a workflow 43 | is fitted, the user typically supplies training data with the \code{data} argument. 44 | When workflows don't contain a postprocessor that requires training, 45 | users can pass all of the available data to the \code{data} argument to train the 46 | preprocessor and model. However, in the case where a postprocessor must be 47 | trained as well, allotting all of the available data to the \code{data} argument 48 | to train the preprocessor and model would leave no data 49 | to train the postprocessor with---if that were the case, workflows 50 | would need to \code{predict()} from the preprocessor and model on the same \code{data} 51 | that they were trained on, with the postprocessor then training on those 52 | predictions. Predictions on data that a model was trained on likely follow 53 | different distributions than predictions on unseen data; thus, workflows must 54 | split up the supplied \code{data} into two training sets, where the first is used to 55 | train the preprocessor and model and the second, called the "calibration set," 56 | is passed to that trained postprocessor and model to generate predictions, 57 | which then form the training data for the postprocessor. 58 | 59 | When fitting a workflow with a postprocessor that requires training 60 | (i.e. one that returns \code{TRUE} in \code{.workflow_includes_calibration(workflow)}), users 61 | must pass two data arguments--the usual \code{fit.workflow(data)} will be used 62 | to train the preprocessor and model while \code{fit.workflow(calibration)} will 63 | be used to train the postprocessor. 64 | 65 | In some situations, randomly splitting \code{fit.workflow(data)} (with 66 | \code{rsample::initial_split()}, for example) is sufficient to prevent data 67 | leakage. However, \code{fit.workflow(data)} could also have arisen as: 68 | 69 | \if{html}{\out{
}}\preformatted{boots <- rsample::bootstraps(some_other_data) 70 | split <- rsample::get_rsplit(boots, 1) 71 | data <- rsample::analysis(split) 72 | }\if{html}{\out{
}} 73 | 74 | In this case, some of the rows in \code{data} will be duplicated. Thus, randomly 75 | allotting some of them to train the preprocessor and model and others to train 76 | the preprocessor would likely result in the same rows appearing in both 77 | datasets, resulting in the preprocessor and model generating predictions on 78 | rows they've seen before. Similarly problematic situations could arise in the 79 | context of other resampling situations, like time-based splits. 80 | In general, use the \code{rsample::inner_split()} function to prevent data 81 | leakage when resampling; when workflows with postprocessors that require 82 | training are passed to the tune package, this is handled internally. 83 | } 84 | 85 | \examples{ 86 | \dontshow{if (rlang::is_installed(c("tailor", "probably"))) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 87 | library(tailor) 88 | library(magrittr) 89 | 90 | tailor <- tailor() 91 | tailor_1 <- adjust_probability_threshold(tailor, .1) 92 | 93 | workflow <- workflow() |> 94 | add_tailor(tailor_1) 95 | 96 | workflow 97 | 98 | remove_tailor(workflow) 99 | 100 | update_tailor(workflow, adjust_probability_threshold(tailor, .2)) 101 | \dontshow{\}) # examplesIf} 102 | } 103 | -------------------------------------------------------------------------------- /man/add_variables.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/pre-action-variables.R 3 | \name{add_variables} 4 | \alias{add_variables} 5 | \alias{remove_variables} 6 | \alias{update_variables} 7 | \alias{workflow_variables} 8 | \title{Add variables to a workflow} 9 | \usage{ 10 | add_variables(x, outcomes, predictors, ..., blueprint = NULL, variables = NULL) 11 | 12 | remove_variables(x) 13 | 14 | update_variables( 15 | x, 16 | outcomes, 17 | predictors, 18 | ..., 19 | blueprint = NULL, 20 | variables = NULL 21 | ) 22 | 23 | workflow_variables(outcomes, predictors) 24 | } 25 | \arguments{ 26 | \item{x}{A workflow} 27 | 28 | \item{outcomes, predictors}{Tidyselect expressions specifying the terms 29 | of the model. \code{outcomes} is evaluated first, and then all outcome columns 30 | are removed from the data before \code{predictors} is evaluated. 31 | See \link[tidyselect:language]{tidyselect::select_helpers} for the full range of possible ways to 32 | specify terms.} 33 | 34 | \item{...}{Not used.} 35 | 36 | \item{blueprint}{A hardhat blueprint used for fine tuning the preprocessing. 37 | 38 | If \code{NULL}, \code{\link[hardhat:default_xy_blueprint]{hardhat::default_xy_blueprint()}} is used. 39 | 40 | Note that preprocessing done here is separate from preprocessing that 41 | might be done by the underlying model.} 42 | 43 | \item{variables}{An alternative specification of \code{outcomes} and \code{predictors}, 44 | useful for supplying variables programmatically. 45 | \itemize{ 46 | \item If \code{NULL}, this argument is unused, and \code{outcomes} and \code{predictors} are 47 | used to specify the variables. 48 | \item Otherwise, this must be the result of calling \code{workflow_variables()} to 49 | create a standalone variables object. In this case, \code{outcomes} and 50 | \code{predictors} are completely ignored. 51 | }} 52 | } 53 | \value{ 54 | \itemize{ 55 | \item \code{add_variables()} returns \code{x} with a new variables preprocessor. 56 | \item \code{remove_variables()} returns \code{x} after resetting any model fit and 57 | removing the variables preprocessor. 58 | \item \code{update_variables()} returns \code{x} after removing the variables preprocessor, 59 | and then re-specifying it with new variables. 60 | \item \code{workflow_variables()} returns a 'workflow_variables' object containing 61 | both the \code{outcomes} and \code{predictors}. 62 | } 63 | } 64 | \description{ 65 | \itemize{ 66 | \item \code{add_variables()} specifies the terms of the model through the usage of 67 | \link[tidyselect:language]{tidyselect::select_helpers} for the \code{outcomes} and \code{predictors}. 68 | \item \code{remove_variables()} removes the variables. Additionally, if the model 69 | has already been fit, then the fit is removed. 70 | \item \code{update_variables()} first removes the variables, then replaces the 71 | previous variables with the new ones. Any model that has already been 72 | fit based on the original variables will need to be refit. 73 | \item \code{workflow_variables()} bundles \code{outcomes} and \code{predictors} into a single 74 | variables object, which can be supplied to \code{add_variables()}. 75 | } 76 | } 77 | \details{ 78 | To fit a workflow, exactly one of \code{\link[=add_formula]{add_formula()}}, \code{\link[=add_recipe]{add_recipe()}}, or 79 | \code{\link[=add_variables]{add_variables()}} \emph{must} be specified. 80 | } 81 | \examples{ 82 | library(parsnip) 83 | 84 | spec_lm <- linear_reg() 85 | spec_lm <- set_engine(spec_lm, "lm") 86 | 87 | workflow <- workflow() 88 | workflow <- add_model(workflow, spec_lm) 89 | 90 | # Add terms with tidyselect expressions. 91 | # Outcomes are specified before predictors. 92 | workflow1 <- add_variables( 93 | workflow, 94 | outcomes = mpg, 95 | predictors = c(cyl, disp) 96 | ) 97 | 98 | workflow1 <- fit(workflow1, mtcars) 99 | workflow1 100 | 101 | # Removing the variables of a fit workflow will also remove the model 102 | remove_variables(workflow1) 103 | 104 | # Variables can also be updated 105 | update_variables(workflow1, mpg, starts_with("d")) 106 | 107 | # The `outcomes` are removed before the `predictors` expression 108 | # is evaluated. This allows you to easily specify the predictors 109 | # as "everything except the outcomes". 110 | workflow2 <- add_variables(workflow, mpg, everything()) 111 | workflow2 <- fit(workflow2, mtcars) 112 | extract_mold(workflow2)$predictors 113 | 114 | # Variables can also be added from the result of a call to 115 | # `workflow_variables()`, which creates a standalone variables object 116 | variables <- workflow_variables(mpg, c(cyl, disp)) 117 | workflow3 <- add_variables(workflow, variables = variables) 118 | fit(workflow3, mtcars) 119 | } 120 | -------------------------------------------------------------------------------- /man/augment.workflow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/broom.R 3 | \name{augment.workflow} 4 | \alias{augment.workflow} 5 | \title{Augment data with predictions} 6 | \usage{ 7 | \method{augment}{workflow}(x, new_data, eval_time = NULL, ...) 8 | } 9 | \arguments{ 10 | \item{x}{A workflow} 11 | 12 | \item{new_data}{A data frame of predictors} 13 | 14 | \item{eval_time}{For censored regression models, a vector of time points at 15 | which the survival probability is estimated. See 16 | \code{\link[parsnip:augment]{parsnip::augment.model_fit()}} for more details.} 17 | 18 | \item{...}{Arguments passed on to methods} 19 | } 20 | \value{ 21 | \code{new_data} with new prediction specific columns. 22 | } 23 | \description{ 24 | This is a \code{\link[generics:augment]{generics::augment()}} method for a workflow that calls 25 | \code{augment()} on the underlying parsnip model with \code{new_data}. 26 | 27 | \code{x} must be a trained workflow, resulting in fitted parsnip model to 28 | \code{augment()} with. 29 | 30 | \code{new_data} will be preprocessed using the preprocessor in the workflow, 31 | and that preprocessed data will be used to generate predictions. The 32 | final result will contain the original \code{new_data} with new columns containing 33 | the prediction information. 34 | } 35 | \examples{ 36 | if (rlang::is_installed("broom")) { 37 | 38 | library(parsnip) 39 | library(magrittr) 40 | library(modeldata) 41 | 42 | data("attrition") 43 | 44 | model <- logistic_reg() |> 45 | set_engine("glm") 46 | 47 | wf <- workflow() |> 48 | add_model(model) |> 49 | add_formula( 50 | Attrition ~ BusinessTravel + YearsSinceLastPromotion + OverTime 51 | ) 52 | 53 | wf_fit <- fit(wf, attrition) 54 | 55 | augment(wf_fit, attrition) 56 | 57 | } 58 | } 59 | -------------------------------------------------------------------------------- /man/control_workflow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/control.R 3 | \name{control_workflow} 4 | \alias{control_workflow} 5 | \title{Control object for a workflow} 6 | \usage{ 7 | control_workflow(control_parsnip = NULL) 8 | } 9 | \arguments{ 10 | \item{control_parsnip}{A parsnip control object. If \code{NULL}, a default control 11 | argument is constructed from \code{\link[parsnip:control_parsnip]{parsnip::control_parsnip()}}.} 12 | } 13 | \value{ 14 | A \code{control_workflow} object for tweaking the workflow fitting process. 15 | } 16 | \description{ 17 | \code{control_workflow()} holds the control parameters for a workflow. 18 | } 19 | \examples{ 20 | control_workflow() 21 | } 22 | -------------------------------------------------------------------------------- /man/extract-workflow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/extract.R 3 | \name{extract-workflow} 4 | \alias{extract-workflow} 5 | \alias{extract_spec_parsnip.workflow} 6 | \alias{extract_recipe.workflow} 7 | \alias{extract_fit_parsnip.workflow} 8 | \alias{extract_fit_engine.workflow} 9 | \alias{extract_mold.workflow} 10 | \alias{extract_preprocessor.workflow} 11 | \alias{extract_postprocessor.workflow} 12 | \alias{extract_parameter_set_dials.workflow} 13 | \alias{extract_parameter_dials.workflow} 14 | \alias{extract_fit_time.workflow} 15 | \title{Extract elements of a workflow} 16 | \usage{ 17 | \method{extract_spec_parsnip}{workflow}(x, ...) 18 | 19 | \method{extract_recipe}{workflow}(x, ..., estimated = TRUE) 20 | 21 | \method{extract_fit_parsnip}{workflow}(x, ...) 22 | 23 | \method{extract_fit_engine}{workflow}(x, ...) 24 | 25 | \method{extract_mold}{workflow}(x, ...) 26 | 27 | \method{extract_preprocessor}{workflow}(x, ...) 28 | 29 | \method{extract_postprocessor}{workflow}(x, estimated = TRUE, ...) 30 | 31 | \method{extract_parameter_set_dials}{workflow}(x, ...) 32 | 33 | \method{extract_parameter_dials}{workflow}(x, parameter, ...) 34 | 35 | \method{extract_fit_time}{workflow}(x, summarize = TRUE, ...) 36 | } 37 | \arguments{ 38 | \item{x}{A workflow} 39 | 40 | \item{...}{Not currently used.} 41 | 42 | \item{estimated}{A logical for whether the original (unfit) recipe or the 43 | fitted recipe should be returned. This argument should be named.} 44 | 45 | \item{parameter}{A single string for the parameter ID.} 46 | 47 | \item{summarize}{A logical for whether the elapsed fit time should be returned as a 48 | single row or multiple rows.} 49 | } 50 | \value{ 51 | The extracted value from the object, \code{x}, as described in the description 52 | section. 53 | } 54 | \description{ 55 | These functions extract various elements from a workflow object. If they do 56 | not exist yet, an error is thrown. 57 | \itemize{ 58 | \item \code{extract_preprocessor()} returns the formula, recipe, or variable 59 | expressions used for preprocessing. 60 | \item \code{extract_spec_parsnip()} returns the parsnip model specification. 61 | \item \code{extract_fit_parsnip()} returns the parsnip model fit object. 62 | \item \code{extract_fit_engine()} returns the engine specific fit embedded within 63 | a parsnip model fit. For example, when using \code{\link[parsnip:linear_reg]{parsnip::linear_reg()}} 64 | with the \code{"lm"} engine, this returns the underlying \code{lm} object. 65 | \item \code{extract_mold()} returns the preprocessed "mold" object returned 66 | from \code{\link[hardhat:mold]{hardhat::mold()}}. It contains information about the preprocessing, 67 | including either the prepped recipe, the formula terms object, or 68 | variable selectors. 69 | \item \code{extract_recipe()} returns the recipe. The \code{estimated} argument specifies 70 | whether the fitted or original recipe is returned. 71 | \item \code{extract_parameter_dials()} returns a single dials parameter object. 72 | \item \code{extract_parameter_set_dials()} returns a set of dials parameter objects. 73 | \item \code{extract_fit_time()} returns a tibble with elapsed fit times. The fit 74 | times correspond to the time for the parsnip engine or recipe steps to fit 75 | (or their sum if \code{summarize = TRUE}) and do not include other portions of 76 | the elapsed time in \code{\link[=fit.workflow]{fit.workflow()}}. 77 | } 78 | } 79 | \details{ 80 | Extracting the underlying engine fit can be helpful for describing the 81 | model (via \code{print()}, \code{summary()}, \code{plot()}, etc.) or for variable 82 | importance/explainers. 83 | 84 | However, users should not invoke the \code{predict()} method on an extracted 85 | model. There may be preprocessing operations that \code{workflows} has executed on 86 | the data prior to giving it to the model. Bypassing these can lead to errors 87 | or silently generating incorrect predictions. 88 | 89 | \emph{Good}: 90 | 91 | \if{html}{\out{
}}\preformatted{workflow_fit |> predict(new_data) 92 | }\if{html}{\out{
}} 93 | 94 | \emph{Bad}: 95 | 96 | \if{html}{\out{
}}\preformatted{workflow_fit |> extract_fit_engine() |> predict(new_data) 97 | # or 98 | workflow_fit |> extract_fit_parsnip() |> predict(new_data) 99 | }\if{html}{\out{
}} 100 | } 101 | \examples{ 102 | \dontshow{if (rlang::is_installed("recipes")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 103 | library(parsnip) 104 | library(recipes) 105 | library(magrittr) 106 | 107 | model <- linear_reg() |> 108 | set_engine("lm") 109 | 110 | recipe <- recipe(mpg ~ cyl + disp, mtcars) |> 111 | step_log(disp) 112 | 113 | base_wf <- workflow() |> 114 | add_model(model) 115 | 116 | recipe_wf <- add_recipe(base_wf, recipe) 117 | formula_wf <- add_formula(base_wf, mpg ~ cyl + log(disp)) 118 | variable_wf <- add_variables(base_wf, mpg, c(cyl, disp)) 119 | 120 | fit_recipe_wf <- fit(recipe_wf, mtcars) 121 | fit_formula_wf <- fit(formula_wf, mtcars) 122 | 123 | # The preprocessor is a recipe, formula, or a list holding the 124 | # tidyselect expressions identifying the outcomes/predictors 125 | extract_preprocessor(recipe_wf) 126 | extract_preprocessor(formula_wf) 127 | extract_preprocessor(variable_wf) 128 | 129 | # The `spec` is the parsnip spec before it has been fit. 130 | # The `fit` is the fitted parsnip model. 131 | extract_spec_parsnip(fit_formula_wf) 132 | extract_fit_parsnip(fit_formula_wf) 133 | extract_fit_engine(fit_formula_wf) 134 | 135 | # The mold is returned from `hardhat::mold()`, and contains the 136 | # predictors, outcomes, and information about the preprocessing 137 | # for use on new data at `predict()` time. 138 | extract_mold(fit_recipe_wf) 139 | 140 | # A useful shortcut is to extract the fitted recipe from the workflow 141 | extract_recipe(fit_recipe_wf) 142 | 143 | # That is identical to 144 | identical( 145 | extract_mold(fit_recipe_wf)$blueprint$recipe, 146 | extract_recipe(fit_recipe_wf) 147 | ) 148 | \dontshow{\}) # examplesIf} 149 | } 150 | -------------------------------------------------------------------------------- /man/figures/lifecycle-deprecated.svg: -------------------------------------------------------------------------------- 1 | lifecyclelifecycledeprecateddeprecated -------------------------------------------------------------------------------- /man/figures/lifecycle-soft-deprecated.svg: -------------------------------------------------------------------------------- 1 | lifecyclelifecyclesoft-deprecatedsoft-deprecated -------------------------------------------------------------------------------- /man/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/man/figures/logo.png -------------------------------------------------------------------------------- /man/glance.workflow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/broom.R 3 | \name{glance.workflow} 4 | \alias{glance.workflow} 5 | \title{Glance at a workflow model} 6 | \usage{ 7 | \method{glance}{workflow}(x, ...) 8 | } 9 | \arguments{ 10 | \item{x}{A workflow} 11 | 12 | \item{...}{Arguments passed on to methods} 13 | } 14 | \description{ 15 | This is a \code{\link[generics:glance]{generics::glance()}} method for a workflow that calls \code{glance()} on 16 | the underlying parsnip model. 17 | 18 | \code{x} must be a trained workflow, resulting in fitted parsnip model to 19 | \code{glance()} at. 20 | } 21 | \examples{ 22 | if (rlang::is_installed(c("broom", "modeldata"))) { 23 | 24 | library(parsnip) 25 | library(magrittr) 26 | library(modeldata) 27 | 28 | data("attrition") 29 | 30 | model <- logistic_reg() |> 31 | set_engine("glm") 32 | 33 | wf <- workflow() |> 34 | add_model(model) |> 35 | add_formula( 36 | Attrition ~ BusinessTravel + YearsSinceLastPromotion + OverTime 37 | ) 38 | 39 | # Workflow must be trained to call `glance()` 40 | try(glance(wf)) 41 | 42 | wf_fit <- fit(wf, attrition) 43 | 44 | glance(wf_fit) 45 | 46 | } 47 | } 48 | -------------------------------------------------------------------------------- /man/is_trained_workflow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/workflow.R 3 | \name{is_trained_workflow} 4 | \alias{is_trained_workflow} 5 | \title{Determine if a workflow has been trained} 6 | \usage{ 7 | is_trained_workflow(x) 8 | } 9 | \arguments{ 10 | \item{x}{A workflow.} 11 | } 12 | \value{ 13 | A single logical indicating if the workflow has been trained or not. 14 | } 15 | \description{ 16 | A trained workflow is one that has gone through \code{\link[=fit.workflow]{fit()}}, 17 | which preprocesses the underlying data, and fits the parsnip model. 18 | } 19 | \examples{ 20 | \dontshow{if (rlang::is_installed("recipes")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 21 | library(parsnip) 22 | library(recipes) 23 | library(magrittr) 24 | 25 | rec <- recipe(mpg ~ cyl, mtcars) 26 | 27 | mod <- linear_reg() 28 | mod <- set_engine(mod, "lm") 29 | 30 | wf <- workflow() |> 31 | add_recipe(rec) |> 32 | add_model(mod) 33 | 34 | # Before any preprocessing or model fitting has been done 35 | is_trained_workflow(wf) 36 | 37 | wf <- fit(wf, mtcars) 38 | 39 | # After all preprocessing and model fitting 40 | is_trained_workflow(wf) 41 | \dontshow{\}) # examplesIf} 42 | } 43 | -------------------------------------------------------------------------------- /man/predict-workflow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/predict.R 3 | \name{predict-workflow} 4 | \alias{predict-workflow} 5 | \alias{predict.workflow} 6 | \title{Predict from a workflow} 7 | \usage{ 8 | \method{predict}{workflow}(object, new_data, type = NULL, opts = list(), ...) 9 | } 10 | \arguments{ 11 | \item{object}{A workflow that has been fit by \code{\link[=fit.workflow]{fit.workflow()}}} 12 | 13 | \item{new_data}{A data frame containing the new predictors to preprocess 14 | and predict on. If using a recipe preprocessor, you should not call 15 | \code{\link[recipes:bake]{recipes::bake()}} on \code{new_data} before passing to this function.} 16 | 17 | \item{type}{A single character value or \code{NULL}. Possible values 18 | are \code{"numeric"}, \code{"class"}, \code{"prob"}, \code{"conf_int"}, \code{"pred_int"}, 19 | \code{"quantile"}, \code{"time"}, \code{"hazard"}, \code{"survival"}, or \code{"raw"}. When \code{NULL}, 20 | \code{predict()} will choose an appropriate value based on the model's mode.} 21 | 22 | \item{opts}{A list of optional arguments to the underlying 23 | predict function that will be used when \code{type = "raw"}. The 24 | list should not include options for the model object or the 25 | new data being predicted.} 26 | 27 | \item{...}{Additional \code{parsnip}-related options, depending on the 28 | value of \code{type}. Arguments to the underlying model's prediction 29 | function cannot be passed here (use the \code{opts} argument instead). 30 | Possible arguments are: 31 | \itemize{ 32 | \item \code{interval}: for \code{type} equal to \code{"survival"} or \code{"quantile"}, should 33 | interval estimates be added, if available? Options are \code{"none"} 34 | and \code{"confidence"}. 35 | \item \code{level}: for \code{type} equal to \code{"conf_int"}, \code{"pred_int"}, or \code{"survival"}, 36 | this is the parameter for the tail area of the intervals 37 | (e.g. confidence level for confidence intervals). 38 | Default value is \code{0.95}. 39 | \item \code{std_error}: for \code{type} equal to \code{"conf_int"} or \code{"pred_int"}, add 40 | the standard error of fit or prediction (on the scale of the 41 | linear predictors). Default value is \code{FALSE}. 42 | \item \code{quantile}: for \code{type} equal to \code{quantile}, the quantiles of the 43 | distribution. Default is \code{(1:9)/10}. 44 | \item \code{eval_time}: for \code{type} equal to \code{"survival"} or \code{"hazard"}, the 45 | time points at which the survival probability or hazard is estimated. 46 | }} 47 | } 48 | \value{ 49 | A data frame of model predictions, with as many rows as \code{new_data} has. 50 | } 51 | \description{ 52 | This is the \code{predict()} method for a fit workflow object. The nice thing 53 | about predicting from a workflow is that it will: 54 | \itemize{ 55 | \item Preprocess \code{new_data} using the preprocessing method specified when the 56 | workflow was created and fit. This is accomplished using 57 | \code{\link[hardhat:forge]{hardhat::forge()}}, which will apply any formula preprocessing or call 58 | \code{\link[recipes:bake]{recipes::bake()}} if a recipe was supplied. 59 | \item Call \code{\link[parsnip:predict.model_fit]{parsnip::predict.model_fit()}} for you using the underlying fit 60 | parsnip model. 61 | } 62 | } 63 | \examples{ 64 | \dontshow{if (rlang::is_installed("recipes")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 65 | library(parsnip) 66 | library(recipes) 67 | library(magrittr) 68 | 69 | training <- mtcars[1:20, ] 70 | testing <- mtcars[21:32, ] 71 | 72 | model <- linear_reg() |> 73 | set_engine("lm") 74 | 75 | workflow <- workflow() |> 76 | add_model(model) 77 | 78 | recipe <- recipe(mpg ~ cyl + disp, training) |> 79 | step_log(disp) 80 | 81 | workflow <- add_recipe(workflow, recipe) 82 | 83 | fit_workflow <- fit(workflow, training) 84 | 85 | # This will automatically `bake()` the recipe on `testing`, 86 | # applying the log step to `disp`, and then fit the regression. 87 | predict(fit_workflow, testing) 88 | \dontshow{\}) # examplesIf} 89 | } 90 | -------------------------------------------------------------------------------- /man/reexports.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/reexports.R 3 | \docType{import} 4 | \name{reexports} 5 | \alias{reexports} 6 | \alias{extract_spec_parsnip} 7 | \alias{extract_recipe} 8 | \alias{extract_fit_parsnip} 9 | \alias{extract_fit_engine} 10 | \alias{extract_mold} 11 | \alias{extract_preprocessor} 12 | \alias{extract_postprocessor} 13 | \alias{extract_parameter_set_dials} 14 | \alias{extract_parameter_dials} 15 | \alias{extract_fit_time} 16 | \alias{required_pkgs} 17 | \alias{fit} 18 | \title{Objects exported from other packages} 19 | \keyword{internal} 20 | \description{ 21 | These objects are imported from other packages. Follow the links 22 | below to see their documentation. 23 | 24 | \describe{ 25 | \item{generics}{\code{\link[generics]{fit}}, \code{\link[generics]{required_pkgs}}} 26 | 27 | \item{hardhat}{\code{\link[hardhat:hardhat-extract]{extract_fit_engine}}, \code{\link[hardhat:hardhat-extract]{extract_fit_parsnip}}, \code{\link[hardhat:hardhat-extract]{extract_fit_time}}, \code{\link[hardhat:hardhat-extract]{extract_mold}}, \code{\link[hardhat:hardhat-extract]{extract_parameter_dials}}, \code{\link[hardhat:hardhat-extract]{extract_parameter_set_dials}}, \code{\link[hardhat:hardhat-extract]{extract_postprocessor}}, \code{\link[hardhat:hardhat-extract]{extract_preprocessor}}, \code{\link[hardhat:hardhat-extract]{extract_recipe}}, \code{\link[hardhat:hardhat-extract]{extract_spec_parsnip}}} 28 | }} 29 | 30 | -------------------------------------------------------------------------------- /man/rmd/add-formula.Rmd: -------------------------------------------------------------------------------- 1 | # Formula Handling 2 | 3 | ```{r start, include = FALSE} 4 | options(width = 70) 5 | 6 | library(parsnip) 7 | library(workflows) 8 | library(magrittr) 9 | library(modeldata) 10 | library(hardhat) 11 | library(splines) 12 | ``` 13 | 14 | Note that, for different models, the formula given to `add_formula()` might be handled in different ways, depending on the parsnip model being used. For example, a random forest model fit using ranger would not convert any factor predictors to binary indicator variables. This is consistent with what `ranger::ranger()` would do, but is inconsistent with what `stats::model.matrix()` would do. 15 | 16 | The documentation for parsnip models provides details about how the data given in the formula are encoded for the model if they diverge from the standard `model.matrix()` methodology. Our goal is to be consistent with how the underlying model package works. 17 | 18 | ## How is this formula used? 19 | 20 | To demonstrate, the example below uses `lm()` to fit a model. The formula given to `add_formula()` is used to create the model matrix and that is what is passed to `lm()` with a simple formula of `body_mass_g ~ .`: 21 | 22 | ```{r pre-encoded-fit} 23 | library(parsnip) 24 | library(workflows) 25 | library(magrittr) 26 | library(modeldata) 27 | library(hardhat) 28 | 29 | data(penguins) 30 | 31 | lm_mod <- linear_reg() |> 32 | set_engine("lm") 33 | 34 | lm_wflow <- workflow() |> 35 | add_model(lm_mod) 36 | 37 | pre_encoded <- lm_wflow |> 38 | add_formula(body_mass_g ~ species + island + bill_depth_mm) |> 39 | fit(data = penguins) 40 | 41 | pre_encoded_parsnip_fit <- pre_encoded |> 42 | extract_fit_parsnip() 43 | 44 | pre_encoded_fit <- pre_encoded_parsnip_fit$fit 45 | 46 | # The `lm()` formula is *not* the same as the `add_formula()` formula: 47 | pre_encoded_fit 48 | ``` 49 | 50 | This can affect how the results are analyzed. For example, to get sequential hypothesis tests, each individual term is tested: 51 | 52 | ```{r pre-encoded-anova} 53 | anova(pre_encoded_fit) 54 | ``` 55 | 56 | ## Overriding the default encodings 57 | 58 | Users can override the model-specific encodings by using a hardhat blueprint. The blueprint can specify how factors are encoded and whether intercepts are included. As an example, if you use a formula and would like the data to be passed to a model untouched: 59 | 60 | ```{r blueprint-fit} 61 | minimal <- default_formula_blueprint(indicators = "none", intercept = FALSE) 62 | 63 | un_encoded <- lm_wflow |> 64 | add_formula( 65 | body_mass_g ~ species + island + bill_depth_mm, 66 | blueprint = minimal 67 | ) |> 68 | fit(data = penguins) 69 | 70 | un_encoded_parsnip_fit <- un_encoded |> 71 | extract_fit_parsnip() 72 | 73 | un_encoded_fit <- un_encoded_parsnip_fit$fit 74 | 75 | un_encoded_fit 76 | ``` 77 | 78 | While this looks the same, the raw columns were given to `lm()` and that function created the dummy variables. Because of this, the sequential ANOVA tests groups of parameters to get column-level p-values: 79 | 80 | ```{r blueprint-anova} 81 | anova(un_encoded_fit) 82 | ``` 83 | 84 | ## Overriding the default model formula 85 | 86 | Additionally, the formula passed to the underlying model can also be customized. In this case, the `formula` argument of `add_model()` can be used. To demonstrate, a spline function will be used for the bill depth: 87 | 88 | ```{r extra-formula-fit} 89 | library(splines) 90 | 91 | custom_formula <- workflow() |> 92 | add_model( 93 | lm_mod, 94 | formula = body_mass_g ~ species + island + ns(bill_depth_mm, 3) 95 | ) |> 96 | add_formula( 97 | body_mass_g ~ species + island + bill_depth_mm, 98 | blueprint = minimal 99 | ) |> 100 | fit(data = penguins) 101 | 102 | custom_parsnip_fit <- custom_formula |> 103 | extract_fit_parsnip() 104 | 105 | custom_fit <- custom_parsnip_fit$fit 106 | 107 | custom_fit 108 | ``` 109 | 110 | ## Altering the formula 111 | 112 | Finally, when a formula is updated or removed from a fitted workflow, the corresponding model fit is removed. 113 | 114 | ```{r remove} 115 | custom_formula_no_fit <- update_formula(custom_formula, body_mass_g ~ species) 116 | 117 | try(extract_fit_parsnip(custom_formula_no_fit)) 118 | ``` 119 | -------------------------------------------------------------------------------- /man/rmd/indicators.Rmd: -------------------------------------------------------------------------------- 1 | # Indicator Variable Details 2 | 3 | ```{r echo=FALSE} 4 | options(cli.width = 70, width = 70, cli.unicode = FALSE) 5 | 6 | # Load them early on so package conflict messages don't show up 7 | suppressPackageStartupMessages({ 8 | library(parsnip) 9 | library(recipes) 10 | library(workflows) 11 | library(modeldata) 12 | }) 13 | ``` 14 | 15 | 16 | Some modeling functions in R create indicator/dummy variables from categorical data when you use a model formula, and some do not. When you specify and fit a model with a `workflow()`, parsnip and workflows match and reproduce the underlying behavior of the user-specified model's computational engine. 17 | 18 | ## Formula Preprocessor 19 | 20 | In the [modeldata::Sacramento] data set of real estate prices, the `type` variable has three levels: `"Residential"`, `"Condo"`, and `"Multi-Family"`. This base `workflow()` contains a formula added via [add_formula()] to predict property price from property type, square footage, number of beds, and number of baths: 21 | 22 | ```{r} 23 | set.seed(123) 24 | 25 | library(parsnip) 26 | library(recipes) 27 | library(workflows) 28 | library(modeldata) 29 | 30 | data("Sacramento") 31 | 32 | base_wf <- workflow() |> 33 | add_formula(price ~ type + sqft + beds + baths) 34 | ``` 35 | 36 | This first model does create dummy/indicator variables: 37 | 38 | ```{r} 39 | lm_spec <- linear_reg() |> 40 | set_engine("lm") 41 | 42 | base_wf |> 43 | add_model(lm_spec) |> 44 | fit(Sacramento) 45 | ``` 46 | 47 | There are **five** independent variables in the fitted model for this OLS linear regression. With this model type and engine, the factor predictor `type` of the real estate properties was converted to two binary predictors, `typeMulti_Family` and `typeResidential`. (The third type, for condos, does not need its own column because it is the baseline level). 48 | 49 | This second model does not create dummy/indicator variables: 50 | 51 | ```{r} 52 | rf_spec <- rand_forest() |> 53 | set_mode("regression") |> 54 | set_engine("ranger") 55 | 56 | base_wf |> 57 | add_model(rf_spec) |> 58 | fit(Sacramento) 59 | ``` 60 | 61 | Note that there are **four** independent variables in the fitted model for this ranger random forest. With this model type and engine, indicator variables were not created for the `type` of real estate property being sold. Tree-based models such as random forest models can handle factor predictors directly, and don't need any conversion to numeric binary variables. 62 | 63 | ## Recipe Preprocessor 64 | 65 | When you specify a model with a `workflow()` and a recipe preprocessor via [add_recipe()], the _recipe_ controls whether dummy variables are created or not; the recipe overrides any underlying behavior from the model's computational engine. 66 | -------------------------------------------------------------------------------- /man/tidy.workflow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/broom.R 3 | \name{tidy.workflow} 4 | \alias{tidy.workflow} 5 | \title{Tidy a workflow} 6 | \usage{ 7 | \method{tidy}{workflow}(x, what = "model", ...) 8 | } 9 | \arguments{ 10 | \item{x}{A workflow} 11 | 12 | \item{what}{A single string. Either \code{"model"} or \code{"recipe"} to select 13 | which part of the workflow to tidy. Defaults to tidying the model.} 14 | 15 | \item{...}{Arguments passed on to methods} 16 | } 17 | \description{ 18 | This is a \code{\link[generics:tidy]{generics::tidy()}} method for a workflow that calls \code{tidy()} on 19 | either the underlying parsnip model or the recipe, depending on the value 20 | of \code{what}. 21 | 22 | \code{x} must be a fitted workflow, resulting in fitted parsnip model or prepped 23 | recipe that you want to tidy. 24 | } 25 | \details{ 26 | To tidy the unprepped recipe, use \code{\link[=extract_preprocessor]{extract_preprocessor()}} and \code{tidy()} 27 | that directly. 28 | } 29 | -------------------------------------------------------------------------------- /man/workflow-butcher.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/butcher.R 3 | \name{workflow-butcher} 4 | \alias{workflow-butcher} 5 | \alias{axe_call.workflow} 6 | \alias{axe_ctrl.workflow} 7 | \alias{axe_data.workflow} 8 | \alias{axe_env.workflow} 9 | \alias{axe_fitted.workflow} 10 | \title{Butcher methods for a workflow} 11 | \usage{ 12 | axe_call.workflow(x, verbose = FALSE, ...) 13 | 14 | axe_ctrl.workflow(x, verbose = FALSE, ...) 15 | 16 | axe_data.workflow(x, verbose = FALSE, ...) 17 | 18 | axe_env.workflow(x, verbose = FALSE, ...) 19 | 20 | axe_fitted.workflow(x, verbose = FALSE, ...) 21 | } 22 | \arguments{ 23 | \item{x}{A workflow.} 24 | 25 | \item{verbose}{Should information be printed about how much memory is freed 26 | from butchering?} 27 | 28 | \item{...}{Extra arguments possibly used by underlying methods.} 29 | } 30 | \description{ 31 | These methods allow you to use the butcher package to reduce the size of 32 | a workflow. After calling \code{butcher::butcher()} on a workflow, the only 33 | guarantee is that you will still be able to \code{predict()} from that workflow. 34 | Other functions may not work as expected. 35 | } 36 | -------------------------------------------------------------------------------- /man/workflow-extractors.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/pull.R 3 | \name{workflow-extractors} 4 | \alias{workflow-extractors} 5 | \alias{pull_workflow_preprocessor} 6 | \alias{pull_workflow_spec} 7 | \alias{pull_workflow_fit} 8 | \alias{pull_workflow_mold} 9 | \alias{pull_workflow_prepped_recipe} 10 | \title{Extract elements of a workflow} 11 | \usage{ 12 | pull_workflow_preprocessor(x) 13 | 14 | pull_workflow_spec(x) 15 | 16 | pull_workflow_fit(x) 17 | 18 | pull_workflow_mold(x) 19 | 20 | pull_workflow_prepped_recipe(x) 21 | } 22 | \arguments{ 23 | \item{x}{A workflow} 24 | } 25 | \value{ 26 | The extracted value from the workflow, \code{x}, as described in the description 27 | section. 28 | } 29 | \description{ 30 | \ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#soft-deprecated}{\figure{lifecycle-soft-deprecated.svg}{options: alt='[Soft-deprecated]'}}}{\strong{[Soft-deprecated]}} 31 | 32 | Please use the \verb{extract_*()} functions instead of these 33 | (e.g. \code{\link[=extract_mold]{extract_mold()}}). 34 | 35 | These functions extract various elements from a workflow object. If they do 36 | not exist yet, an error is thrown. 37 | \itemize{ 38 | \item \code{pull_workflow_preprocessor()} returns the formula, recipe, or variable 39 | expressions used for preprocessing. 40 | \item \code{pull_workflow_spec()} returns the parsnip model specification. 41 | \item \code{pull_workflow_fit()} returns the parsnip model fit. 42 | \item \code{pull_workflow_mold()} returns the preprocessed "mold" object returned 43 | from \code{\link[hardhat:mold]{hardhat::mold()}}. It contains information about the preprocessing, 44 | including either the prepped recipe or the formula terms object. 45 | \item \code{pull_workflow_prepped_recipe()} returns the prepped recipe. It is 46 | extracted from the mold object returned from \code{pull_workflow_mold()}. 47 | } 48 | } 49 | \examples{ 50 | \dontshow{if (rlang::is_installed("recipes")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 51 | library(parsnip) 52 | library(recipes) 53 | library(magrittr) 54 | 55 | model <- linear_reg() |> 56 | set_engine("lm") 57 | 58 | recipe <- recipe(mpg ~ cyl + disp, mtcars) |> 59 | step_log(disp) 60 | 61 | base_wf <- workflow() |> 62 | add_model(model) 63 | 64 | recipe_wf <- add_recipe(base_wf, recipe) 65 | formula_wf <- add_formula(base_wf, mpg ~ cyl + log(disp)) 66 | variable_wf <- add_variables(base_wf, mpg, c(cyl, disp)) 67 | 68 | fit_recipe_wf <- fit(recipe_wf, mtcars) 69 | fit_formula_wf <- fit(formula_wf, mtcars) 70 | 71 | # The preprocessor is a recipes, formula, or a list holding the 72 | # tidyselect expressions identifying the outcomes/predictors 73 | pull_workflow_preprocessor(recipe_wf) 74 | pull_workflow_preprocessor(formula_wf) 75 | pull_workflow_preprocessor(variable_wf) 76 | 77 | # The `spec` is the parsnip spec before it has been fit. 78 | # The `fit` is the fit parsnip model. 79 | pull_workflow_spec(fit_formula_wf) 80 | pull_workflow_fit(fit_formula_wf) 81 | 82 | # The mold is returned from `hardhat::mold()`, and contains the 83 | # predictors, outcomes, and information about the preprocessing 84 | # for use on new data at `predict()` time. 85 | pull_workflow_mold(fit_recipe_wf) 86 | 87 | # A useful shortcut is to extract the prepped recipe from the workflow 88 | pull_workflow_prepped_recipe(fit_recipe_wf) 89 | 90 | # That is identical to 91 | identical( 92 | pull_workflow_mold(fit_recipe_wf)$blueprint$recipe, 93 | pull_workflow_prepped_recipe(fit_recipe_wf) 94 | ) 95 | \dontshow{\}) # examplesIf} 96 | } 97 | \keyword{internal} 98 | -------------------------------------------------------------------------------- /man/workflows-internals.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/fit.R 3 | \name{.workflow_includes_calibration} 4 | \alias{.workflow_includes_calibration} 5 | \alias{workflows-internals} 6 | \alias{.fit_pre} 7 | \alias{.fit_model} 8 | \alias{.fit_post} 9 | \alias{.fit_finalize} 10 | \title{Internal workflow functions} 11 | \usage{ 12 | .workflow_includes_calibration(workflow) 13 | 14 | .fit_pre(workflow, data) 15 | 16 | .fit_model(workflow, control) 17 | 18 | .fit_post(workflow, data) 19 | 20 | .fit_finalize(workflow) 21 | } 22 | \arguments{ 23 | \item{workflow}{A workflow 24 | 25 | For \code{.fit_pre()}, this should be a fresh workflow. 26 | 27 | For \code{.fit_model()}, this should be a workflow that has already been trained 28 | through \code{.fit_pre()}. 29 | 30 | For \code{.fit_finalize()}, this should be a workflow that has been through 31 | both \code{.fit_pre()} and \code{.fit_model()}.} 32 | 33 | \item{data}{A data frame of predictors and outcomes to use when fitting the 34 | workflow} 35 | 36 | \item{control}{A \code{\link[=control_workflow]{control_workflow()}} object} 37 | } 38 | \description{ 39 | \code{.fit_pre()}, \code{.fit_model()}, and \code{.fit_finalize()} are internal workflow 40 | functions for \emph{partially} fitting a workflow object. They are only exported 41 | for usage by the tuning package, \href{https://github.com/tidymodels/tune}{tune}, 42 | and the general user should never need to worry about them. 43 | } 44 | \examples{ 45 | \dontshow{if (rlang::is_installed("recipes")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 46 | library(parsnip) 47 | library(recipes) 48 | library(magrittr) 49 | 50 | model <- linear_reg() |> 51 | set_engine("lm") 52 | 53 | wf_unfit <- workflow() |> 54 | add_model(model) |> 55 | add_formula(mpg ~ cyl + log(disp)) 56 | 57 | wf_fit_pre <- .fit_pre(wf_unfit, mtcars) 58 | wf_fit_model <- .fit_model(wf_fit_pre, control_workflow()) 59 | wf_fit <- .fit_finalize(wf_fit_model) 60 | 61 | # Notice that fitting through the model doesn't mark the 62 | # workflow as being "trained" 63 | wf_fit_model 64 | 65 | # Finalizing the workflow marks it as "trained" 66 | wf_fit 67 | 68 | # Which allows you to predict from it 69 | try(predict(wf_fit_model, mtcars)) 70 | 71 | predict(wf_fit, mtcars) 72 | \dontshow{\}) # examplesIf} 73 | } 74 | \keyword{internal} 75 | -------------------------------------------------------------------------------- /man/workflows-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/workflows-package.R 3 | \docType{package} 4 | \name{workflows-package} 5 | \alias{workflows} 6 | \alias{workflows-package} 7 | \title{workflows: Modeling Workflows} 8 | \description{ 9 | \if{html}{\figure{logo.png}{options: style='float: right' alt='logo' width='120'}} 10 | 11 | Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object. 12 | } 13 | \seealso{ 14 | Useful links: 15 | \itemize{ 16 | \item \url{https://github.com/tidymodels/workflows} 17 | \item \url{https://workflows.tidymodels.org} 18 | \item Report bugs at \url{https://github.com/tidymodels/workflows/issues} 19 | } 20 | 21 | } 22 | \author{ 23 | \strong{Maintainer}: Simon Couch \email{simon.couch@posit.co} (\href{https://orcid.org/0000-0001-5676-5107}{ORCID}) 24 | 25 | Authors: 26 | \itemize{ 27 | \item Davis Vaughan \email{davis@posit.co} 28 | } 29 | 30 | Other contributors: 31 | \itemize{ 32 | \item Posit Software, PBC (03wc8by49) [copyright holder, funder] 33 | } 34 | 35 | } 36 | \keyword{internal} 37 | -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-120x120.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/apple-touch-icon-120x120.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-152x152.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/apple-touch-icon-152x152.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-180x180.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/apple-touch-icon-180x180.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-60x60.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/apple-touch-icon-60x60.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-76x76.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/apple-touch-icon-76x76.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/apple-touch-icon.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon-16x16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/favicon-16x16.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/favicon-32x32.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tidymodels/workflows/835ee3574688f23375ee52b4fb1bd060b44b0e57/pkgdown/favicon/favicon.ico -------------------------------------------------------------------------------- /revdep/.gitignore: -------------------------------------------------------------------------------- 1 | checks 2 | library 3 | checks.noindex 4 | library.noindex 5 | data.sqlite 6 | *.html 7 | cloud.noindex 8 | -------------------------------------------------------------------------------- /revdep/README.md: -------------------------------------------------------------------------------- 1 | # Revdeps 2 | 3 | ## Failed to check (2) 4 | 5 | |package |version |error |warning |note | 6 | |:-------|:-------|:-----|:-------|:----| 7 | |NA |? | | | | 8 | |NA |? | | | | 9 | 10 | ## New problems (3) 11 | 12 | |package |version |error |warning |note | 13 | |:------------------|:-------|:------|:-------|:----| 14 | |[modeltime](problems.md#modeltime)|1.2.2 |__+1__ | | | 15 | |[modeltime.ensemble](problems.md#modeltimeensemble)|1.0.1 |__+1__ | |1 | 16 | |[modeltime.resample](problems.md#modeltimeresample)|0.2.1 | |__+1__ |1 | 17 | 18 | -------------------------------------------------------------------------------- /revdep/cran.md: -------------------------------------------------------------------------------- 1 | ## revdepcheck results 2 | 3 | We checked 29 reverse dependencies (27 from CRAN + 2 from Bioconductor), comparing R CMD check results across CRAN and dev versions of this package. 4 | 5 | * We saw 3 new problems 6 | * We failed to check 0 packages 7 | 8 | Issues with CRAN packages are summarised below. 9 | 10 | ### New problems 11 | (This reports the first line of each new failure) 12 | 13 | * modeltime 14 | checking tests ... ERROR 15 | 16 | * modeltime.ensemble 17 | checking tests ... ERROR 18 | 19 | * modeltime.resample 20 | checking re-building of vignette outputs ... WARNING 21 | 22 | -------------------------------------------------------------------------------- /revdep/email.yml: -------------------------------------------------------------------------------- 1 | release_date: ??? 2 | rel_release_date: ??? 3 | my_news_url: ??? 4 | release_version: ??? 5 | release_details: ??? 6 | -------------------------------------------------------------------------------- /revdep/failures.md: -------------------------------------------------------------------------------- 1 | # NA 2 | 3 |
4 | 5 | * Version: NA 6 | * GitHub: NA 7 | * Source code: https://github.com/cran/NA 8 | * Number of recursive dependencies: 0 9 | 10 | Run `cloud_details(, "NA")` for more info 11 | 12 |
13 | 14 | ## Error before installation 15 | 16 | ### Devel 17 | 18 | ``` 19 | 20 | 21 | 22 | 23 | 24 | 25 | ``` 26 | ### CRAN 27 | 28 | ``` 29 | 30 | 31 | 32 | 33 | 34 | 35 | ``` 36 | # NA 37 | 38 |
39 | 40 | * Version: NA 41 | * GitHub: NA 42 | * Source code: https://github.com/cran/NA 43 | * Number of recursive dependencies: 0 44 | 45 | Run `cloud_details(, "NA")` for more info 46 | 47 |
48 | 49 | ## Error before installation 50 | 51 | ### Devel 52 | 53 | ``` 54 | 55 | 56 | 57 | 58 | 59 | 60 | ``` 61 | ### CRAN 62 | 63 | ``` 64 | 65 | 66 | 67 | 68 | 69 | 70 | ``` 71 | -------------------------------------------------------------------------------- /revdep/problems.md: -------------------------------------------------------------------------------- 1 | # modeltime 2 | 3 |
4 | 5 | * Version: 1.2.2 6 | * GitHub: https://github.com/business-science/modeltime 7 | * Source code: https://github.com/cran/modeltime 8 | * Date/Publication: 2022-06-07 21:50:02 UTC 9 | * Number of recursive dependencies: 243 10 | 11 | Run `cloud_details(, "modeltime")` for more info 12 | 13 |
14 | 15 | ## Newly broken 16 | 17 | * checking tests ... ERROR 18 | ``` 19 | Running ‘testthat.R’ 20 | Running the tests in ‘tests/testthat.R’ failed. 21 | Last 13 lines of output: 22 | Backtrace: 23 | ▆ 24 | 1. ├─... |> fit(data_set) at test-panel-data.R:33:0 25 | 2. ├─generics::fit(., data_set) 26 | 3. ├─workflows::add_recipe(., recipe_spec |> step_rm(date)) 27 | 4. │ └─workflows:::add_action(x, action, "recipe") 28 | 5. │ └─workflows:::validate_is_workflow(x, call = call) 29 | 6. │ └─workflows:::is_workflow(x) 30 | 7. └─workflows::add_model(., svm_rbf() |> set_engine("kernlab")) 31 | 8. └─workflows:::new_action_model(spec, formula) 32 | 9. └─rlang::abort(message, call = call) 33 | 34 | [ FAIL 2 | WARN 2 | SKIP 22 | PASS 477 ] 35 | Error: Test failures 36 | Execution halted 37 | ``` 38 | 39 | # modeltime.ensemble 40 | 41 |
42 | 43 | * Version: 1.0.1 44 | * GitHub: https://github.com/business-science/modeltime.ensemble 45 | * Source code: https://github.com/cran/modeltime.ensemble 46 | * Date/Publication: 2022-06-09 12:20:02 UTC 47 | * Number of recursive dependencies: 214 48 | 49 | Run `cloud_details(, "modeltime.ensemble")` for more info 50 | 51 |
52 | 53 | ## Newly broken 54 | 55 | * checking tests ... ERROR 56 | ``` 57 | Running ‘testthat.R’ 58 | Running the tests in ‘tests/testthat.R’ failed. 59 | Last 13 lines of output: 60 | Backtrace: 61 | ▆ 62 | 1. ├─... |> fit(data_set) at test-panel-data.R:28:0 63 | 2. ├─generics::fit(., data_set) 64 | 3. ├─workflows::add_recipe(., recipe_spec |> step_rm(date)) 65 | 4. │ └─workflows:::add_action(x, action, "recipe") 66 | 5. │ └─workflows:::validate_is_workflow(x, call = call) 67 | 6. │ └─workflows:::is_workflow(x) 68 | 7. └─workflows::add_model(., boost_tree() |> set_engine("xgboost")) 69 | 8. └─workflows:::new_action_model(spec, formula) 70 | 9. └─rlang::abort(message, call = call) 71 | 72 | [ FAIL 2 | WARN 16 | SKIP 5 | PASS 52 ] 73 | Error: Test failures 74 | Execution halted 75 | ``` 76 | 77 | ## In both 78 | 79 | * checking dependencies in R code ... NOTE 80 | ``` 81 | Namespace in Imports field not imported from: ‘parsnip’ 82 | All declared Imports should be used. 83 | ``` 84 | 85 | # modeltime.resample 86 | 87 |
88 | 89 | * Version: 0.2.1 90 | * GitHub: https://github.com/business-science/modeltime.resample 91 | * Source code: https://github.com/cran/modeltime.resample 92 | * Date/Publication: 2022-06-07 14:30:03 UTC 93 | * Number of recursive dependencies: 212 94 | 95 | Run `cloud_details(, "modeltime.resample")` for more info 96 | 97 |
98 | 99 | ## Newly broken 100 | 101 | * checking re-building of vignette outputs ... WARNING 102 | ``` 103 | Error(s) in re-building vignettes: 104 | --- re-building ‘getting-started.Rmd’ using rmarkdown 105 | ── Attaching packages ────────────────────────────────────── tidymodels 1.0.0 ── 106 | ✔ broom 1.0.1 ✔ recipes 1.0.1 107 | ✔ dials 1.0.0 ✔ rsample 1.1.0 108 | ✔ dplyr 1.0.10 ✔ tibble 3.1.8 109 | ✔ ggplot2 3.3.6 ✔ tidyr 1.2.1 110 | ✔ infer 1.0.3 ✔ tune 1.0.0 111 | ✔ modeldata 1.0.1 ✔ workflows 1.1.0 112 | ✔ parsnip 1.0.1 ✔ workflowsets 1.0.0 113 | ... 114 | Error: processing vignette 'panel-data.Rmd' failed with diagnostics: 115 | `spec` must have a known mode. 116 | ℹ Set the mode of `spec` by using `parsnip::set_mode()` or by setting the mode directly in the parsnip specification function. 117 | --- failed re-building ‘panel-data.Rmd’ 118 | 119 | SUMMARY: processing the following file failed: 120 | ‘panel-data.Rmd’ 121 | 122 | Error: Vignette re-building failed. 123 | Execution halted 124 | ``` 125 | 126 | ## In both 127 | 128 | * checking dependencies in R code ... NOTE 129 | ``` 130 | Namespaces in Imports field not imported from: 131 | ‘crayon’ ‘dials’ ‘glue’ ‘parsnip’ 132 | All declared Imports should be used. 133 | ``` 134 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(workflows) 3 | 4 | test_check("workflows") 5 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/broom.md: -------------------------------------------------------------------------------- 1 | # can't tidy the model of an unfit workflow 2 | 3 | Code 4 | tidy(x) 5 | Condition 6 | Error in `extract_fit_parsnip()`: 7 | ! Can't extract a model fit from an untrained workflow. 8 | i Do you need to call `fit()`? 9 | 10 | # can't tidy the recipe of an unfit workflow 11 | 12 | Code 13 | tidy(x, what = "recipe") 14 | Condition 15 | Error in `extract_recipe()`: 16 | ! The workflow must have a recipe preprocessor. 17 | 18 | --- 19 | 20 | Code 21 | tidy(x, what = "recipe") 22 | Condition 23 | Error in `extract_mold()`: 24 | ! Can't extract a mold from an untrained workflow. 25 | i Do you need to call `fit()`? 26 | 27 | # can't glance at the model of an unfit workflow 28 | 29 | Can't extract a model fit from an untrained workflow. 30 | i Do you need to call `fit()`? 31 | 32 | # can't augment with the model of an unfit workflow 33 | 34 | Can't extract a model fit from an untrained workflow. 35 | i Do you need to call `fit()`? 36 | 37 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/butcher.md: -------------------------------------------------------------------------------- 1 | # fails if not a fitted workflow 2 | 3 | Code 4 | butcher::butcher(workflow()) 5 | Condition 6 | Error in `extract_fit_parsnip()`: 7 | ! Can't extract a model fit from an untrained workflow. 8 | i Do you need to call `fit()`? 9 | 10 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/control.md: -------------------------------------------------------------------------------- 1 | # parsnip control is validated 2 | 3 | Code 4 | control_workflow(control_parsnip = 1) 5 | Condition 6 | Error in `control_workflow()`: 7 | ! `control_parsnip` must be a object. 8 | 9 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/extract.md: -------------------------------------------------------------------------------- 1 | # error if no preprocessor 2 | 3 | Code 4 | extract_preprocessor(workflow()) 5 | Condition 6 | Error in `extract_preprocessor()`: 7 | ! The workflow does not have a preprocessor. 8 | 9 | # error if not a workflow 10 | 11 | Code 12 | extract_preprocessor(1) 13 | Condition 14 | Error in `UseMethod()`: 15 | ! no applicable method for 'extract_preprocessor' applied to an object of class "c('double', 'numeric')" 16 | 17 | --- 18 | 19 | Code 20 | extract_spec_parsnip(1) 21 | Condition 22 | Error in `UseMethod()`: 23 | ! no applicable method for 'extract_spec_parsnip' applied to an object of class "c('double', 'numeric')" 24 | 25 | --- 26 | 27 | Code 28 | extract_fit_parsnip(1) 29 | Condition 30 | Error in `UseMethod()`: 31 | ! no applicable method for 'extract_fit_parsnip' applied to an object of class "c('double', 'numeric')" 32 | 33 | --- 34 | 35 | Code 36 | extract_mold(1) 37 | Condition 38 | Error in `UseMethod()`: 39 | ! no applicable method for 'extract_mold' applied to an object of class "c('double', 'numeric')" 40 | 41 | --- 42 | 43 | Code 44 | extract_recipe(1) 45 | Condition 46 | Error in `UseMethod()`: 47 | ! no applicable method for 'extract_recipe' applied to an object of class "c('double', 'numeric')" 48 | 49 | # error if no spec 50 | 51 | Code 52 | extract_spec_parsnip(workflow()) 53 | Condition 54 | Error in `extract_spec_parsnip()`: 55 | ! The workflow does not have a model spec. 56 | 57 | # error if no parsnip fit 58 | 59 | Code 60 | extract_fit_parsnip(workflow()) 61 | Condition 62 | Error in `extract_fit_parsnip()`: 63 | ! Can't extract a model fit from an untrained workflow. 64 | i Do you need to call `fit()`? 65 | 66 | # error if no mold 67 | 68 | Code 69 | extract_mold(workflow()) 70 | Condition 71 | Error in `extract_mold()`: 72 | ! Can't extract a mold from an untrained workflow. 73 | i Do you need to call `fit()`? 74 | 75 | --- 76 | 77 | Code 78 | extract_recipe(workflow) 79 | Condition 80 | Error in `extract_mold()`: 81 | ! Can't extract a mold from an untrained workflow. 82 | i Do you need to call `fit()`? 83 | 84 | # can extract a prepped recipe 85 | 86 | Code 87 | extract_recipe(workflow, FALSE) 88 | Condition 89 | Error in `extract_recipe()`: 90 | ! `...` must be empty. 91 | x Problematic argument: 92 | * ..1 = FALSE 93 | i Did you forget to name an argument? 94 | 95 | --- 96 | 97 | Code 98 | extract_recipe(workflow, estimated = "yes please") 99 | Condition 100 | Error in `extract_recipe()`: 101 | ! `estimated` must be a single `TRUE` or `FALSE`. 102 | 103 | # error if no recipe preprocessor 104 | 105 | Code 106 | extract_recipe(workflow()) 107 | Condition 108 | Error in `extract_recipe()`: 109 | ! The workflow must have a recipe preprocessor. 110 | 111 | # extract parameter set from workflow with potentially conflicting ids (#266) 112 | 113 | Code 114 | extract_parameter_set_dials(wflow) 115 | Condition 116 | Error in `extract_parameter_set_dials()`: 117 | x Element id should have unique values. 118 | i Duplicates exist for item: threshold 119 | 120 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/fit-action-model.md: -------------------------------------------------------------------------------- 1 | # model is validated 2 | 3 | Code 4 | add_model(workflow(), 1) 5 | Condition 6 | Error in `add_model()`: 7 | ! `spec` must be a . 8 | 9 | # model must contain a known mode (#160) 10 | 11 | Code 12 | add_model(workflow, mod) 13 | Condition 14 | Error in `add_model()`: 15 | ! `spec` must have a known mode. 16 | i Set the mode of `spec` by using `parsnip::set_mode()` or by setting the mode directly in the parsnip specification function. 17 | 18 | # prompt on spec without a loaded implementation (#174) 19 | 20 | Code 21 | add_model(workflow, mod) 22 | Condition 23 | Error in `add_model()`: 24 | ! parsnip could not locate an implementation for `bag_tree` regression model specifications. 25 | i The parsnip extension package baguette implements support for this specification. 26 | i Please install (if needed) and load to continue. 27 | 28 | --- 29 | 30 | Code 31 | workflow(spec = mod) 32 | Condition 33 | Error in `add_model()`: 34 | ! parsnip could not locate an implementation for `bag_tree` regression model specifications. 35 | i The parsnip extension package baguette implements support for this specification. 36 | i Please install (if needed) and load to continue. 37 | 38 | # cannot add two models 39 | 40 | Code 41 | add_model(workflow, mod) 42 | Condition 43 | Error in `add_model()`: 44 | ! A `model` action has already been added to this workflow. 45 | 46 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/fit.md: -------------------------------------------------------------------------------- 1 | # missing `data` argument has a nice error 2 | 3 | Code 4 | fit(workflow) 5 | Condition 6 | Error in `fit()`: 7 | ! `data` must be provided to fit a workflow. 8 | 9 | # invalid `control` argument has a nice error 10 | 11 | Code 12 | fit(workflow, mtcars, control = control) 13 | Condition 14 | Error in `fit()`: 15 | ! `control` must be a workflows control object created by `control_workflow()`. 16 | 17 | # cannot fit without a pre stage 18 | 19 | Code 20 | fit(workflow, mtcars) 21 | Condition 22 | Error in `.fit_pre()`: 23 | ! The workflow must have a formula, recipe, or variables preprocessor. 24 | i Provide one with `add_formula()`, `add_recipe()`, or `add_variables()`. 25 | 26 | # cannot fit without a fit stage 27 | 28 | Code 29 | fit(workflow, mtcars) 30 | Condition 31 | Error in `.fit_pre()`: 32 | ! The workflow must have a model. 33 | i Provide one with `add_model()`. 34 | 35 | # fit.workflow confirms compatibility of object and calibration 36 | 37 | Code 38 | res <- fit(workflow, mtcars, calibration = mtcars) 39 | Condition 40 | Warning in `fit()`: 41 | The workflow does not require a `calibration` set to train but one was supplied. 42 | 43 | --- 44 | 45 | Code 46 | fit(workflow, mtcars) 47 | Condition 48 | Error in `fit()`: 49 | ! The workflow requires a `calibration` set to train but none was supplied. 50 | 51 | # can `predict()` from workflow fit from individual pieces 52 | 53 | Code 54 | predict(workflow_model, mtcars) 55 | Condition 56 | Error in `predict()`: 57 | ! Can't predict on an untrained workflow. 58 | i Do you need to call `fit()`? 59 | 60 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/post-action-tailor.md: -------------------------------------------------------------------------------- 1 | # postprocessor is validated 2 | 3 | Code 4 | add_tailor(workflow(), 1) 5 | Condition 6 | Error in `add_tailor()`: 7 | ! `tailor` must be a tailor. 8 | 9 | # cannot add two postprocessors 10 | 11 | Code 12 | add_tailor(workflow, post) 13 | Condition 14 | Error in `add_tailor()`: 15 | ! A `tailor` action has already been added to this workflow. 16 | 17 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/pre-action-case-weights.md: -------------------------------------------------------------------------------- 1 | # case weights + recipe doesn't allow the recipe to drop the case weights column 2 | 3 | Code 4 | fit(wf, df) 5 | Condition 6 | Error in `fit()`: 7 | ! No columns with a "case_weights" role exist in the data after processing the recipe. 8 | i Did you remove or modify the case weights while processing the recipe? 9 | 10 | # case weights + recipe doesn't allow the recipe to adjust the case weights column class 11 | 12 | Code 13 | fit(wf, df) 14 | Condition 15 | Error in `fit()`: 16 | ! The column with a recipes role of "case_weights" must be a classed case weights column, as determined by `hardhat::is_case_weights()`. 17 | i Did you modify the case weights while processing the recipe? 18 | 19 | # case weights + recipe doesn't allow the recipe to change the name of the case weights column 20 | 21 | Code 22 | fit(wf, df) 23 | Condition 24 | Error in `fit()`: 25 | ! Can't select columns that don't exist. 26 | x Column `w` doesn't exist. 27 | 28 | # case weights `col` can't select >1 columns in `data` 29 | 30 | Code 31 | fit(wf, mtcars) 32 | Condition 33 | Error in `fit()`: 34 | ! `col` must specify exactly one column from `data` to extract case weights from. 35 | 36 | # case weights must inherit from the base case weights class 37 | 38 | Code 39 | fit(wf, df) 40 | Condition 41 | Error in `fit()`: 42 | ! `col` must select a classed case weights column, as determined by `hardhat::is_case_weights()`. 43 | i For example, it could be a column created by `hardhat::frequency_weights()` or `hardhat::importance_weights()`. 44 | 45 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/pre-action-formula.md: -------------------------------------------------------------------------------- 1 | # formula is validated 2 | 3 | Code 4 | add_formula(workflow(), 1) 5 | Condition 6 | Error in `add_formula()`: 7 | ! `formula` must be a formula. 8 | 9 | # cannot add a formula if a recipe already exists 10 | 11 | Code 12 | add_formula(workflow, mpg ~ cyl) 13 | Condition 14 | Error in `add_formula()`: 15 | ! A formula cannot be added when a recipe already exists. 16 | 17 | # cannot add a formula if variables already exist 18 | 19 | Code 20 | add_formula(workflow, mpg ~ cyl) 21 | Condition 22 | Error in `add_formula()`: 23 | ! A formula cannot be added when variables already exist. 24 | 25 | # cannot add two formulas 26 | 27 | Code 28 | add_formula(workflow, mpg ~ cyl) 29 | Condition 30 | Error in `add_formula()`: 31 | ! A `formula` action has already been added to this workflow. 32 | 33 | # can't pass an `offset()` through `add_formula()` (#162) 34 | 35 | Code 36 | fit(workflow, data = df) 37 | Condition 38 | Error in `fit()`: 39 | ! Can't use an offset in the formula supplied to `add_formula()`. 40 | i Instead, specify offsets through a model formula in `add_model(formula = )`. 41 | 42 | # can only use a 'formula_blueprint' blueprint 43 | 44 | Code 45 | add_formula(workflow, mpg ~ cyl, blueprint = blueprint) 46 | Condition 47 | Error in `add_formula()`: 48 | ! `blueprint` must be a hardhat . 49 | 50 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/pre-action-recipe.md: -------------------------------------------------------------------------------- 1 | # recipe is validated 2 | 3 | Code 4 | add_recipe(workflow(), 1) 5 | Condition 6 | Error in `add_recipe()`: 7 | ! `recipe` must be a recipe. 8 | 9 | # cannot add a recipe if a formula already exists 10 | 11 | Code 12 | add_recipe(workflow, rec) 13 | Condition 14 | Error in `add_recipe()`: 15 | ! A recipe cannot be added when a formula already exists. 16 | 17 | # cannot add a recipe if variables already exist 18 | 19 | Code 20 | add_recipe(workflow, rec) 21 | Condition 22 | Error in `add_recipe()`: 23 | ! A recipe cannot be added when variables already exist. 24 | 25 | # cannot add a recipe if recipe is trained 26 | 27 | Code 28 | add_recipe(workflow, rec) 29 | Condition 30 | Error in `add_recipe()`: 31 | ! Can't add a trained recipe to a workflow. 32 | 33 | # cannot add two recipe 34 | 35 | Code 36 | add_recipe(workflow, rec) 37 | Condition 38 | Error in `add_recipe()`: 39 | ! A `recipe` action has already been added to this workflow. 40 | 41 | # can only use a 'recipe_blueprint' blueprint 42 | 43 | Code 44 | add_recipe(workflow, rec, blueprint = blueprint) 45 | Condition 46 | Error in `add_recipe()`: 47 | ! `blueprint` must be a hardhat . 48 | 49 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/pre-action-variables.md: -------------------------------------------------------------------------------- 1 | # cannot add variables if a recipe already exists 2 | 3 | Code 4 | add_variables(wf, y, x) 5 | Condition 6 | Error in `add_variables()`: 7 | ! Variables cannot be added when a recipe already exists. 8 | 9 | # cannot add variables if a formula already exist 10 | 11 | Code 12 | add_variables(wf, y, x) 13 | Condition 14 | Error in `add_variables()`: 15 | ! Variables cannot be added when a formula already exists. 16 | 17 | # informative error if either `predictors` or `outcomes` aren't provided (#144) 18 | 19 | Code 20 | add_variables(workflow(), outcomes = mpg) 21 | Condition 22 | Error in `workflow_variables()`: 23 | ! `predictors` can't be missing. 24 | 25 | --- 26 | 27 | Code 28 | add_variables(workflow(), predictors = mpg) 29 | Condition 30 | Error in `workflow_variables()`: 31 | ! `outcomes` can't be missing. 32 | 33 | # cannot add two variables 34 | 35 | Code 36 | add_variables(workflow, mpg, cyl) 37 | Condition 38 | Error in `add_variables()`: 39 | ! A `variables` action has already been added to this workflow. 40 | 41 | --- 42 | 43 | Code 44 | add_variables(workflow, variables = workflow_variables(mpg, cyl)) 45 | Condition 46 | Error in `add_variables()`: 47 | ! A `variables` action has already been added to this workflow. 48 | 49 | # can only use a 'xy_blueprint' blueprint 50 | 51 | Code 52 | add_variables(workflow, mpg, cyl, blueprint = blueprint) 53 | Condition 54 | Error in `add_variables()`: 55 | ! `blueprint` must be a hardhat . 56 | 57 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/predict.md: -------------------------------------------------------------------------------- 1 | # workflow must have been `fit()` before prediction can be done 2 | 3 | Code 4 | predict(workflow(), mtcars) 5 | Condition 6 | Error in `predict()`: 7 | ! Can't predict on an untrained workflow. 8 | i Do you need to call `fit()`? 9 | 10 | # predict(type) is respected with a postprocessor (#251) 11 | 12 | Code 13 | predict(wflow_fit, d[1:5, ], type = "boop") 14 | Condition 15 | Error in `predict()`: 16 | ! Unsupported prediction `type` "boop" for a workflow with a postprocessor. 17 | 18 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/pull.md: -------------------------------------------------------------------------------- 1 | # error if no preprocessor 2 | 3 | Code 4 | pull_workflow_preprocessor(workflow()) 5 | Condition 6 | Error in `extract_preprocessor()`: 7 | ! The workflow does not have a preprocessor. 8 | 9 | # error if not a workflow 10 | 11 | Code 12 | pull_workflow_preprocessor(1) 13 | Condition 14 | Error in `pull_workflow_preprocessor()`: 15 | ! `x` must be a workflow, not a . 16 | 17 | --- 18 | 19 | Code 20 | pull_workflow_spec(1) 21 | Condition 22 | Error in `pull_workflow_spec()`: 23 | ! `x` must be a workflow, not a . 24 | 25 | --- 26 | 27 | Code 28 | pull_workflow_fit(1) 29 | Condition 30 | Error in `pull_workflow_fit()`: 31 | ! `x` must be a workflow, not a . 32 | 33 | --- 34 | 35 | Code 36 | pull_workflow_mold(1) 37 | Condition 38 | Error in `pull_workflow_mold()`: 39 | ! `x` must be a workflow, not a . 40 | 41 | --- 42 | 43 | Code 44 | pull_workflow_prepped_recipe(1) 45 | Condition 46 | Error in `pull_workflow_prepped_recipe()`: 47 | ! `x` must be a workflow, not a . 48 | 49 | # `pull_workflow_preprocessor()` is soft-deprecated 50 | 51 | Code 52 | x <- pull_workflow_preprocessor(workflow) 53 | Condition 54 | Warning: 55 | `pull_workflow_preprocessor()` was deprecated in workflows 0.2.3. 56 | i Please use `extract_preprocessor()` instead. 57 | 58 | # error if no spec 59 | 60 | Code 61 | pull_workflow_spec(workflow()) 62 | Condition 63 | Error in `extract_spec_parsnip()`: 64 | ! The workflow does not have a model spec. 65 | 66 | # `pull_workflow_spec()` is soft-deprecated 67 | 68 | Code 69 | x <- pull_workflow_spec(workflow) 70 | Condition 71 | Warning: 72 | `pull_workflow_spec()` was deprecated in workflows 0.2.3. 73 | i Please use `extract_spec_parsnip()` instead. 74 | 75 | # error if no fit 76 | 77 | Code 78 | pull_workflow_fit(workflow()) 79 | Condition 80 | Error in `extract_fit_parsnip()`: 81 | ! Can't extract a model fit from an untrained workflow. 82 | i Do you need to call `fit()`? 83 | 84 | # `pull_workflow_fit()` is soft-deprecated 85 | 86 | Code 87 | x <- pull_workflow_fit(workflow) 88 | Condition 89 | Warning: 90 | `pull_workflow_fit()` was deprecated in workflows 0.2.3. 91 | i Please use `extract_fit_parsnip()` instead. 92 | 93 | # error if no mold 94 | 95 | Code 96 | pull_workflow_mold(workflow()) 97 | Condition 98 | Error in `extract_mold()`: 99 | ! Can't extract a mold from an untrained workflow. 100 | i Do you need to call `fit()`? 101 | 102 | --- 103 | 104 | Code 105 | pull_workflow_prepped_recipe(workflow) 106 | Condition 107 | Error in `extract_mold()`: 108 | ! Can't extract a mold from an untrained workflow. 109 | i Do you need to call `fit()`? 110 | 111 | # `pull_workflow_mold()` is soft-deprecated 112 | 113 | Code 114 | x <- pull_workflow_mold(workflow) 115 | Condition 116 | Warning: 117 | `pull_workflow_mold()` was deprecated in workflows 0.2.3. 118 | i Please use `extract_mold()` instead. 119 | 120 | # error if no recipe preprocessor 121 | 122 | Code 123 | pull_workflow_prepped_recipe(workflow()) 124 | Condition 125 | Error in `extract_recipe()`: 126 | ! The workflow must have a recipe preprocessor. 127 | 128 | # `pull_workflow_prepped_recipe()` is soft-deprecated 129 | 130 | Code 131 | x <- pull_workflow_prepped_recipe(workflow) 132 | Condition 133 | Warning: 134 | `pull_workflow_prepped_recipe()` was deprecated in workflows 0.2.3. 135 | i Please use `extract_recipe()` instead. 136 | 137 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/sparsevctrs.md: -------------------------------------------------------------------------------- 1 | # sparse tibble can be passed to `fit() - formula 2 | 3 | Code 4 | wf_fit <- fit(wf_spec, hotel_data) 5 | Condition 6 | Error in `fit()`: 7 | ! Sparse data cannot be used with the formula interface. Please use `add_recipe()` or `add_variables()` instead. 8 | 9 | # sparse matrices can be passed to `fit() - recipe 10 | 11 | Code 12 | wf_fit <- fit(wf_spec, hotel_data) 13 | Output 14 | sparsevctrs: Sparse vector materialized 15 | 16 | # sparse matrices can be passed to `fit() - formula 17 | 18 | Code 19 | wf_fit <- fit(wf_spec, hotel_data) 20 | Condition 21 | Error in `fit()`: 22 | ! Sparse data cannot be used with the formula interface. Please use `add_recipe()` or `add_variables()` instead. 23 | 24 | # sparse matrices can be passed to `fit() - xy 25 | 26 | Code 27 | wf_fit <- fit(wf_spec, hotel_data) 28 | Output 29 | sparsevctrs: Sparse vector materialized 30 | 31 | # fit() errors if sparse matrix has no colnames 32 | 33 | Code 34 | fit(wf_spec, hotel_data) 35 | Condition 36 | Error in `fit()`: 37 | ! `x` must have column names. 38 | 39 | -------------------------------------------------------------------------------- /tests/testthat/_snaps/workflow.md: -------------------------------------------------------------------------------- 1 | # workflow must be the first argument when adding actions 2 | 3 | Code 4 | add_formula(1, mpg ~ cyl) 5 | Condition 6 | Error in `add_formula()`: 7 | ! `x` must be a workflow, not a . 8 | 9 | --- 10 | 11 | Code 12 | add_recipe(1, rec) 13 | Condition 14 | Error in `add_recipe()`: 15 | ! `x` must be a workflow, not a . 16 | 17 | --- 18 | 19 | Code 20 | add_model(1, mod) 21 | Condition 22 | Error in `add_model()`: 23 | ! `x` must be a workflow, not a . 24 | 25 | # model spec is validated 26 | 27 | Code 28 | workflow(spec = 1) 29 | Condition 30 | Error in `add_model()`: 31 | ! `spec` must be a . 32 | 33 | # preprocessor is validated 34 | 35 | Code 36 | workflow(preprocessor = 1) 37 | Condition 38 | Error in `workflow()`: 39 | ! `preprocessor` must be a formula, recipe, or a set of workflow variables. 40 | 41 | # constructor validates input 42 | 43 | Code 44 | new_workflow(pre = 1) 45 | Condition 46 | Error in `new_workflow()`: 47 | ! `pre` must be a `stage`. 48 | 49 | --- 50 | 51 | Code 52 | new_workflow(fit = 1) 53 | Condition 54 | Error in `new_workflow()`: 55 | ! `fit` must be a `stage`. 56 | 57 | --- 58 | 59 | Code 60 | new_workflow(post = 1) 61 | Condition 62 | Error in `new_workflow()`: 63 | ! `post` must be a `stage`. 64 | 65 | --- 66 | 67 | Code 68 | new_workflow(trained = 1) 69 | Condition 70 | Error in `new_workflow()`: 71 | ! `trained` must be a single logical value. 72 | 73 | # input must be a workflow 74 | 75 | `x` must be a workflow, not a . 76 | 77 | -------------------------------------------------------------------------------- /tests/testthat/helper-extract_parameter_set.R: -------------------------------------------------------------------------------- 1 | check_parameter_set_tibble <- function(x) { 2 | expect_equal( 3 | names(x), 4 | c("name", "id", "source", "component", "component_id", "object") 5 | ) 6 | expect_equal(class(x$name), "character") 7 | expect_equal(class(x$id), "character") 8 | expect_equal(class(x$source), "character") 9 | expect_equal(class(x$component), "character") 10 | expect_equal(class(x$component_id), "character") 11 | expect_true(!any(duplicated(x$id))) 12 | 13 | expect_equal(class(x$object), "list") 14 | obj_check <- map_lgl( 15 | x$object, 16 | function(.x) inherits(.x, "param") | all(is.na(.x)) 17 | ) 18 | expect_true(all(obj_check)) 19 | 20 | invisible(TRUE) 21 | } 22 | -------------------------------------------------------------------------------- /tests/testthat/helper-lifecycle.R: -------------------------------------------------------------------------------- 1 | local_lifecycle_quiet <- function(frame = caller_env()) { 2 | local_options(lifecycle_verbosity = "quiet", .frame = frame) 3 | } 4 | -------------------------------------------------------------------------------- /tests/testthat/helper-sparsevctrs.R: -------------------------------------------------------------------------------- 1 | # ------------------------------------------------------------------------------ 2 | # For sparse tibble testing 3 | 4 | sparse_hotel_rates <- function(tibble = FALSE) { 5 | if (!rlang::is_installed("modeldata")) { 6 | return() 7 | } 8 | 9 | # 99.2 sparsity 10 | hotel_rates <- modeldata::hotel_rates 11 | 12 | prefix_colnames <- function(x, prefix) { 13 | colnames(x) <- paste(colnames(x), prefix, sep = "_") 14 | x 15 | } 16 | 17 | dummies_country <- hardhat::fct_encode_one_hot(hotel_rates$country) 18 | dummies_company <- hardhat::fct_encode_one_hot(hotel_rates$company) 19 | dummies_agent <- hardhat::fct_encode_one_hot(hotel_rates$agent) 20 | 21 | res <- cbind( 22 | hotel_rates["avg_price_per_room"], 23 | prefix_colnames(dummies_country, "country"), 24 | prefix_colnames(dummies_company, "company"), 25 | prefix_colnames(dummies_agent, "agent") 26 | ) 27 | 28 | res <- as.matrix(res) 29 | res <- Matrix::Matrix(res, sparse = TRUE) 30 | 31 | if (tibble) { 32 | res <- sparsevctrs::coerce_to_sparse_tibble(res) 33 | 34 | # materialize outcome 35 | withr::local_options("sparsevctrs.verbose_materialize" = NULL) 36 | res$avg_price_per_room <- res$avg_price_per_room[] 37 | } 38 | 39 | res 40 | } 41 | -------------------------------------------------------------------------------- /tests/testthat/helper-tunable.R: -------------------------------------------------------------------------------- 1 | check_tunable <- function(x) { 2 | expect_equal( 3 | names(x), 4 | c("name", "call_info", "source", "component", "component_id") 5 | ) 6 | expect_equal(class(x$name), "character") 7 | expect_equal(class(x$call_info), "list") 8 | expect_equal(class(x$source), "character") 9 | expect_equal(class(x$component), "character") 10 | expect_equal(class(x$component_id), "character") 11 | 12 | for (i in seq_along(x$call_info)) { 13 | check_call_info(x$call_info[[i]]) 14 | } 15 | 16 | invisible(TRUE) 17 | } 18 | 19 | check_call_info <- function(x) { 20 | if (all(is.null(x))) { 21 | # it is possible that engine parameter do not have call info 22 | return(invisible(TRUE)) 23 | } 24 | expect_true(all(c("pkg", "fun") %in% names(x))) 25 | expect_equal(class(x$pkg), "character") 26 | expect_equal(class(x$fun), "character") 27 | invisible(TRUE) 28 | } 29 | -------------------------------------------------------------------------------- /tests/testthat/test-control.R: -------------------------------------------------------------------------------- 1 | test_that("can create a basic workflow control object", { 2 | expect_s3_class(control_workflow(), "control_workflow") 3 | }) 4 | 5 | test_that("default parsnip control is created", { 6 | expect_equal(control_workflow()$control_parsnip, parsnip::control_parsnip()) 7 | }) 8 | 9 | test_that("parsnip control is validated", { 10 | expect_snapshot(error = TRUE, { 11 | control_workflow(control_parsnip = 1) 12 | }) 13 | }) 14 | -------------------------------------------------------------------------------- /tests/testthat/test-fit-action-model.R: -------------------------------------------------------------------------------- 1 | test_that("can add a model to a workflow", { 2 | mod <- parsnip::linear_reg() 3 | mod <- parsnip::set_engine(mod, "lm") 4 | 5 | workflow <- workflow() 6 | workflow <- add_model(workflow, mod) 7 | 8 | expect_s3_class(workflow$fit$actions$model, "action_model") 9 | }) 10 | 11 | test_that("model is validated", { 12 | expect_snapshot(error = TRUE, add_model(workflow(), 1)) 13 | }) 14 | 15 | test_that("model must contain a known mode (#160)", { 16 | mod <- parsnip::decision_tree() 17 | 18 | workflow <- workflow() 19 | 20 | expect_snapshot(error = TRUE, { 21 | add_model(workflow, mod) 22 | }) 23 | }) 24 | 25 | test_that("prompt on spec without a loaded implementation (#174)", { 26 | mod <- parsnip::bag_tree() |> 27 | parsnip::set_mode("regression") 28 | 29 | workflow <- workflow() 30 | 31 | expect_snapshot(error = TRUE, add_model(workflow, mod)) 32 | expect_snapshot(error = TRUE, workflow(spec = mod)) 33 | }) 34 | 35 | skip_if_not_installed("recipes") 36 | 37 | test_that("cannot add two models", { 38 | mod <- parsnip::linear_reg() 39 | mod <- parsnip::set_engine(mod, "lm") 40 | 41 | workflow <- workflow() 42 | workflow <- add_model(workflow, mod) 43 | 44 | expect_snapshot(error = TRUE, add_model(workflow, mod)) 45 | }) 46 | 47 | test_that("can provide a model formula override", { 48 | # disp is in the recipe, but excluded from the model formula 49 | rec <- recipes::recipe(mpg ~ cyl + disp, mtcars) 50 | rec <- recipes::step_center(rec, cyl) 51 | 52 | mod <- parsnip::linear_reg() 53 | mod <- parsnip::set_engine(mod, "lm") 54 | 55 | workflow <- workflow() 56 | workflow <- add_recipe(workflow, rec) 57 | workflow <- add_model(workflow, mod, formula = mpg ~ cyl) 58 | 59 | result <- fit(workflow, mtcars) 60 | 61 | expect_equal( 62 | c("(Intercept)", "cyl"), 63 | names(result$fit$fit$fit$coefficients) 64 | ) 65 | }) 66 | 67 | test_that("model formula override can contain `offset()` (#162)", { 68 | df <- vctrs::data_frame( 69 | y = c(1.5, 2.5, 3.5, 1, 3), 70 | x = c(2, 6, 7, 3, 6), 71 | o = c(1.1, 2, 3, .5, 2) 72 | ) 73 | 74 | lm_model <- parsnip::linear_reg() 75 | lm_model <- parsnip::set_engine(lm_model, "lm") 76 | 77 | workflow <- workflow() 78 | workflow <- add_model(workflow, lm_model, formula = y ~ x + offset(o)) 79 | workflow <- add_variables(workflow, y, c(x, o)) 80 | 81 | result <- fit(workflow, data = df) 82 | lm_result <- hardhat::extract_fit_engine(result) 83 | 84 | expect_named(lm_result$coefficients, c("(Intercept)", "x")) 85 | expect_identical(attr(lm_result$terms, "offset"), 3L) 86 | }) 87 | 88 | test_that("remove a model", { 89 | lm_model <- parsnip::linear_reg() 90 | lm_model <- parsnip::set_engine(lm_model, "lm") 91 | 92 | workflow_no_model <- workflow() 93 | workflow_no_model <- add_formula(workflow_no_model, mpg ~ cyl) 94 | 95 | workflow_with_model <- add_model(workflow_no_model, lm_model) 96 | workflow_removed_model <- remove_model(workflow_with_model) 97 | 98 | expect_equal(workflow_no_model$fit, workflow_removed_model$fit) 99 | }) 100 | 101 | test_that("remove a model after model fit", { 102 | lm_model <- parsnip::linear_reg() 103 | lm_model <- parsnip::set_engine(lm_model, "lm") 104 | 105 | workflow_no_model <- workflow() 106 | workflow_no_model <- add_formula(workflow_no_model, mpg ~ cyl) 107 | 108 | workflow_with_model <- add_model(workflow_no_model, lm_model) 109 | workflow_with_model <- fit(workflow_with_model, data = mtcars) 110 | 111 | workflow_removed_model <- remove_model(workflow_with_model) 112 | 113 | expect_equal(workflow_no_model$fit, workflow_removed_model$fit) 114 | }) 115 | 116 | test_that("update a model", { 117 | lm_model <- parsnip::linear_reg() 118 | lm_model <- parsnip::set_engine(lm_model, "lm") 119 | glmn_model <- parsnip::set_engine(lm_model, "glmnet") 120 | 121 | workflow <- workflow() 122 | workflow <- add_formula(workflow, mpg ~ cyl) 123 | workflow <- add_model(workflow, lm_model) 124 | workflow <- update_model(workflow, glmn_model) 125 | 126 | expect_equal(workflow$fit$actions$model$spec$engine, "glmnet") 127 | }) 128 | 129 | 130 | test_that("update a model after model fit", { 131 | lm_model <- parsnip::linear_reg() 132 | lm_model <- parsnip::set_engine(lm_model, "lm") 133 | no_model <- parsnip::set_engine(lm_model, "lm", model = FALSE) 134 | 135 | workflow <- workflow() 136 | workflow <- add_model(workflow, no_model) 137 | workflow <- add_formula(workflow, mpg ~ cyl) 138 | 139 | workflow <- fit(workflow, data = mtcars) 140 | workflow <- update_model(workflow, lm_model) 141 | 142 | # Should no longer have `model = FALSE` engine arg 143 | engine_args <- workflow$fit$actions$model$spec$eng_args 144 | expect_false(any(names(engine_args) == "model")) 145 | 146 | # The fitted model should be removed 147 | expect_null(workflow$fit$fit) 148 | }) 149 | -------------------------------------------------------------------------------- /tests/testthat/test-post-action-tailor.R: -------------------------------------------------------------------------------- 1 | skip_if_not_installed("probably") 2 | skip_if_not_installed("tailor") 3 | 4 | test_that("can add a postprocessor to a workflow", { 5 | post <- tailor::tailor() 6 | 7 | workflow <- workflow() 8 | workflow <- add_tailor(workflow, post) 9 | 10 | expect_s3_class(workflow$post$actions$tailor, "action_tailor") 11 | }) 12 | 13 | test_that("postprocessor is validated", { 14 | expect_snapshot(error = TRUE, add_tailor(workflow(), 1)) 15 | }) 16 | 17 | test_that("cannot add two postprocessors", { 18 | post <- tailor::tailor() 19 | 20 | workflow <- workflow() 21 | workflow <- add_tailor(workflow, post) 22 | 23 | expect_snapshot(error = TRUE, add_tailor(workflow, post)) 24 | }) 25 | 26 | test_that("remove a postprocessor", { 27 | post <- tailor::tailor() 28 | 29 | workflow_no_post <- workflow() 30 | workflow_no_post <- add_formula(workflow_no_post, mpg ~ cyl) 31 | 32 | workflow_with_post <- add_tailor(workflow_no_post, post) 33 | workflow_removed_post <- remove_tailor(workflow_with_post) 34 | 35 | expect_equal(workflow_no_post$post, workflow_removed_post$post) 36 | }) 37 | 38 | test_that("remove a postprocessor after postprocessor fit", { 39 | post <- tailor::tailor() 40 | 41 | workflow_no_post <- workflow() 42 | workflow_no_post <- add_formula(workflow_no_post, mpg ~ cyl) 43 | workflow_no_post <- add_model(workflow_no_post, parsnip::linear_reg()) 44 | 45 | workflow_with_post <- add_tailor(workflow_no_post, post) 46 | workflow_with_post <- fit(workflow_with_post, data = mtcars) 47 | 48 | workflow_removed_post <- remove_tailor(workflow_with_post) 49 | 50 | expect_equal(workflow_no_post$post, workflow_removed_post$post) 51 | }) 52 | 53 | test_that("update a postprocessor", { 54 | post <- tailor::tailor() 55 | post2 <- tailor::adjust_numeric_range(post, 0, Inf) 56 | 57 | workflow <- workflow() 58 | workflow <- add_tailor(workflow, post) 59 | workflow <- update_tailor(workflow, post2) 60 | 61 | expect_length(workflow$post$actions$tailor$tailor$adjustments, 1) 62 | }) 63 | 64 | test_that("update a postprocessor after postprocessor fit", { 65 | post <- tailor::tailor() 66 | post2 <- tailor::adjust_numeric_range(post, 0, Inf) 67 | 68 | workflow_no_post <- workflow() 69 | workflow_no_post <- add_formula(workflow_no_post, mpg ~ cyl) 70 | workflow_no_post <- add_model(workflow_no_post, parsnip::linear_reg()) 71 | 72 | workflow_with_post <- add_tailor(workflow_no_post, post) 73 | workflow_with_post <- fit(workflow_with_post, data = mtcars) 74 | 75 | workflow_with_post_new <- update_tailor(workflow_with_post, post2) 76 | 77 | expect_length( 78 | workflow_with_post_new$post$actions$tailor$tailor$adjustments, 79 | 1 80 | ) 81 | 82 | # Note that the fitted model and preprocessor can remain; the new 83 | # postprocessor will not affect it (#225) 84 | expect_equal(workflow_with_post$fit, workflow_with_post_new$fit) 85 | }) 86 | 87 | test_that("postprocessor fit aligns with manually fitted version (no calibration)", { 88 | skip_if_not_installed("modeldata") 89 | 90 | # create example data 91 | y <- seq(0, 7, .1) 92 | dat <- data.frame(y = y, x = y + (y - 3)^2) 93 | 94 | # construct workflows 95 | post <- tailor::tailor() 96 | post <- tailor::adjust_numeric_range(post, 0, 5) 97 | 98 | wflow_simple <- workflow(y ~ ., parsnip::linear_reg()) 99 | wflow_post <- add_tailor(wflow_simple, post) 100 | 101 | # train workflow 102 | wf_simple_fit <- fit(wflow_simple, dat) 103 | wf_post_fit <- fit(wflow_post, dat) 104 | 105 | # ...verify predictions are the same as training the post-proc separately 106 | wflow_simple_preds <- augment(wf_simple_fit, dat) 107 | post_trained <- fit(post, wflow_simple_preds, y, .pred) 108 | wflow_manual_preds <- predict(post_trained, wflow_simple_preds) 109 | 110 | wflow_post_preds <- predict(wf_post_fit, dat) 111 | 112 | expect_equal(wflow_manual_preds[".pred"], wflow_post_preds) 113 | expect_false(all(wflow_simple_preds[".pred"] == wflow_manual_preds[".pred"])) 114 | }) 115 | 116 | test_that("postprocessor fit aligns with manually fitted version (with calibration)", { 117 | skip_if_not_installed("modeldata") 118 | skip_if_not_installed("mgcv") 119 | 120 | # create example data 121 | y <- seq(0, 7, .1) 122 | dat <- data.frame(y = y, x = y + (y - 3)^2) 123 | 124 | dat_data <- dat[1:40, ] 125 | dat_calibration <- dat[41:71, ] 126 | 127 | # construct workflows 128 | post <- tailor::tailor() 129 | post <- tailor::adjust_numeric_calibration(post, "linear") 130 | 131 | wflow_simple <- workflow(y ~ ., parsnip::linear_reg()) 132 | wflow_post <- add_tailor(wflow_simple, post) 133 | 134 | # train workflows 135 | wf_simple_fit <- fit(wflow_simple, dat_data) 136 | wf_post_fit <- fit(wflow_post, dat_data, calibration = dat_calibration) 137 | 138 | # ...verify predictions are the same as training the post-proc separately. 139 | # note that this test naughtily re-predicts on the calibration set. 140 | wflow_simple_preds <- augment(wf_simple_fit, dat_calibration) 141 | post_trained <- fit(post, wflow_simple_preds, y, .pred) 142 | wflow_manual_preds <- predict(post_trained, wflow_simple_preds) 143 | 144 | wflow_post_preds <- predict(wf_post_fit, dat_calibration) 145 | 146 | expect_equal(wflow_manual_preds[".pred"], wflow_post_preds) 147 | 148 | # okay if some predictions are the same, but we wouldn't expect all of them to be 149 | expect_false(all(wflow_simple_preds[".pred"] == wflow_manual_preds[".pred"])) 150 | }) 151 | -------------------------------------------------------------------------------- /tests/testthat/test-pre-action-formula.R: -------------------------------------------------------------------------------- 1 | skip_if_not_installed("recipes") 2 | 3 | test_that("can add a formula to a workflow", { 4 | workflow <- workflow() 5 | workflow <- add_formula(workflow, mpg ~ cyl) 6 | 7 | expect_s3_class(workflow$pre$actions$formula, "action_formula") 8 | }) 9 | 10 | test_that("formula is validated", { 11 | expect_snapshot(error = TRUE, add_formula(workflow(), 1)) 12 | }) 13 | 14 | test_that("cannot add a formula if a recipe already exists", { 15 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 16 | 17 | workflow <- workflow() 18 | workflow <- add_recipe(workflow, rec) 19 | 20 | expect_snapshot(error = TRUE, add_formula(workflow, mpg ~ cyl)) 21 | }) 22 | 23 | test_that("cannot add a formula if variables already exist", { 24 | workflow <- workflow() 25 | workflow <- add_variables(workflow, y, x) 26 | 27 | expect_snapshot(error = TRUE, add_formula(workflow, mpg ~ cyl)) 28 | }) 29 | 30 | test_that("formula preprocessing is executed upon `fit()`", { 31 | mod <- parsnip::linear_reg() 32 | mod <- parsnip::set_engine(mod, "lm") 33 | 34 | workflow <- workflow() 35 | workflow <- add_formula(workflow, mpg ~ log(cyl)) 36 | workflow <- add_model(workflow, mod) 37 | 38 | result <- fit(workflow, mtcars) 39 | 40 | expect_equal( 41 | result$pre$mold$outcomes$mpg, 42 | mtcars$mpg 43 | ) 44 | 45 | expect_equal( 46 | result$pre$mold$predictors$`log(cyl)`, 47 | log(mtcars$cyl) 48 | ) 49 | }) 50 | 51 | test_that("cannot add two formulas", { 52 | workflow <- workflow() 53 | workflow <- add_formula(workflow, mpg ~ cyl) 54 | 55 | expect_snapshot(error = TRUE, add_formula(workflow, mpg ~ cyl)) 56 | }) 57 | 58 | test_that("remove a formula", { 59 | workflow_no_formula <- workflow() 60 | workflow_with_formula <- add_formula(workflow_no_formula, mpg ~ cyl) 61 | workflow_removed_formula <- remove_formula(workflow_with_formula) 62 | 63 | expect_equal(workflow_no_formula$pre, workflow_removed_formula$pre) 64 | }) 65 | 66 | test_that("remove a formula after model fit", { 67 | lm_model <- parsnip::linear_reg() 68 | lm_model <- parsnip::set_engine(lm_model, "lm") 69 | 70 | workflow_no_formula <- workflow() 71 | workflow_no_formula <- add_model(workflow_no_formula, lm_model) 72 | 73 | workflow_with_formula <- add_formula(workflow_no_formula, mpg ~ cyl) 74 | workflow_with_formula <- fit(workflow_with_formula, data = mtcars) 75 | 76 | workflow_removed_formula <- remove_formula(workflow_with_formula) 77 | 78 | expect_equal(workflow_no_formula$pre, workflow_removed_formula$pre) 79 | }) 80 | 81 | test_that("removing a formula doesn't remove case weights", { 82 | wf <- workflow() 83 | wf <- add_formula(wf, mpg ~ .) 84 | wf <- add_case_weights(wf, disp) 85 | 86 | wf <- remove_formula(wf) 87 | 88 | expect_identical(names(wf$pre$actions), "case_weights") 89 | }) 90 | 91 | test_that("update a formula", { 92 | workflow <- workflow() 93 | workflow <- add_formula(workflow, mpg ~ cyl) 94 | workflow <- update_formula(workflow, mpg ~ disp) 95 | 96 | expect_equal(workflow$pre$actions$formula$formula, mpg ~ disp) 97 | }) 98 | 99 | test_that("update a formula after model fit", { 100 | lm_model <- parsnip::linear_reg() 101 | lm_model <- parsnip::set_engine(lm_model, "lm") 102 | 103 | workflow <- workflow() 104 | workflow <- add_model(workflow, lm_model) 105 | workflow <- add_formula(workflow, mpg ~ cyl) 106 | 107 | workflow <- fit(workflow, data = mtcars) 108 | 109 | # Should clear fitted model 110 | workflow <- update_formula(workflow, mpg ~ disp) 111 | 112 | expect_equal(workflow$pre$actions$formula$formula, mpg ~ disp) 113 | 114 | expect_equal(workflow$fit$actions$model$spec, lm_model) 115 | expect_null(workflow$pre$mold) 116 | }) 117 | 118 | test_that("can pass a blueprint through to hardhat::mold()", { 119 | lm_model <- parsnip::linear_reg() 120 | lm_model <- parsnip::set_engine(lm_model, "lm") 121 | 122 | blueprint <- hardhat::default_formula_blueprint(intercept = TRUE) 123 | 124 | workflow <- workflow() 125 | workflow <- add_model(workflow, lm_model) 126 | workflow <- add_formula(workflow, mpg ~ cyl, blueprint = blueprint) 127 | 128 | workflow <- fit(workflow, data = mtcars) 129 | 130 | expect_true("(Intercept)" %in% colnames(workflow$pre$mold$predictors)) 131 | expect_equal(workflow$pre$actions$formula$blueprint, blueprint) 132 | expect_true(workflow$pre$mold$blueprint$intercept) 133 | }) 134 | 135 | test_that("can't pass an `offset()` through `add_formula()` (#162)", { 136 | df <- vctrs::data_frame( 137 | y = c(1.5, 2.5, 3.5, 1, 3), 138 | x = c(2, 6, 7, 3, 6), 139 | o = c(1.1, 2, 3, .5, 2) 140 | ) 141 | 142 | lm_model <- parsnip::linear_reg() 143 | lm_model <- parsnip::set_engine(lm_model, "lm") 144 | 145 | workflow <- workflow() 146 | workflow <- add_model(workflow, lm_model) 147 | workflow <- add_formula(workflow, y ~ x + offset(o)) 148 | 149 | expect_snapshot(error = TRUE, { 150 | fit(workflow, data = df) 151 | }) 152 | }) 153 | 154 | test_that("can only use a 'formula_blueprint' blueprint", { 155 | blueprint <- hardhat::default_recipe_blueprint() 156 | 157 | workflow <- workflow() 158 | 159 | expect_snapshot( 160 | error = TRUE, 161 | add_formula(workflow, mpg ~ cyl, blueprint = blueprint) 162 | ) 163 | }) 164 | -------------------------------------------------------------------------------- /tests/testthat/test-pre-action-recipe.R: -------------------------------------------------------------------------------- 1 | skip_if_not_installed("recipes") 2 | 3 | test_that("can add a recipe to a workflow", { 4 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 5 | 6 | workflow <- workflow() 7 | workflow <- add_recipe(workflow, rec) 8 | 9 | expect_s3_class(workflow$pre$actions$recipe, "action_recipe") 10 | }) 11 | 12 | test_that("recipe is validated", { 13 | expect_snapshot(error = TRUE, add_recipe(workflow(), 1)) 14 | }) 15 | 16 | test_that("cannot add a recipe if a formula already exists", { 17 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 18 | 19 | workflow <- workflow() 20 | workflow <- add_formula(workflow, mpg ~ cyl) 21 | 22 | expect_snapshot(error = TRUE, add_recipe(workflow, rec)) 23 | }) 24 | 25 | test_that("cannot add a recipe if variables already exist", { 26 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 27 | 28 | workflow <- workflow() 29 | workflow <- add_variables(workflow, y, x) 30 | 31 | expect_snapshot(error = TRUE, add_recipe(workflow, rec)) 32 | }) 33 | 34 | test_that("cannot add a recipe if recipe is trained", { 35 | rec <- recipes::recipe(mpg ~ cyl, mtcars) |> recipes::prep() 36 | 37 | workflow <- workflow() 38 | 39 | expect_snapshot(error = TRUE, add_recipe(workflow, rec)) 40 | }) 41 | 42 | test_that("remove a recipe", { 43 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 44 | 45 | workflow_no_recipe <- workflow() 46 | workflow_with_recipe <- add_recipe(workflow_no_recipe, rec) 47 | workflow_removed_recipe <- remove_recipe(workflow_with_recipe) 48 | 49 | expect_equal(workflow_no_recipe$pre, workflow_removed_recipe$pre) 50 | }) 51 | 52 | test_that("remove a recipe after model fit", { 53 | lm_model <- parsnip::linear_reg() 54 | lm_model <- parsnip::set_engine(lm_model, "lm") 55 | 56 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 57 | 58 | workflow_no_recipe <- workflow() 59 | workflow_no_recipe <- add_model(workflow_no_recipe, lm_model) 60 | 61 | workflow_with_recipe <- add_recipe(workflow_no_recipe, rec) 62 | workflow_with_recipe <- fit(workflow_with_recipe, data = mtcars) 63 | 64 | workflow_removed_recipe <- remove_recipe(workflow_with_recipe) 65 | 66 | expect_equal(workflow_no_recipe$pre, workflow_removed_recipe$pre) 67 | }) 68 | 69 | test_that("removing a formula doesn't remove case weights", { 70 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 71 | 72 | wf <- workflow() 73 | wf <- add_recipe(wf, rec) 74 | wf <- add_case_weights(wf, disp) 75 | 76 | wf <- remove_recipe(wf) 77 | 78 | expect_identical(names(wf$pre$actions), "case_weights") 79 | }) 80 | 81 | test_that("update a recipe", { 82 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 83 | rec2 <- recipes::recipe(mpg ~ disp, mtcars) 84 | 85 | workflow <- workflow() 86 | workflow <- add_recipe(workflow, rec) 87 | workflow <- update_recipe(workflow, rec2) 88 | 89 | expect_equal(workflow$pre$actions$recipe$recipe, rec2) 90 | }) 91 | 92 | test_that("update a recipe after model fit", { 93 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 94 | rec2 <- recipes::recipe(mpg ~ disp, mtcars) 95 | 96 | lm_model <- parsnip::linear_reg() 97 | lm_model <- parsnip::set_engine(lm_model, "lm") 98 | 99 | workflow <- workflow() 100 | workflow <- add_model(workflow, lm_model) 101 | workflow <- add_recipe(workflow, rec) 102 | 103 | workflow <- fit(workflow, data = mtcars) 104 | 105 | # Should clear fitted model 106 | workflow <- update_recipe(workflow, rec2) 107 | 108 | expect_equal(workflow$pre$actions$recipe$recipe, rec2) 109 | 110 | expect_equal(workflow$fit$actions$model$spec, lm_model) 111 | expect_null(workflow$pre$mold) 112 | }) 113 | 114 | test_that("recipe is prepped upon `fit()`", { 115 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 116 | rec <- recipes::step_center(rec, cyl) 117 | 118 | mod <- parsnip::linear_reg() 119 | mod <- parsnip::set_engine(mod, "lm") 120 | 121 | workflow <- workflow() 122 | workflow <- add_recipe(workflow, rec) 123 | workflow <- add_model(workflow, mod) 124 | 125 | result <- fit(workflow, mtcars) 126 | 127 | expect_equal( 128 | result$pre$mold$outcomes$mpg, 129 | mtcars$mpg 130 | ) 131 | 132 | expect_equal( 133 | result$pre$mold$predictors$cyl, 134 | mtcars$cyl - mean(mtcars$cyl) 135 | ) 136 | 137 | center_step <- result$pre$mold$blueprint$recipe$steps[[1]] 138 | 139 | expect_true(recipes::is_trained(center_step)) 140 | }) 141 | 142 | test_that("cannot add two recipe", { 143 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 144 | 145 | workflow <- workflow() 146 | workflow <- add_recipe(workflow, rec) 147 | 148 | expect_snapshot(error = TRUE, add_recipe(workflow, rec)) 149 | }) 150 | 151 | test_that("can pass a blueprint through to hardhat::mold()", { 152 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 153 | 154 | lm_model <- parsnip::linear_reg() 155 | lm_model <- parsnip::set_engine(lm_model, "lm") 156 | 157 | blueprint <- hardhat::default_recipe_blueprint(intercept = TRUE) 158 | 159 | workflow <- workflow() 160 | workflow <- add_model(workflow, lm_model) 161 | workflow <- add_recipe(workflow, rec, blueprint = blueprint) 162 | 163 | workflow <- fit(workflow, data = mtcars) 164 | 165 | expect_true("(Intercept)" %in% colnames(workflow$pre$mold$predictors)) 166 | expect_equal(workflow$pre$actions$recipe$blueprint, blueprint) 167 | expect_true(workflow$pre$mold$blueprint$intercept) 168 | }) 169 | 170 | test_that("can only use a 'recipe_blueprint' blueprint", { 171 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 172 | blueprint <- hardhat::default_formula_blueprint() 173 | 174 | workflow <- workflow() 175 | 176 | expect_snapshot( 177 | error = TRUE, 178 | add_recipe(workflow, rec, blueprint = blueprint) 179 | ) 180 | }) 181 | -------------------------------------------------------------------------------- /tests/testthat/test-predict.R: -------------------------------------------------------------------------------- 1 | skip_if_not_installed("recipes") 2 | 3 | test_that("can predict from a workflow", { 4 | mod <- parsnip::linear_reg() 5 | mod <- parsnip::set_engine(mod, "lm") 6 | 7 | workflow <- workflow() 8 | workflow <- add_formula(workflow, mpg ~ cyl) 9 | workflow <- add_model(workflow, mod) 10 | 11 | fit_workflow <- fit(workflow, mtcars) 12 | 13 | result <- predict(fit_workflow, mtcars) 14 | 15 | expect_s3_class(result, "tbl_df") 16 | expect_equal(nrow(result), 32) 17 | }) 18 | 19 | test_that("workflow must have been `fit()` before prediction can be done", { 20 | expect_snapshot(error = TRUE, predict(workflow(), mtcars)) 21 | }) 22 | 23 | test_that("formula preprocessing is done to the `new_data`", { 24 | mod <- parsnip::linear_reg() 25 | mod <- parsnip::set_engine(mod, "lm") 26 | 27 | workflow <- workflow() 28 | workflow <- add_formula(workflow, mpg ~ log(cyl)) 29 | workflow <- add_model(workflow, mod) 30 | 31 | fit_workflow <- fit(workflow, mtcars) 32 | 33 | result1 <- predict(fit_workflow, mtcars) 34 | 35 | # pre-log the data 36 | mtcars_with_log <- mtcars 37 | mtcars_with_log$cyl <- log(mtcars_with_log$cyl) 38 | 39 | workflow <- workflow() 40 | workflow <- add_formula(workflow, mpg ~ cyl) 41 | workflow <- add_model(workflow, mod) 42 | 43 | fit_workflow <- fit(workflow, mtcars_with_log) 44 | 45 | result2 <- predict(fit_workflow, mtcars_with_log) 46 | 47 | expect_equal(result1, result2) 48 | }) 49 | 50 | test_that("recipe preprocessing is done to the `new_data`", { 51 | mod <- parsnip::linear_reg() 52 | mod <- parsnip::set_engine(mod, "lm") 53 | 54 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 55 | rec <- recipes::step_log(rec, cyl) 56 | 57 | workflow <- workflow() 58 | workflow <- add_recipe(workflow, rec) 59 | workflow <- add_model(workflow, mod) 60 | 61 | fit_workflow <- fit(workflow, mtcars) 62 | 63 | result1 <- predict(fit_workflow, mtcars) 64 | 65 | # pre-log the data 66 | mtcars_with_log <- mtcars 67 | mtcars_with_log$cyl <- log(mtcars_with_log$cyl) 68 | 69 | workflow <- workflow() 70 | workflow <- add_formula(workflow, mpg ~ cyl) 71 | workflow <- add_model(workflow, mod) 72 | 73 | fit_workflow <- fit(workflow, mtcars_with_log) 74 | 75 | result2 <- predict(fit_workflow, mtcars_with_log) 76 | 77 | expect_equal(result1, result2) 78 | }) 79 | 80 | test_that("`new_data` must have all of the original predictors", { 81 | mod <- parsnip::linear_reg() 82 | mod <- parsnip::set_engine(mod, "lm") 83 | 84 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 85 | rec <- recipes::step_log(rec, cyl) 86 | 87 | workflow <- workflow() 88 | workflow <- add_recipe(workflow, rec) 89 | workflow <- add_model(workflow, mod) 90 | 91 | fit_workflow <- fit(workflow, mtcars) 92 | 93 | cars_no_cyl <- mtcars 94 | cars_no_cyl$cyl <- NULL 95 | 96 | # This error comes from hardhat, so we don't snapshot it 97 | expect_error(predict(fit_workflow, cars_no_cyl)) 98 | }) 99 | 100 | test_that("blueprint will get passed on to hardhat::forge()", { 101 | train <- data.frame( 102 | y = c(1L, 5L, 3L, 4L), 103 | x = factor(c("x", "y", "x", "y")) 104 | ) 105 | 106 | test <- data.frame( 107 | x = factor(c("x", "y", "z")) 108 | ) 109 | 110 | spec <- parsnip::linear_reg() 111 | spec <- parsnip::set_engine(spec, "lm") 112 | 113 | bp1 <- hardhat::default_formula_blueprint( 114 | intercept = TRUE, 115 | allow_novel_levels = FALSE 116 | ) 117 | bp2 <- hardhat::default_formula_blueprint( 118 | intercept = TRUE, 119 | allow_novel_levels = TRUE 120 | ) 121 | 122 | workflow <- workflow() 123 | workflow <- add_model(workflow, spec) 124 | 125 | workflow1 <- add_formula(workflow, y ~ x, blueprint = bp1) 126 | workflow2 <- add_formula(workflow, y ~ x, blueprint = bp2) 127 | 128 | mod1 <- fit(workflow1, train) 129 | mod2 <- fit(workflow2, train) 130 | 131 | # Warning from hardhat, so we don't snapshot it 132 | expect_warning(pred1 <- predict(mod1, test)) 133 | expect_no_warning(pred2 <- predict(mod2, test)) 134 | 135 | expect_identical( 136 | pred1[[".pred"]], 137 | c(2, 4.5, NA) 138 | ) 139 | 140 | expect_identical( 141 | pred2[[".pred"]], 142 | c(2, 4.5, 2) 143 | ) 144 | }) 145 | 146 | test_that("monitoring: no double intercept due to dot expansion in model formula #210", { 147 | mod <- parsnip::linear_reg() 148 | mod <- parsnip::set_engine(mod, "lm") 149 | 150 | # model formula includes a dot to mean "everything available after the preprocessing formula 151 | workflow <- workflow() 152 | workflow <- add_model(workflow, mod, formula = mpg ~ .) 153 | 154 | blueprint_with_intercept <- hardhat::default_formula_blueprint( 155 | intercept = TRUE 156 | ) 157 | workflow_with_intercept <- add_formula( 158 | workflow, 159 | mpg ~ hp + disp, 160 | blueprint = blueprint_with_intercept 161 | ) 162 | fit_with_intercept <- fit(workflow_with_intercept, mtcars) 163 | 164 | # The dot expansion used to include the intercept column, added via the blueprint, as a regular predictor. 165 | # `parsnip:::prepare_data()` removed this column, so lm's predict method errored. 166 | # Now it gets removed before fitting (lm will handle the intercept itself), 167 | # so lm()'s predict method won't error anymore here. (tidymodels/parsnip#1033) 168 | expect_no_error(predict(fit_with_intercept, mtcars)) 169 | }) 170 | 171 | test_that("predict(type) is respected with a postprocessor (#251)", { 172 | skip_if_not_installed("tailor") 173 | # create example data 174 | y <- seq(0, 7, .1) 175 | d <- data.frame( 176 | y = as.factor(ifelse(y > 3.5, "yes", "no")), 177 | x = y + (y - 3)^2 178 | ) 179 | wflow <- workflow(y ~ ., parsnip::logistic_reg(), tailor::tailor()) 180 | wflow_fit <- fit(wflow, d) 181 | 182 | pred_class <- predict(wflow_fit, d[1:5, ], type = "class") 183 | pred_prob <- predict(wflow_fit, d[1:5, ], type = "prob") 184 | pred_null <- predict(wflow_fit, d[1:5, ]) 185 | 186 | expect_named(pred_class, ".pred_class") 187 | expect_named(pred_prob, c(".pred_no", ".pred_yes"), ignore.order = TRUE) 188 | expect_equal(pred_class, pred_null) 189 | 190 | expect_snapshot(error = TRUE, predict(wflow_fit, d[1:5, ], type = "boop")) 191 | }) 192 | -------------------------------------------------------------------------------- /tests/testthat/test-printing.R: -------------------------------------------------------------------------------- 1 | skip_if_not_installed("recipes") 2 | 3 | test_that("can print empty workflow", { 4 | expect_snapshot(workflow()) 5 | }) 6 | 7 | test_that("can print workflow with recipe", { 8 | rec <- recipes::recipe(mtcars) 9 | expect_snapshot(add_recipe(workflow(), rec)) 10 | }) 11 | 12 | test_that("can print workflow with formula", { 13 | expect_snapshot(add_formula(workflow(), y ~ x)) 14 | }) 15 | 16 | test_that("can print workflow with variables", { 17 | expect_snapshot(add_variables(workflow(), y, c(x1, x2))) 18 | }) 19 | 20 | test_that("can print workflow with model", { 21 | model <- parsnip::linear_reg() 22 | model <- parsnip::set_engine(model, "lm") 23 | 24 | expect_snapshot(add_model(workflow(), model)) 25 | }) 26 | 27 | test_that("can print workflow with model with engine specific args", { 28 | model <- parsnip::linear_reg(penalty = 0.01) 29 | model <- parsnip::set_engine(model, "glmnet", dfmax = 5) 30 | 31 | expect_snapshot(add_model(workflow(), model)) 32 | }) 33 | 34 | test_that("can print workflow with fit model", { 35 | model <- parsnip::linear_reg() 36 | model <- parsnip::set_engine(model, "lm") 37 | 38 | workflow <- workflow() 39 | workflow <- add_formula(workflow, mpg ~ cyl) 40 | workflow <- add_model(workflow, model) 41 | 42 | expect_snapshot(fit(workflow, mtcars)) 43 | }) 44 | 45 | test_that("can print workflow with >10 recipe steps", { 46 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 47 | rec <- recipes::step_log(rec, cyl) 48 | rec <- recipes::step_log(rec, cyl) 49 | rec <- recipes::step_log(rec, cyl) 50 | rec <- recipes::step_log(rec, cyl) 51 | rec <- recipes::step_log(rec, cyl) 52 | rec <- recipes::step_log(rec, cyl) 53 | rec <- recipes::step_log(rec, cyl) 54 | rec <- recipes::step_log(rec, cyl) 55 | rec <- recipes::step_log(rec, cyl) 56 | rec <- recipes::step_log(rec, cyl) 57 | rec <- recipes::step_log(rec, cyl) 58 | 59 | expect_snapshot(add_recipe(workflow(), rec)) 60 | 61 | rec <- recipes::step_log(rec, cyl) 62 | 63 | expect_snapshot(add_recipe(workflow(), rec)) 64 | }) 65 | 66 | test_that("can print workflow with just case weights", { 67 | workflow <- workflow() 68 | workflow <- add_case_weights(workflow, disp) 69 | 70 | expect_snapshot(workflow) 71 | }) 72 | 73 | test_that("can print workflow with case weights, preprocessor, and model", { 74 | model <- parsnip::linear_reg() 75 | model <- parsnip::set_engine(model, "lm") 76 | 77 | workflow <- workflow() 78 | workflow <- add_formula(workflow, mpg ~ .) 79 | workflow <- add_case_weights(workflow, disp) 80 | workflow <- add_model(workflow, model) 81 | 82 | expect_snapshot(workflow) 83 | }) 84 | 85 | test_that("can print workflow with postprocessor", { 86 | skip_if_not_installed("tailor") 87 | 88 | post <- tailor::tailor() 89 | workflow <- workflow() 90 | workflow <- add_postprocessor(workflow, post) 91 | 92 | expect_snapshot(workflow) 93 | }) 94 | -------------------------------------------------------------------------------- /tests/testthat/test-workflow.R: -------------------------------------------------------------------------------- 1 | skip_if_not_installed("recipes") 2 | 3 | # ------------------------------------------------------------------------------ 4 | # workflow() 5 | 6 | test_that("can create a basic workflow", { 7 | workflow <- workflow() 8 | 9 | expect_s3_class(workflow, "workflow") 10 | 11 | expect_s3_class(workflow$pre, "stage_pre") 12 | expect_s3_class(workflow$fit, "stage_fit") 13 | expect_s3_class(workflow$post, "stage_post") 14 | 15 | expect_equal(workflow$pre$actions, new_named_list()) 16 | expect_equal(workflow$pre$mold, NULL) 17 | 18 | expect_equal(workflow$fit$actions, new_named_list()) 19 | expect_equal(workflow$fit$fit, NULL) 20 | 21 | expect_equal(workflow$post$actions, new_named_list()) 22 | }) 23 | 24 | test_that("workflow must be the first argument when adding actions", { 25 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 26 | mod <- parsnip::linear_reg() 27 | 28 | expect_snapshot(error = TRUE, add_formula(1, mpg ~ cyl)) 29 | expect_snapshot(error = TRUE, add_recipe(1, rec)) 30 | expect_snapshot(error = TRUE, add_model(1, mod)) 31 | }) 32 | 33 | test_that("can add a model spec directly to a workflow", { 34 | mod <- parsnip::linear_reg() 35 | workflow <- workflow(spec = mod) 36 | 37 | expect_identical(workflow$fit$actions$model$spec, mod) 38 | }) 39 | 40 | test_that("can add a preprocessor directly to a workflow", { 41 | preprocessor <- recipes::recipe(mpg ~ cyl, mtcars) 42 | workflow <- workflow(preprocessor) 43 | expect_identical(workflow$pre$actions$recipe$recipe, preprocessor) 44 | 45 | preprocessor <- mpg ~ cyl 46 | workflow <- workflow(preprocessor) 47 | expect_identical(workflow$pre$actions$formula$formula, preprocessor) 48 | 49 | preprocessor <- workflow_variables(mpg, cyl) 50 | workflow <- workflow(preprocessor) 51 | expect_identical(workflow$pre$actions$variables$variables, preprocessor) 52 | }) 53 | 54 | test_that("model spec is validated", { 55 | expect_snapshot(error = TRUE, workflow(spec = 1)) 56 | }) 57 | 58 | test_that("preprocessor is validated", { 59 | expect_snapshot(error = TRUE, workflow(preprocessor = 1)) 60 | }) 61 | 62 | # ------------------------------------------------------------------------------ 63 | # new_workflow() 64 | 65 | test_that("constructor validates input", { 66 | expect_snapshot(error = TRUE, new_workflow(pre = 1)) 67 | expect_snapshot(error = TRUE, new_workflow(fit = 1)) 68 | expect_snapshot(error = TRUE, new_workflow(post = 1)) 69 | 70 | expect_snapshot(error = TRUE, new_workflow(trained = 1)) 71 | }) 72 | 73 | # ------------------------------------------------------------------------------ 74 | # is_trained_workflow() 75 | 76 | test_that("can check if a workflow is trained", { 77 | rec <- recipes::recipe(mpg ~ cyl, mtcars) 78 | mod <- parsnip::linear_reg() 79 | mod <- parsnip::set_engine(mod, "lm") 80 | 81 | wf <- workflow() 82 | wf <- add_recipe(wf, rec) 83 | wf <- add_model(wf, mod) 84 | 85 | expect_false(is_trained_workflow(wf)) 86 | wf <- fit(wf, mtcars) 87 | expect_true(is_trained_workflow(wf)) 88 | }) 89 | 90 | test_that("input must be a workflow", { 91 | expect_snapshot_error(is_trained_workflow(1)) 92 | }) 93 | -------------------------------------------------------------------------------- /vignettes/stages.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Workflow Stages" 3 | vignette: > 4 | %\VignetteEngine{knitr::rmarkdown} 5 | %\VignetteIndexEntry{Workflow Stages} 6 | output: 7 | knitr:::html_vignette: 8 | toc: yes 9 | --- 10 | 11 | ```{r} 12 | #| label: setup 13 | #| include: false 14 | knitr::opts_chunk$set( 15 | digits = 3, 16 | collapse = TRUE, 17 | comment = "#>" 18 | ) 19 | options(digits = 3) 20 | ``` 21 | 22 | Workflows encompasses the three main stages of the modeling _process_: pre-processing of data, model fitting, and post-processing of results. This page enumerates the possible operations for each stage that have been implemented to date. 23 | 24 | ## Pre-processing 25 | 26 | The three elements allowed for pre-processing are: 27 | 28 | * A standard [model formula](https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Formulae-for-statistical-models) via `add_formula()`. 29 | 30 | * A tidyselect interface via `add_variables()` that [strictly preserves the class](https://www.tidyverse.org/blog/2020/09/workflows-0-2-0/) of your columns. 31 | 32 | * A recipe object via `add_recipe()`. 33 | 34 | You can use one or the other but not both. 35 | 36 | ## Model Fitting 37 | 38 | `parsnip` model specifications are the only option here, specified via `add_model()`. 39 | 40 | When using a preprocessor, you may need an additional formula for special model terms (e.g. for mixed models or generalized linear models). In these cases, specify that formula using `add_model()`'s `formula` argument, which will be passed to the underlying model when `fit()` is called. 41 | 42 | ## Post-processing 43 | 44 | `tailor` post-processors are the only option here, specified via `add_tailor()`. Some examples of post-processing model predictions could include adding a probability threshold for two-class problems, calibration of probability estimates, truncating the possible range of predictions, and so on. 45 | -------------------------------------------------------------------------------- /workflows.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: No 4 | SaveWorkspace: No 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | PackageRoxygenize: rd,collate,namespace 22 | --------------------------------------------------------------------------------