├── .github
    └── workflows
    │   ├── README.md
    │   ├── pr-close-signal.yaml
    │   ├── pr-comment.yaml
    │   ├── pr-post-remove-branch.yaml
    │   ├── pr-preflight.yaml
    │   ├── pr-receive.yaml
    │   ├── sandpaper-main.yaml
    │   ├── sandpaper-version.txt
    │   ├── update-cache.yaml
    │   └── update-workflows.yaml
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE.md
├── README.md
├── config.yaml
├── episodes
    ├── basic-targets.Rmd
    ├── branch.Rmd
    ├── cache.Rmd
    ├── fig
    │   ├── 03-qmd-workflow.png
    │   ├── basic-rstudio-project.png
    │   ├── basic-rstudio-wizard.png
    │   └── lifecycle-visnetwork.png
    ├── files.Rmd
    ├── files
    │   ├── lesson_functions.R
    │   ├── packages.R
    │   ├── plans
    │   │   ├── README.md
    │   │   ├── plan_0.R
    │   │   ├── plan_1.R
    │   │   ├── plan_10.R
    │   │   ├── plan_11.R
    │   │   ├── plan_2.R
    │   │   ├── plan_2b.R
    │   │   ├── plan_3.R
    │   │   ├── plan_4.R
    │   │   ├── plan_5.R
    │   │   ├── plan_6.R
    │   │   ├── plan_6b.R
    │   │   ├── plan_7.R
    │   │   ├── plan_8.R
    │   │   └── plan_9.R
    │   └── tar_functions
    │   │   ├── README.md
    │   │   ├── augment_with_mod_name.R
    │   │   ├── augment_with_mod_name_slow.R
    │   │   ├── clean_penguin_data.R
    │   │   ├── glance_with_mod_name.R
    │   │   ├── glance_with_mod_name_slow.R
    │   │   ├── model_augment.R
    │   │   ├── model_augment_slow.R
    │   │   ├── model_glance.R
    │   │   ├── model_glance_orig.R
    │   │   ├── model_glance_slow.R
    │   │   └── write_lines_file.R
    ├── functions.Rmd
    ├── introduction.Rmd
    ├── lifecycle.Rmd
    ├── organization.Rmd
    ├── packages.Rmd
    ├── parallel.Rmd
    └── quarto.Rmd
├── index.md
├── instructors
    └── instructor-notes.md
├── learners
    ├── reference.md
    └── setup.md
├── links.md
├── profiles
    └── learner-profiles.md
├── renv
    ├── activate.R
    ├── profile
    └── profiles
    │   └── lesson-requirements
    │       ├── renv.lock
    │       └── renv
    │           ├── .gitignore
    │           └── settings.json
├── site
    └── README.md
└── targets-workshop.Rproj


/.github/workflows/README.md:
--------------------------------------------------------------------------------
  1 | # Carpentries Workflows
  2 | 
  3 | This directory contains workflows to be used for Lessons using the {sandpaper}
  4 | lesson infrastructure. Two of these workflows require R (`sandpaper-main.yaml`
  5 | and `pr-receive.yaml`) and the rest are bots to handle pull request management.
  6 | 
  7 | These workflows will likely change as {sandpaper} evolves, so it is important to
  8 | keep them up-to-date. To do this in your lesson you can do the following in your
  9 | R console:
 10 | 
 11 | ```r
 12 | # Install/Update sandpaper
 13 | options(repos = c(carpentries = "https://carpentries.r-universe.dev/", 
 14 |   CRAN = "https://cloud.r-project.org"))
 15 | install.packages("sandpaper")
 16 | 
 17 | # update the workflows in your lesson
 18 | library("sandpaper")
 19 | update_github_workflows()
 20 | ```
 21 | 
 22 | Inside this folder, you will find a file called `sandpaper-version.txt`, which
 23 | will contain a version number for sandpaper. This will be used in the future to
 24 | alert you if a workflow update is needed.
 25 | 
 26 | What follows are the descriptions of the workflow files:
 27 | 
 28 | ## Deployment
 29 | 
 30 | ### 01 Build and Deploy (sandpaper-main.yaml)
 31 | 
 32 | This is the main driver that will only act on the main branch of the repository.
 33 | This workflow does the following:
 34 | 
 35 |  1. checks out the lesson
 36 |  2. provisions the following resources
 37 |    - R
 38 |    - pandoc
 39 |    - lesson infrastructure (stored in a cache)
 40 |    - lesson dependencies if needed (stored in a cache)
 41 |  3. builds the lesson via `sandpaper:::ci_deploy()`
 42 | 
 43 | #### Caching
 44 | 
 45 | This workflow has two caches; one cache is for the lesson infrastructure and 
 46 | the other is for the the lesson dependencies if the lesson contains rendered
 47 | content. These caches are invalidated by new versions of the infrastructure and
 48 | the `renv.lock` file, respectively. If there is a problem with the cache, 
 49 | manual invaliation is necessary. You will need maintain access to the repository
 50 | and you can either go to the actions tab and [click on the caches button to find
 51 | and invalidate the failing cache](https://github.blog/changelog/2022-10-20-manage-caches-in-your-actions-workflows-from-web-interface/) 
 52 | or by setting the `CACHE_VERSION` secret to the current date (which will
 53 | invalidate all of the caches).
 54 | 
 55 | ## Updates
 56 | 
 57 | ### Setup Information
 58 | 
 59 | These workflows run on a schedule and at the maintainer's request. Because they
 60 | create pull requests that update workflows/require the downstream actions to run,
 61 | they need a special repository/organization secret token called 
 62 | `SANDPAPER_WORKFLOW` and it must have the `public_repo` and `workflow` scope. 
 63 | 
 64 | This can be an individual user token, OR it can be a trusted bot account. If you
 65 | have a repository in one of the official Carpentries accounts, then you do not
 66 | need to worry about this token being present because the Carpentries Core Team
 67 | will take care of supplying this token.
 68 | 
 69 | If you want to use your personal account: you can go to 
 70 | <https://github.com/settings/tokens/new?scopes=public_repo,workflow&description=Sandpaper%20Token>
 71 | to create a token. Once you have created your token, you should copy it to your
 72 | clipboard and then go to your repository's settings > secrets > actions and
 73 | create or edit the `SANDPAPER_WORKFLOW` secret, pasting in the generated token.
 74 | 
 75 | If you do not specify your token correctly, the runs will not fail and they will
 76 | give you instructions to provide the token for your repository. 
 77 | 
 78 | ### 02 Maintain: Update Workflow Files (update-workflow.yaml)
 79 | 
 80 | The {sandpaper} repository was designed to do as much as possible to separate 
 81 | the tools from the content. For local builds, this is absolutely true, but 
 82 | there is a minor issue when it comes to workflow files: they must live inside 
 83 | the repository. 
 84 | 
 85 | This workflow ensures that the workflow files are up-to-date. The way it work is
 86 | to download the update-workflows.sh script from GitHub and run it. The script 
 87 | will do the following:
 88 | 
 89 | 1. check the recorded version of sandpaper against the current version on github
 90 | 2. update the files if there is a difference in versions
 91 | 
 92 | After the files are updated, if there are any changes, they are pushed to a
 93 | branch called `update/workflows` and a pull request is created. Maintainers are
 94 | encouraged to review the changes and accept the pull request if the outputs
 95 | are okay.
 96 | 
 97 | This update is run weekly or on demand.
 98 | 
 99 | ### 03 Maintain: Update Package Cache (update-cache.yaml)
100 | 
101 | For lessons that have generated content, we use {renv} to ensure that the output
102 | is stable. This is controlled by a single lockfile which documents the packages
103 | needed for the lesson and the version numbers. This workflow is skipped in 
104 | lessons that do not have generated content.
105 | 
106 | Because the lessons need to remain current with the package ecosystem, it's a
107 | good idea to make sure these packages can be updated periodically. The 
108 | update cache workflow will do this by checking for updates, applying them in a
109 | branch called `updates/packages` and creating a pull request with _only the
110 | lockfile changed_. 
111 | 
112 | From here, the markdown documents will be rebuilt and you can inspect what has
113 | changed based on how the packages have updated. 
114 | 
115 | ## Pull Request and Review Management
116 | 
117 | Because our lessons execute code, pull requests are a secruity risk for any
118 | lesson and thus have security measures associted with them. **Do not merge any
119 | pull requests that do not pass checks and do not have bots commented on them.**
120 | 
121 | This series of workflows all go together and are described in the following 
122 | diagram and the below sections:
123 | 
124 | ![Graph representation of a pull request](https://carpentries.github.io/sandpaper/articles/img/pr-flow.dot.svg)
125 | 
126 | ### Pre Flight Pull Request Validation (pr-preflight.yaml)
127 | 
128 | This workflow runs every time a pull request is created and its purpose is to
129 | validate that the pull request is okay to run. This means the following things:
130 | 
131 | 1. The pull request does not contain modified workflow files
132 | 2. If the pull request contains modified workflow files, it does not contain 
133 |    modified content files (such as a situation where @carpentries-bot will
134 |    make an automated pull request)
135 | 3. The pull request does not contain an invalid commit hash (e.g. from a fork
136 |    that was made before a lesson was transitioned from styles to use the
137 |    workbench).
138 | 
139 | Once the checks are finished, a comment is issued to the pull request, which 
140 | will allow maintainers to determine if it is safe to run the 
141 | "Receive Pull Request" workflow from new contributors.
142 | 
143 | ### Receive Pull Request (pr-receive.yaml)
144 | 
145 | **Note of caution:** This workflow runs arbitrary code by anyone who creates a
146 | pull request. GitHub has safeguarded the token used in this workflow to have no
147 | priviledges in the repository, but we have taken precautions to protect against
148 | spoofing.
149 | 
150 | This workflow is triggered with every push to a pull request. If this workflow
151 | is already running and a new push is sent to the pull request, the workflow
152 | running from the previous push will be cancelled and a new workflow run will be
153 | started.
154 | 
155 | The first step of this workflow is to check if it is valid (e.g. that no
156 | workflow files have been modified). If there are workflow files that have been
157 | modified, a comment is made that indicates that the workflow is not run. If 
158 | both a workflow file and lesson content is modified, an error will occurr.
159 | 
160 | The second step (if valid) is to build the generated content from the pull
161 | request. This builds the content and uploads three artifacts:
162 | 
163 | 1. The pull request number (pr)
164 | 2. A summary of changes after the rendering process (diff)
165 | 3. The rendered files (build)
166 | 
167 | Because this workflow builds generated content, it follows the same general 
168 | process as the `sandpaper-main` workflow with the same caching mechanisms.
169 | 
170 | The artifacts produced are used by the next workflow.
171 | 
172 | ### Comment on Pull Request (pr-comment.yaml)
173 | 
174 | This workflow is triggered if the `pr-receive.yaml` workflow is successful.
175 | The steps in this workflow are:
176 | 
177 | 1. Test if the workflow is valid and comment the validity of the workflow to the
178 |    pull request.
179 | 2. If it is valid: create an orphan branch with two commits: the current state
180 |    of the repository and the proposed changes.
181 | 3. If it is valid: update the pull request comment with the summary of changes
182 | 
183 | Importantly: if the pull request is invalid, the branch is not created so any
184 | malicious code is not published.
185 | 
186 | From here, the maintainer can request changes from the author and eventually 
187 | either merge or reject the PR. When this happens, if the PR was valid, the 
188 | preview branch needs to be deleted. 
189 | 
190 | ### Send Close PR Signal (pr-close-signal.yaml)
191 | 
192 | Triggered any time a pull request is closed. This emits an artifact that is the
193 | pull request number for the next action
194 | 
195 | ### Remove Pull Request Branch (pr-post-remove-branch.yaml)
196 | 
197 | Tiggered by `pr-close-signal.yaml`. This removes the temporary branch associated with
198 | the pull request (if it was created).
199 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-close-signal.yaml:
--------------------------------------------------------------------------------
 1 | name: "Bot: Send Close Pull Request Signal"
 2 | 
 3 | on:
 4 |   pull_request:
 5 |     types:
 6 |       [closed]
 7 | 
 8 | jobs:
 9 |   send-close-signal:
10 |     name: "Send closing signal"
11 |     runs-on: ubuntu-22.04
12 |     if: ${{ github.event.action == 'closed' }}
13 |     steps:
14 |       - name: "Create PRtifact"
15 |         run: |
16 |           mkdir -p ./pr
17 |           printf ${{ github.event.number }} > ./pr/NUM
18 |       - name: Upload Diff
19 |         uses: actions/upload-artifact@v4
20 |         with:
21 |           name: pr
22 |           path: ./pr
23 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-comment.yaml:
--------------------------------------------------------------------------------
  1 | name: "Bot: Comment on the Pull Request"
  2 | 
  3 | # read-write repo token
  4 | # access to secrets
  5 | on:
  6 |   workflow_run:
  7 |     workflows: ["Receive Pull Request"]
  8 |     types:
  9 |       - completed
 10 | 
 11 | concurrency:
 12 |   group: pr-${{ github.event.workflow_run.pull_requests[0].number }}
 13 |   cancel-in-progress: true
 14 | 
 15 | 
 16 | jobs:
 17 |   # Pull requests are valid if:
 18 |   #  - they match the sha of the workflow run head commit
 19 |   #  - they are open
 20 |   #  - no .github files were committed
 21 |   test-pr:
 22 |     name: "Test if pull request is valid"
 23 |     runs-on: ubuntu-22.04
 24 |     if: >
 25 |       github.event.workflow_run.event == 'pull_request' &&
 26 |       github.event.workflow_run.conclusion == 'success'
 27 |     outputs:
 28 |       is_valid: ${{ steps.check-pr.outputs.VALID }}
 29 |       payload: ${{ steps.check-pr.outputs.payload }}
 30 |       number: ${{ steps.get-pr.outputs.NUM }}
 31 |       msg: ${{ steps.check-pr.outputs.MSG }}
 32 |     steps:
 33 |       - name: 'Download PR artifact'
 34 |         id: dl
 35 |         uses: carpentries/actions/download-workflow-artifact@main
 36 |         with:
 37 |           run: ${{ github.event.workflow_run.id }}
 38 |           name: 'pr'
 39 | 
 40 |       - name: "Get PR Number"
 41 |         if: ${{ steps.dl.outputs.success == 'true' }}
 42 |         id: get-pr
 43 |         run: |
 44 |           unzip pr.zip
 45 |           echo "NUM=$(<./NR)" >> $GITHUB_OUTPUT
 46 | 
 47 |       - name: "Fail if PR number was not present"
 48 |         id: bad-pr
 49 |         if: ${{ steps.dl.outputs.success != 'true' }}
 50 |         run: |
 51 |           echo '::error::A pull request number was not recorded. The pull request that triggered this workflow is likely malicious.'
 52 |           exit 1
 53 |       - name: "Get Invalid Hashes File"
 54 |         id: hash
 55 |         run: |
 56 |           echo "json<<EOF
 57 |           $(curl -sL https://files.carpentries.org/invalid-hashes.json)
 58 |           EOF" >> $GITHUB_OUTPUT
 59 |       - name: "Check PR"
 60 |         id: check-pr
 61 |         if: ${{ steps.dl.outputs.success == 'true' }}
 62 |         uses: carpentries/actions/check-valid-pr@main
 63 |         with:
 64 |           pr: ${{ steps.get-pr.outputs.NUM }}
 65 |           sha: ${{ github.event.workflow_run.head_sha }}
 66 |           headroom: 3 # if it's within the last three commits, we can keep going, because it's likely rapid-fire
 67 |           invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
 68 |           fail_on_error: true
 69 | 
 70 |   # Create an orphan branch on this repository with two commits
 71 |   #  - the current HEAD of the md-outputs branch
 72 |   #  - the output from running the current HEAD of the pull request through
 73 |   #    the md generator
 74 |   create-branch:
 75 |     name: "Create Git Branch"
 76 |     needs: test-pr
 77 |     runs-on: ubuntu-22.04
 78 |     if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
 79 |     env:
 80 |       NR: ${{ needs.test-pr.outputs.number }}
 81 |     permissions:
 82 |       contents: write
 83 |     steps:
 84 |       - name: 'Checkout md outputs'
 85 |         uses: actions/checkout@v4
 86 |         with:
 87 |           ref: md-outputs
 88 |           path: built
 89 |           fetch-depth: 1
 90 | 
 91 |       - name: 'Download built markdown'
 92 |         id: dl
 93 |         uses: carpentries/actions/download-workflow-artifact@main
 94 |         with:
 95 |           run: ${{ github.event.workflow_run.id }}
 96 |           name: 'built'
 97 | 
 98 |       - if: ${{ steps.dl.outputs.success == 'true' }}
 99 |         run: unzip built.zip
100 | 
101 |       - name: "Create orphan and push"
102 |         if: ${{ steps.dl.outputs.success == 'true' }}
103 |         run: |
104 |           cd built/
105 |           git config --local user.email "actions@github.com"
106 |           git config --local user.name "GitHub Actions"
107 |           CURR_HEAD=$(git rev-parse HEAD)
108 |           git checkout --orphan md-outputs-PR-${NR}
109 |           git add -A
110 |           git commit -m "source commit: ${CURR_HEAD}"
111 |           ls -A | grep -v '^.git$' | xargs -I _ rm -r '_'
112 |           cd ..
113 |           unzip -o -d built built.zip
114 |           cd built
115 |           git add -A
116 |           git commit --allow-empty -m "differences for PR #${NR}"
117 |           git push -u --force --set-upstream origin md-outputs-PR-${NR}
118 | 
119 |   # Comment on the Pull Request with a link to the branch and the diff
120 |   comment-pr:
121 |     name: "Comment on Pull Request"
122 |     needs: [test-pr, create-branch]
123 |     runs-on: ubuntu-22.04
124 |     if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
125 |     env:
126 |       NR: ${{ needs.test-pr.outputs.number }}
127 |     permissions:
128 |       pull-requests: write
129 |     steps:
130 |       - name: 'Download comment artifact'
131 |         id: dl
132 |         uses: carpentries/actions/download-workflow-artifact@main
133 |         with:
134 |           run: ${{ github.event.workflow_run.id }}
135 |           name: 'diff'
136 | 
137 |       - if: ${{ steps.dl.outputs.success == 'true' }}
138 |         run: unzip ${{ github.workspace }}/diff.zip
139 | 
140 |       - name: "Comment on PR"
141 |         id: comment-diff
142 |         if: ${{ steps.dl.outputs.success == 'true' }}
143 |         uses: carpentries/actions/comment-diff@main
144 |         with:
145 |           pr: ${{ env.NR }}
146 |           path: ${{ github.workspace }}/diff.md
147 | 
148 |   # Comment if the PR is open and matches the SHA, but the workflow files have
149 |   # changed
150 |   comment-changed-workflow:
151 |     name: "Comment if workflow files have changed"
152 |     needs: test-pr
153 |     runs-on: ubuntu-22.04
154 |     if: ${{ always() && needs.test-pr.outputs.is_valid == 'false' }}
155 |     env:
156 |       NR: ${{ github.event.workflow_run.pull_requests[0].number }}
157 |       body: ${{ needs.test-pr.outputs.msg }}
158 |     permissions:
159 |       pull-requests: write
160 |     steps:
161 |       - name: 'Check for spoofing'
162 |         id: dl
163 |         uses: carpentries/actions/download-workflow-artifact@main
164 |         with:
165 |           run: ${{ github.event.workflow_run.id }}
166 |           name: 'built'
167 | 
168 |       - name: 'Alert if spoofed'
169 |         id: spoof
170 |         if: ${{ steps.dl.outputs.success == 'true' }}
171 |         run: |
172 |           echo 'body<<EOF' >> $GITHUB_ENV
173 |           echo '' >> $GITHUB_ENV
174 |           echo '## :x: DANGER :x:' >> $GITHUB_ENV
175 |           echo 'This pull request has modified workflows that created output. Close this now.' >> $GITHUB_ENV
176 |           echo '' >> $GITHUB_ENV
177 |           echo 'EOF' >> $GITHUB_ENV
178 | 
179 |       - name: "Comment on PR"
180 |         id: comment-diff
181 |         uses: carpentries/actions/comment-diff@main
182 |         with:
183 |           pr: ${{ env.NR }}
184 |           body: ${{ env.body }}
185 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-post-remove-branch.yaml:
--------------------------------------------------------------------------------
 1 | name: "Bot: Remove Temporary PR Branch"
 2 | 
 3 | on:
 4 |   workflow_run:
 5 |     workflows: ["Bot: Send Close Pull Request Signal"]
 6 |     types:
 7 |       - completed
 8 | 
 9 | jobs:
10 |   delete:
11 |     name: "Delete branch from Pull Request"
12 |     runs-on: ubuntu-22.04
13 |     if: >
14 |       github.event.workflow_run.event == 'pull_request' &&
15 |       github.event.workflow_run.conclusion == 'success'
16 |     permissions:
17 |       contents: write
18 |     steps:
19 |       - name: 'Download artifact'
20 |         uses: carpentries/actions/download-workflow-artifact@main
21 |         with:
22 |           run: ${{ github.event.workflow_run.id }}
23 |           name: pr
24 |       - name: "Get PR Number"
25 |         id: get-pr
26 |         run: |
27 |           unzip pr.zip
28 |           echo "NUM=$(<./NUM)" >> $GITHUB_OUTPUT
29 |       - name: 'Remove branch'
30 |         uses: carpentries/actions/remove-branch@main
31 |         with:
32 |           pr: ${{ steps.get-pr.outputs.NUM }}
33 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-preflight.yaml:
--------------------------------------------------------------------------------
 1 | name: "Pull Request Preflight Check"
 2 | 
 3 | on:
 4 |   pull_request_target:
 5 |     branches:
 6 |       ["main"]
 7 |     types:
 8 |       ["opened", "synchronize", "reopened"]
 9 | 
10 | jobs:
11 |   test-pr:
12 |     name: "Test if pull request is valid"
13 |     if: ${{ github.event.action != 'closed' }}
14 |     runs-on: ubuntu-22.04
15 |     outputs:
16 |       is_valid: ${{ steps.check-pr.outputs.VALID }}
17 |     permissions:
18 |       pull-requests: write
19 |     steps:
20 |       - name: "Get Invalid Hashes File"
21 |         id: hash
22 |         run: |
23 |           echo "json<<EOF
24 |           $(curl -sL https://files.carpentries.org/invalid-hashes.json)
25 |           EOF" >> $GITHUB_OUTPUT
26 |       - name: "Check PR"
27 |         id: check-pr
28 |         uses: carpentries/actions/check-valid-pr@main
29 |         with:
30 |           pr: ${{ github.event.number }}
31 |           invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
32 |           fail_on_error: true
33 |       - name: "Comment result of validation"
34 |         id: comment-diff
35 |         if: ${{ always() }}
36 |         uses: carpentries/actions/comment-diff@main
37 |         with:
38 |           pr: ${{ github.event.number }}
39 |           body: ${{ steps.check-pr.outputs.MSG }}
40 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-receive.yaml:
--------------------------------------------------------------------------------
  1 | name: "Receive Pull Request"
  2 | 
  3 | on:
  4 |   pull_request:
  5 |     types:
  6 |       [opened, synchronize, reopened]
  7 | 
  8 | concurrency:
  9 |   group: ${{ github.ref }}
 10 |   cancel-in-progress: true
 11 | 
 12 | jobs:
 13 |   test-pr:
 14 |     name: "Record PR number"
 15 |     if: ${{ github.event.action != 'closed' }}
 16 |     runs-on: ubuntu-22.04
 17 |     outputs:
 18 |       is_valid: ${{ steps.check-pr.outputs.VALID }}
 19 |     steps:
 20 |       - name: "Record PR number"
 21 |         id: record
 22 |         if: ${{ always() }}
 23 |         run: |
 24 |           echo ${{ github.event.number }} > ${{ github.workspace }}/NR # 2022-03-02: artifact name fixed to be NR
 25 |       - name: "Upload PR number"
 26 |         id: upload
 27 |         if: ${{ always() }}
 28 |         uses: actions/upload-artifact@v4
 29 |         with:
 30 |           name: pr
 31 |           path: ${{ github.workspace }}/NR
 32 |       - name: "Get Invalid Hashes File"
 33 |         id: hash
 34 |         run: |
 35 |           echo "json<<EOF
 36 |           $(curl -sL https://files.carpentries.org/invalid-hashes.json)
 37 |           EOF" >> $GITHUB_OUTPUT
 38 |       - name: "echo output"
 39 |         run: |
 40 |           echo "${{ steps.hash.outputs.json }}"
 41 |       - name: "Check PR"
 42 |         id: check-pr
 43 |         uses: carpentries/actions/check-valid-pr@main
 44 |         with:
 45 |           pr: ${{ github.event.number }}
 46 |           invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
 47 | 
 48 |   build-md-source:
 49 |     name: "Build markdown source files if valid"
 50 |     needs: test-pr
 51 |     runs-on: ubuntu-22.04
 52 |     if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
 53 |     env:
 54 |       GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
 55 |       RENV_PATHS_ROOT: ~/.local/share/renv/
 56 |       CHIVE: ${{ github.workspace }}/site/chive
 57 |       PR: ${{ github.workspace }}/site/pr
 58 |       MD: ${{ github.workspace }}/site/built
 59 |     steps:
 60 |       - name: "Check Out Main Branch"
 61 |         uses: actions/checkout@v4
 62 | 
 63 |       - name: "Check Out Staging Branch"
 64 |         uses: actions/checkout@v4
 65 |         with:
 66 |           ref: md-outputs
 67 |           path: ${{ env.MD }}
 68 | 
 69 |       - name: "Set up R"
 70 |         uses: r-lib/actions/setup-r@v2
 71 |         with:
 72 |           use-public-rspm: true
 73 |           install-r: false
 74 | 
 75 |       - name: "Set up Pandoc"
 76 |         uses: r-lib/actions/setup-pandoc@v2
 77 | 
 78 |       - name: "Setup Lesson Engine"
 79 |         uses: carpentries/actions/setup-sandpaper@main
 80 |         with:
 81 |           cache-version: ${{ secrets.CACHE_VERSION }}
 82 | 
 83 |       - name: "Setup Package Cache"
 84 |         uses: carpentries/actions/setup-lesson-deps@main
 85 |         with:
 86 |           cache-version: ${{ secrets.CACHE_VERSION }}
 87 | 
 88 |       - name: "Validate and Build Markdown"
 89 |         id: build-site
 90 |         run: |
 91 |           sandpaper::package_cache_trigger(TRUE)
 92 |           sandpaper::validate_lesson(path = '${{ github.workspace }}')
 93 |           sandpaper:::build_markdown(path = '${{ github.workspace }}', quiet = FALSE)
 94 |         shell: Rscript {0}
 95 | 
 96 |       - name: "Generate Artifacts"
 97 |         id: generate-artifacts
 98 |         run: |
 99 |           sandpaper:::ci_bundle_pr_artifacts(
100 |             repo         = '${{ github.repository }}',
101 |             pr_number    = '${{ github.event.number }}',
102 |             path_md      = '${{ env.MD }}',
103 |             path_pr      = '${{ env.PR }}',
104 |             path_archive = '${{ env.CHIVE }}',
105 |             branch       = 'md-outputs'
106 |           )
107 |         shell: Rscript {0}
108 | 
109 |       - name: "Upload PR"
110 |         uses: actions/upload-artifact@v4
111 |         with:
112 |           name: pr
113 |           path: ${{ env.PR }}
114 |           overwrite: true
115 | 
116 |       - name: "Upload Diff"
117 |         uses: actions/upload-artifact@v4
118 |         with:
119 |           name: diff
120 |           path: ${{ env.CHIVE }}
121 |           retention-days: 1
122 | 
123 |       - name: "Upload Build"
124 |         uses: actions/upload-artifact@v4
125 |         with:
126 |           name: built
127 |           path: ${{ env.MD }}
128 |           retention-days: 1
129 | 
130 |       - name: "Teardown"
131 |         run: sandpaper::reset_site()
132 |         shell: Rscript {0}
133 | 


--------------------------------------------------------------------------------
/.github/workflows/sandpaper-main.yaml:
--------------------------------------------------------------------------------
 1 | name: "01 Build and Deploy Site"
 2 | 
 3 | on:
 4 |   push:
 5 |     branches:
 6 |       - main
 7 |       - master
 8 |   schedule:
 9 |     - cron: '0 0 * * 2'
10 |   workflow_dispatch:
11 |     inputs:
12 |       name:
13 |         description: 'Who triggered this build?'
14 |         required: true
15 |         default: 'Maintainer (via GitHub)'
16 |       reset:
17 |         description: 'Reset cached markdown files'
18 |         required: false
19 |         default: false
20 |         type: boolean
21 | jobs:
22 |   full-build:
23 |     name: "Build Full Site"
24 | 
25 |     # 2024-10-01: ubuntu-latest is now 24.04 and R is not installed by default in the runner image
26 |     # pin to 22.04 for now
27 |     runs-on: ubuntu-22.04
28 |     permissions:
29 |       checks: write
30 |       contents: write
31 |       pages: write
32 |     env:
33 |       GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
34 |       RENV_PATHS_ROOT: ~/.local/share/renv/
35 |     steps:
36 | 
37 |       - name: "Checkout Lesson"
38 |         uses: actions/checkout@v4
39 | 
40 |       - name: "Set up R"
41 |         uses: r-lib/actions/setup-r@v2
42 |         with:
43 |           use-public-rspm: true
44 |           install-r: false
45 | 
46 |       - name: "Set up Pandoc"
47 |         uses: r-lib/actions/setup-pandoc@v2
48 | 
49 |       - name: "Setup Lesson Engine"
50 |         uses: carpentries/actions/setup-sandpaper@main
51 |         with:
52 |           cache-version: ${{ secrets.CACHE_VERSION }}
53 | 
54 |       - name: "Setup Package Cache"
55 |         uses: carpentries/actions/setup-lesson-deps@main
56 |         with:
57 |           cache-version: ${{ secrets.CACHE_VERSION }}
58 | 
59 |       - name: "Deploy Site"
60 |         run: |
61 |           reset <- "${{ github.event.inputs.reset }}" == "true"
62 |           sandpaper::package_cache_trigger(TRUE)
63 |           sandpaper:::ci_deploy(reset = reset)
64 |         shell: Rscript {0}
65 | 


--------------------------------------------------------------------------------
/.github/workflows/sandpaper-version.txt:
--------------------------------------------------------------------------------
1 | 0.16.11
2 | 


--------------------------------------------------------------------------------
/.github/workflows/update-cache.yaml:
--------------------------------------------------------------------------------
  1 | name: "03 Maintain: Update Package Cache"
  2 | 
  3 | on:
  4 |   workflow_dispatch:
  5 |     inputs:
  6 |       name:
  7 |         description: 'Who triggered this build (enter github username to tag yourself)?'
  8 |         required: true
  9 |         default: 'monthly run'
 10 |   schedule:
 11 |     # Run every tuesday
 12 |     - cron: '0 0 * * 2'
 13 | 
 14 | jobs:
 15 |   preflight:
 16 |     name: "Preflight Check"
 17 |     runs-on: ubuntu-22.04
 18 |     outputs:
 19 |       ok: ${{ steps.check.outputs.ok }}
 20 |     steps:
 21 |       - id: check
 22 |         run: |
 23 |           if [[ ${{ github.event_name }} == 'workflow_dispatch' ]]; then
 24 |             echo "ok=true" >> $GITHUB_OUTPUT
 25 |             echo "Running on request"
 26 |           # using single brackets here to avoid 08 being interpreted as octal
 27 |           # https://github.com/carpentries/sandpaper/issues/250
 28 |           elif [ `date +%d` -le 7 ]; then
 29 |             # If the Tuesday lands in the first week of the month, run it
 30 |             echo "ok=true" >> $GITHUB_OUTPUT
 31 |             echo "Running on schedule"
 32 |           else
 33 |             echo "ok=false" >> $GITHUB_OUTPUT
 34 |             echo "Not Running Today"
 35 |           fi
 36 | 
 37 |   check_renv:
 38 |     name: "Check if We Need {renv}"
 39 |     runs-on: ubuntu-22.04
 40 |     needs: preflight
 41 |     if: ${{ needs.preflight.outputs.ok == 'true'}}
 42 |     outputs:
 43 |       needed: ${{ steps.renv.outputs.exists }}
 44 |     steps:
 45 |       - name: "Checkout Lesson"
 46 |         uses: actions/checkout@v4
 47 |       - id: renv
 48 |         run: |
 49 |           if [[ -d renv ]]; then
 50 |             echo "exists=true" >> $GITHUB_OUTPUT
 51 |           fi
 52 | 
 53 |   check_token:
 54 |     name: "Check SANDPAPER_WORKFLOW token"
 55 |     runs-on: ubuntu-22.04
 56 |     needs: check_renv
 57 |     if: ${{ needs.check_renv.outputs.needed == 'true' }}
 58 |     outputs:
 59 |       workflow: ${{ steps.validate.outputs.wf }}
 60 |       repo: ${{ steps.validate.outputs.repo }}
 61 |     steps:
 62 |       - name: "validate token"
 63 |         id: validate
 64 |         uses: carpentries/actions/check-valid-credentials@main
 65 |         with:
 66 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
 67 | 
 68 |   update_cache:
 69 |     name: "Update Package Cache"
 70 |     needs: check_token
 71 |     if: ${{ needs.check_token.outputs.repo== 'true' }}
 72 |     runs-on: ubuntu-22.04
 73 |     env:
 74 |       GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
 75 |       RENV_PATHS_ROOT: ~/.local/share/renv/
 76 |     steps:
 77 | 
 78 |       - name: "Checkout Lesson"
 79 |         uses: actions/checkout@v4
 80 | 
 81 |       - name: "Set up R"
 82 |         uses: r-lib/actions/setup-r@v2
 83 |         with:
 84 |           use-public-rspm: true
 85 |           install-r: false
 86 | 
 87 |       - name: "Update {renv} deps and determine if a PR is needed"
 88 |         id: update
 89 |         uses: carpentries/actions/update-lockfile@main
 90 |         with:
 91 |           cache-version: ${{ secrets.CACHE_VERSION }}
 92 | 
 93 |       - name: Create Pull Request
 94 |         id: cpr
 95 |         if: ${{ steps.update.outputs.n > 0 }}
 96 |         uses: carpentries/create-pull-request@main
 97 |         with:
 98 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
 99 |           delete-branch: true
100 |           branch: "update/packages"
101 |           commit-message: "[actions] update ${{ steps.update.outputs.n }} packages"
102 |           title: "Update ${{ steps.update.outputs.n }} packages"
103 |           body: |
104 |             :robot: This is an automated build
105 | 
106 |             This will update ${{ steps.update.outputs.n }} packages in your lesson with the following versions:
107 | 
108 |             ```
109 |             ${{ steps.update.outputs.report }}
110 |             ```
111 | 
112 |             :stopwatch: In a few minutes, a comment will appear that will show you how the output has changed based on these updates.
113 | 
114 |             If you want to inspect these changes locally, you can use the following code to check out a new branch:
115 | 
116 |             ```bash
117 |             git fetch origin update/packages
118 |             git checkout update/packages
119 |             ```
120 | 
121 |             - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }}
122 | 
123 |             [1]: https://github.com/carpentries/create-pull-request/tree/main
124 |           labels: "type: package cache"
125 |           draft: false
126 | 


--------------------------------------------------------------------------------
/.github/workflows/update-workflows.yaml:
--------------------------------------------------------------------------------
 1 | name: "02 Maintain: Update Workflow Files"
 2 | 
 3 | on:
 4 |   workflow_dispatch:
 5 |     inputs:
 6 |       name:
 7 |         description: 'Who triggered this build (enter github username to tag yourself)?'
 8 |         required: true
 9 |         default: 'weekly run'
10 |       clean:
11 |         description: 'Workflow files/file extensions to clean (no wildcards, enter "" for none)'
12 |         required: false
13 |         default: '.yaml'
14 |   schedule:
15 |     # Run every Tuesday
16 |     - cron: '0 0 * * 2'
17 | 
18 | jobs:
19 |   check_token:
20 |     name: "Check SANDPAPER_WORKFLOW token"
21 |     runs-on: ubuntu-22.04
22 |     outputs:
23 |       workflow: ${{ steps.validate.outputs.wf }}
24 |       repo: ${{ steps.validate.outputs.repo }}
25 |     steps:
26 |       - name: "validate token"
27 |         id: validate
28 |         uses: carpentries/actions/check-valid-credentials@main
29 |         with:
30 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
31 | 
32 |   update_workflow:
33 |     name: "Update Workflow"
34 |     runs-on: ubuntu-22.04
35 |     needs: check_token
36 |     if: ${{ needs.check_token.outputs.workflow == 'true' }}
37 |     steps:
38 |       - name: "Checkout Repository"
39 |         uses: actions/checkout@v4
40 | 
41 |       - name: Update Workflows
42 |         id: update
43 |         uses: carpentries/actions/update-workflows@main
44 |         with:
45 |           clean: ${{ github.event.inputs.clean }}
46 | 
47 |       - name: Create Pull Request
48 |         id: cpr
49 |         if: "${{ steps.update.outputs.new }}"
50 |         uses: carpentries/create-pull-request@main
51 |         with:
52 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
53 |           delete-branch: true
54 |           branch: "update/workflows"
55 |           commit-message: "[actions] update sandpaper workflow to version ${{ steps.update.outputs.new }}"
56 |           title: "Update Workflows to Version ${{ steps.update.outputs.new }}"
57 |           body: |
58 |             :robot: This is an automated build
59 | 
60 |             Update Workflows from sandpaper version ${{ steps.update.outputs.old }} -> ${{ steps.update.outputs.new }}
61 | 
62 |             - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }}
63 | 
64 |             [1]: https://github.com/carpentries/create-pull-request/tree/main
65 |           labels: "type: template and tools"
66 |           draft: false
67 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # sandpaper files
 2 | episodes/*html
 3 | site/*
 4 | !site/README.md
 5 | 
 6 | # History files
 7 | .Rhistory
 8 | .Rapp.history
 9 | 
10 | # Session Data files
11 | .RData
12 | 
13 | # User-specific files
14 | .Ruserdata
15 | 
16 | # Example code in package build process
17 | *-Ex.R
18 | 
19 | # Output files from R CMD build
20 | /*.tar.gz
21 | 
22 | # Output files from R CMD check
23 | /*.Rcheck/
24 | 
25 | # RStudio files
26 | .Rproj.user/
27 | 
28 | # produced vignettes
29 | vignettes/*.html
30 | vignettes/*.pdf
31 | 
32 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
33 | .httr-oauth
34 | 
35 | # knitr and R markdown default cache directories
36 | *_cache/
37 | /cache/
38 | 
39 | # Temporary files created by R markdown
40 | *.utf8.md
41 | *.knit.md
42 | 
43 | # R Environment Variables
44 | .Renviron
45 | 
46 | # pkgdown site
47 | docs/
48 | 
49 | # translation temp files
50 | po/*~
51 | 
52 | # renv detritus
53 | renv/sandbox/
54 | 
55 | # vscode settings
56 | .vscode
57 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Contributor Code of Conduct"
 3 | ---
 4 | 
 5 | As contributors and maintainers of this project,
 6 | we pledge to follow the [The Carpentries Code of Conduct][coc].
 7 | 
 8 | Instances of abusive, harassing, or otherwise unacceptable behavior
 9 | may be reported by following our [reporting guidelines][coc-reporting].
10 | 
11 | 
12 | [coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html
13 | [coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
14 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
  1 | ## Contributing
  2 | 
  3 | [The Carpentries][cp-site] ([Software Carpentry][swc-site], [Data
  4 | Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source
  5 | projects, and we welcome contributions of all kinds: new lessons, fixes to
  6 | existing material, bug reports, and reviews of proposed changes are all
  7 | welcome.
  8 | 
  9 | ### Contributor Agreement
 10 | 
 11 | By contributing, you agree that we may redistribute your work under [our
 12 | license](LICENSE.md). In exchange, we will address your issues and/or assess
 13 | your change proposal as promptly as we can, and help you become a member of our
 14 | community. Everyone involved in [The Carpentries][cp-site] agrees to abide by
 15 | our [code of conduct](CODE_OF_CONDUCT.md).
 16 | 
 17 | ### How to Contribute
 18 | 
 19 | The easiest way to get started is to file an issue to tell us about a spelling
 20 | mistake, some awkward wording, or a factual error. This is a good way to
 21 | introduce yourself and to meet some of our community members.
 22 | 
 23 | 1. If you do not have a [GitHub][github] account, you can [send us comments by
 24 |    email][contact]. However, we will be able to respond more quickly if you use
 25 |    one of the other methods described below.
 26 | 
 27 | 2. If you have a [GitHub][github] account, or are willing to [create
 28 |    one][github-join], but do not know how to use Git, you can report problems
 29 |    or suggest improvements by [creating an issue][issues]. This allows us to
 30 |    assign the item to someone and to respond to it in a threaded discussion.
 31 | 
 32 | 3. If you are comfortable with Git, and would like to add or change material,
 33 |    you can submit a pull request (PR). Instructions for doing this are
 34 |    [included below](#using-github).
 35 | 
 36 | Note: if you want to build the website locally, please refer to [The Workbench
 37 | documentation][template-doc].
 38 | 
 39 | ### Where to Contribute
 40 | 
 41 | 1. If you wish to change this lesson, add issues and pull requests here.
 42 | 2. If you wish to change the template used for workshop websites, please refer
 43 |    to [The Workbench documentation][template-doc].
 44 | 
 45 | 
 46 | ### What to Contribute
 47 | 
 48 | There are many ways to contribute, from writing new exercises and improving
 49 | existing ones to updating or filling in the documentation and submitting [bug
 50 | reports][issues] about things that do not work, are not clear, or are missing.
 51 | If you are looking for ideas, please see [the list of issues for this
 52 | repository][repo], or the issues for [Data Carpentry][dc-issues], [Library
 53 | Carpentry][lc-issues], and [Software Carpentry][swc-issues] projects.
 54 | 
 55 | Comments on issues and reviews of pull requests are just as welcome: we are
 56 | smarter together than we are on our own. **Reviews from novices and newcomers
 57 | are particularly valuable**: it's easy for people who have been using these
 58 | lessons for a while to forget how impenetrable some of this material can be, so
 59 | fresh eyes are always welcome.
 60 | 
 61 | ### What *Not* to Contribute
 62 | 
 63 | Our lessons already contain more material than we can cover in a typical
 64 | workshop, so we are usually *not* looking for more concepts or tools to add to
 65 | them. As a rule, if you want to introduce a new idea, you must (a) estimate how
 66 | long it will take to teach and (b) explain what you would take out to make room
 67 | for it. The first encourages contributors to be honest about requirements; the
 68 | second, to think hard about priorities.
 69 | 
 70 | We are also not looking for exercises or other material that only run on one
 71 | platform. Our workshops typically contain a mixture of Windows, macOS, and
 72 | Linux users; in order to be usable, our lessons must run equally well on all
 73 | three.
 74 | 
 75 | ### Using GitHub
 76 | 
 77 | If you choose to contribute via GitHub, you may want to look at [How to
 78 | Contribute to an Open Source Project on GitHub][how-contribute]. In brief, we
 79 | use [GitHub flow][github-flow] to manage changes:
 80 | 
 81 | 1. Create a new branch in your desktop copy of this repository for each
 82 |    significant change.
 83 | 2. Commit the change in that branch.
 84 | 3. Push that branch to your fork of this repository on GitHub.
 85 | 4. Submit a pull request from that branch to the [upstream repository][repo].
 86 | 5. If you receive feedback, make changes on your desktop and push to your
 87 |    branch on GitHub: the pull request will update automatically.
 88 | 
 89 | NB: The published copy of the lesson is usually in the `main` branch.
 90 | 
 91 | Each lesson has a team of maintainers who review issues and pull requests or
 92 | encourage others to do so. The maintainers are community volunteers, and have
 93 | final say over what gets merged into the lesson.
 94 | 
 95 | ### Other Resources
 96 | 
 97 | The Carpentries is a global organisation with volunteers and learners all over
 98 | the world. We share values of inclusivity and a passion for sharing knowledge,
 99 | teaching and learning. There are several ways to connect with The Carpentries
100 | community listed at <https://carpentries.org/connect/> including via social
101 | media, slack, newsletters, and email lists. You can also [reach us by
102 | email][contact].
103 | 
104 | [repo]: https://github.com/joelnitta/targets-workshop
105 | [contact]: mailto:team@carpentries.org
106 | [cp-site]: https://carpentries.org/
107 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry
108 | [dc-lessons]: https://datacarpentry.org/lessons/
109 | [dc-site]: https://datacarpentry.org/
110 | [discuss-list]: https://lists.software-carpentry.org/listinfo/discuss
111 | [github]: https://github.com
112 | [github-flow]: https://guides.github.com/introduction/flow/
113 | [github-join]: https://github.com/join
114 | [how-contribute]: https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github
115 | [issues]: https://carpentries.org/help-wanted-issues/
116 | [lc-issues]: https://github.com/issues?q=user%3ALibraryCarpentry
117 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry
118 | [swc-lessons]: https://software-carpentry.org/lessons/
119 | [swc-site]: https://software-carpentry.org/
120 | [lc-site]: https://librarycarpentry.org/
121 | [template-doc]: https://carpentries.github.io/workbench/
122 | 


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Licenses"
 3 | ---
 4 | 
 5 | ## Instructional Material
 6 | 
 7 | All Carpentries (Software Carpentry, Data Carpentry, and Library Carpentry)
 8 | instructional material is made available under the [Creative Commons
 9 | Attribution license][cc-by-human]. The following is a human-readable summary of
10 | (and not a substitute for) the [full legal text of the CC BY 4.0
11 | license][cc-by-legal].
12 | 
13 | You are free:
14 | 
15 | - to **Share**---copy and redistribute the material in any medium or format
16 | - to **Adapt**---remix, transform, and build upon the material
17 | 
18 | for any purpose, even commercially.
19 | 
20 | The licensor cannot revoke these freedoms as long as you follow the license
21 | terms.
22 | 
23 | Under the following terms:
24 | 
25 | - **Attribution**---You must give appropriate credit (mentioning that your work
26 |   is derived from work that is Copyright (c) The Carpentries and, where
27 |   practical, linking to <https://carpentries.org/>), provide a [link to the
28 |   license][cc-by-human], and indicate if changes were made. You may do so in
29 |   any reasonable manner, but not in any way that suggests the licensor endorses
30 |   you or your use.
31 | 
32 | - **No additional restrictions**---You may not apply legal terms or
33 |   technological measures that legally restrict others from doing anything the
34 |   license permits.  With the understanding that:
35 | 
36 | Notices:
37 | 
38 | * You do not have to comply with the license for elements of the material in
39 |   the public domain or where your use is permitted by an applicable exception
40 |   or limitation.
41 | * No warranties are given. The license may not give you all of the permissions
42 |   necessary for your intended use. For example, other rights such as publicity,
43 |   privacy, or moral rights may limit how you use the material.
44 | 
45 | ## Software
46 | 
47 | Except where otherwise noted, the example programs and other software provided
48 | by The Carpentries are made available under the [OSI][osi]-approved [MIT
49 | license][mit-license].
50 | 
51 | Permission is hereby granted, free of charge, to any person obtaining a copy of
52 | this software and associated documentation files (the "Software"), to deal in
53 | the Software without restriction, including without limitation the rights to
54 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
55 | of the Software, and to permit persons to whom the Software is furnished to do
56 | so, subject to the following conditions:
57 | 
58 | The above copyright notice and this permission notice shall be included in all
59 | copies or substantial portions of the Software.
60 | 
61 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
62 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
63 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
64 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
65 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
66 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
67 | SOFTWARE.
68 | 
69 | ## Trademark
70 | 
71 | "The Carpentries", "Software Carpentry", "Data Carpentry", and "Library
72 | Carpentry" and their respective logos are registered trademarks of [Community
73 | Initiatives][ci].
74 | 
75 | [cc-by-human]: https://creativecommons.org/licenses/by/4.0/
76 | [cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode
77 | [mit-license]: https://opensource.org/licenses/mit-license.html
78 | [ci]: https://communityin.org/
79 | [osi]: https://opensource.org
80 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Introduction to the "targets" Package for Reproducible Data Analysis in R
 2 | 
 3 | This is a pre-alpha lesson about the [targets](https://github.com/ropensci/targets) R package built using [The Carpentries Workbench][workbench].
 4 | 
 5 | The lesson website is here: https://carpentries-incubator.github.io/targets-workshop/
 6 | 
 7 | [workbench]: https://carpentries.github.io/sandpaper-docs/
 8 | 
 9 | Materials licensed under [CC-BY 4.0](LICENSE.md) by the authors
10 | 


--------------------------------------------------------------------------------
/config.yaml:
--------------------------------------------------------------------------------
 1 | #------------------------------------------------------------
 2 | # Values for this lesson.
 3 | #------------------------------------------------------------
 4 | 
 5 | # Which carpentry is this (swc, dc, lc, or cp)?
 6 | # swc: Software Carpentry
 7 | # dc: Data Carpentry
 8 | # lc: Library Carpentry
 9 | # cp: Carpentries (to use for instructor training for instance)
10 | # incubator: The Carpentries Incubator
11 | carpentry: 'incubator'
12 | 
13 | # Overall title for pages.
14 | title: 'Introduction to targets'
15 | 
16 | # Date the lesson was created (YYYY-MM-DD, this is empty by default)
17 | created: ~
18 | 
19 | # Comma-separated list of keywords for the lesson
20 | keywords: 'reproducibility, data, targets, R'
21 | 
22 | # Life cycle stage of the lesson
23 | # possible values: pre-alpha, alpha, beta, stable
24 | life_cycle: 'pre-alpha'
25 | 
26 | # License of the lesson
27 | license: 'CC-BY 4.0'
28 | 
29 | # Link to the source repository for this lesson
30 | source: 'https://github.com/carpentries-incubator/targets-workshop'
31 | 
32 | # Default branch of your lesson
33 | branch: 'main'
34 | 
35 | # Who to contact if there are any issues
36 | contact: 'joelnitta@gmail.com'
37 | 
38 | # Navigation ------------------------------------------------
39 | #
40 | # Use the following menu items to specify the order of
41 | # individual pages in each dropdown section. Leave blank to
42 | # include all pages in the folder.
43 | #
44 | # Example -------------
45 | #
46 | # episodes:
47 | # - introduction.md
48 | # - first-steps.md
49 | #
50 | # learners:
51 | # - setup.md
52 | #
53 | # instructors:
54 | # - instructor-notes.md
55 | #
56 | # profiles:
57 | # - one-learner.md
58 | # - another-learner.md
59 | 
60 | # Order of episodes in your lesson
61 | episodes: 
62 | - introduction.Rmd
63 | - basic-targets.Rmd
64 | - functions.Rmd
65 | - cache.Rmd
66 | - lifecycle.Rmd
67 | - organization.Rmd
68 | - packages.Rmd
69 | - files.Rmd
70 | - branch.Rmd
71 | - parallel.Rmd
72 | - quarto.Rmd
73 | 
74 | # Information for Learners
75 | learners: 
76 | 
77 | # Information for Instructors
78 | instructors: 
79 | 
80 | # Learner Profiles
81 | profiles: 
82 | 
83 | # Customisation ---------------------------------------------
84 | #
85 | # This space below is where custom yaml items (e.g. pinning
86 | # sandpaper and varnish versions) should live
87 | 
88 | 
89 | 


--------------------------------------------------------------------------------
/episodes/basic-targets.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'First targets Workflow'
  3 | teaching: 30
  4 | exercises: 10
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - What are best practices for organizing analyses?
 10 | - What is a `_targets.R` file for?
 11 | - What is the content of the `_targets.R` file?
 12 | - How do you run a workflow? 
 13 | 
 14 | ::::::::::::::::::::::::::::::::::::::::::::::::
 15 | 
 16 | ::::::::::::::::::::::::::::::::::::: objectives
 17 | 
 18 | - Create a project in RStudio
 19 | - Explain the purpose of the `_targets.R` file
 20 | - Write a basic `_targets.R` file
 21 | - Use a `_targets.R` file to run a workflow
 22 | 
 23 | ::::::::::::::::::::::::::::::::::::::::::::::::
 24 | 
 25 | ::::::::::::::::::::::::::::::::::::: {.instructor}
 26 | 
 27 | Episode summary: First chance to get hands dirty by writing a very simple workflow
 28 | 
 29 | :::::::::::::::::::::::::::::::::::::
 30 | 
 31 | ```{r}
 32 | #| label: setup
 33 | #| echo: FALSE
 34 | #| message: FALSE
 35 | #| warning: FALSE
 36 | library(targets)
 37 | 
 38 | if (interactive()) {
 39 |   setwd("episodes")
 40 | }
 41 | 
 42 | source("files/lesson_functions.R")
 43 | ```
 44 | 
 45 | ## Create a project
 46 | 
 47 | ### About projects
 48 | 
 49 | `targets` uses the "project" concept for organizing analyses: all of the files needed for a given project are put in a single folder, the project folder.
 50 | The project folder has additional subfolders for organization, such as folders for data, code, and results.
 51 | 
 52 | By using projects, it makes it straightforward to re-orient yourself if you return to an analysis after time spent elsewhere.
 53 | This wouldn't be a problem if we only ever work on one thing at a time until completion, but that is almost never the case.
 54 | It is hard to remember what you were doing when you come back to a project after working on something else (a phenomenon called "context switching").
 55 | By using a standardized organization system, you will reduce confusion and lost time... in other words, you are increasing reproducibility!
 56 | 
 57 | This workshop will use RStudio, since it also works well with the project organization concept.
 58 | 
 59 | ### Create a project in RStudio
 60 | 
 61 | Let's start a new project using RStudio.
 62 | 
 63 | Click "File", then select "New Project".
 64 | 
 65 | This will open the New Project Wizard, a set of menus to help you set up the project.
 66 | 
 67 | ![The New Project Wizard](fig/basic-rstudio-wizard.png){alt="Screenshot of RStudio New Project Wizard menu"}
 68 | 
 69 | In the Wizard, click the first option, "New Directory", since we are making a brand-new project from scratch.
 70 | Click "New Project" in the next menu.
 71 | In "Directory name", enter a name that helps you remember the purpose of the project, such as "targets-demo" (follow best practices for naming files and folders).
 72 | Under "Create project as a subdirectory of...", click the "Browse" button to select a directory to put the project.
 73 | We recommend putting it on your Desktop so you can easily find it.
 74 | 
 75 | You can leave "Create a git repository" and "Use renv with this project" unchecked, but these are both excellent tools to improve reproducibility, and you should consider learning them and using them in the future, if you don't already.
 76 | They can be enabled at any later time, so you don't need to worry about trying to use them immediately.
 77 | 
 78 | Once you work through these steps, your RStudio session should look like this:
 79 | 
 80 | ![Your newly created project](fig/basic-rstudio-project.png){alt="Screenshot of RStudio with a newly created project called 'targets-demo' open containing a single file, 'targets-demo.Rproj'"}
 81 | 
 82 | Our project now contains a single file, created by RStudio: `targets-demo.Rproj`. You should not edit this file by hand. Its purpose is to tell RStudio that this is a project folder and to store some RStudio settings (if you use version-control software, it is OK to commit this file). Also, you can open the project by double clicking on the `.Rproj` file in your file explorer (try it by quitting RStudio then navigating in your file browser to your Desktop, opening the "targets-demo" folder, and double clicking `targets-demo.Rproj`).
 83 | 
 84 | OK, now that our project is set up, we are (almost) ready to start using `targets`!
 85 | 
 86 | ## Background: non-`targets` version
 87 | 
 88 | First though, to get familiar with the functions and packages we'll use, let's run the code like you would in a "normal" R script without using `targets`.
 89 | 
 90 | Recall that we are using the `palmerpenguins` R package to obtain the data.
 91 | This package actually includes two variations of the dataset: one is an external CSV file with the raw data, and another is the cleaned data loaded into R.
 92 | In real life you are probably have externally stored raw data, so **let's use the raw penguin data** as the starting point for our analysis too.
 93 | 
 94 | The `path_to_file()` function in `palmerpenguins` provides the path to the raw data CSV file (it is inside the `palmerpenguins` R package source code that you downloaded to your computer when you installed the package).
 95 | 
 96 | ```{r}
 97 | #| label: normal-r-path
 98 | library(palmerpenguins)
 99 | 
100 | # Get path to CSV file
101 | penguins_csv_file <- path_to_file("penguins_raw.csv")
102 | 
103 | penguins_csv_file
104 | ```
105 | 
106 | We will use the `tidyverse` set of packages for loading and manipulating the data. We don't have time to cover all the details about using `tidyverse` now, but if you want to learn more about it, please see the ["Manipulating, analyzing and exporting data with tidyverse" lesson](https://datacarpentry.org/R-ecology-lesson/03-dplyr.html), or the Carpentry incubator lesson [R and the tidyverse for working with datasets](https://carpentries-incubator.github.io/r-tidyverse-4-datasets/).
107 | 
108 | Let's load the data with `read_csv()`.
109 | 
110 | ```{r}
111 | #| label: normal-r-load-show
112 | #| eval: false
113 | library(tidyverse)
114 | 
115 | # Read CSV file into R
116 | penguins_data_raw <- read_csv(penguins_csv_file)
117 | 
118 | penguins_data_raw
119 | ```
120 | 
121 | ```{r}
122 | #| label: normal-r-load-hide
123 | #| echo: false
124 | suppressPackageStartupMessages(library(tidyverse))
125 | 
126 | # Read CSV file into R
127 | penguins_data_raw <- read_csv(penguins_csv_file)
128 | 
129 | penguins_data_raw
130 | ```
131 | 
132 | We see the raw data has some awkward column names with spaces (these are hard to type out and can easily lead to mistakes in the code), and far more columns than we need.
133 | For the purposes of this analysis, we only need species name, bill length, and bill depth.
134 | In the raw data, the rather technical term "culmen" is used to refer to the bill.
135 | 
136 | ![Illustration of bill (culmen) length and depth. Artwork by @allison_horst.](https://allisonhorst.github.io/palmerpenguins/reference/figures/culmen_depth.png)
137 | 
138 | Let's clean up the data to make it easier to use for downstream analyses.
139 | We will also remove any rows with missing data, because this could cause errors for some functions later.
140 | 
141 | ```{r}
142 | #| label: normal-r-clean
143 | 
144 | # Clean up raw data
145 | penguins_data <- penguins_data_raw |>
146 |   # Rename columns for easier typing and
147 |   # subset to only the columns needed for analysis
148 |   select(
149 |     species = Species,
150 |     bill_length_mm = `Culmen Length (mm)`,
151 |     bill_depth_mm = `Culmen Depth (mm)`
152 |   ) |>
153 |   # Delete rows with missing data
154 |   drop_na()
155 | 
156 | penguins_data
157 | ```
158 | 
159 | We have not run the full analysis yet, but this is enough to get us started with the transition to using `targets`.
160 | 
161 | ## `targets` version
162 | 
163 | ### About the `_targets.R` file
164 | 
165 | One major difference between a typical R data analysis and a `targets` project is that the latter must include a special file, called `_targets.R` in the main project folder (the "project root").
166 | 
167 | The `_targets.R` file includes the specification of the workflow: these are the directions for R to run your analysis, kind of like a recipe.
168 | By using the `_targets.R` file, **you won't have to remember to run specific scripts in a certain order**; instead, R will do it for you!
169 | This is a **huge win**, both for your future self and anybody else trying to reproduce your analysis.
170 | 
171 | ### Writing the initial `_targets.R` file
172 | 
173 | We will now start to write a `_targets.R` file. Fortunately, `targets` comes with a function to help us do this.
174 | 
175 | In the R console, first load the `targets` package with `library(targets)`, then run the command `tar_script()`.
176 | 
177 | ```{r}
178 | #| label: start-targets-show
179 | #| eval: FALSE
180 | library(targets)
181 | tar_script()
182 | ```
183 | 
184 | Nothing will happen in the console, but in the file viewer, you should see a new file, `_targets.R` appear. Open it using the File menu or by clicking on it.
185 | 
186 | ```{r}
187 | #| label: start-targets-hide
188 | #| eval: true
189 | #| echo: false
190 | #| results: "asis"
191 | plan_0_dir <- make_tempdir()
192 | pushd(plan_0_dir)
193 | tar_script()
194 | default_script <- readr::read_lines("_targets.R")
195 | popd()
196 | 
197 | cat("```{.r}\n")
198 | cat(default_script, sep = "\n")
199 | cat("```")
200 | ```
201 | 
202 | Don't worry about the details of this file.
203 | Instead, notice that that it includes three main parts:
204 | 
205 | - Loading packages with `library()`
206 | - Defining a custom function with `function()`
207 | - Defining a list with `list()`.
208 | 
209 | You may not have used `function()` before.
210 | If not, that's OK; we will cover this in more detail in the [next episode](episodes/functions.Rmd), so we will ignore it for now.
211 | 
212 | The last part, the list, is the **most important part** of the `_targets.R` file.
213 | It defines the steps in the workflow.
214 | The `_targets.R` file **must always end with this list**.
215 | 
216 | Furthermore, each item in the list is a call of the `tar_target()` function.
217 | The first argument of `tar_target()` is name of the target to build, and the second argument is the command used to build it.
218 | Note that the name of the target is **unquoted**, that is, it is written without any surrounding quotation marks.
219 | 
220 | ## Modifying `_targets.R` to run the example analysis
221 | 
222 | First, let's load all of the packages we need for our workflow.
223 | Add `library(tidyverse)` and `library(palmerpenguins)` to the top of `_targets.R` after `library(targets)`.
224 | 
225 | Next, we can delete the `function()` statement since we won't be using that just yet (we will come back to custom functions soon!).
226 | 
227 | The last, and trickiest, part is correctly defining the workflow in the list at the end of the file.
228 | 
229 | From [the non-`targets` version](#background-non-targets-version), you can see we have three steps so far:
230 | 
231 | 1. Define the path to the CSV file with the raw penguins data.
232 | 2. Read the CSV file.
233 | 3. Clean the raw data.
234 | 
235 | Each of these will be one item in the list.
236 | Furthermore, we need to write each item using the `tar_target()` function.
237 | Recall that we write the `tar_target()` function by writing the **name of the target to build** first and the **command to build it** second.
238 | 
239 | ::::::::::::::::::::::::::::::::::::: {.callout}
240 | 
241 | ## Choosing good target names
242 | 
243 | The name of each target could be anything you like, but it is strongly recommended to **choose names that reflect what the target actually contains**.
244 | 
245 | For example, `penguins_data_raw` for the raw data loaded from the CSV file and not `x`.
246 | 
247 | Your future self will thank you!
248 | 
249 | ::::::::::::::::::::::::::::::::::::::::::
250 | 
251 | ::::::::::::::::::::::::::::::::::::: {.challenge}
252 | 
253 | ## Challenge: Use `tar_target()`
254 | 
255 | Can you use `tar_target()` to define the first step in the workflow (setting the path  to the CSV file with the penguins data)?
256 | 
257 | :::::::::::::::::::::::::::::::::: {.solution}
258 | 
259 | ```{r}
260 | #| label: challenge-solution-1
261 | #| eval: false
262 | tar_target(name = penguins_csv_file, command = path_to_file("penguins_raw.csv"))
263 | ```
264 | 
265 | The first two arguments of `tar_target()` are the **name** of the target, followed by the **command** to build it.
266 | 
267 | These arguments are used so frequently we will typically omit the argument names, instead writing it like this:
268 | 
269 | ```{r}
270 | #| label: challenge-solution-2
271 | #| eval: false
272 | tar_target(penguins_csv_file, path_to_file("penguins_raw.csv"))
273 | ```
274 | 
275 | ::::::::::::::::::::::::::::::::::
276 | 
277 | ::::::::::::::::::::::::::::::::::::::::::
278 | 
279 | Now that we've seen how to define the first target, let's continue and add the rest.
280 | 
281 | Once you've done that, this is how `_targets.R` should look:
282 | 
283 | ```{r}
284 | #| label = "targets-show-workflow",
285 | #| eval = FALSE,
286 | #| code = readLines("files/plans/plan_0.R")[2:22]
287 | ```
288 | 
289 | I have set `show_col_types = FALSE` in `read_csv()` because we know from the earlier code that the column types were set correctly by default (character for species and numeric for bill length and depth), so we don't need to see the warning it would otherwise issue.
290 | 
291 | ## Run the workflow
292 | 
293 | Now that we have a workflow, we can run it with the `tar_make()` function.
294 | Try running it, and you should see something like this:
295 | 
296 | ```{r}
297 | #| label: targets-run
298 | #| eval: true
299 | #| echo: [3]
300 | pushd(make_tempdir())
301 | write_example_plan("plan_0.R")
302 | tar_make()
303 | popd()
304 | ```
305 | 
306 | Congratulations, you've run your first workflow with `targets`!
307 | 
308 | ::::::::::::::::::::::::::::::::::::: {.callout}
309 | 
310 | ## The workflow cannot be run interactively
311 | 
312 | You may be used to running R code interactively by selecting lines and pressing the "Run" button (or using the keyboard shortcut) in RStudio or your IDE of choice.
313 | 
314 | You *could* run the list at the of `_targets.R` this way, but it will not execute the workflow (it will return a list instead).
315 | 
316 | **The only way to run the workflow is with `tar_make()`.**
317 | 
318 | You do not need to select and run anything interactively in `_targets.R`.
319 | In fact, you do not even need to have the `_targets.R` file open to run the workflow with `tar_make()`---try it for yourself!
320 | 
321 | Similarly, **you must not write `tar_make()` in the `_targets.R` file**; you should only use `tar_make()` as a direct command at the R console.
322 | 
323 | ::::::::::::::::::::::::::::::::::::::::::
324 | 
325 | Remember, now that we are using `targets`, **the only thing you need to do to replicate your analysis is run `tar_make()`**.
326 | 
327 | This is true no matter how long or complicated your analysis becomes.
328 | 
329 | ::::::::::::::::::::::::::::::::::::: keypoints 
330 | 
331 | - Projects help keep our analyses organized so we can easily re-run them later
332 | - Use the RStudio Project Wizard to create projects
333 | - The `_targets.R` file is a special file that must be included in all `targets` projects, and defines the worklow
334 | - Use `tar_script()` to create a default `_targets.R` file
335 | - Use `tar_make()` to run the workflow
336 | 
337 | ::::::::::::::::::::::::::::::::::::::::::::::::
338 | 


--------------------------------------------------------------------------------
/episodes/branch.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'Branching'
  3 | teaching: 30
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - How can we specify many targets without typing everything out?
 10 | 
 11 | ::::::::::::::::::::::::::::::::::::::::::::::::
 12 | 
 13 | ::::::::::::::::::::::::::::::::::::: objectives
 14 | 
 15 | - Be able to specify targets using branching
 16 | 
 17 | ::::::::::::::::::::::::::::::::::::::::::::::::
 18 | 
 19 | ::::::::::::::::::::::::::::::::::::: instructor
 20 | 
 21 | Episode summary: Show how to use branching
 22 | 
 23 | :::::::::::::::::::::::::::::::::::::
 24 | 
 25 | ```{r}
 26 | #| label: setup
 27 | #| echo: FALSE
 28 | #| message: FALSE
 29 | #| warning: FALSE
 30 | library(targets)
 31 | library(tarchetypes)
 32 | library(broom)
 33 | 
 34 | # sandpaper renders this lesson from episodes/
 35 | # need to emulate this behavior during interactive development
 36 | # would be preferable to use here::here() but it doesn't work for some reason
 37 | if (interactive()) {
 38 |   setwd("episodes")
 39 | }
 40 | 
 41 | source("files/lesson_functions.R")
 42 | 
 43 | # Increase width for printing tibbles
 44 | options(width = 140)
 45 | ```
 46 | 
 47 | ## Why branching?
 48 | 
 49 | One of the major strengths of `targets` is the ability to define many targets from a single line of code ("branching").
 50 | This not only saves you typing, it also **reduces the risk of errors** since there is less chance of making a typo.
 51 | 
 52 | ## Types of branching
 53 | 
 54 | There are two types of branching, **dynamic branching** and **static branching**.
 55 | "Branching" refers to the idea that you can provide a single specification for how to make targets (the "pattern"), and `targets` generates multiple targets from it ("branches").
 56 | "Dynamic" means that the branches that result from the pattern do not have to be defined ahead of time---they are a dynamic result of the code.
 57 | 
 58 | In this workshop, we will only cover dynamic branching since it is generally easier to write (static branching requires use of [meta-programming](https://books.ropensci.org/targets/static.html#metaprogramming), an advanced topic). For more information about each and when you might want to use one or the other (or some combination of the two), [see the `targets` package manual](https://books.ropensci.org/targets/dynamic.html).
 59 | 
 60 | ## Example without branching
 61 | 
 62 | To see how this works, let's continue our analysis of the `palmerpenguins` dataset.
 63 | 
 64 | **Our hypothesis is that bill depth decreases with bill length.**
 65 | We will test this hypothesis with a linear model.
 66 | 
 67 | For example, this is a model of bill depth dependent on bill length:
 68 | 
 69 | ```{r}
 70 | #| label: example-lm
 71 | #| eval: FALSE
 72 | lm(bill_depth_mm ~ bill_length_mm, data = penguins_data)
 73 | ```
 74 | 
 75 | We can add this to our pipeline. We will call it the `combined_model` because it combines all the species together without distinction:
 76 | 
 77 | ```{r}
 78 | #| label = "example-lm-pipeline-show",
 79 | #| eval = FALSE,
 80 | #| code = readLines("files/plans/plan_4.R")[2:19]
 81 | ```
 82 | 
 83 | ```{r}
 84 | #| label: example-lm-pipeline-hide
 85 | #| echo: false
 86 | plan_4_dir <- make_tempdir()
 87 | pushd(plan_4_dir)
 88 | write_example_plan("plan_3.R")
 89 | tar_make(reporter = "silent")
 90 | write_example_plan("plan_4.R")
 91 | tar_make()
 92 | popd()
 93 | ```
 94 | 
 95 | Let's have a look at the model. We will use the `glance()` function from the `broom` package. Unlike base R `summary()`, this function returns output as a tibble (the tidyverse equivalent of a dataframe), which as we will see later is quite useful for downstream analyses.
 96 | 
 97 | ```{r}
 98 | #| label: example-lm-pipeline-inspect-show
 99 | #| eval: true
100 | #| echo: [2, 3, 4]
101 | pushd(plan_4_dir)
102 | library(broom)
103 | tar_load(combined_model)
104 | glance(combined_model)
105 | popd()
106 | ```
107 | 
108 | Notice the small *P*-value.
109 | This seems to indicate that the model is highly significant.
110 | 
111 | But wait a moment... is this really an appropriate model? Recall that there are three species of penguins in the dataset. It is possible that the relationship between bill depth and length **varies by species**.
112 | 
113 | Let's try making one model *per* species (three models total) to see how that does (this is technically not the correct statistical approach, but our focus here is to learn `targets`, not statistics).
114 | 
115 | Now our workflow is getting more complicated. This is what a workflow for such an analysis might look like **without branching** (make sure to add `library(broom)` to `packages.R`):
116 | 
117 | ```{r}
118 | #| label = "example-model-show-1",
119 | #| eval = FALSE,
120 | #| code = readLines("files/plans/plan_5.R")[2:36]
121 | ```
122 | 
123 | ```{r}
124 | #| label: example-model-hide-1
125 | #| echo: false
126 | plan_5_dir <- make_tempdir()
127 | pushd(plan_5_dir)
128 | # simulate already running the plan once
129 | write_example_plan("plan_4.R")
130 | tar_make(reporter = "silent")
131 | write_example_plan("plan_5.R")
132 | tar_make()
133 | popd()
134 | ```
135 | 
136 | Let's look at the summary of one of the models:
137 | 
138 | ```{r}
139 | #| label: example-model-show-2
140 | #| eval: true
141 | #| echo: [2]
142 | pushd(plan_5_dir)
143 | tar_read(adelie_summary)
144 | popd()
145 | ```
146 | 
147 | So this way of writing the pipeline works, but is repetitive: we have to call `glance()` each time we want to obtain summary statistics for each model.
148 | Furthermore, each summary target (`adelie_summary`, etc.) is explicitly named and typed out manually.
149 | It would be fairly easy to make a typo and end up with the wrong model being summarized.
150 | 
151 | Before moving on, let's define another **custom function** function: `model_glance()`.
152 | You will need to write custom functions frequently when using `targets`, so it's good to get used to it!
153 | 
154 | As the name `model_glance()` suggests (it is good to write functions with names that indicate their purpose), this will build a model then immediately run `glance()` on it.
155 | The reason for doing so is that we get a **dataframe as a result**, which is very helpful for branching, as we will see in the next section.
156 | Save this in `R/functions.R`:
157 | 
158 | ```{r}
159 | #| label = "model-glance",
160 | #| eval = FALSE,
161 | #| code = readLines("files/tar_functions/model_glance_orig.R")
162 | ```
163 | 
164 | ## Example with branching
165 | 
166 | ### First attempt
167 | 
168 | Let's see how to write the same plan using **dynamic branching** (after running it, we will go through the new version in detail to understand each step):
169 | 
170 | ```{r}
171 | #| label = "example-model-show-3",
172 | #| eval = FALSE,
173 | #| code = readLines("files/plans/plan_6.R")[2:28]
174 | ```
175 | 
176 | What is going on here?
177 | 
178 | First, let's look at the messages provided by `tar_make()`.
179 | 
180 | ```{r}
181 | #| label: example-model-hide-3
182 | #| echo: false
183 | plan_6_dir <- make_tempdir()
184 | pushd(plan_6_dir)
185 | # simulate already running the plan once
186 | write_example_plan("plan_5.R")
187 | tar_make(reporter = "silent")
188 | # run version of plan that uses `model_glance_orig()` (doesn't include species
189 | # names in output)
190 | write_example_plan("plan_6b.R")
191 | tar_make()
192 | example_branch_name <- tar_branch_names(species_summary, 1)
193 | popd()
194 | ```
195 | 
196 | There is a series of smaller targets (branches) that are each named like `r example_branch_name`, then one overall `species_summary` target.
197 | That is the result of specifying targets using branching: each of the smaller targets are the "branches" that comprise the overall target.
198 | Since `targets` has no way of knowing ahead of time how many branches there will be or what they represent, it names each one using this series of numbers and letters (the "hash").
199 | `targets` builds each branch one at a time, then combines them into the overall target.
200 | 
201 | Next, let's look in more detail about how the workflow is set up, starting with how we set up the data:
202 | 
203 | ```{r}
204 | #| label = "model-def",
205 | #| code = readLines("files/plans/plan_6.R")[14:19],
206 | #| eval = FALSE
207 | ```
208 | 
209 | Unlike the non-branching version, we added a step that **groups the data**.
210 | This is because dynamic branching is similar to the [`tidyverse` approach](https://dplyr.tidyverse.org/articles/grouping.html) of applying the same function to a grouped dataframe.
211 | So we use the `tar_group_by()` function to specify the groups in our input data: one group per species.
212 | 
213 | Next, take a look at the command to build the target `species_summary`.
214 | 
215 | ```{r}
216 | #| label = "model-summaries",
217 | #| code = readLines("files/plans/plan_6.R")[22:27],
218 | #| eval = FALSE
219 | ```
220 | 
221 | As before, the first argument to `tar_target()` is the name of the target to build, and the second is the command to build it.
222 | 
223 | Here, we apply our custom `model_glance()` function to each group (in other words, each species) in `penguins_data_grouped`.
224 | 
225 | Finally, there is an argument we haven't seen before, `pattern`, which indicates that this target should be built using dynamic branching.
226 | `map` means to apply the function to each group of the input data (`penguins_data_grouped`) sequentially.
227 | 
228 | Now that we understand how the branching workflow is constructed, let's inspect the output:
229 | 
230 | ```{r}
231 | #| label: example-model-show-4
232 | #| eval: FALSE
233 | tar_read(species_summary)
234 | ```
235 | 
236 | ```{r}
237 | #| label: example-model-hide-4
238 | #| echo: FALSE
239 | pushd(plan_6_dir)
240 | tar_read(species_summary)
241 | popd()
242 | ```
243 | 
244 | The model summary statistics are all included in a single dataframe.
245 | 
246 | But there's one problem: **we can't tell which row came from which species!** It would be unwise to assume that they are in the same order as the input data.
247 | 
248 | This is due to the way dynamic branching works: by default, there is no information about the provenance of each target preserved in the output.
249 | 
250 | How can we fix this?
251 | 
252 | ### Second attempt
253 | 
254 | The key to obtaining useful output from branching pipelines is to include the necessary information in the output of each individual branch.
255 | Here, we want to know the species that corresponds to each row of the model summaries.
256 | 
257 | We can achieve this by modifying our `model_glance` function. Be sure to save it after modifying it to include a column for species:
258 | 
259 | ```{r}
260 | #| label: example-model-show-5
261 | #| eval: FALSE
262 | #| file: files/tar_functions/model_glance.R
263 | ```
264 | 
265 | Our new pipeline looks exactly the same as before; we have made a modification, but to a **function**, not the pipeline.
266 | 
267 | Since `targets` tracks the contents of each custom function, it realizes that it needs to recompute `species_summary` and runs this target again with the newly modified function.
268 | 
269 | ```{r}
270 | #| label: example-model-hide-6
271 | #| echo: FALSE
272 | pushd(plan_6_dir)
273 | write_example_plan("plan_6.R")
274 | tar_make()
275 | popd()
276 | ```
277 | 
278 | And this time, when we load the `model_summaries`, we can tell which model corresponds to which row (the `.before = 1` in `mutate()` ensures that it shows up before the other columns).
279 | 
280 | ```{r}
281 | #| label: example-model-7
282 | #| echo: [2]
283 | #| warning: false
284 | pushd(plan_6_dir)
285 | tar_read(species_summary)
286 | popd()
287 | ```
288 | 
289 | Next we will add one more target, a prediction of bill depth based on each model. These will be needed for plotting the models in the report.
290 | Such a prediction can be obtained with the `augment()` function of the `broom` package, and we create a custom function that outputs predicted points as a dataframe much like we did for the model summaries.
291 | 
292 | 
293 | ::::::::::::::::::::::::::::::::::::: {.challenge}
294 | 
295 | ## Challenge: Add model predictions to the workflow
296 | 
297 | Can you add the model predictions using `augment()`? You will need to define a custom function just like we did for `glance()`.
298 | 
299 | :::::::::::::::::::::::::::::::::: {.solution}
300 | 
301 | Define the new function as `model_augment()`. It is the same as `model_glance()`, but use `augment()` instead of `glance()`:
302 | 
303 | ```{r}
304 | #| label: example-model-augment-func
305 | #| eval: FALSE
306 | #| file: files/tar_functions/model_augment.R
307 | ```
308 | 
309 | Add the step to the workflow:
310 | 
311 | ```{r}
312 | #| label = "example-model-augment-show",
313 | #| code = readLines("files/plans/plan_7.R")[2:36],
314 | #| eval = FALSE
315 | ```
316 | 
317 | ::::::::::::::::::::::::::::::::::
318 | 
319 | :::::::::::::::::::::::::::::::::::::
320 | 
321 | ### Further simplify the workflow
322 | 
323 | You may have noticed that we can further simplify the workflow: there is no need to have separate `penguins_data` and `penguins_data_grouped` dataframes.
324 | In general it is best to keep the number of named objects as small as possible to make it easier to reason about your code.
325 | Let's combine the cleaning and grouping step into a single command:
326 | 
327 | ```{r}
328 | #| label = "example-model-show-8",
329 | #| eval = FALSE,
330 | #| code = readLines("files/plans/plan_8.R")[2:34]
331 | ```
332 | 
333 | And run it once more:
334 | 
335 | ```{r}
336 | #| label: example-model-hide-8
337 | #| echo: false
338 | pushd(plan_6_dir)
339 | # simulate already running the plan once
340 | write_example_plan("plan_7.R")
341 | tar_make(reporter = "silent")
342 | # run version of plan that uses `model_glance_orig()` (doesn't include species
343 | # names in output)
344 | write_example_plan("plan_8.R")
345 | tar_make()
346 | popd()
347 | ```
348 | 
349 | ::::::::::::::::::::::::::::::::::::: {.callout}
350 | 
351 | ## Best practices for branching
352 | 
353 | Dynamic branching is designed to work well with **dataframes** (it can also use [lists](https://books.ropensci.org/targets/dynamic.html#list-iteration), but that is more advanced, so we recommend using dataframes when possible).
354 | 
355 | It is recommended to write your custom functions to accept dataframes as input and return them as output, and always include any necessary metadata as a column or columns.
356 | 
357 | :::::::::::::::::::::::::::::::::::::
358 | 
359 | ::::::::::::::::::::::::::::::::::::: {.challenge}
360 | 
361 | ## Challenge: What other kinds of patterns are there?
362 | 
363 | So far, we have only used a single function in conjunction with the `pattern` argument, `map()`, which applies the function to each element of its input in sequence.
364 | 
365 | Can you think of any other ways you might want to apply a branching pattern?
366 | 
367 | :::::::::::::::::::::::::::::::::: {.solution}
368 | 
369 | Some other ways of applying branching patterns include:
370 | 
371 | - crossing: one branch per combination of elements (`cross()` function)
372 | - slicing: one branch for each of a manually selected set of elements (`slice()` function)
373 | - sampling: one branch for each of a randomly selected set of elements (`sample()` function)
374 | 
375 | You can [find out more about different branching patterns in the `targets` manual](https://books.ropensci.org/targets/dynamic.html#patterns).
376 | 
377 | ::::::::::::::::::::::::::::::::::
378 | 
379 | :::::::::::::::::::::::::::::::::::::
380 | 
381 | ::::::::::::::::::::::::::::::::::::: keypoints 
382 | 
383 | - Dynamic branching creates multiple targets with a single command
384 | - You usually need to write custom functions so that the output of the branches includes necessary metadata 
385 | 
386 | ::::::::::::::::::::::::::::::::::::::::::::::::
387 | 


--------------------------------------------------------------------------------
/episodes/cache.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'Loading Workflow Objects'
  3 | teaching: 10
  4 | exercises: 2
  5 | ---
  6 | 
  7 | ```{r}
  8 | #| label: setup
  9 | #| echo: FALSE
 10 | #| message: FALSE
 11 | #| warning: FALSE
 12 | library(targets)
 13 | source("files/lesson_functions.R")
 14 | ```
 15 | 
 16 | :::::::::::::::::::::::::::::::::::::: questions 
 17 | 
 18 | - Where does the workflow happen?
 19 | - How can we inspect the objects built by the workflow?
 20 | 
 21 | ::::::::::::::::::::::::::::::::::::::::::::::::
 22 | 
 23 | ::::::::::::::::::::::::::::::::::::: objectives
 24 | 
 25 | - Explain where `targets` runs the workflow and why
 26 | - Be able to load objects built by the workflow into your R session
 27 | 
 28 | ::::::::::::::::::::::::::::::::::::::::::::::::
 29 | 
 30 | ::::::::::::::::::::::::::::::::::::: instructor
 31 | 
 32 | Episode summary: Show how to get at the objects that we built
 33 | 
 34 | :::::::::::::::::::::::::::::::::::::
 35 | 
 36 | ## Where does the workflow happen?
 37 | 
 38 | So we just finished running our workflow.
 39 | Now you probably want to look at its output.
 40 | But, if we just call the name of the object (for example, `penguins_data`), we get an error.
 41 | ```{r}
 42 | #| label: error
 43 | penguins_data
 44 | ```
 45 | 
 46 | Where are the results of our workflow?
 47 | 
 48 | ::::::::::::::::::::::::::::::::::::: instructor
 49 | 
 50 | - To reinforce the concept of `targets` running in a separate R session, you may want to pretend trying to run `penguins_data`, then feigning surprise when it doesn't work and using it as a teaching moment (errors are pedagogy!).
 51 | 
 52 | ::::::::::::::::::::::::::::::::::::::::::::::::
 53 | 
 54 | We don't see the workflow results because `targets` **runs the workflow in a separate R session** that we can't interact with.
 55 | This is for reproducibility---the objects built by the workflow should only depend on the code in your project, not any commands you may have interactively given to R.
 56 | 
 57 | Fortunately, `targets` has two functions that can be used to load objects built by the workflow into our current session, `tar_load()` and `tar_read()`.
 58 | Let's see how these work.
 59 | 
 60 | ## tar_load()
 61 | 
 62 | `tar_load()` loads an object built by the workflow into the current session.
 63 | Its first argument is the name of the object you want to load.
 64 | Let's use this to load `penguins_data` and get an overview of the data with `summary()`.
 65 | 
 66 | ```{r}
 67 | #| label: targets-run-hide
 68 | #| echo: FALSE
 69 | # When building the Rmd, each instance of the workflow is isolated, so need
 70 | # to re-run
 71 | plan_1_dir <- make_tempdir()
 72 | pushd(plan_1_dir)
 73 | write_example_plan("plan_1.R")
 74 | tar_make(reporter = "silent")
 75 | penguins_csv_file_hide <- tar_read(penguins_csv_file)
 76 | penguins_data_hide <- tar_read(penguins_data)
 77 | popd()
 78 | ```
 79 | 
 80 | ```{r}
 81 | #| label: targets-load
 82 | #| echo: [2, 3]
 83 | pushd(plan_1_dir)
 84 | tar_load(penguins_data)
 85 | summary(penguins_data)
 86 | popd()
 87 | ```
 88 | 
 89 | Note that `tar_load()` is used for its **side-effect**---loading the desired object into the current R session.
 90 | It doesn't actually return a value.
 91 | 
 92 | ## tar_read()
 93 | 
 94 | `tar_read()` is similar to `tar_load()` in that it is used to retrieve objects built by the workflow, but unlike `tar_load()`, it returns them directly as output.
 95 | 
 96 | Let's try it with `penguins_csv_file`.
 97 | 
 98 | ```{r}
 99 | #| label: targets-read-show
100 | #| echo: [2]
101 | pushd(plan_1_dir)
102 | tar_read(penguins_csv_file)
103 | popd()
104 | ```
105 | 
106 | We immediately see the contents of `penguins_csv_file`.
107 | But it has not been loaded into the environment.
108 | If you try to run `penguins_csv_file` now, you will get an error:
109 | 
110 | ```{r}
111 | #| label: error-2
112 | penguins_csv_file
113 | ```
114 | 
115 | ## When to use which function
116 | 
117 | `tar_load()` tends to be more useful when you want to load objects and do things with them.
118 | `tar_read()` is more useful when you just want to immediately inspect an object.
119 | 
120 | ## The targets cache
121 | 
122 | If you close your R session, then re-start it and use `tar_load()` or `tar_read()`, you will notice that it can still load the workflow objects.
123 | In other words, the workflow output is **saved across R sessions**.
124 | How is this possible?
125 | 
126 | You may have noticed a new folder has appeared in your project, called `_targets`.
127 | This is the **targets cache**.
128 | It contains all of the workflow output; that is how we can load the targets built by the workflow even after quitting then restarting R.
129 | 
130 | **You should not edit the contents of the cache by hand** (with one exception).
131 | Doing so would make your analysis non-reproducible.
132 | 
133 | The one exception to this rule is a special subfolder called `_targets/user`.
134 | This folder does not exist by default.
135 | You can create it if you want, and put whatever you want inside.
136 | 
137 | Generally, `_targets/user` is a good place to store files that are not code, like data and output.
138 | 
139 | Note that if you don't have anything in `_targets/user` that you need to keep around, it is possible to "reset" your workflow by simply deleting the entire `_targets` folder. Of course, this means you will need to run everything over again, so don't do this lightly!
140 | 
141 | ::::::::::::::::::::::::::::::::::::: keypoints 
142 | 
143 | - `targets` workflows are run in a separate, non-interactive R session
144 | - `tar_load()` loads a workflow object into the current R session
145 | - `tar_read()` reads a workflow object and returns its value
146 | - The `_targets` folder is the cache and generally should not be edited by hand
147 | 
148 | ::::::::::::::::::::::::::::::::::::::::::::::::
149 | 


--------------------------------------------------------------------------------
/episodes/fig/03-qmd-workflow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carpentries-incubator/targets-workshop/10b9a63c208b2802fe0855c4a2d2ad9d6104276f/episodes/fig/03-qmd-workflow.png


--------------------------------------------------------------------------------
/episodes/fig/basic-rstudio-project.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carpentries-incubator/targets-workshop/10b9a63c208b2802fe0855c4a2d2ad9d6104276f/episodes/fig/basic-rstudio-project.png


--------------------------------------------------------------------------------
/episodes/fig/basic-rstudio-wizard.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carpentries-incubator/targets-workshop/10b9a63c208b2802fe0855c4a2d2ad9d6104276f/episodes/fig/basic-rstudio-wizard.png


--------------------------------------------------------------------------------
/episodes/fig/lifecycle-visnetwork.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carpentries-incubator/targets-workshop/10b9a63c208b2802fe0855c4a2d2ad9d6104276f/episodes/fig/lifecycle-visnetwork.png


--------------------------------------------------------------------------------
/episodes/files.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'Working with External Files'
  3 | teaching: 10
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - How can we load external data?
 10 | 
 11 | ::::::::::::::::::::::::::::::::::::::::::::::::
 12 | 
 13 | ::::::::::::::::::::::::::::::::::::: objectives
 14 | 
 15 | - Be able to load external data into a workflow
 16 | - Configure the workflow to rerun if the contents of the external data change
 17 | 
 18 | ::::::::::::::::::::::::::::::::::::::::::::::::
 19 | 
 20 | ::::::::::::::::::::::::::::::::::::: instructor
 21 | 
 22 | Episode summary: Show how to read and write external files
 23 | 
 24 | :::::::::::::::::::::::::::::::::::::
 25 | 
 26 | ```{r}
 27 | #| label: setup
 28 | #| echo: FALSE
 29 | #| message: FALSE
 30 | #| warning: FALSE
 31 | library(targets)
 32 | library(tarchetypes)
 33 | source("files/lesson_functions.R")
 34 | ```
 35 | 
 36 | ## Treating external files as a dependency
 37 | 
 38 | Almost all workflows will start by importing data, which is typically stored as an external file.
 39 | 
 40 | As a simple example, let's create an external data file in RStudio with the "New File" menu option. Enter a single line of text, "Hello World" and save it as "hello.txt" text file in `_targets/user/data/`.
 41 | 
 42 | We will read in the contents of this file and store it as `some_data` in the workflow by writing the following plan and running `tar_make()`:
 43 | 
 44 | ::::::::::::::::::::::::::::::::::::: {.callout}
 45 | 
 46 | ## Save your progress
 47 | 
 48 | You can only have one active `_targets.R` file at a time in a given project.
 49 | 
 50 | We are about to create a new `_targets.R` file, but you probably don't want to lose your progress in the one we have been working on so far (the penguins bill analysis). You can temporarily rename that one to something like `_targets_old.R` so that you don't overwrite it with the new example `_targets.R` file below. Then, rename them when you are ready to work on it again.
 51 | 
 52 | :::::::::::::::::::::::::::::::::::::
 53 | 
 54 | ```{r}
 55 | #| label: example-file-show-1
 56 | #| eval: FALSE
 57 | library(targets)
 58 | library(tarchetypes)
 59 | 
 60 | tar_plan(
 61 |   some_data = readLines("_targets/user/data/hello.txt")
 62 | )
 63 | ```
 64 | 
 65 | ```{r}
 66 | #| label: example-file-hide-1
 67 | #| echo: FALSE
 68 | tar_dir({
 69 |   fs::dir_create("_targets/user/data")
 70 |   writeLines("Hello World", "_targets/user/data/hello.txt")
 71 |   write_example_plan(chunk = "example-file-show-1")
 72 |   tar_make()
 73 | })
 74 | ```
 75 | 
 76 | If we inspect the contents of `some_data` with `tar_read(some_data)`, it will contain the string `"Hello World"` as expected.
 77 | 
 78 | Now say we edit "hello.txt", perhaps add some text: "Hello World. How are you?". Edit this in the RStudio text editor and save it. Now run the pipeline again.
 79 | 
 80 | ```{r}
 81 | #| label = "example-file-show-2",
 82 | #| eval = FALSE,
 83 | #| code = knitr::knit_code$get("example-file-show-1")
 84 | ```
 85 | 
 86 | ```{r}
 87 | #| label: example-file-hide-2
 88 | #| echo: FALSE
 89 | tar_dir({
 90 |   fs::dir_create("_targets/user/data")
 91 |   writeLines("Hello World", "_targets/user/data/hello.txt")
 92 |   write_example_plan(chunk = "example-file-show-1")
 93 |   tar_make(reporter = "silent")
 94 |   writeLines("Hello World. How are you?", "_targets/user/data/hello.txt")
 95 |   tar_make()
 96 | })
 97 | ```
 98 | 
 99 | The target `some_data` was skipped, even though the contents of the file changed.
100 | 
101 | That is because right now, targets is only tracking the **name** of the file, not its contents. We need to use a special function for that, `tar_file()` from the `tarchetypes` package. `tar_file()` will calculate the "hash" of a file---a unique digital signature that is determined by the file's contents. If the contents change, the hash will change, and this will be detected by `targets`.
102 | 
103 | ```{r}
104 | #| label: example-file-show-3
105 | #| eval: FALSE
106 | library(targets)
107 | library(tarchetypes)
108 | 
109 | tar_plan(
110 |   tar_file(data_file, "_targets/user/data/hello.txt"),
111 |   some_data = readLines(data_file)
112 | )
113 | ```
114 | 
115 | ```{r}
116 | #| label: example-file-hide-3
117 | #| echo: FALSE
118 | tar_dir({
119 |   fs::dir_create("_targets/user/data")
120 |   writeLines("Hello World", "_targets/user/data/hello.txt")
121 |   write_example_plan(chunk = "example-file-show-3")
122 |   tar_make(reporter = "silent")
123 |   writeLines("Hello World. How are you?", "_targets/user/data/hello.txt")
124 |   tar_make()
125 | })
126 | ```
127 | 
128 | This time we see that `targets` does successfully re-build `some_data` as expected.
129 | 
130 | ## A shortcut (or, About target factories)
131 | 
132 | However, also notice that this means we need to write two targets instead of one: one target to track the contents of the file (`data_file`), and one target to store what we load from the file (`some_data`).
133 | 
134 | It turns out that this is a common pattern in `targets` workflows, so `tarchetypes` provides a shortcut to express this more concisely, `tar_file_read()`.
135 | 
136 | ```{r}
137 | #| label: example-file-show-4
138 | #| eval: FALSE
139 | library(targets)
140 | library(tarchetypes)
141 | 
142 | tar_plan(
143 |   tar_file_read(
144 |     hello,
145 |     "_targets/user/data/hello.txt",
146 |     readLines(!!.x)
147 |   )
148 | )
149 | ```
150 | 
151 | Let's inspect this pipeline with `tar_manifest()`:
152 | 
153 | ```{r}
154 | #| label: example-file-show-5
155 | #| eval: FALSE
156 | tar_manifest()
157 | ```
158 | 
159 | ```{r}
160 | #| label: example-file-hide-5
161 | #| echo: FALSE
162 | tar_dir({
163 |   # Emulate what the learner is doing
164 |   fs::dir_create("_targets/user/data")
165 |   # Old (longer) version:
166 |   writeLines("Hello World. How are you?", "_targets/user/data/hello.txt")
167 |   # Make it again with the shorter version
168 |   write_example_plan(chunk = "example-file-show-4")
169 |   tar_manifest()
170 | })
171 | ```
172 | 
173 | Notice that even though we only specified one target in the pipeline (`hello`, with `tar_file_read()`), the pipeline actually includes **two** targets, `hello_file` and `hello`.
174 | 
175 | That is because `tar_file_read()` is a special function called a **target factory**, so-called because it makes **multiple** targets at once. One of the main purposes of the `tarchetypes` package is to provide target factories to make writing pipelines easier and less error-prone.
176 | 
177 | ## Non-standard evaluation
178 | 
179 | What is the deal with the `!!.x`? That may look unfamiliar even if you are used to using R. It is known as "non-standard evaluation," and gets used in some special contexts. We don't have time to go into the details now, but just remember that you will need to use this special notation with `tar_file_read()`. If you forget how to write it (this happens frequently!) look at the examples in the help file by running `?tar_file_read`.
180 | 
181 | ## Other data loading functions
182 | 
183 | Although we used `readLines()` as an example here, you can use the same pattern for other functions that load data from external files, such as `readr::read_csv()`, `xlsx::read_excel()`, and others (for example, `read_csv(!!.x)`, `read_excel(!!.x)`, etc.).
184 | 
185 | This is generally recommended so that your pipeline stays up to date with your input data.
186 | 
187 | ::::::::::::::::::::::::::::::::::::: {.challenge}
188 | 
189 | ## Challenge: Use `tar_file_read()` with the penguins example
190 | 
191 | We didn't know about `tar_file_read()` yet when we started on the penguins bill analysis.
192 | 
193 | How can you use `tar_file_read()` to load the CSV file while tracking its contents?
194 | 
195 | :::::::::::::::::::::::::::::::::: {.solution}
196 | 
197 | ```{r}
198 | #| label = "tar-file-read-answer-show",
199 | #| eval = FALSE,
200 | #| code = readLines("files/plans/plan_3.R")[2:12]
201 | ```
202 | 
203 | ```{r}
204 | #| label: tar-file-read-answer-hide
205 | #| echo: FALSE
206 | tar_dir({
207 |   # New workflow
208 |   write_example_plan("plan_3.R")
209 |   # Run it
210 |   tar_make()
211 | })
212 | ```
213 | 
214 | ::::::::::::::::::::::::::::::::::
215 | 
216 | :::::::::::::::::::::::::::::::::::::
217 | 
218 | ## Writing out data
219 | 
220 | Writing to files is similar to loading in files: we will use the `tar_file()` function. There is one important caveat: in this case, the second argument of `tar_file()` (the command to build the target) **must return the path to the file**. Not all functions that write files do this (some return nothing; these treat the output file is a side-effect of running the function), so you may need to define a custom function that writes out the file and then returns its path.
221 | 
222 | Let's do this for `writeLines()`, the R function that writes character data to a file. Normally, its output would be `NULL` (nothing), as we can see here:
223 | 
224 | ```{r}
225 | #| label: write-data-show-1
226 | #| eval: false
227 | x <- writeLines("some text", "test.txt")
228 | x
229 | ```
230 | 
231 | ```{r}
232 | #| label: write-data-hide-1
233 | #| echo: false
234 | x <- writeLines("some text", "test.txt")
235 | x
236 | fs::file_delete("test.txt")
237 | ```
238 | 
239 | Here is our modified function that writes character data to a file and returns the name of the file (the `...` means "pass the rest of these arguments to `writeLines()`"):
240 | 
241 | ```{r}
242 | #| label: write-data-func
243 | #| file: files/tar_functions/write_lines_file.R
244 | ```
245 | 
246 | Let's try it out:
247 | 
248 | ```{r}
249 | #| label: write-data-show-2
250 | #| eval: false
251 | x <- write_lines_file("some text", "test.txt")
252 | x
253 | ```
254 | 
255 | ```{r}
256 | #| label: write-data-hide-2
257 | #| echo: false
258 | x <- write_lines_file("some text", "test.txt")
259 | x
260 | fs::file_delete("test.txt")
261 | ```
262 | 
263 | We can now use this in a pipeline. For example let's change the text to upper case then write it out again:
264 | 
265 | ```{r}
266 | #| label: example-file-show-6
267 | #| eval: false
268 | library(targets)
269 | library(tarchetypes)
270 | 
271 | source("R/functions.R")
272 | 
273 | tar_plan(
274 |   tar_file_read(
275 |     hello,
276 |     "_targets/user/data/hello.txt",
277 |     readLines(!!.x)
278 |   ),
279 |   hello_caps = toupper(hello),
280 |   tar_file(
281 |     hello_caps_out,
282 |     write_lines_file(hello_caps, "_targets/user/results/hello_caps.txt")
283 |   )
284 | )
285 | ```
286 | 
287 | ```{r}
288 | #| label: example-file-hide-6
289 | #| echo: false
290 | tar_dir({
291 |   fs::dir_create("_targets/user/data")
292 |   fs::dir_create("_targets/user/results")
293 |   writeLines("Hello World. How are you?", "_targets/user/data/hello.txt")
294 |   write_example_plan(chunk = "example-file-show-6")
295 |   tar_make()
296 | })
297 | ```
298 | 
299 | Take a look at `hello_caps.txt` in the `results` folder and verify it is as you expect.
300 | 
301 | ::::::::::::::::::::::::::::::::::::: {.challenge}
302 | 
303 | ## Challenge: What happens to file output if its modified?
304 | 
305 | Delete or change the contents of `hello_caps.txt` in the `results` folder.
306 | What do you think will happen when you run `tar_make()` again?
307 | Try it and see.
308 | 
309 | :::::::::::::::::::::::::::::::::: {.solution}
310 | 
311 | `targets` detects that `hello_caps_out` has changed (is "invalidated"), and re-runs the code to make it, thus writing out `hello_caps.txt` to `results` again.
312 | 
313 | So this way of writing out results makes your pipeline more robust: we have a guarantee that the contents of the file in `results` are generated solely by the code in your plan.
314 | 
315 | ::::::::::::::::::::::::::::::::::
316 | 
317 | :::::::::::::::::::::::::::::::::::::
318 | 
319 | ::::::::::::::::::::::::::::::::::::: keypoints 
320 | 
321 | - `tarchetypes::tar_file()` tracks the contents of a file
322 | - Use `tarchetypes::tar_file_read()` in combination with data loading functions like `read_csv()` to keep the pipeline in sync with your input data
323 | - Use `tarchetypes::tar_file()` in combination with a function that writes to a file and returns its path to write out data
324 | 
325 | ::::::::::::::::::::::::::::::::::::::::::::::::
326 | 


--------------------------------------------------------------------------------
/episodes/files/lesson_functions.R:
--------------------------------------------------------------------------------
 1 | # Functions used in the lesson `.Rmd` files, but that learners
 2 | # aren't exposed to, and aren't used inside the Targets pipelines
 3 | 
 4 | make_tempdir <- function() {
 5 |   x <- tempfile()
 6 |   dir.create(x, showWarnings = FALSE)
 7 |   x
 8 | }
 9 | 
10 | files_root <- normalizePath("files")
11 | plan_root <- file.path(files_root, "plans")
12 | utility_funcs <- file.path(files_root, "tar_functions") |>
13 |   list.files(full.names = TRUE, pattern = "\\.R$") |>
14 |   lapply(readLines) |>
15 |   unlist()
16 | package_script <- file.path(files_root, "packages.R")
17 | 
18 | #' @param file The path to another file to use as a workflow
19 | #' @param chunk The chunk name to use as a targets workflow
20 | write_example_plan <- function(file = NULL, chunk = NULL) {
21 |   # Write the utility functions into the R/ directory
22 | 
23 |   if (!dir.exists("R")) {
24 |     dir.create("R")
25 | 
26 |     # Write the functions.R script
27 |     file.path("R", "functions.R") |>
28 |       writeLines(utility_funcs, con = _)
29 | 
30 |     # Copy the packages.R script
31 |     file.path("R", "packages.R") |>
32 |       file.copy(from = package_script, to = _)
33 |   }
34 | 
35 |   # Write the workflow
36 |   if (!is.null(file)) {
37 |     file.path(plan_root, file) |>
38 |         file.copy(from = _, to = "_targets.R", overwrite = TRUE)
39 |   }
40 |   if (!is.null(chunk)) {
41 |     writeLines(text = knitr::knit_code$get(chunk), con = "_targets.R")
42 |   }
43 | 
44 |   invisible()
45 | }
46 | 
47 | directory_stack <- getwd()
48 | 
49 | pushd <- function(dir) {
50 |   directory_stack <<- c(dir, directory_stack)
51 |   setwd(directory_stack[1])
52 |   invisible()
53 | }
54 | 
55 | popd <- function() {
56 |   directory_stack <<- directory_stack[-1]
57 |   setwd(directory_stack[1])
58 |   invisible()
59 | }
60 | 


--------------------------------------------------------------------------------
/episodes/files/packages.R:
--------------------------------------------------------------------------------
1 | library(targets)
2 | library(tarchetypes)
3 | library(palmerpenguins)
4 | library(tidyverse)
5 | library(broom)
6 | library(htmlwidgets)
7 | 


--------------------------------------------------------------------------------
/episodes/files/plans/README.md:
--------------------------------------------------------------------------------
1 | Plans that are re-used between multiple episodes are placed here


--------------------------------------------------------------------------------
/episodes/files/plans/plan_0.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | library(targets)
 3 | library(tidyverse)
 4 | library(palmerpenguins)
 5 | 
 6 | list(
 7 |   tar_target(penguins_csv_file, path_to_file("penguins_raw.csv")),
 8 |   tar_target(
 9 |     penguins_data_raw,
10 |     read_csv(penguins_csv_file, show_col_types = FALSE)
11 |   ),
12 |   tar_target(
13 |     penguins_data,
14 |     penguins_data_raw |>
15 |       select(
16 |         species = Species,
17 |         bill_length_mm = `Culmen Length (mm)`,
18 |         bill_depth_mm = `Culmen Depth (mm)`
19 |       ) |>
20 |       drop_na()
21 |   )
22 | )
23 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_1.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | library(targets)
 3 | library(tidyverse)
 4 | library(palmerpenguins)
 5 | 
 6 | clean_penguin_data <- function(penguins_data_raw) {
 7 |   penguins_data_raw |>
 8 |     select(
 9 |       species = Species,
10 |       bill_length_mm = `Culmen Length (mm)`,
11 |       bill_depth_mm = `Culmen Depth (mm)`
12 |     ) |>
13 |     drop_na()
14 | }
15 | 
16 | list(
17 |   tar_target(penguins_csv_file, path_to_file("penguins_raw.csv")),
18 |   tar_target(penguins_data_raw, read_csv(
19 |     penguins_csv_file, show_col_types = FALSE)),
20 |   tar_target(penguins_data, clean_penguin_data(penguins_data_raw))
21 | )
22 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_10.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | suppressPackageStartupMessages(library(crew))
 3 | source("R/functions.R")
 4 | source("R/packages.R")
 5 | 
 6 | # Set up parallelization
 7 | library(crew)
 8 | tar_option_set(
 9 |   controller = crew_controller_local(workers = 2)
10 | )
11 | 
12 | tar_plan(
13 |   # Load raw data
14 |   tar_file_read(
15 |     penguins_data_raw,
16 |     path_to_file("penguins_raw.csv"),
17 |     read_csv(!!.x, show_col_types = FALSE)
18 |   ),
19 |   # Clean and group data
20 |   tar_group_by(
21 |     penguins_data,
22 |     clean_penguin_data(penguins_data_raw),
23 |     species
24 |   ),
25 |   # Get summary of combined model with all species together
26 |   combined_summary = model_glance_slow(penguins_data),
27 |   # Get summary of one model per species
28 |   tar_target(
29 |     species_summary,
30 |     model_glance_slow(penguins_data),
31 |     pattern = map(penguins_data)
32 |   ),
33 |   # Get predictions of combined model with all species together
34 |   combined_predictions = model_augment_slow(penguins_data),
35 |   # Get predictions of one model per species
36 |   tar_target(
37 |     species_predictions,
38 |     model_augment_slow(penguins_data),
39 |     pattern = map(penguins_data)
40 |   )
41 | )
42 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_11.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/functions.R")
 3 | source("R/packages.R")
 4 | 
 5 | tar_plan(
 6 |   # Load raw data
 7 |   tar_file_read(
 8 |     penguins_data_raw,
 9 |     path_to_file("penguins_raw.csv"),
10 |     read_csv(!!.x, show_col_types = FALSE)
11 |   ),
12 |   # Clean and group data
13 |   tar_group_by(
14 |     penguins_data,
15 |     clean_penguin_data(penguins_data_raw),
16 |     species
17 |   ),
18 |   # Get summary of combined model with all species together
19 |   combined_summary = model_glance(penguins_data),
20 |   # Get summary of one model per species
21 |   tar_target(
22 |     species_summary,
23 |     model_glance(penguins_data),
24 |     pattern = map(penguins_data)
25 |   ),
26 |   # Get predictions of combined model with all species together
27 |   combined_predictions = model_augment(penguins_data),
28 |   # Get predictions of one model per species
29 |   tar_target(
30 |     species_predictions,
31 |     model_augment(penguins_data),
32 |     pattern = map(penguins_data)
33 |   ),
34 |   # Generate report
35 |   tar_quarto(
36 |     penguin_report,
37 |     path = "penguin_report.qmd",
38 |     quiet = FALSE
39 |   )
40 | )
41 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_2.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/packages.R")
 3 | source("R/functions.R")
 4 | 
 5 | list(
 6 |   tar_target(penguins_csv_file, path_to_file('penguins_raw.csv')),
 7 |   tar_target(penguins_data_raw, read_csv(
 8 |     penguins_csv_file, show_col_types = FALSE)),
 9 |   tar_target(penguins_data, clean_penguin_data(penguins_data_raw))
10 | )
11 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_2b.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/packages.R")
 3 | source("R/functions.R")
 4 | 
 5 | tar_plan(
 6 |   penguins_csv_file = path_to_file("penguins_raw.csv"),
 7 |   penguins_data_raw = read_csv(penguins_csv_file, show_col_types = FALSE),
 8 |   penguins_data = clean_penguin_data(penguins_data_raw)
 9 | )
10 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_3.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/packages.R")
 3 | source("R/functions.R")
 4 | 
 5 | tar_plan(
 6 |   tar_file_read(
 7 |     penguins_data_raw,
 8 |     path_to_file("penguins_raw.csv"),
 9 |     read_csv(!!.x, show_col_types = FALSE)
10 |   ),
11 |   penguins_data = clean_penguin_data(penguins_data_raw)
12 | )
13 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_4.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/packages.R")
 3 | source("R/functions.R")
 4 | 
 5 | tar_plan(
 6 |   # Load raw data
 7 |   tar_file_read(
 8 |     penguins_data_raw,
 9 |     path_to_file("penguins_raw.csv"),
10 |     read_csv(!!.x, show_col_types = FALSE)
11 |   ),
12 |   # Clean data
13 |   penguins_data = clean_penguin_data(penguins_data_raw),
14 |   # Build model
15 |   combined_model = lm(
16 |     bill_depth_mm ~ bill_length_mm,
17 |     data = penguins_data
18 |   )
19 | )
20 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_5.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/packages.R")
 3 | source("R/functions.R")
 4 | 
 5 | tar_plan(
 6 |   # Load raw data
 7 |   tar_file_read(
 8 |     penguins_data_raw,
 9 |     path_to_file("penguins_raw.csv"),
10 |     read_csv(!!.x, show_col_types = FALSE)
11 |   ),
12 |   # Clean data
13 |   penguins_data = clean_penguin_data(penguins_data_raw),
14 |   # Build models
15 |   combined_model = lm(
16 |     bill_depth_mm ~ bill_length_mm,
17 |     data = penguins_data
18 |   ),
19 |   adelie_model = lm(
20 |     bill_depth_mm ~ bill_length_mm,
21 |     data = filter(penguins_data, species == "Adelie")
22 |   ),
23 |   chinstrap_model = lm(
24 |     bill_depth_mm ~ bill_length_mm,
25 |     data = filter(penguins_data, species == "Chinstrap")
26 |   ),
27 |   gentoo_model = lm(
28 |     bill_depth_mm ~ bill_length_mm,
29 |     data = filter(penguins_data, species == "Gentoo")
30 |   ),
31 |   # Get model summaries
32 |   combined_summary = glance(combined_model),
33 |   adelie_summary = glance(adelie_model),
34 |   chinstrap_summary = glance(chinstrap_model),
35 |   gentoo_summary = glance(gentoo_model)
36 | )
37 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_6.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/packages.R")
 3 | source("R/functions.R")
 4 | 
 5 | tar_plan(
 6 |   # Load raw data
 7 |   tar_file_read(
 8 |     penguins_data_raw,
 9 |     path_to_file("penguins_raw.csv"),
10 |     read_csv(!!.x, show_col_types = FALSE)
11 |   ),
12 |   # Clean data
13 |   penguins_data = clean_penguin_data(penguins_data_raw),
14 |   # Group data
15 |   tar_group_by(
16 |     penguins_data_grouped,
17 |     penguins_data,
18 |     species
19 |   ),
20 |   # Build combined model with all species together
21 |   combined_summary = model_glance(penguins_data),
22 |   # Build one model per species
23 |   tar_target(
24 |     species_summary,
25 |     model_glance(penguins_data_grouped),
26 |     pattern = map(penguins_data_grouped)
27 |   )
28 | )
29 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_6b.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/packages.R")
 3 | source("R/functions.R")
 4 | 
 5 | tar_plan(
 6 |   # Load raw data
 7 |   tar_file_read(
 8 |     penguins_data_raw,
 9 |     path_to_file("penguins_raw.csv"),
10 |     read_csv(!!.x, show_col_types = FALSE)
11 |   ),
12 |   # Clean data
13 |   penguins_data = clean_penguin_data(penguins_data_raw),
14 |   # Group data
15 |   tar_group_by(
16 |     penguins_data_grouped,
17 |     penguins_data,
18 |     species
19 |   ),
20 |   # Build combined model with all species together
21 |   combined_summary = model_glance_orig(penguins_data),
22 |   # Build one model per species
23 |   tar_target(
24 |     species_summary,
25 |     model_glance_orig(penguins_data_grouped),
26 |     pattern = map(penguins_data_grouped)
27 |   )
28 | )
29 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_7.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/functions.R")
 3 | source("R/packages.R")
 4 | 
 5 | tar_plan(
 6 |   # Load raw data
 7 |   tar_file_read(
 8 |     penguins_data_raw,
 9 |     path_to_file("penguins_raw.csv"),
10 |     read_csv(!!.x, show_col_types = FALSE)
11 |   ),
12 |   # Clean data
13 |   penguins_data = clean_penguin_data(penguins_data_raw),
14 |   # Group data
15 |   tar_group_by(
16 |     penguins_data_grouped,
17 |     penguins_data,
18 |     species
19 |   ),
20 |   # Get summary of combined model with all species together
21 |   combined_summary = model_glance(penguins_data),
22 |   # Get summary of one model per species
23 |   tar_target(
24 |     species_summary,
25 |     model_glance(penguins_data_grouped),
26 |     pattern = map(penguins_data_grouped)
27 |   ),
28 |   # Get predictions of combined model with all species together
29 |   combined_predictions = model_augment(penguins_data_grouped),
30 |   # Get predictions of one model per species
31 |   tar_target(
32 |     species_predictions,
33 |     model_augment(penguins_data_grouped),
34 |     pattern = map(penguins_data_grouped)
35 |   )
36 | )
37 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_8.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | source("R/functions.R")
 3 | source("R/packages.R")
 4 | 
 5 | tar_plan(
 6 |   # Load raw data
 7 |   tar_file_read(
 8 |     penguins_data_raw,
 9 |     path_to_file("penguins_raw.csv"),
10 |     read_csv(!!.x, show_col_types = FALSE)
11 |   ),
12 |   # Clean and group data
13 |   tar_group_by(
14 |     penguins_data,
15 |     clean_penguin_data(penguins_data_raw),
16 |     species
17 |   ),
18 |   # Get summary of combined model with all species together
19 |   combined_summary = model_glance(penguins_data),
20 |   # Get summary of one model per species
21 |   tar_target(
22 |     species_summary,
23 |     model_glance(penguins_data),
24 |     pattern = map(penguins_data)
25 |   ),
26 |   # Get predictions of combined model with all species together
27 |   combined_predictions = model_augment(penguins_data),
28 |   # Get predictions of one model per species
29 |   tar_target(
30 |     species_predictions,
31 |     model_augment(penguins_data),
32 |     pattern = map(penguins_data)
33 |   )
34 | )
35 | 


--------------------------------------------------------------------------------
/episodes/files/plans/plan_9.R:
--------------------------------------------------------------------------------
 1 | options(tidyverse.quiet = TRUE)
 2 | suppressPackageStartupMessages(library(crew))
 3 | source("R/functions.R")
 4 | source("R/packages.R")
 5 | 
 6 | # Set up parallelization
 7 | library(crew)
 8 | tar_option_set(
 9 |   controller = crew_controller_local(workers = 2)
10 | )
11 | 
12 | tar_plan(
13 |   # Load raw data
14 |   tar_file_read(
15 |     penguins_data_raw,
16 |     path_to_file("penguins_raw.csv"),
17 |     read_csv(!!.x, show_col_types = FALSE)
18 |   ),
19 |   # Clean and group data
20 |   tar_group_by(
21 |     penguins_data,
22 |     clean_penguin_data(penguins_data_raw),
23 |     species
24 |   ),
25 |   # Get summary of combined model with all species together
26 |   combined_summary = model_glance(penguins_data),
27 |   # Get summary of one model per species
28 |   tar_target(
29 |     species_summary,
30 |     model_glance(penguins_data),
31 |     pattern = map(penguins_data)
32 |   ),
33 |   # Get predictions of combined model with all species together
34 |   combined_predictions = model_augment(penguins_data),
35 |   # Get predictions of one model per species
36 |   tar_target(
37 |     species_predictions,
38 |     model_augment(penguins_data),
39 |     pattern = map(penguins_data)
40 |   )
41 | )
42 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/README.md:
--------------------------------------------------------------------------------
1 | These are functions that are used inside the targets pipelines.
2 | All of them are automatically included in every plan written by `execute_plan`.
3 | However they are split into separate files so they can be included as code chunks and thereby shown to the learners.
4 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/augment_with_mod_name.R:
--------------------------------------------------------------------------------
1 | augment_with_mod_name <- function(model_in_list) {
2 |   model_name <- names(model_in_list)
3 |   model <- model_in_list[[1]]
4 |   augment(model) |>
5 |     mutate(model_name = model_name)
6 | }
7 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/augment_with_mod_name_slow.R:
--------------------------------------------------------------------------------
1 | augment_with_mod_name_slow <- function(model_in_list) {
2 |   Sys.sleep(4)
3 |   model_name <- names(model_in_list)
4 |   model <- model_in_list[[1]]
5 |   broom::augment(model) |>
6 |     mutate(model_name = model_name)
7 | }
8 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/clean_penguin_data.R:
--------------------------------------------------------------------------------
 1 | clean_penguin_data <- function(penguins_data_raw) {
 2 |   penguins_data_raw |>
 3 |     select(
 4 |       species = Species,
 5 |       bill_length_mm = `Culmen Length (mm)`,
 6 |       bill_depth_mm = `Culmen Depth (mm)`
 7 |     ) |>
 8 |     drop_na() |>
 9 |     # Split "species" apart on spaces, and only keep the first word
10 |     separate(species, into = "species", extra = "drop")
11 | }
12 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/glance_with_mod_name.R:
--------------------------------------------------------------------------------
1 | glance_with_mod_name <- function(model_in_list) {
2 |   model_name <- names(model_in_list)
3 |   model <- model_in_list[[1]]
4 |   glance(model) |>
5 |     mutate(model_name = model_name)
6 | }
7 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/glance_with_mod_name_slow.R:
--------------------------------------------------------------------------------
1 | glance_with_mod_name_slow <- function(model_in_list) {
2 |   Sys.sleep(4)
3 |   model_name <- names(model_in_list)
4 |   model <- model_in_list[[1]]
5 |   broom::glance(model) |>
6 |     mutate(model_name = model_name)
7 | }
8 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/model_augment.R:
--------------------------------------------------------------------------------
 1 | model_augment <- function(penguins_data) {
 2 |   # Make model
 3 |   model <- lm(
 4 |     bill_depth_mm ~ bill_length_mm,
 5 |     data = penguins_data)
 6 |   # Get species name
 7 |   species_name <- unique(penguins_data$species)
 8 |   # If this is the combined dataset with multiple
 9 |   # species, changed name to 'combined'
10 |   if (length(species_name) > 1) {
11 |     species_name <- "combined"
12 |   }
13 |   # Get model summary and add species name
14 |   augment(model) |>
15 |     mutate(species = species_name, .before = 1)
16 | }
17 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/model_augment_slow.R:
--------------------------------------------------------------------------------
 1 | model_augment_slow <- function(penguins_data) {
 2 |   Sys.sleep(4)
 3 |   # Make model
 4 |   model <- lm(
 5 |     bill_depth_mm ~ bill_length_mm,
 6 |     data = penguins_data)
 7 |   # Get species name
 8 |   species_name <- unique(penguins_data$species)
 9 |   # If this is the combined dataset with multiple
10 |   # species, changed name to 'combined'
11 |   if (length(species_name) > 1) {
12 |     species_name <- "combined"
13 |   }
14 |   # Get model summary and add species name
15 |   augment(model) |>
16 |     mutate(species = species_name, .before = 1)
17 | }
18 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/model_glance.R:
--------------------------------------------------------------------------------
 1 | model_glance <- function(penguins_data) {
 2 |   # Make model
 3 |   model <- lm(
 4 |     bill_depth_mm ~ bill_length_mm,
 5 |     data = penguins_data)
 6 |   # Get species name
 7 |   species_name <- unique(penguins_data$species)
 8 |   # If this is the combined dataset with multiple
 9 |   # species, changed name to 'combined'
10 |   if (length(species_name) > 1) {
11 |     species_name <- "combined"
12 |   }
13 |   # Get model summary and add species name
14 |   glance(model) |>
15 |     mutate(species = species_name, .before = 1)
16 | }
17 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/model_glance_orig.R:
--------------------------------------------------------------------------------
1 | model_glance_orig <- function(penguins_data) {
2 |   model <- lm(
3 |     bill_depth_mm ~ bill_length_mm,
4 |     data = penguins_data)
5 |   broom::glance(model)
6 | }
7 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/model_glance_slow.R:
--------------------------------------------------------------------------------
 1 | model_glance_slow <- function(penguins_data) {
 2 |   Sys.sleep(4)
 3 |   # Make model
 4 |   model <- lm(
 5 |     bill_depth_mm ~ bill_length_mm,
 6 |     data = penguins_data)
 7 |   # Get species name
 8 |   species_name <- unique(penguins_data$species)
 9 |   # If this is the combined dataset with multiple
10 |   # species, changed name to 'combined'
11 |   if (length(species_name) > 1) {
12 |     species_name <- "combined"
13 |   }
14 |   # Get model summary and add species name
15 |   glance(model) |>
16 |     mutate(species = species_name, .before = 1)
17 | }
18 | 


--------------------------------------------------------------------------------
/episodes/files/tar_functions/write_lines_file.R:
--------------------------------------------------------------------------------
1 | write_lines_file <- function(text, file, ...) {
2 |   writeLines(text = text, con = file, ...)
3 |   file
4 | }
5 | 


--------------------------------------------------------------------------------
/episodes/functions.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'A Brief Introduction to Functions'
  3 | teaching: 30
  4 | exercises: 10
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - What are functions?
 10 | - Why should we know how to write them?
 11 | - What are the main components of a function?
 12 | 
 13 | ::::::::::::::::::::::::::::::::::::::::::::::::
 14 | 
 15 | ::::::::::::::::::::::::::::::::::::: objectives
 16 | 
 17 | - Understand the usefulness of custom functions
 18 | - Understand the basic concepts around writing functions
 19 | 
 20 | ::::::::::::::::::::::::::::::::::::::::::::::::
 21 | 
 22 | ::::::::::::::::::::::::::::::::::::: {.instructor}
 23 | 
 24 | Episode summary: A very brief introduction to functions, when you have learners who have no experience with them.
 25 | 
 26 | ::::::::::::::::::::::::::::::::::::::::::::::::
 27 | 
 28 | ```{r}
 29 | #| label: setup
 30 | #| echo: FALSE
 31 | #| message: FALSE
 32 | #| warning: FALSE
 33 | library(targets)
 34 | 
 35 | if (interactive()) {
 36 |   setwd("episodes")
 37 | }
 38 | 
 39 | source("files/lesson_functions.R")
 40 | ```
 41 | 
 42 | ## About functions
 43 | 
 44 | Functions in R are something we are used to thinking of as something that comes from a package. You find, install and use specialized functions from packages to get your work done.
 45 | 
 46 | But you can, and arguably should, be writing your own functions too!
 47 | Functions are a great way of making it easy to repeat the same operation but with different settings.
 48 | How many times have you copy-pasted the exact same code in your script, only to change a couple of things (a variable, an input etc.) before running it again?
 49 | Only to then discover that there was an error in the code, and when you fix it, you need to remember to do so in all the places where you copied that code. 
 50 | 
 51 | Through writing functions you can reduce this back and forth, and create a more efficient workflow for yourself.
 52 | When you find the bug, you fix it in a single place, the function you made, and each subsequent call of that function will now be fixed.
 53 | 
 54 | Furthermore, `targets` makes extensive use of custom functions, so a basic understanding of how they work is very important to successfully using it.
 55 | 
 56 | ### Writing a function
 57 | 
 58 | There is not much difference between writing your own function and writing other code in R, you are still coding with R!
 59 | Let's imagine we want to convert the millimeter measurements in the penguins data to centimeters.
 60 | 
 61 | ```{r}
 62 | #| label: targets-functions-problem
 63 | #| message: FALSE
 64 | library(palmerpenguins)
 65 | library(tidyverse)
 66 | 
 67 | penguins |>
 68 |   mutate(
 69 |     bill_length_cm = bill_length_mm / 10,
 70 |     bill_depth_cm = bill_depth_mm / 10
 71 |   )
 72 | 
 73 | ```
 74 | 
 75 | This is not a complicated operation, but we might want to make a convenient custom function that can do this conversion for us anyways.
 76 | 
 77 | To write a function, you need to use the `function()` function. 
 78 | With this function we provide what will be the input arguments of the function inside its parentheses, and what the function will subsequently do with those input arguments in curly braces `{}` after the function parentheses. 
 79 | The object name we assign this to, will become the function's name.
 80 | 
 81 | ```{r}
 82 | #| label: targets-functions-skeleton
 83 | #| eval: false
 84 | my_function <- function(argument1, argument2) {
 85 |   # the things the function will do
 86 | }
 87 | # call the function
 88 | my_function(1, "something")
 89 | ```
 90 | 
 91 | For our mm to cm conversion the function would look like so:
 92 | 
 93 | ```{r}
 94 | #| label: targets-functions-cm
 95 | mm2cm <- function(x) {
 96 |   x / 10
 97 | }
 98 | ```
 99 | 
100 | Our custom function will now transform any numerical input by dividing it by 10. 
101 | 
102 | Let's try it out:
103 | 
104 | ```{r}
105 | #| label: targets-functions-cm-use
106 | penguins |>
107 |   mutate(
108 |     bill_length_cm = mm2cm(bill_length_mm),
109 |     bill_depth_cm = mm2cm(bill_depth_mm)
110 |   )
111 | ```
112 | 
113 | Congratulations, you've created and used your first custom function!
114 | 
115 | ### Make a function from existing code
116 | 
117 | Many times, we might already have a piece of code that we'd like to use to create a function.
118 | For instance, we've copy-pasted a section of code several times and realize that this piece of code is repetitive, so a function is in order.
119 | Or, you are converting your workflow to `targets`, and need to change your script into a series of functions that `targets` will call.
120 | 
121 | Recall the code snippet we had to clean our penguins data:
122 | 
123 | ```{r}
124 | #| label: code-to-convert-to-function
125 | #| eval: false
126 | penguins_data_raw |>
127 |   select(
128 |     species = Species,
129 |     bill_length_mm = `Culmen Length (mm)`,
130 |     bill_depth_mm = `Culmen Depth (mm)`
131 |   ) |>
132 |   drop_na()
133 | ```
134 | 
135 | We need to adapt this code to become a function, and this function needs a single argument, which is the dataset it should clean.
136 | 
137 | It should look like this:
138 | ```{r}
139 | #| label: clean-data-function
140 | clean_penguin_data <- function(penguins_data_raw) {
141 |   penguins_data_raw |>
142 |     select(
143 |       species = Species,
144 |       bill_length_mm = `Culmen Length (mm)`,
145 |       bill_depth_mm = `Culmen Depth (mm)`
146 |     ) |>
147 |     drop_na()
148 | }
149 | ```
150 | 
151 | Add this function to `_targets.R` after the part where you load packages with `library()` and before the list at the end.
152 | 
153 | ::::::::::::::::: callout
154 | 
155 | # RStudio function extraction
156 | 
157 | RStudio also has a handy helper to extract a function from a piece of code.
158 | Once you have basic familiarity with functions, it may help you figure out the necessary input when turning code into a function.
159 | 
160 | To use it, highlight the piece of code you want to make into a function.
161 | In our case that is the entire pipeline from `penguins_data_raw` to the `drop_na()` statement.
162 | Once you have done this, in RStudio go to the "Code" section in the top bar, and select "Extract function" from the list.
163 | A prompt will open asking you to hit enter, and you should have the following code in your script where the cursor was.
164 | 
165 | This function will not work however, because it contains more stuff than is needed as an argument.
166 | This is because tidyverse uses non-standard evaluation, and we can write unquoted column names inside the `select()`. 
167 | The function extractor thinks that all unquoted (or back-ticked) text in the code is a reference to an object.
168 | You will need to do some manual cleaning to get the function working, which is why its more convenient if you have a little experience with functions already.
169 | 
170 | ::::::::::::::::::
171 | 
172 | ::::::::::::::::::::::::::::::::::::: {.challenge}
173 | 
174 | ## Challenge: Write a function that takes a numerical vector and returns its mean divided by 10.
175 | 
176 | :::::::::::::::::::::::::::::::::: {.solution}
177 | 
178 | ```{r}
179 | #| label: write-function-answer
180 | vecmean <- function(x) {
181 |   mean(x) / 10
182 | }
183 | ```
184 | 
185 | ::::::::::::::::::::::::::::::::::
186 | 
187 | :::::::::::::::::::::::::::::::::::::
188 | 
189 | ## Using functions in the workflow
190 | 
191 | Now that we've defined our custom data cleaning function, we can put it to use in the workflow.
192 | 
193 | Can you see how this might be done?
194 | 
195 | We need to delete the corresponding code from the last `tar_target()` and replace it with a call to the new function.
196 | 
197 | Modify the workflow to look like this:
198 | 
199 | ```{r}
200 | #| label = "targets-show-fun-add",
201 | #| eval = FALSE,
202 | #| code = readLines("files/plans/plan_1.R")[2:21]
203 | ```
204 | 
205 | We should run the workflow again with `tar_make()` to make sure it is up-to-date:
206 | 
207 | ```{r}
208 | #| label: targets-run-fun
209 | #| eval: true
210 | #| echo: [5]
211 | pushd(make_tempdir())
212 | write_example_plan("plan_0.R")
213 | tar_make(reporter = "silent")
214 | write_example_plan("plan_1.R")
215 | tar_make()
216 | popd()
217 | ```
218 | 
219 | We will learn more soon about the messages that `targets()` prints out.
220 | 
221 | ## Functions make it easier to reason about code
222 | 
223 | Notice that now the list of targets at the end is starting to look like a high-level summary of your analysis.
224 | 
225 | This is another advantage of using custom functions: **functions allows us to separate the details of each workflow step from the overall workflow**.
226 | 
227 | To understand the overall workflow, you don't need to know all of the details about how the data were cleaned; you just need to know that there was a cleaning step.
228 | On the other hand, if you do need to go back and delve into the specifics of the data cleaning, you only need to pay attention to what happens inside that function, and you can ignore the rest of the workflow.
229 | **This makes it easier to reason about the code**, and will lead to fewer bugs and ultimately save you time and mental energy.
230 | 
231 | Here we have only scratched the surface of functions, and you will likely need to get more help in learning about them.
232 | For more information, we recommend reading this episode in the R Novice lesson from Carpentries that is [all about functions](https://swcarpentry.github.io/r-novice-gapminder/10-functions.html).
233 | 
234 | ::::::::::::::::::::::::::::::::::::: keypoints 
235 | 
236 | - Functions are crucial when repeating the same code many times with minor differences
237 | - RStudio's "Extract function" tool can help you get started with converting code into functions
238 | - Functions are an essential part of how `targets` works.
239 | 
240 | ::::::::::::::::::::::::::::::::::::::::::::::::
241 | 


--------------------------------------------------------------------------------
/episodes/introduction.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Introduction"
  3 | teaching: 10
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - Why should we care about reproducibility?
 10 | - How can `targets` help us achieve reproducibility?
 11 | 
 12 | ::::::::::::::::::::::::::::::::::::::::::::::::
 13 | 
 14 | ::::::::::::::::::::::::::::::::::::: objectives
 15 | 
 16 | - Explain why reproducibility is important for science
 17 | - Describe the features of `targets` that enhance reproducibility
 18 | 
 19 | ::::::::::::::::::::::::::::::::::::::::::::::::
 20 | 
 21 | ::::::::::::::::::::::::::::::::::::: {.instructor}
 22 | 
 23 | Episode summary: Introduce the idea of reproducibility and why / who would want to use `targets`
 24 | 
 25 | :::::::::::::::::::::::::::::::::::::
 26 | 
 27 | ## What is reproducibility?
 28 | 
 29 | Reproducibility is the ability for others (including your future self) to reproduce your analysis.
 30 | 
 31 | We can only have confidence in the results of scientific analyses if they can be reproduced.
 32 | 
 33 | However, reproducibility is not a binary concept (not reproducible vs. reproducible); rather, there is a scale from **less** reproducible to **more** reproducible.
 34 | 
 35 | `targets` goes a long ways towards making your analyses **more reproducible**.
 36 | 
 37 | Other practices you can use to further enhance reproducibility include controlling your computing environment with tools like Docker, conda, or renv, but we don't have time to cover those in this workshop.
 38 | 
 39 | ## What is `targets`?
 40 | 
 41 | `targets` is a workflow management package for the R programming language developed and maintained by Will Landau.
 42 | 
 43 | The major features of `targets` include:
 44 | 
 45 | - **Automation** of workflow
 46 | - **Caching** of workflow steps
 47 | - **Batch creation** of workflow steps
 48 | - **Parallelization** at the level of the workflow
 49 | 
 50 | This allows you to do the following:
 51 | 
 52 | - return to a project after working on something else and immediately pick up where you left off without confusion or trying to remember what you were doing
 53 | - change the workflow, then only re-run the parts that that are affected by the change
 54 | - massively scale up the workflow without changing individual functions
 55 | 
 56 | ... and of course, it will help others reproduce your analysis.
 57 | 
 58 | ## Who should use `targets`?
 59 | 
 60 | `targets` is by no means the only workflow management software.
 61 | There is a large number of similar tools, each with varying features and use-cases.
 62 | For example, [snakemake](https://snakemake.readthedocs.io/en/stable/) is a popular workflow tool for python, and [`make`](https://www.gnu.org/software/make/) is a tool that has been around for a very long time for automating bash scripts.
 63 | `targets` is designed to work specifically with R, so it makes the most sense to use it if you primarily use R, or intend to.
 64 | If you mostly code with other tools, you may want to consider an alternative.
 65 | 
 66 | The **goal** of this workshop is to **learn how to use `targets` to reproducible data analysis in R**.
 67 | 
 68 | ## Where to get more information
 69 | 
 70 | `targets` is a sophisticated package and there is a lot more to learn that we can cover in this workshop.
 71 | 
 72 | Here are some recommended resources for continuing on your `targets` journey:
 73 | 
 74 | - [The `targets` R package user manual](https://books.ropensci.org/targets/) by the author of `targets`, Will Landau, should be considered required reading for anyone seriously interested in `targets`.
 75 | - [The `targets` discussion board](https://github.com/ropensci/targets/discussions) is a great place for asking questions and getting help. Before you ask a question though, be sure to [read the policy on asking for help](https://books.ropensci.org/targets/help.html).
 76 | - [The `targets` package webpage](https://docs.ropensci.org/targets/) includes documentation of all `targets` functions.
 77 | - [The `tarchetypes` package webpage](https://docs.ropensci.org/tarchetypes/) includes documentation of all `tarchetypes` functions. You will almost certainly use `tarchetypes` along with `targets`, so it's good to consult both.
 78 | - [Reproducible computation at scale in R with `targets`](https://github.com/wlandau/targets-tutorial) is a tutorial by Will Landau analyzing customer churn with Keras.
 79 | - [Recorded talks](https://github.com/ropensci/targets#recorded-talks) and [example projects](https://github.com/ropensci/targets#example-projects) listed on the `targets` README.
 80 | 
 81 | ## About the example dataset
 82 | 
 83 | For this workshop, we will analyze an example dataset of measurements taken on adult foraging Adélie, Chinstrap, and Gentoo penguins observed on islands in the Palmer Archipelago, Antarctica.
 84 | 
 85 | The data are available from the `palmerpenguins` R package. You can get more information about the data by running `?palmerpenguins`.
 86 | 
 87 | ![The three species of penguins in the `palmerpenguins` dataset. Artwork by @allison_horst.](https://allisonhorst.github.io/palmerpenguins/reference/figures/lter_penguins.png)
 88 | 
 89 | The goal of the analysis is to determine the relationship between bill length and depth by using linear models.
 90 | 
 91 | We will gradually build up the analysis through this lesson, but you can see the final version at <https://github.com/joelnitta/penguins-targets>.
 92 | 
 93 | ::::::::::::::::::::::::::::::::::::: keypoints 
 94 | 
 95 | - We can only have confidence in the results of scientific analyses if they can be reproduced by others (including your future self)
 96 | - `targets` helps achieve reproducibility by automating workflow
 97 | - `targets` is designed for use with the R programming language
 98 | - The example dataset for this workshop includes measurements taken on penguins in Antarctica
 99 | 
100 | ::::::::::::::::::::::::::::::::::::::::::::::::
101 | 


--------------------------------------------------------------------------------
/episodes/lifecycle.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'The Workflow Lifecycle'
  3 | teaching: 10
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - What happens if we re-run a workflow?
 10 | - How does `targets` know what steps to re-run?
 11 | - How can we inspect the state of the workflow?
 12 | 
 13 | ::::::::::::::::::::::::::::::::::::::::::::::::
 14 | 
 15 | ::::::::::::::::::::::::::::::::::::: objectives
 16 | 
 17 | - Explain how `targets` helps increase efficiency
 18 | - Be able to inspect a workflow to see what parts are outdated
 19 | 
 20 | ::::::::::::::::::::::::::::::::::::::::::::::::
 21 | 
 22 | ::::::::::::::::::::::::::::::::::::: {.instructor}
 23 | 
 24 | Episode summary: Demonstrate typical cycle of running `targets`: make, inspect, adjust, make...
 25 | 
 26 | :::::::::::::::::::::::::::::::::::::
 27 | 
 28 | ```{r}
 29 | #| label: setup
 30 | #| echo: FALSE
 31 | #| message: FALSE
 32 | #| warning: FALSE
 33 | library(targets)
 34 | library(visNetwork)
 35 | source("files/lesson_functions.R")
 36 | ```
 37 | 
 38 | ## Re-running the workflow
 39 | 
 40 | One of the features of `targets` is that it maximizes efficiency by only running the parts of the workflow that need to be run.
 41 | 
 42 | This is easiest to understand by trying it yourself. Let's try running the workflow again:
 43 | 
 44 | ```{r}
 45 | #| label: targets-run
 46 | #| echo: [5]
 47 | # Each tar_script is fresh, so need to run once to catch up to learners
 48 | pushd(make_tempdir())
 49 | write_example_plan("plan_1.R")
 50 | tar_make(reporter = "silent")
 51 | tar_make()
 52 | popd()
 53 | ```
 54 | 
 55 | Remember how the first time we ran the pipeline, `targets` printed out a list of each target as it was being built?
 56 | 
 57 | This time, it tells us it is skipping those targets; they have already been built, so there's no need to run that code again.
 58 | 
 59 | Remember, the fastest code is the code you don't have to run!
 60 | 
 61 | ## Re-running the workflow after modification
 62 | 
 63 | What happens when we change one part of the workflow then run it again?
 64 | 
 65 | Say that we decide the species names should be shorter.
 66 | Right now they include the common name and the scientific name, but we really only need the first part of the common name to distinguish them.
 67 | 
 68 | Edit `_targets.R` so that the `clean_penguin_data()` function looks like this:
 69 | 
 70 | ```{r}
 71 | #| label: new-func
 72 | #| eval: FALSE
 73 | #| file: files/tar_functions/clean_penguin_data.R
 74 | ```
 75 | 
 76 | Then run it again.
 77 | 
 78 | ```{r}
 79 | #| label: targets-run-2
 80 | #| echo: [6]
 81 | plan_2_dir <- make_tempdir()
 82 | pushd(plan_2_dir)
 83 | write_example_plan("plan_1.R")
 84 | tar_make(reporter = "silent")
 85 | write_example_plan("plan_2.R")
 86 | tar_make()
 87 | popd()
 88 | ```
 89 | 
 90 | What happened?
 91 | 
 92 | This time, it skipped `penguins_csv_file` and `penguins_data_raw` and only ran `penguins_data`.
 93 | 
 94 | Of course, since our example workflow is so short we don't even notice the amount of time saved.
 95 | But imagine using this in a series of computationally intensive analysis steps.
 96 | The ability to automatically skip steps results in a massive increase in efficiency.
 97 | 
 98 | ::::::::::::::::::::::::::::::::::::: challenge
 99 | 
100 | ## Challenge 1: Inspect the output
101 | 
102 | How can you inspect the contents of `penguins_data`?
103 | 
104 | :::::::::::::::::::::::::::::::::: solution
105 | 
106 | With `tar_read(penguins_data)` or by running `tar_load(penguins_data)` followed by `penguins_data`.
107 | 
108 | ::::::::::::::::::::::::::::::::::::::::::::
109 | 
110 | :::::::::::::::::::::::::::::::::::::::::::::::
111 | 
112 | ## Under the hood
113 | 
114 | How does `targets` keep track of which targets are up-to-date vs. outdated?
115 | 
116 | For each target in the workflow (items in the list at the end of the `_targets.R` file) and any custom functions used in the workflow, `targets` calculates a **hash value**, or unique combination of letters and digits that represents an object in the computer's memory.
117 | You can think of the hash value (or "hash" for short) as **a unique fingerprint** for a target or function.
118 | 
119 | The first time your run `tar_make()`, `targets` calculates the hashes for each target and function as it runs the code and stores them in the targets cache (the `_targets` folder).
120 | Then, for each subsequent call of `tar_make()`, it calculates the hashes again and compares them to the stored values.
121 | It detects which have changed, and this is how it knows which targets are out of date.
122 | 
123 | :::::::::::::::::::::::::::::::::::::::: callout
124 | 
125 | ## Where the hashes live
126 | 
127 | If you are curious about what the hashes look like, you can see them in the file `_targets/meta/meta`, but **do not edit this file by hand**---that would ruin your workflow!
128 | 
129 | ::::::::::::::::::::::::::::::::::::::::
130 | 
131 | This information is used in combination with the dependency relationships (in other words, how each target depends on the others) to re-run the workflow in the most efficient way possible: code is only run for targets that need to be re-built, and others are skipped.
132 | 
133 | ## Visualizing the workflow
134 | 
135 | Typically, you will be making edits to various places in your code, adding new targets, and running the workflow periodically.
136 | It is good to be able to visualize the state of the workflow.
137 | 
138 | This can be done with `tar_visnetwork()`
139 | 
140 | ```{r}
141 | #| label: targets-run-hide-3
142 | #| echo: [5]
143 | #| results: "asis"
144 | #| eval: FALSE
145 | # TODO: Change #| eval to TRUE when
146 | # https://github.com/carpentries/sandpaper/issues/443
147 | # is resolved
148 | pushd(plan_2_dir)
149 | tar_visnetwork()
150 | popd()
151 | ```
152 | 
153 | ![](fig/lifecycle-visnetwork.png){alt="Visualization of the targets worklow, showing 'penguins_data' connected by lines to 'penguins_data_raw', 'penguins_csv_file' and 'clean_penguin_data'"}
154 | 
155 | You should see the network show up in the plot area of RStudio.
156 | 
157 | It is an HTML widget, so you can zoom in and out (this isn't important for the current example since it is so small, but is useful for larger, "real-life" workflows).
158 | 
159 | Here, we see that all of the targets are dark green, indicating that they are up-to-date and would be skipped if we were to run the workflow again.
160 | 
161 | ::::::::::::::::::::::::::::::::::::: prereq 
162 | 
163 | ## Installing visNetwork
164 | 
165 | You may encounter an error message `The package "visNetwork" is required.`
166 | 
167 | In this case, install it first with `install.packages("visNetwork")`.
168 | 
169 | ::::::::::::::::::::::::::::::::::::::::::::::::
170 | 
171 | ::::::::::::::::::::::::::::::::::::: challenge
172 | 
173 | ## Challenge 2: What else can the visualization tell us?
174 | 
175 | Modify the workflow in `_targets.R`, then run `tar_visnetwork()` again **without** running `tar_make()`.
176 | What color indicates that a target is out of date?
177 | 
178 | :::::::::::::::::::::::::::::::::: solution
179 | 
180 | Light blue indicates the target is out of date.
181 | 
182 | Depending on how you modified the code, any or all of the targets may now be light blue.
183 | 
184 | ::::::::::::::::::::::::::::::::::::::::::::
185 | 
186 | :::::::::::::::::::::::::::::::::::::::::::::::
187 | 
188 | ::::::::::::::::::::::::::::::::::::: callout
189 | 
190 | ## 'Outdated' does not always mean 'will be run'
191 | 
192 | Just because a target appears as light blue (is "outdated") in the network visualization, this does not guarantee that it will be re-built during the next run. Rather, it means that **at least of one the targets that it depends on has changed**.
193 | 
194 | For example, if the workflow state looked like this:
195 | 
196 | `A -> B* -> C -> D`
197 | 
198 | where the `*` indicates that `B` has changed compared to the last time the workflow was run, the network visualization will show `B`, `C`, and `D` all as light blue.
199 | 
200 | But if re-running the workflow results in the exact same value for `C` as before, `D` will not be re-run (will be "skipped").
201 | 
202 | Most of the time, a single change will cascade to the rest of the downstream targets and cause them to be re-built, but this is not always the case. `targets` has no way of knowing ahead of time what the actual output will be, so it cannot provide a network visualization that completely predicts the future!
203 | 
204 | :::::::::::::::::::::::::::::::::::::::::::::::
205 | 
206 | ## Other ways to check workflow status
207 | 
208 | The visualization is very useful, but sometimes you may be working on a server that doesn't provide graphical output, or you just want a quick textual summary of the workflow.
209 | There are some other useful functions that can do that.
210 | 
211 | `tar_outdated()` lists only the outdated targets; that is, targets that will be built during the next run, or depend on such a target.
212 | If everything is up to date, it will return a zero-length character vector (`character(0)`).
213 | 
214 | ```{r}
215 | #| label: targets-outdated
216 | #| echo: [2]
217 | pushd(plan_2_dir)
218 | tar_outdated()
219 | popd()
220 | ```
221 | 
222 | `tar_progress()` shows the current status of the workflow as a dataframe.
223 | You may find it helpful to further manipulate the dataframe to obtain useful summaries of the workflow, for example using `dplyr` (such data manipulation is beyond the scope of this lesson but the instructor may demonstrate its use).
224 | 
225 | ```{r}
226 | #| label: targets-progress
227 | #| echo: [2]
228 | pushd(plan_2_dir)
229 | tar_progress()
230 | popd()
231 | ```
232 | 
233 | ## Granular control of targets
234 | 
235 | It is possible to only make a particular target instead of running the entire workflow.
236 | 
237 | To do this, type the name of the target you wish to build after `tar_make()` (note that any targets required by the one you specify will also be built).
238 | For example, `tar_make(penguins_data_raw)` would **only** build `penguins_data_raw`, not `penguins_data`.
239 | 
240 | Furthermore, if you want to manually "reset" a target and make it appear out-of-date, you can do so with `tar_invalidate()`. This means that target (and any that depend on it) will be re-run next time.
241 | 
242 | Let's give this a try. Remember that our pipeline is currently up to date, so `tar_make()` will skip everything:
243 | 
244 | ```{r}
245 | #| label: targets-progress-show-2
246 | #| eval: true
247 | #| echo: [2]
248 | pushd(plan_2_dir)
249 | tar_make()
250 | popd()
251 | ```
252 | 
253 | Let's invalidate `penguins_data` and run it again:
254 | 
255 | ```{r}
256 | #| label: targets-progress-show-3
257 | #| eval: true
258 | #| echo: [2, 3]
259 | pushd(plan_2_dir)
260 | tar_invalidate(penguins_data)
261 | tar_make()
262 | popd()
263 | ```
264 | 
265 | If you want to reset **everything** and start fresh, you can use `tar_invalidate(everything())` (`tar_invalidate()` [accepts `tidyselect` expressions](https://docs.ropensci.org/targets/reference/tar_invalidate.html) to specify target names).
266 | 
267 | **Caution should be exercised** when using granular methods like this, though, since you may end up with your workflow in an unexpected state. The surest way to maintain an up-to-date workflow is to run `tar_make()` frequently.
268 | 
269 | ## How this all works in practice
270 | 
271 | In practice, you will likely be switching between running the workflow with `tar_make()`, loading the targets you built with `tar_load()`, and editing your custom functions by running code in an interactive R session. It takes some time to get used to it, but soon you will feel that your code isn't "real" until it is embedded in a `targets` workflow.
272 | 
273 | ::::::::::::::::::::::::::::::::::::: keypoints 
274 | 
275 | - `targets` only runs the steps that have been affected by a change to the code
276 | - `tar_visnetwork()` shows the current state of the workflow as a network
277 | - `tar_progress()` shows the current state of the workflow as a data frame
278 | - `tar_outdated()` lists outdated targets
279 | - `tar_invalidate()` can be used to invalidate (re-run) specific targets
280 | 
281 | ::::::::::::::::::::::::::::::::::::::::::::::::
282 | 


--------------------------------------------------------------------------------
/episodes/organization.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'Best Practices for targets Project Organization'
  3 | teaching: 10
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - What are best practices for organizing `targets` projects?
 10 | - How does the organization of a `targets` workflow differ from a script-based analysis?
 11 | 
 12 | ::::::::::::::::::::::::::::::::::::::::::::::::
 13 | 
 14 | ::::::::::::::::::::::::::::::::::::: objectives
 15 | 
 16 | - Explain how to organize `targets` projects for maximal reproducibility
 17 | - Understand how to use functions in the context of `targets`
 18 | 
 19 | ::::::::::::::::::::::::::::::::::::::::::::::::
 20 | 
 21 | ::::::::::::::::::::::::::::::::::::: instructor
 22 | 
 23 | Episode summary: Demonstrate best-practices for project organization
 24 | 
 25 | :::::::::::::::::::::::::::::::::::::
 26 | 
 27 | ```{r}
 28 | #| label: setup
 29 | #| echo: FALSE
 30 | #| message: FALSE
 31 | #| warning: FALSE
 32 | library(targets)
 33 | library(tarchetypes)
 34 | source("files/lesson_functions.R")
 35 | ```
 36 | 
 37 | ## A simpler way to write workflow plans
 38 | 
 39 | The default way to specify targets in the plan is with the `tar_target()` function.
 40 | But this way of writing plans can be a bit verbose.
 41 | 
 42 | There is an alternative provided by the `tarchetypes` package, also written by the creator of `targets`, Will Landau.
 43 | 
 44 | ::::::::::::::::::::::::::::::::::::: prereq
 45 | 
 46 | ## Install `tarchetypes`
 47 | 
 48 | If you haven't done so yet, install `tarchetypes` with `install.packages("tarchetypes")`.
 49 | 
 50 | :::::::::::::::::::::::::::::::::::::
 51 | 
 52 | The purpose of the `tarchetypes` is to provide various shortcuts that make writing `targets` pipelines easier.
 53 | We will introduce just one for now, `tar_plan()`. This is used in place of `list()` at the end of the `_targets.R` script.
 54 | By using `tar_plan()`, instead of specifying targets with `tar_target()`, we can use a syntax like this: `target_name = target_command`.
 55 | 
 56 | Let's edit the penguins workflow to use the `tar_plan()` syntax:
 57 | 
 58 | <!-- The chunk below intersperses plan_2b with clean_penguin_data() to avoid writing it manually -->
 59 | ```{r}
 60 | #| label = "tar-plan-show-1",
 61 | #| eval = FALSE,
 62 | #| code = c(readLines("files/packages.R")[1:4], "\n", readLines("files/tar_functions/clean_penguin_data.R"), "\n", readLines("files/plans/plan_2b.R")[5:9])
 63 | ```
 64 | 
 65 | I think it is easier to read, do you?
 66 | 
 67 | Notice that `tar_plan()` does not mean you have to write *all* targets this way; you can still use the `tar_target()` format within `tar_plan()`.
 68 | That is because `=`, while short and easy to read, does not provide all of the customization that `targets` is capable of.
 69 | This doesn't matter so much for now, but it will become important when you start to create more advanced `targets` workflows.
 70 | 
 71 | ## Organizing files and folders
 72 | 
 73 | So far, we have been doing everything with a single `_targets.R` file.
 74 | This is OK for a small workflow, but does not work very well when the workflow gets bigger.
 75 | There are better ways to organize your code.
 76 | 
 77 | First, let's create a directory called `R` to store R code *other than* `_targets.R` (remember, `_targets.R` must be placed in the overall project directory, not in a subdirectory).
 78 | Create a new R file in `R/` called `functions.R`.
 79 | This is where we will put our custom functions.
 80 | Let's go ahead and put `clean_penguin_data()` in there now and save it.
 81 | 
 82 | Similarly, let's put the `library()` calls in their own script in `R/` called `packages.R` (this isn't the only way to do it though; see the ["Managing Packages" episode](https://joelnitta.github.io/targets-workshop/packages.html) for alternative approaches).
 83 | 
 84 | We will also need to modify our `_targets.R` script to call these scripts with `source`:
 85 | 
 86 | ```{r}
 87 | #| label = "tar-plan-show-2",
 88 | #| eval = FALSE,
 89 | #| code = readLines("files/plans/plan_2b.R")[2:9]
 90 | ```
 91 | 
 92 | Now `_targets.R` is much more streamlined: it is focused just on the workflow and immediately tells us what happens in each step.
 93 | 
 94 | Finally, let's make some directories for storing data and output---files that are not code.
 95 | Create a new directory inside the targets cache called `user`: `_targets/user`.
 96 | Within `user`, create two more directories, `data` and `results`.
 97 | (If you use version control, you will probably want to ignore the `_targets` directory).
 98 | 
 99 | ## A word about functions
100 | 
101 | We mentioned custom functions earlier in the lesson, but this is an important topic that deserves further clarification.
102 | If you are used to analyzing data in R with a series of scripts instead of a single workflow like `targets`, you may not write many functions (using the `function()` function).
103 | 
104 | This is a major difference from `targets`.
105 | It would be quite difficult to write an efficient `targets` pipeline without the use of custom functions, because each target you build has to be the output of a single command.
106 | 
107 | We don't have time in this curriculum to cover how to write functions in R, but the [Software Carpentry lesson](https://swcarpentry.github.io/r-novice-gapminder/10-functions) is recommended for reviewing this topic.
108 | 
109 | Another major difference is that **each target must have a unique name**.
110 | You may be used to writing code that looks like this:
111 | 
112 | ```{r}
113 | #| eval: FALSE
114 | #| label: example-script
115 | 
116 | # Store a person's height in cm, then convert to inches
117 | height <- 160
118 | height <- height / 2.54
119 | ```
120 | 
121 | You would get an error if you tried to run the equivalent targets pipeline:
122 | 
123 | ```{r}
124 | #| eval: FALSE
125 | #| label: example-bad-pipeline-show
126 | #| echo: [-1, -2]
127 | library(targets)
128 | library(tarchetypes)
129 | tar_plan(
130 |     height = 160,
131 |     height = height / 2.54
132 | )
133 | ```
134 | 
135 | ```{r}
136 | #| echo: FALSE
137 | #| label: example-bad-pipeline-hide
138 | #| error: true
139 | tar_dir({
140 |   write_example_plan(chunk = "example-bad-pipeline-show")
141 |   tar_make()
142 | })
143 | ```
144 | 
145 | **A major part of working with `targets` pipelines is writing custom functions that are the right size.**
146 | They should not be so small that each is just a single line of code; this would make your pipeline difficult to understand and be too difficult to maintain.
147 | On the other hand, they should not be so big that each has large numbers of inputs and is thus overly sensitive to changes.
148 | 
149 | Striking this balance is more of art than science, and only comes with practice. I find a good rule of thumb is no more than three inputs per target.
150 | 
151 | ::::::::::::::::::::::::::::::::::::: keypoints 
152 | 
153 | - Put code in the `R/` folder
154 | - Put functions in `R/functions.R`
155 | - Specify packages in `R/packages.R`
156 | - Put other miscellaneous files in `_targets/user`
157 | - Writing functions is a key skill for `targets` pipelines
158 | 
159 | ::::::::::::::::::::::::::::::::::::::::::::::::
160 | 
161 | 


--------------------------------------------------------------------------------
/episodes/packages.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'Managing Packages'
  3 | teaching: 10
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - How should I manage packages for my `targets` project?
 10 | 
 11 | ::::::::::::::::::::::::::::::::::::::::::::::::
 12 | 
 13 | ::::::::::::::::::::::::::::::::::::: objectives
 14 | 
 15 | - Demonstrate best practices for managing packages
 16 | 
 17 | ::::::::::::::::::::::::::::::::::::::::::::::::
 18 | 
 19 | ::::::::::::::::::::::::::::::::::::: instructor
 20 | 
 21 | Episode summary: Show how to load packages and maintain package versions
 22 | 
 23 | :::::::::::::::::::::::::::::::::::::
 24 | 
 25 | ```{r}
 26 | #| label: setup
 27 | #| echo: FALSE
 28 | #| message: FALSE
 29 | #| warning: FALSE
 30 | library(targets)
 31 | library(tarchetypes)
 32 | source("files/lesson_functions.R")
 33 | ```
 34 | 
 35 | ## Loading packages
 36 | 
 37 | Almost every R analysis relies on packages for functions beyond those available in base R.
 38 | 
 39 | There are three main ways to load packages in `targets` workflows.
 40 | 
 41 | ### Method 1: `library()` {#method-1}
 42 | 
 43 | This is the method you are almost certainly more familiar with, and is the method we have been using by default so far.
 44 | 
 45 | Like any other R script, include `library()` calls near the top of the `_targets.R` script. Alternatively (and as the <!-- FIXME ADD LINK -->recommended best practice for project organization), you can put all of the `library()` calls in a separate script---this is typically called `packages.R` and stored in the `R/` directory of your project.
 46 | 
 47 | The potential downside to this approach is that if you have a long list of packages to load, certain functions like `tar_visnetwork()`, `tar_outdated()`, etc., may take an unnecessarily long time to run because they have to load all the packages, even though they don't necessarily use them.
 48 | 
 49 | ### Method 2: `tar_option_set()` {#method-2}
 50 | 
 51 | In this method, use the `tar_option_set()` function in `_targets.R` to specify the packages to load when running the workflow.
 52 | 
 53 | This will be demonstrated using the pre-cleaned dataset from the `palmerpenguins` package. Let's say we want to filter it down to just data for the Adelie penguin.
 54 | 
 55 | ::::::::::::::::::::::::::::::::::::: {.callout}
 56 | 
 57 | ## Save your progress
 58 | 
 59 | You can only have one active `_targets.R` file at a time in a given project.
 60 | 
 61 | We are about to create a new `_targets.R` file, but you probably don't want to lose your progress in the one we have been working on so far (the penguins bill analysis). You can temporarily rename that one to something like `_targets_old.R` so that you don't overwrite it with the new example `_targets.R` file below. Then, rename them when you are ready to work on it again.
 62 | 
 63 | :::::::::::::::::::::::::::::::::::::
 64 | 
 65 | This is what using the `tar_option_set()` method looks like:
 66 | 
 67 | ```{r}
 68 | #| eval: FALSE
 69 | #| label: load-pkg-show
 70 | library(targets)
 71 | library(tarchetypes)
 72 | 
 73 | tar_option_set(packages = c("dplyr", "palmerpenguins"))
 74 | 
 75 | tar_plan(
 76 |   adelie_data = filter(penguins, species == "Adelie")
 77 | )
 78 | ```
 79 | 
 80 | ```{r}
 81 | #| echo: FALSE
 82 | #| label: load-pkg-hide
 83 | tar_dir({
 84 |   write_example_plan(chunk = "load-pkg-show")
 85 |   tar_make()
 86 | })
 87 | ```
 88 | 
 89 | This method gets around the slow-downs that may sometimes be experienced with Method 1.
 90 | 
 91 | ### Method 3: `packages` argument of `tar_target()` {#method-3}
 92 | 
 93 | The main function for defining targets, `tar_target()` includes a `packages` argument that will load the specified packages **only for that target**.
 94 | 
 95 | Here is how we could use this method, modified from the same example as above.
 96 | 
 97 | ```{r}
 98 | #| eval: FALSE
 99 | #| label: load-pkg-show-2
100 | library(targets)
101 | library(tarchetypes)
102 | 
103 | tar_plan(
104 |   tar_target(
105 |     adelie_data,
106 |     filter(penguins, species == "Adelie"),
107 |     packages = c("dplyr", "palmerpenguins")
108 |   )
109 | )
110 | ```
111 | 
112 | ```{r}
113 | #| echo: FALSE
114 | #| label: load-pkg-hide-2
115 | tar_dir({
116 |   write_example_plan(chunk="load-pkg-show-2")
117 |   tar_make()
118 | })
119 | ```
120 | 
121 | This can be more memory efficient in some cases than loading all packages, since not every target is always made during a typical run of the workflow.
122 | But, it can be tedious to remember and specify packages needed on a per-target basis.
123 | 
124 | ### One more option
125 | 
126 | Another alternative that does not actually involve loading packages is to specify the package associated with each function by using the `::` notation, for example, `dplyr::mutate()`.
127 | This means you can **avoid loading packages altogether**.
128 | 
129 | Here is how to write the plan using this method:
130 | 
131 | ```{r}
132 | #| eval: FALSE
133 | #| label: load-pkg-show-3
134 | library(targets)
135 | library(tarchetypes)
136 | 
137 | tar_plan(
138 |   adelie_data = dplyr::filter(palmerpenguins::penguins, species == "Adelie")
139 | )
140 | ```
141 | 
142 | ```{r}
143 | #| echo: FALSE
144 | #| label: load-pkg-hide-3
145 | tar_dir({
146 |   write_example_plan(chunk = "load-pkg-show-3")
147 |   tar_make()
148 | })
149 | ```
150 | 
151 | The benefits of this approach are that the origins of all functions is explicit, so you could browse your code (for example, by looking at its source in GitHub), and immediately know where all the functions come from.
152 | The downside is that it is rather verbose because you need to type the package name every time you use one of its functions.
153 | 
154 | ### Which is the right way?
155 | 
156 | **There is no "right" answer about how to load packages**---it is a matter of what works best for your particular situation.
157 | 
158 | Often a reasonable approach is to load your most commonly used packages with `library()` (such as `tidyverse`) in `packages.R`, then use `::` notation for less frequently used functions whose origins you may otherwise forget.
159 | 
160 | ## Maintaining package versions
161 | 
162 | ### Tracking of custom functions vs. functions from packages
163 | 
164 | A critical thing to understand about `targets` is that **it only tracks custom functions and targets**, not functions provided by packages.
165 | 
166 | However, the content of packages can change, and packages typically get updated on a regular basis. **The output of your workflow may depend not only on the packages you use, but their versions**.
167 | 
168 | Therefore, it is a good idea to track package versions.
169 | 
170 | ### About `renv`
171 | 
172 | Fortunately, you don't have to do this by hand: there are R packages available that can help automate this process. We recommend [renv](https://rstudio.github.io/renv/index.html), but there are others available as well (e.g., [groundhog](https://groundhogr.com/)). We don't have the time to cover detailed usage of `renv` in this lesson. To get started with `renv`, see the ["Introduction to renv" vignette](https://rstudio.github.io/renv/articles/renv.html).
173 | 
174 | You can generally use `renv` the same way you would for a `targets` project as any other R project. However, there is one exception: if you load packages using `tar_option_set()` or the `packages` argument of `tar_target()` ([Method 2](#method-2) or [Method 3](#method-3), respectively), `renv` will not detect them (because it expects packages to be loaded with `library()`, `require()`, etc.).
175 | 
176 | The solution in this case is to use the [`tar_renv()` function](https://docs.ropensci.org/targets/reference/tar_renv.html). This will write a separate file with `library()` calls for each package used in the workflow so that `renv` will properly detect them.
177 | 
178 | ### Selective tracking of functions from packages
179 | 
180 | Because `targets` doesn't track functions from packages, if you update a package and the contents of one of its functions changes, `targets` **will not re-build the target that was generated by that function**.
181 | 
182 | However, it is possible to change this behavior on a per-package basis.
183 | This is best done only for a small number of packages, since adding too many would add too much computational overhead to `targets` when it has to calculate dependencies.
184 | For example, you may want to do this if you are using your own custom package that you update frequently.
185 | 
186 | The way to do so is by using `tar_option_set()`, specifying the **same** package name in both `packages` and `imports`. Here is a modified version of the earlier code that demonstrates this for `dplyr` and `palmerpenguins`.
187 | 
188 | ```{r}
189 | #| eval: FALSE
190 | #| label: load-pkg-show-4
191 | library(targets)
192 | library(tarchetypes)
193 | 
194 | tar_option_set(
195 |   packages = c("dplyr", "palmerpenguins"),
196 |   imports = c("dplyr", "palmerpenguins")
197 | )
198 | 
199 | tar_plan(
200 |   adelie_data = filter(penguins, species == "Adelie")
201 | )
202 | ```
203 | 
204 | If we were to re-install either `dplyr` or `palmerpenguins` and one of the functions used from those in the pipeline changes (for example, `filter()`), any target depending on that function will be rebuilt.
205 | 
206 | ## Resolving namespace conflicts
207 | 
208 | There is one final best-practice to mention related to packages: resolving namespace conflicts.
209 | 
210 | "Namespace" refers to the idea that a certain set of unique names are only unique **within a particular context**.
211 | For example, all the function names of a package have to be unique, but only within that package.
212 | Function names could be duplicated across packages.
213 | 
214 | As you may imagine, this can cause confusion.
215 | For example, the `filter()` function appears in both the `stats` package and the `dplyr` package, but does completely different things in each.
216 | This is a **namespace conflict**: how do we know which `filter()` we are talking about?
217 | 
218 | The `conflicted` package can help prevent such confusion by stopping you if you try to use an ambiguous function, and help you be explicit about which package to use.
219 | We don't have time to cover the details here, but you can read more about how to use `conflicted` at its [website](https://conflicted.r-lib.org/).
220 | 
221 | When you use `conflicted`, you will typically run a series of commands to explicitly resolve namespace conflicts, like `conflicts_prefer(dplyr::filter)` (this would tell R that we want to use `filter` from `dplyr`, not `stats`).
222 | 
223 | To use this in a `targets` workflow, you should put all calls to `conflicts_prefer` in a special file called `.Rprofile` that is located in the main folder of your project. This will ensure that the conflicts are always resolved for each target.
224 | 
225 | The recommended way to edit your `.Rprofile` is to use `usethis::edit_r_profile("project")`.
226 | This will open `.Rprofile` in your editor, where you can edit it and save it.
227 | 
228 | For example, your `.Rprofile` could include this:
229 | 
230 | ```{r}
231 | #| eval: false
232 | library(conflicted)
233 | conflicts_prefer(dplyr::filter)
234 | ```
235 | 
236 | Note that you don't need to run `source()` to run the code in `.Rprofile`.
237 | It will always get run at the start of each R session automatically.
238 | 
239 | ::::::::::::::::::::::::::::::::::::: keypoints 
240 | 
241 | - There are multiple ways to load packages with `targets`
242 | - `targets` only tracks user-defined functions, not packages
243 | - Use `renv` to manage package versions
244 | - Use the `conflicted` package to manage namespace conflicts
245 | 
246 | ::::::::::::::::::::::::::::::::::::::::::::::::
247 | 


--------------------------------------------------------------------------------
/episodes/parallel.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'Parallel Processing'
  3 | teaching: 15
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - How can we build targets in parallel?
 10 | 
 11 | ::::::::::::::::::::::::::::::::::::::::::::::::
 12 | 
 13 | ::::::::::::::::::::::::::::::::::::: objectives
 14 | 
 15 | - Be able to build targets in parallel
 16 | 
 17 | ::::::::::::::::::::::::::::::::::::::::::::::::
 18 | 
 19 | ::::::::::::::::::::::::::::::::::::: instructor
 20 | 
 21 | Episode summary: Show how to use parallel processing
 22 | 
 23 | :::::::::::::::::::::::::::::::::::::
 24 | 
 25 | ```{r}
 26 | #| label: setup
 27 | #| echo: FALSE
 28 | #| message: FALSE
 29 | #| warning: FALSE
 30 | library(targets)
 31 | library(tarchetypes)
 32 | library(broom)
 33 | 
 34 | if (interactive()) {
 35 |   setwd("episodes")
 36 | }
 37 | 
 38 | source("files/lesson_functions.R")
 39 | 
 40 | # Increase width for printing tibbles
 41 | options(width = 140)
 42 | ```
 43 | 
 44 | Once a pipeline starts to include many targets, you may want to think about parallel processing.
 45 | This takes advantage of multiple processors in your computer to build multiple targets at the same time.
 46 | 
 47 | ::::::::::::::::::::::::::::::::::::: {.callout}
 48 | 
 49 | ## When to use parallel processing
 50 | 
 51 | Parallel processing should only be used if your workflow has independent tasks---if your workflow only consists of a linear sequence of targets, then there is nothing to parallelize.
 52 | Most workflows that use branching can benefit from parallelism.
 53 | 
 54 | :::::::::::::::::::::::::::::::::::::
 55 | 
 56 | `targets` includes support for high-performance computing, cloud computing, and various parallel backends.
 57 | Here, we assume you are running this analysis on a laptop and so will use a relatively simple backend.
 58 | If you are interested in high-performance computing, [see the `targets` manual](https://books.ropensci.org/targets/hpc.html).
 59 | 
 60 | ### Set up workflow
 61 | 
 62 | To enable parallel processing with `crew` you only need to load the `crew` package, then tell `targets` to use it using `tar_option_set`.
 63 | Specifically, the following lines enable crew, and tells it to use 2 parallel workers.
 64 | You can increase this number on more powerful machines:
 65 | 
 66 | ```r
 67 | library(crew)
 68 | tar_option_set(
 69 |   controller = crew_controller_local(workers = 2)
 70 | )
 71 | ```
 72 | 
 73 | Make these changes to the penguins analysis.
 74 | It should now look like this:
 75 | 
 76 | ```{r}
 77 | #| label = "example-model-show-setup",
 78 | #| eval = FALSE,
 79 | #| code = readLines("files/plans/plan_9.R")[3:41]
 80 | ```
 81 | 
 82 | There is still one more thing we need to modify only for the purposes of this demo: if we ran the analysis in parallel now, you wouldn't notice any difference in compute time because the functions are so fast.
 83 | 
 84 | So let's make "slow" versions of `model_glance()` and `model_augment()` using the `Sys.sleep()` function, which just tells the computer to wait some number of seconds.
 85 | This will simulate a long-running computation and enable us to see the difference between running sequentially and in parallel.
 86 | 
 87 | Add these functions to `functions.R` (you can copy-paste the original ones, then modify them):
 88 | 
 89 | ```{r}
 90 | #| label: slow-funcs
 91 | #| eval: false
 92 | #| file:
 93 | #|    - files/tar_functions/model_glance_slow.R
 94 | #|    - files/tar_functions/model_augment_slow.R
 95 | ```
 96 | 
 97 | Then, change the plan to use the "slow" version of the functions:
 98 | 
 99 | ```{r}
100 | #| label = "example-model-show-9",
101 | #| eval = FALSE,
102 | #| code = readLines("files/plans/plan_10.R")[3:41]
103 | ```
104 | 
105 | Finally, run the pipeline with `tar_make()` as normal.
106 | 
107 | ```{r}
108 | #| label: example-model-hide-9
109 | #| warning: false
110 | #| message: false
111 | #| echo: false
112 | 
113 | plan_10_dir <- make_tempdir()
114 | pushd(plan_10_dir)
115 | write_example_plan("plan_9.R")
116 | tar_make(reporter = "silent")
117 | write_example_plan("plan_10.R")
118 | tar_make()
119 | popd()
120 | ```
121 | 
122 | Notice that although the time required to build each individual target is about 4 seconds, the total time to run the entire workflow is less than the sum of the individual target times! That is proof that processes are running in parallel **and saving you time**.
123 | 
124 | The unique and powerful thing about targets is that **we did not need to change our custom function to run it in parallel**. We only adjusted *the workflow*. This means it is relatively easy to refactor (modify) a workflow for running sequentially locally or running in parallel in a high-performance context.
125 | 
126 | Now that we have demonstrated how this works, you can change your analysis plan back to the original versions of the functions you wrote.
127 | 
128 | ::::::::::::::::::::::::::::::::::::: keypoints 
129 | 
130 | - Dynamic branching creates multiple targets with a single command
131 | - You usually need to write custom functions so that the output of the branches includes necessary metadata 
132 | - Parallel computing works at the level of the workflow, not the function
133 | 
134 | ::::::::::::::::::::::::::::::::::::::::::::::::
135 | 


--------------------------------------------------------------------------------
/episodes/quarto.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: 'Reproducible Reports with Quarto'
  3 | teaching: 10
  4 | exercises: 2
  5 | ---
  6 | 
  7 | :::::::::::::::::::::::::::::::::::::: questions 
  8 | 
  9 | - How can we create reproducible reports?
 10 | 
 11 | ::::::::::::::::::::::::::::::::::::::::::::::::
 12 | 
 13 | ::::::::::::::::::::::::::::::::::::: objectives
 14 | 
 15 | - Be able to generate a report using `targets`
 16 | 
 17 | ::::::::::::::::::::::::::::::::::::::::::::::::
 18 | 
 19 | ::::::::::::::::::::::::::::::::::::: instructor
 20 | 
 21 | Episode summary: Show how to write reports with Quarto
 22 | 
 23 | :::::::::::::::::::::::::::::::::::::
 24 | 
 25 | ```{r}
 26 | #| label: setup
 27 | #| echo: FALSE
 28 | #| message: FALSE
 29 | #| warning: FALSE
 30 | library(targets)
 31 | library(tarchetypes)
 32 | library(quarto) # don't actually need to load, but put here so renv catches it
 33 | 
 34 | if (interactive()) {
 35 |   setwd("episodes")
 36 | }
 37 | 
 38 | source("files/lesson_functions.R")
 39 | 
 40 | # Increase width for printing tibbles
 41 | options(width = 140)
 42 | ```
 43 | 
 44 | ## Copy-paste vs. dynamic documents
 45 | 
 46 | Typically, you will want to communicate the results of a data analysis to a broader audience.
 47 | 
 48 | You may have done this before by copying and pasting statistics, plots, and other results into a text document or presentation.
 49 | This may be fine if you only ever do the analysis once.
 50 | But that is rarely the case---it is much more likely that you will tweak parts of the analysis or add new data and re-run your pipeline.
 51 | With the copy-paste method, you'd have to remember what results changed and manually make sure everything is up-to-date.
 52 | This is a perilous exercise!
 53 | 
 54 | Fortunately, `targets` provides functions for keeping a document in sync with pipeline results, so you can avoid such pitfalls.
 55 | The main tool we will use to generate documents is **Quarto**.
 56 | Quarto can be used separately from `targets` (and is a large topic on its own), but it also happens to be an excellent way to dynamically generate reports with `targets`.
 57 | 
 58 | Quarto allows you to insert the results of R code directly into your documents so that there is no danger of copy-and-paste mistakes.
 59 | Furthermore, it can generate output from the same underlying script in multiple formats including PDF, HTML, and Microsoft Word.
 60 | 
 61 | ::::::::::::::::::::::::::::::::::::: {.prereq}
 62 | 
 63 | ## Installing Quarto
 64 | 
 65 | As of v2022.07.1, [RStudio comes with Quarto](https://docs.posit.co/ide/user/ide/guide/documents/quarto-project.html), so you don't need to install it separately. If you can't run Quarto from RStudio, we recommend installing the latest version of RStudio.
 66 | 
 67 | :::::::::::::::::::::::::::::::::::::
 68 | 
 69 | ## About Quarto files
 70 | 
 71 | `.qmd` or `.Qmd` is the extension for Quarto files, and stands for "Quarto markdown".
 72 | Quarto files invert the normal way of writing code and comments: in a typical R script, all text is assumed to be R code, unless you preface it with a `#` to show that it is a comment.
 73 | In Quarto, all text is assumed to be prose, and you use special notation to indicate which lines are R code to be evaluated.
 74 | Once the code is evaluated, the results get inserted into a final, rendered document, which could be one of various formats.
 75 | 
 76 | ![Quarto workflow](fig/03-qmd-workflow.png)
 77 | 
 78 | We don't have the time to go into the details of Quarto during this lesson, but recommend the ["Introduction to Reproducible Publications with RStudio" incubator (in-development) lesson](https://ucsbcarpentry.github.io/Reproducible-Publications-with-RStudio-Quarto/) for more on this topic.
 79 | 
 80 | ## Recommended workflow
 81 | 
 82 | Dynamic documents like Quarto (or Rmarkdown, the predecessor to Quarto) can actually be used to manage data analysis pipelines.
 83 | But that is not recommended because it doesn't scale well and lacks the sophisticated dependency tracking offered by `targets`.
 84 | 
 85 | Our suggested approach is to conduct the vast majority of data analysis (in other words, the "heavy lifting") in the `targets` pipeline, then use the Quarto document to **summarize** and **plot** the results.
 86 | 
 87 | ## Report on bill size in penguins
 88 | 
 89 | Continuing our penguin bill size analysis, let's write a report evaluating each model.
 90 | 
 91 | To save time, the report is already available at <https://github.com/joelnitta/penguins-targets>.
 92 | 
 93 | Copy the [raw code from here](https://raw.githubusercontent.com/joelnitta/penguins-targets/main/penguin_report.qmd) and save it as a new file `penguin_report.qmd` in your project folder (you may also be able to right click in your browser and select "Save As").
 94 | 
 95 | Then, add one more target to the pipeline using the `tar_quarto()` function like this:
 96 | 
 97 | ```{r}
 98 | #| label = "example-penguins-show-1",
 99 | #| eval = FALSE,
100 | #| code = readLines("files/plans/plan_11.R")[2:40]
101 | ```
102 | 
103 | ```{r}
104 | #| label: example-penguins-hide-1
105 | #| echo: FALSE
106 | #| eval: FALSE
107 | 
108 | # FIXME
109 | # Skip eval until can figure out how to install quarto CLI in whatever is
110 | # compiling the lesson
111 | 
112 | tar_dir({
113 |   library(quarto)
114 |   readr::read_lines("https://raw.githubusercontent.com/joelnitta/penguins-targets/main/penguin_report.qmd") |>
115 |     readr::write_lines("penguin_report.qmd")
116 |   # Run it
117 |   write_example_plan("plan_8.R")
118 |   tar_make(reporter = "silent")
119 |   write_example_plan("plan_11.R")
120 |   tar_make()
121 | })
122 | ```
123 | 
124 | The function to generate the report is `tar_quarto()`, from the `tarchetypes` package.
125 | 
126 | As you can see, the "heavy" analysis of running the models is done in the workflow, then there is a single call to render the report at the end with `tar_quarto()`.
127 | 
128 | ## How does `targets` know when to render the report?
129 | 
130 | It is not immediately apparent just from this how `targets` knows to generate the report **at the end of the workflow** (recall that build order is not determined by the order of how targets are written in the workflow, but rather by their dependencies).
131 | `penguin_report` does not appear to depend on any of the other targets, since they do not show up in the `tar_quarto()` call.
132 | 
133 | How does this work?
134 | 
135 | The answer lies **inside** the `penguin_report.qmd` file. Let's look at the start of the file:
136 | 
137 | ```{r}
138 | #| label: show-penguin-report-qmd
139 | #| echo: FALSE
140 | #| results: 'asis'
141 | 
142 | penguin_qmd <- readr::read_lines("https://raw.githubusercontent.com/joelnitta/penguins-targets/main/penguin_report.qmd")
143 | 
144 | cat("````{.markdown}\n")
145 | cat(penguin_qmd[1:24], sep = "\n")
146 | cat("\n````")
147 | ```
148 | 
149 | The lines in between `---` and `---` at the very beginning are called the "YAML header", and contain directions about how to render the document.
150 | 
151 | The R code to be executed is specified by the lines between `` ```{r}  `` and `` ``` ``. This is called a "code chunk", since it is a portion of code interspersed within prose text.
152 | 
153 | Take a closer look at the R code chunk. Notice the use of `targets::tar_load()`. Do you remember what that function does? It loads the targets built during the workflow.
154 | 
155 | Now things should make a bit more sense: `targets` knows that the report depends on the targets built during the workflow like `combined_summary` and `species_summary` **because they are loaded in the report with `tar_load()`.**
156 | 
157 | ## Generating dynamic content
158 | 
159 | The call to `tar_load()` at the start of `penguin_report.qmd` is really the key to generating an up-to-date report---once those are loaded from the workflow, we know that they are in sync with the data, and can use them to produce "polished" text and plots.
160 | 
161 | ::::::::::::::::::::::::::::::::::::: {.challenge}
162 | 
163 | ## Challenge: Spot the dynamic contents
164 | 
165 | Read through `penguin_report.qmd` and try to find instances where the targets built during the workflow (`combined_summary`, etc.) are used to dynamically produce text and plots.
166 | 
167 | :::::::::::::::::::::::::::::::::: {.solution}
168 | 
169 | - In the code chunk labeled `results-stats`, statistics from the models like *R* squared are extracted, then inserted into the text with in-line code like `` `r knitr::inline_expr("combined_r2")` ``.
170 | 
171 | - There are two figures, one for the combined model and one for the separate models (code chunks labeled `fig-combined-plot` and `fig-separate-plot`, respectively). These are built using the points predicted from the model in `combined_predictions` and `species_predictions`.
172 | 
173 | ::::::::::::::::::::::::::::::::::
174 | 
175 | :::::::::::::::::::::::::::::::::::::
176 | 
177 | You should also interactively run the code in `penguin_report.qmd` to better understand what is going on, starting with `tar_load()`. In fact, that is how this report was written: the code was run in an interactive session, and saved to the report as it was gradually tweaked to obtain the desired results.
178 | 
179 | The best way to learn this approach to generating reports is to **try it yourself**.
180 | 
181 | So your final Challenge is to construct a `targets` workflow using your own data and generate a report. Good luck!
182 | 
183 | ::::::::::::::::::::::::::::::::::::: keypoints 
184 | 
185 | - `tarchetypes::tar_quarto()` is used to render Quarto documents
186 | - You should load targets within the Quarto document using `tar_load()` and `tar_read()`
187 | - It is recommended to do heavy computations in the main targets workflow, and lighter formatting and plot generation in the Quarto document
188 | 
189 | ::::::::::::::::::::::::::::::::::::::::::::::::
190 | 


--------------------------------------------------------------------------------
/index.md:
--------------------------------------------------------------------------------
1 | ---
2 | site: sandpaper::sandpaper_site
3 | ---
4 | 
5 | This is a lesson about how to use the [targets](https://docs.ropensci.org/targets/) R package for maintaining efficient data analysis workflows.
6 | 


--------------------------------------------------------------------------------
/instructors/instructor-notes.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: 'Instructor Notes'
 3 | ---
 4 | 
 5 | ## General notes
 6 | 
 7 | The examples gradually build up to a [full analysis](https://github.com/joelnitta/penguins-targets) of the [Palmer Penguins dataset](https://allisonhorst.github.io/palmerpenguins/). However, there are a few places where completely different code is demonstrated to explain certain concepts. Since a given `targets` project can only have one `_targets.R` file, this means the participants may have to delete their existing `_targets.R` file and write a new one to follow along with the examples. This may cause frustration if they can't keep a record of what they have done so far. One solution would be to save the old `_targets.R` file as `_targets_old.R` or similar, then rename it when it should be run again.
 8 | 
 9 | ## Optional episodes:
10 | The "Function" episode is an optional episode and will depend on the learners coming to your workshop.
11 | We would recommend having a show of hands (or stickies) who has experience with functions, and if you have learners who do not, run this episode.
12 | 
13 | targets relies so much on functions we believe it is worth spending a little time on if you have learners inexperienced with them, they will quickly fall behind and not be empowered to use targets at the end of the workshop if they don't get a short introduction.
14 | 
15 | 


--------------------------------------------------------------------------------
/learners/reference.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: 'Reference'
 3 | ---
 4 | 
 5 | ## Glossary
 6 | 
 7 | branch
 8 | : A set of targets that are programmatically defined in the `targets` workflow
 9 | 
10 | reproducibility
11 | : The ability for others (including your future self) to be able to re-run an analysis and obtain the same results
12 | 
13 | target
14 | : An object built by the `targets` workflow
15 | 
16 | 


--------------------------------------------------------------------------------
/learners/setup.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: Setup
 3 | ---
 4 | 
 5 | ## Local setup
 6 | 
 7 | Follow these instructions to install the required software on your computer.
 8 | 
 9 | - [Download and install the latest version of R](https://www.r-project.org/).
10 | - [Download and install RStudio](https://www.rstudio.com/products/rstudio/download/#download). RStudio is an application (an integrated development environment or IDE) that facilitates the use of R and offers a number of nice additional features, including the [Quarto](https://quarto.org/) publishing system. You will need the free Desktop version for your computer.
11 | - Install the necessary R packages with the following command:
12 | 
13 | ```r
14 | install.packages(
15 |   c(
16 |     "conflicted",
17 |     "crew",
18 |     "palmerpenguins",
19 |     "quarto",
20 |     "tarchetypes",
21 |     "targets",
22 |     "tidyverse",
23 |     "visNetwork"
24 |   )
25 | )
26 | ```
27 | 
28 | ## Alternative: In the cloud
29 | 
30 | There is a [Posit Cloud](https://posit.cloud/) instance with RStudio and all necessary packages pre-installed available, so you don't need to install anything on your own computer. You may need to create an account (free).
31 | 
32 | Click this link to open: <https://posit.cloud/content/6064275>
33 | 


--------------------------------------------------------------------------------
/links.md:
--------------------------------------------------------------------------------
 1 | <!-- 
 2 | Place links that you need to refer to multiple times across pages here. Delete
 3 | any links that you are not going to use. 
 4 |  -->
 5 | 
 6 | [pandoc]: https://pandoc.org/MANUAL.html
 7 | [r-markdown]: https://rmarkdown.rstudio.com/
 8 | [rstudio]: https://www.rstudio.com/
 9 | [carpentries-workbench]: https://carpentries.github.io/sandpaper-docs/
10 | 
11 | 


--------------------------------------------------------------------------------
/profiles/learner-profiles.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: Learner Profiles
 3 | ---
 4 | 
 5 | These are fictional examples of the sort of learner expected to take this workshop.
 6 | 
 7 | **Dayja** is a graduate student in evolutionary biology.
 8 | She is familiar with R and writes many R scripts to conduct her analyses, but she often finds that it is difficult to remember which scripts to run in what order when she updates her data.
 9 | 
10 | **Jessie** is an undergraduate who is using Quarto to write their graduate thesis.
11 | They want to make sure all the results presented in the thesis come directly from code to avoid any errors and to make it easier for submission to a journal later.
12 | 
13 | **Vincent** is a post-doc in bioinformatics.
14 | He has to orchestrate large workflows that run the same set of steps over many samples.
15 | He wants to simplify his code to avoid repetition.
16 | 


--------------------------------------------------------------------------------
/renv/activate.R:
--------------------------------------------------------------------------------
   1 | 
   2 | local({
   3 | 
   4 |   # the requested version of renv
   5 |   version <- "1.1.2"
   6 |   attr(version, "sha") <- NULL
   7 | 
   8 |   # the project directory
   9 |   project <- Sys.getenv("RENV_PROJECT")
  10 |   if (!nzchar(project))
  11 |     project <- getwd()
  12 | 
  13 |   # use start-up diagnostics if enabled
  14 |   diagnostics <- Sys.getenv("RENV_STARTUP_DIAGNOSTICS", unset = "FALSE")
  15 |   if (diagnostics) {
  16 |     start <- Sys.time()
  17 |     profile <- tempfile("renv-startup-", fileext = ".Rprof")
  18 |     utils::Rprof(profile)
  19 |     on.exit({
  20 |       utils::Rprof(NULL)
  21 |       elapsed <- signif(difftime(Sys.time(), start, units = "auto"), digits = 2L)
  22 |       writeLines(sprintf("- renv took %s to run the autoloader.", format(elapsed)))
  23 |       writeLines(sprintf("- Profile: %s", profile))
  24 |       print(utils::summaryRprof(profile))
  25 |     }, add = TRUE)
  26 |   }
  27 | 
  28 |   # figure out whether the autoloader is enabled
  29 |   enabled <- local({
  30 | 
  31 |     # first, check config option
  32 |     override <- getOption("renv.config.autoloader.enabled")
  33 |     if (!is.null(override))
  34 |       return(override)
  35 | 
  36 |     # if we're being run in a context where R_LIBS is already set,
  37 |     # don't load -- presumably we're being run as a sub-process and
  38 |     # the parent process has already set up library paths for us
  39 |     rcmd <- Sys.getenv("R_CMD", unset = NA)
  40 |     rlibs <- Sys.getenv("R_LIBS", unset = NA)
  41 |     if (!is.na(rlibs) && !is.na(rcmd))
  42 |       return(FALSE)
  43 | 
  44 |     # next, check environment variables
  45 |     # prefer using the configuration one in the future
  46 |     envvars <- c(
  47 |       "RENV_CONFIG_AUTOLOADER_ENABLED",
  48 |       "RENV_AUTOLOADER_ENABLED",
  49 |       "RENV_ACTIVATE_PROJECT"
  50 |     )
  51 | 
  52 |     for (envvar in envvars) {
  53 |       envval <- Sys.getenv(envvar, unset = NA)
  54 |       if (!is.na(envval))
  55 |         return(tolower(envval) %in% c("true", "t", "1"))
  56 |     }
  57 | 
  58 |     # enable by default
  59 |     TRUE
  60 | 
  61 |   })
  62 | 
  63 |   # bail if we're not enabled
  64 |   if (!enabled) {
  65 | 
  66 |     # if we're not enabled, we might still need to manually load
  67 |     # the user profile here
  68 |     profile <- Sys.getenv("R_PROFILE_USER", unset = "~/.Rprofile")
  69 |     if (file.exists(profile)) {
  70 |       cfg <- Sys.getenv("RENV_CONFIG_USER_PROFILE", unset = "TRUE")
  71 |       if (tolower(cfg) %in% c("true", "t", "1"))
  72 |         sys.source(profile, envir = globalenv())
  73 |     }
  74 | 
  75 |     return(FALSE)
  76 | 
  77 |   }
  78 | 
  79 |   # avoid recursion
  80 |   if (identical(getOption("renv.autoloader.running"), TRUE)) {
  81 |     warning("ignoring recursive attempt to run renv autoloader")
  82 |     return(invisible(TRUE))
  83 |   }
  84 | 
  85 |   # signal that we're loading renv during R startup
  86 |   options(renv.autoloader.running = TRUE)
  87 |   on.exit(options(renv.autoloader.running = NULL), add = TRUE)
  88 | 
  89 |   # signal that we've consented to use renv
  90 |   options(renv.consent = TRUE)
  91 | 
  92 |   # load the 'utils' package eagerly -- this ensures that renv shims, which
  93 |   # mask 'utils' packages, will come first on the search path
  94 |   library(utils, lib.loc = .Library)
  95 | 
  96 |   # unload renv if it's already been loaded
  97 |   if ("renv" %in% loadedNamespaces())
  98 |     unloadNamespace("renv")
  99 | 
 100 |   # load bootstrap tools   
 101 |   ansify <- function(text) {
 102 |     if (renv_ansify_enabled())
 103 |       renv_ansify_enhanced(text)
 104 |     else
 105 |       renv_ansify_default(text)
 106 |   }
 107 |   
 108 |   renv_ansify_enabled <- function() {
 109 |   
 110 |     override <- Sys.getenv("RENV_ANSIFY_ENABLED", unset = NA)
 111 |     if (!is.na(override))
 112 |       return(as.logical(override))
 113 |   
 114 |     pane <- Sys.getenv("RSTUDIO_CHILD_PROCESS_PANE", unset = NA)
 115 |     if (identical(pane, "build"))
 116 |       return(FALSE)
 117 |   
 118 |     testthat <- Sys.getenv("TESTTHAT", unset = "false")
 119 |     if (tolower(testthat) %in% "true")
 120 |       return(FALSE)
 121 |   
 122 |     iderun <- Sys.getenv("R_CLI_HAS_HYPERLINK_IDE_RUN", unset = "false")
 123 |     if (tolower(iderun) %in% "false")
 124 |       return(FALSE)
 125 |   
 126 |     TRUE
 127 |   
 128 |   }
 129 |   
 130 |   renv_ansify_default <- function(text) {
 131 |     text
 132 |   }
 133 |   
 134 |   renv_ansify_enhanced <- function(text) {
 135 |   
 136 |     # R help links
 137 |     pattern <- "`\\?(renv::(?:[^`])+)`"
 138 |     replacement <- "`\033]8;;x-r-help:\\1\a?\\1\033]8;;\a`"
 139 |     text <- gsub(pattern, replacement, text, perl = TRUE)
 140 |   
 141 |     # runnable code
 142 |     pattern <- "`(renv::(?:[^`])+)`"
 143 |     replacement <- "`\033]8;;x-r-run:\\1\a\\1\033]8;;\a`"
 144 |     text <- gsub(pattern, replacement, text, perl = TRUE)
 145 |   
 146 |     # return ansified text
 147 |     text
 148 |   
 149 |   }
 150 |   
 151 |   renv_ansify_init <- function() {
 152 |   
 153 |     envir <- renv_envir_self()
 154 |     if (renv_ansify_enabled())
 155 |       assign("ansify", renv_ansify_enhanced, envir = envir)
 156 |     else
 157 |       assign("ansify", renv_ansify_default, envir = envir)
 158 |   
 159 |   }
 160 |   
 161 |   `%||%` <- function(x, y) {
 162 |     if (is.null(x)) y else x
 163 |   }
 164 |   
 165 |   catf <- function(fmt, ..., appendLF = TRUE) {
 166 |   
 167 |     quiet <- getOption("renv.bootstrap.quiet", default = FALSE)
 168 |     if (quiet)
 169 |       return(invisible())
 170 |   
 171 |     msg <- sprintf(fmt, ...)
 172 |     cat(msg, file = stdout(), sep = if (appendLF) "\n" else "")
 173 |   
 174 |     invisible(msg)
 175 |   
 176 |   }
 177 |   
 178 |   header <- function(label,
 179 |                      ...,
 180 |                      prefix = "#",
 181 |                      suffix = "-",
 182 |                      n = min(getOption("width"), 78))
 183 |   {
 184 |     label <- sprintf(label, ...)
 185 |     n <- max(n - nchar(label) - nchar(prefix) - 2L, 8L)
 186 |     if (n <= 0)
 187 |       return(paste(prefix, label))
 188 |   
 189 |     tail <- paste(rep.int(suffix, n), collapse = "")
 190 |     paste0(prefix, " ", label, " ", tail)
 191 |   
 192 |   }
 193 |   
 194 |   heredoc <- function(text, leave = 0) {
 195 |   
 196 |     # remove leading, trailing whitespace
 197 |     trimmed <- gsub("^\\s*\\n|\\n\\s*$", "", text)
 198 |   
 199 |     # split into lines
 200 |     lines <- strsplit(trimmed, "\n", fixed = TRUE)[[1L]]
 201 |   
 202 |     # compute common indent
 203 |     indent <- regexpr("[^[:space:]]", lines)
 204 |     common <- min(setdiff(indent, -1L)) - leave
 205 |     text <- paste(substring(lines, common), collapse = "\n")
 206 |   
 207 |     # substitute in ANSI links for executable renv code
 208 |     ansify(text)
 209 |   
 210 |   }
 211 |   
 212 |   bootstrap <- function(version, library) {
 213 |   
 214 |     friendly <- renv_bootstrap_version_friendly(version)
 215 |     section <- header(sprintf("Bootstrapping renv %s", friendly))
 216 |     catf(section)
 217 |   
 218 |     # attempt to download renv
 219 |     catf("- Downloading renv ... ", appendLF = FALSE)
 220 |     withCallingHandlers(
 221 |       tarball <- renv_bootstrap_download(version),
 222 |       error = function(err) {
 223 |         catf("FAILED")
 224 |         stop("failed to download:\n", conditionMessage(err))
 225 |       }
 226 |     )
 227 |     catf("OK")
 228 |     on.exit(unlink(tarball), add = TRUE)
 229 |   
 230 |     # now attempt to install
 231 |     catf("- Installing renv  ... ", appendLF = FALSE)
 232 |     withCallingHandlers(
 233 |       status <- renv_bootstrap_install(version, tarball, library),
 234 |       error = function(err) {
 235 |         catf("FAILED")
 236 |         stop("failed to install:\n", conditionMessage(err))
 237 |       }
 238 |     )
 239 |     catf("OK")
 240 |   
 241 |     # add empty line to break up bootstrapping from normal output
 242 |     catf("")
 243 |   
 244 |     return(invisible())
 245 |   }
 246 |   
 247 |   renv_bootstrap_tests_running <- function() {
 248 |     getOption("renv.tests.running", default = FALSE)
 249 |   }
 250 |   
 251 |   renv_bootstrap_repos <- function() {
 252 |   
 253 |     # get CRAN repository
 254 |     cran <- getOption("renv.repos.cran", "https://cloud.r-project.org")
 255 |   
 256 |     # check for repos override
 257 |     repos <- Sys.getenv("RENV_CONFIG_REPOS_OVERRIDE", unset = NA)
 258 |     if (!is.na(repos)) {
 259 |   
 260 |       # check for RSPM; if set, use a fallback repository for renv
 261 |       rspm <- Sys.getenv("RSPM", unset = NA)
 262 |       if (identical(rspm, repos))
 263 |         repos <- c(RSPM = rspm, CRAN = cran)
 264 |   
 265 |       return(repos)
 266 |   
 267 |     }
 268 |   
 269 |     # check for lockfile repositories
 270 |     repos <- tryCatch(renv_bootstrap_repos_lockfile(), error = identity)
 271 |     if (!inherits(repos, "error") && length(repos))
 272 |       return(repos)
 273 |   
 274 |     # retrieve current repos
 275 |     repos <- getOption("repos")
 276 |   
 277 |     # ensure @CRAN@ entries are resolved
 278 |     repos[repos == "@CRAN@"] <- cran
 279 |   
 280 |     # add in renv.bootstrap.repos if set
 281 |     default <- c(FALLBACK = "https://cloud.r-project.org")
 282 |     extra <- getOption("renv.bootstrap.repos", default = default)
 283 |     repos <- c(repos, extra)
 284 |   
 285 |     # remove duplicates that might've snuck in
 286 |     dupes <- duplicated(repos) | duplicated(names(repos))
 287 |     repos[!dupes]
 288 |   
 289 |   }
 290 |   
 291 |   renv_bootstrap_repos_lockfile <- function() {
 292 |   
 293 |     lockpath <- Sys.getenv("RENV_PATHS_LOCKFILE", unset = "renv.lock")
 294 |     if (!file.exists(lockpath))
 295 |       return(NULL)
 296 |   
 297 |     lockfile <- tryCatch(renv_json_read(lockpath), error = identity)
 298 |     if (inherits(lockfile, "error")) {
 299 |       warning(lockfile)
 300 |       return(NULL)
 301 |     }
 302 |   
 303 |     repos <- lockfile$R$Repositories
 304 |     if (length(repos) == 0)
 305 |       return(NULL)
 306 |   
 307 |     keys <- vapply(repos, `[[`, "Name", FUN.VALUE = character(1))
 308 |     vals <- vapply(repos, `[[`, "URL", FUN.VALUE = character(1))
 309 |     names(vals) <- keys
 310 |   
 311 |     return(vals)
 312 |   
 313 |   }
 314 |   
 315 |   renv_bootstrap_download <- function(version) {
 316 |   
 317 |     sha <- attr(version, "sha", exact = TRUE)
 318 |   
 319 |     methods <- if (!is.null(sha)) {
 320 |   
 321 |       # attempting to bootstrap a development version of renv
 322 |       c(
 323 |         function() renv_bootstrap_download_tarball(sha),
 324 |         function() renv_bootstrap_download_github(sha)
 325 |       )
 326 |   
 327 |     } else {
 328 |   
 329 |       # attempting to bootstrap a release version of renv
 330 |       c(
 331 |         function() renv_bootstrap_download_tarball(version),
 332 |         function() renv_bootstrap_download_cran_latest(version),
 333 |         function() renv_bootstrap_download_cran_archive(version)
 334 |       )
 335 |   
 336 |     }
 337 |   
 338 |     for (method in methods) {
 339 |       path <- tryCatch(method(), error = identity)
 340 |       if (is.character(path) && file.exists(path))
 341 |         return(path)
 342 |     }
 343 |   
 344 |     stop("All download methods failed")
 345 |   
 346 |   }
 347 |   
 348 |   renv_bootstrap_download_impl <- function(url, destfile) {
 349 |   
 350 |     mode <- "wb"
 351 |   
 352 |     # https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17715
 353 |     fixup <-
 354 |       Sys.info()[["sysname"]] == "Windows" &&
 355 |       substring(url, 1L, 5L) == "file:"
 356 |   
 357 |     if (fixup)
 358 |       mode <- "w+b"
 359 |   
 360 |     args <- list(
 361 |       url      = url,
 362 |       destfile = destfile,
 363 |       mode     = mode,
 364 |       quiet    = TRUE
 365 |     )
 366 |   
 367 |     if ("headers" %in% names(formals(utils::download.file))) {
 368 |       headers <- renv_bootstrap_download_custom_headers(url)
 369 |       if (length(headers) && is.character(headers))
 370 |         args$headers <- headers
 371 |     }
 372 |   
 373 |     do.call(utils::download.file, args)
 374 |   
 375 |   }
 376 |   
 377 |   renv_bootstrap_download_custom_headers <- function(url) {
 378 |   
 379 |     headers <- getOption("renv.download.headers")
 380 |     if (is.null(headers))
 381 |       return(character())
 382 |   
 383 |     if (!is.function(headers))
 384 |       stopf("'renv.download.headers' is not a function")
 385 |   
 386 |     headers <- headers(url)
 387 |     if (length(headers) == 0L)
 388 |       return(character())
 389 |   
 390 |     if (is.list(headers))
 391 |       headers <- unlist(headers, recursive = FALSE, use.names = TRUE)
 392 |   
 393 |     ok <-
 394 |       is.character(headers) &&
 395 |       is.character(names(headers)) &&
 396 |       all(nzchar(names(headers)))
 397 |   
 398 |     if (!ok)
 399 |       stop("invocation of 'renv.download.headers' did not return a named character vector")
 400 |   
 401 |     headers
 402 |   
 403 |   }
 404 |   
 405 |   renv_bootstrap_download_cran_latest <- function(version) {
 406 |   
 407 |     spec <- renv_bootstrap_download_cran_latest_find(version)
 408 |     type  <- spec$type
 409 |     repos <- spec$repos
 410 |   
 411 |     baseurl <- utils::contrib.url(repos = repos, type = type)
 412 |     ext <- if (identical(type, "source"))
 413 |       ".tar.gz"
 414 |     else if (Sys.info()[["sysname"]] == "Windows")
 415 |       ".zip"
 416 |     else
 417 |       ".tgz"
 418 |     name <- sprintf("renv_%s%s", version, ext)
 419 |     url <- paste(baseurl, name, sep = "/")
 420 |   
 421 |     destfile <- file.path(tempdir(), name)
 422 |     status <- tryCatch(
 423 |       renv_bootstrap_download_impl(url, destfile),
 424 |       condition = identity
 425 |     )
 426 |   
 427 |     if (inherits(status, "condition"))
 428 |       return(FALSE)
 429 |   
 430 |     # report success and return
 431 |     destfile
 432 |   
 433 |   }
 434 |   
 435 |   renv_bootstrap_download_cran_latest_find <- function(version) {
 436 |   
 437 |     # check whether binaries are supported on this system
 438 |     binary <-
 439 |       getOption("renv.bootstrap.binary", default = TRUE) &&
 440 |       !identical(.Platform$pkgType, "source") &&
 441 |       !identical(getOption("pkgType"), "source") &&
 442 |       Sys.info()[["sysname"]] %in% c("Darwin", "Windows")
 443 |   
 444 |     types <- c(if (binary) "binary", "source")
 445 |   
 446 |     # iterate over types + repositories
 447 |     for (type in types) {
 448 |       for (repos in renv_bootstrap_repos()) {
 449 |   
 450 |         # build arguments for utils::available.packages() call
 451 |         args <- list(type = type, repos = repos)
 452 |   
 453 |         # add custom headers if available -- note that
 454 |         # utils::available.packages() will pass this to download.file()
 455 |         if ("headers" %in% names(formals(utils::download.file))) {
 456 |           headers <- renv_bootstrap_download_custom_headers(repos)
 457 |           if (length(headers) && is.character(headers))
 458 |             args$headers <- headers
 459 |         }
 460 |   
 461 |         # retrieve package database
 462 |         db <- tryCatch(
 463 |           as.data.frame(
 464 |             do.call(utils::available.packages, args),
 465 |             stringsAsFactors = FALSE
 466 |           ),
 467 |           error = identity
 468 |         )
 469 |   
 470 |         if (inherits(db, "error"))
 471 |           next
 472 |   
 473 |         # check for compatible entry
 474 |         entry <- db[db$Package %in% "renv" & db$Version %in% version, ]
 475 |         if (nrow(entry) == 0)
 476 |           next
 477 |   
 478 |         # found it; return spec to caller
 479 |         spec <- list(entry = entry, type = type, repos = repos)
 480 |         return(spec)
 481 |   
 482 |       }
 483 |     }
 484 |   
 485 |     # if we got here, we failed to find renv
 486 |     fmt <- "renv %s is not available from your declared package repositories"
 487 |     stop(sprintf(fmt, version))
 488 |   
 489 |   }
 490 |   
 491 |   renv_bootstrap_download_cran_archive <- function(version) {
 492 |   
 493 |     name <- sprintf("renv_%s.tar.gz", version)
 494 |     repos <- renv_bootstrap_repos()
 495 |     urls <- file.path(repos, "src/contrib/Archive/renv", name)
 496 |     destfile <- file.path(tempdir(), name)
 497 |   
 498 |     for (url in urls) {
 499 |   
 500 |       status <- tryCatch(
 501 |         renv_bootstrap_download_impl(url, destfile),
 502 |         condition = identity
 503 |       )
 504 |   
 505 |       if (identical(status, 0L))
 506 |         return(destfile)
 507 |   
 508 |     }
 509 |   
 510 |     return(FALSE)
 511 |   
 512 |   }
 513 |   
 514 |   renv_bootstrap_download_tarball <- function(version) {
 515 |   
 516 |     # if the user has provided the path to a tarball via
 517 |     # an environment variable, then use it
 518 |     tarball <- Sys.getenv("RENV_BOOTSTRAP_TARBALL", unset = NA)
 519 |     if (is.na(tarball))
 520 |       return()
 521 |   
 522 |     # allow directories
 523 |     if (dir.exists(tarball)) {
 524 |       name <- sprintf("renv_%s.tar.gz", version)
 525 |       tarball <- file.path(tarball, name)
 526 |     }
 527 |   
 528 |     # bail if it doesn't exist
 529 |     if (!file.exists(tarball)) {
 530 |   
 531 |       # let the user know we weren't able to honour their request
 532 |       fmt <- "- RENV_BOOTSTRAP_TARBALL is set (%s) but does not exist."
 533 |       msg <- sprintf(fmt, tarball)
 534 |       warning(msg)
 535 |   
 536 |       # bail
 537 |       return()
 538 |   
 539 |     }
 540 |   
 541 |     catf("- Using local tarball '%s'.", tarball)
 542 |     tarball
 543 |   
 544 |   }
 545 |   
 546 |   renv_bootstrap_github_token <- function() {
 547 |     for (envvar in c("GITHUB_TOKEN", "GITHUB_PAT", "GH_TOKEN")) {
 548 |       envval <- Sys.getenv(envvar, unset = NA)
 549 |       if (!is.na(envval))
 550 |         return(envval)
 551 |     }
 552 |   }
 553 |   
 554 |   renv_bootstrap_download_github <- function(version) {
 555 |   
 556 |     enabled <- Sys.getenv("RENV_BOOTSTRAP_FROM_GITHUB", unset = "TRUE")
 557 |     if (!identical(enabled, "TRUE"))
 558 |       return(FALSE)
 559 |   
 560 |     # prepare download options
 561 |     token <- renv_bootstrap_github_token()
 562 |     if (is.null(token))
 563 |       token <- ""
 564 |   
 565 |     if (nzchar(Sys.which("curl")) && nzchar(token)) {
 566 |       fmt <- "--location --fail --header \"Authorization: token %s\""
 567 |       extra <- sprintf(fmt, token)
 568 |       saved <- options("download.file.method", "download.file.extra")
 569 |       options(download.file.method = "curl", download.file.extra = extra)
 570 |       on.exit(do.call(base::options, saved), add = TRUE)
 571 |     } else if (nzchar(Sys.which("wget")) && nzchar(token)) {
 572 |       fmt <- "--header=\"Authorization: token %s\""
 573 |       extra <- sprintf(fmt, token)
 574 |       saved <- options("download.file.method", "download.file.extra")
 575 |       options(download.file.method = "wget", download.file.extra = extra)
 576 |       on.exit(do.call(base::options, saved), add = TRUE)
 577 |     }
 578 |   
 579 |     url <- file.path("https://api.github.com/repos/rstudio/renv/tarball", version)
 580 |     name <- sprintf("renv_%s.tar.gz", version)
 581 |     destfile <- file.path(tempdir(), name)
 582 |   
 583 |     status <- tryCatch(
 584 |       renv_bootstrap_download_impl(url, destfile),
 585 |       condition = identity
 586 |     )
 587 |   
 588 |     if (!identical(status, 0L))
 589 |       return(FALSE)
 590 |   
 591 |     renv_bootstrap_download_augment(destfile)
 592 |   
 593 |     return(destfile)
 594 |   
 595 |   }
 596 |   
 597 |   # Add Sha to DESCRIPTION. This is stop gap until #890, after which we
 598 |   # can use renv::install() to fully capture metadata.
 599 |   renv_bootstrap_download_augment <- function(destfile) {
 600 |     sha <- renv_bootstrap_git_extract_sha1_tar(destfile)
 601 |     if (is.null(sha)) {
 602 |       return()
 603 |     }
 604 |   
 605 |     # Untar
 606 |     tempdir <- tempfile("renv-github-")
 607 |     on.exit(unlink(tempdir, recursive = TRUE), add = TRUE)
 608 |     untar(destfile, exdir = tempdir)
 609 |     pkgdir <- dir(tempdir, full.names = TRUE)[[1]]
 610 |   
 611 |     # Modify description
 612 |     desc_path <- file.path(pkgdir, "DESCRIPTION")
 613 |     desc_lines <- readLines(desc_path)
 614 |     remotes_fields <- c(
 615 |       "RemoteType: github",
 616 |       "RemoteHost: api.github.com",
 617 |       "RemoteRepo: renv",
 618 |       "RemoteUsername: rstudio",
 619 |       "RemotePkgRef: rstudio/renv",
 620 |       paste("RemoteRef: ", sha),
 621 |       paste("RemoteSha: ", sha)
 622 |     )
 623 |     writeLines(c(desc_lines[desc_lines != ""], remotes_fields), con = desc_path)
 624 |   
 625 |     # Re-tar
 626 |     local({
 627 |       old <- setwd(tempdir)
 628 |       on.exit(setwd(old), add = TRUE)
 629 |   
 630 |       tar(destfile, compression = "gzip")
 631 |     })
 632 |     invisible()
 633 |   }
 634 |   
 635 |   # Extract the commit hash from a git archive. Git archives include the SHA1
 636 |   # hash as the comment field of the tarball pax extended header
 637 |   # (see https://www.kernel.org/pub/software/scm/git/docs/git-archive.html)
 638 |   # For GitHub archives this should be the first header after the default one
 639 |   # (512 byte) header.
 640 |   renv_bootstrap_git_extract_sha1_tar <- function(bundle) {
 641 |   
 642 |     # open the bundle for reading
 643 |     # We use gzcon for everything because (from ?gzcon)
 644 |     # > Reading from a connection which does not supply a 'gzip' magic
 645 |     # > header is equivalent to reading from the original connection
 646 |     conn <- gzcon(file(bundle, open = "rb", raw = TRUE))
 647 |     on.exit(close(conn))
 648 |   
 649 |     # The default pax header is 512 bytes long and the first pax extended header
 650 |     # with the comment should be 51 bytes long
 651 |     # `52 comment=` (11 chars) + 40 byte SHA1 hash
 652 |     len <- 0x200 + 0x33
 653 |     res <- rawToChar(readBin(conn, "raw", n = len)[0x201:len])
 654 |   
 655 |     if (grepl("^52 comment=", res)) {
 656 |       sub("52 comment=", "", res)
 657 |     } else {
 658 |       NULL
 659 |     }
 660 |   }
 661 |   
 662 |   renv_bootstrap_install <- function(version, tarball, library) {
 663 |   
 664 |     # attempt to install it into project library
 665 |     dir.create(library, showWarnings = FALSE, recursive = TRUE)
 666 |     output <- renv_bootstrap_install_impl(library, tarball)
 667 |   
 668 |     # check for successful install
 669 |     status <- attr(output, "status")
 670 |     if (is.null(status) || identical(status, 0L))
 671 |       return(status)
 672 |   
 673 |     # an error occurred; report it
 674 |     header <- "installation of renv failed"
 675 |     lines <- paste(rep.int("=", nchar(header)), collapse = "")
 676 |     text <- paste(c(header, lines, output), collapse = "\n")
 677 |     stop(text)
 678 |   
 679 |   }
 680 |   
 681 |   renv_bootstrap_install_impl <- function(library, tarball) {
 682 |   
 683 |     # invoke using system2 so we can capture and report output
 684 |     bin <- R.home("bin")
 685 |     exe <- if (Sys.info()[["sysname"]] == "Windows") "R.exe" else "R"
 686 |     R <- file.path(bin, exe)
 687 |   
 688 |     args <- c(
 689 |       "--vanilla", "CMD", "INSTALL", "--no-multiarch",
 690 |       "-l", shQuote(path.expand(library)),
 691 |       shQuote(path.expand(tarball))
 692 |     )
 693 |   
 694 |     system2(R, args, stdout = TRUE, stderr = TRUE)
 695 |   
 696 |   }
 697 |   
 698 |   renv_bootstrap_platform_prefix <- function() {
 699 |   
 700 |     # construct version prefix
 701 |     version <- paste(R.version$major, R.version$minor, sep = ".")
 702 |     prefix <- paste("R", numeric_version(version)[1, 1:2], sep = "-")
 703 |   
 704 |     # include SVN revision for development versions of R
 705 |     # (to avoid sharing platform-specific artefacts with released versions of R)
 706 |     devel <-
 707 |       identical(R.version[["status"]],   "Under development (unstable)") ||
 708 |       identical(R.version[["nickname"]], "Unsuffered Consequences")
 709 |   
 710 |     if (devel)
 711 |       prefix <- paste(prefix, R.version[["svn rev"]], sep = "-r")
 712 |   
 713 |     # build list of path components
 714 |     components <- c(prefix, R.version$platform)
 715 |   
 716 |     # include prefix if provided by user
 717 |     prefix <- renv_bootstrap_platform_prefix_impl()
 718 |     if (!is.na(prefix) && nzchar(prefix))
 719 |       components <- c(prefix, components)
 720 |   
 721 |     # build prefix
 722 |     paste(components, collapse = "/")
 723 |   
 724 |   }
 725 |   
 726 |   renv_bootstrap_platform_prefix_impl <- function() {
 727 |   
 728 |     # if an explicit prefix has been supplied, use it
 729 |     prefix <- Sys.getenv("RENV_PATHS_PREFIX", unset = NA)
 730 |     if (!is.na(prefix))
 731 |       return(prefix)
 732 |   
 733 |     # if the user has requested an automatic prefix, generate it
 734 |     auto <- Sys.getenv("RENV_PATHS_PREFIX_AUTO", unset = NA)
 735 |     if (is.na(auto) && getRversion() >= "4.4.0")
 736 |       auto <- "TRUE"
 737 |   
 738 |     if (auto %in% c("TRUE", "True", "true", "1"))
 739 |       return(renv_bootstrap_platform_prefix_auto())
 740 |   
 741 |     # empty string on failure
 742 |     ""
 743 |   
 744 |   }
 745 |   
 746 |   renv_bootstrap_platform_prefix_auto <- function() {
 747 |   
 748 |     prefix <- tryCatch(renv_bootstrap_platform_os(), error = identity)
 749 |     if (inherits(prefix, "error") || prefix %in% "unknown") {
 750 |   
 751 |       msg <- paste(
 752 |         "failed to infer current operating system",
 753 |         "please file a bug report at https://github.com/rstudio/renv/issues",
 754 |         sep = "; "
 755 |       )
 756 |   
 757 |       warning(msg)
 758 |   
 759 |     }
 760 |   
 761 |     prefix
 762 |   
 763 |   }
 764 |   
 765 |   renv_bootstrap_platform_os <- function() {
 766 |   
 767 |     sysinfo <- Sys.info()
 768 |     sysname <- sysinfo[["sysname"]]
 769 |   
 770 |     # handle Windows + macOS up front
 771 |     if (sysname == "Windows")
 772 |       return("windows")
 773 |     else if (sysname == "Darwin")
 774 |       return("macos")
 775 |   
 776 |     # check for os-release files
 777 |     for (file in c("/etc/os-release", "/usr/lib/os-release"))
 778 |       if (file.exists(file))
 779 |         return(renv_bootstrap_platform_os_via_os_release(file, sysinfo))
 780 |   
 781 |     # check for redhat-release files
 782 |     if (file.exists("/etc/redhat-release"))
 783 |       return(renv_bootstrap_platform_os_via_redhat_release())
 784 |   
 785 |     "unknown"
 786 |   
 787 |   }
 788 |   
 789 |   renv_bootstrap_platform_os_via_os_release <- function(file, sysinfo) {
 790 |   
 791 |     # read /etc/os-release
 792 |     release <- utils::read.table(
 793 |       file             = file,
 794 |       sep              = "=",
 795 |       quote            = c("\"", "'"),
 796 |       col.names        = c("Key", "Value"),
 797 |       comment.char     = "#",
 798 |       stringsAsFactors = FALSE
 799 |     )
 800 |   
 801 |     vars <- as.list(release$Value)
 802 |     names(vars) <- release$Key
 803 |   
 804 |     # get os name
 805 |     os <- tolower(sysinfo[["sysname"]])
 806 |   
 807 |     # read id
 808 |     id <- "unknown"
 809 |     for (field in c("ID", "ID_LIKE")) {
 810 |       if (field %in% names(vars) && nzchar(vars[[field]])) {
 811 |         id <- vars[[field]]
 812 |         break
 813 |       }
 814 |     }
 815 |   
 816 |     # read version
 817 |     version <- "unknown"
 818 |     for (field in c("UBUNTU_CODENAME", "VERSION_CODENAME", "VERSION_ID", "BUILD_ID")) {
 819 |       if (field %in% names(vars) && nzchar(vars[[field]])) {
 820 |         version <- vars[[field]]
 821 |         break
 822 |       }
 823 |     }
 824 |   
 825 |     # join together
 826 |     paste(c(os, id, version), collapse = "-")
 827 |   
 828 |   }
 829 |   
 830 |   renv_bootstrap_platform_os_via_redhat_release <- function() {
 831 |   
 832 |     # read /etc/redhat-release
 833 |     contents <- readLines("/etc/redhat-release", warn = FALSE)
 834 |   
 835 |     # infer id
 836 |     id <- if (grepl("centos", contents, ignore.case = TRUE))
 837 |       "centos"
 838 |     else if (grepl("redhat", contents, ignore.case = TRUE))
 839 |       "redhat"
 840 |     else
 841 |       "unknown"
 842 |   
 843 |     # try to find a version component (very hacky)
 844 |     version <- "unknown"
 845 |   
 846 |     parts <- strsplit(contents, "[[:space:]]")[[1L]]
 847 |     for (part in parts) {
 848 |   
 849 |       nv <- tryCatch(numeric_version(part), error = identity)
 850 |       if (inherits(nv, "error"))
 851 |         next
 852 |   
 853 |       version <- nv[1, 1]
 854 |       break
 855 |   
 856 |     }
 857 |   
 858 |     paste(c("linux", id, version), collapse = "-")
 859 |   
 860 |   }
 861 |   
 862 |   renv_bootstrap_library_root_name <- function(project) {
 863 |   
 864 |     # use project name as-is if requested
 865 |     asis <- Sys.getenv("RENV_PATHS_LIBRARY_ROOT_ASIS", unset = "FALSE")
 866 |     if (asis)
 867 |       return(basename(project))
 868 |   
 869 |     # otherwise, disambiguate based on project's path
 870 |     id <- substring(renv_bootstrap_hash_text(project), 1L, 8L)
 871 |     paste(basename(project), id, sep = "-")
 872 |   
 873 |   }
 874 |   
 875 |   renv_bootstrap_library_root <- function(project) {
 876 |   
 877 |     prefix <- renv_bootstrap_profile_prefix()
 878 |   
 879 |     path <- Sys.getenv("RENV_PATHS_LIBRARY", unset = NA)
 880 |     if (!is.na(path))
 881 |       return(paste(c(path, prefix), collapse = "/"))
 882 |   
 883 |     path <- renv_bootstrap_library_root_impl(project)
 884 |     if (!is.null(path)) {
 885 |       name <- renv_bootstrap_library_root_name(project)
 886 |       return(paste(c(path, prefix, name), collapse = "/"))
 887 |     }
 888 |   
 889 |     renv_bootstrap_paths_renv("library", project = project)
 890 |   
 891 |   }
 892 |   
 893 |   renv_bootstrap_library_root_impl <- function(project) {
 894 |   
 895 |     root <- Sys.getenv("RENV_PATHS_LIBRARY_ROOT", unset = NA)
 896 |     if (!is.na(root))
 897 |       return(root)
 898 |   
 899 |     type <- renv_bootstrap_project_type(project)
 900 |     if (identical(type, "package")) {
 901 |       userdir <- renv_bootstrap_user_dir()
 902 |       return(file.path(userdir, "library"))
 903 |     }
 904 |   
 905 |   }
 906 |   
 907 |   renv_bootstrap_validate_version <- function(version, description = NULL) {
 908 |   
 909 |     # resolve description file
 910 |     #
 911 |     # avoid passing lib.loc to `packageDescription()` below, since R will
 912 |     # use the loaded version of the package by default anyhow. note that
 913 |     # this function should only be called after 'renv' is loaded
 914 |     # https://github.com/rstudio/renv/issues/1625
 915 |     description <- description %||% packageDescription("renv")
 916 |   
 917 |     # check whether requested version 'version' matches loaded version of renv
 918 |     sha <- attr(version, "sha", exact = TRUE)
 919 |     valid <- if (!is.null(sha))
 920 |       renv_bootstrap_validate_version_dev(sha, description)
 921 |     else
 922 |       renv_bootstrap_validate_version_release(version, description)
 923 |   
 924 |     if (valid)
 925 |       return(TRUE)
 926 |   
 927 |     # the loaded version of renv doesn't match the requested version;
 928 |     # give the user instructions on how to proceed
 929 |     dev <- identical(description[["RemoteType"]], "github")
 930 |     remote <- if (dev)
 931 |       paste("rstudio/renv", description[["RemoteSha"]], sep = "@")
 932 |     else
 933 |       paste("renv", description[["Version"]], sep = "@")
 934 |   
 935 |     # display both loaded version + sha if available
 936 |     friendly <- renv_bootstrap_version_friendly(
 937 |       version = description[["Version"]],
 938 |       sha     = if (dev) description[["RemoteSha"]]
 939 |     )
 940 |   
 941 |     fmt <- heredoc("
 942 |       renv %1$s was loaded from project library, but this project is configured to use renv %2$s.
 943 |       - Use `renv::record(\"%3$s\")` to record renv %1$s in the lockfile.
 944 |       - Use `renv::restore(packages = \"renv\")` to install renv %2$s into the project library.
 945 |     ")
 946 |     catf(fmt, friendly, renv_bootstrap_version_friendly(version), remote)
 947 |   
 948 |     FALSE
 949 |   
 950 |   }
 951 |   
 952 |   renv_bootstrap_validate_version_dev <- function(version, description) {
 953 |     
 954 |     expected <- description[["RemoteSha"]]
 955 |     if (!is.character(expected))
 956 |       return(FALSE)
 957 |     
 958 |     pattern <- sprintf("^\\Q%s\\E", version)
 959 |     grepl(pattern, expected, perl = TRUE)
 960 |     
 961 |   }
 962 |   
 963 |   renv_bootstrap_validate_version_release <- function(version, description) {
 964 |     expected <- description[["Version"]]
 965 |     is.character(expected) && identical(expected, version)
 966 |   }
 967 |   
 968 |   renv_bootstrap_hash_text <- function(text) {
 969 |   
 970 |     hashfile <- tempfile("renv-hash-")
 971 |     on.exit(unlink(hashfile), add = TRUE)
 972 |   
 973 |     writeLines(text, con = hashfile)
 974 |     tools::md5sum(hashfile)
 975 |   
 976 |   }
 977 |   
 978 |   renv_bootstrap_load <- function(project, libpath, version) {
 979 |   
 980 |     # try to load renv from the project library
 981 |     if (!requireNamespace("renv", lib.loc = libpath, quietly = TRUE))
 982 |       return(FALSE)
 983 |   
 984 |     # warn if the version of renv loaded does not match
 985 |     renv_bootstrap_validate_version(version)
 986 |   
 987 |     # execute renv load hooks, if any
 988 |     hooks <- getHook("renv::autoload")
 989 |     for (hook in hooks)
 990 |       if (is.function(hook))
 991 |         tryCatch(hook(), error = warnify)
 992 |   
 993 |     # load the project
 994 |     renv::load(project)
 995 |   
 996 |     TRUE
 997 |   
 998 |   }
 999 |   
1000 |   renv_bootstrap_profile_load <- function(project) {
1001 |   
1002 |     # if RENV_PROFILE is already set, just use that
1003 |     profile <- Sys.getenv("RENV_PROFILE", unset = NA)
1004 |     if (!is.na(profile) && nzchar(profile))
1005 |       return(profile)
1006 |   
1007 |     # check for a profile file (nothing to do if it doesn't exist)
1008 |     path <- renv_bootstrap_paths_renv("profile", profile = FALSE, project = project)
1009 |     if (!file.exists(path))
1010 |       return(NULL)
1011 |   
1012 |     # read the profile, and set it if it exists
1013 |     contents <- readLines(path, warn = FALSE)
1014 |     if (length(contents) == 0L)
1015 |       return(NULL)
1016 |   
1017 |     # set RENV_PROFILE
1018 |     profile <- contents[[1L]]
1019 |     if (!profile %in% c("", "default"))
1020 |       Sys.setenv(RENV_PROFILE = profile)
1021 |   
1022 |     profile
1023 |   
1024 |   }
1025 |   
1026 |   renv_bootstrap_profile_prefix <- function() {
1027 |     profile <- renv_bootstrap_profile_get()
1028 |     if (!is.null(profile))
1029 |       return(file.path("profiles", profile, "renv"))
1030 |   }
1031 |   
1032 |   renv_bootstrap_profile_get <- function() {
1033 |     profile <- Sys.getenv("RENV_PROFILE", unset = "")
1034 |     renv_bootstrap_profile_normalize(profile)
1035 |   }
1036 |   
1037 |   renv_bootstrap_profile_set <- function(profile) {
1038 |     profile <- renv_bootstrap_profile_normalize(profile)
1039 |     if (is.null(profile))
1040 |       Sys.unsetenv("RENV_PROFILE")
1041 |     else
1042 |       Sys.setenv(RENV_PROFILE = profile)
1043 |   }
1044 |   
1045 |   renv_bootstrap_profile_normalize <- function(profile) {
1046 |   
1047 |     if (is.null(profile) || profile %in% c("", "default"))
1048 |       return(NULL)
1049 |   
1050 |     profile
1051 |   
1052 |   }
1053 |   
1054 |   renv_bootstrap_path_absolute <- function(path) {
1055 |   
1056 |     substr(path, 1L, 1L) %in% c("~", "/", "\\") || (
1057 |       substr(path, 1L, 1L) %in% c(letters, LETTERS) &&
1058 |       substr(path, 2L, 3L) %in% c(":/", ":\\")
1059 |     )
1060 |   
1061 |   }
1062 |   
1063 |   renv_bootstrap_paths_renv <- function(..., profile = TRUE, project = NULL) {
1064 |     renv <- Sys.getenv("RENV_PATHS_RENV", unset = "renv")
1065 |     root <- if (renv_bootstrap_path_absolute(renv)) NULL else project
1066 |     prefix <- if (profile) renv_bootstrap_profile_prefix()
1067 |     components <- c(root, renv, prefix, ...)
1068 |     paste(components, collapse = "/")
1069 |   }
1070 |   
1071 |   renv_bootstrap_project_type <- function(path) {
1072 |   
1073 |     descpath <- file.path(path, "DESCRIPTION")
1074 |     if (!file.exists(descpath))
1075 |       return("unknown")
1076 |   
1077 |     desc <- tryCatch(
1078 |       read.dcf(descpath, all = TRUE),
1079 |       error = identity
1080 |     )
1081 |   
1082 |     if (inherits(desc, "error"))
1083 |       return("unknown")
1084 |   
1085 |     type <- desc$Type
1086 |     if (!is.null(type))
1087 |       return(tolower(type))
1088 |   
1089 |     package <- desc$Package
1090 |     if (!is.null(package))
1091 |       return("package")
1092 |   
1093 |     "unknown"
1094 |   
1095 |   }
1096 |   
1097 |   renv_bootstrap_user_dir <- function() {
1098 |     dir <- renv_bootstrap_user_dir_impl()
1099 |     path.expand(chartr("\\", "/", dir))
1100 |   }
1101 |   
1102 |   renv_bootstrap_user_dir_impl <- function() {
1103 |   
1104 |     # use local override if set
1105 |     override <- getOption("renv.userdir.override")
1106 |     if (!is.null(override))
1107 |       return(override)
1108 |   
1109 |     # use R_user_dir if available
1110 |     tools <- asNamespace("tools")
1111 |     if (is.function(tools$R_user_dir))
1112 |       return(tools$R_user_dir("renv", "cache"))
1113 |   
1114 |     # try using our own backfill for older versions of R
1115 |     envvars <- c("R_USER_CACHE_DIR", "XDG_CACHE_HOME")
1116 |     for (envvar in envvars) {
1117 |       root <- Sys.getenv(envvar, unset = NA)
1118 |       if (!is.na(root))
1119 |         return(file.path(root, "R/renv"))
1120 |     }
1121 |   
1122 |     # use platform-specific default fallbacks
1123 |     if (Sys.info()[["sysname"]] == "Windows")
1124 |       file.path(Sys.getenv("LOCALAPPDATA"), "R/cache/R/renv")
1125 |     else if (Sys.info()[["sysname"]] == "Darwin")
1126 |       "~/Library/Caches/org.R-project.R/R/renv"
1127 |     else
1128 |       "~/.cache/R/renv"
1129 |   
1130 |   }
1131 |   
1132 |   renv_bootstrap_version_friendly <- function(version, shafmt = NULL, sha = NULL) {
1133 |     sha <- sha %||% attr(version, "sha", exact = TRUE)
1134 |     parts <- c(version, sprintf(shafmt %||% " [sha: %s]", substring(sha, 1L, 7L)))
1135 |     paste(parts, collapse = "")
1136 |   }
1137 |   
1138 |   renv_bootstrap_exec <- function(project, libpath, version) {
1139 |     if (!renv_bootstrap_load(project, libpath, version))
1140 |       renv_bootstrap_run(project, libpath, version)
1141 |   }
1142 |   
1143 |   renv_bootstrap_run <- function(project, libpath, version) {
1144 |   
1145 |     # perform bootstrap
1146 |     bootstrap(version, libpath)
1147 |   
1148 |     # exit early if we're just testing bootstrap
1149 |     if (!is.na(Sys.getenv("RENV_BOOTSTRAP_INSTALL_ONLY", unset = NA)))
1150 |       return(TRUE)
1151 |   
1152 |     # try again to load
1153 |     if (requireNamespace("renv", lib.loc = libpath, quietly = TRUE)) {
1154 |       return(renv::load(project = project))
1155 |     }
1156 |   
1157 |     # failed to download or load renv; warn the user
1158 |     msg <- c(
1159 |       "Failed to find an renv installation: the project will not be loaded.",
1160 |       "Use `renv::activate()` to re-initialize the project."
1161 |     )
1162 |   
1163 |     warning(paste(msg, collapse = "\n"), call. = FALSE)
1164 |   
1165 |   }
1166 |   
1167 |   renv_json_read <- function(file = NULL, text = NULL) {
1168 |   
1169 |     jlerr <- NULL
1170 |   
1171 |     # if jsonlite is loaded, use that instead
1172 |     if ("jsonlite" %in% loadedNamespaces()) {
1173 |   
1174 |       json <- tryCatch(renv_json_read_jsonlite(file, text), error = identity)
1175 |       if (!inherits(json, "error"))
1176 |         return(json)
1177 |   
1178 |       jlerr <- json
1179 |   
1180 |     }
1181 |   
1182 |     # otherwise, fall back to the default JSON reader
1183 |     json <- tryCatch(renv_json_read_default(file, text), error = identity)
1184 |     if (!inherits(json, "error"))
1185 |       return(json)
1186 |   
1187 |     # report an error
1188 |     if (!is.null(jlerr))
1189 |       stop(jlerr)
1190 |     else
1191 |       stop(json)
1192 |   
1193 |   }
1194 |   
1195 |   renv_json_read_jsonlite <- function(file = NULL, text = NULL) {
1196 |     text <- paste(text %||% readLines(file, warn = FALSE), collapse = "\n")
1197 |     jsonlite::fromJSON(txt = text, simplifyVector = FALSE)
1198 |   }
1199 |   
1200 |   renv_json_read_patterns <- function() {
1201 |     
1202 |     list(
1203 |       
1204 |       # objects
1205 |       list("{", "\t\n\tobject(\t\n\t"),
1206 |       list("}", "\t\n\t)\t\n\t"),
1207 |       
1208 |       # arrays
1209 |       list("[", "\t\n\tarray(\t\n\t"),
1210 |       list("]", "\n\t\n)\n\t\n"),
1211 |       
1212 |       # maps
1213 |       list(":", "\t\n\t=\t\n\t")
1214 |       
1215 |     )
1216 |     
1217 |   }
1218 |   
1219 |   renv_json_read_envir <- function() {
1220 |   
1221 |     envir <- new.env(parent = emptyenv())
1222 |     
1223 |     envir[["+"]] <- `+`
1224 |     envir[["-"]] <- `-`
1225 |     
1226 |     envir[["object"]] <- function(...) {
1227 |       result <- list(...)
1228 |       names(result) <- as.character(names(result))
1229 |       result
1230 |     }
1231 |     
1232 |     envir[["array"]] <- list
1233 |     
1234 |     envir[["true"]]  <- TRUE
1235 |     envir[["false"]] <- FALSE
1236 |     envir[["null"]]  <- NULL
1237 |     
1238 |     envir
1239 |     
1240 |   }
1241 |   
1242 |   renv_json_read_remap <- function(object, patterns) {
1243 |     
1244 |     # repair names if necessary
1245 |     if (!is.null(names(object))) {
1246 |       
1247 |       nms <- names(object)
1248 |       for (pattern in patterns)
1249 |         nms <- gsub(pattern[[2L]], pattern[[1L]], nms, fixed = TRUE)
1250 |       names(object) <- nms
1251 |       
1252 |     }
1253 |     
1254 |     # repair strings if necessary
1255 |     if (is.character(object)) {
1256 |       for (pattern in patterns)
1257 |         object <- gsub(pattern[[2L]], pattern[[1L]], object, fixed = TRUE)
1258 |     }
1259 |     
1260 |     # recurse for other objects
1261 |     if (is.recursive(object))
1262 |       for (i in seq_along(object))
1263 |         object[i] <- list(renv_json_read_remap(object[[i]], patterns))
1264 |     
1265 |     # return remapped object
1266 |     object
1267 |     
1268 |   }
1269 |   
1270 |   renv_json_read_default <- function(file = NULL, text = NULL) {
1271 |   
1272 |     # read json text
1273 |     text <- paste(text %||% readLines(file, warn = FALSE), collapse = "\n")
1274 |     
1275 |     # convert into something the R parser will understand
1276 |     patterns <- renv_json_read_patterns()
1277 |     transformed <- text
1278 |     for (pattern in patterns)
1279 |       transformed <- gsub(pattern[[1L]], pattern[[2L]], transformed, fixed = TRUE)
1280 |     
1281 |     # parse it
1282 |     rfile <- tempfile("renv-json-", fileext = ".R")
1283 |     on.exit(unlink(rfile), add = TRUE)
1284 |     writeLines(transformed, con = rfile)
1285 |     json <- parse(rfile, keep.source = FALSE, srcfile = NULL)[[1L]]
1286 |   
1287 |     # evaluate in safe environment
1288 |     result <- eval(json, envir = renv_json_read_envir())
1289 |   
1290 |     # fix up strings if necessary
1291 |     renv_json_read_remap(result, patterns)
1292 |     
1293 |   }
1294 |   
1295 | 
1296 |   # load the renv profile, if any
1297 |   renv_bootstrap_profile_load(project)
1298 | 
1299 |   # construct path to library root
1300 |   root <- renv_bootstrap_library_root(project)
1301 | 
1302 |   # construct library prefix for platform
1303 |   prefix <- renv_bootstrap_platform_prefix()
1304 | 
1305 |   # construct full libpath
1306 |   libpath <- file.path(root, prefix)
1307 | 
1308 |   # run bootstrap code
1309 |   renv_bootstrap_exec(project, libpath, version)
1310 | 
1311 |   invisible()
1312 | 
1313 | })
1314 | 


--------------------------------------------------------------------------------
/renv/profile:
--------------------------------------------------------------------------------
1 | lesson-requirements
2 | 


--------------------------------------------------------------------------------
/renv/profiles/lesson-requirements/renv/.gitignore:
--------------------------------------------------------------------------------
1 | library/
2 | local/
3 | cellar/
4 | lock/
5 | python/
6 | sandbox/
7 | staging/
8 | 


--------------------------------------------------------------------------------
/renv/profiles/lesson-requirements/renv/settings.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "bioconductor.version": null,
 3 |   "external.libraries": [],
 4 |   "ignored.packages": [],
 5 |   "package.dependency.fields": [
 6 |     "Imports",
 7 |     "Depends",
 8 |     "LinkingTo"
 9 |   ],
10 |   "ppm.enabled": null,
11 |   "ppm.ignored.urls": [],
12 |   "r.version": null,
13 |   "snapshot.type": "implicit",
14 |   "use.cache": true,
15 |   "vcs.ignore.cellar": true,
16 |   "vcs.ignore.library": true,
17 |   "vcs.ignore.local": true,
18 |   "vcs.manage.ignores": true
19 | }
20 | 


--------------------------------------------------------------------------------
/site/README.md:
--------------------------------------------------------------------------------
1 | This directory contains rendered lesson materials. Please do not edit files
2 | here.
3 | 


--------------------------------------------------------------------------------
/targets-workshop.Rproj:
--------------------------------------------------------------------------------
 1 | Version: 1.0
 2 | 
 3 | RestoreWorkspace: No
 4 | SaveWorkspace: No
 5 | AlwaysSaveHistory: Default
 6 | 
 7 | EnableCodeIndexing: Yes
 8 | UseSpacesForTab: Yes
 9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 | 
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 | 
15 | AutoAppendNewline: Yes
16 | StripTrailingWhitespace: Yes
17 | LineEndingConversion: Posix
18 | 
19 | BuildType: Website
20 | 


--------------------------------------------------------------------------------