├── .Rbuildignore ├── .editorconfig ├── .gitattributes ├── .github └── workflows │ ├── README.md │ ├── pr-close-signal.yaml │ ├── pr-comment.yaml │ ├── pr-post-remove-branch.yaml │ ├── pr-preflight.yaml │ ├── pr-receive.yaml │ ├── sandpaper-main.yaml │ ├── sandpaper-version.txt │ ├── update-cache.yaml │ ├── update-workflows.yaml │ └── workbench-beta-phase.yml ├── .gitignore ├── .zenodo.json ├── CITATION ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md ├── about.md ├── config.yaml ├── episodes ├── 01-raster-structure.Rmd ├── 02-raster-plot.Rmd ├── 03-raster-reproject-in-r.Rmd ├── 04-raster-calculations-in-r.Rmd ├── 05-raster-multi-band-in-r.Rmd ├── 06-vector-open-shapefile-in-r.Rmd ├── 07-vector-shapefile-attributes-in-r.Rmd ├── 08-vector-plot-shapefiles-custom-legend.Rmd ├── 09-vector-when-data-dont-line-up-crs.Rmd ├── 10-vector-csv-to-shapefile-in-r.Rmd ├── 11-vector-raster-integration.Rmd ├── 12-time-series-raster.Rmd ├── 13-plot-time-series-rasters-in-r.Rmd ├── 14-extract-ndvi-from-rasters-in-r.Rmd ├── data │ └── .gitignore ├── fig │ ├── BufferCircular.png │ ├── BufferSquare.png │ ├── Utm-zones-USA.svg │ ├── dc-spatial-raster │ │ ├── GreennessOverTime.jpg │ │ ├── RGBSTack_1.jpg │ │ ├── UTM_zones_18-19.jpg │ │ ├── imageStretch_dark.jpg │ │ ├── imageStretch_light.jpg │ │ ├── lidarTree-height.png │ │ ├── raster_concept.png │ │ ├── raster_resolution.png │ │ ├── single_multi_raster.png │ │ └── spatial_extent.png │ ├── dc-spatial-vector │ │ ├── pnt_line_poly.png │ │ └── spatial_extent.png │ └── map_usa_different_projections.jpg └── setup.R ├── index.md ├── instructors └── instructor-notes.md ├── learners ├── discuss.md ├── reference.md └── setup.md ├── profiles └── learner-profiles.md ├── r-raster-vector-geospatial.Rproj ├── renv ├── activate.R ├── profile └── profiles │ └── lesson-requirements │ ├── renv.lock │ └── renv │ └── .gitignore └── site └── README.md /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^renv$ 2 | ^renv\.lock$ 3 | ^\.travis\.yml$ 4 | ^appveyor\.yml$ 5 | ^tic\.R$ 6 | -------------------------------------------------------------------------------- /.editorconfig: -------------------------------------------------------------------------------- 1 | root = true 2 | 3 | [*] 4 | charset = utf-8 5 | insert_final_newline = true 6 | trim_trailing_whitespace = true 7 | 8 | [*.md] 9 | indent_size = 2 10 | indent_style = space 11 | max_line_length = 100 # Please keep this in sync with bin/lesson_check.py! 12 | trim_trailing_whitespace = false # keep trailing spaces in markdown - 2+ spaces are translated to a hard break (
) 13 | 14 | [*.r] 15 | max_line_length = 80 16 | 17 | [*.py] 18 | indent_size = 4 19 | indent_style = space 20 | max_line_length = 79 21 | 22 | [*.sh] 23 | end_of_line = lf 24 | 25 | [Makefile] 26 | indent_style = tab 27 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | *.py linguist-vendored 2 | *.html linguist-vendored 3 | bin/* linguist-vendored 4 | assets/* linguist-vendored 5 | *.R linguist-vendored=false 6 | assets/css/lesson.scss linguist-vendored 7 | -------------------------------------------------------------------------------- /.github/workflows/README.md: -------------------------------------------------------------------------------- 1 | # Carpentries Workflows 2 | 3 | This directory contains workflows to be used for Lessons using the {sandpaper} 4 | lesson infrastructure. Two of these workflows require R (`sandpaper-main.yaml` 5 | and `pr-receive.yaml`) and the rest are bots to handle pull request management. 6 | 7 | These workflows will likely change as {sandpaper} evolves, so it is important to 8 | keep them up-to-date. To do this in your lesson you can do the following in your 9 | R console: 10 | 11 | ```r 12 | # Install/Update sandpaper 13 | options(repos = c(carpentries = "https://carpentries.r-universe.dev/", 14 | CRAN = "https://cloud.r-project.org")) 15 | install.packages("sandpaper") 16 | 17 | # update the workflows in your lesson 18 | library("sandpaper") 19 | update_github_workflows() 20 | ``` 21 | 22 | Inside this folder, you will find a file called `sandpaper-version.txt`, which 23 | will contain a version number for sandpaper. This will be used in the future to 24 | alert you if a workflow update is needed. 25 | 26 | What follows are the descriptions of the workflow files: 27 | 28 | ## Deployment 29 | 30 | ### 01 Build and Deploy (sandpaper-main.yaml) 31 | 32 | This is the main driver that will only act on the main branch of the repository. 33 | This workflow does the following: 34 | 35 | 1. checks out the lesson 36 | 2. provisions the following resources 37 | - R 38 | - pandoc 39 | - lesson infrastructure (stored in a cache) 40 | - lesson dependencies if needed (stored in a cache) 41 | 3. builds the lesson via `sandpaper:::ci_deploy()` 42 | 43 | #### Caching 44 | 45 | This workflow has two caches; one cache is for the lesson infrastructure and 46 | the other is for the lesson dependencies if the lesson contains rendered 47 | content. These caches are invalidated by new versions of the infrastructure and 48 | the `renv.lock` file, respectively. If there is a problem with the cache, 49 | manual invaliation is necessary. You will need maintain access to the repository 50 | and you can either go to the actions tab and [click on the caches button to find 51 | and invalidate the failing cache](https://github.blog/changelog/2022-10-20-manage-caches-in-your-actions-workflows-from-web-interface/) 52 | or by setting the `CACHE_VERSION` secret to the current date (which will 53 | invalidate all of the caches). 54 | 55 | ## Updates 56 | 57 | ### Setup Information 58 | 59 | These workflows run on a schedule and at the maintainer's request. Because they 60 | create pull requests that update workflows/require the downstream actions to run, 61 | they need a special repository/organization secret token called 62 | `SANDPAPER_WORKFLOW` and it must have the `public_repo` and `workflow` scope. 63 | 64 | This can be an individual user token, OR it can be a trusted bot account. If you 65 | have a repository in one of the official Carpentries accounts, then you do not 66 | need to worry about this token being present because the Carpentries Core Team 67 | will take care of supplying this token. 68 | 69 | If you want to use your personal account: you can go to 70 | 71 | to create a token. Once you have created your token, you should copy it to your 72 | clipboard and then go to your repository's settings > secrets > actions and 73 | create or edit the `SANDPAPER_WORKFLOW` secret, pasting in the generated token. 74 | 75 | If you do not specify your token correctly, the runs will not fail and they will 76 | give you instructions to provide the token for your repository. 77 | 78 | ### 02 Maintain: Update Workflow Files (update-workflow.yaml) 79 | 80 | The {sandpaper} repository was designed to do as much as possible to separate 81 | the tools from the content. For local builds, this is absolutely true, but 82 | there is a minor issue when it comes to workflow files: they must live inside 83 | the repository. 84 | 85 | This workflow ensures that the workflow files are up-to-date. The way it work is 86 | to download the update-workflows.sh script from GitHub and run it. The script 87 | will do the following: 88 | 89 | 1. check the recorded version of sandpaper against the current version on github 90 | 2. update the files if there is a difference in versions 91 | 92 | After the files are updated, if there are any changes, they are pushed to a 93 | branch called `update/workflows` and a pull request is created. Maintainers are 94 | encouraged to review the changes and accept the pull request if the outputs 95 | are okay. 96 | 97 | This update is run weekly or on demand. 98 | 99 | ### 03 Maintain: Update Package Cache (update-cache.yaml) 100 | 101 | For lessons that have generated content, we use {renv} to ensure that the output 102 | is stable. This is controlled by a single lockfile which documents the packages 103 | needed for the lesson and the version numbers. This workflow is skipped in 104 | lessons that do not have generated content. 105 | 106 | Because the lessons need to remain current with the package ecosystem, it's a 107 | good idea to make sure these packages can be updated periodically. The 108 | update cache workflow will do this by checking for updates, applying them in a 109 | branch called `updates/packages` and creating a pull request with _only the 110 | lockfile changed_. 111 | 112 | From here, the markdown documents will be rebuilt and you can inspect what has 113 | changed based on how the packages have updated. 114 | 115 | ## Pull Request and Review Management 116 | 117 | Because our lessons execute code, pull requests are a secruity risk for any 118 | lesson and thus have security measures associted with them. **Do not merge any 119 | pull requests that do not pass checks and do not have bots commented on them.** 120 | 121 | This series of workflows all go together and are described in the following 122 | diagram and the below sections: 123 | 124 | ![Graph representation of a pull request](https://carpentries.github.io/sandpaper/articles/img/pr-flow.dot.svg) 125 | 126 | ### Pre Flight Pull Request Validation (pr-preflight.yaml) 127 | 128 | This workflow runs every time a pull request is created and its purpose is to 129 | validate that the pull request is okay to run. This means the following things: 130 | 131 | 1. The pull request does not contain modified workflow files 132 | 2. If the pull request contains modified workflow files, it does not contain 133 | modified content files (such as a situation where @carpentries-bot will 134 | make an automated pull request) 135 | 3. The pull request does not contain an invalid commit hash (e.g. from a fork 136 | that was made before a lesson was transitioned from styles to use the 137 | workbench). 138 | 139 | Once the checks are finished, a comment is issued to the pull request, which 140 | will allow maintainers to determine if it is safe to run the 141 | "Receive Pull Request" workflow from new contributors. 142 | 143 | ### Receive Pull Request (pr-receive.yaml) 144 | 145 | **Note of caution:** This workflow runs arbitrary code by anyone who creates a 146 | pull request. GitHub has safeguarded the token used in this workflow to have no 147 | priviledges in the repository, but we have taken precautions to protect against 148 | spoofing. 149 | 150 | This workflow is triggered with every push to a pull request. If this workflow 151 | is already running and a new push is sent to the pull request, the workflow 152 | running from the previous push will be cancelled and a new workflow run will be 153 | started. 154 | 155 | The first step of this workflow is to check if it is valid (e.g. that no 156 | workflow files have been modified). If there are workflow files that have been 157 | modified, a comment is made that indicates that the workflow is not run. If 158 | both a workflow file and lesson content is modified, an error will occurr. 159 | 160 | The second step (if valid) is to build the generated content from the pull 161 | request. This builds the content and uploads three artifacts: 162 | 163 | 1. The pull request number (pr) 164 | 2. A summary of changes after the rendering process (diff) 165 | 3. The rendered files (build) 166 | 167 | Because this workflow builds generated content, it follows the same general 168 | process as the `sandpaper-main` workflow with the same caching mechanisms. 169 | 170 | The artifacts produced are used by the next workflow. 171 | 172 | ### Comment on Pull Request (pr-comment.yaml) 173 | 174 | This workflow is triggered if the `pr-receive.yaml` workflow is successful. 175 | The steps in this workflow are: 176 | 177 | 1. Test if the workflow is valid and comment the validity of the workflow to the 178 | pull request. 179 | 2. If it is valid: create an orphan branch with two commits: the current state 180 | of the repository and the proposed changes. 181 | 3. If it is valid: update the pull request comment with the summary of changes 182 | 183 | Importantly: if the pull request is invalid, the branch is not created so any 184 | malicious code is not published. 185 | 186 | From here, the maintainer can request changes from the author and eventually 187 | either merge or reject the PR. When this happens, if the PR was valid, the 188 | preview branch needs to be deleted. 189 | 190 | ### Send Close PR Signal (pr-close-signal.yaml) 191 | 192 | Triggered any time a pull request is closed. This emits an artifact that is the 193 | pull request number for the next action 194 | 195 | ### Remove Pull Request Branch (pr-post-remove-branch.yaml) 196 | 197 | Tiggered by `pr-close-signal.yaml`. This removes the temporary branch associated with 198 | the pull request (if it was created). 199 | -------------------------------------------------------------------------------- /.github/workflows/pr-close-signal.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Send Close Pull Request Signal" 2 | 3 | on: 4 | pull_request: 5 | types: 6 | [closed] 7 | 8 | jobs: 9 | send-close-signal: 10 | name: "Send closing signal" 11 | runs-on: ubuntu-22.04 12 | if: ${{ github.event.action == 'closed' }} 13 | steps: 14 | - name: "Create PRtifact" 15 | run: | 16 | mkdir -p ./pr 17 | printf ${{ github.event.number }} > ./pr/NUM 18 | - name: Upload Diff 19 | uses: actions/upload-artifact@v4 20 | with: 21 | name: pr 22 | path: ./pr 23 | -------------------------------------------------------------------------------- /.github/workflows/pr-comment.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Comment on the Pull Request" 2 | 3 | # read-write repo token 4 | # access to secrets 5 | on: 6 | workflow_run: 7 | workflows: ["Receive Pull Request"] 8 | types: 9 | - completed 10 | 11 | concurrency: 12 | group: pr-${{ github.event.workflow_run.pull_requests[0].number }} 13 | cancel-in-progress: true 14 | 15 | 16 | jobs: 17 | # Pull requests are valid if: 18 | # - they match the sha of the workflow run head commit 19 | # - they are open 20 | # - no .github files were committed 21 | test-pr: 22 | name: "Test if pull request is valid" 23 | runs-on: ubuntu-22.04 24 | if: > 25 | github.event.workflow_run.event == 'pull_request' && 26 | github.event.workflow_run.conclusion == 'success' 27 | outputs: 28 | is_valid: ${{ steps.check-pr.outputs.VALID }} 29 | payload: ${{ steps.check-pr.outputs.payload }} 30 | number: ${{ steps.get-pr.outputs.NUM }} 31 | msg: ${{ steps.check-pr.outputs.MSG }} 32 | steps: 33 | - name: 'Download PR artifact' 34 | id: dl 35 | uses: carpentries/actions/download-workflow-artifact@main 36 | with: 37 | run: ${{ github.event.workflow_run.id }} 38 | name: 'pr' 39 | 40 | - name: "Get PR Number" 41 | if: ${{ steps.dl.outputs.success == 'true' }} 42 | id: get-pr 43 | run: | 44 | unzip pr.zip 45 | echo "NUM=$(<./NR)" >> $GITHUB_OUTPUT 46 | 47 | - name: "Fail if PR number was not present" 48 | id: bad-pr 49 | if: ${{ steps.dl.outputs.success != 'true' }} 50 | run: | 51 | echo '::error::A pull request number was not recorded. The pull request that triggered this workflow is likely malicious.' 52 | exit 1 53 | - name: "Get Invalid Hashes File" 54 | id: hash 55 | run: | 56 | echo "json<> $GITHUB_OUTPUT 59 | - name: "Check PR" 60 | id: check-pr 61 | if: ${{ steps.dl.outputs.success == 'true' }} 62 | uses: carpentries/actions/check-valid-pr@main 63 | with: 64 | pr: ${{ steps.get-pr.outputs.NUM }} 65 | sha: ${{ github.event.workflow_run.head_sha }} 66 | headroom: 3 # if it's within the last three commits, we can keep going, because it's likely rapid-fire 67 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 68 | fail_on_error: true 69 | 70 | # Create an orphan branch on this repository with two commits 71 | # - the current HEAD of the md-outputs branch 72 | # - the output from running the current HEAD of the pull request through 73 | # the md generator 74 | create-branch: 75 | name: "Create Git Branch" 76 | needs: test-pr 77 | runs-on: ubuntu-22.04 78 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 79 | env: 80 | NR: ${{ needs.test-pr.outputs.number }} 81 | permissions: 82 | contents: write 83 | steps: 84 | - name: 'Checkout md outputs' 85 | uses: actions/checkout@v4 86 | with: 87 | ref: md-outputs 88 | path: built 89 | fetch-depth: 1 90 | 91 | - name: 'Download built markdown' 92 | id: dl 93 | uses: carpentries/actions/download-workflow-artifact@main 94 | with: 95 | run: ${{ github.event.workflow_run.id }} 96 | name: 'built' 97 | 98 | - if: ${{ steps.dl.outputs.success == 'true' }} 99 | run: unzip built.zip 100 | 101 | - name: "Create orphan and push" 102 | if: ${{ steps.dl.outputs.success == 'true' }} 103 | run: | 104 | cd built/ 105 | git config --local user.email "actions@github.com" 106 | git config --local user.name "GitHub Actions" 107 | CURR_HEAD=$(git rev-parse HEAD) 108 | git checkout --orphan md-outputs-PR-${NR} 109 | git add -A 110 | git commit -m "source commit: ${CURR_HEAD}" 111 | ls -A | grep -v '^.git$' | xargs -I _ rm -r '_' 112 | cd .. 113 | unzip -o -d built built.zip 114 | cd built 115 | git add -A 116 | git commit --allow-empty -m "differences for PR #${NR}" 117 | git push -u --force --set-upstream origin md-outputs-PR-${NR} 118 | 119 | # Comment on the Pull Request with a link to the branch and the diff 120 | comment-pr: 121 | name: "Comment on Pull Request" 122 | needs: [test-pr, create-branch] 123 | runs-on: ubuntu-22.04 124 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 125 | env: 126 | NR: ${{ needs.test-pr.outputs.number }} 127 | permissions: 128 | pull-requests: write 129 | steps: 130 | - name: 'Download comment artifact' 131 | id: dl 132 | uses: carpentries/actions/download-workflow-artifact@main 133 | with: 134 | run: ${{ github.event.workflow_run.id }} 135 | name: 'diff' 136 | 137 | - if: ${{ steps.dl.outputs.success == 'true' }} 138 | run: unzip ${{ github.workspace }}/diff.zip 139 | 140 | - name: "Comment on PR" 141 | id: comment-diff 142 | if: ${{ steps.dl.outputs.success == 'true' }} 143 | uses: carpentries/actions/comment-diff@main 144 | with: 145 | pr: ${{ env.NR }} 146 | path: ${{ github.workspace }}/diff.md 147 | 148 | # Comment if the PR is open and matches the SHA, but the workflow files have 149 | # changed 150 | comment-changed-workflow: 151 | name: "Comment if workflow files have changed" 152 | needs: test-pr 153 | runs-on: ubuntu-22.04 154 | if: ${{ always() && needs.test-pr.outputs.is_valid == 'false' }} 155 | env: 156 | NR: ${{ github.event.workflow_run.pull_requests[0].number }} 157 | body: ${{ needs.test-pr.outputs.msg }} 158 | permissions: 159 | pull-requests: write 160 | steps: 161 | - name: 'Check for spoofing' 162 | id: dl 163 | uses: carpentries/actions/download-workflow-artifact@main 164 | with: 165 | run: ${{ github.event.workflow_run.id }} 166 | name: 'built' 167 | 168 | - name: 'Alert if spoofed' 169 | id: spoof 170 | if: ${{ steps.dl.outputs.success == 'true' }} 171 | run: | 172 | echo 'body<> $GITHUB_ENV 173 | echo '' >> $GITHUB_ENV 174 | echo '## :x: DANGER :x:' >> $GITHUB_ENV 175 | echo 'This pull request has modified workflows that created output. Close this now.' >> $GITHUB_ENV 176 | echo '' >> $GITHUB_ENV 177 | echo 'EOF' >> $GITHUB_ENV 178 | 179 | - name: "Comment on PR" 180 | id: comment-diff 181 | uses: carpentries/actions/comment-diff@main 182 | with: 183 | pr: ${{ env.NR }} 184 | body: ${{ env.body }} 185 | -------------------------------------------------------------------------------- /.github/workflows/pr-post-remove-branch.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Remove Temporary PR Branch" 2 | 3 | on: 4 | workflow_run: 5 | workflows: ["Bot: Send Close Pull Request Signal"] 6 | types: 7 | - completed 8 | 9 | jobs: 10 | delete: 11 | name: "Delete branch from Pull Request" 12 | runs-on: ubuntu-22.04 13 | if: > 14 | github.event.workflow_run.event == 'pull_request' && 15 | github.event.workflow_run.conclusion == 'success' 16 | permissions: 17 | contents: write 18 | steps: 19 | - name: 'Download artifact' 20 | uses: carpentries/actions/download-workflow-artifact@main 21 | with: 22 | run: ${{ github.event.workflow_run.id }} 23 | name: pr 24 | - name: "Get PR Number" 25 | id: get-pr 26 | run: | 27 | unzip pr.zip 28 | echo "NUM=$(<./NUM)" >> $GITHUB_OUTPUT 29 | - name: 'Remove branch' 30 | uses: carpentries/actions/remove-branch@main 31 | with: 32 | pr: ${{ steps.get-pr.outputs.NUM }} 33 | -------------------------------------------------------------------------------- /.github/workflows/pr-preflight.yaml: -------------------------------------------------------------------------------- 1 | name: "Pull Request Preflight Check" 2 | 3 | on: 4 | pull_request_target: 5 | branches: 6 | ["main"] 7 | types: 8 | ["opened", "synchronize", "reopened"] 9 | 10 | jobs: 11 | test-pr: 12 | name: "Test if pull request is valid" 13 | if: ${{ github.event.action != 'closed' }} 14 | runs-on: ubuntu-22.04 15 | outputs: 16 | is_valid: ${{ steps.check-pr.outputs.VALID }} 17 | permissions: 18 | pull-requests: write 19 | steps: 20 | - name: "Get Invalid Hashes File" 21 | id: hash 22 | run: | 23 | echo "json<> $GITHUB_OUTPUT 26 | - name: "Check PR" 27 | id: check-pr 28 | uses: carpentries/actions/check-valid-pr@main 29 | with: 30 | pr: ${{ github.event.number }} 31 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 32 | fail_on_error: true 33 | - name: "Comment result of validation" 34 | id: comment-diff 35 | if: ${{ always() }} 36 | uses: carpentries/actions/comment-diff@main 37 | with: 38 | pr: ${{ github.event.number }} 39 | body: ${{ steps.check-pr.outputs.MSG }} 40 | -------------------------------------------------------------------------------- /.github/workflows/pr-receive.yaml: -------------------------------------------------------------------------------- 1 | name: "Receive Pull Request" 2 | 3 | on: 4 | pull_request: 5 | types: 6 | [opened, synchronize, reopened] 7 | 8 | concurrency: 9 | group: ${{ github.ref }} 10 | cancel-in-progress: true 11 | 12 | jobs: 13 | test-pr: 14 | name: "Record PR number" 15 | if: ${{ github.event.action != 'closed' }} 16 | runs-on: ubuntu-22.04 17 | outputs: 18 | is_valid: ${{ steps.check-pr.outputs.VALID }} 19 | steps: 20 | - name: "Record PR number" 21 | id: record 22 | if: ${{ always() }} 23 | run: | 24 | echo ${{ github.event.number }} > ${{ github.workspace }}/NR # 2022-03-02: artifact name fixed to be NR 25 | - name: "Upload PR number" 26 | id: upload 27 | if: ${{ always() }} 28 | uses: actions/upload-artifact@v4 29 | with: 30 | name: pr 31 | path: ${{ github.workspace }}/NR 32 | - name: "Get Invalid Hashes File" 33 | id: hash 34 | run: | 35 | echo "json<> $GITHUB_OUTPUT 38 | - name: "echo output" 39 | run: | 40 | echo "${{ steps.hash.outputs.json }}" 41 | - name: "Check PR" 42 | id: check-pr 43 | uses: carpentries/actions/check-valid-pr@main 44 | with: 45 | pr: ${{ github.event.number }} 46 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 47 | 48 | build-md-source: 49 | name: "Build markdown source files if valid" 50 | needs: test-pr 51 | runs-on: ubuntu-22.04 52 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 53 | env: 54 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 55 | RENV_PATHS_ROOT: ~/.local/share/renv/ 56 | CHIVE: ${{ github.workspace }}/site/chive 57 | PR: ${{ github.workspace }}/site/pr 58 | MD: ${{ github.workspace }}/site/built 59 | steps: 60 | - name: "Check Out Main Branch" 61 | uses: actions/checkout@v4 62 | 63 | - name: "Check Out Staging Branch" 64 | uses: actions/checkout@v4 65 | with: 66 | ref: md-outputs 67 | path: ${{ env.MD }} 68 | 69 | - name: "Set up R" 70 | uses: r-lib/actions/setup-r@v2 71 | with: 72 | use-public-rspm: true 73 | install-r: false 74 | 75 | - name: "Set up Pandoc" 76 | uses: r-lib/actions/setup-pandoc@v2 77 | 78 | - name: "Setup Lesson Engine" 79 | uses: carpentries/actions/setup-sandpaper@main 80 | with: 81 | cache-version: ${{ secrets.CACHE_VERSION }} 82 | 83 | - name: "Setup Package Cache" 84 | uses: carpentries/actions/setup-lesson-deps@main 85 | with: 86 | cache-version: ${{ secrets.CACHE_VERSION }} 87 | 88 | - name: "Validate and Build Markdown" 89 | id: build-site 90 | run: | 91 | sandpaper::package_cache_trigger(TRUE) 92 | sandpaper::validate_lesson(path = '${{ github.workspace }}') 93 | sandpaper:::build_markdown(path = '${{ github.workspace }}', quiet = FALSE) 94 | shell: Rscript {0} 95 | 96 | - name: "Generate Artifacts" 97 | id: generate-artifacts 98 | run: | 99 | sandpaper:::ci_bundle_pr_artifacts( 100 | repo = '${{ github.repository }}', 101 | pr_number = '${{ github.event.number }}', 102 | path_md = '${{ env.MD }}', 103 | path_pr = '${{ env.PR }}', 104 | path_archive = '${{ env.CHIVE }}', 105 | branch = 'md-outputs' 106 | ) 107 | shell: Rscript {0} 108 | 109 | - name: "Upload PR" 110 | uses: actions/upload-artifact@v4 111 | with: 112 | name: pr 113 | path: ${{ env.PR }} 114 | overwrite: true 115 | 116 | - name: "Upload Diff" 117 | uses: actions/upload-artifact@v4 118 | with: 119 | name: diff 120 | path: ${{ env.CHIVE }} 121 | retention-days: 1 122 | 123 | - name: "Upload Build" 124 | uses: actions/upload-artifact@v4 125 | with: 126 | name: built 127 | path: ${{ env.MD }} 128 | retention-days: 1 129 | 130 | - name: "Teardown" 131 | run: sandpaper::reset_site() 132 | shell: Rscript {0} 133 | -------------------------------------------------------------------------------- /.github/workflows/sandpaper-main.yaml: -------------------------------------------------------------------------------- 1 | name: "01 Build and Deploy Site" 2 | 3 | on: 4 | push: 5 | branches: 6 | - main 7 | - master 8 | schedule: 9 | - cron: '0 0 * * 2' 10 | workflow_dispatch: 11 | inputs: 12 | name: 13 | description: 'Who triggered this build?' 14 | required: true 15 | default: 'Maintainer (via GitHub)' 16 | reset: 17 | description: 'Reset cached markdown files' 18 | required: false 19 | default: false 20 | type: boolean 21 | jobs: 22 | full-build: 23 | name: "Build Full Site" 24 | 25 | # 2024-10-01: ubuntu-latest is now 24.04 and R is not installed by default in the runner image 26 | # pin to 22.04 for now 27 | runs-on: ubuntu-22.04 28 | permissions: 29 | checks: write 30 | contents: write 31 | pages: write 32 | env: 33 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 34 | RENV_PATHS_ROOT: ~/.local/share/renv/ 35 | steps: 36 | 37 | - name: "Checkout Lesson" 38 | uses: actions/checkout@v4 39 | 40 | - name: "Set up R" 41 | uses: r-lib/actions/setup-r@v2 42 | with: 43 | use-public-rspm: true 44 | install-r: false 45 | 46 | - name: "Set up Pandoc" 47 | uses: r-lib/actions/setup-pandoc@v2 48 | 49 | - name: "Setup Lesson Engine" 50 | uses: carpentries/actions/setup-sandpaper@main 51 | with: 52 | cache-version: ${{ secrets.CACHE_VERSION }} 53 | 54 | - name: "Setup Package Cache" 55 | uses: carpentries/actions/setup-lesson-deps@main 56 | with: 57 | cache-version: ${{ secrets.CACHE_VERSION }} 58 | 59 | - name: "Deploy Site" 60 | run: | 61 | reset <- "${{ github.event.inputs.reset }}" == "true" 62 | sandpaper::package_cache_trigger(TRUE) 63 | sandpaper:::ci_deploy(reset = reset) 64 | shell: Rscript {0} 65 | -------------------------------------------------------------------------------- /.github/workflows/sandpaper-version.txt: -------------------------------------------------------------------------------- 1 | 0.16.12 2 | -------------------------------------------------------------------------------- /.github/workflows/update-cache.yaml: -------------------------------------------------------------------------------- 1 | name: "03 Maintain: Update Package Cache" 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | name: 7 | description: 'Who triggered this build (enter github username to tag yourself)?' 8 | required: true 9 | default: 'monthly run' 10 | schedule: 11 | # Run every tuesday 12 | - cron: '0 0 * * 2' 13 | 14 | jobs: 15 | preflight: 16 | name: "Preflight Check" 17 | runs-on: ubuntu-22.04 18 | outputs: 19 | ok: ${{ steps.check.outputs.ok }} 20 | steps: 21 | - id: check 22 | run: | 23 | if [[ ${{ github.event_name }} == 'workflow_dispatch' ]]; then 24 | echo "ok=true" >> $GITHUB_OUTPUT 25 | echo "Running on request" 26 | # using single brackets here to avoid 08 being interpreted as octal 27 | # https://github.com/carpentries/sandpaper/issues/250 28 | elif [ `date +%d` -le 7 ]; then 29 | # If the Tuesday lands in the first week of the month, run it 30 | echo "ok=true" >> $GITHUB_OUTPUT 31 | echo "Running on schedule" 32 | else 33 | echo "ok=false" >> $GITHUB_OUTPUT 34 | echo "Not Running Today" 35 | fi 36 | 37 | check_renv: 38 | name: "Check if We Need {renv}" 39 | runs-on: ubuntu-22.04 40 | needs: preflight 41 | if: ${{ needs.preflight.outputs.ok == 'true'}} 42 | outputs: 43 | needed: ${{ steps.renv.outputs.exists }} 44 | steps: 45 | - name: "Checkout Lesson" 46 | uses: actions/checkout@v4 47 | - id: renv 48 | run: | 49 | if [[ -d renv ]]; then 50 | echo "exists=true" >> $GITHUB_OUTPUT 51 | fi 52 | 53 | check_token: 54 | name: "Check SANDPAPER_WORKFLOW token" 55 | runs-on: ubuntu-22.04 56 | needs: check_renv 57 | if: ${{ needs.check_renv.outputs.needed == 'true' }} 58 | outputs: 59 | workflow: ${{ steps.validate.outputs.wf }} 60 | repo: ${{ steps.validate.outputs.repo }} 61 | steps: 62 | - name: "validate token" 63 | id: validate 64 | uses: carpentries/actions/check-valid-credentials@main 65 | with: 66 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 67 | 68 | update_cache: 69 | name: "Update Package Cache" 70 | needs: check_token 71 | if: ${{ needs.check_token.outputs.repo== 'true' }} 72 | runs-on: ubuntu-22.04 73 | env: 74 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 75 | RENV_PATHS_ROOT: ~/.local/share/renv/ 76 | steps: 77 | 78 | - name: "Checkout Lesson" 79 | uses: actions/checkout@v4 80 | 81 | - name: "Set up R" 82 | uses: r-lib/actions/setup-r@v2 83 | with: 84 | use-public-rspm: true 85 | install-r: false 86 | 87 | - name: "Update {renv} deps and determine if a PR is needed" 88 | id: update 89 | uses: carpentries/actions/update-lockfile@main 90 | with: 91 | cache-version: ${{ secrets.CACHE_VERSION }} 92 | 93 | - name: Create Pull Request 94 | id: cpr 95 | if: ${{ steps.update.outputs.n > 0 }} 96 | uses: carpentries/create-pull-request@main 97 | with: 98 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 99 | delete-branch: true 100 | branch: "update/packages" 101 | commit-message: "[actions] update ${{ steps.update.outputs.n }} packages" 102 | title: "Update ${{ steps.update.outputs.n }} packages" 103 | body: | 104 | :robot: This is an automated build 105 | 106 | This will update ${{ steps.update.outputs.n }} packages in your lesson with the following versions: 107 | 108 | ``` 109 | ${{ steps.update.outputs.report }} 110 | ``` 111 | 112 | :stopwatch: In a few minutes, a comment will appear that will show you how the output has changed based on these updates. 113 | 114 | If you want to inspect these changes locally, you can use the following code to check out a new branch: 115 | 116 | ```bash 117 | git fetch origin update/packages 118 | git checkout update/packages 119 | ``` 120 | 121 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }} 122 | 123 | [1]: https://github.com/carpentries/create-pull-request/tree/main 124 | labels: "type: package cache" 125 | draft: false 126 | -------------------------------------------------------------------------------- /.github/workflows/update-workflows.yaml: -------------------------------------------------------------------------------- 1 | name: "02 Maintain: Update Workflow Files" 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | name: 7 | description: 'Who triggered this build (enter github username to tag yourself)?' 8 | required: true 9 | default: 'weekly run' 10 | clean: 11 | description: 'Workflow files/file extensions to clean (no wildcards, enter "" for none)' 12 | required: false 13 | default: '.yaml' 14 | schedule: 15 | # Run every Tuesday 16 | - cron: '0 0 * * 2' 17 | 18 | jobs: 19 | check_token: 20 | name: "Check SANDPAPER_WORKFLOW token" 21 | runs-on: ubuntu-22.04 22 | outputs: 23 | workflow: ${{ steps.validate.outputs.wf }} 24 | repo: ${{ steps.validate.outputs.repo }} 25 | steps: 26 | - name: "validate token" 27 | id: validate 28 | uses: carpentries/actions/check-valid-credentials@main 29 | with: 30 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 31 | 32 | update_workflow: 33 | name: "Update Workflow" 34 | runs-on: ubuntu-22.04 35 | needs: check_token 36 | if: ${{ needs.check_token.outputs.workflow == 'true' }} 37 | steps: 38 | - name: "Checkout Repository" 39 | uses: actions/checkout@v4 40 | 41 | - name: Update Workflows 42 | id: update 43 | uses: carpentries/actions/update-workflows@main 44 | with: 45 | clean: ${{ github.event.inputs.clean }} 46 | 47 | - name: Create Pull Request 48 | id: cpr 49 | if: "${{ steps.update.outputs.new }}" 50 | uses: carpentries/create-pull-request@main 51 | with: 52 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 53 | delete-branch: true 54 | branch: "update/workflows" 55 | commit-message: "[actions] update sandpaper workflow to version ${{ steps.update.outputs.new }}" 56 | title: "Update Workflows to Version ${{ steps.update.outputs.new }}" 57 | body: | 58 | :robot: This is an automated build 59 | 60 | Update Workflows from sandpaper version ${{ steps.update.outputs.old }} -> ${{ steps.update.outputs.new }} 61 | 62 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }} 63 | 64 | [1]: https://github.com/carpentries/create-pull-request/tree/main 65 | labels: "type: template and tools" 66 | draft: false 67 | -------------------------------------------------------------------------------- /.github/workflows/workbench-beta-phase.yml: -------------------------------------------------------------------------------- 1 | name: "Deploy to AWS" 2 | 3 | on: 4 | workflow_run: 5 | workflows: ["01 Build and Deploy Site"] 6 | types: 7 | - completed 8 | workflow_dispatch: 9 | 10 | jobs: 11 | preflight: 12 | name: "Preflight Check" 13 | runs-on: ubuntu-latest 14 | outputs: 15 | ok: ${{ steps.check.outputs.ok }} 16 | folder: ${{ steps.check.outputs.folder }} 17 | steps: 18 | - id: check 19 | run: | 20 | if [[ -z "${{ secrets.DISTRIBUTION }}" || -z "${{ secrets.AWS_ACCESS_KEY_ID }}" || -z "${{ secrets.AWS_SECRET_ACCESS_KEY }}" ]]; then 21 | echo ":information_source: No site configured" >> $GITHUB_STEP_SUMMARY 22 | echo "" >> $GITHUB_STEP_SUMMARY 23 | echo 'To deploy the preview on AWS, you need the `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` and `DISTRIBUTION` secrets set up' >> $GITHUB_STEP_SUMMARY 24 | else 25 | echo "::set-output name=folder::"$(sed -E 's^.+/(.+)^\1^' <<< ${{ github.repository }}) 26 | echo "::set-output name=ok::true" 27 | fi 28 | 29 | full-build: 30 | name: "Deploy to AWS" 31 | needs: [preflight] 32 | if: ${{ needs.preflight.outputs.ok }} 33 | runs-on: ubuntu-latest 34 | steps: 35 | 36 | - name: "Checkout site folder" 37 | uses: actions/checkout@v3 38 | with: 39 | ref: 'gh-pages' 40 | path: 'source' 41 | 42 | - name: "Deploy to Bucket" 43 | uses: jakejarvis/s3-sync-action@v0.5.1 44 | with: 45 | args: --acl public-read --follow-symlinks --delete --exclude '.git/*' 46 | env: 47 | AWS_S3_BUCKET: preview.carpentries.org 48 | AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} 49 | AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} 50 | SOURCE_DIR: 'source' 51 | DEST_DIR: ${{ needs.preflight.outputs.folder }} 52 | 53 | - name: "Invalidate CloudFront" 54 | uses: chetan/invalidate-cloudfront-action@master 55 | env: 56 | PATHS: /* 57 | AWS_REGION: 'us-east-1' 58 | DISTRIBUTION: ${{ secrets.DISTRIBUTION }} 59 | AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} 60 | AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} 61 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # sandpaper files 2 | episodes/*html 3 | site/* 4 | !site/README.md 5 | 6 | # History files 7 | .Rhistory 8 | .Rapp.history 9 | # Session Data files 10 | .RData 11 | # User-specific files 12 | .Ruserdata 13 | # Example code in package build process 14 | *-Ex.R 15 | # Output files from R CMD build 16 | /*.tar.gz 17 | # Output files from R CMD check 18 | /*.Rcheck/ 19 | # RStudio files 20 | .Rproj.user/ 21 | # produced vignettes 22 | vignettes/*.html 23 | vignettes/*.pdf 24 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3 25 | .httr-oauth 26 | # knitr and R markdown default cache directories 27 | *_cache/ 28 | /cache/ 29 | # Temporary files created by R markdown 30 | *.utf8.md 31 | *.knit.md 32 | # R Environment Variables 33 | .Renviron 34 | # pkgdown site 35 | docs/ 36 | # translation temp files 37 | po/*~ 38 | # renv detritus 39 | renv/sandbox/ 40 | _site 41 | .DS_Store 42 | .Rproj.user 43 | _episodes_rmd/data/NEON-DS-Airborne-Remote-Sensing 44 | _episodes_rmd/data/NEON-DS-Site-Layout-Files 45 | _episodes_rmd/data/NEON-DS-Landsat-NDVI 46 | _episodes_rmd/data/NEON-DS-Met-Time-Series 47 | _episodes_rmd/chm_ov_SJER.tif 48 | _episodes_rmd/meanNDVI_SJER_2011.csv 49 | *.pyc 50 | *~ 51 | .ipynb_checkpoints 52 | .sass-cache 53 | .jekyll-cache/ 54 | .jekyll-metadata 55 | __pycache__ 56 | .bundle/ 57 | .vendor/ 58 | vendor/ 59 | .docker-vendor/ 60 | Gemfile.lock 61 | .*history -------------------------------------------------------------------------------- /.zenodo.json: -------------------------------------------------------------------------------- 1 | { 2 | "contributors": [ 3 | { 4 | "type": "Editor", 5 | "name": "Jemma Stachelek", 6 | "orcid": "0000-0002-5924-2464" 7 | }, 8 | { 9 | "type": "Editor", 10 | "name": "Drake Asberry" 11 | }, 12 | { 13 | "type": "Editor", 14 | "name": "Ivo Agbor Arrey", 15 | "orcid": "0000-0002-5311-3813" 16 | } 17 | ], 18 | "creators": [ 19 | { 20 | "name": "Jemma Stachelek", 21 | "orcid": "0000-0002-5924-2464" 22 | }, 23 | { 24 | "name": "Erin Alison Becker", 25 | "orcid": "0000-0002-6832-0233" 26 | }, 27 | { 28 | "name": "Drake Asberry" 29 | }, 30 | { 31 | "name": "Matt Strimas-Mackey", 32 | "orcid": "0000-0001-8929-7776" 33 | }, 34 | { 35 | "name": "Annajiat Alim Rasel", 36 | "orcid": "0000-0003-0198-3734" 37 | }, 38 | { 39 | "name": "Angela Li", 40 | "orcid": "0000-0002-8956-419X" 41 | }, 42 | { 43 | "name": "Ryan Avery" 44 | }, 45 | { 46 | "name": "kcarini", 47 | "orcid": "0000-0002-9630-0432" 48 | }, 49 | { 50 | "name": "Kunal Marwaha", 51 | "orcid": "0000-0001-9084-6971" 52 | }, 53 | { 54 | "name": "mneilson-usgs" 55 | }, 56 | { 57 | "name": "Adam H. Sparks", 58 | "orcid": "0000-0002-0061-8359" 59 | }, 60 | { 61 | "name": "bart1" 62 | }, 63 | { 64 | "name": "Christian Boldsen Knudsen", 65 | "orcid": "0000-0002-9816-768X" 66 | }, 67 | { 68 | "name": "Daniel Kerchner", 69 | "orcid": "0000-0002-5921-2193" 70 | }, 71 | { 72 | "name": "Darya P Vanichkina", 73 | "orcid": "0000-0002-0406-164X" 74 | }, 75 | { 76 | "name": "Pérez-Suárez", 77 | "orcid": "0000-0003-0784-6909" 78 | }, 79 | { 80 | "name": "Jon Jablonski" 81 | }, 82 | { 83 | "name": "Michael Liou" 84 | }, 85 | { 86 | "name": "Natalia Morandeira", 87 | "orcid": "0000-0003-3674-2981" 88 | }, 89 | { 90 | "name": "Rob Williams", 91 | "orcid": "0000-0001-9259-3883" 92 | } 93 | ], 94 | "license": { 95 | "id": "CC-BY-4.0" 96 | } 97 | } -------------------------------------------------------------------------------- /CITATION: -------------------------------------------------------------------------------- 1 | Data Carpentry Introduction to Geospatial Raster and Vector Data with R 2 | Leah Wasser; Megan A. Jones; Jemma Stachelek; Lachlan Deer; Zack Brym; Lauren O'Brien; Ana Costa Conrado; Aateka Shashank; Kristina Riemer; Anne Fouilloux; Juan Fung; Marchand; Tracy Teal; Sergio Marconi; James Holmquist; Mike Smorul; Punam Amratia; Erin Becker; Katrin Leinweber 3 | Editors: Jemma Stachelek; Lauren O'Brien; Jane Wyngaard 4 | https://doi.org/10.5281/zenodo.1404424 5 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Contributor Code of Conduct" 3 | --- 4 | 5 | As contributors and maintainers of this project, 6 | we pledge to follow the [The Carpentries Code of Conduct][coc]. 7 | 8 | Instances of abusive, harassing, or otherwise unacceptable behavior 9 | may be reported by following our [reporting guidelines][coc-reporting]. 10 | 11 | 12 | [coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html 13 | [coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html 14 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | ## Contributing 2 | 3 | [The Carpentries][cp-site] ([Software Carpentry][swc-site], [Data 4 | Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source 5 | projects, and we welcome contributions of all kinds: new lessons, fixes to 6 | existing material, bug reports, and reviews of proposed changes are all 7 | welcome. 8 | 9 | ### Contributor Agreement 10 | 11 | By contributing, you agree that we may redistribute your work under [our 12 | license](LICENSE.md). In exchange, we will address your issues and/or assess 13 | your change proposal as promptly as we can, and help you become a member of our 14 | community. Everyone involved in [The Carpentries][cp-site] agrees to abide by 15 | our [code of conduct](CODE_OF_CONDUCT.md). 16 | 17 | ### How to Contribute 18 | 19 | The easiest way to get started is to file an issue to tell us about a spelling 20 | mistake, some awkward wording, or a factual error. This is a good way to 21 | introduce yourself and to meet some of our community members. 22 | 23 | 1. If you do not have a [GitHub][github] account, you can [send us comments by 24 | email][contact]. However, we will be able to respond more quickly if you use 25 | one of the other methods described below. 26 | 27 | 2. If you have a [GitHub][github] account, or are willing to [create 28 | one][github-join], but do not know how to use Git, you can report problems 29 | or suggest improvements by [creating an issue][issues]. This allows us to 30 | assign the item to someone and to respond to it in a threaded discussion. 31 | 32 | 3. If you are comfortable with Git, and would like to add or change material, 33 | you can submit a pull request (PR). Instructions for doing this are 34 | [included below](#using-github). 35 | 36 | Note: if you want to build the website locally, please refer to [The Workbench 37 | documentation][template-doc]. 38 | 39 | ### Where to Contribute 40 | 41 | 1. If you wish to change this lesson, add issues and pull requests here. 42 | 2. If you wish to change the template used for workshop websites, please refer 43 | to [The Workbench documentation][template-doc]. 44 | 45 | 46 | ### What to Contribute 47 | 48 | There are many ways to contribute, from writing new exercises and improving 49 | existing ones to updating or filling in the documentation and submitting [bug 50 | reports][issues] about things that do not work, are not clear, or are missing. 51 | If you are looking for ideas, please see [the list of issues for this 52 | repository][repo], or the issues for [Data Carpentry][dc-issues], [Library 53 | Carpentry][lc-issues], and [Software Carpentry][swc-issues] projects. 54 | 55 | Comments on issues and reviews of pull requests are just as welcome: we are 56 | smarter together than we are on our own. **Reviews from novices and newcomers 57 | are particularly valuable**: it's easy for people who have been using these 58 | lessons for a while to forget how impenetrable some of this material can be, so 59 | fresh eyes are always welcome. 60 | 61 | ### What *Not* to Contribute 62 | 63 | Our lessons already contain more material than we can cover in a typical 64 | workshop, so we are usually *not* looking for more concepts or tools to add to 65 | them. As a rule, if you want to introduce a new idea, you must (a) estimate how 66 | long it will take to teach and (b) explain what you would take out to make room 67 | for it. The first encourages contributors to be honest about requirements; the 68 | second, to think hard about priorities. 69 | 70 | We are also not looking for exercises or other material that only run on one 71 | platform. Our workshops typically contain a mixture of Windows, macOS, and 72 | Linux users; in order to be usable, our lessons must run equally well on all 73 | three. 74 | 75 | ### Using GitHub 76 | 77 | If you choose to contribute via GitHub, you may want to look at [How to 78 | Contribute to an Open Source Project on GitHub][how-contribute]. In brief, we 79 | use [GitHub flow][github-flow] to manage changes: 80 | 81 | 1. Create a new branch in your desktop copy of this repository for each 82 | significant change. 83 | 2. Commit the change in that branch. 84 | 3. Push that branch to your fork of this repository on GitHub. 85 | 4. Submit a pull request from that branch to the [upstream repository][repo]. 86 | 5. If you receive feedback, make changes on your desktop and push to your 87 | branch on GitHub: the pull request will update automatically. 88 | 89 | NB: The published copy of the lesson is usually in the `main` branch. 90 | 91 | Each lesson has a team of maintainers who review issues and pull requests or 92 | encourage others to do so. The maintainers are community volunteers, and have 93 | final say over what gets merged into the lesson. 94 | 95 | ### Other Resources 96 | 97 | The Carpentries is a global organisation with volunteers and learners all over 98 | the world. We share values of inclusivity and a passion for sharing knowledge, 99 | teaching and learning. There are several ways to connect with The Carpentries 100 | community listed at including via social 101 | media, slack, newsletters, and email lists. You can also [reach us by 102 | email][contact]. 103 | 104 | [repo]: https://github.com/datacarpentry/r-raster-vector-geospatial 105 | [contact]: mailto:team@carpentries.org 106 | [cp-site]: https://carpentries.org/ 107 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry 108 | [dc-lessons]: https://datacarpentry.org/lessons/ 109 | [dc-site]: https://datacarpentry.org/ 110 | [discuss-list]: https://lists.software-carpentry.org/listinfo/discuss 111 | [github]: https://github.com 112 | [github-flow]: https://guides.github.com/introduction/flow/ 113 | [github-join]: https://github.com/join 114 | [how-contribute]: https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github 115 | [issues]: https://carpentries.org/help-wanted-issues/ 116 | [lc-issues]: https://github.com/issues?q=user%3ALibraryCarpentry 117 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry 118 | [swc-lessons]: https://software-carpentry.org/lessons/ 119 | [swc-site]: https://software-carpentry.org/ 120 | [lc-site]: https://librarycarpentry.org/ 121 | [template-doc]: https://carpentries.github.io/workbench/ 122 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Licenses" 3 | --- 4 | 5 | ## Instructional Material 6 | 7 | All Software Carpentry, Data Carpentry, and Library Carpentry instructional material is 8 | made available under the [Creative Commons Attribution 9 | license][cc-by-human]. The following is a human-readable summary of 10 | (and not a substitute for) the [full legal text of the CC BY 4.0 11 | license][cc-by-legal]. 12 | 13 | You are free: 14 | 15 | * to **Share**---copy and redistribute the material in any medium or format 16 | * to **Adapt**---remix, transform, and build upon the material 17 | 18 | for any purpose, even commercially. 19 | 20 | The licensor cannot revoke these freedoms as long as you follow the 21 | license terms. 22 | 23 | Under the following terms: 24 | 25 | * **Attribution**---You must give appropriate credit (mentioning that 26 | your work is derived from work that is Copyright © Software 27 | Carpentry and, where practical, linking to 28 | http://software-carpentry.org/), provide a [link to the 29 | license][cc-by-human], and indicate if changes were made. You may do 30 | so in any reasonable manner, but not in any way that suggests the 31 | licensor endorses you or your use. 32 | 33 | **No additional restrictions**---You may not apply legal terms or 34 | technological measures that legally restrict others from doing 35 | anything the license permits. With the understanding that: 36 | 37 | Notices: 38 | 39 | * You do not have to comply with the license for elements of the 40 | material in the public domain or where your use is permitted by an 41 | applicable exception or limitation. 42 | * No warranties are given. The license may not give you all of the 43 | permissions necessary for your intended use. For example, other 44 | rights such as publicity, privacy, or moral rights may limit how you 45 | use the material. 46 | 47 | ## Software 48 | 49 | Except where otherwise noted, the example programs and other software 50 | provided by Software Carpentry and Data Carpentry are made available under the 51 | [OSI][osi]-approved 52 | [MIT license][mit-license]. 53 | 54 | Permission is hereby granted, free of charge, to any person obtaining 55 | a copy of this software and associated documentation files (the 56 | "Software"), to deal in the Software without restriction, including 57 | without limitation the rights to use, copy, modify, merge, publish, 58 | distribute, sublicense, and/or sell copies of the Software, and to 59 | permit persons to whom the Software is furnished to do so, subject to 60 | the following conditions: 61 | 62 | The above copyright notice and this permission notice shall be 63 | included in all copies or substantial portions of the Software. 64 | 65 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 66 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 67 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 68 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 69 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 70 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 71 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 72 | 73 | ## Trademark 74 | 75 | "The Carpentries", "Software Carpentry", "Data Carpentry", and "Library 76 | Carpentry" and their respective logos are registered trademarks of 77 | [The Carpentries, Inc.][carpentries]. 78 | 79 | [cc-by-human]: https://creativecommons.org/licenses/by/4.0/ 80 | [cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode 81 | [mit-license]: https://opensource.org/licenses/mit-license.html 82 | [carpentries]: https://carpentries.org 83 | [osi]: https://opensource.org 84 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![DOI](https://zenodo.org/badge/44772343.svg)](https://zenodo.org/badge/latestdoi/44772343) 2 | [![01 Build and Deploy Site](https://github.com/datacarpentry/r-raster-vector-geospatial/actions/workflows/sandpaper-main.yaml/badge.svg)](https://github.com/datacarpentry/r-raster-vector-geospatial/actions/workflows/sandpaper-main.yaml) 3 | [![Create a Slack Account with us](https://img.shields.io/badge/Create_Slack_Account-The_Carpentries-071159.svg)](https://swc-slack-invite.herokuapp.com/) 4 | [![Slack Status](https://img.shields.io/badge/Slack_Channel-dc--geospatial-E01563.svg)](https://swcarpentry.slack.com/messages/C9ME7G5RD) 5 | 6 | # R for Raster and Vector Data 7 | 8 | ## Contributing to lesson development 9 | 10 | - The lesson files to be edited are in the `_episodes` folder. This repository uses the `main` branch for development. 11 | - You can visualize the changes locally with the [sandpaper](https://github.com/carpentries/sandpaper) R package by executing either the `sandpaper::serve()` or `sandpaper::build_lesson()` commands. In the former case, the site will be rendered at [http://localhost:4321](http://localhost:4321) 12 | - Each time you push a change to GitHub, Github Actions rebuilds the lesson, and when it's successful (look for the green badge at the top of the README file), it publishes the result at [http://www.datacarpentry.org/r-raster-vector-geospatial/](http://www.datacarpentry.org/r-raster-vector-geospatial/) 13 | - Note: any manual commit to `gh-pages` will be erased and lost during the automated build and deploy cycle operated by Github Actions. 14 | 15 | ### Lesson Maintainers: 16 | 17 | - [Jemma Stachelek][stachelek_jemma] 18 | - [Ivo Arrey][arreyves] 19 | - Drake Asberry 20 | - [Jon Jablonski][jonjab] 21 | 22 | [stachelek_jemma]: https://carpentries.org/instructors/#jsta 23 | [arreyves]: https://carpentries.org/instructors/#arreyves 24 | -------------------------------------------------------------------------------- /about.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | description: "A site devoted to open science and open data." 4 | Tags: [] 5 | permalink: about/ 6 | image: 7 | feature: NEONCarpentryHeader_2.png 8 | credit: National Ecological Observatory Network (NEON) 9 | creditlink: http://www.neoninc.org 10 | --- 11 | 12 | 13 | ##About the NEON / Data Carpentry Hackathon 14 | 15 | The National Ecological Observatory Network (NEON) is hosting a 3-day lesson-building hackathon to develop a suite of NEON/ Data Carpentry data tutorials and corresponding assessment instruments. The tutorials and assessment instruments will be used to teach fundamental big data skills needed to work efficiently with large spatio-temporal data using open tools, such as R, Python and postgres SQL. 16 | 17 | Learn more about the Hackathon on the NEON website. 18 | 19 | -------------------------------------------------------------------------------- /config.yaml: -------------------------------------------------------------------------------- 1 | #------------------------------------------------------------ 2 | # Values for this lesson. 3 | #------------------------------------------------------------ 4 | 5 | # Which carpentry is this (swc, dc, lc, or cp)? 6 | # swc: Software Carpentry 7 | # dc: Data Carpentry 8 | # lc: Library Carpentry 9 | # cp: Carpentries (to use for instructor training for instance) 10 | # incubator: The Carpentries Incubator 11 | carpentry: 'dc' 12 | 13 | # Overall title for pages. 14 | title: 'Introduction to Geospatial Raster and Vector Data with R' 15 | 16 | # Date the lesson was created (YYYY-MM-DD, this is empty by default) 17 | created: '2015-10-22' 18 | 19 | # Comma-separated list of keywords for the lesson 20 | keywords: 'software, data, lesson, The Carpentries' 21 | 22 | # Life cycle stage of the lesson 23 | # possible values: pre-alpha, alpha, beta, stable 24 | life_cycle: 'transition-step-2' 25 | 26 | # License of the lesson 27 | license: 'CC-BY 4.0' 28 | 29 | # Link to the source repository for this lesson 30 | source: 'https://github.com/datacarpentry/r-raster-vector-geospatial/' 31 | 32 | # Default branch of your lesson 33 | branch: 'main' 34 | 35 | # Who to contact if there are any issues 36 | contact: 'team@carpentries.org' 37 | 38 | # Navigation ------------------------------------------------ 39 | # 40 | # Use the following menu items to specify the order of 41 | # individual pages in each dropdown section. Leave blank to 42 | # include all pages in the folder. 43 | # 44 | # Example ------------- 45 | # 46 | # episodes: 47 | # - introduction.md 48 | # - first-steps.md 49 | # 50 | # learners: 51 | # - setup.md 52 | # 53 | # instructors: 54 | # - instructor-notes.md 55 | # 56 | # profiles: 57 | # - one-learner.md 58 | # - another-learner.md 59 | 60 | # Order of episodes in your lesson 61 | episodes: 62 | - 01-raster-structure.Rmd 63 | - 02-raster-plot.Rmd 64 | - 03-raster-reproject-in-r.Rmd 65 | - 04-raster-calculations-in-r.Rmd 66 | - 05-raster-multi-band-in-r.Rmd 67 | - 06-vector-open-shapefile-in-r.Rmd 68 | - 07-vector-shapefile-attributes-in-r.Rmd 69 | - 08-vector-plot-shapefiles-custom-legend.Rmd 70 | - 09-vector-when-data-dont-line-up-crs.Rmd 71 | - 10-vector-csv-to-shapefile-in-r.Rmd 72 | - 11-vector-raster-integration.Rmd 73 | - 12-time-series-raster.Rmd 74 | - 13-plot-time-series-rasters-in-r.Rmd 75 | - 14-extract-ndvi-from-rasters-in-r.Rmd 76 | 77 | # Information for Learners 78 | learners: 79 | 80 | # Information for Instructors 81 | instructors: 82 | 83 | # Learner Profiles 84 | profiles: 85 | 86 | # Customisation --------------------------------------------- 87 | # 88 | # This space below is where custom yaml items (e.g. pinning 89 | # sandpaper and varnish versions) should live 90 | 91 | 92 | url: 'https://datacarpentry.org/r-raster-vector-geospatial' 93 | -------------------------------------------------------------------------------- /episodes/02-raster-plot.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Plot Raster Data 3 | teaching: 40 4 | exercises: 30 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Build customized plots for a single band raster using the `ggplot2` package. 15 | - Layer a raster dataset on top of a hillshade to create an elegant basemap. 16 | 17 | :::::::::::::::::::::::::::::::::::::::::::::::::: 18 | 19 | :::::::::::::::::::::::::::::::::::::::: questions 20 | 21 | - How can I create categorized or customized maps of raster data? 22 | - How can I customize the color scheme of a raster image? 23 | - How can I layer raster data in a single image? 24 | 25 | :::::::::::::::::::::::::::::::::::::::::::::::::: 26 | 27 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 28 | library(terra) 29 | library(ggplot2) 30 | library(dplyr) 31 | ``` 32 | 33 | ```{r load-data, echo=FALSE} 34 | # Learners will have this data loaded from earlier episode 35 | # DSM data for Harvard Forest 36 | DSM_HARV <- 37 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif") 38 | 39 | DSM_HARV_df <- as.data.frame(DSM_HARV, xy = TRUE) 40 | ``` 41 | 42 | :::::::::::::::::::::::::::::::::::::::::: prereq 43 | 44 | ## Things You'll Need To Complete This Episode 45 | 46 | See the [lesson homepage](.) for detailed information about the software, 47 | data, and other prerequisites you will need to work through the examples in this episode. 48 | 49 | 50 | :::::::::::::::::::::::::::::::::::::::::::::::::: 51 | 52 | ## Plot Raster Data in R 53 | 54 | This episode covers how to plot a raster in R using the `ggplot2` 55 | package with customized coloring schemes. 56 | It also covers how to layer a raster on top of a hillshade to produce an 57 | eloquent map. We will continue working with the Digital Surface Model (DSM) 58 | raster for the NEON Harvard Forest Field Site. 59 | 60 | ## Plotting Data Using Breaks 61 | 62 | In the previous episode, we viewed our data using a continuous color ramp. For 63 | clarity and visibility of the plot, we may prefer to view the data "symbolized" 64 | or colored according to ranges of values. This is comparable to a "classified" 65 | map. To do this, we need to tell `ggplot` how many groups to break our data 66 | into, and where those breaks should be. To make these decisions, it is useful 67 | to first explore the distribution of the data using a bar plot. To begin with, 68 | we will use `dplyr`'s `mutate()` function combined with `cut()` to split the 69 | data into 3 bins. 70 | 71 | ```{r histogram-breaks-ggplot} 72 | 73 | DSM_HARV_df <- DSM_HARV_df %>% 74 | mutate(fct_elevation = cut(HARV_dsmCrop, breaks = 3)) 75 | 76 | ggplot() + 77 | geom_bar(data = DSM_HARV_df, aes(fct_elevation)) 78 | 79 | ``` 80 | 81 | If we want to know the cutoff values for the groups, we can ask for the unique 82 | values of `fct_elevation`: 83 | 84 | ```{r unique-breaks} 85 | unique(DSM_HARV_df$fct_elevation) 86 | ``` 87 | 88 | And we can get the count of values in each group using `dplyr`'s `count()` function: 89 | 90 | ```{r breaks-count} 91 | DSM_HARV_df %>% 92 | count(fct_elevation) 93 | ``` 94 | 95 | We might prefer to customize the cutoff values for these groups. 96 | Lets round the cutoff values so that we have groups for the ranges of 97 | 301–350 m, 351–400 m, and 401–450 m. 98 | To implement this we will give `mutate()` a numeric vector of break points 99 | instead of the number of breaks we want. 100 | 101 | ```{r custom-bins} 102 | custom_bins <- c(300, 350, 400, 450) 103 | 104 | DSM_HARV_df <- DSM_HARV_df %>% 105 | mutate(fct_elevation_2 = cut(HARV_dsmCrop, breaks = custom_bins)) 106 | 107 | unique(DSM_HARV_df$fct_elevation_2) 108 | ``` 109 | 110 | ::::::::::::::::::::::::::::::::::::::::: callout 111 | 112 | ## Data Tips 113 | 114 | Note that when we assign break values a set of 4 values will result in 3 bins 115 | of data. 116 | 117 | The bin intervals are shown using `(` to mean exclusive and `]` to mean 118 | inclusive. For example: `(305, 342]` means "from 306 through 342". 119 | 120 | 121 | :::::::::::::::::::::::::::::::::::::::::::::::::: 122 | 123 | And now we can plot our bar plot again, using the new groups: 124 | 125 | ```{r histogram-custom-breaks} 126 | ggplot() + 127 | geom_bar(data = DSM_HARV_df, aes(fct_elevation_2)) 128 | ``` 129 | 130 | And we can get the count of values in each group in the same way we did before: 131 | 132 | ```{r break-count-custom} 133 | DSM_HARV_df %>% 134 | count(fct_elevation_2) 135 | ``` 136 | 137 | We can use those groups to plot our raster data, with each group being a 138 | different color: 139 | 140 | ```{r raster-with-breaks} 141 | ggplot() + 142 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y, fill = fct_elevation_2)) + 143 | coord_quickmap() 144 | ``` 145 | 146 | The plot above uses the default colors inside `ggplot` for raster objects. 147 | We can specify our own colors to make the plot look a little nicer. 148 | R has a built in set of colors for plotting terrain, which are built in 149 | to the `terrain.colors()` function. 150 | Since we have three bins, we want to create a 3-color palette: 151 | 152 | ```{r terrain-colors} 153 | terrain.colors(3) 154 | ``` 155 | 156 | The `terrain.colors()` function returns *hex colors* - 157 | each of these character strings represents a color. 158 | To use these in our map, we pass them across using the 159 | `scale_fill_manual()` function. 160 | 161 | ```{r ggplot-breaks-customcolors} 162 | 163 | ggplot() + 164 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y, 165 | fill = fct_elevation_2)) + 166 | scale_fill_manual(values = terrain.colors(3)) + 167 | coord_quickmap() 168 | ``` 169 | 170 | ### More Plot Formatting 171 | 172 | If we need to create multiple plots using the same color palette, we can create 173 | an R object (`my_col`) for the set of colors that we want to use. We can then 174 | quickly change the palette across all plots by modifying the `my_col` object, 175 | rather than each individual plot. 176 | 177 | We can label the x- and y-axes of our plot too using `xlab` and `ylab`. 178 | We can also give the legend a more meaningful title by passing a value 179 | to the `name` argument of the `scale_fill_manual()` function. 180 | 181 | ```{r add-ggplot-labels} 182 | 183 | my_col <- terrain.colors(3) 184 | 185 | ggplot() + 186 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y, 187 | fill = fct_elevation_2)) + 188 | scale_fill_manual(values = my_col, name = "Elevation") + 189 | coord_quickmap() 190 | ``` 191 | 192 | Or we can also turn off the labels of both axes by passing `element_blank()` to 193 | the relevant part of the `theme()` function. 194 | 195 | ```{r turn-off-axes} 196 | ggplot() + 197 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y, 198 | fill = fct_elevation_2)) + 199 | scale_fill_manual(values = my_col, name = "Elevation") + 200 | theme(axis.title = element_blank()) + 201 | coord_quickmap() 202 | ``` 203 | 204 | ::::::::::::::::::::::::::::::::::::::: challenge 205 | 206 | ## Challenge: Plot Using Custom Breaks 207 | 208 | Create a plot of the Harvard Forest Digital Surface Model (DSM) that has: 209 | 210 | 1. Six classified ranges of values (break points) that are evenly divided among 211 | the range of pixel values. 212 | 2. Axis labels. 213 | 3. A plot title. 214 | 215 | ::::::::::::::: solution 216 | 217 | ## Answers 218 | 219 | ```{r challenge-code-plotting} 220 | 221 | DSM_HARV_df <- DSM_HARV_df %>% 222 | mutate(fct_elevation_6 = cut(HARV_dsmCrop, breaks = 6)) 223 | 224 | my_col <- terrain.colors(6) 225 | 226 | ggplot() + 227 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y, 228 | fill = fct_elevation_6)) + 229 | scale_fill_manual(values = my_col, name = "Elevation") + 230 | ggtitle("Classified Elevation Map - NEON Harvard Forest Field Site") + 231 | xlab("UTM Easting Coordinate (m)") + 232 | ylab("UTM Northing Coordinate (m)") + 233 | coord_quickmap() 234 | ``` 235 | 236 | ::::::::::::::::::::::::: 237 | 238 | :::::::::::::::::::::::::::::::::::::::::::::::::: 239 | 240 | ## Layering Rasters 241 | 242 | We can layer a raster on top of a hillshade raster for the same area, and use a 243 | transparency factor to create a 3-dimensional shaded effect. A 244 | hillshade is a raster that maps the shadows and texture that you would see from 245 | above when viewing terrain. 246 | We will add a custom color, making the plot grey. 247 | 248 | First we need to read in our DSM hillshade data and view the structure: 249 | 250 | ```{r} 251 | DSM_hill_HARV <- 252 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_DSMhill.tif") 253 | 254 | DSM_hill_HARV 255 | ``` 256 | 257 | Next we convert it to a dataframe, so that we can plot it using `ggplot2`: 258 | 259 | ```{r} 260 | DSM_hill_HARV_df <- as.data.frame(DSM_hill_HARV, xy = TRUE) 261 | 262 | str(DSM_hill_HARV_df) 263 | ``` 264 | 265 | Now we can plot the hillshade data: 266 | 267 | ```{r raster-hillshade} 268 | ggplot() + 269 | geom_raster(data = DSM_hill_HARV_df, 270 | aes(x = x, y = y, alpha = HARV_DSMhill)) + 271 | scale_alpha(range = c(0.15, 0.65), guide = "none") + 272 | coord_quickmap() 273 | ``` 274 | 275 | ::::::::::::::::::::::::::::::::::::::::: callout 276 | 277 | ## Data Tips 278 | 279 | Turn off, or hide, the legend on a plot by adding `guide = "none"` 280 | to a `scale_something()` function or by setting 281 | `theme(legend.position = "none")`. 282 | 283 | The alpha value determines how transparent the colors will be (0 being 284 | transparent, 1 being opaque). 285 | 286 | 287 | :::::::::::::::::::::::::::::::::::::::::::::::::: 288 | 289 | We can layer another raster on top of our hillshade by adding another call to 290 | the `geom_raster()` function. Let's overlay `DSM_HARV` on top of the `hill_HARV`. 291 | 292 | ```{r overlay-hillshade} 293 | ggplot() + 294 | geom_raster(data = DSM_HARV_df , 295 | aes(x = x, y = y, 296 | fill = HARV_dsmCrop)) + 297 | geom_raster(data = DSM_hill_HARV_df, 298 | aes(x = x, y = y, 299 | alpha = HARV_DSMhill)) + 300 | scale_fill_viridis_c() + 301 | scale_alpha(range = c(0.15, 0.65), guide = "none") + 302 | ggtitle("Elevation with hillshade") + 303 | coord_quickmap() 304 | ``` 305 | 306 | ::::::::::::::::::::::::::::::::::::::: challenge 307 | 308 | ## Challenge: Create DTM \& DSM for SJER 309 | 310 | Use the files in the `data/NEON-DS-Airborne-Remote-Sensing/SJER/` directory to 311 | create a Digital Terrain Model map and Digital Surface Model map of the San 312 | Joaquin Experimental Range field site. 313 | 314 | Make sure to: 315 | 316 | - include hillshade in the maps, 317 | - label axes on the DSM map and exclude them from the DTM map, 318 | - include a title for each map, 319 | - experiment with various alpha values and color palettes to represent the 320 | data. 321 | 322 | ::::::::::::::: solution 323 | 324 | ## Answers 325 | 326 | ```{r challenge-hillshade-layering, echo=TRUE} 327 | # CREATE DSM MAPS 328 | 329 | # import DSM data 330 | DSM_SJER <- 331 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmCrop.tif") 332 | # convert to a df for plotting 333 | DSM_SJER_df <- as.data.frame(DSM_SJER, xy = TRUE) 334 | 335 | # import DSM hillshade 336 | DSM_hill_SJER <- 337 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmHill.tif") 338 | # convert to a df for plotting 339 | DSM_hill_SJER_df <- as.data.frame(DSM_hill_SJER, xy = TRUE) 340 | 341 | # Build Plot 342 | ggplot() + 343 | geom_raster(data = DSM_SJER_df , 344 | aes(x = x, y = y, 345 | fill = SJER_dsmCrop, 346 | alpha = 0.8) 347 | ) + 348 | geom_raster(data = DSM_hill_SJER_df, 349 | aes(x = x, y = y, 350 | alpha = SJER_dsmHill) 351 | ) + 352 | scale_fill_viridis_c() + 353 | guides(fill = guide_colorbar()) + 354 | scale_alpha(range = c(0.4, 0.7), guide = "none") + 355 | # remove grey background and grid lines 356 | theme_bw() + 357 | theme(panel.grid.major = element_blank(), 358 | panel.grid.minor = element_blank()) + 359 | xlab("UTM Easting Coordinate (m)") + 360 | ylab("UTM Northing Coordinate (m)") + 361 | ggtitle("DSM with Hillshade") + 362 | coord_quickmap() 363 | 364 | # CREATE DTM MAP 365 | # import DTM 366 | DTM_SJER <- 367 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DTM/SJER_dtmCrop.tif") 368 | DTM_SJER_df <- as.data.frame(DTM_SJER, xy = TRUE) 369 | 370 | # DTM Hillshade 371 | DTM_hill_SJER <- 372 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DTM/SJER_dtmHill.tif") 373 | DTM_hill_SJER_df <- as.data.frame(DTM_hill_SJER, xy = TRUE) 374 | 375 | ggplot() + 376 | geom_raster(data = DTM_SJER_df , 377 | aes(x = x, y = y, 378 | fill = SJER_dtmCrop, 379 | alpha = 2.0) 380 | ) + 381 | geom_raster(data = DTM_hill_SJER_df, 382 | aes(x = x, y = y, 383 | alpha = SJER_dtmHill) 384 | ) + 385 | scale_fill_viridis_c() + 386 | guides(fill = guide_colorbar()) + 387 | scale_alpha(range = c(0.4, 0.7), guide = "none") + 388 | theme_bw() + 389 | theme(panel.grid.major = element_blank(), 390 | panel.grid.minor = element_blank()) + 391 | theme(axis.title.x = element_blank(), 392 | axis.title.y = element_blank()) + 393 | ggtitle("DTM with Hillshade") + 394 | coord_quickmap() 395 | ``` 396 | 397 | ::::::::::::::::::::::::: 398 | 399 | :::::::::::::::::::::::::::::::::::::::::::::::::: 400 | 401 | 402 | 403 | :::::::::::::::::::::::::::::::::::::::: keypoints 404 | 405 | - Continuous data ranges can be grouped into categories using `mutate()` and `cut()`. 406 | - Use built-in `terrain.colors()` or set your preferred color scheme manually. 407 | - Layer rasters on top of one another by using the `alpha` aesthetic. 408 | 409 | :::::::::::::::::::::::::::::::::::::::::::::::::: 410 | 411 | 412 | -------------------------------------------------------------------------------- /episodes/03-raster-reproject-in-r.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Reproject Raster Data 3 | teaching: 40 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Reproject a raster in R. 15 | 16 | :::::::::::::::::::::::::::::::::::::::::::::::::: 17 | 18 | :::::::::::::::::::::::::::::::::::::::: questions 19 | 20 | - How do I work with raster data sets that are in different projections? 21 | 22 | :::::::::::::::::::::::::::::::::::::::::::::::::: 23 | 24 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 25 | library(terra) 26 | library(ggplot2) 27 | library(dplyr) 28 | ``` 29 | 30 | :::::::::::::::::::::::::::::::::::::::::: prereq 31 | 32 | ## Things You'll Need To Complete This Episode 33 | 34 | See the [lesson homepage](.) for detailed information about the software, 35 | data, and other prerequisites you will need to work through the examples in 36 | this episode. 37 | 38 | 39 | :::::::::::::::::::::::::::::::::::::::::::::::::: 40 | 41 | Sometimes we encounter raster datasets that do not "line up" when plotted or 42 | analyzed. Rasters that don't line up are most often in different Coordinate 43 | Reference Systems (CRS). This episode explains how to deal with rasters in 44 | different, known CRSs. It will walk though reprojecting rasters in R using 45 | the `project()` function in the `terra` package. 46 | 47 | ## Raster Projection in R 48 | 49 | In the [Plot Raster Data in R](02-raster-plot/) 50 | episode, we learned how to layer a raster file on top of a hillshade for a nice 51 | looking basemap. In that episode, all of our data were in the same CRS. What 52 | happens when things don't line up? 53 | 54 | For this episode, we will be working with the Harvard Forest Digital Terrain 55 | Model data. This differs from the surface model data we've been working with so 56 | far in that the digital surface model (DSM) includes the tops of trees, while 57 | the digital terrain model (DTM) shows the ground level. 58 | 59 | We'll be looking at another model (the canopy height model) in 60 | [a later episode](04-raster-calculations-in-r/) and will see how to calculate 61 | the CHM from the DSM and DTM. Here, we will create a map of the Harvard Forest 62 | Digital Terrain Model (`DTM_HARV`) draped or layered on top of the hillshade 63 | (`DTM_hill_HARV`). 64 | The hillshade layer maps the terrain using light and shadow to create a 65 | 3D-looking image, based on a hypothetical illumination of the ground level. 66 | 67 | ![](fig/dc-spatial-raster/lidarTree-height.png){alt='Source: National Ecological Observatory Network (NEON).'} 68 | 69 | First, we need to import the DTM and DTM hillshade data. 70 | 71 | ```{r import-DTM-hillshade} 72 | DTM_HARV <- 73 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif") 74 | 75 | DTM_hill_HARV <- 76 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_DTMhill_WGS84.tif") 77 | ``` 78 | 79 | Next, we will convert each of these datasets to a dataframe for 80 | plotting with `ggplot`. 81 | 82 | ```{r} 83 | DTM_HARV_df <- as.data.frame(DTM_HARV, xy = TRUE) 84 | 85 | DTM_hill_HARV_df <- as.data.frame(DTM_hill_HARV, xy = TRUE) 86 | ``` 87 | 88 | Now we can create a map of the DTM layered over the hillshade. 89 | 90 | ```{r} 91 | ggplot() + 92 | geom_raster(data = DTM_HARV_df , 93 | aes(x = x, y = y, 94 | fill = HARV_dtmCrop)) + 95 | geom_raster(data = DTM_hill_HARV_df, 96 | aes(x = x, y = y, 97 | alpha = HARV_DTMhill_WGS84)) + 98 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + 99 | coord_quickmap() 100 | ``` 101 | 102 | Our results are curious - neither the Digital Terrain Model (`DTM_HARV_df`) 103 | nor the DTM Hillshade (`DTM_hill_HARV_df`) plotted. 104 | Let's try to plot the DTM on its own to make sure there are data there. 105 | 106 | ```{r plot-DTM} 107 | ggplot() + 108 | geom_raster(data = DTM_HARV_df, 109 | aes(x = x, y = y, 110 | fill = HARV_dtmCrop)) + 111 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + 112 | coord_quickmap() 113 | ``` 114 | 115 | Our DTM seems to contain data and plots just fine. 116 | 117 | Next we plot the DTM Hillshade on its own to see whether everything is OK. 118 | 119 | ```{r plot-DTM-hill} 120 | ggplot() + 121 | geom_raster(data = DTM_hill_HARV_df, 122 | aes(x = x, y = y, 123 | alpha = HARV_DTMhill_WGS84)) + 124 | coord_quickmap() 125 | ``` 126 | 127 | If we look at the axes, we can see that the projections of the two rasters are 128 | different. 129 | When this is the case, `ggplot` won't render the image. It won't even throw an 130 | error message to tell you something has gone wrong. We can look at Coordinate 131 | Reference Systems (CRSs) of the DTM and the hillshade data to see how they 132 | differ. 133 | 134 | ::::::::::::::::::::::::::::::::::::::: challenge 135 | 136 | ## Exercise 137 | 138 | View the CRS for each of these two datasets. What projection 139 | does each use? 140 | 141 | ::::::::::::::: solution 142 | 143 | ## Solution 144 | 145 | ```{r explore-crs} 146 | # view crs for DTM 147 | crs(DTM_HARV, parse = TRUE) 148 | 149 | # view crs for hillshade 150 | crs(DTM_hill_HARV, parse = TRUE) 151 | ``` 152 | 153 | `DTM_HARV` is in the UTM projection, with units of meters. 154 | `DTM_hill_HARV` is in 155 | `Geographic WGS84` - which is represented by latitude and longitude values. 156 | 157 | 158 | 159 | ::::::::::::::::::::::::: 160 | 161 | :::::::::::::::::::::::::::::::::::::::::::::::::: 162 | 163 | Because the two rasters are in different CRSs, they don't line up when plotted 164 | in R. We need to reproject (or change the projection of) `DTM_hill_HARV` into 165 | the UTM CRS. Alternatively, we could reproject `DTM_HARV` into WGS84. 166 | 167 | ## Reproject Rasters 168 | 169 | We can use the `project()` function to reproject a raster into a new CRS. 170 | Keep in mind that reprojection only works when you first have a defined CRS 171 | for the raster object that you want to reproject. It cannot be used if no 172 | CRS is defined. Lucky for us, the `DTM_hill_HARV` has a defined CRS. 173 | 174 | ::::::::::::::::::::::::::::::::::::::::: callout 175 | 176 | ## Data Tip 177 | 178 | When we reproject a raster, we move it from one "grid" to another. Thus, we are 179 | modifying the data! Keep this in mind as we work with raster data. 180 | 181 | 182 | :::::::::::::::::::::::::::::::::::::::::::::::::: 183 | 184 | To use the `project()` function, we need to define two things: 185 | 186 | 1. the object we want to reproject and 187 | 2. the CRS that we want to reproject it to. 188 | 189 | The syntax is `project(RasterObject, crs)` 190 | 191 | We want the CRS of our hillshade to match the `DTM_HARV` raster. We can thus 192 | assign the CRS of our `DTM_HARV` to our hillshade within the `project()` 193 | function as follows: `crs(DTM_HARV)`. 194 | Note that we are using the `project()` function on the raster object, 195 | not the `data.frame()` we use for plotting with `ggplot`. 196 | 197 | First we will reproject our `DTM_hill_HARV` raster data to match the `DTM_HARV` 198 | raster CRS: 199 | 200 | ```{r reproject-raster} 201 | DTM_hill_UTMZ18N_HARV <- project(DTM_hill_HARV, 202 | crs(DTM_HARV)) 203 | ``` 204 | 205 | Now we can compare the CRS of our original DTM hillshade and our new DTM 206 | hillshade, to see how they are different. 207 | 208 | ```{r} 209 | crs(DTM_hill_UTMZ18N_HARV, parse = TRUE) 210 | crs(DTM_hill_HARV, parse = TRUE) 211 | ``` 212 | 213 | We can also compare the extent of the two objects. 214 | 215 | ```{r} 216 | ext(DTM_hill_UTMZ18N_HARV) 217 | ext(DTM_hill_HARV) 218 | ``` 219 | 220 | Notice in the output above that the `crs()` of `DTM_hill_UTMZ18N_HARV` is now 221 | UTM. However, the extent values of `DTM_hillUTMZ18N_HARV` are different from 222 | `DTM_hill_HARV`. 223 | 224 | ::::::::::::::::::::::::::::::::::::::: challenge 225 | 226 | ## Challenge: Extent Change with CRS Change 227 | 228 | Why do you think the two extents differ? 229 | 230 | ::::::::::::::: solution 231 | 232 | ## Answers 233 | 234 | The extent for DTM\_hill\_UTMZ18N\_HARV is in UTMs so the extent is in meters. 235 | The extent for DTM\_hill\_HARV is in lat/long so the extent is expressed in 236 | decimal degrees. 237 | 238 | 239 | 240 | ::::::::::::::::::::::::: 241 | 242 | :::::::::::::::::::::::::::::::::::::::::::::::::: 243 | 244 | ## Deal with Raster Resolution 245 | 246 | Let's next have a look at the resolution of our reprojected hillshade versus 247 | our original data. 248 | 249 | ```{r view-resolution} 250 | res(DTM_hill_UTMZ18N_HARV) 251 | res(DTM_HARV) 252 | ``` 253 | 254 | These two resolutions are different, but they're representing the same data. We 255 | can tell R to force our newly reprojected raster to be 1m x 1m resolution by 256 | adding a line of code `res=1` within the `project()` function. In the 257 | example below, we ensure a resolution match by using `res(DTM_HARV)` as a 258 | variable. 259 | 260 | ```{r reproject-assign-resolution} 261 | DTM_hill_UTMZ18N_HARV <- project(DTM_hill_HARV, 262 | crs(DTM_HARV), 263 | res = res(DTM_HARV)) 264 | ``` 265 | 266 | Now both our resolutions and our CRSs match, so we can plot these two data sets 267 | together. Let's double-check our resolution to be sure: 268 | 269 | ```{r} 270 | res(DTM_hill_UTMZ18N_HARV) 271 | res(DTM_HARV) 272 | ``` 273 | 274 | For plotting with `ggplot()`, we will need to create a dataframe from our newly 275 | reprojected raster. 276 | 277 | ```{r make-df-projected-raster} 278 | DTM_hill_HARV_2_df <- as.data.frame(DTM_hill_UTMZ18N_HARV, xy = TRUE) 279 | ``` 280 | 281 | We can now create a plot of this data. 282 | 283 | ```{r plot-projected-raster} 284 | ggplot() + 285 | geom_raster(data = DTM_HARV_df , 286 | aes(x = x, y = y, 287 | fill = HARV_dtmCrop)) + 288 | geom_raster(data = DTM_hill_HARV_2_df, 289 | aes(x = x, y = y, 290 | alpha = HARV_DTMhill_WGS84)) + 291 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + 292 | coord_quickmap() 293 | ``` 294 | 295 | We have now successfully draped the Digital Terrain Model on top of our 296 | hillshade to produce a nice looking, textured map! 297 | 298 | ::::::::::::::::::::::::::::::::::::::: challenge 299 | 300 | ## Challenge: Reproject, then Plot a Digital Terrain Model 301 | 302 | Create a map of the 303 | [San Joaquin Experimental Range](https://www.neonscience.org/field-sites/field-sites-map/SJER) 304 | field site using the `SJER_DSMhill_WGS84.tif` and `SJER_dsmCrop.tif` files. 305 | 306 | Reproject the data as necessary to make things line up! 307 | 308 | ::::::::::::::: solution 309 | 310 | ## Answers 311 | 312 | ```{r challenge-code-reprojection, echo=TRUE} 313 | # import DSM 314 | DSM_SJER <- 315 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmCrop.tif") 316 | # import DSM hillshade 317 | DSM_hill_SJER_WGS <- 318 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_DSMhill_WGS84.tif") 319 | 320 | # reproject raster 321 | DSM_hill_UTMZ18N_SJER <- project(DSM_hill_SJER_WGS, 322 | crs(DSM_SJER), 323 | res = 1) 324 | 325 | # convert to data.frames 326 | DSM_SJER_df <- as.data.frame(DSM_SJER, xy = TRUE) 327 | 328 | DSM_hill_SJER_df <- as.data.frame(DSM_hill_UTMZ18N_SJER, xy = TRUE) 329 | 330 | ggplot() + 331 | geom_raster(data = DSM_hill_SJER_df, 332 | aes(x = x, y = y, 333 | alpha = SJER_DSMhill_WGS84) 334 | ) + 335 | geom_raster(data = DSM_SJER_df, 336 | aes(x = x, y = y, 337 | fill = SJER_dsmCrop, 338 | alpha=0.8) 339 | ) + 340 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + 341 | coord_quickmap() 342 | ``` 343 | 344 | ::::::::::::::::::::::::: 345 | 346 | If you completed the San Joaquin plotting challenge in the 347 | [Plot Raster Data in R](02-raster-plot/) 348 | episode, how does the map you just created compare to that map? 349 | 350 | ::::::::::::::: solution 351 | 352 | ## Answers 353 | 354 | The maps look identical. Which is what they should be as the only difference 355 | is this one was reprojected from WGS84 to UTM prior to plotting. 356 | 357 | 358 | 359 | ::::::::::::::::::::::::: 360 | 361 | :::::::::::::::::::::::::::::::::::::::::::::::::: 362 | 363 | 364 | 365 | :::::::::::::::::::::::::::::::::::::::: keypoints 366 | 367 | - In order to plot two raster data sets together, they must be in the same CRS. 368 | - Use the `project()` function to convert between CRSs. 369 | 370 | :::::::::::::::::::::::::::::::::::::::::::::::::: 371 | 372 | 373 | -------------------------------------------------------------------------------- /episodes/04-raster-calculations-in-r.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Raster Calculations 3 | teaching: 40 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Perform a subtraction between two rasters using raster math. 15 | - Perform a more efficient subtraction between two rasters using the raster `lapp()` function. 16 | - Export raster data as a GeoTIFF file. 17 | 18 | :::::::::::::::::::::::::::::::::::::::::::::::::: 19 | 20 | :::::::::::::::::::::::::::::::::::::::: questions 21 | 22 | - How do I subtract one raster from another and extract pixel values for defined locations? 23 | 24 | :::::::::::::::::::::::::::::::::::::::::::::::::: 25 | 26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 27 | library(terra) 28 | library(ggplot2) 29 | library(dplyr) 30 | ``` 31 | 32 | ```{r load-data, echo=FALSE} 33 | # Learners will have these data loaded from earlier episode 34 | # DSM data for Harvard Forest 35 | DSM_HARV <- 36 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif") 37 | 38 | DSM_HARV_df <- as.data.frame(DSM_HARV, xy = TRUE) 39 | 40 | # DTM data for Harvard Forest 41 | DTM_HARV <- 42 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif") 43 | 44 | DTM_HARV_df <- as.data.frame(DTM_HARV, xy = TRUE) 45 | 46 | # DSM data for SJER 47 | DSM_SJER <- 48 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmCrop.tif") 49 | 50 | DSM_SJER_df <- as.data.frame(DSM_SJER, xy = TRUE) 51 | 52 | # DTM data for SJER 53 | DTM_SJER <- 54 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DTM/SJER_dtmCrop.tif") 55 | 56 | DTM_SJER_df <- as.data.frame(DTM_SJER, xy = TRUE) 57 | ``` 58 | 59 | :::::::::::::::::::::::::::::::::::::::::: prereq 60 | 61 | ## Things You'll Need To Complete This Episode 62 | 63 | See the [lesson homepage](.) for detailed information about the software, 64 | data, and other prerequisites you will need to work through the examples in 65 | this episode. 66 | 67 | 68 | :::::::::::::::::::::::::::::::::::::::::::::::::: 69 | 70 | We often want to combine values of and perform calculations on rasters to 71 | create a new output raster. This episode covers how to subtract one raster from 72 | another using basic raster math and the `lapp()` function. It also covers 73 | how to extract pixel values from a set of locations - for example a buffer 74 | region around plot locations at a field site. 75 | 76 | ## Raster Calculations in R 77 | 78 | We often want to perform calculations on two or more rasters to create a new 79 | output raster. For example, if we are interested in mapping the heights of 80 | trees across an entire field site, we might want to calculate the difference 81 | between the Digital Surface Model (DSM, tops of trees) and the Digital Terrain 82 | Model (DTM, ground level). The resulting dataset is referred to as a Canopy 83 | Height Model (CHM) and represents the actual height of trees, buildings, etc. 84 | with the influence of ground elevation removed. 85 | 86 | ![](fig/dc-spatial-raster/lidarTree-height.png){alt='Source: National Ecological Observatory Network (NEON)'} 87 | 88 | ::::::::::::::::::::::::::::::::::::::::: callout 89 | 90 | ## More Resources 91 | 92 | - Check out more on LiDAR CHM, DTM and DSM in this NEON Data Skills overview tutorial: 93 | [What is a CHM, DSM and DTM? About Gridded, Raster LiDAR Data](https://www.neonscience.org/chm-dsm-dtm-gridded-lidar-data). 94 | 95 | 96 | :::::::::::::::::::::::::::::::::::::::::::::::::: 97 | 98 | ### Load the Data 99 | 100 | For this episode, we will use the DTM and DSM from the NEON Harvard Forest 101 | Field site and San Joaquin Experimental Range, which we already have loaded 102 | from previous episodes. 103 | 104 | ::::::::::::::::::::::::::::::::::::::: challenge 105 | 106 | ## Exercise 107 | 108 | Use the `describe()` function to view information about the DTM and DSM data 109 | files. Do the two rasters have the same or different CRSs and resolutions? Do 110 | they both have defined minimum and maximum values? 111 | 112 | ::::::::::::::: solution 113 | 114 | ## Solution 115 | 116 | ```{r} 117 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif") 118 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif") 119 | ``` 120 | 121 | ::::::::::::::::::::::::: 122 | 123 | :::::::::::::::::::::::::::::::::::::::::::::::::: 124 | 125 | We've already loaded and worked with these two data files in 126 | earlier episodes. Let's plot them each once more to remind ourselves 127 | what this data looks like. First we'll plot the DTM elevation data: 128 | 129 | ```{r harv-dtm-plot} 130 | ggplot() + 131 | geom_raster(data = DTM_HARV_df , 132 | aes(x = x, y = y, fill = HARV_dtmCrop)) + 133 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + 134 | coord_quickmap() 135 | ``` 136 | 137 | And then the DSM elevation data: 138 | 139 | ```{r harv-dsm-plot} 140 | ggplot() + 141 | geom_raster(data = DSM_HARV_df , 142 | aes(x = x, y = y, fill = HARV_dsmCrop)) + 143 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + 144 | coord_quickmap() 145 | ``` 146 | 147 | ## Two Ways to Perform Raster Calculations 148 | 149 | We can calculate the difference between two rasters in two different ways: 150 | 151 | - by directly subtracting the two rasters in R using raster math 152 | 153 | or for more efficient processing - particularly if our rasters are large and/or 154 | the calculations we are performing are complex: 155 | 156 | - using the `lapp()` function. 157 | 158 | ## Raster Math \& Canopy Height Models 159 | 160 | We can perform raster calculations by subtracting (or adding, 161 | multiplying, etc) two rasters. In the geospatial world, we call this 162 | "raster math". 163 | 164 | Let's subtract the DTM from the DSM to create a Canopy Height Model. 165 | After subtracting, let's create a dataframe so we can plot with `ggplot`. 166 | 167 | ```{r raster-math} 168 | CHM_HARV <- DSM_HARV - DTM_HARV 169 | 170 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE) 171 | ``` 172 | 173 | We can now plot the output CHM. 174 | 175 | ```{r harv-chm-plot} 176 | ggplot() + 177 | geom_raster(data = CHM_HARV_df , 178 | aes(x = x, y = y, fill = HARV_dsmCrop)) + 179 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 180 | coord_quickmap() 181 | ``` 182 | 183 | Let's have a look at the distribution of values in our newly created 184 | Canopy Height Model (CHM). 185 | 186 | ```{r create-hist} 187 | ggplot(CHM_HARV_df) + 188 | geom_histogram(aes(HARV_dsmCrop)) 189 | ``` 190 | 191 | Notice that the range of values for the output CHM is between 0 and 30 meters. 192 | Does this make sense for trees in Harvard Forest? 193 | 194 | ::::::::::::::::::::::::::::::::::::::: challenge 195 | 196 | ## Challenge: Explore CHM Raster Values 197 | 198 | It's often a good idea to explore the range of values in a raster dataset just 199 | like we might explore a dataset that we collected in the field. 200 | 201 | 1. What is the min and maximum value for the Harvard Forest Canopy Height Model (`CHM_HARV`) that we just created? 202 | 2. What are two ways you can check this range of data for `CHM_HARV`? 203 | 3. What is the distribution of all the pixel values in the CHM? 204 | 4. Plot a histogram with 6 bins instead of the default and change the color of the histogram. 205 | 5. Plot the `CHM_HARV` raster using breaks that make sense for the data. Include an appropriate color palette for the data, plot title and no axes ticks / labels. 206 | 207 | ::::::::::::::: solution 208 | 209 | ## Answers 210 | 211 | 1) There are missing values in our data, so we need to specify 212 | `na.rm = TRUE`. 213 | 214 | ```{r} 215 | min(CHM_HARV_df$HARV_dsmCrop, na.rm = TRUE) 216 | max(CHM_HARV_df$HARV_dsmCrop, na.rm = TRUE) 217 | ``` 218 | 219 | 2) Possible ways include: 220 | 221 | - Create a histogram 222 | - Use the `min()`, `max()`, and `range()` functions. 223 | - Print the object and look at the `values` attribute. 224 | 225 | 3) 226 | ```{r chm-harv-hist} 227 | ggplot(CHM_HARV_df) + 228 | geom_histogram(aes(HARV_dsmCrop)) 229 | ``` 230 | 231 | 4) 232 | ```{r chm-harv-hist-green} 233 | ggplot(CHM_HARV_df) + 234 | geom_histogram(aes(HARV_dsmCrop), colour="black", 235 | fill="darkgreen", bins = 6) 236 | ``` 237 | 238 | 5) 239 | ```{r chm-harv-raster} 240 | custom_bins <- c(0, 10, 20, 30, 40) 241 | CHM_HARV_df <- CHM_HARV_df %>% 242 | mutate(canopy_discrete = cut(HARV_dsmCrop, 243 | breaks = custom_bins)) 244 | 245 | ggplot() + 246 | geom_raster(data = CHM_HARV_df , aes(x = x, y = y, 247 | fill = canopy_discrete)) + 248 | scale_fill_manual(values = terrain.colors(4)) + 249 | coord_quickmap() 250 | ``` 251 | 252 | ::::::::::::::::::::::::: 253 | 254 | :::::::::::::::::::::::::::::::::::::::::::::::::: 255 | 256 | ## Efficient Raster Calculations 257 | 258 | Raster math, like we just did, is an appropriate approach to raster calculations 259 | if: 260 | 261 | 1. The rasters we are using are small in size. 262 | 2. The calculations we are performing are simple. 263 | 264 | However, raster math is a less efficient approach as computation becomes more 265 | complex or as file sizes become large. 266 | 267 | The `lapp()` function takes two or more rasters and applies a function to 268 | them using efficient processing methods. The syntax is 269 | 270 | `outputRaster <- lapp(x, fun=functionName)` 271 | 272 | In which raster can be either a SpatRaster or a SpatRasterDataset which is an 273 | object that holds rasters. See `help(sds)`. 274 | 275 | ::::::::::::::::::::::::::::::::::::::::: callout 276 | 277 | ## Data Tip 278 | 279 | To create a SpatRasterDataset, we call the function `sds` which can take a list 280 | of raster objects (each one created by calling `rast`). 281 | 282 | :::::::::::::::::::::::::::::::::::::::::::::::::: 283 | 284 | Let's perform the same subtraction calculation that we calculated above using 285 | raster math, using the `lapp()` function. 286 | 287 | ::::::::::::::::::::::::::::::::::::::::: callout 288 | 289 | ## Data Tip 290 | 291 | A custom function consists of a defined set of commands performed on a input 292 | object. Custom functions are particularly useful for tasks that need to be 293 | repeated over and over in the code. A simplified syntax for writing a custom 294 | function in R is: 295 | `function_name <- function(variable1, variable2) { WhatYouWantDone, WhatToReturn}` 296 | 297 | 298 | :::::::::::::::::::::::::::::::::::::::::::::::::: 299 | 300 | ```{r raster-overlay} 301 | CHM_ov_HARV <- lapp(sds(list(DSM_HARV, DTM_HARV)), 302 | fun = function(r1, r2) { return( r1 - r2) }) 303 | ``` 304 | 305 | Next we need to convert our new object to a data frame for plotting with 306 | `ggplot`. 307 | 308 | ```{r} 309 | CHM_ov_HARV_df <- as.data.frame(CHM_ov_HARV, xy = TRUE) 310 | ``` 311 | 312 | Now we can plot the CHM: 313 | 314 | ```{r harv-chm-overlay} 315 | ggplot() + 316 | geom_raster(data = CHM_ov_HARV_df, 317 | aes(x = x, y = y, fill = HARV_dsmCrop)) + 318 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 319 | coord_quickmap() 320 | ``` 321 | 322 | How do the plots of the CHM created with manual raster math and the `lapp()` 323 | function compare? 324 | 325 | ## Export a GeoTIFF 326 | 327 | Now that we've created a new raster, let's export the data as a GeoTIFF 328 | file using 329 | the `writeRaster()` function. 330 | 331 | When we write this raster object to a GeoTIFF file we'll name it 332 | `CHM_HARV.tiff`. This name allows us to quickly remember both what the data 333 | contains (CHM data) and for where (HARVard Forest). The `writeRaster()` function 334 | by default writes the output file to your working directory unless you specify a 335 | full file path. 336 | 337 | We will specify the output format ("GTiff"), the no data value `NAflag = -9999`. 338 | We will also tell R to overwrite any data that is already in a file of the same 339 | name. 340 | 341 | ```{r write-raster, eval=FALSE} 342 | writeRaster(CHM_ov_HARV, "CHM_HARV.tiff", 343 | filetype="GTiff", 344 | overwrite=TRUE, 345 | NAflag=-9999) 346 | ``` 347 | 348 | ### writeRaster() Options 349 | 350 | The function arguments that we used above include: 351 | 352 | - **filetype:** specify that the format will be `GTiff` or GeoTIFF. 353 | - **overwrite:** If TRUE, R will overwrite any existing file with the same 354 | name in the specified directory. USE THIS SETTING WITH CAUTION! 355 | - **NAflag:** set the GeoTIFF tag for `NoDataValue` to -9999, the National 356 | Ecological Observatory Network's (NEON) standard `NoDataValue`. 357 | 358 | ::::::::::::::::::::::::::::::::::::::: challenge 359 | 360 | ## Challenge: Explore the NEON San Joaquin Experimental Range Field Site 361 | 362 | Data are often more interesting and powerful when we compare them across 363 | various locations. Let's compare some data collected over Harvard Forest to 364 | data collected in Southern California. The 365 | [NEON San Joaquin Experimental Range (SJER) field site](https://www.neonscience.org/field-sites/field-sites-map/SJER) 366 | located in Southern California has a very different ecosystem and climate than 367 | the 368 | [NEON Harvard Forest Field Site](https://www.neonscience.org/field-sites/field-sites-map/HARV) 369 | in Massachusetts. 370 | 371 | Import the SJER DSM and DTM raster files and create a Canopy Height Model. 372 | Then compare the two sites. Be sure to name your R objects and outputs 373 | carefully, as follows: objectType\_SJER (e.g. `DSM_SJER`). This will help you 374 | keep track of data from different sites! 375 | 376 | 0. You should have the DSM and DTM data for the SJER site already 377 | loaded from the 378 | [Plot Raster Data in R](02-raster-plot/) 379 | episode.) Don't forget to check the CRSs and units of the data. 380 | 1. Create a CHM from the two raster layers and check to make sure the data 381 | are what you expect. 382 | 2. Plot the CHM from SJER. 383 | 3. Export the SJER CHM as a GeoTIFF. 384 | 4. Compare the vegetation structure of the Harvard Forest and San Joaquin 385 | Experimental Range. 386 | 387 | ::::::::::::::: solution 388 | 389 | ## Answers 390 | 391 | 1) Use the `lapp()` function to subtract the two rasters \& create the CHM. 392 | 393 | ```{r} 394 | CHM_ov_SJER <- lapp(sds(list(DSM_SJER, DTM_SJER)), 395 | fun = function(r1, r2){ return(r1 - r2) }) 396 | ``` 397 | 398 | Convert the output to a dataframe: 399 | 400 | ```{r} 401 | CHM_ov_SJER_df <- as.data.frame(CHM_ov_SJER, xy = TRUE) 402 | ``` 403 | 404 | Create a histogram to check that the data distribution makes sense: 405 | 406 | ```{r sjer-chm-overlay-hist} 407 | ggplot(CHM_ov_SJER_df) + 408 | geom_histogram(aes(SJER_dsmCrop)) 409 | ``` 410 | 411 | 2) Create a plot of the CHM: 412 | 413 | ```{r sjer-chm-overlay-raster} 414 | ggplot() + 415 | geom_raster(data = CHM_ov_SJER_df, 416 | aes(x = x, y = y, 417 | fill = SJER_dsmCrop) 418 | ) + 419 | scale_fill_gradientn(name = "Canopy Height", 420 | colors = terrain.colors(10)) + 421 | coord_quickmap() 422 | ``` 423 | 424 | 3) Export the CHM object to a file: 425 | 426 | ```{r} 427 | writeRaster(CHM_ov_SJER, "chm_ov_SJER.tiff", 428 | filetype = "GTiff", 429 | overwrite = TRUE, 430 | NAflag = -9999) 431 | ``` 432 | 433 | 4) Compare the SJER and HARV CHMs. 434 | Tree heights are much shorter in SJER. You can confirm this by 435 | looking at the histograms of the two CHMs. 436 | 437 | ```{r compare-chm-harv-sjer} 438 | ggplot(CHM_HARV_df) + 439 | geom_histogram(aes(HARV_dsmCrop)) 440 | 441 | ggplot(CHM_ov_SJER_df) + 442 | geom_histogram(aes(SJER_dsmCrop)) 443 | ``` 444 | 445 | ::::::::::::::::::::::::: 446 | 447 | :::::::::::::::::::::::::::::::::::::::::::::::::: 448 | 449 | 450 | 451 | :::::::::::::::::::::::::::::::::::::::: keypoints 452 | 453 | - Rasters can be computed on using mathematical functions. 454 | - The `lapp()` function provides an efficient way to do raster math. 455 | - The `writeRaster()` function can be used to write raster data to a file. 456 | 457 | :::::::::::::::::::::::::::::::::::::::::::::::::: 458 | 459 | 460 | -------------------------------------------------------------------------------- /episodes/05-raster-multi-band-in-r.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Work with Multi-Band Rasters 3 | teaching: 40 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Identify a single vs. a multi-band raster file. 15 | - Import multi-band rasters into R using the `terra` package. 16 | - Plot multi-band color image rasters in R using the `ggplot` package. 17 | 18 | :::::::::::::::::::::::::::::::::::::::::::::::::: 19 | 20 | :::::::::::::::::::::::::::::::::::::::: questions 21 | 22 | - How can I visualize individual and multiple bands in a raster object? 23 | 24 | :::::::::::::::::::::::::::::::::::::::::::::::::: 25 | 26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 27 | library(terra) 28 | library(ggplot2) 29 | library(dplyr) 30 | ``` 31 | 32 | :::::::::::::::::::::::::::::::::::::::::: prereq 33 | 34 | ## Things You'll Need To Complete This Episode 35 | 36 | See the [lesson homepage](.) for detailed information about the software, data, 37 | and other prerequisites you will need to work through the examples in this 38 | episode. 39 | 40 | 41 | :::::::::::::::::::::::::::::::::::::::::::::::::: 42 | 43 | We introduced multi-band raster data in 44 | [an earlier episode](https://datacarpentry.org/organization-geospatial/01-intro-raster-data). 45 | This episode explores how to import and plot a multi-band raster in R. 46 | 47 | ## Getting Started with Multi-Band Data in R 48 | 49 | In this episode, the multi-band data that we are working with is imagery 50 | collected using the 51 | [NEON Airborne Observation Platform](https://www.neonscience.org/data-collection/airborne-remote-sensing) 52 | high resolution camera over the 53 | [NEON Harvard Forest field site](https://www.neonscience.org/field-sites/field-sites-map/HARV). 54 | Each RGB image is a 3-band raster. The same steps would apply to working with a 55 | multi-spectral image with 4 or more bands - like Landsat imagery. 56 | 57 | By using the `rast()` function along with the `lyrs` parameter, we can read 58 | specific raster bands (i.e. the first one); omitting this parameter would read 59 | instead all bands. 60 | 61 | ```{r read-single-band} 62 | RGB_band1_HARV <- 63 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif", 64 | lyrs = 1) 65 | ``` 66 | 67 | We need to convert this data to a data frame in order to plot it with `ggplot`. 68 | 69 | ```{r} 70 | RGB_band1_HARV_df <- as.data.frame(RGB_band1_HARV, xy = TRUE) 71 | ``` 72 | 73 | ```{r harv-rgb-band1} 74 | ggplot() + 75 | geom_raster(data = RGB_band1_HARV_df, 76 | aes(x = x, y = y, alpha = HARV_RGB_Ortho_1)) + 77 | coord_quickmap() 78 | ``` 79 | 80 | ::::::::::::::::::::::::::::::::::::::: challenge 81 | 82 | ## Challenge 83 | 84 | View the attributes of this band. What are its dimensions, CRS, resolution, min 85 | and max values, and band number? 86 | 87 | ::::::::::::::: solution 88 | 89 | ## Solution 90 | 91 | ```{r} 92 | RGB_band1_HARV 93 | ``` 94 | 95 | Notice that when we look at the attributes of this band, we see: 96 | `dimensions : 2317, 3073, 1 (nrow, ncol, nlyr)` 97 | 98 | This is R telling us that we read only one its bands. 99 | 100 | 101 | 102 | ::::::::::::::::::::::::: 103 | 104 | :::::::::::::::::::::::::::::::::::::::::::::::::: 105 | 106 | ::::::::::::::::::::::::::::::::::::::::: callout 107 | 108 | ## Data Tip 109 | 110 | The number of bands associated with a raster's file can also be determined 111 | using the `describe()` function: syntax is `describe(sources(RGB_band1_HARV))`. 112 | 113 | 114 | :::::::::::::::::::::::::::::::::::::::::::::::::: 115 | 116 | ### Image Raster Data Values 117 | 118 | As we saw in the previous exercise, this raster contains values between 0 and 119 | 255. These values represent degrees of brightness associated with the image 120 | band. In the case of a RGB image (red, green and blue), band 1 is the red band. 121 | When we plot the red band, larger numbers (towards 255) represent pixels with 122 | more red in them (a strong red reflection). Smaller numbers (towards 0) 123 | represent pixels with less red in them (less red was reflected). To plot an RGB 124 | image, we mix red + green + blue values into one single color to create a full 125 | color image - similar to the color image a digital camera creates. 126 | 127 | ### Import A Specific Band 128 | 129 | We can use the `rast()` function to import specific bands in our raster object 130 | by specifying which band we want with `lyrs = N` (N represents the band number we 131 | want to work with). To import the green band, we would use `lyrs = 2`. 132 | 133 | ```{r read-specific-band} 134 | RGB_band2_HARV <- 135 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif", 136 | lyrs = 2) 137 | ``` 138 | 139 | We can convert this data to a data frame and plot the same way we plotted the red band: 140 | 141 | ```{r} 142 | RGB_band2_HARV_df <- as.data.frame(RGB_band2_HARV, xy = TRUE) 143 | ``` 144 | 145 | ```{r rgb-harv-band2} 146 | ggplot() + 147 | geom_raster(data = RGB_band2_HARV_df, 148 | aes(x = x, y = y, alpha = HARV_RGB_Ortho_2)) + 149 | coord_equal() 150 | ``` 151 | 152 | ::::::::::::::::::::::::::::::::::::::: challenge 153 | 154 | ## Challenge: Making Sense of Single Band Images 155 | 156 | Compare the plots of band 1 (red) and band 2 (green). Is the forested area 157 | darker or lighter in band 2 (the green band) compared to band 1 (the red band)? 158 | 159 | ::::::::::::::: solution 160 | 161 | ## Solution 162 | 163 | We'd expect a *brighter* value for the forest in band 2 (green) than in band 1 164 | (red) because the leaves on trees of most often appear "green" - healthy leaves 165 | reflect MORE green light than red light. 166 | 167 | 168 | 169 | ::::::::::::::::::::::::: 170 | 171 | :::::::::::::::::::::::::::::::::::::::::::::::::: 172 | 173 | ## Raster Stacks in R 174 | 175 | Next, we will work with all three image bands (red, green and blue) as an R 176 | raster object. We will then plot a 3-band composite, or full color, image. 177 | 178 | To bring in all bands of a multi-band raster, we use the`rast()` function. 179 | 180 | ```{r intro-to-raster-stacks} 181 | RGB_stack_HARV <- 182 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif") 183 | ``` 184 | 185 | Let's preview the attributes of our stack object: 186 | 187 | ```{r} 188 | RGB_stack_HARV 189 | ``` 190 | 191 | We can view the attributes of each band in the stack in a single output. For 192 | example, if we had hundreds of bands, we could specify which band we'd like to 193 | view attributes for using an index value: 194 | 195 | ```{r} 196 | RGB_stack_HARV[[2]] 197 | ``` 198 | 199 | We can also use the `ggplot` functions to plot the data in any layer of our 200 | raster object. Remember, we need to convert to a data frame first. 201 | 202 | ```{r} 203 | RGB_stack_HARV_df <- as.data.frame(RGB_stack_HARV, xy = TRUE) 204 | ``` 205 | 206 | Each band in our RasterStack gets its own column in the data frame. Thus we have: 207 | 208 | ```{r} 209 | str(RGB_stack_HARV_df) 210 | ``` 211 | 212 | Let's create a histogram of the first band: 213 | 214 | ```{r rgb-harv-hist-band1} 215 | ggplot() + 216 | geom_histogram(data = RGB_stack_HARV_df, aes(HARV_RGB_Ortho_1)) 217 | ``` 218 | 219 | And a raster plot of the second band: 220 | 221 | ```{r rgb-harv-plot-band2} 222 | ggplot() + 223 | geom_raster(data = RGB_stack_HARV_df, 224 | aes(x = x, y = y, alpha = HARV_RGB_Ortho_2)) + 225 | coord_quickmap() 226 | ``` 227 | 228 | We can access any individual band in the same way. 229 | 230 | ### Create A Three Band Image 231 | 232 | To render a final three band, colored image in R, we use the `plotRGB()` function. 233 | 234 | This function allows us to: 235 | 236 | 1. Identify what bands we want to render in the red, green and blue regions. 237 | The `plotRGB()` function defaults to a 1=red, 2=green, and 3=blue band 238 | order. However, you can define what bands you'd like to plot manually. 239 | Manual definition of bands is useful if you have, for example a 240 | near-infrared band and want to create a color infrared image. 241 | 2. Adjust the `stretch` of the image to increase or decrease contrast. 242 | 243 | Let's plot our 3-band image. Note that we can use the `plotRGB()` function 244 | directly with our RasterStack object (we don't need a dataframe as this 245 | function isn't part of the `ggplot2` package). 246 | 247 | ```{r plot-rgb-image} 248 | plotRGB(RGB_stack_HARV, 249 | r = 1, g = 2, b = 3) 250 | ``` 251 | 252 | The image above looks pretty good. We can explore whether applying a stretch to 253 | the image might improve clarity and contrast using `stretch="lin"` or 254 | `stretch="hist"`. 255 | 256 | ![](fig/dc-spatial-raster/imageStretch_dark.jpg){alt='Image Stretch'} 257 | 258 | When the range of pixel brightness values is closer to 0, a darker image is 259 | rendered by default. We can stretch the values to extend to the full 0-255 260 | range of potential values to increase the visual contrast of the image. 261 | 262 | ![](fig/dc-spatial-raster/imageStretch_light.jpg){alt='Image Stretch light'} 263 | 264 | When the range of pixel brightness values is closer to 255, a lighter image is 265 | rendered by default. We can stretch the values to extend to the full 0-255 266 | range of potential values to increase the visual contrast of the image. 267 | 268 | ```{r plot-rbg-image-linear} 269 | plotRGB(RGB_stack_HARV, 270 | r = 1, g = 2, b = 3, 271 | scale = 800, 272 | stretch = "lin") 273 | ``` 274 | 275 | ```{r plot-rgb-image-hist} 276 | plotRGB(RGB_stack_HARV, 277 | r = 1, g = 2, b = 3, 278 | scale = 800, 279 | stretch = "hist") 280 | ``` 281 | 282 | In this case, the stretch doesn't enhance the contrast our image significantly 283 | given the distribution of reflectance (or brightness) values is distributed 284 | well between 0 and 255. 285 | 286 | ::::::::::::::::::::::::::::::::::::::: challenge 287 | 288 | ## Challenge - NoData Values 289 | 290 | Let's explore what happens with NoData values when working with RasterStack 291 | objects and using the `plotRGB()` function. We will use the 292 | `HARV_Ortho_wNA.tif` GeoTIFF file in the 293 | `NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/` directory. 294 | 295 | 1. View the files attributes. Are there `NoData` values assigned for this file? 296 | 2. If so, what is the `NoData` Value? 297 | 3. How many bands does it have? 298 | 4. Load the multi-band raster file into R. 299 | 5. Plot the object as a true color image. 300 | 6. What happened to the black edges in the data? 301 | 7. What does this tell us about the difference in the data structure between 302 | `HARV_Ortho_wNA.tif` and `HARV_RGB_Ortho.tif` (R object `RGB_stack`). How can 303 | you check? 304 | 305 | ::::::::::::::: solution 306 | 307 | ## Answers 308 | 309 | 1) First we use the `describe()` function to view the data attributes. 310 | 311 | ```{r} 312 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_Ortho_wNA.tif") 313 | ``` 314 | 315 | 2) From the output above, we see that there are `NoData` values and they are 316 | assigned the value of -9999. 317 | 318 | 3) The data has three bands. 319 | 320 | 4) To read in the file, we will use the `rast()` function: 321 | 322 | ```{r} 323 | HARV_NA <- 324 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_Ortho_wNA.tif") 325 | ``` 326 | 327 | 5) We can plot the data with the `plotRGB()` function: 328 | 329 | ```{r harv-na-rgb} 330 | plotRGB(HARV_NA, 331 | r = 1, g = 2, b = 3) 332 | ``` 333 | 334 | 6) The black edges are not plotted. 335 | 336 | 7) Both data sets have `NoData` values, however, in the RGB\_stack the NoData 337 | value is not defined in the tiff tags, thus R renders them as black as the 338 | reflectance values are 0. The black edges in the other file are defined as 339 | -9999 and R renders them as NA. 340 | 341 | ```{r} 342 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif") 343 | ``` 344 | 345 | ::::::::::::::::::::::::: 346 | 347 | :::::::::::::::::::::::::::::::::::::::::::::::::: 348 | 349 | ::::::::::::::::::::::::::::::::::::::::: callout 350 | 351 | ## Data Tip 352 | 353 | We can create a raster object from several, individual single-band GeoTIFFs 354 | too. We will do this in a later episode, 355 | [Raster Time Series Data in R](12-time-series-raster/). 356 | 357 | 358 | :::::::::::::::::::::::::::::::::::::::::::::::::: 359 | 360 | ## SpatRaster in R 361 | 362 | The R SpatRaster object type can handle rasters with multiple bands. 363 | The SpatRaster only holds parameters that describe the properties of raster 364 | data that is located somewhere on our computer. 365 | 366 | A SpatRasterDataset object can hold references to sub-datasets, that is, 367 | SpatRaster objects. In most cases, we can work with a SpatRaster in the same 368 | way we might work with a SpatRasterDataset. 369 | 370 | ::::::::::::::::::::::::::::::::::::::::: callout 371 | 372 | ## More Resources 373 | 374 | You can read the help for the `rast()` and `sds()` functions by typing `?rast` 375 | or `?sds`. 376 | 377 | 378 | :::::::::::::::::::::::::::::::::::::::::::::::::: 379 | 380 | 381 | We can build a SpatRasterDataset using a SpatRaster or a list of SpatRaster: 382 | 383 | ```{r} 384 | RGB_sds_HARV <- sds(RGB_stack_HARV) 385 | RGB_sds_HARV <- sds(list(RGB_stack_HARV, RGB_stack_HARV)) 386 | ``` 387 | 388 | We can retrieve the SpatRaster objects from a SpatRasterDataset using 389 | subsetting: 390 | 391 | ```{r} 392 | RGB_sds_HARV[[1]] 393 | RGB_sds_HARV[[2]] 394 | ``` 395 | 396 | 397 | ::::::::::::::::::::::::::::::::::::::: challenge 398 | 399 | ## Challenge: What Functions Can Be Used on an R Object of a particular class? 400 | 401 | We can view various functions (or methods) available to use on an R object with 402 | `methods(class=class(objectNameHere))`. Use this to figure out: 403 | 404 | 1. What methods can be used on the `RGB_stack_HARV` object? 405 | 2. What methods can be used on a single band within `RGB_stack_HARV`? 406 | 3. Why do you think there isn't a difference? 407 | 408 | ::::::::::::::: solution 409 | 410 | ## Answers 411 | 412 | 1) We can see a list of all of the methods available for our 413 | RasterStack object: 414 | 415 | ```{r} 416 | methods(class=class(RGB_stack_HARV)) 417 | ``` 418 | 419 | 2) And compare that with the methods available for a single band: 420 | 421 | ```{r} 422 | methods(class=class(RGB_stack_HARV[[1]])) 423 | ``` 424 | 425 | 3) A SpatRaster is the same no matter its number of bands. 426 | 427 | 428 | 429 | ::::::::::::::::::::::::: 430 | 431 | :::::::::::::::::::::::::::::::::::::::::::::::::: 432 | 433 | 434 | 435 | :::::::::::::::::::::::::::::::::::::::: keypoints 436 | 437 | - A single raster file can contain multiple bands or layers. 438 | - Use the `rast()` function to load all bands in a multi-layer raster file into R. 439 | - Individual bands within a SpatRaster can be accessed, analyzed, and visualized using the same functions no matter how many bands it holds. 440 | 441 | :::::::::::::::::::::::::::::::::::::::::::::::::: 442 | 443 | 444 | -------------------------------------------------------------------------------- /episodes/06-vector-open-shapefile-in-r.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Open and Plot Vector Layers 3 | teaching: 20 4 | exercises: 10 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Know the difference between point, line, and polygon vector elements. 15 | - Load point, line, and polygon vector layers into R. 16 | - Access the attributes of a spatial object in R. 17 | 18 | :::::::::::::::::::::::::::::::::::::::::::::::::: 19 | 20 | :::::::::::::::::::::::::::::::::::::::: questions 21 | 22 | - How can I distinguish between and visualize point, line and polygon vector data? 23 | 24 | :::::::::::::::::::::::::::::::::::::::::::::::::: 25 | 26 | ```{r load-libraries, echo=FALSE, results="hide", warning=FALSE, message=FALSE} 27 | library(terra) 28 | library(ggplot2) 29 | library(dplyr) 30 | library(sf) 31 | ``` 32 | 33 | :::::::::::::::::::::::::::::::::::::::::: prereq 34 | 35 | ## Things You'll Need To Complete This Episode 36 | 37 | See the [lesson homepage](.) for detailed information about the software, data, 38 | and other prerequisites you will need to work through the examples in this 39 | episode. 40 | 41 | 42 | :::::::::::::::::::::::::::::::::::::::::::::::::: 43 | 44 | Starting with this episode, we will be moving from working with raster data to 45 | working with vector data. In this episode, we will open and plot point, line 46 | and polygon vector data loaded from ESRI's `shapefile` format into R. These data refer to 47 | the 48 | [NEON Harvard Forest field site](https://www.neonscience.org/field-sites/field-sites-map/HARV), 49 | which we have been working with in previous episodes. In later episodes, we 50 | will learn how to work with raster and vector data together and combine them 51 | into a single plot. 52 | 53 | ## Import Vector Data 54 | 55 | We will use the `sf` package to work with vector data in R. We will also use 56 | the `terra` package, which has been loaded in previous episodes, so we can 57 | explore raster and vector spatial metadata using similar commands. Make sure 58 | you have the `sf` library loaded. 59 | 60 | ```{r load-sf, results="hide", eval=FALSE, message=FALSE} 61 | library(sf) 62 | ``` 63 | 64 | The vector layers that we will import from ESRI's `shapefile` format are: 65 | 66 | - A polygon vector layer representing our field site boundary, 67 | - A line vector layer representing roads, and 68 | - A point vector layer representing the location of the [Fisher flux tower](https://www.neonscience.org/data-collection/flux-tower-measurements) 69 | located at the [NEON Harvard Forest field site](https://www.neonscience.org/field-sites/field-sites-map/HARV). 70 | 71 | The first vector layer that we will open contains the boundary of our study area 72 | (or our Area Of Interest or AOI, hence the name `aoiBoundary`). To import 73 | a vector layer from an ESRI `shapefile` we use the `sf` function `st_read()`. `st_read()` 74 | requires the file path to the ESRI `shapefile`. 75 | 76 | Let's import our AOI: 77 | 78 | ```{r Import-Shapefile} 79 | aoi_boundary_HARV <- st_read( 80 | "data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp") 81 | ``` 82 | 83 | ## Vector Layer Metadata \& Attributes 84 | 85 | When we import the `HarClip_UTMZ18` vector layer from an ESRI `shapefile` into R (as our 86 | `aoi_boundary_HARV` object), the `st_read()` function automatically stores 87 | information about the data. We are particularly interested in the geospatial 88 | metadata, describing the format, CRS, extent, and other components of the 89 | vector data, and the attributes which describe properties associated with each 90 | individual vector object. 91 | 92 | ::::::::::::::::::::::::::::::::::::::::: callout 93 | 94 | ## Data Tip 95 | 96 | The [Explore and Plot by Vector Layer Attributes](07-vector-shapefile-attributes-in-r/) 97 | episode provides more information on both metadata and attributes 98 | and using attributes to subset and plot data. 99 | 100 | 101 | :::::::::::::::::::::::::::::::::::::::::::::::::: 102 | 103 | ## Spatial Metadata 104 | 105 | Key metadata for all vector layers includes: 106 | 107 | 1. **Object Type:** the class of the imported object. 108 | 2. **Coordinate Reference System (CRS):** the projection of the data. 109 | 3. **Extent:** the spatial extent (i.e. geographic area that the vector layer 110 | covers) of the data. Note that the spatial extent for a vector layer 111 | represents the combined extent for all individual objects in the vector layer. 112 | 113 | We can view metadata of a vector layer using the `st_geometry_type()`, `st_crs()` and 114 | `st_bbox()` functions. First, let's view the geometry type for our AOI 115 | vector layer: 116 | 117 | ```{r} 118 | st_geometry_type(aoi_boundary_HARV) 119 | ``` 120 | 121 | Our `aoi_boundary_HARV` is a polygon spatial object. The 18 levels shown below our 122 | output list the possible categories of the geometry type. Now let's check what 123 | CRS this file data is in: 124 | 125 | ```{r} 126 | st_crs(aoi_boundary_HARV) 127 | ``` 128 | 129 | Our data in the CRS **UTM zone 18N**. The CRS is critical to interpreting the 130 | spatial object's extent values as it specifies units. To find the extent of our AOI, we 131 | can use the `st_bbox()` function: 132 | 133 | ```{r} 134 | st_bbox(aoi_boundary_HARV) 135 | ``` 136 | 137 | The spatial extent of a vector layer or R spatial object represents the geographic 138 | "edge" or location that is the furthest north, south east and west. Thus it 139 | represents the overall geographic coverage of the spatial object. Image Source: 140 | National Ecological Observatory Network (NEON). 141 | 142 | ![](fig/dc-spatial-vector/spatial_extent.png){alt='Extent image'} 143 | 144 | Lastly, we can view all of the metadata and attributes for this R spatial 145 | object by printing it to the screen: 146 | 147 | ```{r} 148 | aoi_boundary_HARV 149 | ``` 150 | 151 | ## Spatial Data Attributes 152 | 153 | We introduced the idea of spatial data attributes in 154 | [an earlier lesson](https://datacarpentry.org/organization-geospatial/02-intro-vector-data). 155 | Now we will explore how to use spatial data attributes stored in our data to 156 | plot different features. 157 | 158 | ## Plot a vector layer 159 | 160 | Next, let's visualize the data in our `sf` object using the `ggplot` package. 161 | Unlike with raster data, we do not need to convert vector data to a dataframe 162 | before plotting with `ggplot`. 163 | 164 | We're going to customize our boundary plot by setting the size, color, and fill 165 | for our plot. When plotting `sf` objects with `ggplot2`, you need to use the 166 | `coord_sf()` coordinate system. 167 | 168 | ```{r plot-shapefile} 169 | ggplot() + 170 | geom_sf(data = aoi_boundary_HARV, size = 3, color = "black", fill = "cyan1") + 171 | ggtitle("AOI Boundary Plot") + 172 | coord_sf() 173 | ``` 174 | 175 | ::::::::::::::::::::::::::::::::::::::: challenge 176 | 177 | ## Challenge: Import Line and Point Vector Layers 178 | 179 | Using the steps above, import the HARV\_roads and HARVtower\_UTM18N vector layers into 180 | R. Call the HARV\_roads object `lines_HARV` and the HARVtower\_UTM18N 181 | `point_HARV`. 182 | 183 | Answer the following questions: 184 | 185 | 1. What type of R spatial object is created when you import each layer? 186 | 187 | 2. What is the CRS and extent for each object? 188 | 189 | 3. Do the files contain points, lines, or polygons? 190 | 191 | 4. How many spatial objects are in each file? 192 | 193 | ::::::::::::::: solution 194 | 195 | ## Answers 196 | 197 | First we import the data: 198 | 199 | ```{r import-point-line, echo=TRUE} 200 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp") 201 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp") 202 | ``` 203 | 204 | Then we check its class: 205 | 206 | ```{r} 207 | class(lines_HARV) 208 | class(point_HARV) 209 | ``` 210 | 211 | We also check the CRS and extent of each object: 212 | 213 | ```{r} 214 | st_crs(lines_HARV) 215 | st_bbox(lines_HARV) 216 | st_crs(point_HARV) 217 | st_bbox(point_HARV) 218 | ``` 219 | 220 | To see the number of objects in each file, we can look at the output from when 221 | we read these objects into R. `lines_HARV` contains 13 features (all lines) and 222 | `point_HARV` contains only one point. 223 | 224 | 225 | 226 | ::::::::::::::::::::::::: 227 | 228 | :::::::::::::::::::::::::::::::::::::::::::::::::: 229 | 230 | 231 | 232 | :::::::::::::::::::::::::::::::::::::::: keypoints 233 | 234 | - Metadata for vector layers include geometry type, CRS, and extent. 235 | - Load spatial objects into R with the `st_read()` function. 236 | - Spatial objects can be plotted directly with `ggplot` using the `geom_sf()` 237 | function. No need to convert to a dataframe. 238 | 239 | :::::::::::::::::::::::::::::::::::::::::::::::::: 240 | 241 | 242 | -------------------------------------------------------------------------------- /episodes/08-vector-plot-shapefiles-custom-legend.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Plot Multiple Vector Layers 3 | teaching: 30 4 | exercises: 15 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Plot multiple vector layers in the same plot. 15 | - Apply custom symbols to spatial objects in a plot. 16 | - Create a multi-layered plot with raster and vector data. 17 | 18 | :::::::::::::::::::::::::::::::::::::::::::::::::: 19 | 20 | :::::::::::::::::::::::::::::::::::::::: questions 21 | 22 | - How can I create map compositions with custom legends using ggplot? 23 | - How can I plot raster and vector data together? 24 | 25 | :::::::::::::::::::::::::::::::::::::::::::::::::: 26 | 27 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE} 28 | library(terra) 29 | library(ggplot2) 30 | library(dplyr) 31 | library(sf) 32 | ``` 33 | 34 | ```{r load-data, echo=FALSE, results="hide", warning=FALSE} 35 | # learners will have this data loaded from an earlier episode 36 | aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp") 37 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp") 38 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp") 39 | CHM_HARV <- rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif") 40 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE) 41 | road_colors <- c("blue", "green", "navy", "purple") 42 | ``` 43 | 44 | :::::::::::::::::::::::::::::::::::::::::: prereq 45 | 46 | ## Things You'll Need To Complete This Episode 47 | 48 | See the [lesson homepage](.) for detailed information about the software, data, 49 | and other prerequisites you will need to work through the examples in this 50 | episode. 51 | 52 | 53 | :::::::::::::::::::::::::::::::::::::::::::::::::: 54 | 55 | This episode builds upon 56 | [the previous episode](07-vector-shapefile-attributes-in-r/) 57 | to work with vector layers in R and explore how to plot multiple 58 | vector layers. It also covers how to plot raster and vector data together on the 59 | same plot. 60 | 61 | ## Load the Data 62 | 63 | To work with vector data in R, we can use the `sf` library. The `terra` 64 | package also allows us to explore metadata using similar commands for both 65 | raster and vector files. Make sure that you have these packages loaded. 66 | 67 | We will continue to work with the three ESRI `shapefile` that we loaded in the 68 | [Open and Plot Vector Layers in R](06-vector-open-shapefile-in-r/) episode. 69 | 70 | ## Plotting Multiple Vector Layers 71 | 72 | In the [previous episode](07-vector-shapefile-attributes-in-r/), we learned how 73 | to plot information from a single vector layer and do some plot customization 74 | including adding a custom legend. However, what if we want to create a more 75 | complex plot with many vector layers and unique symbols that need to be 76 | represented clearly in a legend? 77 | 78 | Now, let's create a plot that combines our tower location (`point_HARV`), site 79 | boundary (`aoi_boundary_HARV`) and roads (`lines_HARV`) spatial objects. We 80 | will need to build a custom legend as well. 81 | 82 | To begin, we will create a plot with the site boundary as the first layer. Then 83 | layer the tower location and road data on top using `+`. 84 | 85 | ```{r plot-many-shapefiles} 86 | ggplot() + 87 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") + 88 | geom_sf(data = lines_HARV, aes(color = TYPE), size = 1) + 89 | geom_sf(data = point_HARV) + 90 | ggtitle("NEON Harvard Forest Field Site") + 91 | coord_sf() 92 | ``` 93 | 94 | Next, let's build a custom legend using the symbology (the colors and symbols) 95 | that we used to create the plot above. For example, it might be good if the 96 | lines were symbolized as lines. In the previous episode, you may have noticed 97 | that the default legend behavior for `geom_sf` is to draw a 'patch' for each 98 | legend entry. If you want the legend to draw lines or points, you need to add 99 | an instruction to the `geom_sf` call - in this case, `show.legend = 'line'`. 100 | 101 | ```{r plot-custom-shape} 102 | ggplot() + 103 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") + 104 | geom_sf(data = lines_HARV, aes(color = TYPE), 105 | show.legend = "line", size = 1) + 106 | geom_sf(data = point_HARV, aes(fill = Sub_Type), color = "black") + 107 | scale_color_manual(values = road_colors) + 108 | scale_fill_manual(values = "black") + 109 | ggtitle("NEON Harvard Forest Field Site") + 110 | coord_sf() 111 | ``` 112 | 113 | Now lets adjust the legend titles by passing a `name` to the respective `color` 114 | and `fill` palettes. 115 | 116 | ```{r create-custom-legend} 117 | ggplot() + 118 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") + 119 | geom_sf(data = point_HARV, aes(fill = Sub_Type)) + 120 | geom_sf(data = lines_HARV, aes(color = TYPE), show.legend = "line", 121 | size = 1) + 122 | scale_color_manual(values = road_colors, name = "Line Type") + 123 | scale_fill_manual(values = "black", name = "Tower Location") + 124 | ggtitle("NEON Harvard Forest Field Site") + 125 | coord_sf() 126 | ``` 127 | 128 | Finally, it might be better if the points were symbolized as a symbol. We can 129 | customize this using `shape` parameters in our call to `geom_sf`: 16 is a point 130 | symbol, 15 is a box. 131 | 132 | ::::::::::::::::::::::::::::::::::::::::: callout 133 | 134 | ## Data Tip 135 | 136 | To view a short list of `shape` symbols, 137 | type `?pch` into the R console. 138 | 139 | 140 | :::::::::::::::::::::::::::::::::::::::::::::::::: 141 | 142 | ```{r custom-symbols} 143 | ggplot() + 144 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") + 145 | geom_sf(data = point_HARV, aes(fill = Sub_Type), shape = 15) + 146 | geom_sf(data = lines_HARV, aes(color = TYPE), 147 | show.legend = "line", size = 1) + 148 | scale_color_manual(values = road_colors, name = "Line Type") + 149 | scale_fill_manual(values = "black", name = "Tower Location") + 150 | ggtitle("NEON Harvard Forest Field Site") + 151 | coord_sf() 152 | ``` 153 | 154 | ::::::::::::::::::::::::::::::::::::::: challenge 155 | 156 | ## Challenge: Plot Polygon by Attribute 157 | 158 | 1. Using the `NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp` ESRI `shapefile`, 159 | create a map of study plot locations, with each point colored by the soil 160 | type (`soilTypeOr`). How many different soil types are there at this 161 | particular field site? Overlay this layer on top of the `lines_HARV` layer 162 | (the roads). Create a custom legend that applies line symbols to lines and 163 | point symbols to the points. 164 | 165 | 2. Modify the plot above. Tell R to plot each point, using a different symbol 166 | of `shape` value. 167 | 168 | ::::::::::::::: solution 169 | 170 | ## Answers 171 | 172 | First we need to read in the data and see how many unique soils are represented 173 | in the `soilTypeOr` attribute. 174 | 175 | ```{r} 176 | plot_locations <- 177 | st_read("data/NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp") 178 | 179 | plot_locations$soilTypeOr <- as.factor(plot_locations$soilTypeOr) 180 | levels(plot_locations$soilTypeOr) 181 | ``` 182 | 183 | Next we can create a new color palette with one color for each soil type. 184 | 185 | ```{r} 186 | blue_orange <- c("cornflowerblue", "darkorange") 187 | ``` 188 | 189 | Finally, we will create our plot. 190 | 191 | ```{r harv-plot-locations-bg} 192 | ggplot() + 193 | geom_sf(data = lines_HARV, aes(color = TYPE), show.legend = "line") + 194 | geom_sf(data = plot_locations, aes(fill = soilTypeOr), 195 | shape = 21, show.legend = 'point') + 196 | scale_color_manual(name = "Line Type", values = road_colors, 197 | guide = guide_legend(override.aes = list(linetype = "solid", 198 | shape = NA))) + 199 | scale_fill_manual(name = "Soil Type", values = blue_orange, 200 | guide = guide_legend(override.aes = list(linetype = "blank", shape = 21, 201 | colour = "black"))) + 202 | ggtitle("NEON Harvard Forest Field Site") + 203 | coord_sf() 204 | ``` 205 | 206 | If we want each soil to be shown with a different symbol, we can give multiple 207 | values to the `scale_shape_manual()` argument. 208 | 209 | ```{r harv-plot-locations-pch} 210 | ggplot() + 211 | geom_sf(data = lines_HARV, aes(color = TYPE), show.legend = "line", size = 1) + 212 | geom_sf(data = plot_locations, aes(fill = soilTypeOr, shape = soilTypeOr), 213 | show.legend = 'point', size = 3) + 214 | scale_shape_manual(name = "Soil Type", values = c(21, 22)) + 215 | scale_color_manual(name = "Line Type", values = road_colors, 216 | guide = guide_legend(override.aes = list(linetype = "solid", shape = NA))) + 217 | scale_fill_manual(name = "Soil Type", values = blue_orange, 218 | guide = guide_legend(override.aes = list(linetype = "blank", shape = c(21, 22), color = "black"))) + 219 | ggtitle("NEON Harvard Forest Field Site") + 220 | coord_sf() 221 | ``` 222 | 223 | ::::::::::::::::::::::::: 224 | 225 | :::::::::::::::::::::::::::::::::::::::::::::::::: 226 | 227 | ::::::::::::::::::::::::::::::::::::::: challenge 228 | 229 | ## Challenge: Plot Raster \& Vector Data Together 230 | 231 | You can plot vector data layered on top of raster data using the `+` to add a 232 | layer in `ggplot`. Create a plot that uses the NEON AOI Canopy Height Model 233 | `data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif` as a base 234 | layer. On top of the CHM, please add: 235 | 236 | - The study site AOI. 237 | - Roads. 238 | - The tower location. 239 | 240 | Be sure to give your plot a meaningful title. 241 | 242 | ::::::::::::::: solution 243 | 244 | ## Answers 245 | 246 | ```{r challenge-vector-raster-overlay, echo=TRUE} 247 | ggplot() + 248 | geom_raster(data = CHM_HARV_df, aes(x = x, y = y, fill = HARV_chmCrop)) + 249 | geom_sf(data = lines_HARV, color = "black") + 250 | geom_sf(data = aoi_boundary_HARV, color = "grey20", size = 1) + 251 | geom_sf(data = point_HARV, pch = 8) + 252 | ggtitle("NEON Harvard Forest Field Site w/ Canopy Height Model") + 253 | coord_sf() 254 | ``` 255 | 256 | ::::::::::::::::::::::::: 257 | 258 | :::::::::::::::::::::::::::::::::::::::::::::::::: 259 | 260 | 261 | 262 | :::::::::::::::::::::::::::::::::::::::: keypoints 263 | 264 | - Use the `+` operator to add multiple layers to a ggplot. 265 | - Multi-layered plots can combine raster and vector datasets. 266 | - Use the `show.legend` argument to set legend symbol types. 267 | - Use the `scale_fill_manual()` function to set legend colors. 268 | 269 | :::::::::::::::::::::::::::::::::::::::::::::::::: 270 | 271 | 272 | -------------------------------------------------------------------------------- /episodes/09-vector-when-data-dont-line-up-crs.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Handling Spatial Projection & CRS 3 | teaching: 30 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Plot vector objects with different CRSs in the same plot. 15 | 16 | :::::::::::::::::::::::::::::::::::::::::::::::::: 17 | 18 | :::::::::::::::::::::::::::::::::::::::: questions 19 | 20 | - What do I do when vector data don't line up? 21 | 22 | :::::::::::::::::::::::::::::::::::::::::::::::::: 23 | 24 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 25 | library(terra) 26 | library(sf) 27 | library(ggplot2) 28 | library(dplyr) 29 | ``` 30 | 31 | :::::::::::::::::::::::::::::::::::::::::: prereq 32 | 33 | ## Things You'll Need To Complete This Episode 34 | 35 | See the [lesson homepage](.) for detailed information about the software, data, 36 | and other prerequisites you will need to work through the examples in this 37 | episode. 38 | 39 | 40 | :::::::::::::::::::::::::::::::::::::::::::::::::: 41 | 42 | In [an earlier episode](03-raster-reproject-in-r/) 43 | we learned how to handle a situation where you have two different files with 44 | raster data in different projections. Now we will apply those same principles 45 | to working with vector data. 46 | We will create a base map of our study site using United States state and 47 | country boundary information accessed from the 48 | [United States Census Bureau](https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html). 49 | We will learn how to map vector data that are in different CRSs and thus don't 50 | line up on a map. 51 | 52 | We will continue to work with the three ESRI `shapefiles` that we loaded in the 53 | [Open and Plot Vector Layers in R](06-vector-open-shapefile-in-r/) episode. 54 | 55 | ```{r load-data, echo=FALSE, results="hide", warning=FALSE, message=FALSE} 56 | # learners will have this data loaded from previous episodes 57 | aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp") 58 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp") 59 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp") 60 | CHM_HARV <- rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif") 61 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE) 62 | roadColors <- c("blue", "green", "grey", "purple")[lines_HARV$TYPE] 63 | ``` 64 | 65 | ## Working With Spatial Data From Different Sources 66 | 67 | We often need to gather spatial datasets from different sources and/or data 68 | that cover different spatial extents. 69 | These data are often in different Coordinate Reference Systems (CRSs). 70 | 71 | Some reasons for data being in different CRSs include: 72 | 73 | 1. The data are stored in a particular CRS convention used by the data provider 74 | (for example, a government agency). 75 | 2. The data are stored in a particular CRS that is customized to a region. For 76 | instance, many states in the US prefer to use a State Plane projection 77 | customized for that state. 78 | 79 | ![Maps of the United States using data in different projections. Source: opennews.org, from: https://media.opennews.org/cache/06/37/0637aa2541b31f526ad44f7cb2db7b6c.jpg](fig/map_usa_different_projections.jpg){alt='Maps of the United States using data in different projections.} 80 | 81 | Notice the differences in shape associated with each different projection. 82 | These differences are a direct result of the calculations used to "flatten" the 83 | data onto a 2-dimensional map. Often data are stored purposefully in a 84 | particular projection that optimizes the relative shape and size of surrounding 85 | geographic boundaries (states, counties, countries, etc). 86 | 87 | In this episode we will learn how to identify and manage spatial data in 88 | different projections. We will learn how to reproject the data so that they are 89 | in the same projection to support plotting / mapping. Note that these skills 90 | are also required for any geoprocessing / spatial analysis. Data need to be in 91 | the same CRS to ensure accurate results. 92 | 93 | We will continue to use the `sf` and `terra` packages in this episode. 94 | 95 | ## Import US Boundaries - Census Data 96 | 97 | There are many good sources of boundary base layers that we can use to create a 98 | basemap. Some R packages even have these base layers built in to support quick 99 | and efficient mapping. In this episode, we will use boundary layers for the 100 | contiguous United States, provided by the 101 | [United States Census Bureau](https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html). 102 | It is useful to have vector layers in ESRI's `shapefile` format to work with because we can add additional 103 | attributes to them if need be - for project specific mapping. 104 | 105 | ## Read US Boundary File 106 | 107 | We will use the `st_read()` function to import the 108 | `/US-Boundary-Layers/US-State-Boundaries-Census-2014` layer into R. This layer 109 | contains the boundaries of all contiguous states in the U.S. Please note that 110 | these data have been modified and reprojected from the original data downloaded 111 | from the Census website to support the learning goals of this episode. 112 | 113 | ```{r read-shp} 114 | state_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-State-Boundaries-Census-2014.shp") %>% 115 | st_zm() 116 | ``` 117 | 118 | Next, let's plot the U.S. states data: 119 | 120 | ```{r find-coordinates} 121 | ggplot() + 122 | geom_sf(data = state_boundary_US) + 123 | ggtitle("Map of Contiguous US State Boundaries") + 124 | coord_sf() 125 | ``` 126 | 127 | ## U.S. Boundary Layer 128 | 129 | We can add a boundary layer of the United States to our map - to make it look 130 | nicer. We will import 131 | `NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States`. 132 | 133 | ```{r} 134 | country_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States.shp") %>% 135 | st_zm() 136 | ``` 137 | 138 | If we specify a thicker line width using `size = 2` for the border layer, it 139 | will make our map pop! We will also manually set the colors of the state 140 | boundaries and country boundaries. 141 | 142 | ```{r us-boundaries-thickness} 143 | ggplot() + 144 | geom_sf(data = state_boundary_US, color = "gray60") + 145 | geom_sf(data = country_boundary_US, color = "black",alpha = 0.25,size = 5) + 146 | ggtitle("Map of Contiguous US State Boundaries") + 147 | coord_sf() 148 | ``` 149 | 150 | Next, let's add the location of a flux tower where our study area is. 151 | As we are adding these layers, take note of the CRS of each object. 152 | First let's look at the CRS of our tower location object: 153 | 154 | ```{r crs-sleuthing-1} 155 | st_crs(point_HARV)$proj4string 156 | ``` 157 | 158 | Our project string for `point_HARV` specifies the UTM projection as follows: 159 | 160 | `+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs` 161 | 162 | - **proj=utm:** the projection is UTM, UTM has several zones. 163 | - **zone=18:** the zone is 18 164 | - **datum=WGS84:** the datum WGS84 (the datum refers to the 0,0 reference for 165 | the coordinate system used in the projection) 166 | - **units=m:** the units for the coordinates are in METERS. 167 | 168 | Note that the `zone` is unique to the UTM projection. Not all CRSs will have a 169 | zone. 170 | 171 | Let's check the CRS of our state and country boundary objects: 172 | 173 | ```{r crs-sleuthing-2} 174 | st_crs(state_boundary_US)$proj4string 175 | st_crs(country_boundary_US)$proj4string 176 | ``` 177 | 178 | Our project string for `state_boundary_US` and `country_boundary_US` specifies 179 | the lat/long projection as follows: 180 | 181 | `+proj=longlat +datum=WGS84 +no_defs` 182 | 183 | 184 | - **proj=longlat:** the data are in a geographic (latitude and longitude) 185 | coordinate system 186 | - **datum=WGS84:** the datum WGS84 (the datum refers to the 0,0 reference for 187 | the coordinate system used in the projection) 188 | - **no_defs:** ensures that no defaults are used, but this is now obsolete 189 | 190 | Note that there are no specified units above. This is because this geographic 191 | coordinate reference system is in latitude and longitude which is most often 192 | recorded in decimal degrees. 193 | 194 | ::::::::::::::::::::::::::::::::::::::::: callout 195 | 196 | ## Data Tip 197 | 198 | the last portion of each `proj4` string could potentially be something like 199 | `+towgs84=0,0,0 `. This is a conversion factor that is used if a datum 200 | conversion is required. We will not deal with datums in this episode series. 201 | 202 | 203 | :::::::::::::::::::::::::::::::::::::::::::::::::: 204 | 205 | ## CRS Units - View Object Extent 206 | 207 | Next, let's view the extent or spatial coverage for the `point_HARV` spatial 208 | object compared to the `state_boundary_US` object. 209 | 210 | First we'll look at the extent for our study site: 211 | 212 | ```{r view-extent-1} 213 | st_bbox(point_HARV) 214 | ``` 215 | 216 | And then the extent for the state boundary data. 217 | 218 | ```{r view-extent-2} 219 | st_bbox(state_boundary_US) 220 | ``` 221 | 222 | Note the difference in the units for each object. The extent for 223 | `state_boundary_US` is in latitude and longitude which yields smaller numbers 224 | representing decimal degree units. Our tower location point is in UTM, is 225 | represented in meters. 226 | 227 | ::::::::::::::::::::::::::::::::::::::::: callout 228 | 229 | ## Proj4 \& CRS Resources 230 | 231 | - [Official PROJ library documentation](https://proj4.org/) 232 | - [More information on the proj4 format.](https://proj.maptools.org/faq.html) 233 | - [A fairly comprehensive list of CRSs by format.](https://spatialreference.org) 234 | - To view a list of datum conversion factors type: 235 | `sf_proj_info(type = "datum")` into the R console. However, the results would 236 | depend on the underlying version of the PROJ library. 237 | 238 | 239 | :::::::::::::::::::::::::::::::::::::::::::::::::: 240 | 241 | ## Reproject Vector Data or No? 242 | 243 | We saw in [an earlier episode](03-raster-reproject-in-r/) that when working 244 | with raster data in different CRSs, we needed to convert all objects to the 245 | same CRS. We can do the same thing with our vector data - however, we don't 246 | need to! When using the `ggplot2` package, `ggplot` automatically converts all 247 | objects to the same CRS before plotting. 248 | This means we can plot our three data sets together without doing any 249 | conversion: 250 | 251 | ```{r layer-point-on-states} 252 | ggplot() + 253 | geom_sf(data = state_boundary_US, color = "gray60") + 254 | geom_sf(data = country_boundary_US, size = 5, alpha = 0.25, color = "black") + 255 | geom_sf(data = point_HARV, shape = 19, color = "purple") + 256 | ggtitle("Map of Contiguous US State Boundaries") + 257 | coord_sf() 258 | ``` 259 | 260 | ::::::::::::::::::::::::::::::::::::::: challenge 261 | 262 | ## Challenge - Plot Multiple Layers of Spatial Data 263 | 264 | Create a map of the North Eastern United States as follows: 265 | 266 | 1. Import and plot `Boundary-US-State-NEast.shp`. Adjust line width as 267 | necessary. 268 | 2. Layer the Fisher Tower (in the NEON Harvard Forest site) point location 269 | `point_HARV` onto the plot. 270 | 3. Add a title. 271 | 4. Add a legend that shows both the state boundary (as a line) and the Tower 272 | location point. 273 | 274 | ::::::::::::::: solution 275 | 276 | ## Answers 277 | 278 | ```{r ne-states-harv} 279 | NE.States.Boundary.US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/Boundary-US-State-NEast.shp") %>% 280 | st_zm() 281 | 282 | ggplot() + 283 | geom_sf(data = NE.States.Boundary.US, aes(color ="color"), 284 | show.legend = "line") + 285 | scale_color_manual(name = "", labels = "State Boundary", 286 | values = c("color" = "gray18")) + 287 | geom_sf(data = point_HARV, aes(shape = "shape"), color = "purple") + 288 | scale_shape_manual(name = "", labels = "Fisher Tower", 289 | values = c("shape" = 19)) + 290 | ggtitle("Fisher Tower location") + 291 | theme(legend.background = element_rect(color = NA)) + 292 | coord_sf() 293 | ``` 294 | 295 | ::::::::::::::::::::::::: 296 | 297 | :::::::::::::::::::::::::::::::::::::::::::::::::: 298 | 299 | 300 | 301 | :::::::::::::::::::::::::::::::::::::::: keypoints 302 | 303 | - `ggplot2` automatically converts all objects in a plot to the same CRS. 304 | - Still be aware of the CRS and extent for each object. 305 | 306 | :::::::::::::::::::::::::::::::::::::::::::::::::: 307 | 308 | 309 | -------------------------------------------------------------------------------- /episodes/10-vector-csv-to-shapefile-in-r.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Convert from .csv to a Vector Layer 3 | teaching: 40 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Import .csv files containing x,y coordinate locations into R as a data frame. 15 | - Convert a data frame to a spatial object. 16 | - Export a spatial object to a text file. 17 | 18 | :::::::::::::::::::::::::::::::::::::::::::::::::: 19 | 20 | :::::::::::::::::::::::::::::::::::::::: questions 21 | 22 | - How can I import CSV files as vector layers in R? 23 | 24 | :::::::::::::::::::::::::::::::::::::::::::::::::: 25 | 26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 27 | library(terra) 28 | library(ggplot2) 29 | library(dplyr) 30 | library(sf) 31 | ``` 32 | 33 | ```{r load-data, echo=FALSE, results="hide"} 34 | # Learners will have this data loaded from earlier episodes 35 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp") 36 | aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp") 37 | country_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States.shp") 38 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp") 39 | ``` 40 | 41 | :::::::::::::::::::::::::::::::::::::::::: prereq 42 | 43 | ## Things You'll Need To Complete This Episode 44 | 45 | See the [lesson homepage](.) for detailed information about the software, data, 46 | and other prerequisites you will need to work through the examples in this 47 | episode. 48 | 49 | 50 | :::::::::::::::::::::::::::::::::::::::::::::::::: 51 | 52 | This episode will review how to import spatial points stored in `.csv` (Comma 53 | Separated Value) format into R as an `sf` spatial object. We will also 54 | reproject data imported from an ESRI `shapefile` format, export the reprojected data as an ESRI `shapefile`, and plot raster and vector data as layers in the same plot. 55 | 56 | ## Spatial Data in Text Format 57 | 58 | The `HARV_PlotLocations.csv` file contains `x, y` (point) locations for study 59 | plot where NEON collects data on 60 | [vegetation and other ecological metics](https://www.neonscience.org/data-collection/terrestrial-organismal-sampling). 61 | We would like to: 62 | 63 | - Create a map of these plot locations. 64 | - Export the data in an ESRI `shapefile` format to share with our colleagues. This 65 | `shapefile` can be imported into most GIS software. 66 | - Create a map showing vegetation height with plot locations layered on top. 67 | 68 | Spatial data are sometimes stored in a text file format (`.txt` or `.csv`). If 69 | the text file has an associated `x` and `y` location column, then we can 70 | convert it into an `sf` spatial object. The `sf` object allows us to store both 71 | the `x,y` values that represent the coordinate location of each point and the 72 | associated attribute data - or columns describing each feature in the spatial 73 | object. 74 | 75 | We will continue using the `sf` and `terra` packages in this episode. 76 | 77 | ## Import .csv 78 | 79 | To begin let's import a `.csv` file that contains plot coordinate `x, y` 80 | locations at the NEON Harvard Forest Field Site (`HARV_PlotLocations.csv`) and 81 | look at the structure of that new object: 82 | 83 | ```{r read-csv} 84 | plot_locations_HARV <- 85 | read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv") 86 | 87 | str(plot_locations_HARV) 88 | ``` 89 | 90 | We now have a data frame that contains 21 locations (rows) and 16 variables 91 | (attributes). Note that all of our character data was imported into R as 92 | character (text) data. Next, let's explore the dataframe to determine whether 93 | it contains columns with coordinate values. If we are lucky, our `.csv` will 94 | contain columns labeled: 95 | 96 | - "X" and "Y" OR 97 | - Latitude and Longitude OR 98 | - easting and northing (UTM coordinates) 99 | 100 | Let's check out the column names of our dataframe. 101 | 102 | ```{r find-coordinates} 103 | names(plot_locations_HARV) 104 | ``` 105 | 106 | ## Identify X,Y Location Columns 107 | 108 | Our column names include several fields that might contain spatial information. 109 | The `plot_locations_HARV$easting` and `plot_locations_HARV$northing` columns 110 | contain coordinate values. We can confirm this by looking at the first six rows 111 | of our data. 112 | 113 | ```{r check-out-coordinates} 114 | head(plot_locations_HARV$easting) 115 | head(plot_locations_HARV$northing) 116 | ``` 117 | 118 | We have coordinate values in our data frame. In order to convert our data frame 119 | to an `sf` object, we also need to know the CRS associated with those 120 | coordinate values. 121 | 122 | There are several ways to figure out the CRS of spatial data in text format. 123 | 124 | 1. We can check the file metadata in hopes that the CRS was recorded in the 125 | data. 126 | 2. We can explore the file itself to see if CRS information is embedded in the 127 | file header or somewhere in the data columns. 128 | 129 | Following the `easting` and `northing` columns, there is a `geodeticDa` and a 130 | `utmZone` column. These appear to contain CRS information (`datum` and 131 | `projection`). Let's view those next. 132 | 133 | ```{r view-CRS-info} 134 | head(plot_locations_HARV$geodeticDa) 135 | head(plot_locations_HARV$utmZone) 136 | ``` 137 | 138 | It is not typical to store CRS information in a column. But this particular 139 | file contains CRS information this way. The `geodeticDa` and `utmZone` columns 140 | contain the information that helps us determine the CRS: 141 | 142 | - `geodeticDa`: WGS84 -- this is geodetic datum WGS84 143 | - `utmZone`: 18 144 | 145 | In 146 | [When Vector Data Don't Line Up - Handling Spatial Projection \& CRS in R](09-vector-when-data-dont-line-up-crs.html) 147 | we learned about the components of a `proj4` string. We have everything we need 148 | to assign a CRS to our data frame. 149 | 150 | To create the `proj4` associated with UTM Zone 18 WGS84 we can look up the 151 | projection on the 152 | [Spatial Reference website](https://spatialreference.org/ref/epsg/32618/), 153 | which contains a list of CRS formats for each projection. From here, we can 154 | extract the 155 | [proj4 string for UTM Zone 18N WGS84](https://spatialreference.org/ref/epsg/32618/proj4.txt). 156 | 157 | However, if we have other data in the UTM Zone 18N projection, it's much easier 158 | to use the `st_crs()` function to extract the CRS in `proj4` format from that 159 | object and assign it to our new spatial object. We've seen this CRS before with 160 | our Harvard Forest study site (`point_HARV`). 161 | 162 | ```{r explore-units} 163 | st_crs(point_HARV) 164 | ``` 165 | 166 | The output above shows that the points vector layer is in UTM zone 18N. We can 167 | thus use the CRS from that spatial object to convert our non-spatial dataframe 168 | into an `sf` object. 169 | 170 | Next, let's create a `crs` object that we can use to define the CRS of our `sf` 171 | object when we create it. 172 | 173 | ```{r crs-object} 174 | utm18nCRS <- st_crs(point_HARV) 175 | utm18nCRS 176 | 177 | class(utm18nCRS) 178 | ``` 179 | 180 | ## .csv to sf object 181 | 182 | Next, let's convert our dataframe into an `sf` object. To do this, we need to 183 | specify: 184 | 185 | 1. The columns containing X (`easting`) and Y (`northing`) coordinate values 186 | 2. The CRS that the column coordinate represent (units are included in the CRS) - stored in our `utmCRS` object. 187 | 188 | We will use the `st_as_sf()` function to perform the conversion. 189 | 190 | ```{r convert-csv-shapefile} 191 | plot_locations_sp_HARV <- st_as_sf(plot_locations_HARV, 192 | coords = c("easting", "northing"), 193 | crs = utm18nCRS) 194 | ``` 195 | 196 | We should double check the CRS to make sure it is correct. 197 | 198 | ```{r} 199 | st_crs(plot_locations_sp_HARV) 200 | ``` 201 | 202 | ## Plot Spatial Object 203 | 204 | We now have a spatial R object, we can plot our newly created spatial object. 205 | 206 | ```{r plot-data-points} 207 | ggplot() + 208 | geom_sf(data = plot_locations_sp_HARV) + 209 | ggtitle("Map of Plot Locations") 210 | ``` 211 | 212 | ## Plot Extent 213 | 214 | In 215 | [Open and Plot Vector Layers in R](06-vector-open-shapefile-in-r.html) 216 | we learned about spatial object extent. When we plot several spatial layers in 217 | R using `ggplot`, all of the layers of the plot are considered in setting the 218 | boundaries of the plot. To show this, let's plot our `aoi_boundary_HARV` object 219 | with our vegetation plots. 220 | 221 | ```{r plot-data} 222 | ggplot() + 223 | geom_sf(data = aoi_boundary_HARV) + 224 | geom_sf(data = plot_locations_sp_HARV) + 225 | ggtitle("AOI Boundary Plot") 226 | ``` 227 | 228 | When we plot the two layers together, `ggplot` sets the plot boundaries so that 229 | they are large enough to include all of the data included in all of the layers. 230 | That's really handy! 231 | 232 | ::::::::::::::::::::::::::::::::::::::: challenge 233 | 234 | ## Challenge - Import \& Plot Additional Points 235 | 236 | We want to add two phenology plots to our existing map of vegetation plot 237 | locations. 238 | 239 | Import the .csv: `HARV/HARV_2NewPhenPlots.csv` into R and do the following: 240 | 241 | 1. Find the X and Y coordinate locations. Which value is X and which value is 242 | Y? 243 | 2. These data were collected in a geographic coordinate system (WGS84). Convert 244 | the dataframe into an `sf` object. 245 | 3. Plot the new points with the plot location points from above. Be sure to add 246 | a legend. Use a different symbol for the 2 new points! 247 | 248 | If you have extra time, feel free to add roads and other layers to your map! 249 | 250 | ::::::::::::::: solution 251 | 252 | ## Answers 253 | 254 | 1) 255 | First we will read in the new csv file and look at the data structure. 256 | 257 | ```{r} 258 | newplot_locations_HARV <- 259 | read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_2NewPhenPlots.csv") 260 | str(newplot_locations_HARV) 261 | ``` 262 | 263 | 2) 264 | The US boundary data we worked with previously is in a geographic WGS84 CRS. We 265 | can use that data to establish a CRS for this data. First we will extract the 266 | CRS from the `country_boundary_US` object and confirm that it is WGS84. 267 | 268 | ```{r} 269 | geogCRS <- st_crs(country_boundary_US) 270 | geogCRS 271 | ``` 272 | 273 | Then we will convert our new data to a spatial dataframe, using the `geogCRS` 274 | object as our CRS. 275 | 276 | ```{r} 277 | newPlot.Sp.HARV <- st_as_sf(newplot_locations_HARV, 278 | coords = c("decimalLon", "decimalLat"), 279 | crs = geogCRS) 280 | ``` 281 | 282 | Next we'll confirm that the CRS for our new object is correct. 283 | 284 | ```{r} 285 | st_crs(newPlot.Sp.HARV) 286 | ``` 287 | 288 | We will be adding these new data points to the plot we created before. The data 289 | for the earlier plot was in UTM. Since we're using `ggplot`, it will reproject 290 | the data for us. 291 | 292 | 3) Now we can create our plot. 293 | 294 | ```{r plot-locations-harv-orange} 295 | ggplot() + 296 | geom_sf(data = plot_locations_sp_HARV, color = "orange") + 297 | geom_sf(data = newPlot.Sp.HARV, color = "lightblue") + 298 | ggtitle("Map of All Plot Locations") 299 | ``` 300 | 301 | ::::::::::::::::::::::::: 302 | 303 | :::::::::::::::::::::::::::::::::::::::::::::::::: 304 | 305 | ## Export to an ESRI `shapefile` 306 | 307 | We can write an R spatial object to an ESRI `shapefile` using the `st_write` function 308 | in `sf`. To do this we need the following arguments: 309 | 310 | - the name of the spatial object (`plot_locations_sp_HARV`) 311 | - the directory where we want to save our ESRI `shapefile` (to use `current = getwd()` 312 | or you can specify a different path) 313 | - the name of the new ESRI `shapefile` (`PlotLocations_HARV`) 314 | - the driver which specifies the file format (ESRI Shapefile) 315 | 316 | We can now export the spatial object as an ESRI `shapefile`. 317 | 318 | ```{r write-shapefile, warnings="hide", eval=FALSE} 319 | st_write(plot_locations_sp_HARV, 320 | "data/PlotLocations_HARV.shp", driver = "ESRI Shapefile") 321 | ``` 322 | 323 | 324 | 325 | :::::::::::::::::::::::::::::::::::::::: keypoints 326 | 327 | - Know the projection (if any) of your point data prior to converting to a 328 | spatial object. 329 | - Convert a data frame to an `sf` object using the `st_as_sf()` function. 330 | - Export an `sf` object as text using the `st_write()` function. 331 | 332 | :::::::::::::::::::::::::::::::::::::::::::::::::: 333 | 334 | 335 | -------------------------------------------------------------------------------- /episodes/11-vector-raster-integration.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Manipulate Raster Data 3 | teaching: 40 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Crop a raster to the extent of a vector layer. 15 | - Extract values from a raster that correspond to a vector file overlay. 16 | 17 | :::::::::::::::::::::::::::::::::::::::::::::::::: 18 | 19 | :::::::::::::::::::::::::::::::::::::::: questions 20 | 21 | - How can I crop raster objects to vector objects, and extract the summary of 22 | raster pixels? 23 | 24 | :::::::::::::::::::::::::::::::::::::::::::::::::: 25 | 26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 27 | library(sf) 28 | library(terra) 29 | library(ggplot2) 30 | library(dplyr) 31 | ``` 32 | 33 | ```{r load-data, echo=FALSE, results="hide"} 34 | # Learners will have this data loaded from earlier episodes 35 | point_HARV <- 36 | st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp") 37 | lines_HARV <- 38 | st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp") 39 | aoi_boundary_HARV <- 40 | st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp") 41 | 42 | # CHM 43 | CHM_HARV <- 44 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif") 45 | 46 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE) 47 | 48 | # plot locations 49 | plot_locations_HARV <- 50 | read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv") 51 | utm18nCRS <- st_crs(point_HARV) 52 | plot_locations_sp_HARV <- st_as_sf(plot_locations_HARV, 53 | coords = c("easting", "northing"), 54 | crs = utm18nCRS) 55 | ``` 56 | 57 | :::::::::::::::::::::::::::::::::::::::::: prereq 58 | 59 | ## Things You'll Need To Complete This Episode 60 | 61 | See the [lesson homepage](.) for detailed information about the software, data, 62 | and other prerequisites you will need to work through the examples in this 63 | episode. 64 | 65 | 66 | :::::::::::::::::::::::::::::::::::::::::::::::::: 67 | 68 | This episode explains how to crop a raster using the extent of a vector 69 | layer. We will also cover how to extract values from a raster that occur 70 | within a set of polygons, or in a buffer (surrounding) region around a set of 71 | points. 72 | 73 | ## Crop a Raster to Vector Extent 74 | 75 | We often work with spatial layers that have different spatial extents. The 76 | spatial extent of a vector layer or R spatial object represents the geographic 77 | "edge" or location that is the furthest north, south east and west. Thus it 78 | represents the overall geographic coverage of the spatial object. 79 | 80 | ![](fig/dc-spatial-vector/spatial_extent.png){alt='Extent illustration'} Image Source: National 81 | Ecological Observatory Network (NEON) 82 | 83 | The graphic below illustrates the extent of several of the spatial layers that 84 | we have worked with in this workshop: 85 | 86 | - Area of interest (AOI) -- blue 87 | - Roads and trails -- purple 88 | - Vegetation plot locations (marked with white dots)-- black 89 | - A canopy height model (CHM) in GeoTIFF format -- green 90 | 91 | ```{r view-extents, echo=FALSE, results="hide"} 92 | # code not shown, for demonstration purposes only 93 | # create CHM as a vector layer 94 | CHM_HARV_sp <- st_as_sf(CHM_HARV_df, coords = c("x", "y"), crs = utm18nCRS) 95 | # approximate the boundary box with a random sample of raster points 96 | CHM_rand_sample <- sample_n(CHM_HARV_sp, 10000) 97 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp") 98 | plots_HARV <- 99 | st_read("data/NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp") 100 | ``` 101 | 102 | ```{r compare-data-extents, echo=FALSE} 103 | # code not shown, for demonstration purposes only 104 | ggplot() + 105 | geom_sf(data = st_convex_hull(st_union(CHM_rand_sample)), fill = "green") + 106 | geom_sf(data = st_convex_hull(st_union(lines_HARV)), 107 | fill = "purple", alpha = 0.2) + 108 | geom_sf(data = lines_HARV, aes(color = TYPE), size = 1) + 109 | geom_sf(data = aoi_boundary_HARV, fill = "blue") + 110 | geom_sf(data = st_convex_hull(st_union(plot_locations_sp_HARV)), 111 | fill = "black", alpha = 0.4) + 112 | geom_sf(data = plots_HARV, color = "white") + 113 | theme(legend.position = "none") + 114 | coord_sf() 115 | 116 | ``` 117 | 118 | Frequent use cases of cropping a raster file include reducing file size and 119 | creating maps. Sometimes we have a raster file that is much larger than our 120 | study area or area of interest. It is often more efficient to crop the raster 121 | to the extent of our study area to reduce file sizes as we process our data. 122 | Cropping a raster can also be useful when creating pretty maps so that the 123 | raster layer matches the extent of the desired vector layers. 124 | 125 | ## Crop a Raster Using Vector Extent 126 | 127 | We can use the `crop()` function to crop a raster to the extent of another 128 | spatial object. To do this, we need to specify the raster to be cropped and the 129 | spatial object that will be used to crop the raster. R will use the `extent` of 130 | the spatial object as the cropping boundary. 131 | 132 | To illustrate this, we will crop the Canopy Height Model (CHM) to only include 133 | the area of interest (AOI). Let's start by plotting the full extent of the CHM 134 | data and overlay where the AOI falls within it. The boundaries of the AOI will 135 | be colored blue, and we use `fill = NA` to make the area transparent. 136 | 137 | ```{r crop-by-vector-extent} 138 | ggplot() + 139 | geom_raster(data = CHM_HARV_df, aes(x = x, y = y, fill = HARV_chmCrop)) + 140 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 141 | geom_sf(data = aoi_boundary_HARV, color = "blue", fill = NA) + 142 | coord_sf() 143 | ``` 144 | 145 | Now that we have visualized the area of the CHM we want to subset, we can 146 | perform the cropping operation. We are going to `crop()` function from the 147 | raster package to create a new object with only the portion of the CHM data 148 | that falls within the boundaries of the AOI. 149 | 150 | ```{r} 151 | CHM_HARV_Cropped <- crop(x = CHM_HARV, y = aoi_boundary_HARV) 152 | ``` 153 | 154 | Now we can plot the cropped CHM data, along with a boundary box showing the 155 | full CHM extent. However, remember, since this is raster data, we need to 156 | convert to a data frame in order to plot using `ggplot`. To get the boundary 157 | box from CHM, the `st_bbox()` will extract the 4 corners of the rectangle that 158 | encompass all the features contained in this object. The `st_as_sfc()` converts 159 | these 4 coordinates into a polygon that we can plot: 160 | 161 | ```{r show-cropped-area} 162 | CHM_HARV_Cropped_df <- as.data.frame(CHM_HARV_Cropped, xy = TRUE) 163 | 164 | ggplot() + 165 | geom_sf(data = st_as_sfc(st_bbox(CHM_HARV)), fill = "green", 166 | color = "green", alpha = .2) + 167 | geom_raster(data = CHM_HARV_Cropped_df, 168 | aes(x = x, y = y, fill = HARV_chmCrop)) + 169 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 170 | coord_sf() 171 | ``` 172 | 173 | The plot above shows that the full CHM extent (plotted in green) is much larger 174 | than the resulting cropped raster. Our new cropped CHM now has the same extent 175 | as the `aoi_boundary_HARV` object that was used as a crop extent (blue border 176 | below). 177 | 178 | ```{r view-crop-extent} 179 | ggplot() + 180 | geom_raster(data = CHM_HARV_Cropped_df, 181 | aes(x = x, y = y, fill = HARV_chmCrop)) + 182 | geom_sf(data = aoi_boundary_HARV, color = "blue", fill = NA) + 183 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 184 | coord_sf() 185 | ``` 186 | 187 | We can look at the extent of all of our other objects for this field site. 188 | 189 | ```{r view-extent} 190 | st_bbox(CHM_HARV) 191 | st_bbox(CHM_HARV_Cropped) 192 | st_bbox(aoi_boundary_HARV) 193 | st_bbox(plot_locations_sp_HARV) 194 | ``` 195 | 196 | Our plot location extent is not the largest but is larger than the AOI 197 | Boundary. It would be nice to see our vegetation plot locations plotted on top 198 | of the Canopy Height Model information. 199 | 200 | ::::::::::::::::::::::::::::::::::::::: challenge 201 | 202 | ## Challenge: Crop to Vector Points Extent 203 | 204 | 1. Crop the Canopy Height Model to the extent of the study plot locations. 205 | 2. Plot the vegetation plot location points on top of the Canopy Height Model. 206 | 207 | ::::::::::::::: solution 208 | 209 | ## Answers 210 | 211 | ```{r challenge-code-crop-raster-points} 212 | 213 | CHM_plots_HARVcrop <- crop(x = CHM_HARV, y = plot_locations_sp_HARV) 214 | 215 | CHM_plots_HARVcrop_df <- as.data.frame(CHM_plots_HARVcrop, xy = TRUE) 216 | 217 | ggplot() + 218 | geom_raster(data = CHM_plots_HARVcrop_df, 219 | aes(x = x, y = y, fill = HARV_chmCrop)) + 220 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 221 | geom_sf(data = plot_locations_sp_HARV) + 222 | coord_sf() 223 | ``` 224 | 225 | ::::::::::::::::::::::::: 226 | 227 | :::::::::::::::::::::::::::::::::::::::::::::::::: 228 | 229 | In the plot above, created in the challenge, all the vegetation plot locations 230 | (black dots) appear on the Canopy Height Model raster layer except for one. One 231 | is situated on the blank space to the left of the map. Why? 232 | 233 | A modification of the first figure in this episode is below, showing the 234 | relative extents of all the spatial objects. Notice that the extent for our 235 | vegetation plot layer (black) extends further west than the extent of our CHM 236 | raster (bright green). The `crop()` function will make a raster extent smaller, 237 | it will not expand the extent in areas where there are no data. Thus, the 238 | extent of our vegetation plot layer will still extend further west than the 239 | extent of our (cropped) raster data (dark green). 240 | 241 | ```{r, echo=FALSE} 242 | # code not shown, demonstration only 243 | # create CHM_plots_HARVcrop as a vector layer 244 | CHM_plots_HARVcrop_sp <- st_as_sf(CHM_plots_HARVcrop_df, coords = c("x", "y"), 245 | crs = utm18nCRS) 246 | # approximate the boundary box with random sample of raster points 247 | CHM_plots_HARVcrop_sp_rand_sample = sample_n(CHM_plots_HARVcrop_sp, 10000) 248 | ``` 249 | 250 | ```{r repeat-compare-data-extents, ref.label="compare-data-extents", echo=FALSE} 251 | ``` 252 | 253 | ## Define an Extent 254 | 255 | So far, we have used a vector layer to crop the extent of a raster dataset. 256 | Alternatively, we can also the `ext()` function to define an extent to be 257 | used as a cropping boundary. This creates a new object of class extent. Here we 258 | will provide the `ext()` function our xmin, xmax, ymin, and ymax (in that 259 | order). 260 | 261 | ```{r} 262 | new_extent <- ext(732161.2, 732238.7, 4713249, 4713333) 263 | class(new_extent) 264 | ``` 265 | 266 | ::::::::::::::::::::::::::::::::::::::::: callout 267 | 268 | ## Data Tip 269 | 270 | The extent can be created from a numeric vector (as shown above), a matrix, or 271 | a list. For more details see the `ext()` function help file 272 | (`?terra::ext`). 273 | 274 | 275 | :::::::::::::::::::::::::::::::::::::::::::::::::: 276 | 277 | Once we have defined our new extent, we can use the `crop()` function to crop 278 | our raster to this extent object. 279 | 280 | ```{r crop-using-drawn-extent} 281 | CHM_HARV_manual_cropped <- crop(x = CHM_HARV, y = new_extent) 282 | ``` 283 | 284 | To plot this data using `ggplot()` we need to convert it to a dataframe. 285 | 286 | ```{r} 287 | CHM_HARV_manual_cropped_df <- as.data.frame(CHM_HARV_manual_cropped, xy = TRUE) 288 | ``` 289 | 290 | Now we can plot this cropped data. We will show the AOI boundary on the same 291 | plot for scale. 292 | 293 | ```{r show-manual-crop-area} 294 | ggplot() + 295 | geom_sf(data = aoi_boundary_HARV, color = "blue", fill = NA) + 296 | geom_raster(data = CHM_HARV_manual_cropped_df, 297 | aes(x = x, y = y, fill = HARV_chmCrop)) + 298 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 299 | coord_sf() 300 | ``` 301 | 302 | ## Extract Raster Pixels Values Using Vector Polygons 303 | 304 | Often we want to extract values from a raster layer for particular locations - 305 | for example, plot locations that we are sampling on the ground. We can extract 306 | all pixel values within 20m of our x,y point of interest. These can then be 307 | summarized into some value of interest (e.g. mean, maximum, total). 308 | 309 | ![](fig//BufferSquare.png){alt='Image shows raster information extraction using 20m polygon boundary.'} 310 | Image Source: National Ecological Observatory Network (NEON) 311 | 312 | To do this in R, we use the `extract()` function. The `extract()` function 313 | requires: 314 | 315 | - The raster that we wish to extract values from, 316 | - The vector layer containing the polygons that we wish to use as a boundary or 317 | boundaries, 318 | - we can tell it to store the output values in a data frame using 319 | `raw = FALSE` (this is optional). 320 | 321 | We will begin by extracting all canopy height pixel values located within our 322 | `aoi_boundary_HARV` polygon which surrounds the tower located at the NEON 323 | Harvard Forest field site. 324 | 325 | ```{r extract-from-raster} 326 | tree_height <- extract(x = CHM_HARV, y = aoi_boundary_HARV, raw = FALSE) 327 | 328 | str(tree_height) 329 | ``` 330 | 331 | When we use the `extract()` function, R extracts the value for each pixel 332 | located within the boundary of the polygon being used to perform the extraction 333 | - in this case the `aoi_boundary_HARV` object (a single polygon). Here, the 334 | function extracted values from 18,450 pixels. 335 | 336 | We can create a histogram of tree height values within the boundary to better 337 | understand the structure or height distribution of trees at our site. We will 338 | use the column `HARV_chmCrop` from our data frame as our x values, as this 339 | column represents the tree heights for each pixel. 340 | 341 | ```{r view-extract-histogram} 342 | ggplot() + 343 | geom_histogram(data = tree_height, aes(x = HARV_chmCrop)) + 344 | ggtitle("Histogram of CHM Height Values (m)") + 345 | xlab("Tree Height") + 346 | ylab("Frequency of Pixels") 347 | ``` 348 | 349 | We can also use the `summary()` function to view descriptive statistics 350 | including min, max, and mean height values. These values help us better 351 | understand vegetation at our field site. 352 | 353 | ```{r} 354 | summary(tree_height$HARV_chmCrop) 355 | ``` 356 | 357 | ## Summarize Extracted Raster Values 358 | 359 | We often want to extract summary values from a raster. We can tell R the type 360 | of summary statistic we are interested in using the `fun =` argument. Let's 361 | extract a mean height value for our AOI. 362 | 363 | ```{r summarize-extract} 364 | mean_tree_height_AOI <- extract(x = CHM_HARV, y = aoi_boundary_HARV, 365 | fun = mean) 366 | 367 | mean_tree_height_AOI 368 | ``` 369 | 370 | It appears that the mean height value, extracted from our LiDAR data derived 371 | canopy height model is 22.43 meters. 372 | 373 | ## Extract Data using x,y Locations 374 | 375 | We can also extract pixel values from a raster by defining a buffer or area 376 | surrounding individual point locations using the `st_buffer()` function. To do 377 | this we define the summary argument (`fun = mean`) and the buffer distance 378 | (`dist = 20`) which represents the radius of a circular region around each 379 | point. By default, the units of the buffer are the same units as the data's 380 | CRS. All pixels that are touched by the buffer region are included in the 381 | extract. 382 | 383 | ![](fig/BufferCircular.png){alt='Image shows raster information extraction using 20m buffer region.'} 384 | Image Source: National Ecological Observatory Network (NEON) 385 | 386 | Let's put this into practice by figuring out the mean tree height in the 20m 387 | around the tower location (`point_HARV`). 388 | 389 | ```{r extract-point-to-buffer} 390 | mean_tree_height_tower <- extract(x = CHM_HARV, 391 | y = st_buffer(point_HARV, dist = 20), 392 | fun = mean) 393 | 394 | mean_tree_height_tower 395 | ``` 396 | 397 | ::::::::::::::::::::::::::::::::::::::: challenge 398 | 399 | ## Challenge: Extract Raster Height Values For Plot Locations 400 | 401 | 1) Use the plot locations object (`plot_locations_sp_HARV`) to extract an 402 | average tree height for the area within 20m of each vegetation plot location 403 | in the study area. Because there are multiple plot locations, there will be 404 | multiple averages returned. 405 | 406 | 2) Create a plot showing the mean tree height of each area. 407 | 408 | ::::::::::::::: solution 409 | 410 | ## Answers 411 | 412 | ```{r hist-tree-height-veg-plot} 413 | # extract data at each plot location 414 | mean_tree_height_plots_HARV <- extract(x = CHM_HARV, 415 | y = st_buffer(plot_locations_sp_HARV, 416 | dist = 20), 417 | fun = mean) 418 | 419 | # view data 420 | mean_tree_height_plots_HARV 421 | 422 | # plot data 423 | ggplot(data = mean_tree_height_plots_HARV, aes(ID, HARV_chmCrop)) + 424 | geom_col() + 425 | ggtitle("Mean Tree Height at each Plot") + 426 | xlab("Plot ID") + 427 | ylab("Tree Height (m)") 428 | ``` 429 | 430 | ::::::::::::::::::::::::: 431 | 432 | :::::::::::::::::::::::::::::::::::::::::::::::::: 433 | 434 | 435 | 436 | :::::::::::::::::::::::::::::::::::::::: keypoints 437 | 438 | - Use the `crop()` function to crop a raster object. 439 | - Use the `extract()` function to extract pixels from a raster object that fall 440 | within a particular extent boundary. 441 | - Use the `ext()` function to define an extent. 442 | 443 | :::::::::::::::::::::::::::::::::::::::::::::::::: 444 | 445 | 446 | -------------------------------------------------------------------------------- /episodes/13-plot-time-series-rasters-in-r.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Create Publication-quality Graphics 3 | teaching: 40 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Assign custom names to bands in a RasterStack. 15 | - Customize raster plots using the `ggplot2` package. 16 | 17 | :::::::::::::::::::::::::::::::::::::::::::::::::: 18 | 19 | :::::::::::::::::::::::::::::::::::::::: questions 20 | 21 | - How can I create a publication-quality graphic and customize plot parameters? 22 | 23 | :::::::::::::::::::::::::::::::::::::::::::::::::: 24 | 25 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE} 26 | library(terra) 27 | library(ggplot2) 28 | library(dplyr) 29 | library(reshape) 30 | library(RColorBrewer) 31 | library(scales) 32 | ``` 33 | 34 | ```{r load-data, echo=FALSE, results="hide"} 35 | # learners will have this data loaded from the previous episode 36 | 37 | all_NDVI_HARV <- list.files("data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI", 38 | full.names = TRUE, pattern = ".tif$") 39 | 40 | # Create a time series raster stack 41 | NDVI_HARV_stack <- rast(all_NDVI_HARV) 42 | # NOTE: Fix the bands' names so they don't start with a number! 43 | names(NDVI_HARV_stack) <- paste0("X", names(NDVI_HARV_stack)) 44 | 45 | # apply scale factor 46 | NDVI_HARV_stack <- NDVI_HARV_stack/10000 47 | 48 | # convert to a df for plotting 49 | NDVI_HARV_stack_df <- as.data.frame(NDVI_HARV_stack, xy = TRUE) %>% 50 | # Then reshape data to stack all the X*_HARV_ndvi_crop columns into 51 | # one single column called 'variable' 52 | melt(id.vars = c('x','y')) 53 | ``` 54 | 55 | :::::::::::::::::::::::::::::::::::::::::: prereq 56 | 57 | ## Things You'll Need To Complete This Episode 58 | 59 | See the [lesson homepage](.) for detailed information about the software, data, 60 | and other prerequisites you will need to work through the examples in this 61 | episode. 62 | 63 | 64 | :::::::::::::::::::::::::::::::::::::::::::::::::: 65 | 66 | This episode covers how to customize your raster plots using the `ggplot2` 67 | package in R to create publication-quality plots. 68 | 69 | ## Before and After 70 | 71 | In [the previous episode](12-time-series-raster/), we learned how to plot 72 | multi-band raster data in R using the `facet_wrap()` function. This created a 73 | separate panel in our plot for each raster band. The plot we created together 74 | is shown below: 75 | 76 | ```{r levelplot-time-series-before, echo=FALSE} 77 | # code not shown, demonstration only 78 | ggplot() + 79 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 80 | facet_wrap(~variable) + 81 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") 82 | ``` 83 | 84 | Although this plot is informative, it isn't something we would expect to see in 85 | a journal publication. The x and y-axis labels aren't informative. There is a 86 | lot of unnecessary gray background and the titles of each panel don't clearly 87 | state that the number refers to the Julian day the data was collected. In this 88 | episode, we will customize this plot above to produce a publication quality 89 | graphic. We will go through these steps iteratively. When we're done, we will 90 | have created the plot shown below. 91 | 92 | ```{r levelplot-time-series-after, echo=FALSE} 93 | # code not shown, demonstration only 94 | 95 | raster_names <- names(NDVI_HARV_stack) 96 | raster_names <- gsub("_HARV_ndvi_crop", "", raster_names) 97 | raster_names <- gsub("X", "Day ", raster_names) 98 | labels_names <- setNames(raster_names, unique(NDVI_HARV_stack_df$variable)) 99 | green_colors <- brewer.pal(9, "YlGn") %>% 100 | colorRampPalette() 101 | 102 | ggplot() + 103 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 104 | facet_wrap(~variable, nrow = 3, ncol = 5, 105 | labeller = labeller(variable = labels_names)) + 106 | ggtitle("Landsat NDVI - Julian Days", subtitle = "Harvard Forest 2011") + 107 | theme_void() + 108 | theme(plot.title = element_text(hjust = 0.5, face = "bold"), 109 | plot.subtitle = element_text(hjust = 0.5)) + 110 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20)) 111 | 112 | # cleanup 113 | rm(raster_names, labels_names, green_colors) 114 | ``` 115 | 116 | ## Adjust the Plot Theme 117 | 118 | The first thing we will do to our plot remove the x and y-axis labels and axis 119 | ticks, as these are unnecessary and make our plot look messy. We can do this by 120 | setting the plot theme to `void`. 121 | 122 | ```{r adjust-theme} 123 | ggplot() + 124 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 125 | facet_wrap(~variable) + 126 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") + 127 | theme_void() 128 | ``` 129 | 130 | Next we will center our plot title and subtitle. We need to do this **after** 131 | the `theme_void()` layer, because R interprets the `ggplot` layers in order. If 132 | we first tell R to center our plot title, and then set the theme to `void`, any 133 | adjustments we've made to the plot theme will be over-written by the 134 | `theme_void()` function. So first we make the theme `void` and then we center 135 | the title. We center both the title and subtitle by using the `theme()` 136 | function and setting the `hjust` parameter to 0.5. The `hjust` parameter stands 137 | for "horizontal justification" and takes any value between 0 and 1. A setting 138 | of 0 indicates left justification and a setting of 1 indicates right 139 | justification. 140 | 141 | ```{r adjust-theme-2} 142 | ggplot() + 143 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 144 | facet_wrap(~variable) + 145 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") + 146 | theme_void() + 147 | theme(plot.title = element_text(hjust = 0.5), 148 | plot.subtitle = element_text(hjust = 0.5)) 149 | ``` 150 | 151 | ::::::::::::::::::::::::::::::::::::::: challenge 152 | 153 | ## Challenge 154 | 155 | Change the plot title (but not the subtitle) to bold font. You can (and 156 | should!) use the help menu in RStudio or any internet resources to figure out 157 | how to change this setting. 158 | 159 | ::::::::::::::: solution 160 | 161 | ## Answers 162 | 163 | Learners can find this information in the help files for the `theme()` 164 | function. The parameter to set is called `face`. 165 | 166 | ```{r use-bold-face} 167 | ggplot() + 168 | geom_raster(data = NDVI_HARV_stack_df, 169 | aes(x = x, y = y, fill = value)) + 170 | facet_wrap(~ variable) + 171 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") + 172 | theme_void() + 173 | theme(plot.title = element_text(hjust = 0.5, face = "bold"), 174 | plot.subtitle = element_text(hjust = 0.5)) 175 | ``` 176 | 177 | ::::::::::::::::::::::::: 178 | 179 | :::::::::::::::::::::::::::::::::::::::::::::::::: 180 | 181 | ## Adjust the Color Ramp 182 | 183 | Next, let's adjust the color ramp used to render the rasters. First, we can 184 | change the blue color ramp to a green one that is more visually suited to our 185 | NDVI (greenness) data using the `colorRampPalette()` function in combination 186 | with `colorBrewer` which requires loading the `RColorBrewer` library. Then we 187 | use `scale_fill_gradientn` to pass the list of colours (here 20 different 188 | colours) to ggplot. 189 | 190 | First we need to create a set of colors to use. We will select a set of nine 191 | colors from the "YlGn" (yellow-green) color palette. This returns a set of hex 192 | color codes: 193 | 194 | ```{r} 195 | library(RColorBrewer) 196 | brewer.pal(9, "YlGn") 197 | ``` 198 | 199 | Then we will pass those color codes to the `colorRampPalette` function, which 200 | will interpolate from those colors a more nuanced color range. 201 | 202 | ```{r} 203 | green_colors <- brewer.pal(9, "YlGn") %>% 204 | colorRampPalette() 205 | ``` 206 | 207 | We can tell the `colorRampPalette()` function how many discrete colors within 208 | this color range to create. In our case, we will use 20 colors when we plot our 209 | graphic. 210 | 211 | ```{r change-color-ramp} 212 | ggplot() + 213 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 214 | facet_wrap(~variable) + 215 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") + 216 | theme_void() + 217 | theme(plot.title = element_text(hjust = 0.5, face = "bold"), 218 | plot.subtitle = element_text(hjust = 0.5)) + 219 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20)) 220 | ``` 221 | 222 | The yellow to green color ramp visually represents NDVI well given it's a 223 | measure of greenness. Someone looking at the plot can quickly understand that 224 | pixels that are more green have a higher NDVI value. 225 | 226 | ::::::::::::::::::::::::::::::::::::::::: callout 227 | 228 | ## Data Tip 229 | 230 | For all of the `brewer.pal` ramp names see the 231 | [brewerpal page](https://www.datavis.ca/sasmac/brewerpal.html). 232 | 233 | 234 | :::::::::::::::::::::::::::::::::::::::::::::::::: 235 | 236 | ::::::::::::::::::::::::::::::::::::::::: callout 237 | 238 | ## Data Tip 239 | 240 | Cynthia Brewer, the creator of ColorBrewer, offers an online tool to help 241 | choose suitable color ramps, or to create your own. 242 | [ColorBrewer 2.0; Color Advise for Cartography](https://colorbrewer2.org/) 243 | 244 | 245 | :::::::::::::::::::::::::::::::::::::::::::::::::: 246 | 247 | ## Refine Plot \& Tile Labels 248 | 249 | Next, let's label each panel in our plot with the Julian day that the raster 250 | data for that panel was collected. The current names come from the band "layer 251 | names"" stored in the `RasterStack` and the first part of each name is the 252 | Julian day. 253 | 254 | To create a more meaningful label we can remove the "x" and replace it with 255 | "day" using the `gsub()` function in R. The syntax is as follows: 256 | `gsub("StringToReplace", "TextToReplaceIt", object)`. 257 | 258 | First let's remove "\_HARV\_NDVI\_crop" from each label to make the labels 259 | shorter and remove repetition. To illustrate how this works, we will first 260 | look at the names for our `NDVI_HARV_stack` object: 261 | 262 | ```{r} 263 | names(NDVI_HARV_stack) 264 | ``` 265 | 266 | Now we will use the `gsub()` function to find the character string 267 | "\_HARV\_ndvi\_crop" and replace it with a blank string (""). We will assign 268 | this output to a new object (`raster_names`) and look at that object to make 269 | sure our code is doing what we want it to. 270 | 271 | ```{r} 272 | raster_names <- names(NDVI_HARV_stack) 273 | 274 | raster_names <- gsub("_HARV_ndvi_crop", "", raster_names) 275 | raster_names 276 | ``` 277 | 278 | So far so good. Now we will use `gsub()` again to replace the "X" with the word 279 | "Day" followed by a space. 280 | 281 | ```{r} 282 | raster_names <- gsub("X", "Day ", raster_names) 283 | raster_names 284 | ``` 285 | 286 | Our labels look good now. Let's reassign them to our `all_NDVI_HARV` object: 287 | 288 | ```{r} 289 | labels_names <- setNames(raster_names, unique(NDVI_HARV_stack_df$variable)) 290 | ``` 291 | 292 | Once the names for each band have been reassigned, we can render our plot with 293 | the new labels using a`labeller`. 294 | 295 | ```{r create-levelplot} 296 | ggplot() + 297 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 298 | facet_wrap(~variable, labeller = labeller(variable = labels_names)) + 299 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") + 300 | theme_void() + 301 | theme(plot.title = element_text(hjust = 0.5, face = "bold"), 302 | plot.subtitle = element_text(hjust = 0.5)) + 303 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20)) 304 | ``` 305 | 306 | ## Change Layout of Panels 307 | 308 | We can adjust the columns of our plot by setting the number of columns `ncol` 309 | and the number of rows `nrow` in `facet_wrap`. Let's make our plot so that it 310 | has a width of five panels. 311 | 312 | ```{r adjust-layout} 313 | ggplot() + 314 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 315 | facet_wrap(~variable, ncol = 5, 316 | labeller = labeller(variable = labels_names)) + 317 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") + 318 | theme_void() + 319 | theme(plot.title = element_text(hjust = 0.5, face = "bold"), 320 | plot.subtitle = element_text(hjust = 0.5)) + 321 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20)) 322 | ``` 323 | 324 | Now we have a beautiful, publication quality plot! 325 | 326 | ::::::::::::::::::::::::::::::::::::::: challenge 327 | 328 | ## Challenge: Divergent Color Ramps 329 | 330 | When we used the `gsub()` function to modify the tile labels we replaced the 331 | beginning of each tile title with "Day". A more descriptive name could be 332 | "Julian Day". Update the plot above with the following changes: 333 | 334 | 1. Label each tile "Julian Day" with the julian day value following. 335 | 2. Change the color ramp to a divergent brown to green color ramp. 336 | 337 | **Questions:** 338 | Does having a divergent color ramp represent the data better than a sequential 339 | color ramp (like "YlGn")? Can you think of other data sets where a divergent 340 | color ramp may be best? 341 | 342 | ::::::::::::::: solution 343 | 344 | ## Answers 345 | 346 | ```{r final-figure} 347 | raster_names <- gsub("Day","Julian Day ", raster_names) 348 | labels_names <- setNames(raster_names, unique(NDVI_HARV_stack_df$variable)) 349 | 350 | brown_green_colors <- colorRampPalette(brewer.pal(9, "BrBG")) 351 | 352 | ggplot() + 353 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) + 354 | facet_wrap(~variable, ncol = 5, labeller = labeller(variable = labels_names)) + 355 | ggtitle("Landsat NDVI - Julian Days", subtitle = "Harvard Forest 2011") + 356 | theme_void() + 357 | theme(plot.title = element_text(hjust = 0.5, face = "bold"), 358 | plot.subtitle = element_text(hjust = 0.5)) + 359 | scale_fill_gradientn(name = "NDVI", colours = brown_green_colors(20)) 360 | ``` 361 | 362 | For NDVI data, the sequential color ramp is better than the divergent as it is 363 | more akin to the process of greening up, which starts off at one end and just 364 | keeps increasing. 365 | 366 | 367 | 368 | ::::::::::::::::::::::::: 369 | 370 | :::::::::::::::::::::::::::::::::::::::::::::::::: 371 | 372 | 373 | 374 | :::::::::::::::::::::::::::::::::::::::: keypoints 375 | 376 | - Use the `theme_void()` function for a clean background to your plot. 377 | - Use the `element_text()` function to adjust text size, font, and position. 378 | - Use the `brewer.pal()` function to create a custom color palette. 379 | - Use the `gsub()` function to do pattern matching and replacement in text. 380 | 381 | :::::::::::::::::::::::::::::::::::::::::::::::::: 382 | 383 | 384 | -------------------------------------------------------------------------------- /episodes/14-extract-ndvi-from-rasters-in-r.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Derive Values from Raster Time Series 3 | teaching: 40 4 | exercises: 20 5 | source: Rmd 6 | --- 7 | 8 | ```{r setup, echo=FALSE} 9 | source("setup.R") 10 | ``` 11 | 12 | ::::::::::::::::::::::::::::::::::::::: objectives 13 | 14 | - Extract summary pixel values from a raster. 15 | - Save summary values to a .csv file. 16 | - Plot summary pixel values using `ggplot()`. 17 | - Compare NDVI values between two different sites. 18 | 19 | :::::::::::::::::::::::::::::::::::::::::::::::::: 20 | 21 | :::::::::::::::::::::::::::::::::::::::: questions 22 | 23 | - How can I calculate, extract, and export summarized raster pixel data? 24 | 25 | :::::::::::::::::::::::::::::::::::::::::::::::::: 26 | 27 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE} 28 | library(terra) 29 | library(ggplot2) 30 | library(dplyr) 31 | ``` 32 | 33 | ```{r load-data, echo=FALSE, results="hide"} 34 | # learners will have this data loaded from the previous episode 35 | 36 | all_NDVI_HARV <- list.files("data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI", 37 | full.names = TRUE, pattern = ".tif$") 38 | 39 | # Create a time series raster stack 40 | NDVI_HARV_stack <- rast(all_NDVI_HARV) 41 | # NOTE: Fix the bands' names so they don't start with a number! 42 | names(NDVI_HARV_stack) <- paste0("X", names(NDVI_HARV_stack)) 43 | 44 | # apply scale factor 45 | NDVI_HARV_stack <- NDVI_HARV_stack/10000 46 | ``` 47 | 48 | :::::::::::::::::::::::::::::::::::::::::: prereq 49 | 50 | ## Things You'll Need To Complete This Episode 51 | 52 | See the [lesson homepage](.) for detailed information about the software, data, 53 | and other prerequisites you will need to work through the examples in this 54 | episode. 55 | 56 | 57 | :::::::::::::::::::::::::::::::::::::::::::::::::: 58 | 59 | In this episode, we will extract NDVI values from a raster time series dataset 60 | and plot them using the `ggplot2` package. 61 | 62 | ## Extract Summary Statistics From Raster Data 63 | 64 | We often want to extract summary values from raster data. For example, we might 65 | want to understand overall greeness across a field site or at each plot within 66 | a field site. These values can then be compared between different field sites 67 | and combined with other related metrics to support modeling and further 68 | analysis. 69 | 70 | ## Calculate Average NDVI 71 | 72 | Our goal in this episode is to create a dataframe that contains a single, mean 73 | NDVI value for each raster in our time series. This value represents the mean 74 | NDVI value for this area on a given day. 75 | 76 | We can calculate the mean for each raster using the `global()` function. The 77 | `global()` function produces a named numeric vector, where each value is 78 | associated with the name of raster stack it was derived from. 79 | 80 | ```{r} 81 | avg_NDVI_HARV <- global(NDVI_HARV_stack, mean) 82 | avg_NDVI_HARV 83 | ``` 84 | 85 | The output is a data frame (othewise, we could use `as.data.frame()`). It's a 86 | good idea to view the first few rows of our data frame with `head()` to make 87 | sure the structure is what we expect. 88 | 89 | ```{r} 90 | head(avg_NDVI_HARV) 91 | ``` 92 | 93 | We now have a data frame with row names that are based on the original file 94 | name and a mean NDVI value for each file. Next, let's clean up the column names 95 | in our data frame to make it easier for colleagues to work with our code. 96 | 97 | Let's change the NDVI column name to `meanNDVI`. 98 | 99 | ```{r view-dataframe-output} 100 | names(avg_NDVI_HARV) <- "meanNDVI" 101 | head(avg_NDVI_HARV) 102 | ``` 103 | 104 | The new column name doesn't reminds us what site our data are from. While we 105 | are only working with one site now, we might want to compare several sites 106 | worth of data in the future. Let's add a column to our dataframe called "site". 107 | 108 | ```{r insert-site-name} 109 | avg_NDVI_HARV$site <- "HARV" 110 | ``` 111 | 112 | We can populate this column with the site name - HARV. Let's also create a year 113 | column and populate it with 2011 - the year our data were collected. 114 | 115 | ```{r} 116 | avg_NDVI_HARV$year <- "2011" 117 | head(avg_NDVI_HARV) 118 | ``` 119 | 120 | We now have a dataframe that contains a row for each raster file processed, and 121 | columns for `meanNDVI`, `site`, and `year`. 122 | 123 | ## Extract Julian Day from row names 124 | 125 | We'd like to produce a plot where Julian days (the numeric day of the year, 126 | 0 - 365/366) are on the x-axis and NDVI is on the y-axis. To create this plot, 127 | we'll need a column that contains the Julian day value. 128 | 129 | One way to create a Julian day column is to use `gsub()` on the file name in 130 | each row. We can replace both the `X` and the `_HARV_NDVI_crop` to extract the 131 | Julian Day value, just like we did in the 132 | [previous episode](13-plot-time-series-rasters-in-r/). 133 | 134 | This time we will use one additional trick to do both of these steps at the 135 | same time. The vertical bar character ( `|` ) is equivalent to the word "or". 136 | Using this character in our search pattern allows us to search for more than 137 | one pattern in our text strings. 138 | 139 | ```{r extract-julian-day} 140 | julianDays <- gsub("X|_HARV_ndvi_crop", "", row.names(avg_NDVI_HARV)) 141 | julianDays 142 | ``` 143 | 144 | Now that we've extracted the Julian days from our row names, we can add that 145 | data to the data frame as a column called "julianDay". 146 | 147 | ```{r} 148 | avg_NDVI_HARV$julianDay <- julianDays 149 | ``` 150 | 151 | Let's check the class of this new column: 152 | 153 | ```{r} 154 | class(avg_NDVI_HARV$julianDay) 155 | ``` 156 | 157 | ## Convert Julian Day to Date Class 158 | 159 | Currently, the values in the Julian day column are stored as class `character`. 160 | Storing this data as a date object is better - for plotting, data subsetting 161 | and working with our data. Let's convert. We worked with data conversions 162 | [in an earlier episode](12-time-series-raster/). For an introduction to 163 | date-time classes, see the NEON Data Skills tutorial 164 | [Convert Date \& Time Data from Character Class to Date-Time Class (POSIX) in R](https://www.neonscience.org/dc-convert-date-time-POSIX-r). 165 | 166 | To convert a Julian day number to a date class, we need to set the origin, 167 | which is the day that our Julian days start counting from. Our data is from 168 | 2011 and we know that the USGS Landsat Team created Julian day values for this 169 | year. Therefore, the first day or "origin" for our Julian day count is 01 170 | January 2011. 171 | 172 | ```{r} 173 | origin <- as.Date("2011-01-01") 174 | ``` 175 | 176 | Next we convert the `julianDay` column from character to integer. 177 | 178 | ```{r} 179 | avg_NDVI_HARV$julianDay <- as.integer(avg_NDVI_HARV$julianDay) 180 | ``` 181 | 182 | Once we set the Julian day origin, we can add the Julian day value (as an 183 | integer) to the origin date. 184 | 185 | Note that when we convert our integer class `julianDay` values to dates, we 186 | subtracted 1. This is because the origin day is 01 January 2011, so the 187 | extracted day is 01. The Julian Day (or year day) for this is also 01. When we 188 | convert from the integer 05 `julianDay` value (indicating 5th of January), we 189 | cannot simply add `origin + julianDay` because `01 + 05 = 06` or 06 January 190 | 2011. To correct, this error we then subtract 1 to get the correct day, January 191 | 05 2011. 192 | 193 | ```{r} 194 | avg_NDVI_HARV$Date<- origin + (avg_NDVI_HARV$julianDay - 1) 195 | head(avg_NDVI_HARV$Date) 196 | ``` 197 | 198 | Since the origin date was originally set as a Date class object, the new `Date` 199 | column is also stored as class `Date`. 200 | 201 | ```{r} 202 | class(avg_NDVI_HARV$Date) 203 | ``` 204 | 205 | ::::::::::::::::::::::::::::::::::::::: challenge 206 | 207 | ## Challenge: NDVI for the San Joaquin Experimental Range 208 | 209 | We often want to compare two different sites. The National Ecological 210 | Observatory Network (NEON) also has a field site in Southern California at the 211 | [San Joaquin Experimental Range (SJER)](https://www.neonscience.org/field-sites/field-sites-map/SJER). 212 | 213 | For this challenge, create a dataframe containing the mean NDVI values and the 214 | Julian days the data was collected (in date format) for the NEON San Joaquin 215 | Experimental Range field site. NDVI data for SJER are located in the 216 | `NEON-DS-Landsat-NDVI/SJER/2011/NDVI` directory. 217 | 218 | ::::::::::::::: solution 219 | 220 | ## Answers 221 | 222 | First we will read in the NDVI data for the SJER field site. 223 | 224 | ```{r} 225 | NDVI_path_SJER <- "data/NEON-DS-Landsat-NDVI/SJER/2011/NDVI" 226 | 227 | all_NDVI_SJER <- list.files(NDVI_path_SJER, 228 | full.names = TRUE, 229 | pattern = ".tif$") 230 | 231 | NDVI_stack_SJER <- rast(all_NDVI_SJER) 232 | names(NDVI_stack_SJER) <- paste0("X", names(NDVI_stack_SJER)) 233 | 234 | NDVI_stack_SJER <- NDVI_stack_SJER/10000 235 | ``` 236 | 237 | Then we can calculate the mean values for each day and put that in a dataframe. 238 | 239 | ```{r} 240 | avg_NDVI_SJER <- as.data.frame(global(NDVI_stack_SJER, mean)) 241 | ``` 242 | 243 | Next we rename the NDVI column, and add site and year columns to our data. 244 | 245 | ```{r} 246 | names(avg_NDVI_SJER) <- "meanNDVI" 247 | avg_NDVI_SJER$site <- "SJER" 248 | avg_NDVI_SJER$year <- "2011" 249 | ``` 250 | 251 | Now we will create our Julian day column 252 | 253 | ```{r} 254 | julianDays_SJER <- gsub("X|_SJER_ndvi_crop", "", row.names(avg_NDVI_SJER)) 255 | origin <- as.Date("2011-01-01") 256 | avg_NDVI_SJER$julianDay <- as.integer(julianDays_SJER) 257 | 258 | avg_NDVI_SJER$Date <- origin + (avg_NDVI_SJER$julianDay - 1) 259 | 260 | head(avg_NDVI_SJER) 261 | ``` 262 | 263 | ::::::::::::::::::::::::: 264 | 265 | :::::::::::::::::::::::::::::::::::::::::::::::::: 266 | 267 | ## Plot NDVI Using ggplot 268 | 269 | We now have a clean dataframe with properly scaled NDVI and Julian days. Let's 270 | plot our data. 271 | 272 | ```{r ggplot-data} 273 | ggplot(avg_NDVI_HARV, aes(julianDay, meanNDVI)) + 274 | geom_point() + 275 | ggtitle("Landsat Derived NDVI - 2011", 276 | subtitle = "NEON Harvard Forest Field Site") + 277 | xlab("Julian Days") + ylab("Mean NDVI") 278 | ``` 279 | 280 | ::::::::::::::::::::::::::::::::::::::: challenge 281 | 282 | ## Challenge: Plot San Joaquin Experimental Range Data 283 | 284 | Create a complementary plot for the SJER data. Plot the data points in a 285 | different color. 286 | 287 | ::::::::::::::: solution 288 | 289 | ## Answers 290 | 291 | ```{r avg-ndvi-sjer} 292 | ggplot(avg_NDVI_SJER, aes(julianDay, meanNDVI)) + 293 | geom_point(colour = "SpringGreen4") + 294 | ggtitle("Landsat Derived NDVI - 2011", subtitle = "NEON SJER Field Site") + 295 | xlab("Julian Day") + ylab("Mean NDVI") 296 | ``` 297 | 298 | ::::::::::::::::::::::::: 299 | 300 | :::::::::::::::::::::::::::::::::::::::::::::::::: 301 | 302 | ## Compare NDVI from Two Different Sites in One Plot 303 | 304 | Comparison of plots is often easiest when both plots are side by side. Or, even 305 | better, if both sets of data are plotted in the same plot. We can do this by 306 | merging the two data sets together. The date frames must have the same number 307 | of columns and exact same column names to be merged. 308 | 309 | ```{r merge-df-single-plot} 310 | NDVI_HARV_SJER <- rbind(avg_NDVI_HARV, avg_NDVI_SJER) 311 | ``` 312 | 313 | Now we can plot both datasets on the same plot. 314 | 315 | ```{r ndvi-harv-sjer-comp} 316 | ggplot(NDVI_HARV_SJER, aes(x = julianDay, y = meanNDVI, colour = site)) + 317 | geom_point(aes(group = site)) + 318 | geom_line(aes(group = site)) + 319 | ggtitle("Landsat Derived NDVI - 2011", 320 | subtitle = "Harvard Forest vs San Joaquin") + 321 | xlab("Julian Day") + ylab("Mean NDVI") 322 | ``` 323 | 324 | ::::::::::::::::::::::::::::::::::::::: challenge 325 | 326 | ## Challenge: Plot NDVI with date 327 | 328 | Plot the SJER and HARV data in one plot but use date, rather than Julian day, 329 | on the x-axis. 330 | 331 | ::::::::::::::: solution 332 | 333 | ## Answers 334 | 335 | ```{r ndvi-harv-sjer-date} 336 | ggplot(NDVI_HARV_SJER, aes(x = Date, y = meanNDVI, colour = site)) + 337 | geom_point(aes(group = site)) + 338 | geom_line(aes(group = site)) + 339 | ggtitle("Landsat Derived NDVI - 2011", 340 | subtitle = "Harvard Forest vs San Joaquin") + 341 | xlab("Date") + ylab("Mean NDVI") 342 | ``` 343 | 344 | ::::::::::::::::::::::::: 345 | 346 | :::::::::::::::::::::::::::::::::::::::::::::::::: 347 | 348 | ## Remove Outlier Data 349 | 350 | As we look at these plots we see variation in greenness across the year. 351 | However, the pattern is interrupted by a few points where NDVI quickly drops 352 | towards 0 during a time period when we might expect the vegetation to have a 353 | higher greenness value. Is the vegetation truly senescent or gone or are these 354 | outlier values that should be removed from the data? 355 | 356 | We've seen in [an earlier episode](12-time-series-raster/) that data points 357 | with very low NDVI values can be associated with images that are filled with 358 | clouds. Thus, we can attribute the low NDVI values to high levels of cloud 359 | cover. Is the same thing happening at SJER? 360 | 361 | ```{r view-all-rgb-SJER, echo=FALSE} 362 | # code not shown, demonstration only 363 | # open up the cropped files 364 | rgb.allCropped.SJER <- list.files("data/NEON-DS-Landsat-NDVI/SJER/2011/RGB/", 365 | full.names=TRUE, 366 | pattern = ".tif$") 367 | # create a layout 368 | par(mfrow = c(5, 4)) 369 | 370 | # Super efficient code 371 | # note that there is an issue with one of the rasters 372 | # NEON-DS-Landsat-NDVI/SJER/2011/RGB/254_SJER_landRGB.tif has a blue band with no range 373 | # thus you can't apply a stretch to it. The code below skips the stretch for 374 | # that one image. You could automate this by testing the range of each band in each image 375 | 376 | for (aFile in rgb.allCropped.SJER) 377 | {NDVI.rastStack <- rast(aFile) 378 | if (aFile =="data/NEON-DS-Landsat-NDVI/SJER/2011/RGB//254_SJER_landRGB.tif") 379 | {plotRGB(NDVI.rastStack) } 380 | else { plotRGB(NDVI.rastStack, stretch="lin") } 381 | } 382 | 383 | # reset layout 384 | par(mfrow=c(1, 1)) 385 | ``` 386 | 387 | Without significant additional processing, we will not be able to retrieve a 388 | strong reflection from vegetation, from a remotely sensed image that is 389 | predominantly cloud covered. Thus, these points are likely bad data points. 390 | Let's remove them. 391 | 392 | First, we will identify the good data points that should be retained. One way 393 | to do this is by identifying a threshold value. All values below that threshold 394 | will be removed from our analysis. We will use 0.1 as an example for this 395 | episode. We can then use the subset function to remove outlier datapoints 396 | (below our identified threshold). 397 | 398 | ::::::::::::::::::::::::::::::::::::::::: callout 399 | 400 | ## Data Tip 401 | 402 | Thresholding, or removing outlier data, can be tricky business. In this case, 403 | we can be confident that some of our NDVI values are not valid due to cloud 404 | cover. However, a threshold value may not always be sufficient given that 0.1 405 | could be a valid NDVI value in some areas. This is where decision-making should 406 | be fueled by practical scientific knowledge of the data and the desired 407 | outcomes! 408 | 409 | 410 | :::::::::::::::::::::::::::::::::::::::::::::::::: 411 | 412 | ```{r remove-bad-values} 413 | avg_NDVI_HARV_clean <- subset(avg_NDVI_HARV, meanNDVI > 0.1) 414 | avg_NDVI_HARV_clean$meanNDVI < 0.1 415 | ``` 416 | 417 | Now we can create another plot without the suspect data. 418 | 419 | ```{r plot-clean-HARV} 420 | ggplot(avg_NDVI_HARV_clean, aes(x = julianDay, y = meanNDVI)) + 421 | geom_point() + 422 | ggtitle("Landsat Derived NDVI - 2011", 423 | subtitle = "NEON Harvard Forest Field Site") + 424 | xlab("Julian Days") + ylab("Mean NDVI") 425 | ``` 426 | 427 | Now our outlier data points are removed and the pattern of "green-up" and 428 | "brown-down" makes more sense. 429 | 430 | ## Write NDVI data to a .csv File 431 | 432 | We can write our final NDVI dataframe out to a text format, to quickly share 433 | with a colleague or to reuse for analysis or visualization purposes. We will 434 | export in Comma Seperated Value (.csv) file format because it is usable in many 435 | different tools and across platforms (MAC, PC, etc). 436 | 437 | We will use `write.csv()` to write a specified dataframe to a `.csv` file. 438 | Unless you designate a different directory, the output file will be saved in 439 | your working directory. 440 | 441 | Before saving our file, let's view the format to make sure it is what we want 442 | as an output format. 443 | 444 | ```{r write-csv} 445 | head(avg_NDVI_HARV_clean) 446 | ``` 447 | 448 | It looks like we have a series of `row.names` that we do not need because we 449 | have this information stored in individual columns in our data frame. Let's 450 | remove the row names. 451 | 452 | ```{r drop-rownames-write-csv} 453 | row.names(avg_NDVI_HARV_clean) <- NULL 454 | head(avg_NDVI_HARV_clean) 455 | ``` 456 | 457 | ```{r, eval=FALSE} 458 | write.csv(avg_NDVI_HARV_clean, file="meanNDVI_HARV_2011.csv") 459 | ``` 460 | 461 | ::::::::::::::::::::::::::::::::::::::: challenge 462 | 463 | ## Challenge: Write to .csv 464 | 465 | 1. Create a NDVI .csv file for the NEON SJER field site that is comparable with 466 | the one we just created for the Harvard Forest. Be sure to inspect for 467 | questionable values before writing any data to a .csv file. 468 | 2. Create a NDVI .csv file that includes data from both field sites. 469 | 470 | ::::::::::::::: solution 471 | 472 | ## Answers 473 | 474 | ```{r} 475 | avg_NDVI_SJER_clean <- subset(avg_NDVI_SJER, meanNDVI > 0.1) 476 | row.names(avg_NDVI_SJER_clean) <- NULL 477 | head(avg_NDVI_SJER_clean) 478 | write.csv(avg_NDVI_SJER_clean, file = "meanNDVI_SJER_2011.csv") 479 | ``` 480 | 481 | ::::::::::::::::::::::::: 482 | 483 | :::::::::::::::::::::::::::::::::::::::::::::::::: 484 | 485 | 486 | 487 | :::::::::::::::::::::::::::::::::::::::: keypoints 488 | 489 | - Use the `global()` function to calculate summary statistics for cells in a 490 | raster object. 491 | - The pipe (`|`) operator means `or`. 492 | - Use the `rbind()` function to combine data frames that have the same column 493 | names. 494 | 495 | :::::::::::::::::::::::::::::::::::::::::::::::::: 496 | 497 | 498 | -------------------------------------------------------------------------------- /episodes/data/.gitignore: -------------------------------------------------------------------------------- 1 | * 2 | */ 3 | !.gitignore 4 | -------------------------------------------------------------------------------- /episodes/fig/BufferCircular.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/BufferCircular.png -------------------------------------------------------------------------------- /episodes/fig/BufferSquare.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/BufferSquare.png -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/GreennessOverTime.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/GreennessOverTime.jpg -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/RGBSTack_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/RGBSTack_1.jpg -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/UTM_zones_18-19.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/UTM_zones_18-19.jpg -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/imageStretch_dark.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/imageStretch_dark.jpg -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/imageStretch_light.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/imageStretch_light.jpg -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/lidarTree-height.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/lidarTree-height.png -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/raster_concept.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/raster_concept.png -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/raster_resolution.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/raster_resolution.png -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/single_multi_raster.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/single_multi_raster.png -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-raster/spatial_extent.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/spatial_extent.png -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-vector/pnt_line_poly.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-vector/pnt_line_poly.png -------------------------------------------------------------------------------- /episodes/fig/dc-spatial-vector/spatial_extent.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-vector/spatial_extent.png -------------------------------------------------------------------------------- /episodes/fig/map_usa_different_projections.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/map_usa_different_projections.jpg -------------------------------------------------------------------------------- /episodes/setup.R: -------------------------------------------------------------------------------- 1 | options(timeout = max(300, getOption('timeout'))) 2 | ## file structure 3 | 4 | if (! file.exists("data/NEON-DS-Site-Layout-Files")) { 5 | dest <- tempfile() 6 | download.file("https://ndownloader.figshare.com/files/3708751", dest, 7 | mode = "wb") 8 | unzip(dest, exdir = "data") 9 | } 10 | 11 | if (! file.exists("data/NEON-DS-Airborne-Remote-Sensing")) { 12 | dest <- tempfile() 13 | download.file("https://ndownloader.figshare.com/files/3701578", dest, 14 | mode = "wb") 15 | unzip(dest, exdir = "data") 16 | } 17 | 18 | if (! file.exists("data/NEON-DS-Met-Time-Series")) { 19 | dest <- tempfile() 20 | download.file("https://ndownloader.figshare.com/files/3701572", dest, 21 | mode = "wb") 22 | unzip(dest, exdir = "data") 23 | } 24 | 25 | if (! file.exists("data/NEON-DS-Landsat-NDVI")) { 26 | dest <- tempfile() 27 | download.file("https://ndownloader.figshare.com/files/4933582", dest, 28 | mode = "wb") 29 | unzip(dest, exdir = "data") 30 | } 31 | 32 | if (! file.exists("data/Global/Boundaries/ne_110m_graticules_all")) { 33 | dest <- tempfile() 34 | download.file("https://naciscdn.org/naturalearth/110m/physical/ne_110m_graticules_all.zip", 35 | dest, mode = "wb") 36 | unzip(dest, exdir = "data/Global/Boundaries/ne_110m_graticules_all") 37 | } 38 | 39 | if (! file.exists("data/Global/Boundaries/ne_110m_land")) { 40 | dest <- tempfile() 41 | download.file("https://naciscdn.org/naturalearth/110m/physical/ne_110m_land.zip", 42 | dest, mode = "wb") 43 | unzip(dest, exdir = "data/Global/Boundaries/ne_110m_land") 44 | } 45 | -------------------------------------------------------------------------------- /index.md: -------------------------------------------------------------------------------- 1 | --- 2 | site: sandpaper::sandpaper_site 3 | --- 4 | 5 | **Lesson Authors:** Leah A. Wasser, Megan A. Jones, Zack Brym, Kristina Riemer, Jason Williams, Jeff Hollister, Mike Smorul, Jemma Stachelek 6 | 7 | 8 | 9 | The episodes in this lesson cover how to open, work with, and plot 10 | vector and raster-format spatial data in R. Additional topics include 11 | working with spatial metadata (extent and coordinate reference systems), 12 | reprojecting spatial data, and working with raster time series data. 13 | 14 | :::::::::::::::::::::::::::::::::::::::::: prereq 15 | 16 | ## Prerequisites 17 | 18 | Data Carpentry's teaching is hands-on, so participants are encouraged 19 | to use their own computers to ensure the proper setup of tools for an 20 | efficient workflow. To most effectively use these materials, please 21 | make sure to download the data and install everything before 22 | working through this lesson. 23 | 24 | ### R Skill Level 25 | 26 | This lesson assumes you have some knowledge of `R`. If you've never 27 | used `R` before, or need a refresher, start with our 28 | [Introduction to R for Geospatial Data](http://www.datacarpentry.org/r-intro-geospatial/) 29 | lesson. 30 | 31 | ### Geospatial Skill Level 32 | 33 | This lesson assumes you have some knowledge of geospatial data types 34 | and common file formats. If you have never worked with geospatial 35 | data before, or need a refresher, start with our 36 | [Introduction to Geospatial Concepts](http://www.datacarpentry.org/organization-geospatial/) 37 | lesson. 38 | 39 | ### Install Software and Download Data 40 | 41 | For installation instructions and to download the data used in this 42 | lesson, see the 43 | [Geospatial Workshop Overview](http://www.datacarpentry.org/geospatial-workshop/#setup). 44 | 45 | ### Setup RStudio Project 46 | 47 | Make sure you have set up a RStudio project for this lesson, as 48 | described in the 49 | [setup instructions](http://www.datacarpentry.org/geospatial-workshop/#setup) 50 | and that your working directory is correctly set. 51 | 52 | 53 | :::::::::::::::::::::::::::::::::::::::::::::::::: 54 | 55 | 56 | -------------------------------------------------------------------------------- /instructors/instructor-notes.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Instructor Notes 3 | --- 4 | 5 | 6 | ## Instructor notes 7 | 8 | ## Lesson motivation and learning objectives 9 | 10 | This lesson is designed to introduce learners to the fundamental principles and skills for working with 11 | raster and vector geospatial data in R. It begins by introducing the structure of and simple plotting of 12 | raster data. It then covers re-projection of raster data, performing raster math, and working with multi-band 13 | raster data. After introducing raster data, the lesson moves into working with vector data. Line, point, and 14 | polygon shapefiles are included in the data. Learners will plot multiple raster and/or vector layers 15 | in a single plot, and learn how to customize plot elements such as legends and titles. They will 16 | also learn how to read data in from a csv formatted file and re-format it to a shapefile. Lastly, learners 17 | will work with multi-layered raster data set representing time series data and extract summary statistics 18 | from this data. 19 | 20 | ## Lesson design 21 | 22 | #### Overall comments 23 | 24 | - As of initial release of this lesson (August 2018), the timing is set to be the same for each episode. This 25 | is very likely incorrect and will need to be updated as these lessons are taught. If you teach this lesson, 26 | please put in an issue or PR to suggest an updating timing scheme!! 27 | 28 | - The code examples presented in each episode assume that the learners still have all of the data and packages 29 | loaded from all previous episodes in this lesson. If learners close out of their R session during the breaks or 30 | at the end of the first day, they will need to either save the workspace or reload the data and packages. 31 | Because of this, it is essential that learners save their code to a script throughout the lesson. 32 | 33 | #### [1 Intro to Raster Data in R](01-raster-structure.md) 34 | 35 | - Be sure to introduce the datasets that will be used in this lesson. There are many data files. It may 36 | be helpful to draw a diagram on the board showing the types of data that will be plotted and analyzed 37 | throughout the lesson. 38 | - If the [Introduction to Geospatial Concepts](https://datacarpentry.org/organization-geospatial/) lesson was 39 | included in your workshop, learners will have been introduced to the GDAL library. It will be useful to make 40 | the connection back to that lesson explicitly. 41 | - If the [Introduction to R for Geospatial Data](https://datacarpentry.org/r-intro-geospatial/) lesson was included 42 | in your workshop, learners will be familiar with the idea of packages and with most of the functions used 43 | in this lesson. 44 | - The Dealing with Missing Data and Bad Data Values in Rasters sections have several plots showing alternative ways of displaying missing 45 | data. The code for generating these plots is **not** shared with the learners, as it relies on many functions 46 | they have not yet learned. For these and other plots with hidden demonstration code, show the images in the 47 | lesson page while discussing those examples. 48 | - Be sure to draw a distinction between the DTM and the DSM files, as these two datasets will be used 49 | throughout the lesson. 50 | 51 | #### [2 Plot Raster Data in R](02-raster-plot.md) 52 | 53 | - `geom_bar()` is a new geom for the learners. They were introduced to `geom_col()` in the [Introduction to R for Geospatial Data](https://datacarpentry.org/r-intro-geospatial/) lesson. 54 | - `dplyr` syntax should be familiar to your learners from the [Introduction to R for Geospatial Data](https://datacarpentry.org/r-intro-geospatial/) lesson. 55 | - This may be the first time learners are exposed to hex colors, so be sure to explain that concept. 56 | - Starting in this episode and continuing throughout the lesson, the `ggplot` calls can be very long. Be sure 57 | to explicitly describe each step of the function call and what it is doing for the overall plot. 58 | 59 | #### [3 Reproject Raster Data in R](03-raster-reproject-in-r.md) 60 | 61 | - No notes yet. Please add your tips and comments! 62 | 63 | #### [4 Raster Calculations in R](04-raster-calculations-in-r.md) 64 | 65 | - The `overlay()` function syntax is fairly complex compared to other function calls the learners have seen. 66 | Be sure to explain it in detail. 67 | 68 | #### [5 Work With Multi-Band Rasters in R](05-raster-multi-band-in-r.md) 69 | 70 | - No notes yet. Please add your tips and comments! 71 | 72 | #### [6 Open and Plot Shapefiles in R](06-vector-open-shapefile-in-r.md) 73 | 74 | - Learners may have heard of the `sp` package. If it comes up, explain that `sf` is a 75 | more modern update of `sp`. 76 | - There is a known bug in the `geom_sf()` function that leads to an intermittent error on some platforms. 77 | If you see the following error message, try to re-run your plotting command and it should work. 78 | The `ggplot` development team is working on fixing this bug. 79 | 80 | * Error message * 81 | 82 | ```error 83 | Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : 84 | polygon edge not found 85 | ``` 86 | 87 | #### [7 Explore and Plot by Shapefile Attributes](07-vector-shapefile-attributes-in-r.md) 88 | 89 | - No notes yet. Please add your tips and comments! 90 | 91 | #### [8 Plot Multiple Vector Layers](08-vector-plot-shapefiles-custom-legend.md) 92 | 93 | - No notes yet. Please add your tips and comments! 94 | 95 | #### [9 Handling Spatial Projection \& CRS in R](09-vector-when-data-dont-line-up-crs.md) 96 | 97 | - Note that, although `ggplot` automatically reprojects vector data when plotting multiple shapefiles with 98 | different projections together, it is still important to be aware of the CRSs of your data and to keep track 99 | of how they are being transformed. 100 | 101 | #### [10 Convert from .csv to a Vector Layer](10-vector-csv-to-shapefile-in-r.md) 102 | 103 | - No notes yet. Please add your tips and comments! 104 | 105 | #### [11 Manipulate Raster Data](11-vector-raster-integration.md) 106 | 107 | - Learners have not yet been exposed to the `melt()` function in this workshop. They will need to have 108 | the syntax explained. 109 | - This is the first instance of a faceted plot in this workshop. 110 | 111 | #### [12 Raster Time Series Data](12-time-series-raster.md) 112 | 113 | - No notes yet. Please add your tips and comments! 114 | 115 | #### [13 Create Publication-quality Graphics](13-plot-time-series-rasters-in-r.md) 116 | 117 | - Be sure to show learners the before and after plots to motivate the complexity of the 118 | `ggplot` calls that will be used in this episode. 119 | 120 | #### [14 Derive Values from Raster Time Series](14-extract-ndvi-from-rasters-in-r.md) 121 | 122 | - This is the first time in the workshop that learners will have worked with date data. 123 | 124 | #### Concluding remarks 125 | 126 | - No notes yet. Please add your tips and comments! 127 | 128 | ## Technical tips and tricks 129 | 130 | - Leave about 30 minutes at the start of each workshop and another 15 mins 131 | at the start of each session for technical difficulties like WiFi and 132 | installing things (even if you asked students to install in advance, longer if 133 | not). 134 | 135 | - Don't worry about being correct or knowing the material back-to-front. Use 136 | mistakes as teaching moments: the most vital skill you can impart is how to 137 | debug and recover from unexpected errors. 138 | 139 | ## Scheduling tips 140 | 141 | - You will almost certainly not have enough time to teach this entire curriculum. If pressed for time, 142 | here is one possible shortened schedule you can use (used in a 4 half-day curriculum in May 2022): 143 | - Day 1: Workshop intro, installation, troubleshooting. Episodes 1-5 of Introduction to R for Geospatial Data. 144 | Skip everything in Episode 3 after "Vectors and Type Coercion, but keep Challenge 4. Skip everything in 145 | Episode 4 starting at "Adding columns and rows in data frames". Only include the "Data frames" section of Episode 5. 146 | You can introduce factors on-the-fly in the rest of the curriculum. 147 | - Day 2: Episodes 6-8 of Introduction to R for Geospatial Data, Episodes 6-8 of R for Raster and Vector Data (as far 148 | as you get in Episode 8). 149 | - Day 3: Episodes 8-10 of R for Raster and Vector Data, Episodes 1-2 of R for Raster and Vector Data. 150 | - Day 4: Episodes 3, 11 of Raster and Vector Data (and whatever else you'd like to cover), workshop conclusion. 151 | - It is a good idea to start your teaching with **vector data** (which is more immediately relevant to a greater number of 152 | researchers, particularly those outside of environmental sciences), then move to raster data if there is extra time. 153 | - Skip Introduction to Geospatial Concepts. Spend at most 30 minutes reviewing things as this is currently not 154 | an interactive curriculum. Most of the concepts you can cover within the R for Raster and Vector Data curriculum. 155 | - Covering Episode 10 immediately after 3 can be helpful to solidify the concepts of projections 156 | 157 | ## Common problems 158 | 159 | - Pre-installation for this curriculum is particularly important because geospatial data and software is large and can take 160 | a very long time to load during a workshop. Make sure everything is installed and downloaded ahead of time. 161 | - TBA - Instructors please add other situations you encounter here. 162 | 163 | 164 | 165 | 166 | -------------------------------------------------------------------------------- /learners/discuss.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Discussion 3 | --- 4 | 5 | FIXME 6 | 7 | 8 | 9 | 10 | -------------------------------------------------------------------------------- /learners/reference.md: -------------------------------------------------------------------------------- 1 | --- 2 | {} 3 | --- 4 | 5 | ## References 6 | 7 | - [CRAN Spatial Task View](https://cran.r-project.org/web/views/Spatial.html) 8 | 9 | - [Geocomputation with R](http://robinlovelace.net/geocompr/) 10 | 11 | - [sf package vignettes](https://r-spatial.github.io/sf/articles/) 12 | 13 | - [Wikipedia shapefile page](https://en.wikipedia.org/wiki/Shapefile) 14 | 15 | - [`R` color palettes documentation](https://stat.ethz.ch/R-manual/R-devel/library/grDevices/html/palettes.html) 16 | 17 | 18 | -------------------------------------------------------------------------------- /learners/setup.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Setup 3 | --- 4 | 5 | This lesson is designed to be taught in conjunction with other lessons 6 | in the [Data Carpentry Geospatial workshop](http://www.datacarpentry.org/geospatial-workshop/). 7 | For information about required software, and to access the datasets used 8 | in this lesson, see the 9 | [setup instructions](https://datacarpentry.org/geospatial-workshop/#setup) 10 | on the workshop homepage. 11 | 12 | 13 | -------------------------------------------------------------------------------- /profiles/learner-profiles.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: FIXME 3 | --- 4 | 5 | This is a placeholder file. Please add content here. 6 | -------------------------------------------------------------------------------- /r-raster-vector-geospatial.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | ProjectId: 32618674-7875-479d-aff6-55bf7903a906 3 | 4 | RestoreWorkspace: Default 5 | SaveWorkspace: Default 6 | AlwaysSaveHistory: Default 7 | 8 | EnableCodeIndexing: Yes 9 | UseSpacesForTab: Yes 10 | NumSpacesForTab: 2 11 | Encoding: UTF-8 12 | 13 | RnwWeave: Sweave 14 | LaTeX: pdfLaTeX 15 | 16 | BuildType: Website 17 | -------------------------------------------------------------------------------- /renv/profile: -------------------------------------------------------------------------------- 1 | lesson-requirements 2 | -------------------------------------------------------------------------------- /renv/profiles/lesson-requirements/renv/.gitignore: -------------------------------------------------------------------------------- 1 | library/ 2 | local/ 3 | cellar/ 4 | lock/ 5 | python/ 6 | sandbox/ 7 | staging/ 8 | -------------------------------------------------------------------------------- /site/README.md: -------------------------------------------------------------------------------- 1 | This directory contains rendered lesson materials. Please do not edit files 2 | here. 3 | --------------------------------------------------------------------------------