├── .Rbuildignore
├── .editorconfig
├── .gitattributes
├── .github
└── workflows
│ ├── README.md
│ ├── pr-close-signal.yaml
│ ├── pr-comment.yaml
│ ├── pr-post-remove-branch.yaml
│ ├── pr-preflight.yaml
│ ├── pr-receive.yaml
│ ├── sandpaper-main.yaml
│ ├── sandpaper-version.txt
│ ├── update-cache.yaml
│ ├── update-workflows.yaml
│ └── workbench-beta-phase.yml
├── .gitignore
├── .zenodo.json
├── CITATION
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE.md
├── README.md
├── about.md
├── config.yaml
├── episodes
├── 01-raster-structure.Rmd
├── 02-raster-plot.Rmd
├── 03-raster-reproject-in-r.Rmd
├── 04-raster-calculations-in-r.Rmd
├── 05-raster-multi-band-in-r.Rmd
├── 06-vector-open-shapefile-in-r.Rmd
├── 07-vector-shapefile-attributes-in-r.Rmd
├── 08-vector-plot-shapefiles-custom-legend.Rmd
├── 09-vector-when-data-dont-line-up-crs.Rmd
├── 10-vector-csv-to-shapefile-in-r.Rmd
├── 11-vector-raster-integration.Rmd
├── 12-time-series-raster.Rmd
├── 13-plot-time-series-rasters-in-r.Rmd
├── 14-extract-ndvi-from-rasters-in-r.Rmd
├── data
│ └── .gitignore
├── fig
│ ├── BufferCircular.png
│ ├── BufferSquare.png
│ ├── Utm-zones-USA.svg
│ ├── dc-spatial-raster
│ │ ├── GreennessOverTime.jpg
│ │ ├── RGBSTack_1.jpg
│ │ ├── UTM_zones_18-19.jpg
│ │ ├── imageStretch_dark.jpg
│ │ ├── imageStretch_light.jpg
│ │ ├── lidarTree-height.png
│ │ ├── raster_concept.png
│ │ ├── raster_resolution.png
│ │ ├── single_multi_raster.png
│ │ └── spatial_extent.png
│ ├── dc-spatial-vector
│ │ ├── pnt_line_poly.png
│ │ └── spatial_extent.png
│ └── map_usa_different_projections.jpg
└── setup.R
├── index.md
├── instructors
└── instructor-notes.md
├── learners
├── discuss.md
├── reference.md
└── setup.md
├── profiles
└── learner-profiles.md
├── r-raster-vector-geospatial.Rproj
├── renv
├── activate.R
├── profile
└── profiles
│ └── lesson-requirements
│ ├── renv.lock
│ └── renv
│ └── .gitignore
└── site
└── README.md
/.Rbuildignore:
--------------------------------------------------------------------------------
1 | ^renv$
2 | ^renv\.lock$
3 | ^\.travis\.yml$
4 | ^appveyor\.yml$
5 | ^tic\.R$
6 |
--------------------------------------------------------------------------------
/.editorconfig:
--------------------------------------------------------------------------------
1 | root = true
2 |
3 | [*]
4 | charset = utf-8
5 | insert_final_newline = true
6 | trim_trailing_whitespace = true
7 |
8 | [*.md]
9 | indent_size = 2
10 | indent_style = space
11 | max_line_length = 100 # Please keep this in sync with bin/lesson_check.py!
12 | trim_trailing_whitespace = false # keep trailing spaces in markdown - 2+ spaces are translated to a hard break (
)
13 |
14 | [*.r]
15 | max_line_length = 80
16 |
17 | [*.py]
18 | indent_size = 4
19 | indent_style = space
20 | max_line_length = 79
21 |
22 | [*.sh]
23 | end_of_line = lf
24 |
25 | [Makefile]
26 | indent_style = tab
27 |
--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | *.py linguist-vendored
2 | *.html linguist-vendored
3 | bin/* linguist-vendored
4 | assets/* linguist-vendored
5 | *.R linguist-vendored=false
6 | assets/css/lesson.scss linguist-vendored
7 |
--------------------------------------------------------------------------------
/.github/workflows/README.md:
--------------------------------------------------------------------------------
1 | # Carpentries Workflows
2 |
3 | This directory contains workflows to be used for Lessons using the {sandpaper}
4 | lesson infrastructure. Two of these workflows require R (`sandpaper-main.yaml`
5 | and `pr-receive.yaml`) and the rest are bots to handle pull request management.
6 |
7 | These workflows will likely change as {sandpaper} evolves, so it is important to
8 | keep them up-to-date. To do this in your lesson you can do the following in your
9 | R console:
10 |
11 | ```r
12 | # Install/Update sandpaper
13 | options(repos = c(carpentries = "https://carpentries.r-universe.dev/",
14 | CRAN = "https://cloud.r-project.org"))
15 | install.packages("sandpaper")
16 |
17 | # update the workflows in your lesson
18 | library("sandpaper")
19 | update_github_workflows()
20 | ```
21 |
22 | Inside this folder, you will find a file called `sandpaper-version.txt`, which
23 | will contain a version number for sandpaper. This will be used in the future to
24 | alert you if a workflow update is needed.
25 |
26 | What follows are the descriptions of the workflow files:
27 |
28 | ## Deployment
29 |
30 | ### 01 Build and Deploy (sandpaper-main.yaml)
31 |
32 | This is the main driver that will only act on the main branch of the repository.
33 | This workflow does the following:
34 |
35 | 1. checks out the lesson
36 | 2. provisions the following resources
37 | - R
38 | - pandoc
39 | - lesson infrastructure (stored in a cache)
40 | - lesson dependencies if needed (stored in a cache)
41 | 3. builds the lesson via `sandpaper:::ci_deploy()`
42 |
43 | #### Caching
44 |
45 | This workflow has two caches; one cache is for the lesson infrastructure and
46 | the other is for the lesson dependencies if the lesson contains rendered
47 | content. These caches are invalidated by new versions of the infrastructure and
48 | the `renv.lock` file, respectively. If there is a problem with the cache,
49 | manual invaliation is necessary. You will need maintain access to the repository
50 | and you can either go to the actions tab and [click on the caches button to find
51 | and invalidate the failing cache](https://github.blog/changelog/2022-10-20-manage-caches-in-your-actions-workflows-from-web-interface/)
52 | or by setting the `CACHE_VERSION` secret to the current date (which will
53 | invalidate all of the caches).
54 |
55 | ## Updates
56 |
57 | ### Setup Information
58 |
59 | These workflows run on a schedule and at the maintainer's request. Because they
60 | create pull requests that update workflows/require the downstream actions to run,
61 | they need a special repository/organization secret token called
62 | `SANDPAPER_WORKFLOW` and it must have the `public_repo` and `workflow` scope.
63 |
64 | This can be an individual user token, OR it can be a trusted bot account. If you
65 | have a repository in one of the official Carpentries accounts, then you do not
66 | need to worry about this token being present because the Carpentries Core Team
67 | will take care of supplying this token.
68 |
69 | If you want to use your personal account: you can go to
70 |
71 | to create a token. Once you have created your token, you should copy it to your
72 | clipboard and then go to your repository's settings > secrets > actions and
73 | create or edit the `SANDPAPER_WORKFLOW` secret, pasting in the generated token.
74 |
75 | If you do not specify your token correctly, the runs will not fail and they will
76 | give you instructions to provide the token for your repository.
77 |
78 | ### 02 Maintain: Update Workflow Files (update-workflow.yaml)
79 |
80 | The {sandpaper} repository was designed to do as much as possible to separate
81 | the tools from the content. For local builds, this is absolutely true, but
82 | there is a minor issue when it comes to workflow files: they must live inside
83 | the repository.
84 |
85 | This workflow ensures that the workflow files are up-to-date. The way it work is
86 | to download the update-workflows.sh script from GitHub and run it. The script
87 | will do the following:
88 |
89 | 1. check the recorded version of sandpaper against the current version on github
90 | 2. update the files if there is a difference in versions
91 |
92 | After the files are updated, if there are any changes, they are pushed to a
93 | branch called `update/workflows` and a pull request is created. Maintainers are
94 | encouraged to review the changes and accept the pull request if the outputs
95 | are okay.
96 |
97 | This update is run weekly or on demand.
98 |
99 | ### 03 Maintain: Update Package Cache (update-cache.yaml)
100 |
101 | For lessons that have generated content, we use {renv} to ensure that the output
102 | is stable. This is controlled by a single lockfile which documents the packages
103 | needed for the lesson and the version numbers. This workflow is skipped in
104 | lessons that do not have generated content.
105 |
106 | Because the lessons need to remain current with the package ecosystem, it's a
107 | good idea to make sure these packages can be updated periodically. The
108 | update cache workflow will do this by checking for updates, applying them in a
109 | branch called `updates/packages` and creating a pull request with _only the
110 | lockfile changed_.
111 |
112 | From here, the markdown documents will be rebuilt and you can inspect what has
113 | changed based on how the packages have updated.
114 |
115 | ## Pull Request and Review Management
116 |
117 | Because our lessons execute code, pull requests are a secruity risk for any
118 | lesson and thus have security measures associted with them. **Do not merge any
119 | pull requests that do not pass checks and do not have bots commented on them.**
120 |
121 | This series of workflows all go together and are described in the following
122 | diagram and the below sections:
123 |
124 | 
125 |
126 | ### Pre Flight Pull Request Validation (pr-preflight.yaml)
127 |
128 | This workflow runs every time a pull request is created and its purpose is to
129 | validate that the pull request is okay to run. This means the following things:
130 |
131 | 1. The pull request does not contain modified workflow files
132 | 2. If the pull request contains modified workflow files, it does not contain
133 | modified content files (such as a situation where @carpentries-bot will
134 | make an automated pull request)
135 | 3. The pull request does not contain an invalid commit hash (e.g. from a fork
136 | that was made before a lesson was transitioned from styles to use the
137 | workbench).
138 |
139 | Once the checks are finished, a comment is issued to the pull request, which
140 | will allow maintainers to determine if it is safe to run the
141 | "Receive Pull Request" workflow from new contributors.
142 |
143 | ### Receive Pull Request (pr-receive.yaml)
144 |
145 | **Note of caution:** This workflow runs arbitrary code by anyone who creates a
146 | pull request. GitHub has safeguarded the token used in this workflow to have no
147 | priviledges in the repository, but we have taken precautions to protect against
148 | spoofing.
149 |
150 | This workflow is triggered with every push to a pull request. If this workflow
151 | is already running and a new push is sent to the pull request, the workflow
152 | running from the previous push will be cancelled and a new workflow run will be
153 | started.
154 |
155 | The first step of this workflow is to check if it is valid (e.g. that no
156 | workflow files have been modified). If there are workflow files that have been
157 | modified, a comment is made that indicates that the workflow is not run. If
158 | both a workflow file and lesson content is modified, an error will occurr.
159 |
160 | The second step (if valid) is to build the generated content from the pull
161 | request. This builds the content and uploads three artifacts:
162 |
163 | 1. The pull request number (pr)
164 | 2. A summary of changes after the rendering process (diff)
165 | 3. The rendered files (build)
166 |
167 | Because this workflow builds generated content, it follows the same general
168 | process as the `sandpaper-main` workflow with the same caching mechanisms.
169 |
170 | The artifacts produced are used by the next workflow.
171 |
172 | ### Comment on Pull Request (pr-comment.yaml)
173 |
174 | This workflow is triggered if the `pr-receive.yaml` workflow is successful.
175 | The steps in this workflow are:
176 |
177 | 1. Test if the workflow is valid and comment the validity of the workflow to the
178 | pull request.
179 | 2. If it is valid: create an orphan branch with two commits: the current state
180 | of the repository and the proposed changes.
181 | 3. If it is valid: update the pull request comment with the summary of changes
182 |
183 | Importantly: if the pull request is invalid, the branch is not created so any
184 | malicious code is not published.
185 |
186 | From here, the maintainer can request changes from the author and eventually
187 | either merge or reject the PR. When this happens, if the PR was valid, the
188 | preview branch needs to be deleted.
189 |
190 | ### Send Close PR Signal (pr-close-signal.yaml)
191 |
192 | Triggered any time a pull request is closed. This emits an artifact that is the
193 | pull request number for the next action
194 |
195 | ### Remove Pull Request Branch (pr-post-remove-branch.yaml)
196 |
197 | Tiggered by `pr-close-signal.yaml`. This removes the temporary branch associated with
198 | the pull request (if it was created).
199 |
--------------------------------------------------------------------------------
/.github/workflows/pr-close-signal.yaml:
--------------------------------------------------------------------------------
1 | name: "Bot: Send Close Pull Request Signal"
2 |
3 | on:
4 | pull_request:
5 | types:
6 | [closed]
7 |
8 | jobs:
9 | send-close-signal:
10 | name: "Send closing signal"
11 | runs-on: ubuntu-22.04
12 | if: ${{ github.event.action == 'closed' }}
13 | steps:
14 | - name: "Create PRtifact"
15 | run: |
16 | mkdir -p ./pr
17 | printf ${{ github.event.number }} > ./pr/NUM
18 | - name: Upload Diff
19 | uses: actions/upload-artifact@v4
20 | with:
21 | name: pr
22 | path: ./pr
23 |
--------------------------------------------------------------------------------
/.github/workflows/pr-comment.yaml:
--------------------------------------------------------------------------------
1 | name: "Bot: Comment on the Pull Request"
2 |
3 | # read-write repo token
4 | # access to secrets
5 | on:
6 | workflow_run:
7 | workflows: ["Receive Pull Request"]
8 | types:
9 | - completed
10 |
11 | concurrency:
12 | group: pr-${{ github.event.workflow_run.pull_requests[0].number }}
13 | cancel-in-progress: true
14 |
15 |
16 | jobs:
17 | # Pull requests are valid if:
18 | # - they match the sha of the workflow run head commit
19 | # - they are open
20 | # - no .github files were committed
21 | test-pr:
22 | name: "Test if pull request is valid"
23 | runs-on: ubuntu-22.04
24 | if: >
25 | github.event.workflow_run.event == 'pull_request' &&
26 | github.event.workflow_run.conclusion == 'success'
27 | outputs:
28 | is_valid: ${{ steps.check-pr.outputs.VALID }}
29 | payload: ${{ steps.check-pr.outputs.payload }}
30 | number: ${{ steps.get-pr.outputs.NUM }}
31 | msg: ${{ steps.check-pr.outputs.MSG }}
32 | steps:
33 | - name: 'Download PR artifact'
34 | id: dl
35 | uses: carpentries/actions/download-workflow-artifact@main
36 | with:
37 | run: ${{ github.event.workflow_run.id }}
38 | name: 'pr'
39 |
40 | - name: "Get PR Number"
41 | if: ${{ steps.dl.outputs.success == 'true' }}
42 | id: get-pr
43 | run: |
44 | unzip pr.zip
45 | echo "NUM=$(<./NR)" >> $GITHUB_OUTPUT
46 |
47 | - name: "Fail if PR number was not present"
48 | id: bad-pr
49 | if: ${{ steps.dl.outputs.success != 'true' }}
50 | run: |
51 | echo '::error::A pull request number was not recorded. The pull request that triggered this workflow is likely malicious.'
52 | exit 1
53 | - name: "Get Invalid Hashes File"
54 | id: hash
55 | run: |
56 | echo "json<> $GITHUB_OUTPUT
59 | - name: "Check PR"
60 | id: check-pr
61 | if: ${{ steps.dl.outputs.success == 'true' }}
62 | uses: carpentries/actions/check-valid-pr@main
63 | with:
64 | pr: ${{ steps.get-pr.outputs.NUM }}
65 | sha: ${{ github.event.workflow_run.head_sha }}
66 | headroom: 3 # if it's within the last three commits, we can keep going, because it's likely rapid-fire
67 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
68 | fail_on_error: true
69 |
70 | # Create an orphan branch on this repository with two commits
71 | # - the current HEAD of the md-outputs branch
72 | # - the output from running the current HEAD of the pull request through
73 | # the md generator
74 | create-branch:
75 | name: "Create Git Branch"
76 | needs: test-pr
77 | runs-on: ubuntu-22.04
78 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
79 | env:
80 | NR: ${{ needs.test-pr.outputs.number }}
81 | permissions:
82 | contents: write
83 | steps:
84 | - name: 'Checkout md outputs'
85 | uses: actions/checkout@v4
86 | with:
87 | ref: md-outputs
88 | path: built
89 | fetch-depth: 1
90 |
91 | - name: 'Download built markdown'
92 | id: dl
93 | uses: carpentries/actions/download-workflow-artifact@main
94 | with:
95 | run: ${{ github.event.workflow_run.id }}
96 | name: 'built'
97 |
98 | - if: ${{ steps.dl.outputs.success == 'true' }}
99 | run: unzip built.zip
100 |
101 | - name: "Create orphan and push"
102 | if: ${{ steps.dl.outputs.success == 'true' }}
103 | run: |
104 | cd built/
105 | git config --local user.email "actions@github.com"
106 | git config --local user.name "GitHub Actions"
107 | CURR_HEAD=$(git rev-parse HEAD)
108 | git checkout --orphan md-outputs-PR-${NR}
109 | git add -A
110 | git commit -m "source commit: ${CURR_HEAD}"
111 | ls -A | grep -v '^.git$' | xargs -I _ rm -r '_'
112 | cd ..
113 | unzip -o -d built built.zip
114 | cd built
115 | git add -A
116 | git commit --allow-empty -m "differences for PR #${NR}"
117 | git push -u --force --set-upstream origin md-outputs-PR-${NR}
118 |
119 | # Comment on the Pull Request with a link to the branch and the diff
120 | comment-pr:
121 | name: "Comment on Pull Request"
122 | needs: [test-pr, create-branch]
123 | runs-on: ubuntu-22.04
124 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
125 | env:
126 | NR: ${{ needs.test-pr.outputs.number }}
127 | permissions:
128 | pull-requests: write
129 | steps:
130 | - name: 'Download comment artifact'
131 | id: dl
132 | uses: carpentries/actions/download-workflow-artifact@main
133 | with:
134 | run: ${{ github.event.workflow_run.id }}
135 | name: 'diff'
136 |
137 | - if: ${{ steps.dl.outputs.success == 'true' }}
138 | run: unzip ${{ github.workspace }}/diff.zip
139 |
140 | - name: "Comment on PR"
141 | id: comment-diff
142 | if: ${{ steps.dl.outputs.success == 'true' }}
143 | uses: carpentries/actions/comment-diff@main
144 | with:
145 | pr: ${{ env.NR }}
146 | path: ${{ github.workspace }}/diff.md
147 |
148 | # Comment if the PR is open and matches the SHA, but the workflow files have
149 | # changed
150 | comment-changed-workflow:
151 | name: "Comment if workflow files have changed"
152 | needs: test-pr
153 | runs-on: ubuntu-22.04
154 | if: ${{ always() && needs.test-pr.outputs.is_valid == 'false' }}
155 | env:
156 | NR: ${{ github.event.workflow_run.pull_requests[0].number }}
157 | body: ${{ needs.test-pr.outputs.msg }}
158 | permissions:
159 | pull-requests: write
160 | steps:
161 | - name: 'Check for spoofing'
162 | id: dl
163 | uses: carpentries/actions/download-workflow-artifact@main
164 | with:
165 | run: ${{ github.event.workflow_run.id }}
166 | name: 'built'
167 |
168 | - name: 'Alert if spoofed'
169 | id: spoof
170 | if: ${{ steps.dl.outputs.success == 'true' }}
171 | run: |
172 | echo 'body<> $GITHUB_ENV
173 | echo '' >> $GITHUB_ENV
174 | echo '## :x: DANGER :x:' >> $GITHUB_ENV
175 | echo 'This pull request has modified workflows that created output. Close this now.' >> $GITHUB_ENV
176 | echo '' >> $GITHUB_ENV
177 | echo 'EOF' >> $GITHUB_ENV
178 |
179 | - name: "Comment on PR"
180 | id: comment-diff
181 | uses: carpentries/actions/comment-diff@main
182 | with:
183 | pr: ${{ env.NR }}
184 | body: ${{ env.body }}
185 |
--------------------------------------------------------------------------------
/.github/workflows/pr-post-remove-branch.yaml:
--------------------------------------------------------------------------------
1 | name: "Bot: Remove Temporary PR Branch"
2 |
3 | on:
4 | workflow_run:
5 | workflows: ["Bot: Send Close Pull Request Signal"]
6 | types:
7 | - completed
8 |
9 | jobs:
10 | delete:
11 | name: "Delete branch from Pull Request"
12 | runs-on: ubuntu-22.04
13 | if: >
14 | github.event.workflow_run.event == 'pull_request' &&
15 | github.event.workflow_run.conclusion == 'success'
16 | permissions:
17 | contents: write
18 | steps:
19 | - name: 'Download artifact'
20 | uses: carpentries/actions/download-workflow-artifact@main
21 | with:
22 | run: ${{ github.event.workflow_run.id }}
23 | name: pr
24 | - name: "Get PR Number"
25 | id: get-pr
26 | run: |
27 | unzip pr.zip
28 | echo "NUM=$(<./NUM)" >> $GITHUB_OUTPUT
29 | - name: 'Remove branch'
30 | uses: carpentries/actions/remove-branch@main
31 | with:
32 | pr: ${{ steps.get-pr.outputs.NUM }}
33 |
--------------------------------------------------------------------------------
/.github/workflows/pr-preflight.yaml:
--------------------------------------------------------------------------------
1 | name: "Pull Request Preflight Check"
2 |
3 | on:
4 | pull_request_target:
5 | branches:
6 | ["main"]
7 | types:
8 | ["opened", "synchronize", "reopened"]
9 |
10 | jobs:
11 | test-pr:
12 | name: "Test if pull request is valid"
13 | if: ${{ github.event.action != 'closed' }}
14 | runs-on: ubuntu-22.04
15 | outputs:
16 | is_valid: ${{ steps.check-pr.outputs.VALID }}
17 | permissions:
18 | pull-requests: write
19 | steps:
20 | - name: "Get Invalid Hashes File"
21 | id: hash
22 | run: |
23 | echo "json<> $GITHUB_OUTPUT
26 | - name: "Check PR"
27 | id: check-pr
28 | uses: carpentries/actions/check-valid-pr@main
29 | with:
30 | pr: ${{ github.event.number }}
31 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
32 | fail_on_error: true
33 | - name: "Comment result of validation"
34 | id: comment-diff
35 | if: ${{ always() }}
36 | uses: carpentries/actions/comment-diff@main
37 | with:
38 | pr: ${{ github.event.number }}
39 | body: ${{ steps.check-pr.outputs.MSG }}
40 |
--------------------------------------------------------------------------------
/.github/workflows/pr-receive.yaml:
--------------------------------------------------------------------------------
1 | name: "Receive Pull Request"
2 |
3 | on:
4 | pull_request:
5 | types:
6 | [opened, synchronize, reopened]
7 |
8 | concurrency:
9 | group: ${{ github.ref }}
10 | cancel-in-progress: true
11 |
12 | jobs:
13 | test-pr:
14 | name: "Record PR number"
15 | if: ${{ github.event.action != 'closed' }}
16 | runs-on: ubuntu-22.04
17 | outputs:
18 | is_valid: ${{ steps.check-pr.outputs.VALID }}
19 | steps:
20 | - name: "Record PR number"
21 | id: record
22 | if: ${{ always() }}
23 | run: |
24 | echo ${{ github.event.number }} > ${{ github.workspace }}/NR # 2022-03-02: artifact name fixed to be NR
25 | - name: "Upload PR number"
26 | id: upload
27 | if: ${{ always() }}
28 | uses: actions/upload-artifact@v4
29 | with:
30 | name: pr
31 | path: ${{ github.workspace }}/NR
32 | - name: "Get Invalid Hashes File"
33 | id: hash
34 | run: |
35 | echo "json<> $GITHUB_OUTPUT
38 | - name: "echo output"
39 | run: |
40 | echo "${{ steps.hash.outputs.json }}"
41 | - name: "Check PR"
42 | id: check-pr
43 | uses: carpentries/actions/check-valid-pr@main
44 | with:
45 | pr: ${{ github.event.number }}
46 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
47 |
48 | build-md-source:
49 | name: "Build markdown source files if valid"
50 | needs: test-pr
51 | runs-on: ubuntu-22.04
52 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
53 | env:
54 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
55 | RENV_PATHS_ROOT: ~/.local/share/renv/
56 | CHIVE: ${{ github.workspace }}/site/chive
57 | PR: ${{ github.workspace }}/site/pr
58 | MD: ${{ github.workspace }}/site/built
59 | steps:
60 | - name: "Check Out Main Branch"
61 | uses: actions/checkout@v4
62 |
63 | - name: "Check Out Staging Branch"
64 | uses: actions/checkout@v4
65 | with:
66 | ref: md-outputs
67 | path: ${{ env.MD }}
68 |
69 | - name: "Set up R"
70 | uses: r-lib/actions/setup-r@v2
71 | with:
72 | use-public-rspm: true
73 | install-r: false
74 |
75 | - name: "Set up Pandoc"
76 | uses: r-lib/actions/setup-pandoc@v2
77 |
78 | - name: "Setup Lesson Engine"
79 | uses: carpentries/actions/setup-sandpaper@main
80 | with:
81 | cache-version: ${{ secrets.CACHE_VERSION }}
82 |
83 | - name: "Setup Package Cache"
84 | uses: carpentries/actions/setup-lesson-deps@main
85 | with:
86 | cache-version: ${{ secrets.CACHE_VERSION }}
87 |
88 | - name: "Validate and Build Markdown"
89 | id: build-site
90 | run: |
91 | sandpaper::package_cache_trigger(TRUE)
92 | sandpaper::validate_lesson(path = '${{ github.workspace }}')
93 | sandpaper:::build_markdown(path = '${{ github.workspace }}', quiet = FALSE)
94 | shell: Rscript {0}
95 |
96 | - name: "Generate Artifacts"
97 | id: generate-artifacts
98 | run: |
99 | sandpaper:::ci_bundle_pr_artifacts(
100 | repo = '${{ github.repository }}',
101 | pr_number = '${{ github.event.number }}',
102 | path_md = '${{ env.MD }}',
103 | path_pr = '${{ env.PR }}',
104 | path_archive = '${{ env.CHIVE }}',
105 | branch = 'md-outputs'
106 | )
107 | shell: Rscript {0}
108 |
109 | - name: "Upload PR"
110 | uses: actions/upload-artifact@v4
111 | with:
112 | name: pr
113 | path: ${{ env.PR }}
114 | overwrite: true
115 |
116 | - name: "Upload Diff"
117 | uses: actions/upload-artifact@v4
118 | with:
119 | name: diff
120 | path: ${{ env.CHIVE }}
121 | retention-days: 1
122 |
123 | - name: "Upload Build"
124 | uses: actions/upload-artifact@v4
125 | with:
126 | name: built
127 | path: ${{ env.MD }}
128 | retention-days: 1
129 |
130 | - name: "Teardown"
131 | run: sandpaper::reset_site()
132 | shell: Rscript {0}
133 |
--------------------------------------------------------------------------------
/.github/workflows/sandpaper-main.yaml:
--------------------------------------------------------------------------------
1 | name: "01 Build and Deploy Site"
2 |
3 | on:
4 | push:
5 | branches:
6 | - main
7 | - master
8 | schedule:
9 | - cron: '0 0 * * 2'
10 | workflow_dispatch:
11 | inputs:
12 | name:
13 | description: 'Who triggered this build?'
14 | required: true
15 | default: 'Maintainer (via GitHub)'
16 | reset:
17 | description: 'Reset cached markdown files'
18 | required: false
19 | default: false
20 | type: boolean
21 | jobs:
22 | full-build:
23 | name: "Build Full Site"
24 |
25 | # 2024-10-01: ubuntu-latest is now 24.04 and R is not installed by default in the runner image
26 | # pin to 22.04 for now
27 | runs-on: ubuntu-22.04
28 | permissions:
29 | checks: write
30 | contents: write
31 | pages: write
32 | env:
33 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
34 | RENV_PATHS_ROOT: ~/.local/share/renv/
35 | steps:
36 |
37 | - name: "Checkout Lesson"
38 | uses: actions/checkout@v4
39 |
40 | - name: "Set up R"
41 | uses: r-lib/actions/setup-r@v2
42 | with:
43 | use-public-rspm: true
44 | install-r: false
45 |
46 | - name: "Set up Pandoc"
47 | uses: r-lib/actions/setup-pandoc@v2
48 |
49 | - name: "Setup Lesson Engine"
50 | uses: carpentries/actions/setup-sandpaper@main
51 | with:
52 | cache-version: ${{ secrets.CACHE_VERSION }}
53 |
54 | - name: "Setup Package Cache"
55 | uses: carpentries/actions/setup-lesson-deps@main
56 | with:
57 | cache-version: ${{ secrets.CACHE_VERSION }}
58 |
59 | - name: "Deploy Site"
60 | run: |
61 | reset <- "${{ github.event.inputs.reset }}" == "true"
62 | sandpaper::package_cache_trigger(TRUE)
63 | sandpaper:::ci_deploy(reset = reset)
64 | shell: Rscript {0}
65 |
--------------------------------------------------------------------------------
/.github/workflows/sandpaper-version.txt:
--------------------------------------------------------------------------------
1 | 0.16.12
2 |
--------------------------------------------------------------------------------
/.github/workflows/update-cache.yaml:
--------------------------------------------------------------------------------
1 | name: "03 Maintain: Update Package Cache"
2 |
3 | on:
4 | workflow_dispatch:
5 | inputs:
6 | name:
7 | description: 'Who triggered this build (enter github username to tag yourself)?'
8 | required: true
9 | default: 'monthly run'
10 | schedule:
11 | # Run every tuesday
12 | - cron: '0 0 * * 2'
13 |
14 | jobs:
15 | preflight:
16 | name: "Preflight Check"
17 | runs-on: ubuntu-22.04
18 | outputs:
19 | ok: ${{ steps.check.outputs.ok }}
20 | steps:
21 | - id: check
22 | run: |
23 | if [[ ${{ github.event_name }} == 'workflow_dispatch' ]]; then
24 | echo "ok=true" >> $GITHUB_OUTPUT
25 | echo "Running on request"
26 | # using single brackets here to avoid 08 being interpreted as octal
27 | # https://github.com/carpentries/sandpaper/issues/250
28 | elif [ `date +%d` -le 7 ]; then
29 | # If the Tuesday lands in the first week of the month, run it
30 | echo "ok=true" >> $GITHUB_OUTPUT
31 | echo "Running on schedule"
32 | else
33 | echo "ok=false" >> $GITHUB_OUTPUT
34 | echo "Not Running Today"
35 | fi
36 |
37 | check_renv:
38 | name: "Check if We Need {renv}"
39 | runs-on: ubuntu-22.04
40 | needs: preflight
41 | if: ${{ needs.preflight.outputs.ok == 'true'}}
42 | outputs:
43 | needed: ${{ steps.renv.outputs.exists }}
44 | steps:
45 | - name: "Checkout Lesson"
46 | uses: actions/checkout@v4
47 | - id: renv
48 | run: |
49 | if [[ -d renv ]]; then
50 | echo "exists=true" >> $GITHUB_OUTPUT
51 | fi
52 |
53 | check_token:
54 | name: "Check SANDPAPER_WORKFLOW token"
55 | runs-on: ubuntu-22.04
56 | needs: check_renv
57 | if: ${{ needs.check_renv.outputs.needed == 'true' }}
58 | outputs:
59 | workflow: ${{ steps.validate.outputs.wf }}
60 | repo: ${{ steps.validate.outputs.repo }}
61 | steps:
62 | - name: "validate token"
63 | id: validate
64 | uses: carpentries/actions/check-valid-credentials@main
65 | with:
66 | token: ${{ secrets.SANDPAPER_WORKFLOW }}
67 |
68 | update_cache:
69 | name: "Update Package Cache"
70 | needs: check_token
71 | if: ${{ needs.check_token.outputs.repo== 'true' }}
72 | runs-on: ubuntu-22.04
73 | env:
74 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
75 | RENV_PATHS_ROOT: ~/.local/share/renv/
76 | steps:
77 |
78 | - name: "Checkout Lesson"
79 | uses: actions/checkout@v4
80 |
81 | - name: "Set up R"
82 | uses: r-lib/actions/setup-r@v2
83 | with:
84 | use-public-rspm: true
85 | install-r: false
86 |
87 | - name: "Update {renv} deps and determine if a PR is needed"
88 | id: update
89 | uses: carpentries/actions/update-lockfile@main
90 | with:
91 | cache-version: ${{ secrets.CACHE_VERSION }}
92 |
93 | - name: Create Pull Request
94 | id: cpr
95 | if: ${{ steps.update.outputs.n > 0 }}
96 | uses: carpentries/create-pull-request@main
97 | with:
98 | token: ${{ secrets.SANDPAPER_WORKFLOW }}
99 | delete-branch: true
100 | branch: "update/packages"
101 | commit-message: "[actions] update ${{ steps.update.outputs.n }} packages"
102 | title: "Update ${{ steps.update.outputs.n }} packages"
103 | body: |
104 | :robot: This is an automated build
105 |
106 | This will update ${{ steps.update.outputs.n }} packages in your lesson with the following versions:
107 |
108 | ```
109 | ${{ steps.update.outputs.report }}
110 | ```
111 |
112 | :stopwatch: In a few minutes, a comment will appear that will show you how the output has changed based on these updates.
113 |
114 | If you want to inspect these changes locally, you can use the following code to check out a new branch:
115 |
116 | ```bash
117 | git fetch origin update/packages
118 | git checkout update/packages
119 | ```
120 |
121 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }}
122 |
123 | [1]: https://github.com/carpentries/create-pull-request/tree/main
124 | labels: "type: package cache"
125 | draft: false
126 |
--------------------------------------------------------------------------------
/.github/workflows/update-workflows.yaml:
--------------------------------------------------------------------------------
1 | name: "02 Maintain: Update Workflow Files"
2 |
3 | on:
4 | workflow_dispatch:
5 | inputs:
6 | name:
7 | description: 'Who triggered this build (enter github username to tag yourself)?'
8 | required: true
9 | default: 'weekly run'
10 | clean:
11 | description: 'Workflow files/file extensions to clean (no wildcards, enter "" for none)'
12 | required: false
13 | default: '.yaml'
14 | schedule:
15 | # Run every Tuesday
16 | - cron: '0 0 * * 2'
17 |
18 | jobs:
19 | check_token:
20 | name: "Check SANDPAPER_WORKFLOW token"
21 | runs-on: ubuntu-22.04
22 | outputs:
23 | workflow: ${{ steps.validate.outputs.wf }}
24 | repo: ${{ steps.validate.outputs.repo }}
25 | steps:
26 | - name: "validate token"
27 | id: validate
28 | uses: carpentries/actions/check-valid-credentials@main
29 | with:
30 | token: ${{ secrets.SANDPAPER_WORKFLOW }}
31 |
32 | update_workflow:
33 | name: "Update Workflow"
34 | runs-on: ubuntu-22.04
35 | needs: check_token
36 | if: ${{ needs.check_token.outputs.workflow == 'true' }}
37 | steps:
38 | - name: "Checkout Repository"
39 | uses: actions/checkout@v4
40 |
41 | - name: Update Workflows
42 | id: update
43 | uses: carpentries/actions/update-workflows@main
44 | with:
45 | clean: ${{ github.event.inputs.clean }}
46 |
47 | - name: Create Pull Request
48 | id: cpr
49 | if: "${{ steps.update.outputs.new }}"
50 | uses: carpentries/create-pull-request@main
51 | with:
52 | token: ${{ secrets.SANDPAPER_WORKFLOW }}
53 | delete-branch: true
54 | branch: "update/workflows"
55 | commit-message: "[actions] update sandpaper workflow to version ${{ steps.update.outputs.new }}"
56 | title: "Update Workflows to Version ${{ steps.update.outputs.new }}"
57 | body: |
58 | :robot: This is an automated build
59 |
60 | Update Workflows from sandpaper version ${{ steps.update.outputs.old }} -> ${{ steps.update.outputs.new }}
61 |
62 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }}
63 |
64 | [1]: https://github.com/carpentries/create-pull-request/tree/main
65 | labels: "type: template and tools"
66 | draft: false
67 |
--------------------------------------------------------------------------------
/.github/workflows/workbench-beta-phase.yml:
--------------------------------------------------------------------------------
1 | name: "Deploy to AWS"
2 |
3 | on:
4 | workflow_run:
5 | workflows: ["01 Build and Deploy Site"]
6 | types:
7 | - completed
8 | workflow_dispatch:
9 |
10 | jobs:
11 | preflight:
12 | name: "Preflight Check"
13 | runs-on: ubuntu-latest
14 | outputs:
15 | ok: ${{ steps.check.outputs.ok }}
16 | folder: ${{ steps.check.outputs.folder }}
17 | steps:
18 | - id: check
19 | run: |
20 | if [[ -z "${{ secrets.DISTRIBUTION }}" || -z "${{ secrets.AWS_ACCESS_KEY_ID }}" || -z "${{ secrets.AWS_SECRET_ACCESS_KEY }}" ]]; then
21 | echo ":information_source: No site configured" >> $GITHUB_STEP_SUMMARY
22 | echo "" >> $GITHUB_STEP_SUMMARY
23 | echo 'To deploy the preview on AWS, you need the `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` and `DISTRIBUTION` secrets set up' >> $GITHUB_STEP_SUMMARY
24 | else
25 | echo "::set-output name=folder::"$(sed -E 's^.+/(.+)^\1^' <<< ${{ github.repository }})
26 | echo "::set-output name=ok::true"
27 | fi
28 |
29 | full-build:
30 | name: "Deploy to AWS"
31 | needs: [preflight]
32 | if: ${{ needs.preflight.outputs.ok }}
33 | runs-on: ubuntu-latest
34 | steps:
35 |
36 | - name: "Checkout site folder"
37 | uses: actions/checkout@v3
38 | with:
39 | ref: 'gh-pages'
40 | path: 'source'
41 |
42 | - name: "Deploy to Bucket"
43 | uses: jakejarvis/s3-sync-action@v0.5.1
44 | with:
45 | args: --acl public-read --follow-symlinks --delete --exclude '.git/*'
46 | env:
47 | AWS_S3_BUCKET: preview.carpentries.org
48 | AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
49 | AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
50 | SOURCE_DIR: 'source'
51 | DEST_DIR: ${{ needs.preflight.outputs.folder }}
52 |
53 | - name: "Invalidate CloudFront"
54 | uses: chetan/invalidate-cloudfront-action@master
55 | env:
56 | PATHS: /*
57 | AWS_REGION: 'us-east-1'
58 | DISTRIBUTION: ${{ secrets.DISTRIBUTION }}
59 | AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
60 | AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
61 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # sandpaper files
2 | episodes/*html
3 | site/*
4 | !site/README.md
5 |
6 | # History files
7 | .Rhistory
8 | .Rapp.history
9 | # Session Data files
10 | .RData
11 | # User-specific files
12 | .Ruserdata
13 | # Example code in package build process
14 | *-Ex.R
15 | # Output files from R CMD build
16 | /*.tar.gz
17 | # Output files from R CMD check
18 | /*.Rcheck/
19 | # RStudio files
20 | .Rproj.user/
21 | # produced vignettes
22 | vignettes/*.html
23 | vignettes/*.pdf
24 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
25 | .httr-oauth
26 | # knitr and R markdown default cache directories
27 | *_cache/
28 | /cache/
29 | # Temporary files created by R markdown
30 | *.utf8.md
31 | *.knit.md
32 | # R Environment Variables
33 | .Renviron
34 | # pkgdown site
35 | docs/
36 | # translation temp files
37 | po/*~
38 | # renv detritus
39 | renv/sandbox/
40 | _site
41 | .DS_Store
42 | .Rproj.user
43 | _episodes_rmd/data/NEON-DS-Airborne-Remote-Sensing
44 | _episodes_rmd/data/NEON-DS-Site-Layout-Files
45 | _episodes_rmd/data/NEON-DS-Landsat-NDVI
46 | _episodes_rmd/data/NEON-DS-Met-Time-Series
47 | _episodes_rmd/chm_ov_SJER.tif
48 | _episodes_rmd/meanNDVI_SJER_2011.csv
49 | *.pyc
50 | *~
51 | .ipynb_checkpoints
52 | .sass-cache
53 | .jekyll-cache/
54 | .jekyll-metadata
55 | __pycache__
56 | .bundle/
57 | .vendor/
58 | vendor/
59 | .docker-vendor/
60 | Gemfile.lock
61 | .*history
--------------------------------------------------------------------------------
/.zenodo.json:
--------------------------------------------------------------------------------
1 | {
2 | "contributors": [
3 | {
4 | "type": "Editor",
5 | "name": "Jemma Stachelek",
6 | "orcid": "0000-0002-5924-2464"
7 | },
8 | {
9 | "type": "Editor",
10 | "name": "Drake Asberry"
11 | },
12 | {
13 | "type": "Editor",
14 | "name": "Ivo Agbor Arrey",
15 | "orcid": "0000-0002-5311-3813"
16 | }
17 | ],
18 | "creators": [
19 | {
20 | "name": "Jemma Stachelek",
21 | "orcid": "0000-0002-5924-2464"
22 | },
23 | {
24 | "name": "Erin Alison Becker",
25 | "orcid": "0000-0002-6832-0233"
26 | },
27 | {
28 | "name": "Drake Asberry"
29 | },
30 | {
31 | "name": "Matt Strimas-Mackey",
32 | "orcid": "0000-0001-8929-7776"
33 | },
34 | {
35 | "name": "Annajiat Alim Rasel",
36 | "orcid": "0000-0003-0198-3734"
37 | },
38 | {
39 | "name": "Angela Li",
40 | "orcid": "0000-0002-8956-419X"
41 | },
42 | {
43 | "name": "Ryan Avery"
44 | },
45 | {
46 | "name": "kcarini",
47 | "orcid": "0000-0002-9630-0432"
48 | },
49 | {
50 | "name": "Kunal Marwaha",
51 | "orcid": "0000-0001-9084-6971"
52 | },
53 | {
54 | "name": "mneilson-usgs"
55 | },
56 | {
57 | "name": "Adam H. Sparks",
58 | "orcid": "0000-0002-0061-8359"
59 | },
60 | {
61 | "name": "bart1"
62 | },
63 | {
64 | "name": "Christian Boldsen Knudsen",
65 | "orcid": "0000-0002-9816-768X"
66 | },
67 | {
68 | "name": "Daniel Kerchner",
69 | "orcid": "0000-0002-5921-2193"
70 | },
71 | {
72 | "name": "Darya P Vanichkina",
73 | "orcid": "0000-0002-0406-164X"
74 | },
75 | {
76 | "name": "Pérez-Suárez",
77 | "orcid": "0000-0003-0784-6909"
78 | },
79 | {
80 | "name": "Jon Jablonski"
81 | },
82 | {
83 | "name": "Michael Liou"
84 | },
85 | {
86 | "name": "Natalia Morandeira",
87 | "orcid": "0000-0003-3674-2981"
88 | },
89 | {
90 | "name": "Rob Williams",
91 | "orcid": "0000-0001-9259-3883"
92 | }
93 | ],
94 | "license": {
95 | "id": "CC-BY-4.0"
96 | }
97 | }
--------------------------------------------------------------------------------
/CITATION:
--------------------------------------------------------------------------------
1 | Data Carpentry Introduction to Geospatial Raster and Vector Data with R
2 | Leah Wasser; Megan A. Jones; Jemma Stachelek; Lachlan Deer; Zack Brym; Lauren O'Brien; Ana Costa Conrado; Aateka Shashank; Kristina Riemer; Anne Fouilloux; Juan Fung; Marchand; Tracy Teal; Sergio Marconi; James Holmquist; Mike Smorul; Punam Amratia; Erin Becker; Katrin Leinweber
3 | Editors: Jemma Stachelek; Lauren O'Brien; Jane Wyngaard
4 | https://doi.org/10.5281/zenodo.1404424
5 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Contributor Code of Conduct"
3 | ---
4 |
5 | As contributors and maintainers of this project,
6 | we pledge to follow the [The Carpentries Code of Conduct][coc].
7 |
8 | Instances of abusive, harassing, or otherwise unacceptable behavior
9 | may be reported by following our [reporting guidelines][coc-reporting].
10 |
11 |
12 | [coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html
13 | [coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
14 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | ## Contributing
2 |
3 | [The Carpentries][cp-site] ([Software Carpentry][swc-site], [Data
4 | Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source
5 | projects, and we welcome contributions of all kinds: new lessons, fixes to
6 | existing material, bug reports, and reviews of proposed changes are all
7 | welcome.
8 |
9 | ### Contributor Agreement
10 |
11 | By contributing, you agree that we may redistribute your work under [our
12 | license](LICENSE.md). In exchange, we will address your issues and/or assess
13 | your change proposal as promptly as we can, and help you become a member of our
14 | community. Everyone involved in [The Carpentries][cp-site] agrees to abide by
15 | our [code of conduct](CODE_OF_CONDUCT.md).
16 |
17 | ### How to Contribute
18 |
19 | The easiest way to get started is to file an issue to tell us about a spelling
20 | mistake, some awkward wording, or a factual error. This is a good way to
21 | introduce yourself and to meet some of our community members.
22 |
23 | 1. If you do not have a [GitHub][github] account, you can [send us comments by
24 | email][contact]. However, we will be able to respond more quickly if you use
25 | one of the other methods described below.
26 |
27 | 2. If you have a [GitHub][github] account, or are willing to [create
28 | one][github-join], but do not know how to use Git, you can report problems
29 | or suggest improvements by [creating an issue][issues]. This allows us to
30 | assign the item to someone and to respond to it in a threaded discussion.
31 |
32 | 3. If you are comfortable with Git, and would like to add or change material,
33 | you can submit a pull request (PR). Instructions for doing this are
34 | [included below](#using-github).
35 |
36 | Note: if you want to build the website locally, please refer to [The Workbench
37 | documentation][template-doc].
38 |
39 | ### Where to Contribute
40 |
41 | 1. If you wish to change this lesson, add issues and pull requests here.
42 | 2. If you wish to change the template used for workshop websites, please refer
43 | to [The Workbench documentation][template-doc].
44 |
45 |
46 | ### What to Contribute
47 |
48 | There are many ways to contribute, from writing new exercises and improving
49 | existing ones to updating or filling in the documentation and submitting [bug
50 | reports][issues] about things that do not work, are not clear, or are missing.
51 | If you are looking for ideas, please see [the list of issues for this
52 | repository][repo], or the issues for [Data Carpentry][dc-issues], [Library
53 | Carpentry][lc-issues], and [Software Carpentry][swc-issues] projects.
54 |
55 | Comments on issues and reviews of pull requests are just as welcome: we are
56 | smarter together than we are on our own. **Reviews from novices and newcomers
57 | are particularly valuable**: it's easy for people who have been using these
58 | lessons for a while to forget how impenetrable some of this material can be, so
59 | fresh eyes are always welcome.
60 |
61 | ### What *Not* to Contribute
62 |
63 | Our lessons already contain more material than we can cover in a typical
64 | workshop, so we are usually *not* looking for more concepts or tools to add to
65 | them. As a rule, if you want to introduce a new idea, you must (a) estimate how
66 | long it will take to teach and (b) explain what you would take out to make room
67 | for it. The first encourages contributors to be honest about requirements; the
68 | second, to think hard about priorities.
69 |
70 | We are also not looking for exercises or other material that only run on one
71 | platform. Our workshops typically contain a mixture of Windows, macOS, and
72 | Linux users; in order to be usable, our lessons must run equally well on all
73 | three.
74 |
75 | ### Using GitHub
76 |
77 | If you choose to contribute via GitHub, you may want to look at [How to
78 | Contribute to an Open Source Project on GitHub][how-contribute]. In brief, we
79 | use [GitHub flow][github-flow] to manage changes:
80 |
81 | 1. Create a new branch in your desktop copy of this repository for each
82 | significant change.
83 | 2. Commit the change in that branch.
84 | 3. Push that branch to your fork of this repository on GitHub.
85 | 4. Submit a pull request from that branch to the [upstream repository][repo].
86 | 5. If you receive feedback, make changes on your desktop and push to your
87 | branch on GitHub: the pull request will update automatically.
88 |
89 | NB: The published copy of the lesson is usually in the `main` branch.
90 |
91 | Each lesson has a team of maintainers who review issues and pull requests or
92 | encourage others to do so. The maintainers are community volunteers, and have
93 | final say over what gets merged into the lesson.
94 |
95 | ### Other Resources
96 |
97 | The Carpentries is a global organisation with volunteers and learners all over
98 | the world. We share values of inclusivity and a passion for sharing knowledge,
99 | teaching and learning. There are several ways to connect with The Carpentries
100 | community listed at including via social
101 | media, slack, newsletters, and email lists. You can also [reach us by
102 | email][contact].
103 |
104 | [repo]: https://github.com/datacarpentry/r-raster-vector-geospatial
105 | [contact]: mailto:team@carpentries.org
106 | [cp-site]: https://carpentries.org/
107 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry
108 | [dc-lessons]: https://datacarpentry.org/lessons/
109 | [dc-site]: https://datacarpentry.org/
110 | [discuss-list]: https://lists.software-carpentry.org/listinfo/discuss
111 | [github]: https://github.com
112 | [github-flow]: https://guides.github.com/introduction/flow/
113 | [github-join]: https://github.com/join
114 | [how-contribute]: https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github
115 | [issues]: https://carpentries.org/help-wanted-issues/
116 | [lc-issues]: https://github.com/issues?q=user%3ALibraryCarpentry
117 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry
118 | [swc-lessons]: https://software-carpentry.org/lessons/
119 | [swc-site]: https://software-carpentry.org/
120 | [lc-site]: https://librarycarpentry.org/
121 | [template-doc]: https://carpentries.github.io/workbench/
122 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Licenses"
3 | ---
4 |
5 | ## Instructional Material
6 |
7 | All Software Carpentry, Data Carpentry, and Library Carpentry instructional material is
8 | made available under the [Creative Commons Attribution
9 | license][cc-by-human]. The following is a human-readable summary of
10 | (and not a substitute for) the [full legal text of the CC BY 4.0
11 | license][cc-by-legal].
12 |
13 | You are free:
14 |
15 | * to **Share**---copy and redistribute the material in any medium or format
16 | * to **Adapt**---remix, transform, and build upon the material
17 |
18 | for any purpose, even commercially.
19 |
20 | The licensor cannot revoke these freedoms as long as you follow the
21 | license terms.
22 |
23 | Under the following terms:
24 |
25 | * **Attribution**---You must give appropriate credit (mentioning that
26 | your work is derived from work that is Copyright © Software
27 | Carpentry and, where practical, linking to
28 | http://software-carpentry.org/), provide a [link to the
29 | license][cc-by-human], and indicate if changes were made. You may do
30 | so in any reasonable manner, but not in any way that suggests the
31 | licensor endorses you or your use.
32 |
33 | **No additional restrictions**---You may not apply legal terms or
34 | technological measures that legally restrict others from doing
35 | anything the license permits. With the understanding that:
36 |
37 | Notices:
38 |
39 | * You do not have to comply with the license for elements of the
40 | material in the public domain or where your use is permitted by an
41 | applicable exception or limitation.
42 | * No warranties are given. The license may not give you all of the
43 | permissions necessary for your intended use. For example, other
44 | rights such as publicity, privacy, or moral rights may limit how you
45 | use the material.
46 |
47 | ## Software
48 |
49 | Except where otherwise noted, the example programs and other software
50 | provided by Software Carpentry and Data Carpentry are made available under the
51 | [OSI][osi]-approved
52 | [MIT license][mit-license].
53 |
54 | Permission is hereby granted, free of charge, to any person obtaining
55 | a copy of this software and associated documentation files (the
56 | "Software"), to deal in the Software without restriction, including
57 | without limitation the rights to use, copy, modify, merge, publish,
58 | distribute, sublicense, and/or sell copies of the Software, and to
59 | permit persons to whom the Software is furnished to do so, subject to
60 | the following conditions:
61 |
62 | The above copyright notice and this permission notice shall be
63 | included in all copies or substantial portions of the Software.
64 |
65 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
66 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
67 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
68 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
69 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
70 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
71 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
72 |
73 | ## Trademark
74 |
75 | "The Carpentries", "Software Carpentry", "Data Carpentry", and "Library
76 | Carpentry" and their respective logos are registered trademarks of
77 | [The Carpentries, Inc.][carpentries].
78 |
79 | [cc-by-human]: https://creativecommons.org/licenses/by/4.0/
80 | [cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode
81 | [mit-license]: https://opensource.org/licenses/mit-license.html
82 | [carpentries]: https://carpentries.org
83 | [osi]: https://opensource.org
84 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | [](https://zenodo.org/badge/latestdoi/44772343)
2 | [](https://github.com/datacarpentry/r-raster-vector-geospatial/actions/workflows/sandpaper-main.yaml)
3 | [](https://swc-slack-invite.herokuapp.com/)
4 | [](https://swcarpentry.slack.com/messages/C9ME7G5RD)
5 |
6 | # R for Raster and Vector Data
7 |
8 | ## Contributing to lesson development
9 |
10 | - The lesson files to be edited are in the `_episodes` folder. This repository uses the `main` branch for development.
11 | - You can visualize the changes locally with the [sandpaper](https://github.com/carpentries/sandpaper) R package by executing either the `sandpaper::serve()` or `sandpaper::build_lesson()` commands. In the former case, the site will be rendered at [http://localhost:4321](http://localhost:4321)
12 | - Each time you push a change to GitHub, Github Actions rebuilds the lesson, and when it's successful (look for the green badge at the top of the README file), it publishes the result at [http://www.datacarpentry.org/r-raster-vector-geospatial/](http://www.datacarpentry.org/r-raster-vector-geospatial/)
13 | - Note: any manual commit to `gh-pages` will be erased and lost during the automated build and deploy cycle operated by Github Actions.
14 |
15 | ### Lesson Maintainers:
16 |
17 | - [Jemma Stachelek][stachelek_jemma]
18 | - [Ivo Arrey][arreyves]
19 | - Drake Asberry
20 | - [Jon Jablonski][jonjab]
21 |
22 | [stachelek_jemma]: https://carpentries.org/instructors/#jsta
23 | [arreyves]: https://carpentries.org/instructors/#arreyves
24 |
--------------------------------------------------------------------------------
/about.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: page
3 | description: "A site devoted to open science and open data."
4 | Tags: []
5 | permalink: about/
6 | image:
7 | feature: NEONCarpentryHeader_2.png
8 | credit: National Ecological Observatory Network (NEON)
9 | creditlink: http://www.neoninc.org
10 | ---
11 |
12 |
13 | ##About the NEON / Data Carpentry Hackathon
14 |
15 | The National Ecological Observatory Network (NEON) is hosting a 3-day lesson-building hackathon to develop a suite of NEON/ Data Carpentry data tutorials and corresponding assessment instruments. The tutorials and assessment instruments will be used to teach fundamental big data skills needed to work efficiently with large spatio-temporal data using open tools, such as R, Python and postgres SQL.
16 |
17 | Learn more about the Hackathon on the NEON website.
18 |
19 |
--------------------------------------------------------------------------------
/config.yaml:
--------------------------------------------------------------------------------
1 | #------------------------------------------------------------
2 | # Values for this lesson.
3 | #------------------------------------------------------------
4 |
5 | # Which carpentry is this (swc, dc, lc, or cp)?
6 | # swc: Software Carpentry
7 | # dc: Data Carpentry
8 | # lc: Library Carpentry
9 | # cp: Carpentries (to use for instructor training for instance)
10 | # incubator: The Carpentries Incubator
11 | carpentry: 'dc'
12 |
13 | # Overall title for pages.
14 | title: 'Introduction to Geospatial Raster and Vector Data with R'
15 |
16 | # Date the lesson was created (YYYY-MM-DD, this is empty by default)
17 | created: '2015-10-22'
18 |
19 | # Comma-separated list of keywords for the lesson
20 | keywords: 'software, data, lesson, The Carpentries'
21 |
22 | # Life cycle stage of the lesson
23 | # possible values: pre-alpha, alpha, beta, stable
24 | life_cycle: 'transition-step-2'
25 |
26 | # License of the lesson
27 | license: 'CC-BY 4.0'
28 |
29 | # Link to the source repository for this lesson
30 | source: 'https://github.com/datacarpentry/r-raster-vector-geospatial/'
31 |
32 | # Default branch of your lesson
33 | branch: 'main'
34 |
35 | # Who to contact if there are any issues
36 | contact: 'team@carpentries.org'
37 |
38 | # Navigation ------------------------------------------------
39 | #
40 | # Use the following menu items to specify the order of
41 | # individual pages in each dropdown section. Leave blank to
42 | # include all pages in the folder.
43 | #
44 | # Example -------------
45 | #
46 | # episodes:
47 | # - introduction.md
48 | # - first-steps.md
49 | #
50 | # learners:
51 | # - setup.md
52 | #
53 | # instructors:
54 | # - instructor-notes.md
55 | #
56 | # profiles:
57 | # - one-learner.md
58 | # - another-learner.md
59 |
60 | # Order of episodes in your lesson
61 | episodes:
62 | - 01-raster-structure.Rmd
63 | - 02-raster-plot.Rmd
64 | - 03-raster-reproject-in-r.Rmd
65 | - 04-raster-calculations-in-r.Rmd
66 | - 05-raster-multi-band-in-r.Rmd
67 | - 06-vector-open-shapefile-in-r.Rmd
68 | - 07-vector-shapefile-attributes-in-r.Rmd
69 | - 08-vector-plot-shapefiles-custom-legend.Rmd
70 | - 09-vector-when-data-dont-line-up-crs.Rmd
71 | - 10-vector-csv-to-shapefile-in-r.Rmd
72 | - 11-vector-raster-integration.Rmd
73 | - 12-time-series-raster.Rmd
74 | - 13-plot-time-series-rasters-in-r.Rmd
75 | - 14-extract-ndvi-from-rasters-in-r.Rmd
76 |
77 | # Information for Learners
78 | learners:
79 |
80 | # Information for Instructors
81 | instructors:
82 |
83 | # Learner Profiles
84 | profiles:
85 |
86 | # Customisation ---------------------------------------------
87 | #
88 | # This space below is where custom yaml items (e.g. pinning
89 | # sandpaper and varnish versions) should live
90 |
91 |
92 | url: 'https://datacarpentry.org/r-raster-vector-geospatial'
93 |
--------------------------------------------------------------------------------
/episodes/02-raster-plot.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Plot Raster Data
3 | teaching: 40
4 | exercises: 30
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Build customized plots for a single band raster using the `ggplot2` package.
15 | - Layer a raster dataset on top of a hillshade to create an elegant basemap.
16 |
17 | ::::::::::::::::::::::::::::::::::::::::::::::::::
18 |
19 | :::::::::::::::::::::::::::::::::::::::: questions
20 |
21 | - How can I create categorized or customized maps of raster data?
22 | - How can I customize the color scheme of a raster image?
23 | - How can I layer raster data in a single image?
24 |
25 | ::::::::::::::::::::::::::::::::::::::::::::::::::
26 |
27 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
28 | library(terra)
29 | library(ggplot2)
30 | library(dplyr)
31 | ```
32 |
33 | ```{r load-data, echo=FALSE}
34 | # Learners will have this data loaded from earlier episode
35 | # DSM data for Harvard Forest
36 | DSM_HARV <-
37 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
38 |
39 | DSM_HARV_df <- as.data.frame(DSM_HARV, xy = TRUE)
40 | ```
41 |
42 | :::::::::::::::::::::::::::::::::::::::::: prereq
43 |
44 | ## Things You'll Need To Complete This Episode
45 |
46 | See the [lesson homepage](.) for detailed information about the software,
47 | data, and other prerequisites you will need to work through the examples in this episode.
48 |
49 |
50 | ::::::::::::::::::::::::::::::::::::::::::::::::::
51 |
52 | ## Plot Raster Data in R
53 |
54 | This episode covers how to plot a raster in R using the `ggplot2`
55 | package with customized coloring schemes.
56 | It also covers how to layer a raster on top of a hillshade to produce an
57 | eloquent map. We will continue working with the Digital Surface Model (DSM)
58 | raster for the NEON Harvard Forest Field Site.
59 |
60 | ## Plotting Data Using Breaks
61 |
62 | In the previous episode, we viewed our data using a continuous color ramp. For
63 | clarity and visibility of the plot, we may prefer to view the data "symbolized"
64 | or colored according to ranges of values. This is comparable to a "classified"
65 | map. To do this, we need to tell `ggplot` how many groups to break our data
66 | into, and where those breaks should be. To make these decisions, it is useful
67 | to first explore the distribution of the data using a bar plot. To begin with,
68 | we will use `dplyr`'s `mutate()` function combined with `cut()` to split the
69 | data into 3 bins.
70 |
71 | ```{r histogram-breaks-ggplot}
72 |
73 | DSM_HARV_df <- DSM_HARV_df %>%
74 | mutate(fct_elevation = cut(HARV_dsmCrop, breaks = 3))
75 |
76 | ggplot() +
77 | geom_bar(data = DSM_HARV_df, aes(fct_elevation))
78 |
79 | ```
80 |
81 | If we want to know the cutoff values for the groups, we can ask for the unique
82 | values of `fct_elevation`:
83 |
84 | ```{r unique-breaks}
85 | unique(DSM_HARV_df$fct_elevation)
86 | ```
87 |
88 | And we can get the count of values in each group using `dplyr`'s `count()` function:
89 |
90 | ```{r breaks-count}
91 | DSM_HARV_df %>%
92 | count(fct_elevation)
93 | ```
94 |
95 | We might prefer to customize the cutoff values for these groups.
96 | Lets round the cutoff values so that we have groups for the ranges of
97 | 301–350 m, 351–400 m, and 401–450 m.
98 | To implement this we will give `mutate()` a numeric vector of break points
99 | instead of the number of breaks we want.
100 |
101 | ```{r custom-bins}
102 | custom_bins <- c(300, 350, 400, 450)
103 |
104 | DSM_HARV_df <- DSM_HARV_df %>%
105 | mutate(fct_elevation_2 = cut(HARV_dsmCrop, breaks = custom_bins))
106 |
107 | unique(DSM_HARV_df$fct_elevation_2)
108 | ```
109 |
110 | ::::::::::::::::::::::::::::::::::::::::: callout
111 |
112 | ## Data Tips
113 |
114 | Note that when we assign break values a set of 4 values will result in 3 bins
115 | of data.
116 |
117 | The bin intervals are shown using `(` to mean exclusive and `]` to mean
118 | inclusive. For example: `(305, 342]` means "from 306 through 342".
119 |
120 |
121 | ::::::::::::::::::::::::::::::::::::::::::::::::::
122 |
123 | And now we can plot our bar plot again, using the new groups:
124 |
125 | ```{r histogram-custom-breaks}
126 | ggplot() +
127 | geom_bar(data = DSM_HARV_df, aes(fct_elevation_2))
128 | ```
129 |
130 | And we can get the count of values in each group in the same way we did before:
131 |
132 | ```{r break-count-custom}
133 | DSM_HARV_df %>%
134 | count(fct_elevation_2)
135 | ```
136 |
137 | We can use those groups to plot our raster data, with each group being a
138 | different color:
139 |
140 | ```{r raster-with-breaks}
141 | ggplot() +
142 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y, fill = fct_elevation_2)) +
143 | coord_quickmap()
144 | ```
145 |
146 | The plot above uses the default colors inside `ggplot` for raster objects.
147 | We can specify our own colors to make the plot look a little nicer.
148 | R has a built in set of colors for plotting terrain, which are built in
149 | to the `terrain.colors()` function.
150 | Since we have three bins, we want to create a 3-color palette:
151 |
152 | ```{r terrain-colors}
153 | terrain.colors(3)
154 | ```
155 |
156 | The `terrain.colors()` function returns *hex colors* -
157 | each of these character strings represents a color.
158 | To use these in our map, we pass them across using the
159 | `scale_fill_manual()` function.
160 |
161 | ```{r ggplot-breaks-customcolors}
162 |
163 | ggplot() +
164 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y,
165 | fill = fct_elevation_2)) +
166 | scale_fill_manual(values = terrain.colors(3)) +
167 | coord_quickmap()
168 | ```
169 |
170 | ### More Plot Formatting
171 |
172 | If we need to create multiple plots using the same color palette, we can create
173 | an R object (`my_col`) for the set of colors that we want to use. We can then
174 | quickly change the palette across all plots by modifying the `my_col` object,
175 | rather than each individual plot.
176 |
177 | We can label the x- and y-axes of our plot too using `xlab` and `ylab`.
178 | We can also give the legend a more meaningful title by passing a value
179 | to the `name` argument of the `scale_fill_manual()` function.
180 |
181 | ```{r add-ggplot-labels}
182 |
183 | my_col <- terrain.colors(3)
184 |
185 | ggplot() +
186 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y,
187 | fill = fct_elevation_2)) +
188 | scale_fill_manual(values = my_col, name = "Elevation") +
189 | coord_quickmap()
190 | ```
191 |
192 | Or we can also turn off the labels of both axes by passing `element_blank()` to
193 | the relevant part of the `theme()` function.
194 |
195 | ```{r turn-off-axes}
196 | ggplot() +
197 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y,
198 | fill = fct_elevation_2)) +
199 | scale_fill_manual(values = my_col, name = "Elevation") +
200 | theme(axis.title = element_blank()) +
201 | coord_quickmap()
202 | ```
203 |
204 | ::::::::::::::::::::::::::::::::::::::: challenge
205 |
206 | ## Challenge: Plot Using Custom Breaks
207 |
208 | Create a plot of the Harvard Forest Digital Surface Model (DSM) that has:
209 |
210 | 1. Six classified ranges of values (break points) that are evenly divided among
211 | the range of pixel values.
212 | 2. Axis labels.
213 | 3. A plot title.
214 |
215 | ::::::::::::::: solution
216 |
217 | ## Answers
218 |
219 | ```{r challenge-code-plotting}
220 |
221 | DSM_HARV_df <- DSM_HARV_df %>%
222 | mutate(fct_elevation_6 = cut(HARV_dsmCrop, breaks = 6))
223 |
224 | my_col <- terrain.colors(6)
225 |
226 | ggplot() +
227 | geom_raster(data = DSM_HARV_df , aes(x = x, y = y,
228 | fill = fct_elevation_6)) +
229 | scale_fill_manual(values = my_col, name = "Elevation") +
230 | ggtitle("Classified Elevation Map - NEON Harvard Forest Field Site") +
231 | xlab("UTM Easting Coordinate (m)") +
232 | ylab("UTM Northing Coordinate (m)") +
233 | coord_quickmap()
234 | ```
235 |
236 | :::::::::::::::::::::::::
237 |
238 | ::::::::::::::::::::::::::::::::::::::::::::::::::
239 |
240 | ## Layering Rasters
241 |
242 | We can layer a raster on top of a hillshade raster for the same area, and use a
243 | transparency factor to create a 3-dimensional shaded effect. A
244 | hillshade is a raster that maps the shadows and texture that you would see from
245 | above when viewing terrain.
246 | We will add a custom color, making the plot grey.
247 |
248 | First we need to read in our DSM hillshade data and view the structure:
249 |
250 | ```{r}
251 | DSM_hill_HARV <-
252 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_DSMhill.tif")
253 |
254 | DSM_hill_HARV
255 | ```
256 |
257 | Next we convert it to a dataframe, so that we can plot it using `ggplot2`:
258 |
259 | ```{r}
260 | DSM_hill_HARV_df <- as.data.frame(DSM_hill_HARV, xy = TRUE)
261 |
262 | str(DSM_hill_HARV_df)
263 | ```
264 |
265 | Now we can plot the hillshade data:
266 |
267 | ```{r raster-hillshade}
268 | ggplot() +
269 | geom_raster(data = DSM_hill_HARV_df,
270 | aes(x = x, y = y, alpha = HARV_DSMhill)) +
271 | scale_alpha(range = c(0.15, 0.65), guide = "none") +
272 | coord_quickmap()
273 | ```
274 |
275 | ::::::::::::::::::::::::::::::::::::::::: callout
276 |
277 | ## Data Tips
278 |
279 | Turn off, or hide, the legend on a plot by adding `guide = "none"`
280 | to a `scale_something()` function or by setting
281 | `theme(legend.position = "none")`.
282 |
283 | The alpha value determines how transparent the colors will be (0 being
284 | transparent, 1 being opaque).
285 |
286 |
287 | ::::::::::::::::::::::::::::::::::::::::::::::::::
288 |
289 | We can layer another raster on top of our hillshade by adding another call to
290 | the `geom_raster()` function. Let's overlay `DSM_HARV` on top of the `hill_HARV`.
291 |
292 | ```{r overlay-hillshade}
293 | ggplot() +
294 | geom_raster(data = DSM_HARV_df ,
295 | aes(x = x, y = y,
296 | fill = HARV_dsmCrop)) +
297 | geom_raster(data = DSM_hill_HARV_df,
298 | aes(x = x, y = y,
299 | alpha = HARV_DSMhill)) +
300 | scale_fill_viridis_c() +
301 | scale_alpha(range = c(0.15, 0.65), guide = "none") +
302 | ggtitle("Elevation with hillshade") +
303 | coord_quickmap()
304 | ```
305 |
306 | ::::::::::::::::::::::::::::::::::::::: challenge
307 |
308 | ## Challenge: Create DTM \& DSM for SJER
309 |
310 | Use the files in the `data/NEON-DS-Airborne-Remote-Sensing/SJER/` directory to
311 | create a Digital Terrain Model map and Digital Surface Model map of the San
312 | Joaquin Experimental Range field site.
313 |
314 | Make sure to:
315 |
316 | - include hillshade in the maps,
317 | - label axes on the DSM map and exclude them from the DTM map,
318 | - include a title for each map,
319 | - experiment with various alpha values and color palettes to represent the
320 | data.
321 |
322 | ::::::::::::::: solution
323 |
324 | ## Answers
325 |
326 | ```{r challenge-hillshade-layering, echo=TRUE}
327 | # CREATE DSM MAPS
328 |
329 | # import DSM data
330 | DSM_SJER <-
331 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmCrop.tif")
332 | # convert to a df for plotting
333 | DSM_SJER_df <- as.data.frame(DSM_SJER, xy = TRUE)
334 |
335 | # import DSM hillshade
336 | DSM_hill_SJER <-
337 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmHill.tif")
338 | # convert to a df for plotting
339 | DSM_hill_SJER_df <- as.data.frame(DSM_hill_SJER, xy = TRUE)
340 |
341 | # Build Plot
342 | ggplot() +
343 | geom_raster(data = DSM_SJER_df ,
344 | aes(x = x, y = y,
345 | fill = SJER_dsmCrop,
346 | alpha = 0.8)
347 | ) +
348 | geom_raster(data = DSM_hill_SJER_df,
349 | aes(x = x, y = y,
350 | alpha = SJER_dsmHill)
351 | ) +
352 | scale_fill_viridis_c() +
353 | guides(fill = guide_colorbar()) +
354 | scale_alpha(range = c(0.4, 0.7), guide = "none") +
355 | # remove grey background and grid lines
356 | theme_bw() +
357 | theme(panel.grid.major = element_blank(),
358 | panel.grid.minor = element_blank()) +
359 | xlab("UTM Easting Coordinate (m)") +
360 | ylab("UTM Northing Coordinate (m)") +
361 | ggtitle("DSM with Hillshade") +
362 | coord_quickmap()
363 |
364 | # CREATE DTM MAP
365 | # import DTM
366 | DTM_SJER <-
367 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DTM/SJER_dtmCrop.tif")
368 | DTM_SJER_df <- as.data.frame(DTM_SJER, xy = TRUE)
369 |
370 | # DTM Hillshade
371 | DTM_hill_SJER <-
372 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DTM/SJER_dtmHill.tif")
373 | DTM_hill_SJER_df <- as.data.frame(DTM_hill_SJER, xy = TRUE)
374 |
375 | ggplot() +
376 | geom_raster(data = DTM_SJER_df ,
377 | aes(x = x, y = y,
378 | fill = SJER_dtmCrop,
379 | alpha = 2.0)
380 | ) +
381 | geom_raster(data = DTM_hill_SJER_df,
382 | aes(x = x, y = y,
383 | alpha = SJER_dtmHill)
384 | ) +
385 | scale_fill_viridis_c() +
386 | guides(fill = guide_colorbar()) +
387 | scale_alpha(range = c(0.4, 0.7), guide = "none") +
388 | theme_bw() +
389 | theme(panel.grid.major = element_blank(),
390 | panel.grid.minor = element_blank()) +
391 | theme(axis.title.x = element_blank(),
392 | axis.title.y = element_blank()) +
393 | ggtitle("DTM with Hillshade") +
394 | coord_quickmap()
395 | ```
396 |
397 | :::::::::::::::::::::::::
398 |
399 | ::::::::::::::::::::::::::::::::::::::::::::::::::
400 |
401 |
402 |
403 | :::::::::::::::::::::::::::::::::::::::: keypoints
404 |
405 | - Continuous data ranges can be grouped into categories using `mutate()` and `cut()`.
406 | - Use built-in `terrain.colors()` or set your preferred color scheme manually.
407 | - Layer rasters on top of one another by using the `alpha` aesthetic.
408 |
409 | ::::::::::::::::::::::::::::::::::::::::::::::::::
410 |
411 |
412 |
--------------------------------------------------------------------------------
/episodes/03-raster-reproject-in-r.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Reproject Raster Data
3 | teaching: 40
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Reproject a raster in R.
15 |
16 | ::::::::::::::::::::::::::::::::::::::::::::::::::
17 |
18 | :::::::::::::::::::::::::::::::::::::::: questions
19 |
20 | - How do I work with raster data sets that are in different projections?
21 |
22 | ::::::::::::::::::::::::::::::::::::::::::::::::::
23 |
24 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
25 | library(terra)
26 | library(ggplot2)
27 | library(dplyr)
28 | ```
29 |
30 | :::::::::::::::::::::::::::::::::::::::::: prereq
31 |
32 | ## Things You'll Need To Complete This Episode
33 |
34 | See the [lesson homepage](.) for detailed information about the software,
35 | data, and other prerequisites you will need to work through the examples in
36 | this episode.
37 |
38 |
39 | ::::::::::::::::::::::::::::::::::::::::::::::::::
40 |
41 | Sometimes we encounter raster datasets that do not "line up" when plotted or
42 | analyzed. Rasters that don't line up are most often in different Coordinate
43 | Reference Systems (CRS). This episode explains how to deal with rasters in
44 | different, known CRSs. It will walk though reprojecting rasters in R using
45 | the `project()` function in the `terra` package.
46 |
47 | ## Raster Projection in R
48 |
49 | In the [Plot Raster Data in R](02-raster-plot/)
50 | episode, we learned how to layer a raster file on top of a hillshade for a nice
51 | looking basemap. In that episode, all of our data were in the same CRS. What
52 | happens when things don't line up?
53 |
54 | For this episode, we will be working with the Harvard Forest Digital Terrain
55 | Model data. This differs from the surface model data we've been working with so
56 | far in that the digital surface model (DSM) includes the tops of trees, while
57 | the digital terrain model (DTM) shows the ground level.
58 |
59 | We'll be looking at another model (the canopy height model) in
60 | [a later episode](04-raster-calculations-in-r/) and will see how to calculate
61 | the CHM from the DSM and DTM. Here, we will create a map of the Harvard Forest
62 | Digital Terrain Model (`DTM_HARV`) draped or layered on top of the hillshade
63 | (`DTM_hill_HARV`).
64 | The hillshade layer maps the terrain using light and shadow to create a
65 | 3D-looking image, based on a hypothetical illumination of the ground level.
66 |
67 | {alt='Source: National Ecological Observatory Network (NEON).'}
68 |
69 | First, we need to import the DTM and DTM hillshade data.
70 |
71 | ```{r import-DTM-hillshade}
72 | DTM_HARV <-
73 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif")
74 |
75 | DTM_hill_HARV <-
76 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_DTMhill_WGS84.tif")
77 | ```
78 |
79 | Next, we will convert each of these datasets to a dataframe for
80 | plotting with `ggplot`.
81 |
82 | ```{r}
83 | DTM_HARV_df <- as.data.frame(DTM_HARV, xy = TRUE)
84 |
85 | DTM_hill_HARV_df <- as.data.frame(DTM_hill_HARV, xy = TRUE)
86 | ```
87 |
88 | Now we can create a map of the DTM layered over the hillshade.
89 |
90 | ```{r}
91 | ggplot() +
92 | geom_raster(data = DTM_HARV_df ,
93 | aes(x = x, y = y,
94 | fill = HARV_dtmCrop)) +
95 | geom_raster(data = DTM_hill_HARV_df,
96 | aes(x = x, y = y,
97 | alpha = HARV_DTMhill_WGS84)) +
98 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
99 | coord_quickmap()
100 | ```
101 |
102 | Our results are curious - neither the Digital Terrain Model (`DTM_HARV_df`)
103 | nor the DTM Hillshade (`DTM_hill_HARV_df`) plotted.
104 | Let's try to plot the DTM on its own to make sure there are data there.
105 |
106 | ```{r plot-DTM}
107 | ggplot() +
108 | geom_raster(data = DTM_HARV_df,
109 | aes(x = x, y = y,
110 | fill = HARV_dtmCrop)) +
111 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
112 | coord_quickmap()
113 | ```
114 |
115 | Our DTM seems to contain data and plots just fine.
116 |
117 | Next we plot the DTM Hillshade on its own to see whether everything is OK.
118 |
119 | ```{r plot-DTM-hill}
120 | ggplot() +
121 | geom_raster(data = DTM_hill_HARV_df,
122 | aes(x = x, y = y,
123 | alpha = HARV_DTMhill_WGS84)) +
124 | coord_quickmap()
125 | ```
126 |
127 | If we look at the axes, we can see that the projections of the two rasters are
128 | different.
129 | When this is the case, `ggplot` won't render the image. It won't even throw an
130 | error message to tell you something has gone wrong. We can look at Coordinate
131 | Reference Systems (CRSs) of the DTM and the hillshade data to see how they
132 | differ.
133 |
134 | ::::::::::::::::::::::::::::::::::::::: challenge
135 |
136 | ## Exercise
137 |
138 | View the CRS for each of these two datasets. What projection
139 | does each use?
140 |
141 | ::::::::::::::: solution
142 |
143 | ## Solution
144 |
145 | ```{r explore-crs}
146 | # view crs for DTM
147 | crs(DTM_HARV, parse = TRUE)
148 |
149 | # view crs for hillshade
150 | crs(DTM_hill_HARV, parse = TRUE)
151 | ```
152 |
153 | `DTM_HARV` is in the UTM projection, with units of meters.
154 | `DTM_hill_HARV` is in
155 | `Geographic WGS84` - which is represented by latitude and longitude values.
156 |
157 |
158 |
159 | :::::::::::::::::::::::::
160 |
161 | ::::::::::::::::::::::::::::::::::::::::::::::::::
162 |
163 | Because the two rasters are in different CRSs, they don't line up when plotted
164 | in R. We need to reproject (or change the projection of) `DTM_hill_HARV` into
165 | the UTM CRS. Alternatively, we could reproject `DTM_HARV` into WGS84.
166 |
167 | ## Reproject Rasters
168 |
169 | We can use the `project()` function to reproject a raster into a new CRS.
170 | Keep in mind that reprojection only works when you first have a defined CRS
171 | for the raster object that you want to reproject. It cannot be used if no
172 | CRS is defined. Lucky for us, the `DTM_hill_HARV` has a defined CRS.
173 |
174 | ::::::::::::::::::::::::::::::::::::::::: callout
175 |
176 | ## Data Tip
177 |
178 | When we reproject a raster, we move it from one "grid" to another. Thus, we are
179 | modifying the data! Keep this in mind as we work with raster data.
180 |
181 |
182 | ::::::::::::::::::::::::::::::::::::::::::::::::::
183 |
184 | To use the `project()` function, we need to define two things:
185 |
186 | 1. the object we want to reproject and
187 | 2. the CRS that we want to reproject it to.
188 |
189 | The syntax is `project(RasterObject, crs)`
190 |
191 | We want the CRS of our hillshade to match the `DTM_HARV` raster. We can thus
192 | assign the CRS of our `DTM_HARV` to our hillshade within the `project()`
193 | function as follows: `crs(DTM_HARV)`.
194 | Note that we are using the `project()` function on the raster object,
195 | not the `data.frame()` we use for plotting with `ggplot`.
196 |
197 | First we will reproject our `DTM_hill_HARV` raster data to match the `DTM_HARV`
198 | raster CRS:
199 |
200 | ```{r reproject-raster}
201 | DTM_hill_UTMZ18N_HARV <- project(DTM_hill_HARV,
202 | crs(DTM_HARV))
203 | ```
204 |
205 | Now we can compare the CRS of our original DTM hillshade and our new DTM
206 | hillshade, to see how they are different.
207 |
208 | ```{r}
209 | crs(DTM_hill_UTMZ18N_HARV, parse = TRUE)
210 | crs(DTM_hill_HARV, parse = TRUE)
211 | ```
212 |
213 | We can also compare the extent of the two objects.
214 |
215 | ```{r}
216 | ext(DTM_hill_UTMZ18N_HARV)
217 | ext(DTM_hill_HARV)
218 | ```
219 |
220 | Notice in the output above that the `crs()` of `DTM_hill_UTMZ18N_HARV` is now
221 | UTM. However, the extent values of `DTM_hillUTMZ18N_HARV` are different from
222 | `DTM_hill_HARV`.
223 |
224 | ::::::::::::::::::::::::::::::::::::::: challenge
225 |
226 | ## Challenge: Extent Change with CRS Change
227 |
228 | Why do you think the two extents differ?
229 |
230 | ::::::::::::::: solution
231 |
232 | ## Answers
233 |
234 | The extent for DTM\_hill\_UTMZ18N\_HARV is in UTMs so the extent is in meters.
235 | The extent for DTM\_hill\_HARV is in lat/long so the extent is expressed in
236 | decimal degrees.
237 |
238 |
239 |
240 | :::::::::::::::::::::::::
241 |
242 | ::::::::::::::::::::::::::::::::::::::::::::::::::
243 |
244 | ## Deal with Raster Resolution
245 |
246 | Let's next have a look at the resolution of our reprojected hillshade versus
247 | our original data.
248 |
249 | ```{r view-resolution}
250 | res(DTM_hill_UTMZ18N_HARV)
251 | res(DTM_HARV)
252 | ```
253 |
254 | These two resolutions are different, but they're representing the same data. We
255 | can tell R to force our newly reprojected raster to be 1m x 1m resolution by
256 | adding a line of code `res=1` within the `project()` function. In the
257 | example below, we ensure a resolution match by using `res(DTM_HARV)` as a
258 | variable.
259 |
260 | ```{r reproject-assign-resolution}
261 | DTM_hill_UTMZ18N_HARV <- project(DTM_hill_HARV,
262 | crs(DTM_HARV),
263 | res = res(DTM_HARV))
264 | ```
265 |
266 | Now both our resolutions and our CRSs match, so we can plot these two data sets
267 | together. Let's double-check our resolution to be sure:
268 |
269 | ```{r}
270 | res(DTM_hill_UTMZ18N_HARV)
271 | res(DTM_HARV)
272 | ```
273 |
274 | For plotting with `ggplot()`, we will need to create a dataframe from our newly
275 | reprojected raster.
276 |
277 | ```{r make-df-projected-raster}
278 | DTM_hill_HARV_2_df <- as.data.frame(DTM_hill_UTMZ18N_HARV, xy = TRUE)
279 | ```
280 |
281 | We can now create a plot of this data.
282 |
283 | ```{r plot-projected-raster}
284 | ggplot() +
285 | geom_raster(data = DTM_HARV_df ,
286 | aes(x = x, y = y,
287 | fill = HARV_dtmCrop)) +
288 | geom_raster(data = DTM_hill_HARV_2_df,
289 | aes(x = x, y = y,
290 | alpha = HARV_DTMhill_WGS84)) +
291 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
292 | coord_quickmap()
293 | ```
294 |
295 | We have now successfully draped the Digital Terrain Model on top of our
296 | hillshade to produce a nice looking, textured map!
297 |
298 | ::::::::::::::::::::::::::::::::::::::: challenge
299 |
300 | ## Challenge: Reproject, then Plot a Digital Terrain Model
301 |
302 | Create a map of the
303 | [San Joaquin Experimental Range](https://www.neonscience.org/field-sites/field-sites-map/SJER)
304 | field site using the `SJER_DSMhill_WGS84.tif` and `SJER_dsmCrop.tif` files.
305 |
306 | Reproject the data as necessary to make things line up!
307 |
308 | ::::::::::::::: solution
309 |
310 | ## Answers
311 |
312 | ```{r challenge-code-reprojection, echo=TRUE}
313 | # import DSM
314 | DSM_SJER <-
315 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmCrop.tif")
316 | # import DSM hillshade
317 | DSM_hill_SJER_WGS <-
318 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_DSMhill_WGS84.tif")
319 |
320 | # reproject raster
321 | DSM_hill_UTMZ18N_SJER <- project(DSM_hill_SJER_WGS,
322 | crs(DSM_SJER),
323 | res = 1)
324 |
325 | # convert to data.frames
326 | DSM_SJER_df <- as.data.frame(DSM_SJER, xy = TRUE)
327 |
328 | DSM_hill_SJER_df <- as.data.frame(DSM_hill_UTMZ18N_SJER, xy = TRUE)
329 |
330 | ggplot() +
331 | geom_raster(data = DSM_hill_SJER_df,
332 | aes(x = x, y = y,
333 | alpha = SJER_DSMhill_WGS84)
334 | ) +
335 | geom_raster(data = DSM_SJER_df,
336 | aes(x = x, y = y,
337 | fill = SJER_dsmCrop,
338 | alpha=0.8)
339 | ) +
340 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
341 | coord_quickmap()
342 | ```
343 |
344 | :::::::::::::::::::::::::
345 |
346 | If you completed the San Joaquin plotting challenge in the
347 | [Plot Raster Data in R](02-raster-plot/)
348 | episode, how does the map you just created compare to that map?
349 |
350 | ::::::::::::::: solution
351 |
352 | ## Answers
353 |
354 | The maps look identical. Which is what they should be as the only difference
355 | is this one was reprojected from WGS84 to UTM prior to plotting.
356 |
357 |
358 |
359 | :::::::::::::::::::::::::
360 |
361 | ::::::::::::::::::::::::::::::::::::::::::::::::::
362 |
363 |
364 |
365 | :::::::::::::::::::::::::::::::::::::::: keypoints
366 |
367 | - In order to plot two raster data sets together, they must be in the same CRS.
368 | - Use the `project()` function to convert between CRSs.
369 |
370 | ::::::::::::::::::::::::::::::::::::::::::::::::::
371 |
372 |
373 |
--------------------------------------------------------------------------------
/episodes/04-raster-calculations-in-r.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Raster Calculations
3 | teaching: 40
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Perform a subtraction between two rasters using raster math.
15 | - Perform a more efficient subtraction between two rasters using the raster `lapp()` function.
16 | - Export raster data as a GeoTIFF file.
17 |
18 | ::::::::::::::::::::::::::::::::::::::::::::::::::
19 |
20 | :::::::::::::::::::::::::::::::::::::::: questions
21 |
22 | - How do I subtract one raster from another and extract pixel values for defined locations?
23 |
24 | ::::::::::::::::::::::::::::::::::::::::::::::::::
25 |
26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
27 | library(terra)
28 | library(ggplot2)
29 | library(dplyr)
30 | ```
31 |
32 | ```{r load-data, echo=FALSE}
33 | # Learners will have these data loaded from earlier episode
34 | # DSM data for Harvard Forest
35 | DSM_HARV <-
36 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
37 |
38 | DSM_HARV_df <- as.data.frame(DSM_HARV, xy = TRUE)
39 |
40 | # DTM data for Harvard Forest
41 | DTM_HARV <-
42 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif")
43 |
44 | DTM_HARV_df <- as.data.frame(DTM_HARV, xy = TRUE)
45 |
46 | # DSM data for SJER
47 | DSM_SJER <-
48 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DSM/SJER_dsmCrop.tif")
49 |
50 | DSM_SJER_df <- as.data.frame(DSM_SJER, xy = TRUE)
51 |
52 | # DTM data for SJER
53 | DTM_SJER <-
54 | rast("data/NEON-DS-Airborne-Remote-Sensing/SJER/DTM/SJER_dtmCrop.tif")
55 |
56 | DTM_SJER_df <- as.data.frame(DTM_SJER, xy = TRUE)
57 | ```
58 |
59 | :::::::::::::::::::::::::::::::::::::::::: prereq
60 |
61 | ## Things You'll Need To Complete This Episode
62 |
63 | See the [lesson homepage](.) for detailed information about the software,
64 | data, and other prerequisites you will need to work through the examples in
65 | this episode.
66 |
67 |
68 | ::::::::::::::::::::::::::::::::::::::::::::::::::
69 |
70 | We often want to combine values of and perform calculations on rasters to
71 | create a new output raster. This episode covers how to subtract one raster from
72 | another using basic raster math and the `lapp()` function. It also covers
73 | how to extract pixel values from a set of locations - for example a buffer
74 | region around plot locations at a field site.
75 |
76 | ## Raster Calculations in R
77 |
78 | We often want to perform calculations on two or more rasters to create a new
79 | output raster. For example, if we are interested in mapping the heights of
80 | trees across an entire field site, we might want to calculate the difference
81 | between the Digital Surface Model (DSM, tops of trees) and the Digital Terrain
82 | Model (DTM, ground level). The resulting dataset is referred to as a Canopy
83 | Height Model (CHM) and represents the actual height of trees, buildings, etc.
84 | with the influence of ground elevation removed.
85 |
86 | {alt='Source: National Ecological Observatory Network (NEON)'}
87 |
88 | ::::::::::::::::::::::::::::::::::::::::: callout
89 |
90 | ## More Resources
91 |
92 | - Check out more on LiDAR CHM, DTM and DSM in this NEON Data Skills overview tutorial:
93 | [What is a CHM, DSM and DTM? About Gridded, Raster LiDAR Data](https://www.neonscience.org/chm-dsm-dtm-gridded-lidar-data).
94 |
95 |
96 | ::::::::::::::::::::::::::::::::::::::::::::::::::
97 |
98 | ### Load the Data
99 |
100 | For this episode, we will use the DTM and DSM from the NEON Harvard Forest
101 | Field site and San Joaquin Experimental Range, which we already have loaded
102 | from previous episodes.
103 |
104 | ::::::::::::::::::::::::::::::::::::::: challenge
105 |
106 | ## Exercise
107 |
108 | Use the `describe()` function to view information about the DTM and DSM data
109 | files. Do the two rasters have the same or different CRSs and resolutions? Do
110 | they both have defined minimum and maximum values?
111 |
112 | ::::::::::::::: solution
113 |
114 | ## Solution
115 |
116 | ```{r}
117 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif")
118 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
119 | ```
120 |
121 | :::::::::::::::::::::::::
122 |
123 | ::::::::::::::::::::::::::::::::::::::::::::::::::
124 |
125 | We've already loaded and worked with these two data files in
126 | earlier episodes. Let's plot them each once more to remind ourselves
127 | what this data looks like. First we'll plot the DTM elevation data:
128 |
129 | ```{r harv-dtm-plot}
130 | ggplot() +
131 | geom_raster(data = DTM_HARV_df ,
132 | aes(x = x, y = y, fill = HARV_dtmCrop)) +
133 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
134 | coord_quickmap()
135 | ```
136 |
137 | And then the DSM elevation data:
138 |
139 | ```{r harv-dsm-plot}
140 | ggplot() +
141 | geom_raster(data = DSM_HARV_df ,
142 | aes(x = x, y = y, fill = HARV_dsmCrop)) +
143 | scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
144 | coord_quickmap()
145 | ```
146 |
147 | ## Two Ways to Perform Raster Calculations
148 |
149 | We can calculate the difference between two rasters in two different ways:
150 |
151 | - by directly subtracting the two rasters in R using raster math
152 |
153 | or for more efficient processing - particularly if our rasters are large and/or
154 | the calculations we are performing are complex:
155 |
156 | - using the `lapp()` function.
157 |
158 | ## Raster Math \& Canopy Height Models
159 |
160 | We can perform raster calculations by subtracting (or adding,
161 | multiplying, etc) two rasters. In the geospatial world, we call this
162 | "raster math".
163 |
164 | Let's subtract the DTM from the DSM to create a Canopy Height Model.
165 | After subtracting, let's create a dataframe so we can plot with `ggplot`.
166 |
167 | ```{r raster-math}
168 | CHM_HARV <- DSM_HARV - DTM_HARV
169 |
170 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)
171 | ```
172 |
173 | We can now plot the output CHM.
174 |
175 | ```{r harv-chm-plot}
176 | ggplot() +
177 | geom_raster(data = CHM_HARV_df ,
178 | aes(x = x, y = y, fill = HARV_dsmCrop)) +
179 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
180 | coord_quickmap()
181 | ```
182 |
183 | Let's have a look at the distribution of values in our newly created
184 | Canopy Height Model (CHM).
185 |
186 | ```{r create-hist}
187 | ggplot(CHM_HARV_df) +
188 | geom_histogram(aes(HARV_dsmCrop))
189 | ```
190 |
191 | Notice that the range of values for the output CHM is between 0 and 30 meters.
192 | Does this make sense for trees in Harvard Forest?
193 |
194 | ::::::::::::::::::::::::::::::::::::::: challenge
195 |
196 | ## Challenge: Explore CHM Raster Values
197 |
198 | It's often a good idea to explore the range of values in a raster dataset just
199 | like we might explore a dataset that we collected in the field.
200 |
201 | 1. What is the min and maximum value for the Harvard Forest Canopy Height Model (`CHM_HARV`) that we just created?
202 | 2. What are two ways you can check this range of data for `CHM_HARV`?
203 | 3. What is the distribution of all the pixel values in the CHM?
204 | 4. Plot a histogram with 6 bins instead of the default and change the color of the histogram.
205 | 5. Plot the `CHM_HARV` raster using breaks that make sense for the data. Include an appropriate color palette for the data, plot title and no axes ticks / labels.
206 |
207 | ::::::::::::::: solution
208 |
209 | ## Answers
210 |
211 | 1) There are missing values in our data, so we need to specify
212 | `na.rm = TRUE`.
213 |
214 | ```{r}
215 | min(CHM_HARV_df$HARV_dsmCrop, na.rm = TRUE)
216 | max(CHM_HARV_df$HARV_dsmCrop, na.rm = TRUE)
217 | ```
218 |
219 | 2) Possible ways include:
220 |
221 | - Create a histogram
222 | - Use the `min()`, `max()`, and `range()` functions.
223 | - Print the object and look at the `values` attribute.
224 |
225 | 3)
226 | ```{r chm-harv-hist}
227 | ggplot(CHM_HARV_df) +
228 | geom_histogram(aes(HARV_dsmCrop))
229 | ```
230 |
231 | 4)
232 | ```{r chm-harv-hist-green}
233 | ggplot(CHM_HARV_df) +
234 | geom_histogram(aes(HARV_dsmCrop), colour="black",
235 | fill="darkgreen", bins = 6)
236 | ```
237 |
238 | 5)
239 | ```{r chm-harv-raster}
240 | custom_bins <- c(0, 10, 20, 30, 40)
241 | CHM_HARV_df <- CHM_HARV_df %>%
242 | mutate(canopy_discrete = cut(HARV_dsmCrop,
243 | breaks = custom_bins))
244 |
245 | ggplot() +
246 | geom_raster(data = CHM_HARV_df , aes(x = x, y = y,
247 | fill = canopy_discrete)) +
248 | scale_fill_manual(values = terrain.colors(4)) +
249 | coord_quickmap()
250 | ```
251 |
252 | :::::::::::::::::::::::::
253 |
254 | ::::::::::::::::::::::::::::::::::::::::::::::::::
255 |
256 | ## Efficient Raster Calculations
257 |
258 | Raster math, like we just did, is an appropriate approach to raster calculations
259 | if:
260 |
261 | 1. The rasters we are using are small in size.
262 | 2. The calculations we are performing are simple.
263 |
264 | However, raster math is a less efficient approach as computation becomes more
265 | complex or as file sizes become large.
266 |
267 | The `lapp()` function takes two or more rasters and applies a function to
268 | them using efficient processing methods. The syntax is
269 |
270 | `outputRaster <- lapp(x, fun=functionName)`
271 |
272 | In which raster can be either a SpatRaster or a SpatRasterDataset which is an
273 | object that holds rasters. See `help(sds)`.
274 |
275 | ::::::::::::::::::::::::::::::::::::::::: callout
276 |
277 | ## Data Tip
278 |
279 | To create a SpatRasterDataset, we call the function `sds` which can take a list
280 | of raster objects (each one created by calling `rast`).
281 |
282 | ::::::::::::::::::::::::::::::::::::::::::::::::::
283 |
284 | Let's perform the same subtraction calculation that we calculated above using
285 | raster math, using the `lapp()` function.
286 |
287 | ::::::::::::::::::::::::::::::::::::::::: callout
288 |
289 | ## Data Tip
290 |
291 | A custom function consists of a defined set of commands performed on a input
292 | object. Custom functions are particularly useful for tasks that need to be
293 | repeated over and over in the code. A simplified syntax for writing a custom
294 | function in R is:
295 | `function_name <- function(variable1, variable2) { WhatYouWantDone, WhatToReturn}`
296 |
297 |
298 | ::::::::::::::::::::::::::::::::::::::::::::::::::
299 |
300 | ```{r raster-overlay}
301 | CHM_ov_HARV <- lapp(sds(list(DSM_HARV, DTM_HARV)),
302 | fun = function(r1, r2) { return( r1 - r2) })
303 | ```
304 |
305 | Next we need to convert our new object to a data frame for plotting with
306 | `ggplot`.
307 |
308 | ```{r}
309 | CHM_ov_HARV_df <- as.data.frame(CHM_ov_HARV, xy = TRUE)
310 | ```
311 |
312 | Now we can plot the CHM:
313 |
314 | ```{r harv-chm-overlay}
315 | ggplot() +
316 | geom_raster(data = CHM_ov_HARV_df,
317 | aes(x = x, y = y, fill = HARV_dsmCrop)) +
318 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
319 | coord_quickmap()
320 | ```
321 |
322 | How do the plots of the CHM created with manual raster math and the `lapp()`
323 | function compare?
324 |
325 | ## Export a GeoTIFF
326 |
327 | Now that we've created a new raster, let's export the data as a GeoTIFF
328 | file using
329 | the `writeRaster()` function.
330 |
331 | When we write this raster object to a GeoTIFF file we'll name it
332 | `CHM_HARV.tiff`. This name allows us to quickly remember both what the data
333 | contains (CHM data) and for where (HARVard Forest). The `writeRaster()` function
334 | by default writes the output file to your working directory unless you specify a
335 | full file path.
336 |
337 | We will specify the output format ("GTiff"), the no data value `NAflag = -9999`.
338 | We will also tell R to overwrite any data that is already in a file of the same
339 | name.
340 |
341 | ```{r write-raster, eval=FALSE}
342 | writeRaster(CHM_ov_HARV, "CHM_HARV.tiff",
343 | filetype="GTiff",
344 | overwrite=TRUE,
345 | NAflag=-9999)
346 | ```
347 |
348 | ### writeRaster() Options
349 |
350 | The function arguments that we used above include:
351 |
352 | - **filetype:** specify that the format will be `GTiff` or GeoTIFF.
353 | - **overwrite:** If TRUE, R will overwrite any existing file with the same
354 | name in the specified directory. USE THIS SETTING WITH CAUTION!
355 | - **NAflag:** set the GeoTIFF tag for `NoDataValue` to -9999, the National
356 | Ecological Observatory Network's (NEON) standard `NoDataValue`.
357 |
358 | ::::::::::::::::::::::::::::::::::::::: challenge
359 |
360 | ## Challenge: Explore the NEON San Joaquin Experimental Range Field Site
361 |
362 | Data are often more interesting and powerful when we compare them across
363 | various locations. Let's compare some data collected over Harvard Forest to
364 | data collected in Southern California. The
365 | [NEON San Joaquin Experimental Range (SJER) field site](https://www.neonscience.org/field-sites/field-sites-map/SJER)
366 | located in Southern California has a very different ecosystem and climate than
367 | the
368 | [NEON Harvard Forest Field Site](https://www.neonscience.org/field-sites/field-sites-map/HARV)
369 | in Massachusetts.
370 |
371 | Import the SJER DSM and DTM raster files and create a Canopy Height Model.
372 | Then compare the two sites. Be sure to name your R objects and outputs
373 | carefully, as follows: objectType\_SJER (e.g. `DSM_SJER`). This will help you
374 | keep track of data from different sites!
375 |
376 | 0. You should have the DSM and DTM data for the SJER site already
377 | loaded from the
378 | [Plot Raster Data in R](02-raster-plot/)
379 | episode.) Don't forget to check the CRSs and units of the data.
380 | 1. Create a CHM from the two raster layers and check to make sure the data
381 | are what you expect.
382 | 2. Plot the CHM from SJER.
383 | 3. Export the SJER CHM as a GeoTIFF.
384 | 4. Compare the vegetation structure of the Harvard Forest and San Joaquin
385 | Experimental Range.
386 |
387 | ::::::::::::::: solution
388 |
389 | ## Answers
390 |
391 | 1) Use the `lapp()` function to subtract the two rasters \& create the CHM.
392 |
393 | ```{r}
394 | CHM_ov_SJER <- lapp(sds(list(DSM_SJER, DTM_SJER)),
395 | fun = function(r1, r2){ return(r1 - r2) })
396 | ```
397 |
398 | Convert the output to a dataframe:
399 |
400 | ```{r}
401 | CHM_ov_SJER_df <- as.data.frame(CHM_ov_SJER, xy = TRUE)
402 | ```
403 |
404 | Create a histogram to check that the data distribution makes sense:
405 |
406 | ```{r sjer-chm-overlay-hist}
407 | ggplot(CHM_ov_SJER_df) +
408 | geom_histogram(aes(SJER_dsmCrop))
409 | ```
410 |
411 | 2) Create a plot of the CHM:
412 |
413 | ```{r sjer-chm-overlay-raster}
414 | ggplot() +
415 | geom_raster(data = CHM_ov_SJER_df,
416 | aes(x = x, y = y,
417 | fill = SJER_dsmCrop)
418 | ) +
419 | scale_fill_gradientn(name = "Canopy Height",
420 | colors = terrain.colors(10)) +
421 | coord_quickmap()
422 | ```
423 |
424 | 3) Export the CHM object to a file:
425 |
426 | ```{r}
427 | writeRaster(CHM_ov_SJER, "chm_ov_SJER.tiff",
428 | filetype = "GTiff",
429 | overwrite = TRUE,
430 | NAflag = -9999)
431 | ```
432 |
433 | 4) Compare the SJER and HARV CHMs.
434 | Tree heights are much shorter in SJER. You can confirm this by
435 | looking at the histograms of the two CHMs.
436 |
437 | ```{r compare-chm-harv-sjer}
438 | ggplot(CHM_HARV_df) +
439 | geom_histogram(aes(HARV_dsmCrop))
440 |
441 | ggplot(CHM_ov_SJER_df) +
442 | geom_histogram(aes(SJER_dsmCrop))
443 | ```
444 |
445 | :::::::::::::::::::::::::
446 |
447 | ::::::::::::::::::::::::::::::::::::::::::::::::::
448 |
449 |
450 |
451 | :::::::::::::::::::::::::::::::::::::::: keypoints
452 |
453 | - Rasters can be computed on using mathematical functions.
454 | - The `lapp()` function provides an efficient way to do raster math.
455 | - The `writeRaster()` function can be used to write raster data to a file.
456 |
457 | ::::::::::::::::::::::::::::::::::::::::::::::::::
458 |
459 |
460 |
--------------------------------------------------------------------------------
/episodes/05-raster-multi-band-in-r.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Work with Multi-Band Rasters
3 | teaching: 40
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Identify a single vs. a multi-band raster file.
15 | - Import multi-band rasters into R using the `terra` package.
16 | - Plot multi-band color image rasters in R using the `ggplot` package.
17 |
18 | ::::::::::::::::::::::::::::::::::::::::::::::::::
19 |
20 | :::::::::::::::::::::::::::::::::::::::: questions
21 |
22 | - How can I visualize individual and multiple bands in a raster object?
23 |
24 | ::::::::::::::::::::::::::::::::::::::::::::::::::
25 |
26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
27 | library(terra)
28 | library(ggplot2)
29 | library(dplyr)
30 | ```
31 |
32 | :::::::::::::::::::::::::::::::::::::::::: prereq
33 |
34 | ## Things You'll Need To Complete This Episode
35 |
36 | See the [lesson homepage](.) for detailed information about the software, data,
37 | and other prerequisites you will need to work through the examples in this
38 | episode.
39 |
40 |
41 | ::::::::::::::::::::::::::::::::::::::::::::::::::
42 |
43 | We introduced multi-band raster data in
44 | [an earlier episode](https://datacarpentry.org/organization-geospatial/01-intro-raster-data).
45 | This episode explores how to import and plot a multi-band raster in R.
46 |
47 | ## Getting Started with Multi-Band Data in R
48 |
49 | In this episode, the multi-band data that we are working with is imagery
50 | collected using the
51 | [NEON Airborne Observation Platform](https://www.neonscience.org/data-collection/airborne-remote-sensing)
52 | high resolution camera over the
53 | [NEON Harvard Forest field site](https://www.neonscience.org/field-sites/field-sites-map/HARV).
54 | Each RGB image is a 3-band raster. The same steps would apply to working with a
55 | multi-spectral image with 4 or more bands - like Landsat imagery.
56 |
57 | By using the `rast()` function along with the `lyrs` parameter, we can read
58 | specific raster bands (i.e. the first one); omitting this parameter would read
59 | instead all bands.
60 |
61 | ```{r read-single-band}
62 | RGB_band1_HARV <-
63 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif",
64 | lyrs = 1)
65 | ```
66 |
67 | We need to convert this data to a data frame in order to plot it with `ggplot`.
68 |
69 | ```{r}
70 | RGB_band1_HARV_df <- as.data.frame(RGB_band1_HARV, xy = TRUE)
71 | ```
72 |
73 | ```{r harv-rgb-band1}
74 | ggplot() +
75 | geom_raster(data = RGB_band1_HARV_df,
76 | aes(x = x, y = y, alpha = HARV_RGB_Ortho_1)) +
77 | coord_quickmap()
78 | ```
79 |
80 | ::::::::::::::::::::::::::::::::::::::: challenge
81 |
82 | ## Challenge
83 |
84 | View the attributes of this band. What are its dimensions, CRS, resolution, min
85 | and max values, and band number?
86 |
87 | ::::::::::::::: solution
88 |
89 | ## Solution
90 |
91 | ```{r}
92 | RGB_band1_HARV
93 | ```
94 |
95 | Notice that when we look at the attributes of this band, we see:
96 | `dimensions : 2317, 3073, 1 (nrow, ncol, nlyr)`
97 |
98 | This is R telling us that we read only one its bands.
99 |
100 |
101 |
102 | :::::::::::::::::::::::::
103 |
104 | ::::::::::::::::::::::::::::::::::::::::::::::::::
105 |
106 | ::::::::::::::::::::::::::::::::::::::::: callout
107 |
108 | ## Data Tip
109 |
110 | The number of bands associated with a raster's file can also be determined
111 | using the `describe()` function: syntax is `describe(sources(RGB_band1_HARV))`.
112 |
113 |
114 | ::::::::::::::::::::::::::::::::::::::::::::::::::
115 |
116 | ### Image Raster Data Values
117 |
118 | As we saw in the previous exercise, this raster contains values between 0 and
119 | 255. These values represent degrees of brightness associated with the image
120 | band. In the case of a RGB image (red, green and blue), band 1 is the red band.
121 | When we plot the red band, larger numbers (towards 255) represent pixels with
122 | more red in them (a strong red reflection). Smaller numbers (towards 0)
123 | represent pixels with less red in them (less red was reflected). To plot an RGB
124 | image, we mix red + green + blue values into one single color to create a full
125 | color image - similar to the color image a digital camera creates.
126 |
127 | ### Import A Specific Band
128 |
129 | We can use the `rast()` function to import specific bands in our raster object
130 | by specifying which band we want with `lyrs = N` (N represents the band number we
131 | want to work with). To import the green band, we would use `lyrs = 2`.
132 |
133 | ```{r read-specific-band}
134 | RGB_band2_HARV <-
135 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif",
136 | lyrs = 2)
137 | ```
138 |
139 | We can convert this data to a data frame and plot the same way we plotted the red band:
140 |
141 | ```{r}
142 | RGB_band2_HARV_df <- as.data.frame(RGB_band2_HARV, xy = TRUE)
143 | ```
144 |
145 | ```{r rgb-harv-band2}
146 | ggplot() +
147 | geom_raster(data = RGB_band2_HARV_df,
148 | aes(x = x, y = y, alpha = HARV_RGB_Ortho_2)) +
149 | coord_equal()
150 | ```
151 |
152 | ::::::::::::::::::::::::::::::::::::::: challenge
153 |
154 | ## Challenge: Making Sense of Single Band Images
155 |
156 | Compare the plots of band 1 (red) and band 2 (green). Is the forested area
157 | darker or lighter in band 2 (the green band) compared to band 1 (the red band)?
158 |
159 | ::::::::::::::: solution
160 |
161 | ## Solution
162 |
163 | We'd expect a *brighter* value for the forest in band 2 (green) than in band 1
164 | (red) because the leaves on trees of most often appear "green" - healthy leaves
165 | reflect MORE green light than red light.
166 |
167 |
168 |
169 | :::::::::::::::::::::::::
170 |
171 | ::::::::::::::::::::::::::::::::::::::::::::::::::
172 |
173 | ## Raster Stacks in R
174 |
175 | Next, we will work with all three image bands (red, green and blue) as an R
176 | raster object. We will then plot a 3-band composite, or full color, image.
177 |
178 | To bring in all bands of a multi-band raster, we use the`rast()` function.
179 |
180 | ```{r intro-to-raster-stacks}
181 | RGB_stack_HARV <-
182 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif")
183 | ```
184 |
185 | Let's preview the attributes of our stack object:
186 |
187 | ```{r}
188 | RGB_stack_HARV
189 | ```
190 |
191 | We can view the attributes of each band in the stack in a single output. For
192 | example, if we had hundreds of bands, we could specify which band we'd like to
193 | view attributes for using an index value:
194 |
195 | ```{r}
196 | RGB_stack_HARV[[2]]
197 | ```
198 |
199 | We can also use the `ggplot` functions to plot the data in any layer of our
200 | raster object. Remember, we need to convert to a data frame first.
201 |
202 | ```{r}
203 | RGB_stack_HARV_df <- as.data.frame(RGB_stack_HARV, xy = TRUE)
204 | ```
205 |
206 | Each band in our RasterStack gets its own column in the data frame. Thus we have:
207 |
208 | ```{r}
209 | str(RGB_stack_HARV_df)
210 | ```
211 |
212 | Let's create a histogram of the first band:
213 |
214 | ```{r rgb-harv-hist-band1}
215 | ggplot() +
216 | geom_histogram(data = RGB_stack_HARV_df, aes(HARV_RGB_Ortho_1))
217 | ```
218 |
219 | And a raster plot of the second band:
220 |
221 | ```{r rgb-harv-plot-band2}
222 | ggplot() +
223 | geom_raster(data = RGB_stack_HARV_df,
224 | aes(x = x, y = y, alpha = HARV_RGB_Ortho_2)) +
225 | coord_quickmap()
226 | ```
227 |
228 | We can access any individual band in the same way.
229 |
230 | ### Create A Three Band Image
231 |
232 | To render a final three band, colored image in R, we use the `plotRGB()` function.
233 |
234 | This function allows us to:
235 |
236 | 1. Identify what bands we want to render in the red, green and blue regions.
237 | The `plotRGB()` function defaults to a 1=red, 2=green, and 3=blue band
238 | order. However, you can define what bands you'd like to plot manually.
239 | Manual definition of bands is useful if you have, for example a
240 | near-infrared band and want to create a color infrared image.
241 | 2. Adjust the `stretch` of the image to increase or decrease contrast.
242 |
243 | Let's plot our 3-band image. Note that we can use the `plotRGB()` function
244 | directly with our RasterStack object (we don't need a dataframe as this
245 | function isn't part of the `ggplot2` package).
246 |
247 | ```{r plot-rgb-image}
248 | plotRGB(RGB_stack_HARV,
249 | r = 1, g = 2, b = 3)
250 | ```
251 |
252 | The image above looks pretty good. We can explore whether applying a stretch to
253 | the image might improve clarity and contrast using `stretch="lin"` or
254 | `stretch="hist"`.
255 |
256 | {alt='Image Stretch'}
257 |
258 | When the range of pixel brightness values is closer to 0, a darker image is
259 | rendered by default. We can stretch the values to extend to the full 0-255
260 | range of potential values to increase the visual contrast of the image.
261 |
262 | {alt='Image Stretch light'}
263 |
264 | When the range of pixel brightness values is closer to 255, a lighter image is
265 | rendered by default. We can stretch the values to extend to the full 0-255
266 | range of potential values to increase the visual contrast of the image.
267 |
268 | ```{r plot-rbg-image-linear}
269 | plotRGB(RGB_stack_HARV,
270 | r = 1, g = 2, b = 3,
271 | scale = 800,
272 | stretch = "lin")
273 | ```
274 |
275 | ```{r plot-rgb-image-hist}
276 | plotRGB(RGB_stack_HARV,
277 | r = 1, g = 2, b = 3,
278 | scale = 800,
279 | stretch = "hist")
280 | ```
281 |
282 | In this case, the stretch doesn't enhance the contrast our image significantly
283 | given the distribution of reflectance (or brightness) values is distributed
284 | well between 0 and 255.
285 |
286 | ::::::::::::::::::::::::::::::::::::::: challenge
287 |
288 | ## Challenge - NoData Values
289 |
290 | Let's explore what happens with NoData values when working with RasterStack
291 | objects and using the `plotRGB()` function. We will use the
292 | `HARV_Ortho_wNA.tif` GeoTIFF file in the
293 | `NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/` directory.
294 |
295 | 1. View the files attributes. Are there `NoData` values assigned for this file?
296 | 2. If so, what is the `NoData` Value?
297 | 3. How many bands does it have?
298 | 4. Load the multi-band raster file into R.
299 | 5. Plot the object as a true color image.
300 | 6. What happened to the black edges in the data?
301 | 7. What does this tell us about the difference in the data structure between
302 | `HARV_Ortho_wNA.tif` and `HARV_RGB_Ortho.tif` (R object `RGB_stack`). How can
303 | you check?
304 |
305 | ::::::::::::::: solution
306 |
307 | ## Answers
308 |
309 | 1) First we use the `describe()` function to view the data attributes.
310 |
311 | ```{r}
312 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_Ortho_wNA.tif")
313 | ```
314 |
315 | 2) From the output above, we see that there are `NoData` values and they are
316 | assigned the value of -9999.
317 |
318 | 3) The data has three bands.
319 |
320 | 4) To read in the file, we will use the `rast()` function:
321 |
322 | ```{r}
323 | HARV_NA <-
324 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_Ortho_wNA.tif")
325 | ```
326 |
327 | 5) We can plot the data with the `plotRGB()` function:
328 |
329 | ```{r harv-na-rgb}
330 | plotRGB(HARV_NA,
331 | r = 1, g = 2, b = 3)
332 | ```
333 |
334 | 6) The black edges are not plotted.
335 |
336 | 7) Both data sets have `NoData` values, however, in the RGB\_stack the NoData
337 | value is not defined in the tiff tags, thus R renders them as black as the
338 | reflectance values are 0. The black edges in the other file are defined as
339 | -9999 and R renders them as NA.
340 |
341 | ```{r}
342 | describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif")
343 | ```
344 |
345 | :::::::::::::::::::::::::
346 |
347 | ::::::::::::::::::::::::::::::::::::::::::::::::::
348 |
349 | ::::::::::::::::::::::::::::::::::::::::: callout
350 |
351 | ## Data Tip
352 |
353 | We can create a raster object from several, individual single-band GeoTIFFs
354 | too. We will do this in a later episode,
355 | [Raster Time Series Data in R](12-time-series-raster/).
356 |
357 |
358 | ::::::::::::::::::::::::::::::::::::::::::::::::::
359 |
360 | ## SpatRaster in R
361 |
362 | The R SpatRaster object type can handle rasters with multiple bands.
363 | The SpatRaster only holds parameters that describe the properties of raster
364 | data that is located somewhere on our computer.
365 |
366 | A SpatRasterDataset object can hold references to sub-datasets, that is,
367 | SpatRaster objects. In most cases, we can work with a SpatRaster in the same
368 | way we might work with a SpatRasterDataset.
369 |
370 | ::::::::::::::::::::::::::::::::::::::::: callout
371 |
372 | ## More Resources
373 |
374 | You can read the help for the `rast()` and `sds()` functions by typing `?rast`
375 | or `?sds`.
376 |
377 |
378 | ::::::::::::::::::::::::::::::::::::::::::::::::::
379 |
380 |
381 | We can build a SpatRasterDataset using a SpatRaster or a list of SpatRaster:
382 |
383 | ```{r}
384 | RGB_sds_HARV <- sds(RGB_stack_HARV)
385 | RGB_sds_HARV <- sds(list(RGB_stack_HARV, RGB_stack_HARV))
386 | ```
387 |
388 | We can retrieve the SpatRaster objects from a SpatRasterDataset using
389 | subsetting:
390 |
391 | ```{r}
392 | RGB_sds_HARV[[1]]
393 | RGB_sds_HARV[[2]]
394 | ```
395 |
396 |
397 | ::::::::::::::::::::::::::::::::::::::: challenge
398 |
399 | ## Challenge: What Functions Can Be Used on an R Object of a particular class?
400 |
401 | We can view various functions (or methods) available to use on an R object with
402 | `methods(class=class(objectNameHere))`. Use this to figure out:
403 |
404 | 1. What methods can be used on the `RGB_stack_HARV` object?
405 | 2. What methods can be used on a single band within `RGB_stack_HARV`?
406 | 3. Why do you think there isn't a difference?
407 |
408 | ::::::::::::::: solution
409 |
410 | ## Answers
411 |
412 | 1) We can see a list of all of the methods available for our
413 | RasterStack object:
414 |
415 | ```{r}
416 | methods(class=class(RGB_stack_HARV))
417 | ```
418 |
419 | 2) And compare that with the methods available for a single band:
420 |
421 | ```{r}
422 | methods(class=class(RGB_stack_HARV[[1]]))
423 | ```
424 |
425 | 3) A SpatRaster is the same no matter its number of bands.
426 |
427 |
428 |
429 | :::::::::::::::::::::::::
430 |
431 | ::::::::::::::::::::::::::::::::::::::::::::::::::
432 |
433 |
434 |
435 | :::::::::::::::::::::::::::::::::::::::: keypoints
436 |
437 | - A single raster file can contain multiple bands or layers.
438 | - Use the `rast()` function to load all bands in a multi-layer raster file into R.
439 | - Individual bands within a SpatRaster can be accessed, analyzed, and visualized using the same functions no matter how many bands it holds.
440 |
441 | ::::::::::::::::::::::::::::::::::::::::::::::::::
442 |
443 |
444 |
--------------------------------------------------------------------------------
/episodes/06-vector-open-shapefile-in-r.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Open and Plot Vector Layers
3 | teaching: 20
4 | exercises: 10
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Know the difference between point, line, and polygon vector elements.
15 | - Load point, line, and polygon vector layers into R.
16 | - Access the attributes of a spatial object in R.
17 |
18 | ::::::::::::::::::::::::::::::::::::::::::::::::::
19 |
20 | :::::::::::::::::::::::::::::::::::::::: questions
21 |
22 | - How can I distinguish between and visualize point, line and polygon vector data?
23 |
24 | ::::::::::::::::::::::::::::::::::::::::::::::::::
25 |
26 | ```{r load-libraries, echo=FALSE, results="hide", warning=FALSE, message=FALSE}
27 | library(terra)
28 | library(ggplot2)
29 | library(dplyr)
30 | library(sf)
31 | ```
32 |
33 | :::::::::::::::::::::::::::::::::::::::::: prereq
34 |
35 | ## Things You'll Need To Complete This Episode
36 |
37 | See the [lesson homepage](.) for detailed information about the software, data,
38 | and other prerequisites you will need to work through the examples in this
39 | episode.
40 |
41 |
42 | ::::::::::::::::::::::::::::::::::::::::::::::::::
43 |
44 | Starting with this episode, we will be moving from working with raster data to
45 | working with vector data. In this episode, we will open and plot point, line
46 | and polygon vector data loaded from ESRI's `shapefile` format into R. These data refer to
47 | the
48 | [NEON Harvard Forest field site](https://www.neonscience.org/field-sites/field-sites-map/HARV),
49 | which we have been working with in previous episodes. In later episodes, we
50 | will learn how to work with raster and vector data together and combine them
51 | into a single plot.
52 |
53 | ## Import Vector Data
54 |
55 | We will use the `sf` package to work with vector data in R. We will also use
56 | the `terra` package, which has been loaded in previous episodes, so we can
57 | explore raster and vector spatial metadata using similar commands. Make sure
58 | you have the `sf` library loaded.
59 |
60 | ```{r load-sf, results="hide", eval=FALSE, message=FALSE}
61 | library(sf)
62 | ```
63 |
64 | The vector layers that we will import from ESRI's `shapefile` format are:
65 |
66 | - A polygon vector layer representing our field site boundary,
67 | - A line vector layer representing roads, and
68 | - A point vector layer representing the location of the [Fisher flux tower](https://www.neonscience.org/data-collection/flux-tower-measurements)
69 | located at the [NEON Harvard Forest field site](https://www.neonscience.org/field-sites/field-sites-map/HARV).
70 |
71 | The first vector layer that we will open contains the boundary of our study area
72 | (or our Area Of Interest or AOI, hence the name `aoiBoundary`). To import
73 | a vector layer from an ESRI `shapefile` we use the `sf` function `st_read()`. `st_read()`
74 | requires the file path to the ESRI `shapefile`.
75 |
76 | Let's import our AOI:
77 |
78 | ```{r Import-Shapefile}
79 | aoi_boundary_HARV <- st_read(
80 | "data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
81 | ```
82 |
83 | ## Vector Layer Metadata \& Attributes
84 |
85 | When we import the `HarClip_UTMZ18` vector layer from an ESRI `shapefile` into R (as our
86 | `aoi_boundary_HARV` object), the `st_read()` function automatically stores
87 | information about the data. We are particularly interested in the geospatial
88 | metadata, describing the format, CRS, extent, and other components of the
89 | vector data, and the attributes which describe properties associated with each
90 | individual vector object.
91 |
92 | ::::::::::::::::::::::::::::::::::::::::: callout
93 |
94 | ## Data Tip
95 |
96 | The [Explore and Plot by Vector Layer Attributes](07-vector-shapefile-attributes-in-r/)
97 | episode provides more information on both metadata and attributes
98 | and using attributes to subset and plot data.
99 |
100 |
101 | ::::::::::::::::::::::::::::::::::::::::::::::::::
102 |
103 | ## Spatial Metadata
104 |
105 | Key metadata for all vector layers includes:
106 |
107 | 1. **Object Type:** the class of the imported object.
108 | 2. **Coordinate Reference System (CRS):** the projection of the data.
109 | 3. **Extent:** the spatial extent (i.e. geographic area that the vector layer
110 | covers) of the data. Note that the spatial extent for a vector layer
111 | represents the combined extent for all individual objects in the vector layer.
112 |
113 | We can view metadata of a vector layer using the `st_geometry_type()`, `st_crs()` and
114 | `st_bbox()` functions. First, let's view the geometry type for our AOI
115 | vector layer:
116 |
117 | ```{r}
118 | st_geometry_type(aoi_boundary_HARV)
119 | ```
120 |
121 | Our `aoi_boundary_HARV` is a polygon spatial object. The 18 levels shown below our
122 | output list the possible categories of the geometry type. Now let's check what
123 | CRS this file data is in:
124 |
125 | ```{r}
126 | st_crs(aoi_boundary_HARV)
127 | ```
128 |
129 | Our data in the CRS **UTM zone 18N**. The CRS is critical to interpreting the
130 | spatial object's extent values as it specifies units. To find the extent of our AOI, we
131 | can use the `st_bbox()` function:
132 |
133 | ```{r}
134 | st_bbox(aoi_boundary_HARV)
135 | ```
136 |
137 | The spatial extent of a vector layer or R spatial object represents the geographic
138 | "edge" or location that is the furthest north, south east and west. Thus it
139 | represents the overall geographic coverage of the spatial object. Image Source:
140 | National Ecological Observatory Network (NEON).
141 |
142 | {alt='Extent image'}
143 |
144 | Lastly, we can view all of the metadata and attributes for this R spatial
145 | object by printing it to the screen:
146 |
147 | ```{r}
148 | aoi_boundary_HARV
149 | ```
150 |
151 | ## Spatial Data Attributes
152 |
153 | We introduced the idea of spatial data attributes in
154 | [an earlier lesson](https://datacarpentry.org/organization-geospatial/02-intro-vector-data).
155 | Now we will explore how to use spatial data attributes stored in our data to
156 | plot different features.
157 |
158 | ## Plot a vector layer
159 |
160 | Next, let's visualize the data in our `sf` object using the `ggplot` package.
161 | Unlike with raster data, we do not need to convert vector data to a dataframe
162 | before plotting with `ggplot`.
163 |
164 | We're going to customize our boundary plot by setting the size, color, and fill
165 | for our plot. When plotting `sf` objects with `ggplot2`, you need to use the
166 | `coord_sf()` coordinate system.
167 |
168 | ```{r plot-shapefile}
169 | ggplot() +
170 | geom_sf(data = aoi_boundary_HARV, size = 3, color = "black", fill = "cyan1") +
171 | ggtitle("AOI Boundary Plot") +
172 | coord_sf()
173 | ```
174 |
175 | ::::::::::::::::::::::::::::::::::::::: challenge
176 |
177 | ## Challenge: Import Line and Point Vector Layers
178 |
179 | Using the steps above, import the HARV\_roads and HARVtower\_UTM18N vector layers into
180 | R. Call the HARV\_roads object `lines_HARV` and the HARVtower\_UTM18N
181 | `point_HARV`.
182 |
183 | Answer the following questions:
184 |
185 | 1. What type of R spatial object is created when you import each layer?
186 |
187 | 2. What is the CRS and extent for each object?
188 |
189 | 3. Do the files contain points, lines, or polygons?
190 |
191 | 4. How many spatial objects are in each file?
192 |
193 | ::::::::::::::: solution
194 |
195 | ## Answers
196 |
197 | First we import the data:
198 |
199 | ```{r import-point-line, echo=TRUE}
200 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
201 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
202 | ```
203 |
204 | Then we check its class:
205 |
206 | ```{r}
207 | class(lines_HARV)
208 | class(point_HARV)
209 | ```
210 |
211 | We also check the CRS and extent of each object:
212 |
213 | ```{r}
214 | st_crs(lines_HARV)
215 | st_bbox(lines_HARV)
216 | st_crs(point_HARV)
217 | st_bbox(point_HARV)
218 | ```
219 |
220 | To see the number of objects in each file, we can look at the output from when
221 | we read these objects into R. `lines_HARV` contains 13 features (all lines) and
222 | `point_HARV` contains only one point.
223 |
224 |
225 |
226 | :::::::::::::::::::::::::
227 |
228 | ::::::::::::::::::::::::::::::::::::::::::::::::::
229 |
230 |
231 |
232 | :::::::::::::::::::::::::::::::::::::::: keypoints
233 |
234 | - Metadata for vector layers include geometry type, CRS, and extent.
235 | - Load spatial objects into R with the `st_read()` function.
236 | - Spatial objects can be plotted directly with `ggplot` using the `geom_sf()`
237 | function. No need to convert to a dataframe.
238 |
239 | ::::::::::::::::::::::::::::::::::::::::::::::::::
240 |
241 |
242 |
--------------------------------------------------------------------------------
/episodes/08-vector-plot-shapefiles-custom-legend.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Plot Multiple Vector Layers
3 | teaching: 30
4 | exercises: 15
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Plot multiple vector layers in the same plot.
15 | - Apply custom symbols to spatial objects in a plot.
16 | - Create a multi-layered plot with raster and vector data.
17 |
18 | ::::::::::::::::::::::::::::::::::::::::::::::::::
19 |
20 | :::::::::::::::::::::::::::::::::::::::: questions
21 |
22 | - How can I create map compositions with custom legends using ggplot?
23 | - How can I plot raster and vector data together?
24 |
25 | ::::::::::::::::::::::::::::::::::::::::::::::::::
26 |
27 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE}
28 | library(terra)
29 | library(ggplot2)
30 | library(dplyr)
31 | library(sf)
32 | ```
33 |
34 | ```{r load-data, echo=FALSE, results="hide", warning=FALSE}
35 | # learners will have this data loaded from an earlier episode
36 | aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
37 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
38 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
39 | CHM_HARV <- rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif")
40 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)
41 | road_colors <- c("blue", "green", "navy", "purple")
42 | ```
43 |
44 | :::::::::::::::::::::::::::::::::::::::::: prereq
45 |
46 | ## Things You'll Need To Complete This Episode
47 |
48 | See the [lesson homepage](.) for detailed information about the software, data,
49 | and other prerequisites you will need to work through the examples in this
50 | episode.
51 |
52 |
53 | ::::::::::::::::::::::::::::::::::::::::::::::::::
54 |
55 | This episode builds upon
56 | [the previous episode](07-vector-shapefile-attributes-in-r/)
57 | to work with vector layers in R and explore how to plot multiple
58 | vector layers. It also covers how to plot raster and vector data together on the
59 | same plot.
60 |
61 | ## Load the Data
62 |
63 | To work with vector data in R, we can use the `sf` library. The `terra`
64 | package also allows us to explore metadata using similar commands for both
65 | raster and vector files. Make sure that you have these packages loaded.
66 |
67 | We will continue to work with the three ESRI `shapefile` that we loaded in the
68 | [Open and Plot Vector Layers in R](06-vector-open-shapefile-in-r/) episode.
69 |
70 | ## Plotting Multiple Vector Layers
71 |
72 | In the [previous episode](07-vector-shapefile-attributes-in-r/), we learned how
73 | to plot information from a single vector layer and do some plot customization
74 | including adding a custom legend. However, what if we want to create a more
75 | complex plot with many vector layers and unique symbols that need to be
76 | represented clearly in a legend?
77 |
78 | Now, let's create a plot that combines our tower location (`point_HARV`), site
79 | boundary (`aoi_boundary_HARV`) and roads (`lines_HARV`) spatial objects. We
80 | will need to build a custom legend as well.
81 |
82 | To begin, we will create a plot with the site boundary as the first layer. Then
83 | layer the tower location and road data on top using `+`.
84 |
85 | ```{r plot-many-shapefiles}
86 | ggplot() +
87 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") +
88 | geom_sf(data = lines_HARV, aes(color = TYPE), size = 1) +
89 | geom_sf(data = point_HARV) +
90 | ggtitle("NEON Harvard Forest Field Site") +
91 | coord_sf()
92 | ```
93 |
94 | Next, let's build a custom legend using the symbology (the colors and symbols)
95 | that we used to create the plot above. For example, it might be good if the
96 | lines were symbolized as lines. In the previous episode, you may have noticed
97 | that the default legend behavior for `geom_sf` is to draw a 'patch' for each
98 | legend entry. If you want the legend to draw lines or points, you need to add
99 | an instruction to the `geom_sf` call - in this case, `show.legend = 'line'`.
100 |
101 | ```{r plot-custom-shape}
102 | ggplot() +
103 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") +
104 | geom_sf(data = lines_HARV, aes(color = TYPE),
105 | show.legend = "line", size = 1) +
106 | geom_sf(data = point_HARV, aes(fill = Sub_Type), color = "black") +
107 | scale_color_manual(values = road_colors) +
108 | scale_fill_manual(values = "black") +
109 | ggtitle("NEON Harvard Forest Field Site") +
110 | coord_sf()
111 | ```
112 |
113 | Now lets adjust the legend titles by passing a `name` to the respective `color`
114 | and `fill` palettes.
115 |
116 | ```{r create-custom-legend}
117 | ggplot() +
118 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") +
119 | geom_sf(data = point_HARV, aes(fill = Sub_Type)) +
120 | geom_sf(data = lines_HARV, aes(color = TYPE), show.legend = "line",
121 | size = 1) +
122 | scale_color_manual(values = road_colors, name = "Line Type") +
123 | scale_fill_manual(values = "black", name = "Tower Location") +
124 | ggtitle("NEON Harvard Forest Field Site") +
125 | coord_sf()
126 | ```
127 |
128 | Finally, it might be better if the points were symbolized as a symbol. We can
129 | customize this using `shape` parameters in our call to `geom_sf`: 16 is a point
130 | symbol, 15 is a box.
131 |
132 | ::::::::::::::::::::::::::::::::::::::::: callout
133 |
134 | ## Data Tip
135 |
136 | To view a short list of `shape` symbols,
137 | type `?pch` into the R console.
138 |
139 |
140 | ::::::::::::::::::::::::::::::::::::::::::::::::::
141 |
142 | ```{r custom-symbols}
143 | ggplot() +
144 | geom_sf(data = aoi_boundary_HARV, fill = "grey", color = "grey") +
145 | geom_sf(data = point_HARV, aes(fill = Sub_Type), shape = 15) +
146 | geom_sf(data = lines_HARV, aes(color = TYPE),
147 | show.legend = "line", size = 1) +
148 | scale_color_manual(values = road_colors, name = "Line Type") +
149 | scale_fill_manual(values = "black", name = "Tower Location") +
150 | ggtitle("NEON Harvard Forest Field Site") +
151 | coord_sf()
152 | ```
153 |
154 | ::::::::::::::::::::::::::::::::::::::: challenge
155 |
156 | ## Challenge: Plot Polygon by Attribute
157 |
158 | 1. Using the `NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp` ESRI `shapefile`,
159 | create a map of study plot locations, with each point colored by the soil
160 | type (`soilTypeOr`). How many different soil types are there at this
161 | particular field site? Overlay this layer on top of the `lines_HARV` layer
162 | (the roads). Create a custom legend that applies line symbols to lines and
163 | point symbols to the points.
164 |
165 | 2. Modify the plot above. Tell R to plot each point, using a different symbol
166 | of `shape` value.
167 |
168 | ::::::::::::::: solution
169 |
170 | ## Answers
171 |
172 | First we need to read in the data and see how many unique soils are represented
173 | in the `soilTypeOr` attribute.
174 |
175 | ```{r}
176 | plot_locations <-
177 | st_read("data/NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp")
178 |
179 | plot_locations$soilTypeOr <- as.factor(plot_locations$soilTypeOr)
180 | levels(plot_locations$soilTypeOr)
181 | ```
182 |
183 | Next we can create a new color palette with one color for each soil type.
184 |
185 | ```{r}
186 | blue_orange <- c("cornflowerblue", "darkorange")
187 | ```
188 |
189 | Finally, we will create our plot.
190 |
191 | ```{r harv-plot-locations-bg}
192 | ggplot() +
193 | geom_sf(data = lines_HARV, aes(color = TYPE), show.legend = "line") +
194 | geom_sf(data = plot_locations, aes(fill = soilTypeOr),
195 | shape = 21, show.legend = 'point') +
196 | scale_color_manual(name = "Line Type", values = road_colors,
197 | guide = guide_legend(override.aes = list(linetype = "solid",
198 | shape = NA))) +
199 | scale_fill_manual(name = "Soil Type", values = blue_orange,
200 | guide = guide_legend(override.aes = list(linetype = "blank", shape = 21,
201 | colour = "black"))) +
202 | ggtitle("NEON Harvard Forest Field Site") +
203 | coord_sf()
204 | ```
205 |
206 | If we want each soil to be shown with a different symbol, we can give multiple
207 | values to the `scale_shape_manual()` argument.
208 |
209 | ```{r harv-plot-locations-pch}
210 | ggplot() +
211 | geom_sf(data = lines_HARV, aes(color = TYPE), show.legend = "line", size = 1) +
212 | geom_sf(data = plot_locations, aes(fill = soilTypeOr, shape = soilTypeOr),
213 | show.legend = 'point', size = 3) +
214 | scale_shape_manual(name = "Soil Type", values = c(21, 22)) +
215 | scale_color_manual(name = "Line Type", values = road_colors,
216 | guide = guide_legend(override.aes = list(linetype = "solid", shape = NA))) +
217 | scale_fill_manual(name = "Soil Type", values = blue_orange,
218 | guide = guide_legend(override.aes = list(linetype = "blank", shape = c(21, 22), color = "black"))) +
219 | ggtitle("NEON Harvard Forest Field Site") +
220 | coord_sf()
221 | ```
222 |
223 | :::::::::::::::::::::::::
224 |
225 | ::::::::::::::::::::::::::::::::::::::::::::::::::
226 |
227 | ::::::::::::::::::::::::::::::::::::::: challenge
228 |
229 | ## Challenge: Plot Raster \& Vector Data Together
230 |
231 | You can plot vector data layered on top of raster data using the `+` to add a
232 | layer in `ggplot`. Create a plot that uses the NEON AOI Canopy Height Model
233 | `data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif` as a base
234 | layer. On top of the CHM, please add:
235 |
236 | - The study site AOI.
237 | - Roads.
238 | - The tower location.
239 |
240 | Be sure to give your plot a meaningful title.
241 |
242 | ::::::::::::::: solution
243 |
244 | ## Answers
245 |
246 | ```{r challenge-vector-raster-overlay, echo=TRUE}
247 | ggplot() +
248 | geom_raster(data = CHM_HARV_df, aes(x = x, y = y, fill = HARV_chmCrop)) +
249 | geom_sf(data = lines_HARV, color = "black") +
250 | geom_sf(data = aoi_boundary_HARV, color = "grey20", size = 1) +
251 | geom_sf(data = point_HARV, pch = 8) +
252 | ggtitle("NEON Harvard Forest Field Site w/ Canopy Height Model") +
253 | coord_sf()
254 | ```
255 |
256 | :::::::::::::::::::::::::
257 |
258 | ::::::::::::::::::::::::::::::::::::::::::::::::::
259 |
260 |
261 |
262 | :::::::::::::::::::::::::::::::::::::::: keypoints
263 |
264 | - Use the `+` operator to add multiple layers to a ggplot.
265 | - Multi-layered plots can combine raster and vector datasets.
266 | - Use the `show.legend` argument to set legend symbol types.
267 | - Use the `scale_fill_manual()` function to set legend colors.
268 |
269 | ::::::::::::::::::::::::::::::::::::::::::::::::::
270 |
271 |
272 |
--------------------------------------------------------------------------------
/episodes/09-vector-when-data-dont-line-up-crs.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Handling Spatial Projection & CRS
3 | teaching: 30
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Plot vector objects with different CRSs in the same plot.
15 |
16 | ::::::::::::::::::::::::::::::::::::::::::::::::::
17 |
18 | :::::::::::::::::::::::::::::::::::::::: questions
19 |
20 | - What do I do when vector data don't line up?
21 |
22 | ::::::::::::::::::::::::::::::::::::::::::::::::::
23 |
24 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
25 | library(terra)
26 | library(sf)
27 | library(ggplot2)
28 | library(dplyr)
29 | ```
30 |
31 | :::::::::::::::::::::::::::::::::::::::::: prereq
32 |
33 | ## Things You'll Need To Complete This Episode
34 |
35 | See the [lesson homepage](.) for detailed information about the software, data,
36 | and other prerequisites you will need to work through the examples in this
37 | episode.
38 |
39 |
40 | ::::::::::::::::::::::::::::::::::::::::::::::::::
41 |
42 | In [an earlier episode](03-raster-reproject-in-r/)
43 | we learned how to handle a situation where you have two different files with
44 | raster data in different projections. Now we will apply those same principles
45 | to working with vector data.
46 | We will create a base map of our study site using United States state and
47 | country boundary information accessed from the
48 | [United States Census Bureau](https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html).
49 | We will learn how to map vector data that are in different CRSs and thus don't
50 | line up on a map.
51 |
52 | We will continue to work with the three ESRI `shapefiles` that we loaded in the
53 | [Open and Plot Vector Layers in R](06-vector-open-shapefile-in-r/) episode.
54 |
55 | ```{r load-data, echo=FALSE, results="hide", warning=FALSE, message=FALSE}
56 | # learners will have this data loaded from previous episodes
57 | aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
58 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
59 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
60 | CHM_HARV <- rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif")
61 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)
62 | roadColors <- c("blue", "green", "grey", "purple")[lines_HARV$TYPE]
63 | ```
64 |
65 | ## Working With Spatial Data From Different Sources
66 |
67 | We often need to gather spatial datasets from different sources and/or data
68 | that cover different spatial extents.
69 | These data are often in different Coordinate Reference Systems (CRSs).
70 |
71 | Some reasons for data being in different CRSs include:
72 |
73 | 1. The data are stored in a particular CRS convention used by the data provider
74 | (for example, a government agency).
75 | 2. The data are stored in a particular CRS that is customized to a region. For
76 | instance, many states in the US prefer to use a State Plane projection
77 | customized for that state.
78 |
79 | {alt='Maps of the United States using data in different projections.}
80 |
81 | Notice the differences in shape associated with each different projection.
82 | These differences are a direct result of the calculations used to "flatten" the
83 | data onto a 2-dimensional map. Often data are stored purposefully in a
84 | particular projection that optimizes the relative shape and size of surrounding
85 | geographic boundaries (states, counties, countries, etc).
86 |
87 | In this episode we will learn how to identify and manage spatial data in
88 | different projections. We will learn how to reproject the data so that they are
89 | in the same projection to support plotting / mapping. Note that these skills
90 | are also required for any geoprocessing / spatial analysis. Data need to be in
91 | the same CRS to ensure accurate results.
92 |
93 | We will continue to use the `sf` and `terra` packages in this episode.
94 |
95 | ## Import US Boundaries - Census Data
96 |
97 | There are many good sources of boundary base layers that we can use to create a
98 | basemap. Some R packages even have these base layers built in to support quick
99 | and efficient mapping. In this episode, we will use boundary layers for the
100 | contiguous United States, provided by the
101 | [United States Census Bureau](https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html).
102 | It is useful to have vector layers in ESRI's `shapefile` format to work with because we can add additional
103 | attributes to them if need be - for project specific mapping.
104 |
105 | ## Read US Boundary File
106 |
107 | We will use the `st_read()` function to import the
108 | `/US-Boundary-Layers/US-State-Boundaries-Census-2014` layer into R. This layer
109 | contains the boundaries of all contiguous states in the U.S. Please note that
110 | these data have been modified and reprojected from the original data downloaded
111 | from the Census website to support the learning goals of this episode.
112 |
113 | ```{r read-shp}
114 | state_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-State-Boundaries-Census-2014.shp") %>%
115 | st_zm()
116 | ```
117 |
118 | Next, let's plot the U.S. states data:
119 |
120 | ```{r find-coordinates}
121 | ggplot() +
122 | geom_sf(data = state_boundary_US) +
123 | ggtitle("Map of Contiguous US State Boundaries") +
124 | coord_sf()
125 | ```
126 |
127 | ## U.S. Boundary Layer
128 |
129 | We can add a boundary layer of the United States to our map - to make it look
130 | nicer. We will import
131 | `NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States`.
132 |
133 | ```{r}
134 | country_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States.shp") %>%
135 | st_zm()
136 | ```
137 |
138 | If we specify a thicker line width using `size = 2` for the border layer, it
139 | will make our map pop! We will also manually set the colors of the state
140 | boundaries and country boundaries.
141 |
142 | ```{r us-boundaries-thickness}
143 | ggplot() +
144 | geom_sf(data = state_boundary_US, color = "gray60") +
145 | geom_sf(data = country_boundary_US, color = "black",alpha = 0.25,size = 5) +
146 | ggtitle("Map of Contiguous US State Boundaries") +
147 | coord_sf()
148 | ```
149 |
150 | Next, let's add the location of a flux tower where our study area is.
151 | As we are adding these layers, take note of the CRS of each object.
152 | First let's look at the CRS of our tower location object:
153 |
154 | ```{r crs-sleuthing-1}
155 | st_crs(point_HARV)$proj4string
156 | ```
157 |
158 | Our project string for `point_HARV` specifies the UTM projection as follows:
159 |
160 | `+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs`
161 |
162 | - **proj=utm:** the projection is UTM, UTM has several zones.
163 | - **zone=18:** the zone is 18
164 | - **datum=WGS84:** the datum WGS84 (the datum refers to the 0,0 reference for
165 | the coordinate system used in the projection)
166 | - **units=m:** the units for the coordinates are in METERS.
167 |
168 | Note that the `zone` is unique to the UTM projection. Not all CRSs will have a
169 | zone.
170 |
171 | Let's check the CRS of our state and country boundary objects:
172 |
173 | ```{r crs-sleuthing-2}
174 | st_crs(state_boundary_US)$proj4string
175 | st_crs(country_boundary_US)$proj4string
176 | ```
177 |
178 | Our project string for `state_boundary_US` and `country_boundary_US` specifies
179 | the lat/long projection as follows:
180 |
181 | `+proj=longlat +datum=WGS84 +no_defs`
182 |
183 |
184 | - **proj=longlat:** the data are in a geographic (latitude and longitude)
185 | coordinate system
186 | - **datum=WGS84:** the datum WGS84 (the datum refers to the 0,0 reference for
187 | the coordinate system used in the projection)
188 | - **no_defs:** ensures that no defaults are used, but this is now obsolete
189 |
190 | Note that there are no specified units above. This is because this geographic
191 | coordinate reference system is in latitude and longitude which is most often
192 | recorded in decimal degrees.
193 |
194 | ::::::::::::::::::::::::::::::::::::::::: callout
195 |
196 | ## Data Tip
197 |
198 | the last portion of each `proj4` string could potentially be something like
199 | `+towgs84=0,0,0 `. This is a conversion factor that is used if a datum
200 | conversion is required. We will not deal with datums in this episode series.
201 |
202 |
203 | ::::::::::::::::::::::::::::::::::::::::::::::::::
204 |
205 | ## CRS Units - View Object Extent
206 |
207 | Next, let's view the extent or spatial coverage for the `point_HARV` spatial
208 | object compared to the `state_boundary_US` object.
209 |
210 | First we'll look at the extent for our study site:
211 |
212 | ```{r view-extent-1}
213 | st_bbox(point_HARV)
214 | ```
215 |
216 | And then the extent for the state boundary data.
217 |
218 | ```{r view-extent-2}
219 | st_bbox(state_boundary_US)
220 | ```
221 |
222 | Note the difference in the units for each object. The extent for
223 | `state_boundary_US` is in latitude and longitude which yields smaller numbers
224 | representing decimal degree units. Our tower location point is in UTM, is
225 | represented in meters.
226 |
227 | ::::::::::::::::::::::::::::::::::::::::: callout
228 |
229 | ## Proj4 \& CRS Resources
230 |
231 | - [Official PROJ library documentation](https://proj4.org/)
232 | - [More information on the proj4 format.](https://proj.maptools.org/faq.html)
233 | - [A fairly comprehensive list of CRSs by format.](https://spatialreference.org)
234 | - To view a list of datum conversion factors type:
235 | `sf_proj_info(type = "datum")` into the R console. However, the results would
236 | depend on the underlying version of the PROJ library.
237 |
238 |
239 | ::::::::::::::::::::::::::::::::::::::::::::::::::
240 |
241 | ## Reproject Vector Data or No?
242 |
243 | We saw in [an earlier episode](03-raster-reproject-in-r/) that when working
244 | with raster data in different CRSs, we needed to convert all objects to the
245 | same CRS. We can do the same thing with our vector data - however, we don't
246 | need to! When using the `ggplot2` package, `ggplot` automatically converts all
247 | objects to the same CRS before plotting.
248 | This means we can plot our three data sets together without doing any
249 | conversion:
250 |
251 | ```{r layer-point-on-states}
252 | ggplot() +
253 | geom_sf(data = state_boundary_US, color = "gray60") +
254 | geom_sf(data = country_boundary_US, size = 5, alpha = 0.25, color = "black") +
255 | geom_sf(data = point_HARV, shape = 19, color = "purple") +
256 | ggtitle("Map of Contiguous US State Boundaries") +
257 | coord_sf()
258 | ```
259 |
260 | ::::::::::::::::::::::::::::::::::::::: challenge
261 |
262 | ## Challenge - Plot Multiple Layers of Spatial Data
263 |
264 | Create a map of the North Eastern United States as follows:
265 |
266 | 1. Import and plot `Boundary-US-State-NEast.shp`. Adjust line width as
267 | necessary.
268 | 2. Layer the Fisher Tower (in the NEON Harvard Forest site) point location
269 | `point_HARV` onto the plot.
270 | 3. Add a title.
271 | 4. Add a legend that shows both the state boundary (as a line) and the Tower
272 | location point.
273 |
274 | ::::::::::::::: solution
275 |
276 | ## Answers
277 |
278 | ```{r ne-states-harv}
279 | NE.States.Boundary.US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/Boundary-US-State-NEast.shp") %>%
280 | st_zm()
281 |
282 | ggplot() +
283 | geom_sf(data = NE.States.Boundary.US, aes(color ="color"),
284 | show.legend = "line") +
285 | scale_color_manual(name = "", labels = "State Boundary",
286 | values = c("color" = "gray18")) +
287 | geom_sf(data = point_HARV, aes(shape = "shape"), color = "purple") +
288 | scale_shape_manual(name = "", labels = "Fisher Tower",
289 | values = c("shape" = 19)) +
290 | ggtitle("Fisher Tower location") +
291 | theme(legend.background = element_rect(color = NA)) +
292 | coord_sf()
293 | ```
294 |
295 | :::::::::::::::::::::::::
296 |
297 | ::::::::::::::::::::::::::::::::::::::::::::::::::
298 |
299 |
300 |
301 | :::::::::::::::::::::::::::::::::::::::: keypoints
302 |
303 | - `ggplot2` automatically converts all objects in a plot to the same CRS.
304 | - Still be aware of the CRS and extent for each object.
305 |
306 | ::::::::::::::::::::::::::::::::::::::::::::::::::
307 |
308 |
309 |
--------------------------------------------------------------------------------
/episodes/10-vector-csv-to-shapefile-in-r.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Convert from .csv to a Vector Layer
3 | teaching: 40
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Import .csv files containing x,y coordinate locations into R as a data frame.
15 | - Convert a data frame to a spatial object.
16 | - Export a spatial object to a text file.
17 |
18 | ::::::::::::::::::::::::::::::::::::::::::::::::::
19 |
20 | :::::::::::::::::::::::::::::::::::::::: questions
21 |
22 | - How can I import CSV files as vector layers in R?
23 |
24 | ::::::::::::::::::::::::::::::::::::::::::::::::::
25 |
26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
27 | library(terra)
28 | library(ggplot2)
29 | library(dplyr)
30 | library(sf)
31 | ```
32 |
33 | ```{r load-data, echo=FALSE, results="hide"}
34 | # Learners will have this data loaded from earlier episodes
35 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
36 | aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
37 | country_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States.shp")
38 | point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
39 | ```
40 |
41 | :::::::::::::::::::::::::::::::::::::::::: prereq
42 |
43 | ## Things You'll Need To Complete This Episode
44 |
45 | See the [lesson homepage](.) for detailed information about the software, data,
46 | and other prerequisites you will need to work through the examples in this
47 | episode.
48 |
49 |
50 | ::::::::::::::::::::::::::::::::::::::::::::::::::
51 |
52 | This episode will review how to import spatial points stored in `.csv` (Comma
53 | Separated Value) format into R as an `sf` spatial object. We will also
54 | reproject data imported from an ESRI `shapefile` format, export the reprojected data as an ESRI `shapefile`, and plot raster and vector data as layers in the same plot.
55 |
56 | ## Spatial Data in Text Format
57 |
58 | The `HARV_PlotLocations.csv` file contains `x, y` (point) locations for study
59 | plot where NEON collects data on
60 | [vegetation and other ecological metics](https://www.neonscience.org/data-collection/terrestrial-organismal-sampling).
61 | We would like to:
62 |
63 | - Create a map of these plot locations.
64 | - Export the data in an ESRI `shapefile` format to share with our colleagues. This
65 | `shapefile` can be imported into most GIS software.
66 | - Create a map showing vegetation height with plot locations layered on top.
67 |
68 | Spatial data are sometimes stored in a text file format (`.txt` or `.csv`). If
69 | the text file has an associated `x` and `y` location column, then we can
70 | convert it into an `sf` spatial object. The `sf` object allows us to store both
71 | the `x,y` values that represent the coordinate location of each point and the
72 | associated attribute data - or columns describing each feature in the spatial
73 | object.
74 |
75 | We will continue using the `sf` and `terra` packages in this episode.
76 |
77 | ## Import .csv
78 |
79 | To begin let's import a `.csv` file that contains plot coordinate `x, y`
80 | locations at the NEON Harvard Forest Field Site (`HARV_PlotLocations.csv`) and
81 | look at the structure of that new object:
82 |
83 | ```{r read-csv}
84 | plot_locations_HARV <-
85 | read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv")
86 |
87 | str(plot_locations_HARV)
88 | ```
89 |
90 | We now have a data frame that contains 21 locations (rows) and 16 variables
91 | (attributes). Note that all of our character data was imported into R as
92 | character (text) data. Next, let's explore the dataframe to determine whether
93 | it contains columns with coordinate values. If we are lucky, our `.csv` will
94 | contain columns labeled:
95 |
96 | - "X" and "Y" OR
97 | - Latitude and Longitude OR
98 | - easting and northing (UTM coordinates)
99 |
100 | Let's check out the column names of our dataframe.
101 |
102 | ```{r find-coordinates}
103 | names(plot_locations_HARV)
104 | ```
105 |
106 | ## Identify X,Y Location Columns
107 |
108 | Our column names include several fields that might contain spatial information.
109 | The `plot_locations_HARV$easting` and `plot_locations_HARV$northing` columns
110 | contain coordinate values. We can confirm this by looking at the first six rows
111 | of our data.
112 |
113 | ```{r check-out-coordinates}
114 | head(plot_locations_HARV$easting)
115 | head(plot_locations_HARV$northing)
116 | ```
117 |
118 | We have coordinate values in our data frame. In order to convert our data frame
119 | to an `sf` object, we also need to know the CRS associated with those
120 | coordinate values.
121 |
122 | There are several ways to figure out the CRS of spatial data in text format.
123 |
124 | 1. We can check the file metadata in hopes that the CRS was recorded in the
125 | data.
126 | 2. We can explore the file itself to see if CRS information is embedded in the
127 | file header or somewhere in the data columns.
128 |
129 | Following the `easting` and `northing` columns, there is a `geodeticDa` and a
130 | `utmZone` column. These appear to contain CRS information (`datum` and
131 | `projection`). Let's view those next.
132 |
133 | ```{r view-CRS-info}
134 | head(plot_locations_HARV$geodeticDa)
135 | head(plot_locations_HARV$utmZone)
136 | ```
137 |
138 | It is not typical to store CRS information in a column. But this particular
139 | file contains CRS information this way. The `geodeticDa` and `utmZone` columns
140 | contain the information that helps us determine the CRS:
141 |
142 | - `geodeticDa`: WGS84 -- this is geodetic datum WGS84
143 | - `utmZone`: 18
144 |
145 | In
146 | [When Vector Data Don't Line Up - Handling Spatial Projection \& CRS in R](09-vector-when-data-dont-line-up-crs.html)
147 | we learned about the components of a `proj4` string. We have everything we need
148 | to assign a CRS to our data frame.
149 |
150 | To create the `proj4` associated with UTM Zone 18 WGS84 we can look up the
151 | projection on the
152 | [Spatial Reference website](https://spatialreference.org/ref/epsg/32618/),
153 | which contains a list of CRS formats for each projection. From here, we can
154 | extract the
155 | [proj4 string for UTM Zone 18N WGS84](https://spatialreference.org/ref/epsg/32618/proj4.txt).
156 |
157 | However, if we have other data in the UTM Zone 18N projection, it's much easier
158 | to use the `st_crs()` function to extract the CRS in `proj4` format from that
159 | object and assign it to our new spatial object. We've seen this CRS before with
160 | our Harvard Forest study site (`point_HARV`).
161 |
162 | ```{r explore-units}
163 | st_crs(point_HARV)
164 | ```
165 |
166 | The output above shows that the points vector layer is in UTM zone 18N. We can
167 | thus use the CRS from that spatial object to convert our non-spatial dataframe
168 | into an `sf` object.
169 |
170 | Next, let's create a `crs` object that we can use to define the CRS of our `sf`
171 | object when we create it.
172 |
173 | ```{r crs-object}
174 | utm18nCRS <- st_crs(point_HARV)
175 | utm18nCRS
176 |
177 | class(utm18nCRS)
178 | ```
179 |
180 | ## .csv to sf object
181 |
182 | Next, let's convert our dataframe into an `sf` object. To do this, we need to
183 | specify:
184 |
185 | 1. The columns containing X (`easting`) and Y (`northing`) coordinate values
186 | 2. The CRS that the column coordinate represent (units are included in the CRS) - stored in our `utmCRS` object.
187 |
188 | We will use the `st_as_sf()` function to perform the conversion.
189 |
190 | ```{r convert-csv-shapefile}
191 | plot_locations_sp_HARV <- st_as_sf(plot_locations_HARV,
192 | coords = c("easting", "northing"),
193 | crs = utm18nCRS)
194 | ```
195 |
196 | We should double check the CRS to make sure it is correct.
197 |
198 | ```{r}
199 | st_crs(plot_locations_sp_HARV)
200 | ```
201 |
202 | ## Plot Spatial Object
203 |
204 | We now have a spatial R object, we can plot our newly created spatial object.
205 |
206 | ```{r plot-data-points}
207 | ggplot() +
208 | geom_sf(data = plot_locations_sp_HARV) +
209 | ggtitle("Map of Plot Locations")
210 | ```
211 |
212 | ## Plot Extent
213 |
214 | In
215 | [Open and Plot Vector Layers in R](06-vector-open-shapefile-in-r.html)
216 | we learned about spatial object extent. When we plot several spatial layers in
217 | R using `ggplot`, all of the layers of the plot are considered in setting the
218 | boundaries of the plot. To show this, let's plot our `aoi_boundary_HARV` object
219 | with our vegetation plots.
220 |
221 | ```{r plot-data}
222 | ggplot() +
223 | geom_sf(data = aoi_boundary_HARV) +
224 | geom_sf(data = plot_locations_sp_HARV) +
225 | ggtitle("AOI Boundary Plot")
226 | ```
227 |
228 | When we plot the two layers together, `ggplot` sets the plot boundaries so that
229 | they are large enough to include all of the data included in all of the layers.
230 | That's really handy!
231 |
232 | ::::::::::::::::::::::::::::::::::::::: challenge
233 |
234 | ## Challenge - Import \& Plot Additional Points
235 |
236 | We want to add two phenology plots to our existing map of vegetation plot
237 | locations.
238 |
239 | Import the .csv: `HARV/HARV_2NewPhenPlots.csv` into R and do the following:
240 |
241 | 1. Find the X and Y coordinate locations. Which value is X and which value is
242 | Y?
243 | 2. These data were collected in a geographic coordinate system (WGS84). Convert
244 | the dataframe into an `sf` object.
245 | 3. Plot the new points with the plot location points from above. Be sure to add
246 | a legend. Use a different symbol for the 2 new points!
247 |
248 | If you have extra time, feel free to add roads and other layers to your map!
249 |
250 | ::::::::::::::: solution
251 |
252 | ## Answers
253 |
254 | 1)
255 | First we will read in the new csv file and look at the data structure.
256 |
257 | ```{r}
258 | newplot_locations_HARV <-
259 | read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_2NewPhenPlots.csv")
260 | str(newplot_locations_HARV)
261 | ```
262 |
263 | 2)
264 | The US boundary data we worked with previously is in a geographic WGS84 CRS. We
265 | can use that data to establish a CRS for this data. First we will extract the
266 | CRS from the `country_boundary_US` object and confirm that it is WGS84.
267 |
268 | ```{r}
269 | geogCRS <- st_crs(country_boundary_US)
270 | geogCRS
271 | ```
272 |
273 | Then we will convert our new data to a spatial dataframe, using the `geogCRS`
274 | object as our CRS.
275 |
276 | ```{r}
277 | newPlot.Sp.HARV <- st_as_sf(newplot_locations_HARV,
278 | coords = c("decimalLon", "decimalLat"),
279 | crs = geogCRS)
280 | ```
281 |
282 | Next we'll confirm that the CRS for our new object is correct.
283 |
284 | ```{r}
285 | st_crs(newPlot.Sp.HARV)
286 | ```
287 |
288 | We will be adding these new data points to the plot we created before. The data
289 | for the earlier plot was in UTM. Since we're using `ggplot`, it will reproject
290 | the data for us.
291 |
292 | 3) Now we can create our plot.
293 |
294 | ```{r plot-locations-harv-orange}
295 | ggplot() +
296 | geom_sf(data = plot_locations_sp_HARV, color = "orange") +
297 | geom_sf(data = newPlot.Sp.HARV, color = "lightblue") +
298 | ggtitle("Map of All Plot Locations")
299 | ```
300 |
301 | :::::::::::::::::::::::::
302 |
303 | ::::::::::::::::::::::::::::::::::::::::::::::::::
304 |
305 | ## Export to an ESRI `shapefile`
306 |
307 | We can write an R spatial object to an ESRI `shapefile` using the `st_write` function
308 | in `sf`. To do this we need the following arguments:
309 |
310 | - the name of the spatial object (`plot_locations_sp_HARV`)
311 | - the directory where we want to save our ESRI `shapefile` (to use `current = getwd()`
312 | or you can specify a different path)
313 | - the name of the new ESRI `shapefile` (`PlotLocations_HARV`)
314 | - the driver which specifies the file format (ESRI Shapefile)
315 |
316 | We can now export the spatial object as an ESRI `shapefile`.
317 |
318 | ```{r write-shapefile, warnings="hide", eval=FALSE}
319 | st_write(plot_locations_sp_HARV,
320 | "data/PlotLocations_HARV.shp", driver = "ESRI Shapefile")
321 | ```
322 |
323 |
324 |
325 | :::::::::::::::::::::::::::::::::::::::: keypoints
326 |
327 | - Know the projection (if any) of your point data prior to converting to a
328 | spatial object.
329 | - Convert a data frame to an `sf` object using the `st_as_sf()` function.
330 | - Export an `sf` object as text using the `st_write()` function.
331 |
332 | ::::::::::::::::::::::::::::::::::::::::::::::::::
333 |
334 |
335 |
--------------------------------------------------------------------------------
/episodes/11-vector-raster-integration.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Manipulate Raster Data
3 | teaching: 40
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Crop a raster to the extent of a vector layer.
15 | - Extract values from a raster that correspond to a vector file overlay.
16 |
17 | ::::::::::::::::::::::::::::::::::::::::::::::::::
18 |
19 | :::::::::::::::::::::::::::::::::::::::: questions
20 |
21 | - How can I crop raster objects to vector objects, and extract the summary of
22 | raster pixels?
23 |
24 | ::::::::::::::::::::::::::::::::::::::::::::::::::
25 |
26 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
27 | library(sf)
28 | library(terra)
29 | library(ggplot2)
30 | library(dplyr)
31 | ```
32 |
33 | ```{r load-data, echo=FALSE, results="hide"}
34 | # Learners will have this data loaded from earlier episodes
35 | point_HARV <-
36 | st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
37 | lines_HARV <-
38 | st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
39 | aoi_boundary_HARV <-
40 | st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
41 |
42 | # CHM
43 | CHM_HARV <-
44 | rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif")
45 |
46 | CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)
47 |
48 | # plot locations
49 | plot_locations_HARV <-
50 | read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv")
51 | utm18nCRS <- st_crs(point_HARV)
52 | plot_locations_sp_HARV <- st_as_sf(plot_locations_HARV,
53 | coords = c("easting", "northing"),
54 | crs = utm18nCRS)
55 | ```
56 |
57 | :::::::::::::::::::::::::::::::::::::::::: prereq
58 |
59 | ## Things You'll Need To Complete This Episode
60 |
61 | See the [lesson homepage](.) for detailed information about the software, data,
62 | and other prerequisites you will need to work through the examples in this
63 | episode.
64 |
65 |
66 | ::::::::::::::::::::::::::::::::::::::::::::::::::
67 |
68 | This episode explains how to crop a raster using the extent of a vector
69 | layer. We will also cover how to extract values from a raster that occur
70 | within a set of polygons, or in a buffer (surrounding) region around a set of
71 | points.
72 |
73 | ## Crop a Raster to Vector Extent
74 |
75 | We often work with spatial layers that have different spatial extents. The
76 | spatial extent of a vector layer or R spatial object represents the geographic
77 | "edge" or location that is the furthest north, south east and west. Thus it
78 | represents the overall geographic coverage of the spatial object.
79 |
80 | {alt='Extent illustration'} Image Source: National
81 | Ecological Observatory Network (NEON)
82 |
83 | The graphic below illustrates the extent of several of the spatial layers that
84 | we have worked with in this workshop:
85 |
86 | - Area of interest (AOI) -- blue
87 | - Roads and trails -- purple
88 | - Vegetation plot locations (marked with white dots)-- black
89 | - A canopy height model (CHM) in GeoTIFF format -- green
90 |
91 | ```{r view-extents, echo=FALSE, results="hide"}
92 | # code not shown, for demonstration purposes only
93 | # create CHM as a vector layer
94 | CHM_HARV_sp <- st_as_sf(CHM_HARV_df, coords = c("x", "y"), crs = utm18nCRS)
95 | # approximate the boundary box with a random sample of raster points
96 | CHM_rand_sample <- sample_n(CHM_HARV_sp, 10000)
97 | lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
98 | plots_HARV <-
99 | st_read("data/NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp")
100 | ```
101 |
102 | ```{r compare-data-extents, echo=FALSE}
103 | # code not shown, for demonstration purposes only
104 | ggplot() +
105 | geom_sf(data = st_convex_hull(st_union(CHM_rand_sample)), fill = "green") +
106 | geom_sf(data = st_convex_hull(st_union(lines_HARV)),
107 | fill = "purple", alpha = 0.2) +
108 | geom_sf(data = lines_HARV, aes(color = TYPE), size = 1) +
109 | geom_sf(data = aoi_boundary_HARV, fill = "blue") +
110 | geom_sf(data = st_convex_hull(st_union(plot_locations_sp_HARV)),
111 | fill = "black", alpha = 0.4) +
112 | geom_sf(data = plots_HARV, color = "white") +
113 | theme(legend.position = "none") +
114 | coord_sf()
115 |
116 | ```
117 |
118 | Frequent use cases of cropping a raster file include reducing file size and
119 | creating maps. Sometimes we have a raster file that is much larger than our
120 | study area or area of interest. It is often more efficient to crop the raster
121 | to the extent of our study area to reduce file sizes as we process our data.
122 | Cropping a raster can also be useful when creating pretty maps so that the
123 | raster layer matches the extent of the desired vector layers.
124 |
125 | ## Crop a Raster Using Vector Extent
126 |
127 | We can use the `crop()` function to crop a raster to the extent of another
128 | spatial object. To do this, we need to specify the raster to be cropped and the
129 | spatial object that will be used to crop the raster. R will use the `extent` of
130 | the spatial object as the cropping boundary.
131 |
132 | To illustrate this, we will crop the Canopy Height Model (CHM) to only include
133 | the area of interest (AOI). Let's start by plotting the full extent of the CHM
134 | data and overlay where the AOI falls within it. The boundaries of the AOI will
135 | be colored blue, and we use `fill = NA` to make the area transparent.
136 |
137 | ```{r crop-by-vector-extent}
138 | ggplot() +
139 | geom_raster(data = CHM_HARV_df, aes(x = x, y = y, fill = HARV_chmCrop)) +
140 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
141 | geom_sf(data = aoi_boundary_HARV, color = "blue", fill = NA) +
142 | coord_sf()
143 | ```
144 |
145 | Now that we have visualized the area of the CHM we want to subset, we can
146 | perform the cropping operation. We are going to `crop()` function from the
147 | raster package to create a new object with only the portion of the CHM data
148 | that falls within the boundaries of the AOI.
149 |
150 | ```{r}
151 | CHM_HARV_Cropped <- crop(x = CHM_HARV, y = aoi_boundary_HARV)
152 | ```
153 |
154 | Now we can plot the cropped CHM data, along with a boundary box showing the
155 | full CHM extent. However, remember, since this is raster data, we need to
156 | convert to a data frame in order to plot using `ggplot`. To get the boundary
157 | box from CHM, the `st_bbox()` will extract the 4 corners of the rectangle that
158 | encompass all the features contained in this object. The `st_as_sfc()` converts
159 | these 4 coordinates into a polygon that we can plot:
160 |
161 | ```{r show-cropped-area}
162 | CHM_HARV_Cropped_df <- as.data.frame(CHM_HARV_Cropped, xy = TRUE)
163 |
164 | ggplot() +
165 | geom_sf(data = st_as_sfc(st_bbox(CHM_HARV)), fill = "green",
166 | color = "green", alpha = .2) +
167 | geom_raster(data = CHM_HARV_Cropped_df,
168 | aes(x = x, y = y, fill = HARV_chmCrop)) +
169 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
170 | coord_sf()
171 | ```
172 |
173 | The plot above shows that the full CHM extent (plotted in green) is much larger
174 | than the resulting cropped raster. Our new cropped CHM now has the same extent
175 | as the `aoi_boundary_HARV` object that was used as a crop extent (blue border
176 | below).
177 |
178 | ```{r view-crop-extent}
179 | ggplot() +
180 | geom_raster(data = CHM_HARV_Cropped_df,
181 | aes(x = x, y = y, fill = HARV_chmCrop)) +
182 | geom_sf(data = aoi_boundary_HARV, color = "blue", fill = NA) +
183 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
184 | coord_sf()
185 | ```
186 |
187 | We can look at the extent of all of our other objects for this field site.
188 |
189 | ```{r view-extent}
190 | st_bbox(CHM_HARV)
191 | st_bbox(CHM_HARV_Cropped)
192 | st_bbox(aoi_boundary_HARV)
193 | st_bbox(plot_locations_sp_HARV)
194 | ```
195 |
196 | Our plot location extent is not the largest but is larger than the AOI
197 | Boundary. It would be nice to see our vegetation plot locations plotted on top
198 | of the Canopy Height Model information.
199 |
200 | ::::::::::::::::::::::::::::::::::::::: challenge
201 |
202 | ## Challenge: Crop to Vector Points Extent
203 |
204 | 1. Crop the Canopy Height Model to the extent of the study plot locations.
205 | 2. Plot the vegetation plot location points on top of the Canopy Height Model.
206 |
207 | ::::::::::::::: solution
208 |
209 | ## Answers
210 |
211 | ```{r challenge-code-crop-raster-points}
212 |
213 | CHM_plots_HARVcrop <- crop(x = CHM_HARV, y = plot_locations_sp_HARV)
214 |
215 | CHM_plots_HARVcrop_df <- as.data.frame(CHM_plots_HARVcrop, xy = TRUE)
216 |
217 | ggplot() +
218 | geom_raster(data = CHM_plots_HARVcrop_df,
219 | aes(x = x, y = y, fill = HARV_chmCrop)) +
220 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
221 | geom_sf(data = plot_locations_sp_HARV) +
222 | coord_sf()
223 | ```
224 |
225 | :::::::::::::::::::::::::
226 |
227 | ::::::::::::::::::::::::::::::::::::::::::::::::::
228 |
229 | In the plot above, created in the challenge, all the vegetation plot locations
230 | (black dots) appear on the Canopy Height Model raster layer except for one. One
231 | is situated on the blank space to the left of the map. Why?
232 |
233 | A modification of the first figure in this episode is below, showing the
234 | relative extents of all the spatial objects. Notice that the extent for our
235 | vegetation plot layer (black) extends further west than the extent of our CHM
236 | raster (bright green). The `crop()` function will make a raster extent smaller,
237 | it will not expand the extent in areas where there are no data. Thus, the
238 | extent of our vegetation plot layer will still extend further west than the
239 | extent of our (cropped) raster data (dark green).
240 |
241 | ```{r, echo=FALSE}
242 | # code not shown, demonstration only
243 | # create CHM_plots_HARVcrop as a vector layer
244 | CHM_plots_HARVcrop_sp <- st_as_sf(CHM_plots_HARVcrop_df, coords = c("x", "y"),
245 | crs = utm18nCRS)
246 | # approximate the boundary box with random sample of raster points
247 | CHM_plots_HARVcrop_sp_rand_sample = sample_n(CHM_plots_HARVcrop_sp, 10000)
248 | ```
249 |
250 | ```{r repeat-compare-data-extents, ref.label="compare-data-extents", echo=FALSE}
251 | ```
252 |
253 | ## Define an Extent
254 |
255 | So far, we have used a vector layer to crop the extent of a raster dataset.
256 | Alternatively, we can also the `ext()` function to define an extent to be
257 | used as a cropping boundary. This creates a new object of class extent. Here we
258 | will provide the `ext()` function our xmin, xmax, ymin, and ymax (in that
259 | order).
260 |
261 | ```{r}
262 | new_extent <- ext(732161.2, 732238.7, 4713249, 4713333)
263 | class(new_extent)
264 | ```
265 |
266 | ::::::::::::::::::::::::::::::::::::::::: callout
267 |
268 | ## Data Tip
269 |
270 | The extent can be created from a numeric vector (as shown above), a matrix, or
271 | a list. For more details see the `ext()` function help file
272 | (`?terra::ext`).
273 |
274 |
275 | ::::::::::::::::::::::::::::::::::::::::::::::::::
276 |
277 | Once we have defined our new extent, we can use the `crop()` function to crop
278 | our raster to this extent object.
279 |
280 | ```{r crop-using-drawn-extent}
281 | CHM_HARV_manual_cropped <- crop(x = CHM_HARV, y = new_extent)
282 | ```
283 |
284 | To plot this data using `ggplot()` we need to convert it to a dataframe.
285 |
286 | ```{r}
287 | CHM_HARV_manual_cropped_df <- as.data.frame(CHM_HARV_manual_cropped, xy = TRUE)
288 | ```
289 |
290 | Now we can plot this cropped data. We will show the AOI boundary on the same
291 | plot for scale.
292 |
293 | ```{r show-manual-crop-area}
294 | ggplot() +
295 | geom_sf(data = aoi_boundary_HARV, color = "blue", fill = NA) +
296 | geom_raster(data = CHM_HARV_manual_cropped_df,
297 | aes(x = x, y = y, fill = HARV_chmCrop)) +
298 | scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
299 | coord_sf()
300 | ```
301 |
302 | ## Extract Raster Pixels Values Using Vector Polygons
303 |
304 | Often we want to extract values from a raster layer for particular locations -
305 | for example, plot locations that we are sampling on the ground. We can extract
306 | all pixel values within 20m of our x,y point of interest. These can then be
307 | summarized into some value of interest (e.g. mean, maximum, total).
308 |
309 | {alt='Image shows raster information extraction using 20m polygon boundary.'}
310 | Image Source: National Ecological Observatory Network (NEON)
311 |
312 | To do this in R, we use the `extract()` function. The `extract()` function
313 | requires:
314 |
315 | - The raster that we wish to extract values from,
316 | - The vector layer containing the polygons that we wish to use as a boundary or
317 | boundaries,
318 | - we can tell it to store the output values in a data frame using
319 | `raw = FALSE` (this is optional).
320 |
321 | We will begin by extracting all canopy height pixel values located within our
322 | `aoi_boundary_HARV` polygon which surrounds the tower located at the NEON
323 | Harvard Forest field site.
324 |
325 | ```{r extract-from-raster}
326 | tree_height <- extract(x = CHM_HARV, y = aoi_boundary_HARV, raw = FALSE)
327 |
328 | str(tree_height)
329 | ```
330 |
331 | When we use the `extract()` function, R extracts the value for each pixel
332 | located within the boundary of the polygon being used to perform the extraction
333 | - in this case the `aoi_boundary_HARV` object (a single polygon). Here, the
334 | function extracted values from 18,450 pixels.
335 |
336 | We can create a histogram of tree height values within the boundary to better
337 | understand the structure or height distribution of trees at our site. We will
338 | use the column `HARV_chmCrop` from our data frame as our x values, as this
339 | column represents the tree heights for each pixel.
340 |
341 | ```{r view-extract-histogram}
342 | ggplot() +
343 | geom_histogram(data = tree_height, aes(x = HARV_chmCrop)) +
344 | ggtitle("Histogram of CHM Height Values (m)") +
345 | xlab("Tree Height") +
346 | ylab("Frequency of Pixels")
347 | ```
348 |
349 | We can also use the `summary()` function to view descriptive statistics
350 | including min, max, and mean height values. These values help us better
351 | understand vegetation at our field site.
352 |
353 | ```{r}
354 | summary(tree_height$HARV_chmCrop)
355 | ```
356 |
357 | ## Summarize Extracted Raster Values
358 |
359 | We often want to extract summary values from a raster. We can tell R the type
360 | of summary statistic we are interested in using the `fun =` argument. Let's
361 | extract a mean height value for our AOI.
362 |
363 | ```{r summarize-extract}
364 | mean_tree_height_AOI <- extract(x = CHM_HARV, y = aoi_boundary_HARV,
365 | fun = mean)
366 |
367 | mean_tree_height_AOI
368 | ```
369 |
370 | It appears that the mean height value, extracted from our LiDAR data derived
371 | canopy height model is 22.43 meters.
372 |
373 | ## Extract Data using x,y Locations
374 |
375 | We can also extract pixel values from a raster by defining a buffer or area
376 | surrounding individual point locations using the `st_buffer()` function. To do
377 | this we define the summary argument (`fun = mean`) and the buffer distance
378 | (`dist = 20`) which represents the radius of a circular region around each
379 | point. By default, the units of the buffer are the same units as the data's
380 | CRS. All pixels that are touched by the buffer region are included in the
381 | extract.
382 |
383 | {alt='Image shows raster information extraction using 20m buffer region.'}
384 | Image Source: National Ecological Observatory Network (NEON)
385 |
386 | Let's put this into practice by figuring out the mean tree height in the 20m
387 | around the tower location (`point_HARV`).
388 |
389 | ```{r extract-point-to-buffer}
390 | mean_tree_height_tower <- extract(x = CHM_HARV,
391 | y = st_buffer(point_HARV, dist = 20),
392 | fun = mean)
393 |
394 | mean_tree_height_tower
395 | ```
396 |
397 | ::::::::::::::::::::::::::::::::::::::: challenge
398 |
399 | ## Challenge: Extract Raster Height Values For Plot Locations
400 |
401 | 1) Use the plot locations object (`plot_locations_sp_HARV`) to extract an
402 | average tree height for the area within 20m of each vegetation plot location
403 | in the study area. Because there are multiple plot locations, there will be
404 | multiple averages returned.
405 |
406 | 2) Create a plot showing the mean tree height of each area.
407 |
408 | ::::::::::::::: solution
409 |
410 | ## Answers
411 |
412 | ```{r hist-tree-height-veg-plot}
413 | # extract data at each plot location
414 | mean_tree_height_plots_HARV <- extract(x = CHM_HARV,
415 | y = st_buffer(plot_locations_sp_HARV,
416 | dist = 20),
417 | fun = mean)
418 |
419 | # view data
420 | mean_tree_height_plots_HARV
421 |
422 | # plot data
423 | ggplot(data = mean_tree_height_plots_HARV, aes(ID, HARV_chmCrop)) +
424 | geom_col() +
425 | ggtitle("Mean Tree Height at each Plot") +
426 | xlab("Plot ID") +
427 | ylab("Tree Height (m)")
428 | ```
429 |
430 | :::::::::::::::::::::::::
431 |
432 | ::::::::::::::::::::::::::::::::::::::::::::::::::
433 |
434 |
435 |
436 | :::::::::::::::::::::::::::::::::::::::: keypoints
437 |
438 | - Use the `crop()` function to crop a raster object.
439 | - Use the `extract()` function to extract pixels from a raster object that fall
440 | within a particular extent boundary.
441 | - Use the `ext()` function to define an extent.
442 |
443 | ::::::::::::::::::::::::::::::::::::::::::::::::::
444 |
445 |
446 |
--------------------------------------------------------------------------------
/episodes/13-plot-time-series-rasters-in-r.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Create Publication-quality Graphics
3 | teaching: 40
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Assign custom names to bands in a RasterStack.
15 | - Customize raster plots using the `ggplot2` package.
16 |
17 | ::::::::::::::::::::::::::::::::::::::::::::::::::
18 |
19 | :::::::::::::::::::::::::::::::::::::::: questions
20 |
21 | - How can I create a publication-quality graphic and customize plot parameters?
22 |
23 | ::::::::::::::::::::::::::::::::::::::::::::::::::
24 |
25 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE}
26 | library(terra)
27 | library(ggplot2)
28 | library(dplyr)
29 | library(reshape)
30 | library(RColorBrewer)
31 | library(scales)
32 | ```
33 |
34 | ```{r load-data, echo=FALSE, results="hide"}
35 | # learners will have this data loaded from the previous episode
36 |
37 | all_NDVI_HARV <- list.files("data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI",
38 | full.names = TRUE, pattern = ".tif$")
39 |
40 | # Create a time series raster stack
41 | NDVI_HARV_stack <- rast(all_NDVI_HARV)
42 | # NOTE: Fix the bands' names so they don't start with a number!
43 | names(NDVI_HARV_stack) <- paste0("X", names(NDVI_HARV_stack))
44 |
45 | # apply scale factor
46 | NDVI_HARV_stack <- NDVI_HARV_stack/10000
47 |
48 | # convert to a df for plotting
49 | NDVI_HARV_stack_df <- as.data.frame(NDVI_HARV_stack, xy = TRUE) %>%
50 | # Then reshape data to stack all the X*_HARV_ndvi_crop columns into
51 | # one single column called 'variable'
52 | melt(id.vars = c('x','y'))
53 | ```
54 |
55 | :::::::::::::::::::::::::::::::::::::::::: prereq
56 |
57 | ## Things You'll Need To Complete This Episode
58 |
59 | See the [lesson homepage](.) for detailed information about the software, data,
60 | and other prerequisites you will need to work through the examples in this
61 | episode.
62 |
63 |
64 | ::::::::::::::::::::::::::::::::::::::::::::::::::
65 |
66 | This episode covers how to customize your raster plots using the `ggplot2`
67 | package in R to create publication-quality plots.
68 |
69 | ## Before and After
70 |
71 | In [the previous episode](12-time-series-raster/), we learned how to plot
72 | multi-band raster data in R using the `facet_wrap()` function. This created a
73 | separate panel in our plot for each raster band. The plot we created together
74 | is shown below:
75 |
76 | ```{r levelplot-time-series-before, echo=FALSE}
77 | # code not shown, demonstration only
78 | ggplot() +
79 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
80 | facet_wrap(~variable) +
81 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest")
82 | ```
83 |
84 | Although this plot is informative, it isn't something we would expect to see in
85 | a journal publication. The x and y-axis labels aren't informative. There is a
86 | lot of unnecessary gray background and the titles of each panel don't clearly
87 | state that the number refers to the Julian day the data was collected. In this
88 | episode, we will customize this plot above to produce a publication quality
89 | graphic. We will go through these steps iteratively. When we're done, we will
90 | have created the plot shown below.
91 |
92 | ```{r levelplot-time-series-after, echo=FALSE}
93 | # code not shown, demonstration only
94 |
95 | raster_names <- names(NDVI_HARV_stack)
96 | raster_names <- gsub("_HARV_ndvi_crop", "", raster_names)
97 | raster_names <- gsub("X", "Day ", raster_names)
98 | labels_names <- setNames(raster_names, unique(NDVI_HARV_stack_df$variable))
99 | green_colors <- brewer.pal(9, "YlGn") %>%
100 | colorRampPalette()
101 |
102 | ggplot() +
103 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
104 | facet_wrap(~variable, nrow = 3, ncol = 5,
105 | labeller = labeller(variable = labels_names)) +
106 | ggtitle("Landsat NDVI - Julian Days", subtitle = "Harvard Forest 2011") +
107 | theme_void() +
108 | theme(plot.title = element_text(hjust = 0.5, face = "bold"),
109 | plot.subtitle = element_text(hjust = 0.5)) +
110 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20))
111 |
112 | # cleanup
113 | rm(raster_names, labels_names, green_colors)
114 | ```
115 |
116 | ## Adjust the Plot Theme
117 |
118 | The first thing we will do to our plot remove the x and y-axis labels and axis
119 | ticks, as these are unnecessary and make our plot look messy. We can do this by
120 | setting the plot theme to `void`.
121 |
122 | ```{r adjust-theme}
123 | ggplot() +
124 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
125 | facet_wrap(~variable) +
126 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") +
127 | theme_void()
128 | ```
129 |
130 | Next we will center our plot title and subtitle. We need to do this **after**
131 | the `theme_void()` layer, because R interprets the `ggplot` layers in order. If
132 | we first tell R to center our plot title, and then set the theme to `void`, any
133 | adjustments we've made to the plot theme will be over-written by the
134 | `theme_void()` function. So first we make the theme `void` and then we center
135 | the title. We center both the title and subtitle by using the `theme()`
136 | function and setting the `hjust` parameter to 0.5. The `hjust` parameter stands
137 | for "horizontal justification" and takes any value between 0 and 1. A setting
138 | of 0 indicates left justification and a setting of 1 indicates right
139 | justification.
140 |
141 | ```{r adjust-theme-2}
142 | ggplot() +
143 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
144 | facet_wrap(~variable) +
145 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") +
146 | theme_void() +
147 | theme(plot.title = element_text(hjust = 0.5),
148 | plot.subtitle = element_text(hjust = 0.5))
149 | ```
150 |
151 | ::::::::::::::::::::::::::::::::::::::: challenge
152 |
153 | ## Challenge
154 |
155 | Change the plot title (but not the subtitle) to bold font. You can (and
156 | should!) use the help menu in RStudio or any internet resources to figure out
157 | how to change this setting.
158 |
159 | ::::::::::::::: solution
160 |
161 | ## Answers
162 |
163 | Learners can find this information in the help files for the `theme()`
164 | function. The parameter to set is called `face`.
165 |
166 | ```{r use-bold-face}
167 | ggplot() +
168 | geom_raster(data = NDVI_HARV_stack_df,
169 | aes(x = x, y = y, fill = value)) +
170 | facet_wrap(~ variable) +
171 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") +
172 | theme_void() +
173 | theme(plot.title = element_text(hjust = 0.5, face = "bold"),
174 | plot.subtitle = element_text(hjust = 0.5))
175 | ```
176 |
177 | :::::::::::::::::::::::::
178 |
179 | ::::::::::::::::::::::::::::::::::::::::::::::::::
180 |
181 | ## Adjust the Color Ramp
182 |
183 | Next, let's adjust the color ramp used to render the rasters. First, we can
184 | change the blue color ramp to a green one that is more visually suited to our
185 | NDVI (greenness) data using the `colorRampPalette()` function in combination
186 | with `colorBrewer` which requires loading the `RColorBrewer` library. Then we
187 | use `scale_fill_gradientn` to pass the list of colours (here 20 different
188 | colours) to ggplot.
189 |
190 | First we need to create a set of colors to use. We will select a set of nine
191 | colors from the "YlGn" (yellow-green) color palette. This returns a set of hex
192 | color codes:
193 |
194 | ```{r}
195 | library(RColorBrewer)
196 | brewer.pal(9, "YlGn")
197 | ```
198 |
199 | Then we will pass those color codes to the `colorRampPalette` function, which
200 | will interpolate from those colors a more nuanced color range.
201 |
202 | ```{r}
203 | green_colors <- brewer.pal(9, "YlGn") %>%
204 | colorRampPalette()
205 | ```
206 |
207 | We can tell the `colorRampPalette()` function how many discrete colors within
208 | this color range to create. In our case, we will use 20 colors when we plot our
209 | graphic.
210 |
211 | ```{r change-color-ramp}
212 | ggplot() +
213 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
214 | facet_wrap(~variable) +
215 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") +
216 | theme_void() +
217 | theme(plot.title = element_text(hjust = 0.5, face = "bold"),
218 | plot.subtitle = element_text(hjust = 0.5)) +
219 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20))
220 | ```
221 |
222 | The yellow to green color ramp visually represents NDVI well given it's a
223 | measure of greenness. Someone looking at the plot can quickly understand that
224 | pixels that are more green have a higher NDVI value.
225 |
226 | ::::::::::::::::::::::::::::::::::::::::: callout
227 |
228 | ## Data Tip
229 |
230 | For all of the `brewer.pal` ramp names see the
231 | [brewerpal page](https://www.datavis.ca/sasmac/brewerpal.html).
232 |
233 |
234 | ::::::::::::::::::::::::::::::::::::::::::::::::::
235 |
236 | ::::::::::::::::::::::::::::::::::::::::: callout
237 |
238 | ## Data Tip
239 |
240 | Cynthia Brewer, the creator of ColorBrewer, offers an online tool to help
241 | choose suitable color ramps, or to create your own.
242 | [ColorBrewer 2.0; Color Advise for Cartography](https://colorbrewer2.org/)
243 |
244 |
245 | ::::::::::::::::::::::::::::::::::::::::::::::::::
246 |
247 | ## Refine Plot \& Tile Labels
248 |
249 | Next, let's label each panel in our plot with the Julian day that the raster
250 | data for that panel was collected. The current names come from the band "layer
251 | names"" stored in the `RasterStack` and the first part of each name is the
252 | Julian day.
253 |
254 | To create a more meaningful label we can remove the "x" and replace it with
255 | "day" using the `gsub()` function in R. The syntax is as follows:
256 | `gsub("StringToReplace", "TextToReplaceIt", object)`.
257 |
258 | First let's remove "\_HARV\_NDVI\_crop" from each label to make the labels
259 | shorter and remove repetition. To illustrate how this works, we will first
260 | look at the names for our `NDVI_HARV_stack` object:
261 |
262 | ```{r}
263 | names(NDVI_HARV_stack)
264 | ```
265 |
266 | Now we will use the `gsub()` function to find the character string
267 | "\_HARV\_ndvi\_crop" and replace it with a blank string (""). We will assign
268 | this output to a new object (`raster_names`) and look at that object to make
269 | sure our code is doing what we want it to.
270 |
271 | ```{r}
272 | raster_names <- names(NDVI_HARV_stack)
273 |
274 | raster_names <- gsub("_HARV_ndvi_crop", "", raster_names)
275 | raster_names
276 | ```
277 |
278 | So far so good. Now we will use `gsub()` again to replace the "X" with the word
279 | "Day" followed by a space.
280 |
281 | ```{r}
282 | raster_names <- gsub("X", "Day ", raster_names)
283 | raster_names
284 | ```
285 |
286 | Our labels look good now. Let's reassign them to our `all_NDVI_HARV` object:
287 |
288 | ```{r}
289 | labels_names <- setNames(raster_names, unique(NDVI_HARV_stack_df$variable))
290 | ```
291 |
292 | Once the names for each band have been reassigned, we can render our plot with
293 | the new labels using a`labeller`.
294 |
295 | ```{r create-levelplot}
296 | ggplot() +
297 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
298 | facet_wrap(~variable, labeller = labeller(variable = labels_names)) +
299 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") +
300 | theme_void() +
301 | theme(plot.title = element_text(hjust = 0.5, face = "bold"),
302 | plot.subtitle = element_text(hjust = 0.5)) +
303 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20))
304 | ```
305 |
306 | ## Change Layout of Panels
307 |
308 | We can adjust the columns of our plot by setting the number of columns `ncol`
309 | and the number of rows `nrow` in `facet_wrap`. Let's make our plot so that it
310 | has a width of five panels.
311 |
312 | ```{r adjust-layout}
313 | ggplot() +
314 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
315 | facet_wrap(~variable, ncol = 5,
316 | labeller = labeller(variable = labels_names)) +
317 | ggtitle("Landsat NDVI", subtitle = "NEON Harvard Forest") +
318 | theme_void() +
319 | theme(plot.title = element_text(hjust = 0.5, face = "bold"),
320 | plot.subtitle = element_text(hjust = 0.5)) +
321 | scale_fill_gradientn(name = "NDVI", colours = green_colors(20))
322 | ```
323 |
324 | Now we have a beautiful, publication quality plot!
325 |
326 | ::::::::::::::::::::::::::::::::::::::: challenge
327 |
328 | ## Challenge: Divergent Color Ramps
329 |
330 | When we used the `gsub()` function to modify the tile labels we replaced the
331 | beginning of each tile title with "Day". A more descriptive name could be
332 | "Julian Day". Update the plot above with the following changes:
333 |
334 | 1. Label each tile "Julian Day" with the julian day value following.
335 | 2. Change the color ramp to a divergent brown to green color ramp.
336 |
337 | **Questions:**
338 | Does having a divergent color ramp represent the data better than a sequential
339 | color ramp (like "YlGn")? Can you think of other data sets where a divergent
340 | color ramp may be best?
341 |
342 | ::::::::::::::: solution
343 |
344 | ## Answers
345 |
346 | ```{r final-figure}
347 | raster_names <- gsub("Day","Julian Day ", raster_names)
348 | labels_names <- setNames(raster_names, unique(NDVI_HARV_stack_df$variable))
349 |
350 | brown_green_colors <- colorRampPalette(brewer.pal(9, "BrBG"))
351 |
352 | ggplot() +
353 | geom_raster(data = NDVI_HARV_stack_df , aes(x = x, y = y, fill = value)) +
354 | facet_wrap(~variable, ncol = 5, labeller = labeller(variable = labels_names)) +
355 | ggtitle("Landsat NDVI - Julian Days", subtitle = "Harvard Forest 2011") +
356 | theme_void() +
357 | theme(plot.title = element_text(hjust = 0.5, face = "bold"),
358 | plot.subtitle = element_text(hjust = 0.5)) +
359 | scale_fill_gradientn(name = "NDVI", colours = brown_green_colors(20))
360 | ```
361 |
362 | For NDVI data, the sequential color ramp is better than the divergent as it is
363 | more akin to the process of greening up, which starts off at one end and just
364 | keeps increasing.
365 |
366 |
367 |
368 | :::::::::::::::::::::::::
369 |
370 | ::::::::::::::::::::::::::::::::::::::::::::::::::
371 |
372 |
373 |
374 | :::::::::::::::::::::::::::::::::::::::: keypoints
375 |
376 | - Use the `theme_void()` function for a clean background to your plot.
377 | - Use the `element_text()` function to adjust text size, font, and position.
378 | - Use the `brewer.pal()` function to create a custom color palette.
379 | - Use the `gsub()` function to do pattern matching and replacement in text.
380 |
381 | ::::::::::::::::::::::::::::::::::::::::::::::::::
382 |
383 |
384 |
--------------------------------------------------------------------------------
/episodes/14-extract-ndvi-from-rasters-in-r.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Derive Values from Raster Time Series
3 | teaching: 40
4 | exercises: 20
5 | source: Rmd
6 | ---
7 |
8 | ```{r setup, echo=FALSE}
9 | source("setup.R")
10 | ```
11 |
12 | ::::::::::::::::::::::::::::::::::::::: objectives
13 |
14 | - Extract summary pixel values from a raster.
15 | - Save summary values to a .csv file.
16 | - Plot summary pixel values using `ggplot()`.
17 | - Compare NDVI values between two different sites.
18 |
19 | ::::::::::::::::::::::::::::::::::::::::::::::::::
20 |
21 | :::::::::::::::::::::::::::::::::::::::: questions
22 |
23 | - How can I calculate, extract, and export summarized raster pixel data?
24 |
25 | ::::::::::::::::::::::::::::::::::::::::::::::::::
26 |
27 | ```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
28 | library(terra)
29 | library(ggplot2)
30 | library(dplyr)
31 | ```
32 |
33 | ```{r load-data, echo=FALSE, results="hide"}
34 | # learners will have this data loaded from the previous episode
35 |
36 | all_NDVI_HARV <- list.files("data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI",
37 | full.names = TRUE, pattern = ".tif$")
38 |
39 | # Create a time series raster stack
40 | NDVI_HARV_stack <- rast(all_NDVI_HARV)
41 | # NOTE: Fix the bands' names so they don't start with a number!
42 | names(NDVI_HARV_stack) <- paste0("X", names(NDVI_HARV_stack))
43 |
44 | # apply scale factor
45 | NDVI_HARV_stack <- NDVI_HARV_stack/10000
46 | ```
47 |
48 | :::::::::::::::::::::::::::::::::::::::::: prereq
49 |
50 | ## Things You'll Need To Complete This Episode
51 |
52 | See the [lesson homepage](.) for detailed information about the software, data,
53 | and other prerequisites you will need to work through the examples in this
54 | episode.
55 |
56 |
57 | ::::::::::::::::::::::::::::::::::::::::::::::::::
58 |
59 | In this episode, we will extract NDVI values from a raster time series dataset
60 | and plot them using the `ggplot2` package.
61 |
62 | ## Extract Summary Statistics From Raster Data
63 |
64 | We often want to extract summary values from raster data. For example, we might
65 | want to understand overall greeness across a field site or at each plot within
66 | a field site. These values can then be compared between different field sites
67 | and combined with other related metrics to support modeling and further
68 | analysis.
69 |
70 | ## Calculate Average NDVI
71 |
72 | Our goal in this episode is to create a dataframe that contains a single, mean
73 | NDVI value for each raster in our time series. This value represents the mean
74 | NDVI value for this area on a given day.
75 |
76 | We can calculate the mean for each raster using the `global()` function. The
77 | `global()` function produces a named numeric vector, where each value is
78 | associated with the name of raster stack it was derived from.
79 |
80 | ```{r}
81 | avg_NDVI_HARV <- global(NDVI_HARV_stack, mean)
82 | avg_NDVI_HARV
83 | ```
84 |
85 | The output is a data frame (othewise, we could use `as.data.frame()`). It's a
86 | good idea to view the first few rows of our data frame with `head()` to make
87 | sure the structure is what we expect.
88 |
89 | ```{r}
90 | head(avg_NDVI_HARV)
91 | ```
92 |
93 | We now have a data frame with row names that are based on the original file
94 | name and a mean NDVI value for each file. Next, let's clean up the column names
95 | in our data frame to make it easier for colleagues to work with our code.
96 |
97 | Let's change the NDVI column name to `meanNDVI`.
98 |
99 | ```{r view-dataframe-output}
100 | names(avg_NDVI_HARV) <- "meanNDVI"
101 | head(avg_NDVI_HARV)
102 | ```
103 |
104 | The new column name doesn't reminds us what site our data are from. While we
105 | are only working with one site now, we might want to compare several sites
106 | worth of data in the future. Let's add a column to our dataframe called "site".
107 |
108 | ```{r insert-site-name}
109 | avg_NDVI_HARV$site <- "HARV"
110 | ```
111 |
112 | We can populate this column with the site name - HARV. Let's also create a year
113 | column and populate it with 2011 - the year our data were collected.
114 |
115 | ```{r}
116 | avg_NDVI_HARV$year <- "2011"
117 | head(avg_NDVI_HARV)
118 | ```
119 |
120 | We now have a dataframe that contains a row for each raster file processed, and
121 | columns for `meanNDVI`, `site`, and `year`.
122 |
123 | ## Extract Julian Day from row names
124 |
125 | We'd like to produce a plot where Julian days (the numeric day of the year,
126 | 0 - 365/366) are on the x-axis and NDVI is on the y-axis. To create this plot,
127 | we'll need a column that contains the Julian day value.
128 |
129 | One way to create a Julian day column is to use `gsub()` on the file name in
130 | each row. We can replace both the `X` and the `_HARV_NDVI_crop` to extract the
131 | Julian Day value, just like we did in the
132 | [previous episode](13-plot-time-series-rasters-in-r/).
133 |
134 | This time we will use one additional trick to do both of these steps at the
135 | same time. The vertical bar character ( `|` ) is equivalent to the word "or".
136 | Using this character in our search pattern allows us to search for more than
137 | one pattern in our text strings.
138 |
139 | ```{r extract-julian-day}
140 | julianDays <- gsub("X|_HARV_ndvi_crop", "", row.names(avg_NDVI_HARV))
141 | julianDays
142 | ```
143 |
144 | Now that we've extracted the Julian days from our row names, we can add that
145 | data to the data frame as a column called "julianDay".
146 |
147 | ```{r}
148 | avg_NDVI_HARV$julianDay <- julianDays
149 | ```
150 |
151 | Let's check the class of this new column:
152 |
153 | ```{r}
154 | class(avg_NDVI_HARV$julianDay)
155 | ```
156 |
157 | ## Convert Julian Day to Date Class
158 |
159 | Currently, the values in the Julian day column are stored as class `character`.
160 | Storing this data as a date object is better - for plotting, data subsetting
161 | and working with our data. Let's convert. We worked with data conversions
162 | [in an earlier episode](12-time-series-raster/). For an introduction to
163 | date-time classes, see the NEON Data Skills tutorial
164 | [Convert Date \& Time Data from Character Class to Date-Time Class (POSIX) in R](https://www.neonscience.org/dc-convert-date-time-POSIX-r).
165 |
166 | To convert a Julian day number to a date class, we need to set the origin,
167 | which is the day that our Julian days start counting from. Our data is from
168 | 2011 and we know that the USGS Landsat Team created Julian day values for this
169 | year. Therefore, the first day or "origin" for our Julian day count is 01
170 | January 2011.
171 |
172 | ```{r}
173 | origin <- as.Date("2011-01-01")
174 | ```
175 |
176 | Next we convert the `julianDay` column from character to integer.
177 |
178 | ```{r}
179 | avg_NDVI_HARV$julianDay <- as.integer(avg_NDVI_HARV$julianDay)
180 | ```
181 |
182 | Once we set the Julian day origin, we can add the Julian day value (as an
183 | integer) to the origin date.
184 |
185 | Note that when we convert our integer class `julianDay` values to dates, we
186 | subtracted 1. This is because the origin day is 01 January 2011, so the
187 | extracted day is 01. The Julian Day (or year day) for this is also 01. When we
188 | convert from the integer 05 `julianDay` value (indicating 5th of January), we
189 | cannot simply add `origin + julianDay` because `01 + 05 = 06` or 06 January
190 | 2011. To correct, this error we then subtract 1 to get the correct day, January
191 | 05 2011.
192 |
193 | ```{r}
194 | avg_NDVI_HARV$Date<- origin + (avg_NDVI_HARV$julianDay - 1)
195 | head(avg_NDVI_HARV$Date)
196 | ```
197 |
198 | Since the origin date was originally set as a Date class object, the new `Date`
199 | column is also stored as class `Date`.
200 |
201 | ```{r}
202 | class(avg_NDVI_HARV$Date)
203 | ```
204 |
205 | ::::::::::::::::::::::::::::::::::::::: challenge
206 |
207 | ## Challenge: NDVI for the San Joaquin Experimental Range
208 |
209 | We often want to compare two different sites. The National Ecological
210 | Observatory Network (NEON) also has a field site in Southern California at the
211 | [San Joaquin Experimental Range (SJER)](https://www.neonscience.org/field-sites/field-sites-map/SJER).
212 |
213 | For this challenge, create a dataframe containing the mean NDVI values and the
214 | Julian days the data was collected (in date format) for the NEON San Joaquin
215 | Experimental Range field site. NDVI data for SJER are located in the
216 | `NEON-DS-Landsat-NDVI/SJER/2011/NDVI` directory.
217 |
218 | ::::::::::::::: solution
219 |
220 | ## Answers
221 |
222 | First we will read in the NDVI data for the SJER field site.
223 |
224 | ```{r}
225 | NDVI_path_SJER <- "data/NEON-DS-Landsat-NDVI/SJER/2011/NDVI"
226 |
227 | all_NDVI_SJER <- list.files(NDVI_path_SJER,
228 | full.names = TRUE,
229 | pattern = ".tif$")
230 |
231 | NDVI_stack_SJER <- rast(all_NDVI_SJER)
232 | names(NDVI_stack_SJER) <- paste0("X", names(NDVI_stack_SJER))
233 |
234 | NDVI_stack_SJER <- NDVI_stack_SJER/10000
235 | ```
236 |
237 | Then we can calculate the mean values for each day and put that in a dataframe.
238 |
239 | ```{r}
240 | avg_NDVI_SJER <- as.data.frame(global(NDVI_stack_SJER, mean))
241 | ```
242 |
243 | Next we rename the NDVI column, and add site and year columns to our data.
244 |
245 | ```{r}
246 | names(avg_NDVI_SJER) <- "meanNDVI"
247 | avg_NDVI_SJER$site <- "SJER"
248 | avg_NDVI_SJER$year <- "2011"
249 | ```
250 |
251 | Now we will create our Julian day column
252 |
253 | ```{r}
254 | julianDays_SJER <- gsub("X|_SJER_ndvi_crop", "", row.names(avg_NDVI_SJER))
255 | origin <- as.Date("2011-01-01")
256 | avg_NDVI_SJER$julianDay <- as.integer(julianDays_SJER)
257 |
258 | avg_NDVI_SJER$Date <- origin + (avg_NDVI_SJER$julianDay - 1)
259 |
260 | head(avg_NDVI_SJER)
261 | ```
262 |
263 | :::::::::::::::::::::::::
264 |
265 | ::::::::::::::::::::::::::::::::::::::::::::::::::
266 |
267 | ## Plot NDVI Using ggplot
268 |
269 | We now have a clean dataframe with properly scaled NDVI and Julian days. Let's
270 | plot our data.
271 |
272 | ```{r ggplot-data}
273 | ggplot(avg_NDVI_HARV, aes(julianDay, meanNDVI)) +
274 | geom_point() +
275 | ggtitle("Landsat Derived NDVI - 2011",
276 | subtitle = "NEON Harvard Forest Field Site") +
277 | xlab("Julian Days") + ylab("Mean NDVI")
278 | ```
279 |
280 | ::::::::::::::::::::::::::::::::::::::: challenge
281 |
282 | ## Challenge: Plot San Joaquin Experimental Range Data
283 |
284 | Create a complementary plot for the SJER data. Plot the data points in a
285 | different color.
286 |
287 | ::::::::::::::: solution
288 |
289 | ## Answers
290 |
291 | ```{r avg-ndvi-sjer}
292 | ggplot(avg_NDVI_SJER, aes(julianDay, meanNDVI)) +
293 | geom_point(colour = "SpringGreen4") +
294 | ggtitle("Landsat Derived NDVI - 2011", subtitle = "NEON SJER Field Site") +
295 | xlab("Julian Day") + ylab("Mean NDVI")
296 | ```
297 |
298 | :::::::::::::::::::::::::
299 |
300 | ::::::::::::::::::::::::::::::::::::::::::::::::::
301 |
302 | ## Compare NDVI from Two Different Sites in One Plot
303 |
304 | Comparison of plots is often easiest when both plots are side by side. Or, even
305 | better, if both sets of data are plotted in the same plot. We can do this by
306 | merging the two data sets together. The date frames must have the same number
307 | of columns and exact same column names to be merged.
308 |
309 | ```{r merge-df-single-plot}
310 | NDVI_HARV_SJER <- rbind(avg_NDVI_HARV, avg_NDVI_SJER)
311 | ```
312 |
313 | Now we can plot both datasets on the same plot.
314 |
315 | ```{r ndvi-harv-sjer-comp}
316 | ggplot(NDVI_HARV_SJER, aes(x = julianDay, y = meanNDVI, colour = site)) +
317 | geom_point(aes(group = site)) +
318 | geom_line(aes(group = site)) +
319 | ggtitle("Landsat Derived NDVI - 2011",
320 | subtitle = "Harvard Forest vs San Joaquin") +
321 | xlab("Julian Day") + ylab("Mean NDVI")
322 | ```
323 |
324 | ::::::::::::::::::::::::::::::::::::::: challenge
325 |
326 | ## Challenge: Plot NDVI with date
327 |
328 | Plot the SJER and HARV data in one plot but use date, rather than Julian day,
329 | on the x-axis.
330 |
331 | ::::::::::::::: solution
332 |
333 | ## Answers
334 |
335 | ```{r ndvi-harv-sjer-date}
336 | ggplot(NDVI_HARV_SJER, aes(x = Date, y = meanNDVI, colour = site)) +
337 | geom_point(aes(group = site)) +
338 | geom_line(aes(group = site)) +
339 | ggtitle("Landsat Derived NDVI - 2011",
340 | subtitle = "Harvard Forest vs San Joaquin") +
341 | xlab("Date") + ylab("Mean NDVI")
342 | ```
343 |
344 | :::::::::::::::::::::::::
345 |
346 | ::::::::::::::::::::::::::::::::::::::::::::::::::
347 |
348 | ## Remove Outlier Data
349 |
350 | As we look at these plots we see variation in greenness across the year.
351 | However, the pattern is interrupted by a few points where NDVI quickly drops
352 | towards 0 during a time period when we might expect the vegetation to have a
353 | higher greenness value. Is the vegetation truly senescent or gone or are these
354 | outlier values that should be removed from the data?
355 |
356 | We've seen in [an earlier episode](12-time-series-raster/) that data points
357 | with very low NDVI values can be associated with images that are filled with
358 | clouds. Thus, we can attribute the low NDVI values to high levels of cloud
359 | cover. Is the same thing happening at SJER?
360 |
361 | ```{r view-all-rgb-SJER, echo=FALSE}
362 | # code not shown, demonstration only
363 | # open up the cropped files
364 | rgb.allCropped.SJER <- list.files("data/NEON-DS-Landsat-NDVI/SJER/2011/RGB/",
365 | full.names=TRUE,
366 | pattern = ".tif$")
367 | # create a layout
368 | par(mfrow = c(5, 4))
369 |
370 | # Super efficient code
371 | # note that there is an issue with one of the rasters
372 | # NEON-DS-Landsat-NDVI/SJER/2011/RGB/254_SJER_landRGB.tif has a blue band with no range
373 | # thus you can't apply a stretch to it. The code below skips the stretch for
374 | # that one image. You could automate this by testing the range of each band in each image
375 |
376 | for (aFile in rgb.allCropped.SJER)
377 | {NDVI.rastStack <- rast(aFile)
378 | if (aFile =="data/NEON-DS-Landsat-NDVI/SJER/2011/RGB//254_SJER_landRGB.tif")
379 | {plotRGB(NDVI.rastStack) }
380 | else { plotRGB(NDVI.rastStack, stretch="lin") }
381 | }
382 |
383 | # reset layout
384 | par(mfrow=c(1, 1))
385 | ```
386 |
387 | Without significant additional processing, we will not be able to retrieve a
388 | strong reflection from vegetation, from a remotely sensed image that is
389 | predominantly cloud covered. Thus, these points are likely bad data points.
390 | Let's remove them.
391 |
392 | First, we will identify the good data points that should be retained. One way
393 | to do this is by identifying a threshold value. All values below that threshold
394 | will be removed from our analysis. We will use 0.1 as an example for this
395 | episode. We can then use the subset function to remove outlier datapoints
396 | (below our identified threshold).
397 |
398 | ::::::::::::::::::::::::::::::::::::::::: callout
399 |
400 | ## Data Tip
401 |
402 | Thresholding, or removing outlier data, can be tricky business. In this case,
403 | we can be confident that some of our NDVI values are not valid due to cloud
404 | cover. However, a threshold value may not always be sufficient given that 0.1
405 | could be a valid NDVI value in some areas. This is where decision-making should
406 | be fueled by practical scientific knowledge of the data and the desired
407 | outcomes!
408 |
409 |
410 | ::::::::::::::::::::::::::::::::::::::::::::::::::
411 |
412 | ```{r remove-bad-values}
413 | avg_NDVI_HARV_clean <- subset(avg_NDVI_HARV, meanNDVI > 0.1)
414 | avg_NDVI_HARV_clean$meanNDVI < 0.1
415 | ```
416 |
417 | Now we can create another plot without the suspect data.
418 |
419 | ```{r plot-clean-HARV}
420 | ggplot(avg_NDVI_HARV_clean, aes(x = julianDay, y = meanNDVI)) +
421 | geom_point() +
422 | ggtitle("Landsat Derived NDVI - 2011",
423 | subtitle = "NEON Harvard Forest Field Site") +
424 | xlab("Julian Days") + ylab("Mean NDVI")
425 | ```
426 |
427 | Now our outlier data points are removed and the pattern of "green-up" and
428 | "brown-down" makes more sense.
429 |
430 | ## Write NDVI data to a .csv File
431 |
432 | We can write our final NDVI dataframe out to a text format, to quickly share
433 | with a colleague or to reuse for analysis or visualization purposes. We will
434 | export in Comma Seperated Value (.csv) file format because it is usable in many
435 | different tools and across platforms (MAC, PC, etc).
436 |
437 | We will use `write.csv()` to write a specified dataframe to a `.csv` file.
438 | Unless you designate a different directory, the output file will be saved in
439 | your working directory.
440 |
441 | Before saving our file, let's view the format to make sure it is what we want
442 | as an output format.
443 |
444 | ```{r write-csv}
445 | head(avg_NDVI_HARV_clean)
446 | ```
447 |
448 | It looks like we have a series of `row.names` that we do not need because we
449 | have this information stored in individual columns in our data frame. Let's
450 | remove the row names.
451 |
452 | ```{r drop-rownames-write-csv}
453 | row.names(avg_NDVI_HARV_clean) <- NULL
454 | head(avg_NDVI_HARV_clean)
455 | ```
456 |
457 | ```{r, eval=FALSE}
458 | write.csv(avg_NDVI_HARV_clean, file="meanNDVI_HARV_2011.csv")
459 | ```
460 |
461 | ::::::::::::::::::::::::::::::::::::::: challenge
462 |
463 | ## Challenge: Write to .csv
464 |
465 | 1. Create a NDVI .csv file for the NEON SJER field site that is comparable with
466 | the one we just created for the Harvard Forest. Be sure to inspect for
467 | questionable values before writing any data to a .csv file.
468 | 2. Create a NDVI .csv file that includes data from both field sites.
469 |
470 | ::::::::::::::: solution
471 |
472 | ## Answers
473 |
474 | ```{r}
475 | avg_NDVI_SJER_clean <- subset(avg_NDVI_SJER, meanNDVI > 0.1)
476 | row.names(avg_NDVI_SJER_clean) <- NULL
477 | head(avg_NDVI_SJER_clean)
478 | write.csv(avg_NDVI_SJER_clean, file = "meanNDVI_SJER_2011.csv")
479 | ```
480 |
481 | :::::::::::::::::::::::::
482 |
483 | ::::::::::::::::::::::::::::::::::::::::::::::::::
484 |
485 |
486 |
487 | :::::::::::::::::::::::::::::::::::::::: keypoints
488 |
489 | - Use the `global()` function to calculate summary statistics for cells in a
490 | raster object.
491 | - The pipe (`|`) operator means `or`.
492 | - Use the `rbind()` function to combine data frames that have the same column
493 | names.
494 |
495 | ::::::::::::::::::::::::::::::::::::::::::::::::::
496 |
497 |
498 |
--------------------------------------------------------------------------------
/episodes/data/.gitignore:
--------------------------------------------------------------------------------
1 | *
2 | */
3 | !.gitignore
4 |
--------------------------------------------------------------------------------
/episodes/fig/BufferCircular.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/BufferCircular.png
--------------------------------------------------------------------------------
/episodes/fig/BufferSquare.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/BufferSquare.png
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/GreennessOverTime.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/GreennessOverTime.jpg
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/RGBSTack_1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/RGBSTack_1.jpg
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/UTM_zones_18-19.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/UTM_zones_18-19.jpg
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/imageStretch_dark.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/imageStretch_dark.jpg
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/imageStretch_light.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/imageStretch_light.jpg
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/lidarTree-height.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/lidarTree-height.png
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/raster_concept.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/raster_concept.png
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/raster_resolution.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/raster_resolution.png
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/single_multi_raster.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/single_multi_raster.png
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-raster/spatial_extent.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-raster/spatial_extent.png
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-vector/pnt_line_poly.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-vector/pnt_line_poly.png
--------------------------------------------------------------------------------
/episodes/fig/dc-spatial-vector/spatial_extent.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/dc-spatial-vector/spatial_extent.png
--------------------------------------------------------------------------------
/episodes/fig/map_usa_different_projections.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/r-raster-vector-geospatial/1d84ea9812af46c4fd730474aedb5066d6dcdc3c/episodes/fig/map_usa_different_projections.jpg
--------------------------------------------------------------------------------
/episodes/setup.R:
--------------------------------------------------------------------------------
1 | options(timeout = max(300, getOption('timeout')))
2 | ## file structure
3 |
4 | if (! file.exists("data/NEON-DS-Site-Layout-Files")) {
5 | dest <- tempfile()
6 | download.file("https://ndownloader.figshare.com/files/3708751", dest,
7 | mode = "wb")
8 | unzip(dest, exdir = "data")
9 | }
10 |
11 | if (! file.exists("data/NEON-DS-Airborne-Remote-Sensing")) {
12 | dest <- tempfile()
13 | download.file("https://ndownloader.figshare.com/files/3701578", dest,
14 | mode = "wb")
15 | unzip(dest, exdir = "data")
16 | }
17 |
18 | if (! file.exists("data/NEON-DS-Met-Time-Series")) {
19 | dest <- tempfile()
20 | download.file("https://ndownloader.figshare.com/files/3701572", dest,
21 | mode = "wb")
22 | unzip(dest, exdir = "data")
23 | }
24 |
25 | if (! file.exists("data/NEON-DS-Landsat-NDVI")) {
26 | dest <- tempfile()
27 | download.file("https://ndownloader.figshare.com/files/4933582", dest,
28 | mode = "wb")
29 | unzip(dest, exdir = "data")
30 | }
31 |
32 | if (! file.exists("data/Global/Boundaries/ne_110m_graticules_all")) {
33 | dest <- tempfile()
34 | download.file("https://naciscdn.org/naturalearth/110m/physical/ne_110m_graticules_all.zip",
35 | dest, mode = "wb")
36 | unzip(dest, exdir = "data/Global/Boundaries/ne_110m_graticules_all")
37 | }
38 |
39 | if (! file.exists("data/Global/Boundaries/ne_110m_land")) {
40 | dest <- tempfile()
41 | download.file("https://naciscdn.org/naturalearth/110m/physical/ne_110m_land.zip",
42 | dest, mode = "wb")
43 | unzip(dest, exdir = "data/Global/Boundaries/ne_110m_land")
44 | }
45 |
--------------------------------------------------------------------------------
/index.md:
--------------------------------------------------------------------------------
1 | ---
2 | site: sandpaper::sandpaper_site
3 | ---
4 |
5 | **Lesson Authors:** Leah A. Wasser, Megan A. Jones, Zack Brym, Kristina Riemer, Jason Williams, Jeff Hollister, Mike Smorul, Jemma Stachelek
6 |
7 |
8 |
9 | The episodes in this lesson cover how to open, work with, and plot
10 | vector and raster-format spatial data in R. Additional topics include
11 | working with spatial metadata (extent and coordinate reference systems),
12 | reprojecting spatial data, and working with raster time series data.
13 |
14 | :::::::::::::::::::::::::::::::::::::::::: prereq
15 |
16 | ## Prerequisites
17 |
18 | Data Carpentry's teaching is hands-on, so participants are encouraged
19 | to use their own computers to ensure the proper setup of tools for an
20 | efficient workflow. To most effectively use these materials, please
21 | make sure to download the data and install everything before
22 | working through this lesson.
23 |
24 | ### R Skill Level
25 |
26 | This lesson assumes you have some knowledge of `R`. If you've never
27 | used `R` before, or need a refresher, start with our
28 | [Introduction to R for Geospatial Data](http://www.datacarpentry.org/r-intro-geospatial/)
29 | lesson.
30 |
31 | ### Geospatial Skill Level
32 |
33 | This lesson assumes you have some knowledge of geospatial data types
34 | and common file formats. If you have never worked with geospatial
35 | data before, or need a refresher, start with our
36 | [Introduction to Geospatial Concepts](http://www.datacarpentry.org/organization-geospatial/)
37 | lesson.
38 |
39 | ### Install Software and Download Data
40 |
41 | For installation instructions and to download the data used in this
42 | lesson, see the
43 | [Geospatial Workshop Overview](http://www.datacarpentry.org/geospatial-workshop/#setup).
44 |
45 | ### Setup RStudio Project
46 |
47 | Make sure you have set up a RStudio project for this lesson, as
48 | described in the
49 | [setup instructions](http://www.datacarpentry.org/geospatial-workshop/#setup)
50 | and that your working directory is correctly set.
51 |
52 |
53 | ::::::::::::::::::::::::::::::::::::::::::::::::::
54 |
55 |
56 |
--------------------------------------------------------------------------------
/instructors/instructor-notes.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Instructor Notes
3 | ---
4 |
5 |
6 | ## Instructor notes
7 |
8 | ## Lesson motivation and learning objectives
9 |
10 | This lesson is designed to introduce learners to the fundamental principles and skills for working with
11 | raster and vector geospatial data in R. It begins by introducing the structure of and simple plotting of
12 | raster data. It then covers re-projection of raster data, performing raster math, and working with multi-band
13 | raster data. After introducing raster data, the lesson moves into working with vector data. Line, point, and
14 | polygon shapefiles are included in the data. Learners will plot multiple raster and/or vector layers
15 | in a single plot, and learn how to customize plot elements such as legends and titles. They will
16 | also learn how to read data in from a csv formatted file and re-format it to a shapefile. Lastly, learners
17 | will work with multi-layered raster data set representing time series data and extract summary statistics
18 | from this data.
19 |
20 | ## Lesson design
21 |
22 | #### Overall comments
23 |
24 | - As of initial release of this lesson (August 2018), the timing is set to be the same for each episode. This
25 | is very likely incorrect and will need to be updated as these lessons are taught. If you teach this lesson,
26 | please put in an issue or PR to suggest an updating timing scheme!!
27 |
28 | - The code examples presented in each episode assume that the learners still have all of the data and packages
29 | loaded from all previous episodes in this lesson. If learners close out of their R session during the breaks or
30 | at the end of the first day, they will need to either save the workspace or reload the data and packages.
31 | Because of this, it is essential that learners save their code to a script throughout the lesson.
32 |
33 | #### [1 Intro to Raster Data in R](01-raster-structure.md)
34 |
35 | - Be sure to introduce the datasets that will be used in this lesson. There are many data files. It may
36 | be helpful to draw a diagram on the board showing the types of data that will be plotted and analyzed
37 | throughout the lesson.
38 | - If the [Introduction to Geospatial Concepts](https://datacarpentry.org/organization-geospatial/) lesson was
39 | included in your workshop, learners will have been introduced to the GDAL library. It will be useful to make
40 | the connection back to that lesson explicitly.
41 | - If the [Introduction to R for Geospatial Data](https://datacarpentry.org/r-intro-geospatial/) lesson was included
42 | in your workshop, learners will be familiar with the idea of packages and with most of the functions used
43 | in this lesson.
44 | - The Dealing with Missing Data and Bad Data Values in Rasters sections have several plots showing alternative ways of displaying missing
45 | data. The code for generating these plots is **not** shared with the learners, as it relies on many functions
46 | they have not yet learned. For these and other plots with hidden demonstration code, show the images in the
47 | lesson page while discussing those examples.
48 | - Be sure to draw a distinction between the DTM and the DSM files, as these two datasets will be used
49 | throughout the lesson.
50 |
51 | #### [2 Plot Raster Data in R](02-raster-plot.md)
52 |
53 | - `geom_bar()` is a new geom for the learners. They were introduced to `geom_col()` in the [Introduction to R for Geospatial Data](https://datacarpentry.org/r-intro-geospatial/) lesson.
54 | - `dplyr` syntax should be familiar to your learners from the [Introduction to R for Geospatial Data](https://datacarpentry.org/r-intro-geospatial/) lesson.
55 | - This may be the first time learners are exposed to hex colors, so be sure to explain that concept.
56 | - Starting in this episode and continuing throughout the lesson, the `ggplot` calls can be very long. Be sure
57 | to explicitly describe each step of the function call and what it is doing for the overall plot.
58 |
59 | #### [3 Reproject Raster Data in R](03-raster-reproject-in-r.md)
60 |
61 | - No notes yet. Please add your tips and comments!
62 |
63 | #### [4 Raster Calculations in R](04-raster-calculations-in-r.md)
64 |
65 | - The `overlay()` function syntax is fairly complex compared to other function calls the learners have seen.
66 | Be sure to explain it in detail.
67 |
68 | #### [5 Work With Multi-Band Rasters in R](05-raster-multi-band-in-r.md)
69 |
70 | - No notes yet. Please add your tips and comments!
71 |
72 | #### [6 Open and Plot Shapefiles in R](06-vector-open-shapefile-in-r.md)
73 |
74 | - Learners may have heard of the `sp` package. If it comes up, explain that `sf` is a
75 | more modern update of `sp`.
76 | - There is a known bug in the `geom_sf()` function that leads to an intermittent error on some platforms.
77 | If you see the following error message, try to re-run your plotting command and it should work.
78 | The `ggplot` development team is working on fixing this bug.
79 |
80 | * Error message *
81 |
82 | ```error
83 | Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
84 | polygon edge not found
85 | ```
86 |
87 | #### [7 Explore and Plot by Shapefile Attributes](07-vector-shapefile-attributes-in-r.md)
88 |
89 | - No notes yet. Please add your tips and comments!
90 |
91 | #### [8 Plot Multiple Vector Layers](08-vector-plot-shapefiles-custom-legend.md)
92 |
93 | - No notes yet. Please add your tips and comments!
94 |
95 | #### [9 Handling Spatial Projection \& CRS in R](09-vector-when-data-dont-line-up-crs.md)
96 |
97 | - Note that, although `ggplot` automatically reprojects vector data when plotting multiple shapefiles with
98 | different projections together, it is still important to be aware of the CRSs of your data and to keep track
99 | of how they are being transformed.
100 |
101 | #### [10 Convert from .csv to a Vector Layer](10-vector-csv-to-shapefile-in-r.md)
102 |
103 | - No notes yet. Please add your tips and comments!
104 |
105 | #### [11 Manipulate Raster Data](11-vector-raster-integration.md)
106 |
107 | - Learners have not yet been exposed to the `melt()` function in this workshop. They will need to have
108 | the syntax explained.
109 | - This is the first instance of a faceted plot in this workshop.
110 |
111 | #### [12 Raster Time Series Data](12-time-series-raster.md)
112 |
113 | - No notes yet. Please add your tips and comments!
114 |
115 | #### [13 Create Publication-quality Graphics](13-plot-time-series-rasters-in-r.md)
116 |
117 | - Be sure to show learners the before and after plots to motivate the complexity of the
118 | `ggplot` calls that will be used in this episode.
119 |
120 | #### [14 Derive Values from Raster Time Series](14-extract-ndvi-from-rasters-in-r.md)
121 |
122 | - This is the first time in the workshop that learners will have worked with date data.
123 |
124 | #### Concluding remarks
125 |
126 | - No notes yet. Please add your tips and comments!
127 |
128 | ## Technical tips and tricks
129 |
130 | - Leave about 30 minutes at the start of each workshop and another 15 mins
131 | at the start of each session for technical difficulties like WiFi and
132 | installing things (even if you asked students to install in advance, longer if
133 | not).
134 |
135 | - Don't worry about being correct or knowing the material back-to-front. Use
136 | mistakes as teaching moments: the most vital skill you can impart is how to
137 | debug and recover from unexpected errors.
138 |
139 | ## Scheduling tips
140 |
141 | - You will almost certainly not have enough time to teach this entire curriculum. If pressed for time,
142 | here is one possible shortened schedule you can use (used in a 4 half-day curriculum in May 2022):
143 | - Day 1: Workshop intro, installation, troubleshooting. Episodes 1-5 of Introduction to R for Geospatial Data.
144 | Skip everything in Episode 3 after "Vectors and Type Coercion, but keep Challenge 4. Skip everything in
145 | Episode 4 starting at "Adding columns and rows in data frames". Only include the "Data frames" section of Episode 5.
146 | You can introduce factors on-the-fly in the rest of the curriculum.
147 | - Day 2: Episodes 6-8 of Introduction to R for Geospatial Data, Episodes 6-8 of R for Raster and Vector Data (as far
148 | as you get in Episode 8).
149 | - Day 3: Episodes 8-10 of R for Raster and Vector Data, Episodes 1-2 of R for Raster and Vector Data.
150 | - Day 4: Episodes 3, 11 of Raster and Vector Data (and whatever else you'd like to cover), workshop conclusion.
151 | - It is a good idea to start your teaching with **vector data** (which is more immediately relevant to a greater number of
152 | researchers, particularly those outside of environmental sciences), then move to raster data if there is extra time.
153 | - Skip Introduction to Geospatial Concepts. Spend at most 30 minutes reviewing things as this is currently not
154 | an interactive curriculum. Most of the concepts you can cover within the R for Raster and Vector Data curriculum.
155 | - Covering Episode 10 immediately after 3 can be helpful to solidify the concepts of projections
156 |
157 | ## Common problems
158 |
159 | - Pre-installation for this curriculum is particularly important because geospatial data and software is large and can take
160 | a very long time to load during a workshop. Make sure everything is installed and downloaded ahead of time.
161 | - TBA - Instructors please add other situations you encounter here.
162 |
163 |
164 |
165 |
166 |
--------------------------------------------------------------------------------
/learners/discuss.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Discussion
3 | ---
4 |
5 | FIXME
6 |
7 |
8 |
9 |
10 |
--------------------------------------------------------------------------------
/learners/reference.md:
--------------------------------------------------------------------------------
1 | ---
2 | {}
3 | ---
4 |
5 | ## References
6 |
7 | - [CRAN Spatial Task View](https://cran.r-project.org/web/views/Spatial.html)
8 |
9 | - [Geocomputation with R](http://robinlovelace.net/geocompr/)
10 |
11 | - [sf package vignettes](https://r-spatial.github.io/sf/articles/)
12 |
13 | - [Wikipedia shapefile page](https://en.wikipedia.org/wiki/Shapefile)
14 |
15 | - [`R` color palettes documentation](https://stat.ethz.ch/R-manual/R-devel/library/grDevices/html/palettes.html)
16 |
17 |
18 |
--------------------------------------------------------------------------------
/learners/setup.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Setup
3 | ---
4 |
5 | This lesson is designed to be taught in conjunction with other lessons
6 | in the [Data Carpentry Geospatial workshop](http://www.datacarpentry.org/geospatial-workshop/).
7 | For information about required software, and to access the datasets used
8 | in this lesson, see the
9 | [setup instructions](https://datacarpentry.org/geospatial-workshop/#setup)
10 | on the workshop homepage.
11 |
12 |
13 |
--------------------------------------------------------------------------------
/profiles/learner-profiles.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: FIXME
3 | ---
4 |
5 | This is a placeholder file. Please add content here.
6 |
--------------------------------------------------------------------------------
/r-raster-vector-geospatial.Rproj:
--------------------------------------------------------------------------------
1 | Version: 1.0
2 | ProjectId: 32618674-7875-479d-aff6-55bf7903a906
3 |
4 | RestoreWorkspace: Default
5 | SaveWorkspace: Default
6 | AlwaysSaveHistory: Default
7 |
8 | EnableCodeIndexing: Yes
9 | UseSpacesForTab: Yes
10 | NumSpacesForTab: 2
11 | Encoding: UTF-8
12 |
13 | RnwWeave: Sweave
14 | LaTeX: pdfLaTeX
15 |
16 | BuildType: Website
17 |
--------------------------------------------------------------------------------
/renv/profile:
--------------------------------------------------------------------------------
1 | lesson-requirements
2 |
--------------------------------------------------------------------------------
/renv/profiles/lesson-requirements/renv/.gitignore:
--------------------------------------------------------------------------------
1 | library/
2 | local/
3 | cellar/
4 | lock/
5 | python/
6 | sandbox/
7 | staging/
8 |
--------------------------------------------------------------------------------
/site/README.md:
--------------------------------------------------------------------------------
1 | This directory contains rendered lesson materials. Please do not edit files
2 | here.
3 |
--------------------------------------------------------------------------------