├── .github
    └── workflows
    │   ├── README.md
    │   ├── pr-close-signal.yaml
    │   ├── pr-comment.yaml
    │   ├── pr-post-remove-branch.yaml
    │   ├── pr-preflight.yaml
    │   ├── pr-receive.yaml
    │   ├── sandpaper-main.yaml
    │   ├── sandpaper-version.txt
    │   ├── update-cache.yaml
    │   └── update-workflows.yaml
├── .gitignore
├── .zenodo.json
├── AUTHORS
├── CITATION
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE.md
├── README.md
├── config.yaml
├── fig
    ├── logging-onto-cloud-new-key-pair_1.png
    ├── logging-onto-cloud-new-key-pair_2.png
    ├── logging-onto-cloud-security-group_1.png
    ├── logging-onto-cloud-security-group_2.png
    ├── logging-onto-cloud-security-group_3.png
    ├── logging-onto-cloud-summary.png
    ├── logging-onto-cloud_1.png
    ├── logging-onto-cloud_1b.png
    ├── logging-onto-cloud_2.png
    ├── logging-onto-cloud_3.png
    ├── logging-onto-cloud_3b.png
    ├── logging-onto-cloud_5.png
    ├── logging-onto-cloud_6.png
    └── logging-onto-cloud_7.png
├── index.md
├── instructors
    ├── AMI-setup.md
    ├── faq.md
    ├── instructor-notes.md
    └── teaching_demos.md
├── learners
    ├── reference.md
    └── setup.md
├── profiles
    └── learner-profiles.md
└── site
    └── README.md


/.github/workflows/README.md:
--------------------------------------------------------------------------------
  1 | # Carpentries Workflows
  2 | 
  3 | This directory contains workflows to be used for Lessons using the {sandpaper}
  4 | lesson infrastructure. Two of these workflows require R (`sandpaper-main.yaml`
  5 | and `pr-receive.yaml`) and the rest are bots to handle pull request management.
  6 | 
  7 | These workflows will likely change as {sandpaper} evolves, so it is important to
  8 | keep them up-to-date. To do this in your lesson you can do the following in your
  9 | R console:
 10 | 
 11 | ```r
 12 | # Install/Update sandpaper
 13 | options(repos = c(carpentries = "https://carpentries.r-universe.dev/", 
 14 |   CRAN = "https://cloud.r-project.org"))
 15 | install.packages("sandpaper")
 16 | 
 17 | # update the workflows in your lesson
 18 | library("sandpaper")
 19 | update_github_workflows()
 20 | ```
 21 | 
 22 | Inside this folder, you will find a file called `sandpaper-version.txt`, which
 23 | will contain a version number for sandpaper. This will be used in the future to
 24 | alert you if a workflow update is needed.
 25 | 
 26 | What follows are the descriptions of the workflow files:
 27 | 
 28 | ## Deployment
 29 | 
 30 | ### 01 Build and Deploy (sandpaper-main.yaml)
 31 | 
 32 | This is the main driver that will only act on the main branch of the repository.
 33 | This workflow does the following:
 34 | 
 35 |  1. checks out the lesson
 36 |  2. provisions the following resources
 37 |    - R
 38 |    - pandoc
 39 |    - lesson infrastructure (stored in a cache)
 40 |    - lesson dependencies if needed (stored in a cache)
 41 |  3. builds the lesson via `sandpaper:::ci_deploy()`
 42 | 
 43 | #### Caching
 44 | 
 45 | This workflow has two caches; one cache is for the lesson infrastructure and 
 46 | the other is for the the lesson dependencies if the lesson contains rendered
 47 | content. These caches are invalidated by new versions of the infrastructure and
 48 | the `renv.lock` file, respectively. If there is a problem with the cache, 
 49 | manual invaliation is necessary. You will need maintain access to the repository
 50 | and you can either go to the actions tab and [click on the caches button to find
 51 | and invalidate the failing cache](https://github.blog/changelog/2022-10-20-manage-caches-in-your-actions-workflows-from-web-interface/) 
 52 | or by setting the `CACHE_VERSION` secret to the current date (which will
 53 | invalidate all of the caches).
 54 | 
 55 | ## Updates
 56 | 
 57 | ### Setup Information
 58 | 
 59 | These workflows run on a schedule and at the maintainer's request. Because they
 60 | create pull requests that update workflows/require the downstream actions to run,
 61 | they need a special repository/organization secret token called 
 62 | `SANDPAPER_WORKFLOW` and it must have the `public_repo` and `workflow` scope. 
 63 | 
 64 | This can be an individual user token, OR it can be a trusted bot account. If you
 65 | have a repository in one of the official Carpentries accounts, then you do not
 66 | need to worry about this token being present because the Carpentries Core Team
 67 | will take care of supplying this token.
 68 | 
 69 | If you want to use your personal account: you can go to 
 70 | <https://github.com/settings/tokens/new?scopes=public_repo,workflow&description=Sandpaper%20Token>
 71 | to create a token. Once you have created your token, you should copy it to your
 72 | clipboard and then go to your repository's settings > secrets > actions and
 73 | create or edit the `SANDPAPER_WORKFLOW` secret, pasting in the generated token.
 74 | 
 75 | If you do not specify your token correctly, the runs will not fail and they will
 76 | give you instructions to provide the token for your repository. 
 77 | 
 78 | ### 02 Maintain: Update Workflow Files (update-workflow.yaml)
 79 | 
 80 | The {sandpaper} repository was designed to do as much as possible to separate 
 81 | the tools from the content. For local builds, this is absolutely true, but 
 82 | there is a minor issue when it comes to workflow files: they must live inside 
 83 | the repository. 
 84 | 
 85 | This workflow ensures that the workflow files are up-to-date. The way it work is
 86 | to download the update-workflows.sh script from GitHub and run it. The script 
 87 | will do the following:
 88 | 
 89 | 1. check the recorded version of sandpaper against the current version on github
 90 | 2. update the files if there is a difference in versions
 91 | 
 92 | After the files are updated, if there are any changes, they are pushed to a
 93 | branch called `update/workflows` and a pull request is created. Maintainers are
 94 | encouraged to review the changes and accept the pull request if the outputs
 95 | are okay.
 96 | 
 97 | This update is run weekly or on demand.
 98 | 
 99 | ### 03 Maintain: Update Package Cache (update-cache.yaml)
100 | 
101 | For lessons that have generated content, we use {renv} to ensure that the output
102 | is stable. This is controlled by a single lockfile which documents the packages
103 | needed for the lesson and the version numbers. This workflow is skipped in 
104 | lessons that do not have generated content.
105 | 
106 | Because the lessons need to remain current with the package ecosystem, it's a
107 | good idea to make sure these packages can be updated periodically. The 
108 | update cache workflow will do this by checking for updates, applying them in a
109 | branch called `updates/packages` and creating a pull request with _only the
110 | lockfile changed_. 
111 | 
112 | From here, the markdown documents will be rebuilt and you can inspect what has
113 | changed based on how the packages have updated. 
114 | 
115 | ## Pull Request and Review Management
116 | 
117 | Because our lessons execute code, pull requests are a secruity risk for any
118 | lesson and thus have security measures associted with them. **Do not merge any
119 | pull requests that do not pass checks and do not have bots commented on them.**
120 | 
121 | This series of workflows all go together and are described in the following 
122 | diagram and the below sections:
123 | 
124 | ![Graph representation of a pull request](https://carpentries.github.io/sandpaper/articles/img/pr-flow.dot.svg)
125 | 
126 | ### Pre Flight Pull Request Validation (pr-preflight.yaml)
127 | 
128 | This workflow runs every time a pull request is created and its purpose is to
129 | validate that the pull request is okay to run. This means the following things:
130 | 
131 | 1. The pull request does not contain modified workflow files
132 | 2. If the pull request contains modified workflow files, it does not contain 
133 |    modified content files (such as a situation where @carpentries-bot will
134 |    make an automated pull request)
135 | 3. The pull request does not contain an invalid commit hash (e.g. from a fork
136 |    that was made before a lesson was transitioned from styles to use the
137 |    workbench).
138 | 
139 | Once the checks are finished, a comment is issued to the pull request, which 
140 | will allow maintainers to determine if it is safe to run the 
141 | "Receive Pull Request" workflow from new contributors.
142 | 
143 | ### Receive Pull Request (pr-receive.yaml)
144 | 
145 | **Note of caution:** This workflow runs arbitrary code by anyone who creates a
146 | pull request. GitHub has safeguarded the token used in this workflow to have no
147 | priviledges in the repository, but we have taken precautions to protect against
148 | spoofing.
149 | 
150 | This workflow is triggered with every push to a pull request. If this workflow
151 | is already running and a new push is sent to the pull request, the workflow
152 | running from the previous push will be cancelled and a new workflow run will be
153 | started.
154 | 
155 | The first step of this workflow is to check if it is valid (e.g. that no
156 | workflow files have been modified). If there are workflow files that have been
157 | modified, a comment is made that indicates that the workflow is not run. If 
158 | both a workflow file and lesson content is modified, an error will occurr.
159 | 
160 | The second step (if valid) is to build the generated content from the pull
161 | request. This builds the content and uploads three artifacts:
162 | 
163 | 1. The pull request number (pr)
164 | 2. A summary of changes after the rendering process (diff)
165 | 3. The rendered files (build)
166 | 
167 | Because this workflow builds generated content, it follows the same general 
168 | process as the `sandpaper-main` workflow with the same caching mechanisms.
169 | 
170 | The artifacts produced are used by the next workflow.
171 | 
172 | ### Comment on Pull Request (pr-comment.yaml)
173 | 
174 | This workflow is triggered if the `pr-receive.yaml` workflow is successful.
175 | The steps in this workflow are:
176 | 
177 | 1. Test if the workflow is valid and comment the validity of the workflow to the
178 |    pull request.
179 | 2. If it is valid: create an orphan branch with two commits: the current state
180 |    of the repository and the proposed changes.
181 | 3. If it is valid: update the pull request comment with the summary of changes
182 | 
183 | Importantly: if the pull request is invalid, the branch is not created so any
184 | malicious code is not published.
185 | 
186 | From here, the maintainer can request changes from the author and eventually 
187 | either merge or reject the PR. When this happens, if the PR was valid, the 
188 | preview branch needs to be deleted. 
189 | 
190 | ### Send Close PR Signal (pr-close-signal.yaml)
191 | 
192 | Triggered any time a pull request is closed. This emits an artifact that is the
193 | pull request number for the next action
194 | 
195 | ### Remove Pull Request Branch (pr-post-remove-branch.yaml)
196 | 
197 | Tiggered by `pr-close-signal.yaml`. This removes the temporary branch associated with
198 | the pull request (if it was created).
199 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-close-signal.yaml:
--------------------------------------------------------------------------------
 1 | name: "Bot: Send Close Pull Request Signal"
 2 | 
 3 | on:
 4 |   pull_request:
 5 |     types:
 6 |       [closed]
 7 | 
 8 | jobs:
 9 |   send-close-signal:
10 |     name: "Send closing signal"
11 |     runs-on: ubuntu-22.04
12 |     if: ${{ github.event.action == 'closed' }}
13 |     steps:
14 |       - name: "Create PRtifact"
15 |         run: |
16 |           mkdir -p ./pr
17 |           printf ${{ github.event.number }} > ./pr/NUM
18 |       - name: Upload Diff
19 |         uses: actions/upload-artifact@v4
20 |         with:
21 |           name: pr
22 |           path: ./pr
23 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-comment.yaml:
--------------------------------------------------------------------------------
  1 | name: "Bot: Comment on the Pull Request"
  2 | 
  3 | # read-write repo token
  4 | # access to secrets
  5 | on:
  6 |   workflow_run:
  7 |     workflows: ["Receive Pull Request"]
  8 |     types:
  9 |       - completed
 10 | 
 11 | concurrency:
 12 |   group: pr-${{ github.event.workflow_run.pull_requests[0].number }}
 13 |   cancel-in-progress: true
 14 | 
 15 | 
 16 | jobs:
 17 |   # Pull requests are valid if:
 18 |   #  - they match the sha of the workflow run head commit
 19 |   #  - they are open
 20 |   #  - no .github files were committed
 21 |   test-pr:
 22 |     name: "Test if pull request is valid"
 23 |     runs-on: ubuntu-22.04
 24 |     if: >
 25 |       github.event.workflow_run.event == 'pull_request' &&
 26 |       github.event.workflow_run.conclusion == 'success'
 27 |     outputs:
 28 |       is_valid: ${{ steps.check-pr.outputs.VALID }}
 29 |       payload: ${{ steps.check-pr.outputs.payload }}
 30 |       number: ${{ steps.get-pr.outputs.NUM }}
 31 |       msg: ${{ steps.check-pr.outputs.MSG }}
 32 |     steps:
 33 |       - name: 'Download PR artifact'
 34 |         id: dl
 35 |         uses: carpentries/actions/download-workflow-artifact@main
 36 |         with:
 37 |           run: ${{ github.event.workflow_run.id }}
 38 |           name: 'pr'
 39 | 
 40 |       - name: "Get PR Number"
 41 |         if: ${{ steps.dl.outputs.success == 'true' }}
 42 |         id: get-pr
 43 |         run: |
 44 |           unzip pr.zip
 45 |           echo "NUM=$(<./NR)" >> $GITHUB_OUTPUT
 46 | 
 47 |       - name: "Fail if PR number was not present"
 48 |         id: bad-pr
 49 |         if: ${{ steps.dl.outputs.success != 'true' }}
 50 |         run: |
 51 |           echo '::error::A pull request number was not recorded. The pull request that triggered this workflow is likely malicious.'
 52 |           exit 1
 53 |       - name: "Get Invalid Hashes File"
 54 |         id: hash
 55 |         run: |
 56 |           echo "json<<EOF
 57 |           $(curl -sL https://files.carpentries.org/invalid-hashes.json)
 58 |           EOF" >> $GITHUB_OUTPUT
 59 |       - name: "Check PR"
 60 |         id: check-pr
 61 |         if: ${{ steps.dl.outputs.success == 'true' }}
 62 |         uses: carpentries/actions/check-valid-pr@main
 63 |         with:
 64 |           pr: ${{ steps.get-pr.outputs.NUM }}
 65 |           sha: ${{ github.event.workflow_run.head_sha }}
 66 |           headroom: 3 # if it's within the last three commits, we can keep going, because it's likely rapid-fire
 67 |           invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
 68 |           fail_on_error: true
 69 | 
 70 |   # Create an orphan branch on this repository with two commits
 71 |   #  - the current HEAD of the md-outputs branch
 72 |   #  - the output from running the current HEAD of the pull request through
 73 |   #    the md generator
 74 |   create-branch:
 75 |     name: "Create Git Branch"
 76 |     needs: test-pr
 77 |     runs-on: ubuntu-22.04
 78 |     if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
 79 |     env:
 80 |       NR: ${{ needs.test-pr.outputs.number }}
 81 |     permissions:
 82 |       contents: write
 83 |     steps:
 84 |       - name: 'Checkout md outputs'
 85 |         uses: actions/checkout@v4
 86 |         with:
 87 |           ref: md-outputs
 88 |           path: built
 89 |           fetch-depth: 1
 90 | 
 91 |       - name: 'Download built markdown'
 92 |         id: dl
 93 |         uses: carpentries/actions/download-workflow-artifact@main
 94 |         with:
 95 |           run: ${{ github.event.workflow_run.id }}
 96 |           name: 'built'
 97 | 
 98 |       - if: ${{ steps.dl.outputs.success == 'true' }}
 99 |         run: unzip built.zip
100 | 
101 |       - name: "Create orphan and push"
102 |         if: ${{ steps.dl.outputs.success == 'true' }}
103 |         run: |
104 |           cd built/
105 |           git config --local user.email "actions@github.com"
106 |           git config --local user.name "GitHub Actions"
107 |           CURR_HEAD=$(git rev-parse HEAD)
108 |           git checkout --orphan md-outputs-PR-${NR}
109 |           git add -A
110 |           git commit -m "source commit: ${CURR_HEAD}"
111 |           ls -A | grep -v '^.git$' | xargs -I _ rm -r '_'
112 |           cd ..
113 |           unzip -o -d built built.zip
114 |           cd built
115 |           git add -A
116 |           git commit --allow-empty -m "differences for PR #${NR}"
117 |           git push -u --force --set-upstream origin md-outputs-PR-${NR}
118 | 
119 |   # Comment on the Pull Request with a link to the branch and the diff
120 |   comment-pr:
121 |     name: "Comment on Pull Request"
122 |     needs: [test-pr, create-branch]
123 |     runs-on: ubuntu-22.04
124 |     if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
125 |     env:
126 |       NR: ${{ needs.test-pr.outputs.number }}
127 |     permissions:
128 |       pull-requests: write
129 |     steps:
130 |       - name: 'Download comment artifact'
131 |         id: dl
132 |         uses: carpentries/actions/download-workflow-artifact@main
133 |         with:
134 |           run: ${{ github.event.workflow_run.id }}
135 |           name: 'diff'
136 | 
137 |       - if: ${{ steps.dl.outputs.success == 'true' }}
138 |         run: unzip ${{ github.workspace }}/diff.zip
139 | 
140 |       - name: "Comment on PR"
141 |         id: comment-diff
142 |         if: ${{ steps.dl.outputs.success == 'true' }}
143 |         uses: carpentries/actions/comment-diff@main
144 |         with:
145 |           pr: ${{ env.NR }}
146 |           path: ${{ github.workspace }}/diff.md
147 | 
148 |   # Comment if the PR is open and matches the SHA, but the workflow files have
149 |   # changed
150 |   comment-changed-workflow:
151 |     name: "Comment if workflow files have changed"
152 |     needs: test-pr
153 |     runs-on: ubuntu-22.04
154 |     if: ${{ always() && needs.test-pr.outputs.is_valid == 'false' }}
155 |     env:
156 |       NR: ${{ github.event.workflow_run.pull_requests[0].number }}
157 |       body: ${{ needs.test-pr.outputs.msg }}
158 |     permissions:
159 |       pull-requests: write
160 |     steps:
161 |       - name: 'Check for spoofing'
162 |         id: dl
163 |         uses: carpentries/actions/download-workflow-artifact@main
164 |         with:
165 |           run: ${{ github.event.workflow_run.id }}
166 |           name: 'built'
167 | 
168 |       - name: 'Alert if spoofed'
169 |         id: spoof
170 |         if: ${{ steps.dl.outputs.success == 'true' }}
171 |         run: |
172 |           echo 'body<<EOF' >> $GITHUB_ENV
173 |           echo '' >> $GITHUB_ENV
174 |           echo '## :x: DANGER :x:' >> $GITHUB_ENV
175 |           echo 'This pull request has modified workflows that created output. Close this now.' >> $GITHUB_ENV
176 |           echo '' >> $GITHUB_ENV
177 |           echo 'EOF' >> $GITHUB_ENV
178 | 
179 |       - name: "Comment on PR"
180 |         id: comment-diff
181 |         uses: carpentries/actions/comment-diff@main
182 |         with:
183 |           pr: ${{ env.NR }}
184 |           body: ${{ env.body }}
185 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-post-remove-branch.yaml:
--------------------------------------------------------------------------------
 1 | name: "Bot: Remove Temporary PR Branch"
 2 | 
 3 | on:
 4 |   workflow_run:
 5 |     workflows: ["Bot: Send Close Pull Request Signal"]
 6 |     types:
 7 |       - completed
 8 | 
 9 | jobs:
10 |   delete:
11 |     name: "Delete branch from Pull Request"
12 |     runs-on: ubuntu-22.04
13 |     if: >
14 |       github.event.workflow_run.event == 'pull_request' &&
15 |       github.event.workflow_run.conclusion == 'success'
16 |     permissions:
17 |       contents: write
18 |     steps:
19 |       - name: 'Download artifact'
20 |         uses: carpentries/actions/download-workflow-artifact@main
21 |         with:
22 |           run: ${{ github.event.workflow_run.id }}
23 |           name: pr
24 |       - name: "Get PR Number"
25 |         id: get-pr
26 |         run: |
27 |           unzip pr.zip
28 |           echo "NUM=$(<./NUM)" >> $GITHUB_OUTPUT
29 |       - name: 'Remove branch'
30 |         uses: carpentries/actions/remove-branch@main
31 |         with:
32 |           pr: ${{ steps.get-pr.outputs.NUM }}
33 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-preflight.yaml:
--------------------------------------------------------------------------------
 1 | name: "Pull Request Preflight Check"
 2 | 
 3 | on:
 4 |   pull_request_target:
 5 |     branches:
 6 |       ["main"]
 7 |     types:
 8 |       ["opened", "synchronize", "reopened"]
 9 | 
10 | jobs:
11 |   test-pr:
12 |     name: "Test if pull request is valid"
13 |     if: ${{ github.event.action != 'closed' }}
14 |     runs-on: ubuntu-22.04
15 |     outputs:
16 |       is_valid: ${{ steps.check-pr.outputs.VALID }}
17 |     permissions:
18 |       pull-requests: write
19 |     steps:
20 |       - name: "Get Invalid Hashes File"
21 |         id: hash
22 |         run: |
23 |           echo "json<<EOF
24 |           $(curl -sL https://files.carpentries.org/invalid-hashes.json)
25 |           EOF" >> $GITHUB_OUTPUT
26 |       - name: "Check PR"
27 |         id: check-pr
28 |         uses: carpentries/actions/check-valid-pr@main
29 |         with:
30 |           pr: ${{ github.event.number }}
31 |           invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
32 |           fail_on_error: true
33 |       - name: "Comment result of validation"
34 |         id: comment-diff
35 |         if: ${{ always() }}
36 |         uses: carpentries/actions/comment-diff@main
37 |         with:
38 |           pr: ${{ github.event.number }}
39 |           body: ${{ steps.check-pr.outputs.MSG }}
40 | 


--------------------------------------------------------------------------------
/.github/workflows/pr-receive.yaml:
--------------------------------------------------------------------------------
  1 | name: "Receive Pull Request"
  2 | 
  3 | on:
  4 |   pull_request:
  5 |     types:
  6 |       [opened, synchronize, reopened]
  7 | 
  8 | concurrency:
  9 |   group: ${{ github.ref }}
 10 |   cancel-in-progress: true
 11 | 
 12 | jobs:
 13 |   test-pr:
 14 |     name: "Record PR number"
 15 |     if: ${{ github.event.action != 'closed' }}
 16 |     runs-on: ubuntu-22.04
 17 |     outputs:
 18 |       is_valid: ${{ steps.check-pr.outputs.VALID }}
 19 |     steps:
 20 |       - name: "Record PR number"
 21 |         id: record
 22 |         if: ${{ always() }}
 23 |         run: |
 24 |           echo ${{ github.event.number }} > ${{ github.workspace }}/NR # 2022-03-02: artifact name fixed to be NR
 25 |       - name: "Upload PR number"
 26 |         id: upload
 27 |         if: ${{ always() }}
 28 |         uses: actions/upload-artifact@v4
 29 |         with:
 30 |           name: pr
 31 |           path: ${{ github.workspace }}/NR
 32 |       - name: "Get Invalid Hashes File"
 33 |         id: hash
 34 |         run: |
 35 |           echo "json<<EOF
 36 |           $(curl -sL https://files.carpentries.org/invalid-hashes.json)
 37 |           EOF" >> $GITHUB_OUTPUT
 38 |       - name: "echo output"
 39 |         run: |
 40 |           echo "${{ steps.hash.outputs.json }}"
 41 |       - name: "Check PR"
 42 |         id: check-pr
 43 |         uses: carpentries/actions/check-valid-pr@main
 44 |         with:
 45 |           pr: ${{ github.event.number }}
 46 |           invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }}
 47 | 
 48 |   build-md-source:
 49 |     name: "Build markdown source files if valid"
 50 |     needs: test-pr
 51 |     runs-on: ubuntu-22.04
 52 |     if: ${{ needs.test-pr.outputs.is_valid == 'true' }}
 53 |     env:
 54 |       GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
 55 |       RENV_PATHS_ROOT: ~/.local/share/renv/
 56 |       CHIVE: ${{ github.workspace }}/site/chive
 57 |       PR: ${{ github.workspace }}/site/pr
 58 |       MD: ${{ github.workspace }}/site/built
 59 |     steps:
 60 |       - name: "Check Out Main Branch"
 61 |         uses: actions/checkout@v4
 62 | 
 63 |       - name: "Check Out Staging Branch"
 64 |         uses: actions/checkout@v4
 65 |         with:
 66 |           ref: md-outputs
 67 |           path: ${{ env.MD }}
 68 | 
 69 |       - name: "Set up R"
 70 |         uses: r-lib/actions/setup-r@v2
 71 |         with:
 72 |           use-public-rspm: true
 73 |           install-r: false
 74 | 
 75 |       - name: "Set up Pandoc"
 76 |         uses: r-lib/actions/setup-pandoc@v2
 77 | 
 78 |       - name: "Setup Lesson Engine"
 79 |         uses: carpentries/actions/setup-sandpaper@main
 80 |         with:
 81 |           cache-version: ${{ secrets.CACHE_VERSION }}
 82 | 
 83 |       - name: "Setup Package Cache"
 84 |         uses: carpentries/actions/setup-lesson-deps@main
 85 |         with:
 86 |           cache-version: ${{ secrets.CACHE_VERSION }}
 87 | 
 88 |       - name: "Validate and Build Markdown"
 89 |         id: build-site
 90 |         run: |
 91 |           sandpaper::package_cache_trigger(TRUE)
 92 |           sandpaper::validate_lesson(path = '${{ github.workspace }}')
 93 |           sandpaper:::build_markdown(path = '${{ github.workspace }}', quiet = FALSE)
 94 |         shell: Rscript {0}
 95 | 
 96 |       - name: "Generate Artifacts"
 97 |         id: generate-artifacts
 98 |         run: |
 99 |           sandpaper:::ci_bundle_pr_artifacts(
100 |             repo         = '${{ github.repository }}',
101 |             pr_number    = '${{ github.event.number }}',
102 |             path_md      = '${{ env.MD }}',
103 |             path_pr      = '${{ env.PR }}',
104 |             path_archive = '${{ env.CHIVE }}',
105 |             branch       = 'md-outputs'
106 |           )
107 |         shell: Rscript {0}
108 | 
109 |       - name: "Upload PR"
110 |         uses: actions/upload-artifact@v4
111 |         with:
112 |           name: pr
113 |           path: ${{ env.PR }}
114 |           overwrite: true
115 | 
116 |       - name: "Upload Diff"
117 |         uses: actions/upload-artifact@v4
118 |         with:
119 |           name: diff
120 |           path: ${{ env.CHIVE }}
121 |           retention-days: 1
122 | 
123 |       - name: "Upload Build"
124 |         uses: actions/upload-artifact@v4
125 |         with:
126 |           name: built
127 |           path: ${{ env.MD }}
128 |           retention-days: 1
129 | 
130 |       - name: "Teardown"
131 |         run: sandpaper::reset_site()
132 |         shell: Rscript {0}
133 | 


--------------------------------------------------------------------------------
/.github/workflows/sandpaper-main.yaml:
--------------------------------------------------------------------------------
 1 | name: "01 Build and Deploy Site"
 2 | 
 3 | on:
 4 |   push:
 5 |     branches:
 6 |       - main
 7 |       - master
 8 |   schedule:
 9 |     - cron: '0 0 * * 2'
10 |   workflow_dispatch:
11 |     inputs:
12 |       name:
13 |         description: 'Who triggered this build?'
14 |         required: true
15 |         default: 'Maintainer (via GitHub)'
16 |       reset:
17 |         description: 'Reset cached markdown files'
18 |         required: false
19 |         default: false
20 |         type: boolean
21 | jobs:
22 |   full-build:
23 |     name: "Build Full Site"
24 | 
25 |     # 2024-10-01: ubuntu-latest is now 24.04 and R is not installed by default in the runner image
26 |     # pin to 22.04 for now
27 |     runs-on: ubuntu-22.04
28 |     permissions:
29 |       checks: write
30 |       contents: write
31 |       pages: write
32 |     env:
33 |       GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
34 |       RENV_PATHS_ROOT: ~/.local/share/renv/
35 |     steps:
36 | 
37 |       - name: "Checkout Lesson"
38 |         uses: actions/checkout@v4
39 | 
40 |       - name: "Set up R"
41 |         uses: r-lib/actions/setup-r@v2
42 |         with:
43 |           use-public-rspm: true
44 |           install-r: false
45 | 
46 |       - name: "Set up Pandoc"
47 |         uses: r-lib/actions/setup-pandoc@v2
48 | 
49 |       - name: "Setup Lesson Engine"
50 |         uses: carpentries/actions/setup-sandpaper@main
51 |         with:
52 |           cache-version: ${{ secrets.CACHE_VERSION }}
53 | 
54 |       - name: "Setup Package Cache"
55 |         uses: carpentries/actions/setup-lesson-deps@main
56 |         with:
57 |           cache-version: ${{ secrets.CACHE_VERSION }}
58 | 
59 |       - name: "Deploy Site"
60 |         run: |
61 |           reset <- "${{ github.event.inputs.reset }}" == "true"
62 |           sandpaper::package_cache_trigger(TRUE)
63 |           sandpaper:::ci_deploy(reset = reset)
64 |         shell: Rscript {0}
65 | 


--------------------------------------------------------------------------------
/.github/workflows/sandpaper-version.txt:
--------------------------------------------------------------------------------
1 | 0.16.9
2 | 


--------------------------------------------------------------------------------
/.github/workflows/update-cache.yaml:
--------------------------------------------------------------------------------
  1 | name: "03 Maintain: Update Package Cache"
  2 | 
  3 | on:
  4 |   workflow_dispatch:
  5 |     inputs:
  6 |       name:
  7 |         description: 'Who triggered this build (enter github username to tag yourself)?'
  8 |         required: true
  9 |         default: 'monthly run'
 10 |   schedule:
 11 |     # Run every tuesday
 12 |     - cron: '0 0 * * 2'
 13 | 
 14 | jobs:
 15 |   preflight:
 16 |     name: "Preflight Check"
 17 |     runs-on: ubuntu-22.04
 18 |     outputs:
 19 |       ok: ${{ steps.check.outputs.ok }}
 20 |     steps:
 21 |       - id: check
 22 |         run: |
 23 |           if [[ ${{ github.event_name }} == 'workflow_dispatch' ]]; then
 24 |             echo "ok=true" >> $GITHUB_OUTPUT
 25 |             echo "Running on request"
 26 |           # using single brackets here to avoid 08 being interpreted as octal
 27 |           # https://github.com/carpentries/sandpaper/issues/250
 28 |           elif [ `date +%d` -le 7 ]; then
 29 |             # If the Tuesday lands in the first week of the month, run it
 30 |             echo "ok=true" >> $GITHUB_OUTPUT
 31 |             echo "Running on schedule"
 32 |           else
 33 |             echo "ok=false" >> $GITHUB_OUTPUT
 34 |             echo "Not Running Today"
 35 |           fi
 36 | 
 37 |   check_renv:
 38 |     name: "Check if We Need {renv}"
 39 |     runs-on: ubuntu-22.04
 40 |     needs: preflight
 41 |     if: ${{ needs.preflight.outputs.ok == 'true'}}
 42 |     outputs:
 43 |       needed: ${{ steps.renv.outputs.exists }}
 44 |     steps:
 45 |       - name: "Checkout Lesson"
 46 |         uses: actions/checkout@v4
 47 |       - id: renv
 48 |         run: |
 49 |           if [[ -d renv ]]; then
 50 |             echo "exists=true" >> $GITHUB_OUTPUT
 51 |           fi
 52 | 
 53 |   check_token:
 54 |     name: "Check SANDPAPER_WORKFLOW token"
 55 |     runs-on: ubuntu-22.04
 56 |     needs: check_renv
 57 |     if: ${{ needs.check_renv.outputs.needed == 'true' }}
 58 |     outputs:
 59 |       workflow: ${{ steps.validate.outputs.wf }}
 60 |       repo: ${{ steps.validate.outputs.repo }}
 61 |     steps:
 62 |       - name: "validate token"
 63 |         id: validate
 64 |         uses: carpentries/actions/check-valid-credentials@main
 65 |         with:
 66 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
 67 | 
 68 |   update_cache:
 69 |     name: "Update Package Cache"
 70 |     needs: check_token
 71 |     if: ${{ needs.check_token.outputs.repo== 'true' }}
 72 |     runs-on: ubuntu-22.04
 73 |     env:
 74 |       GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
 75 |       RENV_PATHS_ROOT: ~/.local/share/renv/
 76 |     steps:
 77 | 
 78 |       - name: "Checkout Lesson"
 79 |         uses: actions/checkout@v4
 80 | 
 81 |       - name: "Set up R"
 82 |         uses: r-lib/actions/setup-r@v2
 83 |         with:
 84 |           use-public-rspm: true
 85 |           install-r: false
 86 | 
 87 |       - name: "Update {renv} deps and determine if a PR is needed"
 88 |         id: update
 89 |         uses: carpentries/actions/update-lockfile@main
 90 |         with:
 91 |           cache-version: ${{ secrets.CACHE_VERSION }}
 92 | 
 93 |       - name: Create Pull Request
 94 |         id: cpr
 95 |         if: ${{ steps.update.outputs.n > 0 }}
 96 |         uses: carpentries/create-pull-request@main
 97 |         with:
 98 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
 99 |           delete-branch: true
100 |           branch: "update/packages"
101 |           commit-message: "[actions] update ${{ steps.update.outputs.n }} packages"
102 |           title: "Update ${{ steps.update.outputs.n }} packages"
103 |           body: |
104 |             :robot: This is an automated build
105 | 
106 |             This will update ${{ steps.update.outputs.n }} packages in your lesson with the following versions:
107 | 
108 |             ```
109 |             ${{ steps.update.outputs.report }}
110 |             ```
111 | 
112 |             :stopwatch: In a few minutes, a comment will appear that will show you how the output has changed based on these updates.
113 | 
114 |             If you want to inspect these changes locally, you can use the following code to check out a new branch:
115 | 
116 |             ```bash
117 |             git fetch origin update/packages
118 |             git checkout update/packages
119 |             ```
120 | 
121 |             - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }}
122 | 
123 |             [1]: https://github.com/carpentries/create-pull-request/tree/main
124 |           labels: "type: package cache"
125 |           draft: false
126 | 


--------------------------------------------------------------------------------
/.github/workflows/update-workflows.yaml:
--------------------------------------------------------------------------------
 1 | name: "02 Maintain: Update Workflow Files"
 2 | 
 3 | on:
 4 |   workflow_dispatch:
 5 |     inputs:
 6 |       name:
 7 |         description: 'Who triggered this build (enter github username to tag yourself)?'
 8 |         required: true
 9 |         default: 'weekly run'
10 |       clean:
11 |         description: 'Workflow files/file extensions to clean (no wildcards, enter "" for none)'
12 |         required: false
13 |         default: '.yaml'
14 |   schedule:
15 |     # Run every Tuesday
16 |     - cron: '0 0 * * 2'
17 | 
18 | jobs:
19 |   check_token:
20 |     name: "Check SANDPAPER_WORKFLOW token"
21 |     runs-on: ubuntu-22.04
22 |     outputs:
23 |       workflow: ${{ steps.validate.outputs.wf }}
24 |       repo: ${{ steps.validate.outputs.repo }}
25 |     steps:
26 |       - name: "validate token"
27 |         id: validate
28 |         uses: carpentries/actions/check-valid-credentials@main
29 |         with:
30 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
31 | 
32 |   update_workflow:
33 |     name: "Update Workflow"
34 |     runs-on: ubuntu-22.04
35 |     needs: check_token
36 |     if: ${{ needs.check_token.outputs.workflow == 'true' }}
37 |     steps:
38 |       - name: "Checkout Repository"
39 |         uses: actions/checkout@v4
40 | 
41 |       - name: Update Workflows
42 |         id: update
43 |         uses: carpentries/actions/update-workflows@main
44 |         with:
45 |           clean: ${{ github.event.inputs.clean }}
46 | 
47 |       - name: Create Pull Request
48 |         id: cpr
49 |         if: "${{ steps.update.outputs.new }}"
50 |         uses: carpentries/create-pull-request@main
51 |         with:
52 |           token: ${{ secrets.SANDPAPER_WORKFLOW }}
53 |           delete-branch: true
54 |           branch: "update/workflows"
55 |           commit-message: "[actions] update sandpaper workflow to version ${{ steps.update.outputs.new }}"
56 |           title: "Update Workflows to Version ${{ steps.update.outputs.new }}"
57 |           body: |
58 |             :robot: This is an automated build
59 | 
60 |             Update Workflows from sandpaper version ${{ steps.update.outputs.old }} -> ${{ steps.update.outputs.new }}
61 | 
62 |             - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }}
63 | 
64 |             [1]: https://github.com/carpentries/create-pull-request/tree/main
65 |           labels: "type: template and tools"
66 |           draft: false
67 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # sandpaper files
 2 | episodes/*html
 3 | site/*
 4 | !site/README.md
 5 | 
 6 | # History files
 7 | .Rhistory
 8 | .Rapp.history
 9 | # Session Data files
10 | .RData
11 | # User-specific files
12 | .Ruserdata
13 | # Example code in package build process
14 | *-Ex.R
15 | # Output files from R CMD build
16 | /*.tar.gz
17 | # Output files from R CMD check
18 | /*.Rcheck/
19 | # RStudio files
20 | .Rproj.user/
21 | # produced vignettes
22 | vignettes/*.html
23 | vignettes/*.pdf
24 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
25 | .httr-oauth
26 | # knitr and R markdown default cache directories
27 | *_cache/
28 | /cache/
29 | # Temporary files created by R markdown
30 | *.utf8.md
31 | *.knit.md
32 | # R Environment Variables
33 | .Renviron
34 | # pkgdown site
35 | docs/
36 | # translation temp files
37 | po/*~
38 | # renv detritus
39 | renv/sandbox/
40 | *.pyc
41 | *~
42 | .DS_Store
43 | .ipynb_checkpoints
44 | .sass-cache
45 | __pycache__
46 | _site
47 | .Rproj.user
48 | 


--------------------------------------------------------------------------------
/.zenodo.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "contributors": [
 3 |     {
 4 |       "type": "Editor",
 5 |       "name": "Anuj Guruacharya"
 6 |     },
 7 |     {
 8 |       "type": "Editor",
 9 |       "name": "Travis Wrightsman"
10 |     }
11 |   ],
12 |   "creators": [
13 |     {
14 |       "name": "Erin Alison Becker",
15 |       "orcid": "0000-0002-6832-0233"
16 |     },
17 |     {
18 |       "name": "Sarah LR Stevens",
19 |       "orcid": "0000-0002-7040-548X"
20 |     },
21 |     {
22 |       "name": "Bianca Peterson"
23 |     },
24 |     {
25 |       "name": "Jake Cowper Szamosi",
26 |       "orcid": "0000-0003-2106-0072"
27 |     },
28 |     {
29 |       "name": "Fotis E. Psomopoulos",
30 |       "orcid": "0000-0002-0222-4273"
31 |     },
32 |     {
33 |       "name": "Travis Wrightsman",
34 |       "orcid": "0000-0002-0904-6473"
35 |     },
36 |     {
37 |       "name": "Karen Word",
38 |       "orcid": "0000-0002-7294-7231"
39 |     },
40 |     {
41 |       "name": "Murray Cadzow",
42 |       "orcid": "0000-0002-2299-4136"
43 |     },
44 |     {
45 |       "name": "Sam Nooij",
46 |       "orcid": "0000-0001-5892-5637"
47 |     },
48 |     {
49 |       "name": "Sangram Keshari Sahu",
50 |       "orcid": "0000-0001-5010-9539"
51 |     },
52 |     {
53 |       "name": "Sarah M Brown",
54 |       "orcid": "0000-0001-5728-0822"
55 |     },
56 |     {
57 |       "name": "Stephen Tahan"
58 |     },
59 |     {
60 |       "name": "Umar Ahmad"
61 |     },
62 |     {
63 |       "name": "Valerie Gartner"
64 |     },
65 |     {
66 |       "name": "Annajiat Alim Rasel",
67 |       "orcid": "0000-0003-0198-3734"
68 |     },
69 |     {
70 |       "name": "Daniel Kerchner",
71 |       "orcid": "0000-0002-5921-2193"
72 |     },
73 |     {
74 |       "name": "rosemm"
75 |     }
76 |   ],
77 |   "license": {
78 |     "id": "CC-BY-4.0"
79 |   }
80 | }
81 | 


--------------------------------------------------------------------------------
/AUTHORS:
--------------------------------------------------------------------------------
1 | FIXME: list authors' names and email addresses.
2 | 


--------------------------------------------------------------------------------
/CITATION:
--------------------------------------------------------------------------------
1 | Please cite as:
2 | 
3 | Erin Alison Becker, Tracy Teal, François Michonneau, Maneesha Sane, Taylor Reiter, Jason Williams, et al. (2019, June). 
4 | datacarpentry/genomics-workshop: Data Carpentry: Genomics Workshop Overview, June 2019 (Version v2019.06.1). 
5 | Zenodo. http://doi.org/10.5281/zenodo.3260309
6 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Contributor Code of Conduct"
 3 | ---
 4 | 
 5 | As contributors and maintainers of this project,
 6 | we pledge to follow the [The Carpentries Code of Conduct][coc].
 7 | 
 8 | Instances of abusive, harassing, or otherwise unacceptable behavior
 9 | may be reported by following our [reporting guidelines][coc-reporting].
10 | 
11 | [coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
12 | [coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html
13 | 
14 | 
15 | 
16 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
  1 | ## Contributing
  2 | 
  3 | [The Carpentries][cp-site] ([Software Carpentry][swc-site], [Data
  4 | Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source
  5 | projects, and we welcome contributions of all kinds: new lessons, fixes to
  6 | existing material, bug reports, and reviews of proposed changes are all
  7 | welcome.
  8 | 
  9 | ### Contributor Agreement
 10 | 
 11 | By contributing, you agree that we may redistribute your work under [our
 12 | license](LICENSE.md). In exchange, we will address your issues and/or assess
 13 | your change proposal as promptly as we can, and help you become a member of our
 14 | community. Everyone involved in [The Carpentries][cp-site] agrees to abide by
 15 | our [code of conduct](CODE_OF_CONDUCT.md).
 16 | 
 17 | ### How to Contribute
 18 | 
 19 | The easiest way to get started is to file an issue to tell us about a spelling
 20 | mistake, some awkward wording, or a factual error. This is a good way to
 21 | introduce yourself and to meet some of our community members.
 22 | 
 23 | 1. If you do not have a [GitHub][github] account, you can [send us comments by
 24 |   email][contact]. However, we will be able to respond more quickly if you use
 25 |   one of the other methods described below.
 26 | 
 27 | 2. If you have a [GitHub][github] account, or are willing to [create
 28 |   one][github-join], but do not know how to use Git, you can report problems
 29 |   or suggest improvements by [creating an issue][repo-issues]. This allows us
 30 |   to assign the item to someone and to respond to it in a threaded discussion.
 31 | 
 32 | 3. If you are comfortable with Git, and would like to add or change material,
 33 |   you can submit a pull request (PR). Instructions for doing this are
 34 |   [included below](#using-github). For inspiration about changes that need to
 35 |   be made, check out the [list of open issues][issues] across the Carpentries.
 36 | 
 37 | Note: if you want to build the website locally, please refer to [The Workbench
 38 | documentation][template-doc].
 39 | 
 40 | ### Where to Contribute
 41 | 
 42 | 1. If you wish to change this lesson, add issues and pull requests here.
 43 | 2. If you wish to change the template used for workshop websites, please refer
 44 |   to [The Workbench documentation][template-doc].
 45 | 
 46 | ### What to Contribute
 47 | 
 48 | There are many ways to contribute, from writing new exercises and improving
 49 | existing ones to updating or filling in the documentation and submitting [bug
 50 | reports][issues] about things that do not work, are not clear, or are missing.
 51 | If you are looking for ideas, please see [the list of issues for this
 52 | repository][repo-issues], or the issues for [Data Carpentry][dc-issues],
 53 | [Library Carpentry][lc-issues], and [Software Carpentry][swc-issues] projects.
 54 | 
 55 | Comments on issues and reviews of pull requests are just as welcome: we are
 56 | smarter together than we are on our own. **Reviews from novices and newcomers
 57 | are particularly valuable**: it's easy for people who have been using these
 58 | lessons for a while to forget how impenetrable some of this material can be, so
 59 | fresh eyes are always welcome.
 60 | 
 61 | ### What *Not* to Contribute
 62 | 
 63 | Our lessons already contain more material than we can cover in a typical
 64 | workshop, so we are usually *not* looking for more concepts or tools to add to
 65 | them. As a rule, if you want to introduce a new idea, you must (a) estimate how
 66 | long it will take to teach and (b) explain what you would take out to make room
 67 | for it. The first encourages contributors to be honest about requirements; the
 68 | second, to think hard about priorities.
 69 | 
 70 | We are also not looking for exercises or other material that only run on one
 71 | platform. Our workshops typically contain a mixture of Windows, macOS, and
 72 | Linux users; in order to be usable, our lessons must run equally well on all
 73 | three.
 74 | 
 75 | ### Using GitHub
 76 | 
 77 | If you choose to contribute via GitHub, you may want to look at [How to
 78 | Contribute to an Open Source Project on GitHub][how-contribute]. In brief, we
 79 | use [GitHub flow][github-flow] to manage changes:
 80 | 
 81 | 1. Create a new branch in your desktop copy of this repository for each
 82 |   significant change.
 83 | 2. Commit the change in that branch.
 84 | 3. Push that branch to your fork of this repository on GitHub.
 85 | 4. Submit a pull request from that branch to the [upstream repository][repo].
 86 | 5. If you receive feedback, make changes on your desktop and push to your
 87 |   branch on GitHub: the pull request will update automatically.
 88 | 
 89 | NB: The published copy of the lesson is usually in the `main` branch.
 90 | 
 91 | Each lesson has a team of maintainers who review issues and pull requests or
 92 | encourage others to do so. The maintainers are community volunteers, and have
 93 | final say over what gets merged into the lesson.
 94 | 
 95 | ### Other Resources
 96 | 
 97 | The Carpentries is a global organisation with volunteers and learners all over
 98 | the world. We share values of inclusivity and a passion for sharing knowledge,
 99 | teaching and learning. There are several ways to connect with The Carpentries
100 | community listed at [https://carpentries.org/connect/](https://carpentries.org/connect/) including via social
101 | media, slack, newsletters, and email lists. You can also [reach us by
102 | email][contact].
103 | 
104 | [cp-site]: https://carpentries.org/
105 | [swc-site]: https://software-carpentry.org/
106 | [dc-site]: https://datacarpentry.org/
107 | [lc-site]: https://librarycarpentry.org/
108 | [github]: https://github.com
109 | [contact]: mailto:team@carpentries.org
110 | [github-join]: https://github.com/join
111 | [repo-issues]: https://github.com/datacarpentry/genomics-workshop/issues
112 | [issues]: https://carpentries.org/help-wanted-issues/
113 | [template-doc]: https://carpentries.github.io/workbench/
114 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry
115 | [lc-issues]: https://github.com/issues?q=user%3ALibraryCarpentry
116 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry
117 | [how-contribute]: https://egghead.io/courses/how-to-contribute-to-an-open-source-project-on-github
118 | [github-flow]: https://guides.github.com/introduction/flow/
119 | [repo]: https://github.com/datacarpentry/genomics-workshop
120 | 
121 | 
122 | 
123 | 


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Licenses"
 3 | ---
 4 | 
 5 | ## Instructional Material
 6 | 
 7 | All Carpentries (Software Carpentry, Data Carpentry, and Library Carpentry)
 8 | instructional material is made available under the [Creative Commons
 9 | Attribution license][cc-by-human]. The following is a human-readable summary of
10 | (and not a substitute for) the [full legal text of the CC BY 4.0
11 | license][cc-by-legal].
12 | 
13 | You are free:
14 | 
15 | - to **Share**\---copy and redistribute the material in any medium or format
16 | - to **Adapt**\---remix, transform, and build upon the material
17 | 
18 | for any purpose, even commercially.
19 | 
20 | The licensor cannot revoke these freedoms as long as you follow the license
21 | terms.
22 | 
23 | Under the following terms:
24 | 
25 | - **Attribution**\---You must give appropriate credit (mentioning that your work
26 |   is derived from work that is Copyright (c) The Carpentries and, where
27 |   practical, linking to [https://carpentries.org/](https://carpentries.org/)), provide a [link to the
28 |   license][cc-by-human], and indicate if changes were made. You may do so in
29 |   any reasonable manner, but not in any way that suggests the licensor endorses
30 |   you or your use.
31 | 
32 | - **No additional restrictions**\---You may not apply legal terms or
33 |   technological measures that legally restrict others from doing anything the
34 |   license permits.  With the understanding that:
35 | 
36 | Notices:
37 | 
38 | - You do not have to comply with the license for elements of the material in
39 |   the public domain or where your use is permitted by an applicable exception
40 |   or limitation.
41 | - No warranties are given. The license may not give you all of the permissions
42 |   necessary for your intended use. For example, other rights such as publicity,
43 |   privacy, or moral rights may limit how you use the material.
44 | 
45 | ## Software
46 | 
47 | Except where otherwise noted, the example programs and other software provided
48 | by The Carpentries are made available under the [OSI][osi]\-approved [MIT
49 | license][mit-license].
50 | 
51 | Permission is hereby granted, free of charge, to any person obtaining a copy of
52 | this software and associated documentation files (the "Software"), to deal in
53 | the Software without restriction, including without limitation the rights to
54 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
55 | of the Software, and to permit persons to whom the Software is furnished to do
56 | so, subject to the following conditions:
57 | 
58 | The above copyright notice and this permission notice shall be included in all
59 | copies or substantial portions of the Software.
60 | 
61 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
62 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
63 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
64 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
65 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
66 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
67 | SOFTWARE.
68 | 
69 | ## Trademark
70 | 
71 | "The Carpentries", "Software Carpentry", "Data Carpentry", and "Library
72 | Carpentry" and their respective logos are registered trademarks of
73 | [The Carpentries, Inc.][carpentries].
74 | 
75 | [cc-by-human]: https://creativecommons.org/licenses/by/4.0/
76 | [cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode
77 | [mit-license]: https://opensource.org/licenses/mit-license.html
78 | [carpentries]: https://carpentries.org
79 | [osi]: https://opensource.org
80 | 
81 | 
82 | 
83 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3260309.svg)](https://doi.org/10.5281/zenodo.3260309)
 2 | 
 3 | # Genomics Workshop
 4 | 
 5 | Overview of the Genomics workshop.
 6 | 
 7 | ## Code of Conduct
 8 | 
 9 | All participants should agree to abide by [The Carpentries Code of Conduct](https://docs.carpentries.org/topic_folders/policies/index_coc.html).
10 | 
11 | ## Authors
12 | 
13 | The Genomics workshop overview is authored and maintained by the [community](https://github.com/datacarpentry/genomics-workshop/network/members).
14 | 
15 | ## Citation
16 | 
17 | Please cite as:  
18 | Erin Alison Becker, Tracy Teal, François Michonneau, Maneesha Sane, Taylor Reiter, Jason Williams, et al. (2019, June). datacarpentry/genomics-workshop: Data Carpentry: Genomics Workshop Overview, June 2019 (Version v2019.06.1). Zenodo. [http://doi.org/10.5281/zenodo.3260309](https://doi.org/10.5281/zenodo.3260309)
19 | 
20 | 
21 | 


--------------------------------------------------------------------------------
/config.yaml:
--------------------------------------------------------------------------------
 1 | #------------------------------------------------------------
 2 | # Values for this lesson.
 3 | #------------------------------------------------------------
 4 | 
 5 | # Which carpentry is this (swc, dc, lc, or cp)?
 6 | # swc: Software Carpentry
 7 | # dc: Data Carpentry
 8 | # lc: Library Carpentry
 9 | # cp: Carpentries (to use for instructor training for instance)
10 | # incubator: The Carpentries Incubator
11 | carpentry: 'dc'
12 | 
13 | # Overall title for pages.
14 | title: 'Genomics Workshop Overview'
15 | 
16 | # Date the lesson was created (YYYY-MM-DD, this is empty by default)
17 | created: '2015-06-03'
18 | 
19 | # Comma-separated list of keywords for the lesson
20 | keywords: 'software, data, lesson, The Carpentries' # FIXME
21 | 
22 | # Life cycle stage of the lesson
23 | # possible values: pre-alpha, alpha, beta, stable
24 | life_cycle: 'stable'
25 | 
26 | # License of the lesson
27 | license: 'CC-BY 4.0'
28 | 
29 | # Link to the source repository for this lesson
30 | source: 'https://github.com/datacarpentry/genomics-workshop'
31 | 
32 | # Default branch of your lesson
33 | branch: 'main'
34 | 
35 | # Who to contact if there are any issues
36 | contact: 'team@carpentries.org'
37 | 
38 | # Navigation ------------------------------------------------
39 | #
40 | # Use the following menu items to specify the order of
41 | # individual pages in each dropdown section. Leave blank to
42 | # include all pages in the folder.
43 | #
44 | # Example -------------
45 | #
46 | # episodes:
47 | # - introduction.md
48 | # - first-steps.md
49 | #
50 | # learners:
51 | # - setup.md
52 | #
53 | # instructors:
54 | # - instructor-notes.md
55 | #
56 | # profiles:
57 | # - one-learner.md
58 | # - another-learner.md
59 | 
60 | # Order of episodes in your lesson
61 | episodes: 
62 | - introduction.Rmd
63 | 
64 | # Information for Learners
65 | learners: 
66 | 
67 | # Information for Instructors
68 | instructors: 
69 | 
70 | # Learner Profiles
71 | profiles: 
72 | 
73 | # Customisation ---------------------------------------------
74 | #
75 | # This space below is where custom yaml items (e.g. pinning
76 | # sandpaper and varnish versions) should live
77 | 
78 | 
79 | url: 'https://datacarpentry.github.io/genomics-workshop'
80 | analytics: 'carpentries'
81 | lang: 'en'
82 | overview: true
83 | 


--------------------------------------------------------------------------------
/fig/logging-onto-cloud-new-key-pair_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-new-key-pair_1.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud-new-key-pair_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-new-key-pair_2.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud-security-group_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-security-group_1.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud-security-group_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-security-group_2.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud-security-group_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-security-group_3.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud-summary.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-summary.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_1.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_1b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_1b.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_2.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_3.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_3b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_3b.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_5.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_6.png


--------------------------------------------------------------------------------
/fig/logging-onto-cloud_7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_7.png


--------------------------------------------------------------------------------
/index.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | site: sandpaper::sandpaper_site
 3 | ---
 4 | 
 5 | Data Carpentry's aim is to teach researchers basic concepts, skills, and tools for working
 6 | with data so that they can get more done in less time, and with less pain. This workshop
 7 | teaches data management and analysis for genomics research including:
 8 | best practices for organization of bioinformatics projects and data, use of command-line
 9 | utilities, use of command-line tools to analyze sequence quality and
10 | perform variant calling, and connecting to and using cloud computing. This workshop is designed to
11 | be taught over two full days of instruction.
12 | 
13 | **Please note that workshop materials for working with Genomics data in R are in "alpha" development. These lessons are available for review and for informal teaching experiences, but are not yet part of The Carpentries' official lesson offerings.**
14 | 
15 | Interested in teaching these materials? We have an [onboarding video](https://www.youtube.com/watch?v=zgdutO5tejo) and accompanying [slides](https://docs.google.com/presentation/d/1fLlT2lPv32DqCFpRPPdHZBNHiQTpK79wd5Z3nsFwL3s/edit#slide=id.p) available to prepare Instructors to teach these lessons. After watching this video, please contact [team@carpentries.org](mailto:team@carpentries.org) so that we can record your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at centrally-organized Data Carpentry Genomics workshops.
16 | 
17 | :::::::::::::::::::::::::::::::::::::::::  callout
18 | 
19 | ## Frequently Asked Questions
20 | 
21 | Read our [FAQ](/genomics-workshop/faq) to learn more about Data Carpentry's Genomics workshop, as an Instructor or a workshop host.
22 | 
23 | ::::::::::::::::::::::::::::::::::::::::::::::::::
24 | 
25 | ::::::::::::::::::::::::::::::::::::::::::  prereq
26 | 
27 | ## Getting Started
28 | 
29 | This lesson assumes that learners have no prior experience with the tools covered in the workshop.
30 | However, learners are expected to have some familiarity with biological concepts,
31 | including the
32 | concept of genomic variation within a population. Participants should bring their own laptops and plan to participate actively.
33 | 
34 | To get started, follow the directions in the [Setup](learners/setup.md) tab to
35 | get access to the required software and data for this workshop.
36 | 
37 | ::::::::::::::::::::::::::::::::::::::::::::::::::
38 | 
39 | ::::::::::::::::::::::::::::::::::::::::::  prereq
40 | 
41 | ## Data
42 | 
43 | This workshop uses data from a long term evolution experiment published in 2016: [Tempo and mode of genome evolution in a 50,000-generation experiment](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4988878/) by Tenaillon O, Barrick JE, Ribeck N, Deatherage DE, Blanchard JL, Dasgupta A, Wu GC, Wielgoss S, Cruveiller S, Médigue C, Schneider D, and Lenski RE. (doi: 10.1038/nature18959)
44 | 
45 | All of the data used in this workshop can be [downloaded from Figshare](https://figshare.com/articles/Data_Carpentry_Genomics_beta_2_0/7726454).
46 | More information about this data is available on the [Data page](https://datacarpentry.org/organization-genomics/data).
47 | 
48 | ::::::::::::::::::::::::::::::::::::::::::::::::::
49 | 
50 | # Workshop Overview
51 | 
52 | | Lesson | Overview                                                                                                                                                                      | 
53 | | ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
54 | | [Project organization and management](https://datacarpentry.github.io/organization-genomics/)       | Learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database. | 
55 | | [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/)       | Learn to navigate your file system, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards.                          | 
56 | | [Data wrangling and processing](https://datacarpentry.github.io/wrangling-genomics/)       | Use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation.                                    | 
57 | | [Introduction to cloud computing for genomics](https://www.datacarpentry.org/cloud-genomics/)       | Learn how to work with Amazon AWS cloud computing and how to transfer data between your local computer and cloud resources.                                                   | 
58 | 
59 | # Optional Additional Lessons
60 | 
61 | | Lesson | Overview                                                                                                                                                                      | 
62 | | ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
63 | | [Intro to R and RStudio for Genomics](https://datacarpentry.org/genomics-r-intro/)       | Use R to analyze and visualize between-sample variation.                                                                                                                      | 
64 | 
65 | # Teaching Platform
66 | 
67 | This workshop is designed to be run on pre-imaged Amazon Web Services (AWS)
68 | instances. All the software and data used in the workshop are hosted on an Amazon Machine Image (AMI).
69 | If you want to run your own instance of the server used for this workshop, follow the directions in the [Setup](learners/setup.md) tab.
70 | 
71 | # Common Schedules
72 | 
73 | ### Schedule A (2 days OR 4 half days)
74 | 
75 | - Half-day 1: [Project organization and management](https://datacarpentry.github.io/organization-genomics/) \& [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/)
76 | - Half-day 2: [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) (continued).
77 | - Half-day 3 \& 4 : [Data wrangling and processing](https://datacarpentry.github.io/wrangling-genomics/)
78 | 
79 | ### Schedule B (2 days OR 4 half days)
80 | 
81 | - Half-day 1: [Project organization and management](https://datacarpentry.github.io/organization-genomics/) \& [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/)
82 | - Half-day 2: [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) (continued)
83 | - Half-day 3 \& 4: [Intro to R and RStudio for Genomics](https://datacarpentry.org/genomics-r-intro/)
84 | 
85 | ### Schedule C (3 days OR 6 half days)
86 | 
87 | - Half-day 1: [Project organization and management](https://datacarpentry.github.io/organization-genomics/) \& [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/)
88 | - Half-day 2: [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) (continued)
89 | - Half-day 3 \& 4 : [Data wrangling and processing](https://datacarpentry.github.io/wrangling-genomics/)
90 | - Half-day 5 \& 6: [Intro to R and RStudio for Genomics](https://datacarpentry.org/genomics-r-intro/)
91 | 
92 | 
93 | 


--------------------------------------------------------------------------------
/instructors/AMI-setup.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Launching your own AMI instances
  3 | ---
  4 | 
  5 | :::::::::::::::::::::::::::::::::::::::::  callout
  6 | 
  7 | ## Do I need to create my own instances?
  8 | 
  9 | **If you are:**
 10 | 
 11 | - teaching at or attending a centrally organized Data
 12 |   Carpentry workshop,
 13 | - a Maintainer for one of the Genomics lessons,
 14 | - contributing to the Genomics lessons, or
 15 | - teaching at a self-organized workshop
 16 | 
 17 | The Carpentries staff will create AMI instances for you. Please contact
 18 | [team@carpentries.org](mailto:team@carpentries.org).
 19 | 
 20 | **If you are:**
 21 | 
 22 | - working through these lessons on your own outside of a workshop,
 23 | - practicing your skills after a workshop, or
 24 | - using these lessons for a teaching demonstration as part of your Instructor checkout for The Carpentries,
 25 | 
 26 | you will need to create your own AMI instances using the instructions below. The cost of using this AMI for a few days, with the
 27 | t2.medium instance type is about USD $1.20 per day. Data Carpentry has no control over AWS pricing structure and provides
 28 | this cost estimate with no guarantees. Please see the [EC2 pricing page](https://aws.amazon.com/ec2/pricing/on-demand) for up-to-date information.
 29 | 
 30 | ::::::::::::::::::::::::::::::::::::::::::::::::::
 31 | 
 32 | ### Launching an instance on Amazon Web Services
 33 | 
 34 | ::::::::::::::::::::::::::::::::::::::::::  prereq
 35 | 
 36 | ## Prerequisites
 37 | 
 38 | - Form of payment (credit card)
 39 | - Understanding of Amazon's billing and payment (See: [Getting started with AWS Billing and Cost Management](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/billing-getting-started.html))
 40 | - You can use some of Amazon Web Services for free, or see if you qualify for an AWS Grant (See: [https://aws.amazon.com/grants/](https://aws.amazon.com/grants/) ) if you are using AWS for education. The free level of service *will not* be sufficient for working with the amount of data we are using for our lessons.
 41 | 
 42 | ::::::::::::::::::::::::::::::::::::::::::::::::::
 43 | 
 44 | #### Create an AWS account
 45 | 
 46 | 1\. Go to Amazon Web Services [https://aws.amazon.com/](https://aws.amazon.com/)
 47 | 
 48 | 2\. Follow the button to sign up for an account - you will need to agree to Amazon's terms and conditions and provide credit card information.
 49 | 
 50 | #### Sign into AWS and Launch an Instance
 51 | 
 52 | 1\. Sign into the AWS EC2 Dashboard: [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/)
 53 | 
 54 | 2\. Click the 'Launch Instance' button
 55 | 
 56 | <img src="fig/logging-onto-cloud_1.png" width="500" alt="Screenshot of AWS EC2 dashboard showing location of launch instance button.">
 57 | 
 58 | 3\. Under 'Application and OS Images (Amazon Machine Image)' search for the AMI listed on this curriculum's [Setup page](https://datacarpentry.org/genomics-workshop/index.html#setup)
 59 | 
 60 | <img src="fig/logging-onto-cloud_1b.png" width="500" alt="Screenshot of AMI launch wizard showing search function.">
 61 | 
 62 | 4\. Click "Community AMIs", and then select that image
 63 | 
 64 | <img src="fig/logging-onto-cloud_2.png" width="500" alt="Screenshot of AMI launch wizard showing Community AMI tab.">
 65 | 
 66 | 5\. Under 'Instance type' click "Compare instance types" and and then select **t2.medium**; click "Select instance type"
 67 | 
 68 | <img src="fig/logging-onto-cloud_3.png" width="500" alt="Screenshot of AMI launch wizard showing choosing t2.medium image type.">  
 69 | 
 70 | <img src="fig/logging-onto-cloud_3b.png" width="500" alt="Screenshot of AMI compare instance type page.">  
 71 | 
 72 | 6\. Under 'Key pair (login)' click "Proceed without a key pair (not recommended)". A key pair is not necessary for this use case, as you will be using an account that is set up with limited access for learners. If you want to make changes to the instance (for example, installing additional software), you will need administrative access and will need to set up a key pair. Refer to [Amazon's user manual](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html) for information on key pair usage.  
 73 | 
 74 | <img src="fig/logging-onto-cloud-new-key-pair_1.png" width="500" alt="Screenshot showing key pair settings box.">
 75 | 
 76 | 7\. Scroll down to 'Network settings'. If this is your first time working with this AMI on your
 77 | AWS account, choose "create a new security group". Click "Edit".
 78 | 
 79 | <img src="fig/logging-onto-cloud-security-group_1.png" width="500" alt="Screenshot of AMI launch wizard showing network settings box with 'Create security group' selected.">
 80 | 
 81 | 8\. Name your security group something descriptive (for example "DC-genomics-AMI")
 82 | and enter a description into the description box (for example "to use with DC genomics AMI").
 83 | 
 84 | Your security group should now look like this:
 85 | 
 86 | <img src="fig/logging-onto-cloud-security-group_2.png" width="500" alt="Screenshot of AMI launch wizard showing creating a new security group.">
 87 | 
 88 | 9\. Click "Add security group rule". A new row will appear. Under 'Type' select "Custom TCP" and enter "8787" into box labeled "Port Range". Under
 89 | "Source type", select "Anywhere". You should now see a screen that looks like this:
 90 | 
 91 | <img src="fig/logging-onto-cloud-security-group_3.png" width="500" alt="Screenshot of AMI launch wizard showing security group rules.">
 92 | 
 93 | 10\. Under 'Summary' on the right side of the screen, you should now see a screen that looks like this. Click "Launch Instance".
 94 | 
 95 | <img src="fig/logging-onto-cloud-summary.png" width="250" alt="Screenshot of AMI launch wizard showing security group rules.">
 96 | 
 97 | You instance will now be launched. You should follow the links to 'Create billing alerts' and then the instructions below
 98 | for connecting to and terminating your Amazon Instance.
 99 | 
100 | :::::::::::::::: spoiler
101 | 
102 | ## Connect to your Amazon Instance (MacOS/Linux)
103 | 
104 | 1. Log into your AWC EC2 Dashboard [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/)
105 | 
106 | 2. You should see that you have one instance. To proceed, the instance state must be 'running' (if you just launched the instance it will take \<5 min for the instance to start running).
107 | 
108 | <img src="fig/logging-onto-cloud_5.png" width="500" alt="Screenshot of AWS EC2 dashboard showing number of running instances.">
109 | 
110 | 3. At the bottom of the dashboard, you should see a **Public IPv4 DNS** which will look something like *ec2-18-212-60-130.compute-1.amazonaws.com*. Copy that address (you may wish make a note of it as you will need this each time you connect.)
111 | 
112 | <img src="fig/logging-onto-cloud_6.png" width="500" alt="Screenshot of AWS EC2 dashboard showing instance state as running.">
113 | 
114 | 4. You can now connect to your instance using 'ssh'. Your command will be something like this:
115 | 
116 | ```bash
117 | $ ssh dcuser@ec2-18-212-60-130.compute-1.amazonaws.com
118 | ```
119 | 
120 | Use `dcuser` as the username, but be sure to replace `ec2-18-212-60-130.compute-1.amazonaws.com` with the DNS for your image. You may be notified that the authenticity of the host cannot be verified - if so, type 'yes' into the prompt to bypass the warning and continue connecting.
121 | 
122 | 5. When prompted, enter the password `data4Carp`
123 | 
124 | You should now be connected to your personal instance. You can confirm that you are in the correct location
125 | by using the `whoami` and `pwd` commands, which should yield the following results:
126 | 
127 | ```bash
128 | $ whoami
129 | dcuser
130 | $ pwd
131 | /home/dcuser
132 | ```
133 | 
134 | :::::::::::::::::::::::::
135 | 
136 | :::::::::::::::: spoiler
137 | 
138 | ## Connect to your Amazon instance (Windows)
139 | 
140 | 1. Download the PuTTY application at: [http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe](https://the.earth.li/~sgtatham/putty/latest/x86/putty.exe)
141 | 
142 | 2. Log into your AWC EC2 Dashboard [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/)
143 | 
144 | 3. You should see that you have one instance, make sure instance state is 'running' (if you just launched the instance it will take \<5 min for the instance to start running)
145 | 
146 | <img src="fig/logging-onto-cloud_5.png" width="500" alt="Screenshot of AWS EC2 dashboard showing number of running instances.">
147 | 
148 | 4. At the bottom of the dashboard, you should see a **Public IPv4 DNS** which will look something like *ec2-18-212-60-130.compute-1.amazonaws.com*. Copy that address (you may wish make a note of it as you will need this each time you connect.)
149 | 
150 | <img src="fig/logging-onto-cloud_6.png" width="500" alt="Screenshot of AWS EC2 dashboard showing instance state as running.">
151 | 
152 | 5. Start PuTTY. In the section 'Specify the destination you want to connect to' for 'Host Name (or IP address)' paste in the DNS address and click 'Open'
153 | 
154 | 6. When prompted to login, enter 'dcuser'; you may be notified that the authenticity of the host cannot be verified - if so select "Yes" to bypass the warning and continue connecting
155 | 
156 | 7. When prompted, enter the password `data4Carp`
157 | 
158 | You should now be connected to your personal instance. You can confirm this with the following commands; `whoami` and `pwd`, which should yield the following results:
159 | 
160 | ```bash
161 | Last login: Thu Jul 30 13:21:08 2015 from 8.sub-70-197-200.myvzw.com
162 | $ whoami
163 | dcuser
164 | $ pwd
165 | /home/dcuser
166 | ```
167 | 
168 | :::::::::::::::::::::::::
169 | 
170 | #### Terminating your instance
171 | 
172 | :::::::::::::::::::::::::::::::::::::::::  callout
173 | 
174 | ## Very Important Warning - Avoid Unwanted Charges
175 | 
176 | Please remember, for as long as this instance is running, you will
177 | be charged for your usage. You can see an estimate of the current
178 | charge from your AWS EC2 dashboard by clicking your name (Account
179 | name) on the upper right of the dashboard and selecting 'Billing
180 | \& Cost Management'. **DO NOT FORGET TO TERMINATE YOUR INSTANCE WHEN YOU ARE DONE**
181 | 
182 | ::::::::::::::::::::::::::::::::::::::::::::::::::
183 | 
184 | When you are finished with your instance, you must terminate it to avoid unwanted charges. Follow the following steps.
185 | 
186 | 1. Sign into AWS and go to the EC2 Dashboard: [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/)
187 | 2. Under 'Resources' select 'Running Instances'
188 | 3. Select the instance you wish to terminate, then click 'Instance state' and select 'Terminate instance'
189 | 
190 | <img src="fig/logging-onto-cloud_7.png" width="500" alt="Screenshot of AWS EC2 dashboard showing drop-down menu for terminating an instance.">
191 | 
192 | :::::::::::::::::::::::::::::::::::::::::  callout
193 | 
194 | ## Warning
195 | 
196 | Terminating an instance will delete any data on this instance, so you must move any data you wish to save off the instance.
197 | 
198 | ::::::::::::::::::::::::::::::::::::::::::::::::::
199 | 
200 | 5. Select 'Terminate' to terminate the instance.
201 | 
202 | 
203 | 


--------------------------------------------------------------------------------
/instructors/faq.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: Frequently Asked Questions
 3 | ---
 4 | 
 5 | Thank you for your interest in hosting or teaching a Genomics workshop. Below you will find answers to some frequently asked questions about this curriculum. If the answer to your question doesn't appear, please contact [team@carpentries.org](mailto:team@carpentries.org).
 6 | 
 7 | - [For Hosts](#hosts)
 8 | - [For Instructors](#instructors)
 9 | 
10 | ## <a id="hosts"></a> Hosts
11 | 
12 | ### What does this workshop cover?
13 | 
14 | This workshop teaches data management and analysis for genomics research including: best practices for organization of bioinformatics projects and data, use of command line utilities, use of command line tools to analyze sequence quality and perform variant calling, and connecting to and using cloud computing.
15 | 
16 | ### What experience do learners need to have before this workshop? What will they be able to do by the end of the workshop?
17 | 
18 | This lesson assumes no prior experience with the tools covered in the workshop. However, learners are expected to have some familiarity with biological concepts, including the concept of genomic variation within a population. By the end of the workshop, learners will be able to:
19 | 
20 | - structure their metadata, organize and document their genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database,
21 | - navigate their file systems, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards,
22 | - use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation,
23 | - work with Amazon AWS cloud computing and transfer data between their local computer and cloud resources.
24 | 
25 | ### What are the software, hardware, and connectivity needs for this workshop?
26 | 
27 | Learners will need to bring a laptop (not a tablet) with any spreadsheet program installed (e.g. LibreOffice, Microsoft Excel). Learners using a Windows machine will also need to download and install [PuTTY](https://www.putty.org/). There are no other hardware or software requirements. Learners will need a stable, strong internet connection in order to work on the remote computing system used for this workshop.
28 | 
29 | ### My institution has its own compute cluster, or our research group uses a different cloud computing resource. Can we deliver the workshop using that system?
30 | 
31 | To ensure a consistent workshop experience for learners and Instructors, all workshops organized by The Carpentries ("centrally-organized workshops") use our stable, community-tested curriculum and technical set-up. Currently, all centrally-organized Genomics workshops are taught using AWS, although we are interested in supporting other systems in the future. If you are interested in using a different platform to teach this curriculum in a self-organized workshop, all of our materials are publicly available and licensed [CC-BY](https://creativecommons.org/licenses/by/4.0/). For information about the difference between centrally-organized and self-organized workshops, and limitations on use of "The Carpentries" brand name and logo, see the [Teaching and Hosting](https://docs.carpentries.org/topic_folders/hosts_instructors/index.html) section of The Carpentries Handbook.
32 | 
33 | ### What experience do helpers need to have for this workshop?
34 | 
35 | Anyone who has some experience using the Bash shell can be an effective helper for this workshop. Helpers do not need to have experience working with genomics data or the specific command line tools taught in this workshop.
36 | 
37 | ### I want to include the optional R lesson. Can I do that?
38 | 
39 | To ensure a consistent workshop experience for learners and Instructors, all workshops organized by The Carpentries ("centrally-organized workshops") use our stable, community-tested curriculum. The Genomics R lesson is still under development and we cannot guarantee that it will meet our high curricular standards. If you are interested in using this lesson in a self-organized workshop, all of our materials are publicly available and licensed CC-BY. For information about the difference between centrally-organized and self-organized workshops, and limitations on use of "The Carpentries" brand name and logo, see  the [Teaching and Hosting](https://docs.carpentries.org/topic_folders/hosts_instructors/index.html) section of The Carpentries Handbook.
40 | 
41 | ### Does the AWS image location matter? Do I need to set up an AMI in a different region if my workshop will be held outside of the Eastern US?
42 | 
43 | We have run this workshop in locations across the United States and Europe with no noticeable difference in instance speed. If you experience any issues, please [let us know](mailto:team@carpentries.org).
44 | 
45 | ### Where can I find more information about this workshop?
46 | 
47 | For a full description of this workshop, including what content is covered, and what dataset we use to teach, visit the [Genomics Workshop Overview](https://datacarpentry.org/genomics-workshop/) page.
48 | 
49 | ## <a id="instructors"></a> Instructors
50 | 
51 | ### What background and technological skills do I need to have to teach this workshop?
52 | 
53 | You will need experience using a bash shell (the default shell on Mac OS and most Linux systems), including writing your own small bash scripts and running programs written by others from the command line. You do not need to have specifically used the command-line programs that are used in these lessons. You do not need to have prior experience working with Amazon Web Services (AWS), but some experience logging on to remote computers would be useful. You should have experience working with genomic sequences in FASTQ format.
54 | 
55 | ### How can I prepare to teach this material?
56 | 
57 | Each lesson has a set of Instructor Notes that provide information about the design of the lesson, commonly encountered problems, and technical tips and tricks. You can access Instructor Notes through the [main lessons page](https://datacarpentry.org/lessons/#genomics-workshop) (linked through the plus icon in the lesson table) or in the "Extras" menu on each individual lesson page. Instructor Notes are written collaboratively by our Instructors, so please contribute your own notes after your workshop!
58 | 
59 | ### When will I have access to an AWS image to practice on?
60 | 
61 | Each Instructor will receive connection information for an AMI instance approximately one week before the workshop. If you would like workshop helpers to also have access to an instance to prepare for the workshop, or if you would like access more than one week in advance of the workshop, please contact [team@carpentries.org](mailto:team@carpentries.org).
62 | 
63 | ### How will my learners get connection and log-in information?
64 | 
65 | The Carpentries Workshops and Instruction Team will provide Instructors with connection information for AMI instances the day before the workshop. Enough instances will be provided for each learner, and if requested, for each workshop helper, to have their own individual instance. Instructors can make connection information available to learners through the workshop Etherpad.
66 | 
67 | ### I want to teach the optional R lesson. Can I do that?
68 | 
69 | To ensure a consistent workshop experience for learners and Instructors, all workshops organized by The Carpentries ("centrally-organized workshops") use our stable, community-tested curriculum. The Genomics R lesson is still under development and we cannot guarantee that it will meet our high curricular standards. If you are interested in using this lesson in a self-organized workshop, all of our materials are publicly available and licensed CC-BY. For information about the difference between centrally-organized and self-organized workshops, and limitations on use of "The Carpentries" brand name and logo, see the [Teaching and Hosting](https://docs.carpentries.org/topic_folders/hosts_instructors/index.html) section of The Carpentries Handbook.
70 | 
71 | ### Is there anything special I need to do if I'm teaching the optional R lesson?
72 | 
73 | Nope! The AMI that we use for the standard lessons can also be used to teach the optional R lesson.
74 | 
75 | ### What are common problems that arise during this workshop?
76 | 
77 | The best place to get information about common problems that arise during workshops is in the Instructor Notes for each lesson. You can access Instructor Notes through the Extras menu in the top navigation bar that appears across the head of each lesson. Instructors are strongly encouraged to contribute back to the Instructor Notes based on their workshop experience. To contribute to the Instructor Notes, click the "Improve this page" menu option in the upper right corner of the Instructor Notes page.
78 | 
79 | ### Does the AWS image location matter? Do I need to set up an AMI in a different region if my workshop will be held outside of the Eastern US?
80 | 
81 | We have run this workshop in locations across the United States and Europe with no noticeable difference in instance speed. If you experience any issues, please [let us know](mailto:team@carpentries.org).
82 | 
83 | 
84 | 


--------------------------------------------------------------------------------
/instructors/instructor-notes.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: Instructor Notes
 3 | ---
 4 | 
 5 | ## Resouces for Instructors
 6 | 
 7 | We have an [onboarding video](https://www.youtube.com/watch?v=zgdutO5tejo) available to prepare Instructors to teach these lessons.
 8 | The slides presented in this video are available: [https://tinyurl.com/y27swdvo](https://tinyurl.com/y27swdvo).
 9 | After watching this video, please contact [[team@carpentries.org](mailto:team@carpentries.org)](mailto: [team@carpentries.org](mailto:team@carpentries.org)) so that we can record
10 | your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at
11 | centrally-organized Carpentries workshops.
12 | 
13 | ## Workshop Structure
14 | 
15 | [Instructors, please add notes on your experience with the workshop structure here.]
16 | 
17 | ## Technical tips and tricks
18 | 
19 | #### Installation
20 | 
21 | This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) instances. See the
22 | [Setup page](https://datacarpentry.org/genomics-workshop/index.html#setup) for complete setup instructions. If you are
23 | teaching these lessons, and would like an AWS instance to practice on, please contact [[team@carpentries.org](mailto:team@carpentries.org)](mailto: [team@carpentries.org](mailto:team@carpentries.org)).
24 | 
25 | ## Common problems
26 | 
27 | This workshop introduces an analysis pipeline, where each step in that pipeline is dependent on the previous step.
28 | If a learner gets behind, or one of the steps doesn't work for them, they may not be able to catch up with the rest of the class.
29 | To help ensure that all learners are able to work through the whole process, we provide the solution files. This includes all
30 | of the output files for each step in the data processing pipeline, as well as the scripts that the learners write collaboratively
31 | with the Instructors throughout the workshop. These files are available on the AMI in `dcuser/.solutions`.
32 | 
33 | Similarly, if the learners aren't able to pull the data files that are pulled in the lesson directly from the SRA (e.g. due to
34 | unstable internet), those files are available in the hidden backup directory (`dcuser/.backup`).
35 | 
36 | Make sure to tell your helpers about the `.solutions` and `.backup` directories so that they can use these resources to help
37 | learners catch up during the workshop.
38 | 
39 | 
40 | 


--------------------------------------------------------------------------------
/instructors/teaching_demos.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: Teaching Demonstrations
 3 | ---
 4 | 
 5 | If you are an instructor in training and wish to use lessons from Data Carpentry's Genomics curriculum for your teaching demo, please read the instructions below to be sure you are prepared. You must follow these steps *before* your teaching demo, or you will be asked to reschedule.
 6 | 
 7 | <hr>
 8 | 
 9 | #### [Project Organization and Management for Genomics](https://datacarpentry.org/organization-genomics/)
10 | 
11 | No special instructions.
12 | 
13 | <hr>
14 | 
15 | #### [Introduction to the Command Line for Genomics](https://datacarpentry.org/shell-genomics/)
16 | 
17 | For your teaching demo, you may follow this lesson locally without an AMI instance. Note that this will
18 | require some changes to paths throughout the lesson.
19 | 
20 | Use the following shell commands to download and unzip the necessary data files from FigShare.
21 | 
22 | ```
23 | wget --output-document shell_data.tar.gz https://ndownloader.figshare.com/files/14417834
24 | tar -xzf shell_data.tar.gz
25 | ```
26 | 
27 | <hr>
28 | 
29 | #### [Data Wrangling and Processing for Genomics](https://datacarpentry.org/wrangling-genomics/)
30 | 
31 | Use [these instructions](https://datacarpentry.org/genomics-workshop/AMI-setup) to launch and connect to your own instance of the Data Carpentry Genomics AMI. This instance should cost you approximately US $1.20 per day. (This cost estimate is provided without any guarantee of accuracy and Data Carpentry assumes no liability for costs associated with your AMI instance(s).)
32 | 
33 | Once you have connected to your AWS instance, use the shell commands below to ensure that the data directory is created,
34 | that the data is placed into the data directory, and that you are in the data directory before
35 | starting to operate on the data.
36 | 
37 | ```
38 | mkdir -p ~/dc_workshop/data/untrimmed_fastq/
39 | mv  ~/.backup/untrimmed_fastq/* ~/dc_workshop/data/untrimmed_fastq/
40 | cd ~/dc_workshop/data/untrimmed_fastq
41 | ```
42 | 
43 | <hr>
44 | 
45 | #### [Introduction to Cloud Computing for Genomics](https://datacarpentry.org/cloud-genomics/)
46 | 
47 | Use [these instructions](https://datacarpentry.org/genomics-workshop/AMI-setup) to launch and connect to your own instance of the Data Carpentry Genomics AMI. This instance should cost you approximately US $1.20 per day. (This cost estimate is provided without any guarantee of accuracy and Data Carpentry assumes no liability for costs associated with your AMI instance(s).)
48 | 
49 | <hr>
50 | 
51 | #### [Data Analysis and Visualization in R](https://datacarpentry.org/genomics-r-intro/)
52 | 
53 | **DO NOT USE for demos.** This lesson is not yet stable.
54 | 
55 | <hr>
56 | 
57 | 
58 | 


--------------------------------------------------------------------------------
/learners/reference.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: 'Glossary'
 3 | ---
 4 | 
 5 | ## Glossary
 6 | 
 7 | FIXME
 8 | 
 9 | 
10 | 
11 | 
12 | 


--------------------------------------------------------------------------------
/learners/setup.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Setup
  3 | ---
  4 | 
  5 | # Overview
  6 | 
  7 | This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) instances.
  8 | All of the data and most of the software used in the workshop are hosted on an
  9 | Amazon Machine Image (AMI).
 10 | Some additional software, detailed below, must be installed on your computer.
 11 | 
 12 | Please follow the instructions below to prepare your computer for the workshop:
 13 | 
 14 | - Required additional software + Option A
 15 |   **OR**
 16 | - Required additional software + Option B
 17 | 
 18 | ## Required additional software
 19 | 
 20 | This lesson requires a working spreadsheet program.
 21 | If you don't have a spreadsheet program already, you can use LibreOffice.
 22 | It's a free, open source spreadsheet program.
 23 | Directions to install are included for each Windows, Mac OS X, and Linux systems below.
 24 | For Windows, you will also need to install either Git Bash, PuTTY, or the Ubuntu Subsystem.
 25 | 
 26 | :::::::::::::::: spoiler
 27 | 
 28 | ## Windows
 29 | 
 30 | - Visit [the LibreOffice installation page](https://www.libreoffice.org/download/libreoffice-fresh/).
 31 |   The version for Windows should automatically be selected.
 32 |   Click Download Version X.X.X (whichever is the most recent version).
 33 |   You will go to a page that asks about a donation, but you don't need to make one.
 34 |   Your download should begin automatically.
 35 | - Once the installer is downloaded, double click on it and LibreOffice should install.
 36 | - Download the [Git for Windows installer](https://git-for-windows.github.io/).
 37 |   Run the installer and follow the steps below:
 38 |   - Click on "Next" four times (two times if you've previously installed Git).
 39 |     You don't need to change anything in the Information, location, components, and start menu screens.
 40 |   - **From the dropdown menu select "Use the Nano editor by default"
 41 |     (NOTE: you will need to scroll up to find it) and click on "Next".**
 42 |   - On the page that says "Adjusting the name of the initial branch in new repositories",
 43 |     ensure that "Let Git decide" is selected.
 44 |     This will ensure the highest level of compatibility for our lessons.
 45 |   - Ensure that "Git from the command line and also from 3rd-party software"
 46 |     is selected and click on "Next".
 47 |     (If you don't do this Git Bash will not work properly,
 48 |     requiring you to remove the Git Bash installation,
 49 |     re-run the installer and to select the
 50 |     "Git from the command line and also from 3rd-party software" option.)
 51 |   - Ensure that "Use the native Windows Secure Channel Library" is selected and click on "Next".
 52 |   - Ensure that "Checkout Windows-style, commit Unix-style line endings" is selected and click on "Next".
 53 |   - **Ensure that "Use Windows' default console window" is selected and click on "Next".**
 54 |   - Ensure that "Default (fast-forward or merge) is selected and click "Next"
 55 |   - Ensure that "Git Credential Manager Core" is selected and click on "Next".
 56 |   - Ensure that "Enable file system caching" is selected and click on "Next".
 57 |   - Click on "Install".
 58 |   - Click on "Finish".
 59 |   - Check the settings for you your "HOME" environment variable.
 60 |     - If your "HOME" environment variable is not set (or you don't know what this is):
 61 |     - Open command prompt (Open Start Menu then type `cmd` and press [Enter])
 62 |     - Type the following line into the command prompt window exactly as shown: `setx HOME "%USERPROFILE%"`
 63 |     - Press [Enter], you should see `SUCCESS: Specified value was saved.`
 64 |     - Quit command prompt by typing `exit` then pressing [Enter]
 65 | - An **alternative option** is to [install PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html).
 66 |   For most newer computers, click on putty-64bit-X.XX-installer.msi to download the 64-bit version.
 67 |   If you have an older laptop, you may need to get the 32-bit version putty-X.XX-installer.msi.
 68 |   If you aren't sure whether you need the 64 or 32 bit version,
 69 |   you can [check your laptop version](https://support.microsoft.com/en-us/help/15056/windows-32-64-bit-faq).
 70 |   Once the installer is downloaded, double click on it, and PuTTY should install.
 71 | - **Another alternative option** is to use the Ubuntu Subsystem for Windows.
 72 |   This option is only available for Windows 10 - the Microsoft documentation provides
 73 |   [detailed instructions for installing Windows 10](https://docs.microsoft.com/en-us/windows/wsl/install-win10).
 74 | 
 75 | :::::::::::::::::::::::::
 76 | 
 77 | :::::::::::::::: spoiler
 78 | 
 79 | ## Mac OS X
 80 | 
 81 | - Visit [the LibreOffice installation page](https://www.libreoffice.org/download/libreoffice-fresh/).
 82 |   The version for Mac should automatically be selected.
 83 |   Click Download Version X.X.X (whichever is the most recent version).
 84 |   You will go to a page that asks about a donation, but you don't need to make one.
 85 |   Your download should begin automatically.
 86 | - Once the installer is downloaded, double click on it and LibreOffice should install.
 87 | 
 88 | :::::::::::::::::::::::::
 89 | 
 90 | :::::::::::::::: spoiler
 91 | 
 92 | ## Linux
 93 | 
 94 | - Visit [the LibreOffice installation page](https://www.libreoffice.org/download/libreoffice-fresh/).
 95 |   The version for Linux should automatically be selected.
 96 |   Click Download Version X.X.X (whichever is the most recent version).
 97 |   You will go to a page that asks about a donation, but you don't need to make one.
 98 |   Your download should begin automatically.
 99 | - Once the installer is downloaded, double click on it and LibreOffice should install.
100 | 
101 | :::::::::::::::::::::::::
102 | 
103 | ## Option A (**Recommended**): Using the lessons with Amazon Web Services (AWS)
104 | 
105 | If you are signed up to take a Genomics Data Carpentry workshop,
106 | you do *not* need to worry about setting up an AMI instance.
107 | The Carpentries staff will create an instance for you and this will be provided to you at no cost.
108 | This is true for both self-organized and centrally-organized workshops.
109 | Your Instructor will provide instructions for connecting to the AMI instance at the workshop.
110 | 
111 | If you would like to work through these lessons independently, outside of a workshop,
112 | you will need to start your own AMI instance.
113 | Follow these [instructions on creating an Amazon instance](https://datacarpentry.org/genomics-workshop/AMI-setup).
114 | Use the AMI `ami-07196848f138b4f29` (Data Carpentry Genomics with R 4.4)
115 | listed on the Community AMIs page.
116 | Please note that you must set your location as `N. Virginia` in order to access this community AMI.
117 | You can change your location in the upper right corner of the main AWS menu bar.
118 | The cost of using this AMI for a few days,
119 | with the t2.medium instance type is very low (about USD $1.50 per user, per day).
120 | Data Carpentry has *no* control over AWS pricing structure and provides this
121 | cost estimate with no guarantees.
122 | Please read AWS documentation on pricing for up-to-date information.
123 | 
124 | If you're an Instructor or Maintainer or want to contribute to these lessons,
125 | please [get in touch with us](mailto:team@carpentries.org)
126 | and we will start instances for you.
127 | 
128 | ## Option B: Using the lessons on your local machine
129 | 
130 | While not recommended, it is possible to work through the lessons on your local machine
131 | (i.e. without using AWS).
132 | To do this, you will need to install all of the software used in the workshop
133 | and obtain a copy of the dataset.
134 | Instructions for doing this are listed below.
135 | 
136 | ### Data
137 | 
138 | [The data used in this workshop is available on FigShare](https://figshare.com/articles/Data_Carpentry_Genomics_beta_2_0/7726454).
139 | Because this workshop works with real data, be aware that file sizes for the data are large.
140 | Please read the FigShare page for information about the data and access to the data files.
141 | 
142 | More information about these data will be presented in
143 | [the first lesson of the workshop](https://www.datacarpentry.org/organization-genomics/data/).
144 | 
145 | ### Software
146 | 
147 | | Software | Version | Manual | Available for         | Description                                                           | 
148 | | -------- | ------- | ------ | --------------------- | --------------------------------------------------------------------- |
149 | | [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)         | 0\.11.9  | [Link](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/)       | Linux, MacOS, Windows | Quality control tool for high throughput sequence data.               | 
150 | | [Trimmomatic](https://www.usadellab.org/cms/?page=trimmomatic)         | 0\.39    | [Link](https://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf)       | Linux, MacOS, Windows | A flexible read trimming tool for Illumina NGS data.                  | 
151 | | [BWA](https://bio-bwa.sourceforge.net/)         | 0\.7.17  | [Link](https://bio-bwa.sourceforge.net/bwa.shtml)       | Linux, MacOS          | Mapping DNA sequences against reference genome.                       | 
152 | | [SAMtools](https://samtools.sourceforge.net/)         | 1\.9     | [Link](https://www.htslib.org/doc/samtools.html)       | Linux, MacOS          | Utilities for manipulating alignments in the SAM format.              | 
153 | | [BCFtools](https://samtools.github.io/bcftools/)         | 1\.9     | [Link](https://samtools.github.io/bcftools/bcftools.html)       | Linux, MacOS          | Utilities for variant calling and manipulating VCFs and BCFs.         | 
154 | | [IGV](https://software.broadinstitute.org/software/igv/home)         | [Link](https://software.broadinstitute.org/software/igv/download)        | [Link](https://software.broadinstitute.org/software/igv/UserGuide)       | Linux, MacOS, Windows | Visualization and interactive exploration of large genomics datasets. | 
155 | 
156 | ### QuickStart Software Installation Instructions
157 | 
158 | These are the QuickStart installation instructions.
159 | They assume familiarity with the command line and with installation in general.
160 | As there are different operating systems and many different versions of
161 | operating systems and environments, these may not work on your computer.
162 | If an installation doesn't work for you, please refer to the user guide for the tool,
163 | listed in the table above.
164 | 
165 | We have installed software using [Conda](https://conda.io).
166 | Conda is a package manager that simplifies the installation process.
167 | Please first install Conda through the Miniconda installer (see below) before proceeding to the installation of individual tools.
168 | For more information on Miniconda, please refer to the Conda [documentation](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).
169 | 
170 | ### Conda
171 | 
172 | :::::::::::::::: spoiler
173 | 
174 | ## Linux
175 | 
176 | To install Conda, type:
177 | 
178 | ```bash
179 | $ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
180 | $ bash Miniconda3-latest-Linux-x86_64.sh
181 | ```
182 | 
183 | Then, follow the instructions that you are prompted with on the screen to install Conda.
184 | 
185 | :::::::::::::::::::::::::
186 | 
187 | :::::::::::::::: spoiler
188 | 
189 | ## MacOS
190 | 
191 | To install Conda, type:
192 | 
193 | ```bash
194 | $ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
195 | $ bash Miniconda3-latest-MacOSX-x86_64.sh
196 | ```
197 | 
198 | Then, follow the instructions that you are prompted with on the screen to install Conda.
199 | 
200 | :::::::::::::::::::::::::
201 | 
202 | ### FastQC
203 | 
204 | :::::::::::::::: spoiler
205 | 
206 | ## MacOS
207 | 
208 | To install FastQC, type:
209 | 
210 | ```bash
211 | $ conda install -c bioconda fastqc=0.11.9
212 | ```
213 | 
214 | :::::::::::::::::::::::::
215 | 
216 | :::::::::::::::: spoiler
217 | 
218 | ## FastQC Source Code Installation
219 | 
220 | If you prefer to install from source, follow the directions below:
221 | 
222 | ```bash
223 | $ cd ~/src
224 | $ curl -O http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.9.zip
225 | $ unzip fastqc_v0.11.9.zip
226 | ```
227 | 
228 | Link the fastqc executable to the ~/bin folder that
229 | you have already added to the path.
230 | 
231 | ```bash
232 | $ ln -sf ~/src/FastQC/fastqc ~/bin/fastqc
233 | ```
234 | 
235 | Due to what seems a packaging error
236 | the executable flag on the fastqc program is not set.
237 | We need to set it ourselves.
238 | 
239 | ```bash
240 | $ chmod +x ~/bin/fastqc
241 | ```
242 | 
243 | :::::::::::::::::::::::::
244 | 
245 | **Test your installation by running:**
246 | 
247 | ```bash
248 | $ fastqc -h
249 | ```
250 | 
251 | ### Trimmomatic
252 | 
253 | :::::::::::::::: spoiler
254 | 
255 | ## MacOS
256 | 
257 | ```bash
258 | conda install -c bioconda trimmomatic=0.39
259 | ```
260 | 
261 | :::::::::::::::::::::::::
262 | 
263 | :::::::::::::::: spoiler
264 | 
265 | ## Trimmomatic Source Code Installation
266 | 
267 | If you prefer to install from source, follow the directions below:
268 | 
269 | ```bash
270 | $ cd ~/src
271 | $ curl -O http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.39.zip
272 | $ unzip Trimmomatic-0.39.zip
273 | ```
274 | 
275 | The program can be invoked via:
276 | 
277 | ```
278 | $ java -jar ~/src/Trimmomatic-0.39/trimmomatic-0.39.jar
279 | ```
280 | 
281 | The ~/src/Trimmomatic-0.39/adapters/ directory contains
282 | Illumina specific adapter sequences.
283 | 
284 | ```bash
285 | $ ls ~/src/Trimmomatic-0.39/adapters/
286 | ```
287 | 
288 | :::::::::::::::::::::::::
289 | 
290 | **Test your installation by running:** (assuming things are installed in ~/src)
291 | 
292 | ```bash
293 | $ java -jar ~/src/Trimmomatic-0.39/trimmomatic-0.39.jar
294 | ```
295 | 
296 | :::::::::::::::: spoiler
297 | 
298 | ## Simplify the Invocation, or to Test your installation if you installed with miniconda3:
299 | 
300 | To simplify the invocation you could also create a script in the ~/bin folder:
301 | 
302 | ```bash
303 | $ echo '#!/bin/bash' > ~/bin/trimmomatic
304 | $ echo 'java -jar ~/src/Trimmomatic-0.39/trimmomatic-0.39.jar $@' >> ~/bin/trimmomatic
305 | $ chmod +x ~/bin/trimmomatic
306 | ```
307 | 
308 | Test your script by running:
309 | 
310 | ```bash
311 | $ trimmomatic
312 | ```
313 | 
314 | :::::::::::::::::::::::::
315 | 
316 | ### BWA
317 | 
318 | :::::::::::::::: spoiler
319 | 
320 | ## MacOS
321 | 
322 | ```bash
323 | conda install -c bioconda bwa=0.7.17=ha92aebf_3
324 | ```
325 | 
326 | :::::::::::::::::::::::::
327 | 
328 | :::::::::::::::: spoiler
329 | 
330 | ## BWA Source Code Installation
331 | 
332 | If you prefer to install from source, follow the instructions below:
333 | 
334 | ```bash
335 | $ cd ~/src
336 | $ curl -OL http://sourceforge.net/projects/bio-bwa/files/bwa-0.7.17.tar.bz2
337 | $ tar jxvf bwa-0.7.17.tar.bz2
338 | $ cd bwa-0.7.17
339 | $ make
340 | $ export PATH=~/src/bwa-0.7.17:$PATH
341 | ```
342 | 
343 | :::::::::::::::::::::::::
344 | 
345 | **Test your installation by running:**
346 | 
347 | ```bash
348 | $ bwa
349 | ```
350 | 
351 | ### SAMtools
352 | 
353 | :::::::::::::::: spoiler
354 | 
355 | ## MacOS
356 | 
357 | ```bash
358 | $ conda install -c bioconda samtools=1.9=h8ee4bcc_1
359 | ```
360 | 
361 | :::::::::::::::::::::::::
362 | 
363 | :::::::::::::::::::::::::::::::::::::::::  callout
364 | 
365 | ## SAMtools Versions
366 | 
367 | SAMtools has changed the command line invocation (for the better).
368 | But this means that most of the tutorials on the web indicate an older and obsolete usage.
369 | 
370 | Using SAMtools version 1.9 is important to work with the commands we present in these lessons.
371 | 
372 | ::::::::::::::::::::::::::::::::::::::::::::::::::
373 | 
374 | :::::::::::::::: spoiler
375 | 
376 | ## SAMtools Source Code Installation
377 | 
378 | If you prefer to install from source, follow the instructions below:
379 | 
380 | ```bash
381 | $ cd ~/src
382 | $ curl -OkL https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2
383 | $ tar jxvf samtools-1.9.tar.bz2
384 | $ cd samtools-1.9
385 | $ make
386 | ```
387 | 
388 | Add directory to the path if necessary:
389 | 
390 | ```bash
391 | $ echo export `PATH=~/src/samtools-1.9:$PATH` >> ~/.bashrc
392 | $ source ~/.bashrc
393 | ```
394 | 
395 | :::::::::::::::::::::::::
396 | 
397 | **Test your installation by running:**
398 | 
399 | ```bash
400 | $ samtools
401 | ```
402 | 
403 | ### BCFtools
404 | 
405 | :::::::::::::::: spoiler
406 | 
407 | ## MacOS
408 | 
409 | ```bash
410 | $ conda install -c bioconda bcftools=1.9
411 | ```
412 | 
413 | :::::::::::::::::::::::::
414 | 
415 | :::::::::::::::: spoiler
416 | 
417 | ## BCF tools Source Code Installation
418 | 
419 | If you prefer to install from source, follow the instructions below:
420 | 
421 | ```bash
422 | $ cd ~/src
423 | $ curl -OkL https://github.com/samtools/bcftools/releases/download/1.9/bcftools-1.9.tar.bz2
424 | $ tar jxvf bcftools-1.9.tar.bz2
425 | $ cd bcftools-1.9
426 | $ make
427 | ```
428 | 
429 | Add directory to the path if necessary:
430 | 
431 | ```bash
432 | $ echo export `PATH=~/src/bcftools-1.9:$PATH` >> ~/.bashrc
433 | $ source ~/.bashrc
434 | ```
435 | 
436 | :::::::::::::::::::::::::
437 | 
438 | **Test your installation by running:**
439 | 
440 | ```bash
441 | $ bcftools
442 | ```
443 | 
444 | ### IGV
445 | 
446 | - [Download the IGV installation files](https://software.broadinstitute.org/software/igv/download)
447 | - [Install and run IGV using the instructions for your operating system](https://software.broadinstitute.org/software/igv/download).
448 | 
449 | 
450 | 


--------------------------------------------------------------------------------
/profiles/learner-profiles.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: FIXME
3 | ---
4 | 
5 | This is a placeholder file. Please add content here. 
6 | 


--------------------------------------------------------------------------------
/site/README.md:
--------------------------------------------------------------------------------
1 | This directory contains rendered lesson materials. Please do not edit files
2 | here.  
3 | 


--------------------------------------------------------------------------------