├── .github └── workflows │ ├── README.md │ ├── pr-close-signal.yaml │ ├── pr-comment.yaml │ ├── pr-post-remove-branch.yaml │ ├── pr-preflight.yaml │ ├── pr-receive.yaml │ ├── sandpaper-main.yaml │ ├── sandpaper-version.txt │ ├── update-cache.yaml │ └── update-workflows.yaml ├── .gitignore ├── .zenodo.json ├── AUTHORS ├── CITATION ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md ├── config.yaml ├── fig ├── logging-onto-cloud-new-key-pair_1.png ├── logging-onto-cloud-new-key-pair_2.png ├── logging-onto-cloud-security-group_1.png ├── logging-onto-cloud-security-group_2.png ├── logging-onto-cloud-security-group_3.png ├── logging-onto-cloud-summary.png ├── logging-onto-cloud_1.png ├── logging-onto-cloud_1b.png ├── logging-onto-cloud_2.png ├── logging-onto-cloud_3.png ├── logging-onto-cloud_3b.png ├── logging-onto-cloud_5.png ├── logging-onto-cloud_6.png └── logging-onto-cloud_7.png ├── index.md ├── instructors ├── AMI-setup.md ├── faq.md ├── instructor-notes.md └── teaching_demos.md ├── learners ├── reference.md └── setup.md ├── profiles └── learner-profiles.md └── site └── README.md /.github/workflows/README.md: -------------------------------------------------------------------------------- 1 | # Carpentries Workflows 2 | 3 | This directory contains workflows to be used for Lessons using the {sandpaper} 4 | lesson infrastructure. Two of these workflows require R (`sandpaper-main.yaml` 5 | and `pr-receive.yaml`) and the rest are bots to handle pull request management. 6 | 7 | These workflows will likely change as {sandpaper} evolves, so it is important to 8 | keep them up-to-date. To do this in your lesson you can do the following in your 9 | R console: 10 | 11 | ```r 12 | # Install/Update sandpaper 13 | options(repos = c(carpentries = "https://carpentries.r-universe.dev/", 14 | CRAN = "https://cloud.r-project.org")) 15 | install.packages("sandpaper") 16 | 17 | # update the workflows in your lesson 18 | library("sandpaper") 19 | update_github_workflows() 20 | ``` 21 | 22 | Inside this folder, you will find a file called `sandpaper-version.txt`, which 23 | will contain a version number for sandpaper. This will be used in the future to 24 | alert you if a workflow update is needed. 25 | 26 | What follows are the descriptions of the workflow files: 27 | 28 | ## Deployment 29 | 30 | ### 01 Build and Deploy (sandpaper-main.yaml) 31 | 32 | This is the main driver that will only act on the main branch of the repository. 33 | This workflow does the following: 34 | 35 | 1. checks out the lesson 36 | 2. provisions the following resources 37 | - R 38 | - pandoc 39 | - lesson infrastructure (stored in a cache) 40 | - lesson dependencies if needed (stored in a cache) 41 | 3. builds the lesson via `sandpaper:::ci_deploy()` 42 | 43 | #### Caching 44 | 45 | This workflow has two caches; one cache is for the lesson infrastructure and 46 | the other is for the the lesson dependencies if the lesson contains rendered 47 | content. These caches are invalidated by new versions of the infrastructure and 48 | the `renv.lock` file, respectively. If there is a problem with the cache, 49 | manual invaliation is necessary. You will need maintain access to the repository 50 | and you can either go to the actions tab and [click on the caches button to find 51 | and invalidate the failing cache](https://github.blog/changelog/2022-10-20-manage-caches-in-your-actions-workflows-from-web-interface/) 52 | or by setting the `CACHE_VERSION` secret to the current date (which will 53 | invalidate all of the caches). 54 | 55 | ## Updates 56 | 57 | ### Setup Information 58 | 59 | These workflows run on a schedule and at the maintainer's request. Because they 60 | create pull requests that update workflows/require the downstream actions to run, 61 | they need a special repository/organization secret token called 62 | `SANDPAPER_WORKFLOW` and it must have the `public_repo` and `workflow` scope. 63 | 64 | This can be an individual user token, OR it can be a trusted bot account. If you 65 | have a repository in one of the official Carpentries accounts, then you do not 66 | need to worry about this token being present because the Carpentries Core Team 67 | will take care of supplying this token. 68 | 69 | If you want to use your personal account: you can go to 70 | 71 | to create a token. Once you have created your token, you should copy it to your 72 | clipboard and then go to your repository's settings > secrets > actions and 73 | create or edit the `SANDPAPER_WORKFLOW` secret, pasting in the generated token. 74 | 75 | If you do not specify your token correctly, the runs will not fail and they will 76 | give you instructions to provide the token for your repository. 77 | 78 | ### 02 Maintain: Update Workflow Files (update-workflow.yaml) 79 | 80 | The {sandpaper} repository was designed to do as much as possible to separate 81 | the tools from the content. For local builds, this is absolutely true, but 82 | there is a minor issue when it comes to workflow files: they must live inside 83 | the repository. 84 | 85 | This workflow ensures that the workflow files are up-to-date. The way it work is 86 | to download the update-workflows.sh script from GitHub and run it. The script 87 | will do the following: 88 | 89 | 1. check the recorded version of sandpaper against the current version on github 90 | 2. update the files if there is a difference in versions 91 | 92 | After the files are updated, if there are any changes, they are pushed to a 93 | branch called `update/workflows` and a pull request is created. Maintainers are 94 | encouraged to review the changes and accept the pull request if the outputs 95 | are okay. 96 | 97 | This update is run weekly or on demand. 98 | 99 | ### 03 Maintain: Update Package Cache (update-cache.yaml) 100 | 101 | For lessons that have generated content, we use {renv} to ensure that the output 102 | is stable. This is controlled by a single lockfile which documents the packages 103 | needed for the lesson and the version numbers. This workflow is skipped in 104 | lessons that do not have generated content. 105 | 106 | Because the lessons need to remain current with the package ecosystem, it's a 107 | good idea to make sure these packages can be updated periodically. The 108 | update cache workflow will do this by checking for updates, applying them in a 109 | branch called `updates/packages` and creating a pull request with _only the 110 | lockfile changed_. 111 | 112 | From here, the markdown documents will be rebuilt and you can inspect what has 113 | changed based on how the packages have updated. 114 | 115 | ## Pull Request and Review Management 116 | 117 | Because our lessons execute code, pull requests are a secruity risk for any 118 | lesson and thus have security measures associted with them. **Do not merge any 119 | pull requests that do not pass checks and do not have bots commented on them.** 120 | 121 | This series of workflows all go together and are described in the following 122 | diagram and the below sections: 123 | 124 | ![Graph representation of a pull request](https://carpentries.github.io/sandpaper/articles/img/pr-flow.dot.svg) 125 | 126 | ### Pre Flight Pull Request Validation (pr-preflight.yaml) 127 | 128 | This workflow runs every time a pull request is created and its purpose is to 129 | validate that the pull request is okay to run. This means the following things: 130 | 131 | 1. The pull request does not contain modified workflow files 132 | 2. If the pull request contains modified workflow files, it does not contain 133 | modified content files (such as a situation where @carpentries-bot will 134 | make an automated pull request) 135 | 3. The pull request does not contain an invalid commit hash (e.g. from a fork 136 | that was made before a lesson was transitioned from styles to use the 137 | workbench). 138 | 139 | Once the checks are finished, a comment is issued to the pull request, which 140 | will allow maintainers to determine if it is safe to run the 141 | "Receive Pull Request" workflow from new contributors. 142 | 143 | ### Receive Pull Request (pr-receive.yaml) 144 | 145 | **Note of caution:** This workflow runs arbitrary code by anyone who creates a 146 | pull request. GitHub has safeguarded the token used in this workflow to have no 147 | priviledges in the repository, but we have taken precautions to protect against 148 | spoofing. 149 | 150 | This workflow is triggered with every push to a pull request. If this workflow 151 | is already running and a new push is sent to the pull request, the workflow 152 | running from the previous push will be cancelled and a new workflow run will be 153 | started. 154 | 155 | The first step of this workflow is to check if it is valid (e.g. that no 156 | workflow files have been modified). If there are workflow files that have been 157 | modified, a comment is made that indicates that the workflow is not run. If 158 | both a workflow file and lesson content is modified, an error will occurr. 159 | 160 | The second step (if valid) is to build the generated content from the pull 161 | request. This builds the content and uploads three artifacts: 162 | 163 | 1. The pull request number (pr) 164 | 2. A summary of changes after the rendering process (diff) 165 | 3. The rendered files (build) 166 | 167 | Because this workflow builds generated content, it follows the same general 168 | process as the `sandpaper-main` workflow with the same caching mechanisms. 169 | 170 | The artifacts produced are used by the next workflow. 171 | 172 | ### Comment on Pull Request (pr-comment.yaml) 173 | 174 | This workflow is triggered if the `pr-receive.yaml` workflow is successful. 175 | The steps in this workflow are: 176 | 177 | 1. Test if the workflow is valid and comment the validity of the workflow to the 178 | pull request. 179 | 2. If it is valid: create an orphan branch with two commits: the current state 180 | of the repository and the proposed changes. 181 | 3. If it is valid: update the pull request comment with the summary of changes 182 | 183 | Importantly: if the pull request is invalid, the branch is not created so any 184 | malicious code is not published. 185 | 186 | From here, the maintainer can request changes from the author and eventually 187 | either merge or reject the PR. When this happens, if the PR was valid, the 188 | preview branch needs to be deleted. 189 | 190 | ### Send Close PR Signal (pr-close-signal.yaml) 191 | 192 | Triggered any time a pull request is closed. This emits an artifact that is the 193 | pull request number for the next action 194 | 195 | ### Remove Pull Request Branch (pr-post-remove-branch.yaml) 196 | 197 | Tiggered by `pr-close-signal.yaml`. This removes the temporary branch associated with 198 | the pull request (if it was created). 199 | -------------------------------------------------------------------------------- /.github/workflows/pr-close-signal.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Send Close Pull Request Signal" 2 | 3 | on: 4 | pull_request: 5 | types: 6 | [closed] 7 | 8 | jobs: 9 | send-close-signal: 10 | name: "Send closing signal" 11 | runs-on: ubuntu-22.04 12 | if: ${{ github.event.action == 'closed' }} 13 | steps: 14 | - name: "Create PRtifact" 15 | run: | 16 | mkdir -p ./pr 17 | printf ${{ github.event.number }} > ./pr/NUM 18 | - name: Upload Diff 19 | uses: actions/upload-artifact@v4 20 | with: 21 | name: pr 22 | path: ./pr 23 | -------------------------------------------------------------------------------- /.github/workflows/pr-comment.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Comment on the Pull Request" 2 | 3 | # read-write repo token 4 | # access to secrets 5 | on: 6 | workflow_run: 7 | workflows: ["Receive Pull Request"] 8 | types: 9 | - completed 10 | 11 | concurrency: 12 | group: pr-${{ github.event.workflow_run.pull_requests[0].number }} 13 | cancel-in-progress: true 14 | 15 | 16 | jobs: 17 | # Pull requests are valid if: 18 | # - they match the sha of the workflow run head commit 19 | # - they are open 20 | # - no .github files were committed 21 | test-pr: 22 | name: "Test if pull request is valid" 23 | runs-on: ubuntu-22.04 24 | if: > 25 | github.event.workflow_run.event == 'pull_request' && 26 | github.event.workflow_run.conclusion == 'success' 27 | outputs: 28 | is_valid: ${{ steps.check-pr.outputs.VALID }} 29 | payload: ${{ steps.check-pr.outputs.payload }} 30 | number: ${{ steps.get-pr.outputs.NUM }} 31 | msg: ${{ steps.check-pr.outputs.MSG }} 32 | steps: 33 | - name: 'Download PR artifact' 34 | id: dl 35 | uses: carpentries/actions/download-workflow-artifact@main 36 | with: 37 | run: ${{ github.event.workflow_run.id }} 38 | name: 'pr' 39 | 40 | - name: "Get PR Number" 41 | if: ${{ steps.dl.outputs.success == 'true' }} 42 | id: get-pr 43 | run: | 44 | unzip pr.zip 45 | echo "NUM=$(<./NR)" >> $GITHUB_OUTPUT 46 | 47 | - name: "Fail if PR number was not present" 48 | id: bad-pr 49 | if: ${{ steps.dl.outputs.success != 'true' }} 50 | run: | 51 | echo '::error::A pull request number was not recorded. The pull request that triggered this workflow is likely malicious.' 52 | exit 1 53 | - name: "Get Invalid Hashes File" 54 | id: hash 55 | run: | 56 | echo "json<> $GITHUB_OUTPUT 59 | - name: "Check PR" 60 | id: check-pr 61 | if: ${{ steps.dl.outputs.success == 'true' }} 62 | uses: carpentries/actions/check-valid-pr@main 63 | with: 64 | pr: ${{ steps.get-pr.outputs.NUM }} 65 | sha: ${{ github.event.workflow_run.head_sha }} 66 | headroom: 3 # if it's within the last three commits, we can keep going, because it's likely rapid-fire 67 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 68 | fail_on_error: true 69 | 70 | # Create an orphan branch on this repository with two commits 71 | # - the current HEAD of the md-outputs branch 72 | # - the output from running the current HEAD of the pull request through 73 | # the md generator 74 | create-branch: 75 | name: "Create Git Branch" 76 | needs: test-pr 77 | runs-on: ubuntu-22.04 78 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 79 | env: 80 | NR: ${{ needs.test-pr.outputs.number }} 81 | permissions: 82 | contents: write 83 | steps: 84 | - name: 'Checkout md outputs' 85 | uses: actions/checkout@v4 86 | with: 87 | ref: md-outputs 88 | path: built 89 | fetch-depth: 1 90 | 91 | - name: 'Download built markdown' 92 | id: dl 93 | uses: carpentries/actions/download-workflow-artifact@main 94 | with: 95 | run: ${{ github.event.workflow_run.id }} 96 | name: 'built' 97 | 98 | - if: ${{ steps.dl.outputs.success == 'true' }} 99 | run: unzip built.zip 100 | 101 | - name: "Create orphan and push" 102 | if: ${{ steps.dl.outputs.success == 'true' }} 103 | run: | 104 | cd built/ 105 | git config --local user.email "actions@github.com" 106 | git config --local user.name "GitHub Actions" 107 | CURR_HEAD=$(git rev-parse HEAD) 108 | git checkout --orphan md-outputs-PR-${NR} 109 | git add -A 110 | git commit -m "source commit: ${CURR_HEAD}" 111 | ls -A | grep -v '^.git$' | xargs -I _ rm -r '_' 112 | cd .. 113 | unzip -o -d built built.zip 114 | cd built 115 | git add -A 116 | git commit --allow-empty -m "differences for PR #${NR}" 117 | git push -u --force --set-upstream origin md-outputs-PR-${NR} 118 | 119 | # Comment on the Pull Request with a link to the branch and the diff 120 | comment-pr: 121 | name: "Comment on Pull Request" 122 | needs: [test-pr, create-branch] 123 | runs-on: ubuntu-22.04 124 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 125 | env: 126 | NR: ${{ needs.test-pr.outputs.number }} 127 | permissions: 128 | pull-requests: write 129 | steps: 130 | - name: 'Download comment artifact' 131 | id: dl 132 | uses: carpentries/actions/download-workflow-artifact@main 133 | with: 134 | run: ${{ github.event.workflow_run.id }} 135 | name: 'diff' 136 | 137 | - if: ${{ steps.dl.outputs.success == 'true' }} 138 | run: unzip ${{ github.workspace }}/diff.zip 139 | 140 | - name: "Comment on PR" 141 | id: comment-diff 142 | if: ${{ steps.dl.outputs.success == 'true' }} 143 | uses: carpentries/actions/comment-diff@main 144 | with: 145 | pr: ${{ env.NR }} 146 | path: ${{ github.workspace }}/diff.md 147 | 148 | # Comment if the PR is open and matches the SHA, but the workflow files have 149 | # changed 150 | comment-changed-workflow: 151 | name: "Comment if workflow files have changed" 152 | needs: test-pr 153 | runs-on: ubuntu-22.04 154 | if: ${{ always() && needs.test-pr.outputs.is_valid == 'false' }} 155 | env: 156 | NR: ${{ github.event.workflow_run.pull_requests[0].number }} 157 | body: ${{ needs.test-pr.outputs.msg }} 158 | permissions: 159 | pull-requests: write 160 | steps: 161 | - name: 'Check for spoofing' 162 | id: dl 163 | uses: carpentries/actions/download-workflow-artifact@main 164 | with: 165 | run: ${{ github.event.workflow_run.id }} 166 | name: 'built' 167 | 168 | - name: 'Alert if spoofed' 169 | id: spoof 170 | if: ${{ steps.dl.outputs.success == 'true' }} 171 | run: | 172 | echo 'body<> $GITHUB_ENV 173 | echo '' >> $GITHUB_ENV 174 | echo '## :x: DANGER :x:' >> $GITHUB_ENV 175 | echo 'This pull request has modified workflows that created output. Close this now.' >> $GITHUB_ENV 176 | echo '' >> $GITHUB_ENV 177 | echo 'EOF' >> $GITHUB_ENV 178 | 179 | - name: "Comment on PR" 180 | id: comment-diff 181 | uses: carpentries/actions/comment-diff@main 182 | with: 183 | pr: ${{ env.NR }} 184 | body: ${{ env.body }} 185 | -------------------------------------------------------------------------------- /.github/workflows/pr-post-remove-branch.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Remove Temporary PR Branch" 2 | 3 | on: 4 | workflow_run: 5 | workflows: ["Bot: Send Close Pull Request Signal"] 6 | types: 7 | - completed 8 | 9 | jobs: 10 | delete: 11 | name: "Delete branch from Pull Request" 12 | runs-on: ubuntu-22.04 13 | if: > 14 | github.event.workflow_run.event == 'pull_request' && 15 | github.event.workflow_run.conclusion == 'success' 16 | permissions: 17 | contents: write 18 | steps: 19 | - name: 'Download artifact' 20 | uses: carpentries/actions/download-workflow-artifact@main 21 | with: 22 | run: ${{ github.event.workflow_run.id }} 23 | name: pr 24 | - name: "Get PR Number" 25 | id: get-pr 26 | run: | 27 | unzip pr.zip 28 | echo "NUM=$(<./NUM)" >> $GITHUB_OUTPUT 29 | - name: 'Remove branch' 30 | uses: carpentries/actions/remove-branch@main 31 | with: 32 | pr: ${{ steps.get-pr.outputs.NUM }} 33 | -------------------------------------------------------------------------------- /.github/workflows/pr-preflight.yaml: -------------------------------------------------------------------------------- 1 | name: "Pull Request Preflight Check" 2 | 3 | on: 4 | pull_request_target: 5 | branches: 6 | ["main"] 7 | types: 8 | ["opened", "synchronize", "reopened"] 9 | 10 | jobs: 11 | test-pr: 12 | name: "Test if pull request is valid" 13 | if: ${{ github.event.action != 'closed' }} 14 | runs-on: ubuntu-22.04 15 | outputs: 16 | is_valid: ${{ steps.check-pr.outputs.VALID }} 17 | permissions: 18 | pull-requests: write 19 | steps: 20 | - name: "Get Invalid Hashes File" 21 | id: hash 22 | run: | 23 | echo "json<> $GITHUB_OUTPUT 26 | - name: "Check PR" 27 | id: check-pr 28 | uses: carpentries/actions/check-valid-pr@main 29 | with: 30 | pr: ${{ github.event.number }} 31 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 32 | fail_on_error: true 33 | - name: "Comment result of validation" 34 | id: comment-diff 35 | if: ${{ always() }} 36 | uses: carpentries/actions/comment-diff@main 37 | with: 38 | pr: ${{ github.event.number }} 39 | body: ${{ steps.check-pr.outputs.MSG }} 40 | -------------------------------------------------------------------------------- /.github/workflows/pr-receive.yaml: -------------------------------------------------------------------------------- 1 | name: "Receive Pull Request" 2 | 3 | on: 4 | pull_request: 5 | types: 6 | [opened, synchronize, reopened] 7 | 8 | concurrency: 9 | group: ${{ github.ref }} 10 | cancel-in-progress: true 11 | 12 | jobs: 13 | test-pr: 14 | name: "Record PR number" 15 | if: ${{ github.event.action != 'closed' }} 16 | runs-on: ubuntu-22.04 17 | outputs: 18 | is_valid: ${{ steps.check-pr.outputs.VALID }} 19 | steps: 20 | - name: "Record PR number" 21 | id: record 22 | if: ${{ always() }} 23 | run: | 24 | echo ${{ github.event.number }} > ${{ github.workspace }}/NR # 2022-03-02: artifact name fixed to be NR 25 | - name: "Upload PR number" 26 | id: upload 27 | if: ${{ always() }} 28 | uses: actions/upload-artifact@v4 29 | with: 30 | name: pr 31 | path: ${{ github.workspace }}/NR 32 | - name: "Get Invalid Hashes File" 33 | id: hash 34 | run: | 35 | echo "json<> $GITHUB_OUTPUT 38 | - name: "echo output" 39 | run: | 40 | echo "${{ steps.hash.outputs.json }}" 41 | - name: "Check PR" 42 | id: check-pr 43 | uses: carpentries/actions/check-valid-pr@main 44 | with: 45 | pr: ${{ github.event.number }} 46 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 47 | 48 | build-md-source: 49 | name: "Build markdown source files if valid" 50 | needs: test-pr 51 | runs-on: ubuntu-22.04 52 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 53 | env: 54 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 55 | RENV_PATHS_ROOT: ~/.local/share/renv/ 56 | CHIVE: ${{ github.workspace }}/site/chive 57 | PR: ${{ github.workspace }}/site/pr 58 | MD: ${{ github.workspace }}/site/built 59 | steps: 60 | - name: "Check Out Main Branch" 61 | uses: actions/checkout@v4 62 | 63 | - name: "Check Out Staging Branch" 64 | uses: actions/checkout@v4 65 | with: 66 | ref: md-outputs 67 | path: ${{ env.MD }} 68 | 69 | - name: "Set up R" 70 | uses: r-lib/actions/setup-r@v2 71 | with: 72 | use-public-rspm: true 73 | install-r: false 74 | 75 | - name: "Set up Pandoc" 76 | uses: r-lib/actions/setup-pandoc@v2 77 | 78 | - name: "Setup Lesson Engine" 79 | uses: carpentries/actions/setup-sandpaper@main 80 | with: 81 | cache-version: ${{ secrets.CACHE_VERSION }} 82 | 83 | - name: "Setup Package Cache" 84 | uses: carpentries/actions/setup-lesson-deps@main 85 | with: 86 | cache-version: ${{ secrets.CACHE_VERSION }} 87 | 88 | - name: "Validate and Build Markdown" 89 | id: build-site 90 | run: | 91 | sandpaper::package_cache_trigger(TRUE) 92 | sandpaper::validate_lesson(path = '${{ github.workspace }}') 93 | sandpaper:::build_markdown(path = '${{ github.workspace }}', quiet = FALSE) 94 | shell: Rscript {0} 95 | 96 | - name: "Generate Artifacts" 97 | id: generate-artifacts 98 | run: | 99 | sandpaper:::ci_bundle_pr_artifacts( 100 | repo = '${{ github.repository }}', 101 | pr_number = '${{ github.event.number }}', 102 | path_md = '${{ env.MD }}', 103 | path_pr = '${{ env.PR }}', 104 | path_archive = '${{ env.CHIVE }}', 105 | branch = 'md-outputs' 106 | ) 107 | shell: Rscript {0} 108 | 109 | - name: "Upload PR" 110 | uses: actions/upload-artifact@v4 111 | with: 112 | name: pr 113 | path: ${{ env.PR }} 114 | overwrite: true 115 | 116 | - name: "Upload Diff" 117 | uses: actions/upload-artifact@v4 118 | with: 119 | name: diff 120 | path: ${{ env.CHIVE }} 121 | retention-days: 1 122 | 123 | - name: "Upload Build" 124 | uses: actions/upload-artifact@v4 125 | with: 126 | name: built 127 | path: ${{ env.MD }} 128 | retention-days: 1 129 | 130 | - name: "Teardown" 131 | run: sandpaper::reset_site() 132 | shell: Rscript {0} 133 | -------------------------------------------------------------------------------- /.github/workflows/sandpaper-main.yaml: -------------------------------------------------------------------------------- 1 | name: "01 Build and Deploy Site" 2 | 3 | on: 4 | push: 5 | branches: 6 | - main 7 | - master 8 | schedule: 9 | - cron: '0 0 * * 2' 10 | workflow_dispatch: 11 | inputs: 12 | name: 13 | description: 'Who triggered this build?' 14 | required: true 15 | default: 'Maintainer (via GitHub)' 16 | reset: 17 | description: 'Reset cached markdown files' 18 | required: false 19 | default: false 20 | type: boolean 21 | jobs: 22 | full-build: 23 | name: "Build Full Site" 24 | 25 | # 2024-10-01: ubuntu-latest is now 24.04 and R is not installed by default in the runner image 26 | # pin to 22.04 for now 27 | runs-on: ubuntu-22.04 28 | permissions: 29 | checks: write 30 | contents: write 31 | pages: write 32 | env: 33 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 34 | RENV_PATHS_ROOT: ~/.local/share/renv/ 35 | steps: 36 | 37 | - name: "Checkout Lesson" 38 | uses: actions/checkout@v4 39 | 40 | - name: "Set up R" 41 | uses: r-lib/actions/setup-r@v2 42 | with: 43 | use-public-rspm: true 44 | install-r: false 45 | 46 | - name: "Set up Pandoc" 47 | uses: r-lib/actions/setup-pandoc@v2 48 | 49 | - name: "Setup Lesson Engine" 50 | uses: carpentries/actions/setup-sandpaper@main 51 | with: 52 | cache-version: ${{ secrets.CACHE_VERSION }} 53 | 54 | - name: "Setup Package Cache" 55 | uses: carpentries/actions/setup-lesson-deps@main 56 | with: 57 | cache-version: ${{ secrets.CACHE_VERSION }} 58 | 59 | - name: "Deploy Site" 60 | run: | 61 | reset <- "${{ github.event.inputs.reset }}" == "true" 62 | sandpaper::package_cache_trigger(TRUE) 63 | sandpaper:::ci_deploy(reset = reset) 64 | shell: Rscript {0} 65 | -------------------------------------------------------------------------------- /.github/workflows/sandpaper-version.txt: -------------------------------------------------------------------------------- 1 | 0.16.9 2 | -------------------------------------------------------------------------------- /.github/workflows/update-cache.yaml: -------------------------------------------------------------------------------- 1 | name: "03 Maintain: Update Package Cache" 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | name: 7 | description: 'Who triggered this build (enter github username to tag yourself)?' 8 | required: true 9 | default: 'monthly run' 10 | schedule: 11 | # Run every tuesday 12 | - cron: '0 0 * * 2' 13 | 14 | jobs: 15 | preflight: 16 | name: "Preflight Check" 17 | runs-on: ubuntu-22.04 18 | outputs: 19 | ok: ${{ steps.check.outputs.ok }} 20 | steps: 21 | - id: check 22 | run: | 23 | if [[ ${{ github.event_name }} == 'workflow_dispatch' ]]; then 24 | echo "ok=true" >> $GITHUB_OUTPUT 25 | echo "Running on request" 26 | # using single brackets here to avoid 08 being interpreted as octal 27 | # https://github.com/carpentries/sandpaper/issues/250 28 | elif [ `date +%d` -le 7 ]; then 29 | # If the Tuesday lands in the first week of the month, run it 30 | echo "ok=true" >> $GITHUB_OUTPUT 31 | echo "Running on schedule" 32 | else 33 | echo "ok=false" >> $GITHUB_OUTPUT 34 | echo "Not Running Today" 35 | fi 36 | 37 | check_renv: 38 | name: "Check if We Need {renv}" 39 | runs-on: ubuntu-22.04 40 | needs: preflight 41 | if: ${{ needs.preflight.outputs.ok == 'true'}} 42 | outputs: 43 | needed: ${{ steps.renv.outputs.exists }} 44 | steps: 45 | - name: "Checkout Lesson" 46 | uses: actions/checkout@v4 47 | - id: renv 48 | run: | 49 | if [[ -d renv ]]; then 50 | echo "exists=true" >> $GITHUB_OUTPUT 51 | fi 52 | 53 | check_token: 54 | name: "Check SANDPAPER_WORKFLOW token" 55 | runs-on: ubuntu-22.04 56 | needs: check_renv 57 | if: ${{ needs.check_renv.outputs.needed == 'true' }} 58 | outputs: 59 | workflow: ${{ steps.validate.outputs.wf }} 60 | repo: ${{ steps.validate.outputs.repo }} 61 | steps: 62 | - name: "validate token" 63 | id: validate 64 | uses: carpentries/actions/check-valid-credentials@main 65 | with: 66 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 67 | 68 | update_cache: 69 | name: "Update Package Cache" 70 | needs: check_token 71 | if: ${{ needs.check_token.outputs.repo== 'true' }} 72 | runs-on: ubuntu-22.04 73 | env: 74 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 75 | RENV_PATHS_ROOT: ~/.local/share/renv/ 76 | steps: 77 | 78 | - name: "Checkout Lesson" 79 | uses: actions/checkout@v4 80 | 81 | - name: "Set up R" 82 | uses: r-lib/actions/setup-r@v2 83 | with: 84 | use-public-rspm: true 85 | install-r: false 86 | 87 | - name: "Update {renv} deps and determine if a PR is needed" 88 | id: update 89 | uses: carpentries/actions/update-lockfile@main 90 | with: 91 | cache-version: ${{ secrets.CACHE_VERSION }} 92 | 93 | - name: Create Pull Request 94 | id: cpr 95 | if: ${{ steps.update.outputs.n > 0 }} 96 | uses: carpentries/create-pull-request@main 97 | with: 98 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 99 | delete-branch: true 100 | branch: "update/packages" 101 | commit-message: "[actions] update ${{ steps.update.outputs.n }} packages" 102 | title: "Update ${{ steps.update.outputs.n }} packages" 103 | body: | 104 | :robot: This is an automated build 105 | 106 | This will update ${{ steps.update.outputs.n }} packages in your lesson with the following versions: 107 | 108 | ``` 109 | ${{ steps.update.outputs.report }} 110 | ``` 111 | 112 | :stopwatch: In a few minutes, a comment will appear that will show you how the output has changed based on these updates. 113 | 114 | If you want to inspect these changes locally, you can use the following code to check out a new branch: 115 | 116 | ```bash 117 | git fetch origin update/packages 118 | git checkout update/packages 119 | ``` 120 | 121 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }} 122 | 123 | [1]: https://github.com/carpentries/create-pull-request/tree/main 124 | labels: "type: package cache" 125 | draft: false 126 | -------------------------------------------------------------------------------- /.github/workflows/update-workflows.yaml: -------------------------------------------------------------------------------- 1 | name: "02 Maintain: Update Workflow Files" 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | name: 7 | description: 'Who triggered this build (enter github username to tag yourself)?' 8 | required: true 9 | default: 'weekly run' 10 | clean: 11 | description: 'Workflow files/file extensions to clean (no wildcards, enter "" for none)' 12 | required: false 13 | default: '.yaml' 14 | schedule: 15 | # Run every Tuesday 16 | - cron: '0 0 * * 2' 17 | 18 | jobs: 19 | check_token: 20 | name: "Check SANDPAPER_WORKFLOW token" 21 | runs-on: ubuntu-22.04 22 | outputs: 23 | workflow: ${{ steps.validate.outputs.wf }} 24 | repo: ${{ steps.validate.outputs.repo }} 25 | steps: 26 | - name: "validate token" 27 | id: validate 28 | uses: carpentries/actions/check-valid-credentials@main 29 | with: 30 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 31 | 32 | update_workflow: 33 | name: "Update Workflow" 34 | runs-on: ubuntu-22.04 35 | needs: check_token 36 | if: ${{ needs.check_token.outputs.workflow == 'true' }} 37 | steps: 38 | - name: "Checkout Repository" 39 | uses: actions/checkout@v4 40 | 41 | - name: Update Workflows 42 | id: update 43 | uses: carpentries/actions/update-workflows@main 44 | with: 45 | clean: ${{ github.event.inputs.clean }} 46 | 47 | - name: Create Pull Request 48 | id: cpr 49 | if: "${{ steps.update.outputs.new }}" 50 | uses: carpentries/create-pull-request@main 51 | with: 52 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 53 | delete-branch: true 54 | branch: "update/workflows" 55 | commit-message: "[actions] update sandpaper workflow to version ${{ steps.update.outputs.new }}" 56 | title: "Update Workflows to Version ${{ steps.update.outputs.new }}" 57 | body: | 58 | :robot: This is an automated build 59 | 60 | Update Workflows from sandpaper version ${{ steps.update.outputs.old }} -> ${{ steps.update.outputs.new }} 61 | 62 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }} 63 | 64 | [1]: https://github.com/carpentries/create-pull-request/tree/main 65 | labels: "type: template and tools" 66 | draft: false 67 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # sandpaper files 2 | episodes/*html 3 | site/* 4 | !site/README.md 5 | 6 | # History files 7 | .Rhistory 8 | .Rapp.history 9 | # Session Data files 10 | .RData 11 | # User-specific files 12 | .Ruserdata 13 | # Example code in package build process 14 | *-Ex.R 15 | # Output files from R CMD build 16 | /*.tar.gz 17 | # Output files from R CMD check 18 | /*.Rcheck/ 19 | # RStudio files 20 | .Rproj.user/ 21 | # produced vignettes 22 | vignettes/*.html 23 | vignettes/*.pdf 24 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3 25 | .httr-oauth 26 | # knitr and R markdown default cache directories 27 | *_cache/ 28 | /cache/ 29 | # Temporary files created by R markdown 30 | *.utf8.md 31 | *.knit.md 32 | # R Environment Variables 33 | .Renviron 34 | # pkgdown site 35 | docs/ 36 | # translation temp files 37 | po/*~ 38 | # renv detritus 39 | renv/sandbox/ 40 | *.pyc 41 | *~ 42 | .DS_Store 43 | .ipynb_checkpoints 44 | .sass-cache 45 | __pycache__ 46 | _site 47 | .Rproj.user 48 | -------------------------------------------------------------------------------- /.zenodo.json: -------------------------------------------------------------------------------- 1 | { 2 | "contributors": [ 3 | { 4 | "type": "Editor", 5 | "name": "Anuj Guruacharya" 6 | }, 7 | { 8 | "type": "Editor", 9 | "name": "Travis Wrightsman" 10 | } 11 | ], 12 | "creators": [ 13 | { 14 | "name": "Erin Alison Becker", 15 | "orcid": "0000-0002-6832-0233" 16 | }, 17 | { 18 | "name": "Sarah LR Stevens", 19 | "orcid": "0000-0002-7040-548X" 20 | }, 21 | { 22 | "name": "Bianca Peterson" 23 | }, 24 | { 25 | "name": "Jake Cowper Szamosi", 26 | "orcid": "0000-0003-2106-0072" 27 | }, 28 | { 29 | "name": "Fotis E. Psomopoulos", 30 | "orcid": "0000-0002-0222-4273" 31 | }, 32 | { 33 | "name": "Travis Wrightsman", 34 | "orcid": "0000-0002-0904-6473" 35 | }, 36 | { 37 | "name": "Karen Word", 38 | "orcid": "0000-0002-7294-7231" 39 | }, 40 | { 41 | "name": "Murray Cadzow", 42 | "orcid": "0000-0002-2299-4136" 43 | }, 44 | { 45 | "name": "Sam Nooij", 46 | "orcid": "0000-0001-5892-5637" 47 | }, 48 | { 49 | "name": "Sangram Keshari Sahu", 50 | "orcid": "0000-0001-5010-9539" 51 | }, 52 | { 53 | "name": "Sarah M Brown", 54 | "orcid": "0000-0001-5728-0822" 55 | }, 56 | { 57 | "name": "Stephen Tahan" 58 | }, 59 | { 60 | "name": "Umar Ahmad" 61 | }, 62 | { 63 | "name": "Valerie Gartner" 64 | }, 65 | { 66 | "name": "Annajiat Alim Rasel", 67 | "orcid": "0000-0003-0198-3734" 68 | }, 69 | { 70 | "name": "Daniel Kerchner", 71 | "orcid": "0000-0002-5921-2193" 72 | }, 73 | { 74 | "name": "rosemm" 75 | } 76 | ], 77 | "license": { 78 | "id": "CC-BY-4.0" 79 | } 80 | } 81 | -------------------------------------------------------------------------------- /AUTHORS: -------------------------------------------------------------------------------- 1 | FIXME: list authors' names and email addresses. 2 | -------------------------------------------------------------------------------- /CITATION: -------------------------------------------------------------------------------- 1 | Please cite as: 2 | 3 | Erin Alison Becker, Tracy Teal, François Michonneau, Maneesha Sane, Taylor Reiter, Jason Williams, et al. (2019, June). 4 | datacarpentry/genomics-workshop: Data Carpentry: Genomics Workshop Overview, June 2019 (Version v2019.06.1). 5 | Zenodo. http://doi.org/10.5281/zenodo.3260309 6 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Contributor Code of Conduct" 3 | --- 4 | 5 | As contributors and maintainers of this project, 6 | we pledge to follow the [The Carpentries Code of Conduct][coc]. 7 | 8 | Instances of abusive, harassing, or otherwise unacceptable behavior 9 | may be reported by following our [reporting guidelines][coc-reporting]. 10 | 11 | [coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html 12 | [coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html 13 | 14 | 15 | 16 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | ## Contributing 2 | 3 | [The Carpentries][cp-site] ([Software Carpentry][swc-site], [Data 4 | Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source 5 | projects, and we welcome contributions of all kinds: new lessons, fixes to 6 | existing material, bug reports, and reviews of proposed changes are all 7 | welcome. 8 | 9 | ### Contributor Agreement 10 | 11 | By contributing, you agree that we may redistribute your work under [our 12 | license](LICENSE.md). In exchange, we will address your issues and/or assess 13 | your change proposal as promptly as we can, and help you become a member of our 14 | community. Everyone involved in [The Carpentries][cp-site] agrees to abide by 15 | our [code of conduct](CODE_OF_CONDUCT.md). 16 | 17 | ### How to Contribute 18 | 19 | The easiest way to get started is to file an issue to tell us about a spelling 20 | mistake, some awkward wording, or a factual error. This is a good way to 21 | introduce yourself and to meet some of our community members. 22 | 23 | 1. If you do not have a [GitHub][github] account, you can [send us comments by 24 | email][contact]. However, we will be able to respond more quickly if you use 25 | one of the other methods described below. 26 | 27 | 2. If you have a [GitHub][github] account, or are willing to [create 28 | one][github-join], but do not know how to use Git, you can report problems 29 | or suggest improvements by [creating an issue][repo-issues]. This allows us 30 | to assign the item to someone and to respond to it in a threaded discussion. 31 | 32 | 3. If you are comfortable with Git, and would like to add or change material, 33 | you can submit a pull request (PR). Instructions for doing this are 34 | [included below](#using-github). For inspiration about changes that need to 35 | be made, check out the [list of open issues][issues] across the Carpentries. 36 | 37 | Note: if you want to build the website locally, please refer to [The Workbench 38 | documentation][template-doc]. 39 | 40 | ### Where to Contribute 41 | 42 | 1. If you wish to change this lesson, add issues and pull requests here. 43 | 2. If you wish to change the template used for workshop websites, please refer 44 | to [The Workbench documentation][template-doc]. 45 | 46 | ### What to Contribute 47 | 48 | There are many ways to contribute, from writing new exercises and improving 49 | existing ones to updating or filling in the documentation and submitting [bug 50 | reports][issues] about things that do not work, are not clear, or are missing. 51 | If you are looking for ideas, please see [the list of issues for this 52 | repository][repo-issues], or the issues for [Data Carpentry][dc-issues], 53 | [Library Carpentry][lc-issues], and [Software Carpentry][swc-issues] projects. 54 | 55 | Comments on issues and reviews of pull requests are just as welcome: we are 56 | smarter together than we are on our own. **Reviews from novices and newcomers 57 | are particularly valuable**: it's easy for people who have been using these 58 | lessons for a while to forget how impenetrable some of this material can be, so 59 | fresh eyes are always welcome. 60 | 61 | ### What *Not* to Contribute 62 | 63 | Our lessons already contain more material than we can cover in a typical 64 | workshop, so we are usually *not* looking for more concepts or tools to add to 65 | them. As a rule, if you want to introduce a new idea, you must (a) estimate how 66 | long it will take to teach and (b) explain what you would take out to make room 67 | for it. The first encourages contributors to be honest about requirements; the 68 | second, to think hard about priorities. 69 | 70 | We are also not looking for exercises or other material that only run on one 71 | platform. Our workshops typically contain a mixture of Windows, macOS, and 72 | Linux users; in order to be usable, our lessons must run equally well on all 73 | three. 74 | 75 | ### Using GitHub 76 | 77 | If you choose to contribute via GitHub, you may want to look at [How to 78 | Contribute to an Open Source Project on GitHub][how-contribute]. In brief, we 79 | use [GitHub flow][github-flow] to manage changes: 80 | 81 | 1. Create a new branch in your desktop copy of this repository for each 82 | significant change. 83 | 2. Commit the change in that branch. 84 | 3. Push that branch to your fork of this repository on GitHub. 85 | 4. Submit a pull request from that branch to the [upstream repository][repo]. 86 | 5. If you receive feedback, make changes on your desktop and push to your 87 | branch on GitHub: the pull request will update automatically. 88 | 89 | NB: The published copy of the lesson is usually in the `main` branch. 90 | 91 | Each lesson has a team of maintainers who review issues and pull requests or 92 | encourage others to do so. The maintainers are community volunteers, and have 93 | final say over what gets merged into the lesson. 94 | 95 | ### Other Resources 96 | 97 | The Carpentries is a global organisation with volunteers and learners all over 98 | the world. We share values of inclusivity and a passion for sharing knowledge, 99 | teaching and learning. There are several ways to connect with The Carpentries 100 | community listed at [https://carpentries.org/connect/](https://carpentries.org/connect/) including via social 101 | media, slack, newsletters, and email lists. You can also [reach us by 102 | email][contact]. 103 | 104 | [cp-site]: https://carpentries.org/ 105 | [swc-site]: https://software-carpentry.org/ 106 | [dc-site]: https://datacarpentry.org/ 107 | [lc-site]: https://librarycarpentry.org/ 108 | [github]: https://github.com 109 | [contact]: mailto:team@carpentries.org 110 | [github-join]: https://github.com/join 111 | [repo-issues]: https://github.com/datacarpentry/genomics-workshop/issues 112 | [issues]: https://carpentries.org/help-wanted-issues/ 113 | [template-doc]: https://carpentries.github.io/workbench/ 114 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry 115 | [lc-issues]: https://github.com/issues?q=user%3ALibraryCarpentry 116 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry 117 | [how-contribute]: https://egghead.io/courses/how-to-contribute-to-an-open-source-project-on-github 118 | [github-flow]: https://guides.github.com/introduction/flow/ 119 | [repo]: https://github.com/datacarpentry/genomics-workshop 120 | 121 | 122 | 123 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Licenses" 3 | --- 4 | 5 | ## Instructional Material 6 | 7 | All Carpentries (Software Carpentry, Data Carpentry, and Library Carpentry) 8 | instructional material is made available under the [Creative Commons 9 | Attribution license][cc-by-human]. The following is a human-readable summary of 10 | (and not a substitute for) the [full legal text of the CC BY 4.0 11 | license][cc-by-legal]. 12 | 13 | You are free: 14 | 15 | - to **Share**\---copy and redistribute the material in any medium or format 16 | - to **Adapt**\---remix, transform, and build upon the material 17 | 18 | for any purpose, even commercially. 19 | 20 | The licensor cannot revoke these freedoms as long as you follow the license 21 | terms. 22 | 23 | Under the following terms: 24 | 25 | - **Attribution**\---You must give appropriate credit (mentioning that your work 26 | is derived from work that is Copyright (c) The Carpentries and, where 27 | practical, linking to [https://carpentries.org/](https://carpentries.org/)), provide a [link to the 28 | license][cc-by-human], and indicate if changes were made. You may do so in 29 | any reasonable manner, but not in any way that suggests the licensor endorses 30 | you or your use. 31 | 32 | - **No additional restrictions**\---You may not apply legal terms or 33 | technological measures that legally restrict others from doing anything the 34 | license permits. With the understanding that: 35 | 36 | Notices: 37 | 38 | - You do not have to comply with the license for elements of the material in 39 | the public domain or where your use is permitted by an applicable exception 40 | or limitation. 41 | - No warranties are given. The license may not give you all of the permissions 42 | necessary for your intended use. For example, other rights such as publicity, 43 | privacy, or moral rights may limit how you use the material. 44 | 45 | ## Software 46 | 47 | Except where otherwise noted, the example programs and other software provided 48 | by The Carpentries are made available under the [OSI][osi]\-approved [MIT 49 | license][mit-license]. 50 | 51 | Permission is hereby granted, free of charge, to any person obtaining a copy of 52 | this software and associated documentation files (the "Software"), to deal in 53 | the Software without restriction, including without limitation the rights to 54 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 55 | of the Software, and to permit persons to whom the Software is furnished to do 56 | so, subject to the following conditions: 57 | 58 | The above copyright notice and this permission notice shall be included in all 59 | copies or substantial portions of the Software. 60 | 61 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 62 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 63 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 64 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 65 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 66 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 67 | SOFTWARE. 68 | 69 | ## Trademark 70 | 71 | "The Carpentries", "Software Carpentry", "Data Carpentry", and "Library 72 | Carpentry" and their respective logos are registered trademarks of 73 | [The Carpentries, Inc.][carpentries]. 74 | 75 | [cc-by-human]: https://creativecommons.org/licenses/by/4.0/ 76 | [cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode 77 | [mit-license]: https://opensource.org/licenses/mit-license.html 78 | [carpentries]: https://carpentries.org 79 | [osi]: https://opensource.org 80 | 81 | 82 | 83 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3260309.svg)](https://doi.org/10.5281/zenodo.3260309) 2 | 3 | # Genomics Workshop 4 | 5 | Overview of the Genomics workshop. 6 | 7 | ## Code of Conduct 8 | 9 | All participants should agree to abide by [The Carpentries Code of Conduct](https://docs.carpentries.org/topic_folders/policies/index_coc.html). 10 | 11 | ## Authors 12 | 13 | The Genomics workshop overview is authored and maintained by the [community](https://github.com/datacarpentry/genomics-workshop/network/members). 14 | 15 | ## Citation 16 | 17 | Please cite as: 18 | Erin Alison Becker, Tracy Teal, François Michonneau, Maneesha Sane, Taylor Reiter, Jason Williams, et al. (2019, June). datacarpentry/genomics-workshop: Data Carpentry: Genomics Workshop Overview, June 2019 (Version v2019.06.1). Zenodo. [http://doi.org/10.5281/zenodo.3260309](https://doi.org/10.5281/zenodo.3260309) 19 | 20 | 21 | -------------------------------------------------------------------------------- /config.yaml: -------------------------------------------------------------------------------- 1 | #------------------------------------------------------------ 2 | # Values for this lesson. 3 | #------------------------------------------------------------ 4 | 5 | # Which carpentry is this (swc, dc, lc, or cp)? 6 | # swc: Software Carpentry 7 | # dc: Data Carpentry 8 | # lc: Library Carpentry 9 | # cp: Carpentries (to use for instructor training for instance) 10 | # incubator: The Carpentries Incubator 11 | carpentry: 'dc' 12 | 13 | # Overall title for pages. 14 | title: 'Genomics Workshop Overview' 15 | 16 | # Date the lesson was created (YYYY-MM-DD, this is empty by default) 17 | created: '2015-06-03' 18 | 19 | # Comma-separated list of keywords for the lesson 20 | keywords: 'software, data, lesson, The Carpentries' # FIXME 21 | 22 | # Life cycle stage of the lesson 23 | # possible values: pre-alpha, alpha, beta, stable 24 | life_cycle: 'stable' 25 | 26 | # License of the lesson 27 | license: 'CC-BY 4.0' 28 | 29 | # Link to the source repository for this lesson 30 | source: 'https://github.com/datacarpentry/genomics-workshop' 31 | 32 | # Default branch of your lesson 33 | branch: 'main' 34 | 35 | # Who to contact if there are any issues 36 | contact: 'team@carpentries.org' 37 | 38 | # Navigation ------------------------------------------------ 39 | # 40 | # Use the following menu items to specify the order of 41 | # individual pages in each dropdown section. Leave blank to 42 | # include all pages in the folder. 43 | # 44 | # Example ------------- 45 | # 46 | # episodes: 47 | # - introduction.md 48 | # - first-steps.md 49 | # 50 | # learners: 51 | # - setup.md 52 | # 53 | # instructors: 54 | # - instructor-notes.md 55 | # 56 | # profiles: 57 | # - one-learner.md 58 | # - another-learner.md 59 | 60 | # Order of episodes in your lesson 61 | episodes: 62 | - introduction.Rmd 63 | 64 | # Information for Learners 65 | learners: 66 | 67 | # Information for Instructors 68 | instructors: 69 | 70 | # Learner Profiles 71 | profiles: 72 | 73 | # Customisation --------------------------------------------- 74 | # 75 | # This space below is where custom yaml items (e.g. pinning 76 | # sandpaper and varnish versions) should live 77 | 78 | 79 | url: 'https://datacarpentry.github.io/genomics-workshop' 80 | analytics: 'carpentries' 81 | lang: 'en' 82 | overview: true 83 | -------------------------------------------------------------------------------- /fig/logging-onto-cloud-new-key-pair_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-new-key-pair_1.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud-new-key-pair_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-new-key-pair_2.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud-security-group_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-security-group_1.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud-security-group_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-security-group_2.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud-security-group_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-security-group_3.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud-summary.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud-summary.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_1.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_1b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_1b.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_2.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_3.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_3b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_3b.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_5.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_6.png -------------------------------------------------------------------------------- /fig/logging-onto-cloud_7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacarpentry/genomics-workshop/d813c0c0d2c69f5f298bf3d067743be2fd05f795/fig/logging-onto-cloud_7.png -------------------------------------------------------------------------------- /index.md: -------------------------------------------------------------------------------- 1 | --- 2 | site: sandpaper::sandpaper_site 3 | --- 4 | 5 | Data Carpentry's aim is to teach researchers basic concepts, skills, and tools for working 6 | with data so that they can get more done in less time, and with less pain. This workshop 7 | teaches data management and analysis for genomics research including: 8 | best practices for organization of bioinformatics projects and data, use of command-line 9 | utilities, use of command-line tools to analyze sequence quality and 10 | perform variant calling, and connecting to and using cloud computing. This workshop is designed to 11 | be taught over two full days of instruction. 12 | 13 | **Please note that workshop materials for working with Genomics data in R are in "alpha" development. These lessons are available for review and for informal teaching experiences, but are not yet part of The Carpentries' official lesson offerings.** 14 | 15 | Interested in teaching these materials? We have an [onboarding video](https://www.youtube.com/watch?v=zgdutO5tejo) and accompanying [slides](https://docs.google.com/presentation/d/1fLlT2lPv32DqCFpRPPdHZBNHiQTpK79wd5Z3nsFwL3s/edit#slide=id.p) available to prepare Instructors to teach these lessons. After watching this video, please contact [team@carpentries.org](mailto:team@carpentries.org) so that we can record your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at centrally-organized Data Carpentry Genomics workshops. 16 | 17 | ::::::::::::::::::::::::::::::::::::::::: callout 18 | 19 | ## Frequently Asked Questions 20 | 21 | Read our [FAQ](/genomics-workshop/faq) to learn more about Data Carpentry's Genomics workshop, as an Instructor or a workshop host. 22 | 23 | :::::::::::::::::::::::::::::::::::::::::::::::::: 24 | 25 | :::::::::::::::::::::::::::::::::::::::::: prereq 26 | 27 | ## Getting Started 28 | 29 | This lesson assumes that learners have no prior experience with the tools covered in the workshop. 30 | However, learners are expected to have some familiarity with biological concepts, 31 | including the 32 | concept of genomic variation within a population. Participants should bring their own laptops and plan to participate actively. 33 | 34 | To get started, follow the directions in the [Setup](learners/setup.md) tab to 35 | get access to the required software and data for this workshop. 36 | 37 | :::::::::::::::::::::::::::::::::::::::::::::::::: 38 | 39 | :::::::::::::::::::::::::::::::::::::::::: prereq 40 | 41 | ## Data 42 | 43 | This workshop uses data from a long term evolution experiment published in 2016: [Tempo and mode of genome evolution in a 50,000-generation experiment](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4988878/) by Tenaillon O, Barrick JE, Ribeck N, Deatherage DE, Blanchard JL, Dasgupta A, Wu GC, Wielgoss S, Cruveiller S, Médigue C, Schneider D, and Lenski RE. (doi: 10.1038/nature18959) 44 | 45 | All of the data used in this workshop can be [downloaded from Figshare](https://figshare.com/articles/Data_Carpentry_Genomics_beta_2_0/7726454). 46 | More information about this data is available on the [Data page](https://datacarpentry.org/organization-genomics/data). 47 | 48 | :::::::::::::::::::::::::::::::::::::::::::::::::: 49 | 50 | # Workshop Overview 51 | 52 | | Lesson | Overview | 53 | | ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ | 54 | | [Project organization and management](https://datacarpentry.github.io/organization-genomics/) | Learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database. | 55 | | [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) | Learn to navigate your file system, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards. | 56 | | [Data wrangling and processing](https://datacarpentry.github.io/wrangling-genomics/) | Use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation. | 57 | | [Introduction to cloud computing for genomics](https://www.datacarpentry.org/cloud-genomics/) | Learn how to work with Amazon AWS cloud computing and how to transfer data between your local computer and cloud resources. | 58 | 59 | # Optional Additional Lessons 60 | 61 | | Lesson | Overview | 62 | | ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ | 63 | | [Intro to R and RStudio for Genomics](https://datacarpentry.org/genomics-r-intro/) | Use R to analyze and visualize between-sample variation. | 64 | 65 | # Teaching Platform 66 | 67 | This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) 68 | instances. All the software and data used in the workshop are hosted on an Amazon Machine Image (AMI). 69 | If you want to run your own instance of the server used for this workshop, follow the directions in the [Setup](learners/setup.md) tab. 70 | 71 | # Common Schedules 72 | 73 | ### Schedule A (2 days OR 4 half days) 74 | 75 | - Half-day 1: [Project organization and management](https://datacarpentry.github.io/organization-genomics/) \& [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) 76 | - Half-day 2: [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) (continued). 77 | - Half-day 3 \& 4 : [Data wrangling and processing](https://datacarpentry.github.io/wrangling-genomics/) 78 | 79 | ### Schedule B (2 days OR 4 half days) 80 | 81 | - Half-day 1: [Project organization and management](https://datacarpentry.github.io/organization-genomics/) \& [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) 82 | - Half-day 2: [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) (continued) 83 | - Half-day 3 \& 4: [Intro to R and RStudio for Genomics](https://datacarpentry.org/genomics-r-intro/) 84 | 85 | ### Schedule C (3 days OR 6 half days) 86 | 87 | - Half-day 1: [Project organization and management](https://datacarpentry.github.io/organization-genomics/) \& [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) 88 | - Half-day 2: [Introduction to the command line](https://datacarpentry.github.io/shell-genomics/) (continued) 89 | - Half-day 3 \& 4 : [Data wrangling and processing](https://datacarpentry.github.io/wrangling-genomics/) 90 | - Half-day 5 \& 6: [Intro to R and RStudio for Genomics](https://datacarpentry.org/genomics-r-intro/) 91 | 92 | 93 | -------------------------------------------------------------------------------- /instructors/AMI-setup.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Launching your own AMI instances 3 | --- 4 | 5 | ::::::::::::::::::::::::::::::::::::::::: callout 6 | 7 | ## Do I need to create my own instances? 8 | 9 | **If you are:** 10 | 11 | - teaching at or attending a centrally organized Data 12 | Carpentry workshop, 13 | - a Maintainer for one of the Genomics lessons, 14 | - contributing to the Genomics lessons, or 15 | - teaching at a self-organized workshop 16 | 17 | The Carpentries staff will create AMI instances for you. Please contact 18 | [team@carpentries.org](mailto:team@carpentries.org). 19 | 20 | **If you are:** 21 | 22 | - working through these lessons on your own outside of a workshop, 23 | - practicing your skills after a workshop, or 24 | - using these lessons for a teaching demonstration as part of your Instructor checkout for The Carpentries, 25 | 26 | you will need to create your own AMI instances using the instructions below. The cost of using this AMI for a few days, with the 27 | t2.medium instance type is about USD $1.20 per day. Data Carpentry has no control over AWS pricing structure and provides 28 | this cost estimate with no guarantees. Please see the [EC2 pricing page](https://aws.amazon.com/ec2/pricing/on-demand) for up-to-date information. 29 | 30 | :::::::::::::::::::::::::::::::::::::::::::::::::: 31 | 32 | ### Launching an instance on Amazon Web Services 33 | 34 | :::::::::::::::::::::::::::::::::::::::::: prereq 35 | 36 | ## Prerequisites 37 | 38 | - Form of payment (credit card) 39 | - Understanding of Amazon's billing and payment (See: [Getting started with AWS Billing and Cost Management](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/billing-getting-started.html)) 40 | - You can use some of Amazon Web Services for free, or see if you qualify for an AWS Grant (See: [https://aws.amazon.com/grants/](https://aws.amazon.com/grants/) ) if you are using AWS for education. The free level of service *will not* be sufficient for working with the amount of data we are using for our lessons. 41 | 42 | :::::::::::::::::::::::::::::::::::::::::::::::::: 43 | 44 | #### Create an AWS account 45 | 46 | 1\. Go to Amazon Web Services [https://aws.amazon.com/](https://aws.amazon.com/) 47 | 48 | 2\. Follow the button to sign up for an account - you will need to agree to Amazon's terms and conditions and provide credit card information. 49 | 50 | #### Sign into AWS and Launch an Instance 51 | 52 | 1\. Sign into the AWS EC2 Dashboard: [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/) 53 | 54 | 2\. Click the 'Launch Instance' button 55 | 56 | Screenshot of AWS EC2 dashboard showing location of launch instance button. 57 | 58 | 3\. Under 'Application and OS Images (Amazon Machine Image)' search for the AMI listed on this curriculum's [Setup page](https://datacarpentry.org/genomics-workshop/index.html#setup) 59 | 60 | Screenshot of AMI launch wizard showing search function. 61 | 62 | 4\. Click "Community AMIs", and then select that image 63 | 64 | Screenshot of AMI launch wizard showing Community AMI tab. 65 | 66 | 5\. Under 'Instance type' click "Compare instance types" and and then select **t2.medium**; click "Select instance type" 67 | 68 | Screenshot of AMI launch wizard showing choosing t2.medium image type. 69 | 70 | Screenshot of AMI compare instance type page. 71 | 72 | 6\. Under 'Key pair (login)' click "Proceed without a key pair (not recommended)". A key pair is not necessary for this use case, as you will be using an account that is set up with limited access for learners. If you want to make changes to the instance (for example, installing additional software), you will need administrative access and will need to set up a key pair. Refer to [Amazon's user manual](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html) for information on key pair usage. 73 | 74 | Screenshot showing key pair settings box. 75 | 76 | 7\. Scroll down to 'Network settings'. If this is your first time working with this AMI on your 77 | AWS account, choose "create a new security group". Click "Edit". 78 | 79 | Screenshot of AMI launch wizard showing network settings box with 'Create security group' selected. 80 | 81 | 8\. Name your security group something descriptive (for example "DC-genomics-AMI") 82 | and enter a description into the description box (for example "to use with DC genomics AMI"). 83 | 84 | Your security group should now look like this: 85 | 86 | Screenshot of AMI launch wizard showing creating a new security group. 87 | 88 | 9\. Click "Add security group rule". A new row will appear. Under 'Type' select "Custom TCP" and enter "8787" into box labeled "Port Range". Under 89 | "Source type", select "Anywhere". You should now see a screen that looks like this: 90 | 91 | Screenshot of AMI launch wizard showing security group rules. 92 | 93 | 10\. Under 'Summary' on the right side of the screen, you should now see a screen that looks like this. Click "Launch Instance". 94 | 95 | Screenshot of AMI launch wizard showing security group rules. 96 | 97 | You instance will now be launched. You should follow the links to 'Create billing alerts' and then the instructions below 98 | for connecting to and terminating your Amazon Instance. 99 | 100 | :::::::::::::::: spoiler 101 | 102 | ## Connect to your Amazon Instance (MacOS/Linux) 103 | 104 | 1. Log into your AWC EC2 Dashboard [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/) 105 | 106 | 2. You should see that you have one instance. To proceed, the instance state must be 'running' (if you just launched the instance it will take \<5 min for the instance to start running). 107 | 108 | Screenshot of AWS EC2 dashboard showing number of running instances. 109 | 110 | 3. At the bottom of the dashboard, you should see a **Public IPv4 DNS** which will look something like *ec2-18-212-60-130.compute-1.amazonaws.com*. Copy that address (you may wish make a note of it as you will need this each time you connect.) 111 | 112 | Screenshot of AWS EC2 dashboard showing instance state as running. 113 | 114 | 4. You can now connect to your instance using 'ssh'. Your command will be something like this: 115 | 116 | ```bash 117 | $ ssh dcuser@ec2-18-212-60-130.compute-1.amazonaws.com 118 | ``` 119 | 120 | Use `dcuser` as the username, but be sure to replace `ec2-18-212-60-130.compute-1.amazonaws.com` with the DNS for your image. You may be notified that the authenticity of the host cannot be verified - if so, type 'yes' into the prompt to bypass the warning and continue connecting. 121 | 122 | 5. When prompted, enter the password `data4Carp` 123 | 124 | You should now be connected to your personal instance. You can confirm that you are in the correct location 125 | by using the `whoami` and `pwd` commands, which should yield the following results: 126 | 127 | ```bash 128 | $ whoami 129 | dcuser 130 | $ pwd 131 | /home/dcuser 132 | ``` 133 | 134 | ::::::::::::::::::::::::: 135 | 136 | :::::::::::::::: spoiler 137 | 138 | ## Connect to your Amazon instance (Windows) 139 | 140 | 1. Download the PuTTY application at: [http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe](https://the.earth.li/~sgtatham/putty/latest/x86/putty.exe) 141 | 142 | 2. Log into your AWC EC2 Dashboard [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/) 143 | 144 | 3. You should see that you have one instance, make sure instance state is 'running' (if you just launched the instance it will take \<5 min for the instance to start running) 145 | 146 | Screenshot of AWS EC2 dashboard showing number of running instances. 147 | 148 | 4. At the bottom of the dashboard, you should see a **Public IPv4 DNS** which will look something like *ec2-18-212-60-130.compute-1.amazonaws.com*. Copy that address (you may wish make a note of it as you will need this each time you connect.) 149 | 150 | Screenshot of AWS EC2 dashboard showing instance state as running. 151 | 152 | 5. Start PuTTY. In the section 'Specify the destination you want to connect to' for 'Host Name (or IP address)' paste in the DNS address and click 'Open' 153 | 154 | 6. When prompted to login, enter 'dcuser'; you may be notified that the authenticity of the host cannot be verified - if so select "Yes" to bypass the warning and continue connecting 155 | 156 | 7. When prompted, enter the password `data4Carp` 157 | 158 | You should now be connected to your personal instance. You can confirm this with the following commands; `whoami` and `pwd`, which should yield the following results: 159 | 160 | ```bash 161 | Last login: Thu Jul 30 13:21:08 2015 from 8.sub-70-197-200.myvzw.com 162 | $ whoami 163 | dcuser 164 | $ pwd 165 | /home/dcuser 166 | ``` 167 | 168 | ::::::::::::::::::::::::: 169 | 170 | #### Terminating your instance 171 | 172 | ::::::::::::::::::::::::::::::::::::::::: callout 173 | 174 | ## Very Important Warning - Avoid Unwanted Charges 175 | 176 | Please remember, for as long as this instance is running, you will 177 | be charged for your usage. You can see an estimate of the current 178 | charge from your AWS EC2 dashboard by clicking your name (Account 179 | name) on the upper right of the dashboard and selecting 'Billing 180 | \& Cost Management'. **DO NOT FORGET TO TERMINATE YOUR INSTANCE WHEN YOU ARE DONE** 181 | 182 | :::::::::::::::::::::::::::::::::::::::::::::::::: 183 | 184 | When you are finished with your instance, you must terminate it to avoid unwanted charges. Follow the following steps. 185 | 186 | 1. Sign into AWS and go to the EC2 Dashboard: [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/) 187 | 2. Under 'Resources' select 'Running Instances' 188 | 3. Select the instance you wish to terminate, then click 'Instance state' and select 'Terminate instance' 189 | 190 | Screenshot of AWS EC2 dashboard showing drop-down menu for terminating an instance. 191 | 192 | ::::::::::::::::::::::::::::::::::::::::: callout 193 | 194 | ## Warning 195 | 196 | Terminating an instance will delete any data on this instance, so you must move any data you wish to save off the instance. 197 | 198 | :::::::::::::::::::::::::::::::::::::::::::::::::: 199 | 200 | 5. Select 'Terminate' to terminate the instance. 201 | 202 | 203 | -------------------------------------------------------------------------------- /instructors/faq.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Frequently Asked Questions 3 | --- 4 | 5 | Thank you for your interest in hosting or teaching a Genomics workshop. Below you will find answers to some frequently asked questions about this curriculum. If the answer to your question doesn't appear, please contact [team@carpentries.org](mailto:team@carpentries.org). 6 | 7 | - [For Hosts](#hosts) 8 | - [For Instructors](#instructors) 9 | 10 | ## Hosts 11 | 12 | ### What does this workshop cover? 13 | 14 | This workshop teaches data management and analysis for genomics research including: best practices for organization of bioinformatics projects and data, use of command line utilities, use of command line tools to analyze sequence quality and perform variant calling, and connecting to and using cloud computing. 15 | 16 | ### What experience do learners need to have before this workshop? What will they be able to do by the end of the workshop? 17 | 18 | This lesson assumes no prior experience with the tools covered in the workshop. However, learners are expected to have some familiarity with biological concepts, including the concept of genomic variation within a population. By the end of the workshop, learners will be able to: 19 | 20 | - structure their metadata, organize and document their genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database, 21 | - navigate their file systems, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards, 22 | - use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation, 23 | - work with Amazon AWS cloud computing and transfer data between their local computer and cloud resources. 24 | 25 | ### What are the software, hardware, and connectivity needs for this workshop? 26 | 27 | Learners will need to bring a laptop (not a tablet) with any spreadsheet program installed (e.g. LibreOffice, Microsoft Excel). Learners using a Windows machine will also need to download and install [PuTTY](https://www.putty.org/). There are no other hardware or software requirements. Learners will need a stable, strong internet connection in order to work on the remote computing system used for this workshop. 28 | 29 | ### My institution has its own compute cluster, or our research group uses a different cloud computing resource. Can we deliver the workshop using that system? 30 | 31 | To ensure a consistent workshop experience for learners and Instructors, all workshops organized by The Carpentries ("centrally-organized workshops") use our stable, community-tested curriculum and technical set-up. Currently, all centrally-organized Genomics workshops are taught using AWS, although we are interested in supporting other systems in the future. If you are interested in using a different platform to teach this curriculum in a self-organized workshop, all of our materials are publicly available and licensed [CC-BY](https://creativecommons.org/licenses/by/4.0/). For information about the difference between centrally-organized and self-organized workshops, and limitations on use of "The Carpentries" brand name and logo, see the [Teaching and Hosting](https://docs.carpentries.org/topic_folders/hosts_instructors/index.html) section of The Carpentries Handbook. 32 | 33 | ### What experience do helpers need to have for this workshop? 34 | 35 | Anyone who has some experience using the Bash shell can be an effective helper for this workshop. Helpers do not need to have experience working with genomics data or the specific command line tools taught in this workshop. 36 | 37 | ### I want to include the optional R lesson. Can I do that? 38 | 39 | To ensure a consistent workshop experience for learners and Instructors, all workshops organized by The Carpentries ("centrally-organized workshops") use our stable, community-tested curriculum. The Genomics R lesson is still under development and we cannot guarantee that it will meet our high curricular standards. If you are interested in using this lesson in a self-organized workshop, all of our materials are publicly available and licensed CC-BY. For information about the difference between centrally-organized and self-organized workshops, and limitations on use of "The Carpentries" brand name and logo, see the [Teaching and Hosting](https://docs.carpentries.org/topic_folders/hosts_instructors/index.html) section of The Carpentries Handbook. 40 | 41 | ### Does the AWS image location matter? Do I need to set up an AMI in a different region if my workshop will be held outside of the Eastern US? 42 | 43 | We have run this workshop in locations across the United States and Europe with no noticeable difference in instance speed. If you experience any issues, please [let us know](mailto:team@carpentries.org). 44 | 45 | ### Where can I find more information about this workshop? 46 | 47 | For a full description of this workshop, including what content is covered, and what dataset we use to teach, visit the [Genomics Workshop Overview](https://datacarpentry.org/genomics-workshop/) page. 48 | 49 | ## Instructors 50 | 51 | ### What background and technological skills do I need to have to teach this workshop? 52 | 53 | You will need experience using a bash shell (the default shell on Mac OS and most Linux systems), including writing your own small bash scripts and running programs written by others from the command line. You do not need to have specifically used the command-line programs that are used in these lessons. You do not need to have prior experience working with Amazon Web Services (AWS), but some experience logging on to remote computers would be useful. You should have experience working with genomic sequences in FASTQ format. 54 | 55 | ### How can I prepare to teach this material? 56 | 57 | Each lesson has a set of Instructor Notes that provide information about the design of the lesson, commonly encountered problems, and technical tips and tricks. You can access Instructor Notes through the [main lessons page](https://datacarpentry.org/lessons/#genomics-workshop) (linked through the plus icon in the lesson table) or in the "Extras" menu on each individual lesson page. Instructor Notes are written collaboratively by our Instructors, so please contribute your own notes after your workshop! 58 | 59 | ### When will I have access to an AWS image to practice on? 60 | 61 | Each Instructor will receive connection information for an AMI instance approximately one week before the workshop. If you would like workshop helpers to also have access to an instance to prepare for the workshop, or if you would like access more than one week in advance of the workshop, please contact [team@carpentries.org](mailto:team@carpentries.org). 62 | 63 | ### How will my learners get connection and log-in information? 64 | 65 | The Carpentries Workshops and Instruction Team will provide Instructors with connection information for AMI instances the day before the workshop. Enough instances will be provided for each learner, and if requested, for each workshop helper, to have their own individual instance. Instructors can make connection information available to learners through the workshop Etherpad. 66 | 67 | ### I want to teach the optional R lesson. Can I do that? 68 | 69 | To ensure a consistent workshop experience for learners and Instructors, all workshops organized by The Carpentries ("centrally-organized workshops") use our stable, community-tested curriculum. The Genomics R lesson is still under development and we cannot guarantee that it will meet our high curricular standards. If you are interested in using this lesson in a self-organized workshop, all of our materials are publicly available and licensed CC-BY. For information about the difference between centrally-organized and self-organized workshops, and limitations on use of "The Carpentries" brand name and logo, see the [Teaching and Hosting](https://docs.carpentries.org/topic_folders/hosts_instructors/index.html) section of The Carpentries Handbook. 70 | 71 | ### Is there anything special I need to do if I'm teaching the optional R lesson? 72 | 73 | Nope! The AMI that we use for the standard lessons can also be used to teach the optional R lesson. 74 | 75 | ### What are common problems that arise during this workshop? 76 | 77 | The best place to get information about common problems that arise during workshops is in the Instructor Notes for each lesson. You can access Instructor Notes through the Extras menu in the top navigation bar that appears across the head of each lesson. Instructors are strongly encouraged to contribute back to the Instructor Notes based on their workshop experience. To contribute to the Instructor Notes, click the "Improve this page" menu option in the upper right corner of the Instructor Notes page. 78 | 79 | ### Does the AWS image location matter? Do I need to set up an AMI in a different region if my workshop will be held outside of the Eastern US? 80 | 81 | We have run this workshop in locations across the United States and Europe with no noticeable difference in instance speed. If you experience any issues, please [let us know](mailto:team@carpentries.org). 82 | 83 | 84 | -------------------------------------------------------------------------------- /instructors/instructor-notes.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Instructor Notes 3 | --- 4 | 5 | ## Resouces for Instructors 6 | 7 | We have an [onboarding video](https://www.youtube.com/watch?v=zgdutO5tejo) available to prepare Instructors to teach these lessons. 8 | The slides presented in this video are available: [https://tinyurl.com/y27swdvo](https://tinyurl.com/y27swdvo). 9 | After watching this video, please contact [[team@carpentries.org](mailto:team@carpentries.org)](mailto: [team@carpentries.org](mailto:team@carpentries.org)) so that we can record 10 | your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at 11 | centrally-organized Carpentries workshops. 12 | 13 | ## Workshop Structure 14 | 15 | [Instructors, please add notes on your experience with the workshop structure here.] 16 | 17 | ## Technical tips and tricks 18 | 19 | #### Installation 20 | 21 | This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) instances. See the 22 | [Setup page](https://datacarpentry.org/genomics-workshop/index.html#setup) for complete setup instructions. If you are 23 | teaching these lessons, and would like an AWS instance to practice on, please contact [[team@carpentries.org](mailto:team@carpentries.org)](mailto: [team@carpentries.org](mailto:team@carpentries.org)). 24 | 25 | ## Common problems 26 | 27 | This workshop introduces an analysis pipeline, where each step in that pipeline is dependent on the previous step. 28 | If a learner gets behind, or one of the steps doesn't work for them, they may not be able to catch up with the rest of the class. 29 | To help ensure that all learners are able to work through the whole process, we provide the solution files. This includes all 30 | of the output files for each step in the data processing pipeline, as well as the scripts that the learners write collaboratively 31 | with the Instructors throughout the workshop. These files are available on the AMI in `dcuser/.solutions`. 32 | 33 | Similarly, if the learners aren't able to pull the data files that are pulled in the lesson directly from the SRA (e.g. due to 34 | unstable internet), those files are available in the hidden backup directory (`dcuser/.backup`). 35 | 36 | Make sure to tell your helpers about the `.solutions` and `.backup` directories so that they can use these resources to help 37 | learners catch up during the workshop. 38 | 39 | 40 | -------------------------------------------------------------------------------- /instructors/teaching_demos.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Teaching Demonstrations 3 | --- 4 | 5 | If you are an instructor in training and wish to use lessons from Data Carpentry's Genomics curriculum for your teaching demo, please read the instructions below to be sure you are prepared. You must follow these steps *before* your teaching demo, or you will be asked to reschedule. 6 | 7 |
8 | 9 | #### [Project Organization and Management for Genomics](https://datacarpentry.org/organization-genomics/) 10 | 11 | No special instructions. 12 | 13 |
14 | 15 | #### [Introduction to the Command Line for Genomics](https://datacarpentry.org/shell-genomics/) 16 | 17 | For your teaching demo, you may follow this lesson locally without an AMI instance. Note that this will 18 | require some changes to paths throughout the lesson. 19 | 20 | Use the following shell commands to download and unzip the necessary data files from FigShare. 21 | 22 | ``` 23 | wget --output-document shell_data.tar.gz https://ndownloader.figshare.com/files/14417834 24 | tar -xzf shell_data.tar.gz 25 | ``` 26 | 27 |
28 | 29 | #### [Data Wrangling and Processing for Genomics](https://datacarpentry.org/wrangling-genomics/) 30 | 31 | Use [these instructions](https://datacarpentry.org/genomics-workshop/AMI-setup) to launch and connect to your own instance of the Data Carpentry Genomics AMI. This instance should cost you approximately US $1.20 per day. (This cost estimate is provided without any guarantee of accuracy and Data Carpentry assumes no liability for costs associated with your AMI instance(s).) 32 | 33 | Once you have connected to your AWS instance, use the shell commands below to ensure that the data directory is created, 34 | that the data is placed into the data directory, and that you are in the data directory before 35 | starting to operate on the data. 36 | 37 | ``` 38 | mkdir -p ~/dc_workshop/data/untrimmed_fastq/ 39 | mv ~/.backup/untrimmed_fastq/* ~/dc_workshop/data/untrimmed_fastq/ 40 | cd ~/dc_workshop/data/untrimmed_fastq 41 | ``` 42 | 43 |
44 | 45 | #### [Introduction to Cloud Computing for Genomics](https://datacarpentry.org/cloud-genomics/) 46 | 47 | Use [these instructions](https://datacarpentry.org/genomics-workshop/AMI-setup) to launch and connect to your own instance of the Data Carpentry Genomics AMI. This instance should cost you approximately US $1.20 per day. (This cost estimate is provided without any guarantee of accuracy and Data Carpentry assumes no liability for costs associated with your AMI instance(s).) 48 | 49 |
50 | 51 | #### [Data Analysis and Visualization in R](https://datacarpentry.org/genomics-r-intro/) 52 | 53 | **DO NOT USE for demos.** This lesson is not yet stable. 54 | 55 |
56 | 57 | 58 | -------------------------------------------------------------------------------- /learners/reference.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: 'Glossary' 3 | --- 4 | 5 | ## Glossary 6 | 7 | FIXME 8 | 9 | 10 | 11 | 12 | -------------------------------------------------------------------------------- /learners/setup.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Setup 3 | --- 4 | 5 | # Overview 6 | 7 | This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) instances. 8 | All of the data and most of the software used in the workshop are hosted on an 9 | Amazon Machine Image (AMI). 10 | Some additional software, detailed below, must be installed on your computer. 11 | 12 | Please follow the instructions below to prepare your computer for the workshop: 13 | 14 | - Required additional software + Option A 15 | **OR** 16 | - Required additional software + Option B 17 | 18 | ## Required additional software 19 | 20 | This lesson requires a working spreadsheet program. 21 | If you don't have a spreadsheet program already, you can use LibreOffice. 22 | It's a free, open source spreadsheet program. 23 | Directions to install are included for each Windows, Mac OS X, and Linux systems below. 24 | For Windows, you will also need to install either Git Bash, PuTTY, or the Ubuntu Subsystem. 25 | 26 | :::::::::::::::: spoiler 27 | 28 | ## Windows 29 | 30 | - Visit [the LibreOffice installation page](https://www.libreoffice.org/download/libreoffice-fresh/). 31 | The version for Windows should automatically be selected. 32 | Click Download Version X.X.X (whichever is the most recent version). 33 | You will go to a page that asks about a donation, but you don't need to make one. 34 | Your download should begin automatically. 35 | - Once the installer is downloaded, double click on it and LibreOffice should install. 36 | - Download the [Git for Windows installer](https://git-for-windows.github.io/). 37 | Run the installer and follow the steps below: 38 | - Click on "Next" four times (two times if you've previously installed Git). 39 | You don't need to change anything in the Information, location, components, and start menu screens. 40 | - **From the dropdown menu select "Use the Nano editor by default" 41 | (NOTE: you will need to scroll up to find it) and click on "Next".** 42 | - On the page that says "Adjusting the name of the initial branch in new repositories", 43 | ensure that "Let Git decide" is selected. 44 | This will ensure the highest level of compatibility for our lessons. 45 | - Ensure that "Git from the command line and also from 3rd-party software" 46 | is selected and click on "Next". 47 | (If you don't do this Git Bash will not work properly, 48 | requiring you to remove the Git Bash installation, 49 | re-run the installer and to select the 50 | "Git from the command line and also from 3rd-party software" option.) 51 | - Ensure that "Use the native Windows Secure Channel Library" is selected and click on "Next". 52 | - Ensure that "Checkout Windows-style, commit Unix-style line endings" is selected and click on "Next". 53 | - **Ensure that "Use Windows' default console window" is selected and click on "Next".** 54 | - Ensure that "Default (fast-forward or merge) is selected and click "Next" 55 | - Ensure that "Git Credential Manager Core" is selected and click on "Next". 56 | - Ensure that "Enable file system caching" is selected and click on "Next". 57 | - Click on "Install". 58 | - Click on "Finish". 59 | - Check the settings for you your "HOME" environment variable. 60 | - If your "HOME" environment variable is not set (or you don't know what this is): 61 | - Open command prompt (Open Start Menu then type `cmd` and press [Enter]) 62 | - Type the following line into the command prompt window exactly as shown: `setx HOME "%USERPROFILE%"` 63 | - Press [Enter], you should see `SUCCESS: Specified value was saved.` 64 | - Quit command prompt by typing `exit` then pressing [Enter] 65 | - An **alternative option** is to [install PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html). 66 | For most newer computers, click on putty-64bit-X.XX-installer.msi to download the 64-bit version. 67 | If you have an older laptop, you may need to get the 32-bit version putty-X.XX-installer.msi. 68 | If you aren't sure whether you need the 64 or 32 bit version, 69 | you can [check your laptop version](https://support.microsoft.com/en-us/help/15056/windows-32-64-bit-faq). 70 | Once the installer is downloaded, double click on it, and PuTTY should install. 71 | - **Another alternative option** is to use the Ubuntu Subsystem for Windows. 72 | This option is only available for Windows 10 - the Microsoft documentation provides 73 | [detailed instructions for installing Windows 10](https://docs.microsoft.com/en-us/windows/wsl/install-win10). 74 | 75 | ::::::::::::::::::::::::: 76 | 77 | :::::::::::::::: spoiler 78 | 79 | ## Mac OS X 80 | 81 | - Visit [the LibreOffice installation page](https://www.libreoffice.org/download/libreoffice-fresh/). 82 | The version for Mac should automatically be selected. 83 | Click Download Version X.X.X (whichever is the most recent version). 84 | You will go to a page that asks about a donation, but you don't need to make one. 85 | Your download should begin automatically. 86 | - Once the installer is downloaded, double click on it and LibreOffice should install. 87 | 88 | ::::::::::::::::::::::::: 89 | 90 | :::::::::::::::: spoiler 91 | 92 | ## Linux 93 | 94 | - Visit [the LibreOffice installation page](https://www.libreoffice.org/download/libreoffice-fresh/). 95 | The version for Linux should automatically be selected. 96 | Click Download Version X.X.X (whichever is the most recent version). 97 | You will go to a page that asks about a donation, but you don't need to make one. 98 | Your download should begin automatically. 99 | - Once the installer is downloaded, double click on it and LibreOffice should install. 100 | 101 | ::::::::::::::::::::::::: 102 | 103 | ## Option A (**Recommended**): Using the lessons with Amazon Web Services (AWS) 104 | 105 | If you are signed up to take a Genomics Data Carpentry workshop, 106 | you do *not* need to worry about setting up an AMI instance. 107 | The Carpentries staff will create an instance for you and this will be provided to you at no cost. 108 | This is true for both self-organized and centrally-organized workshops. 109 | Your Instructor will provide instructions for connecting to the AMI instance at the workshop. 110 | 111 | If you would like to work through these lessons independently, outside of a workshop, 112 | you will need to start your own AMI instance. 113 | Follow these [instructions on creating an Amazon instance](https://datacarpentry.org/genomics-workshop/AMI-setup). 114 | Use the AMI `ami-07196848f138b4f29` (Data Carpentry Genomics with R 4.4) 115 | listed on the Community AMIs page. 116 | Please note that you must set your location as `N. Virginia` in order to access this community AMI. 117 | You can change your location in the upper right corner of the main AWS menu bar. 118 | The cost of using this AMI for a few days, 119 | with the t2.medium instance type is very low (about USD $1.50 per user, per day). 120 | Data Carpentry has *no* control over AWS pricing structure and provides this 121 | cost estimate with no guarantees. 122 | Please read AWS documentation on pricing for up-to-date information. 123 | 124 | If you're an Instructor or Maintainer or want to contribute to these lessons, 125 | please [get in touch with us](mailto:team@carpentries.org) 126 | and we will start instances for you. 127 | 128 | ## Option B: Using the lessons on your local machine 129 | 130 | While not recommended, it is possible to work through the lessons on your local machine 131 | (i.e. without using AWS). 132 | To do this, you will need to install all of the software used in the workshop 133 | and obtain a copy of the dataset. 134 | Instructions for doing this are listed below. 135 | 136 | ### Data 137 | 138 | [The data used in this workshop is available on FigShare](https://figshare.com/articles/Data_Carpentry_Genomics_beta_2_0/7726454). 139 | Because this workshop works with real data, be aware that file sizes for the data are large. 140 | Please read the FigShare page for information about the data and access to the data files. 141 | 142 | More information about these data will be presented in 143 | [the first lesson of the workshop](https://www.datacarpentry.org/organization-genomics/data/). 144 | 145 | ### Software 146 | 147 | | Software | Version | Manual | Available for | Description | 148 | | -------- | ------- | ------ | --------------------- | --------------------------------------------------------------------- | 149 | | [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) | 0\.11.9 | [Link](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/) | Linux, MacOS, Windows | Quality control tool for high throughput sequence data. | 150 | | [Trimmomatic](https://www.usadellab.org/cms/?page=trimmomatic) | 0\.39 | [Link](https://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf) | Linux, MacOS, Windows | A flexible read trimming tool for Illumina NGS data. | 151 | | [BWA](https://bio-bwa.sourceforge.net/) | 0\.7.17 | [Link](https://bio-bwa.sourceforge.net/bwa.shtml) | Linux, MacOS | Mapping DNA sequences against reference genome. | 152 | | [SAMtools](https://samtools.sourceforge.net/) | 1\.9 | [Link](https://www.htslib.org/doc/samtools.html) | Linux, MacOS | Utilities for manipulating alignments in the SAM format. | 153 | | [BCFtools](https://samtools.github.io/bcftools/) | 1\.9 | [Link](https://samtools.github.io/bcftools/bcftools.html) | Linux, MacOS | Utilities for variant calling and manipulating VCFs and BCFs. | 154 | | [IGV](https://software.broadinstitute.org/software/igv/home) | [Link](https://software.broadinstitute.org/software/igv/download) | [Link](https://software.broadinstitute.org/software/igv/UserGuide) | Linux, MacOS, Windows | Visualization and interactive exploration of large genomics datasets. | 155 | 156 | ### QuickStart Software Installation Instructions 157 | 158 | These are the QuickStart installation instructions. 159 | They assume familiarity with the command line and with installation in general. 160 | As there are different operating systems and many different versions of 161 | operating systems and environments, these may not work on your computer. 162 | If an installation doesn't work for you, please refer to the user guide for the tool, 163 | listed in the table above. 164 | 165 | We have installed software using [Conda](https://conda.io). 166 | Conda is a package manager that simplifies the installation process. 167 | Please first install Conda through the Miniconda installer (see below) before proceeding to the installation of individual tools. 168 | For more information on Miniconda, please refer to the Conda [documentation](https://conda.io/projects/conda/en/latest/user-guide/install/index.html). 169 | 170 | ### Conda 171 | 172 | :::::::::::::::: spoiler 173 | 174 | ## Linux 175 | 176 | To install Conda, type: 177 | 178 | ```bash 179 | $ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 180 | $ bash Miniconda3-latest-Linux-x86_64.sh 181 | ``` 182 | 183 | Then, follow the instructions that you are prompted with on the screen to install Conda. 184 | 185 | ::::::::::::::::::::::::: 186 | 187 | :::::::::::::::: spoiler 188 | 189 | ## MacOS 190 | 191 | To install Conda, type: 192 | 193 | ```bash 194 | $ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh 195 | $ bash Miniconda3-latest-MacOSX-x86_64.sh 196 | ``` 197 | 198 | Then, follow the instructions that you are prompted with on the screen to install Conda. 199 | 200 | ::::::::::::::::::::::::: 201 | 202 | ### FastQC 203 | 204 | :::::::::::::::: spoiler 205 | 206 | ## MacOS 207 | 208 | To install FastQC, type: 209 | 210 | ```bash 211 | $ conda install -c bioconda fastqc=0.11.9 212 | ``` 213 | 214 | ::::::::::::::::::::::::: 215 | 216 | :::::::::::::::: spoiler 217 | 218 | ## FastQC Source Code Installation 219 | 220 | If you prefer to install from source, follow the directions below: 221 | 222 | ```bash 223 | $ cd ~/src 224 | $ curl -O http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.9.zip 225 | $ unzip fastqc_v0.11.9.zip 226 | ``` 227 | 228 | Link the fastqc executable to the ~/bin folder that 229 | you have already added to the path. 230 | 231 | ```bash 232 | $ ln -sf ~/src/FastQC/fastqc ~/bin/fastqc 233 | ``` 234 | 235 | Due to what seems a packaging error 236 | the executable flag on the fastqc program is not set. 237 | We need to set it ourselves. 238 | 239 | ```bash 240 | $ chmod +x ~/bin/fastqc 241 | ``` 242 | 243 | ::::::::::::::::::::::::: 244 | 245 | **Test your installation by running:** 246 | 247 | ```bash 248 | $ fastqc -h 249 | ``` 250 | 251 | ### Trimmomatic 252 | 253 | :::::::::::::::: spoiler 254 | 255 | ## MacOS 256 | 257 | ```bash 258 | conda install -c bioconda trimmomatic=0.39 259 | ``` 260 | 261 | ::::::::::::::::::::::::: 262 | 263 | :::::::::::::::: spoiler 264 | 265 | ## Trimmomatic Source Code Installation 266 | 267 | If you prefer to install from source, follow the directions below: 268 | 269 | ```bash 270 | $ cd ~/src 271 | $ curl -O http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.39.zip 272 | $ unzip Trimmomatic-0.39.zip 273 | ``` 274 | 275 | The program can be invoked via: 276 | 277 | ``` 278 | $ java -jar ~/src/Trimmomatic-0.39/trimmomatic-0.39.jar 279 | ``` 280 | 281 | The ~/src/Trimmomatic-0.39/adapters/ directory contains 282 | Illumina specific adapter sequences. 283 | 284 | ```bash 285 | $ ls ~/src/Trimmomatic-0.39/adapters/ 286 | ``` 287 | 288 | ::::::::::::::::::::::::: 289 | 290 | **Test your installation by running:** (assuming things are installed in ~/src) 291 | 292 | ```bash 293 | $ java -jar ~/src/Trimmomatic-0.39/trimmomatic-0.39.jar 294 | ``` 295 | 296 | :::::::::::::::: spoiler 297 | 298 | ## Simplify the Invocation, or to Test your installation if you installed with miniconda3: 299 | 300 | To simplify the invocation you could also create a script in the ~/bin folder: 301 | 302 | ```bash 303 | $ echo '#!/bin/bash' > ~/bin/trimmomatic 304 | $ echo 'java -jar ~/src/Trimmomatic-0.39/trimmomatic-0.39.jar $@' >> ~/bin/trimmomatic 305 | $ chmod +x ~/bin/trimmomatic 306 | ``` 307 | 308 | Test your script by running: 309 | 310 | ```bash 311 | $ trimmomatic 312 | ``` 313 | 314 | ::::::::::::::::::::::::: 315 | 316 | ### BWA 317 | 318 | :::::::::::::::: spoiler 319 | 320 | ## MacOS 321 | 322 | ```bash 323 | conda install -c bioconda bwa=0.7.17=ha92aebf_3 324 | ``` 325 | 326 | ::::::::::::::::::::::::: 327 | 328 | :::::::::::::::: spoiler 329 | 330 | ## BWA Source Code Installation 331 | 332 | If you prefer to install from source, follow the instructions below: 333 | 334 | ```bash 335 | $ cd ~/src 336 | $ curl -OL http://sourceforge.net/projects/bio-bwa/files/bwa-0.7.17.tar.bz2 337 | $ tar jxvf bwa-0.7.17.tar.bz2 338 | $ cd bwa-0.7.17 339 | $ make 340 | $ export PATH=~/src/bwa-0.7.17:$PATH 341 | ``` 342 | 343 | ::::::::::::::::::::::::: 344 | 345 | **Test your installation by running:** 346 | 347 | ```bash 348 | $ bwa 349 | ``` 350 | 351 | ### SAMtools 352 | 353 | :::::::::::::::: spoiler 354 | 355 | ## MacOS 356 | 357 | ```bash 358 | $ conda install -c bioconda samtools=1.9=h8ee4bcc_1 359 | ``` 360 | 361 | ::::::::::::::::::::::::: 362 | 363 | ::::::::::::::::::::::::::::::::::::::::: callout 364 | 365 | ## SAMtools Versions 366 | 367 | SAMtools has changed the command line invocation (for the better). 368 | But this means that most of the tutorials on the web indicate an older and obsolete usage. 369 | 370 | Using SAMtools version 1.9 is important to work with the commands we present in these lessons. 371 | 372 | :::::::::::::::::::::::::::::::::::::::::::::::::: 373 | 374 | :::::::::::::::: spoiler 375 | 376 | ## SAMtools Source Code Installation 377 | 378 | If you prefer to install from source, follow the instructions below: 379 | 380 | ```bash 381 | $ cd ~/src 382 | $ curl -OkL https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2 383 | $ tar jxvf samtools-1.9.tar.bz2 384 | $ cd samtools-1.9 385 | $ make 386 | ``` 387 | 388 | Add directory to the path if necessary: 389 | 390 | ```bash 391 | $ echo export `PATH=~/src/samtools-1.9:$PATH` >> ~/.bashrc 392 | $ source ~/.bashrc 393 | ``` 394 | 395 | ::::::::::::::::::::::::: 396 | 397 | **Test your installation by running:** 398 | 399 | ```bash 400 | $ samtools 401 | ``` 402 | 403 | ### BCFtools 404 | 405 | :::::::::::::::: spoiler 406 | 407 | ## MacOS 408 | 409 | ```bash 410 | $ conda install -c bioconda bcftools=1.9 411 | ``` 412 | 413 | ::::::::::::::::::::::::: 414 | 415 | :::::::::::::::: spoiler 416 | 417 | ## BCF tools Source Code Installation 418 | 419 | If you prefer to install from source, follow the instructions below: 420 | 421 | ```bash 422 | $ cd ~/src 423 | $ curl -OkL https://github.com/samtools/bcftools/releases/download/1.9/bcftools-1.9.tar.bz2 424 | $ tar jxvf bcftools-1.9.tar.bz2 425 | $ cd bcftools-1.9 426 | $ make 427 | ``` 428 | 429 | Add directory to the path if necessary: 430 | 431 | ```bash 432 | $ echo export `PATH=~/src/bcftools-1.9:$PATH` >> ~/.bashrc 433 | $ source ~/.bashrc 434 | ``` 435 | 436 | ::::::::::::::::::::::::: 437 | 438 | **Test your installation by running:** 439 | 440 | ```bash 441 | $ bcftools 442 | ``` 443 | 444 | ### IGV 445 | 446 | - [Download the IGV installation files](https://software.broadinstitute.org/software/igv/download) 447 | - [Install and run IGV using the instructions for your operating system](https://software.broadinstitute.org/software/igv/download). 448 | 449 | 450 | -------------------------------------------------------------------------------- /profiles/learner-profiles.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: FIXME 3 | --- 4 | 5 | This is a placeholder file. Please add content here. 6 | -------------------------------------------------------------------------------- /site/README.md: -------------------------------------------------------------------------------- 1 | This directory contains rendered lesson materials. Please do not edit files 2 | here. 3 | --------------------------------------------------------------------------------