├── .github └── workflows │ ├── README.md │ ├── pr-close-signal.yaml │ ├── pr-comment.yaml │ ├── pr-post-remove-branch.yaml │ ├── pr-preflight.yaml │ ├── pr-receive.yaml │ ├── sandpaper-main.yaml │ ├── sandpaper-version.txt │ ├── update-cache.yaml │ └── update-workflows.yaml ├── .gitignore ├── .travis.yml ├── AUTHORS ├── CITATION ├── CITATION.cff ├── CODEOWNERS ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── Gemfile ├── LICENSE.md ├── Makefile ├── README.md ├── config.yaml ├── episodes ├── .gitkeep ├── 01-intro.md ├── 02-variables.md ├── 03-ranges-arrays.md ├── 04-conditionals.md ├── 05-loops.md ├── 06-procedures.md ├── 07-commandargs.md ├── 08-timing.md ├── 11-parallel-intro.md ├── 12-fire-forget-tasks.md ├── 13-synchronization.md ├── 14-parallel-case-study.md ├── 21-locales.md └── 22-domains.md ├── hpc-chapel.Rproj ├── index.md ├── instructors └── instructor-notes.md ├── learners ├── reference.md └── setup.md ├── links.md ├── profiles └── learner-profiles.md └── site └── README.md /.github/workflows/README.md: -------------------------------------------------------------------------------- 1 | # Carpentries Workflows 2 | 3 | This directory contains workflows to be used for Lessons using the {sandpaper} 4 | lesson infrastructure. Two of these workflows require R (`sandpaper-main.yaml` 5 | and `pr-receive.yaml`) and the rest are bots to handle pull request management. 6 | 7 | These workflows will likely change as {sandpaper} evolves, so it is important to 8 | keep them up-to-date. To do this in your lesson you can do the following in your 9 | R console: 10 | 11 | ```r 12 | # Install/Update sandpaper 13 | options(repos = c(carpentries = "https://carpentries.r-universe.dev/", 14 | CRAN = "https://cloud.r-project.org")) 15 | install.packages("sandpaper") 16 | 17 | # update the workflows in your lesson 18 | library("sandpaper") 19 | update_github_workflows() 20 | ``` 21 | 22 | Inside this folder, you will find a file called `sandpaper-version.txt`, which 23 | will contain a version number for sandpaper. This will be used in the future to 24 | alert you if a workflow update is needed. 25 | 26 | What follows are the descriptions of the workflow files: 27 | 28 | ## Deployment 29 | 30 | ### 01 Build and Deploy (sandpaper-main.yaml) 31 | 32 | This is the main driver that will only act on the main branch of the repository. 33 | This workflow does the following: 34 | 35 | 1. checks out the lesson 36 | 2. provisions the following resources 37 | - R 38 | - pandoc 39 | - lesson infrastructure (stored in a cache) 40 | - lesson dependencies if needed (stored in a cache) 41 | 3. builds the lesson via `sandpaper:::ci_deploy()` 42 | 43 | #### Caching 44 | 45 | This workflow has two caches; one cache is for the lesson infrastructure and 46 | the other is for the the lesson dependencies if the lesson contains rendered 47 | content. These caches are invalidated by new versions of the infrastructure and 48 | the `renv.lock` file, respectively. If there is a problem with the cache, 49 | manual invaliation is necessary. You will need maintain access to the repository 50 | and you can either go to the actions tab and [click on the caches button to find 51 | and invalidate the failing cache](https://github.blog/changelog/2022-10-20-manage-caches-in-your-actions-workflows-from-web-interface/) 52 | or by setting the `CACHE_VERSION` secret to the current date (which will 53 | invalidate all of the caches). 54 | 55 | ## Updates 56 | 57 | ### Setup Information 58 | 59 | These workflows run on a schedule and at the maintainer's request. Because they 60 | create pull requests that update workflows/require the downstream actions to run, 61 | they need a special repository/organization secret token called 62 | `SANDPAPER_WORKFLOW` and it must have the `public_repo` and `workflow` scope. 63 | 64 | This can be an individual user token, OR it can be a trusted bot account. If you 65 | have a repository in one of the official Carpentries accounts, then you do not 66 | need to worry about this token being present because the Carpentries Core Team 67 | will take care of supplying this token. 68 | 69 | If you want to use your personal account: you can go to 70 | 71 | to create a token. Once you have created your token, you should copy it to your 72 | clipboard and then go to your repository's settings > secrets > actions and 73 | create or edit the `SANDPAPER_WORKFLOW` secret, pasting in the generated token. 74 | 75 | If you do not specify your token correctly, the runs will not fail and they will 76 | give you instructions to provide the token for your repository. 77 | 78 | ### 02 Maintain: Update Workflow Files (update-workflow.yaml) 79 | 80 | The {sandpaper} repository was designed to do as much as possible to separate 81 | the tools from the content. For local builds, this is absolutely true, but 82 | there is a minor issue when it comes to workflow files: they must live inside 83 | the repository. 84 | 85 | This workflow ensures that the workflow files are up-to-date. The way it work is 86 | to download the update-workflows.sh script from GitHub and run it. The script 87 | will do the following: 88 | 89 | 1. check the recorded version of sandpaper against the current version on github 90 | 2. update the files if there is a difference in versions 91 | 92 | After the files are updated, if there are any changes, they are pushed to a 93 | branch called `update/workflows` and a pull request is created. Maintainers are 94 | encouraged to review the changes and accept the pull request if the outputs 95 | are okay. 96 | 97 | This update is run weekly or on demand. 98 | 99 | ### 03 Maintain: Update Package Cache (update-cache.yaml) 100 | 101 | For lessons that have generated content, we use {renv} to ensure that the output 102 | is stable. This is controlled by a single lockfile which documents the packages 103 | needed for the lesson and the version numbers. This workflow is skipped in 104 | lessons that do not have generated content. 105 | 106 | Because the lessons need to remain current with the package ecosystem, it's a 107 | good idea to make sure these packages can be updated periodically. The 108 | update cache workflow will do this by checking for updates, applying them in a 109 | branch called `updates/packages` and creating a pull request with _only the 110 | lockfile changed_. 111 | 112 | From here, the markdown documents will be rebuilt and you can inspect what has 113 | changed based on how the packages have updated. 114 | 115 | ## Pull Request and Review Management 116 | 117 | Because our lessons execute code, pull requests are a secruity risk for any 118 | lesson and thus have security measures associted with them. **Do not merge any 119 | pull requests that do not pass checks and do not have bots commented on them.** 120 | 121 | This series of workflows all go together and are described in the following 122 | diagram and the below sections: 123 | 124 | ![Graph representation of a pull request](https://carpentries.github.io/sandpaper/articles/img/pr-flow.dot.svg) 125 | 126 | ### Pre Flight Pull Request Validation (pr-preflight.yaml) 127 | 128 | This workflow runs every time a pull request is created and its purpose is to 129 | validate that the pull request is okay to run. This means the following things: 130 | 131 | 1. The pull request does not contain modified workflow files 132 | 2. If the pull request contains modified workflow files, it does not contain 133 | modified content files (such as a situation where @carpentries-bot will 134 | make an automated pull request) 135 | 3. The pull request does not contain an invalid commit hash (e.g. from a fork 136 | that was made before a lesson was transitioned from styles to use the 137 | workbench). 138 | 139 | Once the checks are finished, a comment is issued to the pull request, which 140 | will allow maintainers to determine if it is safe to run the 141 | "Receive Pull Request" workflow from new contributors. 142 | 143 | ### Receive Pull Request (pr-receive.yaml) 144 | 145 | **Note of caution:** This workflow runs arbitrary code by anyone who creates a 146 | pull request. GitHub has safeguarded the token used in this workflow to have no 147 | priviledges in the repository, but we have taken precautions to protect against 148 | spoofing. 149 | 150 | This workflow is triggered with every push to a pull request. If this workflow 151 | is already running and a new push is sent to the pull request, the workflow 152 | running from the previous push will be cancelled and a new workflow run will be 153 | started. 154 | 155 | The first step of this workflow is to check if it is valid (e.g. that no 156 | workflow files have been modified). If there are workflow files that have been 157 | modified, a comment is made that indicates that the workflow is not run. If 158 | both a workflow file and lesson content is modified, an error will occurr. 159 | 160 | The second step (if valid) is to build the generated content from the pull 161 | request. This builds the content and uploads three artifacts: 162 | 163 | 1. The pull request number (pr) 164 | 2. A summary of changes after the rendering process (diff) 165 | 3. The rendered files (build) 166 | 167 | Because this workflow builds generated content, it follows the same general 168 | process as the `sandpaper-main` workflow with the same caching mechanisms. 169 | 170 | The artifacts produced are used by the next workflow. 171 | 172 | ### Comment on Pull Request (pr-comment.yaml) 173 | 174 | This workflow is triggered if the `pr-receive.yaml` workflow is successful. 175 | The steps in this workflow are: 176 | 177 | 1. Test if the workflow is valid and comment the validity of the workflow to the 178 | pull request. 179 | 2. If it is valid: create an orphan branch with two commits: the current state 180 | of the repository and the proposed changes. 181 | 3. If it is valid: update the pull request comment with the summary of changes 182 | 183 | Importantly: if the pull request is invalid, the branch is not created so any 184 | malicious code is not published. 185 | 186 | From here, the maintainer can request changes from the author and eventually 187 | either merge or reject the PR. When this happens, if the PR was valid, the 188 | preview branch needs to be deleted. 189 | 190 | ### Send Close PR Signal (pr-close-signal.yaml) 191 | 192 | Triggered any time a pull request is closed. This emits an artifact that is the 193 | pull request number for the next action 194 | 195 | ### Remove Pull Request Branch (pr-post-remove-branch.yaml) 196 | 197 | Tiggered by `pr-close-signal.yaml`. This removes the temporary branch associated with 198 | the pull request (if it was created). 199 | -------------------------------------------------------------------------------- /.github/workflows/pr-close-signal.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Send Close Pull Request Signal" 2 | 3 | on: 4 | pull_request: 5 | types: 6 | [closed] 7 | 8 | jobs: 9 | send-close-signal: 10 | name: "Send closing signal" 11 | runs-on: ubuntu-latest 12 | if: ${{ github.event.action == 'closed' }} 13 | steps: 14 | - name: "Create PRtifact" 15 | run: | 16 | mkdir -p ./pr 17 | printf ${{ github.event.number }} > ./pr/NUM 18 | - name: Upload Diff 19 | uses: actions/upload-artifact@v4 20 | with: 21 | name: pr 22 | path: ./pr 23 | 24 | -------------------------------------------------------------------------------- /.github/workflows/pr-comment.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Comment on the Pull Request" 2 | 3 | # read-write repo token 4 | # access to secrets 5 | on: 6 | workflow_run: 7 | workflows: ["Receive Pull Request"] 8 | types: 9 | - completed 10 | 11 | concurrency: 12 | group: pr-${{ github.event.workflow_run.pull_requests[0].number }} 13 | cancel-in-progress: true 14 | 15 | 16 | jobs: 17 | # Pull requests are valid if: 18 | # - they match the sha of the workflow run head commit 19 | # - they are open 20 | # - no .github files were committed 21 | test-pr: 22 | name: "Test if pull request is valid" 23 | runs-on: ubuntu-latest 24 | if: > 25 | github.event.workflow_run.event == 'pull_request' && 26 | github.event.workflow_run.conclusion == 'success' 27 | outputs: 28 | is_valid: ${{ steps.check-pr.outputs.VALID }} 29 | payload: ${{ steps.check-pr.outputs.payload }} 30 | number: ${{ steps.get-pr.outputs.NUM }} 31 | msg: ${{ steps.check-pr.outputs.MSG }} 32 | steps: 33 | - name: 'Download PR artifact' 34 | id: dl 35 | uses: carpentries/actions/download-workflow-artifact@main 36 | with: 37 | run: ${{ github.event.workflow_run.id }} 38 | name: 'pr' 39 | 40 | - name: "Get PR Number" 41 | if: ${{ steps.dl.outputs.success == 'true' }} 42 | id: get-pr 43 | run: | 44 | unzip pr.zip 45 | echo "NUM=$(<./NR)" >> $GITHUB_OUTPUT 46 | 47 | - name: "Fail if PR number was not present" 48 | id: bad-pr 49 | if: ${{ steps.dl.outputs.success != 'true' }} 50 | run: | 51 | echo '::error::A pull request number was not recorded. The pull request that triggered this workflow is likely malicious.' 52 | exit 1 53 | - name: "Get Invalid Hashes File" 54 | id: hash 55 | run: | 56 | echo "json<> $GITHUB_OUTPUT 59 | - name: "Check PR" 60 | id: check-pr 61 | if: ${{ steps.dl.outputs.success == 'true' }} 62 | uses: carpentries/actions/check-valid-pr@main 63 | with: 64 | pr: ${{ steps.get-pr.outputs.NUM }} 65 | sha: ${{ github.event.workflow_run.head_sha }} 66 | headroom: 3 # if it's within the last three commits, we can keep going, because it's likely rapid-fire 67 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 68 | fail_on_error: true 69 | 70 | # Create an orphan branch on this repository with two commits 71 | # - the current HEAD of the md-outputs branch 72 | # - the output from running the current HEAD of the pull request through 73 | # the md generator 74 | create-branch: 75 | name: "Create Git Branch" 76 | needs: test-pr 77 | runs-on: ubuntu-latest 78 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 79 | env: 80 | NR: ${{ needs.test-pr.outputs.number }} 81 | permissions: 82 | contents: write 83 | steps: 84 | - name: 'Checkout md outputs' 85 | uses: actions/checkout@v4 86 | with: 87 | ref: md-outputs 88 | path: built 89 | fetch-depth: 1 90 | 91 | - name: 'Download built markdown' 92 | id: dl 93 | uses: carpentries/actions/download-workflow-artifact@main 94 | with: 95 | run: ${{ github.event.workflow_run.id }} 96 | name: 'built' 97 | 98 | - if: ${{ steps.dl.outputs.success == 'true' }} 99 | run: unzip built.zip 100 | 101 | - name: "Create orphan and push" 102 | if: ${{ steps.dl.outputs.success == 'true' }} 103 | run: | 104 | cd built/ 105 | git config --local user.email "actions@github.com" 106 | git config --local user.name "GitHub Actions" 107 | CURR_HEAD=$(git rev-parse HEAD) 108 | git checkout --orphan md-outputs-PR-${NR} 109 | git add -A 110 | git commit -m "source commit: ${CURR_HEAD}" 111 | ls -A | grep -v '^.git$' | xargs -I _ rm -r '_' 112 | cd .. 113 | unzip -o -d built built.zip 114 | cd built 115 | git add -A 116 | git commit --allow-empty -m "differences for PR #${NR}" 117 | git push -u --force --set-upstream origin md-outputs-PR-${NR} 118 | 119 | # Comment on the Pull Request with a link to the branch and the diff 120 | comment-pr: 121 | name: "Comment on Pull Request" 122 | needs: [test-pr, create-branch] 123 | runs-on: ubuntu-latest 124 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 125 | env: 126 | NR: ${{ needs.test-pr.outputs.number }} 127 | permissions: 128 | pull-requests: write 129 | steps: 130 | - name: 'Download comment artifact' 131 | id: dl 132 | uses: carpentries/actions/download-workflow-artifact@main 133 | with: 134 | run: ${{ github.event.workflow_run.id }} 135 | name: 'diff' 136 | 137 | - if: ${{ steps.dl.outputs.success == 'true' }} 138 | run: unzip ${{ github.workspace }}/diff.zip 139 | 140 | - name: "Comment on PR" 141 | id: comment-diff 142 | if: ${{ steps.dl.outputs.success == 'true' }} 143 | uses: carpentries/actions/comment-diff@main 144 | with: 145 | pr: ${{ env.NR }} 146 | path: ${{ github.workspace }}/diff.md 147 | 148 | # Comment if the PR is open and matches the SHA, but the workflow files have 149 | # changed 150 | comment-changed-workflow: 151 | name: "Comment if workflow files have changed" 152 | needs: test-pr 153 | runs-on: ubuntu-latest 154 | if: ${{ always() && needs.test-pr.outputs.is_valid == 'false' }} 155 | env: 156 | NR: ${{ github.event.workflow_run.pull_requests[0].number }} 157 | body: ${{ needs.test-pr.outputs.msg }} 158 | permissions: 159 | pull-requests: write 160 | steps: 161 | - name: 'Check for spoofing' 162 | id: dl 163 | uses: carpentries/actions/download-workflow-artifact@main 164 | with: 165 | run: ${{ github.event.workflow_run.id }} 166 | name: 'built' 167 | 168 | - name: 'Alert if spoofed' 169 | id: spoof 170 | if: ${{ steps.dl.outputs.success == 'true' }} 171 | run: | 172 | echo 'body<> $GITHUB_ENV 173 | echo '' >> $GITHUB_ENV 174 | echo '## :x: DANGER :x:' >> $GITHUB_ENV 175 | echo 'This pull request has modified workflows that created output. Close this now.' >> $GITHUB_ENV 176 | echo '' >> $GITHUB_ENV 177 | echo 'EOF' >> $GITHUB_ENV 178 | 179 | - name: "Comment on PR" 180 | id: comment-diff 181 | uses: carpentries/actions/comment-diff@main 182 | with: 183 | pr: ${{ env.NR }} 184 | body: ${{ env.body }} 185 | 186 | -------------------------------------------------------------------------------- /.github/workflows/pr-post-remove-branch.yaml: -------------------------------------------------------------------------------- 1 | name: "Bot: Remove Temporary PR Branch" 2 | 3 | on: 4 | workflow_run: 5 | workflows: ["Bot: Send Close Pull Request Signal"] 6 | types: 7 | - completed 8 | 9 | jobs: 10 | delete: 11 | name: "Delete branch from Pull Request" 12 | runs-on: ubuntu-latest 13 | if: > 14 | github.event.workflow_run.event == 'pull_request' && 15 | github.event.workflow_run.conclusion == 'success' 16 | permissions: 17 | contents: write 18 | steps: 19 | - name: 'Download artifact' 20 | uses: carpentries/actions/download-workflow-artifact@main 21 | with: 22 | run: ${{ github.event.workflow_run.id }} 23 | name: pr 24 | - name: "Get PR Number" 25 | id: get-pr 26 | run: | 27 | unzip pr.zip 28 | echo "NUM=$(<./NUM)" >> $GITHUB_OUTPUT 29 | - name: 'Remove branch' 30 | uses: carpentries/actions/remove-branch@main 31 | with: 32 | pr: ${{ steps.get-pr.outputs.NUM }} 33 | -------------------------------------------------------------------------------- /.github/workflows/pr-preflight.yaml: -------------------------------------------------------------------------------- 1 | name: "Pull Request Preflight Check" 2 | 3 | on: 4 | pull_request_target: 5 | branches: 6 | ["main"] 7 | types: 8 | ["opened", "synchronize", "reopened"] 9 | 10 | jobs: 11 | test-pr: 12 | name: "Test if pull request is valid" 13 | if: ${{ github.event.action != 'closed' }} 14 | runs-on: ubuntu-latest 15 | outputs: 16 | is_valid: ${{ steps.check-pr.outputs.VALID }} 17 | permissions: 18 | pull-requests: write 19 | steps: 20 | - name: "Get Invalid Hashes File" 21 | id: hash 22 | run: | 23 | echo "json<> $GITHUB_OUTPUT 26 | - name: "Check PR" 27 | id: check-pr 28 | uses: carpentries/actions/check-valid-pr@main 29 | with: 30 | pr: ${{ github.event.number }} 31 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 32 | fail_on_error: true 33 | - name: "Comment result of validation" 34 | id: comment-diff 35 | if: ${{ always() }} 36 | uses: carpentries/actions/comment-diff@main 37 | with: 38 | pr: ${{ github.event.number }} 39 | body: ${{ steps.check-pr.outputs.MSG }} 40 | -------------------------------------------------------------------------------- /.github/workflows/pr-receive.yaml: -------------------------------------------------------------------------------- 1 | name: "Receive Pull Request" 2 | 3 | on: 4 | pull_request: 5 | types: 6 | [opened, synchronize, reopened] 7 | 8 | concurrency: 9 | group: ${{ github.ref }} 10 | cancel-in-progress: true 11 | 12 | jobs: 13 | test-pr: 14 | name: "Record PR number" 15 | if: ${{ github.event.action != 'closed' }} 16 | runs-on: ubuntu-latest 17 | outputs: 18 | is_valid: ${{ steps.check-pr.outputs.VALID }} 19 | steps: 20 | - name: "Record PR number" 21 | id: record 22 | if: ${{ always() }} 23 | run: | 24 | echo ${{ github.event.number }} > ${{ github.workspace }}/NR # 2022-03-02: artifact name fixed to be NR 25 | - name: "Upload PR number" 26 | id: upload 27 | if: ${{ always() }} 28 | uses: actions/upload-artifact@v4 29 | with: 30 | name: pr 31 | path: ${{ github.workspace }}/NR 32 | - name: "Get Invalid Hashes File" 33 | id: hash 34 | run: | 35 | echo "json<> $GITHUB_OUTPUT 38 | - name: "echo output" 39 | run: | 40 | echo "${{ steps.hash.outputs.json }}" 41 | - name: "Check PR" 42 | id: check-pr 43 | uses: carpentries/actions/check-valid-pr@main 44 | with: 45 | pr: ${{ github.event.number }} 46 | invalid: ${{ fromJSON(steps.hash.outputs.json)[github.repository] }} 47 | 48 | build-md-source: 49 | name: "Build markdown source files if valid" 50 | needs: test-pr 51 | runs-on: ubuntu-latest 52 | if: ${{ needs.test-pr.outputs.is_valid == 'true' }} 53 | env: 54 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 55 | RENV_PATHS_ROOT: ~/.local/share/renv/ 56 | CHIVE: ${{ github.workspace }}/site/chive 57 | PR: ${{ github.workspace }}/site/pr 58 | MD: ${{ github.workspace }}/site/built 59 | steps: 60 | - name: "Check Out Main Branch" 61 | uses: actions/checkout@v4 62 | 63 | - name: "Check Out Staging Branch" 64 | uses: actions/checkout@v4 65 | with: 66 | ref: md-outputs 67 | path: ${{ env.MD }} 68 | 69 | - name: "Set up R" 70 | uses: r-lib/actions/setup-r@v2 71 | with: 72 | use-public-rspm: true 73 | install-r: false 74 | 75 | - name: "Set up Pandoc" 76 | uses: r-lib/actions/setup-pandoc@v2 77 | 78 | - name: "Setup Lesson Engine" 79 | uses: carpentries/actions/setup-sandpaper@main 80 | with: 81 | cache-version: ${{ secrets.CACHE_VERSION }} 82 | 83 | - name: "Setup Package Cache" 84 | uses: carpentries/actions/setup-lesson-deps@main 85 | with: 86 | cache-version: ${{ secrets.CACHE_VERSION }} 87 | 88 | - name: "Validate and Build Markdown" 89 | id: build-site 90 | run: | 91 | sandpaper::package_cache_trigger(TRUE) 92 | sandpaper::validate_lesson(path = '${{ github.workspace }}') 93 | sandpaper:::build_markdown(path = '${{ github.workspace }}', quiet = FALSE) 94 | shell: Rscript {0} 95 | 96 | - name: "Generate Artifacts" 97 | id: generate-artifacts 98 | run: | 99 | sandpaper:::ci_bundle_pr_artifacts( 100 | repo = '${{ github.repository }}', 101 | pr_number = '${{ github.event.number }}', 102 | path_md = '${{ env.MD }}', 103 | path_pr = '${{ env.PR }}', 104 | path_archive = '${{ env.CHIVE }}', 105 | branch = 'md-outputs' 106 | ) 107 | shell: Rscript {0} 108 | 109 | - name: "Upload PR" 110 | uses: actions/upload-artifact@v4 111 | with: 112 | name: pr 113 | path: ${{ env.PR }} 114 | 115 | - name: "Upload Diff" 116 | uses: actions/upload-artifact@v4 117 | with: 118 | name: diff 119 | path: ${{ env.CHIVE }} 120 | retention-days: 1 121 | 122 | - name: "Upload Build" 123 | uses: actions/upload-artifact@v4 124 | with: 125 | name: built 126 | path: ${{ env.MD }} 127 | retention-days: 1 128 | 129 | - name: "Teardown" 130 | run: sandpaper::reset_site() 131 | shell: Rscript {0} 132 | -------------------------------------------------------------------------------- /.github/workflows/sandpaper-main.yaml: -------------------------------------------------------------------------------- 1 | name: "01 Build and Deploy Site" 2 | 3 | on: 4 | push: 5 | branches: 6 | - main 7 | - master 8 | schedule: 9 | - cron: '0 0 * * 2' 10 | workflow_dispatch: 11 | inputs: 12 | name: 13 | description: 'Who triggered this build?' 14 | required: true 15 | default: 'Maintainer (via GitHub)' 16 | reset: 17 | description: 'Reset cached markdown files' 18 | required: false 19 | default: false 20 | type: boolean 21 | jobs: 22 | full-build: 23 | name: "Build Full Site" 24 | runs-on: ubuntu-latest 25 | permissions: 26 | checks: write 27 | contents: write 28 | pages: write 29 | env: 30 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 31 | RENV_PATHS_ROOT: ~/.local/share/renv/ 32 | steps: 33 | 34 | - name: "Checkout Lesson" 35 | uses: actions/checkout@v4 36 | 37 | - name: "Set up R" 38 | uses: r-lib/actions/setup-r@v2 39 | with: 40 | use-public-rspm: true 41 | install-r: false 42 | 43 | - name: "Set up Pandoc" 44 | uses: r-lib/actions/setup-pandoc@v2 45 | 46 | - name: "Setup Lesson Engine" 47 | uses: carpentries/actions/setup-sandpaper@main 48 | with: 49 | cache-version: ${{ secrets.CACHE_VERSION }} 50 | 51 | - name: "Setup Package Cache" 52 | uses: carpentries/actions/setup-lesson-deps@main 53 | with: 54 | cache-version: ${{ secrets.CACHE_VERSION }} 55 | 56 | - name: "Deploy Site" 57 | run: | 58 | reset <- "${{ github.event.inputs.reset }}" == "true" 59 | sandpaper::package_cache_trigger(TRUE) 60 | sandpaper:::ci_deploy(reset = reset) 61 | shell: Rscript {0} 62 | -------------------------------------------------------------------------------- /.github/workflows/sandpaper-version.txt: -------------------------------------------------------------------------------- 1 | 0.16.6 2 | -------------------------------------------------------------------------------- /.github/workflows/update-cache.yaml: -------------------------------------------------------------------------------- 1 | name: "03 Maintain: Update Package Cache" 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | name: 7 | description: 'Who triggered this build (enter github username to tag yourself)?' 8 | required: true 9 | default: 'monthly run' 10 | schedule: 11 | # Run every tuesday 12 | - cron: '0 0 * * 2' 13 | 14 | jobs: 15 | preflight: 16 | name: "Preflight Check" 17 | runs-on: ubuntu-latest 18 | outputs: 19 | ok: ${{ steps.check.outputs.ok }} 20 | steps: 21 | - id: check 22 | run: | 23 | if [[ ${{ github.event_name }} == 'workflow_dispatch' ]]; then 24 | echo "ok=true" >> $GITHUB_OUTPUT 25 | echo "Running on request" 26 | # using single brackets here to avoid 08 being interpreted as octal 27 | # https://github.com/carpentries/sandpaper/issues/250 28 | elif [ `date +%d` -le 7 ]; then 29 | # If the Tuesday lands in the first week of the month, run it 30 | echo "ok=true" >> $GITHUB_OUTPUT 31 | echo "Running on schedule" 32 | else 33 | echo "ok=false" >> $GITHUB_OUTPUT 34 | echo "Not Running Today" 35 | fi 36 | 37 | check_renv: 38 | name: "Check if We Need {renv}" 39 | runs-on: ubuntu-latest 40 | needs: preflight 41 | if: ${{ needs.preflight.outputs.ok == 'true'}} 42 | outputs: 43 | needed: ${{ steps.renv.outputs.exists }} 44 | steps: 45 | - name: "Checkout Lesson" 46 | uses: actions/checkout@v4 47 | - id: renv 48 | run: | 49 | if [[ -d renv ]]; then 50 | echo "exists=true" >> $GITHUB_OUTPUT 51 | fi 52 | 53 | check_token: 54 | name: "Check SANDPAPER_WORKFLOW token" 55 | runs-on: ubuntu-latest 56 | needs: check_renv 57 | if: ${{ needs.check_renv.outputs.needed == 'true' }} 58 | outputs: 59 | workflow: ${{ steps.validate.outputs.wf }} 60 | repo: ${{ steps.validate.outputs.repo }} 61 | steps: 62 | - name: "validate token" 63 | id: validate 64 | uses: carpentries/actions/check-valid-credentials@main 65 | with: 66 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 67 | 68 | update_cache: 69 | name: "Update Package Cache" 70 | needs: check_token 71 | if: ${{ needs.check_token.outputs.repo== 'true' }} 72 | runs-on: ubuntu-latest 73 | env: 74 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 75 | RENV_PATHS_ROOT: ~/.local/share/renv/ 76 | steps: 77 | 78 | - name: "Checkout Lesson" 79 | uses: actions/checkout@v4 80 | 81 | - name: "Set up R" 82 | uses: r-lib/actions/setup-r@v2 83 | with: 84 | use-public-rspm: true 85 | install-r: false 86 | 87 | - name: "Update {renv} deps and determine if a PR is needed" 88 | id: update 89 | uses: carpentries/actions/update-lockfile@main 90 | with: 91 | cache-version: ${{ secrets.CACHE_VERSION }} 92 | 93 | - name: Create Pull Request 94 | id: cpr 95 | if: ${{ steps.update.outputs.n > 0 }} 96 | uses: carpentries/create-pull-request@main 97 | with: 98 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 99 | delete-branch: true 100 | branch: "update/packages" 101 | commit-message: "[actions] update ${{ steps.update.outputs.n }} packages" 102 | title: "Update ${{ steps.update.outputs.n }} packages" 103 | body: | 104 | :robot: This is an automated build 105 | 106 | This will update ${{ steps.update.outputs.n }} packages in your lesson with the following versions: 107 | 108 | ``` 109 | ${{ steps.update.outputs.report }} 110 | ``` 111 | 112 | :stopwatch: In a few minutes, a comment will appear that will show you how the output has changed based on these updates. 113 | 114 | If you want to inspect these changes locally, you can use the following code to check out a new branch: 115 | 116 | ```bash 117 | git fetch origin update/packages 118 | git checkout update/packages 119 | ``` 120 | 121 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }} 122 | 123 | [1]: https://github.com/carpentries/create-pull-request/tree/main 124 | labels: "type: package cache" 125 | draft: false 126 | -------------------------------------------------------------------------------- /.github/workflows/update-workflows.yaml: -------------------------------------------------------------------------------- 1 | name: "02 Maintain: Update Workflow Files" 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | name: 7 | description: 'Who triggered this build (enter github username to tag yourself)?' 8 | required: true 9 | default: 'weekly run' 10 | clean: 11 | description: 'Workflow files/file extensions to clean (no wildcards, enter "" for none)' 12 | required: false 13 | default: '.yaml' 14 | schedule: 15 | # Run every Tuesday 16 | - cron: '0 0 * * 2' 17 | 18 | jobs: 19 | check_token: 20 | name: "Check SANDPAPER_WORKFLOW token" 21 | runs-on: ubuntu-latest 22 | outputs: 23 | workflow: ${{ steps.validate.outputs.wf }} 24 | repo: ${{ steps.validate.outputs.repo }} 25 | steps: 26 | - name: "validate token" 27 | id: validate 28 | uses: carpentries/actions/check-valid-credentials@main 29 | with: 30 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 31 | 32 | update_workflow: 33 | name: "Update Workflow" 34 | runs-on: ubuntu-latest 35 | needs: check_token 36 | if: ${{ needs.check_token.outputs.workflow == 'true' }} 37 | steps: 38 | - name: "Checkout Repository" 39 | uses: actions/checkout@v4 40 | 41 | - name: Update Workflows 42 | id: update 43 | uses: carpentries/actions/update-workflows@main 44 | with: 45 | clean: ${{ github.event.inputs.clean }} 46 | 47 | - name: Create Pull Request 48 | id: cpr 49 | if: "${{ steps.update.outputs.new }}" 50 | uses: carpentries/create-pull-request@main 51 | with: 52 | token: ${{ secrets.SANDPAPER_WORKFLOW }} 53 | delete-branch: true 54 | branch: "update/workflows" 55 | commit-message: "[actions] update sandpaper workflow to version ${{ steps.update.outputs.new }}" 56 | title: "Update Workflows to Version ${{ steps.update.outputs.new }}" 57 | body: | 58 | :robot: This is an automated build 59 | 60 | Update Workflows from sandpaper version ${{ steps.update.outputs.old }} -> ${{ steps.update.outputs.new }} 61 | 62 | - Auto-generated by [create-pull-request][1] on ${{ steps.update.outputs.date }} 63 | 64 | [1]: https://github.com/carpentries/create-pull-request/tree/main 65 | labels: "type: template and tools" 66 | draft: false 67 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # sandpaper files 2 | episodes/*html 3 | site/* 4 | !site/README.md 5 | 6 | # History files 7 | .Rhistory 8 | .Rapp.history 9 | 10 | # Session Data files 11 | .RData 12 | 13 | # User-specific files 14 | .Ruserdata 15 | 16 | # Example code in package build process 17 | *-Ex.R 18 | 19 | # Output files from R CMD build 20 | /*.tar.gz 21 | 22 | # Output files from R CMD check 23 | /*.Rcheck/ 24 | 25 | # RStudio files 26 | .Rproj.user/ 27 | 28 | # produced vignettes 29 | vignettes/*.html 30 | vignettes/*.pdf 31 | 32 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3 33 | .httr-oauth 34 | 35 | # knitr and R markdown default cache directories 36 | *_cache/ 37 | /cache/ 38 | 39 | # Temporary files created by R markdown 40 | *.utf8.md 41 | *.knit.md 42 | 43 | # R Environment Variables 44 | .Renviron 45 | 46 | # pkgdown site 47 | docs/ 48 | 49 | # translation temp files 50 | po/*~ 51 | 52 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | # Travis-CI config for https://github.com/hpc-carpentry/hpc-chapel 2 | # Results at https://travis-ci.org/github/hpc-carpentry/hpc-chapel 3 | 4 | dist: xenial 5 | language: python 6 | python: 3.7 7 | 8 | branches: 9 | only: 10 | - gh-pages 11 | - /.*/ 12 | 13 | before_install: 14 | install: 15 | script: 16 | 17 | jobs: 18 | include: 19 | - stage: "Check for typos and spelling mistakes" 20 | before_install: # Don't need everything to build the site 21 | install: 22 | pip install codespell 23 | script: 24 | codespell --skip="assets,*.svg,bin" --quiet-level=2 -L "rouge,dropse,namd,hist" 25 | - stage: "Build the site" 26 | before_install: 27 | - sudo apt-get update -y 28 | - rvm default 29 | - gem install bundler jekyll json kramdown 30 | - bundle config build.nokogiri --use-system-libraries 31 | - bundle install 32 | install: 33 | pip install pyyaml 34 | script: 35 | - make lesson-check-all 36 | - make --always-make site 37 | 38 | -------------------------------------------------------------------------------- /AUTHORS: -------------------------------------------------------------------------------- 1 | HPC Chapel is maintained by 2 | 3 | - [Alex Razoumov](mailto:alex.razoumov@westgrid.ca) 4 | 5 | It was written and edited by 6 | 7 | - [@razoumov](https://github.com/razoumov) 8 | - [@jcarzu](https://github.com/jcarzu) 9 | -------------------------------------------------------------------------------- /CITATION: -------------------------------------------------------------------------------- 1 | To reference this lesson, please cite: 2 | 3 | Razoumov A., Zuniga J. (2024). Introduction to High-Performance Computing in Chapel. https://www.hpc-carpentry.org/hpc-chapel 4 | -------------------------------------------------------------------------------- /CITATION.cff: -------------------------------------------------------------------------------- 1 | # This template CITATION.cff file was generated with cffinit. 2 | # Visit https://bit.ly/cffinit to replace its contents 3 | # with information about your lesson. 4 | # Remember to update this file periodically, 5 | # ensuring that the author list and other fields remain accurate. 6 | 7 | cff-version: 1.2.0 8 | title: Introduction to High-Performance Computing in Chapel 9 | message: >- 10 | Please cite this lesson using the information in this file 11 | when you refer to it in publications, and/or if you 12 | re-use, adapt, or expand on the content in your own 13 | training material. 14 | type: dataset 15 | authors: 16 | - given-names: Alex 17 | family-names: Razoumov 18 | email: alex.razoumov@westdri.ca 19 | affiliation: SFU 20 | - given-names: Juan 21 | family-names: Zuniga 22 | repository-code: 'https://github.com/hpc-carpentry/hpc-chapel' 23 | url: 'https://www.hpc-carpentry.org/hpc-chapel' 24 | abstract: >- 25 | This lesson is an introduction to high-performance 26 | computing using Chapel parallel language. 27 | keywords: 28 | - 'Chapel, HPC, parallel' 29 | license: CC-BY-4.0 30 | -------------------------------------------------------------------------------- /CODEOWNERS: -------------------------------------------------------------------------------- 1 | # This file lists the contributors responsible for the 2 | # repository content. They will also be automatically 3 | # asked to review any pull request made in this repository. 4 | 5 | # Each line is a file pattern followed by one or more owners. 6 | # The sequence matters: later patterns take precedence. 7 | 8 | # FILES OWNERS 9 | * @hpc-carpentry/hpc-chapel-maintainers 10 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Contributor Code of Conduct" 3 | --- 4 | 5 | As contributors and maintainers of this project, 6 | we pledge to follow the [The Carpentries Code of Conduct][coc]. 7 | 8 | Instances of abusive, harassing, or otherwise unacceptable behavior 9 | may be reported by following our [reporting guidelines][coc-reporting]. 10 | 11 | 12 | [coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html 13 | [coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html 14 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | ## Contributing 2 | 3 | [The Carpentries][cp-site] ([Software Carpentry][swc-site], [Data 4 | Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source 5 | projects, and we welcome contributions of all kinds: new lessons, fixes to 6 | existing material, bug reports, and reviews of proposed changes are all 7 | welcome. 8 | 9 | ### Contributor Agreement 10 | 11 | By contributing, you agree that we may redistribute your work under [our 12 | license](LICENSE.md). In exchange, we will address your issues and/or assess 13 | your change proposal as promptly as we can, and help you become a member of our 14 | community. Everyone involved in [The Carpentries][cp-site] agrees to abide by 15 | our [code of conduct](CODE_OF_CONDUCT.md). 16 | 17 | ### How to Contribute 18 | 19 | The easiest way to get started is to file an issue to tell us about a spelling 20 | mistake, some awkward wording, or a factual error. This is a good way to 21 | introduce yourself and to meet some of our community members. 22 | 23 | 1. If you do not have a [GitHub][github] account, you can [send us comments by 24 | email][contact]. However, we will be able to respond more quickly if you use 25 | one of the other methods described below. 26 | 27 | 2. If you have a [GitHub][github] account, or are willing to [create 28 | one][github-join], but do not know how to use Git, you can report problems 29 | or suggest improvements by [creating an issue][repo-issues]. This allows us 30 | to assign the item to someone and to respond to it in a threaded discussion. 31 | 32 | 3. If you are comfortable with Git, and would like to add or change material, 33 | you can submit a pull request (PR). Instructions for doing this are 34 | [included below](#using-github). For inspiration about changes that need to 35 | be made, check out the [list of open issues][issues] across the Carpentries. 36 | 37 | Note: if you want to build the website locally, please refer to [The Workbench 38 | documentation][template-doc]. 39 | 40 | ### Where to Contribute 41 | 42 | 1. If you wish to change this lesson, add issues and pull requests here. 43 | 2. If you wish to change the template used for workshop websites, please refer 44 | to [The Workbench documentation][template-doc]. 45 | 46 | 47 | ### What to Contribute 48 | 49 | There are many ways to contribute, from writing new exercises and improving 50 | existing ones to updating or filling in the documentation and submitting [bug 51 | reports][issues] about things that do not work, are not clear, or are missing. 52 | If you are looking for ideas, please see [the list of issues for this 53 | repository][repo-issues], or the issues for [Data Carpentry][dc-issues], 54 | [Library Carpentry][lc-issues], and [Software Carpentry][swc-issues] projects. 55 | 56 | Comments on issues and reviews of pull requests are just as welcome: we are 57 | smarter together than we are on our own. **Reviews from novices and newcomers 58 | are particularly valuable**: it's easy for people who have been using these 59 | lessons for a while to forget how impenetrable some of this material can be, so 60 | fresh eyes are always welcome. 61 | 62 | ### What *Not* to Contribute 63 | 64 | Our lessons already contain more material than we can cover in a typical 65 | workshop, so we are usually *not* looking for more concepts or tools to add to 66 | them. As a rule, if you want to introduce a new idea, you must (a) estimate how 67 | long it will take to teach and (b) explain what you would take out to make room 68 | for it. The first encourages contributors to be honest about requirements; the 69 | second, to think hard about priorities. 70 | 71 | We are also not looking for exercises or other material that only run on one 72 | platform. Our workshops typically contain a mixture of Windows, macOS, and 73 | Linux users; in order to be usable, our lessons must run equally well on all 74 | three. 75 | 76 | ### Using GitHub 77 | 78 | If you choose to contribute via GitHub, you may want to look at [How to 79 | Contribute to an Open Source Project on GitHub][how-contribute]. In brief, we 80 | use [GitHub flow][github-flow] to manage changes: 81 | 82 | 1. Create a new branch in your desktop copy of this repository for each 83 | significant change. 84 | 2. Commit the change in that branch. 85 | 3. Push that branch to your fork of this repository on GitHub. 86 | 4. Submit a pull request from that branch to the [upstream repository][repo]. 87 | 5. If you receive feedback, make changes on your desktop and push to your 88 | branch on GitHub: the pull request will update automatically. 89 | 90 | NB: The published copy of the lesson is usually in the `main` branch. 91 | 92 | Each lesson has a team of maintainers who review issues and pull requests or 93 | encourage others to do so. The maintainers are community volunteers, and have 94 | final say over what gets merged into the lesson. 95 | 96 | ### Other Resources 97 | 98 | The Carpentries is a global organisation with volunteers and learners all over 99 | the world. We share values of inclusivity and a passion for sharing knowledge, 100 | teaching and learning. There are several ways to connect with The Carpentries 101 | community listed at including via social 102 | media, slack, newsletters, and email lists. You can also [reach us by 103 | email][contact]. 104 | 105 | [repo]: https://github.com/hpc-carpentry/hpc-chapel 106 | [repo-issues]: https://github.com/hpc-carpentry/hpc-chapel/issues 107 | [contact]: mailto:maintainers-hpc@lists.carpentries.org 108 | [cp-site]: https://carpentries.org/ 109 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry 110 | [dc-lessons]: https://datacarpentry.org/lessons/ 111 | [dc-site]: https://datacarpentry.org/ 112 | [discuss-list]: https://carpentries.topicbox.com/groups/discuss 113 | [github]: https://github.com 114 | [github-flow]: https://guides.github.com/introduction/flow/ 115 | [github-join]: https://github.com/join 116 | [how-contribute]: https://egghead.io/courses/how-to-contribute-to-an-open-source-project-on-github 117 | [issues]: https://carpentries.org/help-wanted-issues/ 118 | [lc-issues]: https://github.com/issues?q=user%3ALibraryCarpentry 119 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry 120 | [swc-lessons]: https://software-carpentry.org/lessons/ 121 | [swc-site]: https://software-carpentry.org/ 122 | [lc-site]: https://librarycarpentry.org/ 123 | [template-doc]: https://carpentries.github.io/workbench/ 124 | -------------------------------------------------------------------------------- /Gemfile: -------------------------------------------------------------------------------- 1 | source "https://rubygems.org" 2 | gem "github-pages", group: :jekyll_plugins 3 | gem "kramdown-parser-gfm" 4 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Licenses" 3 | --- 4 | 5 | ## Instructional Material 6 | 7 | All High Performance Computing Carpentry instructional material is 8 | made available under the [Creative Commons Attribution 9 | license][cc-by-human]. The following is a human-readable summary of 10 | (and not a substitute for) the full legal text of the [CC BY 4.0 11 | license][cc-by-legal]. 12 | 13 | ### You are free to 14 | 15 | **Share**---copy and redistribute the material in any medium or format for any 16 | purpose, even commercially. 17 | 18 | **Adapt**---remix, transform, and build upon the material for any purpose, even 19 | commercially. 20 | 21 | The licensor cannot revoke these freedoms as long as you follow the license 22 | terms. 23 | 24 | ### Under the following terms 25 | 26 | **Attribution**---You must give appropriate credit, provide a [link to the 27 | license][cc-by-human], and indicate if changes were made. You may do so in any 28 | reasonable manner, but not in any way that suggests the licensor endorses you 29 | or your use. 30 | 31 | **No additional restrictions**---You may not apply legal terms or technological 32 | measures that legally restrict others from doing anything the license permits. 33 | 34 | ### Notices 35 | 36 | You do not have to comply with the license for elements of the material in the 37 | public domain or where your use is permitted by an applicable exception or 38 | limitation. 39 | 40 | No warranties are given. The license may not give you all of the permissions 41 | necessary for your intended use. For example, other rights such as publicity, 42 | privacy, or moral rights may limit how you use the material. 43 | 44 | ## Software 45 | 46 | Except where otherwise noted, the example programs and other software provided 47 | by HPC Carpentry are made available under the [OSI][osi]-approved [MIT 48 | license][mit-license]. 49 | 50 | ### MIT License 51 | 52 | Copyright © 2024 HPC Carpentry 53 | 54 | Permission is hereby granted, free of charge, to any person obtaining a copy of 55 | this software and associated documentation files (the "Software"), to deal in 56 | the Software without restriction, including without limitation the rights to 57 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 58 | of the Software, and to permit persons to whom the Software is furnished to do 59 | so, subject to the following conditions: 60 | 61 | The above copyright notice and this permission notice shall be included in all 62 | copies or substantial portions of the Software. 63 | 64 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 65 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 66 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 67 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 68 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 69 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 70 | SOFTWARE. 71 | 72 | ## Trademark 73 | 74 | "The Carpentries", "Software Carpentry", "Data Carpentry", and "Library 75 | Carpentry" and their respective logos are registered trademarks of [Community 76 | Initiatives][ci]. 77 | 78 | [cc-by-human]: https://creativecommons.org/licenses/by/4.0/ 79 | [cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode 80 | [mit-license]: https://opensource.org/licenses/mit-license.html 81 | [ci]: https://communityin.org/ 82 | [osi]: https://opensource.org 83 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # Makefile to build HPC Chapel lesson locally 2 | # Docs: 3 | 4 | # Disable the browser, if none is set. 5 | export R_BROWSER := $(or $(R_BROWSER),"false") 6 | 7 | all: serve 8 | .PHONY: all build check clean serve 9 | 10 | serve: build 11 | Rscript -e "sandpaper::serve()" 12 | 13 | build: 14 | Rscript -e "sandpaper::build_lesson()" 15 | 16 | check: 17 | Rscript -e "sandpaper::check_lesson()" 18 | 19 | clean: 20 | Rscript -e "sandpaper::reset_site()" 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # HPC Chapel 2 | 3 | This lesson is focused on teaching the basics of high-performance computing 4 | (HPC). There are 4 primary components to this lesson. Each component is 5 | budgeted half a day's worth of teaching-time, resulting in a two day workshop. 6 | 7 | 1. UNIX fundamentals 8 | 2. Working on a cluster 9 | 3. Programming language introduction/review 10 | 4. Introduction to parallel programming 11 | 12 | Sections 3 and 4 (programming) will feature two programming languages: 13 | [Python](https://www.python.org/) and [Chapel](https://chapel-lang.org). There 14 | are strong arguments for both languages, and instructors will be able to choose 15 | which language they wish to teach in. 16 | 17 | ## Topic breakdown and todo list 18 | 19 | The lesson outline and rough breakdown of topics by lesson writer is in 20 | [lesson-outline.md](lesson-outline.md). The topics there will be initially 21 | generated by the lesson writer, and then reviewed by the rest of the group once 22 | complete. 23 | 24 | ## Lesson writing instructions 25 | 26 | This is a fast overview of the Software Carpentry lesson template. This won't 27 | cover lesson style or formatting (address that during review?). 28 | 29 | For a full guide to the lesson template, see the [Software Carpentry example 30 | lesson](http://swcarpentry.github.io/lesson-example/). 31 | 32 | ### Lesson structure 33 | 34 | Software Carpentry lessons are generally episodic, with one clear concept for 35 | each episode ([example](http://swcarpentry.github.io/r-novice-gapminder/)). 36 | We've got 4 major sections, each section should be broken up into several 37 | episodes (perhaps the higher-level bullet points from the lesson outline?). 38 | 39 | An episode is just a markdown file that lives under the `_episodes` folder. 40 | Here is a link to a [markdown 41 | cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) 42 | with most markdown syntax. Additionally, the Software Carpentry lesson template 43 | uses several extra bits of formatting- see here for a [full 44 | guide](http://swcarpentry.github.io/lesson-example/04-formatting/). The most 45 | significant change is the addition of a YAML header that adds metadata (key 46 | questions, lesson teaching times, etc.) and special syntax for code blocks, 47 | exercises, and the like. 48 | 49 | Episode names should be prefixed with a number of their section plus the number 50 | of their episode within that section. This is important because the Software 51 | Carpentry lesson template will auto-post our lessons in the order that they 52 | would sort in. As long as your lesson sorts into the correct order, it will 53 | appear in the correct order on the website. 54 | 55 | ### Publishing changes to GitHub + the GitHub pages website 56 | 57 | The lesson website is viewable at 58 | [hpc-carpentry.github.io/hpc-novice](hpc-carpentry.github.io/hpc-novice). 59 | 60 | The lesson website itself is auto-generated from the `gh-pages` branch of this 61 | repository. GitHub pages will rebuild the website as soon as you push to the 62 | GitHub `gh-pages` branch. Because of this `gh-pages` is considered the "master" 63 | branch. 64 | 65 | ### Previewing changes locally 66 | 67 | Obviously having to push to GitHub every time you want to view your changes to 68 | the website isn't very convenient. To preview the lesson locally, run `make 69 | serve`. You can then view the website at `localhost:4321` in your browser. 70 | Pages will be automatically regenerated every time you write to them. 71 | 72 | This process requires the R language and three R packages -- 73 | [sandpaper](https://carpentries.github.io/sandpaper), [pegboard](https://carpentries.github.io/pegboard), and 74 | [varnish](https://carpentries.github.io/varnish) -- that work together with R and [pandoc](https://pandoc.org) 75 | to manage and deploy Carpentries Lesson websites written in Markdown or R Markdown. 76 | 77 | You can find the setup instructions [here](https://carpentries.github.io/workbench). 78 | 79 | ## Example lessons 80 | 81 | A couple links to example SWC workshop lessons for reference: 82 | 83 | * [Example Bash lesson](https://github.com/swcarpentry/shell-novice) 84 | * [Example Python lesson](https://github.com/swcarpentry/python-novice-inflammation) 85 | * [Example R lesson](https://github.com/swcarpentry/r-novice-gapminder) (uses R 86 | markdown files instead of markdown) 87 | 88 | 89 | -------------------------------------------------------------------------------- /config.yaml: -------------------------------------------------------------------------------- 1 | #------------------------------------------------------------ 2 | # Values for this lesson. 3 | #------------------------------------------------------------ 4 | 5 | # Which carpentry is this (swc, dc, lc, or cp)? 6 | # swc: Software Carpentry 7 | # dc: Data Carpentry 8 | # lc: Library Carpentry 9 | # cp: Carpentries (to use for instructor training for instance) 10 | # incubator: The Carpentries Incubator 11 | # 12 | # This option supports custom types so lessons can be branded 13 | # and themed with your own logo and alt-text (see `carpentry_description`) 14 | # See https://carpentries.github.io/sandpaper-docs/editing.html#adding-a-custom-logo 15 | carpentry: 'incubator' 16 | 17 | # Alt-text description of the lesson. 18 | carpentry_description: 'Introduction to parallel programming in Chapel' 19 | 20 | # Overall title for pages. 21 | title: 'Introduction to High-Performance Computing in Chapel' 22 | 23 | # Date the lesson was created (YYYY-MM-DD, this is empty by default) 24 | created: 2017-09-14 25 | 26 | # Comma-separated list of keywords for the lesson 27 | keywords: 'software, data, lesson, The Carpentries, HPC, Chapel' 28 | 29 | # Life cycle stage of the lesson 30 | # possible values: pre-alpha, alpha, beta, stable 31 | life_cycle: 'alpha' 32 | 33 | # License of the lesson 34 | license: 'CC-BY 4.0' 35 | 36 | # Link to the source repository for this lesson 37 | source: 'https://github.com/hpc-carpentry/hpc-chapel' 38 | 39 | # Default branch of your lesson 40 | branch: 'main' 41 | 42 | # Who to contact if there are any issues 43 | contact: 'maintainers-hpc@lists.carpentries.org' 44 | 45 | # Navigation ------------------------------------------------ 46 | # 47 | # Use the following menu items to specify the order of 48 | # individual pages in each dropdown section. Leave blank to 49 | # include all pages in the folder. 50 | # 51 | # Example ------------- 52 | # 53 | # episodes: 54 | # - introduction.md 55 | # - first-steps.md 56 | # 57 | # learners: 58 | # - setup.md 59 | # 60 | # instructors: 61 | # - instructor-notes.md 62 | # 63 | # profiles: 64 | # - one-learner.md 65 | # - another-learner.md 66 | 67 | # Order of episodes in your lesson 68 | episodes: 69 | - 01-intro.md 70 | - 02-variables.md 71 | - 03-ranges-arrays.md 72 | - 04-conditionals.md 73 | - 05-loops.md 74 | - 06-procedures.md 75 | - 07-commandargs.md 76 | - 08-timing.md 77 | - 11-parallel-intro.md 78 | - 12-fire-forget-tasks.md 79 | - 13-synchronization.md 80 | - 14-parallel-case-study.md 81 | - 21-locales.md 82 | - 22-domains.md 83 | # - introduction.md 84 | 85 | # Information for Learners 86 | learners: 87 | 88 | # Information for Instructors 89 | instructors: 90 | 91 | # Learner Profiles 92 | profiles: 93 | 94 | # Customisation --------------------------------------------- 95 | # 96 | # This space below is where custom yaml items (e.g. pinning 97 | # sandpaper and varnish versions) should live 98 | -------------------------------------------------------------------------------- /episodes/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc-carpentry/hpc-chapel/0f86bde7d5f3f4a9fe5c25f2ce016ac444e4f434/episodes/.gitkeep -------------------------------------------------------------------------------- /episodes/01-intro.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Introduction to Chapel" 3 | teaching: 15 4 | exercises: 15 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "What is Chapel and why is it useful?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Write and execute our first Chapel program." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | **_Chapel_** is a modern, open-source programming language that supports HPC via high-level 16 | abstractions for data parallelism and task parallelism. These abstractions allow the users to express parallel 17 | codes in a natural, almost intuitive, manner. In contrast with other high-level parallel languages, however, 18 | Chapel was designed around a _multi-resolution_ philosophy. This means that users can incrementally add more 19 | detail to their original code prototype, to optimise it to a particular computer as closely as required. 20 | 21 | In a nutshell, with Chapel we can write parallel code with the simplicity and readability of scripting 22 | languages such as Python or MATLAB, but achieving performance comparable to compiled languages like C or 23 | Fortran (+ traditional parallel libraries such as MPI or OpenMP). 24 | 25 | In this lesson we will learn the basic elements and syntax of the language; then we will study **_task 26 | parallelism_**, the first level of parallelism in Chapel, and finally we will use parallel data structures and 27 | **_data parallelism_**, which is the higher level of abstraction, in parallel programming, offered by Chapel. 28 | 29 | ## Getting started 30 | 31 | Chapel is a compilable language which means that we must **_compile_** our **_source code_** to generate a 32 | **_binary_** or **_executable_** that we can then run in the computer. 33 | 34 | Chapel source code must be written in text files with the extension **_.chpl_**. Let's write a simple "hello 35 | world"-type program to demonstrate how we write Chapel code! Using your favourite text editor, create the file 36 | `hello.chpl` with the following content: 37 | 38 | ```chpl 39 | writeln('If we can see this, everything works!'); 40 | ``` 41 | 42 | This program can then be compiled with the following bash command: 43 | 44 | ```bash 45 | chpl --fast hello.chpl 46 | ``` 47 | 48 | The flag `--fast` indicates the compiler to optimise the binary to run as fast as possible in the given 49 | architecture. By default, the compiler will produce a program with the same name 50 | as the source file. In our case, the program will be called `hello`. The `-o` 51 | option can be used to change the name of the generated binary. 52 | 53 | To run the code, you execute it as you would any other program: 54 | 55 | ```bash 56 | ./hello 57 | ``` 58 | ```output 59 | If we can see this, everything works! 60 | ``` 61 | 62 | ## Running on a cluster 63 | 64 | Depending on the code, it might utilise several or even all cores on the current node. The command above 65 | implies that you are allowed to utilise all cores. This might not be the case on an HPC cluster, where a login 66 | node is shared by many people at the same time, and where it might not be a good idea to occupy all cores on a 67 | login node with CPU-intensive tasks. Instead, you will need to submit your Chapel run as a job to the 68 | scheduler asking for a specific number of CPU cores. 69 | 70 | Use `module avail chapel` to list Chapel packages on your HPC cluster, and select the best fit for Chapel, 71 | e.g. the single-locale Chapel module: 72 | 73 | ```bash 74 | module load chapel-multicore 75 | ``` 76 | 77 | Then, for running a test code on a cluster you would submit an interactive job to the queue 78 | 79 | ```bash 80 | salloc --time=0:30:0 --ntasks=1 --cpus-per-task=3 --mem-per-cpu=1000 --account=def-guest 81 | ``` 82 | 83 | and then inside that job compile and run the test code 84 | 85 | ```bash 86 | chpl --fast hello.chpl 87 | ./hello 88 | ``` 89 | 90 | For production jobs, you would compile the code and then submit a batch script to the queue: 91 | 92 | ```bash 93 | chpl --fast hello.chpl 94 | sbatch script.sh 95 | ``` 96 | 97 | where the script `script.sh` would set all Slurm variables and call the executable `mybinary`. 98 | 99 | ## Case study 100 | 101 | Along all the Chapel lessons we will be using the following _case study_ as the leading thread of the 102 | discussion. Essentially, we will be building, step by step, a Chapel code to solve the **_Heat transfer_** 103 | problem described below. Then we will parallelize the code to improve its performance. 104 | 105 | Suppose that we have a square metallic plate with some initial heat distribution or **_initial 106 | conditions_**. We want to simulate the evolution of the temperature across the plate when its border is in 107 | contact with a different heat distribution that we call the **_boundary conditions_**. 108 | 109 | The Laplace equation is the mathematical model for the evolution of the temperature in the plate. To solve 110 | this equation numerically, we need to **_discretise_** it, i.e. to consider the plate as a grid, or matrix of 111 | points, and to evaluate the temperature on each point at each iteration, according to the following 112 | **_difference equation_**: 113 | 114 | ```chpl 115 | temp_new[i,j] = 0.25 * (temp[i-1,j] + temp[i+1,j] + temp[i,j-1] + temp[i,j+1]) 116 | ``` 117 | 118 | Here `temp_new` stands for the new temperature at the current iteration, while `temp` contains the temperature calculated 119 | at the past iteration (or the initial conditions in case we are at the first iteration). The indices `i` and 120 | `j` indicate that we are working on the point of the grid located at the *i*th row and the *j*th column. 121 | 122 | So, our objective is to: 123 | 124 | > ## Goals 125 | > 1. Write a code to implement the difference equation above. The code should 126 | > have the following requirements: 127 | > 128 | > - It should work for any given number of rows and columns in the grid. 129 | > - It should run for a given number of iterations, or until the difference 130 | > between `temp_new` and `temp` is smaller than a given tolerance value. 131 | > - It should output the temperature at a desired position on the grid every 132 | > given number of iterations. 133 | > 134 | > 2. Use task parallelism to improve the performance of the code and run it in 135 | > the cluster 136 | > 3. Use data parallelism to improve the performance of the code and run it in 137 | > the cluster. 138 | 139 | ::::::::::::::::::::::::::::::::::::: keypoints 140 | - "Chapel is a compiled language - any programs we make must be compiled with `chpl`." 141 | - "The `--fast` flag instructs the Chapel compiler to optimise our code." 142 | - "The `-o` flag tells the compiler what to name our output (otherwise it gets named after the source file)" 143 | :::::::::::::::::::::::::::::::::::::::::::::::: 144 | -------------------------------------------------------------------------------- /episodes/02-variables.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Basic syntax and variables" 3 | teaching: 15 4 | exercises: 15 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How do I write basic Chapel code?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Perform basic maths in Chapel." 13 | - "Understand Chapel's basic data types." 14 | - "Understand how to read and fix errors." 15 | - "Know how to define and use data stored as variables." 16 | :::::::::::::::::::::::::::::::::::::::::::::::: 17 | 18 | Using basic maths in Chapel is fairly intuitive. Try compiling the following code to see 19 | how the different mathematical operators work. 20 | 21 | ```chpl 22 | writeln(4 + 5); 23 | writeln(4 - 5); 24 | writeln(4 * 5); 25 | writeln(4 / 5); // integer division 26 | writeln(4.0 / 5.0); // floating-point division 27 | writeln(4 ** 5); // exponentiation 28 | ``` 29 | 30 | In this example, our code is called `operators.chpl`. You can compile it with the following commands: 31 | 32 | ```bash 33 | chpl operators.chpl --fast 34 | ./operators 35 | ``` 36 | 37 | You should see output that looks something like the following: 38 | 39 | ```output 40 | 9 41 | -1 42 | 20 43 | 0 44 | 0.8 45 | 1024 46 | ``` 47 | 48 | Code beginning with `//` is interpreted as a comment — it does not get run. Comments are very valuable 49 | when writing code, because they allow us to write notes to ourselves about what each piece of code does. You 50 | can also create block comments with `/*` and `*/`: 51 | 52 | ```chpl 53 | /* This is a block comment. 54 | It can span as many lines as you want! 55 | (like this) */ 56 | ``` 57 | 58 | ## Variables 59 | 60 | Granted, we probably want to do more than basic maths with Chapel. We will need to store the results of 61 | complex operations using variables. Variables in programming are not the same as the mathematical concept. In 62 | programming, a variable represents (or references) a location in the memory of the computer where we can store information or 63 | data while executing a program. A variable has three elements: 64 | 65 | 1. a **_name_** or label, to identify the variable 66 | 2. a **_type_**, that indicates the kind of data that we can store in it, and 67 | 3. a **_value_**, the actual information or data stored in the variable. 68 | 69 | Variables in Chapel are declared with the `var` or `const` keywords. When a variable declared as `const` is 70 | initialised, its value cannot be modified anymore during the execution of the program. What happens if we try to 71 | modify a constant variable like `test` below? 72 | 73 | ```chpl 74 | const test = 100; 75 | test = 200; 76 | writeln('The value of test is: ', test); 77 | writeln(test / 4); 78 | ``` 79 | ```bash 80 | chpl variables.chpl 81 | ``` 82 | ```error 83 | variables.chpl:2: error: cannot assign to const variable 84 | ``` 85 | 86 | The compiler threw an error, and did not compile our program. This is a feature of compiled languages - if 87 | there is something wrong, we will typically see an error at compile-time, instead of while running 88 | it. Although we already kind of know why the error was caused (we tried to reassign the value of a `const` 89 | variable, which by definition cannot be changed), let's walk through the error as an example of how to 90 | troubleshoot our programs. 91 | 92 | - `variables.chpl:2:` indicates that the error was caused on line 2 of our `variables.chpl` file. 93 | 94 | - `error:` indicates that the issue was an error, and blocks compilation. Sometimes the compiler will just 95 | give us warning or information, not necessarily errors. When we see something that is not an error, we 96 | should carefully read the output and consider if it necessitates changing our code. Errors must be fixed, 97 | as they will block the code from compiling. 98 | 99 | - `cannot assign to const variable` indicates that we were trying to reassign a `const` variable, which is 100 | explicitly not allowed in Chapel. 101 | 102 | To fix this error, we can change `const` to `var` when declaring our `test` variable. `var` indicates a 103 | variable that can be reassigned. 104 | 105 | ```chpl 106 | var test = 100; 107 | test = 200; 108 | writeln('The value of test is: ', test); 109 | writeln(test / 4); 110 | ``` 111 | ```bash 112 | chpl variables.chpl 113 | ``` 114 | ```output 115 | The value of test is: 200 116 | 50 117 | ``` 118 | 119 | 120 | 121 | 122 | 123 | In Chapel, to initialize a variable we must specify the type of the variable, or initialise it in place with 124 | some value. The common variable types in Chapel are: 125 | 126 | - integer `int` (positive or negative whole numbers) 127 | - floating-point number `real` (decimal values) 128 | - Boolean `bool` (true or false) 129 | - string `string` (any type of text) 130 | 131 | These two variables below are initialized with the type. If no initial value is given, Chapel will initialise 132 | a variable with a default value depending on the declared type, for example 0 for integers and 0.0 for real 133 | variables. 134 | 135 | ```chpl 136 | var counter: int; 137 | var delta: real; 138 | writeln("counter is ", counter, " and delta is ", delta); 139 | ``` 140 | ```bash 141 | chpl variables.chpl 142 | ./variables 143 | ``` 144 | ```output 145 | counter is 0 and delta is 0.0 146 | ``` 147 | 148 | If a variable is initialised with a value but without a type, Chapel will infer its type from the given 149 | initial value: 150 | 151 | ```chpl 152 | const test = 100; 153 | writeln('The value of test is ', test, ' and its type is ', test.type:string); 154 | ``` 155 | ```bash 156 | chpl variables.chpl 157 | ./variables 158 | ``` 159 | ```output 160 | The value of test is 100 and its type is int(64) 161 | ``` 162 | 163 | When initialising a variable, we can also assign its type in addition to its value: 164 | 165 | ```chpl 166 | const tolerance: real = 0.0001; 167 | const outputFrequency: int = 20; 168 | ``` 169 | 170 | ::::::::::::::::::::::::::::::::::::: callout 171 | 172 | Note that these two notations below are different, but produce the same result in the end: 173 | 174 | ```chpl 175 | var a: real = 10.0; // we specify both the type and the value 176 | var a = 10: real; // we specify only the value (10 converted to real) 177 | ``` 178 | 179 | :::::::::::::::::::::::::::::::::::::::::::::::: 180 | 181 | 182 | ::::::::::::::::::::::::::::::::::::: callout 183 | 184 | In the following code (saved as `variables.chpl`) we have not initialised the variable `test` before trying to 185 | use it in line 2: 186 | 187 | ```chpl 188 | const test; // declare 'test' variable 189 | writeln('The value of test is: ', test); 190 | ``` 191 | ```error 192 | variables.chpl:1: error: 'test' is not initialized and has no type 193 | variables.chpl:1: note: cannot find initialization point to split-init this variable 194 | variables.chpl:2: note: 'test' is used here before it is initialized 195 | ``` 196 | 197 | :::::::::::::::::::::::::::::::::::::::::::::::: 198 | 199 | Now we know how to set, use, and change a variable, as well as the implications of using `var` and `const`. We 200 | also know how to read and interpret errors. 201 | 202 | Let's practice defining variables and use this as the starting point of our simulation code. The code will be 203 | stored in the file `base_solution.chpl`. We will be solving the heat transfer problem introduced in the 204 | previous section, starting with some initial temperature and computing a new temperature at each iteration. We 205 | will then compute the greatest difference between the old and the new temperature and will check if it is 206 | smaller than a preset `tolerance`. If no, we will continue iterating. If yes, we will stop iterations and will 207 | print the final temperature. We will also stop iterations if we reach the maximum number of iterations 208 | `niter`. 209 | 210 | Our grid will be of size `rows` by `cols`, and every `outputFrequency`th iteration we will print temperature 211 | at coordinates `x` and `y`. 212 | 213 | The variable `delta` will store the greatest difference in temperature from one iteration to another. The 214 | variable `tmp` will store some temporary results when computing the temperatures. 215 | 216 | Let's define our variables: 217 | 218 | ```chpl 219 | const rows = 100; // number of rows in the grid 220 | const cols = 100; // number of columns in the grid 221 | const niter = 500; // maximum number of iterations 222 | const x = 50; // row number for a printout 223 | const y = 50; // column number for a printout 224 | var delta: real; // greatest difference in temperature from one iteration to another 225 | var tmp: real; // for temporary results 226 | const tolerance: real = 0.0001; // smallest difference in temperature that would be accepted before stopping 227 | const outputFrequency: int = 20; // the temperature will be printed every outputFrequency iterations 228 | ``` 229 | 230 | ::::::::::::::::::::::::::::::::::::: keypoints 231 | - "A comment is preceded with `//` or surrounded by `/* and `*/`" 232 | - "All variables in Chapel have a type, whether assigned explicitly by the user, or chosen by the Chapel 233 | compiler based on its value." 234 | - "Reassigning a new value to a `const` variable will produce an error during compilation. If you want to assign a new value to a variable, declare that variable with the `var` keyword." 235 | :::::::::::::::::::::::::::::::::::::::::::::::: 236 | -------------------------------------------------------------------------------- /episodes/03-ranges-arrays.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Ranges and arrays" 3 | teaching: 60 4 | exercises: 30 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "What is Chapel and why is it useful?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Learn to define and use ranges and arrays." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | ## Ranges and Arrays 16 | 17 | A series of integers (1,2,3,4,5, for example), is called a **_range_**. Ranges are generated with the `..` 18 | operator. Let's examine what a range looks like; we store the following code as `ranges.chpl`. Here we 19 | introduce a very simple loop, cycling through all elements of the range and printing their values (we will 20 | study `for` loops in a separate section): 21 | 22 | ```chpl 23 | var example_range = 0..10; 24 | writeln('Our example range was set to: ', example_range); 25 | for x in example_range do writeln(x); 26 | ``` 27 | 28 | ```bash 29 | chpl ranges.chpl 30 | ./ranges 31 | ``` 32 | 33 | ```output 34 | Our example range was set to: 0..10 35 | 0 36 | 1 37 | ... 38 | 9 39 | 10 40 | ``` 41 | 42 | Among other uses, ranges can be used to declare **_arrays_** of variables. An array is a multidimensional 43 | collection of values of the same type. Arrays can be of any size. Let's define a 1-dimensional array of the 44 | size `example_range` and see what it looks like. Notice how the size of an array is included with its type. 45 | 46 | ```chpl 47 | var example_range = 0..10; 48 | writeln('Our example range was set to: ', example_range); 49 | var example_array: [example_range] real; 50 | writeln('Our example array is now: ', example_array); 51 | ``` 52 | 53 | We can reassign the values in our example array the same way we would reassign a variable. An array can either 54 | be set all to a single value, or to a sequence of values. 55 | 56 | ```chpl 57 | var example_range = 0..10; 58 | writeln('Our example range was set to: ', example_range); 59 | var example_array: [example_range] real; 60 | writeln('Our example array is now: ', example_array); 61 | example_array = 5; 62 | writeln('When set to 5: ', example_array); 63 | example_array = 1..11; 64 | writeln('When set to a range: ', example_array); 65 | ``` 66 | 67 | ```bash 68 | chpl ranges.chpl 69 | ./ranges 70 | ``` 71 | 72 | ```output 73 | Our example range was set to: 0..10 74 | Our example array is now: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 75 | When set to 5: 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 76 | When set to a range: 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 77 | ``` 78 | 79 | Notice how ranges are "right inclusive", the last number of a range is included in the range. This is 80 | different from languages like Python where this does not happen. 81 | 82 | ## Indexing elements 83 | 84 | We can retrieve and reset specific values of an array using `[]` notation. Note that we use the same square 85 | bracket notation in two different contexts: (1) to declare an array, with the square brackets containing the 86 | array's full index range `[example_range]`, and (2) to access specific array elements, as we will see 87 | below. Let's try retrieving and setting a specific value in our example so far: 88 | 89 | ```chpl 90 | var example_range = 0..10; 91 | writeln('Our example range was set to: ', example_range); 92 | var example_array: [example_range] real; 93 | writeln('Our example array is now: ', example_array); 94 | example_array = 5; 95 | writeln('When set to 5: ', example_array); 96 | example_array = 1..11; 97 | writeln('When set to a range: ', example_array); 98 | // retrieve the 5th index 99 | writeln(example_array[5]); 100 | // set index 5 to a new value 101 | example_array[5] = 99999; 102 | writeln(example_array); 103 | ``` 104 | 105 | ```bash 106 | chpl ranges.chpl 107 | ./ranges 108 | ``` 109 | 110 | ```output 111 | Our example range was set to: 0..10 112 | Our example array is now: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 113 | When set to 5: 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 114 | When set to a range: 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 115 | 6.0 116 | 1.0 2.0 3.0 4.0 5.0 99999.0 7.0 8.0 9.0 10.0 11.0 117 | ``` 118 | 119 | One very important thing to note - in this case, index 5 was actually the 6th element. This was caused by how 120 | we set up our array. When we defined our array using a range starting at 0, element 5 corresponds to the 6th 121 | element. Unlike most other programming languages, arrays in Chapel do not start at a fixed value - they can 122 | start at any number depending on how we define them! For instance, let's redefine example_range to start at 5: 123 | 124 | ```chpl 125 | var example_range = 5..15; 126 | writeln('Our example range was set to: ', example_range); 127 | var example_array: [example_range] real; 128 | writeln('Our example array is now: ', example_array); 129 | example_array = 5; 130 | writeln('When set to 5: ', example_array); 131 | example_array = 1..11; 132 | writeln('When set to a range: ', example_array); 133 | // retrieve the 5th index 134 | writeln(example_array[5]); 135 | // set index 5 to a new value 136 | example_array[5] = 99999; 137 | writeln(example_array); 138 | ``` 139 | 140 | ```bash 141 | chpl ranges.chpl 142 | ./ranges 143 | ``` 144 | 145 | ```output 146 | Our example range was set to: 5..15 147 | Our example array is now: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 148 | When set to 5: 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 149 | When set to a range: 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 150 | 1.0 151 | 99999.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 152 | ``` 153 | 154 | ## Back to our simulation 155 | 156 | Let's define a two-dimensional array for use in our simulation and set its initial values: 157 | 158 | ```chpl 159 | // this is our "plate" 160 | var temp: [0..rows+1, 0..cols+1] real; 161 | temp[1..rows,1..cols] = 25; // set the initial temperature on the internal grid 162 | ``` 163 | 164 | This is a matrix (2D array) with (`rows + 2`) rows and (`cols + 2`) columns of real numbers. The ranges 165 | `0..rows+1` and `0..cols+1` used here, not only define the size and shape of the array, they stand for the 166 | indices with which we could access particular elements of the array using the `[ , ]` notation. For example, 167 | `temp[0,0]` is the real variable located at the first row and first column of the array `temp`, while 168 | `temp[3,7]` is the one at the 4th row and 8th column; `temp[2,3..15]` access columns 4th to 16th of the 3th 169 | row of `temp`, and `temp[0..3,4]` corresponds to the first 4 rows on the 5th column of `temp`. 170 | 171 | We divide our "plate" into two parts: (1) the internal grid `1..rows,1..cols` on which we set the initial 172 | temperature at 25.0, and (2) the surrounding layer of *ghost points* with row indices equal to `0` or `rows+1` 173 | and column indices equal to `0` or `cols+1`. The temperature in the ghost layer is equal to 0.0 by default, as 174 | we do not assign a value there. 175 | 176 | We must now be ready to start coding our simulations. Let's print some information about the initial 177 | configuration, compile the code, and execute it to see if everything is working as expected. 178 | 179 | ```chpl 180 | const rows = 100; 181 | const cols = 100; 182 | const niter = 500; 183 | const x = 50; // row number of the desired position 184 | const y = 50; // column number of the desired position 185 | const tolerance = 0.0001; // smallest difference in temperature that would be accepted before stopping 186 | const outputFrequency: int = 20; // the temperature will be printed every outputFrequency iterations 187 | 188 | // this is our "plate" 189 | var temp: [0..rows+1, 0..cols+1] real; 190 | temp[1..rows,1..cols] = 25; // set the initial temperature on the internal grid 191 | 192 | writeln('This simulation will consider a matrix of ', rows, ' by ', cols, ' elements.'); 193 | writeln('Temperature at start is: ', temp[x, y]); 194 | ``` 195 | 196 | ```bash 197 | chpl base_solution.chpl 198 | ./base_solution 199 | ``` 200 | 201 | ```output 202 | This simulation will consider a matrix of 100 by 100 elements. 203 | Temperature at start is: 25.0 204 | ``` 205 | 206 | ::::::::::::::::::::::::::::::::::::: keypoints 207 | - "A range is a sequence of integers." 208 | - "An array holds a non-negative number of values of the same type." 209 | - "Chapel arrays can start at any index, not just 0 or 1." 210 | - "You can index arrays with the `[]` brackets." 211 | :::::::::::::::::::::::::::::::::::::::::::::::: 212 | -------------------------------------------------------------------------------- /episodes/04-conditionals.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Conditional statements" 3 | teaching: 60 4 | exercises: 30 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How do I add conditional logic to my code?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "You can use the `==`, `>`, `>=`, etc. operators to make a comparison that returns true or false." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | Chapel, as most *high level programming languages*, has different statements to control the flow of the 16 | program or code. The conditional statements are: the **_if statement_**, and the **_while statement_**. These 17 | statements both rely on comparisons between values. Let's try a few comparisons to see how they work 18 | (`conditionals.chpl`): 19 | 20 | ```chpl 21 | writeln(1 == 2); 22 | writeln(1 != 2); 23 | writeln(1 > 2); 24 | writeln(1 >= 2); 25 | writeln(1 < 2); 26 | writeln(1 <= 2); 27 | ``` 28 | 29 | ```bash 30 | chpl conditionals.chpl 31 | ./conditionals 32 | ``` 33 | 34 | ```output 35 | false 36 | true 37 | false 38 | false 39 | true 40 | true 41 | ``` 42 | 43 | You can combine comparisons with the `&&` (AND) and `||` (OR) operators. `&&` only returns `true` if both 44 | conditions are true, while `||` returns `true` if either condition is true. 45 | 46 | ```chpl 47 | writeln(1 == 2); 48 | writeln(1 != 2); 49 | writeln(1 > 2); 50 | writeln(1 >= 2); 51 | writeln(1 < 2); 52 | writeln(1 <= 2); 53 | writeln(true && true); 54 | writeln(true && false); 55 | writeln(true || false); 56 | ``` 57 | 58 | ```bash 59 | chpl conditionals.chpl 60 | ./conditionals 61 | ``` 62 | 63 | ```output 64 | false 65 | true 66 | false 67 | false 68 | true 69 | true 70 | true 71 | false 72 | true 73 | ``` 74 | 75 | ## Control flow 76 | 77 | The general syntax of a while statement is: 78 | 79 | ```chpl 80 | // single-statement form 81 | while condition do 82 | instruction 83 | 84 | // multi-statement form 85 | while condition 86 | { 87 | instructions 88 | } 89 | ``` 90 | 91 | The code flows as follows: first, the condition is evaluated, and then, if it is satisfied, all the 92 | instructions within the curly brackets or `do` are executed one by one. This will be repeated over and over again 93 | until the condition does not hold anymore. 94 | 95 | The main loop in our simulation can be programmed using a while statement like this 96 | 97 | ```chpl 98 | //this is the main loop of the simulation 99 | var c = 0; 100 | delta = tolerance; 101 | while (c < niter && delta >= tolerance) 102 | { 103 | c += 1; 104 | // actual simulation calculations will go here 105 | } 106 | ``` 107 | 108 | Essentially, what we want is to repeat all the code inside the curly brackets until the number of iterations 109 | is greater than or equal to `niter`, or the difference of temperature between iterations is less than 110 | `tolerance`. (Note that in our case, as `delta` was not initialised when declared -- and thus Chapel assigned it 111 | the default real value 0.0 -- we need to assign it a value greater than or equal to 0.001, or otherwise the 112 | condition of the while statement will never be satisfied. A good starting point is to simple say that `delta` 113 | is equal to `tolerance`). 114 | 115 | To count iterations we just need to keep adding 1 to the counter variable `c`. We could do this with `c=c+1`, 116 | or with the compound assignment, `+=`, as in the code above. To program the rest of the logic inside the curly 117 | brackets, on the other hand, we will need more elaborated instructions. 118 | 119 | Let's focus, first, on printing the temperature every `outputFrequency = 20` iterations. To achieve this, we 120 | only need to check whether `c` is a multiple of `outputFrequency`, and in that case, to print the temperature 121 | at the desired position. This is the type of control that an **_if statement_** give us. The general syntax 122 | is: 123 | 124 | ```chpl 125 | // single-statement form 126 | if condition then 127 | instruction A 128 | else 129 | instruction B 130 | 131 | // multi-statement form 132 | if condition 133 | {instructions A} 134 | else 135 | {instructions B} 136 | ``` 137 | 138 | The set of instructions A is executed once if the condition is satisfied; the set of instructions B is 139 | executed otherwise (the else part of the if statement is optional). 140 | 141 | So, in our case this would do the trick: 142 | 143 | ```chpl 144 | if (c % outputFrequency == 0) 145 | { 146 | writeln('Temperature at iteration ', c, ': ', temp[x, y]); 147 | } 148 | ``` 149 | 150 | Note that when only one instruction will be executed, there is no need to use the curly brackets. `%` is the 151 | modulo operator, it returns the remainder after the division (i.e. it returns zero when `c` is multiple of 152 | `outputFrequency`). 153 | 154 | Let's compile and execute our code to see what we get until now 155 | 156 | ```chpl 157 | const rows = 100; 158 | const cols = 100; 159 | const niter = 500; 160 | const x = 50; // row number of the desired position 161 | const y = 50; // column number of the desired position 162 | const tolerance = 0.0001; // smallest difference in temperature that 163 | // would be accepted before stopping 164 | const outputFrequency: int = 20; // the temperature will be printed every outputFrequency iterations 165 | var delta: real; // greatest difference in temperature from one iteration to another 166 | var tmp: real; // for temporary results 167 | 168 | // this is our "plate" 169 | var temp: [0..rows+1, 0..cols+1] real = 25; 170 | 171 | writeln('This simulation will consider a matrix of ', rows, ' by ', cols, ' elements.'); 172 | writeln('Temperature at start is: ', temp[x, y]); 173 | 174 | //this is the main loop of the simulation 175 | var c = 0; 176 | delta = tolerance; 177 | while (c < niter && delta >= tolerance) 178 | { 179 | c += 1; 180 | if (c % outputFrequency == 0) 181 | { 182 | writeln('Temperature at iteration ', c, ': ', temp[x, y]); 183 | } 184 | } 185 | ``` 186 | 187 | ```bash 188 | chpl base_solution.chpl 189 | ./base_solution 190 | ``` 191 | 192 | ```output 193 | This simulation will consider a matrix of 100 by 100 elements. 194 | Temperature at start is: 25.0 195 | Temperature at iteration 20: 25.0 196 | Temperature at iteration 40: 25.0 197 | Temperature at iteration 60: 25.0 198 | Temperature at iteration 80: 25.0 199 | Temperature at iteration 100: 25.0 200 | Temperature at iteration 120: 25.0 201 | Temperature at iteration 140: 25.0 202 | Temperature at iteration 160: 25.0 203 | Temperature at iteration 180: 25.0 204 | Temperature at iteration 200: 25.0 205 | Temperature at iteration 220: 25.0 206 | Temperature at iteration 240: 25.0 207 | Temperature at iteration 260: 25.0 208 | Temperature at iteration 280: 25.0 209 | Temperature at iteration 300: 25.0 210 | Temperature at iteration 320: 25.0 211 | Temperature at iteration 340: 25.0 212 | Temperature at iteration 360: 25.0 213 | Temperature at iteration 380: 25.0 214 | Temperature at iteration 400: 25.0 215 | Temperature at iteration 420: 25.0 216 | Temperature at iteration 440: 25.0 217 | Temperature at iteration 460: 25.0 218 | Temperature at iteration 480: 25.0 219 | Temperature at iteration 500: 25.0 220 | ``` 221 | 222 | Of course the temperature is always 25.0 at any iteration other than the initial one, as we haven't done any 223 | computation yet. 224 | 225 | ::::::::::::::::::::::::::::::::::::: keypoints 226 | - "Use `if {instructions A} else {instructions B}` syntax to execute one set of instructions 227 | if the condition is satisfied, and the other set of instructions if the condition is not satisfied." 228 | - This syntax can be simplified to `if {instructions}` if we only want to execute the 229 | instructions within the curly brackets if the condition is satisfied. 230 | - "Use `while {instructions}` to repeatedly execute the instructions within the curly brackets 231 | while the condition is satisfied. The instructions will be executed over and over again until the condition 232 | does not hold anymore." 233 | :::::::::::::::::::::::::::::::::::::::::::::::: 234 | -------------------------------------------------------------------------------- /episodes/05-loops.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Getting started with loops" 3 | teaching: 60 4 | exercises: 30 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How do I run the same piece of code repeatedly?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Learn to use `for` loops to run over every element of an iterand." 13 | - "Learn the difference between using `for` loops and using a `while` statement to repeatedly execute a code block. 14 | :::::::::::::::::::::::::::::::::::::::::::::::: 15 | 16 | To compute the new temperature, i.e. each element of `temp_new`, we need to add all the surrounding elements in 17 | `temp` and divide the result by 4. And, essentially, we need to repeat this process for all the elements 18 | of `temp_new`, or, in other words, we need to *iterate* over the elements of `temp_new`. When it comes to iterating over 19 | a given number of elements, the **_for-loop_** is what we want to use. The for-loop has the following general 20 | syntax: 21 | 22 | ```chpl 23 | // single-statement version 24 | for index in iterand do 25 | instruction; 26 | 27 | // multi-statement version 28 | for index in iterand 29 | {instructions} 30 | ``` 31 | 32 | The *iterand* is a function or statement that expresses an iteration; it could be the range 1..15, for 33 | example. *index* is a variable that exists only in the context of the for-loop, and that will be taking the 34 | different values yielded by the iterand. The code flows as follows: index takes the first value yielded by the 35 | iterand, and keeps it until all the instructions inside the curly brackets are executed one by one; then, 36 | index takes the second value yielded by the iterand, and keeps it until all the instructions are executed 37 | again. This pattern is repeated until index takes all the different values expressed by the iterand. 38 | 39 | This `for` loop, for example 40 | 41 | ```chpl 42 | // calculate the new temperatures (temp_new) using the past temperatures (temp) 43 | for i in 1..rows 44 | { 45 | // do this for every row 46 | } 47 | ``` 48 | 49 | will allow us to iterate over the rows of `temp_new`. Now, for each row we also need to iterate over all the 50 | columns in order to access every single element of `temp_new`. This can be done with nested `for` loops like 51 | this: 52 | 53 | ```chpl 54 | // calculate the new temperatures (temp_new) using the past temperatures (temp) 55 | for i in 1..rows 56 | { 57 | // do this for every row 58 | for j in 1..cols 59 | { 60 | // and this for every column in the row i 61 | } 62 | } 63 | ``` 64 | 65 | Now, inside the inner loop, we can use the indices `i` and `j` to perform the required computations as 66 | follows: 67 | 68 | ```chpl 69 | // calculate the new temperatures (temp_new) using the past temperatures (temp) 70 | for i in 1..rows 71 | { 72 | // do this for every row 73 | for j in 1..cols 74 | { 75 | // and this for every column in the row i 76 | temp_new[i,j] = (temp[i-1,j] + temp[i+1,j] + temp[i,j-1] + temp[i,j+1]) / 4; 77 | } 78 | } 79 | temp=temp_new; 80 | ``` 81 | 82 | Note that at the end of the outer `for` loop, when all the elements in `temp_new` are already calculated, we update 83 | `temp` with the values of `temp_new`; this way everything is set up for the next iteration of the main `while` 84 | statement. 85 | 86 | We're ready to execute our code, but the conditions we have initially set up 87 | will not produce interesting output, because the plate has a temperature 88 | value of `25` everywhere. We can change the boundaries to have temperature `0` 89 | so that the middle will start cooling down. To do this, we should change the 90 | declaration of `temp` to: 91 | 92 | ```chpl 93 | var temp: [0..rows+1, 0..cols+1] real = 0; // the whole plate starts at 0 94 | temp[1..rows,1..cols] = 25; // set the non-boundary coordinates to 25 95 | ``` 96 | 97 | Now let's compile and execute our code again: 98 | 99 | ```bash 100 | chpl base_solution.chpl 101 | ./base_solution 102 | ``` 103 | 104 | ```output 105 | The simulation will consider a matrix of 100 by 100 elements, 106 | it will run up to 500 iterations, or until the largest difference 107 | in temperature between iterations is less than 0.0001. 108 | You are interested in the evolution of the temperature at the 109 | position (50,50) of the matrix... 110 | 111 | and here we go... 112 | Temperature at iteration 0: 25.0 113 | Temperature at iteration 20: 25.0 114 | Temperature at iteration 40: 25.0 115 | Temperature at iteration 60: 25.0 116 | Temperature at iteration 80: 25.0 117 | Temperature at iteration 100: 25.0 118 | Temperature at iteration 120: 25.0 119 | Temperature at iteration 140: 25.0 120 | Temperature at iteration 160: 25.0 121 | Temperature at iteration 180: 25.0 122 | Temperature at iteration 200: 25.0 123 | Temperature at iteration 220: 24.9999 124 | Temperature at iteration 240: 24.9996 125 | Temperature at iteration 260: 24.9991 126 | Temperature at iteration 280: 24.9981 127 | Temperature at iteration 300: 24.9963 128 | Temperature at iteration 320: 24.9935 129 | Temperature at iteration 340: 24.9893 130 | Temperature at iteration 360: 24.9833 131 | Temperature at iteration 380: 24.9752 132 | Temperature at iteration 400: 24.9644 133 | Temperature at iteration 420: 24.9507 134 | Temperature at iteration 440: 24.9337 135 | Temperature at iteration 460: 24.913 136 | Temperature at iteration 480: 24.8883 137 | Temperature at iteration 500: 24.8595 138 | ``` 139 | 140 | As we can see, the temperature in the middle of the plate (position 50,50) is slowly decreasing as the plate 141 | is cooling down. 142 | 143 | ::::::::::::::::::::::::::::::::::::: challenge 144 | 145 | ## Challenge 1: Can you do it? 146 | 147 | What would be the temperature at the top right corner of the plate? In our current setup we have a layer of 148 | ghost points around the internal grid. While the temperature on the internal grid was initially set to 25.0, 149 | the temperature at the ghost points was set to 0.0. Note that during our iterations we do not compute the 150 | temperature at the ghost points -- it is permanently set to 0.0. Consequently, any point close to the ghost 151 | layer will be influenced by this zero temperature, so we expect the temperature near the border of the plate 152 | to decrease faster. Modify the code to see the temperature at the top right corner. 153 | 154 | :::::::::::::::::::::::: solution 155 | 156 | To see the evolution of the temperature at the top right corner of the plate, we just need to modify `x` and 157 | `y`. This corner correspond to the first row (`x=1`) and the last column (`y=cols`) of the plate. 158 | 159 | ```bash 160 | chpl base_solution.chpl 161 | ./base_solution 162 | ``` 163 | 164 | ```output 165 | The simulation will consider a matrix of 100 by 100 elements, 166 | it will run up to 500 iterations, or until the largest difference 167 | in temperature between iterations is less than 0.0001. 168 | You are interested in the evolution of the temperature at the position (1,100) of the matrix... 169 | 170 | and here we go... 171 | Temperature at iteration 0: 25.0 172 | Temperature at iteration 20: 1.48171 173 | Temperature at iteration 40: 0.767179 174 | ... 175 | Temperature at iteration 460: 0.068973 176 | Temperature at iteration 480: 0.0661081 177 | Temperature at iteration 500: 0.0634717 178 | ``` 179 | 180 | ::::::::::::::::::::::::::::::::: 181 | :::::::::::::::::::::::::::::::::::::::::::::::: 182 | 183 | ::::::::::::::::::::::::::::::::::::: challenge 184 | 185 | ## Challenge 2: Can you do it? 186 | 187 | Now let's have some more interesting boundary conditions. Suppose that the plate is heated by a source of 80 188 | degrees located at the bottom right corner, and that the temperature on the rest of the border decreases 189 | linearly as one gets farther form the corner (see the image below). Utilise for loops to setup the described 190 | boundary conditions. Compile and run your code to see how the temperature is changing now. 191 | 192 | :::::::::::::::::::::::: solution 193 | 194 | To get the linear distribution, the 80 degrees must be divided by the number of rows or columns in our 195 | plate. So, the following couple of `for` loops at the start of time iteration will give us what we want: 196 | 197 | ```chpl 198 | // set the boundary conditions 199 | for i in 1..rows do 200 | temp[i,cols+1] = i*80.0/rows; // right side 201 | for j in 1..cols do 202 | temp[rows+1,j] = j*80.0/cols; // bottom side 203 | ``` 204 | 205 | Note that 80 degrees is written as a real number 80.0. The division of integers in Chapel returns an integer, 206 | then, as `rows` and `cols` are integers, we must have 80 as real so that the result is not truncated. 207 | 208 | ```bash 209 | chpl base_solution.chpl 210 | ./base_solution 211 | ``` 212 | 213 | ```output 214 | The simulation will consider a matrix of 100 by 100 elements, it will run 215 | up to 500 iterations, or until the largest difference in temperature 216 | between iterations is less than 0.0001. You are interested in the evolution 217 | of the temperature at the position (1,100) of the matrix... 218 | 219 | and here we go... 220 | Temperature at iteration 0: 25.0 221 | Temperature at iteration 20: 2.0859 222 | Temperature at iteration 40: 1.42663 223 | ... 224 | Temperature at iteration 460: 0.826941 225 | Temperature at iteration 480: 0.824959 226 | Temperature at iteration 500: 0.823152 227 | ``` 228 | 229 | ::::::::::::::::::::::::::::::::: 230 | :::::::::::::::::::::::::::::::::::::::::::::::: 231 | 232 | ::::::::::::::::::::::::::::::::::::: challenge 233 | 234 | ## Challenge 3: Can you do it? 235 | 236 | Let us increase the maximum number of iterations to `niter = 10_000`. The code now does 10_000 iterations: 237 | 238 | ```output 239 | ... 240 | Temperature at iteration 9960: 0.79214 241 | Temperature at iteration 9980: 0.792139 242 | Temperature at iteration 10000: 0.792139 243 | ``` 244 | 245 | So far, `delta` has been always equal to `tolerance`, which means that our main `while` loop will always run 246 | `niter` iterations. So let's update `delta` after each iteration. Use what we have studied so far to write the 247 | required piece of code. 248 | 249 | :::::::::::::::::::::::: solution 250 | 251 | The idea is simple, after each iteration of the while loop, we must compare all elements of `temp_new` and 252 | `temp`, find the greatest difference, and update `delta` with that value. The next nested for loops do 253 | the job: 254 | 255 | ```chpl 256 | // update delta, the greatest difference between temp_new and temp 257 | delta=0; 258 | for i in 1..rows 259 | { 260 | for j in 1..cols 261 | { 262 | tmp = abs(temp_new[i,j]-temp[i,j]); 263 | if tmp > delta then delta = tmp; 264 | } 265 | } 266 | ``` 267 | 268 | Clearly there is no need to keep the difference at every single position in the array, we just need to update 269 | `delta` if we find a greater one. 270 | 271 | ```bash 272 | chpl base_solution.chpl 273 | ./base_solution 274 | ``` 275 | 276 | ```output 277 | The simulation will consider a matrix of 100 by 100 elements, 278 | it will run up to 10000 iterations, or until the largest difference 279 | in temperature between iterations is less than 0.0001. 280 | You are interested in the evolution of the temperature at the 281 | position (1,100) of the matrix... 282 | 283 | and here we go... 284 | Temperature at iteration 0: 25.0 285 | Temperature at iteration 20: 2.0859 286 | Temperature at iteration 40: 1.42663 287 | ... 288 | Temperature at iteration 7460: 0.792283 289 | Temperature at iteration 7480: 0.792281 290 | Temperature at iteration 7500: 0.792279 291 | 292 | Final temperature at the desired position after 7505 iterations is: 0.792279 293 | The difference in temperatures between the last two iterations was: 9.99834e-05 294 | ``` 295 | 296 | ::::::::::::::::::::::::::::::::: 297 | :::::::::::::::::::::::::::::::::::::::::::::::: 298 | 299 | Now, after Exercise 3 we should have a working program to simulate our heat 300 | transfer equation. Let's just print some additional useful information, 301 | 302 | ```chpl 303 | // print final information 304 | writeln('\nFinal temperature at the desired position after ',c,' iterations is: ',temp[x,y]); 305 | writeln('The difference in temperatures between the last two iterations was: ',delta,'\n'); 306 | ``` 307 | 308 | and compile and execute our final code, 309 | 310 | ```bash 311 | chpl base_solution.chpl 312 | ./base_solution 313 | ``` 314 | 315 | ```output 316 | The simulation will consider a matrix of 100 by 100 elements, 317 | it will run up to 500 iterations, or until the largest difference 318 | in temperature between iterations is less than 0.0001. 319 | You are interested in the evolution of the temperature at the 320 | position (1,100) of the matrix... 321 | 322 | and here we go... 323 | Temperature at iteration 0: 25.0 324 | Temperature at iteration 20: 2.0859 325 | Temperature at iteration 40: 1.42663 326 | Temperature at iteration 60: 1.20229 327 | Temperature at iteration 80: 1.09044 328 | Temperature at iteration 100: 1.02391 329 | Temperature at iteration 120: 0.980011 330 | Temperature at iteration 140: 0.949004 331 | Temperature at iteration 160: 0.926011 332 | Temperature at iteration 180: 0.908328 333 | Temperature at iteration 200: 0.894339 334 | Temperature at iteration 220: 0.88302 335 | Temperature at iteration 240: 0.873688 336 | Temperature at iteration 260: 0.865876 337 | Temperature at iteration 280: 0.85925 338 | Temperature at iteration 300: 0.853567 339 | Temperature at iteration 320: 0.848644 340 | Temperature at iteration 340: 0.844343 341 | Temperature at iteration 360: 0.840559 342 | Temperature at iteration 380: 0.837205 343 | Temperature at iteration 400: 0.834216 344 | Temperature at iteration 420: 0.831537 345 | Temperature at iteration 440: 0.829124 346 | Temperature at iteration 460: 0.826941 347 | Temperature at iteration 480: 0.824959 348 | Temperature at iteration 500: 0.823152 349 | 350 | Final temperature at the desired position after 500 iterations is: 0.823152 351 | The greatest difference in temperatures between the last two iterations was: 0.0258874 352 | ``` 353 | 354 | ::::::::::::::::::::::::::::::::::::: keypoints 355 | - "You can organize loops with `for` and `while` statements. Use a `for` loop to run over every element of the 356 | iterand, e.g. `for i in 1..rows { ...}` will run over all integers from 1 to `rows`. Use a `while` 357 | statement to repeatedly execute a code block until the condition does not hold anymore, e.g. `while (c < 358 | niter && delta >= tolerance) {...}` will repeatedly execute the commands in curly braces until one of the 359 | two conditions turns false." 360 | :::::::::::::::::::::::::::::::::::::::::::::::: 361 | -------------------------------------------------------------------------------- /episodes/06-procedures.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Procedures" 3 | teaching: 15 4 | exercises: 0 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How do I write functions?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Be able to write our own procedures." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | Similar to other programming languages, Chapel lets you define your own functions. These are called 16 | 'procedures' in Chapel and have an easy-to-understand syntax: 17 | 18 | ```chpl 19 | proc addOne(n) { // n is an input parameter 20 | return n + 1; 21 | } 22 | ``` 23 | 24 | To call this procedure, you would use its name: 25 | 26 | ```chpl 27 | writeln(addOne(10)); 28 | ``` 29 | 30 | Procedures can be recursive, as demonstrated below. In this example the procedure takes an integer number as a 31 | parameter and returns an integer number -- more on this below. If the input parameter is 1 or 0, `fibonacci` 32 | will return the same input parameter. If the input parameter is 2 or larger, `fibonacci` will call itself 33 | recursively. 34 | 35 | ```chpl 36 | proc fibonacci(n: int): int { // input parameter type and procedure return type, respectively 37 | if n <= 1 then return n; 38 | return fibonacci(n-1) + fibonacci(n-2); 39 | } 40 | ``` 41 | ```chpl 42 | writeln(fibonacci(10)); 43 | ``` 44 | 45 | The input parameter type `n: int` is enforced at compilation time. For example, if you try to pass a real-type 46 | number to the procedure with `fibonacci(10.2)`, you will get an error "error: unresolved call". Similarly, the 47 | return variable type is also enforced at compilation time. For example, replacing `return n` with `return 1.0` 48 | in line 2 will result in "error: cannot initialize return value of type 'int(64)'". While specifying these 49 | types might be optional (see the call out below), we highly recommend doing so in your code, as it will add 50 | additional checks for your program. 51 | 52 | ::::::::::::::::::::::::::::::::::::: callout 53 | 54 | If not specified, the procedure return type is inferred from the return variable type. This might not be 55 | possible with a recursive procedure as the return type is the procedure type, and it is not known to the 56 | compiler, so in this case (and in the `fibonacci` example above) we need to specify the procedure return type 57 | explicitly. 58 | 59 | :::::::::::::::::::::::::::::::::::::::::::::::: 60 | 61 | Procedures can take a varying number of parameters. In this example the procedure `maxOf` takes two or more 62 | parameters of the same type. This group of parameters is referred to as a *tuple* and is named `x` inside the 63 | procedure. The number of elements `k` in this tuple is inferred from the number of parameters passed to the 64 | procedure and is used to organize the calculations inside the procedure: 65 | 66 | ```chpl 67 | proc maxOf(x ...?k) { // take a tuple of one type with k elements 68 | var maximum = x[0]; 69 | for i in 1..=tolerance) do 71 | { 72 | ... 73 | } 74 | 75 | watch.stop(); 76 | 77 | //print final information 78 | writeln('\nThe simulation took ',watch.elapsed(),' seconds'); 79 | writeln('Final temperature at the desired position after ',c,' iterations is: ',temp[x,y]); 80 | writeln('The greatest difference in temperatures between the last two iterations was: ',delta,'\n'); 81 | ``` 82 | 83 | ```bash 84 | chpl base_solution.chpl 85 | ./base_solution --rows=650 --cols=650 --x=200 --y=300 --tolerance=0.002 --outputFrequency=1000 86 | ``` 87 | 88 | ```output 89 | The simulation will consider a matrix of 650 by 650 elements, 90 | it will run up to 10000 iterations, or until the largest difference 91 | in temperature between iterations is less than 0.002. 92 | You are interested in the evolution of the temperature at the 93 | position (200,300) of the matrix... 94 | 95 | and here we go... 96 | Temperature at iteration 0: 25.0 97 | Temperature at iteration 1000: 25.0 98 | Temperature at iteration 2000: 25.0 99 | Temperature at iteration 3000: 25.0 100 | Temperature at iteration 4000: 24.9998 101 | Temperature at iteration 5000: 24.9984 102 | Temperature at iteration 6000: 24.9935 103 | Temperature at iteration 7000: 24.9819 104 | 105 | The simulation took 20.1621 seconds 106 | Final temperature at the desired position after 7750 iterations is: 24.9671 107 | The greatest difference in temperatures between the last two iterations was: 0.00199985 108 | ``` 109 | 110 | ::::::::::::::::::::::::::::::::::::: keypoints 111 | - "To measure performance, instrument your Chapel code using a stopwatch from the `Time` module." 112 | :::::::::::::::::::::::::::::::::::::::::::::::: 113 | -------------------------------------------------------------------------------- /episodes/11-parallel-intro.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Intro to parallel computing" 3 | teaching: 60 4 | exercises: 30 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How does parallel processing work?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Discuss some common concepts in parallel computing." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | The basic concept of parallel computing is simple to understand: we divide our job into tasks that can be 16 | executed at the same time, so that we finish the job in a fraction of the time that it would have taken if the 17 | tasks were executed one by one. Implementing parallel computations, however, is not always easy, nor 18 | possible... 19 | 20 | Consider the following analogy: 21 | 22 | Suppose that we want to paint the four walls in a room. We'll call this the *problem*. We can divide our 23 | problem into 4 different tasks: paint each of the walls. In principle, our 4 tasks are independent from each 24 | other in the sense that we don't need to finish one to start one another. We say that we have 4 **_concurrent 25 | tasks_**; the tasks can be executed within the same time frame. However, this does not mean that the tasks 26 | can be executed simultaneously or in parallel. It all depends on the amount of resources that we have for the 27 | tasks. If there is only one painter, this guy could work for a while in one wall, then start painting another 28 | one, then work for a little bit on the third one, and so on. **_The tasks are being executed concurrently but 29 | not in parallel_**. If we have two painters for the job, then more parallelism can be introduced. Four 30 | painters could execute the tasks **_truly in parallel_**. 31 | 32 | ::::::::::::::::::::::::::::::::::::: callout 33 | 34 | Think of the CPU cores as the painters or workers that will execute your concurrent tasks. 35 | 36 | :::::::::::::::::::::::::::::::::::::::::::::::: 37 | 38 | Now imagine that all workers have to obtain their paint from a central dispenser located at the middle of the 39 | room. If each worker is using a different colour, then they can work **_asynchronously_**, however, if they 40 | use the same colour, and two of them run out of paint at the same time, then they have to **_synchronise_** to 41 | use the dispenser: One must wait while the other is being serviced. 42 | 43 | ::::::::::::::::::::::::::::::::::::: callout 44 | 45 | Think of the shared memory in your computer as the central dispenser for all your workers. 46 | 47 | :::::::::::::::::::::::::::::::::::::::::::::::: 48 | 49 | Finally, imagine that we have 4 paint dispensers, one for each worker. In this scenario, each worker can 50 | complete their task totally on their own. They don't even have to be in the same room, they could be painting 51 | walls of different rooms in the house, in different houses in the city, and different cities in the 52 | country. We need, however, a communication system in place. Suppose that worker A, for some reason, needs a 53 | colour that is only available in the dispenser of worker B, they must then synchronise: worker A must request 54 | the paint of worker B and worker B must respond by sending the required colour. 55 | 56 | ::::::::::::::::::::::::::::::::::::: callout 57 | 58 | Think of the memory on each node of a cluster as a separate dispenser for your workers. 59 | 60 | :::::::::::::::::::::::::::::::::::::::::::::::: 61 | 62 | A **_fine-grained_** parallel code needs lots of communication or synchronisation between tasks, in contrast 63 | with a **_coarse-grained_** one. An **_embarrassingly parallel_** problem is one where all tasks can be 64 | executed completely independent from each other (no communications required). 65 | 66 | ## Parallel programming in Chapel 67 | 68 | Chapel provides high-level abstractions for parallel programming no matter the grain size of your tasks, 69 | whether they run in a shared memory on one node or use memory distributed across multiple compute nodes, 70 | or whether they are executed 71 | concurrently or truly in parallel. As a programmer you can focus in the algorithm: how to divide the problem 72 | into tasks that make sense in the context of the problem, and be sure that the high-level implementation will 73 | run on any hardware configuration. Then you could consider the details of the specific system you are going to 74 | use (whether it is shared or distributed, the number of cores, etc.) and tune your code/algorithm to obtain a 75 | better performance. 76 | 77 | ::::::::::::::::::::::::::::::::::::: callout 78 | 79 | To this effect, **_concurrency_** (the creation and execution of multiple tasks), and **_locality_** (in 80 | which set of resources these tasks are executed) are orthogonal concepts in Chapel. 81 | 82 | :::::::::::::::::::::::::::::::::::::::::::::::: 83 | 84 | In summary, we can have a set of several tasks; these tasks could be running: 85 | 86 | 1. concurrently by the same processor in a single compute node, 87 | 2. in parallel by several processors in a single compute node, 88 | 3. in parallel by several processors distributed in different compute nodes, or 89 | 4. serially (one by one) by several processors distributed in different compute nodes. 90 | 91 | Similarly, each of these tasks could be using variables 92 | 93 | 1. located in the local memory on the compute node where it is running, or 94 | 2. stored on other compute nodes. 95 | 96 | And again, Chapel could take care of all the stuff required to run our algorithm in most of the scenarios, but 97 | we can always add more specific detail to gain performance when targeting a particular scenario. 98 | 99 | ::::::::::::::::::::::::::::::::::::: keypoints 100 | - "Concurrency and locality are orthogonal concepts in Chapel: where the tasks are running may not be 101 | indicative of when they run, and you can control both in Chapel." 102 | - "Problems with a lot of communication between tasks, or so called **_fine-grained_** parallel problems, are 103 | typically more difficult to parallelize. As we will see later in these lessons, Chapel simplifies writing 104 | **_fine-grained_** parallel codes by hiding a lot of communication complexity under the hood." 105 | :::::::::::::::::::::::::::::::::::::::::::::::: 106 | -------------------------------------------------------------------------------- /episodes/12-fire-forget-tasks.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Fire-and-forget tasks" 3 | teaching: 60 4 | exercises: 30 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How do we execute work in parallel?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Launching multiple threads to execute tasks in parallel." 13 | - "Learn how to use `begin`, `cobegin`, and `coforall` to spawn new tasks." 14 | :::::::::::::::::::::::::::::::::::::::::::::::: 15 | 16 | ::::::::::::::::::::::::::::::::::::: callout 17 | 18 | In the very first chapter where we showed how to run single-node Chapel codes. As a refresher, let's go over 19 | this again. If you are running Chapel on your own computer, then you are all set, and you can simply compile 20 | and run Chapel codes. If you are on a cluster, you will need to run Chapel codes inside interactive jobs. Here 21 | so far we are covering only single-locale Chapel, so -- from the login node -- you can submit an interactive 22 | job to the scheduler with a command like this one: 23 | 24 | ```sh 25 | salloc --time=2:0:0 --ntasks=1 --cpus-per-task=3 --mem-per-cpu=1000 26 | ``` 27 | 28 | The details may vary depending on your cluster, e.g. different scheduler, requirement to specify an account or 29 | reservation, etc, but the general idea remains the same: on a cluster you need to ask for resources before you 30 | can run calculations. In this case we are asking for 2 hours maximum runtime, single MPI task (sufficient for 31 | our parallelism in this chapter), 3 CPU cores inside that task, and 1000M maximum memory per core. The core 32 | count means that we can run 3 threads in parallel, each on its own CPU core. Once your interactive job starts, 33 | you can compile and run the Chapel codes below. Inside your Chapel code, when new threads start, they will be 34 | able to utilize our 3 allocated CPU cores. 35 | 36 | :::::::::::::::::::::::::::::::::::::::::::::::: 37 | 38 | A Chapel program always start as a single main thread. You can then start concurrent tasks with the `begin` 39 | statement. A task spawned by the `begin` statement will run in a different thread while the main thread 40 | continues its normal execution. Consider the following example: 41 | 42 | ```chpl 43 | var x = 0; 44 | 45 | writeln("This is the main thread starting first task"); 46 | begin 47 | { 48 | var c = 0; 49 | while c < 10 50 | { 51 | c += 1; 52 | writeln('thread 1: ', x+c); 53 | } 54 | } 55 | 56 | writeln("This is the main thread starting second task"); 57 | begin 58 | { 59 | var c = 0; 60 | while c < 10 61 | { 62 | c += 1; 63 | writeln('thread 2: ', x+c); 64 | } 65 | } 66 | 67 | writeln('this is main thread, I am done...'); 68 | ``` 69 | 70 | ```bash 71 | chpl begin_example.chpl 72 | ./begin_example 73 | ``` 74 | 75 | ```output 76 | This is the main thread starting first task 77 | This is the main thread starting second task 78 | this is main thread, I am done... 79 | thread 1: 1 80 | thread 1: 2 81 | thread 1: 3 82 | thread 1: 4 83 | thread 1: 5 84 | thread 1: 6 85 | thread 1: 7 86 | thread 1: 8 87 | thread 1: 9 88 | thread 1: 10 89 | thread 2: 1 90 | thread 2: 2 91 | thread 2: 3 92 | thread 2: 4 93 | thread 2: 5 94 | thread 2: 6 95 | thread 2: 7 96 | thread 2: 8 97 | thread 2: 9 98 | thread 2: 10 99 | ``` 100 | 101 | As you can see the order of the output is not what we would expected, and actually it is completely 102 | unpredictable. This is a well known effect of concurrent tasks accessing the same shared resource at the same 103 | time (in this case the screen); the system decides in which order the tasks could write to the screen. 104 | 105 | 106 | 107 | 108 | 109 | 110 | ::::::::::::::::::::::::::::::::::::: challenge 111 | 112 | ## Challenge 1: what if `c` is defined globally? 113 | 114 | What would happen if in the last code we *move* the definition of `c` into the main thread, but try to assign 115 | it from threads 1 and 2? Select one answer from these: 116 | 117 | 1. The code will fail to compile. 118 | 1. The code will compile and run, but `c` will be updated by both threads at the same time (a *race 119 | condition*), so that its final value will vary from one run to another. 120 | 1. The code will compile and run, and the two threads will be taking turns updating `c`, so that its final 121 | value will always be the same. 122 | 123 | :::::::::::::::::::::::: solution 124 | 125 | We'll get an error at compilation ("cannot assign to const variable"), since then `c` would be defined within 126 | the scope of the main thread, and we could modify its value only in the main thread. Any attempt to modify its 127 | value inside threads 1 or 2 will produce a compilation error. 128 | 129 | ::::::::::::::::::::::::::::::::: 130 | :::::::::::::::::::::::::::::::::::::::::::::::: 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | ::::::::::::::::::::::::::::::::::::: challenge 139 | 140 | ## Challenge 2: what if we have a second, local definition of `x`? 141 | 142 | What would happen if we try to insert a second definition `var x = 10;` inside the first `begin` statement? 143 | Select one answer from these: 144 | 145 | 1. The code will fail to compile. 146 | 1. The code will compile and run, and the inside the first `begin` statement the value `x = 10` will be used, 147 | whereas inside the second `begin` statement the value `x = 0` will be used. 148 | 1. The new value `x = 10` will overwrite the global value `x = 0` in both threads 1 and 2. 149 | 150 | :::::::::::::::::::::::: solution 151 | 152 | The code will compile and run, and you will see the following output: 153 | 154 | ```output 155 | This is the main thread starting first task 156 | This is the main thread starting second task 157 | this is main thread, I am done... 158 | thread 1: 11 159 | thread 1: 12 160 | thread 1: 13 161 | thread 1: 14 162 | thread 1: 15 163 | thread 1: 16 164 | thread 1: 17 165 | thread 1: 18 166 | thread 1: 19 167 | thread 1: 20 168 | thread 2: 1 169 | thread 2: 2 170 | thread 2: 3 171 | thread 2: 4 172 | thread 2: 5 173 | thread 2: 6 174 | thread 2: 7 175 | thread 2: 8 176 | thread 2: 9 177 | thread 2: 10 178 | ``` 179 | 180 | ::::::::::::::::::::::::::::::::: 181 | :::::::::::::::::::::::::::::::::::::::::::::::: 182 | 183 | ::::::::::::::::::::::::::::::::::::: callout 184 | 185 | All variables have a **_scope_** in which they can be used. The variables declared inside a concurrent task 186 | are accessible only by that task. The variables declared in the main task can be read everywhere, but Chapel 187 | won't allow other concurrent tasks to modify them. 188 | 189 | :::::::::::::::::::::::::::::::::::::::::::::::: 190 | 191 | 192 | 193 | 194 | 195 | 196 | 197 | ::::::::::::::::::::::::::::::::::::::: discussion 198 | 199 | ## Try this ... 200 | 201 | Are the concurrent tasks, spawned by the last code, running truly in parallel? 202 | 203 | The answer is: it depends on the number of cores available to your Chapel code. To verify this, let's modify the code 204 | to get both threads 1 and 2 into an infinite loop: 205 | 206 | ```chpl 207 | begin 208 | { 209 | var c=0; 210 | while c > -1 211 | { 212 | c += 1; 213 | // the rest of the code in the thread 214 | } 215 | } 216 | ``` 217 | 218 | Compile and run the code: 219 | 220 | ```sh 221 | chpl begin_example.chpl 222 | ./begin_example 223 | ``` 224 | 225 | If you are running this on your own computer, you can run `top` or `htop` or `ps` commands in another terminal 226 | to check Chapel's CPU usage. If you are running inside an interactive job on a cluster, you can open a 227 | different terminal, log in to the cluster, and open a bash shell on the node that is running your job (if your 228 | cluster setup allows this): 229 | 230 | ```sh 231 | squeue -u $USER # check the jobID number 232 | srun --jobid= --pty bash # put your jobID here 233 | htop -u $USER -s PERCENT_CPU # display CPU usage and other information 234 | ``` 235 | 236 | In the output of `htop` you will see a table with the list of your processes, and in the "CPU%" column you 237 | will see the percentage consumed by each process. Find the Chapel process, and if it shows that your CPU usage 238 | is close to 300%, you are using 3 CPU cores. What do you see? 239 | 240 | Now exit `htop` by pressing *Q*. Also exit your interactive run by pressing *Ctrl-C*. 241 | 242 | ::::::::::::::::::::::::::::::::::::::::::::::::::: 243 | 244 | ::::::::::::::::::::::::::::::::::::: callout 245 | 246 | To maximise performance, start as many tasks as cores are available. 247 | 248 | :::::::::::::::::::::::::::::::::::::::::::::::: 249 | 250 | A slightly more structured way to start concurrent tasks in Chapel is by using the `cobegin`statement. Here 251 | you can start a block of concurrent tasks, one for each statement inside the curly brackets. The main 252 | difference between the `begin`and `cobegin` statements is that with the `cobegin`, all the spawned tasks are 253 | synchronised at the end of the statement, i.e. the main thread won't continue its execution until all tasks 254 | are done. 255 | 256 | ```chpl 257 | var x=0; 258 | writeln("This is the main thread, my value of x is ",x); 259 | 260 | cobegin 261 | { 262 | { 263 | var x=5; 264 | writeln("this is task 1, my value of x is ",x); 265 | } 266 | writeln("this is task 2, my value of x is ",x); 267 | } 268 | 269 | writeln("this message won't appear until all tasks are done..."); 270 | ``` 271 | 272 | ```bash 273 | chpl cobegin_example.chpl 274 | ./cobegin_example 275 | ``` 276 | 277 | ```output 278 | This is the main thread, my value of x is 0 279 | this is task 2, my value of x is 0 280 | this is task 1, my value of x is 5 281 | this message won't appear until all tasks are done... 282 | ``` 283 | 284 | As you may have conclude from the Discussion exercise above, the variables declared inside a task are 285 | accessible only by the task, while those variables declared in the main task are accessible to all tasks. 286 | 287 | The last, and most useful way to start concurrent/parallel tasks in Chapel, is the `coforall` loop. This is a 288 | combination of the for-loop and the `cobegin`statements. The general syntax is: 289 | 290 | ```chpl 291 | coforall index in iterand 292 | {instructions} 293 | ``` 294 | 295 | This will start a new task, for each iteration. Each tasks will then perform all the instructions inside the 296 | curly brackets. Each task will have a copy of the variable **_index_** with the corresponding value yielded by 297 | the iterand. This index allows us to _customise_ the set of instructions for each particular task. 298 | 299 | ```chpl 300 | var x=1; 301 | config var numoftasks=2; 302 | 303 | writeln("This is the main task: x = ",x); 304 | 305 | coforall taskid in 1..numoftasks 306 | { 307 | var c=taskid+1; 308 | writeln("this is task ",taskid,": x + ",taskid," = ",x+taskid,". My value of c is: ",c); 309 | } 310 | 311 | writeln("this message won't appear until all tasks are done..."); 312 | ``` 313 | 314 | ```bash 315 | chpl coforall_example.chpl 316 | ./coforall_example --numoftasks=5 317 | ``` 318 | 319 | ```output 320 | This is the main task: x = 1 321 | this is task 5: x + 5 = 6. My value of c is: 6 322 | this is task 2: x + 2 = 3. My value of c is: 3 323 | this is task 4: x + 4 = 5. My value of c is: 5 324 | this is task 3: x + 3 = 4. My value of c is: 4 325 | this is task 1: x + 1 = 2. My value of c is: 2 326 | this message won't appear until all tasks are done... 327 | ``` 328 | 329 | Notice how we are able to customise the instructions inside the coforall, to give different results depending 330 | on the task that is executing them. Also, notice how, once again, the variables declared outside the coforall 331 | can be read by all tasks, while the variables declared inside, are available only to the particular task. 332 | 333 | ::::::::::::::::::::::::::::::::::::: challenge 334 | 335 | ## Challenge 3: Can you do it? 336 | 337 | Would it be possible to print all the messages in the right order? Modify the code in the last example as 338 | required. 339 | 340 | Hint: you can use an array of strings declared in the main task, where all the concurrent tasks could write 341 | their messages in the corresponding position. Then, at the end, have the main task printing all elements of 342 | the array in order. 343 | 344 | :::::::::::::::::::::::: solution 345 | 346 | The following code is a possible solution: 347 | 348 | ```chpl 349 | var x = 1; 350 | config var numoftasks = 2; 351 | var messages: [1..numoftasks] string; 352 | 353 | writeln("This is the main task: x = ", x); 354 | 355 | coforall taskid in 1..numoftasks { 356 | var c = taskid + 1; 357 | messages[taskid] = 'this is task ' + taskid:string + 358 | ': my value of c is ' + c:string + ' and x is ' + x:string; 359 | } 360 | 361 | for i in 1..numoftasks do writeln(messages[i]); 362 | writeln("this message won't appear until all tasks are done..."); 363 | ``` 364 | 365 | ```bash 366 | chpl exercise_coforall.chpl 367 | ./exercise_coforall --numoftasks=5 368 | ``` 369 | 370 | ```output 371 | This is the main task: x = 1 372 | this is task 1: x + 1 = 2. My value of c is: 2 373 | this is task 2: x + 2 = 3. My value of c is: 3 374 | this is task 3: x + 3 = 4. My value of c is: 4 375 | this is task 4: x + 4 = 5. My value of c is: 5 376 | this is task 5: x + 5 = 6. My value of c is: 6 377 | this message won't appear until all tasks are done... 378 | ``` 379 | 380 | Note that we need to convert integers to strings first (`taskid:string` converts `taskid` integer variable to 381 | a string) before we can add them to other strings to form a message stored inside each `messages` element. 382 | 383 | ::::::::::::::::::::::::::::::::: 384 | :::::::::::::::::::::::::::::::::::::::::::::::: 385 | 386 | ::::::::::::::::::::::::::::::::::::: challenge 387 | 388 | ## Challenge 4: Can you do it? 389 | 390 | Consider the following code: 391 | 392 | ```chpl 393 | use Random; 394 | config const nelem = 100_000_000; 395 | var x: [1..nelem] int; 396 | fillRandom(x); //fill array with random numbers 397 | var mymax = 0; 398 | 399 | // here put your code to find mymax 400 | 401 | writeln("the maximum value in x is: ", mymax); 402 | ``` 403 | 404 | Write a parallel code to find the maximum value in the array x. 405 | 406 | :::::::::::::::::::::::: solution 407 | 408 | ```chpl 409 | config const numtasks = 12; 410 | const n = nelem/numtasks; // number of elements per thread 411 | const r = nelem - n*numtasks; // these elements did not fit into the last thread 412 | 413 | var d: [1..numtasks] int; // local maxima for each thread 414 | 415 | coforall taskid in 1..numtasks { 416 | var i, f: int; 417 | i = (taskid-1)*n + 1; 418 | f = (taskid-1)*n + n; 419 | if taskid == numtasks then f += r; // add r elements to the last thread 420 | for j in i..f do 421 | if x[j] > d[taskid] then d[taskid] = x[j]; 422 | } 423 | for i in 1..numtasks do 424 | if d[i] > mymax then mymax = d[i]; 425 | ``` 426 | 427 | ```bash 428 | chpl --fast exercise_coforall_2.chpl 429 | ./exercise_coforall_2 430 | ``` 431 | 432 | ```output 433 | the maximum value in x is: 9223372034161572255 # large random integer 434 | ``` 435 | 436 | We use the `coforall` loop to spawn tasks that work concurrently in a fraction of the array. The trick here is to 437 | determine, based on the _taskid_, the initial and final indices that the task will use. Each task obtains the 438 | maximum in its fraction of the array, and finally, after the coforall is done, the main task obtains the 439 | maximum of the array from the maximums of all tasks. 440 | 441 | ::::::::::::::::::::::::::::::::: 442 | :::::::::::::::::::::::::::::::::::::::::::::::: 443 | 444 | ::::::::::::::::::::::::::::::::::::::: discussion 445 | 446 | ## Try this ... 447 | 448 | Substitute the code to find _mymax_ in the last exercise with: 449 | 450 | ```chpl 451 | mymax=max reduce x; 452 | ``` 453 | 454 | Time the execution of the original code and this new one. How do they compare? 455 | 456 | ::::::::::::::::::::::::::::::::::::::::::::::::::: 457 | 458 | 459 | ::::::::::::::::::::::::::::::::::::: callout 460 | 461 | It is always a good idea to check whether there is _built-in_ functions or methods in the used language, that 462 | can do what we want to do as efficiently (or better) than our house-made code. In this case, the _reduce_ 463 | statement reduces the given array to a single number using the given operation (in this case max), and it is 464 | parallelized and optimised to have a very good performance. 465 | 466 | :::::::::::::::::::::::::::::::::::::::::::::::: 467 | 468 | 469 | The code in these last Exercises somehow _synchronise_ the tasks to obtain the desired result. In addition, 470 | Chapel has specific mechanisms task synchronisation, that could help us to achieve fine-grained 471 | parallelization. 472 | 473 | ::::::::::::::::::::::::::::::::::::: keypoints 474 | - "Use `begin` or `cobegin` or `coforall` to spawn new tasks." 475 | - "You can run more than one task per core, as the number of cores on a node is limited." 476 | :::::::::::::::::::::::::::::::::::::::::::::::: 477 | -------------------------------------------------------------------------------- /episodes/13-synchronization.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Synchronising tasks" 3 | teaching: 60 4 | exercises: 30 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How should I access my data in parallel?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "Learn how to synchronize multiple threads using one of three mechanisms: `sync` statements, sync variables, 13 | and atomic variables." 14 | - "Learn that with shared memory access from multiple threads you can run into race conditions and deadlocks, 15 | and learn how to recognize and solve these problems." 16 | :::::::::::::::::::::::::::::::::::::::::::::::: 17 | 18 | In Chapel the keyword `sync` can be either a statement or a type qualifier, providing two different 19 | synchronization mechanisms for threads. Let's start with using `sync` as a statement. 20 | 21 | As we saw in the previous section, the `begin` statement will start a concurrent (or *child*) task that will 22 | run in a different thread while the main (or *parent*) thread continues its normal execution. In this sense 23 | the `begin` statement is non-blocking. If you want to pause the execution of the main thread and wait until 24 | the child thread ends, you can prepend the `begin` statement with the `sync` statement. Consider the following 25 | code; running this code, after the initial output line, you will first see all output from thread 1 and only 26 | then the line "The first task is done..." and the rest of the output: 27 | 28 | ```chpl 29 | var x=0; 30 | writeln("This is the main thread starting a synchronous task"); 31 | 32 | sync 33 | { 34 | begin 35 | { 36 | var c=0; 37 | while c<10 38 | { 39 | c+=1; 40 | writeln('thread 1: ',x+c); 41 | } 42 | } 43 | } 44 | writeln("The first task is done..."); 45 | 46 | writeln("This is the main thread starting an asynchronous task"); 47 | begin 48 | { 49 | var c=0; 50 | while c<10 51 | { 52 | c+=1; 53 | writeln('thread 2: ',x+c); 54 | } 55 | } 56 | 57 | writeln('this is main thread, I am done...'); 58 | ``` 59 | 60 | ```bash 61 | chpl sync_example_1.chpl 62 | ./sync_example_1 63 | ``` 64 | 65 | ```output 66 | This is the main thread starting a synchronous task 67 | thread 1: 1 68 | thread 1: 2 69 | thread 1: 3 70 | thread 1: 4 71 | thread 1: 5 72 | thread 1: 6 73 | thread 1: 7 74 | thread 1: 8 75 | thread 1: 9 76 | thread 1: 10 77 | The first task is done... 78 | This is the main thread starting an asynchronous task 79 | this is main thread, I am done... 80 | thread 2: 1 81 | thread 2: 2 82 | thread 2: 3 83 | thread 2: 4 84 | thread 2: 5 85 | thread 2: 6 86 | thread 2: 7 87 | thread 2: 8 88 | thread 2: 9 89 | thread 2: 10 90 | ``` 91 | 92 | ::::::::::::::::::::::::::::::::::::::: discussion 93 | 94 | ## Discussion 95 | 96 | What would happen if we write instead 97 | 98 | ```chpl 99 | begin 100 | { 101 | sync 102 | { 103 | var c=0; 104 | while c<10 105 | { 106 | c+=1; 107 | writeln('thread 1: ',x+c); 108 | } 109 | } 110 | } 111 | writeln("The first task is done..."); 112 | ``` 113 | 114 | ::::::::::::::::::::::::::::::::::::::::::::::::::: 115 | 116 | ::::::::::::::::::::::::::::::::::::: challenge 117 | 118 | ## Challenge 3: Can you do it? 119 | 120 | Use `begin` and `sync` statements to reproduce the functionality of `cobegin` in `cobegin_example.chpl`. 121 | 122 | :::::::::::::::::::::::: solution 123 | 124 | ```chpl 125 | var x=0; 126 | writeln("This is the main thread, my value of x is ",x); 127 | 128 | sync 129 | { 130 | begin 131 | { 132 | var x=5; 133 | writeln("this is task 1, my value of x is ",x); 134 | } 135 | begin writeln("this is task 2, my value of x is ",x); 136 | } 137 | 138 | writeln("this message won't appear until all tasks are done..."); 139 | ``` 140 | 141 | ::::::::::::::::::::::::::::::::: 142 | :::::::::::::::::::::::::::::::::::::::::::::::: 143 | 144 | A more elaborated and powerful use of `sync` is as a type qualifier for variables. When a variable is declared 145 | as _sync_, a state that can be **_full_** or **_empty_** is associated to it. 146 | 147 | To assign a new value to a _sync_ variable, its state must be _empty_ (after the assignment operation is 148 | completed, the state will be set as _full_). On the contrary, to read a value from a _sync_ variable, its 149 | state must be _full_ (after the read operation is completed, the state will be set as _empty_ again). 150 | 151 | Starting from Chapel 2.x, you must use functions `writeEF` and `readFF` to perform blocking write and read 152 | with sync variables. Below is an example to demonstrate the use of sync variables. Here we launch a new task 153 | that is busy for a short time executing the loop. While this loop is running, the main task continues printing 154 | the message "this is main task after launching new task... I will wait until it is done". As it takes time to 155 | spawn a new thread, it is very likely that you will see this message before the output from the loop. Next, 156 | the main task will attempt to read `x` and assign it to `a` which it can only do when `x` is full. We write 157 | into `x` after the loop, so you will see the final message "and now it is done" only after the message "New 158 | task finished". In other words, reading `x`, we pause the execution of the main thread. 159 | 160 | ```chpl 161 | var x: sync int, a: int; 162 | writeln("this is main task launching a new task"); 163 | begin { 164 | for i in 1..10 do writeln("this is new task working: ",i); 165 | x.writeEF(2); // assign 2 to x 166 | writeln("New task finished"); 167 | } 168 | 169 | writeln("this is main task after launching new task... I will wait until it is done"); 170 | a = x.readFE(); // don't run this line until the variable x is written in the other task 171 | writeln("and now it is done"); 172 | ``` 173 | 174 | ```bash 175 | chpl sync_example_2.chpl 176 | ./sync_example_2 177 | ``` 178 | 179 | ```output 180 | this is main task launching a new task 181 | this is main task after launching new task... I will wait until it is done 182 | this is new task working: 1 183 | this is new task working: 2 184 | this is new task working: 3 185 | this is new task working: 4 186 | this is new task working: 5 187 | this is new task working: 6 188 | this is new task working: 7 189 | this is new task working: 8 190 | this is new task working: 9 191 | this is new task working: 10 192 | New task finished 193 | and now it is done 194 | ``` 195 | 196 | ::::::::::::::::::::::::::::::::::::::: discussion 197 | 198 | ## Discussion 199 | 200 | What would happen if we try to read `x` inside the new task as well, i.e. we have the following `begin` 201 | statement, without changing the rest of the code: 202 | 203 | ```chpl 204 | begin { 205 | for i in 1..10 do writeln("this is new task working: ",i); 206 | x.writeEF(2); 207 | writeln("New task finished"); 208 | x.readFE(); 209 | } 210 | ``` 211 | :::::::::::::::::::::::: solution 212 | 213 | The code will block (run forever), and you would need to press *Ctrl-C* to halt its execution. In this example 214 | we try to read `x` in two places: the main task and the new task. When we read a sync variable with `readFE`, 215 | the state of the sync variable is set to empty when this method completes. In other words, one of the two 216 | `readFE` calls will succeed (which one -- depends on the runtime) and will mark the variable as empty. The 217 | other `readFE` will then attempt to read it but it will block waiting for `x` to become full again (which will 218 | never happen). In the end, the execution of either the main thread or the child thread will block, hanging the 219 | entire code. 220 | 221 | ::::::::::::::::::::::::::::::::: 222 | ::::::::::::::::::::::::::::::::::::::::::::::::::: 223 | 224 | There are a number of methods defined for _sync_ variables. If `x` is a sync variable of a given type, you can 225 | use the following functions: 226 | 227 | ```chpl 228 | // non-blocking methods 229 | x.reset() //will set the state as empty and the value as the default of x's type 230 | x.isFull //will return true is the state of x is full, false if it is empty 231 | 232 | //blocking read and write methods 233 | x.writeEF(value) //will block until the state of x is empty, 234 | //then will assign the value, and set the state to full 235 | x.writeFF(value) //will block until the state of x is full, 236 | //then will assign the value, and leave the state as full 237 | x.readFE() //will block until the state of x is full, 238 | //then will return x's value, and set the state to empty 239 | x.readFF() //will block until the state of x is full, 240 | //then will return x's value, and leave the state as full 241 | 242 | //non-blocking read and write methods 243 | x.writeXF(value) //will assign the value no matter the state of x, and then set the state as full 244 | x.readXX() //will return the value of x regardless its state. The state will remain unchanged 245 | ``` 246 | 247 | Chapel also implements **_atomic_** operations with variables declared as `atomic`, and this provides another 248 | option to synchronise tasks. Atomic operations run completely independently of any other thread or 249 | process. This means that when several tasks try to write an atomic variable, only one will succeed at a given 250 | moment, providing implicit synchronisation between them. There is a number of methods defined for atomic 251 | variables, among them `sub()`, `add()`, `write()`, `read()`, and `waitfor()` are very useful to establish 252 | explicit synchronisation between tasks, as showed in the next code: 253 | 254 | ```chpl 255 | var lock: atomic int; 256 | const numtasks=5; 257 | 258 | lock.write(0); //the main task set lock to zero 259 | 260 | coforall id in 1..numtasks 261 | { 262 | writeln("greetings from task ",id,"... I am waiting for all tasks to say hello"); 263 | lock.add(1); //task id says hello and atomically adds 1 to lock 264 | lock.waitFor(numtasks); //then it waits for lock to be equal numtasks (which will happen when all tasks say hello) 265 | writeln("task ",id," is done..."); 266 | } 267 | ``` 268 | 269 | ```bash 270 | chpl atomic_example.chpl 271 | ./atomic_example 272 | ``` 273 | 274 | ```output 275 | greetings from task 4... I am waiting for all tasks to say hello 276 | greetings from task 5... I am waiting for all tasks to say hello 277 | greetings from task 2... I am waiting for all tasks to say hello 278 | greetings from task 3... I am waiting for all tasks to say hello 279 | greetings from task 1... I am waiting for all tasks to say hello 280 | task 1 is done... 281 | task 5 is done... 282 | task 2 is done... 283 | task 3 is done... 284 | task 4 is done... 285 | ``` 286 | 287 | > ## Try this... 288 | > 289 | > Comment out the line `lock.waitfor(numtasks)` in the code above to clearly observe the effect of the task 290 | > synchronisation. 291 | 292 | Finally, with all the material studied so far, we should be ready to parallelize our code for the simulation 293 | of the heat transfer equation. 294 | 295 | ::::::::::::::::::::::::::::::::::::: keypoints 296 | - "You can explicitly synchronise tasks with `sync` statement." 297 | - "You can also use sync and atomic variables to synchronise tasks." 298 | :::::::::::::::::::::::::::::::::::::::::::::::: 299 | -------------------------------------------------------------------------------- /episodes/14-parallel-case-study.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Task parallelism with Chapel" 3 | teaching: 60 4 | exercises: 30 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How do I write parallel code for a real use case?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "First objective." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | Here is our plan to task-parallelize the heat transfer equation: 16 | 17 | 1. divide the entire grid of points into blocks and assign blocks to individual tasks, 18 | 1. each task should compute the new temperature of its assigned points, 19 | 1. perform a **_reduction_** over the whole grid, to update the greatest temperature difference between 20 | `temp_new` and `temp`. 21 | 22 | For the reduction of the grid we can simply use the `max reduce` statement, which is already 23 | parallelized. Now, let's divide the grid into `rowtasks` x `coltasks` sub-grids, and assign each sub-grid to a 24 | task using the `coforall` loop (we will have `rowtasks*coltasks` tasks in total). 25 | 26 | ```chpl 27 | config const rowtasks = 2; 28 | config const coltasks = 2; 29 | 30 | // this is the main loop of the simulation 31 | delta = tolerance; 32 | while (c=tolerance) { 33 | c += 1; 34 | 35 | coforall taskid in 0..coltasks*rowtasks-1 { 36 | for i in rowi..rowf { 37 | for j in coli..colf { 38 | temp_new[i,j] = (temp[i-1,j] + temp[i+1,j] + temp[i,j-1] + temp[i,j+1]) / 4; 39 | } 40 | } 41 | } 42 | 43 | delta = max reduce (temp_new-temp); 44 | temp = temp_new; 45 | 46 | if c%outputFrequency == 0 then writeln('Temperature at iteration ',c,': ',temp[x,y]); 47 | } 48 | ``` 49 | 50 | Note that now the nested `for` loops run from `rowi` to `rowf` and from `coli` to `colf` which are, 51 | respectively, the initial and final row and column of the sub-grid associated to the task `taskid`. To compute 52 | these limits, based on `taskid`, we need to compute the number of rows and columns per task (`nr` and `nc`, 53 | respectively) and account for possible non-zero remainders (`rr` and `rc`) that we should add to the last row 54 | and column: 55 | 56 | ```chpl 57 | config const rowtasks = 2; 58 | config const coltasks = 2; 59 | 60 | const nr = rows/rowtasks; 61 | const rr = rows-nr*rowtasks; 62 | const nc = cols/coltasks; 63 | const rc = cols-nc*coltasks; 64 | 65 | // this is the main loop of the simulation 66 | delta = tolerance; 67 | while (c=tolerance) { 68 | c+=1; 69 | 70 | coforall taskid in 0..coltasks*rowtasks-1 { 71 | var rowi, coli, rowf, colf: int; 72 | var taskr, taskc: int; 73 | 74 | taskr = taskid/coltasks; 75 | taskc = taskid%coltasks; 76 | 77 | if taskr=tolerance) { 182 | c = c+1; 183 | 184 | for i in rowi..rowf { 185 | for j in coli..colf { 186 | temp_new[i,j] = (temp[i-1,j] + temp[i+1,j] + temp[i,j-1] + temp[i,j+1]) / 4; 187 | } 188 | } 189 | 190 | //update delta 191 | //update temp 192 | //print temperature in desired position 193 | } 194 | } 195 | ``` 196 | 197 | The problem with this approach is that now we have to explicitly synchronise the tasks. Before, `delta` and 198 | `temp` were updated only by the main task at each iteration; similarly, only the main task was printing 199 | results. Now, all these operations must be carried inside the coforall loop, which imposes the need of 200 | synchronisation between tasks. 201 | 202 | The synchronisation must happen at two points: 203 | 204 | 1. We need to be sure that all tasks have finished with the computations of their part of the grid `temp`, 205 | before updating `delta` and `temp` safely. 206 | 2. We need to be sure that all tasks use the updated value of `delta` to evaluate the condition of the while 207 | loop for the next iteration. 208 | 209 | To update `delta` we could have each task computing the greatest difference in temperature in its associated 210 | sub-grid, and then, after the synchronisation, have only one task reducing all the sub-grids' maximums. 211 | 212 | ```chpl 213 | var delta: atomic real; 214 | var myd: [0..coltasks*rowtasks-1] real; 215 | ... 216 | //this is the main loop of the simulation 217 | delta.write(tolerance); 218 | coforall taskid in 0..coltasks*rowtasks-1 219 | { 220 | var myd2: real; 221 | ... 222 | 223 | while (c= tolerance) { 224 | c = c+1; 225 | ... 226 | 227 | for i in rowi..rowf { 228 | for j in coli..colf { 229 | temp_new[i,j] = (temp[i-1,j] + temp[i+1,j] + temp[i,j-1] + temp[i,j+1]) / 4; 230 | myd2 = max(abs(temp_new[i,j]-temp[i,j]),myd2); 231 | } 232 | } 233 | myd[taskid] = myd2 234 | 235 | // here comes the synchronisation of tasks 236 | 237 | temp[rowi..rowf,coli..colf] = temp_new[rowi..rowf,coli..colf]; 238 | if taskid==0 { 239 | delta.write(max reduce myd); 240 | if c%outputFrequency==0 then writeln('Temperature at iteration ',c,': ',temp[x,y]); 241 | } 242 | 243 | // here comes the synchronisation of tasks again 244 | } 245 | } 246 | ``` 247 | 248 | ::::::::::::::::::::::::::::::::::::: challenge 249 | 250 | ## Challenge 4: Can you do it? 251 | 252 | Use `sync` or `atomic` variables to implement the synchronisation required in the code above. 253 | 254 | :::::::::::::::::::::::: solution 255 | 256 | One possible solution is to use an atomic variable as a _lock_ that opens (using the `waitFor` method) when 257 | all the tasks complete the required instructions 258 | 259 | ```chpl 260 | var lock: atomic int; 261 | lock.write(0); 262 | ... 263 | //this is the main loop of the simulation 264 | delta.write(tolerance); 265 | coforall taskid in 0..coltasks*rowtasks-1 266 | { 267 | ... 268 | while (c=tolerance) 269 | { 270 | ... 271 | myd[taskid]=myd2 272 | 273 | //here comes the synchronisation of tasks 274 | lock.add(1); 275 | lock.waitFor(coltasks*rowtasks); 276 | 277 | temp[rowi..rowf,coli..colf] = temp_new[rowi..rowf,coli..colf]; 278 | ... 279 | 280 | //here comes the synchronisation of tasks again 281 | lock.sub(1); 282 | lock.waitFor(0); 283 | } 284 | } 285 | ``` 286 | 287 | ::::::::::::::::::::::::::::::::: 288 | :::::::::::::::::::::::::::::::::::::::::::::::: 289 | 290 | Using the solution in the Exercise 4, we can now compare the performance with the benchmark solution 291 | 292 | ```bash 293 | chpl --fast parallel2.chpl 294 | ./parallel2 --rows=650 --cols=650 --x=200 --y=300 --niter=10000 --tolerance=0.002 --outputFrequency=1000 295 | ``` 296 | 297 | ```output 298 | The simulation will consider a matrix of 650 by 650 elements, 299 | it will run up to 10000 iterations, or until the largest difference 300 | in temperature between iterations is less than 0.002. 301 | You are interested in the evolution of the temperature at the position (200,300) of the matrix... 302 | 303 | and here we go... 304 | Temperature at iteration 0: 25.0 305 | Temperature at iteration 1000: 25.0 306 | Temperature at iteration 2000: 25.0 307 | Temperature at iteration 3000: 25.0 308 | Temperature at iteration 4000: 24.9998 309 | Temperature at iteration 5000: 24.9984 310 | Temperature at iteration 6000: 24.9935 311 | Temperature at iteration 7000: 24.9819 312 | 313 | The simulation took 4.2733 seconds 314 | Final temperature at the desired position after 7750 iterations is: 24.9671 315 | The greatest difference in temperatures between the last two iterations was: 0.00199985 316 | ``` 317 | 318 | to see that we now have a code that performs 5x faster. 319 | 320 | We finish this section by providing another, elegant version of the 2D heat transfer solver (without time 321 | stepping) using data parallelism on a single locale: 322 | 323 | ```chpl 324 | use Math; /* for exp() */ 325 | 326 | const n = 100, stride = 20; 327 | var temp: [0..n+1, 0..n+1] real; 328 | var temp_new: [1..n,1..n] real; 329 | var x, y: real; 330 | for (i,j) in {1..n,1..n} { // serial iteration 331 | x = ((i:real)-0.5)/n; 332 | y = ((j:real)-0.5)/n; 333 | temp[i,j] = exp(-((x-0.5)**2 + (y-0.5)**2)/0.01); // narrow Gaussian peak 334 | } 335 | coforall (i,j) in {1..n,1..n} by (stride,stride) { // 5x5 decomposition into 20x20 blocks => 25 tasks 336 | for k in i..i+stride-1 { // serial loop inside each block 337 | for l in j..j+stride-1 { 338 | temp_new[k,l] = (temp[k-1,l] + temp[k+1,l] + temp[k,l-1] + temp[k,l+1]) / 4; 339 | } 340 | } 341 | } 342 | ``` 343 | 344 | We will study data parallelism in more detail in the next section. 345 | 346 | ::::::::::::::::::::::::::::::::::::: keypoints 347 | - "To parallelize the diffusion solver with tasks, you divide the 2D domain into blocks and assign each block 348 | to a task." 349 | - "To get the maximum performance, you need to launch the parallel tasks only once, and run the temporal loop 350 | of the simulation with the same set of tasks, resuming the main task only to print the final results." 351 | - "Parallelizing with tasks is more laborious than parallelizing with data (covered in the next section)." 352 | :::::::::::::::::::::::::::::::::::::::::::::::: 353 | -------------------------------------------------------------------------------- /episodes/21-locales.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Running code on multiple machines" 3 | teaching: 120 4 | exercises: 60 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "What is a locale?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "First objective." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | So far we have been working with single-locale Chapel codes that may run on one or many cores on a single 16 | compute node, making use of the shared memory space and accelerating computations by launching concurrent 17 | tasks on individual cores in parallel. Chapel codes can also run on multiple nodes on a compute cluster. In 18 | Chapel this is referred to as *multi-locale* execution. 19 | 20 | If you work inside a Chapel Docker container, e.g., chapel/chapel-gasnet, the container environment simulates 21 | a multi-locale cluster, so you would compile and launch multi-locale Chapel codes directly by specifying the 22 | number of locales with `-nl` flag: 23 | 24 | ```bash 25 | chpl --fast mycode.chpl -o mybinary 26 | ./mybinary -nl 4 27 | ``` 28 | 29 | Inside the Docker container on multiple locales your code will not run any faster than on a single locale, 30 | since you are emulating a virtual cluster, and all tasks run on the same physical node. To achieve actual 31 | speedup, you need to run your parallel multi-locale Chapel code on a real physical cluster which we hope you 32 | have access to for this session. 33 | 34 | On a real HPC cluster you would need to submit either an interactive or a batch job asking for several nodes 35 | and then run a multi-locale Chapel code inside that job. In practice, the exact commands depend on how the 36 | multi-locale Chapel was built on the cluster. 37 | 38 | When you compile a Chapel code with the multi-locale Chapel compiler, two binaries will be produced. One is 39 | called `mybinary` and is a launcher binary used to submit the real executable `mybinary_real`. If the Chapel 40 | environment is configured properly with the launcher for the cluster's physical interconnect (which might not 41 | be always possible due to a number of factors), then you would simply compile the code and use the launcher 42 | binary `mybinary` to submit the job to the queue: 43 | 44 | ```bash 45 | chpl --fast mycode.chpl -o mybinary 46 | ./mybinary -nl 2 47 | ``` 48 | 49 | The exact parameters of the job such as the maximum runtime and the requested memory can be specified with 50 | Chapel environment variables. One possible drawback of this launching method is that, depending on your 51 | cluster setup, Chapel might have access to all physical cores on each node participating in the run -- this 52 | will present problems if you are scheduling jobs by-core and not by-node, since part of a node should be 53 | allocated to someone else's job. 54 | 55 | Note that on Compute Canada clusters this launching method works without problem. On these clusters 56 | multi-locale Chapel is provided by `chapel-ofi` (for the OmniPath interconnect on Cedar) and `chapel-ucx` (for 57 | the InfiniBand interconnect on Graham, Béluga, Narval) modules, so -- depending on the cluster -- you will 58 | load Chapel using one of the two lines below: 59 | 60 | ```bash 61 | module load gcc chapel-ofi # for the OmniPath interconnect on Cedar cluster 62 | module load gcc chapel-ucx # for the InfiniBand interconnect on Graham, Béluga, Narval clusters 63 | ``` 64 | 65 | 66 | 67 | We can also launch multi-locale Chapel codes using the real executable `mybinary_real`. For example, for an 68 | interactive job you would type: 69 | 70 | ```bash 71 | salloc --time=0:30:0 --nodes=4 --cpus-per-task=3 --mem-per-cpu=1000 --account=def-guest 72 | chpl --fast mycode.chpl -o mybinary 73 | srun ./mybinary_real -nl 4 # will run on four locales with max 3 cores per locale 74 | ``` 75 | 76 | Production jobs would be launched with `sbatch` command and a Slurm launch script as usual. 77 | 78 | For the rest of this class we assume that you have a working multi-locale Chapel environment, whether provided 79 | by a Docker container or by multi-locale Chapel on a physical HPC cluster. We will run all examples on four 80 | nodes with three cores per node. 81 | 82 | # Intro to multi-locale code 83 | 84 | Let us test our multi-locale Chapel environment by launching the following code: 85 | 86 | ```chpl 87 | writeln(Locales); 88 | ``` 89 | 90 | This code will print the built-in global array `Locales`. Running it on four 91 | locales will produce 92 | 93 | ```output 94 | LOCALE0 LOCALE1 LOCALE2 LOCALE3 95 | ``` 96 | 97 | We want to run some code on each locale (node). For that, we can cycle through locales: 98 | 99 | ```chpl 100 | for loc in Locales do // this is still a serial program 101 | on loc do // run the next line on locale `loc` 102 | writeln("this locale is named ", here.name); 103 | ``` 104 | 105 | This will produce 106 | 107 | ```output 108 | this locale is named cdr544 109 | this locale is named cdr552 110 | this locale is named cdr556 111 | this locale is named cdr692 112 | ``` 113 | 114 | Here the built-in variable class `here` refers to the locale on which the code is running, and `here.name` is 115 | its hostname. We started a serial `for` loop cycling through all locales, and on each locale we printed its 116 | name, i.e., the hostname of each node. This program ran in serial starting a task on each locale only after 117 | completing the same task on the previous locale. Note the order in which locales were listed. 118 | 119 | To run this code in parallel, starting four simultaneous tasks, one per locale, we simply need to replace 120 | `for` with `forall`: 121 | 122 | ```chpl 123 | forall loc in Locales do // now this is a parallel loop 124 | on loc do 125 | writeln("this locale is named ", here.name); 126 | ``` 127 | 128 | This starts four tasks in parallel, and the order in which the print statement is executed depends on the 129 | runtime conditions and can change from run to run: 130 | 131 | ```output 132 | this locale is named cdr544 133 | this locale is named cdr692 134 | this locale is named cdr556 135 | this locale is named cdr552 136 | ``` 137 | 138 | We can print few other attributes of each locale. Here it is actually useful to revert to the serial loop 139 | `for` so that the print statements appear in order: 140 | 141 | ```chpl 142 | use MemDiagnostics; 143 | for loc in Locales do 144 | on loc { 145 | writeln("locale #", here.id, "..."); 146 | writeln(" ...is named: ", here.name); 147 | writeln(" ...has ", here.numPUs(), " processor cores"); 148 | writeln(" ...has ", here.physicalMemory(unit=MemUnits.GB, retType=real), " GB of memory"); 149 | writeln(" ...has ", here.maxTaskPar, " maximum parallelism"); 150 | } 151 | ``` 152 | 153 | ```output 154 | locale #0... 155 | ...is named: cdr544 156 | ...has 3 processor cores 157 | ...has 125.804 GB of memory 158 | ...has 3 maximum parallelism 159 | locale #1... 160 | ...is named: cdr552 161 | ...has 3 processor cores 162 | ...has 125.804 GB of memory 163 | ...has 3 maximum parallelism 164 | locale #2... 165 | ...is named: cdr556 166 | ...has 3 processor cores 167 | ...has 125.804 GB of memory 168 | ...has 3 maximum parallelism 169 | locale #3... 170 | ...is named: cdr692 171 | ...has 3 processor cores 172 | ...has 125.804 GB of memory 173 | ...has 3 maximum parallelism 174 | ``` 175 | 176 | Note that while Chapel correctly determines the number of cores available inside our job on each node, and the 177 | maximum parallelism (which is the same as the number of cores available!), it lists the total physical memory 178 | on each node available to all running jobs which is not the same as the total memory per node allocated to our 179 | job. 180 | 181 | ::::::::::::::::::::::::::::::::::::: keypoints 182 | - "Locale in Chapel is a shared-memory node on a cluster." 183 | - "We can cycle in serial or parallel through all locales." 184 | :::::::::::::::::::::::::::::::::::::::::::::::: 185 | -------------------------------------------------------------------------------- /episodes/22-domains.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Domains and data parallelism" 3 | teaching: 120 4 | exercises: 60 5 | --- 6 | 7 | :::::::::::::::::::::::::::::::::::::: questions 8 | - "How do I store and manipulate data across multiple locales?" 9 | :::::::::::::::::::::::::::::::::::::::::::::::: 10 | 11 | ::::::::::::::::::::::::::::::::::::: objectives 12 | - "First objective." 13 | :::::::::::::::::::::::::::::::::::::::::::::::: 14 | 15 | # Domains and single-locale data parallelism 16 | 17 | We start this section by recalling the definition of a range in Chapel. A range is a 1D set of integer indices 18 | that can be bounded or infinite: 19 | 20 | ```chpl 21 | var oneToTen: range = 1..10; // 1, 2, 3, ..., 10 22 | var a = 1234, b = 5678; 23 | var aToB: range = a..b; // using variables 24 | var twoToTenByTwo: range(strides=strideKind.positive) = 2..10 by 2; // 2, 4, 6, 8, 10 25 | var oneToInf = 1.. ; // unbounded range 26 | ``` 27 | 28 | On the other hand, domains are multi-dimensional (including 1D) sets of integer indices that are always 29 | bounded. To stress the difference between domain ranges and domains, domain definitions always enclose their 30 | indices in curly brackets. Ranges can be used to define a specific dimension of a domain: 31 | 32 | ```chpl 33 | var domain1to10: domain(1) = {1..10}; // 1D domain from 1 to 10 defined using the range 1..10 34 | var twoDimensions: domain(2) = {-2..2,0..2}; // 2D domain over a product of two ranges 35 | var thirdDim: range = 1..16; // a range 36 | var threeDims: domain(3) = {thirdDim, 1..10, 5..10}; // 3D domain over a product of three ranges 37 | for idx in twoDimensions do // cycle through all points in a 2D domain 38 | write(idx, ", "); 39 | writeln(); 40 | for (x,y) in twoDimensions { // can also cycle using explicit tuples (x,y) 41 | write("(", x, ", ", y, ")", ", "); 42 | } 43 | ``` 44 | 45 | Let us define an n^2 domain called `mesh`. It is defined by the single task in our code and is therefore 46 | defined in memory on the same node (locale 0) where this task is running. For each of n^2 mesh points, let us 47 | print out 48 | 49 | 1. `m.locale.id`, the ID of the locale holding that mesh point (should be 0) 50 | 2. `here.id`, the ID of the locale on which the code is running (should be 0) 51 | 3. `here.maxTaskPar`, the number of cores (max parallelism with 1 task/core) (should be 3) 52 | 53 | **Note**: We already saw some of these variables/functions: numLocales, Locales, here.id, here.name, 54 | here.numPUs(), here.physicalMemory(), here.maxTaskPar. 55 | 56 | ```chpl 57 | config const n = 8; 58 | const mesh: domain(2) = {1..n, 1..n}; // a 2D domain defined in shared memory on a single locale 59 | forall m in mesh { // go in parallel through all n^2 mesh points 60 | writeln((m, m.locale.id, here.id, here.maxTaskPar)); 61 | } 62 | ``` 63 | 64 | ```output 65 | ((7, 1), 0, 0, 3) 66 | ((1, 1), 0, 0, 3) 67 | ((7, 2), 0, 0, 3) 68 | ((1, 2), 0, 0, 3) 69 | ... 70 | ((6, 6), 0, 0, 3) 71 | ((6, 7), 0, 0, 3) 72 | ((6, 8), 0, 0, 3) 73 | ``` 74 | 75 | Now we are going to learn two very important properties of Chapel domains. First, domains can be used to 76 | define arrays of variables of any type on top of them. For example, let us define an n^2 array of real numbers 77 | on top of `mesh`: 78 | 79 | ```chpl 80 | config const n = 8; 81 | const mesh: domain(2) = {1..n, 1..n}; // a 2D domain defined in shared memory on a single locale 82 | var T: [mesh] real; // a 2D array of reals defined in shared memory on a single locale (mapped onto this domain) 83 | forall t in T { // go in parallel through all n^2 elements of T 84 | writeln((t, t.locale.id)); 85 | } 86 | ``` 87 | 88 | ```output 89 | (0.0, 0) 90 | (0.0, 0) 91 | (0.0, 0) 92 | (0.0, 0) 93 | ... 94 | (0.0, 0) 95 | (0.0, 0) 96 | (0.0, 0) 97 | ``` 98 | 99 | By default, all n^2 array elements are set to zero, and all of them are defined on the same locale as the 100 | underlying mesh. We can also cycle through all indices of T by accessing its domain: 101 | 102 | ```chpl 103 | forall idx in T.domain { 104 | writeln(idx, ' ', T(idx)); // idx is a tuple (i,j); also print the corresponding array element 105 | } 106 | ``` 107 | 108 | ```output 109 | (7, 1) 0.0 110 | (1, 1) 0.0 111 | (7, 2) 0.0 112 | (1, 2) 0.0 113 | ... 114 | (6, 6) 0.0 115 | (6, 7) 0.0 116 | (6, 8) 0.0 117 | ``` 118 | 119 | Since we use a parallel `forall` loop, the print statements appear in a random runtime order. 120 | 121 | We can also define multiple arrays on the same domain: 122 | 123 | ```chpl 124 | const grid = {1..100}; // 1D domain 125 | const alpha = 5; // some number 126 | var A, B, C: [grid] real; // local real-type arrays on this 1D domain 127 | B = 2; C = 3; 128 | forall (a,b,c) in zip(A,B,C) do // parallel loop 129 | a = b + alpha*c; // simple example of data parallelism on a single locale 130 | writeln(A); 131 | ``` 132 | 133 | The second important property of Chapel domains is that they can span multiple locales (nodes). 134 | 135 | ## Distributed domains 136 | 137 | Domains are fundamental Chapel concept for distributed-memory data parallelism. 138 | 139 | Let us now define an n^2 distributed (over several locales) domain `distributedMesh` mapped to locales in 140 | blocks. On top of this domain we define a 2D block-distributed array A of strings mapped to locales in exactly 141 | the same pattern as the underlying domain. Let us print out 142 | 143 | 1. `a.locale.id`, the ID of the locale holding the element a of A 144 | 2. `here.name`, the name of the locale on which the code is running 145 | 3. `here.maxTaskPar`, the number of cores on the locale on which the code is 146 | running 147 | 148 | Instead of printing these values to the screen, we will store this output inside each element of A as a string 149 | `a.locale.id:string + '-' + here.name + '-' + here.maxTaskPar:string`, adding a separator `' '` at the end of 150 | each element. 151 | 152 | ```chpl 153 | use BlockDist; // use standard block distribution module to partition the domain into blocks 154 | config const n = 8; 155 | const mesh: domain(2) = {1..n, 1..n}; 156 | const distributedMesh: domain(2) dmapped new blockDist(boundingBox=mesh) = mesh; 157 | var A: [distributedMesh] string; // block-distributed array mapped to locales 158 | forall a in A { // go in parallel through all n^2 elements in A 159 | // assign each array element on the locale that stores that index/element 160 | a = a.locale.id:string + '-' + here.name + '-' + here.maxTaskPar:string + ' '; 161 | } 162 | writeln(A); 163 | ``` 164 | 165 | The syntax `boundingBox=mesh` tells the compiler that the outer edge of our decomposition coincides exactly 166 | with the outer edge of our domain. Alternatively, the outer decomposition layer could include an additional 167 | perimeter of *ghost points* if we specify 168 | 169 | ```chpl 170 | const mesh: domain(2) = {1..n, 1..n}; 171 | const largerMesh: domain(2) dmapped new blockDist(boundingBox=mesh) = {0..n+1,0..n+1}; 172 | ``` 173 | 174 | but let us not worry about this for now. 175 | 176 | Running our code on four locales with three cores per locale produces the following output: 177 | 178 | ```output 179 | 0-cdr544-3 0-cdr544-3 0-cdr544-3 0-cdr544-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 180 | 0-cdr544-3 0-cdr544-3 0-cdr544-3 0-cdr544-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 181 | 0-cdr544-3 0-cdr544-3 0-cdr544-3 0-cdr544-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 182 | 0-cdr544-3 0-cdr544-3 0-cdr544-3 0-cdr544-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 1-cdr552-3 183 | 2-cdr556-3 2-cdr556-3 2-cdr556-3 2-cdr556-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 184 | 2-cdr556-3 2-cdr556-3 2-cdr556-3 2-cdr556-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 185 | 2-cdr556-3 2-cdr556-3 2-cdr556-3 2-cdr556-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 186 | 2-cdr556-3 2-cdr556-3 2-cdr556-3 2-cdr556-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 3-cdr692-3 187 | ``` 188 | 189 | As we see, the domain `distributedMesh` (along with the string array `A` on top of it) was decomposed into 2x2 190 | blocks stored on the four nodes, respectively. Equally important, for each element `a` of the array, the line 191 | of code filling in that element ran on the same locale where that element was stored. In other words, this 192 | code ran in parallel (`forall` loop) on four nodes, using up to three cores on each node to fill in the 193 | corresponding array elements. Once the parallel loop is finished, the `writeln` command runs on locale 0 194 | gathering remote elements from other locales and printing them to standard output. 195 | 196 | Now we can print the range of indices for each sub-domain by adding the following to our code: 197 | 198 | ```chpl 199 | for loc in Locales { 200 | on loc { 201 | writeln(A.localSubdomain()); 202 | } 203 | } 204 | ``` 205 | 206 | On 4 locales we should get: 207 | 208 | ```output 209 | {1..4, 1..4} 210 | {1..4, 5..8} 211 | {5..8, 1..4} 212 | {5..8, 5..8} 213 | ``` 214 | 215 | Let us count the number of threads by adding the following to our code: 216 | 217 | ```chpl 218 | var counter = 0; 219 | forall a in A with (+ reduce counter) { // go in parallel through all n^2 elements 220 | counter = 1; 221 | } 222 | writeln("actual number of threads = ", counter); 223 | ``` 224 | 225 | If `n=8` in our code is sufficiently large, there are enough array elements per node (8*8/4 = 16 in our case) 226 | to fully utilise all three available cores on each node, so our output should be 227 | 228 | ```output 229 | actual number of threads = 12 230 | ``` 231 | 232 | Try reducing the array size `n` to see if that changes the output (fewer tasks per locale), e.g., setting 233 | n=3. Also try increasing the array size to n=20 and study the output. Does the output make sense? 234 | 235 | So far we looked at the block distribution `BlockDist`. It will distribute a 2D domain among nodes either 236 | using 1D or 2D decomposition (in our example it was 2D decomposition 2x2), depending on the domain size and 237 | the number of nodes. 238 | 239 | Let us take a look at another standard module for domain partitioning onto locales, called CyclicDist. For 240 | each element of the array we will print out again 241 | 242 | 1. `a.locale.id`, the ID of the locale holding the element a of A 243 | 2. `here.name`, the name of the locale on which the code is running 244 | 3. `here.maxTaskPar`, the number of cores on the locale on which the code is running 245 | 246 | ```chpl 247 | use CyclicDist; // elements are sent to locales in a round-robin pattern 248 | config const n = 8; 249 | const mesh: domain(2) = {1..n, 1..n}; // a 2D domain defined in shared memory on a single locale 250 | const m2: domain(2) dmapped new cyclicDist(startIdx=mesh.low) = mesh; // mesh.low is the first index (1,1) 251 | var A2: [m2] string; 252 | forall a in A2 { 253 | a = a.locale.id:string + '-' + here.name + '-' + here.maxTaskPar:string + ' '; 254 | } 255 | writeln(A2); 256 | ``` 257 | 258 | ```output 259 | 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 260 | 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 261 | 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 262 | 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 263 | 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 264 | 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 265 | 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 0-cdr544-3 1-cdr552-3 266 | 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 2-cdr556-3 3-cdr692-3 267 | ``` 268 | 269 | As the name `CyclicDist` suggests, the domain was mapped to locales in a cyclic, round-robin pattern. We can 270 | also print the range of indices for each sub-domain by adding the following to our code: 271 | 272 | ```chpl 273 | for loc in Locales { 274 | on loc { 275 | writeln(A2.localSubdomain()); 276 | } 277 | } 278 | ``` 279 | 280 | ```output 281 | {1..7 by 2, 1..7 by 2} 282 | {1..7 by 2, 2..8 by 2} 283 | {2..8 by 2, 1..7 by 2} 284 | {2..8 by 2, 2..8 by 2} 285 | ``` 286 | 287 | In addition to BlockDist and CyclicDist, Chapel has several other predefined distributions: BlockCycDist, 288 | ReplicatedDist, DimensionalDist2D, ReplicatedDim, BlockCycDim — for details please see 289 | https://chapel-lang.org/docs/primers/distributions.html. 290 | 291 | ## Diffusion solver on distributed domains 292 | 293 | Now let us use distributed domains to write a parallel version of our original diffusion solver code: 294 | 295 | ```chpl 296 | use BlockDist; 297 | use Math; 298 | config const n = 8; 299 | const mesh: domain(2) = {1..n, 1..n}; // local 2D n^2 domain 300 | ``` 301 | 302 | We will add a larger (n+2)^2 block-distributed domain `largerMesh` with a layer of *ghost points* on 303 | *perimeter locales*, and define a temperature array `temp` on top of it, by adding the following to our code: 304 | 305 | ```chpl 306 | const largerMesh: domain(2) dmapped new blockDist(boundingBox=mesh) = {0..n+1, 0..n+1}; 307 | var temp: [largerMesh] real; // a block-distributed array of temperatures 308 | forall (i,j) in temp.domain[1..n,1..n] { 309 | var x = ((i:real)-0.5)/(n:real); // x, y are local to each task 310 | var y = ((j:real)-0.5)/(n:real); 311 | temp[i,j] = exp(-((x-0.5)**2 + (y-0.5)**2) / 0.01); // narrow Gaussian peak 312 | } 313 | writeln(temp); 314 | ``` 315 | 316 | Here we initialised an initial Gaussian temperature peak in the middle of the mesh. As we evolve our solution 317 | in time, this peak should diffuse slowly over the rest of the domain. 318 | 319 | > ## Question 320 | > 321 | > Why do we have `forall (i,j) in temp.domain[1..n,1..n]` 322 | > and not `forall (i,j) in mesh`? 323 | > 324 | > > ## Answer 325 | > > The first one will run on multiple locales in parallel, whereas the 326 | > > second will run in parallel via multiple threads on locale 0 only, since 327 | > > "mesh" is defined on locale 0. 328 | 329 | The code above will print the initial temperature distribution: 330 | 331 | ```output 332 | 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 333 | 0.0 2.36954e-17 2.79367e-13 1.44716e-10 3.29371e-09 3.29371e-09 1.44716e-10 2.79367e-13 2.36954e-17 0.0 334 | 0.0 2.79367e-13 3.29371e-09 1.70619e-06 3.88326e-05 3.88326e-05 1.70619e-06 3.29371e-09 2.79367e-13 0.0 335 | 0.0 1.44716e-10 1.70619e-06 0.000883826 0.0201158 0.0201158 0.000883826 1.70619e-06 1.44716e-10 0.0 336 | 0.0 3.29371e-09 3.88326e-05 0.0201158 0.457833 0.457833 0.0201158 3.88326e-05 3.29371e-09 0.0 337 | 0.0 3.29371e-09 3.88326e-05 0.0201158 0.457833 0.457833 0.0201158 3.88326e-05 3.29371e-09 0.0 338 | 0.0 1.44716e-10 1.70619e-06 0.000883826 0.0201158 0.0201158 0.000883826 1.70619e-06 1.44716e-10 0.0 339 | 0.0 2.79367e-13 3.29371e-09 1.70619e-06 3.88326e-05 3.88326e-05 1.70619e-06 3.29371e-09 2.79367e-13 0.0 340 | 0.0 2.36954e-17 2.79367e-13 1.44716e-10 3.29371e-09 3.29371e-09 1.44716e-10 2.79367e-13 2.36954e-17 0.0 341 | 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 342 | ``` 343 | 344 | Let us define an array of strings `nodeID` with the same distribution over locales as `temp`, by adding the 345 | following to our code: 346 | 347 | ```chpl 348 | var nodeID: [largerMesh] string; 349 | forall m in nodeID do 350 | m = here.id:string; 351 | writeln(nodeID); 352 | ``` 353 | 354 | The outer perimeter in the partition below are the *ghost points*: 355 | 356 | ```output 357 | 0 0 0 0 0 1 1 1 1 1 358 | 0 0 0 0 0 1 1 1 1 1 359 | 0 0 0 0 0 1 1 1 1 1 360 | 0 0 0 0 0 1 1 1 1 1 361 | 0 0 0 0 0 1 1 1 1 1 362 | 2 2 2 2 2 3 3 3 3 3 363 | 2 2 2 2 2 3 3 3 3 3 364 | 2 2 2 2 2 3 3 3 3 3 365 | 2 2 2 2 2 3 3 3 3 3 366 | 2 2 2 2 2 3 3 3 3 3 367 | ``` 368 | 369 | ::::::::::::::::::::::::::::::::::::: challenge 370 | 371 | ## Challenge 3: Can you do it? 372 | 373 | In addition to here.id, also print the ID of the locale holding that value. Is it the same or different from here.id? 374 | 375 | :::::::::::::::::::::::: solution 376 | 377 | Something along the lines: `m = here.id:string + '-' + m.locale.id:string` 378 | 379 | ::::::::::::::::::::::::::::::::: 380 | :::::::::::::::::::::::::::::::::::::::::::::::: 381 | 382 | Now we implement the parallel solver, by adding the following to our code (*contains a mistake on purpose!*): 383 | 384 | ```chpl 385 | var temp_new: [largerMesh] real; 386 | for step in 1..5 { // time-stepping 387 | forall (i,j) in mesh do 388 | temp_new[i,j] = (temp[i-1,j] + temp[i+1,j] + temp[i,j-1] + temp[i,j+1]) / 4; 389 | temp[mesh] = temp_new[mesh]; // uses parallel forall underneath 390 | } 391 | ``` 392 | 393 | ::::::::::::::::::::::::::::::::::::: challenge 394 | 395 | ## Challenge 4: Can you do it? 396 | 397 | Can anyone spot a mistake in the last code? 398 | 399 | :::::::::::::::::::::::: solution 400 | 401 | It should be 402 | 403 | `forall (i,j) in temp_new.domain[1..n,1..n] do` 404 | 405 | instead of 406 | 407 | `forall (i,j) in mesh do` 408 | 409 | as the last one will likely run in parallel via threads only on locale 0, whereas the former will run on 410 | multiple locales in parallel. 411 | 412 | ::::::::::::::::::::::::::::::::: 413 | :::::::::::::::::::::::::::::::::::::::::::::::: 414 | 415 | Here is the final version of the entire code: 416 | 417 | ```chpl 418 | use BlockDist; 419 | use Math; 420 | config const n = 8; 421 | const mesh: domain(2) = {1..n,1..n}; 422 | const largerMesh: domain(2) dmapped new blockDist(boundingBox=mesh) = {0..n+1,0..n+1}; 423 | var temp, temp_new: [largerMesh] real; 424 | forall (i,j) in temp.domain[1..n,1..n] { 425 | var x = ((i:real)-0.5)/(n:real); 426 | var y = ((j:real)-0.5)/(n:real); 427 | temp[i,j] = exp(-((x-0.5)**2 + (y-0.5)**2) / 0.01); 428 | } 429 | for step in 1..5 { 430 | forall (i,j) in temp_new.domain[1..n,1..n] { 431 | temp_new[i,j] = (temp[i-1,j] + temp[i+1,j] + temp[i,j-1] + temp[i,j+1]) / 4.0; 432 | } 433 | temp = temp_new; 434 | writeln((step, " ", temp[n/2,n/2], " ", temp[1,1])); 435 | } 436 | ``` 437 | 438 | This is the entire parallel solver! Note that we implemented an open boundary: `temp` on the *ghost points* is 439 | always 0. Let us add some printout and also compute the total energy on the mesh, by adding the following to 440 | our code: 441 | 442 | ```chpl 443 | writeln((step, " ", temp[n/2,n/2], " ", temp[2,2])); 444 | var total: real = 0; 445 | forall (i,j) in mesh with (+ reduce total) do 446 | total += temp[i,j]; 447 | writeln("total = ", total); 448 | ``` 449 | 450 | Notice how the total energy decreases in time with the open boundary conditions, as the energy is leaving the 451 | system. 452 | 453 | 454 | ::::::::::::::::::::::::::::::::::::: challenge 455 | 456 | ## Challenge 5: Can you do it? 457 | 458 | Write a code to print how the finite-difference stencil [i,j], [i-1,j], [i+1,j], [i,j-1], [i,j+1] is 459 | distributed among nodes, and compare that to the ID of the node where temp[i,i] is computed. 460 | 461 | :::::::::::::::::::::::: solution 462 | 463 | Here is one possible solution examining the locality of the finite-difference stencil: 464 | 465 | ```chpl 466 | var nodeID: [largerMesh] string = 'empty'; 467 | forall (i,j) in nodeID.domain[1..n,1..n] do 468 | nodeID[i,j] = here.id:string + nodeID[i,j].locale.id:string + nodeID[i-1,j].locale.id:string + 469 | nodeID[i+1,j].locale.id:string + nodeID[i,j-1].locale.id:string + nodeID[i,j+1].locale.id:string + ' '; 470 | writeln(nodeID); 471 | ``` 472 | 473 | ::::::::::::::::::::::::::::::::: 474 | :::::::::::::::::::::::::::::::::::::::::::::::: 475 | 476 | This produced the following output clearly showing the *ghost points* and the stencil distribution for each 477 | mesh point: 478 | 479 | ```output 480 | empty empty empty empty empty empty empty empty empty empty 481 | empty 000000 000000 000000 000001 111101 111111 111111 111111 empty 482 | empty 000000 000000 000000 000001 111101 111111 111111 111111 empty 483 | empty 000000 000000 000000 000001 111101 111111 111111 111111 empty 484 | empty 000200 000200 000200 000201 111301 111311 111311 111311 empty 485 | empty 220222 220222 220222 220223 331323 331333 331333 331333 empty 486 | empty 222222 222222 222222 222223 333323 333333 333333 333333 empty 487 | empty 222222 222222 222222 222223 333323 333333 333333 333333 empty 488 | empty 222222 222222 222222 222223 333323 333333 333333 333333 empty 489 | empty empty empty empty empty empty empty empty empty empty 490 | ``` 491 | 492 | Note that temp[i,j] is always computed on the same node where that element is stored, which makes sense. 493 | 494 | ## Periodic boundary conditions 495 | 496 | Now let us modify the previous parallel solver to include periodic BCs. At the beginning of each time step we 497 | need to set elements on the *ghost points* to their respective values on the *opposite ends*, by adding the 498 | following to our code: 499 | 500 | ```chpl 501 | temp[0,1..n] = temp[n,1..n]; // periodic boundaries on all four sides; these will run via parallel forall 502 | temp[n+1,1..n] = temp[1,1..n]; 503 | temp[1..n,0] = temp[1..n,n]; 504 | temp[1..n,n+1] = temp[1..n,1]; 505 | ``` 506 | 507 | Now total energy should be conserved, as nothing leaves the domain. 508 | 509 | # I/O 510 | 511 | Let us write the final solution to disk. There are several caveats: 512 | 513 | - works only with ASCII 514 | - Chapel can also write binary data but nothing can read it (checked: not the 515 | endians problem!) 516 | - would love to write NetCDF and HDF5, probably can do this by calling C/C++ 517 | functions from Chapel 518 | 519 | We'll add the following to our code to write ASCII: 520 | 521 | ```chpl 522 | use IO; 523 | var myFile = open("output.dat", ioMode.cw); // open the file for writing 524 | var myWritingChannel = myFile.writer(); // create a writing channel starting at file offset 0 525 | myWritingChannel.write(temp); // write the array 526 | myWritingChannel.close(); // close the channel 527 | ``` 528 | 529 | Run the code and check the file *output.dat*: it should contain the array T after 5 steps in ASCII. 530 | 531 | 532 | 533 | 534 | 535 | 536 | 537 | 538 | ::::::::::::::::::::::::::::::::::::: keypoints 539 | - "Domains are multi-dimensional sets of integer indices." 540 | - "A domain can be defined on a single locale or distributed across many locales." 541 | - "There are many predefined distribution method: block, cyclic, etc." 542 | - "Arrays are defined on top of domains and inherit their distribution model." 543 | :::::::::::::::::::::::::::::::::::::::::::::::: 544 | -------------------------------------------------------------------------------- /hpc-chapel.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: No 4 | SaveWorkspace: No 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | LineEndingConversion: Posix 18 | 19 | BuildType: Website 20 | -------------------------------------------------------------------------------- /index.md: -------------------------------------------------------------------------------- 1 | --- 2 | site: sandpaper::sandpaper_site 3 | --- 4 | 5 | 6 | 7 | 8 | This workshop is an introduction to parallel programming in Chapel. This material is designed for Day 2 of HPC Carpentry. 9 | 10 | By the end of this workshop, students will know: 11 | 12 | - the basic syntax of Chapel codes, 13 | - how to run single-locale Chapel codes, 14 | - how to write task-parallel codes for a shared-memory compute node, 15 | - how to run multi-locale Chapel codes, 16 | - how to write domain-parallel codes for a distributed-memory cluster. 17 | 18 | **NOTE**: This is the draft HPC Carpentry release. Comments and feedback are welcome. 19 | -------------------------------------------------------------------------------- /instructors/instructor-notes.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: 'Instructor Notes' 3 | --- 4 | 5 | This is a placeholder file. Please add content here. 6 | -------------------------------------------------------------------------------- /learners/reference.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: 'Reference' 3 | --- 4 | 5 | ## Glossary 6 | 7 | This is a placeholder file. Please add content here. 8 | 9 | -------------------------------------------------------------------------------- /learners/setup.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Setup 3 | --- 4 | 5 | We highly recommend running Chapel on an HPC cluster. Alternatively, you can run Chapel on your computer, but 6 | don't expect a multi-node speedup since you have only one node. 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | ## Software Setup 18 | 19 | ::::::::::::::::::::::::::::::::::::::: discussion 20 | 21 | ### Details 22 | 23 | This section describes installing Chapel on your own computer. Before proceeding, please double-check that 24 | your workshop instructors do not already provide Chapel on an HPC cluster. 25 | 26 | 27 | 28 | 29 | 30 | 31 | ::::::::::::::::::::::::::::::::::::::::::::::::::: 32 | 33 | :::::::::::::::: spoiler 34 | 35 | ### Windows 36 | 37 | Go to the website https://docs.docker.com/docker-for-windows/install/ and download the Docker Desktop 38 | installation file. Double-click on the `Docker_Desktop_Installer.exe` to run the installer. During the 39 | installation process, enable Hyper-V Windows Feature on the Configuration page, and wait for the installation 40 | to complete. At this point you might need to restart your computer. 41 | 42 | Eventually you want to run https://hub.docker.com/r/chapel/chapel Docker image. 43 | 44 | :::::::::::::::::::::::: 45 | 46 | :::::::::::::::: spoiler 47 | 48 | ### MacOS 49 | 50 | The quickest way to get started with Chapel on MacOS is to install it via Homebrew. If you don't have Homebrew 51 | installed (skip this step if you do), open Terminal.app and type 52 | 53 | ```bash 54 | /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" 55 | ``` 56 | 57 | Next, proceed to installing Chapel: 58 | 59 | ```bash 60 | brew update 61 | brew install chapel 62 | ``` 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | :::::::::::::::::::::::: 72 | 73 | 74 | :::::::::::::::: spoiler 75 | 76 | ### Linux 77 | 78 | At https://github.com/chapel-lang/chapel/releases scroll to the first "Assets" section (you might need to 79 | click on "Show all assets") and pick the latest precompiled Chapel package for your Linux distribution. For 80 | example, with Ubuntu 22.04 you can do: 81 | 82 | ```bash 83 | wget https://github.com/chapel-lang/chapel/releases/download/2.0.0/chapel-2.1.0-1.ubuntu22.amd64.deb 84 | sudo apt install ./chapel-2.1.0-1.ubuntu22.amd64.deb 85 | ``` 86 | 87 | :::::::::::::::::::::::: 88 | -------------------------------------------------------------------------------- /links.md: -------------------------------------------------------------------------------- 1 | 5 | 6 | [pandoc]: https://pandoc.org/MANUAL.html 7 | [r-markdown]: https://rmarkdown.rstudio.com/ 8 | [rstudio]: https://www.rstudio.com/ 9 | [carpentries-workbench]: https://carpentries.github.io/sandpaper-docs/ 10 | 11 | -------------------------------------------------------------------------------- /profiles/learner-profiles.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: FIXME 3 | --- 4 | 5 | This is a placeholder file. Please add content here. 6 | -------------------------------------------------------------------------------- /site/README.md: -------------------------------------------------------------------------------- 1 | # {sandpaper}-Generated Content 2 | 3 | This directory contains rendered lesson materials. 4 | Please do not edit files here. 5 | --------------------------------------------------------------------------------