├── .nojekyll
├── CNAME
├── images
    ├── favicon.png
    ├── pdf-cover.pdf
    └── netherlands-escience-center-logo-RGB.png
├── .docsifytopdfrc.yml
├── .gitignore
├── .pre-commit-config.yaml
├── lychee.toml
├── styles.css
├── .github
    ├── dependabot.yml
    ├── PULL_REQUEST_TEMPLATE
    └── workflows
    │   ├── link-checker.yml
    │   ├── link-checker-pr.yml
    │   └── upload-pdf.yml
├── privacy.md
├── technology
    ├── technology_overview.md
    ├── user_experience.md
    ├── datasets.md
    └── gpu.md
├── _sidebar.md
├── language_guides
    ├── languages_overview.md
    ├── fortran.md
    ├── rust.md
    ├── bash.md
    ├── r.md
    ├── javascript.md
    ├── ccpp.md
    └── python.md
├── README.md
├── index.html
├── CITATION.cff
├── best_practices.md
├── CONTRIBUTING.md
└── LICENSE


/.nojekyll:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/CNAME:
--------------------------------------------------------------------------------
1 | guide.esciencecenter.nl


--------------------------------------------------------------------------------
/images/favicon.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NLeSC/guide/HEAD/images/favicon.png


--------------------------------------------------------------------------------
/.docsifytopdfrc.yml:
--------------------------------------------------------------------------------
1 | contents:
2 |   - _sidebar.md
3 | pathToPublic: guide-nlesc.pdf
4 | 


--------------------------------------------------------------------------------
/images/pdf-cover.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NLeSC/guide/HEAD/images/pdf-cover.pdf


--------------------------------------------------------------------------------
/images/netherlands-escience-center-logo-RGB.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NLeSC/guide/HEAD/images/netherlands-escience-center-logo-RGB.png


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # files for JetBrains editors:
 2 | **/*.iml
 3 | .idea
 4 | 
 5 | # VS Code
 6 | .vscode
 7 | 
 8 | # Mac OS
 9 | .DS_Store
10 | 
11 | 
12 | 


--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------
1 | repos:
2 |   - repo: https://github.com/rbubley/mirrors-prettier
3 |     rev: v3.7.4
4 |     hooks:
5 |       - id: prettier
6 | 


--------------------------------------------------------------------------------
/lychee.toml:
--------------------------------------------------------------------------------
1 | # Lychee configuration file
2 | # See https://github.com/lycheeverse/lychee/blob/master/lychee.example.toml
3 | exclude_all_private = true
4 | include_mail = false
5 | no_progress = true
6 | verbose = "info"
7 | 


--------------------------------------------------------------------------------
/styles.css:
--------------------------------------------------------------------------------
 1 | /* General theme*/
 2 | body {
 3 |   --theme-color: #009fe3;
 4 | }
 5 | 
 6 | /* Sidebar element order */
 7 | .sidebar {
 8 |   display: flex;
 9 |   flex-direction: column;
10 | }
11 | .sidebar .app-name {
12 |   order: 1;
13 |   margin: 10px 10px 0 10px;
14 | }
15 | .sidebar .search {
16 |   order: 2;
17 | }
18 | .sidebar .sidebar-nav {
19 |   order: 3;
20 | }
21 | 


--------------------------------------------------------------------------------
/.github/dependabot.yml:
--------------------------------------------------------------------------------
 1 | # Please see the documentation for all configuration options:
 2 | # https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
 3 | 
 4 | version: 2
 5 | updates:
 6 |   # Set update schedule for GitHub Actions
 7 |   - package-ecosystem: "github-actions"
 8 |     directory: "/"
 9 |     schedule:
10 |       # Check for updates to GitHub Actions every week
11 |       interval: "weekly"
12 | 


--------------------------------------------------------------------------------
/privacy.md:
--------------------------------------------------------------------------------
1 | # Privacy policy
2 | 
3 | We collect anonymised user data that helps us to monitor the effectiveness of our website.
4 | No personally identifiable information is recorded and no cookies containing such information are set in your browser session.
5 | 
6 | <!-- ## Analytics opt-out -->
7 | <div id="matomo-opt-out"></div>
8 | <script src="https://matomo.research.software/index.php?module=CoreAdminHome&action=optOutJS&divId=matomo-opt-out&language=auto&showIntro=1"></script>
9 | 


--------------------------------------------------------------------------------
/technology/technology_overview.md:
--------------------------------------------------------------------------------
 1 | # Technology Guides
 2 | 
 3 | _Page maintainer: Patrick Bos_ [@egpbos](https://github.com/egpbos)
 4 | 
 5 | These chapters are based on our experiences with using specific software technologies.
 6 | 
 7 | The main audience is RSEs familiar with basic computing and programming concepts.
 8 | 
 9 | The purpose of these chapters is for someone unfamiliar with the specific technology to get a quick overview of the most important concepts, practices and tools, without going into too much detail (we provide links to further reading material for more).
10 | 


--------------------------------------------------------------------------------
/_sidebar.md:
--------------------------------------------------------------------------------
 1 | - [Introduction](/README.md)
 2 | - [Best practices](/best_practices.md)
 3 | - [Language Guides](/language_guides/languages_overview.md)
 4 |   - [Bash](/language_guides/bash.md)
 5 |   - [JavaScript and TypeScript](/language_guides/javascript.md)
 6 |   - [Python](/language_guides/python.md)
 7 |   - [R](/language_guides/r.md)
 8 |   - [C and C++](/language_guides/ccpp.md)
 9 |   - [Fortran](/language_guides/fortran.md)
10 |   - [Rust](/language_guides/rust.md)
11 | - [Technology Guides](/technology/technology_overview.md)
12 |   - [GPU programming](/technology/gpu.md)
13 |   - [UX - User Experience](/technology/user_experience.md)
14 |   - [Datasets](/technology/datasets.md)
15 | - [Contributing to this Guide](/CONTRIBUTING.md)
16 | - [Privacy](/privacy.md)
17 | 


--------------------------------------------------------------------------------
/.github/PULL_REQUEST_TEMPLATE:
--------------------------------------------------------------------------------
 1 | # Changes in this PR
 2 | <!--
 3 | Give a brief description of the PR, mainly: why & what.
 4 | Link relevant issues as well.
 5 | -->
 6 | 
 7 | 
 8 | 
 9 | # Checklist
10 | <!--
11 | Use the checklist below to make sure you followed the [CONTRIBUTING guidelines](https://guide.esciencecenter.nl/#/CONTRIBUTING).
12 | Feel free to remove what is not applicable.
13 | -->
14 | 
15 | ## SIGNIFICANT changes / additions, e.g. new chapters
16 | - [ ] I checked whether the contribution fits in [The Turing Way](https://github.com/the-turing-way/the-turing-way) before considering contributing to this Guide.
17 | - [ ] I discussed my contribution in an issue and took into account feedback.
18 | 
19 | ## ALL contributions
20 | - [ ] I previewed my changes locally using e.g. `python3 -m http.server 4000` and confirmed they work correctly.
21 | - [ ] I checked for broken links, e.g. using the link checker GitHub Action workflow, or locally by using ``docker run --init -it -v `pwd`:/docs lycheeverse/lychee /docs --config=docs/lychee.toml``, at least for the files I changed.
22 | - [ ] My name was added to the `CITATION.cff` file.
23 | 


--------------------------------------------------------------------------------
/.github/workflows/link-checker.yml:
--------------------------------------------------------------------------------
 1 | name: Link Checker
 2 | on:
 3 |   workflow_dispatch:
 4 |   push:
 5 |     branches:
 6 |       - main
 7 |   schedule:
 8 |     - cron: "0 4 * * *"
 9 | jobs:
10 |   linkChecker:
11 |     runs-on: ubuntu-latest
12 |     steps:
13 |       - uses: actions/checkout@v6
14 |       - name: Link Checker
15 |         uses: lycheeverse/lychee-action@v2
16 |         id: lychee
17 |         with:
18 |           # note: args has a long default value; when you override it, make sure you don't accidentally forget to include the default options you want! see https://github.com/lycheeverse/lychee-action/blob/master/action.yml
19 |           args: --verbose --no-progress './**/*.md' './**/*.html' './**/*.rst' --accept '100..=103,200..=299, 429' --exclude nlesc.sharepoint.com --exclude support.posit.co --exclude www.intel.com --exclude reddit.com --exclude jsfiddle.net
20 |         env:
21 |           # This token is included to avoid github.com requests to error out with status 429 (too many requests). It only works for GitHub requests (also other GitHub REST API calls), not for the rest of the web.
22 |           GITHUB_TOKEN: ${{secrets.TOKEN_GITHUB}}
23 | 


--------------------------------------------------------------------------------
/.github/workflows/link-checker-pr.yml:
--------------------------------------------------------------------------------
 1 | name: Link Checker for Pull requests
 2 | on: pull_request
 3 | jobs:
 4 |   changedFiles:
 5 |     runs-on: ubuntu-latest
 6 |     outputs:
 7 |       files: ${{ steps.changed-markdown-files.outputs.all_changed_files }}
 8 |     steps:
 9 |       - uses: actions/checkout@v6
10 |       - name: Get changed markdown files
11 |         id: changed-markdown-files
12 |         uses: tj-actions/changed-files@v47
13 |         with:
14 |           # Avoid using single or double quotes for multiline patterns
15 |           files: |
16 |             **.md
17 |           matrix: true
18 | 
19 |   linkChecker:
20 |     runs-on: ubuntu-latest
21 |     needs: changedFiles
22 |     if: ${{ needs.changedFiles.outputs.files != '' && toJSON(fromJSON(needs.changedFiles.outputs.files)) != '[]' }}
23 |     strategy:
24 |       matrix:
25 |         file: ${{ fromJSON(needs.changedFiles.outputs.files) }}
26 |       fail-fast: false
27 |     steps:
28 |       - uses: actions/checkout@v6
29 |         with:
30 |           fetch-depth: 2
31 |       - name: download Lychee
32 |         run: |
33 |           wget https://github.com/lycheeverse/lychee/releases/download/lychee-v0.18.1/lychee-x86_64-unknown-linux-gnu.tar.gz
34 |           tar xzf lychee-x86_64-unknown-linux-gnu.tar.gz
35 |       - name: Check all this file's additions for broken links
36 |         run: |
37 |           export base_sha=$(git rev-parse ${{ github.sha }}^)
38 |           git diff -U0 ${base_sha} ${{ github.event.pull_request.head.sha }} -- ${{ matrix.file }} | grep -v "+++" | grep "^+" | cut -c 2- | ./lychee --exclude nlesc.sharepoint.com --exclude support.posit.co --exclude www.intel.com --exclude reddit.com --exclude jsfiddle.net -
39 | 


--------------------------------------------------------------------------------
/language_guides/languages_overview.md:
--------------------------------------------------------------------------------
 1 | # Language Guides
 2 | 
 3 | _Page maintainer: Patrick Bos_ [@egpbos](https://github.com/egpbos)
 4 | 
 5 | This chapter provides practical info on each of the main programming languages of the Netherlands eScience Center.
 6 | 
 7 | This info is (on purpose) high level, try to provide "default" options, and mostly link to more info.
 8 | 
 9 | Each chapter should contain:
10 | 
11 | - Intro: philosophy, typical usecases.
12 | - Recommended sources of information
13 | - Installing compilers and runtimes
14 | - Editors and IDEs
15 | - Coding style conventions
16 | - Building and packaging code
17 | - Testing
18 | - Code quality analysis tools and services
19 | - Debugging and Profiling
20 | - Logging
21 | - Writing documentation
22 | - Recommended additional packages and libraries
23 | - Available templates
24 | 
25 | ## Preferred Languages
26 | 
27 | At the Netherlands eScience Center we prefer Java and Python over C++ and Perl, as these languages in general produce more sustainable code. It is not always possible to choose which libraries we use, as almost all projects have existing code as a starting point.
28 | 
29 | (In alphabetical order)
30 | 
31 | - Java
32 | - JavaScript (preferably Typescript)
33 | - Python
34 | - OpenCL and CUDA
35 | - R
36 | 
37 | ## Selecting tools and libraries
38 | 
39 | On GitHub there is a concept of an "awesome list", that collects awesome libraries and tools on some topic. For instance, here is one for Python: https://github.com/vinta/awesome-python
40 | 
41 | Now, someone has been smart enough to see the pattern, and has created an awesome list of awesome lists: https://awesome.re/
42 | 
43 | Highly recommented to get some inspiration on available tools and libraries!
44 | 
45 | ## Development Services
46 | 
47 | To do development in any language you first need infrastructure (code hosting, ci, etc). Luckily a lot is available for free now.
48 | 
49 | See this list: https://github.com/ripienaar/free-for-dev
50 | 


--------------------------------------------------------------------------------
/.github/workflows/upload-pdf.yml:
--------------------------------------------------------------------------------
 1 | # Generates a PDF for the full guide and uploads it to Zenodo
 2 | ## This action triggers when there is a new release of the guide
 3 | ## Manual release of this action also triggers upload to the Zenodo Sandbox
 4 | name: Generate PDF and upload to Zenodo
 5 | on:
 6 |   # Trigger manually via the Actions tab
 7 |   workflow_dispatch:
 8 |   # Trigger when you publish a release via GitHub's release page
 9 |   release:
10 |     types:
11 |       - published
12 | 
13 | jobs:
14 |   publish:
15 |     runs-on: ubuntu-latest
16 |     steps:
17 |       - name: Checkout the contents of your repository
18 |         uses: actions/checkout@v6
19 | 
20 |       - name: Change absolute paths to relative
21 |         run: perl -pi -e 's@\]\(\/@\]\(@' _sidebar.md
22 | 
23 |       - name: Pull Docker image
24 |         run: docker pull ghcr.io/kernoeb/docker-docsify-pdf:main
25 | 
26 |       - name: Generate PDF using the Docker image
27 |         run: |
28 |           docker run --rm --privileged \
29 |             -v "${{ github.workspace }}/":/home/node/docs:rw \
30 |             -v "${{ github.workspace }}/":/home/node/pdf:rw \
31 |             -v "${{ github.workspace }}/images/pdf-cover.pdf":/home/node/resources/cover.pdf:rw \
32 |             --user $(id -u):$(id -g) \
33 |             -e "PDF_OUTPUT_NAME=guide-nlesc.pdf" \
34 |             -e "NO_SANDBOX=true" \
35 |             ghcr.io/kernoeb/docker-docsify-pdf:main
36 | 
37 |       - name: Generate .zenodo.json from CITATION.cff
38 |         uses: citation-file-format/cffconvert-github-action@2.0.0
39 |         with:
40 |           args: "--format zenodo --outfile .zenodo.json"
41 | 
42 |       - name: Create a draft snapshot on Zenodo Sandbox
43 |         if: github.event_name == 'workflow_dispatch'
44 |         env:
45 |           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
46 |           ZENODO_SANDBOX_ACCESS_TOKEN: ${{ secrets.ZENODO_SANDBOX_ACCESS_TOKEN }}
47 |         uses: zenodraft/action@0.13.3
48 |         with:
49 |           concept: 277497 # doesn't matter which it is, it is only for testing
50 |           publish: false
51 |           sandbox: true
52 |           filenames: guide-nlesc.pdf
53 |           metadata: .zenodo.json
54 | 
55 |       - name: Create a new draft snapshot in the Zenodo record
56 |         if: github.event_name == 'release'
57 |         env:
58 |           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
59 |           ZENODO_ACCESS_TOKEN: ${{ secrets.ZENODO_ACCESS_TOKEN }}
60 |         uses: zenodraft/action@0.13.3
61 |         with:
62 |           concept: 4020564
63 |           publish: false # let the user press the publish button manually
64 |           sandbox: false
65 |           filenames: guide-nlesc.pdf
66 |           metadata: .zenodo.json
67 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4020564.svg)](https://doi.org/10.5281/zenodo.4020564)[![Link Checker](https://github.com/NLeSC/guide/actions/workflows/link-checker.yml/badge.svg)](https://github.com/NLeSC/guide/actions/workflows/link-checker.yml)
 2 | 
 3 | # Guide
 4 | 
 5 | This is a guide to research software development at the Netherlands eScience Center.
 6 | It is a living document, written by and for our research software engineers (RSEs) and our collaborators.
 7 | 
 8 | We write it for two reasons:
 9 | 
10 | 1. To have a trusted source for quickly getting started on selected software development topics.
11 |    We hope this will help RSEs (including our future selves!) to get off to a flying start on new projects in software/technological areas they are not yet familiar with.
12 | 2. To discuss and reach consensus on such topics/areas.
13 |    This in itself is valuable experience!
14 |    Discussing your practices can be confronting and a bit uncomfortable, but often teaches you new tricks and points of view.
15 | 
16 | Openness and collaboration are at the heart of the eScience Center, which is why we develop and share these guidelines in the open.
17 | [Join us!](#contributing)
18 | 
19 | ## Contents
20 | 
21 | To get started, check out the checklist of generic research software engineering advice
22 | in the [Best Practices](/best_practices.md) chapter.
23 | This chapter lists the most important overall attention points while developing research software.
24 | For more details, the sections refer to selected resources in community guides that we collaborate with.
25 | 
26 | If you are looking for more in-depth advise on using a specific programming language, have a look at the [language guides](/language_guides/languages_overview.md).
27 | Here we catalogue our experiences with the languages we use the most in our research software development projects.
28 | We also provide [technology guides](/technology/technology_overview.md) on digital technologies we use often in our projects with research partners.
29 | 
30 | ## Resources
31 | 
32 | All of the text in this guide is backed by our own experiences in developing high quality research software.
33 | However, we also learn from and share knowledge with other community-driven research software guides.
34 | The two most important of these are [The Turing Way](https://book.the-turing-way.org/index.html) and the
35 | [Research Software Quality Kit](http://everse.software/RSQKit/).
36 | Their scope is slightly different, but we collaborate with them when we can.
37 | 
38 | ## Contributing
39 | 
40 | Please consider contributing to this book!
41 | It is a great way to make long-lasting impact by sharing your time-tested knowledge and expertise.
42 | You'll hone your writing skills while you're at it.
43 | 
44 | See the [Contributing to this Guide](/CONTRIBUTING.md) chapter if you want to know more about how you can help, or ask one of the editors.
45 | Currently the editorial team consists of:
46 | 
47 | - Bouwe Andela [@bouweandela](https://github.com/bouweandela) (research software engineer)
48 | - Carlos Martínez Ortiz [@c-martinez](https://github.com/c-martinez) (community manager)
49 | - Patrick Bos [@egpbos](https://github.com/egpbos) (technology lead)
50 | 


--------------------------------------------------------------------------------
/language_guides/fortran.md:
--------------------------------------------------------------------------------
 1 | # Fortran
 2 | 
 3 | _Page maintainer: Gijs van den Oord_ [@goord](https://github.com/goord)
 4 | 
 5 | **Disclaimer: In general the Netherlands eScience Center does not recommend using Fortran. However, in some cases it is the only viable option, for instance if a project builds upon existing code written in this language. This section will be restricted to Fortran90, which captures majority of Fortran source code.**
 6 | 
 7 | The second use case may be extremely performance-critical dense
 8 | numerical compute workloads, with no existing alternative. In this case it is recommended to keep the Fortran part of the application minimal, using a high-level language like Python for program control flow, IO, and user interface.
 9 | 
10 | ## Recommended sources of information
11 | 
12 | - [Fortran90 best practices](https://github.com/certik/fortran90.org/blob/master/src/best-practices.rst).
13 | - [Fortran wiki](http://fortranwiki.org/fortran/show/HomePage)
14 | - [Fortran90 handbook](http://micro.ustc.edu.cn/Fortran/Fortran%2090%20Handbook.pdf)
15 | 
16 | ## Compilers
17 | 
18 | - **gfortran**: the official GNU Fortran compiler and part of the gcc compiler suite.
19 | - **ifort**: the Intel Fortran compiler, widely used in academia and industry because of its superior performance, but
20 |   unfortunately this is commercial software so not recommended. The same holds for the Portland compiler **pgfortran**
21 | 
22 | ## Debuggers and diagnostic tools
23 | 
24 | There exist many commercial performance profiling tools by Intel and the Portland Group which we shall not discuss here. Most important freely available alternatives are
25 | 
26 | - **gdb**: the GNU debugger, part of the gcc compiler suite. Use the **-g** option to compile with debugging symbols.
27 | - **gprof**: the GNU profiler, part of gcc too. Use the **-p** option to compile with profiling enabled.
28 | - **valgrind**: to detect memory leaks.
29 | 
30 | ## Editors and IDEs
31 | 
32 | Most lightweight editors provide Fortran syntax highlighting. Vim and emacs are most widely used, but for code
33 | completion and refactoring tools one might consider the [CBFortran](http://cbfortran.sourceforge.net/) distribution of Code::Blocks.
34 | 
35 | ## Coding style conventions
36 | 
37 | If working on an existing code base, adopt the existing conventions. Otherwise we recommend the
38 | standard conventions, described in the [official documentation](https://github.com/certik/fortran90.org/blob/master/src/best-practices.rst#fortran-style-guide) and the [Fortran company style guide](http://www.fortran.com/). We would like to add the following advice:
39 | 
40 | - Use free-form text input style (the default), with a maximal line width well below the 132 characters imposed by the Fortran90 standard.
41 | - When a method does not need to alter any data in any module and returns a single value, use a function for it, otherwise use a subroutine. Minimize the latter to reasonable extent.
42 | - Use the intent attributes in subroutine variable declarations as it makes the code much easier to understand.
43 | - Use a performance-driven approach to the architecture, do not use the object-oriented features of Fortran90 if they slow down execution. Encapsulation by modules is perfectly acceptable.
44 | - Add concise comments to modules and routines, and add comments to less obvious lines of code.
45 | - Provide a test suite with your code, containing both unit and integration tests. Both automake and cmake provide test
46 |   suite functionality; if you create your makefile yourself, add a separate testing target.
47 | 


--------------------------------------------------------------------------------
/index.html:
--------------------------------------------------------------------------------
 1 | <!doctype html>
 2 | <html lang="en">
 3 |   <head>
 4 |     <meta charset="UTF-8" />
 5 |     <title>Netherlands eScience Center Guide</title>
 6 |     <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
 7 |     <meta name="description" content="Description" />
 8 |     <meta
 9 |       name="viewport"
10 |       content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"
11 |     />
12 |     <link
13 |       rel="stylesheet"
14 |       href="https://cdn.jsdelivr.net/npm/docsify/themes/vue.css"
15 |     />
16 |     <link rel="stylesheet" href="/styles.css" />
17 |     <link rel="icon" type="image/png" href="images/favicon.png" />
18 |   </head>
19 | 
20 |   <body>
21 |     <div id="app"></div>
22 |     <script>
23 |       window.$docsify = {
24 |         name: "Netherlands eScience Center Guide",
25 |         repo: "NLeSC/guide",
26 |         loadSidebar: true,
27 |         logo: "/images/netherlands-escience-center-logo-RGB.png",
28 |         alias: {
29 |           "/.*/_sidebar.md": "/_sidebar.md",
30 |         },
31 |         relativePath: true,
32 |         matomo: {
33 |           host: "//matomo.research.software",
34 |           id: 5,
35 |           disableCookies: true,
36 |         },
37 |       };
38 |     </script>
39 |     <script src="https://cdn.jsdelivr.net/npm/docsify/lib/docsify.min.js"></script>
40 |     <script src="https://cdn.jsdelivr.net/npm/docsify/lib/plugins/external-script.min.js"></script>
41 |     <script src="https://cdn.jsdelivr.net/npm/docsify/lib/plugins/search.min.js"></script>
42 |     <script src="https://cdn.jsdelivr.net/npm/prismjs/components/prism-bash.min.js"></script>
43 |     <script src="https://cdn.jsdelivr.net/npm/prismjs/components/prism-rust.min.js"></script>
44 |     <!-- custom version of Matomo plugin, modified from https://github.com/docsifyjs/docsify/blob/develop/src/plugins/matomo.js -->
45 |     <script>
46 |       function appendScript(options) {
47 |         const script = document.createElement("script");
48 |         script.async = true;
49 |         script.src = options.host + "/matomo.js";
50 |         document.body.appendChild(script);
51 |       }
52 | 
53 |       function init(options) {
54 |         window._paq = window._paq || [];
55 |         if (options.disableCookies) {
56 |           window._paq.push(["disableCookies"]); // Call disableCookies before calling trackPageView
57 |         }
58 |         // window._paq.push(["trackPageView"]);
59 |         window._paq.push(["enableLinkTracking"]);
60 |         setTimeout(() => {
61 |           appendScript(options);
62 |           window._paq.push(["setTrackerUrl", options.host + "/matomo.php"]);
63 |           window._paq.push(["setSiteId", String(options.id)]);
64 |         }, 0);
65 |       }
66 | 
67 |       function collect() {
68 |         if (!window._paq) {
69 |           init($docsify.matomo);
70 |         }
71 | 
72 |         window._paq.push(["setCustomUrl", "/" + window.location.hash]);
73 |         window._paq.push(["setDocumentTitle", document.title]);
74 |         window._paq.push(["trackPageView"]);
75 |       }
76 | 
77 |       const install = function (hook) {
78 |         if (!$docsify.matomo) {
79 |           // eslint-disable-next-line no-console
80 |           console.error("[Docsify] matomo is required.");
81 |           return;
82 |         }
83 | 
84 |         hook.doneEach(collect);
85 |       };
86 | 
87 |       // only load Matomo if there's no opt-out cookie:
88 |       if (document.cookie.indexOf("mtm_consent_removed=") == -1) {
89 |         window.$docsify = window.$docsify || {};
90 |         $docsify.plugins = [install, ...($docsify.plugins || [])];
91 |       }
92 |     </script>
93 |   </body>
94 | </html>
95 | 


--------------------------------------------------------------------------------
/technology/user_experience.md:
--------------------------------------------------------------------------------
 1 | # User Experience (UX)
 2 | 
 3 | _Page maintainer: Jesus Garcia_ [@ctwhome](https://github.com/ctwhome)
 4 | 
 5 | User Experience Design (UX) is a broad, holistic science that combines many cognitive and brain sciences disciplines like psychology and sociology, content strategies, and arts and aesthetics by following human-center approaches.
 6 | 
 7 | > Human-centred design is an approach to interactive systems development that aims to make systems usable and useful by focusing on the users, their needs and requirements, and applying human factors/ergonomics and usability knowledge and techniques. This approach enhances effectiveness and efficiency, improves human well-being, user satisfaction, accessibility, sustainability, and counteracts possible adverse effects on human health, safety, and performance. [Wikipedia](https://en.wikipedia.org/wiki/Human-centered_design)
 8 | 
 9 | ## Table of content
10 | 
11 | - UX disciplines
12 | - Design thinking process
13 | - Designing software
14 | - Tools and Resources
15 | 
16 | ### UX disciplines
17 | 
18 | The principles and indications taught by [interaction-design.org](https://www.interaction-design.org/literature) can be useful in the process of creating research software.
19 | 
20 | The main UX disciplines are:
21 | 
22 | 1.  **User research**: understanding the people who use a product or system through observations.
23 | 2.  **Information architecture**: identifying and organizing information within a system in a purposeful and meaningful way.
24 | 3.  **Interaction design**: designing a product or system's interactive behaviors with a specific focus on their use.
25 | 4.  **Usability evaluation**: measuring the quality of a user's experience when interacting with a product or system.
26 | 5.  **Accessibility evaluation:** measuring the quality of a product or system to be accessed irrespective of personal abilities and device properties.
27 | 6.  **Visual design**: designing the visual attributes of a product or system in an aesthetically pleasing way.
28 | 
29 | The known UX umbrella diagram represents the different disciplines of UX:
30 | 
31 | <img src="https://user-images.githubusercontent.com/4195550/100587866-681a6700-32f1-11eb-87d2-c40616d45c4f.png" width="800" />
32 | 
33 | _Author/Copyright holder: J.G. Gonzalez and The Netherlands eScience Center. Copyright: Apache License 2.0_
34 | 
35 | ### Design Thinking
36 | 
37 | Design thinking is an approach, mindset, or ideology for product development. According to the [IxF(Interaction Design Foundation](https://interaction-design.org), Design thinking achieves all these advantages at the same time:
38 | 
39 | - It is a user-centered process that starts with user data, creates design artifacts that address real and not imaginary user needs, and then tests those artifacts with real users.
40 | - It leverages the collective expertise and establishes a shared language and buy-in amongst your team.
41 | - It encourages innovation by exploring multiple avenues for the same problem.
42 | 
43 | <img src="https://user-images.githubusercontent.com/4195550/99543973-13532400-29b4-11eb-9179-f74db459dfbe.png" width="700"/>
44 | 
45 | _Author/Copyright holder: Teo Yu Siang and Interaction Design Foundation. Copyright licence: CC BY-NC-SA 3.0_
46 | 
47 | You can find more information about Design Thinking on the [IxF page](https://www.interaction-design.org/literature/topics/design-thinking).
48 | 
49 | ### Designing software
50 | 
51 | Heuristics, or commonly known 'as the rule of thumb,' play a significant role when users interact with software. The Nielsen/Norman group has a top [10 Usability Heuristics for User Interface Design](https://www.nngroup.com/articles/ten-usability-heuristics/) to consider when developing software.
52 | 
53 | #### Designing Lovable software
54 | 
55 | When delivering software iteratively, one of the common approaches to follow is to define a Minimum Value Product that contains the minimum requirements. Often is forgotten in this approach to deliver software that attracts and engages the users. When developing research software, researchers should present the new and innovative outcomes in a way that feels comfortable and easy to use from the very beginning, eliminating any cognitive burden that the software's interaction may include.
56 | 
57 |  <img src="https://user-images.githubusercontent.com/4195550/99543638-ad669c80-29b3-11eb-92c6-1754fa9c837c.png" width="800" />
58 | 
59 | _Author/Copyright holder: J.G. Gonzalez and The Netherlands eScience Center. Copyright: Apache License 2.0_
60 | 
61 | While MVP (Minumun Product Value) focuses on provide users with a way to explore the product and understand its main intent, MLP (Minimun Loveable Product) approach focuses on essential features instead of the bare minimum expected from a class software. Going beyond the bare functionality, the attention is driven towards a great user experience. The outcomes mush contains all elements in the pyramid being **functional, reliable, usable, and pleasurable.**
62 | 
63 | ### Tools and resources
64 | 
65 | Design tools used for Visual Design, Prototyping, and IxD testing collaborative, real-time, online, and multiplatform.
66 | 
67 | - [Figma](https://www.figma.com/)
68 | - [Miro](https://miro.com/)
69 | - [Whimsical](https://whimsical.com/)
70 | 


--------------------------------------------------------------------------------
/language_guides/rust.md:
--------------------------------------------------------------------------------
  1 | # Rust
  2 | 
  3 | _Page maintainer: [Rodrigo V. Honorato](https://github.com/rvhonorato)_
  4 | 
  5 | Rust is a modern programming language designed to provide both high
  6 | performance while enforcing memory safety through its unique ownership system
  7 | and borrow checker. Developed by Mozilla and first released in 2015,
  8 | Rust has rapidly gained popularity for its ability to prevent common
  9 | programming errors at compile time. It is commonly categorized as a systems
 10 | programming language but over the last few years its ecosystem has grown
 11 | considerably and Rust is being adopted as a general programming language.
 12 | 
 13 | Rust is increasingly adopted in **research software** for its unique blend of
 14 | speed, safety, and modern tooling. It powers everything from
 15 | high-throughput DNA sequencing pipelines to climate simulations, where even
 16 | minor memory errors could invalidate results. By eliminating entire classes
 17 | of bugs (e.g., null pointers, race conditions, type mismatches), Rust lets
 18 | researchers focus on science, not on debugging.
 19 | 
 20 | It is however a **low-level** language, which gives you direct control over
 21 | hardware and memory (like [C/C++](./ccpp.md)). For comparison, [Python](./python.md)
 22 | is a **high-level** language that prioritizes readability by abstracting these
 23 | details - in Python you don't ever need to think about allocating or freeing
 24 | memory as the interpreter takes care of it, making the code slower but much
 25 | easier to program. In a **low-level** language you need to manage it yourself.
 26 | Because Rust runs "closer to the metal", it achieves blazing-fast performance -
 27 | similar to [C/C++](./ccpp.md) while avoiding common memory-safety and
 28 | concurrency bugs.
 29 | 
 30 | Here are some of Rust's key characteristics:
 31 | 
 32 | - **Memory Safety**: Rust's unique ownership system guarantees memory safety at compile
 33 |   time, eliminating crashes from null pointers, dangling references, or leaks.
 34 | 
 35 | - **Type Safety**: Strict compile-time checks ensure variables, data types,
 36 |   and operations are error-free, so there will be no surprises at runtime.
 37 | 
 38 | - **Zero-Cost Abstractions**: High-level syntax (e.g., iterators, traits) compiles
 39 |   to machine code as efficiently as hand-written low-level code.
 40 | 
 41 | - **Fearless Concurrency**: Built-in rules prevent data races, letting you
 42 |   write safe, parallel code without runtime crashes.
 43 | 
 44 | - **Expressive Enums & Pattern Matching**: Enums can hold data, and match
 45 |   ensures all cases are handled—no forgotten edge cases.
 46 | 
 47 | - **Traits for Polymorphism**: Define shared behavior across types without
 48 |   runtime overhead.
 49 | 
 50 | - **Rich Ecosystem**: Tools like [Cargo](https://doc.rust-lang.org/cargo/)
 51 |   (package manager), [Clippy](https://doc.rust-lang.org/stable/clippy/usage.html)
 52 |   (linting), [crates.io](https://crates.io) (libraries)
 53 |   and [rustdoc](https://doc.rust-lang.org/stable/rustdoc/) (documentation)
 54 |   streamline development.
 55 | 
 56 | ```rust
 57 | // Ownership in action: the compiler tracks who "owns" data.
 58 | fn main() {
 59 |     // Lets declare a string, here `s` owns it
 60 |     let s = String::from("hello");
 61 | 
 62 |     // Borrow `s` as a read-only reference (no ownership transference)
 63 |     let len = calculate_length(&s);
 64 | 
 65 |     // `s` still owns the data and we can use it
 66 |     println!("'{}' has length {}", s, len);
 67 | }
 68 | 
 69 | fn calculate_length(s: &str) -> usize {
 70 |     s.len()
 71 | }
 72 | ```
 73 | 
 74 | ## Getting started
 75 | 
 76 | To get started you will first need to install Rust, this can be done via [`rustup`](https://rustup.rs)
 77 | which is a command line tool for managing Rust versions and tools.
 78 | 
 79 | On Linux/MacOs:
 80 | 
 81 | ```bash
 82 | curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh
 83 | ```
 84 | 
 85 | On Windows, [see the instructions here](https://forge.rust-lang.org/infra/other-installation-methods.html#other-ways-to-install-rustup).
 86 | 
 87 | Cargo is Rust's build system and package manager and is installed by `rustup`.
 88 | You can use it to create a project:
 89 | 
 90 | ```bash
 91 | cargo new rust_project
 92 | ```
 93 | 
 94 | This will create the project folder structure, add a `Cargo.toml` and a `src/main.rs`
 95 | which contains a placeholder "Hello world", so you can already build this
 96 | `rust_project`
 97 | 
 98 | ```bash
 99 | cd rust_project
100 | cargo build --release # using --release will build the optimized binary
101 | ./target/release/rust_project # execute the binary
102 | ```
103 | 
104 | ## Learning
105 | 
106 | Its unique approach to memory management (ownership, borrowing and lifetimes) and
107 | the strict compiler can feel daunting at fist - especially if you are accustomed
108 | to high-level languages like [python](./python.md) or [javascript](./javascript.md).
109 | Learning Rust can be challenging as some new concepts, such as the borrow checker
110 | , may take time to be internalized.
111 | 
112 | > Keep in mind that in the long run all the effort pays off. The code produced
113 | > will be faster while having _fewer bugs_ (thanks to the opinionated compiler),
114 | > you will learn _transferable skills_ that will make you a better programmer
115 | > in other languages. The general mindset should be **start small and embrace
116 | > the compiler**.
117 | 
118 | To learn it, you only need:
119 | 
120 | - [The Rust Book](https://doc.rust-lang.org/book/): This
121 |   is the official book and it is very well written and easy to follow. It contains
122 |   all the information you need to gain a deep understanding of Rust. It contains
123 |   a fully guided tutorial on how to write a Guessing game as your first project.
124 | - [Rust by Example](https://doc.rust-lang.org/rust-by-example/): This contains
125 |   smaller examples of how to use the language, and it is a good complement to
126 |   the book or when you need to quickly look up how to do something.
127 | - [Rustlings](https://rustlings.cool): Fully interactive exercises
128 |   that will help you get used to the syntax and the concepts of the language -
129 |   it is paired with the book, so you should be doing the exercises as you go
130 |   through the book.
131 | - [Rust Playground](https://play.rust-lang.org/): Lets you experiment with Rust
132 |   online in your browser
133 | 
134 | 🦀
135 | 


--------------------------------------------------------------------------------
/technology/datasets.md:
--------------------------------------------------------------------------------
 1 | # Working with tabular data
 2 | 
 3 | _Page maintainers: Suvayu Ali_ [@suvayu](https://github.com/suvayu) _, Flavio Hafner_ [@f-hafner](https://github.com/f-hafner) _and Reggie Cushing_ [@recap](https://github.com/recap)
 4 | 
 5 | There are several solutions available to you as an RSE, with their own pros and cons. You should evaluate which one works best for your project, and project partners, and pick one. Sometimes it might be, that you need to combine two different types of technologies. Here are some examples from our experience.
 6 | 
 7 | You will encounter datasets in various file formats like:
 8 | 
 9 | - CSV/Excel
10 | - Parquet
11 | - HDF5/NetCDF
12 | - JSON/JSON-LD
13 | 
14 | Or local database files like SQLite. It is important to note, the various trade-offs between these formats. For instance, doing a random seek is difficult with a large dataset for non-binary formats like: CSV, Excel, or JSON. In such cases you should consider formats like Parquet, or HDF5/NetCDF. Non-binary files can also be imported into local databases like SQLite or DuckDB. Below we compare some options to work with datasets in these formats.
15 | 
16 | It's also good to know about [Apache Arrow](https://arrow.apache.org), which is not itself a file format, but a specification for a memory layout of (binary) data.
17 | There is an ecosystem of libraries for all major languages to handle data in this format.
18 | It is used as the back-end of [many data handling projects](https://arrow.apache.org/powered_by/), among which a few others mentioned in this chapter.
19 | 
20 | ## Local database
21 | 
22 | When you have a relational dataset, it is recommended that you use a database. Using local databases like SQLite and DuckDB can be very easy because of no setup requirements. But they come with some some limitations; for instance, multiple users cannot write to the database simultaneously.
23 | 
24 | SQLite is a transactional database, so if you have a dataset that is changing with time (e.g. you are adding new rows), it would be more appropriate. However in research often we work with static databases, and are interested mostly in analytical tasks. For such a case, DuckDB is a more appropriate alternative. Between the two,
25 | 
26 | - DuckDB can also create views (virtual tables) from other sources like files, other databases, but with SQLite you always have to import the data before running any queries.
27 | - DuckDB is multi-threaded. This can be an advantage for large databases, where aggregation queries tend to be faster than sqlite.
28 |   - However if you have a really large dataset, say 100Ms of rows, and want to perform a deeply nested query, it would require substantial amount of memory, making it unfeasible to run on personal laptops.
29 |   - There are options to customize memory handling, and push what is possible on a single machine.
30 | 
31 |     You need to limit the memory usage to prevent the operatings system, or shell from preemptively killing it. You can choose a value about 50% of your system's RAM.
32 | 
33 |   ```sql
34 |   SET memory_limit = '5GB';
35 |   ```
36 | 
37 |   By default, DuckDB spills over to disk when memory usage grows beyond the above limit. You can verify the temporary directory by running:
38 | 
39 |   ```sql
40 |   SELECT current_setting('temp_directory') AS temp_directory;
41 |   ```
42 | 
43 |   Note, if your query is deeply nested, you should have sufficient disk space for DuckDB to use; e.g. for 4 nested levels of `INNER JOIN` combined with a `GROUP BY`, we observed a disk spill over of 30x the original dataset. However we found this was not always reliable.
44 | 
45 |   In this kind of borderline cases, it might be possible to address the limitation by splitting the workload into chunks, and aggregating later, or by considering one of the alternatives mentioned below.
46 |   - You can also optimize the queries for DuckDB, but that requires a deeper dive into the documentation, and understanding how DuckDB query optimisation works.
47 | 
48 | - Both databases support setting (unique) indexes. Indexes are useful and sometimes necessary
49 |   - For both DuckDB and SQLite, unique indexes allow to ensure data integrity
50 |   - For SQLite, indexes are crucial to improve the performance of queries. However, having more indexes makes writing new records to the database slower. So it's again a trade-off between query and write speed.
51 | 
52 | # Useful libraries
53 | 
54 | ## Database APIs
55 | 
56 | - [SQLAlchemy](https://www.sqlalchemy.org/)
57 |   - In Python, interfacing to SQL databases like SQLite, MySQL or PostgreSQL is often done using [SQLAlchemy](https://www.sqlalchemy.org/), which is an Object Relational Mapper (ORM) that allows you to map tables to Python classes. Note that you still need to use a lot of manual SQL outside of Python to manage the database. However, SQLAlchemy allows you to use the data in a Pythonic way once you have the database layout figured out.
58 | 
59 | ## Data processing libraries on a single machine
60 | 
61 | - Pandas
62 |   - The standard tool for working with dataframes, and widely used in analytics or machine learning workflows. Note however how Pandas uses memory, because certain APIs create copies, while others do not. So if you are chaining multiple operations, it is preferable to use APIs that avoid copies.
63 | - Vaex
64 |   - Vaex is an alternative that focuses on out-of-core processing (larger than memory), and has some lazy evaluation capabilities.
65 | - Polars
66 |   - An alternative to Pandas (started in 2020), which is primarily written in Rust. Compared to pandas, it is multi-threaded and does lazy evaluation with query optimisation, so much more performant. However since it is newer, documentation is not as complete. It also allows you to write your own custom extensions in Rust.
67 | - [Apache Datafusion](https://datafusion.apache.org/)
68 |   - A very fast, extensible query engine for building high-quality data-centric systems in [Rust](http://rustlang.org/), using the [Apache Arrow](https://arrow.apache.org/) in-memory format. DataFusion offers SQL and Dataframe APIs, excellent [performance](https://benchmark.clickhouse.com/), built-in support for CSV, Parquet, JSON, and Avro, extensive customization, and a great community.
69 | 
70 | ## Distributed/multi-node data processing libraries
71 | 
72 | - Dask
73 |   - `dask.dataframe` and `dask.array` provides the same API as pandas and numpy respectively, making it easy to switch.
74 |   - When working with multiple nodes, it requires communication across nodes (which is network bound).
75 | - Ray
76 | - Apache Spark
77 | 


--------------------------------------------------------------------------------
/technology/gpu.md:
--------------------------------------------------------------------------------
  1 | # GPU Programming Languages
  2 | 
  3 | _Page maintainer: Alessio Sclocco_ [@isazi](https://github.com/isazi)
  4 | 
  5 | ## Learning Resources
  6 | 
  7 | - Carpentries GPU Programming course
  8 |   - [Lesson material](https://carpentries-incubator.github.io/lesson-gpu-programming/)
  9 | - Introduction to CUDA C
 10 |   - [Slides](http://developer.download.nvidia.com/compute/developertrainingmaterials/presentations/cuda_language/Introduction_to_CUDA_C.pptx)
 11 |   - [Video](http://on-demand.gputechconf.com/gtc/2012/video/S0624-Monday-Introduction-to-CUDA-C.mp4)
 12 | - Introduction to OpenACC
 13 |   - [Slides](http://developer.download.nvidia.com/compute/developertrainingmaterials/presentations/openacc/Introduction_To_OpenACC.pptx)
 14 | - Introduction to HIP Programming
 15 |   - [Video](https://www.youtube.com/watch?v=3ejUwypP0bI)
 16 | - SYCL Introduction and Best Practices
 17 |   - [Video](https://www.youtube.com/watch?v=TbkrODiVDQY)
 18 | - CSCS GPU Programming with Julia
 19 |   - [Course recordings](https://github.com/omlins/julia-gpu-course)
 20 | 
 21 | ## Documentation
 22 | 
 23 | - CUDA
 24 |   - [C programming guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html)
 25 |   - [Runtime API](https://docs.nvidia.com/cuda/cuda-runtime-api/)
 26 |   - [Driver API](https://docs.nvidia.com/cuda/cuda-driver-api/index.html)
 27 |   - [Fortran programming guide](https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html)
 28 | - HIP
 29 |   - [Kernel language syntax](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/kernel_language.html)
 30 |   - [Runtime API](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/hip_runtime_api_reference.html)
 31 | - SYCL
 32 |   - [Specification](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html)
 33 |   - [Reference guide](https://www.khronos.org/files/sycl/sycl-2020-reference-guide.pdf)
 34 | - OpenCL
 35 |   - [Guide](https://github.com/KhronosGroup/OpenCL-Guide)
 36 |   - [API](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html)
 37 |   - [OpenCL C specification](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html)
 38 |   - [Reference guide](https://www.khronos.org/files/opencl30-reference-guide.pdf)
 39 | - OpenACC
 40 |   - [Programming guide](https://www.openacc.org/sites/default/files/inline-files/OpenACC_Programming_Guide_0_0.pdf)
 41 |   - [Reference guide](https://www.openacc.org/sites/default/files/inline-files/API%20Guide%202.7.pdf)
 42 | - OpenMP
 43 |   - [Reference guide](https://www.openmp.org/wp-content/uploads/OpenMPRef-5.0-111802-web.pdf)
 44 | 
 45 | ## Overview of Libraries
 46 | 
 47 | - CUDA
 48 |   - [cuBLAS](http://docs.nvidia.com/cuda/cublas/index.html)
 49 |   - [NVBLAS](http://docs.nvidia.com/cuda/nvblas/index.html)
 50 |   - [cuFFT](http://docs.nvidia.com/cuda/cufft/index.html)
 51 |   - [cuGRAPH](https://docs.rapids.ai/api/cugraph/stable/)
 52 |   - [cuRAND](http://docs.nvidia.com/cuda/curand/index.html)
 53 |   - [cuSPARSE](http://docs.nvidia.com/cuda/cusparse/index.html)
 54 | - HIP
 55 |   - [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/latest/index.html)
 56 |   - [hipFFT](https://rocm.docs.amd.com/projects/hipFFT/en/latest/index.html)
 57 |   - [hipRAND](https://rocm.docs.amd.com/projects/hipRAND/en/latest/index.html)
 58 |   - [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/latest/index.html)
 59 | - SYCL
 60 |   - [OneAPI BLAS](https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-dpcpp/2025-0/blas-routines.html)
 61 |   - [OneAPI FFT](https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-dpcpp/2025-0/fourier-transform-functions.html)
 62 |   - [OneAPI sparse](https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-dpcpp/2025-0/sparse-blas-routines.html)
 63 |   - [OneAPI random number generators](https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-dpcpp/2025-0/random-number-generators.html)
 64 | - OpenCL
 65 |   - [CLBlast](https://github.com/CNugteren/CLBlast)
 66 |   - [clFFT](https://github.com/clMathLibraries/clFFT)
 67 | 
 68 | ## Source-to-source Translation
 69 | 
 70 | - CUDA to HIP
 71 |   - [hipify](https://github.com/ROCm/HIPIFY)
 72 | - CUDA to SYCL
 73 |   - [SYCLomatic](https://github.com/oneapi-src/SYCLomatic)
 74 | - CUDA to OpenCL
 75 |   - [cutocl](https://github.com/benvanwerkhoven/cutocl)
 76 | 
 77 | ## Foreign Function Interfaces
 78 | 
 79 | - C++
 80 |   - CUDA
 81 |     - [cudawrappers](https://github.com/nlesc-recruit/cudawrappers)
 82 |   - OpenCL
 83 |     - [CLHPP](https://github.com/KhronosGroup/OpenCL-CLHPP)
 84 | - Python
 85 |   - CUDA
 86 |     - [PyCuda](https://mathema.tician.de/software/pycuda/)
 87 |     - [CuPy](https://cupy.dev/)
 88 |     - [cuda-python](https://nvidia.github.io/cuda-python/)
 89 |   - HIP
 90 |     - [PyHIP](https://github.com/jatinx/PyHIP)
 91 |   - SYCL
 92 |     - [dpctl](https://github.com/IntelPython/dpctl)
 93 |   - OpenCL
 94 |     - [PyOpenCL](https://mathema.tician.de/software/pycuda/)
 95 | - Julia
 96 |   - CUDA
 97 |     - [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl)
 98 |   - HIP
 99 |     - [AMDGPU.jl](https://github.com/JuliaGPU/AMDGPU.jl)
100 |   - SYCL
101 |     - [oneAPI.jl](https://github.com/JuliaGPU/oneAPI.jl)
102 | - Java
103 |   - CUDA
104 |     - [JCuda](http://www.jcuda.org/)
105 |   - OpenCL
106 |     - [JOCL](http://www.jocl.org/)
107 | 
108 | ## High-Level Abstractions
109 | 
110 | - C++
111 |   - [Kokkos](https://github.com/kokkos/kokkos)
112 |   - [Raja](https://github.com/LLNL/RAJA)
113 | - Python
114 |   - [Numba](https://numba.pydata.org/)
115 |   - [pykokkos](https://github.com/kokkos/pykokkos)
116 | 
117 | ## Debugging and Profiling Tools
118 | 
119 | - CUDA
120 |   - [Nsight Systems](https://developer.nvidia.com/nsight-systems)
121 |   - [Nsight Compute](https://developer.nvidia.com/nsight-compute)
122 |   - [CUDA-GDB](http://docs.nvidia.com/cuda/cuda-gdb/index.html)
123 |   - [compute-sanitizer](https://docs.nvidia.com/compute-sanitizer/index.html)
124 | - HIP
125 |   - [omniperf](https://github.com/AMDResearch/omniperf)
126 |   - [rocprof](https://github.com/ROCm/rocprofiler)
127 | - SYCL
128 |   - [oneprof](https://github.com/intel/pti-gpu/tree/master/tools/oneprof)
129 |   - [onetrace](https://github.com/intel/pti-gpu/tree/master/tools/onetrace)
130 | 
131 | ## Performance Optimization
132 | 
133 | - [PRACE best practice guide on modern accelerators](https://zenodo.org/records/5839488)
134 | - [CUDA best practices](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html)
135 | - [OneAPI SYCL best practices](https://www.intel.com/content/www/us/en/docs/oneapi/programming-guide/2025-0/optimize-your-sycl-applications.html)
136 | 
137 | ## Auto-tuning
138 | 
139 | - Kernel Tuner
140 |   - [GitHub repository](https://github.com/KernelTuner/kernel_tuner)
141 |   - [Documentation](https://kerneltuner.github.io/kernel_tuner/stable/)
142 |   - [Tutorial](https://github.com/KernelTuner/kernel_tuner_tutorial)
143 | 


--------------------------------------------------------------------------------
/CITATION.cff:
--------------------------------------------------------------------------------
  1 | # This CITATION.cff file was generated with cffinit.
  2 | # Visit https://bit.ly/cffinit to generate yours today!
  3 | 
  4 | cff-version: 1.2.0
  5 | title: Netherlands eScience Center - Software Development Guide
  6 | message: "If you use this guide, please cite it."
  7 | type: software
  8 | authors:
  9 |   - affiliation: Netherlands eScience Center
 10 |     family-names: Drost
 11 |     given-names: Niels
 12 |     orcid: "https://orcid.org/0000-0001-9795-7981"
 13 |   - affiliation: Netherlands eScience Center
 14 |     family-names: Spaaks
 15 |     given-names: Jurriaan H.
 16 |     orcid: "https://orcid.org/0000-0002-7064-4069"
 17 |   - affiliation: Netherlands eScience Center
 18 |     family-names: Andela
 19 |     given-names: Bouwe
 20 |   - affiliation: Netherlands eScience Center
 21 |     family-names: Veen
 22 |     given-names: Lourens
 23 |   - affiliation: Netherlands eScience Center
 24 |     family-names: Zwaan
 25 |     name-particle: van der
 26 |     given-names: Janneke M.
 27 |     orcid: "https://orcid.org/0000-0002-8329-7000"
 28 |   - affiliation: Netherlands eScience Center
 29 |     family-names: Verhoeven
 30 |     given-names: Stefan
 31 |     orcid: "https://orcid.org/0000-0002-5821-2060"
 32 |   - affiliation: Netherlands eScience Center
 33 |     family-names: Bos
 34 |     given-names: Patrick
 35 |     orcid: "https://orcid.org/0000-0002-6033-960X"
 36 |   - family-names: Kuzak
 37 |     given-names: Mateusz
 38 |     orcid: "https://orcid.org/0000-0003-0087-6021"
 39 |   - affiliation: Netherlands eScience Center
 40 |     family-names: Werkhoven
 41 |     name-particle: van
 42 |     given-names: Ben
 43 |     orcid: "https://orcid.org/0000-0002-7508-3272"
 44 |   - affiliation: Netherlands eScience Center
 45 |     family-names: Attema
 46 |     given-names: Jisk
 47 |     orcid: "https://orcid.org/0000-0002-0948-1176"
 48 |   - affiliation: Netherlands eScience Center
 49 |     family-names: Hidding
 50 |     given-names: Johannes
 51 |   - family-names: Hees
 52 |     name-particle: van
 53 |     given-names: Vincent
 54 |     orcid: "https://orcid.org/0000-0003-0182-9008"
 55 |   - affiliation: Netherlands eScience Center
 56 |     family-names: Martinez-Ortiz
 57 |     given-names: Carlos
 58 |     orcid: "https://orcid.org/0000-0001-5565-7577"
 59 |   - affiliation: Netherlands eScience Center
 60 |     family-names: Spreeuw
 61 |     given-names: Hanno
 62 |     orcid: "https://orcid.org/0000-0002-5057-0322"
 63 |   - family-names: Borgdorff
 64 |     given-names: Joris
 65 |     orcid: "https://orcid.org/0000-0001-7911-9490"
 66 |   - family-names: Leinweber
 67 |     given-names: Katrin
 68 |   - affiliation: Netherlands eScience Center
 69 |     family-names: Diblen
 70 |     given-names: Faruk
 71 |   - affiliation: Netherlands eScience Center
 72 |     family-names: Oord
 73 |     name-particle: van den
 74 |     given-names: Gijs
 75 |   - affiliation: Netherlands eScience Center
 76 |     family-names: Goncalves
 77 |     given-names: Romulo
 78 |     orcid: "https://orcid.org/0000-0003-2225-1428"
 79 |   - affiliation: Netherlands eScience Center
 80 |     family-names: Kuzniar
 81 |     given-names: Arnold
 82 |     orcid: "https://orcid.org/0000-0003-1711-7961"
 83 |   - affiliation: Netherlands eScience Center
 84 |     family-names: Kuppevelt
 85 |     name-particle: van
 86 |     given-names: Dafne
 87 |   - affiliation: Netherlands eScience Center
 88 |     family-names: Weel
 89 |     given-names: Berend
 90 |   - affiliation: Netherlands eScience Center
 91 |     family-names: Meijer
 92 |     given-names: Christiaan
 93 |   - affiliation: Netherlands eScience Center
 94 |     family-names: Maassen
 95 |     given-names: Jason
 96 |     orcid: "https://orcid.org/0000-0002-8172-4865"
 97 |   - affiliation: Netherlands eScience Center
 98 |     family-names: Rodríguez-Sánchez
 99 |     given-names: Pablo
100 |     orcid: "https://orcid.org/0000-0002-2855-940X"
101 |   - affiliation: Netherlands eScience Center
102 |     family-names: Klaver
103 |     given-names: Tom
104 |   - affiliation: Netherlands eScience Center
105 |     family-names: Hage
106 |     name-particle: van
107 |     given-names: Willem Robert
108 |     orcid: "https://orcid.org/0000-0002-6478-3003"
109 |   - affiliation: Netherlands eScience Center
110 |     family-names: Zapata
111 |     given-names: Felipe
112 |     orcid: "https://orcid.org/0000-0001-8286-677X"
113 |   - affiliation: Netherlands eScience Center
114 |     family-names: Bakker
115 |     given-names: Tom
116 |   - affiliation: Netherlands eScience Center
117 |     family-names: Rijn
118 |     name-particle: van
119 |     given-names: Sander
120 |     orcid: "https://orcid.org/0000-0001-6159-041X"
121 |   - affiliation: Journal of Open Source Software
122 |     family-names: Niemeyer
123 |     given-names: Kyle
124 |   - affiliation: Netherlands eScience Center
125 |     family-names: Wehner
126 |     given-names: Jens
127 |   - affiliation: Netherlands eScience Center
128 |     family-names: Burg
129 |     name-particle: van der
130 |     given-names: Sven
131 |   - affiliation: Netherlands eScience Center
132 |     family-names: Siqueira
133 |     given-names: Abel
134 |   - affiliation: Netherlands eScience Center
135 |     family-names: Vreede
136 |     given-names: Barbara
137 |   - affiliation: Netherlands eScience Center
138 |     family-names: Schnober
139 |     given-names: Carsten
140 |   - affiliation: Netherlands eScience Center
141 |     family-names: Chandramouli
142 |     given-names: Pranav
143 |   - affiliation: Utrecht University
144 |     family-names: Oberman
145 |     given-names: Hanne
146 |   - affiliation: Netherlands eScience Center
147 |     family-names: Lüken
148 |     given-names: Malte
149 |   - affiliation: Netherlands eScience Center
150 |     family-names: Isazi
151 |     given-names: Alessio
152 |   - affiliation: "Datadog, Inc."
153 |     family-names: Lev
154 |     given-names: Ofek
155 |   - affiliation: Netherlands eScience Center
156 |     family-names: Cahen
157 |     given-names: Ewan
158 |   - affiliation: Netherlands eScience Center
159 |     family-names: Ali
160 |     given-names: Suvayu
161 |   - affiliation: Netherlands eScience Center
162 |     family-names: Hafner
163 |     given-names: Flavio
164 |   - affiliation: Netherlands eScience Center
165 |     family-names: Cushing
166 |     given-names: Reggie
167 |   - affiliation: Netherlands eScience Center
168 |     family-names: Kasalica
169 |     given-names: Vedran
170 |     orcid: "https://orcid.org/0000-0002-0097-1056"
171 |   - affiliation: Utrecht University
172 |     family-names: Vargas Honorato
173 |     given-names: Rodrigo
174 |     orcid: "https://orcid.org/0000-0001-5267-3002"
175 | repository-code: "https://github.com/NLeSC/guide"
176 | abstract: >-
177 |   This is a guide to software development and projects at
178 |   the Netherlands eScience Center. It both serves as a
179 |   source of information for exactly how we work at the
180 |   eScience Center, and as a basis for discussions and
181 |   reaching consensus on this topic.
182 | license: CC-BY-4.0
183 | 


--------------------------------------------------------------------------------
/best_practices.md:
--------------------------------------------------------------------------------
  1 | # Best Practices for Software Development
  2 | 
  3 | In this chapter we give an overview of the best practices for software development at the Netherlands eScience Center, including a rationale.
  4 | 
  5 | ## Checklists
  6 | 
  7 | An easy way to make sure you did not forget anything important is to use a well curated checklist.
  8 | Great examples can be found via [FAIR Software NL](https://fair-software.nl/recommendations/checklist).
  9 | [The Turing Way](https://book.the-turing-way.org) has specific topical checklists at the end of each of their chapters.
 10 | 
 11 | ## Version control
 12 | 
 13 | Use a version control tool like `git` to track changes in your codebase.
 14 | This allows you to retrace your steps when debugging, keep your repository clean, easily collaborate with others asynchronously and more.
 15 | More info: [The Turing Way chapter on Version Control](https://book.the-turing-way.org/reproducible-research/vcs), [RSQkit chapter on Version Control](http://everse.software/RSQKit/using_version_control).
 16 | 
 17 | **At the Netherlands eScience Center:** we always use version control and we preferably use GitHub as our online repository and collaboration platform (see the [Project Management Protocol on our intranet](https://nlesc.sharepoint.com/sites/home/SitePages/Project-procedures.aspx) (only accessible to Netherlands eScience Center employees)).
 18 | 
 19 | ## Testing
 20 | 
 21 | Tests are important for two reasons: 1. confirming the expected workings of your code while developing for the first time and 2. making sure your features keep working when later on you or others modify the implementation.
 22 | [The Turing Way gives an overview of the many ways to test code](https://book.the-turing-way.org/reproducible-research/testing).
 23 | 
 24 | ## Code Reviews
 25 | 
 26 | The most effective tool for improving software quality (and sharing knowledge at the same time) is doing code reviews.
 27 | Have a look at the [The Turing Way chapter on Code Reviewing](https://book.the-turing-way.org/reproducible-research/reviewing) to learn more about ways to do this.
 28 | 
 29 | ## Documentation
 30 | 
 31 | Developed programs should be documented at multiple levels, from code comments, through API documentation, to installation and usage documentation.
 32 | Comments at each level should take into account different target audiences, from experienced developers, to end users with no programming skills.
 33 | In the [Turing Way chapter on Code Documentation](https://book.the-turing-way.org/reproducible-research/code-documentation) you will find a great overview of the how and why of documentation.
 34 | 
 35 | ## Code Quality
 36 | 
 37 | Ways to improve code quality are described in the [Code quality](https://book.the-turing-way.org/reproducible-research/code-quality.html) chapter on the Turing Way.
 38 | 
 39 | Explore [online tools for software quality improvement](https://book.the-turing-way.org/reproducible-research/code-quality/code-quality-style.html#online-services-providing-software-quality-checks). Additionally, check our [language guides](/language_guides/languages_overview.md) for language-specific recommendations.
 40 | [RSQKit: Research Software Quality Kit](https://everse.software/RSQKit/) also has many useful guides including software quality. These guides are result of an international collaboration primarily focusing on research software quality.
 41 | 
 42 | ### EditorConfig
 43 | 
 44 | The eScience Center provides a [shared config file](https://raw.githubusercontent.com/NLeSC/exemplum/master/.editorconfig) for IDEs and text editors. This file helps standardize coding styles across projects.
 45 | 
 46 | ### Namespaces
 47 | 
 48 | If your programming language supports namespaces, use your organization or project-specific namespace.
 49 | 
 50 | **At the Netherlands eScience Center:**, the recommended namespace is **nl.esciencecenter**, or adapt it to a namespace that aligns with your project's context.
 51 | 
 52 | ## Use standards
 53 | 
 54 | Standard files and protocols should always be a primary choice.
 55 | Using standards improves the interoperability of your software, thereby improving its usefulness.
 56 | Examples include exchange formats like Unicode, NetCDF, and W3C web standards, and protocols like HTTP, TCP, TLS.
 57 | 
 58 | ## Licensing
 59 | 
 60 | Since source code is protected by copyright, to allow people to use your code it needs a license.
 61 | For more information, see [The Turing Way chapter on licensing](https://the-turing-way.netlify.app/reproducible-research/licensing) or the [RSQkit Licensing software task](http://everse.software/RSQKit/licensing_software).
 62 | 
 63 | **At the Netherlands eScience Center:** our first choice is the Apache v2 license.
 64 | See the [Project Management Protocol on our intranet](https://nlesc.sharepoint.com/sites/home/SitePages/Project-procedures.aspx) (only accessible to Netherlands eScience Center employees) for more details on licensing and our intellectual property policies.
 65 | 
 66 | ## Software management plans
 67 | 
 68 | The Netherlands eScience Center and [NWO](https://www.nwo.nl/en) have authored the [practical guide to software management plans](https://doi.org/10.5281/zenodo.7248877) ([see also](https://www.esciencecenter.nl/national-guidelines-for-software-management-plans/)).
 69 | For our projects we recommend using [our Software Sustainability Protocol](https://doi.org/10.5281/zenodo.1451750), which is based on these guidelines.
 70 | For more information you can also [read here](https://github.com/the-turing-way/the-turing-way/issues/2419). <!-- we should point to the actual Turing Way chapter once it has been created -->
 71 | 
 72 | ## Releases
 73 | 
 74 | Releases are a way to mark or point to a particular milestone in software development.
 75 | This is useful for users and collaborators, e.g. I found a bug running version x.
 76 | For publications that refer to software, refering to a specific release enhances the reproducability.
 77 | See [the RSQkit task on Creating code releases](http://everse.software/RSQKit/releasing_software) for the most essential guidelines.
 78 | The Turing Way offers many related tips in their [chapter on Making Research Objects Citable](https://book.the-turing-way.org/communication/citable), like how to make code citable with CITATION.CFF files.
 79 | 
 80 | ## Packaging
 81 | 
 82 | A related, but separate topic is packaging, which allows users to conveniently install your released software.
 83 | Most [languages](/language_guides/languages_overview) and OS'es have their particular ways of doing this.
 84 | The Turing Way offers advice on [making reproducible environments](https://book.the-turing-way.org/reproducible-research/renv), in which packaging is an essential component.
 85 | 
 86 | ## Know your tools
 87 | 
 88 | In addition to the advice on the best practices above, knowing the
 89 | tools that are available for software development can really help you getting
 90 | things done faster.
 91 | 
 92 | ### Learn how to use the command line efficiently
 93 | 
 94 | Read the chapter on using [Bash](/language_guides/bash.md).
 95 | 
 96 | ### Use an editor that helps you develop
 97 | 
 98 | Commonly used editors and their ecosystem of plugins can really help you write
 99 | better code faster.
100 | Note that for each of the editors and environments listed below, it is important
101 | to configure them such that they support the programming languages that you are
102 | developing in.
103 | 
104 | Below is a list of editors that support many programming languages.
105 | 
106 | Integrated Development Environments (IDEs):
107 | 
108 | - [Visual Studio Code](https://code.visualstudio.com/) - modern editor with extensive plugin ecosystem that can make it as powerful as most IDEs
109 | - [JetBrains IDEs](https://www.jetbrains.com/ides/) - specialized IDEs for Python, C++, Java and web, all using the IntelliJ framework
110 | - [Eclipse](https://www.eclipse.org/ide/) - a bit older but still nice
111 | 
112 | Text editors:
113 | 
114 | - [vim](https://www.vim.org/) - classic text editor
115 | - [emacs](https://www.gnu.org/software/emacs/) - classic text editor
116 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
  1 | # Contributing to this Guide
  2 | 
  3 | - [Who? You!](#who_you)
  4 | - [Audience](#audience)
  5 | - [Scope](#scope)
  6 | - [How?](#how)
  7 | - [Technical details (docsify)](#technical-details)
  8 | - [Zen of the Guide](#zen-of-the-guide)
  9 | 
 10 | # Who? You!
 11 | 
 12 | This guide is primarily written by the Research Software Engineers at the Netherlands eScience Center.
 13 | Contributions by anyone (also outside the Center) are most welcome!
 14 | 
 15 | ## Page maintainers
 16 | 
 17 | While everybody is encouraged to contribute where they can, we appoint maintainers for specific pages to regularly keep things up to date and think along with contributors.
 18 | To see who is responsible for which part of the guide see the maintainer listed at the top of a page.
 19 | If you are interested in becoming a chapter owner for a page that is listed as _unmaintained_, please open a pull request to add your name instead of _unmaintained_.
 20 | 
 21 | ## Editorial board
 22 | 
 23 | The editors make sure content is in line with [the scope](#scope), that it is maintainable and that it is maintained.
 24 | In practice they will:
 25 | 
 26 | - track, lead towards satisfactory conclusion of and when necessary (in case of disagreement) decide on issues, discussions and pull requests,
 27 | - flag content that needs to be updated or removed,
 28 | - ask for input from page maintainers or other contributors,
 29 | - periodically organize sprints to work on content together with everyone interested in contributing; usually in the form of a "Book Dash" together with The Turing Way contributors,
 30 | 
 31 | and do any other regular editing tasks.
 32 | 
 33 | Currently the team consists of:
 34 | 
 35 | - Bouwe Andela [@bouweandela](https://github.com/bouweandela) (research software engineer)
 36 | - Carlos Martínez Ortiz [@c-martinez](https://github.com/c-martinez) (community manager)
 37 | - Patrick Bos [@egpbos](https://github.com/egpbos) (technology lead)
 38 | 
 39 | # Audience
 40 | 
 41 | Our eScience Center _RSEs_ are the prototypical audience members, in particular those starting out in some unfamiliar area of technology.
 42 | Some characteristics include:
 43 | 
 44 | - They are interested in _intermediate to advanced level_ best practices. If there are already ten easily found blog posts about it, it doesn't have to be in the Guide.
 45 | - They are a _programmer or researcher_ that is already familiar with some other programming language or software-related technology.
 46 | - They may be generally interested (in particular topics of eScience practice and research software development in general or how this is done at the eScience Center specifically), but their main aim is towards _practical_ application, not to create a literature study of the current landscape of (research) software.
 47 | 
 48 | # Scope
 49 | 
 50 | To make sure the information in this guide stays relevant and up to date it is intentionally low on technical details.
 51 | The guide contains and links to best practices we use to code and develop research software in our projects.
 52 | 
 53 | The main goal: having information available about research software engineering best practices for our colleagues, collaborators and other interested people.
 54 | It can be information that you can give a colleague starting in some area, for instance, a new language or a new technology.
 55 | 
 56 | 80% of this goal will be met by [the Turing Way](https://book.the-turing-way.org/).
 57 | For everything else: we have the Guide.
 58 | 
 59 | We focus on eScience Center-specific best practices.
 60 | These can be generic and complete or specific and highly curated.
 61 | It depends!
 62 | For instance, eScience specific content (e.g. we prefer `git` over `svn`) should be in the Guide, while content of interest to a general audience (e.g. it is good practice to use a version control system) should go in The Turing Way.
 63 | When in doubt, discuss your doubts in an issue.
 64 | 
 65 | A few things are excluded:
 66 | 
 67 | 1. Project related practices (planning, communication, stake holders, management, etc.). These we gather on our intranet pages.
 68 | 2. Project output is gathered on the [Research Software Directory](https://research-software-directory.org/organisations/netherlands-escience-center?tab=software&order=is_featured).
 69 | 3. Generic research software engineering advice that can be added to [The Turing Way](https://github.com/the-turing-way/the-turing-way).
 70 | 
 71 | In practice, this means the Guide (for now) will mostly consist of language guides and technology guides.
 72 | 
 73 | It can also sometimes function as a staging/draft area for eventually moving content to the Turing Way.
 74 | However, we will urge you to contribute to the Turing Way directly.
 75 | 
 76 | ## For significant changes / additions, especially new chapters
 77 | 
 78 | Please check if your contribution fits in [The Turing Way](https://github.com/the-turing-way/the-turing-way) before considering contributing to this guide.
 79 | Feel free to ask the [editors](#editorial-board) if you are unsure or open an [issue](https://github.com/NLeSC/guide/issues) to discuss it.
 80 | If it does not fit, please open an [issue](https://github.com/NLeSC/guide/issues) to discuss your planned contribution before starting to work on it, to avoid disappointment later.
 81 | 
 82 | # How?
 83 | 
 84 | ## Style, form
 85 | 
 86 | A well written piece of advice should contain the following information:
 87 | 
 88 | 1. What, e.g. _version control_
 89 | 2. Why, e.g. _why version control is a good idea_
 90 | 3. Short how / tl;dr: Recommend one solution for readers who don't want to spend time reading about all possible options, e.g. _at NLeSC we use git with GitHub because..._ This is where NLeSC specific info should go if it makes sense to do so.
 91 | 4. Long how: also explain other options for implementing advice, e.g. _here's a list of some more version control programs and/or services which we can recommend_.
 92 | 
 93 | ## Technical
 94 | 
 95 | Please use branches and pull requests to contribute content. If you are not part of the Netherlands eScience Center organization but would still like to contribute please do by submitting a pull request from a fork.
 96 | 
 97 | ```shell
 98 | git clone https://github.com/NLeSC/guide.git
 99 | cd guide
100 | git branch newbranch
101 | git checkout newbranch
102 | ```
103 | 
104 | Please install [pre-commit](https://pre-commit.com/) and enable the pre-commit
105 | hooks by running
106 | 
107 | ```shell
108 | pre-commit install
109 | ```
110 | 
111 | to automatically format your changes when committing.
112 | 
113 | Add your new awesome feature, fix bugs, make other changes.
114 | 
115 | To preview changes locally, host the repo with a static file web server:
116 | 
117 | ```shell
118 | python3 -m http.server 4000
119 | ```
120 | 
121 | to view the documentation in a web browser (default address: http://localhost:4000).
122 | 
123 | To check if there are any broken links use [lychee](https://github.com/lycheeverse/lychee) in a Docker container:
124 | 
125 | ```shell
126 | docker run --init -it -v `pwd`:/docs lycheeverse/lychee /docs --config=docs/lychee.toml
127 | ```
128 | 
129 | If everything works as it should, `git add`, `commit` and `push` like normal.
130 | 
131 | If you have made a significant contribution to the guide, please make sure to add yourself to the `CITATION.cff` file so your name can be included in the list of authors of the guide.
132 | 
133 | ## Create a PDF file
134 | 
135 | We host a PDF version of the guide on [Zenodo](https://doi.org/10.5281/zenodo.4020564).
136 | To update it a [new release](https://github.com/NLeSC/guide/releases) needs to be made of the guide. This will trigger a GitHub action to create a new Zenodo version with the PDF file.
137 | 
138 | # Technical details
139 | 
140 | The basics of how the Guide is implemented.
141 | 
142 | The Guide is rendered by [docsify](https://docsify.js.org) and hosted on GitHub Pages.
143 | Deployment is "automatic" from the main branch, because docsify requires no build step into static HTML pages, but rather generates HTML dynamically from the MarkDown files in the Guide repository.
144 | The only configuration that was necessary for this automatic deployment is:
145 | 
146 | 1. The [index.html](https://github.com/NLeSC/guide/blob/main/index.html) file in the root directory that loads docsify.
147 | 2. The empty [.nojekyll](https://github.com/NLeSC/guide/blob/main/.nojekyll) file, which tells GitHub that we're not dealing with Jekyll here (the GitHub Pages default).
148 | 3. Telling GitHub in the Settings -> Pages menu to load the Pages content from the root directory.
149 | 4. The [\_sidebar.md](https://github.com/NLeSC/guide/blob/main/_sidebar.md) file for the table of contents.
150 | 
151 | Plugins that we use:
152 | 
153 | - The [docsify full text search plugin](https://docsify.js.org/#/plugins?id=full-text-search)
154 | - The [docsify Google Analytics plugin](https://docsify.js.org/#/plugins?id=google-analytics)
155 | - [Prism](https://docsify.js.org/#/language-highlight) is used for language highlighting.
156 | 
157 | If you want to change anything in this part, please discuss in an issue.
158 | 
159 | # Zen of the Guide
160 | 
161 | 0. Help your colleagues.
162 | 1. Citing is better than copying.
163 | 2. Copying is better than rewriting from scratch.
164 | 3. ... but leaving out is often even better.
165 | 4. Don't state the obvious.
166 | 5. Don't assume that something is obvious.
167 | 6. Snippets are friends.
168 | 7. Remove outdated content.
169 | 8. Better yet, update outdated content.
170 | 9. Your practices are just _your_ practices. Best practices are shared practices. $N>1$.
171 | 10. Our best practices are just _our_ best practices. We don't have to agree with everyone.
172 | 11. Best practices are timeless (at least for a year or so).
173 | 12. Best practices are never set in stone. They are set in the Guide.
174 | 13. Best practices are not always practices.
175 | 14. ~~Best practices are not always best practices.~~
176 | 15. Kill your darlings.
177 | 16. Consider The Turing Way first.
178 | 17. Sharing is better than guiding.
179 | 18. Guiding is better than turning a blind eye.
180 | 19. This Guide shall be under your pillow.
181 | 


--------------------------------------------------------------------------------
/language_guides/bash.md:
--------------------------------------------------------------------------------
  1 | # Bash
  2 | 
  3 | _Page maintainer: Bouwe Andela_ [@bouweandela](https://github.com/bouweandela)
  4 | 
  5 | Bash is both a command line interface,
  6 | also known as a **shell**, and a scripting language.
  7 | On most Linux distributions, the Bash shell is the default way of interacting
  8 | with the system.
  9 | Zsh is an alternative shell that also understands the Bash scripting language,
 10 | this is the default shell on recent versions of Mac OS.
 11 | Both Bash and Zsh are available for most operating systems.
 12 | 
 13 | At the Netherlands eScience Center, Bash is the recommended shell scripting
 14 | language because it is the most commonly used shell language and therefore the
 15 | most convenient for collaboration.
 16 | To facilitate mutual understanding, it is also recommended that you are aware of
 17 | the shell that your collaborators are using and that you write documentation
 18 | with this in mind.
 19 | Using the same shell as your collaborators is a simple way of making sure you
 20 | are always on the same page.
 21 | 
 22 | In this chapter, a short introduction and best practices for both interactive
 23 | and use in scripts will be given.
 24 | An excellent tutorial introducing Bash can be found
 25 | [here](https://swcarpentry.github.io/shell-novice/).
 26 | If you have not used Bash or another shell before, it is recommended that you
 27 | follow the tutorial before continuing reading.
 28 | Learning to use Bash is highly recommended, because after some initial learning,
 29 | you will be more efficient and have a better understanding of what is going on
 30 | than when clicking buttons from the graphical user interface of your operating
 31 | system or integrated development environment.
 32 | 
 33 | ## Interactive use
 34 | 
 35 | If you are a (research) software engineer, it is highly recommended that you
 36 | learn
 37 | 
 38 | - the [keyboard shortcuts](#Bash-keyboard-shortcuts)
 39 | - how to configure [Bash aliases](#Bash-aliases)
 40 | - the name and function of [commonly used command line tools](#Commonly-used-command-line-tools)
 41 | 
 42 | ### Bash keyboard shortcuts
 43 | 
 44 | An introduction to
 45 | [bash keyboard shortcuts](https://www.tecmint.com/linux-command-line-bash-shortcut-keys/)
 46 | can be found here.
 47 | Note that Bash can also be configured such that it uses the _vi_ keyboard
 48 | shortcuts instead of the default _emacs_ ones, which can be useful if you
 49 | [prefer vi](https://skeptics.stackexchange.com/questions/17492/does-emacs-cause-emacs-pinky).
 50 | 
 51 | ### Bash aliases
 52 | 
 53 | [Bash aliases](https://linuxize.com/post/how-to-create-bash-aliases/)
 54 | allow you to define shorthands for commands you use often.
 55 | Typically these are defined in the `~/.bashrc` or `~/.bash_aliases` file.
 56 | 
 57 | ### Commonly used command line tools
 58 | 
 59 | It is recommended that you know at least the names and use of the following
 60 | command line tools.
 61 | The details of how to use a tool exactly can easily be found by searching the
 62 | internet or using `man` to read the manual, but you will be vastly more
 63 | efficient if you already know the name of the command you are looking for.
 64 | 
 65 | **Working with files**
 66 | 
 67 | - `ls` - List files and directories
 68 | - `tree` - Graphical representation of a directory structure
 69 | - `cd` - Change working directory
 70 | - `pwd` - Show current working directory
 71 | - `cp` - Copy a file or directory
 72 | - `mv` - Move a file or directory
 73 | - `rm` - Remove a file or directory
 74 | - `mkdir` - Make a new directory
 75 | - `touch` - Make a new empty file or update its access and modification time to the current time
 76 | - `chmod` - Change the permissions on a file or directory
 77 | - `chown` - Change the owner of a file or directory
 78 | - `find` - Search for files and directories on the file system
 79 | - `locate`, `updatedb` - Search for files and directories quickly using a database
 80 | - `tar` - (Un)pack .tar or .tar.gz files
 81 | - `unzip` - Unpack .zip files
 82 | - `df`, `du` - Show free space on disk, show disk space usage of files/folders
 83 | 
 84 | **Working with text**
 85 | 
 86 | Here we list the most commonly used Bash tools that are built to manipulate
 87 | _lines of text_.
 88 | The nice thing about these tools is that you can combine them by streaming the
 89 | output of one tool to become the input of the next tool.
 90 | Have a look at the
 91 | [tutorial](https://swcarpentry.github.io/shell-novice/04-pipefilter.html)
 92 | for an introduction.
 93 | This can be done by creating
 94 | [pipelines](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Pipelines)
 95 | with the pipe operator `|` and by redirecting text to output streams or files
 96 | using
 97 | [redirection operators](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Redirections)
 98 | like `>` for output and `<` for input to a command from a text file.
 99 | 
100 | - `echo` - Repeat some text
101 | - `diff` - Show the difference between two text files
102 | - `grep` - Search for lines of text matching a simple string or regular expressions
103 | - `sed` - Edit lines of text using regular expressions
104 | - `cut` - Select columns from text
105 | - `cat` - Print the content of a file
106 | - `head` - Print the first n lines
107 | - `tail` - Print the last n lines
108 | - `tee` - Read from standard input and write to standard output and file
109 | - `less` - Read text
110 | - `sort` - Sort lines of text
111 | - `uniq` - Keep unique lines
112 | - `wc` - Count words/lines
113 | - `nano`, `emacs`, `vi` - Interactive text editors found on most Unix systems
114 | 
115 | **Working with programs**
116 | 
117 | - `man` - Read the manual
118 | - `ps` - Print all currently running programs
119 | - `top` - Interactively display all currently running programs
120 | - `kill` - Stop a running program
121 | - `\time` - Collect statistics about resource usage such as runtime, memory use, storage access (the `\` in front is needed to run the `time` program instead of the bash builtin function with the same name)
122 | - `which` - Find which file will be executed when you run a command
123 | - `xargs` - Run programs with arguments in parallel
124 | 
125 | **Working with remote systems**
126 | 
127 | - `ssh` - Connect to a shell on a remote computer
128 | - `rsync` - Copy files between computers using SSH/SFTP
129 | - `lftp` - Copy files between computers using FTP
130 | - `wget`, `curl` - Copy a file using https or make a request to a remote API
131 | - `scp`, `sftp`, `ftp` - Simple tools for transferring files over (S)FTP - not recommended
132 | - `who` - show who is logged on
133 | - `screen` - Run multiple bash sessions and keep them running even when you log out
134 | 
135 | **Installing software**
136 | 
137 | - `apt` - The default package manager on Debian based Linux distributions
138 | - `yum`, `dnf` - The default package manager on RedHat/Fedora based Linux distributions
139 | - `brew` - A package manager for MacOS
140 | - `conda` - A package manager that supports many operating systems
141 | - `pip` - The Python package manager
142 | - `docker`, `singularity` - Run an entire Linux operating system including software from a [container](https://www.docker.com/resources/what-container)
143 | 
144 | **Miscellaneous**
145 | 
146 | - `bash`, `zsh` - The command to start Bash/Zsh
147 | - `history` - View all past commands
148 | - `fg`, `bg` - Move a program to the foreground, background, useful with Ctrl+Z
149 | - `su` - Switch user
150 | - `sudo` - Run a command with root permissions
151 | 
152 | For further inspiration, see this
153 | [extensive list of command line tools](https://fossbytes.com/a-z-list-linux-command-line-reference/).
154 | 
155 | ## Scripts
156 | 
157 | It is possible to write bash scripts.
158 | This is done by writing the commands that you would normally use on the command
159 | line in text file and e.g. running the file with `bash some-file.sh`.
160 | 
161 | However, doing this is only recommended if there really are no other options.
162 | If you have the option to write a Python script instead, that is the recommended
163 | way to go.
164 | This will bring you all the advantages of a fully-fledged programming language
165 | (such as libraries, frameworks for testing and documentation) and Python is the
166 | recommended programming language at the Netherlands eScience Center.
167 | If you do not mind having an extra dependency and would like to use the features
168 | and commands available in the shell from Python, the
169 | [sh](https://sh.readthedocs.io) library is a nice option.
170 | 
171 | Disclaimer: if you are an experienced Bash developer, there might be situations
172 | where using a Bash script solves your problem faster or in a more portable way
173 | than a Python script.
174 | Do take take a moment to think about whether such a solution is easy to
175 | contribute to for collaborators and will be easy to maintain in the future, as
176 | the number of features, supported systems, and code paths grows.
177 | 
178 | When writing a bash script, always use
179 | [`shellcheck`](https://www.shellcheck.net/)
180 | to make sure that your bash script is as likely to do what you think it should
181 | do as possible.
182 | 
183 | In addition to that, always start the script with
184 | 
185 | ```bash
186 | set -euo pipefail
187 | ```
188 | 
189 | this will stop the script if there is
190 | 
191 | - `-e` a command that exits with a non-zero exit code
192 | - `-o pipefail` a command in a pipe that exits with a non-zero exit code
193 | - `-u` an undefined variable in your script
194 | 
195 | an exit code other than zero usually indicates that an error occurred.
196 | If needed, you can temporarily allow this kind of error for a single line by
197 | wrapping it like this
198 | 
199 | ```bash
200 | set +e
201 | false  # A command that returns a non-zero exit code
202 | set -e
203 | ```
204 | 
205 | ## Further resources
206 | 
207 | - [Bash Tutorial](https://swcarpentry.github.io/shell-novice/)
208 | - [Bash Cheat sheet](https://devhints.io/bash)
209 | - The [Bash Reference Manual](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html) or use `man bash`
210 | - [Oh My Zsh](https://ohmyz.sh/) offers an extensive set of themes and shortcuts for the Zsh
211 | 


--------------------------------------------------------------------------------
/language_guides/r.md:
--------------------------------------------------------------------------------
  1 | # R
  2 | 
  3 | _Page maintainers: [Malte Lüken](https://github.com/maltelueken) and [Pablo Rodríguez-Sánchez](https://github.com/PabRod)_ .
  4 | 
  5 | ## What is R?
  6 | 
  7 | R is a functional programming language and software environment for statistical computing and graphics: https://www.r-project.org/.
  8 | 
  9 | ### Philosophy and typical use cases
 10 | 
 11 | R is particularly popular in the social, health, and biological sciences where it is used for statistical modeling. R can also be used for signal processing (e.g. FFT), machine learning, image analyses, and natural language processing. The R syntax is similar to that of Matlab and Python in terms of compactness and readability, which makes it a good prototyping language for science.
 12 | 
 13 | One of the strengths of R is the large number of available open source statistical packages, often developed by domain experts. For example, R-package [Seewave](http://rug.mnhn.fr/seewave/) is specialised in sound analyses. Packages are typically released on CRAN [The Comprehensive R Archive Network](http://cran.r-project.org).
 14 | 
 15 | ### Some crucial differences with Python
 16 | 
 17 | Are you familiar with Python? Then kickstart your R journey by reading this [blog post](https://towardsdatascience.com/the-starter-guide-for-transitioning-your-python-projects-to-r-8de4122b04ad).
 18 | 
 19 | ### Recommended sources of information
 20 | 
 21 | All R functions come with documentation in a standardized format. Some R packages have their own google group. Further, stackoverflow and standard search engines can lead you to answers to issues.
 22 | 
 23 | If you prefer books, consider the following resources:
 24 | 
 25 | - [R for Data Science](https://r4ds.had.co.nz/) by Hadley Wickham,
 26 | - [Advanced R](https://adv-r.hadley.nz/) by Hadley Wickham,
 27 | - [Writing better R code](http://www.bioconductor.org/help/course-materials/2013/CSAMA2013/friday/afternoon/R-programming.pdf) by Laurent Gatto.
 28 | 
 29 | ## Getting started
 30 | 
 31 | ### Setting up R
 32 | 
 33 | To install R check detailed description at [CRAN website](http://cran.r-project.org).
 34 | 
 35 | #### IDE
 36 | 
 37 | R programs can be written in any text editor. R code can be run from the command line or interactively within R environment, that can be started with `R` command in the shell. To quit R environment type `q()`.
 38 | 
 39 | Said this, it is highly recommended to use an integrated development environment (IDE). The most popular one is [RStudio / Posit](https://posit.co/products/open-source/rstudio/). It is free and quite powerful. It features editor with code completion, command line environment, file manager, package manager and history lookup among others.
 40 | 
 41 | It comes with many menus and key bindings (visible when you hover your mouse over the menu item). For instance, you can run code sections by selecting them and pressing `Ctrl+Enter`.
 42 | 
 43 | Note you will have to install RStudio in addition to installing R. Please note that updating RStudio does not automatically update R and the other way around.
 44 | 
 45 | Within RStudio you can work on ad-hoc code or create a project. Compared with Python an R project is a bit like a virtual environment as it preserves the workspace and installed packages for that project. Creating a project is needed to build an R package. A project is created via the menu at the top of the screen.
 46 | 
 47 | ### Installing compilers and runtimes
 48 | 
 49 | Not needed as most functions in R are already compiled in C, nevertheless R has compiling functionality as described in the [R manual](https://stat.ethz.ch/R-manual/R-devel/library/compiler/html/compile.html). See [overview by Hadley Wickham](http://r-pkgs.had.co.nz/src.html).
 50 | 
 51 | ## Coding style conventions
 52 | 
 53 | We recommend following the [Tidyverse style guide](https://style.tidyverse.org/).
 54 | Its guidelines can be automatically followed using linters such as:
 55 | 
 56 | - [styler](https://github.com/r-lib/styler)
 57 | - [lintr](https://github.com/r-lib/lintr)
 58 | 
 59 | ### The `<-` operator
 60 | 
 61 | Assigning variables with `<-` instead of `=` is recommended, although **most** of the time both are equivalent.
 62 | 
 63 | If you are interested in the controversy around assignment operators, check out this [blog post](https://csgillespie.wordpress.com/2010/11/16/assignment-operators-in-r-vs/).
 64 | 
 65 | ### `%>%` and `|>`
 66 | 
 67 | The symbols `%>%` and `|>` represent the pipe operator.
 68 | The first one is part of the `magrittr` package, and it gained so much popularity that a similar operator, `|>`, was added as part of native R since version 4.1.0. For details on the differences between the two, see this [blog post](https://www.tidyverse.org/blog/2023/04/base-vs-magrittr-pipe/).
 69 | They just add syntactic sugar to the way we pass a variable to a function.
 70 | The example below shows its basic behavior:
 71 | 
 72 | ```r
 73 | var %>% function(params)
 74 | # Is equivalent to
 75 | function(var, params)
 76 | ```
 77 | 
 78 | These operators are pretty useful for composing functions, and very often appear concatenated:
 79 | 
 80 | ```r
 81 | grades |> remove_nans() |> mean() |> print()
 82 | ```
 83 | 
 84 | You can think of it as a production chain, were an object (the `grades`) passes through three machines, one that removes the `NaN`s, another one that takes the mean, and a last one that prints the result.
 85 | 
 86 | ## Recommended additional packages and libraries
 87 | 
 88 | One of the strengths of R is its community, that creates and maintains a constellation of packages.
 89 | Very rarely will you use just base R.
 90 | Here we give you a list of usual packages, starting by one solving the first problem you'll find... how to manage that many packages!
 91 | 
 92 | ### Managing environments with `renv`
 93 | 
 94 | [`renv`](https://rstudio.github.io/renv/articles/renv.html) allows you to create and manage a dependencies library on a per-project basis. It also keeps track of the specific versions of each package used in the project, which is great for reproducibility... and avoiding future headhaches!
 95 | 
 96 | ### Plotting with basic functions and ggplot2 and ggvis
 97 | 
 98 | For a generic impression about plotting with R, see: https://www.r-graph-gallery.com/all-graphs
 99 | 
100 | The basic R installation comes with a wide range of functions to plot data to a window on your screen or to a file. If you need to quickly inspect your data or create a custom-made static plot then the basic functions offer the building blocks to do the job. There is a [Statmethods.net tutorial with some examples of plotting options in R](http://www.statmethods.net/graphs/index.html).
101 | 
102 | However, externally contributed plotting packages may offer easier syntax or convenient templates for creating plots. The most popular and powerful contributed graphics package is [ggplot2](https://ggplot2.tidyverse.org/). Interactive plots can be made with [ggvis](https://github.com/rstudio/ggvis) package and embeded in web application, and this [tutorial](https://www.statmethods.net/advgraphs/ggplot2.html).
103 | 
104 | In summary, it is good to familiarize yourself with both the basic plotting functions as well as the contributed graphics packages. In theory, the basic plot functions can do everything that ggplot2 can do, it is mostly a matter of how much you like either syntax and how much freedom you need to tailor the visualisation to your use case.
105 | 
106 | ### Building interactive web applications with shiny
107 | 
108 | Thanks to [shiny.app](https://shiny.posit.co/) it is possible to make interactive web application in R without the need to write javascript or html.
109 | 
110 | ### Building reports with knitr
111 | 
112 | [knitr](https://yihui.name/knitr/) is an R package designed to build dynamic reports in R. It's possible to generate on the fly new pdf or html documents with results of computations embedded inside.
113 | 
114 | ### Preparing data for analysis
115 | 
116 | There are packages that ease tidying up messy data, e.g. [tidyr](https://github.com/hadley/tidyr) and [reshape2](https://github.com/hadley/reshape). The idea of tidy and messy data is explained in a [tidy data](http://vita.had.co.nz/papers/tidy-data.html) paper by Hadley Wickham. There is also the google group [manipulatr](https://groups.google.com/forum/#!forum/manipulatr) to discuss topics related to data manipulation in R.
117 | 
118 | ### Speeding up code
119 | 
120 | Speeding up code always start with knowing where your bottlenecks are.
121 | The following profiling tools will help you doing so:
122 | 
123 | - Introduction to [profiling in R](https://bookdown.org/rdpeng/rprogdatascience/profiling-r-code.html)
124 | 
125 | Some rules of thumb that can quickly improve your code are the follwing:
126 | 
127 | - Avoid loops, use `apply` functionals instead
128 | - Try to use vectorized functions
129 | - Checkout the [`purrr`](https://purrr.tidyverse.org/) package
130 | - If you are really in a hurry, consider communicating with `C++` code using [`Rcpp`](https://www.rcpp.org/).
131 | 
132 | For a deeper introduction to the many optimization methods, check the free ebook:
133 | 
134 | - [Efficient R programming](https://csgillespie.github.io/efficientR/), by Colin Gillespie and Robin Lovelace.
135 | 
136 | ## Package development
137 | 
138 | ### Building R packages
139 | 
140 | There is a great tutorial written by Hadley Wickam describing all the nitty gritty of building your own package in R. It's called [R packages](http://r-pkgs.had.co.nz).
141 | For a quicker introduction, consider this software Carpentries' [lesson on R packages](https://carpentries-incubator.github.io/lesson-R-packaging/), originated and developed at our Center!
142 | 
143 | ### Package documentation
144 | 
145 | Read [Documentation](http://r-pkgs.had.co.nz/man.html) chapter of Hadleys [R packages](http://r-pkgs.had.co.nz) book for details about documenting R code.
146 | 
147 | Customary R uses `.Rd` files in `/man` directory for documentation. These files and folders are automatically created by RStudio when you create a new project from your existing R-function files.
148 | 
149 | Function level comments starting with `#'` are used by `roxygen` to automatically generate the `.Rd` files. This means that you **don't have to edit the `.Rd` files directly**.
150 | 
151 | R function documentation offers plenty of space to document the functionality, including code examples, literature references, and links to related functions. Nevertheless, it can sometimes be helpful for the user to also have a more generic description of the package with for example use-cases. You can do this with a `vignette`.
152 | 
153 | Read more about vignettes in [Package documentation](http://r-pkgs.had.co.nz/vignettes.html) chapter of Hadleys [R packages](http://r-pkgs.had.co.nz) book.
154 | Read more about `roxygen` syntax on it's [github page](https://github.com/yihui/roxygen2). `roxygen` will also populate `NAMESPACE` file which is necessary to manage package level imports.
155 | 
156 | ## Available templates
157 | 
158 | Most of the templating is nativelly managed by the [`usethis`](https://usethis.r-lib.org/) package.
159 | It contains functions that create the boilerplate for you, reducing the burden on your memory and reducing chances for errors.
160 | In the snippet below you can see how it feels to use it.
161 | 
162 | ```r
163 | usethis::create_package()     # Creates a package structure
164 | usethis::use_readme_md()      # Adds a readme
165 | usethis::use_apache_license() # Adds an Apache License
166 | usethis::use_testthat()       # Adds the testing infrastructure
167 | usethis::use_citation()       # Adds a citation file
168 | # etc...
169 | 
170 | ```
171 | 
172 | Having said this, these others can serve as inspiration:
173 | 
174 | - https://rapporter.github.io/rapport/
175 | - https://shiny.posit.co/r/articles/build/templates/
176 | - https://bookdown.org/yihui/rmarkdown/document-templates.html
177 | 
178 | ## Testing, Checking, Debugging and Profiling
179 | 
180 | ### Testing and checking
181 | 
182 | [Testthat](https://github.com/hadley/testthat) is a testing package by Hadley Wickham. [Testing chapter](http://r-pkgs.had.co.nz/tests.html) of a book [R packages](http://r-pkgs.had.co.nz) describes in detail testing process in R with use of `testthat`. Further, [testthat: Get Started with Testing](https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf) by Whickham may also provide a good starting point.
183 | 
184 | See also [checking](http://r-pkgs.had.co.nz/check.html) and [testing](http://r-pkgs.had.co.nz/tests.html) R packages. note that within RStudio R package check and R package test can be done via simple toolbar clicks.
185 | 
186 | ### Continuous integration
187 | 
188 | [Continuous integration](https://book.the-turing-way.org/reproducible-research/ci) should be done with an online service. We recommend using GitHub actions.
189 | 
190 | ### Debugging and Profiling
191 | 
192 | Debugging is possible in RStudio, see [link](https://support.posit.co/hc/en-us/articles/205612627-Debugging-with-RStudio). For profiling tips see [link](http://adv-r.had.co.nz/Profiling.html)
193 | 
194 | ## Not in this tutorial yet:
195 | 
196 | - Logging
197 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 | Attribution 4.0 International
  2 | 
  3 | =======================================================================
  4 | 
  5 | Creative Commons Corporation ("Creative Commons") is not a law firm and
  6 | does not provide legal services or legal advice. Distribution of
  7 | Creative Commons public licenses does not create a lawyer-client or
  8 | other relationship. Creative Commons makes its licenses and related
  9 | information available on an "as-is" basis. Creative Commons gives no
 10 | warranties regarding its licenses, any material licensed under their
 11 | terms and conditions, or any related information. Creative Commons
 12 | disclaims all liability for damages resulting from their use to the
 13 | fullest extent possible.
 14 | 
 15 | Using Creative Commons Public Licenses
 16 | 
 17 | Creative Commons public licenses provide a standard set of terms and
 18 | conditions that creators and other rights holders may use to share
 19 | original works of authorship and other material subject to copyright
 20 | and certain other rights specified in the public license below. The
 21 | following considerations are for informational purposes only, are not
 22 | exhaustive, and do not form part of our licenses.
 23 | 
 24 |      Considerations for licensors: Our public licenses are
 25 |      intended for use by those authorized to give the public
 26 |      permission to use material in ways otherwise restricted by
 27 |      copyright and certain other rights. Our licenses are
 28 |      irrevocable. Licensors should read and understand the terms
 29 |      and conditions of the license they choose before applying it.
 30 |      Licensors should also secure all rights necessary before
 31 |      applying our licenses so that the public can reuse the
 32 |      material as expected. Licensors should clearly mark any
 33 |      material not subject to the license. This includes other CC-
 34 |      licensed material, or material used under an exception or
 35 |      limitation to copyright. More considerations for licensors:
 36 | 	wiki.creativecommons.org/Considerations_for_licensors
 37 | 
 38 |      Considerations for the public: By using one of our public
 39 |      licenses, a licensor grants the public permission to use the
 40 |      licensed material under specified terms and conditions. If
 41 |      the licensor's permission is not necessary for any reason--for
 42 |      example, because of any applicable exception or limitation to
 43 |      copyright--then that use is not regulated by the license. Our
 44 |      licenses grant only permissions under copyright and certain
 45 |      other rights that a licensor has authority to grant. Use of
 46 |      the licensed material may still be restricted for other
 47 |      reasons, including because others have copyright or other
 48 |      rights in the material. A licensor may make special requests,
 49 |      such as asking that all changes be marked or described.
 50 |      Although not required by our licenses, you are encouraged to
 51 |      respect those requests where reasonable. More_considerations
 52 |      for the public: 
 53 | 	wiki.creativecommons.org/Considerations_for_licensees
 54 | 
 55 | =======================================================================
 56 | 
 57 | Creative Commons Attribution 4.0 International Public License
 58 | 
 59 | By exercising the Licensed Rights (defined below), You accept and agree
 60 | to be bound by the terms and conditions of this Creative Commons
 61 | Attribution 4.0 International Public License ("Public License"). To the
 62 | extent this Public License may be interpreted as a contract, You are
 63 | granted the Licensed Rights in consideration of Your acceptance of
 64 | these terms and conditions, and the Licensor grants You such rights in
 65 | consideration of benefits the Licensor receives from making the
 66 | Licensed Material available under these terms and conditions.
 67 | 
 68 | 
 69 | Section 1 -- Definitions.
 70 | 
 71 |   a. Adapted Material means material subject to Copyright and Similar
 72 |      Rights that is derived from or based upon the Licensed Material
 73 |      and in which the Licensed Material is translated, altered,
 74 |      arranged, transformed, or otherwise modified in a manner requiring
 75 |      permission under the Copyright and Similar Rights held by the
 76 |      Licensor. For purposes of this Public License, where the Licensed
 77 |      Material is a musical work, performance, or sound recording,
 78 |      Adapted Material is always produced where the Licensed Material is
 79 |      synched in timed relation with a moving image.
 80 | 
 81 |   b. Adapter's License means the license You apply to Your Copyright
 82 |      and Similar Rights in Your contributions to Adapted Material in
 83 |      accordance with the terms and conditions of this Public License.
 84 | 
 85 |   c. Copyright and Similar Rights means copyright and/or similar rights
 86 |      closely related to copyright including, without limitation,
 87 |      performance, broadcast, sound recording, and Sui Generis Database
 88 |      Rights, without regard to how the rights are labeled or
 89 |      categorized. For purposes of this Public License, the rights
 90 |      specified in Section 2(b)(1)-(2) are not Copyright and Similar
 91 |      Rights.
 92 | 
 93 |   d. Effective Technological Measures means those measures that, in the
 94 |      absence of proper authority, may not be circumvented under laws
 95 |      fulfilling obligations under Article 11 of the WIPO Copyright
 96 |      Treaty adopted on December 20, 1996, and/or similar international
 97 |      agreements.
 98 | 
 99 |   e. Exceptions and Limitations means fair use, fair dealing, and/or
100 |      any other exception or limitation to Copyright and Similar Rights
101 |      that applies to Your use of the Licensed Material.
102 | 
103 |   f. Licensed Material means the artistic or literary work, database,
104 |      or other material to which the Licensor applied this Public
105 |      License.
106 | 
107 |   g. Licensed Rights means the rights granted to You subject to the
108 |      terms and conditions of this Public License, which are limited to
109 |      all Copyright and Similar Rights that apply to Your use of the
110 |      Licensed Material and that the Licensor has authority to license.
111 | 
112 |   h. Licensor means the individual(s) or entity(ies) granting rights
113 |      under this Public License.
114 | 
115 |   i. Share means to provide material to the public by any means or
116 |      process that requires permission under the Licensed Rights, such
117 |      as reproduction, public display, public performance, distribution,
118 |      dissemination, communication, or importation, and to make material
119 |      available to the public including in ways that members of the
120 |      public may access the material from a place and at a time
121 |      individually chosen by them.
122 | 
123 |   j. Sui Generis Database Rights means rights other than copyright
124 |      resulting from Directive 96/9/EC of the European Parliament and of
125 |      the Council of 11 March 1996 on the legal protection of databases,
126 |      as amended and/or succeeded, as well as other essentially
127 |      equivalent rights anywhere in the world.
128 | 
129 |   k. You means the individual or entity exercising the Licensed Rights
130 |      under this Public License. Your has a corresponding meaning.
131 | 
132 | 
133 | Section 2 -- Scope.
134 | 
135 |   a. License grant.
136 | 
137 |        1. Subject to the terms and conditions of this Public License,
138 |           the Licensor hereby grants You a worldwide, royalty-free,
139 |           non-sublicensable, non-exclusive, irrevocable license to
140 |           exercise the Licensed Rights in the Licensed Material to:
141 | 
142 |             a. reproduce and Share the Licensed Material, in whole or
143 |                in part; and
144 | 
145 |             b. produce, reproduce, and Share Adapted Material.
146 | 
147 |        2. Exceptions and Limitations. For the avoidance of doubt, where
148 |           Exceptions and Limitations apply to Your use, this Public
149 |           License does not apply, and You do not need to comply with
150 |           its terms and conditions.
151 | 
152 |        3. Term. The term of this Public License is specified in Section
153 |           6(a).
154 | 
155 |        4. Media and formats; technical modifications allowed. The
156 |           Licensor authorizes You to exercise the Licensed Rights in
157 |           all media and formats whether now known or hereafter created,
158 |           and to make technical modifications necessary to do so. The
159 |           Licensor waives and/or agrees not to assert any right or
160 |           authority to forbid You from making technical modifications
161 |           necessary to exercise the Licensed Rights, including
162 |           technical modifications necessary to circumvent Effective
163 |           Technological Measures. For purposes of this Public License,
164 |           simply making modifications authorized by this Section 2(a)
165 |           (4) never produces Adapted Material.
166 | 
167 |        5. Downstream recipients.
168 | 
169 |             a. Offer from the Licensor -- Licensed Material. Every
170 |                recipient of the Licensed Material automatically
171 |                receives an offer from the Licensor to exercise the
172 |                Licensed Rights under the terms and conditions of this
173 |                Public License.
174 | 
175 |             b. No downstream restrictions. You may not offer or impose
176 |                any additional or different terms or conditions on, or
177 |                apply any Effective Technological Measures to, the
178 |                Licensed Material if doing so restricts exercise of the
179 |                Licensed Rights by any recipient of the Licensed
180 |                Material.
181 | 
182 |        6. No endorsement. Nothing in this Public License constitutes or
183 |           may be construed as permission to assert or imply that You
184 |           are, or that Your use of the Licensed Material is, connected
185 |           with, or sponsored, endorsed, or granted official status by,
186 |           the Licensor or others designated to receive attribution as
187 |           provided in Section 3(a)(1)(A)(i).
188 | 
189 |   b. Other rights.
190 | 
191 |        1. Moral rights, such as the right of integrity, are not
192 |           licensed under this Public License, nor are publicity,
193 |           privacy, and/or other similar personality rights; however, to
194 |           the extent possible, the Licensor waives and/or agrees not to
195 |           assert any such rights held by the Licensor to the limited
196 |           extent necessary to allow You to exercise the Licensed
197 |           Rights, but not otherwise.
198 | 
199 |        2. Patent and trademark rights are not licensed under this
200 |           Public License.
201 | 
202 |        3. To the extent possible, the Licensor waives any right to
203 |           collect royalties from You for the exercise of the Licensed
204 |           Rights, whether directly or through a collecting society
205 |           under any voluntary or waivable statutory or compulsory
206 |           licensing scheme. In all other cases the Licensor expressly
207 |           reserves any right to collect such royalties.
208 | 
209 | 
210 | Section 3 -- License Conditions.
211 | 
212 | Your exercise of the Licensed Rights is expressly made subject to the
213 | following conditions.
214 | 
215 |   a. Attribution.
216 | 
217 |        1. If You Share the Licensed Material (including in modified
218 |           form), You must:
219 | 
220 |             a. retain the following if it is supplied by the Licensor
221 |                with the Licensed Material:
222 | 
223 |                  i. identification of the creator(s) of the Licensed
224 |                     Material and any others designated to receive
225 |                     attribution, in any reasonable manner requested by
226 |                     the Licensor (including by pseudonym if
227 |                     designated);
228 | 
229 |                 ii. a copyright notice;
230 | 
231 |                iii. a notice that refers to this Public License;
232 | 
233 |                 iv. a notice that refers to the disclaimer of
234 |                     warranties;
235 | 
236 |                  v. a URI or hyperlink to the Licensed Material to the
237 |                     extent reasonably practicable;
238 | 
239 |             b. indicate if You modified the Licensed Material and
240 |                retain an indication of any previous modifications; and
241 | 
242 |             c. indicate the Licensed Material is licensed under this
243 |                Public License, and include the text of, or the URI or
244 |                hyperlink to, this Public License.
245 | 
246 |        2. You may satisfy the conditions in Section 3(a)(1) in any
247 |           reasonable manner based on the medium, means, and context in
248 |           which You Share the Licensed Material. For example, it may be
249 |           reasonable to satisfy the conditions by providing a URI or
250 |           hyperlink to a resource that includes the required
251 |           information.
252 | 
253 |        3. If requested by the Licensor, You must remove any of the
254 |           information required by Section 3(a)(1)(A) to the extent
255 |           reasonably practicable.
256 | 
257 |        4. If You Share Adapted Material You produce, the Adapter's
258 |           License You apply must not prevent recipients of the Adapted
259 |           Material from complying with this Public License.
260 | 
261 | 
262 | Section 4 -- Sui Generis Database Rights.
263 | 
264 | Where the Licensed Rights include Sui Generis Database Rights that
265 | apply to Your use of the Licensed Material:
266 | 
267 |   a. for the avoidance of doubt, Section 2(a)(1) grants You the right
268 |      to extract, reuse, reproduce, and Share all or a substantial
269 |      portion of the contents of the database;
270 | 
271 |   b. if You include all or a substantial portion of the database
272 |      contents in a database in which You have Sui Generis Database
273 |      Rights, then the database in which You have Sui Generis Database
274 |      Rights (but not its individual contents) is Adapted Material; and
275 | 
276 |   c. You must comply with the conditions in Section 3(a) if You Share
277 |      all or a substantial portion of the contents of the database.
278 | 
279 | For the avoidance of doubt, this Section 4 supplements and does not
280 | replace Your obligations under this Public License where the Licensed
281 | Rights include other Copyright and Similar Rights.
282 | 
283 | 
284 | Section 5 -- Disclaimer of Warranties and Limitation of Liability.
285 | 
286 |   a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
287 |      EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
288 |      AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
289 |      ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
290 |      IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
291 |      WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
292 |      PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
293 |      ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
294 |      KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
295 |      ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
296 | 
297 |   b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
298 |      TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
299 |      NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
300 |      INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
301 |      COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
302 |      USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
303 |      ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
304 |      DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
305 |      IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
306 | 
307 |   c. The disclaimer of warranties and limitation of liability provided
308 |      above shall be interpreted in a manner that, to the extent
309 |      possible, most closely approximates an absolute disclaimer and
310 |      waiver of all liability.
311 | 
312 | 
313 | Section 6 -- Term and Termination.
314 | 
315 |   a. This Public License applies for the term of the Copyright and
316 |      Similar Rights licensed here. However, if You fail to comply with
317 |      this Public License, then Your rights under this Public License
318 |      terminate automatically.
319 | 
320 |   b. Where Your right to use the Licensed Material has terminated under
321 |      Section 6(a), it reinstates:
322 | 
323 |        1. automatically as of the date the violation is cured, provided
324 |           it is cured within 30 days of Your discovery of the
325 |           violation; or
326 | 
327 |        2. upon express reinstatement by the Licensor.
328 | 
329 |      For the avoidance of doubt, this Section 6(b) does not affect any
330 |      right the Licensor may have to seek remedies for Your violations
331 |      of this Public License.
332 | 
333 |   c. For the avoidance of doubt, the Licensor may also offer the
334 |      Licensed Material under separate terms or conditions or stop
335 |      distributing the Licensed Material at any time; however, doing so
336 |      will not terminate this Public License.
337 | 
338 |   d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
339 |      License.
340 | 
341 | 
342 | Section 7 -- Other Terms and Conditions.
343 | 
344 |   a. The Licensor shall not be bound by any additional or different
345 |      terms or conditions communicated by You unless expressly agreed.
346 | 
347 |   b. Any arrangements, understandings, or agreements regarding the
348 |      Licensed Material not stated herein are separate from and
349 |      independent of the terms and conditions of this Public License.
350 | 
351 | 
352 | Section 8 -- Interpretation.
353 | 
354 |   a. For the avoidance of doubt, this Public License does not, and
355 |      shall not be interpreted to, reduce, limit, restrict, or impose
356 |      conditions on any use of the Licensed Material that could lawfully
357 |      be made without permission under this Public License.
358 | 
359 |   b. To the extent possible, if any provision of this Public License is
360 |      deemed unenforceable, it shall be automatically reformed to the
361 |      minimum extent necessary to make it enforceable. If the provision
362 |      cannot be reformed, it shall be severed from this Public License
363 |      without affecting the enforceability of the remaining terms and
364 |      conditions.
365 | 
366 |   c. No term or condition of this Public License will be waived and no
367 |      failure to comply consented to unless expressly agreed to by the
368 |      Licensor.
369 | 
370 |   d. Nothing in this Public License constitutes or may be interpreted
371 |      as a limitation upon, or waiver of, any privileges and immunities
372 |      that apply to the Licensor or You, including from the legal
373 |      processes of any jurisdiction or authority.
374 | 
375 | 
376 | =======================================================================
377 | 
378 | Creative Commons is not a party to its public
379 | licenses. Notwithstanding, Creative Commons may elect to apply one of
380 | its public licenses to material it publishes and in those instances
381 | will be considered the “Licensor.” The text of the Creative Commons
382 | public licenses is dedicated to the public domain under the CC0 Public
383 | Domain Dedication. Except for the limited purpose of indicating that
384 | material is shared under a Creative Commons public license or as
385 | otherwise permitted by the Creative Commons policies published at
386 | creativecommons.org/policies, Creative Commons does not authorize the
387 | use of the trademark "Creative Commons" or any other trademark or logo
388 | of Creative Commons without its prior written consent including,
389 | without limitation, in connection with any unauthorized modifications
390 | to any of its public licenses or any other arrangements,
391 | understandings, or agreements concerning use of licensed material. For
392 | the avoidance of doubt, this paragraph does not form part of the
393 | public licenses.
394 | 
395 | Creative Commons may be contacted at creativecommons.org.
396 | 
397 | 


--------------------------------------------------------------------------------
/language_guides/javascript.md:
--------------------------------------------------------------------------------
  1 | # JavaScript
  2 | 
  3 | _Page maintainer: Ewan Cahen_ [@ewan-escience](https://github.com/ewan-escience)
  4 | 
  5 | [JavaScript](https://en.wikipedia.org/wiki/JavaScript) (JS) is a programming language that is one of the three (together with [HTML](https://en.wikipedia.org/wiki/HTML) and [CSS](https://en.wikipedia.org/wiki/CSS)) core technologies of the web. It is essential if you want to write interactive webpages or web applications, because JavaScript is, apart from [WebAssembly](https://webassembly.org/), the only programming language that runs in modern browsers. Furthermore, JS can also run [outside of the browser](/language_guides/javascript?id=javascript-outside-of-the-browser), e.g. for running short scripts or full-blown servers.
  6 | 
  7 | ## Getting started
  8 | 
  9 | A good introductory tutorial on JavaScript is [this one from W3Schools](https://www.w3schools.com/js/).
 10 | 
 11 | Another source of information for JavaScript (and web development in general) is the [MDN Web Docs](https://developer.mozilla.org/en-US/docs/Learn).
 12 | 
 13 | ## Frameworks
 14 | 
 15 | Many people will jump straight to using a framework when building a web application. We, however, recommend that you learn the fundamentals first and get an impression of what problems frameworks are trying to solve for you. Read, for example, this article on [how the web works](https://developer.mozilla.org/en-US/docs/Learn/Getting_started_with_the_web/How_the_Web_works) a look at this [introduction to the DOM](https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction).
 16 | 
 17 | A good video summary on the history of frameworks and the problems they try to solve can be found [here](https://www.youtube.com/watch?v=EPir6uxr1o8).
 18 | 
 19 | Before you pick a framework, you should first consider what you are trying to build.
 20 | 
 21 | - If you're building a (more traditional) website with mostly static content, like an info page for an event or a blog, whose content doesn't adapt to the visitor, consider using a [static site generator](https://jamstack.org/generators/) like [Jekyll](https://jekyllrb.com/) or [Hugo](https://gohugo.io/) or [Docusaurus](https://docusaurus.io/) for writing documentation. An advantage of this is that static sites can be hosted on [GitHub for free](https://pages.github.com/), which uses Jekyll by default (but you can use other static site generators as well).
 22 | - If you're building a website that is not very interactive, but that many people have to edit, and when a static site generator is too technical, consider using [WordPress](https://wordpress.org/). Many hosting providers support WordPress out of the box.
 23 | - When you need light interactivity, the options above can be combined with libraries like [jQuery](https://jquery.com/), [Alpine.js](https://alpinejs.dev/), [htmx](https://htmx.org/) or you can write the JavaScript yourself.
 24 | - When you want to build a website that has high interactivity with its users, something you would call an "application" rather than a "website", consider using [htmx](https://htmx.org/) or one of the JavaScript frameworks below.
 25 | 
 26 | Currently, the most popular frameworks are (ordered by popularity according to the [StackOverflow 2024 Developer Survey](https://survey.stackoverflow.co/2024/technology#1-web-frameworks-and-technologies))
 27 | 
 28 | - [React](https://react.dev/)
 29 | - [Angular](https://angular.dev/)
 30 | - [Vue.js](https://vuejs.org/)
 31 | - [Svelte](https://svelte.dev/)
 32 | - [SolidJS](https://www.solidjs.com/)
 33 | 
 34 | ### React
 35 | 
 36 | [React](https://react.dev/) is a framework which can used to create interactive User Interfaces by combining components. It is developed by Facebook. It is by far the most popular framework, resulting in a huge choice of libraries and a lot of available documentation. Contrary to most other frameworks, React apps are typically written in [JSX](https://react.dev/learn/writing-markup-with-jsx) instead of plain HTML, CSS and JS.
 37 | 
 38 | Where other frameworks like Angular and Vue.js include rendering, routing and, state management functionality, React only does rendering, so other libraries must be used for routing and state management.
 39 | [Redux](https://redux.js.org/) can be used to let state changes flow through React components. [React Router](https://reactrouter.com/) can be used to navigate the application using URLs. Or you can use a so-called "[meta-framework](https://prismic.io/blog/javascript-meta-frameworks-ecosystem)" like [Next.js](https://nextjs.org/).
 40 | 
 41 | To create a React application, the official documentation recommends to [start with a meta-framework](https://react.dev/learn/start-a-new-react-project). Alternatively, you can use the tool [Create React App](https://create-react-app.dev/), optionally [with TypeScript](https://create-react-app.dev/docs/getting-started#creating-a-typescript-app).
 42 | 
 43 | ### Angular
 44 | 
 45 | [Angular](https://angular.dev/) is a application framework by Google written in [TypeScript](https://www.typescriptlang.org/). It is a full-blown framework, with many features included. It is therefore more used in enterprises and probably overkill for your average scientific project. Read more about what Angular is [in the documentation](https://angular.dev/overview).
 46 | 
 47 | To create a Angular application see the [installation docs](https://angular.dev/installation).
 48 | 
 49 | Angular also has a meta-framework called [Analog](https://analogjs.org/).
 50 | 
 51 | ### Vue.js
 52 | 
 53 | [Vue.js](https://vuejs.org/) is an open-source JavaScript framework for building user interfaces. Read about the use cases for Vue and reasons to use it [in their introduction](https://vuejs.org/guide/introduction.html).
 54 | 
 55 | To create a Vue application, read the [quick start](https://vuejs.org/guide/quick-start). It also has info on using [TypeScript with Vue](https://vuejs.org/guide/typescript/overview).
 56 | 
 57 | A meta-framework for Vue is [Nuxt](https://nuxt.com/).
 58 | 
 59 | ### Svelte
 60 | 
 61 | Svelte is a UI framework, that differs with most other frameworks in that is uses a compiler before shipping JavaScript to the client. Svelte applications are written in HTML, CSS and JS. Read more about Svelte in their [overview](https://svelte.dev/docs/svelte/overview).
 62 | 
 63 | In their [documentation](https://svelte.dev/docs/svelte/getting-started), they recommend to use their meta-framework [SvelteKit](https://svelte.dev/docs/kit/introduction) to create a Svelte application. It also [supports TypeScript](https://svelte.dev/docs/svelte/typescript).
 64 | 
 65 | ### Solid.js
 66 | 
 67 | A UI framework that focuses on performance and being developer friendly. Like React, it uses [JSX](https://docs.solidjs.com/concepts/understanding-jsx). Read more about Solid [here](https://docs.solidjs.com/).
 68 | 
 69 | To create a Solid application, check out the [quick start](https://docs.solidjs.com/quick-start). They also [support TypeScript](https://docs.solidjs.com/configuration/typescript).
 70 | 
 71 | Solid has a meta-framework called [SolidStart](https://start.solidjs.com/).
 72 | 
 73 | ## JavaScript outside of the browser
 74 | 
 75 | Most JavaScript is run in web browsers, but if you want to run it outside of a browser (e.g. as a server or to run a script locally), you'll need a JavaScript **runtime**. These are the main runtimes available:
 76 | 
 77 | - [Node.js](https://nodejs.org) is the most used runtime, mainly for being the only available runtime for a long time. This gives the advantage that there is a lot of documentation available (official and unofficial, e.g. forums) and that many tools are available for Node.js. It comes with a [package manager (npm)](https://www.npmjs.com/) that allows you to install packages from a huge library. Its installation instructions can be found [here](https://nodejs.org/en/learn/getting-started/how-to-install-nodejs).
 78 | - [Deno](https://deno.com/) can be seen as a successor to Node.js and tries to improve on it in a few ways, most notably:
 79 |   - [built-in support](https://docs.deno.com/runtime/fundamentals/typescript/) for TypeScript
 80 |   - a better [security model](https://docs.deno.com/runtime/fundamentals/typescript/)
 81 |   - built-in tooling, like a [linter and formatter](https://docs.deno.com/runtime/fundamentals/linting_and_formatting/)
 82 |   - [compiling](https://docs.deno.com/runtime/reference/cli/compiler/) to standalone executables
 83 | 
 84 | Its installation instructions can be found [here](https://docs.deno.com/runtime/getting_started/installation/)
 85 | 
 86 | - [Bun](https://bun.sh/), the youngest runtime of the three. Its focus is on speed, reduced complexity and enhanced developer productivity (read more [here](https://bun.sh/docs)). Just like Deno, it comes with [built-in TypeScript support](https://bun.sh/docs/runtime/typescript), can [compile to standalone executables](https://bun.sh/docs/bundler/executables) and it aims to be fully [compatible with Node.js](https://bun.sh/docs/runtime/nodejs-apis). Its installation instructions can be found [here](https://bun.sh/docs/installation).
 87 | 
 88 | A more comprehensive comparison can be found [in this guide](https://zerotomastery.io/blog/deno-vs-node-vs-bun-comparison-guide/).
 89 | 
 90 | ### Which runtime to choose?
 91 | 
 92 | To answer this question, you should consider what is important for you and your project.
 93 | 
 94 | Choose Node.js if:
 95 | 
 96 | - you need a stable, mature and a well established runtime with a large community around it;
 97 | - you need to use dependencies that should most likely "just work";
 98 | - you cannot convince the people you work with to install something else;
 99 | - you don't need any particular feature of any of its competitors.
100 | 
101 | Choose Deno if:
102 | 
103 | - you want a relatively mature runtime with a lot of features built in;
104 | - you want out-of-the-box TypeScript support;
105 | - you like its security model;
106 | - you want a complete package with a linter and formatter included;
107 | - you don't mind spending some time if something does not work directly.
108 | 
109 | Choose Bun if:
110 | 
111 | - you are willing to take a risk using a relatively new runtime;
112 | - you want out-of-the-box TypeScript support;
113 | - you want to use one of Bun's particular features;
114 | - you need maximum performance (though you should benchmark for your use case first and consider using a different programming language).
115 | 
116 | ## Editors and IDEs
117 | 
118 | These are some good JavaScript editors:
119 | 
120 | - [WebStorm](https://www.jetbrains.com/webstorm/) by JetBrains. It is free (as in monetary cost) for [non-commercial use](https://www.jetbrains.com/legal/docs/toolbox/license_non-commercial/); otherwise you have to buy a licence. Most of its features are also available in other IDEs of JetBrains, like [IntelliJ IDEA ultimate](https://www.jetbrains.com/idea/), [PyCharm professional](https://www.jetbrains.com/pycharm/) and [Rider](https://www.jetbrains.com/rider/). You can compare the products of JetBrains [here](https://www.jetbrains.com/products/compare/?product=webstorm&product=idea). Note that the free version of WebStorm will [collect data](https://blog.jetbrains.com/blog/2024/10/24/webstorm-and-rider-are-now-free-for-non-commercial-use/#anonymous-data-collection) anonymously, _without_ the option to disable it. WebStorm comes with a lot of [functionality included](https://www.jetbrains.com/webstorm/features/), but also gives access to a [Marketplace of plugins](https://plugins.jetbrains.com/).
121 | - [Visual Studio Code](https://code.visualstudio.com), an open source and free (as in monetary cost) editor by Microsoft. By default, it collects [telemetry data](https://code.visualstudio.com/docs/getstarted/telemetry), but that can be [disabled](https://code.visualstudio.com/docs/getstarted/telemetry#_disable-telemetry-reporting). VSCode has a [limited feature set](https://code.visualstudio.com/docs/editor/whyvscode) out of the box, which can be enhanced with [extensions](https://marketplace.visualstudio.com/vscode).
122 | 
123 | ## Debugging
124 | 
125 | In web development, debugging is typically done in the browser. Read [this article from W3Schools](https://www.w3schools.com/js/js_debugging.asp) for more info.
126 | 
127 | There is documentation for each browser on their [dev tools](https://en.wikipedia.org/wiki/Web_development_tools):
128 | 
129 | - [Firefox](https://firefox-source-docs.mozilla.org/devtools-user/)
130 | - [Chrome](https://developer.chrome.com/docs/devtools)
131 | - [Edge](https://learn.microsoft.com/en-us/microsoft-edge/devtools-guide-chromium/overview)
132 | - [Safari](https://developer.apple.com/safari/tools/)
133 | 
134 | There are also debugging guides for the various JS runtimes:
135 | 
136 | - [Node.js](https://nodejs.org/en/learn/getting-started/debugging)
137 | - [Deno](https://docs.deno.com/runtime/fundamentals/debugging/)
138 | - [Bun](https://bun.sh/docs/runtime/debugger)
139 | 
140 | When using a (meta-)framework, also have a look at its documentation.
141 | 
142 | Sometimes, the JavaScript code in the browser is not an exact copy of the code you see in your development environment, for example because the original source code is minified/uglified or transpiled before it's loaded in the browser.
143 | All major browsers can now deal with this through so-called [source maps](https://web.dev/articles/source-maps), which instruct the browser which symbol/line in a javascript file corresponds to which line in the human-readable source code.
144 | Look for the 'create sourcemaps' option when using minification/uglification/transpiling tools.
145 | 
146 | ## Hosting data files
147 | 
148 | To display web pages (HTML files) with JavaScript, you can't use any file system URL due to safety restrictions.
149 | You should use a [web server](https://developer.mozilla.org/en-US/docs/Learn/Common_questions/Web_mechanics/What_is_a_web_server) (which may still serve files that are local).
150 | A simple web server can be started from the directory you want to host files with:
151 | 
152 | ```bash
153 | python3 -m http.server 8000
154 | ```
155 | 
156 | <!-- the &#104; notation below is to avoid problems with the link checker (broken-link-checker) -->
157 | 
158 | Then open the web browser to &#104;ttp://localhost:8000.
159 | 
160 | ## Documentation :id=js-docs
161 | 
162 | [JSDoc](https://jsdoc.app/) (similar to [JavaDoc](https://www.baeldung.com/javadoc)), parses your JavaScript files and automatically generates HTML documentation, based on the JSDoc comments you put in the code.
163 | 
164 | ## Testing
165 | 
166 | The various runtimes have testing functionality included, so you don't have to install extra dependencies:
167 | 
168 | - [Node.js](https://nodejs.org/en/learn/test-runner/introduction)
169 | - [Deno](https://docs.deno.com/runtime/fundamentals/testing/)
170 | - [Bun](https://bun.sh/guides/test/run-tests)
171 | 
172 | If these don't suffice, a nice overview of popular testing frameworks can be found [here](https://raygun.com/blog/javascript-unit-testing-frameworks/).
173 | 
174 | ### Testing with browsers
175 | 
176 | To interact with web browsers use [Selenium](https://www.selenium.dev/).
177 | 
178 | ## Coding style
179 | 
180 | ### Formatters
181 | 
182 | A formatter is a tool to make your source code look consistent and easy to look at. In web development, the most used formatter is [Prettier](https://prettier.io/), which can [integrate with many editors](https://prettier.io/docs/en/editors). You could [set up a GitHub action](https://akhilaariyachandra.com/blog/prettier-in-github-actions) that rejects pull requests that are not formatted properly.
183 | 
184 | When using Deno, you can also use its [built-in formatter](https://docs.deno.com/runtime/fundamentals/linting_and_formatting/#formatting).
185 | 
186 | An alternative to Prettier is [Biome](https://biomejs.dev/), which also includes a linter.
187 | 
188 | In any case, remember to use tabs for indentation for the [purpose of accessibility](https://old.reddit.com/r/javascript/comments/c8drjo/nobody_talks_about_the_real_reason_to_use_tabs/).
189 | 
190 | ### Linters
191 | 
192 | A linter is a tool to check your code quality, in order to prevent bugs. The most used linter is [ESLint](https://eslint.org/). It has [many integrations](https://eslint.org/docs/latest/use/integrations)
193 | 
194 | When using Deno, you can also use its [built-in linter](https://docs.deno.com/runtime/fundamentals/linting_and_formatting/#linting).
195 | 
196 | An alternative to ESLint is [Biome](https://biomejs.dev/), which also includes a formatter.
197 | 
198 | Also have a look at the [Airbnb JavaScript Style Guide](https://github.com/airbnb/javascript) or the W3Schools page on [JavaScript best practices](https://www.w3schools.com/js/js_best_practices.asp).
199 | 
200 | ### Code quality analysis tools and services
201 | 
202 | For more in-depth analyses, you can use a code quality and analysis tool.
203 | 
204 | - [SonarCloud](https://sonarcloud.io) is an open platform to manage code quality which can also show code coverage and count test results over time. It easily [integrates with GitHub](https://github.com/marketplace/sonarcloud).
205 | - [Codacy](https://www.codacy.com) can analyze [many different languages](https://docs.codacy.com/getting-started/supported-languages-and-tools/) using open source tools. It also offers [GitHub integration](https://docs.codacy.com/repositories-configure/integrations/github-integration/).
206 | - [Code climate](https://codeclimate.com/quality) can analyze JavaScript (and Ruby, PHP). Can analyze Java (best supported), C, C++, Python, JavaScript and TypeScript.
207 | 
208 | ## Showing code examples
209 | 
210 | You can use [jsfiddle](https://jsfiddle.net/), which shows you a live preview of your web page while you fiddle with the underlying HTML, JavaScript and CSS code.
211 | 
212 | ## TypeScript
213 | 
214 | https://www.typescriptlang.org/
215 | 
216 | TypeScript is a typed superset of JavaScript which compiles to plain JavaScript. TypeScript adds static typing to JavaScript, which makes it easier to scale up in people and lines of code.
217 | 
218 | At the Netherlands eScience Center we prefer TypeScript to JavaScript as it will lead to more sustainable software.
219 | 
220 | This section highlights the differences with JavaScript. For topics without significant differences, like IDEs, code style etc., see the respective JavaScript section.
221 | 
222 | ### Getting Started
223 | 
224 | To learn about TypeScript, the following resources are available:
225 | 
226 | - Official [TypeScript documentation](https://www.typescriptlang.org/docs/) and [tutorial](https://www.typescriptlang.org/docs/handbook/intro.html)
227 | - [Single video tutorial](https://www.youtube.com/watch?v=d56mG7DezGs) and [playlist tutorial](https://www.youtube.com/playlist?list=PL4cUxeGkcC9gUgr39Q_yD6v-bSyMwKPUI)
228 | - Tutorials on debugging TypeScript in [Chrome](https://blog.logrocket.com/how-to-debug-typescript-chrome/) and [Firefox](https://hacks.mozilla.org/2019/09/debugging-typescript-in-firefox-devtools/). If you are using a framework, consult the documentation of that framework for additional ways of debugging
229 | - [The Definitive TypeScript 5.0 Guide](https://www.sitepen.com/blog/update-the-definitive-typescript-guide)
230 | - The [W3Schools TypeScript tutorial](https://www.w3schools.com/typescript/index.php)
231 | 
232 | ### Quickstart
233 | 
234 | To install TypeScript compiler run, check out the [official documentation](https://www.typescriptlang.org/download/). Note that Deno and Bun support TypeScript [out of the box](/language_guides/javascript?id=javascript-outside-of-the-browser).
235 | 
236 | ### Dealing with Types
237 | 
238 | In TypeScript, variables are typed and these types are checked.
239 | This implies that when using libraries, the types of these libraries need to be installed.
240 | More and more libraries ship with type declarations in them so they can be used directly. These libraries will have a "typings" key in their `package.json`.
241 | When a library does not ship with type declarations then the libraries `@types/<library-name>` package must be installed using npm:
242 | 
243 | ```shell
244 | npm install --save-dev @types/<library-name>
245 | ```
246 | 
247 | For example say we want to use the `react` package which we installed using `npm`:
248 | 
249 | ```shell
250 | npm install react --save
251 | ```
252 | 
253 | To be able to use its functionality in TypeScript we need to install the typings.
254 | 
255 | Install it with:
256 | 
257 | ```shell
258 | npm install --save-dev @types/react
259 | ```
260 | 
261 | The `--save-dev` flag saves this installation to the package.json file as a development dependency.
262 | Do not use `--save` for types because a production build will have been transpiled to JavaScript and has no use for TypeScript types.
263 | 
264 | ### Debugging
265 | 
266 | In web development, debugging is typically done in the browser.
267 | TypeScript cannot be run directly in the web browser, so it must be transpiled to JavaScript. To map a breakpoint in the browser to a line in the original TypeScript file [source maps](https://www.html5rocks.com/en/tutorials/developertools/sourcemaps/) are required. Most frameworks have a project build system which generate source maps. For more info, see the [Javascript section on debugging](/language_guides/javascript?id=debugging)
268 | 
269 | ### Documentation
270 | 
271 | Just like [JSDoc](/language_guides/javascript?id=js-docs) for JavaScript, [TypeDoc](https://typedoc.org/) can automatically generate HTML documentation for your code.
272 | 


--------------------------------------------------------------------------------
/language_guides/ccpp.md:
--------------------------------------------------------------------------------
  1 | # C and C++
  2 | 
  3 | _Page maintainer: Johan Hidding_ [@jhidding](https://github.com/jhidding)
  4 | 
  5 | C++ is one of the hardest languages to learn. Entering a project where C++ coding is needed should not be taken lightly. This guide focusses on tools and documentation for use of C++ in an open-source environment.
  6 | 
  7 | ### Standards
  8 | 
  9 | The latest ratified standard of C++ is C++17. The first standardised version of C++ is from 1998. The next version of C++ is scheduled for 2020. With these updates (especially the 2011 one) the preferred style of C++ changed drastically. As a result, a program written in 1998 looks very different from one from 2018, but it still compiles. There are many videos on Youtube describing some of these changes and how they can be used to make your code look better (i.e. more maintainable). This goes with a warning: Don't try to be too smart; other people still have to understand your code.
 10 | 
 11 | ## Practical use
 12 | 
 13 | ### Compilers
 14 | 
 15 | There are two main-stream open-source C++ compilers.
 16 | 
 17 | - [GCC](https://gcc.gnu.org/)
 18 | - [LLVM - CLANG](http://llvm.org/)
 19 | 
 20 | Overall, these compilers are more or less similar in terms of features, language support, compile times and (perhaps most importantly) performance of the generated binaries.
 21 | The generated binary performance does differ for specific algorithms.
 22 | See for instance [this Phoronix benchmark for a comparison of GCC 9 and Clang 7/8](https://www.phoronix.com/scan.php?page=article&item=gcc9-stage3-skylake).
 23 | 
 24 | MacOS (XCode) has a custom branch of `clang`, which misses some features like OpenMP support, and its own libcxx, which misses some standard library things like the very useful `std::filesystem` module.
 25 | It is nevertheless recommended to use it as much as possible to maintain binary compatibility with the rest of macOS.
 26 | 
 27 | If you need every last erg of performance, some cluster environments have the Intel compiler installed.
 28 | 
 29 | These compilers come with a lot of options. Some basic literacy in GCC and CLANG:
 30 | 
 31 | - `-O` changes optimisation levels
 32 | - `-std=c++xx` sets the C++ standard used
 33 | - `-I*path*` add path to search for include files
 34 | - `-o*file*` output file
 35 | - `-c` only compile, do not link
 36 | - `-Wall` be more verbose with warnings
 37 | 
 38 | And linker flags:
 39 | 
 40 | - `-l*library*` links to a library
 41 | - `-L*path*` add path to search for libraries
 42 | - `-shared` make a shared library
 43 | - `-Wl,-z,defs` ensures all symbols are accounted for when linking to a shared object
 44 | 
 45 | ### Interpreter
 46 | 
 47 | There **is** a C++ interpreter called [Cling](https://rawgit.com/vgvassilev/cling/master/www/index.html).
 48 | This also comes with a [Jupyter notebook kernel](http://jupyter.org/try).
 49 | 
 50 | ### Build systems
 51 | 
 52 | There are several build systems that handle C/C++.
 53 | Currently, [the CMake system is most popular](https://www.jetbrains.com/research/devecosystem-2018/cpp/).
 54 | It is not actually a build system itself; it generates build files based on (in theory) platform-independent and compiler-independent configuration files.
 55 | It can generate Makefiles, but also [Ninja](https://ninja-build.org/) files, which gives much faster build times, NMake files for Windows and more.
 56 | Some popular IDEs keep automatic count for CMake, or are even completely built around it ([CLion](http://www.jetbrains.com/clion/)).
 57 | The major drawback of CMake is the confusing documentation, but this is generally made up for in terms of community support.
 58 | When Googling for ways to write your CMake files, make sure you look for "modern CMake", which is a style that has been gaining traction in the last few years and makes everything better (e.g. dependency management, but also just the CMake files themselves).
 59 | 
 60 | Traditionally, the auto-tools suite (AutoConf and AutoMake) was _the_ way to build things on Unix; you'll probably know the three command salute:
 61 | 
 62 |     > ./configure --prefix=~/.local
 63 |         ...
 64 |     > make -j4
 65 |         ...
 66 |     > make install
 67 | 
 68 | With either one of these two (CMake or Autotools), any moderately experienced user should be able to compile your code (if it compiles).
 69 | 
 70 | There are many other systems.
 71 | Microsoft Visual Studio has its own project model / build system and a library like Qt also forces its own build system on you.
 72 | We do not recommend these if you don't also supply an option for building with CMake or Autotools.
 73 | Another modern alternative that has been gaining attention mainly in the GNU/Gnome/Linux world is [Meson](http://mesonbuild.com/), which is also based on [Ninja](https://ninja-build.org/).
 74 | 
 75 | ### Package management
 76 | 
 77 | There is no standard package manager like `pip`, `npm` or `gem` for C++.
 78 | This means that you will have to choose depending on your particular circumstances what tool to use for installing libraries and, possibly, packaging the tools you yourself built.
 79 | Some important factors include:
 80 | 
 81 | - Whether or not you have root/admin access to your system
 82 | - What kind of environment/ecosystem you are working in. For instance:
 83 |   - There are many tools targeted specifically at HPC/cluster environments.
 84 |   - Specific communities (e.g. NLP research or bioinformatics) may have gravitated towards specific tools, so you'll probably want to use those for maximum impact.
 85 | - Whether software is packaged at all; many C/C++ tools only come in source form, hopefully with [build setup configuration](#build-systems).
 86 | 
 87 | #### Yes root access
 88 | 
 89 | If you have root/admin access to your system, the first go-to for libraries may be your OS package manager.
 90 | If the target package is not in there, try to see if there is an equivalent library that is, and see what kind of software uses it.
 91 | 
 92 | #### No root access
 93 | 
 94 | A good, cross-platform option nowadays is to use [`miniconda`](https://conda.io/miniconda.html), which works on Linux, macOS and Windows.
 95 | The `conda-forge` channel especially has a lot of C++ libraries.
 96 | Specify that you want to use this channel with command line option `-c conda-forge`.
 97 | The `bioconda` channel in turn builds upon the `conda-forge` libraries, hosting a lot of bioinformatics tools.
 98 | 
 99 | #### Managing non-packaged software
100 | 
101 | If you do have to install a programm, which depends on a specific version of a library which depends on a specific version of another library, you enter what is called _dependency hell_.
102 | Some agility in compiling and installing libraries is essential.
103 | 
104 | You can install libraries in `/usr/local` or in `${HOME}/.local` if you aren't root, but there you have no package management.
105 | 
106 | Many HPC administrations provide [environment modules](https://modules.readthedocs.io/en/latest/) (`module avail`), which allow you to easily populate your `$PATH` and other environment variables to find the respective package. You can also write your own module files to solve your _dependency hell_.
107 | 
108 | A lot of libraries come with a package description for `pkg-config`.
109 | These descriptions are installed in `/usr/lib/pkgconfig`.
110 | You can point `pkg-config` to your additional libraries by setting the `PKG_CONFIG_PATH` environment variable.
111 | This also helps for instance when trying to automatically locate dependencies from CMake, which has `pkg-config` support as a fallback for when libraries don't support CMake's `find_package`.
112 | 
113 | If you want to keep things organized on systems where you use multiple versions of the same software for different projects, a simple solution is to use something like `xstow`.
114 | [XStow](http://xstow.sourceforge.net/) is a poor-mans package manager.
115 | You install each library in its own directory (`~/.local/pkg/<package>` for instance), then running `xstow` will create symlinks to the files in the `~/.local` directory (one above the XStow package directory).
116 | Using XStow in this way alows you to keep a single additional search path when compiling your next library.
117 | 
118 | #### Packaging software
119 | 
120 | In case you find the manual compilation too cumbersome, or want to conveniently distribute software (your own or perhaps one of your project's dependencies that the author did not package themselves), you'll have to build your own package.
121 | The above solutions are good defaults for this, but there are some additional options that are widely used.
122 | 
123 | - For distribution to root/admin users: system package managers (Linux: `apt`, `yum`, `pacman`, macOS: Homebrew, Macports)
124 | - For distribution to any users: [Conda](https://conda.io/miniconda.html) and [Conan](https://conan.io/) are cross-platform (Linux, macOS, Windows)
125 | - For distribution to HPC/cluster users: see options below
126 | 
127 | When choosing which system to build your package for, it is imporant to consider your target audience.
128 | If any of these tools are already widely used in your audience, pick that one.
129 | If not, it is really up to your personal preferences, as all tools have their pros and cons.
130 | Some general guidelines could be:
131 | 
132 | - prefer multi-platform over single platform
133 | - prefer widely used over obscure (even if it's technically magnificent, if nobody uses it, it's useless for distributing your software)
134 | - prefer multi-language over single language (especially for C++, because it is so often used to build libraries that power higher level languages)
135 | 
136 | But, as the state of the package management ecosystem shows, in practice, there will be many exceptions to these guidelines.
137 | 
138 | #### HPC/cluster environments
139 | 
140 | One way around this if the system does use `module` is to use [Easybuild](https://easybuild.readthedocs.io/en/latest/), which makes installing modules in your home directory quite easy.
141 | Many recipes (called Easyblocks) for building packages or whole toolchains are [available online](https://easybuild.readthedocs.io/en/latest/version-specific/Supported_software.html).
142 | These are written in Python.
143 | 
144 | A similar package that is used a lot in the bioinformatics community is [guix](https://hpc.guix.info/).
145 | With guix, you can create virtual environments, much like those in Python `virtualenv` or Conda.
146 | You can also create relocatable binaries to use your binaries on systems that do not have guix installed.
147 | This makes it easy to test your packages on your laptop before deploying to a cluster system.
148 | 
149 | A package that gains more traction at the moment for HPC environments is [spack](https://spack.readthedocs.io/en/latest/).
150 | Spack allows you to pick from many compilers. When installing packages, it compiles every package from scratch. This allows you to be tailor compilation flags and such to take fullest advantage of your cluster's hardware, which can be essential in HPC situations
151 | 
152 | #### Near future: Modules
153 | 
154 | Note that C++20 will bring Modules, which can be used as an alternative to including (precompiled) header files.
155 | This will allow for easier packaging and will probably cause the package management landscape to change considerably.
156 | For this reason, it may be wise at this time to keep your options open and keep an eye on developments within the different package management solutions.
157 | 
158 | ### Editors
159 | 
160 | This is largely a matter of taste, but not always.
161 | 
162 | In theory, given that there are many good command line tools available for working with C(++) code, any code editor will do to write C(++).
163 | Some people also prefer to avoid relying on IDEs too much; by helping your memory they can also help you to write less maintainable code.
164 | People of this persuasion would usually recommend any of the following editors:
165 | 
166 | - Vim, recommended plugins:
167 |   - [NERDTree](https://github.com/scrooloose/nerdtree) file explorer.
168 |   - [editorconfig](https://github.com/editorconfig/editorconfig-vim)
169 |   - [stl.vim](https://www.vim.org/scripts/script.php?script_id=4293) adds STL to syntax highlighting
170 |   - [Syntastic](https://github.com/scrooloose/syntastic)
171 |   - Integrated debugging using [Clewn](http://clewn.sourceforge.net/)
172 | - Emacs:
173 |   - Has GDB mode for debugging.
174 | - More modern editors: Atom / Sublime Text / VS Code
175 |   - Rich plugin ecosystem
176 |   - Easier on the eyes... I mean modern OS/GUI integration
177 | 
178 | In practice, sometimes you run into large/complex existing projects and navigating these can be really hard, especially when you just start working on the project.
179 | In these cases, an IDE can really help.
180 | Intelligent code suggestions, easy jumping between code segments in different files, integrated debugging, testing, VCS, etc. can make the learning curve a lot less steep.
181 | Good/popular IDEs are
182 | 
183 | - CLion
184 | - Visual Studio (Windows only, but many people swear by it)
185 | - Eclipse
186 | 
187 | ### Code and program quality analysis
188 | 
189 | C++ (and C) compilers come with built in linters and tools to check that your program runs correctly, make sure you use those. In order to find issues, it is probably a good idea to use both compilers (and maybe the valgrind memcheck tool too), because they tend to detect different problems.
190 | 
191 | #### Automatic Formatting with clang-format
192 | 
193 | While most IDEs and some editors offer automatic formatting of files, [clang-format](http://clang.llvm.org/docs/ClangFormat.html) is a standalone tool, which offers sensible defaults and a huge range of customisation options. Integrating it into the CI workflow guarantees that checked in code adheres to formatting guidelines.
194 | 
195 | #### Static code analysis with GCC
196 | 
197 | To use the GCC linter, use the following set of compiler flags when compiling C++ code:
198 | 
199 | ```
200 | -O2 -Wall -Wextra -Wcast-align -Wcast-qual -Wctor-dtor-privacy -Wdisabled-optimization -Wformat=2
201 | -Winit-self -Wlogical-op -Wmissing-declarations -Wmissing-include-dirs -Wnoexcept -Wold-style-cast
202 | -Woverloaded-virtual -Wredundant-decls -Wshadow -Wsign-conversion -Wsign-promo -Wstrict-null-sentinel
203 | -Wstrict-overflow=5 -Wswitch-default -Wundef -Wno-unused
204 | ```
205 | 
206 | and these flags when compiling C code:
207 | 
208 | ```
209 | -O2 -Wall -Wextra -Wformat-nonliteral -Wcast-align -Wpointer-arith -Wbad-function-cast
210 | -Wmissing-prototypes -Wstrict-prototypes -Wmissing-declarations -Winline -Wundef
211 | -Wnested-externs -Wcast-qual -Wshadow -Wwrite-strings -Wno-unused-parameter
212 | -Wfloat-equal
213 | ```
214 | 
215 | Use at least optimization level 2 (`-O2`) to have GCC perform code analysis up to a level where you get all warnings. Use the `-Werror` flag to turn warnings into errors, i.e. your code won't compile if you have warnings. See this [post](https://stackoverflow.com/questions/5088460/flags-to-enable-thorough-and-verbose-g-warnings) for an explanation of why this is a reasonable selection of warning flags.
216 | 
217 | #### Static code analysis with Clang (LLVM)
218 | 
219 | Clang has the very convenient flag
220 | 
221 | ```
222 | -Weverything
223 | ```
224 | 
225 | A good strategy is probably to start out using this flag and then disable any warnings that you do not find useful.
226 | 
227 | #### Static code analysis with cppcheck
228 | 
229 | An additional good tool that detects many issues is cppcheck. Most editors/IDEs have plugins to use it automatically.
230 | 
231 | #### Dynamic program analysis using `-fsanitize`
232 | 
233 | Both GCC and Clang allow you to compile your code with the `-fsanitize=` flag, which will instrument your program to detect various errors quickly. The most useful option is probably
234 | 
235 | ```
236 | -fsanitize=address -O2 -fno-omit-frame-pointer -g
237 | ```
238 | 
239 | which is a fast memory error detector. There are also other options available like `-fsanitize=thread` and `-fsanitize=undefined`. See the GCC man page or the [Clang online manual](https://clang.llvm.org/docs/index.html) for more information.
240 | 
241 | #### Dynamic program analysis using the valgrind suite of tools
242 | 
243 | The [valgrind suite of tools](http://valgrind.org/info/tools.html) has tools similar to what is provided by the `-fsanitize` compiler flag as well as various profiling tools. Using the valgrind tool memcheck to detect memory errors is typically slower than using compiler provided option, so this might be something you will want to do less often. You will probably want to compile your code with debug symbols enabled (`-g`) in order to get useful output with memcheck. When using the profilers, keep in mind that a [statistical profiler](https://en.wikipedia.org/wiki/Profiling_%28computer_programming%29#Statistical_profilers) may give you more realistic results.
244 | 
245 | ### Automated code refactoring
246 | 
247 | Sometimes you have to update large parts of your code base a little bit, like when you move from one standard to another or you changed a function definition. Although this can be accomplished with a `sed` command using regular expressions, this approach is dangerous, if you use macros, your code is not formatted properly etc.... [Clang-tidy](https://clang.llvm.org/extra/clang-tidy/) can do these things and many more by using the abstract syntax tree of the compiler instead of the source code files to refactor your code and thus is much more robust but also powerful.
248 | 
249 | ### Debugging
250 | 
251 | Most of your time programming C(++) will probably be spent on debugging.
252 | At some point, surrounding every line of your code with `printf("here %d", i++);` will no longer avail you and you will need a more powerful tool.
253 | With a debugger, you can inspect the program while it is running.
254 | You can pause it, either at random points when you feel like it or, more usually, at so-called breakpoints that you specified in advance, for instance at a certain line in your code, or when a certain function is called.
255 | When paused, you can inspect the current values of variables, manually step forward in the code line by line (or by function, or to the next breakpoint) and even change values and continue running.
256 | Learning to use these powerful tools is a very good time investment.
257 | There are some really good CppCon videos about debugging on YouTube.
258 | 
259 | - GDB - the GNU Debugger, many graphical front-ends are based on GDB.
260 | - LLDB - the LLVM debugger. This is the go-to GDB alternative for the LLVM toolchain, especially on macOS where GDB is hard to setup.
261 | - DDD - primitive GUI frontend for GDB.
262 | - The IDEs mentioned above either have custom built-in debuggers or provide an interface to GDB or LLDB.
263 | 
264 | ## Libraries
265 | 
266 | Historically, many C and C++ projects have seemed rather hestitant about using external dependencies (perhaps due to the poor dependency management situation mentioned above).
267 | However, many good (scientific) computing libraries are available today that you should consider using if applicable.
268 | Here follows a list of libraries that we recommend and/or have experience with.
269 | These can typically be installed from a wide range of [package managers](#package-management).
270 | 
271 | ### Usual suspects
272 | 
273 | These scientific libraries are well known, widely used and have a lot of good online documentation.
274 | 
275 | - [GNU Scientific library (GSL)](https://www.gnu.org/software/gsl/doc/html/index.html)
276 | - [FFTW](http://www.fftw.org): Fastest Fourier Transform in the West
277 | - [OpenMPI](https://www.open-mpi.org). Use with caution, since it will strongly define the structure of your code, which may or may not be desirable.
278 | 
279 | ### Boost
280 | 
281 | This is what the Google style guide has to say about Boost:
282 | 
283 | > - **Definition:** The Boost library collection is a popular collection of peer-reviewed, free, open-source C++ libraries.
284 | > - **Pros:** Boost code is generally very high-quality, is widely portable, and fills many important gaps in the C++ standard library, such as type traits and better binders.
285 | > - **Cons:** Some Boost libraries encourage coding practices which can hamper readability, such as metaprogramming and other advanced template techniques, and an excessively "functional" style of programming.
286 | 
287 | As a general rule, don't use Boost when there is equivalent STL functionality.
288 | 
289 | ### xtensor
290 | 
291 | [xtensor](http://github.com/xtensor-stack/xtensor) is a modern (C++14) N-dimensional tensor (array, matrix, etc) library for numerical work in the style of Python's NumPy.
292 | It aims for maximum performance (and in most cases it succeeds) and has an active development community.
293 | This library features, among other things:
294 | 
295 | - Lazy-evaluation: only calculate when necessary.
296 | - Extensible template expressions: automatically optimize many subsequent operations into one "kernel".
297 | - NumPy style syntax, including broadcasting.
298 | - C++ STL style interfaces for easy integration with STL functionality.
299 | - [Very low-effort integration with today's main data science languages Python](https://blog.esciencecenter.nl/irregular-data-in-pandas-using-c-88ce311cb9ef?gi=23ebfce3ae77), R and Julia.
300 |   This all makes xtensor a very interesting choice compared to similar older libraries like Eigen and Armadillo.
301 | 
302 | ### General purpose, I/O
303 | 
304 | - Configuration file reading and writing:
305 |   - [yaml-cpp](https://github.com/jbeder/yaml-cpp): A YAML parser and emitter in C++
306 |   - [JSON for Modern C++](https://nlohmann.github.io/json/)
307 | - Command line argument parsing:
308 |   - [argagg](https://github.com/vietjtnguyen/argagg)
309 |   - [Clara](https://github.com/catchorg/Clara)
310 | - [fmt](https://github.com/fmtlib/fmt): pythonic string formatting
311 | - [hdf5](https://github.com/HDFGroup/hdf5): The popular HDF5 binary format C++ interface.
312 | 
313 | ### Parallel processing
314 | 
315 | - [oneAPI Threading Building Blocks](https://oneapi-src.github.io/oneTBB/) (oneTBB): template library for task parallelism
316 | - [ZeroMQ](http://zeromq.org): lower level flexible communication library with a unified interface for message passing between threads and processes, but also between separate machines via TCP.
317 | 
318 | ## Style
319 | 
320 | ### Style guides
321 | 
322 | Good style is not just about layout and linting on trailing whitespace. It will mean the difference between a blazing fast code and a broken one.
323 | 
324 | - [C++ Core Guidelines](http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines)
325 | - [Guidelines Support Library](https://github.com/Microsoft/GSL)
326 | - [Google Style Guide](https://google.github.io/styleguide/cppguide.html)
327 | - [Google Style Guide - github](https://github.com/google/styleguide) Contains the CppLint linter.
328 | 
329 | ### Project layout
330 | 
331 | A C++ project will usually have directories `/src` for source codes, `/doc` for Doxygen output, `/test` for testing code. Some people like to put header files in `/include`. In C++ though, many header files will contain functioning code (templates and inline functions). This makes the separation between code and interface a bit murky.
332 | In this case, it can make more sense to put headers and implementation in the same tree, but different communities will have different opinions on this.
333 | A third option that is sometimes used is to make separate "template implementation" header files.
334 | 
335 | ## Sustainability
336 | 
337 | ### Testing
338 | 
339 | Use [Google Test](https://github.com/google/googletest).
340 | It is light-weight, good and is used a lot.
341 | [Catch2](https://github.com/catchorg/Catch2) is also pretty good, well maintained and has native support in the CLion
342 | IDE.
343 | 
344 | ### Documentation
345 | 
346 | Use [Doxygen](http://www.doxygen.nl/). It is the de-facto standard way of inlining documentation into comment sections of your code. The output is very ugly. Mini-tutorial: run `doxygen -g` (preferably inside a `doc` folder) in a new project to set things up, from then on, run `doxygen` to (re-)generate the documentation.
347 | 
348 | A newer but less mature option is [cldoc](http://jessevdk.github.io/cldoc/).
349 | 
350 | ## Resources
351 | 
352 | ### Online
353 | 
354 | - [CppCon videos](https://www.youtube.com/user/CppCon): Many really good talks recorded at the various CppCon meetings.
355 | - [CppReference.com](http://en.cppreference.com/w/)
356 | - [C++ Annotations](http://www.icce.rug.nl/documents/cplusplus/)
357 | - [CPlusPlus.com](http://www.cplusplus.com/)
358 | - [Modern C++, according to Microsoft](https://msdn.microsoft.com/en-us/library/hh279654.aspx)
359 | 
360 | ### Books
361 | 
362 | - Bjarne Soustrup - The C++ Language
363 | - Scott Meyers - Effective Modern C++
364 | 


--------------------------------------------------------------------------------
/language_guides/python.md:
--------------------------------------------------------------------------------
  1 | # Python
  2 | 
  3 | _Page maintainer: Bouwe Andela_ [@bouweandela](https://github.com/bouweandela)
  4 | 
  5 | Python is the "dynamic language of choice" of the Netherlands eScience Center.
  6 | We use it for data analysis and data science projects, and for many other types of projects: workflow management, visualization, natural language processing, web-based tools and much more.
  7 | It is a good default choice for many kinds of projects due to its generic nature, its large and broad ecosystem of third-party modules and its compact syntax which allows for rapid prototyping.
  8 | It is not the language of maximum performance, although in many cases performance critical components can be easily replaced by modules written in faster, compiled languages like C(++) or Cython.
  9 | 
 10 | The philosophy of Python is summarized in the [Zen of Python](https://www.python.org/dev/peps/pep-0020/).
 11 | In Python, this text can be retrieved with the `import this` command.
 12 | 
 13 | ## Project setup
 14 | 
 15 | When starting a new Python project, consider using our [Python template](https://github.com/NLeSC/python-template). This template provides a basic project structure, so you can spend less time setting up and configuring your new Python packages, and comply with the software guide right from the start.
 16 | 
 17 | ## Use Python 3, avoid 2
 18 | 
 19 | Python 2 and Python 3 have co-existed for a long time, but [starting from 2020, development of Python 2 is officially abandoned](https://www.python.org/doc/sunset-python-2/), meaning Python 2 will no longer be improved, even in case of security issues.
 20 | If you are creating a new package, use Python 3.
 21 | It is possible to write Python that is both Python 2 and Python 3 compatible (e.g. using [Six](https://pypi.org/project/six/)), but only do this when you are 100% sure that your package won't be used otherwise.
 22 | If you need Python 2 because of old, incompatible Python 2 libraries, strongly consider upgrading those libraries to Python 3 or replacing them altogether.
 23 | Building and/or using Python 2 is probably discouraged even more than, say, using Fortran 77, since at least Fortran 77 compilers are still being maintained.
 24 | 
 25 | - [Six](https://pypi.org/project/six/): Python 2 and 3 Compatibility Library
 26 | - [2to3](https://docs.python.org/2/library/2to3.html): Automated Python 2 to 3 code translation
 27 | - [python-modernize](https://github.com/mitsuhiko/python-modernize): wrapper around 2to3
 28 | 
 29 | ## Learning Python
 30 | 
 31 | - A popular way to learn Python is by doing it the hard way at http://learnpythonthehardway.org/
 32 | - Using [`pylint`](https://www.pylint.org) and [`yapf`](https://github.com/google/yapf) while learning Python is an easy way to get familiar with best practices and commonly used coding styles
 33 | 
 34 | ## Dependencies and package management
 35 | 
 36 | To install Python packages use `pip` or `conda` (or both, see also [what is the difference between pip and conda?](http://stackoverflow.com/questions/20994716/what-is-the-difference-between-pip-and-conda)).
 37 | 
 38 | If you are planning on distributing your code at a later stage, be aware that your choice of package management may affect your packaging process. See [Building and packaging](#building-and-packaging-code) for more info.
 39 | 
 40 | ### Use virtual environments
 41 | 
 42 | We strongly recommend creating isolated "virtual environments" for each Python project.
 43 | These can be created with `venv` or with `conda`.
 44 | Advantages over installing packages system-wide or in a single user folder:
 45 | 
 46 | - Installs Python modules when you are not root.
 47 | - Contains all Python dependencies so the environment keeps working after an upgrade.
 48 | - Keeps environments clean for each project, so you don't get more than you need (and can easily reproduce that minimal working situation).
 49 | - Lets you select the Python version per environment, so you can test code compatibility between Python versions
 50 | 
 51 | ### Pip + a virtual environment
 52 | 
 53 | If you don't want to use `conda`, create isolated Python environments with the standard library [`venv`](https://docs.python.org/3/library/venv.html) module.
 54 | If you are still using Python 2, [`virtualenv`](https://virtualenv.pypa.io/en/latest/) and [`virtualenvwrapper`](https://virtualenvwrapper.readthedocs.org) can be used instead.
 55 | 
 56 | With `venv` and `virtualenv`, `pip` is used to install all dependencies. An increasing number of packages are using [`wheel`](http://pythonwheels.com), so `pip` downloads and installs them as binaries. This means they have no build dependencies and are much faster to install.
 57 | 
 58 | If the installation of a package fails because of its non-Python extensions or system library dependencies and you are not root, you could switch to `conda` (see below).
 59 | 
 60 | ### Conda
 61 | 
 62 | [Conda](http://conda.pydata.org/docs/) can be used instead of venv and pip, since it is both an environment manager and a package manager. It easily installs binary dependencies, like Python itself or system libraries.
 63 | Installation of packages that are not using `wheel`, but have a lot of non-Python code, is much faster with Conda than with `pip` because Conda does not compile the package, it only downloads compiled packages.
 64 | The disadvantage of Conda is that the package needs to have a Conda build recipe.
 65 | Many Conda build recipes already exist, but they are less common than the `setuptools` configuration that generally all Python packages have.
 66 | 
 67 | There are two main "official" distributions of Conda: [Anaconda](https://docs.anaconda.com/anaconda/install/) and [Miniconda](https://docs.conda.io/projects/miniconda/en/latest/) (and variants of the latter like miniforge, explained below).
 68 | Anaconda is large and contains a lot of common packages, like numpy and matplotlib, whereas Miniconda is very lightweight and only contains Python. If you need more, the `conda` command acts as a package manager for Python packages.
 69 | If installation with the `conda` command is too slow for your purposes, it is recommended that you use [`mamba`](https://github.com/mamba-org/mamba) instead.
 70 | 
 71 | For environments where you do not have admin rights (e.g. DAS-6) either Anaconda or Miniconda is highly recommended since the installation is very straightforward.
 72 | The installation of packages through Conda is very robust.
 73 | 
 74 | A possible downside of Anaconda is the fact that this is offered by a commercial supplier, but we don't foresee any vendor lock-in issues, because all packages are open source and can still be obtained elsewhere.
 75 | Do note that since 2020, [Anaconda has started to ask money from large institutes](https://www.anaconda.com/blog/anaconda-commercial-edition-faq) for downloading packages from their [main channel (called the `default` channel)](https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/channels.html#what-is-a-conda-channel) through `conda`.
 76 | This does not apply to universities and most research institutes, but could apply to some government institutes that also perform research and definitely applies to large for-profit companies.
 77 | Be aware of this when choosing the distribution channel for your package.
 78 | An alternative, community-driven Conda distribution that avoids this problem altogether because it only installs packages from `conda-forge` by default is [miniforge](https://github.com/conda-forge/miniforge).
 79 | Miniforge includes both the faster `mamba` as well as the traditional `conda`.
 80 | 
 81 | ## Building and packaging code
 82 | 
 83 | ### Making an installable package
 84 | 
 85 | To create an installable Python package you will have to create a `pyproject.toml` file.
 86 | This will contain three kinds of information: metadata about your project, information on how to build and install your package, and configuration settings for any tools your project may use. Our [Python template](https://github.com/NLeSC/python-template) already does this for you.
 87 | 
 88 | #### Project metadata
 89 | 
 90 | Your project metadata will be under the `[project]` header, and includes such information as the name, version number, description and dependencies.
 91 | The [Python Packaging User Guide](https://packaging.python.org/en/latest/specifications/pyproject-toml/#declaring-project-metadata-the-project-table) has more information on what else can or should be added here.
 92 | For your dependencies, you should keep version constraints to a minimum; use, in order of descending preference: no constraints, lower bounds, lower + upper bounds, exact versions.
 93 | Use of `requirements.txt` is discouraged, unless necessary for something specific, see the [discussion here](https://github.com/NLeSC/guide/issues/156).
 94 | 
 95 | It is best to keep track of direct dependencies for your project from the start and list these in your `pyproject.toml`
 96 | If instead you are writing a new `pyproject.toml` for an existing project, a recommended way to find all direct dependencies is by running your code in a clean environment (probably by running your test suite) and installing one by one the dependencies that are missing, as reported by the ensuing errors.
 97 | It is possible to find the full list of currently installed packages with `pip freeze` or `conda list`, but note that this is not ideal for listing dependencies in `pyproject.toml`, because it also lists all dependencies of the dependencies that you use.
 98 | 
 99 | #### Build system
100 | 
101 | Besides specifying your project's own metadata, you also have to specify a build-system under the `[build-system]` header.
102 | We currently recommend using [`hatchling`](https://pypi.org/project/hatchling/) or [`setuptools`](https://setuptools.pypa.io/en/latest/build_meta.html).
103 | Note that Python's build system landscape is still in flux, so be sure to look upthe some current practices in the [packaging guide's section on build backends](https://packaging.python.org/en/latest/tutorials/packaging-projects/#choosing-a-build-backend) and [authoritative blogs like this one](https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html).
104 | One important thing to note is that use of `setup.py` and `setup.cfg` has been officially deprecated and we should migrate away from that.
105 | 
106 | #### Tool configuration
107 | 
108 | Finally, `pyproject.toml` can be used to specify the configuration for any other tools like `pytest`, `ruff` and `mypy` your project may use.
109 | Each of these gets their own section in your `pyproject.toml` instead of using their own file, saving you from having dozens of such files in your project.
110 | 
111 | #### Installation
112 | 
113 | When the `pyproject.toml` is written, your package can be installed with
114 | 
115 | ```
116 | pip install -e .
117 | ```
118 | 
119 | The `-e` flag will install your package in editable mode, i.e. it will create a symlink to your package in the installation location instead of copying the package. This is convenient when developing, because any changes you make to the source code will immediately be available for use in the installed version.
120 | 
121 | Set up continuous integration to test your installation setup.
122 | You can use `pyroma` as a linter for your installation configuration.
123 | 
124 | ### Packaging and distributing your package
125 | 
126 | For packaging your code, you can either use `pip` or `conda`. Neither of them is [better than the other](https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/) -- they are different; use the one which is more suitable for your project. `pip` may be more suitable for distributing pure python packages, and it provides some support for binary dependencies using [`wheels`](http://pythonwheels.com). `conda` may be more suitable when you have external dependencies which cannot be packaged in a wheel.
127 | 
128 | #### Build via the [Python Package Index (PyPI)](https://pypi.org) so that the package can be installed with pip
129 | 
130 | - [General instructions](https://packaging.python.org/en/latest/tutorials/packaging-projects/)
131 | - We recommend to configure GitHub Actions to upload the package to PyPI automatically for each release.
132 |   - For new repositories, it is recommended to use [trusted publishing](https://docs.pypi.org/trusted-publishers/) because it is more secure than using secret tokens from GitHub.
133 |     - For a workflow using secret tokens instead, see this [example workflow in DIANNA](https://github.com/dianna-ai/dianna/blob/main/.github/workflows/release.yml).
134 |   - You can follow [these instructions](https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/) to set up GitHub Actions workflows with trusted publishing.
135 |     - The [`verbose`](https://github.com/marketplace/actions/pypi-publish#for-debugging) option for pypi workflows is useful to see why a workflow failed.
136 |     - To avoid unnecessary workflow runs, you can follow the example in the [sirup package](https://github.com/ivory-tower-private-power/sirup/blob/main/.github/workflows/release.yml): manually trigger pushes to pypi and investigate potential bugs during this process with a manual upload.
137 | - Manual uploads with twine
138 |   - Because PyPI and Test PyPI require Two-Factor Authentication per January 2024, you need to mimick GitHub's trusted publishing to publish manually with `twine`.
139 |   - You can follow the section on "The manual way" as described [here](https://docs.pypi.org/trusted-publishers/using-a-publisher/).
140 | - Additional guidelines:
141 |   - Packages should be uploaded to PyPI using [your own account](https://pypi.org/account/register)
142 |   - For packages developed in a team or organization, it is recommended that you create a team or organizational account on PyPI and add that as a collaborator with the owner rule. This will allow your team or organization to maintain the package even if individual contributors at some point move on to do other things. At the Netherlands eScience Center, we are a fairly small organization, so we use a single backup account (`nlesc`).
143 |   - When distributing code through PyPI, non-python files (such as `requirements.txt`) will not be packaged automatically, you need to [add them to](https://stackoverflow.com/questions/1612733/including-non-python-files-with-setup-py) a `MANIFEST.in` file.
144 |   - To test whether your distribution will work correctly before uploading to PyPI, you can run `python -m build` in the root of your repository. Then try installing your package with `pip install dist/<your_package>tar.gz.`
145 |   - `python -m build` will also build [Python wheels](http://pythonwheels.com/), the current standard for [distributing](https://packaging.python.org/distributing/#wheels) Python packages. This will work out of the box for pure Python code, without C extensions. If C extensions are used, each OS needs to have its own wheel. The [manylinux](https://github.com/pypa/manylinux) Docker images can be used for building wheels compatible with multiple Linux distributions. Wheel building can be automated using GitHub Actions or another CI solution, where you can build on all three major platforms using a build matrix.
146 | 
147 | #### [Build using conda](https://conda-forge.org/docs/maintainer/adding_pkgs.html)
148 | 
149 | - **Make use of [conda-forge](https://conda-forge.org/) whenever possible**, since it provides many automated build services that save you tons of work, compared to using your own conda repository. It also has a very active community for when you need help.
150 | - Use BioConda or custom channels (hosted on GitHub) as alternatives if need be.
151 | 
152 | ## Editors and IDEs
153 | 
154 | Every major text editor supports Python, either natively or through plugins.
155 | At the Netherlands eScience Center, some popular editors or IDEs are:
156 | 
157 | - [vscode](https://code.visualstudio.com/) holds the middle ground between a lightweight text editor and a full-fledged language-dedicated IDE.
158 | - [vim](https://realpython.com/blog/python/vim-and-python-a-match-made-in-heaven/) or `emacs` (don't forget to install plugins to get the most out of these two), two versatile classic powertools that can also be used through remote SSH connection when needed.
159 | - JetBrains [PyCharm](https://www.jetbrains.com/pycharm/) is the Python-specific IDE of choice. [PyCharm Community Edition](https://www.jetbrains.com/pycharm) is free and open source; the source code is available in the [python folder of the IntelliJ repository](https://github.com/JetBrains/intellij-community/tree/master/python).
160 | 
161 | ## Coding style conventions
162 | 
163 | The style guide for Python code is [PEP8](http://www.python.org/dev/peps/pep-0008/) and for docstrings it is [PEP257](https://www.python.org/dev/peps/pep-0257/). We highly recommend following these conventions, as they are widely agreed upon to improve readability. To make following them significantly easier, we recommend using a linter.
164 | 
165 | Many linters exists for Python.
166 | The most popular one is currently [Ruff](https://github.com/astral-sh/ruff).
167 | Although it is new (see the website for the complete function parity comparison with alternatives), it works well and has an active community.
168 | An alternative is [`prospector`](https://github.com/landscapeio/prospector), a tool for running a suite of linters, including, among others [pycodestyle](https://github.com/PyCQA/pycodestyle), [pydocstyle](https://github.com/PyCQA/pydocstyle), [pyflakes](https://pypi.python.org/pypi/pyflakes), [pylint](https://www.pylint.org/), [mccabe](https://github.com/PyCQA/mccabe) and [pyroma](https://github.com/regebro/pyroma).
169 | Some of these tools have seen decreasing community support recently, but it is still a good alternative, having been a defining community default for years.
170 | 
171 | Most of the above tools can be integrated in text editors and IDEs for convenience.
172 | 
173 | Autoformatting tools like [`yapf`](https://github.com/google/yapf) and [`black`](https://black.readthedocs.io/en/stable/index.html) can automatically format code for optimal readability. `yapf` is configurable to suit your (team's) preferences, whereas `black` enforces the style chosen by the `black` authors. The [`isort`](http://timothycrosley.github.io/isort/) package automatically formats and groups all imports in a standard, readable way.
174 | 
175 | Ruff can do autoformatting as well and can function as a drop-in replacement of `black` and `isort`.
176 | 
177 | ## Type hints
178 | 
179 | Since [PEP 484](https://peps.python.org/pep-0484/), which was first implemented in Python 3.5 (released in 2015), Python has gained the ability to add type information to variables.
180 | These are not types, as in typed languages; they are _hints_.
181 | Naively, one could say they are a new type of documentation.
182 | However, in practice they are far more than this, because they do have their own special syntax rules and are thus parsable.
183 | In fact, some tools have started to make use of this in runtime modules as well, making them more than hints for tools like Pydantic, FastAPI and Typer (all described below).
184 | See [this guide](https://realpython.com/python-type-checking/) to learn more about type hints.
185 | 
186 | Some tools to know about that make use of type hints:
187 | 
188 | - [Type checkers](https://www.infoworld.com/article/2260170/4-python-type-checkers-to-keep-your-code-clean.html) are static code
189 |   analysis tools that check your code based on the type hints you provide. It is highly recommended that you use a type checker.
190 |   Choose [mypy](https://mypy-lang.org/) if you are unsure which one to choose.
191 | - Tools to build documentation from source code have extensions that can show type hints in the generated documentation to make your code easier to understand. Popular examples are [sphinx autodoc](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_typehints), [sphinx autapi](https://sphinx-autoapi.readthedocs.io/en/latest/how_to.html#how-to-include-type-annotations-as-types-in-rendered-docstrings), and [mkdocstrings](https://mkdocstrings.github.io/).
192 | - [Pydantic](https://docs.pydantic.dev/latest/) is a widely used data validation library that allows you to automatically validate instances of dataclasses at runtime. This means that for this tool the type hints are no longer just hints or a form of documentation, but have actual effects. Essentially, a fully Pydantic-enriched application (in "strict mode") is like having Mypy at runtime (there is also a "tolerant" mode that lets some common types slip through without errors). It effectively turns Python into a statically typed language.
193 | - Most editors nowadays make use of type hints for autocompletion.
194 |   If the editor knows the type of your variable, for instance, it can autocomplete attributes or methods of that class.
195 | 
196 | We recommend using type hints, where possible and _practical_.
197 | Type hints are still being actively developed; not everything one would like to be able to express in a compact way can yet be achieved.
198 | This is why, for instance, [NumPy](https://numpy.org/) arrays and machine learning library (e.g. [Pytorch](https://pytorch.org/), [Tensorflow](https://www.tensorflow.org/)) "tensor" types still (in 2024) have awkward type hinting.
199 | Crucial information that one would typically want to encode for array type input arguments are shapes, but this is not yet possible.
200 | Other important libraries, like [Matplotlib](https://matplotlib.org/), have very complex functions that take in many possible types of arguments, leading to overly complex variable types.
201 | Such huge types clutter your code tremendously, so they are not typically encouraged.
202 | 
203 | ## Testing
204 | 
205 | Use [pytest](https://docs.pytest.org/) as the basis for your testing setup.
206 | This is preferred over the `unittest` standard library, because it has a much more concise syntax and supports many useful features.
207 | 
208 | It [has many plugins](https://docs.pytest.org/en/stable/plugins.html).
209 | For linting, we have found `pytest-pycodestyle`, `pytest-pydocstyle`, `pytest-mypy` and `pytest-flake8` to be useful.
210 | Other plugins we had good experience with are `pytest-cov`, `pytest-html`, `pytest-xdist` and `pytest-nbmake`.
211 | 
212 | Creating mocks can also be done within the pytest framework by using the `mocker` fixture provided by the `pytest-mock` plugin or by using `MagicMock` and `patch` from `unittest`.
213 | For a general explanation about mocking, see the [standard library docs on mocking](https://docs.python.org/3/library/unittest.mock.html).
214 | 
215 | To run your test suite, it can be convenient to use `tox`.
216 | Testing with `tox` allows for keeping the testing environment separate from your development environment.
217 | The development environment will typically accumulate (old) packages during development that interfere with testing; this problem is avoided by testing with `tox`.
218 | 
219 | ### Code coverage
220 | 
221 | When you have tests it is also a good to see which source code is exercised by the test suite.
222 | [Code coverage](https://book.the-turing-way.org/reproducible-research/testing/testing-guidance#aim-to-have-a-good-code-coverage) can be measured with the [coverage](https://coverage.readthedocs.io) Python package.
223 | The coverage package can also generate html reports which show which line was covered.
224 | Most test runners have have the coverage package integrated.
225 | 
226 | The code coverage reports can be published online using a code quality service or code coverage services.
227 | Preferred is to use one of the code quality service which also handles code coverage listed [below](#Code_quality_analysis_tools_and_services).
228 | If this is not possible or does not fit then use a generic code coverage service such as [Codecov](https://about.codecov.io/) or [Coveralls](https://coveralls.io/).
229 | 
230 | ## Code quality analysis tools and services
231 | 
232 | Code quality service is explained in the [The Turing Way](https://book.the-turing-way.org/reproducible-research/code-quality/code-quality-style.html#online-services-providing-software-quality-checks).
233 | There are multiple code quality services available for Python, all of which have their pros and cons.
234 | See [The Turing Way](https://book.the-turing-way.org/reproducible-research/code-quality/code-quality-resources.html) for links to lists of possible services.
235 | We currently setup [Sonarcloud](https://sonarcloud.io/) by default in our [Python template](https://github.com/NLeSC/python-template).
236 | To reproduce the Sonarcloud pipeline locally, you can use [SonarLint](https://www.sonarlint.org/) in your IDE.
237 | If you use another editor, perhaps it is more convenient to pick another service like Codacy or Codecov.
238 | 
239 | ## Debugging and profiling
240 | 
241 | ### Debugging
242 | 
243 | - Python has its own debugger called [pdb](https://docs.python.org/3/library/pdb.html). It is a part of the Python distribution.
244 | - [pudb](https://github.com/inducer/pudb) is a console-based Python debugger which can easily be installed using pip.
245 | - If you are looking for IDEs with debugging capabilities, see the [Editors and IDEs section](#editors-and-ides).
246 | - If you are using Windows, [Python Tools for Visual Studio](https://github.com/Microsoft/PTVS) adds Python support for Visual Studio.
247 | - If you would like to integrate [pdb](https://docs.python.org/3/library/pdb.html) with `vim`, you can use [Pyclewn](https://sourceforge.net/projects/pyclewn).
248 | 
249 | - List of other available software can be found on the [Python wiki page on debugging tools](https://wiki.python.org/moin/PythonDebuggingTools).
250 | 
251 | - If you are looking for some tutorials to get started:
252 |   - https://pymotw.com/2/pdb
253 |   - https://github.com/spiside/pdb-tutorial
254 |   - https://www.jetbrains.com/help/pycharm/2016.3/debugging.html
255 |   - https://waterprogramming.wordpress.com/2015/09/10/debugging-in-python-using-pycharm/
256 |   - http://www.pydev.org/manual_101_run.html
257 | 
258 | ### Profiling
259 | 
260 | There are a number of available profiling tools that are suitable for different situations.
261 | 
262 | - [cProfile](https://docs.python.org/2/library/profile.html) measures number of function calls and how much CPU time they take. The output can be further analyzed using the `pstats` module.
263 | - For more fine-grained, line-by-line CPU time profiling, two modules can be used:
264 |   - [line_profiler](https://github.com/rkern/line_profiler) provides a function decorator that measures the time spent on each line inside the function.
265 |   - [pprofile](https://github.com/vpelletier/pprofile) is less intrusive; it simply times entire Python scripts line-by-line. It can give output in callgrind format, which allows you to study the statistics and call tree in `kcachegrind` (often used for analyzing c(++) profiles from `valgrind`).
266 | 
267 | More realistic profiling information can usually be obtained by using statistical or sampling profilers. The profilers listed below all create nice flame graphs.
268 | 
269 | - [vprof](https://github.com/nvdv/vprof)
270 | - [Pyflame](https://github.com/uber/pyflame)
271 | - [nylas-perftools](https://github.com/nylas/nylas-perftools)
272 | 
273 | ## Logging
274 | 
275 | - [logging](https://docs.python.org/3/library/logging.html) module is the most commonly used tool to track events in Python code.
276 | - Tutorials:
277 |   - [Official Python Logging Tutorial](https://docs.python.org/3/howto/logging.html#logging-basic-tutorial)
278 |   - http://docs.python-guide.org/en/latest/writing/logging
279 |   - [Python logging best practices](https://www.datadoghq.com/blog/python-logging-best-practices/)
280 | 
281 | ## Documentation
282 | 
283 | It is recommended that you [write documentation](https://book.the-turing-way.org/reproducible-research/code-documentation) for your projects and publish it on an interactive webpage.
284 | A popular and recommended solution for hosting documentation is [Read the Docs](https://readthedocs.org).
285 | It can automatically build documentation for projects hosted on [GitHub, GitLab, and Bitbucket](https://docs.readthedocs.io/en/stable/reference/git-integration.html).
286 | 
287 | ### Building documentation
288 | 
289 | There are several tools for building webpages with documentation.
290 | At the eScience Center, we mostly use [Sphinx](http://www.sphinx-doc.org/en/master/usage/quickstart.html) (more established) and [MkDocs](https://www.mkdocs.org/getting-started/) (newer).
291 | 
292 | User guides and other text documents are typically written in [Markdown](https://www.markdownguide.org/getting-started/) or [reStructuredText](https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html). Sphinx supports both formats, while MkDocs only supports Markdown. Markdown has the advantage that it's easier to read
293 | for humans so it may be easier to work with and contribute to. reStructuredText is easier to read for computers so may be more suitable for complex projects.
294 | 
295 | Python uses [Docstrings](https://pandas.pydata.org/docs/development/contributing_docstring.html#about-docstrings-and-standards) for code documentation. You can read a detailed description of docstring usage in [PEP 257](https://www.python.org/dev/peps/pep-0257/). Both Sphinx and MkDocs can generate documentation webpages from docstrings.
296 | There are two popular Sphinx extensions for generating documentation: [autoapi](https://sphinx-autoapi.readthedocs.io) (newer and more lightweight) and [autodoc](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html) (more established).
297 | For MkDocs the [mkdocstrings](https://mkdocstrings.github.io/) package is available.
298 | We recommend using the [NumPy documentation style](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard), as that is widely used in the scientific Python ecosystem.
299 | 
300 | You can also integrate entire Jupyter notebooks into your documentation with [nbsphinx](https://nbsphinx.readthedocs.io) or
301 | [mkdocs-jupyter](https://github.com/danielfrg/mkdocs-jupyter).
302 | This way, your demo notebooks, for instance, can double as documentation.
303 | Of course, the notebooks will not be interactive in the compiled webpage, but they will include all code and output cells and you can easily link to an interactive version from the compiled documentation.
304 | 
305 | It is recommended that you [routinely test any code examples in your documentation](https://docs.pytest.org/en/stable/how-to/doctest.html).
306 | 
307 | ## Recommended additional packages and libraries
308 | 
309 | ### General scientific
310 | 
311 | - [NumPy](http://www.numpy.org/)
312 | - [SciPy](https://www.scipy.org/)
313 | - [Pandas](http://pandas.pydata.org/) data analysis toolkit
314 | - [scikit-learn](http://scikit-learn.org/): machine learning in Python
315 | - [Cython](http://cython.org/) speed up Python code by using C types and calling C functions
316 | - [dask](http://dask.pydata.org) larger than memory arrays and parallel execution
317 | 
318 | ### IPython and Jupyter notebooks (aka IPython notebooks)
319 | 
320 | [IPython](https://ipython.org/) is an interactive Python interpreter -- very much the same as the standard Python interactive interpreter, but with some [extra features](http://ipython.readthedocs.io/en/stable/interactive/index.html) (tab completion, shell commands, in-line help, etc).
321 | 
322 | [Jupyter](http://jupyter.org/) notebooks (formerly know as IPython notebooks) are browser based interactive Python enviroments. It incorporates the same features as the IPython console, plus some extras like in-line plotting. [Look at some examples](https://nbviewer.jupyter.org/github/ipython/ipython/blob/4.0.x/examples/IPython%20Kernel/Index.ipynb) to find out more. Within a notebook you can alternate code with Markdown comments (and even LaTeX), which is great for reproducible research.
323 | [Notebook extensions](https://github.com/ipython-contrib/jupyter_contrib_nbextensions) adds extra functionalities to notebooks.
324 | [JupyterLab](https://github.com/jupyterlab/jupyterlab) is a web-based environment with a lot of improvements and integrated tools.
325 | 
326 | Jupyter notebooks contain data that makes it hard to nicely keep track of code changes using version control. If you are using git,
327 | you can [add filters that automatically remove output cells and unneeded metadata from your notebooks](http://timstaley.co.uk/posts/making-git-and-jupyter-notebooks-play-nice/).
328 | If you do choose to keep output cells in the notebooks (which can be useful to showcase your code's capabilities statically from GitHub) use [ReviewNB](https://www.reviewnb.com/) to automatically create nice visual diffs in your GitHub pull request threads.
329 | It is good practice to restart the kernel and run the notebook from start to finish in one go before saving and committing, so you are sure that everything works as expected.
330 | 
331 | ### Visualization
332 | 
333 | - [Matplotlib](http://matplotlib.org) has been the standard in scientific visualization. It supports quick-and-dirty plotting through the `pyplot` submodule. Its object oriented interface can be somewhat arcane, but is highly customizable and runs natively on many platforms, making it compatible with all major OSes and environments. It supports most sources of data, including native Python objects, Numpy and Pandas.
334 |   - [Seaborn](http://stanford.edu/~mwaskom/software/seaborn/index.html) is a Python visualisation library based on Matplotlib and aimed towards statistical analysis. It supports numpy, pandas, scipy and statmodels.
335 | - Web-based:
336 |   - [Bokeh](https://github.com/bokeh/bokeh) is Interactive Web Plotting for Python.
337 |   - [Plotly](https://plot.ly/) is another platform for interactive plotting through a web browser, including in Jupyter notebooks.
338 |   - [altair](https://github.com/ellisonbg/altair) is a _grammar of graphics_ style declarative statistical visualization library. It does not render visualizations itself, but rather outputs Vega-Lite JSON data. This can lead to a simplified workflow.
339 |   - [ggplot](https://github.com/yhat/ggpy) is a plotting library imported from R.
340 | 
341 | ### Parallelisation
342 | 
343 | CPython (the official and mainstream Python implementation) is not built for parallel processing due to the [global interpreter lock](https://wiki.python.org/moin/GlobalInterpreterLock). Note that the GIL only applies to actual Python code, so compiled modules like e.g. `numpy` do not suffer from it.
344 | 
345 | Having said that, there are many ways to run Python code in parallel:
346 | 
347 | - The [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) module is the standard way to do parallel executions in one or multiple machines, it circumvents the GIL by creating multiple Python processess.
348 | - A much simpler alternative in Python 3 is the [`concurrent.futures`](https://docs.python.org/3/library/concurrent.futures.html) module.
349 | - [IPython / Jupyter notebooks have built-in parallel and distributed computing capabilities](https://ipython.org/ipython-doc/3/parallel/)
350 | - Many modules have parallel capabilities or can be compiled to have them.
351 | - At the eScience Center, we have developed the [Noodles package](https://research-software-directory.org/software/noodles) for creating computational workflows and automatically parallelizing it by dispatching independent subtasks to parallel and/or distributed systems.
352 | 
353 | ### Web Frameworks
354 | 
355 | There are convenient Python web frameworks available:
356 | 
357 | - [flask](http://flask.pocoo.org/)
358 | - [CherryPy](https://cherrypy.dev/)
359 | - [Django](https://www.djangoproject.com/)
360 | - [bottle](http://bottlepy.org/) (similar to flask, but a bit more light-weight for a JSON-REST service)
361 | - [FastAPI](https://fastapi.tiangolo.com): again, similar to flask in functionality, but uses modern Python features like async and type hints with runtime behavioral effects.
362 | 
363 | We have recommended `flask` in the past, but FastAPI has become more popular recently.
364 | 
365 | ### NLP/text mining
366 | 
367 | - [nltk](http://www.nltk.org/) Natural Language Toolkit
368 | - [Pattern](https://github.com/clips/pattern): web/text mining module
369 | - [gensim](https://radimrehurek.com/gensim/): Topic modeling
370 | 
371 | ### Creating programs with command line arguments
372 | 
373 | - For run-time configuration via command-line options, the built-in [`argparse`](https://docs.python.org/library/argparse.html) module usually suffices.
374 | - A more complete solution is [`ConfigArgParse`](https://github.com/bw2/ConfigArgParse). This (almost) drop-in replacement for `argparse` allows you to not only specify configuration options via command-line options, but also via (ini or yaml) configuration files and via environment variables.
375 | - Other popular libraries are [`click`](https://click.palletsprojects.com) and [`fire`](https://google.github.io/python-fire/).
376 | - [Typer](https://typer.tiangolo.com): make a command-line application by using type hints with runtime effects. Very low on boilerplate for simple cases, but also allows for more complex cases. Uses `click` internally.
377 | 


--------------------------------------------------------------------------------