├── code
└── .gitkeep
├── data
└── .gitkeep
├── fig
├── .gitkeep
└── SingularityInDocker.png
├── _episodes
├── .gitkeep
├── lunch.md
├── am-break.md
├── pm-break.md
├── advanced-topics.md
├── singularity-shell.md
├── singularity-cache.md
├── singularity-docker.md
├── reproducibility.md
├── singularity-files.md
├── advanced-containers.md
├── introduction.md
├── singularity-gettingstarted.md
├── singularity-blast.md
└── singularity-mpi.md
├── _extras
├── .gitkeep
├── discuss.md
├── about.md
├── figures.md
└── guide.md
├── files
├── .gitkeep
├── blast_example.tar.gz
└── osu_latency.slurm.template
├── _episodes_rmd
├── .gitkeep
└── data
│ └── .gitkeep
├── AUTHORS
├── bin
├── boilerplate
│ ├── CITATION
│ ├── AUTHORS
│ ├── setup.md
│ ├── _extras
│ │ ├── discuss.md
│ │ ├── guide.md
│ │ ├── about.md
│ │ └── figures.md
│ ├── reference.md
│ ├── _episodes
│ │ └── 01-introduction.md
│ ├── index.md
│ ├── README.md
│ ├── .travis.yml
│ ├── _config.yml
│ └── CONTRIBUTING.md
├── install_r_deps.sh
├── run-make-docker-serve.sh
├── knit_lessons.sh
├── markdown_ast.rb
├── test_lesson_check.py
├── lesson_initialize.py
├── generate_md_episodes.R
├── dependencies.R
├── chunk-options.R
├── repo_check.py
├── util.py
└── workshop_check.py
├── reference.md
├── .github
├── FUNDING.yml
├── PULL_REQUEST_TEMPLATE.md
├── ISSUE_TEMPLATE.md
└── workflows
│ └── template.yml
├── .gitignore
├── CITATION
├── Gemfile
├── aio.md
├── CODE_OF_CONDUCT.md
├── .editorconfig
├── index.md
├── README.md
├── .travis.yml
├── LICENSE.md
├── _config.yml
├── Makefile
├── CONTRIBUTING.md
└── setup.md
/code/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/data/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/fig/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/_episodes/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/_extras/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/files/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/_episodes_rmd/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/_episodes_rmd/data/.gitkeep:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/AUTHORS:
--------------------------------------------------------------------------------
1 | Jeremy Cohen
2 | Andrew Turner
--------------------------------------------------------------------------------
/bin/boilerplate/CITATION:
--------------------------------------------------------------------------------
1 | FIXME: describe how to cite this lesson.
--------------------------------------------------------------------------------
/bin/boilerplate/AUTHORS:
--------------------------------------------------------------------------------
1 | FIXME: list authors' names and email addresses.
--------------------------------------------------------------------------------
/_episodes/lunch.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Lunch"
3 | break: 60
4 | ---
5 |
6 | Lunch break
7 |
--------------------------------------------------------------------------------
/_episodes/am-break.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Break"
3 | break: 30
4 | ---
5 |
6 | Lunch break
7 |
--------------------------------------------------------------------------------
/_episodes/pm-break.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Break"
3 | break: 30
4 | ---
5 |
6 | Lunch break
7 |
--------------------------------------------------------------------------------
/_extras/discuss.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Discussion
3 | ---
4 | FIXME
5 |
6 | {% include links.md %}
7 |
--------------------------------------------------------------------------------
/bin/boilerplate/setup.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Setup
3 | ---
4 | FIXME
5 |
6 |
7 | {% include links.md %}
8 |
--------------------------------------------------------------------------------
/_extras/about.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: About
3 | ---
4 | {% include carpentries.html %}
5 | {% include links.md %}
6 |
--------------------------------------------------------------------------------
/bin/boilerplate/_extras/discuss.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Discussion
3 | ---
4 | FIXME
5 |
6 | {% include links.md %}
7 |
--------------------------------------------------------------------------------
/bin/boilerplate/_extras/guide.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Instructor Notes"
3 | ---
4 | FIXME
5 |
6 | {% include links.md %}
7 |
--------------------------------------------------------------------------------
/bin/boilerplate/_extras/about.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: About
3 | ---
4 | {% include carpentries.html %}
5 | {% include links.md %}
6 |
--------------------------------------------------------------------------------
/reference.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: reference
3 | ---
4 |
5 | ## Glossary
6 |
7 | FIXME
8 |
9 | {% include links.md %}
10 |
--------------------------------------------------------------------------------
/files/blast_example.tar.gz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/EPCCed/2023-09-21_Singularity_Nottingham/gh-pages/files/blast_example.tar.gz
--------------------------------------------------------------------------------
/.github/FUNDING.yml:
--------------------------------------------------------------------------------
1 | github: [carpentries, swcarpentry, datacarpentry, librarycarpentry]
2 | custom: ["https://carpentries.wedid.it"]
3 |
--------------------------------------------------------------------------------
/fig/SingularityInDocker.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/EPCCed/2023-09-21_Singularity_Nottingham/gh-pages/fig/SingularityInDocker.png
--------------------------------------------------------------------------------
/bin/boilerplate/reference.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: reference
3 | ---
4 |
5 | ## Glossary
6 |
7 | FIXME
8 |
9 | {% include links.md %}
10 |
--------------------------------------------------------------------------------
/bin/install_r_deps.sh:
--------------------------------------------------------------------------------
1 | Rscript -e "source(file.path('bin', 'dependencies.R')); install_required_packages(); install_dependencies(identify_dependencies())"
2 |
--------------------------------------------------------------------------------
/bin/run-make-docker-serve.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -o errexit
4 | set -o pipefail
5 | set -o nounset
6 |
7 |
8 | bundle install
9 | bundle update
10 | exec bundle exec jekyll serve --host 0.0.0.0
11 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | *~
3 | .DS_Store
4 | .ipynb_checkpoints
5 | .sass-cache
6 | .jekyll-cache/
7 | __pycache__
8 | _site
9 | .Rproj.user
10 | .Rhistory
11 | .RData
12 | .bundle/
13 | .vendor/
14 | vendor/
15 | .docker-vendor/
16 | Gemfile.lock
17 | .*history
18 |
--------------------------------------------------------------------------------
/bin/knit_lessons.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 |
3 | # Only try running R to translate files if there are some files present.
4 | # The Makefile passes in the names of files.
5 |
6 | if [ $# -eq 2 ] ; then
7 | Rscript -e "source('bin/generate_md_episodes.R')" "$@"
8 | fi
9 |
--------------------------------------------------------------------------------
/CITATION:
--------------------------------------------------------------------------------
1 | Please cite as:
2 |
3 | J. Cohen and A. Turner. "Reproducible computational environments
4 | using containers: Introduction to Singularity". Version 2020.08a,
5 | August 2020. Carpentries Incubator.
6 | https://github.com/carpentries-incubator/singularity-introduction
--------------------------------------------------------------------------------
/Gemfile:
--------------------------------------------------------------------------------
1 | # frozen_string_literal: true
2 |
3 | source 'https://rubygems.org'
4 |
5 | git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
6 |
7 | # Synchronize with https://pages.github.com/versions
8 | ruby '>=2.5.8'
9 |
10 | gem 'github-pages', group: :jekyll_plugins
11 |
--------------------------------------------------------------------------------
/bin/markdown_ast.rb:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env ruby
2 |
3 | # Use Kramdown parser to produce AST for Markdown document.
4 |
5 | require "kramdown"
6 | require "json"
7 |
8 | markdown = STDIN.read()
9 | doc = Kramdown::Document.new(markdown)
10 | tree = doc.to_hash_a_s_t
11 | puts JSON.pretty_generate(tree)
12 |
--------------------------------------------------------------------------------
/bin/boilerplate/_episodes/01-introduction.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Introduction"
3 | teaching: 0
4 | exercises: 0
5 | questions:
6 | - "Key question (FIXME)"
7 | objectives:
8 | - "First learning objective. (FIXME)"
9 | keypoints:
10 | - "First key point. Brief Answer to questions. (FIXME)"
11 | ---
12 | FIXME
13 |
14 | {% include links.md %}
15 |
16 |
--------------------------------------------------------------------------------
/aio.md:
--------------------------------------------------------------------------------
1 | ---
2 | permalink: /aio/index.html
3 | ---
4 |
5 | {% comment %}
6 | As a maintainer, you don't need to edit this file.
7 | If you notice that something doesn't work, please
8 | open an issue: https://github.com/carpentries/styles/issues/new
9 | {% endcomment %}
10 |
11 | {% include base_path.html %}
12 |
13 | {% include aio-script.md %}
14 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: page
3 | title: "Contributor Code of Conduct"
4 | ---
5 | As contributors and maintainers of this project,
6 | we pledge to follow the [Carpentry Code of Conduct][coc].
7 |
8 | Instances of abusive, harassing, or otherwise unacceptable behavior
9 | may be reported by following our [reporting guidelines][coc-reporting].
10 |
11 | {% include links.md %}
12 |
--------------------------------------------------------------------------------
/bin/boilerplate/index.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: lesson
3 | root: . # Is the only page that doesn't follow the pattern /:path/index.html
4 | permalink: index.html # Is the only page that doesn't follow the pattern /:path/index.html
5 | ---
6 | FIXME: home page introduction
7 |
8 |
9 |
10 | {% comment %} This is a comment in Liquid {% endcomment %}
11 |
12 | > ## Prerequisites
13 | >
14 | > FIXME
15 | {: .prereq}
16 |
17 | {% include links.md %}
18 |
--------------------------------------------------------------------------------
/.editorconfig:
--------------------------------------------------------------------------------
1 | root = true
2 |
3 | [*]
4 | charset = utf-8
5 | insert_final_newline = true
6 | trim_trailing_whitespace = true
7 |
8 | [*.md]
9 | indent_size = 2
10 | indent_style = space
11 | max_line_length = 100 # Please keep this in sync with bin/lesson_check.py!
12 |
13 | [*.r]
14 | max_line_length = 80
15 |
16 | [*.py]
17 | indent_size = 4
18 | indent_style = space
19 | max_line_length = 79
20 |
21 | [*.sh]
22 | end_of_line = lf
23 |
24 | [Makefile]
25 | indent_style = tab
26 |
--------------------------------------------------------------------------------
/bin/test_lesson_check.py:
--------------------------------------------------------------------------------
1 | import unittest
2 |
3 | import lesson_check
4 | import util
5 |
6 |
7 | class TestFileList(unittest.TestCase):
8 | def setUp(self):
9 | self.reporter = util.Reporter() # TODO: refactor reporter class.
10 |
11 | def test_file_list_has_expected_entries(self):
12 | # For first pass, simply assume that all required files are present
13 |
14 | lesson_check.check_fileset('', self.reporter, lesson_check.REQUIRED_FILES)
15 | self.assertEqual(len(self.reporter.messages), 0)
16 |
17 |
18 | if __name__ == "__main__":
19 | unittest.main()
20 |
--------------------------------------------------------------------------------
/.github/PULL_REQUEST_TEMPLATE.md:
--------------------------------------------------------------------------------
1 |
2 | Instructions
3 |
4 | Thanks for contributing! :heart:
5 |
6 | If this contribution is for instructor training, please email the link to this contribution to
7 | checkout@carpentries.org so we can record your progress. You've completed your contribution
8 | step for instructor checkout by submitting this contribution!
9 |
10 | Keep in mind that **lesson maintainers are volunteers** and it may take them some time to
11 | respond to your contribution. Although not all contributions can be incorporated into the lesson
12 | materials, we appreciate your time and effort to improve the curriculum. If you have any questions
13 | about the lesson maintenance process or would like to volunteer your time as a contribution
14 | reviewer, please contact The Carpentries Team at team@carpentries.org.
15 |
16 | You may delete these instructions from your comment.
17 |
18 | \- The Carpentries
19 |
20 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE.md:
--------------------------------------------------------------------------------
1 |
2 | Instructions
3 |
4 | Thanks for contributing! :heart:
5 |
6 | If this contribution is for instructor training, please email the link to this contribution to
7 | checkout@carpentries.org so we can record your progress. You've completed your contribution
8 | step for instructor checkout by submitting this contribution!
9 |
10 | If this issue is about a specific episode within a lesson, please provide its link or filename.
11 |
12 | Keep in mind that **lesson maintainers are volunteers** and it may take them some time to
13 | respond to your contribution. Although not all contributions can be incorporated into the lesson
14 | materials, we appreciate your time and effort to improve the curriculum. If you have any questions
15 | about the lesson maintenance process or would like to volunteer your time as a contribution
16 | reviewer, please contact The Carpentries Team at team@carpentries.org.
17 |
18 | You may delete these instructions from your comment.
19 |
20 | \- The Carpentries
21 |
22 |
--------------------------------------------------------------------------------
/bin/lesson_initialize.py:
--------------------------------------------------------------------------------
1 | """Initialize a newly-created repository."""
2 |
3 |
4 | import sys
5 | import os
6 | import shutil
7 |
8 | BOILERPLATE = (
9 | '.travis.yml',
10 | 'AUTHORS',
11 | 'CITATION',
12 | 'CONTRIBUTING.md',
13 | 'README.md',
14 | '_config.yml',
15 | os.path.join('_episodes', '01-introduction.md'),
16 | os.path.join('_extras', 'about.md'),
17 | os.path.join('_extras', 'discuss.md'),
18 | os.path.join('_extras', 'figures.md'),
19 | os.path.join('_extras', 'guide.md'),
20 | 'index.md',
21 | 'reference.md',
22 | 'setup.md',
23 | )
24 |
25 |
26 | def main():
27 | """Check for collisions, then create."""
28 |
29 | # Check.
30 | errors = False
31 | for path in BOILERPLATE:
32 | if os.path.exists(path):
33 | print('Warning: {0} already exists.'.format(path), file=sys.stderr)
34 | errors = True
35 | if errors:
36 | print('**Exiting without creating files.**', file=sys.stderr)
37 | sys.exit(1)
38 |
39 | # Create.
40 | for path in BOILERPLATE:
41 | shutil.copyfile(
42 | os.path.join('bin', 'boilerplate', path),
43 | path
44 | )
45 |
46 |
47 | if __name__ == '__main__':
48 | main()
49 |
--------------------------------------------------------------------------------
/bin/generate_md_episodes.R:
--------------------------------------------------------------------------------
1 | generate_md_episodes <- function() {
2 |
3 | ## get the Rmd file to process from the command line, and generate the path
4 | ## for their respective outputs
5 | args <- commandArgs(trailingOnly = TRUE)
6 | if (!identical(length(args), 2L)) {
7 | stop("input and output file must be passed to the script")
8 | }
9 |
10 | src_rmd <- args[1]
11 | dest_md <- args[2]
12 |
13 | ## knit the Rmd into markdown
14 | knitr::knit(src_rmd, output = dest_md)
15 |
16 | # Read the generated md files and add comments advising not to edit them
17 | add_no_edit_comment <- function(y) {
18 | con <- file(y)
19 | mdfile <- readLines(con)
20 | if (mdfile[1] != "---")
21 | stop("Input file does not have a valid header")
22 | mdfile <- append(
23 | mdfile,
24 | "# Please do not edit this file directly; it is auto generated.",
25 | after = 1
26 | )
27 | mdfile <- append(
28 | mdfile,
29 | paste("# Instead, please edit", basename(y), "in _episodes_rmd/"),
30 | after = 2
31 | )
32 | writeLines(mdfile, con)
33 | close(con)
34 | return(paste("Warning added to YAML header of", y))
35 | }
36 |
37 | vapply(dest_md, add_no_edit_comment, character(1))
38 | }
39 |
40 | generate_md_episodes()
41 |
--------------------------------------------------------------------------------
/files/osu_latency.slurm.template:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Slurm job options (name, compute nodes, job time)
4 | #SBATCH --job-name=
5 | #SBATCH --time=0:10:0
6 | #SBATCH --nodes=2
7 | #SBATCH --tasks-per-node=128
8 | #SBATCH --cpus-per-task=1
9 |
10 | # Replace [budget code] below with your budget code (e.g. t01)
11 | #SBATCH --partition=standard
12 | #SBATCH --qos=standard
13 | #SBATCH --account=
14 |
15 | # Setup the job environment (this module needs to be loaded before any other modules)
16 | module load epcc-job-env
17 |
18 | # Set the number of threads to 1
19 | # This prevents any threaded system libraries from automatically
20 | # using threading.
21 | export OMP_NUM_THREADS=1
22 |
23 | # Set the LD_LIBRARY_PATH environment variable within the Singularity container
24 | # to ensure that it used the correct MPI libraries
25 | export SINGULARITYENV_LD_LIBRARY_PATH=/opt/cray/pe/mpich/8.0.16/ofi/gnu/9.1/lib-abi-mpich:/usr/lib/x86_64-linux-gnu/libibverbs:/opt/cray/pe/pmi/6.0.7/lib:/opt/cray/libfabric/1.11.0.0.233/lib64:/usr/lib64/host:/.singularity.d/libs
26 |
27 | # Set the options for the Singularity executable
28 | # This makes sure the locations with Cray Slingshot interconnect libraries are available
29 | singopts="-B /opt/cray,/usr/lib64:/usr/lib64/host,/usr/lib64/tcl,/var/spool/slurmd/mpi_cray_shasta"
30 |
31 | # Launch the parallel job
32 | srun --hint=nomultithread --distribution=block:block singularity run $singopts osu_benchmarks.sif collective/osu_allreduce
33 |
--------------------------------------------------------------------------------
/bin/dependencies.R:
--------------------------------------------------------------------------------
1 | install_required_packages <- function(lib = NULL, repos = getOption("repos")) {
2 |
3 | if (is.null(lib)) {
4 | lib <- .libPaths()
5 | }
6 |
7 | message("lib paths: ", paste(lib, collapse = ", "))
8 | missing_pkgs <- setdiff(
9 | c("rprojroot", "desc", "remotes", "renv"),
10 | rownames(installed.packages(lib.loc = lib))
11 | )
12 |
13 | install.packages(missing_pkgs, lib = lib, repos = repos)
14 |
15 | }
16 |
17 | find_root <- function() {
18 |
19 | cfg <- rprojroot::has_file_pattern("^_config.y*ml$")
20 | root <- rprojroot::find_root(cfg)
21 |
22 | root
23 | }
24 |
25 | identify_dependencies <- function() {
26 |
27 | root <- find_root()
28 |
29 | required_pkgs <- unique(c(
30 | ## Packages for episodes
31 | renv::dependencies(file.path(root, "_episodes_rmd"), progress = FALSE, error = "ignore")$Package,
32 | ## Packages for tools
33 | renv::dependencies(file.path(root, "bin"), progress = FALSE, error = "ignore")$Package
34 | ))
35 |
36 | required_pkgs
37 | }
38 |
39 | create_description <- function(required_pkgs) {
40 | d <- desc::description$new("!new")
41 | lapply(required_pkgs, function(x) d$set_dep(x))
42 | d$write("DESCRIPTION")
43 | }
44 |
45 | install_dependencies <- function(required_pkgs, ...) {
46 |
47 | create_description(required_pkgs)
48 | on.exit(file.remove("DESCRIPTION"))
49 | remotes::install_deps(dependencies = TRUE, ...)
50 |
51 | if (require("knitr") && packageVersion("knitr") < '1.9.19') {
52 | stop("knitr must be version 1.9.20 or higher")
53 | }
54 |
55 | }
56 |
--------------------------------------------------------------------------------
/bin/boilerplate/README.md:
--------------------------------------------------------------------------------
1 | # FIXME Lesson title
2 |
3 | [](https://swc-slack-invite.herokuapp.com/)
4 |
5 | This repository generates the corresponding lesson website from [The Carpentries](https://carpentries.org/) repertoire of lessons.
6 |
7 | ## Contributing
8 |
9 | We welcome all contributions to improve the lesson! Maintainers will do their best to help you if you have any
10 | questions, concerns, or experience any difficulties along the way.
11 |
12 | We'd like to ask you to familiarize yourself with our [Contribution Guide](CONTRIBUTING.md) and have a look at
13 | the [more detailed guidelines][lesson-example] on proper formatting, ways to render the lesson locally, and even
14 | how to write new episodes.
15 |
16 | Please see the current list of [issues][FIXME] for ideas for contributing to this
17 | repository. For making your contribution, we use the GitHub flow, which is
18 | nicely explained in the chapter [Contributing to a Project](http://git-scm.com/book/en/v2/GitHub-Contributing-to-a-Project) in Pro Git
19 | by Scott Chacon.
20 | Look for the tag . This indicates that the maintainers will welcome a pull request fixing this issue.
21 |
22 |
23 | ## Maintainer(s)
24 |
25 | Current maintainers of this lesson are
26 |
27 | * FIXME
28 | * FIXME
29 | * FIXME
30 |
31 |
32 | ## Authors
33 |
34 | A list of contributors to the lesson can be found in [AUTHORS](AUTHORS)
35 |
36 | ## Citation
37 |
38 | To cite this lesson, please consult with [CITATION](CITATION)
39 |
40 | [lesson-example]: https://carpentries.github.io/lesson-example
41 |
--------------------------------------------------------------------------------
/_episodes/advanced-topics.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Additional topics and next steps"
3 | teaching: 0
4 | exercises: 0
5 | questions:
6 | - "How do I understand more on how containers work?"
7 | - "What different container technologies are there and what are differences/implications?"
8 | - "How can I orchestrate different containers?"
9 | objectives:
10 | - "Understand container technologies better."
11 | - "Provide useful links to continue your journey with containers."
12 | keypoints:
13 | - "TBC"
14 | ---
15 |
16 | ## Additional topics
17 |
18 | - How do containers work
19 | + [Containers vs Virtual Machines](https://learn.microsoft.com/en-us/virtualization/windowscontainers/about/containers-vs-vm)
20 | + [Layers](https://docs.docker.com/storage/storagedriver/)
21 | - Container technologies
22 | + [Docker](https://docs.docker.com/)
23 | + [Podman](https://podman.io/)
24 | + [Container Engine State of the Art (FOSDEM'21)](https://www.youtube.com/watch?v=Ir11tGO7lpI)
25 | - Container good practice
26 | + [How should containers be used and present themselves?](https://qnib.org/data/2023-01-29/HPC_OCI_Conformance_v10.pdf)
27 | + [Best practice for bioinformatic containers](https://f1000research.com/articles/7-742/v2)
28 | - Container orchestration - typically using Docker containers rather than Singularity
29 | + [Kubernetes](https://kubernetes.io/)
30 | + [Docker Compose](https://docs.docker.com/compose/)
31 | + [Docker Swarm](https://docs.docker.com/engine/swarm/)
32 |
33 | ## Useful links
34 |
35 | - [Docker Introductory course (Carpentries Incubator)](https://carpentries-incubator.github.io/docker-introduction/)
36 | - [Singularity CE documentation](https://sylabs.io/docs/)
37 | - [Apptainer documentation](https://apptainer.org/docs/)
38 |
39 |
40 |
--------------------------------------------------------------------------------
/index.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: lesson
3 | root: . # Is the only page that doesn't follow the pattern /:path/index.html
4 | permalink: index.html # Is the only page that doesn't follow the pattern /:path/index.html
5 | ---
6 |
7 | This lesson provides an introduction to using the SingularityCE/Apptainer container platform. Singularity is particularly suited to running containers on infrastructure where users don't have administrative privileges, for example shared infrastructure such as High Performance Computing (HPC) clusters.
8 |
9 | This lesson will introduce Singularity from scratch showing you how to run a simple container and building up to creating your own containers and running parallel scientific workloads on HPC infrastructure.
10 |
11 | > ## Prerequisites
12 | > You should have basic familiarity with using a command shell, and the lesson text will at times request that you "open a shell window", with an assumption that you know what this means.
13 | > Under Linux or macOS it is assumed that you will access a bash shell (usually the default), using your Terminal application.
14 | > Under Windows, Powershell or WSL should allow you to use the Unix instructions.
15 | > The lessons will sometimes request that you use a text editor to create or edit files in particular directories. It is assumed that you either have an editor that you know how to use that runs within the working directory of your shell window (e.g. nano), or that if you use a graphical editor, that you can use it to read and write files into the working directory of your shell.
16 | {: .prereq}
17 |
18 | ## Course details
19 |
20 | - Dates:
21 | + 0930-1600, Thu 21 Sep 2023
22 | - Location:
23 | + E06 in the Monica Partridge Building, University of Nottingham, University Park
24 | - Instructor: Andy Turner, EPCC, University of Edinburgh, UK
25 |
26 |
27 | {% include links.md %}
28 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Reproducible computational environments using containers: Introduction to Singularity
2 |
3 | [](https://swc-slack-invite.herokuapp.com/)
4 |
5 | This lesson provides an introduction to the [Singularity container platform](https://github.com/hpcng/singularity).
6 |
7 | It covers the basics of using Singularity and creating containers:
8 |
9 | - What is Singularity?
10 | - Installing/running Singularity on the command line
11 | - Running containers
12 | - Creating Singularity images
13 | - Running an MPI parallel application from a Singularity container
14 |
15 | ## Contributing
16 |
17 | We welcome all contributions to improve the lesson! Maintainers will do their best to help you if you have any
18 | questions, concerns, or experience any difficulties along the way.
19 |
20 | We'd like to ask you to familiarize yourself with our [Contribution Guide](CONTRIBUTING.md) and have a look at
21 | the [more detailed guidelines][lesson-example] on proper formatting, ways to render the lesson locally, and even
22 | how to write new episodes.
23 |
24 | Please see the current list of [issues][issues] for ideas for contributing to this
25 | repository. For making your contribution, we use the GitHub flow, which is
26 | nicely explained in the chapter [Contributing to a Project](http://git-scm.com/book/en/v2/GitHub-Contributing-to-a-Project) in Pro Git
27 | by Scott Chacon.
28 | Look for the tag . This indicates that the maintainers will welcome a pull request fixing this issue.
29 |
30 |
31 | ## Maintainer(s)
32 |
33 | Current maintainers of this lesson are
34 |
35 | * [Jeremy Cohen](https://github.com/jcohen02)
36 | * [Andy Turner](https://github.com/aturner-epcc)
37 |
38 | ## Authors
39 |
40 | A list of contributors to the lesson can be found in [AUTHORS](AUTHORS)
41 |
42 | ## Citation
43 |
44 | To cite this lesson, please consult with [CITATION](CITATION)
45 |
46 | [cdh]: https://cdh.carpentries.org
47 | [community-lessons]: https://carpentries.org/community-lessons
48 | [lesson-example]: https://carpentries.github.io/lesson-example
49 | [issues]: https://github.com/carpentries-incubator/singularity-introduction/issues
50 |
--------------------------------------------------------------------------------
/.travis.yml:
--------------------------------------------------------------------------------
1 | # Travis CI is only used to check the lesson and is not involved in its deployment
2 | dist: bionic
3 | language: ruby
4 | rvm:
5 | - 2.7.1
6 |
7 | branches:
8 | only:
9 | - gh-pages
10 | - /.*/
11 |
12 | cache:
13 | apt: true
14 | bundler: true
15 | directories:
16 | - /home/travis/.rvm/
17 | - $R_LIBS_USER
18 | - $HOME/.cache/pip
19 |
20 | env:
21 | global:
22 | - NOKOGIRI_USE_SYSTEM_LIBRARIES=true # speeds up installation of html-proofer
23 | - R_LIBS_USER=~/R/Library
24 | - R_LIBS_SITE=/usr/local/lib/R/site-library:/usr/lib/R/site-library
25 | - R_VERSION=4.0.2
26 |
27 | before_install:
28 | ## Install R + pandoc + dependencies
29 | - sudo add-apt-repository -y "ppa:marutter/rrutter4.0"
30 | - sudo add-apt-repository -y "ppa:c2d4u.team/c2d4u4.0+"
31 | - sudo add-apt-repository -y "ppa:ubuntugis/ppa"
32 | - sudo add-apt-repository -y "ppa:cran/travis"
33 | - travis_apt_get_update
34 | - sudo apt-get install -y --no-install-recommends build-essential gcc g++ libblas-dev liblapack-dev libncurses5-dev libreadline-dev libjpeg-dev libpcre3-dev libpng-dev zlib1g-dev libbz2-dev liblzma-dev libicu-dev cdbs qpdf texinfo libssh2-1-dev gfortran jq python3.5 python3-pip r-base
35 | - export PATH=${TRAVIS_HOME}/R-bin/bin:$PATH
36 | - export LD_LIBRARY_PATH=${TRAVIS_HOME}/R-bin/lib:$LD_LIBRARY_PATH
37 | - sudo mkdir -p /usr/local/lib/R/site-library $R_LIBS_USER
38 | - sudo chmod 2777 /usr/local/lib/R /usr/local/lib/R/site-library $R_LIBS_USER
39 | - echo 'options(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/bionic/latest"))' > ~/.Rprofile.site
40 | - export R_PROFILE=~/.Rprofile.site
41 | - curl -fLo /tmp/texlive.tar.gz https://github.com/jimhester/ubuntu-bin/releases/download/latest/texlive.tar.gz
42 | - tar xzf /tmp/texlive.tar.gz -C ~
43 | - export PATH=${TRAVIS_HOME}/texlive/bin/x86_64-linux:$PATH
44 | - tlmgr update --self
45 | - curl -fLo /tmp/pandoc-2.2-1-amd64.deb https://github.com/jgm/pandoc/releases/download/2.2/pandoc-2.2-1-amd64.deb
46 | - sudo dpkg -i /tmp/pandoc-2.2-1-amd64.deb
47 | - sudo apt-get install -f
48 | - rm /tmp/pandoc-2.2-1-amd64.deb
49 | - Rscript -e "install.packages(setdiff(c('renv', 'rprojroot'), installed.packages()), loc = Sys.getenv('R_LIBS_USER')); update.packages(lib.loc = Sys.getenv('R_LIBS_USER'), ask = FALSE, checkBuilt = TRUE)"
50 | - Rscript -e 'sessionInfo()'
51 | ## Install python and dependencies
52 | - python3 -m pip install --upgrade pip setuptools wheel
53 | - python3 -m pip install pyyaml
54 |
55 | script:
56 | - make lesson-check-all
57 | - make --always-make site
58 |
--------------------------------------------------------------------------------
/bin/boilerplate/.travis.yml:
--------------------------------------------------------------------------------
1 | # Travis CI is only used to check the lesson and is not involved in its deployment
2 | dist: bionic
3 | language: ruby
4 | rvm:
5 | - 2.7.1
6 |
7 | branches:
8 | only:
9 | - gh-pages
10 | - /.*/
11 |
12 | cache:
13 | apt: true
14 | bundler: true
15 | directories:
16 | - /home/travis/.rvm/
17 | - $R_LIBS_USER
18 | - $HOME/.cache/pip
19 |
20 | env:
21 | global:
22 | - NOKOGIRI_USE_SYSTEM_LIBRARIES=true # speeds up installation of html-proofer
23 | - R_LIBS_USER=~/R/Library
24 | - R_LIBS_SITE=/usr/local/lib/R/site-library:/usr/lib/R/site-library
25 | - R_VERSION=4.0.2
26 |
27 | before_install:
28 | ## Install R + pandoc + dependencies
29 | - sudo add-apt-repository -y "ppa:marutter/rrutter4.0"
30 | - sudo add-apt-repository -y "ppa:c2d4u.team/c2d4u4.0+"
31 | - sudo add-apt-repository -y "ppa:ubuntugis/ppa"
32 | - sudo add-apt-repository -y "ppa:cran/travis"
33 | - travis_apt_get_update
34 | - sudo apt-get install -y --no-install-recommends build-essential gcc g++ libblas-dev liblapack-dev libncurses5-dev libreadline-dev libjpeg-dev libpcre3-dev libpng-dev zlib1g-dev libbz2-dev liblzma-dev libicu-dev cdbs qpdf texinfo libssh2-1-dev gfortran jq python3.5 python3-pip r-base
35 | - export PATH=${TRAVIS_HOME}/R-bin/bin:$PATH
36 | - export LD_LIBRARY_PATH=${TRAVIS_HOME}/R-bin/lib:$LD_LIBRARY_PATH
37 | - sudo mkdir -p /usr/local/lib/R/site-library $R_LIBS_USER
38 | - sudo chmod 2777 /usr/local/lib/R /usr/local/lib/R/site-library $R_LIBS_USER
39 | - echo 'options(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/bionic/latest"))' > ~/.Rprofile.site
40 | - export R_PROFILE=~/.Rprofile.site
41 | - curl -fLo /tmp/texlive.tar.gz https://github.com/jimhester/ubuntu-bin/releases/download/latest/texlive.tar.gz
42 | - tar xzf /tmp/texlive.tar.gz -C ~
43 | - export PATH=${TRAVIS_HOME}/texlive/bin/x86_64-linux:$PATH
44 | - tlmgr update --self
45 | - curl -fLo /tmp/pandoc-2.2-1-amd64.deb https://github.com/jgm/pandoc/releases/download/2.2/pandoc-2.2-1-amd64.deb
46 | - sudo dpkg -i /tmp/pandoc-2.2-1-amd64.deb
47 | - sudo apt-get install -f
48 | - rm /tmp/pandoc-2.2-1-amd64.deb
49 | - Rscript -e "install.packages(setdiff(c('renv', 'rprojroot'), installed.packages()), loc = Sys.getenv('R_LIBS_USER')); update.packages(lib.loc = Sys.getenv('R_LIBS_USER'), ask = FALSE, checkBuilt = TRUE)"
50 | - Rscript -e 'sessionInfo()'
51 | ## Install python and dependencies
52 | - python3 -m pip install --upgrade pip setuptools wheel
53 | - python3 -m pip install pyyaml
54 |
55 | script:
56 | - make lesson-check-all
57 | - make --always-make site
58 |
--------------------------------------------------------------------------------
/bin/chunk-options.R:
--------------------------------------------------------------------------------
1 | # These settings control the behavior of all chunks in the novice R materials.
2 | # For example, to generate the lessons with all the output hidden, simply change
3 | # `results` from "markup" to "hide".
4 | # For more information on available chunk options, see
5 | # http://yihui.name/knitr/options#chunk_options
6 |
7 | library("knitr")
8 |
9 | fix_fig_path <- function(pth) file.path("..", pth)
10 |
11 |
12 | ## We set the path for the figures globally below, so if we want to
13 | ## customize it for individual episodes, we can append a prefix to the
14 | ## global path. For instance, if we call knitr_fig_path("01-") in the
15 | ## first episode of the lesson, it will generate the figures in
16 | ## `fig/rmd-01-`
17 | knitr_fig_path <- function(prefix) {
18 | new_path <- paste0(opts_chunk$get("fig.path"),
19 | prefix)
20 | opts_chunk$set(fig.path = new_path)
21 | }
22 |
23 | ## We use the rmd- prefix for the figures generated by the lessons so
24 | ## they can be easily identified and deleted by `make clean-rmd`. The
25 | ## working directory when the lessons are generated is the root so the
26 | ## figures need to be saved in fig/, but when the site is generated,
27 | ## the episodes will be one level down. We fix the path using the
28 | ## `fig.process` option.
29 |
30 | opts_chunk$set(tidy = FALSE, results = "markup", comment = NA,
31 | fig.align = "center", fig.path = "fig/rmd-",
32 | fig.process = fix_fig_path,
33 | fig.width = 8.5, fig.height = 8.5,
34 | fig.retina = 2)
35 |
36 | # The hooks below add html tags to the code chunks and their output so that they
37 | # are properly formatted when the site is built.
38 |
39 | hook_in <- function(x, options) {
40 | lg <- tolower(options$engine)
41 | style <- paste0(".language-", lg)
42 |
43 | stringr::str_c("\n\n~~~\n",
44 | paste0(x, collapse="\n"),
45 | "\n~~~\n{: ", style, "}\n\n")
46 | }
47 |
48 | hook_out <- function(x, options) {
49 | x <- gsub("\n$", "", x)
50 | stringr::str_c("\n\n~~~\n",
51 | paste0(x, collapse="\n"),
52 | "\n~~~\n{: .output}\n\n")
53 | }
54 |
55 | hook_error <- function(x, options) {
56 | x <- gsub("\n$", "", x)
57 | stringr::str_c("\n\n~~~\n",
58 | paste0(x, collapse="\n"),
59 | "\n~~~\n{: .error}\n\n")
60 | }
61 |
62 | hook_warning <- function(x, options) {
63 | x <- gsub("\n$", "", x)
64 | stringr::str_c("\n\n~~~\n",
65 | paste0(x, collapse = "\n"),
66 | "\n~~~\n{: .warning}\n\n")
67 | }
68 |
69 | knit_hooks$set(source = hook_in, output = hook_out, warning = hook_warning,
70 | error = hook_error, message = hook_out)
71 |
--------------------------------------------------------------------------------
/_extras/figures.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Figures
3 | ---
4 |
5 | {% include base_path.html %}
6 | {% include manual_episode_order.html %}
7 |
8 |
67 |
68 | {% comment %} Create anchor for each one of the episodes. {% endcomment %}
69 |
70 | {% for lesson_episode in lesson_episodes %}
71 | {% if site.episode_order %}
72 | {% assign episode = site.episodes | where: "slug", lesson_episode | first %}
73 | {% else %}
74 | {% assign episode = lesson_episode %}
75 | {% endif %}
76 |
77 | {% endfor %}
78 |
79 | {% include links.md %}
80 |
--------------------------------------------------------------------------------
/bin/boilerplate/_extras/figures.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Figures
3 | ---
4 |
5 | {% include base_path.html %}
6 | {% include manual_episode_order.html %}
7 |
8 |
67 |
68 | {% comment %} Create anchor for each one of the episodes. {% endcomment %}
69 |
70 | {% for lesson_episode in lesson_episodes %}
71 | {% if site.episode_order %}
72 | {% assign episode = site.episodes | where: "slug", lesson_episode | first %}
73 | {% else %}
74 | {% assign episode = lesson_episode %}
75 | {% endif %}
76 |
77 | {% endfor %}
78 |
79 | {% include links.md %}
80 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: page
3 | title: "Licenses"
4 | root: .
5 | ---
6 | ## Instructional Material
7 |
8 | All Software Carpentry, Data Carpentry, and Library Carpentry instructional material is
9 | made available under the [Creative Commons Attribution
10 | license][cc-by-human]. The following is a human-readable summary of
11 | (and not a substitute for) the [full legal text of the CC BY 4.0
12 | license][cc-by-legal].
13 |
14 | You are free:
15 |
16 | * to **Share**---copy and redistribute the material in any medium or format
17 | * to **Adapt**---remix, transform, and build upon the material
18 |
19 | for any purpose, even commercially.
20 |
21 | The licensor cannot revoke these freedoms as long as you follow the
22 | license terms.
23 |
24 | Under the following terms:
25 |
26 | * **Attribution**---You must give appropriate credit (mentioning that
27 | your work is derived from work that is Copyright © Software
28 | Carpentry and, where practical, linking to
29 | http://software-carpentry.org/), provide a [link to the
30 | license][cc-by-human], and indicate if changes were made. You may do
31 | so in any reasonable manner, but not in any way that suggests the
32 | licensor endorses you or your use.
33 |
34 | **No additional restrictions**---You may not apply legal terms or
35 | technological measures that legally restrict others from doing
36 | anything the license permits. With the understanding that:
37 |
38 | Notices:
39 |
40 | * You do not have to comply with the license for elements of the
41 | material in the public domain or where your use is permitted by an
42 | applicable exception or limitation.
43 | * No warranties are given. The license may not give you all of the
44 | permissions necessary for your intended use. For example, other
45 | rights such as publicity, privacy, or moral rights may limit how you
46 | use the material.
47 |
48 | ## Software
49 |
50 | Except where otherwise noted, the example programs and other software
51 | provided by Software Carpentry and Data Carpentry are made available under the
52 | [OSI][osi]-approved
53 | [MIT license][mit-license].
54 |
55 | Permission is hereby granted, free of charge, to any person obtaining
56 | a copy of this software and associated documentation files (the
57 | "Software"), to deal in the Software without restriction, including
58 | without limitation the rights to use, copy, modify, merge, publish,
59 | distribute, sublicense, and/or sell copies of the Software, and to
60 | permit persons to whom the Software is furnished to do so, subject to
61 | the following conditions:
62 |
63 | The above copyright notice and this permission notice shall be
64 | included in all copies or substantial portions of the Software.
65 |
66 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
67 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
68 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
69 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
70 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
71 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
72 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
73 |
74 | ## Trademark
75 |
76 | "Software Carpentry" and "Data Carpentry" and their respective logos
77 | are registered trademarks of [Community Initiatives][CI].
78 |
79 | [cc-by-human]: https://creativecommons.org/licenses/by/4.0/
80 | [cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode
81 | [mit-license]: https://opensource.org/licenses/mit-license.html
82 | [ci]: http://communityin.org/
83 | [osi]: https://opensource.org
84 |
--------------------------------------------------------------------------------
/bin/boilerplate/_config.yml:
--------------------------------------------------------------------------------
1 | #------------------------------------------------------------
2 | # Values for this lesson.
3 | #------------------------------------------------------------
4 |
5 | # Which carpentry is this ("swc", "dc", "lc", or "cp")?
6 | # swc: Software Carpentry
7 | # dc: Data Carpentry
8 | # lc: Library Carpentry
9 | # cp: Carpentries (to use for instructor traning for instance)
10 | carpentry: "swc"
11 |
12 | # Overall title for pages.
13 | title: "Lesson Title"
14 |
15 | # Life cycle stage of the lesson
16 | # See this page for more details: https://cdh.carpentries.org/the-lesson-life-cycle.html
17 | # Possible values: "pre-alpha", "alpha", "beta", "stable"
18 | life_cycle: "pre-alpha"
19 |
20 | #------------------------------------------------------------
21 | # Generic settings (should not need to change).
22 | #------------------------------------------------------------
23 |
24 | # What kind of thing is this ("workshop" or "lesson")?
25 | kind: "lesson"
26 |
27 | # Magic to make URLs resolve both locally and on GitHub.
28 | # See https://help.github.com/articles/repository-metadata-on-github-pages/.
29 | # Please don't change it: / is correct.
30 | repository: /
31 |
32 | # Email address, no mailto:
33 | email: "team@carpentries.org"
34 |
35 | # Sites.
36 | amy_site: "https://amy.carpentries.org/"
37 | carpentries_github: "https://github.com/carpentries"
38 | carpentries_pages: "https://carpentries.github.io"
39 | carpentries_site: "https://carpentries.org/"
40 | dc_site: "https://datacarpentry.org"
41 | example_repo: "https://github.com/carpentries/lesson-example"
42 | example_site: "https://carpentries.github.io/lesson-example"
43 | lc_site: "https://librarycarpentry.org/"
44 | swc_github: "https://github.com/swcarpentry"
45 | swc_pages: "https://swcarpentry.github.io"
46 | swc_site: "https://software-carpentry.org"
47 | template_repo: "https://github.com/carpentries/styles"
48 | training_site: "https://carpentries.github.io/instructor-training"
49 | workshop_repo: "https://github.com/carpentries/workshop-template"
50 | workshop_site: "https://carpentries.github.io/workshop-template"
51 | cc_by_human: "https://creativecommons.org/licenses/by/4.0/"
52 |
53 | # Surveys.
54 | pre_survey: "https://carpentries.typeform.com/to/wi32rS?slug="
55 | post_survey: "https://carpentries.typeform.com/to/UgVdRQ?slug="
56 | instructor_pre_survey: "https://www.surveymonkey.com/r/instructor_training_pre_survey?workshop_id="
57 | instructor_post_survey: "https://www.surveymonkey.com/r/instructor_training_post_survey?workshop_id="
58 |
59 |
60 | # Start time in minutes (0 to be clock-independent, 540 to show a start at 09:00 am).
61 | start_time: 0
62 |
63 | # Specify that things in the episodes collection should be output.
64 | collections:
65 | episodes:
66 | output: true
67 | permalink: /:path/index.html
68 | extras:
69 | output: true
70 | permalink: /:path/index.html
71 |
72 | # Set the default layout for things in the episodes collection.
73 | defaults:
74 | - values:
75 | root: .
76 | layout: page
77 | - scope:
78 | path: ""
79 | type: episodes
80 | values:
81 | root: ..
82 | layout: episode
83 | - scope:
84 | path: ""
85 | type: extras
86 | values:
87 | root: ..
88 | layout: page
89 |
90 | # Files and directories that are not to be copied.
91 | exclude:
92 | - Makefile
93 | - bin/
94 | - .Rproj.user/
95 | - .vendor/
96 | - vendor/
97 | - .docker-vendor/
98 |
99 | # Turn on built-in syntax highlighting.
100 | highlighter: rouge
101 |
--------------------------------------------------------------------------------
/_config.yml:
--------------------------------------------------------------------------------
1 | #------------------------------------------------------------
2 | # Values for this lesson.
3 | #------------------------------------------------------------
4 |
5 | # Which carpentry is this ("swc", "dc", "lc", or "cp")?
6 | # swc: Software Carpentry
7 | # dc: Data Carpentry
8 | # lc: Library Carpentry
9 | # cp: Carpentries (to use for instructor traning for instance)
10 | carpentry: "incubator" # "swc" in future
11 |
12 | # Overall title for pages.
13 | title: "Reproducible computational environments using containers: Introduction to Singularity"
14 |
15 | # Life cycle stage of the lesson
16 | # See this page for more details: https://cdh.carpentries.org/the-lesson-life-cycle.html
17 | # Possible values: "pre-alpha", "alpha", "beta", "stable"
18 | life_cycle: "stable"
19 |
20 | #------------------------------------------------------------
21 | # Generic settings (should not need to change).
22 | #------------------------------------------------------------
23 |
24 | # What kind of thing is this ("workshop" or "lesson")?
25 | kind: "lesson"
26 |
27 | # Magic to make URLs resolve both locally and on GitHub.
28 | # See https://help.github.com/articles/repository-metadata-on-github-pages/.
29 | # Please don't change it: / is correct.
30 | repository: /
31 |
32 | # Email address, no mailto:
33 | email: "a.turner@epcc.ed.ac.uk"
34 |
35 | # Sites.
36 | amy_site: "https://amy.carpentries.org/"
37 | carpentries_github: "https://github.com/carpentries"
38 | carpentries_pages: "https://carpentries.github.io"
39 | carpentries_site: "https://carpentries.org/"
40 | dc_site: "https://datacarpentry.org"
41 | example_repo: "https://github.com/carpentries/lesson-example"
42 | example_site: "https://carpentries.github.io/lesson-example"
43 | lc_site: "https://librarycarpentry.org/"
44 | swc_github: "https://github.com/swcarpentry"
45 | swc_pages: "https://swcarpentry.github.io"
46 | swc_site: "https://software-carpentry.org"
47 | template_repo: "https://github.com/carpentries/styles"
48 | training_site: "https://carpentries.github.io/instructor-training"
49 | workshop_repo: "https://github.com/carpentries/workshop-template"
50 | workshop_site: "https://carpentries.github.io/workshop-template"
51 | cc_by_human: "https://creativecommons.org/licenses/by/4.0/"
52 |
53 | # Surveys.
54 | pre_survey: "https://carpentries.typeform.com/to/wi32rS?slug="
55 | post_survey: "https://carpentries.typeform.com/to/UgVdRQ?slug="
56 | instructor_pre_survey: "https://www.surveymonkey.com/r/instructor_training_pre_survey?workshop_id="
57 | instructor_post_survey: "https://www.surveymonkey.com/r/instructor_training_post_survey?workshop_id="
58 |
59 |
60 | # Start time in minutes (0 to be clock-independent, 540 to show a start at 09:00 am).
61 | start_time: 570
62 |
63 | # Specify that things in the episodes collection should be output.
64 | collections:
65 | episodes:
66 | output: true
67 | permalink: /:path/index.html
68 | extras:
69 | output: true
70 | permalink: /:path/index.html
71 |
72 | # Set the default layout for things in the episodes collection.
73 | defaults:
74 | - values:
75 | root: .
76 | layout: page
77 | - scope:
78 | path: ""
79 | type: episodes
80 | values:
81 | root: ..
82 | layout: episode
83 | - scope:
84 | path: ""
85 | type: extras
86 | values:
87 | root: ..
88 | layout: page
89 |
90 | # Files and directories that are not to be copied.
91 | exclude:
92 | - Makefile
93 | - bin/
94 | - .Rproj.user/
95 | - .vendor/
96 | - vendor/
97 | - .docker-vendor/
98 |
99 | # Turn on built-in syntax highlighting.
100 | highlighter: rouge
101 |
102 | # Remote theme
103 | remote_theme: carpentries/carpentries-theme@main
104 |
105 | episode_order:
106 | - introduction
107 | - singularity-gettingstarted
108 | - singularity-shell
109 | - singularity-docker
110 | - am-break
111 | - singularity-cache
112 | - singularity-files
113 | - creating-container-images
114 | - lunch
115 | - advanced-containers
116 | - singularity-mpi
117 | - pm-break
118 | - reproducibility
119 | - advanced-topics
120 | - singularity-blast
121 |
122 |
123 |
--------------------------------------------------------------------------------
/_episodes/singularity-shell.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using Singularity containers to run commands"
3 | teaching: 10
4 | exercises: 5
5 | questions:
6 | - "How do I run different commands within a container?"
7 | - "How do I access an interactive shell within a container?"
8 | objectives:
9 | - "Learn how to run different commands when starting a container."
10 | - "Learn how to open an interactive shell within a container environment."
11 | keypoints:
12 | - "The `singularity exec` is an alternative to `singularity run` that allows you to start a container running a specific command."
13 | - "The `singularity shell` command can be used to start a container and run an interactive shell within it."
14 | ---
15 |
16 | ## Running specific commands within a container
17 |
18 | We saw earlier that we can use the `singularity inspect` command to see the run script that a container is configured to run by default. What if we want to run a different command within a container?
19 |
20 | If we know the path of an executable that we want to run within a container, we can use the `singularity exec` command. For example, using the `lolcow.sif` container that we've already pulled from Singularity Hub, we can run the following within the `test` directory where the `lolcow.sif` file is located:
21 |
22 | ~~~
23 | remote$ singularity exec lolcow.sif /bin/echo "Hello, world"
24 | ~~~
25 | {: .language-bash}
26 |
27 | ~~~
28 | Hello, world
29 | ~~~
30 | {: .output}
31 |
32 | Here we see that a container has been started from the `lolcow.sif` image and the `/bin/echo` command has been run within the container, passing the input `Hello, world`. The command has echoed the provided input to the console and the container has terminated.
33 |
34 | Note that the use of `singularity exec` has overriden any run script set within the image metadata and the command that we specified as an argument to `singularity exec` has been run instead.
35 |
36 | > ## Basic exercise: Running a different command within the "hello-world" container
37 | >
38 | > Can you run a container based on the `lolcow.sif` image that *prints the current date and time*?
39 | >
40 | > > ## Solution
41 | > >
42 | > > ~~~
43 | > > remote$ singularity exec lolcow.sif /bin/date
44 | > > ~~~
45 | > > {: .language-bash}
46 | > >
47 | > > ~~~
48 | > > Fri Jun 26 15:17:44 BST 2020
49 | > > ~~~
50 | > > {: .output}
51 | > {: .solution}
52 | {: .challenge}
53 |
54 |
55 | ### Difference between `singularity run` and `singularity exec`
56 |
57 | Above, we used the `singularity exec` command. In earlier episodes of this
58 | course we used `singularity run`. To clarify, the difference between these
59 | two commands is:
60 |
61 | - `singularity run`: This will run the default command set for containers
62 | based on the specified image. This default command is set within
63 | the image metadata when the image is built (we'll see more about this
64 | in later episodes). You do not specify a command to run when using
65 | `singularity run`, you simply specify the image file name. As we saw
66 | earlier, you can use the `singularity inspect` command to see what command
67 | is run by default when starting a new container based on an image.
68 |
69 | - `singularity exec`: This will start a container based on the specified
70 | image and run the command provided on the command line following
71 | `singularity exec `. This will override any default
72 | command specified within the image metadata that would otherwise be
73 | run if you used `singularity run`.
74 |
75 | ## Opening an interactive shell within a container
76 |
77 | If you want to open an interactive shell within a container, Singularity provides the `singularity shell` command. Again, using the `lolcow.sif` image, and within our `test` directory, we can run a shell within a container from the hello-world image:
78 |
79 | ~~~
80 | remote$ singularity shell lolcow.sif
81 | ~~~
82 | {: .language-bash}
83 |
84 | ~~~
85 | Singularity> whoami
86 | []
87 | Singularity> ls
88 | lolcow.sif
89 | Singularity>
90 | ~~~
91 | {: .output}
92 |
93 | As shown above, we have opened a shell in a new container started from the `lolcow.sif` image. Note that the shell prompt has changed to show we are now within the Singularity container.
94 |
95 | Use the `exit` command to exit from the container shell.
96 |
97 |
98 |
--------------------------------------------------------------------------------
/_extras/guide.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: page
3 | title: "Instructor Notes"
4 | ---
5 |
6 | ## Resouces for Instructors
7 |
8 | ## Workshop Structure
9 |
10 | _[Instructors, please add notes here reporting on your experiences of teaching this module, either standalone, or as part of a wider workshop.]_
11 |
12 | - **Containers course covering Docker and Singularity:** This Singularity module is regularly taught alongside the [Introduction to Docker](https://github.com/carpentries-incubator/docker-introduction) module as part of a 2-day course "_Reproducible computational environments using containers_" run through the [ARCHER2 training programme](https://www.archer2.ac.uk/training/) in the UK. [See an example](https://www.archer2.ac.uk/training/courses/221207-containers/) of this course run in December 2022.
13 |
14 | This course has been run both online and in person. Experience suggests that this Singularity module requires between 5 and 6 hours of time to run effectively. The main aspect that takes a significant amount of time is the material at the end of the module looking at building a Singularity image containing an MPI code and then running this in parallel on an HPC platform. The variation in timing depends on how much experience the learners already have with running parallel jobs on HPC platforms and how much they wish to go into the details of the processes of running parallel jobs. For some groups of learners, the MPI use case is not something that is relevant and they may request to cover this material without the section on running parallel MPI jobs. In this case, the material can comfortably be compelted in 4 hours.
15 |
16 |
17 |
18 | ## Technical tips and tricks
19 |
20 | - **HPC access:** Many learners will be keen to learn Singularity so that they can make use of it on a remote High Performance Computing (HPC) cluster. It is therefore strongly recommended that workshop organizers provide course attendees with access to an HPC platform that has Singularity pre-installed for undertaking this module. Where necessary, it is also recommended that guest accounts are set up and learners are asked to test access to the platform before the workshop.
21 |
22 | - **Use of the Singularity Docker container:** Singularity is a Linux tool. The optimal approach to building Singularity images, a key aspect of the material in this module, requires that the learner have a platform with Singularity installed, on which they have admin/root access. Since it's likely that many learners undertaking this module will not be using Linux as the main operating system on the computer on which they are undertaking the training, we need an alternative option. To address this, we use Docker to run the [Singularity Docker container](https://quay.io/repository/singularity/singularity). This ensures that learners have access to a local Singularity deployment (running within a Docker container) on which they have root access and can build Singularity images. The layers of indirection that this requires can prove confusing and we are looking at alternatives but from experience of teaching the module so far, this has proved to be the most reasonable solution at present.
23 |
24 | #### Pre-workshop Planning / Installation / Setup
25 |
26 | As highlighted above, this module is designed to support learners who wish to use Singularity on an HPC cluster. Elements of the module are therefore designed to be run on a High Performance Computing cluster (i.e. clusters that run SLURM, SGE, or other job scheduling software). It is possible for learners to undertake large parts of the module on their own computer. However, since a key use case for Singularity containers is their use on HPC infrastructure, it is recommended that an HPC platform be used in the teaching of this module. In such a case, if Singularity is not already on the cluster, admin rights would be required to install Singularity cluster-wide. Practically, it is likely that this will require a support ticket to be raised with cluster administrators and may require some time for them to investigate the software if they are unfamiliar with Singularity.
27 |
28 | ## Common problems
29 |
30 | Some installs of Singularity require the use of `--bind` for compatibility between the container and the host system, if the host system does not have a directory that is a default or required directory in the container that directory will need to be bound elsewhere in order to work correctly: (i.e. `--bind /data:/mnt`)
31 |
32 |
33 |
34 |
--------------------------------------------------------------------------------
/_episodes/singularity-cache.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "The Singularity cache"
3 | teaching: 10
4 | exercises: 0
5 | questions:
6 | - "Why does Singularity use a local cache?"
7 | - "Where does Singularity store images?"
8 | objectives:
9 | - "Learn about Singularity's image cache."
10 | - "Learn how to manage Singularity images stored locally."
11 | keypoints:
12 | - "Singularity caches downloaded images so that an unchanged image isn't downloaded again when it is requested using the `singularity pull` command."
13 | - "You can free up space in the cache by removing all locally cached images or by specifying individual images to remove."
14 | ---
15 |
16 | ## Singularity's image cache
17 |
18 | Singularity uses a local cache to save downloaded container image files in addition to storing them as the file you specify. As we saw in the previous episode, images are simply `.sif` files stored on your local disk.
19 |
20 | If you delete a local `.sif` container image that you have pulled from a remote container image repository and then pull it again, if the container image is unchanged from the version you previously pulled, you will be given a copy of the container image file from your local cache rather than the container image being downloaded again from the remote source. This removes unnecessary network transfers and is particularly useful for large container images which may take some time to transfer over the network. To demonstrate this, remove the `lolcow.sif` file stored in your `test` directory and then issue the `pull` command again:
21 |
22 | ~~~
23 | remote$ rm lolcow.sif
24 | remote$ singularity pull lolcow.sif library://lolcow
25 | ~~~
26 | {: .language-bash}
27 |
28 | ~~~
29 | INFO: Using cached image
30 | ~~~
31 | {: .output}
32 |
33 | As we can see in the above output, the container image has been returned from the cache and we do not see the output that we saw previously showing the container image being downloaded from the Cloud Library.
34 |
35 | How do we know what is stored in the local cache? We can find out using the `singularity cache` command:
36 |
37 | ~~~
38 | remote$ singularity cache list
39 | ~~~
40 | {: .language-bash}
41 |
42 | ~~~
43 | There are 2 container file(s) using 129.35 MiB and 7 oci blob file(s) using 41.49 MiB of space
44 | Total space used: 170.84 MiB
45 | ~~~
46 | {: .output}
47 |
48 | This tells us how many container image files are stored in the cache and how much disk space the cache is using but it doesn't tell us _what_ is actually being stored. To find out more information we can add the `-v` verbose flag to the `list` command:
49 |
50 | ~~~
51 | remote$ singularity cache list -v
52 | ~~~
53 | {: .language-bash}
54 |
55 | ~~~
56 | There are 2 container file(s) using 129.35 MiB and 7 oci blob file(s) using 41.49 MiB of space
57 | Total space used: 170.84 MiB
58 | artta118@ln04:~/test> singularity cache list -v
59 | NAME DATE CREATED SIZE TYPE
60 | 50b2668d8d3f74c49a7280 2023-09-12 11:41:31 0.96 KiB blob
61 | 76f124aca9afaf3f75812d 2023-09-12 11:41:30 2.51 MiB blob
62 | 7becefa709e2358336177a 2023-09-12 11:41:31 6.25 KiB blob
63 | 87bc5aa6fc4253b93dee0a 2023-09-12 11:41:30 0.23 KiB blob
64 | dae1d9fd74c12f7e66b92c 2023-09-12 11:41:29 10.43 MiB blob
65 | e1acddbe380c63f0de4b77 2023-09-12 11:41:27 25.89 MiB blob
66 | ecc7ff4d26223f4545c4fd 2023-09-12 11:41:28 2.64 MiB blob
67 | sha256.cef378b9a9274c2 2023-09-12 11:39:18 90.43 MiB library
68 | 28bed4c51c3b531159d8af 2023-09-12 11:41:36 38.92 MiB oci-tmp
69 |
70 | There are 2 container file(s) using 129.35 MiB and 7 oci blob file(s) using 41.49 MiB of space
71 | Total space used: 170.84 MiB
72 | ~~~
73 | {: .output}
74 |
75 | This provides us with some more useful information about the actual container images stored in the cache. In the `TYPE` column we can see that our container image type is `library` because it's a `SIF` container image that has been pulled from the Cloud Library.
76 |
77 | > ## Cleaning the Singularity image cache
78 | > We can remove container images from the cache using the `singularity cache clean` command. Running the command without any options will display a warning and ask you to confirm that you want to remove everything from your cache.
79 | >
80 | > You can also remove specific container images or all container images of a particular type. Look at the output of `singularity cache clean --help` for more information.
81 | {: .callout}
82 |
83 | > ## Cache location
84 | > By default, Singularity uses `$HOME/.singularity/cache` as the location for the cache. You can change the location of the cache by setting the `SINGULARITY_CACHEDIR` environment variable to the cache location you want to use.
85 | {: .callout}
86 |
--------------------------------------------------------------------------------
/.github/workflows/template.yml:
--------------------------------------------------------------------------------
1 | name: Template
2 | on:
3 | push:
4 | branches: gh-pages
5 | pull_request:
6 | jobs:
7 | check-template:
8 | name: Test lesson template
9 | if: github.repository == 'carpentries/styles'
10 | runs-on: ${{ matrix.os }}
11 | strategy:
12 | fail-fast: false
13 | matrix:
14 | lesson: [swcarpentry/shell-novice, datacarpentry/r-intro-geospatial, librarycarpentry/lc-git]
15 | os: [ubuntu-latest, macos-latest, windows-latest]
16 | defaults:
17 | run:
18 | shell: bash # forces 'Git for Windows' on Windows
19 | env:
20 | RSPM: 'https://packagemanager.rstudio.com/cran/__linux__/bionic/latest'
21 | steps:
22 | - name: Set up Ruby
23 | uses: actions/setup-ruby@main
24 | with:
25 | ruby-version: '2.7.1'
26 |
27 | - name: Set up Python
28 | uses: actions/setup-python@v2
29 | with:
30 | python-version: '3.x'
31 |
32 | - name: Install GitHub Pages, Bundler, and kramdown gems
33 | run: |
34 | gem install github-pages bundler kramdown
35 |
36 | - name: Install Python modules
37 | run: |
38 | if [[ $RUNNER_OS == macOS || $RUNNER_OS == Linux ]]; then
39 | python3 -m pip install --upgrade pip setuptools wheel pyyaml==5.3.1 requests
40 | elif [[ $RUNNER_OS == Windows ]]; then
41 | python -m pip install --upgrade pip setuptools wheel pyyaml==5.3.1 requests
42 | fi
43 |
44 | - name: Checkout the ${{ matrix.lesson }} lesson
45 | uses: actions/checkout@master
46 | with:
47 | repository: ${{ matrix.lesson }}
48 | path: lesson
49 | fetch-depth: 0
50 |
51 | - name: Determine the proper reference to use
52 | id: styles-ref
53 | run: |
54 | if [[ -n "${{ github.event.pull_request.number }}" ]]; then
55 | echo "::set-output name=ref::refs/pull/${{ github.event.pull_request.number }}/head"
56 | else
57 | echo "::set-output name=ref::gh-pages"
58 | fi
59 |
60 | - name: Sync lesson with carpentries/styles
61 | working-directory: lesson
62 | run: |
63 | git config --global user.email "team@carpentries.org"
64 | git config --global user.name "The Carpentries Bot"
65 | git remote add styles https://github.com/carpentries/styles.git
66 | git config --local remote.styles.tagOpt --no-tags
67 | git fetch styles ${{ steps.styles-ref.outputs.ref }}:styles-ref
68 | git merge -s recursive -Xtheirs --no-commit styles-ref
69 | git commit -m "Sync lesson with carpentries/styles"
70 |
71 | - name: Look for R-markdown files
72 | id: check-rmd
73 | working-directory: lesson
74 | run: |
75 | echo "count=$(shopt -s nullglob; files=($(find . -iname '*.Rmd')); echo ${#files[@]})" >> $GITHUB_OUTPUT
76 |
77 | - name: Set up R
78 | if: steps.check-rmd.outputs.count != 0
79 | uses: r-lib/actions/setup-r@v2
80 | with:
81 | use-public-rspm: true
82 | install-r: false
83 | r-version: 'release'
84 |
85 | - name: Install needed packages
86 | if: steps.check-rmd.outputs.count != 0
87 | run: |
88 | install.packages(c('remotes', 'rprojroot', 'renv', 'desc', 'rmarkdown', 'knitr'))
89 | shell: Rscript {0}
90 |
91 | - name: Query dependencies
92 | if: steps.check-rmd.outputs.count != 0
93 | working-directory: lesson
94 | run: |
95 | source('bin/dependencies.R')
96 | deps <- identify_dependencies()
97 | create_description(deps)
98 | saveRDS(remotes::dev_package_deps(dependencies = TRUE), ".github/depends.Rds", version = 2)
99 | writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version")
100 | shell: Rscript {0}
101 |
102 | - name: Cache R packages
103 | if: runner.os != 'Windows' && steps.check-rmd.outputs.count != 0
104 | uses: actions/cache@v1
105 | with:
106 | path: ${{ env.R_LIBS_USER }}
107 | key: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-${{ hashFiles('.github/depends.Rds') }}
108 | restore-keys: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-
109 |
110 | - name: Install system dependencies for R packages
111 | if: runner.os == 'Linux' && steps.check-rmd.outputs.count != 0
112 | working-directory: lesson
113 | run: |
114 | while read -r cmd
115 | do
116 | eval sudo $cmd
117 | done < <(Rscript -e 'cat(remotes::system_requirements("ubuntu", "18.04"), sep = "\n")')
118 |
119 | - run: make site
120 | working-directory: lesson
121 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | ## ========================================
2 | ## Commands for both workshop and lesson websites.
3 |
4 | # Settings
5 | MAKEFILES=Makefile $(wildcard *.mk)
6 | JEKYLL=bundle config --local set path .vendor/bundle && bundle install && bundle update && bundle exec jekyll
7 | PARSER=bin/markdown_ast.rb
8 | DST=_site
9 |
10 | # Check Python 3 is installed and determine if it's called via python3 or python
11 | # (https://stackoverflow.com/a/4933395)
12 | PYTHON3_EXE := $(shell which python3 2>/dev/null)
13 | ifneq (, $(PYTHON3_EXE))
14 | ifeq (,$(findstring Microsoft/WindowsApps/python3,$(subst \,/,$(PYTHON3_EXE))))
15 | PYTHON := python3
16 | endif
17 | endif
18 |
19 | ifeq (,$(PYTHON))
20 | PYTHON_EXE := $(shell which python 2>/dev/null)
21 | ifneq (, $(PYTHON_EXE))
22 | PYTHON_VERSION_FULL := $(wordlist 2,4,$(subst ., ,$(shell python --version 2>&1)))
23 | PYTHON_VERSION_MAJOR := $(word 1,${PYTHON_VERSION_FULL})
24 | ifneq (3, ${PYTHON_VERSION_MAJOR})
25 | $(error "Your system does not appear to have Python 3 installed.")
26 | endif
27 | PYTHON := python
28 | else
29 | $(error "Your system does not appear to have any Python installed.")
30 | endif
31 | endif
32 |
33 |
34 | # Controls
35 | .PHONY : commands clean files
36 |
37 | # Default target
38 | .DEFAULT_GOAL := commands
39 |
40 | ## I. Commands for both workshop and lesson websites
41 | ## =================================================
42 |
43 | ## * serve : render website and run a local server
44 | serve : lesson-md
45 | ${JEKYLL} serve
46 |
47 | ## * site : build website but do not run a server
48 | site : lesson-md
49 | ${JEKYLL} build
50 |
51 | ## * docker-serve : use Docker to serve the site
52 | docker-serve :
53 | docker pull carpentries/lesson-docker:latest
54 | docker run --rm -it \
55 | -v $${PWD}:/home/rstudio \
56 | -p 4000:4000 \
57 | -p 8787:8787 \
58 | -e USERID=$$(id -u) \
59 | -e GROUPID=$$(id -g) \
60 | carpentries/lesson-docker:latest
61 |
62 | ## * repo-check : check repository settings
63 | repo-check :
64 | @${PYTHON} bin/repo_check.py -s .
65 |
66 | ## * clean : clean up junk files
67 | clean :
68 | @rm -rf ${DST}
69 | @rm -rf .sass-cache
70 | @rm -rf bin/__pycache__
71 | @find . -name .DS_Store -exec rm {} \;
72 | @find . -name '*~' -exec rm {} \;
73 | @find . -name '*.pyc' -exec rm {} \;
74 |
75 | ## * clean-rmd : clean intermediate R files (that need to be committed to the repo)
76 | clean-rmd :
77 | @rm -rf ${RMD_DST}
78 | @rm -rf fig/rmd-*
79 |
80 |
81 | ##
82 | ## II. Commands specific to workshop websites
83 | ## =================================================
84 |
85 | .PHONY : workshop-check
86 |
87 | ## * workshop-check : check workshop homepage
88 | workshop-check :
89 | @${PYTHON} bin/workshop_check.py .
90 |
91 |
92 | ##
93 | ## III. Commands specific to lesson websites
94 | ## =================================================
95 |
96 | .PHONY : lesson-check lesson-md lesson-files lesson-fixme install-rmd-deps
97 |
98 | # RMarkdown files
99 | RMD_SRC = $(wildcard _episodes_rmd/??-*.Rmd)
100 | RMD_DST = $(patsubst _episodes_rmd/%.Rmd,_episodes/%.md,$(RMD_SRC))
101 |
102 | # Lesson source files in the order they appear in the navigation menu.
103 | MARKDOWN_SRC = \
104 | index.md \
105 | CODE_OF_CONDUCT.md \
106 | setup.md \
107 | $(sort $(wildcard _episodes/*.md)) \
108 | reference.md \
109 | $(sort $(wildcard _extras/*.md)) \
110 | LICENSE.md
111 |
112 | # Generated lesson files in the order they appear in the navigation menu.
113 | HTML_DST = \
114 | ${DST}/index.html \
115 | ${DST}/conduct/index.html \
116 | ${DST}/setup/index.html \
117 | $(patsubst _episodes/%.md,${DST}/%/index.html,$(sort $(wildcard _episodes/*.md))) \
118 | ${DST}/reference/index.html \
119 | $(patsubst _extras/%.md,${DST}/%/index.html,$(sort $(wildcard _extras/*.md))) \
120 | ${DST}/license/index.html
121 |
122 | ## * install-rmd-deps : Install R packages dependencies to build the RMarkdown lesson
123 | install-rmd-deps:
124 | @${SHELL} bin/install_r_deps.sh
125 |
126 | ## * lesson-md : convert Rmarkdown files to markdown
127 | lesson-md : ${RMD_DST}
128 |
129 | _episodes/%.md: _episodes_rmd/%.Rmd install-rmd-deps
130 | @mkdir -p _episodes
131 | @bin/knit_lessons.sh $< $@
132 |
133 | ## * lesson-check : validate lesson Markdown
134 | lesson-check : lesson-fixme
135 | @${PYTHON} bin/lesson_check.py -s . -p ${PARSER} -r _includes/links.md
136 |
137 | ## * lesson-check-all : validate lesson Markdown, checking line lengths and trailing whitespace
138 | lesson-check-all :
139 | @${PYTHON} bin/lesson_check.py -s . -p ${PARSER} -r _includes/links.md -l -w --permissive
140 |
141 | ## * unittest : run unit tests on checking tools
142 | unittest :
143 | @${PYTHON} bin/test_lesson_check.py
144 |
145 | ## * lesson-files : show expected names of generated files for debugging
146 | lesson-files :
147 | @echo 'RMD_SRC:' ${RMD_SRC}
148 | @echo 'RMD_DST:' ${RMD_DST}
149 | @echo 'MARKDOWN_SRC:' ${MARKDOWN_SRC}
150 | @echo 'HTML_DST:' ${HTML_DST}
151 |
152 | ## * lesson-fixme : show FIXME markers embedded in source files
153 | lesson-fixme :
154 | @grep --fixed-strings --word-regexp --line-number --no-messages FIXME ${MARKDOWN_SRC} || true
155 |
156 | ##
157 | ## IV. Auxililary (plumbing) commands
158 | ## =================================================
159 |
160 | ## * commands : show all commands.
161 | commands :
162 | @sed -n -e '/^##/s|^##[[:space:]]*||p' $(MAKEFILE_LIST)
163 |
--------------------------------------------------------------------------------
/bin/repo_check.py:
--------------------------------------------------------------------------------
1 | """
2 | Check repository settings.
3 | """
4 |
5 |
6 | import sys
7 | import os
8 | from subprocess import Popen, PIPE
9 | import re
10 | from argparse import ArgumentParser
11 |
12 | from util import Reporter, require
13 |
14 | # Import this way to produce a more useful error message.
15 | try:
16 | import requests
17 | except ImportError:
18 | print('Unable to import requests module: please install requests', file=sys.stderr)
19 | sys.exit(1)
20 |
21 |
22 | # Pattern to match Git command-line output for remotes => (user name, project name).
23 | P_GIT_REMOTE = re.compile(r'upstream\s+(?:https://|git@)github.com[:/]([^/]+)/([^.]+)(\.git)?\s+\(fetch\)')
24 |
25 | # Repository URL format string.
26 | F_REPO_URL = 'https://github.com/{0}/{1}/'
27 |
28 | # Pattern to match repository URLs => (user name, project name)
29 | P_REPO_URL = re.compile(r'https?://github\.com/([^.]+)/([^/]+)/?')
30 |
31 | # API URL format string.
32 | F_API_URL = 'https://api.github.com/repos/{0}/{1}/labels'
33 |
34 | # Expected labels and colors.
35 | EXPECTED = {
36 | 'help wanted': 'dcecc7',
37 | 'status:in progress': '9bcc65',
38 | 'status:changes requested': '679f38',
39 | 'status:wait': 'fff2df',
40 | 'status:refer to cac': 'ffdfb2',
41 | 'status:need more info': 'ee6c00',
42 | 'status:blocked': 'e55100',
43 | 'status:out of scope': 'eeeeee',
44 | 'status:duplicate': 'bdbdbd',
45 | 'type:typo text': 'f8bad0',
46 | 'type:bug': 'eb3f79',
47 | 'type:formatting': 'ac1357',
48 | 'type:template and tools': '7985cb',
49 | 'type:instructor guide': '00887a',
50 | 'type:discussion': 'b2e5fc',
51 | 'type:enhancement': '7fdeea',
52 | 'type:clarification': '00acc0',
53 | 'type:teaching example': 'ced8dc',
54 | 'good first issue': 'ffeb3a',
55 | 'high priority': 'd22e2e'
56 | }
57 |
58 |
59 | def main():
60 | """
61 | Main driver.
62 | """
63 |
64 | args = parse_args()
65 | reporter = Reporter()
66 | repo_url = get_repo_url(args.repo_url)
67 | check_labels(reporter, repo_url)
68 | reporter.report()
69 |
70 |
71 | def parse_args():
72 | """
73 | Parse command-line arguments.
74 | """
75 |
76 | parser = ArgumentParser(description="""Check repository settings.""")
77 | parser.add_argument('-r', '--repo',
78 | default=None,
79 | dest='repo_url',
80 | help='repository URL')
81 | parser.add_argument('-s', '--source',
82 | default=os.curdir,
83 | dest='source_dir',
84 | help='source directory')
85 |
86 | args, extras = parser.parse_known_args()
87 | require(not extras,
88 | 'Unexpected trailing command-line arguments "{0}"'.format(extras))
89 |
90 | return args
91 |
92 |
93 | def get_repo_url(repo_url):
94 | """
95 | Figure out which repository to query.
96 | """
97 |
98 | # Explicitly specified.
99 | if repo_url is not None:
100 | return repo_url
101 |
102 | # Guess.
103 | cmd = 'git remote -v'
104 | p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE,
105 | close_fds=True, universal_newlines=True, encoding='utf-8')
106 | stdout_data, stderr_data = p.communicate()
107 | stdout_data = stdout_data.split('\n')
108 | matches = [P_GIT_REMOTE.match(line) for line in stdout_data]
109 | matches = [m for m in matches if m is not None]
110 | require(len(matches) == 1,
111 | 'Unexpected output from git remote command: "{0}"'.format(matches))
112 |
113 | username = matches[0].group(1)
114 | require(
115 | username, 'empty username in git remote output {0}'.format(matches[0]))
116 |
117 | project_name = matches[0].group(2)
118 | require(
119 | username, 'empty project name in git remote output {0}'.format(matches[0]))
120 |
121 | url = F_REPO_URL.format(username, project_name)
122 | return url
123 |
124 |
125 | def check_labels(reporter, repo_url):
126 | """
127 | Check labels in repository.
128 | """
129 |
130 | actual = get_labels(repo_url)
131 | extra = set(actual.keys()) - set(EXPECTED.keys())
132 |
133 | reporter.check(not extra,
134 | None,
135 | 'Extra label(s) in repository {0}: {1}',
136 | repo_url, ', '.join(sorted(extra)))
137 |
138 | missing = set(EXPECTED.keys()) - set(actual.keys())
139 | reporter.check(not missing,
140 | None,
141 | 'Missing label(s) in repository {0}: {1}',
142 | repo_url, ', '.join(sorted(missing)))
143 |
144 | overlap = set(EXPECTED.keys()).intersection(set(actual.keys()))
145 | for name in sorted(overlap):
146 | reporter.check(EXPECTED[name].lower() == actual[name].lower(),
147 | None,
148 | 'Color mis-match for label {0} in {1}: expected {2}, found {3}',
149 | name, repo_url, EXPECTED[name], actual[name])
150 |
151 |
152 | def get_labels(repo_url):
153 | """
154 | Get actual labels from repository.
155 | """
156 |
157 | m = P_REPO_URL.match(repo_url)
158 | require(
159 | m, 'repository URL {0} does not match expected pattern'.format(repo_url))
160 |
161 | username = m.group(1)
162 | require(username, 'empty username in repository URL {0}'.format(repo_url))
163 |
164 | project_name = m.group(2)
165 | require(
166 | username, 'empty project name in repository URL {0}'.format(repo_url))
167 |
168 | url = F_API_URL.format(username, project_name)
169 | r = requests.get(url)
170 | require(r.status_code == 200,
171 | 'Request for {0} failed with {1}'.format(url, r.status_code))
172 |
173 | result = {}
174 | for entry in r.json():
175 | result[entry['name']] = entry['color']
176 | return result
177 |
178 |
179 | if __name__ == '__main__':
180 | main()
181 |
--------------------------------------------------------------------------------
/bin/util.py:
--------------------------------------------------------------------------------
1 | import sys
2 | import os
3 | import json
4 | from subprocess import Popen, PIPE
5 |
6 | # Import this way to produce a more useful error message.
7 | try:
8 | import yaml
9 | except ImportError:
10 | print('Unable to import YAML module: please install PyYAML', file=sys.stderr)
11 | sys.exit(1)
12 |
13 |
14 | # Things an image file's name can end with.
15 | IMAGE_FILE_SUFFIX = {
16 | '.gif',
17 | '.jpg',
18 | '.png',
19 | '.svg'
20 | }
21 |
22 | # Files that shouldn't be present.
23 | UNWANTED_FILES = [
24 | '.nojekyll'
25 | ]
26 |
27 | # Marker to show that an expected value hasn't been provided.
28 | # (Can't use 'None' because that might be a legitimate value.)
29 | REPORTER_NOT_SET = []
30 |
31 |
32 | class Reporter:
33 | """Collect and report errors."""
34 |
35 | def __init__(self):
36 | """Constructor."""
37 | self.messages = []
38 |
39 | def check_field(self, filename, name, values, key, expected=REPORTER_NOT_SET):
40 | """Check that a dictionary has an expected value."""
41 |
42 | if key not in values:
43 | self.add(filename, '{0} does not contain {1}', name, key)
44 | elif expected is REPORTER_NOT_SET:
45 | pass
46 | elif type(expected) in (tuple, set, list):
47 | if values[key] not in expected:
48 | self.add(
49 | filename, '{0} {1} value {2} is not in {3}', name, key, values[key], expected)
50 | elif values[key] != expected:
51 | self.add(filename, '{0} {1} is {2} not {3}',
52 | name, key, values[key], expected)
53 |
54 | def check(self, condition, location, fmt, *args):
55 | """Append error if condition not met."""
56 |
57 | if not condition:
58 | self.add(location, fmt, *args)
59 |
60 | def add(self, location, fmt, *args):
61 | """Append error unilaterally."""
62 |
63 | self.messages.append((location, fmt.format(*args)))
64 |
65 | @staticmethod
66 | def pretty(item):
67 | location, message = item
68 | if isinstance(location, type(None)):
69 | return message
70 | elif isinstance(location, str):
71 | return location + ': ' + message
72 | elif isinstance(location, tuple):
73 | return '{0}:{1}: '.format(*location) + message
74 |
75 | print('Unknown item "{0}"'.format(item), file=sys.stderr)
76 | return NotImplemented
77 |
78 | @staticmethod
79 | def key(item):
80 | location, message = item
81 | if isinstance(location, type(None)):
82 | return ('', -1, message)
83 | elif isinstance(location, str):
84 | return (location, -1, message)
85 | elif isinstance(location, tuple):
86 | return (location[0], location[1], message)
87 |
88 | print('Unknown item "{0}"'.format(item), file=sys.stderr)
89 | return NotImplemented
90 |
91 | def report(self, stream=sys.stdout):
92 | """Report all messages in order."""
93 |
94 | if not self.messages:
95 | return
96 |
97 | for m in sorted(self.messages, key=self.key):
98 | print(self.pretty(m), file=stream)
99 |
100 |
101 | def read_markdown(parser, path):
102 | """
103 | Get YAML and AST for Markdown file, returning
104 | {'metadata':yaml, 'metadata_len':N, 'text':text, 'lines':[(i, line, len)], 'doc':doc}.
105 | """
106 |
107 | # Split and extract YAML (if present).
108 | with open(path, 'r', encoding='utf-8') as reader:
109 | body = reader.read()
110 | metadata_raw, metadata_yaml, body = split_metadata(path, body)
111 |
112 | # Split into lines.
113 | metadata_len = 0 if metadata_raw is None else metadata_raw.count('\n')
114 | lines = [(metadata_len+i+1, line, len(line))
115 | for (i, line) in enumerate(body.split('\n'))]
116 |
117 | # Parse Markdown.
118 | cmd = 'ruby {0}'.format(parser)
119 | p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE,
120 | close_fds=True, universal_newlines=True, encoding='utf-8')
121 | stdout_data, stderr_data = p.communicate(body)
122 | doc = json.loads(stdout_data)
123 |
124 | return {
125 | 'metadata': metadata_yaml,
126 | 'metadata_len': metadata_len,
127 | 'text': body,
128 | 'lines': lines,
129 | 'doc': doc
130 | }
131 |
132 |
133 | def split_metadata(path, text):
134 | """
135 | Get raw (text) metadata, metadata as YAML, and rest of body.
136 | If no metadata, return (None, None, body).
137 | """
138 |
139 | metadata_raw = None
140 | metadata_yaml = None
141 |
142 | pieces = text.split('---', 2)
143 | if len(pieces) == 3:
144 | metadata_raw = pieces[1]
145 | text = pieces[2]
146 | try:
147 | metadata_yaml = yaml.load(metadata_raw, Loader=yaml.SafeLoader)
148 | except yaml.YAMLError as e:
149 | print('Unable to parse YAML header in {0}:\n{1}'.format(
150 | path, e), file=sys.stderr)
151 | sys.exit(1)
152 |
153 | return metadata_raw, metadata_yaml, text
154 |
155 |
156 | def load_yaml(filename):
157 | """
158 | Wrapper around YAML loading so that 'import yaml' is only needed
159 | in one file.
160 | """
161 |
162 | try:
163 | with open(filename, 'r', encoding='utf-8') as reader:
164 | return yaml.load(reader, Loader=yaml.SafeLoader)
165 | except (yaml.YAMLError, IOError) as e:
166 | print('Unable to load YAML file {0}:\n{1}'.format(
167 | filename, e), file=sys.stderr)
168 | sys.exit(1)
169 |
170 |
171 | def check_unwanted_files(dir_path, reporter):
172 | """
173 | Check that unwanted files are not present.
174 | """
175 |
176 | for filename in UNWANTED_FILES:
177 | path = os.path.join(dir_path, filename)
178 | reporter.check(not os.path.exists(path),
179 | path,
180 | "Unwanted file found")
181 |
182 |
183 | def require(condition, message):
184 | """Fail if condition not met."""
185 |
186 | if not condition:
187 | print(message, file=sys.stderr)
188 | sys.exit(1)
189 |
--------------------------------------------------------------------------------
/_episodes/singularity-docker.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using Docker images with Singularity"
3 | teaching: 5
4 | exercises: 10
5 | questions:
6 | - "How do I use Docker images with Singularity?"
7 | objectives:
8 | - "Learn how to run Singularity containers based on Docker images."
9 | keypoints:
10 | - "Singularity can start a container from a Docker image which can be pulled directly from Docker Hub."
11 | ---
12 |
13 | ## Using Docker images with Singularity
14 |
15 | Singularity can also start containers directly from Docker container images, opening up access to a huge number of existing container images available on [Docker Hub](https://hub.docker.com/) and other registries.
16 |
17 | While Singularity doesn't actually run a container using the Docker container image (it first converts it to a format suitable for use by Singularity), the approach used provides a seamless experience for the end user. When you direct Singularity to run a container based on a Docker container image, Singularity pulls the slices or _layers_ that make up the Docker container image and converts them into a single-file Singularity SIF container image.
18 |
19 | For example, moving on from the simple _Hello World_ examples that we've looked at so far, let's pull one of the [official Docker Python container images](https://hub.docker.com/_/python). We'll use the image with the tag `3.9.6-slim-buster` which has Python 3.9.6 installed on Debian's [Buster](https://www.debian.org/releases/buster/) (v10) Linux distribution:
20 |
21 | ~~~
22 | remote$ singularity pull python-3.9.6.sif docker://python:3.9.6-slim-buster
23 | ~~~
24 | {: .language-bash}
25 |
26 | ~~~
27 | INFO: Converting OCI blobs to SIF format
28 | INFO: Starting build...
29 | Getting image source signatures
30 | Copying blob 33847f680f63 done
31 | Copying blob b693dfa28d38 done
32 | Copying blob ef8f1a8cefd1 done
33 | Copying blob 248d7d56b4a7 done
34 | Copying blob 478d2dfa1a8d done
35 | Copying config c7d70af7c3 done
36 | Writing manifest to image destination
37 | Storing signatures
38 | 2021/07/27 17:23:38 info unpack layer: sha256:33847f680f63fb1b343a9fc782e267b5abdbdb50d65d4b9bd2a136291d67cf75
39 | 2021/07/27 17:23:40 info unpack layer: sha256:b693dfa28d38fd92288f84a9e7ffeba93eba5caff2c1b7d9fe3385b6dd972b5d
40 | 2021/07/27 17:23:40 info unpack layer: sha256:ef8f1a8cefd144b4ee4871a7d0d9e34f67c8c266f516c221e6d20bca001ce2a5
41 | 2021/07/27 17:23:40 info unpack layer: sha256:248d7d56b4a792ca7bdfe866fde773a9cf2028f973216160323684ceabb36451
42 | 2021/07/27 17:23:40 info unpack layer: sha256:478d2dfa1a8d7fc4d9957aca29ae4f4187bc2e5365400a842aaefce8b01c2658
43 | INFO: Creating SIF file...
44 | ~~~
45 | {: .output}
46 |
47 | Note how we see Singularity saying that it's "_Converting OCI blobs to SIF format_". We then see the layers of the Docker container image being downloaded and unpacked and written into a single SIF file. Once the process is complete, we should see the python-3.9.6.sif container image file in the current directory.
48 |
49 | We can now run a container from this container image as we would with any other Singularity container image.
50 |
51 | > ## Running the Python 3.9.6 image that we just pulled from Docker Hub
52 | >
53 | > Try running the Python 3.9.6 container image. What happens?
54 | >
55 | > Try running some simple Python statements...
56 | >
57 | > > ## Running the Python 3.9.6 image
58 | > >
59 | > > ~~~
60 | > > remote$ singularity run python-3.9.6.sif
61 | > > ~~~
62 | > > {: .language-bash}
63 | > >
64 | > > This should put you straight into a Python interactive shell within the running container:
65 | > >
66 | > > ~~~
67 | > > Python 3.9.6 (default, Jul 22 2021, 15:24:21)
68 | > > [GCC 8.3.0] on linux
69 | > > Type "help", "copyright", "credits" or "license" for more information.
70 | > > >>>
71 | > > ~~~
72 | > > Now try running some simple Python statements:
73 | > > ~~~
74 | > > >>> import math
75 | > > >>> math.pi
76 | > > 3.141592653589793
77 | > > >>>
78 | > > ~~~
79 | > > {: .language-python}
80 | > {: .solution}
81 | {: .challenge}
82 |
83 | In addition to running a container and having it run the default run script, you could also start a container running a shell in case you want to undertake any configuration prior to running Python. This is covered in the following exercise:
84 |
85 | > ## Open a shell within a Python container
86 | >
87 | > Try to run a shell within a singularity container based on the `python-3.9.6.sif` container image. That is, run a container that opens a shell rather than the default Python interactive console as we saw above.
88 | > See if you can find more than one way to achieve this.
89 | >
90 | > Within the shell, try starting the Python interactive console and running some Python commands.
91 | >
92 | > > ## Solution
93 | > >
94 | > > Recall from the earlier material that we can use the `singularity shell` command to open a shell within a container. To open a regular shell within a container based on the `python-3.9.6.sif` container image, we can therefore simply run:
95 | > > ~~~
96 | > > remote$ singularity shell python-3.9.6.sif
97 | > > ~~~
98 | > > {: .language-bash}
99 | > >
100 | > > ~~~
101 | > > Singularity> echo $SHELL
102 | > > /bin/bash
103 | > > Singularity> cat /etc/issue
104 | > > Debian GNU/Linux 10 \n \l
105 | > >
106 | > > Singularity> python
107 | > > Python 3.9.6 (default, Jul 22 2021, 15:24:21)
108 | > > [GCC 8.3.0] on linux
109 | > > Type "help", "copyright", "credits" or "license" for more information.
110 | > > >>> print('Hello World!')
111 | > > Hello World!
112 | > > >>> exit()
113 | > >
114 | > > Singularity> exit
115 | > > $
116 | > > ~~~
117 | > > {: .output}
118 | > >
119 | > > It is also possible to use the `singularity exec` command to run an executable within a container. We could, therefore, use the `exec` command to run `/bin/bash`:
120 | > >
121 | > > ~~~
122 | > > remote$ singularity exec python-3.9.6.sif /bin/bash
123 | > > ~~~
124 | > > {: .language-bash}
125 | > >
126 | > > ~~~
127 | > > Singularity> echo $SHELL
128 | > > /bin/bash
129 | > > ~~~
130 | > > {: .output}
131 | > >
132 | > > You can run the Python console from your container shell simply by running the `python` command.
133 | > {: .solution}
134 | {: .challenge}
135 |
136 |
137 | ## References
138 |
139 | \[1\] [Gregory M. Kurzer, Containers for Science, Reproducibility and Mobility: Singularity P2. Intel HPC Developer Conference, 2017](https://www.youtube.com/watch?v=DA87Ba2dpNM)
140 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing
2 |
3 | [The Carpentries][c-site] ([Software Carpentry][swc-site], [Data Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source projects,
4 | and we welcome contributions of all kinds:
5 | new lessons,
6 | fixes to existing material,
7 | bug reports,
8 | and reviews of proposed changes are all welcome.
9 |
10 | ## Contributor Agreement
11 |
12 | By contributing,
13 | you agree that we may redistribute your work under [our license](LICENSE.md).
14 | In exchange,
15 | we will address your issues and/or assess your change proposal as promptly as we can,
16 | and help you become a member of our community.
17 | Everyone involved in [The Carpentries][c-site]
18 | agrees to abide by our [code of conduct](CODE_OF_CONDUCT.md).
19 |
20 | ## How to Contribute
21 |
22 | The easiest way to get started is to file an issue
23 | to tell us about a spelling mistake,
24 | some awkward wording,
25 | or a factual error.
26 | This is a good way to introduce yourself
27 | and to meet some of our community members.
28 |
29 | 1. If you do not have a [GitHub][github] account,
30 | you can [send us comments by email][email].
31 | However,
32 | we will be able to respond more quickly if you use one of the other methods described below.
33 |
34 | 2. If you have a [GitHub][github] account,
35 | or are willing to [create one][github-join],
36 | but do not know how to use Git,
37 | you can report problems or suggest improvements by [creating an issue][issues].
38 | This allows us to assign the item to someone
39 | and to respond to it in a threaded discussion.
40 |
41 | 3. If you are comfortable with Git,
42 | and would like to add or change material,
43 | you can submit a pull request (PR).
44 | Instructions for doing this are [included below](#using-github).
45 |
46 | ## Where to Contribute
47 |
48 | 1. If you wish to change this lesson,
49 | please work in ,
50 | which can be viewed at .
51 |
52 | 2. If you wish to change the example lesson,
53 | please work in ,
54 | which documents the format of our lessons
55 | and can be viewed at .
56 |
57 | 3. If you wish to change the template used for workshop websites,
58 | please work in .
59 | The home page of that repository explains how to set up workshop websites,
60 | while the extra pages in
61 | provide more background on our design choices.
62 |
63 | 4. If you wish to change CSS style files, tools,
64 | or HTML boilerplate for lessons or workshops stored in `_includes` or `_layouts`,
65 | please work in .
66 |
67 | ## What to Contribute
68 |
69 | There are many ways to contribute,
70 | from writing new exercises and improving existing ones
71 | to updating or filling in the documentation
72 | and submitting [bug reports][issues]
73 | about things that don't work, aren't clear, or are missing.
74 | If you are looking for ideas, please see the 'Issues' tab for
75 | a list of issues associated with this repository,
76 | or you may also look at the issues for [Data Carpentry][dc-issues],
77 | [Software Carpentry][swc-issues], and [Library Carpentry][lc-issues] projects.
78 |
79 | Comments on issues and reviews of pull requests are just as welcome:
80 | we are smarter together than we are on our own.
81 | Reviews from novices and newcomers are particularly valuable:
82 | it's easy for people who have been using these lessons for a while
83 | to forget how impenetrable some of this material can be,
84 | so fresh eyes are always welcome.
85 |
86 | ## What *Not* to Contribute
87 |
88 | Our lessons already contain more material than we can cover in a typical workshop,
89 | so we are usually *not* looking for more concepts or tools to add to them.
90 | As a rule,
91 | if you want to introduce a new idea,
92 | you must (a) estimate how long it will take to teach
93 | and (b) explain what you would take out to make room for it.
94 | The first encourages contributors to be honest about requirements;
95 | the second, to think hard about priorities.
96 |
97 | We are also not looking for exercises or other material that only run on one platform.
98 | Our workshops typically contain a mixture of Windows, macOS, and Linux users;
99 | in order to be usable,
100 | our lessons must run equally well on all three.
101 |
102 | ## Using GitHub
103 |
104 | If you choose to contribute via GitHub, you may want to look at
105 | [How to Contribute to an Open Source Project on GitHub][how-contribute].
106 | To manage changes, we follow [GitHub flow][github-flow].
107 | Each lesson has two maintainers who review issues and pull requests or encourage others to do so.
108 | The maintainers are community volunteers and have final say over what gets merged into the lesson.
109 | To use the web interface for contributing to a lesson:
110 |
111 | 1. Fork the originating repository to your GitHub profile.
112 | 2. Within your version of the forked repository, move to the `gh-pages` branch and
113 | create a new branch for each significant change being made.
114 | 3. Navigate to the file(s) you wish to change within the new branches and make revisions as required.
115 | 4. Commit all changed files within the appropriate branches.
116 | 5. Create individual pull requests from each of your changed branches
117 | to the `gh-pages` branch within the originating repository.
118 | 6. If you receive feedback, make changes using your issue-specific branches of the forked
119 | repository and the pull requests will update automatically.
120 | 7. Repeat as needed until all feedback has been addressed.
121 |
122 | When starting work, please make sure your clone of the originating `gh-pages` branch is up-to-date
123 | before creating your own revision-specific branch(es) from there.
124 | Additionally, please only work from your newly-created branch(es) and *not*
125 | your clone of the originating `gh-pages` branch.
126 | Lastly, published copies of all the lessons are available in the `gh-pages` branch of the originating
127 | repository for reference while revising.
128 |
129 | ## Other Resources
130 |
131 | General discussion of [Software Carpentry][swc-site] and [Data Carpentry][dc-site]
132 | happens on the [discussion mailing list][discuss-list],
133 | which everyone is welcome to join.
134 | You can also [reach us by email][email].
135 |
136 | [email]: mailto:admin@software-carpentry.org
137 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry
138 | [dc-lessons]: http://datacarpentry.org/lessons/
139 | [dc-site]: http://datacarpentry.org/
140 | [discuss-list]: http://lists.software-carpentry.org/listinfo/discuss
141 | [github]: https://github.com
142 | [github-flow]: https://guides.github.com/introduction/flow/
143 | [github-join]: https://github.com/join
144 | [how-contribute]: https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github
145 | [issues]: https://guides.github.com/features/issues/
146 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry
147 | [swc-lessons]: https://software-carpentry.org/lessons/
148 | [swc-site]: https://software-carpentry.org/
149 | [c-site]: https://carpentries.org/
150 | [lc-site]: https://librarycarpentry.org/
151 | [lc-issues]: https://github.com/issues?q=user%3Alibrarycarpentry
152 |
--------------------------------------------------------------------------------
/bin/boilerplate/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing
2 |
3 | [The Carpentries][c-site] ([Software Carpentry][swc-site], [Data Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source projects,
4 | and we welcome contributions of all kinds:
5 | new lessons,
6 | fixes to existing material,
7 | bug reports,
8 | and reviews of proposed changes are all welcome.
9 |
10 | ## Contributor Agreement
11 |
12 | By contributing,
13 | you agree that we may redistribute your work under [our license](LICENSE.md).
14 | In exchange,
15 | we will address your issues and/or assess your change proposal as promptly as we can,
16 | and help you become a member of our community.
17 | Everyone involved in [The Carpentries][c-site]
18 | agrees to abide by our [code of conduct](CODE_OF_CONDUCT.md).
19 |
20 | ## How to Contribute
21 |
22 | The easiest way to get started is to file an issue
23 | to tell us about a spelling mistake,
24 | some awkward wording,
25 | or a factual error.
26 | This is a good way to introduce yourself
27 | and to meet some of our community members.
28 |
29 | 1. If you do not have a [GitHub][github] account,
30 | you can [send us comments by email][email].
31 | However,
32 | we will be able to respond more quickly if you use one of the other methods described below.
33 |
34 | 2. If you have a [GitHub][github] account,
35 | or are willing to [create one][github-join],
36 | but do not know how to use Git,
37 | you can report problems or suggest improvements by [creating an issue][issues].
38 | This allows us to assign the item to someone
39 | and to respond to it in a threaded discussion.
40 |
41 | 3. If you are comfortable with Git,
42 | and would like to add or change material,
43 | you can submit a pull request (PR).
44 | Instructions for doing this are [included below](#using-github).
45 |
46 | ## Where to Contribute
47 |
48 | 1. If you wish to change this lesson,
49 | please work in ,
50 | which can be viewed at .
51 |
52 | 2. If you wish to change the example lesson,
53 | please work in ,
54 | which documents the format of our lessons
55 | and can be viewed at .
56 |
57 | 3. If you wish to change the template used for workshop websites,
58 | please work in .
59 | The home page of that repository explains how to set up workshop websites,
60 | while the extra pages in
61 | provide more background on our design choices.
62 |
63 | 4. If you wish to change CSS style files, tools,
64 | or HTML boilerplate for lessons or workshops stored in `_includes` or `_layouts`,
65 | please work in .
66 |
67 | ## What to Contribute
68 |
69 | There are many ways to contribute,
70 | from writing new exercises and improving existing ones
71 | to updating or filling in the documentation
72 | and submitting [bug reports][issues]
73 | about things that don't work, aren't clear, or are missing.
74 | If you are looking for ideas, please see the 'Issues' tab for
75 | a list of issues associated with this repository,
76 | or you may also look at the issues for [Data Carpentry][dc-issues],
77 | [Software Carpentry][swc-issues], and [Library Carpentry][lc-issues] projects.
78 |
79 | Comments on issues and reviews of pull requests are just as welcome:
80 | we are smarter together than we are on our own.
81 | Reviews from novices and newcomers are particularly valuable:
82 | it's easy for people who have been using these lessons for a while
83 | to forget how impenetrable some of this material can be,
84 | so fresh eyes are always welcome.
85 |
86 | ## What *Not* to Contribute
87 |
88 | Our lessons already contain more material than we can cover in a typical workshop,
89 | so we are usually *not* looking for more concepts or tools to add to them.
90 | As a rule,
91 | if you want to introduce a new idea,
92 | you must (a) estimate how long it will take to teach
93 | and (b) explain what you would take out to make room for it.
94 | The first encourages contributors to be honest about requirements;
95 | the second, to think hard about priorities.
96 |
97 | We are also not looking for exercises or other material that only run on one platform.
98 | Our workshops typically contain a mixture of Windows, macOS, and Linux users;
99 | in order to be usable,
100 | our lessons must run equally well on all three.
101 |
102 | ## Using GitHub
103 |
104 | If you choose to contribute via GitHub, you may want to look at
105 | [How to Contribute to an Open Source Project on GitHub][how-contribute].
106 | To manage changes, we follow [GitHub flow][github-flow].
107 | Each lesson has two maintainers who review issues and pull requests or encourage others to do so.
108 | The maintainers are community volunteers and have final say over what gets merged into the lesson.
109 | To use the web interface for contributing to a lesson:
110 |
111 | 1. Fork the originating repository to your GitHub profile.
112 | 2. Within your version of the forked repository, move to the `gh-pages` branch and
113 | create a new branch for each significant change being made.
114 | 3. Navigate to the file(s) you wish to change within the new branches and make revisions as required.
115 | 4. Commit all changed files within the appropriate branches.
116 | 5. Create individual pull requests from each of your changed branches
117 | to the `gh-pages` branch within the originating repository.
118 | 6. If you receive feedback, make changes using your issue-specific branches of the forked
119 | repository and the pull requests will update automatically.
120 | 7. Repeat as needed until all feedback has been addressed.
121 |
122 | When starting work, please make sure your clone of the originating `gh-pages` branch is up-to-date
123 | before creating your own revision-specific branch(es) from there.
124 | Additionally, please only work from your newly-created branch(es) and *not*
125 | your clone of the originating `gh-pages` branch.
126 | Lastly, published copies of all the lessons are available in the `gh-pages` branch of the originating
127 | repository for reference while revising.
128 |
129 | ## Other Resources
130 |
131 | General discussion of [Software Carpentry][swc-site] and [Data Carpentry][dc-site]
132 | happens on the [discussion mailing list][discuss-list],
133 | which everyone is welcome to join.
134 | You can also [reach us by email][email].
135 |
136 | [email]: mailto:admin@software-carpentry.org
137 | [dc-issues]: https://github.com/issues?q=user%3Adatacarpentry
138 | [dc-lessons]: http://datacarpentry.org/lessons/
139 | [dc-site]: http://datacarpentry.org/
140 | [discuss-list]: http://lists.software-carpentry.org/listinfo/discuss
141 | [github]: https://github.com
142 | [github-flow]: https://guides.github.com/introduction/flow/
143 | [github-join]: https://github.com/join
144 | [how-contribute]: https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github
145 | [issues]: https://guides.github.com/features/issues/
146 | [swc-issues]: https://github.com/issues?q=user%3Aswcarpentry
147 | [swc-lessons]: https://software-carpentry.org/lessons/
148 | [swc-site]: https://software-carpentry.org/
149 | [c-site]: https://carpentries.org/
150 | [lc-site]: https://librarycarpentry.org/
151 | [lc-issues]: https://github.com/issues?q=user%3Alibrarycarpentry
152 |
--------------------------------------------------------------------------------
/_episodes/reproducibility.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Containers in Research Workflows: Reproducibility and Granularity"
3 | teaching: 20
4 | exercises: 5
5 | questions:
6 | - "How can I use container images to make my research more reproducible?"
7 | - "How do I incorporate containers into my research workflow?"
8 | objectives:
9 | - "Understand how container images can help make research more reproducible."
10 | - "Understand what practical steps I can take to improve the reproducibility of my research using containers."
11 | keypoints:
12 | - "Container images allow us to encapsulate the computation (and data) we have used in our research."
13 | - "Using online containerimage repositories allows us to easily share computational work we have done."
14 | - "Using container images along with a DOI service such as Zenodo allows us to capture our work and enables reproducibility."
15 | ---
16 |
17 | Although this workshop is titled "Reproducible computational environments using containers",
18 | so far we have mostly covered the mechanics of using Singularity with only passing reference to
19 | the reproducibility aspects. In this section, we discuss these aspects in more detail.
20 |
21 | > ## Work in progress...
22 | > Note that reproducibility aspects of software and containers are an active area of research,
23 | > discussion and development so are subject to many changes. We will present some ideas and
24 | > approaches here but best practices will likely evolve in the near future.
25 | {: .callout}
26 |
27 | ## Reproducibility
28 |
29 | By *reproducibility* here we mean the ability of someone else (or your future self) being able to reproduce
30 | what you did computationally at a particular time (be this in research, analysis or something else)
31 | as closely as possible even if they do not have access to exactly the same hardware resources
32 | that you had when you did the original work.
33 |
34 | Some examples of why containers are an attractive technology to help with reproducibility include:
35 |
36 | - The same computational work can be run across multiple different technologies seamlessly (e.g. Windows, macOS, Linux).
37 | - You can save the exact process that you used for your computational work (rather than relying on potentially incomplete notes).
38 | - You can save the exact versions of software and their dependencies in the container image.
39 | - You can access legacy versions of software and underlying dependencies which may not be generally available any more.
40 | - Depending on their size, you can also potentially store a copy of key data within the container image.
41 | - You can archive and share the container image as well as associating a persistent identifier with a container image
42 | to allow other researchers to reproduce and build on your work.
43 |
44 | ## Sharing images
45 |
46 | We have made use of a few different online repositories during this course, such as [Sylabs Cloud Library](https://cloud.sylabs.io/library) and [Docker Hub](https://hub.docker.com) which provide platforms for sharing container images publicly. Once you have uploaded a container image, you can point people to its public location and they can download and build upon it.
47 |
48 | This is fine for working collaboratively with container images on a day-to-day basis but these repositories are not a good option for long time archive of container images in support of research and publications as:
49 |
50 | - free accounts have a limit on how long a container image will be hosted if it is not updated
51 | - it does not support adding persistent identifiers to container images
52 | - it is easy to overwrite container images with newer versions by mistake.
53 |
54 | ## Archiving and persistently identifying container images using Zenodo
55 |
56 | When you publish your work or make it publicly available in some way it is good practice to make container images that you used for computational work available in an immutable, persistent way and to have an identifier that allows people to cite and give you credit for the work you have done. [Zenodo](https://zenodo.org/) is one service that provides this functionality.
57 |
58 | Zenodo supports the upload of *zip* archives and we can capture our Singularity container images as zip archives. For example, to convert the container image we created earlier, `alpine-sum.sif` in this lesson to a zip archive (on the command line):
59 |
60 | ~~~
61 | zip alpine-sum.zip alpine-sum.sif
62 | ~~~
63 | {: .bash}
64 |
65 | Note: These zip container images can become quite large and Zenodo supports uploads up to 50GB. If your container image is too large, you may need to look at other options to archive them or work to reduce the size of the container images.
66 |
67 | Once you have your archive, you can [deposit it on Zenodo](https://zenodo.org/deposit/) and this will:
68 |
69 | - Create a long-term archive snapshot of your Singularity container image which people (including your future self) can download and reuse or reproduce your work.
70 | - Create a persistent DOI (*Digital Object Identifier*) that you can cite in any publications or outputs to enable reproducibility and recognition of your work.
71 |
72 | In addition to the archive file itself, the deposit process will ask you to provide some basic metadata to classify the container image and the associated work.
73 |
74 | Note that Zenodo is not the only option for archiving and generating persistent DOIs for container images. There are other services out there -- for example, some organizations may provide their own, equivalent, service.
75 |
76 | ## Reproducibility good practice
77 |
78 | - Make use of container images to capture the computational environment required for your work.
79 | - Decide on the appropriate granularity for the container images you will use for your computational work -- this will be different for each project/area. Take note of accepted practice from contemporary work in the same area. What are the right building blocks for individual container images in your work?
80 | - Document what you have done and why -- this can be put in comments in the Singularity recipe file and the use of the container image described in associated documentation and/or publications. Make sure that references are made in both directions so that the container image and the documentation are appropriately linked.
81 | - When you publish work (in whatever way) use an archiving and DOI service such as Zenodo to make sure your container image is captured as it was used for the work and that is obtains a persistent DOI to allow it to be cited and referenced properly.
82 |
83 | ## Container Granularity
84 |
85 | As mentioned above, one of the decisions you may need to make when containerising your research workflows
86 | is what level of *granularity* you wish to employ. The two extremes of this decision could be characterized
87 | as:
88 |
89 | - Create a single container image with all the tools you require for your research or analysis workflow
90 | - Create many container images each running a single command (or step) of the workflow and use them together
91 |
92 | Of course, many real applications will sit somewhere between these two extremes.
93 |
94 | > ## Positives and negatives
95 | > What are the advantages and disadvantages of the two approaches to container granularity for research
96 | > workflows described above? Think about this
97 | > and write a few bullet points for advantages and disadvantages for each approach in the course Etherpad.
98 | > > ## Solution
99 | > > This is not an exhaustive list but some of the advantages and disadvantages could be:
100 | > > ### Single large container image
101 | > > - Advantages:
102 | > > + Simpler to document
103 | > > + Full set of requirements packaged in one place
104 | > > + Potentially easier to maintain (though could be opposite if working with large, distributed group)
105 | > > - Disadvantages:
106 | > > + Could get very large in size, making it more difficult to distribute
107 | > > + May end up with same dependency issues within the container image from different software requirements
108 | > > + Potentially more complex to test
109 | > > + Less re-useable for different, but related, work
110 | > >
111 | > > ### Multiple smaller container images
112 | > > - Advantages:
113 | > > + Individual components can be re-used for different, but related, work
114 | > > + Individual parts are smaller in size making them easier to distribute
115 | > > + Avoid dependency issues between different pieces of software
116 | > > + Easier to test
117 | > > - Disadvantage:
118 | > > + More difficult to document
119 | > > + Potentially more difficult to maintain (though could be easier if working with large, distributed group)
120 | > > + May end up with dependency issues between component container images if they get out of sync
121 | > {: .solution}
122 | {: .challenge}
123 |
124 | > ## Next steps with containers
125 | >
126 | > Now that we're at the end of the lesson material, take a moment to reflect on
127 | > what you've learned, how it applies to you, and what to do next.
128 | >
129 | > 1. In your own notes, write down or diagram your understanding of Singularity containers and container images:
130 | > concepts, commands, and how they work.
131 | > 2. In your own notes, write down how you think you might
132 | > use containers in your daily work. If there's something you want to try doing with
133 | > containers right away, what is a next step after this workshop to make that happen?
134 | >
135 | {: .challenge}
136 |
137 |
--------------------------------------------------------------------------------
/_episodes/singularity-files.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Files in Singularity containers"
3 | teaching: 10
4 | exercises: 10
5 | questions:
6 | - "How do I make data available in a Singularity container?"
7 | - "What data is made available by default in a Singularity container?"
8 | objectives:
9 | - "Understand that some data from the host system is usually made available by default within a container"
10 | - "Learn more about how Singularity handles users and binds directories from the host filesystem."
11 | keypoints:
12 | - "Your current directory and home directory are usually available by default in a container."
13 | - "You have the same username and permissions in a container as on the host system."
14 | - "You can specify additional host system directories to be available in the container."
15 | ---
16 |
17 | The key concept to remember when running a Singularity container, you only have the same permissions to access files as the
18 | user on the host system that you start the container as. (If you are familiar with Docker, you may note that this is different
19 | behaviour than you would see with that tool.)
20 |
21 | In this episode we will look at working with files in the context of Singularity containers and how this links with Singularity's
22 | approach to users and permissions within containers.
23 |
24 | ## Users within a Singularity container
25 |
26 | The first thing to note is that when you ran `whoami` within the container shell you started at the end of the previous episode,
27 | you should have seen the same username that you have on the host system when you ran the container.
28 |
29 | For example, if my username were `jc1000`, I would expect to see the following:
30 |
31 | ~~~
32 | remote$ singularity shell lolcow.sif
33 | Singularity> whoami
34 | ~~~
35 | {: .language-bash}
36 |
37 | ~~~
38 | artta118
39 | ~~~
40 | {: .output}
41 |
42 | But wait! I downloaded the standard, public version of the `lolcow` container image from the Cloud Library. I haven't customised
43 | it in any way. How is it configured with my own user details?!
44 |
45 | If you have any familiarity with Linux system administration, you may be aware that in Linux, users and their Unix groups are
46 | configured in the `/etc/passwd` and `/etc/group` files respectively. In order for the running container to know of my
47 | user, the relevant user information needs to be available within these files within the container.
48 |
49 | Assuming this feature is enabled within the installation of Singularity on your system, when the container is started, Singularity
50 | appends the relevant user and group lines from the host system to the `/etc/passwd` and `/etc/group` files within the
51 | container[\[1\]](https://www.intel.com/content/dam/www/public/us/en/documents/presentation/hpc-containers-singularity-advanced.pdf).
52 |
53 | This means that the host system can effectively ensure that you cannot access/modify/delete any data you should not be able to on
54 | the host system from within the container and you cannot run anything that you would not have permission to run on the host system
55 | since you are restricted to the same user permissions within the container as you are on the host system.
56 |
57 | ## Files and directories within a Singularity container
58 |
59 | Singularity also *binds* some *directories* from the host system where you are running the `singularity` command into the container
60 | that you are starting. Note that this bind process is not copying files into the running container, it is making an existing directory
61 | on the host system visible and accessible within the container environment. If you write files to this directory within the running
62 | container, when the container shuts down, those changes will persist in the relevant location on the host system.
63 |
64 | There is a default configuration of which files and directories are bound into the container but ultimate control of how things are
65 | set up on the system where you are running Singularity is determined by the system administrator. As a result, this section provides
66 | an overview but you may find that things are a little different on the system that you're running on.
67 |
68 | One directory that is likely to be accessible within a container that you start is your *home directory*. You may also find that
69 | the directory from which you issued the `singularity` command (the *current working directory*) is also bound.
70 |
71 | The binding of file content and directories from a host system into a Singularity container is illustrated in the example below
72 | showing a subset of the directories on the host Linux system and in a running Singularity container:
73 |
74 | ~~~
75 | Host system: Singularity container:
76 | ------------- ----------------------
77 | / /
78 | ├── bin ├── bin
79 | ├── etc ├── etc
80 | │ ├── ... │ ├── ...
81 | │ ├── group ─> user's group added to group file in container ─>│ ├── group
82 | │ └── passwd ──> user info added to passwd file in container ──>│ └── passwd
83 | ├── home ├── usr
84 | │ └── artta118 ───> user home directory made available ──> ─┐ ├── sbin
85 | ├── usr in container via bind mount │ ├── home
86 | ├── sbin └────────>└── artta118
87 | └── ... └── ...
88 |
89 | ~~~
90 | {: .output}
91 |
92 | > ## Questions and exercises: Files in Singularity containers
93 | >
94 | > **Q1:** What do you notice about the ownership of files in a container started from the `lolcow.sif` image? (e.g. take a look at the ownership
95 | > of files in the root directory (`/`) and your home directory (`~/`)).
96 | >
97 | > **Exercise 1:** In this container, try creating a file in the root directory `/` (e.g. using `touch /myfile.dat`). What do you notice? Try
98 | > removing the `/singularity` file. What happens in these two cases?
99 | >
100 | > **Exercise 2:** In your home directory within the container shell, try and create a simple text file (e.g. `echo "Some text" > ~/test-file.txt`).
101 | > Is it possible to do this? If so, why? If not, why not?! If you can successfully create a file, what happens to it when you exit the shell and
102 | > the container shuts down?
103 | >
104 | > > ## Answers
105 | > >
106 | > > **A1:** Use the `ls -l /` command to see a detailed file listing including file ownership and permission details. You should see that most of
107 | > > the files in the `/` directory are owned by `root`, as you would probably expect on any Linux system. If you look at the files in your home
108 | > > directory, they should be owned by you.
109 | > >
110 | > > **A Ex1:** We've already seen from the previous answer that the files in `/` are owned by `root` so we would nott expect to be able to create
111 | > > files there if we're not the root user. However, if you tried to remove `/singularity` you would have seen an error similar to the following:
112 | > > `cannot remove '/singularity': Read-only file system`. This tells us something else about the filesystem. It's not just that we do not have
113 | > > permission to delete the file, the filesystem itself is read-only so even the `root` user would not be able to edit/delete this file. We will
114 | > > look at this in more detail shortly.
115 | > >
116 | > > **A Ex2:** Within your home directory, you _should_ be able to successfully create a file. Since you're seeing your home directory on the host
117 | > > system which has been bound into the container, when you exit and the container shuts down, the file that you created within the container
118 | > > should still be present when you look at your home directory on the host system.
119 | > {: .solution}
120 | {: .challenge}
121 |
122 | ## Binding additional host system directories to the container
123 |
124 | You will sometimes need to bind additional host system directories into a container you are using over and above those bound by default. For example:
125 |
126 | - There may be a shared dataset in a location that you need access to in the container
127 | - You may require executables and software libraries from the host system in the container
128 |
129 | The `-B` option to the `singularity` command is used to specify additional binds. For example, to bind the `/opt/cray` directory (where the HPE Cray programming environment is stored) into a container you could use:
130 |
131 | ```
132 | remote$ singularity shell -B /opt/cray lolcow.sif
133 | Singularity> ls -la /opt/cray
134 | ```
135 | {: .language-bash}
136 |
137 | Note that, by default, a bind is mounted at the same path in the container as on the host system. You can also specify where a host directory is
138 | mounted in the container by separating the host path from the container path by a colon (`:`) in the option:
139 |
140 | ```
141 | remote$ singularity shell -B /opt/cray:/cpe lolcow.sif
142 | Singularity> ls -la /cpe
143 | ```
144 | {: .language-bash}
145 |
146 | You can specify multiple binds to `-B` by separating them by commas (`,`).
147 |
148 | Another option is to specify the paths you want to bind in the `SINGULARITY_BIND` environment variable. This can be more convenient when you have a lot of paths you want to bind into the running container (we will see this later in the course when we look at using MPI with containers). For example, to bind the locations that contain both the HPE Cray programming environment and the CSE centrally installed software into a running container, we would use:
149 |
150 | ```
151 | remote$ export SINGULARITY_BIND="/opt/cray,/work/y07/shared"
152 | remote$ singularity shell lolcow.sif
153 | Singularity> ls -la /work/y07/shared
154 | ```
155 |
156 | Finally, you can also copy data into a container image at build time if there is some static data required in the image. We cover this later in the section on building container images.
157 |
158 | ## References
159 |
160 | \[1\] Gregory M. Kurzer, Containers for Science, Reproducibility and Mobility: Singularity P2. Intel HPC Developer Conference, 2017. Available
161 | at: https://www.intel.com/content/dam/www/public/us/en/documents/presentation/hpc-containers-singularity-advanced.pdf
162 |
--------------------------------------------------------------------------------
/_episodes/advanced-containers.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Creating More Complex Container Images"
3 | teaching: 15
4 | exercises: 20
5 | questions:
6 | - "How can I make more complex container images? "
7 | objectives:
8 | - "Explain how you can include files within Docker container images when you build them."
9 | - "Explain how you can access files on the Docker host from your Docker containers."
10 | keypoints:
11 | - Docker allows containers to read and write files from the Docker host.
12 | - You can include files from your Docker host into your Docker container images by using the `COPY` instruction in your `Dockerfile`.
13 | ---
14 |
15 | ## Building container images with your files included
16 |
17 | In order to create and use your own container images, you may need more information than
18 | our previous example. You may want to use files from outside the container,
19 | that are not included within the container image by copying the files
20 | into the container image at build time.
21 |
22 | ### Create a Python script
23 |
24 | Before we go ahead and build our next container image, we're going
25 | to create a simple Python script on our host system and create a
26 | Dockerfile to have this script copied into our container image when
27 | it is created.
28 |
29 | In your shell, create a new directory to hold the *build context* for our new container image and
30 | move into the directory:
31 |
32 | ~~~
33 | $ mkdir alpine-sum
34 | $ cd alpine-sum
35 | ~~~
36 | {: .language-bash}
37 |
38 | Use your text editor to create a Python script called `sum.py` with the
39 | following contents:
40 |
41 | ~~~
42 | #!/usr/bin/env python3
43 |
44 | import sys
45 | try:
46 | total = sum(int(arg) for arg in sys.argv[1:])
47 | print('sum =', total)
48 | except ValueError:
49 | print('Please supply integer arguments')
50 | ~~~
51 | {: .language-python}
52 |
53 | Let's assume that we've finished with our `sum.py`
54 | script and want to add it to the container image itself.
55 |
56 | ### Create the Dockerfile
57 |
58 | Now we have our Python script, we are going to create our Dockerfile. This is going
59 | to be similar to the Dockerfile we used in the previous section with the addition of
60 | one extra line. Here is the full Dockerfile:
61 |
62 | ~~~
63 | FROM alpine
64 | RUN apk add --update python3 py3-pip python3-dev
65 | RUN pip install cython
66 | COPY sum.py /home
67 | CMD ["python3", "--version"]
68 | ~~~
69 |
70 | The additional line we have added is:
71 |
72 | ~~~
73 | COPY sum.py /home
74 | ~~~
75 |
76 | This line will cause Docker to copy the file from your computer into the container's
77 | filesystem. Let's build the container image like before, but give it a different name
78 | and then push it to Docker Hub (remember to subsitute `alice` for your Docker Hub username):
79 |
80 | ~~~
81 | $ docker image build --platform linux/amd64 -t alice/alpine-sum .
82 |
83 | ...output from docker build...
84 |
85 | $ docker push alice/alpine-sum
86 | ~~~
87 | {: .language-bash}
88 |
89 | > ## The Importance of Command Order in a Dockerfile
90 | >
91 | > When you run `docker build` it executes the build in the order specified
92 | > in the `Dockerfile`.
93 | > This order is important for rebuilding and you typically will want to put your `RUN`
94 | > commands before your `COPY` commands.
95 | >
96 | > Docker builds the layers of commands in order.
97 | > This becomes important when you need to rebuild container images.
98 | > If you change layers later in the `Dockerfile` and rebuild the container image, Docker doesn't need to
99 | > rebuild the earlier layers but will instead used a stored (called "cached") version of
100 | > those layers.
101 | >
102 | > For example, in an instance where you wanted to copy `multiply.py` into the container
103 | > image instead of `sum.py`.
104 | > If the `COPY` line came before the `RUN` line, it would need to rebuild the whole image.
105 | > If the `COPY` line came second then it would use the cached `RUN` layer from the previous
106 | > build and then only rebuild the `COPY` layer.
107 | >
108 | {: .callout}
109 |
110 | > ## Exercise: Did it work?
111 | >
112 | > Can you remember how to run a container interactively on the remote HPC system? Try that with this one.
113 | > Once inside, try running the Python script you added to the container image.
114 | >
115 | > > ## Solution
116 | > >
117 | > > You can start the container interactively on the remote HPC system like so (remember to use your
118 | > > Docker Hub username):
119 | > > ~~~
120 | > > remote$ singularity pull alpine-sum.sif docker://alice/alpine-sum
121 | > > remote$ singularity shell alpine-sum.sif
122 | > > ~~~
123 | > > {: .language-bash}
124 | > >
125 | > > You should be able to run the python command inside the container like this:
126 | > > ~~~
127 | > > Singularity> python3 /home/sum.py
128 | > > ~~~
129 | > > {: .language-bash}
130 | > >
131 | > {: .solution}
132 | {: .challenge}
133 |
134 | This `COPY` keyword can be used to place your own scripts or own data into a container image
135 | that you want to publish or use as a record. Note that it's not necessarily a good idea
136 | to put your scripts inside the container image if you're constantly changing or editing them.
137 | Then, referencing the scripts from outside the container is a good idea, by bind mounting
138 | host data into the running container as we saw earlier in the workshop. You also want to
139 | think carefully about size -- if you run `docker image ls` you'll see the size of each container
140 | image all the way on the right of the screen. The bigger your container image becomes, the harder
141 | it will be to easily download.
142 |
143 | > ## Security Warning
144 | >
145 | > Login credentials including passwords, tokens, secure access tokens or other secrets
146 | > must never be stored in a container. If secrets are stored, they are at high risk to
147 | > be found and exploited when made public.
148 | {: .callout}
149 |
150 | > ## Copying alternatives
151 | >
152 | > Another trick for getting your own files into a container image is by using the `RUN`
153 | > keyword and downloading the files from the internet. For example, if your code
154 | > is in a GitHub repository, you could include this statement in your Dockerfile
155 | > to download the latest version every time you build the container image:
156 | >
157 | > ~~~
158 | > RUN git clone https://github.com/alice/mycode
159 | > ~~~
160 | >
161 | > Similarly, the `wget` command can be used to download any file publicly available
162 | > on the internet:
163 | >
164 | > ~~~
165 | > RUN wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.10.0/ncbi-blast-2.10.0+-x64-linux.tar.gz
166 | > ~~~
167 | >
168 | > Note that the above `RUN` examples depend on commands (`git` and `wget` respectively) that
169 | > must be available within your container: Linux distributions such as Alpine may require you to
170 | > install such commands before using them within `RUN` statements.
171 | {: .callout}
172 |
173 | ## More fancy `Dockerfile` options (optional, for presentation or as exercises)
174 |
175 | We can expand on the example above to make our container image even more "automatic".
176 | Here are some ideas:
177 |
178 | ### Make the `sum.py` script run automatically
179 |
180 | ~~~
181 | FROM alpine
182 | RUN apk add --update python3 py3-pip python3-dev
183 | COPY sum.py /home
184 |
185 | # Run the sum.py script as the default command
186 | CMD ["python3", "/home/sum.py"]
187 | ~~~
188 |
189 | Build and push this:
190 |
191 | ~~~
192 | $ docker image build --platform linux/amd64 -t alice/alpine-sum:v1 .
193 | $ docker push alice/alpine-sum:v1
194 | ~~~
195 | {: .language-bash}
196 |
197 | You'll notice that you can run the container without arguments just fine,
198 | resulting in `sum = 0`, but this is boring. Supplying arguments however
199 | doesn't work:
200 |
201 | ~~~
202 | remote$ singularity pull alpine-sum_v1.sif docker://alice/alpine-sum:v1
203 | remote$ singularity run alpine-sum_v1.sif 10 11 12
204 | ~~~
205 | {: .language-bash}
206 |
207 | results in:
208 |
209 | ~~~
210 | FATAL: "10": executable file not found in $PATH
211 | ~~~
212 | {: .output}
213 |
214 | This is because the arguments `10 11 12` are ignored by the `CMD` syntax and the container
215 | tries to interpret them as additional commands instead of arguments to the `sum.py` script.
216 |
217 | To achieve the goal of having a command that *always* runs when a
218 | container is run from the container image *and* can be passed the arguments given on the
219 | command line, use the keyword `ENTRYPOINT` in the `Dockerfile`.
220 |
221 | ~~~
222 | FROM alpine
223 |
224 | COPY sum.py /home
225 | RUN apk add --update python3 py3-pip python3-dev
226 |
227 | # Run the sum.py script as the default command and
228 | # allow people to enter arguments for it
229 | ENTRYPOINT ["python3", "/home/sum.py"]
230 |
231 | # Give default arguments, in case none are supplied on
232 | # the command-line
233 | CMD ["10", "11"]
234 | ~~~
235 |
236 | Build and push this:
237 |
238 | ~~~
239 | $ docker image build --platform linux/amd64 -t alice/alpine-sum:v2 .
240 | $ docker push alice/alpine-sum:v2
241 | ~~~
242 | {: .language-bash}
243 |
244 | ~~~
245 | remote$ singularity pull alpine-sum_v2.sif docker://alice/alpine-sum:v2
246 | remote$ singularity run alpine-sum_v2.sif 10 11 12
247 | ~~~
248 | {: .language-bash}
249 | ~~~
250 | sum = 33
251 | ~~~
252 | {: .output}
253 |
254 | ### Add the `sum.py` script to the `PATH` so you can run it directly:
255 |
256 | ~~~
257 | FROM alpine
258 |
259 | RUN apk add --update python3 py3-pip python3-dev
260 |
261 | COPY sum.py /home
262 | # set script permissions
263 | RUN chmod +x /home/sum.py
264 | # add /home folder to the PATH
265 | ENV PATH /home:$PATH
266 | ~~~
267 |
268 | Build and push this:
269 |
270 | ~~~
271 | $ docker image build --platform linux/amd64 -t alice/alpine-sum:v3 .
272 | $ docker push alice/alpine-sum:v3
273 | ~~~
274 | {: .language-bash}
275 |
276 | Build and test it:
277 | ~~~
278 | remote$ singularity pull alpine-sum_v3.sif docker://alice/alpine-sum:v3
279 | remote$ singularity exec alpine-sum_v3.sif sum.py 10 11 12
280 | ~~~
281 | {: .language-bash}
282 | ~~~
283 | sum = 33
284 | ~~~
285 | {: .output}
286 |
287 | > ## Best practices for writing Dockerfiles
288 | > Take a look at Nüst et al.'s "[_Ten simple rules for writing Dockerfiles for reproducible data science_](https://doi.org/10.1371/journal.pcbi.1008316)" \[1\]
289 | > for some great examples of best practices to use when writing Dockerfiles.
290 | > The [GitHub repository](https://github.com/nuest/ten-simple-rules-dockerfiles) associated with the paper also has a set of [example `Dockerfile`s](https://github.com/nuest/ten-simple-rules-dockerfiles/tree/master/examples)
291 | > demonstrating how the rules highlighted by the paper can be applied.
292 | >
293 | > [1] Nüst D, Sochat V, Marwick B, Eglen SJ, Head T, et al. (2020) Ten simple rules for writing Dockerfiles for reproducible data science. PLOS Computational Biology 16(11): e1008316. https://doi.org/10.1371/journal.pcbi.1008316
294 | {: .callout}
295 |
296 | {% include links.md %}
297 |
--------------------------------------------------------------------------------
/_episodes/introduction.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Introducing Containers"
3 | teaching: 20
4 | exercises: 0
5 | questions:
6 | - "What are containers, and why might they be useful to me?"
7 | objectives:
8 | - "Show how software depending on other software leads to configuration management problems."
9 | - "Identify the problems that software installation can pose for research."
10 | - "Explain the advantages of containerization."
11 | - "Explain how using containers can solve software configuration problems"
12 | keypoints:
13 | - "Almost all software depends on other software components to function, but these components have independent evolutionary paths."
14 | - "Small environments that contain only the software that is needed for a given task are easier to replicate and maintain."
15 | - "Critical systems that cannot be upgraded, due to cost, difficulty, etc. need to be reproduced on newer systems in a maintainable and self-documented way."
16 | - "Virtualization allows multiple environments to run on a single computer."
17 | - "Containerization improves upon the virtualization of whole computers by allowing efficient management of the host computer's memory and storage resources."
18 | - "Containers are built from 'recipes' that define the required set of software components and the instructions necessary to build/install them within a container image."
19 | - "Singularity and Docker are examples of software platforms that can create containers and the resources they use."
20 | ---
21 |
22 | > ## Learning about Containers
23 | >
24 | > The Australian Research Data Commons has produced a short introductory video
25 | > about containers that covers many of the points below. Watch it before
26 | > or after you go through this section to reinforce your understanding!
27 | >
28 | > [How can software containers help your research?](https://www.youtube.com/watch?v=HelrQnm3v4g)
29 | >
30 | > Australian Research Data Commons, 2021. *How can software containers help your research?*. [video] Available at: https://www.youtube.com/watch?v=HelrQnm3v4g DOI: http://doi.org/10.5281/zenodo.5091260
31 | {: .callout}
32 |
33 |
34 | ## Scientific Software Challenges
35 |
36 | > ## What's Your Experience?
37 | >
38 | > Take a minute to think about challenges that you have experienced in using
39 | > scientific software (or software in general!) for your research. Then,
40 | > share with your neighbors and try to come up with a list of common gripes or
41 | > challenges.
42 | {: .challenge}
43 |
44 | You may have come up with some of the following:
45 |
46 | - you want to use software that doesn't exist for the operating system (Mac, Windows, Linux) you'd prefer.
47 | - you struggle with installing a software tool because you have to install a number of other dependencies first. Those dependencies, in turn, require *other* things, and so on (i.e. combinatoric explosion).
48 | - the software you're setting up involves many dependencies and only a subset of all possible versions of those dependencies actually works as desired.
49 | - you're not actually sure what version of the software you're using because the install process was so circuitous.
50 | - you and a colleague are using the same software but get different results because you have installed different versions and/or are using different operating systems.
51 | - you installed everything correctly on your computer but now need to install it on a colleague's computer/campus computing cluster/etc.
52 | - you've written a package for other people to use but a lot of your users frequently have trouble with installation.
53 | - you need to reproduce a research project from a former colleague and the software used was on a system you no longer have access to.
54 |
55 | A lot of these characteristics boil down to one fact: the main program you want
56 | to use likely depends on many, many, different other programs (including the
57 | operating system!), creating a very complex, and often fragile system. One change
58 | or missing piece may stop the whole thing from working or break something that was
59 | already running. It's no surprise that this situation is sometimes
60 | informally termed "dependency hell".
61 |
62 | > ## Software and Science
63 | >
64 | > Again, take a minute to think about how the software challenges we've discussed
65 | > could impact (or have impacted!) the quality of your work.
66 | > Share your thoughts with your neighbors. What can go wrong if our software
67 | > doesn't work?
68 | {: .challenge}
69 |
70 | Unsurprisingly, software installation and configuration challenges can have
71 | negative consequences for research:
72 | - you can't use a specific tool at all, because it's not available or installable.
73 | - you can't reproduce your results because you're not sure what tools you're actually using.
74 | - you can't access extra/newer resources because you're not able to replicate your software set up.
75 | - others cannot validate and/or build upon your work because they cannot recreate your system's unique configuration.
76 |
77 | Thankfully there are ways to get underneath (a lot of) this mess: containers
78 | to the rescue! Containers provide a way to package up software dependencies
79 | and access to resources such as files and communications networks in a uniform manner.
80 |
81 | ## What is a Container?
82 |
83 | To understand containers, let's first talk briefly about your computer.
84 |
85 | Your computer has some standard pieces that allow it to work -- often what's
86 | called the hardware. One of these pieces is the CPU or processor; another is
87 | the amount of memory or RAM that your computer can use to store information
88 | temporarily while running programs; another is the hard drive, which can store
89 | information over the long-term. All these pieces work together to do the
90 | "computing" of a computer, but we don't see them because they're hidden from view (usually).
91 |
92 | Instead, what we see is our desktop, program windows, different folders, and
93 | files. These all live in what's called the filesystem. Everything on your computer -- programs,
94 | pictures, documents, the operating system itself -- lives somewhere in the filesystem.
95 |
96 | NOW, imagine you want to install some new software but don't want to take the chance
97 | of making a mess of your existing system by installing a bunch of additional stuff
98 | (libraries/dependencies/etc.).
99 | You don't want to buy a whole new computer because it's too expensive.
100 | What if, instead, you could have another independent filesystem and running operating system that you could access from your main computer, and that is actually stored within this existing computer?
101 |
102 | Or, imagine you have two tools you want to use in your groundbreaking research on cat memes: `PurrLOLing`, a tool that does AMAZINGLY well at predicting the best text for a meme based on the cat species and `WhiskerSpot`, the only tool available for identifying cat species from images. You want to send cat pictures to `WhiskerSpot`, and then send the species output to `PurrLOLing`. But there's a problem: `PurrLOLing` only works on Ubuntu and `WhiskerSpot` is only supported for OpenSUSE so you can't have them on the same system! Again, we really want another filesystem (or two) on our computer that we could use to chain together `WhiskerSpot` and `PurrLOLing` in a "pipeline"...
103 |
104 | Container systems, like Singularity/Apptainer and Docker, are special programs on your computer that make it possible!
105 | The term "container" can be usefully considered with reference to shipping
106 | containers. Before shipping containers were developed, packing and unpacking
107 | cargo ships was time consuming and error prone, with high potential for
108 | different clients' goods to become mixed up. Just like shipping containers keep things
109 | together that should stay together, software containers standardize the description and
110 | creation of a complete software system: you can drop a container into any computer with
111 | the container software installed (the 'container host'), and it should "just work".
112 |
113 | > ## Virtualization
114 | >
115 | > Containers are an example of what's called **virtualization** -- having a
116 | > second "virtual" computer running and accessible from a main or **host**
117 | > computer. Another example of virtualization are **virtual machines** or
118 | > VMs. A virtual machine typically contains a whole copy of an operating system in
119 | > addition to its own filesystem and has to get booted up in the same way
120 | > a computer would.
121 | > A container is considered a lightweight version of a virtual machine;
122 | > underneath, the container is (usually) using the Linux kernel and simply has some
123 | > flavour of Linux + the filesystem inside.
124 | {: .callout}
125 |
126 | One final term: while the **container** is an alternative filesystem layer that you
127 | can access and run from your computer, the **container image** is the 'recipe' or template
128 | for a container. The container image has all the required information to start
129 | up a running copy of the container. A running container tends to be transient
130 | and can be started and shut down. The container image is more long-lived, as a definition for the container.
131 | You could think of the container image like a cookie cutter -- it
132 | can be used to create multiple copies of the same shape (or container)
133 | and is relatively unchanging, where cookies come and go. If you want a
134 | different type of container (cookie) you need a different container image (cookie cutter).
135 |
136 |
137 | ## Putting the Pieces Together
138 |
139 | Think back to some of the challenges we described at the beginning. The many layers
140 | of scientific software installations make it hard to install and re-install
141 | scientific software -- which ultimately, hinders reliability and reproducibility.
142 |
143 | But now, think about what a container is -- a self-contained, complete, separate
144 | computer filesystem. What advantages are there if you put your scientific software
145 | tools into containers?
146 |
147 | This solves several of our problems:
148 |
149 | - documentation -- there is a clear record of what software and software dependencies were used, from bottom to top.
150 | - portability -- the container can be used on any computer that has Docker installed -- it doesn't matter whether the computer is Mac, Windows or Linux-based.
151 | - reproducibility -- you can use the exact same software and environment on your computer and on other resources (like a large-scale computing cluster).
152 | - configurability -- containers can be sized to take advantage of more resources (memory, CPU, etc.) on large systems (clusters) or less, depending on the circumstances.
153 |
154 | The rest of this workshop will show you how to download and run containers from pre-existing
155 | container images on your own computer, and how to create and share your own container images.
156 |
157 | ## Use cases for containers
158 |
159 | Now that we have discussed a little bit about containers -- what they do and the
160 | issues they attempt to address -- you may be able to think of a few potential use
161 | cases in your area of work. Some examples of common use cases for containers in
162 | a research context include:
163 |
164 | - Using containers solely on your own computer to use a specific software tool
165 | or to test out a tool (possibly to avoid a difficult and complex installation
166 | process, to save your time or to avoid dependency hell).
167 | - Creating a recipe that generates a container image with software that you
168 | specify installed, then sharing a container image generated using this Dockerfile with
169 | your collaborators for use on their computers or a remote computing resource
170 | (e.g. cloud-based or HPC system).
171 | - Archiving the container images so you can repeat analysis/modelling using the
172 | same software and configuration in the future -- capturing your workflow.
173 |
174 | {% include links.md %}
175 |
176 | {% comment %}
177 |
179 | {% endcomment %}
180 |
--------------------------------------------------------------------------------
/_episodes/singularity-gettingstarted.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Singularity: Getting started"
3 | teaching: 15
4 | exercises: 10
5 | questions:
6 | - "What is Singularity and why might I want to use it?"
7 | objectives:
8 | - "Understand what Singularity is and when you might want to use it."
9 | - "Undertake your first run of a simple Singularity container."
10 | keypoints:
11 | - "Singularity is another container platform and it is often used in cluster/HPC/research environments."
12 | - "Singularity has a different security model to other container platforms, one of the key reasons that it is well suited to HPC and cluster environments."
13 | - "Singularity has its own container image format (SIF)."
14 | - "The `singularity` command can be used to pull images from Sylabs Cloud Library and run a container from an image file."
15 | ---
16 |
17 | The episodes in this lesson will introduce you to the [Singularity](https://sylabs.io/singularity/) container platform and demonstrate how to set up and use Singularity.
18 |
19 | ## What is Singularity?
20 |
21 | [Singularity](https://sylabs.io/singularity/) (or
22 | [Apptainer](https://apptainer.org/), we'll get to this in a minute...) is a
23 | container platform that supports packaging and deploying software and tools in
24 | a portable and reproducible manner.
25 |
26 | You may be familiar with Docker, another container platform that is now used widely.
27 | If you are, you will see that in some ways, Singularity is similar to Docker. However, in
28 | other ways, particularly in terms of the system's architecture, it is
29 | fundamentally different. These differences mean that Singularity is
30 | particularly well-suited to running on shared platforms such as distributed,
31 | High Performance Computing (HPC) infrastructure, as well as on a Linux laptop
32 | or desktop.
33 |
34 | Singularity runs containers from container images which, as we discussed, are essentially a
35 | virtual computer disk that contains all of the necessary software, libraries
36 | and configuration to run one or more applications or undertake a particular
37 | task, e.g. to support a specific research project. This saves you the time and
38 | effort of installing and configuring software on your own system or setting up
39 | a new computer from scratch, as you can simply run a Singularity container from
40 | an image and have a virtual environment that is equivalent to the one used by
41 | the person who created the image. Singularity/Apptainer is increasingly widely
42 | used in the research community for supporting research projects due to its
43 | support for shared computing platforms.
44 |
45 | System administrators will not, generally, install Docker on shared computing
46 | platforms such as lab desktops, research clusters or HPC platforms because the
47 | design of Docker presents potential security issues for shared platforms with
48 | multiple users. Singularity/Apptainer, on the other hand, can be run by
49 | end-users entirely within "user space", that is, no special administrative
50 | privileges need to be assigned to a user in order for them to run and interact
51 | with containers on a platform where Singularity has been installed.
52 |
53 | ### A little history...
54 |
55 | Singularity is open source software and was initially developed within the research
56 | community. A couple of years ago, the project was "forked" something that is
57 | not uncommon within the open source software community, with the software
58 | effectively splitting into two projects going in different directions. The fork
59 | is being developed by a commercial entity, [Sylabs.io](https://sylabs.io/) who
60 | provide both the free, open source [SingularityCE (Community
61 | Edition)](https://sylabs.io/singularity) and Pro/Enterprise editions of the
62 | software. The original open source Singularity project has recently been
63 | [renamed to
64 | Apptainer](https://apptainer.org/news/community-announcement-20211130/) and has
65 | moved into the Linux Foundation. The initial release of Apptainer was made
66 | about a year ago, at the time of writing. While earlier versions of this course
67 | focused on versions of Singularity released before the project fork, we now
68 | base the course material on recent Apptainer releases. Despite this, the basic
69 | features of Apptainer/Singularity remain the same and so this material is
70 | equally applicable whether you're working with a recent Apptainer release or a
71 | slightly older Singularity version. Nonetheless, it is useful to be aware of
72 | this history and that you may see both Singularity and Apptainer being used
73 | within the research community over the coming months and years.
74 |
75 | Another point to note is that some systems that have a recent Apptainer release
76 | installed may also provide a `singularity` command that is simply a link to the
77 | `apptainer` executable on the system. This helps to ensure that existing
78 | scripts being used on the system that were developed before the migration to
79 | Apptainer will still function correctly.
80 |
81 | For now, the remainder of this material refers to Singularity but where you
82 | have a release of Apptainer installed on your local system, you can simply
83 | replace references to `singularity` with `apptainer`, if you wish.
84 |
85 | ## Checking Singularity works
86 |
87 | [Login to ARCHER2](https://docs.archer2.ac.uk/user-guide/connecting/) using the
88 | login address `training.dyn.archer2.ac.uk`:
89 |
90 | ~~~
91 | ssh -i /path/to/ssh-key user@training.dyn.archer2.ac.uk
92 | ~~~
93 |
94 | Now check that the `singularity` command is available in your terminal:
95 |
96 | ~~~
97 | remote$ singularity --version
98 | ~~~
99 | {: .language-bash}
100 |
101 | ~~~
102 | singularity version 3.7.3-1
103 | ~~~
104 | {: .output}
105 |
106 |
107 | > ## Loading a module
108 | > HPC systems often use *modules* to provide access to software on the system so you may need to use the command:
109 | > ~~~
110 | > remote$ module load singularity
111 | > ~~~
112 | > {: .language-bash}
113 | > before you can use the `singularity` command on remote systems. However, this depends on how the system is configured.
114 | > You do not need to load a module on ARCHER2. If in doubt, consult the documentation for the system you are using
115 | > or contact the support team.
116 | {: .callout}
117 |
118 | ## Images and containers: reminder
119 |
120 | A quick reminder on terminology: we refer to both *container images* and *containers*. What is the difference between these two terms?
121 |
122 | *Container images* (sometimes just *images*) are bundles of files including an operating system, software and potentially data and other application-related files. They may sometimes be referred to as a *disk image* or *image* and they may be stored in different ways, perhaps as a single file, or as a group of files. Either way, we refer to this file, or collection of files, as an image.
123 |
124 | A *container* is a virtual environment that is based on a container image. That is, the files, applications, tools, etc that are available within a running container are determined by the image that the container is started from. It may be possible to start multiple container instances from an image. You could, perhaps, consider an image to be a form of template from which running container instances can be started.
125 |
126 | ## Getting a container image and running a Singularity container
127 |
128 | Singularity uses the [Singularity Image Format (SIF)](https://github.com/sylabs/sif) and container images are provided as single `SIF` files (usually with a `.sif` or `.img` filename extension). Singularity container images can be pulled from the [Sylabs Cloud Library](https://cloud.sylabs.io/), a registry for Singularity container images. Singularity is also capable of running containers based on container images pulled from [Docker Hub](https://hub.docker.com/) and other Docker image repositories (e.g. [Quay.io](https://quay.io)). We will look at accessing container images from Docker Hub later in the course.
129 |
130 | > ## Sylabs Remote Builder
131 | > Note that in addition to providing a repository that you can pull container images from, [Sylabs Cloud Library](https://cloud.sylabs.io/) can also build Singularity images for you from a *recipe* - a configuration file defining the steps to build an image. We will look at recipes and building images later in the workshop.
132 | {: .callout}
133 |
134 | ### Pulling a container image from Sylabs Cloud Library
135 |
136 | Let's begin by creating a `test` directory, changing into it and _pulling_ an existing container image from Sylabs Cloud Library:
137 |
138 | ~~~
139 | remote$ mkdir test
140 | remote$ cd test
141 | remote$ singularity pull lolcow.sif library://lolcow
142 | ~~~
143 | {: .language-bash}
144 |
145 | ~~~
146 | INFO: Downloading library image
147 | 90.4 MiB / 90.4 MiB [===============================================================================================================] 100.00% 90.4 MiB/s 1s
148 | ~~~
149 | {: .output}
150 |
151 | What just happened? We pulled a container image from a remote repository using the `singularity pull` command and directed it to store the container image in a file using the name `lolcow.sif` in the current directory. If you run the `ls` command, you should see that the `lolcow.sif` file is now present in the current directory.
152 |
153 | ~~~
154 | remote$ ls -lh
155 | ~~~
156 | {: .language-bash}
157 |
158 | ~~~
159 | total 60M
160 | -rwxr-xr-x. 1 auser group 91M Jun 13 2023 lolcow.sif
161 | ~~~
162 | {: .output}
163 |
164 | ### Running a Singularity container
165 |
166 | We can now run a container based on the `lolcow.sif` container image:
167 |
168 | ~~~
169 | remote$ singularity run lolcow.sif
170 | ~~~
171 | {: .language-bash}
172 |
173 | ~~~
174 | ______________________________
175 | < Tue Jun 20 08:44:51 UTC 2023 >
176 | ------------------------------
177 | \ ^__^
178 | \ (oo)\_______
179 | (__)\ )\/\
180 | ||----w |
181 | || ||
182 | ~~~
183 | {: .output}
184 |
185 | The above command ran a *lolcow* container based on the container image we downloaded from the online repository and the resulting output was shown.
186 |
187 | What just happened? When we use the `singularity run` command, Singularity does three things:
188 |
189 | | 1. Starts a Running Container | 2. Performs Default Action | 3. Shuts Down the Container
190 | | --------------------|-----------------|----------------|
191 | | Starts a running container, based on the container image. Think of this as the "alive" or "inflated" version of the container -- it's actually doing something. | If the container has a default action set, it will perform that default action. This could be as simple as printing a message (as above) or running a whole analysis pipeline! | Once the default action is complete, the container stops running (or exits). |
192 |
193 | ### Default action
194 |
195 | How did the container determine what to do when we ran it? What did running the container actually do to result in the displayed output?
196 |
197 | When you run a container from a Singularity container image using the `singularity run` command, the container runs the default run script that is embedded within the container image. This is a shell script that can be used to run commands, tools or applications stored within the container image on container startup. We can inspect the container image's run script using the `singularity inspect` command:
198 |
199 | ~~~
200 | remote$ singularity inspect -r lolcow.sif
201 | ~~~
202 | {: .language-bash}
203 |
204 | ~~~
205 | #!/bin/sh
206 |
207 | date | cowsay | lolcat
208 |
209 | ~~~
210 | {: .output}
211 |
212 | This shows us the script within the `lolcow.sif` image configured to run by default when we use the `singularity run` command.
213 |
214 | This seems very simple but already, we have downloaded a container image that is built with a different
215 | OS than is available on ARCHER2 that also contains software not available on ARCHER2 (`cowsay` and
216 | `lolcat`) and been able to run this on the ARCHER2 system without needing to install anything ourselves
217 | and without the container image having to know anything specific about how ARCHER2 is configured.
218 |
--------------------------------------------------------------------------------
/setup.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Setup
3 | ---
4 |
5 | There are a number of tasks you need to complete *before* the workshop to ensure that you get the
6 | best learning experience:
7 |
8 | 1. Create a Docker Hub account (if you do not already have one)
9 | 2. Make sure you have a working installation of Docker Desktop or Docker on your laptop
10 | 3. Create an account on ARCHER2 - even if you have an existing ARCHER2 account, you should create a new
11 | one for this workshop in the course project
12 |
13 | Details of these setup steps are provided below.
14 |
15 | ## Create an account on Docker Hub
16 |
17 | Please seek help at the start of the lesson if you have not been able to establish a website account on:
18 | - The [Docker Hub](http://hub.docker.com). We will use the Docker Hub to download pre-built container images, and for you to upload and download container images that you create, as explained in the relevant lesson episodes.
19 |
20 | ## Install Docker Desktop or Docker
21 |
22 | In most cases, you will need to have administrator rights on the computer in order to install the Docker software. If you are using a computer managed by your organisation and do not have administrator rights, you *may* be able to get your organisation's IT staff to install Docker for you. Alternatively your IT support staff *may* be able to give you remote access to a server that can run Docker commands.
23 |
24 | Please try to install the appropriate Docker software from the list below depending on the operating system that your computer is running. Do let the workshop organisers know as early as possible if you are unable to install Docker using these instructions, as there may be other options available.
25 |
26 | #### Microsoft Windows
27 |
28 | **You must have admin rights to run Docker!** Some parts of the lesson will work without running as admin but if you are unable to `Run as administrator` on your machine some elements of this workshop might not work as described.
29 |
30 | Ideally, you will be able to install the Docker Desktop software, following the [Docker website's documentation](https://docs.docker.com/docker-for-windows/install/). Note that the instructions for installing Docker Desktop on Windows 10 Home Edition are different from other versions of Windows 10.
31 |
32 | Note that the above installation instructions highlight a minimum version or "build" that is required to be able to install Docker on your Windows 10 system. See [Which version of Windows operating system am I running?](https://support.microsoft.com/en-us/windows/which-version-of-windows-operating-system-am-i-running-628bec99-476a-2c13-5296-9dd081cdd808) for details of how to find out which version/build of Windows 10 you have.
33 |
34 | If you are unable to follow the above instructions to install Docker Desktop on your Windows system, the final release of the deprecated Docker Toolbox version of Docker for Windows can be downloaded from the [releases page of the Docker Toolbox GitHub repository](https://github.com/docker/toolbox/releases). (Download the `.exe` file for the Windows installer). _Please note that this final release of Docker Toolbox includes an old version of Docker and you are strongly advised not to attempt to use this for any production use. It will, however, enable you to follow along with the lesson material._
35 |
36 | > ## Warning: Git Bash
37 | > If you are using Git Bash as your terminal on Windows then you should be aware that you may run
38 | > into issues running some of the commands in this lesson as Git Bash will automatically re-write
39 | > any paths you specify at the command line into Windows versions of the paths and this will confuse
40 | > the Docker container you are trying to use. For example, if you enter the command:
41 | > ```
42 | > docker run alpine cat /etc/os-release
43 | > ```
44 | > Git Bash will change the `/etc/os-release` path to `C:\etc\os-release\` before passing the command
45 | > to the Docker container and the container will report an error. If you want to use Git Bash then you
46 | > can request that this path translation does not take place by adding an extra `/` to the start of the
47 | > path. i.e. the command would become:
48 | > ```
49 | > docker run alpine cat //etc/os-release
50 | > ```
51 | > This should suppress the path translation functionality in Git Bash.
52 | {: .callout}
53 |
54 | #### Apple macOS
55 |
56 | Ideally, you will be able to install the Docker Desktop software, following the
57 | [Docker website's documentation](https://docs.docker.com/docker-for-mac/install/).
58 | The current version of the Docker Desktop software requires macOS version 10.14 (Mojave) or later.
59 |
60 | If you already use Homebrew or MacPorts to manage your software, and would prefer to use those
61 | tools rather than Docker's installer, you can do so. For Homebrew, you can run the command
62 | `brew install --cask docker`. Note that you still need to run the Docker graphical user interface
63 | once to complete the initial setup, after which time the command line functionality of Docker will
64 | become available. The Homebrew install of Docker also requires a minimum macOS version of 10.14.
65 | The MacPorts Docker port should support older, as well as the most recent, operating system
66 | versions (see the [port details](https://ports.macports.org/port/docker/details/)), but note that
67 | we have not recently tested the Docker installation process via MacPorts.
68 |
69 | #### Linux
70 |
71 | There are too many varieties of Linux to give precise instructions here, but hopefully you can locate documentation for getting Docker installed on your Linux distribution. It may already be installed. If it is not already installed on your system, the [Install Docker Engine](https://docs.docker.com/engine/install/) page provides an overview of supported Linux distributions and pointers to relevant installation information. Alternatively, see:
72 |
73 | - [Docker Engine on CentOS](https://docs.docker.com/install/linux/docker-ce/centos/)
74 | - [Docker Engine on Debian](https://docs.docker.com/install/linux/docker-ce/debian/)
75 | - [Docker Engine on Fedora](https://docs.docker.com/install/linux/docker-ce/fedora/)
76 | - [Docker Engine on Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/)
77 |
78 | Alternatively, Docker now provide a version of Docker Desktop for some Linux distributions which give a user experience similar to Docker Desktop on Windows or macOS. You can find instructions on installing this at:
79 |
80 | - [Docker Desktop for Linux](https://docs.docker.com/desktop/install/linux-install/)
81 |
82 | ### Verify Installation
83 |
84 | To quickly check if the Docker and client and server are working run the following command in a new terminal or ssh session:
85 | ~~~
86 | $ docker version
87 | ~~~
88 | {: .language-bash}
89 | ~~~
90 | Client:
91 | Version: 20.10.2
92 | API version: 1.41
93 | Go version: go1.13.8
94 | Git commit: 20.10.2-0ubuntu2
95 | Built: Tue Mar 2 05:52:27 2021
96 | OS/Arch: linux/arm64
97 | Context: default
98 | Experimental: true
99 |
100 | Server:
101 | Engine:
102 | Version: 20.10.2
103 | API version: 1.41 (minimum version 1.12)
104 | Go version: go1.13.8
105 | Git commit: 20.10.2-0ubuntu2
106 | Built: Tue Mar 2 05:45:16 2021
107 | OS/Arch: linux/arm64
108 | Experimental: false
109 | containerd:
110 | Version: 1.4.4-0ubuntu1
111 | GitCommit:
112 | runc:
113 | Version: 1.0.0~rc95-0ubuntu1~21.04.1
114 | GitCommit:
115 | docker-init:
116 | Version: 0.19.0
117 | GitCommit:
118 | ~~~
119 | {: .output}
120 |
121 | The above output shows a successful installation and will vary based on your system. The important part is that the "Client" and the "Server" parts are both working and returns information. It is beyond the scope of this document to debug installation problems but common errors include the user not belonging to the `docker` group and forgetting to start a new terminal or ssh session.
122 |
123 | ## Getting an account on ARCHER2
124 |
125 | ### Sign up for a SAFE account
126 |
127 | To sign up, you must first register for an account on [SAFE](https://safe.epcc.ed.ac.uk/) (our service administration web application):
128 |
129 | If you are already registered on the EPCC SAFE you do not need to re-register. Please proceed to the next step.
130 |
131 | 1. Go to the [SAFE New User Signup Form](https://safe.epcc.ed.ac.uk/signup.jsp)
132 | 2. Fill in your personal details. You can come back later and change them if you wish. _**Note:** you should register using your institutional or company email address - email domains such as gmail.com, outlook.com, etc. are not allowed to be used for access to ARCHER2_
133 | 3. Click “Submit”
134 | 4. You are now registered. A single use login link will be emailed to the email address you provided. You can use this link to login and set your password.
135 |
136 | ### Sign up for an account on ARCHER2 through SAFE
137 |
138 | In addition to your password, you will need an SSH key pair to access ARCHER2. There is useful guidance on how
139 | to generate SSH key pairs in [the ARCHER2 documentation](https://docs.archer2.ac.uk/user-guide/connecting/#ssh-key-pairs).
140 | It is useful to have your SSH key pair generated before you request an account on ARCHER2 as you can add it when
141 | you request the account.
142 |
143 | 1. [Login to SAFE](https://safe.epcc.ed.ac.uk)
144 | 2. Go to the Menu "Login accounts" and select "Request login account"
145 | 3. Choose the `ta118` project “Choose Project for Machine Account” box and click "Next"
146 | 4. Select the *ARCHER2* machine in the list of available machines
147 | 5. Click *Next*
148 | 6. Enter a username for the account and an SSH public
149 | key
150 | 1. You can always add an SSH key (or additional SSH keys) after your account has been created.
151 | 7. Click *Request*
152 |
153 | Now you have to wait for the course organiser to accept your request to register. When this has happened,your account will be created on ARCHER2.
154 | Once this has been done, you should be sent an email. _If you have not received an email but believe that your account should have been activated, check your account status in SAFE which will also show when the account has been activated._ You can then pick up your one shot initial password for ARCHER2 from your SAFE account.
155 |
156 | ### Log into ARCHER2
157 |
158 | You should now be able to log into ARCHER2 by following the [login instructions in the ARCHER2 documentation](https://docs.archer2.ac.uk/user-guide/connecting/#ssh-clients).
159 |
160 | ### A quick tutorial on copy/pasting file contents from episodes of the lesson
161 | Let's say you want to copy text off the lesson website and paste it into a file named `myfile` in the current working directory of a shell window. This can be achieved in many ways, depending on your computer's operating system, but routes I have found work for me:
162 | - macOS and Linux: you are likely to have the `nano` editor installed, which provides you with a very straightforward way to create such a file, just run `nano myfile`, then paste text into the shell window, and press control+x to exit: you will be prompted whether you want to save changes to the file, and you can type y to say "yes".
163 | - Microsoft Windows running `cmd.exe` shells:
164 | - `del myfile` to remove `myfile` if it already existed;
165 | - `copy con myfile` to mean what's typed in your shell window is copied into `myfile`;
166 | - paste the text you want within `myfile` into the shell window;
167 | - type control+z and then press enter to finish copying content into `myfile` and return to your shell;
168 | - you can run the command `type myfile` to check the content of that file, as a double-check.
169 | - Microsoft Windows running PowerShell:
170 | - The `cmd.exe` method probably works, but another is to paste your file contents into a so-called "here-string" between `@'` and `'@` as in this example that follows (the ">" is the prompt indicator):
171 |
172 | > @'
173 | Some hypothetical
174 | file content that is
175 |
176 | split over many
177 |
178 | lines.
179 | '@ | Set-Content myfile -encoding ascii
180 |
181 | {% include links.md %}
182 |
--------------------------------------------------------------------------------
/bin/workshop_check.py:
--------------------------------------------------------------------------------
1 | '''Check that a workshop's index.html metadata is valid. See the
2 | docstrings on the checking functions for a summary of the checks.
3 | '''
4 |
5 |
6 | import sys
7 | import os
8 | import re
9 | from datetime import date
10 | from util import Reporter, split_metadata, load_yaml, check_unwanted_files
11 |
12 | # Metadata field patterns.
13 | EMAIL_PATTERN = r'[^@]+@[^@]+\.[^@]+'
14 | HUMANTIME_PATTERN = r'((0?[1-9]|1[0-2]):[0-5]\d(am|pm)(-|to)(0?[1-9]|1[0-2]):[0-5]\d(am|pm))|((0?\d|1\d|2[0-3]):[0-5]\d(-|to)(0?\d|1\d|2[0-3]):[0-5]\d)'
15 | EVENTBRITE_PATTERN = r'\d{9,10}'
16 | URL_PATTERN = r'https?://.+'
17 |
18 | # Defaults.
19 | CARPENTRIES = ("dc", "swc", "lc", "cp")
20 | DEFAULT_CONTACT_EMAIL = 'admin@software-carpentry.org'
21 |
22 | USAGE = 'Usage: "workshop_check.py path/to/root/directory"'
23 |
24 | # Country and language codes. Note that codes mean different things: 'ar'
25 | # is 'Arabic' as a language but 'Argentina' as a country.
26 |
27 | ISO_COUNTRY = [
28 | 'ad', 'ae', 'af', 'ag', 'ai', 'al', 'am', 'an', 'ao', 'aq', 'ar', 'as',
29 | 'at', 'au', 'aw', 'ax', 'az', 'ba', 'bb', 'bd', 'be', 'bf', 'bg', 'bh',
30 | 'bi', 'bj', 'bm', 'bn', 'bo', 'br', 'bs', 'bt', 'bv', 'bw', 'by', 'bz',
31 | 'ca', 'cc', 'cd', 'cf', 'cg', 'ch', 'ci', 'ck', 'cl', 'cm', 'cn', 'co',
32 | 'cr', 'cu', 'cv', 'cx', 'cy', 'cz', 'de', 'dj', 'dk', 'dm', 'do', 'dz',
33 | 'ec', 'ee', 'eg', 'eh', 'er', 'es', 'et', 'eu', 'fi', 'fj', 'fk', 'fm',
34 | 'fo', 'fr', 'ga', 'gb', 'gd', 'ge', 'gf', 'gg', 'gh', 'gi', 'gl', 'gm',
35 | 'gn', 'gp', 'gq', 'gr', 'gs', 'gt', 'gu', 'gw', 'gy', 'hk', 'hm', 'hn',
36 | 'hr', 'ht', 'hu', 'id', 'ie', 'il', 'im', 'in', 'io', 'iq', 'ir', 'is',
37 | 'it', 'je', 'jm', 'jo', 'jp', 'ke', 'kg', 'kh', 'ki', 'km', 'kn', 'kp',
38 | 'kr', 'kw', 'ky', 'kz', 'la', 'lb', 'lc', 'li', 'lk', 'lr', 'ls', 'lt',
39 | 'lu', 'lv', 'ly', 'ma', 'mc', 'md', 'me', 'mg', 'mh', 'mk', 'ml', 'mm',
40 | 'mn', 'mo', 'mp', 'mq', 'mr', 'ms', 'mt', 'mu', 'mv', 'mw', 'mx', 'my',
41 | 'mz', 'na', 'nc', 'ne', 'nf', 'ng', 'ni', 'nl', 'no', 'np', 'nr', 'nu',
42 | 'nz', 'om', 'pa', 'pe', 'pf', 'pg', 'ph', 'pk', 'pl', 'pm', 'pn', 'pr',
43 | 'ps', 'pt', 'pw', 'py', 'qa', 're', 'ro', 'rs', 'ru', 'rw', 'sa', 'sb',
44 | 'sc', 'sd', 'se', 'sg', 'sh', 'si', 'sj', 'sk', 'sl', 'sm', 'sn', 'so',
45 | 'sr', 'st', 'sv', 'sy', 'sz', 'tc', 'td', 'tf', 'tg', 'th', 'tj', 'tk',
46 | 'tl', 'tm', 'tn', 'to', 'tr', 'tt', 'tv', 'tw', 'tz', 'ua', 'ug', 'um',
47 | 'us', 'uy', 'uz', 'va', 'vc', 've', 'vg', 'vi', 'vn', 'vu', 'wf', 'ws',
48 | 'ye', 'yt', 'za', 'zm', 'zw'
49 | ]
50 |
51 | ISO_LANGUAGE = [
52 | 'aa', 'ab', 'ae', 'af', 'ak', 'am', 'an', 'ar', 'as', 'av', 'ay', 'az',
53 | 'ba', 'be', 'bg', 'bh', 'bi', 'bm', 'bn', 'bo', 'br', 'bs', 'ca', 'ce',
54 | 'ch', 'co', 'cr', 'cs', 'cu', 'cv', 'cy', 'da', 'de', 'dv', 'dz', 'ee',
55 | 'el', 'en', 'eo', 'es', 'et', 'eu', 'fa', 'ff', 'fi', 'fj', 'fo', 'fr',
56 | 'fy', 'ga', 'gd', 'gl', 'gn', 'gu', 'gv', 'ha', 'he', 'hi', 'ho', 'hr',
57 | 'ht', 'hu', 'hy', 'hz', 'ia', 'id', 'ie', 'ig', 'ii', 'ik', 'io', 'is',
58 | 'it', 'iu', 'ja', 'jv', 'ka', 'kg', 'ki', 'kj', 'kk', 'kl', 'km', 'kn',
59 | 'ko', 'kr', 'ks', 'ku', 'kv', 'kw', 'ky', 'la', 'lb', 'lg', 'li', 'ln',
60 | 'lo', 'lt', 'lu', 'lv', 'mg', 'mh', 'mi', 'mk', 'ml', 'mn', 'mr', 'ms',
61 | 'mt', 'my', 'na', 'nb', 'nd', 'ne', 'ng', 'nl', 'nn', 'no', 'nr', 'nv',
62 | 'ny', 'oc', 'oj', 'om', 'or', 'os', 'pa', 'pi', 'pl', 'ps', 'pt', 'qu',
63 | 'rm', 'rn', 'ro', 'ru', 'rw', 'sa', 'sc', 'sd', 'se', 'sg', 'si', 'sk',
64 | 'sl', 'sm', 'sn', 'so', 'sq', 'sr', 'ss', 'st', 'su', 'sv', 'sw', 'ta',
65 | 'te', 'tg', 'th', 'ti', 'tk', 'tl', 'tn', 'to', 'tr', 'ts', 'tt', 'tw',
66 | 'ty', 'ug', 'uk', 'ur', 'uz', 've', 'vi', 'vo', 'wa', 'wo', 'xh', 'yi',
67 | 'yo', 'za', 'zh', 'zu'
68 | ]
69 |
70 |
71 | def look_for_fixme(func):
72 | """Decorator to fail test if text argument starts with "FIXME"."""
73 |
74 | def inner(arg):
75 | if (arg is not None) and \
76 | isinstance(arg, str) and \
77 | arg.lstrip().startswith('FIXME'):
78 | return False
79 | return func(arg)
80 | return inner
81 |
82 |
83 | @look_for_fixme
84 | def check_layout(layout):
85 | '''"layout" in YAML header must be "workshop".'''
86 |
87 | return layout == 'workshop'
88 |
89 |
90 | @look_for_fixme
91 | def check_carpentry(layout):
92 | '''"carpentry" in YAML header must be "dc", "swc", "lc", or "cp".'''
93 |
94 | return layout in CARPENTRIES
95 |
96 |
97 | @look_for_fixme
98 | def check_country(country):
99 | '''"country" must be a lowercase ISO-3166 two-letter code.'''
100 |
101 | return country in ISO_COUNTRY
102 |
103 |
104 | @look_for_fixme
105 | def check_language(language):
106 | '''"language" must be a lowercase ISO-639 two-letter code.'''
107 |
108 | return language in ISO_LANGUAGE
109 |
110 |
111 | @look_for_fixme
112 | def check_humandate(date):
113 | """
114 | 'humandate' must be a human-readable date with a 3-letter month
115 | and 4-digit year. Examples include 'Feb 18-20, 2025' and 'Feb 18
116 | and 20, 2025'. It may be in languages other than English, but the
117 | month name should be kept short to aid formatting of the main
118 | Carpentries web site.
119 | """
120 |
121 | if ',' not in date:
122 | return False
123 |
124 | month_dates, year = date.split(',')
125 |
126 | # The first three characters of month_dates are not empty
127 | month = month_dates[:3]
128 | if any(char == ' ' for char in month):
129 | return False
130 |
131 | # But the fourth character is empty ("February" is illegal)
132 | if month_dates[3] != ' ':
133 | return False
134 |
135 | # year contains *only* numbers
136 | try:
137 | int(year)
138 | except:
139 | return False
140 |
141 | return True
142 |
143 |
144 | @look_for_fixme
145 | def check_humantime(time):
146 | """
147 | 'humantime' is a human-readable start and end time for the
148 | workshop, such as '09:00 - 16:00'.
149 | """
150 |
151 | return bool(re.match(HUMANTIME_PATTERN, time.replace(' ', '')))
152 |
153 |
154 | def check_date(this_date):
155 | """
156 | 'startdate' and 'enddate' are machine-readable start and end dates
157 | for the workshop, and must be in YYYY-MM-DD format, e.g.,
158 | '2015-07-01'.
159 | """
160 |
161 | # YAML automatically loads valid dates as datetime.date.
162 | return isinstance(this_date, date)
163 |
164 |
165 | @look_for_fixme
166 | def check_latitude_longitude(latlng):
167 | """
168 | 'latlng' must be a valid latitude and longitude represented as two
169 | floating-point numbers separated by a comma.
170 | """
171 |
172 | try:
173 | lat, lng = latlng.split(',')
174 | lat = float(lat)
175 | lng = float(lng)
176 | return (-90.0 <= lat <= 90.0) and (-180.0 <= lng <= 180.0)
177 | except ValueError:
178 | return False
179 |
180 |
181 | def check_instructors(instructors):
182 | """
183 | 'instructor' must be a non-empty comma-separated list of quoted
184 | names, e.g. ['First name', 'Second name', ...']. Do not use 'TBD'
185 | or other placeholders.
186 | """
187 |
188 | # YAML automatically loads list-like strings as lists.
189 | return isinstance(instructors, list) and len(instructors) > 0
190 |
191 |
192 | def check_helpers(helpers):
193 | """
194 | 'helper' must be a comma-separated list of quoted names,
195 | e.g. ['First name', 'Second name', ...']. The list may be empty.
196 | Do not use 'TBD' or other placeholders.
197 | """
198 |
199 | # YAML automatically loads list-like strings as lists.
200 | return isinstance(helpers, list) and len(helpers) >= 0
201 |
202 |
203 | @look_for_fixme
204 | def check_emails(emails):
205 | """
206 | 'emails' must be a comma-separated list of valid email addresses.
207 | The list may be empty. A valid email address consists of characters,
208 | an '@', and more characters. It should not contain the default contact
209 | """
210 |
211 | # YAML automatically loads list-like strings as lists.
212 | if (isinstance(emails, list) and len(emails) >= 0):
213 | for email in emails:
214 | if ((not bool(re.match(EMAIL_PATTERN, email))) or (email == DEFAULT_CONTACT_EMAIL)):
215 | return False
216 | else:
217 | return False
218 |
219 | return True
220 |
221 |
222 | def check_eventbrite(eventbrite):
223 | """
224 | 'eventbrite' (the Eventbrite registration key) must be 9 or more
225 | digits. It may appear as an integer or as a string.
226 | """
227 |
228 | if isinstance(eventbrite, int):
229 | return True
230 | else:
231 | return bool(re.match(EVENTBRITE_PATTERN, eventbrite))
232 |
233 |
234 | @look_for_fixme
235 | def check_collaborative_notes(collaborative_notes):
236 | """
237 | 'collaborative_notes' must be a valid URL.
238 | """
239 |
240 | return bool(re.match(URL_PATTERN, collaborative_notes))
241 |
242 |
243 | @look_for_fixme
244 | def check_pass(value):
245 | """
246 | This test always passes (it is used for 'checking' things like the
247 | workshop address, for which no sensible validation is feasible).
248 | """
249 |
250 | return True
251 |
252 |
253 | HANDLERS = {
254 | 'layout': (True, check_layout, 'layout isn\'t "workshop"'),
255 |
256 | 'carpentry': (True, check_carpentry, 'carpentry isn\'t in ' +
257 | ', '.join(CARPENTRIES)),
258 |
259 | 'country': (True, check_country,
260 | 'country invalid: must use lowercase two-letter ISO code ' +
261 | 'from ' + ', '.join(ISO_COUNTRY)),
262 |
263 | 'language': (False, check_language,
264 | 'language invalid: must use lowercase two-letter ISO code' +
265 | ' from ' + ', '.join(ISO_LANGUAGE)),
266 |
267 | 'humandate': (True, check_humandate,
268 | 'humandate invalid. Please use three-letter months like ' +
269 | '"Jan" and four-letter years like "2025"'),
270 |
271 | 'humantime': (True, check_humantime,
272 | 'humantime doesn\'t include numbers'),
273 |
274 | 'startdate': (True, check_date,
275 | 'startdate invalid. Must be of format year-month-day, ' +
276 | 'i.e., 2014-01-31'),
277 |
278 | 'enddate': (False, check_date,
279 | 'enddate invalid. Must be of format year-month-day, i.e.,' +
280 | ' 2014-01-31'),
281 |
282 | 'latlng': (True, check_latitude_longitude,
283 | 'latlng invalid. Check that it is two floating point ' +
284 | 'numbers, separated by a comma'),
285 |
286 | 'instructor': (True, check_instructors,
287 | 'instructor list isn\'t a valid list of format ' +
288 | '["First instructor", "Second instructor",..]'),
289 |
290 | 'helper': (True, check_helpers,
291 | 'helper list isn\'t a valid list of format ' +
292 | '["First helper", "Second helper",..]'),
293 |
294 | 'email': (True, check_emails,
295 | 'contact email list isn\'t a valid list of format ' +
296 | '["me@example.org", "you@example.org",..] or contains incorrectly formatted email addresses or ' +
297 | '"{0}".'.format(DEFAULT_CONTACT_EMAIL)),
298 |
299 | 'eventbrite': (False, check_eventbrite, 'Eventbrite key appears invalid'),
300 |
301 | 'collaborative_notes': (False, check_collaborative_notes, 'Collaborative Notes URL appears invalid'),
302 |
303 | 'venue': (False, check_pass, 'venue name not specified'),
304 |
305 | 'address': (False, check_pass, 'address not specified')
306 | }
307 |
308 | # REQUIRED is all required categories.
309 | REQUIRED = {k for k in HANDLERS if HANDLERS[k][0]}
310 |
311 | # OPTIONAL is all optional categories.
312 | OPTIONAL = {k for k in HANDLERS if not HANDLERS[k][0]}
313 |
314 |
315 | def check_blank_lines(reporter, raw):
316 | """
317 | Blank lines are not allowed in category headers.
318 | """
319 |
320 | lines = [(i, x) for (i, x) in enumerate(
321 | raw.strip().split('\n')) if not x.strip()]
322 | reporter.check(not lines,
323 | None,
324 | 'Blank line(s) in header: {0}',
325 | ', '.join(["{0}: {1}".format(i, x.rstrip()) for (i, x) in lines]))
326 |
327 |
328 | def check_categories(reporter, left, right, msg):
329 | """
330 | Report differences (if any) between two sets of categories.
331 | """
332 |
333 | diff = left - right
334 | reporter.check(len(diff) == 0,
335 | None,
336 | '{0}: offending entries {1}',
337 | msg, sorted(list(diff)))
338 |
339 |
340 | def check_file(reporter, path, data):
341 | """
342 | Get header from file, call all other functions, and check file for
343 | validity.
344 | """
345 |
346 | # Get metadata as text and as YAML.
347 | raw, header, body = split_metadata(path, data)
348 |
349 | # Do we have any blank lines in the header?
350 | check_blank_lines(reporter, raw)
351 |
352 | # Look through all header entries. If the category is in the input
353 | # file and is either required or we have actual data (as opposed to
354 | # a commented-out entry), we check it. If it *isn't* in the header
355 | # but is required, report an error.
356 | for category in HANDLERS:
357 | required, handler, message = HANDLERS[category]
358 | if category in header:
359 | if required or header[category]:
360 | reporter.check(handler(header[category]),
361 | None,
362 | '{0}\n actual value "{1}"',
363 | message, header[category])
364 | elif required:
365 | reporter.add(None,
366 | 'Missing mandatory key "{0}"',
367 | category)
368 |
369 | # Check whether we have missing or too many categories
370 | seen_categories = set(header.keys())
371 | check_categories(reporter, REQUIRED, seen_categories,
372 | 'Missing categories')
373 | check_categories(reporter, seen_categories, REQUIRED.union(OPTIONAL),
374 | 'Superfluous categories')
375 |
376 |
377 | def check_config(reporter, filename):
378 | """
379 | Check YAML configuration file.
380 | """
381 |
382 | config = load_yaml(filename)
383 |
384 | kind = config.get('kind', None)
385 | reporter.check(kind == 'workshop',
386 | filename,
387 | 'Missing or unknown kind of event: {0}',
388 | kind)
389 |
390 | carpentry = config.get('carpentry', None)
391 | reporter.check(carpentry in ('swc', 'dc', 'lc', 'cp'),
392 | filename,
393 | 'Missing or unknown carpentry: {0}',
394 | carpentry)
395 |
396 |
397 | def main():
398 | '''Run as the main program.'''
399 |
400 | if len(sys.argv) != 2:
401 | print(USAGE, file=sys.stderr)
402 | sys.exit(1)
403 |
404 | root_dir = sys.argv[1]
405 | index_file = os.path.join(root_dir, 'index.html')
406 | config_file = os.path.join(root_dir, '_config.yml')
407 |
408 | reporter = Reporter()
409 | check_config(reporter, config_file)
410 | check_unwanted_files(root_dir, reporter)
411 | with open(index_file, encoding='utf-8') as reader:
412 | data = reader.read()
413 | check_file(reporter, index_file, data)
414 | reporter.report()
415 |
416 |
417 | if __name__ == '__main__':
418 | main()
419 |
--------------------------------------------------------------------------------
/_episodes/singularity-blast.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "(Optional) Using Singularity to run BLAST+"
3 | teaching: 30
4 | exercises: 30
5 | questions:
6 | - "How can I use Singularity to run bioinformatics workflows with BLAST+?"
7 | objectives:
8 | - "Show example of using Singularity with a common bioinformatics tool."
9 | keypoints:
10 | - "We can use containers to run software without having to install it"
11 | - "The commands we use are very similar to those we would use natively"
12 | - "Singularity handles a lot of complexity around data and internet access for us"
13 | ---
14 |
15 | We have now learned enough to be able to use Sigularity to deploy software without us
16 | needed to install the software itself on the host system.
17 |
18 | In this section we will demonstrate the use of a Singularity container image that
19 | provides the BLAST+ software.
20 |
21 | > ## Source material
22 | > This example is based on the example from the official [NCBI BLAST+ Docker
23 | > container documentation](https://github.com/ncbi/blast_plus_docs#step-2-import-sequences-and-create-a-blast-database)
24 | > Note: the `efetch` parts of the step-by-step guide do not currently work using
25 | > Singularity version of the image so we provide a dataset with the data already
26 | > downloaded.
27 | >
28 | > (This is because the NCBI BLAST+ Docker container image has the `efetch` tool
29 | > installed in the `/root` directory and this special location gets overwritten
30 | > during the conversion to a Singularity container image.)
31 | {: .callout}
32 |
33 | ## Download the required data
34 |
35 | Download the [blast_example.tar.gz]({{ page.root }}/files/blast_example.tar.gz).
36 |
37 | Unpack the archive which contains the downloaded data required for the BLAST+ example:
38 |
39 | ~~~
40 | remote$ wget https://epcced.github.io/2023-09-21_Singularity_Nottingham/files/blast_example.tar.gz
41 | remote$ tar -xvf blast_example.tar.gz
42 | ~~~
43 | {: .language-bash}
44 | ~~~
45 | x blast/
46 | x blast/blastdb/
47 | x blast/queries/
48 | x blast/fasta/
49 | x blast/results/
50 | x blast/blastdb_custom/
51 | x blast/fasta/nurse-shark-proteins.fsa
52 | x blast/queries/P01349.fsa
53 | ~~~
54 | {: .output}
55 |
56 | Finally, move into the newly created directory:
57 |
58 | ~~~
59 | remote$ cd blast
60 | remote$ ls
61 | ~~~
62 | {: .language-bash}
63 | ~~~
64 | blastdb blastdb_custom fasta queries results
65 | ~~~
66 | {: .output}
67 |
68 | ## Create the Singularity container image
69 |
70 | NCBI provide official Docker containers with the BLAST+ software hosted on Docker Hub. We can create
71 | a Singularity container image from the Docker container image with:
72 |
73 | ~~~
74 | remote$ singularity pull ncbi-blast.sif docker://ncbi/blast
75 | ~~~
76 | {: .language-bash}
77 | ~~~
78 | INFO: Converting OCI blobs to SIF format
79 | INFO: Starting build...
80 | Getting image source signatures
81 | Copying blob f3b81f6693c5 done
82 | Copying blob 9e3ea8720c6d done
83 | Copying blob f1910abb61ed done
84 | Copying blob 5ac33d4de47b done
85 | Copying blob 8402427c8382 done
86 | Copying blob 06add1a477bc done
87 | Copying blob d9781f222125 done
88 | Copying blob 4aae31cc8a8b done
89 | Copying blob 6a61413c1ffa done
90 | Copying blob c657bf8fc6ca done
91 | Copying blob 1776e565f5f8 done
92 | Copying blob d90474a0d8c8 done
93 | Copying blob 0bc89cb1b9d7 done
94 | Copying blob b8a272fccf13 done
95 | Copying blob 891eb09f891f done
96 | Copying blob 4c64befa8a35 done
97 | Copying blob 7ab0b7afbc21 done
98 | Copying blob b007c620c60b done
99 | Copying blob f877ffc04713 done
100 | Copying blob 6ee97c348001 done
101 | Copying blob 03f0ee97190b done
102 | Copying config 28914b3519 done
103 | Writing manifest to image destination
104 | Storing signatures
105 | 2023/06/16 08:26:53 info unpack layer: sha256:9e3ea8720c6de96cc9ad544dddc695a3ab73f5581c5d954e0504cc4f80fb5e5c
106 | 2023/06/16 08:26:53 info unpack layer: sha256:06add1a477bcffec8bac0529923aa8ae25d51f0660f0c8ef658e66aa89ac82c2
107 | 2023/06/16 08:26:53 info unpack layer: sha256:f3b81f6693c592ab94c8ebff2109dc60464d7220578331c39972407ef7b9e5ec
108 | 2023/06/16 08:26:53 info unpack layer: sha256:5ac33d4de47beb37ae35e9cad976d27afa514ab8cbc66e0e60c828a98e7531f4
109 | 2023/06/16 08:27:03 info unpack layer: sha256:8402427c8382ab723ac504155561fb6d3e5ea1e7b4f3deac8449cec9e44ae65a
110 | 2023/06/16 08:27:03 info unpack layer: sha256:f1910abb61edef8947e9b5556ec756fd989fa13f329ac503417728bf3b0bae5e
111 | 2023/06/16 08:27:03 info unpack layer: sha256:d9781f222125b5ad192d0df0b59570f75b797b2ab1dc0d82064c1b6cead04840
112 | 2023/06/16 08:27:03 info unpack layer: sha256:4aae31cc8a8b726dce085e4e2dc4671a9be28162b8d4e1b1c00b8754f14e6fe6
113 | 2023/06/16 08:27:03 info unpack layer: sha256:6a61413c1ffa309d92931265a5b0ecc9448568f13ccf3920e16aaacc8fdfc671
114 | 2023/06/16 08:27:03 info unpack layer: sha256:c657bf8fc6cae341e3835cb101dc4c6839ba4aad69578ff8538b3c1eba7abb21
115 | 2023/06/16 08:27:04 info unpack layer: sha256:1776e565f5f85562b8601edfd29c35f3fba76eb53177c8e89105f709387e3627
116 | 2023/06/16 08:27:04 info unpack layer: sha256:d90474a0d8c8e6165d909cc0ebbf97dbe70fd759a93eff11a5a3f91fa09a470e
117 | 2023/06/16 08:27:04 warn rootless{root/edirect/aux/lib/perl5/Mozilla/CA/cacert.pem} ignoring (usually) harmless EPERM on setxattr "user.rootlesscontainers"
118 | 2023/06/16 08:27:04 warn rootless{root/edirect/aux/lib/perl5/Mozilla/CA.pm} ignoring (usually) harmless EPERM on setxattr "user.rootlesscontainers"
119 | 2023/06/16 08:27:04 warn rootless{root/edirect/aux/lib/perl5/Mozilla/mk-ca-bundle.pl} ignoring (usually) harmless EPERM on setxattr "user.rootlesscontainers"
120 | 2023/06/16 08:27:04 info unpack layer: sha256:0bc89cb1b9d7ca198a7a1b95258006560feffaff858509be8eb7388b315b9cf5
121 | 2023/06/16 08:27:04 info unpack layer: sha256:b8a272fccf13b721fa68826f17f0c2bb395de377e0d22c98d38748eb5957a4c6
122 | 2023/06/16 08:27:04 info unpack layer: sha256:891eb09f891ff2c26f24a5466112e134f6fb30bd3d0e78c14c0d676b0e68d60a
123 | 2023/06/16 08:27:04 info unpack layer: sha256:4c64befa8a35c9f8518324524dfc27966753462a4c07b2234811865387058bf4
124 | 2023/06/16 08:27:04 info unpack layer: sha256:7ab0b7afbc21b75697a7b8ed907ee9b81e5b17a04895dc6ff7d25ea2ba1eeba4
125 | 2023/06/16 08:27:04 info unpack layer: sha256:b007c620c60b91ce6a9e76584ecc4bc062c822822c204d8c2b1c8668193d44d1
126 | 2023/06/16 08:27:04 info unpack layer: sha256:f877ffc04713a03dffd995f540ee13b65f426b350cdc8c5f1e20c290de129571
127 | 2023/06/16 08:27:04 info unpack layer: sha256:6ee97c348001fca7c98e56f02b787ce5e91d8cc7af7c7f96810a9ecf4a833504
128 | 2023/06/16 08:27:04 info unpack layer: sha256:03f0ee97190baebded2f82136bad72239254175c567b19def105b755247b0193
129 | INFO: Creating SIF file...
130 | ~~~
131 | {: .output}
132 |
133 | Now we have a container with the software in, we can use it.
134 |
135 | ## Build and verify the BLAST database
136 |
137 | Our example dataset has already downloaded the query and database sequences. We first
138 | use these downloaded data to create a custom BLAST database by using a container to run
139 | the command `makeblastdb` with the correct options.
140 |
141 | ~~~
142 | remote$ singularity exec ncbi-blast.sif \
143 | makeblastdb -in fasta/nurse-shark-proteins.fsa -dbtype prot \
144 | -parse_seqids -out nurse-shark-proteins -title "Nurse shark proteins" \
145 | -taxid 7801 -blastdb_version 5
146 | ~~~
147 | {: .language-bash}
148 | ~~~
149 |
150 | Building a new DB, current time: 06/16/2023 14:35:07
151 | New DB name: /home/auser/test/blast/blast/nurse-shark-proteins
152 | New DB title: Nurse shark proteins
153 | Sequence type: Protein
154 | Keep MBits: T
155 | Maximum file size: 3000000000B
156 | Adding sequences from FASTA; added 7 sequences in 0.0199499 seconds.
157 |
158 | ~~~
159 | {: .output}
160 |
161 | To verify the newly created BLAST database above, you can run the
162 | `blastdbcmd -entry all -db nurse-shark-proteins -outfmt "%a %l %T"` command to display
163 | the accessions, sequence length, and common name of the sequences in the database.
164 |
165 | ~~~
166 | remote$ singularity exec ncbi-blast.sif \
167 | blastdbcmd -entry all -db nurse-shark-proteins -outfmt "%a %l %T"
168 | ~~~
169 | {: .language-bash}
170 |
171 | ~~~
172 | Q90523.1 106 7801
173 | P80049.1 132 7801
174 | P83981.1 53 7801
175 | P83977.1 95 7801
176 | P83984.1 190 7801
177 | P83985.1 195 7801
178 | P27950.1 151 7801
179 | ~~~
180 | {: .output}
181 |
182 | Now we have our database we can run queries against it.
183 |
184 | ## Run a query against the BLAST database
185 |
186 | Lets execute a query on our database using the `blastp` command:
187 |
188 | ~~~
189 | remote$ singularity exec ncbi-blast.sif \
190 | blastp -query queries/P01349.fsa -db nurse-shark-proteins \
191 | -out results/blastp.out
192 | ~~~
193 | {: .language-bash}
194 |
195 | At this point, you should see the results of the query in the output file `results/blastp.out`.
196 | To view the content of this output file, use the command `less results/blastp.out`.
197 |
198 | ~~~
199 | remote$ less results/blastp.out
200 | ~~~
201 | {: .language-bash}
202 |
203 | ~~~
204 | ...output trimmed...
205 |
206 | Query= sp|P01349.2|RELX_CARTA RecName: Full=Relaxin; Contains: RecName:
207 | Full=Relaxin B chain; Contains: RecName: Full=Relaxin A chain
208 |
209 | Length=44
210 | Score E
211 | Sequences producing significant alignments: (Bits) Value
212 |
213 | P80049.1 RecName: Full=Fatty acid-binding protein, liver; AltName... 14.2 0.96
214 |
215 |
216 | >P80049.1 RecName: Full=Fatty acid-binding protein, liver; AltName: Full=Liver-type
217 | fatty acid-binding protein; Short=L-FABP
218 | Length=132
219 |
220 | ...output trimmed...
221 | ~~~
222 | {: .output}
223 |
224 | With your query, BLAST identified the protein sequence P80049.1 as a match with a score
225 | of 14.2 and an E-value of 0.96.
226 |
227 | ## Accessing online BLAST databases
228 |
229 | As well as building your own local database to query, you can also access databases that are
230 | available online. For example, to see which databases are available online in the Google Compute
231 | Platform (GCP):
232 |
233 | ~~~
234 | remote$ singularity exec ncbi-blast.sif update_blastdb.pl --showall pretty --source gcp
235 | ~~~
236 | {: .language-bash}
237 |
238 | ~~~
239 | Connected to GCP
240 | BLASTDB DESCRIPTION SIZE (GB) LAST_UPDATED
241 | nr All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 369.4824 2023-06-10
242 | swissprot Non-redundant UniProtKB/SwissProt sequences 0.3576 2023-06-10
243 | refseq_protein NCBI Protein Reference Sequences 146.5088 2023-06-12
244 | landmark Landmark database for SmartBLAST 0.3817 2023-04-25
245 | pdbaa PDB protein database 0.1967 2023-06-10
246 | nt Nucleotide collection (nt) 319.5044 2023-06-11
247 | pdbnt PDB nucleotide database 0.0145 2023-06-09
248 | patnt Nucleotide sequences derived from the Patent division of GenBank 15.7342 2023-06-09
249 | refseq_rna NCBI Transcript Reference Sequences 47.8721 2023-06-12
250 |
251 | ...output trimmed...
252 | ~~~
253 | {: .output}
254 |
255 | Similarly, for databases hosted at NCBI:
256 |
257 | ~~~
258 | remote$ singularity exec ncbi-blast.sif update_blastdb.pl --showall pretty --source ncbi
259 | ~~~
260 | {: .language-bash}
261 |
262 | ~~~
263 | Connected to NCBI
264 | BLASTDB DESCRIPTION SIZE (GB) LAST_UPDATED
265 | env_nr Proteins from WGS metagenomic projects (env_nr). 3.9459 2023-06-11
266 | SSU_eukaryote_rRNA Small subunit ribosomal nucleic acid for Eukaryotes 0.0063 2022-12-05
267 | LSU_prokaryote_rRNA Large subunit ribosomal nucleic acid for Prokaryotes 0.0041 2022-12-05
268 | 16S_ribosomal_RNA 16S ribosomal RNA (Bacteria and Archaea type strains) 0.0178 2023-06-16
269 | env_nt environmental samples 48.8599 2023-06-08
270 | LSU_eukaryote_rRNA Large subunit ribosomal nucleic acid for Eukaryotes 0.0053 2022-12-05
271 | ITS_RefSeq_Fungi Internal transcribed spacer region (ITS) from Fungi type and reference material 0.0067 2022-10-28
272 | Betacoronavirus Betacoronavirus 55.3705 2023-06-16
273 |
274 | ...output trimmed...
275 | ~~~
276 | {: .output}
277 |
278 | ## Notes
279 |
280 | You have now completed a simple example of using a complex piece of bioinformatics software
281 | through Singularity containers. You may have noticed that some things just worked without
282 | you needing to set them up even though you were running using containers:
283 |
284 | 1. We did not need to explicitly bind any files/directories in to the container. This worked
285 | because Singularity automatically binds the current directory into the running container, so
286 | any data in the current directory (or its subdirectories) will generally be available in
287 | running Singularity containers. (If you have used Docker containers, you will notice that
288 | this is different from the default behaviour there.)
289 | 2. Access to the internet is automatically available within the running container in the same
290 | way as it is on the host system without us needed to specify any additional options.
291 | 4. Files and data we create within the container have the right ownership and permissions for
292 | us to access outside the container.
293 |
294 | In addition, we were able to use the tools in the container image provided by NCBI without having
295 | to do any work to install the software irrespective of the computing platform that we are using.
296 | (In fact, the example this is based on runs the pipeline using Docker on a cloud computing platform
297 | rather than on your local system.)
298 |
299 |
300 |
--------------------------------------------------------------------------------
/_episodes/singularity-mpi.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Running MPI parallel jobs using Singularity containers"
3 | teaching: 30
4 | exercises: 40
5 | questions:
6 | - "How do I set up and run an MPI job from a Singularity container?"
7 | objectives:
8 | - "Learn how MPI applications within Singularity containers can be run on HPC platforms"
9 | - "Understand the challenges and related performance implications when running MPI jobs via Singularity"
10 | keypoints:
11 | - "Singularity images containing MPI applications can be built on one platform and then run on another (e.g. an HPC cluster) if the two platforms have compatible MPI implementations."
12 | - "When running an MPI application within a Singularity container, use the MPI executable on the host system to launch a Singularity container for each process."
13 | - "Think about parallel application performance requirements and how where you build/run your image may affect that."
14 | ---
15 |
16 | > ## What is MPI?
17 | > MPI - [Message Passing Interface](https://en.wikipedia.org/wiki/Message_Passing_Interface) - is a widely
18 | > used standard for parallel programming. It is used for exchanging messages/data between processes in a
19 | > parallel application. If you've been involved in developing or working with computational science software.
20 | {: .callout}
21 |
22 | Usually, when working on HPC systems, you compile your application against the MPI libraries provided on the system
23 | or you use applications that have been compiled by the HPC system support team. This approach to portability:
24 | *source code portability* is the traditional approach to making applications portable to different HPC platforms.
25 |
26 | However, compiling complex HPC applications that have lots of dependencies (including MPI) is not always straightforward
27 | and can be a significant challenge as most HPC systems differ in various ways in terms of OS and base software
28 | available. There are a number of different approaches that can be taken to make it easier to deploy applications
29 | on HPC systems; for example, the [Spack](https://spack.readthedocs.io) software automates the dependency resolution and compilation of
30 | applications. Containers provide another potential way to resolve these problems but care needs to be taken
31 | when interfacing with MPI on the host system which adds more complexity to running containers in parallel on
32 | HPC systems.
33 |
34 | ### MPI codes with Singularity containers
35 |
36 | Obviously, we will not have admin/root access on the HPC platform we are using so cannot (usually) build our container
37 | images on the HPC system itself. However, we do need to ensure our container is using the MPI library on
38 | the HPC system itself so we can get the performance benefit of the HPC interconnect. How do we overcome these
39 | contradictions?
40 |
41 | The answer is that we install a version of the MPI library in our container image that is binary compatible with
42 | the MPI library on the host system and install our software in the container image using the local version of
43 | the MPI library. At runtime, we then ensure that the MPI library from the host is used within the running container
44 | rather than the locally-installed version of MPI.
45 |
46 | There are two widely used open source MPI library distributions on HPC systems:
47 |
48 | * [MPICH](https://www.mpich.org/) - in addition to the open source version, MPICH is [binary compatible](https://www.mpich.org/abi/) with many
49 | proprietary vendor libraries, including Intel MPI and HPE Cray MPT as well as the open source MVAPICH.
50 | * [OpenMPI](https://www.open-mpi.org/)
51 |
52 | This typically means that if you want to distribute HPC software that uses MPI within a container image you will
53 | need to maintain versions that are compatible with both MPICH and OpenMPI. There are efforts underway to provide
54 | tools that will provide a binary interface between different MPI implementations, e.g. HPE Cray's MPIxlate software
55 | but these are not generally available yet.
56 |
57 | ### Building a Singularity container image with MPI software
58 |
59 | This example makes the assumption that you'll be building a container image on a local platform and then deploying
60 | it to a HPC system with a different but compatible MPI implementation using a combination of the *Hybrid* and *Bind*
61 | models from the Singularity documentation. We will build our application using MPI in the container image but will
62 | bind the MPI library from the host into the container at runtime. See
63 | [Singularity and MPI applications](https://docs.sylabs.io/guides/3.7/user-guide/mpi.html)
64 | in the Singularity documentation for more technical details.
65 |
66 | The example we will build will:
67 | * Use MPICH as the container image's MPI library
68 | * Use the Ohio State University MPI Micro-benchmarks as the example application
69 | * Use ARCHER2 as the runtime platform - this uses Cray MPT as the host MPI library and the HPE Cray Slingshot interconnect
70 |
71 | The Dockerfile to install MPICH and the OSU micro-benchmark we will use
72 | to build the container image is shown below. Create a new directory called
73 | `osu-benchmarks` to hold the build context for this new image. Create the
74 | `Dockerfile` in this directory.
75 |
76 | ~~~
77 | FROM ubuntu:20.04
78 |
79 | ENV DEBIAN_FRONTEND=noninteractive
80 |
81 | # Install the necessary packages (from repo)
82 | RUN apt-get update && apt-get install -y --no-install-recommends \
83 | apt-utils \
84 | build-essential \
85 | curl \
86 | libcurl4-openssl-dev \
87 | libzmq3-dev \
88 | pkg-config \
89 | software-properties-common
90 | RUN apt-get clean
91 | RUN apt-get install -y dkms
92 | RUN apt-get install -y autoconf automake build-essential numactl libnuma-dev autoconf automake gcc g++ git libtool
93 |
94 | # Download and build an ABI compatible MPICH
95 | RUN curl -sSLO http://www.mpich.org/static/downloads/3.4.2/mpich-3.4.2.tar.gz \
96 | && tar -xzf mpich-3.4.2.tar.gz -C /root \
97 | && cd /root/mpich-3.4.2 \
98 | && ./configure --prefix=/usr --with-device=ch4:ofi --disable-fortran \
99 | && make -j8 install \
100 | && cd / \
101 | && rm -rf /root/mpich-3.4.2 \
102 | && rm /mpich-3.4.2.tar.gz
103 |
104 | # OSU benchmarks
105 | RUN curl -sSLO http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-5.4.1.tar.gz \
106 | && tar -xzf osu-micro-benchmarks-5.4.1.tar.gz -C /root \
107 | && cd /root/osu-micro-benchmarks-5.4.1 \
108 | && ./configure --prefix=/usr/local CC=/usr/bin/mpicc CXX=/usr/bin/mpicxx \
109 | && cd mpi \
110 | && make -j8 install \
111 | && cd / \
112 | && rm -rf /root/osu-micro-benchmarks-5.4.1 \
113 | && rm /osu-micro-benchmarks-5.4.1.tar.gz
114 |
115 | # Add the OSU benchmark executables to the PATH
116 | ENV PATH=/usr/local/libexec/osu-micro-benchmarks/mpi/startup:$PATH
117 | ENV PATH=/usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt:$PATH
118 | ENV PATH=/usr/local/libexec/osu-micro-benchmarks/mpi/collective:$PATH
119 | ENV OSU_DIR=/usr/local/libexec/osu-micro-benchmarks/mpi
120 |
121 | # path to mlx IB libraries in Ubuntu
122 | ENV LD_LIBRARY_PATH=/usr/lib/libibverbs:$LD_LIBRARY_PATH
123 | ~~~
124 | {: .output}
125 |
126 | A quick overview of what the above definition file is doing:
127 |
128 | - The image is being built based on the `ubuntu:20.04` Docker image.
129 | - In the `RUN` sections:
130 | - Ubuntu's `apt-get` package manager is used to update the package directory and then install the compilers and other libraries required for the MPICH and OSU benchmark build.
131 | - The MPICH software is downloaded, extracted, configured, built and installed. Note the use of the `--with-device` option to configure MPICH to use the correct driver to support improved communication performance on a high performance cluster. After the install is complete we delete the files that are no longer needed.
132 | - The OSU Micro-Benchmarks software is downloaded, extracted, configured, built and installed. After the install is complete we delete the files that are no longer needed.
133 | - In the `ENV` sections: Set environment variables that will be available within all containers run from the generated image.
134 |
135 | > ## Build and test the OSU Micro-Benchmarks image
136 | >
137 | > Using the above `Dockerfile`, build a container image and push it to Docker Hub.
138 | >
139 | > Pull the image on ARCHER2 using Singularity to convert it to a Singularity image and the test it by running the `osu_hello` benchmark that is found in the `startup` benchmark folder with either `singularity exec` or `singularity shell`.
140 | >
141 | > Note: the build process can take a while. If you want to test running while the build is happening, you can log into ARCHER2
142 | > and use a pre-built version of the container image to test. You can find this container image at:
143 | >
144 | > ~~~
145 | > ${EPCC_SINGULARITY_DIR}/osu_benchmarks.sif
146 | > ~~~
147 | >
148 | > > ## Solution
149 | > >
150 | > > You should be able to build an image and push it to Docker Hub as follows:
151 | > >
152 | > > ~~~
153 | > > $ docker image build --platform linux/amd64 -t alice/osu-benchmarks
154 | > > $ docker push alice/osu-benchmarks
155 | > > ~~~
156 | > > {: .language-bash}
157 | > >
158 | > > You can then log into ARCHER2 and pull the container image from Docker Hub with:
159 | > >
160 | > > ~~~
161 | > > remote$ singularity pull osu-benchmarks.sif docker://alice/osu-benchmarks
162 | > > ~~~
163 | > > {: .language-bash}
164 | > >
165 | > > Let's begin with a single-process run of `startup/osu_hello` to ensure that we can run the container as expected. We'll use the MPI installation _within_ the container for this test. Note that when we run a parallel job on an HPC cluster platform, we use the MPI installation on the cluster to coordinate the run so things are a little different... we will see this shortly.
166 | > >
167 | > > Start a shell in the Singularity container based on your image and then run a single process job via `mpirun`:
168 | > >
169 | > > ~~~
170 | > > $ singularity shell --contain osu_benchmarks.sif
171 | > > Singularity> mpirun -np 1 osu_hello
172 | > > ~~~
173 | > > {: .language-bash}
174 | > >
175 | > > You should see output similar to the following:
176 | > >
177 | > > ~~~
178 | > > # OSU MPI Hello World Test v5.7.1
179 | > > This is a test with 1 processes
180 | > > ~~~
181 | > > {: .output}
182 | > {: .solution}
183 | {: .challenge}
184 |
185 | ### Running Singularity containers with MPI on HPC system
186 |
187 | Assuming the above tests worked, we can now try undertaking a parallel run of one of the
188 | OSU benchmarking tools within our container image on the remote HPC platform.
189 |
190 | This is where things get interesting and we will begin by looking at how Singularity
191 | containers are run within an MPI environment.
192 |
193 | If you're familiar with running MPI codes, you'll know that you use `mpirun` (as we did
194 | in the previous example), `mpiexec`, `srun` or a similar MPI executable to start your
195 | application. This executable may be run directly on the local system or cluster platform
196 | that you're using, or you may need to run it through a job script submitted to a job
197 | scheduler. Your MPI-based application code, which will be linked against the MPI libraries,
198 | will make MPI API calls into these MPI libraries which in turn talk to the MPI daemon
199 | process running on the host system. This daemon process handles the communication between
200 | MPI processes, including talking to the daemons on other nodes to exchange information
201 | between processes running on different machines, as necessary.
202 |
203 | When running code within a Singularity container, we don't use the MPI executables stored
204 | within the container, i.e. we DO NOT run:
205 |
206 | `singularity exec mpirun -np /path/to/my/executable`
207 |
208 | Instead we use the MPI installation on the host system to run Singularity and start an
209 | instance of our executable from within a container for each MPI process. Without Singularity
210 | support in an MPI implementation, this results in starting a separate Singularity container
211 | instance within each process. This can present some overhead if a large number of processes
212 | are being run on a host. Where Singularity support is built into an MPI implementation
213 | this can address this potential issue and reduce the overhead of running code from within a
214 | container as part of an MPI job.
215 |
216 | Ultimately, this means that our running MPI code is linking to the MPI libraries from the MPI
217 | install within our container and these are, in turn, communicating with the MPI daemon on the
218 | host system which is part of the host system's MPI installation. In the case of MPICH, these
219 | two installations of MPI may be different but as long as there is
220 | [ABI compatibility](https://wiki.mpich.org/mpich/index.php/ABI_Compatibility_Initiative) between
221 | the version of MPI installed in your container image and the version on the host system, your
222 | job should run successfully.
223 |
224 | We can now try running a 2-process MPI run of a point to point benchmark `osu_latency` on
225 | ARCHER2.
226 |
227 | > ## Undertake a parallel run of the `osu_latency` benchmark (general example)
228 | >
229 | > Create a job submission script called `submit.slurm` on the /work file system on ARCHER2 to run
230 | > containers based on the container image across two nodes on ARCHER2. The example below uses the
231 | > osu-benchmarks container image that is already available on ARCHER2 but you can easily modify it
232 | > to use your version of the container image if you wish - the results should be the same in both
233 | > cases.
234 | >
235 | > A template based on the example in the
236 | > [ARCHER2 documentation](https://docs.archer2.ac.uk/user-guide/containers/#running-parallel-mpi-jobs-using-singularity-containers) is:
237 | >
238 | > ~~~
239 | > #!/bin/bash
240 | >
241 | > #SBATCH --job-name=singularity_parallel
242 | > #SBATCH --time=0:10:0
243 | > #SBATCH --nodes=2
244 | > #SBATCH --ntasks-per-node=1
245 | > #SBATCH --cpus-per-task=1
246 | >
247 | > #SBATCH --partition=standard
248 | > #SBATCH --qos=short
249 | > #SBATCH --account=[budget code]
250 | >
251 | > # Load the module to make the Cray MPICH ABI available
252 | > module load cray-mpich-abi
253 | >
254 | > export OMP_NUM_THREADS=1
255 | > export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
256 | >
257 | > # Set the LD_LIBRARY_PATH environment variable within the Singularity container
258 | > # to ensure that it used the correct MPI libraries.
259 | export SINGULARITYENV_LD_LIBRARY_PATH="/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib-abi-mpich:/opt/cray/pe/mpich/8.1.23/gtl/lib:/opt/cray/libfabric/1.12.1.2.2.0.0/lib64:/opt/cray/pe/gcc-libs:/opt/cray/pe/gcc-libs:/opt/cray/pe/lib64:/opt/cray/pe/lib64:/opt/cray/xpmem/default/lib64:/usr/lib64/libibverbs:/usr/lib64:/usr/lib64"
260 | >
261 | > # This makes sure HPE Cray Slingshot interconnect libraries are available
262 | > # from inside the container.
263 | > export SINGULARITY_BIND="/opt/cray,/var/spool,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib-abi-mpich:/opt/cray/pe/mpich/8.1.23/gtl/lib,/etc/host.conf,/etc/libibverbs.d/mlx5.driver,/etc/libnl/classid,/etc/resolv.conf,/opt/cray/libfabric/1.12.1.2.2.0.0/lib64/libfabric.so.1,/opt/cray/pe/gcc-libs/libatomic.so.1,/opt/cray/pe/gcc-libs/libgcc_s.so.1,/opt/cray/pe/gcc-libs/libgfortran.so.5,/opt/cray/pe/gcc-libs/libquadmath.so.0,/opt/cray/pe/lib64/libpals.so.0,/opt/cray/pe/lib64/libpmi2.so.0,/opt/cray/pe/lib64/libpmi.so.0,/opt/cray/xpmem/default/lib64/libxpmem.so.0,/run/munge/munge.socket.2,/usr/lib64/libibverbs/libmlx5-rdmav34.so,/usr/lib64/libibverbs.so.1,/usr/lib64/libkeyutils.so.1,/usr/lib64/liblnetconfig.so.4,/usr/lib64/liblustreapi.so,/usr/lib64/libmunge.so.2,/usr/lib64/libnl-3.so.200,/usr/lib64/libnl-genl-3.so.200,/usr/lib64/libnl-route-3.so.200,/usr/lib64/librdmacm.so.1,/usr/lib64/libyaml-0.so.2"
264 | >
265 | > # Launch the parallel job.
266 | > srun --hint=nomultithread --distribution=block:block \
267 | > singularity exec ${EPCC_SINGULARITY_DIR}/osu_benchmarks.sif \
268 | > osu_latency
269 | > ~~~
270 | > {: .language-bash}
271 | >
272 | > Finally, submit the job to the batch system with
273 | >
274 | > ~~~
275 | > remote$ sbatch submit.slurm
276 | > ~~~
277 | > {: .language-bash}
278 | >
279 | > > ## Solution
280 | > >
281 | > > As you can see in the job script shown above, we have called `srun` on the host system
282 | > > and are passing to MPI the `singularity` executable for which the parameters are the image
283 | > > file and the name of the benchmark executable we want to run.
284 | > >
285 | > > The following shows an example of the output you should expect to see. You should have latency
286 | > > values reported for message sizes up to 4MB.
287 | > >
288 | > > ~~~
289 | > > # OSU MPI Latency Test v5.6.2
290 | > > # Size Latency (us)
291 | > > 0 0.38
292 | > > 1 0.34
293 | > > ...
294 | > > ~~~
295 | > > {: .output}
296 | > {: .solution}
297 | {: .challenge}
298 |
299 | This has demonstrated that we can successfully run a parallel MPI executable from within a Singularity container.
300 |
301 | > ## Investigate performance of native benchmark compared to containerised version
302 | >
303 | > To get an idea of any difference in performance between the code within your Singularity image and the same
304 | > code built natively on the target HPC platform, try running the `osu_allreduce` benchmarks natively on ARCHER2
305 | > on all cores on at least 16 nodes (if you want to use more than 32 nodes, you will need to use the `standard` QoS
306 | > rather than the `short QoS`). Then try running the same benchmark that you ran via the Singularity container. Do
307 | > you see any performance differences?
308 | >
309 | > What do you see?
310 | >
311 | > Do you see the same when you run on small node counts - particularly a single node?
312 | >
313 | > Note: a native version of the OSU micro-benchmark suite is available on ARCHER2 via `module load osu-benchmarks`.
314 | >
315 | > > ## Discussion
316 | > >
317 | > > Here are some selected results measured on ARCHER2:
318 | > >
319 | > > 1 node:
320 | > > - 4 B
321 | > > + Native: 6.13 us
322 | > > + Container: 5.30 us (16% faster)
323 | > > - 128 KiB
324 | > > + Native: 173.00 us
325 | > > + Container: 230.38 us (25% slower)
326 | > > - 1 MiB
327 | > > + Native: 1291.18 us
328 | > > + Container: 2101.02 us (39% slower)
329 | > >
330 | > > 16 nodes:
331 | > > - 4 B
332 | > > + Native: 17.66 us
333 | > > + Container: 18.15 us (3% slower)
334 | > > - 128 KiB
335 | > > + Native: 237.29 us
336 | > > + Container: 303.92 us (22% slower)
337 | > > - 1 MiB
338 | > > + Native: 1501.25 us
339 | > > + Container: 2359.11 us (36% slower)
340 | > >
341 | > > 32 nodes:
342 | > > - 4 B
343 | > > + Native: 30.72 us
344 | > > + Container: 24.41 us (20% faster)
345 | > > - 128 KiB
346 | > > + Native: 265.36 us
347 | > > + Container: 363.58 us (26% slower)
348 | > > - 1 MiB
349 | > > + Native: 1520.58 us
350 | > > + Container: 2429.24 us (36% slower)
351 | > >
352 | > > For the medium and large messages, using a container produces substantially worse MPI performance for this
353 | > > benchmark on ARCHER2. When the messages are very small, containers match the native performance and can
354 | > > actually be faster.
355 | > >
356 | > > Is this true for other MPI benchmarks that use all the cores on a node or is it specific to Allreduce?
357 | > >
358 | > {: .solution}
359 | {: .challenge}
360 |
361 | ### Summary
362 |
363 | Singularity can be combined with MPI to create portable containers that run software in parallel across multiple
364 | compute nodes. However, there are some limitations, specifically:
365 |
366 | - You must use an MPI library in the container that is binary compatible with the MPI library on the host system -
367 | typically, your container will be based on either MPICH or OpenMPI.
368 | - The host setup to enable MPI typically requires binding a large number of low-level libraries into the running
369 | container. You will usually require help from the HPC system support team to get the correct bind options for
370 | the platform you are using.
371 | - Performance of containers+MPI can be substantially lower than the performance of native applications using MPI
372 | on the system. The effect is dependent on the MPI routines used in your application, message sizes and the number of MPI
373 | processes used.
374 |
375 |
376 |
--------------------------------------------------------------------------------