├── .gitignore
├── 00_docker_installation.md
├── 01_motivation.md
├── 02_technology.md
├── 03_container_lifecycle.md
├── 04_hands_on_sessions.md
├── 05_homework_science_project.md
├── 05_homework_science_project
├── .gitignore
├── create_data.F90
├── example.dat
└── plot_data.gp
├── 06_singularity.md
├── 07_summary.md
├── 08_addendum.md
├── README.md
├── container-concept.png
└── possible_solutions
├── 04_hands_on_sessions.md
└── 05_homework_science_project
├── Dockerfile
├── README.md
└── run_everything.sh
/.gitignore:
--------------------------------------------------------------------------------
1 | container.jpg
2 | container.png
3 | .DS_Store
4 |
--------------------------------------------------------------------------------
/00_docker_installation.md:
--------------------------------------------------------------------------------
1 | # Docker installation
2 |
3 | This course will have a mixture of lecture-style, hands-on and discussion parts.
4 | Please install Docker on your Linux, MacOS or Windows machine in advance, if you want to follow the hands-on parts yourself.
5 |
6 | While this is not mandatory (as we will also do a demo of the exercises) we would highly advise you to prepare Docker on your machine, especially if you already intend to try out containers for one of your scientific projects.
7 | By installing Docker, you will have two things at the end of the course: (1) a fully working container build environment (that is also able to build Singularity container images), as well as (2) hands-on experience in specifying, building, and running Docker container images in a way that is also applicable to the Singularity world.
8 |
9 | ## Instructions
10 |
11 | For Linux install Docker CE,
12 |
13 | * https://docs.docker.com/engine/install/#server
14 |
15 | (To receive updates to your Docker installation, please setup and install Docker CE via the Docker package repository for your Linux distribution as described for e.g. Ubuntu here: https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository)
16 |
17 | For MacOS install Docker desktop,
18 |
19 | * https://docs.docker.com/docker-for-mac/install/#install-and-run-docker-desktop-on-mac
20 |
21 | For Windows also install Docker desktop,
22 |
23 | * https://docs.docker.com/docker-for-windows/install/#install-docker-desktop-on-windows
24 |
25 | (Consider **not** to choose the WSL2-backend during the installation, but to go for the default Hyper-V backend initially. While WSL2 is more performant, it is a bit more complex to setup. Luckily, it can be added at any later point.)
26 |
27 | Please note, there won't be time to discuss problems with installing Docker during the beginning of the course.
28 | Please also note, you will need administrator privileges on your machine to successfully install Docker.
29 | Contact your computer's administrator if you don't have the appropriate rights on your computer.
30 |
31 | Also, during the hands-on part of the course it is helpful to have a text editor suitable for coding available.
32 | This could be a terminal application such as "vim" or a GUI tool such as "VS Code" (https://code.visualstudio.com/Download).
33 |
34 | ## Questions
35 |
36 | > Why is Docker used for this course, if on the big machines I have only seen Singularity to be available?
37 |
38 | Containers are a very helpful technology that simplifies the deployment aspect associated with software.
39 | The Singularity image build and Singularity container run functionality is only available on Linux systems, though.
40 | Docker container images, on the other hand, work across Linux, MacOS and Windows machines.
41 | Furthermore, they can be converted into Singularity container images very easily.
42 | As there is often no way of building Singularity container images directly on the machines where they will be used (at least in practice!) we during our daily work almost exclusively use workflows in which a container image is specified in Docker and then converted to a Singularity image.
43 | How to do this (with Docker or Singularity), will also be part of the course.
44 |
--------------------------------------------------------------------------------
/01_motivation.md:
--------------------------------------------------------------------------------
1 | # Motivation behind using containers
2 |
3 | Containers may help with **reproducibility** of your scientific work,
4 |
5 | > **Scholarly research has evolved significantly over the past decade, but the same cannot be said for the methods by which research processes are captured and disseminated.** The primary method for dissemination - the scholarly publication - is largely unchanged since the advent of the scientific journal in the 1660s. This is no longer sufficient to verify, reproduce, and extend scientific results. Despite the increasing recognition of the need to share all aspects of the research process, scholarly publications today are often disconnected from the underlying analysis and, crucially, the computational environment that produced the findings. **For research to be reproducible, researchers must publish and distribute the entire contained analysis, not just its results.** The analysis should be mobile. **Mobility of Compute is defined as the ability to define, create, and maintain a workflow locally while remaining confident that the workflow can be executed elsewhere.** In essence, mobility of compute means being able to contain the entire software stack, from data files up through the library stack, and reliably move it from system to system. Any research that is limited to where it can be deployed is instantly limited in the extent that it can be reproduced. (from [The Turing Way](https://the-turing-way.netlify.app/reproducible-research/renv.html#science))
6 |
7 | And containers may help increasing your personal (and your research group's) **scientific productivity**,
8 |
9 | * especially by simplifying workflow aspects around "the lifecycle of a scientific idea" (after Perez 2017, [the architecture of Jupyter](https://www.youtube.com/watch?v=dENc0gwzySc))
10 | * individual exploratory work
11 | * collaborative developments
12 | * production runs (HPC, workstation, cloud, ...)
13 | * publication (reproducibly!)
14 | * specifically, by reducing your individual "time-to-scientific-insight"
15 | * daily perspective, e.g. [Jupyter start-up times](https://nbviewer.jupyter.org/github/ExaESM-WP4/Jupyter-HPC-performance/blob/fa725c1f3656f81c78254946f97a9c1764908e53/analysis.ipynb) on distributed storage machines
16 | * project perspective, e.g. becoming independent of particular machines and their software
17 | * ...
18 |
19 | ## Demo
20 |
21 | Suppose you want to do a data analysis on a machine, where you do not already have a suitable software environment installed.
22 | With containerized environments, it's extremely simple to (1) get started on your local machine, as well as to (2) scale a task out to another, e.g. bigger machine, or to (3) share an environment with your colleague who wants to build upon an analysis you have already started.
23 |
24 | We cover sharing of containerized software environments later in more detail.
25 | For a start, here we demonstrate the "installation" (or deployment) of a fully identical IPython software environment on a Linux, MacOS and Windows machine.
26 | IPython is provided with, e.g., the [Jupyter Docker stacks](https://jupyter-docker-stacks.readthedocs.io/en/latest/index.html), which are provided via [Dockerhub](https://hub.docker.com/r/jupyter/base-notebook), a very popular container sharing platform.
27 | Only two commands are necessary to pull and start-up the software environment.
28 |
29 | Your Linux desktop with Docker,
30 |
31 | ```
32 | $ uname -a
33 | Linux morpheus 5.8.0-50-generic #56~20.04.1-Ubuntu SMP Mon Apr 12 21:46:35 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
34 | $ docker pull jupyter/base-notebook
35 | $ docker run -it jupyter/base-notebook ipython
36 | Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46)
37 | Type 'copyright', 'credits' or 'license' for more information
38 | IPython 7.24.1 -- An enhanced Interactive Python. Type '?' for help.
39 |
40 | In [1]:
41 | ```
42 |
43 | Your MacOS desktop with Docker,
44 |
45 | ```
46 | $ uname -a
47 | Darwin od-nb008mc 18.7.0 Darwin Kernel Version 18.7.0: Mon Mar 8 22:11:48 PST 2021; root:xnu-4903.278.65~1/RELEASE_X86_64 x86_64
48 | $ docker pull jupyter/base-notebook
49 | $ docker run -it jupyter/base-notebook ipython
50 | Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46)
51 | Type 'copyright', 'credits' or 'license' for more information
52 | IPython 7.24.1 -- An enhanced Interactive Python. Type '?' for help.
53 |
54 | In [1]:
55 | ```
56 |
57 | Your Windows desktop with Docker,
58 |
59 | ```
60 | Windows PowerShell
61 | Copyright (C) Microsoft Corporation. All rights reserved.
62 | PS V:\> docker pull jupyter/base-notebook
63 | PS V:\> docker run -it jupyter/base-notebook ipython
64 | Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46)
65 | Type 'copyright', 'credits' or 'license' for more information
66 | IPython 7.24.1 -- An enhanced Interactive Python. Type '?' for help.
67 |
68 | In [1]:
69 | ```
70 |
71 | # Goal of this workshop
72 |
73 | Accelerate your "scientific productivity" (and increase your scientific work's reproducibility) by covering the basics of container technology, as well as a few considerations around scientific workflows from a container perspective.
74 |
--------------------------------------------------------------------------------
/02_technology.md:
--------------------------------------------------------------------------------
1 | # What is a container?
2 |
3 | ## High-level conceptual perspective
4 |
5 | * isolated runtime environment for "applications" including their dependencies but not, unlike VMs, the kernel
6 | * simplifies "application" development, and deployment, and strongly improves "application" portability
7 |
8 | ## Brief historical perspective
9 |
10 | Containers are not a new technology.
11 | They were born as a unix system developer tool already during the 1980s.
12 |
13 | * chroot (1979/82) <— birth
14 | * FreeBSD jail (2000) <— usability
15 | * LXC (2008) <— popularity
16 | * **Docker (2013) <— community**
17 | * **Singularity (2015) <— science and containerized high-performance computing! also: shared machines**
18 | * Docker rootless (2020) <- Docker for shared machines; but: e.g. GPUs possible?
19 | * Apptainer (2021) joins the Linux Foundation <- align containerized high-performance computing and cloud technology developments
20 |
21 | (see e.g. [here](https://en.wikipedia.org/wiki/OS-level_virtualization) and [there](https://www.section.io/engineering-education/history-of-container-technology/) and [many more](https://www.google.com/search?q=history+of+container+technology))
22 |
23 | ## Technical perspective
24 |
25 | The core concept about "what is a container?" from a user perspective is the "shared Linux kernel" and "contained software environment" aspect.
26 |
27 | * user processes
28 | * Linux kernel w/ file system
29 | * hardware layer
30 |
31 |
32 |
33 | **References**
34 |
35 | * https://www.linuxfordevices.com/tutorials/linux/linux-kernel
36 | * https://www.linux.com/training-tutorials/linux-filesystem-explained/
37 | * https://www.hpcwire.com/2017/11/01/sc17-singularity-preps-version-3-0-nears-1m-containers-served-daily/
38 |
39 | ## Hands-on part (5 minutes)
40 |
41 | To illustrate this, let's pull the `alpine:latest` and `ubuntu:22.04` Linux base images.
42 |
43 | On a Linux (or MacOS machine) you can directly have a look at your host system's kernel version,
44 |
45 | ```
46 | $ uname -a
47 | Linux morpheus 5.8.0-50-generic #56~20.04.1-Ubuntu SMP Mon Apr 12 21:46:35 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
48 | ```
49 |
50 | the host system's file system structure,
51 |
52 | ```
53 | $ ls -l /
54 | lrwxrwxrwx 1 root root 7 Okt 30 2020 bin -> usr/bin
55 | drwxr-xr-x 4 root root 4096 Jun 4 06:30 boot
56 | drwxrwxr-x 2 root root 4096 Okt 30 2020 cdrom
57 | drwxr-xr-x 20 root root 4600 Mai 12 09:57 dev
58 | drwxr-xr-x 133 root root 12288 Jun 4 06:30 etc
59 | drwxr-xr-x 3 root root 4096 Okt 30 2020 home
60 | lrwxrwxrwx 1 root root 7 Okt 30 2020 lib -> usr/lib
61 | lrwxrwxrwx 1 root root 9 Okt 30 2020 lib32 -> usr/lib32
62 | lrwxrwxrwx 1 root root 9 Okt 30 2020 lib64 -> usr/lib64
63 | lrwxrwxrwx 1 root root 10 Okt 30 2020 libx32 -> usr/libx32
64 | drwx------ 2 root root 16384 Okt 30 2020 lost+found
65 | drwxr-xr-x 2 root root 4096 Jul 31 2020 media
66 | drwxr-xr-x 2 root root 4096 Jul 31 2020 mnt
67 | drwxr-xr-x 3 root root 4096 Nov 3 2020 opt
68 | dr-xr-xr-x 327 root root 0 Apr 20 22:31 proc
69 | drwx------ 5 root root 4096 Mai 16 14:46 root
70 | drwxr-xr-x 42 root root 1320 Jun 6 17:27 run
71 | lrwxrwxrwx 1 root root 8 Okt 30 2020 sbin -> usr/sbin
72 | dr-xr-xr-x 13 root root 0 Apr 20 22:31 sys
73 | drwxrwxrwt 18 root root 4096 Jun 6 17:27 tmp
74 | drwxr-xr-x 14 root root 4096 Jul 31 2020 usr
75 | drwxr-xr-x 14 root root 4096 Jul 31 2020 var
76 | ```
77 | and e.g. host system's user directory contents,
78 |
79 | ```
80 | $ ls /home
81 | khoeflich
82 | ```
83 |
84 | Now, download the Alpine base image from Dockerhub,
85 |
86 | ```
87 | $ docker pull alpine:latest
88 | ```
89 |
90 | start an interactive Bash session in the container,
91 |
92 | ```
93 | $ docker run -it --rm alpine:latest /bin/sh
94 | ```
95 |
96 | and familiarize yourself with the visible software environment,
97 |
98 | ```
99 | root@131fa759eb1b:/# ls -l /
100 | root@131fa759eb1b:/# ls /home
101 | root@131fa759eb1b:/# cat /etc/os-release
102 | root@131fa759eb1b:/# uname -a
103 | ```
104 |
105 | Do this also for the `ubuntu:22.04` image and your host system! What is different, what is the same?
106 |
107 | Please note, Docker desktop on MacOS and Windows is shipped with a Linux virtual machine, that runs in the background and provides you with Docker functionality.
108 | What do you expect to find for the respective `uname` commands?
109 |
110 | Especially on Windows, you can't natively run a Unix-like command such as `uname` in your host system's PowerShell.
111 | Can you still demonstrate the core concept of "shared Linux kernel" and "contained software environment" on a Windows machine? If so, how?
112 |
--------------------------------------------------------------------------------
/03_container_lifecycle.md:
--------------------------------------------------------------------------------
1 | # The container lifecycle
2 |
3 | The typical container lifecycle (from an industry containerized software delivery perspective) consists of four parts which may be grouped into a part dealing with the *container image* and a part dealing with the *container* itself:
4 |
5 | - Image:
6 | - [Specify](https://docs.docker.com/build/building/best-practices/)
7 | - [Build](https://docs.docker.com/reference/cli/docker/buildx/build/)
8 | - [Deploy](https://docs.docker.com/reference/cli/docker/image/pull/)
9 | - Container:
10 | - [Run](https://docs.docker.com/reference/cli/docker/container/run/)
11 |
12 | In a scientific-computing setting, we add two parts:
13 |
14 | - Archive
15 | - Reproduce
16 |
--------------------------------------------------------------------------------
/04_hands_on_sessions.md:
--------------------------------------------------------------------------------
1 | # Basic Hands-on Example
2 |
3 | Throughout this part, we'll link the [docker reference docs](https://docs.docker.com/reference/) of which the [docs for the command line interface (CLI)](https://docs.docker.com/engine/reference/commandline/cli/) and the [docs for Dockerfiles](https://docs.docker.com/engine/reference/builder/) are the most important.
4 |
5 | ## Background
6 |
7 | Suppose we have a workflow step that requires pictures to be downloaded from the internet and converted into a different format.
8 | We might want to script the task, and it might be important to have particular versions of the involved software installed to ensure that we do not need to adapt our script if picture libraries get extended, and software tool developers decide to change their e.g. command-line interfaces.
9 | We also may want to be able to execute this workflow on different machines, all of which, fortuitously, have Docker or Singularity / Apptainer installed.
10 |
11 | ## A: Modification of an existing container
12 |
13 | ### Task
14 |
15 | Use an existing base image ([`ubuntu:22.04`](https://hub.docker.com/_/ubuntu)), start a container with an interactive shell, and install a software necessary to download an image from the internet (`curl`) and a software necessary to convert the image to a different format (`imagemagick`).
16 |
17 | ### Hints
18 |
19 | Running containers with an interactive shell (here using `bash`) can be done with: [`docker run --rm -it bash`](https://docs.docker.com/engine/reference/commandline/run/)
20 |
21 | Installation can be done with `apt update` followed by `apt install -y ...`.
22 |
23 | Let's use this graphics file:
24 |
25 | To download the image, run
26 | ```shell
27 | curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg
28 | ```
29 |
30 | To convert the graphics file to, e.g., PNG, run:
31 | ```shell
32 | convert container.jpg container.png
33 | ```
34 | (The `convert` command is part of the `imagemagick` suite.)
35 |
36 | To check if the file format really changed, compare the output of `identify container.jpg` with that of `identify container.png`.
37 | (The `identify` command is part of the `imagemagick` suite.)
38 |
39 | ### Discussion
40 |
41 | - Are the graphic files still there when you exit the container and start it again with `docker run`?
42 | - How do you properly restart a stopped container?
43 | - How to get the graphics file out of the container? — We'll skip this for now and come back to this question later.
44 |
45 | ## B: Specification and Building
46 |
47 | ### Task
48 |
49 | Write a [`Dockerfile`](https://docs.docker.com/engine/reference/builder/) which specifies the full environment (e.g. the base image and all packages we need), and [build the container image](https://docs.docker.com/engine/reference/commandline/build/).
50 |
51 | Then, for testing if the container is able to do what it's meant for, start a shell in the container (see above) and download and convert the graphics file again.
52 |
53 | ### Hints
54 |
55 | In the `Dockerfile`, you can [specify the base image using](https://docs.docker.com/engine/reference/builder/#from)
56 | ```Dockerfile
57 | FROM
58 | ```
59 | and you can specify [commands that are running at _build time_](https://docs.docker.com/engine/reference/builder/#run) using
60 | ```Dockerfile
61 | RUN
62 | ```
63 |
64 | ### Discussion
65 |
66 | - Note that this will store the image in a local _registry_. We'll talk about this later.
67 |
68 | ## C: Binding host-system storage
69 |
70 | ### Task
71 |
72 | Make the file-system of your computer available to the container.
73 |
74 | This will show how to get data in and out of the container. We've already seen that from within the container, we can access the network / the internet. For many _industry_ applications using the network for interacting with the environment is all that is needed. For a scientific data-analysis context, file-system acces is, however, often essential.
75 |
76 | ### Hints
77 |
78 | Adding the following flags [`--volume $PWD:/work --workdir /work`](https://docs.docker.com/engine/reference/commandline/run/#mount-volume--v---read-only) to the [`docker run`](https://docs.docker.com/engine/reference/commandline/run/) call will create a directory `/work/` within the container showing the contents of the current directory (`$PWD`) and makes sure the working directory within the container is `/work`.
79 |
80 | ### Discussion
81 |
82 | There is no strong conventions about how to name volumes / directories bound into containers. It's wise to avoid any name used in the [(Linux) Filesystem Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard), however.
83 |
84 | ## D.1: Deploy (via registry)
85 |
86 | ### Task
87 |
88 | Use the image built in part C above. Tag the image. Push the image to a registry. Pull the image from a registry. Run the image.
89 |
90 | _(Note that for pushing, you'd need an account at a registry. So the pushing part will be done by one of the presenters. But you should be able to follow along pulling and running the image.)_
91 |
92 | ### Hints
93 |
94 | Tagging images can be done with [`docker tag ...`](https://docs.docker.com/engine/reference/commandline/tag/).
95 |
96 | Pushing to a registry is as easy as running [`docker push`](https://docs.docker.com/engine/reference/commandline/push/).
97 |
98 | You can [`docker pull `](https://docs.docker.com/engine/reference/commandline/pull/) an image and then run it as if you tagged it locally with [`docker run ... `](https://docs.docker.com/engine/reference/commandline/run/)
99 |
100 | ## D.2: Deploy (via files)
101 |
102 | ### Task
103 |
104 | Export the container image to a file and import it again (possibly on a separate machine).
105 |
106 | ### Hints
107 |
108 | You can use [`docker save --output `](https://docs.docker.com/engine/reference/commandline/save/) to save a container image to a tar archive. You can use [`docker load --input `](https://docs.docker.com/engine/reference/commandline/load/) to load the image again.
109 |
110 | ### Discussion
111 |
112 | - While container registries are best used for keeping and providing container images during active use, file-based deployments can be used for archiving or for sharing containers with machines that do not have access to a registry.
113 |
114 | - There is also a way of directly importing file-based docker images into Singularity, which is relevant for multi-user systems.
115 |
--------------------------------------------------------------------------------
/05_homework_science_project.md:
--------------------------------------------------------------------------------
1 | # Homework: Minimal science project
2 |
3 | _All pre-existing materials you might need are in: [`05_homework_science_project/`](05_homework_science_project/)_
4 |
5 | Consider a scientific project that consists of two steps: A _"simulation"_ which produces data, and a _"data analysis"_ which tries to make sense of the data.
6 |
7 | ## Simulation — Running a compiled software which produces data
8 |
9 | This could, e.g., be a physical or biological simulation. Here, we'll use a [small program written in Fortran](05_homework_science_project/create_data.F90) which produces a data sets with a sinus-shaped signal and some added noise.
10 |
11 | The data look like this:
12 | ```
13 | 15.0000000 0.644502997
14 | 30.0000000 1.54576778
15 | 45.0000000 2.07850647
16 | ... ...
17 | 330.000000 -1.68714190
18 | 345.000000 -0.620180488
19 | 360.000000 9.53099951E-02
20 | ```
21 |
22 | ## Data analysis - Visualize the simulation's data
23 |
24 | This could be a script or a set of scripts which produce figures for a publication or reduced data like mean and standard deviations of the input data. Here, we'll use [`gnuplot`](https://en.wikipedia.org/wiki/Gnuplot) to run [a script](05_homework_science_project/plot_data.gp) which plots the sinus-shaped data produced in step A.
25 |
26 | The resulting plot looks like this:
27 | ```
28 |
29 | 4 +----------------------------------------------------------------------+
30 | | + + + + + + + |
31 | 3 |-+ A A "data.dat" A +-|
32 | | A |
33 | | A A A |
34 | 2 |-+ A +-|
35 | | A |
36 | 1 |-+ A +-|
37 | | A A |
38 | | |
39 | 0 |-+ A A +-|
40 | | A |
41 | -1 |-+ A +-|
42 | | A |
43 | | A |
44 | -2 |-+ A +-|
45 | | A |
46 | -3 |-+ A A A +-|
47 | | A A |
48 | | + + + + + + + |
49 | -4 +----------------------------------------------------------------------+
50 | 0 50 100 150 200 250 300 350 400
51 |
52 | ```
53 |
54 | ## Hands-on details
55 |
56 | - Use the latest stable Ubuntu LTS container image.
57 |
58 | Hint: Compare the [Docker image tags](https://hub.docker.com/_/ubuntu) with the [Ubuntu releases](https://ubuntu.com/about/release-cycle) and choose the newest Ubuntu Long Term Support (LTS) release.
59 |
60 | - First set up the container interactively:
61 |
62 | - Make sure you have the Fortran compiler gfortran (package name is `gfortran`) and Gnuplot (package name `gnuplot-nox`) installed. Installation can be done by running `apt update` and `apt install -y ...` in the container.
63 |
64 | - Compile the software using: `gfortran create_data.F90 -o create_data` Then run the software with: `./create data`. To redirect the output into a data file called `data.dat`, use: `./create_data > data.dat`.
65 |
66 | - Plot the data with: `gnuplot -c plot_data.gp "data.dat"`.
67 |
68 | - Now write a `Dockerfile` to set up the container up to the point where the data can be created.
69 |
70 | Hint: You need to copy the files `create_data.F90` and `plot_data.gp` into the container during the build process.
71 |
72 | - Whenever the container is run, the data shall be written to the host file system and the plot shall be generated.
73 |
--------------------------------------------------------------------------------
/05_homework_science_project/.gitignore:
--------------------------------------------------------------------------------
1 | create_data.x
2 |
--------------------------------------------------------------------------------
/05_homework_science_project/create_data.F90:
--------------------------------------------------------------------------------
1 | program create_data
2 | implicit none
3 | integer :: istep
4 | real :: alpha, signal, r, noise
5 |
6 | do istep = 0, 24
7 | ! sinusoidal signal between 0 and 360 deg
8 | alpha = istep * 15.0
9 | signal = 3.0 * sind(alpha)
10 |
11 | ! random noise (symmetric random number in +/- 0.2)
12 | call random_number(r)
13 | noise = 0.2 * (2.0 * r - 1.0)
14 |
15 | ! write all to stdout
16 | write(*,*) alpha, signal + noise
17 | end do
18 | end program create_data
--------------------------------------------------------------------------------
/05_homework_science_project/example.dat:
--------------------------------------------------------------------------------
1 | 15.0000000 0.606624901
2 | 30.0000000 1.31915927
3 | 45.0000000 2.23794079
4 | 60.0000000 2.57067752
5 | 75.0000000 3.06147194
6 | 90.0000000 3.17521191
7 | 105.000000 2.71621990
8 | 120.000000 2.55521202
9 | 135.000000 2.00537181
10 | 150.000000 1.66122675
11 | 165.000000 0.873169243
12 | 180.000000 -0.128877386
13 | 195.000000 -0.788596749
14 | 210.000000 -1.60218716
15 | 225.000000 -2.17572522
16 | 240.000000 -2.78242493
17 | 255.000000 -3.08727360
18 | 270.000000 -3.11648655
19 | 285.000000 -2.71767974
20 | 300.000000 -2.78709698
21 | 315.000000 -2.20977879
22 | 330.000000 -1.32908857
23 | 345.000000 -0.589825869
24 | 360.000000 -3.76013517E-02
25 |
--------------------------------------------------------------------------------
/05_homework_science_project/plot_data.gp:
--------------------------------------------------------------------------------
1 | # use ascii output
2 | set terminal dumb
3 |
4 | # and plot
5 | filename=ARG1
6 | plot filename
--------------------------------------------------------------------------------
/06_singularity.md:
--------------------------------------------------------------------------------
1 | # Singularity
2 |
3 | Singularity is a container platform that provides,
4 |
5 | * mobility of compute via a single-file SIF container image format
6 | * very natural to migrate, execute, share and archive!
7 | * a permission model that is suitable for shared machines (such as HPC, group's workstation, ...)
8 | * users execute containers as themselves and don't need and/or can't get root privileges on the host per default
9 | * optimization for "integration" with the host system, rather than "isolation" from the host system
10 | * it's very easy to use the host's network and file system (plus available GPUs! which is also possible with Docker, though...)
11 |
12 | ## Docker/Singularity CLI
13 |
14 | During the hands-on part you have already used Docker commands to pull, specify, build, run and archive a container (image).
15 | Basically, each of these have an equivalent command in Singularity on the machines you might be using.
16 |
17 | Pull container images from a remote registry,
18 |
19 | ```
20 | $ docker pull ubuntu:22.04
21 | $ singularity pull docker://ubuntu:22.04
22 | ```
23 |
24 | Build a container image after specification (on a machine you own),
25 |
26 | ```
27 | $ docker build -t local/my-container-image -f Dockerfile .
28 | $ singularity build --fakeroot my-container-image.sif my-container-image.txt
29 | ```
30 |
31 | (For [details on Singularity definition files](https://sylabs.io/guides/3.7/user-guide/definition_files.html) and [differences to Dockerfiles](https://sylabs.io/guides/3.7/user-guide/singularity_and_docker.html#singularity-definition-file-vs-dockerfile) see the official Singularity docs.)
32 |
33 | Run a containerized software,
34 |
35 | ```
36 | $ docker run -it --rm local/my-container-image
37 | $ singularity run my-container-image.sif
38 | ```
39 |
40 | Archive a container image,
41 |
42 | ```
43 | $ docker save local/my-container-image --output my-container-image.tar
44 | ```
45 |
46 | (With Singularity's container image format the archiving aspect is solved naturally.)
47 |
48 | ## Singularity build: Stumbling blocks
49 |
50 | _Note that there is a lot of development on Singularity (and their docs!) going on. Expect that the following statements will be outdated very soon and that working with Singularity on MacOS and Windows machines will be just as easy as with Docker Desktop._
51 |
52 | * installing Singularity requires a Linux machine and involves compiling the Singularity code base
53 | * debugging, building and executing Singularity containers is only natively possible on Linux architecture
54 | * building Singularity containers requires root privileges and is therefore not possible on shared machines
55 |
56 | ```
57 | $ singularity build my-container-image.sif my-container-image.txt
58 | FATAL: You must be the root user, however you can use --remote or --fakeroot to build from a Singularity recipe file
59 | $ singularity build --fakeroot my-container-image.sif my-container-image.txt
60 | FATAL: could not use fakeroot: no mapping entry found in /etc/subuid for username
61 | $ singularity build --remote my-container-image.sif my-container-image.txt
62 | FATAL: Unable to submit build job: no authentication token, log in with `singularity remote login`
63 | ```
64 |
65 | You could get yourself access to the [Syslabs.io remote builder](https://cloud.sylabs.io/).
66 | (While the existence of such a service is quite nice, debugging/building remotely might become rather tedious quickly. Also the storage for your container images is currently limited to only about 11GB quota. More severe might be the aspect of how to add further local files to the container image during the build? Not tested, though.)
67 |
68 | As Singularity comes with a lot of ways to convert Docker images to the Singularity container image format one can fully go around the "Singularity build" problem, however, by utilizing Docker only (and the Docker community's knowledge.)
69 |
70 | ## Docker build workflow for Singularity containers
71 |
72 | The "advisable" scientific container lifecycle: `docker build` to `singularity run`?
73 | (If this is really the "advisable" way depends on your specific use case and is open for discussion, though. If you need Docker and Singularity container types for your project, it's currently the most simple way. Converting from Singularity to Docker images is possible, but not very convenient.)
74 |
75 | ### Specify
76 |
77 | Let's specify a container image that contains a Bash (!) script that prints a "hello world" message.
78 | In the Ubuntu base image Bash is already installed, let's use Alpine here so that we are forced to install a software.
79 |
80 | ```
81 | $ cat Dockerfile
82 | FROM alpine:latest
83 | RUN apk add bash
84 | RUN echo '#!/bin/bash' > /hello.sh \
85 | && echo 'echo "Hello from a container built on MacOS!"' >> /hello.sh \
86 | && chmod +x /hello.sh
87 | ```
88 |
89 | ### Build
90 |
91 | ```
92 | $ docker build -t local/hello-from-macos .
93 | ```
94 |
95 | Have a look at your local registry,
96 |
97 | ```
98 | $ docker images
99 | REPOSITORY TAG IMAGE ID CREATED SIZE
100 | local/hello-from-macos latest 2ec03757f43b 5 seconds ago 9.74MB
101 | ```
102 |
103 | Make a test run,
104 |
105 | ```
106 | $ docker run --rm local/hello-from-macos /hello.sh
107 | Hello from a container built on MacOS!
108 | ```
109 |
110 | ### Deploy... and run!
111 |
112 | You don't actually need to have Singularity installed on your system to build Singularity containers.
113 | Having Docker installed you can:
114 |
115 | * pull a dockerized Singularity (thanks Docker and Singularity community!)
116 | * do the Singularity build directly from your local Docker registry
117 | * transfer the Singularity image file to the target system
118 |
119 | ```
120 | # local machine
121 | $ docker pull kathoef/docker2singularity:latest
122 | $ docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -v ${PWD}:/output \
123 | kathoef/docker2singularity singularity build hello-from-macos.sif docker-daemon://local/hello-from-macos:latest
124 | $ ls -lh
125 | -rwxr-xr-x 1 khoeflich GEOMAR 5.3M Jun 9 14:08 hello-from-macos.sif
126 | $ scp hello-from-macos.sif ...
127 | ```
128 |
129 | ```
130 | # target machine
131 | $ module load singularity/...
132 | $ singularity run hello-from-macos.sif /hello.sh
133 | Hello from a container built on MacOS!
134 | ```
135 |
136 | ### Alternative conversion workflow
137 |
138 | Note, that you could also
139 |
140 | * export your container image as a Docker tar archive
141 | * transfer the tar archive to a machine with Singularity
142 | * do a Singularity build from the tar archive on the target machine
143 |
144 | ```
145 | # local machine
146 | $ docker save local/hello-from-macos --output hello-from-macos.tar
147 | $ scp hello-from-macos.tar ...
148 | ```
149 |
150 | ```
151 | # target machine
152 | $ singularity build hello-from-macos.sif docker-archive://hello-from-macos.tar
153 | INFO: Starting build...
154 | Getting image source signatures
155 | Copying blob b2d5eeeaba3a done
156 | Copying blob c33f491601cb done
157 | Copying blob 4b7cacb5f751 done
158 | Copying config b7469f518f done
159 | Writing manifest to image destination
160 | Storing signatures
161 | 2021/06/09 22:18:32 info unpack layer: sha256:d12dd637fd61a233bdb43ff256513c0704ceb2d4d1d8e40d75c8b4a0128dc976
162 | 2021/06/09 22:18:32 info unpack layer: sha256:d1b09556f2eedc3a9044cd5788a9efdf602382391249dc576e5e30db3cac5d7e
163 | 2021/06/09 22:18:32 info unpack layer: sha256:b4d2b1ee81c66d77659d89d01ca64bcba9345e1d82bfd4f52b74db8f7e6c4a93
164 | INFO: Creating SIF file...
165 | INFO: Build complete: hello-from-macos.sif
166 | ```
167 |
168 | Doing the Singularity build on your local machine has the advantage of transfering a much smaller file, though.
169 | (Because the Docker layer structure is collapsed during conversion and SIF files itself are compressed.)
170 |
171 | ```
172 | $ ls -lh
173 | -rwxr-xr-x 1 khoeflich GEOMAR 5.3M Jun 9 14:08 hello-from-macos.sif
174 | -rw------- 1 khoeflich GEOMAR 9.6M Jun 9 14:17 hello-from-macos.tar
175 | ```
176 |
177 | ## Docker/Singularity compatibility
178 |
179 | Per default, Docker container images run "isolated" and "writable", while Singularity container images run "integrated" and "read-only".
180 | If you want a Docker image to be compatible with Singularity runtime assumptions, consider the following aspects for your `Dockerfile`:
181 |
182 | * do not install any libraries (other than what is installed via e.g. `apt install...`) and/or scripts
183 | * in a typical Linux file system locations like e.g. `/opt` (i.e. rather use `/my-software` or `/my-script.sh`)
184 | * in the container environment's home folders, i.e. `$HOME` or `/root`
185 | * (see the [(Linux) Filesystem Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a list of paths that should be avoided)
186 | * make use of Dockerfile instructions, i.e. `ENV` to specify your software locations (do not use e.g. `.bashrc`)
187 | * do not rely on having runtime write permissions to a file system location other than `$HOME` or `/tmp` (plus locations you manually bind mount)
188 | * to enable yourself to use (i.e. "mount") also software from the host system
189 | * do not use Alpine as base image for your projects because of incompatibilities between Alpine `libc` and typical HPC host `libc`s
190 | * use CentOS/Ubuntu/Debian base images that are neither too new, nor too old (see [here](https://github.com/ExaESM-WP4/Batch-scheduler-Singularity-bindings) for an HPC use case where this problem came up for ourselves)
191 |
192 | (This is a filtered list from [here](https://github.com/singularityhub/docker2singularity#tips-for-making-docker-images-compatible-with-singularity) with aspects added from a few "lessons learned" during our own use of containerized Jupyter and Dask jobqueue software environments on HPC systems.)
193 |
--------------------------------------------------------------------------------
/07_summary.md:
--------------------------------------------------------------------------------
1 | # Summary
2 |
3 | ## Recap the container lifecycle
4 |
5 | - Specify: `Dockerfile`
6 | - Build: `docker build`
7 | - Deploy: `docker push/save` and `docker pull/load`
8 | - Run: `docker run`
9 | - Archive: `docker push/save`
10 | - Reproduce: `docker build`, `docker pull/load`
11 |
12 | ## Further key points
13 |
14 | - What is a container?
15 | - What's a container image?
16 | - The (scientific) container lifecycle
17 | - How to debug containers?
18 | - Singularity: Containers for shared machines
19 | - Isolation from / integration into host system
20 | - Filesystem and networking
21 | - Deploying containers
22 | - via a registry
23 | - via tar archives
24 | - converting docker images to singularity files
25 |
26 | ## What's next?
27 |
28 | - Containers
29 | - and GPUs
30 | - and MPI
31 | - and host-system libraries (like batch schedulers)
32 | - Multi-container setups and container orchestration
33 | - docker compose and singularity compose
34 | - manual solutions
35 | - kubernetes
36 | - what level of granularity is best for which purpose?
37 | - Continuous Integration (CI)
38 | - Containers as test environments for scientific software
39 | - CI for building container images
40 | - Container based workflows
41 | - Maintain group-wide containers as single source of truth for computing environments?
42 | - Share (changing) containers with collaborators?
43 | - Container best practices (for scientific computing)
44 | - No data in containers
45 | - Include own sofware in container or not?
46 | - Where to archive containers for long / short term purposes?
47 | - Advanced building
48 | - Docker-image layers (number and size) and cleanup
49 | - Multi-stage builds
50 | - Controlling build context
51 | - Target different architectures
52 | - From Docker to Singularity
53 | - Many pre-existing Docker images were built with Docker but not with Singularity in mind.
54 | - There's a few considerations to make it easier to use Docker images with Singularity.
--------------------------------------------------------------------------------
/08_addendum.md:
--------------------------------------------------------------------------------
1 | # Miscellaneous materials and notes
2 |
3 | - Reference container for Jupyter & Python based envs: https://github.com/martinclaus/py-da-stack
4 | - ...
5 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Containers for Scientific Computing Workshop 2021
2 |
3 | Structure of this course:
4 |
5 | - Installation instructions: [00_docker_installation.md](00_docker_installation.md)
6 | - Motivation: [01_motivation.md](01_motivation.md)
7 | - Overview of container technology: [02_technology.md](02_technology.md)
8 | - The container lifecycle: [03_container_lifecycle.md](03_container_lifecycle.md)
9 | - Hands-on session: [04_hands_on_sessions.md](04_hands_on_sessions.md)
10 | - Homework project: [05_homework_science_project.md](05_homework_science_project.md)
11 | - Singularity: [06_singularity.md](06_singularity.md)
12 | - Wrap-up and summary: [07_summary.md](07_summary.md)
13 |
--------------------------------------------------------------------------------
/container-concept.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExaESM-WP4/Containers-for-Scientific-Computing/7e69399b618dd307fc978ac2035be7bac9d8570c/container-concept.png
--------------------------------------------------------------------------------
/possible_solutions/04_hands_on_sessions.md:
--------------------------------------------------------------------------------
1 | # Basic Hands-on Example
2 |
3 | ## A: Modification of an existing container
4 |
5 | ### Possible Solution
6 |
7 | ```shell
8 | $ docker run -it ubuntu:22.04 bash
9 | # apt update
10 | # apt install curl imagemagick
11 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg
12 | # convert container.jpg container.png
13 | # identify container.jpg
14 | # identify container.png
15 | # exit
16 | ```
17 | Repeating the `docker run` creates a new container from the same image. This new container is not affected by the changes you made to the first container created. You can see the existing containers on your system by runnig
18 | ```shell
19 | $ docker container ls -a
20 | ```
21 | Docker automatically assignes names to the containers. You can use the name of the first container to start it again interactively and verify that the graphic files are still present.
22 | ```shell
23 | $ docker start -i NAME_OF_THE_CONTAINER
24 | # ls -la
25 | ```
26 | Note that you can force docker to automatically remove any stopped container by providing the `--rm` flag to `docker run`.
27 |
28 | ## B: Specification and Building
29 |
30 | ### Possible solution
31 |
32 | Create a file called `Dockerfile` with the following contents:
33 | ```Dockerfile
34 | FROM ubuntu:22.04
35 |
36 | RUN apt update && apt install -y curl imagemagick
37 | ```
38 |
39 | Then, being in the (otherwise empty directory that holds the `Dockerfile`), run
40 | ```shell
41 | $ docker build . -t my-graphics-conversion-image
42 | ```
43 |
44 | For starting the container, for starting a shell in the container, and for downloading the file etc, run
45 | ```shell
46 | $ docker run -it --rm my-graphics-conversion-image bash
47 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg
48 | # convert container.jpg container.png
49 | # identify container.jpg
50 | # identify container.png
51 | # exit
52 | ```
53 |
54 | ## C: Binding host-system storage
55 |
56 | ### Possible solution
57 |
58 | With the image built above, start a container with a directory mounted:
59 | ```shell
60 | $ docker run -it --rm --volume $PWD:/work --workdir /work my-convert-image bash
61 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg
62 | # convert container.jpg container.png
63 | # identify container.jpg
64 | # identify container.png
65 | # exit
66 | $ ls eg.* # will display the two files
67 | ```
68 |
69 | ## D.1: Deploy via registry
70 |
71 | With the image built above, first tag the image and then push it
72 | ```shell
73 | $ docker tag my-convert-image willirath/2021-06-container-intro-course:convert-image
74 | $ docker push willirath/2021-06-container-intro-course:convert-image
75 | ```
76 |
77 | Pulling and running the image amounts to:
78 | ```shell
79 | $ docker pull willirath/2021-06-container-intro-course:convert-image
80 | $ docker run -it --rm --volume $PWD:/work --workdir /work willirath/2021-06-container-intro-course:convert-image
81 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg
82 | # convert container.jpg container.png
83 | # identify container.jpg
84 | # identify container.png
85 | # exit
86 | ```
87 |
88 | ## D.2: Deploy via file
89 |
90 | With the image built above, save the container to a tar archive:
91 | ```shell
92 | $ docker save my-convert-image --output my-convert-image.tar
93 | ```
94 |
95 | Re-loading and running the image amounts to:
96 | ```shell
97 | $ docker load --input my-convert-image.tar
98 | $ docker run -it --rm --volume $PWD:/work --workdir my-convert-image
99 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg
100 | # convert container.jpg container.png
101 | # identify container.jpg
102 | # identify container.png
103 | # exit
104 | ```
105 |
--------------------------------------------------------------------------------
/possible_solutions/05_homework_science_project/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM ubuntu:24.04
2 |
3 | RUN apt update && apt install -y gfortran gnuplot-nox
4 |
5 | # make source code available in container
6 | RUN mkdir -p /app
7 | COPY create_data.F90 plot_data.gp run_everything.sh /app/
8 |
9 | # compile data creator
10 | RUN gfortran -o /app/create_data /app/create_data.F90
11 |
12 | # command which is executed if the container is just called
13 | CMD [ "bash", "/app/run_everything.sh" ]
--------------------------------------------------------------------------------
/possible_solutions/05_homework_science_project/README.md:
--------------------------------------------------------------------------------
1 | # Minimal science project - One possible solution
2 |
3 | We drive the whole workflow with one script [`run_everyting.sh`](run_everyting.sh).
4 |
5 | The [`Dockerfile`](Dockerfile) installs all packages, compiles the "simulation" code [`create_data.F90`](create_data.F90) _at build time_, and then runs the whole workflow (create data and plot data) at _runtime_ as a `CMD`.
6 |
--------------------------------------------------------------------------------
/possible_solutions/05_homework_science_project/run_everything.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 |
3 | # parameters
4 | app_dir="/app/"
5 | data_path="data/"
6 | data_file="${data_path}/data.dat"
7 |
8 | # create data
9 | mkdir -p "${data_path}"
10 | "${app_dir}/create_data" > "${data_file}"
11 |
12 | # plot data
13 | gnuplot -c "${app_dir}/plot_data.gp" "${data_file}"
--------------------------------------------------------------------------------