├── .gitignore ├── 00_docker_installation.md ├── 01_motivation.md ├── 02_technology.md ├── 03_container_lifecycle.md ├── 04_hands_on_sessions.md ├── 05_homework_science_project.md ├── 05_homework_science_project ├── .gitignore ├── create_data.F90 ├── example.dat └── plot_data.gp ├── 06_singularity.md ├── 07_summary.md ├── 08_addendum.md ├── README.md ├── container-concept.png └── possible_solutions ├── 04_hands_on_sessions.md └── 05_homework_science_project ├── Dockerfile ├── README.md └── run_everything.sh /.gitignore: -------------------------------------------------------------------------------- 1 | container.jpg 2 | container.png 3 | .DS_Store 4 | -------------------------------------------------------------------------------- /00_docker_installation.md: -------------------------------------------------------------------------------- 1 | # Docker installation 2 | 3 | This course will have a mixture of lecture-style, hands-on and discussion parts. 4 | Please install Docker on your Linux, MacOS or Windows machine in advance, if you want to follow the hands-on parts yourself. 5 | 6 | While this is not mandatory (as we will also do a demo of the exercises) we would highly advise you to prepare Docker on your machine, especially if you already intend to try out containers for one of your scientific projects. 7 | By installing Docker, you will have two things at the end of the course: (1) a fully working container build environment (that is also able to build Singularity container images), as well as (2) hands-on experience in specifying, building, and running Docker container images in a way that is also applicable to the Singularity world. 8 | 9 | ## Instructions 10 | 11 | For Linux install Docker CE, 12 | 13 | * https://docs.docker.com/engine/install/#server 14 | 15 | (To receive updates to your Docker installation, please setup and install Docker CE via the Docker package repository for your Linux distribution as described for e.g. Ubuntu here: https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository) 16 | 17 | For MacOS install Docker desktop, 18 | 19 | * https://docs.docker.com/docker-for-mac/install/#install-and-run-docker-desktop-on-mac 20 | 21 | For Windows also install Docker desktop, 22 | 23 | * https://docs.docker.com/docker-for-windows/install/#install-docker-desktop-on-windows 24 | 25 | (Consider **not** to choose the WSL2-backend during the installation, but to go for the default Hyper-V backend initially. While WSL2 is more performant, it is a bit more complex to setup. Luckily, it can be added at any later point.) 26 | 27 | Please note, there won't be time to discuss problems with installing Docker during the beginning of the course. 28 | Please also note, you will need administrator privileges on your machine to successfully install Docker. 29 | Contact your computer's administrator if you don't have the appropriate rights on your computer. 30 | 31 | Also, during the hands-on part of the course it is helpful to have a text editor suitable for coding available. 32 | This could be a terminal application such as "vim" or a GUI tool such as "VS Code" (https://code.visualstudio.com/Download). 33 | 34 | ## Questions 35 | 36 | > Why is Docker used for this course, if on the big machines I have only seen Singularity to be available? 37 | 38 | Containers are a very helpful technology that simplifies the deployment aspect associated with software. 39 | The Singularity image build and Singularity container run functionality is only available on Linux systems, though. 40 | Docker container images, on the other hand, work across Linux, MacOS and Windows machines. 41 | Furthermore, they can be converted into Singularity container images very easily. 42 | As there is often no way of building Singularity container images directly on the machines where they will be used (at least in practice!) we during our daily work almost exclusively use workflows in which a container image is specified in Docker and then converted to a Singularity image. 43 | How to do this (with Docker or Singularity), will also be part of the course. 44 | -------------------------------------------------------------------------------- /01_motivation.md: -------------------------------------------------------------------------------- 1 | # Motivation behind using containers 2 | 3 | Containers may help with **reproducibility** of your scientific work, 4 | 5 | > **Scholarly research has evolved significantly over the past decade, but the same cannot be said for the methods by which research processes are captured and disseminated.** The primary method for dissemination - the scholarly publication - is largely unchanged since the advent of the scientific journal in the 1660s. This is no longer sufficient to verify, reproduce, and extend scientific results. Despite the increasing recognition of the need to share all aspects of the research process, scholarly publications today are often disconnected from the underlying analysis and, crucially, the computational environment that produced the findings. **For research to be reproducible, researchers must publish and distribute the entire contained analysis, not just its results.** The analysis should be mobile. **Mobility of Compute is defined as the ability to define, create, and maintain a workflow locally while remaining confident that the workflow can be executed elsewhere.** In essence, mobility of compute means being able to contain the entire software stack, from data files up through the library stack, and reliably move it from system to system. Any research that is limited to where it can be deployed is instantly limited in the extent that it can be reproduced. (from [The Turing Way](https://the-turing-way.netlify.app/reproducible-research/renv.html#science)) 6 | 7 | And containers may help increasing your personal (and your research group's) **scientific productivity**, 8 | 9 | * especially by simplifying workflow aspects around "the lifecycle of a scientific idea" (after Perez 2017, [the architecture of Jupyter](https://www.youtube.com/watch?v=dENc0gwzySc)) 10 | * individual exploratory work 11 | * collaborative developments 12 | * production runs (HPC, workstation, cloud, ...) 13 | * publication (reproducibly!) 14 | * specifically, by reducing your individual "time-to-scientific-insight" 15 | * daily perspective, e.g. [Jupyter start-up times](https://nbviewer.jupyter.org/github/ExaESM-WP4/Jupyter-HPC-performance/blob/fa725c1f3656f81c78254946f97a9c1764908e53/analysis.ipynb) on distributed storage machines 16 | * project perspective, e.g. becoming independent of particular machines and their software 17 | * ... 18 | 19 | ## Demo 20 | 21 | Suppose you want to do a data analysis on a machine, where you do not already have a suitable software environment installed. 22 | With containerized environments, it's extremely simple to (1) get started on your local machine, as well as to (2) scale a task out to another, e.g. bigger machine, or to (3) share an environment with your colleague who wants to build upon an analysis you have already started. 23 | 24 | We cover sharing of containerized software environments later in more detail. 25 | For a start, here we demonstrate the "installation" (or deployment) of a fully identical IPython software environment on a Linux, MacOS and Windows machine. 26 | IPython is provided with, e.g., the [Jupyter Docker stacks](https://jupyter-docker-stacks.readthedocs.io/en/latest/index.html), which are provided via [Dockerhub](https://hub.docker.com/r/jupyter/base-notebook), a very popular container sharing platform. 27 | Only two commands are necessary to pull and start-up the software environment. 28 | 29 | Your Linux desktop with Docker, 30 | 31 | ``` 32 | $ uname -a 33 | Linux morpheus 5.8.0-50-generic #56~20.04.1-Ubuntu SMP Mon Apr 12 21:46:35 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 34 | $ docker pull jupyter/base-notebook 35 | $ docker run -it jupyter/base-notebook ipython 36 | Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46) 37 | Type 'copyright', 'credits' or 'license' for more information 38 | IPython 7.24.1 -- An enhanced Interactive Python. Type '?' for help. 39 | 40 | In [1]: 41 | ``` 42 | 43 | Your MacOS desktop with Docker, 44 | 45 | ``` 46 | $ uname -a 47 | Darwin od-nb008mc 18.7.0 Darwin Kernel Version 18.7.0: Mon Mar 8 22:11:48 PST 2021; root:xnu-4903.278.65~1/RELEASE_X86_64 x86_64 48 | $ docker pull jupyter/base-notebook 49 | $ docker run -it jupyter/base-notebook ipython 50 | Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46) 51 | Type 'copyright', 'credits' or 'license' for more information 52 | IPython 7.24.1 -- An enhanced Interactive Python. Type '?' for help. 53 | 54 | In [1]: 55 | ``` 56 | 57 | Your Windows desktop with Docker, 58 | 59 | ``` 60 | Windows PowerShell 61 | Copyright (C) Microsoft Corporation. All rights reserved. 62 | PS V:\> docker pull jupyter/base-notebook 63 | PS V:\> docker run -it jupyter/base-notebook ipython 64 | Python 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46) 65 | Type 'copyright', 'credits' or 'license' for more information 66 | IPython 7.24.1 -- An enhanced Interactive Python. Type '?' for help. 67 | 68 | In [1]: 69 | ``` 70 | 71 | # Goal of this workshop 72 | 73 | Accelerate your "scientific productivity" (and increase your scientific work's reproducibility) by covering the basics of container technology, as well as a few considerations around scientific workflows from a container perspective. 74 | -------------------------------------------------------------------------------- /02_technology.md: -------------------------------------------------------------------------------- 1 | # What is a container? 2 | 3 | ## High-level conceptual perspective 4 | 5 | * isolated runtime environment for "applications" including their dependencies but not, unlike VMs, the kernel 6 | * simplifies "application" development, and deployment, and strongly improves "application" portability 7 | 8 | ## Brief historical perspective 9 | 10 | Containers are not a new technology. 11 | They were born as a unix system developer tool already during the 1980s. 12 | 13 | * chroot (1979/82) <— birth 14 | * FreeBSD jail (2000) <— usability 15 | * LXC (2008) <— popularity 16 | * **Docker (2013) <— community** 17 | * **Singularity (2015) <— science and containerized high-performance computing! also: shared machines** 18 | * Docker rootless (2020) <- Docker for shared machines; but: e.g. GPUs possible? 19 | * Apptainer (2021) joins the Linux Foundation <- align containerized high-performance computing and cloud technology developments 20 | 21 | (see e.g. [here](https://en.wikipedia.org/wiki/OS-level_virtualization) and [there](https://www.section.io/engineering-education/history-of-container-technology/) and [many more](https://www.google.com/search?q=history+of+container+technology)) 22 | 23 | ## Technical perspective 24 | 25 | The core concept about "what is a container?" from a user perspective is the "shared Linux kernel" and "contained software environment" aspect. 26 | 27 | * user processes 28 | * Linux kernel w/ file system 29 | * hardware layer 30 | 31 | 32 | 33 | **References** 34 | 35 | * https://www.linuxfordevices.com/tutorials/linux/linux-kernel 36 | * https://www.linux.com/training-tutorials/linux-filesystem-explained/ 37 | * https://www.hpcwire.com/2017/11/01/sc17-singularity-preps-version-3-0-nears-1m-containers-served-daily/ 38 | 39 | ## Hands-on part (5 minutes) 40 | 41 | To illustrate this, let's pull the `alpine:latest` and `ubuntu:22.04` Linux base images. 42 | 43 | On a Linux (or MacOS machine) you can directly have a look at your host system's kernel version, 44 | 45 | ``` 46 | $ uname -a 47 | Linux morpheus 5.8.0-50-generic #56~20.04.1-Ubuntu SMP Mon Apr 12 21:46:35 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 48 | ``` 49 | 50 | the host system's file system structure, 51 | 52 | ``` 53 | $ ls -l / 54 | lrwxrwxrwx 1 root root 7 Okt 30 2020 bin -> usr/bin 55 | drwxr-xr-x 4 root root 4096 Jun 4 06:30 boot 56 | drwxrwxr-x 2 root root 4096 Okt 30 2020 cdrom 57 | drwxr-xr-x 20 root root 4600 Mai 12 09:57 dev 58 | drwxr-xr-x 133 root root 12288 Jun 4 06:30 etc 59 | drwxr-xr-x 3 root root 4096 Okt 30 2020 home 60 | lrwxrwxrwx 1 root root 7 Okt 30 2020 lib -> usr/lib 61 | lrwxrwxrwx 1 root root 9 Okt 30 2020 lib32 -> usr/lib32 62 | lrwxrwxrwx 1 root root 9 Okt 30 2020 lib64 -> usr/lib64 63 | lrwxrwxrwx 1 root root 10 Okt 30 2020 libx32 -> usr/libx32 64 | drwx------ 2 root root 16384 Okt 30 2020 lost+found 65 | drwxr-xr-x 2 root root 4096 Jul 31 2020 media 66 | drwxr-xr-x 2 root root 4096 Jul 31 2020 mnt 67 | drwxr-xr-x 3 root root 4096 Nov 3 2020 opt 68 | dr-xr-xr-x 327 root root 0 Apr 20 22:31 proc 69 | drwx------ 5 root root 4096 Mai 16 14:46 root 70 | drwxr-xr-x 42 root root 1320 Jun 6 17:27 run 71 | lrwxrwxrwx 1 root root 8 Okt 30 2020 sbin -> usr/sbin 72 | dr-xr-xr-x 13 root root 0 Apr 20 22:31 sys 73 | drwxrwxrwt 18 root root 4096 Jun 6 17:27 tmp 74 | drwxr-xr-x 14 root root 4096 Jul 31 2020 usr 75 | drwxr-xr-x 14 root root 4096 Jul 31 2020 var 76 | ``` 77 | and e.g. host system's user directory contents, 78 | 79 | ``` 80 | $ ls /home 81 | khoeflich 82 | ``` 83 | 84 | Now, download the Alpine base image from Dockerhub, 85 | 86 | ``` 87 | $ docker pull alpine:latest 88 | ``` 89 | 90 | start an interactive Bash session in the container, 91 | 92 | ``` 93 | $ docker run -it --rm alpine:latest /bin/sh 94 | ``` 95 | 96 | and familiarize yourself with the visible software environment, 97 | 98 | ``` 99 | root@131fa759eb1b:/# ls -l / 100 | root@131fa759eb1b:/# ls /home 101 | root@131fa759eb1b:/# cat /etc/os-release 102 | root@131fa759eb1b:/# uname -a 103 | ``` 104 | 105 | Do this also for the `ubuntu:22.04` image and your host system! What is different, what is the same? 106 | 107 | Please note, Docker desktop on MacOS and Windows is shipped with a Linux virtual machine, that runs in the background and provides you with Docker functionality. 108 | What do you expect to find for the respective `uname` commands? 109 | 110 | Especially on Windows, you can't natively run a Unix-like command such as `uname` in your host system's PowerShell. 111 | Can you still demonstrate the core concept of "shared Linux kernel" and "contained software environment" on a Windows machine? If so, how? 112 | -------------------------------------------------------------------------------- /03_container_lifecycle.md: -------------------------------------------------------------------------------- 1 | # The container lifecycle 2 | 3 | The typical container lifecycle (from an industry containerized software delivery perspective) consists of four parts which may be grouped into a part dealing with the *container image* and a part dealing with the *container* itself: 4 | 5 | - Image: 6 | - [Specify](https://docs.docker.com/build/building/best-practices/) 7 | - [Build](https://docs.docker.com/reference/cli/docker/buildx/build/) 8 | - [Deploy](https://docs.docker.com/reference/cli/docker/image/pull/) 9 | - Container: 10 | - [Run](https://docs.docker.com/reference/cli/docker/container/run/) 11 | 12 | In a scientific-computing setting, we add two parts: 13 | 14 | - Archive 15 | - Reproduce 16 | -------------------------------------------------------------------------------- /04_hands_on_sessions.md: -------------------------------------------------------------------------------- 1 | # Basic Hands-on Example 2 | 3 | Throughout this part, we'll link the [docker reference docs](https://docs.docker.com/reference/) of which the [docs for the command line interface (CLI)](https://docs.docker.com/engine/reference/commandline/cli/) and the [docs for Dockerfiles](https://docs.docker.com/engine/reference/builder/) are the most important. 4 | 5 | ## Background 6 | 7 | Suppose we have a workflow step that requires pictures to be downloaded from the internet and converted into a different format. 8 | We might want to script the task, and it might be important to have particular versions of the involved software installed to ensure that we do not need to adapt our script if picture libraries get extended, and software tool developers decide to change their e.g. command-line interfaces. 9 | We also may want to be able to execute this workflow on different machines, all of which, fortuitously, have Docker or Singularity / Apptainer installed. 10 | 11 | ## A: Modification of an existing container 12 | 13 | ### Task 14 | 15 | Use an existing base image ([`ubuntu:22.04`](https://hub.docker.com/_/ubuntu)), start a container with an interactive shell, and install a software necessary to download an image from the internet (`curl`) and a software necessary to convert the image to a different format (`imagemagick`). 16 | 17 | ### Hints 18 | 19 | Running containers with an interactive shell (here using `bash`) can be done with: [`docker run --rm -it bash`](https://docs.docker.com/engine/reference/commandline/run/) 20 | 21 | Installation can be done with `apt update` followed by `apt install -y ...`. 22 | 23 | Let's use this graphics file: 24 | 25 | To download the image, run 26 | ```shell 27 | curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg 28 | ``` 29 | 30 | To convert the graphics file to, e.g., PNG, run: 31 | ```shell 32 | convert container.jpg container.png 33 | ``` 34 | (The `convert` command is part of the `imagemagick` suite.) 35 | 36 | To check if the file format really changed, compare the output of `identify container.jpg` with that of `identify container.png`. 37 | (The `identify` command is part of the `imagemagick` suite.) 38 | 39 | ### Discussion 40 | 41 | - Are the graphic files still there when you exit the container and start it again with `docker run`? 42 | - How do you properly restart a stopped container? 43 | - How to get the graphics file out of the container? — We'll skip this for now and come back to this question later. 44 | 45 | ## B: Specification and Building 46 | 47 | ### Task 48 | 49 | Write a [`Dockerfile`](https://docs.docker.com/engine/reference/builder/) which specifies the full environment (e.g. the base image and all packages we need), and [build the container image](https://docs.docker.com/engine/reference/commandline/build/). 50 | 51 | Then, for testing if the container is able to do what it's meant for, start a shell in the container (see above) and download and convert the graphics file again. 52 | 53 | ### Hints 54 | 55 | In the `Dockerfile`, you can [specify the base image using](https://docs.docker.com/engine/reference/builder/#from) 56 | ```Dockerfile 57 | FROM 58 | ``` 59 | and you can specify [commands that are running at _build time_](https://docs.docker.com/engine/reference/builder/#run) using 60 | ```Dockerfile 61 | RUN 62 | ``` 63 | 64 | ### Discussion 65 | 66 | - Note that this will store the image in a local _registry_. We'll talk about this later. 67 | 68 | ## C: Binding host-system storage 69 | 70 | ### Task 71 | 72 | Make the file-system of your computer available to the container. 73 | 74 | This will show how to get data in and out of the container. We've already seen that from within the container, we can access the network / the internet. For many _industry_ applications using the network for interacting with the environment is all that is needed. For a scientific data-analysis context, file-system acces is, however, often essential. 75 | 76 | ### Hints 77 | 78 | Adding the following flags [`--volume $PWD:/work --workdir /work`](https://docs.docker.com/engine/reference/commandline/run/#mount-volume--v---read-only) to the [`docker run`](https://docs.docker.com/engine/reference/commandline/run/) call will create a directory `/work/` within the container showing the contents of the current directory (`$PWD`) and makes sure the working directory within the container is `/work`. 79 | 80 | ### Discussion 81 | 82 | There is no strong conventions about how to name volumes / directories bound into containers. It's wise to avoid any name used in the [(Linux) Filesystem Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard), however. 83 | 84 | ## D.1: Deploy (via registry) 85 | 86 | ### Task 87 | 88 | Use the image built in part C above. Tag the image. Push the image to a registry. Pull the image from a registry. Run the image. 89 | 90 | _(Note that for pushing, you'd need an account at a registry. So the pushing part will be done by one of the presenters. But you should be able to follow along pulling and running the image.)_ 91 | 92 | ### Hints 93 | 94 | Tagging images can be done with [`docker tag ...`](https://docs.docker.com/engine/reference/commandline/tag/). 95 | 96 | Pushing to a registry is as easy as running [`docker push`](https://docs.docker.com/engine/reference/commandline/push/). 97 | 98 | You can [`docker pull `](https://docs.docker.com/engine/reference/commandline/pull/) an image and then run it as if you tagged it locally with [`docker run ... `](https://docs.docker.com/engine/reference/commandline/run/) 99 | 100 | ## D.2: Deploy (via files) 101 | 102 | ### Task 103 | 104 | Export the container image to a file and import it again (possibly on a separate machine). 105 | 106 | ### Hints 107 | 108 | You can use [`docker save --output `](https://docs.docker.com/engine/reference/commandline/save/) to save a container image to a tar archive. You can use [`docker load --input `](https://docs.docker.com/engine/reference/commandline/load/) to load the image again. 109 | 110 | ### Discussion 111 | 112 | - While container registries are best used for keeping and providing container images during active use, file-based deployments can be used for archiving or for sharing containers with machines that do not have access to a registry. 113 | 114 | - There is also a way of directly importing file-based docker images into Singularity, which is relevant for multi-user systems. 115 | -------------------------------------------------------------------------------- /05_homework_science_project.md: -------------------------------------------------------------------------------- 1 | # Homework: Minimal science project 2 | 3 | _All pre-existing materials you might need are in: [`05_homework_science_project/`](05_homework_science_project/)_ 4 | 5 | Consider a scientific project that consists of two steps: A _"simulation"_ which produces data, and a _"data analysis"_ which tries to make sense of the data. 6 | 7 | ## Simulation — Running a compiled software which produces data 8 | 9 | This could, e.g., be a physical or biological simulation. Here, we'll use a [small program written in Fortran](05_homework_science_project/create_data.F90) which produces a data sets with a sinus-shaped signal and some added noise. 10 | 11 | The data look like this: 12 | ``` 13 | 15.0000000 0.644502997 14 | 30.0000000 1.54576778 15 | 45.0000000 2.07850647 16 | ... ... 17 | 330.000000 -1.68714190 18 | 345.000000 -0.620180488 19 | 360.000000 9.53099951E-02 20 | ``` 21 | 22 | ## Data analysis - Visualize the simulation's data 23 | 24 | This could be a script or a set of scripts which produce figures for a publication or reduced data like mean and standard deviations of the input data. Here, we'll use [`gnuplot`](https://en.wikipedia.org/wiki/Gnuplot) to run [a script](05_homework_science_project/plot_data.gp) which plots the sinus-shaped data produced in step A. 25 | 26 | The resulting plot looks like this: 27 | ``` 28 | 29 | 4 +----------------------------------------------------------------------+ 30 | | + + + + + + + | 31 | 3 |-+ A A "data.dat" A +-| 32 | | A | 33 | | A A A | 34 | 2 |-+ A +-| 35 | | A | 36 | 1 |-+ A +-| 37 | | A A | 38 | | | 39 | 0 |-+ A A +-| 40 | | A | 41 | -1 |-+ A +-| 42 | | A | 43 | | A | 44 | -2 |-+ A +-| 45 | | A | 46 | -3 |-+ A A A +-| 47 | | A A | 48 | | + + + + + + + | 49 | -4 +----------------------------------------------------------------------+ 50 | 0 50 100 150 200 250 300 350 400 51 | 52 | ``` 53 | 54 | ## Hands-on details 55 | 56 | - Use the latest stable Ubuntu LTS container image. 57 | 58 | Hint: Compare the [Docker image tags](https://hub.docker.com/_/ubuntu) with the [Ubuntu releases](https://ubuntu.com/about/release-cycle) and choose the newest Ubuntu Long Term Support (LTS) release. 59 | 60 | - First set up the container interactively: 61 | 62 | - Make sure you have the Fortran compiler gfortran (package name is `gfortran`) and Gnuplot (package name `gnuplot-nox`) installed. Installation can be done by running `apt update` and `apt install -y ...` in the container. 63 | 64 | - Compile the software using: `gfortran create_data.F90 -o create_data` Then run the software with: `./create data`. To redirect the output into a data file called `data.dat`, use: `./create_data > data.dat`. 65 | 66 | - Plot the data with: `gnuplot -c plot_data.gp "data.dat"`. 67 | 68 | - Now write a `Dockerfile` to set up the container up to the point where the data can be created. 69 | 70 | Hint: You need to copy the files `create_data.F90` and `plot_data.gp` into the container during the build process. 71 | 72 | - Whenever the container is run, the data shall be written to the host file system and the plot shall be generated. 73 | -------------------------------------------------------------------------------- /05_homework_science_project/.gitignore: -------------------------------------------------------------------------------- 1 | create_data.x 2 | -------------------------------------------------------------------------------- /05_homework_science_project/create_data.F90: -------------------------------------------------------------------------------- 1 | program create_data 2 | implicit none 3 | integer :: istep 4 | real :: alpha, signal, r, noise 5 | 6 | do istep = 0, 24 7 | ! sinusoidal signal between 0 and 360 deg 8 | alpha = istep * 15.0 9 | signal = 3.0 * sind(alpha) 10 | 11 | ! random noise (symmetric random number in +/- 0.2) 12 | call random_number(r) 13 | noise = 0.2 * (2.0 * r - 1.0) 14 | 15 | ! write all to stdout 16 | write(*,*) alpha, signal + noise 17 | end do 18 | end program create_data -------------------------------------------------------------------------------- /05_homework_science_project/example.dat: -------------------------------------------------------------------------------- 1 | 15.0000000 0.606624901 2 | 30.0000000 1.31915927 3 | 45.0000000 2.23794079 4 | 60.0000000 2.57067752 5 | 75.0000000 3.06147194 6 | 90.0000000 3.17521191 7 | 105.000000 2.71621990 8 | 120.000000 2.55521202 9 | 135.000000 2.00537181 10 | 150.000000 1.66122675 11 | 165.000000 0.873169243 12 | 180.000000 -0.128877386 13 | 195.000000 -0.788596749 14 | 210.000000 -1.60218716 15 | 225.000000 -2.17572522 16 | 240.000000 -2.78242493 17 | 255.000000 -3.08727360 18 | 270.000000 -3.11648655 19 | 285.000000 -2.71767974 20 | 300.000000 -2.78709698 21 | 315.000000 -2.20977879 22 | 330.000000 -1.32908857 23 | 345.000000 -0.589825869 24 | 360.000000 -3.76013517E-02 25 | -------------------------------------------------------------------------------- /05_homework_science_project/plot_data.gp: -------------------------------------------------------------------------------- 1 | # use ascii output 2 | set terminal dumb 3 | 4 | # and plot 5 | filename=ARG1 6 | plot filename -------------------------------------------------------------------------------- /06_singularity.md: -------------------------------------------------------------------------------- 1 | # Singularity 2 | 3 | Singularity is a container platform that provides, 4 | 5 | * mobility of compute via a single-file SIF container image format 6 | * very natural to migrate, execute, share and archive! 7 | * a permission model that is suitable for shared machines (such as HPC, group's workstation, ...) 8 | * users execute containers as themselves and don't need and/or can't get root privileges on the host per default 9 | * optimization for "integration" with the host system, rather than "isolation" from the host system 10 | * it's very easy to use the host's network and file system (plus available GPUs! which is also possible with Docker, though...) 11 | 12 | ## Docker/Singularity CLI 13 | 14 | During the hands-on part you have already used Docker commands to pull, specify, build, run and archive a container (image). 15 | Basically, each of these have an equivalent command in Singularity on the machines you might be using. 16 | 17 | Pull container images from a remote registry, 18 | 19 | ``` 20 | $ docker pull ubuntu:22.04 21 | $ singularity pull docker://ubuntu:22.04 22 | ``` 23 | 24 | Build a container image after specification (on a machine you own), 25 | 26 | ``` 27 | $ docker build -t local/my-container-image -f Dockerfile . 28 | $ singularity build --fakeroot my-container-image.sif my-container-image.txt 29 | ``` 30 | 31 | (For [details on Singularity definition files](https://sylabs.io/guides/3.7/user-guide/definition_files.html) and [differences to Dockerfiles](https://sylabs.io/guides/3.7/user-guide/singularity_and_docker.html#singularity-definition-file-vs-dockerfile) see the official Singularity docs.) 32 | 33 | Run a containerized software, 34 | 35 | ``` 36 | $ docker run -it --rm local/my-container-image 37 | $ singularity run my-container-image.sif 38 | ``` 39 | 40 | Archive a container image, 41 | 42 | ``` 43 | $ docker save local/my-container-image --output my-container-image.tar 44 | ``` 45 | 46 | (With Singularity's container image format the archiving aspect is solved naturally.) 47 | 48 | ## Singularity build: Stumbling blocks 49 | 50 | _Note that there is a lot of development on Singularity (and their docs!) going on. Expect that the following statements will be outdated very soon and that working with Singularity on MacOS and Windows machines will be just as easy as with Docker Desktop._ 51 | 52 | * installing Singularity requires a Linux machine and involves compiling the Singularity code base 53 | * debugging, building and executing Singularity containers is only natively possible on Linux architecture 54 | * building Singularity containers requires root privileges and is therefore not possible on shared machines 55 | 56 | ``` 57 | $ singularity build my-container-image.sif my-container-image.txt 58 | FATAL: You must be the root user, however you can use --remote or --fakeroot to build from a Singularity recipe file 59 | $ singularity build --fakeroot my-container-image.sif my-container-image.txt 60 | FATAL: could not use fakeroot: no mapping entry found in /etc/subuid for username 61 | $ singularity build --remote my-container-image.sif my-container-image.txt 62 | FATAL: Unable to submit build job: no authentication token, log in with `singularity remote login` 63 | ``` 64 | 65 | You could get yourself access to the [Syslabs.io remote builder](https://cloud.sylabs.io/). 66 | (While the existence of such a service is quite nice, debugging/building remotely might become rather tedious quickly. Also the storage for your container images is currently limited to only about 11GB quota. More severe might be the aspect of how to add further local files to the container image during the build? Not tested, though.) 67 | 68 | As Singularity comes with a lot of ways to convert Docker images to the Singularity container image format one can fully go around the "Singularity build" problem, however, by utilizing Docker only (and the Docker community's knowledge.) 69 | 70 | ## Docker build workflow for Singularity containers 71 | 72 | The "advisable" scientific container lifecycle: `docker build` to `singularity run`? 73 | (If this is really the "advisable" way depends on your specific use case and is open for discussion, though. If you need Docker and Singularity container types for your project, it's currently the most simple way. Converting from Singularity to Docker images is possible, but not very convenient.) 74 | 75 | ### Specify 76 | 77 | Let's specify a container image that contains a Bash (!) script that prints a "hello world" message. 78 | In the Ubuntu base image Bash is already installed, let's use Alpine here so that we are forced to install a software. 79 | 80 | ``` 81 | $ cat Dockerfile 82 | FROM alpine:latest 83 | RUN apk add bash 84 | RUN echo '#!/bin/bash' > /hello.sh \ 85 | && echo 'echo "Hello from a container built on MacOS!"' >> /hello.sh \ 86 | && chmod +x /hello.sh 87 | ``` 88 | 89 | ### Build 90 | 91 | ``` 92 | $ docker build -t local/hello-from-macos . 93 | ``` 94 | 95 | Have a look at your local registry, 96 | 97 | ``` 98 | $ docker images 99 | REPOSITORY TAG IMAGE ID CREATED SIZE 100 | local/hello-from-macos latest 2ec03757f43b 5 seconds ago 9.74MB 101 | ``` 102 | 103 | Make a test run, 104 | 105 | ``` 106 | $ docker run --rm local/hello-from-macos /hello.sh 107 | Hello from a container built on MacOS! 108 | ``` 109 | 110 | ### Deploy... and run! 111 | 112 | You don't actually need to have Singularity installed on your system to build Singularity containers. 113 | Having Docker installed you can: 114 | 115 | * pull a dockerized Singularity (thanks Docker and Singularity community!) 116 | * do the Singularity build directly from your local Docker registry 117 | * transfer the Singularity image file to the target system 118 | 119 | ``` 120 | # local machine 121 | $ docker pull kathoef/docker2singularity:latest 122 | $ docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -v ${PWD}:/output \ 123 | kathoef/docker2singularity singularity build hello-from-macos.sif docker-daemon://local/hello-from-macos:latest 124 | $ ls -lh 125 | -rwxr-xr-x 1 khoeflich GEOMAR 5.3M Jun 9 14:08 hello-from-macos.sif 126 | $ scp hello-from-macos.sif ... 127 | ``` 128 | 129 | ``` 130 | # target machine 131 | $ module load singularity/... 132 | $ singularity run hello-from-macos.sif /hello.sh 133 | Hello from a container built on MacOS! 134 | ``` 135 | 136 | ### Alternative conversion workflow 137 | 138 | Note, that you could also 139 | 140 | * export your container image as a Docker tar archive 141 | * transfer the tar archive to a machine with Singularity 142 | * do a Singularity build from the tar archive on the target machine 143 | 144 | ``` 145 | # local machine 146 | $ docker save local/hello-from-macos --output hello-from-macos.tar 147 | $ scp hello-from-macos.tar ... 148 | ``` 149 | 150 | ``` 151 | # target machine 152 | $ singularity build hello-from-macos.sif docker-archive://hello-from-macos.tar 153 | INFO: Starting build... 154 | Getting image source signatures 155 | Copying blob b2d5eeeaba3a done 156 | Copying blob c33f491601cb done 157 | Copying blob 4b7cacb5f751 done 158 | Copying config b7469f518f done 159 | Writing manifest to image destination 160 | Storing signatures 161 | 2021/06/09 22:18:32 info unpack layer: sha256:d12dd637fd61a233bdb43ff256513c0704ceb2d4d1d8e40d75c8b4a0128dc976 162 | 2021/06/09 22:18:32 info unpack layer: sha256:d1b09556f2eedc3a9044cd5788a9efdf602382391249dc576e5e30db3cac5d7e 163 | 2021/06/09 22:18:32 info unpack layer: sha256:b4d2b1ee81c66d77659d89d01ca64bcba9345e1d82bfd4f52b74db8f7e6c4a93 164 | INFO: Creating SIF file... 165 | INFO: Build complete: hello-from-macos.sif 166 | ``` 167 | 168 | Doing the Singularity build on your local machine has the advantage of transfering a much smaller file, though. 169 | (Because the Docker layer structure is collapsed during conversion and SIF files itself are compressed.) 170 | 171 | ``` 172 | $ ls -lh 173 | -rwxr-xr-x 1 khoeflich GEOMAR 5.3M Jun 9 14:08 hello-from-macos.sif 174 | -rw------- 1 khoeflich GEOMAR 9.6M Jun 9 14:17 hello-from-macos.tar 175 | ``` 176 | 177 | ## Docker/Singularity compatibility 178 | 179 | Per default, Docker container images run "isolated" and "writable", while Singularity container images run "integrated" and "read-only". 180 | If you want a Docker image to be compatible with Singularity runtime assumptions, consider the following aspects for your `Dockerfile`: 181 | 182 | * do not install any libraries (other than what is installed via e.g. `apt install...`) and/or scripts 183 | * in a typical Linux file system locations like e.g. `/opt` (i.e. rather use `/my-software` or `/my-script.sh`) 184 | * in the container environment's home folders, i.e. `$HOME` or `/root` 185 | * (see the [(Linux) Filesystem Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a list of paths that should be avoided) 186 | * make use of Dockerfile instructions, i.e. `ENV` to specify your software locations (do not use e.g. `.bashrc`) 187 | * do not rely on having runtime write permissions to a file system location other than `$HOME` or `/tmp` (plus locations you manually bind mount) 188 | * to enable yourself to use (i.e. "mount") also software from the host system 189 | * do not use Alpine as base image for your projects because of incompatibilities between Alpine `libc` and typical HPC host `libc`s 190 | * use CentOS/Ubuntu/Debian base images that are neither too new, nor too old (see [here](https://github.com/ExaESM-WP4/Batch-scheduler-Singularity-bindings) for an HPC use case where this problem came up for ourselves) 191 | 192 | (This is a filtered list from [here](https://github.com/singularityhub/docker2singularity#tips-for-making-docker-images-compatible-with-singularity) with aspects added from a few "lessons learned" during our own use of containerized Jupyter and Dask jobqueue software environments on HPC systems.) 193 | -------------------------------------------------------------------------------- /07_summary.md: -------------------------------------------------------------------------------- 1 | # Summary 2 | 3 | ## Recap the container lifecycle 4 | 5 | - Specify: `Dockerfile` 6 | - Build: `docker build` 7 | - Deploy: `docker push/save` and `docker pull/load` 8 | - Run: `docker run` 9 | - Archive: `docker push/save` 10 | - Reproduce: `docker build`, `docker pull/load` 11 | 12 | ## Further key points 13 | 14 | - What is a container? 15 | - What's a container image? 16 | - The (scientific) container lifecycle 17 | - How to debug containers? 18 | - Singularity: Containers for shared machines 19 | - Isolation from / integration into host system 20 | - Filesystem and networking 21 | - Deploying containers 22 | - via a registry 23 | - via tar archives 24 | - converting docker images to singularity files 25 | 26 | ## What's next? 27 | 28 | - Containers 29 | - and GPUs 30 | - and MPI 31 | - and host-system libraries (like batch schedulers) 32 | - Multi-container setups and container orchestration 33 | - docker compose and singularity compose 34 | - manual solutions 35 | - kubernetes 36 | - what level of granularity is best for which purpose? 37 | - Continuous Integration (CI) 38 | - Containers as test environments for scientific software 39 | - CI for building container images 40 | - Container based workflows 41 | - Maintain group-wide containers as single source of truth for computing environments? 42 | - Share (changing) containers with collaborators? 43 | - Container best practices (for scientific computing) 44 | - No data in containers 45 | - Include own sofware in container or not? 46 | - Where to archive containers for long / short term purposes? 47 | - Advanced building 48 | - Docker-image layers (number and size) and cleanup 49 | - Multi-stage builds 50 | - Controlling build context 51 | - Target different architectures 52 | - From Docker to Singularity 53 | - Many pre-existing Docker images were built with Docker but not with Singularity in mind. 54 | - There's a few considerations to make it easier to use Docker images with Singularity. -------------------------------------------------------------------------------- /08_addendum.md: -------------------------------------------------------------------------------- 1 | # Miscellaneous materials and notes 2 | 3 | - Reference container for Jupyter & Python based envs: https://github.com/martinclaus/py-da-stack 4 | - ... 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Containers for Scientific Computing Workshop 2021 2 | 3 | Structure of this course: 4 | 5 | - Installation instructions: [00_docker_installation.md](00_docker_installation.md) 6 | - Motivation: [01_motivation.md](01_motivation.md) 7 | - Overview of container technology: [02_technology.md](02_technology.md) 8 | - The container lifecycle: [03_container_lifecycle.md](03_container_lifecycle.md) 9 | - Hands-on session: [04_hands_on_sessions.md](04_hands_on_sessions.md) 10 | - Homework project: [05_homework_science_project.md](05_homework_science_project.md) 11 | - Singularity: [06_singularity.md](06_singularity.md) 12 | - Wrap-up and summary: [07_summary.md](07_summary.md) 13 | -------------------------------------------------------------------------------- /container-concept.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExaESM-WP4/Containers-for-Scientific-Computing/7e69399b618dd307fc978ac2035be7bac9d8570c/container-concept.png -------------------------------------------------------------------------------- /possible_solutions/04_hands_on_sessions.md: -------------------------------------------------------------------------------- 1 | # Basic Hands-on Example 2 | 3 | ## A: Modification of an existing container 4 | 5 | ### Possible Solution 6 | 7 | ```shell 8 | $ docker run -it ubuntu:22.04 bash 9 | # apt update 10 | # apt install curl imagemagick 11 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg 12 | # convert container.jpg container.png 13 | # identify container.jpg 14 | # identify container.png 15 | # exit 16 | ``` 17 | Repeating the `docker run` creates a new container from the same image. This new container is not affected by the changes you made to the first container created. You can see the existing containers on your system by runnig 18 | ```shell 19 | $ docker container ls -a 20 | ``` 21 | Docker automatically assignes names to the containers. You can use the name of the first container to start it again interactively and verify that the graphic files are still present. 22 | ```shell 23 | $ docker start -i NAME_OF_THE_CONTAINER 24 | # ls -la 25 | ``` 26 | Note that you can force docker to automatically remove any stopped container by providing the `--rm` flag to `docker run`. 27 | 28 | ## B: Specification and Building 29 | 30 | ### Possible solution 31 | 32 | Create a file called `Dockerfile` with the following contents: 33 | ```Dockerfile 34 | FROM ubuntu:22.04 35 | 36 | RUN apt update && apt install -y curl imagemagick 37 | ``` 38 | 39 | Then, being in the (otherwise empty directory that holds the `Dockerfile`), run 40 | ```shell 41 | $ docker build . -t my-graphics-conversion-image 42 | ``` 43 | 44 | For starting the container, for starting a shell in the container, and for downloading the file etc, run 45 | ```shell 46 | $ docker run -it --rm my-graphics-conversion-image bash 47 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg 48 | # convert container.jpg container.png 49 | # identify container.jpg 50 | # identify container.png 51 | # exit 52 | ``` 53 | 54 | ## C: Binding host-system storage 55 | 56 | ### Possible solution 57 | 58 | With the image built above, start a container with a directory mounted: 59 | ```shell 60 | $ docker run -it --rm --volume $PWD:/work --workdir /work my-convert-image bash 61 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg 62 | # convert container.jpg container.png 63 | # identify container.jpg 64 | # identify container.png 65 | # exit 66 | $ ls eg.* # will display the two files 67 | ``` 68 | 69 | ## D.1: Deploy via registry 70 | 71 | With the image built above, first tag the image and then push it 72 | ```shell 73 | $ docker tag my-convert-image willirath/2021-06-container-intro-course:convert-image 74 | $ docker push willirath/2021-06-container-intro-course:convert-image 75 | ``` 76 | 77 | Pulling and running the image amounts to: 78 | ```shell 79 | $ docker pull willirath/2021-06-container-intro-course:convert-image 80 | $ docker run -it --rm --volume $PWD:/work --workdir /work willirath/2021-06-container-intro-course:convert-image 81 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg 82 | # convert container.jpg container.png 83 | # identify container.jpg 84 | # identify container.png 85 | # exit 86 | ``` 87 | 88 | ## D.2: Deploy via file 89 | 90 | With the image built above, save the container to a tar archive: 91 | ```shell 92 | $ docker save my-convert-image --output my-convert-image.tar 93 | ``` 94 | 95 | Re-loading and running the image amounts to: 96 | ```shell 97 | $ docker load --input my-convert-image.tar 98 | $ docker run -it --rm --volume $PWD:/work --workdir my-convert-image 99 | # curl https://upload.wikimedia.org/wikipedia/commons/d/df/Container_01_KMJ.jpg -o container.jpg 100 | # convert container.jpg container.png 101 | # identify container.jpg 102 | # identify container.png 103 | # exit 104 | ``` 105 | -------------------------------------------------------------------------------- /possible_solutions/05_homework_science_project/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu:24.04 2 | 3 | RUN apt update && apt install -y gfortran gnuplot-nox 4 | 5 | # make source code available in container 6 | RUN mkdir -p /app 7 | COPY create_data.F90 plot_data.gp run_everything.sh /app/ 8 | 9 | # compile data creator 10 | RUN gfortran -o /app/create_data /app/create_data.F90 11 | 12 | # command which is executed if the container is just called 13 | CMD [ "bash", "/app/run_everything.sh" ] -------------------------------------------------------------------------------- /possible_solutions/05_homework_science_project/README.md: -------------------------------------------------------------------------------- 1 | # Minimal science project - One possible solution 2 | 3 | We drive the whole workflow with one script [`run_everyting.sh`](run_everyting.sh). 4 | 5 | The [`Dockerfile`](Dockerfile) installs all packages, compiles the "simulation" code [`create_data.F90`](create_data.F90) _at build time_, and then runs the whole workflow (create data and plot data) at _runtime_ as a `CMD`. 6 | -------------------------------------------------------------------------------- /possible_solutions/05_homework_science_project/run_everything.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | # parameters 4 | app_dir="/app/" 5 | data_path="data/" 6 | data_file="${data_path}/data.dat" 7 | 8 | # create data 9 | mkdir -p "${data_path}" 10 | "${app_dir}/create_data" > "${data_file}" 11 | 12 | # plot data 13 | gnuplot -c "${app_dir}/plot_data.gp" "${data_file}" --------------------------------------------------------------------------------