├── .gitignore
├── challenge-exercises.rst
├── README.md
├── LICENSE
└── docker-intro.rst
/.gitignore:
--------------------------------------------------------------------------------
1 | *~
2 |
--------------------------------------------------------------------------------
/challenge-exercises.rst:
--------------------------------------------------------------------------------
1 | Some "challenge exercises"
2 | ==========================
3 |
4 | Here are some ideas for projects that are a little less scripted than
5 | `the introduction to Docker on Linux <./docker-intro.rst>`__.
6 |
7 | 0. There's a set of `challenge exercises at the bottom of the introduction <./docker-intro.rst#challenge-exercises>`__.
8 |
9 | 1. Install Docker on your local computer (Mac, Windows, or Linux) and
10 | run through the introduction yourself.
11 |
12 | 2. Make a Docker container for a command-line package that you use.
13 |
14 | 3. Create a Docker Hub account and push the Docker container you built
15 | to the hub. Try running it on someone else's computer (or on
16 | another/new AWS machine).
17 |
18 | More advanced ideas
19 | -------------------
20 |
21 | * Try out `bioboxes `__.
22 |
23 | * Take a look at `mybinder.org `__, which takes a
24 | git repo and turns it into a running data analysis environment.
25 |
26 | * Try out `docker-machine
27 | `__.
28 |
29 | * Try out `Carina `__.
30 |
31 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Docker Hands-on materials
2 |
3 | ## Nov 7-8, 2015
4 |
5 | These are the materials for [a Docker hands-on
6 | workshop](http://dib-training.readthedocs.org/en/pub/2015-11-09-docker.html)
7 | that we ran at UC Davis in November 2015.
8 |
9 | ----
10 |
11 | 1. [An introduction to running Docker on Linux](./docker-intro.rst)
12 |
13 | Here is an introduction to running Docker on an Ubuntu machine. For
14 | sanity's sake, we use an Amazon Web Services machine, but it should work
15 | on any reasonably recent Ubuntu machine (14.04 or greater install).
16 |
17 | 2. [Challenge exercises](./challenge-exercises.rst)
18 |
19 | For people that want to try some things out on their own, here are some
20 | suggestions for getting started.
21 |
22 | 3. [Installing bioboxes](http://bioboxes.org/docs/how-to-install/) and using bioboxes to [assemble a genome and evaluate the assembly](http://bioboxes.org/docs/assemble-a-genome/).
23 |
24 | ----
25 |
26 | Other links:
27 |
28 | * Lisa Cohen's [blog post recording our live-coding session for Dockerfiles](https://monsterbashseq.wordpress.com/2015/11/10/uc-davis-docker-workshop-live-coding/)
29 |
30 | * Titus Brown's [repository containing Dockerfiles for khmer and Salmon](https://github.com/ctb/2015-docker-building), together with notes on docker-machine and data volumes (see [the dammit directory](https://github.com/ctb/2015-docker-building/tree/master/dammit)).
31 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | CC0 1.0 Universal
2 |
3 | Statement of Purpose
4 |
5 | The laws of most jurisdictions throughout the world automatically confer
6 | exclusive Copyright and Related Rights (defined below) upon the creator and
7 | subsequent owner(s) (each and all, an "owner") of an original work of
8 | authorship and/or a database (each, a "Work").
9 |
10 | Certain owners wish to permanently relinquish those rights to a Work for the
11 | purpose of contributing to a commons of creative, cultural and scientific
12 | works ("Commons") that the public can reliably and without fear of later
13 | claims of infringement build upon, modify, incorporate in other works, reuse
14 | and redistribute as freely as possible in any form whatsoever and for any
15 | purposes, including without limitation commercial purposes. These owners may
16 | contribute to the Commons to promote the ideal of a free culture and the
17 | further production of creative, cultural and scientific works, or to gain
18 | reputation or greater distribution for their Work in part through the use and
19 | efforts of others.
20 |
21 | For these and/or other purposes and motivations, and without any expectation
22 | of additional consideration or compensation, the person associating CC0 with a
23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
25 | and publicly distribute the Work under its terms, with knowledge of his or her
26 | Copyright and Related Rights in the Work and the meaning and intended legal
27 | effect of CC0 on those rights.
28 |
29 | 1. Copyright and Related Rights. A Work made available under CC0 may be
30 | protected by copyright and related or neighboring rights ("Copyright and
31 | Related Rights"). Copyright and Related Rights include, but are not limited
32 | to, the following:
33 |
34 | i. the right to reproduce, adapt, distribute, perform, display, communicate,
35 | and translate a Work;
36 |
37 | ii. moral rights retained by the original author(s) and/or performer(s);
38 |
39 | iii. publicity and privacy rights pertaining to a person's image or likeness
40 | depicted in a Work;
41 |
42 | iv. rights protecting against unfair competition in regards to a Work,
43 | subject to the limitations in paragraph 4(a), below;
44 |
45 | v. rights protecting the extraction, dissemination, use and reuse of data in
46 | a Work;
47 |
48 | vi. database rights (such as those arising under Directive 96/9/EC of the
49 | European Parliament and of the Council of 11 March 1996 on the legal
50 | protection of databases, and under any national implementation thereof,
51 | including any amended or successor version of such directive); and
52 |
53 | vii. other similar, equivalent or corresponding rights throughout the world
54 | based on applicable law or treaty, and any national implementations thereof.
55 |
56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of,
57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
59 | and Related Rights and associated claims and causes of action, whether now
60 | known or unknown (including existing as well as future claims and causes of
61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum
62 | duration provided by applicable law or treaty (including future time
63 | extensions), (iii) in any current or future medium and for any number of
64 | copies, and (iv) for any purpose whatsoever, including without limitation
65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
66 | the Waiver for the benefit of each member of the public at large and to the
67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver
68 | shall not be subject to revocation, rescission, cancellation, termination, or
69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work
70 | by the public as contemplated by Affirmer's express Statement of Purpose.
71 |
72 | 3. Public License Fallback. Should any part of the Waiver for any reason be
73 | judged legally invalid or ineffective under applicable law, then the Waiver
74 | shall be preserved to the maximum extent permitted taking into account
75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
76 | is so judged Affirmer hereby grants to each affected person a royalty-free,
77 | non transferable, non sublicensable, non exclusive, irrevocable and
78 | unconditional license to exercise Affirmer's Copyright and Related Rights in
79 | the Work (i) in all territories worldwide, (ii) for the maximum duration
80 | provided by applicable law or treaty (including future time extensions), (iii)
81 | in any current or future medium and for any number of copies, and (iv) for any
82 | purpose whatsoever, including without limitation commercial, advertising or
83 | promotional purposes (the "License"). The License shall be deemed effective as
84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the
85 | License for any reason be judged legally invalid or ineffective under
86 | applicable law, such partial invalidity or ineffectiveness shall not
87 | invalidate the remainder of the License, and in such case Affirmer hereby
88 | affirms that he or she will not (i) exercise any of his or her remaining
89 | Copyright and Related Rights in the Work or (ii) assert any associated claims
90 | and causes of action with respect to the Work, in either case contrary to
91 | Affirmer's express Statement of Purpose.
92 |
93 | 4. Limitations and Disclaimers.
94 |
95 | a. No trademark or patent rights held by Affirmer are waived, abandoned,
96 | surrendered, licensed or otherwise affected by this document.
97 |
98 | b. Affirmer offers the Work as-is and makes no representations or warranties
99 | of any kind concerning the Work, express, implied, statutory or otherwise,
100 | including without limitation warranties of title, merchantability, fitness
101 | for a particular purpose, non infringement, or the absence of latent or
102 | other defects, accuracy, or the present or absence of errors, whether or not
103 | discoverable, all to the greatest extent permissible under applicable law.
104 |
105 | c. Affirmer disclaims responsibility for clearing rights of other persons
106 | that may apply to the Work or any use thereof, including without limitation
107 | any person's Copyright and Related Rights in the Work. Further, Affirmer
108 | disclaims responsibility for obtaining any necessary consents, permissions
109 | or other rights required for any use of the Work.
110 |
111 | d. Affirmer understands and acknowledges that Creative Commons is not a
112 | party to this document and has no duty or obligation with respect to this
113 | CC0 or use of the Work.
114 |
115 | For more information, please see
116 |
117 |
118 |
--------------------------------------------------------------------------------
/docker-intro.rst:
--------------------------------------------------------------------------------
1 | =================================
2 | A hands-on introduction to Docker
3 | =================================
4 |
5 | :author: \C. Titus Brown, titus@idyll.org
6 | :license: CC0
7 | :date: Nov 7. 2015
8 |
9 | Introduction and goals
10 | ======================
11 |
12 | Docker is a mechanism for building and running isolated "containers"
13 | of software. Docker containers act much like virtual machines but are
14 | smaller and more flexible than VMs. The Docker culture and ecosystem
15 | also enhance Docker's potential for aiding in reproducible computing.
16 |
17 | Below, we'll show you how to install Docker on an EC2 instance, use an
18 | existing Docker container, and build your own Docker container.
19 |
20 | .. contents::
21 |
22 | Getting started with Docker
23 | ===========================
24 |
25 | Install Docker
26 | --------------
27 |
28 | Start up an EC2 instance running blank Ubuntu 14.04
29 | (see http://angus.readthedocs.org/en/2015/amazon).
30 |
31 | Then, install Docker::
32 |
33 | wget -qO- https://get.docker.com/ | sudo sh
34 |
35 | (This will take about 5 minutes.)
36 |
37 | Now, configure the default user ('ubuntu') to use Docker::
38 |
39 | sudo usermod -aG docker ubuntu
40 |
41 | and log out and log back in.
42 |
43 | Run Docker
44 | ----------
45 |
46 | The following command will start up a blank Ubuntu 14.04 docker container::
47 |
48 | docker run -it ubuntu:14.04
49 |
50 | (If you get the message ``Post
51 | http:///var/run/docker.sock/v1.20/containers/create: dial unix
52 | /var/run/docker.sock: permission denied.`` then you need to log out
53 | and log back in.)
54 |
55 | This command will spit out a fair bit of output - what it's doing (the
56 | first time you run it) is going out to `the docker hub
57 | `__ and downloading the Ubuntu 14.04 image to
58 | your EC2 instance.
59 |
60 | You should end up at a prompt that looks like this:
61 | ``root@77e00211fef4:/# ``. Unlike your previous prompt (which on EC2
62 | defaults to ending in a ``$ ``), *this* prompt has placed you inside
63 | your running Docker container. This container is running *inside*
64 | your other Ubuntu machine, and its file system and process space is
65 | completely isolated from the "parent" machine. Note in partcular that
66 | you are 'root' inside the Docker container, while you're still user 'ubuntu'
67 | on the AWS machine.
68 |
69 | This is a blank Ubuntu machine. You can play around in here a bit, if you
70 | want, to verify this.
71 |
72 | Now, exit by typing::
73 |
74 | exit
75 |
76 | This will place you back at your EC2 prompt.
77 |
78 | At this point your docker container is shut down and you are placed
79 | back at your EC2 prompt. Importantly, everything you did to the file
80 | system in the container is basically gone at this point - container
81 | contents don't affect the image from which they derive. You can verify this
82 | by re-running the ``docker run -it ubuntu:14.04``, adding a file, and
83 | then exiting; if you run the same image, the file system will be
84 | missing the added file.
85 |
86 |
87 | Cleaning up
88 | -----------
89 |
90 | By default, docker saves a record of containers that have been run, which you can see with ``docker ps -a``. If you wish to later delete an image, docker will complain if any dependent containers still exist. Here's a remedy:
91 |
92 | docker stop $(docker ps -a -q) #needed if you have running containers
93 | docker rm $(docker ps -a -q)
94 |
95 | Alternatively, one can add the ``--rm`` flag when running interactively to avoid having to remove the containers later.
96 |
97 | Persisting changes
98 | -----------
99 |
100 | See ``docker
101 | commit`` and the `Docker image docs
102 | `__ for more info
103 | on building images, or go on to the next section.
104 |
105 | Building images
106 | ===============
107 |
108 | The image we ran above is named ``ubuntu:14.04``, which is the unique
109 | Docker ID for that particular OS (Ubuntu), that particular version
110 | (14.04). It doesn't contain anything particularly useful,
111 | though. What if you wanted to build your own Docker container
112 | with some more software installed? We'll do that next.
113 |
114 | Build a Docker image for MEGAHIT, interactively
115 | -----------------------------------------------
116 |
117 | Let's build a Docker image for the MEGAHIT short-read assembler.
118 | (This is not the right way to do it in general, and we'll do it the
119 | Right Way with a Dockerfile, below.) This is all based on the
120 | `Assembling E. coli tutorial
121 | `__.
122 |
123 | Start up a new container::
124 |
125 | docker run -it ubuntu:14.04
126 |
127 | This completes quite quickly, because you've already downloaded everything.
128 |
129 | Now, **in this new container**, run the commands necessary to build
130 | and run MEGAHIT:
131 |
132 | First, update the base software and install g++, make, git, and zlib::
133 |
134 | apt-get update && apt-get install -y g++ make git zlib1g-dev python
135 |
136 | Then check out and build megahit::
137 |
138 | git clone https://github.com/voutcn/megahit.git /home/megahit
139 | cd /home/megahit && make
140 |
141 | So, now we have megahit built! On our docker container! But we face
142 | two problems:
143 |
144 | * that took a while, and we'd probably rather not do it again; but the docker
145 | container is going to go away as soon as we exit! Wouldn't it be nice
146 | to be able to package this for others?
147 |
148 | * the docker container is disconnected from the underlying machine, so we
149 | have no way of accessing any data! How can we connect it to some data?
150 |
151 | Let's take these two problems on separately - we'll start with the
152 | first problem, by saving the docker container to an image that we can
153 | re-run.
154 |
155 | ----
156 |
157 | To save the docker container to an image, we need to reference the
158 | docker container somehow. This is done by taking note of the
159 | container ID; it's the string between the '@' and the ':' in the
160 | command prompt, so, for a command prompt like ``root@fa1bf23148a5:``,
161 | it would be ``fa1bf23148a5``. Copy this information somewhere (into
162 | an e-mail or something). Then, exit the container::
163 |
164 | exit
165 |
166 | Now you'll be back at the ``ubuntu`` prompt. To commit a copy of
167 | the container above to a docker image, type::
168 |
169 | docker commit -m "built megahit" fa1bf23148a5 megahit_ctr
170 |
171 | but replacing ``fa1bf23148a5`` with your docker container ID.
172 |
173 | This creates a new image named 'megahit_ctr' that contains all of your changes
174 | above. If you run::
175 |
176 | docker images
177 |
178 | you should see something like::
179 |
180 | | REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
181 | | megahit_ctr latest 749fd74397ed 29 seconds ago 427.5 MB
182 | | ubuntu 14.04 91e54dfb1179 3 days ago 188.4 MB
183 |
184 | Now, to run the megahit image, you can type::
185 |
186 | docker run -it megahit_ctr
187 |
188 | and (inside the docker container, which will have a new container ID) you can
189 | run::
190 |
191 | /home/megahit/megahit
192 |
193 | to verify that you still have megahit installed and running. And
194 | voila! You've created your own container! (If you want to make this
195 | available to everyone, go check out `the Docker hub
196 | `__.)
197 |
198 | Connecting a Docker container to some external data
199 | ---------------------------------------------------
200 |
201 | Now that we can run and rerun the megahit-installed container to our heart's
202 | content, we still have to figure out how to connect it to some data. How??
203 |
204 | Well, first, let's download some data to our EC2 instance.
205 |
206 | Make sure you're at the ``ubuntu@`` prompt, by typing ``exit`` if necessary.
207 |
208 | Now execute::
209 |
210 | cd
211 | mkdir data
212 | cd data
213 | wget http://public.ged.msu.edu.s3.amazonaws.com/ecoli_ref-5m-trim.se.fq.gz
214 | wget http://public.ged.msu.edu.s3.amazonaws.com/ecoli_ref-5m-trim.pe.fq.gz
215 |
216 | This downloads those two data files into your home directory -- these are
217 | E. coli short-read data from Chitsaz et al., 2011.
218 |
219 | Now, run your ``megahit_ctr`` image, and connect /home/ubuntu/data/ to /mydata
220 | on the image::
221 |
222 | docker run -v /home/ubuntu/data:/mydata \
223 | -it megahit_ctr
224 |
225 | This will "mount" your data from /home/ubuntu/data on the Docker container,
226 | and connect it to the '/mydata' directory in your container. Type::
227 |
228 | ls /mydata
229 |
230 | to verify that you see these files.
231 |
232 | Now, let's assemble! ::
233 |
234 | /home/megahit/megahit --12 /mydata/*.pe.fq.gz \
235 | -r /mydata/*.se.fq.gz \
236 | -o /mydata/ecoli -t 4
237 |
238 | Now, exit your docker container with ``exit`` and look at your data directory::
239 |
240 | ls /home/ubuntu/data
241 |
242 | You should see the /home/ubuntu/data/ecoli directory with the assembly in it::
243 |
244 | ls /home/ubuntu/data/ecoli
245 |
246 | Running it all in one
247 | ---------------------
248 |
249 | You might think, "hey, wouldn't it be nice to be able to run all of
250 | this in one command, rather than starting a docker container and
251 | then running it from the command line in there?" Yep. Run this::
252 |
253 | docker run -v /home/ubuntu/data:/mydata \
254 | -it megahit_ctr \
255 | sh -c '/home/megahit/megahit --12 /mydata/*.pe.fq.gz
256 | -r /mydata/*.se.fq.gz
257 | -o /mydata/ecoli -t 4'
258 |
259 | Basically, everything after the image name gets passed directly into docker
260 | to be executed. You have to use the 'sh -c' stuff because otherwise
261 | ``/data/*.se.fq.gz`` gets interpreted on your EC2 machine and not on your
262 | Docker image.
263 |
264 | But... this is kind of long and annoying. Wouldn't it be nice to have this
265 | in a shell script? Yes, yes, it would. Let's put it in a shell script
266 | in the 'data' directory, and then run *that*.
267 |
268 | First, put the command in a shell script::
269 |
270 | cd /home/ubuntu/data
271 | cat < do-assemble.sh
272 | #! /bin/bash
273 | rm -fr /data/ecoli
274 | /home/megahit/megahit --12 /mydata/*.pe.fq.gz \
275 | -r /mydata/*.se.fq.gz \
276 | -o /mydata/ecoli -t 4
277 | EOF
278 | chmod +x do-assemble.sh
279 |
280 | and then run the shell script inside of Docker::
281 |
282 | docker run -v /home/ubuntu/data:/mydata \
283 | -it megahit_ctr /mydata/do-assemble.sh
284 |
285 | and voila!
286 |
287 | One thing to note here is that we've placed the ``do-assemble.sh`` script on
288 | the EC2 machine, rather than in the Docker container. You can do it either
289 | way, but in this case it was more convenient to do it this way because
290 | we'd already created the container and I didn't want to have to create a
291 | new one. The only change needed is to put the script in ``/home`` on the
292 | docker image (because that's the local disk), instead of ``/mydata`` (which
293 | is the mounted volume)..
294 |
295 | Building an image with a Dockerfile
296 | -----------------------------------
297 |
298 | The image above was constructed by running a bunch of commands. Wouldn't
299 | it be nice if we could give Docker a bunch of commands and tell *it* to
300 | build an image *for us*?
301 |
302 | You can do that with a Dockerfile, which is the Right Way to build an image.
303 |
304 | Let's encode the commands above in a Dockerfile::
305 |
306 | mkdir /home/ubuntu/make_megahit
307 | cd /home/ubuntu/make_megahit
308 | cat < Dockerfile
309 | FROM ubuntu:14.04
310 | RUN apt-get update
311 | RUN apt-get install -y g++ make git zlib1g-dev python
312 | RUN git clone https://github.com/voutcn/megahit.git /home/megahit
313 | RUN cd /home/megahit && make
314 | CMD /mydata/do-assemble.sh
315 | EOF
316 |
317 | Let's look at this Dockerfile before running it::
318 |
319 | cat Dockerfile
320 |
321 | The 'FROM' command tells Docker what container to load; the 'RUN'
322 | commands tell Docker what to execute (and then save the results from);
323 | and the `CMD` specifies the script entry point - a command that is
324 | run if no other command is given.
325 |
326 | Let's build a Docker image from this and see what happens! ::
327 |
328 | docker build -t megahit_ctr2 .
329 |
330 | (This will take a few minutes.)
331 |
332 | Once it's built, you can now run it like so::
333 |
334 | docker run -v /home/ubuntu/data:/mydata -it megahit_ctr2
335 |
336 | ...and voila!
337 |
338 | If you wanted to make this broadly available, the next steps
339 | would be to log into the Docker hub and push it; I did so with
340 | these commands: ``docker login``, ``docker build -t titus/megahit .``,
341 | and ``docker push titus/megahit``.
342 |
343 | You can run *my* version of all of this with::
344 |
345 | docker run -v /home/ubuntu/data:/data -it titus/megahit
346 |
347 | and -- here's the super neat thing -- you don't need to repeat any of
348 | the above, other than installing Docker itself and downloading the data!
349 |
350 | Summary points
351 | ==============
352 |
353 | * Docker provides a nice way to bundle multiple packages of software
354 | together, for both yourself and for others to run.
355 |
356 | * Docker gives you a good way to isolate what you're running from the
357 | data you're running it on.
358 |
359 | * The Dockerfile enhances reproducibility by giving explicit instructions
360 | for what to install, rather than simply bundling it all in a binary.
361 |
362 | Challenge exercises
363 | ===================
364 |
365 | * Create a new image ``megahit2`` where the do-assemble.sh script
366 | created above is saved in /home on the image itself, rather than
367 | in /data.
368 |
369 | * Create a container that has both MEGAHIT and Quast installed; see
370 | `this page `__
371 | for Quast install instructions.
372 |
373 | * Modify the Docker run script to also run Quast on the MEGAHIT
374 | assembly.
375 |
376 | * Install docker on your local computer, and run the 'titus/megahit' image
377 | there.
378 |
379 | More reading
380 | ============
381 |
382 | `Docker has a lot of docs `__.
383 |
384 | Docker was used `to make a GigaScience paper completely reproducible `__. (I've `written about this idea `__ too.)
385 |
386 | `Binary containers can be bad for science `__.
387 |
388 | Dealing with data is `still complicated `__, but
389 | `the landscape is changing fast `__.
390 |
391 | `The impact of Docker containers on the performance of genomic pipelines `__, Di Tommaso et al., 2015 (PeerJ preprint).
392 |
--------------------------------------------------------------------------------