├── README.md ├── LICENSE ├── .gitignore └── 01-MPI-monte-carlo-pi.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # example-jupyter-notebooks 2 | General example notebooks for Jupyter at NERSC 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2018, The Regents of the University of California, through 4 | Lawrence Berkeley National Laboratory (subject to receipt of any required 5 | approvals from the U.S. Dept. of Energy). All rights reserved. 6 | 7 | Redistribution and use in source and binary forms, with or without 8 | modification, are permitted provided that the following conditions are met: 9 | 10 | * Redistributions of source code must retain the above copyright notice, this 11 | list of conditions and the following disclaimer. 12 | 13 | * Redistributions in binary form must reproduce the above copyright notice, 14 | this list of conditions and the following disclaimer in the documentation 15 | and/or other materials provided with the distribution. 16 | 17 | * Neither the name of the copyright holder nor the names of its 18 | contributors may be used to endorse or promote products derived from 19 | this software without specific prior written permission. 20 | 21 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 22 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 23 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 24 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 25 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 26 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 27 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 28 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 29 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 30 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 31 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | -------------------------------------------------------------------------------- /01-MPI-monte-carlo-pi.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Using MPI from Jupyter at NERSC\n", 8 | "\n", 9 | "* This notebook shows how you can use MPI in a job allocation on Cori.\n", 10 | "* In this example the notebook itself is not running on a node in the job allocation.\n", 11 | "* It is running \"outside\" the compute nodes.\n", 12 | "\n", 13 | "## Overview\n", 14 | "\n", 15 | "* First we'll start up an ipyparallel cluster in a job allocation on Cori compute nodes.\n", 16 | "* When it is running, we will make a connection to the cluster controller process.\n", 17 | "* We'll send code to worker processes (engines) to execute code in parallel.\n", 18 | "* The example will be simple: Using a Monte Carlo technique to estimate the value of pi." 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "## Enable interaction with the batch queue\n", 26 | "\n", 27 | "* We need to issue Slurm commands like `sbatch`, `squeue`, and `scancel` from our notebook\n", 28 | "* Enable this by loading the `slurm_magic` package: Slurm magic commands for Jupyter" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "%load_ext slurm_magic" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "## Start an ipyparallel cluster on some compute nodes\n", 45 | "\n", 46 | "* To do this we'll submit a job using the `%%sbatch` cell magic\n", 47 | "* We'll use the submitted job's ID to make contact with it, once it's up, from the notebook\n", 48 | "* We start an `ipcontroller` process that coordinates the worker processes\n", 49 | "* Finally we launch `ipengine` worker processes --- just Python processes waiting for input" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "%%sbatch\n", 59 | "#!/bin/bash\n", 60 | "#SBATCH --constraint=haswell\n", 61 | "#SBATCH --nodes=2\n", 62 | "#SBATCH --partition=debug\n", 63 | "#SBATCH --time=30\n", 64 | "\n", 65 | "module load python/3.6-anaconda-5.2\n", 66 | "\n", 67 | "# Get IP address of head node\n", 68 | "head_ip=$(ip addr show ipogif0 | grep '10\\.' | awk '{print $2}' | awk -F'/' '{print $1}')\n", 69 | "\n", 70 | "# Unique cluster ID for this job\n", 71 | "cluster_id=cori_${SLURM_JOB_ID}\n", 72 | "\n", 73 | "# Cluster controller\n", 74 | "ipcontroller --ip=\"$head_ip\" --cluster-id=$cluster_id &\n", 75 | "sleep 10\n", 76 | "\n", 77 | "# Compute engines\n", 78 | "srun -u -n 32 ipengine --cluster-id=$cluster_id" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "## Get the cluster ID from the job ID" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": null, 91 | "metadata": {}, 92 | "outputs": [], 93 | "source": [ 94 | "job_id = _.split()[-1]\n", 95 | "job_id" 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": null, 101 | "metadata": {}, 102 | "outputs": [], 103 | "source": [ 104 | "cluster_id = \"cori_\" + job_id\n", 105 | "cluster_id" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": {}, 111 | "source": [ 112 | "## Is the job running yet?" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "%squeue -u rthomas" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "## Establish connection from the notebook to the compute nodes" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "import ipyparallel as ipp\n", 138 | "c = ipp.Client(timeout=60, cluster_id=cluster_id)\n", 139 | "[str(i) for i in c.ids].join()" 140 | ] 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "metadata": {}, 145 | "source": [ 146 | "## Initialize MPI\n", 147 | "\n", 148 | "* With `mpi4py` the import actually calls `MPI_Init()` under the hood\n", 149 | "* The `%%px` cell magic means \"execute this cell on all the workers\"" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": null, 155 | "metadata": {}, 156 | "outputs": [], 157 | "source": [ 158 | "%%px\n", 159 | "from mpi4py import MPI" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": {}, 165 | "source": [ 166 | "## The Customary MPI \"Hello World\"" 167 | ] 168 | }, 169 | { 170 | "cell_type": "code", 171 | "execution_count": null, 172 | "metadata": {}, 173 | "outputs": [], 174 | "source": [ 175 | "%%px\n", 176 | "print(\"Hello world from rank\", MPI.COMM_WORLD.rank, \"of\" MPI.COMM_WORLD.size)" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": {}, 182 | "source": [ 183 | "## Let's do something more interesting, estimate Pi\n", 184 | "\n", 185 | "* Let's import a couple more packages we need\n", 186 | "* Remember, all the workers need to do this" 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": null, 192 | "metadata": {}, 193 | "outputs": [], 194 | "source": [ 195 | "%%px\n", 196 | "import random\n", 197 | "import math" 198 | ] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "metadata": {}, 203 | "source": [ 204 | "## Set up a \"dart board\" function\n", 205 | "\n", 206 | "* Consider a unit square with a quarter-circle centered in the lower left-hand corner\n", 207 | "* Distribute \"darts\" uniformly over the unit square\n", 208 | "* Count how many \"darts\" land inside the quarter circle\n", 209 | "* The ratio of that number of \"darts\" to the total number \"thrown\" can be used to estimate pi\n", 210 | "* The answer gets better the more \"darts\" are thrown" 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "metadata": {}, 217 | "outputs": [], 218 | "source": [ 219 | "%%px\n", 220 | "def work(trials):\n", 221 | " count = 0.0\n", 222 | " for i in range(trials):\n", 223 | " x = random.random()\n", 224 | " y = random.random()\n", 225 | " if x ** 2 + y ** 2 < 1.0:\n", 226 | " count += 1.0\n", 227 | " return count" 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "## Now the MPI part\n", 235 | "\n", 236 | "* Each MPI rank initializes its own random number generator and throws the same number of \"darts\"\n", 237 | "* The total number of \"darts\" in each rank's circle are `MPI_Gather`ed\n", 238 | "* MPI rank 0 tallies them up and multiplies by 4 to get a unit circle's area\n", 239 | "* Finally we output the answer and see how well we did" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "%%px\n", 249 | "def compute_pi(trials):\n", 250 | " \n", 251 | " # Useful MPI information\n", 252 | " \n", 253 | " mpi_comm = MPI.COMM_WORLD\n", 254 | " mpi_size = mpi_comm.size\n", 255 | " mpi_rank = mpi_comm.rank\n", 256 | " mpi_root = mpi_rank == 0\n", 257 | " \n", 258 | " # Initialize random number generator\n", 259 | " \n", 260 | " random.seed(192837465 + mpi_rank)\n", 261 | " \n", 262 | " # Distribute work, gather results, and report answer\n", 263 | " \n", 264 | " count = work(trials)\n", 265 | " count = mpi_comm.gather(count, root=0)\n", 266 | " if mpi_root:\n", 267 | " total_count = sum(count)\n", 268 | " total_trials = mpi_size * trials\n", 269 | " estimated_pi = 4.0 * total_count / total_trials\n", 270 | " print(\"Total Count: \", total_count)\n", 271 | " print(\"Total Trials: \", total_trials)\n", 272 | " print(\"Estimate of pi:\", estimated_pi, math.pi, abs(estimated_pi - math.pi)) " 273 | ] 274 | }, 275 | { 276 | "cell_type": "markdown", 277 | "metadata": {}, 278 | "source": [ 279 | "## Time to compute pi!" 280 | ] 281 | }, 282 | { 283 | "cell_type": "code", 284 | "execution_count": null, 285 | "metadata": {}, 286 | "outputs": [], 287 | "source": [ 288 | "%%px\n", 289 | "compute_pi(10000000)" 290 | ] 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": {}, 295 | "source": [ 296 | "## Example of just running on one rank (not in the notebook)" 297 | ] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "execution_count": null, 302 | "metadata": {}, 303 | "outputs": [], 304 | "source": [ 305 | "%%px --targets 0\n", 306 | "work(10000000)" 307 | ] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": [ 313 | "## Cancel the job --- our notebook stays up" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": null, 319 | "metadata": {}, 320 | "outputs": [], 321 | "source": [ 322 | "%scancel $job_id" 323 | ] 324 | } 325 | ], 326 | "metadata": { 327 | "kernelspec": { 328 | "display_name": "Python 3", 329 | "language": "python", 330 | "name": "python3" 331 | }, 332 | "language_info": { 333 | "codemirror_mode": { 334 | "name": "ipython", 335 | "version": 3 336 | }, 337 | "file_extension": ".py", 338 | "mimetype": "text/x-python", 339 | "name": "python", 340 | "nbconvert_exporter": "python", 341 | "pygments_lexer": "ipython3", 342 | "version": "3.7.0" 343 | } 344 | }, 345 | "nbformat": 4, 346 | "nbformat_minor": 2 347 | } 348 | --------------------------------------------------------------------------------