├── .gitignore
├── binder
    ├── start
    └── environment.yml
├── environment.yml
├── README.md
├── 05-bag.ipynb
├── 06-schedulers.ipynb
├── 07-ML.ipynb
├── 03-array.ipynb
└── 04-delayed.ipynb


/.gitignore:
--------------------------------------------------------------------------------
1 | .ipynb_checkpoints
2 | .DS_Store
3 | dask-worker-space
4 | 


--------------------------------------------------------------------------------
/binder/start:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | 
3 | # Replace DASK_DASHBOARD_URL with the proxy location
4 | sed -i -e "s|DASK_DASHBOARD_URL|${JUPYTERHUB_BASE_URL}user/${JUPYTERHUB_USER}/proxy/8787|g" binder/jupyterlab-workspace.json
5 | 
6 | exec "$@"
7 | 


--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
 1 | name: talkpython-dask
 2 | channels:
 3 |   - conda-forge
 4 | dependencies:
 5 |   - python=3.8
 6 |   - nodejs
 7 |   - dask>=2021.3.0
 8 |   - dask-ml>=1.7.0
 9 |   - distributed>=2021.3.0
10 |   - jupyterlab>=3.0
11 |   - notebook
12 |   - pandas>=1.0.1
13 |   - numpy>=1.19.2
14 |   - scipy>=1.4.1
15 |   - scikit-learn>=0.22.1
16 |   - scikit-image>=0.15.0
17 |   - ipywidgets>=7.5
18 |   - bokeh>=2.3.0
19 |   - pip>=20.3.0
20 |   - pip:
21 |     - dask-labextension>=3.0.0
22 |     - coiled
23 |   - python-graphviz
24 |   - h5py
25 |   - mimesis
26 | 


--------------------------------------------------------------------------------
/binder/environment.yml:
--------------------------------------------------------------------------------
 1 | name: talkpython-dask
 2 | channels:
 3 |   - conda-forge
 4 | dependencies:
 5 |   - python=3.8
 6 |   - nodejs
 7 |   - dask>=2021.3.0
 8 |   - dask-ml>=1.7.0
 9 |   - distributed>=2021.3.0
10 |   - jupyterlab>=3.0
11 |   - notebook
12 |   - pandas>=1.0.1
13 |   - numpy>=1.19.2
14 |   - scipy>=1.4.1
15 |   - scikit-learn>=0.22.1
16 |   - scikit-image>=0.15.0
17 |   - ipywidgets>=7.5
18 |   - bokeh>=2.3.0
19 |   - pip>=20.3.0
20 |   - pip:
21 |     - dask-labextension>=3.0.0
22 |     - coiled
23 |   - python-graphviz
24 |   - h5py
25 |   - mimesis
26 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Fundamentals of Dask
 2 | 
 3 | This repository contains the material for **Talk Python Training course** on Fundamentals of Dask.
 4 | 
 5 | The Python data science stack, consisting of tools like pandas, NumPy, scikit-learn, and many more is extremely powerful, but it rarely leverages the parallel computing potential of modern hardware. Dask can help bridge this gap. This course will teach you how to parallelize everything from array computations to general Python code with Dask and even perform distributed machine learning to train models at scale.
 6 | 
 7 | Take the course at: [training.talkpython.fm](https://training.talkpython.fm/courses/fundamentals-of-dask-getting-up-to-speed)
 8 | 
 9 | In this course, you will learn to:
10 | 
11 | * Scale array computations using a parallel alternative to NumPy
12 | * Parallelize general Python code including for-loops
13 | * Work with unstructured data in parallel
14 | * Train machine learning models faster using distributed computing
15 | * And lots more!
16 | 
17 | ## Prerequisites
18 | 
19 | - Basic Python
20 | 
21 | Not required, but nice to have:
22 | - pandas
23 | - NumPy
24 | - scikit-learn
25 | - Machine Learning
26 | - JupyterLab
27 | - conda (for local setup)
28 | - Terminal (for local setup)
29 | 
30 | ## Setup
31 | 
32 | You get up and running in two ways:
33 | 
34 | ### Launch Binder
35 | 
36 | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/coiled/talkpython-fundamentals-of-dask/master?urlpath=lab/tree/03-array.ipynb)
37 | 
38 | The binder project allows you to open Jupyter notebooks in this repository in an online executable environment. Click on the "launch binder" link in your browser window to get started. It might take a few minutes to start.
39 | 
40 | *Note: Binder notebooks timeout if inactive for more than 10 mins.*
41 | 
42 | ### Local setup (recommended)
43 | 
44 | * [Fork this repository](https://docs.github.com/en/free-pro-team@latest/github/getting-started-with-github/fork-a-repo)
45 | 
46 | * Clone your forked repository:
47 | 
48 | ```git clone http://github.com/<username>/talkpython-fundamentals-of-dask```
49 | 
50 | * From root directory:
51 | 
52 | ```cd talkpython-fundamentals-of-dask```
53 | 
54 | create a new conda environment:
55 | 
56 | ```conda env create -f environment.yml```
57 | 
58 | * Activate the conda environment:
59 | 
60 | ``` conda activate talkpython-dask```
61 | 
62 | * Start JupyterLab
63 | 
64 | ```jupyter lab```
65 | 


--------------------------------------------------------------------------------
/05-bag.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "id": "concrete-certification",
  6 |    "metadata": {},
  7 |    "source": [
  8 |     "# Dask Bag\n",
  9 |     "\n",
 10 |     "## Notebook Objectives\n",
 11 |     "* **Read and Manipulate data with Dask Bag**, high-level interface to parallelize generic Python objects.\n",
 12 |     "* **Convert Dask Bag to Dask DataFrame**.\n",
 13 |     "* **Limitations of Dask Bag**.\n",
 14 |     "* **References** for further reading."
 15 |    ]
 16 |   },
 17 |   {
 18 |    "cell_type": "markdown",
 19 |    "id": "experimental-thirty",
 20 |    "metadata": {},
 21 |    "source": [
 22 |     "## Read data with Dask Bag\n",
 23 |     "\n",
 24 |     "We can create a Dask Bag from any Python sequence: lists, dict, set, from files (json, xml), S3, etc.\n",
 25 |     "\n",
 26 |     "Before that, let's start a Cluster:"
 27 |    ]
 28 |   },
 29 |   {
 30 |    "cell_type": "code",
 31 |    "execution_count": 2,
 32 |    "id": "knowing-graduation",
 33 |    "metadata": {},
 34 |    "outputs": [
 35 |     {
 36 |      "data": {
 37 |       "text/html": [
 38 |        "<table style=\"border: 2px solid white;\">\n",
 39 |        "<tr>\n",
 40 |        "<td style=\"vertical-align: top; border: 0px solid white\">\n",
 41 |        "<h3 style=\"text-align: left;\">Client</h3>\n",
 42 |        "<ul style=\"text-align: left; list-style: none; margin: 0; padding: 0;\">\n",
 43 |        "  <li><b>Scheduler: </b>tcp://127.0.0.1:50672</li>\n",
 44 |        "  <li><b>Dashboard: </b><a href='http://127.0.0.1:8787/status' target='_blank'>http://127.0.0.1:8787/status</a></li>\n",
 45 |        "</ul>\n",
 46 |        "</td>\n",
 47 |        "<td style=\"vertical-align: top; border: 0px solid white\">\n",
 48 |        "<h3 style=\"text-align: left;\">Cluster</h3>\n",
 49 |        "<ul style=\"text-align: left; list-style:none; margin: 0; padding: 0;\">\n",
 50 |        "  <li><b>Workers: </b>4</li>\n",
 51 |        "  <li><b>Cores: </b>12</li>\n",
 52 |        "  <li><b>Memory: </b>16.00 GiB</li>\n",
 53 |        "</ul>\n",
 54 |        "</td>\n",
 55 |        "</tr>\n",
 56 |        "</table>"
 57 |       ],
 58 |       "text/plain": [
 59 |        "<Client: 'tcp://127.0.0.1:50672' processes=4 threads=12, memory=16.00 GiB>"
 60 |       ]
 61 |      },
 62 |      "execution_count": 2,
 63 |      "metadata": {},
 64 |      "output_type": "execute_result"
 65 |     }
 66 |    ],
 67 |    "source": [
 68 |     "from dask.distributed import Client\n",
 69 |     "\n",
 70 |     "client = Client(n_workers=4)\n",
 71 |     "client"
 72 |    ]
 73 |   },
 74 |   {
 75 |    "cell_type": "markdown",
 76 |    "id": "decent-overall",
 77 |    "metadata": {},
 78 |    "source": [
 79 |     "Open the dashboards!"
 80 |    ]
 81 |   },
 82 |   {
 83 |    "cell_type": "markdown",
 84 |    "id": "liable-montgomery",
 85 |    "metadata": {},
 86 |    "source": [
 87 |     "### Reading from Python sequence\n",
 88 |     "\n",
 89 |     "Here we create a Dask Bag from a Python list. You can create Bags similarly from sets and dictionaries.\n",
 90 |     "\n",
 91 |     "Data is partitioned into blocks. In the following example, there are two partitions with 5 elements each."
 92 |    ]
 93 |   },
 94 |   {
 95 |    "cell_type": "code",
 96 |    "execution_count": 3,
 97 |    "id": "christian-nancy",
 98 |    "metadata": {},
 99 |    "outputs": [
100 |     {
101 |      "data": {
102 |       "text/plain": [
103 |        "dask.bag<from_sequence, npartitions=2>"
104 |       ]
105 |      },
106 |      "execution_count": 3,
107 |      "metadata": {},
108 |      "output_type": "execute_result"
109 |     }
110 |    ],
111 |    "source": [
112 |     "import dask.bag as db\n",
113 |     "\n",
114 |     "b = db.from_sequence(['Alaska', 'Minnesota', 'Georgia', 'Maine', 'West Virginia', 'California', 'South Dakota', 'Indiana', 'New York', 'Nebraska'], npartitions=2)\n",
115 |     "b"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "markdown",
120 |    "id": "handmade-wilson",
121 |    "metadata": {},
122 |    "source": [
123 |     "Bag object are also evaluated lazily by default, so we need to call `compute` to get the result."
124 |    ]
125 |   },
126 |   {
127 |    "cell_type": "code",
128 |    "execution_count": 4,
129 |    "id": "classified-split",
130 |    "metadata": {
131 |     "tags": []
132 |    },
133 |    "outputs": [
134 |     {
135 |      "data": {
136 |       "text/plain": [
137 |        "['Alaska',\n",
138 |        " 'Minnesota',\n",
139 |        " 'Georgia',\n",
140 |        " 'Maine',\n",
141 |        " 'West Virginia',\n",
142 |        " 'California',\n",
143 |        " 'South Dakota',\n",
144 |        " 'Indiana',\n",
145 |        " 'New York',\n",
146 |        " 'Nebraska']"
147 |       ]
148 |      },
149 |      "execution_count": 4,
150 |      "metadata": {},
151 |      "output_type": "execute_result"
152 |     }
153 |    ],
154 |    "source": [
155 |     "b.compute()"
156 |    ]
157 |   },
158 |   {
159 |    "cell_type": "markdown",
160 |    "id": "organized-tsunami",
161 |    "metadata": {},
162 |    "source": [
163 |     "`take()` can be used to show elements of the data."
164 |    ]
165 |   },
166 |   {
167 |    "cell_type": "code",
168 |    "execution_count": 5,
169 |    "id": "resistant-competition",
170 |    "metadata": {},
171 |    "outputs": [
172 |     {
173 |      "data": {
174 |       "text/plain": [
175 |        "('Alaska', 'Minnesota', 'Georgia')"
176 |       ]
177 |      },
178 |      "execution_count": 5,
179 |      "metadata": {},
180 |      "output_type": "execute_result"
181 |     }
182 |    ],
183 |    "source": [
184 |     "b.take(3)"
185 |    ]
186 |   },
187 |   {
188 |    "cell_type": "markdown",
189 |    "id": "consolidated-cutting",
190 |    "metadata": {},
191 |    "source": [
192 |     "### Reading from JSON file\n",
193 |     "\n",
194 |     "Here we create a Dask Bag from the JSON files."
195 |    ]
196 |   },
197 |   {
198 |    "cell_type": "code",
199 |    "execution_count": 6,
200 |    "id": "animated-buyer",
201 |    "metadata": {},
202 |    "outputs": [
203 |     {
204 |      "data": {
205 |       "text/plain": [
206 |        "['/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/0.json',\n",
207 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/1.json',\n",
208 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/2.json',\n",
209 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/3.json',\n",
210 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/4.json',\n",
211 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/5.json',\n",
212 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/6.json',\n",
213 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/7.json',\n",
214 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/8.json',\n",
215 |        " '/Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/data/9.json']"
216 |       ]
217 |      },
218 |      "execution_count": 6,
219 |      "metadata": {},
220 |      "output_type": "execute_result"
221 |     }
222 |    ],
223 |    "source": [
224 |     "# Create random data and store as JSON files \n",
225 |     "\n",
226 |     "import dask\n",
227 |     "import json\n",
228 |     "import os\n",
229 |     "\n",
230 |     "b = dask.datasets.make_people()\n",
231 |     "b.map(json.dumps).to_textfiles('data/*.json')"
232 |    ]
233 |   },
234 |   {
235 |    "cell_type": "markdown",
236 |    "id": "4533ad2c-47fa-45e5-b1dc-91d29147e861",
237 |    "metadata": {},
238 |    "source": [
239 |     "Then, read the data using `read_text`."
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "code",
244 |    "execution_count": 7,
245 |    "id": "postal-newcastle",
246 |    "metadata": {},
247 |    "outputs": [
248 |     {
249 |      "data": {
250 |       "text/plain": [
251 |        "dask.bag<bag-from-delayed, npartitions=10>"
252 |       ]
253 |      },
254 |      "execution_count": 7,
255 |      "metadata": {},
256 |      "output_type": "execute_result"
257 |     }
258 |    ],
259 |    "source": [
260 |     "b = db.read_text('data/*.json')\n",
261 |     "b"
262 |    ]
263 |   },
264 |   {
265 |    "cell_type": "code",
266 |    "execution_count": 7,
267 |    "id": "first-lover",
268 |    "metadata": {},
269 |    "outputs": [
270 |     {
271 |      "data": {
272 |       "text/plain": [
273 |        "('{\"age\": 30, \"name\": [\"Darrel\", \"Soto\"], \"occupation\": \"Audiologist\", \"telephone\": \"527.475.4983\", \"address\": {\"address\": \"460 Rivas Drung\", \"city\": \"Winston-Salem\"}, \"credit-card\": {\"number\": \"2446 9077 9141 7987\", \"expiration-date\": \"09/22\"}}\\n',\n",
274 |        " '{\"age\": 38, \"name\": [\"Sindy\", \"Campbell\"], \"occupation\": \"Foreman\", \"telephone\": \"946.885.3965\", \"address\": {\"address\": \"1185 Bass Spur\", \"city\": \"Millville\"}, \"credit-card\": {\"number\": \"4956 2525 9272 9241\", \"expiration-date\": \"08/20\"}}\\n')"
275 |       ]
276 |      },
277 |      "execution_count": 7,
278 |      "metadata": {},
279 |      "output_type": "execute_result"
280 |     }
281 |    ],
282 |    "source": [
283 |     "b.take(2)"
284 |    ]
285 |   },
286 |   {
287 |    "cell_type": "markdown",
288 |    "id": "marine-telling",
289 |    "metadata": {},
290 |    "source": [
291 |     "Note the partitions for the 10 files in our data.\n",
292 |     "\n",
293 |     "The data comes out as lines of text, we can make this data more readable using `json.loads`."
294 |    ]
295 |   },
296 |   {
297 |    "cell_type": "code",
298 |    "execution_count": 8,
299 |    "id": "catholic-retention",
300 |    "metadata": {},
301 |    "outputs": [
302 |     {
303 |      "data": {
304 |       "text/plain": [
305 |        "({'age': 60,\n",
306 |        "  'name': ['Jeffery', 'Garcia'],\n",
307 |        "  'occupation': 'Training Consultant',\n",
308 |        "  'telephone': '1-702-673-7969',\n",
309 |        "  'address': {'address': '744 Langton Parade', 'city': 'Sugar Hill'},\n",
310 |        "  'credit-card': {'number': '3745 852410 45994', 'expiration-date': '06/23'}},\n",
311 |        " {'age': 54,\n",
312 |        "  'name': ['Parker', 'Reed'],\n",
313 |        "  'occupation': 'Window Dresser',\n",
314 |        "  'telephone': '223-543-9697',\n",
315 |        "  'address': {'address': '1065 Mill Field', 'city': 'South Portland'},\n",
316 |        "  'credit-card': {'number': '3789 947854 38464', 'expiration-date': '09/23'}})"
317 |       ]
318 |      },
319 |      "execution_count": 8,
320 |      "metadata": {},
321 |      "output_type": "execute_result"
322 |     }
323 |    ],
324 |    "source": [
325 |     "b = b.map(json.loads)\n",
326 |     "b.take(2)"
327 |    ]
328 |   },
329 |   {
330 |    "cell_type": "markdown",
331 |    "id": "civilian-booth",
332 |    "metadata": {},
333 |    "source": [
334 |     "## Manipulate data with Dask Bag\n",
335 |     "\n",
336 |     "Bag objects have the standard functional API found in projects like the Python standard library, toolz, or pyspark, including map, filter, groupby, etc.\n",
337 |     "\n",
338 |     "Operations on Bag objects create new bags. "
339 |    ]
340 |   },
341 |   {
342 |    "cell_type": "markdown",
343 |    "id": "wicked-cholesterol",
344 |    "metadata": {},
345 |    "source": [
346 |     "### Filter operation\n",
347 |     "\n",
348 |     "Filter the file for all records having age over 25."
349 |    ]
350 |   },
351 |   {
352 |    "cell_type": "code",
353 |    "execution_count": 9,
354 |    "id": "important-render",
355 |    "metadata": {},
356 |    "outputs": [
357 |     {
358 |      "data": {
359 |       "text/plain": [
360 |        "({'age': 60,\n",
361 |        "  'name': ['Jeffery', 'Garcia'],\n",
362 |        "  'occupation': 'Training Consultant',\n",
363 |        "  'telephone': '1-702-673-7969',\n",
364 |        "  'address': {'address': '744 Langton Parade', 'city': 'Sugar Hill'},\n",
365 |        "  'credit-card': {'number': '3745 852410 45994', 'expiration-date': '06/23'}},\n",
366 |        " {'age': 54,\n",
367 |        "  'name': ['Parker', 'Reed'],\n",
368 |        "  'occupation': 'Window Dresser',\n",
369 |        "  'telephone': '223-543-9697',\n",
370 |        "  'address': {'address': '1065 Mill Field', 'city': 'South Portland'},\n",
371 |        "  'credit-card': {'number': '3789 947854 38464', 'expiration-date': '09/23'}},\n",
372 |        " {'age': 44,\n",
373 |        "  'name': ['Nicolas', 'Duncan'],\n",
374 |        "  'occupation': 'Forest Ranger',\n",
375 |        "  'telephone': '064.491.6735',\n",
376 |        "  'address': {'address': '529 Cameron Alley', 'city': 'Garner'},\n",
377 |        "  'credit-card': {'number': '4165 7976 6426 7113',\n",
378 |        "   'expiration-date': '11/22'}},\n",
379 |        " {'age': 42,\n",
380 |        "  'name': ['Patrick', 'Rasmussen'],\n",
381 |        "  'occupation': 'Technical Clerk',\n",
382 |        "  'telephone': '530-726-3639',\n",
383 |        "  'address': {'address': '988 Western Shore Line', 'city': 'Yorba Linda'},\n",
384 |        "  'credit-card': {'number': '4075 4659 6389 2457',\n",
385 |        "   'expiration-date': '08/22'}},\n",
386 |        " {'age': 28,\n",
387 |        "  'name': ['Caleb', 'Allison'],\n",
388 |        "  'occupation': 'Hospital Manager',\n",
389 |        "  'telephone': '1-197-089-3998',\n",
390 |        "  'address': {'address': '908 White Place', 'city': 'Salinas'},\n",
391 |        "  'credit-card': {'number': '3766 677448 55505', 'expiration-date': '05/22'}})"
392 |       ]
393 |      },
394 |      "execution_count": 9,
395 |      "metadata": {},
396 |      "output_type": "execute_result"
397 |     }
398 |    ],
399 |    "source": [
400 |     "b.filter(lambda record: record['age'] > 25).take(5)"
401 |    ]
402 |   },
403 |   {
404 |    "cell_type": "markdown",
405 |    "id": "heard-solid",
406 |    "metadata": {},
407 |    "source": [
408 |     "### Map operation\n",
409 |     "\n",
410 |     "Get only the first name."
411 |    ]
412 |   },
413 |   {
414 |    "cell_type": "code",
415 |    "execution_count": 10,
416 |    "id": "touched-emperor",
417 |    "metadata": {},
418 |    "outputs": [
419 |     {
420 |      "data": {
421 |       "text/plain": [
422 |        "('Jeffery',\n",
423 |        " 'Parker',\n",
424 |        " 'Nicolas',\n",
425 |        " 'Rickie',\n",
426 |        " 'Patrick',\n",
427 |        " 'Caleb',\n",
428 |        " 'Cruz',\n",
429 |        " 'Jeanene',\n",
430 |        " 'Wade',\n",
431 |        " 'Jarrett')"
432 |       ]
433 |      },
434 |      "execution_count": 10,
435 |      "metadata": {},
436 |      "output_type": "execute_result"
437 |     }
438 |    ],
439 |    "source": [
440 |     "x = b.map(lambda record: record['name'][0]).take(10)\n",
441 |     "x"
442 |    ]
443 |   },
444 |   {
445 |    "cell_type": "markdown",
446 |    "id": "seventh-security",
447 |    "metadata": {
448 |     "tags": []
449 |    },
450 |    "source": [
451 |     "### Groupby Operation\n",
452 |     "\n",
453 |     "Group data by some function or key."
454 |    ]
455 |   },
456 |   {
457 |    "cell_type": "code",
458 |    "execution_count": 11,
459 |    "id": "mobile-collect",
460 |    "metadata": {
461 |     "tags": []
462 |    },
463 |    "outputs": [
464 |     {
465 |      "data": {
466 |       "text/plain": [
467 |        "[(6, ['Parker', 'Rickie']),\n",
468 |        " (4, ['Cruz', 'Wade']),\n",
469 |        " (7, ['Jeffery', 'Nicolas', 'Patrick', 'Jeanene', 'Jarrett']),\n",
470 |        " (5, ['Caleb'])]"
471 |       ]
472 |      },
473 |      "execution_count": 11,
474 |      "metadata": {},
475 |      "output_type": "execute_result"
476 |     }
477 |    ],
478 |    "source": [
479 |     "b = db.from_sequence(x, npartitions=2)\n",
480 |     "b.groupby(len).compute()"
481 |    ]
482 |   },
483 |   {
484 |    "cell_type": "markdown",
485 |    "id": "suitable-announcement",
486 |    "metadata": {},
487 |    "source": [
488 |     "**Note:**\n",
489 |     "\n",
490 |     "Often we want to group data by some function or key. We can do this either with the `.groupby` method, which is straightforward but forces a full shuffle of the data (expensive) or with the harder-to-use but faster `.foldby` method, which does a streaming combined groupby and reduction.\n",
491 |     "\n",
492 |     "* `groupby`: Shuffles data so that all items with the same key are in the same key-value pair\n",
493 |     "* `foldby`: Walks through the data accumulating a result per key\n",
494 |     "\n",
495 |     "_~ Source: [tutorial.dask.org](https://tutorial.dask.org/02_bag.html#Groupby-and-Foldby)_"
496 |    ]
497 |   },
498 |   {
499 |    "cell_type": "markdown",
500 |    "id": "northern-interaction",
501 |    "metadata": {},
502 |    "source": [
503 |     "## Checkpoint\n",
504 |     "\n",
505 |     "**Question:** Find all cities from the JSON data we created earlier."
506 |    ]
507 |   },
508 |   {
509 |    "cell_type": "code",
510 |    "execution_count": null,
511 |    "id": "subject-shoulder",
512 |    "metadata": {},
513 |    "outputs": [],
514 |    "source": [
515 |     "# Your answer here"
516 |    ]
517 |   },
518 |   {
519 |    "cell_type": "code",
520 |    "execution_count": null,
521 |    "id": "partial-closer",
522 |    "metadata": {
523 |     "jupyter": {
524 |      "source_hidden": true
525 |     },
526 |     "tags": []
527 |    },
528 |    "outputs": [],
529 |    "source": [
530 |     "b = db.read_text('data/*.json').map(json.loads)\n",
531 |     "x = b.map(lambda record: record['address']['city']).take(10)\n",
532 |     "x"
533 |    ]
534 |   },
535 |   {
536 |    "cell_type": "markdown",
537 |    "id": "hollow-permit",
538 |    "metadata": {},
539 |    "source": [
540 |     "## Convert Dask Bag to Dask DataFrame\n",
541 |     "\n",
542 |     "Dask Bag can be used for simple analysis but for more complex computations, Dask DataFrame or Dask Array might be a better choice. They are faster for the same reason pandas and numpy are faster than Python. They also have more functionality suited for data analysis.\n",
543 |     "\n",
544 |     "`to_dataframe` method can be used to transform Dask Bag to Dask DataFrame."
545 |    ]
546 |   },
547 |   {
548 |    "cell_type": "code",
549 |    "execution_count": 19,
550 |    "id": "subjective-biotechnology",
551 |    "metadata": {
552 |     "tags": []
553 |    },
554 |    "outputs": [
555 |     {
556 |      "data": {
557 |       "text/html": [
558 |        "<div>\n",
559 |        "<style scoped>\n",
560 |        "    .dataframe tbody tr th:only-of-type {\n",
561 |        "        vertical-align: middle;\n",
562 |        "    }\n",
563 |        "\n",
564 |        "    .dataframe tbody tr th {\n",
565 |        "        vertical-align: top;\n",
566 |        "    }\n",
567 |        "\n",
568 |        "    .dataframe thead th {\n",
569 |        "        text-align: right;\n",
570 |        "    }\n",
571 |        "</style>\n",
572 |        "<table border=\"1\" class=\"dataframe\">\n",
573 |        "  <thead>\n",
574 |        "    <tr style=\"text-align: right;\">\n",
575 |        "      <th></th>\n",
576 |        "      <th>age</th>\n",
577 |        "      <th>name</th>\n",
578 |        "      <th>occupation</th>\n",
579 |        "      <th>telephone</th>\n",
580 |        "      <th>address</th>\n",
581 |        "      <th>credit-card</th>\n",
582 |        "    </tr>\n",
583 |        "  </thead>\n",
584 |        "  <tbody>\n",
585 |        "    <tr>\n",
586 |        "      <th>0</th>\n",
587 |        "      <td>60</td>\n",
588 |        "      <td>[Jeffery, Garcia]</td>\n",
589 |        "      <td>Training Consultant</td>\n",
590 |        "      <td>1-702-673-7969</td>\n",
591 |        "      <td>{'address': '744 Langton Parade', 'city': 'Sug...</td>\n",
592 |        "      <td>{'number': '3745 852410 45994', 'expiration-da...</td>\n",
593 |        "    </tr>\n",
594 |        "    <tr>\n",
595 |        "      <th>1</th>\n",
596 |        "      <td>54</td>\n",
597 |        "      <td>[Parker, Reed]</td>\n",
598 |        "      <td>Window Dresser</td>\n",
599 |        "      <td>223-543-9697</td>\n",
600 |        "      <td>{'address': '1065 Mill Field', 'city': 'South ...</td>\n",
601 |        "      <td>{'number': '3789 947854 38464', 'expiration-da...</td>\n",
602 |        "    </tr>\n",
603 |        "    <tr>\n",
604 |        "      <th>2</th>\n",
605 |        "      <td>44</td>\n",
606 |        "      <td>[Nicolas, Duncan]</td>\n",
607 |        "      <td>Forest Ranger</td>\n",
608 |        "      <td>064.491.6735</td>\n",
609 |        "      <td>{'address': '529 Cameron Alley', 'city': 'Garn...</td>\n",
610 |        "      <td>{'number': '4165 7976 6426 7113', 'expiration-...</td>\n",
611 |        "    </tr>\n",
612 |        "    <tr>\n",
613 |        "      <th>3</th>\n",
614 |        "      <td>23</td>\n",
615 |        "      <td>[Rickie, Dickerson]</td>\n",
616 |        "      <td>Chiropodist</td>\n",
617 |        "      <td>(393) 425-7342</td>\n",
618 |        "      <td>{'address': '733 Miramar Run', 'city': 'Shakop...</td>\n",
619 |        "      <td>{'number': '4904 7032 6941 1961', 'expiration-...</td>\n",
620 |        "    </tr>\n",
621 |        "    <tr>\n",
622 |        "      <th>4</th>\n",
623 |        "      <td>42</td>\n",
624 |        "      <td>[Patrick, Rasmussen]</td>\n",
625 |        "      <td>Technical Clerk</td>\n",
626 |        "      <td>530-726-3639</td>\n",
627 |        "      <td>{'address': '988 Western Shore Line', 'city': ...</td>\n",
628 |        "      <td>{'number': '4075 4659 6389 2457', 'expiration-...</td>\n",
629 |        "    </tr>\n",
630 |        "  </tbody>\n",
631 |        "</table>\n",
632 |        "</div>"
633 |       ],
634 |       "text/plain": [
635 |        "   age                  name           occupation       telephone  \\\n",
636 |        "0   60     [Jeffery, Garcia]  Training Consultant  1-702-673-7969   \n",
637 |        "1   54        [Parker, Reed]       Window Dresser    223-543-9697   \n",
638 |        "2   44     [Nicolas, Duncan]        Forest Ranger    064.491.6735   \n",
639 |        "3   23   [Rickie, Dickerson]          Chiropodist  (393) 425-7342   \n",
640 |        "4   42  [Patrick, Rasmussen]      Technical Clerk    530-726-3639   \n",
641 |        "\n",
642 |        "                                             address  \\\n",
643 |        "0  {'address': '744 Langton Parade', 'city': 'Sug...   \n",
644 |        "1  {'address': '1065 Mill Field', 'city': 'South ...   \n",
645 |        "2  {'address': '529 Cameron Alley', 'city': 'Garn...   \n",
646 |        "3  {'address': '733 Miramar Run', 'city': 'Shakop...   \n",
647 |        "4  {'address': '988 Western Shore Line', 'city': ...   \n",
648 |        "\n",
649 |        "                                         credit-card  \n",
650 |        "0  {'number': '3745 852410 45994', 'expiration-da...  \n",
651 |        "1  {'number': '3789 947854 38464', 'expiration-da...  \n",
652 |        "2  {'number': '4165 7976 6426 7113', 'expiration-...  \n",
653 |        "3  {'number': '4904 7032 6941 1961', 'expiration-...  \n",
654 |        "4  {'number': '4075 4659 6389 2457', 'expiration-...  "
655 |       ]
656 |      },
657 |      "execution_count": 19,
658 |      "metadata": {},
659 |      "output_type": "execute_result"
660 |     }
661 |    ],
662 |    "source": [
663 |     "b = db.read_text('data/*.json').map(json.loads)\n",
664 |     "df = b.to_dataframe()\n",
665 |     "df.head()"
666 |    ]
667 |   },
668 |   {
669 |    "cell_type": "markdown",
670 |    "id": "nominated-nothing",
671 |    "metadata": {},
672 |    "source": [
673 |     "Remember to close the Cluster. :)"
674 |    ]
675 |   },
676 |   {
677 |    "cell_type": "code",
678 |    "execution_count": 20,
679 |    "id": "eight-fruit",
680 |    "metadata": {},
681 |    "outputs": [],
682 |    "source": [
683 |     "client.close()"
684 |    ]
685 |   },
686 |   {
687 |    "cell_type": "markdown",
688 |    "id": "spanish-acceptance",
689 |    "metadata": {},
690 |    "source": [
691 |     "## Limitations\n",
692 |     "\n",
693 |     "* Does not perform well on computations that include a great deal of inter-worker communication.\n",
694 |     "* Bag operations are slower than array/DataFrame computations (Python is slower than NumPy/pandas).\n",
695 |     "* Bag.groupby is slow. You should try to use Bag.foldby if possible.\n",
696 |     "* Bags are immutable and so you can not change individual elements."
697 |    ]
698 |   },
699 |   {
700 |    "cell_type": "markdown",
701 |    "id": "loose-sampling",
702 |    "metadata": {},
703 |    "source": [
704 |     "## References\n",
705 |     "* [Dask Bag documentation](https://docs.dask.org/en/latest/bag.html)\n",
706 |     "* [Dask Bag API](https://docs.dask.org/en/latest/bag-api.html)\n",
707 |     "* [Dask Bag examples](https://docs.dask.org/en/latest/bag-api.html)\n",
708 |     "* [Dask Tutotial - Bag](https://tutorial.dask.org/02_bag.html)"
709 |    ]
710 |   },
711 |   {
712 |    "cell_type": "code",
713 |    "execution_count": null,
714 |    "id": "cardiovascular-subdivision",
715 |    "metadata": {},
716 |    "outputs": [],
717 |    "source": []
718 |   }
719 |  ],
720 |  "metadata": {
721 |   "kernelspec": {
722 |    "display_name": "Python 3",
723 |    "language": "python",
724 |    "name": "python3"
725 |   },
726 |   "language_info": {
727 |    "codemirror_mode": {
728 |     "name": "ipython",
729 |     "version": 3
730 |    },
731 |    "file_extension": ".py",
732 |    "mimetype": "text/x-python",
733 |    "name": "python",
734 |    "nbconvert_exporter": "python",
735 |    "pygments_lexer": "ipython3",
736 |    "version": "3.8.10"
737 |   }
738 |  },
739 |  "nbformat": 4,
740 |  "nbformat_minor": 5
741 | }
742 | 


--------------------------------------------------------------------------------
/06-schedulers.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "id": "yellow-slovak",
  6 |    "metadata": {},
  7 |    "source": [
  8 |     "# Dask Schedulers\n",
  9 |     "\n",
 10 |     "## Notebook Objectives\n",
 11 |     "* **Performance comparison** of different dask schedulers.\n",
 12 |     "* **References** for further reading."
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "markdown",
 17 |    "id": "aboriginal-release",
 18 |    "metadata": {},
 19 |    "source": [
 20 |     "## Performance comparison of different dask schedulers"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "markdown",
 25 |    "id": "metric-small",
 26 |    "metadata": {},
 27 |    "source": [
 28 |     "To compare the different schedulers, let's go back to the DataFrame example where we read the NYC Taxi Trips dataset and compute the maximum tip amount."
 29 |    ]
 30 |   },
 31 |   {
 32 |    "cell_type": "code",
 33 |    "execution_count": 5,
 34 |    "id": "modular-button",
 35 |    "metadata": {},
 36 |    "outputs": [
 37 |     {
 38 |      "data": {
 39 |       "text/html": [
 40 |        "\n",
 41 |        "            <div>\n",
 42 |        "                <div style=\"\n",
 43 |        "                    width: 24px;\n",
 44 |        "                    height: 24px;\n",
 45 |        "                    background-color: #e1e1e1;\n",
 46 |        "                    border: 3px solid #9D9D9D;\n",
 47 |        "                    border-radius: 5px;\n",
 48 |        "                    position: absolute;\"> </div>\n",
 49 |        "                <div style=\"margin-left: 48px;\">\n",
 50 |        "                    <h3 style=\"margin-bottom: 0px;\">Client</h3>\n",
 51 |        "                    <p style=\"color: #9D9D9D; margin-bottom: 0px;\">Client-ad7e6a04-d48f-11eb-b4de-acde48001122</p>\n",
 52 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 53 |        "                    \n",
 54 |        "                <tr>\n",
 55 |        "                    <td style=\"text-align: left;\"><strong>Connection method:</strong> Cluster object</td>\n",
 56 |        "                    <td style=\"text-align: left;\"><strong>Cluster type:</strong> LocalCluster</td>\n",
 57 |        "                </tr>\n",
 58 |        "                \n",
 59 |        "                <tr>\n",
 60 |        "                    <td style=\"text-align: left;\">\n",
 61 |        "                        <strong>Dashboard: </strong>\n",
 62 |        "                        <a href=\"http://127.0.0.1:8787/status\">http://127.0.0.1:8787/status</a>\n",
 63 |        "                    </td>\n",
 64 |        "                    <td style=\"text-align: left;\"></td>\n",
 65 |        "                </tr>\n",
 66 |        "                \n",
 67 |        "                    </table>\n",
 68 |        "                    \n",
 69 |        "                <details>\n",
 70 |        "                <summary style=\"margin-bottom: 20px;\"><h3 style=\"display: inline;\">Cluster Info</h3></summary>\n",
 71 |        "                \n",
 72 |        "            <div class=\"jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output\">\n",
 73 |        "                <div style=\"\n",
 74 |        "                    width: 24px;\n",
 75 |        "                    height: 24px;\n",
 76 |        "                    background-color: #e1e1e1;\n",
 77 |        "                    border: 3px solid #9D9D9D;\n",
 78 |        "                    border-radius: 5px;\n",
 79 |        "                    position: absolute;\"> </div>\n",
 80 |        "                <div style=\"margin-left: 48px;\">\n",
 81 |        "                    <h3 style=\"margin-bottom: 0px; margin-top: 0px;\">LocalCluster</h3>\n",
 82 |        "                    <p style=\"color: #9D9D9D; margin-bottom: 0px;\">537e1587</p>\n",
 83 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 84 |        "                    \n",
 85 |        "            <tr>\n",
 86 |        "                <td style=\"text-align: left;\"><strong>Status:</strong> running</td>\n",
 87 |        "                <td style=\"text-align: left;\"><strong>Using processes:</strong> True</td>\n",
 88 |        "            </tr>\n",
 89 |        "        \n",
 90 |        "            <tr>\n",
 91 |        "                <td style=\"text-align: left;\">\n",
 92 |        "                    <strong>Dashboard:</strong> <a href=\"http://127.0.0.1:8787/status\">http://127.0.0.1:8787/status</a>\n",
 93 |        "                </td>\n",
 94 |        "                <td style=\"text-align: left;\"><strong>Workers:</strong> 4</td>\n",
 95 |        "            </tr>\n",
 96 |        "            <tr>\n",
 97 |        "                <td style=\"text-align: left;\">\n",
 98 |        "                    <strong>Total threads:</strong>\n",
 99 |        "                    12\n",
100 |        "                </td>\n",
101 |        "                <td style=\"text-align: left;\">\n",
102 |        "                    <strong>Total memory:</strong>\n",
103 |        "                    16.00 GiB\n",
104 |        "                </td>\n",
105 |        "            </tr>\n",
106 |        "        \n",
107 |        "                    </table>\n",
108 |        "                    <details>\n",
109 |        "                    <summary style=\"margin-bottom: 20px;\"><h3 style=\"display: inline;\">Scheduler Info</h3></summary>\n",
110 |        "                    \n",
111 |        "        <div style=\"\">\n",
112 |        "            \n",
113 |        "            <div>\n",
114 |        "                <div style=\"\n",
115 |        "                    width: 24px;\n",
116 |        "                    height: 24px;\n",
117 |        "                    background-color: #FFF7E5;\n",
118 |        "                    border: 3px solid #FF6132;\n",
119 |        "                    border-radius: 5px;\n",
120 |        "                    position: absolute;\"> </div>\n",
121 |        "                <div style=\"margin-left: 48px;\">\n",
122 |        "                    <h3 style=\"margin-bottom: 0px;\">Scheduler</h3>\n",
123 |        "                    <p style=\"color: #9D9D9D; margin-bottom: 0px;\">Scheduler-dd1a7531-946d-479c-97cf-bf50918e22b0</p>\n",
124 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
125 |        "                        <tr>\n",
126 |        "                            <td style=\"text-align: left;\"><strong>Comm:</strong> tcp://127.0.0.1:49654</td>\n",
127 |        "                            <td style=\"text-align: left;\"><strong>Workers:</strong> 4</td>\n",
128 |        "                        </tr>\n",
129 |        "                        <tr>\n",
130 |        "                            <td style=\"text-align: left;\">\n",
131 |        "                                <strong>Dashboard:</strong> <a href=\"http://127.0.0.1:8787/status\">http://127.0.0.1:8787/status</a>\n",
132 |        "                            </td>\n",
133 |        "                            <td style=\"text-align: left;\">\n",
134 |        "                                <strong>Total threads:</strong>\n",
135 |        "                                12\n",
136 |        "                            </td>\n",
137 |        "                        </tr>\n",
138 |        "                        <tr>\n",
139 |        "                            <td style=\"text-align: left;\">\n",
140 |        "                                <strong>Started:</strong>\n",
141 |        "                                Just now\n",
142 |        "                            </td>\n",
143 |        "                            <td style=\"text-align: left;\">\n",
144 |        "                                <strong>Total memory:</strong>\n",
145 |        "                                16.00 GiB\n",
146 |        "                            </td>\n",
147 |        "                        </tr>\n",
148 |        "                    </table>\n",
149 |        "                </div>\n",
150 |        "            </div>\n",
151 |        "        \n",
152 |        "            <details style=\"margin-left: 48px;\">\n",
153 |        "            <summary style=\"margin-bottom: 20px;\"><h3 style=\"display: inline;\">Workers</h3></summary>\n",
154 |        "            \n",
155 |        "            <div style=\"margin-bottom: 20px;\">\n",
156 |        "                <div style=\"width: 24px;\n",
157 |        "                            height: 24px;\n",
158 |        "                            background-color: #DBF5FF;\n",
159 |        "                            border: 3px solid #4CC9FF;\n",
160 |        "                            border-radius: 5px;\n",
161 |        "                            position: absolute;\"> </div>\n",
162 |        "                <div style=\"margin-left: 48px;\">\n",
163 |        "                <details>\n",
164 |        "                    <summary>\n",
165 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 0</h4>\n",
166 |        "                    </summary>\n",
167 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
168 |        "                        <tr>\n",
169 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:49666</td>\n",
170 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
171 |        "                        </tr>\n",
172 |        "                        <tr>\n",
173 |        "                            <td style=\"text-align: left;\">\n",
174 |        "                                <strong>Dashboard: </strong>\n",
175 |        "                                <a href=\"http://127.0.0.1:49668/status\">http://127.0.0.1:49668/status</a>\n",
176 |        "                            </td>\n",
177 |        "                            <td style=\"text-align: left;\">\n",
178 |        "                                <strong>Memory: </strong>\n",
179 |        "                                4.00 GiB\n",
180 |        "                            </td>\n",
181 |        "                        </tr>\n",
182 |        "                        <tr>\n",
183 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:49658</td>\n",
184 |        "                            <td style=\"text-align: left;\"></td>\n",
185 |        "                        </tr>\n",
186 |        "                        <tr>\n",
187 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
188 |        "                                <strong>Local directory: </strong>\n",
189 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-r84rhk5j\n",
190 |        "                            </td>\n",
191 |        "                        </tr>\n",
192 |        "                        \n",
193 |        "                        \n",
194 |        "                    </table>\n",
195 |        "                </details>\n",
196 |        "                </div>\n",
197 |        "            </div>\n",
198 |        "            \n",
199 |        "            <div style=\"margin-bottom: 20px;\">\n",
200 |        "                <div style=\"width: 24px;\n",
201 |        "                            height: 24px;\n",
202 |        "                            background-color: #DBF5FF;\n",
203 |        "                            border: 3px solid #4CC9FF;\n",
204 |        "                            border-radius: 5px;\n",
205 |        "                            position: absolute;\"> </div>\n",
206 |        "                <div style=\"margin-left: 48px;\">\n",
207 |        "                <details>\n",
208 |        "                    <summary>\n",
209 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 1</h4>\n",
210 |        "                    </summary>\n",
211 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
212 |        "                        <tr>\n",
213 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:49667</td>\n",
214 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
215 |        "                        </tr>\n",
216 |        "                        <tr>\n",
217 |        "                            <td style=\"text-align: left;\">\n",
218 |        "                                <strong>Dashboard: </strong>\n",
219 |        "                                <a href=\"http://127.0.0.1:49670/status\">http://127.0.0.1:49670/status</a>\n",
220 |        "                            </td>\n",
221 |        "                            <td style=\"text-align: left;\">\n",
222 |        "                                <strong>Memory: </strong>\n",
223 |        "                                4.00 GiB\n",
224 |        "                            </td>\n",
225 |        "                        </tr>\n",
226 |        "                        <tr>\n",
227 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:49659</td>\n",
228 |        "                            <td style=\"text-align: left;\"></td>\n",
229 |        "                        </tr>\n",
230 |        "                        <tr>\n",
231 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
232 |        "                                <strong>Local directory: </strong>\n",
233 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-c0savz3z\n",
234 |        "                            </td>\n",
235 |        "                        </tr>\n",
236 |        "                        \n",
237 |        "                        \n",
238 |        "                    </table>\n",
239 |        "                </details>\n",
240 |        "                </div>\n",
241 |        "            </div>\n",
242 |        "            \n",
243 |        "            <div style=\"margin-bottom: 20px;\">\n",
244 |        "                <div style=\"width: 24px;\n",
245 |        "                            height: 24px;\n",
246 |        "                            background-color: #DBF5FF;\n",
247 |        "                            border: 3px solid #4CC9FF;\n",
248 |        "                            border-radius: 5px;\n",
249 |        "                            position: absolute;\"> </div>\n",
250 |        "                <div style=\"margin-left: 48px;\">\n",
251 |        "                <details>\n",
252 |        "                    <summary>\n",
253 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 2</h4>\n",
254 |        "                    </summary>\n",
255 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
256 |        "                        <tr>\n",
257 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:49662</td>\n",
258 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
259 |        "                        </tr>\n",
260 |        "                        <tr>\n",
261 |        "                            <td style=\"text-align: left;\">\n",
262 |        "                                <strong>Dashboard: </strong>\n",
263 |        "                                <a href=\"http://127.0.0.1:49664/status\">http://127.0.0.1:49664/status</a>\n",
264 |        "                            </td>\n",
265 |        "                            <td style=\"text-align: left;\">\n",
266 |        "                                <strong>Memory: </strong>\n",
267 |        "                                4.00 GiB\n",
268 |        "                            </td>\n",
269 |        "                        </tr>\n",
270 |        "                        <tr>\n",
271 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:49657</td>\n",
272 |        "                            <td style=\"text-align: left;\"></td>\n",
273 |        "                        </tr>\n",
274 |        "                        <tr>\n",
275 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
276 |        "                                <strong>Local directory: </strong>\n",
277 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-zhy6e5hn\n",
278 |        "                            </td>\n",
279 |        "                        </tr>\n",
280 |        "                        \n",
281 |        "                        \n",
282 |        "                    </table>\n",
283 |        "                </details>\n",
284 |        "                </div>\n",
285 |        "            </div>\n",
286 |        "            \n",
287 |        "            <div style=\"margin-bottom: 20px;\">\n",
288 |        "                <div style=\"width: 24px;\n",
289 |        "                            height: 24px;\n",
290 |        "                            background-color: #DBF5FF;\n",
291 |        "                            border: 3px solid #4CC9FF;\n",
292 |        "                            border-radius: 5px;\n",
293 |        "                            position: absolute;\"> </div>\n",
294 |        "                <div style=\"margin-left: 48px;\">\n",
295 |        "                <details>\n",
296 |        "                    <summary>\n",
297 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 3</h4>\n",
298 |        "                    </summary>\n",
299 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
300 |        "                        <tr>\n",
301 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:49660</td>\n",
302 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
303 |        "                        </tr>\n",
304 |        "                        <tr>\n",
305 |        "                            <td style=\"text-align: left;\">\n",
306 |        "                                <strong>Dashboard: </strong>\n",
307 |        "                                <a href=\"http://127.0.0.1:49661/status\">http://127.0.0.1:49661/status</a>\n",
308 |        "                            </td>\n",
309 |        "                            <td style=\"text-align: left;\">\n",
310 |        "                                <strong>Memory: </strong>\n",
311 |        "                                4.00 GiB\n",
312 |        "                            </td>\n",
313 |        "                        </tr>\n",
314 |        "                        <tr>\n",
315 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:49656</td>\n",
316 |        "                            <td style=\"text-align: left;\"></td>\n",
317 |        "                        </tr>\n",
318 |        "                        <tr>\n",
319 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
320 |        "                                <strong>Local directory: </strong>\n",
321 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-kl2ig26m\n",
322 |        "                            </td>\n",
323 |        "                        </tr>\n",
324 |        "                        \n",
325 |        "                        \n",
326 |        "                    </table>\n",
327 |        "                </details>\n",
328 |        "                </div>\n",
329 |        "            </div>\n",
330 |        "            \n",
331 |        "            </details>\n",
332 |        "        </div>\n",
333 |        "        \n",
334 |        "                    </details>\n",
335 |        "                </div>\n",
336 |        "            </div>\n",
337 |        "        \n",
338 |        "                </details>\n",
339 |        "                \n",
340 |        "                </div>\n",
341 |        "            </div>\n",
342 |        "        "
343 |       ],
344 |       "text/plain": [
345 |        "<Client: 'tcp://127.0.0.1:49654' processes=4 threads=12, memory=16.00 GiB>"
346 |       ]
347 |      },
348 |      "execution_count": 5,
349 |      "metadata": {},
350 |      "output_type": "execute_result"
351 |     }
352 |    ],
353 |    "source": [
354 |     "from dask.distributed import Client\n",
355 |     "\n",
356 |     "client = Client(n_workers=4)\n",
357 |     "client"
358 |    ]
359 |   },
360 |   {
361 |    "cell_type": "code",
362 |    "execution_count": 6,
363 |    "id": "processed-apparel",
364 |    "metadata": {},
365 |    "outputs": [
366 |     {
367 |      "data": {
368 |       "text/plain": [
369 |        "dd.Scalar<series-..., dtype=float64>"
370 |       ]
371 |      },
372 |      "execution_count": 6,
373 |      "metadata": {},
374 |      "output_type": "execute_result"
375 |     }
376 |    ],
377 |    "source": [
378 |     "import dask.dataframe as dd\n",
379 |     "\n",
380 |     "df = dd.read_csv(\"data/yellow_tripdata_2019-*.csv\",\n",
381 |     "                 dtype={'RatecodeID': 'float64',\n",
382 |     "                        'VendorID': 'float64',\n",
383 |     "                        'passenger_count': 'float64',\n",
384 |     "                        'payment_type': 'float64'\n",
385 |     "                       })\n",
386 |     "\n",
387 |     "max_tip_amount = df.groupby(\"passenger_count\").tip_amount.mean().max()\n",
388 |     "max_tip_amount"
389 |    ]
390 |   },
391 |   {
392 |    "cell_type": "code",
393 |    "execution_count": 7,
394 |    "id": "62c9bfdd-0fa9-4a1c-b2d9-9cca37d00947",
395 |    "metadata": {},
396 |    "outputs": [
397 |     {
398 |      "name": "stdout",
399 |      "output_type": "stream",
400 |      "text": [
401 |       "CPU times: user 1min 14s, sys: 4.03 s, total: 1min 18s\n",
402 |       "Wall time: 2min 12s\n"
403 |      ]
404 |     },
405 |     {
406 |      "data": {
407 |       "text/plain": [
408 |        "7.377822222222222"
409 |       ]
410 |      },
411 |      "execution_count": 7,
412 |      "metadata": {},
413 |      "output_type": "execute_result"
414 |     }
415 |    ],
416 |    "source": [
417 |     "%%time\n",
418 |     "\n",
419 |     "max_tip_amount.compute()"
420 |    ]
421 |   },
422 |   {
423 |    "cell_type": "markdown",
424 |    "id": "82a447a2-0232-488e-bc03-a0a5d571d2e4",
425 |    "metadata": {},
426 |    "source": [
427 |     "Let's try this computation using different schedulers and look at the results. We are selecting the scheduler _inline_ while calling `compute`."
428 |    ]
429 |   },
430 |   {
431 |    "cell_type": "code",
432 |    "execution_count": 4,
433 |    "id": "continental-puppy",
434 |    "metadata": {},
435 |    "outputs": [
436 |     {
437 |      "name": "stderr",
438 |      "output_type": "stream",
439 |      "text": [
440 |       "/Users/pavithra-coiled/.conda/envs/talkpython-dask/lib/python3.8/site-packages/dask/core.py:151: DtypeWarning: Columns (6) have mixed types.Specify dtype option on import or set low_memory=False.\n",
441 |       "  result = _execute_task(task, cache)\n",
442 |       "/Users/pavithra-coiled/.conda/envs/talkpython-dask/lib/python3.8/site-packages/pandas/core/indexes/base.py:3080: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n",
443 |       "  return self._engine.get_loc(casted_key)\n",
444 |       "/Users/pavithra-coiled/.conda/envs/talkpython-dask/lib/python3.8/site-packages/dask/core.py:151: DtypeWarning: Columns (6) have mixed types.Specify dtype option on import or set low_memory=False.\n",
445 |       "  result = _execute_task(task, cache)\n"
446 |      ]
447 |     },
448 |     {
449 |      "name": "stdout",
450 |      "output_type": "stream",
451 |      "text": [
452 |       "Scheduler: threading , Compute time: 105.8745608329773 , Result: 7.377822222222222\n",
453 |       "Scheduler: processes , Compute time: 384.7675268650055 , Result: 7.377822222222222\n"
454 |      ]
455 |     },
456 |     {
457 |      "name": "stderr",
458 |      "output_type": "stream",
459 |      "text": [
460 |       "/Users/pavithra-coiled/.conda/envs/talkpython-dask/lib/python3.8/site-packages/dask/core.py:151: DtypeWarning: Columns (6) have mixed types.Specify dtype option on import or set low_memory=False.\n",
461 |       "  result = _execute_task(task, cache)\n"
462 |      ]
463 |     },
464 |     {
465 |      "name": "stdout",
466 |      "output_type": "stream",
467 |      "text": [
468 |       "Scheduler: sync , Compute time: 176.0339961051941 , Result: 7.377822222222222\n"
469 |      ]
470 |     }
471 |    ],
472 |    "source": [
473 |     "import time\n",
474 |     "\n",
475 |     "for sch in ['threading', 'processes', 'synchronous']:\n",
476 |     "    t0 = time.time()\n",
477 |     "    amount = max_tip_amount.compute(scheduler=sch)\n",
478 |     "    t1 = time.time()\n",
479 |     "    print(\"Scheduler:\", sch, \", Compute time:\", t1 - t0, \", Result:\", amount)"
480 |    ]
481 |   },
482 |   {
483 |    "cell_type": "markdown",
484 |    "id": "7f329ce0-78a6-4840-a1ec-25f3915d8343",
485 |    "metadata": {},
486 |    "source": [
487 |     "We can see that the results are the same, but the time to compute varies. This is because each scheduler works differently and is best-suited for specific situations."
488 |    ]
489 |   },
490 |   {
491 |    "cell_type": "markdown",
492 |    "id": "97f6323a-54f7-423e-b0b1-432a8a4f6318",
493 |    "metadata": {},
494 |    "source": [
495 |     "For most cases, we recommend using the distributed scheduler:\n",
496 |     "\n",
497 |     "```\n",
498 |     "from dask.distributed import Client\n",
499 |     "client = Client()\n",
500 |     "```"
501 |    ]
502 |   },
503 |   {
504 |    "cell_type": "markdown",
505 |    "id": "1c30de77-6bf1-4138-8441-8921996965fe",
506 |    "metadata": {},
507 |    "source": [
508 |     "Note that only the distributed scheduler supports all the dashboards, modern scheduling improvements, and other features.\n",
509 |     "\n",
510 |     "The distributed scheduler:\n",
511 |     "\n",
512 |     "  * will also work well for these workloads on a single machine\n",
513 |     "  * recommended for workloads that do hold the GIL, (`dask.bag` and custom code wrapped with `dask.delayed`), even on single machine\n",
514 |     "  * more intelligent and provides better diagnostics than the processes scheduler\n",
515 |     "  * required for scaling out work across a cluster\n",
516 |     " "
517 |    ]
518 |   },
519 |   {
520 |    "cell_type": "markdown",
521 |    "id": "fa58aedf-2919-4678-bf2d-5d8445e7a261",
522 |    "metadata": {},
523 |    "source": [
524 |     "Finally, let's close the cluster!"
525 |    ]
526 |   },
527 |   {
528 |    "cell_type": "code",
529 |    "execution_count": 8,
530 |    "id": "394e7bd1-2baf-42a3-bfe5-616c9d632a5f",
531 |    "metadata": {},
532 |    "outputs": [],
533 |    "source": [
534 |     "client.close()"
535 |    ]
536 |   },
537 |   {
538 |    "cell_type": "markdown",
539 |    "id": "worst-controversy",
540 |    "metadata": {},
541 |    "source": [
542 |     "## References\n",
543 |     "\n",
544 |     "* [Scheduling Documentation](https://docs.dask.org/en/latest/scheduling.html)\n",
545 |     "* [Dask Tutorial - Distributed](https://tutorial.dask.org/05_distributed.html)"
546 |    ]
547 |   },
548 |   {
549 |    "cell_type": "code",
550 |    "execution_count": null,
551 |    "id": "single-landing",
552 |    "metadata": {},
553 |    "outputs": [],
554 |    "source": []
555 |   }
556 |  ],
557 |  "metadata": {
558 |   "kernelspec": {
559 |    "display_name": "Python 3",
560 |    "language": "python",
561 |    "name": "python3"
562 |   },
563 |   "language_info": {
564 |    "codemirror_mode": {
565 |     "name": "ipython",
566 |     "version": 3
567 |    },
568 |    "file_extension": ".py",
569 |    "mimetype": "text/x-python",
570 |    "name": "python",
571 |    "nbconvert_exporter": "python",
572 |    "pygments_lexer": "ipython3",
573 |    "version": "3.8.10"
574 |   }
575 |  },
576 |  "nbformat": 4,
577 |  "nbformat_minor": 5
578 | }
579 | 


--------------------------------------------------------------------------------
/07-ML.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "id": "bf3d47cc-d76e-4fad-97c4-117f64c7d857",
   6 |    "metadata": {},
   7 |    "source": [
   8 |     "# Dask-ML\n",
   9 |     "\n",
  10 |     "## Notebook Objectives\n",
  11 |     "* **Demonstrate scikit-learn**, a library for machine learning in Python.\n",
  12 |     "* Use **Joblib and Dask to leverage parallelism** in case of compute-bound challenges.\n",
  13 |     "* Use **Dask-ML for distributed machine learning** in case of memory-bound challenges.\n",
  14 |     "* A brief look at **machine learning in the cloud** for additional computing resources. (Optional)\n",
  15 |     "* **References** for further reading."
  16 |    ]
  17 |   },
  18 |   {
  19 |    "cell_type": "markdown",
  20 |    "id": "24da6bf0-474e-4323-b2ca-6d507dbd1bd9",
  21 |    "metadata": {},
  22 |    "source": [
  23 |     "## scikit-learn for machine learning\n",
  24 |     "\n",
  25 |     "scikit-learn is a powerful library for machine learning in Python. It provides tools for pre-processing, model training, evaluation, and more.\n",
  26 |     "\n",
  27 |     "If your model and data fits on your computer, we recommend using scikit-learn as usual with no parallelism.\n",
  28 |     "\n",
  29 |     "Let's take a look at at how you can train machine learning models in scikit-learn."
  30 |    ]
  31 |   },
  32 |   {
  33 |    "cell_type": "markdown",
  34 |    "id": "862ae9be-26be-4163-8724-e1888fc4fd9c",
  35 |    "metadata": {},
  36 |    "source": [
  37 |     "#### Creating Datasets\n",
  38 |     "\n",
  39 |     "We start by generating some synthetic data using scikit-learn's [`make_classifaction`](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html) module. `make_classification` creates random classification problems, we create one with 100k samples and 10 features."
  40 |    ]
  41 |   },
  42 |   {
  43 |    "cell_type": "code",
  44 |    "execution_count": 6,
  45 |    "id": "eba26578-71f6-4a90-b599-0c010c9be050",
  46 |    "metadata": {},
  47 |    "outputs": [],
  48 |    "source": [
  49 |     "from sklearn.datasets import make_classification\n",
  50 |     "\n",
  51 |     "X, y = make_classification(n_samples=100_000, n_features=10, random_state=0)"
  52 |    ]
  53 |   },
  54 |   {
  55 |    "cell_type": "markdown",
  56 |    "id": "44748d91-b9b8-4a6c-8a63-49765ec98c22",
  57 |    "metadata": {},
  58 |    "source": [
  59 |     "Let's examine the X and y variables. Note that X represents the set of input variables and y the output/target variables."
  60 |    ]
  61 |   },
  62 |   {
  63 |    "cell_type": "code",
  64 |    "execution_count": 7,
  65 |    "id": "d8e0df36-cb38-4528-b70e-bf0b3eb69965",
  66 |    "metadata": {},
  67 |    "outputs": [
  68 |     {
  69 |      "data": {
  70 |       "text/plain": [
  71 |        "array([[-0.7462974 ,  0.19602952,  0.11141229,  0.59340009,  1.32627975,\n",
  72 |        "        -1.10504115, -0.63411817,  1.19223806, -0.32277383, -0.03057938],\n",
  73 |        "       [-0.74584283, -0.24857446,  0.50831426, -0.6628635 ,  1.24896798,\n",
  74 |        "         0.95601408, -2.28687281,  1.12441665, -1.53928374,  0.78151558],\n",
  75 |        "       [-0.62459237, -0.02605275, -0.18403411, -0.94905415,  1.07726998,\n",
  76 |        "         1.18669218,  0.30910096,  0.8074069 , -0.79054371,  0.059631  ],\n",
  77 |        "       [-0.99690131, -0.09017488,  0.67867704,  0.28108283,  1.71104871,\n",
  78 |        "         1.01523959,  0.78247076,  1.26565066, -1.39478782,  1.37608239],\n",
  79 |        "       [ 0.40153919,  0.29434464, -1.76744682,  1.20321684, -0.64477815,\n",
  80 |        "        -0.36214576,  0.61815685,  0.93696374,  1.26810107,  0.2989785 ]])"
  81 |       ]
  82 |      },
  83 |      "execution_count": 7,
  84 |      "metadata": {},
  85 |      "output_type": "execute_result"
  86 |     }
  87 |    ],
  88 |    "source": [
  89 |     "X[:5]"
  90 |    ]
  91 |   },
  92 |   {
  93 |    "cell_type": "code",
  94 |    "execution_count": 8,
  95 |    "id": "2da5faf7-b29b-47e2-a5b4-98182b2cdada",
  96 |    "metadata": {},
  97 |    "outputs": [
  98 |     {
  99 |      "data": {
 100 |       "text/plain": [
 101 |        "array([0, 0, 0, 0, 1])"
 102 |       ]
 103 |      },
 104 |      "execution_count": 8,
 105 |      "metadata": {},
 106 |      "output_type": "execute_result"
 107 |     }
 108 |    ],
 109 |    "source": [
 110 |     "y[:5]"
 111 |    ]
 112 |   },
 113 |   {
 114 |    "cell_type": "markdown",
 115 |    "id": "54418017-3d4b-45fc-a052-e81cb2f71adc",
 116 |    "metadata": {},
 117 |    "source": [
 118 |     "#### k-nearest neighbors Classification\n",
 119 |     "\n",
 120 |     "Next, we will implement a [k-NN classifier](https://scikit-learn.org/stable/modules/neighbors.html#classification) that creates a model based on the 'k' nearest neighbors of the query points.\n",
 121 |     "\n",
 122 |     "Scikit-learn makes it very easy to train this model. All we need to do is call the fit method, and the score method computes the accuracy (the fraction of the data the model gets correct)."
 123 |    ]
 124 |   },
 125 |   {
 126 |    "cell_type": "code",
 127 |    "execution_count": null,
 128 |    "id": "e63474a1-d763-4522-b299-64cad8c5b2ee",
 129 |    "metadata": {},
 130 |    "outputs": [],
 131 |    "source": [
 132 |     "from sklearn.neighbors import KNeighborsClassifier"
 133 |    ]
 134 |   },
 135 |   {
 136 |    "cell_type": "code",
 137 |    "execution_count": null,
 138 |    "id": "46d770e9-76a5-4c7e-ac56-76a1057ab488",
 139 |    "metadata": {},
 140 |    "outputs": [],
 141 |    "source": [
 142 |     "%%time\n",
 143 |     "\n",
 144 |     "neigh = KNeighborsClassifier(n_neighbors=3)\n",
 145 |     "clf = neigh.fit(X, y)"
 146 |    ]
 147 |   },
 148 |   {
 149 |    "cell_type": "code",
 150 |    "execution_count": null,
 151 |    "id": "1021e289-7b34-4fb1-8100-8b34fba9c250",
 152 |    "metadata": {},
 153 |    "outputs": [],
 154 |    "source": [
 155 |     "clf.score(X,y)"
 156 |    ]
 157 |   },
 158 |   {
 159 |    "cell_type": "markdown",
 160 |    "id": "d5ce56d2-3597-4c19-a5dc-858c8b3bf574",
 161 |    "metadata": {},
 162 |    "source": [
 163 |     "#### Hyperparameter Tuning\n",
 164 |     "\n",
 165 |     "Hyperparameters are some predefined attributes that impact the performance of your models. For example, in the above k-NN example, the value of k is defined ahead of time. We might want to check how the model performs with different values of k, and select the best value of k. This process of selecting the best hyperparameters is called Hyperparameter Tuning.\n",
 166 |     "\n",
 167 |     "There are many ways to tune hyerparameters, we will look at GridSearchCV in this notebook."
 168 |    ]
 169 |   },
 170 |   {
 171 |    "cell_type": "code",
 172 |    "execution_count": 10,
 173 |    "id": "80b75c4a-34c8-4238-a494-9bdf918077cf",
 174 |    "metadata": {},
 175 |    "outputs": [],
 176 |    "source": [
 177 |     "from sklearn.model_selection import GridSearchCV"
 178 |    ]
 179 |   },
 180 |   {
 181 |    "cell_type": "markdown",
 182 |    "id": "2c49df4f-9194-4289-82dc-bdf53ac0c811",
 183 |    "metadata": {},
 184 |    "source": [
 185 |     "We can specify the parameters to be explored as shown below, and then run `fit` on all the sets of parameters."
 186 |    ]
 187 |   },
 188 |   {
 189 |    "cell_type": "code",
 190 |    "execution_count": 11,
 191 |    "id": "10630bc8-f757-493c-bbe6-911db7b27cba",
 192 |    "metadata": {},
 193 |    "outputs": [],
 194 |    "source": [
 195 |     "param_grid = {\n",
 196 |     "    'n_neighbors': [3, 5, 8],\n",
 197 |     "    'weights': ['uniform', 'distance'],\n",
 198 |     "}"
 199 |    ]
 200 |   },
 201 |   {
 202 |    "cell_type": "markdown",
 203 |    "id": "f29856c5-5183-4333-be96-a27d81774bb4",
 204 |    "metadata": {},
 205 |    "source": [
 206 |     "`verbose` gives us a detailed output for each fit and `cv` is used to define the number of folds during cross-validation."
 207 |    ]
 208 |   },
 209 |   {
 210 |    "cell_type": "code",
 211 |    "execution_count": 12,
 212 |    "id": "fd0ecee7-c80f-4d8a-a94b-b11688ff0e91",
 213 |    "metadata": {},
 214 |    "outputs": [
 215 |     {
 216 |      "name": "stdout",
 217 |      "output_type": "stream",
 218 |      "text": [
 219 |       "Fitting 2 folds for each of 6 candidates, totalling 12 fits\n",
 220 |       "[CV] END .....................n_neighbors=3, weights=uniform; total time=   5.6s\n",
 221 |       "[CV] END .....................n_neighbors=3, weights=uniform; total time=   5.7s\n",
 222 |       "[CV] END ....................n_neighbors=3, weights=distance; total time=   4.3s\n",
 223 |       "[CV] END ....................n_neighbors=3, weights=distance; total time=   4.3s\n",
 224 |       "[CV] END .....................n_neighbors=5, weights=uniform; total time=   7.1s\n",
 225 |       "[CV] END .....................n_neighbors=5, weights=uniform; total time=   9.9s\n",
 226 |       "[CV] END ....................n_neighbors=5, weights=distance; total time=   7.7s\n",
 227 |       "[CV] END ....................n_neighbors=5, weights=distance; total time=   5.8s\n",
 228 |       "[CV] END .....................n_neighbors=8, weights=uniform; total time=   7.3s\n",
 229 |       "[CV] END .....................n_neighbors=8, weights=uniform; total time=   6.5s\n",
 230 |       "[CV] END ....................n_neighbors=8, weights=distance; total time=   5.3s\n",
 231 |       "[CV] END ....................n_neighbors=8, weights=distance; total time=   5.3s\n",
 232 |       "CPU times: user 1min 14s, sys: 283 ms, total: 1min 14s\n",
 233 |       "Wall time: 1min 14s\n"
 234 |      ]
 235 |     },
 236 |     {
 237 |      "data": {
 238 |       "text/plain": [
 239 |        "GridSearchCV(cv=2, estimator=KNeighborsClassifier(n_neighbors=3),\n",
 240 |        "             param_grid={'n_neighbors': [3, 5, 8],\n",
 241 |        "                         'weights': ['uniform', 'distance']},\n",
 242 |        "             verbose=2)"
 243 |       ]
 244 |      },
 245 |      "execution_count": 12,
 246 |      "metadata": {},
 247 |      "output_type": "execute_result"
 248 |     }
 249 |    ],
 250 |    "source": [
 251 |     "%%time\n",
 252 |     "\n",
 253 |     "grid_search = GridSearchCV(clf, param_grid, verbose=2, cv=2)\n",
 254 |     "grid_search.fit(X, y)"
 255 |    ]
 256 |   },
 257 |   {
 258 |    "cell_type": "markdown",
 259 |    "id": "78d5ad72-e391-4b6d-a4d6-4f36fff04623",
 260 |    "metadata": {},
 261 |    "source": [
 262 |     "Note the time taken!\n",
 263 |     "\n",
 264 |     "Now, we can check what that best parameters were and the best score they produced."
 265 |    ]
 266 |   },
 267 |   {
 268 |    "cell_type": "code",
 269 |    "execution_count": 13,
 270 |    "id": "ca3974c6-59c8-43e8-9a48-ce7773c7172c",
 271 |    "metadata": {},
 272 |    "outputs": [
 273 |     {
 274 |      "data": {
 275 |       "text/plain": [
 276 |        "{'n_neighbors': 8, 'weights': 'distance'}"
 277 |       ]
 278 |      },
 279 |      "execution_count": 13,
 280 |      "metadata": {},
 281 |      "output_type": "execute_result"
 282 |     }
 283 |    ],
 284 |    "source": [
 285 |     "grid_search.best_params_"
 286 |    ]
 287 |   },
 288 |   {
 289 |    "cell_type": "code",
 290 |    "execution_count": 14,
 291 |    "id": "f58bfcec-6518-4d9c-b0bd-5cfd81716b2d",
 292 |    "metadata": {},
 293 |    "outputs": [
 294 |     {
 295 |      "data": {
 296 |       "text/plain": [
 297 |        "0.8952100000000001"
 298 |       ]
 299 |      },
 300 |      "execution_count": 14,
 301 |      "metadata": {},
 302 |      "output_type": "execute_result"
 303 |     }
 304 |    ],
 305 |    "source": [
 306 |     "grid_search.best_score_"
 307 |    ]
 308 |   },
 309 |   {
 310 |    "cell_type": "markdown",
 311 |    "id": "8862e077-c508-42e9-894f-d910c4d33d71",
 312 |    "metadata": {},
 313 |    "source": [
 314 |     "## joblib and Dask for compute bound problems\n",
 315 |     "\n",
 316 |     "If you data fits in memory but your model is complex, a general solution is to leverage parallel computing."
 317 |    ]
 318 |   },
 319 |   {
 320 |    "cell_type": "markdown",
 321 |    "id": "eb83f047-fdcf-4793-8e26-c082dddedc9e",
 322 |    "metadata": {},
 323 |    "source": [
 324 |     "### Single machine parallelism: scikit-learn + joblib\n",
 325 |     "\n",
 326 |     "scikit-learn offers **single-machine parallelism** using a tool called Joblib. We can parallelize some algorithms by passing the number of cores in the `n_jobs` parameter.\n",
 327 |     "\n",
 328 |     "Let's look at GridSearchCV again, but this time we will use all available CPU cores. To do this, we can define `n_jobs=-1`. Note that you can also define the exact number of core to use, for example `n_jobs=4` will use 4 cores."
 329 |    ]
 330 |   },
 331 |   {
 332 |    "cell_type": "code",
 333 |    "execution_count": 15,
 334 |    "id": "a8772a9e-613d-4676-82c1-af6f7494e1de",
 335 |    "metadata": {
 336 |     "tags": []
 337 |    },
 338 |    "outputs": [
 339 |     {
 340 |      "name": "stdout",
 341 |      "output_type": "stream",
 342 |      "text": [
 343 |       "CPU times: user 176 ms, sys: 118 ms, total: 294 ms\n",
 344 |       "Wall time: 32.4 s\n"
 345 |      ]
 346 |     },
 347 |     {
 348 |      "data": {
 349 |       "text/plain": [
 350 |        "GridSearchCV(cv=2, estimator=KNeighborsClassifier(n_neighbors=3), n_jobs=-1,\n",
 351 |        "             param_grid={'n_neighbors': [3, 5, 8],\n",
 352 |        "                         'weights': ['uniform', 'distance']})"
 353 |       ]
 354 |      },
 355 |      "execution_count": 15,
 356 |      "metadata": {},
 357 |      "output_type": "execute_result"
 358 |     }
 359 |    ],
 360 |    "source": [
 361 |     "%%time\n",
 362 |     "\n",
 363 |     "grid_search = GridSearchCV(clf, param_grid, cv=2, n_jobs=-1)\n",
 364 |     "grid_search.fit(X, y)"
 365 |    ]
 366 |   },
 367 |   {
 368 |    "cell_type": "markdown",
 369 |    "id": "118dd976-5ef9-4679-8154-d5e648a6606f",
 370 |    "metadata": {},
 371 |    "source": [
 372 |     "Notice how the the compute time is almost reduced by half!"
 373 |    ]
 374 |   },
 375 |   {
 376 |    "cell_type": "markdown",
 377 |    "id": "79c483fc-a862-4280-a6a8-e941f60d8c3f",
 378 |    "metadata": {},
 379 |    "source": [
 380 |     "### Multi-machine parallelis: scikit-learn + joblib + Dask\n",
 381 |     "\n",
 382 |     "Dask offers a *parallel backend* scale this computation to a cluster. \n",
 383 |     "\n",
 384 |     "First, let's spin up a cluster and open the dashboard plots!"
 385 |    ]
 386 |   },
 387 |   {
 388 |    "cell_type": "code",
 389 |    "execution_count": 24,
 390 |    "id": "5fbe6de6-726f-47e0-9d94-44b127cd182a",
 391 |    "metadata": {},
 392 |    "outputs": [
 393 |     {
 394 |      "name": "stderr",
 395 |      "output_type": "stream",
 396 |      "text": [
 397 |       "/opt/anaconda3/envs/talkpython-dask/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.\n",
 398 |       "Perhaps you already have a cluster running?\n",
 399 |       "Hosting the HTTP server on port 55138 instead\n",
 400 |       "  warnings.warn(\n"
 401 |      ]
 402 |     },
 403 |     {
 404 |      "data": {
 405 |       "text/html": [
 406 |        "<table style=\"border: 2px solid white;\">\n",
 407 |        "<tr>\n",
 408 |        "<td style=\"vertical-align: top; border: 0px solid white\">\n",
 409 |        "<h3 style=\"text-align: left;\">Client</h3>\n",
 410 |        "<ul style=\"text-align: left; list-style: none; margin: 0; padding: 0;\">\n",
 411 |        "  <li><b>Scheduler: </b>tcp://127.0.0.1:55139</li>\n",
 412 |        "  <li><b>Dashboard: </b><a href='http://127.0.0.1:55138/status' target='_blank'>http://127.0.0.1:55138/status</a></li>\n",
 413 |        "</ul>\n",
 414 |        "</td>\n",
 415 |        "<td style=\"vertical-align: top; border: 0px solid white\">\n",
 416 |        "<h3 style=\"text-align: left;\">Cluster</h3>\n",
 417 |        "<ul style=\"text-align: left; list-style:none; margin: 0; padding: 0;\">\n",
 418 |        "  <li><b>Workers: </b>4</li>\n",
 419 |        "  <li><b>Cores: </b>12</li>\n",
 420 |        "  <li><b>Memory: </b>16.00 GiB</li>\n",
 421 |        "</ul>\n",
 422 |        "</td>\n",
 423 |        "</tr>\n",
 424 |        "</table>"
 425 |       ],
 426 |       "text/plain": [
 427 |        "<Client: 'tcp://127.0.0.1:55139' processes=4 threads=12, memory=16.00 GiB>"
 428 |       ]
 429 |      },
 430 |      "execution_count": 24,
 431 |      "metadata": {},
 432 |      "output_type": "execute_result"
 433 |     }
 434 |    ],
 435 |    "source": [
 436 |     "import joblib\n",
 437 |     "from dask.distributed import Client\n",
 438 |     "\n",
 439 |     "client = Client(n_workers=4)\n",
 440 |     "client"
 441 |    ]
 442 |   },
 443 |   {
 444 |    "cell_type": "markdown",
 445 |    "id": "bba07264-4902-46ca-80e2-4a9cc091101f",
 446 |    "metadata": {},
 447 |    "source": [
 448 |     "Continuing with the previous GridSearchCV Example, we can use Dask as shown below:"
 449 |    ]
 450 |   },
 451 |   {
 452 |    "cell_type": "code",
 453 |    "execution_count": 25,
 454 |    "id": "8bd919f3-bb0b-40db-9f54-fcb97ec47016",
 455 |    "metadata": {},
 456 |    "outputs": [
 457 |     {
 458 |      "name": "stdout",
 459 |      "output_type": "stream",
 460 |      "text": [
 461 |       "CPU times: user 5.3 s, sys: 2.57 s, total: 7.87 s\n",
 462 |       "Wall time: 36 s\n"
 463 |      ]
 464 |     }
 465 |    ],
 466 |    "source": [
 467 |     "%%time\n",
 468 |     "\n",
 469 |     "with joblib.parallel_backend(\"dask\", scatter=[X, y]):\n",
 470 |     "    grid_search.fit(X, y)"
 471 |    ]
 472 |   },
 473 |   {
 474 |    "cell_type": "markdown",
 475 |    "id": "c11895e1-8c77-4680-b7e9-4f1a00d2d34b",
 476 |    "metadata": {},
 477 |    "source": [
 478 |     "## Checkpoint\n",
 479 |     "\n",
 480 |     "**Question:** Fit a LogisticRegresstionCV model on the given data. Implement it with and without parallelism and note the time. Reference: [scikit-learn documentation](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.html?highlight=logistic%20regression#sklearn.linear_model.LogisticRegressionCV)"
 481 |    ]
 482 |   },
 483 |   {
 484 |    "cell_type": "code",
 485 |    "execution_count": 25,
 486 |    "id": "b29d1e44-50a4-484d-b1e9-99d0721564ca",
 487 |    "metadata": {
 488 |     "tags": []
 489 |    },
 490 |    "outputs": [],
 491 |    "source": [
 492 |     "from sklearn.linear_model import LogisticRegressionCV"
 493 |    ]
 494 |   },
 495 |   {
 496 |    "cell_type": "code",
 497 |    "execution_count": null,
 498 |    "id": "a21da37a-9fd0-4c94-a60b-54b3902ef798",
 499 |    "metadata": {},
 500 |    "outputs": [],
 501 |    "source": [
 502 |     "# Your answer here"
 503 |    ]
 504 |   },
 505 |   {
 506 |    "cell_type": "code",
 507 |    "execution_count": 38,
 508 |    "id": "d45b2213-801e-49b8-910b-01e963b70c97",
 509 |    "metadata": {
 510 |     "collapsed": true,
 511 |     "jupyter": {
 512 |      "outputs_hidden": true,
 513 |      "source_hidden": true
 514 |     },
 515 |     "tags": []
 516 |    },
 517 |    "outputs": [
 518 |     {
 519 |      "name": "stdout",
 520 |      "output_type": "stream",
 521 |      "text": [
 522 |       "CPU times: user 11.3 s, sys: 206 ms, total: 11.5 s\n",
 523 |       "Wall time: 1.01 s\n"
 524 |      ]
 525 |     }
 526 |    ],
 527 |    "source": [
 528 |     "%%time\n",
 529 |     "\n",
 530 |     "# Without parallelism\n",
 531 |     "clf = LogisticRegressionCV(cv=4, random_state=0).fit(X, y)"
 532 |    ]
 533 |   },
 534 |   {
 535 |    "cell_type": "code",
 536 |    "execution_count": 40,
 537 |    "id": "3b0e9d0f-9552-47ff-9e0a-0b7709eff961",
 538 |    "metadata": {
 539 |     "collapsed": true,
 540 |     "jupyter": {
 541 |      "outputs_hidden": true,
 542 |      "source_hidden": true
 543 |     },
 544 |     "tags": []
 545 |    },
 546 |    "outputs": [
 547 |     {
 548 |      "name": "stdout",
 549 |      "output_type": "stream",
 550 |      "text": [
 551 |       "CPU times: user 265 ms, sys: 46.1 ms, total: 311 ms\n",
 552 |       "Wall time: 657 ms\n"
 553 |      ]
 554 |     }
 555 |    ],
 556 |    "source": [
 557 |     "%%time\n",
 558 |     "\n",
 559 |     "# With parallelism (We can have all 4 folds execute in parallel!)\n",
 560 |     "clf = LogisticRegressionCV(cv=4, random_state=0, n_jobs=4).fit(X, y)"
 561 |    ]
 562 |   },
 563 |   {
 564 |    "cell_type": "markdown",
 565 |    "id": "1d8e6216-ccc0-449a-ae1c-22723e6559ea",
 566 |    "metadata": {},
 567 |    "source": [
 568 |     "## Dask-ML for memory bound problems\n",
 569 |     "\n",
 570 |     "Memory-bound problems arise when your dataset is too large to even read. This is where Dask can help. In the previous course, we saw how Dask DataFrame can be used to perform pandas-like operations on larger-than-memory data. Similarly, we can use Dask-ML to perform scikit-learn-like operations on our large datasets."
 571 |    ]
 572 |   },
 573 |   {
 574 |    "cell_type": "markdown",
 575 |    "id": "73899ada-2d9e-4d56-9be0-79239d55233d",
 576 |    "metadata": {},
 577 |    "source": [
 578 |     "We can use Dask-ML on the previous GridSearchCV example, but this time, with more parameters."
 579 |    ]
 580 |   },
 581 |   {
 582 |    "cell_type": "code",
 583 |    "execution_count": 26,
 584 |    "id": "f54a1552-7fca-4fdb-b50f-2ab7d729f53c",
 585 |    "metadata": {},
 586 |    "outputs": [],
 587 |    "source": [
 588 |     "import dask_ml.model_selection as dcv"
 589 |    ]
 590 |   },
 591 |   {
 592 |    "cell_type": "code",
 593 |    "execution_count": 27,
 594 |    "id": "64c30e9e-22d4-4eb6-bac1-afb08abd2620",
 595 |    "metadata": {},
 596 |    "outputs": [],
 597 |    "source": [
 598 |     "param_grid = {\n",
 599 |     "   'n_neighbors': [3, 5, 8],\n",
 600 |     "    'weights': ['uniform', 'distance'],\n",
 601 |     "    'algorithm': ['auto', 'ball_tree'],\n",
 602 |     "}"
 603 |    ]
 604 |   },
 605 |   {
 606 |    "cell_type": "code",
 607 |    "execution_count": 28,
 608 |    "id": "549ccf50-9946-4c80-bb6d-15225a0c42e5",
 609 |    "metadata": {},
 610 |    "outputs": [
 611 |     {
 612 |      "name": "stdout",
 613 |      "output_type": "stream",
 614 |      "text": [
 615 |       "CPU times: user 34.9 s, sys: 5.52 s, total: 40.5 s\n",
 616 |       "Wall time: 4min 14s\n"
 617 |      ]
 618 |     },
 619 |     {
 620 |      "data": {
 621 |       "text/plain": [
 622 |        "GridSearchCV(cv=2, estimator=KNeighborsClassifier(n_neighbors=3),\n",
 623 |        "             param_grid={'algorithm': ['auto', 'ball_tree'],\n",
 624 |        "                         'n_neighbors': [3, 5, 8],\n",
 625 |        "                         'weights': ['uniform', 'distance']})"
 626 |       ]
 627 |      },
 628 |      "execution_count": 28,
 629 |      "metadata": {},
 630 |      "output_type": "execute_result"
 631 |     }
 632 |    ],
 633 |    "source": [
 634 |     "%%time\n",
 635 |     "\n",
 636 |     "grid_search = dcv.GridSearchCV(clf, param_grid, cv=2)\n",
 637 |     "grid_search.fit(X, y)"
 638 |    ]
 639 |   },
 640 |   {
 641 |    "cell_type": "markdown",
 642 |    "id": "d20f0266-25c3-41cf-803c-61c95f10f92e",
 643 |    "metadata": {},
 644 |    "source": [
 645 |     "Let's look at another algorithm: Logistic Regression using Dask-ML. As Dask-ML implements the scikit-learn API, the code is similar."
 646 |    ]
 647 |   },
 648 |   {
 649 |    "cell_type": "code",
 650 |    "execution_count": 29,
 651 |    "id": "f518e6a3-8f49-40e9-ae68-0d98473d9e8b",
 652 |    "metadata": {},
 653 |    "outputs": [],
 654 |    "source": [
 655 |     "from dask_ml.linear_model import LogisticRegression"
 656 |    ]
 657 |   },
 658 |   {
 659 |    "cell_type": "code",
 660 |    "execution_count": 30,
 661 |    "id": "f8ee9f60-9067-477e-95d1-3f3dcc97d543",
 662 |    "metadata": {},
 663 |    "outputs": [
 664 |     {
 665 |      "name": "stdout",
 666 |      "output_type": "stream",
 667 |      "text": [
 668 |       "CPU times: user 609 ms, sys: 136 ms, total: 744 ms\n",
 669 |       "Wall time: 1.82 s\n"
 670 |      ]
 671 |     },
 672 |     {
 673 |      "data": {
 674 |       "text/plain": [
 675 |        "0.88207"
 676 |       ]
 677 |      },
 678 |      "execution_count": 30,
 679 |      "metadata": {},
 680 |      "output_type": "execute_result"
 681 |     }
 682 |    ],
 683 |    "source": [
 684 |     "%%time\n",
 685 |     "\n",
 686 |     "clf = LogisticRegression().fit(X,y)\n",
 687 |     "clf.score(X,y)"
 688 |    ]
 689 |   },
 690 |   {
 691 |    "cell_type": "code",
 692 |    "execution_count": 31,
 693 |    "id": "135108f9-7d7b-49b5-850b-45e96bfd119b",
 694 |    "metadata": {},
 695 |    "outputs": [
 696 |     {
 697 |      "data": {
 698 |       "text/plain": [
 699 |        "array([False, False, False, False,  True])"
 700 |       ]
 701 |      },
 702 |      "execution_count": 31,
 703 |      "metadata": {},
 704 |      "output_type": "execute_result"
 705 |     }
 706 |    ],
 707 |    "source": [
 708 |     "clf.predict(X)[:5]"
 709 |    ]
 710 |   },
 711 |   {
 712 |    "cell_type": "markdown",
 713 |    "id": "84341c54-d462-4aff-b7e7-10bb92039d0f",
 714 |    "metadata": {},
 715 |    "source": [
 716 |     "That's it!"
 717 |    ]
 718 |   },
 719 |   {
 720 |    "cell_type": "markdown",
 721 |    "id": "4f24a1fe-7e2e-41ea-ace6-6b9b446d6bee",
 722 |    "metadata": {},
 723 |    "source": [
 724 |     "## Checkpoint\n",
 725 |     "\n",
 726 |     "**Question:** Use Dask-ML to implement a [Naive Bayes classifier](https://ml.dask.org/naive-bayes.html) on the given dataset."
 727 |    ]
 728 |   },
 729 |   {
 730 |    "cell_type": "code",
 731 |    "execution_count": null,
 732 |    "id": "0c8b8dfe-8d84-4426-9271-70982dfa707e",
 733 |    "metadata": {},
 734 |    "outputs": [],
 735 |    "source": [
 736 |     "# Your answer here"
 737 |    ]
 738 |   },
 739 |   {
 740 |    "cell_type": "code",
 741 |    "execution_count": 45,
 742 |    "id": "569b2337-e35b-42d8-8d3b-5a34e94c7c00",
 743 |    "metadata": {
 744 |     "collapsed": true,
 745 |     "jupyter": {
 746 |      "outputs_hidden": true
 747 |     },
 748 |     "tags": []
 749 |    },
 750 |    "outputs": [
 751 |     {
 752 |      "data": {
 753 |       "text/plain": [
 754 |        "array([0, 0, 0, 0, 1])"
 755 |       ]
 756 |      },
 757 |      "execution_count": 45,
 758 |      "metadata": {},
 759 |      "output_type": "execute_result"
 760 |     }
 761 |    ],
 762 |    "source": [
 763 |     "from dask_ml.naive_bayes import GaussianNB\n",
 764 |     "\n",
 765 |     "clf = GaussianNB().fit(X,y)\n",
 766 |     "clf.predict(X)[:5].compute()"
 767 |    ]
 768 |   },
 769 |   {
 770 |    "cell_type": "markdown",
 771 |    "id": "4b548ff7-77d2-4289-b07e-79828f9b3e0a",
 772 |    "metadata": {},
 773 |    "source": [
 774 |     "Finally, let's close the cluster."
 775 |    ]
 776 |   },
 777 |   {
 778 |    "cell_type": "code",
 779 |    "execution_count": null,
 780 |    "id": "fddf8955-eb9a-43d1-8f7a-f3a72fff84bf",
 781 |    "metadata": {},
 782 |    "outputs": [],
 783 |    "source": [
 784 |     "client.close()"
 785 |    ]
 786 |   },
 787 |   {
 788 |    "cell_type": "markdown",
 789 |    "id": "90d8a6cb-c38b-427b-8e6f-92c91a53630e",
 790 |    "metadata": {},
 791 |    "source": [
 792 |     "## Machine Learning in the Cloud (Optional)"
 793 |    ]
 794 |   },
 795 |   {
 796 |    "cell_type": "markdown",
 797 |    "id": "8bf3a52b-e1ab-4282-a490-4224a854ab80",
 798 |    "metadata": {},
 799 |    "source": [
 800 |     "As we saw in the first course, Dask can also scale this computation to the cloud! There are many ways to do this, but here, we will be using Coiled. Coiled provides cluster-as-a-service functionality to provision hosted Dask clusters. It manages software environments, networking, etc. so that we can connect to the cloud quickly.\n",
 801 |     "\n",
 802 |     "To get started, sign-up on [cloud.coiled.io](https://cloud.coiled.io) and get your coiled login token. Then in terminal (or command prompt), execute `coiled login` and share your token when prompted."
 803 |    ]
 804 |   },
 805 |   {
 806 |    "cell_type": "markdown",
 807 |    "id": "1845e4f1-595a-4f07-86e9-10c4b662ec69",
 808 |    "metadata": {},
 809 |    "source": [
 810 |     "That's it! We can work from this same notebook now. We can import coiled and create a cluster as shown below:"
 811 |    ]
 812 |   },
 813 |   {
 814 |    "cell_type": "code",
 815 |    "execution_count": 1,
 816 |    "id": "6826d61e-5b62-4149-87e1-4d4a7a17edd6",
 817 |    "metadata": {},
 818 |    "outputs": [
 819 |     {
 820 |      "data": {
 821 |       "application/vnd.jupyter.widget-view+json": {
 822 |        "model_id": "",
 823 |        "version_major": 2,
 824 |        "version_minor": 0
 825 |       },
 826 |       "text/plain": [
 827 |        "Output()"
 828 |       ]
 829 |      },
 830 |      "metadata": {},
 831 |      "output_type": "display_data"
 832 |     },
 833 |     {
 834 |      "name": "stdout",
 835 |      "output_type": "stream",
 836 |      "text": [
 837 |       "Found software environment build\n"
 838 |      ]
 839 |     },
 840 |     {
 841 |      "data": {
 842 |       "text/html": [
 843 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
 844 |       ],
 845 |       "text/plain": []
 846 |      },
 847 |      "metadata": {},
 848 |      "output_type": "display_data"
 849 |     }
 850 |    ],
 851 |    "source": [
 852 |     "import coiled\n",
 853 |     "\n",
 854 |     "cluster = coiled.Cluster(n_workers=10)"
 855 |    ]
 856 |   },
 857 |   {
 858 |    "cell_type": "code",
 859 |    "execution_count": 2,
 860 |    "id": "f154acd2-3ad9-44ee-b492-bed74e1fc445",
 861 |    "metadata": {},
 862 |    "outputs": [
 863 |     {
 864 |      "name": "stdout",
 865 |      "output_type": "stream",
 866 |      "text": [
 867 |       "Dashboard: http://ec2-54-158-32-172.compute-1.amazonaws.com:8787\n"
 868 |      ]
 869 |     },
 870 |     {
 871 |      "name": "stderr",
 872 |      "output_type": "stream",
 873 |      "text": [
 874 |       "/Users/pavithra-coiled/.conda/envs/talkpython-dask/lib/python3.8/site-packages/distributed/client.py:1186: VersionMismatchWarning: Mismatched versions found\n",
 875 |       "\n",
 876 |       "+---------+--------+-----------+---------+\n",
 877 |       "| Package | client | scheduler | workers |\n",
 878 |       "+---------+--------+-----------+---------+\n",
 879 |       "| blosc   | None   | 1.10.2    | 1.10.2  |\n",
 880 |       "| lz4     | None   | 3.1.3     | 3.1.3   |\n",
 881 |       "| numpy   | 1.20.3 | 1.21.0    | 1.21.0  |\n",
 882 |       "+---------+--------+-----------+---------+\n",
 883 |       "  warnings.warn(version_module.VersionMismatchWarning(msg[0][\"warning\"]))\n"
 884 |      ]
 885 |     }
 886 |    ],
 887 |    "source": [
 888 |     "from dask.distributed import Client\n",
 889 |     "\n",
 890 |     "client = Client(cluster)\n",
 891 |     "\n",
 892 |     "print('Dashboard:', client.dashboard_link)"
 893 |    ]
 894 |   },
 895 |   {
 896 |    "cell_type": "markdown",
 897 |    "id": "054569ea-4e3b-4dc3-8609-c0b7bf08a050",
 898 |    "metadata": {},
 899 |    "source": [
 900 |     "Note that the dashboard link points to AWS."
 901 |    ]
 902 |   },
 903 |   {
 904 |    "cell_type": "markdown",
 905 |    "id": "f4c7bb77-fef1-42cd-aa53-1ba2c499cab9",
 906 |    "metadata": {},
 907 |    "source": [
 908 |     "Now, let's implement KMeans on some generated data using sklearn and Dask-ML."
 909 |    ]
 910 |   },
 911 |   {
 912 |    "cell_type": "code",
 913 |    "execution_count": 3,
 914 |    "id": "ed46562c-5df0-4a9a-b79a-77e9c250e5e6",
 915 |    "metadata": {},
 916 |    "outputs": [],
 917 |    "source": [
 918 |     "from sklearn.datasets import make_blobs\n",
 919 |     "\n",
 920 |     "X, y = make_blobs(n_samples=100, n_features=5, random_state=0)"
 921 |    ]
 922 |   },
 923 |   {
 924 |    "cell_type": "code",
 925 |    "execution_count": 4,
 926 |    "id": "60f63e8f-7bd3-4ed9-adf9-044af675451d",
 927 |    "metadata": {},
 928 |    "outputs": [
 929 |     {
 930 |      "data": {
 931 |       "text/plain": [
 932 |        "(100, 5)"
 933 |       ]
 934 |      },
 935 |      "execution_count": 4,
 936 |      "metadata": {},
 937 |      "output_type": "execute_result"
 938 |     }
 939 |    ],
 940 |    "source": [
 941 |     "X.shape"
 942 |    ]
 943 |   },
 944 |   {
 945 |    "cell_type": "code",
 946 |    "execution_count": 5,
 947 |    "id": "df13bced-8aa3-4ade-bc2b-b376f466c9b6",
 948 |    "metadata": {},
 949 |    "outputs": [],
 950 |    "source": [
 951 |     "from dask_ml.cluster import KMeans"
 952 |    ]
 953 |   },
 954 |   {
 955 |    "cell_type": "code",
 956 |    "execution_count": 6,
 957 |    "id": "f6394c05-a31a-48f7-b2b1-a3bcc6a707d8",
 958 |    "metadata": {},
 959 |    "outputs": [
 960 |     {
 961 |      "name": "stdout",
 962 |      "output_type": "stream",
 963 |      "text": [
 964 |       "CPU times: user 883 ms, sys: 39.8 ms, total: 923 ms\n",
 965 |       "Wall time: 26 s\n"
 966 |      ]
 967 |     }
 968 |    ],
 969 |    "source": [
 970 |     "%%time\n",
 971 |     "\n",
 972 |     "clf = KMeans().fit(X)"
 973 |    ]
 974 |   },
 975 |   {
 976 |    "cell_type": "code",
 977 |    "execution_count": 7,
 978 |    "id": "81a4aaad-3d18-4fd2-8474-ac2e1c7eea53",
 979 |    "metadata": {},
 980 |    "outputs": [
 981 |     {
 982 |      "data": {
 983 |       "text/html": [
 984 |        "<table>\n",
 985 |        "<tr>\n",
 986 |        "<td>\n",
 987 |        "<table>\n",
 988 |        "  <thead>\n",
 989 |        "    <tr><td> </td><th> Array </th><th> Chunk </th></tr>\n",
 990 |        "  </thead>\n",
 991 |        "  <tbody>\n",
 992 |        "    <tr><th> Bytes </th><td> 400 B </td> <td> 32 B </td></tr>\n",
 993 |        "    <tr><th> Shape </th><td> (100,) </td> <td> (8,) </td></tr>\n",
 994 |        "    <tr><th> Count </th><td> 78 Tasks </td><td> 13 Chunks </td></tr>\n",
 995 |        "    <tr><th> Type </th><td> int32 </td><td> numpy.ndarray </td></tr>\n",
 996 |        "  </tbody>\n",
 997 |        "</table>\n",
 998 |        "</td>\n",
 999 |        "<td>\n",
1000 |        "<svg width=\"170\" height=\"75\" style=\"stroke:rgb(0,0,0);stroke-width:1\" >\n",
1001 |        "\n",
1002 |        "  <!-- Horizontal lines -->\n",
1003 |        "  <line x1=\"0\" y1=\"0\" x2=\"120\" y2=\"0\" style=\"stroke-width:2\" />\n",
1004 |        "  <line x1=\"0\" y1=\"25\" x2=\"120\" y2=\"25\" style=\"stroke-width:2\" />\n",
1005 |        "\n",
1006 |        "  <!-- Vertical lines -->\n",
1007 |        "  <line x1=\"0\" y1=\"0\" x2=\"0\" y2=\"25\" style=\"stroke-width:2\" />\n",
1008 |        "  <line x1=\"9\" y1=\"0\" x2=\"9\" y2=\"25\" />\n",
1009 |        "  <line x1=\"19\" y1=\"0\" x2=\"19\" y2=\"25\" />\n",
1010 |        "  <line x1=\"28\" y1=\"0\" x2=\"28\" y2=\"25\" />\n",
1011 |        "  <line x1=\"38\" y1=\"0\" x2=\"38\" y2=\"25\" />\n",
1012 |        "  <line x1=\"48\" y1=\"0\" x2=\"48\" y2=\"25\" />\n",
1013 |        "  <line x1=\"57\" y1=\"0\" x2=\"57\" y2=\"25\" />\n",
1014 |        "  <line x1=\"67\" y1=\"0\" x2=\"67\" y2=\"25\" />\n",
1015 |        "  <line x1=\"76\" y1=\"0\" x2=\"76\" y2=\"25\" />\n",
1016 |        "  <line x1=\"86\" y1=\"0\" x2=\"86\" y2=\"25\" />\n",
1017 |        "  <line x1=\"96\" y1=\"0\" x2=\"96\" y2=\"25\" />\n",
1018 |        "  <line x1=\"105\" y1=\"0\" x2=\"105\" y2=\"25\" />\n",
1019 |        "  <line x1=\"115\" y1=\"0\" x2=\"115\" y2=\"25\" />\n",
1020 |        "  <line x1=\"120\" y1=\"0\" x2=\"120\" y2=\"25\" style=\"stroke-width:2\" />\n",
1021 |        "\n",
1022 |        "  <!-- Colored Rectangle -->\n",
1023 |        "  <polygon points=\"0.0,0.0 120.0,0.0 120.0,25.412616514582485 0.0,25.412616514582485\" style=\"fill:#ECB172A0;stroke-width:0\"/>\n",
1024 |        "\n",
1025 |        "  <!-- Text -->\n",
1026 |        "  <text x=\"60.000000\" y=\"45.412617\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" >100</text>\n",
1027 |        "  <text x=\"140.000000\" y=\"12.706308\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" transform=\"rotate(0,140.000000,12.706308)\">1</text>\n",
1028 |        "</svg>\n",
1029 |        "</td>\n",
1030 |        "</tr>\n",
1031 |        "</table>"
1032 |       ],
1033 |       "text/plain": [
1034 |        "dask.array<astype, shape=(100,), dtype=int32, chunksize=(8,), chunktype=numpy.ndarray>"
1035 |       ]
1036 |      },
1037 |      "execution_count": 7,
1038 |      "metadata": {},
1039 |      "output_type": "execute_result"
1040 |     }
1041 |    ],
1042 |    "source": [
1043 |     "clf.labels_"
1044 |    ]
1045 |   },
1046 |   {
1047 |    "cell_type": "code",
1048 |    "execution_count": 8,
1049 |    "id": "d4f2c8ba-2ebc-4b19-9ef1-c05375bc1c36",
1050 |    "metadata": {},
1051 |    "outputs": [
1052 |     {
1053 |      "data": {
1054 |       "text/plain": [
1055 |        "array([2, 4, 0, 6, 0, 7, 7, 5, 1, 2], dtype=int32)"
1056 |       ]
1057 |      },
1058 |      "execution_count": 8,
1059 |      "metadata": {},
1060 |      "output_type": "execute_result"
1061 |     }
1062 |    ],
1063 |    "source": [
1064 |     "clf.labels_[:10].compute()"
1065 |    ]
1066 |   },
1067 |   {
1068 |    "cell_type": "code",
1069 |    "execution_count": 9,
1070 |    "id": "0b789ba3-a867-4797-b053-bd341b2a07b9",
1071 |    "metadata": {},
1072 |    "outputs": [],
1073 |    "source": [
1074 |     "client.close()"
1075 |    ]
1076 |   },
1077 |   {
1078 |    "cell_type": "markdown",
1079 |    "id": "6df48786-524b-4bd2-913f-72683176aa56",
1080 |    "metadata": {},
1081 |    "source": [
1082 |     "## References\n",
1083 |     "\n",
1084 |     "* [Dask-ML documentation](https://ml.dask.org/)\n",
1085 |     "* [Dask Examples - Machine Learning](https://examples.dask.org/machine-learning.html)\n",
1086 |     "* [Dask Tutorial - Machine Learning](https://tutorial.dask.org/08_machine_learning.html)"
1087 |    ]
1088 |   },
1089 |   {
1090 |    "cell_type": "code",
1091 |    "execution_count": null,
1092 |    "id": "ab8e02a4-6b79-448d-a090-3c17768b0b74",
1093 |    "metadata": {},
1094 |    "outputs": [],
1095 |    "source": []
1096 |   }
1097 |  ],
1098 |  "metadata": {
1099 |   "kernelspec": {
1100 |    "display_name": "Python 3",
1101 |    "language": "python",
1102 |    "name": "python3"
1103 |   },
1104 |   "language_info": {
1105 |    "codemirror_mode": {
1106 |     "name": "ipython",
1107 |     "version": 3
1108 |    },
1109 |    "file_extension": ".py",
1110 |    "mimetype": "text/x-python",
1111 |    "name": "python",
1112 |    "nbconvert_exporter": "python",
1113 |    "pygments_lexer": "ipython3",
1114 |    "version": "3.8.10"
1115 |   }
1116 |  },
1117 |  "nbformat": 4,
1118 |  "nbformat_minor": 5
1119 | }
1120 | 


--------------------------------------------------------------------------------
/03-array.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "id": "continent-fifty",
   6 |    "metadata": {},
   7 |    "source": [
   8 |     "# Dask Array\n",
   9 |     "\n",
  10 |     "## Notebook Objectives\n",
  11 |     "* **Demonstrate NumPy**, a library for working with multidimensional arrays.\n",
  12 |     "* Using **blocked algorithms** to work on a large dataset in small chunks.\n",
  13 |     "* **Introducing Dask Array**, interface for parallel NumPy.\n",
  14 |     "* **Limitations of Dask Array**.\n",
  15 |     "* **References** for further reading."
  16 |    ]
  17 |   },
  18 |   {
  19 |    "cell_type": "markdown",
  20 |    "id": "metallic-spray",
  21 |    "metadata": {},
  22 |    "source": [
  23 |     "## Demostrate NumPy for array operations\n"
  24 |    ]
  25 |   },
  26 |   {
  27 |    "cell_type": "markdown",
  28 |    "id": "limiting-navigator",
  29 |    "metadata": {},
  30 |    "source": [
  31 |     "NumPy has a `ones()` function to create unit arrays, or arrays of all ones. We use it to create a 10x10 matrix of ones:"
  32 |    ]
  33 |   },
  34 |   {
  35 |    "cell_type": "code",
  36 |    "execution_count": 1,
  37 |    "id": "handed-round",
  38 |    "metadata": {},
  39 |    "outputs": [
  40 |     {
  41 |      "data": {
  42 |       "text/plain": [
  43 |        "array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  44 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  45 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  46 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  47 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  48 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  49 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  50 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  51 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n",
  52 |        "       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])"
  53 |       ]
  54 |      },
  55 |      "execution_count": 1,
  56 |      "metadata": {},
  57 |      "output_type": "execute_result"
  58 |     }
  59 |    ],
  60 |    "source": [
  61 |     "import numpy as np\n",
  62 |     "\n",
  63 |     "x = np.ones((10, 10), dtype=int)\n",
  64 |     "x"
  65 |    ]
  66 |   },
  67 |   {
  68 |    "cell_type": "markdown",
  69 |    "id": "literary-encounter",
  70 |    "metadata": {},
  71 |    "source": [
  72 |     "The `sum()` method is used to calculate sum."
  73 |    ]
  74 |   },
  75 |   {
  76 |    "cell_type": "code",
  77 |    "execution_count": 2,
  78 |    "id": "checked-offer",
  79 |    "metadata": {},
  80 |    "outputs": [
  81 |     {
  82 |      "name": "stdout",
  83 |      "output_type": "stream",
  84 |      "text": [
  85 |       "CPU times: user 30 µs, sys: 6 µs, total: 36 µs\n",
  86 |       "Wall time: 38.9 µs\n"
  87 |      ]
  88 |     },
  89 |     {
  90 |      "data": {
  91 |       "text/plain": [
  92 |        "100"
  93 |       ]
  94 |      },
  95 |      "execution_count": 2,
  96 |      "metadata": {},
  97 |      "output_type": "execute_result"
  98 |     }
  99 |    ],
 100 |    "source": [
 101 |     "%%time\n",
 102 |     "x.sum()"
 103 |    ]
 104 |   },
 105 |   {
 106 |    "cell_type": "markdown",
 107 |    "id": "twenty-operations",
 108 |    "metadata": {},
 109 |    "source": [
 110 |     "The `random` module can be used to create arrays of random data. Let's create a larger matrix of dimension 1000x1000:"
 111 |    ]
 112 |   },
 113 |   {
 114 |    "cell_type": "code",
 115 |    "execution_count": 28,
 116 |    "id": "subtle-outdoors",
 117 |    "metadata": {},
 118 |    "outputs": [
 119 |     {
 120 |      "data": {
 121 |       "text/plain": [
 122 |        "array([[0.45568271, 0.1431577 , 0.37642501, ..., 0.15796487, 0.45990731,\n",
 123 |        "        0.27984186],\n",
 124 |        "       [0.81437462, 0.62504866, 0.51884817, ..., 0.66213603, 0.76877329,\n",
 125 |        "        0.27941481],\n",
 126 |        "       [0.67832935, 0.47963707, 0.86939886, ..., 0.8534172 , 0.84040989,\n",
 127 |        "        0.56960117],\n",
 128 |        "       ...,\n",
 129 |        "       [0.01973544, 0.35057438, 0.05273421, ..., 0.60723429, 0.2137585 ,\n",
 130 |        "        0.99921152],\n",
 131 |        "       [0.89358418, 0.53268602, 0.69352128, ..., 0.06789243, 0.84053498,\n",
 132 |        "        0.38334184],\n",
 133 |        "       [0.05749119, 0.42748649, 0.72071472, ..., 0.44029739, 0.43499474,\n",
 134 |        "        0.46421326]])"
 135 |       ]
 136 |      },
 137 |      "execution_count": 28,
 138 |      "metadata": {},
 139 |      "output_type": "execute_result"
 140 |     }
 141 |    ],
 142 |    "source": [
 143 |     "x = np.random.random(size=(1000, 1000))\n",
 144 |     "x"
 145 |    ]
 146 |   },
 147 |   {
 148 |    "cell_type": "code",
 149 |    "execution_count": 3,
 150 |    "id": "expressed-immigration",
 151 |    "metadata": {},
 152 |    "outputs": [
 153 |     {
 154 |      "name": "stdout",
 155 |      "output_type": "stream",
 156 |      "text": [
 157 |       "CPU times: user 33 µs, sys: 7 µs, total: 40 µs\n",
 158 |       "Wall time: 42.9 µs\n"
 159 |      ]
 160 |     },
 161 |     {
 162 |      "data": {
 163 |       "text/plain": [
 164 |        "100"
 165 |       ]
 166 |      },
 167 |      "execution_count": 3,
 168 |      "metadata": {},
 169 |      "output_type": "execute_result"
 170 |     }
 171 |    ],
 172 |    "source": [
 173 |     "%%time\n",
 174 |     "x.sum()"
 175 |    ]
 176 |   },
 177 |   {
 178 |    "cell_type": "markdown",
 179 |    "id": "integrated-cooking",
 180 |    "metadata": {},
 181 |    "source": [
 182 |     "NumPy has many helpful operations, including matrix transpose, matrix addition, and mean as shown below:"
 183 |    ]
 184 |   },
 185 |   {
 186 |    "cell_type": "code",
 187 |    "execution_count": 4,
 188 |    "id": "compound-functionality",
 189 |    "metadata": {},
 190 |    "outputs": [
 191 |     {
 192 |      "name": "stdout",
 193 |      "output_type": "stream",
 194 |      "text": [
 195 |       "CPU times: user 36 µs, sys: 0 ns, total: 36 µs\n",
 196 |       "Wall time: 39.1 µs\n"
 197 |      ]
 198 |     },
 199 |     {
 200 |      "data": {
 201 |       "text/plain": [
 202 |        "array([[2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 203 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 204 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 205 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 206 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 207 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 208 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 209 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 210 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],\n",
 211 |        "       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2]])"
 212 |       ]
 213 |      },
 214 |      "execution_count": 4,
 215 |      "metadata": {},
 216 |      "output_type": "execute_result"
 217 |     }
 218 |    ],
 219 |    "source": [
 220 |     "%%time\n",
 221 |     "y = x + x.T\n",
 222 |     "y"
 223 |    ]
 224 |   },
 225 |   {
 226 |    "cell_type": "code",
 227 |    "execution_count": 5,
 228 |    "id": "novel-billion",
 229 |    "metadata": {},
 230 |    "outputs": [
 231 |     {
 232 |      "name": "stdout",
 233 |      "output_type": "stream",
 234 |      "text": [
 235 |       "CPU times: user 76 µs, sys: 8 µs, total: 84 µs\n",
 236 |       "Wall time: 86.1 µs\n"
 237 |      ]
 238 |     },
 239 |     {
 240 |      "data": {
 241 |       "text/plain": [
 242 |        "2.0"
 243 |       ]
 244 |      },
 245 |      "execution_count": 5,
 246 |      "metadata": {},
 247 |      "output_type": "execute_result"
 248 |     }
 249 |    ],
 250 |    "source": [
 251 |     "%%time\n",
 252 |     "np.mean(y)"
 253 |    ]
 254 |   },
 255 |   {
 256 |    "cell_type": "markdown",
 257 |    "id": "acoustic-feedback",
 258 |    "metadata": {},
 259 |    "source": [
 260 |     "Let's now create an even larger matrix of 20,000x20,000 normally distributed random values and compute it's mean."
 261 |    ]
 262 |   },
 263 |   {
 264 |    "cell_type": "code",
 265 |    "execution_count": 6,
 266 |    "id": "integrated-metallic",
 267 |    "metadata": {},
 268 |    "outputs": [
 269 |     {
 270 |      "name": "stdout",
 271 |      "output_type": "stream",
 272 |      "text": [
 273 |       "CPU times: user 9.15 s, sys: 771 ms, total: 9.92 s\n",
 274 |       "Wall time: 9.95 s\n"
 275 |      ]
 276 |     },
 277 |     {
 278 |      "data": {
 279 |       "text/plain": [
 280 |        "array([ 9.99981611,  9.99998234, 10.00035018, ..., 10.0003316 ,\n",
 281 |        "       10.00153614,  9.99937434])"
 282 |       ]
 283 |      },
 284 |      "execution_count": 6,
 285 |      "metadata": {},
 286 |      "output_type": "execute_result"
 287 |     }
 288 |    ],
 289 |    "source": [
 290 |     "%%time \n",
 291 |     "x = np.random.normal(10, 0.1, size=(20000, 20000)) \n",
 292 |     "y = x.mean(axis=0) \n",
 293 |     "y"
 294 |    ]
 295 |   },
 296 |   {
 297 |    "cell_type": "markdown",
 298 |    "id": "split-original",
 299 |    "metadata": {},
 300 |    "source": [
 301 |     "Note that this computation takes some time. We will run this same example using Dask in a few minutes!"
 302 |    ]
 303 |   },
 304 |   {
 305 |    "cell_type": "markdown",
 306 |    "id": "heavy-samoa",
 307 |    "metadata": {},
 308 |    "source": [
 309 |     "Now, let's try to create an even larger matrix with a billion values along each axis!"
 310 |    ]
 311 |   },
 312 |   {
 313 |    "cell_type": "code",
 314 |    "execution_count": 33,
 315 |    "id": "generic-dimension",
 316 |    "metadata": {},
 317 |    "outputs": [
 318 |     {
 319 |      "ename": "MemoryError",
 320 |      "evalue": "Unable to allocate 6.94 EiB for an array with shape (1000000000, 1000000000) and data type int64",
 321 |      "output_type": "error",
 322 |      "traceback": [
 323 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
 324 |       "\u001b[0;31mMemoryError\u001b[0m                               Traceback (most recent call last)",
 325 |       "\u001b[0;32m<ipython-input-33-cc9168c39fd3>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mones\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1_000_000_000\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1_000_000_000\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mint\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
 326 |       "\u001b[0;32m~/.conda/envs/talkpython-dask/lib/python3.8/site-packages/numpy/core/numeric.py\u001b[0m in \u001b[0;36mones\u001b[0;34m(shape, dtype, order, like)\u001b[0m\n\u001b[1;32m    201\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0m_ones_with_like\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0morder\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0morder\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlike\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mlike\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    202\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 203\u001b[0;31m     \u001b[0ma\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mempty\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0morder\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    204\u001b[0m     \u001b[0mmultiarray\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcopyto\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcasting\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'unsafe'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    205\u001b[0m     \u001b[0;32mreturn\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
 327 |       "\u001b[0;31mMemoryError\u001b[0m: Unable to allocate 6.94 EiB for an array with shape (1000000000, 1000000000) and data type int64"
 328 |      ]
 329 |     }
 330 |    ],
 331 |    "source": [
 332 |     "x = np.ones((1_000_000_000, 1_000_000_000), dtype=int)"
 333 |    ]
 334 |   },
 335 |   {
 336 |    "cell_type": "markdown",
 337 |    "id": "streaming-midnight",
 338 |    "metadata": {},
 339 |    "source": [
 340 |     "This throws a `MemoryError`, meaning NumPy isn't able to handle data at this size. We can work around this limitation using blocked algorithms as shown in the following section."
 341 |    ]
 342 |   },
 343 |   {
 344 |    "cell_type": "markdown",
 345 |    "id": "recreational-upgrade",
 346 |    "metadata": {},
 347 |    "source": [
 348 |     "## Blocked Algorithms"
 349 |    ]
 350 |   },
 351 |   {
 352 |    "cell_type": "markdown",
 353 |    "id": "angry-projection",
 354 |    "metadata": {},
 355 |    "source": [
 356 |     "*\"A blocked algorithm executes on a large dataset by breaking it up into many small blocks.\"* ~ [tutorial.dask.org](https://tutorial.dask.org/03_array.html#Blocked-Algorithms)\n",
 357 |     "\n",
 358 |     "For example, in the above example with a billion+ numbers, consider taking the sum of all numbers. We might instead break up the array into 1,000 chunks, each of size 1,000,000, take the sum of each chunk, and then take the sum of the intermediate sums.\n",
 359 |     "\n",
 360 |     "Let's do this:"
 361 |    ]
 362 |   },
 363 |   {
 364 |    "cell_type": "code",
 365 |    "execution_count": 34,
 366 |    "id": "funky-awareness",
 367 |    "metadata": {},
 368 |    "outputs": [],
 369 |    "source": [
 370 |     "# Load data with h5py\n",
 371 |     "import h5py\n",
 372 |     "import os\n",
 373 |     "f = h5py.File(os.path.join('data', 'random.hdf5'), mode='r')\n",
 374 |     "dset = f['/x']"
 375 |    ]
 376 |   },
 377 |   {
 378 |    "cell_type": "code",
 379 |    "execution_count": 35,
 380 |    "id": "ahead-discussion",
 381 |    "metadata": {},
 382 |    "outputs": [
 383 |     {
 384 |      "name": "stdout",
 385 |      "output_type": "stream",
 386 |      "text": [
 387 |       "5005765.3125\n"
 388 |      ]
 389 |     }
 390 |    ],
 391 |    "source": [
 392 |     "# Compute sum of large array, one million numbers at a time\n",
 393 |     "sums = []\n",
 394 |     "for i in range(0, 1_000_000_000, 1_000_000):\n",
 395 |     "    chunk = dset[i: i + 1_000_000]\n",
 396 |     "    sums.append(chunk.sum())\n",
 397 |     "\n",
 398 |     "total = sum(sums)\n",
 399 |     "print(total)"
 400 |    ]
 401 |   },
 402 |   {
 403 |    "cell_type": "markdown",
 404 |    "id": "bearing-document",
 405 |    "metadata": {},
 406 |    "source": [
 407 |     "Note that this is a sequential process in the notebook kernel, both the loading and summing."
 408 |    ]
 409 |   },
 410 |   {
 411 |    "cell_type": "markdown",
 412 |    "id": "interpreted-triumph",
 413 |    "metadata": {},
 414 |    "source": [
 415 |     "## Checkpoint\n",
 416 |     "\n",
 417 |     "Question: Create a random matrix of size 1000x1000 and compute standard deviation."
 418 |    ]
 419 |   },
 420 |   {
 421 |    "cell_type": "code",
 422 |    "execution_count": null,
 423 |    "id": "honey-awareness",
 424 |    "metadata": {},
 425 |    "outputs": [],
 426 |    "source": [
 427 |     "# Your answer goes here"
 428 |    ]
 429 |   },
 430 |   {
 431 |    "cell_type": "code",
 432 |    "execution_count": null,
 433 |    "id": "lucky-playing",
 434 |    "metadata": {
 435 |     "jupyter": {
 436 |      "source_hidden": true
 437 |     },
 438 |     "tags": []
 439 |    },
 440 |    "outputs": [],
 441 |    "source": [
 442 |     "# Answer\n",
 443 |     "\n",
 444 |     "x = np.random.random(size=(1000, 1000))\n",
 445 |     "y = x.std(axis=0)\n",
 446 |     "y"
 447 |    ]
 448 |   },
 449 |   {
 450 |    "cell_type": "markdown",
 451 |    "id": "obvious-bubble",
 452 |    "metadata": {},
 453 |    "source": [
 454 |     "## Dask Array for parallel NumPy"
 455 |    ]
 456 |   },
 457 |   {
 458 |    "cell_type": "markdown",
 459 |    "id": "illegal-causing",
 460 |    "metadata": {},
 461 |    "source": [
 462 |     "Dask Array is high-level interface that can be used to scale NumPy code to large datasets by using chuncking techniques as seen in the previous section."
 463 |    ]
 464 |   },
 465 |   {
 466 |    "cell_type": "markdown",
 467 |    "id": "specific-abraham",
 468 |    "metadata": {},
 469 |    "source": [
 470 |     "Let's create a new cluster:"
 471 |    ]
 472 |   },
 473 |   {
 474 |    "cell_type": "code",
 475 |    "execution_count": 7,
 476 |    "id": "valuable-seventh",
 477 |    "metadata": {},
 478 |    "outputs": [
 479 |     {
 480 |      "data": {
 481 |       "text/html": [
 482 |        "\n",
 483 |        "            <div>\n",
 484 |        "                <div style=\"\n",
 485 |        "                    width: 24px;\n",
 486 |        "                    height: 24px;\n",
 487 |        "                    background-color: #e1e1e1;\n",
 488 |        "                    border: 3px solid #9D9D9D;\n",
 489 |        "                    border-radius: 5px;\n",
 490 |        "                    position: absolute;\"> </div>\n",
 491 |        "                <div style=\"margin-left: 48px;\">\n",
 492 |        "                    <h3 style=\"margin-bottom: 0px;\">Client</h3>\n",
 493 |        "                    <p style=\"color: #9D9D9D; margin-bottom: 0px;\">Client-751baf30-d815-11eb-b4c1-3e22fb7564f2</p>\n",
 494 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 495 |        "                    \n",
 496 |        "                <tr>\n",
 497 |        "                    <td style=\"text-align: left;\"><strong>Connection method:</strong> Cluster object</td>\n",
 498 |        "                    <td style=\"text-align: left;\"><strong>Cluster type:</strong> LocalCluster</td>\n",
 499 |        "                </tr>\n",
 500 |        "                \n",
 501 |        "                <tr>\n",
 502 |        "                    <td style=\"text-align: left;\">\n",
 503 |        "                        <strong>Dashboard: </strong>\n",
 504 |        "                        <a href=\"http://127.0.0.1:8787/status\">http://127.0.0.1:8787/status</a>\n",
 505 |        "                    </td>\n",
 506 |        "                    <td style=\"text-align: left;\"></td>\n",
 507 |        "                </tr>\n",
 508 |        "                \n",
 509 |        "                    </table>\n",
 510 |        "                    \n",
 511 |        "                <details>\n",
 512 |        "                <summary style=\"margin-bottom: 20px;\"><h3 style=\"display: inline;\">Cluster Info</h3></summary>\n",
 513 |        "                \n",
 514 |        "            <div class=\"jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output\">\n",
 515 |        "                <div style=\"\n",
 516 |        "                    width: 24px;\n",
 517 |        "                    height: 24px;\n",
 518 |        "                    background-color: #e1e1e1;\n",
 519 |        "                    border: 3px solid #9D9D9D;\n",
 520 |        "                    border-radius: 5px;\n",
 521 |        "                    position: absolute;\"> </div>\n",
 522 |        "                <div style=\"margin-left: 48px;\">\n",
 523 |        "                    <h3 style=\"margin-bottom: 0px; margin-top: 0px;\">LocalCluster</h3>\n",
 524 |        "                    <p style=\"color: #9D9D9D; margin-bottom: 0px;\">2614b8c1</p>\n",
 525 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 526 |        "                    \n",
 527 |        "            <tr>\n",
 528 |        "                <td style=\"text-align: left;\"><strong>Status:</strong> running</td>\n",
 529 |        "                <td style=\"text-align: left;\"><strong>Using processes:</strong> True</td>\n",
 530 |        "            </tr>\n",
 531 |        "        \n",
 532 |        "            <tr>\n",
 533 |        "                <td style=\"text-align: left;\">\n",
 534 |        "                    <strong>Dashboard:</strong> <a href=\"http://127.0.0.1:8787/status\">http://127.0.0.1:8787/status</a>\n",
 535 |        "                </td>\n",
 536 |        "                <td style=\"text-align: left;\"><strong>Workers:</strong> 4</td>\n",
 537 |        "            </tr>\n",
 538 |        "            <tr>\n",
 539 |        "                <td style=\"text-align: left;\">\n",
 540 |        "                    <strong>Total threads:</strong>\n",
 541 |        "                    12\n",
 542 |        "                </td>\n",
 543 |        "                <td style=\"text-align: left;\">\n",
 544 |        "                    <strong>Total memory:</strong>\n",
 545 |        "                    16.00 GiB\n",
 546 |        "                </td>\n",
 547 |        "            </tr>\n",
 548 |        "        \n",
 549 |        "                    </table>\n",
 550 |        "                    <details>\n",
 551 |        "                    <summary style=\"margin-bottom: 20px;\"><h3 style=\"display: inline;\">Scheduler Info</h3></summary>\n",
 552 |        "                    \n",
 553 |        "        <div style=\"\">\n",
 554 |        "            \n",
 555 |        "            <div>\n",
 556 |        "                <div style=\"\n",
 557 |        "                    width: 24px;\n",
 558 |        "                    height: 24px;\n",
 559 |        "                    background-color: #FFF7E5;\n",
 560 |        "                    border: 3px solid #FF6132;\n",
 561 |        "                    border-radius: 5px;\n",
 562 |        "                    position: absolute;\"> </div>\n",
 563 |        "                <div style=\"margin-left: 48px;\">\n",
 564 |        "                    <h3 style=\"margin-bottom: 0px;\">Scheduler</h3>\n",
 565 |        "                    <p style=\"color: #9D9D9D; margin-bottom: 0px;\">Scheduler-44f1121d-1f22-45d0-bc92-11e25d854ecf</p>\n",
 566 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 567 |        "                        <tr>\n",
 568 |        "                            <td style=\"text-align: left;\"><strong>Comm:</strong> tcp://127.0.0.1:63861</td>\n",
 569 |        "                            <td style=\"text-align: left;\"><strong>Workers:</strong> 4</td>\n",
 570 |        "                        </tr>\n",
 571 |        "                        <tr>\n",
 572 |        "                            <td style=\"text-align: left;\">\n",
 573 |        "                                <strong>Dashboard:</strong> <a href=\"http://127.0.0.1:8787/status\">http://127.0.0.1:8787/status</a>\n",
 574 |        "                            </td>\n",
 575 |        "                            <td style=\"text-align: left;\">\n",
 576 |        "                                <strong>Total threads:</strong>\n",
 577 |        "                                12\n",
 578 |        "                            </td>\n",
 579 |        "                        </tr>\n",
 580 |        "                        <tr>\n",
 581 |        "                            <td style=\"text-align: left;\">\n",
 582 |        "                                <strong>Started:</strong>\n",
 583 |        "                                Just now\n",
 584 |        "                            </td>\n",
 585 |        "                            <td style=\"text-align: left;\">\n",
 586 |        "                                <strong>Total memory:</strong>\n",
 587 |        "                                16.00 GiB\n",
 588 |        "                            </td>\n",
 589 |        "                        </tr>\n",
 590 |        "                    </table>\n",
 591 |        "                </div>\n",
 592 |        "            </div>\n",
 593 |        "        \n",
 594 |        "            <details style=\"margin-left: 48px;\">\n",
 595 |        "            <summary style=\"margin-bottom: 20px;\"><h3 style=\"display: inline;\">Workers</h3></summary>\n",
 596 |        "            \n",
 597 |        "            <div style=\"margin-bottom: 20px;\">\n",
 598 |        "                <div style=\"width: 24px;\n",
 599 |        "                            height: 24px;\n",
 600 |        "                            background-color: #DBF5FF;\n",
 601 |        "                            border: 3px solid #4CC9FF;\n",
 602 |        "                            border-radius: 5px;\n",
 603 |        "                            position: absolute;\"> </div>\n",
 604 |        "                <div style=\"margin-left: 48px;\">\n",
 605 |        "                <details>\n",
 606 |        "                    <summary>\n",
 607 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 0</h4>\n",
 608 |        "                    </summary>\n",
 609 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 610 |        "                        <tr>\n",
 611 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:63868</td>\n",
 612 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
 613 |        "                        </tr>\n",
 614 |        "                        <tr>\n",
 615 |        "                            <td style=\"text-align: left;\">\n",
 616 |        "                                <strong>Dashboard: </strong>\n",
 617 |        "                                <a href=\"http://127.0.0.1:63872/status\">http://127.0.0.1:63872/status</a>\n",
 618 |        "                            </td>\n",
 619 |        "                            <td style=\"text-align: left;\">\n",
 620 |        "                                <strong>Memory: </strong>\n",
 621 |        "                                4.00 GiB\n",
 622 |        "                            </td>\n",
 623 |        "                        </tr>\n",
 624 |        "                        <tr>\n",
 625 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:63863</td>\n",
 626 |        "                            <td style=\"text-align: left;\"></td>\n",
 627 |        "                        </tr>\n",
 628 |        "                        <tr>\n",
 629 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
 630 |        "                                <strong>Local directory: </strong>\n",
 631 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-ec3y923x\n",
 632 |        "                            </td>\n",
 633 |        "                        </tr>\n",
 634 |        "                        \n",
 635 |        "                        \n",
 636 |        "                    </table>\n",
 637 |        "                </details>\n",
 638 |        "                </div>\n",
 639 |        "            </div>\n",
 640 |        "            \n",
 641 |        "            <div style=\"margin-bottom: 20px;\">\n",
 642 |        "                <div style=\"width: 24px;\n",
 643 |        "                            height: 24px;\n",
 644 |        "                            background-color: #DBF5FF;\n",
 645 |        "                            border: 3px solid #4CC9FF;\n",
 646 |        "                            border-radius: 5px;\n",
 647 |        "                            position: absolute;\"> </div>\n",
 648 |        "                <div style=\"margin-left: 48px;\">\n",
 649 |        "                <details>\n",
 650 |        "                    <summary>\n",
 651 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 1</h4>\n",
 652 |        "                    </summary>\n",
 653 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 654 |        "                        <tr>\n",
 655 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:63870</td>\n",
 656 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
 657 |        "                        </tr>\n",
 658 |        "                        <tr>\n",
 659 |        "                            <td style=\"text-align: left;\">\n",
 660 |        "                                <strong>Dashboard: </strong>\n",
 661 |        "                                <a href=\"http://127.0.0.1:63873/status\">http://127.0.0.1:63873/status</a>\n",
 662 |        "                            </td>\n",
 663 |        "                            <td style=\"text-align: left;\">\n",
 664 |        "                                <strong>Memory: </strong>\n",
 665 |        "                                4.00 GiB\n",
 666 |        "                            </td>\n",
 667 |        "                        </tr>\n",
 668 |        "                        <tr>\n",
 669 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:63864</td>\n",
 670 |        "                            <td style=\"text-align: left;\"></td>\n",
 671 |        "                        </tr>\n",
 672 |        "                        <tr>\n",
 673 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
 674 |        "                                <strong>Local directory: </strong>\n",
 675 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-asmqcmx8\n",
 676 |        "                            </td>\n",
 677 |        "                        </tr>\n",
 678 |        "                        \n",
 679 |        "                        \n",
 680 |        "                    </table>\n",
 681 |        "                </details>\n",
 682 |        "                </div>\n",
 683 |        "            </div>\n",
 684 |        "            \n",
 685 |        "            <div style=\"margin-bottom: 20px;\">\n",
 686 |        "                <div style=\"width: 24px;\n",
 687 |        "                            height: 24px;\n",
 688 |        "                            background-color: #DBF5FF;\n",
 689 |        "                            border: 3px solid #4CC9FF;\n",
 690 |        "                            border-radius: 5px;\n",
 691 |        "                            position: absolute;\"> </div>\n",
 692 |        "                <div style=\"margin-left: 48px;\">\n",
 693 |        "                <details>\n",
 694 |        "                    <summary>\n",
 695 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 2</h4>\n",
 696 |        "                    </summary>\n",
 697 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 698 |        "                        <tr>\n",
 699 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:63867</td>\n",
 700 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
 701 |        "                        </tr>\n",
 702 |        "                        <tr>\n",
 703 |        "                            <td style=\"text-align: left;\">\n",
 704 |        "                                <strong>Dashboard: </strong>\n",
 705 |        "                                <a href=\"http://127.0.0.1:63871/status\">http://127.0.0.1:63871/status</a>\n",
 706 |        "                            </td>\n",
 707 |        "                            <td style=\"text-align: left;\">\n",
 708 |        "                                <strong>Memory: </strong>\n",
 709 |        "                                4.00 GiB\n",
 710 |        "                            </td>\n",
 711 |        "                        </tr>\n",
 712 |        "                        <tr>\n",
 713 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:63866</td>\n",
 714 |        "                            <td style=\"text-align: left;\"></td>\n",
 715 |        "                        </tr>\n",
 716 |        "                        <tr>\n",
 717 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
 718 |        "                                <strong>Local directory: </strong>\n",
 719 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-j5mi2q4h\n",
 720 |        "                            </td>\n",
 721 |        "                        </tr>\n",
 722 |        "                        \n",
 723 |        "                        \n",
 724 |        "                    </table>\n",
 725 |        "                </details>\n",
 726 |        "                </div>\n",
 727 |        "            </div>\n",
 728 |        "            \n",
 729 |        "            <div style=\"margin-bottom: 20px;\">\n",
 730 |        "                <div style=\"width: 24px;\n",
 731 |        "                            height: 24px;\n",
 732 |        "                            background-color: #DBF5FF;\n",
 733 |        "                            border: 3px solid #4CC9FF;\n",
 734 |        "                            border-radius: 5px;\n",
 735 |        "                            position: absolute;\"> </div>\n",
 736 |        "                <div style=\"margin-left: 48px;\">\n",
 737 |        "                <details>\n",
 738 |        "                    <summary>\n",
 739 |        "                        <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 3</h4>\n",
 740 |        "                    </summary>\n",
 741 |        "                    <table style=\"width: 100%; text-align: left;\">\n",
 742 |        "                        <tr>\n",
 743 |        "                            <td style=\"text-align: left;\"><strong>Comm: </strong> tcp://127.0.0.1:63869</td>\n",
 744 |        "                            <td style=\"text-align: left;\"><strong>Total threads: </strong> 3</td>\n",
 745 |        "                        </tr>\n",
 746 |        "                        <tr>\n",
 747 |        "                            <td style=\"text-align: left;\">\n",
 748 |        "                                <strong>Dashboard: </strong>\n",
 749 |        "                                <a href=\"http://127.0.0.1:63874/status\">http://127.0.0.1:63874/status</a>\n",
 750 |        "                            </td>\n",
 751 |        "                            <td style=\"text-align: left;\">\n",
 752 |        "                                <strong>Memory: </strong>\n",
 753 |        "                                4.00 GiB\n",
 754 |        "                            </td>\n",
 755 |        "                        </tr>\n",
 756 |        "                        <tr>\n",
 757 |        "                            <td style=\"text-align: left;\"><strong>Nanny: </strong> tcp://127.0.0.1:63865</td>\n",
 758 |        "                            <td style=\"text-align: left;\"></td>\n",
 759 |        "                        </tr>\n",
 760 |        "                        <tr>\n",
 761 |        "                            <td colspan=\"2\" style=\"text-align: left;\">\n",
 762 |        "                                <strong>Local directory: </strong>\n",
 763 |        "                                /Users/pavithra-coiled/Developer/talkpython-dask-course/2-dask-fundamentals/dask-worker-space/worker-z8ngf4fe\n",
 764 |        "                            </td>\n",
 765 |        "                        </tr>\n",
 766 |        "                        \n",
 767 |        "                        \n",
 768 |        "                    </table>\n",
 769 |        "                </details>\n",
 770 |        "                </div>\n",
 771 |        "            </div>\n",
 772 |        "            \n",
 773 |        "            </details>\n",
 774 |        "        </div>\n",
 775 |        "        \n",
 776 |        "                    </details>\n",
 777 |        "                </div>\n",
 778 |        "            </div>\n",
 779 |        "        \n",
 780 |        "                </details>\n",
 781 |        "                \n",
 782 |        "                </div>\n",
 783 |        "            </div>\n",
 784 |        "        "
 785 |       ],
 786 |       "text/plain": [
 787 |        "<Client: 'tcp://127.0.0.1:63861' processes=4 threads=12, memory=16.00 GiB>"
 788 |       ]
 789 |      },
 790 |      "execution_count": 7,
 791 |      "metadata": {},
 792 |      "output_type": "execute_result"
 793 |     }
 794 |    ],
 795 |    "source": [
 796 |     "from dask.distributed import Client\n",
 797 |     "\n",
 798 |     "client = Client(n_workers=4)\n",
 799 |     "client"
 800 |    ]
 801 |   },
 802 |   {
 803 |    "cell_type": "markdown",
 804 |    "id": "requested-consistency",
 805 |    "metadata": {},
 806 |    "source": [
 807 |     "Don't forget to open the dashboards!"
 808 |    ]
 809 |   },
 810 |   {
 811 |    "cell_type": "markdown",
 812 |    "id": "related-defensive",
 813 |    "metadata": {},
 814 |    "source": [
 815 |     "The following Dask Array code creates a 10,000x10,000 array with 100x100 chunks. The visualization below shows the resulting structure created."
 816 |    ]
 817 |   },
 818 |   {
 819 |    "cell_type": "code",
 820 |    "execution_count": 8,
 821 |    "id": "usual-certificate",
 822 |    "metadata": {},
 823 |    "outputs": [
 824 |     {
 825 |      "data": {
 826 |       "text/html": [
 827 |        "<table>\n",
 828 |        "<tr>\n",
 829 |        "<td>\n",
 830 |        "<table>\n",
 831 |        "  <thead>\n",
 832 |        "    <tr><td> </td><th> Array </th><th> Chunk </th></tr>\n",
 833 |        "  </thead>\n",
 834 |        "  <tbody>\n",
 835 |        "    <tr><th> Bytes </th><td> 762.94 MiB </td> <td> 78.12 kiB </td></tr>\n",
 836 |        "    <tr><th> Shape </th><td> (10000, 10000) </td> <td> (100, 100) </td></tr>\n",
 837 |        "    <tr><th> Count </th><td> 10000 Tasks </td><td> 10000 Chunks </td></tr>\n",
 838 |        "    <tr><th> Type </th><td> float64 </td><td> numpy.ndarray </td></tr>\n",
 839 |        "  </tbody>\n",
 840 |        "</table>\n",
 841 |        "</td>\n",
 842 |        "<td>\n",
 843 |        "<svg width=\"170\" height=\"170\" style=\"stroke:rgb(0,0,0);stroke-width:1\" >\n",
 844 |        "\n",
 845 |        "  <!-- Horizontal lines -->\n",
 846 |        "  <line x1=\"0\" y1=\"0\" x2=\"120\" y2=\"0\" style=\"stroke-width:2\" />\n",
 847 |        "  <line x1=\"0\" y1=\"6\" x2=\"120\" y2=\"6\" />\n",
 848 |        "  <line x1=\"0\" y1=\"12\" x2=\"120\" y2=\"12\" />\n",
 849 |        "  <line x1=\"0\" y1=\"18\" x2=\"120\" y2=\"18\" />\n",
 850 |        "  <line x1=\"0\" y1=\"25\" x2=\"120\" y2=\"25\" />\n",
 851 |        "  <line x1=\"0\" y1=\"31\" x2=\"120\" y2=\"31\" />\n",
 852 |        "  <line x1=\"0\" y1=\"37\" x2=\"120\" y2=\"37\" />\n",
 853 |        "  <line x1=\"0\" y1=\"43\" x2=\"120\" y2=\"43\" />\n",
 854 |        "  <line x1=\"0\" y1=\"50\" x2=\"120\" y2=\"50\" />\n",
 855 |        "  <line x1=\"0\" y1=\"56\" x2=\"120\" y2=\"56\" />\n",
 856 |        "  <line x1=\"0\" y1=\"62\" x2=\"120\" y2=\"62\" />\n",
 857 |        "  <line x1=\"0\" y1=\"68\" x2=\"120\" y2=\"68\" />\n",
 858 |        "  <line x1=\"0\" y1=\"75\" x2=\"120\" y2=\"75\" />\n",
 859 |        "  <line x1=\"0\" y1=\"81\" x2=\"120\" y2=\"81\" />\n",
 860 |        "  <line x1=\"0\" y1=\"87\" x2=\"120\" y2=\"87\" />\n",
 861 |        "  <line x1=\"0\" y1=\"93\" x2=\"120\" y2=\"93\" />\n",
 862 |        "  <line x1=\"0\" y1=\"100\" x2=\"120\" y2=\"100\" />\n",
 863 |        "  <line x1=\"0\" y1=\"106\" x2=\"120\" y2=\"106\" />\n",
 864 |        "  <line x1=\"0\" y1=\"112\" x2=\"120\" y2=\"112\" />\n",
 865 |        "  <line x1=\"0\" y1=\"120\" x2=\"120\" y2=\"120\" style=\"stroke-width:2\" />\n",
 866 |        "\n",
 867 |        "  <!-- Vertical lines -->\n",
 868 |        "  <line x1=\"0\" y1=\"0\" x2=\"0\" y2=\"120\" style=\"stroke-width:2\" />\n",
 869 |        "  <line x1=\"6\" y1=\"0\" x2=\"6\" y2=\"120\" />\n",
 870 |        "  <line x1=\"12\" y1=\"0\" x2=\"12\" y2=\"120\" />\n",
 871 |        "  <line x1=\"18\" y1=\"0\" x2=\"18\" y2=\"120\" />\n",
 872 |        "  <line x1=\"25\" y1=\"0\" x2=\"25\" y2=\"120\" />\n",
 873 |        "  <line x1=\"31\" y1=\"0\" x2=\"31\" y2=\"120\" />\n",
 874 |        "  <line x1=\"37\" y1=\"0\" x2=\"37\" y2=\"120\" />\n",
 875 |        "  <line x1=\"43\" y1=\"0\" x2=\"43\" y2=\"120\" />\n",
 876 |        "  <line x1=\"50\" y1=\"0\" x2=\"50\" y2=\"120\" />\n",
 877 |        "  <line x1=\"56\" y1=\"0\" x2=\"56\" y2=\"120\" />\n",
 878 |        "  <line x1=\"62\" y1=\"0\" x2=\"62\" y2=\"120\" />\n",
 879 |        "  <line x1=\"68\" y1=\"0\" x2=\"68\" y2=\"120\" />\n",
 880 |        "  <line x1=\"75\" y1=\"0\" x2=\"75\" y2=\"120\" />\n",
 881 |        "  <line x1=\"81\" y1=\"0\" x2=\"81\" y2=\"120\" />\n",
 882 |        "  <line x1=\"87\" y1=\"0\" x2=\"87\" y2=\"120\" />\n",
 883 |        "  <line x1=\"93\" y1=\"0\" x2=\"93\" y2=\"120\" />\n",
 884 |        "  <line x1=\"100\" y1=\"0\" x2=\"100\" y2=\"120\" />\n",
 885 |        "  <line x1=\"106\" y1=\"0\" x2=\"106\" y2=\"120\" />\n",
 886 |        "  <line x1=\"112\" y1=\"0\" x2=\"112\" y2=\"120\" />\n",
 887 |        "  <line x1=\"120\" y1=\"0\" x2=\"120\" y2=\"120\" style=\"stroke-width:2\" />\n",
 888 |        "\n",
 889 |        "  <!-- Colored Rectangle -->\n",
 890 |        "  <polygon points=\"0.0,0.0 120.0,0.0 120.0,120.0 0.0,120.0\" style=\"fill:#8B4903A0;stroke-width:0\"/>\n",
 891 |        "\n",
 892 |        "  <!-- Text -->\n",
 893 |        "  <text x=\"60.000000\" y=\"140.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" >10000</text>\n",
 894 |        "  <text x=\"140.000000\" y=\"60.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" transform=\"rotate(-90,140.000000,60.000000)\">10000</text>\n",
 895 |        "</svg>\n",
 896 |        "</td>\n",
 897 |        "</tr>\n",
 898 |        "</table>"
 899 |       ],
 900 |       "text/plain": [
 901 |        "dask.array<ones, shape=(10000, 10000), dtype=float64, chunksize=(100, 100), chunktype=numpy.ndarray>"
 902 |       ]
 903 |      },
 904 |      "execution_count": 8,
 905 |      "metadata": {},
 906 |      "output_type": "execute_result"
 907 |     }
 908 |    ],
 909 |    "source": [
 910 |     "import dask.array as da\n",
 911 |     "\n",
 912 |     "x = da.ones((10_000, 10_000), chunks=(100, 100))\n",
 913 |     "x"
 914 |    ]
 915 |   },
 916 |   {
 917 |    "cell_type": "markdown",
 918 |    "id": "photographic-george",
 919 |    "metadata": {},
 920 |    "source": [
 921 |     "Let's compute the sum of this array. Dask Array also evaluates lazily, so we need to call `compute()` to get the result."
 922 |    ]
 923 |   },
 924 |   {
 925 |    "cell_type": "code",
 926 |    "execution_count": 9,
 927 |    "id": "attended-original",
 928 |    "metadata": {},
 929 |    "outputs": [
 930 |     {
 931 |      "name": "stdout",
 932 |      "output_type": "stream",
 933 |      "text": [
 934 |       "CPU times: user 55.8 ms, sys: 3.4 ms, total: 59.2 ms\n",
 935 |       "Wall time: 58.4 ms\n"
 936 |      ]
 937 |     }
 938 |    ],
 939 |    "source": [
 940 |     "%%time\n",
 941 |     "result = x.sum()"
 942 |    ]
 943 |   },
 944 |   {
 945 |    "cell_type": "code",
 946 |    "execution_count": 10,
 947 |    "id": "appropriate-sunday",
 948 |    "metadata": {},
 949 |    "outputs": [
 950 |     {
 951 |      "data": {
 952 |       "text/html": [
 953 |        "<table>\n",
 954 |        "<tr>\n",
 955 |        "<td>\n",
 956 |        "<table>\n",
 957 |        "  <thead>\n",
 958 |        "    <tr><td> </td><th> Array </th><th> Chunk </th></tr>\n",
 959 |        "  </thead>\n",
 960 |        "  <tbody>\n",
 961 |        "    <tr><th> Bytes </th><td> 8 B </td> <td> 8.0 B </td></tr>\n",
 962 |        "    <tr><th> Shape </th><td> () </td> <td> () </td></tr>\n",
 963 |        "    <tr><th> Count </th><td> 23364 Tasks </td><td> 1 Chunks </td></tr>\n",
 964 |        "    <tr><th> Type </th><td> float64 </td><td> numpy.ndarray </td></tr>\n",
 965 |        "  </tbody>\n",
 966 |        "</table>\n",
 967 |        "</td>\n",
 968 |        "<td>\n",
 969 |        "\n",
 970 |        "</td>\n",
 971 |        "</tr>\n",
 972 |        "</table>"
 973 |       ],
 974 |       "text/plain": [
 975 |        "dask.array<sum-aggregate, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>"
 976 |       ]
 977 |      },
 978 |      "execution_count": 10,
 979 |      "metadata": {},
 980 |      "output_type": "execute_result"
 981 |     }
 982 |    ],
 983 |    "source": [
 984 |     "result"
 985 |    ]
 986 |   },
 987 |   {
 988 |    "cell_type": "code",
 989 |    "execution_count": 11,
 990 |    "id": "vanilla-variable",
 991 |    "metadata": {},
 992 |    "outputs": [
 993 |     {
 994 |      "name": "stdout",
 995 |      "output_type": "stream",
 996 |      "text": [
 997 |       "CPU times: user 14 s, sys: 587 ms, total: 14.6 s\n",
 998 |       "Wall time: 14.8 s\n"
 999 |      ]
1000 |     },
1001 |     {
1002 |      "data": {
1003 |       "text/plain": [
1004 |        "100000000.0"
1005 |       ]
1006 |      },
1007 |      "execution_count": 11,
1008 |      "metadata": {},
1009 |      "output_type": "execute_result"
1010 |     }
1011 |    ],
1012 |    "source": [
1013 |     "%%time\n",
1014 |     "result.compute()"
1015 |    ]
1016 |   },
1017 |   {
1018 |    "cell_type": "markdown",
1019 |    "id": "visible-candle",
1020 |    "metadata": {},
1021 |    "source": [
1022 |     "Now, let's do the same NumPy operations as earlier and compare the compute time!"
1023 |    ]
1024 |   },
1025 |   {
1026 |    "cell_type": "code",
1027 |    "execution_count": 12,
1028 |    "id": "certain-twins",
1029 |    "metadata": {},
1030 |    "outputs": [
1031 |     {
1032 |      "name": "stdout",
1033 |      "output_type": "stream",
1034 |      "text": [
1035 |       "CPU times: user 2.09 s, sys: 77.9 ms, total: 2.17 s\n",
1036 |       "Wall time: 2.66 s\n"
1037 |      ]
1038 |     },
1039 |     {
1040 |      "data": {
1041 |       "text/plain": [
1042 |        "array([ 9.99984054,  9.99810732, 10.0011439 , 10.00006243, 10.00067303,\n",
1043 |        "       10.00061939,  9.99967573, 10.00082095, 10.00005874, 10.00077991,\n",
1044 |        "       10.00026624,  9.9997531 ,  9.9998511 , 10.00016311, 10.00117175,\n",
1045 |        "       10.00104788,  9.99950712, 10.00058845, 10.0008487 ,  9.99974336,\n",
1046 |        "        9.99990533, 10.00005898,  9.9997313 ,  9.99963978,  9.99963639,\n",
1047 |        "       10.0004849 , 10.00105947, 10.0013202 , 10.0003486 ,  9.99945365,\n",
1048 |        "        9.9993116 , 10.00008687, 10.00020631, 10.00188321,  9.99979062,\n",
1049 |        "        9.99948907,  9.99998619,  9.99977491,  9.99940876,  9.99954004,\n",
1050 |        "        9.99940587, 10.00030509,  9.99879323, 10.00005036,  9.99899518,\n",
1051 |        "        9.99996586, 10.00165206, 10.00018339,  9.99941188,  9.9996999 ,\n",
1052 |        "       10.00090938,  9.9999311 ,  9.99904588, 10.00009871, 10.00080389,\n",
1053 |        "        9.9996124 ,  9.99973756, 10.0006106 ,  9.99932609,  9.99976075,\n",
1054 |        "       10.00024083,  9.99707116,  9.99981883,  9.99889187,  9.99995958,\n",
1055 |        "        9.99922913, 10.00073525,  9.9999535 , 10.00043148,  9.99911152,\n",
1056 |        "       10.00125334, 10.00005256,  9.99983756, 10.00075923,  9.99979802,\n",
1057 |        "        9.99972084, 10.00023872, 10.00063686,  9.99954617,  9.99973034,\n",
1058 |        "       10.00006705, 10.00021521,  9.99963444, 10.00034974,  9.99978419,\n",
1059 |        "       10.00112573, 10.00122812, 10.000071  ,  9.99896755,  9.99921739,\n",
1060 |        "       10.00022158,  9.99956024, 10.00018029,  9.99947774, 10.00028779,\n",
1061 |        "       10.00035899,  9.99952996,  9.99992577, 10.0007089 ,  9.99882874,\n",
1062 |        "       10.00140435,  9.99983057, 10.00084932, 10.00110383,  9.99861585,\n",
1063 |        "       10.00057231,  9.99946645, 10.00006694,  9.99990636, 10.00075132,\n",
1064 |        "        9.99948178, 10.00017333, 10.0002826 ,  9.99890721, 10.00102143,\n",
1065 |        "       10.00052894,  9.99955435,  9.99998   , 10.00069429, 10.00137973,\n",
1066 |        "        9.99904271,  9.99989111,  9.99978139, 10.00059649,  9.99930181,\n",
1067 |        "       10.00061705,  9.9993816 ,  9.99939831, 10.00018856, 10.00056412,\n",
1068 |        "        9.99937629, 10.0004428 , 10.00040308, 10.00143014, 10.00078526,\n",
1069 |        "       10.00026439,  9.99907207, 10.00159326, 10.00129855, 10.00078995,\n",
1070 |        "        9.99974592,  9.99964061, 10.00055318,  9.99948493,  9.99856299,\n",
1071 |        "        9.99908679, 10.00014546,  9.99976545,  9.99945505,  9.99907417,\n",
1072 |        "        9.99914494,  9.99994726,  9.99908853,  9.99928258, 10.00063795,\n",
1073 |        "       10.00022527,  9.99882283,  9.99902949,  9.99992081,  9.99939082,\n",
1074 |        "        9.99922374, 10.00056829, 10.00163323, 10.00045605, 10.00005765,\n",
1075 |        "       10.00038722,  9.99962068,  9.99960248,  9.99976618, 10.00107423,\n",
1076 |        "        9.99968588,  9.99877324, 10.00105333,  9.99946677, 10.00037982,\n",
1077 |        "        9.99980408,  9.99860037, 10.00151025,  9.99957228,  9.99947328,\n",
1078 |        "        9.99925242, 10.00099767,  9.99971488,  9.9996304 ,  9.99957526,\n",
1079 |        "        9.99962409, 10.00078201, 10.0005748 , 10.00027635, 10.00094458,\n",
1080 |        "        9.99915061, 10.00042229, 10.00008712,  9.99955296,  9.99935311,\n",
1081 |        "       10.00076805, 10.00027261, 10.0005381 , 10.00007226, 10.0008371 ])"
1082 |       ]
1083 |      },
1084 |      "execution_count": 12,
1085 |      "metadata": {},
1086 |      "output_type": "execute_result"
1087 |     }
1088 |    ],
1089 |    "source": [
1090 |     "%%time\n",
1091 |     "x = da.random.normal(10, 0.1, size=(20000, 20000), chunks=(1000, 1000))\n",
1092 |     "y = x.mean(axis=0)[::100] \n",
1093 |     "y.compute()"
1094 |    ]
1095 |   },
1096 |   {
1097 |    "cell_type": "code",
1098 |    "execution_count": 13,
1099 |    "id": "understood-mechanics",
1100 |    "metadata": {},
1101 |    "outputs": [],
1102 |    "source": [
1103 |     "client.close()"
1104 |    ]
1105 |   },
1106 |   {
1107 |    "cell_type": "markdown",
1108 |    "id": "blind-growing",
1109 |    "metadata": {},
1110 |    "source": [
1111 |     "## Checkpoint"
1112 |    ]
1113 |   },
1114 |   {
1115 |    "cell_type": "markdown",
1116 |    "id": "interim-silicon",
1117 |    "metadata": {},
1118 |    "source": [
1119 |     "**Question**: Using Dask Array, create a random matrix of size 1 million x 1 million and compute the standard deviation."
1120 |    ]
1121 |   },
1122 |   {
1123 |    "cell_type": "code",
1124 |    "execution_count": null,
1125 |    "id": "floppy-gender",
1126 |    "metadata": {},
1127 |    "outputs": [],
1128 |    "source": [
1129 |     "#your answer here"
1130 |    ]
1131 |   },
1132 |   {
1133 |    "cell_type": "code",
1134 |    "execution_count": null,
1135 |    "id": "thick-marshall",
1136 |    "metadata": {
1137 |     "jupyter": {
1138 |      "source_hidden": true
1139 |     },
1140 |     "tags": []
1141 |    },
1142 |    "outputs": [],
1143 |    "source": [
1144 |     "# Answer\n",
1145 |     "\n",
1146 |     "x = da.random((1_000_000, 1_000_000), chunks=(10_000, 10_000))\n",
1147 |     "y = x.std(axis=0)\n",
1148 |     "y"
1149 |    ]
1150 |   },
1151 |   {
1152 |    "cell_type": "markdown",
1153 |    "id": "suspected-sociology",
1154 |    "metadata": {},
1155 |    "source": [
1156 |     "## Limitations of Dask Array\n",
1157 |     "\n",
1158 |     "* Dask Array does not implement the entire NumPy interface. For example, it does not implement `np.linalg` and `np.sometrue`.\n",
1159 |     "* Dask Array does not support some operations where the resulting shape depends on the values of the array.\n",
1160 |     "* Dask Array does not attempt operations like sort which are difficult to do in parallel."
1161 |    ]
1162 |   },
1163 |   {
1164 |    "cell_type": "markdown",
1165 |    "id": "retained-saturn",
1166 |    "metadata": {},
1167 |    "source": [
1168 |     "## References\n",
1169 |     "\n",
1170 |     "* [Dask Array documentation](https://docs.dask.org/en/latest/array.html)\n",
1171 |     "* [Dask Array API](https://docs.dask.org/en/latest/array-api.html)\n",
1172 |     "* [Dask Array examples](https://examples.dask.org/array.html)\n",
1173 |     "* [Dask Tutorial - Array](https://tutorial.dask.org/03_array.html)"
1174 |    ]
1175 |   },
1176 |   {
1177 |    "cell_type": "code",
1178 |    "execution_count": null,
1179 |    "id": "polyphonic-parks",
1180 |    "metadata": {},
1181 |    "outputs": [],
1182 |    "source": []
1183 |   }
1184 |  ],
1185 |  "metadata": {
1186 |   "kernelspec": {
1187 |    "display_name": "Python 3",
1188 |    "language": "python",
1189 |    "name": "python3"
1190 |   },
1191 |   "language_info": {
1192 |    "codemirror_mode": {
1193 |     "name": "ipython",
1194 |     "version": 3
1195 |    },
1196 |    "file_extension": ".py",
1197 |    "mimetype": "text/x-python",
1198 |    "name": "python",
1199 |    "nbconvert_exporter": "python",
1200 |    "pygments_lexer": "ipython3",
1201 |    "version": "3.8.10"
1202 |   }
1203 |  },
1204 |  "nbformat": 4,
1205 |  "nbformat_minor": 5
1206 | }
1207 | 


--------------------------------------------------------------------------------
/04-delayed.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Dask Delayed\n",
  8 |     "\n",
  9 |     "## Notebook Objectives\n",
 10 |     "* **Recap Delayed API** from first course.\n",
 11 |     "* **Parallelize Python code with Delayed API**.\n",
 12 |     "* **Best Practices** for using Dask Delayed.\n",
 13 |     "* **References** for further reading."
 14 |    ]
 15 |   },
 16 |   {
 17 |    "cell_type": "markdown",
 18 |    "metadata": {},
 19 |    "source": [
 20 |     "## Recap basics of Delayed API"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "markdown",
 25 |    "metadata": {},
 26 |    "source": [
 27 |     "We introduced the Dask Dealayed API in the first course:\n",
 28 |     "* It can be used to parallelize regular Python code.\n",
 29 |     "* It is evaluated lazily, meaning computation isn't evaluated until necessary, or until we call `compute()`.\n",
 30 |     "* The task graph generated can be visualized using `visualize`."
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "code",
 35 |    "execution_count": 1,
 36 |    "metadata": {},
 37 |    "outputs": [],
 38 |    "source": [
 39 |     "from time import sleep\n",
 40 |     "\n",
 41 |     "def inc(x):\n",
 42 |     "    sleep(1)\n",
 43 |     "    return x + 1\n",
 44 |     "\n",
 45 |     "def add(x, y):\n",
 46 |     "    sleep(1)\n",
 47 |     "    return x + y"
 48 |    ]
 49 |   },
 50 |   {
 51 |    "cell_type": "code",
 52 |    "execution_count": 2,
 53 |    "metadata": {},
 54 |    "outputs": [
 55 |     {
 56 |      "data": {
 57 |       "text/plain": [
 58 |        "Delayed('add-6aae8eb9-1fa2-4af7-af1e-0b71f22ff8a6')"
 59 |       ]
 60 |      },
 61 |      "execution_count": 2,
 62 |      "metadata": {},
 63 |      "output_type": "execute_result"
 64 |     }
 65 |    ],
 66 |    "source": [
 67 |     "from dask import delayed\n",
 68 |     "\n",
 69 |     "x = delayed(inc)(10)\n",
 70 |     "y = delayed(inc)(10)\n",
 71 |     "\n",
 72 |     "z = delayed(add)(x, y)\n",
 73 |     "z"
 74 |    ]
 75 |   },
 76 |   {
 77 |    "cell_type": "code",
 78 |    "execution_count": 3,
 79 |    "metadata": {},
 80 |    "outputs": [
 81 |     {
 82 |      "data": {
 83 |       "text/plain": [
 84 |        "22"
 85 |       ]
 86 |      },
 87 |      "execution_count": 3,
 88 |      "metadata": {},
 89 |      "output_type": "execute_result"
 90 |     }
 91 |    ],
 92 |    "source": [
 93 |     "z.compute()"
 94 |    ]
 95 |   },
 96 |   {
 97 |    "cell_type": "code",
 98 |    "execution_count": 4,
 99 |    "metadata": {},
100 |    "outputs": [
101 |     {
102 |      "data": {
103 |       "image/png": "iVBORw0KGgoAAAANSUhEUgAAALMAAAFyCAYAAAC+1+tWAAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nO3deXxM9/4/8NfMZB9JiIQIbVJboqKC2gUpSqtoS6WqdtX14na59Gup20fbS1FVO9fSRqKC0ha1L7FEKmIniQQl1pB9kExm3r8/XPk1SGQ5M5+Zz3k/Hw9/dDI555V3Xzk5c+bMORoiIjBm/9ZoRSdgTClcZiYNLjOThoPoAEpLS0vDwYMHRcewef379xcdQXEa2V4ARkdHIzw8XHQMmyfZ/3ZA5heARMT/HvNv9erVov/XWIy0ZWbqw2Vm0uAyM2lwmZk0uMxMGlxmJg0uM5MGl5lJg8vMpMFlZtLgMjNpcJmZNLjMTBpcZiYNLjOTBpeZSYPLzKTBZWbS4DIzaXCZmTS4zEwaXGYmDS4zkwaXmUmDy8ykwWVm0uAyM2lwmZk0uMxMGlxmJg0uM5MGl5lJg8vMpMFlZtLgMjNpcJmZNLjMTBpcZiYNLjOTBpeZSYPLzKTBZWbS4DIzaXCZmTS4zFaQlpYmOoIqOIgOYCnR0dGiIwAADAYDFi5ciE8++UR0FABAbGys6AgWI22Zw8PDRUcoxtbyyEi63Yz+/fuDiGzmX8eOHQEACxcuFJ7l7/9kJF2Zbcm1a9ewf/9+AMBPP/0kOI38uMwW9PPPP0OrvT/i2NhY/PXXX4ITyY3LbEE//fQTTCYTAMDBwQFr1qwRnEhuXGYLSU1NxfHjx4v2TwsLC3lXw8K4zBYSFRUFB4f/f7CIiHDy5EmcOXNGYCq5cZktJCIiAkajsdhjjo6ONnP8W0ZcZgs4evQozp0798jjRqMRy5cvF5BIHbjMFrBq1So4OTk99muXLl1CfHy8lROpA5dZYUSEyMhIFBQUPPbrjo6OWLVqlZVTqQOXWWH79+/H1atXS/y60WhERERE0SE7phwus8JK28V4ID09Hfv27bNSIvXgMiuosLAQq1evhslkgrOzc9E/JyenYv8NgHc1LEDas+ZEuH37Nv71r38Ve+zEiROIiorC1KlTiz1erVo1a0ZTBQ3JegqVjYiOjkZ4eLi0Z6rZkDW8m8GkwWVm0uAyM2lwmZk0uMxMGlxmJg0uM5MGl5lJg8vMpMFlZtLgMjNpcJmZNLjMTBpcZiYNLjOTBpeZSYPLzKTBZWbS4DIzaXCZmTS4zEwaXGYmDS4zkwaXmUmDy8ykwWVm0uAyM2lwmZk0uMxMGlxmJg0uM5MGX2zcQrKyspCbm4srV64AAM6fPw+9Xg93d3e4ubkJTicnvth4JZnNZpw4cQJ79uxBfHw8kpKSkJycjJycnBK/p3bt2ggMDERQUBDatWuHsLAw+Pn5WTG1lNZwmSvAbDZj165diIiIwKZNm3D79m14e3ujVatWaNSoERo2bAh/f394eHhAr9dDr9cjKysLeXl5yMnJKSr86dOnER8fj4KCAgQFBaF///4YNGgQ6tevL/pHtEdrQKzMsrKy6Ouvv6Y6deoQAGrdujXNnDmTjh07RiaTqULLNBgMtHXrVvr444/Jz8+PAFD79u1p/fr1ZDabFf4JpBbNZS6DnJwcmjhxInl6epKnpyeNGzeOEhMTFV9PYWEhbdmyhV577TXSarUUHBxM0dHRiq9HUlzmJ1m9ejXVrl2bqlevTl9//TVlZWVZZb2nTp2it956i7RaLXXp0sUivzyS4TKXJD09nXr27EkajYZGjBhB6enpQnLExsZSs2bNyMnJib799lve9SgZl/lxYmJiqE6dOuTv70/79u0THYcKCwtp2rRp5OjoSD179qRbt26JjmSLuMwPW7FiBTk4OFCfPn0oIyNDdJxiDh48SE8//TTVrVuXkpOTRcexNVzmv5s+fTppNBqaMGGCzf45T09Pp1atWlGNGjUoPj5edBxbwmV+4JtvviGtVkuzZ88WHeWJcnNzqXv37uTh4UEJCQmi49gKLjMR0ZIlS0ij0dC8efNERymz/Px86tq1K9WsWZNSUlJEx7EFXOYtW7aQTqejyZMni45Sbjk5OfT8889T/fr1KTMzU3Qc0aJV/Xb21atXERISgm7duiEyMlJ0nAq5ceMGmjdvjrZt22Lt2rWi44ik3hvBm81mDBw4EF5eXli0aJHoOBVWs2ZNREZGYsOGDZg3b57oOGKJ/tsgyvz588nR0ZGOHTsmOooiJk+eTHq9ni5duiQ6iijq3M24efMmgoKCMGrUKEydOlV0HEUUFBTgueeeQ3BwsFp3N9S5mzFhwgRUqVIFkyZNEh1FMU5OTpgzZw7WrVuHnTt3io4jhOq2zJcuXUL9+vWxaNEiDBs2THQcxfXo0QP37t3Dnj17REexNvVtmadPnw5fX18MHDhQdBSLmDx5Mvbu3Yv9+/eLjmJ1qtoy5+TkoFatWpg2bRo++ugj0XEsJjQ0FDVq1MC6detER7EmdW2Zo6OjYTab8fbbb4uOYlEjR47Exo0bcfv2bdFRrEpVZY6IiECfPn1QtWpV0VEsql+/fnBycsKaNWtER7Eq1ZQ5PT0d+/btw4ABA0RHsTi9Xo+ePXvil19+ER3FqlRT5j179kCn0yEsLMzq687Ly8Pvv/+OcePGVeo55dGtWzccOHAA+fn5iizPHqimzLt370aLFi3g4eFh9XVv2bIFo0ePxs8//1yp55RH165dcefOHcTFxSmyPHugmjLHxcWhQ4cOQtbdr18/tGrVCg4OJV9AqizPKQ9/f3/UqVMHsbGxiizPHqiizESE5ORkPPvss8IyaLVaaLWlj7sszymPRo0aISkpSbHl2TpVXGsuLS0NeXl5CAwMVGyZycnJOHToEE6cOIH27dvjtddeK/b1jIwMrF27FhcvXsTzzz8PIoJGoyn3cyojMDAQCQkJii3P5gk7x8mKYmJiCABdvXpVkeXNmjWLOnfuTGazmS5cuEABAQE0f/78oq8nJiZSy5Yt6eDBg2Q0GmnRokXk7OxMDRs2LNdzlMhZq1YtxZZn46JVsZuRnZ0NAPD09FRkefPmzUPjxo2h0WgQEBCAkJAQbNy4sejrQ4YMQefOndG2bVs4ODjgnXfeQe3atYstoyzPqSxPT89SL+AoG1WUOS8vDzqdDq6uroosb8+ePfjqq68AAGfOnMHly5dx7tw5AMCuXbsQFxdX7BCgRqNBy5Yti3YhyvIcJbi7u+POnTswmUyKLdOWqaLMd+/ehaurq2JFqV27Nv7880+MHj0aZ8+eRb169WA2mwEAx48fBwAEBwcX+56/r7ssz1GCXq8HEeHOnTuKLtdWqeIFoIuLC+7evavY8iZNmoS9e/di69atcHV1LXZCz4M/63FxcXjqqaeKfd+DspblOUp4UGKl/iLZOlVsmd3d3WEymRQp9IULF/DVV1/h7bffLirJg60yADRp0gTA/V2JkpTlOUrIzc2Fm5ubYseubZ0qyvzgXb8HLwQrIy8vDwCwatUq5OTkYN++fYiJiUFmZiby8vIQFhaGoKAgREREICYmBsD9T4Hv3bsXaWlpOHHiBF5++eUnPqewsLDSWXNycoS84ymKKsrs7+8P4P5WtbKaNGmC4cOHY//+/WjRogXOnDmDOXPmIC8vD3369AER4Y8//kCjRo3QqVMn1KtXD5999hmef/55hISE4ODBgwDwxOcoUebU1FQEBARUejn2QhUn55vNZri7u2PevHkYOnSoIsvMzc2Fu7t70X/n5+fD2dm52HPS09Ph5uYGvV6PvLw8VKlS5ZHllOU5FdWjRw/4+vpixYoVii3Thqnj5HytVosGDRrg7Nmzii3z70UG8EiRAcDHxwd6vR4ASixpWZ5TUYmJiYq+62nrVFFmAGjVqhUOHDggOobVXL58GX/99Rdat24tOorVqKbMYWFhiIuLQ25urugoVrFz5064uLigbdu2oqNYjarKbDKZio4eyG7nzp1o166dao4xAyoqs6+vL9q0aaPYye+27M6dO/j111/x6quvio5iVaopMwAMGjQIv/zyi/S7Ghs2bMDdu3fRv39/0VGsSlVlDg8Ph8lkwqpVq0RHsailS5fipZdeQs2aNUVHsSpVldnLywuDBw/GtGnTFHlTwhbFxcVh165dGDNmjOgoVqeKN03+7vz58wgMDMSPP/6It956S3QcxfXq1Qvp6ek4dOiQ6CjWps4bwQ8ZMgQHDhzAyZMnpXq1HxMTg86dO2Pjxo14+eWXRcexNnWW+fr16wgKCsLYsWMxZcoU0XEUUVhYiBYtWqBWrVrYsmWL6DgiqOPt7If5+vpiypQpmDZtmqJvcYs0Y8YMnDt3DvPnzxcdRRhVbpmB+1uy0NBQGAwGxMXF2fXuxqFDh9CxY0d88803+PTTT0XHEUWduxkP/PXXX2jWrBn69u2LJUuWiI5TIRkZGWjevDkaN26MjRs3Kv7RKzuizt2MB/z9/bFixQosW7YM06dPFx2n3O7evVt0DvWPP/6o5iLfJ+D6BjZn3rx5pNFoaOnSpaKjlFlhYSG9/vrrVL16dTpz5ozoOLYgWh0fDnuCDz74AGlpaXj33XcBAMOHDxecqHT5+fkYOHAgtmzZgh07dqBRo0aiI9kG0b9OtmTq1Kmk0Wjoiy++EB2lRLm5udStWzeqWrUq7d27V3QcW8L3zn7Y3LlzSavVUt++fSkrK0t0nGJOnz5NjRs3plq1atHRo0dFx7E16rg8V3l8+OGH2LRpE2JiYtC6dWscPXpUdCQQERYuXIgWLVrAx8cH8fHxCAkJER3L9oj+dbJVV65coc6dO5ODgwONHTuWsrOzheQ4fvw4dejQgXQ6HU2ePJkKCwuF5LADvJtRGrPZTEuXLiVvb2+qVasWzZ49m1JTUyknJ8ei601OTqaUlBQaNWoUOTg4UJs2bXi34sm4zGVx+/ZtGjNmDLm6upKDgwN9+eWXdPnyZYus6+DBgxQaGkparZbq1q1Ly5YtI5PJZJF1SYbLXFbXrl2jgIAAcnV1JS8vL9JqtdSlSxdavHgxpaSkVHi5RqORDh48SJMnT6b69esTAKpbty4BoBEjRpDZbFbwp5BatKrfzi6rmzdvIjQ0FMnJyWjSpAni4+OxefNm/PTTT9i2bRsMBgP8/f3Rrl07BAYGIigoCAEBAahSpQqqVKkCvV6PnJwcZGdnIycnB8nJyUhKSsLp06dx4MAB5Obmok6dOnjjjTcwaNAgeHp6ol69egDuH/NesmSJoreHkJS6z80oixs3biA0NBQXL16E0WhEr1698NtvvxV9vaCgoOjTHQkJCUhMTMT58+dL/SSLu7t7Uenbt2+PsLCwYhdryc/Ph6urK4gIWq0Wo0aNwvz58/nt6tKt4XcAS5Geno5OnToVFdnJyanounUPODk5ITQ0FKGhoUWPGY1GXLt2Dbm5ucjLy4PBYICnpyc8PDzg7u4OX1/fUtfr7OyMqlWrIjMzE2azGYsXL4bRaMSSJUu40KXgMpcgPT0dHTt2xPnz52E0GgHcv3ZyWW7V4OjoiKeffrpS6/fz80NmZiaA+9fKW758ObRaLRYtWsSFLgHviD3GgyKnpKQUFRm4v8WtU6eOVTI888wzxf7bbDZj6dKlGDt2LHjP8PG4zA/JzMxEly5dkJqa+sh+r9lstlqZn376aTg6Oj6y/rlz5+Kf//ynVTLYGy7z32RmZqJz585ITEwstkX+O2uVuXbt2o89gmE2mzFnzhyMHTvWKjnsCZf5f7KyshAWFoazZ8+WWOSy7jMroU6dOiXmeFBo3kIXx2XG/Y8ehYaG4syZMyUWCLh/SM1anxWsU6dOsXulPMxsNmP27Nn47LPPrJLHHnCZAURERCAxMfGJz/Pz87NCmvuetDuj0Wig0WiwcuVKnDx50kqpbBuXGcCYMWOQlpaGjz/+GM7OziXenenhIwyWVFKZtVotNBoN/Pz88N133+H8+fNFd69SOy7z/9SsWRNTp05FWloa3n777UduOebo6PjIGyaW5ObmVuxWEzqdDhqNBt7e3liwYAEuXryIMWPG2PUlEpTGZX6It7c3Tp48iRdffBFffvklvLy8oNPpYDKZrPbi74FatWoVvUHSpEkTrFy5EgUFBcjIyFDNvf3KReRpTrZo06ZNBIDi4+OJiMhgMNDs2bPJz8+Pli1bZtUs3bt3p9DQUNq2bVvRY+PHj6caNWqQwWCwahY7wGfNPSw0NBQeHh7YtGlTsccLCgpgMBhQrVo1q2W5cePGI9dYvnnzJp555hlMnToV//jHP6yWxQ7wWXN/FxMTg06dOmHfvn3o0KGD6DglGj16NNavX4/U1FQ4OTmJjmMruMx/16NHD9y9exd79+4VHaVUly9fRv369bFgwQKbv8aHFXGZHzh27BiaN2+OzZs3o0ePHqLjPNHw4cMRExODpKQk6HQ60XFsAZf5gX79+iE1NRUJCQl2cYplamoqAgMDERUVpbob8ZSAywzcvy1v48aNER0djb59+4qOU2bh4eFISkrC0aNH7eIX0MK4zMD920L8+eefOH36tF191u748eNo1qwZfv/9d/Ts2VN0HNG4zJcuXUL9+vWxZMkSDBkyRHSccuvZsycyMjIQGxsrOopo6r4+MwBMmzYNtWrVwoABA0RHqZBJkybh0KFDNn8ExhpUvWW+ceMGnnnmGcyYMQMffPCB6DgV1rlzZzg7O2Pr1q2io4ik7i3zzJkz4eHhgWHDhomOUimff/45tm3bhsOHD4uOIpRqt8wZGRkICAjAxIkT8a9//Ut0nEpr1aoVnnrqKaxbt050FFHUu2WeM2cOtFpt0dXy7d348eOxfv16nDp1SnQUYVRZZoPBgLlz52LMmDHw9PQUHUcRr732Gho3boxvv/1WdBRhVFnmhQsX4u7du1KddabRaPDZZ58hKioKKSkpouMIoboy5+fnY9asWXjvvffg7e0tOo6iBg4ciICAAHz33XeiowihujIvX74ct27dkvJj+jqdDp988gmWLVuGq1evio5jdaoqs8lkwsyZMzF06FCrfwTKWoYPH47q1atj1qxZoqNYnarKvGrVKly4cAGffPKJ6CgW4+zsjLFjx2LBggW4deuW6DhWpZoyExGmTZuGAQMGoEGDBqLjWNQHH3wAV1dXzJ07V3QUq1JNmTds2IDTp09j3LhxoqNYnF6vx4cffog5c+YgNzdXdByrUU2Zp0+fjj59+iA4OFh0FKsYPXo0jEYjFi1aJDqK1aiizNu3b0dsbKwUb1uXlZeXF959913MmDEDd+/eFR3HKlRxbsYLL7wAnU6H7du3i45iVdevX0fdunUxc+ZMvP/++6LjWJr852bExcVh9+7d+L//+z/RUazO19cXQ4YMwdSpU0u9uqkspN8y9+rVC+np6Th06JDoKEJcuHABDRs2xLJlyzBo0CDRcSxJ7o9NnThxAiEhIfj111/Rq1cv0XGEGTx4MA4fPmx3n3EsJ7nLPGDAAJw5cwbHjh1T9aeXz549i+DgYKxZswavv/666DiWIm+ZU1NTERQUhIiICLz55pui4wj3+uuv4+LFizhy5Iisv9jyvgD8z3/+A39/f/Tr1090FJswadIkHDt2TOojOlJumdPS0lCvXj3MmzcPI0eOFB3HZnTv3h35+fnYs2eP6CiWIOeWecaMGahRowYGDx4sOopN+fzzz7F3717s379fdBSLkG7LfPv2bQQEBOCrr77CmDFjRMexOaGhofD09MTGjRtFR1GafFvmWbNmwdnZGSNGjBAdxSaNHz8emzZtQkJCgugoipNqy5yTkwN/f398+umnmDBhgug4NqtFixaoX78+Vq9eLTqKkh49NLdhwwa7vOYacP9Gj/n5+XBxcbH44afs7GyLLNca8y8sLERhYSFcXFwsuh5Lesz81zxyy6KCggLk5ORg8eLF1kllZ+Lj4y06G55/6Uqbf4n333rnnXcsFsieeXp6WqVoPP/HK23+0r0AZOrFZWbS4DIzaXCZmTS4zEwaXGYmDS4zkwaXmUmDy8ykwWVm0uAyM2lwmZk0uMxMGlxmJg0uM5MGl5lJg8vMpMFlZtLgMjNpcJmZNLjMTBpcZiYNLjOTBpeZSYPLzKTBZWbS4DIzaXCZmTS4zEwaXGYmDS4zkwaXmUmDy8ykwWVm0uAyM2lwmZk0uMxMGlxmJg0uM5MGl5lJg8vMpMFlZtLgMjNpcJmZNOyizLGxsaIjqJq9zN+hpC94enpaM0eJzGYz7t69C71eLzoKAMBoNFplPTz/xytt/o+UuWXLlli8eLFFA5XHH3/8gfXr12PcuHHw8fERHcfieP6VQDYuMDCQANCXX34pOooq2dH8ozVERKJ/oUpy6tQpNGnSBABQv359nDt3TnAidbGz+a+x6ReAUVFRcHR0BACkpKTg2LFjghOpi73N32bLTESIiIgo2uF3cnLCqlWrBKdSD3ucv83uZsTGxqJdu3bFHvP19cWVK1eg1drs76A07HD+trubsWrVKjg5ORV77Pr16zhw4ICgROpij/O3yTKbTCZERkaioKCg2OOOjo42/6dOBvY6f5ss886dO5GRkfHI40ajEVFRUVZ740Kt7HX+NlnmqKioR/7EPZCdnY0dO3ZYOZG62Ov8ba7M9+7dw9q1ax/5E/eAo6MjIiMjrZxKPex5/jZX5k2bNuHOnTslft1oNGLdunUwGAxWTKUe9jx/mytzVFTUEw/93Lt3D5s2bbJSInWx5/nbVJlzcnKwefNmmEymJz7Xll9V2yt7n3+Jp4CKoNVqsX///mKPbd++HZ9//jni4+OLPf7gbVamHHufv02VuUqVKmjRokWxx1JTUwHgkceZ8ux9/ja1m8FYZXCZmTS4zEwaXGYmDS4zkwaXmUmDy8ykwWVm0uAyM2lwmZk0uMxMGlxmJg0uM5MGl5lJg8vMpMFlZtLgMjNpcJmZNLjMTBpcZiYNLjOTBpeZSYPLzKTBZWbS4DIzaXCZmTS4zEwaXGYmDS4zkwaXmUmDy8ykYTPXZ87NzUVsbCxOnz6NpKQkXLx4EVlZWbh58yZcXV3RvHlzVKlSBV5eXmjYsCECAwMREhKCkJAQ6HQ60fHtngzzF3q74XPnziEyMhJbt25FfHw8CgsLUbNmTQQFBaFu3brw8vKCXq+Hm5sbsrKyYDAYkJ6ejqSkJCQnJ8NgMKBq1aro2LEjXn31VfTt2xceHh6ifhy7I9n814CszGg00sqVK6lt27YEgPz8/Oi9996j1atX040bN8q8HLPZTKdOnaIffviB+vTpQ87OzuTq6koDBw6khIQEC/4E9k3i+Udbrcwmk4n++9//Ur169cjBwYHefPNN2rJlCxUWFiqy/IyMDFq0aBE1b96cNBoNvfTSS3T48GFFli0DFczfOmU+fPgwtWzZkhwdHWnUqFGUmppq0fVt3ryZ2rVrR1qtlt577z3KyMiw6PpsnUrmb9kyFxYW0oQJE0ir1VKnTp3o1KlTllxdMWazmSIiIsjX15d8fX1p+/btVlu3rVDZ/C1X5qtXr1LHjh3J1dWVFi1aRGaz2VKrKlVWVhaFh4eTVqulSZMmkclkEpLD2lQ4f8uUOSkpiQICAigwMJCOHz9uiVWU28KFC8nZ2ZnCw8MpPz9fdByLUun8lS9zQkIC+fj4UJs2bej27dtKL75Sdu3aRR4eHtS1a1cyGAyi41iEiuevbJmTkpLIx8eHunXrRnl5eUouWjEJCQnk7e1NPXv2JKPRKDqOolQ+f+XKfPXqVQoICKA2bdrY7CAfOHToEOn1eho6dKjoKIrh+StU5sLCQurUqRM1bNiQbt26pcQiLW7z5s2k0+lo3rx5oqNUGs+fiJQq84QJE8jFxcXu3nmbMmUKOTs705EjR0RHqRSePxEpUebDhw+TTqejhQsXKhHIqkwmE73wwgsUHBxst/vPPP8ilSuzyWSi1q1bU4cOHYQdx6yslJQUcnFxoRkzZoiOUm48/2IqV+alS5eSo6OjVd9ZsoQvvviC3N3dy3WijS3g+RdT8TIXFhZSvXr16J133qlMAJtgMBioRo0aNH78eNFRyozn/4iKlzkyMpIcHBwoJSWlMgFsxtSpU8nDw8NuTkri+T+i4mVu3749hYeHV/TbbU5OTg65u7vTDz/8IDpKmfD8HxFdoc8ApqSk4ODBgxg2bJiynxUQyN3dHX379kVERIToKE/E83+8CpU5MjISvr6+6Nq1a4VXbIsGDx6Mw4cPIzExUXSUUvH8H69CZd66dSt69+5d6Q8ynj9/HsOHD0daWlqllqOUjh07olq1ati2bZvoKKXi+T9eucucl5eH+Ph4hIWFVWiFf5eQkIDly5fj5MmTlV6WEnQ6HTp16oTdu3eLjlIinn8pyruXvW3bNgJA169fr+iOejHp6emKLEcps2fPJi8vL9ExSsTzL1H5XwCeOXMGNWrUQM2aNSv22/MQb29vRZajlCZNmiAjIwM3btwQHeWxeP4lK3eZk5KSEBgYWO4VPY7ZbMbu3btx+PDhoscuX76M2bNnw2w249SpU/j6668REREBs9lc7Hvz8vKwcuVKTJo0CdHR0cjOzlYk04OfzVZfBPL8S1HebXmPHj0UOQ/19OnT1K9fPwJACxYsICKi3377jXx8fAgAzZo1i4YNG0avvPIKAaBvvvmm6HvPnj1LL7/8Mh0/fpyMRiMNGDCAqlevrtinjt3c3GjZsmWKLEtpPP8Slf9Nk7Zt29LYsWPL+22PdeLEiWLDJCIaP348AaAdO3YUPda8eXNq0aIFEd1/GzckJIQWL15c9PUjR46Qk5MT/f7774rkqlWrFn3//feKLEtpPP8SRZf7WnN5eXlwd3cv/5+Ax3B2dn7kMVdXVwBAUFBQ0WPPPvsstm7dCgDYvHkzjh07hp49exZ9vXnz5sjNzYWTk5Miudzd3ZGbm6vIspTG8y9ZufeZ8/PzFQtdVjqdDvS/S+IdP34cer0ePj4+xZ6jZCYXFxfcu3dPseUpiedfsnKXWa/Xw2AwlHtFSjGbzTAYDBY9Fpybm6vY1k9pPP+SlbvM7u7uyMvLK/eKlA6J4psAAAT4SURBVNKkSRMAQFRUVLHHb9++jfXr1yuyDlsuM8+/ZOXeZ/b29sbNmzfLvaLHyc/PBwDcunWr6LGcnBwAQEFBQdFjt27dQn5+PogIvXv3RrNmzfDjjz/CxcUFb7zxBk6cOIE9e/YgOjq60pmMRiOysrJQvXr1Si/LEnj+pSjvS8Zx48ZR06ZNy/ttjzh06FDRoaHg4GDauHEj7dmzh+rWrUsAaOTIkXTt2jVatWoVeXh4EACaMmUKGY1GSktLo27dupFGoyGNRkOdO3emtLS0Smciun/YCQAdPXpUkeUpjedfovIfmlu+fDm5ubnZxDXbMjMzFb9qz4YNG0ij0djstSd4/iUq/9vZTZs2xZ07d3Dq1Kny/xlQWNWqVeHl5aXoMv/88080aNAAer1e0eUqhedfsgqVuXr16jZ9Zlll7Nq1Cy+88ILoGCXi+Zes3GXWarXo2LEjduzYUaEV2rKsrCzEx8ejc+fOoqOUiOdfsgqdnN+nTx9s27YNt2/frtBKbdXatWvh4OCA7t27i45SKp7/41WozP369YOTk5Mih2JsSUREBHr37o2qVauKjlIqnn8JKvqqc9CgQdS0aVO7vZLOw06fPk0ajYY2btwoOkqZ8PwfUfFLDRw7dsyu/uc/ycCBA6lRo0Y2ccirLHj+j6jc5bleeeUVatOmjd1vHZKSkkin01FkZKToKOXC8y+mcmU+cuQI6XQ6WrFiRWUWI1z37t3pueeeU+yeeNbC8y+m8pe0/eijj6hGjRp2c1mrh61Zs4Y0Gg3t27dPdJQK4fkXqXyZMzMzyc/Pj/r161fZRVndlStXyMfHh0aMGCE6SoXx/Isoc+X8PXv22N0tFUwmE3Xp0oUaNGhA2dnZouNUCs+fiJS8Qc+///1vcnJysps7ob7//vvk6upqM/fJqyyev4JlNpvNNHToUHJzc6ODBw8qtViLmDJlCul0Olq7dq3oKIrh+St8H8CCggJ66aWXqFq1ajb7gmrq1KkEgObMmSM6iuJUPn/l79B6584d6t27N7m5uSn20XMlGI1Gev/990mn0xX7mLxsVDx/y9w722g00siRI4tu/i36+G1aWlrRTdHXr18vNIs1qHT+linzAwsXLiQXFxfq1KkTJScnW3JVJYqOjiYfHx8KCgqiEydOCMkgisrmb9kyE90/h6Bp06bk7OxMEydOtNrHkRITE+nFF18kjUZDI0aMoNzcXKus19aoaP6WLzPR/T9733//PXl4eJC3tzd98cUXFnvH6tSpUzRo0CBycHCg5557jvbv32+R9dgTlczfOmV+ID09nSZOnEienp5UpUoVGjx4MG3fvr3S+3SZmZm0ZMkSCg0NJY1GQ8HBwRQVFSV8X9HWSD7/aA3R/667ZEXZ2dlYuXIlIiIiEBcXB09PT3Tq1AlhYWFo0qQJGjZsiKeeeuqx31tQUIDU1FQkJibi8OHD2LVrF44cOQJHR0f07t0bQ4YMQY8ePaDRaKz8U9kPSee/RkiZ/+7cuXPYunUrdu3ahX379hVdkMTNzQ3u7u6oUqUK9Ho98vLykJmZidzcXBQWFkKj0SAwMBBhYWEICwvDiy++CE9PT5E/il2SaP7iy/yw9PR0JCYmIjU1Fbm5ucjLy4PBYIC7uzuqVq2KatWqoWHDhggMDCy6YiVTjh3P3/bKzFgFranQB1oZs0VcZiYNLjOTxv8D94J8b9F1GAcAAAAASUVORK5CYII=\n",
104 |       "text/plain": [
105 |        "<IPython.core.display.Image object>"
106 |       ]
107 |      },
108 |      "execution_count": 4,
109 |      "metadata": {},
110 |      "output_type": "execute_result"
111 |     }
112 |    ],
113 |    "source": [
114 |     "z.visualize()"
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "markdown",
119 |    "metadata": {},
120 |    "source": [
121 |     "## Parallize Python code with Delayed\n",
122 |     "\n",
123 |     "We will look at more examples of using Delayed in this section."
124 |    ]
125 |   },
126 |   {
127 |    "cell_type": "markdown",
128 |    "metadata": {},
129 |    "source": [
130 |     "### Parallel for-loop\n",
131 |     "\n",
132 |     "Loops are the most common parts of a program that can be parallelized. Let's take a look at the following sequential for-loop."
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "code",
137 |    "execution_count": 5,
138 |    "metadata": {},
139 |    "outputs": [],
140 |    "source": [
141 |     "data = [1, 2, 3, 4, 5, 6, 7, 8]"
142 |    ]
143 |   },
144 |   {
145 |    "cell_type": "code",
146 |    "execution_count": 6,
147 |    "metadata": {},
148 |    "outputs": [
149 |     {
150 |      "name": "stdout",
151 |      "output_type": "stream",
152 |      "text": [
153 |       "CPU times: user 948 µs, sys: 1.19 ms, total: 2.14 ms\n",
154 |       "Wall time: 8.03 s\n"
155 |      ]
156 |     },
157 |     {
158 |      "data": {
159 |       "text/plain": [
160 |        "44"
161 |       ]
162 |      },
163 |      "execution_count": 6,
164 |      "metadata": {},
165 |      "output_type": "execute_result"
166 |     }
167 |    ],
168 |    "source": [
169 |     "%%time\n",
170 |     "\n",
171 |     "results = []\n",
172 |     "for x in data:\n",
173 |     "    y = inc(x)\n",
174 |     "    results.append(y)\n",
175 |     "\n",
176 |     "total = sum(results)\n",
177 |     "total"
178 |    ]
179 |   },
180 |   {
181 |    "cell_type": "markdown",
182 |    "metadata": {},
183 |    "source": [
184 |     "It can be parallelized by wrapping certain functions with delayed."
185 |    ]
186 |   },
187 |   {
188 |    "cell_type": "code",
189 |    "execution_count": 7,
190 |    "metadata": {},
191 |    "outputs": [
192 |     {
193 |      "name": "stdout",
194 |      "output_type": "stream",
195 |      "text": [
196 |       "CPU times: user 2.09 ms, sys: 944 µs, total: 3.03 ms\n",
197 |       "Wall time: 2.43 ms\n"
198 |      ]
199 |     },
200 |     {
201 |      "data": {
202 |       "text/plain": [
203 |        "Delayed('sum-fec9e99c-c860-4e82-8ac7-1707b89370c2')"
204 |       ]
205 |      },
206 |      "execution_count": 7,
207 |      "metadata": {},
208 |      "output_type": "execute_result"
209 |     }
210 |    ],
211 |    "source": [
212 |     "%%time\n",
213 |     "\n",
214 |     "results = []\n",
215 |     "for x in data:\n",
216 |     "    y = delayed(inc)(x)\n",
217 |     "    results.append(y)\n",
218 |     "\n",
219 |     "total = delayed(sum)(results)\n",
220 |     "total"
221 |    ]
222 |   },
223 |   {
224 |    "cell_type": "code",
225 |    "execution_count": 8,
226 |    "metadata": {},
227 |    "outputs": [
228 |     {
229 |      "name": "stdout",
230 |      "output_type": "stream",
231 |      "text": [
232 |       "CPU times: user 2.94 ms, sys: 1.87 ms, total: 4.82 ms\n",
233 |       "Wall time: 1.01 s\n"
234 |      ]
235 |     },
236 |     {
237 |      "data": {
238 |       "text/plain": [
239 |        "44"
240 |       ]
241 |      },
242 |      "execution_count": 8,
243 |      "metadata": {},
244 |      "output_type": "execute_result"
245 |     }
246 |    ],
247 |    "source": [
248 |     "%%time\n",
249 |     "total.compute()"
250 |    ]
251 |   },
252 |   {
253 |    "cell_type": "code",
254 |    "execution_count": 9,
255 |    "metadata": {},
256 |    "outputs": [
257 |     {
258 |      "data": {
259 |       "image/png": "iVBORw0KGgoAAAANSUhEUgAAAvMAAAF5CAYAAAASxqX0AAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nOzdeVhU9f4H8PcMMyyyiooiKIYb5JJLFoq7uHOtrpJlYZvizYzUFtqpW96Lt230aoWlCd4yccfUCjMV1yKX65KikoCKu6CICDPz+f3Rj7mg4DozZ87wfj0PT0/AzHnPQeDNOd9FIyICIiIiIiJSm4VapRMQEREREdHtYZknIiIiIlIplnkiIiIiIpXSKR2AiMhZHD16FJs3b1Y6hsN7+OGHlY5AROQ0NJwAS0RkHWlpaRg5cqTSMRwef+0QEVkNJ8ASEVmbiPCtmrcFCxYo/aUhInI6LPNERERERCrFMk9EREREpFIs80REREREKsUyT0RERESkUizzREREREQqxTJPRERERKRSLPNERERERCrFMk9EREREpFIs80REREREKsUyT0RERESkUizzREREREQqxTJPRERERKRSLPNERERERCrFMk9EREREpFIs80REREREKsUyT0RERESkUizzREREREQqxTJPRERERKRSLPNERERERCrFMk9EREREpFIs80REREREKsUyT0RERESkUizzREREREQqxTJPRERERKRSLPNERERERCrFMk9EREREpFIs80REREREKsUyT0RERESkUizzREREREQqxTJPRERERKRSLPNERERERCrFMk9EREREpFIs80REREREKsUyT0RERESkUizzREREREQqxTJPRERERKRSLPNERERERCrFMk9EREREpFIs80REREREKsUyT0RERESkUizzREREREQqxTJPRERERKRSLPNERERERCrFMk9EREREpFIs80RETsxsNisdgYiIbEindAAiImeTlpamdAQAgIjg+++/x+DBg5WOAgDYsmWL0hGIiJwOyzwRkZWNHDlS6QhVzJ07V+kIRERkIxxmQ0RkJQ8//DBExGHenn32WQDA2rVrFc9S+Y2IiKyHZZ6IyAmVl5fjm2++AQDLf4mIyPmwzBMROaEff/wRRUVFAIAFCxbgypUrCiciIiJbYJknInJC33zzDfR6PQCguLgYP/zwg8KJiIjIFljmiYicTElJCZYuXYry8nIAgIuLC77++muFUxERkS2wzBMROZkVK1agtLTU8v9GoxHLly9HcXGxgqmIiMgWWOaJiJzM119/DRcXlyrvKy8vR3p6ukKJiIjIVljmiYicSGFhIb7//nsYjcYq79doNJg3b55CqYiIyFZY5omInMiiRYtgNpuveb/JZEJGRgbOnDmjQCoiIrIVlnkiIidyo6vvixcvtlMSIiKyB5Z5IiInUVBQgI0bN8JkMlX7cRFBamqqnVMREZEtscwTETmJBQsWQKut+ce62WzGli1bkJuba8dURERkSyzzREROYt68eddMfL2aiGDhwoV2SkRERLamUzoAERHduePHj+P8+fMICgqyvM9oNKK0tBReXl5VPnfLli32jkdERDaiERFROgQREVlfWloaRo4cCf6YJyJyWgs5zIaIiIiISKVY5omIiIiIVIplnoiIiIhIpVjmiYiIiIhUimWeiIiIiEilWOaJiIiIiFSKZZ6IiIiISKVY5omIiIiIVIplnoiIiIhIpVjmiYiIiIhUimWeiIiIiEilWOaJiIiIiFSKZZ6IiIiISKVY5omIiIiIVIplnoiIiIhIpVjmiYiIiIhUimWeiIiIiEilWOaJiIiIiFSKZZ6IiIiISKVY5omIiIiIVIplnoiIiIhIpVjmiYiIiIhUimWeiIiIiEilWOaJiIiIiFSKZZ6IiIiISKVY5omIiIiIVIplnoiIiIhIpVjmiYiIiIhUimWeiIiIiEilWOaJiIiIiFSKZZ6IiIiISKV0SgcgIiLrEBEcO3YMJ0+exKVLl7Bz504AwJo1a+Dl5QVPT08EBwejbt26CiclIiJr0YiIKB2CiIhujYhg7969WLt2LTZv3ozs7GwcOHAAJSUlN3xsQEAAwsLC0KZNG/Ts2RN9+vRBw4YN7ZCaiIisbCHLPBGRSpjNZvz888+YN28eVq9ejVOnTqFu3bro3r077r77brRq1QqtWrVCYGAgPD09LVfjCwsLUVxcjOLiYuTm5uLAgQM4cOAAdu3ahV9++QUmkwlt2rTB8OHDERsbi+bNmyv9UomI6OawzBMRObrTp0/j3//+N+bOnYv8/Hzcd999iImJQd++fdGhQwdotbc//am4uBiZmZn48ccfsWDBApw4cQLdunXDs88+i5EjR0Kn42hMIiIHxjJPROSojh07hg8++ABffPEFvLy8MHbsWDz++OMICwuzyfFMJhMyMjKQkpKCRYsWISQkBAkJCXjiiSfg6upqk2MSEdEdYZknInI05eXl+PTTT/Hmm2/C09MTkyZNwvPPP486derYLcORI0fwySefYNasWQgODsaMGTMwcOBAux2fiIhuCss8EZEj2bBhA8aNG4e8vDy8+eabePHFFxW9Kp6bm4uJEydi2bJleOSRRzBt2jQEBAQoloeIiKpYyHXmiYgcgMlkwrvvvou+ffuiZcuW2LdvH1577TXFh7eEhIRg6dKl+O6777B161Z06NABP//8s6KZiIjof1jmiYgUdurUKfTv3x9JSUmYPn060tPTERISonSsKoYOHYqdO3eiW7du6N+/P/7+97+DN3aJiJTHZQqIiBSUk5ODgQMHQkSwefNmdOzYUelINfL19cWiRYswY8YMTJ48GYcPH8aXX34JvV6vdDQiolqLY+aJiBSyd+9eDBw4EI0aNcKqVatUNRb9p59+wkMPPYT7778fS5Ysgbe3t9KRiIhqI46ZJyJSwt69e9GzZ0+EhYXh559/VlWRB4B+/fphzZo12LlzJ4YPH46ysjKlIxER1Uq8Mk9EZGdHjx5FZGQkGjdujJ9++smuS05a2+7du9GzZ08MGjQIX3/99R1tYEVERLeMV+aJiOypqKgI/fv3h5+fH1avXq3qIg8A7dq1w+LFi7F06VK88sorSschIqp1WOaJiOxozJgxKCwsxOrVq+Hn56d0HKvo27cv5syZg48//hhLlixROg4RUa3C1WyIiOxk5syZWLJkCb7//ns0btxY6ThWNWrUKKxfvx7PPPMMOnbsiLvuukvpSEREtQLHzBMR2cGBAwfQoUMHJCQk4J133lE6jk2UlpYiIiICXl5eyMzMhEajUToSEZGzW8gyT0RkB1FRUTh16hS2b98Onc55b4ru3r0bnTp1QnJyMp5++mml4xAROTtOgCUisrX58+fj559/RnJyslMXeeDPCbHjx4/HK6+8gjNnzigdh4jI6fHKPBGRDZWVlaFFixYYOHAgvvjiC6Xj2EVRURHCwsIwatQofPTRR0rHISJyZrwyT0RkS6mpqThx4gTefPNNpaPYja+vL1555RV8/vnnOH36tNJxiIicGss8EZGNmEwmfPDBB3jiiScQEhKidBy7GjduHLy8vDBt2jSloxAROTWWeSIiG1mxYgUOHTqEhIQEpaPYXZ06dTBx4kTMnDkTpaWlSschInJaLPNERDaSmpqKfv36oUWLFkpHUcQzzzyD4uJipKenKx2FiMhpscwTEdnAuXPnsGrVKsTGxiodRTEBAQGIiorCvHnzlI5CROS0WOaJiGxg0aJF0Ol0eOihh5SOoqjRo0fjhx9+wLlz55SOQkTklFjmiYhsICMjA3369IGXl5fSURQ1ePBgmM1mrFu3TukoREROiWWeiMjKRATr1q1Dnz59lI6iOD8/P3Ts2BE///yz0lGIiJwSyzwRkZXt3r0bZ86cYZn/f3379sXatWuVjkFE5JRY5omIrGzHjh3w8PDAPffco3QUhxAREYH9+/fj8uXLSkchInI6LPNERFa2f/9+tGzZElotf8QCQOvWrWE2m3Hw4EGloxAROR3+piEisrIDBw6gdevWSsdwGC1atIBOp8P+/fuVjkJE5HR0SgcgInI2OTk5GDJkiNWeT0Swfv167Ny5Ey4uLggLC0P//v2xYsUKHD58GF5eXhgzZgwuXryI1NRUlJeXIzAwECNHjgQAXL58GcuXL8ewYcNw6tQprFq1Co0bN8Zf/vIXuLi44OTJk0hPT4dWq0VMTAx8fHyslh0AXF1d0aRJE+Tk5Fj1eYmIiGWeiMjqioqK4OfnZ7Xne/PNN3HXXXdh4sSJyMrKwnPPPYf+/fvjL3/5C9q2bYuioiKMGTMG3t7eGD16NIKDg9GmTRuMHDkS69evx9ixY3Hw4EF89NFHOHDgAPz8/PDyyy9j8ODBGDRoENatWweTyYQFCxZg+fLlNtmx1c/PD0VFRVZ/XiKi2o5lnojIyi5evAhvb2+rPJeIYNasWVi4cCEA4N5778WwYcMsHw8PD8fWrVst/+/t7Y0WLVpY/r9Xr1549tlnMXnyZDRt2hSTJ08GAGi1WiQlJWHUqFH4z3/+AwBo3rw5PvzwQ5jNZquP9/f29sbFixet+pxERMQx80REVmfNMq/RaNC6dWuMHDkSy5cvBwC89NJLt/Qcvr6+AIB27dpZ3lcxpr/yijthYWG4cuUKjh8/fqexr+Hj48MyT0RkAyzzRERWptFoICJWe74ZM2bAx8cHDz74IKKiolBYWHjHz+nu7n7N+/R6PQDg0qVLd/z8V7Pm+SAiov9hmScisjIvLy8UFxdb7fk6dOiA7du3Y/z48Vi3bh06deqEc+fOWe357cGadyuIiOh/WOaJiKzMmuPDr1y5gnnz5sHb2xszZ87EypUrUVBQgCVLlgAAdDodSktLrXIsW2KZJyKyDZZ5IiIr8/Pzw/nz563yXCKCzz//3DJMZcCAAahfvz7q169v+f8zZ87gq6++wqVLl/DVV1/h7NmzyMnJsWSo+MPiypUrluetuHNQ+Qp/xfCayp9nLYWFhZax+0REZD0s80REVhYaGorDhw9b7fn++OMPjBo1CosWLcLHH3+MZ599Fg8++CAAICYmBhEREXj66afRpUsX+Pn5oXPnzujQoQMWL16MLVu24KuvvgIAfPzxx/jjjz+wbt06fPbZZwCAd999F/v27cOWLVvwxRdfAACmTJli1d1ay8rKkJ+fj+bNm1vtOYmI6E8a4awkIiKrev3117Fy5Urs2rXLKs9nNBphNptx4sQJNG3atNrPOX36NBo0aAAAKC0trXaCq1L27t2Ltm3bYteuXWjfvr3ScYiInMlCXpknIqrB8ePHsWPHDpw8efKWHte6dWscPHgQJpPJKjl0Oh1cXV1rLPIALEUeqH6lGiUdOHAAWq0WLVu2vOnHlJWVIS8vD5s2bbLaeSQickbcNIqIqAZ//PEHunfvDuDPQt2gQQMEBwcjJCQEQUFBaNKkCQIDAy3/DQoKgoeHBzp16oTLly9j165d6NSpk8KvQnlbt25FeHg4PDw8APw5Tv/48eM4evQoCgoKkJ+fj+PHjyM/Px9HjhzBiRMnLGP5GzVqhIKCAiXjExE5NA6zISKqwZUrV+Dt7Y3y8vIq79doNJY12cvLy6usoe7t7Y1GjRohNzcXXbp0wbJlyyyTVWujzMxMPPjgg3Bzc4Ner8eJEydQVlZm+biLiwt0Oh3MZvM151mr1eKvf/2rZfdbIiK6BofZEBHVxM3NDZ07d77m/SKCsrIylJWVXbMZ0sWLF3Hw4EH4+vrC1dW1Vhd54M9dZwsLC1FSUoK8vLwqRR4ATCYTrly5ck2RB/4s8z169LBXVCIiVWKZJyK6jj59+sDV1fWmPlen08HLywvJycl4//33sW3bNqutN69WK1euhIuLC3bt2oXnn38eGo0GOt3NjfA0Go0s80REN8AyT0RUDbPZjF27duHcuXPXXE2+mlb754/SAQMGYP/+/YiLi0NMTAxMJpNlc6faat68eRg8eDBCQkIwffp0ZGZmIjQ0FC4uLjd8rKurK/bt24fjx4/bISkRkTpxzDwREf68CpyVlYXMzExs2LABGzdutGx0dOHChWuG01TQ6/WoW7cuPv/8czz00ENVPjZ8+HAUFRVhzZo19ngJDufkyZMIDg7G/PnzMWLECMv7y8vL8fHHH+Ott96CiMBoNF7zWI1GA39/fxQVFcFoNKJFixbo0aMHevbsiZ49eyI0NNSeL4WIyFFxzDwR1U5GoxG//fYbpk6dir/85S+oX78+unbtig8//BAA8OqrryIrKwvnzp1D69atr3m8i4sLNBoNnnzySRw6dAgPPvggzp8/jyNHjmD37t3YvHkz2rVrh7Vr12LKlCk4c+aMvV+ion799VeMHz8ebm5uqFOnDn799VccOHAAx48fR2lpKRISErB3715ERERY7mxUptfrMXHiRBQVFSEzMxNjxoxBfn4+xo8fj+bNmyMwMBAPP/wwpk2bht9++63GP7aIiJwdr8wTUa1w6dIl7NixA5s2bcKaNWuwceNGlJaWIjAwEN27d0dkZCS6d++OTp06QaPRVHnsiBEjsGzZMst65xqNBq6urvD390dZWRlKSkpw+fLlao+r0WgQGRmJzMxMm79GR3L27FkEBQXhypUr1X5co9HA09MTXl5eMJvNOHPmDESkSilfvXo1Bg0aVOVxRqMRu3btsnwNK+6g+Pj44L777kNUVBQiIyNx//33W1YcIiJyYgtZ5onIKV28eBHbtm2zlL5ff/0VZWVlCA0NtRT3yMhItGnT5obPNXPmTEyYMOGWjq/VauHq6oq//e1vmDlzJg4ePIiQkJDbfTmq8+GHH+Kdd97BI488gtmzZ9/y4zUaDYqLi1GnTp3rfp7JZML+/fstf6StXbsWZ8+ehZeXFyIiIixf6x49esDNze12Xw4RkaNimSci53Dy5En88ssvllK3Y8cOmM1mhIaGWq7W9u7d+7q7qNYkPz8fTZs2hUajuanhHDqdDp6enli1ahW6dOmCli1bom/fvpgzZ87tvDTVKSwsROvWrTF69Gh88MEHmDZtGiZNmgQANz0cJjQ0FIcPH76t4+fk5Fj+iFu/fj3y8vJQp04ddOzY0fJHXK9eveDj43Nbz09E5EBY5olInQoKCrBu3TrLW3Z2NnQ6HTp27GiZJNm9e3f4+/tb5XhvvvkmpkyZcsPP0+v1aNiwIX766Se0atUKAJCWloZHHnkEP//8M3r16mWVPI5swoQJ+Pbbb3HgwAHUq1cPAPCf//wHTz31FETEMlypJn5+fpg3bx6io6Otkic7O9sysXn9+vXIzc2Fq6sr7rvvPvTu3Ru9e/dGt27dLDvUEhGpCMs8EanDyZMnq5T3/fv3Q6fToUuXLujduzd69eqFyMhIeHl52eT4IoLQ0FDk5ubWeHVZp9OhTZs2+OGHH9CwYcMqHxs6dCjy8vKwfft2px7LvX37dtx3332YPXs2nnjiiSofW7t2LYYNG4YrV65Uu4IN8OcfQy+99BL+8Y9/2CxjXl4eNmzYYPm3dPjwYbi5ueH+++9Hnz590Lt3b0RERMDd3d1mGYiIrIRlnogc0+nTp7F161bLsJnt27dDq9WiQ4cOlnHQ/fv3h5+fn82zHDhwAF999RVmzJiB0tLSaq8su7i4oEePHli+fHm1wzcOHjyIe+65By+++CLee+89m2dWwuXLl3H//ffDz88P69evv2YiMQDs3r0b/fv3x7lz56rd9VWj0aBp06Z49tln8cwzz9hlB92CggJs3LjRMjRn37590Ol0uOeeexAVFYWoqCh0796d5Z6IHBHLPBE5hsoTVivGvGs0GkXKO/DnWuhLly7Fp59+ivXr16NVq1Z44okn8Pe///2aFVq0Wi1GjRqFOXPmXPeq+yeffIIXX3wR0dHRSE1NtdtrsYd169YhPj4eR44cwfbt29GiRYsaP/fIkSOIiopCXl5elUKv0+kQGRmJtm3bYt68eSgvL8ejjz6K8ePHo3PnzvZ4GQCqlvuMjAz88ccf15R7TqglIgfBMk9EyqipvLdu3Rrdu3e3lKa6devaNVdBQQFSU1Mxc+ZMHDt2DH379kV8fDyio6Oh0Wjw9NNP4z//+U+VEhofHw+DwVDtlej9+/fju+++w7Jly7B161aYTCb4+vpiz549CA4OtudLs6nU1FTLsBpvb28MGTIEw4YNw6BBg6qdt3Du3DkMHToUWVlZVYbcVCxHefHiRcyfPx+ffvopdu3ahc6dOyMuLg6PP/74DVe4sbbjx49b7hD9+OOPOHLkSJUJtSz3RKQglnkiso/i4mJs3bq1SnkHgLCwMEsh6tevn9UmrN6qjRs3Yvr06Vi2bBnq1q2Lp556Cn/729/QrFmzKp+XlZWFLl26QKPRQKPRYMaMGXj22WctHy8rK8OGDRvw3XffYenSpcjLy4Ner4fRaISIYNiwYTh06BA0Gg02bNig2Ou1pjVr1mDo0KEYN24cvv76a5w/fx5arRZmsxkajQZdunTBQw89hOjo6CpLgZaUlGDkyJFYvXo1TCYTmjRpgiNHjlyzidRvv/2GadOm4dtvv4WnpydGjx6NiRMn4q677rL3SwVQtdz/8MMPyM3NRZ06ddCtWzfLXaSePXvC1dVVkXxEVKsshBAR2UBRUZGsWLFCXnzxRenUqZO4uLiIVquVe+65R1544QVZtmyZnDt3TtGMFy5ckOTkZGnXrp0AkM6dO0tycrKUlJRc93EdOnQQV1dXWbp0qYiInD59WtLS0uTxxx8XT09PASB6vV4AWN60Wq00aNBAzpw5I0ePHpWQkBC5//77pbi42B4v1WZ27dolvr6+MmrUKDGZTLJ8+fIqrxuAaDQay/lo3LixxMXFSXp6upSWlorRaJSxY8cKAJk6dep1j3XixAlJSkqSkJAQ0Wq1EhUVJWlpaWI0Gu30aquXnZ0ts2bNklGjRklgYKAAEC8vLxk8eLBMnTpVtm3bpnhGInJaabwyT0RWUXlnzjVr1mDDhg2WTZoqhsz07dvXslShkvbv34/PPvsMc+bMgdFoRExMDCZPnowOHTrc1OMXL16M4uJiHD16FEuXLrVMztVoNDWu0qLRaLB69WoMHDgQwJ8TYrt3747w8HAsX74cvr6+Vnt99vLLL78gOjoa7du3x6pVqyxXoseMGYPU1NRqJ7gCgKurK8rKyuDu7o4BAwZg2LBhyMnJwaRJk25qwqvZbMbatWsxbdo0rFy5EqGhoRg7dqzdJszeSE5ODjZu3IhNmzZh9erVyM/Pt2xiVfG9UN1Ow0REt4HDbIjo9phMJuzYsQM//fQTfvrpJ2zatAklJSVo0aIF+vXrh379+qFPnz4OUa6AP4e/LF++HLNmzcJPP/2EFi1a4JlnnsHYsWNva6jLq6++iqlTp97URlI6nQ7jxo3DjBkzqrx/7969GDRoEPz8/PD9998jKCjolnMoJSMjA8OHD0fv3r3x7bffVhnHXlxcjDZt2uDYsWM3XFNeo9FAq9Vi48aNiIiIuOUcBw8exOzZs/HFF1/g0qVLGDZsGF544QVERkbe8nPZyr59+7B27Vr89NNPWLduHQoLCxEYGGj5PunXrx+aNGmidEwiUicOsyGim3f48GFJTk6WmJgY8ff3FwASEBAgMTExkpycLDk5OUpHvMbx48clKSlJgoODLUMz0tPTxWw239HzlpeXS+fOna8ZTnP1m06nk+bNm9c4dOePP/6Q1q1bS7NmzeSXX365o0z2YDab5eOPPxadTidPP/20lJeXV/t5mzdvFq1We91zA0BcXFxkypQpd5zr8uXLkpKSIvfcc0+VIVOXLl264+e2JqPRKFlZWZKUlCRRUVHi7u4uACQ0NFTi4uIkLS1Nzpw5o3RMIlKPNJZ5IqrRiRMnJC0tTeLi4iQkJEQAiKenp0RFRUlSUpJkZWXdcSm2lczMTImJiRGdTicNGzaUhIQEOXLkiFWPcejQIalTp45oNJrrltVt27Zd93lOnz4tAwYMEFdXVzEYDA57Ts+ePSvDhg0TnU4n//znP2+Y88033xQXF5caz41er5du3bpZfTx5VlaWxMXFibu7u/j6+kp8fLwcPnzYqsewlvLy8irlXq/Xi1arlbvvvttS7ouKipSOSUSOi2WeiP7n4sWLkpGRIQkJCdK5c2fRaDSi0+mkc+fOkpCQIBkZGXLlyhWlY9aoqKhIkpOTpW3btlWuzl6+fNlmx5w7d+51i/z7779/U89jMplkypQpotPpZNCgQXLo0CGbZb4dS5YskSZNmkhwcLBs2LDhph5TXl4unTp1qvbuhVarFU9PT8nNzbVZZkedMHs9av8eJCK7Y5knqs1KSkokMzNT9VcFf//9d4mPjxcvLy9xd3eX2NhY2blzp82Pe+LECRk0aJDodDrR6XTXXHXu0qXLLRfHTZs2Sdu2bcXd3V0SExNvuLKOrR08eFCGDBkiGo1GYmNj5fTp07f0+N9//13c3Nyu+8eOrcu1yWSSjIwMiY6OFo1GI0FBQZKYmHjLr0UJar47RkR2wTJPVJs403jdK1euSFpamkRFRQkAadWqlSQlJcnZs2ftcvyMjAwJDAyUkJAQycjIkGbNmlUp9O7u7pKdnX1bz11eXi4Gg0F8fHykQYMGkpiYKJ999pndxn9/99138uOPP0p8fLy4ublJq1at5Mcff7zt55s+fXqV8fM6nU7GjRsnycnJ4uHhIb1795b8/HwrvoKaZWdnS0JCgvj7+4ubm5vExMRIRkaGXY5tDdXNW2nQoIFDz1shIptimSdydvv375fp06fLAw88IL6+vgJAAgMDJTY2VubOnSt5eXlKR7wlx44dqzKhNTo6WjIyMux2dbKsrEwSExNFq9VKTEyMnD9/XkREtmzZYimsGo1GZs2adcfHOnHihLz88svi7u4uGo1GXnnlFdm9e/cdP29NysrKZMWKFdK9e3fLH3lz586tcZLrzTKbzdKvXz/R6/Wi0+mkVatWljsOe/bskXbt2omvr6/Mnz/fGi/jplRMmO3QoYNDT5i9HqPRKL/++qskJSVJ//79xcPDQwBI8+bNJS4uThYuXKj4Xg5EZHMs80TO5vz587Jo0SKJi4uTZs2aCQDx9fIYLm0AACAASURBVPWVBx54QKZPny579+5VOuItM5vNkpGRIQ8++KC4uLhIYGCgvP3223L06FG75ti/f7907NhRvL29JTk5+ZqPT5kyRQDIkCFDrHbMhQsXilarFT8/P8vXs2PHjvKPf/xDtm7desdFu7CwUJYvXy7PPfecNGjQQDQajXTq1EkASFBQkNWumB87dkx8fHxEr9fLrl27qnyspKRE4uPjBYDExsbafSOtjRs3yqhRo8TV1VX8/f3lpZdectgJs9dTWloqa9eulTfeeEMiIiLExcVFXFxcJCIiQt5++23ZtGnTHf97ISKHw02jiNTOZDJh586dls2a1q9fD5PJhI4dO1o2qFHr1vIXLlxAamoqZs6cif3796NXr14YP348HnroIej1ertmSU1NxXPPPYewsDDMnz8fLVq0uOZzzGYzRo0aBYPBgEaNGt3xMdPT0/HXv/4VJpMJERER2Lx5MzIzMzFv3jysXLkSBQUF8PHxQWRkJO6++260atUKrVu3RsOGDeHl5WV5u3jxIgoLC1FcXIzc3FwcOHAABw4cwM6dO7F9+3aICNq3b4/hw4cjNjYWbm5uCAwMhEajQbNmzbBp0yYEBgbe8etZunQpjh07hgkTJtT48TFjxiAgIADz58+/6U28rOXkyZOYPXs2Pv/8cxw7dgxDhgzBhAkTMGDAAFVu8FRcXIytW7dixYoVSE9Px5EjR+Dp6YmuXbsiOjoaw4YNw1133aV0TCK6M1xnnkiNCgoKJCUlpcq42UaNGklMTIykpKTYbdy4rVSMa65bt65dJ7RWp7CwUB599FHRaDQSHx9vt5VEfvzxR8uEZI1GIw8//PA1n/P777/LzJkz5fHHH5d7771XvL29b7iuOwBp3Lix9O3bV55//nlZtGjRNfMkTCaTZUlJvV4voaGhUlBQYJfXnZeXJz179hR3d3fFlum8esJsixYt7Dofw1Yqj7evGHJXeb5MYWGh0hGJ6NbxyjyRGpSUlGDz5s2Wq+/bt2+Hu7s7IiMjnWZ7eLPZjLVr12LatGlYuXIlQkNDMXbsWIwZMwb16tVTJNO2bdswatQoXLp0CXPnzsWgQYPsctyNGzeif//+KC8vh8lkgqurK8aPH49PPvnkho8tKCjA6dOnUVxcbHnz8fGBn58fvLy8EBQUBG9v7xs+T/369XH27FkAgF6vR4sWLZCZmWmXr4XZbMa///1vvPzyy+jTpw9SUlKscqfjdmRnZ2POnDlITk5GaWkpYmJi8NJLL6F9+/aK5LEWo9GIXbt2YcWKFfjuu++wY8cOaDQadOjQAVFRUYiOjka3bt2g1WqVjkpE18cr80SO6vDhw2IwGCQqKsqytF9oaKjEx8dLRkaGTddOt6fz58+LwWCQZs2aOcxa4EajUZKSkkSv18vAgQPtdlVa5M+dUz08PKpstuTm5iZJSUl2yyAicvfdd1+z1GabNm3sOqFy27Zt0rx5cwkICJBVq1bZ7bjVuXDhgiQnJ0ubNm0sE2ZTUlKkrKxM0VzWcvr0acsSmMHBwQJA6tevb1klR20T5YlqEU6AJXIUp06dsvwyDQoKumbJOXst3Wcvv/32m8TFxUmdOnXEx8dH4uLiZN++fUrHktzcXOnRo4ciwzy2b98u3t7e1+ya6uLiInPnzrVbDhGRgQMHVrtj67333isXLlywW46ioiJ57LHH7D7M6Xoq7y4cGBgoCQkJdp+MbWsVQ3Kio6OvWcI2PT3daS4mEDkBlnkipVRs456YmHjNTo+JiYmSlZUlJpNJ6ZhWdfXa8GFhYWIwGOy+eklNFi9eLP7+/hIeHm73Mfo7d+4UHx+fa4p8xdsPP/xg1zzPPPPMNRthVRT6+++/Xy5evGjXPCkpKeLl5SX33nvvba/fb21Hjx6VxMREqV+/vri6uqpuzfqbVVJScs2utB4eHty4isgxsMwT2VPlCWg+Pj7XTEBTw26rt+P48eOSlJQkQUFB4uLiYve14W/k6qUR7b3W+P79+8Xf37/a8lzxdvVyjrb2xhtv1Lhzq06nkz59+tj96mxOTo5ERETUuDSoUkpLSyUtLU26du0qAKRTp06qW7P+VlTelTYwMFAASMOGDS13EY8fP650RKLahGWeyJaKi4tl+fLlMm7cOGnatKkAkLp168qIESNk1qxZcuTIEaUj2lRWVpbExsaKXq+Xhg0bSkJCguTm5iodq4rdu3dL27Ztxc/PTxYsWGD342dnZ0v9+vWvW+QByKlTp+yaa8aMGaLX62vMo9PpJCoqSkpLS+2aq7y83LJp14gRIxxuU6SsrCyJi4sTd3d38fPzk/j4ePnjjz+UjmUzRqNRtm3bJu+99550795ddDqdaLVaue++++Sdd96RX375xenuMBI5GJZ5ImvLzs4Wg8EgAwYMEDc3N8svtsTERNmyZYuiEzvtoWJnzfbt21fZWdPRxtiazWYxGAzi5uYmffr0UWTM86FDhyQgIKDGoTWVx8zbuxAtWrTohktcajQaeeCBBxTZiGjNmjXSuHFjCQkJkczMTLsf/0ZOnDghSUlJ0rRpU8vE7vT0dIe5G2UrRUVFsnTpUhk3bpw0adJEAEhAQIA88cQTsmDBAsuOyURkNSzzRHeqvLxcMjMzJSEhwbICiKenp0RHR0tycrIcO3ZM6Yh2cfDgQUlISBB/f39xc3OTmJgY2bx5s9KxqnXy5EkZOnSo6HQ6SUxMVOwPrPXr10tERIRlLHpNpblBgwZ2z7Zp06brFnm9Xi86nU6eeuopOXHihN3zifw5aTw6Olrxr+P1GI1GSU9Pl6ioKNFoNNKyZUtJSkpyuDsKtlJ5VS5XV1dxcXGpMi/I2f+4IbIDlnmi23HixAnLpk2Vx75XLBvpCCtu2EPF5joxMTHi4uIiQUFBkpiYKKdPn1Y6Wo0yMjIkMDBQQkJCZOPGjUrHEZE/h2aMGjVKtFpttaW+Xbt2ds90+PDhGu8SeHh4SHx8vEOssGQ2myU5OVk8PDykd+/eDpGpJr///rvEx8eLp6eneHt7S1xcnOzevVvpWHZTXFws6enpEhcXJ40bN7aMtY+NjXXqOUNENsYyT3QzjEbjNSvP1KlTR6KiosRgMDjcOHBbKywslOTkZAkLCxMAEhkZKWlpaYoMt7hZZWVllrHWMTExDnm7f9q0aZYr3pVL/eDBg+2epaSkpMpwGq1WK97e3uLh4eGQyzDu2bNH2rVrJ76+vjJ//nyl41xXUVGRJCcnS3h4uGq+f2xhz549kpSUJJGRkaLVakWn00lkZKRlhRwiuiks80Q1qdhEJTY2VurWrXvNOsv2nvjnCHbs2CFxcXFVrizu2bNH6Vg3tH//funYsaN4e3tLamqq0nFqFBERIcOHD5djx47JK6+8Ip6engJAxowZo0ieiuO3aNFCZs+eLadOnRJfX1/54IMPFMlzI1evSuQoS57W5Oo7W40bN3b4O1u2cqOft44254bIgbDME1UwmUySlZVV45WivXv3Kh1REWVlZVXWhm/VqpUkJSU55JXt6lSsT96lSxc5ePCg0nFqtHbtWgFQZZ5BUVGRTJ06VWbMmKFIpieffFKWL19eZfLt5MmTJSgoyKGHki1ZskT8/f0lLCxMduzYoXScm3Lo0KFr5pxs2rRJ6ViKqO5OaMW69rXxTijRDbDMU+128eJFjuGsQUFBgSQlJUmTJk1UuRpHYWGhPProow61c+j1DB48WHr16qV0jBvKz88XV1dX+eqrr5SOcl15eXnSs2dPcXNzk6SkJNUsj1ixGtQ999xTZTWokpISpaMphnOUiK6LZZ5qn8qrK+j1eq6ucJXNmzfLI488Inq9Xho0aCCvvfaa5OXlKR3rlmzZskVCQ0OlYcOGsnr1aqXj3NB///tf0Wg0snLlSqWj3JTRo0dLWFiYwxdkk8kkBoNB9Hq9DBgwQAoKCpSOdEvWrVsnMTExotPppGHDhvL222/X+g2ZbrR6mCPO5yCyMZZ5cn6XLl2SZcuWydixYyU4OJjrHlejtLRUUlNTpUuXLpargSkpKaqbF2A0GiUpKUn0er088MADqhl7/Pjjj0vbtm1V84fkvn37RKvVSnp6utJRbsq2bdukefPmEhAQIKtWrVI6zi07evSovPHGG9KgQQNxdXWVUaNGydatW5WO5RCys7Plk08+kf79+1v29ejSpYskJibKr7/+qprvKaI7wDJPzunYsWOSnJws0dHR4uHhwR0Ja1CxsU1wcLC4uLhIdHS0ZGRkKB3rtuTm5kqPHj3E3d1dDAaDan6J5+fni16vl5SUFKWj3JKhQ4dKjx49lI5x04qKiuSxxx5TzbCr6ly5ckXS0tKka9euDr0hm1KKi4tl2bJlEhcXZ7lw07hxY4mLi5PvvvuuVg9VIqfGMk/O4+plzipPmOKt16oqbznfoEEDSUhIUPWkssWLF4u/v7+Eh4fLzp07lY5zSyZOnCjBwcGqK5fr1q0TAKqbpFkxIbpz586SnZ2tdJzblpWVJbGxsaLX66Vhw4aSkJDg0GvsK6Hy74SrJ9HWls38qFZgmSf1qjx2slWrVpadMismr168eFHpiA6l4qpeZGSkAJCOHTtKcnKyXLp0Selot+3qpQjV9lrOnTsnXl5e8tFHHykd5bZ07dpVHnroIaVj3LKcnByJiIgQDw8PMRgMSse5I8ePH5fExETLEJyYmBiH2QzNkZw6dcoyidbLy0u0Wi3nSpGzYJkndTl79qxlLWJfX18BIHfffbckJCRIZmYmh89U4+TJk5ahNFqtVtVDaSrbvXu3tG3bVurXry/Lly9XOs5tee+998TX11cKCwuVjnJbFi1aJBqNRpXLtpaXl1s2ERsxYoScO3dO6Uh3pLS0VNLS0iQiIoJDcG6gpKREMjIyJD4+3rKKWbNmzSxr2qvtLhnVeizz5Piys7PlX//6l/To0UNcXFzEzc1NBgwYIDNmzFD10BBb++233yQuLk48PDzEz89P4uPj5ciRI0rHumNms1kMBoO4ublJnz59VDuEqrS0VBo1aiSvv/660lFum8lkkvDwcMU2tbKGNWvWSOPGjSUkJEQyMzOVjmMVFUNwKlbBSUhIUO33ia2ZTCbZsmWLvPbaa9K2bVsBIH5+fvLII4/IN998o9o/tKlWYZknx2M2m+XXX3+V119/Xdq0aSMApF69ejJ69GhZuHChXLhwQemIDuvqDZ46dOig+qE0lZ08eVKGDBkiOp1OEhMTxWg0Kh3ptn322Wfi5uam+qUGP//8c3Fzc1P1GORTp05JdHS0U/y7qqxiCE79+vUtQ3DUNsfB3iqWLu7Xr5/o9XpxdXWVAQMGyGeffabqf+Pk1FjmyTEYjUbJzMyU+Ph4adKkiQCQpk2bWm57lpWVKR3RoVUMpanY4KliKI0zjQPNyMiQwMBACQkJUf2YYKPRKC1btpRx48YpHeWOlZaWSmBgoLz66qtKR7kjZrNZkpOTxcPDQ7p27So5OTlKR7Ka0tLSazaiSklJ4c/VGzh37pxlWGfFZlV33323ZZw9kYNgmSfllJSUWHZfDQgIuGb8uzMVUVupPJTG19fXaYbSVFZWVmYZ2xwTE+MU+wKkpaWJVquVffv2KR3FKqZMmSI+Pj5OMSRhz5490q5dO/H19ZX58+crHcfqKg/BadSokSQkJPCK8024fPmyZZx9YGBglV1oOV+LFMYyT/Z15syZGlcU2L9/v9LxVMFoNEp6erplKE3r1q3FYDA4zVCayvbv3y8dO3YUb29vSU1NVTqO1URERMjw4cOVjmE1hYWF4uvrKx988IHSUazi6lWSiouLlY5kdceOHZPExESpV6+eZQjOli1blI6lCiaTSbKysiQxMVFat25dZSW19PR01W22R6rHMk+2l5OTIx999JH07NlTXFxcxN3dXaKjo+XLL7+UU6dOKR1PNU6dOiVJSUnStGlTpx1KU1nFeuBdunSRgwcPKh3HatauXSsAZPPmzUpHsarJkydLUFCQU60EsmTJEvH395ewsDDZsWOH0nFsomIITvv27TkE5zbt2rVL3n33XenYsaMAEF9fX3n00UdlwYIFnONF9sAyT7aRk5MjBoPBslmHn5+fxMTESEpKihQVFSkdT1W2b99+zVCaP/74Q+lYNlNYWCiPPvqoqnfqvJ5BgwZJr169lI5hdfn5+eLq6ipz5sxROopV5eXlSc+ePcXNzU2SkpKcejhFZmamxMTEiE6nk8DAQElMTOQFl1uUm5tr2X1cr9dbLl4lJyfLyZMnlY5Hzollnqxn586d8tZbb1lWoAkICJC4uDj54YcfeJXnFplMpipDaVq1aiUGg8Epb/dXtmXLFgkNDZWGDRvK6tWrlY5jdf/9739Fo9HIypUrlY5iE6NHj5awsDCnK7wmk0kMBoPo9XoZMGCAFBQUKB3JpnJyciQhIUHq1asnbm5uHIJzm06fPi1ffvmlDBkyRFxdXUWv18vAgQNZ7MnaWObpzuzZs0cSExMlLCxMAEhwcLBlBZry8nKl46nO+fPnxWAwWIbSREVFSXp6utMOpalgNBolKSlJ9Hq9PPDAA3LmzBmlI9nE448/LuHh4U779dy3b59otVpJT09XOopNbNu2TZo3by4BAQGyatUqpePY3OXLlyUlJUXatWtXZQgOf7bfukuXLkl6errExsZa5otFRkaKwWDgHgB0p1jm6daYTCbJzMyUhIQEadGihQCQkJAQy4x+Zy0ptrZjxw6Ji4uTOnXqiI+Pj8THxzvV0njXk5ubKz169BB3d3cxGAxO+28oPz9f9Hq9pKSkKB3FpoYOHSo9evRQOobNFBUVyWOPPea0w8BqUjEEx8XFxTIE5/Tp00rHUqWKldyu3sk8MTFRDhw4oHQ8Uh+WebqxymvAV2x9XXlJLmctX7ZWXl4uCxculJ49e1p+mH/22WdOP5SmssWLF4u/v7+Eh4fLzp07lY5jUxMnTpTg4GCnL3/r1q0TAE6/OVHFBO3OnTtLdna20nHs5tChQzJp0iTx9fUVDw8PGTNmjOzatUvpWKpVWlpqWfKy8hLNiYmJsnfvXqXjkTqwzFP1ysvL5YcffpAxY8ZIvXr1BIB07NhR3nvvPadZG1spRUVFYjAYJCQkpFYNpans6qX/nHFZzcrOnTsnXl5e8tFHHykdxS66du0qDz74oNIxbC4nJ0e6du0qHh4eYjAYlI5jVxcvXpTk5GRp27atAJDIyEhJS0tzmt1zlVBWViY//vijjBs3Tho2bCgApE2bNvLuu+/K77//rnQ8clws8/Q/RqNR1qxZI+PGjZP69esLALn33nvlX//6lxw+fFjpeKp38OBBef7558Xb21u8vb0lPj5eDh06pHQsu9u9e7e0bdtW6tevL8uXL1c6jl2899574uvr6xSbKt2MRYsWiUajqRVXFsvLyy2bmo0YMULOnTundCS7MpvN8v3338vAgQNFo9FIy5YtZcaMGbXqDqMtGI1GWb9+vUyYMMGySVX79u3l/fffr1V3guimsMzXdhVj4Cvvasexe9ZVeaxp48aNJTExUc6ePat0LLszm81iMBjEzc1N+vTpU2smfV2+fFkaNWokr7/+utJR7MZkMkl4eLiMGTNG6Sh2s2bNGmncuLGEhIRIZmam0nEUkZ2dLfHx8VXm/jjbjtRK4O9pugGW+dqo8g+GijHwFT8YuAurdVy5ckXS0tIkIiKCq0CIyMmTJ2XIkCGi0+kkMTGxVt2K//TTT8XNzU2OHz+udBS7Sk5OFjc3Nzl27JjSUezm1KlTEh0dXSv/nVdWWFgoBoNBmjRpYtngbuPGjUrHcgqVf383atSoyu9vXrGvtVjma5M9e/ZIQkKCBAUFVfkBwDHw1lPxSyw4ONjyS8zZJwLeSEZGhgQGBkqzZs1q3S90o9EoLVu2lHHjxikdxe5KS0slMDBQXn31VaWj2JXZbJbk5GSpU6eOdO3atdasSlUdXtSwrcqLU1SMsa/4ve5Mu2bTDbHMO7tt27bJpEmTpEmTJpwlb0PV3V7Ozc1VOpaiysrKLGOJY2Ji5Pz580pHsru0tDTRarW19g/mKVOmiI+PT62ZK1DZnj17pF27duLr6yvz589XOo7isrKyJDY2tsrusrVxuKGtlJeXy/fffy9PP/20+Pv7i0ajka5du8onn3xSq+6O1VIs885o37598tZbb1nWgW/durW89dZbsnv3bqWjOZ3MzEyJjo4WjUYjzZs3rxW7tN6M33//XTp27Cje3t6SmpqqdBzFREREyPDhw5WOoZjCwkLx9fWVDz74QOkoirh61Sb+bBA5fPiwJCQkiJ+fn3h5eUlcXBxXarGysrIyWblypTz55JPi5+cnWq1W+vTpI1988UWtm6BdS7DMO4v8/HwxGAwSGRkpACQoKMiyDjxZV2lpqaSkpHBJthqkpKSIp6endOnSpVbf6l27dq0AkM2bNysdRVGTJ0+WoKAgp19f/3qWLFki/v7+EhYWJjt27FA6jkO4cOGCGAwGadasWa1dotceSktLLRtUeXp6iouLi0RFRUlKSopcvHhR6XhkHSzzanbu3DlJSUmRqKgo0Wq1UrduXYmNjZX09HQWSxs4ceKEJCYmSv369cXV1VViY2O5WUolhYWF8uijj9a6nTFrMmjQIOnVq5fSMRSXn58vrq6uMmfOHKWjKCovL0969uwpbm5ukpSUJCaTSelIDsFkMkl6erpERUUJAOnQoYMkJyfL5cuXlY7mdEpKSiQtLU2io6NFr9eLh4eHREdHS1paWq3/ea1yLPNqU/mb0dXVld+MdrBjxw6Ji4sTd3d3CQgIkISEhFqzrOLN2rJli4SGhkrDhg1l9erVSsdR3H//+1/RaDSycuVKpaM4hNGjR0tYWFitL7Amk0kMBoPo9Xrp37+/FBQUKB3Jofz2228SGxsrer1eGjZsKImJiXL69GmlYzmls2fPWi4GajQay8XAjIyMWv99qkIs82pQ+TaZl5dXldtkFy5cUDqeU7r6alGrVq3EYDBISUmJ0tEcitFolKSkJNHr9fLAAw/ImTNnlI7kEB5//HEJDw/nkIH/t2/fPtFqtZKenq50FIewbds2ad68uQQEBMiqVauUjuNwCgoKJDExUerVqydubm4SGxvLOV82lJeXV2WYbnBwMIfpqgvLvKMym82yYcMGGTt2rNStW1e0Wq306tVLkpOTuQKADVVsUR4WFiYajYbjOK8jNzdXevToIe7u7mIwGHiO/l9+fr7o9XpJSUlROopDGTp0qPTo0UPpGA6jqKhIHnvsMQ5Lu47Lly9LSkqK3H333Zb5Sfx5bFu7d++W119/Xe666y4BIOHh4TJlypRavzqbg2OZdzQHDx6Ut99+2/KN1L59e/nwww8lPz9f6WhO7dixY5KYmCj+/v7i7u4usbGxXL7zOhYtWiT+/v4SHh4uO3fuVDqOQ5k4caIEBweznF1l3bp1AqDW77twtZSUFPHy8pLOnTtzN88amM1mycjIsKwc1rJlSzEYDHLp0iWlozkts9ksmzZtkgkTJkj9+vVFq9VK7969Zc6cOVJUVKR0PKqKZd4RnD9/vsrYtcDAQImPj5esrCylozm9ymsfN2rUSBITEzlU5DquXmqPv0yrOnv2rHh5eclHH32kdBSH1LVrV3nwwQeVjuFwcnJypGvXruLh4SEGg0HpOA7twIEDEh8fLx4eHuLr6yvx8fGSl5endCynZjQaJSMjQ2JjY6VOnTri7u4uMTExkp6eLmVlZUrHI5Z55fCbQzkV4+G7desmAKRTp06SkpLC834Du3fvlrZt20r9+vVl+fLlSsdxSH//+9/F19e3Vm6SdDMWLVokGo2Gd72qUV5ebtlkbcSIEVwP/AZOnjwpSUlJEhQUJHq9XmJiYmTLli1Kx3J6V198rFevnsTFxXF8vbJY5u1tz549kpCQIAEBAaLVaiUyMlKSk5N528oOioqKxGAwSNOmTUWr1Up0dLRkZGQoHcvhmc1mMRgM4ubmJn369OFKPjW4fPmyNGrUSF5//XWlozgsk8kk4eHhMmbMGKWjOKw1a9ZI48aNpWnTpixIN+HKlSuSlpYm9913nwCQzp07S0pKipSXlysdzenl5uZKUlKSZYPKih3mjxw5onS02iZNIyKCq5jN5qvfRVfRarU3/bnHjh3DokWL8NVXX2HXrl1o3bo1HnnkEYwePRqhoaHXfD7P/43dyvk/fPgwpk+fjtmzZ0Or1eKpp57CpEmT0KxZs2o/n+f/f06ePIknn3wSP//8M9599128/PLL0Gq1t3T+b5Vaz39ycjImT56Mw4cPo1GjRjY9lprP/xdffIEXXngBhw8fRmBgoE2PZSu2Pv+nTp3C008/jYyMDCQmJiIhIQEuLi42O6ba1HT+N27ciOnTp2PJkiVo2rQpxo0bh7i4ONStW/emn1utP3/sqbrz/9tvvyE1NRXffPMNzp07h65du2L06NEYNWoUvLy8bvq5ef5vrJrzv/CaK/MLFiwQAHy7wduNlJaWyrfffiv9+/cXrVYr9evXlwkTJsi2bduu+zief+ucfxGRDRs2yLBhw0Sr1UpoaKgYDIYbLuXJ82+98387eP55/tXwxvPv2Of/wIEDMn78ePH09BRfX195+eWXb+qOIs//nZ//y5cvV9mYysfHR+Li4uSXX37h+bfd+U/ToQYLFy6s6UO12pYtW/Dxxx/X+PF9+/bhyy+/xLx583D+/HkMHjwYixcvxpAhQ+Dq6nrTx+H5r96Nzr/ZbMbSpUvx4YcfYuvWrejWrRsWLVqEBx544JaupvH8V+9G599aeP6rx/OvLJ5/Zd3s+W/VqhVmzpyJ999/H8nJyZg+fTqmTZuGRx99FC+++CLatWt33cfz/FfvZs6/u7s7YmJiEBMTg1OnTuHrr7/Gl19+iVmzZuGee+7BmDFj8Nhjj133bgnPf/Wud/5rLPMjRoywWSA1q+4W0JUrV7Bwi/xR3QAAFuZJREFU4UJ89tln2Lx5M0JDQzFx4kQ8+eSTCAoKuq3j8PxXr6ZbcFeuXMGCBQvwz3/+E9nZ2RgyZAgyMjIQFRV1W8fh+a+evW6B8vxXj+dfWTz/yrrV81+3bl28+uqrmDx5Mr799lv861//Qvv27REZGYmEhARER0dDo9Fc8zie/+rd6vkPCAjApEmTMGnSJGzZsgVffvklXnvtNbz88suIiYnBc889h/vvv/+ax/H8V+965992A/9qgaNHj+LNN99E06ZN8fTTTyMoKAhr1qzBwYMH8cYbb9x2kaebd+bMGUydOhV33XUX4uLi0KVLF+zZswcrVqy47SJPRETOw9XVFaNHj8aePXuQmZmJunXrYtiwYejYsSNSU1NRXl6udESn17VrV8yePRvHjx/H9OnTsWfPHkRERKBLly6YO3cuSktLlY6oaizzt2nEiBG46667MHv2bIwbNw5HjhxBWloa+vXrZ9PJUfSnnJwcvPDCCwgJCcE///lPxMTEICcnB6mpqQgPD1c6HhEROaDu3btjxYoV2L59O9q3b49nnnkGLVu2xNSpU3Hp0iWl4zk9b29vjB07Ftu3b8emTZvQsmVLjBs3Dk2aNMHXX3+tdDzVqnGYDV1fQUEBUlJSMGLEiFsaC0/W0bJlSzRr1gxTp07FU089BU9PT6UjERGRSlRclX/nnXfwySef4L333lM6Uq3TrVs3dOvWDSdOnMCsWbOQlpamdCTV4iXk27Rp0yaMGjWKRV4h3377LbKzszFhwgQWeSIiui2hoaH497//jdzcXERHRysdp1Zq1KgR3n77bbz99ttKR1EtlnlSpZiYGK67TEREVlGvXj389a9/VToG0W1hmSciIiIiUimWeSIiIiIilWKZJyIiIiJSKZZ5IiIiIiKVYpknIiIiIlIplnkiIiIiIpVimSciIiIiUimWeSIiIiIilWKZJyIiIiJSKZZ5IiIiIiKVYpknIiIiIlIplnkiIiIiIpVimSciIiIiUimWeSIiIiIilWKZJyIiIiJSKZZ5IiIiIiKVYpknIiIiIlIplnkiIiIiIpVimSciIiIiUimWeSIiIiIilWKZJyIiIiJSKZZ5IiIiIiKVYpknIiIiIlIplnkiIiIiIpVimSciIiIiUimWeSIiIiIilWKZJyIiIiJSKZZ5IiIiIiKVYpknIiIiIlIplnkiIiIiIpVimSciIiIiUimWeSIiIiIilWKZJyIiIiJSKZZ5IiIiIiKVYpknIiIiIlIplnkiIiIiIpVimSciIiIiUimWeSIiIiIilWKZJyIiIiJSKV1NHzCbzfbMcV1r165F3759lY4BABARuxyH5796PP/K4vlXFs+/snj+lcXzryyef2Vd7/zXWOZd/q+9O42Nqu7iOH7a0rEwpbRoA0rUWCI0CmLdICTI4o64ocQlEZeoUTQYY1wTE2Oi0RhF4xoT48LqVCTuQkpFa6KAiEVRCta1KpEaCp2WLnTO88LQ2KctMtM7/c2l38/LW+Z/T7+8OWVuh5yctAyDA0N/Lfpr0V+L/lr016K/Fv2T122Znzx5spWXlytm6VF5ebnFYjF76qmnbNSoUepx0o7+WvTXor8W/bXor0V/Lfr3gWe40aNHu5n5Aw88oB5lQKK/Fv216K9Ffy36a9FfK0T9Y1nu/fQQVAq++uorO/nkk83M7KijjrJffvlFPNHAQn8t+mvRX4v+WvTXor9WyPqXZ/Sn2SxZssQikYiZmf3666+2bt068UQDC/216K9Ffy36a9Ffi/5aYeufsct8IpGwRYsWWVtbm5mZRSIRW7p0qXiqgYP+WvTXor8W/bXor0V/rTD2z9jHbD755BObNm1al2uHHXaYbd++nd907gf016K/Fv216K9Ffy36a4Wwf+Y+ZrN06dLOtzj2qa+vtzVr1mgGGmDor0V/Lfpr0V+L/lr01wpj/4xc5tvb223p0qWdb3Hsk5uba0uWLBFNNXDQX4v+WvTXor8W/bXorxXW/hm5zK9cudJ2797d7Xp7e7vFYjFrbW0VTDVw0F+L/lr016K/Fv216K8V1v4ZucwvWbLEcnNze/xaU1OTffTRR/080cBCfy36a9Ffi/5a9Neiv1ZY+2fcMt/c3GwrVqyw9vb2Hr+ek5Njixcv7uepBg76a9Ffi/5a9Neivxb9tcLcP+OW+bfffnu/b2Ps3bvX3n777R7fBkHf0V+L/lr016K/Fv216K8V5v4Zt8wvXrzYsrKy9vtn2tra7J133umniQYW+mvRX4v+WvTXor8W/bXC3D+jlvmdO3faqlWrLJFI/OefzfQP8A8j+mvRX4v+WvTXor8W/bXC3n+QeoB/Gzx4sG3ZsqXLtffff9/mz59vtbW1Xa5n6Af3hxr9teivRX8t+mvRX4v+WmHvn1HLfF5enpWUlHS5NmLECDOzbtcRPPpr0V+L/lr016K/Fv21wt4/ox6zAQAAAHDgWOYBAACAkGKZBwAAAEKKZR4AAAAIKZZ5AAAAIKRY5gEAAICQYpkHAAAAQoplHgAAAAgplnkAAAAgpFjmAQAAgJBimQcAAABCimUeAAAACCmWeQAAACCkWOYBAACAkGKZBwAAAEKKZR4AAAAIKZZ5AAAAIKRY5gEAAICQYpkHAAAAQoplHgAAAAgplnkAAAAgpFjmAQAAgJBimQcAAABCimUeAAAACCmWeQAAACCkWOYBAACAkGKZBwAAAEKKZR4AAAAIKZZ5AAAAIKRY5gEAAICQYpkHAAAAQmqQeoB9/vjjD6uqqrLNmzfb1q1b7aeffrKGhgbbuXOnRSIRKykpsWg0asOHD7exY8famDFj7MQTT7TJkyfbkCFD1OOHHv216K9Ffy36a9Ffi/5aB0P/LHd31c3XrVtnixcvtpUrV1pNTY3l5uba6NGjrbS01EpKSqyoqMii0ahFo1FraGiwpqYm27Fjh9XU1NjWrVutrq7OIpGITZw40S6++GK76qqrbOTIkapvJ3Tor0V/Lfpr0V+L/lr01zrI+peb97Pm5mZ/5plnvLS01M3MS0tL/b777vNVq1Z5U1NTUmf9/vvvvnDhQr/22mu9sLDQc3JyfObMmb569eo0TR9+9Neivxb9teivRX8t+msdxP1j/bbMt7a2+uOPP+4jRozwwYMH+y233OJr164N7Pw9e/Z4LBbzGTNmuJn5pEmTvKKiIrDzw47+WvTXor8W/bXor0V/rQHQv3+W+YqKCi8tLfUhQ4b4Pffc49u3b0/r/T7//HOfOXOmm5lffvnl/vvvv6f1fpmO/lr016K/Fv216K9Ff60B0j+9y3xLS4vPmzfPzcwvuugi//nnn9N5u27ee+89Lykp8cLCQn/zzTf79d6ZgP5a9Neivxb9teivRX+tAdY/fct8bW2tl5WV+bBhwzwWi6XrNv+pubnZb775Zjczv+2227ytrU02S3+ivxb9teivRX8t+mvRX2sA9k/PMr9x40YfOXKkl5WV+Q8//JCOWyTtjTfe8Pz8fD/vvPM8Ho+rx0kr+mvRX4v+WvTXor8W/bUGaP/gl/mqqiofNmyYn3nmmd7Y2Bj08X2ybt06Ly4u9kmTJnlDQ4N6nLSgvxb9teivRX8t+mvRX2sA9w92mf/666992LBhPnv2bG9tbQ3y6MBs2bLFR40a5VOnTvU9e/aoxwkU/bXor0V/Lfpr0V+L/loDvH9wy/xPP/3kI0eO9DPOOMNbWlqCOjYtNm3a5EVFRX7ppZd6IpFQjxMI+mvRX4v+WvTXor8W/bXoH9Ay39ra6qeeeqpPmDDBd+/eHcSRaffpp596bm6uP/roo+pR+oz+WvTXor8W/bXor0V/Lfq7e1DL/O233+5Dhw71mpqaII7rN0888YQPGjTIP/vsM/UofUJ/Lfpr0V+L/lr016K/Fv3dPYhlvqqqyrOysnzRokVBDNSvEomEz5o1y4899tiMf2umN/TXor8W/bXor0V/Lfpr0b9T35b59vZ2nzBhgp999tl9HUTmt99+8/z8fH/ooYfUoySN/lr016K/Fv216K9Ffy36d9G3Zf7ZZ5/1vLw837ZtW18HkXrsscd88ODBXldXpx4lKfTXor8W/bXor0V/Lfpr0b+L1Jf5trY2P/roo33+/Pl9GSAjtLS0+KhRo/yOO+5Qj3LA6K9Ffy36a9Ffi/5a9NeifzepL/Mvv/yyRyIR//XXX/syQMZ46qmnfMiQIf7XX3+pRzkg9Neivxb9teivRX8t+mvRv5vUl/lTTjnF586dm+rLM05TU5MXFRX5448/rh7lgNBfi/5a9Neivxb9teivRf9uUlvmv/vuOzczr6ysTPXGGemmm27y448/Xj3Gf6K/Fv216K9Ffy36a9Ffi/49imVbChYvXmxHHXWUTZ06NZWXZ6y5c+fa5s2b7ZtvvlGPsl/016K/Fv216K9Ffy36a9G/Zykt8ytXrrQLLrjAsrNTenmnH3/80a6//nqrq6vr0zlBmTx5shUXF9uqVavUo+wX/bXor0V/Lfpr0V+L/lr071nSNXbt2mUbN2606dOnp3TDf/vqq6/slVdeyZifBLOysmzatGn28ccfq0fpFf216K9Ffy36a9Ffi/5a9N+PZB/M+fDDD93MfMeOHak+29NFUOcE5bnnnvNhw4Z5IpFQj9Ij+mvRX4v+WvTXor8W/bXo36vkn5n/7rvv7PDDD7fDDjsstZ8e/k9Q5wRl3LhxtmvXLvvzzz/Vo/SI/lr016K/Fv216K9Ffy369y7pZb6mpsbGjh2b9I16kkgk7OOPP7b169d3Xvvtt9/s6aeftkQiYd9++609/PDDtnDhQkskEl1eG4/HbdGiRfbAAw9YLBazXbt2BTLTvu+tpqYmkPOCRn8t+mvRX4v+WvTXor8W/fcj2X/LP+ecc/z6669P9mXdbN682S+77DI3M3/hhRfc3f2dd97x4uJiNzNfsGCBX3fddT5r1iw3M3/kkUc6X/v999/7zJkzvbq62tvb2/3KK6/0Qw891Gtra/s8l7t7NBr1l19+OZCzgkZ/Lfpr0V+L/lr016K/Fv17lfznzE+aNCmw//Z306ZNXWK6u997771uZl5RUdF57aSTTvKTTz7Z3d337t3rJ554or/00kudX9+wYYNHIhF/9913A5nriCOO8AULFgRyVtDor0V/Lfpr0V+L/lr016J/r2KDkv2X/Hg8bkOHDk32ZT065JBDul0bPHiwmZmVlpZ2XjvuuONs5cqVZmb2wQcf2Ndff23nn39+59dPOukka2xstEgkEshcQ4cOtcbGxkDOChr9teivRX8t+mvRX4v+WvTvXdLPzLe3t1tubm7SN+qLnJwcc3czM6uurrZoNGrFxcVd/kxQIc3++Utua2sL7Lwg0V+L/lr016K/Fv216K9F/94lvcxHo1GLx+NJ3ygoiUTCmpqa0vpZqI2NjZafn5+28/uC/lr016K/Fv216K9Ffy369y7pZX7o0KHSmOPHjzczsyVLlnS5/vfff9uKFSsCucfu3butoKAgkLOCRn8t+mvRX4v+WvTXor8W/XuX9DPzxcXFtn379qRv1JPW1lYzM6uvr++8tnv3bjOzLm8z1NfXW2trq7m7XXjhhVZWVmavvfaa5eXl2Zw5c2zTpk22Zs0ai8VigczU0NCQcZ8/ug/9teivRX8t+mvRX4v+WvTfj2R/Zfb+++/38ePHJ/uybr744ovOjwYaN26cv/fee75mzRovKSlxM/MbbrjB//zzT1+6dKkXFBS4mfmDDz7o7e3tXldX52eddZZnZWV5VlaWT5s2zevq6vo8k7v7t99+62bm1dXVgZwXNPpr0V+L/lr016K/Fv216N+r5D+a8rXXXvO8vDzfu3dvsi8N3M6dO/3vv/8O9Mzly5d7dna2Nzc3B3puUOivRX8t+mvRX4v+WvTXon+vYkk/M19WVmYtLS1WXV2d/NsAASssLLThw4cHeubatWuttLS08yOKMg39teivRX8t+mvRX4v+WvTvXdLL/Lhx42zEiBFWWVmZ9M3CoLKy0mbMmKEeo1f016K/Fv216K9Ffy36a9G/d0kv81lZWTZ16lSrqKhI6YaZrL6+3jZu3GjTp09Xj9Ir+mvRX4v+WvTXor8W/bXo37ukl3kzs4svvthWr14d2G8VZ4pYLGZ5eXl29tlnq0fZL/pr0V+L/lr016K/Fv216N+zlJb5Sy65xPLz823ZsmUp3TRTLVy40GbPnp2x/2HCPvTXor8W/bXor0V/Lfpr0b8Xqf7W7Y033uilpaXe0dGR6hEZZcOGDW5mXlFRoR7lgNBfi/5a9Neivxb9teivRf9ukv9oyn22bNni2dnZXl5enuoRGWX27NleVlbmiURCPcoBob8W/bXor0V/Lfpr0V+L/t2kvsy7u8+ZM8cnTJgQ+p+OqqurPTs729966y31KEmhvxb9teivRX8t+mvRX4v+XfRtmd+8ebPn5ub6888/35djpBKJhE+ZMsUnTpwYmp9K96G/Fv216K9Ffy36a9Ffi/5d9G2Zd3e/++67vaioyLdv397XoyReffVVz8nJ8Q0bNqhHSQn9teivRX8t+mvRX4v+WvTv1PdlPh6P+zHHHOPnnntu6N7uqK2t9cLCQp8/f756lJTRX4v+WvTXor8W/bXor0X/Tn1f5t3d161b55FIxB999NEgjusXbW1tPmnSJB8/frw3Nzerx+kT+mvRX4v+WvTXor8W/bXo7+5BLfPu7gsWLPCcnJxQ/BJFIpHwq666ygsKCnzr1q3qcQJBfy36a9Ffi/5a9Neivxb9A1zm3d3nzZvneXl5vmbNmiCPDdwdd9zhkUgkNJ+peqDor0V/Lfpr0V+L/lr01xrg/YNd5js6OnzOnDmen5/vH330UZBHB6Kjo8Nvv/12z87O9mXLlqnHCRz9teivRX8t+mvRX4v+WgO8f7DLvPs/zwJdffXVHolEfOHChUEfn7I9e/b4FVdc4YcccojHYjH1OGlDfy36a9Ffi/5a9Neiv9YA7h/8Mu/+zzNBd911l5uZ33rrrd7S0pKO2xywrVu3ellZmRcWFvrq1auls/QH+mvRX4v+WvTXor8W/bUGaP/0LPOdp8diXlBQ4GVlZf7ll1+m81Y96ujo8BdffNELCgr8lFNO8dra2n6fQYn+WvTXor8W/bXor0V/rQHWP73LvLv7tm3b/PTTT/ecnByfN2+e79ixI923dHf3tWvX+mmnnea5ubl+1113yX86U6G/Fv216K9Ffy36a9FfawD1T/8y7/7P2x6vv/66jxgxwqPRqN95553+xx9/pOVeVVVVfu6557qZ+ZQpU/ybb75Jy33ChP5a9Neivxb9teivRX+tAdK/f5b5feLxuD/55JN+xBFH+KBBg3zWrFn+xhtveDwe79O5P//8sz/22GM+bty4zoiZ+NvMavTXor8W/bXor0V/LfprHeT9Y1nu7tbPWltbbfny5bZo0SJbtWqVZWdn28SJE2369Ok2fvx4Gzt2rI0ZM8by8vK6vba+vt5qampsy5Yttn79equsrLRt27bZ8OHD7fLLL7drrrnGJk6c2N/fUqjQX4v+WvTXor8W/bXor3WQ9i+XLPP/9tdff1lFRYVVVlZaVVWV1dbWWkdHh5mZ5ebmWn5+vkWjUYvH49bQ0ND5umg0aieccIJNnz7dZsyYYVOmTLFIJKL6NkKL/lr016K/Fv216K9Ff62DqL9+mf9/bW1ttm3bNqutrbXGxkaLx+PW1NRk+fn5VlRUZEVFRTZmzBg78sgjLSsrSz3uQYf+WvTXor8W/bXor0V/rRD3z7xlHgAAAMABKc9WTwAAAAAgNSzzAAAAQEixzAMAAAAh9T/qWQXKiyoG1QAAAABJRU5ErkJggg==\n",
260 |       "text/plain": [
261 |        "<IPython.core.display.Image object>"
262 |       ]
263 |      },
264 |      "execution_count": 9,
265 |      "metadata": {},
266 |      "output_type": "execute_result"
267 |     }
268 |    ],
269 |    "source": [
270 |     "total.visualize()"
271 |    ]
272 |   },
273 |   {
274 |    "cell_type": "markdown",
275 |    "metadata": {},
276 |    "source": [
277 |     "Notice that the computation is significantly faster!"
278 |    ]
279 |   },
280 |   {
281 |    "cell_type": "markdown",
282 |    "metadata": {},
283 |    "source": [
284 |     "### Parallel Pandas `groupby()`"
285 |    ]
286 |   },
287 |   {
288 |    "cell_type": "markdown",
289 |    "metadata": {},
290 |    "source": [
291 |     "For another example, let's go back to the NYC taxi cabs dataset. Again, we assume all csv files are in the `/data` subdirectory."
292 |    ]
293 |   },
294 |   {
295 |    "cell_type": "code",
296 |    "execution_count": null,
297 |    "metadata": {},
298 |    "outputs": [],
299 |    "source": [
300 |     "# Uncomment to download data again\n",
301 |     "# !wget https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2019-{01..12}.csv"
302 |    ]
303 |   },
304 |   {
305 |    "cell_type": "markdown",
306 |    "metadata": {},
307 |    "source": [
308 |     "Similar to the computation in the DataFrame notebook, we read the data for one month and find the mean tip amount as a function of passenger count."
309 |    ]
310 |   },
311 |   {
312 |    "cell_type": "code",
313 |    "execution_count": 13,
314 |    "metadata": {},
315 |    "outputs": [
316 |     {
317 |      "data": {
318 |       "text/html": [
319 |        "<div>\n",
320 |        "<style scoped>\n",
321 |        "    .dataframe tbody tr th:only-of-type {\n",
322 |        "        vertical-align: middle;\n",
323 |        "    }\n",
324 |        "\n",
325 |        "    .dataframe tbody tr th {\n",
326 |        "        vertical-align: top;\n",
327 |        "    }\n",
328 |        "\n",
329 |        "    .dataframe thead th {\n",
330 |        "        text-align: right;\n",
331 |        "    }\n",
332 |        "</style>\n",
333 |        "<table border=\"1\" class=\"dataframe\">\n",
334 |        "  <thead>\n",
335 |        "    <tr style=\"text-align: right;\">\n",
336 |        "      <th></th>\n",
337 |        "      <th>VendorID</th>\n",
338 |        "      <th>tpep_pickup_datetime</th>\n",
339 |        "      <th>tpep_dropoff_datetime</th>\n",
340 |        "      <th>passenger_count</th>\n",
341 |        "      <th>trip_distance</th>\n",
342 |        "      <th>RatecodeID</th>\n",
343 |        "      <th>store_and_fwd_flag</th>\n",
344 |        "      <th>PULocationID</th>\n",
345 |        "      <th>DOLocationID</th>\n",
346 |        "      <th>payment_type</th>\n",
347 |        "      <th>fare_amount</th>\n",
348 |        "      <th>extra</th>\n",
349 |        "      <th>mta_tax</th>\n",
350 |        "      <th>tip_amount</th>\n",
351 |        "      <th>tolls_amount</th>\n",
352 |        "      <th>improvement_surcharge</th>\n",
353 |        "      <th>total_amount</th>\n",
354 |        "      <th>congestion_surcharge</th>\n",
355 |        "    </tr>\n",
356 |        "  </thead>\n",
357 |        "  <tbody>\n",
358 |        "    <tr>\n",
359 |        "      <th>0</th>\n",
360 |        "      <td>1</td>\n",
361 |        "      <td>2019-01-01 00:46:40</td>\n",
362 |        "      <td>2019-01-01 00:53:20</td>\n",
363 |        "      <td>1</td>\n",
364 |        "      <td>1.5</td>\n",
365 |        "      <td>1</td>\n",
366 |        "      <td>N</td>\n",
367 |        "      <td>151</td>\n",
368 |        "      <td>239</td>\n",
369 |        "      <td>1</td>\n",
370 |        "      <td>7.0</td>\n",
371 |        "      <td>0.5</td>\n",
372 |        "      <td>0.5</td>\n",
373 |        "      <td>1.65</td>\n",
374 |        "      <td>0.0</td>\n",
375 |        "      <td>0.3</td>\n",
376 |        "      <td>9.95</td>\n",
377 |        "      <td>NaN</td>\n",
378 |        "    </tr>\n",
379 |        "    <tr>\n",
380 |        "      <th>1</th>\n",
381 |        "      <td>1</td>\n",
382 |        "      <td>2019-01-01 00:59:47</td>\n",
383 |        "      <td>2019-01-01 01:18:59</td>\n",
384 |        "      <td>1</td>\n",
385 |        "      <td>2.6</td>\n",
386 |        "      <td>1</td>\n",
387 |        "      <td>N</td>\n",
388 |        "      <td>239</td>\n",
389 |        "      <td>246</td>\n",
390 |        "      <td>1</td>\n",
391 |        "      <td>14.0</td>\n",
392 |        "      <td>0.5</td>\n",
393 |        "      <td>0.5</td>\n",
394 |        "      <td>1.00</td>\n",
395 |        "      <td>0.0</td>\n",
396 |        "      <td>0.3</td>\n",
397 |        "      <td>16.30</td>\n",
398 |        "      <td>NaN</td>\n",
399 |        "    </tr>\n",
400 |        "    <tr>\n",
401 |        "      <th>2</th>\n",
402 |        "      <td>2</td>\n",
403 |        "      <td>2018-12-21 13:48:30</td>\n",
404 |        "      <td>2018-12-21 13:52:40</td>\n",
405 |        "      <td>3</td>\n",
406 |        "      <td>0.0</td>\n",
407 |        "      <td>1</td>\n",
408 |        "      <td>N</td>\n",
409 |        "      <td>236</td>\n",
410 |        "      <td>236</td>\n",
411 |        "      <td>1</td>\n",
412 |        "      <td>4.5</td>\n",
413 |        "      <td>0.5</td>\n",
414 |        "      <td>0.5</td>\n",
415 |        "      <td>0.00</td>\n",
416 |        "      <td>0.0</td>\n",
417 |        "      <td>0.3</td>\n",
418 |        "      <td>5.80</td>\n",
419 |        "      <td>NaN</td>\n",
420 |        "    </tr>\n",
421 |        "    <tr>\n",
422 |        "      <th>3</th>\n",
423 |        "      <td>2</td>\n",
424 |        "      <td>2018-11-28 15:52:25</td>\n",
425 |        "      <td>2018-11-28 15:55:45</td>\n",
426 |        "      <td>5</td>\n",
427 |        "      <td>0.0</td>\n",
428 |        "      <td>1</td>\n",
429 |        "      <td>N</td>\n",
430 |        "      <td>193</td>\n",
431 |        "      <td>193</td>\n",
432 |        "      <td>2</td>\n",
433 |        "      <td>3.5</td>\n",
434 |        "      <td>0.5</td>\n",
435 |        "      <td>0.5</td>\n",
436 |        "      <td>0.00</td>\n",
437 |        "      <td>0.0</td>\n",
438 |        "      <td>0.3</td>\n",
439 |        "      <td>7.55</td>\n",
440 |        "      <td>NaN</td>\n",
441 |        "    </tr>\n",
442 |        "    <tr>\n",
443 |        "      <th>4</th>\n",
444 |        "      <td>2</td>\n",
445 |        "      <td>2018-11-28 15:56:57</td>\n",
446 |        "      <td>2018-11-28 15:58:33</td>\n",
447 |        "      <td>5</td>\n",
448 |        "      <td>0.0</td>\n",
449 |        "      <td>2</td>\n",
450 |        "      <td>N</td>\n",
451 |        "      <td>193</td>\n",
452 |        "      <td>193</td>\n",
453 |        "      <td>2</td>\n",
454 |        "      <td>52.0</td>\n",
455 |        "      <td>0.0</td>\n",
456 |        "      <td>0.5</td>\n",
457 |        "      <td>0.00</td>\n",
458 |        "      <td>0.0</td>\n",
459 |        "      <td>0.3</td>\n",
460 |        "      <td>55.55</td>\n",
461 |        "      <td>NaN</td>\n",
462 |        "    </tr>\n",
463 |        "  </tbody>\n",
464 |        "</table>\n",
465 |        "</div>"
466 |       ],
467 |       "text/plain": [
468 |        "   VendorID tpep_pickup_datetime tpep_dropoff_datetime  passenger_count  \\\n",
469 |        "0         1  2019-01-01 00:46:40   2019-01-01 00:53:20                1   \n",
470 |        "1         1  2019-01-01 00:59:47   2019-01-01 01:18:59                1   \n",
471 |        "2         2  2018-12-21 13:48:30   2018-12-21 13:52:40                3   \n",
472 |        "3         2  2018-11-28 15:52:25   2018-11-28 15:55:45                5   \n",
473 |        "4         2  2018-11-28 15:56:57   2018-11-28 15:58:33                5   \n",
474 |        "\n",
475 |        "   trip_distance  RatecodeID store_and_fwd_flag  PULocationID  DOLocationID  \\\n",
476 |        "0            1.5           1                  N           151           239   \n",
477 |        "1            2.6           1                  N           239           246   \n",
478 |        "2            0.0           1                  N           236           236   \n",
479 |        "3            0.0           1                  N           193           193   \n",
480 |        "4            0.0           2                  N           193           193   \n",
481 |        "\n",
482 |        "   payment_type  fare_amount  extra  mta_tax  tip_amount  tolls_amount  \\\n",
483 |        "0             1          7.0    0.5      0.5        1.65           0.0   \n",
484 |        "1             1         14.0    0.5      0.5        1.00           0.0   \n",
485 |        "2             1          4.5    0.5      0.5        0.00           0.0   \n",
486 |        "3             2          3.5    0.5      0.5        0.00           0.0   \n",
487 |        "4             2         52.0    0.0      0.5        0.00           0.0   \n",
488 |        "\n",
489 |        "   improvement_surcharge  total_amount  congestion_surcharge  \n",
490 |        "0                    0.3          9.95                   NaN  \n",
491 |        "1                    0.3         16.30                   NaN  \n",
492 |        "2                    0.3          5.80                   NaN  \n",
493 |        "3                    0.3          7.55                   NaN  \n",
494 |        "4                    0.3         55.55                   NaN  "
495 |       ]
496 |      },
497 |      "execution_count": 13,
498 |      "metadata": {},
499 |      "output_type": "execute_result"
500 |     }
501 |    ],
502 |    "source": [
503 |     "import pandas as pd\n",
504 |     "\n",
505 |     "df = pd.read_csv(\"data/yellow_tripdata_2019-01.csv\")\n",
506 |     "df.head()"
507 |    ]
508 |   },
509 |   {
510 |    "cell_type": "code",
511 |    "execution_count": 14,
512 |    "metadata": {},
513 |    "outputs": [
514 |     {
515 |      "data": {
516 |       "text/plain": [
517 |        "passenger_count\n",
518 |        "0    1.786901\n",
519 |        "1    1.828308\n",
520 |        "2    1.833877\n",
521 |        "3    1.795579\n",
522 |        "4    1.702710\n",
523 |        "5    1.869868\n",
524 |        "6    1.856830\n",
525 |        "7    6.542632\n",
526 |        "8    6.480690\n",
527 |        "9    3.116667\n",
528 |        "Name: tip_amount, dtype: float64"
529 |       ]
530 |      },
531 |      "execution_count": 14,
532 |      "metadata": {},
533 |      "output_type": "execute_result"
534 |     }
535 |    ],
536 |    "source": [
537 |     "df.groupby(\"passenger_count\").tip_amount.mean()"
538 |    ]
539 |   },
540 |   {
541 |    "cell_type": "markdown",
542 |    "metadata": {},
543 |    "source": [
544 |     "Now, to compute this value across the entire dataset, we can use the following sequential code:"
545 |    ]
546 |   },
547 |   {
548 |    "cell_type": "code",
549 |    "execution_count": 15,
550 |    "metadata": {},
551 |    "outputs": [],
552 |    "source": [
553 |     "import os\n",
554 |     "from glob import glob\n",
555 |     "\n",
556 |     "filenames = sorted(glob(os.path.join('data', '*.csv')))"
557 |    ]
558 |   },
559 |   {
560 |    "cell_type": "code",
561 |    "execution_count": 16,
562 |    "metadata": {},
563 |    "outputs": [
564 |     {
565 |      "name": "stderr",
566 |      "output_type": "stream",
567 |      "text": [
568 |       "<decorator-gen-54>:2: DtypeWarning: Columns (6) have mixed types.Specify dtype option on import or set low_memory=False.\n"
569 |      ]
570 |     },
571 |     {
572 |      "name": "stdout",
573 |      "output_type": "stream",
574 |      "text": [
575 |       "CPU times: user 2min 20s, sys: 17.4 s, total: 2min 37s\n",
576 |       "Wall time: 2min 38s\n"
577 |      ]
578 |     },
579 |     {
580 |      "data": {
581 |       "text/plain": [
582 |        "passenger_count\n",
583 |        "0    2.122789\n",
584 |        "1    2.206790\n",
585 |        "2    2.214306\n",
586 |        "3    2.137775\n",
587 |        "4    2.023804\n",
588 |        "5    2.235441\n",
589 |        "6    2.221105\n",
590 |        "7    6.675962\n",
591 |        "8    7.111625\n",
592 |        "9    7.377822\n",
593 |        "Name: tip_amount, dtype: float64"
594 |       ]
595 |      },
596 |      "execution_count": 16,
597 |      "metadata": {},
598 |      "output_type": "execute_result"
599 |     }
600 |    ],
601 |    "source": [
602 |     "%%time\n",
603 |     "\n",
604 |     "sums = []\n",
605 |     "counts = []\n",
606 |     "\n",
607 |     "for fn in filenames:\n",
608 |     "    # Read file\n",
609 |     "    df = pd.read_csv(fn)\n",
610 |     "\n",
611 |     "    # Groupby passenger_count\n",
612 |     "    by_passenger_count = df.groupby('passenger_count')\n",
613 |     "\n",
614 |     "    # Sum of (all) tip_amount as function of passenger_count\n",
615 |     "    amount = by_passenger_count.tip_amount.sum()\n",
616 |     "\n",
617 |     "    # Number of total data points\n",
618 |     "    total = by_passenger_count.tip_amount.count()\n",
619 |     "\n",
620 |     "    # Save the intermediates\n",
621 |     "    sums.append(amount)\n",
622 |     "    counts.append(total)\n",
623 |     "\n",
624 |     "# Combine intermediates to get total mean\n",
625 |     "sum_tip_amount = sum(sums)\n",
626 |     "n_passengers = sum(counts)\n",
627 |     "mean = sum_tip_amount / n_passengers\n",
628 |     "mean"
629 |    ]
630 |   },
631 |   {
632 |    "cell_type": "markdown",
633 |    "metadata": {},
634 |    "source": [
635 |     "Parallelize using delayed:"
636 |    ]
637 |   },
638 |   {
639 |    "cell_type": "code",
640 |    "execution_count": 17,
641 |    "metadata": {},
642 |    "outputs": [],
643 |    "source": [
644 |     "from dask import compute"
645 |    ]
646 |   },
647 |   {
648 |    "cell_type": "code",
649 |    "execution_count": 18,
650 |    "metadata": {},
651 |    "outputs": [
652 |     {
653 |      "name": "stderr",
654 |      "output_type": "stream",
655 |      "text": [
656 |       "/Users/pavithra-coiled/.conda/envs/talkpython-dask/lib/python3.8/site-packages/dask/local.py:237: DtypeWarning: Columns (6) have mixed types.Specify dtype option on import or set low_memory=False.\n",
657 |       "  return [execute_task(*a) for a in it]\n"
658 |      ]
659 |     },
660 |     {
661 |      "name": "stdout",
662 |      "output_type": "stream",
663 |      "text": [
664 |       "CPU times: user 3min 42s, sys: 1min 36s, total: 5min 18s\n",
665 |       "Wall time: 2min\n"
666 |      ]
667 |     },
668 |     {
669 |      "data": {
670 |       "text/plain": [
671 |        "passenger_count\n",
672 |        "0    2.122789\n",
673 |        "1    2.206790\n",
674 |        "2    2.214306\n",
675 |        "3    2.137775\n",
676 |        "4    2.023804\n",
677 |        "5    2.235441\n",
678 |        "6    2.221105\n",
679 |        "7    6.675962\n",
680 |        "8    7.111625\n",
681 |        "9    7.377822\n",
682 |        "Name: tip_amount, dtype: float64"
683 |       ]
684 |      },
685 |      "execution_count": 18,
686 |      "metadata": {},
687 |      "output_type": "execute_result"
688 |     }
689 |    ],
690 |    "source": [
691 |     "%%time\n",
692 |     "\n",
693 |     "sums = []\n",
694 |     "counts = []\n",
695 |     "\n",
696 |     "for fn in filenames:\n",
697 |     "    \n",
698 |     "    df = delayed(pd.read_csv)(fn)  # Delayed!\n",
699 |     "\n",
700 |     "    by_passenger_count = df.groupby('passenger_count')\n",
701 |     " \n",
702 |     "    amount = by_passenger_count.tip_amount.sum()\n",
703 |     "\n",
704 |     "    total = by_passenger_count.tip_amount.count()\n",
705 |     "\n",
706 |     "    sums.append(amount)\n",
707 |     "    counts.append(total)\n",
708 |     "\n",
709 |     "    \n",
710 |     "sums, counts = compute(sums, counts)  # Compute the intermediates!\n",
711 |     "    \n",
712 |     "sum_tip_amount = sum(sums)\n",
713 |     "n_passengers = sum(counts)\n",
714 |     "mean = sum_tip_amount / n_passengers\n",
715 |     "mean"
716 |    ]
717 |   },
718 |   {
719 |    "cell_type": "markdown",
720 |    "metadata": {},
721 |    "source": [
722 |     "## Checkpoint\n",
723 |     "\n",
724 |     "**Question:** Using the Delayed API to parallelize, create a NumPy array `x` of any size and compute the sum of all array entires."
725 |    ]
726 |   },
727 |   {
728 |    "cell_type": "code",
729 |    "execution_count": null,
730 |    "metadata": {},
731 |    "outputs": [],
732 |    "source": [
733 |     "# Your answer here"
734 |    ]
735 |   },
736 |   {
737 |    "cell_type": "code",
738 |    "execution_count": null,
739 |    "metadata": {
740 |     "jupyter": {
741 |      "source_hidden": true
742 |     },
743 |     "tags": []
744 |    },
745 |    "outputs": [],
746 |    "source": [
747 |     "import numpy as np\n",
748 |     "\n",
749 |     "x = delayed(np.ones)((1000,1000), dtype=int)\n",
750 |     "y = x.sum()\n",
751 |     "y.compute()"
752 |    ]
753 |   },
754 |   {
755 |    "cell_type": "markdown",
756 |    "metadata": {},
757 |    "source": [
758 |     "## Best Practices"
759 |    ]
760 |   },
761 |   {
762 |    "cell_type": "markdown",
763 |    "metadata": {},
764 |    "source": [
765 |     "1. Delayed is called on Python functions and not the results"
766 |    ]
767 |   },
768 |   {
769 |    "cell_type": "code",
770 |    "execution_count": null,
771 |    "metadata": {},
772 |    "outputs": [],
773 |    "source": [
774 |     "# [DON'T] Call delayed on result, becasuse it executes immediately\n",
775 |     "\n",
776 |     "dask.delayed(f(x, y))"
777 |    ]
778 |   },
779 |   {
780 |    "cell_type": "code",
781 |    "execution_count": null,
782 |    "metadata": {},
783 |    "outputs": [],
784 |    "source": [
785 |     "# [DO] Call delayed on function\n",
786 |     "\n",
787 |     "dask.delayed(f)(x, y)"
788 |    ]
789 |   },
790 |   {
791 |    "cell_type": "markdown",
792 |    "metadata": {},
793 |    "source": [
794 |     "2. Compute at once, instead of repeatedly"
795 |    ]
796 |   },
797 |   {
798 |    "cell_type": "code",
799 |    "execution_count": null,
800 |    "metadata": {},
801 |    "outputs": [],
802 |    "source": [
803 |     "# [DON'T] Call compute repeatedly\n",
804 |     "\n",
805 |     "results = []\n",
806 |     "for x in L:\n",
807 |     "    y = dask.delayed(f)(x)\n",
808 |     "    results.append(y.compute())\n",
809 |     "\n",
810 |     "results"
811 |    ]
812 |   },
813 |   {
814 |    "cell_type": "code",
815 |    "execution_count": null,
816 |    "metadata": {},
817 |    "outputs": [],
818 |    "source": [
819 |     "# [DO] Collect many calls for one compute\n",
820 |     "\n",
821 |     "results = []\n",
822 |     "for x in L:\n",
823 |     "    y = dask.delayed(f)(x)\n",
824 |     "    results.append(y)\n",
825 |     "\n",
826 |     "results = dask.compute(*results)"
827 |    ]
828 |   },
829 |   {
830 |    "cell_type": "markdown",
831 |    "metadata": {},
832 |    "source": [
833 |     "3. Do not change (mutate) inputs"
834 |    ]
835 |   },
836 |   {
837 |    "cell_type": "code",
838 |    "execution_count": null,
839 |    "metadata": {},
840 |    "outputs": [],
841 |    "source": [
842 |     "# [Don'T] Mutate inputs in functions\n",
843 |     "\n",
844 |     "@dask.delayed\n",
845 |     "def f(x):\n",
846 |     "    x += 1\n",
847 |     "    return x"
848 |    ]
849 |   },
850 |   {
851 |    "cell_type": "code",
852 |    "execution_count": null,
853 |    "metadata": {},
854 |    "outputs": [],
855 |    "source": [
856 |     "# [DO] Return new values or copies\n",
857 |     "\n",
858 |     "@dask.delayed\n",
859 |     "def f(x):\n",
860 |     "    x = x + 1\n",
861 |     "    return x"
862 |    ]
863 |   },
864 |   {
865 |    "cell_type": "markdown",
866 |    "metadata": {},
867 |    "source": [
868 |     "For more best practices, refer to the [Dask documentation](https://docs.dask.org/en/latest/delayed-best-practices.html)."
869 |    ]
870 |   },
871 |   {
872 |    "cell_type": "markdown",
873 |    "metadata": {},
874 |    "source": [
875 |     "## References"
876 |    ]
877 |   },
878 |   {
879 |    "cell_type": "markdown",
880 |    "metadata": {},
881 |    "source": [
882 |     "* [Dask Delayed documentation](https://docs.dask.org/en/latest/delayed.html)\n",
883 |     "* [Dask Delayed best practices](https://docs.dask.org/en/latest/delayed-best-practices.html)\n",
884 |     "* [Dask Tutorial - Delayed](https://tutorial.dask.org/01_dask.delayed.html)"
885 |    ]
886 |   },
887 |   {
888 |    "cell_type": "code",
889 |    "execution_count": null,
890 |    "metadata": {},
891 |    "outputs": [],
892 |    "source": []
893 |   }
894 |  ],
895 |  "metadata": {
896 |   "kernelspec": {
897 |    "display_name": "Python 3",
898 |    "language": "python",
899 |    "name": "python3"
900 |   },
901 |   "language_info": {
902 |    "codemirror_mode": {
903 |     "name": "ipython",
904 |     "version": 3
905 |    },
906 |    "file_extension": ".py",
907 |    "mimetype": "text/x-python",
908 |    "name": "python",
909 |    "nbconvert_exporter": "python",
910 |    "pygments_lexer": "ipython3",
911 |    "version": "3.8.10"
912 |   }
913 |  },
914 |  "nbformat": 4,
915 |  "nbformat_minor": 4
916 | }
917 | 


--------------------------------------------------------------------------------