├── .gitignore
├── 1. Beginning with Python.ipynb
├── 2. First steps with Pandas.ipynb
├── 3.1 Visualization with Matplotlib.ipynb
├── 3.2 Visualization with Seaborn.ipynb
├── 4. More Python basics.ipynb
├── 5. More Pandas.ipynb
├── CHANGELOG.md
├── LICENSE
├── README.md
├── data
├── Iris
│ ├── Iris.csv
│ └── Iris_data.csv
├── Penguins
│ ├── penguins.csv
│ └── penguins_clean.csv
├── Pokemon
│ └── pokemon.csv
└── food_training
│ ├── languages.csv
│ ├── training_2014.csv
│ ├── training_2015.csv
│ └── training_2016.csv
├── media
├── colab
│ ├── image1.png
│ ├── image10.png
│ ├── image2.png
│ ├── image3.png
│ ├── image4.png
│ ├── image5.png
│ ├── image6.png
│ ├── image7.png
│ ├── image8.png
│ └── image9.png
├── humble-data-logo-transparent.png
├── humble-data-logo-white-transparent.png
└── matplotlib_logo_light.svg
├── requirements.txt
└── solutions
├── 01_01.py
├── 01_02.py
├── 01_03.py
├── 01_04.py
├── 01_05.py
├── 01_06.py
├── 01_07.py
├── 01_08.py
├── 01_09.py
├── 01_10.py
├── 01_11.py
├── 01_12.py
├── 01_13.py
├── 01_14.py
├── 01_15.py
├── 01_16.py
├── 01_17.py
├── 01_18.py
├── 01_19.py
├── 01_20.py
├── 01_21.py
├── 01_22.py
├── 01_23.py
├── 01_24.py
├── 01_25.py
├── 01_26.py
├── 01_27.py
├── 01_28.py
├── 01_29.py
├── 01_30.py
├── 01_31.py
├── 01_32.py
├── 01_33.py
├── 01_34.py
├── 01_35.py
├── 01_36.py
├── 01_37.py
├── 01_38.py
├── 02_01.py
├── 02_02.py
├── 02_03.py
├── 02_04.py
├── 02_05.py
├── 02_06.py
├── 02_07.py
├── 02_08.py
├── 02_09.py
├── 02_10.py
├── 02_11.py
├── 02_12.py
├── 02_13.py
├── 02_14.py
├── 02_15.py
├── 02_16.py
├── 02_17.py
├── 02_18.py
├── 02_19.py
├── 02_20.py
├── 02_21.py
├── 02_22.py
├── 02_23.py
├── 02_24.py
├── 02_25.py
├── 02_26.py
├── 02_27.py
├── 02_28.py
├── 02_29.py
├── 02_30.py
├── 02_31.py
├── 02_32.py
├── 04_01.py
├── 04_02.py
├── 04_03.py
├── 04_04.py
├── 04_05.py
├── 04_06.py
├── 04_07.py
├── 04_08.py
├── 04_09.py
├── 05_01.py
├── 05_02.py
├── 05_03.py
├── 05_04.py
├── 05_05.py
├── 05_06.py
├── 05_07.py
├── 05_08.py
├── 05_09.py
├── 05_10.py
├── 05_11.py
├── 05_12.py
├── 05_13.py
├── 05_14.py
├── 05_15.py
├── 05_16.py
├── 05_17.py
├── 05_18.py
├── 05_19.py
├── 05_20.py
├── 05_21.py
├── 05_22.py
├── 05_23.py
├── 05_24.py
├── 05_25.py
├── 05_26.py
├── 05_27.py
├── 05_28.py
├── 05_29.py
├── 05_30.py
├── 05_31.py
├── 05_32.py
├── 05_33.py
├── 05_34.py
├── 05_35.py
├── 05_36.py
├── 05_37.py
├── 05_38.py
├── 05_39.py
├── 05_40.py
├── 05_41.py
├── 05_42.py
├── 05_43.py
├── 05_44.py
├── 05_45.py
├── 05_46.py
└── 05_47.py
/.gitignore:
--------------------------------------------------------------------------------
1 | ### https://www.gitignore.io/ ###
2 |
3 | # Caches
4 | **/.ipynb_checkpoints/*
5 | **/__pycache__/*
6 | *.pyc
7 |
8 | # Data
9 | data/Penguins/my_penguins.csv
10 |
11 | # CoCalc
12 | *.sage-chat
13 | *.sage-jupyter2
14 |
15 | # Other
16 | .DS_Store
17 | .vscode
18 |
19 | env_workshop/
20 |
--------------------------------------------------------------------------------
/2. First steps with Pandas.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "
\n",
8 | "
\n",
9 | "
\n",
10 | "
\n",
11 | "Data Analysis with Pandas\n",
12 | "
\n",
13 | "
"
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "> ***Note***: This notebook contains solution cells with ***a*** solution. Remember there is not only one solution to a problem! \n",
21 | "> \n",
22 | "> You will recognise these cells as they start with **# %**. \n",
23 | "> \n",
24 | "> If you would like to see the solution, you will have to remove the **#** (which can be done by using **Ctrl** and **?**) and run the cell. If you want to run the solution code, you will have to run the cell again."
25 | ]
26 | },
27 | {
28 | "cell_type": "markdown",
29 | "metadata": {},
30 | "source": [
31 | "\n",
32 | "Data analysis packages\n",
33 | "
\n",
34 | ""
35 | ]
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "Data Scientists use a wide variety of libraries in Python that make working with data significantly easier. Those libraries primarily consist of:\n",
42 | "\n",
43 | "| Package | Description |\n",
44 | "| -- | -- |\n",
45 | "| `NumPy` | Numerical calculations - does all the heavy lifting by passing out to C subroutines. This means you get _both_ the productivity of Python, _and_ the computational power of C. Best of both worlds! |\n",
46 | "| `SciPy` | Scientific computing, statistic tests, and much more! |\n",
47 | "| `pandas` | Your data manipulation swiss army knife. You'll likely see pandas used in any PyData demo! pandas is built on top of NumPy, so it's **fast**. |\n",
48 | "| `matplotlib` | An old but powerful data visualisation package, inspired by Matlab. |\n",
49 | "| `Seaborn` | A newer and easy-to-use but limited data visualisation package, built on top of matplotlib. |\n",
50 | "| `scikit-learn` | Your one-stop machine learning shop! Classification, regression, clustering, dimensional reduction and more. |\n",
51 | "| `nltk` and `spacy` | nltk = natural language processing toolkit; spacy is a newer package for natural language processing but very easy to use. |\n",
52 | "| `statsmodels` | Statistical tests, time series forecasting and more. The \"model formula\" interface will be familiar to R users. |\n",
53 | "| `requests` and `Beautiful Soup` | `requests` + `Beautiful Soup` = great combination for building web scrapers. |\n",
54 | "| `Jupyter` | Jupyter itself is a package too. See the latest version at https://pypi.org/project/jupyter/, and upgrade with e.g. `conda install jupyter==1.0.0` |\n",
55 | "\n",
56 | "Though there are countless others available.\n",
57 | "\n",
58 | "For today, we'll primarily focus ourselves around the library that is 99% of our work: `pandas`. Pandas is built on top of the speed and power of NumPy."
59 | ]
60 | },
61 | {
62 | "cell_type": "markdown",
63 | "metadata": {},
64 | "source": [
65 | "---\n",
66 | "\n",
67 | "\n",
68 | "Imports\n",
69 | "
\n",
70 | ""
71 | ]
72 | },
73 | {
74 | "cell_type": "code",
75 | "execution_count": null,
76 | "metadata": {
77 | "jupyter": {
78 | "outputs_hidden": false
79 | }
80 | },
81 | "outputs": [],
82 | "source": [
83 | "import pandas as pd"
84 | ]
85 | },
86 | {
87 | "cell_type": "markdown",
88 | "metadata": {},
89 | "source": [
90 | ">Import numpy using the convention seen at the end of the first notebook."
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": null,
96 | "metadata": {
97 | "jupyter": {
98 | "outputs_hidden": false
99 | }
100 | },
101 | "outputs": [],
102 | "source": []
103 | },
104 | {
105 | "cell_type": "code",
106 | "execution_count": null,
107 | "metadata": {
108 | "jupyter": {
109 | "outputs_hidden": false
110 | }
111 | },
112 | "outputs": [],
113 | "source": "# !cat solutions/02_01.py"
114 | },
115 | {
116 | "cell_type": "markdown",
117 | "metadata": {},
118 | "source": [
119 | "---\n",
120 | "\n",
121 | "\n",
122 | "Loading the data\n",
123 | "
\n",
124 | ""
125 | ]
126 | },
127 | {
128 | "cell_type": "markdown",
129 | "metadata": {},
130 | "source": [
131 | "To see a method's documentation, you can use the help function. In Jupyter, you can also just put a question mark before the method."
132 | ]
133 | },
134 | {
135 | "cell_type": "code",
136 | "execution_count": null,
137 | "metadata": {
138 | "jupyter": {
139 | "outputs_hidden": false
140 | },
141 | "scrolled": true
142 | },
143 | "outputs": [],
144 | "source": [
145 | "?pd.read_csv"
146 | ]
147 | },
148 | {
149 | "cell_type": "markdown",
150 | "metadata": {},
151 | "source": [
152 | "To load the dataframe we are using in this notebook, we will provide the path to the file: ../data/Penguins/penguins.csv"
153 | ]
154 | },
155 | {
156 | "cell_type": "markdown",
157 | "metadata": {},
158 | "source": [
159 | ">Load the dataframe, read it into a pandas DataFrame and assign it to df"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": null,
165 | "metadata": {
166 | "jupyter": {
167 | "outputs_hidden": false
168 | }
169 | },
170 | "outputs": [],
171 | "source": []
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": null,
176 | "metadata": {
177 | "jupyter": {
178 | "outputs_hidden": false
179 | }
180 | },
181 | "outputs": [],
182 | "source": "# !cat solutions/02_02.py"
183 | },
184 | {
185 | "cell_type": "markdown",
186 | "metadata": {},
187 | "source": [
188 | "**To have a look at the first 5 rows of df, we can use the *head* method.**"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": null,
194 | "metadata": {
195 | "jupyter": {
196 | "outputs_hidden": false
197 | }
198 | },
199 | "outputs": [],
200 | "source": [
201 | "df.head()"
202 | ]
203 | },
204 | {
205 | "cell_type": "markdown",
206 | "metadata": {},
207 | "source": [
208 | ">Have a look at the last 3 rows of df using the tail method"
209 | ]
210 | },
211 | {
212 | "cell_type": "code",
213 | "execution_count": null,
214 | "metadata": {
215 | "jupyter": {
216 | "outputs_hidden": false
217 | }
218 | },
219 | "outputs": [],
220 | "source": []
221 | },
222 | {
223 | "cell_type": "code",
224 | "execution_count": null,
225 | "metadata": {
226 | "jupyter": {
227 | "outputs_hidden": false
228 | }
229 | },
230 | "outputs": [],
231 | "source": "# !cat solutions/02_03.py"
232 | },
233 | {
234 | "cell_type": "markdown",
235 | "metadata": {},
236 | "source": [
237 | "---\n",
238 | "\n",
239 | "\n",
240 | "General information about the dataset\n",
241 | "
\n",
242 | ""
243 | ]
244 | },
245 | {
246 | "cell_type": "markdown",
247 | "metadata": {},
248 | "source": [
249 | "**To get the size of the datasets, we can use the *shape* attribute.** \n",
250 | "The first number is the number of row, the second one the number of columns"
251 | ]
252 | },
253 | {
254 | "cell_type": "markdown",
255 | "metadata": {},
256 | "source": [
257 | ">Show the shape of df (do not put brackets at the end)"
258 | ]
259 | },
260 | {
261 | "cell_type": "code",
262 | "execution_count": null,
263 | "metadata": {
264 | "jupyter": {
265 | "outputs_hidden": false
266 | }
267 | },
268 | "outputs": [],
269 | "source": []
270 | },
271 | {
272 | "cell_type": "code",
273 | "execution_count": null,
274 | "metadata": {
275 | "jupyter": {
276 | "outputs_hidden": false
277 | }
278 | },
279 | "outputs": [],
280 | "source": "# !cat solutions/02_04.py"
281 | },
282 | {
283 | "cell_type": "markdown",
284 | "metadata": {},
285 | "source": [
286 | ">Get the names of the columns and info about them (number of non null and type) using the info method."
287 | ]
288 | },
289 | {
290 | "cell_type": "code",
291 | "execution_count": null,
292 | "metadata": {
293 | "jupyter": {
294 | "outputs_hidden": false
295 | }
296 | },
297 | "outputs": [],
298 | "source": []
299 | },
300 | {
301 | "cell_type": "code",
302 | "execution_count": null,
303 | "metadata": {
304 | "jupyter": {
305 | "outputs_hidden": false
306 | }
307 | },
308 | "outputs": [],
309 | "source": "# !cat solutions/02_05.py"
310 | },
311 | {
312 | "cell_type": "markdown",
313 | "metadata": {},
314 | "source": [
315 | ">Get the columns of the dataframe using the columns attribute."
316 | ]
317 | },
318 | {
319 | "cell_type": "code",
320 | "execution_count": null,
321 | "metadata": {
322 | "jupyter": {
323 | "outputs_hidden": false
324 | }
325 | },
326 | "outputs": [],
327 | "source": []
328 | },
329 | {
330 | "cell_type": "code",
331 | "execution_count": null,
332 | "metadata": {
333 | "jupyter": {
334 | "outputs_hidden": false
335 | }
336 | },
337 | "outputs": [],
338 | "source": "# !cat solutions/02_06.py"
339 | },
340 | {
341 | "cell_type": "markdown",
342 | "metadata": {},
343 | "source": [
344 | "---\n",
345 | "\n",
346 | "\n",
347 | "Display settings\n",
348 | "
\n",
349 | ""
350 | ]
351 | },
352 | {
353 | "cell_type": "markdown",
354 | "metadata": {},
355 | "source": [
356 | "We can check the display option of the notebook."
357 | ]
358 | },
359 | {
360 | "cell_type": "code",
361 | "execution_count": null,
362 | "metadata": {
363 | "jupyter": {
364 | "outputs_hidden": false
365 | }
366 | },
367 | "outputs": [],
368 | "source": [
369 | "pd.options.display.max_rows"
370 | ]
371 | },
372 | {
373 | "cell_type": "markdown",
374 | "metadata": {},
375 | "source": [
376 | ">Force pandas to display 25 rows by changing the value of the above."
377 | ]
378 | },
379 | {
380 | "cell_type": "code",
381 | "execution_count": null,
382 | "metadata": {
383 | "jupyter": {
384 | "outputs_hidden": false
385 | }
386 | },
387 | "outputs": [],
388 | "source": []
389 | },
390 | {
391 | "cell_type": "code",
392 | "execution_count": null,
393 | "metadata": {
394 | "jupyter": {
395 | "outputs_hidden": false
396 | }
397 | },
398 | "outputs": [],
399 | "source": "# !cat solutions/02_07.py"
400 | },
401 | {
402 | "cell_type": "markdown",
403 | "metadata": {},
404 | "source": [
405 | "---\n",
406 | "\n",
407 | "\n",
408 | "Subsetting data\n",
409 | "
\n",
410 | ""
411 | ]
412 | },
413 | {
414 | "cell_type": "markdown",
415 | "metadata": {},
416 | "source": [
417 | "We can subset a dataframe by label, by index or a combination of both. \n",
418 | "There are different ways to do it, using .loc, .iloc and also []. \n",
419 | "See [documentation ](https://pandas.pydata.org/pandas-docs/stable/indexing.html)."
420 | ]
421 | },
422 | {
423 | "cell_type": "markdown",
424 | "metadata": {},
425 | "source": [
426 | ">Display the 'bill_length_mm' column"
427 | ]
428 | },
429 | {
430 | "cell_type": "code",
431 | "execution_count": null,
432 | "metadata": {
433 | "jupyter": {
434 | "outputs_hidden": false
435 | }
436 | },
437 | "outputs": [],
438 | "source": []
439 | },
440 | {
441 | "cell_type": "code",
442 | "execution_count": null,
443 | "metadata": {
444 | "jupyter": {
445 | "outputs_hidden": false
446 | }
447 | },
448 | "outputs": [],
449 | "source": "# !cat solutions/02_08.py"
450 | },
451 | {
452 | "cell_type": "markdown",
453 | "metadata": {},
454 | "source": [
455 | "*Note:* We could also use `df.bill_length_mm`, but it's not the greatest idea because it could be mixed with methods and does not work for columns with spaces."
456 | ]
457 | },
458 | {
459 | "cell_type": "markdown",
460 | "metadata": {},
461 | "source": [
462 | ">Have a look at the 12th observation:"
463 | ]
464 | },
465 | {
466 | "cell_type": "code",
467 | "execution_count": null,
468 | "metadata": {
469 | "jupyter": {
470 | "outputs_hidden": false
471 | }
472 | },
473 | "outputs": [],
474 | "source": [
475 | "# using .iloc (uses positions, \"i\" stands for integer)\n"
476 | ]
477 | },
478 | {
479 | "cell_type": "code",
480 | "execution_count": null,
481 | "metadata": {
482 | "jupyter": {
483 | "outputs_hidden": false
484 | }
485 | },
486 | "outputs": [],
487 | "source": "# !cat solutions/02_09.py"
488 | },
489 | {
490 | "cell_type": "code",
491 | "execution_count": null,
492 | "metadata": {
493 | "jupyter": {
494 | "outputs_hidden": false
495 | }
496 | },
497 | "outputs": [],
498 | "source": [
499 | "# using .loc (uses indexes and labels)\n"
500 | ]
501 | },
502 | {
503 | "cell_type": "code",
504 | "execution_count": null,
505 | "metadata": {
506 | "jupyter": {
507 | "outputs_hidden": false
508 | }
509 | },
510 | "outputs": [],
511 | "source": "# !cat solutions/02_10.py"
512 | },
513 | {
514 | "cell_type": "markdown",
515 | "metadata": {},
516 | "source": [
517 | ">Display the **bill_length_mm** of the last three observations."
518 | ]
519 | },
520 | {
521 | "cell_type": "code",
522 | "execution_count": null,
523 | "metadata": {
524 | "jupyter": {
525 | "outputs_hidden": false
526 | }
527 | },
528 | "outputs": [],
529 | "source": [
530 | "# using .iloc\n"
531 | ]
532 | },
533 | {
534 | "cell_type": "code",
535 | "execution_count": null,
536 | "metadata": {
537 | "jupyter": {
538 | "outputs_hidden": false
539 | }
540 | },
541 | "outputs": [],
542 | "source": "# !cat solutions/02_11.py"
543 | },
544 | {
545 | "cell_type": "code",
546 | "execution_count": null,
547 | "metadata": {
548 | "jupyter": {
549 | "outputs_hidden": false
550 | }
551 | },
552 | "outputs": [],
553 | "source": [
554 | "# using .loc\n"
555 | ]
556 | },
557 | {
558 | "cell_type": "code",
559 | "execution_count": null,
560 | "metadata": {
561 | "jupyter": {
562 | "outputs_hidden": false
563 | },
564 | "scrolled": true
565 | },
566 | "outputs": [],
567 | "source": "# !cat solutions/02_12.py"
568 | },
569 | {
570 | "cell_type": "markdown",
571 | "metadata": {},
572 | "source": [
573 | "And finally look at the **flipper_length_mm** and **body_mass_g** of the 146th, the 8th and the 1rst observations:"
574 | ]
575 | },
576 | {
577 | "cell_type": "code",
578 | "execution_count": null,
579 | "metadata": {
580 | "jupyter": {
581 | "outputs_hidden": false
582 | }
583 | },
584 | "outputs": [],
585 | "source": [
586 | "# using .iloc\n"
587 | ]
588 | },
589 | {
590 | "cell_type": "code",
591 | "execution_count": null,
592 | "metadata": {
593 | "jupyter": {
594 | "outputs_hidden": false
595 | }
596 | },
597 | "outputs": [],
598 | "source": "# !cat solutions/02_13.py"
599 | },
600 | {
601 | "cell_type": "code",
602 | "execution_count": null,
603 | "metadata": {
604 | "jupyter": {
605 | "outputs_hidden": false
606 | }
607 | },
608 | "outputs": [],
609 | "source": [
610 | "# using .loc\n"
611 | ]
612 | },
613 | {
614 | "cell_type": "code",
615 | "execution_count": null,
616 | "metadata": {
617 | "jupyter": {
618 | "outputs_hidden": false
619 | }
620 | },
621 | "outputs": [],
622 | "source": "# !cat solutions/02_14.py"
623 | },
624 | {
625 | "cell_type": "markdown",
626 | "metadata": {},
627 | "source": [
628 | "**!!WARNING!!** Unlike Python and ``.iloc``, the end value in a range specified by ``.loc`` **includes** the last index specified. "
629 | ]
630 | },
631 | {
632 | "cell_type": "code",
633 | "execution_count": null,
634 | "metadata": {
635 | "jupyter": {
636 | "outputs_hidden": false
637 | },
638 | "scrolled": true
639 | },
640 | "outputs": [],
641 | "source": [
642 | "df.iloc[5:10]"
643 | ]
644 | },
645 | {
646 | "cell_type": "code",
647 | "execution_count": null,
648 | "metadata": {
649 | "jupyter": {
650 | "outputs_hidden": false
651 | }
652 | },
653 | "outputs": [],
654 | "source": [
655 | "df.loc[5:10]"
656 | ]
657 | },
658 | {
659 | "cell_type": "markdown",
660 | "metadata": {},
661 | "source": [
662 | "---\n",
663 | "\n",
664 | "\n",
665 | "Filtering data on conditions\n",
666 | "
\n",
667 | ""
668 | ]
669 | },
670 | {
671 | "cell_type": "markdown",
672 | "metadata": {},
673 | "source": [
674 | "**We can also use condition(s) to filter.** \n",
675 | "We want to display the rows of df where **body_mass_g** is greater than 4000. We will start by creating a mask with this condition."
676 | ]
677 | },
678 | {
679 | "cell_type": "code",
680 | "execution_count": null,
681 | "metadata": {
682 | "jupyter": {
683 | "outputs_hidden": false
684 | },
685 | "scrolled": true
686 | },
687 | "outputs": [],
688 | "source": [
689 | "mask_PW = df['body_mass_g'] > 4000\n",
690 | "mask_PW"
691 | ]
692 | },
693 | {
694 | "cell_type": "markdown",
695 | "metadata": {},
696 | "source": [
697 | "Note that this return booleans. If we pass this mask to our dataframe, it will display only the rows where the mask is True."
698 | ]
699 | },
700 | {
701 | "cell_type": "code",
702 | "execution_count": null,
703 | "metadata": {
704 | "jupyter": {
705 | "outputs_hidden": false
706 | }
707 | },
708 | "outputs": [],
709 | "source": [
710 | "df[mask_PW]"
711 | ]
712 | },
713 | {
714 | "cell_type": "markdown",
715 | "metadata": {},
716 | "source": [
717 | ">Display the rows of df where **body_mass_g** is greater than 4000 and **flipper_length_mm** is less than 185."
718 | ]
719 | },
720 | {
721 | "cell_type": "code",
722 | "execution_count": null,
723 | "metadata": {
724 | "jupyter": {
725 | "outputs_hidden": false
726 | }
727 | },
728 | "outputs": [],
729 | "source": []
730 | },
731 | {
732 | "cell_type": "code",
733 | "execution_count": null,
734 | "metadata": {
735 | "jupyter": {
736 | "outputs_hidden": false
737 | }
738 | },
739 | "outputs": [],
740 | "source": "# !cat solutions/02_15.py"
741 | },
742 | {
743 | "cell_type": "markdown",
744 | "metadata": {},
745 | "source": [
746 | "---\n",
747 | "\n",
748 | "\n",
749 | "Values\n",
750 | "
\n",
751 | ""
752 | ]
753 | },
754 | {
755 | "cell_type": "markdown",
756 | "metadata": {},
757 | "source": [
758 | "We can get the number of unique values from a certain column by using the `nunique` method.\n",
759 | "\n",
760 | "For example, we can get the number of unique values from the species column:"
761 | ]
762 | },
763 | {
764 | "cell_type": "code",
765 | "execution_count": null,
766 | "metadata": {
767 | "jupyter": {
768 | "outputs_hidden": false
769 | }
770 | },
771 | "outputs": [],
772 | "source": [
773 | "df['species'].nunique()"
774 | ]
775 | },
776 | {
777 | "cell_type": "markdown",
778 | "metadata": {},
779 | "source": [
780 | "We can also get the list of unique values from a certain column by using the `unique` method.\n",
781 | ">Return the list of unique values from the species column"
782 | ]
783 | },
784 | {
785 | "cell_type": "code",
786 | "execution_count": null,
787 | "metadata": {
788 | "jupyter": {
789 | "outputs_hidden": false
790 | }
791 | },
792 | "outputs": [],
793 | "source": []
794 | },
795 | {
796 | "cell_type": "code",
797 | "execution_count": null,
798 | "metadata": {
799 | "jupyter": {
800 | "outputs_hidden": false
801 | }
802 | },
803 | "outputs": [],
804 | "source": "# !cat solutions/02_16.py"
805 | },
806 | {
807 | "cell_type": "markdown",
808 | "metadata": {},
809 | "source": [
810 | "---\n",
811 | "\n",
812 | "\n",
813 | "Null Values and NaN\n",
814 | "
\n",
815 | ""
816 | ]
817 | },
818 | {
819 | "cell_type": "markdown",
820 | "metadata": {},
821 | "source": [
822 | "When you work with data, you will quickly learn that data is never \"clean\". These values are usually referred to as null value. In computation it is best practice to define a \"special number\" that is \"**N**ot **a** **N**umber\" also called NaN.\n",
823 | "\n",
824 | "We can use the `isnull` method to know if a value is null or not. It returns boolean values."
825 | ]
826 | },
827 | {
828 | "cell_type": "code",
829 | "execution_count": null,
830 | "metadata": {
831 | "jupyter": {
832 | "outputs_hidden": false
833 | }
834 | },
835 | "outputs": [],
836 | "source": [
837 | "df['flipper_length_mm'].isnull()"
838 | ]
839 | },
840 | {
841 | "cell_type": "markdown",
842 | "metadata": {},
843 | "source": [
844 | "**We can apply different methods one after the other.**. \n",
845 | "For example, we could apply to method `sum` after the method `isnull` to know the number of null observations in the **flipper_length_mm** column.\n",
846 | ">Get the total number of null values for **flipper_length_mm**."
847 | ]
848 | },
849 | {
850 | "cell_type": "code",
851 | "execution_count": null,
852 | "metadata": {},
853 | "outputs": [],
854 | "source": []
855 | },
856 | {
857 | "cell_type": "code",
858 | "execution_count": null,
859 | "metadata": {},
860 | "outputs": [],
861 | "source": "# !cat solutions/02_17.py"
862 | },
863 | {
864 | "cell_type": "markdown",
865 | "metadata": {},
866 | "source": [
867 | "To get the count of the different values of a column, we can use the `value_counts` method.\n",
868 | "\n",
869 | "For example, for the species column:"
870 | ]
871 | },
872 | {
873 | "cell_type": "code",
874 | "execution_count": null,
875 | "metadata": {
876 | "jupyter": {
877 | "outputs_hidden": false
878 | }
879 | },
880 | "outputs": [],
881 | "source": [
882 | "df['species'].value_counts()"
883 | ]
884 | },
885 | {
886 | "cell_type": "markdown",
887 | "metadata": {},
888 | "source": [
889 | "If we want to know the count of NaN values, we have to pass the value `False` to the parameter **dropna** (set to `True` by default).\n",
890 | "> Return the proportion for each sex, including the NaN values.\""
891 | ]
892 | },
893 | {
894 | "cell_type": "code",
895 | "execution_count": null,
896 | "metadata": {
897 | "jupyter": {
898 | "outputs_hidden": false
899 | }
900 | },
901 | "outputs": [],
902 | "source": []
903 | },
904 | {
905 | "cell_type": "code",
906 | "execution_count": null,
907 | "metadata": {
908 | "jupyter": {
909 | "outputs_hidden": false
910 | }
911 | },
912 | "outputs": [],
913 | "source": "# !cat solutions/02_18.py"
914 | },
915 | {
916 | "cell_type": "markdown",
917 | "metadata": {},
918 | "source": [
919 | "To get the proportion instead of the count of these values, we have to pass the value `True` to the parameter **normalize**.\n",
920 | ">Return the proportion for each species."
921 | ]
922 | },
923 | {
924 | "cell_type": "code",
925 | "execution_count": null,
926 | "metadata": {
927 | "jupyter": {
928 | "outputs_hidden": false
929 | }
930 | },
931 | "outputs": [],
932 | "source": []
933 | },
934 | {
935 | "cell_type": "code",
936 | "execution_count": null,
937 | "metadata": {
938 | "jupyter": {
939 | "outputs_hidden": false
940 | }
941 | },
942 | "outputs": [],
943 | "source": "# !cat solutions/02_19.py"
944 | },
945 | {
946 | "cell_type": "markdown",
947 | "metadata": {},
948 | "source": [
949 | ">Using the index attribute, get the indexes of the observation without **flipper_length_mm**"
950 | ]
951 | },
952 | {
953 | "cell_type": "code",
954 | "execution_count": null,
955 | "metadata": {
956 | "jupyter": {
957 | "outputs_hidden": false
958 | }
959 | },
960 | "outputs": [],
961 | "source": []
962 | },
963 | {
964 | "cell_type": "code",
965 | "execution_count": null,
966 | "metadata": {
967 | "jupyter": {
968 | "outputs_hidden": false
969 | }
970 | },
971 | "outputs": [],
972 | "source": "# !cat solutions/02_20.py"
973 | },
974 | {
975 | "cell_type": "markdown",
976 | "metadata": {},
977 | "source": [
978 | "Use the **[dropna](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html)** method to remove the row which only has NaN values.\n",
979 | ">Get the help for the dropna method."
980 | ]
981 | },
982 | {
983 | "cell_type": "code",
984 | "execution_count": null,
985 | "metadata": {
986 | "jupyter": {
987 | "outputs_hidden": false
988 | }
989 | },
990 | "outputs": [],
991 | "source": []
992 | },
993 | {
994 | "cell_type": "code",
995 | "execution_count": null,
996 | "metadata": {
997 | "jupyter": {
998 | "outputs_hidden": false
999 | },
1000 | "scrolled": true
1001 | },
1002 | "outputs": [],
1003 | "source": "# !cat solutions/02_21.py"
1004 | },
1005 | {
1006 | "cell_type": "markdown",
1007 | "metadata": {},
1008 | "source": [
1009 | ">Use the dropna method to remove the row of `df` where all of the values are NaN, and assign it to `df_2`."
1010 | ]
1011 | },
1012 | {
1013 | "cell_type": "code",
1014 | "execution_count": null,
1015 | "metadata": {
1016 | "jupyter": {
1017 | "outputs_hidden": false
1018 | }
1019 | },
1020 | "outputs": [],
1021 | "source": []
1022 | },
1023 | {
1024 | "cell_type": "code",
1025 | "execution_count": null,
1026 | "metadata": {
1027 | "jupyter": {
1028 | "outputs_hidden": false
1029 | }
1030 | },
1031 | "outputs": [],
1032 | "source": "# !cat solutions/02_22.py"
1033 | },
1034 | {
1035 | "cell_type": "markdown",
1036 | "metadata": {},
1037 | "source": [
1038 | "We can use a f-string to format a string. We have to write a `f` before the quotation mark, and write what you want to format between curly brackets."
1039 | ]
1040 | },
1041 | {
1042 | "cell_type": "code",
1043 | "execution_count": null,
1044 | "metadata": {
1045 | "jupyter": {
1046 | "outputs_hidden": false
1047 | }
1048 | },
1049 | "outputs": [],
1050 | "source": [
1051 | "print(f'shape of df: {df.shape}')"
1052 | ]
1053 | },
1054 | {
1055 | "cell_type": "markdown",
1056 | "metadata": {},
1057 | "source": [
1058 | "> Print the number of rows of `df_2` using a f_string. Did we lose any rows between `df` and `df_2`? If not, why not?"
1059 | ]
1060 | },
1061 | {
1062 | "cell_type": "code",
1063 | "execution_count": null,
1064 | "metadata": {
1065 | "jupyter": {
1066 | "outputs_hidden": false
1067 | }
1068 | },
1069 | "outputs": [],
1070 | "source": []
1071 | },
1072 | {
1073 | "cell_type": "code",
1074 | "execution_count": null,
1075 | "metadata": {
1076 | "jupyter": {
1077 | "outputs_hidden": false
1078 | }
1079 | },
1080 | "outputs": [],
1081 | "source": "# !cat solutions/02_23.py"
1082 | },
1083 | {
1084 | "cell_type": "markdown",
1085 | "metadata": {},
1086 | "source": [
1087 | ">Use the dropna method to remove the rows of `df_2` which contains any NaN values, and assign it to `df_3`"
1088 | ]
1089 | },
1090 | {
1091 | "cell_type": "code",
1092 | "execution_count": null,
1093 | "metadata": {
1094 | "jupyter": {
1095 | "outputs_hidden": false
1096 | }
1097 | },
1098 | "outputs": [],
1099 | "source": "# !cat solutions/02_24.py"
1100 | },
1101 | {
1102 | "cell_type": "markdown",
1103 | "metadata": {},
1104 | "source": [
1105 | ">Print the number of rows of `df_3` using a f_string."
1106 | ]
1107 | },
1108 | {
1109 | "cell_type": "code",
1110 | "execution_count": null,
1111 | "metadata": {
1112 | "jupyter": {
1113 | "outputs_hidden": false
1114 | }
1115 | },
1116 | "outputs": [],
1117 | "source": []
1118 | },
1119 | {
1120 | "cell_type": "code",
1121 | "execution_count": null,
1122 | "metadata": {
1123 | "jupyter": {
1124 | "outputs_hidden": false
1125 | }
1126 | },
1127 | "outputs": [],
1128 | "source": "# !cat solutions/02_25.py"
1129 | },
1130 | {
1131 | "cell_type": "markdown",
1132 | "metadata": {},
1133 | "source": [
1134 | "---\n",
1135 | "\n",
1136 | "\n",
1137 | "Duplicates\n",
1138 | "
\n",
1139 | ""
1140 | ]
1141 | },
1142 | {
1143 | "cell_type": "markdown",
1144 | "metadata": {},
1145 | "source": [
1146 | ">Remove the duplicates rows from `df_3`, and assign the new dataframe to `df_4`"
1147 | ]
1148 | },
1149 | {
1150 | "cell_type": "code",
1151 | "execution_count": null,
1152 | "metadata": {
1153 | "jupyter": {
1154 | "outputs_hidden": false
1155 | }
1156 | },
1157 | "outputs": [],
1158 | "source": []
1159 | },
1160 | {
1161 | "cell_type": "code",
1162 | "execution_count": null,
1163 | "metadata": {
1164 | "jupyter": {
1165 | "outputs_hidden": false
1166 | },
1167 | "scrolled": true
1168 | },
1169 | "outputs": [],
1170 | "source": "# !cat solutions/02_26.py"
1171 | },
1172 | {
1173 | "cell_type": "code",
1174 | "execution_count": null,
1175 | "metadata": {
1176 | "jupyter": {
1177 | "outputs_hidden": false
1178 | }
1179 | },
1180 | "outputs": [],
1181 | "source": [
1182 | "# checking the shape of df_4\n",
1183 | "df_4.shape"
1184 | ]
1185 | },
1186 | {
1187 | "cell_type": "markdown",
1188 | "metadata": {},
1189 | "source": [
1190 | "You should see that 4 rows have been dropped. "
1191 | ]
1192 | },
1193 | {
1194 | "cell_type": "markdown",
1195 | "metadata": {},
1196 | "source": [
1197 | "---\n",
1198 | "\n",
1199 | "\n",
1200 | "Some stats\n",
1201 | "
\n",
1202 | ""
1203 | ]
1204 | },
1205 | {
1206 | "cell_type": "markdown",
1207 | "metadata": {},
1208 | "source": [
1209 | ">Use the describe method to see how the data is distributed (numerical features only!)"
1210 | ]
1211 | },
1212 | {
1213 | "cell_type": "code",
1214 | "execution_count": null,
1215 | "metadata": {
1216 | "jupyter": {
1217 | "outputs_hidden": false
1218 | }
1219 | },
1220 | "outputs": [],
1221 | "source": []
1222 | },
1223 | {
1224 | "cell_type": "code",
1225 | "execution_count": null,
1226 | "metadata": {
1227 | "jupyter": {
1228 | "outputs_hidden": false
1229 | }
1230 | },
1231 | "outputs": [],
1232 | "source": "# !cat solutions/02_27.py"
1233 | },
1234 | {
1235 | "cell_type": "markdown",
1236 | "metadata": {},
1237 | "source": [
1238 | "We can also change the **species** column to save memory space. Note: You may receive a **SettingWithCopyWarning** - you can safely ignore this error for this notebook."
1239 | ]
1240 | },
1241 | {
1242 | "cell_type": "code",
1243 | "execution_count": null,
1244 | "metadata": {
1245 | "jupyter": {
1246 | "outputs_hidden": false
1247 | }
1248 | },
1249 | "outputs": [],
1250 | "source": [
1251 | "df_4['species'] = df_4['species'].astype('category')"
1252 | ]
1253 | },
1254 | {
1255 | "cell_type": "markdown",
1256 | "metadata": {},
1257 | "source": [
1258 | ">Using the dtypes attribute, check the types of the columns of `df_4`"
1259 | ]
1260 | },
1261 | {
1262 | "cell_type": "code",
1263 | "execution_count": null,
1264 | "metadata": {
1265 | "jupyter": {
1266 | "outputs_hidden": false
1267 | }
1268 | },
1269 | "outputs": [],
1270 | "source": []
1271 | },
1272 | {
1273 | "cell_type": "code",
1274 | "execution_count": null,
1275 | "metadata": {
1276 | "jupyter": {
1277 | "outputs_hidden": false
1278 | }
1279 | },
1280 | "outputs": [],
1281 | "source": "# !cat solutions/02_28.py"
1282 | },
1283 | {
1284 | "cell_type": "markdown",
1285 | "metadata": {},
1286 | "source": [
1287 | "We can also use the functions count(), mean(), sum(), median(), std(), min() and max() separately if we are only interested in one of those."
1288 | ]
1289 | },
1290 | {
1291 | "cell_type": "markdown",
1292 | "metadata": {},
1293 | "source": [
1294 | ">Get the minimum for each numerical column of `df_4`"
1295 | ]
1296 | },
1297 | {
1298 | "cell_type": "code",
1299 | "execution_count": null,
1300 | "metadata": {
1301 | "jupyter": {
1302 | "outputs_hidden": false
1303 | }
1304 | },
1305 | "outputs": [],
1306 | "source": []
1307 | },
1308 | {
1309 | "cell_type": "code",
1310 | "execution_count": null,
1311 | "metadata": {
1312 | "jupyter": {
1313 | "outputs_hidden": false
1314 | }
1315 | },
1316 | "outputs": [],
1317 | "source": "# !cat solutions/02_29.py"
1318 | },
1319 | {
1320 | "cell_type": "markdown",
1321 | "metadata": {},
1322 | "source": [
1323 | ">Calculate the maximum of the **flipper_length_mm**"
1324 | ]
1325 | },
1326 | {
1327 | "cell_type": "code",
1328 | "execution_count": null,
1329 | "metadata": {
1330 | "jupyter": {
1331 | "outputs_hidden": false
1332 | }
1333 | },
1334 | "outputs": [],
1335 | "source": []
1336 | },
1337 | {
1338 | "cell_type": "code",
1339 | "execution_count": null,
1340 | "metadata": {
1341 | "jupyter": {
1342 | "outputs_hidden": false
1343 | }
1344 | },
1345 | "outputs": [],
1346 | "source": "# !cat solutions/02_30.py"
1347 | },
1348 | {
1349 | "cell_type": "markdown",
1350 | "metadata": {},
1351 | "source": [
1352 | "We can also get information for each species using the `groupby` method.\n",
1353 | "\n",
1354 | "\n",
1355 | "> Get the median for each **species**."
1356 | ]
1357 | },
1358 | {
1359 | "cell_type": "code",
1360 | "execution_count": null,
1361 | "metadata": {
1362 | "jupyter": {
1363 | "outputs_hidden": false
1364 | }
1365 | },
1366 | "outputs": [],
1367 | "source": "# !cat solutions/02_31.py"
1368 | },
1369 | {
1370 | "cell_type": "markdown",
1371 | "metadata": {},
1372 | "source": [
1373 | "---\n",
1374 | "\n",
1375 | "\n",
1376 | "Saving the dataframe as a csv file\n",
1377 | "
\n",
1378 | ""
1379 | ]
1380 | },
1381 | {
1382 | "cell_type": "markdown",
1383 | "metadata": {},
1384 | "source": ">Save df_4 using this path: `'data/Penguins/my_penguins.csv'`"
1385 | },
1386 | {
1387 | "cell_type": "code",
1388 | "execution_count": null,
1389 | "metadata": {
1390 | "jupyter": {
1391 | "outputs_hidden": false
1392 | }
1393 | },
1394 | "outputs": [],
1395 | "source": []
1396 | },
1397 | {
1398 | "cell_type": "code",
1399 | "execution_count": null,
1400 | "metadata": {
1401 | "jupyter": {
1402 | "outputs_hidden": false
1403 | }
1404 | },
1405 | "outputs": [],
1406 | "source": "# !cat solutions/02_32.py\n"
1407 | }
1408 | ],
1409 | "metadata": {
1410 | "kernelspec": {
1411 | "display_name": "Python 3 (system-wide)",
1412 | "language": "python",
1413 | "name": "python3"
1414 | },
1415 | "language_info": {
1416 | "codemirror_mode": {
1417 | "name": "ipython",
1418 | "version": 3
1419 | },
1420 | "file_extension": ".py",
1421 | "mimetype": "text/x-python",
1422 | "name": "python",
1423 | "nbconvert_exporter": "python",
1424 | "pygments_lexer": "ipython3",
1425 | "version": "3.8.5"
1426 | }
1427 | },
1428 | "nbformat": 4,
1429 | "nbformat_minor": 4
1430 | }
1431 |
--------------------------------------------------------------------------------
/3.1 Visualization with Matplotlib.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "
\n",
10 | "
\n",
11 | "Data visualization with matplotlib\n",
12 | "
\n",
13 | "
"
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": ""
20 | },
21 | {
22 | "cell_type": "markdown",
23 | "metadata": {},
24 | "source": [
25 | "\n",
26 | "Import pyplot in matplotlib (and pandas)\n",
27 | "
\n",
28 | ""
29 | ]
30 | },
31 | {
32 | "cell_type": "markdown",
33 | "metadata": {},
34 | "source": [
35 | "According to the [official documentation](https://matplotlib.org/gallery/index.html):\n",
36 | "\n",
37 | "`matplotlib.pyplot` is a collection of command style functions that make Matplotlib work like MATLAB. Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.\n",
38 | "\n",
39 | "`pyplot` is mainly intended for interactive plots and simple cases of programmatic plot generation."
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | ""
47 | ]
48 | },
49 | {
50 | "cell_type": "code",
51 | "execution_count": null,
52 | "metadata": {
53 | "collapsed": false,
54 | "jupyter": {
55 | "outputs_hidden": false
56 | }
57 | },
58 | "outputs": [],
59 | "source": [
60 | "%matplotlib inline\n",
61 | "# this is for ipython interpreter to show the plot in Jupyter\n",
62 | "\n",
63 | "import pandas as pd\n",
64 | "import matplotlib.pyplot as plt"
65 | ]
66 | },
67 | {
68 | "cell_type": "markdown",
69 | "metadata": {},
70 | "source": [
71 | "### Import the dataframe again, read it into a pandas DataFrame and assign it to df."
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": null,
77 | "metadata": {
78 | "collapsed": false,
79 | "jupyter": {
80 | "outputs_hidden": false
81 | }
82 | },
83 | "outputs": [],
84 | "source": "df = pd.read_csv('data/Penguins/penguins_clean.csv')"
85 | },
86 | {
87 | "cell_type": "markdown",
88 | "metadata": {},
89 | "source": [
90 | "### Refresh our memory about how the data looks like"
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": null,
96 | "metadata": {
97 | "collapsed": false,
98 | "jupyter": {
99 | "outputs_hidden": false
100 | }
101 | },
102 | "outputs": [],
103 | "source": [
104 | "df.head()"
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "### Using DataFrame.plot() in pandas\n",
112 | "\n",
113 | "pandas DataFrame object has a `plot()` method which provide basic plot of different kinds, including: 'line', 'bar', 'hist', 'box' etc. You can also set parameters to control the layout and labels of the plot.\n",
114 | "\n",
115 | "`plot()` uses `matplotlib.pyplot` in the background which makes plotting data in a DataFrame much easier \n",
116 | "\n",
117 | "You will find this page very helpful:\n",
118 | "https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html"
119 | ]
120 | },
121 | {
122 | "cell_type": "markdown",
123 | "metadata": {},
124 | "source": [
125 | "#### Example: Box plot in general"
126 | ]
127 | },
128 | {
129 | "cell_type": "code",
130 | "execution_count": null,
131 | "metadata": {
132 | "collapsed": false,
133 | "jupyter": {
134 | "outputs_hidden": false
135 | }
136 | },
137 | "outputs": [],
138 | "source": [
139 | "df.plot(kind='box')"
140 | ]
141 | },
142 | {
143 | "cell_type": "markdown",
144 | "metadata": {},
145 | "source": [
146 | "The scales of our data don't align particularly well. So for the sake of plotting, we'll ignore the body mass of the penguins."
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": null,
152 | "metadata": {},
153 | "outputs": [],
154 | "source": [
155 | "df.drop([\"body_mass_g\"], axis=1).plot(kind='box')"
156 | ]
157 | },
158 | {
159 | "cell_type": "markdown",
160 | "metadata": {},
161 | "source": [
162 | "#### Better presentation: figure size, add title and legend"
163 | ]
164 | },
165 | {
166 | "cell_type": "code",
167 | "execution_count": null,
168 | "metadata": {
169 | "collapsed": false,
170 | "jupyter": {
171 | "outputs_hidden": false
172 | }
173 | },
174 | "outputs": [],
175 | "source": [
176 | "df.drop([\"body_mass_g\"], axis=1).plot(kind='box', figsize=(10,8), title='Box plot of different measurements of species of penguin', legend=True)"
177 | ]
178 | },
179 | {
180 | "cell_type": "markdown",
181 | "metadata": {},
182 | "source": [
183 | "#### Making subplots"
184 | ]
185 | },
186 | {
187 | "cell_type": "code",
188 | "execution_count": null,
189 | "metadata": {
190 | "collapsed": false,
191 | "jupyter": {
192 | "outputs_hidden": false
193 | }
194 | },
195 | "outputs": [],
196 | "source": [
197 | "df.plot(kind='box',\n",
198 | " subplots=True, layout=(2,2),\n",
199 | " figsize=(10,8), title='Box plot of different measurements of species of penguin', legend=True)"
200 | ]
201 | },
202 | {
203 | "cell_type": "markdown",
204 | "metadata": {},
205 | "source": [
206 | "---\n",
207 | "\n",
208 | "\n",
209 | "Exercise: Compare bill length of different species of penguin\n",
210 | "
\n",
211 | ""
212 | ]
213 | },
214 | {
215 | "cell_type": "markdown",
216 | "metadata": {},
217 | "source": [
218 | "Let's use box plot to compare the bill length of different species of penguin. We need the DataFrame to be slightly different so we can compare the different type species of penguin. We would like to pivot the data so each column are bill length of different species of penguin."
219 | ]
220 | },
221 | {
222 | "cell_type": "markdown",
223 | "metadata": {},
224 | "source": [
225 | "#### Prepare the data set"
226 | ]
227 | },
228 | {
229 | "cell_type": "code",
230 | "execution_count": null,
231 | "metadata": {
232 | "collapsed": false,
233 | "jupyter": {
234 | "outputs_hidden": false
235 | }
236 | },
237 | "outputs": [],
238 | "source": [
239 | "df_pivot = df.pivot(columns='species', values='bill_length_mm')\n",
240 | "# tell the pivot() method to make the 'species' as columns, and using the 'bill_length_mm' as the value"
241 | ]
242 | },
243 | {
244 | "cell_type": "code",
245 | "execution_count": null,
246 | "metadata": {
247 | "collapsed": false,
248 | "jupyter": {
249 | "outputs_hidden": false
250 | }
251 | },
252 | "outputs": [],
253 | "source": [
254 | "df_pivot.sample(10)"
255 | ]
256 | },
257 | {
258 | "cell_type": "markdown",
259 | "metadata": {},
260 | "source": [
261 | "#### Box plot of df_pivot\n",
262 | "\n",
263 | "Now we can use `plot()` on `df_pivot`. To make a box plot, remember to set the parameter `kind` to 'box'. Also make the presentation nice by setting a good `figsize` and with a good `title`. Don't forget the `legend`."
264 | ]
265 | },
266 | {
267 | "cell_type": "code",
268 | "execution_count": null,
269 | "metadata": {
270 | "collapsed": false,
271 | "jupyter": {
272 | "outputs_hidden": false
273 | }
274 | },
275 | "outputs": [],
276 | "source": []
277 | },
278 | {
279 | "cell_type": "markdown",
280 | "metadata": {},
281 | "source": [
282 | "#### Additional exercise\n",
283 | "\n",
284 | "Challenge yourself by making your own `df_pivot` pivoting on a different measure (e.g. Body Mass). Also try using a histogram (hist) instead of a boxplot. You can also try making a plot with 3 subplots, each is a histogram of a type of penguin."
285 | ]
286 | },
287 | {
288 | "cell_type": "code",
289 | "execution_count": null,
290 | "metadata": {
291 | "collapsed": false,
292 | "jupyter": {
293 | "outputs_hidden": false
294 | }
295 | },
296 | "outputs": [],
297 | "source": []
298 | },
299 | {
300 | "cell_type": "markdown",
301 | "metadata": {},
302 | "source": [
303 | "So far we are not using `matplotlib.pyplot` directly. Although it is very convenient to use `df.plot()`, sometimes we would like to have more control with what we are plotting and make more complex graphs. In the following sections, we will use `matplotlib.pyplot` (which is imported as `plt` now) directly."
304 | ]
305 | },
306 | {
307 | "cell_type": "markdown",
308 | "metadata": {},
309 | "source": [
310 | "### Divide the data into 3 types accordingly"
311 | ]
312 | },
313 | {
314 | "cell_type": "code",
315 | "execution_count": null,
316 | "metadata": {
317 | "collapsed": false,
318 | "jupyter": {
319 | "outputs_hidden": false
320 | }
321 | },
322 | "outputs": [],
323 | "source": [
324 | "df['species'].unique()"
325 | ]
326 | },
327 | {
328 | "cell_type": "code",
329 | "execution_count": null,
330 | "metadata": {
331 | "collapsed": false,
332 | "jupyter": {
333 | "outputs_hidden": false
334 | }
335 | },
336 | "outputs": [],
337 | "source": [
338 | "df_adelie = df[df['species'] == 'Adelie']"
339 | ]
340 | },
341 | {
342 | "cell_type": "code",
343 | "execution_count": null,
344 | "metadata": {
345 | "collapsed": false,
346 | "jupyter": {
347 | "outputs_hidden": false
348 | }
349 | },
350 | "outputs": [],
351 | "source": [
352 | "df_chinstrap = df[df['species'] == 'Chinstrap']"
353 | ]
354 | },
355 | {
356 | "cell_type": "code",
357 | "execution_count": null,
358 | "metadata": {
359 | "collapsed": false,
360 | "jupyter": {
361 | "outputs_hidden": false
362 | }
363 | },
364 | "outputs": [],
365 | "source": [
366 | "df_gentoo = df[df['species'] == 'Gentoo']"
367 | ]
368 | },
369 | {
370 | "cell_type": "markdown",
371 | "metadata": {},
372 | "source": [
373 | "### Scatter plot example: plot on Bill Length and Width"
374 | ]
375 | },
376 | {
377 | "cell_type": "code",
378 | "execution_count": null,
379 | "metadata": {
380 | "collapsed": false,
381 | "jupyter": {
382 | "outputs_hidden": false
383 | }
384 | },
385 | "outputs": [],
386 | "source": [
387 | "plt.scatter(df_adelie['bill_length_mm'], df_adelie['bill_depth_mm'], c='r')\n",
388 | "plt.scatter(df_chinstrap['bill_length_mm'], df_chinstrap['bill_depth_mm'], c='g')\n",
389 | "plt.scatter(df_gentoo['bill_length_mm'], df_gentoo['bill_depth_mm'], c='b')"
390 | ]
391 | },
392 | {
393 | "cell_type": "markdown",
394 | "metadata": {},
395 | "source": [
396 | "#### Better presentation: figure size, add labels and legend"
397 | ]
398 | },
399 | {
400 | "cell_type": "code",
401 | "execution_count": null,
402 | "metadata": {
403 | "collapsed": false,
404 | "jupyter": {
405 | "outputs_hidden": false
406 | }
407 | },
408 | "outputs": [],
409 | "source": [
410 | "plt.figure(figsize=(10,8)) # set the size of the plot\n",
411 | "\n",
412 | "plt.scatter(df_adelie['bill_length_mm'], df_adelie['bill_depth_mm'], c='r')\n",
413 | "plt.scatter(df_chinstrap['bill_length_mm'], df_chinstrap['bill_depth_mm'], c='g')\n",
414 | "plt.scatter(df_gentoo['bill_length_mm'], df_gentoo['bill_depth_mm'], c='b')\n",
415 | "\n",
416 | "ax = plt.gca() #gca method tell the rest of the code to reference the plot we made\n",
417 | "\n",
418 | "ax.set_xlabel('Bill Length (mm)')\n",
419 | "ax.set_ylabel('Bill Width (mm)')\n",
420 | "ax.set_title('Bill Length and Width for Different Species of Penguin')\n",
421 | "\n",
422 | "ax.legend(('adelie', 'chinstrap', 'gentoo'))"
423 | ]
424 | },
425 | {
426 | "cell_type": "markdown",
427 | "metadata": {},
428 | "source": [
429 | "### Scatter plot exercise: plot on Flipper Length and Body Mass\n",
430 | "\n",
431 | "Now is your turn to make your own plot. Make sure you have also set the labels and legend"
432 | ]
433 | },
434 | {
435 | "cell_type": "code",
436 | "execution_count": null,
437 | "metadata": {
438 | "collapsed": false,
439 | "jupyter": {
440 | "outputs_hidden": false
441 | }
442 | },
443 | "outputs": [],
444 | "source": []
445 | },
446 | {
447 | "cell_type": "markdown",
448 | "metadata": {},
449 | "source": [
450 | "### Histogram example: plot on Bill Length"
451 | ]
452 | },
453 | {
454 | "cell_type": "code",
455 | "execution_count": null,
456 | "metadata": {
457 | "collapsed": false,
458 | "jupyter": {
459 | "outputs_hidden": false
460 | }
461 | },
462 | "outputs": [],
463 | "source": [
464 | "plt.figure(figsize=(10,8))\n",
465 | "\n",
466 | "plt.hist(df_adelie['bill_length_mm'], color='r', alpha=.5) # alpha set the transparency of the plot\n",
467 | "plt.hist(df_chinstrap['bill_length_mm'], color='g', alpha=.5)\n",
468 | "plt.hist(df_gentoo['bill_length_mm'], color='b', alpha=.5)\n",
469 | "\n",
470 | "ax = plt.gca()\n",
471 | "\n",
472 | "ax.set_xlabel('Bill Length (mm)')\n",
473 | "ax.set_title('Histogram of Bill Length for Different Species of Penguin')\n",
474 | "\n",
475 | "ax.legend(('adelie', 'chinstrap', 'gentoo'))"
476 | ]
477 | },
478 | {
479 | "cell_type": "markdown",
480 | "metadata": {},
481 | "source": [
482 | "### Histogram exercise: plot on Body Mass\n",
483 | "\n",
484 | "Now is your turn to make your own plot. Make sure you set the alpha to a proper value and have the right the labels and legend."
485 | ]
486 | },
487 | {
488 | "cell_type": "code",
489 | "execution_count": null,
490 | "metadata": {
491 | "collapsed": false,
492 | "jupyter": {
493 | "outputs_hidden": false
494 | }
495 | },
496 | "outputs": [],
497 | "source": []
498 | },
499 | {
500 | "cell_type": "markdown",
501 | "metadata": {},
502 | "source": [
503 | "### Making subplots example\n",
504 | "\n",
505 | "To make subplots with just `plt` is a bit more complicated. It is considered more advance and require some understanding of what the building blocks are in a plot. Don't feel bad if you find it challenging, you can always follow the example and try it yourself to understand more what is going on.\n",
506 | "\n",
507 | "The example below plot the histogram of Bill Length and Bill Width side by side"
508 | ]
509 | },
510 | {
511 | "cell_type": "code",
512 | "execution_count": null,
513 | "metadata": {
514 | "collapsed": false,
515 | "jupyter": {
516 | "outputs_hidden": false
517 | }
518 | },
519 | "outputs": [],
520 | "source": [
521 | "# First, we have to decide how many subplots we want and how they are orientated\n",
522 | "# say we want them side by side (i.e. 1 row 2 columns)\n",
523 | "\n",
524 | "fig, (ax0, ax1) = plt.subplots(nrows=1, ncols=2, figsize=(15,8))\n",
525 | "\n",
526 | "# this will create a figure object (which is the whole plot area)\n",
527 | "# and 2 axes (which are the 2 subplots labeled ax0 and ax1)\n",
528 | "\n",
529 | "# Now we can put plots in them accordingly\n",
530 | "\n",
531 | "### for ax0 ###\n",
532 | "\n",
533 | "ax0.hist(df_adelie['bill_length_mm'], color='r', alpha=.5) \n",
534 | "ax0.hist(df_chinstrap['bill_length_mm'], color='g', alpha=.5)\n",
535 | "ax0.hist(df_gentoo['bill_length_mm'], color='b', alpha=.5)\n",
536 | "\n",
537 | "ax0.set_xlabel('Bill Length (mm)')\n",
538 | "ax0.set_title('Histogram of Bill Length for Different Species of Penguin')\n",
539 | "\n",
540 | "ax0.legend(('adelie', 'chinstrap', 'gentoo'))\n",
541 | "\n",
542 | "### for ax1 ###\n",
543 | "\n",
544 | "ax1.hist(df_adelie['bill_depth_mm'], color='r', alpha=.5) \n",
545 | "ax1.hist(df_chinstrap['bill_depth_mm'], color='g', alpha=.5)\n",
546 | "ax1.hist(df_gentoo['bill_depth_mm'], color='b', alpha=.5)\n",
547 | "\n",
548 | "ax1.set_xlabel('Bill Width (mm)')\n",
549 | "ax1.set_title('Histogram of Bill Width for Different Species of Penguin')\n",
550 | "\n",
551 | "ax1.legend(('adelie', 'chinstrap', 'gentoo'))\n",
552 | "\n",
553 | "plt.show() # after building what we want for both axes, use show() method to show plots"
554 | ]
555 | },
556 | {
557 | "cell_type": "markdown",
558 | "metadata": {},
559 | "source": [
560 | "---\n",
561 | "\n",
562 | "\n",
563 | "Making subplots exercise\n",
564 | "
\n",
565 | ""
566 | ]
567 | },
568 | {
569 | "cell_type": "markdown",
570 | "metadata": {},
571 | "source": [
572 | "Make 2 subplots, one on top of another. They are scatter plots of Flipper Length and Body Mass (with different type of penguin). After you have done it, try also other orientation and plots. See if you can make 4 subplots together. Always make sure the presentation is good."
573 | ]
574 | },
575 | {
576 | "cell_type": "markdown",
577 | "metadata": {},
578 | "source": [
579 | "---\n",
580 | "\n",
581 | "\n",
582 | "More matplotlib!\n",
583 | "
\n",
584 | ""
585 | ]
586 | },
587 | {
588 | "cell_type": "markdown",
589 | "metadata": {},
590 | "source": [
591 | "Check out more example of histogram with multiple data sets: https://matplotlib.org/gallery/statistics/histogram_multihist.html#sphx-glr-gallery-statistics-histogram-multihist-py\n",
592 | "\n",
593 | "Example: Creates histogram from scatter plot and adds them to the sides of the plot\n",
594 | "https://matplotlib.org/gallery/lines_bars_and_markers/scatter_hist.html#sphx-glr-gallery-lines-bars-and-markers-scatter-hist-py\n",
595 | "\n",
596 | "There are a lot more to learn about matplotlib. It is a very powerful library. You can always learn more by looking at the examples at: https://matplotlib.org/gallery/index.html\n",
597 | "\n",
598 | "Also, if you are stuck, always check the documentation: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.html#module-matplotlib.pyplot\n",
599 | "\n",
600 | "\n"
601 | ]
602 | }
603 | ],
604 | "metadata": {
605 | "kernelspec": {
606 | "display_name": "Python 3 (ipykernel)",
607 | "language": "python",
608 | "name": "python3"
609 | },
610 | "language_info": {
611 | "codemirror_mode": {
612 | "name": "ipython",
613 | "version": 3
614 | },
615 | "file_extension": ".py",
616 | "mimetype": "text/x-python",
617 | "name": "python",
618 | "nbconvert_exporter": "python",
619 | "pygments_lexer": "ipython3",
620 | "version": "3.11.1"
621 | }
622 | },
623 | "nbformat": 4,
624 | "nbformat_minor": 4
625 | }
626 |
--------------------------------------------------------------------------------
/3.2 Visualization with Seaborn.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "
\n",
10 | "
\n",
11 | "Data visualization with Seaborn\n",
12 | "
\n",
13 | "
"
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "## About seaborn\n",
21 | "Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics, which is very powerful for visualizing categorical data.\n",
22 | "\n",
23 | ""
24 | ]
25 | },
26 | {
27 | "cell_type": "markdown",
28 | "metadata": {},
29 | "source": [
30 | "We will be using the [Pokemon.csv](https://gist.github.com/armgilles/194bcff35001e7eb53a2a8b441e8b2c6). Let's have a look at the data:"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": null,
36 | "metadata": {
37 | "jupyter": {
38 | "outputs_hidden": false
39 | }
40 | },
41 | "outputs": [],
42 | "source": [
43 | "import pandas as pd\n",
44 | "\n",
45 | "pokemon_df = pd.read_csv('data/Pokemon/pokemon.csv', index_col=0)\n",
46 | "pokemon_df.head(10)"
47 | ]
48 | },
49 | {
50 | "cell_type": "markdown",
51 | "metadata": {},
52 | "source": [
53 | "---\n",
54 | "\n",
55 | "\n",
56 | "Categorical scatterplots\n",
57 | "
\n",
58 | ""
59 | ]
60 | },
61 | {
62 | "cell_type": "markdown",
63 | "metadata": {},
64 | "source": [
65 | "For example, we want to compare the Attack of different type of Pokemon, to see if any type is generally more powerful than the others:"
66 | ]
67 | },
68 | {
69 | "cell_type": "code",
70 | "execution_count": null,
71 | "metadata": {
72 | "jupyter": {
73 | "outputs_hidden": false
74 | }
75 | },
76 | "outputs": [],
77 | "source": [
78 | "import seaborn as sns\n",
79 | "import matplotlib.pyplot as plt\n",
80 | "\n",
81 | "sns.catplot(x=\"Type 1\", y=\"Attack\", data=pokemon_df);"
82 | ]
83 | },
84 | {
85 | "cell_type": "markdown",
86 | "metadata": {},
87 | "source": [
88 | "When import, we usually simplify 'seaborn' as 'sns'. (It's a [West Wing / Rob Lowe](https://en.wikipedia.org/wiki/Sam_Seaborn) reference!) Note that we have to also have to import matplotlib.pyplot because Seaborn is a library that sit on top of matplotlib. We got a plot but it looks ugly and not readable, let's add some configuration to make it nicer."
89 | ]
90 | },
91 | {
92 | "cell_type": "markdown",
93 | "metadata": {},
94 | "source": [
95 | "**Try: adding `aspect=2.5` as the last arguments in the following `sns.catplot`**"
96 | ]
97 | },
98 | {
99 | "cell_type": "code",
100 | "execution_count": null,
101 | "metadata": {
102 | "jupyter": {
103 | "outputs_hidden": false
104 | }
105 | },
106 | "outputs": [],
107 | "source": [
108 | "sns.catplot(x=\"Type 1\", y=\"Attack\", data=pokemon_df);"
109 | ]
110 | },
111 | {
112 | "cell_type": "markdown",
113 | "metadata": {},
114 | "source": [
115 | "So you can see that by adding 'aspect' we make the plot wider. The width of the plot is equal to 'aspect * height' so by adding 'aspect' we increase the width of the plot. It is one of the configuration we can add to the plot. For the whole list and their details, we can refer to the [official documentation](https://seaborn.pydata.org/generated/seaborn.catplot.html#seaborn.catplot) but we will give an introduction to a few common ones."
116 | ]
117 | },
118 | {
119 | "cell_type": "markdown",
120 | "metadata": {},
121 | "source": [
122 | "For example, here we see that there's a random x-axis offset for all the points so we can see them without dots overlapping each other. This is done by the 'jitter' setting which is default to True. Let's turn it off and see how it looks like:"
123 | ]
124 | },
125 | {
126 | "cell_type": "markdown",
127 | "metadata": {},
128 | "source": [
129 | "**Try: adding `jitter=False` as the last arguments in the following `sns.catplot`**"
130 | ]
131 | },
132 | {
133 | "cell_type": "code",
134 | "execution_count": null,
135 | "metadata": {
136 | "jupyter": {
137 | "outputs_hidden": false
138 | }
139 | },
140 | "outputs": [],
141 | "source": [
142 | "sns.catplot(x=\"Type 1\", y=\"Attack\", data=pokemon_df, aspect=2.5);"
143 | ]
144 | },
145 | {
146 | "cell_type": "markdown",
147 | "metadata": {},
148 | "source": [
149 | "So we now have a plot that points are align according to their catagories without the x-axis offsets. Which one to use is depending on if the population of the value (e.g. Attack) is important. In our case, we want to know how the Attack is distributed in each Type so many be it's good to have 'jitter' on, or even better if we can spread it out even more and show the distribution:"
150 | ]
151 | },
152 | {
153 | "cell_type": "markdown",
154 | "metadata": {},
155 | "source": [
156 | "**Try: adding `kind=\"swarm\"` as the last arguments in the following `sns.catplot`**"
157 | ]
158 | },
159 | {
160 | "cell_type": "code",
161 | "execution_count": null,
162 | "metadata": {
163 | "jupyter": {
164 | "outputs_hidden": false
165 | }
166 | },
167 | "outputs": [],
168 | "source": [
169 | "sns.catplot(x=\"Type 1\", y=\"Attack\", data=pokemon_df, aspect=2.5);"
170 | ]
171 | },
172 | {
173 | "cell_type": "markdown",
174 | "metadata": {},
175 | "source": [
176 | "Here we can do it by setting 'kind' to 'swarm' so the points are not overlapping. The disadvantage is that this ploy will need more space horizontally. Imagine we don't want to make the plot super wide due to the limitation of the paper. We can turn it 90 degrees by flipping the x and the y,also we would adjust the aspect and the height:"
177 | ]
178 | },
179 | {
180 | "cell_type": "markdown",
181 | "metadata": {},
182 | "source": [
183 | "**Try: swap `x` and `y`, and add `height=12, aspect=0.6, kind=\"swarm\"` in the arguments of the following `sns.catplot`**"
184 | ]
185 | },
186 | {
187 | "cell_type": "code",
188 | "execution_count": null,
189 | "metadata": {
190 | "jupyter": {
191 | "outputs_hidden": false
192 | }
193 | },
194 | "outputs": [],
195 | "source": [
196 | "sns.catplot(x=\"Type 1\", y=\"Attack\", data=pokemon_df);"
197 | ]
198 | },
199 | {
200 | "cell_type": "markdown",
201 | "metadata": {},
202 | "source": [
203 | "There are a few thing we can observe so far:\n",
204 | "\n",
205 | "1. For some Types, like Psychic has a very large range of Attack with a long tail the end (i.e. some Physic Types has very high Attack power while most of the Psychic type does not).\n",
206 | "\n",
207 | "2. On the other hand, the Poison type are mostly in the range of 40-110 Attacks.\n",
208 | "\n",
209 | "3. In general Dragon Types have more Attack power than Fairy, but there are 2 Fairy type that has more attack power."
210 | ]
211 | },
212 | {
213 | "cell_type": "markdown",
214 | "metadata": {},
215 | "source": [
216 | "However, we would like to look deeper: I have a theory that Legendary Pokemon are more powerful. let's colour code according to 'Legendary' to see if the pokemon is Legendary or not will have something to do with the Attack of the pokemon:"
217 | ]
218 | },
219 | {
220 | "cell_type": "markdown",
221 | "metadata": {},
222 | "source": [
223 | "**Try: adding `hue=\"Legendary\"` as the last arguments in the following `sns.catplot`**"
224 | ]
225 | },
226 | {
227 | "cell_type": "code",
228 | "execution_count": null,
229 | "metadata": {
230 | "jupyter": {
231 | "outputs_hidden": false
232 | }
233 | },
234 | "outputs": [],
235 | "source": [
236 | "plt.figure(figsize=(15, 6))\n",
237 | "sns.stripplot(x=\"Type 1\", y=\"Attack\", data=pokemon_df, size=7)"
238 | ]
239 | },
240 | {
241 | "cell_type": "markdown",
242 | "metadata": {},
243 | "source": [
244 | "Ah ha! We see that a lot of the Psychic Type that has higher that others in Attack is actually Legendary pokemon. That also happen to the Ground Type and the Flying type."
245 | ]
246 | },
247 | {
248 | "cell_type": "markdown",
249 | "metadata": {},
250 | "source": [
251 | "### Exercise\n",
252 | "Now it's your turn to do some analysis. Pick a property of the Pokemon: HP, Defense, Sp. Atk, Sp. Def or Speed and do the similar analysis as above to see if you can find any interesting facts about Pokemon."
253 | ]
254 | },
255 | {
256 | "cell_type": "markdown",
257 | "metadata": {},
258 | "source": [
259 | "---\n",
260 | "\n",
261 | "\n",
262 | "Building structured multi-plot grids\n",
263 | "
\n",
264 | ""
265 | ]
266 | },
267 | {
268 | "cell_type": "markdown",
269 | "metadata": {},
270 | "source": [
271 | "Sometimes, we would have multiple plots in one graph for comparison. One way to do it in seaborn is to use FacetGrid. The FacetGrid class is useful when you want to visualize the distribution of a variable or the relationship between multiple variables separately within subsets of your dataset. In the following, we will be using FacetGrid to see if there is a difference for our analysis above across different Generations."
272 | ]
273 | },
274 | {
275 | "cell_type": "markdown",
276 | "metadata": {},
277 | "source": [
278 | "To make a FacetGrid, we can do the following:"
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": null,
284 | "metadata": {
285 | "jupyter": {
286 | "outputs_hidden": false
287 | }
288 | },
289 | "outputs": [],
290 | "source": [
291 | "g = sns.FacetGrid(pokemon_df, col=\"Generation\")"
292 | ]
293 | },
294 | {
295 | "cell_type": "markdown",
296 | "metadata": {},
297 | "source": [
298 | "Look we have 6 plot areas which match as the number of different of Generations that we have\n",
299 | "(we can check what are the different Generations like this):"
300 | ]
301 | },
302 | {
303 | "cell_type": "code",
304 | "execution_count": null,
305 | "metadata": {
306 | "jupyter": {
307 | "outputs_hidden": false
308 | }
309 | },
310 | "outputs": [],
311 | "source": [
312 | "pokemon_df[\"Generation\"].unique()"
313 | ]
314 | },
315 | {
316 | "cell_type": "markdown",
317 | "metadata": {},
318 | "source": [
319 | "However, we would like to have the plots align vertically rather than horizontally."
320 | ]
321 | },
322 | {
323 | "cell_type": "markdown",
324 | "metadata": {},
325 | "source": [
326 | "**Try: replace `col` with `row` in the following `sns.FacetGrid`**"
327 | ]
328 | },
329 | {
330 | "cell_type": "code",
331 | "execution_count": null,
332 | "metadata": {
333 | "jupyter": {
334 | "outputs_hidden": false
335 | }
336 | },
337 | "outputs": [],
338 | "source": [
339 | "g = sns.FacetGrid(pokemon_df, col=\"Generation\")"
340 | ]
341 | },
342 | {
343 | "cell_type": "markdown",
344 | "metadata": {},
345 | "source": [
346 | "Ok, now we have the layout, how we gonna to put the plot in? For some plots, it could be done with the [FacetGrid.map()](https://seaborn.pydata.org/generated/seaborn.FacetGrid.map.html#seaborn.FacetGrid.map) method, for example, using sns.countplot to count how many Pokemon in different types:"
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": null,
352 | "metadata": {
353 | "jupyter": {
354 | "outputs_hidden": false
355 | }
356 | },
357 | "outputs": [],
358 | "source": [
359 | "g = sns.FacetGrid(pokemon_df, row=\"Generation\", aspect=3.5)\n",
360 | "g.map(sns.countplot, \"Type 1\");"
361 | ]
362 | },
363 | {
364 | "cell_type": "markdown",
365 | "metadata": {},
366 | "source": [
367 | "But with sns.catplot that we used before, this are even simpler. As catplot is already a FacetGrid , we can directly add the `row` or `col` setting to it."
368 | ]
369 | },
370 | {
371 | "cell_type": "markdown",
372 | "metadata": {},
373 | "source": [
374 | "**Try: adding `row=\"Generation\"` as the last arguments in the following `sns.catplot`**"
375 | ]
376 | },
377 | {
378 | "cell_type": "code",
379 | "execution_count": null,
380 | "metadata": {
381 | "jupyter": {
382 | "outputs_hidden": false
383 | }
384 | },
385 | "outputs": [],
386 | "source": [
387 | "plt.figure(figsize=(15, 6))\n",
388 | "sns.stripplot(x=\"Type 1\", y=\"Attack\", data=pokemon_df, size=7, hue=\"Legendary\")"
389 | ]
390 | },
391 | {
392 | "cell_type": "markdown",
393 | "metadata": {},
394 | "source": "Now you see that in each generation, the Legendary Pokemon are outliers with super attack powers comparing with the others within their own generation. For details using FacetGrids, you can see the official documentation here: https://seaborn.pydata.org/tutorial/axis_grids.html\n"
395 | }
396 | ],
397 | "metadata": {
398 | "kernelspec": {
399 | "display_name": "Python 3 (system-wide)",
400 | "language": "python",
401 | "name": "python3"
402 | },
403 | "language_info": {
404 | "codemirror_mode": {
405 | "name": "ipython",
406 | "version": 3
407 | },
408 | "file_extension": ".py",
409 | "mimetype": "text/x-python",
410 | "name": "python",
411 | "nbconvert_exporter": "python",
412 | "pygments_lexer": "ipython3",
413 | "version": "3.8.5"
414 | }
415 | },
416 | "nbformat": 4,
417 | "nbformat_minor": 4
418 | }
419 |
--------------------------------------------------------------------------------
/4. More Python basics.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "
\n",
10 | "
\n",
11 | "More Python\n",
12 | "
\n",
13 | "
"
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "***Note***: This notebook contains solution cells with (a) solution. Remember there is not only one solution to a problem! \n",
21 | "You will recognise these cells as they start with **# %**. \n",
22 | "If you would like to see the solution, you will have to remove the **#** (which can be done by using **Ctrl** and **?**) and run the cell. If you want to run the solution code, you will have to run the cell again."
23 | ]
24 | },
25 | {
26 | "cell_type": "markdown",
27 | "metadata": {},
28 | "source": [
29 | "---\n",
30 | "\n",
31 | "\n",
32 | "Dictionaries\n",
33 | "
\n",
34 | ""
35 | ]
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "**A dictionary is formed of value-key pairs, separated by commas, enclosed in curly brackets ( {} ). \n",
42 | "The key and the value are separated by a column ( : ), i.e. key:value.**"
43 | ]
44 | },
45 | {
46 | "cell_type": "code",
47 | "execution_count": null,
48 | "metadata": {},
49 | "outputs": [],
50 | "source": [
51 | "dict_greeting = {'Namibia':'Hallo', 'France':'Bonjour', 'Spain':'Ola', 'UK':'Hello', 'Italy':'Ciao'}"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "metadata": {},
58 | "outputs": [],
59 | "source": [
60 | "dict_greeting # there is no order in a dictionary"
61 | ]
62 | },
63 | {
64 | "cell_type": "markdown",
65 | "metadata": {},
66 | "source": [
67 | "**We can access values using the keys between square brackets.** \n",
68 | ">Get the greeting from Italy (use square brackets)."
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": null,
74 | "metadata": {},
75 | "outputs": [],
76 | "source": []
77 | },
78 | {
79 | "cell_type": "code",
80 | "execution_count": null,
81 | "metadata": {},
82 | "outputs": [],
83 | "source": "# !cat solutions/04_01.py"
84 | },
85 | {
86 | "cell_type": "markdown",
87 | "metadata": {},
88 | "source": [
89 | "**Keys are immutables (i.e. can't be changed) but values can be updated.** \n",
90 | ">Replace the UK greeting with 'Good Morning'. \n",
91 | ">Print the dictionary."
92 | ]
93 | },
94 | {
95 | "cell_type": "code",
96 | "execution_count": null,
97 | "metadata": {},
98 | "outputs": [],
99 | "source": []
100 | },
101 | {
102 | "cell_type": "code",
103 | "execution_count": null,
104 | "metadata": {},
105 | "outputs": [],
106 | "source": "# !cat solutions/04_02.py"
107 | },
108 | {
109 | "cell_type": "markdown",
110 | "metadata": {},
111 | "source": [
112 | "**We can also add new key-value pairs.** \n",
113 | ">Add the greeting of 'Hawaii' as 'Aloha'"
114 | ]
115 | },
116 | {
117 | "cell_type": "code",
118 | "execution_count": null,
119 | "metadata": {},
120 | "outputs": [],
121 | "source": []
122 | },
123 | {
124 | "cell_type": "code",
125 | "execution_count": null,
126 | "metadata": {},
127 | "outputs": [],
128 | "source": "# !cat solutions/04_03.py"
129 | },
130 | {
131 | "cell_type": "markdown",
132 | "metadata": {},
133 | "source": [
134 | "---\n",
135 | "\n",
136 | "\n",
137 | "Sets\n",
138 | "
\n",
139 | ""
140 | ]
141 | },
142 | {
143 | "cell_type": "markdown",
144 | "metadata": {},
145 | "source": [
146 | "**A set is a collection of unique, unordered and un-indexed elements.**"
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": null,
152 | "metadata": {},
153 | "outputs": [],
154 | "source": [
155 | "important_set = set(['me','myself', 'me', 'I'])\n",
156 | "important_set"
157 | ]
158 | },
159 | {
160 | "cell_type": "markdown",
161 | "metadata": {},
162 | "source": [
163 | "We can add an item to a set using the ***add*** method..."
164 | ]
165 | },
166 | {
167 | "cell_type": "code",
168 | "execution_count": null,
169 | "metadata": {},
170 | "outputs": [],
171 | "source": [
172 | "important_set.add('you')\n",
173 | "important_set"
174 | ]
175 | },
176 | {
177 | "cell_type": "markdown",
178 | "metadata": {},
179 | "source": [
180 | "...or multiple items using the ***update*** method."
181 | ]
182 | },
183 | {
184 | "cell_type": "code",
185 | "execution_count": null,
186 | "metadata": {},
187 | "outputs": [],
188 | "source": [
189 | "other_set = {'me', 'all of you', 'other people'}\n",
190 | "important_set.update(other_set)\n",
191 | "important_set"
192 | ]
193 | },
194 | {
195 | "cell_type": "markdown",
196 | "metadata": {},
197 | "source": [
198 | "You can find the methods for set [here](https://docs.python.org/2/library/sets.html). \n",
199 | "For example, you can get the intersection of two sets."
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": null,
205 | "metadata": {},
206 | "outputs": [],
207 | "source": [
208 | "set_1 = {3, 6, 9, 12, 15, 18, 21, 24, 27, 30}\n",
209 | "set_2 = {5, 10, 15, 20, 25, 30}\n",
210 | "set_3 = set_1.intersection(set_2)\n",
211 | "set_3"
212 | ]
213 | },
214 | {
215 | "cell_type": "markdown",
216 | "metadata": {},
217 | "source": [
218 | "---\n",
219 | "\n",
220 | "\n",
221 | "If - Elif - Else\n",
222 | "
\n",
223 | ""
224 | ]
225 | },
226 | {
227 | "cell_type": "markdown",
228 | "metadata": {},
229 | "source": [
230 | "\n",
231 | "Let's write a short program with an ***if*** statement to help us decide if a name is long or not (6 is completely arbitrary ;-)). \n",
232 | "We have to respect blocks of codes / indentation."
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": null,
238 | "metadata": {},
239 | "outputs": [],
240 | "source": [
241 | "name = input('What is your name? ')\n",
242 | "\n",
243 | "if len(name) > 6:\n",
244 | " print('You have a long name.')\n",
245 | "else:\n",
246 | " print('You have a short name.')"
247 | ]
248 | },
249 | {
250 | "cell_type": "markdown",
251 | "metadata": {},
252 | "source": [
253 | ">Write an ***If - Elif - Else*** statement printing 'Python' if x is positive, else 'sunshine' if y is equal to 2, else 'data' if z is a multiple of 3, else 'Why?'. \n",
254 | "You can test it with different values of x, y and z."
255 | ]
256 | },
257 | {
258 | "cell_type": "code",
259 | "execution_count": null,
260 | "metadata": {},
261 | "outputs": [],
262 | "source": [
263 | "x=\n",
264 | "y=\n",
265 | "z=\n",
266 | "\n",
267 | "if\n",
268 | "\n"
269 | ]
270 | },
271 | {
272 | "cell_type": "code",
273 | "execution_count": null,
274 | "metadata": {},
275 | "outputs": [],
276 | "source": "# !cat solutions/04_04.py"
277 | },
278 | {
279 | "cell_type": "markdown",
280 | "metadata": {},
281 | "source": [
282 | "---\n",
283 | "\n",
284 | "\n",
285 | "Using \"and\" to check for None\n",
286 | "
\n",
287 | ""
288 | ]
289 | },
290 | {
291 | "cell_type": "markdown",
292 | "metadata": {},
293 | "source": [
294 | "Sometimes we have missing data (which is None or np.nan) and will cause error in the check condition. e.g. age > 18 while age is missing or NaN) However, there's a trick. Since in the `and` operation, the second argument will not be checked if the 1st one is False (the result will be False anyway) so by checking if the 1st argument is valid or not we can avoid the error. For example:"
295 | ]
296 | },
297 | {
298 | "cell_type": "code",
299 | "execution_count": null,
300 | "metadata": {},
301 | "outputs": [],
302 | "source": [
303 | "age = None\n",
304 | "age > 18 # you will get an error"
305 | ]
306 | },
307 | {
308 | "cell_type": "code",
309 | "execution_count": null,
310 | "metadata": {},
311 | "outputs": [],
312 | "source": [
313 | "age = None\n",
314 | "(age is not None) and (age > 18) # no error"
315 | ]
316 | },
317 | {
318 | "cell_type": "markdown",
319 | "metadata": {},
320 | "source": [
321 | "The advantage of doing it this way is that we will have simpler code and a less indented `if` structure"
322 | ]
323 | },
324 | {
325 | "cell_type": "code",
326 | "execution_count": null,
327 | "metadata": {},
328 | "outputs": [],
329 | "source": [
330 | "if age is not None:\n",
331 | " if age > 18:\n",
332 | " print(\"have beer\")\n",
333 | "\n",
334 | "# which is not as good as\n",
335 | "\n",
336 | "if (age is not None) and (age > 18):\n",
337 | " print(\"have beer\")"
338 | ]
339 | },
340 | {
341 | "cell_type": "markdown",
342 | "metadata": {},
343 | "source": [
344 | "---\n",
345 | "\n",
346 | "\n",
347 | "Functions\n",
348 | "
\n",
349 | ""
350 | ]
351 | },
352 | {
353 | "cell_type": "markdown",
354 | "metadata": {},
355 | "source": [
356 | "**We can define our own functions with the keyword \"def\" followed by the name of the function and by parentheses with the parameter(s) inside.** \n",
357 | "\n",
358 | "Using the list **list_greeting**, we can define a **is_greeting** function which will decide if a string is a greeting or not."
359 | ]
360 | },
361 | {
362 | "cell_type": "code",
363 | "execution_count": null,
364 | "metadata": {},
365 | "outputs": [],
366 | "source": [
367 | "list_greeting = ['Hallo', 'Bonjour', 'Ola', 'Hello', 'Ciao', 'Ave']\n",
368 | "\n",
369 | "def is_greeting(s):\n",
370 | " \"\"\"Returns True if s is in list_greeting, else False.\"\"\"\n",
371 | " if s in list_greeting:\n",
372 | " return True\n",
373 | " else:\n",
374 | " return False "
375 | ]
376 | },
377 | {
378 | "cell_type": "markdown",
379 | "metadata": {},
380 | "source": [
381 | "We can now check if **Ola** and **Yo** are greetings."
382 | ]
383 | },
384 | {
385 | "cell_type": "code",
386 | "execution_count": null,
387 | "metadata": {},
388 | "outputs": [],
389 | "source": [
390 | "is_greeting('Ola')"
391 | ]
392 | },
393 | {
394 | "cell_type": "code",
395 | "execution_count": null,
396 | "metadata": {},
397 | "outputs": [],
398 | "source": [
399 | "is_greeting('Yo')"
400 | ]
401 | },
402 | {
403 | "cell_type": "markdown",
404 | "metadata": {},
405 | "source": [
406 | ">Get the documentation for the is_greeting function."
407 | ]
408 | },
409 | {
410 | "cell_type": "code",
411 | "execution_count": null,
412 | "metadata": {},
413 | "outputs": [],
414 | "source": []
415 | },
416 | {
417 | "cell_type": "code",
418 | "execution_count": null,
419 | "metadata": {},
420 | "outputs": [],
421 | "source": "# !cat solutions/04_05.py"
422 | },
423 | {
424 | "cell_type": "markdown",
425 | "metadata": {},
426 | "source": [
427 | ">Write a function that returns the input multiplied by 3 and increased by 10.\n",
428 | "\n",
429 | "Note: we call these inputs arguments."
430 | ]
431 | },
432 | {
433 | "cell_type": "code",
434 | "execution_count": null,
435 | "metadata": {},
436 | "outputs": [],
437 | "source": []
438 | },
439 | {
440 | "cell_type": "code",
441 | "execution_count": null,
442 | "metadata": {},
443 | "outputs": [],
444 | "source": "# !cat solutions/04_06.py\n"
445 | }
446 | ],
447 | "metadata": {
448 | "kernelspec": {
449 | "display_name": "Python 3 (system-wide)",
450 | "language": "python",
451 | "name": "python3"
452 | },
453 | "language_info": {
454 | "codemirror_mode": {
455 | "name": "ipython",
456 | "version": 3
457 | },
458 | "file_extension": ".py",
459 | "mimetype": "text/x-python",
460 | "name": "python",
461 | "nbconvert_exporter": "python",
462 | "pygments_lexer": "ipython3",
463 | "version": "3.8.5"
464 | }
465 | },
466 | "nbformat": 4,
467 | "nbformat_minor": 4
468 | }
469 |
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | # Changelog
2 |
3 | All notable changes to this project will be documented in this file.
4 |
5 | The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
6 | and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
7 |
8 |
9 |
10 | ## [2.0.0] -2025-02-01
11 |
12 | - simplify set up process and instructions
13 | - restructure and put all the notebooks in the root directory for easy access in Colab
14 | - use `!cat` instead of load magic to load solution as load magics are not implemented in some environments
15 |
16 | ## [1.1.0] - 2022-07-10
17 |
18 | - Various upgrades ([#40](https://github.com/HumbleData/beginners-data-workshop/pull/40))
19 | - **Behind-the-scenes**
20 | - Upgrade Python to 3.9.13
21 | - Upgrade all dependencies to latest with NumPy/pandas/Matplotlib/scikit-learn matching CoCalc versions.
22 | - Switch from `pip-tools` to Poetry (configuration in `pyproject.toml`)
23 | - Update Development Setup guidance in `README.md`
24 | - Integrate & configure flake8, pylint, black & other linters.
25 | - Added pre-commit configuration.
26 | - Updated VS Code `settings.json`
27 | - Added `linestripper.py` to keep EOF newlines out of solutions code (better attendee UX) yet black-compliant.
28 | - Added `CHANGELOG.md` and started versioning releases.
29 | - **Workshop materials**
30 | - Reviewed all workshop materials for deprecations.
31 | - Replaced all single quotes in solutions with double quotes for black compliance.
32 | - Added EOF newlines to datasets.
33 | - Other minor changes: trailing whitespace, trailing commas, isort compliant solution code.
34 |
35 |
36 | ## [1.0.0] - 2021-07-23
37 | Final version used for Humble Data workshops in 2021.
38 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.
2 | To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/
3 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Humble Data Workshop
2 |
3 | [](https://humbledata.org)
4 |
5 | ## ℹ️ If you would like to know more about this workshop, please [email us](mailto:contact@humbledata.org).
6 |
7 | ---
8 | ## Table of Contents
9 | * [Google Colab setup](#google-colab-setup)
10 | * [Local environment setup](#local-environment-setup)
11 | + [UV Installation](#uv-installation)
12 | + [Installing Miniconda](#installing-miniconda)
13 | - [Windows](#windows)
14 | - [Unix (Linux/macOS)](#unix-linuxmacos)
15 | + [Creating and Activating the Environment](#creating-and-activating-the-environment)
16 |
17 | * [License](#license)
18 | ---
19 |
20 | ## Google Colab setup
21 |
22 | 1. Go to [https://githubtocolab.com/HumbleData/beginners-data-workshop](https://githubtocolab.com/HumbleData/beginners-data-workshop)
23 | 2. Choose the notebook that you want to open
24 | 
25 | 3. Click on the file icon
on the left
26 | 4. If you haven’t logged in to your Google account, you will be asked to do so
27 | 
28 | 5. At the beginning of the notebook, add a cell by clicking the
button at the top
29 | 
30 | 6. After that copy and paste the following codes in the new cell:
31 | ```
32 | !git clone https://github.com/HumbleData/beginners-data-workshop.git
33 | !cp -r beginners-data-workshop/media/ .
34 | !cp -r beginners-data-workshop/data/ .
35 | !cp -r beginners-data-workshop/solutions/ .
36 | !rm -r beginners-data-workshop/
37 | ```
38 | > NOTE: You will need to add this code cell to every notebook you start.
39 |
40 | 
41 | 7. Run the cell by clicking the play button on the left of the cell or press shift \+ enter on your keyboard
42 | 
43 | 8. You may get this warning when running the first code block. Click “Run anyway” when asked (because you trust us not giving you malicious code).
44 | 
45 | 9. When the code is finished (it may take a moment), you should see that three folders are added to your files. Consider the preparation work done and you may now start using the notebook.
46 |
47 |
48 | 10. Note that when you disconnect from the notebook (or leave it inactive for a long time) the files we just download with the code and your work is not saved.
49 |
50 | Consider downloading or saving your work in drive before you leave this notebook. You can do so by clicking on the “File” button at the bottom.
51 |
52 |
53 |
54 | ---
55 |
56 | ## Local environment setup
57 |
58 | This document contains instructions on how to run the workshop using either `uv` or `conda` (Miniconda).
59 |
60 | ### UV Installation
61 | To run this workshop locally using `uv`, first you will need to [install uv](https://docs.astral.sh/uv/getting-started/installation/) on your computer.
62 |
63 | Once it is done, follow the instructions below:
64 |
65 | 1. Create a virtual python virtual environment 3.10+
66 | * `uv venv humble-data-workshop --python 3.10`
67 | 2. Activate the virtual environment.
68 | * `source humble-data-workshop/bin/activate`
69 | 3. Install Dependencies
70 | * `uv pip install -r requirements.txt`
71 |
72 | ### Installing Miniconda
73 |
74 | #### Windows
75 | 1. Download the Miniconda installer for Windows from the [official website](https://docs.conda.io/en/latest/miniconda.html)
76 | 2. Double-click the downloaded `.exe` file
77 | 3. Follow the installation prompts:
78 | - Click "Next"
79 | - Accept the license terms
80 | - Select "Just Me" for installation scope
81 | - Choose an installation directory (default is recommended)
82 | - In "Advanced Options", check "Add Miniconda3 to my PATH environment variable"
83 | - Click "Install"
84 |
85 | #### Unix (Linux/macOS)
86 | 1. Download the Miniconda installer for your system from the [official website](https://docs.conda.io/en/latest/miniconda.html)
87 | 2. Open Terminal
88 | 3. Navigate to the directory containing the downloaded file
89 | 4. Make the installer executable:
90 | ```bash
91 | chmod +x Miniconda3-latest-*-x86_64.sh
92 | ```
93 | 5. Run the installer:
94 | ```bash
95 | ./Miniconda3-latest-*-x86_64.sh
96 | ```
97 | 6. Follow the prompts:
98 | - Press Enter to review the license agreement
99 | - Type "yes" to accept the license terms
100 | - Confirm the installation location (default is recommended)
101 | - Type "yes" to initialize Miniconda3
102 |
103 | ### Creating and Activating the Environment
104 |
105 | 1. Open a new terminal (Windows: Anaconda Prompt, Unix: Terminal)
106 | 2. Create a new environment named 'humble-data':
107 | ```bash
108 | conda create -n humble-data python=3.8
109 | ```
110 | 3. Activate the environment:
111 | - Windows:
112 | ```bash
113 | conda activate humble-data
114 | ```
115 | - Unix:
116 | ```bash
117 | conda activate humble-data
118 | ```
119 | 4. Install required packages:
120 | ```bash
121 | pip install -r requirements.txt
122 | ```
123 |
124 | 5. Start Jupyter Notebook:
125 | ```bash
126 | jupyter notebook
127 | ```
128 | This will open Jupyter Notebook in your default web browser. You can now navigate to and open any of the workshop notebooks.
129 |
130 | ## Contributing
131 |
132 | 1. Fork this repository
133 | 2. Clone your fork locally
134 | 3. Create a branch for your changes:
135 | ```git checkout -b improve-notebook-x```
136 |
137 | 4. Make your changes:
138 |
139 | - Keep explanations simple and beginner-friendly
140 | - Test notebooks in both Google Colab and local environments
141 | - Follow existing code style and formatting
142 |
143 |
144 | 5. Commit with a clear message:
145 | ```git commit -m "Fix typo in data visualization notebook"```
146 |
147 | 6. Push and create a pull request
148 |
149 | ---
150 |
151 | ## License
152 |
153 | 
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
154 |
--------------------------------------------------------------------------------
/data/Iris/Iris.csv:
--------------------------------------------------------------------------------
1 | Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
2 | 1,5.1,3.5,1.4,0.2,Iris-setosa
3 | 2,4.9,3.0,1.4,0.2,Iris-setosa
4 | 3,4.7,3.2,1.3,0.2,Iris-setosa
5 | 4,4.6,3.1,1.5,0.2,Iris-setosa
6 | 5,5.0,3.6,1.4,0.2,Iris-setosa
7 | 6,5.4,3.9,1.7,0.4,Iris-setosa
8 | 7,4.6,3.4,1.4,0.3,Iris-setosa
9 | 8,5.0,3.4,1.5,0.2,Iris-setosa
10 | 9,4.4,2.9,1.4,0.2,Iris-setosa
11 | 10,4.9,3.1,1.5,0.1,Iris-setosa
12 | 11,5.4,3.7,1.5,0.2,Iris-setosa
13 | 12,4.8,3.4,1.6,0.2,Iris-setosa
14 | 13,4.8,3.0,1.4,0.1,Iris-setosa
15 | 14,4.3,3.0,1.1,0.1,Iris-setosa
16 | 15,5.8,4.0,1.2,0.2,Iris-setosa
17 | 16,5.7,4.4,1.5,0.4,Iris-setosa
18 | 17,5.4,3.9,1.3,0.4,Iris-setosa
19 | 18,5.1,3.5,1.4,0.3,Iris-setosa
20 | 19,5.7,3.8,1.7,0.3,Iris-setosa
21 | 20,5.1,3.8,1.5,0.3,Iris-setosa
22 | 21,5.4,3.4,1.7,0.2,Iris-setosa
23 | 22,5.1,3.7,1.5,0.4,Iris-setosa
24 | 23,4.6,3.6,1.0,0.2,Iris-setosa
25 | 24,5.1,3.3,1.7,0.5,Iris-setosa
26 | 25,4.8,3.4,1.9,0.2,Iris-setosa
27 | 26,5.0,3.0,1.6,0.2,Iris-setosa
28 | 27,5.0,3.4,1.6,0.4,Iris-setosa
29 | 28,5.2,3.5,1.5,0.2,Iris-setosa
30 | 29,5.2,3.4,1.4,0.2,Iris-setosa
31 | 30,4.7,3.2,1.6,0.2,Iris-setosa
32 | 31,4.8,3.1,1.6,0.2,Iris-setosa
33 | 32,5.4,3.4,1.5,0.4,Iris-setosa
34 | 33,5.2,4.1,1.5,0.1,Iris-setosa
35 | 34,5.5,4.2,1.4,0.2,Iris-setosa
36 | 35,4.9,3.1,1.5,0.1,Iris-setosa
37 | 36,5.0,3.2,1.2,0.2,Iris-setosa
38 | 37,5.5,3.5,1.3,0.2,Iris-setosa
39 | 38,4.9,3.1,1.5,0.1,Iris-setosa
40 | 39,4.4,3.0,1.3,0.2,Iris-setosa
41 | 40,5.1,3.4,1.5,0.2,Iris-setosa
42 | 41,5.0,3.5,1.3,0.3,Iris-setosa
43 | 42,4.5,2.3,1.3,0.3,Iris-setosa
44 | 43,4.4,3.2,1.3,0.2,Iris-setosa
45 | 44,5.0,3.5,1.6,0.6,Iris-setosa
46 | 45,5.1,3.8,1.9,0.4,Iris-setosa
47 | 46,4.8,3.0,1.4,0.3,Iris-setosa
48 | 47,5.1,3.8,1.6,0.2,Iris-setosa
49 | 48,4.6,3.2,1.4,0.2,Iris-setosa
50 | 49,5.3,3.7,1.5,0.2,Iris-setosa
51 | 50,5.0,3.3,1.4,0.2,Iris-setosa
52 | 51,7.0,3.2,4.7,1.4,Iris-versicolor
53 | 52,6.4,3.2,4.5,1.5,Iris-versicolor
54 | 53,6.9,3.1,4.9,1.5,Iris-versicolor
55 | 54,5.5,2.3,4.0,1.3,Iris-versicolor
56 | 55,6.5,2.8,4.6,1.5,Iris-versicolor
57 | 56,5.7,2.8,4.5,1.3,Iris-versicolor
58 | 57,6.3,3.3,4.7,1.6,Iris-versicolor
59 | 58,4.9,2.4,3.3,1.0,Iris-versicolor
60 | 59,6.6,2.9,4.6,1.3,Iris-versicolor
61 | 60,5.2,2.7,3.9,1.4,Iris-versicolor
62 | 61,5.0,2.0,3.5,1.0,Iris-versicolor
63 | 62,5.9,3.0,4.2,1.5,Iris-versicolor
64 | 63,6.0,2.2,4.0,1.0,Iris-versicolor
65 | 64,6.1,2.9,4.7,1.4,Iris-versicolor
66 | 65,5.6,2.9,3.6,1.3,Iris-versicolor
67 | 66,6.7,3.1,4.4,1.4,Iris-versicolor
68 | 67,5.6,3.0,4.5,1.5,Iris-versicolor
69 | 68,5.8,2.7,4.1,1.0,Iris-versicolor
70 | 69,6.2,2.2,4.5,1.5,Iris-versicolor
71 | 70,5.6,2.5,3.9,1.1,Iris-versicolor
72 | 71,5.9,3.2,4.8,1.8,Iris-versicolor
73 | 72,6.1,2.8,4.0,1.3,Iris-versicolor
74 | 73,6.3,2.5,4.9,1.5,Iris-versicolor
75 | 74,6.1,2.8,4.7,1.2,Iris-versicolor
76 | 75,6.4,2.9,4.3,1.3,Iris-versicolor
77 | 76,6.6,3.0,4.4,1.4,Iris-versicolor
78 | 77,6.8,2.8,4.8,1.4,Iris-versicolor
79 | 78,6.7,3.0,5.0,1.7,Iris-versicolor
80 | 79,6.0,2.9,4.5,1.5,Iris-versicolor
81 | 80,5.7,2.6,3.5,1.0,Iris-versicolor
82 | 81,5.5,2.4,3.8,1.1,Iris-versicolor
83 | 82,5.5,2.4,3.7,1.0,Iris-versicolor
84 | 83,5.8,2.7,3.9,1.2,Iris-versicolor
85 | 84,6.0,2.7,5.1,1.6,Iris-versicolor
86 | 85,5.4,3.0,4.5,1.5,Iris-versicolor
87 | 86,6.0,3.4,4.5,1.6,Iris-versicolor
88 | 87,6.7,3.1,4.7,1.5,Iris-versicolor
89 | 88,6.3,2.3,4.4,1.3,Iris-versicolor
90 | 89,5.6,3.0,4.1,1.3,Iris-versicolor
91 | 90,5.5,2.5,4.0,1.3,Iris-versicolor
92 | 91,5.5,2.6,4.4,1.2,Iris-versicolor
93 | 92,6.1,3.0,4.6,1.4,Iris-versicolor
94 | 93,5.8,2.6,4.0,1.2,Iris-versicolor
95 | 94,5.0,2.3,3.3,1.0,Iris-versicolor
96 | 95,5.6,2.7,4.2,1.3,Iris-versicolor
97 | 96,5.7,3.0,4.2,1.2,Iris-versicolor
98 | 97,5.7,2.9,4.2,1.3,Iris-versicolor
99 | 98,6.2,2.9,4.3,1.3,Iris-versicolor
100 | 99,5.1,2.5,3.0,1.1,Iris-versicolor
101 | 100,5.7,2.8,4.1,1.3,Iris-versicolor
102 | 101,6.3,3.3,6.0,2.5,Iris-virginica
103 | 102,5.8,2.7,5.1,1.9,Iris-virginica
104 | 103,7.1,3.0,5.9,2.1,Iris-virginica
105 | 104,6.3,2.9,5.6,1.8,Iris-virginica
106 | 105,6.5,3.0,5.8,2.2,Iris-virginica
107 | 106,7.6,3.0,6.6,2.1,Iris-virginica
108 | 107,4.9,2.5,4.5,1.7,Iris-virginica
109 | 108,7.3,2.9,6.3,1.8,Iris-virginica
110 | 109,6.7,2.5,5.8,1.8,Iris-virginica
111 | 110,7.2,3.6,6.1,2.5,Iris-virginica
112 | 111,6.5,3.2,5.1,2.0,Iris-virginica
113 | 112,6.4,2.7,5.3,1.9,Iris-virginica
114 | 113,6.8,3.0,5.5,2.1,Iris-virginica
115 | 114,5.7,2.5,5.0,2.0,Iris-virginica
116 | 115,5.8,2.8,5.1,2.4,Iris-virginica
117 | 116,6.4,3.2,5.3,2.3,Iris-virginica
118 | 117,6.5,3.0,5.5,1.8,Iris-virginica
119 | 118,7.7,3.8,6.7,2.2,Iris-virginica
120 | 119,7.7,2.6,6.9,2.3,Iris-virginica
121 | 120,6.0,2.2,5.0,1.5,Iris-virginica
122 | 121,6.9,3.2,5.7,2.3,Iris-virginica
123 | 122,5.6,2.8,4.9,2.0,Iris-virginica
124 | 123,7.7,2.8,6.7,2.0,Iris-virginica
125 | 124,6.3,2.7,4.9,1.8,Iris-virginica
126 | 125,6.7,3.3,5.7,2.1,Iris-virginica
127 | 126,7.2,3.2,6.0,1.8,Iris-virginica
128 | 127,6.2,2.8,4.8,1.8,Iris-virginica
129 | 128,6.1,3.0,4.9,1.8,Iris-virginica
130 | 129,6.4,2.8,5.6,2.1,Iris-virginica
131 | 130,7.2,3.0,5.8,1.6,Iris-virginica
132 | 131,7.4,2.8,6.1,1.9,Iris-virginica
133 | 132,7.9,3.8,6.4,2.0,Iris-virginica
134 | 133,6.4,2.8,5.6,2.2,Iris-virginica
135 | 134,6.3,2.8,5.1,1.5,Iris-virginica
136 | 135,6.1,2.6,5.6,1.4,Iris-virginica
137 | 136,7.7,3.0,6.1,2.3,Iris-virginica
138 | 137,6.3,3.4,5.6,2.4,Iris-virginica
139 | 138,6.4,3.1,5.5,1.8,Iris-virginica
140 | 139,6.0,3.0,4.8,1.8,Iris-virginica
141 | 140,6.9,3.1,5.4,2.1,Iris-virginica
142 | 141,6.7,3.1,5.6,2.4,Iris-virginica
143 | 142,6.9,3.1,5.1,2.3,Iris-virginica
144 | 143,5.8,2.7,5.1,1.9,Iris-virginica
145 | 144,6.8,3.2,5.9,2.3,Iris-virginica
146 | 145,6.7,3.3,5.7,2.5,Iris-virginica
147 | 146,6.7,3.0,5.2,2.3,Iris-virginica
148 | 147,6.3,2.5,5.0,1.9,Iris-virginica
149 | 148,6.5,3.0,5.2,2.0,Iris-virginica
150 | 149,6.2,3.4,5.4,2.3,Iris-virginica
151 | 150,5.9,3.0,5.1,1.8,Iris-virginica
152 |
--------------------------------------------------------------------------------
/data/Iris/Iris_data.csv:
--------------------------------------------------------------------------------
1 | Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
2 | 1,5.1,3.5,1.4,0.2,Iris-setosa
3 | 2,4.9,3.0,1.4,0.2,Iris-setosa
4 | 3,4.7,3.2,1.3,0.2,Iris-setosa
5 | 4,4.6,3.1,1.5,0.2,Iris-setosa
6 | 5,5.0,3.6,1.4,0.2,Iris-setosa
7 | 6,5.4,3.9,1.7,0.4,Iris-setosa
8 | 7,4.6,3.4,1.4,0.3,Iris-setosa
9 | 8,5.0,3.4,1.5,0.2,Iris-setosa
10 | 9,4.4,2.9,1.4,0.2,Iris-setosa
11 | 10,4.9,3.1,1.5,0.1,Iris-setosa
12 | 11,5.4,3.7,1.5,0.2,Iris-setosa
13 | 12,4.8,3.4,1.6,0.2,Iris-setosa
14 | 12,4.8,3.4,1.6,0.2,Iris-setosa
15 | 13,4.8,3.0,1.4,0.1,Iris-setosa
16 | 14,4.3,3.0,1.1,0.1,Iris-setosa
17 | 15,5.8,4.0,1.2,0.2,Iris-setosa
18 | 16,5.7,4.4,1.5,0.4,Iris-setosa
19 | 17,5.4,3.9,1.3,0.4,Iris-setosa
20 | 18,5.1,3.5,1.4,0.3,Iris-setosa
21 | 19,5.7,3.8,1.7,0.3,Iris-setosa
22 | 20,5.1,3.8,1.5,0.3,Iris-setosa
23 | 21,5.4,3.4,1.7,0.2,Iris-setosa
24 | 22,5.1,3.7,1.5,0.4,Iris-setosa
25 | 23,4.6,3.6,1.0,0.2,Iris-setosa
26 | 24,5.1,3.3,1.7,0.5,Iris-setosa
27 | 25,4.8,3.4,1.9,0.2,Iris-setosa
28 | 26,5.0,3.0,1.6,0.2,Iris-setosa
29 | 27,5.0,3.4,1.6,0.4,Iris-setosa
30 | 28,5.2,3.5,1.5,0.2,Iris-setosa
31 | 29,5.2,3.4,1.4,0.2,Iris-setosa
32 | 30,4.7,3.2,1.6,0.2,Iris-setosa
33 | 31,4.8,3.1,1.6,0.2,Iris-setosa
34 | 32,5.4,3.4,1.5,0.4,Iris-setosa
35 | 33,5.2,4.1,1.5,0.1,Iris-setosa
36 | 34,5.5,4.2,1.4,0.2,Iris-setosa
37 | 35,4.9,3.1,1.5,0.1,Iris-setosa
38 | 36,5.0,3.2,1.2,0.2,Iris-setosa
39 | 37,5.5,3.5,1.3,0.2,Iris-setosa
40 | 38,4.9,3.1,1.5,0.1,Iris-setosa
41 | 39,4.4,3.0,1.3,0.2,Iris-setosa
42 | 40,5.1,3.4,1.5,0.2,Iris-setosa
43 | 41,5.0,3.5,1.3,0.3,Iris-setosa
44 | 42,4.5,2.3,1.3,0.3,Iris-setosa
45 | 43,4.4,3.2,1.3,0.2,Iris-setosa
46 | 44,5.0,3.5,1.6,0.6,Iris-setosa
47 | 45,5.1,3.8,1.9,0.4,Iris-setosa
48 | 46,4.8,3.0,1.4,0.3,Iris-setosa
49 | 47,5.1,3.8,1.6,0.2,Iris-setosa
50 | 48,4.6,3.2,1.4,0.2,Iris-setosa
51 | 49,5.3,3.7,1.5,0.2,Iris-setosa
52 | 50,5.0,3.3,1.4,0.2,Iris-setosa
53 | 51,7.0,3.2,4.7,1.4,Iris-versicolor
54 | 52,6.4,3.2,4.5,1.5,Iris-versicolor
55 | 53,6.9,3.1,4.9,1.5,Iris-versicolor
56 | 54,5.5,2.3,4.0,1.3,Iris-versicolor
57 | 55,6.5,2.8,4.6,1.5,Iris-versicolor
58 | 56,5.7,2.8,4.5,1.3,Iris-versicolor
59 | 57,6.3,3.3,4.7,1.6,Iris-versicolor
60 | 58,4.9,2.4,3.3,1.0,Iris-versicolor
61 | 59,6.6,2.9,4.6,1.3,Iris-versicolor
62 | 60,5.2,2.7,3.9,1.4,Iris-versicolor
63 | 61,5.0,2.0,3.5,1.0,Iris-versicolor
64 | 62,5.9,3.0,4.2,1.5,Iris-versicolor
65 | 63,6.0,2.2,4.0,1.0,Iris-versicolor
66 | 64,6.1,2.9,4.7,1.4,Iris-versicolor
67 | 65,5.6,2.9,3.6,1.3,Iris-versicolor
68 | 66,6.7,3.1,4.4,1.4,Iris-versicolor
69 | 67,5.6,3.0,4.5,1.5,Iris-versicolor
70 | 68,5.8,2.7,4.1,1.0,Iris-versicolor
71 | 69,6.2,2.2,4.5,1.5,Iris-versicolor
72 | 70,5.6,2.5,3.9,1.1,Iris-versicolor
73 | 71,5.9,3.2,4.8,1.8,Iris-versicolor
74 | 72,6.3,,4.7,1.6,Iris-versicolor
75 | 73,6.1,2.8,4.0,1.3,Iris-versicolor
76 | 74,6.3,2.5,4.9,1.5,Iris-versicolor
77 | 75,6.1,2.8,4.7,1.2,Iris-versicolor
78 | 76,6.4,2.9,4.3,1.3,Iris-versicolor
79 | 77,6.6,3.0,4.4,1.4,Iris-versicolor
80 | 78,6.8,2.8,4.8,1.4,Iris-versicolor
81 | 79,6.7,3.0,5.0,1.7,Iris-versicolor
82 | 80,6.0,2.9,4.5,1.5,Iris-versicolor
83 | 81,5.7,2.6,3.5,1.0,Iris-versicolor
84 | 82,5.5,2.4,3.8,1.1,Iris-versicolor
85 | 83,5.5,2.4,3.7,1.0,Iris-versicolor
86 | 84,5.8,2.7,3.9,1.2,Iris-versicolor
87 | 85,6.0,2.7,5.1,1.6,Iris-versicolor
88 | 86,5.4,3.0,4.5,1.5,Iris-versicolor
89 | 87,6.0,3.4,4.5,1.6,Iris-versicolor
90 | 88,6.7,3.1,4.7,1.5,Iris-versicolor
91 | 89,6.3,2.3,4.4,1.3,Iris-versicolor
92 | 90,5.6,3.0,4.1,1.3,Iris-versicolor
93 | 91,5.5,2.5,4.0,1.3,Iris-versicolor
94 | 92,5.5,2.6,4.4,1.2,Iris-versicolor
95 | 93,6.1,3.0,4.6,1.4,Iris-versicolor
96 | 94,5.8,2.6,4.0,1.2,Iris-versicolor
97 | 95,5.0,2.3,3.3,1.0,Iris-versicolor
98 | 96,5.6,2.7,4.2,1.3,Iris-versicolor
99 | 97,5.7,3.0,4.2,1.2,Iris-versicolor
100 | 98,5.7,2.9,4.2,1.3,Iris-versicolor
101 | 99,6.2,2.9,4.3,1.3,Iris-versicolor
102 | 100,5.1,2.5,3.0,1.1,Iris-versicolor
103 | 101,5.7,2.8,4.1,1.3,Iris-versicolor
104 | 102,6.3,3.3,6.0,2.5,Iris-virginica
105 | 103,5.8,2.7,5.1,1.9,Iris-virginica
106 | 104,7.1,3.0,5.9,2.1,Iris-virginica
107 | 105,6.3,2.9,5.6,1.8,Iris-virginica
108 | 106,6.5,3.0,5.8,2.2,Iris-virginica
109 | 107,7.6,3.0,6.6,2.1,Iris-virginica
110 | 108,4.9,2.5,4.5,1.7,Iris-virginica
111 | 109,7.3,2.9,6.3,1.8,Iris-virginica
112 | 110,6.7,2.5,5.8,1.8,Iris-virginica
113 | 111,7.2,3.6,6.1,2.5,Iris-virginica
114 | 112,5.8,2.6,,,Iris-versicolor
115 | 113,6.5,3.2,5.1,2.0,Iris-virginica
116 | 114,6.4,2.7,5.3,1.9,Iris-virginica
117 | 115,6.8,3.0,5.5,2.1,Iris-virginica
118 | 116,5.7,2.5,5.0,2.0,Iris-virginica
119 | 117,5.8,2.8,5.1,2.4,Iris-virginica
120 | 118,6.4,3.2,5.3,2.3,Iris-virginica
121 | 119,6.5,3.0,5.5,1.8,Iris-virginica
122 | 120,7.7,3.8,6.7,2.2,Iris-virginica
123 | 121,7.7,2.6,6.9,2.3,Iris-virginica
124 | 122,6.0,2.2,5.0,1.5,Iris-virginica
125 | 123,6.9,3.2,5.7,2.3,Iris-virginica
126 | 124,5.6,2.8,4.9,2.0,Iris-virginica
127 | 125,7.7,2.8,6.7,2.0,Iris-virginica
128 | 126,6.3,2.7,4.9,1.8,Iris-virginica
129 | 127,6.7,3.3,5.7,2.1,Iris-virginica
130 | 128,7.2,3.2,6.0,1.8,Iris-virginica
131 | 129,6.2,2.8,4.8,1.8,Iris-virginica
132 | 130,6.1,3.0,4.9,1.8,Iris-virginica
133 | 131,6.4,2.8,5.6,2.1,Iris-virginica
134 | 132,7.2,3.0,5.8,1.6,Iris-virginica
135 | 133,7.4,2.8,6.1,1.9,Iris-virginica
136 | 134,7.9,3.8,6.4,2.0,Iris-virginica
137 | 135,6.4,2.8,5.6,2.2,Iris-virginica
138 | 136,6.3,2.8,5.1,1.5,Iris-virginica
139 | 137,6.1,2.6,5.6,1.4,Iris-virginica
140 | ,,,,,
141 | 139,7.7,3.0,6.1,2.3,Iris-virginica
142 | 140,6.3,3.4,5.6,2.4,Iris-virginica
143 | 141,6.4,3.1,5.5,1.8,Iris-virginica
144 | 142,6.0,3.0,4.8,1.8,Iris-virginica
145 | 143,6.9,3.1,5.4,2.1,Iris-virginica
146 | 144,6.7,3.1,5.6,2.4,Iris-virginica
147 | 145,6.9,3.1,5.1,2.3,Iris-virginica
148 | 146,5.8,2.7,5.1,1.9,Iris-virginica
149 | 147,6.8,3.2,5.9,2.3,Iris-virginica
150 | 148,6.7,3.3,5.7,2.5,Iris-virginica
151 | 149,6.7,3.0,5.2,2.3,Iris-virginica
152 | 150,6.3,2.5,5.0,1.9,Iris-virginica
153 | 151,6.5,3.0,5.2,2.0,Iris-virginica
154 | 152,6.2,3.4,5.4,2.3,Iris-virginica
155 | 153,5.9,3.0,5.1,1.8,Iris-virginica
156 |
--------------------------------------------------------------------------------
/data/Penguins/penguins.csv:
--------------------------------------------------------------------------------
1 | species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
2 | Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
3 | Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
4 | Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
5 | Adelie,Torgersen,,,,,
6 | ,,,,,,
7 | Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female
8 | Adelie,Torgersen,39.3,20.6,190.0,3650.0,Male
9 | Adelie,Torgersen,38.9,17.8,181.0,3625.0,Female
10 | Adelie,Torgersen,39.2,19.6,195.0,4675.0,Male
11 | Adelie,Torgersen,34.1,18.1,193.0,3475.0,
12 | Adelie,Torgersen,42.0,20.2,190.0,4250.0,
13 | Adelie,Torgersen,37.8,17.1,186.0,3300.0,
14 | Adelie,Torgersen,37.8,17.3,180.0,3700.0,
15 | Adelie,Torgersen,41.1,17.6,182.0,3200.0,Female
16 | Adelie,Torgersen,38.6,21.2,191.0,3800.0,Male
17 | Adelie,Torgersen,34.6,21.1,198.0,4400.0,Male
18 | Adelie,Torgersen,36.6,17.8,185.0,3700.0,Female
19 | Adelie,Torgersen,38.7,19.0,195.0,3450.0,Female
20 | Adelie,Torgersen,42.5,20.7,197.0,4500.0,Male
21 | Adelie,Torgersen,34.4,18.4,184.0,3325.0,Female
22 | Adelie,Torgersen,46.0,21.5,194.0,4200.0,Male
23 | Adelie,Biscoe,37.8,18.3,174.0,3400.0,Female
24 | Adelie,Biscoe,37.7,18.7,180.0,3600.0,Male
25 | Adelie,Biscoe,35.9,19.2,189.0,3800.0,Female
26 | Adelie,Biscoe,38.2,18.1,185.0,3950.0,Male
27 | Adelie,Biscoe,38.8,17.2,180.0,3800.0,Male
28 | Adelie,Biscoe,35.3,18.9,187.0,3800.0,Female
29 | Adelie,Biscoe,40.6,18.6,183.0,3550.0,Male
30 | Adelie,Biscoe,40.5,17.9,187.0,3200.0,Female
31 | Adelie,Biscoe,40.5,17.9,187.0,3200.0,Female
32 | Adelie,Biscoe,37.9,18.6,172.0,3150.0,Female
33 | Adelie,Biscoe,40.5,18.9,180.0,3950.0,Male
34 | Adelie,Dream,39.5,16.7,178.0,3250.0,Female
35 | Adelie,Dream,37.2,18.1,178.0,3900.0,Male
36 | Adelie,Dream,39.5,17.8,188.0,3300.0,Female
37 | Adelie,Dream,40.9,18.9,184.0,3900.0,Male
38 | Adelie,Dream,36.4,17.0,195.0,3325.0,Female
39 | Adelie,Dream,39.2,21.1,196.0,4150.0,Male
40 | Adelie,Dream,38.8,20.0,190.0,3950.0,Male
41 | Adelie,Dream,42.2,18.5,180.0,3550.0,Female
42 | Adelie,Dream,37.6,19.3,181.0,3300.0,Female
43 | Adelie,Dream,39.8,19.1,184.0,4650.0,Male
44 | Adelie,Dream,36.5,18.0,182.0,3150.0,Female
45 | Adelie,Dream,40.8,18.4,195.0,3900.0,Male
46 | Adelie,Dream,36.0,18.5,186.0,3100.0,Female
47 | Adelie,Dream,44.1,19.7,196.0,4400.0,Male
48 | Adelie,Dream,37.0,16.9,185.0,3000.0,Female
49 | Adelie,Dream,39.6,18.8,190.0,4600.0,Male
50 | Adelie,Dream,41.1,19.0,182.0,3425.0,Male
51 | Adelie,Dream,37.5,18.9,179.0,2975.0,
52 | Adelie,Dream,36.0,17.9,190.0,3450.0,Female
53 | Adelie,Dream,42.3,21.2,191.0,4150.0,Male
54 | Adelie,Biscoe,39.6,17.7,186.0,3500.0,Female
55 | Adelie,Biscoe,40.1,18.9,188.0,4300.0,Male
56 | Adelie,Biscoe,35.0,17.9,190.0,3450.0,Female
57 | Adelie,Biscoe,42.0,19.5,200.0,4050.0,Male
58 | Adelie,Biscoe,34.5,18.1,187.0,2900.0,Female
59 | Adelie,Biscoe,41.4,18.6,191.0,3700.0,Male
60 | Adelie,Biscoe,39.0,17.5,186.0,3550.0,Female
61 | ,,,,,,
62 | Adelie,Biscoe,40.6,18.8,193.0,3800.0,Male
63 | Adelie,Biscoe,36.5,16.6,181.0,2850.0,Female
64 | Adelie,Biscoe,37.6,19.1,194.0,3750.0,Male
65 | Adelie,Biscoe,35.7,16.9,185.0,3150.0,Female
66 | Adelie,Biscoe,41.3,21.1,195.0,4400.0,Male
67 | Adelie,Biscoe,37.6,17.0,185.0,3600.0,Female
68 | Adelie,Biscoe,41.1,18.2,192.0,4050.0,Male
69 | Adelie,Biscoe,36.4,17.1,184.0,2850.0,Female
70 | Adelie,Biscoe,41.6,18.0,192.0,3950.0,Male
71 | Adelie,Biscoe,35.5,16.2,195.0,3350.0,Female
72 | Adelie,Biscoe,41.1,19.1,188.0,4100.0,Male
73 | Adelie,Torgersen,35.9,16.6,190.0,3050.0,Female
74 | Adelie,Torgersen,41.8,19.4,198.0,4450.0,Male
75 | Adelie,Torgersen,33.5,19.0,190.0,3600.0,Female
76 | Adelie,Torgersen,39.7,18.4,190.0,3900.0,Male
77 | Adelie,Torgersen,39.6,17.2,196.0,3550.0,Female
78 | Adelie,Torgersen,45.8,18.9,197.0,4150.0,Male
79 | Adelie,Torgersen,35.5,17.5,190.0,3700.0,Female
80 | Adelie,Torgersen,42.8,18.5,195.0,4250.0,Male
81 | Adelie,Torgersen,40.9,16.8,191.0,3700.0,Female
82 | Adelie,Torgersen,37.2,19.4,184.0,3900.0,Male
83 | Adelie,Torgersen,36.2,16.1,187.0,3550.0,Female
84 | Adelie,Torgersen,42.1,19.1,195.0,4000.0,Male
85 | Adelie,Torgersen,34.6,17.2,189.0,3200.0,Female
86 | Adelie,Torgersen,42.9,17.6,196.0,4700.0,Male
87 | Adelie,Torgersen,36.7,18.8,187.0,3800.0,Female
88 | Adelie,Torgersen,35.1,19.4,193.0,4200.0,Male
89 | Adelie,Dream,37.3,17.8,191.0,3350.0,Female
90 | Adelie,Dream,41.3,20.3,194.0,3550.0,Male
91 | Adelie,Dream,36.3,19.5,190.0,3800.0,Male
92 | Adelie,Dream,36.9,18.6,189.0,3500.0,Female
93 | Adelie,Dream,38.3,19.2,189.0,3950.0,Male
94 | Adelie,Dream,38.9,18.8,190.0,3600.0,Female
95 | Adelie,Dream,35.7,18.0,202.0,3550.0,Female
96 | Adelie,Dream,41.1,18.1,205.0,4300.0,Male
97 | Adelie,Dream,34.0,17.1,185.0,3400.0,Female
98 | Adelie,Dream,39.6,18.1,186.0,4450.0,Male
99 | Adelie,Dream,36.2,17.3,187.0,3300.0,Female
100 | Adelie,Dream,40.8,18.9,208.0,4300.0,Male
101 | Adelie,Dream,38.1,18.6,190.0,3700.0,Female
102 | Adelie,Dream,40.3,18.5,196.0,4350.0,Male
103 | ,,,,,,
104 | Adelie,Dream,33.1,16.1,178.0,2900.0,Female
105 | Adelie,Dream,43.2,18.5,192.0,4100.0,Male
106 | Adelie,Biscoe,35.0,17.9,192.0,3725.0,Female
107 | Adelie,Biscoe,41.0,20.0,203.0,4725.0,Male
108 | Adelie,Biscoe,37.7,16.0,183.0,3075.0,Female
109 | Adelie,Biscoe,37.8,20.0,190.0,4250.0,Male
110 | Adelie,Biscoe,37.9,18.6,193.0,2925.0,Female
111 | Adelie,Biscoe,39.7,18.9,184.0,3550.0,Male
112 | Adelie,Biscoe,38.6,17.2,199.0,3750.0,Female
113 | Adelie,Biscoe,38.2,20.0,190.0,3900.0,Male
114 | Adelie,Biscoe,38.1,17.0,181.0,3175.0,Female
115 | Adelie,Biscoe,43.2,19.0,197.0,4775.0,Male
116 | Adelie,Biscoe,38.1,16.5,198.0,3825.0,Female
117 | Adelie,Biscoe,45.6,20.3,191.0,4600.0,Male
118 | Adelie,Biscoe,39.7,17.7,193.0,3200.0,Female
119 | Adelie,Biscoe,42.2,19.5,197.0,4275.0,Male
120 | Adelie,Biscoe,39.6,20.7,191.0,3900.0,Female
121 | Adelie,Biscoe,42.7,18.3,196.0,4075.0,Male
122 | Adelie,Torgersen,38.6,17.0,188.0,2900.0,Female
123 | Adelie,Torgersen,37.3,20.5,199.0,3775.0,Male
124 | Adelie,Torgersen,35.7,17.0,189.0,3350.0,Female
125 | Adelie,Torgersen,41.1,18.6,189.0,3325.0,Male
126 | Adelie,Torgersen,36.2,17.2,187.0,3150.0,Female
127 | Adelie,Torgersen,37.7,19.8,198.0,3500.0,Male
128 | Adelie,Torgersen,40.2,17.0,176.0,3450.0,Female
129 | Adelie,Torgersen,41.4,18.5,202.0,3875.0,Male
130 | Adelie,Torgersen,35.2,15.9,186.0,3050.0,Female
131 | Adelie,Torgersen,40.6,19.0,199.0,4000.0,Male
132 | Adelie,Torgersen,38.8,17.6,191.0,3275.0,Female
133 | ,,,,,,
134 | Adelie,Torgersen,41.5,18.3,195.0,4300.0,Male
135 | Adelie,Torgersen,39.0,17.1,191.0,3050.0,Female
136 | Adelie,Torgersen,44.1,18.0,210.0,4000.0,Male
137 | Adelie,Torgersen,38.5,17.9,190.0,3325.0,Female
138 | Adelie,Torgersen,43.1,19.2,197.0,3500.0,Male
139 | Adelie,Dream,36.8,18.5,193.0,3500.0,Female
140 | Adelie,Dream,37.5,18.5,199.0,4475.0,Male
141 | Adelie,Dream,38.1,17.6,187.0,3425.0,Female
142 | Adelie,Dream,41.1,17.5,190.0,3900.0,Male
143 | Adelie,Dream,35.6,17.5,191.0,3175.0,Female
144 | Adelie,Dream,40.2,20.1,200.0,3975.0,Male
145 | Adelie,Dream,37.0,16.5,185.0,3400.0,Female
146 | Adelie,Dream,39.7,17.9,193.0,4250.0,Male
147 | Adelie,Dream,40.2,17.1,193.0,3400.0,Female
148 | Adelie,Dream,40.6,17.2,187.0,3475.0,Male
149 | Adelie,Dream,32.1,15.5,188.0,3050.0,Female
150 | Adelie,Dream,40.7,17.0,190.0,3725.0,Male
151 | Adelie,Dream,37.3,16.8,192.0,3000.0,Female
152 | Adelie,Dream,39.0,18.7,185.0,3650.0,Male
153 | Adelie,Dream,39.2,18.6,190.0,4250.0,Male
154 | Adelie,Dream,36.6,18.4,184.0,3475.0,Female
155 | Adelie,Dream,36.0,17.8,195.0,3450.0,Female
156 | Adelie,Dream,37.8,18.1,193.0,3750.0,Male
157 | Adelie,Dream,36.0,17.1,187.0,3700.0,Female
158 | Adelie,Dream,41.5,18.5,201.0,4000.0,Male
159 | Chinstrap,Dream,46.5,17.9,192.0,3500.0,Female
160 | Chinstrap,Dream,50.0,19.5,196.0,3900.0,Male
161 | Chinstrap,Dream,51.3,19.2,193.0,3650.0,Male
162 | Chinstrap,Dream,45.4,18.7,188.0,3525.0,Female
163 | Chinstrap,Dream,52.7,19.8,197.0,3725.0,Male
164 | Chinstrap,Dream,45.2,17.8,198.0,3950.0,Female
165 | Chinstrap,Dream,46.1,18.2,178.0,3250.0,Female
166 | Chinstrap,Dream,51.3,18.2,197.0,3750.0,Male
167 | Chinstrap,Dream,46.0,18.9,195.0,4150.0,Female
168 | Chinstrap,Dream,51.3,19.9,198.0,3700.0,Male
169 | Chinstrap,Dream,46.6,17.8,193.0,3800.0,Female
170 | Chinstrap,Dream,46.6,17.8,193.0,3800.0,Female
171 | Chinstrap,Dream,51.7,20.3,194.0,3775.0,Male
172 | Chinstrap,Dream,47.0,17.3,185.0,3700.0,Female
173 | Chinstrap,Dream,52.0,18.1,201.0,4050.0,Male
174 | Chinstrap,Dream,45.9,17.1,190.0,3575.0,Female
175 | Chinstrap,Dream,50.5,19.6,201.0,4050.0,Male
176 | Chinstrap,Dream,50.3,20.0,197.0,3300.0,Male
177 | Chinstrap,Dream,58.0,17.8,181.0,3700.0,Female
178 | Chinstrap,Dream,46.4,18.6,190.0,3450.0,Female
179 | Chinstrap,Dream,49.2,18.2,195.0,4400.0,Male
180 | Chinstrap,Dream,42.4,17.3,181.0,3600.0,Female
181 | Chinstrap,Dream,48.5,17.5,191.0,3400.0,Male
182 | Chinstrap,Dream,43.2,16.6,187.0,2900.0,Female
183 | Chinstrap,Dream,50.6,19.4,193.0,3800.0,Male
184 | Chinstrap,Dream,46.7,17.9,195.0,3300.0,Female
185 | Chinstrap,Dream,52.0,19.0,197.0,4150.0,Male
186 | Chinstrap,Dream,50.5,18.4,200.0,3400.0,Female
187 | ,,,,,,
188 | Chinstrap,Dream,49.5,19.0,200.0,3800.0,Male
189 | Chinstrap,Dream,46.4,17.8,191.0,3700.0,Female
190 | Chinstrap,Dream,52.8,20.0,205.0,4550.0,Male
191 | Chinstrap,Dream,40.9,16.6,187.0,3200.0,Female
192 | Chinstrap,Dream,54.2,20.8,201.0,4300.0,Male
193 | Chinstrap,Dream,42.5,16.7,187.0,3350.0,Female
194 | Chinstrap,Dream,51.0,18.8,203.0,4100.0,Male
195 | Chinstrap,Dream,49.7,18.6,195.0,3600.0,Male
196 | Chinstrap,Dream,47.5,16.8,199.0,3900.0,Female
197 | Chinstrap,Dream,47.6,18.3,195.0,3850.0,Female
198 | Chinstrap,Dream,52.0,20.7,210.0,4800.0,Male
199 | Chinstrap,Dream,46.9,16.6,192.0,2700.0,Female
200 | Chinstrap,Dream,53.5,19.9,205.0,4500.0,Male
201 | Chinstrap,Dream,49.0,19.5,210.0,3950.0,Male
202 | Chinstrap,Dream,46.2,17.5,187.0,3650.0,Female
203 | Chinstrap,Dream,50.9,19.1,196.0,3550.0,Male
204 | Chinstrap,Dream,45.5,17.0,196.0,3500.0,Female
205 | Chinstrap,Dream,50.9,17.9,196.0,3675.0,Female
206 | Chinstrap,Dream,50.8,18.5,201.0,4450.0,Male
207 | Chinstrap,Dream,50.1,17.9,190.0,3400.0,Female
208 | Chinstrap,Dream,49.0,19.6,212.0,4300.0,Male
209 | Chinstrap,Dream,51.5,18.7,187.0,3250.0,Male
210 | Chinstrap,Dream,49.8,17.3,198.0,3675.0,Female
211 | Chinstrap,Dream,48.1,16.4,199.0,3325.0,Female
212 | Chinstrap,Dream,51.4,19.0,201.0,3950.0,Male
213 | Chinstrap,Dream,45.7,17.3,193.0,3600.0,Female
214 | Chinstrap,Dream,50.7,19.7,203.0,4050.0,Male
215 | Chinstrap,Dream,42.5,17.3,187.0,3350.0,Female
216 | Chinstrap,Dream,52.2,18.8,197.0,3450.0,Male
217 | Chinstrap,Dream,45.2,16.6,191.0,3250.0,Female
218 | Chinstrap,Dream,49.3,19.9,203.0,4050.0,Male
219 | Chinstrap,Dream,50.2,18.8,202.0,3800.0,Male
220 | Chinstrap,Dream,45.6,19.4,194.0,3525.0,Female
221 | Chinstrap,Dream,51.9,19.5,206.0,3950.0,Male
222 | Chinstrap,Dream,46.8,16.5,189.0,3650.0,Female
223 | Chinstrap,Dream,45.7,17.0,195.0,3650.0,Female
224 | Chinstrap,Dream,55.8,19.8,207.0,4000.0,Male
225 | Chinstrap,Dream,43.5,18.1,202.0,3400.0,Female
226 | Chinstrap,Dream,49.6,18.2,193.0,3775.0,Male
227 | Chinstrap,Dream,50.8,19.0,210.0,4100.0,Male
228 | Chinstrap,Dream,50.2,18.7,198.0,3775.0,Female
229 | Gentoo,Biscoe,46.1,13.2,211.0,4500.0,Female
230 | Gentoo,Biscoe,50.0,16.3,230.0,5700.0,Male
231 | Gentoo,Biscoe,48.7,14.1,210.0,4450.0,Female
232 | Gentoo,Biscoe,50.0,15.2,218.0,5700.0,Male
233 | Gentoo,Biscoe,47.6,14.5,215.0,5400.0,Male
234 | Gentoo,Biscoe,46.5,13.5,210.0,4550.0,Female
235 | Gentoo,Biscoe,45.4,14.6,211.0,4800.0,Female
236 | Gentoo,Biscoe,46.7,15.3,219.0,5200.0,Male
237 | Gentoo,Biscoe,43.3,13.4,209.0,4400.0,Female
238 | Gentoo,Biscoe,46.8,15.4,215.0,5150.0,Male
239 | Gentoo,Biscoe,40.9,13.7,214.0,4650.0,Female
240 | Gentoo,Biscoe,49.0,16.1,216.0,5550.0,Male
241 | ,,,,,,
242 | Gentoo,Biscoe,45.5,13.7,214.0,4650.0,Female
243 | Gentoo,Biscoe,48.4,14.6,213.0,5850.0,Male
244 | Gentoo,Biscoe,45.8,14.6,210.0,4200.0,Female
245 | Gentoo,Biscoe,49.3,15.7,217.0,5850.0,Male
246 | Gentoo,Biscoe,42.0,13.5,210.0,4150.0,Female
247 | Gentoo,Biscoe,49.2,15.2,221.0,6300.0,Male
248 | Gentoo,Biscoe,46.2,14.5,209.0,4800.0,Female
249 | Gentoo,Biscoe,48.7,15.1,222.0,5350.0,Male
250 | Gentoo,Biscoe,50.2,14.3,218.0,5700.0,Male
251 | Gentoo,Biscoe,45.1,14.5,215.0,5000.0,Female
252 | Gentoo,Biscoe,46.5,14.5,213.0,4400.0,Female
253 | Gentoo,Biscoe,46.3,15.8,215.0,5050.0,Male
254 | Gentoo,Biscoe,42.9,13.1,215.0,5000.0,Female
255 | Gentoo,Biscoe,46.1,15.1,215.0,5100.0,Male
256 | Gentoo,Biscoe,44.5,14.3,216.0,4100.0,
257 | Gentoo,Biscoe,47.8,15.0,215.0,5650.0,Male
258 | Gentoo,Biscoe,48.2,14.3,210.0,4600.0,Female
259 | Gentoo,Biscoe,50.0,15.3,220.0,5550.0,Male
260 | Gentoo,Biscoe,47.3,15.3,222.0,5250.0,Male
261 | Gentoo,Biscoe,42.8,14.2,209.0,4700.0,Female
262 | Gentoo,Biscoe,45.1,14.5,207.0,5050.0,Female
263 | Gentoo,Biscoe,59.6,17.0,230.0,6050.0,Male
264 | Gentoo,Biscoe,49.1,14.8,220.0,5150.0,Female
265 | Gentoo,Biscoe,48.4,16.3,220.0,5400.0,Male
266 | Gentoo,Biscoe,48.4,16.3,220.0,5400.0,Male
267 | Gentoo,Biscoe,42.6,13.7,213.0,4950.0,Female
268 | Gentoo,Biscoe,44.4,17.3,219.0,5250.0,Male
269 | Gentoo,Biscoe,44.0,13.6,208.0,4350.0,Female
270 | Gentoo,Biscoe,48.7,15.7,208.0,5350.0,Male
271 | Gentoo,Biscoe,42.7,13.7,208.0,3950.0,Female
272 | Gentoo,Biscoe,49.6,16.0,225.0,5700.0,Male
273 | Gentoo,Biscoe,45.3,13.7,210.0,4300.0,Female
274 | Gentoo,Biscoe,49.6,15.0,216.0,4750.0,Male
275 | Gentoo,Biscoe,50.5,15.9,222.0,5550.0,Male
276 | Gentoo,Biscoe,43.6,13.9,217.0,4900.0,Female
277 | Gentoo,Biscoe,45.5,13.9,210.0,4200.0,Female
278 | Gentoo,Biscoe,50.5,15.9,225.0,5400.0,Male
279 | Gentoo,Biscoe,44.9,13.3,213.0,5100.0,Female
280 | Gentoo,Biscoe,45.2,15.8,215.0,5300.0,Male
281 | Gentoo,Biscoe,46.6,14.2,210.0,4850.0,Female
282 | Gentoo,Biscoe,48.5,14.1,220.0,5300.0,Male
283 | Gentoo,Biscoe,45.1,14.4,210.0,4400.0,Female
284 | Gentoo,Biscoe,50.1,15.0,225.0,5000.0,Male
285 | Gentoo,Biscoe,46.5,14.4,217.0,4900.0,Female
286 | Gentoo,Biscoe,45.0,15.4,220.0,5050.0,Male
287 | Gentoo,Biscoe,43.8,13.9,208.0,4300.0,Female
288 | Gentoo,Biscoe,45.5,15.0,220.0,5000.0,Male
289 | Gentoo,Biscoe,43.2,14.5,208.0,4450.0,Female
290 | Gentoo,Biscoe,50.4,15.3,224.0,5550.0,Male
291 | Gentoo,Biscoe,45.3,13.8,208.0,4200.0,Female
292 | Gentoo,Biscoe,46.2,14.9,221.0,5300.0,Male
293 | Gentoo,Biscoe,45.7,13.9,214.0,4400.0,Female
294 | Gentoo,Biscoe,54.3,15.7,231.0,5650.0,Male
295 | Gentoo,Biscoe,45.8,14.2,219.0,4700.0,Female
296 | Gentoo,Biscoe,49.8,16.8,230.0,5700.0,Male
297 | Gentoo,Biscoe,46.2,14.4,214.0,4650.0,
298 | Gentoo,Biscoe,49.5,16.2,229.0,5800.0,Male
299 | Gentoo,Biscoe,43.5,14.2,220.0,4700.0,Female
300 | Gentoo,Biscoe,50.7,15.0,223.0,5550.0,Male
301 | Gentoo,Biscoe,47.7,15.0,216.0,4750.0,Female
302 | Gentoo,Biscoe,46.4,15.6,221.0,5000.0,Male
303 | Gentoo,Biscoe,48.2,15.6,221.0,5100.0,Male
304 | Gentoo,Biscoe,46.5,14.8,217.0,5200.0,Female
305 | Gentoo,Biscoe,46.4,15.0,216.0,4700.0,Female
306 | Gentoo,Biscoe,48.6,16.0,230.0,5800.0,Male
307 | Gentoo,Biscoe,47.5,14.2,209.0,4600.0,Female
308 | Gentoo,Biscoe,51.1,16.3,220.0,6000.0,Male
309 | Gentoo,Biscoe,45.2,13.8,215.0,4750.0,Female
310 | Gentoo,Biscoe,45.2,16.4,223.0,5950.0,Male
311 | Gentoo,Biscoe,49.1,14.5,212.0,4625.0,Female
312 | Gentoo,Biscoe,52.5,15.6,221.0,5450.0,Male
313 | Gentoo,Biscoe,47.4,14.6,212.0,4725.0,Female
314 | Gentoo,Biscoe,50.0,15.9,224.0,5350.0,Male
315 | Gentoo,Biscoe,44.9,13.8,212.0,4750.0,Female
316 | Gentoo,Biscoe,50.8,17.3,228.0,5600.0,Male
317 | Gentoo,Biscoe,43.4,14.4,218.0,4600.0,Female
318 | Gentoo,Biscoe,51.3,14.2,218.0,5300.0,Male
319 | Gentoo,Biscoe,47.5,14.0,212.0,4875.0,Female
320 | Gentoo,Biscoe,52.1,17.0,230.0,5550.0,Male
321 | Gentoo,Biscoe,47.5,15.0,218.0,4950.0,Female
322 | Gentoo,Biscoe,52.2,17.1,228.0,5400.0,Male
323 | Gentoo,Biscoe,45.5,14.5,212.0,4750.0,Female
324 | Gentoo,Biscoe,49.5,16.1,224.0,5650.0,Male
325 | Gentoo,Biscoe,44.5,14.7,214.0,4850.0,Female
326 | Gentoo,Biscoe,50.8,15.7,226.0,5200.0,Male
327 | Gentoo,Biscoe,49.4,15.8,216.0,4925.0,Male
328 | Gentoo,Biscoe,46.9,14.6,222.0,4875.0,Female
329 | Gentoo,Biscoe,48.4,14.4,203.0,4625.0,Female
330 | Gentoo,Biscoe,51.1,16.5,225.0,5250.0,Male
331 | Gentoo,Biscoe,48.5,15.0,219.0,4850.0,Female
332 | Gentoo,Biscoe,55.9,17.0,228.0,5600.0,Male
333 | Gentoo,Biscoe,47.2,15.5,215.0,4975.0,Female
334 | Gentoo,Biscoe,49.1,15.0,228.0,5500.0,Male
335 | Gentoo,Biscoe,47.3,13.8,216.0,4725.0,
336 | Gentoo,Biscoe,46.8,16.1,215.0,5500.0,Male
337 | Gentoo,Biscoe,41.7,14.7,210.0,4700.0,Female
338 | Gentoo,Biscoe,53.4,15.8,219.0,5500.0,Male
339 | Gentoo,Biscoe,43.3,14.0,208.0,4575.0,Female
340 | Gentoo,Biscoe,48.1,15.1,209.0,5500.0,Male
341 | ,,,,,,
342 | Gentoo,Biscoe,50.5,15.2,216.0,5000.0,Female
343 | Gentoo,Biscoe,49.8,15.9,229.0,5950.0,Male
344 | Gentoo,Biscoe,43.5,15.2,213.0,4650.0,Female
345 | Gentoo,Biscoe,51.5,16.3,230.0,5500.0,Male
346 | Gentoo,Biscoe,46.2,14.1,217.0,4375.0,Female
347 | Gentoo,Biscoe,55.1,16.0,230.0,5850.0,Male
348 | Gentoo,Biscoe,44.5,15.7,217.0,4875.0,
349 | Gentoo,Biscoe,48.8,16.2,222.0,6000.0,Male
350 | Gentoo,Biscoe,47.2,13.7,214.0,4925.0,Female
351 | Gentoo,Biscoe,,,,,
352 | Gentoo,Biscoe,46.8,14.3,215.0,4850.0,Female
353 | Gentoo,Biscoe,50.4,15.7,222.0,5750.0,Male
354 | Gentoo,Biscoe,45.2,14.8,212.0,5200.0,Female
355 | Gentoo,Biscoe,45.2,14.8,212.0,5200.0,Female
356 | Gentoo,Biscoe,49.9,16.1,213.0,5400.0,Male
357 |
--------------------------------------------------------------------------------
/data/Penguins/penguins_clean.csv:
--------------------------------------------------------------------------------
1 | species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
2 | Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
3 | Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
4 | Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
5 | Adelie,Torgersen,,,,,
6 | Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female
7 | Adelie,Torgersen,39.3,20.6,190.0,3650.0,Male
8 | Adelie,Torgersen,38.9,17.8,181.0,3625.0,Female
9 | Adelie,Torgersen,39.2,19.6,195.0,4675.0,Male
10 | Adelie,Torgersen,34.1,18.1,193.0,3475.0,
11 | Adelie,Torgersen,42.0,20.2,190.0,4250.0,
12 | Adelie,Torgersen,37.8,17.1,186.0,3300.0,
13 | Adelie,Torgersen,37.8,17.3,180.0,3700.0,
14 | Adelie,Torgersen,41.1,17.6,182.0,3200.0,Female
15 | Adelie,Torgersen,38.6,21.2,191.0,3800.0,Male
16 | Adelie,Torgersen,34.6,21.1,198.0,4400.0,Male
17 | Adelie,Torgersen,36.6,17.8,185.0,3700.0,Female
18 | Adelie,Torgersen,38.7,19.0,195.0,3450.0,Female
19 | Adelie,Torgersen,42.5,20.7,197.0,4500.0,Male
20 | Adelie,Torgersen,34.4,18.4,184.0,3325.0,Female
21 | Adelie,Torgersen,46.0,21.5,194.0,4200.0,Male
22 | Adelie,Biscoe,37.8,18.3,174.0,3400.0,Female
23 | Adelie,Biscoe,37.7,18.7,180.0,3600.0,Male
24 | Adelie,Biscoe,35.9,19.2,189.0,3800.0,Female
25 | Adelie,Biscoe,38.2,18.1,185.0,3950.0,Male
26 | Adelie,Biscoe,38.8,17.2,180.0,3800.0,Male
27 | Adelie,Biscoe,35.3,18.9,187.0,3800.0,Female
28 | Adelie,Biscoe,40.6,18.6,183.0,3550.0,Male
29 | Adelie,Biscoe,40.5,17.9,187.0,3200.0,Female
30 | Adelie,Biscoe,37.9,18.6,172.0,3150.0,Female
31 | Adelie,Biscoe,40.5,18.9,180.0,3950.0,Male
32 | Adelie,Dream,39.5,16.7,178.0,3250.0,Female
33 | Adelie,Dream,37.2,18.1,178.0,3900.0,Male
34 | Adelie,Dream,39.5,17.8,188.0,3300.0,Female
35 | Adelie,Dream,40.9,18.9,184.0,3900.0,Male
36 | Adelie,Dream,36.4,17.0,195.0,3325.0,Female
37 | Adelie,Dream,39.2,21.1,196.0,4150.0,Male
38 | Adelie,Dream,38.8,20.0,190.0,3950.0,Male
39 | Adelie,Dream,42.2,18.5,180.0,3550.0,Female
40 | Adelie,Dream,37.6,19.3,181.0,3300.0,Female
41 | Adelie,Dream,39.8,19.1,184.0,4650.0,Male
42 | Adelie,Dream,36.5,18.0,182.0,3150.0,Female
43 | Adelie,Dream,40.8,18.4,195.0,3900.0,Male
44 | Adelie,Dream,36.0,18.5,186.0,3100.0,Female
45 | Adelie,Dream,44.1,19.7,196.0,4400.0,Male
46 | Adelie,Dream,37.0,16.9,185.0,3000.0,Female
47 | Adelie,Dream,39.6,18.8,190.0,4600.0,Male
48 | Adelie,Dream,41.1,19.0,182.0,3425.0,Male
49 | Adelie,Dream,37.5,18.9,179.0,2975.0,
50 | Adelie,Dream,36.0,17.9,190.0,3450.0,Female
51 | Adelie,Dream,42.3,21.2,191.0,4150.0,Male
52 | Adelie,Biscoe,39.6,17.7,186.0,3500.0,Female
53 | Adelie,Biscoe,40.1,18.9,188.0,4300.0,Male
54 | Adelie,Biscoe,35.0,17.9,190.0,3450.0,Female
55 | Adelie,Biscoe,42.0,19.5,200.0,4050.0,Male
56 | Adelie,Biscoe,34.5,18.1,187.0,2900.0,Female
57 | Adelie,Biscoe,41.4,18.6,191.0,3700.0,Male
58 | Adelie,Biscoe,39.0,17.5,186.0,3550.0,Female
59 | Adelie,Biscoe,40.6,18.8,193.0,3800.0,Male
60 | Adelie,Biscoe,36.5,16.6,181.0,2850.0,Female
61 | Adelie,Biscoe,37.6,19.1,194.0,3750.0,Male
62 | Adelie,Biscoe,35.7,16.9,185.0,3150.0,Female
63 | Adelie,Biscoe,41.3,21.1,195.0,4400.0,Male
64 | Adelie,Biscoe,37.6,17.0,185.0,3600.0,Female
65 | Adelie,Biscoe,41.1,18.2,192.0,4050.0,Male
66 | Adelie,Biscoe,36.4,17.1,184.0,2850.0,Female
67 | Adelie,Biscoe,41.6,18.0,192.0,3950.0,Male
68 | Adelie,Biscoe,35.5,16.2,195.0,3350.0,Female
69 | Adelie,Biscoe,41.1,19.1,188.0,4100.0,Male
70 | Adelie,Torgersen,35.9,16.6,190.0,3050.0,Female
71 | Adelie,Torgersen,41.8,19.4,198.0,4450.0,Male
72 | Adelie,Torgersen,33.5,19.0,190.0,3600.0,Female
73 | Adelie,Torgersen,39.7,18.4,190.0,3900.0,Male
74 | Adelie,Torgersen,39.6,17.2,196.0,3550.0,Female
75 | Adelie,Torgersen,45.8,18.9,197.0,4150.0,Male
76 | Adelie,Torgersen,35.5,17.5,190.0,3700.0,Female
77 | Adelie,Torgersen,42.8,18.5,195.0,4250.0,Male
78 | Adelie,Torgersen,40.9,16.8,191.0,3700.0,Female
79 | Adelie,Torgersen,37.2,19.4,184.0,3900.0,Male
80 | Adelie,Torgersen,36.2,16.1,187.0,3550.0,Female
81 | Adelie,Torgersen,42.1,19.1,195.0,4000.0,Male
82 | Adelie,Torgersen,34.6,17.2,189.0,3200.0,Female
83 | Adelie,Torgersen,42.9,17.6,196.0,4700.0,Male
84 | Adelie,Torgersen,36.7,18.8,187.0,3800.0,Female
85 | Adelie,Torgersen,35.1,19.4,193.0,4200.0,Male
86 | Adelie,Dream,37.3,17.8,191.0,3350.0,Female
87 | Adelie,Dream,41.3,20.3,194.0,3550.0,Male
88 | Adelie,Dream,36.3,19.5,190.0,3800.0,Male
89 | Adelie,Dream,36.9,18.6,189.0,3500.0,Female
90 | Adelie,Dream,38.3,19.2,189.0,3950.0,Male
91 | Adelie,Dream,38.9,18.8,190.0,3600.0,Female
92 | Adelie,Dream,35.7,18.0,202.0,3550.0,Female
93 | Adelie,Dream,41.1,18.1,205.0,4300.0,Male
94 | Adelie,Dream,34.0,17.1,185.0,3400.0,Female
95 | Adelie,Dream,39.6,18.1,186.0,4450.0,Male
96 | Adelie,Dream,36.2,17.3,187.0,3300.0,Female
97 | Adelie,Dream,40.8,18.9,208.0,4300.0,Male
98 | Adelie,Dream,38.1,18.6,190.0,3700.0,Female
99 | Adelie,Dream,40.3,18.5,196.0,4350.0,Male
100 | Adelie,Dream,33.1,16.1,178.0,2900.0,Female
101 | Adelie,Dream,43.2,18.5,192.0,4100.0,Male
102 | Adelie,Biscoe,35.0,17.9,192.0,3725.0,Female
103 | Adelie,Biscoe,41.0,20.0,203.0,4725.0,Male
104 | Adelie,Biscoe,37.7,16.0,183.0,3075.0,Female
105 | Adelie,Biscoe,37.8,20.0,190.0,4250.0,Male
106 | Adelie,Biscoe,37.9,18.6,193.0,2925.0,Female
107 | Adelie,Biscoe,39.7,18.9,184.0,3550.0,Male
108 | Adelie,Biscoe,38.6,17.2,199.0,3750.0,Female
109 | Adelie,Biscoe,38.2,20.0,190.0,3900.0,Male
110 | Adelie,Biscoe,38.1,17.0,181.0,3175.0,Female
111 | Adelie,Biscoe,43.2,19.0,197.0,4775.0,Male
112 | Adelie,Biscoe,38.1,16.5,198.0,3825.0,Female
113 | Adelie,Biscoe,45.6,20.3,191.0,4600.0,Male
114 | Adelie,Biscoe,39.7,17.7,193.0,3200.0,Female
115 | Adelie,Biscoe,42.2,19.5,197.0,4275.0,Male
116 | Adelie,Biscoe,39.6,20.7,191.0,3900.0,Female
117 | Adelie,Biscoe,42.7,18.3,196.0,4075.0,Male
118 | Adelie,Torgersen,38.6,17.0,188.0,2900.0,Female
119 | Adelie,Torgersen,37.3,20.5,199.0,3775.0,Male
120 | Adelie,Torgersen,35.7,17.0,189.0,3350.0,Female
121 | Adelie,Torgersen,41.1,18.6,189.0,3325.0,Male
122 | Adelie,Torgersen,36.2,17.2,187.0,3150.0,Female
123 | Adelie,Torgersen,37.7,19.8,198.0,3500.0,Male
124 | Adelie,Torgersen,40.2,17.0,176.0,3450.0,Female
125 | Adelie,Torgersen,41.4,18.5,202.0,3875.0,Male
126 | Adelie,Torgersen,35.2,15.9,186.0,3050.0,Female
127 | Adelie,Torgersen,40.6,19.0,199.0,4000.0,Male
128 | Adelie,Torgersen,38.8,17.6,191.0,3275.0,Female
129 | Adelie,Torgersen,41.5,18.3,195.0,4300.0,Male
130 | Adelie,Torgersen,39.0,17.1,191.0,3050.0,Female
131 | Adelie,Torgersen,44.1,18.0,210.0,4000.0,Male
132 | Adelie,Torgersen,38.5,17.9,190.0,3325.0,Female
133 | Adelie,Torgersen,43.1,19.2,197.0,3500.0,Male
134 | Adelie,Dream,36.8,18.5,193.0,3500.0,Female
135 | Adelie,Dream,37.5,18.5,199.0,4475.0,Male
136 | Adelie,Dream,38.1,17.6,187.0,3425.0,Female
137 | Adelie,Dream,41.1,17.5,190.0,3900.0,Male
138 | Adelie,Dream,35.6,17.5,191.0,3175.0,Female
139 | Adelie,Dream,40.2,20.1,200.0,3975.0,Male
140 | Adelie,Dream,37.0,16.5,185.0,3400.0,Female
141 | Adelie,Dream,39.7,17.9,193.0,4250.0,Male
142 | Adelie,Dream,40.2,17.1,193.0,3400.0,Female
143 | Adelie,Dream,40.6,17.2,187.0,3475.0,Male
144 | Adelie,Dream,32.1,15.5,188.0,3050.0,Female
145 | Adelie,Dream,40.7,17.0,190.0,3725.0,Male
146 | Adelie,Dream,37.3,16.8,192.0,3000.0,Female
147 | Adelie,Dream,39.0,18.7,185.0,3650.0,Male
148 | Adelie,Dream,39.2,18.6,190.0,4250.0,Male
149 | Adelie,Dream,36.6,18.4,184.0,3475.0,Female
150 | Adelie,Dream,36.0,17.8,195.0,3450.0,Female
151 | Adelie,Dream,37.8,18.1,193.0,3750.0,Male
152 | Adelie,Dream,36.0,17.1,187.0,3700.0,Female
153 | Adelie,Dream,41.5,18.5,201.0,4000.0,Male
154 | Chinstrap,Dream,46.5,17.9,192.0,3500.0,Female
155 | Chinstrap,Dream,50.0,19.5,196.0,3900.0,Male
156 | Chinstrap,Dream,51.3,19.2,193.0,3650.0,Male
157 | Chinstrap,Dream,45.4,18.7,188.0,3525.0,Female
158 | Chinstrap,Dream,52.7,19.8,197.0,3725.0,Male
159 | Chinstrap,Dream,45.2,17.8,198.0,3950.0,Female
160 | Chinstrap,Dream,46.1,18.2,178.0,3250.0,Female
161 | Chinstrap,Dream,51.3,18.2,197.0,3750.0,Male
162 | Chinstrap,Dream,46.0,18.9,195.0,4150.0,Female
163 | Chinstrap,Dream,51.3,19.9,198.0,3700.0,Male
164 | Chinstrap,Dream,46.6,17.8,193.0,3800.0,Female
165 | Chinstrap,Dream,51.7,20.3,194.0,3775.0,Male
166 | Chinstrap,Dream,47.0,17.3,185.0,3700.0,Female
167 | Chinstrap,Dream,52.0,18.1,201.0,4050.0,Male
168 | Chinstrap,Dream,45.9,17.1,190.0,3575.0,Female
169 | Chinstrap,Dream,50.5,19.6,201.0,4050.0,Male
170 | Chinstrap,Dream,50.3,20.0,197.0,3300.0,Male
171 | Chinstrap,Dream,58.0,17.8,181.0,3700.0,Female
172 | Chinstrap,Dream,46.4,18.6,190.0,3450.0,Female
173 | Chinstrap,Dream,49.2,18.2,195.0,4400.0,Male
174 | Chinstrap,Dream,42.4,17.3,181.0,3600.0,Female
175 | Chinstrap,Dream,48.5,17.5,191.0,3400.0,Male
176 | Chinstrap,Dream,43.2,16.6,187.0,2900.0,Female
177 | Chinstrap,Dream,50.6,19.4,193.0,3800.0,Male
178 | Chinstrap,Dream,46.7,17.9,195.0,3300.0,Female
179 | Chinstrap,Dream,52.0,19.0,197.0,4150.0,Male
180 | Chinstrap,Dream,50.5,18.4,200.0,3400.0,Female
181 | Chinstrap,Dream,49.5,19.0,200.0,3800.0,Male
182 | Chinstrap,Dream,46.4,17.8,191.0,3700.0,Female
183 | Chinstrap,Dream,52.8,20.0,205.0,4550.0,Male
184 | Chinstrap,Dream,40.9,16.6,187.0,3200.0,Female
185 | Chinstrap,Dream,54.2,20.8,201.0,4300.0,Male
186 | Chinstrap,Dream,42.5,16.7,187.0,3350.0,Female
187 | Chinstrap,Dream,51.0,18.8,203.0,4100.0,Male
188 | Chinstrap,Dream,49.7,18.6,195.0,3600.0,Male
189 | Chinstrap,Dream,47.5,16.8,199.0,3900.0,Female
190 | Chinstrap,Dream,47.6,18.3,195.0,3850.0,Female
191 | Chinstrap,Dream,52.0,20.7,210.0,4800.0,Male
192 | Chinstrap,Dream,46.9,16.6,192.0,2700.0,Female
193 | Chinstrap,Dream,53.5,19.9,205.0,4500.0,Male
194 | Chinstrap,Dream,49.0,19.5,210.0,3950.0,Male
195 | Chinstrap,Dream,46.2,17.5,187.0,3650.0,Female
196 | Chinstrap,Dream,50.9,19.1,196.0,3550.0,Male
197 | Chinstrap,Dream,45.5,17.0,196.0,3500.0,Female
198 | Chinstrap,Dream,50.9,17.9,196.0,3675.0,Female
199 | Chinstrap,Dream,50.8,18.5,201.0,4450.0,Male
200 | Chinstrap,Dream,50.1,17.9,190.0,3400.0,Female
201 | Chinstrap,Dream,49.0,19.6,212.0,4300.0,Male
202 | Chinstrap,Dream,51.5,18.7,187.0,3250.0,Male
203 | Chinstrap,Dream,49.8,17.3,198.0,3675.0,Female
204 | Chinstrap,Dream,48.1,16.4,199.0,3325.0,Female
205 | Chinstrap,Dream,51.4,19.0,201.0,3950.0,Male
206 | Chinstrap,Dream,45.7,17.3,193.0,3600.0,Female
207 | Chinstrap,Dream,50.7,19.7,203.0,4050.0,Male
208 | Chinstrap,Dream,42.5,17.3,187.0,3350.0,Female
209 | Chinstrap,Dream,52.2,18.8,197.0,3450.0,Male
210 | Chinstrap,Dream,45.2,16.6,191.0,3250.0,Female
211 | Chinstrap,Dream,49.3,19.9,203.0,4050.0,Male
212 | Chinstrap,Dream,50.2,18.8,202.0,3800.0,Male
213 | Chinstrap,Dream,45.6,19.4,194.0,3525.0,Female
214 | Chinstrap,Dream,51.9,19.5,206.0,3950.0,Male
215 | Chinstrap,Dream,46.8,16.5,189.0,3650.0,Female
216 | Chinstrap,Dream,45.7,17.0,195.0,3650.0,Female
217 | Chinstrap,Dream,55.8,19.8,207.0,4000.0,Male
218 | Chinstrap,Dream,43.5,18.1,202.0,3400.0,Female
219 | Chinstrap,Dream,49.6,18.2,193.0,3775.0,Male
220 | Chinstrap,Dream,50.8,19.0,210.0,4100.0,Male
221 | Chinstrap,Dream,50.2,18.7,198.0,3775.0,Female
222 | Gentoo,Biscoe,46.1,13.2,211.0,4500.0,Female
223 | Gentoo,Biscoe,50.0,16.3,230.0,5700.0,Male
224 | Gentoo,Biscoe,48.7,14.1,210.0,4450.0,Female
225 | Gentoo,Biscoe,50.0,15.2,218.0,5700.0,Male
226 | Gentoo,Biscoe,47.6,14.5,215.0,5400.0,Male
227 | Gentoo,Biscoe,46.5,13.5,210.0,4550.0,Female
228 | Gentoo,Biscoe,45.4,14.6,211.0,4800.0,Female
229 | Gentoo,Biscoe,46.7,15.3,219.0,5200.0,Male
230 | Gentoo,Biscoe,43.3,13.4,209.0,4400.0,Female
231 | Gentoo,Biscoe,46.8,15.4,215.0,5150.0,Male
232 | Gentoo,Biscoe,40.9,13.7,214.0,4650.0,Female
233 | Gentoo,Biscoe,49.0,16.1,216.0,5550.0,Male
234 | Gentoo,Biscoe,45.5,13.7,214.0,4650.0,Female
235 | Gentoo,Biscoe,48.4,14.6,213.0,5850.0,Male
236 | Gentoo,Biscoe,45.8,14.6,210.0,4200.0,Female
237 | Gentoo,Biscoe,49.3,15.7,217.0,5850.0,Male
238 | Gentoo,Biscoe,42.0,13.5,210.0,4150.0,Female
239 | Gentoo,Biscoe,49.2,15.2,221.0,6300.0,Male
240 | Gentoo,Biscoe,46.2,14.5,209.0,4800.0,Female
241 | Gentoo,Biscoe,48.7,15.1,222.0,5350.0,Male
242 | Gentoo,Biscoe,50.2,14.3,218.0,5700.0,Male
243 | Gentoo,Biscoe,45.1,14.5,215.0,5000.0,Female
244 | Gentoo,Biscoe,46.5,14.5,213.0,4400.0,Female
245 | Gentoo,Biscoe,46.3,15.8,215.0,5050.0,Male
246 | Gentoo,Biscoe,42.9,13.1,215.0,5000.0,Female
247 | Gentoo,Biscoe,46.1,15.1,215.0,5100.0,Male
248 | Gentoo,Biscoe,44.5,14.3,216.0,4100.0,
249 | Gentoo,Biscoe,47.8,15.0,215.0,5650.0,Male
250 | Gentoo,Biscoe,48.2,14.3,210.0,4600.0,Female
251 | Gentoo,Biscoe,50.0,15.3,220.0,5550.0,Male
252 | Gentoo,Biscoe,47.3,15.3,222.0,5250.0,Male
253 | Gentoo,Biscoe,42.8,14.2,209.0,4700.0,Female
254 | Gentoo,Biscoe,45.1,14.5,207.0,5050.0,Female
255 | Gentoo,Biscoe,59.6,17.0,230.0,6050.0,Male
256 | Gentoo,Biscoe,49.1,14.8,220.0,5150.0,Female
257 | Gentoo,Biscoe,48.4,16.3,220.0,5400.0,Male
258 | Gentoo,Biscoe,42.6,13.7,213.0,4950.0,Female
259 | Gentoo,Biscoe,44.4,17.3,219.0,5250.0,Male
260 | Gentoo,Biscoe,44.0,13.6,208.0,4350.0,Female
261 | Gentoo,Biscoe,48.7,15.7,208.0,5350.0,Male
262 | Gentoo,Biscoe,42.7,13.7,208.0,3950.0,Female
263 | Gentoo,Biscoe,49.6,16.0,225.0,5700.0,Male
264 | Gentoo,Biscoe,45.3,13.7,210.0,4300.0,Female
265 | Gentoo,Biscoe,49.6,15.0,216.0,4750.0,Male
266 | Gentoo,Biscoe,50.5,15.9,222.0,5550.0,Male
267 | Gentoo,Biscoe,43.6,13.9,217.0,4900.0,Female
268 | Gentoo,Biscoe,45.5,13.9,210.0,4200.0,Female
269 | Gentoo,Biscoe,50.5,15.9,225.0,5400.0,Male
270 | Gentoo,Biscoe,44.9,13.3,213.0,5100.0,Female
271 | Gentoo,Biscoe,45.2,15.8,215.0,5300.0,Male
272 | Gentoo,Biscoe,46.6,14.2,210.0,4850.0,Female
273 | Gentoo,Biscoe,48.5,14.1,220.0,5300.0,Male
274 | Gentoo,Biscoe,45.1,14.4,210.0,4400.0,Female
275 | Gentoo,Biscoe,50.1,15.0,225.0,5000.0,Male
276 | Gentoo,Biscoe,46.5,14.4,217.0,4900.0,Female
277 | Gentoo,Biscoe,45.0,15.4,220.0,5050.0,Male
278 | Gentoo,Biscoe,43.8,13.9,208.0,4300.0,Female
279 | Gentoo,Biscoe,45.5,15.0,220.0,5000.0,Male
280 | Gentoo,Biscoe,43.2,14.5,208.0,4450.0,Female
281 | Gentoo,Biscoe,50.4,15.3,224.0,5550.0,Male
282 | Gentoo,Biscoe,45.3,13.8,208.0,4200.0,Female
283 | Gentoo,Biscoe,46.2,14.9,221.0,5300.0,Male
284 | Gentoo,Biscoe,45.7,13.9,214.0,4400.0,Female
285 | Gentoo,Biscoe,54.3,15.7,231.0,5650.0,Male
286 | Gentoo,Biscoe,45.8,14.2,219.0,4700.0,Female
287 | Gentoo,Biscoe,49.8,16.8,230.0,5700.0,Male
288 | Gentoo,Biscoe,46.2,14.4,214.0,4650.0,
289 | Gentoo,Biscoe,49.5,16.2,229.0,5800.0,Male
290 | Gentoo,Biscoe,43.5,14.2,220.0,4700.0,Female
291 | Gentoo,Biscoe,50.7,15.0,223.0,5550.0,Male
292 | Gentoo,Biscoe,47.7,15.0,216.0,4750.0,Female
293 | Gentoo,Biscoe,46.4,15.6,221.0,5000.0,Male
294 | Gentoo,Biscoe,48.2,15.6,221.0,5100.0,Male
295 | Gentoo,Biscoe,46.5,14.8,217.0,5200.0,Female
296 | Gentoo,Biscoe,46.4,15.0,216.0,4700.0,Female
297 | Gentoo,Biscoe,48.6,16.0,230.0,5800.0,Male
298 | Gentoo,Biscoe,47.5,14.2,209.0,4600.0,Female
299 | Gentoo,Biscoe,51.1,16.3,220.0,6000.0,Male
300 | Gentoo,Biscoe,45.2,13.8,215.0,4750.0,Female
301 | Gentoo,Biscoe,45.2,16.4,223.0,5950.0,Male
302 | Gentoo,Biscoe,49.1,14.5,212.0,4625.0,Female
303 | Gentoo,Biscoe,52.5,15.6,221.0,5450.0,Male
304 | Gentoo,Biscoe,47.4,14.6,212.0,4725.0,Female
305 | Gentoo,Biscoe,50.0,15.9,224.0,5350.0,Male
306 | Gentoo,Biscoe,44.9,13.8,212.0,4750.0,Female
307 | Gentoo,Biscoe,50.8,17.3,228.0,5600.0,Male
308 | Gentoo,Biscoe,43.4,14.4,218.0,4600.0,Female
309 | Gentoo,Biscoe,51.3,14.2,218.0,5300.0,Male
310 | Gentoo,Biscoe,47.5,14.0,212.0,4875.0,Female
311 | Gentoo,Biscoe,52.1,17.0,230.0,5550.0,Male
312 | Gentoo,Biscoe,47.5,15.0,218.0,4950.0,Female
313 | Gentoo,Biscoe,52.2,17.1,228.0,5400.0,Male
314 | Gentoo,Biscoe,45.5,14.5,212.0,4750.0,Female
315 | Gentoo,Biscoe,49.5,16.1,224.0,5650.0,Male
316 | Gentoo,Biscoe,44.5,14.7,214.0,4850.0,Female
317 | Gentoo,Biscoe,50.8,15.7,226.0,5200.0,Male
318 | Gentoo,Biscoe,49.4,15.8,216.0,4925.0,Male
319 | Gentoo,Biscoe,46.9,14.6,222.0,4875.0,Female
320 | Gentoo,Biscoe,48.4,14.4,203.0,4625.0,Female
321 | Gentoo,Biscoe,51.1,16.5,225.0,5250.0,Male
322 | Gentoo,Biscoe,48.5,15.0,219.0,4850.0,Female
323 | Gentoo,Biscoe,55.9,17.0,228.0,5600.0,Male
324 | Gentoo,Biscoe,47.2,15.5,215.0,4975.0,Female
325 | Gentoo,Biscoe,49.1,15.0,228.0,5500.0,Male
326 | Gentoo,Biscoe,47.3,13.8,216.0,4725.0,
327 | Gentoo,Biscoe,46.8,16.1,215.0,5500.0,Male
328 | Gentoo,Biscoe,41.7,14.7,210.0,4700.0,Female
329 | Gentoo,Biscoe,53.4,15.8,219.0,5500.0,Male
330 | Gentoo,Biscoe,43.3,14.0,208.0,4575.0,Female
331 | Gentoo,Biscoe,48.1,15.1,209.0,5500.0,Male
332 | Gentoo,Biscoe,50.5,15.2,216.0,5000.0,Female
333 | Gentoo,Biscoe,49.8,15.9,229.0,5950.0,Male
334 | Gentoo,Biscoe,43.5,15.2,213.0,4650.0,Female
335 | Gentoo,Biscoe,51.5,16.3,230.0,5500.0,Male
336 | Gentoo,Biscoe,46.2,14.1,217.0,4375.0,Female
337 | Gentoo,Biscoe,55.1,16.0,230.0,5850.0,Male
338 | Gentoo,Biscoe,44.5,15.7,217.0,4875.0,
339 | Gentoo,Biscoe,48.8,16.2,222.0,6000.0,Male
340 | Gentoo,Biscoe,47.2,13.7,214.0,4925.0,Female
341 | Gentoo,Biscoe,,,,,
342 | Gentoo,Biscoe,46.8,14.3,215.0,4850.0,Female
343 | Gentoo,Biscoe,50.4,15.7,222.0,5750.0,Male
344 | Gentoo,Biscoe,45.2,14.8,212.0,5200.0,Female
345 | Gentoo,Biscoe,49.9,16.1,213.0,5400.0,Male
346 |
--------------------------------------------------------------------------------
/data/food_training/languages.csv:
--------------------------------------------------------------------------------
1 | Country,Official and national Languages
2 | Albania,Albanian
3 | Andorra,Catalan
4 | Austria,German/Slovene/Croatian/Hungarian
5 | Belarus,Belarusian/Russian
6 | Belgium,Dutch/French/German
7 | Bosnia & Herzegovina,Bosnian/Croatian/Serbian
8 | Bulgaria,Bulgarian
9 | Croatia,Croatian
10 | Cyprus,Greek/Turkish/English
11 | Czech Republic,Czech
12 | Denmark,Danish
13 | Estonia,Estonian
14 | Faroe Islands,Faroese/Danish
15 | Finland,Finnish/Swedish
16 | France,French
17 | Germany,German
18 | Gibraltar,English
19 | Greece,Greek
20 | Greenland,Greenlandic Inuktitut/Danish
21 | Hungary,Hungarian
22 | Iceland,Icelandic
23 | Ireland,Irish/English
24 | Italy,Italian
25 | Latvia,Latvian
26 | Liechtenstein,German
27 | Lithuania,Lithuanian
28 | Luxembourg,Luxembourgish/French/German
29 | Macedonia (Rep. of),Macedonia/Albanian
30 | Malta,Maltese
31 | Moldova,Moldovan
32 | Monaco,French
33 | Montenegro,Serbo-Croatian
34 | Netherlands,Dutch/Frisian
35 | Norway,Norwegian
36 | Poland,Polish
37 | Portugal,Portuguese
38 | Romania,Romanian
39 | Russian Federation,Russian
40 | San Marino,Italian
41 | Serbia,Serbian/Albanian
42 | Slovakia,Slovak
43 | Slovenia,Slovenian
44 | Spain,Spanish/Catalan/Galician/Basque
45 | Sweden,Swedish
46 | Switzerland,German/French/Italian/Romansch
47 | Turkey,Turkish
48 | Ukraine,Ukrainian
49 | United Kingdom,English
50 | Vatican City State,Latin/Italian
51 |
--------------------------------------------------------------------------------
/data/food_training/training_2014.csv:
--------------------------------------------------------------------------------
1 | ,,,,,,
2 | CourseName,Location,DateFrom,DateTo,Attendees,,
3 | Risk Assessment (Pest),lisbon;Portugal,2015-01-12,2015-01-16,1,,
4 | Organic Farming,Bristol,2015-01-19,2015-01-22,2,,
5 | Prevention Control and Eradication of Transmissible Spongiform Encephalopathies,Ljubljana;Slovenia,2015-01-20,2015-01-23,2,,
6 | Contingency Planning,Padua,2015-01-26,2015-01-30,2,,
7 | HACCP,Rome,2015-02-02,2015-02-06,2,,
8 | Food Hygiene and Flexibility,Parma,2015-02-08,2015-02-13,1,,
9 | Risk Assessment (Animal Health),Lisbon,2015-02-08,2015-02-12,1,,
10 | Plant Health Risks,Munich,2015-02-09,2015-02-12,1,,
11 | Plant Protection Products,Lisbon,2015-02-16,2015-02-19,2,,
12 | Food Hygiene at Primary Production (Plants),Murcia,2015-02-16,2015-02-20,2,,
13 | Risk Assessment (GMO),Lisbon,2015-02-23,2015-02-27,1,,
14 | Audit,Amsterdam,2015-02-23,2015-02-27,1,,
15 | Semen Embryos and Ova,Venice,2015-02-23,2015-02-27,1,,
16 | Food Additives,Athens,2015-02-23,2015-02-27,2,,
17 | Organic Farming,Warsaw,2015-03-02,2015-03-05,2,,
18 | Risk Assessment (Microbiological),Berlin,2015-03-02,2015-03-06,1,,
19 | HACCP,Lyon,2015-03-02,2015-03-06,4,,
20 | Animal identification registration and traceability,Lyon,2015-03-02,2015-03-06,1,,
21 | Animal Health and Disease Prevention for Bees and Zoo Animals,Antwerp,2015-03-08,2015-03-13,1,,
22 | Food Hygiene and Flexibility,Graz,2015-03-08,2015-03-13,1,,
23 | Audit ,Barcelona,2015-03-09,2015-03-13,2,,
24 | Risk Assessment (Environment),Rome,2015-03-16,2015-03-20,1,,
25 | Food Hygiene at Primary Production (Land Animals), Budapest,2015-03-16,2015-03-20,1,,
26 | RASFF,Madrid,2015-03-23,2015-03-26,1,,
27 | Contingency Planning,Padua,2015-03-23,2015-03-27,1,,
28 | Microbiological Criteria,Lisbon,2015-03-23,2015-03-27,1,,
29 | Food Composition and Information,Athens,2015-03-23,2015-03-27,1,,
30 | Control on Contaminants in Feed and Food,Berlin,2015-03-24,2015-03-27,3,,
31 |
--------------------------------------------------------------------------------
/data/food_training/training_2015.csv:
--------------------------------------------------------------------------------
1 | ,,,,,,
2 | CourseName,Location,DateFrom,DateTo,Attendees,,
3 | Food Hygiene and Flexibility,Vilnius,2015-04-12,2015-04-17,2,,
4 | Plant Health Risks,Milan;Italy,2015-04-13,2015-04-16,1,,
5 | New Investigative Techniques,Madrid,2015-04-13,2015-04-16,2,,
6 | Foodborne Outbreaks Investigations,Lisbon,2015-04-13,2015-04-17,2,,
7 | Food Hygiene at Primary Production (Plants),Budapest,2015-04-13,2015-04-17,2,,
8 | Food Composition and Information,Trim; Ireland ,2015-04-13,2015-04-17,2,,
9 | Veterinary Medical Products,Venice,2015-04-14,2015-04-17,3,,
10 | Audit,Grange;Ireland,2015-04-20,2015-04-24,1,,
11 | Food Additives,Riga,2015-04-20,2015-04-24,2,,
12 | Movement of Cats and Dogs,Zagreb;Croatia,2015-04-21,2015-04-24,1,,
13 | Import Controls on Food and Feed of Non-Animal Origin,Genoa,2015-04-21,2015-04-24,3,,
14 | Animal by Products (Intermediate Level),Antwerp,2015-04-21,2015-04-24,2,,
15 | Microbiological Criteria,Riga,2015-05-04,2015-05-07,2,,
16 | Risk Assessment (Microbiological),Tallinn,2015-05-04,2015-05-08,1,,
17 | Animal identification registration and traceability,Warsaw,2015-05-04,2015-05-08,2,,
18 | Contingency Planning,Padua;Italy,2015-05-04,2015-05-08,3,,
19 | Food Hygiene and Flexibility,Coimbra;Portugal,2015-05-10,2015-05-15,2,,
20 | Food Composition and Information,Madrid,2015-05-11,2015-05-14,1,,
21 | Microbiological Criteria,Valencia,2015-05-11,2015-05-15,3,,
22 | New Investigative Techniques,Bratislava;Slovakia,2015-05-18,2015-05-21,2,,
23 | Border Inspection Post,Felixstowe;United Kingdom,2015-05-18,2015-05-21,4,,
24 | Semen Embryos and Ova,Venice,2015-05-18,2015-05-22,1,,
25 | RASFF,Valencia,2015-05-18,2015-05-22,1,,
26 | Traces (USE AT IMPORT OF CERTAIN FEED AND FOOD OF NON-ANIMAL ORIGIN),Riga,2015-05-19,2015-05-22,2,,
27 | Animal by Products (Upgraded),Maribor,2015-05-19,2015-05-22,2,,
28 | Animal Welfare (During Transport),Unknown;France,2015-05-26,2015-05-29,2,,
29 | Plant Health Risks,Lisbon,2015-06-01,2015-06-04,1,,
30 | Veterinary Medical Products,Krakow,2015-06-02,2015-06-05,2,,
31 | Animal Health and Disease Prevention for Bees and Zoo Animals,Maribor,2015-06-02,2015-06-05,2,,
32 | HACCP,Rome,2015-06-08,2015-06-11,2,,
33 | New Investigative Techniques,Prague,2015-06-08,2015-06-11,1,,
34 | Feed Law,Bremen,2015-06-08,2015-06-12,2,,
35 | Food Additives,Rotterdam;Netherlands,2015-06-09,2015-06-12,3,,
36 | Food Hygiene and Flexibility,Vilnius,2015-06-14,2015-06-19,1,,
37 | Risk Assessment (Pest),Tallinn,2015-06-15,2015-06-16,1,,
38 | Animal identification registration and traceability,Munich;Germany,2015-06-15,2015-06-19,2,,
39 | RASFF,Tallinn,2015-06-16,2015-06-19,1,,
40 | Animal Welfare (Pig Production),Unknown;Denmark,2015-06-16,2015-06-19,1,,
41 | Border Inspection Post,Vienna;Austria,2015-06-16,2015-06-19,3,,
42 | Food Additives,Trim;Ireland,2015-06-22,2015-06-22,2,,
43 | Traces (USE AT IMPORT OF LIVE PLANTS),Marseille;France,2015-06-23,2015-06-26,1,,
44 | Food Hygiene at Primary Production (Aquatic Animals),Venice/Udine; Italy ,2015-06-28,2015-10-02,1,,
45 | Semen Embryos and Ova,Lisbon,2015-06-29,2015-07-03,1,,
46 | Plant Health Risks,Lisbon,2015-07-06,2015-07-09,1,,
47 | Contingency Planning,Riga,2015-07-06,2015-07-10,1,,
48 | Import Controls on Food and Feed of Non-Animal Origin,Athens,2015-07-06,2015-07-10,2,,
49 | Pesticide Application Equipment,Barcelona,2015-07-07,2015-07-10,1,,
50 | Animal by Products (Advanced Level),Dusseldorf;Germany,2015-07-07,2015-07-10,1,,
51 | Veterinary Medical Products,Madrid,2015-07-07,2015-07-10,2,,
52 | Contingency Planning,Maribor;Slovenia,2015-09-01,2015-09-04,2,,
53 | Animal Welfare,Cardiff,2015-09-07,2015-09-11,2,,
54 | Plant Health Controls,Brussels/Antwerp,2015-09-15,2015-09-17,1,,
55 | Prevention Control and Eradication of Transmissible Spongiform Encephalopathies,Utrecht;Netherlands,2015-09-15,2015-09-18,2,,
56 | RASFF,Tallinn,2015-09-15,2015-09-18,1,,
57 | Control on Contaminants in Feed and Food,Rome;IT,2015-09-15,2015-09-18,3,,
58 | New Investigative Techniques,Prague,2015-09-21,2015-09-24,1,,
59 | Food Additives,Trim;Ireland,2015-09-21,2015-09-25,1,,
60 | Movement of Cats and Dogs,Malaga;Spain,2015-09-22,2015-09-25,2,,
61 | Animal by Products (Upgraded),Maribor;Slovenia,2015-09-22,2015-09-25,2,,
62 | Animal Welfare (In Hen Laying),Unknown;UK,2015-09-22,2015-09-25,2,,
63 | Import Controls on Food and Feed of Non-Animal Origin,Valencia;Spain,2015-09-22,2015-09-25,3,,
64 | Detection of counterfeit/illegal pesticides,Grange;Ireland,2015-09-23,2015-09-25,1,,
65 | Plant Health Controls,Warsaw;Poland,2015-09-28,2015-10-02,1,,
66 | Foodborne Outbreaks Investigations,Berlin,2015-09-28,2015-10-02,2,,
67 | HACCP,Budapest,2015-09-28,2015-10-02,1,,
68 | Veterinary Medical Products,Trim;Ireland,2015-09-29,2015-10-02,1,,
69 | Traces (USE AT IMPORT OF LIVE ANIMALS AND PRODUCTS OF ANIMAL ORIGIN),Budapest,2015-09-29,2015-10-02,2,,
70 | Risk Assessment (GMO),Tallinn;Estonia,2015-10-05,2015-10-09,1,,
71 | Control on Contaminants in Feed and Food,Riga,2015-10-05,2015-10-09,1,,
72 | Semen Embryos and Ova,Gothenburg,2015-10-05,2015-10-09,2,,
73 | Food Composition and Information,Trim;Ireland,2015-10-05,2015-10-09,1,,
74 | Animal Welfare (Killing For Disease Control),Unknown;Italy,2015-10-06,2015-10-09,1,,
75 | Contingency Planning,Grange;Ireland,2015-10-07,2015-10-09,2,,
76 | Food Hygiene at Primary Production (Land Animals),Trim;Ireland,2015-10-12,2015-10-16,3,,
77 | Animal Welfare (At Slaughter),Grange;Ireland,2015-10-13,2015-10-15,2,,
78 | Border Inspection Post,Felixstowe,2015-10-13,2015-10-16,4,,
79 | Food Composition and Information,D?sseldorf,2015-10-13,2015-10-16,2,,
80 | Food Hygiene and Flexibility,Graz,2015-10-18,2015-10-23,2,,
81 | Microbiological Criteria,Barcelona;Spain,2015-10-19,2015-10-22,2,,
82 | New Investigative Techniques,Madrid,2015-10-19,2015-10-22,3,,
83 | Animal identification registration and traceability,Munich,2015-10-19,2015-10-23,1,,
84 | Animal Welfare (At Slaughter Cattle pigs sheep and goats),Unknown;Italy,2015-10-20,2015-10-23,1,,
85 | Control on Contaminants in Feed and Food,Rome;IT,2015-10-20,2015-10-23,3,,
86 | Food Hygiene at Primary Production (Plants),Budapest,2015-10-26,2015-10-30,1,,
87 | HACCP,Sofia Bulgaria,2015-10-26,2015-10-30,2,,
88 | Movement of Cats and Dogs,London,2015-10-27,2015-10-30,3,,
89 | Import Controls on Food and Feed of Non-Animal Origin,Riga,2015-10-27,2015-10-30,3,,
90 | EU Feed Hygiene Rules and HACCP auditing,Budapest;Hungary,2015-11-03,2015-11-06,1,,
91 | Food Hygiene and Flexibility,Barcelona,2015-11-08,2015-11-13,1,,
92 | Plant Health Controls,Lisbon;Portugal,2015-11-16,2015-11-19,1,,
93 | Food Hygiene at Primary Production (Plants),Alicante/Murcia;Spain,2015-11-16,2015-11-20,1,,
94 | Border Inspection Post,Vienna,2015-11-17,2015-11-20,3,,
95 | Plant Health Controls,Unknown;Italy,2015-11-17,2015-11-20,1,,
96 | Microbiological Criteria,Rome,2015-11-23,2015-11-26,2,,
97 | Import Controls on Food and Feed of Non-Animal Origin,Frankfurt;Switzerland,2015-11-24,2015-11-27,3,,
98 | Control on Contaminants in Feed and Food,Brussels,2015-11-24,2015-11-27,3,,
99 | Foodborne Outbreaks Investigations,Rome,2015-11-30,2015-12-04,3,,
100 | Post-slaughter traceability of Meat (FVO),Grange;Ireland,2015-12-08,2015-12-10,3,,
101 | Plant Health Controls,London,2015-12-08,2015-12-10,2,,
102 | Movement of Cats and Dogs,Zagreb,2015-12-15,2015-12-18,1,,
103 | CONTROL OF ZOONOSES AND PREVENTION AND MONITORING OF ANTI-MICROBIAL RESISTANCE IN THE FOOD CHAIN (Zoon), Venice; Italy,2015-12-15,2015-12-18,1,,
104 | Food Hygiene at Primary Production (Aquatic Animals),Tarragona;Spain,2016-02-08,2016-02-12,3,,
105 | CONTROL OF ZOONOSES AND PREVENTION AND MONITORING OF ANTI-MICROBIAL RESISTANCE IN THE FOOD CHAIN (AMR),Krakow;Poland,2016-02-09,2016-02-12,1,,
106 | Food Composition and Information,Athens ; Greece,2016-03-07,2016-03-11,1,,
107 | Animal Welfare (Broiler Production),Unknown;Italy,2016-03-08,2016-03-11,1,,
108 | Food Composition and Information,Madrid;Spain,2016-01-18,2016-01-22,1,,
109 | Movement of Cats and Dogs,Zagreb,2016-01-19,2016-01-22,1,,
110 | Foodborne Outbreaks Investigations,Berlin,2016-02-01,2016-02-05,2,,
111 | Feed Law,Bremen;Germany,2016-02-01,2016-02-05,2,,
112 | Food Composition and Information,Athens;Greece,2016-02-01,2016-02-05,1,,
113 | Animal Welfare (Cattle pigs sheep and goats: Advanced level course),Unknown;Spain,2016-02-02,2016-02-05,2,,
114 | Food Hygiene at Primary Production (Plants),Valencia;Spain,2016-02-15,2016-02-19,2,,
115 | Animal Health and Disease Prevention for Bees and Zoo Animals,London ,2016-02-16,2016-02-18,2,,
116 | Animal by Products (Intermediate Level),Antwerp; Belgium,2016-02-16,2016-02-19,2,,
117 | Pesticide Application Equipment,Torino;Italy,2016-02-22,2016-02-25,1,,
118 | Semen Embryos and Ova,Lisbon ,2016-02-22,2016-02-26,2,,
119 | Food Hygiene at Primary Production (Land Animals), Budapest,2016-02-29,2016-03-04,1,,
120 | Plant Health Controls,Venice/Treviso,2016-03-07,2016-03-11,2,,
121 | Import Controls on Food and Feed of Non-Animal Origin,Genoa;Italy,2016-03-08,2016-03-11,4,,
122 | Feed Law,Nantes;France,2016-03-14,2016-03-18,3,,
123 | Animal by Products (Upgraded),Maribor;Slovenia,2016-03-15,2016-03-18,2,,
124 | EU Feed Hygiene Rules and HACCP auditing ,Barcelona,2016-03-31,2016-06-03,2,,
125 |
--------------------------------------------------------------------------------
/data/food_training/training_2016.csv:
--------------------------------------------------------------------------------
1 | ,,,,,,
2 | CourseName,Location,DateFrom,DateTo,Attendees,,
3 | Food Composition and Information,Valencia;Spain,2016-04-04,2016-04-08,2,,
4 | HACCP,Dublin;Ireland,2016-04-04,2016-04-08,3,,
5 | Food Hygiene and Flexibility,Vilnius;Lithuania,2016-04-04,2016-04-08,2,,
6 | Foodborne Outbreaks Investigations,Berlin,2016-04-04,2016-04-08,2,,
7 | Traces (USE AT IMPORT OF LIVE ANIMALS AND PRODUCTS OF ANIMAL ORIGIN),Alicante;Spain,2016-04-05,2016-04-08,1,,
8 | Animal identification registration and traceability,Lyon,2016-04-11,2016-04-15,1,,
9 | Audit B1 - Standard Level,Grange ; Ireland,2016-04-11,2016-04-15,2,,
10 | Food Additives Type 1,Athens;Greece,2016-04-25,2016-04-29,1,,
11 | New Investigative Techniques - A,Rome,2016-04-25,2016-04-28,2,,
12 | Plant Health Controls,Venice/Treviso,2016-04-25,2016-04-29,2,,
13 | Foodborne Outbreaks Investigations,Tallinn,2016-04-25,2016-04-29,3,,
14 | Import Controls on Food and Feed of Non-Animal Origin,Valencia,2016-04-26,2016-04-29,3,,
15 | Animal Health and Disease Prevention for Bees and Zoo Animals,Prague,2016-04-26,2016-04-29,1,,
16 | Foodborne Outbreaks Investigations,Lisbon,2016-04-29,2016-03-04,2,,
17 | Contingency Planning,Cardiff;UK,2016-05-09,2016-05-13,1,,
18 | New Investigative Techniques - Standard B1,Prague,2016-05-09,2016-05-12,2,,
19 | Border Inspection Post,Felixstowe,2016-05-10,2016-05-13,5,,
20 | Auditing Plastic Recycling Processes,Treviso,2016-05-10,2016-05-13,1,,
21 | EU Feed Hygiene Rules and HACCP auditing ,Amsterdam,2016-05-10,2016-05-13,2,,
22 | Traces (USE AT IMPORT OF CERTAIN FEED AND FOOD OF NON-ANIMAL ORIGIN),Tallinn,2016-05-10,2016-05-13,1,,
23 | Control on Contaminants in Feed and Food,Brussels;Belgium,2016-05-10,2016-05-13,1,,
24 | HACCP,Ljubljana,2016-05-16,2016-05-20,1,,
25 | Food Hygiene and Flexibility (Decision Makers),Barcelona,2016-05-16,2016-05-20,1,,
26 | Microbiological Criteria,Barcelona,2016-05-16,2016-05-19,2,,
27 | Plant Health Controls,Brussels/Antwerp,2016-05-17,2016-05-19,2,,
28 | Animal Welfare (During Transport),Poland,2016-05-17,2016-05-20,1,,
29 | Audit A,Budapest,2016-05-23,2016-05-27,2,,
30 | HACCP,Budapest,2016-05-23,2016-05-27,2,,
31 | Foodborne Outbreaks Investigations,Rome,2016-05-23,2016-05-27,4,,
32 | Food Composition and Information,Prague,2016-05-30,2016-06-03,1,,
33 | Prevention Control and Eradication of Transmissible Spongiform Encephalopathies,Ljubljana,2016-05-31,2016-06-03,2,,
34 | Movement of Cats and Dogs,Malaga,2016-05-31,2016-06-03,2,,
35 | Plant Health Controls,Naples;Italy,2016-06-06,2016-06-09,2,,
36 | Control on Contaminants in Feed and Food,Prague;CZ,2016-06-07,2016-06-10,4,,
37 | Animal by Products (Upgraded),Antwerp;Belgium,2016-06-07,2016-06-10,1,,
38 | HACCP,Ljubljana,2016-06-13,2016-06-17,2,,
39 | Feed Law,Riga,2016-06-13,2016-06-17,4,,
40 | Food Composition and Information,Prague,2016-06-13,2016-06-17,3,,
41 | Animal identification registration and traceability,lisbon,2016-06-13,2016-06-17,1,,
42 | Food Additives Type 1,Trim;Ireland,2016-06-13,2016-06-17,2,,
43 | Food Hygiene and Flexibility,Turin,2016-06-13,2016-06-17,1,,
44 | Microbiological Criteria,Riga;Latvia,2016-06-13,2016-06-16,2,,
45 | Border Inspection Post,Felixstowe,2016-06-14,2016-06-17,6,,
46 | Traces (USE AT IMPORT OF LIVE PLANTS),Riga,2016-06-14,2016-06-17,2,,
47 | Animal Welfare (Poultry at Slaughter: Advanced level course),Unknown;Spain,2016-06-14,2016-06-17,2,,
48 | Audit B1 - Standard Level,Trim,2016-06-20,2016-06-24,1,,
49 | Plant Health Controls,Vienna,2016-06-20,2016-06-24,2,,
50 | Foodborne Outbreaks Investigations,Lisbon,2016-06-20,2016-06-24,2,,
51 | CONTROL OF ZOONOSES AND PREVENTION AND MONITORING OF ANTI-MICROBIAL RESISTANCE IN THE FOOD CHAIN (Zoon),Uppsala;Sweden,2016-06-21,2016-06-24,1,,
52 | Import Controls on Food and Feed of Non-Animal Origin,Frankfurt,2016-06-21,2016-06-24,3,,
53 | New Investigative Techniques - Standard B1,Madrid,2016-06-27,2016-06-30,2,,
54 | Plant Protection Products,Lisbon,2016-06-27,2016-06-30,2,,
55 | Animal Health and Disease Prevention for Bees and Zoo Animals,Maribor,2016-06-28,2016-07-01,1,,
56 | Animal by Products (Upgraded),Dusseldorf,2016-07-05,2016-07-08,1,,
57 | CONTROL OF ZOONOSES AND PREVENTION AND MONITORING OF ANTI-MICROBIAL RESISTANCE IN THE FOOD CHAIN (AMR),Trim;Ireland,2016-07-05,2016-07-08,1,,
58 | Food Additives Type 1,Valencia;Spain,2016-07-11,2016-07-15,1,,
59 | Plant Protection Products,Berlin,2016-08-29,2016-09-01,1,,
60 | Food Composition and Information,Trim;Ireland ,2016-09-05,2016-09-09,1,,
61 | Microbiological Criteria,Riga,2016-09-05,2016-09-08,2,,
62 | Movement of Cats and Dogs,Milan,2016-09-06,2016-09-09,1,,
63 | Food Hygiene and Flexibility,Coimbra,2016-09-12,2016-09-16,2,,
64 | Plant Health Controls,Naples,2016-09-12,2016-09-16,3,,
65 | Foodborne Outbreaks Investigations,Tallinn,2016-09-12,2016-09-16,3,,
66 | Animal by Products (Upgraded),Dusseldorf,2016-09-13,2016-09-16,1,,
67 | Animal Health and Disease Prevention for Bees and Zoo Animals,Maribor,2016-09-13,2016-09-16,4,,
68 | Control on Contaminants in Feed and Food,Sofia;BG,2016-09-13,2016-09-16,5,,
69 | Import Controls on Food and Feed of Non-Animal Origin,Rotterdam/Delft,2016-09-13,2016-09-16,13,,
70 | Animal identification registration and traceability,Warsaw,2016-09-19,2016-09-23,1,,
71 | Food Additives Type 1,Trim;Ireland,2016-09-19,2016-09-23,3,,
72 | HACCP,Lyon,2016-09-19,2016-09-23,2,,
73 | Traces (USE AT INTRA-EU TRADE OF LIVE ANIMALS),Madrid,2016-09-20,2016-09-23,2,,
74 | New Investigative Techniques - Advanced B2,Madrid,2016-09-25,2016-09-28,2,,
75 | Audit B1 - Standard Level,Bratislava,2016-09-26,2016-09-30,4,,
76 | Animal Welfare (In Pig Production),Denmark,2016-09-27,2016-09-30,3,,
77 | EU Feed Hygiene Rules and HACCP auditing ,Budapest,2016-09-27,2016-09-30,4,,
78 | Plant Health Controls,Warsaw,2016-10-03,2016-10-07,4,,
79 | Audit B2 - Advanced Level,Bratislava,2016-10-03,2016-10-07,2,,
80 | Food Hygiene and Flexibility,Vilnius,2016-10-03,2016-10-07,3,,
81 | Microbiological Criteria,Barcelona,2016-10-03,2016-10-06,2,,
82 | Border Inspection Post,Felixstowe,2016-10-04,2016-10-07,4,,
83 | Control on Contaminants in Feed and Food,Rome;IT,2016-10-04,2016-10-07,3,,
84 | Contingency Planning,Thessaloniki;GR,2016-10-10,2016-10-14,2,,
85 | Foodborne Outbreaks Investigations,Rome,2016-10-10,2016-10-14,2,,
86 | Animal by Products (Intermediate Level),Antwerp;Belgium,2016-10-11,2016-10-14,2,,
87 | Audit B1 - Standard Level,Valencia,2016-10-17,2016-10-21,1,,
88 | HACCP (German),Budapest,2016-10-24,2016-10-28,2,,
89 | New Investigative Techniques - Advanced B2,Madrid,2016-10-24,2016-10-27,1,,
90 | Animal Welfare (In Hen Laying),Unknown;UK,2016-10-25,2016-10-28,2,,
91 | Import Controls on Food and Feed of Non-Animal Origin,Riga,2016-10-25,2016-10-28,3,,
92 | Auditing Plastic Recycling Processes,Leipzig;Germany,2016-10-25,2016-10-28,2,,
93 | Contingency Planning,Cardiff;UK,2016-11-07,2016-11-11,2,,
94 | Audit A,Amsterdam;Netherlands,2016-11-07,2016-11-11,2,,
95 | Animal identification registration and traceability,Ljubljana,2016-11-07,2016-11-11,1,,
96 | Food Additives Type 1,Athens;Greece,2016-11-07,2016-11-11,1,,
97 | Food Hygiene and Flexibility,Turin;Italy,2016-11-07,2016-11-11,2,,
98 | Foodborne Outbreaks Investigations,Berlin;Germany,2016-11-07,2016-11-11,1,,
99 | Border Inspection Post,Vienna,2016-11-08,2016-11-11,3,,
100 | Animal Welfare (In Pig Production),Unknown;Italy,2016-11-15,2016-11-18,1,,
101 | Traces (USE AT INTRA-EU TRADE OF LIVE ANIMALS),Torino,2016-11-15,2016-11-18,2,,
102 | Audit B2 - Advanced Level,Berlin,2016-11-21,2016-11-25,2,,
103 | Prevention Control and Eradication of Transmissible Spongiform Encephalopathies,Ljubljana;SI,2016-11-22,2016-11-25,2,,
104 | HACCP,Valencia,2016-11-28,2016-12-02,2,,
105 | Workshop on Fishery Products (FVO),Grange;Ireland,2016-12-01,2015-12-03,3,,
106 | Contingency Planning,Venice;IT,2016-12-05,2016-12-09,2,,
107 | Animal identification registration and traceability,Lisbon;Portugal,2016-12-05,2016-12-09,1,,
108 | Audit A,Seville;Spain,2016-12-12,2016-12-16,1,,
109 | CONTROL OF ZOONOSES AND PREVENTION AND MONITORING OF ANTI-MICROBIAL RESISTANCE IN THE FOOD CHAIN (AMR),Athens;Greece,2016-12-13,2016-12-16,1,,
110 | Food Additives Type 2,Trim;Ireland,2017-01-23,2017-01-27,3,,
111 | Food Hygiene and Flexibility,Barcelona,2017-02-06,2017-02-10,1,,
112 | Auditing Plastic Recycling Processes,Treviso;Italy,2017-02-07,2017-02-10,2,,
113 | CONTROL OF ZOONOSES AND PREVENTION AND MONITORING OF ANTI-MICROBIAL RESISTANCE IN THE FOOD CHAIN (Zoon),Venice;Italy,2017-02-13,2017-02-16,1,,
114 | Contingency Planning,Venice;IT,2017-03-06,2017-03-10,3,,
115 | Audit B1 - Standard Level,Valencia,2017-03-06,2017-03-10,2,,
116 | Audit A,Bratislava,2017-03-20,2017-03-24,1,,
117 | Food Hygiene and Flexibility,Zagreb/Helsinki,2017-03-20,2017-03-24,2,,
118 | Animal Health and Disease Prevention for Bees and Zoo Animals,Antwerp,2017-03-28,2017-03-31,2,,
119 |
--------------------------------------------------------------------------------
/media/colab/image1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image1.png
--------------------------------------------------------------------------------
/media/colab/image10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image10.png
--------------------------------------------------------------------------------
/media/colab/image2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image2.png
--------------------------------------------------------------------------------
/media/colab/image3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image3.png
--------------------------------------------------------------------------------
/media/colab/image4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image4.png
--------------------------------------------------------------------------------
/media/colab/image5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image5.png
--------------------------------------------------------------------------------
/media/colab/image6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image6.png
--------------------------------------------------------------------------------
/media/colab/image7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image7.png
--------------------------------------------------------------------------------
/media/colab/image8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image8.png
--------------------------------------------------------------------------------
/media/colab/image9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/colab/image9.png
--------------------------------------------------------------------------------
/media/humble-data-logo-transparent.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/humble-data-logo-transparent.png
--------------------------------------------------------------------------------
/media/humble-data-logo-white-transparent.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/media/humble-data-logo-white-transparent.png
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | jupyterlab==4.0.8
2 | notebook==7.0.6
3 | pandas==2.1.3
4 | numpy==1.26.2
5 | matplotlib==3.8.2
6 | seaborn==0.13.0
7 |
--------------------------------------------------------------------------------
/solutions/01_01.py:
--------------------------------------------------------------------------------
1 | name = "Anne"
2 | print(name)
--------------------------------------------------------------------------------
/solutions/01_02.py:
--------------------------------------------------------------------------------
1 | name = 8
2 | print(name)
--------------------------------------------------------------------------------
/solutions/01_03.py:
--------------------------------------------------------------------------------
1 | "I'm enjoying this workshop!"
--------------------------------------------------------------------------------
/solutions/01_04.py:
--------------------------------------------------------------------------------
1 | 'I'm enjoying this workshop!'
--------------------------------------------------------------------------------
/solutions/01_05.py:
--------------------------------------------------------------------------------
1 | 'I\'m enjoying this workshop!'
--------------------------------------------------------------------------------
/solutions/01_06.py:
--------------------------------------------------------------------------------
1 | s[-1]
--------------------------------------------------------------------------------
/solutions/01_07.py:
--------------------------------------------------------------------------------
1 | s[-3:]
--------------------------------------------------------------------------------
/solutions/01_08.py:
--------------------------------------------------------------------------------
1 | "I" not in s
--------------------------------------------------------------------------------
/solutions/01_09.py:
--------------------------------------------------------------------------------
1 | 3 + 4
--------------------------------------------------------------------------------
/solutions/01_10.py:
--------------------------------------------------------------------------------
1 | 10.0 - 6
--------------------------------------------------------------------------------
/solutions/01_11.py:
--------------------------------------------------------------------------------
1 | 15 * 12
--------------------------------------------------------------------------------
/solutions/01_12.py:
--------------------------------------------------------------------------------
1 | 2**6
--------------------------------------------------------------------------------
/solutions/01_13.py:
--------------------------------------------------------------------------------
1 | 3.1**2
--------------------------------------------------------------------------------
/solutions/01_14.py:
--------------------------------------------------------------------------------
1 | 5.0**2
--------------------------------------------------------------------------------
/solutions/01_15.py:
--------------------------------------------------------------------------------
1 | 6 / 2
--------------------------------------------------------------------------------
/solutions/01_16.py:
--------------------------------------------------------------------------------
1 | 6 // 2
--------------------------------------------------------------------------------
/solutions/01_17.py:
--------------------------------------------------------------------------------
1 | 19 / 5
--------------------------------------------------------------------------------
/solutions/01_18.py:
--------------------------------------------------------------------------------
1 | 19 // 5
--------------------------------------------------------------------------------
/solutions/01_19.py:
--------------------------------------------------------------------------------
1 | 19 % 5
--------------------------------------------------------------------------------
/solutions/01_20.py:
--------------------------------------------------------------------------------
1 | False != 2
--------------------------------------------------------------------------------
/solutions/01_21.py:
--------------------------------------------------------------------------------
1 | len("Sandrine") > 8
--------------------------------------------------------------------------------
/solutions/01_22.py:
--------------------------------------------------------------------------------
1 | (len("Sandrine") > 5) and (len("Cheuk") < 7)
--------------------------------------------------------------------------------
/solutions/01_23.py:
--------------------------------------------------------------------------------
1 | list_greeting[0]
--------------------------------------------------------------------------------
/solutions/01_24.py:
--------------------------------------------------------------------------------
1 | list_greeting[3:]
--------------------------------------------------------------------------------
/solutions/01_25.py:
--------------------------------------------------------------------------------
1 | list_greeting[:4]
--------------------------------------------------------------------------------
/solutions/01_26.py:
--------------------------------------------------------------------------------
1 | list_greeting[::2]
--------------------------------------------------------------------------------
/solutions/01_27.py:
--------------------------------------------------------------------------------
1 | list_greeting[2] = "Ola"
2 | print(list_greeting)
--------------------------------------------------------------------------------
/solutions/01_28.py:
--------------------------------------------------------------------------------
1 | 10 in list_greeting
--------------------------------------------------------------------------------
/solutions/01_29.py:
--------------------------------------------------------------------------------
1 | "Ole" not in list_greeting
--------------------------------------------------------------------------------
/solutions/01_30.py:
--------------------------------------------------------------------------------
1 | print("Here we are!")
--------------------------------------------------------------------------------
/solutions/01_31.py:
--------------------------------------------------------------------------------
1 | len(snakes)
--------------------------------------------------------------------------------
/solutions/01_32.py:
--------------------------------------------------------------------------------
1 | len(list_greeting)
--------------------------------------------------------------------------------
/solutions/01_33.py:
--------------------------------------------------------------------------------
1 | max(1, 2, 3, 4, 5)
--------------------------------------------------------------------------------
/solutions/01_34.py:
--------------------------------------------------------------------------------
1 | round(123.45)
--------------------------------------------------------------------------------
/solutions/01_35.py:
--------------------------------------------------------------------------------
1 | round(123.45, 1)
--------------------------------------------------------------------------------
/solutions/01_36.py:
--------------------------------------------------------------------------------
1 | list_greeting.append("Aloha")
--------------------------------------------------------------------------------
/solutions/01_37.py:
--------------------------------------------------------------------------------
1 | from math import sqrt
2 |
3 | sqrt(24336)
--------------------------------------------------------------------------------
/solutions/01_38.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | np.sin(np.pi / 4)
--------------------------------------------------------------------------------
/solutions/02_01.py:
--------------------------------------------------------------------------------
1 | import numpy as np
--------------------------------------------------------------------------------
/solutions/02_02.py:
--------------------------------------------------------------------------------
1 | df = pd.read_csv("../data/Penguins/penguins.csv")
--------------------------------------------------------------------------------
/solutions/02_03.py:
--------------------------------------------------------------------------------
1 | df.tail(3)
--------------------------------------------------------------------------------
/solutions/02_04.py:
--------------------------------------------------------------------------------
1 | df.shape
--------------------------------------------------------------------------------
/solutions/02_05.py:
--------------------------------------------------------------------------------
1 | df.info()
--------------------------------------------------------------------------------
/solutions/02_06.py:
--------------------------------------------------------------------------------
1 | df.columns
--------------------------------------------------------------------------------
/solutions/02_07.py:
--------------------------------------------------------------------------------
1 | pd.options.display.max_rows = 25
--------------------------------------------------------------------------------
/solutions/02_08.py:
--------------------------------------------------------------------------------
1 | df["bill_length_mm"]
--------------------------------------------------------------------------------
/solutions/02_09.py:
--------------------------------------------------------------------------------
1 | df.iloc[11]
--------------------------------------------------------------------------------
/solutions/02_10.py:
--------------------------------------------------------------------------------
1 | df.loc[11]
--------------------------------------------------------------------------------
/solutions/02_11.py:
--------------------------------------------------------------------------------
1 | df.iloc[-3:, 2]
--------------------------------------------------------------------------------
/solutions/02_12.py:
--------------------------------------------------------------------------------
1 | df.loc[352:, "bill_length_mm"]
2 |
--------------------------------------------------------------------------------
/solutions/02_13.py:
--------------------------------------------------------------------------------
1 | df.iloc[[145, 7, 0], [4, -2]]
--------------------------------------------------------------------------------
/solutions/02_14.py:
--------------------------------------------------------------------------------
1 | df.loc[[145, 7, 0], ["flipper_length_mm", "body_mass_g"]]
--------------------------------------------------------------------------------
/solutions/02_15.py:
--------------------------------------------------------------------------------
1 | mask_PW_PL = (df["body_mass_g"] > 4000) & (df["flipper_length_mm"] < 185)
2 | df[mask_PW_PL]
--------------------------------------------------------------------------------
/solutions/02_16.py:
--------------------------------------------------------------------------------
1 | df["species"].unique()
--------------------------------------------------------------------------------
/solutions/02_17.py:
--------------------------------------------------------------------------------
1 | df["flipper_length_mm"].isnull().sum()
--------------------------------------------------------------------------------
/solutions/02_18.py:
--------------------------------------------------------------------------------
1 | df["sex"].value_counts(dropna=False)
--------------------------------------------------------------------------------
/solutions/02_19.py:
--------------------------------------------------------------------------------
1 | df["species"].value_counts(normalize=True)
--------------------------------------------------------------------------------
/solutions/02_20.py:
--------------------------------------------------------------------------------
1 | df[df["flipper_length_mm"].isnull()].index
--------------------------------------------------------------------------------
/solutions/02_21.py:
--------------------------------------------------------------------------------
1 | ?pd.DataFrame.dropna
--------------------------------------------------------------------------------
/solutions/02_22.py:
--------------------------------------------------------------------------------
1 | df_2 = df.dropna(how="all")
--------------------------------------------------------------------------------
/solutions/02_23.py:
--------------------------------------------------------------------------------
1 | print(f"number of rows of df_2: {df_2.shape[0]}")
--------------------------------------------------------------------------------
/solutions/02_24.py:
--------------------------------------------------------------------------------
1 | df_3 = df_2.dropna(how="any")
--------------------------------------------------------------------------------
/solutions/02_25.py:
--------------------------------------------------------------------------------
1 | print(f"number of rows of df_3: {df_3.shape[0]}")
--------------------------------------------------------------------------------
/solutions/02_26.py:
--------------------------------------------------------------------------------
1 | df_4 = df_3.drop_duplicates()
--------------------------------------------------------------------------------
/solutions/02_27.py:
--------------------------------------------------------------------------------
1 | df_4.describe()
--------------------------------------------------------------------------------
/solutions/02_28.py:
--------------------------------------------------------------------------------
1 | df_4.dtypes
--------------------------------------------------------------------------------
/solutions/02_29.py:
--------------------------------------------------------------------------------
1 | df_4.min(numeric_only=True)
--------------------------------------------------------------------------------
/solutions/02_30.py:
--------------------------------------------------------------------------------
1 | df_4["flipper_length_mm"].max()
--------------------------------------------------------------------------------
/solutions/02_31.py:
--------------------------------------------------------------------------------
1 | df_4.groupby("species").median(numeric_only=True)
--------------------------------------------------------------------------------
/solutions/02_32.py:
--------------------------------------------------------------------------------
1 | df_4.to_csv("../data/Penguins/my_penguins.csv")
--------------------------------------------------------------------------------
/solutions/04_01.py:
--------------------------------------------------------------------------------
1 | dict_greeting["Italy"]
--------------------------------------------------------------------------------
/solutions/04_02.py:
--------------------------------------------------------------------------------
1 | dict_greeting["UK"] = "Good Morning"
2 | print(dict_greeting)
--------------------------------------------------------------------------------
/solutions/04_03.py:
--------------------------------------------------------------------------------
1 | dict_greeting["Hawaii"] = "Aloha"
2 | print(dict_greeting)
--------------------------------------------------------------------------------
/solutions/04_04.py:
--------------------------------------------------------------------------------
1 | x = -1
2 | y = 2
3 | z = 12
4 |
5 | if x > 0:
6 | print("Python")
7 | elif y == 2:
8 | print("sunshine")
9 | elif z % 3 == 0:
10 | print("data")
11 | else:
12 | print("Why?")
--------------------------------------------------------------------------------
/solutions/04_05.py:
--------------------------------------------------------------------------------
1 | ?is_greeting
--------------------------------------------------------------------------------
/solutions/04_06.py:
--------------------------------------------------------------------------------
1 | # Remember to run your function
2 | # Such as:
3 | #
4 | # print(f(x))
5 | #
6 |
7 |
8 | def f(x):
9 | """Returns the argument multiplied by 3 and increased by 10."""
10 | return (x * 3) + 10
--------------------------------------------------------------------------------
/solutions/04_07.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/solutions/04_07.py
--------------------------------------------------------------------------------
/solutions/04_08.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/solutions/04_08.py
--------------------------------------------------------------------------------
/solutions/04_09.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HumbleData/beginners-data-workshop/94e1eb90e7694903badff6e535451ead42247c13/solutions/04_09.py
--------------------------------------------------------------------------------
/solutions/05_01.py:
--------------------------------------------------------------------------------
1 | import datetime
2 |
3 | import matplotlib.pyplot as plt
4 | import pandas as pd
5 |
6 | %matplotlib inline
--------------------------------------------------------------------------------
/solutions/05_02.py:
--------------------------------------------------------------------------------
1 | df_2014 = pd.read_csv("../data/food_training/training_2014.csv")
--------------------------------------------------------------------------------
/solutions/05_03.py:
--------------------------------------------------------------------------------
1 | df_2014.head()
--------------------------------------------------------------------------------
/solutions/05_04.py:
--------------------------------------------------------------------------------
1 | df_2014 = pd.read_csv("../data/food_training/training_2014.csv", header=1)
--------------------------------------------------------------------------------
/solutions/05_05.py:
--------------------------------------------------------------------------------
1 | df_2014.head()
--------------------------------------------------------------------------------
/solutions/05_06.py:
--------------------------------------------------------------------------------
1 | df_2015 = pd.read_csv("../data/food_training/training_2015.csv", header=1)
2 | df_2016 = pd.read_csv("../data/food_training/training_2016.csv", header=1)
--------------------------------------------------------------------------------
/solutions/05_07.py:
--------------------------------------------------------------------------------
1 | frames = [df_2014, df_2015, df_2016]
2 | df = pd.concat(frames)
--------------------------------------------------------------------------------
/solutions/05_08.py:
--------------------------------------------------------------------------------
1 | df.shape
--------------------------------------------------------------------------------
/solutions/05_09.py:
--------------------------------------------------------------------------------
1 | df.index
--------------------------------------------------------------------------------
/solutions/05_10.py:
--------------------------------------------------------------------------------
1 | df = df.reset_index()
2 | df.index
3 |
4 | # We could also have done the following when concatenating:
5 | # df = pd.concat(frames, ignore_index=True)
--------------------------------------------------------------------------------
/solutions/05_11.py:
--------------------------------------------------------------------------------
1 | df.info()
--------------------------------------------------------------------------------
/solutions/05_12.py:
--------------------------------------------------------------------------------
1 | ?pd.DataFrame.drop
--------------------------------------------------------------------------------
/solutions/05_13.py:
--------------------------------------------------------------------------------
1 | cols_to_remove = ["Unnamed: 5", "Unnamed: 6"]
2 | df = df.drop(cols_to_remove, axis=1)
--------------------------------------------------------------------------------
/solutions/05_14.py:
--------------------------------------------------------------------------------
1 | df["Location"].unique()
--------------------------------------------------------------------------------
/solutions/05_15.py:
--------------------------------------------------------------------------------
1 | df["Location"].str.split(pat=";")
--------------------------------------------------------------------------------
/solutions/05_16.py:
--------------------------------------------------------------------------------
1 | df["Location"].str.split(pat=";", expand=True)
--------------------------------------------------------------------------------
/solutions/05_17.py:
--------------------------------------------------------------------------------
1 | df[["city", "country"]] = df["Location"].str.split(pat=";", expand=True)
--------------------------------------------------------------------------------
/solutions/05_18.py:
--------------------------------------------------------------------------------
1 | df = df.drop("Location", axis=1)
--------------------------------------------------------------------------------
/solutions/05_19.py:
--------------------------------------------------------------------------------
1 | df["country"].nunique()
--------------------------------------------------------------------------------
/solutions/05_20.py:
--------------------------------------------------------------------------------
1 | df["country"].value_counts()
--------------------------------------------------------------------------------
/solutions/05_21.py:
--------------------------------------------------------------------------------
1 | df["country"] = df["country"].str.strip()
2 | df["city"] = df["city"].str.strip()
--------------------------------------------------------------------------------
/solutions/05_22.py:
--------------------------------------------------------------------------------
1 | df["country"].nunique()
--------------------------------------------------------------------------------
/solutions/05_23.py:
--------------------------------------------------------------------------------
1 | df[df["country"] == "Portugal"]
--------------------------------------------------------------------------------
/solutions/05_24.py:
--------------------------------------------------------------------------------
1 | df["city"] = df["city"].str.lower()
--------------------------------------------------------------------------------
/solutions/05_25.py:
--------------------------------------------------------------------------------
1 | df["city"][df["city"].str.contains("/")]
--------------------------------------------------------------------------------
/solutions/05_26.py:
--------------------------------------------------------------------------------
1 | df["city"] = df["city"].str.replace(r"/\w*", "", regex=True)
--------------------------------------------------------------------------------
/solutions/05_27.py:
--------------------------------------------------------------------------------
1 | dict_codes = {
2 | "BG": "Bulgaria",
3 | "CZ": "Czech Republic",
4 | "IT": "Italy",
5 | "GR": "Greece",
6 | "SI": "Slovenia",
7 | "UK": "United Kingdom",
8 | }
9 |
10 | country_in_codes = df["country"].isin(dict_codes.keys())
11 | df.loc[country_in_codes, "country"] = df.loc[country_in_codes, "country"].map(dict_codes)
--------------------------------------------------------------------------------
/solutions/05_28.py:
--------------------------------------------------------------------------------
1 | df.loc[df["city"] == "unknown", "country"]
--------------------------------------------------------------------------------
/solutions/05_29.py:
--------------------------------------------------------------------------------
1 | dict_capitals = {
2 | "Denmark": "copenhague",
3 | "France": "paris",
4 | "Italy": "rome",
5 | "Spain": "madrid",
6 | "United Kingdom": "london",
7 | }
8 |
9 | unknown_city = df["city"] == "unknown"
10 | df.loc[unknown_city, "city"] = df.loc[unknown_city, "country"].map(dict_capitals)
--------------------------------------------------------------------------------
/solutions/05_30.py:
--------------------------------------------------------------------------------
1 | set(df["city"]) - dict_cities.keys()
--------------------------------------------------------------------------------
/solutions/05_31.py:
--------------------------------------------------------------------------------
1 | dict_cities.update(
2 | {
3 | "bristol": "United Kingdom",
4 | "gothenburg": "Sweden",
5 | "graz": "Austria",
6 | "lyon": "France",
7 | "murcia": "Spain",
8 | "parma": "Italy",
9 | },
10 | )
--------------------------------------------------------------------------------
/solutions/05_32.py:
--------------------------------------------------------------------------------
1 | null_country = df["country"].isnull()
2 | df.loc[null_country, "country"] = df.loc[null_country, "city"].map(dict_cities)
--------------------------------------------------------------------------------
/solutions/05_33.py:
--------------------------------------------------------------------------------
1 | df["country"].value_counts(dropna=False)
--------------------------------------------------------------------------------
/solutions/05_34.py:
--------------------------------------------------------------------------------
1 | def f(x):
2 | if x == 1:
3 | return "single"
4 | else:
5 | return "multiple"
--------------------------------------------------------------------------------
/solutions/05_35.py:
--------------------------------------------------------------------------------
1 | df["Attendees"].apply(f)
--------------------------------------------------------------------------------
/solutions/05_36.py:
--------------------------------------------------------------------------------
1 | languages = pd.read_csv("../data/food_training/languages.csv")
--------------------------------------------------------------------------------
/solutions/05_37.py:
--------------------------------------------------------------------------------
1 | df = df.merge(languages, how="left", left_on="country", right_on="Country")
--------------------------------------------------------------------------------
/solutions/05_38.py:
--------------------------------------------------------------------------------
1 | df = df.drop("Country", axis=1)
2 |
3 | # N.B. You can only run this cell once! If you try run it again, it will throw an error!
4 | # Why? Because if you drop the Country column, it will be removed...so you can't
5 | # drop it a second time as the column isn't there to drop!
--------------------------------------------------------------------------------
/solutions/05_39.py:
--------------------------------------------------------------------------------
1 | df["DateFrom"].dtype
--------------------------------------------------------------------------------
/solutions/05_40.py:
--------------------------------------------------------------------------------
1 | df["DateFrom"] = pd.to_datetime(df["DateFrom"], format="%Y-%m-%d")
2 | df["DateTo"] = pd.to_datetime(df["DateTo"], format="%Y-%m-%d")
--------------------------------------------------------------------------------
/solutions/05_41.py:
--------------------------------------------------------------------------------
1 | df[df["DateFrom"] > "2017-02-01"]
--------------------------------------------------------------------------------
/solutions/05_42.py:
--------------------------------------------------------------------------------
1 | df["duration"] = df["DateTo"] - df["DateFrom"] + datetime.timedelta(days=1)
--------------------------------------------------------------------------------
/solutions/05_43.py:
--------------------------------------------------------------------------------
1 | df["month"] = df["DateFrom"].dt.month
2 | df["month"].hist()
--------------------------------------------------------------------------------
/solutions/05_44.py:
--------------------------------------------------------------------------------
1 | df.sort_values("city")
--------------------------------------------------------------------------------
/solutions/05_45.py:
--------------------------------------------------------------------------------
1 | df.sort_values(["duration", "Attendees"], ascending=[True, False])
--------------------------------------------------------------------------------
/solutions/05_46.py:
--------------------------------------------------------------------------------
1 | df_gr = df.groupby("city")
--------------------------------------------------------------------------------
/solutions/05_47.py:
--------------------------------------------------------------------------------
1 | df_gr["Attendees"].mean()
--------------------------------------------------------------------------------