├── .gitignore ├── 1_Introduction_to_Python.ipynb ├── 1_Introduction_to_Python_answered.ipynb ├── 2_Python_for_Data_Science.ipynb ├── 2_Python_for_Data_Science_answered.ipynb ├── README.md ├── data └── penguins_raw.csv ├── figures ├── culmen_depth.png ├── elimination.png ├── lter_penguins.png └── upper_triangular.png └── setup_check.sh /.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints/ 2 | *.swp 3 | venv 4 | .idea 5 | -------------------------------------------------------------------------------- /1_Introduction_to_Python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to Python" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "### About the Tutor\n", 15 | "\n", 16 | "My name is [Kasia](https://kasia.codes/) and I'm a DPhil student in Genomic Medicine and Statistics* at the [Wellcome Centre for Human Genetics](https://www.well.ox.ac.uk/), University of Oxford. I'm also the President of [NGSchool Society](https://ngschool.eu/) where with an amazing group of [people](https://ngschool.eu/people/) we try to build international community for computational biologists with training, platforms to network and exchange knowledge. Be sure to check out more on our social media ([twitter](https://twitter.com/NGSchoolEU) and [fb](https://www.facebook.com/NGSchool.eu/)) and [join us on discord](https://discord.gg/MhNeqwR).\n", 17 | "\n", 18 | "Hope you'll enjoy this workshop. If you have any questions ask me straight away by raising your hand or posting on Slack. If you are reading this outsde of the classroom, go aheas and shoot me [an email](mailto:kasia@well.ox.ac.uk) in case you have some questions. \n", 19 | "\n", 20 | "Feel free to leave your comments and/or find some bugs. Create an issue, or better yet - fix it and create a pull request. Who knows, if I get the chance I might buy you a beer as a thank you ;)\n", 21 | "\n", 22 | "\\*btw. if you're looking for a PhD, look [this one up](https://www.medsci.ox.ac.uk/study/graduateschool/courses/dtc-structured-research-degrees/genomic-medicine-and-statistics) as it is quite interesting and comes with generous funding!" 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "## Why Python?" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "![](https://imgs.xkcd.com/comics/python.png)\n", 37 | "\n", 38 | "\n", 39 | "The title of the [graphic](https://imgs.xkcd.com/comics/python.png): _I wrote 20 short programs in Python yesterday. It was wonderful. Perl, I'm leaving you._" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "**Python advantages, or in other words \"Why Python?\"**\n", 47 | "\n", 48 | "1. Easy to read, easy to learn.\n", 49 | "\n", 50 | " The code you write in Python resembles instructions you would give to someone (here: computer) in English. Behind the curtain the code is automatically converted to a set of instructions your computer understands.\n", 51 | "\n", 52 | "2. Vast repository of libraries providing necessary extensions.\n", 53 | "\n", 54 | " As in the comic above - if you need sth, there's probably a package in Python with the tools you need. Just import it `import some_package` and you're good to go.\n", 55 | "\n", 56 | "3. Portable throughout platforms.\n", 57 | "\n", 58 | " Provided that the code is run with the same version - it should run exactly the same on Linux, Widnows or Mac machines. \n", 59 | "\n", 60 | "4. It's easy to leverage other languages speed (such as C/C++ or others) by combining it with Python code.\n", 61 | "\n", 62 | " There are still some moments where Python is not as quick as we would need it to be. But, it's super easy to implement C for example. According to Python Manual _It is quite easy to add new built-in modules to Python, if you know how to program in C. Such extension modules can do two things that can’t be done directly in Python: they can implement new built-in object types, and they can call C library functions and system calls._ (you can read more [here](https://docs.python.org/3.7/extending/extending.html)). But even if you do not know how to write in C you still can benefit from its speed by using [Cython](https://cython.org/) - you have to slightly adjust your code (for one, define variable types) and run Cython extension which will translate as much as it can to C. Et voila!\n", 63 | "\n", 64 | "5. Python is object-oriented.\n", 65 | "\n", 66 | " Everything in Python is an object which helps in solving complex problems. At the beginning it might be a bit hard to wrap your head around this, but don't worry - it gets easier!" 67 | ] 68 | }, 69 | { 70 | "cell_type": "markdown", 71 | "metadata": {}, 72 | "source": [ 73 | "## Introduction\n", 74 | "\n", 75 | "Welcome to this **Introduction to Python** and basic tools necessary for data analysis. The idea of this workshop is to show you how things are done in Python, how to load the tools and learn to use them, but mostly give you a flavor of Python's possibilities and show you where to find help. \n", 76 | "\n", 77 | "We will be working with Python3, in Jupyter and we will be using the following packages:\n", 78 | "\n", 79 | "* [NumPy](https://www.numpy.org/) - *the fundamental package for scientific computing with Python*\n", 80 | "* [Pandas](https://pandas.pydata.org/) - *library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming*\n", 81 | "* [SciPy](https://www.scipy.org/) - *Python-based ecosystem of open-source software for mathematics, science, and engineering*\n", 82 | "* [Matplotlib](https://matplotlib.org/) - *2D plotting library which produces publication quality figures*\n", 83 | "\n", 84 | "### Outline\n", 85 | "1. Basics\n", 86 | " 1. Variable and types\n", 87 | " 2. Lists and more\n", 88 | " 3. Controlling the flow \n", 89 | " 4. Functions\n", 90 | " 5. Modules\n", 91 | "2. Data analysis\n", 92 | " 1. Reading in and descriptive analysis with Numpy and Pandas\n", 93 | " 2. Data exploration with SciPy\n", 94 | " 3. Plotting with Matplotlib" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "metadata": {}, 100 | "source": [ 101 | "__Short note on code styling__\n", 102 | "\n", 103 | "The very best advice I got about how to write code and which style to use was that it doesn't matter which style you use as long as you are consitent! So I try to be. \n", 104 | "\n", 105 | "If you want to read more on recommended styling - you can read the [PEP 8 Style Guide](https://www.python.org/dev/peps/pep-0008/). \n", 106 | "\n", 107 | "I name my variables usualy with `lower_snake` case, but as long as you're consistant and create names that mean something you can do whatever you want! If you want advice on how to name things, have a look at [those slides by Jenny Bryan](https://speakerdeck.com/jennybc/how-to-name-files).\n", 108 | "\n", 109 | "\"In\n", 110 | "\n", 111 | "Artwork by [@allison_horst](https://twitter.com/allison_horst)" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": {}, 117 | "source": [ 118 | "***\n", 119 | "## Basics\n", 120 | "\n", 121 | "Historically, the first thing you *ought* to do in any given programming language is to print the *Hello world!* statement. In order to this in Python3 you just call a *function* - print() and provide it with an *string argument*, as follows:" 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": 1, 127 | "metadata": { 128 | "scrolled": true 129 | }, 130 | "outputs": [ 131 | { 132 | "name": "stdout", 133 | "output_type": "stream", 134 | "text": [ 135 | "Hello world!\n" 136 | ] 137 | } 138 | ], 139 | "source": [ 140 | "print(\"Hello world!\")" 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": null, 146 | "metadata": {}, 147 | "outputs": [], 148 | "source": [ 149 | "# Now it's your turn. write line of code that will print a greeting to you. \n", 150 | "# In my case, that would be print(\"Hello Kasia!\") which will print \"Hello Kasia!\".\n", 151 | "# Type it here and press run.\n" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": {}, 157 | "source": [ 158 | "**Comments**" 159 | ] 160 | }, 161 | { 162 | "cell_type": "markdown", 163 | "metadata": {}, 164 | "source": [ 165 | "Everytime we write code we strive to make it self-explanatory. However, whenever you want to describe what your code does you shoudl use comments. In my book - more often it's better to elave too much comments than to comment too vaguely." 166 | ] 167 | }, 168 | { 169 | "cell_type": "markdown", 170 | "metadata": {}, 171 | "source": [ 172 | "```python\n", 173 | "# Text proceeded with the hashtag sign (#) are not executed and serve as comemnts.\n", 174 | "\n", 175 | "# There are also special kind of comments - so called docstrings.\n", 176 | "# You'll find them inside functions, they are use to translate \n", 177 | "# the comment into the proper docummentation. \n", 178 | "\n", 179 | "def add_two_numbers(a, b):\n", 180 | " \"\"\"Adds two integers.\n", 181 | " Args:\n", 182 | " two integers\n", 183 | " Returns:\n", 184 | " sum of the arguments\n", 185 | " \"\"\"\n", 186 | " int(a) + int(b)\n", 187 | "```" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "***\n", 195 | "### Variables and Types" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": {}, 201 | "source": [ 202 | "#### Strings" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": {}, 208 | "source": [ 209 | "Strings are defined by single ' ' or double \" \" quotations. \n", 210 | "We assign value to variable by writing its desired name (no white spaces!) followed by equal sign and the value." 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "metadata": {}, 217 | "outputs": [], 218 | "source": [ 219 | "my_string = 'This is my string'\n", 220 | "print(my_string)" 221 | ] 222 | }, 223 | { 224 | "cell_type": "code", 225 | "execution_count": null, 226 | "metadata": {}, 227 | "outputs": [], 228 | "source": [ 229 | "len(my_string) # string lenght" 230 | ] 231 | }, 232 | { 233 | "cell_type": "markdown", 234 | "metadata": {}, 235 | "source": [ 236 | "We can access each character from a string. Nothe that Python counts from 0, that means that if you want to access T from `mystring` you need to call for position 0." 237 | ] 238 | }, 239 | { 240 | "cell_type": "code", 241 | "execution_count": null, 242 | "metadata": {}, 243 | "outputs": [], 244 | "source": [ 245 | "my_string[0]" 246 | ] 247 | }, 248 | { 249 | "cell_type": "markdown", 250 | "metadata": {}, 251 | "source": [ 252 | "We can also get the characters from the end. Again, we are counting from the start, but to the left. I.e., to access last character we need to call for -1, second to last -2 and so on." 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": [ 259 | "We can access slices of a string..." 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": null, 265 | "metadata": {}, 266 | "outputs": [], 267 | "source": [ 268 | "my_string[0:4]" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "metadata": {}, 275 | "outputs": [], 276 | "source": [ 277 | "my_string[-0]" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": {}, 283 | "source": [ 284 | "... and create new strings by using `+` and `*` operations." 285 | ] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": null, 290 | "metadata": {}, 291 | "outputs": [], 292 | "source": [ 293 | "my_new_string = my_string + ' that I changed!'\n", 294 | "my_newer_string = my_string * 3\n", 295 | "print(my_string)" 296 | ] 297 | }, 298 | { 299 | "cell_type": "code", 300 | "execution_count": null, 301 | "metadata": {}, 302 | "outputs": [], 303 | "source": [ 304 | "print(my_new_string)" 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": null, 310 | "metadata": {}, 311 | "outputs": [], 312 | "source": [ 313 | "print(my_newer_string)" 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "metadata": {}, 319 | "source": [ 320 | "There are few methods specific for strings that will come in handy in your everyday programming. Let me quickly introduce those. \n", 321 | "\n", 322 | "* `find(pattern)` - Finds the start position of the pattern; returns -1 if no match;\n", 323 | "* `replace(pattern, replacement)` - Replaces all occurences of a pattern with replacement;\n", 324 | "* `upper()`, `lower()` - converts to upper or lowercase;\n", 325 | "* `strip(character)` - strip string of a character (removes occurence of the character at the beggining and end of a string; \n", 326 | "* `rstrip()` - strips trsings of whitespace characters on the right side." 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": null, 332 | "metadata": {}, 333 | "outputs": [], 334 | "source": [ 335 | "hello_kasia = 'Hello Kasia!'\n", 336 | "hello_adam = hello_kasia.replace('Kasia', 'Adam')\n", 337 | "print(hello_kasia)" 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": null, 343 | "metadata": {}, 344 | "outputs": [], 345 | "source": [ 346 | "print(hello_adam)" 347 | ] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "execution_count": null, 352 | "metadata": {}, 353 | "outputs": [], 354 | "source": [ 355 | "deep_quote = 'Be yourself; everyone else is already taken.'\n", 356 | "deep_quote.split('e')" 357 | ] 358 | }, 359 | { 360 | "cell_type": "markdown", 361 | "metadata": {}, 362 | "source": [ 363 | "We can apply more than one method of an object. The methods are executed from left to right." 364 | ] 365 | }, 366 | { 367 | "cell_type": "code", 368 | "execution_count": null, 369 | "metadata": {}, 370 | "outputs": [], 371 | "source": [ 372 | "another_deep_quote = '\"Two things are infinite: the universe and human stupidity; and I\\'m not sure about the universe.\"\\n'\n", 373 | "another_deep_quote.rstrip().lower().split(' ')" 374 | ] 375 | }, 376 | { 377 | "cell_type": "markdown", 378 | "metadata": {}, 379 | "source": [ 380 | "The same thing can be achieved if we create intermediate variables." 381 | ] 382 | }, 383 | { 384 | "cell_type": "code", 385 | "execution_count": null, 386 | "metadata": {}, 387 | "outputs": [], 388 | "source": [ 389 | "another_deep_quote_stripped = another_deep_quote.rstrip()\n", 390 | "another_deep_quote_stripped_and_lower = another_deep_quote_stripped.lower()\n", 391 | "another_deep_quote_stripped_lower_and_split = another_deep_quote_stripped_and_lower.split(' ')\n", 392 | "another_deep_quote_stripped_lower_and_split" 393 | ] 394 | }, 395 | { 396 | "cell_type": "markdown", 397 | "metadata": {}, 398 | "source": [ 399 | "There is one last important function worth mentioning here. We can `format()` the strings by using other strings (not only strings, but we will talk about this later) as arguments." 400 | ] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "execution_count": null, 405 | "metadata": {}, 406 | "outputs": [], 407 | "source": [ 408 | "# before we start, type in your name here\n", 409 | "your_name = " 410 | ] 411 | }, 412 | { 413 | "cell_type": "code", 414 | "execution_count": null, 415 | "metadata": {}, 416 | "outputs": [], 417 | "source": [ 418 | "'Hello %s! Hope you are enjoying this %s class. :)' % (your_name, 'Python')" 419 | ] 420 | }, 421 | { 422 | "cell_type": "code", 423 | "execution_count": null, 424 | "metadata": {}, 425 | "outputs": [], 426 | "source": [ 427 | "# This is equivalent to the following:\n", 428 | "'Hello {yname}! Hope you are enjoying this {lang} class. :)'.format(yname = your_name, lang = 'Python')" 429 | ] 430 | }, 431 | { 432 | "cell_type": "code", 433 | "execution_count": null, 434 | "metadata": {}, 435 | "outputs": [], 436 | "source": [ 437 | "lang = 'Python'\n", 438 | "f'Hello {your_name}! Hope you are enjoying this {lang} class. :)'" 439 | ] 440 | }, 441 | { 442 | "cell_type": "code", 443 | "execution_count": 1, 444 | "metadata": {}, 445 | "outputs": [ 446 | { 447 | "data": { 448 | "text/plain": [ 449 | "'Hello {your_name}! Hope you are enjoying this {lang} class. :)'" 450 | ] 451 | }, 452 | "execution_count": 1, 453 | "metadata": {}, 454 | "output_type": "execute_result" 455 | } 456 | ], 457 | "source": [ 458 | "'Hello {your_name}! Hope you are enjoying this {lang} class. :)'" 459 | ] 460 | }, 461 | { 462 | "cell_type": "code", 463 | "execution_count": 2, 464 | "metadata": {}, 465 | "outputs": [ 466 | { 467 | "ename": "NameError", 468 | "evalue": "name 'your_name' is not defined", 469 | "output_type": "error", 470 | "traceback": [ 471 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 472 | "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", 473 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;34mf'Hello {your_name}! Hope you are enjoying this {lang} class. :)'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 474 | "\u001b[0;31mNameError\u001b[0m: name 'your_name' is not defined" 475 | ] 476 | } 477 | ], 478 | "source": [ 479 | "f'Hello {your_name}! Hope you are enjoying this {lang} class. :)'" 480 | ] 481 | }, 482 | { 483 | "cell_type": "markdown", 484 | "metadata": {}, 485 | "source": [ 486 | "##### Numbers" 487 | ] 488 | }, 489 | { 490 | "cell_type": "code", 491 | "execution_count": null, 492 | "metadata": {}, 493 | "outputs": [], 494 | "source": [ 495 | "my_integer = 1\n", 496 | "my_float = 2.34567" 497 | ] 498 | }, 499 | { 500 | "cell_type": "code", 501 | "execution_count": null, 502 | "metadata": {}, 503 | "outputs": [], 504 | "source": [ 505 | "print(my_integer)" 506 | ] 507 | }, 508 | { 509 | "cell_type": "code", 510 | "execution_count": null, 511 | "metadata": {}, 512 | "outputs": [], 513 | "source": [ 514 | "print(my_float)" 515 | ] 516 | }, 517 | { 518 | "cell_type": "code", 519 | "execution_count": null, 520 | "metadata": {}, 521 | "outputs": [], 522 | "source": [ 523 | "my_float + my_integer" 524 | ] 525 | }, 526 | { 527 | "cell_type": "markdown", 528 | "metadata": {}, 529 | "source": [ 530 | "**Using Python as a calculator**\n", 531 | "\n", 532 | "Now, that we know a little about numbers and how to assign variables we can learn how to use Python as a calculator. \n", 533 | "\n", 534 | "Basic operations:\n", 535 | "\n", 536 | "|Operator|Operation|\n", 537 | "|:--:|---|\n", 538 | "|+|addition|\n", 539 | "|-|substration|\n", 540 | "|\\*|multiplication|\n", 541 | "|/|division|\n", 542 | "|//|floor division|\n", 543 | "|%|modulo|\n", 544 | "|\\*\\*|power|" 545 | ] 546 | }, 547 | { 548 | "cell_type": "markdown", 549 | "metadata": {}, 550 | "source": [ 551 | "***" 552 | ] 553 | }, 554 | { 555 | "cell_type": "markdown", 556 | "metadata": {}, 557 | "source": [ 558 | "**Excercise**\n", 559 | "Multiply the reminder of division 456 by 87 by 21 and raise it to the power of 3. Get the floor division of this number by 11. Save the result to a variable *result*.\n" 560 | ] 561 | }, 562 | { 563 | "cell_type": "code", 564 | "execution_count": null, 565 | "metadata": {}, 566 | "outputs": [], 567 | "source": [ 568 | "# write your formula here\n" 569 | ] 570 | }, 571 | { 572 | "cell_type": "code", 573 | "execution_count": null, 574 | "metadata": {}, 575 | "outputs": [], 576 | "source": [ 577 | "# Did it work?\n", 578 | "result == 7796920" 579 | ] 580 | }, 581 | { 582 | "cell_type": "markdown", 583 | "metadata": {}, 584 | "source": [ 585 | "**Side Note: On numbers and precision**" 586 | ] 587 | }, 588 | { 589 | "cell_type": "markdown", 590 | "metadata": {}, 591 | "source": [ 592 | "The numbers in computers are represented as a base of 2, not a base of 10 like we are used to in everyday life. What that means is that sometimes computer will store an aproximation of a number, rather than an exact number. \n", 593 | "\n", 594 | "For example 10 will be represented as 1 x 8, 0 x 4, 1 x 2 and 0 x 1 rather than 1 x 10 and 0 x 1. That generates difficulties when we want to store a decimal number. \n", 595 | "\n", 596 | "In order to store 1/10 in the base 2 system we would have to save it as an infinite sum.\n", 597 | "\n", 598 | "1/10 = 1/16 + 1/32 + 1/256 + 1/512 ... \n", 599 | "\n", 600 | "Therefore, 1/10 in our computers is only an approximation. You can read more about this [here](https://docs.python.org/3/tutorial/floatingpoint.html).\n", 601 | "\n", 602 | "That's why:" 603 | ] 604 | }, 605 | { 606 | "cell_type": "code", 607 | "execution_count": null, 608 | "metadata": {}, 609 | "outputs": [], 610 | "source": [ 611 | "# This is not true\n", 612 | "0.1 + 0.1 + 0.1 == 0.3" 613 | ] 614 | }, 615 | { 616 | "cell_type": "code", 617 | "execution_count": null, 618 | "metadata": {}, 619 | "outputs": [], 620 | "source": [ 621 | "# Nor is this\n", 622 | "round(0.1, 1) + round(0.1, 1) + round(0.1, 1) == round(0.3, 1)" 623 | ] 624 | }, 625 | { 626 | "cell_type": "code", 627 | "execution_count": null, 628 | "metadata": { 629 | "scrolled": true 630 | }, 631 | "outputs": [], 632 | "source": [ 633 | "# But this is\n", 634 | "round(0.1 + 0.1 + 0.1, 10) == round(0.3, 10)" 635 | ] 636 | }, 637 | { 638 | "cell_type": "markdown", 639 | "metadata": {}, 640 | "source": [ 641 | "##### Mixing numbers and strings\n", 642 | "We can assign more than one variable at a time. We do that by separating each of the assignment with comma. Note, that the number of variable names and values has to be equal." 643 | ] 644 | }, 645 | { 646 | "cell_type": "code", 647 | "execution_count": null, 648 | "metadata": {}, 649 | "outputs": [], 650 | "source": [ 651 | "a, b, c = 1, 2.01, 'three'\n", 652 | "print(b)" 653 | ] 654 | }, 655 | { 656 | "cell_type": "code", 657 | "execution_count": null, 658 | "metadata": {}, 659 | "outputs": [], 660 | "source": [ 661 | "type(b)" 662 | ] 663 | }, 664 | { 665 | "cell_type": "markdown", 666 | "metadata": {}, 667 | "source": [ 668 | "Also, we can convert a number to a string, for example to check how many digits has a really big number." 669 | ] 670 | }, 671 | { 672 | "cell_type": "code", 673 | "execution_count": null, 674 | "metadata": {}, 675 | "outputs": [], 676 | "source": [ 677 | "# Let's check length of a big number\n", 678 | "big_number = 23 ** 1992\n", 679 | "len(str(big_number))" 680 | ] 681 | }, 682 | { 683 | "cell_type": "markdown", 684 | "metadata": {}, 685 | "source": [ 686 | "Similiarly of course we can transform string to a number. This might come in handy while extracting a number from text." 687 | ] 688 | }, 689 | { 690 | "cell_type": "code", 691 | "execution_count": null, 692 | "metadata": {}, 693 | "outputs": [], 694 | "source": [ 695 | "a = str(int('1234') + 56)\n", 696 | "b = int('1234' + str(56))" 697 | ] 698 | }, 699 | { 700 | "cell_type": "markdown", 701 | "metadata": {}, 702 | "source": [ 703 | "Is a equal b? What's the difference? " 704 | ] 705 | }, 706 | { 707 | "cell_type": "code", 708 | "execution_count": null, 709 | "metadata": {}, 710 | "outputs": [], 711 | "source": [ 712 | "# To check your answer print both numbers\n", 713 | "print(a == b)" 714 | ] 715 | }, 716 | { 717 | "cell_type": "code", 718 | "execution_count": null, 719 | "metadata": {}, 720 | "outputs": [], 721 | "source": [ 722 | "print(a)" 723 | ] 724 | }, 725 | { 726 | "cell_type": "code", 727 | "execution_count": null, 728 | "metadata": {}, 729 | "outputs": [], 730 | "source": [ 731 | "print(b)" 732 | ] 733 | }, 734 | { 735 | "cell_type": "markdown", 736 | "metadata": {}, 737 | "source": [ 738 | "***\n", 739 | "### Lists and more" 740 | ] 741 | }, 742 | { 743 | "cell_type": "markdown", 744 | "metadata": {}, 745 | "source": [ 746 | "#### Lists" 747 | ] 748 | }, 749 | { 750 | "cell_type": "code", 751 | "execution_count": null, 752 | "metadata": {}, 753 | "outputs": [], 754 | "source": [ 755 | "my_list = [1, 2, ['three', 4], 5.5] \n", 756 | "print(my_list)" 757 | ] 758 | }, 759 | { 760 | "cell_type": "markdown", 761 | "metadata": {}, 762 | "source": [ 763 | "As with the strings, we can access the elemnts of the list from left and right side." 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": null, 769 | "metadata": {}, 770 | "outputs": [], 771 | "source": [ 772 | "print(my_list[1:]) " 773 | ] 774 | }, 775 | { 776 | "cell_type": "code", 777 | "execution_count": null, 778 | "metadata": {}, 779 | "outputs": [], 780 | "source": [ 781 | "print(my_list[1:3]) " 782 | ] 783 | }, 784 | { 785 | "cell_type": "code", 786 | "execution_count": null, 787 | "metadata": {}, 788 | "outputs": [], 789 | "source": [ 790 | "print(my_list[0])" 791 | ] 792 | }, 793 | { 794 | "cell_type": "code", 795 | "execution_count": null, 796 | "metadata": {}, 797 | "outputs": [], 798 | "source": [ 799 | "# Note, this will get us elemnts at position 3 and 4\n", 800 | "print(my_list[-2:])" 801 | ] 802 | }, 803 | { 804 | "cell_type": "code", 805 | "execution_count": null, 806 | "metadata": {}, 807 | "outputs": [], 808 | "source": [ 809 | "print(my_list * 2)" 810 | ] 811 | }, 812 | { 813 | "cell_type": "markdown", 814 | "metadata": {}, 815 | "source": [ 816 | "We can modify elements of the list:" 817 | ] 818 | }, 819 | { 820 | "cell_type": "code", 821 | "execution_count": null, 822 | "metadata": {}, 823 | "outputs": [], 824 | "source": [ 825 | "print(my_list)" 826 | ] 827 | }, 828 | { 829 | "cell_type": "code", 830 | "execution_count": null, 831 | "metadata": {}, 832 | "outputs": [], 833 | "source": [ 834 | "my_list[1] = 'seven'\n", 835 | "print(my_list)" 836 | ] 837 | }, 838 | { 839 | "cell_type": "code", 840 | "execution_count": null, 841 | "metadata": {}, 842 | "outputs": [], 843 | "source": [ 844 | "my_list2 = my_list\n", 845 | "my_list2[0] = '1'\n", 846 | "my_list2[1] = \"second_element\"" 847 | ] 848 | }, 849 | { 850 | "cell_type": "code", 851 | "execution_count": null, 852 | "metadata": {}, 853 | "outputs": [], 854 | "source": [ 855 | "print(my_list)" 856 | ] 857 | }, 858 | { 859 | "cell_type": "code", 860 | "execution_count": null, 861 | "metadata": {}, 862 | "outputs": [], 863 | "source": [ 864 | "my_list3 = my_list.copy()\n", 865 | "my_list3[0] = \"seventy_seven\"" 866 | ] 867 | }, 868 | { 869 | "cell_type": "code", 870 | "execution_count": null, 871 | "metadata": {}, 872 | "outputs": [], 873 | "source": [ 874 | "print(my_list3)" 875 | ] 876 | }, 877 | { 878 | "cell_type": "code", 879 | "execution_count": null, 880 | "metadata": {}, 881 | "outputs": [], 882 | "source": [ 883 | "print(my_list)" 884 | ] 885 | }, 886 | { 887 | "cell_type": "markdown", 888 | "metadata": {}, 889 | "source": [ 890 | "and add elements to the list by creating a new one." 891 | ] 892 | }, 893 | { 894 | "cell_type": "code", 895 | "execution_count": null, 896 | "metadata": { 897 | "scrolled": true 898 | }, 899 | "outputs": [], 900 | "source": [ 901 | "my_new_list = my_list + ['new element'] \n", 902 | "print(my_new_list) # the new list has the new element\n", 903 | "print(my_list) # however the original list was left unchanges" 904 | ] 905 | }, 906 | { 907 | "cell_type": "markdown", 908 | "metadata": {}, 909 | "source": [ 910 | "If we want to modify the list on the fly, we can add new element with the `append()` method." 911 | ] 912 | }, 913 | { 914 | "cell_type": "code", 915 | "execution_count": null, 916 | "metadata": {}, 917 | "outputs": [], 918 | "source": [ 919 | "my_list.append('new element')\n", 920 | "print(my_list)" 921 | ] 922 | }, 923 | { 924 | "cell_type": "markdown", 925 | "metadata": {}, 926 | "source": [ 927 | "Note that `append()` always add its argument as a new element. Therefore, if we want a list as an element of our list we can do that with append. On the other hand, if we want to add multiple elements to a list - we should rather use `extend()`." 928 | ] 929 | }, 930 | { 931 | "cell_type": "code", 932 | "execution_count": null, 933 | "metadata": {}, 934 | "outputs": [], 935 | "source": [ 936 | "my_list.append([1,2,3,4])\n", 937 | "print(my_list)" 938 | ] 939 | }, 940 | { 941 | "cell_type": "code", 942 | "execution_count": null, 943 | "metadata": {}, 944 | "outputs": [], 945 | "source": [ 946 | "my_list.extend([1,2,3,4])\n", 947 | "print(my_list)" 948 | ] 949 | }, 950 | { 951 | "cell_type": "markdown", 952 | "metadata": {}, 953 | "source": [ 954 | "***" 955 | ] 956 | }, 957 | { 958 | "cell_type": "markdown", 959 | "metadata": {}, 960 | "source": [ 961 | "**Excercise:** Given a list `names = ['Kasia', 'Jasio', 'Zuzia', 'Amelka']` answer the following questions:\n", 962 | "\n", 963 | "* What will `names[-2]` output?\n", 964 | "* What will `sorted(names)[1]` ourput?\n", 965 | "* What will the following command - `len(names[1:3])` equeal to?" 966 | ] 967 | }, 968 | { 969 | "cell_type": "markdown", 970 | "metadata": {}, 971 | "source": [ 972 | "**Don't run the code below until you try answering the questions!**" 973 | ] 974 | }, 975 | { 976 | "cell_type": "code", 977 | "execution_count": null, 978 | "metadata": {}, 979 | "outputs": [], 980 | "source": [ 981 | "names = ['Kasia', 'Jasio', 'Zuzia', 'Amelka']" 982 | ] 983 | }, 984 | { 985 | "cell_type": "code", 986 | "execution_count": null, 987 | "metadata": {}, 988 | "outputs": [], 989 | "source": [ 990 | "names[-2]" 991 | ] 992 | }, 993 | { 994 | "cell_type": "code", 995 | "execution_count": null, 996 | "metadata": {}, 997 | "outputs": [], 998 | "source": [ 999 | "sorted(names)[1]" 1000 | ] 1001 | }, 1002 | { 1003 | "cell_type": "code", 1004 | "execution_count": null, 1005 | "metadata": {}, 1006 | "outputs": [], 1007 | "source": [ 1008 | "len(names[1:3])" 1009 | ] 1010 | }, 1011 | { 1012 | "cell_type": "markdown", 1013 | "metadata": {}, 1014 | "source": [ 1015 | "#### Tuples" 1016 | ] 1017 | }, 1018 | { 1019 | "cell_type": "markdown", 1020 | "metadata": {}, 1021 | "source": [ 1022 | "Tuples are similiar to lists - they are a collection of items separated by commas and enclosed in parenthesis - (). Their elements can be accessed by their position, however, unlike lists their shape is constant. They cannot be updated." 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "code", 1027 | "execution_count": null, 1028 | "metadata": {}, 1029 | "outputs": [], 1030 | "source": [ 1031 | "my_tuple = (1, 2, 3, 4)\n", 1032 | "print(my_tuple)" 1033 | ] 1034 | }, 1035 | { 1036 | "cell_type": "code", 1037 | "execution_count": null, 1038 | "metadata": {}, 1039 | "outputs": [], 1040 | "source": [ 1041 | "print(my_tuple[2])" 1042 | ] 1043 | }, 1044 | { 1045 | "cell_type": "code", 1046 | "execution_count": null, 1047 | "metadata": {}, 1048 | "outputs": [], 1049 | "source": [ 1050 | "print(my_tuple[1:3])" 1051 | ] 1052 | }, 1053 | { 1054 | "cell_type": "code", 1055 | "execution_count": null, 1056 | "metadata": {}, 1057 | "outputs": [], 1058 | "source": [ 1059 | "print(my_tuple * 2)" 1060 | ] 1061 | }, 1062 | { 1063 | "cell_type": "code", 1064 | "execution_count": null, 1065 | "metadata": {}, 1066 | "outputs": [], 1067 | "source": [ 1068 | "my_tuple.append(1)" 1069 | ] 1070 | }, 1071 | { 1072 | "cell_type": "markdown", 1073 | "metadata": {}, 1074 | "source": [ 1075 | "#### Sets" 1076 | ] 1077 | }, 1078 | { 1079 | "cell_type": "markdown", 1080 | "metadata": {}, 1081 | "source": [ 1082 | "Sets are special collection in Python - they are unordered, unindexed and all their entires are unique. We create sets by using curly brackets {} or by using the `set()` function." 1083 | ] 1084 | }, 1085 | { 1086 | "cell_type": "code", 1087 | "execution_count": null, 1088 | "metadata": {}, 1089 | "outputs": [], 1090 | "source": [ 1091 | "my_set = {'a', 'b', 3, 'b'}\n", 1092 | "my_set2 = set(['a', 'b', 3, 'b'])\n", 1093 | "print(my_set)" 1094 | ] 1095 | }, 1096 | { 1097 | "cell_type": "code", 1098 | "execution_count": null, 1099 | "metadata": {}, 1100 | "outputs": [], 1101 | "source": [ 1102 | "print(my_set2)" 1103 | ] 1104 | }, 1105 | { 1106 | "cell_type": "markdown", 1107 | "metadata": {}, 1108 | "source": [ 1109 | "The sets cannot be accessed as list - they are unordered and unindexed. Uncomment the following two lines and see what happens." 1110 | ] 1111 | }, 1112 | { 1113 | "cell_type": "code", 1114 | "execution_count": null, 1115 | "metadata": {}, 1116 | "outputs": [], 1117 | "source": [ 1118 | "my_set[2]" 1119 | ] 1120 | }, 1121 | { 1122 | "cell_type": "code", 1123 | "execution_count": null, 1124 | "metadata": {}, 1125 | "outputs": [], 1126 | "source": [ 1127 | "print(my_set * 2)" 1128 | ] 1129 | }, 1130 | { 1131 | "cell_type": "markdown", 1132 | "metadata": {}, 1133 | "source": [ 1134 | "However, similiarly to lists, we can add items to the set." 1135 | ] 1136 | }, 1137 | { 1138 | "cell_type": "code", 1139 | "execution_count": null, 1140 | "metadata": {}, 1141 | "outputs": [], 1142 | "source": [ 1143 | "my_set.add(7) # one element at the time\n", 1144 | "print(my_set)" 1145 | ] 1146 | }, 1147 | { 1148 | "cell_type": "code", 1149 | "execution_count": null, 1150 | "metadata": {}, 1151 | "outputs": [], 1152 | "source": [ 1153 | "my_set.update([1, 2, 'f', 'g']) # multiple items at once\n", 1154 | "print(my_set)" 1155 | ] 1156 | }, 1157 | { 1158 | "cell_type": "markdown", 1159 | "metadata": {}, 1160 | "source": [ 1161 | "#### Dictionaries\n", 1162 | "For people that have some experience with programming one might say that Python's dictionaries are kind of hash-table type. They work like associative arrays or hashes found in Perl. For others, with no background I like to say that the name - dictionary describes this type quite well. Dictionaries work in a way a real-life dictionary works - you can look up a value by it's key. You can look up a value (for example the translation od a word) by it key - this word in the language you know. \n", 1163 | "\n", 1164 | "We create dictionares with curly brackets {}, however, unlike with sets - we provide key and a value for each item:" 1165 | ] 1166 | }, 1167 | { 1168 | "cell_type": "code", 1169 | "execution_count": null, 1170 | "metadata": {}, 1171 | "outputs": [], 1172 | "source": [ 1173 | "my_dictionary = {\n", 1174 | " 'me': 'ja', \n", 1175 | " 'you': 'ty', \n", 1176 | " 'he': 'on', \n", 1177 | " 'she': 'ona', \n", 1178 | " 'hi': 'czesc'\n", 1179 | "}\n", 1180 | "print(my_dictionary)" 1181 | ] 1182 | }, 1183 | { 1184 | "cell_type": "markdown", 1185 | "metadata": {}, 1186 | "source": [ 1187 | "We can also create a dictionary using a `dict()` function, the argument of the function has to be a sequence of (key, value) tuples." 1188 | ] 1189 | }, 1190 | { 1191 | "cell_type": "code", 1192 | "execution_count": null, 1193 | "metadata": {}, 1194 | "outputs": [], 1195 | "source": [ 1196 | "d = [('one', 1), ('two', 2), ('three', 3)]\n", 1197 | "d_dict = dict(d)\n", 1198 | "print(d_dict)" 1199 | ] 1200 | }, 1201 | { 1202 | "cell_type": "markdown", 1203 | "metadata": {}, 1204 | "source": [ 1205 | "Let's check how to say you in Polish." 1206 | ] 1207 | }, 1208 | { 1209 | "cell_type": "code", 1210 | "execution_count": null, 1211 | "metadata": {}, 1212 | "outputs": [], 1213 | "source": [ 1214 | "my_dictionary['you']" 1215 | ] 1216 | }, 1217 | { 1218 | "cell_type": "markdown", 1219 | "metadata": {}, 1220 | "source": [ 1221 | "We can add new entries to the dictionary." 1222 | ] 1223 | }, 1224 | { 1225 | "cell_type": "code", 1226 | "execution_count": null, 1227 | "metadata": {}, 1228 | "outputs": [], 1229 | "source": [ 1230 | "my_dictionary['ice cream'] = 'lody'\n", 1231 | "my_dictionary['pizza'] = 'pizza'\n", 1232 | "print(my_dictionary)" 1233 | ] 1234 | }, 1235 | { 1236 | "cell_type": "markdown", 1237 | "metadata": {}, 1238 | "source": [ 1239 | "In the same way we can modify entries in the dictionary." 1240 | ] 1241 | }, 1242 | { 1243 | "cell_type": "code", 1244 | "execution_count": null, 1245 | "metadata": {}, 1246 | "outputs": [], 1247 | "source": [ 1248 | "my_dictionary['pizza'] = 'wloski placek' # roughly translates to italian pie\n", 1249 | "print(my_dictionary)" 1250 | ] 1251 | }, 1252 | { 1253 | "cell_type": "markdown", 1254 | "metadata": {}, 1255 | "source": [ 1256 | "We can access all the keys by calling `keys()` function on our dictionary." 1257 | ] 1258 | }, 1259 | { 1260 | "cell_type": "code", 1261 | "execution_count": null, 1262 | "metadata": {}, 1263 | "outputs": [], 1264 | "source": [ 1265 | "print(my_dictionary.keys())" 1266 | ] 1267 | }, 1268 | { 1269 | "cell_type": "markdown", 1270 | "metadata": {}, 1271 | "source": [ 1272 | "We can also access all the values by calling `values()` function on our dictionary." 1273 | ] 1274 | }, 1275 | { 1276 | "cell_type": "code", 1277 | "execution_count": null, 1278 | "metadata": { 1279 | "scrolled": true 1280 | }, 1281 | "outputs": [], 1282 | "source": [ 1283 | "print(my_dictionary.values())" 1284 | ] 1285 | }, 1286 | { 1287 | "cell_type": "markdown", 1288 | "metadata": {}, 1289 | "source": [ 1290 | "***" 1291 | ] 1292 | }, 1293 | { 1294 | "cell_type": "markdown", 1295 | "metadata": {}, 1296 | "source": [ 1297 | "**Excercise**: You need to save few phone numbers of your friends (Kasia +48300200100, Maja +48564763562, Jacek +48600500400). Which of the above will you use to quickly access those numbers?" 1298 | ] 1299 | }, 1300 | { 1301 | "cell_type": "code", 1302 | "execution_count": null, 1303 | "metadata": {}, 1304 | "outputs": [], 1305 | "source": [ 1306 | "# Create your phonebook here\n", 1307 | "phone_book = " 1308 | ] 1309 | }, 1310 | { 1311 | "cell_type": "markdown", 1312 | "metadata": {}, 1313 | "source": [ 1314 | "**Excercise:** You have a long list of duplicate entries. How will you remove duplciates and create a new list with only unique values?" 1315 | ] 1316 | }, 1317 | { 1318 | "cell_type": "code", 1319 | "execution_count": null, 1320 | "metadata": {}, 1321 | "outputs": [], 1322 | "source": [ 1323 | "long_list_of_duplicates = [1, 7, 1, 2, 3, 5, 16, 7, 8, 17, 13, 23, 21, 34, 55, 23, 89, 1, 2, 3, 34, 4, 5, 6, 34, 7, 8, 4, 9, 10, 21, 11, 12, 13]\n", 1324 | "print(len(long_list_of_duplicates)) # 34" 1325 | ] 1326 | }, 1327 | { 1328 | "cell_type": "code", 1329 | "execution_count": null, 1330 | "metadata": {}, 1331 | "outputs": [], 1332 | "source": [ 1333 | "# Create your new list\n", 1334 | "new_list = \n", 1335 | "\n", 1336 | "# test\n", 1337 | "len(new_list) == 20" 1338 | ] 1339 | }, 1340 | { 1341 | "cell_type": "markdown", 1342 | "metadata": {}, 1343 | "source": [ 1344 | "*** \n", 1345 | "### Controlling the flow" 1346 | ] 1347 | }, 1348 | { 1349 | "cell_type": "markdown", 1350 | "metadata": {}, 1351 | "source": [ 1352 | "#### Logic in Python" 1353 | ] 1354 | }, 1355 | { 1356 | "cell_type": "markdown", 1357 | "metadata": {}, 1358 | "source": [ 1359 | "Logical operators in Python:\n", 1360 | "* `True`, `False` are respectively equal to `1` and `0`;\n", 1361 | "* all objects have boolean value, and exept for `False`, `0`, `None` or empty collections (`\"\"`, `[]`, `{}` and `()`) every other objects are `True`;\n", 1362 | "* `and` - both values have to be true for it to be true;\n", 1363 | "* `or` - at least one turth makes the statement true;\n", 1364 | "* `not` - negation of the value;\n", 1365 | "* `is` - checks if are the same object;\n", 1366 | "* comparisons: \n", 1367 | "\n", 1368 | "| symbol | meaning | \n", 1369 | "| --- | --- | \n", 1370 | "| == | is equal| \n", 1371 | "| != | is not equal| \n", 1372 | "| > | greater than| \n", 1373 | "| >= | greater or equal to| \n", 1374 | "| < | less than| \n", 1375 | "| <= | less or euqal to| \n" 1376 | ] 1377 | }, 1378 | { 1379 | "cell_type": "code", 1380 | "execution_count": null, 1381 | "metadata": {}, 1382 | "outputs": [], 1383 | "source": [ 1384 | "# Definitions\n", 1385 | "a = True\n", 1386 | "b = False\n", 1387 | "c = d = 1\n", 1388 | "true_statement = d > 0" 1389 | ] 1390 | }, 1391 | { 1392 | "cell_type": "markdown", 1393 | "metadata": {}, 1394 | "source": [ 1395 | "`1` is equal to `True` but those are two different objects." 1396 | ] 1397 | }, 1398 | { 1399 | "cell_type": "code", 1400 | "execution_count": null, 1401 | "metadata": {}, 1402 | "outputs": [], 1403 | "source": [ 1404 | "print(a == c)" 1405 | ] 1406 | }, 1407 | { 1408 | "cell_type": "code", 1409 | "execution_count": null, 1410 | "metadata": {}, 1411 | "outputs": [], 1412 | "source": [ 1413 | "print(a is c)" 1414 | ] 1415 | }, 1416 | { 1417 | "cell_type": "code", 1418 | "execution_count": null, 1419 | "metadata": {}, 1420 | "outputs": [], 1421 | "source": [ 1422 | "print(c is d)" 1423 | ] 1424 | }, 1425 | { 1426 | "cell_type": "code", 1427 | "execution_count": null, 1428 | "metadata": {}, 1429 | "outputs": [], 1430 | "source": [ 1431 | "print(a != b)" 1432 | ] 1433 | }, 1434 | { 1435 | "cell_type": "code", 1436 | "execution_count": null, 1437 | "metadata": {}, 1438 | "outputs": [], 1439 | "source": [ 1440 | "print((a or b) and (a and not b))" 1441 | ] 1442 | }, 1443 | { 1444 | "cell_type": "markdown", 1445 | "metadata": {}, 1446 | "source": [ 1447 | "Objects do evaluate as either `True` or `False`:" 1448 | ] 1449 | }, 1450 | { 1451 | "cell_type": "code", 1452 | "execution_count": null, 1453 | "metadata": {}, 1454 | "outputs": [], 1455 | "source": [ 1456 | "not_empty_list = [1, 2, 3]\n", 1457 | "if len(not_empty_list) > 0:\n", 1458 | " print(\"{} is true.\".format(not_empty_list))" 1459 | ] 1460 | }, 1461 | { 1462 | "cell_type": "code", 1463 | "execution_count": null, 1464 | "metadata": {}, 1465 | "outputs": [], 1466 | "source": [ 1467 | "empty_list = []\n", 1468 | "if not empty_list:\n", 1469 | " print(\"{} isn't.\".format(empty_list))" 1470 | ] 1471 | }, 1472 | { 1473 | "cell_type": "markdown", 1474 | "metadata": {}, 1475 | "source": [ 1476 | "however they act a bit differently when used within logical opperations." 1477 | ] 1478 | }, 1479 | { 1480 | "cell_type": "code", 1481 | "execution_count": null, 1482 | "metadata": {}, 1483 | "outputs": [], 1484 | "source": [ 1485 | "print(not(False or False) or 1)" 1486 | ] 1487 | }, 1488 | { 1489 | "cell_type": "code", 1490 | "execution_count": null, 1491 | "metadata": {}, 1492 | "outputs": [], 1493 | "source": [ 1494 | "print(not(247 % 5 or 234 % 2) or 1 == 9)" 1495 | ] 1496 | }, 1497 | { 1498 | "cell_type": "code", 1499 | "execution_count": null, 1500 | "metadata": {}, 1501 | "outputs": [], 1502 | "source": [ 1503 | "print(not(247 % 5 or 234 % 2) or [1, 2, 3])" 1504 | ] 1505 | }, 1506 | { 1507 | "cell_type": "markdown", 1508 | "metadata": {}, 1509 | "source": [ 1510 | "***" 1511 | ] 1512 | }, 1513 | { 1514 | "cell_type": "markdown", 1515 | "metadata": {}, 1516 | "source": [ 1517 | "**Excercise:** Answer the following questions:\n", 1518 | "\n", 1519 | "* What is the result of 10 == “10”\n", 1520 | "* What is the result of not(True or False)?\n", 1521 | "* What is the result of not 'bag'?\n", 1522 | "* Tricky question: What is the result of “bag” > “apple”? Why?" 1523 | ] 1524 | }, 1525 | { 1526 | "cell_type": "markdown", 1527 | "metadata": {}, 1528 | "source": [ 1529 | "**Don't run the following before you try answering the questions!**" 1530 | ] 1531 | }, 1532 | { 1533 | "cell_type": "code", 1534 | "execution_count": null, 1535 | "metadata": {}, 1536 | "outputs": [], 1537 | "source": [ 1538 | "print(10 == \"10\")" 1539 | ] 1540 | }, 1541 | { 1542 | "cell_type": "code", 1543 | "execution_count": null, 1544 | "metadata": {}, 1545 | "outputs": [], 1546 | "source": [ 1547 | "print(not(True or False))" 1548 | ] 1549 | }, 1550 | { 1551 | "cell_type": "code", 1552 | "execution_count": null, 1553 | "metadata": {}, 1554 | "outputs": [], 1555 | "source": [ 1556 | "print(not 'bag')" 1557 | ] 1558 | }, 1559 | { 1560 | "cell_type": "code", 1561 | "execution_count": null, 1562 | "metadata": {}, 1563 | "outputs": [], 1564 | "source": [ 1565 | "print(\"bag\" > \"apple\")" 1566 | ] 1567 | }, 1568 | { 1569 | "cell_type": "markdown", 1570 | "metadata": {}, 1571 | "source": [ 1572 | "#### Conditions: `if`, `else` and `elif`" 1573 | ] 1574 | }, 1575 | { 1576 | "cell_type": "markdown", 1577 | "metadata": {}, 1578 | "source": [ 1579 | "The basic statement controlling flow is the statement if - If certain condition is met do sth. For example, if you input" 1580 | ] 1581 | }, 1582 | { 1583 | "cell_type": "code", 1584 | "execution_count": null, 1585 | "metadata": {}, 1586 | "outputs": [], 1587 | "source": [ 1588 | "x = int(input('Please, provide a number greater than 0: '))\n", 1589 | "if x > 0:\n", 1590 | " print('Thanks!')\n", 1591 | "else:\n", 1592 | " print('Hmm, {} is not greater than 0!'.format(x))" 1593 | ] 1594 | }, 1595 | { 1596 | "cell_type": "code", 1597 | "execution_count": null, 1598 | "metadata": { 1599 | "scrolled": true 1600 | }, 1601 | "outputs": [], 1602 | "source": [ 1603 | "x = int(input('Please, provide a number greater than 0: '))\n", 1604 | "if x > 0:\n", 1605 | " print('Thanks!')\n", 1606 | "elif x == 0:\n", 1607 | " print('Hmm, {} is 0, therefore not greater than 0!'.format(x))\n", 1608 | "else:\n", 1609 | " print('Hmm, {} is not greater than 0!'.format(x))" 1610 | ] 1611 | }, 1612 | { 1613 | "cell_type": "markdown", 1614 | "metadata": {}, 1615 | "source": [ 1616 | "#### Loops: `for` and `while`" 1617 | ] 1618 | }, 1619 | { 1620 | "cell_type": "markdown", 1621 | "metadata": {}, 1622 | "source": [ 1623 | "Those statements allow us to repeat a certain procedure over an iteration of sorts. For example, let's say that for every odd number lower than 10 we would want to learn the value of this number taken to the power of 3. We can do it like this:" 1624 | ] 1625 | }, 1626 | { 1627 | "cell_type": "code", 1628 | "execution_count": null, 1629 | "metadata": {}, 1630 | "outputs": [], 1631 | "source": [ 1632 | "print(1**3)\n", 1633 | "print(3**3)\n", 1634 | "print(5**3)" 1635 | ] 1636 | }, 1637 | { 1638 | "cell_type": "markdown", 1639 | "metadata": {}, 1640 | "source": [ 1641 | "and so on. Or, we can write it like this:" 1642 | ] 1643 | }, 1644 | { 1645 | "cell_type": "code", 1646 | "execution_count": null, 1647 | "metadata": {}, 1648 | "outputs": [], 1649 | "source": [ 1650 | "# range(1, 10, 2) takes a range from 1 to 10 and takes every other number\n", 1651 | "for i in range(1, 10, 2):\n", 1652 | " print(i**3)" 1653 | ] 1654 | }, 1655 | { 1656 | "cell_type": "markdown", 1657 | "metadata": {}, 1658 | "source": [ 1659 | "By modifying the above example we can get the sum of all odd numbers bigger than 0 and smaller than 100." 1660 | ] 1661 | }, 1662 | { 1663 | "cell_type": "code", 1664 | "execution_count": null, 1665 | "metadata": {}, 1666 | "outputs": [], 1667 | "source": [ 1668 | "odd_sum = 0\n", 1669 | "for i in range(1, 100, 2):\n", 1670 | " odd_sum += i**3 # += is an abbreviation for odd_sum = odd_sum + i**3\n", 1671 | " \n", 1672 | "print(odd_sum)" 1673 | ] 1674 | }, 1675 | { 1676 | "cell_type": "markdown", 1677 | "metadata": {}, 1678 | "source": [ 1679 | "Other statement that allows us to execute the same command multiple times is the statement `while`. You can think of this as - while something is true, do this. Coming back to our example from earlier:" 1680 | ] 1681 | }, 1682 | { 1683 | "cell_type": "code", 1684 | "execution_count": null, 1685 | "metadata": {}, 1686 | "outputs": [], 1687 | "source": [ 1688 | "is_greater = False\n", 1689 | "while not is_greater:\n", 1690 | " x = int(input('Please, provide a number greater than 0: '))\n", 1691 | " if x > 0:\n", 1692 | " is_greater = True\n", 1693 | " else:\n", 1694 | " print('We are still looking for a number greater than zero ;)')\n", 1695 | "print('Thanks!')" 1696 | ] 1697 | }, 1698 | { 1699 | "cell_type": "markdown", 1700 | "metadata": {}, 1701 | "source": [ 1702 | "#### More control flow tools: `continue` and `break`" 1703 | ] 1704 | }, 1705 | { 1706 | "cell_type": "markdown", 1707 | "metadata": {}, 1708 | "source": [ 1709 | "The `break` statement breaks out of the looop, i.e. stop any following iterations of the loop." 1710 | ] 1711 | }, 1712 | { 1713 | "cell_type": "code", 1714 | "execution_count": null, 1715 | "metadata": {}, 1716 | "outputs": [], 1717 | "source": [ 1718 | "while True:\n", 1719 | " x = int(input('Please, provide a number greater than 0: '))\n", 1720 | " if x > 0:\n", 1721 | " break\n", 1722 | " else:\n", 1723 | " print('We are still looking for a number greater than zero ;)')\n", 1724 | "print('Thanks!')" 1725 | ] 1726 | }, 1727 | { 1728 | "cell_type": "markdown", 1729 | "metadata": {}, 1730 | "source": [ 1731 | "Another example of a `break` statement. Note, `break` only influences the loop it was put into." 1732 | ] 1733 | }, 1734 | { 1735 | "cell_type": "code", 1736 | "execution_count": null, 1737 | "metadata": {}, 1738 | "outputs": [], 1739 | "source": [ 1740 | "for n in range(2, 10):\n", 1741 | " for x in range(2, n):\n", 1742 | " if n % x == 0:\n", 1743 | " print(n, 'equals', x, '*', n // x)\n", 1744 | " break # breaks out of the second for loop\n", 1745 | " else:\n", 1746 | " # loop fell through without finding a factor\n", 1747 | " print(n, 'is a prime number')" 1748 | ] 1749 | }, 1750 | { 1751 | "cell_type": "markdown", 1752 | "metadata": {}, 1753 | "source": [ 1754 | "`continue` goes to another iteration skipping the remainder of the loop." 1755 | ] 1756 | }, 1757 | { 1758 | "cell_type": "code", 1759 | "execution_count": null, 1760 | "metadata": {}, 1761 | "outputs": [], 1762 | "source": [ 1763 | "for num in range(2, 10):\n", 1764 | " if num % 2 == 0:\n", 1765 | " print(\"Found an even number\", num)\n", 1766 | " continue # next line will not get executed if this is reached\n", 1767 | " print(\"Found an odd number\", num)" 1768 | ] 1769 | }, 1770 | { 1771 | "cell_type": "code", 1772 | "execution_count": null, 1773 | "metadata": {}, 1774 | "outputs": [], 1775 | "source": [ 1776 | "for num in range(2, 10):\n", 1777 | " if num % 2 == 0:\n", 1778 | " print(\"Found an even number\", num)\n", 1779 | " else:\n", 1780 | " print(\"Found an odd number\", num)" 1781 | ] 1782 | }, 1783 | { 1784 | "cell_type": "markdown", 1785 | "metadata": {}, 1786 | "source": [ 1787 | "***" 1788 | ] 1789 | }, 1790 | { 1791 | "cell_type": "markdown", 1792 | "metadata": {}, 1793 | "source": [ 1794 | "**Excercise:** Write a loop that will print the following pattern: \n", 1795 | " 1 \n", 1796 | "22 \n", 1797 | "333 \n", 1798 | "4444 \n", 1799 | "55555 \n", 1800 | "666666 \n", 1801 | "7777777 \n", 1802 | "88888888 \n", 1803 | "999999999 " 1804 | ] 1805 | }, 1806 | { 1807 | "cell_type": "code", 1808 | "execution_count": null, 1809 | "metadata": {}, 1810 | "outputs": [], 1811 | "source": [ 1812 | "# write your code here\n" 1813 | ] 1814 | }, 1815 | { 1816 | "cell_type": "markdown", 1817 | "metadata": {}, 1818 | "source": [ 1819 | "***\n", 1820 | "### Functions" 1821 | ] 1822 | }, 1823 | { 1824 | "cell_type": "markdown", 1825 | "metadata": {}, 1826 | "source": [ 1827 | "There are many [built-in functions in Python](https://docs.python.org/3/library/functions.html), there is a universe of those available in various packages. That said, sometimes we need to do an operation that is really specific, or simply not worth loading a whole package. Any operation that can be put into function, most likely should be put into a function. Functions in Python are defined with `def` statement as follows:" 1828 | ] 1829 | }, 1830 | { 1831 | "cell_type": "code", 1832 | "execution_count": null, 1833 | "metadata": {}, 1834 | "outputs": [], 1835 | "source": [ 1836 | "def function_name(function_arguments):\n", 1837 | " # the body of the function. i.e. what it should do with the arguments\n", 1838 | " print()" 1839 | ] 1840 | }, 1841 | { 1842 | "cell_type": "markdown", 1843 | "metadata": {}, 1844 | "source": [ 1845 | "For example, a simple function performing addition." 1846 | ] 1847 | }, 1848 | { 1849 | "cell_type": "code", 1850 | "execution_count": null, 1851 | "metadata": {}, 1852 | "outputs": [], 1853 | "source": [ 1854 | "def add(a, b):\n", 1855 | " # adds two numbers\n", 1856 | " print(\"The sum of a = {} and b = {} is equal to {}\".format(a, b, a + b))" 1857 | ] 1858 | }, 1859 | { 1860 | "cell_type": "markdown", 1861 | "metadata": {}, 1862 | "source": [ 1863 | "Notice, that we can provide arguments by possition, or by name." 1864 | ] 1865 | }, 1866 | { 1867 | "cell_type": "code", 1868 | "execution_count": null, 1869 | "metadata": {}, 1870 | "outputs": [], 1871 | "source": [ 1872 | "add(1, 4)" 1873 | ] 1874 | }, 1875 | { 1876 | "cell_type": "code", 1877 | "execution_count": null, 1878 | "metadata": {}, 1879 | "outputs": [], 1880 | "source": [ 1881 | "add(b = 4, a = 1)" 1882 | ] 1883 | }, 1884 | { 1885 | "cell_type": "markdown", 1886 | "metadata": {}, 1887 | "source": [ 1888 | "Notice that a and b inside the function are local and inside the function, they overwride whatever other variables you can have defined globally. However, the excecution of the function will not change those variables in your global eniroment. That said, it's better to call variables with a bit more inforative names than a and b, and to also avoid using the same name for different things." 1889 | ] 1890 | }, 1891 | { 1892 | "cell_type": "code", 1893 | "execution_count": null, 1894 | "metadata": {}, 1895 | "outputs": [], 1896 | "source": [ 1897 | "x = 5\n", 1898 | "b = 7\n", 1899 | "add(b, x)" 1900 | ] 1901 | }, 1902 | { 1903 | "cell_type": "code", 1904 | "execution_count": null, 1905 | "metadata": {}, 1906 | "outputs": [], 1907 | "source": [ 1908 | "print(x)" 1909 | ] 1910 | }, 1911 | { 1912 | "cell_type": "code", 1913 | "execution_count": null, 1914 | "metadata": {}, 1915 | "outputs": [], 1916 | "source": [ 1917 | "print(b)" 1918 | ] 1919 | }, 1920 | { 1921 | "cell_type": "markdown", 1922 | "metadata": {}, 1923 | "source": [ 1924 | "Now, a let's look at another example." 1925 | ] 1926 | }, 1927 | { 1928 | "cell_type": "code", 1929 | "execution_count": null, 1930 | "metadata": {}, 1931 | "outputs": [], 1932 | "source": [ 1933 | "# Let's write a function that calculates an arithmetic mean \n", 1934 | "def arithmetic_mean(input_numbers):\n", 1935 | " # input_numbers is a list\n", 1936 | " return sum(input_numbers) / len(input_numbers)" 1937 | ] 1938 | }, 1939 | { 1940 | "cell_type": "markdown", 1941 | "metadata": {}, 1942 | "source": [ 1943 | "Now, we can use our function in a loop. " 1944 | ] 1945 | }, 1946 | { 1947 | "cell_type": "code", 1948 | "execution_count": null, 1949 | "metadata": {}, 1950 | "outputs": [], 1951 | "source": [ 1952 | "even_numbers = range(2, 100, 2)\n", 1953 | "s = 0\n", 1954 | "for number in even_numbers:\n", 1955 | " s += arithmetic_mean([number, s])\n", 1956 | "print(s)" 1957 | ] 1958 | }, 1959 | { 1960 | "cell_type": "markdown", 1961 | "metadata": {}, 1962 | "source": [ 1963 | "Obviosly, this will still work if you wouldn't create a function. However, the function adds to readebilty of the code as well, as to its potential for development." 1964 | ] 1965 | }, 1966 | { 1967 | "cell_type": "code", 1968 | "execution_count": null, 1969 | "metadata": {}, 1970 | "outputs": [], 1971 | "source": [ 1972 | "even_numbers = range(2, 100, 2)\n", 1973 | "s = 0\n", 1974 | "for number in even_numbers:\n", 1975 | " s += sum([number, s]) / len([number, s])\n", 1976 | "print(s)" 1977 | ] 1978 | }, 1979 | { 1980 | "cell_type": "markdown", 1981 | "metadata": {}, 1982 | "source": [ 1983 | "#### Side note: testing" 1984 | ] 1985 | }, 1986 | { 1987 | "cell_type": "markdown", 1988 | "metadata": {}, 1989 | "source": [ 1990 | "When writing a function - it's always good to think of what you expect and what potential errors can arrisen while executing the code. This helps prevent your code from crushing. It also allows you to print more human readable error messages." 1991 | ] 1992 | }, 1993 | { 1994 | "cell_type": "code", 1995 | "execution_count": null, 1996 | "metadata": {}, 1997 | "outputs": [], 1998 | "source": [ 1999 | "def arithmetic_mean_long(input_numbers):\n", 2000 | " \"\"\"Calculates arithmetic mean\n", 2001 | " Args:\n", 2002 | " List of numbers\n", 2003 | " Returns:\n", 2004 | " Arithmetic mean\n", 2005 | " \"\"\"\n", 2006 | " try:\n", 2007 | " result = sum(input_numbers) / len(input_numbers)\n", 2008 | " except TypeError:\n", 2009 | " not_numbers = [number for number in input_numbers if isinstance(number, (int, float)) == False]\n", 2010 | " print('ERROR: Input must contain only numbers! Those are not numbers: {}'.format(not_numbers))\n", 2011 | " except ZeroDivisionError:\n", 2012 | " print('ERROR: Input cannot be empty!')\n", 2013 | " else:\n", 2014 | " return result " 2015 | ] 2016 | }, 2017 | { 2018 | "cell_type": "code", 2019 | "execution_count": null, 2020 | "metadata": {}, 2021 | "outputs": [], 2022 | "source": [ 2023 | "test1 = [1, 2, 3, 4, 5]\n", 2024 | "arithmetic_mean(test1)" 2025 | ] 2026 | }, 2027 | { 2028 | "cell_type": "code", 2029 | "execution_count": null, 2030 | "metadata": {}, 2031 | "outputs": [], 2032 | "source": [ 2033 | "arithmetic_mean_long(test1)" 2034 | ] 2035 | }, 2036 | { 2037 | "cell_type": "code", 2038 | "execution_count": null, 2039 | "metadata": { 2040 | "scrolled": true 2041 | }, 2042 | "outputs": [], 2043 | "source": [ 2044 | "test2 = [1, 2, 3, '4', '5']\n", 2045 | "arithmetic_mean(test2)" 2046 | ] 2047 | }, 2048 | { 2049 | "cell_type": "code", 2050 | "execution_count": null, 2051 | "metadata": {}, 2052 | "outputs": [], 2053 | "source": [ 2054 | "arithmetic_mean_long(test2)" 2055 | ] 2056 | }, 2057 | { 2058 | "cell_type": "code", 2059 | "execution_count": null, 2060 | "metadata": {}, 2061 | "outputs": [], 2062 | "source": [ 2063 | "test3 = []\n", 2064 | "arithmetic_mean(test3)" 2065 | ] 2066 | }, 2067 | { 2068 | "cell_type": "code", 2069 | "execution_count": null, 2070 | "metadata": {}, 2071 | "outputs": [], 2072 | "source": [ 2073 | "arithmetic_mean_long(test3)" 2074 | ] 2075 | }, 2076 | { 2077 | "cell_type": "markdown", 2078 | "metadata": {}, 2079 | "source": [ 2080 | "***" 2081 | ] 2082 | }, 2083 | { 2084 | "cell_type": "markdown", 2085 | "metadata": {}, 2086 | "source": [ 2087 | "**Excercise:** Write a function called fizz_buzz that takes a number and:\n", 2088 | "* If the number is divisible by 3, it should print “Fizz”.\n", 2089 | "* If it is divisible by 5, it should print “Buzz”.\n", 2090 | "* If it is divisible by both 3 and 5, it should print“FizzBuzz”.\n", 2091 | "* Otherwise, it should print the same number." 2092 | ] 2093 | }, 2094 | { 2095 | "cell_type": "code", 2096 | "execution_count": null, 2097 | "metadata": {}, 2098 | "outputs": [], 2099 | "source": [ 2100 | "# write your function here\n", 2101 | "def fizz_buzz():" 2102 | ] 2103 | }, 2104 | { 2105 | "cell_type": "code", 2106 | "execution_count": null, 2107 | "metadata": {}, 2108 | "outputs": [], 2109 | "source": [ 2110 | "# and test it\n", 2111 | "fizz_buzz(9) # Fizz" 2112 | ] 2113 | }, 2114 | { 2115 | "cell_type": "code", 2116 | "execution_count": null, 2117 | "metadata": {}, 2118 | "outputs": [], 2119 | "source": [ 2120 | "fizz_buzz(25) # Buzz" 2121 | ] 2122 | }, 2123 | { 2124 | "cell_type": "code", 2125 | "execution_count": null, 2126 | "metadata": {}, 2127 | "outputs": [], 2128 | "source": [ 2129 | "fizz_buzz(15) # FizzBuzz" 2130 | ] 2131 | }, 2132 | { 2133 | "cell_type": "code", 2134 | "execution_count": null, 2135 | "metadata": {}, 2136 | "outputs": [], 2137 | "source": [ 2138 | "fizz_buzz(13) # 13" 2139 | ] 2140 | }, 2141 | { 2142 | "cell_type": "markdown", 2143 | "metadata": {}, 2144 | "source": [ 2145 | "**Excercise:** Write a function for checking the speed of drivers. This function should have one parameter: speed. \n", 2146 | "* If speed is less than 70, it should print “Ok”.\n", 2147 | "* Otherwise, for every 5km above the speed limit (70), it should give the driver one demerit point and print the total number of demerit points. For example, if the speed is 80, it should print: “Points: 2”.\n", 2148 | "* If the driver gets more than 12 points, the function should print: “License suspended”" 2149 | ] 2150 | }, 2151 | { 2152 | "cell_type": "code", 2153 | "execution_count": null, 2154 | "metadata": {}, 2155 | "outputs": [], 2156 | "source": [ 2157 | "# write your function here\n", 2158 | "def check_the_speed():\n", 2159 | " " 2160 | ] 2161 | }, 2162 | { 2163 | "cell_type": "code", 2164 | "execution_count": null, 2165 | "metadata": {}, 2166 | "outputs": [], 2167 | "source": [ 2168 | "# let's test it:\n", 2169 | "check_the_speed(64) # Ok" 2170 | ] 2171 | }, 2172 | { 2173 | "cell_type": "code", 2174 | "execution_count": null, 2175 | "metadata": {}, 2176 | "outputs": [], 2177 | "source": [ 2178 | "check_the_speed(71) # Points: 1" 2179 | ] 2180 | }, 2181 | { 2182 | "cell_type": "code", 2183 | "execution_count": null, 2184 | "metadata": {}, 2185 | "outputs": [], 2186 | "source": [ 2187 | "check_the_speed(83) # Points: 3" 2188 | ] 2189 | }, 2190 | { 2191 | "cell_type": "code", 2192 | "execution_count": null, 2193 | "metadata": {}, 2194 | "outputs": [], 2195 | "source": [ 2196 | "check_the_speed(131) # Licence suspended" 2197 | ] 2198 | }, 2199 | { 2200 | "cell_type": "markdown", 2201 | "metadata": {}, 2202 | "source": [ 2203 | "# Resources\n", 2204 | "\n", 2205 | "## Books\n", 2206 | "* [Python for Data Analysis](https://www.oreilly.com/library/view/python-for-data/9781491957653/)\n", 2207 | "* [Python in a Nutshell](http://shop.oreilly.com/product/0636920012610.do)\n", 2208 | "* [Think Python: How to Think Like a Computer Scientist](http://greenteapress.com/thinkpython/html/index.html) - wole book available online\n", 2209 | "* [A Byte of Python](https://python.swaroopch.com/) - whole book available online\n", 2210 | "\n", 2211 | "## Interactive courses\n", 2212 | "* [Learn Python](https://www.learnpython.org/)\n", 2213 | "* [A Python crash course](https://www.grahamwheeler.com/posts/python-crash-course.html) - This one is aimed at people already programming in Java\n", 2214 | "* [Python for beginners](http://opentechschool.github.io/python-beginners/en/index.html)\n", 2215 | "* [Tech Dev Guide by Google](https://techdevguide.withgoogle.com/) - Google resources for learning and advancing your programming skills\n", 2216 | "* [Courses in Python on edX](https://www.edx.org/learn/python)\n", 2217 | "* [Codeacademy - Python3](https://www.codecademy.com/learn/learn-python-3) - unfortunately behind a paywall\n", 2218 | "* [10 Minutes to Pandas](https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html)\n", 2219 | "* [Exploratory Data Anylyses with Pandas and Numpy](https://www.grahamwheeler.com/posts/exploratory-data-analysis-with-numpy-and-pandas.html)\n", 2220 | "\n", 2221 | "## Online IDEs\n", 2222 | "* [Repl](https://repl.it/languages/python3)\n", 2223 | "\n", 2224 | "## Other\n", 2225 | "* [List of resources on Hackr.io](https://hackr.io/tutorials/learn-python)\n", 2226 | "* [Python excercises for beginners](https://programmingwithmosh.com/python/python-exercises-and-questions-for-beginners/)" 2227 | ] 2228 | } 2229 | ], 2230 | "metadata": { 2231 | "kernelspec": { 2232 | "display_name": "Python 3", 2233 | "language": "python", 2234 | "name": "python3" 2235 | }, 2236 | "language_info": { 2237 | "codemirror_mode": { 2238 | "name": "ipython", 2239 | "version": 3 2240 | }, 2241 | "file_extension": ".py", 2242 | "mimetype": "text/x-python", 2243 | "name": "python", 2244 | "nbconvert_exporter": "python", 2245 | "pygments_lexer": "ipython3", 2246 | "version": "3.8.3" 2247 | } 2248 | }, 2249 | "nbformat": 4, 2250 | "nbformat_minor": 4 2251 | } 2252 | -------------------------------------------------------------------------------- /2_Python_for_Data_Science.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Python for Data Science" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "### About this notebook\n", 15 | "\n", 16 | "This part of the tutorial focuses on how to leverage vast repository of\n", 17 | "Python modules to efficiently wrangle, analyze and visualize small and big data. We will mostly focus on NumPy, Pandas and Matplotlib, but also touch on ML libraries and Seaborn.\n", 18 | "\n", 19 | "This notebook will be organized a bit differently than the previous one. Here, I will first introduce the modules showing some essentials and then, we will move on to tackling problems and learning by doing. " 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "***\n", 27 | "### Modules" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "metadata": {}, 33 | "source": [ 34 | "Modules are what, in my opinion, make Python so good. They provide the needed\n", 35 | "extensions to the already neat language. Within modules you can find solutions\n", 36 | " to many of your problems and tasks.\n", 37 | "\n", 38 | "Let me introduce few crucial modules, some of them come with Pyton installation, some you have to install on your own.\n", 39 | "\n", 40 | "Standard Library (i.e. modules that come with Python):\n", 41 | "* [os](https://docs.python.org/3/library/os.html) - provides multiple OS interfaces \n", 42 | "* [re](https://docs.python.org/3/library/re.html?highlight=re#module-re) - module for regular expressions\n", 43 | "* [math](https://docs.python.org/3/library/math.html) - math operations extension (includes rounding up, calculating factorials and much more)\n", 44 | "* [statistcs](https://docs.python.org/3/library/statistics.html) - some statistical functions - among others facilitates mode, medians etc. calculations\n", 45 | "* [random](https://docs.python.org/3/library/random.html?highlight=random#module-random) - *generates pseudo-random numbers*\n", 46 | "* [collections](https://docs.python.org/3/library/collections.html) - extensions of the basics collections\n", 47 | "* [itertools](https://docs.python.org/3/library/itertools.html) - great module for various iteration and efficient looping\n", 48 | "* [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) - module allowing parallel computing\n", 49 | "* [logging](https://docs.python.org/3/library/logging.html) - module allowing\n", 50 | " to generate nice and comprehensive messages\n", 51 | "\n", 52 | "You can find much more described [here](https://docs.python.org/3/library/).\n", 53 | "\n", 54 | "Other crucial modules, especially for Data Science:\n", 55 | "\n", 56 | "* [NumPy](https://www.numpy.org/) - *the fundamental package for scientific computing with Python*\n", 57 | "* [Pandas](https://pandas.pydata.org/) - *library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming*\n", 58 | "* [SciPy](https://www.scipy.org/) - *Python-based ecosystem of open-source software for mathematics, science, and engineering*\n", 59 | "* [Matplotlib](https://matplotlib.org/) - *2D plotting library which produces publication quality figures*\n", 60 | "* [Tensorflow](https://www.tensorflow.org/) - *An end-to-end open source machine learning platform*\n", 61 | "* [Keras](https://keras.io/) - *The Python Deep Learning library*\n", 62 | "* [scikit-learn](https://scikit-learn.org/stable/) - *Machine Learning in Python*\n", 63 | "* [Scrapy](https://scrapy.org/) - *An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.*\n", 64 | "* [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/) - module for xml and html parsing\n", 65 | "\n" 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": {}, 71 | "source": [ 72 | "We can import whole modules by using `import ` statement. We usually place them at the very beginning of the script in order to specify what modules the following code relies on. We can access the functions can be accessed by preceding the function name with the module name." 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": null, 78 | "metadata": {}, 79 | "outputs": [], 80 | "source": [ 81 | "import random\n", 82 | "print(random.randint(1, 19))" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "If we only need some functions from a module, we can load only those by using `from import ` statement as follows. The functions then can be access without referring to the module name." 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "from math import pi, ceil\n", 99 | "print(pi)\n", 100 | "print(ceil(pi))" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | "In that way we can also load all functions from within the module by using `from import *` statement. In this case, we also don't have to proceed the functions names with the module name." 108 | ] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "execution_count": null, 113 | "metadata": {}, 114 | "outputs": [], 115 | "source": [ 116 | "from math import *\n", 117 | "print(factorial(3))" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": {}, 123 | "source": [ 124 | "And lastly, we can create an abbreviation for a given module and only proceed the functions with this abbreviation by using `import as `." 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": null, 130 | "metadata": {}, 131 | "outputs": [], 132 | "source": [ 133 | "import statistics as st\n", 134 | "print(st.median([1, 3, 5, 7]))" 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": {}, 140 | "source": [ 141 | "***\n", 142 | "## NumPy\n", 143 | "\n", 144 | "Without further ado, we will strat with `numpy` - module design for handling scientific computing." 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": null, 150 | "metadata": {}, 151 | "outputs": [], 152 | "source": [ 153 | "# Let's load the packages\n", 154 | "import numpy as np" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "Array are the primary type in _numpy_. We can create an array by changing a list of elements into an numpy array by calling `np.array()`." 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": { 168 | "scrolled": true 169 | }, 170 | "outputs": [], 171 | "source": [ 172 | "a_list = [1, 2, 3, 4, 5, 6, 7, 8, 9]\n", 173 | "A = np.array(a_list)\n", 174 | "a_list" 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": null, 180 | "metadata": { 181 | "collapsed": false, 182 | "jupyter": { 183 | "outputs_hidden": false 184 | }, 185 | "pycharm": { 186 | "name": "#%%\n" 187 | } 188 | }, 189 | "outputs": [], 190 | "source": [ 191 | "A" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "metadata": {}, 198 | "outputs": [], 199 | "source": [ 200 | "A.dtype" 201 | ] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": {}, 206 | "source": [ 207 | "We can create arrays with different types of elements. `Numpy` will try to match the type according to certain hierarchy, i.e. if we have characters and numbers, the type will default to characters." 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": null, 213 | "metadata": {}, 214 | "outputs": [], 215 | "source": [ 216 | "b_list = [1, 2, 3, 4, 5, 6, '7', 8, 9]\n", 217 | "B = np.array(b_list)\n", 218 | "B" 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "metadata": {}, 224 | "source": [ 225 | "We can change the type of an array. " 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "execution_count": null, 231 | "metadata": {}, 232 | "outputs": [], 233 | "source": [ 234 | "B = B.astype('float64')\n", 235 | "B" 236 | ] 237 | }, 238 | { 239 | "cell_type": "markdown", 240 | "metadata": {}, 241 | "source": [ 242 | "Arrays have shapes. We are more familiar with matrices 1-, 2- and 3-dimensional. However, they can easily be N-dimensional. This comes in handy especially in Machine Learning. We will not explore tensors in detail, as this is the beginners guide to `numpy`, but the principles shown below extend to more complex objects." 243 | ] 244 | }, 245 | { 246 | "cell_type": "code", 247 | "execution_count": null, 248 | "metadata": {}, 249 | "outputs": [], 250 | "source": [ 251 | "print(A.shape)" 252 | ] 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "metadata": {}, 257 | "source": [ 258 | "Our array `A` is of a special shape, i.e. it does not have a specific shape. It looks suspiciously similar to the other matrix with the shape 1 x 9." 259 | ] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "execution_count": null, 264 | "metadata": {}, 265 | "outputs": [], 266 | "source": [ 267 | "A_reshaped = A.reshape(1, 9)\n", 268 | "print(A_reshaped.shape)" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "metadata": {}, 275 | "outputs": [], 276 | "source": [ 277 | "A_reshaped" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": {}, 283 | "source": [ 284 | "So, if it looks like a dog, walks like a dog and barks like a dog, then it should be a dog, yes? No, at least not in this case. Shapes are really important for the arrays. " 285 | ] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": null, 290 | "metadata": { 291 | "collapsed": false, 292 | "jupyter": { 293 | "outputs_hidden": false 294 | }, 295 | "pycharm": { 296 | "name": "#%%\n" 297 | } 298 | }, 299 | "outputs": [], 300 | "source": [ 301 | "A_reshaped == A" 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": null, 307 | "metadata": {}, 308 | "outputs": [], 309 | "source": [ 310 | "np.array_equal(A, A_reshaped)" 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "metadata": {}, 316 | "source": [ 317 | "We can create the matrices from list, and it is quite useful to be able to reshape them." 318 | ] 319 | }, 320 | { 321 | "cell_type": "code", 322 | "execution_count": null, 323 | "metadata": {}, 324 | "outputs": [], 325 | "source": [ 326 | "A = A.reshape(3, 3)\n", 327 | "A" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "We can have a 3D arrays, we can easily just reshape our other matrices." 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": null, 340 | "metadata": {}, 341 | "outputs": [], 342 | "source": [ 343 | "A = A.reshape(3, 3, 1)\n", 344 | "A" 345 | ] 346 | }, 347 | { 348 | "cell_type": "markdown", 349 | "metadata": {}, 350 | "source": [ 351 | "There are special kind of matrices we can create that come in handy when we need to initialize an array that we will later fill with our data. " 352 | ] 353 | }, 354 | { 355 | "cell_type": "markdown", 356 | "metadata": {}, 357 | "source": [ 358 | "An empty array" 359 | ] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": null, 364 | "metadata": {}, 365 | "outputs": [], 366 | "source": [ 367 | "np.empty([2, 2])" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": {}, 373 | "source": [ 374 | "Array filled with zeros..." 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": null, 380 | "metadata": {}, 381 | "outputs": [], 382 | "source": [ 383 | "np.zeros([2, 3])" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "and similarly, an array with ones..." 391 | ] 392 | }, 393 | { 394 | "cell_type": "code", 395 | "execution_count": null, 396 | "metadata": {}, 397 | "outputs": [], 398 | "source": [ 399 | "np.ones([3, 5])" 400 | ] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": {}, 405 | "source": [ 406 | "and other values." 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": null, 412 | "metadata": {}, 413 | "outputs": [], 414 | "source": [ 415 | "np.full([2, 4], 10)" 416 | ] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "metadata": {}, 421 | "source": [ 422 | "An array with the sequence of numbers and so on." 423 | ] 424 | }, 425 | { 426 | "cell_type": "code", 427 | "execution_count": null, 428 | "metadata": {}, 429 | "outputs": [], 430 | "source": [ 431 | "C = np.arange(0, 20, 2)\n", 432 | "C = C.reshape((2,5))\n", 433 | "C" 434 | ] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "execution_count": null, 439 | "metadata": {}, 440 | "outputs": [], 441 | "source": [ 442 | "print(C.shape)" 443 | ] 444 | }, 445 | { 446 | "cell_type": "markdown", 447 | "metadata": {}, 448 | "source": [ 449 | "Numpy allows us to easily perform matrix operations. For example, we might be\n", 450 | " interested in the sum of all of the array elements." 451 | ] 452 | }, 453 | { 454 | "cell_type": "code", 455 | "execution_count": null, 456 | "metadata": {}, 457 | "outputs": [], 458 | "source": [ 459 | "M = np.arange(1, 10).reshape(3, 3)\n", 460 | "print(M)" 461 | ] 462 | }, 463 | { 464 | "cell_type": "code", 465 | "execution_count": null, 466 | "metadata": {}, 467 | "outputs": [], 468 | "source": [ 469 | "np.sum(M)" 470 | ] 471 | }, 472 | { 473 | "cell_type": "markdown", 474 | "metadata": {}, 475 | "source": [ 476 | "Or, we want to get the column sums." 477 | ] 478 | }, 479 | { 480 | "cell_type": "code", 481 | "execution_count": null, 482 | "metadata": {}, 483 | "outputs": [], 484 | "source": [ 485 | "np.sum(M, axis = 0)" 486 | ] 487 | }, 488 | { 489 | "cell_type": "markdown", 490 | "metadata": {}, 491 | "source": [ 492 | "We can multiply array by a scalar:" 493 | ] 494 | }, 495 | { 496 | "cell_type": "code", 497 | "execution_count": null, 498 | "metadata": {}, 499 | "outputs": [], 500 | "source": [ 501 | "M * 2" 502 | ] 503 | }, 504 | { 505 | "cell_type": "markdown", 506 | "metadata": {}, 507 | "source": [ 508 | "add a vector to each row:" 509 | ] 510 | }, 511 | { 512 | "cell_type": "code", 513 | "execution_count": null, 514 | "metadata": {}, 515 | "outputs": [], 516 | "source": [ 517 | "print(M)" 518 | ] 519 | }, 520 | { 521 | "cell_type": "code", 522 | "execution_count": null, 523 | "metadata": {}, 524 | "outputs": [], 525 | "source": [ 526 | "v = np.array([[1, 2, 3]])\n", 527 | "print(M + v)" 528 | ] 529 | }, 530 | { 531 | "cell_type": "markdown", 532 | "metadata": {}, 533 | "source": [ 534 | "or column:" 535 | ] 536 | }, 537 | { 538 | "cell_type": "code", 539 | "execution_count": null, 540 | "metadata": {}, 541 | "outputs": [], 542 | "source": [ 543 | "print(M)" 544 | ] 545 | }, 546 | { 547 | "cell_type": "code", 548 | "execution_count": null, 549 | "metadata": {}, 550 | "outputs": [], 551 | "source": [ 552 | "vec = np.array([[1, 2, 3]]).transpose()\n", 553 | "print(M + vec)" 554 | ] 555 | }, 556 | { 557 | "cell_type": "markdown", 558 | "metadata": {}, 559 | "source": [ 560 | "### Solving linear equations\n", 561 | "\n", 562 | "*Note:* Don't worry if you are not familiar with linear algebra. I am using this as an example because it highlights the advantages of numpy and why it is so widely used in ML. I will walk you through all steps, and hopefully it will be quite straight forward.\n", 563 | "\n", 564 | "Suppose we want to solve the following system of equations.\n", 565 | "\n", 566 | "$$ \\begin{cases} x_1 + x_2 = 1 \\\\ 4x_1 + 6x_2 +x_3 = 2 \\\\ -2x_1 + 2x_2 = 3 \\end{cases} $$" 567 | ] 568 | }, 569 | { 570 | "cell_type": "markdown", 571 | "metadata": {}, 572 | "source": [ 573 | "We first start by transforming the equations into matrix form. \n", 574 | "\n", 575 | "$$ A x = b $$\n", 576 | "\n", 577 | "$$ A = \\begin{bmatrix} 1 & 1 & 0 \\\\ 4 & 6 & 1 \\\\ -2 & 2 & 0 \\\\ \\end{bmatrix}, x = \\begin{bmatrix} x_1 & x_2 & x_3 \\\\ \\end{bmatrix}^T, b = \\begin{bmatrix} 1 & 2 & 3 \\\\ \\end{bmatrix}^T $$" 578 | ] 579 | }, 580 | { 581 | "cell_type": "markdown", 582 | "metadata": {}, 583 | "source": [ 584 | "We will use Gaussian elimination, and solve it on paper with the help of numpy. \n", 585 | "\n", 586 | "1. We start with finding the Upper triangular matrix - $U$.\n", 587 | "\n", 588 | "![](./figures/upper_triangular.png)" 589 | ] 590 | }, 591 | { 592 | "cell_type": "markdown", 593 | "metadata": {}, 594 | "source": [ 595 | "2. Now that we have $U$, and we know that $U = EA$ we need to find the elimination matrix - $E$.\n", 596 | "\n", 597 | "To do that we need to write the transformation from above in matrix form.\n", 598 | "\n", 599 | "![](./figures/elimination.png)\n", 600 | "\n", 601 | "And knowing that $E = E_2E_1$, we can get the elimination matrix with the help of numpy." 602 | ] 603 | }, 604 | { 605 | "cell_type": "code", 606 | "execution_count": null, 607 | "metadata": {}, 608 | "outputs": [], 609 | "source": [ 610 | "E1 = np.array([[1, 0, 0], [-4, 1, 0], [2, 0, 1]])\n", 611 | "E2 = np.array([[1, 0, 0], [0, 1, 0], [0, -2, 1]])\n", 612 | "E = np.matmul(E2, E1)\n", 613 | "E" 614 | ] 615 | }, 616 | { 617 | "cell_type": "markdown", 618 | "metadata": {}, 619 | "source": [ 620 | "Now, that we have the elimination matrix we can also check if everything is ok by confirming that $U$ matches the one we computed." 621 | ] 622 | }, 623 | { 624 | "cell_type": "code", 625 | "execution_count": null, 626 | "metadata": {}, 627 | "outputs": [], 628 | "source": [ 629 | "A = np.array([[1, 1, 0], [4, 6, 1], [-2, 2, 0]])\n", 630 | "U = np.matmul(E, A) # matrix multiplication\n", 631 | "U" 632 | ] 633 | }, 634 | { 635 | "cell_type": "markdown", 636 | "metadata": {}, 637 | "source": [ 638 | "We want to solve $Ax = b$. In order to do so, we can multiply both sides of the equation by our elimination matrix: $EAx = Eb$. Knowing that $U = EA$, we get the following:\n", 639 | "\n", 640 | "$$Ux = Eb$$\n", 641 | "\n", 642 | "We then need the $Eb$." 643 | ] 644 | }, 645 | { 646 | "cell_type": "code", 647 | "execution_count": null, 648 | "metadata": {}, 649 | "outputs": [], 650 | "source": [ 651 | "b = np.array([1, 2, 3]).reshape(3, 1)\n", 652 | "Eb = np.matmul(E, b)\n", 653 | "Eb" 654 | ] 655 | }, 656 | { 657 | "cell_type": "markdown", 658 | "metadata": {}, 659 | "source": [ 660 | "And with that, we only need to solve this one last trivial set of equations. \n", 661 | "\n", 662 | "$$ Ux = \\begin{bmatrix} 1 & 1 & 0 \\\\ 0 & 2 & 1 \\\\ 0 & 0 & -2 \\end{bmatrix} \\begin{bmatrix} x_1 \\\\ x_2 \\\\ x_3 \\end{bmatrix} = \\begin{bmatrix} 1 \\\\ -2 \\\\ 9 \\end{bmatrix}$$" 663 | ] 664 | }, 665 | { 666 | "cell_type": "markdown", 667 | "metadata": {}, 668 | "source": [ 669 | "$$ \\begin{cases} x_1 + x_2 = 1 \\\\ 2x_2 + x_3 = -2 \\\\ -2x_3 = 9 \\end{cases} $$" 670 | ] 671 | }, 672 | { 673 | "cell_type": "code", 674 | "execution_count": null, 675 | "metadata": {}, 676 | "outputs": [], 677 | "source": [ 678 | "x3 = 9 / -2\n", 679 | "x2 = (-2 - x3)/2\n", 680 | "x1 = 1 - x2\n", 681 | "\n", 682 | "x1, x2, x3" 683 | ] 684 | }, 685 | { 686 | "cell_type": "markdown", 687 | "metadata": {}, 688 | "source": [ 689 | "Or, we can use numpy to help us out entirely :)" 690 | ] 691 | }, 692 | { 693 | "cell_type": "code", 694 | "execution_count": null, 695 | "metadata": {}, 696 | "outputs": [], 697 | "source": [ 698 | "np.linalg.solve(A, b)" 699 | ] 700 | }, 701 | { 702 | "cell_type": "markdown", 703 | "metadata": {}, 704 | "source": [ 705 | "Linear algebra functions within numpy cover a lot of necessary computation from the set of linear algebra. Including functions like singular value decomposition, or eigen decomposition. Using the functions from that set one can for example hand code PCA. The full list and documentation can be accessed [here](https://numpy.org/doc/stable/reference/routines.linalg.html)." 706 | ] 707 | }, 708 | { 709 | "cell_type": "markdown", 710 | "metadata": {}, 711 | "source": [ 712 | "***" 713 | ] 714 | }, 715 | { 716 | "cell_type": "markdown", 717 | "metadata": {}, 718 | "source": [ 719 | "**Exercise** \n", 720 | "Given:\n", 721 | "$$ A = \\begin{bmatrix} 1 & 1 & 0 \\\\ 4 & 6 & 1 \\\\ -2 & 2 & 0 \\\\ \\end{bmatrix}, B = \\begin{bmatrix} -1 & -2 & -3 \\\\ -4 & -5 & -6 \\\\ \\end{bmatrix}, C = \\begin{bmatrix} 5 & 10 & 15 \\\\ 20 & 25 & 30 \\\\ 35 & 40 & 45 \\\\ \\end{bmatrix} $$ \n", 722 | "\n", 723 | "* create those arrays; " 724 | ] 725 | }, 726 | { 727 | "cell_type": "code", 728 | "execution_count": null, 729 | "metadata": {}, 730 | "outputs": [], 731 | "source": [ 732 | "# your code here :) \n", 733 | "# A =\n", 734 | "# B =\n", 735 | "# C =" 736 | ] 737 | }, 738 | { 739 | "cell_type": "markdown", 740 | "metadata": {}, 741 | "source": [ 742 | "* add A and C;" 743 | ] 744 | }, 745 | { 746 | "cell_type": "code", 747 | "execution_count": null, 748 | "metadata": {}, 749 | "outputs": [], 750 | "source": [ 751 | "# your code here :) \n" 752 | ] 753 | }, 754 | { 755 | "cell_type": "markdown", 756 | "metadata": {}, 757 | "source": [ 758 | "* multiply B by A; " 759 | ] 760 | }, 761 | { 762 | "cell_type": "code", 763 | "execution_count": null, 764 | "metadata": {}, 765 | "outputs": [], 766 | "source": [ 767 | "# your code here :) \n" 768 | ] 769 | }, 770 | { 771 | "cell_type": "markdown", 772 | "metadata": {}, 773 | "source": [ 774 | "* multiply 3rd column of A by second row of B" 775 | ] 776 | }, 777 | { 778 | "cell_type": "code", 779 | "execution_count": null, 780 | "metadata": {}, 781 | "outputs": [], 782 | "source": [ 783 | "# your code here :) \n" 784 | ] 785 | }, 786 | { 787 | "cell_type": "markdown", 788 | "metadata": {}, 789 | "source": [ 790 | "**Excercise** \n", 791 | "This one is a bit more challenging.\n", 792 | "\n", 793 | "Linear regression model is given by the formula:\n", 794 | "$$y = X\\beta + \\epsilon$$\n", 795 | "\n", 796 | "A closed form solution is of a following form. \n", 797 | "\n", 798 | "$$ \\hat\\beta = (X^TX)^{-1}X^Ty$$\n", 799 | "\n", 800 | "Given $X$ and $y$, calculate $\\hat\\beta$." 801 | ] 802 | }, 803 | { 804 | "cell_type": "markdown", 805 | "metadata": {}, 806 | "source": [ 807 | "Let's create random data from a linear model of a following formula:\n", 808 | "$$ y = 0.5 + 2x_1 - x_2 $$" 809 | ] 810 | }, 811 | { 812 | "cell_type": "code", 813 | "execution_count": null, 814 | "metadata": {}, 815 | "outputs": [], 816 | "source": [ 817 | "n = 5\n", 818 | "# generate the variables\n", 819 | "ones = np.ones(n)\n", 820 | "x_1 = np.random.randn(n)\n", 821 | "x_2 = np.random.randn(n)\n", 822 | "X = np.column_stack((ones, x_1, x_2))\n", 823 | "# generate the y from a linear model\n", 824 | "y = 0.5 * ones + 2 * x_1 - x_2\n", 825 | "# and add noise\n", 826 | "y = y + np.random.normal(0, 0.1, n)" 827 | ] 828 | }, 829 | { 830 | "cell_type": "code", 831 | "execution_count": null, 832 | "metadata": {}, 833 | "outputs": [], 834 | "source": [ 835 | "# your code here :)\n", 836 | "beta_hat = " 837 | ] 838 | }, 839 | { 840 | "cell_type": "code", 841 | "execution_count": null, 842 | "metadata": {}, 843 | "outputs": [], 844 | "source": [ 845 | "np.matmul(X, beta_hat)" 846 | ] 847 | }, 848 | { 849 | "cell_type": "markdown", 850 | "metadata": {}, 851 | "source": [ 852 | "***\n", 853 | "\n", 854 | "## Pandas\n", 855 | "\n", 856 | "Working with data most often than not means working with tables of all sorts. The way to go about it in Python is with `pandas`." 857 | ] 858 | }, 859 | { 860 | "cell_type": "markdown", 861 | "metadata": {}, 862 | "source": [ 863 | "`Pandas` is a module that allows for data frame manipulation. It resembles the `R data.frame` & `dplyr` / `data.table` handling." 864 | ] 865 | }, 866 | { 867 | "cell_type": "markdown", 868 | "metadata": {}, 869 | "source": [ 870 | "We will get to explore only the top of the iceberg of the capabilities of the `pandas` module. More about the `numpy`, `pandas` and data science with Python can be found [here - in the Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/index.html)." 871 | ] 872 | }, 873 | { 874 | "cell_type": "markdown", 875 | "metadata": {}, 876 | "source": [ 877 | "To explore the capabilities of the library we will use the penguins data set. It carries information about 3 species of penguins. Data was collected and made available by [Dr. Kristen Gorman](https://www.uaf.edu/cfos/people/faculty/detail/kristen-gorman.php) and the [Palmer Station, Antarctica LTER](https://pal.lternet.edu/), a member of the [Long Term Ecological Research Network](https://lternet.edu/).\n", 878 | "\n", 879 | "\"Penguin\n", 880 | "\n", 881 | "Artwork by [@allison_horst](https://twitter.com/allison_horst)\n", 882 | "\n", 883 | "Among others, it contains information about the penguins' culmens, their body mass and others." 884 | ] 885 | }, 886 | { 887 | "cell_type": "markdown", 888 | "metadata": {}, 889 | "source": [ 890 | "\"Penguin\n", 891 | "\n", 892 | "Artwork by [@allison_horst](https://twitter.com/allison_horst)" 893 | ] 894 | }, 895 | { 896 | "cell_type": "code", 897 | "execution_count": null, 898 | "metadata": { 899 | "collapsed": false, 900 | "jupyter": { 901 | "outputs_hidden": false 902 | }, 903 | "pycharm": { 904 | "name": "#%%\n" 905 | } 906 | }, 907 | "outputs": [], 908 | "source": [ 909 | "import pandas as pd\n", 910 | "# penguins_raw.csv as downloaded from:\n", 911 | "# https://raw.githubusercontent.com/allisonhorst/palmerpenguins/9e1a7937e257ade8e4dc1764028e55cd7efe2cf5/data-raw/penguins_raw.csv\n", 912 | "penguins_df = pd.read_csv(\"./data/penguins_raw.csv\")\n", 913 | "penguins_df.head()" 914 | ] 915 | }, 916 | { 917 | "cell_type": "markdown", 918 | "metadata": {}, 919 | "source": [ 920 | "If we want to know more about the dataset we can call describe function that\n", 921 | "will tell us more about what we have in the data frame. It's much easier to\n", 922 | "go through a summary than through the whole data." 923 | ] 924 | }, 925 | { 926 | "cell_type": "markdown", 927 | "metadata": {}, 928 | "source": [ 929 | "`describe()` is one of the very interesting function that allows us to investigate the content of the table. " 930 | ] 931 | }, 932 | { 933 | "cell_type": "code", 934 | "execution_count": null, 935 | "metadata": {}, 936 | "outputs": [], 937 | "source": [ 938 | "penguins_df.describe()" 939 | ] 940 | }, 941 | { 942 | "cell_type": "markdown", 943 | "metadata": {}, 944 | "source": [ 945 | "On default though, describe function will summarize the numerical columns. If we want to see the content of the categorical, or character columns we need to specify an option `include = 'all'`, or select the specific subset of columns.\n", 946 | "\n", 947 | "> For object data (e.g. strings or timestamps), the result’s index will include count, unique, top, and freq. The top is the most common value. The freq is the most common value’s frequency. Timestamps also include the first and last items." 948 | ] 949 | }, 950 | { 951 | "cell_type": "code", 952 | "execution_count": null, 953 | "metadata": { 954 | "collapsed": false, 955 | "jupyter": { 956 | "outputs_hidden": false 957 | }, 958 | "pycharm": { 959 | "name": "#%%\n" 960 | } 961 | }, 962 | "outputs": [], 963 | "source": [ 964 | "penguins_df.describe(exclude = [np.number])" 965 | ] 966 | }, 967 | { 968 | "cell_type": "markdown", 969 | "metadata": {}, 970 | "source": [ 971 | "We can access row names of pandas data frame" 972 | ] 973 | }, 974 | { 975 | "cell_type": "code", 976 | "execution_count": null, 977 | "metadata": {}, 978 | "outputs": [], 979 | "source": [ 980 | "penguins_df.index" 981 | ] 982 | }, 983 | { 984 | "cell_type": "markdown", 985 | "metadata": {}, 986 | "source": [ 987 | "and its columns:" 988 | ] 989 | }, 990 | { 991 | "cell_type": "code", 992 | "execution_count": null, 993 | "metadata": {}, 994 | "outputs": [], 995 | "source": [ 996 | "penguins_df.columns" 997 | ] 998 | }, 999 | { 1000 | "cell_type": "code", 1001 | "execution_count": null, 1002 | "metadata": {}, 1003 | "outputs": [], 1004 | "source": [ 1005 | "penguins_df[['Culmen Length (mm)', 'Culmen Depth (mm)', 'Flipper Length (mm)']].values" 1006 | ] 1007 | }, 1008 | { 1009 | "cell_type": "code", 1010 | "execution_count": null, 1011 | "metadata": {}, 1012 | "outputs": [], 1013 | "source": [ 1014 | "penguins_df.sort_values(by = 'Body Mass (g)')" 1015 | ] 1016 | }, 1017 | { 1018 | "cell_type": "markdown", 1019 | "metadata": {}, 1020 | "source": [ 1021 | "Let's imagine we want to calculate mean of all parameters within the Species. We can do that by calling `groupby` function." 1022 | ] 1023 | }, 1024 | { 1025 | "cell_type": "code", 1026 | "execution_count": null, 1027 | "metadata": { 1028 | "collapsed": false, 1029 | "jupyter": { 1030 | "outputs_hidden": false 1031 | }, 1032 | "pycharm": { 1033 | "name": "#%%\n" 1034 | } 1035 | }, 1036 | "outputs": [], 1037 | "source": [ 1038 | "penguins_df.groupby(['Species']).mean()" 1039 | ] 1040 | }, 1041 | { 1042 | "cell_type": "markdown", 1043 | "metadata": {}, 1044 | "source": [ 1045 | "**Excercise** \n", 1046 | "Let's answer few questions using `pandas`. \n", 1047 | "\n", 1048 | "* are all species equally represented in the data frame? \n", 1049 | "\n", 1050 | "*Hint:* we can count values in a column with `value_counts()` method." 1051 | ] 1052 | }, 1053 | { 1054 | "cell_type": "code", 1055 | "execution_count": null, 1056 | "metadata": {}, 1057 | "outputs": [], 1058 | "source": [ 1059 | "# your code here :)" 1060 | ] 1061 | }, 1062 | { 1063 | "cell_type": "markdown", 1064 | "metadata": {}, 1065 | "source": [ 1066 | "* are all penguins present on all islands (if we take the lack of samples in a data as a sign for lack of specimens)? \n", 1067 | "\n", 1068 | "*Hint*: method `size()` will tell us about the size of each subgroup." 1069 | ] 1070 | }, 1071 | { 1072 | "cell_type": "code", 1073 | "execution_count": null, 1074 | "metadata": {}, 1075 | "outputs": [], 1076 | "source": [ 1077 | "# your code here :)" 1078 | ] 1079 | }, 1080 | { 1081 | "cell_type": "markdown", 1082 | "metadata": {}, 1083 | "source": [ 1084 | "* calculate means of all numeric variables for Adelie Penguins with respect to the island they live on;" 1085 | ] 1086 | }, 1087 | { 1088 | "cell_type": "code", 1089 | "execution_count": null, 1090 | "metadata": {}, 1091 | "outputs": [], 1092 | "source": [ 1093 | "# your code here :)" 1094 | ] 1095 | }, 1096 | { 1097 | "cell_type": "markdown", 1098 | "metadata": {}, 1099 | "source": [ 1100 | "***\n", 1101 | "## Matplotlib" 1102 | ] 1103 | }, 1104 | { 1105 | "cell_type": "markdown", 1106 | "metadata": {}, 1107 | "source": [ 1108 | "`Matplotlib` is a Python module to create plots. By series of commands we create a figure, add labels and modify it. Below, we create a simple plot with a line." 1109 | ] 1110 | }, 1111 | { 1112 | "cell_type": "code", 1113 | "execution_count": null, 1114 | "metadata": {}, 1115 | "outputs": [], 1116 | "source": [ 1117 | "import matplotlib.pyplot as plt\n", 1118 | "plt.rcParams[\"figure.facecolor\"] = \"w\"" 1119 | ] 1120 | }, 1121 | { 1122 | "cell_type": "code", 1123 | "execution_count": null, 1124 | "metadata": {}, 1125 | "outputs": [], 1126 | "source": [ 1127 | "plt.plot(range(1, 10))\n", 1128 | "plt.ylabel('Y axis label')\n", 1129 | "plt.xlabel('X axis label')\n", 1130 | "plt" 1131 | ] 1132 | }, 1133 | { 1134 | "cell_type": "markdown", 1135 | "metadata": {}, 1136 | "source": [ 1137 | "In general, the `plot()` function has 3 arguments: `x`, `y` and `shape` for every series. \n", 1138 | "\n", 1139 | "**Line types**:\n", 1140 | "\n", 1141 | "character | description\n", 1142 | ":---: | ---\n", 1143 | "'-' | solid line style\n", 1144 | "'--' | dashed line style\n", 1145 | "'-.' |\tdash-dot line style\n", 1146 | "':'\t| dotted line style\n", 1147 | "'.' | point marker\n", 1148 | "','\t| pixel marker\n", 1149 | "'o'\t| circle marker\n", 1150 | "'v'\t| triangle_down marker\n", 1151 | "'*'\t| star marker\n", 1152 | "'h'\t| hexagon1 marker\n", 1153 | "'x'\t| x marker\n", 1154 | "'D'\t| diamond marker\n", 1155 | "'\\|'\t| vline marker\n", 1156 | "'_'\t| hline marker\n", 1157 | "\n", 1158 | "**Colors**:\n", 1159 | "\n", 1160 | "character | color\n", 1161 | ":---: | ---\n", 1162 | "‘b’ | blue\n", 1163 | "‘g’\t| green\n", 1164 | "‘r’\t| red\n", 1165 | "‘c’\t| cyan\n", 1166 | "‘m’\t| magenta\n", 1167 | "‘y’\t| yellow\n", 1168 | "‘k’\t| black\n", 1169 | "‘w’\t| white\n", 1170 | "\n", 1171 | "More examples of matplotlib plots [here](https://matplotlib.org/tutorials/introductory/pyplot.html)." 1172 | ] 1173 | }, 1174 | { 1175 | "cell_type": "markdown", 1176 | "metadata": {}, 1177 | "source": [ 1178 | "We can add multiple series to a plot by adding the series line by line." 1179 | ] 1180 | }, 1181 | { 1182 | "cell_type": "code", 1183 | "execution_count": null, 1184 | "metadata": {}, 1185 | "outputs": [], 1186 | "source": [ 1187 | "x = np.arange(0, 10, 1)\n", 1188 | "y = np.arange(0, 20, 2)\n", 1189 | "\n", 1190 | "plt.plot(x, y, 'r-.')\n", 1191 | "plt.plot(x, x*y, 'g:')" 1192 | ] 1193 | }, 1194 | { 1195 | "cell_type": "markdown", 1196 | "metadata": {}, 1197 | "source": [ 1198 | "**Exercise:** \n", 1199 | "Given a vector t (`t = np.arange(0, 12, 0.5)` ) create a plot\n", 1200 | " with 3 series: $y_1 = 10t$, $y_2 = t^2$ and $y_3 = -10$ times the reminder of $t$ divided by 3. With 3 different styles: green triangles, red dashed line and blue diamonds." 1201 | ] 1202 | }, 1203 | { 1204 | "cell_type": "code", 1205 | "execution_count": null, 1206 | "metadata": {}, 1207 | "outputs": [], 1208 | "source": [ 1209 | "# your code here :) \n", 1210 | "t = np.arange(0, 12, 0.5)\n" 1211 | ] 1212 | }, 1213 | { 1214 | "cell_type": "markdown", 1215 | "metadata": {}, 1216 | "source": [ 1217 | "Now, let's do something a bit more complex. " 1218 | ] 1219 | }, 1220 | { 1221 | "cell_type": "code", 1222 | "execution_count": null, 1223 | "metadata": {}, 1224 | "outputs": [], 1225 | "source": [ 1226 | "groups = penguins_df.groupby('Species')\n", 1227 | "for name, group in groups:\n", 1228 | " plt.scatter(x = group['Body Mass (g)'], y = group['Flipper Length (mm)'], label = name)\n", 1229 | "plt.legend()" 1230 | ] 1231 | }, 1232 | { 1233 | "cell_type": "markdown", 1234 | "metadata": {}, 1235 | "source": [ 1236 | "Or we can create a factor column which we can directly use to color the point by." 1237 | ] 1238 | }, 1239 | { 1240 | "cell_type": "code", 1241 | "execution_count": null, 1242 | "metadata": {}, 1243 | "outputs": [], 1244 | "source": [ 1245 | "penguins_df['Species_factor'] = pd.factorize(penguins_df['Species'])[0]" 1246 | ] 1247 | }, 1248 | { 1249 | "cell_type": "code", 1250 | "execution_count": null, 1251 | "metadata": {}, 1252 | "outputs": [], 1253 | "source": [ 1254 | "plt.scatter('Culmen Length (mm)', 'Culmen Depth (mm)', c = 'Species_factor', data = penguins_df)" 1255 | ] 1256 | }, 1257 | { 1258 | "cell_type": "markdown", 1259 | "metadata": {}, 1260 | "source": [ 1261 | "Additionally pandas data frames have some visualisation methods built-in, they are described in more detail [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html)." 1262 | ] 1263 | }, 1264 | { 1265 | "cell_type": "code", 1266 | "execution_count": null, 1267 | "metadata": { 1268 | "collapsed": false, 1269 | "jupyter": { 1270 | "outputs_hidden": false 1271 | }, 1272 | "pycharm": { 1273 | "name": "#%%\n" 1274 | } 1275 | }, 1276 | "outputs": [], 1277 | "source": [ 1278 | "penguins_df['body_mass'] = penguins_df.groupby('Species')['Body Mass (g)'].apply(lambda x: (x - min(x)) / (max(x) - min(x)))\n", 1279 | "penguins_df.boxplot(column = ['body_mass'],\n", 1280 | " by = 'Species', figsize = (11, 5), notch = True)" 1281 | ] 1282 | }, 1283 | { 1284 | "cell_type": "markdown", 1285 | "metadata": {}, 1286 | "source": [ 1287 | "There are some other libtraries extending the plotting capabilities of Python. One worth mentioning is `seaborn` module. There are a lot of things simplified and easily accessible with the library. \n", 1288 | "It does allow to make visually nice plots, and is worth exploring when working with Python. More about [seaborn here](https://seaborn.pydata.org/)." 1289 | ] 1290 | }, 1291 | { 1292 | "cell_type": "code", 1293 | "execution_count": null, 1294 | "metadata": { 1295 | "collapsed": false, 1296 | "jupyter": { 1297 | "outputs_hidden": false 1298 | }, 1299 | "pycharm": { 1300 | "name": "#%%\n" 1301 | } 1302 | }, 1303 | "outputs": [], 1304 | "source": [ 1305 | "import seaborn as sns" 1306 | ] 1307 | }, 1308 | { 1309 | "cell_type": "code", 1310 | "execution_count": null, 1311 | "metadata": { 1312 | "collapsed": false, 1313 | "jupyter": { 1314 | "outputs_hidden": false 1315 | }, 1316 | "pycharm": { 1317 | "name": "#%%\n" 1318 | } 1319 | }, 1320 | "outputs": [], 1321 | "source": [ 1322 | "sns.set_style(\"white\")\n", 1323 | "sns.pairplot(penguins_df, hue = 'Species')" 1324 | ] 1325 | }, 1326 | { 1327 | "cell_type": "markdown", 1328 | "metadata": {}, 1329 | "source": [ 1330 | "It still, in my opinion, is not close enough to what `ggplot2` does and allows to do in `R` though." 1331 | ] 1332 | }, 1333 | { 1334 | "cell_type": "code", 1335 | "execution_count": null, 1336 | "metadata": {}, 1337 | "outputs": [], 1338 | "source": [ 1339 | "from plotnine import *\n", 1340 | "\n", 1341 | "(\n", 1342 | " ggplot(penguins_df, aes('Body Mass (g)', 'Flipper Length (mm)', color = 'Species'))\n", 1343 | " + geom_point()\n", 1344 | " + geom_smooth(se = False)\n", 1345 | " + theme_light()\n", 1346 | ")" 1347 | ] 1348 | }, 1349 | { 1350 | "cell_type": "markdown", 1351 | "metadata": {}, 1352 | "source": [ 1353 | "And a nice example of Simpson's paradox." 1354 | ] 1355 | }, 1356 | { 1357 | "cell_type": "code", 1358 | "execution_count": null, 1359 | "metadata": {}, 1360 | "outputs": [], 1361 | "source": [ 1362 | "(\n", 1363 | " ggplot(penguins_df, aes('Body Mass (g)', 'Culmen Depth (mm)'))\n", 1364 | " + geom_point(aes(color = 'Species'))\n", 1365 | " + geom_smooth(se = False, method = 'lm')\n", 1366 | " + geom_smooth(aes(color = 'Species'), se = False, method = 'lm')\n", 1367 | " + theme_xkcd()\n", 1368 | ")" 1369 | ] 1370 | }, 1371 | { 1372 | "cell_type": "markdown", 1373 | "metadata": {}, 1374 | "source": [ 1375 | "**Excercise** \n", 1376 | "\n", 1377 | "* plot the number of each specie occurence." 1378 | ] 1379 | }, 1380 | { 1381 | "cell_type": "code", 1382 | "execution_count": null, 1383 | "metadata": {}, 1384 | "outputs": [], 1385 | "source": [ 1386 | "# your code here :)" 1387 | ] 1388 | }, 1389 | { 1390 | "cell_type": "markdown", 1391 | "metadata": {}, 1392 | "source": [ 1393 | "* Plot the culmen dimensions with respect to the specie." 1394 | ] 1395 | }, 1396 | { 1397 | "cell_type": "code", 1398 | "execution_count": null, 1399 | "metadata": {}, 1400 | "outputs": [], 1401 | "source": [ 1402 | "# your code here :)" 1403 | ] 1404 | }, 1405 | { 1406 | "cell_type": "markdown", 1407 | "metadata": {}, 1408 | "source": [ 1409 | "***\n", 1410 | "## SciPy" 1411 | ] 1412 | }, 1413 | { 1414 | "cell_type": "markdown", 1415 | "metadata": {}, 1416 | "source": [ 1417 | "`SciPy` is a Python-based ecosystem of open-source software for mathematics, science, and engineering. For example, it provides useful statistical functions. One of them is MannWhitney U test - a non parametric test allowing to check the difference between the two distributions, series." 1418 | ] 1419 | }, 1420 | { 1421 | "cell_type": "code", 1422 | "execution_count": null, 1423 | "metadata": {}, 1424 | "outputs": [], 1425 | "source": [ 1426 | "from scipy.stats import mannwhitneyu\n", 1427 | "A = np.random.normal(0, 3, 20)\n", 1428 | "B = np.random.normal(-2, 5, 30)" 1429 | ] 1430 | }, 1431 | { 1432 | "cell_type": "markdown", 1433 | "metadata": {}, 1434 | "source": [ 1435 | "Let's visualise what we have in those two sets." 1436 | ] 1437 | }, 1438 | { 1439 | "cell_type": "code", 1440 | "execution_count": null, 1441 | "metadata": {}, 1442 | "outputs": [], 1443 | "source": [ 1444 | "plt.boxplot((A, B))" 1445 | ] 1446 | }, 1447 | { 1448 | "cell_type": "code", 1449 | "execution_count": null, 1450 | "metadata": { 1451 | "pycharm": { 1452 | "name": "#%%\n" 1453 | } 1454 | }, 1455 | "outputs": [], 1456 | "source": [ 1457 | "mannwhitneyu(A, B)" 1458 | ] 1459 | }, 1460 | { 1461 | "cell_type": "markdown", 1462 | "metadata": {}, 1463 | "source": [ 1464 | "We saw that one of the species is heavier than others. Let's remind ourselves the distribution of the weights. " 1465 | ] 1466 | }, 1467 | { 1468 | "cell_type": "code", 1469 | "execution_count": null, 1470 | "metadata": {}, 1471 | "outputs": [], 1472 | "source": [ 1473 | "fig, ax = plt.subplots(figsize = (12, 8))\n", 1474 | "ax = sns.stripplot(x = 'Species', y = 'Body Mass (g)', data = penguins_df, color=\".25\")\n", 1475 | "ax = sns.boxplot(x = 'Species', y = 'Body Mass (g)', data = penguins_df, palette = \"PRGn\", fliersize = 0)" 1476 | ] 1477 | }, 1478 | { 1479 | "cell_type": "markdown", 1480 | "metadata": {}, 1481 | "source": [ 1482 | "With `scipy.stats` we can check if the change is significant." 1483 | ] 1484 | }, 1485 | { 1486 | "cell_type": "code", 1487 | "execution_count": null, 1488 | "metadata": {}, 1489 | "outputs": [], 1490 | "source": [ 1491 | "from itertools import combinations\n", 1492 | "\n", 1493 | "species = penguins_df['Species'].unique()\n", 1494 | "for (specie1, specie2) in combinations(species, 2):\n", 1495 | " a = penguins_df[penguins_df['Species'] == specie1]['Body Mass (g)']\n", 1496 | " b = penguins_df[penguins_df['Species'] == specie2]['Body Mass (g)']\n", 1497 | " stat, pval = mannwhitneyu(a, b)\n", 1498 | " print((specie1, specie2, pval))" 1499 | ] 1500 | }, 1501 | { 1502 | "cell_type": "markdown", 1503 | "metadata": {}, 1504 | "source": [ 1505 | "***\n", 1506 | "## Scikit learn\n", 1507 | "\n", 1508 | "Module allowing Machine Learning in Python. " 1509 | ] 1510 | }, 1511 | { 1512 | "cell_type": "markdown", 1513 | "metadata": {}, 1514 | "source": [ 1515 | "For this example, let's recreate the data from the linear model example." 1516 | ] 1517 | }, 1518 | { 1519 | "cell_type": "code", 1520 | "execution_count": null, 1521 | "metadata": {}, 1522 | "outputs": [], 1523 | "source": [ 1524 | "n = 5\n", 1525 | "# generate the variables\n", 1526 | "ones = np.ones(n)\n", 1527 | "x_1 = np.random.randn(n)\n", 1528 | "x_2 = np.random.randn(n)\n", 1529 | "X = np.column_stack((ones, x_1, x_2))\n", 1530 | "# generate the y from a linear model\n", 1531 | "y = 0.5 * ones + 2 * x_1 - x_2\n", 1532 | "# and add noise\n", 1533 | "y = y + np.random.normal(0, 0.1, n)" 1534 | ] 1535 | }, 1536 | { 1537 | "cell_type": "code", 1538 | "execution_count": null, 1539 | "metadata": {}, 1540 | "outputs": [], 1541 | "source": [ 1542 | "from sklearn import linear_model\n", 1543 | "reg = linear_model.LinearRegression(fit_intercept = False)\n", 1544 | "\n", 1545 | "reg.fit(X, y)\n", 1546 | "reg.coef_" 1547 | ] 1548 | }, 1549 | { 1550 | "cell_type": "code", 1551 | "execution_count": null, 1552 | "metadata": {}, 1553 | "outputs": [], 1554 | "source": [ 1555 | "beta_hat = np.matmul(np.matmul(np.linalg.inv(np.matmul(X.T, X)), X.T), y)\n", 1556 | "beta_hat" 1557 | ] 1558 | }, 1559 | { 1560 | "cell_type": "code", 1561 | "execution_count": null, 1562 | "metadata": {}, 1563 | "outputs": [], 1564 | "source": [ 1565 | "penguins_df.columns" 1566 | ] 1567 | }, 1568 | { 1569 | "cell_type": "code", 1570 | "execution_count": null, 1571 | "metadata": {}, 1572 | "outputs": [], 1573 | "source": [ 1574 | "from sklearn.decomposition import *\n", 1575 | "\n", 1576 | "df_ = penguins_df[['Species', 'Culmen Length (mm)', 'Culmen Depth (mm)', 'Flipper Length (mm)', 'Body Mass (g)']].dropna()\n", 1577 | "pca = PCA(n_components=2)\n", 1578 | "pca.fit(df_.drop('Species', axis = 1))\n", 1579 | "projected = pca.fit_transform(df_.drop('Species', axis = 1))" 1580 | ] 1581 | }, 1582 | { 1583 | "cell_type": "code", 1584 | "execution_count": null, 1585 | "metadata": {}, 1586 | "outputs": [], 1587 | "source": [ 1588 | "pca_plot = pd.concat((pd.DataFrame(projected), df_['Species']), axis = 1)\n", 1589 | "pca_plot.columns = ['PC1', 'PC2', 'Species'] " 1590 | ] 1591 | }, 1592 | { 1593 | "cell_type": "code", 1594 | "execution_count": null, 1595 | "metadata": {}, 1596 | "outputs": [], 1597 | "source": [ 1598 | "pca_groups = pca_plot.groupby('Species')\n", 1599 | "\n", 1600 | "for specie, group in pca_groups:\n", 1601 | " plt.scatter(x = group['PC1'], y = group['PC2'], label = specie)\n", 1602 | "plt.legend()" 1603 | ] 1604 | }, 1605 | { 1606 | "cell_type": "code", 1607 | "execution_count": null, 1608 | "metadata": {}, 1609 | "outputs": [], 1610 | "source": [ 1611 | "(\n", 1612 | "ggplot(pca_plot, aes('PC1', 'PC2', color = 'Species'))\n", 1613 | "+ geom_point() \n", 1614 | "+ theme_xkcd()\n", 1615 | ")" 1616 | ] 1617 | }, 1618 | { 1619 | "cell_type": "markdown", 1620 | "metadata": {}, 1621 | "source": [ 1622 | "***\n", 1623 | "**Excercises** \n", 1624 | "\n", 1625 | "* Build a model that will predict the size of a flipper based on a penguin weight. Plot the prediction againts the true values.\n", 1626 | "\n", 1627 | "*Hint*: Good place to start would be a [linear model](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html)." 1628 | ] 1629 | }, 1630 | { 1631 | "cell_type": "code", 1632 | "execution_count": null, 1633 | "metadata": {}, 1634 | "outputs": [], 1635 | "source": [ 1636 | "# your code here :)" 1637 | ] 1638 | }, 1639 | { 1640 | "cell_type": "markdown", 1641 | "metadata": {}, 1642 | "source": [ 1643 | "* See if we can cluster the data into 3 clusters and if they match the species.\n", 1644 | "\n", 1645 | "*Hint*: You might want to use [kmeans clustering](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html) for starters. " 1646 | ] 1647 | }, 1648 | { 1649 | "cell_type": "code", 1650 | "execution_count": null, 1651 | "metadata": {}, 1652 | "outputs": [], 1653 | "source": [ 1654 | "# your code here :)" 1655 | ] 1656 | } 1657 | ], 1658 | "metadata": { 1659 | "kernelspec": { 1660 | "display_name": "Python 3", 1661 | "language": "python", 1662 | "name": "python3" 1663 | }, 1664 | "language_info": { 1665 | "codemirror_mode": { 1666 | "name": "ipython", 1667 | "version": 3 1668 | }, 1669 | "file_extension": ".py", 1670 | "mimetype": "text/x-python", 1671 | "name": "python", 1672 | "nbconvert_exporter": "python", 1673 | "pygments_lexer": "ipython3", 1674 | "version": "3.8.3" 1675 | } 1676 | }, 1677 | "nbformat": 4, 1678 | "nbformat_minor": 4 1679 | } 1680 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Introduction to Python 2 | 3 | The repository with materials for the Introduction to Python course for NGSchool2019 and RNA Club meeting on May the 6th and later at the #NGSchool2019: Machine Learning for Biomedicine. It has been used few times since then and has been continuously improved. 4 | 5 | The whole tutorial was design to be able to be run interactively in a classroom, or on its own. So if you missed the online session - try running it on your own. Soon, there will also be a video of a live session which hopefully help you out. 6 | 7 | ## Aim 8 | 9 | The aim of the workshops is to walk you through basics, show what the advantages of Python and what can be done with it. It also lists places where to look for materials to learn more. 10 | 11 | ## About the tutor 12 | 13 | My name is [Kasia](https://kasia.codes/) and I'm a DPhil student in Genomic Medicine and Statistics at the Wellcome Centre for Human Genetics, University of Oxford. I'm also the President of [NGSchool Society](https://ngschool.eu/) where with an amazing group of [people](https://ngschool.eu/people/) we try to build international community for computational biologists with training, platforms to network and exchange knowledge. Be sure to check out more on our social media ([twitter](https://twitter.com/NGSchoolEU) and [fb](https://www.facebook.com/NGSchool.eu/)) and [join us on discord](https://discord.gg/MhNeqwR). 14 | 15 | ## Running the tutorial materials 16 | ### Running notebook on Google Colab 17 | 18 | You can run the whole notebook on [Google Colab](https://colab.research.google.com/). You can follow [this link](https://colab.research.google.com/github/NGSchoolEU/ngs19_python_intro/blob/master/Intro_to_Python.ipynb) directly or open a notebook from this repository. The beauty of this solution is that everything works and all you need is access to the internet to work on the exercises. However, the downside of this solution is that you have to have a google account. In case you are not comfortable with this, or you want to run the notebook offline see the instructions in the next section. 19 | 20 | ### Running notebook on your machine 21 | 22 | You can run the notebook on your machine. All you need is to make sure the following requirements are satisfied. 23 | 24 | **Requirements** 25 | 26 | * Python3 (>= 3.5) 27 | * numpy 28 | * pandas 29 | * scipy 30 | * matplotlib 31 | * seaborn 32 | * Jupyter lab 33 | 34 | I prepared a script that can test whether the requirements are satisfied, and gives advice how to proceed in case some are missing. However, it only works on `bash`. 35 | 36 | ```bash 37 | bash setup_check.sh 38 | ``` 39 | 40 | In case you need more advice don't hesitate to contact me. Hope you enjoy! 41 | 42 | -------------------------------------------------------------------------------- /data/penguins_raw.csv: -------------------------------------------------------------------------------- 1 | studyName,Sample Number,Species,Region,Island,Stage,Individual ID,Clutch Completion,Date Egg,Culmen Length (mm),Culmen Depth (mm),Flipper Length (mm),Body Mass (g),Sex,Delta 15 N (o/oo),Delta 13 C (o/oo),Comments 2 | PAL0708,1,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N1A1,Yes,2007-11-11,39.1,18.7,181,3750,MALE,NA,NA,Not enough blood for isotopes. 3 | PAL0708,2,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N1A2,Yes,2007-11-11,39.5,17.4,186,3800,FEMALE,8.94956,-24.69454,NA 4 | PAL0708,3,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N2A1,Yes,2007-11-16,40.3,18,195,3250,FEMALE,8.36821,-25.33302,NA 5 | PAL0708,4,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N2A2,Yes,2007-11-16,NA,NA,NA,NA,NA,NA,NA,Adult not sampled. 6 | PAL0708,5,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N3A1,Yes,2007-11-16,36.7,19.3,193,3450,FEMALE,8.76651,-25.32426,NA 7 | PAL0708,6,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N3A2,Yes,2007-11-16,39.3,20.6,190,3650,MALE,8.66496,-25.29805,NA 8 | PAL0708,7,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N4A1,No,2007-11-15,38.9,17.8,181,3625,FEMALE,9.18718,-25.21799,Nest never observed with full clutch. 9 | PAL0708,8,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N4A2,No,2007-11-15,39.2,19.6,195,4675,MALE,9.4606,-24.89958,Nest never observed with full clutch. 10 | PAL0708,9,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N5A1,Yes,2007-11-09,34.1,18.1,193,3475,NA,NA,NA,No blood sample obtained. 11 | PAL0708,10,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N5A2,Yes,2007-11-09,42,20.2,190,4250,NA,9.13362,-25.09368,No blood sample obtained for sexing. 12 | PAL0708,11,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N6A1,Yes,2007-11-09,37.8,17.1,186,3300,NA,8.63243,-25.21315,No blood sample obtained for sexing. 13 | PAL0708,12,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N6A2,Yes,2007-11-09,37.8,17.3,180,3700,NA,NA,NA,No blood sample obtained. 14 | PAL0708,13,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N7A1,Yes,2007-11-15,41.1,17.6,182,3200,FEMALE,NA,NA,Not enough blood for isotopes. 15 | PAL0708,14,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N7A2,Yes,2007-11-15,38.6,21.2,191,3800,MALE,NA,NA,Not enough blood for isotopes. 16 | PAL0708,15,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N8A1,Yes,2007-11-16,34.6,21.1,198,4400,MALE,8.55583,-25.22588,NA 17 | PAL0708,16,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N8A2,Yes,2007-11-16,36.6,17.8,185,3700,FEMALE,NA,NA,Not enough blood for isotopes. 18 | PAL0708,17,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N9A1,Yes,2007-11-12,38.7,19,195,3450,FEMALE,9.18528,-25.06691,NA 19 | PAL0708,18,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N9A2,Yes,2007-11-12,42.5,20.7,197,4500,MALE,8.67538,-25.13993,NA 20 | PAL0708,19,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N10A1,Yes,2007-11-16,34.4,18.4,184,3325,FEMALE,8.47827,-25.23319,NA 21 | PAL0708,20,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N10A2,Yes,2007-11-16,46,21.5,194,4200,MALE,9.11616,-24.77227,NA 22 | PAL0708,21,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N11A1,Yes,2007-11-12,37.8,18.3,174,3400,FEMALE,8.73762,-25.09383,NA 23 | PAL0708,22,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N11A2,Yes,2007-11-12,37.7,18.7,180,3600,MALE,8.66271,-25.0639,NA 24 | PAL0708,23,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N12A1,Yes,2007-11-12,35.9,19.2,189,3800,FEMALE,9.22286,-25.03474,NA 25 | PAL0708,24,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N12A2,Yes,2007-11-12,38.2,18.1,185,3950,MALE,8.43423,-25.22664,NA 26 | PAL0708,25,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N13A1,Yes,2007-11-10,38.8,17.2,180,3800,MALE,9.63954,-25.29856,NA 27 | PAL0708,26,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N13A2,Yes,2007-11-10,35.3,18.9,187,3800,FEMALE,9.21292,-24.3613,NA 28 | PAL0708,27,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N17A1,Yes,2007-11-12,40.6,18.6,183,3550,MALE,8.93997,-25.36288,NA 29 | PAL0708,28,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N17A2,Yes,2007-11-12,40.5,17.9,187,3200,FEMALE,8.08138,-25.49448,NA 30 | PAL0708,29,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N18A1,No,2007-11-10,37.9,18.6,172,3150,FEMALE,8.38404,-25.19837,Nest never observed with full clutch. 31 | PAL0708,30,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N18A2,No,2007-11-10,40.5,18.9,180,3950,MALE,8.90027,-25.11609,Nest never observed with full clutch. 32 | PAL0708,31,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N21A1,Yes,2007-11-09,39.5,16.7,178,3250,FEMALE,9.69756,-25.11223,NA 33 | PAL0708,32,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N21A2,Yes,2007-11-09,37.2,18.1,178,3900,MALE,9.72764,-25.0102,NA 34 | PAL0708,33,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N22A1,Yes,2007-11-09,39.5,17.8,188,3300,FEMALE,9.66523,-25.0602,NA 35 | PAL0708,34,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N22A2,Yes,2007-11-09,40.9,18.9,184,3900,MALE,8.79665,-25.14591,NA 36 | PAL0708,35,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N23A1,Yes,2007-11-16,36.4,17,195,3325,FEMALE,9.17847,-25.23061,NA 37 | PAL0708,36,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N23A2,Yes,2007-11-16,39.2,21.1,196,4150,MALE,9.15308,-25.03469,NA 38 | PAL0708,37,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N24A1,Yes,2007-11-16,38.8,20,190,3950,MALE,9.18985,-25.12255,NA 39 | PAL0708,38,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N24A2,Yes,2007-11-16,42.2,18.5,180,3550,FEMALE,8.04787,-25.49523,NA 40 | PAL0708,39,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N25A1,No,2007-11-13,37.6,19.3,181,3300,FEMALE,9.41131,-25.04169,Nest never observed with full clutch. 41 | PAL0708,40,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N25A2,No,2007-11-13,39.8,19.1,184,4650,MALE,NA,NA,Nest never observed with full clutch. Not enough blood for isotopes. 42 | PAL0708,41,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N26A1,Yes,2007-11-16,36.5,18,182,3150,FEMALE,9.68933,-24.4228,NA 43 | PAL0708,42,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N26A2,Yes,2007-11-16,40.8,18.4,195,3900,MALE,NA,NA,Not enough blood for isotopes. 44 | PAL0708,43,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N27A1,Yes,2007-11-19,36,18.5,186,3100,FEMALE,9.50772,-25.03492,NA 45 | PAL0708,44,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N27A2,Yes,2007-11-19,44.1,19.7,196,4400,MALE,9.2372,-24.52698,NA 46 | PAL0708,45,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N28A1,Yes,2007-11-16,37,16.9,185,3000,FEMALE,9.36392,-25.01745,NA 47 | PAL0708,46,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N28A2,Yes,2007-11-16,39.6,18.8,190,4600,MALE,9.49106,-24.10255,NA 48 | PAL0708,47,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N29A1,Yes,2007-11-13,41.1,19,182,3425,MALE,NA,NA,Not enough blood for isotopes. 49 | PAL0708,48,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N29A2,Yes,2007-11-13,37.5,18.9,179,2975,NA,NA,NA,Sexing primers did not amplify. Not enough blood for isotopes. 50 | PAL0708,49,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N30A1,Yes,2007-11-13,36,17.9,190,3450,FEMALE,9.51784,-25.07683,NA 51 | PAL0708,50,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N30A2,Yes,2007-11-13,42.3,21.2,191,4150,MALE,8.87988,-25.18543,NA 52 | PAL0809,51,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N21A1,Yes,2008-11-06,39.6,17.7,186,3500,FEMALE,8.46616,-26.12989,NA 53 | PAL0809,52,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N21A2,Yes,2008-11-06,40.1,18.9,188,4300,MALE,8.51362,-26.55602,NA 54 | PAL0809,53,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N22A1,Yes,2008-11-09,35,17.9,190,3450,FEMALE,8.19539,-26.17213,NA 55 | PAL0809,54,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N22A2,Yes,2008-11-09,42,19.5,200,4050,MALE,8.48095,-26.3146,NA 56 | PAL0809,55,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N23A1,Yes,2008-11-09,34.5,18.1,187,2900,FEMALE,8.41837,-26.54718,NA 57 | PAL0809,56,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N23A2,Yes,2008-11-09,41.4,18.6,191,3700,MALE,8.35396,-26.27853,NA 58 | PAL0809,57,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N24A1,Yes,2008-11-15,39,17.5,186,3550,FEMALE,8.57199,-26.07188,NA 59 | PAL0809,58,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N24A2,Yes,2008-11-15,40.6,18.8,193,3800,MALE,8.56674,-25.98843,NA 60 | PAL0809,59,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N25A1,Yes,2008-11-15,36.5,16.6,181,2850,FEMALE,9.07878,-25.88156,NA 61 | PAL0809,60,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N25A2,Yes,2008-11-15,37.6,19.1,194,3750,MALE,9.108,-25.89677,NA 62 | PAL0809,61,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N27A1,Yes,2008-11-13,35.7,16.9,185,3150,FEMALE,8.96472,-26.40943,NA 63 | PAL0809,62,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N27A2,Yes,2008-11-13,41.3,21.1,195,4400,MALE,8.74802,-26.37809,NA 64 | PAL0809,63,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N28A1,Yes,2008-11-13,37.6,17,185,3600,FEMALE,8.58063,-26.21569,NA 65 | PAL0809,64,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N28A2,Yes,2008-11-13,41.1,18.2,192,4050,MALE,8.62264,-26.60023,NA 66 | PAL0809,65,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N29A1,Yes,2008-11-13,36.4,17.1,184,2850,FEMALE,8.62623,-26.1165,NA 67 | PAL0809,66,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N29A2,Yes,2008-11-13,41.6,18,192,3950,MALE,8.85562,-26.09294,NA 68 | PAL0809,67,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N30A1,Yes,2008-11-06,35.5,16.2,195,3350,FEMALE,8.56192,-25.95541,NA 69 | PAL0809,68,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N30A2,Yes,2008-11-06,41.1,19.1,188,4100,MALE,8.71078,-25.81012,NA 70 | PAL0809,69,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N32A1,No,2008-11-11,35.9,16.6,190,3050,FEMALE,8.47781,-26.07821,Nest never observed with full clutch. 71 | PAL0809,70,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N32A2,No,2008-11-11,41.8,19.4,198,4450,MALE,8.86853,-26.06209,Nest never observed with full clutch. 72 | PAL0809,71,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N34A1,Yes,2008-11-14,33.5,19,190,3600,FEMALE,7.88863,-26.63085,NA 73 | PAL0809,72,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N34A2,Yes,2008-11-14,39.7,18.4,190,3900,MALE,9.29808,-25.23453,NA 74 | PAL0809,73,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N35A1,Yes,2008-11-11,39.6,17.2,196,3550,FEMALE,8.33524,-26.55351,NA 75 | PAL0809,74,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N35A2,Yes,2008-11-11,45.8,18.9,197,4150,MALE,8.18658,-26.45978,NA 76 | PAL0809,75,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N36A1,Yes,2008-11-08,35.5,17.5,190,3700,FEMALE,8.70642,-26.15003,NA 77 | PAL0809,76,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N36A2,Yes,2008-11-08,42.8,18.5,195,4250,MALE,8.2993,-26.38986,NA 78 | PAL0809,77,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N37A1,Yes,2008-11-06,40.9,16.8,191,3700,FEMALE,8.47257,-26.02002,NA 79 | PAL0809,78,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N37A2,Yes,2008-11-06,37.2,19.4,184,3900,MALE,8.3554,-26.44787,NA 80 | PAL0809,79,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N38A1,Yes,2008-11-09,36.2,16.1,187,3550,FEMALE,7.82381,-26.51382,NA 81 | PAL0809,80,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N38A2,Yes,2008-11-09,42.1,19.1,195,4000,MALE,9.05736,-25.81513,NA 82 | PAL0809,81,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N39A1,Yes,2008-11-02,34.6,17.2,189,3200,FEMALE,7.69778,-26.5387,NA 83 | PAL0809,82,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N39A2,Yes,2008-11-02,42.9,17.6,196,4700,MALE,8.63259,-26.23027,NA 84 | PAL0809,83,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N40A1,Yes,2008-11-07,36.7,18.8,187,3800,FEMALE,7.88494,-26.24837,NA 85 | PAL0809,84,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N40A2,Yes,2008-11-07,35.1,19.4,193,4200,MALE,8.90002,-26.46254,NA 86 | PAL0809,85,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N41A1,Yes,2008-11-17,37.3,17.8,191,3350,FEMALE,8.32718,-26.38396,NA 87 | PAL0809,86,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N41A2,Yes,2008-11-17,41.3,20.3,194,3550,MALE,9.14863,-26.09635,NA 88 | PAL0809,87,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N42A1,Yes,2008-11-08,36.3,19.5,190,3800,MALE,8.57087,-26.22227,NA 89 | PAL0809,88,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N42A2,Yes,2008-11-08,36.9,18.6,189,3500,FEMALE,8.59147,-26.08165,NA 90 | PAL0809,89,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N44A1,Yes,2008-11-08,38.3,19.2,189,3950,MALE,9.07826,-26.12417,NA 91 | PAL0809,90,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N44A2,Yes,2008-11-08,38.9,18.8,190,3600,FEMALE,8.36936,-26.11199,NA 92 | PAL0809,91,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N45A1,Yes,2008-11-14,35.7,18,202,3550,FEMALE,8.46531,-26.05621,NA 93 | PAL0809,92,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N45A2,Yes,2008-11-14,41.1,18.1,205,4300,MALE,8.77018,-25.83352,NA 94 | PAL0809,93,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N46A1,Yes,2008-11-05,34,17.1,185,3400,FEMALE,8.01485,-26.695430000000002,NA 95 | PAL0809,94,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N46A2,Yes,2008-11-05,39.6,18.1,186,4450,MALE,8.49915,-26.42406,NA 96 | PAL0809,95,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N48A1,Yes,2008-11-17,36.2,17.3,187,3300,FEMALE,8.90723,-26.30037,NA 97 | PAL0809,96,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N48A2,Yes,2008-11-17,40.8,18.9,208,4300,MALE,8.48204,-26.57941,NA 98 | PAL0809,97,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N49A1,Yes,2008-11-08,38.1,18.6,190,3700,FEMALE,8.10277,-26.50086,NA 99 | PAL0809,98,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N49A2,Yes,2008-11-08,40.3,18.5,196,4350,MALE,8.3945900000000009,-26.01152,NA 100 | PAL0809,99,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N50A1,Yes,2008-11-10,33.1,16.1,178,2900,FEMALE,9.04218,-26.15775,NA 101 | PAL0809,100,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N50A2,Yes,2008-11-10,43.2,18.5,192,4100,MALE,8.97025,-26.03679,NA 102 | PAL0910,101,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N47A1,Yes,2009-11-09,35,17.9,192,3725,FEMALE,8.84451,-26.28055,NA 103 | PAL0910,102,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N47A2,Yes,2009-11-09,41,20,203,4725,MALE,9.01079,-26.38085,NA 104 | PAL0910,103,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N49A1,Yes,2009-11-15,37.7,16,183,3075,FEMALE,9.2151,-26.2253,NA 105 | PAL0910,104,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N49A2,Yes,2009-11-15,37.8,20,190,4250,MALE,9.51929,-25.69199,NA 106 | PAL0910,105,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N51A1,Yes,2009-11-15,37.9,18.6,193,2925,FEMALE,9.02642,-25.86482,NA 107 | PAL0910,106,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N51A2,Yes,2009-11-15,39.7,18.9,184,3550,MALE,8.85699,-25.80208,NA 108 | PAL0910,107,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N53A1,Yes,2009-11-15,38.6,17.2,199,3750,FEMALE,8.77322,-26.48973,NA 109 | PAL0910,108,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N53A2,Yes,2009-11-15,38.2,20,190,3900,MALE,9.59245,-25.70711,NA 110 | PAL0910,109,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N55A1,Yes,2009-11-20,38.1,17,181,3175,FEMALE,9.79532,-25.27385,NA 111 | PAL0910,110,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N55A2,Yes,2009-11-20,43.2,19,197,4775,MALE,9.31735,-25.45171,NA 112 | PAL0910,111,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N58A1,Yes,2009-11-12,38.1,16.5,198,3825,FEMALE,8.43951,-26.57563,NA 113 | PAL0910,112,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N58A2,Yes,2009-11-12,45.6,20.3,191,4600,MALE,8.65466,-26.32909,NA 114 | PAL0910,113,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N60A1,Yes,2009-11-15,39.7,17.7,193,3200,FEMALE,9.02657,-26.06203,NA 115 | PAL0910,114,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N60A2,Yes,2009-11-15,42.2,19.5,197,4275,MALE,8.80186,-26.41218,NA 116 | PAL0910,115,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N61A1,Yes,2009-11-17,39.6,20.7,191,3900,FEMALE,8.80967,-26.78958,NA 117 | PAL0910,116,Adelie Penguin (Pygoscelis adeliae),Anvers,Biscoe,"Adult, 1 Egg Stage",N61A2,Yes,2009-11-17,42.7,18.3,196,4075,MALE,8.91434,-26.42018,NA 118 | PAL0910,117,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N63A1,Yes,2009-11-18,38.6,17,188,2900,FEMALE,9.18021,-25.77264,NA 119 | PAL0910,118,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N63A2,Yes,2009-11-18,37.3,20.5,199,3775,MALE,9.49645,-26.36678,NA 120 | PAL0910,119,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N64A1,Yes,2009-11-22,35.7,17,189,3350,FEMALE,8.96436,-23.90309,NA 121 | PAL0910,120,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N64A2,Yes,2009-11-22,41.1,18.6,189,3325,MALE,9.32277,-26.09989,NA 122 | PAL0910,121,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N66A1,No,2009-11-17,36.2,17.2,187,3150,FEMALE,9.04296,-26.19444,Nest never observed with full clutch. 123 | PAL0910,122,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N66A2,No,2009-11-17,37.7,19.8,198,3500,MALE,9.11066,-26.42563,Nest never observed with full clutch. 124 | PAL0910,123,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N67A1,Yes,2009-11-16,40.2,17,176,3450,FEMALE,9.30722,-25.61039,NA 125 | PAL0910,124,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N67A2,Yes,2009-11-16,41.4,18.5,202,3875,MALE,9.59462,-25.42621,NA 126 | PAL0910,125,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N69A1,Yes,2009-11-18,35.2,15.9,186,3050,FEMALE,8.81668,-25.95399,NA 127 | PAL0910,126,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N69A2,Yes,2009-11-18,40.6,19,199,4000,MALE,9.22537,-25.60826,NA 128 | PAL0910,127,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N71A1,Yes,2009-11-21,38.8,17.6,191,3275,FEMALE,8.88098,-25.89741,NA 129 | PAL0910,128,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N71A2,Yes,2009-11-21,41.5,18.3,195,4300,MALE,8.52566,-26.0245,NA 130 | PAL0910,129,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N72A1,Yes,2009-11-18,39,17.1,191,3050,FEMALE,9.19031,-25.73722,NA 131 | PAL0910,130,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N72A2,Yes,2009-11-18,44.1,18,210,4000,MALE,9.10702,-26.01363,NA 132 | PAL0910,131,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N73A1,No,2009-11-23,38.5,17.9,190,3325,FEMALE,8.9846,-25.57956,Nest never observed with full clutch. 133 | PAL0910,132,Adelie Penguin (Pygoscelis adeliae),Anvers,Torgersen,"Adult, 1 Egg Stage",N73A2,No,2009-11-23,43.1,19.2,197,3500,MALE,8.86495,-26.1396,Nest never observed with full clutch. 134 | PAL0910,133,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N76A1,Yes,2009-11-10,36.8,18.5,193,3500,FEMALE,8.98705,-25.57647,NA 135 | PAL0910,134,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N76A2,Yes,2009-11-10,37.5,18.5,199,4475,MALE,8.56708,-26.49288,NA 136 | PAL0910,135,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N77A1,Yes,2009-11-13,38.1,17.6,187,3425,FEMALE,8.717,-25.77951,NA 137 | PAL0910,136,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N77A2,Yes,2009-11-13,41.1,17.5,190,3900,MALE,8.94365,-26.06943,NA 138 | PAL0910,137,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N78A1,Yes,2009-11-16,35.6,17.5,191,3175,FEMALE,8.75984,-25.97696,NA 139 | PAL0910,138,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N78A2,Yes,2009-11-16,40.2,20.1,200,3975,MALE,8.95998,-26.32601,NA 140 | PAL0910,139,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N79A1,No,2009-11-16,37,16.5,185,3400,FEMALE,8.61651,-26.07021,Nest never observed with full clutch. 141 | PAL0910,140,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N79A2,No,2009-11-16,39.7,17.9,193,4250,MALE,9.25769,-25.88798,Nest never observed with full clutch. 142 | PAL0910,141,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N80A1,Yes,2009-11-14,40.2,17.1,193,3400,FEMALE,9.2881,-25.54976,NA 143 | PAL0910,142,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N80A2,Yes,2009-11-14,40.6,17.2,187,3475,MALE,9.23408,-26.01549,NA 144 | PAL0910,143,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N81A1,Yes,2009-11-16,32.1,15.5,188,3050,FEMALE,8.79787,-26.61075,NA 145 | PAL0910,144,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N81A2,Yes,2009-11-16,40.7,17,190,3725,MALE,9.05674,-25.79529,NA 146 | PAL0910,145,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N82A1,Yes,2009-11-16,37.3,16.8,192,3000,FEMALE,9.06829,-25.85203,NA 147 | PAL0910,146,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N82A2,Yes,2009-11-16,39,18.7,185,3650,MALE,9.22033,-26.03442,NA 148 | PAL0910,147,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N83A1,Yes,2009-11-13,39.2,18.6,190,4250,MALE,9.11006,-25.79549,NA 149 | PAL0910,148,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N83A2,Yes,2009-11-13,36.6,18.4,184,3475,FEMALE,8.68744,-25.8306,NA 150 | PAL0910,149,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N84A1,Yes,2009-11-17,36,17.8,195,3450,FEMALE,8.94332,-25.79189,NA 151 | PAL0910,150,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N84A2,Yes,2009-11-17,37.8,18.1,193,3750,MALE,8.97533,-26.03495,NA 152 | PAL0910,151,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N85A1,Yes,2009-11-17,36,17.1,187,3700,FEMALE,8.93465,-26.07081,NA 153 | PAL0910,152,Adelie Penguin (Pygoscelis adeliae),Anvers,Dream,"Adult, 1 Egg Stage",N85A2,Yes,2009-11-17,41.5,18.5,201,4000,MALE,8.8964,-26.06967,NA 154 | PAL0708,1,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N31A1,Yes,2007-11-27,46.1,13.2,211,4500,FEMALE,7.993,-25.5139,NA 155 | PAL0708,2,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N31A2,Yes,2007-11-27,50,16.3,230,5700,MALE,8.14756,-25.39369,NA 156 | PAL0708,3,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N32A1,Yes,2007-11-27,48.7,14.1,210,4450,FEMALE,8.14705,-25.46172,NA 157 | PAL0708,4,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N32A2,Yes,2007-11-27,50,15.2,218,5700,MALE,8.2554,-25.40075,NA 158 | PAL0708,5,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N33A1,Yes,2007-11-18,47.6,14.5,215,5400,MALE,8.2345,-25.54456,NA 159 | PAL0708,6,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N33A2,Yes,2007-11-18,46.5,13.5,210,4550,FEMALE,7.9953,-25.32829,NA 160 | PAL0708,7,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N34A1,Yes,2007-11-27,45.4,14.6,211,4800,FEMALE,8.24515,-25.46782,NA 161 | PAL0708,8,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N34A2,Yes,2007-11-27,46.7,15.3,219,5200,MALE,8.22673,-25.4276,NA 162 | PAL0708,9,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N35A1,Yes,2007-11-27,43.3,13.4,209,4400,FEMALE,8.13643,-25.32176,NA 163 | PAL0708,10,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N35A2,Yes,2007-11-27,46.8,15.4,215,5150,MALE,8.1631,-25.38017,NA 164 | PAL0708,11,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N36A1,No,2007-11-27,40.9,13.7,214,4650,FEMALE,8.19579,-25.3933,Nest never observed with full clutch. 165 | PAL0708,12,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N36A2,No,2007-11-27,49,16.1,216,5550,MALE,8.10417,-25.50562,Nest never observed with full clutch. 166 | PAL0708,13,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N37A1,Yes,2007-11-29,45.5,13.7,214,4650,FEMALE,7.77672,-25.4168,NA 167 | PAL0708,14,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N37A2,Yes,2007-11-29,48.4,14.6,213,5850,MALE,7.8208,-25.48025,NA 168 | PAL0708,15,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N38A1,Yes,2007-12-03,45.8,14.6,210,4200,FEMALE,7.79958,-25.62618,NA 169 | PAL0708,16,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N38A2,Yes,2007-12-03,49.3,15.7,217,5850,MALE,8.07137,-25.52473,NA 170 | PAL0708,17,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N39A1,Yes,2007-11-27,42,13.5,210,4150,FEMALE,7.63884,-25.52627,NA 171 | PAL0708,18,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N39A2,Yes,2007-11-27,49.2,15.2,221,6300,MALE,8.27376,-25.00169,NA 172 | PAL0708,19,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N40A1,Yes,2007-11-27,46.2,14.5,209,4800,FEMALE,7.84057,-25.37899,NA 173 | PAL0708,20,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N40A2,Yes,2007-11-27,48.7,15.1,222,5350,MALE,7.96491,-25.39587,NA 174 | PAL0708,21,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N41A1,Yes,2007-11-27,50.2,14.3,218,5700,MALE,7.8962,-25.37746,NA 175 | PAL0708,22,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N41A2,Yes,2007-11-27,45.1,14.5,215,5000,FEMALE,7.6322,-25.46569,NA 176 | PAL0708,23,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N42A1,Yes,2007-11-27,46.5,14.5,213,4400,FEMALE,7.90436,-25.3947,NA 177 | PAL0708,24,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N42A2,Yes,2007-11-27,46.3,15.8,215,5050,MALE,7.90971,-25.38157,NA 178 | PAL0708,25,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N44A1,Yes,2007-11-29,42.9,13.1,215,5000,FEMALE,7.68528,-25.39181,NA 179 | PAL0708,26,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N44A2,Yes,2007-11-29,46.1,15.1,215,5100,MALE,7.83733,-25.42826,NA 180 | PAL0708,27,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N46A1,Yes,2007-11-29,44.5,14.3,216,4100,NA,7.96621,-25.69327,Sexing primers did not amplify. 181 | PAL0708,28,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N46A2,Yes,2007-11-29,47.8,15,215,5650,MALE,7.92358,-25.48383,NA 182 | PAL0708,29,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N47A1,Yes,2007-11-29,48.2,14.3,210,4600,FEMALE,7.6887,-25.50811,NA 183 | PAL0708,30,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N47A2,Yes,2007-11-29,50,15.3,220,5550,MALE,8.30515,-25.19017,NA 184 | PAL0708,31,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N50A1,Yes,2007-11-29,47.3,15.3,222,5250,MALE,NA,NA,Not enough blood for isotopes. 185 | PAL0708,32,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N50A2,Yes,2007-11-29,42.8,14.2,209,4700,FEMALE,7.63452,-25.46327,NA 186 | PAL0708,33,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N56A1,Yes,2007-12-03,45.1,14.5,207,5050,FEMALE,7.97408,-25.53768,NA 187 | PAL0708,34,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N56A2,Yes,2007-12-03,59.6,17,230,6050,MALE,7.76843,-25.6821,NA 188 | PAL0809,35,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N2A1,Yes,2008-11-13,49.1,14.8,220,5150,FEMALE,7.89744,-26.63405,NA 189 | PAL0809,36,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N2A2,Yes,2008-11-13,48.4,16.3,220,5400,MALE,8.03659,-26.86127,NA 190 | PAL0809,37,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N4A1,Yes,2008-11-02,42.6,13.7,213,4950,FEMALE,7.96935,-26.70968,NA 191 | PAL0809,38,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N4A2,Yes,2008-11-02,44.4,17.3,219,5250,MALE,8.13746,-26.79093,NA 192 | PAL0809,39,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N5A1,Yes,2008-11-09,44,13.6,208,4350,FEMALE,8.01979,-26.68311,NA 193 | PAL0809,40,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N5A2,Yes,2008-11-09,48.7,15.7,208,5350,MALE,8.14776,-26.84506,NA 194 | PAL0809,41,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N6A1,No,2008-11-04,42.7,13.7,208,3950,FEMALE,8.14567,-26.59467,Nest never observed with full clutch. 195 | PAL0809,42,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N6A2,No,2008-11-04,49.6,16,225,5700,MALE,8.38324,-26.84272,Nest never observed with full clutch. 196 | PAL0809,43,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N7A1,Yes,2008-11-04,45.3,13.7,210,4300,FEMALE,8.37615,-26.72791,NA 197 | PAL0809,44,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N7A2,Yes,2008-11-04,49.6,15,216,4750,MALE,8.26548,-26.7699,NA 198 | PAL0809,45,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N8A1,Yes,2008-11-03,50.5,15.9,222,5550,MALE,8.46894,-26.60436,NA 199 | PAL0809,46,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N8A2,Yes,2008-11-03,43.6,13.9,217,4900,FEMALE,8.27141,-26.7765,NA 200 | PAL0809,47,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N11A1,No,2008-11-09,45.5,13.9,210,4200,FEMALE,8.47829,-26.61788,Nest never observed with full clutch. 201 | PAL0809,48,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N11A2,No,2008-11-09,50.5,15.9,225,5400,MALE,8.65803,-26.57585,Nest never observed with full clutch. 202 | PAL0809,49,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N12A1,Yes,2008-11-02,44.9,13.3,213,5100,FEMALE,8.45167,-26.89644,NA 203 | PAL0809,50,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N12A2,Yes,2008-11-02,45.2,15.8,215,5300,MALE,8.55868,-26.67799,NA 204 | PAL0809,51,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N13A1,Yes,2008-11-04,46.6,14.2,210,4850,FEMALE,8.38289,-26.86352,NA 205 | PAL0809,52,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N13A2,Yes,2008-11-04,48.5,14.1,220,5300,MALE,8.39867,-26.79358,NA 206 | PAL0809,53,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N14A1,Yes,2008-11-04,45.1,14.4,210,4400,FEMALE,8.51951,-27.01854,NA 207 | PAL0809,54,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N14A2,Yes,2008-11-04,50.1,15,225,5000,MALE,8.50153,-26.61414,NA 208 | PAL0809,55,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N15A1,Yes,2008-11-04,46.5,14.4,217,4900,FEMALE,8.48789,-26.83006,NA 209 | PAL0809,56,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N15A2,Yes,2008-11-04,45,15.4,220,5050,MALE,8.63488,-26.75621,NA 210 | PAL0809,57,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N16A1,Yes,2008-11-03,43.8,13.9,208,4300,FEMALE,8.58319,-26.84415,NA 211 | PAL0809,58,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N16A2,Yes,2008-11-03,45.5,15,220,5000,MALE,8.63604,-26.7489,NA 212 | PAL0809,59,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N17A1,Yes,2008-11-06,43.2,14.5,208,4450,FEMALE,8.48367,-26.86485,NA 213 | PAL0809,60,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N17A2,Yes,2008-11-06,50.4,15.3,224,5550,MALE,8.74647,-26.79846,NA 214 | PAL0809,61,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N18A1,Yes,2008-11-03,45.3,13.8,208,4200,FEMALE,8.65015,-26.79053,NA 215 | PAL0809,62,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N18A2,Yes,2008-11-03,46.2,14.9,221,5300,MALE,8.60092,-26.84374,NA 216 | PAL0809,63,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N19A1,Yes,2008-11-13,45.7,13.9,214,4400,FEMALE,8.6287,-26.60484,NA 217 | PAL0809,64,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N19A2,Yes,2008-11-13,54.3,15.7,231,5650,MALE,8.49662,-26.84166,NA 218 | PAL0809,65,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N20A1,Yes,2008-11-04,45.8,14.2,219,4700,FEMALE,8.60447,-26.61601,NA 219 | PAL0809,66,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N20A2,Yes,2008-11-04,49.8,16.8,230,5700,MALE,8.47067,-26.69166,NA 220 | PAL0809,67,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N51A1,Yes,2008-11-09,46.2,14.4,214,4650,NA,8.24253,-26.8154,Sexing primers did not amplify. 221 | PAL0809,68,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N51A2,Yes,2008-11-09,49.5,16.2,229,5800,MALE,8.49854,-26.74809,NA 222 | PAL0809,69,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N53A1,Yes,2008-11-13,43.5,14.2,220,4700,FEMALE,8.64931,-26.68867,NA 223 | PAL0809,70,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N53A2,Yes,2008-11-13,50.7,15,223,5550,MALE,8.63551,-26.74249,NA 224 | PAL0809,71,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N54A1,Yes,2008-11-03,47.7,15,216,4750,FEMALE,8.53018,-26.72751,NA 225 | PAL0809,72,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N54A2,Yes,2008-11-03,46.4,15.6,221,5000,MALE,8.35078,-26.70783,NA 226 | PAL0809,73,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N55A1,Yes,2008-11-09,48.2,15.6,221,5100,MALE,8.24651,-26.66958,NA 227 | PAL0809,74,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N55A2,Yes,2008-11-09,46.5,14.8,217,5200,FEMALE,8.58487,-26.5929,NA 228 | PAL0809,75,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N56A1,Yes,2008-11-06,46.4,15,216,4700,FEMALE,8.47938,-26.9547,NA 229 | PAL0809,76,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N56A2,Yes,2008-11-06,48.6,16,230,5800,MALE,8.5964,-26.71199,NA 230 | PAL0809,77,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N58A1,Yes,2008-11-06,47.5,14.2,209,4600,FEMALE,8.39299,-26.78733,NA 231 | PAL0809,78,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N58A2,Yes,2008-11-06,51.1,16.3,220,6000,MALE,8.40327,-26.76821,NA 232 | PAL0809,79,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N60A1,Yes,2008-11-09,45.2,13.8,215,4750,FEMALE,8.24694,-26.65359,NA 233 | PAL0809,80,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N60A2,Yes,2008-11-09,45.2,16.4,223,5950,MALE,8.19749,-26.65931,NA 234 | PAL0910,81,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N1A1,Yes,2009-11-18,49.1,14.5,212,4625,FEMALE,8.35802,-26.2766,NA 235 | PAL0910,82,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N1A2,Yes,2009-11-18,52.5,15.6,221,5450,MALE,8.28601,-26.27573,NA 236 | PAL0910,83,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N6A1,Yes,2009-11-15,47.4,14.6,212,4725,FEMALE,8.19101,-26.24369,NA 237 | PAL0910,84,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N6A2,Yes,2009-11-15,50,15.9,224,5350,MALE,8.20042,-26.39677,NA 238 | PAL0910,85,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N8A1,Yes,2009-11-22,44.9,13.8,212,4750,FEMALE,8.11238,-26.20372,NA 239 | PAL0910,86,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N8A2,Yes,2009-11-22,50.8,17.3,228,5600,MALE,8.27428,-26.30019,NA 240 | PAL0910,87,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N13A1,Yes,2009-11-20,43.4,14.4,218,4600,FEMALE,8.2346800000000009,-26.18599,NA 241 | PAL0910,88,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N13A2,Yes,2009-11-20,51.3,14.2,218,5300,MALE,8.15426,-26.3433,NA 242 | PAL0910,89,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N14A1,Yes,2009-11-25,47.5,14,212,4875,FEMALE,8.12691,-26.23613,NA 243 | PAL0910,90,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N14A2,Yes,2009-11-25,52.1,17,230,5550,MALE,8.27595,-26.11657,NA 244 | PAL0910,91,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N15A1,Yes,2009-11-25,47.5,15,218,4950,FEMALE,8.29671,-26.08547,NA 245 | PAL0910,92,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N15A2,Yes,2009-11-25,52.2,17.1,228,5400,MALE,8.36701,-25.89834,NA 246 | PAL0910,93,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N18A1,Yes,2009-12-01,45.5,14.5,212,4750,FEMALE,8.15566,-26.22848,NA 247 | PAL0910,94,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N18A2,Yes,2009-12-01,49.5,16.1,224,5650,MALE,8.83352,-25.69195,NA 248 | PAL0910,95,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N19A1,Yes,2009-11-27,44.5,14.7,214,4850,FEMALE,8.20106,-26.16524,NA 249 | PAL0910,96,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N19A2,Yes,2009-11-27,50.8,15.7,226,5200,MALE,8.27102,-26.11244,NA 250 | PAL0910,97,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N20A1,Yes,2009-11-18,49.4,15.8,216,4925,MALE,8.03624,-26.06594,NA 251 | PAL0910,98,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N20A2,Yes,2009-11-18,46.9,14.6,222,4875,FEMALE,7.8881,-26.04726,NA 252 | PAL0910,99,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N21A1,Yes,2009-11-18,48.4,14.4,203,4625,FEMALE,8.16582,-26.13971,NA 253 | PAL0910,100,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N21A2,Yes,2009-11-18,51.1,16.5,225,5250,MALE,8.2066,-26.36863,NA 254 | PAL0910,101,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N22A1,Yes,2009-11-22,48.5,15,219,4850,FEMALE,8.10231,-26.18763,NA 255 | PAL0910,102,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N22A2,Yes,2009-11-22,55.9,17,228,5600,MALE,8.3118,-26.35425,NA 256 | PAL0910,103,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N23A1,Yes,2009-11-18,47.2,15.5,215,4975,FEMALE,8.30817,-26.21651,NA 257 | PAL0910,104,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N23A2,Yes,2009-11-18,49.1,15,228,5500,MALE,8.65914,-25.79203,NA 258 | PAL0910,105,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N24A1,Yes,2009-12-01,47.3,13.8,216,4725,NA,8.25818,-26.23886,Sexing primers did not amplify. 259 | PAL0910,106,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N24A2,Yes,2009-12-01,46.8,16.1,215,5500,MALE,8.32359,-26.05756,NA 260 | PAL0910,107,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N28A1,Yes,2009-11-10,41.7,14.7,210,4700,FEMALE,8.12311,-26.44815,NA 261 | PAL0910,108,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N28A2,Yes,2009-11-10,53.4,15.8,219,5500,MALE,8.41017,-26.33867,NA 262 | PAL0910,109,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N29A1,Yes,2009-11-09,43.3,14,208,4575,FEMALE,8.4207,-26.38092,NA 263 | PAL0910,110,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N29A2,Yes,2009-11-09,48.1,15.1,209,5500,MALE,8.45738,-26.22664,NA 264 | PAL0910,111,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N32A1,Yes,2009-11-20,50.5,15.2,216,5000,FEMALE,8.24691,-26.18466,NA 265 | PAL0910,112,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N32A2,Yes,2009-11-20,49.8,15.9,229,5950,MALE,8.29226,-26.21019,NA 266 | PAL0910,113,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N34A1,Yes,2009-11-27,43.5,15.2,213,4650,FEMALE,8.21634,-26.11046,NA 267 | PAL0910,114,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N34A2,Yes,2009-11-27,51.5,16.3,230,5500,MALE,8.78557,-25.76147,NA 268 | PAL0910,115,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N35A1,Yes,2009-11-25,46.2,14.1,217,4375,FEMALE,8.30231,-25.96013,NA 269 | PAL0910,116,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N35A2,Yes,2009-11-25,55.1,16,230,5850,MALE,8.08354,-26.18161,NA 270 | PAL0910,117,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N36A1,Yes,2009-12-01,44.5,15.7,217,4875,.,8.04111,-26.18444,Sexing primers did not amplify. 271 | PAL0910,118,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N36A2,Yes,2009-12-01,48.8,16.2,222,6000,MALE,8.33825,-25.88547,NA 272 | PAL0910,119,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N38A1,No,2009-12-01,47.2,13.7,214,4925,FEMALE,7.99184,-26.20538,Nest never observed with full clutch. 273 | PAL0910,120,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N38A2,No,2009-12-01,NA,NA,NA,NA,NA,NA,NA,Adult not sampled. Nest never observed with full clutch. 274 | PAL0910,121,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N39A1,Yes,2009-11-22,46.8,14.3,215,4850,FEMALE,8.41151,-26.13832,NA 275 | PAL0910,122,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N39A2,Yes,2009-11-22,50.4,15.7,222,5750,MALE,8.30166,-26.04117,NA 276 | PAL0910,123,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N43A1,Yes,2009-11-22,45.2,14.8,212,5200,FEMALE,8.24246,-26.11969,NA 277 | PAL0910,124,Gentoo penguin (Pygoscelis papua),Anvers,Biscoe,"Adult, 1 Egg Stage",N43A2,Yes,2009-11-22,49.9,16.1,213,5400,MALE,8.3639,-26.15531,NA 278 | PAL0708,1,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N61A1,No,2007-11-19,46.5,17.9,192,3500,FEMALE,9.03935,-24.30229,Nest never observed with full clutch. 279 | PAL0708,2,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N61A2,No,2007-11-19,50,19.5,196,3900,MALE,8.92069,-24.23592,Nest never observed with full clutch. 280 | PAL0708,3,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N62A1,Yes,2007-11-26,51.3,19.2,193,3650,MALE,9.29078,-24.7557,NA 281 | PAL0708,4,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N62A2,Yes,2007-11-26,45.4,18.7,188,3525,FEMALE,8.64701,-24.62717,NA 282 | PAL0708,5,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N64A1,Yes,2007-11-21,52.7,19.8,197,3725,MALE,9.00642,-24.61867,NA 283 | PAL0708,6,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N64A2,Yes,2007-11-21,45.2,17.8,198,3950,FEMALE,8.88942,-24.49433,NA 284 | PAL0708,7,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N66A1,Yes,2007-11-28,46.1,18.2,178,3250,FEMALE,8.85664,-24.55644,NA 285 | PAL0708,8,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N66A2,Yes,2007-11-28,51.3,18.2,197,3750,MALE,8.63701,-24.84059,NA 286 | PAL0708,9,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N67A1,Yes,2007-11-21,46,18.9,195,4150,FEMALE,8.47173,-24.29229,NA 287 | PAL0708,10,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N67A2,Yes,2007-11-21,51.3,19.9,198,3700,MALE,8.79581,-24.36088,NA 288 | PAL0708,11,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N68A1,Yes,2007-11-28,46.6,17.8,193,3800,FEMALE,8.95063,-24.59897,NA 289 | PAL0708,12,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N68A2,Yes,2007-11-28,51.7,20.3,194,3775,MALE,8.68747,-24.38751,NA 290 | PAL0708,13,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N69A1,Yes,2007-11-26,47,17.3,185,3700,FEMALE,8.72037,-24.80526,NA 291 | PAL0708,14,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N69A2,Yes,2007-11-26,52,18.1,201,4050,MALE,9.0233,-24.38933,NA 292 | PAL0708,15,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N70A1,Yes,2007-11-22,45.9,17.1,190,3575,FEMALE,9.12277,-24.90024,NA 293 | PAL0708,16,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N70A2,Yes,2007-11-22,50.5,19.6,201,4050,MALE,9.8059,-24.7294,NA 294 | PAL0708,17,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N71A1,No,2007-11-30,50.3,20,197,3300,MALE,10.02019,-24.54704,Nest never observed with full clutch. 295 | PAL0708,18,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N71A2,No,2007-11-30,58,17.8,181,3700,FEMALE,9.14382,-24.57994,Nest never observed with full clutch. 296 | PAL0708,19,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N72A1,Yes,2007-11-30,46.4,18.6,190,3450,FEMALE,9.32105,-24.64162,NA 297 | PAL0708,20,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N72A2,Yes,2007-11-30,49.2,18.2,195,4400,MALE,9.27158,-24.64335,NA 298 | PAL0708,21,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N73A1,Yes,2007-12-03,42.4,17.3,181,3600,FEMALE,9.35138,-24.6879,NA 299 | PAL0708,22,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N73A2,Yes,2007-12-03,48.5,17.5,191,3400,MALE,9.42666,-24.26375,NA 300 | PAL0708,23,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N85A1,No,2007-11-28,43.2,16.6,187,2900,FEMALE,9.35416,-25.01185,Nest never observed with full clutch. 301 | PAL0708,24,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N85A2,No,2007-11-28,50.6,19.4,193,3800,MALE,9.28153,-24.97134,Nest never observed with full clutch. 302 | PAL0708,25,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N89A1,No,2007-11-28,46.7,17.9,195,3300,FEMALE,9.74144,-24.59467,Nest never observed with full clutch. 303 | PAL0708,26,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N89A2,No,2007-11-28,52,19,197,4150,MALE,9.36799,-24.47142,Nest never observed with full clutch. 304 | PAL0809,27,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N61A1,No,2008-11-25,50.5,18.4,200,3400,FEMALE,8.9399,-23.89017,Nest never observed with full clutch. 305 | PAL0809,28,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N61A2,No,2008-11-25,49.5,19,200,3800,MALE,9.63074,-24.34684,Nest never observed with full clutch. 306 | PAL0809,29,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N62A1,Yes,2008-11-14,46.4,17.8,191,3700,FEMALE,9.37369,-24.52896,NA 307 | PAL0809,30,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N62A2,Yes,2008-11-14,52.8,20,205,4550,MALE,9.25177,-24.69638,NA 308 | PAL0809,31,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N63A1,Yes,2008-11-24,40.9,16.6,187,3200,FEMALE,9.08458,-24.54903,NA 309 | PAL0809,32,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N63A2,Yes,2008-11-24,54.2,20.8,201,4300,MALE,9.49283,-24.59996,NA 310 | PAL0809,33,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N65A1,Yes,2008-11-24,42.5,16.7,187,3350,FEMALE,9.36668,-24.45195,NA 311 | PAL0809,34,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N65A2,Yes,2008-11-24,51,18.8,203,4100,MALE,9.23196,-24.17282,NA 312 | PAL0809,35,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N67A1,Yes,2008-11-25,49.7,18.6,195,3600,MALE,9.75486,-24.31198,NA 313 | PAL0809,36,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N67A2,Yes,2008-11-25,47.5,16.8,199,3900,FEMALE,9.07825,-25.1455,NA 314 | PAL0809,37,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N69A1,Yes,2008-11-14,47.6,18.3,195,3850,FEMALE,8.83502,-24.65859,NA 315 | PAL0809,38,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N69A2,Yes,2008-11-14,52,20.7,210,4800,MALE,9.43146,-24.6844,NA 316 | PAL0809,39,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N72A1,No,2008-11-24,46.9,16.6,192,2700,FEMALE,9.80589,-24.73735,Nest never observed with full clutch. 317 | PAL0809,40,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N72A2,No,2008-11-24,53.5,19.9,205,4500,MALE,10.02544,-24.90816,Nest never observed with full clutch. 318 | PAL0809,41,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N74A1,Yes,2008-11-24,49,19.5,210,3950,MALE,9.53262,-24.66867,NA 319 | PAL0809,42,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N74A2,Yes,2008-11-24,46.2,17.5,187,3650,FEMALE,9.61734,-24.66188,NA 320 | PAL0809,43,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N75A1,Yes,2008-11-14,50.9,19.1,196,3550,MALE,10.02372,-24.86594,NA 321 | PAL0809,44,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N75A2,Yes,2008-11-14,45.5,17,196,3500,FEMALE,9.36493,-24.66259,NA 322 | PAL0910,45,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N86A1,Yes,2009-11-17,50.9,17.9,196,3675,FEMALE,9.43684,-24.16566,NA 323 | PAL0910,46,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N86A2,Yes,2009-11-17,50.8,18.5,201,4450,MALE,9.45827,-24.35575,NA 324 | PAL0910,47,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N87A1,Yes,2009-11-27,50.1,17.9,190,3400,FEMALE,9.46819,-24.45721,NA 325 | PAL0910,48,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N87A2,Yes,2009-11-27,49,19.6,212,4300,MALE,9.34089,-24.45189,NA 326 | PAL0910,49,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N88A1,Yes,2009-11-23,51.5,18.7,187,3250,MALE,9.6895,-24.43062,NA 327 | PAL0910,50,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N88A2,Yes,2009-11-23,49.8,17.3,198,3675,FEMALE,9.32169,-24.41562,NA 328 | PAL0910,51,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N90A1,Yes,2009-11-21,48.1,16.4,199,3325,FEMALE,9.46929,-24.48403,NA 329 | PAL0910,52,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N90A2,Yes,2009-11-21,51.4,19,201,3950,MALE,9.43782,-24.36202,NA 330 | PAL0910,53,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N92A1,Yes,2009-11-23,45.7,17.3,193,3600,FEMALE,9.415,-24.805,NA 331 | PAL0910,54,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N92A2,Yes,2009-11-23,50.7,19.7,203,4050,MALE,9.93727,-24.59066,NA 332 | PAL0910,55,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N93A1,Yes,2009-11-27,42.5,17.3,187,3350,FEMALE,9.56534,-24.60882,NA 333 | PAL0910,56,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N93A2,Yes,2009-11-27,52.2,18.8,197,3450,MALE,9.77528,-24.56481,NA 334 | PAL0910,57,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N94A1,Yes,2009-11-21,45.2,16.6,191,3250,FEMALE,9.62357,-24.78984,NA 335 | PAL0910,58,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N94A2,Yes,2009-11-21,49.3,19.9,203,4050,MALE,9.88809,-24.59513,NA 336 | PAL0910,59,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N95A1,Yes,2009-11-21,50.2,18.8,202,3800,MALE,9.74492,-24.404,NA 337 | PAL0910,60,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N95A2,Yes,2009-11-21,45.6,19.4,194,3525,FEMALE,9.46985,-24.65786,NA 338 | PAL0910,61,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N96A1,Yes,2009-11-27,51.9,19.5,206,3950,MALE,NA,-23.78767,No delta15N data received from lab. 339 | PAL0910,62,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N96A2,Yes,2009-11-27,46.8,16.5,189,3650,FEMALE,9.65061,-24.48153,NA 340 | PAL0910,63,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N98A1,Yes,2009-11-19,45.7,17,195,3650,FEMALE,9.2671500000000009,-24.31912,NA 341 | PAL0910,64,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N98A2,Yes,2009-11-19,55.8,19.8,207,4000,MALE,9.7046500000000009,-24.53494,NA 342 | PAL0910,65,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N99A1,No,2009-11-21,43.5,18.1,202,3400,FEMALE,9.37608,-24.40753,Nest never observed with full clutch. 343 | PAL0910,66,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N99A2,No,2009-11-21,49.6,18.2,193,3775,MALE,9.4618,-24.70615,Nest never observed with full clutch. 344 | PAL0910,67,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N100A1,Yes,2009-11-21,50.8,19,210,4100,MALE,9.98044,-24.68741,NA 345 | PAL0910,68,Chinstrap penguin (Pygoscelis antarctica),Anvers,Dream,"Adult, 1 Egg Stage",N100A2,Yes,2009-11-21,50.2,18.7,198,3775,FEMALE,9.39305,-24.25255,NA 346 | -------------------------------------------------------------------------------- /figures/culmen_depth.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NGSchoolEU/ngs19_python_intro/6a70c84188b74a02cdd52b13ff5f5c2955af86f5/figures/culmen_depth.png -------------------------------------------------------------------------------- /figures/elimination.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NGSchoolEU/ngs19_python_intro/6a70c84188b74a02cdd52b13ff5f5c2955af86f5/figures/elimination.png -------------------------------------------------------------------------------- /figures/lter_penguins.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NGSchoolEU/ngs19_python_intro/6a70c84188b74a02cdd52b13ff5f5c2955af86f5/figures/lter_penguins.png -------------------------------------------------------------------------------- /figures/upper_triangular.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NGSchoolEU/ngs19_python_intro/6a70c84188b74a02cdd52b13ff5f5c2955af86f5/figures/upper_triangular.png -------------------------------------------------------------------------------- /setup_check.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # ============================================================================== 3 | # | Description: Checks if requirements for the Intro to Python class | 4 | # | are satisfied. | 5 | # | Author: Katarzyna Kedzierska | 6 | # | Affiliation: Wellcome Centre for Human Genetics, University of Oxford | 7 | # | Date: April 30, 2019 | 8 | # | Usage: bash check_setup.sh | 9 | # ============================================================================== 10 | 11 | PROGRAM=$0 12 | 13 | # Defining helper functions 14 | error() 15 | { 16 | echo -e "\e[48;5;88m$(date +'%Y-%m-%d %T') ERROR: $1\e[0m $2" 1>&2 17 | exit 1 18 | } 19 | 20 | warning() 21 | { 22 | echo -e "\e[7m$(date +'%Y-%m-%d %T') WARNING\e[0m $@" 23 | } 24 | 25 | usage() 26 | { 27 | echo -e " 28 | bash $PROGRAM 29 | 30 | Checks if the requirements for the Introduction to Python course are satisfied. 31 | If not, reports what is missing." 32 | } 33 | 34 | modules="numpy pandas matplotlib scipy seaborn" 35 | 36 | warning " 37 | This is a very simple script that looks for \e[1mPython3\e[0m, \e[1mJupyter lab\e[0m and 38 | Python modules: \e[1m${modules}\e[0m. 39 | It also tries to give you the best advice on how to proceed in case any of them 40 | are missing. 41 | 42 | If you have more than one Python3 version installed adapt the following advice 43 | accordingly, i.e. make sure the PATH is set up properly or conda environment 44 | is activated before checking, and then installing new Python version :)! 45 | 46 | If you do not feel like you understand the advice properly contact the author, 47 | or try running the workshop in Google Colab (more info in the README file). 48 | " 49 | 50 | # Check if Python3 installed 51 | python3_present=$(which python3) 52 | 53 | if [ "${python3_present}" == "" ]; then 54 | error "Python3 is missing" " 55 | There is no Python3 installation present in your PATH! 56 | If you are using conda, remember to activate a proper environment." 57 | fi 58 | 59 | # Check Python3 minor version 60 | python_version=$(python3 --version | cut -f2 -d' ') 61 | python_version_minor=$(echo ${python_version} | cut -f2 -d'.') 62 | 63 | if [ "${python_version_minor}" -lt "5" ]; then 64 | warning " 65 | Your Python3 version (${python_version}) is a bit old. This course was not tested 66 | on versions earlier than 3.5.3 and some of its features may not work properly." 67 | fi 68 | 69 | # Check for necessary modules 70 | for module in ${modules}; do 71 | fail=true; 72 | python3 -c "import $module" 2> /dev/null && fail=false; 73 | if ${fail}; then 74 | modules_to_install="${modules_to_install} $module" 75 | fi 76 | done 77 | 78 | modules_to_install=$(echo ${modules_to_install} | tr -s ' ' | sed 's/^ //g') 79 | 80 | pip3_present=$(which pip3) 81 | conda_present=$(which conda) 82 | 83 | if [ "${modules_to_install}" != "" ]; then 84 | if [ "${conda_present}" != "" ]; then 85 | error "Missing module(s)" " 86 | \e[31mThe following module(s) are missing:\e[0m 87 | \e[1m${modules_to_install}\e[0m 88 | In order to install them, for example activate desired environment 89 | \e[1mconda activate desired_environment\e[0m 90 | and install packages: 91 | \e[1mconda install -c anaconda ${modules_to_install}\e[0m" 92 | elif [ "${pip3_present}" != "" ]; then 93 | error "Missing module(s)" " 94 | \e[31mThe following module(s) are missing:\e[0m 95 | \e[1m${modules_to_install}\e[0m 96 | In order to install them try typing: 97 | \e[1mpip3 ${modules_to_install}\e[0m" 98 | else 99 | error "Missing module(s)" " 100 | \e[31mThe following module(s) are missing:\e[0m 101 | \e[1m${modules_to_install}\e[0mi" 102 | fi 103 | fi 104 | 105 | # Lastly, check for the Jupyter notebook 106 | jupyter lab -h > /dev/null 2>&1 && jupyter_notebook="present" 107 | 108 | if [ "${jupyter_notebook}" == "" ]; then 109 | if [ "${pip3_present}" != "" ] || [ "${conda_present}" != "" ]; then 110 | error "Jupyter lab is missing" " 111 | You can install jupyter lab with conda (remember to switch the environment on) 112 | \e[1mconda install -c anaconda jupyterlab\e[0m 113 | If you are not using conda, try typing 114 | \e[1mpip3 install jupyter\e[0m" 115 | else 116 | error "Jupyter lab missing" " 117 | No conda, nor pip3 detected. Follow the installation guide here: 118 | https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html" 119 | fi 120 | fi 121 | 122 | if [ "$(which cowsay)" != "" ]; then 123 | cowsay "Success! Seems like all modules are installed! 124 | You are ready to start the workshop. Hope you'll enjoy it! :)" 125 | else 126 | echo -e " 127 | =================================== SUCCESS ==================================== 128 | | Seems like all modules are installed! You are ready to start the workshop. | 129 | | Hope you'll enjoy it! :) | 130 | | | 131 | | BTW. If you would have cowsay installed this message would be way nicer. | 132 | | You can install it by typing: \e[1msudo apt install cowsay\e[0m, just saying ;) | 133 | ================================================================================ 134 | " 135 | fi 136 | --------------------------------------------------------------------------------