├── .ipynb_checkpoints ├── 1 - Pandas - Series-checkpoint.ipynb ├── 3 - Pandas - DataFrames-checkpoint.ipynb ├── 4 - Pandas DataFrames exercises-checkpoint.ipynb └── 5 - Pandas - Reading CSV and Basic Plotting-checkpoint.ipynb ├── 1 - Pandas - Series.ipynb ├── 2 - Pandas Series exercises.ipynb ├── 3 - Pandas - DataFrames.ipynb ├── 4 - Pandas DataFrames exercises.ipynb ├── 5 - Pandas - Reading CSV and Basic Plotting.ipynb ├── README.md └── data ├── .ipynb_checkpoints └── btc-market-price-checkpoint.csv ├── btc-market-price.csv └── eth-price.csv /.ipynb_checkpoints/1 - Pandas - Series-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)\n", 8 | "
\n", 9 | "\n", 10 | "\n", 12 | "\n", 13 | "# Pandas - Series\n" 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": {}, 19 | "source": [ 20 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 21 | "\n", 22 | "## Hands on! " 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": null, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "import pandas as pd\n", 32 | "import numpy as np" 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "metadata": {}, 38 | "source": [ 39 | "## Pandas Series\n", 40 | "\n", 41 | "We'll start analyzing \"[The Group of Seven](https://en.wikipedia.org/wiki/Group_of_Seven)\". Which is a political formed by Canada, France, Germany, Italy, Japan, the United Kingdom and the United States. We'll start by analyzing population, and for that, we'll use a `pandas.Series` object." 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "metadata": {}, 48 | "outputs": [], 49 | "source": [ 50 | "# In millions\n", 51 | "g7_pop = pd.Series([35.467, 63.951, 80.940, 60.665, 127.061, 64.511, 318.523])" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": null, 57 | "metadata": { 58 | "scrolled": true 59 | }, 60 | "outputs": [], 61 | "source": [ 62 | "g7_pop" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "Someone might not know we're representing population in millions of inhabitants. Series can have a `name`, to better document the purpose of the Series:" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": {}, 76 | "outputs": [], 77 | "source": [ 78 | "g7_pop.name = 'G7 Population in millions'" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "g7_pop" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "Series are pretty similar to numpy arrays:" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "metadata": {}, 101 | "outputs": [], 102 | "source": [ 103 | "g7_pop.dtype" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "g7_pop.values" 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "metadata": {}, 118 | "source": [ 119 | "They're actually backed by numpy arrays:" 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": null, 125 | "metadata": {}, 126 | "outputs": [], 127 | "source": [ 128 | "type(g7_pop.values)" 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": {}, 134 | "source": [ 135 | "And they _look_ like simple Python lists or Numpy Arrays. But they're actually more similar to Python `dict`s.\n", 136 | "\n", 137 | "A Series has an `index`, that's similar to the automatic index assigned to Python's lists:" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": null, 143 | "metadata": {}, 144 | "outputs": [], 145 | "source": [ 146 | "g7_pop[0]" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "g7_pop[1]" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "g7_pop.index" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "But, in contrast to lists, we can explicitly define the index:" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": {}, 178 | "outputs": [], 179 | "source": [ 180 | "g7_pop.index = [\n", 181 | " 'Canada',\n", 182 | " 'France',\n", 183 | " 'Germany',\n", 184 | " 'Italy',\n", 185 | " 'Japan',\n", 186 | " 'United Kingdom',\n", 187 | " 'United States',\n", 188 | "]" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": null, 194 | "metadata": {}, 195 | "outputs": [], 196 | "source": [ 197 | "g7_pop" 198 | ] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "metadata": {}, 203 | "source": [ 204 | "Compare it with the [following table](https://docs.google.com/spreadsheets/d/1IlorV2-Oh9Da1JAZ7weVw86PQrQydSMp-ydVMH135iI/edit?usp=sharing): \n", 205 | "\n", 206 | "\n", 207 | "\n", 208 | "We can say that Series look like \"ordered dictionaries\". We can actually create Series out of dictionaries:" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "metadata": { 215 | "scrolled": true 216 | }, 217 | "outputs": [], 218 | "source": [ 219 | "pd.Series({\n", 220 | " 'Canada': 35.467,\n", 221 | " 'France': 63.951,\n", 222 | " 'Germany': 80.94,\n", 223 | " 'Italy': 60.665,\n", 224 | " 'Japan': 127.061,\n", 225 | " 'United Kingdom': 64.511,\n", 226 | " 'United States': 318.523\n", 227 | "}, name='G7 Population in millions')" 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": null, 233 | "metadata": {}, 234 | "outputs": [], 235 | "source": [ 236 | "pd.Series(\n", 237 | " [35.467, 63.951, 80.94, 60.665, 127.061, 64.511, 318.523],\n", 238 | " index=['Canada', 'France', 'Germany', 'Italy', 'Japan', 'United Kingdom',\n", 239 | " 'United States'],\n", 240 | " name='G7 Population in millions')" 241 | ] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": {}, 246 | "source": [ 247 | "You can also create Series out of other series, specifying indexes:" 248 | ] 249 | }, 250 | { 251 | "cell_type": "code", 252 | "execution_count": null, 253 | "metadata": { 254 | "scrolled": false 255 | }, 256 | "outputs": [], 257 | "source": [ 258 | "pd.Series(g7_pop, index=['France', 'Germany', 'Italy', 'Spain'])" 259 | ] 260 | }, 261 | { 262 | "cell_type": "markdown", 263 | "metadata": {}, 264 | "source": [ 265 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 266 | "\n", 267 | "## Indexing\n", 268 | "\n", 269 | "Indexing works similarly to lists and dictionaries, you use the **index** of the element you're looking for:" 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": null, 275 | "metadata": {}, 276 | "outputs": [], 277 | "source": [ 278 | "g7_pop['Canada']" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "g7_pop['Japan']" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "Numeric positions can also be used, with the `iloc` attribute:" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": null, 300 | "metadata": {}, 301 | "outputs": [], 302 | "source": [ 303 | "g7_pop.iloc[0]" 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": null, 309 | "metadata": { 310 | "scrolled": true 311 | }, 312 | "outputs": [], 313 | "source": [ 314 | "g7_pop.iloc[-1]" 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "metadata": {}, 320 | "source": [ 321 | "Selecting multiple elements at once:" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": null, 327 | "metadata": { 328 | "scrolled": true 329 | }, 330 | "outputs": [], 331 | "source": [ 332 | "g7_pop[['Italy', 'France']]" 333 | ] 334 | }, 335 | { 336 | "cell_type": "markdown", 337 | "metadata": {}, 338 | "source": [ 339 | "_(The result is another Series)_" 340 | ] 341 | }, 342 | { 343 | "cell_type": "code", 344 | "execution_count": null, 345 | "metadata": { 346 | "scrolled": true 347 | }, 348 | "outputs": [], 349 | "source": [ 350 | "g7_pop.iloc[[0, 1]]" 351 | ] 352 | }, 353 | { 354 | "cell_type": "markdown", 355 | "metadata": {}, 356 | "source": [ 357 | "Slicing also works, but **important**, in Pandas, the upper limit is also included:" 358 | ] 359 | }, 360 | { 361 | "cell_type": "code", 362 | "execution_count": null, 363 | "metadata": { 364 | "scrolled": false 365 | }, 366 | "outputs": [], 367 | "source": [ 368 | "g7_pop['Canada': 'Italy']" 369 | ] 370 | }, 371 | { 372 | "cell_type": "markdown", 373 | "metadata": {}, 374 | "source": [ 375 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 376 | "\n", 377 | "## Conditional selection (boolean arrays)\n", 378 | "\n", 379 | "The same boolean array techniques we saw applied to numpy arrays can be used for Pandas `Series`:" 380 | ] 381 | }, 382 | { 383 | "cell_type": "code", 384 | "execution_count": null, 385 | "metadata": {}, 386 | "outputs": [], 387 | "source": [ 388 | "g7_pop" 389 | ] 390 | }, 391 | { 392 | "cell_type": "code", 393 | "execution_count": null, 394 | "metadata": {}, 395 | "outputs": [], 396 | "source": [ 397 | "g7_pop > 70" 398 | ] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "execution_count": null, 403 | "metadata": {}, 404 | "outputs": [], 405 | "source": [ 406 | "g7_pop[g7_pop > 70]" 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": null, 412 | "metadata": {}, 413 | "outputs": [], 414 | "source": [ 415 | "g7_pop.mean()" 416 | ] 417 | }, 418 | { 419 | "cell_type": "code", 420 | "execution_count": null, 421 | "metadata": {}, 422 | "outputs": [], 423 | "source": [ 424 | "g7_pop[g7_pop > g7_pop.mean()]" 425 | ] 426 | }, 427 | { 428 | "cell_type": "code", 429 | "execution_count": null, 430 | "metadata": {}, 431 | "outputs": [], 432 | "source": [ 433 | "g7_pop.std()" 434 | ] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "execution_count": null, 439 | "metadata": { 440 | "scrolled": true 441 | }, 442 | "outputs": [], 443 | "source": [ 444 | "g7_pop[(g7_pop > g7_pop.mean() - g7_pop.std() / 2) | (g7_pop > g7_pop.mean() + g7_pop.std() / 2)]" 445 | ] 446 | }, 447 | { 448 | "cell_type": "markdown", 449 | "metadata": {}, 450 | "source": [ 451 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 452 | "\n", 453 | "## Operations and methods\n", 454 | "Series also support vectorized operations and aggregation functions as Numpy:" 455 | ] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "execution_count": null, 460 | "metadata": {}, 461 | "outputs": [], 462 | "source": [ 463 | "g7_pop * 1_000_000" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": null, 469 | "metadata": {}, 470 | "outputs": [], 471 | "source": [ 472 | "g7_pop.mean()" 473 | ] 474 | }, 475 | { 476 | "cell_type": "code", 477 | "execution_count": null, 478 | "metadata": { 479 | "scrolled": true 480 | }, 481 | "outputs": [], 482 | "source": [ 483 | "np.log(g7_pop)" 484 | ] 485 | }, 486 | { 487 | "cell_type": "code", 488 | "execution_count": null, 489 | "metadata": { 490 | "scrolled": false 491 | }, 492 | "outputs": [], 493 | "source": [ 494 | "g7_pop['France': 'Italy'].mean()" 495 | ] 496 | }, 497 | { 498 | "cell_type": "markdown", 499 | "metadata": {}, 500 | "source": [ 501 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 502 | "\n", 503 | "## Boolean arrays\n", 504 | "(Work in the same way as numpy)" 505 | ] 506 | }, 507 | { 508 | "cell_type": "code", 509 | "execution_count": null, 510 | "metadata": {}, 511 | "outputs": [], 512 | "source": [ 513 | "g7_pop" 514 | ] 515 | }, 516 | { 517 | "cell_type": "code", 518 | "execution_count": null, 519 | "metadata": {}, 520 | "outputs": [], 521 | "source": [ 522 | "g7_pop > 80" 523 | ] 524 | }, 525 | { 526 | "cell_type": "code", 527 | "execution_count": null, 528 | "metadata": { 529 | "scrolled": true 530 | }, 531 | "outputs": [], 532 | "source": [ 533 | "g7_pop[g7_pop > 80]" 534 | ] 535 | }, 536 | { 537 | "cell_type": "code", 538 | "execution_count": null, 539 | "metadata": { 540 | "scrolled": true 541 | }, 542 | "outputs": [], 543 | "source": [ 544 | "g7_pop[(g7_pop > 80) | (g7_pop < 40)]" 545 | ] 546 | }, 547 | { 548 | "cell_type": "code", 549 | "execution_count": null, 550 | "metadata": { 551 | "scrolled": true 552 | }, 553 | "outputs": [], 554 | "source": [ 555 | "g7_pop[(g7_pop > 80) & (g7_pop < 200)]" 556 | ] 557 | }, 558 | { 559 | "cell_type": "markdown", 560 | "metadata": {}, 561 | "source": [ 562 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 563 | "\n", 564 | "## Modifying series\n" 565 | ] 566 | }, 567 | { 568 | "cell_type": "code", 569 | "execution_count": null, 570 | "metadata": {}, 571 | "outputs": [], 572 | "source": [ 573 | "g7_pop['Canada'] = 40.5" 574 | ] 575 | }, 576 | { 577 | "cell_type": "code", 578 | "execution_count": null, 579 | "metadata": {}, 580 | "outputs": [], 581 | "source": [ 582 | "g7_pop" 583 | ] 584 | }, 585 | { 586 | "cell_type": "code", 587 | "execution_count": null, 588 | "metadata": {}, 589 | "outputs": [], 590 | "source": [ 591 | "g7_pop.iloc[-1] = 500" 592 | ] 593 | }, 594 | { 595 | "cell_type": "code", 596 | "execution_count": null, 597 | "metadata": { 598 | "scrolled": false 599 | }, 600 | "outputs": [], 601 | "source": [ 602 | "g7_pop" 603 | ] 604 | }, 605 | { 606 | "cell_type": "code", 607 | "execution_count": null, 608 | "metadata": {}, 609 | "outputs": [], 610 | "source": [ 611 | "g7_pop[g7_pop < 70] = 99.99" 612 | ] 613 | }, 614 | { 615 | "cell_type": "code", 616 | "execution_count": null, 617 | "metadata": { 618 | "scrolled": true 619 | }, 620 | "outputs": [], 621 | "source": [ 622 | "g7_pop" 623 | ] 624 | }, 625 | { 626 | "cell_type": "markdown", 627 | "metadata": {}, 628 | "source": [ 629 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n" 630 | ] 631 | } 632 | ], 633 | "metadata": { 634 | "kernelspec": { 635 | "display_name": "Python 3", 636 | "language": "python", 637 | "name": "python3" 638 | }, 639 | "language_info": { 640 | "codemirror_mode": { 641 | "name": "ipython", 642 | "version": 3 643 | }, 644 | "file_extension": ".py", 645 | "mimetype": "text/x-python", 646 | "name": "python", 647 | "nbconvert_exporter": "python", 648 | "pygments_lexer": "ipython3", 649 | "version": "3.7.4" 650 | } 651 | }, 652 | "nbformat": 4, 653 | "nbformat_minor": 2 654 | } 655 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/3 - Pandas - DataFrames-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)\n", 8 | "
\n", 9 | "\n", 10 | "\n", 12 | "\n", 13 | "# Pandas - `DataFrame`s\n", 14 | "\n", 15 | "Probably the most important data structure of pandas is the `DataFrame`. It's a tabular structure tightly integrated with `Series`.\n" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 23 | "\n", 24 | "## Hands on! " 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": null, 30 | "metadata": {}, 31 | "outputs": [], 32 | "source": [ 33 | "import numpy as np\n", 34 | "import pandas as pd" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": {}, 40 | "source": [ 41 | "We'll keep our analysis of G7 countries and looking now at DataFrames. As said, a DataFrame looks a lot like a table (as the one you can appreciate [here](https://docs.google.com/spreadsheets/d/1IlorV2-Oh9Da1JAZ7weVw86PQrQydSMp-ydVMH135iI/edit?usp=sharing)):\n", 42 | "\n", 43 | "\n", 44 | "\n", 45 | "Creating `DataFrame`s manually can be tedious. 99% of the time you'll be pulling the data from a Database, a csv file or the web. But still, you can create a DataFrame by specifying the columns and values:" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "df = pd.DataFrame({\n", 55 | " 'Population': [35.467, 63.951, 80.94 , 60.665, 127.061, 64.511, 318.523],\n", 56 | " 'GDP': [\n", 57 | " 1785387,\n", 58 | " 2833687,\n", 59 | " 3874437,\n", 60 | " 2167744,\n", 61 | " 4602367,\n", 62 | " 2950039,\n", 63 | " 17348075\n", 64 | " ],\n", 65 | " 'Surface Area': [\n", 66 | " 9984670,\n", 67 | " 640679,\n", 68 | " 357114,\n", 69 | " 301336,\n", 70 | " 377930,\n", 71 | " 242495,\n", 72 | " 9525067\n", 73 | " ],\n", 74 | " 'HDI': [\n", 75 | " 0.913,\n", 76 | " 0.888,\n", 77 | " 0.916,\n", 78 | " 0.873,\n", 79 | " 0.891,\n", 80 | " 0.907,\n", 81 | " 0.915\n", 82 | " ],\n", 83 | " 'Continent': [\n", 84 | " 'America',\n", 85 | " 'Europe',\n", 86 | " 'Europe',\n", 87 | " 'Europe',\n", 88 | " 'Asia',\n", 89 | " 'Europe',\n", 90 | " 'America'\n", 91 | " ]\n", 92 | "}, columns=['Population', 'GDP', 'Surface Area', 'HDI', 'Continent'])" 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | "_(The `columns` attribute is optional. I'm using it to keep the same order as in the picture above)_" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": null, 105 | "metadata": { 106 | "scrolled": true 107 | }, 108 | "outputs": [], 109 | "source": [ 110 | "df" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "`DataFrame`s also have indexes. As you can see in the \"table\" above, pandas has assigned a numeric, autoincremental index automatically to each \"row\" in our DataFrame. In our case, we know that each row represents a country, so we'll just reassign the index:" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": null, 123 | "metadata": {}, 124 | "outputs": [], 125 | "source": [ 126 | "df.index = [\n", 127 | " 'Canada',\n", 128 | " 'France',\n", 129 | " 'Germany',\n", 130 | " 'Italy',\n", 131 | " 'Japan',\n", 132 | " 'United Kingdom',\n", 133 | " 'United States',\n", 134 | "]" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "df" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "df.columns" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": null, 158 | "metadata": {}, 159 | "outputs": [], 160 | "source": [ 161 | "df.index" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [ 170 | "df.info()" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "metadata": {}, 177 | "outputs": [], 178 | "source": [ 179 | "df.size" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": null, 185 | "metadata": {}, 186 | "outputs": [], 187 | "source": [ 188 | "df.shape" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": null, 194 | "metadata": { 195 | "scrolled": true 196 | }, 197 | "outputs": [], 198 | "source": [ 199 | "df.describe()" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "metadata": {}, 206 | "outputs": [], 207 | "source": [ 208 | "df.dtypes" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "metadata": { 215 | "scrolled": false 216 | }, 217 | "outputs": [], 218 | "source": [ 219 | "df.dtypes.value_counts()" 220 | ] 221 | }, 222 | { 223 | "cell_type": "markdown", 224 | "metadata": {}, 225 | "source": [ 226 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 227 | "\n", 228 | "## Indexing, Selection and Slicing\n", 229 | "\n", 230 | "Individual columns in the DataFrame can be selected with regular indexing. Each column is represented as a `Series`:" 231 | ] 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": null, 236 | "metadata": { 237 | "scrolled": true 238 | }, 239 | "outputs": [], 240 | "source": [ 241 | "df['Population']" 242 | ] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "metadata": {}, 247 | "source": [ 248 | "Note that the `index` of the returned Series is the same as the DataFrame one. And its `name` is the name of the column. If you're working on a notebook and want to see a more DataFrame-like format you can use the `to_frame` method:" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": { 255 | "scrolled": false 256 | }, 257 | "outputs": [], 258 | "source": [ 259 | "df['Population'].to_frame()" 260 | ] 261 | }, 262 | { 263 | "cell_type": "markdown", 264 | "metadata": {}, 265 | "source": [ 266 | "Multiple columns can also be selected similarly to `numpy` and `Series`:" 267 | ] 268 | }, 269 | { 270 | "cell_type": "code", 271 | "execution_count": null, 272 | "metadata": { 273 | "scrolled": true 274 | }, 275 | "outputs": [], 276 | "source": [ 277 | "df[['Population', 'GDP']]" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": {}, 283 | "source": [ 284 | "In this case, the result is another `DataFrame`. Slicing works differently, it acts at \"row level\", and can be counter intuitive:" 285 | ] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": null, 290 | "metadata": { 291 | "scrolled": false 292 | }, 293 | "outputs": [], 294 | "source": [ 295 | "df[1:3]" 296 | ] 297 | }, 298 | { 299 | "cell_type": "markdown", 300 | "metadata": {}, 301 | "source": [ 302 | "Row level selection works better with `loc` and `iloc` **which are recommended** over regular \"direct slicing\" (`df[:]`).\n", 303 | "\n", 304 | "`loc` selects rows matching the given index:" 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": null, 310 | "metadata": {}, 311 | "outputs": [], 312 | "source": [ 313 | "df.loc['Italy']" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": null, 319 | "metadata": { 320 | "scrolled": true 321 | }, 322 | "outputs": [], 323 | "source": [ 324 | "df.loc['France': 'Italy']" 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "As a second \"argument\", you can pass the column(s) you'd like to select:" 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": null, 337 | "metadata": { 338 | "scrolled": false 339 | }, 340 | "outputs": [], 341 | "source": [ 342 | "df.loc['France': 'Italy', 'Population']" 343 | ] 344 | }, 345 | { 346 | "cell_type": "code", 347 | "execution_count": null, 348 | "metadata": { 349 | "scrolled": true 350 | }, 351 | "outputs": [], 352 | "source": [ 353 | "df.loc['France': 'Italy', ['Population', 'GDP']]" 354 | ] 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "metadata": {}, 359 | "source": [ 360 | "`iloc` works with the (numeric) \"position\" of the index:" 361 | ] 362 | }, 363 | { 364 | "cell_type": "code", 365 | "execution_count": null, 366 | "metadata": {}, 367 | "outputs": [], 368 | "source": [ 369 | "df" 370 | ] 371 | }, 372 | { 373 | "cell_type": "code", 374 | "execution_count": null, 375 | "metadata": {}, 376 | "outputs": [], 377 | "source": [ 378 | "df.iloc[0]" 379 | ] 380 | }, 381 | { 382 | "cell_type": "code", 383 | "execution_count": null, 384 | "metadata": {}, 385 | "outputs": [], 386 | "source": [ 387 | "df.iloc[-1]" 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": null, 393 | "metadata": { 394 | "scrolled": false 395 | }, 396 | "outputs": [], 397 | "source": [ 398 | "df.iloc[[0, 1, -1]]" 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "execution_count": null, 404 | "metadata": {}, 405 | "outputs": [], 406 | "source": [ 407 | "df.iloc[1:3]" 408 | ] 409 | }, 410 | { 411 | "cell_type": "code", 412 | "execution_count": null, 413 | "metadata": {}, 414 | "outputs": [], 415 | "source": [ 416 | "df.iloc[1:3, 3]" 417 | ] 418 | }, 419 | { 420 | "cell_type": "code", 421 | "execution_count": null, 422 | "metadata": {}, 423 | "outputs": [], 424 | "source": [ 425 | "df.iloc[1:3, [0, 3]]" 426 | ] 427 | }, 428 | { 429 | "cell_type": "code", 430 | "execution_count": null, 431 | "metadata": { 432 | "scrolled": true 433 | }, 434 | "outputs": [], 435 | "source": [ 436 | "df.iloc[1:3, 1:3]" 437 | ] 438 | }, 439 | { 440 | "cell_type": "markdown", 441 | "metadata": {}, 442 | "source": [ 443 | "> **RECOMMENDED: Always use `loc` and `iloc` to reduce ambiguity, specially with `DataFrame`s with numeric indexes.**" 444 | ] 445 | }, 446 | { 447 | "cell_type": "markdown", 448 | "metadata": {}, 449 | "source": [ 450 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 451 | "\n", 452 | "## Conditional selection (boolean arrays)\n", 453 | "\n", 454 | "We saw conditional selection applied to `Series` and it'll work in the same way for `DataFrame`s. After all, a `DataFrame` is a collection of `Series`:" 455 | ] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "execution_count": null, 460 | "metadata": {}, 461 | "outputs": [], 462 | "source": [ 463 | "df" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": null, 469 | "metadata": {}, 470 | "outputs": [], 471 | "source": [ 472 | "df['Population'] > 70" 473 | ] 474 | }, 475 | { 476 | "cell_type": "code", 477 | "execution_count": null, 478 | "metadata": {}, 479 | "outputs": [], 480 | "source": [ 481 | "df.loc[df['Population'] > 70]" 482 | ] 483 | }, 484 | { 485 | "cell_type": "markdown", 486 | "metadata": {}, 487 | "source": [ 488 | "The boolean matching is done at Index level, so you can filter by any row, as long as it contains the right indexes. Column selection still works as expected:" 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": null, 494 | "metadata": {}, 495 | "outputs": [], 496 | "source": [ 497 | "df.loc[df['Population'] > 70, 'Population']" 498 | ] 499 | }, 500 | { 501 | "cell_type": "code", 502 | "execution_count": null, 503 | "metadata": {}, 504 | "outputs": [], 505 | "source": [ 506 | "df.loc[df['Population'] > 70, ['Population', 'GDP']]" 507 | ] 508 | }, 509 | { 510 | "cell_type": "markdown", 511 | "metadata": {}, 512 | "source": [ 513 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 514 | "\n", 515 | "## Dropping stuff\n", 516 | "\n", 517 | "Opposed to the concept of selection, we have \"dropping\". Instead of pointing out which values you'd like to _select_ you could point which ones you'd like to `drop`:" 518 | ] 519 | }, 520 | { 521 | "cell_type": "code", 522 | "execution_count": null, 523 | "metadata": {}, 524 | "outputs": [], 525 | "source": [ 526 | "df.drop('Canada')" 527 | ] 528 | }, 529 | { 530 | "cell_type": "code", 531 | "execution_count": null, 532 | "metadata": {}, 533 | "outputs": [], 534 | "source": [ 535 | "df.drop(['Canada', 'Japan'])" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": null, 541 | "metadata": {}, 542 | "outputs": [], 543 | "source": [ 544 | "df.drop(columns=['Population', 'HDI'])" 545 | ] 546 | }, 547 | { 548 | "cell_type": "code", 549 | "execution_count": null, 550 | "metadata": { 551 | "scrolled": false 552 | }, 553 | "outputs": [], 554 | "source": [ 555 | "df.drop(['Italy', 'Canada'], axis=0)" 556 | ] 557 | }, 558 | { 559 | "cell_type": "code", 560 | "execution_count": null, 561 | "metadata": { 562 | "scrolled": false 563 | }, 564 | "outputs": [], 565 | "source": [ 566 | "df.drop(['Population', 'HDI'], axis=1)" 567 | ] 568 | }, 569 | { 570 | "cell_type": "code", 571 | "execution_count": null, 572 | "metadata": { 573 | "scrolled": true 574 | }, 575 | "outputs": [], 576 | "source": [ 577 | "df.drop(['Population', 'HDI'], axis=1)" 578 | ] 579 | }, 580 | { 581 | "cell_type": "code", 582 | "execution_count": null, 583 | "metadata": {}, 584 | "outputs": [], 585 | "source": [ 586 | "df.drop(['Population', 'HDI'], axis='columns')" 587 | ] 588 | }, 589 | { 590 | "cell_type": "code", 591 | "execution_count": null, 592 | "metadata": {}, 593 | "outputs": [], 594 | "source": [ 595 | "df.drop(['Canada', 'Germany'], axis='rows')" 596 | ] 597 | }, 598 | { 599 | "cell_type": "markdown", 600 | "metadata": {}, 601 | "source": [ 602 | "All these `drop` methods return a new `DataFrame`. If you'd like to modify it \"in place\", you can use the `inplace` attribute (there's an example below)." 603 | ] 604 | }, 605 | { 606 | "cell_type": "markdown", 607 | "metadata": {}, 608 | "source": [ 609 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 610 | "\n", 611 | "## Operations" 612 | ] 613 | }, 614 | { 615 | "cell_type": "code", 616 | "execution_count": null, 617 | "metadata": { 618 | "scrolled": false 619 | }, 620 | "outputs": [], 621 | "source": [ 622 | "df[['Population', 'GDP']]" 623 | ] 624 | }, 625 | { 626 | "cell_type": "code", 627 | "execution_count": null, 628 | "metadata": {}, 629 | "outputs": [], 630 | "source": [ 631 | "df[['Population', 'GDP']] / 100" 632 | ] 633 | }, 634 | { 635 | "cell_type": "markdown", 636 | "metadata": {}, 637 | "source": [ 638 | "**Operations with Series** work at a column level, broadcasting down the rows (which can be counter intuitive)." 639 | ] 640 | }, 641 | { 642 | "cell_type": "code", 643 | "execution_count": null, 644 | "metadata": {}, 645 | "outputs": [], 646 | "source": [ 647 | "crisis = pd.Series([-1_000_000, -0.3], index=['GDP', 'HDI'])" 648 | ] 649 | }, 650 | { 651 | "cell_type": "code", 652 | "execution_count": null, 653 | "metadata": {}, 654 | "outputs": [], 655 | "source": [ 656 | "df[['GDP', 'HDI']] + crisis" 657 | ] 658 | }, 659 | { 660 | "cell_type": "markdown", 661 | "metadata": {}, 662 | "source": [ 663 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 664 | "\n", 665 | "## Modifying DataFrames\n", 666 | "\n", 667 | "It's simple and intuitive, You can add columns, or replace values for columns without issues:" 668 | ] 669 | }, 670 | { 671 | "cell_type": "markdown", 672 | "metadata": {}, 673 | "source": [ 674 | "### Adding a new column" 675 | ] 676 | }, 677 | { 678 | "cell_type": "code", 679 | "execution_count": null, 680 | "metadata": {}, 681 | "outputs": [], 682 | "source": [ 683 | "langs = pd.Series(\n", 684 | " ['French', 'German', 'Italian'],\n", 685 | " index=['France', 'Germany', 'Italy'],\n", 686 | " name='Language'\n", 687 | ")" 688 | ] 689 | }, 690 | { 691 | "cell_type": "code", 692 | "execution_count": null, 693 | "metadata": {}, 694 | "outputs": [], 695 | "source": [ 696 | "df['Language'] = langs" 697 | ] 698 | }, 699 | { 700 | "cell_type": "code", 701 | "execution_count": null, 702 | "metadata": { 703 | "scrolled": false 704 | }, 705 | "outputs": [], 706 | "source": [ 707 | "df" 708 | ] 709 | }, 710 | { 711 | "cell_type": "markdown", 712 | "metadata": {}, 713 | "source": [ 714 | "---\n", 715 | "### Replacing values per column" 716 | ] 717 | }, 718 | { 719 | "cell_type": "code", 720 | "execution_count": null, 721 | "metadata": {}, 722 | "outputs": [], 723 | "source": [ 724 | "df['Language'] = 'English'" 725 | ] 726 | }, 727 | { 728 | "cell_type": "code", 729 | "execution_count": null, 730 | "metadata": {}, 731 | "outputs": [], 732 | "source": [ 733 | "df" 734 | ] 735 | }, 736 | { 737 | "cell_type": "markdown", 738 | "metadata": {}, 739 | "source": [ 740 | "---\n", 741 | "### Renaming Columns\n" 742 | ] 743 | }, 744 | { 745 | "cell_type": "code", 746 | "execution_count": null, 747 | "metadata": { 748 | "scrolled": false 749 | }, 750 | "outputs": [], 751 | "source": [ 752 | "df.rename(\n", 753 | " columns={\n", 754 | " 'HDI': 'Human Development Index',\n", 755 | " 'Anual Popcorn Consumption': 'APC'\n", 756 | " }, index={\n", 757 | " 'United States': 'USA',\n", 758 | " 'United Kingdom': 'UK',\n", 759 | " 'Argentina': 'AR'\n", 760 | " })" 761 | ] 762 | }, 763 | { 764 | "cell_type": "code", 765 | "execution_count": null, 766 | "metadata": { 767 | "scrolled": true 768 | }, 769 | "outputs": [], 770 | "source": [ 771 | "df.rename(index=str.upper)" 772 | ] 773 | }, 774 | { 775 | "cell_type": "code", 776 | "execution_count": null, 777 | "metadata": { 778 | "scrolled": true 779 | }, 780 | "outputs": [], 781 | "source": [ 782 | "df.rename(index=lambda x: x.lower())" 783 | ] 784 | }, 785 | { 786 | "cell_type": "markdown", 787 | "metadata": {}, 788 | "source": [ 789 | "---\n", 790 | "### Dropping columns" 791 | ] 792 | }, 793 | { 794 | "cell_type": "code", 795 | "execution_count": null, 796 | "metadata": {}, 797 | "outputs": [], 798 | "source": [ 799 | "df.drop(columns='Language', inplace=True)" 800 | ] 801 | }, 802 | { 803 | "cell_type": "markdown", 804 | "metadata": {}, 805 | "source": [ 806 | "---\n", 807 | "### Adding values" 808 | ] 809 | }, 810 | { 811 | "cell_type": "code", 812 | "execution_count": null, 813 | "metadata": { 814 | "scrolled": false 815 | }, 816 | "outputs": [], 817 | "source": [ 818 | "df.append(pd.Series({\n", 819 | " 'Population': 3,\n", 820 | " 'GDP': 5\n", 821 | "}, name='China'))" 822 | ] 823 | }, 824 | { 825 | "cell_type": "markdown", 826 | "metadata": {}, 827 | "source": [ 828 | "Append returns a new `DataFrame`:" 829 | ] 830 | }, 831 | { 832 | "cell_type": "code", 833 | "execution_count": null, 834 | "metadata": { 835 | "scrolled": false 836 | }, 837 | "outputs": [], 838 | "source": [ 839 | "df" 840 | ] 841 | }, 842 | { 843 | "cell_type": "markdown", 844 | "metadata": {}, 845 | "source": [ 846 | "You can directly set the new index and values to the `DataFrame`:" 847 | ] 848 | }, 849 | { 850 | "cell_type": "code", 851 | "execution_count": null, 852 | "metadata": {}, 853 | "outputs": [], 854 | "source": [ 855 | "df.loc['China'] = pd.Series({'Population': 1_400_000_000, 'Continent': 'Asia'})" 856 | ] 857 | }, 858 | { 859 | "cell_type": "code", 860 | "execution_count": null, 861 | "metadata": { 862 | "scrolled": true 863 | }, 864 | "outputs": [], 865 | "source": [ 866 | "df" 867 | ] 868 | }, 869 | { 870 | "cell_type": "markdown", 871 | "metadata": {}, 872 | "source": [ 873 | "We can use `drop` to just remove a row by index:" 874 | ] 875 | }, 876 | { 877 | "cell_type": "code", 878 | "execution_count": null, 879 | "metadata": {}, 880 | "outputs": [], 881 | "source": [ 882 | "df.drop('China', inplace=True)" 883 | ] 884 | }, 885 | { 886 | "cell_type": "code", 887 | "execution_count": null, 888 | "metadata": { 889 | "scrolled": false 890 | }, 891 | "outputs": [], 892 | "source": [ 893 | "df" 894 | ] 895 | }, 896 | { 897 | "cell_type": "markdown", 898 | "metadata": {}, 899 | "source": [ 900 | "---\n", 901 | "### More radical index changes" 902 | ] 903 | }, 904 | { 905 | "cell_type": "code", 906 | "execution_count": null, 907 | "metadata": {}, 908 | "outputs": [], 909 | "source": [ 910 | "df.reset_index()" 911 | ] 912 | }, 913 | { 914 | "cell_type": "code", 915 | "execution_count": null, 916 | "metadata": { 917 | "scrolled": true 918 | }, 919 | "outputs": [], 920 | "source": [ 921 | "df.set_index('Population')" 922 | ] 923 | }, 924 | { 925 | "cell_type": "markdown", 926 | "metadata": {}, 927 | "source": [ 928 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 929 | "\n", 930 | "## Creating columns from other columns\n", 931 | "\n", 932 | "Altering a DataFrame often involves combining different columns into another. For example, in our Countries analysis, we could try to calculate the \"GDP per capita\", which is just, `GDP / Population`." 933 | ] 934 | }, 935 | { 936 | "cell_type": "code", 937 | "execution_count": null, 938 | "metadata": { 939 | "scrolled": true 940 | }, 941 | "outputs": [], 942 | "source": [ 943 | "df[['Population', 'GDP']]" 944 | ] 945 | }, 946 | { 947 | "cell_type": "markdown", 948 | "metadata": {}, 949 | "source": [ 950 | "The regular pandas way of expressing that, is just dividing each series:" 951 | ] 952 | }, 953 | { 954 | "cell_type": "code", 955 | "execution_count": null, 956 | "metadata": {}, 957 | "outputs": [], 958 | "source": [ 959 | "df['GDP'] / df['Population']" 960 | ] 961 | }, 962 | { 963 | "cell_type": "markdown", 964 | "metadata": {}, 965 | "source": [ 966 | "The result of that operation is just another series that you can add to the original `DataFrame`:" 967 | ] 968 | }, 969 | { 970 | "cell_type": "code", 971 | "execution_count": null, 972 | "metadata": {}, 973 | "outputs": [], 974 | "source": [ 975 | "df['GDP Per Capita'] = df['GDP'] / df['Population']" 976 | ] 977 | }, 978 | { 979 | "cell_type": "code", 980 | "execution_count": null, 981 | "metadata": {}, 982 | "outputs": [], 983 | "source": [ 984 | "df" 985 | ] 986 | }, 987 | { 988 | "cell_type": "markdown", 989 | "metadata": {}, 990 | "source": [ 991 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 992 | "\n", 993 | "## Statistical info\n", 994 | "\n", 995 | "You've already seen the `describe` method, which gives you a good \"summary\" of the `DataFrame`. Let's explore other methods in more detail:" 996 | ] 997 | }, 998 | { 999 | "cell_type": "code", 1000 | "execution_count": null, 1001 | "metadata": {}, 1002 | "outputs": [], 1003 | "source": [ 1004 | "df.head()" 1005 | ] 1006 | }, 1007 | { 1008 | "cell_type": "code", 1009 | "execution_count": null, 1010 | "metadata": {}, 1011 | "outputs": [], 1012 | "source": [ 1013 | "df.describe()" 1014 | ] 1015 | }, 1016 | { 1017 | "cell_type": "code", 1018 | "execution_count": null, 1019 | "metadata": {}, 1020 | "outputs": [], 1021 | "source": [ 1022 | "population = df['Population']" 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "code", 1027 | "execution_count": null, 1028 | "metadata": {}, 1029 | "outputs": [], 1030 | "source": [ 1031 | "population.min(), population.max()" 1032 | ] 1033 | }, 1034 | { 1035 | "cell_type": "code", 1036 | "execution_count": null, 1037 | "metadata": {}, 1038 | "outputs": [], 1039 | "source": [ 1040 | "population.sum()" 1041 | ] 1042 | }, 1043 | { 1044 | "cell_type": "code", 1045 | "execution_count": null, 1046 | "metadata": {}, 1047 | "outputs": [], 1048 | "source": [ 1049 | "population.sum() / len(population)" 1050 | ] 1051 | }, 1052 | { 1053 | "cell_type": "code", 1054 | "execution_count": null, 1055 | "metadata": { 1056 | "scrolled": true 1057 | }, 1058 | "outputs": [], 1059 | "source": [ 1060 | "population.mean()" 1061 | ] 1062 | }, 1063 | { 1064 | "cell_type": "code", 1065 | "execution_count": null, 1066 | "metadata": {}, 1067 | "outputs": [], 1068 | "source": [ 1069 | "population.std()" 1070 | ] 1071 | }, 1072 | { 1073 | "cell_type": "code", 1074 | "execution_count": null, 1075 | "metadata": {}, 1076 | "outputs": [], 1077 | "source": [ 1078 | "population.median()" 1079 | ] 1080 | }, 1081 | { 1082 | "cell_type": "code", 1083 | "execution_count": null, 1084 | "metadata": { 1085 | "scrolled": true 1086 | }, 1087 | "outputs": [], 1088 | "source": [ 1089 | "population.describe()" 1090 | ] 1091 | }, 1092 | { 1093 | "cell_type": "code", 1094 | "execution_count": null, 1095 | "metadata": { 1096 | "scrolled": true 1097 | }, 1098 | "outputs": [], 1099 | "source": [ 1100 | "population.quantile(.25)" 1101 | ] 1102 | }, 1103 | { 1104 | "cell_type": "code", 1105 | "execution_count": null, 1106 | "metadata": { 1107 | "scrolled": false 1108 | }, 1109 | "outputs": [], 1110 | "source": [ 1111 | "population.quantile([.2, .4, .6, .8, 1])" 1112 | ] 1113 | }, 1114 | { 1115 | "cell_type": "markdown", 1116 | "metadata": {}, 1117 | "source": [ 1118 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n" 1119 | ] 1120 | } 1121 | ], 1122 | "metadata": { 1123 | "kernelspec": { 1124 | "display_name": "Python 3", 1125 | "language": "python", 1126 | "name": "python3" 1127 | }, 1128 | "language_info": { 1129 | "codemirror_mode": { 1130 | "name": "ipython", 1131 | "version": 3 1132 | }, 1133 | "file_extension": ".py", 1134 | "mimetype": "text/x-python", 1135 | "name": "python", 1136 | "nbconvert_exporter": "python", 1137 | "pygments_lexer": "ipython3", 1138 | "version": "3.7.4" 1139 | } 1140 | }, 1141 | "nbformat": 4, 1142 | "nbformat_minor": 2 1143 | } 1144 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/4 - Pandas DataFrames exercises-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)\n", 8 | "
\n", 9 | "\n", 10 | "# Pandas DataFrame exercises\n" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": null, 16 | "metadata": {}, 17 | "outputs": [], 18 | "source": [ 19 | "# Import the numpy package under the name np\n", 20 | "import numpy as np\n", 21 | "\n", 22 | "# Import the pandas package under the name pd\n", 23 | "import pandas as pd\n", 24 | "\n", 25 | "# Import the matplotlib package under the name plt\n", 26 | "import matplotlib.pyplot as plt\n", 27 | "%matplotlib inline\n", 28 | "\n", 29 | "# Print the pandas version and the configuration\n", 30 | "print(pd.__version__)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 38 | "\n", 39 | "## DataFrame creation" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "### Create an empty pandas DataFrame\n" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": null, 52 | "metadata": {}, 53 | "outputs": [], 54 | "source": [ 55 | "# your code goes here\n" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": { 62 | "cell_type": "solution" 63 | }, 64 | "outputs": [], 65 | "source": [ 66 | "pd.DataFrame(data=[None],\n", 67 | " index=[None],\n", 68 | " columns=[None])" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": {}, 74 | "source": [ 75 | "" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 83 | "\n", 84 | "### Create a `marvel_df` pandas DataFrame with the given marvel data\n" 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": {}, 91 | "outputs": [], 92 | "source": [ 93 | "marvel_data = [\n", 94 | " ['Spider-Man', 'male', 1962],\n", 95 | " ['Captain America', 'male', 1941],\n", 96 | " ['Wolverine', 'male', 1974],\n", 97 | " ['Iron Man', 'male', 1963],\n", 98 | " ['Thor', 'male', 1963],\n", 99 | " ['Thing', 'male', 1961],\n", 100 | " ['Mister Fantastic', 'male', 1961],\n", 101 | " ['Hulk', 'male', 1962],\n", 102 | " ['Beast', 'male', 1963],\n", 103 | " ['Invisible Woman', 'female', 1961],\n", 104 | " ['Storm', 'female', 1975],\n", 105 | " ['Namor', 'male', 1939],\n", 106 | " ['Hawkeye', 'male', 1964],\n", 107 | " ['Daredevil', 'male', 1964],\n", 108 | " ['Doctor Strange', 'male', 1963],\n", 109 | " ['Hank Pym', 'male', 1962],\n", 110 | " ['Scarlet Witch', 'female', 1964],\n", 111 | " ['Wasp', 'female', 1963],\n", 112 | " ['Black Widow', 'female', 1964],\n", 113 | " ['Vision', 'male', 1968]\n", 114 | "]" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "# your code goes here\n" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": { 130 | "cell_type": "solution" 131 | }, 132 | "outputs": [], 133 | "source": [ 134 | "marvel_df = pd.DataFrame(data=marvel_data)\n", 135 | "\n", 136 | "marvel_df" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": {}, 142 | "source": [ 143 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 144 | "\n", 145 | "### Add column names to the `marvel_df`\n", 146 | " " 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "# your code goes here\n" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": { 162 | "cell_type": "solution" 163 | }, 164 | "outputs": [], 165 | "source": [ 166 | "col_names = ['name', 'sex', 'first_appearance']\n", 167 | "\n", 168 | "marvel_df.columns = col_names\n", 169 | "marvel_df" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 177 | "\n", 178 | "### Add index names to the `marvel_df` (use the character name as index)\n" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": null, 184 | "metadata": {}, 185 | "outputs": [], 186 | "source": [ 187 | "# your code goes here\n" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": { 194 | "cell_type": "solution" 195 | }, 196 | "outputs": [], 197 | "source": [ 198 | "marvel_df.index = marvel_df['name']\n", 199 | "marvel_df" 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 207 | "\n", 208 | "### Drop the name column as it's now the index" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "metadata": {}, 215 | "outputs": [], 216 | "source": [ 217 | "# your code goes here\n" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": null, 223 | "metadata": { 224 | "cell_type": "solution" 225 | }, 226 | "outputs": [], 227 | "source": [ 228 | "#marvel_df = marvel_df.drop(columns=['name'])\n", 229 | "marvel_df = marvel_df.drop(['name'], axis=1)\n", 230 | "marvel_df" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 238 | "\n", 239 | "### Drop 'Namor' and 'Hank Pym' rows\n" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "# your code goes here\n" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": { 255 | "cell_type": "solution", 256 | "scrolled": false 257 | }, 258 | "outputs": [], 259 | "source": [ 260 | "marvel_df = marvel_df.drop(['Namor', 'Hank Pym'], axis=0)\n", 261 | "marvel_df" 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "metadata": {}, 267 | "source": [ 268 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 269 | "\n", 270 | "## DataFrame selection, slicing and indexation" 271 | ] 272 | }, 273 | { 274 | "cell_type": "markdown", 275 | "metadata": {}, 276 | "source": [ 277 | "### Show the first 5 elements on `marvel_df`\n", 278 | " " 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "# your code goes here\n" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": null, 293 | "metadata": { 294 | "cell_type": "solution" 295 | }, 296 | "outputs": [], 297 | "source": [ 298 | "#marvel_df.loc[['Spider-Man', 'Captain America', 'Wolverine', 'Iron Man', 'Thor'], :] # bad!\n", 299 | "#marvel_df.loc['Spider-Man': 'Thor', :]\n", 300 | "#marvel_df.iloc[0:5, :]\n", 301 | "#marvel_df.iloc[0:5,]\n", 302 | "marvel_df.iloc[:5,]\n", 303 | "#marvel_df.head()" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 311 | "\n", 312 | "### Show the last 5 elements on `marvel_df`\n" 313 | ] 314 | }, 315 | { 316 | "cell_type": "code", 317 | "execution_count": null, 318 | "metadata": {}, 319 | "outputs": [], 320 | "source": [ 321 | "# your code goes here\n" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": null, 327 | "metadata": { 328 | "cell_type": "solution" 329 | }, 330 | "outputs": [], 331 | "source": [ 332 | "#marvel_df.loc[['Hank Pym', 'Scarlet Witch', 'Wasp', 'Black Widow', 'Vision'], :] # bad!\n", 333 | "#marvel_df.loc['Hank Pym':'Vision', :]\n", 334 | "marvel_df.iloc[-5:,]\n", 335 | "#marvel_df.tail()" 336 | ] 337 | }, 338 | { 339 | "cell_type": "markdown", 340 | "metadata": {}, 341 | "source": [ 342 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 343 | "\n", 344 | "### Show just the sex of the first 5 elements on `marvel_df`" 345 | ] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "execution_count": null, 350 | "metadata": {}, 351 | "outputs": [], 352 | "source": [ 353 | "# your code goes here\n" 354 | ] 355 | }, 356 | { 357 | "cell_type": "code", 358 | "execution_count": null, 359 | "metadata": { 360 | "cell_type": "solution" 361 | }, 362 | "outputs": [], 363 | "source": [ 364 | "#marvel_df.iloc[:5,]['sex'].to_frame()\n", 365 | "marvel_df.iloc[:5,].sex.to_frame()\n", 366 | "#marvel_df.head().sex.to_frame()" 367 | ] 368 | }, 369 | { 370 | "cell_type": "markdown", 371 | "metadata": {}, 372 | "source": [ 373 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 374 | "\n", 375 | "### Show the first_appearance of all middle elements on `marvel_df` " 376 | ] 377 | }, 378 | { 379 | "cell_type": "code", 380 | "execution_count": null, 381 | "metadata": {}, 382 | "outputs": [], 383 | "source": [ 384 | "# your code goes here\n" 385 | ] 386 | }, 387 | { 388 | "cell_type": "code", 389 | "execution_count": null, 390 | "metadata": { 391 | "cell_type": "solution", 392 | "scrolled": false 393 | }, 394 | "outputs": [], 395 | "source": [ 396 | "marvel_df.iloc[1:-1,].first_appearance.to_frame()" 397 | ] 398 | }, 399 | { 400 | "cell_type": "markdown", 401 | "metadata": {}, 402 | "source": [ 403 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 404 | "\n", 405 | "### Show the first and last elements on `marvel_df`\n" 406 | ] 407 | }, 408 | { 409 | "cell_type": "code", 410 | "execution_count": null, 411 | "metadata": {}, 412 | "outputs": [], 413 | "source": [ 414 | "# your code goes here\n" 415 | ] 416 | }, 417 | { 418 | "cell_type": "code", 419 | "execution_count": null, 420 | "metadata": { 421 | "cell_type": "solution" 422 | }, 423 | "outputs": [], 424 | "source": [ 425 | "#marvel_df.iloc[[0, -1],][['sex', 'first_appearance']]\n", 426 | "marvel_df.iloc[[0, -1],]" 427 | ] 428 | }, 429 | { 430 | "cell_type": "markdown", 431 | "metadata": {}, 432 | "source": [ 433 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 434 | "\n", 435 | "## DataFrame manipulation and operations" 436 | ] 437 | }, 438 | { 439 | "cell_type": "markdown", 440 | "metadata": {}, 441 | "source": [ 442 | "### Modify the `first_appearance` of 'Vision' to year 1964" 443 | ] 444 | }, 445 | { 446 | "cell_type": "code", 447 | "execution_count": null, 448 | "metadata": {}, 449 | "outputs": [], 450 | "source": [ 451 | "# your code goes here\n" 452 | ] 453 | }, 454 | { 455 | "cell_type": "code", 456 | "execution_count": null, 457 | "metadata": { 458 | "cell_type": "solution" 459 | }, 460 | "outputs": [], 461 | "source": [ 462 | "marvel_df.loc['Vision', 'first_appearance'] = 1964\n", 463 | "\n", 464 | "marvel_df" 465 | ] 466 | }, 467 | { 468 | "cell_type": "markdown", 469 | "metadata": {}, 470 | "source": [ 471 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 472 | "\n", 473 | "### Add a new column to `marvel_df` called 'years_since' with the years since `first_appearance`\n" 474 | ] 475 | }, 476 | { 477 | "cell_type": "code", 478 | "execution_count": null, 479 | "metadata": {}, 480 | "outputs": [], 481 | "source": [ 482 | "# your code goes here\n" 483 | ] 484 | }, 485 | { 486 | "cell_type": "code", 487 | "execution_count": null, 488 | "metadata": { 489 | "cell_type": "solution" 490 | }, 491 | "outputs": [], 492 | "source": [ 493 | "marvel_df['years_since'] = 2018 - marvel_df['first_appearance']\n", 494 | "\n", 495 | "marvel_df" 496 | ] 497 | }, 498 | { 499 | "cell_type": "markdown", 500 | "metadata": {}, 501 | "source": [ 502 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 503 | "\n", 504 | "## DataFrame boolean arrays (also called masks)" 505 | ] 506 | }, 507 | { 508 | "cell_type": "markdown", 509 | "metadata": {}, 510 | "source": [ 511 | "### Given the `marvel_df` pandas DataFrame, make a mask showing the female characters\n" 512 | ] 513 | }, 514 | { 515 | "cell_type": "code", 516 | "execution_count": null, 517 | "metadata": {}, 518 | "outputs": [], 519 | "source": [ 520 | "# your code goes here\n" 521 | ] 522 | }, 523 | { 524 | "cell_type": "code", 525 | "execution_count": null, 526 | "metadata": { 527 | "cell_type": "solution" 528 | }, 529 | "outputs": [], 530 | "source": [ 531 | "mask = marvel_df['sex'] == 'female'\n", 532 | "\n", 533 | "mask" 534 | ] 535 | }, 536 | { 537 | "cell_type": "markdown", 538 | "metadata": {}, 539 | "source": [ 540 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 541 | "\n", 542 | "### Given the `marvel_df` pandas DataFrame, get the male characters\n" 543 | ] 544 | }, 545 | { 546 | "cell_type": "code", 547 | "execution_count": null, 548 | "metadata": {}, 549 | "outputs": [], 550 | "source": [ 551 | "# your code goes here\n" 552 | ] 553 | }, 554 | { 555 | "cell_type": "code", 556 | "execution_count": null, 557 | "metadata": { 558 | "cell_type": "solution" 559 | }, 560 | "outputs": [], 561 | "source": [ 562 | "mask = marvel_df['sex'] == 'male'\n", 563 | "\n", 564 | "marvel_df[mask]" 565 | ] 566 | }, 567 | { 568 | "cell_type": "markdown", 569 | "metadata": {}, 570 | "source": [ 571 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 572 | "\n", 573 | "### Given the `marvel_df` pandas DataFrame, get the characters with `first_appearance` after 1970\n" 574 | ] 575 | }, 576 | { 577 | "cell_type": "code", 578 | "execution_count": null, 579 | "metadata": {}, 580 | "outputs": [], 581 | "source": [ 582 | "# your code goes here\n" 583 | ] 584 | }, 585 | { 586 | "cell_type": "code", 587 | "execution_count": null, 588 | "metadata": { 589 | "cell_type": "solution" 590 | }, 591 | "outputs": [], 592 | "source": [ 593 | "mask = marvel_df['first_appearance'] > 1970\n", 594 | "\n", 595 | "marvel_df[mask]" 596 | ] 597 | }, 598 | { 599 | "cell_type": "markdown", 600 | "metadata": {}, 601 | "source": [ 602 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 603 | "\n", 604 | "### Given the `marvel_df` pandas DataFrame, get the female characters with `first_appearance` after 1970" 605 | ] 606 | }, 607 | { 608 | "cell_type": "code", 609 | "execution_count": null, 610 | "metadata": {}, 611 | "outputs": [], 612 | "source": [ 613 | "# your code goes here\n" 614 | ] 615 | }, 616 | { 617 | "cell_type": "code", 618 | "execution_count": null, 619 | "metadata": { 620 | "cell_type": "solution", 621 | "scrolled": true 622 | }, 623 | "outputs": [], 624 | "source": [ 625 | "mask = (marvel_df['sex'] == 'female') & (marvel_df['first_appearance'] > 1970)\n", 626 | "\n", 627 | "marvel_df[mask]" 628 | ] 629 | }, 630 | { 631 | "cell_type": "markdown", 632 | "metadata": {}, 633 | "source": [ 634 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 635 | "\n", 636 | "## DataFrame summary statistics" 637 | ] 638 | }, 639 | { 640 | "cell_type": "markdown", 641 | "metadata": {}, 642 | "source": [ 643 | "### Show basic statistics of `marvel_df`" 644 | ] 645 | }, 646 | { 647 | "cell_type": "code", 648 | "execution_count": null, 649 | "metadata": {}, 650 | "outputs": [], 651 | "source": [ 652 | "# your code goes here\n" 653 | ] 654 | }, 655 | { 656 | "cell_type": "code", 657 | "execution_count": null, 658 | "metadata": { 659 | "cell_type": "solution" 660 | }, 661 | "outputs": [], 662 | "source": [ 663 | "marvel_df.describe()" 664 | ] 665 | }, 666 | { 667 | "cell_type": "markdown", 668 | "metadata": {}, 669 | "source": [ 670 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 671 | "\n", 672 | "### Given the `marvel_df` pandas DataFrame, show the mean value of `first_appearance`" 673 | ] 674 | }, 675 | { 676 | "cell_type": "code", 677 | "execution_count": null, 678 | "metadata": {}, 679 | "outputs": [], 680 | "source": [ 681 | "# your code goes here\n" 682 | ] 683 | }, 684 | { 685 | "cell_type": "code", 686 | "execution_count": null, 687 | "metadata": { 688 | "cell_type": "solution" 689 | }, 690 | "outputs": [], 691 | "source": [ 692 | "\n", 693 | "#np.mean(marvel_df.first_appearance)\n", 694 | "marvel_df.first_appearance.mean()" 695 | ] 696 | }, 697 | { 698 | "cell_type": "markdown", 699 | "metadata": {}, 700 | "source": [ 701 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 702 | "\n", 703 | "### Given the `marvel_df` pandas DataFrame, show the min value of `first_appearance`\n" 704 | ] 705 | }, 706 | { 707 | "cell_type": "code", 708 | "execution_count": null, 709 | "metadata": {}, 710 | "outputs": [], 711 | "source": [ 712 | "# your code goes here\n" 713 | ] 714 | }, 715 | { 716 | "cell_type": "code", 717 | "execution_count": null, 718 | "metadata": { 719 | "cell_type": "solution" 720 | }, 721 | "outputs": [], 722 | "source": [ 723 | "#np.min(marvel_df.first_appearance)\n", 724 | "marvel_df.first_appearance.min()" 725 | ] 726 | }, 727 | { 728 | "cell_type": "markdown", 729 | "metadata": {}, 730 | "source": [ 731 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 732 | "\n", 733 | "### Given the `marvel_df` pandas DataFrame, get the characters with the min value of `first_appearance`" 734 | ] 735 | }, 736 | { 737 | "cell_type": "code", 738 | "execution_count": null, 739 | "metadata": {}, 740 | "outputs": [], 741 | "source": [ 742 | "# your code goes here\n" 743 | ] 744 | }, 745 | { 746 | "cell_type": "code", 747 | "execution_count": null, 748 | "metadata": { 749 | "cell_type": "solution" 750 | }, 751 | "outputs": [], 752 | "source": [ 753 | "mask = marvel_df['first_appearance'] == marvel_df.first_appearance.min()\n", 754 | "marvel_df[mask]" 755 | ] 756 | }, 757 | { 758 | "cell_type": "markdown", 759 | "metadata": {}, 760 | "source": [ 761 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 762 | "\n", 763 | "## DataFrame basic plottings" 764 | ] 765 | }, 766 | { 767 | "cell_type": "markdown", 768 | "metadata": {}, 769 | "source": [ 770 | "### Reset index names of `marvel_df`\n" 771 | ] 772 | }, 773 | { 774 | "cell_type": "code", 775 | "execution_count": null, 776 | "metadata": {}, 777 | "outputs": [], 778 | "source": [ 779 | "# your code goes here\n" 780 | ] 781 | }, 782 | { 783 | "cell_type": "code", 784 | "execution_count": null, 785 | "metadata": { 786 | "cell_type": "solution" 787 | }, 788 | "outputs": [], 789 | "source": [ 790 | "marvel_df = marvel_df.reset_index()\n", 791 | "\n", 792 | "marvel_df" 793 | ] 794 | }, 795 | { 796 | "cell_type": "markdown", 797 | "metadata": {}, 798 | "source": [ 799 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 800 | "\n", 801 | "### Plot the values of `first_appearance`\n" 802 | ] 803 | }, 804 | { 805 | "cell_type": "code", 806 | "execution_count": null, 807 | "metadata": {}, 808 | "outputs": [], 809 | "source": [ 810 | "# your code goes here\n" 811 | ] 812 | }, 813 | { 814 | "cell_type": "code", 815 | "execution_count": null, 816 | "metadata": { 817 | "cell_type": "solution" 818 | }, 819 | "outputs": [], 820 | "source": [ 821 | "#plt.plot(marvel_df.index, marvel_df.first_appearance)\n", 822 | "marvel_df.first_appearance.plot()" 823 | ] 824 | }, 825 | { 826 | "cell_type": "markdown", 827 | "metadata": {}, 828 | "source": [ 829 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 830 | "\n", 831 | "### Plot a histogram (plot.hist) with values of `first_appearance`\n" 832 | ] 833 | }, 834 | { 835 | "cell_type": "code", 836 | "execution_count": null, 837 | "metadata": {}, 838 | "outputs": [], 839 | "source": [ 840 | "# your code goes here\n" 841 | ] 842 | }, 843 | { 844 | "cell_type": "code", 845 | "execution_count": null, 846 | "metadata": { 847 | "cell_type": "solution" 848 | }, 849 | "outputs": [], 850 | "source": [ 851 | "\n", 852 | "plt.hist(marvel_df.first_appearance)" 853 | ] 854 | }, 855 | { 856 | "cell_type": "markdown", 857 | "metadata": {}, 858 | "source": [ 859 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n" 860 | ] 861 | } 862 | ], 863 | "metadata": { 864 | "kernelspec": { 865 | "display_name": "Python 3", 866 | "language": "python", 867 | "name": "python3" 868 | }, 869 | "language_info": { 870 | "codemirror_mode": { 871 | "name": "ipython", 872 | "version": 3 873 | }, 874 | "file_extension": ".py", 875 | "mimetype": "text/x-python", 876 | "name": "python", 877 | "nbconvert_exporter": "python", 878 | "pygments_lexer": "ipython3", 879 | "version": "3.7.4" 880 | } 881 | }, 882 | "nbformat": 4, 883 | "nbformat_minor": 2 884 | } 885 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/5 - Pandas - Reading CSV and Basic Plotting-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)\n", 8 | "
\n", 9 | "\n", 10 | "\n", 12 | "\n", 13 | "# Reading external data & Plotting\n", 14 | "\n", 15 | "[Source](https://blockchain.info/charts/market-price)" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 23 | "\n", 24 | "## Hands on! " 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": null, 30 | "metadata": {}, 31 | "outputs": [], 32 | "source": [ 33 | "import numpy as np\n", 34 | "import pandas as pd\n", 35 | "import matplotlib.pyplot as plt\n", 36 | "\n", 37 | "%matplotlib inline" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "Pandas can easily read data stored in different file formats like CSV, JSON, XML or even Excel. Parsing always involves specifying the correct structure, encoding and other details. The `read_csv` method reads CSV files and accepts many parameters." 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": null, 50 | "metadata": { 51 | "scrolled": true 52 | }, 53 | "outputs": [], 54 | "source": [ 55 | "pd.read_csv" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": {}, 62 | "outputs": [], 63 | "source": [ 64 | "df = pd.read_csv('data/btc-market-price.csv')" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": null, 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "df.head()" 74 | ] 75 | }, 76 | { 77 | "cell_type": "markdown", 78 | "metadata": {}, 79 | "source": [ 80 | "The CSV file we're reading has only two columns: `timestamp` and `price`. It doesn't have a header, it contains whitespaces and has values separated by commas. pandas automatically assigned the first row of data as headers, which is incorrect. We can overwrite this behavior with the `header` parameter:" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": {}, 87 | "outputs": [], 88 | "source": [ 89 | "df = pd.read_csv('data/btc-market-price.csv', header=None)" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "df.head()" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "We can then set the names of each column explicitely by setting the `df.columns` attribute:" 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": null, 111 | "metadata": {}, 112 | "outputs": [], 113 | "source": [ 114 | "df.columns = ['Timestamp', 'Price']" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "df.head()" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": {}, 129 | "source": [ 130 | "The type of the `Price` column was correctly interpreted as `float`, but the `Timestamp` was interpreted as a regular string (`object` in pandas notation):" 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": null, 136 | "metadata": {}, 137 | "outputs": [], 138 | "source": [ 139 | "df.dtypes" 140 | ] 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "metadata": {}, 145 | "source": [ 146 | "We can perform a vectorized operation to parse all the Timestamp values as `Datetime` objects:" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "pd.to_datetime(df['Timestamp']).head()" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "df['Timestamp'] = pd.to_datetime(df['Timestamp'])" 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": null, 170 | "metadata": { 171 | "scrolled": true 172 | }, 173 | "outputs": [], 174 | "source": [ 175 | "df.head()" 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [ 184 | "df.dtypes" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "metadata": {}, 190 | "source": [ 191 | "The timestamp looks a lot like the index of this `DataFrame`: `date > price`. We can change the autoincremental ID generated by pandas and use the `Timestamp DS` column as the Index:" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "metadata": {}, 198 | "outputs": [], 199 | "source": [ 200 | "df.set_index('Timestamp', inplace=True)" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": null, 206 | "metadata": {}, 207 | "outputs": [], 208 | "source": [ 209 | "df.head()" 210 | ] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "metadata": {}, 215 | "source": [ 216 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 217 | "\n", 218 | "## Putting everything together\n", 219 | "\n", 220 | "And now, we've finally arrived to the final, desired version of the `DataFrame` parsed from our CSV file. The steps were:" 221 | ] 222 | }, 223 | { 224 | "cell_type": "code", 225 | "execution_count": null, 226 | "metadata": {}, 227 | "outputs": [], 228 | "source": [ 229 | "df = pd.read_csv('data/btc-market-price.csv', header=None)\n", 230 | "df.columns = ['Timestamp', 'Price']\n", 231 | "df['Timestamp'] = pd.to_datetime(df['Timestamp'])\n", 232 | "df.set_index('Timestamp', inplace=True)" 233 | ] 234 | }, 235 | { 236 | "cell_type": "code", 237 | "execution_count": null, 238 | "metadata": { 239 | "scrolled": false 240 | }, 241 | "outputs": [], 242 | "source": [ 243 | "df.head()" 244 | ] 245 | }, 246 | { 247 | "cell_type": "markdown", 248 | "metadata": {}, 249 | "source": [ 250 | "**There should be a better way**. And there is 😎. And there usually is, explicitly with all these repetitive tasks with pandas.\n", 251 | "\n", 252 | "The `read_csv` function is extremely powerful and you can specify many more parameters at import time. We can achive the same results with only one line by doing:" 253 | ] 254 | }, 255 | { 256 | "cell_type": "code", 257 | "execution_count": null, 258 | "metadata": {}, 259 | "outputs": [], 260 | "source": [ 261 | "df = pd.read_csv(\n", 262 | " 'data/btc-market-price.csv',\n", 263 | " header=None,\n", 264 | " names=['Timestamp', 'Price'],\n", 265 | " index_col=0,\n", 266 | " parse_dates=True\n", 267 | ")" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": null, 273 | "metadata": { 274 | "scrolled": true 275 | }, 276 | "outputs": [], 277 | "source": [ 278 | "df.head()" 279 | ] 280 | }, 281 | { 282 | "cell_type": "markdown", 283 | "metadata": {}, 284 | "source": [ 285 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 286 | "\n", 287 | "## Plotting basics\n", 288 | "\n", 289 | "`pandas` integrates with Matplotlib and creating a plot is as simple as:" 290 | ] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "execution_count": null, 295 | "metadata": { 296 | "scrolled": true 297 | }, 298 | "outputs": [], 299 | "source": [ 300 | "df.plot()" 301 | ] 302 | }, 303 | { 304 | "cell_type": "markdown", 305 | "metadata": {}, 306 | "source": [ 307 | "Behind the scenes, it's using `matplotlib.pyplot`'s interface. We can create a similar plot with the `plt.plot()` function:" 308 | ] 309 | }, 310 | { 311 | "cell_type": "code", 312 | "execution_count": null, 313 | "metadata": { 314 | "scrolled": true 315 | }, 316 | "outputs": [], 317 | "source": [ 318 | "plt.plot(df.index, df['Price'])" 319 | ] 320 | }, 321 | { 322 | "cell_type": "markdown", 323 | "metadata": {}, 324 | "source": [ 325 | "`plt.plot()` accepts many parameters, but the first two ones are the most important ones: the values for the `X` and `Y` axes. Another example:" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": null, 331 | "metadata": {}, 332 | "outputs": [], 333 | "source": [ 334 | "x = np.arange(-10, 11)" 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": null, 340 | "metadata": { 341 | "scrolled": false 342 | }, 343 | "outputs": [], 344 | "source": [ 345 | "plt.plot(x, x ** 2)" 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | "We're using `matplotlib`'s global API, which is horrible but it's the most popular one. We'll learn later how to use the _OOP_ API which will make our work much easier." 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": null, 358 | "metadata": { 359 | "scrolled": true 360 | }, 361 | "outputs": [], 362 | "source": [ 363 | "plt.plot(x, x ** 2)\n", 364 | "plt.plot(x, -1 * (x ** 2))" 365 | ] 366 | }, 367 | { 368 | "cell_type": "markdown", 369 | "metadata": {}, 370 | "source": [ 371 | "Each `plt` function alters the global state. If you want to set settings of your plot you can use the `plt.figure` function. Others like `plt.title` keep altering the global plot:" 372 | ] 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": null, 377 | "metadata": { 378 | "scrolled": false 379 | }, 380 | "outputs": [], 381 | "source": [ 382 | "plt.figure(figsize=(12, 6))\n", 383 | "plt.plot(x, x ** 2)\n", 384 | "plt.plot(x, -1 * (x ** 2))\n", 385 | "\n", 386 | "plt.title('My Nice Plot')" 387 | ] 388 | }, 389 | { 390 | "cell_type": "markdown", 391 | "metadata": {}, 392 | "source": [ 393 | "Some of the arguments in `plt.figure` and `plt.plot` are available in the pandas' `plot` interface:" 394 | ] 395 | }, 396 | { 397 | "cell_type": "code", 398 | "execution_count": null, 399 | "metadata": { 400 | "scrolled": true 401 | }, 402 | "outputs": [], 403 | "source": [ 404 | "df.plot(figsize=(16, 9), title='Bitcoin Price 2017-2018')" 405 | ] 406 | }, 407 | { 408 | "cell_type": "markdown", 409 | "metadata": {}, 410 | "source": [ 411 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 412 | "\n", 413 | "## A more challenging parsing\n", 414 | "\n", 415 | "To demonstrate plotting two columns together, we'll try to add Ether prices to our `df` DataFrame. The ETH prices data can be found in the `data/eth-price.csv` file. The problem is that it seems like that CSV file was created by someone who really hated programmers. Take a look at it and see how ugly it looks like. We'll still use `pandas` to parse it." 416 | ] 417 | }, 418 | { 419 | "cell_type": "code", 420 | "execution_count": null, 421 | "metadata": { 422 | "scrolled": true 423 | }, 424 | "outputs": [], 425 | "source": [ 426 | "eth = pd.read_csv('data/eth-price.csv')\n", 427 | "\n", 428 | "eth.head()" 429 | ] 430 | }, 431 | { 432 | "cell_type": "markdown", 433 | "metadata": {}, 434 | "source": [ 435 | "As you can see, it has a `Value` column (which represents the price), a `Date(UTC)` one that has a string representing dates and also a `UnixTimeStamp` date represeting the datetime in unix timestamp format. The header is read automatically, let's try to parse dates with the CSV Reader:" 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": null, 441 | "metadata": {}, 442 | "outputs": [], 443 | "source": [ 444 | "eth = pd.read_csv('data/eth-price.csv', parse_dates=True)\n", 445 | "\n", 446 | "print(eth.dtypes)\n", 447 | "eth.head()" 448 | ] 449 | }, 450 | { 451 | "cell_type": "markdown", 452 | "metadata": {}, 453 | "source": [ 454 | "Seems like the `parse_dates` attribute didn't work. We'll need to add a little bit more customization. Let's divide this problem and focus on the problem of \"date parsing\" first. The simplest option would be to use the `UnixTimeStamp` column. The `pandas` module has a `to_datetime` function that converts Unix timestamps to Datetime objects automatically:" 455 | ] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "execution_count": null, 460 | "metadata": {}, 461 | "outputs": [], 462 | "source": [ 463 | "pd.to_datetime(eth['UnixTimeStamp']).head()" 464 | ] 465 | }, 466 | { 467 | "cell_type": "markdown", 468 | "metadata": {}, 469 | "source": [ 470 | "The problem is the precision of unix timestamps. To match both columns we'll need to use the same index and, our `df` containing Bitcoin prices, is \"per day\":" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": null, 476 | "metadata": { 477 | "scrolled": true 478 | }, 479 | "outputs": [], 480 | "source": [ 481 | "df.head()" 482 | ] 483 | }, 484 | { 485 | "cell_type": "markdown", 486 | "metadata": {}, 487 | "source": [ 488 | "We could either, remove the precision of `UnixTimeStamp` or attempt to parse the `Date(UTC)`. Let's do String parsing of `Date(UTC)` for fun:" 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": null, 494 | "metadata": { 495 | "scrolled": false 496 | }, 497 | "outputs": [], 498 | "source": [ 499 | "pd.to_datetime(eth['Date(UTC)']).head()" 500 | ] 501 | }, 502 | { 503 | "cell_type": "markdown", 504 | "metadata": {}, 505 | "source": [ 506 | "That seems to work fine! Why isn't it then parsing the `Date(UTC)` column? Simple, the `parse_dates=True` parameter will instruct pandas to parse the index of the `DataFrame`. If you want to parse any other column, you must explicitly pass the column position or name:" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": null, 512 | "metadata": { 513 | "scrolled": false 514 | }, 515 | "outputs": [], 516 | "source": [ 517 | "pd.read_csv('data/eth-price.csv', parse_dates=[0]).head()" 518 | ] 519 | }, 520 | { 521 | "cell_type": "markdown", 522 | "metadata": {}, 523 | "source": [ 524 | "Putting everything together again:" 525 | ] 526 | }, 527 | { 528 | "cell_type": "code", 529 | "execution_count": null, 530 | "metadata": { 531 | "scrolled": false 532 | }, 533 | "outputs": [], 534 | "source": [ 535 | "eth = pd.read_csv('data/eth-price.csv', parse_dates=True, index_col=0)\n", 536 | "print(eth.info())\n", 537 | "\n", 538 | "eth.head()" 539 | ] 540 | }, 541 | { 542 | "cell_type": "markdown", 543 | "metadata": {}, 544 | "source": [ 545 | "We can now combine both `DataFrame`s into one. Both have the same index, so aligning both prices will be easy. Let's first create an empty `DataFrame` and with the index from Bitcoin prices:" 546 | ] 547 | }, 548 | { 549 | "cell_type": "code", 550 | "execution_count": null, 551 | "metadata": {}, 552 | "outputs": [], 553 | "source": [ 554 | "prices = pd.DataFrame(index=df.index)" 555 | ] 556 | }, 557 | { 558 | "cell_type": "code", 559 | "execution_count": null, 560 | "metadata": {}, 561 | "outputs": [], 562 | "source": [ 563 | "prices.head()" 564 | ] 565 | }, 566 | { 567 | "cell_type": "markdown", 568 | "metadata": {}, 569 | "source": [ 570 | "And we can now just set columns from the other `DataFrame`s:" 571 | ] 572 | }, 573 | { 574 | "cell_type": "code", 575 | "execution_count": null, 576 | "metadata": {}, 577 | "outputs": [], 578 | "source": [ 579 | "prices['Bitcoin'] = df['Price']" 580 | ] 581 | }, 582 | { 583 | "cell_type": "code", 584 | "execution_count": null, 585 | "metadata": {}, 586 | "outputs": [], 587 | "source": [ 588 | "prices['Ether'] = eth['Value']" 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": null, 594 | "metadata": {}, 595 | "outputs": [], 596 | "source": [ 597 | "prices.head()" 598 | ] 599 | }, 600 | { 601 | "cell_type": "markdown", 602 | "metadata": {}, 603 | "source": [ 604 | "We can now try plotting both values:" 605 | ] 606 | }, 607 | { 608 | "cell_type": "code", 609 | "execution_count": null, 610 | "metadata": { 611 | "scrolled": true 612 | }, 613 | "outputs": [], 614 | "source": [ 615 | "prices.plot(figsize=(12, 6))" 616 | ] 617 | }, 618 | { 619 | "cell_type": "markdown", 620 | "metadata": {}, 621 | "source": [ 622 | "🤔seems like there's a tiny gap between Dec 2017 and Jan 2018. Let's zoom in there:" 623 | ] 624 | }, 625 | { 626 | "cell_type": "code", 627 | "execution_count": null, 628 | "metadata": { 629 | "scrolled": false 630 | }, 631 | "outputs": [], 632 | "source": [ 633 | "prices.loc['2017-12-01':'2018-01-01'].plot(figsize=(12, 6))" 634 | ] 635 | }, 636 | { 637 | "cell_type": "markdown", 638 | "metadata": {}, 639 | "source": [ 640 | "Oh no, missing data 😱. We'll learn how to deal with that later 😉.\n", 641 | "\n", 642 | "Btw, did you note that fancy indexing `'2017-12-01':'2018-01-01'` 😏. That's pandas power 💪. We'll learn how to deal with TimeSeries later too." 643 | ] 644 | }, 645 | { 646 | "cell_type": "markdown", 647 | "metadata": {}, 648 | "source": [ 649 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n" 650 | ] 651 | } 652 | ], 653 | "metadata": { 654 | "kernelspec": { 655 | "display_name": "Python 3", 656 | "language": "python", 657 | "name": "python3" 658 | }, 659 | "language_info": { 660 | "codemirror_mode": { 661 | "name": "ipython", 662 | "version": 3 663 | }, 664 | "file_extension": ".py", 665 | "mimetype": "text/x-python", 666 | "name": "python", 667 | "nbconvert_exporter": "python", 668 | "pygments_lexer": "ipython3", 669 | "version": "3.7.4" 670 | } 671 | }, 672 | "nbformat": 4, 673 | "nbformat_minor": 2 674 | } 675 | -------------------------------------------------------------------------------- /1 - Pandas - Series.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)\n", 8 | "
\n", 9 | "\n", 10 | "\n", 12 | "\n", 13 | "# Pandas - Series\n" 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": {}, 19 | "source": [ 20 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 21 | "\n", 22 | "## Hands on! " 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "import pandas as pd\n", 32 | "import numpy as np" 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "metadata": {}, 38 | "source": [ 39 | "## Pandas Series\n", 40 | "\n", 41 | "We'll start analyzing \"[The Group of Seven](https://en.wikipedia.org/wiki/Group_of_Seven)\". Which is a political formed by Canada, France, Germany, Italy, Japan, the United Kingdom and the United States. We'll start by analyzing population, and for that, we'll use a `pandas.Series` object." 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": 2, 47 | "metadata": {}, 48 | "outputs": [], 49 | "source": [ 50 | "# In millions\n", 51 | "g7_pop = pd.Series([35.467, 63.951, 80.940, 60.665, 127.061, 64.511, 318.523])" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 3, 57 | "metadata": { 58 | "scrolled": true 59 | }, 60 | "outputs": [ 61 | { 62 | "data": { 63 | "text/plain": [ 64 | "0 35.467\n", 65 | "1 63.951\n", 66 | "2 80.940\n", 67 | "3 60.665\n", 68 | "4 127.061\n", 69 | "5 64.511\n", 70 | "6 318.523\n", 71 | "dtype: float64" 72 | ] 73 | }, 74 | "execution_count": 3, 75 | "metadata": {}, 76 | "output_type": "execute_result" 77 | } 78 | ], 79 | "source": [ 80 | "g7_pop" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": {}, 86 | "source": [ 87 | "Someone might not know we're representing population in millions of inhabitants. Series can have a `name`, to better document the purpose of the Series:" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": 4, 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [ 96 | "g7_pop.name = 'G7 Population in millions'" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": 5, 102 | "metadata": {}, 103 | "outputs": [ 104 | { 105 | "data": { 106 | "text/plain": [ 107 | "0 35.467\n", 108 | "1 63.951\n", 109 | "2 80.940\n", 110 | "3 60.665\n", 111 | "4 127.061\n", 112 | "5 64.511\n", 113 | "6 318.523\n", 114 | "Name: G7 Population in millions, dtype: float64" 115 | ] 116 | }, 117 | "execution_count": 5, 118 | "metadata": {}, 119 | "output_type": "execute_result" 120 | } 121 | ], 122 | "source": [ 123 | "g7_pop" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": {}, 129 | "source": [ 130 | "Series are pretty similar to numpy arrays:" 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": 6, 136 | "metadata": {}, 137 | "outputs": [ 138 | { 139 | "data": { 140 | "text/plain": [ 141 | "dtype('float64')" 142 | ] 143 | }, 144 | "execution_count": 6, 145 | "metadata": {}, 146 | "output_type": "execute_result" 147 | } 148 | ], 149 | "source": [ 150 | "g7_pop.dtype" 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 7, 156 | "metadata": {}, 157 | "outputs": [ 158 | { 159 | "data": { 160 | "text/plain": [ 161 | "array([ 35.467, 63.951, 80.94 , 60.665, 127.061, 64.511, 318.523])" 162 | ] 163 | }, 164 | "execution_count": 7, 165 | "metadata": {}, 166 | "output_type": "execute_result" 167 | } 168 | ], 169 | "source": [ 170 | "g7_pop.values" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "They're actually backed by numpy arrays:" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 8, 183 | "metadata": {}, 184 | "outputs": [ 185 | { 186 | "data": { 187 | "text/plain": [ 188 | "numpy.ndarray" 189 | ] 190 | }, 191 | "execution_count": 8, 192 | "metadata": {}, 193 | "output_type": "execute_result" 194 | } 195 | ], 196 | "source": [ 197 | "type(g7_pop.values)" 198 | ] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "metadata": {}, 203 | "source": [ 204 | "And they _look_ like simple Python lists or Numpy Arrays. But they're actually more similar to Python `dict`s.\n", 205 | "\n", 206 | "A Series has an `index`, that's similar to the automatic index assigned to Python's lists:" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": 9, 212 | "metadata": {}, 213 | "outputs": [ 214 | { 215 | "data": { 216 | "text/plain": [ 217 | "0 35.467\n", 218 | "1 63.951\n", 219 | "2 80.940\n", 220 | "3 60.665\n", 221 | "4 127.061\n", 222 | "5 64.511\n", 223 | "6 318.523\n", 224 | "Name: G7 Population in millions, dtype: float64" 225 | ] 226 | }, 227 | "execution_count": 9, 228 | "metadata": {}, 229 | "output_type": "execute_result" 230 | } 231 | ], 232 | "source": [ 233 | "g7_pop" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 10, 239 | "metadata": {}, 240 | "outputs": [ 241 | { 242 | "data": { 243 | "text/plain": [ 244 | "35.467" 245 | ] 246 | }, 247 | "execution_count": 10, 248 | "metadata": {}, 249 | "output_type": "execute_result" 250 | } 251 | ], 252 | "source": [ 253 | "g7_pop[0]" 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": 11, 259 | "metadata": {}, 260 | "outputs": [ 261 | { 262 | "data": { 263 | "text/plain": [ 264 | "63.951" 265 | ] 266 | }, 267 | "execution_count": 11, 268 | "metadata": {}, 269 | "output_type": "execute_result" 270 | } 271 | ], 272 | "source": [ 273 | "g7_pop[1]" 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": 12, 279 | "metadata": {}, 280 | "outputs": [ 281 | { 282 | "data": { 283 | "text/plain": [ 284 | "RangeIndex(start=0, stop=7, step=1)" 285 | ] 286 | }, 287 | "execution_count": 12, 288 | "metadata": {}, 289 | "output_type": "execute_result" 290 | } 291 | ], 292 | "source": [ 293 | "g7_pop.index" 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": 13, 299 | "metadata": {}, 300 | "outputs": [], 301 | "source": [ 302 | "l = ['a', 'b', 'c']" 303 | ] 304 | }, 305 | { 306 | "cell_type": "markdown", 307 | "metadata": {}, 308 | "source": [ 309 | "But, in contrast to lists, we can explicitly define the index:" 310 | ] 311 | }, 312 | { 313 | "cell_type": "code", 314 | "execution_count": 14, 315 | "metadata": {}, 316 | "outputs": [], 317 | "source": [ 318 | "g7_pop.index = [\n", 319 | " 'Canada',\n", 320 | " 'France',\n", 321 | " 'Germany',\n", 322 | " 'Italy',\n", 323 | " 'Japan',\n", 324 | " 'United Kingdom',\n", 325 | " 'United States',\n", 326 | "]" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": 15, 332 | "metadata": {}, 333 | "outputs": [ 334 | { 335 | "data": { 336 | "text/plain": [ 337 | "Canada 35.467\n", 338 | "France 63.951\n", 339 | "Germany 80.940\n", 340 | "Italy 60.665\n", 341 | "Japan 127.061\n", 342 | "United Kingdom 64.511\n", 343 | "United States 318.523\n", 344 | "Name: G7 Population in millions, dtype: float64" 345 | ] 346 | }, 347 | "execution_count": 15, 348 | "metadata": {}, 349 | "output_type": "execute_result" 350 | } 351 | ], 352 | "source": [ 353 | "g7_pop" 354 | ] 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "metadata": {}, 359 | "source": [ 360 | "Compare it with the [following table](https://docs.google.com/spreadsheets/d/1IlorV2-Oh9Da1JAZ7weVw86PQrQydSMp-ydVMH135iI/edit?usp=sharing): \n", 361 | "\n", 362 | "\n", 363 | "\n", 364 | "We can say that Series look like \"ordered dictionaries\". We can actually create Series out of dictionaries:" 365 | ] 366 | }, 367 | { 368 | "cell_type": "code", 369 | "execution_count": 16, 370 | "metadata": { 371 | "scrolled": true 372 | }, 373 | "outputs": [ 374 | { 375 | "data": { 376 | "text/plain": [ 377 | "Canada 35.467\n", 378 | "France 63.951\n", 379 | "Germany 80.940\n", 380 | "Italy 60.665\n", 381 | "Japan 127.061\n", 382 | "United Kingdom 64.511\n", 383 | "United States 318.523\n", 384 | "Name: G7 Population in millions, dtype: float64" 385 | ] 386 | }, 387 | "execution_count": 16, 388 | "metadata": {}, 389 | "output_type": "execute_result" 390 | } 391 | ], 392 | "source": [ 393 | "pd.Series({\n", 394 | " 'Canada': 35.467,\n", 395 | " 'France': 63.951,\n", 396 | " 'Germany': 80.94,\n", 397 | " 'Italy': 60.665,\n", 398 | " 'Japan': 127.061,\n", 399 | " 'United Kingdom': 64.511,\n", 400 | " 'United States': 318.523\n", 401 | "}, name='G7 Population in millions')" 402 | ] 403 | }, 404 | { 405 | "cell_type": "code", 406 | "execution_count": 17, 407 | "metadata": {}, 408 | "outputs": [ 409 | { 410 | "data": { 411 | "text/plain": [ 412 | "Canada 35.467\n", 413 | "France 63.951\n", 414 | "Germany 80.940\n", 415 | "Italy 60.665\n", 416 | "Japan 127.061\n", 417 | "United Kingdom 64.511\n", 418 | "United States 318.523\n", 419 | "Name: G7 Population in millions, dtype: float64" 420 | ] 421 | }, 422 | "execution_count": 17, 423 | "metadata": {}, 424 | "output_type": "execute_result" 425 | } 426 | ], 427 | "source": [ 428 | "pd.Series(\n", 429 | " [35.467, 63.951, 80.94, 60.665, 127.061, 64.511, 318.523],\n", 430 | " index=['Canada', 'France', 'Germany', 'Italy', 'Japan', 'United Kingdom',\n", 431 | " 'United States'],\n", 432 | " name='G7 Population in millions')" 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": {}, 438 | "source": [ 439 | "You can also create Series out of other series, specifying indexes:" 440 | ] 441 | }, 442 | { 443 | "cell_type": "code", 444 | "execution_count": 18, 445 | "metadata": {}, 446 | "outputs": [ 447 | { 448 | "data": { 449 | "text/plain": [ 450 | "France 63.951\n", 451 | "Germany 80.940\n", 452 | "Italy 60.665\n", 453 | "Spain NaN\n", 454 | "Name: G7 Population in millions, dtype: float64" 455 | ] 456 | }, 457 | "execution_count": 18, 458 | "metadata": {}, 459 | "output_type": "execute_result" 460 | } 461 | ], 462 | "source": [ 463 | "pd.Series(g7_pop, index=['France', 'Germany', 'Italy', 'Spain'])" 464 | ] 465 | }, 466 | { 467 | "cell_type": "markdown", 468 | "metadata": {}, 469 | "source": [ 470 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 471 | "\n", 472 | "## Indexing\n", 473 | "\n", 474 | "Indexing works similarly to lists and dictionaries, you use the **index** of the element you're looking for:" 475 | ] 476 | }, 477 | { 478 | "cell_type": "code", 479 | "execution_count": 19, 480 | "metadata": {}, 481 | "outputs": [ 482 | { 483 | "data": { 484 | "text/plain": [ 485 | "Canada 35.467\n", 486 | "France 63.951\n", 487 | "Germany 80.940\n", 488 | "Italy 60.665\n", 489 | "Japan 127.061\n", 490 | "United Kingdom 64.511\n", 491 | "United States 318.523\n", 492 | "Name: G7 Population in millions, dtype: float64" 493 | ] 494 | }, 495 | "execution_count": 19, 496 | "metadata": {}, 497 | "output_type": "execute_result" 498 | } 499 | ], 500 | "source": [ 501 | "g7_pop" 502 | ] 503 | }, 504 | { 505 | "cell_type": "code", 506 | "execution_count": 20, 507 | "metadata": {}, 508 | "outputs": [ 509 | { 510 | "data": { 511 | "text/plain": [ 512 | "35.467" 513 | ] 514 | }, 515 | "execution_count": 20, 516 | "metadata": {}, 517 | "output_type": "execute_result" 518 | } 519 | ], 520 | "source": [ 521 | "g7_pop['Canada']" 522 | ] 523 | }, 524 | { 525 | "cell_type": "code", 526 | "execution_count": 21, 527 | "metadata": {}, 528 | "outputs": [ 529 | { 530 | "data": { 531 | "text/plain": [ 532 | "127.061" 533 | ] 534 | }, 535 | "execution_count": 21, 536 | "metadata": {}, 537 | "output_type": "execute_result" 538 | } 539 | ], 540 | "source": [ 541 | "g7_pop['Japan']" 542 | ] 543 | }, 544 | { 545 | "cell_type": "markdown", 546 | "metadata": {}, 547 | "source": [ 548 | "Numeric positions can also be used, with the `iloc` attribute:" 549 | ] 550 | }, 551 | { 552 | "cell_type": "code", 553 | "execution_count": 22, 554 | "metadata": {}, 555 | "outputs": [ 556 | { 557 | "data": { 558 | "text/plain": [ 559 | "35.467" 560 | ] 561 | }, 562 | "execution_count": 22, 563 | "metadata": {}, 564 | "output_type": "execute_result" 565 | } 566 | ], 567 | "source": [ 568 | "g7_pop.iloc[0]" 569 | ] 570 | }, 571 | { 572 | "cell_type": "code", 573 | "execution_count": 23, 574 | "metadata": { 575 | "scrolled": true 576 | }, 577 | "outputs": [ 578 | { 579 | "data": { 580 | "text/plain": [ 581 | "318.523" 582 | ] 583 | }, 584 | "execution_count": 23, 585 | "metadata": {}, 586 | "output_type": "execute_result" 587 | } 588 | ], 589 | "source": [ 590 | "g7_pop.iloc[-1]" 591 | ] 592 | }, 593 | { 594 | "cell_type": "markdown", 595 | "metadata": {}, 596 | "source": [ 597 | "Selecting multiple elements at once:" 598 | ] 599 | }, 600 | { 601 | "cell_type": "code", 602 | "execution_count": 24, 603 | "metadata": { 604 | "scrolled": true 605 | }, 606 | "outputs": [ 607 | { 608 | "data": { 609 | "text/plain": [ 610 | "Italy 60.665\n", 611 | "France 63.951\n", 612 | "Name: G7 Population in millions, dtype: float64" 613 | ] 614 | }, 615 | "execution_count": 24, 616 | "metadata": {}, 617 | "output_type": "execute_result" 618 | } 619 | ], 620 | "source": [ 621 | "g7_pop[['Italy', 'France']]" 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": {}, 627 | "source": [ 628 | "_(The result is another Series)_" 629 | ] 630 | }, 631 | { 632 | "cell_type": "code", 633 | "execution_count": 25, 634 | "metadata": { 635 | "scrolled": true 636 | }, 637 | "outputs": [ 638 | { 639 | "data": { 640 | "text/plain": [ 641 | "Canada 35.467\n", 642 | "France 63.951\n", 643 | "Name: G7 Population in millions, dtype: float64" 644 | ] 645 | }, 646 | "execution_count": 25, 647 | "metadata": {}, 648 | "output_type": "execute_result" 649 | } 650 | ], 651 | "source": [ 652 | "g7_pop.iloc[[0, 1]]" 653 | ] 654 | }, 655 | { 656 | "cell_type": "markdown", 657 | "metadata": {}, 658 | "source": [ 659 | "Slicing also works, but **important**, in Pandas, the upper limit is also included:" 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": 28, 665 | "metadata": {}, 666 | "outputs": [ 667 | { 668 | "data": { 669 | "text/plain": [ 670 | "Canada 35.467\n", 671 | "France 63.951\n", 672 | "Germany 80.940\n", 673 | "Italy 60.665\n", 674 | "Name: G7 Population in millions, dtype: float64" 675 | ] 676 | }, 677 | "execution_count": 28, 678 | "metadata": {}, 679 | "output_type": "execute_result" 680 | } 681 | ], 682 | "source": [ 683 | "g7_pop['Canada': 'Italy']" 684 | ] 685 | }, 686 | { 687 | "cell_type": "markdown", 688 | "metadata": {}, 689 | "source": [ 690 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 691 | "\n", 692 | "## Conditional selection (boolean arrays)\n", 693 | "\n", 694 | "The same boolean array techniques we saw applied to numpy arrays can be used for Pandas `Series`:" 695 | ] 696 | }, 697 | { 698 | "cell_type": "code", 699 | "execution_count": 31, 700 | "metadata": {}, 701 | "outputs": [ 702 | { 703 | "data": { 704 | "text/plain": [ 705 | "Canada 35.467\n", 706 | "France 63.951\n", 707 | "Germany 80.940\n", 708 | "Italy 60.665\n", 709 | "Japan 127.061\n", 710 | "United Kingdom 64.511\n", 711 | "United States 318.523\n", 712 | "Name: G7 Population in millions, dtype: float64" 713 | ] 714 | }, 715 | "execution_count": 31, 716 | "metadata": {}, 717 | "output_type": "execute_result" 718 | } 719 | ], 720 | "source": [ 721 | "g7_pop" 722 | ] 723 | }, 724 | { 725 | "cell_type": "code", 726 | "execution_count": 32, 727 | "metadata": {}, 728 | "outputs": [ 729 | { 730 | "data": { 731 | "text/plain": [ 732 | "Canada False\n", 733 | "France False\n", 734 | "Germany True\n", 735 | "Italy False\n", 736 | "Japan True\n", 737 | "United Kingdom False\n", 738 | "United States True\n", 739 | "Name: G7 Population in millions, dtype: bool" 740 | ] 741 | }, 742 | "execution_count": 32, 743 | "metadata": {}, 744 | "output_type": "execute_result" 745 | } 746 | ], 747 | "source": [ 748 | "g7_pop > 70" 749 | ] 750 | }, 751 | { 752 | "cell_type": "code", 753 | "execution_count": 33, 754 | "metadata": {}, 755 | "outputs": [ 756 | { 757 | "data": { 758 | "text/plain": [ 759 | "Germany 80.940\n", 760 | "Japan 127.061\n", 761 | "United States 318.523\n", 762 | "Name: G7 Population in millions, dtype: float64" 763 | ] 764 | }, 765 | "execution_count": 33, 766 | "metadata": {}, 767 | "output_type": "execute_result" 768 | } 769 | ], 770 | "source": [ 771 | "g7_pop[g7_pop > 70]" 772 | ] 773 | }, 774 | { 775 | "cell_type": "code", 776 | "execution_count": 34, 777 | "metadata": {}, 778 | "outputs": [ 779 | { 780 | "data": { 781 | "text/plain": [ 782 | "107.30257142857144" 783 | ] 784 | }, 785 | "execution_count": 34, 786 | "metadata": {}, 787 | "output_type": "execute_result" 788 | } 789 | ], 790 | "source": [ 791 | "g7_pop.mean()" 792 | ] 793 | }, 794 | { 795 | "cell_type": "code", 796 | "execution_count": 35, 797 | "metadata": {}, 798 | "outputs": [ 799 | { 800 | "data": { 801 | "text/plain": [ 802 | "Japan 127.061\n", 803 | "United States 318.523\n", 804 | "Name: G7 Population in millions, dtype: float64" 805 | ] 806 | }, 807 | "execution_count": 35, 808 | "metadata": {}, 809 | "output_type": "execute_result" 810 | } 811 | ], 812 | "source": [ 813 | "g7_pop[g7_pop > g7_pop.mean()]" 814 | ] 815 | }, 816 | { 817 | "cell_type": "code", 818 | "execution_count": null, 819 | "metadata": {}, 820 | "outputs": [], 821 | "source": [ 822 | "g7_pop.std()" 823 | ] 824 | }, 825 | { 826 | "cell_type": "code", 827 | "execution_count": null, 828 | "metadata": {}, 829 | "outputs": [], 830 | "source": [ 831 | "~ not\n", 832 | "| or\n", 833 | "& and" 834 | ] 835 | }, 836 | { 837 | "cell_type": "code", 838 | "execution_count": null, 839 | "metadata": { 840 | "scrolled": true 841 | }, 842 | "outputs": [], 843 | "source": [ 844 | "g7_pop[(g7_pop > g7_pop.mean() - g7_pop.std() / 2) | (g7_pop > g7_pop.mean() + g7_pop.std() / 2)]" 845 | ] 846 | }, 847 | { 848 | "cell_type": "markdown", 849 | "metadata": {}, 850 | "source": [ 851 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 852 | "\n", 853 | "## Operations and methods\n", 854 | "Series also support vectorized operations and aggregation functions as Numpy:" 855 | ] 856 | }, 857 | { 858 | "cell_type": "code", 859 | "execution_count": 29, 860 | "metadata": {}, 861 | "outputs": [ 862 | { 863 | "data": { 864 | "text/plain": [ 865 | "Canada 35.467\n", 866 | "France 63.951\n", 867 | "Germany 80.940\n", 868 | "Italy 60.665\n", 869 | "Japan 127.061\n", 870 | "United Kingdom 64.511\n", 871 | "United States 318.523\n", 872 | "Name: G7 Population in millions, dtype: float64" 873 | ] 874 | }, 875 | "execution_count": 29, 876 | "metadata": {}, 877 | "output_type": "execute_result" 878 | } 879 | ], 880 | "source": [ 881 | "g7_pop" 882 | ] 883 | }, 884 | { 885 | "cell_type": "code", 886 | "execution_count": 30, 887 | "metadata": {}, 888 | "outputs": [ 889 | { 890 | "data": { 891 | "text/plain": [ 892 | "Canada 35467000.0\n", 893 | "France 63951000.0\n", 894 | "Germany 80940000.0\n", 895 | "Italy 60665000.0\n", 896 | "Japan 127061000.0\n", 897 | "United Kingdom 64511000.0\n", 898 | "United States 318523000.0\n", 899 | "Name: G7 Population in millions, dtype: float64" 900 | ] 901 | }, 902 | "execution_count": 30, 903 | "metadata": {}, 904 | "output_type": "execute_result" 905 | } 906 | ], 907 | "source": [ 908 | "g7_pop * 1_000_000" 909 | ] 910 | }, 911 | { 912 | "cell_type": "code", 913 | "execution_count": 36, 914 | "metadata": {}, 915 | "outputs": [ 916 | { 917 | "data": { 918 | "text/plain": [ 919 | "107.30257142857144" 920 | ] 921 | }, 922 | "execution_count": 36, 923 | "metadata": {}, 924 | "output_type": "execute_result" 925 | } 926 | ], 927 | "source": [ 928 | "g7_pop.mean()" 929 | ] 930 | }, 931 | { 932 | "cell_type": "code", 933 | "execution_count": 37, 934 | "metadata": { 935 | "scrolled": true 936 | }, 937 | "outputs": [ 938 | { 939 | "data": { 940 | "text/plain": [ 941 | "Canada 3.568603\n", 942 | "France 4.158117\n", 943 | "Germany 4.393708\n", 944 | "Italy 4.105367\n", 945 | "Japan 4.844667\n", 946 | "United Kingdom 4.166836\n", 947 | "United States 5.763695\n", 948 | "Name: G7 Population in millions, dtype: float64" 949 | ] 950 | }, 951 | "execution_count": 37, 952 | "metadata": {}, 953 | "output_type": "execute_result" 954 | } 955 | ], 956 | "source": [ 957 | "np.log(g7_pop)" 958 | ] 959 | }, 960 | { 961 | "cell_type": "code", 962 | "execution_count": 38, 963 | "metadata": {}, 964 | "outputs": [ 965 | { 966 | "data": { 967 | "text/plain": [ 968 | "68.51866666666666" 969 | ] 970 | }, 971 | "execution_count": 38, 972 | "metadata": {}, 973 | "output_type": "execute_result" 974 | } 975 | ], 976 | "source": [ 977 | "g7_pop['France': 'Italy'].mean()" 978 | ] 979 | }, 980 | { 981 | "cell_type": "markdown", 982 | "metadata": {}, 983 | "source": [ 984 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 985 | "\n", 986 | "## Boolean arrays\n", 987 | "(Work in the same way as numpy)" 988 | ] 989 | }, 990 | { 991 | "cell_type": "code", 992 | "execution_count": 39, 993 | "metadata": {}, 994 | "outputs": [ 995 | { 996 | "data": { 997 | "text/plain": [ 998 | "Canada 35.467\n", 999 | "France 63.951\n", 1000 | "Germany 80.940\n", 1001 | "Italy 60.665\n", 1002 | "Japan 127.061\n", 1003 | "United Kingdom 64.511\n", 1004 | "United States 318.523\n", 1005 | "Name: G7 Population in millions, dtype: float64" 1006 | ] 1007 | }, 1008 | "execution_count": 39, 1009 | "metadata": {}, 1010 | "output_type": "execute_result" 1011 | } 1012 | ], 1013 | "source": [ 1014 | "g7_pop" 1015 | ] 1016 | }, 1017 | { 1018 | "cell_type": "code", 1019 | "execution_count": 40, 1020 | "metadata": {}, 1021 | "outputs": [ 1022 | { 1023 | "data": { 1024 | "text/plain": [ 1025 | "Canada False\n", 1026 | "France False\n", 1027 | "Germany True\n", 1028 | "Italy False\n", 1029 | "Japan True\n", 1030 | "United Kingdom False\n", 1031 | "United States True\n", 1032 | "Name: G7 Population in millions, dtype: bool" 1033 | ] 1034 | }, 1035 | "execution_count": 40, 1036 | "metadata": {}, 1037 | "output_type": "execute_result" 1038 | } 1039 | ], 1040 | "source": [ 1041 | "g7_pop > 80" 1042 | ] 1043 | }, 1044 | { 1045 | "cell_type": "code", 1046 | "execution_count": 41, 1047 | "metadata": { 1048 | "scrolled": true 1049 | }, 1050 | "outputs": [ 1051 | { 1052 | "data": { 1053 | "text/plain": [ 1054 | "Germany 80.940\n", 1055 | "Japan 127.061\n", 1056 | "United States 318.523\n", 1057 | "Name: G7 Population in millions, dtype: float64" 1058 | ] 1059 | }, 1060 | "execution_count": 41, 1061 | "metadata": {}, 1062 | "output_type": "execute_result" 1063 | } 1064 | ], 1065 | "source": [ 1066 | "g7_pop[g7_pop > 80]" 1067 | ] 1068 | }, 1069 | { 1070 | "cell_type": "code", 1071 | "execution_count": 42, 1072 | "metadata": { 1073 | "scrolled": true 1074 | }, 1075 | "outputs": [ 1076 | { 1077 | "data": { 1078 | "text/plain": [ 1079 | "Canada 35.467\n", 1080 | "Germany 80.940\n", 1081 | "Japan 127.061\n", 1082 | "United States 318.523\n", 1083 | "Name: G7 Population in millions, dtype: float64" 1084 | ] 1085 | }, 1086 | "execution_count": 42, 1087 | "metadata": {}, 1088 | "output_type": "execute_result" 1089 | } 1090 | ], 1091 | "source": [ 1092 | "g7_pop[(g7_pop > 80) | (g7_pop < 40)]" 1093 | ] 1094 | }, 1095 | { 1096 | "cell_type": "code", 1097 | "execution_count": 43, 1098 | "metadata": { 1099 | "scrolled": true 1100 | }, 1101 | "outputs": [ 1102 | { 1103 | "data": { 1104 | "text/plain": [ 1105 | "Germany 80.940\n", 1106 | "Japan 127.061\n", 1107 | "Name: G7 Population in millions, dtype: float64" 1108 | ] 1109 | }, 1110 | "execution_count": 43, 1111 | "metadata": {}, 1112 | "output_type": "execute_result" 1113 | } 1114 | ], 1115 | "source": [ 1116 | "g7_pop[(g7_pop > 80) & (g7_pop < 200)]" 1117 | ] 1118 | }, 1119 | { 1120 | "cell_type": "markdown", 1121 | "metadata": {}, 1122 | "source": [ 1123 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 1124 | "\n", 1125 | "## Modifying series\n" 1126 | ] 1127 | }, 1128 | { 1129 | "cell_type": "code", 1130 | "execution_count": 44, 1131 | "metadata": {}, 1132 | "outputs": [], 1133 | "source": [ 1134 | "g7_pop['Canada'] = 40.5" 1135 | ] 1136 | }, 1137 | { 1138 | "cell_type": "code", 1139 | "execution_count": 45, 1140 | "metadata": {}, 1141 | "outputs": [ 1142 | { 1143 | "data": { 1144 | "text/plain": [ 1145 | "Canada 40.500\n", 1146 | "France 63.951\n", 1147 | "Germany 80.940\n", 1148 | "Italy 60.665\n", 1149 | "Japan 127.061\n", 1150 | "United Kingdom 64.511\n", 1151 | "United States 318.523\n", 1152 | "Name: G7 Population in millions, dtype: float64" 1153 | ] 1154 | }, 1155 | "execution_count": 45, 1156 | "metadata": {}, 1157 | "output_type": "execute_result" 1158 | } 1159 | ], 1160 | "source": [ 1161 | "g7_pop" 1162 | ] 1163 | }, 1164 | { 1165 | "cell_type": "code", 1166 | "execution_count": 46, 1167 | "metadata": {}, 1168 | "outputs": [], 1169 | "source": [ 1170 | "g7_pop.iloc[-1] = 500" 1171 | ] 1172 | }, 1173 | { 1174 | "cell_type": "code", 1175 | "execution_count": 47, 1176 | "metadata": {}, 1177 | "outputs": [ 1178 | { 1179 | "data": { 1180 | "text/plain": [ 1181 | "Canada 40.500\n", 1182 | "France 63.951\n", 1183 | "Germany 80.940\n", 1184 | "Italy 60.665\n", 1185 | "Japan 127.061\n", 1186 | "United Kingdom 64.511\n", 1187 | "United States 500.000\n", 1188 | "Name: G7 Population in millions, dtype: float64" 1189 | ] 1190 | }, 1191 | "execution_count": 47, 1192 | "metadata": {}, 1193 | "output_type": "execute_result" 1194 | } 1195 | ], 1196 | "source": [ 1197 | "g7_pop" 1198 | ] 1199 | }, 1200 | { 1201 | "cell_type": "code", 1202 | "execution_count": 48, 1203 | "metadata": {}, 1204 | "outputs": [ 1205 | { 1206 | "data": { 1207 | "text/plain": [ 1208 | "Canada 40.500\n", 1209 | "France 63.951\n", 1210 | "Italy 60.665\n", 1211 | "United Kingdom 64.511\n", 1212 | "Name: G7 Population in millions, dtype: float64" 1213 | ] 1214 | }, 1215 | "execution_count": 48, 1216 | "metadata": {}, 1217 | "output_type": "execute_result" 1218 | } 1219 | ], 1220 | "source": [ 1221 | "g7_pop[g7_pop < 70]" 1222 | ] 1223 | }, 1224 | { 1225 | "cell_type": "code", 1226 | "execution_count": 49, 1227 | "metadata": {}, 1228 | "outputs": [], 1229 | "source": [ 1230 | "g7_pop[g7_pop < 70] = 99.99" 1231 | ] 1232 | }, 1233 | { 1234 | "cell_type": "code", 1235 | "execution_count": 50, 1236 | "metadata": { 1237 | "scrolled": true 1238 | }, 1239 | "outputs": [ 1240 | { 1241 | "data": { 1242 | "text/plain": [ 1243 | "Canada 99.990\n", 1244 | "France 99.990\n", 1245 | "Germany 80.940\n", 1246 | "Italy 99.990\n", 1247 | "Japan 127.061\n", 1248 | "United Kingdom 99.990\n", 1249 | "United States 500.000\n", 1250 | "Name: G7 Population in millions, dtype: float64" 1251 | ] 1252 | }, 1253 | "execution_count": 50, 1254 | "metadata": {}, 1255 | "output_type": "execute_result" 1256 | } 1257 | ], 1258 | "source": [ 1259 | "g7_pop" 1260 | ] 1261 | }, 1262 | { 1263 | "cell_type": "markdown", 1264 | "metadata": {}, 1265 | "source": [ 1266 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n" 1267 | ] 1268 | } 1269 | ], 1270 | "metadata": { 1271 | "kernelspec": { 1272 | "display_name": "Python 3", 1273 | "language": "python", 1274 | "name": "python3" 1275 | }, 1276 | "language_info": { 1277 | "codemirror_mode": { 1278 | "name": "ipython", 1279 | "version": 3 1280 | }, 1281 | "file_extension": ".py", 1282 | "mimetype": "text/x-python", 1283 | "name": "python", 1284 | "nbconvert_exporter": "python", 1285 | "pygments_lexer": "ipython3", 1286 | "version": "3.8.1" 1287 | } 1288 | }, 1289 | "nbformat": 4, 1290 | "nbformat_minor": 4 1291 | } 1292 | -------------------------------------------------------------------------------- /2 - Pandas Series exercises.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)\n", 8 | "
\n", 9 | "\n", 10 | "# Pandas Series exercises\n" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": 1, 16 | "metadata": {}, 17 | "outputs": [ 18 | { 19 | "name": "stdout", 20 | "output_type": "stream", 21 | "text": [ 22 | "0.25.3\n" 23 | ] 24 | } 25 | ], 26 | "source": [ 27 | "# Import the numpy package under the name np\n", 28 | "import numpy as np\n", 29 | "\n", 30 | "# Import the pandas package under the name pd\n", 31 | "import pandas as pd\n", 32 | "\n", 33 | "# Print the pandas version and the configuration\n", 34 | "print(pd.__version__)" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": {}, 40 | "source": [ 41 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 42 | "\n", 43 | "## Series creation" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": {}, 49 | "source": [ 50 | "### Create an empty pandas Series" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": null, 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "# your code goes here\n" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": null, 65 | "metadata": { 66 | "cell_type": "solution" 67 | }, 68 | "outputs": [], 69 | "source": [ 70 | "pd.Series()" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 78 | "\n", 79 | "### Given the X python list convert it to an Y pandas Series" 80 | ] 81 | }, 82 | { 83 | "cell_type": "code", 84 | "execution_count": null, 85 | "metadata": {}, 86 | "outputs": [], 87 | "source": [ 88 | "# your code goes here\n" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": null, 94 | "metadata": { 95 | "cell_type": "solution" 96 | }, 97 | "outputs": [], 98 | "source": [ 99 | "X = ['A','B','C']\n", 100 | "print(X, type(X))\n", 101 | "\n", 102 | "Y = pd.Series(X)\n", 103 | "print(Y, type(Y)) # different type" 104 | ] 105 | }, 106 | { 107 | "cell_type": "markdown", 108 | "metadata": {}, 109 | "source": [ 110 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 111 | "\n", 112 | "### Given the X pandas Series, name it 'My letters'" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "# your code goes here\n" 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": null, 127 | "metadata": { 128 | "cell_type": "solution" 129 | }, 130 | "outputs": [], 131 | "source": [ 132 | "X = pd.Series(['A','B','C'])\n", 133 | "\n", 134 | "X.name = 'My letters'\n", 135 | "X" 136 | ] 137 | }, 138 | { 139 | "cell_type": "markdown", 140 | "metadata": {}, 141 | "source": [ 142 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 143 | "\n", 144 | "### Given the X pandas Series, show its values\n" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": null, 150 | "metadata": {}, 151 | "outputs": [], 152 | "source": [ 153 | "# your code goes here\n" 154 | ] 155 | }, 156 | { 157 | "cell_type": "code", 158 | "execution_count": null, 159 | "metadata": { 160 | "cell_type": "solution" 161 | }, 162 | "outputs": [], 163 | "source": [ 164 | "X = pd.Series(['A','B','C'])\n", 165 | "\n", 166 | "X.values" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 174 | "\n", 175 | "## Series indexation" 176 | ] 177 | }, 178 | { 179 | "cell_type": "markdown", 180 | "metadata": {}, 181 | "source": [ 182 | "### Assign index names to the given X pandas Series\n" 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": null, 188 | "metadata": {}, 189 | "outputs": [], 190 | "source": [ 191 | "# your code goes here\n" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "metadata": { 198 | "cell_type": "solution" 199 | }, 200 | "outputs": [], 201 | "source": [ 202 | "X = pd.Series(['A','B','C'])\n", 203 | "index_names = ['first', 'second', 'third']\n", 204 | "\n", 205 | "X.index = index_names\n", 206 | "X" 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "metadata": {}, 212 | "source": [ 213 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 214 | "\n", 215 | "### Given the X pandas Series, show its first element\n" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": null, 221 | "metadata": {}, 222 | "outputs": [], 223 | "source": [ 224 | "# your code goes here\n" 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "execution_count": null, 230 | "metadata": { 231 | "cell_type": "solution" 232 | }, 233 | "outputs": [], 234 | "source": [ 235 | "X = pd.Series(['A','B','C'], index=['first', 'second', 'third'])\n", 236 | "\n", 237 | "#X[0] # by position\n", 238 | "#X.iloc[0] # by position\n", 239 | "X['first'] # by index" 240 | ] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "metadata": {}, 245 | "source": [ 246 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 247 | "\n", 248 | "### Given the X pandas Series, show its last element\n" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": {}, 255 | "outputs": [], 256 | "source": [ 257 | "# your code goes here\n" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": null, 263 | "metadata": { 264 | "cell_type": "solution" 265 | }, 266 | "outputs": [], 267 | "source": [ 268 | "X = pd.Series(['A','B','C'], index=['first', 'second', 'third'])\n", 269 | "\n", 270 | "#X[-1] # by position\n", 271 | "#X.iloc[-1] # by position\n", 272 | "X['third'] # by index" 273 | ] 274 | }, 275 | { 276 | "cell_type": "markdown", 277 | "metadata": {}, 278 | "source": [ 279 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 280 | "\n", 281 | "### Given the X pandas Series, show all middle elements\n" 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": null, 287 | "metadata": {}, 288 | "outputs": [], 289 | "source": [ 290 | "# your code goes here\n" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": { 297 | "cell_type": "solution" 298 | }, 299 | "outputs": [], 300 | "source": [ 301 | "X = pd.Series(['A','B','C','D','E'],\n", 302 | " index=['first','second','third','forth','fifth'])\n", 303 | "\n", 304 | "#X[['second', 'third', 'forth']]\n", 305 | "#X.iloc[1:-1] # by position\n", 306 | "X[1:-1] # by position" 307 | ] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": [ 313 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 314 | "\n", 315 | "### Given the X pandas Series, show the elements in reverse position\n" 316 | ] 317 | }, 318 | { 319 | "cell_type": "code", 320 | "execution_count": null, 321 | "metadata": {}, 322 | "outputs": [], 323 | "source": [ 324 | "# your code goes here\n" 325 | ] 326 | }, 327 | { 328 | "cell_type": "code", 329 | "execution_count": null, 330 | "metadata": { 331 | "cell_type": "solution" 332 | }, 333 | "outputs": [], 334 | "source": [ 335 | "X = pd.Series(['A','B','C','D','E'],\n", 336 | " index=['first','second','third','forth','fifth'])\n", 337 | "\n", 338 | "#X.iloc[::-1]\n", 339 | "X[::-1]" 340 | ] 341 | }, 342 | { 343 | "cell_type": "markdown", 344 | "metadata": {}, 345 | "source": [ 346 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 347 | "\n", 348 | "### Given the X pandas Series, show the first and last elements\n" 349 | ] 350 | }, 351 | { 352 | "cell_type": "code", 353 | "execution_count": null, 354 | "metadata": {}, 355 | "outputs": [], 356 | "source": [ 357 | "# your code goes here\n" 358 | ] 359 | }, 360 | { 361 | "cell_type": "code", 362 | "execution_count": null, 363 | "metadata": { 364 | "cell_type": "solution" 365 | }, 366 | "outputs": [], 367 | "source": [ 368 | "X = pd.Series(['A','B','C','D','E'],\n", 369 | " index=['first','second','third','forth','fifth'])\n", 370 | "\n", 371 | "#X[['first', 'fifth']]\n", 372 | "#X.iloc[[0, -1]]\n", 373 | "X[[0, -1]]" 374 | ] 375 | }, 376 | { 377 | "cell_type": "markdown", 378 | "metadata": {}, 379 | "source": [ 380 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 381 | "\n", 382 | "## Series manipulation" 383 | ] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": {}, 388 | "source": [ 389 | "### Convert the given integer pandas Series to float\n" 390 | ] 391 | }, 392 | { 393 | "cell_type": "code", 394 | "execution_count": null, 395 | "metadata": {}, 396 | "outputs": [], 397 | "source": [ 398 | "# your code goes here\n" 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "execution_count": null, 404 | "metadata": { 405 | "cell_type": "solution" 406 | }, 407 | "outputs": [], 408 | "source": [ 409 | "X = pd.Series([1,2,3,4,5],\n", 410 | " index=['first','second','third','forth','fifth'])\n", 411 | "\n", 412 | "pd.Series(X, dtype=np.float)" 413 | ] 414 | }, 415 | { 416 | "cell_type": "markdown", 417 | "metadata": {}, 418 | "source": [ 419 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 420 | "\n", 421 | "### Reverse the given pandas Series (first element becomes last)" 422 | ] 423 | }, 424 | { 425 | "cell_type": "code", 426 | "execution_count": null, 427 | "metadata": {}, 428 | "outputs": [], 429 | "source": [ 430 | "# your code goes here\n" 431 | ] 432 | }, 433 | { 434 | "cell_type": "code", 435 | "execution_count": null, 436 | "metadata": { 437 | "cell_type": "solution" 438 | }, 439 | "outputs": [], 440 | "source": [ 441 | "X = pd.Series([1,2,3,4,5],\n", 442 | " index=['first','second','third','forth','fifth'])\n", 443 | "\n", 444 | "X[::-1]" 445 | ] 446 | }, 447 | { 448 | "cell_type": "markdown", 449 | "metadata": {}, 450 | "source": [ 451 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 452 | "\n", 453 | "### Order (sort) the given pandas Series\n" 454 | ] 455 | }, 456 | { 457 | "cell_type": "code", 458 | "execution_count": null, 459 | "metadata": {}, 460 | "outputs": [], 461 | "source": [ 462 | "# your code goes here\n" 463 | ] 464 | }, 465 | { 466 | "cell_type": "code", 467 | "execution_count": null, 468 | "metadata": { 469 | "cell_type": "solution" 470 | }, 471 | "outputs": [], 472 | "source": [ 473 | "X = pd.Series([4,2,5,1,3],\n", 474 | " index=['forth','second','fifth','first','third'])\n", 475 | "\n", 476 | "X = X.sort_values()\n", 477 | "X" 478 | ] 479 | }, 480 | { 481 | "cell_type": "markdown", 482 | "metadata": {}, 483 | "source": [ 484 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 485 | "\n", 486 | "### Given the X pandas Series, set the fifth element equal to 10\n" 487 | ] 488 | }, 489 | { 490 | "cell_type": "code", 491 | "execution_count": null, 492 | "metadata": {}, 493 | "outputs": [], 494 | "source": [ 495 | "# your code goes here\n" 496 | ] 497 | }, 498 | { 499 | "cell_type": "code", 500 | "execution_count": null, 501 | "metadata": { 502 | "cell_type": "solution" 503 | }, 504 | "outputs": [], 505 | "source": [ 506 | "X = pd.Series([1,2,3,4,5],\n", 507 | " index=['A','B','C','D','E'])\n", 508 | "\n", 509 | "X[4] = 10\n", 510 | "X" 511 | ] 512 | }, 513 | { 514 | "cell_type": "markdown", 515 | "metadata": {}, 516 | "source": [ 517 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 518 | "\n", 519 | "### Given the X pandas Series, change all the middle elements to 0\n" 520 | ] 521 | }, 522 | { 523 | "cell_type": "code", 524 | "execution_count": null, 525 | "metadata": {}, 526 | "outputs": [], 527 | "source": [ 528 | "# your code goes here\n" 529 | ] 530 | }, 531 | { 532 | "cell_type": "code", 533 | "execution_count": null, 534 | "metadata": { 535 | "cell_type": "solution", 536 | "scrolled": false 537 | }, 538 | "outputs": [], 539 | "source": [ 540 | "X = pd.Series([1,2,3,4,5],\n", 541 | " index=['A','B','C','D','E'])\n", 542 | "\n", 543 | "X[1:-1] = 0\n", 544 | "X" 545 | ] 546 | }, 547 | { 548 | "cell_type": "markdown", 549 | "metadata": {}, 550 | "source": [ 551 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 552 | "\n", 553 | "### Given the X pandas Series, add 5 to every element\n" 554 | ] 555 | }, 556 | { 557 | "cell_type": "code", 558 | "execution_count": null, 559 | "metadata": {}, 560 | "outputs": [], 561 | "source": [ 562 | "# your code goes here\n" 563 | ] 564 | }, 565 | { 566 | "cell_type": "code", 567 | "execution_count": null, 568 | "metadata": { 569 | "cell_type": "solution" 570 | }, 571 | "outputs": [], 572 | "source": [ 573 | "X = pd.Series([1,2,3,4,5])\n", 574 | "\n", 575 | "X + 5" 576 | ] 577 | }, 578 | { 579 | "cell_type": "markdown", 580 | "metadata": {}, 581 | "source": [ 582 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 583 | "\n", 584 | "## Series boolean arrays (also called masks)" 585 | ] 586 | }, 587 | { 588 | "cell_type": "markdown", 589 | "metadata": {}, 590 | "source": [ 591 | "### Given the X pandas Series, make a mask showing negative elements\n" 592 | ] 593 | }, 594 | { 595 | "cell_type": "code", 596 | "execution_count": null, 597 | "metadata": {}, 598 | "outputs": [], 599 | "source": [ 600 | "# your code goes here\n" 601 | ] 602 | }, 603 | { 604 | "cell_type": "code", 605 | "execution_count": null, 606 | "metadata": { 607 | "cell_type": "solution" 608 | }, 609 | "outputs": [], 610 | "source": [ 611 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n", 612 | "\n", 613 | "mask = X <= 0\n", 614 | "mask" 615 | ] 616 | }, 617 | { 618 | "cell_type": "markdown", 619 | "metadata": {}, 620 | "source": [ 621 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 622 | "\n", 623 | "### Given the X pandas Series, get the negative elements\n" 624 | ] 625 | }, 626 | { 627 | "cell_type": "code", 628 | "execution_count": null, 629 | "metadata": {}, 630 | "outputs": [], 631 | "source": [ 632 | "# your code goes here\n" 633 | ] 634 | }, 635 | { 636 | "cell_type": "code", 637 | "execution_count": null, 638 | "metadata": { 639 | "cell_type": "solution" 640 | }, 641 | "outputs": [], 642 | "source": [ 643 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n", 644 | "\n", 645 | "mask = X <= 0\n", 646 | "X[mask]" 647 | ] 648 | }, 649 | { 650 | "cell_type": "markdown", 651 | "metadata": {}, 652 | "source": [ 653 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 654 | "\n", 655 | "### Given the X pandas Series, get numbers higher than 5\n" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": null, 661 | "metadata": {}, 662 | "outputs": [], 663 | "source": [ 664 | "# your code goes here\n" 665 | ] 666 | }, 667 | { 668 | "cell_type": "code", 669 | "execution_count": null, 670 | "metadata": { 671 | "cell_type": "solution" 672 | }, 673 | "outputs": [], 674 | "source": [ 675 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n", 676 | "\n", 677 | "mask = X > 5\n", 678 | "X[mask]" 679 | ] 680 | }, 681 | { 682 | "cell_type": "markdown", 683 | "metadata": {}, 684 | "source": [ 685 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 686 | "\n", 687 | "### Given the X pandas Series, get numbers higher than the elements mean" 688 | ] 689 | }, 690 | { 691 | "cell_type": "code", 692 | "execution_count": null, 693 | "metadata": {}, 694 | "outputs": [], 695 | "source": [ 696 | "# your code goes here\n" 697 | ] 698 | }, 699 | { 700 | "cell_type": "code", 701 | "execution_count": null, 702 | "metadata": { 703 | "cell_type": "solution" 704 | }, 705 | "outputs": [], 706 | "source": [ 707 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n", 708 | "\n", 709 | "mask = X > X.mean()\n", 710 | "X[mask]" 711 | ] 712 | }, 713 | { 714 | "cell_type": "markdown", 715 | "metadata": {}, 716 | "source": [ 717 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 718 | "\n", 719 | "### Given the X pandas Series, get numbers equal to 2 or 10\n" 720 | ] 721 | }, 722 | { 723 | "cell_type": "code", 724 | "execution_count": null, 725 | "metadata": {}, 726 | "outputs": [], 727 | "source": [ 728 | "# your code goes here\n" 729 | ] 730 | }, 731 | { 732 | "cell_type": "code", 733 | "execution_count": null, 734 | "metadata": { 735 | "cell_type": "solution", 736 | "scrolled": true 737 | }, 738 | "outputs": [], 739 | "source": [ 740 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n", 741 | "\n", 742 | "mask = (X == 2) | (X == 10)\n", 743 | "X[mask]" 744 | ] 745 | }, 746 | { 747 | "cell_type": "markdown", 748 | "metadata": {}, 749 | "source": [ 750 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 751 | "\n", 752 | "## Logic functions" 753 | ] 754 | }, 755 | { 756 | "cell_type": "markdown", 757 | "metadata": {}, 758 | "source": [ 759 | "### Given the X pandas Series, return True if none of its elements is zero" 760 | ] 761 | }, 762 | { 763 | "cell_type": "code", 764 | "execution_count": null, 765 | "metadata": {}, 766 | "outputs": [], 767 | "source": [ 768 | "# your code goes here\n" 769 | ] 770 | }, 771 | { 772 | "cell_type": "code", 773 | "execution_count": null, 774 | "metadata": { 775 | "cell_type": "solution" 776 | }, 777 | "outputs": [], 778 | "source": [ 779 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n", 780 | "\n", 781 | "X.all()" 782 | ] 783 | }, 784 | { 785 | "cell_type": "markdown", 786 | "metadata": {}, 787 | "source": [ 788 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 789 | "\n", 790 | "### Given the X pandas Series, return True if any of its elements is zero\n" 791 | ] 792 | }, 793 | { 794 | "cell_type": "code", 795 | "execution_count": null, 796 | "metadata": {}, 797 | "outputs": [], 798 | "source": [ 799 | "# your code goes here\n" 800 | ] 801 | }, 802 | { 803 | "cell_type": "code", 804 | "execution_count": null, 805 | "metadata": { 806 | "cell_type": "solution" 807 | }, 808 | "outputs": [], 809 | "source": [ 810 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n", 811 | "\n", 812 | "X.any()" 813 | ] 814 | }, 815 | { 816 | "cell_type": "markdown", 817 | "metadata": {}, 818 | "source": [ 819 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 820 | "\n", 821 | "## Summary statistics" 822 | ] 823 | }, 824 | { 825 | "cell_type": "markdown", 826 | "metadata": {}, 827 | "source": [ 828 | "### Given the X pandas Series, show the sum of its elements\n" 829 | ] 830 | }, 831 | { 832 | "cell_type": "code", 833 | "execution_count": null, 834 | "metadata": {}, 835 | "outputs": [], 836 | "source": [ 837 | "# your code goes here\n" 838 | ] 839 | }, 840 | { 841 | "cell_type": "code", 842 | "execution_count": null, 843 | "metadata": { 844 | "cell_type": "solution" 845 | }, 846 | "outputs": [], 847 | "source": [ 848 | "X = pd.Series([3,5,6,7,2,3,4,9,4])\n", 849 | "\n", 850 | "#np.sum(X)\n", 851 | "X.sum()" 852 | ] 853 | }, 854 | { 855 | "cell_type": "markdown", 856 | "metadata": {}, 857 | "source": [ 858 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 859 | "\n", 860 | "### Given the X pandas Series, show the mean value of its elements" 861 | ] 862 | }, 863 | { 864 | "cell_type": "code", 865 | "execution_count": null, 866 | "metadata": {}, 867 | "outputs": [], 868 | "source": [ 869 | "# your code goes here\n" 870 | ] 871 | }, 872 | { 873 | "cell_type": "code", 874 | "execution_count": null, 875 | "metadata": { 876 | "cell_type": "solution" 877 | }, 878 | "outputs": [], 879 | "source": [ 880 | "X = pd.Series([1,2,0,4,5,6,0,0,9,10])\n", 881 | "\n", 882 | "#np.mean(X)\n", 883 | "X.mean()" 884 | ] 885 | }, 886 | { 887 | "cell_type": "markdown", 888 | "metadata": {}, 889 | "source": [ 890 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 891 | "\n", 892 | "### Given the X pandas Series, show the max value of its elements" 893 | ] 894 | }, 895 | { 896 | "cell_type": "code", 897 | "execution_count": null, 898 | "metadata": {}, 899 | "outputs": [], 900 | "source": [ 901 | "# your code goes here\n" 902 | ] 903 | }, 904 | { 905 | "cell_type": "code", 906 | "execution_count": null, 907 | "metadata": { 908 | "cell_type": "solution" 909 | }, 910 | "outputs": [], 911 | "source": [ 912 | "X = pd.Series([1,2,0,4,5,6,0,0,9,10])\n", 913 | "\n", 914 | "#np.max(X)\n", 915 | "X.max()" 916 | ] 917 | }, 918 | { 919 | "cell_type": "markdown", 920 | "metadata": {}, 921 | "source": [ 922 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)" 923 | ] 924 | } 925 | ], 926 | "metadata": { 927 | "kernelspec": { 928 | "display_name": "Python 3", 929 | "language": "python", 930 | "name": "python3" 931 | }, 932 | "language_info": { 933 | "codemirror_mode": { 934 | "name": "ipython", 935 | "version": 3 936 | }, 937 | "file_extension": ".py", 938 | "mimetype": "text/x-python", 939 | "name": "python", 940 | "nbconvert_exporter": "python", 941 | "pygments_lexer": "ipython3", 942 | "version": "3.7.4" 943 | } 944 | }, 945 | "nbformat": 4, 946 | "nbformat_minor": 2 947 | } 948 | -------------------------------------------------------------------------------- /4 - Pandas DataFrames exercises.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)\n", 8 | "
\n", 9 | "\n", 10 | "# Pandas DataFrame exercises\n" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": null, 16 | "metadata": {}, 17 | "outputs": [], 18 | "source": [ 19 | "# Import the numpy package under the name np\n", 20 | "import numpy as np\n", 21 | "\n", 22 | "# Import the pandas package under the name pd\n", 23 | "import pandas as pd\n", 24 | "\n", 25 | "# Import the matplotlib package under the name plt\n", 26 | "import matplotlib.pyplot as plt\n", 27 | "%matplotlib inline\n", 28 | "\n", 29 | "# Print the pandas version and the configuration\n", 30 | "print(pd.__version__)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 38 | "\n", 39 | "## DataFrame creation" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "### Create an empty pandas DataFrame\n" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": null, 52 | "metadata": {}, 53 | "outputs": [], 54 | "source": [ 55 | "# your code goes here\n" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": { 62 | "cell_type": "solution" 63 | }, 64 | "outputs": [], 65 | "source": [ 66 | "pd.DataFrame(data=[None],\n", 67 | " index=[None],\n", 68 | " columns=[None])" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": {}, 74 | "source": [ 75 | "" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 83 | "\n", 84 | "### Create a `marvel_df` pandas DataFrame with the given marvel data\n" 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": {}, 91 | "outputs": [], 92 | "source": [ 93 | "marvel_data = [\n", 94 | " ['Spider-Man', 'male', 1962],\n", 95 | " ['Captain America', 'male', 1941],\n", 96 | " ['Wolverine', 'male', 1974],\n", 97 | " ['Iron Man', 'male', 1963],\n", 98 | " ['Thor', 'male', 1963],\n", 99 | " ['Thing', 'male', 1961],\n", 100 | " ['Mister Fantastic', 'male', 1961],\n", 101 | " ['Hulk', 'male', 1962],\n", 102 | " ['Beast', 'male', 1963],\n", 103 | " ['Invisible Woman', 'female', 1961],\n", 104 | " ['Storm', 'female', 1975],\n", 105 | " ['Namor', 'male', 1939],\n", 106 | " ['Hawkeye', 'male', 1964],\n", 107 | " ['Daredevil', 'male', 1964],\n", 108 | " ['Doctor Strange', 'male', 1963],\n", 109 | " ['Hank Pym', 'male', 1962],\n", 110 | " ['Scarlet Witch', 'female', 1964],\n", 111 | " ['Wasp', 'female', 1963],\n", 112 | " ['Black Widow', 'female', 1964],\n", 113 | " ['Vision', 'male', 1968]\n", 114 | "]" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "# your code goes here\n" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": { 130 | "cell_type": "solution" 131 | }, 132 | "outputs": [], 133 | "source": [ 134 | "marvel_df = pd.DataFrame(data=marvel_data)\n", 135 | "\n", 136 | "marvel_df" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": {}, 142 | "source": [ 143 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 144 | "\n", 145 | "### Add column names to the `marvel_df`\n", 146 | " " 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "# your code goes here\n" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": { 162 | "cell_type": "solution" 163 | }, 164 | "outputs": [], 165 | "source": [ 166 | "col_names = ['name', 'sex', 'first_appearance']\n", 167 | "\n", 168 | "marvel_df.columns = col_names\n", 169 | "marvel_df" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 177 | "\n", 178 | "### Add index names to the `marvel_df` (use the character name as index)\n" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": null, 184 | "metadata": {}, 185 | "outputs": [], 186 | "source": [ 187 | "# your code goes here\n" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": { 194 | "cell_type": "solution" 195 | }, 196 | "outputs": [], 197 | "source": [ 198 | "marvel_df.index = marvel_df['name']\n", 199 | "marvel_df" 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 207 | "\n", 208 | "### Drop the name column as it's now the index" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "metadata": {}, 215 | "outputs": [], 216 | "source": [ 217 | "# your code goes here\n" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": null, 223 | "metadata": { 224 | "cell_type": "solution" 225 | }, 226 | "outputs": [], 227 | "source": [ 228 | "#marvel_df = marvel_df.drop(columns=['name'])\n", 229 | "marvel_df = marvel_df.drop(['name'], axis=1)\n", 230 | "marvel_df" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 238 | "\n", 239 | "### Drop 'Namor' and 'Hank Pym' rows\n" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "# your code goes here\n" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": { 255 | "cell_type": "solution" 256 | }, 257 | "outputs": [], 258 | "source": [ 259 | "marvel_df = marvel_df.drop(['Namor', 'Hank Pym'], axis=0)\n", 260 | "marvel_df" 261 | ] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "metadata": {}, 266 | "source": [ 267 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 268 | "\n", 269 | "## DataFrame selection, slicing and indexation" 270 | ] 271 | }, 272 | { 273 | "cell_type": "markdown", 274 | "metadata": {}, 275 | "source": [ 276 | "### Show the first 5 elements on `marvel_df`\n", 277 | " " 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": null, 283 | "metadata": {}, 284 | "outputs": [], 285 | "source": [ 286 | "# your code goes here\n" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": null, 292 | "metadata": { 293 | "cell_type": "solution" 294 | }, 295 | "outputs": [], 296 | "source": [ 297 | "#marvel_df.loc[['Spider-Man', 'Captain America', 'Wolverine', 'Iron Man', 'Thor'], :] # bad!\n", 298 | "#marvel_df.loc['Spider-Man': 'Thor', :]\n", 299 | "#marvel_df.iloc[0:5, :]\n", 300 | "#marvel_df.iloc[0:5,]\n", 301 | "marvel_df.iloc[:5,]\n", 302 | "#marvel_df.head()" 303 | ] 304 | }, 305 | { 306 | "cell_type": "markdown", 307 | "metadata": {}, 308 | "source": [ 309 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 310 | "\n", 311 | "### Show the last 5 elements on `marvel_df`\n" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": null, 317 | "metadata": {}, 318 | "outputs": [], 319 | "source": [ 320 | "# your code goes here\n" 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": { 327 | "cell_type": "solution" 328 | }, 329 | "outputs": [], 330 | "source": [ 331 | "#marvel_df.loc[['Hank Pym', 'Scarlet Witch', 'Wasp', 'Black Widow', 'Vision'], :] # bad!\n", 332 | "#marvel_df.loc['Hank Pym':'Vision', :]\n", 333 | "marvel_df.iloc[-5:,]\n", 334 | "#marvel_df.tail()" 335 | ] 336 | }, 337 | { 338 | "cell_type": "markdown", 339 | "metadata": {}, 340 | "source": [ 341 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 342 | "\n", 343 | "### Show just the sex of the first 5 elements on `marvel_df`" 344 | ] 345 | }, 346 | { 347 | "cell_type": "code", 348 | "execution_count": null, 349 | "metadata": {}, 350 | "outputs": [], 351 | "source": [ 352 | "# your code goes here\n" 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": null, 358 | "metadata": { 359 | "cell_type": "solution" 360 | }, 361 | "outputs": [], 362 | "source": [ 363 | "#marvel_df.iloc[:5,]['sex'].to_frame()\n", 364 | "marvel_df.iloc[:5,].sex.to_frame()\n", 365 | "#marvel_df.head().sex.to_frame()" 366 | ] 367 | }, 368 | { 369 | "cell_type": "markdown", 370 | "metadata": {}, 371 | "source": [ 372 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 373 | "\n", 374 | "### Show the first_appearance of all middle elements on `marvel_df` " 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": null, 380 | "metadata": {}, 381 | "outputs": [], 382 | "source": [ 383 | "# your code goes here\n" 384 | ] 385 | }, 386 | { 387 | "cell_type": "code", 388 | "execution_count": null, 389 | "metadata": { 390 | "cell_type": "solution" 391 | }, 392 | "outputs": [], 393 | "source": [ 394 | "marvel_df.iloc[1:-1,].first_appearance.to_frame()" 395 | ] 396 | }, 397 | { 398 | "cell_type": "markdown", 399 | "metadata": {}, 400 | "source": [ 401 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 402 | "\n", 403 | "### Show the first and last elements on `marvel_df`\n" 404 | ] 405 | }, 406 | { 407 | "cell_type": "code", 408 | "execution_count": null, 409 | "metadata": {}, 410 | "outputs": [], 411 | "source": [ 412 | "# your code goes here\n" 413 | ] 414 | }, 415 | { 416 | "cell_type": "code", 417 | "execution_count": null, 418 | "metadata": { 419 | "cell_type": "solution" 420 | }, 421 | "outputs": [], 422 | "source": [ 423 | "#marvel_df.iloc[[0, -1],][['sex', 'first_appearance']]\n", 424 | "marvel_df.iloc[[0, -1],]" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": {}, 430 | "source": [ 431 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 432 | "\n", 433 | "## DataFrame manipulation and operations" 434 | ] 435 | }, 436 | { 437 | "cell_type": "markdown", 438 | "metadata": {}, 439 | "source": [ 440 | "### Modify the `first_appearance` of 'Vision' to year 1964" 441 | ] 442 | }, 443 | { 444 | "cell_type": "code", 445 | "execution_count": null, 446 | "metadata": {}, 447 | "outputs": [], 448 | "source": [ 449 | "# your code goes here\n" 450 | ] 451 | }, 452 | { 453 | "cell_type": "code", 454 | "execution_count": null, 455 | "metadata": { 456 | "cell_type": "solution" 457 | }, 458 | "outputs": [], 459 | "source": [ 460 | "marvel_df.loc['Vision', 'first_appearance'] = 1964\n", 461 | "\n", 462 | "marvel_df" 463 | ] 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "metadata": {}, 468 | "source": [ 469 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 470 | "\n", 471 | "### Add a new column to `marvel_df` called 'years_since' with the years since `first_appearance`\n" 472 | ] 473 | }, 474 | { 475 | "cell_type": "code", 476 | "execution_count": null, 477 | "metadata": {}, 478 | "outputs": [], 479 | "source": [ 480 | "# your code goes here\n" 481 | ] 482 | }, 483 | { 484 | "cell_type": "code", 485 | "execution_count": null, 486 | "metadata": { 487 | "cell_type": "solution" 488 | }, 489 | "outputs": [], 490 | "source": [ 491 | "marvel_df['years_since'] = 2018 - marvel_df['first_appearance']\n", 492 | "\n", 493 | "marvel_df" 494 | ] 495 | }, 496 | { 497 | "cell_type": "markdown", 498 | "metadata": {}, 499 | "source": [ 500 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 501 | "\n", 502 | "## DataFrame boolean arrays (also called masks)" 503 | ] 504 | }, 505 | { 506 | "cell_type": "markdown", 507 | "metadata": {}, 508 | "source": [ 509 | "### Given the `marvel_df` pandas DataFrame, make a mask showing the female characters\n" 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": null, 515 | "metadata": {}, 516 | "outputs": [], 517 | "source": [ 518 | "# your code goes here\n" 519 | ] 520 | }, 521 | { 522 | "cell_type": "code", 523 | "execution_count": null, 524 | "metadata": { 525 | "cell_type": "solution" 526 | }, 527 | "outputs": [], 528 | "source": [ 529 | "mask = marvel_df['sex'] == 'female'\n", 530 | "\n", 531 | "mask" 532 | ] 533 | }, 534 | { 535 | "cell_type": "markdown", 536 | "metadata": {}, 537 | "source": [ 538 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 539 | "\n", 540 | "### Given the `marvel_df` pandas DataFrame, get the male characters\n" 541 | ] 542 | }, 543 | { 544 | "cell_type": "code", 545 | "execution_count": null, 546 | "metadata": {}, 547 | "outputs": [], 548 | "source": [ 549 | "# your code goes here\n" 550 | ] 551 | }, 552 | { 553 | "cell_type": "code", 554 | "execution_count": null, 555 | "metadata": { 556 | "cell_type": "solution" 557 | }, 558 | "outputs": [], 559 | "source": [ 560 | "mask = marvel_df['sex'] == 'male'\n", 561 | "\n", 562 | "marvel_df[mask]" 563 | ] 564 | }, 565 | { 566 | "cell_type": "markdown", 567 | "metadata": {}, 568 | "source": [ 569 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 570 | "\n", 571 | "### Given the `marvel_df` pandas DataFrame, get the characters with `first_appearance` after 1970\n" 572 | ] 573 | }, 574 | { 575 | "cell_type": "code", 576 | "execution_count": null, 577 | "metadata": {}, 578 | "outputs": [], 579 | "source": [ 580 | "# your code goes here\n" 581 | ] 582 | }, 583 | { 584 | "cell_type": "code", 585 | "execution_count": null, 586 | "metadata": { 587 | "cell_type": "solution" 588 | }, 589 | "outputs": [], 590 | "source": [ 591 | "mask = marvel_df['first_appearance'] > 1970\n", 592 | "\n", 593 | "marvel_df[mask]" 594 | ] 595 | }, 596 | { 597 | "cell_type": "markdown", 598 | "metadata": {}, 599 | "source": [ 600 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 601 | "\n", 602 | "### Given the `marvel_df` pandas DataFrame, get the female characters with `first_appearance` after 1970" 603 | ] 604 | }, 605 | { 606 | "cell_type": "code", 607 | "execution_count": null, 608 | "metadata": {}, 609 | "outputs": [], 610 | "source": [ 611 | "# your code goes here\n" 612 | ] 613 | }, 614 | { 615 | "cell_type": "code", 616 | "execution_count": null, 617 | "metadata": { 618 | "cell_type": "solution", 619 | "scrolled": true 620 | }, 621 | "outputs": [], 622 | "source": [ 623 | "mask = (marvel_df['sex'] == 'female') & (marvel_df['first_appearance'] > 1970)\n", 624 | "\n", 625 | "marvel_df[mask]" 626 | ] 627 | }, 628 | { 629 | "cell_type": "markdown", 630 | "metadata": {}, 631 | "source": [ 632 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 633 | "\n", 634 | "## DataFrame summary statistics" 635 | ] 636 | }, 637 | { 638 | "cell_type": "markdown", 639 | "metadata": {}, 640 | "source": [ 641 | "### Show basic statistics of `marvel_df`" 642 | ] 643 | }, 644 | { 645 | "cell_type": "code", 646 | "execution_count": null, 647 | "metadata": {}, 648 | "outputs": [], 649 | "source": [ 650 | "# your code goes here\n" 651 | ] 652 | }, 653 | { 654 | "cell_type": "code", 655 | "execution_count": null, 656 | "metadata": { 657 | "cell_type": "solution" 658 | }, 659 | "outputs": [], 660 | "source": [ 661 | "marvel_df.describe()" 662 | ] 663 | }, 664 | { 665 | "cell_type": "markdown", 666 | "metadata": {}, 667 | "source": [ 668 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 669 | "\n", 670 | "### Given the `marvel_df` pandas DataFrame, show the mean value of `first_appearance`" 671 | ] 672 | }, 673 | { 674 | "cell_type": "code", 675 | "execution_count": null, 676 | "metadata": {}, 677 | "outputs": [], 678 | "source": [ 679 | "# your code goes here\n" 680 | ] 681 | }, 682 | { 683 | "cell_type": "code", 684 | "execution_count": null, 685 | "metadata": { 686 | "cell_type": "solution" 687 | }, 688 | "outputs": [], 689 | "source": [ 690 | "\n", 691 | "#np.mean(marvel_df.first_appearance)\n", 692 | "marvel_df.first_appearance.mean()" 693 | ] 694 | }, 695 | { 696 | "cell_type": "markdown", 697 | "metadata": {}, 698 | "source": [ 699 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 700 | "\n", 701 | "### Given the `marvel_df` pandas DataFrame, show the min value of `first_appearance`\n" 702 | ] 703 | }, 704 | { 705 | "cell_type": "code", 706 | "execution_count": null, 707 | "metadata": {}, 708 | "outputs": [], 709 | "source": [ 710 | "# your code goes here\n" 711 | ] 712 | }, 713 | { 714 | "cell_type": "code", 715 | "execution_count": null, 716 | "metadata": { 717 | "cell_type": "solution" 718 | }, 719 | "outputs": [], 720 | "source": [ 721 | "#np.min(marvel_df.first_appearance)\n", 722 | "marvel_df.first_appearance.min()" 723 | ] 724 | }, 725 | { 726 | "cell_type": "markdown", 727 | "metadata": {}, 728 | "source": [ 729 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 730 | "\n", 731 | "### Given the `marvel_df` pandas DataFrame, get the characters with the min value of `first_appearance`" 732 | ] 733 | }, 734 | { 735 | "cell_type": "code", 736 | "execution_count": null, 737 | "metadata": {}, 738 | "outputs": [], 739 | "source": [ 740 | "# your code goes here\n" 741 | ] 742 | }, 743 | { 744 | "cell_type": "code", 745 | "execution_count": null, 746 | "metadata": { 747 | "cell_type": "solution" 748 | }, 749 | "outputs": [], 750 | "source": [ 751 | "mask = marvel_df['first_appearance'] == marvel_df.first_appearance.min()\n", 752 | "marvel_df[mask]" 753 | ] 754 | }, 755 | { 756 | "cell_type": "markdown", 757 | "metadata": {}, 758 | "source": [ 759 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n", 760 | "\n", 761 | "## DataFrame basic plottings" 762 | ] 763 | }, 764 | { 765 | "cell_type": "markdown", 766 | "metadata": {}, 767 | "source": [ 768 | "### Reset index names of `marvel_df`\n" 769 | ] 770 | }, 771 | { 772 | "cell_type": "code", 773 | "execution_count": null, 774 | "metadata": {}, 775 | "outputs": [], 776 | "source": [ 777 | "# your code goes here\n" 778 | ] 779 | }, 780 | { 781 | "cell_type": "code", 782 | "execution_count": null, 783 | "metadata": { 784 | "cell_type": "solution" 785 | }, 786 | "outputs": [], 787 | "source": [ 788 | "marvel_df = marvel_df.reset_index()\n", 789 | "\n", 790 | "marvel_df" 791 | ] 792 | }, 793 | { 794 | "cell_type": "markdown", 795 | "metadata": {}, 796 | "source": [ 797 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 798 | "\n", 799 | "### Plot the values of `first_appearance`\n" 800 | ] 801 | }, 802 | { 803 | "cell_type": "code", 804 | "execution_count": null, 805 | "metadata": {}, 806 | "outputs": [], 807 | "source": [ 808 | "# your code goes here\n" 809 | ] 810 | }, 811 | { 812 | "cell_type": "code", 813 | "execution_count": null, 814 | "metadata": { 815 | "cell_type": "solution" 816 | }, 817 | "outputs": [], 818 | "source": [ 819 | "#plt.plot(marvel_df.index, marvel_df.first_appearance)\n", 820 | "marvel_df.first_appearance.plot()" 821 | ] 822 | }, 823 | { 824 | "cell_type": "markdown", 825 | "metadata": {}, 826 | "source": [ 827 | "![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)\n", 828 | "\n", 829 | "### Plot a histogram (plot.hist) with values of `first_appearance`\n" 830 | ] 831 | }, 832 | { 833 | "cell_type": "code", 834 | "execution_count": null, 835 | "metadata": {}, 836 | "outputs": [], 837 | "source": [ 838 | "# your code goes here\n" 839 | ] 840 | }, 841 | { 842 | "cell_type": "code", 843 | "execution_count": null, 844 | "metadata": { 845 | "cell_type": "solution" 846 | }, 847 | "outputs": [], 848 | "source": [ 849 | "\n", 850 | "plt.hist(marvel_df.first_appearance)" 851 | ] 852 | }, 853 | { 854 | "cell_type": "markdown", 855 | "metadata": {}, 856 | "source": [ 857 | "![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)\n" 858 | ] 859 | } 860 | ], 861 | "metadata": { 862 | "kernelspec": { 863 | "display_name": "Python 3", 864 | "language": "python", 865 | "name": "python3" 866 | }, 867 | "language_info": { 868 | "codemirror_mode": { 869 | "name": "ipython", 870 | "version": 3 871 | }, 872 | "file_extension": ".py", 873 | "mimetype": "text/x-python", 874 | "name": "python", 875 | "nbconvert_exporter": "python", 876 | "pygments_lexer": "ipython3", 877 | "version": "3.8.1" 878 | } 879 | }, 880 | "nbformat": 4, 881 | "nbformat_minor": 4 882 | } 883 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | #### We're in the process of adapting these notebooks into interactive projects in [DataWars](https://www.datawars.io/?utm_source=fccrepo&utm_medium=intro-to-pandas). Sign up now, it's [completely free](https://www.datawars.io/?utm_source=fccrepo&utm_medium=intro-to-pandas). 3 | 4 | Stay tuned! Have any questions? [Join our Discord](https://discord.gg/DSTe8tY38T) 5 | 6 | --- 7 | 8 | Created by Santiago Basulto. Connect with me on [X](https://x.com/santiagobasulto) or [LinkedIn](https://www.linkedin.com/in/santiagobasulto/) 9 | -------------------------------------------------------------------------------- /data/.ipynb_checkpoints/btc-market-price-checkpoint.csv: -------------------------------------------------------------------------------- 1 | 2017-04-02 00:00:00,1099.169125 2 | 2017-04-03 00:00:00,1141.813 3 | 2017-04-04 00:00:00,1141.6003625 4 | 2017-04-05 00:00:00,1133.0793142857142 5 | 2017-04-06 00:00:00,1196.3079375 6 | 2017-04-07 00:00:00,1190.45425 7 | 2017-04-08 00:00:00,1181.1498375 8 | 2017-04-09 00:00:00,1208.8005 9 | 2017-04-10 00:00:00,1207.744875 10 | 2017-04-11 00:00:00,1226.6170375 11 | 2017-04-12 00:00:00,1218.92205 12 | 2017-04-13 00:00:00,1180.0237125 13 | 2017-04-14 00:00:00,1185.2600571428572 14 | 2017-04-15 00:00:00,1184.8806714285713 15 | 2017-04-16 00:00:00,1186.9274125 16 | 2017-04-17 00:00:00,1205.634875 17 | 2017-04-18 00:00:00,1216.1867428571427 18 | 2017-04-19 00:00:00,1217.9300875 19 | 2017-04-20 00:00:00,1241.6863250000001 20 | 2017-04-21 00:00:00,1258.3614125 21 | 2017-04-22 00:00:00,1261.311225 22 | 2017-04-23 00:00:00,1257.9881125 23 | 2017-04-24 00:00:00,1262.902775 24 | 2017-04-25 00:00:00,1279.4146875000001 25 | 2017-04-26 00:00:00,1309.109875 26 | 2017-04-27 00:00:00,1345.3539125 27 | 2017-04-28 00:00:00,1331.2944285714286 28 | 2017-04-29 00:00:00,1334.9790375 29 | 2017-04-30 00:00:00,1353.0045 30 | 2017-05-01 00:00:00,1417.1728125 31 | 2017-05-02 00:00:00,1452.0762875 32 | 2017-05-03 00:00:00,1507.5768571428573 33 | 2017-05-04 00:00:00,1508.292125 34 | 2017-05-05 00:00:00,1533.3350714285714 35 | 2017-05-06 00:00:00,1560.4102 36 | 2017-05-07 00:00:00,1535.8684285714285 37 | 2017-05-08 00:00:00,1640.619225 38 | 2017-05-09 00:00:00,1721.2849714285715 39 | 2017-05-10 00:00:00,1762.88625 40 | 2017-05-11 00:00:00,1820.9905625 41 | 2017-05-12 00:00:00,1720.4785 42 | 2017-05-13 00:00:00,1771.9200125 43 | 2017-05-14 00:00:00,1776.3165 44 | 2017-05-15 00:00:00,1723.1269375 45 | 2017-05-16 00:00:00,1739.031975 46 | 2017-05-17 00:00:00,1807.4850625 47 | 2017-05-18 00:00:00,1899.0828875 48 | 2017-05-19 00:00:00,1961.5204875 49 | 2017-05-20 00:00:00,2052.9097875 50 | 2017-05-21 00:00:00,2046.5344625 51 | 2017-05-22 00:00:00,2090.6623125 52 | 2017-05-23 00:00:00,2287.7102875 53 | 2017-05-24 00:00:00,2379.1938333333333 54 | 2017-05-25 00:00:00,2387.2062857142855 55 | 2017-05-26 00:00:00,2211.976857142857 56 | 2017-05-27 00:00:00,2014.0529625 57 | 2017-05-28 00:00:00,2192.9808 58 | 2017-05-29 00:00:00,2275.9307 59 | 2017-05-30 00:00:00,2239.2053428571426 60 | 2017-05-31 00:00:00,2285.9339142857143 61 | 2017-06-01 00:00:00,2399.2426714285716 62 | 2017-06-02 00:00:00,2446.142414285714 63 | 2017-06-03 00:00:00,2525.7651584699997 64 | 2017-06-04 00:00:00,2516.173142857143 65 | 2017-06-05 00:00:00,2698.3138125 66 | 2017-06-06 00:00:00,2883.3136966371426 67 | 2017-06-07 00:00:00,2664.9208625 68 | 2017-06-08 00:00:00,2792.9991875 69 | 2017-06-09 00:00:00,2827.4913 70 | 2017-06-10 00:00:00,2845.3728571428574 71 | 2017-06-11 00:00:00,2961.8296124999997 72 | 2017-06-12 00:00:00,2657.6750625 73 | 2017-06-13 00:00:00,2748.185085714286 74 | 2017-06-14 00:00:00,2447.0415625 75 | 2017-06-15 00:00:00,2442.48025 76 | 2017-06-16 00:00:00,2464.9598142857144 77 | 2017-06-17 00:00:00,2665.927 78 | 2017-06-18 00:00:00,2507.389252144286 79 | 2017-06-19 00:00:00,2617.2102625 80 | 2017-06-20 00:00:00,2754.97825 81 | 2017-06-21 00:00:00,2671.04325 82 | 2017-06-22 00:00:00,2727.2880125 83 | 2017-06-23 00:00:00,2710.4122857142856 84 | 2017-06-24 00:00:00,2589.1648875 85 | 2017-06-25 00:00:00,2512.3662857142854 86 | 2017-06-26 00:00:00,2436.4510571428573 87 | 2017-06-27 00:00:00,2517.9031142857143 88 | 2017-06-28 00:00:00,2585.349185714286 89 | 2017-06-29 00:00:00,2544.414475 90 | 2017-06-30 00:00:00,2477.641375 91 | 2017-07-01 00:00:00,2434.0778625 92 | 2017-07-02 00:00:00,2501.191342857143 93 | 2017-07-03 00:00:00,2561.225428571429 94 | 2017-07-04 00:00:00,2599.7298375 95 | 2017-07-05 00:00:00,2619.1875030042856 96 | 2017-07-06 00:00:00,2609.96775 97 | 2017-07-07 00:00:00,2491.201214285714 98 | 2017-07-08 00:00:00,2562.1306624999997 99 | 2017-07-09 00:00:00,2536.2389375 100 | 2017-07-10 00:00:00,2366.1701428571428 101 | 2017-07-11 00:00:00,2369.8621285714285 102 | 2017-07-12 00:00:00,2385.7485714285717 103 | 2017-07-13 00:00:00,2354.7834166666667 104 | 2017-07-14 00:00:00,2190.947833333333 105 | 2017-07-15 00:00:00,2058.9955999999997 106 | 2017-07-16 00:00:00,1931.2143 107 | 2017-07-17 00:00:00,2176.6234875 108 | 2017-07-18 00:00:00,2320.12225 109 | 2017-07-19 00:00:00,2264.7657 110 | 2017-07-20 00:00:00,2898.1884166666664 111 | 2017-07-21 00:00:00,2682.1953625 112 | 2017-07-22 00:00:00,2807.609857142857 113 | 2017-07-23 00:00:00,2725.549716666667 114 | 2017-07-24 00:00:00,2751.821028571429 115 | 2017-07-25 00:00:00,2560.9979166666667 116 | 2017-07-26 00:00:00,2495.028585714286 117 | 2017-07-27 00:00:00,2647.625 118 | 2017-07-28 00:00:00,2781.636583333333 119 | 2017-07-29 00:00:00,2722.512785714286 120 | 2017-07-30 00:00:00,2745.955416666666 121 | 2017-07-31 00:00:00,2866.431666666667 122 | 2017-08-01 00:00:00,2710.4130666666665 123 | 2017-08-02 00:00:00,2693.6339833333336 124 | 2017-08-03 00:00:00,2794.117716666666 125 | 2017-08-04 00:00:00,2873.8510833333335 126 | 2017-08-05 00:00:00,3218.1150166666666 127 | 2017-08-06 00:00:00,3252.5625333333332 128 | 2017-08-07 00:00:00,3407.2268333333336 129 | 2017-08-08 00:00:00,3457.374333333333 130 | 2017-08-09 00:00:00,3357.326316666667 131 | 2017-08-10 00:00:00,3424.4042000000004 132 | 2017-08-11 00:00:00,3632.5066666666667 133 | 2017-08-12 00:00:00,3852.8029142857145 134 | 2017-08-13 00:00:00,4125.54802 135 | 2017-08-14 00:00:00,4282.992 136 | 2017-08-15 00:00:00,4217.028328571429 137 | 2017-08-16 00:00:00,4360.876871428572 138 | 2017-08-17 00:00:00,4328.725716666667 139 | 2017-08-18 00:00:00,4130.440066666667 140 | 2017-08-19 00:00:00,4222.662214285714 141 | 2017-08-20 00:00:00,4157.958033333333 142 | 2017-08-21 00:00:00,4043.722 143 | 2017-08-22 00:00:00,4082.180983333333 144 | 2017-08-23 00:00:00,4174.95 145 | 2017-08-24 00:00:00,4340.316716666667 146 | 2017-08-25 00:00:00,4363.05445 147 | 2017-08-26 00:00:00,4360.5133166666665 148 | 2017-08-27 00:00:00,4354.308333333333 149 | 2017-08-28 00:00:00,4391.673516666667 150 | 2017-08-29 00:00:00,4607.98545 151 | 2017-08-30 00:00:00,4594.98785 152 | 2017-08-31 00:00:00,4748.255 153 | 2017-09-01 00:00:00,4911.740016666667 154 | 2017-09-02 00:00:00,4580.387479999999 155 | 2017-09-03 00:00:00,4648.159983333334 156 | 2017-09-04 00:00:00,4344.0983166666665 157 | 2017-09-05 00:00:00,4488.72014 158 | 2017-09-06 00:00:00,4641.822016666666 159 | 2017-09-07 00:00:00,4654.6585000000005 160 | 2017-09-08 00:00:00,4310.750183333334 161 | 2017-09-09 00:00:00,4375.55952 162 | 2017-09-10 00:00:00,4329.955 163 | 2017-09-11 00:00:00,4248.090016666666 164 | 2017-09-12 00:00:00,4219.036616666667 165 | 2017-09-13 00:00:00,3961.2712666666666 166 | 2017-09-14 00:00:00,3319.6299999999997 167 | 2017-09-15 00:00:00,3774.2652833333336 168 | 2017-09-16 00:00:00,3763.62604 169 | 2017-09-17 00:00:00,3746.060783333333 170 | 2017-09-18 00:00:00,4093.316666666667 171 | 2017-09-19 00:00:00,3943.4133333333334 172 | 2017-09-20 00:00:00,3977.5616666666665 173 | 2017-09-21 00:00:00,3658.8981833333332 174 | 2017-09-22 00:00:00,3637.5025499999997 175 | 2017-09-23 00:00:00,3776.3869 176 | 2017-09-24 00:00:00,3703.0406500000004 177 | 2017-09-25 00:00:00,3942.5550000000003 178 | 2017-09-26 00:00:00,3910.3073833333333 179 | 2017-09-27 00:00:00,4202.554983333333 180 | 2017-09-28 00:00:00,4201.98905 181 | 2017-09-29 00:00:00,4193.574666666666 182 | 2017-09-30 00:00:00,4335.368316666667 183 | 2017-10-01 00:00:00,4360.722966666667 184 | 2017-10-02 00:00:00,4386.88375 185 | 2017-10-03 00:00:00,4293.3066 186 | 2017-10-04 00:00:00,4225.175 187 | 2017-10-05 00:00:00,4338.852 188 | 2017-10-06 00:00:00,4345.6033333333335 189 | 2017-10-07 00:00:00,4376.191666666667 190 | 2017-10-08 00:00:00,4602.280883333334 191 | 2017-10-09 00:00:00,4777.967816666666 192 | 2017-10-10 00:00:00,4782.28 193 | 2017-10-11 00:00:00,4819.485766666667 194 | 2017-10-12 00:00:00,5325.130683333333 195 | 2017-10-13 00:00:00,5563.806566666666 196 | 2017-10-14 00:00:00,5739.438733333333 197 | 2017-10-15 00:00:00,5647.311666666667 198 | 2017-10-16 00:00:00,5711.205866666667 199 | 2017-10-17 00:00:00,5603.71294 200 | 2017-10-18 00:00:00,5546.176100000001 201 | 2017-10-19 00:00:00,5727.6335 202 | 2017-10-20 00:00:00,5979.45984 203 | 2017-10-21 00:00:00,6020.371683333334 204 | 2017-10-22 00:00:00,5983.184550000001 205 | 2017-10-23 00:00:00,5876.079866666667 206 | 2017-10-24 00:00:00,5505.827766666666 207 | 2017-10-25 00:00:00,5669.622533333334 208 | 2017-10-26 00:00:00,5893.138416666666 209 | 2017-10-27 00:00:00,5772.504983333333 210 | 2017-10-28 00:00:00,5776.6969500000005 211 | 2017-10-29 00:00:00,6155.43402 212 | 2017-10-30 00:00:00,6105.87422 213 | 2017-10-31 00:00:00,6388.645166666666 214 | 2017-11-01 00:00:00,6665.306683333333 215 | 2017-11-02 00:00:00,7068.020100000001 216 | 2017-11-03 00:00:00,7197.72006 217 | 2017-11-04 00:00:00,7437.543316666666 218 | 2017-11-05 00:00:00,7377.012366666667 219 | 2017-11-06 00:00:00,6989.071666666667 220 | 2017-11-07 00:00:00,7092.127233333333 221 | 2017-11-08 00:00:00,7415.878250000001 222 | 2017-11-09 00:00:00,7158.03706 223 | 2017-11-10 00:00:00,6719.39785 224 | 2017-11-11 00:00:00,6362.851033333333 225 | 2017-11-12 00:00:00,5716.301583333334 226 | 2017-11-13 00:00:00,6550.227533333334 227 | 2017-11-14 00:00:00,6635.412633333333 228 | 2017-11-15 00:00:00,7301.42992 229 | 2017-11-16 00:00:00,7815.0307 230 | 2017-11-17 00:00:00,7786.884366666666 231 | 2017-11-18 00:00:00,7817.1403833333325 232 | 2017-11-19 00:00:00,8007.654066666667 233 | 2017-11-20 00:00:00,8255.596816666666 234 | 2017-11-21 00:00:00,8059.8 235 | 2017-11-22 00:00:00,8268.035 236 | 2017-11-23 00:00:00,8148.95 237 | 2017-11-24 00:00:00,8250.978333333334 238 | 2017-11-25 00:00:00,8707.407266666667 239 | 2017-11-26 00:00:00,9284.1438 240 | 2017-11-27 00:00:00,9718.29505 241 | 2017-11-28 00:00:00,9952.50882 242 | 2017-11-29 00:00:00,9879.328333333333 243 | 2017-11-30 00:00:00,10147.372 244 | 2017-12-01 00:00:00,10883.912 245 | 2017-12-02 00:00:00,11071.368333333332 246 | 2017-12-03 00:00:00,11332.622 247 | 2017-12-04 00:00:00,11584.83 248 | 2017-12-05 00:00:00,11878.433333333334 249 | 2017-12-06 00:00:00,13540.980000000001 250 | 2017-12-07 00:00:00,16501.971666666668 251 | 2017-12-08 00:00:00,16007.436666666666 252 | 2017-12-09 00:00:00,15142.834152123332 253 | 2017-12-10 00:00:00,14869.805 254 | 2017-12-11 00:00:00,16762.116666666665 255 | 2017-12-12 00:00:00,17276.393333333333 256 | 2017-12-13 00:00:00,16808.366666666665 257 | 2017-12-14 00:00:00,16678.892 258 | 2017-12-15 00:00:00,17771.899999999998 259 | 2017-12-16 00:00:00,19498.683333333334 260 | 2017-12-17 00:00:00,19289.785 261 | 2017-12-18 00:00:00,18961.856666666667 262 | 2017-12-19 00:00:00,17737.111666666668 263 | 2017-12-20 00:00:00,16026.271666666667 264 | 2017-12-21 00:00:00,16047.51 265 | 2017-12-22 00:00:00,15190.945 266 | 2017-12-23 00:00:00,15360.261666666667 267 | 2017-12-24 00:00:00,13949.175000000001 268 | 2017-12-25 00:00:00,14119.028333333334 269 | 2017-12-26 00:00:00,15999.048333333332 270 | 2017-12-27 00:00:00,15589.321666666665 271 | 2017-12-28 00:00:00,14380.581666666667 272 | 2017-12-29 00:00:00,14640.14 273 | 2017-12-30 00:00:00,13215.573999999999 274 | 2017-12-31 00:00:00,14165.574999999999 275 | 2018-01-01 00:00:00,13812.186666666666 276 | 2018-01-02 00:00:00,15005.856666666667 277 | 2018-01-03 00:00:00,15053.261666666665 278 | 2018-01-04 00:00:00,15199.355000000001 279 | 2018-01-05 00:00:00,17174.12 280 | 2018-01-06 00:00:00,17319.198 281 | 2018-01-07 00:00:00,16651.471666666668 282 | 2018-01-08 00:00:00,15265.906666666668 283 | 2018-01-09 00:00:00,14714.253333333334 284 | 2018-01-10 00:00:00,15126.398333333333 285 | 2018-01-11 00:00:00,13296.794 286 | 2018-01-12 00:00:00,13912.882000000001 287 | 2018-01-13 00:00:00,14499.773333333333 288 | 2018-01-14 00:00:00,13852.92 289 | 2018-01-15 00:00:00,14012.196 290 | 2018-01-16 00:00:00,11180.998333333331 291 | 2018-01-17 00:00:00,11116.946666666669 292 | 2018-01-18 00:00:00,11345.423333333332 293 | 2018-01-19 00:00:00,11422.44 294 | 2018-01-20 00:00:00,12950.793333333333 295 | 2018-01-21 00:00:00,11505.228 296 | 2018-01-22 00:00:00,10544.593333333332 297 | 2018-01-23 00:00:00,11223.064 298 | 2018-01-24 00:00:00,11282.258333333333 299 | 2018-01-25 00:00:00,11214.44 300 | 2018-01-26 00:00:00,10969.815 301 | 2018-01-27 00:00:00,11524.776666666667 302 | 2018-01-28 00:00:00,11765.71 303 | 2018-01-29 00:00:00,11212.654999999999 304 | 2018-01-30 00:00:00,10184.061666666666 305 | 2018-01-31 00:00:00,10125.013333333334 306 | 2018-02-01 00:00:00,9083.258333333333 307 | 2018-02-02 00:00:00,8901.901666666667 308 | 2018-02-03 00:00:00,9076.678333333333 309 | 2018-02-04 00:00:00,8400.648333333333 310 | 2018-02-05 00:00:00,6838.816666666667 311 | 2018-02-06 00:00:00,7685.633333333334 312 | 2018-02-07 00:00:00,8099.958333333333 313 | 2018-02-08 00:00:00,8240.536666666667 314 | 2018-02-09 00:00:00,8535.516666666668 315 | 2018-02-10 00:00:00,8319.876566184 316 | 2018-02-11 00:00:00,8343.455 317 | 2018-02-12 00:00:00,8811.343333333332 318 | 2018-02-13 00:00:00,8597.7675 319 | 2018-02-14 00:00:00,9334.633333333333 320 | 2018-02-15 00:00:00,9977.154 321 | 2018-02-16 00:00:00,10127.161666666667 322 | 2018-02-17 00:00:00,10841.991666666667 323 | 2018-02-18 00:00:00,10503.298333333334 324 | 2018-02-19 00:00:00,11110.964999999998 325 | 2018-02-20 00:00:00,11390.391666666668 326 | 2018-02-21 00:00:00,10532.791666666666 327 | 2018-02-22 00:00:00,9931.071666666667 328 | 2018-02-23 00:00:00,10162.116666666667 329 | 2018-02-24 00:00:00,9697.956 330 | 2018-02-25 00:00:00,9696.593333333332 331 | 2018-02-26 00:00:00,10348.603333333334 332 | 2018-02-27 00:00:00,10763.883333333333 333 | 2018-02-28 00:00:00,10370.164999999999 334 | 2018-03-01 00:00:00,11009.381666666668 335 | 2018-03-02 00:00:00,11055.815 336 | 2018-03-03 00:00:00,11326.948333333334 337 | 2018-03-04 00:00:00,11430.181666666665 338 | 2018-03-05 00:00:00,11595.54 339 | 2018-03-06 00:00:00,10763.198333333334 340 | 2018-03-07 00:00:00,10118.058 341 | 2018-03-08 00:00:00,9429.111666666666 342 | 2018-03-09 00:00:00,9089.278333333334 343 | 2018-03-10 00:00:00,8746.002 344 | 2018-03-11 00:00:00,9761.396666666666 345 | 2018-03-12 00:00:00,9182.843333333332 346 | 2018-03-13 00:00:00,9154.699999999999 347 | 2018-03-14 00:00:00,8151.531666666667 348 | 2018-03-15 00:00:00,8358.121666666666 349 | 2018-03-16 00:00:00,8530.402 350 | 2018-03-17 00:00:00,7993.674643641666 351 | 2018-03-18 00:00:00,8171.415 352 | 2018-03-19 00:00:00,8412.033333333333 353 | 2018-03-20 00:00:00,8986.948333333334 354 | 2018-03-21 00:00:00,8947.753333333334 355 | 2018-03-22 00:00:00,8690.408333333333 356 | 2018-03-23 00:00:00,8686.826666666666 357 | 2018-03-24 00:00:00,8662.378333333334 358 | 2018-03-25 00:00:00,8617.296666666667 359 | 2018-03-26 00:00:00,8197.548333333334 360 | 2018-03-27 00:00:00,7876.195 361 | 2018-03-28 00:00:00,7960.38 362 | 2018-03-29 00:00:00,7172.28 363 | 2018-03-30 00:00:00,6882.531666666667 364 | 2018-03-31 00:00:00,6935.48 365 | 2018-04-01 00:00:00,6794.105 366 | -------------------------------------------------------------------------------- /data/btc-market-price.csv: -------------------------------------------------------------------------------- 1 | 2017-04-02 00:00:00,1099.169125 2 | 2017-04-03 00:00:00,1141.813 3 | 2017-04-04 00:00:00,1141.6003625 4 | 2017-04-05 00:00:00,1133.0793142857142 5 | 2017-04-06 00:00:00,1196.3079375 6 | 2017-04-07 00:00:00,1190.45425 7 | 2017-04-08 00:00:00,1181.1498375 8 | 2017-04-09 00:00:00,1208.8005 9 | 2017-04-10 00:00:00,1207.744875 10 | 2017-04-11 00:00:00,1226.6170375 11 | 2017-04-12 00:00:00,1218.92205 12 | 2017-04-13 00:00:00,1180.0237125 13 | 2017-04-14 00:00:00,1185.2600571428572 14 | 2017-04-15 00:00:00,1184.8806714285713 15 | 2017-04-16 00:00:00,1186.9274125 16 | 2017-04-17 00:00:00,1205.634875 17 | 2017-04-18 00:00:00,1216.1867428571427 18 | 2017-04-19 00:00:00,1217.9300875 19 | 2017-04-20 00:00:00,1241.6863250000001 20 | 2017-04-21 00:00:00,1258.3614125 21 | 2017-04-22 00:00:00,1261.311225 22 | 2017-04-23 00:00:00,1257.9881125 23 | 2017-04-24 00:00:00,1262.902775 24 | 2017-04-25 00:00:00,1279.4146875000001 25 | 2017-04-26 00:00:00,1309.109875 26 | 2017-04-27 00:00:00,1345.3539125 27 | 2017-04-28 00:00:00,1331.2944285714286 28 | 2017-04-29 00:00:00,1334.9790375 29 | 2017-04-30 00:00:00,1353.0045 30 | 2017-05-01 00:00:00,1417.1728125 31 | 2017-05-02 00:00:00,1452.0762875 32 | 2017-05-03 00:00:00,1507.5768571428573 33 | 2017-05-04 00:00:00,1508.292125 34 | 2017-05-05 00:00:00,1533.3350714285714 35 | 2017-05-06 00:00:00,1560.4102 36 | 2017-05-07 00:00:00,1535.8684285714285 37 | 2017-05-08 00:00:00,1640.619225 38 | 2017-05-09 00:00:00,1721.2849714285715 39 | 2017-05-10 00:00:00,1762.88625 40 | 2017-05-11 00:00:00,1820.9905625 41 | 2017-05-12 00:00:00,1720.4785 42 | 2017-05-13 00:00:00,1771.9200125 43 | 2017-05-14 00:00:00,1776.3165 44 | 2017-05-15 00:00:00,1723.1269375 45 | 2017-05-16 00:00:00,1739.031975 46 | 2017-05-17 00:00:00,1807.4850625 47 | 2017-05-18 00:00:00,1899.0828875 48 | 2017-05-19 00:00:00,1961.5204875 49 | 2017-05-20 00:00:00,2052.9097875 50 | 2017-05-21 00:00:00,2046.5344625 51 | 2017-05-22 00:00:00,2090.6623125 52 | 2017-05-23 00:00:00,2287.7102875 53 | 2017-05-24 00:00:00,2379.1938333333333 54 | 2017-05-25 00:00:00,2387.2062857142855 55 | 2017-05-26 00:00:00,2211.976857142857 56 | 2017-05-27 00:00:00,2014.0529625 57 | 2017-05-28 00:00:00,2192.9808 58 | 2017-05-29 00:00:00,2275.9307 59 | 2017-05-30 00:00:00,2239.2053428571426 60 | 2017-05-31 00:00:00,2285.9339142857143 61 | 2017-06-01 00:00:00,2399.2426714285716 62 | 2017-06-02 00:00:00,2446.142414285714 63 | 2017-06-03 00:00:00,2525.7651584699997 64 | 2017-06-04 00:00:00,2516.173142857143 65 | 2017-06-05 00:00:00,2698.3138125 66 | 2017-06-06 00:00:00,2883.3136966371426 67 | 2017-06-07 00:00:00,2664.9208625 68 | 2017-06-08 00:00:00,2792.9991875 69 | 2017-06-09 00:00:00,2827.4913 70 | 2017-06-10 00:00:00,2845.3728571428574 71 | 2017-06-11 00:00:00,2961.8296124999997 72 | 2017-06-12 00:00:00,2657.6750625 73 | 2017-06-13 00:00:00,2748.185085714286 74 | 2017-06-14 00:00:00,2447.0415625 75 | 2017-06-15 00:00:00,2442.48025 76 | 2017-06-16 00:00:00,2464.9598142857144 77 | 2017-06-17 00:00:00,2665.927 78 | 2017-06-18 00:00:00,2507.389252144286 79 | 2017-06-19 00:00:00,2617.2102625 80 | 2017-06-20 00:00:00,2754.97825 81 | 2017-06-21 00:00:00,2671.04325 82 | 2017-06-22 00:00:00,2727.2880125 83 | 2017-06-23 00:00:00,2710.4122857142856 84 | 2017-06-24 00:00:00,2589.1648875 85 | 2017-06-25 00:00:00,2512.3662857142854 86 | 2017-06-26 00:00:00,2436.4510571428573 87 | 2017-06-27 00:00:00,2517.9031142857143 88 | 2017-06-28 00:00:00,2585.349185714286 89 | 2017-06-29 00:00:00,2544.414475 90 | 2017-06-30 00:00:00,2477.641375 91 | 2017-07-01 00:00:00,2434.0778625 92 | 2017-07-02 00:00:00,2501.191342857143 93 | 2017-07-03 00:00:00,2561.225428571429 94 | 2017-07-04 00:00:00,2599.7298375 95 | 2017-07-05 00:00:00,2619.1875030042856 96 | 2017-07-06 00:00:00,2609.96775 97 | 2017-07-07 00:00:00,2491.201214285714 98 | 2017-07-08 00:00:00,2562.1306624999997 99 | 2017-07-09 00:00:00,2536.2389375 100 | 2017-07-10 00:00:00,2366.1701428571428 101 | 2017-07-11 00:00:00,2369.8621285714285 102 | 2017-07-12 00:00:00,2385.7485714285717 103 | 2017-07-13 00:00:00,2354.7834166666667 104 | 2017-07-14 00:00:00,2190.947833333333 105 | 2017-07-15 00:00:00,2058.9955999999997 106 | 2017-07-16 00:00:00,1931.2143 107 | 2017-07-17 00:00:00,2176.6234875 108 | 2017-07-18 00:00:00,2320.12225 109 | 2017-07-19 00:00:00,2264.7657 110 | 2017-07-20 00:00:00,2898.1884166666664 111 | 2017-07-21 00:00:00,2682.1953625 112 | 2017-07-22 00:00:00,2807.609857142857 113 | 2017-07-23 00:00:00,2725.549716666667 114 | 2017-07-24 00:00:00,2751.821028571429 115 | 2017-07-25 00:00:00,2560.9979166666667 116 | 2017-07-26 00:00:00,2495.028585714286 117 | 2017-07-27 00:00:00,2647.625 118 | 2017-07-28 00:00:00,2781.636583333333 119 | 2017-07-29 00:00:00,2722.512785714286 120 | 2017-07-30 00:00:00,2745.955416666666 121 | 2017-07-31 00:00:00,2866.431666666667 122 | 2017-08-01 00:00:00,2710.4130666666665 123 | 2017-08-02 00:00:00,2693.6339833333336 124 | 2017-08-03 00:00:00,2794.117716666666 125 | 2017-08-04 00:00:00,2873.8510833333335 126 | 2017-08-05 00:00:00,3218.1150166666666 127 | 2017-08-06 00:00:00,3252.5625333333332 128 | 2017-08-07 00:00:00,3407.2268333333336 129 | 2017-08-08 00:00:00,3457.374333333333 130 | 2017-08-09 00:00:00,3357.326316666667 131 | 2017-08-10 00:00:00,3424.4042000000004 132 | 2017-08-11 00:00:00,3632.5066666666667 133 | 2017-08-12 00:00:00,3852.8029142857145 134 | 2017-08-13 00:00:00,4125.54802 135 | 2017-08-14 00:00:00,4282.992 136 | 2017-08-15 00:00:00,4217.028328571429 137 | 2017-08-16 00:00:00,4360.876871428572 138 | 2017-08-17 00:00:00,4328.725716666667 139 | 2017-08-18 00:00:00,4130.440066666667 140 | 2017-08-19 00:00:00,4222.662214285714 141 | 2017-08-20 00:00:00,4157.958033333333 142 | 2017-08-21 00:00:00,4043.722 143 | 2017-08-22 00:00:00,4082.180983333333 144 | 2017-08-23 00:00:00,4174.95 145 | 2017-08-24 00:00:00,4340.316716666667 146 | 2017-08-25 00:00:00,4363.05445 147 | 2017-08-26 00:00:00,4360.5133166666665 148 | 2017-08-27 00:00:00,4354.308333333333 149 | 2017-08-28 00:00:00,4391.673516666667 150 | 2017-08-29 00:00:00,4607.98545 151 | 2017-08-30 00:00:00,4594.98785 152 | 2017-08-31 00:00:00,4748.255 153 | 2017-09-01 00:00:00,4911.740016666667 154 | 2017-09-02 00:00:00,4580.387479999999 155 | 2017-09-03 00:00:00,4648.159983333334 156 | 2017-09-04 00:00:00,4344.0983166666665 157 | 2017-09-05 00:00:00,4488.72014 158 | 2017-09-06 00:00:00,4641.822016666666 159 | 2017-09-07 00:00:00,4654.6585000000005 160 | 2017-09-08 00:00:00,4310.750183333334 161 | 2017-09-09 00:00:00,4375.55952 162 | 2017-09-10 00:00:00,4329.955 163 | 2017-09-11 00:00:00,4248.090016666666 164 | 2017-09-12 00:00:00,4219.036616666667 165 | 2017-09-13 00:00:00,3961.2712666666666 166 | 2017-09-14 00:00:00,3319.6299999999997 167 | 2017-09-15 00:00:00,3774.2652833333336 168 | 2017-09-16 00:00:00,3763.62604 169 | 2017-09-17 00:00:00,3746.060783333333 170 | 2017-09-18 00:00:00,4093.316666666667 171 | 2017-09-19 00:00:00,3943.4133333333334 172 | 2017-09-20 00:00:00,3977.5616666666665 173 | 2017-09-21 00:00:00,3658.8981833333332 174 | 2017-09-22 00:00:00,3637.5025499999997 175 | 2017-09-23 00:00:00,3776.3869 176 | 2017-09-24 00:00:00,3703.0406500000004 177 | 2017-09-25 00:00:00,3942.5550000000003 178 | 2017-09-26 00:00:00,3910.3073833333333 179 | 2017-09-27 00:00:00,4202.554983333333 180 | 2017-09-28 00:00:00,4201.98905 181 | 2017-09-29 00:00:00,4193.574666666666 182 | 2017-09-30 00:00:00,4335.368316666667 183 | 2017-10-01 00:00:00,4360.722966666667 184 | 2017-10-02 00:00:00,4386.88375 185 | 2017-10-03 00:00:00,4293.3066 186 | 2017-10-04 00:00:00,4225.175 187 | 2017-10-05 00:00:00,4338.852 188 | 2017-10-06 00:00:00,4345.6033333333335 189 | 2017-10-07 00:00:00,4376.191666666667 190 | 2017-10-08 00:00:00,4602.280883333334 191 | 2017-10-09 00:00:00,4777.967816666666 192 | 2017-10-10 00:00:00,4782.28 193 | 2017-10-11 00:00:00,4819.485766666667 194 | 2017-10-12 00:00:00,5325.130683333333 195 | 2017-10-13 00:00:00,5563.806566666666 196 | 2017-10-14 00:00:00,5739.438733333333 197 | 2017-10-15 00:00:00,5647.311666666667 198 | 2017-10-16 00:00:00,5711.205866666667 199 | 2017-10-17 00:00:00,5603.71294 200 | 2017-10-18 00:00:00,5546.176100000001 201 | 2017-10-19 00:00:00,5727.6335 202 | 2017-10-20 00:00:00,5979.45984 203 | 2017-10-21 00:00:00,6020.371683333334 204 | 2017-10-22 00:00:00,5983.184550000001 205 | 2017-10-23 00:00:00,5876.079866666667 206 | 2017-10-24 00:00:00,5505.827766666666 207 | 2017-10-25 00:00:00,5669.622533333334 208 | 2017-10-26 00:00:00,5893.138416666666 209 | 2017-10-27 00:00:00,5772.504983333333 210 | 2017-10-28 00:00:00,5776.6969500000005 211 | 2017-10-29 00:00:00,6155.43402 212 | 2017-10-30 00:00:00,6105.87422 213 | 2017-10-31 00:00:00,6388.645166666666 214 | 2017-11-01 00:00:00,6665.306683333333 215 | 2017-11-02 00:00:00,7068.020100000001 216 | 2017-11-03 00:00:00,7197.72006 217 | 2017-11-04 00:00:00,7437.543316666666 218 | 2017-11-05 00:00:00,7377.012366666667 219 | 2017-11-06 00:00:00,6989.071666666667 220 | 2017-11-07 00:00:00,7092.127233333333 221 | 2017-11-08 00:00:00,7415.878250000001 222 | 2017-11-09 00:00:00,7158.03706 223 | 2017-11-10 00:00:00,6719.39785 224 | 2017-11-11 00:00:00,6362.851033333333 225 | 2017-11-12 00:00:00,5716.301583333334 226 | 2017-11-13 00:00:00,6550.227533333334 227 | 2017-11-14 00:00:00,6635.412633333333 228 | 2017-11-15 00:00:00,7301.42992 229 | 2017-11-16 00:00:00,7815.0307 230 | 2017-11-17 00:00:00,7786.884366666666 231 | 2017-11-18 00:00:00,7817.1403833333325 232 | 2017-11-19 00:00:00,8007.654066666667 233 | 2017-11-20 00:00:00,8255.596816666666 234 | 2017-11-21 00:00:00,8059.8 235 | 2017-11-22 00:00:00,8268.035 236 | 2017-11-23 00:00:00,8148.95 237 | 2017-11-24 00:00:00,8250.978333333334 238 | 2017-11-25 00:00:00,8707.407266666667 239 | 2017-11-26 00:00:00,9284.1438 240 | 2017-11-27 00:00:00,9718.29505 241 | 2017-11-28 00:00:00,9952.50882 242 | 2017-11-29 00:00:00,9879.328333333333 243 | 2017-11-30 00:00:00,10147.372 244 | 2017-12-01 00:00:00,10883.912 245 | 2017-12-02 00:00:00,11071.368333333332 246 | 2017-12-03 00:00:00,11332.622 247 | 2017-12-04 00:00:00,11584.83 248 | 2017-12-05 00:00:00,11878.433333333334 249 | 2017-12-06 00:00:00,13540.980000000001 250 | 2017-12-07 00:00:00,16501.971666666668 251 | 2017-12-08 00:00:00,16007.436666666666 252 | 2017-12-09 00:00:00,15142.834152123332 253 | 2017-12-10 00:00:00,14869.805 254 | 2017-12-11 00:00:00,16762.116666666665 255 | 2017-12-12 00:00:00,17276.393333333333 256 | 2017-12-13 00:00:00,16808.366666666665 257 | 2017-12-14 00:00:00,16678.892 258 | 2017-12-15 00:00:00,17771.899999999998 259 | 2017-12-16 00:00:00,19498.683333333334 260 | 2017-12-17 00:00:00,19289.785 261 | 2017-12-18 00:00:00,18961.856666666667 262 | 2017-12-19 00:00:00,17737.111666666668 263 | 2017-12-20 00:00:00,16026.271666666667 264 | 2017-12-21 00:00:00,16047.51 265 | 2017-12-22 00:00:00,15190.945 266 | 2017-12-23 00:00:00,15360.261666666667 267 | 2017-12-24 00:00:00,13949.175000000001 268 | 2017-12-25 00:00:00,14119.028333333334 269 | 2017-12-26 00:00:00,15999.048333333332 270 | 2017-12-27 00:00:00,15589.321666666665 271 | 2017-12-28 00:00:00,14380.581666666667 272 | 2017-12-29 00:00:00,14640.14 273 | 2017-12-30 00:00:00,13215.573999999999 274 | 2017-12-31 00:00:00,14165.574999999999 275 | 2018-01-01 00:00:00,13812.186666666666 276 | 2018-01-02 00:00:00,15005.856666666667 277 | 2018-01-03 00:00:00,15053.261666666665 278 | 2018-01-04 00:00:00,15199.355000000001 279 | 2018-01-05 00:00:00,17174.12 280 | 2018-01-06 00:00:00,17319.198 281 | 2018-01-07 00:00:00,16651.471666666668 282 | 2018-01-08 00:00:00,15265.906666666668 283 | 2018-01-09 00:00:00,14714.253333333334 284 | 2018-01-10 00:00:00,15126.398333333333 285 | 2018-01-11 00:00:00,13296.794 286 | 2018-01-12 00:00:00,13912.882000000001 287 | 2018-01-13 00:00:00,14499.773333333333 288 | 2018-01-14 00:00:00,13852.92 289 | 2018-01-15 00:00:00,14012.196 290 | 2018-01-16 00:00:00,11180.998333333331 291 | 2018-01-17 00:00:00,11116.946666666669 292 | 2018-01-18 00:00:00,11345.423333333332 293 | 2018-01-19 00:00:00,11422.44 294 | 2018-01-20 00:00:00,12950.793333333333 295 | 2018-01-21 00:00:00,11505.228 296 | 2018-01-22 00:00:00,10544.593333333332 297 | 2018-01-23 00:00:00,11223.064 298 | 2018-01-24 00:00:00,11282.258333333333 299 | 2018-01-25 00:00:00,11214.44 300 | 2018-01-26 00:00:00,10969.815 301 | 2018-01-27 00:00:00,11524.776666666667 302 | 2018-01-28 00:00:00,11765.71 303 | 2018-01-29 00:00:00,11212.654999999999 304 | 2018-01-30 00:00:00,10184.061666666666 305 | 2018-01-31 00:00:00,10125.013333333334 306 | 2018-02-01 00:00:00,9083.258333333333 307 | 2018-02-02 00:00:00,8901.901666666667 308 | 2018-02-03 00:00:00,9076.678333333333 309 | 2018-02-04 00:00:00,8400.648333333333 310 | 2018-02-05 00:00:00,6838.816666666667 311 | 2018-02-06 00:00:00,7685.633333333334 312 | 2018-02-07 00:00:00,8099.958333333333 313 | 2018-02-08 00:00:00,8240.536666666667 314 | 2018-02-09 00:00:00,8535.516666666668 315 | 2018-02-10 00:00:00,8319.876566184 316 | 2018-02-11 00:00:00,8343.455 317 | 2018-02-12 00:00:00,8811.343333333332 318 | 2018-02-13 00:00:00,8597.7675 319 | 2018-02-14 00:00:00,9334.633333333333 320 | 2018-02-15 00:00:00,9977.154 321 | 2018-02-16 00:00:00,10127.161666666667 322 | 2018-02-17 00:00:00,10841.991666666667 323 | 2018-02-18 00:00:00,10503.298333333334 324 | 2018-02-19 00:00:00,11110.964999999998 325 | 2018-02-20 00:00:00,11390.391666666668 326 | 2018-02-21 00:00:00,10532.791666666666 327 | 2018-02-22 00:00:00,9931.071666666667 328 | 2018-02-23 00:00:00,10162.116666666667 329 | 2018-02-24 00:00:00,9697.956 330 | 2018-02-25 00:00:00,9696.593333333332 331 | 2018-02-26 00:00:00,10348.603333333334 332 | 2018-02-27 00:00:00,10763.883333333333 333 | 2018-02-28 00:00:00,10370.164999999999 334 | 2018-03-01 00:00:00,11009.381666666668 335 | 2018-03-02 00:00:00,11055.815 336 | 2018-03-03 00:00:00,11326.948333333334 337 | 2018-03-04 00:00:00,11430.181666666665 338 | 2018-03-05 00:00:00,11595.54 339 | 2018-03-06 00:00:00,10763.198333333334 340 | 2018-03-07 00:00:00,10118.058 341 | 2018-03-08 00:00:00,9429.111666666666 342 | 2018-03-09 00:00:00,9089.278333333334 343 | 2018-03-10 00:00:00,8746.002 344 | 2018-03-11 00:00:00,9761.396666666666 345 | 2018-03-12 00:00:00,9182.843333333332 346 | 2018-03-13 00:00:00,9154.699999999999 347 | 2018-03-14 00:00:00,8151.531666666667 348 | 2018-03-15 00:00:00,8358.121666666666 349 | 2018-03-16 00:00:00,8530.402 350 | 2018-03-17 00:00:00,7993.674643641666 351 | 2018-03-18 00:00:00,8171.415 352 | 2018-03-19 00:00:00,8412.033333333333 353 | 2018-03-20 00:00:00,8986.948333333334 354 | 2018-03-21 00:00:00,8947.753333333334 355 | 2018-03-22 00:00:00,8690.408333333333 356 | 2018-03-23 00:00:00,8686.826666666666 357 | 2018-03-24 00:00:00,8662.378333333334 358 | 2018-03-25 00:00:00,8617.296666666667 359 | 2018-03-26 00:00:00,8197.548333333334 360 | 2018-03-27 00:00:00,7876.195 361 | 2018-03-28 00:00:00,7960.38 362 | 2018-03-29 00:00:00,7172.28 363 | 2018-03-30 00:00:00,6882.531666666667 364 | 2018-03-31 00:00:00,6935.48 365 | 2018-04-01 00:00:00,6794.105 366 | -------------------------------------------------------------------------------- /data/eth-price.csv: -------------------------------------------------------------------------------- 1 | "Date(UTC)","UnixTimeStamp","Value" 2 | "4/2/2017","1491091200","48.55" 3 | "4/3/2017","1491177600","44.13" 4 | "4/4/2017","1491264000","44.43" 5 | "4/5/2017","1491350400","44.90" 6 | "4/6/2017","1491436800","43.23" 7 | "4/7/2017","1491523200","42.31" 8 | "4/8/2017","1491609600","44.37" 9 | "4/9/2017","1491696000","43.72" 10 | "4/10/2017","1491782400","43.74" 11 | "4/11/2017","1491868800","43.74" 12 | "4/12/2017","1491955200","46.38" 13 | "4/13/2017","1492041600","49.97" 14 | "4/14/2017","1492128000","47.32" 15 | "4/15/2017","1492214400","48.89" 16 | "4/16/2017","1492300800","48.22" 17 | "4/17/2017","1492387200","47.94" 18 | "4/18/2017","1492473600","49.88" 19 | "4/19/2017","1492560000","47.88" 20 | "4/20/2017","1492646400","49.36" 21 | "4/21/2017","1492732800","48.27" 22 | "4/22/2017","1492819200","48.41" 23 | "4/23/2017","1492905600","48.75" 24 | "4/24/2017","1492992000","49.94" 25 | "4/25/2017","1493078400","50.09" 26 | "4/26/2017","1493164800","53.28" 27 | "4/27/2017","1493251200","63.14" 28 | "4/28/2017","1493337600","72.42" 29 | "4/29/2017","1493424000","69.83" 30 | "4/30/2017","1493510400","79.83" 31 | "5/1/2017","1493596800","77.53" 32 | "5/2/2017","1493683200","77.25" 33 | "5/3/2017","1493769600","80.37" 34 | "5/4/2017","1493856000","94.55" 35 | "5/5/2017","1493942400","90.79" 36 | "5/6/2017","1494028800","94.82" 37 | "5/7/2017","1494115200","90.46" 38 | "5/8/2017","1494201600","88.39" 39 | "5/9/2017","1494288000","86.27" 40 | "5/10/2017","1494374400","87.83" 41 | "5/11/2017","1494460800","88.20" 42 | "5/12/2017","1494547200","85.15" 43 | "5/13/2017","1494633600","87.96" 44 | "5/14/2017","1494720000","88.72" 45 | "5/15/2017","1494806400","90.32" 46 | "5/16/2017","1494892800","87.80" 47 | "5/17/2017","1494979200","86.98" 48 | "5/18/2017","1495065600","95.88" 49 | "5/19/2017","1495152000","124.38" 50 | "5/20/2017","1495238400","123.06" 51 | "5/21/2017","1495324800","148.00" 52 | "5/22/2017","1495411200","160.39" 53 | "5/23/2017","1495497600","169.50" 54 | "5/24/2017","1495584000","193.03" 55 | "5/25/2017","1495670400","177.33" 56 | "5/26/2017","1495756800","162.83" 57 | "5/27/2017","1495843200","156.63" 58 | "5/28/2017","1495929600","172.86" 59 | "5/29/2017","1496016000","194.17" 60 | "5/30/2017","1496102400","228.58" 61 | "5/31/2017","1496188800","228.64" 62 | "6/1/2017","1496275200","220.70" 63 | "6/2/2017","1496361600","222.04" 64 | "6/3/2017","1496448000","224.30" 65 | "6/4/2017","1496534400","244.96" 66 | "6/5/2017","1496620800","247.75" 67 | "6/6/2017","1496707200","264.26" 68 | "6/7/2017","1496793600","255.77" 69 | "6/8/2017","1496880000","259.41" 70 | "6/9/2017","1496966400","279.11" 71 | "6/10/2017","1497052800","335.95" 72 | "6/11/2017","1497139200","339.68" 73 | "6/12/2017","1497225600","394.66" 74 | "6/13/2017","1497312000","388.09" 75 | "6/14/2017","1497398400","343.84" 76 | "6/15/2017","1497484800","344.68" 77 | "6/16/2017","1497571200","353.61" 78 | "6/17/2017","1497657600","368.10" 79 | "6/18/2017","1497744000","351.53" 80 | "6/19/2017","1497830400","358.20" 81 | "6/20/2017","1497916800","350.53" 82 | "6/21/2017","1498003200","325.30" 83 | "6/22/2017","1498089600","320.97" 84 | "6/23/2017","1498176000","326.85" 85 | "6/24/2017","1498262400","304.54" 86 | "6/25/2017","1498348800","279.36" 87 | "6/26/2017","1498435200","253.68" 88 | "6/27/2017","1498521600","286.14" 89 | "6/28/2017","1498608000","315.86" 90 | "6/29/2017","1498694400","292.90" 91 | "6/30/2017","1498780800","280.68" 92 | "7/1/2017","1498867200","261.00" 93 | "7/2/2017","1498953600","283.99" 94 | "7/3/2017","1499040000","276.41" 95 | "7/4/2017","1499126400","269.05" 96 | "7/5/2017","1499212800","266.00" 97 | "7/6/2017","1499299200","265.88" 98 | "7/7/2017","1499385600","240.94" 99 | "7/8/2017","1499472000","245.67" 100 | "7/9/2017","1499558400","237.72" 101 | "7/10/2017","1499644800","205.76" 102 | "7/11/2017","1499731200","190.55" 103 | "7/12/2017","1499817600","224.15" 104 | "7/13/2017","1499904000","205.41" 105 | "7/14/2017","1499990400","197.14" 106 | "7/15/2017","1500076800","169.10" 107 | "7/16/2017","1500163200","155.42" 108 | "7/17/2017","1500249600","189.97" 109 | "7/18/2017","1500336000","227.09" 110 | "7/19/2017","1500422400","194.41" 111 | "7/20/2017","1500508800","226.33" 112 | "7/21/2017","1500595200","216.33" 113 | "7/22/2017","1500681600","230.47" 114 | "7/23/2017","1500768000","228.32" 115 | "7/24/2017","1500854400","225.48" 116 | "7/25/2017","1500940800","203.59" 117 | "7/26/2017","1501027200","202.88" 118 | "7/27/2017","1501113600","202.93" 119 | "7/28/2017","1501200000","191.21" 120 | "7/29/2017","1501286400","206.14" 121 | "7/30/2017","1501372800","196.78" 122 | "7/31/2017","1501459200","201.33" 123 | "8/1/2017","1501545600","225.90" 124 | "8/2/2017","1501632000","218.12" 125 | "8/3/2017","1501718400","224.39" 126 | "8/4/2017","1501804800","220.60" 127 | "8/5/2017","1501891200","253.09" 128 | "8/6/2017","1501977600","264.56" 129 | "8/7/2017","1502064000","269.94" 130 | "8/8/2017","1502150400","296.51" 131 | "8/9/2017","1502236800","295.28" 132 | "8/10/2017","1502323200","298.28" 133 | "8/11/2017","1502409600","309.32" 134 | "8/12/2017","1502496000","308.02" 135 | "8/13/2017","1502582400","296.62" 136 | "8/14/2017","1502668800","299.16" 137 | "8/15/2017","1502755200","286.52" 138 | "8/16/2017","1502841600","301.38" 139 | "8/17/2017","1502928000","300.30" 140 | "8/18/2017","1503014400","292.62" 141 | "8/19/2017","1503100800","293.02" 142 | "8/20/2017","1503187200","298.20" 143 | "8/21/2017","1503273600","321.85" 144 | "8/22/2017","1503360000","313.37" 145 | "8/23/2017","1503446400","317.40" 146 | "8/24/2017","1503532800","325.28" 147 | "8/25/2017","1503619200","330.06" 148 | "8/26/2017","1503705600","332.86" 149 | "8/27/2017","1503792000","347.88" 150 | "8/28/2017","1503878400","347.66" 151 | "8/29/2017","1503964800","372.35" 152 | "8/30/2017","1504051200","383.86" 153 | "8/31/2017","1504137600","388.33" 154 | "9/1/2017","1504224000","391.42" 155 | "9/2/2017","1504310400","351.03" 156 | "9/3/2017","1504396800","352.45" 157 | "9/4/2017","1504483200","303.70" 158 | "9/5/2017","1504569600","317.94" 159 | "9/6/2017","1504656000","338.92" 160 | "9/7/2017","1504742400","335.37" 161 | "9/8/2017","1504828800","306.72" 162 | "9/9/2017","1504915200","303.79" 163 | "9/10/2017","1505001600","299.21" 164 | "9/11/2017","1505088000","297.95" 165 | "9/12/2017","1505174400","294.10" 166 | "9/13/2017","1505260800","275.84" 167 | "9/14/2017","1505347200","223.14" 168 | "9/15/2017","1505433600","259.57" 169 | "9/16/2017","1505520000","254.49" 170 | "9/17/2017","1505606400","258.40" 171 | "9/18/2017","1505692800","297.53" 172 | "9/19/2017","1505779200","283.00" 173 | "9/20/2017","1505865600","283.56" 174 | "9/21/2017","1505952000","257.77" 175 | "9/22/2017","1506038400","262.94" 176 | "9/23/2017","1506124800","286.14" 177 | "9/24/2017","1506211200","282.60" 178 | "9/25/2017","1506297600","294.89" 179 | "9/26/2017","1506384000","288.64" 180 | "9/27/2017","1506470400","309.97" 181 | "9/28/2017","1506556800","302.77" 182 | "9/29/2017","1506643200","292.58" 183 | "9/30/2017","1506729600","302.77" 184 | "10/1/2017","1506816000","303.95" 185 | "10/2/2017","1506902400","296.81" 186 | "10/3/2017","1506988800","291.81" 187 | "10/4/2017","1507075200","291.68" 188 | "10/5/2017","1507161600","294.99" 189 | "10/6/2017","1507248000","308.33" 190 | "10/7/2017","1507334400","311.26" 191 | "10/8/2017","1507420800","309.49" 192 | "10/9/2017","1507507200","296.95" 193 | "10/10/2017","1507593600","298.46" 194 | "10/11/2017","1507680000","302.86" 195 | "10/12/2017","1507766400","302.89" 196 | "10/13/2017","1507852800","336.83" 197 | "10/14/2017","1507939200","338.81" 198 | "10/15/2017","1508025600","336.58" 199 | "10/16/2017","1508112000","334.23" 200 | "10/17/2017","1508198400","316.14" 201 | "10/18/2017","1508284800","313.54" 202 | "10/19/2017","1508371200","307.41" 203 | "10/20/2017","1508457600","303.08" 204 | "10/21/2017","1508544000","299.55" 205 | "10/22/2017","1508630400","294.03" 206 | "10/23/2017","1508716800","285.27" 207 | "10/24/2017","1508803200","296.50" 208 | "10/25/2017","1508889600","296.35" 209 | "10/26/2017","1508976000","295.54" 210 | "10/27/2017","1509062400","296.36" 211 | "10/28/2017","1509148800","293.35" 212 | "10/29/2017","1509235200","304.04" 213 | "10/30/2017","1509321600","306.80" 214 | "10/31/2017","1509408000","303.64" 215 | "11/1/2017","1509494400","289.42" 216 | "11/2/2017","1509580800","284.92" 217 | "11/3/2017","1509667200","304.51" 218 | "11/4/2017","1509753600","300.04" 219 | "11/5/2017","1509840000","296.23" 220 | "11/6/2017","1509926400","296.82" 221 | "11/7/2017","1510012800","291.84" 222 | "11/8/2017","1510099200","307.35" 223 | "11/9/2017","1510185600","319.66" 224 | "11/10/2017","1510272000","296.86" 225 | "11/11/2017","1510358400","314.23" 226 | "11/12/2017","1510444800","306.02" 227 | "11/13/2017","1510531200","314.60" 228 | "11/14/2017","1510617600","334.72" 229 | "11/15/2017","1510704000","331.20" 230 | "11/16/2017","1510790400","330.32" 231 | "11/17/2017","1510876800","331.72" 232 | "11/18/2017","1510963200","346.65" 233 | "11/19/2017","1511049600","354.60" 234 | "11/20/2017","1511136000","367.71" 235 | "11/21/2017","1511222400","360.52" 236 | "11/22/2017","1511308800","380.84" 237 | "11/23/2017","1511395200","406.57" 238 | "11/24/2017","1511481600","470.43" 239 | "11/25/2017","1511568000","464.61" 240 | "11/26/2017","1511654400","470.54" 241 | "11/27/2017","1511740800","475.24" 242 | "11/28/2017","1511827200","466.27" 243 | "11/29/2017","1511913600","427.42" 244 | "11/30/2017","1512000000","434.85" 245 | "12/1/2017","1512086400","461.58" 246 | "12/2/2017","1512172800","457.96" 247 | "12/3/2017","1512259200","462.81" 248 | "12/4/2017","1512345600","466.93" 249 | "12/5/2017","1512432000","453.96" 250 | "12/6/2017","1512518400","422.48" 251 | "12/7/2017","1512604800","421.15" 252 | "12/11/2017","1512950400","513.29" 253 | "12/12/2017","1513036800","656.52" 254 | "12/13/2017","1513123200","699.09" 255 | "12/14/2017","1513209600","693.58" 256 | "12/15/2017","1513296000","684.27" 257 | "12/16/2017","1513382400","692.83" 258 | "12/17/2017","1513468800","717.71" 259 | "12/18/2017","1513555200","785.99" 260 | "12/19/2017","1513641600","812.50" 261 | "12/20/2017","1513728000","799.17" 262 | "12/21/2017","1513814400","789.39" 263 | "12/22/2017","1513900800","657.83" 264 | "12/23/2017","1513987200","700.44" 265 | "12/24/2017","1514073600","675.91" 266 | "12/25/2017","1514160000","723.14" 267 | "12/26/2017","1514246400","753.40" 268 | "12/27/2017","1514332800","739.94" 269 | "12/28/2017","1514419200","716.69" 270 | "12/29/2017","1514505600","739.60" 271 | "12/30/2017","1514592000","692.99" 272 | "12/31/2017","1514678400","741.13" 273 | "1/1/2018","1514764800","756.20" 274 | "1/2/2018","1514851200","861.97" 275 | "1/3/2018","1514937600","941.10" 276 | "1/4/2018","1515024000","944.83" 277 | "1/5/2018","1515110400","967.13" 278 | "1/6/2018","1515196800","1006.41" 279 | "1/7/2018","1515283200","1117.75" 280 | "1/8/2018","1515369600","1136.11" 281 | "1/9/2018","1515456000","1289.24" 282 | "1/10/2018","1515542400","1248.99" 283 | "1/11/2018","1515628800","1139.32" 284 | "1/12/2018","1515715200","1261.03" 285 | "1/13/2018","1515801600","1385.02" 286 | "1/14/2018","1515888000","1359.48" 287 | "1/15/2018","1515974400","1278.69" 288 | "1/16/2018","1516060800","1050.26" 289 | "1/17/2018","1516147200","1024.69" 290 | "1/18/2018","1516233600","1012.97" 291 | "1/19/2018","1516320000","1037.36" 292 | "1/20/2018","1516406400","1150.50" 293 | "1/21/2018","1516492800","1049.09" 294 | "1/22/2018","1516579200","999.64" 295 | "1/23/2018","1516665600","984.47" 296 | "1/24/2018","1516752000","1061.78" 297 | "1/25/2018","1516838400","1046.37" 298 | "1/26/2018","1516924800","1048.58" 299 | "1/27/2018","1517011200","1109.08" 300 | "1/28/2018","1517097600","1231.58" 301 | "1/29/2018","1517184000","1169.96" 302 | "1/30/2018","1517270400","1063.75" 303 | "1/31/2018","1517356800","1111.31" 304 | "2/1/2018","1517443200","1026.19" 305 | "2/2/2018","1517529600","917.47" 306 | "2/3/2018","1517616000","970.87" 307 | "2/4/2018","1517702400","827.59" 308 | "2/5/2018","1517788800","695.08" 309 | "2/6/2018","1517875200","785.01" 310 | "2/7/2018","1517961600","751.81" 311 | "2/8/2018","1518048000","813.55" 312 | "2/9/2018","1518134400","877.88" 313 | "2/10/2018","1518220800","850.75" 314 | "2/11/2018","1518307200","811.24" 315 | "2/12/2018","1518393600","865.27" 316 | "2/13/2018","1518480000","840.98" 317 | "2/14/2018","1518566400","920.11" 318 | "2/15/2018","1518652800","927.95" 319 | "2/16/2018","1518739200","938.02" 320 | "2/17/2018","1518825600","974.77" 321 | "2/18/2018","1518912000","913.90" 322 | "2/19/2018","1518998400","939.79" 323 | "2/20/2018","1519084800","885.52" 324 | "2/21/2018","1519171200","840.10" 325 | "2/22/2018","1519257600","804.63" 326 | "2/23/2018","1519344000","854.70" 327 | "2/24/2018","1519430400","833.49" 328 | "2/25/2018","1519516800","840.28" 329 | "2/26/2018","1519603200","867.62" 330 | "2/27/2018","1519689600","871.58" 331 | "2/28/2018","1519776000","851.50" 332 | "3/1/2018","1519862400","869.87" 333 | "3/2/2018","1519948800","855.60" 334 | "3/3/2018","1520035200","855.65" 335 | "3/4/2018","1520121600","864.83" 336 | "3/5/2018","1520208000","849.42" 337 | "3/6/2018","1520294400","815.69" 338 | "3/7/2018","1520380800","751.13" 339 | "3/8/2018","1520467200","698.83" 340 | "3/9/2018","1520553600","726.92" 341 | "3/10/2018","1520640000","682.30" 342 | "3/11/2018","1520726400","720.36" 343 | "3/12/2018","1520812800","697.02" 344 | "3/13/2018","1520899200","689.96" 345 | "3/14/2018","1520985600","613.15" 346 | "3/15/2018","1521072000","610.56" 347 | "3/16/2018","1521158400","600.53" 348 | "3/17/2018","1521244800","549.79" 349 | "3/18/2018","1521331200","537.38" 350 | "3/19/2018","1521417600","555.55" 351 | "3/20/2018","1521504000","557.57" 352 | "3/21/2018","1521590400","559.91" 353 | "3/22/2018","1521676800","539.89" 354 | "3/23/2018","1521763200","543.83" 355 | "3/24/2018","1521849600","520.16" 356 | "3/25/2018","1521936000","523.01" 357 | "3/26/2018","1522022400","486.25" 358 | "3/27/2018","1522108800","448.78" 359 | "3/28/2018","1522195200","445.93" 360 | "3/29/2018","1522281600","383.90" 361 | "3/30/2018","1522368000","393.82" 362 | "3/31/2018","1522454400","394.07" 363 | "4/1/2018","1522540800","378.85" 364 | --------------------------------------------------------------------------------