├── .ipynb_checkpoints
├── 1 - Pandas - Series-checkpoint.ipynb
├── 3 - Pandas - DataFrames-checkpoint.ipynb
├── 4 - Pandas DataFrames exercises-checkpoint.ipynb
└── 5 - Pandas - Reading CSV and Basic Plotting-checkpoint.ipynb
├── 1 - Pandas - Series.ipynb
├── 2 - Pandas Series exercises.ipynb
├── 3 - Pandas - DataFrames.ipynb
├── 4 - Pandas DataFrames exercises.ipynb
├── 5 - Pandas - Reading CSV and Basic Plotting.ipynb
├── README.md
└── data
├── .ipynb_checkpoints
└── btc-market-price-checkpoint.csv
├── btc-market-price.csv
└── eth-price.csv
/.ipynb_checkpoints/1 - Pandas - Series-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "\n",
10 | "
\n",
12 | "\n",
13 | "# Pandas - Series\n"
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "\n",
21 | "\n",
22 | "## Hands on! "
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": null,
28 | "metadata": {},
29 | "outputs": [],
30 | "source": [
31 | "import pandas as pd\n",
32 | "import numpy as np"
33 | ]
34 | },
35 | {
36 | "cell_type": "markdown",
37 | "metadata": {},
38 | "source": [
39 | "## Pandas Series\n",
40 | "\n",
41 | "We'll start analyzing \"[The Group of Seven](https://en.wikipedia.org/wiki/Group_of_Seven)\". Which is a political formed by Canada, France, Germany, Italy, Japan, the United Kingdom and the United States. We'll start by analyzing population, and for that, we'll use a `pandas.Series` object."
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": null,
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "# In millions\n",
51 | "g7_pop = pd.Series([35.467, 63.951, 80.940, 60.665, 127.061, 64.511, 318.523])"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "metadata": {
58 | "scrolled": true
59 | },
60 | "outputs": [],
61 | "source": [
62 | "g7_pop"
63 | ]
64 | },
65 | {
66 | "cell_type": "markdown",
67 | "metadata": {},
68 | "source": [
69 | "Someone might not know we're representing population in millions of inhabitants. Series can have a `name`, to better document the purpose of the Series:"
70 | ]
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": null,
75 | "metadata": {},
76 | "outputs": [],
77 | "source": [
78 | "g7_pop.name = 'G7 Population in millions'"
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": null,
84 | "metadata": {},
85 | "outputs": [],
86 | "source": [
87 | "g7_pop"
88 | ]
89 | },
90 | {
91 | "cell_type": "markdown",
92 | "metadata": {},
93 | "source": [
94 | "Series are pretty similar to numpy arrays:"
95 | ]
96 | },
97 | {
98 | "cell_type": "code",
99 | "execution_count": null,
100 | "metadata": {},
101 | "outputs": [],
102 | "source": [
103 | "g7_pop.dtype"
104 | ]
105 | },
106 | {
107 | "cell_type": "code",
108 | "execution_count": null,
109 | "metadata": {},
110 | "outputs": [],
111 | "source": [
112 | "g7_pop.values"
113 | ]
114 | },
115 | {
116 | "cell_type": "markdown",
117 | "metadata": {},
118 | "source": [
119 | "They're actually backed by numpy arrays:"
120 | ]
121 | },
122 | {
123 | "cell_type": "code",
124 | "execution_count": null,
125 | "metadata": {},
126 | "outputs": [],
127 | "source": [
128 | "type(g7_pop.values)"
129 | ]
130 | },
131 | {
132 | "cell_type": "markdown",
133 | "metadata": {},
134 | "source": [
135 | "And they _look_ like simple Python lists or Numpy Arrays. But they're actually more similar to Python `dict`s.\n",
136 | "\n",
137 | "A Series has an `index`, that's similar to the automatic index assigned to Python's lists:"
138 | ]
139 | },
140 | {
141 | "cell_type": "code",
142 | "execution_count": null,
143 | "metadata": {},
144 | "outputs": [],
145 | "source": [
146 | "g7_pop[0]"
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": null,
152 | "metadata": {},
153 | "outputs": [],
154 | "source": [
155 | "g7_pop[1]"
156 | ]
157 | },
158 | {
159 | "cell_type": "code",
160 | "execution_count": null,
161 | "metadata": {},
162 | "outputs": [],
163 | "source": [
164 | "g7_pop.index"
165 | ]
166 | },
167 | {
168 | "cell_type": "markdown",
169 | "metadata": {},
170 | "source": [
171 | "But, in contrast to lists, we can explicitly define the index:"
172 | ]
173 | },
174 | {
175 | "cell_type": "code",
176 | "execution_count": null,
177 | "metadata": {},
178 | "outputs": [],
179 | "source": [
180 | "g7_pop.index = [\n",
181 | " 'Canada',\n",
182 | " 'France',\n",
183 | " 'Germany',\n",
184 | " 'Italy',\n",
185 | " 'Japan',\n",
186 | " 'United Kingdom',\n",
187 | " 'United States',\n",
188 | "]"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": null,
194 | "metadata": {},
195 | "outputs": [],
196 | "source": [
197 | "g7_pop"
198 | ]
199 | },
200 | {
201 | "cell_type": "markdown",
202 | "metadata": {},
203 | "source": [
204 | "Compare it with the [following table](https://docs.google.com/spreadsheets/d/1IlorV2-Oh9Da1JAZ7weVw86PQrQydSMp-ydVMH135iI/edit?usp=sharing): \n",
205 | "\n",
206 | "
\n",
207 | "\n",
208 | "We can say that Series look like \"ordered dictionaries\". We can actually create Series out of dictionaries:"
209 | ]
210 | },
211 | {
212 | "cell_type": "code",
213 | "execution_count": null,
214 | "metadata": {
215 | "scrolled": true
216 | },
217 | "outputs": [],
218 | "source": [
219 | "pd.Series({\n",
220 | " 'Canada': 35.467,\n",
221 | " 'France': 63.951,\n",
222 | " 'Germany': 80.94,\n",
223 | " 'Italy': 60.665,\n",
224 | " 'Japan': 127.061,\n",
225 | " 'United Kingdom': 64.511,\n",
226 | " 'United States': 318.523\n",
227 | "}, name='G7 Population in millions')"
228 | ]
229 | },
230 | {
231 | "cell_type": "code",
232 | "execution_count": null,
233 | "metadata": {},
234 | "outputs": [],
235 | "source": [
236 | "pd.Series(\n",
237 | " [35.467, 63.951, 80.94, 60.665, 127.061, 64.511, 318.523],\n",
238 | " index=['Canada', 'France', 'Germany', 'Italy', 'Japan', 'United Kingdom',\n",
239 | " 'United States'],\n",
240 | " name='G7 Population in millions')"
241 | ]
242 | },
243 | {
244 | "cell_type": "markdown",
245 | "metadata": {},
246 | "source": [
247 | "You can also create Series out of other series, specifying indexes:"
248 | ]
249 | },
250 | {
251 | "cell_type": "code",
252 | "execution_count": null,
253 | "metadata": {
254 | "scrolled": false
255 | },
256 | "outputs": [],
257 | "source": [
258 | "pd.Series(g7_pop, index=['France', 'Germany', 'Italy', 'Spain'])"
259 | ]
260 | },
261 | {
262 | "cell_type": "markdown",
263 | "metadata": {},
264 | "source": [
265 | "\n",
266 | "\n",
267 | "## Indexing\n",
268 | "\n",
269 | "Indexing works similarly to lists and dictionaries, you use the **index** of the element you're looking for:"
270 | ]
271 | },
272 | {
273 | "cell_type": "code",
274 | "execution_count": null,
275 | "metadata": {},
276 | "outputs": [],
277 | "source": [
278 | "g7_pop['Canada']"
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": null,
284 | "metadata": {},
285 | "outputs": [],
286 | "source": [
287 | "g7_pop['Japan']"
288 | ]
289 | },
290 | {
291 | "cell_type": "markdown",
292 | "metadata": {},
293 | "source": [
294 | "Numeric positions can also be used, with the `iloc` attribute:"
295 | ]
296 | },
297 | {
298 | "cell_type": "code",
299 | "execution_count": null,
300 | "metadata": {},
301 | "outputs": [],
302 | "source": [
303 | "g7_pop.iloc[0]"
304 | ]
305 | },
306 | {
307 | "cell_type": "code",
308 | "execution_count": null,
309 | "metadata": {
310 | "scrolled": true
311 | },
312 | "outputs": [],
313 | "source": [
314 | "g7_pop.iloc[-1]"
315 | ]
316 | },
317 | {
318 | "cell_type": "markdown",
319 | "metadata": {},
320 | "source": [
321 | "Selecting multiple elements at once:"
322 | ]
323 | },
324 | {
325 | "cell_type": "code",
326 | "execution_count": null,
327 | "metadata": {
328 | "scrolled": true
329 | },
330 | "outputs": [],
331 | "source": [
332 | "g7_pop[['Italy', 'France']]"
333 | ]
334 | },
335 | {
336 | "cell_type": "markdown",
337 | "metadata": {},
338 | "source": [
339 | "_(The result is another Series)_"
340 | ]
341 | },
342 | {
343 | "cell_type": "code",
344 | "execution_count": null,
345 | "metadata": {
346 | "scrolled": true
347 | },
348 | "outputs": [],
349 | "source": [
350 | "g7_pop.iloc[[0, 1]]"
351 | ]
352 | },
353 | {
354 | "cell_type": "markdown",
355 | "metadata": {},
356 | "source": [
357 | "Slicing also works, but **important**, in Pandas, the upper limit is also included:"
358 | ]
359 | },
360 | {
361 | "cell_type": "code",
362 | "execution_count": null,
363 | "metadata": {
364 | "scrolled": false
365 | },
366 | "outputs": [],
367 | "source": [
368 | "g7_pop['Canada': 'Italy']"
369 | ]
370 | },
371 | {
372 | "cell_type": "markdown",
373 | "metadata": {},
374 | "source": [
375 | "\n",
376 | "\n",
377 | "## Conditional selection (boolean arrays)\n",
378 | "\n",
379 | "The same boolean array techniques we saw applied to numpy arrays can be used for Pandas `Series`:"
380 | ]
381 | },
382 | {
383 | "cell_type": "code",
384 | "execution_count": null,
385 | "metadata": {},
386 | "outputs": [],
387 | "source": [
388 | "g7_pop"
389 | ]
390 | },
391 | {
392 | "cell_type": "code",
393 | "execution_count": null,
394 | "metadata": {},
395 | "outputs": [],
396 | "source": [
397 | "g7_pop > 70"
398 | ]
399 | },
400 | {
401 | "cell_type": "code",
402 | "execution_count": null,
403 | "metadata": {},
404 | "outputs": [],
405 | "source": [
406 | "g7_pop[g7_pop > 70]"
407 | ]
408 | },
409 | {
410 | "cell_type": "code",
411 | "execution_count": null,
412 | "metadata": {},
413 | "outputs": [],
414 | "source": [
415 | "g7_pop.mean()"
416 | ]
417 | },
418 | {
419 | "cell_type": "code",
420 | "execution_count": null,
421 | "metadata": {},
422 | "outputs": [],
423 | "source": [
424 | "g7_pop[g7_pop > g7_pop.mean()]"
425 | ]
426 | },
427 | {
428 | "cell_type": "code",
429 | "execution_count": null,
430 | "metadata": {},
431 | "outputs": [],
432 | "source": [
433 | "g7_pop.std()"
434 | ]
435 | },
436 | {
437 | "cell_type": "code",
438 | "execution_count": null,
439 | "metadata": {
440 | "scrolled": true
441 | },
442 | "outputs": [],
443 | "source": [
444 | "g7_pop[(g7_pop > g7_pop.mean() - g7_pop.std() / 2) | (g7_pop > g7_pop.mean() + g7_pop.std() / 2)]"
445 | ]
446 | },
447 | {
448 | "cell_type": "markdown",
449 | "metadata": {},
450 | "source": [
451 | "\n",
452 | "\n",
453 | "## Operations and methods\n",
454 | "Series also support vectorized operations and aggregation functions as Numpy:"
455 | ]
456 | },
457 | {
458 | "cell_type": "code",
459 | "execution_count": null,
460 | "metadata": {},
461 | "outputs": [],
462 | "source": [
463 | "g7_pop * 1_000_000"
464 | ]
465 | },
466 | {
467 | "cell_type": "code",
468 | "execution_count": null,
469 | "metadata": {},
470 | "outputs": [],
471 | "source": [
472 | "g7_pop.mean()"
473 | ]
474 | },
475 | {
476 | "cell_type": "code",
477 | "execution_count": null,
478 | "metadata": {
479 | "scrolled": true
480 | },
481 | "outputs": [],
482 | "source": [
483 | "np.log(g7_pop)"
484 | ]
485 | },
486 | {
487 | "cell_type": "code",
488 | "execution_count": null,
489 | "metadata": {
490 | "scrolled": false
491 | },
492 | "outputs": [],
493 | "source": [
494 | "g7_pop['France': 'Italy'].mean()"
495 | ]
496 | },
497 | {
498 | "cell_type": "markdown",
499 | "metadata": {},
500 | "source": [
501 | "\n",
502 | "\n",
503 | "## Boolean arrays\n",
504 | "(Work in the same way as numpy)"
505 | ]
506 | },
507 | {
508 | "cell_type": "code",
509 | "execution_count": null,
510 | "metadata": {},
511 | "outputs": [],
512 | "source": [
513 | "g7_pop"
514 | ]
515 | },
516 | {
517 | "cell_type": "code",
518 | "execution_count": null,
519 | "metadata": {},
520 | "outputs": [],
521 | "source": [
522 | "g7_pop > 80"
523 | ]
524 | },
525 | {
526 | "cell_type": "code",
527 | "execution_count": null,
528 | "metadata": {
529 | "scrolled": true
530 | },
531 | "outputs": [],
532 | "source": [
533 | "g7_pop[g7_pop > 80]"
534 | ]
535 | },
536 | {
537 | "cell_type": "code",
538 | "execution_count": null,
539 | "metadata": {
540 | "scrolled": true
541 | },
542 | "outputs": [],
543 | "source": [
544 | "g7_pop[(g7_pop > 80) | (g7_pop < 40)]"
545 | ]
546 | },
547 | {
548 | "cell_type": "code",
549 | "execution_count": null,
550 | "metadata": {
551 | "scrolled": true
552 | },
553 | "outputs": [],
554 | "source": [
555 | "g7_pop[(g7_pop > 80) & (g7_pop < 200)]"
556 | ]
557 | },
558 | {
559 | "cell_type": "markdown",
560 | "metadata": {},
561 | "source": [
562 | "\n",
563 | "\n",
564 | "## Modifying series\n"
565 | ]
566 | },
567 | {
568 | "cell_type": "code",
569 | "execution_count": null,
570 | "metadata": {},
571 | "outputs": [],
572 | "source": [
573 | "g7_pop['Canada'] = 40.5"
574 | ]
575 | },
576 | {
577 | "cell_type": "code",
578 | "execution_count": null,
579 | "metadata": {},
580 | "outputs": [],
581 | "source": [
582 | "g7_pop"
583 | ]
584 | },
585 | {
586 | "cell_type": "code",
587 | "execution_count": null,
588 | "metadata": {},
589 | "outputs": [],
590 | "source": [
591 | "g7_pop.iloc[-1] = 500"
592 | ]
593 | },
594 | {
595 | "cell_type": "code",
596 | "execution_count": null,
597 | "metadata": {
598 | "scrolled": false
599 | },
600 | "outputs": [],
601 | "source": [
602 | "g7_pop"
603 | ]
604 | },
605 | {
606 | "cell_type": "code",
607 | "execution_count": null,
608 | "metadata": {},
609 | "outputs": [],
610 | "source": [
611 | "g7_pop[g7_pop < 70] = 99.99"
612 | ]
613 | },
614 | {
615 | "cell_type": "code",
616 | "execution_count": null,
617 | "metadata": {
618 | "scrolled": true
619 | },
620 | "outputs": [],
621 | "source": [
622 | "g7_pop"
623 | ]
624 | },
625 | {
626 | "cell_type": "markdown",
627 | "metadata": {},
628 | "source": [
629 | "\n"
630 | ]
631 | }
632 | ],
633 | "metadata": {
634 | "kernelspec": {
635 | "display_name": "Python 3",
636 | "language": "python",
637 | "name": "python3"
638 | },
639 | "language_info": {
640 | "codemirror_mode": {
641 | "name": "ipython",
642 | "version": 3
643 | },
644 | "file_extension": ".py",
645 | "mimetype": "text/x-python",
646 | "name": "python",
647 | "nbconvert_exporter": "python",
648 | "pygments_lexer": "ipython3",
649 | "version": "3.7.4"
650 | }
651 | },
652 | "nbformat": 4,
653 | "nbformat_minor": 2
654 | }
655 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/3 - Pandas - DataFrames-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "\n",
10 | "
\n",
12 | "\n",
13 | "# Pandas - `DataFrame`s\n",
14 | "\n",
15 | "Probably the most important data structure of pandas is the `DataFrame`. It's a tabular structure tightly integrated with `Series`.\n"
16 | ]
17 | },
18 | {
19 | "cell_type": "markdown",
20 | "metadata": {},
21 | "source": [
22 | "\n",
23 | "\n",
24 | "## Hands on! "
25 | ]
26 | },
27 | {
28 | "cell_type": "code",
29 | "execution_count": null,
30 | "metadata": {},
31 | "outputs": [],
32 | "source": [
33 | "import numpy as np\n",
34 | "import pandas as pd"
35 | ]
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "We'll keep our analysis of G7 countries and looking now at DataFrames. As said, a DataFrame looks a lot like a table (as the one you can appreciate [here](https://docs.google.com/spreadsheets/d/1IlorV2-Oh9Da1JAZ7weVw86PQrQydSMp-ydVMH135iI/edit?usp=sharing)):\n",
42 | "\n",
43 | "
\n",
44 | "\n",
45 | "Creating `DataFrame`s manually can be tedious. 99% of the time you'll be pulling the data from a Database, a csv file or the web. But still, you can create a DataFrame by specifying the columns and values:"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": null,
51 | "metadata": {},
52 | "outputs": [],
53 | "source": [
54 | "df = pd.DataFrame({\n",
55 | " 'Population': [35.467, 63.951, 80.94 , 60.665, 127.061, 64.511, 318.523],\n",
56 | " 'GDP': [\n",
57 | " 1785387,\n",
58 | " 2833687,\n",
59 | " 3874437,\n",
60 | " 2167744,\n",
61 | " 4602367,\n",
62 | " 2950039,\n",
63 | " 17348075\n",
64 | " ],\n",
65 | " 'Surface Area': [\n",
66 | " 9984670,\n",
67 | " 640679,\n",
68 | " 357114,\n",
69 | " 301336,\n",
70 | " 377930,\n",
71 | " 242495,\n",
72 | " 9525067\n",
73 | " ],\n",
74 | " 'HDI': [\n",
75 | " 0.913,\n",
76 | " 0.888,\n",
77 | " 0.916,\n",
78 | " 0.873,\n",
79 | " 0.891,\n",
80 | " 0.907,\n",
81 | " 0.915\n",
82 | " ],\n",
83 | " 'Continent': [\n",
84 | " 'America',\n",
85 | " 'Europe',\n",
86 | " 'Europe',\n",
87 | " 'Europe',\n",
88 | " 'Asia',\n",
89 | " 'Europe',\n",
90 | " 'America'\n",
91 | " ]\n",
92 | "}, columns=['Population', 'GDP', 'Surface Area', 'HDI', 'Continent'])"
93 | ]
94 | },
95 | {
96 | "cell_type": "markdown",
97 | "metadata": {},
98 | "source": [
99 | "_(The `columns` attribute is optional. I'm using it to keep the same order as in the picture above)_"
100 | ]
101 | },
102 | {
103 | "cell_type": "code",
104 | "execution_count": null,
105 | "metadata": {
106 | "scrolled": true
107 | },
108 | "outputs": [],
109 | "source": [
110 | "df"
111 | ]
112 | },
113 | {
114 | "cell_type": "markdown",
115 | "metadata": {},
116 | "source": [
117 | "`DataFrame`s also have indexes. As you can see in the \"table\" above, pandas has assigned a numeric, autoincremental index automatically to each \"row\" in our DataFrame. In our case, we know that each row represents a country, so we'll just reassign the index:"
118 | ]
119 | },
120 | {
121 | "cell_type": "code",
122 | "execution_count": null,
123 | "metadata": {},
124 | "outputs": [],
125 | "source": [
126 | "df.index = [\n",
127 | " 'Canada',\n",
128 | " 'France',\n",
129 | " 'Germany',\n",
130 | " 'Italy',\n",
131 | " 'Japan',\n",
132 | " 'United Kingdom',\n",
133 | " 'United States',\n",
134 | "]"
135 | ]
136 | },
137 | {
138 | "cell_type": "code",
139 | "execution_count": null,
140 | "metadata": {},
141 | "outputs": [],
142 | "source": [
143 | "df"
144 | ]
145 | },
146 | {
147 | "cell_type": "code",
148 | "execution_count": null,
149 | "metadata": {},
150 | "outputs": [],
151 | "source": [
152 | "df.columns"
153 | ]
154 | },
155 | {
156 | "cell_type": "code",
157 | "execution_count": null,
158 | "metadata": {},
159 | "outputs": [],
160 | "source": [
161 | "df.index"
162 | ]
163 | },
164 | {
165 | "cell_type": "code",
166 | "execution_count": null,
167 | "metadata": {},
168 | "outputs": [],
169 | "source": [
170 | "df.info()"
171 | ]
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": null,
176 | "metadata": {},
177 | "outputs": [],
178 | "source": [
179 | "df.size"
180 | ]
181 | },
182 | {
183 | "cell_type": "code",
184 | "execution_count": null,
185 | "metadata": {},
186 | "outputs": [],
187 | "source": [
188 | "df.shape"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": null,
194 | "metadata": {
195 | "scrolled": true
196 | },
197 | "outputs": [],
198 | "source": [
199 | "df.describe()"
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": null,
205 | "metadata": {},
206 | "outputs": [],
207 | "source": [
208 | "df.dtypes"
209 | ]
210 | },
211 | {
212 | "cell_type": "code",
213 | "execution_count": null,
214 | "metadata": {
215 | "scrolled": false
216 | },
217 | "outputs": [],
218 | "source": [
219 | "df.dtypes.value_counts()"
220 | ]
221 | },
222 | {
223 | "cell_type": "markdown",
224 | "metadata": {},
225 | "source": [
226 | "\n",
227 | "\n",
228 | "## Indexing, Selection and Slicing\n",
229 | "\n",
230 | "Individual columns in the DataFrame can be selected with regular indexing. Each column is represented as a `Series`:"
231 | ]
232 | },
233 | {
234 | "cell_type": "code",
235 | "execution_count": null,
236 | "metadata": {
237 | "scrolled": true
238 | },
239 | "outputs": [],
240 | "source": [
241 | "df['Population']"
242 | ]
243 | },
244 | {
245 | "cell_type": "markdown",
246 | "metadata": {},
247 | "source": [
248 | "Note that the `index` of the returned Series is the same as the DataFrame one. And its `name` is the name of the column. If you're working on a notebook and want to see a more DataFrame-like format you can use the `to_frame` method:"
249 | ]
250 | },
251 | {
252 | "cell_type": "code",
253 | "execution_count": null,
254 | "metadata": {
255 | "scrolled": false
256 | },
257 | "outputs": [],
258 | "source": [
259 | "df['Population'].to_frame()"
260 | ]
261 | },
262 | {
263 | "cell_type": "markdown",
264 | "metadata": {},
265 | "source": [
266 | "Multiple columns can also be selected similarly to `numpy` and `Series`:"
267 | ]
268 | },
269 | {
270 | "cell_type": "code",
271 | "execution_count": null,
272 | "metadata": {
273 | "scrolled": true
274 | },
275 | "outputs": [],
276 | "source": [
277 | "df[['Population', 'GDP']]"
278 | ]
279 | },
280 | {
281 | "cell_type": "markdown",
282 | "metadata": {},
283 | "source": [
284 | "In this case, the result is another `DataFrame`. Slicing works differently, it acts at \"row level\", and can be counter intuitive:"
285 | ]
286 | },
287 | {
288 | "cell_type": "code",
289 | "execution_count": null,
290 | "metadata": {
291 | "scrolled": false
292 | },
293 | "outputs": [],
294 | "source": [
295 | "df[1:3]"
296 | ]
297 | },
298 | {
299 | "cell_type": "markdown",
300 | "metadata": {},
301 | "source": [
302 | "Row level selection works better with `loc` and `iloc` **which are recommended** over regular \"direct slicing\" (`df[:]`).\n",
303 | "\n",
304 | "`loc` selects rows matching the given index:"
305 | ]
306 | },
307 | {
308 | "cell_type": "code",
309 | "execution_count": null,
310 | "metadata": {},
311 | "outputs": [],
312 | "source": [
313 | "df.loc['Italy']"
314 | ]
315 | },
316 | {
317 | "cell_type": "code",
318 | "execution_count": null,
319 | "metadata": {
320 | "scrolled": true
321 | },
322 | "outputs": [],
323 | "source": [
324 | "df.loc['France': 'Italy']"
325 | ]
326 | },
327 | {
328 | "cell_type": "markdown",
329 | "metadata": {},
330 | "source": [
331 | "As a second \"argument\", you can pass the column(s) you'd like to select:"
332 | ]
333 | },
334 | {
335 | "cell_type": "code",
336 | "execution_count": null,
337 | "metadata": {
338 | "scrolled": false
339 | },
340 | "outputs": [],
341 | "source": [
342 | "df.loc['France': 'Italy', 'Population']"
343 | ]
344 | },
345 | {
346 | "cell_type": "code",
347 | "execution_count": null,
348 | "metadata": {
349 | "scrolled": true
350 | },
351 | "outputs": [],
352 | "source": [
353 | "df.loc['France': 'Italy', ['Population', 'GDP']]"
354 | ]
355 | },
356 | {
357 | "cell_type": "markdown",
358 | "metadata": {},
359 | "source": [
360 | "`iloc` works with the (numeric) \"position\" of the index:"
361 | ]
362 | },
363 | {
364 | "cell_type": "code",
365 | "execution_count": null,
366 | "metadata": {},
367 | "outputs": [],
368 | "source": [
369 | "df"
370 | ]
371 | },
372 | {
373 | "cell_type": "code",
374 | "execution_count": null,
375 | "metadata": {},
376 | "outputs": [],
377 | "source": [
378 | "df.iloc[0]"
379 | ]
380 | },
381 | {
382 | "cell_type": "code",
383 | "execution_count": null,
384 | "metadata": {},
385 | "outputs": [],
386 | "source": [
387 | "df.iloc[-1]"
388 | ]
389 | },
390 | {
391 | "cell_type": "code",
392 | "execution_count": null,
393 | "metadata": {
394 | "scrolled": false
395 | },
396 | "outputs": [],
397 | "source": [
398 | "df.iloc[[0, 1, -1]]"
399 | ]
400 | },
401 | {
402 | "cell_type": "code",
403 | "execution_count": null,
404 | "metadata": {},
405 | "outputs": [],
406 | "source": [
407 | "df.iloc[1:3]"
408 | ]
409 | },
410 | {
411 | "cell_type": "code",
412 | "execution_count": null,
413 | "metadata": {},
414 | "outputs": [],
415 | "source": [
416 | "df.iloc[1:3, 3]"
417 | ]
418 | },
419 | {
420 | "cell_type": "code",
421 | "execution_count": null,
422 | "metadata": {},
423 | "outputs": [],
424 | "source": [
425 | "df.iloc[1:3, [0, 3]]"
426 | ]
427 | },
428 | {
429 | "cell_type": "code",
430 | "execution_count": null,
431 | "metadata": {
432 | "scrolled": true
433 | },
434 | "outputs": [],
435 | "source": [
436 | "df.iloc[1:3, 1:3]"
437 | ]
438 | },
439 | {
440 | "cell_type": "markdown",
441 | "metadata": {},
442 | "source": [
443 | "> **RECOMMENDED: Always use `loc` and `iloc` to reduce ambiguity, specially with `DataFrame`s with numeric indexes.**"
444 | ]
445 | },
446 | {
447 | "cell_type": "markdown",
448 | "metadata": {},
449 | "source": [
450 | "\n",
451 | "\n",
452 | "## Conditional selection (boolean arrays)\n",
453 | "\n",
454 | "We saw conditional selection applied to `Series` and it'll work in the same way for `DataFrame`s. After all, a `DataFrame` is a collection of `Series`:"
455 | ]
456 | },
457 | {
458 | "cell_type": "code",
459 | "execution_count": null,
460 | "metadata": {},
461 | "outputs": [],
462 | "source": [
463 | "df"
464 | ]
465 | },
466 | {
467 | "cell_type": "code",
468 | "execution_count": null,
469 | "metadata": {},
470 | "outputs": [],
471 | "source": [
472 | "df['Population'] > 70"
473 | ]
474 | },
475 | {
476 | "cell_type": "code",
477 | "execution_count": null,
478 | "metadata": {},
479 | "outputs": [],
480 | "source": [
481 | "df.loc[df['Population'] > 70]"
482 | ]
483 | },
484 | {
485 | "cell_type": "markdown",
486 | "metadata": {},
487 | "source": [
488 | "The boolean matching is done at Index level, so you can filter by any row, as long as it contains the right indexes. Column selection still works as expected:"
489 | ]
490 | },
491 | {
492 | "cell_type": "code",
493 | "execution_count": null,
494 | "metadata": {},
495 | "outputs": [],
496 | "source": [
497 | "df.loc[df['Population'] > 70, 'Population']"
498 | ]
499 | },
500 | {
501 | "cell_type": "code",
502 | "execution_count": null,
503 | "metadata": {},
504 | "outputs": [],
505 | "source": [
506 | "df.loc[df['Population'] > 70, ['Population', 'GDP']]"
507 | ]
508 | },
509 | {
510 | "cell_type": "markdown",
511 | "metadata": {},
512 | "source": [
513 | "\n",
514 | "\n",
515 | "## Dropping stuff\n",
516 | "\n",
517 | "Opposed to the concept of selection, we have \"dropping\". Instead of pointing out which values you'd like to _select_ you could point which ones you'd like to `drop`:"
518 | ]
519 | },
520 | {
521 | "cell_type": "code",
522 | "execution_count": null,
523 | "metadata": {},
524 | "outputs": [],
525 | "source": [
526 | "df.drop('Canada')"
527 | ]
528 | },
529 | {
530 | "cell_type": "code",
531 | "execution_count": null,
532 | "metadata": {},
533 | "outputs": [],
534 | "source": [
535 | "df.drop(['Canada', 'Japan'])"
536 | ]
537 | },
538 | {
539 | "cell_type": "code",
540 | "execution_count": null,
541 | "metadata": {},
542 | "outputs": [],
543 | "source": [
544 | "df.drop(columns=['Population', 'HDI'])"
545 | ]
546 | },
547 | {
548 | "cell_type": "code",
549 | "execution_count": null,
550 | "metadata": {
551 | "scrolled": false
552 | },
553 | "outputs": [],
554 | "source": [
555 | "df.drop(['Italy', 'Canada'], axis=0)"
556 | ]
557 | },
558 | {
559 | "cell_type": "code",
560 | "execution_count": null,
561 | "metadata": {
562 | "scrolled": false
563 | },
564 | "outputs": [],
565 | "source": [
566 | "df.drop(['Population', 'HDI'], axis=1)"
567 | ]
568 | },
569 | {
570 | "cell_type": "code",
571 | "execution_count": null,
572 | "metadata": {
573 | "scrolled": true
574 | },
575 | "outputs": [],
576 | "source": [
577 | "df.drop(['Population', 'HDI'], axis=1)"
578 | ]
579 | },
580 | {
581 | "cell_type": "code",
582 | "execution_count": null,
583 | "metadata": {},
584 | "outputs": [],
585 | "source": [
586 | "df.drop(['Population', 'HDI'], axis='columns')"
587 | ]
588 | },
589 | {
590 | "cell_type": "code",
591 | "execution_count": null,
592 | "metadata": {},
593 | "outputs": [],
594 | "source": [
595 | "df.drop(['Canada', 'Germany'], axis='rows')"
596 | ]
597 | },
598 | {
599 | "cell_type": "markdown",
600 | "metadata": {},
601 | "source": [
602 | "All these `drop` methods return a new `DataFrame`. If you'd like to modify it \"in place\", you can use the `inplace` attribute (there's an example below)."
603 | ]
604 | },
605 | {
606 | "cell_type": "markdown",
607 | "metadata": {},
608 | "source": [
609 | "\n",
610 | "\n",
611 | "## Operations"
612 | ]
613 | },
614 | {
615 | "cell_type": "code",
616 | "execution_count": null,
617 | "metadata": {
618 | "scrolled": false
619 | },
620 | "outputs": [],
621 | "source": [
622 | "df[['Population', 'GDP']]"
623 | ]
624 | },
625 | {
626 | "cell_type": "code",
627 | "execution_count": null,
628 | "metadata": {},
629 | "outputs": [],
630 | "source": [
631 | "df[['Population', 'GDP']] / 100"
632 | ]
633 | },
634 | {
635 | "cell_type": "markdown",
636 | "metadata": {},
637 | "source": [
638 | "**Operations with Series** work at a column level, broadcasting down the rows (which can be counter intuitive)."
639 | ]
640 | },
641 | {
642 | "cell_type": "code",
643 | "execution_count": null,
644 | "metadata": {},
645 | "outputs": [],
646 | "source": [
647 | "crisis = pd.Series([-1_000_000, -0.3], index=['GDP', 'HDI'])"
648 | ]
649 | },
650 | {
651 | "cell_type": "code",
652 | "execution_count": null,
653 | "metadata": {},
654 | "outputs": [],
655 | "source": [
656 | "df[['GDP', 'HDI']] + crisis"
657 | ]
658 | },
659 | {
660 | "cell_type": "markdown",
661 | "metadata": {},
662 | "source": [
663 | "\n",
664 | "\n",
665 | "## Modifying DataFrames\n",
666 | "\n",
667 | "It's simple and intuitive, You can add columns, or replace values for columns without issues:"
668 | ]
669 | },
670 | {
671 | "cell_type": "markdown",
672 | "metadata": {},
673 | "source": [
674 | "### Adding a new column"
675 | ]
676 | },
677 | {
678 | "cell_type": "code",
679 | "execution_count": null,
680 | "metadata": {},
681 | "outputs": [],
682 | "source": [
683 | "langs = pd.Series(\n",
684 | " ['French', 'German', 'Italian'],\n",
685 | " index=['France', 'Germany', 'Italy'],\n",
686 | " name='Language'\n",
687 | ")"
688 | ]
689 | },
690 | {
691 | "cell_type": "code",
692 | "execution_count": null,
693 | "metadata": {},
694 | "outputs": [],
695 | "source": [
696 | "df['Language'] = langs"
697 | ]
698 | },
699 | {
700 | "cell_type": "code",
701 | "execution_count": null,
702 | "metadata": {
703 | "scrolled": false
704 | },
705 | "outputs": [],
706 | "source": [
707 | "df"
708 | ]
709 | },
710 | {
711 | "cell_type": "markdown",
712 | "metadata": {},
713 | "source": [
714 | "---\n",
715 | "### Replacing values per column"
716 | ]
717 | },
718 | {
719 | "cell_type": "code",
720 | "execution_count": null,
721 | "metadata": {},
722 | "outputs": [],
723 | "source": [
724 | "df['Language'] = 'English'"
725 | ]
726 | },
727 | {
728 | "cell_type": "code",
729 | "execution_count": null,
730 | "metadata": {},
731 | "outputs": [],
732 | "source": [
733 | "df"
734 | ]
735 | },
736 | {
737 | "cell_type": "markdown",
738 | "metadata": {},
739 | "source": [
740 | "---\n",
741 | "### Renaming Columns\n"
742 | ]
743 | },
744 | {
745 | "cell_type": "code",
746 | "execution_count": null,
747 | "metadata": {
748 | "scrolled": false
749 | },
750 | "outputs": [],
751 | "source": [
752 | "df.rename(\n",
753 | " columns={\n",
754 | " 'HDI': 'Human Development Index',\n",
755 | " 'Anual Popcorn Consumption': 'APC'\n",
756 | " }, index={\n",
757 | " 'United States': 'USA',\n",
758 | " 'United Kingdom': 'UK',\n",
759 | " 'Argentina': 'AR'\n",
760 | " })"
761 | ]
762 | },
763 | {
764 | "cell_type": "code",
765 | "execution_count": null,
766 | "metadata": {
767 | "scrolled": true
768 | },
769 | "outputs": [],
770 | "source": [
771 | "df.rename(index=str.upper)"
772 | ]
773 | },
774 | {
775 | "cell_type": "code",
776 | "execution_count": null,
777 | "metadata": {
778 | "scrolled": true
779 | },
780 | "outputs": [],
781 | "source": [
782 | "df.rename(index=lambda x: x.lower())"
783 | ]
784 | },
785 | {
786 | "cell_type": "markdown",
787 | "metadata": {},
788 | "source": [
789 | "---\n",
790 | "### Dropping columns"
791 | ]
792 | },
793 | {
794 | "cell_type": "code",
795 | "execution_count": null,
796 | "metadata": {},
797 | "outputs": [],
798 | "source": [
799 | "df.drop(columns='Language', inplace=True)"
800 | ]
801 | },
802 | {
803 | "cell_type": "markdown",
804 | "metadata": {},
805 | "source": [
806 | "---\n",
807 | "### Adding values"
808 | ]
809 | },
810 | {
811 | "cell_type": "code",
812 | "execution_count": null,
813 | "metadata": {
814 | "scrolled": false
815 | },
816 | "outputs": [],
817 | "source": [
818 | "df.append(pd.Series({\n",
819 | " 'Population': 3,\n",
820 | " 'GDP': 5\n",
821 | "}, name='China'))"
822 | ]
823 | },
824 | {
825 | "cell_type": "markdown",
826 | "metadata": {},
827 | "source": [
828 | "Append returns a new `DataFrame`:"
829 | ]
830 | },
831 | {
832 | "cell_type": "code",
833 | "execution_count": null,
834 | "metadata": {
835 | "scrolled": false
836 | },
837 | "outputs": [],
838 | "source": [
839 | "df"
840 | ]
841 | },
842 | {
843 | "cell_type": "markdown",
844 | "metadata": {},
845 | "source": [
846 | "You can directly set the new index and values to the `DataFrame`:"
847 | ]
848 | },
849 | {
850 | "cell_type": "code",
851 | "execution_count": null,
852 | "metadata": {},
853 | "outputs": [],
854 | "source": [
855 | "df.loc['China'] = pd.Series({'Population': 1_400_000_000, 'Continent': 'Asia'})"
856 | ]
857 | },
858 | {
859 | "cell_type": "code",
860 | "execution_count": null,
861 | "metadata": {
862 | "scrolled": true
863 | },
864 | "outputs": [],
865 | "source": [
866 | "df"
867 | ]
868 | },
869 | {
870 | "cell_type": "markdown",
871 | "metadata": {},
872 | "source": [
873 | "We can use `drop` to just remove a row by index:"
874 | ]
875 | },
876 | {
877 | "cell_type": "code",
878 | "execution_count": null,
879 | "metadata": {},
880 | "outputs": [],
881 | "source": [
882 | "df.drop('China', inplace=True)"
883 | ]
884 | },
885 | {
886 | "cell_type": "code",
887 | "execution_count": null,
888 | "metadata": {
889 | "scrolled": false
890 | },
891 | "outputs": [],
892 | "source": [
893 | "df"
894 | ]
895 | },
896 | {
897 | "cell_type": "markdown",
898 | "metadata": {},
899 | "source": [
900 | "---\n",
901 | "### More radical index changes"
902 | ]
903 | },
904 | {
905 | "cell_type": "code",
906 | "execution_count": null,
907 | "metadata": {},
908 | "outputs": [],
909 | "source": [
910 | "df.reset_index()"
911 | ]
912 | },
913 | {
914 | "cell_type": "code",
915 | "execution_count": null,
916 | "metadata": {
917 | "scrolled": true
918 | },
919 | "outputs": [],
920 | "source": [
921 | "df.set_index('Population')"
922 | ]
923 | },
924 | {
925 | "cell_type": "markdown",
926 | "metadata": {},
927 | "source": [
928 | "\n",
929 | "\n",
930 | "## Creating columns from other columns\n",
931 | "\n",
932 | "Altering a DataFrame often involves combining different columns into another. For example, in our Countries analysis, we could try to calculate the \"GDP per capita\", which is just, `GDP / Population`."
933 | ]
934 | },
935 | {
936 | "cell_type": "code",
937 | "execution_count": null,
938 | "metadata": {
939 | "scrolled": true
940 | },
941 | "outputs": [],
942 | "source": [
943 | "df[['Population', 'GDP']]"
944 | ]
945 | },
946 | {
947 | "cell_type": "markdown",
948 | "metadata": {},
949 | "source": [
950 | "The regular pandas way of expressing that, is just dividing each series:"
951 | ]
952 | },
953 | {
954 | "cell_type": "code",
955 | "execution_count": null,
956 | "metadata": {},
957 | "outputs": [],
958 | "source": [
959 | "df['GDP'] / df['Population']"
960 | ]
961 | },
962 | {
963 | "cell_type": "markdown",
964 | "metadata": {},
965 | "source": [
966 | "The result of that operation is just another series that you can add to the original `DataFrame`:"
967 | ]
968 | },
969 | {
970 | "cell_type": "code",
971 | "execution_count": null,
972 | "metadata": {},
973 | "outputs": [],
974 | "source": [
975 | "df['GDP Per Capita'] = df['GDP'] / df['Population']"
976 | ]
977 | },
978 | {
979 | "cell_type": "code",
980 | "execution_count": null,
981 | "metadata": {},
982 | "outputs": [],
983 | "source": [
984 | "df"
985 | ]
986 | },
987 | {
988 | "cell_type": "markdown",
989 | "metadata": {},
990 | "source": [
991 | "\n",
992 | "\n",
993 | "## Statistical info\n",
994 | "\n",
995 | "You've already seen the `describe` method, which gives you a good \"summary\" of the `DataFrame`. Let's explore other methods in more detail:"
996 | ]
997 | },
998 | {
999 | "cell_type": "code",
1000 | "execution_count": null,
1001 | "metadata": {},
1002 | "outputs": [],
1003 | "source": [
1004 | "df.head()"
1005 | ]
1006 | },
1007 | {
1008 | "cell_type": "code",
1009 | "execution_count": null,
1010 | "metadata": {},
1011 | "outputs": [],
1012 | "source": [
1013 | "df.describe()"
1014 | ]
1015 | },
1016 | {
1017 | "cell_type": "code",
1018 | "execution_count": null,
1019 | "metadata": {},
1020 | "outputs": [],
1021 | "source": [
1022 | "population = df['Population']"
1023 | ]
1024 | },
1025 | {
1026 | "cell_type": "code",
1027 | "execution_count": null,
1028 | "metadata": {},
1029 | "outputs": [],
1030 | "source": [
1031 | "population.min(), population.max()"
1032 | ]
1033 | },
1034 | {
1035 | "cell_type": "code",
1036 | "execution_count": null,
1037 | "metadata": {},
1038 | "outputs": [],
1039 | "source": [
1040 | "population.sum()"
1041 | ]
1042 | },
1043 | {
1044 | "cell_type": "code",
1045 | "execution_count": null,
1046 | "metadata": {},
1047 | "outputs": [],
1048 | "source": [
1049 | "population.sum() / len(population)"
1050 | ]
1051 | },
1052 | {
1053 | "cell_type": "code",
1054 | "execution_count": null,
1055 | "metadata": {
1056 | "scrolled": true
1057 | },
1058 | "outputs": [],
1059 | "source": [
1060 | "population.mean()"
1061 | ]
1062 | },
1063 | {
1064 | "cell_type": "code",
1065 | "execution_count": null,
1066 | "metadata": {},
1067 | "outputs": [],
1068 | "source": [
1069 | "population.std()"
1070 | ]
1071 | },
1072 | {
1073 | "cell_type": "code",
1074 | "execution_count": null,
1075 | "metadata": {},
1076 | "outputs": [],
1077 | "source": [
1078 | "population.median()"
1079 | ]
1080 | },
1081 | {
1082 | "cell_type": "code",
1083 | "execution_count": null,
1084 | "metadata": {
1085 | "scrolled": true
1086 | },
1087 | "outputs": [],
1088 | "source": [
1089 | "population.describe()"
1090 | ]
1091 | },
1092 | {
1093 | "cell_type": "code",
1094 | "execution_count": null,
1095 | "metadata": {
1096 | "scrolled": true
1097 | },
1098 | "outputs": [],
1099 | "source": [
1100 | "population.quantile(.25)"
1101 | ]
1102 | },
1103 | {
1104 | "cell_type": "code",
1105 | "execution_count": null,
1106 | "metadata": {
1107 | "scrolled": false
1108 | },
1109 | "outputs": [],
1110 | "source": [
1111 | "population.quantile([.2, .4, .6, .8, 1])"
1112 | ]
1113 | },
1114 | {
1115 | "cell_type": "markdown",
1116 | "metadata": {},
1117 | "source": [
1118 | "\n"
1119 | ]
1120 | }
1121 | ],
1122 | "metadata": {
1123 | "kernelspec": {
1124 | "display_name": "Python 3",
1125 | "language": "python",
1126 | "name": "python3"
1127 | },
1128 | "language_info": {
1129 | "codemirror_mode": {
1130 | "name": "ipython",
1131 | "version": 3
1132 | },
1133 | "file_extension": ".py",
1134 | "mimetype": "text/x-python",
1135 | "name": "python",
1136 | "nbconvert_exporter": "python",
1137 | "pygments_lexer": "ipython3",
1138 | "version": "3.7.4"
1139 | }
1140 | },
1141 | "nbformat": 4,
1142 | "nbformat_minor": 2
1143 | }
1144 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/4 - Pandas DataFrames exercises-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "\n",
10 | "# Pandas DataFrame exercises\n"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": null,
16 | "metadata": {},
17 | "outputs": [],
18 | "source": [
19 | "# Import the numpy package under the name np\n",
20 | "import numpy as np\n",
21 | "\n",
22 | "# Import the pandas package under the name pd\n",
23 | "import pandas as pd\n",
24 | "\n",
25 | "# Import the matplotlib package under the name plt\n",
26 | "import matplotlib.pyplot as plt\n",
27 | "%matplotlib inline\n",
28 | "\n",
29 | "# Print the pandas version and the configuration\n",
30 | "print(pd.__version__)"
31 | ]
32 | },
33 | {
34 | "cell_type": "markdown",
35 | "metadata": {},
36 | "source": [
37 | "\n",
38 | "\n",
39 | "## DataFrame creation"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | "### Create an empty pandas DataFrame\n"
47 | ]
48 | },
49 | {
50 | "cell_type": "code",
51 | "execution_count": null,
52 | "metadata": {},
53 | "outputs": [],
54 | "source": [
55 | "# your code goes here\n"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "metadata": {
62 | "cell_type": "solution"
63 | },
64 | "outputs": [],
65 | "source": [
66 | "pd.DataFrame(data=[None],\n",
67 | " index=[None],\n",
68 | " columns=[None])"
69 | ]
70 | },
71 | {
72 | "cell_type": "markdown",
73 | "metadata": {},
74 | "source": [
75 | "
"
76 | ]
77 | },
78 | {
79 | "cell_type": "markdown",
80 | "metadata": {},
81 | "source": [
82 | "\n",
83 | "\n",
84 | "### Create a `marvel_df` pandas DataFrame with the given marvel data\n"
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": null,
90 | "metadata": {},
91 | "outputs": [],
92 | "source": [
93 | "marvel_data = [\n",
94 | " ['Spider-Man', 'male', 1962],\n",
95 | " ['Captain America', 'male', 1941],\n",
96 | " ['Wolverine', 'male', 1974],\n",
97 | " ['Iron Man', 'male', 1963],\n",
98 | " ['Thor', 'male', 1963],\n",
99 | " ['Thing', 'male', 1961],\n",
100 | " ['Mister Fantastic', 'male', 1961],\n",
101 | " ['Hulk', 'male', 1962],\n",
102 | " ['Beast', 'male', 1963],\n",
103 | " ['Invisible Woman', 'female', 1961],\n",
104 | " ['Storm', 'female', 1975],\n",
105 | " ['Namor', 'male', 1939],\n",
106 | " ['Hawkeye', 'male', 1964],\n",
107 | " ['Daredevil', 'male', 1964],\n",
108 | " ['Doctor Strange', 'male', 1963],\n",
109 | " ['Hank Pym', 'male', 1962],\n",
110 | " ['Scarlet Witch', 'female', 1964],\n",
111 | " ['Wasp', 'female', 1963],\n",
112 | " ['Black Widow', 'female', 1964],\n",
113 | " ['Vision', 'male', 1968]\n",
114 | "]"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": null,
120 | "metadata": {},
121 | "outputs": [],
122 | "source": [
123 | "# your code goes here\n"
124 | ]
125 | },
126 | {
127 | "cell_type": "code",
128 | "execution_count": null,
129 | "metadata": {
130 | "cell_type": "solution"
131 | },
132 | "outputs": [],
133 | "source": [
134 | "marvel_df = pd.DataFrame(data=marvel_data)\n",
135 | "\n",
136 | "marvel_df"
137 | ]
138 | },
139 | {
140 | "cell_type": "markdown",
141 | "metadata": {},
142 | "source": [
143 | "\n",
144 | "\n",
145 | "### Add column names to the `marvel_df`\n",
146 | " "
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": null,
152 | "metadata": {},
153 | "outputs": [],
154 | "source": [
155 | "# your code goes here\n"
156 | ]
157 | },
158 | {
159 | "cell_type": "code",
160 | "execution_count": null,
161 | "metadata": {
162 | "cell_type": "solution"
163 | },
164 | "outputs": [],
165 | "source": [
166 | "col_names = ['name', 'sex', 'first_appearance']\n",
167 | "\n",
168 | "marvel_df.columns = col_names\n",
169 | "marvel_df"
170 | ]
171 | },
172 | {
173 | "cell_type": "markdown",
174 | "metadata": {},
175 | "source": [
176 | "\n",
177 | "\n",
178 | "### Add index names to the `marvel_df` (use the character name as index)\n"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": null,
184 | "metadata": {},
185 | "outputs": [],
186 | "source": [
187 | "# your code goes here\n"
188 | ]
189 | },
190 | {
191 | "cell_type": "code",
192 | "execution_count": null,
193 | "metadata": {
194 | "cell_type": "solution"
195 | },
196 | "outputs": [],
197 | "source": [
198 | "marvel_df.index = marvel_df['name']\n",
199 | "marvel_df"
200 | ]
201 | },
202 | {
203 | "cell_type": "markdown",
204 | "metadata": {},
205 | "source": [
206 | "\n",
207 | "\n",
208 | "### Drop the name column as it's now the index"
209 | ]
210 | },
211 | {
212 | "cell_type": "code",
213 | "execution_count": null,
214 | "metadata": {},
215 | "outputs": [],
216 | "source": [
217 | "# your code goes here\n"
218 | ]
219 | },
220 | {
221 | "cell_type": "code",
222 | "execution_count": null,
223 | "metadata": {
224 | "cell_type": "solution"
225 | },
226 | "outputs": [],
227 | "source": [
228 | "#marvel_df = marvel_df.drop(columns=['name'])\n",
229 | "marvel_df = marvel_df.drop(['name'], axis=1)\n",
230 | "marvel_df"
231 | ]
232 | },
233 | {
234 | "cell_type": "markdown",
235 | "metadata": {},
236 | "source": [
237 | "\n",
238 | "\n",
239 | "### Drop 'Namor' and 'Hank Pym' rows\n"
240 | ]
241 | },
242 | {
243 | "cell_type": "code",
244 | "execution_count": null,
245 | "metadata": {},
246 | "outputs": [],
247 | "source": [
248 | "# your code goes here\n"
249 | ]
250 | },
251 | {
252 | "cell_type": "code",
253 | "execution_count": null,
254 | "metadata": {
255 | "cell_type": "solution",
256 | "scrolled": false
257 | },
258 | "outputs": [],
259 | "source": [
260 | "marvel_df = marvel_df.drop(['Namor', 'Hank Pym'], axis=0)\n",
261 | "marvel_df"
262 | ]
263 | },
264 | {
265 | "cell_type": "markdown",
266 | "metadata": {},
267 | "source": [
268 | "\n",
269 | "\n",
270 | "## DataFrame selection, slicing and indexation"
271 | ]
272 | },
273 | {
274 | "cell_type": "markdown",
275 | "metadata": {},
276 | "source": [
277 | "### Show the first 5 elements on `marvel_df`\n",
278 | " "
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": null,
284 | "metadata": {},
285 | "outputs": [],
286 | "source": [
287 | "# your code goes here\n"
288 | ]
289 | },
290 | {
291 | "cell_type": "code",
292 | "execution_count": null,
293 | "metadata": {
294 | "cell_type": "solution"
295 | },
296 | "outputs": [],
297 | "source": [
298 | "#marvel_df.loc[['Spider-Man', 'Captain America', 'Wolverine', 'Iron Man', 'Thor'], :] # bad!\n",
299 | "#marvel_df.loc['Spider-Man': 'Thor', :]\n",
300 | "#marvel_df.iloc[0:5, :]\n",
301 | "#marvel_df.iloc[0:5,]\n",
302 | "marvel_df.iloc[:5,]\n",
303 | "#marvel_df.head()"
304 | ]
305 | },
306 | {
307 | "cell_type": "markdown",
308 | "metadata": {},
309 | "source": [
310 | "\n",
311 | "\n",
312 | "### Show the last 5 elements on `marvel_df`\n"
313 | ]
314 | },
315 | {
316 | "cell_type": "code",
317 | "execution_count": null,
318 | "metadata": {},
319 | "outputs": [],
320 | "source": [
321 | "# your code goes here\n"
322 | ]
323 | },
324 | {
325 | "cell_type": "code",
326 | "execution_count": null,
327 | "metadata": {
328 | "cell_type": "solution"
329 | },
330 | "outputs": [],
331 | "source": [
332 | "#marvel_df.loc[['Hank Pym', 'Scarlet Witch', 'Wasp', 'Black Widow', 'Vision'], :] # bad!\n",
333 | "#marvel_df.loc['Hank Pym':'Vision', :]\n",
334 | "marvel_df.iloc[-5:,]\n",
335 | "#marvel_df.tail()"
336 | ]
337 | },
338 | {
339 | "cell_type": "markdown",
340 | "metadata": {},
341 | "source": [
342 | "\n",
343 | "\n",
344 | "### Show just the sex of the first 5 elements on `marvel_df`"
345 | ]
346 | },
347 | {
348 | "cell_type": "code",
349 | "execution_count": null,
350 | "metadata": {},
351 | "outputs": [],
352 | "source": [
353 | "# your code goes here\n"
354 | ]
355 | },
356 | {
357 | "cell_type": "code",
358 | "execution_count": null,
359 | "metadata": {
360 | "cell_type": "solution"
361 | },
362 | "outputs": [],
363 | "source": [
364 | "#marvel_df.iloc[:5,]['sex'].to_frame()\n",
365 | "marvel_df.iloc[:5,].sex.to_frame()\n",
366 | "#marvel_df.head().sex.to_frame()"
367 | ]
368 | },
369 | {
370 | "cell_type": "markdown",
371 | "metadata": {},
372 | "source": [
373 | "\n",
374 | "\n",
375 | "### Show the first_appearance of all middle elements on `marvel_df` "
376 | ]
377 | },
378 | {
379 | "cell_type": "code",
380 | "execution_count": null,
381 | "metadata": {},
382 | "outputs": [],
383 | "source": [
384 | "# your code goes here\n"
385 | ]
386 | },
387 | {
388 | "cell_type": "code",
389 | "execution_count": null,
390 | "metadata": {
391 | "cell_type": "solution",
392 | "scrolled": false
393 | },
394 | "outputs": [],
395 | "source": [
396 | "marvel_df.iloc[1:-1,].first_appearance.to_frame()"
397 | ]
398 | },
399 | {
400 | "cell_type": "markdown",
401 | "metadata": {},
402 | "source": [
403 | "\n",
404 | "\n",
405 | "### Show the first and last elements on `marvel_df`\n"
406 | ]
407 | },
408 | {
409 | "cell_type": "code",
410 | "execution_count": null,
411 | "metadata": {},
412 | "outputs": [],
413 | "source": [
414 | "# your code goes here\n"
415 | ]
416 | },
417 | {
418 | "cell_type": "code",
419 | "execution_count": null,
420 | "metadata": {
421 | "cell_type": "solution"
422 | },
423 | "outputs": [],
424 | "source": [
425 | "#marvel_df.iloc[[0, -1],][['sex', 'first_appearance']]\n",
426 | "marvel_df.iloc[[0, -1],]"
427 | ]
428 | },
429 | {
430 | "cell_type": "markdown",
431 | "metadata": {},
432 | "source": [
433 | "\n",
434 | "\n",
435 | "## DataFrame manipulation and operations"
436 | ]
437 | },
438 | {
439 | "cell_type": "markdown",
440 | "metadata": {},
441 | "source": [
442 | "### Modify the `first_appearance` of 'Vision' to year 1964"
443 | ]
444 | },
445 | {
446 | "cell_type": "code",
447 | "execution_count": null,
448 | "metadata": {},
449 | "outputs": [],
450 | "source": [
451 | "# your code goes here\n"
452 | ]
453 | },
454 | {
455 | "cell_type": "code",
456 | "execution_count": null,
457 | "metadata": {
458 | "cell_type": "solution"
459 | },
460 | "outputs": [],
461 | "source": [
462 | "marvel_df.loc['Vision', 'first_appearance'] = 1964\n",
463 | "\n",
464 | "marvel_df"
465 | ]
466 | },
467 | {
468 | "cell_type": "markdown",
469 | "metadata": {},
470 | "source": [
471 | "\n",
472 | "\n",
473 | "### Add a new column to `marvel_df` called 'years_since' with the years since `first_appearance`\n"
474 | ]
475 | },
476 | {
477 | "cell_type": "code",
478 | "execution_count": null,
479 | "metadata": {},
480 | "outputs": [],
481 | "source": [
482 | "# your code goes here\n"
483 | ]
484 | },
485 | {
486 | "cell_type": "code",
487 | "execution_count": null,
488 | "metadata": {
489 | "cell_type": "solution"
490 | },
491 | "outputs": [],
492 | "source": [
493 | "marvel_df['years_since'] = 2018 - marvel_df['first_appearance']\n",
494 | "\n",
495 | "marvel_df"
496 | ]
497 | },
498 | {
499 | "cell_type": "markdown",
500 | "metadata": {},
501 | "source": [
502 | "\n",
503 | "\n",
504 | "## DataFrame boolean arrays (also called masks)"
505 | ]
506 | },
507 | {
508 | "cell_type": "markdown",
509 | "metadata": {},
510 | "source": [
511 | "### Given the `marvel_df` pandas DataFrame, make a mask showing the female characters\n"
512 | ]
513 | },
514 | {
515 | "cell_type": "code",
516 | "execution_count": null,
517 | "metadata": {},
518 | "outputs": [],
519 | "source": [
520 | "# your code goes here\n"
521 | ]
522 | },
523 | {
524 | "cell_type": "code",
525 | "execution_count": null,
526 | "metadata": {
527 | "cell_type": "solution"
528 | },
529 | "outputs": [],
530 | "source": [
531 | "mask = marvel_df['sex'] == 'female'\n",
532 | "\n",
533 | "mask"
534 | ]
535 | },
536 | {
537 | "cell_type": "markdown",
538 | "metadata": {},
539 | "source": [
540 | "\n",
541 | "\n",
542 | "### Given the `marvel_df` pandas DataFrame, get the male characters\n"
543 | ]
544 | },
545 | {
546 | "cell_type": "code",
547 | "execution_count": null,
548 | "metadata": {},
549 | "outputs": [],
550 | "source": [
551 | "# your code goes here\n"
552 | ]
553 | },
554 | {
555 | "cell_type": "code",
556 | "execution_count": null,
557 | "metadata": {
558 | "cell_type": "solution"
559 | },
560 | "outputs": [],
561 | "source": [
562 | "mask = marvel_df['sex'] == 'male'\n",
563 | "\n",
564 | "marvel_df[mask]"
565 | ]
566 | },
567 | {
568 | "cell_type": "markdown",
569 | "metadata": {},
570 | "source": [
571 | "\n",
572 | "\n",
573 | "### Given the `marvel_df` pandas DataFrame, get the characters with `first_appearance` after 1970\n"
574 | ]
575 | },
576 | {
577 | "cell_type": "code",
578 | "execution_count": null,
579 | "metadata": {},
580 | "outputs": [],
581 | "source": [
582 | "# your code goes here\n"
583 | ]
584 | },
585 | {
586 | "cell_type": "code",
587 | "execution_count": null,
588 | "metadata": {
589 | "cell_type": "solution"
590 | },
591 | "outputs": [],
592 | "source": [
593 | "mask = marvel_df['first_appearance'] > 1970\n",
594 | "\n",
595 | "marvel_df[mask]"
596 | ]
597 | },
598 | {
599 | "cell_type": "markdown",
600 | "metadata": {},
601 | "source": [
602 | "\n",
603 | "\n",
604 | "### Given the `marvel_df` pandas DataFrame, get the female characters with `first_appearance` after 1970"
605 | ]
606 | },
607 | {
608 | "cell_type": "code",
609 | "execution_count": null,
610 | "metadata": {},
611 | "outputs": [],
612 | "source": [
613 | "# your code goes here\n"
614 | ]
615 | },
616 | {
617 | "cell_type": "code",
618 | "execution_count": null,
619 | "metadata": {
620 | "cell_type": "solution",
621 | "scrolled": true
622 | },
623 | "outputs": [],
624 | "source": [
625 | "mask = (marvel_df['sex'] == 'female') & (marvel_df['first_appearance'] > 1970)\n",
626 | "\n",
627 | "marvel_df[mask]"
628 | ]
629 | },
630 | {
631 | "cell_type": "markdown",
632 | "metadata": {},
633 | "source": [
634 | "\n",
635 | "\n",
636 | "## DataFrame summary statistics"
637 | ]
638 | },
639 | {
640 | "cell_type": "markdown",
641 | "metadata": {},
642 | "source": [
643 | "### Show basic statistics of `marvel_df`"
644 | ]
645 | },
646 | {
647 | "cell_type": "code",
648 | "execution_count": null,
649 | "metadata": {},
650 | "outputs": [],
651 | "source": [
652 | "# your code goes here\n"
653 | ]
654 | },
655 | {
656 | "cell_type": "code",
657 | "execution_count": null,
658 | "metadata": {
659 | "cell_type": "solution"
660 | },
661 | "outputs": [],
662 | "source": [
663 | "marvel_df.describe()"
664 | ]
665 | },
666 | {
667 | "cell_type": "markdown",
668 | "metadata": {},
669 | "source": [
670 | "\n",
671 | "\n",
672 | "### Given the `marvel_df` pandas DataFrame, show the mean value of `first_appearance`"
673 | ]
674 | },
675 | {
676 | "cell_type": "code",
677 | "execution_count": null,
678 | "metadata": {},
679 | "outputs": [],
680 | "source": [
681 | "# your code goes here\n"
682 | ]
683 | },
684 | {
685 | "cell_type": "code",
686 | "execution_count": null,
687 | "metadata": {
688 | "cell_type": "solution"
689 | },
690 | "outputs": [],
691 | "source": [
692 | "\n",
693 | "#np.mean(marvel_df.first_appearance)\n",
694 | "marvel_df.first_appearance.mean()"
695 | ]
696 | },
697 | {
698 | "cell_type": "markdown",
699 | "metadata": {},
700 | "source": [
701 | "\n",
702 | "\n",
703 | "### Given the `marvel_df` pandas DataFrame, show the min value of `first_appearance`\n"
704 | ]
705 | },
706 | {
707 | "cell_type": "code",
708 | "execution_count": null,
709 | "metadata": {},
710 | "outputs": [],
711 | "source": [
712 | "# your code goes here\n"
713 | ]
714 | },
715 | {
716 | "cell_type": "code",
717 | "execution_count": null,
718 | "metadata": {
719 | "cell_type": "solution"
720 | },
721 | "outputs": [],
722 | "source": [
723 | "#np.min(marvel_df.first_appearance)\n",
724 | "marvel_df.first_appearance.min()"
725 | ]
726 | },
727 | {
728 | "cell_type": "markdown",
729 | "metadata": {},
730 | "source": [
731 | "\n",
732 | "\n",
733 | "### Given the `marvel_df` pandas DataFrame, get the characters with the min value of `first_appearance`"
734 | ]
735 | },
736 | {
737 | "cell_type": "code",
738 | "execution_count": null,
739 | "metadata": {},
740 | "outputs": [],
741 | "source": [
742 | "# your code goes here\n"
743 | ]
744 | },
745 | {
746 | "cell_type": "code",
747 | "execution_count": null,
748 | "metadata": {
749 | "cell_type": "solution"
750 | },
751 | "outputs": [],
752 | "source": [
753 | "mask = marvel_df['first_appearance'] == marvel_df.first_appearance.min()\n",
754 | "marvel_df[mask]"
755 | ]
756 | },
757 | {
758 | "cell_type": "markdown",
759 | "metadata": {},
760 | "source": [
761 | "\n",
762 | "\n",
763 | "## DataFrame basic plottings"
764 | ]
765 | },
766 | {
767 | "cell_type": "markdown",
768 | "metadata": {},
769 | "source": [
770 | "### Reset index names of `marvel_df`\n"
771 | ]
772 | },
773 | {
774 | "cell_type": "code",
775 | "execution_count": null,
776 | "metadata": {},
777 | "outputs": [],
778 | "source": [
779 | "# your code goes here\n"
780 | ]
781 | },
782 | {
783 | "cell_type": "code",
784 | "execution_count": null,
785 | "metadata": {
786 | "cell_type": "solution"
787 | },
788 | "outputs": [],
789 | "source": [
790 | "marvel_df = marvel_df.reset_index()\n",
791 | "\n",
792 | "marvel_df"
793 | ]
794 | },
795 | {
796 | "cell_type": "markdown",
797 | "metadata": {},
798 | "source": [
799 | "\n",
800 | "\n",
801 | "### Plot the values of `first_appearance`\n"
802 | ]
803 | },
804 | {
805 | "cell_type": "code",
806 | "execution_count": null,
807 | "metadata": {},
808 | "outputs": [],
809 | "source": [
810 | "# your code goes here\n"
811 | ]
812 | },
813 | {
814 | "cell_type": "code",
815 | "execution_count": null,
816 | "metadata": {
817 | "cell_type": "solution"
818 | },
819 | "outputs": [],
820 | "source": [
821 | "#plt.plot(marvel_df.index, marvel_df.first_appearance)\n",
822 | "marvel_df.first_appearance.plot()"
823 | ]
824 | },
825 | {
826 | "cell_type": "markdown",
827 | "metadata": {},
828 | "source": [
829 | "\n",
830 | "\n",
831 | "### Plot a histogram (plot.hist) with values of `first_appearance`\n"
832 | ]
833 | },
834 | {
835 | "cell_type": "code",
836 | "execution_count": null,
837 | "metadata": {},
838 | "outputs": [],
839 | "source": [
840 | "# your code goes here\n"
841 | ]
842 | },
843 | {
844 | "cell_type": "code",
845 | "execution_count": null,
846 | "metadata": {
847 | "cell_type": "solution"
848 | },
849 | "outputs": [],
850 | "source": [
851 | "\n",
852 | "plt.hist(marvel_df.first_appearance)"
853 | ]
854 | },
855 | {
856 | "cell_type": "markdown",
857 | "metadata": {},
858 | "source": [
859 | "\n"
860 | ]
861 | }
862 | ],
863 | "metadata": {
864 | "kernelspec": {
865 | "display_name": "Python 3",
866 | "language": "python",
867 | "name": "python3"
868 | },
869 | "language_info": {
870 | "codemirror_mode": {
871 | "name": "ipython",
872 | "version": 3
873 | },
874 | "file_extension": ".py",
875 | "mimetype": "text/x-python",
876 | "name": "python",
877 | "nbconvert_exporter": "python",
878 | "pygments_lexer": "ipython3",
879 | "version": "3.7.4"
880 | }
881 | },
882 | "nbformat": 4,
883 | "nbformat_minor": 2
884 | }
885 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/5 - Pandas - Reading CSV and Basic Plotting-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "\n",
10 | "
\n",
12 | "\n",
13 | "# Reading external data & Plotting\n",
14 | "\n",
15 | "[Source](https://blockchain.info/charts/market-price)"
16 | ]
17 | },
18 | {
19 | "cell_type": "markdown",
20 | "metadata": {},
21 | "source": [
22 | "\n",
23 | "\n",
24 | "## Hands on! "
25 | ]
26 | },
27 | {
28 | "cell_type": "code",
29 | "execution_count": null,
30 | "metadata": {},
31 | "outputs": [],
32 | "source": [
33 | "import numpy as np\n",
34 | "import pandas as pd\n",
35 | "import matplotlib.pyplot as plt\n",
36 | "\n",
37 | "%matplotlib inline"
38 | ]
39 | },
40 | {
41 | "cell_type": "markdown",
42 | "metadata": {},
43 | "source": [
44 | "Pandas can easily read data stored in different file formats like CSV, JSON, XML or even Excel. Parsing always involves specifying the correct structure, encoding and other details. The `read_csv` method reads CSV files and accepts many parameters."
45 | ]
46 | },
47 | {
48 | "cell_type": "code",
49 | "execution_count": null,
50 | "metadata": {
51 | "scrolled": true
52 | },
53 | "outputs": [],
54 | "source": [
55 | "pd.read_csv"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "metadata": {},
62 | "outputs": [],
63 | "source": [
64 | "df = pd.read_csv('data/btc-market-price.csv')"
65 | ]
66 | },
67 | {
68 | "cell_type": "code",
69 | "execution_count": null,
70 | "metadata": {},
71 | "outputs": [],
72 | "source": [
73 | "df.head()"
74 | ]
75 | },
76 | {
77 | "cell_type": "markdown",
78 | "metadata": {},
79 | "source": [
80 | "The CSV file we're reading has only two columns: `timestamp` and `price`. It doesn't have a header, it contains whitespaces and has values separated by commas. pandas automatically assigned the first row of data as headers, which is incorrect. We can overwrite this behavior with the `header` parameter:"
81 | ]
82 | },
83 | {
84 | "cell_type": "code",
85 | "execution_count": null,
86 | "metadata": {},
87 | "outputs": [],
88 | "source": [
89 | "df = pd.read_csv('data/btc-market-price.csv', header=None)"
90 | ]
91 | },
92 | {
93 | "cell_type": "code",
94 | "execution_count": null,
95 | "metadata": {},
96 | "outputs": [],
97 | "source": [
98 | "df.head()"
99 | ]
100 | },
101 | {
102 | "cell_type": "markdown",
103 | "metadata": {},
104 | "source": [
105 | "We can then set the names of each column explicitely by setting the `df.columns` attribute:"
106 | ]
107 | },
108 | {
109 | "cell_type": "code",
110 | "execution_count": null,
111 | "metadata": {},
112 | "outputs": [],
113 | "source": [
114 | "df.columns = ['Timestamp', 'Price']"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": null,
120 | "metadata": {},
121 | "outputs": [],
122 | "source": [
123 | "df.head()"
124 | ]
125 | },
126 | {
127 | "cell_type": "markdown",
128 | "metadata": {},
129 | "source": [
130 | "The type of the `Price` column was correctly interpreted as `float`, but the `Timestamp` was interpreted as a regular string (`object` in pandas notation):"
131 | ]
132 | },
133 | {
134 | "cell_type": "code",
135 | "execution_count": null,
136 | "metadata": {},
137 | "outputs": [],
138 | "source": [
139 | "df.dtypes"
140 | ]
141 | },
142 | {
143 | "cell_type": "markdown",
144 | "metadata": {},
145 | "source": [
146 | "We can perform a vectorized operation to parse all the Timestamp values as `Datetime` objects:"
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": null,
152 | "metadata": {},
153 | "outputs": [],
154 | "source": [
155 | "pd.to_datetime(df['Timestamp']).head()"
156 | ]
157 | },
158 | {
159 | "cell_type": "code",
160 | "execution_count": null,
161 | "metadata": {},
162 | "outputs": [],
163 | "source": [
164 | "df['Timestamp'] = pd.to_datetime(df['Timestamp'])"
165 | ]
166 | },
167 | {
168 | "cell_type": "code",
169 | "execution_count": null,
170 | "metadata": {
171 | "scrolled": true
172 | },
173 | "outputs": [],
174 | "source": [
175 | "df.head()"
176 | ]
177 | },
178 | {
179 | "cell_type": "code",
180 | "execution_count": null,
181 | "metadata": {},
182 | "outputs": [],
183 | "source": [
184 | "df.dtypes"
185 | ]
186 | },
187 | {
188 | "cell_type": "markdown",
189 | "metadata": {},
190 | "source": [
191 | "The timestamp looks a lot like the index of this `DataFrame`: `date > price`. We can change the autoincremental ID generated by pandas and use the `Timestamp DS` column as the Index:"
192 | ]
193 | },
194 | {
195 | "cell_type": "code",
196 | "execution_count": null,
197 | "metadata": {},
198 | "outputs": [],
199 | "source": [
200 | "df.set_index('Timestamp', inplace=True)"
201 | ]
202 | },
203 | {
204 | "cell_type": "code",
205 | "execution_count": null,
206 | "metadata": {},
207 | "outputs": [],
208 | "source": [
209 | "df.head()"
210 | ]
211 | },
212 | {
213 | "cell_type": "markdown",
214 | "metadata": {},
215 | "source": [
216 | "\n",
217 | "\n",
218 | "## Putting everything together\n",
219 | "\n",
220 | "And now, we've finally arrived to the final, desired version of the `DataFrame` parsed from our CSV file. The steps were:"
221 | ]
222 | },
223 | {
224 | "cell_type": "code",
225 | "execution_count": null,
226 | "metadata": {},
227 | "outputs": [],
228 | "source": [
229 | "df = pd.read_csv('data/btc-market-price.csv', header=None)\n",
230 | "df.columns = ['Timestamp', 'Price']\n",
231 | "df['Timestamp'] = pd.to_datetime(df['Timestamp'])\n",
232 | "df.set_index('Timestamp', inplace=True)"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": null,
238 | "metadata": {
239 | "scrolled": false
240 | },
241 | "outputs": [],
242 | "source": [
243 | "df.head()"
244 | ]
245 | },
246 | {
247 | "cell_type": "markdown",
248 | "metadata": {},
249 | "source": [
250 | "**There should be a better way**. And there is 😎. And there usually is, explicitly with all these repetitive tasks with pandas.\n",
251 | "\n",
252 | "The `read_csv` function is extremely powerful and you can specify many more parameters at import time. We can achive the same results with only one line by doing:"
253 | ]
254 | },
255 | {
256 | "cell_type": "code",
257 | "execution_count": null,
258 | "metadata": {},
259 | "outputs": [],
260 | "source": [
261 | "df = pd.read_csv(\n",
262 | " 'data/btc-market-price.csv',\n",
263 | " header=None,\n",
264 | " names=['Timestamp', 'Price'],\n",
265 | " index_col=0,\n",
266 | " parse_dates=True\n",
267 | ")"
268 | ]
269 | },
270 | {
271 | "cell_type": "code",
272 | "execution_count": null,
273 | "metadata": {
274 | "scrolled": true
275 | },
276 | "outputs": [],
277 | "source": [
278 | "df.head()"
279 | ]
280 | },
281 | {
282 | "cell_type": "markdown",
283 | "metadata": {},
284 | "source": [
285 | "\n",
286 | "\n",
287 | "## Plotting basics\n",
288 | "\n",
289 | "`pandas` integrates with Matplotlib and creating a plot is as simple as:"
290 | ]
291 | },
292 | {
293 | "cell_type": "code",
294 | "execution_count": null,
295 | "metadata": {
296 | "scrolled": true
297 | },
298 | "outputs": [],
299 | "source": [
300 | "df.plot()"
301 | ]
302 | },
303 | {
304 | "cell_type": "markdown",
305 | "metadata": {},
306 | "source": [
307 | "Behind the scenes, it's using `matplotlib.pyplot`'s interface. We can create a similar plot with the `plt.plot()` function:"
308 | ]
309 | },
310 | {
311 | "cell_type": "code",
312 | "execution_count": null,
313 | "metadata": {
314 | "scrolled": true
315 | },
316 | "outputs": [],
317 | "source": [
318 | "plt.plot(df.index, df['Price'])"
319 | ]
320 | },
321 | {
322 | "cell_type": "markdown",
323 | "metadata": {},
324 | "source": [
325 | "`plt.plot()` accepts many parameters, but the first two ones are the most important ones: the values for the `X` and `Y` axes. Another example:"
326 | ]
327 | },
328 | {
329 | "cell_type": "code",
330 | "execution_count": null,
331 | "metadata": {},
332 | "outputs": [],
333 | "source": [
334 | "x = np.arange(-10, 11)"
335 | ]
336 | },
337 | {
338 | "cell_type": "code",
339 | "execution_count": null,
340 | "metadata": {
341 | "scrolled": false
342 | },
343 | "outputs": [],
344 | "source": [
345 | "plt.plot(x, x ** 2)"
346 | ]
347 | },
348 | {
349 | "cell_type": "markdown",
350 | "metadata": {},
351 | "source": [
352 | "We're using `matplotlib`'s global API, which is horrible but it's the most popular one. We'll learn later how to use the _OOP_ API which will make our work much easier."
353 | ]
354 | },
355 | {
356 | "cell_type": "code",
357 | "execution_count": null,
358 | "metadata": {
359 | "scrolled": true
360 | },
361 | "outputs": [],
362 | "source": [
363 | "plt.plot(x, x ** 2)\n",
364 | "plt.plot(x, -1 * (x ** 2))"
365 | ]
366 | },
367 | {
368 | "cell_type": "markdown",
369 | "metadata": {},
370 | "source": [
371 | "Each `plt` function alters the global state. If you want to set settings of your plot you can use the `plt.figure` function. Others like `plt.title` keep altering the global plot:"
372 | ]
373 | },
374 | {
375 | "cell_type": "code",
376 | "execution_count": null,
377 | "metadata": {
378 | "scrolled": false
379 | },
380 | "outputs": [],
381 | "source": [
382 | "plt.figure(figsize=(12, 6))\n",
383 | "plt.plot(x, x ** 2)\n",
384 | "plt.plot(x, -1 * (x ** 2))\n",
385 | "\n",
386 | "plt.title('My Nice Plot')"
387 | ]
388 | },
389 | {
390 | "cell_type": "markdown",
391 | "metadata": {},
392 | "source": [
393 | "Some of the arguments in `plt.figure` and `plt.plot` are available in the pandas' `plot` interface:"
394 | ]
395 | },
396 | {
397 | "cell_type": "code",
398 | "execution_count": null,
399 | "metadata": {
400 | "scrolled": true
401 | },
402 | "outputs": [],
403 | "source": [
404 | "df.plot(figsize=(16, 9), title='Bitcoin Price 2017-2018')"
405 | ]
406 | },
407 | {
408 | "cell_type": "markdown",
409 | "metadata": {},
410 | "source": [
411 | "\n",
412 | "\n",
413 | "## A more challenging parsing\n",
414 | "\n",
415 | "To demonstrate plotting two columns together, we'll try to add Ether prices to our `df` DataFrame. The ETH prices data can be found in the `data/eth-price.csv` file. The problem is that it seems like that CSV file was created by someone who really hated programmers. Take a look at it and see how ugly it looks like. We'll still use `pandas` to parse it."
416 | ]
417 | },
418 | {
419 | "cell_type": "code",
420 | "execution_count": null,
421 | "metadata": {
422 | "scrolled": true
423 | },
424 | "outputs": [],
425 | "source": [
426 | "eth = pd.read_csv('data/eth-price.csv')\n",
427 | "\n",
428 | "eth.head()"
429 | ]
430 | },
431 | {
432 | "cell_type": "markdown",
433 | "metadata": {},
434 | "source": [
435 | "As you can see, it has a `Value` column (which represents the price), a `Date(UTC)` one that has a string representing dates and also a `UnixTimeStamp` date represeting the datetime in unix timestamp format. The header is read automatically, let's try to parse dates with the CSV Reader:"
436 | ]
437 | },
438 | {
439 | "cell_type": "code",
440 | "execution_count": null,
441 | "metadata": {},
442 | "outputs": [],
443 | "source": [
444 | "eth = pd.read_csv('data/eth-price.csv', parse_dates=True)\n",
445 | "\n",
446 | "print(eth.dtypes)\n",
447 | "eth.head()"
448 | ]
449 | },
450 | {
451 | "cell_type": "markdown",
452 | "metadata": {},
453 | "source": [
454 | "Seems like the `parse_dates` attribute didn't work. We'll need to add a little bit more customization. Let's divide this problem and focus on the problem of \"date parsing\" first. The simplest option would be to use the `UnixTimeStamp` column. The `pandas` module has a `to_datetime` function that converts Unix timestamps to Datetime objects automatically:"
455 | ]
456 | },
457 | {
458 | "cell_type": "code",
459 | "execution_count": null,
460 | "metadata": {},
461 | "outputs": [],
462 | "source": [
463 | "pd.to_datetime(eth['UnixTimeStamp']).head()"
464 | ]
465 | },
466 | {
467 | "cell_type": "markdown",
468 | "metadata": {},
469 | "source": [
470 | "The problem is the precision of unix timestamps. To match both columns we'll need to use the same index and, our `df` containing Bitcoin prices, is \"per day\":"
471 | ]
472 | },
473 | {
474 | "cell_type": "code",
475 | "execution_count": null,
476 | "metadata": {
477 | "scrolled": true
478 | },
479 | "outputs": [],
480 | "source": [
481 | "df.head()"
482 | ]
483 | },
484 | {
485 | "cell_type": "markdown",
486 | "metadata": {},
487 | "source": [
488 | "We could either, remove the precision of `UnixTimeStamp` or attempt to parse the `Date(UTC)`. Let's do String parsing of `Date(UTC)` for fun:"
489 | ]
490 | },
491 | {
492 | "cell_type": "code",
493 | "execution_count": null,
494 | "metadata": {
495 | "scrolled": false
496 | },
497 | "outputs": [],
498 | "source": [
499 | "pd.to_datetime(eth['Date(UTC)']).head()"
500 | ]
501 | },
502 | {
503 | "cell_type": "markdown",
504 | "metadata": {},
505 | "source": [
506 | "That seems to work fine! Why isn't it then parsing the `Date(UTC)` column? Simple, the `parse_dates=True` parameter will instruct pandas to parse the index of the `DataFrame`. If you want to parse any other column, you must explicitly pass the column position or name:"
507 | ]
508 | },
509 | {
510 | "cell_type": "code",
511 | "execution_count": null,
512 | "metadata": {
513 | "scrolled": false
514 | },
515 | "outputs": [],
516 | "source": [
517 | "pd.read_csv('data/eth-price.csv', parse_dates=[0]).head()"
518 | ]
519 | },
520 | {
521 | "cell_type": "markdown",
522 | "metadata": {},
523 | "source": [
524 | "Putting everything together again:"
525 | ]
526 | },
527 | {
528 | "cell_type": "code",
529 | "execution_count": null,
530 | "metadata": {
531 | "scrolled": false
532 | },
533 | "outputs": [],
534 | "source": [
535 | "eth = pd.read_csv('data/eth-price.csv', parse_dates=True, index_col=0)\n",
536 | "print(eth.info())\n",
537 | "\n",
538 | "eth.head()"
539 | ]
540 | },
541 | {
542 | "cell_type": "markdown",
543 | "metadata": {},
544 | "source": [
545 | "We can now combine both `DataFrame`s into one. Both have the same index, so aligning both prices will be easy. Let's first create an empty `DataFrame` and with the index from Bitcoin prices:"
546 | ]
547 | },
548 | {
549 | "cell_type": "code",
550 | "execution_count": null,
551 | "metadata": {},
552 | "outputs": [],
553 | "source": [
554 | "prices = pd.DataFrame(index=df.index)"
555 | ]
556 | },
557 | {
558 | "cell_type": "code",
559 | "execution_count": null,
560 | "metadata": {},
561 | "outputs": [],
562 | "source": [
563 | "prices.head()"
564 | ]
565 | },
566 | {
567 | "cell_type": "markdown",
568 | "metadata": {},
569 | "source": [
570 | "And we can now just set columns from the other `DataFrame`s:"
571 | ]
572 | },
573 | {
574 | "cell_type": "code",
575 | "execution_count": null,
576 | "metadata": {},
577 | "outputs": [],
578 | "source": [
579 | "prices['Bitcoin'] = df['Price']"
580 | ]
581 | },
582 | {
583 | "cell_type": "code",
584 | "execution_count": null,
585 | "metadata": {},
586 | "outputs": [],
587 | "source": [
588 | "prices['Ether'] = eth['Value']"
589 | ]
590 | },
591 | {
592 | "cell_type": "code",
593 | "execution_count": null,
594 | "metadata": {},
595 | "outputs": [],
596 | "source": [
597 | "prices.head()"
598 | ]
599 | },
600 | {
601 | "cell_type": "markdown",
602 | "metadata": {},
603 | "source": [
604 | "We can now try plotting both values:"
605 | ]
606 | },
607 | {
608 | "cell_type": "code",
609 | "execution_count": null,
610 | "metadata": {
611 | "scrolled": true
612 | },
613 | "outputs": [],
614 | "source": [
615 | "prices.plot(figsize=(12, 6))"
616 | ]
617 | },
618 | {
619 | "cell_type": "markdown",
620 | "metadata": {},
621 | "source": [
622 | "🤔seems like there's a tiny gap between Dec 2017 and Jan 2018. Let's zoom in there:"
623 | ]
624 | },
625 | {
626 | "cell_type": "code",
627 | "execution_count": null,
628 | "metadata": {
629 | "scrolled": false
630 | },
631 | "outputs": [],
632 | "source": [
633 | "prices.loc['2017-12-01':'2018-01-01'].plot(figsize=(12, 6))"
634 | ]
635 | },
636 | {
637 | "cell_type": "markdown",
638 | "metadata": {},
639 | "source": [
640 | "Oh no, missing data 😱. We'll learn how to deal with that later 😉.\n",
641 | "\n",
642 | "Btw, did you note that fancy indexing `'2017-12-01':'2018-01-01'` 😏. That's pandas power 💪. We'll learn how to deal with TimeSeries later too."
643 | ]
644 | },
645 | {
646 | "cell_type": "markdown",
647 | "metadata": {},
648 | "source": [
649 | "\n"
650 | ]
651 | }
652 | ],
653 | "metadata": {
654 | "kernelspec": {
655 | "display_name": "Python 3",
656 | "language": "python",
657 | "name": "python3"
658 | },
659 | "language_info": {
660 | "codemirror_mode": {
661 | "name": "ipython",
662 | "version": 3
663 | },
664 | "file_extension": ".py",
665 | "mimetype": "text/x-python",
666 | "name": "python",
667 | "nbconvert_exporter": "python",
668 | "pygments_lexer": "ipython3",
669 | "version": "3.7.4"
670 | }
671 | },
672 | "nbformat": 4,
673 | "nbformat_minor": 2
674 | }
675 |
--------------------------------------------------------------------------------
/1 - Pandas - Series.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "\n",
10 | "
\n",
12 | "\n",
13 | "# Pandas - Series\n"
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "\n",
21 | "\n",
22 | "## Hands on! "
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": 1,
28 | "metadata": {},
29 | "outputs": [],
30 | "source": [
31 | "import pandas as pd\n",
32 | "import numpy as np"
33 | ]
34 | },
35 | {
36 | "cell_type": "markdown",
37 | "metadata": {},
38 | "source": [
39 | "## Pandas Series\n",
40 | "\n",
41 | "We'll start analyzing \"[The Group of Seven](https://en.wikipedia.org/wiki/Group_of_Seven)\". Which is a political formed by Canada, France, Germany, Italy, Japan, the United Kingdom and the United States. We'll start by analyzing population, and for that, we'll use a `pandas.Series` object."
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": 2,
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "# In millions\n",
51 | "g7_pop = pd.Series([35.467, 63.951, 80.940, 60.665, 127.061, 64.511, 318.523])"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": 3,
57 | "metadata": {
58 | "scrolled": true
59 | },
60 | "outputs": [
61 | {
62 | "data": {
63 | "text/plain": [
64 | "0 35.467\n",
65 | "1 63.951\n",
66 | "2 80.940\n",
67 | "3 60.665\n",
68 | "4 127.061\n",
69 | "5 64.511\n",
70 | "6 318.523\n",
71 | "dtype: float64"
72 | ]
73 | },
74 | "execution_count": 3,
75 | "metadata": {},
76 | "output_type": "execute_result"
77 | }
78 | ],
79 | "source": [
80 | "g7_pop"
81 | ]
82 | },
83 | {
84 | "cell_type": "markdown",
85 | "metadata": {},
86 | "source": [
87 | "Someone might not know we're representing population in millions of inhabitants. Series can have a `name`, to better document the purpose of the Series:"
88 | ]
89 | },
90 | {
91 | "cell_type": "code",
92 | "execution_count": 4,
93 | "metadata": {},
94 | "outputs": [],
95 | "source": [
96 | "g7_pop.name = 'G7 Population in millions'"
97 | ]
98 | },
99 | {
100 | "cell_type": "code",
101 | "execution_count": 5,
102 | "metadata": {},
103 | "outputs": [
104 | {
105 | "data": {
106 | "text/plain": [
107 | "0 35.467\n",
108 | "1 63.951\n",
109 | "2 80.940\n",
110 | "3 60.665\n",
111 | "4 127.061\n",
112 | "5 64.511\n",
113 | "6 318.523\n",
114 | "Name: G7 Population in millions, dtype: float64"
115 | ]
116 | },
117 | "execution_count": 5,
118 | "metadata": {},
119 | "output_type": "execute_result"
120 | }
121 | ],
122 | "source": [
123 | "g7_pop"
124 | ]
125 | },
126 | {
127 | "cell_type": "markdown",
128 | "metadata": {},
129 | "source": [
130 | "Series are pretty similar to numpy arrays:"
131 | ]
132 | },
133 | {
134 | "cell_type": "code",
135 | "execution_count": 6,
136 | "metadata": {},
137 | "outputs": [
138 | {
139 | "data": {
140 | "text/plain": [
141 | "dtype('float64')"
142 | ]
143 | },
144 | "execution_count": 6,
145 | "metadata": {},
146 | "output_type": "execute_result"
147 | }
148 | ],
149 | "source": [
150 | "g7_pop.dtype"
151 | ]
152 | },
153 | {
154 | "cell_type": "code",
155 | "execution_count": 7,
156 | "metadata": {},
157 | "outputs": [
158 | {
159 | "data": {
160 | "text/plain": [
161 | "array([ 35.467, 63.951, 80.94 , 60.665, 127.061, 64.511, 318.523])"
162 | ]
163 | },
164 | "execution_count": 7,
165 | "metadata": {},
166 | "output_type": "execute_result"
167 | }
168 | ],
169 | "source": [
170 | "g7_pop.values"
171 | ]
172 | },
173 | {
174 | "cell_type": "markdown",
175 | "metadata": {},
176 | "source": [
177 | "They're actually backed by numpy arrays:"
178 | ]
179 | },
180 | {
181 | "cell_type": "code",
182 | "execution_count": 8,
183 | "metadata": {},
184 | "outputs": [
185 | {
186 | "data": {
187 | "text/plain": [
188 | "numpy.ndarray"
189 | ]
190 | },
191 | "execution_count": 8,
192 | "metadata": {},
193 | "output_type": "execute_result"
194 | }
195 | ],
196 | "source": [
197 | "type(g7_pop.values)"
198 | ]
199 | },
200 | {
201 | "cell_type": "markdown",
202 | "metadata": {},
203 | "source": [
204 | "And they _look_ like simple Python lists or Numpy Arrays. But they're actually more similar to Python `dict`s.\n",
205 | "\n",
206 | "A Series has an `index`, that's similar to the automatic index assigned to Python's lists:"
207 | ]
208 | },
209 | {
210 | "cell_type": "code",
211 | "execution_count": 9,
212 | "metadata": {},
213 | "outputs": [
214 | {
215 | "data": {
216 | "text/plain": [
217 | "0 35.467\n",
218 | "1 63.951\n",
219 | "2 80.940\n",
220 | "3 60.665\n",
221 | "4 127.061\n",
222 | "5 64.511\n",
223 | "6 318.523\n",
224 | "Name: G7 Population in millions, dtype: float64"
225 | ]
226 | },
227 | "execution_count": 9,
228 | "metadata": {},
229 | "output_type": "execute_result"
230 | }
231 | ],
232 | "source": [
233 | "g7_pop"
234 | ]
235 | },
236 | {
237 | "cell_type": "code",
238 | "execution_count": 10,
239 | "metadata": {},
240 | "outputs": [
241 | {
242 | "data": {
243 | "text/plain": [
244 | "35.467"
245 | ]
246 | },
247 | "execution_count": 10,
248 | "metadata": {},
249 | "output_type": "execute_result"
250 | }
251 | ],
252 | "source": [
253 | "g7_pop[0]"
254 | ]
255 | },
256 | {
257 | "cell_type": "code",
258 | "execution_count": 11,
259 | "metadata": {},
260 | "outputs": [
261 | {
262 | "data": {
263 | "text/plain": [
264 | "63.951"
265 | ]
266 | },
267 | "execution_count": 11,
268 | "metadata": {},
269 | "output_type": "execute_result"
270 | }
271 | ],
272 | "source": [
273 | "g7_pop[1]"
274 | ]
275 | },
276 | {
277 | "cell_type": "code",
278 | "execution_count": 12,
279 | "metadata": {},
280 | "outputs": [
281 | {
282 | "data": {
283 | "text/plain": [
284 | "RangeIndex(start=0, stop=7, step=1)"
285 | ]
286 | },
287 | "execution_count": 12,
288 | "metadata": {},
289 | "output_type": "execute_result"
290 | }
291 | ],
292 | "source": [
293 | "g7_pop.index"
294 | ]
295 | },
296 | {
297 | "cell_type": "code",
298 | "execution_count": 13,
299 | "metadata": {},
300 | "outputs": [],
301 | "source": [
302 | "l = ['a', 'b', 'c']"
303 | ]
304 | },
305 | {
306 | "cell_type": "markdown",
307 | "metadata": {},
308 | "source": [
309 | "But, in contrast to lists, we can explicitly define the index:"
310 | ]
311 | },
312 | {
313 | "cell_type": "code",
314 | "execution_count": 14,
315 | "metadata": {},
316 | "outputs": [],
317 | "source": [
318 | "g7_pop.index = [\n",
319 | " 'Canada',\n",
320 | " 'France',\n",
321 | " 'Germany',\n",
322 | " 'Italy',\n",
323 | " 'Japan',\n",
324 | " 'United Kingdom',\n",
325 | " 'United States',\n",
326 | "]"
327 | ]
328 | },
329 | {
330 | "cell_type": "code",
331 | "execution_count": 15,
332 | "metadata": {},
333 | "outputs": [
334 | {
335 | "data": {
336 | "text/plain": [
337 | "Canada 35.467\n",
338 | "France 63.951\n",
339 | "Germany 80.940\n",
340 | "Italy 60.665\n",
341 | "Japan 127.061\n",
342 | "United Kingdom 64.511\n",
343 | "United States 318.523\n",
344 | "Name: G7 Population in millions, dtype: float64"
345 | ]
346 | },
347 | "execution_count": 15,
348 | "metadata": {},
349 | "output_type": "execute_result"
350 | }
351 | ],
352 | "source": [
353 | "g7_pop"
354 | ]
355 | },
356 | {
357 | "cell_type": "markdown",
358 | "metadata": {},
359 | "source": [
360 | "Compare it with the [following table](https://docs.google.com/spreadsheets/d/1IlorV2-Oh9Da1JAZ7weVw86PQrQydSMp-ydVMH135iI/edit?usp=sharing): \n",
361 | "\n",
362 | "
\n",
363 | "\n",
364 | "We can say that Series look like \"ordered dictionaries\". We can actually create Series out of dictionaries:"
365 | ]
366 | },
367 | {
368 | "cell_type": "code",
369 | "execution_count": 16,
370 | "metadata": {
371 | "scrolled": true
372 | },
373 | "outputs": [
374 | {
375 | "data": {
376 | "text/plain": [
377 | "Canada 35.467\n",
378 | "France 63.951\n",
379 | "Germany 80.940\n",
380 | "Italy 60.665\n",
381 | "Japan 127.061\n",
382 | "United Kingdom 64.511\n",
383 | "United States 318.523\n",
384 | "Name: G7 Population in millions, dtype: float64"
385 | ]
386 | },
387 | "execution_count": 16,
388 | "metadata": {},
389 | "output_type": "execute_result"
390 | }
391 | ],
392 | "source": [
393 | "pd.Series({\n",
394 | " 'Canada': 35.467,\n",
395 | " 'France': 63.951,\n",
396 | " 'Germany': 80.94,\n",
397 | " 'Italy': 60.665,\n",
398 | " 'Japan': 127.061,\n",
399 | " 'United Kingdom': 64.511,\n",
400 | " 'United States': 318.523\n",
401 | "}, name='G7 Population in millions')"
402 | ]
403 | },
404 | {
405 | "cell_type": "code",
406 | "execution_count": 17,
407 | "metadata": {},
408 | "outputs": [
409 | {
410 | "data": {
411 | "text/plain": [
412 | "Canada 35.467\n",
413 | "France 63.951\n",
414 | "Germany 80.940\n",
415 | "Italy 60.665\n",
416 | "Japan 127.061\n",
417 | "United Kingdom 64.511\n",
418 | "United States 318.523\n",
419 | "Name: G7 Population in millions, dtype: float64"
420 | ]
421 | },
422 | "execution_count": 17,
423 | "metadata": {},
424 | "output_type": "execute_result"
425 | }
426 | ],
427 | "source": [
428 | "pd.Series(\n",
429 | " [35.467, 63.951, 80.94, 60.665, 127.061, 64.511, 318.523],\n",
430 | " index=['Canada', 'France', 'Germany', 'Italy', 'Japan', 'United Kingdom',\n",
431 | " 'United States'],\n",
432 | " name='G7 Population in millions')"
433 | ]
434 | },
435 | {
436 | "cell_type": "markdown",
437 | "metadata": {},
438 | "source": [
439 | "You can also create Series out of other series, specifying indexes:"
440 | ]
441 | },
442 | {
443 | "cell_type": "code",
444 | "execution_count": 18,
445 | "metadata": {},
446 | "outputs": [
447 | {
448 | "data": {
449 | "text/plain": [
450 | "France 63.951\n",
451 | "Germany 80.940\n",
452 | "Italy 60.665\n",
453 | "Spain NaN\n",
454 | "Name: G7 Population in millions, dtype: float64"
455 | ]
456 | },
457 | "execution_count": 18,
458 | "metadata": {},
459 | "output_type": "execute_result"
460 | }
461 | ],
462 | "source": [
463 | "pd.Series(g7_pop, index=['France', 'Germany', 'Italy', 'Spain'])"
464 | ]
465 | },
466 | {
467 | "cell_type": "markdown",
468 | "metadata": {},
469 | "source": [
470 | "\n",
471 | "\n",
472 | "## Indexing\n",
473 | "\n",
474 | "Indexing works similarly to lists and dictionaries, you use the **index** of the element you're looking for:"
475 | ]
476 | },
477 | {
478 | "cell_type": "code",
479 | "execution_count": 19,
480 | "metadata": {},
481 | "outputs": [
482 | {
483 | "data": {
484 | "text/plain": [
485 | "Canada 35.467\n",
486 | "France 63.951\n",
487 | "Germany 80.940\n",
488 | "Italy 60.665\n",
489 | "Japan 127.061\n",
490 | "United Kingdom 64.511\n",
491 | "United States 318.523\n",
492 | "Name: G7 Population in millions, dtype: float64"
493 | ]
494 | },
495 | "execution_count": 19,
496 | "metadata": {},
497 | "output_type": "execute_result"
498 | }
499 | ],
500 | "source": [
501 | "g7_pop"
502 | ]
503 | },
504 | {
505 | "cell_type": "code",
506 | "execution_count": 20,
507 | "metadata": {},
508 | "outputs": [
509 | {
510 | "data": {
511 | "text/plain": [
512 | "35.467"
513 | ]
514 | },
515 | "execution_count": 20,
516 | "metadata": {},
517 | "output_type": "execute_result"
518 | }
519 | ],
520 | "source": [
521 | "g7_pop['Canada']"
522 | ]
523 | },
524 | {
525 | "cell_type": "code",
526 | "execution_count": 21,
527 | "metadata": {},
528 | "outputs": [
529 | {
530 | "data": {
531 | "text/plain": [
532 | "127.061"
533 | ]
534 | },
535 | "execution_count": 21,
536 | "metadata": {},
537 | "output_type": "execute_result"
538 | }
539 | ],
540 | "source": [
541 | "g7_pop['Japan']"
542 | ]
543 | },
544 | {
545 | "cell_type": "markdown",
546 | "metadata": {},
547 | "source": [
548 | "Numeric positions can also be used, with the `iloc` attribute:"
549 | ]
550 | },
551 | {
552 | "cell_type": "code",
553 | "execution_count": 22,
554 | "metadata": {},
555 | "outputs": [
556 | {
557 | "data": {
558 | "text/plain": [
559 | "35.467"
560 | ]
561 | },
562 | "execution_count": 22,
563 | "metadata": {},
564 | "output_type": "execute_result"
565 | }
566 | ],
567 | "source": [
568 | "g7_pop.iloc[0]"
569 | ]
570 | },
571 | {
572 | "cell_type": "code",
573 | "execution_count": 23,
574 | "metadata": {
575 | "scrolled": true
576 | },
577 | "outputs": [
578 | {
579 | "data": {
580 | "text/plain": [
581 | "318.523"
582 | ]
583 | },
584 | "execution_count": 23,
585 | "metadata": {},
586 | "output_type": "execute_result"
587 | }
588 | ],
589 | "source": [
590 | "g7_pop.iloc[-1]"
591 | ]
592 | },
593 | {
594 | "cell_type": "markdown",
595 | "metadata": {},
596 | "source": [
597 | "Selecting multiple elements at once:"
598 | ]
599 | },
600 | {
601 | "cell_type": "code",
602 | "execution_count": 24,
603 | "metadata": {
604 | "scrolled": true
605 | },
606 | "outputs": [
607 | {
608 | "data": {
609 | "text/plain": [
610 | "Italy 60.665\n",
611 | "France 63.951\n",
612 | "Name: G7 Population in millions, dtype: float64"
613 | ]
614 | },
615 | "execution_count": 24,
616 | "metadata": {},
617 | "output_type": "execute_result"
618 | }
619 | ],
620 | "source": [
621 | "g7_pop[['Italy', 'France']]"
622 | ]
623 | },
624 | {
625 | "cell_type": "markdown",
626 | "metadata": {},
627 | "source": [
628 | "_(The result is another Series)_"
629 | ]
630 | },
631 | {
632 | "cell_type": "code",
633 | "execution_count": 25,
634 | "metadata": {
635 | "scrolled": true
636 | },
637 | "outputs": [
638 | {
639 | "data": {
640 | "text/plain": [
641 | "Canada 35.467\n",
642 | "France 63.951\n",
643 | "Name: G7 Population in millions, dtype: float64"
644 | ]
645 | },
646 | "execution_count": 25,
647 | "metadata": {},
648 | "output_type": "execute_result"
649 | }
650 | ],
651 | "source": [
652 | "g7_pop.iloc[[0, 1]]"
653 | ]
654 | },
655 | {
656 | "cell_type": "markdown",
657 | "metadata": {},
658 | "source": [
659 | "Slicing also works, but **important**, in Pandas, the upper limit is also included:"
660 | ]
661 | },
662 | {
663 | "cell_type": "code",
664 | "execution_count": 28,
665 | "metadata": {},
666 | "outputs": [
667 | {
668 | "data": {
669 | "text/plain": [
670 | "Canada 35.467\n",
671 | "France 63.951\n",
672 | "Germany 80.940\n",
673 | "Italy 60.665\n",
674 | "Name: G7 Population in millions, dtype: float64"
675 | ]
676 | },
677 | "execution_count": 28,
678 | "metadata": {},
679 | "output_type": "execute_result"
680 | }
681 | ],
682 | "source": [
683 | "g7_pop['Canada': 'Italy']"
684 | ]
685 | },
686 | {
687 | "cell_type": "markdown",
688 | "metadata": {},
689 | "source": [
690 | "\n",
691 | "\n",
692 | "## Conditional selection (boolean arrays)\n",
693 | "\n",
694 | "The same boolean array techniques we saw applied to numpy arrays can be used for Pandas `Series`:"
695 | ]
696 | },
697 | {
698 | "cell_type": "code",
699 | "execution_count": 31,
700 | "metadata": {},
701 | "outputs": [
702 | {
703 | "data": {
704 | "text/plain": [
705 | "Canada 35.467\n",
706 | "France 63.951\n",
707 | "Germany 80.940\n",
708 | "Italy 60.665\n",
709 | "Japan 127.061\n",
710 | "United Kingdom 64.511\n",
711 | "United States 318.523\n",
712 | "Name: G7 Population in millions, dtype: float64"
713 | ]
714 | },
715 | "execution_count": 31,
716 | "metadata": {},
717 | "output_type": "execute_result"
718 | }
719 | ],
720 | "source": [
721 | "g7_pop"
722 | ]
723 | },
724 | {
725 | "cell_type": "code",
726 | "execution_count": 32,
727 | "metadata": {},
728 | "outputs": [
729 | {
730 | "data": {
731 | "text/plain": [
732 | "Canada False\n",
733 | "France False\n",
734 | "Germany True\n",
735 | "Italy False\n",
736 | "Japan True\n",
737 | "United Kingdom False\n",
738 | "United States True\n",
739 | "Name: G7 Population in millions, dtype: bool"
740 | ]
741 | },
742 | "execution_count": 32,
743 | "metadata": {},
744 | "output_type": "execute_result"
745 | }
746 | ],
747 | "source": [
748 | "g7_pop > 70"
749 | ]
750 | },
751 | {
752 | "cell_type": "code",
753 | "execution_count": 33,
754 | "metadata": {},
755 | "outputs": [
756 | {
757 | "data": {
758 | "text/plain": [
759 | "Germany 80.940\n",
760 | "Japan 127.061\n",
761 | "United States 318.523\n",
762 | "Name: G7 Population in millions, dtype: float64"
763 | ]
764 | },
765 | "execution_count": 33,
766 | "metadata": {},
767 | "output_type": "execute_result"
768 | }
769 | ],
770 | "source": [
771 | "g7_pop[g7_pop > 70]"
772 | ]
773 | },
774 | {
775 | "cell_type": "code",
776 | "execution_count": 34,
777 | "metadata": {},
778 | "outputs": [
779 | {
780 | "data": {
781 | "text/plain": [
782 | "107.30257142857144"
783 | ]
784 | },
785 | "execution_count": 34,
786 | "metadata": {},
787 | "output_type": "execute_result"
788 | }
789 | ],
790 | "source": [
791 | "g7_pop.mean()"
792 | ]
793 | },
794 | {
795 | "cell_type": "code",
796 | "execution_count": 35,
797 | "metadata": {},
798 | "outputs": [
799 | {
800 | "data": {
801 | "text/plain": [
802 | "Japan 127.061\n",
803 | "United States 318.523\n",
804 | "Name: G7 Population in millions, dtype: float64"
805 | ]
806 | },
807 | "execution_count": 35,
808 | "metadata": {},
809 | "output_type": "execute_result"
810 | }
811 | ],
812 | "source": [
813 | "g7_pop[g7_pop > g7_pop.mean()]"
814 | ]
815 | },
816 | {
817 | "cell_type": "code",
818 | "execution_count": null,
819 | "metadata": {},
820 | "outputs": [],
821 | "source": [
822 | "g7_pop.std()"
823 | ]
824 | },
825 | {
826 | "cell_type": "code",
827 | "execution_count": null,
828 | "metadata": {},
829 | "outputs": [],
830 | "source": [
831 | "~ not\n",
832 | "| or\n",
833 | "& and"
834 | ]
835 | },
836 | {
837 | "cell_type": "code",
838 | "execution_count": null,
839 | "metadata": {
840 | "scrolled": true
841 | },
842 | "outputs": [],
843 | "source": [
844 | "g7_pop[(g7_pop > g7_pop.mean() - g7_pop.std() / 2) | (g7_pop > g7_pop.mean() + g7_pop.std() / 2)]"
845 | ]
846 | },
847 | {
848 | "cell_type": "markdown",
849 | "metadata": {},
850 | "source": [
851 | "\n",
852 | "\n",
853 | "## Operations and methods\n",
854 | "Series also support vectorized operations and aggregation functions as Numpy:"
855 | ]
856 | },
857 | {
858 | "cell_type": "code",
859 | "execution_count": 29,
860 | "metadata": {},
861 | "outputs": [
862 | {
863 | "data": {
864 | "text/plain": [
865 | "Canada 35.467\n",
866 | "France 63.951\n",
867 | "Germany 80.940\n",
868 | "Italy 60.665\n",
869 | "Japan 127.061\n",
870 | "United Kingdom 64.511\n",
871 | "United States 318.523\n",
872 | "Name: G7 Population in millions, dtype: float64"
873 | ]
874 | },
875 | "execution_count": 29,
876 | "metadata": {},
877 | "output_type": "execute_result"
878 | }
879 | ],
880 | "source": [
881 | "g7_pop"
882 | ]
883 | },
884 | {
885 | "cell_type": "code",
886 | "execution_count": 30,
887 | "metadata": {},
888 | "outputs": [
889 | {
890 | "data": {
891 | "text/plain": [
892 | "Canada 35467000.0\n",
893 | "France 63951000.0\n",
894 | "Germany 80940000.0\n",
895 | "Italy 60665000.0\n",
896 | "Japan 127061000.0\n",
897 | "United Kingdom 64511000.0\n",
898 | "United States 318523000.0\n",
899 | "Name: G7 Population in millions, dtype: float64"
900 | ]
901 | },
902 | "execution_count": 30,
903 | "metadata": {},
904 | "output_type": "execute_result"
905 | }
906 | ],
907 | "source": [
908 | "g7_pop * 1_000_000"
909 | ]
910 | },
911 | {
912 | "cell_type": "code",
913 | "execution_count": 36,
914 | "metadata": {},
915 | "outputs": [
916 | {
917 | "data": {
918 | "text/plain": [
919 | "107.30257142857144"
920 | ]
921 | },
922 | "execution_count": 36,
923 | "metadata": {},
924 | "output_type": "execute_result"
925 | }
926 | ],
927 | "source": [
928 | "g7_pop.mean()"
929 | ]
930 | },
931 | {
932 | "cell_type": "code",
933 | "execution_count": 37,
934 | "metadata": {
935 | "scrolled": true
936 | },
937 | "outputs": [
938 | {
939 | "data": {
940 | "text/plain": [
941 | "Canada 3.568603\n",
942 | "France 4.158117\n",
943 | "Germany 4.393708\n",
944 | "Italy 4.105367\n",
945 | "Japan 4.844667\n",
946 | "United Kingdom 4.166836\n",
947 | "United States 5.763695\n",
948 | "Name: G7 Population in millions, dtype: float64"
949 | ]
950 | },
951 | "execution_count": 37,
952 | "metadata": {},
953 | "output_type": "execute_result"
954 | }
955 | ],
956 | "source": [
957 | "np.log(g7_pop)"
958 | ]
959 | },
960 | {
961 | "cell_type": "code",
962 | "execution_count": 38,
963 | "metadata": {},
964 | "outputs": [
965 | {
966 | "data": {
967 | "text/plain": [
968 | "68.51866666666666"
969 | ]
970 | },
971 | "execution_count": 38,
972 | "metadata": {},
973 | "output_type": "execute_result"
974 | }
975 | ],
976 | "source": [
977 | "g7_pop['France': 'Italy'].mean()"
978 | ]
979 | },
980 | {
981 | "cell_type": "markdown",
982 | "metadata": {},
983 | "source": [
984 | "\n",
985 | "\n",
986 | "## Boolean arrays\n",
987 | "(Work in the same way as numpy)"
988 | ]
989 | },
990 | {
991 | "cell_type": "code",
992 | "execution_count": 39,
993 | "metadata": {},
994 | "outputs": [
995 | {
996 | "data": {
997 | "text/plain": [
998 | "Canada 35.467\n",
999 | "France 63.951\n",
1000 | "Germany 80.940\n",
1001 | "Italy 60.665\n",
1002 | "Japan 127.061\n",
1003 | "United Kingdom 64.511\n",
1004 | "United States 318.523\n",
1005 | "Name: G7 Population in millions, dtype: float64"
1006 | ]
1007 | },
1008 | "execution_count": 39,
1009 | "metadata": {},
1010 | "output_type": "execute_result"
1011 | }
1012 | ],
1013 | "source": [
1014 | "g7_pop"
1015 | ]
1016 | },
1017 | {
1018 | "cell_type": "code",
1019 | "execution_count": 40,
1020 | "metadata": {},
1021 | "outputs": [
1022 | {
1023 | "data": {
1024 | "text/plain": [
1025 | "Canada False\n",
1026 | "France False\n",
1027 | "Germany True\n",
1028 | "Italy False\n",
1029 | "Japan True\n",
1030 | "United Kingdom False\n",
1031 | "United States True\n",
1032 | "Name: G7 Population in millions, dtype: bool"
1033 | ]
1034 | },
1035 | "execution_count": 40,
1036 | "metadata": {},
1037 | "output_type": "execute_result"
1038 | }
1039 | ],
1040 | "source": [
1041 | "g7_pop > 80"
1042 | ]
1043 | },
1044 | {
1045 | "cell_type": "code",
1046 | "execution_count": 41,
1047 | "metadata": {
1048 | "scrolled": true
1049 | },
1050 | "outputs": [
1051 | {
1052 | "data": {
1053 | "text/plain": [
1054 | "Germany 80.940\n",
1055 | "Japan 127.061\n",
1056 | "United States 318.523\n",
1057 | "Name: G7 Population in millions, dtype: float64"
1058 | ]
1059 | },
1060 | "execution_count": 41,
1061 | "metadata": {},
1062 | "output_type": "execute_result"
1063 | }
1064 | ],
1065 | "source": [
1066 | "g7_pop[g7_pop > 80]"
1067 | ]
1068 | },
1069 | {
1070 | "cell_type": "code",
1071 | "execution_count": 42,
1072 | "metadata": {
1073 | "scrolled": true
1074 | },
1075 | "outputs": [
1076 | {
1077 | "data": {
1078 | "text/plain": [
1079 | "Canada 35.467\n",
1080 | "Germany 80.940\n",
1081 | "Japan 127.061\n",
1082 | "United States 318.523\n",
1083 | "Name: G7 Population in millions, dtype: float64"
1084 | ]
1085 | },
1086 | "execution_count": 42,
1087 | "metadata": {},
1088 | "output_type": "execute_result"
1089 | }
1090 | ],
1091 | "source": [
1092 | "g7_pop[(g7_pop > 80) | (g7_pop < 40)]"
1093 | ]
1094 | },
1095 | {
1096 | "cell_type": "code",
1097 | "execution_count": 43,
1098 | "metadata": {
1099 | "scrolled": true
1100 | },
1101 | "outputs": [
1102 | {
1103 | "data": {
1104 | "text/plain": [
1105 | "Germany 80.940\n",
1106 | "Japan 127.061\n",
1107 | "Name: G7 Population in millions, dtype: float64"
1108 | ]
1109 | },
1110 | "execution_count": 43,
1111 | "metadata": {},
1112 | "output_type": "execute_result"
1113 | }
1114 | ],
1115 | "source": [
1116 | "g7_pop[(g7_pop > 80) & (g7_pop < 200)]"
1117 | ]
1118 | },
1119 | {
1120 | "cell_type": "markdown",
1121 | "metadata": {},
1122 | "source": [
1123 | "\n",
1124 | "\n",
1125 | "## Modifying series\n"
1126 | ]
1127 | },
1128 | {
1129 | "cell_type": "code",
1130 | "execution_count": 44,
1131 | "metadata": {},
1132 | "outputs": [],
1133 | "source": [
1134 | "g7_pop['Canada'] = 40.5"
1135 | ]
1136 | },
1137 | {
1138 | "cell_type": "code",
1139 | "execution_count": 45,
1140 | "metadata": {},
1141 | "outputs": [
1142 | {
1143 | "data": {
1144 | "text/plain": [
1145 | "Canada 40.500\n",
1146 | "France 63.951\n",
1147 | "Germany 80.940\n",
1148 | "Italy 60.665\n",
1149 | "Japan 127.061\n",
1150 | "United Kingdom 64.511\n",
1151 | "United States 318.523\n",
1152 | "Name: G7 Population in millions, dtype: float64"
1153 | ]
1154 | },
1155 | "execution_count": 45,
1156 | "metadata": {},
1157 | "output_type": "execute_result"
1158 | }
1159 | ],
1160 | "source": [
1161 | "g7_pop"
1162 | ]
1163 | },
1164 | {
1165 | "cell_type": "code",
1166 | "execution_count": 46,
1167 | "metadata": {},
1168 | "outputs": [],
1169 | "source": [
1170 | "g7_pop.iloc[-1] = 500"
1171 | ]
1172 | },
1173 | {
1174 | "cell_type": "code",
1175 | "execution_count": 47,
1176 | "metadata": {},
1177 | "outputs": [
1178 | {
1179 | "data": {
1180 | "text/plain": [
1181 | "Canada 40.500\n",
1182 | "France 63.951\n",
1183 | "Germany 80.940\n",
1184 | "Italy 60.665\n",
1185 | "Japan 127.061\n",
1186 | "United Kingdom 64.511\n",
1187 | "United States 500.000\n",
1188 | "Name: G7 Population in millions, dtype: float64"
1189 | ]
1190 | },
1191 | "execution_count": 47,
1192 | "metadata": {},
1193 | "output_type": "execute_result"
1194 | }
1195 | ],
1196 | "source": [
1197 | "g7_pop"
1198 | ]
1199 | },
1200 | {
1201 | "cell_type": "code",
1202 | "execution_count": 48,
1203 | "metadata": {},
1204 | "outputs": [
1205 | {
1206 | "data": {
1207 | "text/plain": [
1208 | "Canada 40.500\n",
1209 | "France 63.951\n",
1210 | "Italy 60.665\n",
1211 | "United Kingdom 64.511\n",
1212 | "Name: G7 Population in millions, dtype: float64"
1213 | ]
1214 | },
1215 | "execution_count": 48,
1216 | "metadata": {},
1217 | "output_type": "execute_result"
1218 | }
1219 | ],
1220 | "source": [
1221 | "g7_pop[g7_pop < 70]"
1222 | ]
1223 | },
1224 | {
1225 | "cell_type": "code",
1226 | "execution_count": 49,
1227 | "metadata": {},
1228 | "outputs": [],
1229 | "source": [
1230 | "g7_pop[g7_pop < 70] = 99.99"
1231 | ]
1232 | },
1233 | {
1234 | "cell_type": "code",
1235 | "execution_count": 50,
1236 | "metadata": {
1237 | "scrolled": true
1238 | },
1239 | "outputs": [
1240 | {
1241 | "data": {
1242 | "text/plain": [
1243 | "Canada 99.990\n",
1244 | "France 99.990\n",
1245 | "Germany 80.940\n",
1246 | "Italy 99.990\n",
1247 | "Japan 127.061\n",
1248 | "United Kingdom 99.990\n",
1249 | "United States 500.000\n",
1250 | "Name: G7 Population in millions, dtype: float64"
1251 | ]
1252 | },
1253 | "execution_count": 50,
1254 | "metadata": {},
1255 | "output_type": "execute_result"
1256 | }
1257 | ],
1258 | "source": [
1259 | "g7_pop"
1260 | ]
1261 | },
1262 | {
1263 | "cell_type": "markdown",
1264 | "metadata": {},
1265 | "source": [
1266 | "\n"
1267 | ]
1268 | }
1269 | ],
1270 | "metadata": {
1271 | "kernelspec": {
1272 | "display_name": "Python 3",
1273 | "language": "python",
1274 | "name": "python3"
1275 | },
1276 | "language_info": {
1277 | "codemirror_mode": {
1278 | "name": "ipython",
1279 | "version": 3
1280 | },
1281 | "file_extension": ".py",
1282 | "mimetype": "text/x-python",
1283 | "name": "python",
1284 | "nbconvert_exporter": "python",
1285 | "pygments_lexer": "ipython3",
1286 | "version": "3.8.1"
1287 | }
1288 | },
1289 | "nbformat": 4,
1290 | "nbformat_minor": 4
1291 | }
1292 |
--------------------------------------------------------------------------------
/2 - Pandas Series exercises.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "\n",
10 | "# Pandas Series exercises\n"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": 1,
16 | "metadata": {},
17 | "outputs": [
18 | {
19 | "name": "stdout",
20 | "output_type": "stream",
21 | "text": [
22 | "0.25.3\n"
23 | ]
24 | }
25 | ],
26 | "source": [
27 | "# Import the numpy package under the name np\n",
28 | "import numpy as np\n",
29 | "\n",
30 | "# Import the pandas package under the name pd\n",
31 | "import pandas as pd\n",
32 | "\n",
33 | "# Print the pandas version and the configuration\n",
34 | "print(pd.__version__)"
35 | ]
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "\n",
42 | "\n",
43 | "## Series creation"
44 | ]
45 | },
46 | {
47 | "cell_type": "markdown",
48 | "metadata": {},
49 | "source": [
50 | "### Create an empty pandas Series"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": null,
56 | "metadata": {},
57 | "outputs": [],
58 | "source": [
59 | "# your code goes here\n"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": null,
65 | "metadata": {
66 | "cell_type": "solution"
67 | },
68 | "outputs": [],
69 | "source": [
70 | "pd.Series()"
71 | ]
72 | },
73 | {
74 | "cell_type": "markdown",
75 | "metadata": {},
76 | "source": [
77 | "\n",
78 | "\n",
79 | "### Given the X python list convert it to an Y pandas Series"
80 | ]
81 | },
82 | {
83 | "cell_type": "code",
84 | "execution_count": null,
85 | "metadata": {},
86 | "outputs": [],
87 | "source": [
88 | "# your code goes here\n"
89 | ]
90 | },
91 | {
92 | "cell_type": "code",
93 | "execution_count": null,
94 | "metadata": {
95 | "cell_type": "solution"
96 | },
97 | "outputs": [],
98 | "source": [
99 | "X = ['A','B','C']\n",
100 | "print(X, type(X))\n",
101 | "\n",
102 | "Y = pd.Series(X)\n",
103 | "print(Y, type(Y)) # different type"
104 | ]
105 | },
106 | {
107 | "cell_type": "markdown",
108 | "metadata": {},
109 | "source": [
110 | "\n",
111 | "\n",
112 | "### Given the X pandas Series, name it 'My letters'"
113 | ]
114 | },
115 | {
116 | "cell_type": "code",
117 | "execution_count": null,
118 | "metadata": {},
119 | "outputs": [],
120 | "source": [
121 | "# your code goes here\n"
122 | ]
123 | },
124 | {
125 | "cell_type": "code",
126 | "execution_count": null,
127 | "metadata": {
128 | "cell_type": "solution"
129 | },
130 | "outputs": [],
131 | "source": [
132 | "X = pd.Series(['A','B','C'])\n",
133 | "\n",
134 | "X.name = 'My letters'\n",
135 | "X"
136 | ]
137 | },
138 | {
139 | "cell_type": "markdown",
140 | "metadata": {},
141 | "source": [
142 | "\n",
143 | "\n",
144 | "### Given the X pandas Series, show its values\n"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": null,
150 | "metadata": {},
151 | "outputs": [],
152 | "source": [
153 | "# your code goes here\n"
154 | ]
155 | },
156 | {
157 | "cell_type": "code",
158 | "execution_count": null,
159 | "metadata": {
160 | "cell_type": "solution"
161 | },
162 | "outputs": [],
163 | "source": [
164 | "X = pd.Series(['A','B','C'])\n",
165 | "\n",
166 | "X.values"
167 | ]
168 | },
169 | {
170 | "cell_type": "markdown",
171 | "metadata": {},
172 | "source": [
173 | "\n",
174 | "\n",
175 | "## Series indexation"
176 | ]
177 | },
178 | {
179 | "cell_type": "markdown",
180 | "metadata": {},
181 | "source": [
182 | "### Assign index names to the given X pandas Series\n"
183 | ]
184 | },
185 | {
186 | "cell_type": "code",
187 | "execution_count": null,
188 | "metadata": {},
189 | "outputs": [],
190 | "source": [
191 | "# your code goes here\n"
192 | ]
193 | },
194 | {
195 | "cell_type": "code",
196 | "execution_count": null,
197 | "metadata": {
198 | "cell_type": "solution"
199 | },
200 | "outputs": [],
201 | "source": [
202 | "X = pd.Series(['A','B','C'])\n",
203 | "index_names = ['first', 'second', 'third']\n",
204 | "\n",
205 | "X.index = index_names\n",
206 | "X"
207 | ]
208 | },
209 | {
210 | "cell_type": "markdown",
211 | "metadata": {},
212 | "source": [
213 | "\n",
214 | "\n",
215 | "### Given the X pandas Series, show its first element\n"
216 | ]
217 | },
218 | {
219 | "cell_type": "code",
220 | "execution_count": null,
221 | "metadata": {},
222 | "outputs": [],
223 | "source": [
224 | "# your code goes here\n"
225 | ]
226 | },
227 | {
228 | "cell_type": "code",
229 | "execution_count": null,
230 | "metadata": {
231 | "cell_type": "solution"
232 | },
233 | "outputs": [],
234 | "source": [
235 | "X = pd.Series(['A','B','C'], index=['first', 'second', 'third'])\n",
236 | "\n",
237 | "#X[0] # by position\n",
238 | "#X.iloc[0] # by position\n",
239 | "X['first'] # by index"
240 | ]
241 | },
242 | {
243 | "cell_type": "markdown",
244 | "metadata": {},
245 | "source": [
246 | "\n",
247 | "\n",
248 | "### Given the X pandas Series, show its last element\n"
249 | ]
250 | },
251 | {
252 | "cell_type": "code",
253 | "execution_count": null,
254 | "metadata": {},
255 | "outputs": [],
256 | "source": [
257 | "# your code goes here\n"
258 | ]
259 | },
260 | {
261 | "cell_type": "code",
262 | "execution_count": null,
263 | "metadata": {
264 | "cell_type": "solution"
265 | },
266 | "outputs": [],
267 | "source": [
268 | "X = pd.Series(['A','B','C'], index=['first', 'second', 'third'])\n",
269 | "\n",
270 | "#X[-1] # by position\n",
271 | "#X.iloc[-1] # by position\n",
272 | "X['third'] # by index"
273 | ]
274 | },
275 | {
276 | "cell_type": "markdown",
277 | "metadata": {},
278 | "source": [
279 | "\n",
280 | "\n",
281 | "### Given the X pandas Series, show all middle elements\n"
282 | ]
283 | },
284 | {
285 | "cell_type": "code",
286 | "execution_count": null,
287 | "metadata": {},
288 | "outputs": [],
289 | "source": [
290 | "# your code goes here\n"
291 | ]
292 | },
293 | {
294 | "cell_type": "code",
295 | "execution_count": null,
296 | "metadata": {
297 | "cell_type": "solution"
298 | },
299 | "outputs": [],
300 | "source": [
301 | "X = pd.Series(['A','B','C','D','E'],\n",
302 | " index=['first','second','third','forth','fifth'])\n",
303 | "\n",
304 | "#X[['second', 'third', 'forth']]\n",
305 | "#X.iloc[1:-1] # by position\n",
306 | "X[1:-1] # by position"
307 | ]
308 | },
309 | {
310 | "cell_type": "markdown",
311 | "metadata": {},
312 | "source": [
313 | "\n",
314 | "\n",
315 | "### Given the X pandas Series, show the elements in reverse position\n"
316 | ]
317 | },
318 | {
319 | "cell_type": "code",
320 | "execution_count": null,
321 | "metadata": {},
322 | "outputs": [],
323 | "source": [
324 | "# your code goes here\n"
325 | ]
326 | },
327 | {
328 | "cell_type": "code",
329 | "execution_count": null,
330 | "metadata": {
331 | "cell_type": "solution"
332 | },
333 | "outputs": [],
334 | "source": [
335 | "X = pd.Series(['A','B','C','D','E'],\n",
336 | " index=['first','second','third','forth','fifth'])\n",
337 | "\n",
338 | "#X.iloc[::-1]\n",
339 | "X[::-1]"
340 | ]
341 | },
342 | {
343 | "cell_type": "markdown",
344 | "metadata": {},
345 | "source": [
346 | "\n",
347 | "\n",
348 | "### Given the X pandas Series, show the first and last elements\n"
349 | ]
350 | },
351 | {
352 | "cell_type": "code",
353 | "execution_count": null,
354 | "metadata": {},
355 | "outputs": [],
356 | "source": [
357 | "# your code goes here\n"
358 | ]
359 | },
360 | {
361 | "cell_type": "code",
362 | "execution_count": null,
363 | "metadata": {
364 | "cell_type": "solution"
365 | },
366 | "outputs": [],
367 | "source": [
368 | "X = pd.Series(['A','B','C','D','E'],\n",
369 | " index=['first','second','third','forth','fifth'])\n",
370 | "\n",
371 | "#X[['first', 'fifth']]\n",
372 | "#X.iloc[[0, -1]]\n",
373 | "X[[0, -1]]"
374 | ]
375 | },
376 | {
377 | "cell_type": "markdown",
378 | "metadata": {},
379 | "source": [
380 | "\n",
381 | "\n",
382 | "## Series manipulation"
383 | ]
384 | },
385 | {
386 | "cell_type": "markdown",
387 | "metadata": {},
388 | "source": [
389 | "### Convert the given integer pandas Series to float\n"
390 | ]
391 | },
392 | {
393 | "cell_type": "code",
394 | "execution_count": null,
395 | "metadata": {},
396 | "outputs": [],
397 | "source": [
398 | "# your code goes here\n"
399 | ]
400 | },
401 | {
402 | "cell_type": "code",
403 | "execution_count": null,
404 | "metadata": {
405 | "cell_type": "solution"
406 | },
407 | "outputs": [],
408 | "source": [
409 | "X = pd.Series([1,2,3,4,5],\n",
410 | " index=['first','second','third','forth','fifth'])\n",
411 | "\n",
412 | "pd.Series(X, dtype=np.float)"
413 | ]
414 | },
415 | {
416 | "cell_type": "markdown",
417 | "metadata": {},
418 | "source": [
419 | "\n",
420 | "\n",
421 | "### Reverse the given pandas Series (first element becomes last)"
422 | ]
423 | },
424 | {
425 | "cell_type": "code",
426 | "execution_count": null,
427 | "metadata": {},
428 | "outputs": [],
429 | "source": [
430 | "# your code goes here\n"
431 | ]
432 | },
433 | {
434 | "cell_type": "code",
435 | "execution_count": null,
436 | "metadata": {
437 | "cell_type": "solution"
438 | },
439 | "outputs": [],
440 | "source": [
441 | "X = pd.Series([1,2,3,4,5],\n",
442 | " index=['first','second','third','forth','fifth'])\n",
443 | "\n",
444 | "X[::-1]"
445 | ]
446 | },
447 | {
448 | "cell_type": "markdown",
449 | "metadata": {},
450 | "source": [
451 | "\n",
452 | "\n",
453 | "### Order (sort) the given pandas Series\n"
454 | ]
455 | },
456 | {
457 | "cell_type": "code",
458 | "execution_count": null,
459 | "metadata": {},
460 | "outputs": [],
461 | "source": [
462 | "# your code goes here\n"
463 | ]
464 | },
465 | {
466 | "cell_type": "code",
467 | "execution_count": null,
468 | "metadata": {
469 | "cell_type": "solution"
470 | },
471 | "outputs": [],
472 | "source": [
473 | "X = pd.Series([4,2,5,1,3],\n",
474 | " index=['forth','second','fifth','first','third'])\n",
475 | "\n",
476 | "X = X.sort_values()\n",
477 | "X"
478 | ]
479 | },
480 | {
481 | "cell_type": "markdown",
482 | "metadata": {},
483 | "source": [
484 | "\n",
485 | "\n",
486 | "### Given the X pandas Series, set the fifth element equal to 10\n"
487 | ]
488 | },
489 | {
490 | "cell_type": "code",
491 | "execution_count": null,
492 | "metadata": {},
493 | "outputs": [],
494 | "source": [
495 | "# your code goes here\n"
496 | ]
497 | },
498 | {
499 | "cell_type": "code",
500 | "execution_count": null,
501 | "metadata": {
502 | "cell_type": "solution"
503 | },
504 | "outputs": [],
505 | "source": [
506 | "X = pd.Series([1,2,3,4,5],\n",
507 | " index=['A','B','C','D','E'])\n",
508 | "\n",
509 | "X[4] = 10\n",
510 | "X"
511 | ]
512 | },
513 | {
514 | "cell_type": "markdown",
515 | "metadata": {},
516 | "source": [
517 | "\n",
518 | "\n",
519 | "### Given the X pandas Series, change all the middle elements to 0\n"
520 | ]
521 | },
522 | {
523 | "cell_type": "code",
524 | "execution_count": null,
525 | "metadata": {},
526 | "outputs": [],
527 | "source": [
528 | "# your code goes here\n"
529 | ]
530 | },
531 | {
532 | "cell_type": "code",
533 | "execution_count": null,
534 | "metadata": {
535 | "cell_type": "solution",
536 | "scrolled": false
537 | },
538 | "outputs": [],
539 | "source": [
540 | "X = pd.Series([1,2,3,4,5],\n",
541 | " index=['A','B','C','D','E'])\n",
542 | "\n",
543 | "X[1:-1] = 0\n",
544 | "X"
545 | ]
546 | },
547 | {
548 | "cell_type": "markdown",
549 | "metadata": {},
550 | "source": [
551 | "\n",
552 | "\n",
553 | "### Given the X pandas Series, add 5 to every element\n"
554 | ]
555 | },
556 | {
557 | "cell_type": "code",
558 | "execution_count": null,
559 | "metadata": {},
560 | "outputs": [],
561 | "source": [
562 | "# your code goes here\n"
563 | ]
564 | },
565 | {
566 | "cell_type": "code",
567 | "execution_count": null,
568 | "metadata": {
569 | "cell_type": "solution"
570 | },
571 | "outputs": [],
572 | "source": [
573 | "X = pd.Series([1,2,3,4,5])\n",
574 | "\n",
575 | "X + 5"
576 | ]
577 | },
578 | {
579 | "cell_type": "markdown",
580 | "metadata": {},
581 | "source": [
582 | "\n",
583 | "\n",
584 | "## Series boolean arrays (also called masks)"
585 | ]
586 | },
587 | {
588 | "cell_type": "markdown",
589 | "metadata": {},
590 | "source": [
591 | "### Given the X pandas Series, make a mask showing negative elements\n"
592 | ]
593 | },
594 | {
595 | "cell_type": "code",
596 | "execution_count": null,
597 | "metadata": {},
598 | "outputs": [],
599 | "source": [
600 | "# your code goes here\n"
601 | ]
602 | },
603 | {
604 | "cell_type": "code",
605 | "execution_count": null,
606 | "metadata": {
607 | "cell_type": "solution"
608 | },
609 | "outputs": [],
610 | "source": [
611 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n",
612 | "\n",
613 | "mask = X <= 0\n",
614 | "mask"
615 | ]
616 | },
617 | {
618 | "cell_type": "markdown",
619 | "metadata": {},
620 | "source": [
621 | "\n",
622 | "\n",
623 | "### Given the X pandas Series, get the negative elements\n"
624 | ]
625 | },
626 | {
627 | "cell_type": "code",
628 | "execution_count": null,
629 | "metadata": {},
630 | "outputs": [],
631 | "source": [
632 | "# your code goes here\n"
633 | ]
634 | },
635 | {
636 | "cell_type": "code",
637 | "execution_count": null,
638 | "metadata": {
639 | "cell_type": "solution"
640 | },
641 | "outputs": [],
642 | "source": [
643 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n",
644 | "\n",
645 | "mask = X <= 0\n",
646 | "X[mask]"
647 | ]
648 | },
649 | {
650 | "cell_type": "markdown",
651 | "metadata": {},
652 | "source": [
653 | "\n",
654 | "\n",
655 | "### Given the X pandas Series, get numbers higher than 5\n"
656 | ]
657 | },
658 | {
659 | "cell_type": "code",
660 | "execution_count": null,
661 | "metadata": {},
662 | "outputs": [],
663 | "source": [
664 | "# your code goes here\n"
665 | ]
666 | },
667 | {
668 | "cell_type": "code",
669 | "execution_count": null,
670 | "metadata": {
671 | "cell_type": "solution"
672 | },
673 | "outputs": [],
674 | "source": [
675 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n",
676 | "\n",
677 | "mask = X > 5\n",
678 | "X[mask]"
679 | ]
680 | },
681 | {
682 | "cell_type": "markdown",
683 | "metadata": {},
684 | "source": [
685 | "\n",
686 | "\n",
687 | "### Given the X pandas Series, get numbers higher than the elements mean"
688 | ]
689 | },
690 | {
691 | "cell_type": "code",
692 | "execution_count": null,
693 | "metadata": {},
694 | "outputs": [],
695 | "source": [
696 | "# your code goes here\n"
697 | ]
698 | },
699 | {
700 | "cell_type": "code",
701 | "execution_count": null,
702 | "metadata": {
703 | "cell_type": "solution"
704 | },
705 | "outputs": [],
706 | "source": [
707 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n",
708 | "\n",
709 | "mask = X > X.mean()\n",
710 | "X[mask]"
711 | ]
712 | },
713 | {
714 | "cell_type": "markdown",
715 | "metadata": {},
716 | "source": [
717 | "\n",
718 | "\n",
719 | "### Given the X pandas Series, get numbers equal to 2 or 10\n"
720 | ]
721 | },
722 | {
723 | "cell_type": "code",
724 | "execution_count": null,
725 | "metadata": {},
726 | "outputs": [],
727 | "source": [
728 | "# your code goes here\n"
729 | ]
730 | },
731 | {
732 | "cell_type": "code",
733 | "execution_count": null,
734 | "metadata": {
735 | "cell_type": "solution",
736 | "scrolled": true
737 | },
738 | "outputs": [],
739 | "source": [
740 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n",
741 | "\n",
742 | "mask = (X == 2) | (X == 10)\n",
743 | "X[mask]"
744 | ]
745 | },
746 | {
747 | "cell_type": "markdown",
748 | "metadata": {},
749 | "source": [
750 | "\n",
751 | "\n",
752 | "## Logic functions"
753 | ]
754 | },
755 | {
756 | "cell_type": "markdown",
757 | "metadata": {},
758 | "source": [
759 | "### Given the X pandas Series, return True if none of its elements is zero"
760 | ]
761 | },
762 | {
763 | "cell_type": "code",
764 | "execution_count": null,
765 | "metadata": {},
766 | "outputs": [],
767 | "source": [
768 | "# your code goes here\n"
769 | ]
770 | },
771 | {
772 | "cell_type": "code",
773 | "execution_count": null,
774 | "metadata": {
775 | "cell_type": "solution"
776 | },
777 | "outputs": [],
778 | "source": [
779 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n",
780 | "\n",
781 | "X.all()"
782 | ]
783 | },
784 | {
785 | "cell_type": "markdown",
786 | "metadata": {},
787 | "source": [
788 | "\n",
789 | "\n",
790 | "### Given the X pandas Series, return True if any of its elements is zero\n"
791 | ]
792 | },
793 | {
794 | "cell_type": "code",
795 | "execution_count": null,
796 | "metadata": {},
797 | "outputs": [],
798 | "source": [
799 | "# your code goes here\n"
800 | ]
801 | },
802 | {
803 | "cell_type": "code",
804 | "execution_count": null,
805 | "metadata": {
806 | "cell_type": "solution"
807 | },
808 | "outputs": [],
809 | "source": [
810 | "X = pd.Series([-1,2,0,-4,5,6,0,0,-9,10])\n",
811 | "\n",
812 | "X.any()"
813 | ]
814 | },
815 | {
816 | "cell_type": "markdown",
817 | "metadata": {},
818 | "source": [
819 | "\n",
820 | "\n",
821 | "## Summary statistics"
822 | ]
823 | },
824 | {
825 | "cell_type": "markdown",
826 | "metadata": {},
827 | "source": [
828 | "### Given the X pandas Series, show the sum of its elements\n"
829 | ]
830 | },
831 | {
832 | "cell_type": "code",
833 | "execution_count": null,
834 | "metadata": {},
835 | "outputs": [],
836 | "source": [
837 | "# your code goes here\n"
838 | ]
839 | },
840 | {
841 | "cell_type": "code",
842 | "execution_count": null,
843 | "metadata": {
844 | "cell_type": "solution"
845 | },
846 | "outputs": [],
847 | "source": [
848 | "X = pd.Series([3,5,6,7,2,3,4,9,4])\n",
849 | "\n",
850 | "#np.sum(X)\n",
851 | "X.sum()"
852 | ]
853 | },
854 | {
855 | "cell_type": "markdown",
856 | "metadata": {},
857 | "source": [
858 | "\n",
859 | "\n",
860 | "### Given the X pandas Series, show the mean value of its elements"
861 | ]
862 | },
863 | {
864 | "cell_type": "code",
865 | "execution_count": null,
866 | "metadata": {},
867 | "outputs": [],
868 | "source": [
869 | "# your code goes here\n"
870 | ]
871 | },
872 | {
873 | "cell_type": "code",
874 | "execution_count": null,
875 | "metadata": {
876 | "cell_type": "solution"
877 | },
878 | "outputs": [],
879 | "source": [
880 | "X = pd.Series([1,2,0,4,5,6,0,0,9,10])\n",
881 | "\n",
882 | "#np.mean(X)\n",
883 | "X.mean()"
884 | ]
885 | },
886 | {
887 | "cell_type": "markdown",
888 | "metadata": {},
889 | "source": [
890 | "\n",
891 | "\n",
892 | "### Given the X pandas Series, show the max value of its elements"
893 | ]
894 | },
895 | {
896 | "cell_type": "code",
897 | "execution_count": null,
898 | "metadata": {},
899 | "outputs": [],
900 | "source": [
901 | "# your code goes here\n"
902 | ]
903 | },
904 | {
905 | "cell_type": "code",
906 | "execution_count": null,
907 | "metadata": {
908 | "cell_type": "solution"
909 | },
910 | "outputs": [],
911 | "source": [
912 | "X = pd.Series([1,2,0,4,5,6,0,0,9,10])\n",
913 | "\n",
914 | "#np.max(X)\n",
915 | "X.max()"
916 | ]
917 | },
918 | {
919 | "cell_type": "markdown",
920 | "metadata": {},
921 | "source": [
922 | ""
923 | ]
924 | }
925 | ],
926 | "metadata": {
927 | "kernelspec": {
928 | "display_name": "Python 3",
929 | "language": "python",
930 | "name": "python3"
931 | },
932 | "language_info": {
933 | "codemirror_mode": {
934 | "name": "ipython",
935 | "version": 3
936 | },
937 | "file_extension": ".py",
938 | "mimetype": "text/x-python",
939 | "name": "python",
940 | "nbconvert_exporter": "python",
941 | "pygments_lexer": "ipython3",
942 | "version": "3.7.4"
943 | }
944 | },
945 | "nbformat": 4,
946 | "nbformat_minor": 2
947 | }
948 |
--------------------------------------------------------------------------------
/4 - Pandas DataFrames exercises.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "\n",
8 | "
\n",
9 | "\n",
10 | "# Pandas DataFrame exercises\n"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": null,
16 | "metadata": {},
17 | "outputs": [],
18 | "source": [
19 | "# Import the numpy package under the name np\n",
20 | "import numpy as np\n",
21 | "\n",
22 | "# Import the pandas package under the name pd\n",
23 | "import pandas as pd\n",
24 | "\n",
25 | "# Import the matplotlib package under the name plt\n",
26 | "import matplotlib.pyplot as plt\n",
27 | "%matplotlib inline\n",
28 | "\n",
29 | "# Print the pandas version and the configuration\n",
30 | "print(pd.__version__)"
31 | ]
32 | },
33 | {
34 | "cell_type": "markdown",
35 | "metadata": {},
36 | "source": [
37 | "\n",
38 | "\n",
39 | "## DataFrame creation"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | "### Create an empty pandas DataFrame\n"
47 | ]
48 | },
49 | {
50 | "cell_type": "code",
51 | "execution_count": null,
52 | "metadata": {},
53 | "outputs": [],
54 | "source": [
55 | "# your code goes here\n"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "metadata": {
62 | "cell_type": "solution"
63 | },
64 | "outputs": [],
65 | "source": [
66 | "pd.DataFrame(data=[None],\n",
67 | " index=[None],\n",
68 | " columns=[None])"
69 | ]
70 | },
71 | {
72 | "cell_type": "markdown",
73 | "metadata": {},
74 | "source": [
75 | "
"
76 | ]
77 | },
78 | {
79 | "cell_type": "markdown",
80 | "metadata": {},
81 | "source": [
82 | "\n",
83 | "\n",
84 | "### Create a `marvel_df` pandas DataFrame with the given marvel data\n"
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": null,
90 | "metadata": {},
91 | "outputs": [],
92 | "source": [
93 | "marvel_data = [\n",
94 | " ['Spider-Man', 'male', 1962],\n",
95 | " ['Captain America', 'male', 1941],\n",
96 | " ['Wolverine', 'male', 1974],\n",
97 | " ['Iron Man', 'male', 1963],\n",
98 | " ['Thor', 'male', 1963],\n",
99 | " ['Thing', 'male', 1961],\n",
100 | " ['Mister Fantastic', 'male', 1961],\n",
101 | " ['Hulk', 'male', 1962],\n",
102 | " ['Beast', 'male', 1963],\n",
103 | " ['Invisible Woman', 'female', 1961],\n",
104 | " ['Storm', 'female', 1975],\n",
105 | " ['Namor', 'male', 1939],\n",
106 | " ['Hawkeye', 'male', 1964],\n",
107 | " ['Daredevil', 'male', 1964],\n",
108 | " ['Doctor Strange', 'male', 1963],\n",
109 | " ['Hank Pym', 'male', 1962],\n",
110 | " ['Scarlet Witch', 'female', 1964],\n",
111 | " ['Wasp', 'female', 1963],\n",
112 | " ['Black Widow', 'female', 1964],\n",
113 | " ['Vision', 'male', 1968]\n",
114 | "]"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": null,
120 | "metadata": {},
121 | "outputs": [],
122 | "source": [
123 | "# your code goes here\n"
124 | ]
125 | },
126 | {
127 | "cell_type": "code",
128 | "execution_count": null,
129 | "metadata": {
130 | "cell_type": "solution"
131 | },
132 | "outputs": [],
133 | "source": [
134 | "marvel_df = pd.DataFrame(data=marvel_data)\n",
135 | "\n",
136 | "marvel_df"
137 | ]
138 | },
139 | {
140 | "cell_type": "markdown",
141 | "metadata": {},
142 | "source": [
143 | "\n",
144 | "\n",
145 | "### Add column names to the `marvel_df`\n",
146 | " "
147 | ]
148 | },
149 | {
150 | "cell_type": "code",
151 | "execution_count": null,
152 | "metadata": {},
153 | "outputs": [],
154 | "source": [
155 | "# your code goes here\n"
156 | ]
157 | },
158 | {
159 | "cell_type": "code",
160 | "execution_count": null,
161 | "metadata": {
162 | "cell_type": "solution"
163 | },
164 | "outputs": [],
165 | "source": [
166 | "col_names = ['name', 'sex', 'first_appearance']\n",
167 | "\n",
168 | "marvel_df.columns = col_names\n",
169 | "marvel_df"
170 | ]
171 | },
172 | {
173 | "cell_type": "markdown",
174 | "metadata": {},
175 | "source": [
176 | "\n",
177 | "\n",
178 | "### Add index names to the `marvel_df` (use the character name as index)\n"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": null,
184 | "metadata": {},
185 | "outputs": [],
186 | "source": [
187 | "# your code goes here\n"
188 | ]
189 | },
190 | {
191 | "cell_type": "code",
192 | "execution_count": null,
193 | "metadata": {
194 | "cell_type": "solution"
195 | },
196 | "outputs": [],
197 | "source": [
198 | "marvel_df.index = marvel_df['name']\n",
199 | "marvel_df"
200 | ]
201 | },
202 | {
203 | "cell_type": "markdown",
204 | "metadata": {},
205 | "source": [
206 | "\n",
207 | "\n",
208 | "### Drop the name column as it's now the index"
209 | ]
210 | },
211 | {
212 | "cell_type": "code",
213 | "execution_count": null,
214 | "metadata": {},
215 | "outputs": [],
216 | "source": [
217 | "# your code goes here\n"
218 | ]
219 | },
220 | {
221 | "cell_type": "code",
222 | "execution_count": null,
223 | "metadata": {
224 | "cell_type": "solution"
225 | },
226 | "outputs": [],
227 | "source": [
228 | "#marvel_df = marvel_df.drop(columns=['name'])\n",
229 | "marvel_df = marvel_df.drop(['name'], axis=1)\n",
230 | "marvel_df"
231 | ]
232 | },
233 | {
234 | "cell_type": "markdown",
235 | "metadata": {},
236 | "source": [
237 | "\n",
238 | "\n",
239 | "### Drop 'Namor' and 'Hank Pym' rows\n"
240 | ]
241 | },
242 | {
243 | "cell_type": "code",
244 | "execution_count": null,
245 | "metadata": {},
246 | "outputs": [],
247 | "source": [
248 | "# your code goes here\n"
249 | ]
250 | },
251 | {
252 | "cell_type": "code",
253 | "execution_count": null,
254 | "metadata": {
255 | "cell_type": "solution"
256 | },
257 | "outputs": [],
258 | "source": [
259 | "marvel_df = marvel_df.drop(['Namor', 'Hank Pym'], axis=0)\n",
260 | "marvel_df"
261 | ]
262 | },
263 | {
264 | "cell_type": "markdown",
265 | "metadata": {},
266 | "source": [
267 | "\n",
268 | "\n",
269 | "## DataFrame selection, slicing and indexation"
270 | ]
271 | },
272 | {
273 | "cell_type": "markdown",
274 | "metadata": {},
275 | "source": [
276 | "### Show the first 5 elements on `marvel_df`\n",
277 | " "
278 | ]
279 | },
280 | {
281 | "cell_type": "code",
282 | "execution_count": null,
283 | "metadata": {},
284 | "outputs": [],
285 | "source": [
286 | "# your code goes here\n"
287 | ]
288 | },
289 | {
290 | "cell_type": "code",
291 | "execution_count": null,
292 | "metadata": {
293 | "cell_type": "solution"
294 | },
295 | "outputs": [],
296 | "source": [
297 | "#marvel_df.loc[['Spider-Man', 'Captain America', 'Wolverine', 'Iron Man', 'Thor'], :] # bad!\n",
298 | "#marvel_df.loc['Spider-Man': 'Thor', :]\n",
299 | "#marvel_df.iloc[0:5, :]\n",
300 | "#marvel_df.iloc[0:5,]\n",
301 | "marvel_df.iloc[:5,]\n",
302 | "#marvel_df.head()"
303 | ]
304 | },
305 | {
306 | "cell_type": "markdown",
307 | "metadata": {},
308 | "source": [
309 | "\n",
310 | "\n",
311 | "### Show the last 5 elements on `marvel_df`\n"
312 | ]
313 | },
314 | {
315 | "cell_type": "code",
316 | "execution_count": null,
317 | "metadata": {},
318 | "outputs": [],
319 | "source": [
320 | "# your code goes here\n"
321 | ]
322 | },
323 | {
324 | "cell_type": "code",
325 | "execution_count": null,
326 | "metadata": {
327 | "cell_type": "solution"
328 | },
329 | "outputs": [],
330 | "source": [
331 | "#marvel_df.loc[['Hank Pym', 'Scarlet Witch', 'Wasp', 'Black Widow', 'Vision'], :] # bad!\n",
332 | "#marvel_df.loc['Hank Pym':'Vision', :]\n",
333 | "marvel_df.iloc[-5:,]\n",
334 | "#marvel_df.tail()"
335 | ]
336 | },
337 | {
338 | "cell_type": "markdown",
339 | "metadata": {},
340 | "source": [
341 | "\n",
342 | "\n",
343 | "### Show just the sex of the first 5 elements on `marvel_df`"
344 | ]
345 | },
346 | {
347 | "cell_type": "code",
348 | "execution_count": null,
349 | "metadata": {},
350 | "outputs": [],
351 | "source": [
352 | "# your code goes here\n"
353 | ]
354 | },
355 | {
356 | "cell_type": "code",
357 | "execution_count": null,
358 | "metadata": {
359 | "cell_type": "solution"
360 | },
361 | "outputs": [],
362 | "source": [
363 | "#marvel_df.iloc[:5,]['sex'].to_frame()\n",
364 | "marvel_df.iloc[:5,].sex.to_frame()\n",
365 | "#marvel_df.head().sex.to_frame()"
366 | ]
367 | },
368 | {
369 | "cell_type": "markdown",
370 | "metadata": {},
371 | "source": [
372 | "\n",
373 | "\n",
374 | "### Show the first_appearance of all middle elements on `marvel_df` "
375 | ]
376 | },
377 | {
378 | "cell_type": "code",
379 | "execution_count": null,
380 | "metadata": {},
381 | "outputs": [],
382 | "source": [
383 | "# your code goes here\n"
384 | ]
385 | },
386 | {
387 | "cell_type": "code",
388 | "execution_count": null,
389 | "metadata": {
390 | "cell_type": "solution"
391 | },
392 | "outputs": [],
393 | "source": [
394 | "marvel_df.iloc[1:-1,].first_appearance.to_frame()"
395 | ]
396 | },
397 | {
398 | "cell_type": "markdown",
399 | "metadata": {},
400 | "source": [
401 | "\n",
402 | "\n",
403 | "### Show the first and last elements on `marvel_df`\n"
404 | ]
405 | },
406 | {
407 | "cell_type": "code",
408 | "execution_count": null,
409 | "metadata": {},
410 | "outputs": [],
411 | "source": [
412 | "# your code goes here\n"
413 | ]
414 | },
415 | {
416 | "cell_type": "code",
417 | "execution_count": null,
418 | "metadata": {
419 | "cell_type": "solution"
420 | },
421 | "outputs": [],
422 | "source": [
423 | "#marvel_df.iloc[[0, -1],][['sex', 'first_appearance']]\n",
424 | "marvel_df.iloc[[0, -1],]"
425 | ]
426 | },
427 | {
428 | "cell_type": "markdown",
429 | "metadata": {},
430 | "source": [
431 | "\n",
432 | "\n",
433 | "## DataFrame manipulation and operations"
434 | ]
435 | },
436 | {
437 | "cell_type": "markdown",
438 | "metadata": {},
439 | "source": [
440 | "### Modify the `first_appearance` of 'Vision' to year 1964"
441 | ]
442 | },
443 | {
444 | "cell_type": "code",
445 | "execution_count": null,
446 | "metadata": {},
447 | "outputs": [],
448 | "source": [
449 | "# your code goes here\n"
450 | ]
451 | },
452 | {
453 | "cell_type": "code",
454 | "execution_count": null,
455 | "metadata": {
456 | "cell_type": "solution"
457 | },
458 | "outputs": [],
459 | "source": [
460 | "marvel_df.loc['Vision', 'first_appearance'] = 1964\n",
461 | "\n",
462 | "marvel_df"
463 | ]
464 | },
465 | {
466 | "cell_type": "markdown",
467 | "metadata": {},
468 | "source": [
469 | "\n",
470 | "\n",
471 | "### Add a new column to `marvel_df` called 'years_since' with the years since `first_appearance`\n"
472 | ]
473 | },
474 | {
475 | "cell_type": "code",
476 | "execution_count": null,
477 | "metadata": {},
478 | "outputs": [],
479 | "source": [
480 | "# your code goes here\n"
481 | ]
482 | },
483 | {
484 | "cell_type": "code",
485 | "execution_count": null,
486 | "metadata": {
487 | "cell_type": "solution"
488 | },
489 | "outputs": [],
490 | "source": [
491 | "marvel_df['years_since'] = 2018 - marvel_df['first_appearance']\n",
492 | "\n",
493 | "marvel_df"
494 | ]
495 | },
496 | {
497 | "cell_type": "markdown",
498 | "metadata": {},
499 | "source": [
500 | "\n",
501 | "\n",
502 | "## DataFrame boolean arrays (also called masks)"
503 | ]
504 | },
505 | {
506 | "cell_type": "markdown",
507 | "metadata": {},
508 | "source": [
509 | "### Given the `marvel_df` pandas DataFrame, make a mask showing the female characters\n"
510 | ]
511 | },
512 | {
513 | "cell_type": "code",
514 | "execution_count": null,
515 | "metadata": {},
516 | "outputs": [],
517 | "source": [
518 | "# your code goes here\n"
519 | ]
520 | },
521 | {
522 | "cell_type": "code",
523 | "execution_count": null,
524 | "metadata": {
525 | "cell_type": "solution"
526 | },
527 | "outputs": [],
528 | "source": [
529 | "mask = marvel_df['sex'] == 'female'\n",
530 | "\n",
531 | "mask"
532 | ]
533 | },
534 | {
535 | "cell_type": "markdown",
536 | "metadata": {},
537 | "source": [
538 | "\n",
539 | "\n",
540 | "### Given the `marvel_df` pandas DataFrame, get the male characters\n"
541 | ]
542 | },
543 | {
544 | "cell_type": "code",
545 | "execution_count": null,
546 | "metadata": {},
547 | "outputs": [],
548 | "source": [
549 | "# your code goes here\n"
550 | ]
551 | },
552 | {
553 | "cell_type": "code",
554 | "execution_count": null,
555 | "metadata": {
556 | "cell_type": "solution"
557 | },
558 | "outputs": [],
559 | "source": [
560 | "mask = marvel_df['sex'] == 'male'\n",
561 | "\n",
562 | "marvel_df[mask]"
563 | ]
564 | },
565 | {
566 | "cell_type": "markdown",
567 | "metadata": {},
568 | "source": [
569 | "\n",
570 | "\n",
571 | "### Given the `marvel_df` pandas DataFrame, get the characters with `first_appearance` after 1970\n"
572 | ]
573 | },
574 | {
575 | "cell_type": "code",
576 | "execution_count": null,
577 | "metadata": {},
578 | "outputs": [],
579 | "source": [
580 | "# your code goes here\n"
581 | ]
582 | },
583 | {
584 | "cell_type": "code",
585 | "execution_count": null,
586 | "metadata": {
587 | "cell_type": "solution"
588 | },
589 | "outputs": [],
590 | "source": [
591 | "mask = marvel_df['first_appearance'] > 1970\n",
592 | "\n",
593 | "marvel_df[mask]"
594 | ]
595 | },
596 | {
597 | "cell_type": "markdown",
598 | "metadata": {},
599 | "source": [
600 | "\n",
601 | "\n",
602 | "### Given the `marvel_df` pandas DataFrame, get the female characters with `first_appearance` after 1970"
603 | ]
604 | },
605 | {
606 | "cell_type": "code",
607 | "execution_count": null,
608 | "metadata": {},
609 | "outputs": [],
610 | "source": [
611 | "# your code goes here\n"
612 | ]
613 | },
614 | {
615 | "cell_type": "code",
616 | "execution_count": null,
617 | "metadata": {
618 | "cell_type": "solution",
619 | "scrolled": true
620 | },
621 | "outputs": [],
622 | "source": [
623 | "mask = (marvel_df['sex'] == 'female') & (marvel_df['first_appearance'] > 1970)\n",
624 | "\n",
625 | "marvel_df[mask]"
626 | ]
627 | },
628 | {
629 | "cell_type": "markdown",
630 | "metadata": {},
631 | "source": [
632 | "\n",
633 | "\n",
634 | "## DataFrame summary statistics"
635 | ]
636 | },
637 | {
638 | "cell_type": "markdown",
639 | "metadata": {},
640 | "source": [
641 | "### Show basic statistics of `marvel_df`"
642 | ]
643 | },
644 | {
645 | "cell_type": "code",
646 | "execution_count": null,
647 | "metadata": {},
648 | "outputs": [],
649 | "source": [
650 | "# your code goes here\n"
651 | ]
652 | },
653 | {
654 | "cell_type": "code",
655 | "execution_count": null,
656 | "metadata": {
657 | "cell_type": "solution"
658 | },
659 | "outputs": [],
660 | "source": [
661 | "marvel_df.describe()"
662 | ]
663 | },
664 | {
665 | "cell_type": "markdown",
666 | "metadata": {},
667 | "source": [
668 | "\n",
669 | "\n",
670 | "### Given the `marvel_df` pandas DataFrame, show the mean value of `first_appearance`"
671 | ]
672 | },
673 | {
674 | "cell_type": "code",
675 | "execution_count": null,
676 | "metadata": {},
677 | "outputs": [],
678 | "source": [
679 | "# your code goes here\n"
680 | ]
681 | },
682 | {
683 | "cell_type": "code",
684 | "execution_count": null,
685 | "metadata": {
686 | "cell_type": "solution"
687 | },
688 | "outputs": [],
689 | "source": [
690 | "\n",
691 | "#np.mean(marvel_df.first_appearance)\n",
692 | "marvel_df.first_appearance.mean()"
693 | ]
694 | },
695 | {
696 | "cell_type": "markdown",
697 | "metadata": {},
698 | "source": [
699 | "\n",
700 | "\n",
701 | "### Given the `marvel_df` pandas DataFrame, show the min value of `first_appearance`\n"
702 | ]
703 | },
704 | {
705 | "cell_type": "code",
706 | "execution_count": null,
707 | "metadata": {},
708 | "outputs": [],
709 | "source": [
710 | "# your code goes here\n"
711 | ]
712 | },
713 | {
714 | "cell_type": "code",
715 | "execution_count": null,
716 | "metadata": {
717 | "cell_type": "solution"
718 | },
719 | "outputs": [],
720 | "source": [
721 | "#np.min(marvel_df.first_appearance)\n",
722 | "marvel_df.first_appearance.min()"
723 | ]
724 | },
725 | {
726 | "cell_type": "markdown",
727 | "metadata": {},
728 | "source": [
729 | "\n",
730 | "\n",
731 | "### Given the `marvel_df` pandas DataFrame, get the characters with the min value of `first_appearance`"
732 | ]
733 | },
734 | {
735 | "cell_type": "code",
736 | "execution_count": null,
737 | "metadata": {},
738 | "outputs": [],
739 | "source": [
740 | "# your code goes here\n"
741 | ]
742 | },
743 | {
744 | "cell_type": "code",
745 | "execution_count": null,
746 | "metadata": {
747 | "cell_type": "solution"
748 | },
749 | "outputs": [],
750 | "source": [
751 | "mask = marvel_df['first_appearance'] == marvel_df.first_appearance.min()\n",
752 | "marvel_df[mask]"
753 | ]
754 | },
755 | {
756 | "cell_type": "markdown",
757 | "metadata": {},
758 | "source": [
759 | "\n",
760 | "\n",
761 | "## DataFrame basic plottings"
762 | ]
763 | },
764 | {
765 | "cell_type": "markdown",
766 | "metadata": {},
767 | "source": [
768 | "### Reset index names of `marvel_df`\n"
769 | ]
770 | },
771 | {
772 | "cell_type": "code",
773 | "execution_count": null,
774 | "metadata": {},
775 | "outputs": [],
776 | "source": [
777 | "# your code goes here\n"
778 | ]
779 | },
780 | {
781 | "cell_type": "code",
782 | "execution_count": null,
783 | "metadata": {
784 | "cell_type": "solution"
785 | },
786 | "outputs": [],
787 | "source": [
788 | "marvel_df = marvel_df.reset_index()\n",
789 | "\n",
790 | "marvel_df"
791 | ]
792 | },
793 | {
794 | "cell_type": "markdown",
795 | "metadata": {},
796 | "source": [
797 | "\n",
798 | "\n",
799 | "### Plot the values of `first_appearance`\n"
800 | ]
801 | },
802 | {
803 | "cell_type": "code",
804 | "execution_count": null,
805 | "metadata": {},
806 | "outputs": [],
807 | "source": [
808 | "# your code goes here\n"
809 | ]
810 | },
811 | {
812 | "cell_type": "code",
813 | "execution_count": null,
814 | "metadata": {
815 | "cell_type": "solution"
816 | },
817 | "outputs": [],
818 | "source": [
819 | "#plt.plot(marvel_df.index, marvel_df.first_appearance)\n",
820 | "marvel_df.first_appearance.plot()"
821 | ]
822 | },
823 | {
824 | "cell_type": "markdown",
825 | "metadata": {},
826 | "source": [
827 | "\n",
828 | "\n",
829 | "### Plot a histogram (plot.hist) with values of `first_appearance`\n"
830 | ]
831 | },
832 | {
833 | "cell_type": "code",
834 | "execution_count": null,
835 | "metadata": {},
836 | "outputs": [],
837 | "source": [
838 | "# your code goes here\n"
839 | ]
840 | },
841 | {
842 | "cell_type": "code",
843 | "execution_count": null,
844 | "metadata": {
845 | "cell_type": "solution"
846 | },
847 | "outputs": [],
848 | "source": [
849 | "\n",
850 | "plt.hist(marvel_df.first_appearance)"
851 | ]
852 | },
853 | {
854 | "cell_type": "markdown",
855 | "metadata": {},
856 | "source": [
857 | "\n"
858 | ]
859 | }
860 | ],
861 | "metadata": {
862 | "kernelspec": {
863 | "display_name": "Python 3",
864 | "language": "python",
865 | "name": "python3"
866 | },
867 | "language_info": {
868 | "codemirror_mode": {
869 | "name": "ipython",
870 | "version": 3
871 | },
872 | "file_extension": ".py",
873 | "mimetype": "text/x-python",
874 | "name": "python",
875 | "nbconvert_exporter": "python",
876 | "pygments_lexer": "ipython3",
877 | "version": "3.8.1"
878 | }
879 | },
880 | "nbformat": 4,
881 | "nbformat_minor": 4
882 | }
883 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 | #### We're in the process of adapting these notebooks into interactive projects in [DataWars](https://www.datawars.io/?utm_source=fccrepo&utm_medium=intro-to-pandas). Sign up now, it's [completely free](https://www.datawars.io/?utm_source=fccrepo&utm_medium=intro-to-pandas).
3 |
4 | Stay tuned! Have any questions? [Join our Discord](https://discord.gg/DSTe8tY38T)
5 |
6 | ---
7 |
8 | Created by Santiago Basulto. Connect with me on [X](https://x.com/santiagobasulto) or [LinkedIn](https://www.linkedin.com/in/santiagobasulto/)
9 |
--------------------------------------------------------------------------------
/data/.ipynb_checkpoints/btc-market-price-checkpoint.csv:
--------------------------------------------------------------------------------
1 | 2017-04-02 00:00:00,1099.169125
2 | 2017-04-03 00:00:00,1141.813
3 | 2017-04-04 00:00:00,1141.6003625
4 | 2017-04-05 00:00:00,1133.0793142857142
5 | 2017-04-06 00:00:00,1196.3079375
6 | 2017-04-07 00:00:00,1190.45425
7 | 2017-04-08 00:00:00,1181.1498375
8 | 2017-04-09 00:00:00,1208.8005
9 | 2017-04-10 00:00:00,1207.744875
10 | 2017-04-11 00:00:00,1226.6170375
11 | 2017-04-12 00:00:00,1218.92205
12 | 2017-04-13 00:00:00,1180.0237125
13 | 2017-04-14 00:00:00,1185.2600571428572
14 | 2017-04-15 00:00:00,1184.8806714285713
15 | 2017-04-16 00:00:00,1186.9274125
16 | 2017-04-17 00:00:00,1205.634875
17 | 2017-04-18 00:00:00,1216.1867428571427
18 | 2017-04-19 00:00:00,1217.9300875
19 | 2017-04-20 00:00:00,1241.6863250000001
20 | 2017-04-21 00:00:00,1258.3614125
21 | 2017-04-22 00:00:00,1261.311225
22 | 2017-04-23 00:00:00,1257.9881125
23 | 2017-04-24 00:00:00,1262.902775
24 | 2017-04-25 00:00:00,1279.4146875000001
25 | 2017-04-26 00:00:00,1309.109875
26 | 2017-04-27 00:00:00,1345.3539125
27 | 2017-04-28 00:00:00,1331.2944285714286
28 | 2017-04-29 00:00:00,1334.9790375
29 | 2017-04-30 00:00:00,1353.0045
30 | 2017-05-01 00:00:00,1417.1728125
31 | 2017-05-02 00:00:00,1452.0762875
32 | 2017-05-03 00:00:00,1507.5768571428573
33 | 2017-05-04 00:00:00,1508.292125
34 | 2017-05-05 00:00:00,1533.3350714285714
35 | 2017-05-06 00:00:00,1560.4102
36 | 2017-05-07 00:00:00,1535.8684285714285
37 | 2017-05-08 00:00:00,1640.619225
38 | 2017-05-09 00:00:00,1721.2849714285715
39 | 2017-05-10 00:00:00,1762.88625
40 | 2017-05-11 00:00:00,1820.9905625
41 | 2017-05-12 00:00:00,1720.4785
42 | 2017-05-13 00:00:00,1771.9200125
43 | 2017-05-14 00:00:00,1776.3165
44 | 2017-05-15 00:00:00,1723.1269375
45 | 2017-05-16 00:00:00,1739.031975
46 | 2017-05-17 00:00:00,1807.4850625
47 | 2017-05-18 00:00:00,1899.0828875
48 | 2017-05-19 00:00:00,1961.5204875
49 | 2017-05-20 00:00:00,2052.9097875
50 | 2017-05-21 00:00:00,2046.5344625
51 | 2017-05-22 00:00:00,2090.6623125
52 | 2017-05-23 00:00:00,2287.7102875
53 | 2017-05-24 00:00:00,2379.1938333333333
54 | 2017-05-25 00:00:00,2387.2062857142855
55 | 2017-05-26 00:00:00,2211.976857142857
56 | 2017-05-27 00:00:00,2014.0529625
57 | 2017-05-28 00:00:00,2192.9808
58 | 2017-05-29 00:00:00,2275.9307
59 | 2017-05-30 00:00:00,2239.2053428571426
60 | 2017-05-31 00:00:00,2285.9339142857143
61 | 2017-06-01 00:00:00,2399.2426714285716
62 | 2017-06-02 00:00:00,2446.142414285714
63 | 2017-06-03 00:00:00,2525.7651584699997
64 | 2017-06-04 00:00:00,2516.173142857143
65 | 2017-06-05 00:00:00,2698.3138125
66 | 2017-06-06 00:00:00,2883.3136966371426
67 | 2017-06-07 00:00:00,2664.9208625
68 | 2017-06-08 00:00:00,2792.9991875
69 | 2017-06-09 00:00:00,2827.4913
70 | 2017-06-10 00:00:00,2845.3728571428574
71 | 2017-06-11 00:00:00,2961.8296124999997
72 | 2017-06-12 00:00:00,2657.6750625
73 | 2017-06-13 00:00:00,2748.185085714286
74 | 2017-06-14 00:00:00,2447.0415625
75 | 2017-06-15 00:00:00,2442.48025
76 | 2017-06-16 00:00:00,2464.9598142857144
77 | 2017-06-17 00:00:00,2665.927
78 | 2017-06-18 00:00:00,2507.389252144286
79 | 2017-06-19 00:00:00,2617.2102625
80 | 2017-06-20 00:00:00,2754.97825
81 | 2017-06-21 00:00:00,2671.04325
82 | 2017-06-22 00:00:00,2727.2880125
83 | 2017-06-23 00:00:00,2710.4122857142856
84 | 2017-06-24 00:00:00,2589.1648875
85 | 2017-06-25 00:00:00,2512.3662857142854
86 | 2017-06-26 00:00:00,2436.4510571428573
87 | 2017-06-27 00:00:00,2517.9031142857143
88 | 2017-06-28 00:00:00,2585.349185714286
89 | 2017-06-29 00:00:00,2544.414475
90 | 2017-06-30 00:00:00,2477.641375
91 | 2017-07-01 00:00:00,2434.0778625
92 | 2017-07-02 00:00:00,2501.191342857143
93 | 2017-07-03 00:00:00,2561.225428571429
94 | 2017-07-04 00:00:00,2599.7298375
95 | 2017-07-05 00:00:00,2619.1875030042856
96 | 2017-07-06 00:00:00,2609.96775
97 | 2017-07-07 00:00:00,2491.201214285714
98 | 2017-07-08 00:00:00,2562.1306624999997
99 | 2017-07-09 00:00:00,2536.2389375
100 | 2017-07-10 00:00:00,2366.1701428571428
101 | 2017-07-11 00:00:00,2369.8621285714285
102 | 2017-07-12 00:00:00,2385.7485714285717
103 | 2017-07-13 00:00:00,2354.7834166666667
104 | 2017-07-14 00:00:00,2190.947833333333
105 | 2017-07-15 00:00:00,2058.9955999999997
106 | 2017-07-16 00:00:00,1931.2143
107 | 2017-07-17 00:00:00,2176.6234875
108 | 2017-07-18 00:00:00,2320.12225
109 | 2017-07-19 00:00:00,2264.7657
110 | 2017-07-20 00:00:00,2898.1884166666664
111 | 2017-07-21 00:00:00,2682.1953625
112 | 2017-07-22 00:00:00,2807.609857142857
113 | 2017-07-23 00:00:00,2725.549716666667
114 | 2017-07-24 00:00:00,2751.821028571429
115 | 2017-07-25 00:00:00,2560.9979166666667
116 | 2017-07-26 00:00:00,2495.028585714286
117 | 2017-07-27 00:00:00,2647.625
118 | 2017-07-28 00:00:00,2781.636583333333
119 | 2017-07-29 00:00:00,2722.512785714286
120 | 2017-07-30 00:00:00,2745.955416666666
121 | 2017-07-31 00:00:00,2866.431666666667
122 | 2017-08-01 00:00:00,2710.4130666666665
123 | 2017-08-02 00:00:00,2693.6339833333336
124 | 2017-08-03 00:00:00,2794.117716666666
125 | 2017-08-04 00:00:00,2873.8510833333335
126 | 2017-08-05 00:00:00,3218.1150166666666
127 | 2017-08-06 00:00:00,3252.5625333333332
128 | 2017-08-07 00:00:00,3407.2268333333336
129 | 2017-08-08 00:00:00,3457.374333333333
130 | 2017-08-09 00:00:00,3357.326316666667
131 | 2017-08-10 00:00:00,3424.4042000000004
132 | 2017-08-11 00:00:00,3632.5066666666667
133 | 2017-08-12 00:00:00,3852.8029142857145
134 | 2017-08-13 00:00:00,4125.54802
135 | 2017-08-14 00:00:00,4282.992
136 | 2017-08-15 00:00:00,4217.028328571429
137 | 2017-08-16 00:00:00,4360.876871428572
138 | 2017-08-17 00:00:00,4328.725716666667
139 | 2017-08-18 00:00:00,4130.440066666667
140 | 2017-08-19 00:00:00,4222.662214285714
141 | 2017-08-20 00:00:00,4157.958033333333
142 | 2017-08-21 00:00:00,4043.722
143 | 2017-08-22 00:00:00,4082.180983333333
144 | 2017-08-23 00:00:00,4174.95
145 | 2017-08-24 00:00:00,4340.316716666667
146 | 2017-08-25 00:00:00,4363.05445
147 | 2017-08-26 00:00:00,4360.5133166666665
148 | 2017-08-27 00:00:00,4354.308333333333
149 | 2017-08-28 00:00:00,4391.673516666667
150 | 2017-08-29 00:00:00,4607.98545
151 | 2017-08-30 00:00:00,4594.98785
152 | 2017-08-31 00:00:00,4748.255
153 | 2017-09-01 00:00:00,4911.740016666667
154 | 2017-09-02 00:00:00,4580.387479999999
155 | 2017-09-03 00:00:00,4648.159983333334
156 | 2017-09-04 00:00:00,4344.0983166666665
157 | 2017-09-05 00:00:00,4488.72014
158 | 2017-09-06 00:00:00,4641.822016666666
159 | 2017-09-07 00:00:00,4654.6585000000005
160 | 2017-09-08 00:00:00,4310.750183333334
161 | 2017-09-09 00:00:00,4375.55952
162 | 2017-09-10 00:00:00,4329.955
163 | 2017-09-11 00:00:00,4248.090016666666
164 | 2017-09-12 00:00:00,4219.036616666667
165 | 2017-09-13 00:00:00,3961.2712666666666
166 | 2017-09-14 00:00:00,3319.6299999999997
167 | 2017-09-15 00:00:00,3774.2652833333336
168 | 2017-09-16 00:00:00,3763.62604
169 | 2017-09-17 00:00:00,3746.060783333333
170 | 2017-09-18 00:00:00,4093.316666666667
171 | 2017-09-19 00:00:00,3943.4133333333334
172 | 2017-09-20 00:00:00,3977.5616666666665
173 | 2017-09-21 00:00:00,3658.8981833333332
174 | 2017-09-22 00:00:00,3637.5025499999997
175 | 2017-09-23 00:00:00,3776.3869
176 | 2017-09-24 00:00:00,3703.0406500000004
177 | 2017-09-25 00:00:00,3942.5550000000003
178 | 2017-09-26 00:00:00,3910.3073833333333
179 | 2017-09-27 00:00:00,4202.554983333333
180 | 2017-09-28 00:00:00,4201.98905
181 | 2017-09-29 00:00:00,4193.574666666666
182 | 2017-09-30 00:00:00,4335.368316666667
183 | 2017-10-01 00:00:00,4360.722966666667
184 | 2017-10-02 00:00:00,4386.88375
185 | 2017-10-03 00:00:00,4293.3066
186 | 2017-10-04 00:00:00,4225.175
187 | 2017-10-05 00:00:00,4338.852
188 | 2017-10-06 00:00:00,4345.6033333333335
189 | 2017-10-07 00:00:00,4376.191666666667
190 | 2017-10-08 00:00:00,4602.280883333334
191 | 2017-10-09 00:00:00,4777.967816666666
192 | 2017-10-10 00:00:00,4782.28
193 | 2017-10-11 00:00:00,4819.485766666667
194 | 2017-10-12 00:00:00,5325.130683333333
195 | 2017-10-13 00:00:00,5563.806566666666
196 | 2017-10-14 00:00:00,5739.438733333333
197 | 2017-10-15 00:00:00,5647.311666666667
198 | 2017-10-16 00:00:00,5711.205866666667
199 | 2017-10-17 00:00:00,5603.71294
200 | 2017-10-18 00:00:00,5546.176100000001
201 | 2017-10-19 00:00:00,5727.6335
202 | 2017-10-20 00:00:00,5979.45984
203 | 2017-10-21 00:00:00,6020.371683333334
204 | 2017-10-22 00:00:00,5983.184550000001
205 | 2017-10-23 00:00:00,5876.079866666667
206 | 2017-10-24 00:00:00,5505.827766666666
207 | 2017-10-25 00:00:00,5669.622533333334
208 | 2017-10-26 00:00:00,5893.138416666666
209 | 2017-10-27 00:00:00,5772.504983333333
210 | 2017-10-28 00:00:00,5776.6969500000005
211 | 2017-10-29 00:00:00,6155.43402
212 | 2017-10-30 00:00:00,6105.87422
213 | 2017-10-31 00:00:00,6388.645166666666
214 | 2017-11-01 00:00:00,6665.306683333333
215 | 2017-11-02 00:00:00,7068.020100000001
216 | 2017-11-03 00:00:00,7197.72006
217 | 2017-11-04 00:00:00,7437.543316666666
218 | 2017-11-05 00:00:00,7377.012366666667
219 | 2017-11-06 00:00:00,6989.071666666667
220 | 2017-11-07 00:00:00,7092.127233333333
221 | 2017-11-08 00:00:00,7415.878250000001
222 | 2017-11-09 00:00:00,7158.03706
223 | 2017-11-10 00:00:00,6719.39785
224 | 2017-11-11 00:00:00,6362.851033333333
225 | 2017-11-12 00:00:00,5716.301583333334
226 | 2017-11-13 00:00:00,6550.227533333334
227 | 2017-11-14 00:00:00,6635.412633333333
228 | 2017-11-15 00:00:00,7301.42992
229 | 2017-11-16 00:00:00,7815.0307
230 | 2017-11-17 00:00:00,7786.884366666666
231 | 2017-11-18 00:00:00,7817.1403833333325
232 | 2017-11-19 00:00:00,8007.654066666667
233 | 2017-11-20 00:00:00,8255.596816666666
234 | 2017-11-21 00:00:00,8059.8
235 | 2017-11-22 00:00:00,8268.035
236 | 2017-11-23 00:00:00,8148.95
237 | 2017-11-24 00:00:00,8250.978333333334
238 | 2017-11-25 00:00:00,8707.407266666667
239 | 2017-11-26 00:00:00,9284.1438
240 | 2017-11-27 00:00:00,9718.29505
241 | 2017-11-28 00:00:00,9952.50882
242 | 2017-11-29 00:00:00,9879.328333333333
243 | 2017-11-30 00:00:00,10147.372
244 | 2017-12-01 00:00:00,10883.912
245 | 2017-12-02 00:00:00,11071.368333333332
246 | 2017-12-03 00:00:00,11332.622
247 | 2017-12-04 00:00:00,11584.83
248 | 2017-12-05 00:00:00,11878.433333333334
249 | 2017-12-06 00:00:00,13540.980000000001
250 | 2017-12-07 00:00:00,16501.971666666668
251 | 2017-12-08 00:00:00,16007.436666666666
252 | 2017-12-09 00:00:00,15142.834152123332
253 | 2017-12-10 00:00:00,14869.805
254 | 2017-12-11 00:00:00,16762.116666666665
255 | 2017-12-12 00:00:00,17276.393333333333
256 | 2017-12-13 00:00:00,16808.366666666665
257 | 2017-12-14 00:00:00,16678.892
258 | 2017-12-15 00:00:00,17771.899999999998
259 | 2017-12-16 00:00:00,19498.683333333334
260 | 2017-12-17 00:00:00,19289.785
261 | 2017-12-18 00:00:00,18961.856666666667
262 | 2017-12-19 00:00:00,17737.111666666668
263 | 2017-12-20 00:00:00,16026.271666666667
264 | 2017-12-21 00:00:00,16047.51
265 | 2017-12-22 00:00:00,15190.945
266 | 2017-12-23 00:00:00,15360.261666666667
267 | 2017-12-24 00:00:00,13949.175000000001
268 | 2017-12-25 00:00:00,14119.028333333334
269 | 2017-12-26 00:00:00,15999.048333333332
270 | 2017-12-27 00:00:00,15589.321666666665
271 | 2017-12-28 00:00:00,14380.581666666667
272 | 2017-12-29 00:00:00,14640.14
273 | 2017-12-30 00:00:00,13215.573999999999
274 | 2017-12-31 00:00:00,14165.574999999999
275 | 2018-01-01 00:00:00,13812.186666666666
276 | 2018-01-02 00:00:00,15005.856666666667
277 | 2018-01-03 00:00:00,15053.261666666665
278 | 2018-01-04 00:00:00,15199.355000000001
279 | 2018-01-05 00:00:00,17174.12
280 | 2018-01-06 00:00:00,17319.198
281 | 2018-01-07 00:00:00,16651.471666666668
282 | 2018-01-08 00:00:00,15265.906666666668
283 | 2018-01-09 00:00:00,14714.253333333334
284 | 2018-01-10 00:00:00,15126.398333333333
285 | 2018-01-11 00:00:00,13296.794
286 | 2018-01-12 00:00:00,13912.882000000001
287 | 2018-01-13 00:00:00,14499.773333333333
288 | 2018-01-14 00:00:00,13852.92
289 | 2018-01-15 00:00:00,14012.196
290 | 2018-01-16 00:00:00,11180.998333333331
291 | 2018-01-17 00:00:00,11116.946666666669
292 | 2018-01-18 00:00:00,11345.423333333332
293 | 2018-01-19 00:00:00,11422.44
294 | 2018-01-20 00:00:00,12950.793333333333
295 | 2018-01-21 00:00:00,11505.228
296 | 2018-01-22 00:00:00,10544.593333333332
297 | 2018-01-23 00:00:00,11223.064
298 | 2018-01-24 00:00:00,11282.258333333333
299 | 2018-01-25 00:00:00,11214.44
300 | 2018-01-26 00:00:00,10969.815
301 | 2018-01-27 00:00:00,11524.776666666667
302 | 2018-01-28 00:00:00,11765.71
303 | 2018-01-29 00:00:00,11212.654999999999
304 | 2018-01-30 00:00:00,10184.061666666666
305 | 2018-01-31 00:00:00,10125.013333333334
306 | 2018-02-01 00:00:00,9083.258333333333
307 | 2018-02-02 00:00:00,8901.901666666667
308 | 2018-02-03 00:00:00,9076.678333333333
309 | 2018-02-04 00:00:00,8400.648333333333
310 | 2018-02-05 00:00:00,6838.816666666667
311 | 2018-02-06 00:00:00,7685.633333333334
312 | 2018-02-07 00:00:00,8099.958333333333
313 | 2018-02-08 00:00:00,8240.536666666667
314 | 2018-02-09 00:00:00,8535.516666666668
315 | 2018-02-10 00:00:00,8319.876566184
316 | 2018-02-11 00:00:00,8343.455
317 | 2018-02-12 00:00:00,8811.343333333332
318 | 2018-02-13 00:00:00,8597.7675
319 | 2018-02-14 00:00:00,9334.633333333333
320 | 2018-02-15 00:00:00,9977.154
321 | 2018-02-16 00:00:00,10127.161666666667
322 | 2018-02-17 00:00:00,10841.991666666667
323 | 2018-02-18 00:00:00,10503.298333333334
324 | 2018-02-19 00:00:00,11110.964999999998
325 | 2018-02-20 00:00:00,11390.391666666668
326 | 2018-02-21 00:00:00,10532.791666666666
327 | 2018-02-22 00:00:00,9931.071666666667
328 | 2018-02-23 00:00:00,10162.116666666667
329 | 2018-02-24 00:00:00,9697.956
330 | 2018-02-25 00:00:00,9696.593333333332
331 | 2018-02-26 00:00:00,10348.603333333334
332 | 2018-02-27 00:00:00,10763.883333333333
333 | 2018-02-28 00:00:00,10370.164999999999
334 | 2018-03-01 00:00:00,11009.381666666668
335 | 2018-03-02 00:00:00,11055.815
336 | 2018-03-03 00:00:00,11326.948333333334
337 | 2018-03-04 00:00:00,11430.181666666665
338 | 2018-03-05 00:00:00,11595.54
339 | 2018-03-06 00:00:00,10763.198333333334
340 | 2018-03-07 00:00:00,10118.058
341 | 2018-03-08 00:00:00,9429.111666666666
342 | 2018-03-09 00:00:00,9089.278333333334
343 | 2018-03-10 00:00:00,8746.002
344 | 2018-03-11 00:00:00,9761.396666666666
345 | 2018-03-12 00:00:00,9182.843333333332
346 | 2018-03-13 00:00:00,9154.699999999999
347 | 2018-03-14 00:00:00,8151.531666666667
348 | 2018-03-15 00:00:00,8358.121666666666
349 | 2018-03-16 00:00:00,8530.402
350 | 2018-03-17 00:00:00,7993.674643641666
351 | 2018-03-18 00:00:00,8171.415
352 | 2018-03-19 00:00:00,8412.033333333333
353 | 2018-03-20 00:00:00,8986.948333333334
354 | 2018-03-21 00:00:00,8947.753333333334
355 | 2018-03-22 00:00:00,8690.408333333333
356 | 2018-03-23 00:00:00,8686.826666666666
357 | 2018-03-24 00:00:00,8662.378333333334
358 | 2018-03-25 00:00:00,8617.296666666667
359 | 2018-03-26 00:00:00,8197.548333333334
360 | 2018-03-27 00:00:00,7876.195
361 | 2018-03-28 00:00:00,7960.38
362 | 2018-03-29 00:00:00,7172.28
363 | 2018-03-30 00:00:00,6882.531666666667
364 | 2018-03-31 00:00:00,6935.48
365 | 2018-04-01 00:00:00,6794.105
366 |
--------------------------------------------------------------------------------
/data/btc-market-price.csv:
--------------------------------------------------------------------------------
1 | 2017-04-02 00:00:00,1099.169125
2 | 2017-04-03 00:00:00,1141.813
3 | 2017-04-04 00:00:00,1141.6003625
4 | 2017-04-05 00:00:00,1133.0793142857142
5 | 2017-04-06 00:00:00,1196.3079375
6 | 2017-04-07 00:00:00,1190.45425
7 | 2017-04-08 00:00:00,1181.1498375
8 | 2017-04-09 00:00:00,1208.8005
9 | 2017-04-10 00:00:00,1207.744875
10 | 2017-04-11 00:00:00,1226.6170375
11 | 2017-04-12 00:00:00,1218.92205
12 | 2017-04-13 00:00:00,1180.0237125
13 | 2017-04-14 00:00:00,1185.2600571428572
14 | 2017-04-15 00:00:00,1184.8806714285713
15 | 2017-04-16 00:00:00,1186.9274125
16 | 2017-04-17 00:00:00,1205.634875
17 | 2017-04-18 00:00:00,1216.1867428571427
18 | 2017-04-19 00:00:00,1217.9300875
19 | 2017-04-20 00:00:00,1241.6863250000001
20 | 2017-04-21 00:00:00,1258.3614125
21 | 2017-04-22 00:00:00,1261.311225
22 | 2017-04-23 00:00:00,1257.9881125
23 | 2017-04-24 00:00:00,1262.902775
24 | 2017-04-25 00:00:00,1279.4146875000001
25 | 2017-04-26 00:00:00,1309.109875
26 | 2017-04-27 00:00:00,1345.3539125
27 | 2017-04-28 00:00:00,1331.2944285714286
28 | 2017-04-29 00:00:00,1334.9790375
29 | 2017-04-30 00:00:00,1353.0045
30 | 2017-05-01 00:00:00,1417.1728125
31 | 2017-05-02 00:00:00,1452.0762875
32 | 2017-05-03 00:00:00,1507.5768571428573
33 | 2017-05-04 00:00:00,1508.292125
34 | 2017-05-05 00:00:00,1533.3350714285714
35 | 2017-05-06 00:00:00,1560.4102
36 | 2017-05-07 00:00:00,1535.8684285714285
37 | 2017-05-08 00:00:00,1640.619225
38 | 2017-05-09 00:00:00,1721.2849714285715
39 | 2017-05-10 00:00:00,1762.88625
40 | 2017-05-11 00:00:00,1820.9905625
41 | 2017-05-12 00:00:00,1720.4785
42 | 2017-05-13 00:00:00,1771.9200125
43 | 2017-05-14 00:00:00,1776.3165
44 | 2017-05-15 00:00:00,1723.1269375
45 | 2017-05-16 00:00:00,1739.031975
46 | 2017-05-17 00:00:00,1807.4850625
47 | 2017-05-18 00:00:00,1899.0828875
48 | 2017-05-19 00:00:00,1961.5204875
49 | 2017-05-20 00:00:00,2052.9097875
50 | 2017-05-21 00:00:00,2046.5344625
51 | 2017-05-22 00:00:00,2090.6623125
52 | 2017-05-23 00:00:00,2287.7102875
53 | 2017-05-24 00:00:00,2379.1938333333333
54 | 2017-05-25 00:00:00,2387.2062857142855
55 | 2017-05-26 00:00:00,2211.976857142857
56 | 2017-05-27 00:00:00,2014.0529625
57 | 2017-05-28 00:00:00,2192.9808
58 | 2017-05-29 00:00:00,2275.9307
59 | 2017-05-30 00:00:00,2239.2053428571426
60 | 2017-05-31 00:00:00,2285.9339142857143
61 | 2017-06-01 00:00:00,2399.2426714285716
62 | 2017-06-02 00:00:00,2446.142414285714
63 | 2017-06-03 00:00:00,2525.7651584699997
64 | 2017-06-04 00:00:00,2516.173142857143
65 | 2017-06-05 00:00:00,2698.3138125
66 | 2017-06-06 00:00:00,2883.3136966371426
67 | 2017-06-07 00:00:00,2664.9208625
68 | 2017-06-08 00:00:00,2792.9991875
69 | 2017-06-09 00:00:00,2827.4913
70 | 2017-06-10 00:00:00,2845.3728571428574
71 | 2017-06-11 00:00:00,2961.8296124999997
72 | 2017-06-12 00:00:00,2657.6750625
73 | 2017-06-13 00:00:00,2748.185085714286
74 | 2017-06-14 00:00:00,2447.0415625
75 | 2017-06-15 00:00:00,2442.48025
76 | 2017-06-16 00:00:00,2464.9598142857144
77 | 2017-06-17 00:00:00,2665.927
78 | 2017-06-18 00:00:00,2507.389252144286
79 | 2017-06-19 00:00:00,2617.2102625
80 | 2017-06-20 00:00:00,2754.97825
81 | 2017-06-21 00:00:00,2671.04325
82 | 2017-06-22 00:00:00,2727.2880125
83 | 2017-06-23 00:00:00,2710.4122857142856
84 | 2017-06-24 00:00:00,2589.1648875
85 | 2017-06-25 00:00:00,2512.3662857142854
86 | 2017-06-26 00:00:00,2436.4510571428573
87 | 2017-06-27 00:00:00,2517.9031142857143
88 | 2017-06-28 00:00:00,2585.349185714286
89 | 2017-06-29 00:00:00,2544.414475
90 | 2017-06-30 00:00:00,2477.641375
91 | 2017-07-01 00:00:00,2434.0778625
92 | 2017-07-02 00:00:00,2501.191342857143
93 | 2017-07-03 00:00:00,2561.225428571429
94 | 2017-07-04 00:00:00,2599.7298375
95 | 2017-07-05 00:00:00,2619.1875030042856
96 | 2017-07-06 00:00:00,2609.96775
97 | 2017-07-07 00:00:00,2491.201214285714
98 | 2017-07-08 00:00:00,2562.1306624999997
99 | 2017-07-09 00:00:00,2536.2389375
100 | 2017-07-10 00:00:00,2366.1701428571428
101 | 2017-07-11 00:00:00,2369.8621285714285
102 | 2017-07-12 00:00:00,2385.7485714285717
103 | 2017-07-13 00:00:00,2354.7834166666667
104 | 2017-07-14 00:00:00,2190.947833333333
105 | 2017-07-15 00:00:00,2058.9955999999997
106 | 2017-07-16 00:00:00,1931.2143
107 | 2017-07-17 00:00:00,2176.6234875
108 | 2017-07-18 00:00:00,2320.12225
109 | 2017-07-19 00:00:00,2264.7657
110 | 2017-07-20 00:00:00,2898.1884166666664
111 | 2017-07-21 00:00:00,2682.1953625
112 | 2017-07-22 00:00:00,2807.609857142857
113 | 2017-07-23 00:00:00,2725.549716666667
114 | 2017-07-24 00:00:00,2751.821028571429
115 | 2017-07-25 00:00:00,2560.9979166666667
116 | 2017-07-26 00:00:00,2495.028585714286
117 | 2017-07-27 00:00:00,2647.625
118 | 2017-07-28 00:00:00,2781.636583333333
119 | 2017-07-29 00:00:00,2722.512785714286
120 | 2017-07-30 00:00:00,2745.955416666666
121 | 2017-07-31 00:00:00,2866.431666666667
122 | 2017-08-01 00:00:00,2710.4130666666665
123 | 2017-08-02 00:00:00,2693.6339833333336
124 | 2017-08-03 00:00:00,2794.117716666666
125 | 2017-08-04 00:00:00,2873.8510833333335
126 | 2017-08-05 00:00:00,3218.1150166666666
127 | 2017-08-06 00:00:00,3252.5625333333332
128 | 2017-08-07 00:00:00,3407.2268333333336
129 | 2017-08-08 00:00:00,3457.374333333333
130 | 2017-08-09 00:00:00,3357.326316666667
131 | 2017-08-10 00:00:00,3424.4042000000004
132 | 2017-08-11 00:00:00,3632.5066666666667
133 | 2017-08-12 00:00:00,3852.8029142857145
134 | 2017-08-13 00:00:00,4125.54802
135 | 2017-08-14 00:00:00,4282.992
136 | 2017-08-15 00:00:00,4217.028328571429
137 | 2017-08-16 00:00:00,4360.876871428572
138 | 2017-08-17 00:00:00,4328.725716666667
139 | 2017-08-18 00:00:00,4130.440066666667
140 | 2017-08-19 00:00:00,4222.662214285714
141 | 2017-08-20 00:00:00,4157.958033333333
142 | 2017-08-21 00:00:00,4043.722
143 | 2017-08-22 00:00:00,4082.180983333333
144 | 2017-08-23 00:00:00,4174.95
145 | 2017-08-24 00:00:00,4340.316716666667
146 | 2017-08-25 00:00:00,4363.05445
147 | 2017-08-26 00:00:00,4360.5133166666665
148 | 2017-08-27 00:00:00,4354.308333333333
149 | 2017-08-28 00:00:00,4391.673516666667
150 | 2017-08-29 00:00:00,4607.98545
151 | 2017-08-30 00:00:00,4594.98785
152 | 2017-08-31 00:00:00,4748.255
153 | 2017-09-01 00:00:00,4911.740016666667
154 | 2017-09-02 00:00:00,4580.387479999999
155 | 2017-09-03 00:00:00,4648.159983333334
156 | 2017-09-04 00:00:00,4344.0983166666665
157 | 2017-09-05 00:00:00,4488.72014
158 | 2017-09-06 00:00:00,4641.822016666666
159 | 2017-09-07 00:00:00,4654.6585000000005
160 | 2017-09-08 00:00:00,4310.750183333334
161 | 2017-09-09 00:00:00,4375.55952
162 | 2017-09-10 00:00:00,4329.955
163 | 2017-09-11 00:00:00,4248.090016666666
164 | 2017-09-12 00:00:00,4219.036616666667
165 | 2017-09-13 00:00:00,3961.2712666666666
166 | 2017-09-14 00:00:00,3319.6299999999997
167 | 2017-09-15 00:00:00,3774.2652833333336
168 | 2017-09-16 00:00:00,3763.62604
169 | 2017-09-17 00:00:00,3746.060783333333
170 | 2017-09-18 00:00:00,4093.316666666667
171 | 2017-09-19 00:00:00,3943.4133333333334
172 | 2017-09-20 00:00:00,3977.5616666666665
173 | 2017-09-21 00:00:00,3658.8981833333332
174 | 2017-09-22 00:00:00,3637.5025499999997
175 | 2017-09-23 00:00:00,3776.3869
176 | 2017-09-24 00:00:00,3703.0406500000004
177 | 2017-09-25 00:00:00,3942.5550000000003
178 | 2017-09-26 00:00:00,3910.3073833333333
179 | 2017-09-27 00:00:00,4202.554983333333
180 | 2017-09-28 00:00:00,4201.98905
181 | 2017-09-29 00:00:00,4193.574666666666
182 | 2017-09-30 00:00:00,4335.368316666667
183 | 2017-10-01 00:00:00,4360.722966666667
184 | 2017-10-02 00:00:00,4386.88375
185 | 2017-10-03 00:00:00,4293.3066
186 | 2017-10-04 00:00:00,4225.175
187 | 2017-10-05 00:00:00,4338.852
188 | 2017-10-06 00:00:00,4345.6033333333335
189 | 2017-10-07 00:00:00,4376.191666666667
190 | 2017-10-08 00:00:00,4602.280883333334
191 | 2017-10-09 00:00:00,4777.967816666666
192 | 2017-10-10 00:00:00,4782.28
193 | 2017-10-11 00:00:00,4819.485766666667
194 | 2017-10-12 00:00:00,5325.130683333333
195 | 2017-10-13 00:00:00,5563.806566666666
196 | 2017-10-14 00:00:00,5739.438733333333
197 | 2017-10-15 00:00:00,5647.311666666667
198 | 2017-10-16 00:00:00,5711.205866666667
199 | 2017-10-17 00:00:00,5603.71294
200 | 2017-10-18 00:00:00,5546.176100000001
201 | 2017-10-19 00:00:00,5727.6335
202 | 2017-10-20 00:00:00,5979.45984
203 | 2017-10-21 00:00:00,6020.371683333334
204 | 2017-10-22 00:00:00,5983.184550000001
205 | 2017-10-23 00:00:00,5876.079866666667
206 | 2017-10-24 00:00:00,5505.827766666666
207 | 2017-10-25 00:00:00,5669.622533333334
208 | 2017-10-26 00:00:00,5893.138416666666
209 | 2017-10-27 00:00:00,5772.504983333333
210 | 2017-10-28 00:00:00,5776.6969500000005
211 | 2017-10-29 00:00:00,6155.43402
212 | 2017-10-30 00:00:00,6105.87422
213 | 2017-10-31 00:00:00,6388.645166666666
214 | 2017-11-01 00:00:00,6665.306683333333
215 | 2017-11-02 00:00:00,7068.020100000001
216 | 2017-11-03 00:00:00,7197.72006
217 | 2017-11-04 00:00:00,7437.543316666666
218 | 2017-11-05 00:00:00,7377.012366666667
219 | 2017-11-06 00:00:00,6989.071666666667
220 | 2017-11-07 00:00:00,7092.127233333333
221 | 2017-11-08 00:00:00,7415.878250000001
222 | 2017-11-09 00:00:00,7158.03706
223 | 2017-11-10 00:00:00,6719.39785
224 | 2017-11-11 00:00:00,6362.851033333333
225 | 2017-11-12 00:00:00,5716.301583333334
226 | 2017-11-13 00:00:00,6550.227533333334
227 | 2017-11-14 00:00:00,6635.412633333333
228 | 2017-11-15 00:00:00,7301.42992
229 | 2017-11-16 00:00:00,7815.0307
230 | 2017-11-17 00:00:00,7786.884366666666
231 | 2017-11-18 00:00:00,7817.1403833333325
232 | 2017-11-19 00:00:00,8007.654066666667
233 | 2017-11-20 00:00:00,8255.596816666666
234 | 2017-11-21 00:00:00,8059.8
235 | 2017-11-22 00:00:00,8268.035
236 | 2017-11-23 00:00:00,8148.95
237 | 2017-11-24 00:00:00,8250.978333333334
238 | 2017-11-25 00:00:00,8707.407266666667
239 | 2017-11-26 00:00:00,9284.1438
240 | 2017-11-27 00:00:00,9718.29505
241 | 2017-11-28 00:00:00,9952.50882
242 | 2017-11-29 00:00:00,9879.328333333333
243 | 2017-11-30 00:00:00,10147.372
244 | 2017-12-01 00:00:00,10883.912
245 | 2017-12-02 00:00:00,11071.368333333332
246 | 2017-12-03 00:00:00,11332.622
247 | 2017-12-04 00:00:00,11584.83
248 | 2017-12-05 00:00:00,11878.433333333334
249 | 2017-12-06 00:00:00,13540.980000000001
250 | 2017-12-07 00:00:00,16501.971666666668
251 | 2017-12-08 00:00:00,16007.436666666666
252 | 2017-12-09 00:00:00,15142.834152123332
253 | 2017-12-10 00:00:00,14869.805
254 | 2017-12-11 00:00:00,16762.116666666665
255 | 2017-12-12 00:00:00,17276.393333333333
256 | 2017-12-13 00:00:00,16808.366666666665
257 | 2017-12-14 00:00:00,16678.892
258 | 2017-12-15 00:00:00,17771.899999999998
259 | 2017-12-16 00:00:00,19498.683333333334
260 | 2017-12-17 00:00:00,19289.785
261 | 2017-12-18 00:00:00,18961.856666666667
262 | 2017-12-19 00:00:00,17737.111666666668
263 | 2017-12-20 00:00:00,16026.271666666667
264 | 2017-12-21 00:00:00,16047.51
265 | 2017-12-22 00:00:00,15190.945
266 | 2017-12-23 00:00:00,15360.261666666667
267 | 2017-12-24 00:00:00,13949.175000000001
268 | 2017-12-25 00:00:00,14119.028333333334
269 | 2017-12-26 00:00:00,15999.048333333332
270 | 2017-12-27 00:00:00,15589.321666666665
271 | 2017-12-28 00:00:00,14380.581666666667
272 | 2017-12-29 00:00:00,14640.14
273 | 2017-12-30 00:00:00,13215.573999999999
274 | 2017-12-31 00:00:00,14165.574999999999
275 | 2018-01-01 00:00:00,13812.186666666666
276 | 2018-01-02 00:00:00,15005.856666666667
277 | 2018-01-03 00:00:00,15053.261666666665
278 | 2018-01-04 00:00:00,15199.355000000001
279 | 2018-01-05 00:00:00,17174.12
280 | 2018-01-06 00:00:00,17319.198
281 | 2018-01-07 00:00:00,16651.471666666668
282 | 2018-01-08 00:00:00,15265.906666666668
283 | 2018-01-09 00:00:00,14714.253333333334
284 | 2018-01-10 00:00:00,15126.398333333333
285 | 2018-01-11 00:00:00,13296.794
286 | 2018-01-12 00:00:00,13912.882000000001
287 | 2018-01-13 00:00:00,14499.773333333333
288 | 2018-01-14 00:00:00,13852.92
289 | 2018-01-15 00:00:00,14012.196
290 | 2018-01-16 00:00:00,11180.998333333331
291 | 2018-01-17 00:00:00,11116.946666666669
292 | 2018-01-18 00:00:00,11345.423333333332
293 | 2018-01-19 00:00:00,11422.44
294 | 2018-01-20 00:00:00,12950.793333333333
295 | 2018-01-21 00:00:00,11505.228
296 | 2018-01-22 00:00:00,10544.593333333332
297 | 2018-01-23 00:00:00,11223.064
298 | 2018-01-24 00:00:00,11282.258333333333
299 | 2018-01-25 00:00:00,11214.44
300 | 2018-01-26 00:00:00,10969.815
301 | 2018-01-27 00:00:00,11524.776666666667
302 | 2018-01-28 00:00:00,11765.71
303 | 2018-01-29 00:00:00,11212.654999999999
304 | 2018-01-30 00:00:00,10184.061666666666
305 | 2018-01-31 00:00:00,10125.013333333334
306 | 2018-02-01 00:00:00,9083.258333333333
307 | 2018-02-02 00:00:00,8901.901666666667
308 | 2018-02-03 00:00:00,9076.678333333333
309 | 2018-02-04 00:00:00,8400.648333333333
310 | 2018-02-05 00:00:00,6838.816666666667
311 | 2018-02-06 00:00:00,7685.633333333334
312 | 2018-02-07 00:00:00,8099.958333333333
313 | 2018-02-08 00:00:00,8240.536666666667
314 | 2018-02-09 00:00:00,8535.516666666668
315 | 2018-02-10 00:00:00,8319.876566184
316 | 2018-02-11 00:00:00,8343.455
317 | 2018-02-12 00:00:00,8811.343333333332
318 | 2018-02-13 00:00:00,8597.7675
319 | 2018-02-14 00:00:00,9334.633333333333
320 | 2018-02-15 00:00:00,9977.154
321 | 2018-02-16 00:00:00,10127.161666666667
322 | 2018-02-17 00:00:00,10841.991666666667
323 | 2018-02-18 00:00:00,10503.298333333334
324 | 2018-02-19 00:00:00,11110.964999999998
325 | 2018-02-20 00:00:00,11390.391666666668
326 | 2018-02-21 00:00:00,10532.791666666666
327 | 2018-02-22 00:00:00,9931.071666666667
328 | 2018-02-23 00:00:00,10162.116666666667
329 | 2018-02-24 00:00:00,9697.956
330 | 2018-02-25 00:00:00,9696.593333333332
331 | 2018-02-26 00:00:00,10348.603333333334
332 | 2018-02-27 00:00:00,10763.883333333333
333 | 2018-02-28 00:00:00,10370.164999999999
334 | 2018-03-01 00:00:00,11009.381666666668
335 | 2018-03-02 00:00:00,11055.815
336 | 2018-03-03 00:00:00,11326.948333333334
337 | 2018-03-04 00:00:00,11430.181666666665
338 | 2018-03-05 00:00:00,11595.54
339 | 2018-03-06 00:00:00,10763.198333333334
340 | 2018-03-07 00:00:00,10118.058
341 | 2018-03-08 00:00:00,9429.111666666666
342 | 2018-03-09 00:00:00,9089.278333333334
343 | 2018-03-10 00:00:00,8746.002
344 | 2018-03-11 00:00:00,9761.396666666666
345 | 2018-03-12 00:00:00,9182.843333333332
346 | 2018-03-13 00:00:00,9154.699999999999
347 | 2018-03-14 00:00:00,8151.531666666667
348 | 2018-03-15 00:00:00,8358.121666666666
349 | 2018-03-16 00:00:00,8530.402
350 | 2018-03-17 00:00:00,7993.674643641666
351 | 2018-03-18 00:00:00,8171.415
352 | 2018-03-19 00:00:00,8412.033333333333
353 | 2018-03-20 00:00:00,8986.948333333334
354 | 2018-03-21 00:00:00,8947.753333333334
355 | 2018-03-22 00:00:00,8690.408333333333
356 | 2018-03-23 00:00:00,8686.826666666666
357 | 2018-03-24 00:00:00,8662.378333333334
358 | 2018-03-25 00:00:00,8617.296666666667
359 | 2018-03-26 00:00:00,8197.548333333334
360 | 2018-03-27 00:00:00,7876.195
361 | 2018-03-28 00:00:00,7960.38
362 | 2018-03-29 00:00:00,7172.28
363 | 2018-03-30 00:00:00,6882.531666666667
364 | 2018-03-31 00:00:00,6935.48
365 | 2018-04-01 00:00:00,6794.105
366 |
--------------------------------------------------------------------------------
/data/eth-price.csv:
--------------------------------------------------------------------------------
1 | "Date(UTC)","UnixTimeStamp","Value"
2 | "4/2/2017","1491091200","48.55"
3 | "4/3/2017","1491177600","44.13"
4 | "4/4/2017","1491264000","44.43"
5 | "4/5/2017","1491350400","44.90"
6 | "4/6/2017","1491436800","43.23"
7 | "4/7/2017","1491523200","42.31"
8 | "4/8/2017","1491609600","44.37"
9 | "4/9/2017","1491696000","43.72"
10 | "4/10/2017","1491782400","43.74"
11 | "4/11/2017","1491868800","43.74"
12 | "4/12/2017","1491955200","46.38"
13 | "4/13/2017","1492041600","49.97"
14 | "4/14/2017","1492128000","47.32"
15 | "4/15/2017","1492214400","48.89"
16 | "4/16/2017","1492300800","48.22"
17 | "4/17/2017","1492387200","47.94"
18 | "4/18/2017","1492473600","49.88"
19 | "4/19/2017","1492560000","47.88"
20 | "4/20/2017","1492646400","49.36"
21 | "4/21/2017","1492732800","48.27"
22 | "4/22/2017","1492819200","48.41"
23 | "4/23/2017","1492905600","48.75"
24 | "4/24/2017","1492992000","49.94"
25 | "4/25/2017","1493078400","50.09"
26 | "4/26/2017","1493164800","53.28"
27 | "4/27/2017","1493251200","63.14"
28 | "4/28/2017","1493337600","72.42"
29 | "4/29/2017","1493424000","69.83"
30 | "4/30/2017","1493510400","79.83"
31 | "5/1/2017","1493596800","77.53"
32 | "5/2/2017","1493683200","77.25"
33 | "5/3/2017","1493769600","80.37"
34 | "5/4/2017","1493856000","94.55"
35 | "5/5/2017","1493942400","90.79"
36 | "5/6/2017","1494028800","94.82"
37 | "5/7/2017","1494115200","90.46"
38 | "5/8/2017","1494201600","88.39"
39 | "5/9/2017","1494288000","86.27"
40 | "5/10/2017","1494374400","87.83"
41 | "5/11/2017","1494460800","88.20"
42 | "5/12/2017","1494547200","85.15"
43 | "5/13/2017","1494633600","87.96"
44 | "5/14/2017","1494720000","88.72"
45 | "5/15/2017","1494806400","90.32"
46 | "5/16/2017","1494892800","87.80"
47 | "5/17/2017","1494979200","86.98"
48 | "5/18/2017","1495065600","95.88"
49 | "5/19/2017","1495152000","124.38"
50 | "5/20/2017","1495238400","123.06"
51 | "5/21/2017","1495324800","148.00"
52 | "5/22/2017","1495411200","160.39"
53 | "5/23/2017","1495497600","169.50"
54 | "5/24/2017","1495584000","193.03"
55 | "5/25/2017","1495670400","177.33"
56 | "5/26/2017","1495756800","162.83"
57 | "5/27/2017","1495843200","156.63"
58 | "5/28/2017","1495929600","172.86"
59 | "5/29/2017","1496016000","194.17"
60 | "5/30/2017","1496102400","228.58"
61 | "5/31/2017","1496188800","228.64"
62 | "6/1/2017","1496275200","220.70"
63 | "6/2/2017","1496361600","222.04"
64 | "6/3/2017","1496448000","224.30"
65 | "6/4/2017","1496534400","244.96"
66 | "6/5/2017","1496620800","247.75"
67 | "6/6/2017","1496707200","264.26"
68 | "6/7/2017","1496793600","255.77"
69 | "6/8/2017","1496880000","259.41"
70 | "6/9/2017","1496966400","279.11"
71 | "6/10/2017","1497052800","335.95"
72 | "6/11/2017","1497139200","339.68"
73 | "6/12/2017","1497225600","394.66"
74 | "6/13/2017","1497312000","388.09"
75 | "6/14/2017","1497398400","343.84"
76 | "6/15/2017","1497484800","344.68"
77 | "6/16/2017","1497571200","353.61"
78 | "6/17/2017","1497657600","368.10"
79 | "6/18/2017","1497744000","351.53"
80 | "6/19/2017","1497830400","358.20"
81 | "6/20/2017","1497916800","350.53"
82 | "6/21/2017","1498003200","325.30"
83 | "6/22/2017","1498089600","320.97"
84 | "6/23/2017","1498176000","326.85"
85 | "6/24/2017","1498262400","304.54"
86 | "6/25/2017","1498348800","279.36"
87 | "6/26/2017","1498435200","253.68"
88 | "6/27/2017","1498521600","286.14"
89 | "6/28/2017","1498608000","315.86"
90 | "6/29/2017","1498694400","292.90"
91 | "6/30/2017","1498780800","280.68"
92 | "7/1/2017","1498867200","261.00"
93 | "7/2/2017","1498953600","283.99"
94 | "7/3/2017","1499040000","276.41"
95 | "7/4/2017","1499126400","269.05"
96 | "7/5/2017","1499212800","266.00"
97 | "7/6/2017","1499299200","265.88"
98 | "7/7/2017","1499385600","240.94"
99 | "7/8/2017","1499472000","245.67"
100 | "7/9/2017","1499558400","237.72"
101 | "7/10/2017","1499644800","205.76"
102 | "7/11/2017","1499731200","190.55"
103 | "7/12/2017","1499817600","224.15"
104 | "7/13/2017","1499904000","205.41"
105 | "7/14/2017","1499990400","197.14"
106 | "7/15/2017","1500076800","169.10"
107 | "7/16/2017","1500163200","155.42"
108 | "7/17/2017","1500249600","189.97"
109 | "7/18/2017","1500336000","227.09"
110 | "7/19/2017","1500422400","194.41"
111 | "7/20/2017","1500508800","226.33"
112 | "7/21/2017","1500595200","216.33"
113 | "7/22/2017","1500681600","230.47"
114 | "7/23/2017","1500768000","228.32"
115 | "7/24/2017","1500854400","225.48"
116 | "7/25/2017","1500940800","203.59"
117 | "7/26/2017","1501027200","202.88"
118 | "7/27/2017","1501113600","202.93"
119 | "7/28/2017","1501200000","191.21"
120 | "7/29/2017","1501286400","206.14"
121 | "7/30/2017","1501372800","196.78"
122 | "7/31/2017","1501459200","201.33"
123 | "8/1/2017","1501545600","225.90"
124 | "8/2/2017","1501632000","218.12"
125 | "8/3/2017","1501718400","224.39"
126 | "8/4/2017","1501804800","220.60"
127 | "8/5/2017","1501891200","253.09"
128 | "8/6/2017","1501977600","264.56"
129 | "8/7/2017","1502064000","269.94"
130 | "8/8/2017","1502150400","296.51"
131 | "8/9/2017","1502236800","295.28"
132 | "8/10/2017","1502323200","298.28"
133 | "8/11/2017","1502409600","309.32"
134 | "8/12/2017","1502496000","308.02"
135 | "8/13/2017","1502582400","296.62"
136 | "8/14/2017","1502668800","299.16"
137 | "8/15/2017","1502755200","286.52"
138 | "8/16/2017","1502841600","301.38"
139 | "8/17/2017","1502928000","300.30"
140 | "8/18/2017","1503014400","292.62"
141 | "8/19/2017","1503100800","293.02"
142 | "8/20/2017","1503187200","298.20"
143 | "8/21/2017","1503273600","321.85"
144 | "8/22/2017","1503360000","313.37"
145 | "8/23/2017","1503446400","317.40"
146 | "8/24/2017","1503532800","325.28"
147 | "8/25/2017","1503619200","330.06"
148 | "8/26/2017","1503705600","332.86"
149 | "8/27/2017","1503792000","347.88"
150 | "8/28/2017","1503878400","347.66"
151 | "8/29/2017","1503964800","372.35"
152 | "8/30/2017","1504051200","383.86"
153 | "8/31/2017","1504137600","388.33"
154 | "9/1/2017","1504224000","391.42"
155 | "9/2/2017","1504310400","351.03"
156 | "9/3/2017","1504396800","352.45"
157 | "9/4/2017","1504483200","303.70"
158 | "9/5/2017","1504569600","317.94"
159 | "9/6/2017","1504656000","338.92"
160 | "9/7/2017","1504742400","335.37"
161 | "9/8/2017","1504828800","306.72"
162 | "9/9/2017","1504915200","303.79"
163 | "9/10/2017","1505001600","299.21"
164 | "9/11/2017","1505088000","297.95"
165 | "9/12/2017","1505174400","294.10"
166 | "9/13/2017","1505260800","275.84"
167 | "9/14/2017","1505347200","223.14"
168 | "9/15/2017","1505433600","259.57"
169 | "9/16/2017","1505520000","254.49"
170 | "9/17/2017","1505606400","258.40"
171 | "9/18/2017","1505692800","297.53"
172 | "9/19/2017","1505779200","283.00"
173 | "9/20/2017","1505865600","283.56"
174 | "9/21/2017","1505952000","257.77"
175 | "9/22/2017","1506038400","262.94"
176 | "9/23/2017","1506124800","286.14"
177 | "9/24/2017","1506211200","282.60"
178 | "9/25/2017","1506297600","294.89"
179 | "9/26/2017","1506384000","288.64"
180 | "9/27/2017","1506470400","309.97"
181 | "9/28/2017","1506556800","302.77"
182 | "9/29/2017","1506643200","292.58"
183 | "9/30/2017","1506729600","302.77"
184 | "10/1/2017","1506816000","303.95"
185 | "10/2/2017","1506902400","296.81"
186 | "10/3/2017","1506988800","291.81"
187 | "10/4/2017","1507075200","291.68"
188 | "10/5/2017","1507161600","294.99"
189 | "10/6/2017","1507248000","308.33"
190 | "10/7/2017","1507334400","311.26"
191 | "10/8/2017","1507420800","309.49"
192 | "10/9/2017","1507507200","296.95"
193 | "10/10/2017","1507593600","298.46"
194 | "10/11/2017","1507680000","302.86"
195 | "10/12/2017","1507766400","302.89"
196 | "10/13/2017","1507852800","336.83"
197 | "10/14/2017","1507939200","338.81"
198 | "10/15/2017","1508025600","336.58"
199 | "10/16/2017","1508112000","334.23"
200 | "10/17/2017","1508198400","316.14"
201 | "10/18/2017","1508284800","313.54"
202 | "10/19/2017","1508371200","307.41"
203 | "10/20/2017","1508457600","303.08"
204 | "10/21/2017","1508544000","299.55"
205 | "10/22/2017","1508630400","294.03"
206 | "10/23/2017","1508716800","285.27"
207 | "10/24/2017","1508803200","296.50"
208 | "10/25/2017","1508889600","296.35"
209 | "10/26/2017","1508976000","295.54"
210 | "10/27/2017","1509062400","296.36"
211 | "10/28/2017","1509148800","293.35"
212 | "10/29/2017","1509235200","304.04"
213 | "10/30/2017","1509321600","306.80"
214 | "10/31/2017","1509408000","303.64"
215 | "11/1/2017","1509494400","289.42"
216 | "11/2/2017","1509580800","284.92"
217 | "11/3/2017","1509667200","304.51"
218 | "11/4/2017","1509753600","300.04"
219 | "11/5/2017","1509840000","296.23"
220 | "11/6/2017","1509926400","296.82"
221 | "11/7/2017","1510012800","291.84"
222 | "11/8/2017","1510099200","307.35"
223 | "11/9/2017","1510185600","319.66"
224 | "11/10/2017","1510272000","296.86"
225 | "11/11/2017","1510358400","314.23"
226 | "11/12/2017","1510444800","306.02"
227 | "11/13/2017","1510531200","314.60"
228 | "11/14/2017","1510617600","334.72"
229 | "11/15/2017","1510704000","331.20"
230 | "11/16/2017","1510790400","330.32"
231 | "11/17/2017","1510876800","331.72"
232 | "11/18/2017","1510963200","346.65"
233 | "11/19/2017","1511049600","354.60"
234 | "11/20/2017","1511136000","367.71"
235 | "11/21/2017","1511222400","360.52"
236 | "11/22/2017","1511308800","380.84"
237 | "11/23/2017","1511395200","406.57"
238 | "11/24/2017","1511481600","470.43"
239 | "11/25/2017","1511568000","464.61"
240 | "11/26/2017","1511654400","470.54"
241 | "11/27/2017","1511740800","475.24"
242 | "11/28/2017","1511827200","466.27"
243 | "11/29/2017","1511913600","427.42"
244 | "11/30/2017","1512000000","434.85"
245 | "12/1/2017","1512086400","461.58"
246 | "12/2/2017","1512172800","457.96"
247 | "12/3/2017","1512259200","462.81"
248 | "12/4/2017","1512345600","466.93"
249 | "12/5/2017","1512432000","453.96"
250 | "12/6/2017","1512518400","422.48"
251 | "12/7/2017","1512604800","421.15"
252 | "12/11/2017","1512950400","513.29"
253 | "12/12/2017","1513036800","656.52"
254 | "12/13/2017","1513123200","699.09"
255 | "12/14/2017","1513209600","693.58"
256 | "12/15/2017","1513296000","684.27"
257 | "12/16/2017","1513382400","692.83"
258 | "12/17/2017","1513468800","717.71"
259 | "12/18/2017","1513555200","785.99"
260 | "12/19/2017","1513641600","812.50"
261 | "12/20/2017","1513728000","799.17"
262 | "12/21/2017","1513814400","789.39"
263 | "12/22/2017","1513900800","657.83"
264 | "12/23/2017","1513987200","700.44"
265 | "12/24/2017","1514073600","675.91"
266 | "12/25/2017","1514160000","723.14"
267 | "12/26/2017","1514246400","753.40"
268 | "12/27/2017","1514332800","739.94"
269 | "12/28/2017","1514419200","716.69"
270 | "12/29/2017","1514505600","739.60"
271 | "12/30/2017","1514592000","692.99"
272 | "12/31/2017","1514678400","741.13"
273 | "1/1/2018","1514764800","756.20"
274 | "1/2/2018","1514851200","861.97"
275 | "1/3/2018","1514937600","941.10"
276 | "1/4/2018","1515024000","944.83"
277 | "1/5/2018","1515110400","967.13"
278 | "1/6/2018","1515196800","1006.41"
279 | "1/7/2018","1515283200","1117.75"
280 | "1/8/2018","1515369600","1136.11"
281 | "1/9/2018","1515456000","1289.24"
282 | "1/10/2018","1515542400","1248.99"
283 | "1/11/2018","1515628800","1139.32"
284 | "1/12/2018","1515715200","1261.03"
285 | "1/13/2018","1515801600","1385.02"
286 | "1/14/2018","1515888000","1359.48"
287 | "1/15/2018","1515974400","1278.69"
288 | "1/16/2018","1516060800","1050.26"
289 | "1/17/2018","1516147200","1024.69"
290 | "1/18/2018","1516233600","1012.97"
291 | "1/19/2018","1516320000","1037.36"
292 | "1/20/2018","1516406400","1150.50"
293 | "1/21/2018","1516492800","1049.09"
294 | "1/22/2018","1516579200","999.64"
295 | "1/23/2018","1516665600","984.47"
296 | "1/24/2018","1516752000","1061.78"
297 | "1/25/2018","1516838400","1046.37"
298 | "1/26/2018","1516924800","1048.58"
299 | "1/27/2018","1517011200","1109.08"
300 | "1/28/2018","1517097600","1231.58"
301 | "1/29/2018","1517184000","1169.96"
302 | "1/30/2018","1517270400","1063.75"
303 | "1/31/2018","1517356800","1111.31"
304 | "2/1/2018","1517443200","1026.19"
305 | "2/2/2018","1517529600","917.47"
306 | "2/3/2018","1517616000","970.87"
307 | "2/4/2018","1517702400","827.59"
308 | "2/5/2018","1517788800","695.08"
309 | "2/6/2018","1517875200","785.01"
310 | "2/7/2018","1517961600","751.81"
311 | "2/8/2018","1518048000","813.55"
312 | "2/9/2018","1518134400","877.88"
313 | "2/10/2018","1518220800","850.75"
314 | "2/11/2018","1518307200","811.24"
315 | "2/12/2018","1518393600","865.27"
316 | "2/13/2018","1518480000","840.98"
317 | "2/14/2018","1518566400","920.11"
318 | "2/15/2018","1518652800","927.95"
319 | "2/16/2018","1518739200","938.02"
320 | "2/17/2018","1518825600","974.77"
321 | "2/18/2018","1518912000","913.90"
322 | "2/19/2018","1518998400","939.79"
323 | "2/20/2018","1519084800","885.52"
324 | "2/21/2018","1519171200","840.10"
325 | "2/22/2018","1519257600","804.63"
326 | "2/23/2018","1519344000","854.70"
327 | "2/24/2018","1519430400","833.49"
328 | "2/25/2018","1519516800","840.28"
329 | "2/26/2018","1519603200","867.62"
330 | "2/27/2018","1519689600","871.58"
331 | "2/28/2018","1519776000","851.50"
332 | "3/1/2018","1519862400","869.87"
333 | "3/2/2018","1519948800","855.60"
334 | "3/3/2018","1520035200","855.65"
335 | "3/4/2018","1520121600","864.83"
336 | "3/5/2018","1520208000","849.42"
337 | "3/6/2018","1520294400","815.69"
338 | "3/7/2018","1520380800","751.13"
339 | "3/8/2018","1520467200","698.83"
340 | "3/9/2018","1520553600","726.92"
341 | "3/10/2018","1520640000","682.30"
342 | "3/11/2018","1520726400","720.36"
343 | "3/12/2018","1520812800","697.02"
344 | "3/13/2018","1520899200","689.96"
345 | "3/14/2018","1520985600","613.15"
346 | "3/15/2018","1521072000","610.56"
347 | "3/16/2018","1521158400","600.53"
348 | "3/17/2018","1521244800","549.79"
349 | "3/18/2018","1521331200","537.38"
350 | "3/19/2018","1521417600","555.55"
351 | "3/20/2018","1521504000","557.57"
352 | "3/21/2018","1521590400","559.91"
353 | "3/22/2018","1521676800","539.89"
354 | "3/23/2018","1521763200","543.83"
355 | "3/24/2018","1521849600","520.16"
356 | "3/25/2018","1521936000","523.01"
357 | "3/26/2018","1522022400","486.25"
358 | "3/27/2018","1522108800","448.78"
359 | "3/28/2018","1522195200","445.93"
360 | "3/29/2018","1522281600","383.90"
361 | "3/30/2018","1522368000","393.82"
362 | "3/31/2018","1522454400","394.07"
363 | "4/1/2018","1522540800","378.85"
364 |
--------------------------------------------------------------------------------