├── 1-Numpy.ipynb
├── 2-Pandas.ipynb
├── 3-Data Manipulation and Analysis.ipynb
├── 4- Null Values with Pandas.ipynb
├── 5-Data Visualization.ipynb
├── Readme.md
├── banner1.png
└── sales.csv
/1-Numpy.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "a2f9400f",
6 | "metadata": {},
7 | "source": [
8 | "# Numpy in Python for Data Science"
9 | ]
10 | },
11 | {
12 | "cell_type": "markdown",
13 | "id": "3ce682f0",
14 | "metadata": {},
15 | "source": [
16 | "### NumPy (Numerical Python)\n",
17 | "\n",
18 | "NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.\n",
19 | "\n",
20 | "The NumPy library contains multidimensional array and matrix data structures (you’ll find more information about this in later sections). It provides ndarray, a homogeneous n-dimensional array object, with methods to efficiently operate on it. NumPy can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.\n",
21 | "\n",
22 | "### Difference between a Python list and a NumPy array?\n",
23 | "\n",
24 | "NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. The mathematical operations that are meant to be performed on arrays would be extremely inefficient if the arrays weren’t homogeneous.\n",
25 | "\n",
26 | "**Why use NumPy?**\n",
27 | "\n",
28 | "NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further."
29 | ]
30 | },
31 | {
32 | "cell_type": "markdown",
33 | "id": "56bccc4c",
34 | "metadata": {},
35 | "source": [
36 | "## Section 3.1: Arrays & It's Types"
37 | ]
38 | },
39 | {
40 | "cell_type": "markdown",
41 | "id": "363ba426",
42 | "metadata": {},
43 | "source": [
44 | ">*An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element.*\n",
45 | "1. A vector is an array with a single dimension (there’s no difference between row and column vectors)\n",
46 | "2. A matrix refers to an array with two dimensions\n",
47 | "3. For 3-D or higher dimensional arrays, the term tensor is also commonly used"
48 | ]
49 | },
50 | {
51 | "cell_type": "markdown",
52 | "id": "318f5813",
53 | "metadata": {},
54 | "source": [
55 | "### 3.1.1: 1-D Arrays"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "id": "dfdac0f7",
62 | "metadata": {},
63 | "outputs": [
64 | {
65 | "data": {
66 | "text/plain": [
67 | "array([5, 5, 5])"
68 | ]
69 | },
70 | "metadata": {},
71 | "output_type": "display_data"
72 | }
73 | ],
74 | "source": [
75 | "import numpy as np\n",
76 | "\n",
77 | "a = np.array([5, 5, 5,])\n",
78 | "a"
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": null,
84 | "id": "55c3ef44",
85 | "metadata": {},
86 | "outputs": [
87 | {
88 | "data": {
89 | "text/plain": [
90 | "array([ 2, 4, 6, 8, 10])"
91 | ]
92 | },
93 | "metadata": {},
94 | "output_type": "display_data"
95 | }
96 | ],
97 | "source": [
98 | "a = np.array([2,4,6,8,10])\n",
99 | "a"
100 | ]
101 | },
102 | {
103 | "cell_type": "code",
104 | "execution_count": null,
105 | "id": "0de66b66",
106 | "metadata": {},
107 | "outputs": [
108 | {
109 | "data": {
110 | "text/plain": [
111 | "array([1, 2, 3, 4, 5, 6])"
112 | ]
113 | },
114 | "metadata": {},
115 | "output_type": "display_data"
116 | }
117 | ],
118 | "source": [
119 | "a = np.array([1,2,3,4,5,6])\n",
120 | "a"
121 | ]
122 | },
123 | {
124 | "cell_type": "code",
125 | "execution_count": null,
126 | "id": "522e653c",
127 | "metadata": {},
128 | "outputs": [
129 | {
130 | "data": {
131 | "text/plain": [
132 | "array([0., 0., 0.])"
133 | ]
134 | },
135 | "metadata": {},
136 | "output_type": "display_data"
137 | }
138 | ],
139 | "source": [
140 | "b = np.zeros(3)\n",
141 | "b"
142 | ]
143 | },
144 | {
145 | "cell_type": "code",
146 | "execution_count": null,
147 | "id": "fe42cfc7",
148 | "metadata": {},
149 | "outputs": [
150 | {
151 | "data": {
152 | "text/plain": [
153 | "array([1., 1., 1., 1.])"
154 | ]
155 | },
156 | "metadata": {},
157 | "output_type": "display_data"
158 | }
159 | ],
160 | "source": [
161 | "c = np.ones(4)\n",
162 | "c"
163 | ]
164 | },
165 | {
166 | "cell_type": "markdown",
167 | "id": "e8a07511",
168 | "metadata": {},
169 | "source": [
170 | ">Create an empty array with 3 elements"
171 | ]
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": null,
176 | "id": "3334df1e",
177 | "metadata": {},
178 | "outputs": [
179 | {
180 | "data": {
181 | "text/plain": [
182 | "array([0., 0., 0.])"
183 | ]
184 | },
185 | "metadata": {},
186 | "output_type": "display_data"
187 | }
188 | ],
189 | "source": [
190 | "d = np.empty(3)\n",
191 | "d"
192 | ]
193 | },
194 | {
195 | "cell_type": "markdown",
196 | "id": "41ff9884",
197 | "metadata": {},
198 | "source": [
199 | ">Creating array with range of elements"
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": null,
205 | "id": "50199b39",
206 | "metadata": {},
207 | "outputs": [
208 | {
209 | "data": {
210 | "text/plain": [
211 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8])"
212 | ]
213 | },
214 | "metadata": {},
215 | "output_type": "display_data"
216 | }
217 | ],
218 | "source": [
219 | "e = np.arange(9)\n",
220 | "e"
221 | ]
222 | },
223 | {
224 | "cell_type": "markdown",
225 | "id": "dcff9552",
226 | "metadata": {},
227 | "source": [
228 | "> Creating array between specific range of elements"
229 | ]
230 | },
231 | {
232 | "cell_type": "code",
233 | "execution_count": null,
234 | "id": "7b3a8999",
235 | "metadata": {},
236 | "outputs": [
237 | {
238 | "data": {
239 | "text/plain": [
240 | "array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])"
241 | ]
242 | },
243 | "metadata": {},
244 | "output_type": "display_data"
245 | }
246 | ],
247 | "source": [
248 | "f = np.arange(1, 15)\n",
249 | "f"
250 | ]
251 | },
252 | {
253 | "cell_type": "markdown",
254 | "id": "0604f3f7",
255 | "metadata": {},
256 | "source": [
257 | "> Creating array between specific range of elements and specified interval (5)\n"
258 | ]
259 | },
260 | {
261 | "cell_type": "code",
262 | "execution_count": null,
263 | "id": "9895eb8e",
264 | "metadata": {},
265 | "outputs": [
266 | {
267 | "data": {
268 | "text/plain": [
269 | "array([50, 55, 60, 65, 70, 75, 80, 85, 90, 95])"
270 | ]
271 | },
272 | "metadata": {},
273 | "output_type": "display_data"
274 | }
275 | ],
276 | "source": [
277 | "g = np.arange(50, 100, 5)\n",
278 | "g"
279 | ]
280 | },
281 | {
282 | "cell_type": "markdown",
283 | "id": "f3968e12",
284 | "metadata": {},
285 | "source": [
286 | "> Linearly spaced arrays"
287 | ]
288 | },
289 | {
290 | "cell_type": "code",
291 | "execution_count": null,
292 | "id": "6077433a",
293 | "metadata": {},
294 | "outputs": [
295 | {
296 | "data": {
297 | "text/plain": [
298 | "array([100., 105., 110., 115., 120.])"
299 | ]
300 | },
301 | "metadata": {},
302 | "output_type": "display_data"
303 | }
304 | ],
305 | "source": [
306 | "h = np.linspace(100 , 120, num= 5)\n",
307 | "h"
308 | ]
309 | },
310 | {
311 | "cell_type": "markdown",
312 | "id": "a05fed06",
313 | "metadata": {},
314 | "source": [
315 | "> specific data types in array"
316 | ]
317 | },
318 | {
319 | "cell_type": "code",
320 | "execution_count": null,
321 | "id": "ae477c82",
322 | "metadata": {},
323 | "outputs": [
324 | {
325 | "data": {
326 | "text/plain": [
327 | "array([1, 1, 1, 1, 1], dtype=int8)"
328 | ]
329 | },
330 | "metadata": {},
331 | "output_type": "display_data"
332 | }
333 | ],
334 | "source": [
335 | "i = np.ones(5, dtype = np.int8)\n",
336 | "i"
337 | ]
338 | },
339 | {
340 | "cell_type": "code",
341 | "execution_count": null,
342 | "id": "3cf3b978",
343 | "metadata": {},
344 | "outputs": [
345 | {
346 | "data": {
347 | "text/plain": [
348 | "array([1., 1., 1., 1.])"
349 | ]
350 | },
351 | "metadata": {},
352 | "output_type": "display_data"
353 | }
354 | ],
355 | "source": [
356 | "j = np.ones(4, dtype = np.float64)\n",
357 | "j"
358 | ]
359 | },
360 | {
361 | "cell_type": "markdown",
362 | "id": "a5629a81",
363 | "metadata": {},
364 | "source": [
365 | "### 3.1.2: 2-D Arrays"
366 | ]
367 | },
368 | {
369 | "cell_type": "code",
370 | "execution_count": null,
371 | "id": "bc03ef8b",
372 | "metadata": {},
373 | "outputs": [
374 | {
375 | "data": {
376 | "text/plain": [
377 | "array([[1, 2, 3],\n",
378 | " [4, 5, 6],\n",
379 | " [7, 8, 9]])"
380 | ]
381 | },
382 | "metadata": {},
383 | "output_type": "display_data"
384 | }
385 | ],
386 | "source": [
387 | "b = np.array([[1,2,3], [4,5,6], [7,8,9]])\n",
388 | "b"
389 | ]
390 | },
391 | {
392 | "cell_type": "code",
393 | "execution_count": null,
394 | "id": "a1a63fcc",
395 | "metadata": {},
396 | "outputs": [
397 | {
398 | "data": {
399 | "text/plain": [
400 | "array([[0., 0., 0., 0.],\n",
401 | " [0., 0., 0., 0.],\n",
402 | " [0., 0., 0., 0.],\n",
403 | " [0., 0., 0., 0.],\n",
404 | " [0., 0., 0., 0.]])"
405 | ]
406 | },
407 | "metadata": {},
408 | "output_type": "display_data"
409 | }
410 | ],
411 | "source": [
412 | "a = np.zeros((5,4))\n",
413 | "a"
414 | ]
415 | },
416 | {
417 | "cell_type": "code",
418 | "execution_count": null,
419 | "id": "409d4aba",
420 | "metadata": {},
421 | "outputs": [
422 | {
423 | "data": {
424 | "text/plain": [
425 | "array([[1., 1., 1.],\n",
426 | " [1., 1., 1.]])"
427 | ]
428 | },
429 | "metadata": {},
430 | "output_type": "display_data"
431 | }
432 | ],
433 | "source": [
434 | "b = np.ones((2,3))\n",
435 | "b"
436 | ]
437 | },
438 | {
439 | "cell_type": "code",
440 | "execution_count": null,
441 | "id": "728c18fb",
442 | "metadata": {},
443 | "outputs": [
444 | {
445 | "data": {
446 | "text/plain": [
447 | "array([[0.00000000e+000, 0.00000000e+000, 0.00000000e+000],\n",
448 | " [0.00000000e+000, 0.00000000e+000, 1.29049947e-320],\n",
449 | " [8.34441742e-308, 9.79107192e-307, 3.33509775e-317]])"
450 | ]
451 | },
452 | "metadata": {},
453 | "output_type": "display_data"
454 | }
455 | ],
456 | "source": [
457 | "c = np.empty((3,3))\n",
458 | "c"
459 | ]
460 | },
461 | {
462 | "cell_type": "markdown",
463 | "id": "53510c93",
464 | "metadata": {},
465 | "source": [
466 | "### 3.1.3: 3-D Arrays"
467 | ]
468 | },
469 | {
470 | "cell_type": "code",
471 | "execution_count": null,
472 | "id": "4f448214",
473 | "metadata": {},
474 | "outputs": [
475 | {
476 | "data": {
477 | "text/plain": [
478 | "array([[[1, 2, 3, 4],\n",
479 | " [2, 3, 4, 5],\n",
480 | " [9, 8, 7, 6]],\n",
481 | "\n",
482 | " [[7, 8, 9, 4],\n",
483 | " [1, 4, 7, 8],\n",
484 | " [1, 5, 9, 7]],\n",
485 | "\n",
486 | " [[4, 5, 6, 8],\n",
487 | " [3, 2, 1, 4],\n",
488 | " [2, 5, 4, 8]]])"
489 | ]
490 | },
491 | "metadata": {},
492 | "output_type": "display_data"
493 | }
494 | ],
495 | "source": [
496 | "x = np.array([[[1,2,3,4], [2,3,4,5], [9,8,7,6]], [[7,8,9,4], [1,4,7,8], [1,5,9,7]], [[4,5,6,8], [3,2,1,4], [2,5,4,8]]])\n",
497 | "x"
498 | ]
499 | },
500 | {
501 | "cell_type": "markdown",
502 | "id": "a2717e77",
503 | "metadata": {},
504 | "source": [
505 | ">making and reshaping a 3D array"
506 | ]
507 | },
508 | {
509 | "cell_type": "code",
510 | "execution_count": null,
511 | "id": "1ae8e7e1",
512 | "metadata": {},
513 | "outputs": [
514 | {
515 | "data": {
516 | "text/plain": [
517 | "array([[[ 0, 1, 2, 3, 4],\n",
518 | " [ 5, 6, 7, 8, 9],\n",
519 | " [10, 11, 12, 13, 14],\n",
520 | " [15, 16, 17, 18, 19]],\n",
521 | "\n",
522 | " [[20, 21, 22, 23, 24],\n",
523 | " [25, 26, 27, 28, 29],\n",
524 | " [30, 31, 32, 33, 34],\n",
525 | " [35, 36, 37, 38, 39]],\n",
526 | "\n",
527 | " [[40, 41, 42, 43, 44],\n",
528 | " [45, 46, 47, 48, 49],\n",
529 | " [50, 51, 52, 53, 54],\n",
530 | " [55, 56, 57, 58, 59]]])"
531 | ]
532 | },
533 | "metadata": {},
534 | "output_type": "display_data"
535 | }
536 | ],
537 | "source": [
538 | "a = np.arange(60).reshape(3,4,5)\n",
539 | "a"
540 | ]
541 | },
542 | {
543 | "cell_type": "markdown",
544 | "id": "3b1e2bb8",
545 | "metadata": {},
546 | "source": [
547 | "## Section 3.2: Array Functions"
548 | ]
549 | },
550 | {
551 | "cell_type": "code",
552 | "execution_count": null,
553 | "id": "c280e3d9",
554 | "metadata": {},
555 | "outputs": [
556 | {
557 | "data": {
558 | "text/plain": [
559 | "array([ 10. , 12. , 15. , 2. , 4. , 6. , 100. , 320. , 0.5,\n",
560 | " 10.3])"
561 | ]
562 | },
563 | "metadata": {},
564 | "output_type": "display_data"
565 | }
566 | ],
567 | "source": [
568 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n",
569 | "a"
570 | ]
571 | },
572 | {
573 | "cell_type": "markdown",
574 | "id": "e831694e",
575 | "metadata": {},
576 | "source": [
577 | "### 3.2.1 Sorting of one 1-D Array"
578 | ]
579 | },
580 | {
581 | "cell_type": "code",
582 | "execution_count": null,
583 | "id": "f9c7588a",
584 | "metadata": {},
585 | "outputs": [
586 | {
587 | "data": {
588 | "text/plain": [
589 | "array([ 0.5, 2. , 4. , 6. , 10. , 10.3, 12. , 15. , 100. ,\n",
590 | " 320. ])"
591 | ]
592 | },
593 | "metadata": {},
594 | "output_type": "display_data"
595 | }
596 | ],
597 | "source": [
598 | "a.sort()\n",
599 | "a"
600 | ]
601 | },
602 | {
603 | "cell_type": "markdown",
604 | "id": "fa08791c",
605 | "metadata": {},
606 | "source": [
607 | "### 3.2.2 Type of Array"
608 | ]
609 | },
610 | {
611 | "cell_type": "code",
612 | "execution_count": null,
613 | "id": "15df162f",
614 | "metadata": {},
615 | "outputs": [
616 | {
617 | "data": {
618 | "text/plain": [
619 | "numpy.ndarray"
620 | ]
621 | },
622 | "metadata": {},
623 | "output_type": "display_data"
624 | }
625 | ],
626 | "source": [
627 | "type(a)"
628 | ]
629 | },
630 | {
631 | "cell_type": "markdown",
632 | "id": "c6978ff7",
633 | "metadata": {},
634 | "source": [
635 | "### 3.2.3 Length of an array"
636 | ]
637 | },
638 | {
639 | "cell_type": "code",
640 | "execution_count": null,
641 | "id": "0aae0d18",
642 | "metadata": {},
643 | "outputs": [
644 | {
645 | "data": {
646 | "text/plain": [
647 | "10"
648 | ]
649 | },
650 | "metadata": {},
651 | "output_type": "display_data"
652 | }
653 | ],
654 | "source": [
655 | "len(a)"
656 | ]
657 | },
658 | {
659 | "cell_type": "code",
660 | "execution_count": null,
661 | "id": "20c06e4c",
662 | "metadata": {},
663 | "outputs": [
664 | {
665 | "data": {
666 | "text/plain": [
667 | "array([10.2, 3.4, 5.3, 35.2, 45.2])"
668 | ]
669 | },
670 | "metadata": {},
671 | "output_type": "display_data"
672 | }
673 | ],
674 | "source": [
675 | "b = np.array([10.2, 3.4, 5.3, 35.2, 45.2])\n",
676 | "b"
677 | ]
678 | },
679 | {
680 | "cell_type": "markdown",
681 | "id": "1725df86",
682 | "metadata": {},
683 | "source": [
684 | "### 3.2.4 Concatenation of Array"
685 | ]
686 | },
687 | {
688 | "cell_type": "markdown",
689 | "id": "a5c3e79a",
690 | "metadata": {},
691 | "source": [
692 | "> Concatenation of 1-D Array"
693 | ]
694 | },
695 | {
696 | "cell_type": "code",
697 | "execution_count": null,
698 | "id": "5baa38e9",
699 | "metadata": {},
700 | "outputs": [
701 | {
702 | "data": {
703 | "text/plain": [
704 | "array([ 0.5, 2. , 4. , 6. , 10. , 10.3, 12. , 15. , 100. ,\n",
705 | " 320. , 10.2, 3.4, 5.3, 35.2, 45.2])"
706 | ]
707 | },
708 | "metadata": {},
709 | "output_type": "display_data"
710 | }
711 | ],
712 | "source": [
713 | "c = np.concatenate((a,b))\n",
714 | "c"
715 | ]
716 | },
717 | {
718 | "cell_type": "code",
719 | "execution_count": null,
720 | "id": "5ef0d308",
721 | "metadata": {},
722 | "outputs": [
723 | {
724 | "data": {
725 | "text/plain": [
726 | "array([ 0.5, 2. , 3.4, 4. , 5.3, 6. , 10. , 10.2, 10.3,\n",
727 | " 12. , 15. , 35.2, 45.2, 100. , 320. ])"
728 | ]
729 | },
730 | "metadata": {},
731 | "output_type": "display_data"
732 | }
733 | ],
734 | "source": [
735 | "c.sort()\n",
736 | "c"
737 | ]
738 | },
739 | {
740 | "cell_type": "markdown",
741 | "id": "a2e1907c",
742 | "metadata": {},
743 | "source": [
744 | "> Concatenation of 2-D Array"
745 | ]
746 | },
747 | {
748 | "cell_type": "code",
749 | "execution_count": null,
750 | "id": "aea64970",
751 | "metadata": {},
752 | "outputs": [
753 | {
754 | "data": {
755 | "text/plain": [
756 | "array([[1, 2, 3, 4, 5],\n",
757 | " [5, 4, 3, 2, 1]])"
758 | ]
759 | },
760 | "metadata": {},
761 | "output_type": "display_data"
762 | }
763 | ],
764 | "source": [
765 | "a = np.array([[1,2,3,4,5], [5,4,3,2,1]])\n",
766 | "a"
767 | ]
768 | },
769 | {
770 | "cell_type": "code",
771 | "execution_count": null,
772 | "id": "23a1b5e5",
773 | "metadata": {},
774 | "outputs": [
775 | {
776 | "data": {
777 | "text/plain": [
778 | "array([[6, 7, 5, 6, 6],\n",
779 | " [8, 9, 5, 9, 5]])"
780 | ]
781 | },
782 | "metadata": {},
783 | "output_type": "display_data"
784 | }
785 | ],
786 | "source": [
787 | "b = np.array([[6,7,5,6,6], [8,9,5,9,5]])\n",
788 | "b"
789 | ]
790 | },
791 | {
792 | "cell_type": "code",
793 | "execution_count": null,
794 | "id": "43d316e6",
795 | "metadata": {},
796 | "outputs": [
797 | {
798 | "data": {
799 | "text/plain": [
800 | "array([[1, 2, 3, 4, 5],\n",
801 | " [5, 4, 3, 2, 1],\n",
802 | " [6, 7, 5, 6, 6],\n",
803 | " [8, 9, 5, 9, 5]])"
804 | ]
805 | },
806 | "metadata": {},
807 | "output_type": "display_data"
808 | }
809 | ],
810 | "source": [
811 | "c = np.concatenate((a,b))\n",
812 | "c"
813 | ]
814 | },
815 | {
816 | "cell_type": "code",
817 | "execution_count": null,
818 | "id": "4dda2adc",
819 | "metadata": {},
820 | "outputs": [
821 | {
822 | "data": {
823 | "text/plain": [
824 | "array([[1, 2, 3, 4, 5, 6, 7, 5, 6, 6],\n",
825 | " [5, 4, 3, 2, 1, 8, 9, 5, 9, 5]])"
826 | ]
827 | },
828 | "metadata": {},
829 | "output_type": "display_data"
830 | }
831 | ],
832 | "source": [
833 | "c = np.concatenate((a,b), axis=1)\n",
834 | "c"
835 | ]
836 | },
837 | {
838 | "cell_type": "code",
839 | "execution_count": null,
840 | "id": "1121832d",
841 | "metadata": {},
842 | "outputs": [
843 | {
844 | "data": {
845 | "text/plain": [
846 | "array([[1, 2],\n",
847 | " [3, 4],\n",
848 | " [5, 6]])"
849 | ]
850 | },
851 | "metadata": {},
852 | "output_type": "display_data"
853 | }
854 | ],
855 | "source": [
856 | "x = np.array([[1, 2], [3, 4]])\n",
857 | "y = np.array([[5, 6]])\n",
858 | "np.concatenate((x, y), axis=0)"
859 | ]
860 | },
861 | {
862 | "cell_type": "markdown",
863 | "id": "a568f1d9",
864 | "metadata": {},
865 | "source": [
866 | "### 3.2.5 Dimension of an array"
867 | ]
868 | },
869 | {
870 | "cell_type": "code",
871 | "execution_count": null,
872 | "id": "0cc43877",
873 | "metadata": {},
874 | "outputs": [
875 | {
876 | "data": {
877 | "text/plain": [
878 | "1"
879 | ]
880 | },
881 | "metadata": {},
882 | "output_type": "display_data"
883 | }
884 | ],
885 | "source": [
886 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n",
887 | "a.ndim"
888 | ]
889 | },
890 | {
891 | "cell_type": "code",
892 | "execution_count": null,
893 | "id": "415d6248",
894 | "metadata": {},
895 | "outputs": [
896 | {
897 | "data": {
898 | "text/plain": [
899 | "2"
900 | ]
901 | },
902 | "metadata": {},
903 | "output_type": "display_data"
904 | }
905 | ],
906 | "source": [
907 | "b = np.array([[6,7,5,6,6], [8,9,5,9,5]])\n",
908 | "b.ndim"
909 | ]
910 | },
911 | {
912 | "cell_type": "code",
913 | "execution_count": null,
914 | "id": "e6841909",
915 | "metadata": {},
916 | "outputs": [
917 | {
918 | "data": {
919 | "text/plain": [
920 | "3"
921 | ]
922 | },
923 | "metadata": {},
924 | "output_type": "display_data"
925 | }
926 | ],
927 | "source": [
928 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]]])\n",
929 | "a.ndim"
930 | ]
931 | },
932 | {
933 | "cell_type": "markdown",
934 | "id": "23784d4e",
935 | "metadata": {},
936 | "source": [
937 | "### 3.2.6 Number of elements in an array"
938 | ]
939 | },
940 | {
941 | "cell_type": "code",
942 | "execution_count": null,
943 | "id": "7ed0ded3",
944 | "metadata": {},
945 | "outputs": [
946 | {
947 | "data": {
948 | "text/plain": [
949 | "10"
950 | ]
951 | },
952 | "metadata": {},
953 | "output_type": "display_data"
954 | }
955 | ],
956 | "source": [
957 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n",
958 | "a.size"
959 | ]
960 | },
961 | {
962 | "cell_type": "code",
963 | "execution_count": null,
964 | "id": "ffbe8224",
965 | "metadata": {},
966 | "outputs": [
967 | {
968 | "data": {
969 | "text/plain": [
970 | "12"
971 | ]
972 | },
973 | "metadata": {},
974 | "output_type": "display_data"
975 | }
976 | ],
977 | "source": [
978 | "b = np.array([[6,7,5,6,6,6], [8,9,5,9,5,7]])\n",
979 | "b.size"
980 | ]
981 | },
982 | {
983 | "cell_type": "code",
984 | "execution_count": null,
985 | "id": "4a9a4845",
986 | "metadata": {},
987 | "outputs": [
988 | {
989 | "data": {
990 | "text/plain": [
991 | "24"
992 | ]
993 | },
994 | "metadata": {},
995 | "output_type": "display_data"
996 | }
997 | ],
998 | "source": [
999 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]]])\n",
1000 | "a.size"
1001 | ]
1002 | },
1003 | {
1004 | "cell_type": "markdown",
1005 | "id": "63591035",
1006 | "metadata": {},
1007 | "source": [
1008 | "### 3.2.7 Shape of an array"
1009 | ]
1010 | },
1011 | {
1012 | "cell_type": "code",
1013 | "execution_count": null,
1014 | "id": "8906a559",
1015 | "metadata": {},
1016 | "outputs": [
1017 | {
1018 | "data": {
1019 | "text/plain": [
1020 | "array([ 10. , 12. , 15. , 2. , 4. , 6. , 100. , 320. , 0.5,\n",
1021 | " 10.3])"
1022 | ]
1023 | },
1024 | "metadata": {},
1025 | "output_type": "display_data"
1026 | }
1027 | ],
1028 | "source": [
1029 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n",
1030 | "a"
1031 | ]
1032 | },
1033 | {
1034 | "cell_type": "code",
1035 | "execution_count": null,
1036 | "id": "d55cf6f1",
1037 | "metadata": {},
1038 | "outputs": [
1039 | {
1040 | "data": {
1041 | "text/plain": [
1042 | "(10,)"
1043 | ]
1044 | },
1045 | "metadata": {},
1046 | "output_type": "display_data"
1047 | }
1048 | ],
1049 | "source": [
1050 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n",
1051 | "a.shape\n",
1052 | "# Output: 10 Rows or Columns"
1053 | ]
1054 | },
1055 | {
1056 | "cell_type": "code",
1057 | "execution_count": null,
1058 | "id": "56796e4b",
1059 | "metadata": {},
1060 | "outputs": [
1061 | {
1062 | "data": {
1063 | "text/plain": [
1064 | "array([[6, 7, 5, 6, 6, 6],\n",
1065 | " [8, 9, 5, 9, 5, 7],\n",
1066 | " [8, 9, 5, 9, 5, 7]])"
1067 | ]
1068 | },
1069 | "metadata": {},
1070 | "output_type": "display_data"
1071 | }
1072 | ],
1073 | "source": [
1074 | "b = np.array([[6,7,5,6,6,6], [8,9,5,9,5,7], [8,9,5,9,5,7]])\n",
1075 | "b\n",
1076 | "b"
1077 | ]
1078 | },
1079 | {
1080 | "cell_type": "code",
1081 | "execution_count": null,
1082 | "id": "ae697c33",
1083 | "metadata": {},
1084 | "outputs": [
1085 | {
1086 | "data": {
1087 | "text/plain": [
1088 | "(3, 6)"
1089 | ]
1090 | },
1091 | "metadata": {},
1092 | "output_type": "display_data"
1093 | }
1094 | ],
1095 | "source": [
1096 | "b = np.array([[6,7,5,6,6,6], [8,9,5,9,5,7], [8,9,5,9,5,7]])\n",
1097 | "b\n",
1098 | "b.shape\n",
1099 | "# Output: 3 Rows, 6 Columns"
1100 | ]
1101 | },
1102 | {
1103 | "cell_type": "code",
1104 | "execution_count": null,
1105 | "id": "7a0d8ac4",
1106 | "metadata": {},
1107 | "outputs": [
1108 | {
1109 | "data": {
1110 | "text/plain": [
1111 | "array([[[1, 2, 3, 4],\n",
1112 | " [4, 5, 6, 7]],\n",
1113 | "\n",
1114 | " [[3, 2, 1, 4],\n",
1115 | " [6, 5, 4, 1]],\n",
1116 | "\n",
1117 | " [[7, 8, 9, 0],\n",
1118 | " [9, 8, 7, 6]],\n",
1119 | "\n",
1120 | " [[3, 2, 1, 4],\n",
1121 | " [6, 5, 4, 1]]])"
1122 | ]
1123 | },
1124 | "metadata": {},
1125 | "output_type": "display_data"
1126 | }
1127 | ],
1128 | "source": [
1129 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]], [[3,2,1,4],[6,5,4,1]]])\n",
1130 | "a"
1131 | ]
1132 | },
1133 | {
1134 | "cell_type": "code",
1135 | "execution_count": null,
1136 | "id": "cae2a0b3",
1137 | "metadata": {},
1138 | "outputs": [
1139 | {
1140 | "data": {
1141 | "text/plain": [
1142 | "(4, 2, 4)"
1143 | ]
1144 | },
1145 | "metadata": {},
1146 | "output_type": "display_data"
1147 | }
1148 | ],
1149 | "source": [
1150 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]], [[3,2,1,4],[6,5,4,1]]])\n",
1151 | "a.shape\n",
1152 | "# Output: 4 Layers, 2 Rows, 4 Columns"
1153 | ]
1154 | },
1155 | {
1156 | "cell_type": "markdown",
1157 | "id": "ff54e4f2",
1158 | "metadata": {},
1159 | "source": [
1160 | "### 3.2.8 Reshaping an array"
1161 | ]
1162 | },
1163 | {
1164 | "cell_type": "code",
1165 | "execution_count": null,
1166 | "id": "72ec3bef",
1167 | "metadata": {},
1168 | "outputs": [
1169 | {
1170 | "data": {
1171 | "text/plain": [
1172 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8])"
1173 | ]
1174 | },
1175 | "metadata": {},
1176 | "output_type": "display_data"
1177 | }
1178 | ],
1179 | "source": [
1180 | "a = np.arange(9) # 3*3 = 9\n",
1181 | "a"
1182 | ]
1183 | },
1184 | {
1185 | "cell_type": "code",
1186 | "execution_count": null,
1187 | "id": "1f25827c",
1188 | "metadata": {},
1189 | "outputs": [
1190 | {
1191 | "data": {
1192 | "text/plain": [
1193 | "array([[0, 1, 2],\n",
1194 | " [3, 4, 5],\n",
1195 | " [6, 7, 8]])"
1196 | ]
1197 | },
1198 | "metadata": {},
1199 | "output_type": "display_data"
1200 | }
1201 | ],
1202 | "source": [
1203 | "a.reshape(3,3) \n",
1204 | "# Can ony reshape it into multiple of 9 e.g. 3*3 = 9"
1205 | ]
1206 | },
1207 | {
1208 | "cell_type": "code",
1209 | "execution_count": null,
1210 | "id": "8aa7bf57",
1211 | "metadata": {},
1212 | "outputs": [
1213 | {
1214 | "data": {
1215 | "text/plain": [
1216 | "array([[0, 1, 2, 3, 4, 5, 6, 7, 8]])"
1217 | ]
1218 | },
1219 | "metadata": {},
1220 | "output_type": "display_data"
1221 | }
1222 | ],
1223 | "source": [
1224 | "np.reshape(a, newshape=(1,9), order = \"C\")"
1225 | ]
1226 | },
1227 | {
1228 | "cell_type": "markdown",
1229 | "id": "fb959bd6",
1230 | "metadata": {},
1231 | "source": [
1232 | "### 3.2.9 Conversion of an Array"
1233 | ]
1234 | },
1235 | {
1236 | "cell_type": "markdown",
1237 | "id": "bf3f2229",
1238 | "metadata": {},
1239 | "source": [
1240 | "> Conversion of 1-D array to 2-D array"
1241 | ]
1242 | },
1243 | {
1244 | "cell_type": "code",
1245 | "execution_count": null,
1246 | "id": "45a12f7e",
1247 | "metadata": {},
1248 | "outputs": [
1249 | {
1250 | "data": {
1251 | "text/plain": [
1252 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
1253 | ]
1254 | },
1255 | "metadata": {},
1256 | "output_type": "display_data"
1257 | }
1258 | ],
1259 | "source": [
1260 | "a = np.array([1,2,3,4,5,6,7,8,9])\n",
1261 | "a"
1262 | ]
1263 | },
1264 | {
1265 | "cell_type": "markdown",
1266 | "id": "87147d08",
1267 | "metadata": {},
1268 | "source": [
1269 | "> Row wise 1D to 2D conversion "
1270 | ]
1271 | },
1272 | {
1273 | "cell_type": "code",
1274 | "execution_count": null,
1275 | "id": "7320fe05",
1276 | "metadata": {},
1277 | "outputs": [
1278 | {
1279 | "data": {
1280 | "text/plain": [
1281 | "array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])"
1282 | ]
1283 | },
1284 | "metadata": {},
1285 | "output_type": "display_data"
1286 | }
1287 | ],
1288 | "source": [
1289 | "b = a[np.newaxis, :]\n",
1290 | "b"
1291 | ]
1292 | },
1293 | {
1294 | "cell_type": "code",
1295 | "execution_count": null,
1296 | "id": "143fc70b",
1297 | "metadata": {},
1298 | "outputs": [
1299 | {
1300 | "data": {
1301 | "text/plain": [
1302 | "(1, 9)"
1303 | ]
1304 | },
1305 | "metadata": {},
1306 | "output_type": "display_data"
1307 | }
1308 | ],
1309 | "source": [
1310 | "b.shape"
1311 | ]
1312 | },
1313 | {
1314 | "cell_type": "markdown",
1315 | "id": "09bc4ffc",
1316 | "metadata": {},
1317 | "source": [
1318 | "> Column wise 1D to 2D conversion"
1319 | ]
1320 | },
1321 | {
1322 | "cell_type": "code",
1323 | "execution_count": null,
1324 | "id": "55f963b3",
1325 | "metadata": {},
1326 | "outputs": [
1327 | {
1328 | "data": {
1329 | "text/plain": [
1330 | "array([[1],\n",
1331 | " [2],\n",
1332 | " [3],\n",
1333 | " [4],\n",
1334 | " [5],\n",
1335 | " [6],\n",
1336 | " [7],\n",
1337 | " [8],\n",
1338 | " [9]])"
1339 | ]
1340 | },
1341 | "metadata": {},
1342 | "output_type": "display_data"
1343 | }
1344 | ],
1345 | "source": [
1346 | "c = a[: , np.newaxis]\n",
1347 | "c"
1348 | ]
1349 | },
1350 | {
1351 | "cell_type": "code",
1352 | "execution_count": null,
1353 | "id": "7e571800",
1354 | "metadata": {},
1355 | "outputs": [
1356 | {
1357 | "data": {
1358 | "text/plain": [
1359 | "(9, 1)"
1360 | ]
1361 | },
1362 | "metadata": {},
1363 | "output_type": "display_data"
1364 | }
1365 | ],
1366 | "source": [
1367 | "c.shape"
1368 | ]
1369 | },
1370 | {
1371 | "cell_type": "markdown",
1372 | "id": "c9251a24",
1373 | "metadata": {},
1374 | "source": [
1375 | "> Conversion of 1-D array to 2-D and then 2-D to 3-D\n",
1376 | "> \n",
1377 | "> _Using np.newaxis will increase the dimensions of your array by one dimension when used once._"
1378 | ]
1379 | },
1380 | {
1381 | "cell_type": "code",
1382 | "execution_count": null,
1383 | "id": "a170abeb",
1384 | "metadata": {},
1385 | "outputs": [
1386 | {
1387 | "data": {
1388 | "text/plain": [
1389 | "(7,)"
1390 | ]
1391 | },
1392 | "metadata": {},
1393 | "output_type": "display_data"
1394 | }
1395 | ],
1396 | "source": [
1397 | "a = np.arange(7)\n",
1398 | "a.shape\n"
1399 | ]
1400 | },
1401 | {
1402 | "cell_type": "code",
1403 | "execution_count": null,
1404 | "id": "8e332d42",
1405 | "metadata": {},
1406 | "outputs": [
1407 | {
1408 | "data": {
1409 | "text/plain": [
1410 | "(1, 7)"
1411 | ]
1412 | },
1413 | "metadata": {},
1414 | "output_type": "display_data"
1415 | }
1416 | ],
1417 | "source": [
1418 | "b = a[np.newaxis, :]\n",
1419 | "b.shape"
1420 | ]
1421 | },
1422 | {
1423 | "cell_type": "code",
1424 | "execution_count": null,
1425 | "id": "21cf04d3",
1426 | "metadata": {},
1427 | "outputs": [
1428 | {
1429 | "data": {
1430 | "text/plain": [
1431 | "(1, 1, 7)"
1432 | ]
1433 | },
1434 | "metadata": {},
1435 | "output_type": "display_data"
1436 | }
1437 | ],
1438 | "source": [
1439 | "c = b[np.newaxis, :]\n",
1440 | "c.shape"
1441 | ]
1442 | },
1443 | {
1444 | "cell_type": "markdown",
1445 | "id": "a9a45ca3",
1446 | "metadata": {},
1447 | "source": [
1448 | ">Converting 1-D array to 2-D array at specific axis\n",
1449 | ">\n",
1450 | ">_You can also expand an array by inserting a new axis at a specified position with np.expand_dims._"
1451 | ]
1452 | },
1453 | {
1454 | "cell_type": "code",
1455 | "execution_count": null,
1456 | "id": "3b7513fb",
1457 | "metadata": {},
1458 | "outputs": [
1459 | {
1460 | "data": {
1461 | "text/plain": [
1462 | "(6,)"
1463 | ]
1464 | },
1465 | "metadata": {},
1466 | "output_type": "display_data"
1467 | }
1468 | ],
1469 | "source": [
1470 | "a = np.arange(6)\n",
1471 | "a.shape"
1472 | ]
1473 | },
1474 | {
1475 | "cell_type": "markdown",
1476 | "id": "4715e929",
1477 | "metadata": {},
1478 | "source": [
1479 | "> You can use np.expand_dims to add an axis at index position 1"
1480 | ]
1481 | },
1482 | {
1483 | "cell_type": "code",
1484 | "execution_count": null,
1485 | "id": "0adc8dbc",
1486 | "metadata": {},
1487 | "outputs": [
1488 | {
1489 | "data": {
1490 | "text/plain": [
1491 | "array([[0],\n",
1492 | " [1],\n",
1493 | " [2],\n",
1494 | " [3],\n",
1495 | " [4],\n",
1496 | " [5]])"
1497 | ]
1498 | },
1499 | "metadata": {},
1500 | "output_type": "display_data"
1501 | }
1502 | ],
1503 | "source": [
1504 | "b = np.expand_dims(a, axis=1)\n",
1505 | "b"
1506 | ]
1507 | },
1508 | {
1509 | "cell_type": "code",
1510 | "execution_count": null,
1511 | "id": "d8cac73c",
1512 | "metadata": {},
1513 | "outputs": [
1514 | {
1515 | "data": {
1516 | "text/plain": [
1517 | "(6, 1)"
1518 | ]
1519 | },
1520 | "metadata": {},
1521 | "output_type": "display_data"
1522 | }
1523 | ],
1524 | "source": [
1525 | "b.shape"
1526 | ]
1527 | },
1528 | {
1529 | "cell_type": "markdown",
1530 | "id": "5c7d0846",
1531 | "metadata": {},
1532 | "source": [
1533 | ">You can add an axis at index position 0\n"
1534 | ]
1535 | },
1536 | {
1537 | "cell_type": "code",
1538 | "execution_count": null,
1539 | "id": "ed723cda",
1540 | "metadata": {},
1541 | "outputs": [
1542 | {
1543 | "data": {
1544 | "text/plain": [
1545 | "array([[0, 1, 2, 3, 4, 5]])"
1546 | ]
1547 | },
1548 | "metadata": {},
1549 | "output_type": "display_data"
1550 | }
1551 | ],
1552 | "source": [
1553 | "b = np.expand_dims(a, axis=0)\n",
1554 | "b"
1555 | ]
1556 | },
1557 | {
1558 | "cell_type": "code",
1559 | "execution_count": null,
1560 | "id": "b8e6c53e",
1561 | "metadata": {},
1562 | "outputs": [
1563 | {
1564 | "data": {
1565 | "text/plain": [
1566 | "(1, 6)"
1567 | ]
1568 | },
1569 | "metadata": {},
1570 | "output_type": "display_data"
1571 | }
1572 | ],
1573 | "source": [
1574 | "b.shape"
1575 | ]
1576 | },
1577 | {
1578 | "cell_type": "markdown",
1579 | "id": "d2728c58",
1580 | "metadata": {},
1581 | "source": [
1582 | "### 3.2.10 Basic Arithmetic Operations on an Array"
1583 | ]
1584 | },
1585 | {
1586 | "cell_type": "markdown",
1587 | "id": "295d0494",
1588 | "metadata": {},
1589 | "source": [
1590 | "> Addition & Multiplication to elements of array"
1591 | ]
1592 | },
1593 | {
1594 | "cell_type": "code",
1595 | "execution_count": null,
1596 | "id": "7a05ae57",
1597 | "metadata": {},
1598 | "outputs": [
1599 | {
1600 | "data": {
1601 | "text/plain": [
1602 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
1603 | ]
1604 | },
1605 | "metadata": {},
1606 | "output_type": "display_data"
1607 | }
1608 | ],
1609 | "source": [
1610 | "a"
1611 | ]
1612 | },
1613 | {
1614 | "cell_type": "code",
1615 | "execution_count": null,
1616 | "id": "27ef4420",
1617 | "metadata": {},
1618 | "outputs": [
1619 | {
1620 | "data": {
1621 | "text/plain": [
1622 | "array([ 6, 12, 18, 24, 30, 36, 42, 48, 54])"
1623 | ]
1624 | },
1625 | "metadata": {},
1626 | "output_type": "display_data"
1627 | }
1628 | ],
1629 | "source": [
1630 | "a*6"
1631 | ]
1632 | },
1633 | {
1634 | "cell_type": "code",
1635 | "execution_count": null,
1636 | "id": "f80e6e8d",
1637 | "metadata": {},
1638 | "outputs": [
1639 | {
1640 | "data": {
1641 | "text/plain": [
1642 | "array([ 7, 8, 9, 10, 11, 12, 13, 14, 15])"
1643 | ]
1644 | },
1645 | "metadata": {},
1646 | "output_type": "display_data"
1647 | }
1648 | ],
1649 | "source": [
1650 | "a+6"
1651 | ]
1652 | },
1653 | {
1654 | "cell_type": "code",
1655 | "execution_count": null,
1656 | "id": "668651ae",
1657 | "metadata": {},
1658 | "outputs": [
1659 | {
1660 | "data": {
1661 | "text/plain": [
1662 | "45"
1663 | ]
1664 | },
1665 | "metadata": {},
1666 | "output_type": "display_data"
1667 | }
1668 | ],
1669 | "source": [
1670 | "# Sum of elements\n",
1671 | "a.sum()"
1672 | ]
1673 | },
1674 | {
1675 | "cell_type": "code",
1676 | "execution_count": null,
1677 | "id": "11aa6587",
1678 | "metadata": {},
1679 | "outputs": [
1680 | {
1681 | "data": {
1682 | "text/plain": [
1683 | "5.0"
1684 | ]
1685 | },
1686 | "metadata": {},
1687 | "output_type": "display_data"
1688 | }
1689 | ],
1690 | "source": [
1691 | "# mean of elements\n",
1692 | "a.mean()"
1693 | ]
1694 | },
1695 | {
1696 | "cell_type": "markdown",
1697 | "id": "0fe9079e",
1698 | "metadata": {},
1699 | "source": [
1700 | "### 3.2.11 Indexing and Slicing"
1701 | ]
1702 | },
1703 | {
1704 | "cell_type": "code",
1705 | "execution_count": null,
1706 | "id": "9d094233",
1707 | "metadata": {},
1708 | "outputs": [
1709 | {
1710 | "data": {
1711 | "text/plain": [
1712 | "array([10, 11, 12, 13, 14, 15])"
1713 | ]
1714 | },
1715 | "metadata": {},
1716 | "output_type": "display_data"
1717 | }
1718 | ],
1719 | "source": [
1720 | "a = np.array([10, 11, 12, 13, 14, 15])\n",
1721 | "a"
1722 | ]
1723 | },
1724 | {
1725 | "cell_type": "code",
1726 | "execution_count": null,
1727 | "id": "2f7d5cd4",
1728 | "metadata": {},
1729 | "outputs": [
1730 | {
1731 | "data": {
1732 | "text/plain": [
1733 | "12"
1734 | ]
1735 | },
1736 | "metadata": {},
1737 | "output_type": "display_data"
1738 | }
1739 | ],
1740 | "source": [
1741 | "a[2]"
1742 | ]
1743 | },
1744 | {
1745 | "cell_type": "code",
1746 | "execution_count": null,
1747 | "id": "ae5e8043",
1748 | "metadata": {},
1749 | "outputs": [
1750 | {
1751 | "data": {
1752 | "text/plain": [
1753 | "array([10, 11, 12])"
1754 | ]
1755 | },
1756 | "metadata": {},
1757 | "output_type": "display_data"
1758 | }
1759 | ],
1760 | "source": [
1761 | "a[0:3]"
1762 | ]
1763 | },
1764 | {
1765 | "cell_type": "code",
1766 | "execution_count": null,
1767 | "id": "19b0b8b6",
1768 | "metadata": {},
1769 | "outputs": [
1770 | {
1771 | "data": {
1772 | "text/plain": [
1773 | "array([10, 11, 12, 13, 14, 15])"
1774 | ]
1775 | },
1776 | "metadata": {},
1777 | "output_type": "display_data"
1778 | }
1779 | ],
1780 | "source": [
1781 | "a[0:]"
1782 | ]
1783 | },
1784 | {
1785 | "cell_type": "code",
1786 | "execution_count": null,
1787 | "id": "e64cb9aa",
1788 | "metadata": {},
1789 | "outputs": [
1790 | {
1791 | "data": {
1792 | "text/plain": [
1793 | "array([10, 11, 12, 13, 14])"
1794 | ]
1795 | },
1796 | "metadata": {},
1797 | "output_type": "display_data"
1798 | }
1799 | ],
1800 | "source": [
1801 | "a[:5]"
1802 | ]
1803 | },
1804 | {
1805 | "cell_type": "code",
1806 | "execution_count": null,
1807 | "id": "311f0408",
1808 | "metadata": {},
1809 | "outputs": [
1810 | {
1811 | "data": {
1812 | "text/plain": [
1813 | "array([11, 12, 13, 14, 15])"
1814 | ]
1815 | },
1816 | "metadata": {},
1817 | "output_type": "display_data"
1818 | }
1819 | ],
1820 | "source": [
1821 | "a[-5:]"
1822 | ]
1823 | },
1824 | {
1825 | "cell_type": "code",
1826 | "execution_count": null,
1827 | "id": "be73c3dc",
1828 | "metadata": {},
1829 | "outputs": [
1830 | {
1831 | "data": {
1832 | "text/plain": [
1833 | "array([[ 1, 2, 3, 4],\n",
1834 | " [ 5, 6, 7, 8],\n",
1835 | " [ 9, 10, 11, 12]])"
1836 | ]
1837 | },
1838 | "metadata": {},
1839 | "output_type": "display_data"
1840 | }
1841 | ],
1842 | "source": [
1843 | "a = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n",
1844 | "a"
1845 | ]
1846 | },
1847 | {
1848 | "cell_type": "code",
1849 | "execution_count": null,
1850 | "id": "dbf6c60d",
1851 | "metadata": {},
1852 | "outputs": [
1853 | {
1854 | "data": {
1855 | "text/plain": [
1856 | "array([1, 2, 3, 4])"
1857 | ]
1858 | },
1859 | "metadata": {},
1860 | "output_type": "display_data"
1861 | }
1862 | ],
1863 | "source": [
1864 | "a[a<5]"
1865 | ]
1866 | },
1867 | {
1868 | "cell_type": "code",
1869 | "execution_count": null,
1870 | "id": "0040894e",
1871 | "metadata": {},
1872 | "outputs": [
1873 | {
1874 | "data": {
1875 | "text/plain": [
1876 | "array([ 6, 7, 8, 9, 10, 11, 12])"
1877 | ]
1878 | },
1879 | "metadata": {},
1880 | "output_type": "display_data"
1881 | }
1882 | ],
1883 | "source": [
1884 | "b = a > 5\n",
1885 | "a[b]"
1886 | ]
1887 | },
1888 | {
1889 | "cell_type": "code",
1890 | "execution_count": null,
1891 | "id": "6d404e9b",
1892 | "metadata": {},
1893 | "outputs": [
1894 | {
1895 | "data": {
1896 | "text/plain": [
1897 | "array([3, 4, 5, 6, 7, 8, 9])"
1898 | ]
1899 | },
1900 | "metadata": {},
1901 | "output_type": "display_data"
1902 | }
1903 | ],
1904 | "source": [
1905 | "c = a[(a>2) & (a<10)]\n",
1906 | "c"
1907 | ]
1908 | },
1909 | {
1910 | "cell_type": "code",
1911 | "execution_count": null,
1912 | "id": "868f55a7",
1913 | "metadata": {},
1914 | "outputs": [
1915 | {
1916 | "data": {
1917 | "text/plain": [
1918 | "array([[ True, False, False, False],\n",
1919 | " [ True, True, True, True],\n",
1920 | " [ True, True, True, True]])"
1921 | ]
1922 | },
1923 | "metadata": {},
1924 | "output_type": "display_data"
1925 | }
1926 | ],
1927 | "source": [
1928 | "c = (a>4) | (a==1)\n",
1929 | "c"
1930 | ]
1931 | },
1932 | {
1933 | "cell_type": "code",
1934 | "execution_count": null,
1935 | "id": "386af875",
1936 | "metadata": {},
1937 | "outputs": [
1938 | {
1939 | "data": {
1940 | "text/plain": [
1941 | "(array([0, 0, 0, 0], dtype=int64), array([0, 1, 2, 3], dtype=int64))"
1942 | ]
1943 | },
1944 | "metadata": {},
1945 | "output_type": "display_data"
1946 | }
1947 | ],
1948 | "source": [
1949 | "b = np.nonzero(a <5)\n",
1950 | "b"
1951 | ]
1952 | },
1953 | {
1954 | "cell_type": "markdown",
1955 | "id": "ce3807cd",
1956 | "metadata": {},
1957 | "source": [
1958 | "> Slicing"
1959 | ]
1960 | },
1961 | {
1962 | "cell_type": "code",
1963 | "execution_count": null,
1964 | "id": "6858d9e5",
1965 | "metadata": {},
1966 | "outputs": [
1967 | {
1968 | "data": {
1969 | "text/plain": [
1970 | "array([[ 1, 2, 3, 4],\n",
1971 | " [ 5, 6, 7, 8],\n",
1972 | " [ 9, 10, 11, 12]])"
1973 | ]
1974 | },
1975 | "metadata": {},
1976 | "output_type": "display_data"
1977 | }
1978 | ],
1979 | "source": [
1980 | "a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n",
1981 | "a"
1982 | ]
1983 | },
1984 | {
1985 | "cell_type": "markdown",
1986 | "id": "2404a44e",
1987 | "metadata": {},
1988 | "source": [
1989 | "> Slicing a row of an array\n",
1990 | "> \n",
1991 | "> \"1\" is the index of the row of an array"
1992 | ]
1993 | },
1994 | {
1995 | "cell_type": "code",
1996 | "execution_count": null,
1997 | "id": "be84d9bc",
1998 | "metadata": {},
1999 | "outputs": [
2000 | {
2001 | "data": {
2002 | "text/plain": [
2003 | "array([5, 6, 7, 8])"
2004 | ]
2005 | },
2006 | "metadata": {},
2007 | "output_type": "display_data"
2008 | }
2009 | ],
2010 | "source": [
2011 | "b = a[1, :] # \"1\" is the index of the row of an array\n",
2012 | "b"
2013 | ]
2014 | },
2015 | {
2016 | "cell_type": "markdown",
2017 | "id": "f87af930",
2018 | "metadata": {},
2019 | "source": [
2020 | "> Slicing a Column of an array\n",
2021 | "> \n",
2022 | "> \"1\" is the index of the Column of an array"
2023 | ]
2024 | },
2025 | {
2026 | "cell_type": "code",
2027 | "execution_count": null,
2028 | "id": "49f03cd2",
2029 | "metadata": {},
2030 | "outputs": [
2031 | {
2032 | "data": {
2033 | "text/plain": [
2034 | "array([ 2, 6, 10])"
2035 | ]
2036 | },
2037 | "metadata": {},
2038 | "output_type": "display_data"
2039 | }
2040 | ],
2041 | "source": [
2042 | "# to slice a column of an array\n",
2043 | "b = a[:, 1] # \"1\" is the index of the column of an array\n",
2044 | "b"
2045 | ]
2046 | },
2047 | {
2048 | "cell_type": "markdown",
2049 | "id": "84c47e86",
2050 | "metadata": {},
2051 | "source": [
2052 | "> slicing and modifying element"
2053 | ]
2054 | },
2055 | {
2056 | "cell_type": "code",
2057 | "execution_count": null,
2058 | "id": "0e9da100",
2059 | "metadata": {},
2060 | "outputs": [
2061 | {
2062 | "data": {
2063 | "text/plain": [
2064 | "array([200, 6, 10])"
2065 | ]
2066 | },
2067 | "metadata": {},
2068 | "output_type": "display_data"
2069 | }
2070 | ],
2071 | "source": [
2072 | "b[0] = 200\n",
2073 | "b"
2074 | ]
2075 | },
2076 | {
2077 | "cell_type": "code",
2078 | "execution_count": null,
2079 | "id": "d9b2b8c5",
2080 | "metadata": {},
2081 | "outputs": [
2082 | {
2083 | "data": {
2084 | "text/plain": [
2085 | "array([[ 1, 200, 3, 4],\n",
2086 | " [ 5, 6, 7, 8],\n",
2087 | " [ 9, 10, 11, 12]])"
2088 | ]
2089 | },
2090 | "metadata": {},
2091 | "output_type": "display_data"
2092 | }
2093 | ],
2094 | "source": [
2095 | "a"
2096 | ]
2097 | },
2098 | {
2099 | "cell_type": "markdown",
2100 | "id": "7db8191c",
2101 | "metadata": {},
2102 | "source": [
2103 | ">Making a copy of an array\n"
2104 | ]
2105 | },
2106 | {
2107 | "cell_type": "code",
2108 | "execution_count": null,
2109 | "id": "7aa5f16e",
2110 | "metadata": {},
2111 | "outputs": [
2112 | {
2113 | "data": {
2114 | "text/plain": [
2115 | "array([[ 1, 200, 3, 4],\n",
2116 | " [ 5, 6, 7, 8],\n",
2117 | " [ 9, 10, 11, 12]])"
2118 | ]
2119 | },
2120 | "metadata": {},
2121 | "output_type": "display_data"
2122 | }
2123 | ],
2124 | "source": [
2125 | "c = a.copy()\n",
2126 | "c"
2127 | ]
2128 | },
2129 | {
2130 | "cell_type": "markdown",
2131 | "id": "585a68a9",
2132 | "metadata": {},
2133 | "source": [
2134 | "### 3.2.12: Array Stacking & Splitting"
2135 | ]
2136 | },
2137 | {
2138 | "cell_type": "markdown",
2139 | "id": "49d509a5",
2140 | "metadata": {},
2141 | "source": [
2142 | ">You can stack them vertically with vstack"
2143 | ]
2144 | },
2145 | {
2146 | "cell_type": "code",
2147 | "execution_count": null,
2148 | "id": "3d72bf0c",
2149 | "metadata": {},
2150 | "outputs": [
2151 | {
2152 | "data": {
2153 | "text/plain": [
2154 | "array([[1, 1],\n",
2155 | " [2, 2],\n",
2156 | " [3, 3],\n",
2157 | " [4, 4]])"
2158 | ]
2159 | },
2160 | "metadata": {},
2161 | "output_type": "display_data"
2162 | }
2163 | ],
2164 | "source": [
2165 | "a1 = np.array([[1, 1],\n",
2166 | " [2, 2]])\n",
2167 | "a2 = np.array([[3, 3],\n",
2168 | " [4, 4]])\n",
2169 | "\n",
2170 | "np.vstack((a1, a2)) "
2171 | ]
2172 | },
2173 | {
2174 | "cell_type": "markdown",
2175 | "id": "bcc102f7",
2176 | "metadata": {},
2177 | "source": [
2178 | ">Or stack them horizontally with hstack\n"
2179 | ]
2180 | },
2181 | {
2182 | "cell_type": "code",
2183 | "execution_count": null,
2184 | "id": "87b4c071",
2185 | "metadata": {},
2186 | "outputs": [
2187 | {
2188 | "data": {
2189 | "text/plain": [
2190 | "array([[1, 1, 3, 3],\n",
2191 | " [2, 2, 4, 4]])"
2192 | ]
2193 | },
2194 | "metadata": {},
2195 | "output_type": "display_data"
2196 | }
2197 | ],
2198 | "source": [
2199 | "np.hstack((a1, a2))"
2200 | ]
2201 | },
2202 | {
2203 | "cell_type": "markdown",
2204 | "id": "d399c1f6",
2205 | "metadata": {},
2206 | "source": [
2207 | ">Array Splitting"
2208 | ]
2209 | },
2210 | {
2211 | "cell_type": "code",
2212 | "execution_count": null,
2213 | "id": "e91e09d8",
2214 | "metadata": {},
2215 | "outputs": [
2216 | {
2217 | "data": {
2218 | "text/plain": [
2219 | "array([[ 1, 2, 3, 4, 5, 6, 7, 8],\n",
2220 | " [ 9, 10, 11, 12, 13, 14, 15, 16],\n",
2221 | " [17, 18, 19, 20, 21, 22, 23, 24]])"
2222 | ]
2223 | },
2224 | "metadata": {},
2225 | "output_type": "display_data"
2226 | }
2227 | ],
2228 | "source": [
2229 | "x = np.arange(1, 25).reshape(3,8)\n",
2230 | "x"
2231 | ]
2232 | },
2233 | {
2234 | "cell_type": "markdown",
2235 | "id": "5a576ac8",
2236 | "metadata": {},
2237 | "source": [
2238 | ">If you want to split this array into equally shaped arrays\n",
2239 | ">\n",
2240 | "> **Imp Note: Row or Column should be multiple of Row or Column, e.g. above array has 8 column it multiple could be 2, 4, 8)**"
2241 | ]
2242 | },
2243 | {
2244 | "cell_type": "code",
2245 | "execution_count": null,
2246 | "id": "1d1fa4ee",
2247 | "metadata": {},
2248 | "outputs": [
2249 | {
2250 | "data": {
2251 | "text/plain": [
2252 | "[array([[ 1, 2, 3, 4],\n",
2253 | " [ 9, 10, 11, 12],\n",
2254 | " [17, 18, 19, 20]]),\n",
2255 | " array([[ 5, 6, 7, 8],\n",
2256 | " [13, 14, 15, 16],\n",
2257 | " [21, 22, 23, 24]])]"
2258 | ]
2259 | },
2260 | "metadata": {},
2261 | "output_type": "display_data"
2262 | }
2263 | ],
2264 | "source": [
2265 | "np.hsplit(x,2)"
2266 | ]
2267 | },
2268 | {
2269 | "cell_type": "code",
2270 | "execution_count": null,
2271 | "id": "61d976f7",
2272 | "metadata": {},
2273 | "outputs": [
2274 | {
2275 | "data": {
2276 | "text/plain": [
2277 | "[array([[ 1, 2, 3, 4, 5],\n",
2278 | " [ 9, 10, 11, 12, 13],\n",
2279 | " [17, 18, 19, 20, 21]]),\n",
2280 | " array([[ 6],\n",
2281 | " [14],\n",
2282 | " [22]]),\n",
2283 | " array([[ 7, 8],\n",
2284 | " [15, 16],\n",
2285 | " [23, 24]])]"
2286 | ]
2287 | },
2288 | "metadata": {},
2289 | "output_type": "display_data"
2290 | }
2291 | ],
2292 | "source": [
2293 | "np.hsplit(x, (5,6))"
2294 | ]
2295 | },
2296 | {
2297 | "cell_type": "markdown",
2298 | "id": "e21ce303",
2299 | "metadata": {},
2300 | "source": [
2301 | "## Section 3.3: Basic Array Operations"
2302 | ]
2303 | },
2304 | {
2305 | "cell_type": "markdown",
2306 | "id": "fb57017a",
2307 | "metadata": {},
2308 | "source": [
2309 | "> Addition"
2310 | ]
2311 | },
2312 | {
2313 | "cell_type": "code",
2314 | "execution_count": null,
2315 | "id": "2e90f028",
2316 | "metadata": {},
2317 | "outputs": [
2318 | {
2319 | "data": {
2320 | "text/plain": [
2321 | "array([2, 3])"
2322 | ]
2323 | },
2324 | "metadata": {},
2325 | "output_type": "display_data"
2326 | }
2327 | ],
2328 | "source": [
2329 | "import numpy as np\n",
2330 | "a = np.array([2,3])\n",
2331 | "a"
2332 | ]
2333 | },
2334 | {
2335 | "cell_type": "code",
2336 | "execution_count": null,
2337 | "id": "03eddb34",
2338 | "metadata": {},
2339 | "outputs": [
2340 | {
2341 | "data": {
2342 | "text/plain": [
2343 | "array([1, 1])"
2344 | ]
2345 | },
2346 | "metadata": {},
2347 | "output_type": "display_data"
2348 | }
2349 | ],
2350 | "source": [
2351 | "b = np.ones(2, dtype=int)\n",
2352 | "b"
2353 | ]
2354 | },
2355 | {
2356 | "cell_type": "code",
2357 | "execution_count": null,
2358 | "id": "1df7fc1e",
2359 | "metadata": {},
2360 | "outputs": [
2361 | {
2362 | "data": {
2363 | "text/plain": [
2364 | "array([3, 4])"
2365 | ]
2366 | },
2367 | "metadata": {},
2368 | "output_type": "display_data"
2369 | }
2370 | ],
2371 | "source": [
2372 | "c = a+b\n",
2373 | "c"
2374 | ]
2375 | },
2376 | {
2377 | "cell_type": "markdown",
2378 | "id": "4a7c5d1f",
2379 | "metadata": {},
2380 | "source": [
2381 | "### 3.3.1 Basic Operations of 2D Array (Addition, Subtraction, Multiplication & Division) "
2382 | ]
2383 | },
2384 | {
2385 | "cell_type": "markdown",
2386 | "id": "712e347a",
2387 | "metadata": {},
2388 | "source": [
2389 | "> You can add and multiply them using arithmetic operators if you have two matrices that are the same size."
2390 | ]
2391 | },
2392 | {
2393 | "cell_type": "code",
2394 | "execution_count": null,
2395 | "id": "bb58c344",
2396 | "metadata": {},
2397 | "outputs": [
2398 | {
2399 | "data": {
2400 | "text/plain": [
2401 | "array([[2, 3],\n",
2402 | " [4, 5]])"
2403 | ]
2404 | },
2405 | "metadata": {},
2406 | "output_type": "display_data"
2407 | }
2408 | ],
2409 | "source": [
2410 | "a = np.array([[1, 2], [3, 4]])\n",
2411 | "b = np.array([[1, 1], [1, 1]])\n",
2412 | "a + b"
2413 | ]
2414 | },
2415 | {
2416 | "cell_type": "markdown",
2417 | "id": "2bb60d6a",
2418 | "metadata": {},
2419 | "source": [
2420 | "> You can do these arithmetic operations on matrices of different sizes, but only if one matrix has only one column or one row."
2421 | ]
2422 | },
2423 | {
2424 | "cell_type": "code",
2425 | "execution_count": null,
2426 | "id": "dcbf28fa",
2427 | "metadata": {},
2428 | "outputs": [
2429 | {
2430 | "data": {
2431 | "text/plain": [
2432 | "array([[2, 3],\n",
2433 | " [4, 5],\n",
2434 | " [6, 7]])"
2435 | ]
2436 | },
2437 | "metadata": {},
2438 | "output_type": "display_data"
2439 | }
2440 | ],
2441 | "source": [
2442 | "x = np.array([[1, 2], [3, 4], [5, 6]])\n",
2443 | "y = np.array([[1, 1]])\n",
2444 | "x+y"
2445 | ]
2446 | },
2447 | {
2448 | "cell_type": "markdown",
2449 | "id": "c1dc1f0a",
2450 | "metadata": {},
2451 | "source": [
2452 | ">Subtraction"
2453 | ]
2454 | },
2455 | {
2456 | "cell_type": "code",
2457 | "execution_count": null,
2458 | "id": "fa8acc9d",
2459 | "metadata": {},
2460 | "outputs": [
2461 | {
2462 | "data": {
2463 | "text/plain": [
2464 | "array([1, 2])"
2465 | ]
2466 | },
2467 | "metadata": {},
2468 | "output_type": "display_data"
2469 | }
2470 | ],
2471 | "source": [
2472 | "d = a-b\n",
2473 | "d"
2474 | ]
2475 | },
2476 | {
2477 | "cell_type": "markdown",
2478 | "id": "9521fac0",
2479 | "metadata": {},
2480 | "source": [
2481 | "> Multiplication"
2482 | ]
2483 | },
2484 | {
2485 | "cell_type": "code",
2486 | "execution_count": null,
2487 | "id": "845af2d9",
2488 | "metadata": {},
2489 | "outputs": [
2490 | {
2491 | "data": {
2492 | "text/plain": [
2493 | "array([[ 0, 4],\n",
2494 | " [ 6, 12]])"
2495 | ]
2496 | },
2497 | "metadata": {},
2498 | "output_type": "display_data"
2499 | }
2500 | ],
2501 | "source": [
2502 | "e = c*d\n",
2503 | "e"
2504 | ]
2505 | },
2506 | {
2507 | "cell_type": "markdown",
2508 | "id": "b4a55529",
2509 | "metadata": {},
2510 | "source": [
2511 | "> Division\n"
2512 | ]
2513 | },
2514 | {
2515 | "cell_type": "code",
2516 | "execution_count": null,
2517 | "id": "edbc056d",
2518 | "metadata": {},
2519 | "outputs": [
2520 | {
2521 | "data": {
2522 | "text/plain": [
2523 | "array([3., 2.])"
2524 | ]
2525 | },
2526 | "metadata": {},
2527 | "output_type": "display_data"
2528 | }
2529 | ],
2530 | "source": [
2531 | "f = c/d\n",
2532 | "f"
2533 | ]
2534 | },
2535 | {
2536 | "cell_type": "markdown",
2537 | "id": "3664fbf3",
2538 | "metadata": {},
2539 | "source": [
2540 | "### 3.3.2 Sum of Elements in Array"
2541 | ]
2542 | },
2543 | {
2544 | "cell_type": "markdown",
2545 | "id": "bc528ada",
2546 | "metadata": {},
2547 | "source": [
2548 | ">Sum of elements in 1-D array"
2549 | ]
2550 | },
2551 | {
2552 | "cell_type": "code",
2553 | "execution_count": null,
2554 | "id": "74e12ec4",
2555 | "metadata": {},
2556 | "outputs": [
2557 | {
2558 | "data": {
2559 | "text/plain": [
2560 | "array([0, 1, 2, 3])"
2561 | ]
2562 | },
2563 | "metadata": {},
2564 | "output_type": "display_data"
2565 | }
2566 | ],
2567 | "source": [
2568 | "a = np.arange(4)\n",
2569 | "a"
2570 | ]
2571 | },
2572 | {
2573 | "cell_type": "code",
2574 | "execution_count": null,
2575 | "id": "f68b9513",
2576 | "metadata": {},
2577 | "outputs": [
2578 | {
2579 | "data": {
2580 | "text/plain": [
2581 | "6"
2582 | ]
2583 | },
2584 | "metadata": {},
2585 | "output_type": "display_data"
2586 | }
2587 | ],
2588 | "source": [
2589 | "a.sum()"
2590 | ]
2591 | },
2592 | {
2593 | "cell_type": "markdown",
2594 | "id": "62a79a9d",
2595 | "metadata": {},
2596 | "source": [
2597 | ">Sum of elements in 2-D array"
2598 | ]
2599 | },
2600 | {
2601 | "cell_type": "code",
2602 | "execution_count": null,
2603 | "id": "4a676333",
2604 | "metadata": {},
2605 | "outputs": [
2606 | {
2607 | "data": {
2608 | "text/plain": [
2609 | "array([[1, 2],\n",
2610 | " [3, 4]])"
2611 | ]
2612 | },
2613 | "metadata": {},
2614 | "output_type": "display_data"
2615 | }
2616 | ],
2617 | "source": [
2618 | "a = np.array([[1,2], [3,4]])\n",
2619 | "a"
2620 | ]
2621 | },
2622 | {
2623 | "cell_type": "markdown",
2624 | "id": "c6c2d506",
2625 | "metadata": {},
2626 | "source": [
2627 | ">sum of 2D array on 0 axis"
2628 | ]
2629 | },
2630 | {
2631 | "cell_type": "code",
2632 | "execution_count": null,
2633 | "id": "c5fd85dc",
2634 | "metadata": {},
2635 | "outputs": [
2636 | {
2637 | "data": {
2638 | "text/plain": [
2639 | "array([4, 6])"
2640 | ]
2641 | },
2642 | "metadata": {},
2643 | "output_type": "display_data"
2644 | }
2645 | ],
2646 | "source": [
2647 | "a.sum(axis = 0)"
2648 | ]
2649 | },
2650 | {
2651 | "cell_type": "markdown",
2652 | "id": "65b4c783",
2653 | "metadata": {},
2654 | "source": [
2655 | ">sum of 2D array on 1 axis"
2656 | ]
2657 | },
2658 | {
2659 | "cell_type": "code",
2660 | "execution_count": null,
2661 | "id": "09d5cb8b",
2662 | "metadata": {},
2663 | "outputs": [
2664 | {
2665 | "data": {
2666 | "text/plain": [
2667 | "array([3, 7])"
2668 | ]
2669 | },
2670 | "metadata": {},
2671 | "output_type": "display_data"
2672 | }
2673 | ],
2674 | "source": [
2675 | "a.sum(axis = 1)"
2676 | ]
2677 | },
2678 | {
2679 | "cell_type": "markdown",
2680 | "id": "63274fc2",
2681 | "metadata": {},
2682 | "source": [
2683 | ">multiplication of an scalar and vector in array"
2684 | ]
2685 | },
2686 | {
2687 | "cell_type": "code",
2688 | "execution_count": null,
2689 | "id": "b5eaf0ae",
2690 | "metadata": {},
2691 | "outputs": [
2692 | {
2693 | "data": {
2694 | "text/plain": [
2695 | "array([3., 5.])"
2696 | ]
2697 | },
2698 | "metadata": {},
2699 | "output_type": "display_data"
2700 | }
2701 | ],
2702 | "source": [
2703 | "a = np.array([1.5, 2.5])\n",
2704 | "a * 2"
2705 | ]
2706 | },
2707 | {
2708 | "cell_type": "markdown",
2709 | "id": "83ff4562",
2710 | "metadata": {},
2711 | "source": [
2712 | "## Section 3.4: Basic Statistical Operations in Arrays\n",
2713 | "> To find out maximum and minimum, sum, mean, product, standard deviation"
2714 | ]
2715 | },
2716 | {
2717 | "cell_type": "markdown",
2718 | "id": "6b8080a8",
2719 | "metadata": {},
2720 | "source": [
2721 | "> 1-D Array"
2722 | ]
2723 | },
2724 | {
2725 | "cell_type": "code",
2726 | "execution_count": null,
2727 | "id": "cca724f4",
2728 | "metadata": {},
2729 | "outputs": [
2730 | {
2731 | "data": {
2732 | "text/plain": [
2733 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
2734 | ]
2735 | },
2736 | "metadata": {},
2737 | "output_type": "display_data"
2738 | }
2739 | ],
2740 | "source": [
2741 | "a = np.arange(1,10)\n",
2742 | "a"
2743 | ]
2744 | },
2745 | {
2746 | "cell_type": "code",
2747 | "execution_count": null,
2748 | "id": "7845f6de",
2749 | "metadata": {},
2750 | "outputs": [
2751 | {
2752 | "data": {
2753 | "text/plain": [
2754 | "1"
2755 | ]
2756 | },
2757 | "metadata": {},
2758 | "output_type": "display_data"
2759 | }
2760 | ],
2761 | "source": [
2762 | "a.min()"
2763 | ]
2764 | },
2765 | {
2766 | "cell_type": "code",
2767 | "execution_count": null,
2768 | "id": "50c61157",
2769 | "metadata": {},
2770 | "outputs": [
2771 | {
2772 | "data": {
2773 | "text/plain": [
2774 | "9"
2775 | ]
2776 | },
2777 | "metadata": {},
2778 | "output_type": "display_data"
2779 | }
2780 | ],
2781 | "source": [
2782 | "a.max()"
2783 | ]
2784 | },
2785 | {
2786 | "cell_type": "code",
2787 | "execution_count": null,
2788 | "id": "5f9a04af",
2789 | "metadata": {},
2790 | "outputs": [
2791 | {
2792 | "data": {
2793 | "text/plain": [
2794 | "45"
2795 | ]
2796 | },
2797 | "metadata": {},
2798 | "output_type": "display_data"
2799 | }
2800 | ],
2801 | "source": [
2802 | "a.sum()"
2803 | ]
2804 | },
2805 | {
2806 | "cell_type": "code",
2807 | "execution_count": null,
2808 | "id": "4d6f4f78",
2809 | "metadata": {},
2810 | "outputs": [
2811 | {
2812 | "data": {
2813 | "text/plain": [
2814 | "362880"
2815 | ]
2816 | },
2817 | "metadata": {},
2818 | "output_type": "display_data"
2819 | }
2820 | ],
2821 | "source": [
2822 | "a.prod()"
2823 | ]
2824 | },
2825 | {
2826 | "cell_type": "code",
2827 | "execution_count": null,
2828 | "id": "32225504",
2829 | "metadata": {},
2830 | "outputs": [
2831 | {
2832 | "data": {
2833 | "text/plain": [
2834 | "2.581988897471611"
2835 | ]
2836 | },
2837 | "metadata": {},
2838 | "output_type": "display_data"
2839 | }
2840 | ],
2841 | "source": [
2842 | "a.std()"
2843 | ]
2844 | },
2845 | {
2846 | "cell_type": "markdown",
2847 | "id": "fd8a9b75",
2848 | "metadata": {},
2849 | "source": [
2850 | "> 2-D Array"
2851 | ]
2852 | },
2853 | {
2854 | "cell_type": "code",
2855 | "execution_count": null,
2856 | "id": "912b5749",
2857 | "metadata": {},
2858 | "outputs": [
2859 | {
2860 | "data": {
2861 | "text/plain": [
2862 | "4.8595784"
2863 | ]
2864 | },
2865 | "metadata": {},
2866 | "output_type": "display_data"
2867 | }
2868 | ],
2869 | "source": [
2870 | "a = np.array([[0.45053314, 0.17296777, 0.34376245, 0.5510652],\n",
2871 | " [0.54627315, 0.05093587, 0.40067661, 0.55645993],\n",
2872 | " [0.12697628, 0.82485143, 0.26590556, 0.56917101]])\n",
2873 | "a.sum()"
2874 | ]
2875 | },
2876 | {
2877 | "cell_type": "code",
2878 | "execution_count": null,
2879 | "id": "a582957e",
2880 | "metadata": {},
2881 | "outputs": [
2882 | {
2883 | "data": {
2884 | "text/plain": [
2885 | "array([[0.17296777, 0.34376245, 0.45053314, 0.5510652 ],\n",
2886 | " [0.05093587, 0.40067661, 0.54627315, 0.55645993],\n",
2887 | " [0.12697628, 0.26590556, 0.56917101, 0.82485143]])"
2888 | ]
2889 | },
2890 | "metadata": {},
2891 | "output_type": "display_data"
2892 | }
2893 | ],
2894 | "source": [
2895 | "a.sort()\n",
2896 | "a"
2897 | ]
2898 | },
2899 | {
2900 | "cell_type": "code",
2901 | "execution_count": null,
2902 | "id": "84a8d259",
2903 | "metadata": {},
2904 | "outputs": [
2905 | {
2906 | "data": {
2907 | "text/plain": [
2908 | "0.05093587"
2909 | ]
2910 | },
2911 | "metadata": {},
2912 | "output_type": "display_data"
2913 | }
2914 | ],
2915 | "source": [
2916 | "a.min()"
2917 | ]
2918 | },
2919 | {
2920 | "cell_type": "code",
2921 | "execution_count": null,
2922 | "id": "a1e4fb01",
2923 | "metadata": {},
2924 | "outputs": [
2925 | {
2926 | "data": {
2927 | "text/plain": [
2928 | "0.82485143"
2929 | ]
2930 | },
2931 | "metadata": {},
2932 | "output_type": "display_data"
2933 | }
2934 | ],
2935 | "source": [
2936 | "a.max()"
2937 | ]
2938 | },
2939 | {
2940 | "cell_type": "markdown",
2941 | "id": "6befe519",
2942 | "metadata": {},
2943 | "source": [
2944 | "> Minimum and Maximum in 2-D Array at specific axis"
2945 | ]
2946 | },
2947 | {
2948 | "cell_type": "markdown",
2949 | "id": "8bd6a746",
2950 | "metadata": {},
2951 | "source": [
2952 | "> when we use axis=0 to specify the axis. it takes minimum value from each column\n",
2953 | "> \n",
2954 | "> **Imp: axis = 0 gives output as row but scan in vertical way to find minimum value**"
2955 | ]
2956 | },
2957 | {
2958 | "cell_type": "code",
2959 | "execution_count": null,
2960 | "id": "69adf977",
2961 | "metadata": {},
2962 | "outputs": [
2963 | {
2964 | "data": {
2965 | "text/plain": [
2966 | "array([0.05093587, 0.26590556, 0.45053314, 0.5510652 ])"
2967 | ]
2968 | },
2969 | "metadata": {},
2970 | "output_type": "display_data"
2971 | }
2972 | ],
2973 | "source": [
2974 | "a.min(axis=0)"
2975 | ]
2976 | },
2977 | {
2978 | "cell_type": "markdown",
2979 | "id": "1d23fcd9",
2980 | "metadata": {},
2981 | "source": [
2982 | ">when we use axis=1 to specify the axis. it takes minimum value from each row\n",
2983 | ">\n",
2984 | ">**Imp: axis = 1 gives output as column but scan in horizontal way to find minimum value**"
2985 | ]
2986 | },
2987 | {
2988 | "cell_type": "code",
2989 | "execution_count": null,
2990 | "id": "a037f9d6",
2991 | "metadata": {},
2992 | "outputs": [
2993 | {
2994 | "data": {
2995 | "text/plain": [
2996 | "array([0.17296777, 0.05093587, 0.12697628])"
2997 | ]
2998 | },
2999 | "metadata": {},
3000 | "output_type": "display_data"
3001 | }
3002 | ],
3003 | "source": [
3004 | "a.min(axis=1)"
3005 | ]
3006 | },
3007 | {
3008 | "cell_type": "markdown",
3009 | "id": "e80e02ab",
3010 | "metadata": {},
3011 | "source": [
3012 | "> when we use axis=0 to specify the axis. it takes maximum value from each column\n",
3013 | "> \n",
3014 | "> **Imp: axis = 0 gives output as row but scan in vertical way to find maximum value**"
3015 | ]
3016 | },
3017 | {
3018 | "cell_type": "code",
3019 | "execution_count": null,
3020 | "id": "1e81174f",
3021 | "metadata": {},
3022 | "outputs": [
3023 | {
3024 | "data": {
3025 | "text/plain": [
3026 | "array([0.17296777, 0.40067661, 0.56917101, 0.82485143])"
3027 | ]
3028 | },
3029 | "metadata": {},
3030 | "output_type": "display_data"
3031 | }
3032 | ],
3033 | "source": [
3034 | "a.max(axis=0)"
3035 | ]
3036 | },
3037 | {
3038 | "cell_type": "markdown",
3039 | "id": "c9425af6",
3040 | "metadata": {},
3041 | "source": [
3042 | "> when we use axis=1 to specify the axis. it takes maximum value from each row\n",
3043 | "> \n",
3044 | "> **Imp: axis = 1 gives output as column but scan in horizontal way to find maximum value**"
3045 | ]
3046 | },
3047 | {
3048 | "cell_type": "code",
3049 | "execution_count": null,
3050 | "id": "4a0c8c85",
3051 | "metadata": {},
3052 | "outputs": [
3053 | {
3054 | "data": {
3055 | "text/plain": [
3056 | "array([0.5510652 , 0.55645993, 0.82485143])"
3057 | ]
3058 | },
3059 | "metadata": {},
3060 | "output_type": "display_data"
3061 | }
3062 | ],
3063 | "source": [
3064 | "a.max(axis=1)"
3065 | ]
3066 | },
3067 | {
3068 | "cell_type": "code",
3069 | "execution_count": null,
3070 | "id": "d3ba7dd9",
3071 | "metadata": {},
3072 | "outputs": [
3073 | {
3074 | "data": {
3075 | "text/plain": [
3076 | "1.451721612088471e-06"
3077 | ]
3078 | },
3079 | "metadata": {},
3080 | "output_type": "display_data"
3081 | }
3082 | ],
3083 | "source": [
3084 | "a.prod()"
3085 | ]
3086 | },
3087 | {
3088 | "cell_type": "code",
3089 | "execution_count": null,
3090 | "id": "b16b16d3",
3091 | "metadata": {},
3092 | "outputs": [
3093 | {
3094 | "data": {
3095 | "text/plain": [
3096 | "0.21392120766089617"
3097 | ]
3098 | },
3099 | "metadata": {},
3100 | "output_type": "display_data"
3101 | }
3102 | ],
3103 | "source": [
3104 | "a.std()"
3105 | ]
3106 | },
3107 | {
3108 | "cell_type": "markdown",
3109 | "id": "4acd2eaf",
3110 | "metadata": {},
3111 | "source": [
3112 | "## Section 3.5: Indexing 2-D Array"
3113 | ]
3114 | },
3115 | {
3116 | "cell_type": "code",
3117 | "execution_count": null,
3118 | "id": "12fd7121",
3119 | "metadata": {},
3120 | "outputs": [
3121 | {
3122 | "data": {
3123 | "text/plain": [
3124 | "array([[1, 2],\n",
3125 | " [3, 4],\n",
3126 | " [5, 6]])"
3127 | ]
3128 | },
3129 | "metadata": {},
3130 | "output_type": "display_data"
3131 | }
3132 | ],
3133 | "source": [
3134 | "x = np.array([[1, 2], [3, 4], [5, 6]])\n",
3135 | "x"
3136 | ]
3137 | },
3138 | {
3139 | "cell_type": "markdown",
3140 | "id": "d8720cae",
3141 | "metadata": {},
3142 | "source": [
3143 | "> In two dimensional array index (0, 1) here (Row = 0, Column = 1), Index will be the intersecting point of both Row and Column"
3144 | ]
3145 | },
3146 | {
3147 | "cell_type": "code",
3148 | "execution_count": null,
3149 | "id": "976ccbac",
3150 | "metadata": {},
3151 | "outputs": [
3152 | {
3153 | "data": {
3154 | "text/plain": [
3155 | "2"
3156 | ]
3157 | },
3158 | "metadata": {},
3159 | "output_type": "display_data"
3160 | }
3161 | ],
3162 | "source": [
3163 | "x[0 , 1]"
3164 | ]
3165 | },
3166 | {
3167 | "cell_type": "markdown",
3168 | "id": "358bbd0a",
3169 | "metadata": {},
3170 | "source": [
3171 | "> In 2-D array index (1:3, 1) or ((1,2), 1) here (Row, Column), Index will be the intersecting point of both Row and Column"
3172 | ]
3173 | },
3174 | {
3175 | "cell_type": "code",
3176 | "execution_count": null,
3177 | "id": "23c611e2",
3178 | "metadata": {},
3179 | "outputs": [
3180 | {
3181 | "data": {
3182 | "text/plain": [
3183 | "array([4, 6])"
3184 | ]
3185 | },
3186 | "metadata": {},
3187 | "output_type": "display_data"
3188 | }
3189 | ],
3190 | "source": [
3191 | "x[1:3, 1]"
3192 | ]
3193 | },
3194 | {
3195 | "cell_type": "markdown",
3196 | "id": "6509f172",
3197 | "metadata": {},
3198 | "source": [
3199 | "> In 2-D array index (1:3) or ((1,2), empty) here (Row = 1,2, Column = empty), Index will be on Row 1 & 2"
3200 | ]
3201 | },
3202 | {
3203 | "cell_type": "code",
3204 | "execution_count": null,
3205 | "id": "df808d95",
3206 | "metadata": {},
3207 | "outputs": [
3208 | {
3209 | "data": {
3210 | "text/plain": [
3211 | "array([[3, 4],\n",
3212 | " [5, 6]])"
3213 | ]
3214 | },
3215 | "metadata": {},
3216 | "output_type": "display_data"
3217 | }
3218 | ],
3219 | "source": [
3220 | "x[1:3]"
3221 | ]
3222 | },
3223 | {
3224 | "cell_type": "markdown",
3225 | "id": "507df4ee",
3226 | "metadata": {},
3227 | "source": [
3228 | "> Maximum and Minimum in 2-D Array"
3229 | ]
3230 | },
3231 | {
3232 | "cell_type": "code",
3233 | "execution_count": null,
3234 | "id": "8822fae0",
3235 | "metadata": {},
3236 | "outputs": [
3237 | {
3238 | "data": {
3239 | "text/plain": [
3240 | "6"
3241 | ]
3242 | },
3243 | "metadata": {},
3244 | "output_type": "display_data"
3245 | }
3246 | ],
3247 | "source": [
3248 | "x.max()"
3249 | ]
3250 | },
3251 | {
3252 | "cell_type": "code",
3253 | "execution_count": null,
3254 | "id": "1ddaa1ff",
3255 | "metadata": {},
3256 | "outputs": [
3257 | {
3258 | "data": {
3259 | "text/plain": [
3260 | "1"
3261 | ]
3262 | },
3263 | "metadata": {},
3264 | "output_type": "display_data"
3265 | }
3266 | ],
3267 | "source": [
3268 | "x.min()"
3269 | ]
3270 | },
3271 | {
3272 | "cell_type": "markdown",
3273 | "id": "19dd4b88",
3274 | "metadata": {},
3275 | "source": [
3276 | "> when we use axis=0 to specify the axis. it takes maximum value from each column\n",
3277 | "> \n",
3278 | "> **Imp: axis = 0 gives output as row but scan in vertical way to find maximum value**\n"
3279 | ]
3280 | },
3281 | {
3282 | "cell_type": "code",
3283 | "execution_count": null,
3284 | "id": "a959bee2",
3285 | "metadata": {},
3286 | "outputs": [
3287 | {
3288 | "data": {
3289 | "text/plain": [
3290 | "array([5, 6])"
3291 | ]
3292 | },
3293 | "metadata": {},
3294 | "output_type": "display_data"
3295 | }
3296 | ],
3297 | "source": [
3298 | "x.max(axis=0)"
3299 | ]
3300 | },
3301 | {
3302 | "cell_type": "markdown",
3303 | "id": "3b613838",
3304 | "metadata": {},
3305 | "source": [
3306 | "> when we use axis=1 to specify the axis. it takes maximum value from each row\n",
3307 | "> \n",
3308 | "> **Imp: axis = 1 gives output as column but scan in horizontal way to find maximum value**"
3309 | ]
3310 | },
3311 | {
3312 | "cell_type": "code",
3313 | "execution_count": null,
3314 | "id": "b5286d6f",
3315 | "metadata": {},
3316 | "outputs": [
3317 | {
3318 | "data": {
3319 | "text/plain": [
3320 | "array([2, 4, 6])"
3321 | ]
3322 | },
3323 | "metadata": {},
3324 | "output_type": "display_data"
3325 | }
3326 | ],
3327 | "source": [
3328 | "x.max(axis=1)"
3329 | ]
3330 | },
3331 | {
3332 | "cell_type": "markdown",
3333 | "id": "c700c78a",
3334 | "metadata": {},
3335 | "source": [
3336 | "> aggregate matrices the same way you aggregated vectors"
3337 | ]
3338 | },
3339 | {
3340 | "cell_type": "code",
3341 | "execution_count": null,
3342 | "id": "f672a8b5",
3343 | "metadata": {},
3344 | "outputs": [
3345 | {
3346 | "data": {
3347 | "text/plain": [
3348 | "21"
3349 | ]
3350 | },
3351 | "metadata": {},
3352 | "output_type": "display_data"
3353 | }
3354 | ],
3355 | "source": [
3356 | "x.sum()"
3357 | ]
3358 | },
3359 | {
3360 | "cell_type": "markdown",
3361 | "id": "4474171d",
3362 | "metadata": {},
3363 | "source": [
3364 | "## Section 3.6: Random, Rerverse, Reshape & Transpose of an Array"
3365 | ]
3366 | },
3367 | {
3368 | "cell_type": "markdown",
3369 | "id": "83fbef5d",
3370 | "metadata": {},
3371 | "source": [
3372 | "> the simplest way to generate random numbers"
3373 | ]
3374 | },
3375 | {
3376 | "cell_type": "code",
3377 | "execution_count": null,
3378 | "id": "e968f798",
3379 | "metadata": {},
3380 | "outputs": [
3381 | {
3382 | "data": {
3383 | "text/plain": [
3384 | "array([0.63696169, 0.26978671, 0.04097352, 0.01652764, 0.81327024])"
3385 | ]
3386 | },
3387 | "metadata": {},
3388 | "output_type": "display_data"
3389 | }
3390 | ],
3391 | "source": [
3392 | "r = np.random.default_rng(0)\n",
3393 | "r.random(5)"
3394 | ]
3395 | },
3396 | {
3397 | "cell_type": "markdown",
3398 | "id": "3257cbab",
3399 | "metadata": {},
3400 | "source": [
3401 | "> ones(), zeros(), and random() to create a 2D array"
3402 | ]
3403 | },
3404 | {
3405 | "cell_type": "code",
3406 | "execution_count": null,
3407 | "id": "a061ced1",
3408 | "metadata": {},
3409 | "outputs": [
3410 | {
3411 | "data": {
3412 | "text/plain": [
3413 | "array([[0., 0.],\n",
3414 | " [0., 0.],\n",
3415 | " [0., 0.]])"
3416 | ]
3417 | },
3418 | "metadata": {},
3419 | "output_type": "display_data"
3420 | }
3421 | ],
3422 | "source": [
3423 | "a = np.zeros((3,2))\n",
3424 | "a"
3425 | ]
3426 | },
3427 | {
3428 | "cell_type": "code",
3429 | "execution_count": null,
3430 | "id": "de137a40",
3431 | "metadata": {},
3432 | "outputs": [
3433 | {
3434 | "data": {
3435 | "text/plain": [
3436 | "array([[1., 1., 1., 1.],\n",
3437 | " [1., 1., 1., 1.],\n",
3438 | " [1., 1., 1., 1.]])"
3439 | ]
3440 | },
3441 | "metadata": {},
3442 | "output_type": "display_data"
3443 | }
3444 | ],
3445 | "source": [
3446 | "b = np.ones((3,4))\n",
3447 | "b"
3448 | ]
3449 | },
3450 | {
3451 | "cell_type": "code",
3452 | "execution_count": null,
3453 | "id": "40a7fee3",
3454 | "metadata": {},
3455 | "outputs": [
3456 | {
3457 | "data": {
3458 | "text/plain": [
3459 | "array([[0.91275558, 0.60663578, 0.72949656, 0.54362499, 0.93507242],\n",
3460 | " [0.81585355, 0.0027385 , 0.85740428, 0.03358558, 0.72965545],\n",
3461 | " [0.17565562, 0.86317892, 0.54146122, 0.29971189, 0.42268722],\n",
3462 | " [0.02831967, 0.12428328, 0.67062441, 0.64718951, 0.61538511]])"
3463 | ]
3464 | },
3465 | "metadata": {},
3466 | "output_type": "display_data"
3467 | }
3468 | ],
3469 | "source": [
3470 | "r.random((4,5))"
3471 | ]
3472 | },
3473 | {
3474 | "cell_type": "markdown",
3475 | "id": "63c991ee",
3476 | "metadata": {},
3477 | "source": [
3478 | "> unique items in 1D array (non-repetative items)"
3479 | ]
3480 | },
3481 | {
3482 | "cell_type": "code",
3483 | "execution_count": null,
3484 | "id": "291fce71",
3485 | "metadata": {},
3486 | "outputs": [
3487 | {
3488 | "data": {
3489 | "text/plain": [
3490 | "array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20])"
3491 | ]
3492 | },
3493 | "metadata": {},
3494 | "output_type": "display_data"
3495 | }
3496 | ],
3497 | "source": [
3498 | "a = np.array([11, 11, 12, 13, 14, 15, 16, 17, 12, 13, 11, 14, 18, 19, 20])\n",
3499 | "b = np.unique(a)\n",
3500 | "b"
3501 | ]
3502 | },
3503 | {
3504 | "cell_type": "markdown",
3505 | "id": "8bb8e957",
3506 | "metadata": {},
3507 | "source": [
3508 | "> Indices of unique values in the original array"
3509 | ]
3510 | },
3511 | {
3512 | "cell_type": "code",
3513 | "execution_count": null,
3514 | "id": "55812132",
3515 | "metadata": {},
3516 | "outputs": [
3517 | {
3518 | "data": {
3519 | "text/plain": [
3520 | "(array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]),\n",
3521 | " array([ 0, 2, 3, 4, 5, 6, 7, 12, 13, 14], dtype=int64))"
3522 | ]
3523 | },
3524 | "metadata": {},
3525 | "output_type": "display_data"
3526 | }
3527 | ],
3528 | "source": [
3529 | "b = np.unique(a, return_index = True)\n",
3530 | "b"
3531 | ]
3532 | },
3533 | {
3534 | "cell_type": "markdown",
3535 | "id": "3ee98a43",
3536 | "metadata": {},
3537 | "source": [
3538 | "> Frequency count of unique values in a NumPy array"
3539 | ]
3540 | },
3541 | {
3542 | "cell_type": "code",
3543 | "execution_count": null,
3544 | "id": "a4966c83",
3545 | "metadata": {},
3546 | "outputs": [
3547 | {
3548 | "data": {
3549 | "text/plain": [
3550 | "(array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]),\n",
3551 | " array([3, 2, 2, 2, 1, 1, 1, 1, 1, 1], dtype=int64))"
3552 | ]
3553 | },
3554 | "metadata": {},
3555 | "output_type": "display_data"
3556 | }
3557 | ],
3558 | "source": [
3559 | "b = np.unique(a, return_counts = True)\n",
3560 | "b"
3561 | ]
3562 | },
3563 | {
3564 | "cell_type": "markdown",
3565 | "id": "212321f3",
3566 | "metadata": {},
3567 | "source": [
3568 | "> unique items in 2D array"
3569 | ]
3570 | },
3571 | {
3572 | "cell_type": "code",
3573 | "execution_count": null,
3574 | "id": "7ce7d707",
3575 | "metadata": {},
3576 | "outputs": [
3577 | {
3578 | "data": {
3579 | "text/plain": [
3580 | "array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])"
3581 | ]
3582 | },
3583 | "metadata": {},
3584 | "output_type": "display_data"
3585 | }
3586 | ],
3587 | "source": [
3588 | "a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [1, 2, 3, 4]])\n",
3589 | "b_2d = np.unique(a_2d)\n",
3590 | "b_2d"
3591 | ]
3592 | },
3593 | {
3594 | "cell_type": "code",
3595 | "execution_count": null,
3596 | "id": "94ac209e",
3597 | "metadata": {},
3598 | "outputs": [
3599 | {
3600 | "data": {
3601 | "text/plain": [
3602 | "(array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]),\n",
3603 | " array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype=int64),\n",
3604 | " array([2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int64))"
3605 | ]
3606 | },
3607 | "metadata": {},
3608 | "output_type": "display_data"
3609 | }
3610 | ],
3611 | "source": [
3612 | "b_2d = np.unique(a_2d, return_index = True, return_counts = True)\n",
3613 | "b_2d"
3614 | ]
3615 | },
3616 | {
3617 | "cell_type": "markdown",
3618 | "id": "657cd665",
3619 | "metadata": {},
3620 | "source": [
3621 | ">Transposing and reshaping a matrix"
3622 | ]
3623 | },
3624 | {
3625 | "cell_type": "code",
3626 | "execution_count": null,
3627 | "id": "93d0303b",
3628 | "metadata": {},
3629 | "outputs": [
3630 | {
3631 | "data": {
3632 | "text/plain": [
3633 | "[[1, 2, 3], [4, 5, 6]]"
3634 | ]
3635 | },
3636 | "metadata": {},
3637 | "output_type": "display_data"
3638 | }
3639 | ],
3640 | "source": [
3641 | "a = ([[1, 2, 3],\n",
3642 | " [4, 5, 6]])\n",
3643 | "a"
3644 | ]
3645 | },
3646 | {
3647 | "cell_type": "code",
3648 | "execution_count": null,
3649 | "id": "b41d7309",
3650 | "metadata": {},
3651 | "outputs": [
3652 | {
3653 | "data": {
3654 | "text/plain": [
3655 | "(2, 3)"
3656 | ]
3657 | },
3658 | "metadata": {},
3659 | "output_type": "display_data"
3660 | }
3661 | ],
3662 | "source": [
3663 | "np.shape(a)"
3664 | ]
3665 | },
3666 | {
3667 | "cell_type": "code",
3668 | "execution_count": null,
3669 | "id": "5cb790e8",
3670 | "metadata": {},
3671 | "outputs": [
3672 | {
3673 | "data": {
3674 | "text/plain": [
3675 | "array([[0, 1, 2],\n",
3676 | " [3, 4, 5]])"
3677 | ]
3678 | },
3679 | "metadata": {},
3680 | "output_type": "display_data"
3681 | }
3682 | ],
3683 | "source": [
3684 | "a = np.arange(6).reshape((2, 3))\n",
3685 | "a"
3686 | ]
3687 | },
3688 | {
3689 | "cell_type": "code",
3690 | "execution_count": null,
3691 | "id": "54cb9b2e",
3692 | "metadata": {},
3693 | "outputs": [
3694 | {
3695 | "data": {
3696 | "text/plain": [
3697 | "array([[0, 3],\n",
3698 | " [1, 4],\n",
3699 | " [2, 5]])"
3700 | ]
3701 | },
3702 | "metadata": {},
3703 | "output_type": "display_data"
3704 | }
3705 | ],
3706 | "source": [
3707 | "a.transpose()"
3708 | ]
3709 | },
3710 | {
3711 | "cell_type": "markdown",
3712 | "id": "eee5576d",
3713 | "metadata": {},
3714 | "source": [
3715 | "> reverse an 1D array"
3716 | ]
3717 | },
3718 | {
3719 | "cell_type": "code",
3720 | "execution_count": null,
3721 | "id": "84892e2b",
3722 | "metadata": {},
3723 | "outputs": [
3724 | {
3725 | "data": {
3726 | "text/plain": [
3727 | "array([1, 2, 3, 4, 5, 6, 7, 8])"
3728 | ]
3729 | },
3730 | "metadata": {},
3731 | "output_type": "display_data"
3732 | }
3733 | ],
3734 | "source": [
3735 | "a = np.array([1, 2, 3, 4, 5, 6, 7, 8])\n",
3736 | "a"
3737 | ]
3738 | },
3739 | {
3740 | "cell_type": "code",
3741 | "execution_count": null,
3742 | "id": "4e0f7eef",
3743 | "metadata": {},
3744 | "outputs": [
3745 | {
3746 | "data": {
3747 | "text/plain": [
3748 | "array([8, 7, 6, 5, 4, 3, 2, 1])"
3749 | ]
3750 | },
3751 | "metadata": {},
3752 | "output_type": "display_data"
3753 | }
3754 | ],
3755 | "source": [
3756 | "b= np.flip(a)\n",
3757 | "b"
3758 | ]
3759 | },
3760 | {
3761 | "cell_type": "markdown",
3762 | "id": "7664b607",
3763 | "metadata": {},
3764 | "source": [
3765 | "> reverse an 2D array"
3766 | ]
3767 | },
3768 | {
3769 | "cell_type": "code",
3770 | "execution_count": null,
3771 | "id": "ebbea515",
3772 | "metadata": {},
3773 | "outputs": [
3774 | {
3775 | "data": {
3776 | "text/plain": [
3777 | "array([[ 1, 2, 3, 4],\n",
3778 | " [ 5, 6, 7, 8],\n",
3779 | " [ 9, 10, 11, 12]])"
3780 | ]
3781 | },
3782 | "metadata": {},
3783 | "output_type": "display_data"
3784 | }
3785 | ],
3786 | "source": [
3787 | "a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n",
3788 | "a_2d"
3789 | ]
3790 | },
3791 | {
3792 | "cell_type": "code",
3793 | "execution_count": null,
3794 | "id": "e4cca575",
3795 | "metadata": {},
3796 | "outputs": [
3797 | {
3798 | "data": {
3799 | "text/plain": [
3800 | "array([[12, 11, 10, 9],\n",
3801 | " [ 8, 7, 6, 5],\n",
3802 | " [ 4, 3, 2, 1]])"
3803 | ]
3804 | },
3805 | "metadata": {},
3806 | "output_type": "display_data"
3807 | }
3808 | ],
3809 | "source": [
3810 | "b_2d = np.flip(a_2d)\n",
3811 | "b_2d"
3812 | ]
3813 | },
3814 | {
3815 | "cell_type": "markdown",
3816 | "id": "0b788cff",
3817 | "metadata": {},
3818 | "source": [
3819 | "> reverse only the columns"
3820 | ]
3821 | },
3822 | {
3823 | "cell_type": "code",
3824 | "execution_count": null,
3825 | "id": "0ebc3d00",
3826 | "metadata": {},
3827 | "outputs": [
3828 | {
3829 | "data": {
3830 | "text/plain": [
3831 | "array([[ 9, 10, 11, 12],\n",
3832 | " [ 5, 6, 7, 8],\n",
3833 | " [ 1, 2, 3, 4]])"
3834 | ]
3835 | },
3836 | "metadata": {},
3837 | "output_type": "display_data"
3838 | }
3839 | ],
3840 | "source": [
3841 | "b_2d = np.flip(a_2d, axis = 0)\n",
3842 | "b_2d"
3843 | ]
3844 | },
3845 | {
3846 | "cell_type": "markdown",
3847 | "id": "7be0813b",
3848 | "metadata": {},
3849 | "source": [
3850 | "> reverse only the rows"
3851 | ]
3852 | },
3853 | {
3854 | "cell_type": "code",
3855 | "execution_count": null,
3856 | "id": "452fe903",
3857 | "metadata": {},
3858 | "outputs": [
3859 | {
3860 | "data": {
3861 | "text/plain": [
3862 | "array([[ 4, 3, 2, 1],\n",
3863 | " [ 8, 7, 6, 5],\n",
3864 | " [12, 11, 10, 9]])"
3865 | ]
3866 | },
3867 | "metadata": {},
3868 | "output_type": "display_data"
3869 | }
3870 | ],
3871 | "source": [
3872 | "b_2d = np.flip(a_2d, axis = 1)\n",
3873 | "b_2d"
3874 | ]
3875 | },
3876 | {
3877 | "cell_type": "markdown",
3878 | "id": "dba11c66",
3879 | "metadata": {},
3880 | "source": [
3881 | "> reverse the contents of only one column or row"
3882 | ]
3883 | },
3884 | {
3885 | "cell_type": "code",
3886 | "execution_count": null,
3887 | "id": "999f7de9",
3888 | "metadata": {},
3889 | "outputs": [
3890 | {
3891 | "data": {
3892 | "text/plain": [
3893 | "array([[ 1, 2, 3, 4],\n",
3894 | " [ 5, 6, 7, 8],\n",
3895 | " [ 9, 10, 11, 12]])"
3896 | ]
3897 | },
3898 | "metadata": {},
3899 | "output_type": "display_data"
3900 | }
3901 | ],
3902 | "source": [
3903 | "a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n",
3904 | "a_2d"
3905 | ]
3906 | },
3907 | {
3908 | "cell_type": "markdown",
3909 | "id": "4c41ec50",
3910 | "metadata": {},
3911 | "source": [
3912 | "> reverse the contents of the row at index position 1"
3913 | ]
3914 | },
3915 | {
3916 | "cell_type": "code",
3917 | "execution_count": null,
3918 | "id": "4c0139df",
3919 | "metadata": {},
3920 | "outputs": [
3921 | {
3922 | "data": {
3923 | "text/plain": [
3924 | "array([[ 1, 2, 3, 4],\n",
3925 | " [ 8, 7, 6, 5],\n",
3926 | " [ 9, 10, 11, 12]])"
3927 | ]
3928 | },
3929 | "metadata": {},
3930 | "output_type": "display_data"
3931 | }
3932 | ],
3933 | "source": [
3934 | "a_2d[1] = np.flip(a_2d[1])\n",
3935 | "a_2d"
3936 | ]
3937 | },
3938 | {
3939 | "cell_type": "markdown",
3940 | "id": "258f256f",
3941 | "metadata": {},
3942 | "source": [
3943 | "> reverse the contents of the column at index position 0"
3944 | ]
3945 | },
3946 | {
3947 | "cell_type": "code",
3948 | "execution_count": null,
3949 | "id": "872c44f1",
3950 | "metadata": {},
3951 | "outputs": [
3952 | {
3953 | "data": {
3954 | "text/plain": [
3955 | "array([[ 9, 2, 3, 4],\n",
3956 | " [ 5, 6, 7, 8],\n",
3957 | " [ 1, 10, 11, 12]])"
3958 | ]
3959 | },
3960 | "metadata": {},
3961 | "output_type": "display_data"
3962 | }
3963 | ],
3964 | "source": [
3965 | "a_2d[:,0] = np.flip(a_2d[:,0])\n",
3966 | "a_2d"
3967 | ]
3968 | },
3969 | {
3970 | "cell_type": "markdown",
3971 | "id": "4610dd70",
3972 | "metadata": {},
3973 | "source": [
3974 | "## Section 3.7: Reshaping and Flattening Multidimensional Arrays"
3975 | ]
3976 | },
3977 | {
3978 | "cell_type": "code",
3979 | "execution_count": null,
3980 | "id": "a8e69391",
3981 | "metadata": {},
3982 | "outputs": [
3983 | {
3984 | "data": {
3985 | "text/plain": [
3986 | "array([[ 1, 2, 3, 4],\n",
3987 | " [ 5, 6, 7, 8],\n",
3988 | " [ 9, 10, 11, 12]])"
3989 | ]
3990 | },
3991 | "metadata": {},
3992 | "output_type": "display_data"
3993 | }
3994 | ],
3995 | "source": [
3996 | "x = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n",
3997 | "x"
3998 | ]
3999 | },
4000 | {
4001 | "cell_type": "markdown",
4002 | "id": "faa16b7b",
4003 | "metadata": {},
4004 | "source": [
4005 | ">When you use flatten, changes to your new array won’t change the parent array"
4006 | ]
4007 | },
4008 | {
4009 | "cell_type": "code",
4010 | "execution_count": null,
4011 | "id": "a1dd74e2",
4012 | "metadata": {},
4013 | "outputs": [
4014 | {
4015 | "data": {
4016 | "text/plain": [
4017 | "array([ 1, 22, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])"
4018 | ]
4019 | },
4020 | "metadata": {},
4021 | "output_type": "display_data"
4022 | }
4023 | ],
4024 | "source": [
4025 | "x1 = x.flatten()\n",
4026 | "x1[1] = 22\n",
4027 | "x1"
4028 | ]
4029 | },
4030 | {
4031 | "cell_type": "code",
4032 | "execution_count": null,
4033 | "id": "8fafc4d1",
4034 | "metadata": {},
4035 | "outputs": [
4036 | {
4037 | "data": {
4038 | "text/plain": [
4039 | "array([[ 1, 2, 3, 4],\n",
4040 | " [ 5, 6, 7, 8],\n",
4041 | " [ 9, 10, 11, 12]])"
4042 | ]
4043 | },
4044 | "metadata": {},
4045 | "output_type": "display_data"
4046 | }
4047 | ],
4048 | "source": [
4049 | "x"
4050 | ]
4051 | },
4052 | {
4053 | "cell_type": "markdown",
4054 | "id": "76dcb73f",
4055 | "metadata": {},
4056 | "source": [
4057 | "> when you use ravel, the changes you make to the new array will affect the parent array."
4058 | ]
4059 | },
4060 | {
4061 | "cell_type": "code",
4062 | "execution_count": null,
4063 | "id": "9557b503",
4064 | "metadata": {},
4065 | "outputs": [
4066 | {
4067 | "data": {
4068 | "text/plain": [
4069 | "array([ 1, 25, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])"
4070 | ]
4071 | },
4072 | "metadata": {},
4073 | "output_type": "display_data"
4074 | }
4075 | ],
4076 | "source": [
4077 | "x1 = x.ravel()\n",
4078 | "x1[1] = 25\n",
4079 | "x1"
4080 | ]
4081 | },
4082 | {
4083 | "cell_type": "code",
4084 | "execution_count": null,
4085 | "id": "861a2ad0",
4086 | "metadata": {},
4087 | "outputs": [
4088 | {
4089 | "data": {
4090 | "text/plain": [
4091 | "array([[ 1, 25, 3, 4],\n",
4092 | " [ 5, 6, 7, 8],\n",
4093 | " [ 9, 10, 11, 12]])"
4094 | ]
4095 | },
4096 | "metadata": {},
4097 | "output_type": "display_data"
4098 | }
4099 | ],
4100 | "source": [
4101 | "x"
4102 | ]
4103 | },
4104 | {
4105 | "cell_type": "markdown",
4106 | "metadata": {},
4107 | "source": [
4108 | "-----------------\n",
4109 | "-----------------"
4110 | ]
4111 | }
4112 | ],
4113 | "metadata": {
4114 | "language_info": {
4115 | "name": "python"
4116 | },
4117 | "orig_nbformat": 4
4118 | },
4119 | "nbformat": 4,
4120 | "nbformat_minor": 2
4121 | }
4122 |
--------------------------------------------------------------------------------
/4- Null Values with Pandas.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 2,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import pandas as pd\n",
10 | "import numpy as np\n",
11 | "import seaborn as sns"
12 | ]
13 | },
14 | {
15 | "cell_type": "code",
16 | "execution_count": 69,
17 | "metadata": {},
18 | "outputs": [],
19 | "source": [
20 | "df = sns.load_dataset('titanic')"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 70,
26 | "metadata": {},
27 | "outputs": [
28 | {
29 | "data": {
30 | "text/html": [
31 | "
\n",
32 | "\n",
45 | "
\n",
46 | " \n",
47 | " \n",
48 | " | \n",
49 | " survived | \n",
50 | " pclass | \n",
51 | " sex | \n",
52 | " age | \n",
53 | " sibsp | \n",
54 | " parch | \n",
55 | " fare | \n",
56 | " embarked | \n",
57 | " class | \n",
58 | " who | \n",
59 | " adult_male | \n",
60 | " deck | \n",
61 | " embark_town | \n",
62 | " alive | \n",
63 | " alone | \n",
64 | "
\n",
65 | " \n",
66 | " \n",
67 | " \n",
68 | " 0 | \n",
69 | " 0 | \n",
70 | " 3 | \n",
71 | " male | \n",
72 | " 22.0 | \n",
73 | " 1 | \n",
74 | " 0 | \n",
75 | " 7.2500 | \n",
76 | " S | \n",
77 | " Third | \n",
78 | " man | \n",
79 | " True | \n",
80 | " NaN | \n",
81 | " Southampton | \n",
82 | " no | \n",
83 | " False | \n",
84 | "
\n",
85 | " \n",
86 | " 1 | \n",
87 | " 1 | \n",
88 | " 1 | \n",
89 | " female | \n",
90 | " 38.0 | \n",
91 | " 1 | \n",
92 | " 0 | \n",
93 | " 71.2833 | \n",
94 | " C | \n",
95 | " First | \n",
96 | " woman | \n",
97 | " False | \n",
98 | " C | \n",
99 | " Cherbourg | \n",
100 | " yes | \n",
101 | " False | \n",
102 | "
\n",
103 | " \n",
104 | " 2 | \n",
105 | " 1 | \n",
106 | " 3 | \n",
107 | " female | \n",
108 | " 26.0 | \n",
109 | " 0 | \n",
110 | " 0 | \n",
111 | " 7.9250 | \n",
112 | " S | \n",
113 | " Third | \n",
114 | " woman | \n",
115 | " False | \n",
116 | " NaN | \n",
117 | " Southampton | \n",
118 | " yes | \n",
119 | " True | \n",
120 | "
\n",
121 | " \n",
122 | " 3 | \n",
123 | " 1 | \n",
124 | " 1 | \n",
125 | " female | \n",
126 | " 35.0 | \n",
127 | " 1 | \n",
128 | " 0 | \n",
129 | " 53.1000 | \n",
130 | " S | \n",
131 | " First | \n",
132 | " woman | \n",
133 | " False | \n",
134 | " C | \n",
135 | " Southampton | \n",
136 | " yes | \n",
137 | " False | \n",
138 | "
\n",
139 | " \n",
140 | " 4 | \n",
141 | " 0 | \n",
142 | " 3 | \n",
143 | " male | \n",
144 | " 35.0 | \n",
145 | " 0 | \n",
146 | " 0 | \n",
147 | " 8.0500 | \n",
148 | " S | \n",
149 | " Third | \n",
150 | " man | \n",
151 | " True | \n",
152 | " NaN | \n",
153 | " Southampton | \n",
154 | " no | \n",
155 | " True | \n",
156 | "
\n",
157 | " \n",
158 | "
\n",
159 | "
"
160 | ],
161 | "text/plain": [
162 | " survived pclass sex age sibsp parch fare embarked class \\\n",
163 | "0 0 3 male 22.0 1 0 7.2500 S Third \n",
164 | "1 1 1 female 38.0 1 0 71.2833 C First \n",
165 | "2 1 3 female 26.0 0 0 7.9250 S Third \n",
166 | "3 1 1 female 35.0 1 0 53.1000 S First \n",
167 | "4 0 3 male 35.0 0 0 8.0500 S Third \n",
168 | "\n",
169 | " who adult_male deck embark_town alive alone \n",
170 | "0 man True NaN Southampton no False \n",
171 | "1 woman False C Cherbourg yes False \n",
172 | "2 woman False NaN Southampton yes True \n",
173 | "3 woman False C Southampton yes False \n",
174 | "4 man True NaN Southampton no True "
175 | ]
176 | },
177 | "execution_count": 70,
178 | "metadata": {},
179 | "output_type": "execute_result"
180 | }
181 | ],
182 | "source": [
183 | "df.head()"
184 | ]
185 | },
186 | {
187 | "cell_type": "code",
188 | "execution_count": 71,
189 | "metadata": {},
190 | "outputs": [
191 | {
192 | "data": {
193 | "text/plain": [
194 | "survived 0\n",
195 | "pclass 0\n",
196 | "sex 0\n",
197 | "age 177\n",
198 | "sibsp 0\n",
199 | "parch 0\n",
200 | "fare 0\n",
201 | "embarked 2\n",
202 | "class 0\n",
203 | "who 0\n",
204 | "adult_male 0\n",
205 | "deck 688\n",
206 | "embark_town 2\n",
207 | "alive 0\n",
208 | "alone 0\n",
209 | "dtype: int64"
210 | ]
211 | },
212 | "execution_count": 71,
213 | "metadata": {},
214 | "output_type": "execute_result"
215 | }
216 | ],
217 | "source": [
218 | "df.isnull().sum()"
219 | ]
220 | },
221 | {
222 | "cell_type": "code",
223 | "execution_count": 72,
224 | "metadata": {},
225 | "outputs": [
226 | {
227 | "name": "stdout",
228 | "output_type": "stream",
229 | "text": [
230 | "\n",
231 | "RangeIndex: 891 entries, 0 to 890\n",
232 | "Data columns (total 15 columns):\n",
233 | " # Column Non-Null Count Dtype \n",
234 | "--- ------ -------------- ----- \n",
235 | " 0 survived 891 non-null int64 \n",
236 | " 1 pclass 891 non-null int64 \n",
237 | " 2 sex 891 non-null object \n",
238 | " 3 age 714 non-null float64 \n",
239 | " 4 sibsp 891 non-null int64 \n",
240 | " 5 parch 891 non-null int64 \n",
241 | " 6 fare 891 non-null float64 \n",
242 | " 7 embarked 889 non-null object \n",
243 | " 8 class 891 non-null category\n",
244 | " 9 who 891 non-null object \n",
245 | " 10 adult_male 891 non-null bool \n",
246 | " 11 deck 203 non-null category\n",
247 | " 12 embark_town 889 non-null object \n",
248 | " 13 alive 891 non-null object \n",
249 | " 14 alone 891 non-null bool \n",
250 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n",
251 | "memory usage: 80.7+ KB\n"
252 | ]
253 | }
254 | ],
255 | "source": [
256 | "df.info()"
257 | ]
258 | },
259 | {
260 | "cell_type": "markdown",
261 | "metadata": {},
262 | "source": [
263 | "## Dealing with the missing values using PANDAS\n",
264 | "1. Mean, Mode Method\n",
265 | "2. Forward Fill Method\n",
266 | "3. Backward Fill Method "
267 | ]
268 | },
269 | {
270 | "cell_type": "markdown",
271 | "metadata": {},
272 | "source": [
273 | "### 1. Mean/Mode Method"
274 | ]
275 | },
276 | {
277 | "cell_type": "code",
278 | "execution_count": 73,
279 | "metadata": {},
280 | "outputs": [
281 | {
282 | "data": {
283 | "text/plain": [
284 | "'C'"
285 | ]
286 | },
287 | "execution_count": 73,
288 | "metadata": {},
289 | "output_type": "execute_result"
290 | }
291 | ],
292 | "source": [
293 | "from scipy.stats import mode\n",
294 | "deck_mode = mode(df['deck']).mode[0]\n",
295 | "deck_mode"
296 | ]
297 | },
298 | {
299 | "cell_type": "code",
300 | "execution_count": 74,
301 | "metadata": {},
302 | "outputs": [],
303 | "source": [
304 | "df['deck'] = df['deck'].fillna(deck_mode)"
305 | ]
306 | },
307 | {
308 | "cell_type": "code",
309 | "execution_count": 75,
310 | "metadata": {},
311 | "outputs": [
312 | {
313 | "name": "stdout",
314 | "output_type": "stream",
315 | "text": [
316 | "\n",
317 | "RangeIndex: 891 entries, 0 to 890\n",
318 | "Data columns (total 15 columns):\n",
319 | " # Column Non-Null Count Dtype \n",
320 | "--- ------ -------------- ----- \n",
321 | " 0 survived 891 non-null int64 \n",
322 | " 1 pclass 891 non-null int64 \n",
323 | " 2 sex 891 non-null object \n",
324 | " 3 age 714 non-null float64 \n",
325 | " 4 sibsp 891 non-null int64 \n",
326 | " 5 parch 891 non-null int64 \n",
327 | " 6 fare 891 non-null float64 \n",
328 | " 7 embarked 889 non-null object \n",
329 | " 8 class 891 non-null category\n",
330 | " 9 who 891 non-null object \n",
331 | " 10 adult_male 891 non-null bool \n",
332 | " 11 deck 891 non-null category\n",
333 | " 12 embark_town 889 non-null object \n",
334 | " 13 alive 891 non-null object \n",
335 | " 14 alone 891 non-null bool \n",
336 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n",
337 | "memory usage: 80.7+ KB\n"
338 | ]
339 | }
340 | ],
341 | "source": [
342 | "df.info()"
343 | ]
344 | },
345 | {
346 | "cell_type": "code",
347 | "execution_count": 77,
348 | "metadata": {},
349 | "outputs": [
350 | {
351 | "data": {
352 | "text/html": [
353 | "\n",
354 | "\n",
367 | "
\n",
368 | " \n",
369 | " \n",
370 | " | \n",
371 | " survived | \n",
372 | " pclass | \n",
373 | " sex | \n",
374 | " age | \n",
375 | " sibsp | \n",
376 | " parch | \n",
377 | " fare | \n",
378 | " embarked | \n",
379 | " class | \n",
380 | " who | \n",
381 | " adult_male | \n",
382 | " deck | \n",
383 | " embark_town | \n",
384 | " alive | \n",
385 | " alone | \n",
386 | "
\n",
387 | " \n",
388 | " \n",
389 | " \n",
390 | " 886 | \n",
391 | " 0 | \n",
392 | " 2 | \n",
393 | " male | \n",
394 | " 27.0 | \n",
395 | " 0 | \n",
396 | " 0 | \n",
397 | " 13.00 | \n",
398 | " S | \n",
399 | " Second | \n",
400 | " man | \n",
401 | " True | \n",
402 | " C | \n",
403 | " Southampton | \n",
404 | " no | \n",
405 | " True | \n",
406 | "
\n",
407 | " \n",
408 | " 887 | \n",
409 | " 1 | \n",
410 | " 1 | \n",
411 | " female | \n",
412 | " 19.0 | \n",
413 | " 0 | \n",
414 | " 0 | \n",
415 | " 30.00 | \n",
416 | " S | \n",
417 | " First | \n",
418 | " woman | \n",
419 | " False | \n",
420 | " B | \n",
421 | " Southampton | \n",
422 | " yes | \n",
423 | " True | \n",
424 | "
\n",
425 | " \n",
426 | " 888 | \n",
427 | " 0 | \n",
428 | " 3 | \n",
429 | " female | \n",
430 | " NaN | \n",
431 | " 1 | \n",
432 | " 2 | \n",
433 | " 23.45 | \n",
434 | " S | \n",
435 | " Third | \n",
436 | " woman | \n",
437 | " False | \n",
438 | " C | \n",
439 | " Southampton | \n",
440 | " no | \n",
441 | " False | \n",
442 | "
\n",
443 | " \n",
444 | " 889 | \n",
445 | " 1 | \n",
446 | " 1 | \n",
447 | " male | \n",
448 | " 26.0 | \n",
449 | " 0 | \n",
450 | " 0 | \n",
451 | " 30.00 | \n",
452 | " C | \n",
453 | " First | \n",
454 | " man | \n",
455 | " True | \n",
456 | " C | \n",
457 | " Cherbourg | \n",
458 | " yes | \n",
459 | " True | \n",
460 | "
\n",
461 | " \n",
462 | " 890 | \n",
463 | " 0 | \n",
464 | " 3 | \n",
465 | " male | \n",
466 | " 32.0 | \n",
467 | " 0 | \n",
468 | " 0 | \n",
469 | " 7.75 | \n",
470 | " Q | \n",
471 | " Third | \n",
472 | " man | \n",
473 | " True | \n",
474 | " C | \n",
475 | " Queenstown | \n",
476 | " no | \n",
477 | " True | \n",
478 | "
\n",
479 | " \n",
480 | "
\n",
481 | "
"
482 | ],
483 | "text/plain": [
484 | " survived pclass sex age sibsp parch fare embarked class \\\n",
485 | "886 0 2 male 27.0 0 0 13.00 S Second \n",
486 | "887 1 1 female 19.0 0 0 30.00 S First \n",
487 | "888 0 3 female NaN 1 2 23.45 S Third \n",
488 | "889 1 1 male 26.0 0 0 30.00 C First \n",
489 | "890 0 3 male 32.0 0 0 7.75 Q Third \n",
490 | "\n",
491 | " who adult_male deck embark_town alive alone \n",
492 | "886 man True C Southampton no True \n",
493 | "887 woman False B Southampton yes True \n",
494 | "888 woman False C Southampton no False \n",
495 | "889 man True C Cherbourg yes True \n",
496 | "890 man True C Queenstown no True "
497 | ]
498 | },
499 | "execution_count": 77,
500 | "metadata": {},
501 | "output_type": "execute_result"
502 | }
503 | ],
504 | "source": [
505 | "df.tail()"
506 | ]
507 | },
508 | {
509 | "cell_type": "code",
510 | "execution_count": 81,
511 | "metadata": {},
512 | "outputs": [
513 | {
514 | "data": {
515 | "text/plain": [
516 | "'Southampton'"
517 | ]
518 | },
519 | "execution_count": 81,
520 | "metadata": {},
521 | "output_type": "execute_result"
522 | }
523 | ],
524 | "source": [
525 | "embark_town_mode = mode(df['embark_town']).mode[0]\n",
526 | "embark_town_mode"
527 | ]
528 | },
529 | {
530 | "cell_type": "code",
531 | "execution_count": 82,
532 | "metadata": {},
533 | "outputs": [],
534 | "source": [
535 | "df['embark_town'] = df['embark_town'].fillna(embark_town_mode)"
536 | ]
537 | },
538 | {
539 | "cell_type": "code",
540 | "execution_count": 83,
541 | "metadata": {},
542 | "outputs": [
543 | {
544 | "name": "stdout",
545 | "output_type": "stream",
546 | "text": [
547 | "\n",
548 | "RangeIndex: 891 entries, 0 to 890\n",
549 | "Data columns (total 15 columns):\n",
550 | " # Column Non-Null Count Dtype \n",
551 | "--- ------ -------------- ----- \n",
552 | " 0 survived 891 non-null int64 \n",
553 | " 1 pclass 891 non-null int64 \n",
554 | " 2 sex 891 non-null object \n",
555 | " 3 age 891 non-null float64 \n",
556 | " 4 sibsp 891 non-null int64 \n",
557 | " 5 parch 891 non-null int64 \n",
558 | " 6 fare 891 non-null float64 \n",
559 | " 7 embarked 889 non-null object \n",
560 | " 8 class 891 non-null category\n",
561 | " 9 who 891 non-null object \n",
562 | " 10 adult_male 891 non-null bool \n",
563 | " 11 deck 891 non-null category\n",
564 | " 12 embark_town 891 non-null object \n",
565 | " 13 alive 891 non-null object \n",
566 | " 14 alone 891 non-null bool \n",
567 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n",
568 | "memory usage: 80.7+ KB\n"
569 | ]
570 | }
571 | ],
572 | "source": [
573 | "df.info()"
574 | ]
575 | },
576 | {
577 | "cell_type": "code",
578 | "execution_count": 78,
579 | "metadata": {},
580 | "outputs": [
581 | {
582 | "data": {
583 | "text/plain": [
584 | "29.69911764705882"
585 | ]
586 | },
587 | "execution_count": 78,
588 | "metadata": {},
589 | "output_type": "execute_result"
590 | }
591 | ],
592 | "source": [
593 | "age_mean = np.mean(df['age'])\n",
594 | "age_mean"
595 | ]
596 | },
597 | {
598 | "cell_type": "code",
599 | "execution_count": 79,
600 | "metadata": {},
601 | "outputs": [],
602 | "source": [
603 | "df['age'] = df['age'].fillna(age_mean)"
604 | ]
605 | },
606 | {
607 | "cell_type": "code",
608 | "execution_count": 87,
609 | "metadata": {},
610 | "outputs": [
611 | {
612 | "data": {
613 | "text/plain": [
614 | "survived 0\n",
615 | "pclass 0\n",
616 | "sex 0\n",
617 | "age 0\n",
618 | "sibsp 0\n",
619 | "parch 0\n",
620 | "fare 0\n",
621 | "embarked 2\n",
622 | "class 0\n",
623 | "who 0\n",
624 | "adult_male 0\n",
625 | "deck 0\n",
626 | "embark_town 0\n",
627 | "alive 0\n",
628 | "alone 0\n",
629 | "dtype: int64"
630 | ]
631 | },
632 | "execution_count": 87,
633 | "metadata": {},
634 | "output_type": "execute_result"
635 | }
636 | ],
637 | "source": [
638 | "df.isnull().sum()"
639 | ]
640 | },
641 | {
642 | "cell_type": "code",
643 | "execution_count": 88,
644 | "metadata": {},
645 | "outputs": [
646 | {
647 | "name": "stdout",
648 | "output_type": "stream",
649 | "text": [
650 | "\n",
651 | "RangeIndex: 891 entries, 0 to 890\n",
652 | "Data columns (total 15 columns):\n",
653 | " # Column Non-Null Count Dtype \n",
654 | "--- ------ -------------- ----- \n",
655 | " 0 survived 891 non-null int64 \n",
656 | " 1 pclass 891 non-null int64 \n",
657 | " 2 sex 891 non-null object \n",
658 | " 3 age 891 non-null float64 \n",
659 | " 4 sibsp 891 non-null int64 \n",
660 | " 5 parch 891 non-null int64 \n",
661 | " 6 fare 891 non-null float64 \n",
662 | " 7 embarked 889 non-null object \n",
663 | " 8 class 891 non-null category\n",
664 | " 9 who 891 non-null object \n",
665 | " 10 adult_male 891 non-null bool \n",
666 | " 11 deck 891 non-null category\n",
667 | " 12 embark_town 891 non-null object \n",
668 | " 13 alive 891 non-null object \n",
669 | " 14 alone 891 non-null bool \n",
670 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n",
671 | "memory usage: 80.7+ KB\n"
672 | ]
673 | }
674 | ],
675 | "source": [
676 | "df.info()"
677 | ]
678 | },
679 | {
680 | "cell_type": "markdown",
681 | "metadata": {},
682 | "source": [
683 | "### 2. Forward Fill Method\n",
684 | "- Forward Fill Method, It Takes values from the previous non-NaN value"
685 | ]
686 | },
687 | {
688 | "cell_type": "code",
689 | "execution_count": 99,
690 | "metadata": {},
691 | "outputs": [],
692 | "source": [
693 | "df1 = sns.load_dataset('titanic')"
694 | ]
695 | },
696 | {
697 | "cell_type": "code",
698 | "execution_count": 100,
699 | "metadata": {},
700 | "outputs": [
701 | {
702 | "data": {
703 | "text/plain": [
704 | "survived 0\n",
705 | "pclass 0\n",
706 | "sex 0\n",
707 | "age 177\n",
708 | "sibsp 0\n",
709 | "parch 0\n",
710 | "fare 0\n",
711 | "embarked 2\n",
712 | "class 0\n",
713 | "who 0\n",
714 | "adult_male 0\n",
715 | "deck 688\n",
716 | "embark_town 2\n",
717 | "alive 0\n",
718 | "alone 0\n",
719 | "dtype: int64"
720 | ]
721 | },
722 | "execution_count": 100,
723 | "metadata": {},
724 | "output_type": "execute_result"
725 | }
726 | ],
727 | "source": [
728 | "df1.isnull().sum()"
729 | ]
730 | },
731 | {
732 | "cell_type": "code",
733 | "execution_count": 101,
734 | "metadata": {},
735 | "outputs": [],
736 | "source": [
737 | "df1= df1.fillna(method='ffill').head()"
738 | ]
739 | },
740 | {
741 | "cell_type": "code",
742 | "execution_count": 102,
743 | "metadata": {},
744 | "outputs": [
745 | {
746 | "data": {
747 | "text/plain": [
748 | "survived 0\n",
749 | "pclass 0\n",
750 | "sex 0\n",
751 | "age 0\n",
752 | "sibsp 0\n",
753 | "parch 0\n",
754 | "fare 0\n",
755 | "embarked 0\n",
756 | "class 0\n",
757 | "who 0\n",
758 | "adult_male 0\n",
759 | "deck 1\n",
760 | "embark_town 0\n",
761 | "alive 0\n",
762 | "alone 0\n",
763 | "dtype: int64"
764 | ]
765 | },
766 | "execution_count": 102,
767 | "metadata": {},
768 | "output_type": "execute_result"
769 | }
770 | ],
771 | "source": [
772 | "df1.isnull().sum()\n",
773 | "\n",
774 | "# It shows one value as NaN, because in Forward Fill Method, It Takes values from the previous non-NaN value,\n",
775 | "# in this case, it couldn't because it is the first value, there is no value before it, that's why it remains empty"
776 | ]
777 | },
778 | {
779 | "cell_type": "code",
780 | "execution_count": 107,
781 | "metadata": {},
782 | "outputs": [
783 | {
784 | "data": {
785 | "text/html": [
786 | "\n",
787 | "\n",
800 | "
\n",
801 | " \n",
802 | " \n",
803 | " | \n",
804 | " survived | \n",
805 | " pclass | \n",
806 | " sex | \n",
807 | " age | \n",
808 | " sibsp | \n",
809 | " parch | \n",
810 | " fare | \n",
811 | " embarked | \n",
812 | " class | \n",
813 | " who | \n",
814 | " adult_male | \n",
815 | " deck | \n",
816 | " embark_town | \n",
817 | " alive | \n",
818 | " alone | \n",
819 | "
\n",
820 | " \n",
821 | " \n",
822 | " \n",
823 | " 0 | \n",
824 | " 0 | \n",
825 | " 3 | \n",
826 | " male | \n",
827 | " 22.0 | \n",
828 | " 1 | \n",
829 | " 0 | \n",
830 | " 7.2500 | \n",
831 | " S | \n",
832 | " Third | \n",
833 | " man | \n",
834 | " True | \n",
835 | " NaN | \n",
836 | " Southampton | \n",
837 | " no | \n",
838 | " False | \n",
839 | "
\n",
840 | " \n",
841 | " 1 | \n",
842 | " 1 | \n",
843 | " 1 | \n",
844 | " female | \n",
845 | " 38.0 | \n",
846 | " 1 | \n",
847 | " 0 | \n",
848 | " 71.2833 | \n",
849 | " C | \n",
850 | " First | \n",
851 | " woman | \n",
852 | " False | \n",
853 | " C | \n",
854 | " Cherbourg | \n",
855 | " yes | \n",
856 | " False | \n",
857 | "
\n",
858 | " \n",
859 | " 2 | \n",
860 | " 1 | \n",
861 | " 3 | \n",
862 | " female | \n",
863 | " 26.0 | \n",
864 | " 0 | \n",
865 | " 0 | \n",
866 | " 7.9250 | \n",
867 | " S | \n",
868 | " Third | \n",
869 | " woman | \n",
870 | " False | \n",
871 | " C | \n",
872 | " Southampton | \n",
873 | " yes | \n",
874 | " True | \n",
875 | "
\n",
876 | " \n",
877 | " 3 | \n",
878 | " 1 | \n",
879 | " 1 | \n",
880 | " female | \n",
881 | " 35.0 | \n",
882 | " 1 | \n",
883 | " 0 | \n",
884 | " 53.1000 | \n",
885 | " S | \n",
886 | " First | \n",
887 | " woman | \n",
888 | " False | \n",
889 | " C | \n",
890 | " Southampton | \n",
891 | " yes | \n",
892 | " False | \n",
893 | "
\n",
894 | " \n",
895 | " 4 | \n",
896 | " 0 | \n",
897 | " 3 | \n",
898 | " male | \n",
899 | " 35.0 | \n",
900 | " 0 | \n",
901 | " 0 | \n",
902 | " 8.0500 | \n",
903 | " S | \n",
904 | " Third | \n",
905 | " man | \n",
906 | " True | \n",
907 | " C | \n",
908 | " Southampton | \n",
909 | " no | \n",
910 | " True | \n",
911 | "
\n",
912 | " \n",
913 | "
\n",
914 | "
"
915 | ],
916 | "text/plain": [
917 | " survived pclass sex age sibsp parch fare embarked class \\\n",
918 | "0 0 3 male 22.0 1 0 7.2500 S Third \n",
919 | "1 1 1 female 38.0 1 0 71.2833 C First \n",
920 | "2 1 3 female 26.0 0 0 7.9250 S Third \n",
921 | "3 1 1 female 35.0 1 0 53.1000 S First \n",
922 | "4 0 3 male 35.0 0 0 8.0500 S Third \n",
923 | "\n",
924 | " who adult_male deck embark_town alive alone \n",
925 | "0 man True NaN Southampton no False \n",
926 | "1 woman False C Cherbourg yes False \n",
927 | "2 woman False C Southampton yes True \n",
928 | "3 woman False C Southampton yes False \n",
929 | "4 man True C Southampton no True "
930 | ]
931 | },
932 | "execution_count": 107,
933 | "metadata": {},
934 | "output_type": "execute_result"
935 | }
936 | ],
937 | "source": [
938 | "df1.head()"
939 | ]
940 | },
941 | {
942 | "cell_type": "markdown",
943 | "metadata": {},
944 | "source": [
945 | "### 3. Backward Fill Method\n",
946 | "- Backward Fill Method, It Takes values from the subsequent non-NaN value"
947 | ]
948 | },
949 | {
950 | "cell_type": "code",
951 | "execution_count": 111,
952 | "metadata": {},
953 | "outputs": [],
954 | "source": [
955 | "df2 = sns.load_dataset('titanic')"
956 | ]
957 | },
958 | {
959 | "cell_type": "code",
960 | "execution_count": 112,
961 | "metadata": {},
962 | "outputs": [
963 | {
964 | "data": {
965 | "text/plain": [
966 | "survived 0\n",
967 | "pclass 0\n",
968 | "sex 0\n",
969 | "age 177\n",
970 | "sibsp 0\n",
971 | "parch 0\n",
972 | "fare 0\n",
973 | "embarked 2\n",
974 | "class 0\n",
975 | "who 0\n",
976 | "adult_male 0\n",
977 | "deck 688\n",
978 | "embark_town 2\n",
979 | "alive 0\n",
980 | "alone 0\n",
981 | "dtype: int64"
982 | ]
983 | },
984 | "execution_count": 112,
985 | "metadata": {},
986 | "output_type": "execute_result"
987 | }
988 | ],
989 | "source": [
990 | "df2.isnull().sum()"
991 | ]
992 | },
993 | {
994 | "cell_type": "code",
995 | "execution_count": 113,
996 | "metadata": {},
997 | "outputs": [],
998 | "source": [
999 | "df2= df2.fillna(method='bfill').head()"
1000 | ]
1001 | },
1002 | {
1003 | "cell_type": "code",
1004 | "execution_count": 114,
1005 | "metadata": {},
1006 | "outputs": [
1007 | {
1008 | "data": {
1009 | "text/plain": [
1010 | "survived 0\n",
1011 | "pclass 0\n",
1012 | "sex 0\n",
1013 | "age 0\n",
1014 | "sibsp 0\n",
1015 | "parch 0\n",
1016 | "fare 0\n",
1017 | "embarked 0\n",
1018 | "class 0\n",
1019 | "who 0\n",
1020 | "adult_male 0\n",
1021 | "deck 0\n",
1022 | "embark_town 0\n",
1023 | "alive 0\n",
1024 | "alone 0\n",
1025 | "dtype: int64"
1026 | ]
1027 | },
1028 | "execution_count": 114,
1029 | "metadata": {},
1030 | "output_type": "execute_result"
1031 | }
1032 | ],
1033 | "source": [
1034 | "df2.isnull().sum()"
1035 | ]
1036 | },
1037 | {
1038 | "cell_type": "code",
1039 | "execution_count": 115,
1040 | "metadata": {},
1041 | "outputs": [
1042 | {
1043 | "data": {
1044 | "text/html": [
1045 | "\n",
1046 | "\n",
1059 | "
\n",
1060 | " \n",
1061 | " \n",
1062 | " | \n",
1063 | " survived | \n",
1064 | " pclass | \n",
1065 | " sex | \n",
1066 | " age | \n",
1067 | " sibsp | \n",
1068 | " parch | \n",
1069 | " fare | \n",
1070 | " embarked | \n",
1071 | " class | \n",
1072 | " who | \n",
1073 | " adult_male | \n",
1074 | " deck | \n",
1075 | " embark_town | \n",
1076 | " alive | \n",
1077 | " alone | \n",
1078 | "
\n",
1079 | " \n",
1080 | " \n",
1081 | " \n",
1082 | " 0 | \n",
1083 | " 0 | \n",
1084 | " 3 | \n",
1085 | " male | \n",
1086 | " 22.0 | \n",
1087 | " 1 | \n",
1088 | " 0 | \n",
1089 | " 7.2500 | \n",
1090 | " S | \n",
1091 | " Third | \n",
1092 | " man | \n",
1093 | " True | \n",
1094 | " C | \n",
1095 | " Southampton | \n",
1096 | " no | \n",
1097 | " False | \n",
1098 | "
\n",
1099 | " \n",
1100 | " 1 | \n",
1101 | " 1 | \n",
1102 | " 1 | \n",
1103 | " female | \n",
1104 | " 38.0 | \n",
1105 | " 1 | \n",
1106 | " 0 | \n",
1107 | " 71.2833 | \n",
1108 | " C | \n",
1109 | " First | \n",
1110 | " woman | \n",
1111 | " False | \n",
1112 | " C | \n",
1113 | " Cherbourg | \n",
1114 | " yes | \n",
1115 | " False | \n",
1116 | "
\n",
1117 | " \n",
1118 | " 2 | \n",
1119 | " 1 | \n",
1120 | " 3 | \n",
1121 | " female | \n",
1122 | " 26.0 | \n",
1123 | " 0 | \n",
1124 | " 0 | \n",
1125 | " 7.9250 | \n",
1126 | " S | \n",
1127 | " Third | \n",
1128 | " woman | \n",
1129 | " False | \n",
1130 | " C | \n",
1131 | " Southampton | \n",
1132 | " yes | \n",
1133 | " True | \n",
1134 | "
\n",
1135 | " \n",
1136 | " 3 | \n",
1137 | " 1 | \n",
1138 | " 1 | \n",
1139 | " female | \n",
1140 | " 35.0 | \n",
1141 | " 1 | \n",
1142 | " 0 | \n",
1143 | " 53.1000 | \n",
1144 | " S | \n",
1145 | " First | \n",
1146 | " woman | \n",
1147 | " False | \n",
1148 | " C | \n",
1149 | " Southampton | \n",
1150 | " yes | \n",
1151 | " False | \n",
1152 | "
\n",
1153 | " \n",
1154 | " 4 | \n",
1155 | " 0 | \n",
1156 | " 3 | \n",
1157 | " male | \n",
1158 | " 35.0 | \n",
1159 | " 0 | \n",
1160 | " 0 | \n",
1161 | " 8.0500 | \n",
1162 | " S | \n",
1163 | " Third | \n",
1164 | " man | \n",
1165 | " True | \n",
1166 | " E | \n",
1167 | " Southampton | \n",
1168 | " no | \n",
1169 | " True | \n",
1170 | "
\n",
1171 | " \n",
1172 | "
\n",
1173 | "
"
1174 | ],
1175 | "text/plain": [
1176 | " survived pclass sex age sibsp parch fare embarked class \\\n",
1177 | "0 0 3 male 22.0 1 0 7.2500 S Third \n",
1178 | "1 1 1 female 38.0 1 0 71.2833 C First \n",
1179 | "2 1 3 female 26.0 0 0 7.9250 S Third \n",
1180 | "3 1 1 female 35.0 1 0 53.1000 S First \n",
1181 | "4 0 3 male 35.0 0 0 8.0500 S Third \n",
1182 | "\n",
1183 | " who adult_male deck embark_town alive alone \n",
1184 | "0 man True C Southampton no False \n",
1185 | "1 woman False C Cherbourg yes False \n",
1186 | "2 woman False C Southampton yes True \n",
1187 | "3 woman False C Southampton yes False \n",
1188 | "4 man True E Southampton no True "
1189 | ]
1190 | },
1191 | "execution_count": 115,
1192 | "metadata": {},
1193 | "output_type": "execute_result"
1194 | }
1195 | ],
1196 | "source": [
1197 | "df2.tail()"
1198 | ]
1199 | }
1200 | ],
1201 | "metadata": {
1202 | "interpreter": {
1203 | "hash": "acd897d2d4d03b6ecdaa2334811ed2d11d458dee0f89e3986a2ce3f86f932de0"
1204 | },
1205 | "kernelspec": {
1206 | "display_name": "Python 3.10.0 64-bit",
1207 | "language": "python",
1208 | "name": "python3"
1209 | },
1210 | "language_info": {
1211 | "codemirror_mode": {
1212 | "name": "ipython",
1213 | "version": 3
1214 | },
1215 | "file_extension": ".py",
1216 | "mimetype": "text/x-python",
1217 | "name": "python",
1218 | "nbconvert_exporter": "python",
1219 | "pygments_lexer": "ipython3",
1220 | "version": "3.10.0"
1221 | },
1222 | "orig_nbformat": 4
1223 | },
1224 | "nbformat": 4,
1225 | "nbformat_minor": 2
1226 | }
1227 |
--------------------------------------------------------------------------------
/Readme.md:
--------------------------------------------------------------------------------
1 | 
2 |
3 | ## 🔧 Technologies & Tools
4 | 
5 | 
6 | 
7 | 
8 | 
9 | 
10 | 
11 | 
12 | 
13 | 
14 | 
15 | 
16 | 
17 | 
18 | 
19 |
20 | # Data Science using Python
21 | This repository includes integral library for Data Science; NumPy, in which we explore the different functions and usages of 1-Dimensional, 2- Dimensional & 3-Dimensional NumPy Array. Then, we have discussed one of the most important libraries related to Data Science, PANDAS, in which we comprehend different functions and their usages in exploratory data analysis, data cleaning & wrangling.
22 |
23 | In the Data Visualization, we have covered important graphical visualization of our data, starting from some basic graphs i.e., Line Plot, Bar Plot, Box Plot, Histograms to the advance and attractive Violin, Strip, Scatter Map-box, Scatter Polar, Bar Polar, Tree map graphs using Plotly. We have used Seaborn, Matplotlib, and Plotly for our Data Visualization and explored the effect of different components in representing these visualizations.
24 |
25 | In the Data Preprocessing, we have learned how to handle the raw data. When we collect the raw data, it requires to perform Exploratory Data Analysis (EDA) i.e., initial investigation to discover its patterns, empty values, spot anomalies, test hypothesis, and check assumptions with the help of summary of statistics and graphical representations. After EDA, we have also explored and discussed Data wrangling, which involves processing the data in various formats and analyzing it and getting them to be used with another set of data, and bringing them together into valuable insights. It further includes data aggregation, data visualization, and training statistical models for prediction. Data wrangling is one of the most important steps of the data science process. The quality of data analysis is only as good as the quality of data itself, so it is very important to maintain data quality.
26 |
27 | 1. [**NumPy in Python for Data Science**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/main/1-Numpy.ipynb)
28 | 1. Arrays & Its Types
29 | 1. 1-D Arrays
30 | 2. 2-D Arrays
31 | 3. 3-D Arrays
32 | 2. Array Functions
33 | 1. Sorting of Array
34 | 2. Type of Array
35 | 3. Length of Array
36 | 4. Array Concatenation
37 | 5. Dimension of an Array
38 | 6. Elements of Array
39 | 7. Shape of an Array
40 | 8. Reshaping an Array
41 | 9. Conversion of Array
42 | 10. Basic Arithmetic Operations on an Array
43 | 11. Indexing and Slicing
44 | 12. Array Stacking & Splitting
45 | 3. Basic Array Operations
46 | 1. Basic Operations of 2D Array (Addition, Subtraction, Multiplication & Division)
47 | 2. Sum of Elements in Array
48 | 4. Basic Statistical Operations in Arrays
49 | 5. Indexing 2-D Array
50 | 6. Random, Reverse, Reshape & Transpose of an Array
51 | 7. Reshaping and Flattening Multidimensional Arrays
52 |
53 |
54 | 2. [**Pandas in Python for Data Science**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/main/2-Pandas.ipynb)
55 | 1. Basic Functions of Pandas in Python
56 | 2. Pandas Library Functions on FAO Data Set
57 | 1. Set Index
58 | 2. Data Types
59 | 3. Head & Tail View of Data Frame
60 | 4. Convert Data Frame to NumPy Array
61 | 5. Summary of Data Frame
62 | 6. Sorting Data Frame
63 | 7. Selecting Data using Label Based (loc function)
64 | 8. Selecting Data using Label Based (iloc function)
65 | 9. Group by in Pandas
66 | 10. [Dealing with Null Values in PANDAS](https://github.com/ammarmohsin/Data-Science-using-Python/blob/049b145c46e4f226825481f654211d36e66f1d42/4-%20Null%20Values%20with%20Pandas.ipynb)
67 |
68 | 3. [**Data Manipulation and Analysis using PANDAS**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/adc86304e846ffd5375e1bfec43f199b2d984798/3-Data%20Manipulation%20and%20Analysis.ipynb)
69 |
70 | 4. [**Data Visualization**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/7620286a7104d95b07422c3290ba904594a343bd/5-Data%20Visualization.ipynb)
71 | 1. Line Plots
72 | 2. Bar Plots
73 | 3. Box Plots
74 | 4. Box Plots Customization
75 | 5. Exercise
76 | 6. Plotly
77 |
78 | 5. **Exploratory Data Analysis (EDA)**
79 | 1. Cleaning and Filtering Data
80 | 2. Relationship (Correlation)
81 |
82 | 6. **Data Wrangling in Python**
83 | 1. Dealing with missing values
84 | 1. Continuous Variables
85 | 2. Categorical Variables
86 | 2. Data Formatting
87 | 1. Data Standardization
88 | 3. Data Normalization
89 | 1. Simple Feature Scaling
90 | 2. Min-Max Method
91 | 3. Z- score (standard score)
92 | 4. Log transformation
93 | 4. Binning
94 | 5. Convert Categories into Dummies
95 |
96 | 7. **Case Study in FAO Dataset**
97 | 1. Data Handling
98 | 2. Data Filtration
99 | 3. Data Visualization
100 |
101 |
102 |
103 | ## 📈 Stats
104 |
105 |
106 |
107 |
108 |
109 |
110 |
111 |
112 |
113 |
114 |
115 |
116 |
117 |
118 |
119 | Please follow on followings:
120 |
121 |  http://www.ammarmohsin.com/
122 |
123 |  https://github.com/ammarmohsin
124 |
125 |  https://www.linkedin.com/in/ammar777/
126 |
127 |  https://twitter.com/ammarmohsin7
128 |
129 |  ammarmohsin104@gmail.com
130 |
--------------------------------------------------------------------------------
/banner1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ammarmohsin/Data-Science-using-Python/e04d9eaf4aa49f3b6bdad84be2c9454d4380c1fe/banner1.png
--------------------------------------------------------------------------------