├── 1-Numpy.ipynb ├── 2-Pandas.ipynb ├── 3-Data Manipulation and Analysis.ipynb ├── 4- Null Values with Pandas.ipynb ├── 5-Data Visualization.ipynb ├── Readme.md ├── banner1.png └── sales.csv /1-Numpy.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "a2f9400f", 6 | "metadata": {}, 7 | "source": [ 8 | "# Numpy in Python for Data Science" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "3ce682f0", 14 | "metadata": {}, 15 | "source": [ 16 | "### NumPy (Numerical Python)\n", 17 | "\n", 18 | "NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.\n", 19 | "\n", 20 | "The NumPy library contains multidimensional array and matrix data structures (you’ll find more information about this in later sections). It provides ndarray, a homogeneous n-dimensional array object, with methods to efficiently operate on it. NumPy can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.\n", 21 | "\n", 22 | "### Difference between a Python list and a NumPy array?\n", 23 | "\n", 24 | "NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. The mathematical operations that are meant to be performed on arrays would be extremely inefficient if the arrays weren’t homogeneous.\n", 25 | "\n", 26 | "**Why use NumPy?**\n", 27 | "\n", 28 | "NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further." 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "id": "56bccc4c", 34 | "metadata": {}, 35 | "source": [ 36 | "## Section 3.1: Arrays & It's Types" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "id": "363ba426", 42 | "metadata": {}, 43 | "source": [ 44 | ">*An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element.*\n", 45 | "1. A vector is an array with a single dimension (there’s no difference between row and column vectors)\n", 46 | "2. A matrix refers to an array with two dimensions\n", 47 | "3. For 3-D or higher dimensional arrays, the term tensor is also commonly used" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "id": "318f5813", 53 | "metadata": {}, 54 | "source": [ 55 | "### 3.1.1: 1-D Arrays" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "id": "dfdac0f7", 62 | "metadata": {}, 63 | "outputs": [ 64 | { 65 | "data": { 66 | "text/plain": [ 67 | "array([5, 5, 5])" 68 | ] 69 | }, 70 | "metadata": {}, 71 | "output_type": "display_data" 72 | } 73 | ], 74 | "source": [ 75 | "import numpy as np\n", 76 | "\n", 77 | "a = np.array([5, 5, 5,])\n", 78 | "a" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "id": "55c3ef44", 85 | "metadata": {}, 86 | "outputs": [ 87 | { 88 | "data": { 89 | "text/plain": [ 90 | "array([ 2, 4, 6, 8, 10])" 91 | ] 92 | }, 93 | "metadata": {}, 94 | "output_type": "display_data" 95 | } 96 | ], 97 | "source": [ 98 | "a = np.array([2,4,6,8,10])\n", 99 | "a" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": null, 105 | "id": "0de66b66", 106 | "metadata": {}, 107 | "outputs": [ 108 | { 109 | "data": { 110 | "text/plain": [ 111 | "array([1, 2, 3, 4, 5, 6])" 112 | ] 113 | }, 114 | "metadata": {}, 115 | "output_type": "display_data" 116 | } 117 | ], 118 | "source": [ 119 | "a = np.array([1,2,3,4,5,6])\n", 120 | "a" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": null, 126 | "id": "522e653c", 127 | "metadata": {}, 128 | "outputs": [ 129 | { 130 | "data": { 131 | "text/plain": [ 132 | "array([0., 0., 0.])" 133 | ] 134 | }, 135 | "metadata": {}, 136 | "output_type": "display_data" 137 | } 138 | ], 139 | "source": [ 140 | "b = np.zeros(3)\n", 141 | "b" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "id": "fe42cfc7", 148 | "metadata": {}, 149 | "outputs": [ 150 | { 151 | "data": { 152 | "text/plain": [ 153 | "array([1., 1., 1., 1.])" 154 | ] 155 | }, 156 | "metadata": {}, 157 | "output_type": "display_data" 158 | } 159 | ], 160 | "source": [ 161 | "c = np.ones(4)\n", 162 | "c" 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "id": "e8a07511", 168 | "metadata": {}, 169 | "source": [ 170 | ">Create an empty array with 3 elements" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "id": "3334df1e", 177 | "metadata": {}, 178 | "outputs": [ 179 | { 180 | "data": { 181 | "text/plain": [ 182 | "array([0., 0., 0.])" 183 | ] 184 | }, 185 | "metadata": {}, 186 | "output_type": "display_data" 187 | } 188 | ], 189 | "source": [ 190 | "d = np.empty(3)\n", 191 | "d" 192 | ] 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "id": "41ff9884", 197 | "metadata": {}, 198 | "source": [ 199 | ">Creating array with range of elements" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "id": "50199b39", 206 | "metadata": {}, 207 | "outputs": [ 208 | { 209 | "data": { 210 | "text/plain": [ 211 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8])" 212 | ] 213 | }, 214 | "metadata": {}, 215 | "output_type": "display_data" 216 | } 217 | ], 218 | "source": [ 219 | "e = np.arange(9)\n", 220 | "e" 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "id": "dcff9552", 226 | "metadata": {}, 227 | "source": [ 228 | "> Creating array between specific range of elements" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": null, 234 | "id": "7b3a8999", 235 | "metadata": {}, 236 | "outputs": [ 237 | { 238 | "data": { 239 | "text/plain": [ 240 | "array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])" 241 | ] 242 | }, 243 | "metadata": {}, 244 | "output_type": "display_data" 245 | } 246 | ], 247 | "source": [ 248 | "f = np.arange(1, 15)\n", 249 | "f" 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "id": "0604f3f7", 255 | "metadata": {}, 256 | "source": [ 257 | "> Creating array between specific range of elements and specified interval (5)\n" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": null, 263 | "id": "9895eb8e", 264 | "metadata": {}, 265 | "outputs": [ 266 | { 267 | "data": { 268 | "text/plain": [ 269 | "array([50, 55, 60, 65, 70, 75, 80, 85, 90, 95])" 270 | ] 271 | }, 272 | "metadata": {}, 273 | "output_type": "display_data" 274 | } 275 | ], 276 | "source": [ 277 | "g = np.arange(50, 100, 5)\n", 278 | "g" 279 | ] 280 | }, 281 | { 282 | "cell_type": "markdown", 283 | "id": "f3968e12", 284 | "metadata": {}, 285 | "source": [ 286 | "> Linearly spaced arrays" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": null, 292 | "id": "6077433a", 293 | "metadata": {}, 294 | "outputs": [ 295 | { 296 | "data": { 297 | "text/plain": [ 298 | "array([100., 105., 110., 115., 120.])" 299 | ] 300 | }, 301 | "metadata": {}, 302 | "output_type": "display_data" 303 | } 304 | ], 305 | "source": [ 306 | "h = np.linspace(100 , 120, num= 5)\n", 307 | "h" 308 | ] 309 | }, 310 | { 311 | "cell_type": "markdown", 312 | "id": "a05fed06", 313 | "metadata": {}, 314 | "source": [ 315 | "> specific data types in array" 316 | ] 317 | }, 318 | { 319 | "cell_type": "code", 320 | "execution_count": null, 321 | "id": "ae477c82", 322 | "metadata": {}, 323 | "outputs": [ 324 | { 325 | "data": { 326 | "text/plain": [ 327 | "array([1, 1, 1, 1, 1], dtype=int8)" 328 | ] 329 | }, 330 | "metadata": {}, 331 | "output_type": "display_data" 332 | } 333 | ], 334 | "source": [ 335 | "i = np.ones(5, dtype = np.int8)\n", 336 | "i" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": null, 342 | "id": "3cf3b978", 343 | "metadata": {}, 344 | "outputs": [ 345 | { 346 | "data": { 347 | "text/plain": [ 348 | "array([1., 1., 1., 1.])" 349 | ] 350 | }, 351 | "metadata": {}, 352 | "output_type": "display_data" 353 | } 354 | ], 355 | "source": [ 356 | "j = np.ones(4, dtype = np.float64)\n", 357 | "j" 358 | ] 359 | }, 360 | { 361 | "cell_type": "markdown", 362 | "id": "a5629a81", 363 | "metadata": {}, 364 | "source": [ 365 | "### 3.1.2: 2-D Arrays" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": null, 371 | "id": "bc03ef8b", 372 | "metadata": {}, 373 | "outputs": [ 374 | { 375 | "data": { 376 | "text/plain": [ 377 | "array([[1, 2, 3],\n", 378 | " [4, 5, 6],\n", 379 | " [7, 8, 9]])" 380 | ] 381 | }, 382 | "metadata": {}, 383 | "output_type": "display_data" 384 | } 385 | ], 386 | "source": [ 387 | "b = np.array([[1,2,3], [4,5,6], [7,8,9]])\n", 388 | "b" 389 | ] 390 | }, 391 | { 392 | "cell_type": "code", 393 | "execution_count": null, 394 | "id": "a1a63fcc", 395 | "metadata": {}, 396 | "outputs": [ 397 | { 398 | "data": { 399 | "text/plain": [ 400 | "array([[0., 0., 0., 0.],\n", 401 | " [0., 0., 0., 0.],\n", 402 | " [0., 0., 0., 0.],\n", 403 | " [0., 0., 0., 0.],\n", 404 | " [0., 0., 0., 0.]])" 405 | ] 406 | }, 407 | "metadata": {}, 408 | "output_type": "display_data" 409 | } 410 | ], 411 | "source": [ 412 | "a = np.zeros((5,4))\n", 413 | "a" 414 | ] 415 | }, 416 | { 417 | "cell_type": "code", 418 | "execution_count": null, 419 | "id": "409d4aba", 420 | "metadata": {}, 421 | "outputs": [ 422 | { 423 | "data": { 424 | "text/plain": [ 425 | "array([[1., 1., 1.],\n", 426 | " [1., 1., 1.]])" 427 | ] 428 | }, 429 | "metadata": {}, 430 | "output_type": "display_data" 431 | } 432 | ], 433 | "source": [ 434 | "b = np.ones((2,3))\n", 435 | "b" 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": null, 441 | "id": "728c18fb", 442 | "metadata": {}, 443 | "outputs": [ 444 | { 445 | "data": { 446 | "text/plain": [ 447 | "array([[0.00000000e+000, 0.00000000e+000, 0.00000000e+000],\n", 448 | " [0.00000000e+000, 0.00000000e+000, 1.29049947e-320],\n", 449 | " [8.34441742e-308, 9.79107192e-307, 3.33509775e-317]])" 450 | ] 451 | }, 452 | "metadata": {}, 453 | "output_type": "display_data" 454 | } 455 | ], 456 | "source": [ 457 | "c = np.empty((3,3))\n", 458 | "c" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "id": "53510c93", 464 | "metadata": {}, 465 | "source": [ 466 | "### 3.1.3: 3-D Arrays" 467 | ] 468 | }, 469 | { 470 | "cell_type": "code", 471 | "execution_count": null, 472 | "id": "4f448214", 473 | "metadata": {}, 474 | "outputs": [ 475 | { 476 | "data": { 477 | "text/plain": [ 478 | "array([[[1, 2, 3, 4],\n", 479 | " [2, 3, 4, 5],\n", 480 | " [9, 8, 7, 6]],\n", 481 | "\n", 482 | " [[7, 8, 9, 4],\n", 483 | " [1, 4, 7, 8],\n", 484 | " [1, 5, 9, 7]],\n", 485 | "\n", 486 | " [[4, 5, 6, 8],\n", 487 | " [3, 2, 1, 4],\n", 488 | " [2, 5, 4, 8]]])" 489 | ] 490 | }, 491 | "metadata": {}, 492 | "output_type": "display_data" 493 | } 494 | ], 495 | "source": [ 496 | "x = np.array([[[1,2,3,4], [2,3,4,5], [9,8,7,6]], [[7,8,9,4], [1,4,7,8], [1,5,9,7]], [[4,5,6,8], [3,2,1,4], [2,5,4,8]]])\n", 497 | "x" 498 | ] 499 | }, 500 | { 501 | "cell_type": "markdown", 502 | "id": "a2717e77", 503 | "metadata": {}, 504 | "source": [ 505 | ">making and reshaping a 3D array" 506 | ] 507 | }, 508 | { 509 | "cell_type": "code", 510 | "execution_count": null, 511 | "id": "1ae8e7e1", 512 | "metadata": {}, 513 | "outputs": [ 514 | { 515 | "data": { 516 | "text/plain": [ 517 | "array([[[ 0, 1, 2, 3, 4],\n", 518 | " [ 5, 6, 7, 8, 9],\n", 519 | " [10, 11, 12, 13, 14],\n", 520 | " [15, 16, 17, 18, 19]],\n", 521 | "\n", 522 | " [[20, 21, 22, 23, 24],\n", 523 | " [25, 26, 27, 28, 29],\n", 524 | " [30, 31, 32, 33, 34],\n", 525 | " [35, 36, 37, 38, 39]],\n", 526 | "\n", 527 | " [[40, 41, 42, 43, 44],\n", 528 | " [45, 46, 47, 48, 49],\n", 529 | " [50, 51, 52, 53, 54],\n", 530 | " [55, 56, 57, 58, 59]]])" 531 | ] 532 | }, 533 | "metadata": {}, 534 | "output_type": "display_data" 535 | } 536 | ], 537 | "source": [ 538 | "a = np.arange(60).reshape(3,4,5)\n", 539 | "a" 540 | ] 541 | }, 542 | { 543 | "cell_type": "markdown", 544 | "id": "3b1e2bb8", 545 | "metadata": {}, 546 | "source": [ 547 | "## Section 3.2: Array Functions" 548 | ] 549 | }, 550 | { 551 | "cell_type": "code", 552 | "execution_count": null, 553 | "id": "c280e3d9", 554 | "metadata": {}, 555 | "outputs": [ 556 | { 557 | "data": { 558 | "text/plain": [ 559 | "array([ 10. , 12. , 15. , 2. , 4. , 6. , 100. , 320. , 0.5,\n", 560 | " 10.3])" 561 | ] 562 | }, 563 | "metadata": {}, 564 | "output_type": "display_data" 565 | } 566 | ], 567 | "source": [ 568 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n", 569 | "a" 570 | ] 571 | }, 572 | { 573 | "cell_type": "markdown", 574 | "id": "e831694e", 575 | "metadata": {}, 576 | "source": [ 577 | "### 3.2.1 Sorting of one 1-D Array" 578 | ] 579 | }, 580 | { 581 | "cell_type": "code", 582 | "execution_count": null, 583 | "id": "f9c7588a", 584 | "metadata": {}, 585 | "outputs": [ 586 | { 587 | "data": { 588 | "text/plain": [ 589 | "array([ 0.5, 2. , 4. , 6. , 10. , 10.3, 12. , 15. , 100. ,\n", 590 | " 320. ])" 591 | ] 592 | }, 593 | "metadata": {}, 594 | "output_type": "display_data" 595 | } 596 | ], 597 | "source": [ 598 | "a.sort()\n", 599 | "a" 600 | ] 601 | }, 602 | { 603 | "cell_type": "markdown", 604 | "id": "fa08791c", 605 | "metadata": {}, 606 | "source": [ 607 | "### 3.2.2 Type of Array" 608 | ] 609 | }, 610 | { 611 | "cell_type": "code", 612 | "execution_count": null, 613 | "id": "15df162f", 614 | "metadata": {}, 615 | "outputs": [ 616 | { 617 | "data": { 618 | "text/plain": [ 619 | "numpy.ndarray" 620 | ] 621 | }, 622 | "metadata": {}, 623 | "output_type": "display_data" 624 | } 625 | ], 626 | "source": [ 627 | "type(a)" 628 | ] 629 | }, 630 | { 631 | "cell_type": "markdown", 632 | "id": "c6978ff7", 633 | "metadata": {}, 634 | "source": [ 635 | "### 3.2.3 Length of an array" 636 | ] 637 | }, 638 | { 639 | "cell_type": "code", 640 | "execution_count": null, 641 | "id": "0aae0d18", 642 | "metadata": {}, 643 | "outputs": [ 644 | { 645 | "data": { 646 | "text/plain": [ 647 | "10" 648 | ] 649 | }, 650 | "metadata": {}, 651 | "output_type": "display_data" 652 | } 653 | ], 654 | "source": [ 655 | "len(a)" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": null, 661 | "id": "20c06e4c", 662 | "metadata": {}, 663 | "outputs": [ 664 | { 665 | "data": { 666 | "text/plain": [ 667 | "array([10.2, 3.4, 5.3, 35.2, 45.2])" 668 | ] 669 | }, 670 | "metadata": {}, 671 | "output_type": "display_data" 672 | } 673 | ], 674 | "source": [ 675 | "b = np.array([10.2, 3.4, 5.3, 35.2, 45.2])\n", 676 | "b" 677 | ] 678 | }, 679 | { 680 | "cell_type": "markdown", 681 | "id": "1725df86", 682 | "metadata": {}, 683 | "source": [ 684 | "### 3.2.4 Concatenation of Array" 685 | ] 686 | }, 687 | { 688 | "cell_type": "markdown", 689 | "id": "a5c3e79a", 690 | "metadata": {}, 691 | "source": [ 692 | "> Concatenation of 1-D Array" 693 | ] 694 | }, 695 | { 696 | "cell_type": "code", 697 | "execution_count": null, 698 | "id": "5baa38e9", 699 | "metadata": {}, 700 | "outputs": [ 701 | { 702 | "data": { 703 | "text/plain": [ 704 | "array([ 0.5, 2. , 4. , 6. , 10. , 10.3, 12. , 15. , 100. ,\n", 705 | " 320. , 10.2, 3.4, 5.3, 35.2, 45.2])" 706 | ] 707 | }, 708 | "metadata": {}, 709 | "output_type": "display_data" 710 | } 711 | ], 712 | "source": [ 713 | "c = np.concatenate((a,b))\n", 714 | "c" 715 | ] 716 | }, 717 | { 718 | "cell_type": "code", 719 | "execution_count": null, 720 | "id": "5ef0d308", 721 | "metadata": {}, 722 | "outputs": [ 723 | { 724 | "data": { 725 | "text/plain": [ 726 | "array([ 0.5, 2. , 3.4, 4. , 5.3, 6. , 10. , 10.2, 10.3,\n", 727 | " 12. , 15. , 35.2, 45.2, 100. , 320. ])" 728 | ] 729 | }, 730 | "metadata": {}, 731 | "output_type": "display_data" 732 | } 733 | ], 734 | "source": [ 735 | "c.sort()\n", 736 | "c" 737 | ] 738 | }, 739 | { 740 | "cell_type": "markdown", 741 | "id": "a2e1907c", 742 | "metadata": {}, 743 | "source": [ 744 | "> Concatenation of 2-D Array" 745 | ] 746 | }, 747 | { 748 | "cell_type": "code", 749 | "execution_count": null, 750 | "id": "aea64970", 751 | "metadata": {}, 752 | "outputs": [ 753 | { 754 | "data": { 755 | "text/plain": [ 756 | "array([[1, 2, 3, 4, 5],\n", 757 | " [5, 4, 3, 2, 1]])" 758 | ] 759 | }, 760 | "metadata": {}, 761 | "output_type": "display_data" 762 | } 763 | ], 764 | "source": [ 765 | "a = np.array([[1,2,3,4,5], [5,4,3,2,1]])\n", 766 | "a" 767 | ] 768 | }, 769 | { 770 | "cell_type": "code", 771 | "execution_count": null, 772 | "id": "23a1b5e5", 773 | "metadata": {}, 774 | "outputs": [ 775 | { 776 | "data": { 777 | "text/plain": [ 778 | "array([[6, 7, 5, 6, 6],\n", 779 | " [8, 9, 5, 9, 5]])" 780 | ] 781 | }, 782 | "metadata": {}, 783 | "output_type": "display_data" 784 | } 785 | ], 786 | "source": [ 787 | "b = np.array([[6,7,5,6,6], [8,9,5,9,5]])\n", 788 | "b" 789 | ] 790 | }, 791 | { 792 | "cell_type": "code", 793 | "execution_count": null, 794 | "id": "43d316e6", 795 | "metadata": {}, 796 | "outputs": [ 797 | { 798 | "data": { 799 | "text/plain": [ 800 | "array([[1, 2, 3, 4, 5],\n", 801 | " [5, 4, 3, 2, 1],\n", 802 | " [6, 7, 5, 6, 6],\n", 803 | " [8, 9, 5, 9, 5]])" 804 | ] 805 | }, 806 | "metadata": {}, 807 | "output_type": "display_data" 808 | } 809 | ], 810 | "source": [ 811 | "c = np.concatenate((a,b))\n", 812 | "c" 813 | ] 814 | }, 815 | { 816 | "cell_type": "code", 817 | "execution_count": null, 818 | "id": "4dda2adc", 819 | "metadata": {}, 820 | "outputs": [ 821 | { 822 | "data": { 823 | "text/plain": [ 824 | "array([[1, 2, 3, 4, 5, 6, 7, 5, 6, 6],\n", 825 | " [5, 4, 3, 2, 1, 8, 9, 5, 9, 5]])" 826 | ] 827 | }, 828 | "metadata": {}, 829 | "output_type": "display_data" 830 | } 831 | ], 832 | "source": [ 833 | "c = np.concatenate((a,b), axis=1)\n", 834 | "c" 835 | ] 836 | }, 837 | { 838 | "cell_type": "code", 839 | "execution_count": null, 840 | "id": "1121832d", 841 | "metadata": {}, 842 | "outputs": [ 843 | { 844 | "data": { 845 | "text/plain": [ 846 | "array([[1, 2],\n", 847 | " [3, 4],\n", 848 | " [5, 6]])" 849 | ] 850 | }, 851 | "metadata": {}, 852 | "output_type": "display_data" 853 | } 854 | ], 855 | "source": [ 856 | "x = np.array([[1, 2], [3, 4]])\n", 857 | "y = np.array([[5, 6]])\n", 858 | "np.concatenate((x, y), axis=0)" 859 | ] 860 | }, 861 | { 862 | "cell_type": "markdown", 863 | "id": "a568f1d9", 864 | "metadata": {}, 865 | "source": [ 866 | "### 3.2.5 Dimension of an array" 867 | ] 868 | }, 869 | { 870 | "cell_type": "code", 871 | "execution_count": null, 872 | "id": "0cc43877", 873 | "metadata": {}, 874 | "outputs": [ 875 | { 876 | "data": { 877 | "text/plain": [ 878 | "1" 879 | ] 880 | }, 881 | "metadata": {}, 882 | "output_type": "display_data" 883 | } 884 | ], 885 | "source": [ 886 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n", 887 | "a.ndim" 888 | ] 889 | }, 890 | { 891 | "cell_type": "code", 892 | "execution_count": null, 893 | "id": "415d6248", 894 | "metadata": {}, 895 | "outputs": [ 896 | { 897 | "data": { 898 | "text/plain": [ 899 | "2" 900 | ] 901 | }, 902 | "metadata": {}, 903 | "output_type": "display_data" 904 | } 905 | ], 906 | "source": [ 907 | "b = np.array([[6,7,5,6,6], [8,9,5,9,5]])\n", 908 | "b.ndim" 909 | ] 910 | }, 911 | { 912 | "cell_type": "code", 913 | "execution_count": null, 914 | "id": "e6841909", 915 | "metadata": {}, 916 | "outputs": [ 917 | { 918 | "data": { 919 | "text/plain": [ 920 | "3" 921 | ] 922 | }, 923 | "metadata": {}, 924 | "output_type": "display_data" 925 | } 926 | ], 927 | "source": [ 928 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]]])\n", 929 | "a.ndim" 930 | ] 931 | }, 932 | { 933 | "cell_type": "markdown", 934 | "id": "23784d4e", 935 | "metadata": {}, 936 | "source": [ 937 | "### 3.2.6 Number of elements in an array" 938 | ] 939 | }, 940 | { 941 | "cell_type": "code", 942 | "execution_count": null, 943 | "id": "7ed0ded3", 944 | "metadata": {}, 945 | "outputs": [ 946 | { 947 | "data": { 948 | "text/plain": [ 949 | "10" 950 | ] 951 | }, 952 | "metadata": {}, 953 | "output_type": "display_data" 954 | } 955 | ], 956 | "source": [ 957 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n", 958 | "a.size" 959 | ] 960 | }, 961 | { 962 | "cell_type": "code", 963 | "execution_count": null, 964 | "id": "ffbe8224", 965 | "metadata": {}, 966 | "outputs": [ 967 | { 968 | "data": { 969 | "text/plain": [ 970 | "12" 971 | ] 972 | }, 973 | "metadata": {}, 974 | "output_type": "display_data" 975 | } 976 | ], 977 | "source": [ 978 | "b = np.array([[6,7,5,6,6,6], [8,9,5,9,5,7]])\n", 979 | "b.size" 980 | ] 981 | }, 982 | { 983 | "cell_type": "code", 984 | "execution_count": null, 985 | "id": "4a9a4845", 986 | "metadata": {}, 987 | "outputs": [ 988 | { 989 | "data": { 990 | "text/plain": [ 991 | "24" 992 | ] 993 | }, 994 | "metadata": {}, 995 | "output_type": "display_data" 996 | } 997 | ], 998 | "source": [ 999 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]]])\n", 1000 | "a.size" 1001 | ] 1002 | }, 1003 | { 1004 | "cell_type": "markdown", 1005 | "id": "63591035", 1006 | "metadata": {}, 1007 | "source": [ 1008 | "### 3.2.7 Shape of an array" 1009 | ] 1010 | }, 1011 | { 1012 | "cell_type": "code", 1013 | "execution_count": null, 1014 | "id": "8906a559", 1015 | "metadata": {}, 1016 | "outputs": [ 1017 | { 1018 | "data": { 1019 | "text/plain": [ 1020 | "array([ 10. , 12. , 15. , 2. , 4. , 6. , 100. , 320. , 0.5,\n", 1021 | " 10.3])" 1022 | ] 1023 | }, 1024 | "metadata": {}, 1025 | "output_type": "display_data" 1026 | } 1027 | ], 1028 | "source": [ 1029 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n", 1030 | "a" 1031 | ] 1032 | }, 1033 | { 1034 | "cell_type": "code", 1035 | "execution_count": null, 1036 | "id": "d55cf6f1", 1037 | "metadata": {}, 1038 | "outputs": [ 1039 | { 1040 | "data": { 1041 | "text/plain": [ 1042 | "(10,)" 1043 | ] 1044 | }, 1045 | "metadata": {}, 1046 | "output_type": "display_data" 1047 | } 1048 | ], 1049 | "source": [ 1050 | "a = np.array([10,12,15,2,4,6,100,320,0.5,10.3])\n", 1051 | "a.shape\n", 1052 | "# Output: 10 Rows or Columns" 1053 | ] 1054 | }, 1055 | { 1056 | "cell_type": "code", 1057 | "execution_count": null, 1058 | "id": "56796e4b", 1059 | "metadata": {}, 1060 | "outputs": [ 1061 | { 1062 | "data": { 1063 | "text/plain": [ 1064 | "array([[6, 7, 5, 6, 6, 6],\n", 1065 | " [8, 9, 5, 9, 5, 7],\n", 1066 | " [8, 9, 5, 9, 5, 7]])" 1067 | ] 1068 | }, 1069 | "metadata": {}, 1070 | "output_type": "display_data" 1071 | } 1072 | ], 1073 | "source": [ 1074 | "b = np.array([[6,7,5,6,6,6], [8,9,5,9,5,7], [8,9,5,9,5,7]])\n", 1075 | "b\n", 1076 | "b" 1077 | ] 1078 | }, 1079 | { 1080 | "cell_type": "code", 1081 | "execution_count": null, 1082 | "id": "ae697c33", 1083 | "metadata": {}, 1084 | "outputs": [ 1085 | { 1086 | "data": { 1087 | "text/plain": [ 1088 | "(3, 6)" 1089 | ] 1090 | }, 1091 | "metadata": {}, 1092 | "output_type": "display_data" 1093 | } 1094 | ], 1095 | "source": [ 1096 | "b = np.array([[6,7,5,6,6,6], [8,9,5,9,5,7], [8,9,5,9,5,7]])\n", 1097 | "b\n", 1098 | "b.shape\n", 1099 | "# Output: 3 Rows, 6 Columns" 1100 | ] 1101 | }, 1102 | { 1103 | "cell_type": "code", 1104 | "execution_count": null, 1105 | "id": "7a0d8ac4", 1106 | "metadata": {}, 1107 | "outputs": [ 1108 | { 1109 | "data": { 1110 | "text/plain": [ 1111 | "array([[[1, 2, 3, 4],\n", 1112 | " [4, 5, 6, 7]],\n", 1113 | "\n", 1114 | " [[3, 2, 1, 4],\n", 1115 | " [6, 5, 4, 1]],\n", 1116 | "\n", 1117 | " [[7, 8, 9, 0],\n", 1118 | " [9, 8, 7, 6]],\n", 1119 | "\n", 1120 | " [[3, 2, 1, 4],\n", 1121 | " [6, 5, 4, 1]]])" 1122 | ] 1123 | }, 1124 | "metadata": {}, 1125 | "output_type": "display_data" 1126 | } 1127 | ], 1128 | "source": [ 1129 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]], [[3,2,1,4],[6,5,4,1]]])\n", 1130 | "a" 1131 | ] 1132 | }, 1133 | { 1134 | "cell_type": "code", 1135 | "execution_count": null, 1136 | "id": "cae2a0b3", 1137 | "metadata": {}, 1138 | "outputs": [ 1139 | { 1140 | "data": { 1141 | "text/plain": [ 1142 | "(4, 2, 4)" 1143 | ] 1144 | }, 1145 | "metadata": {}, 1146 | "output_type": "display_data" 1147 | } 1148 | ], 1149 | "source": [ 1150 | "a = np.array([[[1,2,3,4], [4,5,6,7]], [[3,2,1,4],[6,5,4,1]], [[7,8,9,0],[9,8,7,6]], [[3,2,1,4],[6,5,4,1]]])\n", 1151 | "a.shape\n", 1152 | "# Output: 4 Layers, 2 Rows, 4 Columns" 1153 | ] 1154 | }, 1155 | { 1156 | "cell_type": "markdown", 1157 | "id": "ff54e4f2", 1158 | "metadata": {}, 1159 | "source": [ 1160 | "### 3.2.8 Reshaping an array" 1161 | ] 1162 | }, 1163 | { 1164 | "cell_type": "code", 1165 | "execution_count": null, 1166 | "id": "72ec3bef", 1167 | "metadata": {}, 1168 | "outputs": [ 1169 | { 1170 | "data": { 1171 | "text/plain": [ 1172 | "array([0, 1, 2, 3, 4, 5, 6, 7, 8])" 1173 | ] 1174 | }, 1175 | "metadata": {}, 1176 | "output_type": "display_data" 1177 | } 1178 | ], 1179 | "source": [ 1180 | "a = np.arange(9) # 3*3 = 9\n", 1181 | "a" 1182 | ] 1183 | }, 1184 | { 1185 | "cell_type": "code", 1186 | "execution_count": null, 1187 | "id": "1f25827c", 1188 | "metadata": {}, 1189 | "outputs": [ 1190 | { 1191 | "data": { 1192 | "text/plain": [ 1193 | "array([[0, 1, 2],\n", 1194 | " [3, 4, 5],\n", 1195 | " [6, 7, 8]])" 1196 | ] 1197 | }, 1198 | "metadata": {}, 1199 | "output_type": "display_data" 1200 | } 1201 | ], 1202 | "source": [ 1203 | "a.reshape(3,3) \n", 1204 | "# Can ony reshape it into multiple of 9 e.g. 3*3 = 9" 1205 | ] 1206 | }, 1207 | { 1208 | "cell_type": "code", 1209 | "execution_count": null, 1210 | "id": "8aa7bf57", 1211 | "metadata": {}, 1212 | "outputs": [ 1213 | { 1214 | "data": { 1215 | "text/plain": [ 1216 | "array([[0, 1, 2, 3, 4, 5, 6, 7, 8]])" 1217 | ] 1218 | }, 1219 | "metadata": {}, 1220 | "output_type": "display_data" 1221 | } 1222 | ], 1223 | "source": [ 1224 | "np.reshape(a, newshape=(1,9), order = \"C\")" 1225 | ] 1226 | }, 1227 | { 1228 | "cell_type": "markdown", 1229 | "id": "fb959bd6", 1230 | "metadata": {}, 1231 | "source": [ 1232 | "### 3.2.9 Conversion of an Array" 1233 | ] 1234 | }, 1235 | { 1236 | "cell_type": "markdown", 1237 | "id": "bf3f2229", 1238 | "metadata": {}, 1239 | "source": [ 1240 | "> Conversion of 1-D array to 2-D array" 1241 | ] 1242 | }, 1243 | { 1244 | "cell_type": "code", 1245 | "execution_count": null, 1246 | "id": "45a12f7e", 1247 | "metadata": {}, 1248 | "outputs": [ 1249 | { 1250 | "data": { 1251 | "text/plain": [ 1252 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])" 1253 | ] 1254 | }, 1255 | "metadata": {}, 1256 | "output_type": "display_data" 1257 | } 1258 | ], 1259 | "source": [ 1260 | "a = np.array([1,2,3,4,5,6,7,8,9])\n", 1261 | "a" 1262 | ] 1263 | }, 1264 | { 1265 | "cell_type": "markdown", 1266 | "id": "87147d08", 1267 | "metadata": {}, 1268 | "source": [ 1269 | "> Row wise 1D to 2D conversion " 1270 | ] 1271 | }, 1272 | { 1273 | "cell_type": "code", 1274 | "execution_count": null, 1275 | "id": "7320fe05", 1276 | "metadata": {}, 1277 | "outputs": [ 1278 | { 1279 | "data": { 1280 | "text/plain": [ 1281 | "array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])" 1282 | ] 1283 | }, 1284 | "metadata": {}, 1285 | "output_type": "display_data" 1286 | } 1287 | ], 1288 | "source": [ 1289 | "b = a[np.newaxis, :]\n", 1290 | "b" 1291 | ] 1292 | }, 1293 | { 1294 | "cell_type": "code", 1295 | "execution_count": null, 1296 | "id": "143fc70b", 1297 | "metadata": {}, 1298 | "outputs": [ 1299 | { 1300 | "data": { 1301 | "text/plain": [ 1302 | "(1, 9)" 1303 | ] 1304 | }, 1305 | "metadata": {}, 1306 | "output_type": "display_data" 1307 | } 1308 | ], 1309 | "source": [ 1310 | "b.shape" 1311 | ] 1312 | }, 1313 | { 1314 | "cell_type": "markdown", 1315 | "id": "09bc4ffc", 1316 | "metadata": {}, 1317 | "source": [ 1318 | "> Column wise 1D to 2D conversion" 1319 | ] 1320 | }, 1321 | { 1322 | "cell_type": "code", 1323 | "execution_count": null, 1324 | "id": "55f963b3", 1325 | "metadata": {}, 1326 | "outputs": [ 1327 | { 1328 | "data": { 1329 | "text/plain": [ 1330 | "array([[1],\n", 1331 | " [2],\n", 1332 | " [3],\n", 1333 | " [4],\n", 1334 | " [5],\n", 1335 | " [6],\n", 1336 | " [7],\n", 1337 | " [8],\n", 1338 | " [9]])" 1339 | ] 1340 | }, 1341 | "metadata": {}, 1342 | "output_type": "display_data" 1343 | } 1344 | ], 1345 | "source": [ 1346 | "c = a[: , np.newaxis]\n", 1347 | "c" 1348 | ] 1349 | }, 1350 | { 1351 | "cell_type": "code", 1352 | "execution_count": null, 1353 | "id": "7e571800", 1354 | "metadata": {}, 1355 | "outputs": [ 1356 | { 1357 | "data": { 1358 | "text/plain": [ 1359 | "(9, 1)" 1360 | ] 1361 | }, 1362 | "metadata": {}, 1363 | "output_type": "display_data" 1364 | } 1365 | ], 1366 | "source": [ 1367 | "c.shape" 1368 | ] 1369 | }, 1370 | { 1371 | "cell_type": "markdown", 1372 | "id": "c9251a24", 1373 | "metadata": {}, 1374 | "source": [ 1375 | "> Conversion of 1-D array to 2-D and then 2-D to 3-D\n", 1376 | "> \n", 1377 | "> _Using np.newaxis will increase the dimensions of your array by one dimension when used once._" 1378 | ] 1379 | }, 1380 | { 1381 | "cell_type": "code", 1382 | "execution_count": null, 1383 | "id": "a170abeb", 1384 | "metadata": {}, 1385 | "outputs": [ 1386 | { 1387 | "data": { 1388 | "text/plain": [ 1389 | "(7,)" 1390 | ] 1391 | }, 1392 | "metadata": {}, 1393 | "output_type": "display_data" 1394 | } 1395 | ], 1396 | "source": [ 1397 | "a = np.arange(7)\n", 1398 | "a.shape\n" 1399 | ] 1400 | }, 1401 | { 1402 | "cell_type": "code", 1403 | "execution_count": null, 1404 | "id": "8e332d42", 1405 | "metadata": {}, 1406 | "outputs": [ 1407 | { 1408 | "data": { 1409 | "text/plain": [ 1410 | "(1, 7)" 1411 | ] 1412 | }, 1413 | "metadata": {}, 1414 | "output_type": "display_data" 1415 | } 1416 | ], 1417 | "source": [ 1418 | "b = a[np.newaxis, :]\n", 1419 | "b.shape" 1420 | ] 1421 | }, 1422 | { 1423 | "cell_type": "code", 1424 | "execution_count": null, 1425 | "id": "21cf04d3", 1426 | "metadata": {}, 1427 | "outputs": [ 1428 | { 1429 | "data": { 1430 | "text/plain": [ 1431 | "(1, 1, 7)" 1432 | ] 1433 | }, 1434 | "metadata": {}, 1435 | "output_type": "display_data" 1436 | } 1437 | ], 1438 | "source": [ 1439 | "c = b[np.newaxis, :]\n", 1440 | "c.shape" 1441 | ] 1442 | }, 1443 | { 1444 | "cell_type": "markdown", 1445 | "id": "a9a45ca3", 1446 | "metadata": {}, 1447 | "source": [ 1448 | ">Converting 1-D array to 2-D array at specific axis\n", 1449 | ">\n", 1450 | ">_You can also expand an array by inserting a new axis at a specified position with np.expand_dims._" 1451 | ] 1452 | }, 1453 | { 1454 | "cell_type": "code", 1455 | "execution_count": null, 1456 | "id": "3b7513fb", 1457 | "metadata": {}, 1458 | "outputs": [ 1459 | { 1460 | "data": { 1461 | "text/plain": [ 1462 | "(6,)" 1463 | ] 1464 | }, 1465 | "metadata": {}, 1466 | "output_type": "display_data" 1467 | } 1468 | ], 1469 | "source": [ 1470 | "a = np.arange(6)\n", 1471 | "a.shape" 1472 | ] 1473 | }, 1474 | { 1475 | "cell_type": "markdown", 1476 | "id": "4715e929", 1477 | "metadata": {}, 1478 | "source": [ 1479 | "> You can use np.expand_dims to add an axis at index position 1" 1480 | ] 1481 | }, 1482 | { 1483 | "cell_type": "code", 1484 | "execution_count": null, 1485 | "id": "0adc8dbc", 1486 | "metadata": {}, 1487 | "outputs": [ 1488 | { 1489 | "data": { 1490 | "text/plain": [ 1491 | "array([[0],\n", 1492 | " [1],\n", 1493 | " [2],\n", 1494 | " [3],\n", 1495 | " [4],\n", 1496 | " [5]])" 1497 | ] 1498 | }, 1499 | "metadata": {}, 1500 | "output_type": "display_data" 1501 | } 1502 | ], 1503 | "source": [ 1504 | "b = np.expand_dims(a, axis=1)\n", 1505 | "b" 1506 | ] 1507 | }, 1508 | { 1509 | "cell_type": "code", 1510 | "execution_count": null, 1511 | "id": "d8cac73c", 1512 | "metadata": {}, 1513 | "outputs": [ 1514 | { 1515 | "data": { 1516 | "text/plain": [ 1517 | "(6, 1)" 1518 | ] 1519 | }, 1520 | "metadata": {}, 1521 | "output_type": "display_data" 1522 | } 1523 | ], 1524 | "source": [ 1525 | "b.shape" 1526 | ] 1527 | }, 1528 | { 1529 | "cell_type": "markdown", 1530 | "id": "5c7d0846", 1531 | "metadata": {}, 1532 | "source": [ 1533 | ">You can add an axis at index position 0\n" 1534 | ] 1535 | }, 1536 | { 1537 | "cell_type": "code", 1538 | "execution_count": null, 1539 | "id": "ed723cda", 1540 | "metadata": {}, 1541 | "outputs": [ 1542 | { 1543 | "data": { 1544 | "text/plain": [ 1545 | "array([[0, 1, 2, 3, 4, 5]])" 1546 | ] 1547 | }, 1548 | "metadata": {}, 1549 | "output_type": "display_data" 1550 | } 1551 | ], 1552 | "source": [ 1553 | "b = np.expand_dims(a, axis=0)\n", 1554 | "b" 1555 | ] 1556 | }, 1557 | { 1558 | "cell_type": "code", 1559 | "execution_count": null, 1560 | "id": "b8e6c53e", 1561 | "metadata": {}, 1562 | "outputs": [ 1563 | { 1564 | "data": { 1565 | "text/plain": [ 1566 | "(1, 6)" 1567 | ] 1568 | }, 1569 | "metadata": {}, 1570 | "output_type": "display_data" 1571 | } 1572 | ], 1573 | "source": [ 1574 | "b.shape" 1575 | ] 1576 | }, 1577 | { 1578 | "cell_type": "markdown", 1579 | "id": "d2728c58", 1580 | "metadata": {}, 1581 | "source": [ 1582 | "### 3.2.10 Basic Arithmetic Operations on an Array" 1583 | ] 1584 | }, 1585 | { 1586 | "cell_type": "markdown", 1587 | "id": "295d0494", 1588 | "metadata": {}, 1589 | "source": [ 1590 | "> Addition & Multiplication to elements of array" 1591 | ] 1592 | }, 1593 | { 1594 | "cell_type": "code", 1595 | "execution_count": null, 1596 | "id": "7a05ae57", 1597 | "metadata": {}, 1598 | "outputs": [ 1599 | { 1600 | "data": { 1601 | "text/plain": [ 1602 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])" 1603 | ] 1604 | }, 1605 | "metadata": {}, 1606 | "output_type": "display_data" 1607 | } 1608 | ], 1609 | "source": [ 1610 | "a" 1611 | ] 1612 | }, 1613 | { 1614 | "cell_type": "code", 1615 | "execution_count": null, 1616 | "id": "27ef4420", 1617 | "metadata": {}, 1618 | "outputs": [ 1619 | { 1620 | "data": { 1621 | "text/plain": [ 1622 | "array([ 6, 12, 18, 24, 30, 36, 42, 48, 54])" 1623 | ] 1624 | }, 1625 | "metadata": {}, 1626 | "output_type": "display_data" 1627 | } 1628 | ], 1629 | "source": [ 1630 | "a*6" 1631 | ] 1632 | }, 1633 | { 1634 | "cell_type": "code", 1635 | "execution_count": null, 1636 | "id": "f80e6e8d", 1637 | "metadata": {}, 1638 | "outputs": [ 1639 | { 1640 | "data": { 1641 | "text/plain": [ 1642 | "array([ 7, 8, 9, 10, 11, 12, 13, 14, 15])" 1643 | ] 1644 | }, 1645 | "metadata": {}, 1646 | "output_type": "display_data" 1647 | } 1648 | ], 1649 | "source": [ 1650 | "a+6" 1651 | ] 1652 | }, 1653 | { 1654 | "cell_type": "code", 1655 | "execution_count": null, 1656 | "id": "668651ae", 1657 | "metadata": {}, 1658 | "outputs": [ 1659 | { 1660 | "data": { 1661 | "text/plain": [ 1662 | "45" 1663 | ] 1664 | }, 1665 | "metadata": {}, 1666 | "output_type": "display_data" 1667 | } 1668 | ], 1669 | "source": [ 1670 | "# Sum of elements\n", 1671 | "a.sum()" 1672 | ] 1673 | }, 1674 | { 1675 | "cell_type": "code", 1676 | "execution_count": null, 1677 | "id": "11aa6587", 1678 | "metadata": {}, 1679 | "outputs": [ 1680 | { 1681 | "data": { 1682 | "text/plain": [ 1683 | "5.0" 1684 | ] 1685 | }, 1686 | "metadata": {}, 1687 | "output_type": "display_data" 1688 | } 1689 | ], 1690 | "source": [ 1691 | "# mean of elements\n", 1692 | "a.mean()" 1693 | ] 1694 | }, 1695 | { 1696 | "cell_type": "markdown", 1697 | "id": "0fe9079e", 1698 | "metadata": {}, 1699 | "source": [ 1700 | "### 3.2.11 Indexing and Slicing" 1701 | ] 1702 | }, 1703 | { 1704 | "cell_type": "code", 1705 | "execution_count": null, 1706 | "id": "9d094233", 1707 | "metadata": {}, 1708 | "outputs": [ 1709 | { 1710 | "data": { 1711 | "text/plain": [ 1712 | "array([10, 11, 12, 13, 14, 15])" 1713 | ] 1714 | }, 1715 | "metadata": {}, 1716 | "output_type": "display_data" 1717 | } 1718 | ], 1719 | "source": [ 1720 | "a = np.array([10, 11, 12, 13, 14, 15])\n", 1721 | "a" 1722 | ] 1723 | }, 1724 | { 1725 | "cell_type": "code", 1726 | "execution_count": null, 1727 | "id": "2f7d5cd4", 1728 | "metadata": {}, 1729 | "outputs": [ 1730 | { 1731 | "data": { 1732 | "text/plain": [ 1733 | "12" 1734 | ] 1735 | }, 1736 | "metadata": {}, 1737 | "output_type": "display_data" 1738 | } 1739 | ], 1740 | "source": [ 1741 | "a[2]" 1742 | ] 1743 | }, 1744 | { 1745 | "cell_type": "code", 1746 | "execution_count": null, 1747 | "id": "ae5e8043", 1748 | "metadata": {}, 1749 | "outputs": [ 1750 | { 1751 | "data": { 1752 | "text/plain": [ 1753 | "array([10, 11, 12])" 1754 | ] 1755 | }, 1756 | "metadata": {}, 1757 | "output_type": "display_data" 1758 | } 1759 | ], 1760 | "source": [ 1761 | "a[0:3]" 1762 | ] 1763 | }, 1764 | { 1765 | "cell_type": "code", 1766 | "execution_count": null, 1767 | "id": "19b0b8b6", 1768 | "metadata": {}, 1769 | "outputs": [ 1770 | { 1771 | "data": { 1772 | "text/plain": [ 1773 | "array([10, 11, 12, 13, 14, 15])" 1774 | ] 1775 | }, 1776 | "metadata": {}, 1777 | "output_type": "display_data" 1778 | } 1779 | ], 1780 | "source": [ 1781 | "a[0:]" 1782 | ] 1783 | }, 1784 | { 1785 | "cell_type": "code", 1786 | "execution_count": null, 1787 | "id": "e64cb9aa", 1788 | "metadata": {}, 1789 | "outputs": [ 1790 | { 1791 | "data": { 1792 | "text/plain": [ 1793 | "array([10, 11, 12, 13, 14])" 1794 | ] 1795 | }, 1796 | "metadata": {}, 1797 | "output_type": "display_data" 1798 | } 1799 | ], 1800 | "source": [ 1801 | "a[:5]" 1802 | ] 1803 | }, 1804 | { 1805 | "cell_type": "code", 1806 | "execution_count": null, 1807 | "id": "311f0408", 1808 | "metadata": {}, 1809 | "outputs": [ 1810 | { 1811 | "data": { 1812 | "text/plain": [ 1813 | "array([11, 12, 13, 14, 15])" 1814 | ] 1815 | }, 1816 | "metadata": {}, 1817 | "output_type": "display_data" 1818 | } 1819 | ], 1820 | "source": [ 1821 | "a[-5:]" 1822 | ] 1823 | }, 1824 | { 1825 | "cell_type": "code", 1826 | "execution_count": null, 1827 | "id": "be73c3dc", 1828 | "metadata": {}, 1829 | "outputs": [ 1830 | { 1831 | "data": { 1832 | "text/plain": [ 1833 | "array([[ 1, 2, 3, 4],\n", 1834 | " [ 5, 6, 7, 8],\n", 1835 | " [ 9, 10, 11, 12]])" 1836 | ] 1837 | }, 1838 | "metadata": {}, 1839 | "output_type": "display_data" 1840 | } 1841 | ], 1842 | "source": [ 1843 | "a = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n", 1844 | "a" 1845 | ] 1846 | }, 1847 | { 1848 | "cell_type": "code", 1849 | "execution_count": null, 1850 | "id": "dbf6c60d", 1851 | "metadata": {}, 1852 | "outputs": [ 1853 | { 1854 | "data": { 1855 | "text/plain": [ 1856 | "array([1, 2, 3, 4])" 1857 | ] 1858 | }, 1859 | "metadata": {}, 1860 | "output_type": "display_data" 1861 | } 1862 | ], 1863 | "source": [ 1864 | "a[a<5]" 1865 | ] 1866 | }, 1867 | { 1868 | "cell_type": "code", 1869 | "execution_count": null, 1870 | "id": "0040894e", 1871 | "metadata": {}, 1872 | "outputs": [ 1873 | { 1874 | "data": { 1875 | "text/plain": [ 1876 | "array([ 6, 7, 8, 9, 10, 11, 12])" 1877 | ] 1878 | }, 1879 | "metadata": {}, 1880 | "output_type": "display_data" 1881 | } 1882 | ], 1883 | "source": [ 1884 | "b = a > 5\n", 1885 | "a[b]" 1886 | ] 1887 | }, 1888 | { 1889 | "cell_type": "code", 1890 | "execution_count": null, 1891 | "id": "6d404e9b", 1892 | "metadata": {}, 1893 | "outputs": [ 1894 | { 1895 | "data": { 1896 | "text/plain": [ 1897 | "array([3, 4, 5, 6, 7, 8, 9])" 1898 | ] 1899 | }, 1900 | "metadata": {}, 1901 | "output_type": "display_data" 1902 | } 1903 | ], 1904 | "source": [ 1905 | "c = a[(a>2) & (a<10)]\n", 1906 | "c" 1907 | ] 1908 | }, 1909 | { 1910 | "cell_type": "code", 1911 | "execution_count": null, 1912 | "id": "868f55a7", 1913 | "metadata": {}, 1914 | "outputs": [ 1915 | { 1916 | "data": { 1917 | "text/plain": [ 1918 | "array([[ True, False, False, False],\n", 1919 | " [ True, True, True, True],\n", 1920 | " [ True, True, True, True]])" 1921 | ] 1922 | }, 1923 | "metadata": {}, 1924 | "output_type": "display_data" 1925 | } 1926 | ], 1927 | "source": [ 1928 | "c = (a>4) | (a==1)\n", 1929 | "c" 1930 | ] 1931 | }, 1932 | { 1933 | "cell_type": "code", 1934 | "execution_count": null, 1935 | "id": "386af875", 1936 | "metadata": {}, 1937 | "outputs": [ 1938 | { 1939 | "data": { 1940 | "text/plain": [ 1941 | "(array([0, 0, 0, 0], dtype=int64), array([0, 1, 2, 3], dtype=int64))" 1942 | ] 1943 | }, 1944 | "metadata": {}, 1945 | "output_type": "display_data" 1946 | } 1947 | ], 1948 | "source": [ 1949 | "b = np.nonzero(a <5)\n", 1950 | "b" 1951 | ] 1952 | }, 1953 | { 1954 | "cell_type": "markdown", 1955 | "id": "ce3807cd", 1956 | "metadata": {}, 1957 | "source": [ 1958 | "> Slicing" 1959 | ] 1960 | }, 1961 | { 1962 | "cell_type": "code", 1963 | "execution_count": null, 1964 | "id": "6858d9e5", 1965 | "metadata": {}, 1966 | "outputs": [ 1967 | { 1968 | "data": { 1969 | "text/plain": [ 1970 | "array([[ 1, 2, 3, 4],\n", 1971 | " [ 5, 6, 7, 8],\n", 1972 | " [ 9, 10, 11, 12]])" 1973 | ] 1974 | }, 1975 | "metadata": {}, 1976 | "output_type": "display_data" 1977 | } 1978 | ], 1979 | "source": [ 1980 | "a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n", 1981 | "a" 1982 | ] 1983 | }, 1984 | { 1985 | "cell_type": "markdown", 1986 | "id": "2404a44e", 1987 | "metadata": {}, 1988 | "source": [ 1989 | "> Slicing a row of an array\n", 1990 | "> \n", 1991 | "> \"1\" is the index of the row of an array" 1992 | ] 1993 | }, 1994 | { 1995 | "cell_type": "code", 1996 | "execution_count": null, 1997 | "id": "be84d9bc", 1998 | "metadata": {}, 1999 | "outputs": [ 2000 | { 2001 | "data": { 2002 | "text/plain": [ 2003 | "array([5, 6, 7, 8])" 2004 | ] 2005 | }, 2006 | "metadata": {}, 2007 | "output_type": "display_data" 2008 | } 2009 | ], 2010 | "source": [ 2011 | "b = a[1, :] # \"1\" is the index of the row of an array\n", 2012 | "b" 2013 | ] 2014 | }, 2015 | { 2016 | "cell_type": "markdown", 2017 | "id": "f87af930", 2018 | "metadata": {}, 2019 | "source": [ 2020 | "> Slicing a Column of an array\n", 2021 | "> \n", 2022 | "> \"1\" is the index of the Column of an array" 2023 | ] 2024 | }, 2025 | { 2026 | "cell_type": "code", 2027 | "execution_count": null, 2028 | "id": "49f03cd2", 2029 | "metadata": {}, 2030 | "outputs": [ 2031 | { 2032 | "data": { 2033 | "text/plain": [ 2034 | "array([ 2, 6, 10])" 2035 | ] 2036 | }, 2037 | "metadata": {}, 2038 | "output_type": "display_data" 2039 | } 2040 | ], 2041 | "source": [ 2042 | "# to slice a column of an array\n", 2043 | "b = a[:, 1] # \"1\" is the index of the column of an array\n", 2044 | "b" 2045 | ] 2046 | }, 2047 | { 2048 | "cell_type": "markdown", 2049 | "id": "84c47e86", 2050 | "metadata": {}, 2051 | "source": [ 2052 | "> slicing and modifying element" 2053 | ] 2054 | }, 2055 | { 2056 | "cell_type": "code", 2057 | "execution_count": null, 2058 | "id": "0e9da100", 2059 | "metadata": {}, 2060 | "outputs": [ 2061 | { 2062 | "data": { 2063 | "text/plain": [ 2064 | "array([200, 6, 10])" 2065 | ] 2066 | }, 2067 | "metadata": {}, 2068 | "output_type": "display_data" 2069 | } 2070 | ], 2071 | "source": [ 2072 | "b[0] = 200\n", 2073 | "b" 2074 | ] 2075 | }, 2076 | { 2077 | "cell_type": "code", 2078 | "execution_count": null, 2079 | "id": "d9b2b8c5", 2080 | "metadata": {}, 2081 | "outputs": [ 2082 | { 2083 | "data": { 2084 | "text/plain": [ 2085 | "array([[ 1, 200, 3, 4],\n", 2086 | " [ 5, 6, 7, 8],\n", 2087 | " [ 9, 10, 11, 12]])" 2088 | ] 2089 | }, 2090 | "metadata": {}, 2091 | "output_type": "display_data" 2092 | } 2093 | ], 2094 | "source": [ 2095 | "a" 2096 | ] 2097 | }, 2098 | { 2099 | "cell_type": "markdown", 2100 | "id": "7db8191c", 2101 | "metadata": {}, 2102 | "source": [ 2103 | ">Making a copy of an array\n" 2104 | ] 2105 | }, 2106 | { 2107 | "cell_type": "code", 2108 | "execution_count": null, 2109 | "id": "7aa5f16e", 2110 | "metadata": {}, 2111 | "outputs": [ 2112 | { 2113 | "data": { 2114 | "text/plain": [ 2115 | "array([[ 1, 200, 3, 4],\n", 2116 | " [ 5, 6, 7, 8],\n", 2117 | " [ 9, 10, 11, 12]])" 2118 | ] 2119 | }, 2120 | "metadata": {}, 2121 | "output_type": "display_data" 2122 | } 2123 | ], 2124 | "source": [ 2125 | "c = a.copy()\n", 2126 | "c" 2127 | ] 2128 | }, 2129 | { 2130 | "cell_type": "markdown", 2131 | "id": "585a68a9", 2132 | "metadata": {}, 2133 | "source": [ 2134 | "### 3.2.12: Array Stacking & Splitting" 2135 | ] 2136 | }, 2137 | { 2138 | "cell_type": "markdown", 2139 | "id": "49d509a5", 2140 | "metadata": {}, 2141 | "source": [ 2142 | ">You can stack them vertically with vstack" 2143 | ] 2144 | }, 2145 | { 2146 | "cell_type": "code", 2147 | "execution_count": null, 2148 | "id": "3d72bf0c", 2149 | "metadata": {}, 2150 | "outputs": [ 2151 | { 2152 | "data": { 2153 | "text/plain": [ 2154 | "array([[1, 1],\n", 2155 | " [2, 2],\n", 2156 | " [3, 3],\n", 2157 | " [4, 4]])" 2158 | ] 2159 | }, 2160 | "metadata": {}, 2161 | "output_type": "display_data" 2162 | } 2163 | ], 2164 | "source": [ 2165 | "a1 = np.array([[1, 1],\n", 2166 | " [2, 2]])\n", 2167 | "a2 = np.array([[3, 3],\n", 2168 | " [4, 4]])\n", 2169 | "\n", 2170 | "np.vstack((a1, a2)) " 2171 | ] 2172 | }, 2173 | { 2174 | "cell_type": "markdown", 2175 | "id": "bcc102f7", 2176 | "metadata": {}, 2177 | "source": [ 2178 | ">Or stack them horizontally with hstack\n" 2179 | ] 2180 | }, 2181 | { 2182 | "cell_type": "code", 2183 | "execution_count": null, 2184 | "id": "87b4c071", 2185 | "metadata": {}, 2186 | "outputs": [ 2187 | { 2188 | "data": { 2189 | "text/plain": [ 2190 | "array([[1, 1, 3, 3],\n", 2191 | " [2, 2, 4, 4]])" 2192 | ] 2193 | }, 2194 | "metadata": {}, 2195 | "output_type": "display_data" 2196 | } 2197 | ], 2198 | "source": [ 2199 | "np.hstack((a1, a2))" 2200 | ] 2201 | }, 2202 | { 2203 | "cell_type": "markdown", 2204 | "id": "d399c1f6", 2205 | "metadata": {}, 2206 | "source": [ 2207 | ">Array Splitting" 2208 | ] 2209 | }, 2210 | { 2211 | "cell_type": "code", 2212 | "execution_count": null, 2213 | "id": "e91e09d8", 2214 | "metadata": {}, 2215 | "outputs": [ 2216 | { 2217 | "data": { 2218 | "text/plain": [ 2219 | "array([[ 1, 2, 3, 4, 5, 6, 7, 8],\n", 2220 | " [ 9, 10, 11, 12, 13, 14, 15, 16],\n", 2221 | " [17, 18, 19, 20, 21, 22, 23, 24]])" 2222 | ] 2223 | }, 2224 | "metadata": {}, 2225 | "output_type": "display_data" 2226 | } 2227 | ], 2228 | "source": [ 2229 | "x = np.arange(1, 25).reshape(3,8)\n", 2230 | "x" 2231 | ] 2232 | }, 2233 | { 2234 | "cell_type": "markdown", 2235 | "id": "5a576ac8", 2236 | "metadata": {}, 2237 | "source": [ 2238 | ">If you want to split this array into equally shaped arrays\n", 2239 | ">\n", 2240 | "> **Imp Note: Row or Column should be multiple of Row or Column, e.g. above array has 8 column it multiple could be 2, 4, 8)**" 2241 | ] 2242 | }, 2243 | { 2244 | "cell_type": "code", 2245 | "execution_count": null, 2246 | "id": "1d1fa4ee", 2247 | "metadata": {}, 2248 | "outputs": [ 2249 | { 2250 | "data": { 2251 | "text/plain": [ 2252 | "[array([[ 1, 2, 3, 4],\n", 2253 | " [ 9, 10, 11, 12],\n", 2254 | " [17, 18, 19, 20]]),\n", 2255 | " array([[ 5, 6, 7, 8],\n", 2256 | " [13, 14, 15, 16],\n", 2257 | " [21, 22, 23, 24]])]" 2258 | ] 2259 | }, 2260 | "metadata": {}, 2261 | "output_type": "display_data" 2262 | } 2263 | ], 2264 | "source": [ 2265 | "np.hsplit(x,2)" 2266 | ] 2267 | }, 2268 | { 2269 | "cell_type": "code", 2270 | "execution_count": null, 2271 | "id": "61d976f7", 2272 | "metadata": {}, 2273 | "outputs": [ 2274 | { 2275 | "data": { 2276 | "text/plain": [ 2277 | "[array([[ 1, 2, 3, 4, 5],\n", 2278 | " [ 9, 10, 11, 12, 13],\n", 2279 | " [17, 18, 19, 20, 21]]),\n", 2280 | " array([[ 6],\n", 2281 | " [14],\n", 2282 | " [22]]),\n", 2283 | " array([[ 7, 8],\n", 2284 | " [15, 16],\n", 2285 | " [23, 24]])]" 2286 | ] 2287 | }, 2288 | "metadata": {}, 2289 | "output_type": "display_data" 2290 | } 2291 | ], 2292 | "source": [ 2293 | "np.hsplit(x, (5,6))" 2294 | ] 2295 | }, 2296 | { 2297 | "cell_type": "markdown", 2298 | "id": "e21ce303", 2299 | "metadata": {}, 2300 | "source": [ 2301 | "## Section 3.3: Basic Array Operations" 2302 | ] 2303 | }, 2304 | { 2305 | "cell_type": "markdown", 2306 | "id": "fb57017a", 2307 | "metadata": {}, 2308 | "source": [ 2309 | "> Addition" 2310 | ] 2311 | }, 2312 | { 2313 | "cell_type": "code", 2314 | "execution_count": null, 2315 | "id": "2e90f028", 2316 | "metadata": {}, 2317 | "outputs": [ 2318 | { 2319 | "data": { 2320 | "text/plain": [ 2321 | "array([2, 3])" 2322 | ] 2323 | }, 2324 | "metadata": {}, 2325 | "output_type": "display_data" 2326 | } 2327 | ], 2328 | "source": [ 2329 | "import numpy as np\n", 2330 | "a = np.array([2,3])\n", 2331 | "a" 2332 | ] 2333 | }, 2334 | { 2335 | "cell_type": "code", 2336 | "execution_count": null, 2337 | "id": "03eddb34", 2338 | "metadata": {}, 2339 | "outputs": [ 2340 | { 2341 | "data": { 2342 | "text/plain": [ 2343 | "array([1, 1])" 2344 | ] 2345 | }, 2346 | "metadata": {}, 2347 | "output_type": "display_data" 2348 | } 2349 | ], 2350 | "source": [ 2351 | "b = np.ones(2, dtype=int)\n", 2352 | "b" 2353 | ] 2354 | }, 2355 | { 2356 | "cell_type": "code", 2357 | "execution_count": null, 2358 | "id": "1df7fc1e", 2359 | "metadata": {}, 2360 | "outputs": [ 2361 | { 2362 | "data": { 2363 | "text/plain": [ 2364 | "array([3, 4])" 2365 | ] 2366 | }, 2367 | "metadata": {}, 2368 | "output_type": "display_data" 2369 | } 2370 | ], 2371 | "source": [ 2372 | "c = a+b\n", 2373 | "c" 2374 | ] 2375 | }, 2376 | { 2377 | "cell_type": "markdown", 2378 | "id": "4a7c5d1f", 2379 | "metadata": {}, 2380 | "source": [ 2381 | "### 3.3.1 Basic Operations of 2D Array (Addition, Subtraction, Multiplication & Division) " 2382 | ] 2383 | }, 2384 | { 2385 | "cell_type": "markdown", 2386 | "id": "712e347a", 2387 | "metadata": {}, 2388 | "source": [ 2389 | "> You can add and multiply them using arithmetic operators if you have two matrices that are the same size." 2390 | ] 2391 | }, 2392 | { 2393 | "cell_type": "code", 2394 | "execution_count": null, 2395 | "id": "bb58c344", 2396 | "metadata": {}, 2397 | "outputs": [ 2398 | { 2399 | "data": { 2400 | "text/plain": [ 2401 | "array([[2, 3],\n", 2402 | " [4, 5]])" 2403 | ] 2404 | }, 2405 | "metadata": {}, 2406 | "output_type": "display_data" 2407 | } 2408 | ], 2409 | "source": [ 2410 | "a = np.array([[1, 2], [3, 4]])\n", 2411 | "b = np.array([[1, 1], [1, 1]])\n", 2412 | "a + b" 2413 | ] 2414 | }, 2415 | { 2416 | "cell_type": "markdown", 2417 | "id": "2bb60d6a", 2418 | "metadata": {}, 2419 | "source": [ 2420 | "> You can do these arithmetic operations on matrices of different sizes, but only if one matrix has only one column or one row." 2421 | ] 2422 | }, 2423 | { 2424 | "cell_type": "code", 2425 | "execution_count": null, 2426 | "id": "dcbf28fa", 2427 | "metadata": {}, 2428 | "outputs": [ 2429 | { 2430 | "data": { 2431 | "text/plain": [ 2432 | "array([[2, 3],\n", 2433 | " [4, 5],\n", 2434 | " [6, 7]])" 2435 | ] 2436 | }, 2437 | "metadata": {}, 2438 | "output_type": "display_data" 2439 | } 2440 | ], 2441 | "source": [ 2442 | "x = np.array([[1, 2], [3, 4], [5, 6]])\n", 2443 | "y = np.array([[1, 1]])\n", 2444 | "x+y" 2445 | ] 2446 | }, 2447 | { 2448 | "cell_type": "markdown", 2449 | "id": "c1dc1f0a", 2450 | "metadata": {}, 2451 | "source": [ 2452 | ">Subtraction" 2453 | ] 2454 | }, 2455 | { 2456 | "cell_type": "code", 2457 | "execution_count": null, 2458 | "id": "fa8acc9d", 2459 | "metadata": {}, 2460 | "outputs": [ 2461 | { 2462 | "data": { 2463 | "text/plain": [ 2464 | "array([1, 2])" 2465 | ] 2466 | }, 2467 | "metadata": {}, 2468 | "output_type": "display_data" 2469 | } 2470 | ], 2471 | "source": [ 2472 | "d = a-b\n", 2473 | "d" 2474 | ] 2475 | }, 2476 | { 2477 | "cell_type": "markdown", 2478 | "id": "9521fac0", 2479 | "metadata": {}, 2480 | "source": [ 2481 | "> Multiplication" 2482 | ] 2483 | }, 2484 | { 2485 | "cell_type": "code", 2486 | "execution_count": null, 2487 | "id": "845af2d9", 2488 | "metadata": {}, 2489 | "outputs": [ 2490 | { 2491 | "data": { 2492 | "text/plain": [ 2493 | "array([[ 0, 4],\n", 2494 | " [ 6, 12]])" 2495 | ] 2496 | }, 2497 | "metadata": {}, 2498 | "output_type": "display_data" 2499 | } 2500 | ], 2501 | "source": [ 2502 | "e = c*d\n", 2503 | "e" 2504 | ] 2505 | }, 2506 | { 2507 | "cell_type": "markdown", 2508 | "id": "b4a55529", 2509 | "metadata": {}, 2510 | "source": [ 2511 | "> Division\n" 2512 | ] 2513 | }, 2514 | { 2515 | "cell_type": "code", 2516 | "execution_count": null, 2517 | "id": "edbc056d", 2518 | "metadata": {}, 2519 | "outputs": [ 2520 | { 2521 | "data": { 2522 | "text/plain": [ 2523 | "array([3., 2.])" 2524 | ] 2525 | }, 2526 | "metadata": {}, 2527 | "output_type": "display_data" 2528 | } 2529 | ], 2530 | "source": [ 2531 | "f = c/d\n", 2532 | "f" 2533 | ] 2534 | }, 2535 | { 2536 | "cell_type": "markdown", 2537 | "id": "3664fbf3", 2538 | "metadata": {}, 2539 | "source": [ 2540 | "### 3.3.2 Sum of Elements in Array" 2541 | ] 2542 | }, 2543 | { 2544 | "cell_type": "markdown", 2545 | "id": "bc528ada", 2546 | "metadata": {}, 2547 | "source": [ 2548 | ">Sum of elements in 1-D array" 2549 | ] 2550 | }, 2551 | { 2552 | "cell_type": "code", 2553 | "execution_count": null, 2554 | "id": "74e12ec4", 2555 | "metadata": {}, 2556 | "outputs": [ 2557 | { 2558 | "data": { 2559 | "text/plain": [ 2560 | "array([0, 1, 2, 3])" 2561 | ] 2562 | }, 2563 | "metadata": {}, 2564 | "output_type": "display_data" 2565 | } 2566 | ], 2567 | "source": [ 2568 | "a = np.arange(4)\n", 2569 | "a" 2570 | ] 2571 | }, 2572 | { 2573 | "cell_type": "code", 2574 | "execution_count": null, 2575 | "id": "f68b9513", 2576 | "metadata": {}, 2577 | "outputs": [ 2578 | { 2579 | "data": { 2580 | "text/plain": [ 2581 | "6" 2582 | ] 2583 | }, 2584 | "metadata": {}, 2585 | "output_type": "display_data" 2586 | } 2587 | ], 2588 | "source": [ 2589 | "a.sum()" 2590 | ] 2591 | }, 2592 | { 2593 | "cell_type": "markdown", 2594 | "id": "62a79a9d", 2595 | "metadata": {}, 2596 | "source": [ 2597 | ">Sum of elements in 2-D array" 2598 | ] 2599 | }, 2600 | { 2601 | "cell_type": "code", 2602 | "execution_count": null, 2603 | "id": "4a676333", 2604 | "metadata": {}, 2605 | "outputs": [ 2606 | { 2607 | "data": { 2608 | "text/plain": [ 2609 | "array([[1, 2],\n", 2610 | " [3, 4]])" 2611 | ] 2612 | }, 2613 | "metadata": {}, 2614 | "output_type": "display_data" 2615 | } 2616 | ], 2617 | "source": [ 2618 | "a = np.array([[1,2], [3,4]])\n", 2619 | "a" 2620 | ] 2621 | }, 2622 | { 2623 | "cell_type": "markdown", 2624 | "id": "c6c2d506", 2625 | "metadata": {}, 2626 | "source": [ 2627 | ">sum of 2D array on 0 axis" 2628 | ] 2629 | }, 2630 | { 2631 | "cell_type": "code", 2632 | "execution_count": null, 2633 | "id": "c5fd85dc", 2634 | "metadata": {}, 2635 | "outputs": [ 2636 | { 2637 | "data": { 2638 | "text/plain": [ 2639 | "array([4, 6])" 2640 | ] 2641 | }, 2642 | "metadata": {}, 2643 | "output_type": "display_data" 2644 | } 2645 | ], 2646 | "source": [ 2647 | "a.sum(axis = 0)" 2648 | ] 2649 | }, 2650 | { 2651 | "cell_type": "markdown", 2652 | "id": "65b4c783", 2653 | "metadata": {}, 2654 | "source": [ 2655 | ">sum of 2D array on 1 axis" 2656 | ] 2657 | }, 2658 | { 2659 | "cell_type": "code", 2660 | "execution_count": null, 2661 | "id": "09d5cb8b", 2662 | "metadata": {}, 2663 | "outputs": [ 2664 | { 2665 | "data": { 2666 | "text/plain": [ 2667 | "array([3, 7])" 2668 | ] 2669 | }, 2670 | "metadata": {}, 2671 | "output_type": "display_data" 2672 | } 2673 | ], 2674 | "source": [ 2675 | "a.sum(axis = 1)" 2676 | ] 2677 | }, 2678 | { 2679 | "cell_type": "markdown", 2680 | "id": "63274fc2", 2681 | "metadata": {}, 2682 | "source": [ 2683 | ">multiplication of an scalar and vector in array" 2684 | ] 2685 | }, 2686 | { 2687 | "cell_type": "code", 2688 | "execution_count": null, 2689 | "id": "b5eaf0ae", 2690 | "metadata": {}, 2691 | "outputs": [ 2692 | { 2693 | "data": { 2694 | "text/plain": [ 2695 | "array([3., 5.])" 2696 | ] 2697 | }, 2698 | "metadata": {}, 2699 | "output_type": "display_data" 2700 | } 2701 | ], 2702 | "source": [ 2703 | "a = np.array([1.5, 2.5])\n", 2704 | "a * 2" 2705 | ] 2706 | }, 2707 | { 2708 | "cell_type": "markdown", 2709 | "id": "83ff4562", 2710 | "metadata": {}, 2711 | "source": [ 2712 | "## Section 3.4: Basic Statistical Operations in Arrays\n", 2713 | "> To find out maximum and minimum, sum, mean, product, standard deviation" 2714 | ] 2715 | }, 2716 | { 2717 | "cell_type": "markdown", 2718 | "id": "6b8080a8", 2719 | "metadata": {}, 2720 | "source": [ 2721 | "> 1-D Array" 2722 | ] 2723 | }, 2724 | { 2725 | "cell_type": "code", 2726 | "execution_count": null, 2727 | "id": "cca724f4", 2728 | "metadata": {}, 2729 | "outputs": [ 2730 | { 2731 | "data": { 2732 | "text/plain": [ 2733 | "array([1, 2, 3, 4, 5, 6, 7, 8, 9])" 2734 | ] 2735 | }, 2736 | "metadata": {}, 2737 | "output_type": "display_data" 2738 | } 2739 | ], 2740 | "source": [ 2741 | "a = np.arange(1,10)\n", 2742 | "a" 2743 | ] 2744 | }, 2745 | { 2746 | "cell_type": "code", 2747 | "execution_count": null, 2748 | "id": "7845f6de", 2749 | "metadata": {}, 2750 | "outputs": [ 2751 | { 2752 | "data": { 2753 | "text/plain": [ 2754 | "1" 2755 | ] 2756 | }, 2757 | "metadata": {}, 2758 | "output_type": "display_data" 2759 | } 2760 | ], 2761 | "source": [ 2762 | "a.min()" 2763 | ] 2764 | }, 2765 | { 2766 | "cell_type": "code", 2767 | "execution_count": null, 2768 | "id": "50c61157", 2769 | "metadata": {}, 2770 | "outputs": [ 2771 | { 2772 | "data": { 2773 | "text/plain": [ 2774 | "9" 2775 | ] 2776 | }, 2777 | "metadata": {}, 2778 | "output_type": "display_data" 2779 | } 2780 | ], 2781 | "source": [ 2782 | "a.max()" 2783 | ] 2784 | }, 2785 | { 2786 | "cell_type": "code", 2787 | "execution_count": null, 2788 | "id": "5f9a04af", 2789 | "metadata": {}, 2790 | "outputs": [ 2791 | { 2792 | "data": { 2793 | "text/plain": [ 2794 | "45" 2795 | ] 2796 | }, 2797 | "metadata": {}, 2798 | "output_type": "display_data" 2799 | } 2800 | ], 2801 | "source": [ 2802 | "a.sum()" 2803 | ] 2804 | }, 2805 | { 2806 | "cell_type": "code", 2807 | "execution_count": null, 2808 | "id": "4d6f4f78", 2809 | "metadata": {}, 2810 | "outputs": [ 2811 | { 2812 | "data": { 2813 | "text/plain": [ 2814 | "362880" 2815 | ] 2816 | }, 2817 | "metadata": {}, 2818 | "output_type": "display_data" 2819 | } 2820 | ], 2821 | "source": [ 2822 | "a.prod()" 2823 | ] 2824 | }, 2825 | { 2826 | "cell_type": "code", 2827 | "execution_count": null, 2828 | "id": "32225504", 2829 | "metadata": {}, 2830 | "outputs": [ 2831 | { 2832 | "data": { 2833 | "text/plain": [ 2834 | "2.581988897471611" 2835 | ] 2836 | }, 2837 | "metadata": {}, 2838 | "output_type": "display_data" 2839 | } 2840 | ], 2841 | "source": [ 2842 | "a.std()" 2843 | ] 2844 | }, 2845 | { 2846 | "cell_type": "markdown", 2847 | "id": "fd8a9b75", 2848 | "metadata": {}, 2849 | "source": [ 2850 | "> 2-D Array" 2851 | ] 2852 | }, 2853 | { 2854 | "cell_type": "code", 2855 | "execution_count": null, 2856 | "id": "912b5749", 2857 | "metadata": {}, 2858 | "outputs": [ 2859 | { 2860 | "data": { 2861 | "text/plain": [ 2862 | "4.8595784" 2863 | ] 2864 | }, 2865 | "metadata": {}, 2866 | "output_type": "display_data" 2867 | } 2868 | ], 2869 | "source": [ 2870 | "a = np.array([[0.45053314, 0.17296777, 0.34376245, 0.5510652],\n", 2871 | " [0.54627315, 0.05093587, 0.40067661, 0.55645993],\n", 2872 | " [0.12697628, 0.82485143, 0.26590556, 0.56917101]])\n", 2873 | "a.sum()" 2874 | ] 2875 | }, 2876 | { 2877 | "cell_type": "code", 2878 | "execution_count": null, 2879 | "id": "a582957e", 2880 | "metadata": {}, 2881 | "outputs": [ 2882 | { 2883 | "data": { 2884 | "text/plain": [ 2885 | "array([[0.17296777, 0.34376245, 0.45053314, 0.5510652 ],\n", 2886 | " [0.05093587, 0.40067661, 0.54627315, 0.55645993],\n", 2887 | " [0.12697628, 0.26590556, 0.56917101, 0.82485143]])" 2888 | ] 2889 | }, 2890 | "metadata": {}, 2891 | "output_type": "display_data" 2892 | } 2893 | ], 2894 | "source": [ 2895 | "a.sort()\n", 2896 | "a" 2897 | ] 2898 | }, 2899 | { 2900 | "cell_type": "code", 2901 | "execution_count": null, 2902 | "id": "84a8d259", 2903 | "metadata": {}, 2904 | "outputs": [ 2905 | { 2906 | "data": { 2907 | "text/plain": [ 2908 | "0.05093587" 2909 | ] 2910 | }, 2911 | "metadata": {}, 2912 | "output_type": "display_data" 2913 | } 2914 | ], 2915 | "source": [ 2916 | "a.min()" 2917 | ] 2918 | }, 2919 | { 2920 | "cell_type": "code", 2921 | "execution_count": null, 2922 | "id": "a1e4fb01", 2923 | "metadata": {}, 2924 | "outputs": [ 2925 | { 2926 | "data": { 2927 | "text/plain": [ 2928 | "0.82485143" 2929 | ] 2930 | }, 2931 | "metadata": {}, 2932 | "output_type": "display_data" 2933 | } 2934 | ], 2935 | "source": [ 2936 | "a.max()" 2937 | ] 2938 | }, 2939 | { 2940 | "cell_type": "markdown", 2941 | "id": "6befe519", 2942 | "metadata": {}, 2943 | "source": [ 2944 | "> Minimum and Maximum in 2-D Array at specific axis" 2945 | ] 2946 | }, 2947 | { 2948 | "cell_type": "markdown", 2949 | "id": "8bd6a746", 2950 | "metadata": {}, 2951 | "source": [ 2952 | "> when we use axis=0 to specify the axis. it takes minimum value from each column\n", 2953 | "> \n", 2954 | "> **Imp: axis = 0 gives output as row but scan in vertical way to find minimum value**" 2955 | ] 2956 | }, 2957 | { 2958 | "cell_type": "code", 2959 | "execution_count": null, 2960 | "id": "69adf977", 2961 | "metadata": {}, 2962 | "outputs": [ 2963 | { 2964 | "data": { 2965 | "text/plain": [ 2966 | "array([0.05093587, 0.26590556, 0.45053314, 0.5510652 ])" 2967 | ] 2968 | }, 2969 | "metadata": {}, 2970 | "output_type": "display_data" 2971 | } 2972 | ], 2973 | "source": [ 2974 | "a.min(axis=0)" 2975 | ] 2976 | }, 2977 | { 2978 | "cell_type": "markdown", 2979 | "id": "1d23fcd9", 2980 | "metadata": {}, 2981 | "source": [ 2982 | ">when we use axis=1 to specify the axis. it takes minimum value from each row\n", 2983 | ">\n", 2984 | ">**Imp: axis = 1 gives output as column but scan in horizontal way to find minimum value**" 2985 | ] 2986 | }, 2987 | { 2988 | "cell_type": "code", 2989 | "execution_count": null, 2990 | "id": "a037f9d6", 2991 | "metadata": {}, 2992 | "outputs": [ 2993 | { 2994 | "data": { 2995 | "text/plain": [ 2996 | "array([0.17296777, 0.05093587, 0.12697628])" 2997 | ] 2998 | }, 2999 | "metadata": {}, 3000 | "output_type": "display_data" 3001 | } 3002 | ], 3003 | "source": [ 3004 | "a.min(axis=1)" 3005 | ] 3006 | }, 3007 | { 3008 | "cell_type": "markdown", 3009 | "id": "e80e02ab", 3010 | "metadata": {}, 3011 | "source": [ 3012 | "> when we use axis=0 to specify the axis. it takes maximum value from each column\n", 3013 | "> \n", 3014 | "> **Imp: axis = 0 gives output as row but scan in vertical way to find maximum value**" 3015 | ] 3016 | }, 3017 | { 3018 | "cell_type": "code", 3019 | "execution_count": null, 3020 | "id": "1e81174f", 3021 | "metadata": {}, 3022 | "outputs": [ 3023 | { 3024 | "data": { 3025 | "text/plain": [ 3026 | "array([0.17296777, 0.40067661, 0.56917101, 0.82485143])" 3027 | ] 3028 | }, 3029 | "metadata": {}, 3030 | "output_type": "display_data" 3031 | } 3032 | ], 3033 | "source": [ 3034 | "a.max(axis=0)" 3035 | ] 3036 | }, 3037 | { 3038 | "cell_type": "markdown", 3039 | "id": "c9425af6", 3040 | "metadata": {}, 3041 | "source": [ 3042 | "> when we use axis=1 to specify the axis. it takes maximum value from each row\n", 3043 | "> \n", 3044 | "> **Imp: axis = 1 gives output as column but scan in horizontal way to find maximum value**" 3045 | ] 3046 | }, 3047 | { 3048 | "cell_type": "code", 3049 | "execution_count": null, 3050 | "id": "4a0c8c85", 3051 | "metadata": {}, 3052 | "outputs": [ 3053 | { 3054 | "data": { 3055 | "text/plain": [ 3056 | "array([0.5510652 , 0.55645993, 0.82485143])" 3057 | ] 3058 | }, 3059 | "metadata": {}, 3060 | "output_type": "display_data" 3061 | } 3062 | ], 3063 | "source": [ 3064 | "a.max(axis=1)" 3065 | ] 3066 | }, 3067 | { 3068 | "cell_type": "code", 3069 | "execution_count": null, 3070 | "id": "d3ba7dd9", 3071 | "metadata": {}, 3072 | "outputs": [ 3073 | { 3074 | "data": { 3075 | "text/plain": [ 3076 | "1.451721612088471e-06" 3077 | ] 3078 | }, 3079 | "metadata": {}, 3080 | "output_type": "display_data" 3081 | } 3082 | ], 3083 | "source": [ 3084 | "a.prod()" 3085 | ] 3086 | }, 3087 | { 3088 | "cell_type": "code", 3089 | "execution_count": null, 3090 | "id": "b16b16d3", 3091 | "metadata": {}, 3092 | "outputs": [ 3093 | { 3094 | "data": { 3095 | "text/plain": [ 3096 | "0.21392120766089617" 3097 | ] 3098 | }, 3099 | "metadata": {}, 3100 | "output_type": "display_data" 3101 | } 3102 | ], 3103 | "source": [ 3104 | "a.std()" 3105 | ] 3106 | }, 3107 | { 3108 | "cell_type": "markdown", 3109 | "id": "4acd2eaf", 3110 | "metadata": {}, 3111 | "source": [ 3112 | "## Section 3.5: Indexing 2-D Array" 3113 | ] 3114 | }, 3115 | { 3116 | "cell_type": "code", 3117 | "execution_count": null, 3118 | "id": "12fd7121", 3119 | "metadata": {}, 3120 | "outputs": [ 3121 | { 3122 | "data": { 3123 | "text/plain": [ 3124 | "array([[1, 2],\n", 3125 | " [3, 4],\n", 3126 | " [5, 6]])" 3127 | ] 3128 | }, 3129 | "metadata": {}, 3130 | "output_type": "display_data" 3131 | } 3132 | ], 3133 | "source": [ 3134 | "x = np.array([[1, 2], [3, 4], [5, 6]])\n", 3135 | "x" 3136 | ] 3137 | }, 3138 | { 3139 | "cell_type": "markdown", 3140 | "id": "d8720cae", 3141 | "metadata": {}, 3142 | "source": [ 3143 | "> In two dimensional array index (0, 1) here (Row = 0, Column = 1), Index will be the intersecting point of both Row and Column" 3144 | ] 3145 | }, 3146 | { 3147 | "cell_type": "code", 3148 | "execution_count": null, 3149 | "id": "976ccbac", 3150 | "metadata": {}, 3151 | "outputs": [ 3152 | { 3153 | "data": { 3154 | "text/plain": [ 3155 | "2" 3156 | ] 3157 | }, 3158 | "metadata": {}, 3159 | "output_type": "display_data" 3160 | } 3161 | ], 3162 | "source": [ 3163 | "x[0 , 1]" 3164 | ] 3165 | }, 3166 | { 3167 | "cell_type": "markdown", 3168 | "id": "358bbd0a", 3169 | "metadata": {}, 3170 | "source": [ 3171 | "> In 2-D array index (1:3, 1) or ((1,2), 1) here (Row, Column), Index will be the intersecting point of both Row and Column" 3172 | ] 3173 | }, 3174 | { 3175 | "cell_type": "code", 3176 | "execution_count": null, 3177 | "id": "23c611e2", 3178 | "metadata": {}, 3179 | "outputs": [ 3180 | { 3181 | "data": { 3182 | "text/plain": [ 3183 | "array([4, 6])" 3184 | ] 3185 | }, 3186 | "metadata": {}, 3187 | "output_type": "display_data" 3188 | } 3189 | ], 3190 | "source": [ 3191 | "x[1:3, 1]" 3192 | ] 3193 | }, 3194 | { 3195 | "cell_type": "markdown", 3196 | "id": "6509f172", 3197 | "metadata": {}, 3198 | "source": [ 3199 | "> In 2-D array index (1:3) or ((1,2), empty) here (Row = 1,2, Column = empty), Index will be on Row 1 & 2" 3200 | ] 3201 | }, 3202 | { 3203 | "cell_type": "code", 3204 | "execution_count": null, 3205 | "id": "df808d95", 3206 | "metadata": {}, 3207 | "outputs": [ 3208 | { 3209 | "data": { 3210 | "text/plain": [ 3211 | "array([[3, 4],\n", 3212 | " [5, 6]])" 3213 | ] 3214 | }, 3215 | "metadata": {}, 3216 | "output_type": "display_data" 3217 | } 3218 | ], 3219 | "source": [ 3220 | "x[1:3]" 3221 | ] 3222 | }, 3223 | { 3224 | "cell_type": "markdown", 3225 | "id": "507df4ee", 3226 | "metadata": {}, 3227 | "source": [ 3228 | "> Maximum and Minimum in 2-D Array" 3229 | ] 3230 | }, 3231 | { 3232 | "cell_type": "code", 3233 | "execution_count": null, 3234 | "id": "8822fae0", 3235 | "metadata": {}, 3236 | "outputs": [ 3237 | { 3238 | "data": { 3239 | "text/plain": [ 3240 | "6" 3241 | ] 3242 | }, 3243 | "metadata": {}, 3244 | "output_type": "display_data" 3245 | } 3246 | ], 3247 | "source": [ 3248 | "x.max()" 3249 | ] 3250 | }, 3251 | { 3252 | "cell_type": "code", 3253 | "execution_count": null, 3254 | "id": "1ddaa1ff", 3255 | "metadata": {}, 3256 | "outputs": [ 3257 | { 3258 | "data": { 3259 | "text/plain": [ 3260 | "1" 3261 | ] 3262 | }, 3263 | "metadata": {}, 3264 | "output_type": "display_data" 3265 | } 3266 | ], 3267 | "source": [ 3268 | "x.min()" 3269 | ] 3270 | }, 3271 | { 3272 | "cell_type": "markdown", 3273 | "id": "19dd4b88", 3274 | "metadata": {}, 3275 | "source": [ 3276 | "> when we use axis=0 to specify the axis. it takes maximum value from each column\n", 3277 | "> \n", 3278 | "> **Imp: axis = 0 gives output as row but scan in vertical way to find maximum value**\n" 3279 | ] 3280 | }, 3281 | { 3282 | "cell_type": "code", 3283 | "execution_count": null, 3284 | "id": "a959bee2", 3285 | "metadata": {}, 3286 | "outputs": [ 3287 | { 3288 | "data": { 3289 | "text/plain": [ 3290 | "array([5, 6])" 3291 | ] 3292 | }, 3293 | "metadata": {}, 3294 | "output_type": "display_data" 3295 | } 3296 | ], 3297 | "source": [ 3298 | "x.max(axis=0)" 3299 | ] 3300 | }, 3301 | { 3302 | "cell_type": "markdown", 3303 | "id": "3b613838", 3304 | "metadata": {}, 3305 | "source": [ 3306 | "> when we use axis=1 to specify the axis. it takes maximum value from each row\n", 3307 | "> \n", 3308 | "> **Imp: axis = 1 gives output as column but scan in horizontal way to find maximum value**" 3309 | ] 3310 | }, 3311 | { 3312 | "cell_type": "code", 3313 | "execution_count": null, 3314 | "id": "b5286d6f", 3315 | "metadata": {}, 3316 | "outputs": [ 3317 | { 3318 | "data": { 3319 | "text/plain": [ 3320 | "array([2, 4, 6])" 3321 | ] 3322 | }, 3323 | "metadata": {}, 3324 | "output_type": "display_data" 3325 | } 3326 | ], 3327 | "source": [ 3328 | "x.max(axis=1)" 3329 | ] 3330 | }, 3331 | { 3332 | "cell_type": "markdown", 3333 | "id": "c700c78a", 3334 | "metadata": {}, 3335 | "source": [ 3336 | "> aggregate matrices the same way you aggregated vectors" 3337 | ] 3338 | }, 3339 | { 3340 | "cell_type": "code", 3341 | "execution_count": null, 3342 | "id": "f672a8b5", 3343 | "metadata": {}, 3344 | "outputs": [ 3345 | { 3346 | "data": { 3347 | "text/plain": [ 3348 | "21" 3349 | ] 3350 | }, 3351 | "metadata": {}, 3352 | "output_type": "display_data" 3353 | } 3354 | ], 3355 | "source": [ 3356 | "x.sum()" 3357 | ] 3358 | }, 3359 | { 3360 | "cell_type": "markdown", 3361 | "id": "4474171d", 3362 | "metadata": {}, 3363 | "source": [ 3364 | "## Section 3.6: Random, Rerverse, Reshape & Transpose of an Array" 3365 | ] 3366 | }, 3367 | { 3368 | "cell_type": "markdown", 3369 | "id": "83fbef5d", 3370 | "metadata": {}, 3371 | "source": [ 3372 | "> the simplest way to generate random numbers" 3373 | ] 3374 | }, 3375 | { 3376 | "cell_type": "code", 3377 | "execution_count": null, 3378 | "id": "e968f798", 3379 | "metadata": {}, 3380 | "outputs": [ 3381 | { 3382 | "data": { 3383 | "text/plain": [ 3384 | "array([0.63696169, 0.26978671, 0.04097352, 0.01652764, 0.81327024])" 3385 | ] 3386 | }, 3387 | "metadata": {}, 3388 | "output_type": "display_data" 3389 | } 3390 | ], 3391 | "source": [ 3392 | "r = np.random.default_rng(0)\n", 3393 | "r.random(5)" 3394 | ] 3395 | }, 3396 | { 3397 | "cell_type": "markdown", 3398 | "id": "3257cbab", 3399 | "metadata": {}, 3400 | "source": [ 3401 | "> ones(), zeros(), and random() to create a 2D array" 3402 | ] 3403 | }, 3404 | { 3405 | "cell_type": "code", 3406 | "execution_count": null, 3407 | "id": "a061ced1", 3408 | "metadata": {}, 3409 | "outputs": [ 3410 | { 3411 | "data": { 3412 | "text/plain": [ 3413 | "array([[0., 0.],\n", 3414 | " [0., 0.],\n", 3415 | " [0., 0.]])" 3416 | ] 3417 | }, 3418 | "metadata": {}, 3419 | "output_type": "display_data" 3420 | } 3421 | ], 3422 | "source": [ 3423 | "a = np.zeros((3,2))\n", 3424 | "a" 3425 | ] 3426 | }, 3427 | { 3428 | "cell_type": "code", 3429 | "execution_count": null, 3430 | "id": "de137a40", 3431 | "metadata": {}, 3432 | "outputs": [ 3433 | { 3434 | "data": { 3435 | "text/plain": [ 3436 | "array([[1., 1., 1., 1.],\n", 3437 | " [1., 1., 1., 1.],\n", 3438 | " [1., 1., 1., 1.]])" 3439 | ] 3440 | }, 3441 | "metadata": {}, 3442 | "output_type": "display_data" 3443 | } 3444 | ], 3445 | "source": [ 3446 | "b = np.ones((3,4))\n", 3447 | "b" 3448 | ] 3449 | }, 3450 | { 3451 | "cell_type": "code", 3452 | "execution_count": null, 3453 | "id": "40a7fee3", 3454 | "metadata": {}, 3455 | "outputs": [ 3456 | { 3457 | "data": { 3458 | "text/plain": [ 3459 | "array([[0.91275558, 0.60663578, 0.72949656, 0.54362499, 0.93507242],\n", 3460 | " [0.81585355, 0.0027385 , 0.85740428, 0.03358558, 0.72965545],\n", 3461 | " [0.17565562, 0.86317892, 0.54146122, 0.29971189, 0.42268722],\n", 3462 | " [0.02831967, 0.12428328, 0.67062441, 0.64718951, 0.61538511]])" 3463 | ] 3464 | }, 3465 | "metadata": {}, 3466 | "output_type": "display_data" 3467 | } 3468 | ], 3469 | "source": [ 3470 | "r.random((4,5))" 3471 | ] 3472 | }, 3473 | { 3474 | "cell_type": "markdown", 3475 | "id": "63c991ee", 3476 | "metadata": {}, 3477 | "source": [ 3478 | "> unique items in 1D array (non-repetative items)" 3479 | ] 3480 | }, 3481 | { 3482 | "cell_type": "code", 3483 | "execution_count": null, 3484 | "id": "291fce71", 3485 | "metadata": {}, 3486 | "outputs": [ 3487 | { 3488 | "data": { 3489 | "text/plain": [ 3490 | "array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20])" 3491 | ] 3492 | }, 3493 | "metadata": {}, 3494 | "output_type": "display_data" 3495 | } 3496 | ], 3497 | "source": [ 3498 | "a = np.array([11, 11, 12, 13, 14, 15, 16, 17, 12, 13, 11, 14, 18, 19, 20])\n", 3499 | "b = np.unique(a)\n", 3500 | "b" 3501 | ] 3502 | }, 3503 | { 3504 | "cell_type": "markdown", 3505 | "id": "8bb8e957", 3506 | "metadata": {}, 3507 | "source": [ 3508 | "> Indices of unique values in the original array" 3509 | ] 3510 | }, 3511 | { 3512 | "cell_type": "code", 3513 | "execution_count": null, 3514 | "id": "55812132", 3515 | "metadata": {}, 3516 | "outputs": [ 3517 | { 3518 | "data": { 3519 | "text/plain": [ 3520 | "(array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]),\n", 3521 | " array([ 0, 2, 3, 4, 5, 6, 7, 12, 13, 14], dtype=int64))" 3522 | ] 3523 | }, 3524 | "metadata": {}, 3525 | "output_type": "display_data" 3526 | } 3527 | ], 3528 | "source": [ 3529 | "b = np.unique(a, return_index = True)\n", 3530 | "b" 3531 | ] 3532 | }, 3533 | { 3534 | "cell_type": "markdown", 3535 | "id": "3ee98a43", 3536 | "metadata": {}, 3537 | "source": [ 3538 | "> Frequency count of unique values in a NumPy array" 3539 | ] 3540 | }, 3541 | { 3542 | "cell_type": "code", 3543 | "execution_count": null, 3544 | "id": "a4966c83", 3545 | "metadata": {}, 3546 | "outputs": [ 3547 | { 3548 | "data": { 3549 | "text/plain": [ 3550 | "(array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]),\n", 3551 | " array([3, 2, 2, 2, 1, 1, 1, 1, 1, 1], dtype=int64))" 3552 | ] 3553 | }, 3554 | "metadata": {}, 3555 | "output_type": "display_data" 3556 | } 3557 | ], 3558 | "source": [ 3559 | "b = np.unique(a, return_counts = True)\n", 3560 | "b" 3561 | ] 3562 | }, 3563 | { 3564 | "cell_type": "markdown", 3565 | "id": "212321f3", 3566 | "metadata": {}, 3567 | "source": [ 3568 | "> unique items in 2D array" 3569 | ] 3570 | }, 3571 | { 3572 | "cell_type": "code", 3573 | "execution_count": null, 3574 | "id": "7ce7d707", 3575 | "metadata": {}, 3576 | "outputs": [ 3577 | { 3578 | "data": { 3579 | "text/plain": [ 3580 | "array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])" 3581 | ] 3582 | }, 3583 | "metadata": {}, 3584 | "output_type": "display_data" 3585 | } 3586 | ], 3587 | "source": [ 3588 | "a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [1, 2, 3, 4]])\n", 3589 | "b_2d = np.unique(a_2d)\n", 3590 | "b_2d" 3591 | ] 3592 | }, 3593 | { 3594 | "cell_type": "code", 3595 | "execution_count": null, 3596 | "id": "94ac209e", 3597 | "metadata": {}, 3598 | "outputs": [ 3599 | { 3600 | "data": { 3601 | "text/plain": [ 3602 | "(array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]),\n", 3603 | " array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype=int64),\n", 3604 | " array([2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int64))" 3605 | ] 3606 | }, 3607 | "metadata": {}, 3608 | "output_type": "display_data" 3609 | } 3610 | ], 3611 | "source": [ 3612 | "b_2d = np.unique(a_2d, return_index = True, return_counts = True)\n", 3613 | "b_2d" 3614 | ] 3615 | }, 3616 | { 3617 | "cell_type": "markdown", 3618 | "id": "657cd665", 3619 | "metadata": {}, 3620 | "source": [ 3621 | ">Transposing and reshaping a matrix" 3622 | ] 3623 | }, 3624 | { 3625 | "cell_type": "code", 3626 | "execution_count": null, 3627 | "id": "93d0303b", 3628 | "metadata": {}, 3629 | "outputs": [ 3630 | { 3631 | "data": { 3632 | "text/plain": [ 3633 | "[[1, 2, 3], [4, 5, 6]]" 3634 | ] 3635 | }, 3636 | "metadata": {}, 3637 | "output_type": "display_data" 3638 | } 3639 | ], 3640 | "source": [ 3641 | "a = ([[1, 2, 3],\n", 3642 | " [4, 5, 6]])\n", 3643 | "a" 3644 | ] 3645 | }, 3646 | { 3647 | "cell_type": "code", 3648 | "execution_count": null, 3649 | "id": "b41d7309", 3650 | "metadata": {}, 3651 | "outputs": [ 3652 | { 3653 | "data": { 3654 | "text/plain": [ 3655 | "(2, 3)" 3656 | ] 3657 | }, 3658 | "metadata": {}, 3659 | "output_type": "display_data" 3660 | } 3661 | ], 3662 | "source": [ 3663 | "np.shape(a)" 3664 | ] 3665 | }, 3666 | { 3667 | "cell_type": "code", 3668 | "execution_count": null, 3669 | "id": "5cb790e8", 3670 | "metadata": {}, 3671 | "outputs": [ 3672 | { 3673 | "data": { 3674 | "text/plain": [ 3675 | "array([[0, 1, 2],\n", 3676 | " [3, 4, 5]])" 3677 | ] 3678 | }, 3679 | "metadata": {}, 3680 | "output_type": "display_data" 3681 | } 3682 | ], 3683 | "source": [ 3684 | "a = np.arange(6).reshape((2, 3))\n", 3685 | "a" 3686 | ] 3687 | }, 3688 | { 3689 | "cell_type": "code", 3690 | "execution_count": null, 3691 | "id": "54cb9b2e", 3692 | "metadata": {}, 3693 | "outputs": [ 3694 | { 3695 | "data": { 3696 | "text/plain": [ 3697 | "array([[0, 3],\n", 3698 | " [1, 4],\n", 3699 | " [2, 5]])" 3700 | ] 3701 | }, 3702 | "metadata": {}, 3703 | "output_type": "display_data" 3704 | } 3705 | ], 3706 | "source": [ 3707 | "a.transpose()" 3708 | ] 3709 | }, 3710 | { 3711 | "cell_type": "markdown", 3712 | "id": "eee5576d", 3713 | "metadata": {}, 3714 | "source": [ 3715 | "> reverse an 1D array" 3716 | ] 3717 | }, 3718 | { 3719 | "cell_type": "code", 3720 | "execution_count": null, 3721 | "id": "84892e2b", 3722 | "metadata": {}, 3723 | "outputs": [ 3724 | { 3725 | "data": { 3726 | "text/plain": [ 3727 | "array([1, 2, 3, 4, 5, 6, 7, 8])" 3728 | ] 3729 | }, 3730 | "metadata": {}, 3731 | "output_type": "display_data" 3732 | } 3733 | ], 3734 | "source": [ 3735 | "a = np.array([1, 2, 3, 4, 5, 6, 7, 8])\n", 3736 | "a" 3737 | ] 3738 | }, 3739 | { 3740 | "cell_type": "code", 3741 | "execution_count": null, 3742 | "id": "4e0f7eef", 3743 | "metadata": {}, 3744 | "outputs": [ 3745 | { 3746 | "data": { 3747 | "text/plain": [ 3748 | "array([8, 7, 6, 5, 4, 3, 2, 1])" 3749 | ] 3750 | }, 3751 | "metadata": {}, 3752 | "output_type": "display_data" 3753 | } 3754 | ], 3755 | "source": [ 3756 | "b= np.flip(a)\n", 3757 | "b" 3758 | ] 3759 | }, 3760 | { 3761 | "cell_type": "markdown", 3762 | "id": "7664b607", 3763 | "metadata": {}, 3764 | "source": [ 3765 | "> reverse an 2D array" 3766 | ] 3767 | }, 3768 | { 3769 | "cell_type": "code", 3770 | "execution_count": null, 3771 | "id": "ebbea515", 3772 | "metadata": {}, 3773 | "outputs": [ 3774 | { 3775 | "data": { 3776 | "text/plain": [ 3777 | "array([[ 1, 2, 3, 4],\n", 3778 | " [ 5, 6, 7, 8],\n", 3779 | " [ 9, 10, 11, 12]])" 3780 | ] 3781 | }, 3782 | "metadata": {}, 3783 | "output_type": "display_data" 3784 | } 3785 | ], 3786 | "source": [ 3787 | "a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n", 3788 | "a_2d" 3789 | ] 3790 | }, 3791 | { 3792 | "cell_type": "code", 3793 | "execution_count": null, 3794 | "id": "e4cca575", 3795 | "metadata": {}, 3796 | "outputs": [ 3797 | { 3798 | "data": { 3799 | "text/plain": [ 3800 | "array([[12, 11, 10, 9],\n", 3801 | " [ 8, 7, 6, 5],\n", 3802 | " [ 4, 3, 2, 1]])" 3803 | ] 3804 | }, 3805 | "metadata": {}, 3806 | "output_type": "display_data" 3807 | } 3808 | ], 3809 | "source": [ 3810 | "b_2d = np.flip(a_2d)\n", 3811 | "b_2d" 3812 | ] 3813 | }, 3814 | { 3815 | "cell_type": "markdown", 3816 | "id": "0b788cff", 3817 | "metadata": {}, 3818 | "source": [ 3819 | "> reverse only the columns" 3820 | ] 3821 | }, 3822 | { 3823 | "cell_type": "code", 3824 | "execution_count": null, 3825 | "id": "0ebc3d00", 3826 | "metadata": {}, 3827 | "outputs": [ 3828 | { 3829 | "data": { 3830 | "text/plain": [ 3831 | "array([[ 9, 10, 11, 12],\n", 3832 | " [ 5, 6, 7, 8],\n", 3833 | " [ 1, 2, 3, 4]])" 3834 | ] 3835 | }, 3836 | "metadata": {}, 3837 | "output_type": "display_data" 3838 | } 3839 | ], 3840 | "source": [ 3841 | "b_2d = np.flip(a_2d, axis = 0)\n", 3842 | "b_2d" 3843 | ] 3844 | }, 3845 | { 3846 | "cell_type": "markdown", 3847 | "id": "7be0813b", 3848 | "metadata": {}, 3849 | "source": [ 3850 | "> reverse only the rows" 3851 | ] 3852 | }, 3853 | { 3854 | "cell_type": "code", 3855 | "execution_count": null, 3856 | "id": "452fe903", 3857 | "metadata": {}, 3858 | "outputs": [ 3859 | { 3860 | "data": { 3861 | "text/plain": [ 3862 | "array([[ 4, 3, 2, 1],\n", 3863 | " [ 8, 7, 6, 5],\n", 3864 | " [12, 11, 10, 9]])" 3865 | ] 3866 | }, 3867 | "metadata": {}, 3868 | "output_type": "display_data" 3869 | } 3870 | ], 3871 | "source": [ 3872 | "b_2d = np.flip(a_2d, axis = 1)\n", 3873 | "b_2d" 3874 | ] 3875 | }, 3876 | { 3877 | "cell_type": "markdown", 3878 | "id": "dba11c66", 3879 | "metadata": {}, 3880 | "source": [ 3881 | "> reverse the contents of only one column or row" 3882 | ] 3883 | }, 3884 | { 3885 | "cell_type": "code", 3886 | "execution_count": null, 3887 | "id": "999f7de9", 3888 | "metadata": {}, 3889 | "outputs": [ 3890 | { 3891 | "data": { 3892 | "text/plain": [ 3893 | "array([[ 1, 2, 3, 4],\n", 3894 | " [ 5, 6, 7, 8],\n", 3895 | " [ 9, 10, 11, 12]])" 3896 | ] 3897 | }, 3898 | "metadata": {}, 3899 | "output_type": "display_data" 3900 | } 3901 | ], 3902 | "source": [ 3903 | "a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n", 3904 | "a_2d" 3905 | ] 3906 | }, 3907 | { 3908 | "cell_type": "markdown", 3909 | "id": "4c41ec50", 3910 | "metadata": {}, 3911 | "source": [ 3912 | "> reverse the contents of the row at index position 1" 3913 | ] 3914 | }, 3915 | { 3916 | "cell_type": "code", 3917 | "execution_count": null, 3918 | "id": "4c0139df", 3919 | "metadata": {}, 3920 | "outputs": [ 3921 | { 3922 | "data": { 3923 | "text/plain": [ 3924 | "array([[ 1, 2, 3, 4],\n", 3925 | " [ 8, 7, 6, 5],\n", 3926 | " [ 9, 10, 11, 12]])" 3927 | ] 3928 | }, 3929 | "metadata": {}, 3930 | "output_type": "display_data" 3931 | } 3932 | ], 3933 | "source": [ 3934 | "a_2d[1] = np.flip(a_2d[1])\n", 3935 | "a_2d" 3936 | ] 3937 | }, 3938 | { 3939 | "cell_type": "markdown", 3940 | "id": "258f256f", 3941 | "metadata": {}, 3942 | "source": [ 3943 | "> reverse the contents of the column at index position 0" 3944 | ] 3945 | }, 3946 | { 3947 | "cell_type": "code", 3948 | "execution_count": null, 3949 | "id": "872c44f1", 3950 | "metadata": {}, 3951 | "outputs": [ 3952 | { 3953 | "data": { 3954 | "text/plain": [ 3955 | "array([[ 9, 2, 3, 4],\n", 3956 | " [ 5, 6, 7, 8],\n", 3957 | " [ 1, 10, 11, 12]])" 3958 | ] 3959 | }, 3960 | "metadata": {}, 3961 | "output_type": "display_data" 3962 | } 3963 | ], 3964 | "source": [ 3965 | "a_2d[:,0] = np.flip(a_2d[:,0])\n", 3966 | "a_2d" 3967 | ] 3968 | }, 3969 | { 3970 | "cell_type": "markdown", 3971 | "id": "4610dd70", 3972 | "metadata": {}, 3973 | "source": [ 3974 | "## Section 3.7: Reshaping and Flattening Multidimensional Arrays" 3975 | ] 3976 | }, 3977 | { 3978 | "cell_type": "code", 3979 | "execution_count": null, 3980 | "id": "a8e69391", 3981 | "metadata": {}, 3982 | "outputs": [ 3983 | { 3984 | "data": { 3985 | "text/plain": [ 3986 | "array([[ 1, 2, 3, 4],\n", 3987 | " [ 5, 6, 7, 8],\n", 3988 | " [ 9, 10, 11, 12]])" 3989 | ] 3990 | }, 3991 | "metadata": {}, 3992 | "output_type": "display_data" 3993 | } 3994 | ], 3995 | "source": [ 3996 | "x = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n", 3997 | "x" 3998 | ] 3999 | }, 4000 | { 4001 | "cell_type": "markdown", 4002 | "id": "faa16b7b", 4003 | "metadata": {}, 4004 | "source": [ 4005 | ">When you use flatten, changes to your new array won’t change the parent array" 4006 | ] 4007 | }, 4008 | { 4009 | "cell_type": "code", 4010 | "execution_count": null, 4011 | "id": "a1dd74e2", 4012 | "metadata": {}, 4013 | "outputs": [ 4014 | { 4015 | "data": { 4016 | "text/plain": [ 4017 | "array([ 1, 22, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])" 4018 | ] 4019 | }, 4020 | "metadata": {}, 4021 | "output_type": "display_data" 4022 | } 4023 | ], 4024 | "source": [ 4025 | "x1 = x.flatten()\n", 4026 | "x1[1] = 22\n", 4027 | "x1" 4028 | ] 4029 | }, 4030 | { 4031 | "cell_type": "code", 4032 | "execution_count": null, 4033 | "id": "8fafc4d1", 4034 | "metadata": {}, 4035 | "outputs": [ 4036 | { 4037 | "data": { 4038 | "text/plain": [ 4039 | "array([[ 1, 2, 3, 4],\n", 4040 | " [ 5, 6, 7, 8],\n", 4041 | " [ 9, 10, 11, 12]])" 4042 | ] 4043 | }, 4044 | "metadata": {}, 4045 | "output_type": "display_data" 4046 | } 4047 | ], 4048 | "source": [ 4049 | "x" 4050 | ] 4051 | }, 4052 | { 4053 | "cell_type": "markdown", 4054 | "id": "76dcb73f", 4055 | "metadata": {}, 4056 | "source": [ 4057 | "> when you use ravel, the changes you make to the new array will affect the parent array." 4058 | ] 4059 | }, 4060 | { 4061 | "cell_type": "code", 4062 | "execution_count": null, 4063 | "id": "9557b503", 4064 | "metadata": {}, 4065 | "outputs": [ 4066 | { 4067 | "data": { 4068 | "text/plain": [ 4069 | "array([ 1, 25, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])" 4070 | ] 4071 | }, 4072 | "metadata": {}, 4073 | "output_type": "display_data" 4074 | } 4075 | ], 4076 | "source": [ 4077 | "x1 = x.ravel()\n", 4078 | "x1[1] = 25\n", 4079 | "x1" 4080 | ] 4081 | }, 4082 | { 4083 | "cell_type": "code", 4084 | "execution_count": null, 4085 | "id": "861a2ad0", 4086 | "metadata": {}, 4087 | "outputs": [ 4088 | { 4089 | "data": { 4090 | "text/plain": [ 4091 | "array([[ 1, 25, 3, 4],\n", 4092 | " [ 5, 6, 7, 8],\n", 4093 | " [ 9, 10, 11, 12]])" 4094 | ] 4095 | }, 4096 | "metadata": {}, 4097 | "output_type": "display_data" 4098 | } 4099 | ], 4100 | "source": [ 4101 | "x" 4102 | ] 4103 | }, 4104 | { 4105 | "cell_type": "markdown", 4106 | "metadata": {}, 4107 | "source": [ 4108 | "-----------------\n", 4109 | "-----------------" 4110 | ] 4111 | } 4112 | ], 4113 | "metadata": { 4114 | "language_info": { 4115 | "name": "python" 4116 | }, 4117 | "orig_nbformat": 4 4118 | }, 4119 | "nbformat": 4, 4120 | "nbformat_minor": 2 4121 | } 4122 | -------------------------------------------------------------------------------- /4- Null Values with Pandas.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 2, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import pandas as pd\n", 10 | "import numpy as np\n", 11 | "import seaborn as sns" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 69, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "df = sns.load_dataset('titanic')" 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": 70, 26 | "metadata": {}, 27 | "outputs": [ 28 | { 29 | "data": { 30 | "text/html": [ 31 | "
\n", 32 | "\n", 45 | "\n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | "
survivedpclasssexagesibspparchfareembarkedclasswhoadult_maledeckembark_townalivealone
003male22.0107.2500SThirdmanTrueNaNSouthamptonnoFalse
111female38.01071.2833CFirstwomanFalseCCherbourgyesFalse
213female26.0007.9250SThirdwomanFalseNaNSouthamptonyesTrue
311female35.01053.1000SFirstwomanFalseCSouthamptonyesFalse
403male35.0008.0500SThirdmanTrueNaNSouthamptonnoTrue
\n", 159 | "
" 160 | ], 161 | "text/plain": [ 162 | " survived pclass sex age sibsp parch fare embarked class \\\n", 163 | "0 0 3 male 22.0 1 0 7.2500 S Third \n", 164 | "1 1 1 female 38.0 1 0 71.2833 C First \n", 165 | "2 1 3 female 26.0 0 0 7.9250 S Third \n", 166 | "3 1 1 female 35.0 1 0 53.1000 S First \n", 167 | "4 0 3 male 35.0 0 0 8.0500 S Third \n", 168 | "\n", 169 | " who adult_male deck embark_town alive alone \n", 170 | "0 man True NaN Southampton no False \n", 171 | "1 woman False C Cherbourg yes False \n", 172 | "2 woman False NaN Southampton yes True \n", 173 | "3 woman False C Southampton yes False \n", 174 | "4 man True NaN Southampton no True " 175 | ] 176 | }, 177 | "execution_count": 70, 178 | "metadata": {}, 179 | "output_type": "execute_result" 180 | } 181 | ], 182 | "source": [ 183 | "df.head()" 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": 71, 189 | "metadata": {}, 190 | "outputs": [ 191 | { 192 | "data": { 193 | "text/plain": [ 194 | "survived 0\n", 195 | "pclass 0\n", 196 | "sex 0\n", 197 | "age 177\n", 198 | "sibsp 0\n", 199 | "parch 0\n", 200 | "fare 0\n", 201 | "embarked 2\n", 202 | "class 0\n", 203 | "who 0\n", 204 | "adult_male 0\n", 205 | "deck 688\n", 206 | "embark_town 2\n", 207 | "alive 0\n", 208 | "alone 0\n", 209 | "dtype: int64" 210 | ] 211 | }, 212 | "execution_count": 71, 213 | "metadata": {}, 214 | "output_type": "execute_result" 215 | } 216 | ], 217 | "source": [ 218 | "df.isnull().sum()" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": 72, 224 | "metadata": {}, 225 | "outputs": [ 226 | { 227 | "name": "stdout", 228 | "output_type": "stream", 229 | "text": [ 230 | "\n", 231 | "RangeIndex: 891 entries, 0 to 890\n", 232 | "Data columns (total 15 columns):\n", 233 | " # Column Non-Null Count Dtype \n", 234 | "--- ------ -------------- ----- \n", 235 | " 0 survived 891 non-null int64 \n", 236 | " 1 pclass 891 non-null int64 \n", 237 | " 2 sex 891 non-null object \n", 238 | " 3 age 714 non-null float64 \n", 239 | " 4 sibsp 891 non-null int64 \n", 240 | " 5 parch 891 non-null int64 \n", 241 | " 6 fare 891 non-null float64 \n", 242 | " 7 embarked 889 non-null object \n", 243 | " 8 class 891 non-null category\n", 244 | " 9 who 891 non-null object \n", 245 | " 10 adult_male 891 non-null bool \n", 246 | " 11 deck 203 non-null category\n", 247 | " 12 embark_town 889 non-null object \n", 248 | " 13 alive 891 non-null object \n", 249 | " 14 alone 891 non-null bool \n", 250 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n", 251 | "memory usage: 80.7+ KB\n" 252 | ] 253 | } 254 | ], 255 | "source": [ 256 | "df.info()" 257 | ] 258 | }, 259 | { 260 | "cell_type": "markdown", 261 | "metadata": {}, 262 | "source": [ 263 | "## Dealing with the missing values using PANDAS\n", 264 | "1. Mean, Mode Method\n", 265 | "2. Forward Fill Method\n", 266 | "3. Backward Fill Method " 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": {}, 272 | "source": [ 273 | "### 1. Mean/Mode Method" 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": 73, 279 | "metadata": {}, 280 | "outputs": [ 281 | { 282 | "data": { 283 | "text/plain": [ 284 | "'C'" 285 | ] 286 | }, 287 | "execution_count": 73, 288 | "metadata": {}, 289 | "output_type": "execute_result" 290 | } 291 | ], 292 | "source": [ 293 | "from scipy.stats import mode\n", 294 | "deck_mode = mode(df['deck']).mode[0]\n", 295 | "deck_mode" 296 | ] 297 | }, 298 | { 299 | "cell_type": "code", 300 | "execution_count": 74, 301 | "metadata": {}, 302 | "outputs": [], 303 | "source": [ 304 | "df['deck'] = df['deck'].fillna(deck_mode)" 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": 75, 310 | "metadata": {}, 311 | "outputs": [ 312 | { 313 | "name": "stdout", 314 | "output_type": "stream", 315 | "text": [ 316 | "\n", 317 | "RangeIndex: 891 entries, 0 to 890\n", 318 | "Data columns (total 15 columns):\n", 319 | " # Column Non-Null Count Dtype \n", 320 | "--- ------ -------------- ----- \n", 321 | " 0 survived 891 non-null int64 \n", 322 | " 1 pclass 891 non-null int64 \n", 323 | " 2 sex 891 non-null object \n", 324 | " 3 age 714 non-null float64 \n", 325 | " 4 sibsp 891 non-null int64 \n", 326 | " 5 parch 891 non-null int64 \n", 327 | " 6 fare 891 non-null float64 \n", 328 | " 7 embarked 889 non-null object \n", 329 | " 8 class 891 non-null category\n", 330 | " 9 who 891 non-null object \n", 331 | " 10 adult_male 891 non-null bool \n", 332 | " 11 deck 891 non-null category\n", 333 | " 12 embark_town 889 non-null object \n", 334 | " 13 alive 891 non-null object \n", 335 | " 14 alone 891 non-null bool \n", 336 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n", 337 | "memory usage: 80.7+ KB\n" 338 | ] 339 | } 340 | ], 341 | "source": [ 342 | "df.info()" 343 | ] 344 | }, 345 | { 346 | "cell_type": "code", 347 | "execution_count": 77, 348 | "metadata": {}, 349 | "outputs": [ 350 | { 351 | "data": { 352 | "text/html": [ 353 | "
\n", 354 | "\n", 367 | "\n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | "
survivedpclasssexagesibspparchfareembarkedclasswhoadult_maledeckembark_townalivealone
88602male27.00013.00SSecondmanTrueCSouthamptonnoTrue
88711female19.00030.00SFirstwomanFalseBSouthamptonyesTrue
88803femaleNaN1223.45SThirdwomanFalseCSouthamptonnoFalse
88911male26.00030.00CFirstmanTrueCCherbourgyesTrue
89003male32.0007.75QThirdmanTrueCQueenstownnoTrue
\n", 481 | "
" 482 | ], 483 | "text/plain": [ 484 | " survived pclass sex age sibsp parch fare embarked class \\\n", 485 | "886 0 2 male 27.0 0 0 13.00 S Second \n", 486 | "887 1 1 female 19.0 0 0 30.00 S First \n", 487 | "888 0 3 female NaN 1 2 23.45 S Third \n", 488 | "889 1 1 male 26.0 0 0 30.00 C First \n", 489 | "890 0 3 male 32.0 0 0 7.75 Q Third \n", 490 | "\n", 491 | " who adult_male deck embark_town alive alone \n", 492 | "886 man True C Southampton no True \n", 493 | "887 woman False B Southampton yes True \n", 494 | "888 woman False C Southampton no False \n", 495 | "889 man True C Cherbourg yes True \n", 496 | "890 man True C Queenstown no True " 497 | ] 498 | }, 499 | "execution_count": 77, 500 | "metadata": {}, 501 | "output_type": "execute_result" 502 | } 503 | ], 504 | "source": [ 505 | "df.tail()" 506 | ] 507 | }, 508 | { 509 | "cell_type": "code", 510 | "execution_count": 81, 511 | "metadata": {}, 512 | "outputs": [ 513 | { 514 | "data": { 515 | "text/plain": [ 516 | "'Southampton'" 517 | ] 518 | }, 519 | "execution_count": 81, 520 | "metadata": {}, 521 | "output_type": "execute_result" 522 | } 523 | ], 524 | "source": [ 525 | "embark_town_mode = mode(df['embark_town']).mode[0]\n", 526 | "embark_town_mode" 527 | ] 528 | }, 529 | { 530 | "cell_type": "code", 531 | "execution_count": 82, 532 | "metadata": {}, 533 | "outputs": [], 534 | "source": [ 535 | "df['embark_town'] = df['embark_town'].fillna(embark_town_mode)" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": 83, 541 | "metadata": {}, 542 | "outputs": [ 543 | { 544 | "name": "stdout", 545 | "output_type": "stream", 546 | "text": [ 547 | "\n", 548 | "RangeIndex: 891 entries, 0 to 890\n", 549 | "Data columns (total 15 columns):\n", 550 | " # Column Non-Null Count Dtype \n", 551 | "--- ------ -------------- ----- \n", 552 | " 0 survived 891 non-null int64 \n", 553 | " 1 pclass 891 non-null int64 \n", 554 | " 2 sex 891 non-null object \n", 555 | " 3 age 891 non-null float64 \n", 556 | " 4 sibsp 891 non-null int64 \n", 557 | " 5 parch 891 non-null int64 \n", 558 | " 6 fare 891 non-null float64 \n", 559 | " 7 embarked 889 non-null object \n", 560 | " 8 class 891 non-null category\n", 561 | " 9 who 891 non-null object \n", 562 | " 10 adult_male 891 non-null bool \n", 563 | " 11 deck 891 non-null category\n", 564 | " 12 embark_town 891 non-null object \n", 565 | " 13 alive 891 non-null object \n", 566 | " 14 alone 891 non-null bool \n", 567 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n", 568 | "memory usage: 80.7+ KB\n" 569 | ] 570 | } 571 | ], 572 | "source": [ 573 | "df.info()" 574 | ] 575 | }, 576 | { 577 | "cell_type": "code", 578 | "execution_count": 78, 579 | "metadata": {}, 580 | "outputs": [ 581 | { 582 | "data": { 583 | "text/plain": [ 584 | "29.69911764705882" 585 | ] 586 | }, 587 | "execution_count": 78, 588 | "metadata": {}, 589 | "output_type": "execute_result" 590 | } 591 | ], 592 | "source": [ 593 | "age_mean = np.mean(df['age'])\n", 594 | "age_mean" 595 | ] 596 | }, 597 | { 598 | "cell_type": "code", 599 | "execution_count": 79, 600 | "metadata": {}, 601 | "outputs": [], 602 | "source": [ 603 | "df['age'] = df['age'].fillna(age_mean)" 604 | ] 605 | }, 606 | { 607 | "cell_type": "code", 608 | "execution_count": 87, 609 | "metadata": {}, 610 | "outputs": [ 611 | { 612 | "data": { 613 | "text/plain": [ 614 | "survived 0\n", 615 | "pclass 0\n", 616 | "sex 0\n", 617 | "age 0\n", 618 | "sibsp 0\n", 619 | "parch 0\n", 620 | "fare 0\n", 621 | "embarked 2\n", 622 | "class 0\n", 623 | "who 0\n", 624 | "adult_male 0\n", 625 | "deck 0\n", 626 | "embark_town 0\n", 627 | "alive 0\n", 628 | "alone 0\n", 629 | "dtype: int64" 630 | ] 631 | }, 632 | "execution_count": 87, 633 | "metadata": {}, 634 | "output_type": "execute_result" 635 | } 636 | ], 637 | "source": [ 638 | "df.isnull().sum()" 639 | ] 640 | }, 641 | { 642 | "cell_type": "code", 643 | "execution_count": 88, 644 | "metadata": {}, 645 | "outputs": [ 646 | { 647 | "name": "stdout", 648 | "output_type": "stream", 649 | "text": [ 650 | "\n", 651 | "RangeIndex: 891 entries, 0 to 890\n", 652 | "Data columns (total 15 columns):\n", 653 | " # Column Non-Null Count Dtype \n", 654 | "--- ------ -------------- ----- \n", 655 | " 0 survived 891 non-null int64 \n", 656 | " 1 pclass 891 non-null int64 \n", 657 | " 2 sex 891 non-null object \n", 658 | " 3 age 891 non-null float64 \n", 659 | " 4 sibsp 891 non-null int64 \n", 660 | " 5 parch 891 non-null int64 \n", 661 | " 6 fare 891 non-null float64 \n", 662 | " 7 embarked 889 non-null object \n", 663 | " 8 class 891 non-null category\n", 664 | " 9 who 891 non-null object \n", 665 | " 10 adult_male 891 non-null bool \n", 666 | " 11 deck 891 non-null category\n", 667 | " 12 embark_town 891 non-null object \n", 668 | " 13 alive 891 non-null object \n", 669 | " 14 alone 891 non-null bool \n", 670 | "dtypes: bool(2), category(2), float64(2), int64(4), object(5)\n", 671 | "memory usage: 80.7+ KB\n" 672 | ] 673 | } 674 | ], 675 | "source": [ 676 | "df.info()" 677 | ] 678 | }, 679 | { 680 | "cell_type": "markdown", 681 | "metadata": {}, 682 | "source": [ 683 | "### 2. Forward Fill Method\n", 684 | "- Forward Fill Method, It Takes values from the previous non-NaN value" 685 | ] 686 | }, 687 | { 688 | "cell_type": "code", 689 | "execution_count": 99, 690 | "metadata": {}, 691 | "outputs": [], 692 | "source": [ 693 | "df1 = sns.load_dataset('titanic')" 694 | ] 695 | }, 696 | { 697 | "cell_type": "code", 698 | "execution_count": 100, 699 | "metadata": {}, 700 | "outputs": [ 701 | { 702 | "data": { 703 | "text/plain": [ 704 | "survived 0\n", 705 | "pclass 0\n", 706 | "sex 0\n", 707 | "age 177\n", 708 | "sibsp 0\n", 709 | "parch 0\n", 710 | "fare 0\n", 711 | "embarked 2\n", 712 | "class 0\n", 713 | "who 0\n", 714 | "adult_male 0\n", 715 | "deck 688\n", 716 | "embark_town 2\n", 717 | "alive 0\n", 718 | "alone 0\n", 719 | "dtype: int64" 720 | ] 721 | }, 722 | "execution_count": 100, 723 | "metadata": {}, 724 | "output_type": "execute_result" 725 | } 726 | ], 727 | "source": [ 728 | "df1.isnull().sum()" 729 | ] 730 | }, 731 | { 732 | "cell_type": "code", 733 | "execution_count": 101, 734 | "metadata": {}, 735 | "outputs": [], 736 | "source": [ 737 | "df1= df1.fillna(method='ffill').head()" 738 | ] 739 | }, 740 | { 741 | "cell_type": "code", 742 | "execution_count": 102, 743 | "metadata": {}, 744 | "outputs": [ 745 | { 746 | "data": { 747 | "text/plain": [ 748 | "survived 0\n", 749 | "pclass 0\n", 750 | "sex 0\n", 751 | "age 0\n", 752 | "sibsp 0\n", 753 | "parch 0\n", 754 | "fare 0\n", 755 | "embarked 0\n", 756 | "class 0\n", 757 | "who 0\n", 758 | "adult_male 0\n", 759 | "deck 1\n", 760 | "embark_town 0\n", 761 | "alive 0\n", 762 | "alone 0\n", 763 | "dtype: int64" 764 | ] 765 | }, 766 | "execution_count": 102, 767 | "metadata": {}, 768 | "output_type": "execute_result" 769 | } 770 | ], 771 | "source": [ 772 | "df1.isnull().sum()\n", 773 | "\n", 774 | "# It shows one value as NaN, because in Forward Fill Method, It Takes values from the previous non-NaN value,\n", 775 | "# in this case, it couldn't because it is the first value, there is no value before it, that's why it remains empty" 776 | ] 777 | }, 778 | { 779 | "cell_type": "code", 780 | "execution_count": 107, 781 | "metadata": {}, 782 | "outputs": [ 783 | { 784 | "data": { 785 | "text/html": [ 786 | "
\n", 787 | "\n", 800 | "\n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 807 | " \n", 808 | " \n", 809 | " \n", 810 | " \n", 811 | " \n", 812 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | " \n", 820 | " \n", 821 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 827 | " \n", 828 | " \n", 829 | " \n", 830 | " \n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | " \n", 841 | " \n", 842 | " \n", 843 | " \n", 844 | " \n", 845 | " \n", 846 | " \n", 847 | " \n", 848 | " \n", 849 | " \n", 850 | " \n", 851 | " \n", 852 | " \n", 853 | " \n", 854 | " \n", 855 | " \n", 856 | " \n", 857 | " \n", 858 | " \n", 859 | " \n", 860 | " \n", 861 | " \n", 862 | " \n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | "
survivedpclasssexagesibspparchfareembarkedclasswhoadult_maledeckembark_townalivealone
003male22.0107.2500SThirdmanTrueNaNSouthamptonnoFalse
111female38.01071.2833CFirstwomanFalseCCherbourgyesFalse
213female26.0007.9250SThirdwomanFalseCSouthamptonyesTrue
311female35.01053.1000SFirstwomanFalseCSouthamptonyesFalse
403male35.0008.0500SThirdmanTrueCSouthamptonnoTrue
\n", 914 | "
" 915 | ], 916 | "text/plain": [ 917 | " survived pclass sex age sibsp parch fare embarked class \\\n", 918 | "0 0 3 male 22.0 1 0 7.2500 S Third \n", 919 | "1 1 1 female 38.0 1 0 71.2833 C First \n", 920 | "2 1 3 female 26.0 0 0 7.9250 S Third \n", 921 | "3 1 1 female 35.0 1 0 53.1000 S First \n", 922 | "4 0 3 male 35.0 0 0 8.0500 S Third \n", 923 | "\n", 924 | " who adult_male deck embark_town alive alone \n", 925 | "0 man True NaN Southampton no False \n", 926 | "1 woman False C Cherbourg yes False \n", 927 | "2 woman False C Southampton yes True \n", 928 | "3 woman False C Southampton yes False \n", 929 | "4 man True C Southampton no True " 930 | ] 931 | }, 932 | "execution_count": 107, 933 | "metadata": {}, 934 | "output_type": "execute_result" 935 | } 936 | ], 937 | "source": [ 938 | "df1.head()" 939 | ] 940 | }, 941 | { 942 | "cell_type": "markdown", 943 | "metadata": {}, 944 | "source": [ 945 | "### 3. Backward Fill Method\n", 946 | "- Backward Fill Method, It Takes values from the subsequent non-NaN value" 947 | ] 948 | }, 949 | { 950 | "cell_type": "code", 951 | "execution_count": 111, 952 | "metadata": {}, 953 | "outputs": [], 954 | "source": [ 955 | "df2 = sns.load_dataset('titanic')" 956 | ] 957 | }, 958 | { 959 | "cell_type": "code", 960 | "execution_count": 112, 961 | "metadata": {}, 962 | "outputs": [ 963 | { 964 | "data": { 965 | "text/plain": [ 966 | "survived 0\n", 967 | "pclass 0\n", 968 | "sex 0\n", 969 | "age 177\n", 970 | "sibsp 0\n", 971 | "parch 0\n", 972 | "fare 0\n", 973 | "embarked 2\n", 974 | "class 0\n", 975 | "who 0\n", 976 | "adult_male 0\n", 977 | "deck 688\n", 978 | "embark_town 2\n", 979 | "alive 0\n", 980 | "alone 0\n", 981 | "dtype: int64" 982 | ] 983 | }, 984 | "execution_count": 112, 985 | "metadata": {}, 986 | "output_type": "execute_result" 987 | } 988 | ], 989 | "source": [ 990 | "df2.isnull().sum()" 991 | ] 992 | }, 993 | { 994 | "cell_type": "code", 995 | "execution_count": 113, 996 | "metadata": {}, 997 | "outputs": [], 998 | "source": [ 999 | "df2= df2.fillna(method='bfill').head()" 1000 | ] 1001 | }, 1002 | { 1003 | "cell_type": "code", 1004 | "execution_count": 114, 1005 | "metadata": {}, 1006 | "outputs": [ 1007 | { 1008 | "data": { 1009 | "text/plain": [ 1010 | "survived 0\n", 1011 | "pclass 0\n", 1012 | "sex 0\n", 1013 | "age 0\n", 1014 | "sibsp 0\n", 1015 | "parch 0\n", 1016 | "fare 0\n", 1017 | "embarked 0\n", 1018 | "class 0\n", 1019 | "who 0\n", 1020 | "adult_male 0\n", 1021 | "deck 0\n", 1022 | "embark_town 0\n", 1023 | "alive 0\n", 1024 | "alone 0\n", 1025 | "dtype: int64" 1026 | ] 1027 | }, 1028 | "execution_count": 114, 1029 | "metadata": {}, 1030 | "output_type": "execute_result" 1031 | } 1032 | ], 1033 | "source": [ 1034 | "df2.isnull().sum()" 1035 | ] 1036 | }, 1037 | { 1038 | "cell_type": "code", 1039 | "execution_count": 115, 1040 | "metadata": {}, 1041 | "outputs": [ 1042 | { 1043 | "data": { 1044 | "text/html": [ 1045 | "
\n", 1046 | "\n", 1059 | "\n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | " \n", 1082 | " \n", 1083 | " \n", 1084 | " \n", 1085 | " \n", 1086 | " \n", 1087 | " \n", 1088 | " \n", 1089 | " \n", 1090 | " \n", 1091 | " \n", 1092 | " \n", 1093 | " \n", 1094 | " \n", 1095 | " \n", 1096 | " \n", 1097 | " \n", 1098 | " \n", 1099 | " \n", 1100 | " \n", 1101 | " \n", 1102 | " \n", 1103 | " \n", 1104 | " \n", 1105 | " \n", 1106 | " \n", 1107 | " \n", 1108 | " \n", 1109 | " \n", 1110 | " \n", 1111 | " \n", 1112 | " \n", 1113 | " \n", 1114 | " \n", 1115 | " \n", 1116 | " \n", 1117 | " \n", 1118 | " \n", 1119 | " \n", 1120 | " \n", 1121 | " \n", 1122 | " \n", 1123 | " \n", 1124 | " \n", 1125 | " \n", 1126 | " \n", 1127 | " \n", 1128 | " \n", 1129 | " \n", 1130 | " \n", 1131 | " \n", 1132 | " \n", 1133 | " \n", 1134 | " \n", 1135 | " \n", 1136 | " \n", 1137 | " \n", 1138 | " \n", 1139 | " \n", 1140 | " \n", 1141 | " \n", 1142 | " \n", 1143 | " \n", 1144 | " \n", 1145 | " \n", 1146 | " \n", 1147 | " \n", 1148 | " \n", 1149 | " \n", 1150 | " \n", 1151 | " \n", 1152 | " \n", 1153 | " \n", 1154 | " \n", 1155 | " \n", 1156 | " \n", 1157 | " \n", 1158 | " \n", 1159 | " \n", 1160 | " \n", 1161 | " \n", 1162 | " \n", 1163 | " \n", 1164 | " \n", 1165 | " \n", 1166 | " \n", 1167 | " \n", 1168 | " \n", 1169 | " \n", 1170 | " \n", 1171 | " \n", 1172 | "
survivedpclasssexagesibspparchfareembarkedclasswhoadult_maledeckembark_townalivealone
003male22.0107.2500SThirdmanTrueCSouthamptonnoFalse
111female38.01071.2833CFirstwomanFalseCCherbourgyesFalse
213female26.0007.9250SThirdwomanFalseCSouthamptonyesTrue
311female35.01053.1000SFirstwomanFalseCSouthamptonyesFalse
403male35.0008.0500SThirdmanTrueESouthamptonnoTrue
\n", 1173 | "
" 1174 | ], 1175 | "text/plain": [ 1176 | " survived pclass sex age sibsp parch fare embarked class \\\n", 1177 | "0 0 3 male 22.0 1 0 7.2500 S Third \n", 1178 | "1 1 1 female 38.0 1 0 71.2833 C First \n", 1179 | "2 1 3 female 26.0 0 0 7.9250 S Third \n", 1180 | "3 1 1 female 35.0 1 0 53.1000 S First \n", 1181 | "4 0 3 male 35.0 0 0 8.0500 S Third \n", 1182 | "\n", 1183 | " who adult_male deck embark_town alive alone \n", 1184 | "0 man True C Southampton no False \n", 1185 | "1 woman False C Cherbourg yes False \n", 1186 | "2 woman False C Southampton yes True \n", 1187 | "3 woman False C Southampton yes False \n", 1188 | "4 man True E Southampton no True " 1189 | ] 1190 | }, 1191 | "execution_count": 115, 1192 | "metadata": {}, 1193 | "output_type": "execute_result" 1194 | } 1195 | ], 1196 | "source": [ 1197 | "df2.tail()" 1198 | ] 1199 | } 1200 | ], 1201 | "metadata": { 1202 | "interpreter": { 1203 | "hash": "acd897d2d4d03b6ecdaa2334811ed2d11d458dee0f89e3986a2ce3f86f932de0" 1204 | }, 1205 | "kernelspec": { 1206 | "display_name": "Python 3.10.0 64-bit", 1207 | "language": "python", 1208 | "name": "python3" 1209 | }, 1210 | "language_info": { 1211 | "codemirror_mode": { 1212 | "name": "ipython", 1213 | "version": 3 1214 | }, 1215 | "file_extension": ".py", 1216 | "mimetype": "text/x-python", 1217 | "name": "python", 1218 | "nbconvert_exporter": "python", 1219 | "pygments_lexer": "ipython3", 1220 | "version": "3.10.0" 1221 | }, 1222 | "orig_nbformat": 4 1223 | }, 1224 | "nbformat": 4, 1225 | "nbformat_minor": 2 1226 | } 1227 | -------------------------------------------------------------------------------- /Readme.md: -------------------------------------------------------------------------------- 1 | ![](banner1.png) 2 | 3 | ## 🔧 Technologies & Tools 4 | ![](https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white) 5 | ![](https://img.shields.io/badge/Visual_Studio-5C2D91?style=for-the-badge&logo=visual%20studio&logoColor=white) 6 | ![](https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue) 7 | ![](https://img.shields.io/badge/Numpy-777BB4?style=for-the-badge&logo=numpy&logoColor=white) 8 | ![](https://img.shields.io/badge/Pandas-2C2D72?style=for-the-badge&logo=pandas&logoColor=white) 9 | ![](https://img.shields.io/badge/Plotly-239120?style=for-the-badge&logo=plotly&logoColor=white) 10 | ![](https://img.shields.io/badge/R-276DC3?style=for-the-badge&logo=r&logoColor=white) 11 | ![](https://img.shields.io/badge/scikit_learn-F7931E?style=for-the-badge&logo=scikit-learn&logoColor=white) 12 | ![](https://img.shields.io/badge/SciPy-654FF0?style=for-the-badge&logo=SciPy&logoColor=white) 13 | ![](https://img.shields.io/badge/Streamlit-FF4B4B?style=for-the-badge&logo=Streamlit&logoColor=white) 14 | ![](https://img.shields.io/badge/Windows-0078D6?style=for-the-badge&logo=windows&logoColor=white) 15 | ![](https://img.shields.io/badge/Linux-FCC624?style=for-the-badge&logo=linux&logoColor=black) 16 | ![](https://img.shields.io/badge/Kaggle-20BEFF?style=for-the-badge&logo=Kaggle&logoColor=white) 17 | ![](https://img.shields.io/badge/windows%20terminal-4D4D4D?style=for-the-badge&logo=windows%20terminal&logoColor=white) 18 | ![](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2F{ammarmohsin}1212%2Fhit-counter) 19 | 20 | # Data Science using Python 21 | This repository includes integral library for Data Science; NumPy, in which we explore the different functions and usages of 1-Dimensional, 2- Dimensional & 3-Dimensional NumPy Array. Then, we have discussed one of the most important libraries related to Data Science, PANDAS, in which we comprehend different functions and their usages in exploratory data analysis, data cleaning & wrangling. 22 | 23 | In the Data Visualization, we have covered important graphical visualization of our data, starting from some basic graphs i.e., Line Plot, Bar Plot, Box Plot, Histograms to the advance and attractive Violin, Strip, Scatter Map-box, Scatter Polar, Bar Polar, Tree map graphs using Plotly. We have used Seaborn, Matplotlib, and Plotly for our Data Visualization and explored the effect of different components in representing these visualizations. 24 | 25 | In the Data Preprocessing, we have learned how to handle the raw data. When we collect the raw data, it requires to perform Exploratory Data Analysis (EDA) i.e., initial investigation to discover its patterns, empty values, spot anomalies, test hypothesis, and check assumptions with the help of summary of statistics and graphical representations. After EDA, we have also explored and discussed Data wrangling, which involves processing the data in various formats and analyzing it and getting them to be used with another set of data, and bringing them together into valuable insights. It further includes data aggregation, data visualization, and training statistical models for prediction. Data wrangling is one of the most important steps of the data science process. The quality of data analysis is only as good as the quality of data itself, so it is very important to maintain data quality. 26 | 27 | 1. [**NumPy in Python for Data Science**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/main/1-Numpy.ipynb) 28 | 1. Arrays & Its Types 29 | 1. 1-D Arrays 30 | 2. 2-D Arrays 31 | 3. 3-D Arrays 32 | 2. Array Functions 33 | 1. Sorting of Array 34 | 2. Type of Array 35 | 3. Length of Array 36 | 4. Array Concatenation 37 | 5. Dimension of an Array 38 | 6. Elements of Array 39 | 7. Shape of an Array 40 | 8. Reshaping an Array 41 | 9. Conversion of Array 42 | 10. Basic Arithmetic Operations on an Array 43 | 11. Indexing and Slicing 44 | 12. Array Stacking & Splitting 45 | 3. Basic Array Operations 46 | 1. Basic Operations of 2D Array (Addition, Subtraction, Multiplication & Division) 47 | 2. Sum of Elements in Array 48 | 4. Basic Statistical Operations in Arrays 49 | 5. Indexing 2-D Array 50 | 6. Random, Reverse, Reshape & Transpose of an Array 51 | 7. Reshaping and Flattening Multidimensional Arrays 52 | 53 | 54 | 2. [**Pandas in Python for Data Science**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/main/2-Pandas.ipynb) 55 | 1. Basic Functions of Pandas in Python 56 | 2. Pandas Library Functions on FAO Data Set 57 | 1. Set Index 58 | 2. Data Types 59 | 3. Head & Tail View of Data Frame 60 | 4. Convert Data Frame to NumPy Array 61 | 5. Summary of Data Frame 62 | 6. Sorting Data Frame 63 | 7. Selecting Data using Label Based (loc function) 64 | 8. Selecting Data using Label Based (iloc function) 65 | 9. Group by in Pandas 66 | 10. [Dealing with Null Values in PANDAS](https://github.com/ammarmohsin/Data-Science-using-Python/blob/049b145c46e4f226825481f654211d36e66f1d42/4-%20Null%20Values%20with%20Pandas.ipynb) 67 | 68 | 3. [**Data Manipulation and Analysis using PANDAS**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/adc86304e846ffd5375e1bfec43f199b2d984798/3-Data%20Manipulation%20and%20Analysis.ipynb) 69 | 70 | 4. [**Data Visualization**](https://github.com/ammarmohsin/Data-Science-using-Python/blob/7620286a7104d95b07422c3290ba904594a343bd/5-Data%20Visualization.ipynb) 71 | 1. Line Plots 72 | 2. Bar Plots 73 | 3. Box Plots 74 | 4. Box Plots Customization 75 | 5. Exercise 76 | 6. Plotly 77 | 78 | 5. **Exploratory Data Analysis (EDA)** 79 | 1. Cleaning and Filtering Data 80 | 2. Relationship (Correlation) 81 | 82 | 6. **Data Wrangling in Python** 83 | 1. Dealing with missing values 84 | 1. Continuous Variables 85 | 2. Categorical Variables 86 | 2. Data Formatting 87 | 1. Data Standardization 88 | 3. Data Normalization 89 | 1. Simple Feature Scaling 90 | 2. Min-Max Method 91 | 3. Z- score (standard score) 92 | 4. Log transformation 93 | 4. Binning 94 | 5. Convert Categories into Dummies 95 | 96 | 7. **Case Study in FAO Dataset** 97 | 1. Data Handling 98 | 2. Data Filtration 99 | 3. Data Visualization 100 | 101 | 102 | 103 | ## 📈 Stats 104 | 105 |

106 | 107 | 108 | 109 |

110 | 111 | 112 |

113 | 114 | 115 | 116 |

117 | 118 | 119 | Please follow on followings: 120 | 121 | ![](https://img.shields.io/badge/website-000000?style=for-the-badge&logo=About.me&logoColor=white) http://www.ammarmohsin.com/ 122 | 123 | ![](https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white) https://github.com/ammarmohsin 124 | 125 | ![](https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white) https://www.linkedin.com/in/ammar777/ 126 | 127 | ![](https://img.shields.io/badge/Twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white) https://twitter.com/ammarmohsin7 128 | 129 | ![](https://img.shields.io/badge/Gmail-D14836?style=for-the-badge&logo=gmail&logoColor=white) ammarmohsin104@gmail.com 130 | -------------------------------------------------------------------------------- /banner1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ammarmohsin/Data-Science-using-Python/e04d9eaf4aa49f3b6bdad84be2c9454d4380c1fe/banner1.png --------------------------------------------------------------------------------