├── .github └── workflows │ └── jekyll-gh-pages.yml ├── A Tutorial on Data Representation - Integers, Floating-point numbers, and characters.pdf ├── Advanced Numerical Linear Algebra ├── readme.md ├── week_1_introduction_to_matrices.ipynb ├── week_2_singular_value_decomposition.ipynb ├── week_3_topic_modeling_with_NMF_and_SVD.ipynb ├── week_4_background_removal_with_robust_PCA.ipynb ├── week_5_compressed_sensing_CT_scans_robust_regression.ipynb ├── week_6_health_outcomes_with_linear_regression.ipynb ├── week_7_how_to_implement_linear_regression.ipynb └── week_8_page_rank_with_eigen_decomposition.ipynb ├── Basic Numerical Linear Algebra ├── 1-Scalars,_Vectors,_Matrices_and_Tensors.ipynb ├── 10-The_Trace_Operator.ipynb ├── 11-The_Determinant.ipynb ├── 2-Multiplying_Matrices_and_Vectors.ipynb ├── 3-Identity_and_Inverse_Matrices.ipynb ├── 4-Linear_Dependence_and_Span.ipynb ├── 5-Norms.ipynb ├── 6-Special_Kind_of_Matrices_and_Vectors.ipynb ├── 7-Eigendecomposition.ipynb ├── 8-Singular_Value_Decomposition.ipynb └── 9-The_Moore_Penrose_Pseudoinverse.ipynb ├── GIFs └── RealisticPlayfulCygnet.gif ├── LICENSE └── Readme.md /.github/workflows/jekyll-gh-pages.yml: -------------------------------------------------------------------------------- 1 | # Sample workflow for building and deploying a Jekyll site to GitHub Pages 2 | name: Deploy Jekyll with GitHub Pages dependencies preinstalled 3 | 4 | on: 5 | # Runs on pushes targeting the default branch 6 | push: 7 | branches: ["main"] 8 | 9 | # Allows you to run this workflow manually from the Actions tab 10 | workflow_dispatch: 11 | 12 | # Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages 13 | permissions: 14 | contents: read 15 | pages: write 16 | id-token: write 17 | 18 | # Allow one concurrent deployment 19 | concurrency: 20 | group: "pages" 21 | cancel-in-progress: true 22 | 23 | jobs: 24 | # Build job 25 | build: 26 | runs-on: ubuntu-latest 27 | steps: 28 | - name: Checkout 29 | uses: actions/checkout@v3 30 | - name: Setup Pages 31 | uses: actions/configure-pages@v3 32 | - name: Build with Jekyll 33 | uses: actions/jekyll-build-pages@v1 34 | with: 35 | source: ./ 36 | destination: ./_site 37 | - name: Upload artifact 38 | uses: actions/upload-pages-artifact@v1 39 | 40 | # Deployment job 41 | deploy: 42 | environment: 43 | name: github-pages 44 | url: ${{ steps.deployment.outputs.page_url }} 45 | runs-on: ubuntu-latest 46 | needs: build 47 | steps: 48 | - name: Deploy to GitHub Pages 49 | id: deployment 50 | uses: actions/deploy-pages@v1 51 | -------------------------------------------------------------------------------- /A Tutorial on Data Representation - Integers, Floating-point numbers, and characters.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Numerical-Linear-Algebra/7c70ad485922a4dc4925c57b221d0b21fecbe035/A Tutorial on Data Representation - Integers, Floating-point numbers, and characters.pdf -------------------------------------------------------------------------------- /Advanced Numerical Linear Algebra/readme.md: -------------------------------------------------------------------------------- 1 | # Computational Linear Algebra for Coders 2 | 3 | This course is focused on the question: How do we do matrix computations with acceptable speed and accurancy? 4 | 5 | The course is taught in python with jupyter notebooks, using libraries like Scikit-learn and Numpy as well as Numba ( a library that compiles Python to C for faster performance) and PyTorch ( an alternative to Numpy for the GPU) 6 | 7 | 8 | ## Table of Contents 9 | 10 | ### 1. [Why are we here?](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/week_1_introduction_to_matrices.ipynb) 11 | 12 | We start with a high level overview of some foundational concepts in numerical linear algebra. 13 |
  • Matrix and tensor products 14 |
  • Matrix Decompositions 15 |
  • Accuracy 16 |
  • Memory Use 17 |
  • Speed 18 |
  • Parallelization & vectorization 19 | 20 | ### 2. [Topic Modeling with NMF and SVD](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/week_2_singular_value_decomposition.ipynb) 21 | 22 | We will use the newsgroups datasets to try to identify the topics of different posts. We use a term-document matrix that represents the frequency of the vocabulary. 23 |
  • Topic Frequency - Inverse Document Frequency (TF-IDF) 24 |
  • Singular Value Decomposition (SVD) 25 |
  • Non-Negative Matrix Factorization (NMF) 26 |
  • Stochastic Gradient Descent (SGD) 27 |
  • Intro to PyTorch 28 |
  • Truncated SVD 29 | 30 | ### 3. [Background Removal with Robust PCA](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/week_4_background_removal_with_robust_PCA.ipynb) 31 | 32 | Another application og SVD is to identify the people and remove the background of a surveillance video. We will cover robust PCA, which uses randomized SVD which in turn uses the LU factorization 33 | 34 |
  • Load and View Video Data 35 |
  • SVD 36 |
  • Principal Component Analysis 37 |
  • L1 Norm induces Sparsity 38 |
  • Robust PCA 39 |
  • LU Factorization 40 |
  • Stability of LU 41 |
  • LU factorization with Pivoting 42 |
  • History of Gaussian Elimination 43 |
  • Block Matrix Multiplication 44 | 45 | 46 | ### 4. [Compressed Sensing with Robust Regression](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/week_5_compressed_sensing_CT_scans_robust_regression.ipynb) 47 | 48 | Compressed sensing is critical to allowing CT scans with lower radiation-- the image can be reconstructed with less data. Here we will learn the technique and apply it to CT images. 49 | 50 |
  • Broadcasting 51 |
  • Sparse Matrices 52 |
  • CT Scans and Compressed Sensing 53 |
  • L1 and L2 regression 54 | 55 | 56 | ### 5. [Predicting Health Outcomes with Linear Regressions](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/week_6_health_outcomes_with_linear_regression.ipynb) 57 | 58 |
  • Linear Regression in sklearn 59 |
  • Polynomial Features 60 |
  • Speeding up with Numba 61 |
  • Regularization and Noise 62 | 63 | 64 | ### 6. [How to Implement Linear Regression?](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/week_7_how_to_implement_linear_regression.ipynb) 65 | 66 |
  • How did Scikit learn to do it? 67 |
  • Naive Solution 68 |
  • Normal equation and Cholesky factorization 69 |
  • QR Factorization 70 |
  • SVD 71 |
  • Timing Comparison 72 |
  • Conditioning and Stability 73 |
  • Full vs Reduced Factorization 74 |
  • Matrix Inversion is Unstable 75 | 76 | 77 | 78 | ### 7. [Page Rank with Eigen Decompositon](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/week_8_page_rank_with_eigen_decomposition.ipynb) 79 | 80 | We have applied SVD to topic modeling, background removal, and linear regression. SVD is intimately connected to the eigen decomposition, so we will now learn how to calculate eigenvalues for a large matrix. We will use DBpedia data, a large dataset of Wikipedia links, because here the principal eigenvector gives the relative importance of different Wikipedia pages (this is the basic idea of Google's PageRank algorithm). We will look at 3 different methods for calculating eigenvectors, of increasing complexity (and increasing usefulness!) 81 | 82 |
  • SVD 83 |
  • DBpedia Dataset 84 |
  • Power Method 85 |
  • QR algorithm 86 |
  • Two-Phase Approach to find eigen values 87 |
  • Arnoldi Iteration 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | -------------------------------------------------------------------------------- /Advanced Numerical Linear Algebra/week_6_health_outcomes_with_linear_regression.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Health Outcomes with Linear Regression" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "# importing basic lib\n", 17 | "from sklearn import datasets, linear_model, metrics\n", 18 | "from sklearn.model_selection import train_test_split\n", 19 | "from sklearn.preprocessing import PolynomialFeatures\n", 20 | "import math, scipy, numpy as np \n", 21 | "from scipy import linalg" 22 | ] 23 | }, 24 | { 25 | "cell_type": "markdown", 26 | "metadata": {}, 27 | "source": [ 28 | "#### Diabetes Dataset\n", 29 | " We will use a dataset from patients with diabetes. The data consists of 442 samples and 10 variables (like tall and skinny). The dependent variable is a quantitatve measure of disease progeression one year after the basel\n", 30 | "\n", 31 | "\n", 32 | "\n", 33 | "\n", 34 | " This is a clasical dataset, famously used by Efron, Hastie, Johnstone, and Tibshirani in their Least angle regression paper." 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": 2, 40 | "metadata": {}, 41 | "outputs": [], 42 | "source": [ 43 | "data = datasets.load_diabetes()" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": 3, 49 | "metadata": {}, 50 | "outputs": [], 51 | "source": [ 52 | "feature_names = ['age','sex','bmi','bp','s1','s2','s3','s4','s5','s6']" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": 4, 58 | "metadata": {}, 59 | "outputs": [], 60 | "source": [ 61 | "trn,test, y_trn, y_test = train_test_split(data.data, data.target, test_size=0.2)" 62 | ] 63 | }, 64 | { 65 | "cell_type": "code", 66 | "execution_count": 5, 67 | "metadata": {}, 68 | "outputs": [ 69 | { 70 | "data": { 71 | "text/plain": [ 72 | "((353, 10), (89, 10))" 73 | ] 74 | }, 75 | "execution_count": 5, 76 | "metadata": {}, 77 | "output_type": "execute_result" 78 | } 79 | ], 80 | "source": [ 81 | "trn.shape, test.shape" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "### Linear Regression in Sci-kit Learn\n", 89 | "\n", 90 | "Consider a system $X\\beta = y$ , where $X$ has more rows than columns. This occurs when you have more data samples than variables. We want to find $/beta$ that minimizes\n", 91 | "\n", 92 | "$$ || X\\beta - y ||_2 $$" 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | " Starting with simmpler sklearn implementation" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": 7, 105 | "metadata": {}, 106 | "outputs": [ 107 | { 108 | "name": "stdout", 109 | "output_type": "stream", 110 | "text": [ 111 | "The slowest run took 37.80 times longer than the fastest. This could mean that an intermediate result is being cached.\n", 112 | "12.5 ms ± 20.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 113 | ] 114 | } 115 | ], 116 | "source": [ 117 | "regr = linear_model.LinearRegression()\n", 118 | "%timeit regr.fit(trn, y_trn)" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": 8, 124 | "metadata": {}, 125 | "outputs": [], 126 | "source": [ 127 | "pred = regr.predict(test)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | " We will have some metrics on how good our prediction is. We will look at the mean squared norm (L2) and mean absolute error (L1)" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": 9, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "def regr_metric(act, pred):\n", 144 | " return (math.sqrt(metrics.mean_squared_error(act, pred)), metrics.mean_absolute_error(act,pred))" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": 10, 150 | "metadata": {}, 151 | "outputs": [ 152 | { 153 | "data": { 154 | "text/plain": [ 155 | "(49.876877547419426, 39.34822094080999)" 156 | ] 157 | }, 158 | "execution_count": 10, 159 | "metadata": {}, 160 | "output_type": "execute_result" 161 | } 162 | ], 163 | "source": [ 164 | "regr_metric(y_test, regr.predict(test))" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "## Polynomial Features\n", 172 | "\n", 173 | "Linear Regression finds the best coefficient $/beta_i$ for:\n", 174 | "$$ x_0 \\beta_0 + x_1 \\beta_1 + x_2\\beta_2 =y $$\n", 175 | "\n", 176 | "Adding Polynomial features is still alinear regression problem, just with more terms:\n", 177 | "\n", 178 | "$$x_0 \\beta_0 + x_1 \\beta_1 + x_2 \\beta_2 + x_o^2 \\beta_3 + x_{0} x_1 \\beta_4 + .... = y $$\n", 179 | "\n", 180 | " We need to use our original data X to calculate the additional polynomial features:" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": 11, 186 | "metadata": {}, 187 | "outputs": [ 188 | { 189 | "data": { 190 | "text/plain": [ 191 | "(353, 10)" 192 | ] 193 | }, 194 | "execution_count": 11, 195 | "metadata": {}, 196 | "output_type": "execute_result" 197 | } 198 | ], 199 | "source": [ 200 | "trn.shape" 201 | ] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": {}, 206 | "source": [ 207 | " The perfomrmce of the model was bad, now to improve it, we have to add some features. Currenlty our model is linear in each variable, but we can add polynomial features to change this." 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": 12, 213 | "metadata": {}, 214 | "outputs": [], 215 | "source": [ 216 | "poly = PolynomialFeatures(include_bias= False)" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": 15, 222 | "metadata": {}, 223 | "outputs": [], 224 | "source": [ 225 | "trn_feat = poly.fit_transform(trn)" 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "execution_count": 17, 231 | "metadata": {}, 232 | "outputs": [ 233 | { 234 | "data": { 235 | "text/plain": [ 236 | "'age, sex, bmi, bp, s1, s2, s3, s4, s5, s6, age^2, age sex, age bmi, age bp, age s1, age s2, age s3, age s4, age s5, age s6, sex^2, sex bmi, sex bp, sex s1, sex s2, sex s3, sex s4, sex s5, sex s6, bmi^2, bmi bp, bmi s1, bmi s2, bmi s3, bmi s4, bmi s5, bmi s6, bp^2, bp s1, bp s2, bp s3, bp s4, bp s5, bp s6, s1^2, s1 s2, s1 s3, s1 s4, s1 s5, s1 s6, s2^2, s2 s3, s2 s4, s2 s5, s2 s6, s3^2, s3 s4, s3 s5, s3 s6, s4^2, s4 s5, s4 s6, s5^2, s5 s6, s6^2'" 237 | ] 238 | }, 239 | "execution_count": 17, 240 | "metadata": {}, 241 | "output_type": "execute_result" 242 | } 243 | ], 244 | "source": [ 245 | "', '.join(poly.get_feature_names_out(feature_names))" 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 18, 251 | "metadata": {}, 252 | "outputs": [ 253 | { 254 | "data": { 255 | "text/plain": [ 256 | "(353, 65)" 257 | ] 258 | }, 259 | "execution_count": 18, 260 | "metadata": {}, 261 | "output_type": "execute_result" 262 | } 263 | ], 264 | "source": [ 265 | "trn_feat.shape" 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": 19, 271 | "metadata": {}, 272 | "outputs": [ 273 | { 274 | "data": { 275 | "text/plain": [ 276 | "LinearRegression()" 277 | ] 278 | }, 279 | "execution_count": 19, 280 | "metadata": {}, 281 | "output_type": "execute_result" 282 | } 283 | ], 284 | "source": [ 285 | "## now do the fitting\n", 286 | "\n", 287 | "regr.fit(trn_feat, y_trn)" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 20, 293 | "metadata": {}, 294 | "outputs": [ 295 | { 296 | "data": { 297 | "text/plain": [ 298 | "(56.08366396156073, 42.37826260086632)" 299 | ] 300 | }, 301 | "execution_count": 20, 302 | "metadata": {}, 303 | "output_type": "execute_result" 304 | } 305 | ], 306 | "source": [ 307 | "regr_metric(y_test, regr.predict(poly.fit_transform(test)))" 308 | ] 309 | }, 310 | { 311 | "cell_type": "markdown", 312 | "metadata": {}, 313 | "source": [ 314 | " Since time is squared in features and linear in points, the below code will be slow to run" 315 | ] 316 | }, 317 | { 318 | "cell_type": "code", 319 | "execution_count": 21, 320 | "metadata": {}, 321 | "outputs": [ 322 | { 323 | "name": "stdout", 324 | "output_type": "stream", 325 | "text": [ 326 | "5.03 ms ± 1.02 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" 327 | ] 328 | } 329 | ], 330 | "source": [ 331 | "%timeit poly.fit_transform(trn)" 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": {}, 337 | "source": [ 338 | "## Speeding up feature generation\n", 339 | "\n", 340 | " We would like to speed this up. we will use Numba, a python library that compiles code directly into C\n", 341 | "\n", 342 | "\n", 343 | " Numba is a compiler\n", 344 | "\n", 345 | " " 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | " Experiments with vectorization and native code\n", 353 | "\n", 354 | " Let's get aquainted with numba " 355 | ] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "execution_count": 22, 360 | "metadata": {}, 361 | "outputs": [], 362 | "source": [ 363 | "%matplotlib inline" 364 | ] 365 | }, 366 | { 367 | "cell_type": "code", 368 | "execution_count": 25, 369 | "metadata": {}, 370 | "outputs": [], 371 | "source": [ 372 | "import math, numpy as np, matplotlib.pyplot as plt\n", 373 | "from pandas_summary import DataFrameSummary\n", 374 | "from scipy import ndimage" 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": 28, 380 | "metadata": {}, 381 | "outputs": [], 382 | "source": [ 383 | "from numba import jit, vectorize, guvectorize, cuda, float32, void, float64" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | " We will show the impact of:\n", 391 | " 1. Avoid memory allocations and copies\n", 392 | " 2. Better locality\n", 393 | " 3. Vectorization\n", 394 | "\n", 395 | " If we use numpy on whole arrays at a time, it creates lots of temporaries, and can't use cache. If we use numba looping through an array item at a time, then we don't have to allocate large temporary arrays, and can reuse cached data since we're doing multiple calculations on each array item." 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": 32, 401 | "metadata": {}, 402 | "outputs": [], 403 | "source": [ 404 | "# untype and unvectroized\n", 405 | "def proc_python(xx,yy):\n", 406 | " zz = np.zeros(nobs, dtype='float32')\n", 407 | " for j in range(nobs):\n", 408 | " x,y = xx[j], yy[j]\n", 409 | " x = x*2 - (y *55)\n", 410 | "\n", 411 | " y = x + y*2\n", 412 | "\n", 413 | " z = x+y + 99\n", 414 | " z = z* (z - .88)\n", 415 | " zz[j] = z\n", 416 | " return zz" 417 | ] 418 | }, 419 | { 420 | "cell_type": "code", 421 | "execution_count": 33, 422 | "metadata": {}, 423 | "outputs": [], 424 | "source": [ 425 | "nobs = 10000\n", 426 | "x = np.random.randn(nobs).astype('float32')\n", 427 | "y = np.random.randn(nobs).astype('float32')" 428 | ] 429 | }, 430 | { 431 | "cell_type": "code", 432 | "execution_count": 34, 433 | "metadata": {}, 434 | "outputs": [ 435 | { 436 | "name": "stdout", 437 | "output_type": "stream", 438 | "text": [ 439 | "1.35 s ± 574 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 440 | ] 441 | } 442 | ], 443 | "source": [ 444 | "%timeit proc_python(x,y)\n", 445 | "\n", 446 | "# this is the untyped and unvectorized version" 447 | ] 448 | }, 449 | { 450 | "cell_type": "markdown", 451 | "metadata": {}, 452 | "source": [ 453 | "### Numpy \n", 454 | " We will vectorize it now" 455 | ] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "execution_count": 35, 460 | "metadata": {}, 461 | "outputs": [], 462 | "source": [ 463 | "# Typed and Vectorized\n", 464 | "def proc_numpy(x,y):\n", 465 | " z = np.zeros(nobs, dtype='float32')\n", 466 | " x = x*2 - ( y * 55 )\n", 467 | " y = x + y*2 \n", 468 | " z = x + y + 99 \n", 469 | " z = z * ( z - .88 ) \n", 470 | " return z" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": 36, 476 | "metadata": {}, 477 | "outputs": [ 478 | { 479 | "data": { 480 | "text/plain": [ 481 | "True" 482 | ] 483 | }, 484 | "execution_count": 36, 485 | "metadata": {}, 486 | "output_type": "execute_result" 487 | } 488 | ], 489 | "source": [ 490 | "np.allclose( proc_numpy(x,y), proc_python(x,y), atol=1e-4 )" 491 | ] 492 | }, 493 | { 494 | "cell_type": "code", 495 | "execution_count": 37, 496 | "metadata": {}, 497 | "outputs": [ 498 | { 499 | "name": "stdout", 500 | "output_type": "stream", 501 | "text": [ 502 | "641 µs ± 125 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 503 | ] 504 | } 505 | ], 506 | "source": [ 507 | "%timeit proc_numpy(x,y) # Typed and vectorized" 508 | ] 509 | }, 510 | { 511 | "cell_type": "markdown", 512 | "metadata": {}, 513 | "source": [ 514 | "### Numba\n", 515 | "\n", 516 | " Numab offers several different decorators. We will try two different ones:\n", 517 | " 1. @jit : very general\n", 518 | " 2. @vectorize : don't need to write a for loop" 519 | ] 520 | }, 521 | { 522 | "cell_type": "code", 523 | "execution_count": 38, 524 | "metadata": {}, 525 | "outputs": [], 526 | "source": [ 527 | "@jit()\n", 528 | "def proc_numba(xx,yy,zz):\n", 529 | " for j in range(nobs): \n", 530 | " x, y = xx[j], yy[j] \n", 531 | " x = x*2 - ( y * 55 )\n", 532 | " y = x + y*2 \n", 533 | " z = x + y + 99 \n", 534 | " z = z * ( z - .88 ) \n", 535 | " zz[j] = z \n", 536 | " return zz" 537 | ] 538 | }, 539 | { 540 | "cell_type": "code", 541 | "execution_count": 39, 542 | "metadata": {}, 543 | "outputs": [ 544 | { 545 | "data": { 546 | "text/plain": [ 547 | "True" 548 | ] 549 | }, 550 | "execution_count": 39, 551 | "metadata": {}, 552 | "output_type": "execute_result" 553 | } 554 | ], 555 | "source": [ 556 | "z = np.zeros(nobs).astype('float32')\n", 557 | "np.allclose( proc_numpy(x,y), proc_numba(x,y,z), atol=1e-4 )\n" 558 | ] 559 | }, 560 | { 561 | "cell_type": "code", 562 | "execution_count": 40, 563 | "metadata": {}, 564 | "outputs": [ 565 | { 566 | "name": "stdout", 567 | "output_type": "stream", 568 | "text": [ 569 | "95.8 µs ± 21.6 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n" 570 | ] 571 | } 572 | ], 573 | "source": [ 574 | "%timeit proc_numba(x,y,z)" 575 | ] 576 | }, 577 | { 578 | "cell_type": "markdown", 579 | "metadata": {}, 580 | "source": [ 581 | " Now we'll use the vectorize decorator. \n", 582 | " Numba's compiler optimizes this in a smarter way\n", 583 | "\n", 584 | "\n", 585 | " " 586 | ] 587 | }, 588 | { 589 | "cell_type": "code", 590 | "execution_count": 41, 591 | "metadata": {}, 592 | "outputs": [], 593 | "source": [ 594 | "@vectorize\n", 595 | "def vec_numba(x,y):\n", 596 | " x = x*2 - ( y * 55 )\n", 597 | " y = x + y*2 \n", 598 | " z = x + y + 99 \n", 599 | " return z * ( z - .88 ) " 600 | ] 601 | }, 602 | { 603 | "cell_type": "code", 604 | "execution_count": 42, 605 | "metadata": {}, 606 | "outputs": [ 607 | { 608 | "data": { 609 | "text/plain": [ 610 | "True" 611 | ] 612 | }, 613 | "execution_count": 42, 614 | "metadata": {}, 615 | "output_type": "execute_result" 616 | } 617 | ], 618 | "source": [ 619 | "np.allclose(vec_numba(x,y), proc_numba(x,y,z), atol=1e-4 )" 620 | ] 621 | }, 622 | { 623 | "cell_type": "code", 624 | "execution_count": 43, 625 | "metadata": {}, 626 | "outputs": [ 627 | { 628 | "name": "stdout", 629 | "output_type": "stream", 630 | "text": [ 631 | "104 µs ± 32.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n" 632 | ] 633 | } 634 | ], 635 | "source": [ 636 | "%timeit vec_numba(x,y)" 637 | ] 638 | }, 639 | { 640 | "cell_type": "markdown", 641 | "metadata": {}, 642 | "source": [ 643 | " NUMBA is amazingly fast" 644 | ] 645 | }, 646 | { 647 | "cell_type": "markdown", 648 | "metadata": {}, 649 | "source": [ 650 | "### Numba Polynomial features" 651 | ] 652 | }, 653 | { 654 | "cell_type": "code", 655 | "execution_count": 44, 656 | "metadata": {}, 657 | "outputs": [], 658 | "source": [ 659 | "@jit(nopython=True)\n", 660 | "def vec_poly(x, res):\n", 661 | " m,n=x.shape\n", 662 | " feat_idx=0\n", 663 | " for i in range(n):\n", 664 | " v1=x[:,i]\n", 665 | " for k in range(m): res[k,feat_idx] = v1[k]\n", 666 | " feat_idx+=1\n", 667 | " for j in range(i,n):\n", 668 | " for k in range(m): res[k,feat_idx] = v1[k]*x[k,j]\n", 669 | " feat_idx+=1" 670 | ] 671 | }, 672 | { 673 | "cell_type": "markdown", 674 | "metadata": {}, 675 | "source": [ 676 | "#### Row Major vs Column Major Storage\n", 677 | "\n", 678 | " \"The row-major layout of a matrix puts the first row in contiguous memory, then the second row right after it, then the third, and so on. Column-major layout puts the first column in contiguous memory, then the second, etc.... While knowing which layout a particular data set is using is critical for good performance, there's no single answer to the question which layout 'is better' in general.\n", 679 | "\n", 680 | " \"It turns out that matching the way your algorithm works with the data layout can make or break the performance of an application.\n", 681 | "\n", 682 | " \"The short takeaway is: always traverse the data in the order it was laid out.\"\n", 683 | "\n", 684 | " Column-major layout: Fortran, Matlab, R, and Julia\n", 685 | "\n", 686 | " Row-major layout: C, C++, Python, Pascal, Mathematica\n" 687 | ] 688 | }, 689 | { 690 | "cell_type": "code", 691 | "execution_count": 45, 692 | "metadata": {}, 693 | "outputs": [], 694 | "source": [ 695 | "trn = np.asfortranarray(trn)\n", 696 | "test = np.asfortranarray(test)" 697 | ] 698 | }, 699 | { 700 | "cell_type": "code", 701 | "execution_count": 46, 702 | "metadata": {}, 703 | "outputs": [], 704 | "source": [ 705 | "m,n = trn.shape\n", 706 | "n_feat = n*(n+1)//2 + n\n", 707 | "trn_feat = np.zeros((m,n_feat), order='F')\n", 708 | "\n", 709 | "test_feat = np.zeros((len(y_test), n_feat), order='F')" 710 | ] 711 | }, 712 | { 713 | "cell_type": "code", 714 | "execution_count": 47, 715 | "metadata": {}, 716 | "outputs": [], 717 | "source": [ 718 | "vec_poly(trn,trn_feat)\n", 719 | "vec_poly(test,test_feat)" 720 | ] 721 | }, 722 | { 723 | "cell_type": "code", 724 | "execution_count": 48, 725 | "metadata": {}, 726 | "outputs": [ 727 | { 728 | "data": { 729 | "text/plain": [ 730 | "LinearRegression()" 731 | ] 732 | }, 733 | "execution_count": 48, 734 | "metadata": {}, 735 | "output_type": "execute_result" 736 | } 737 | ], 738 | "source": [ 739 | "regr.fit(trn_feat, y_trn)" 740 | ] 741 | }, 742 | { 743 | "cell_type": "code", 744 | "execution_count": 49, 745 | "metadata": {}, 746 | "outputs": [ 747 | { 748 | "data": { 749 | "text/plain": [ 750 | "(56.08366396156081, 42.37826260086712)" 751 | ] 752 | }, 753 | "execution_count": 49, 754 | "metadata": {}, 755 | "output_type": "execute_result" 756 | } 757 | ], 758 | "source": [ 759 | "regr_metric(y_test, regr.predict(test_feat))" 760 | ] 761 | }, 762 | { 763 | "cell_type": "code", 764 | "execution_count": 50, 765 | "metadata": {}, 766 | "outputs": [ 767 | { 768 | "name": "stdout", 769 | "output_type": "stream", 770 | "text": [ 771 | "405 µs ± 125 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 772 | ] 773 | } 774 | ], 775 | "source": [ 776 | "%timeit vec_poly(trn, trn_feat)" 777 | ] 778 | }, 779 | { 780 | "cell_type": "markdown", 781 | "metadata": {}, 782 | "source": [ 783 | " This was the time from the scikit learn implementation PolynomialFeatures" 784 | ] 785 | }, 786 | { 787 | "cell_type": "code", 788 | "execution_count": 51, 789 | "metadata": {}, 790 | "outputs": [ 791 | { 792 | "name": "stdout", 793 | "output_type": "stream", 794 | "text": [ 795 | "The slowest run took 5.93 times longer than the fastest. This could mean that an intermediate result is being cached.\n", 796 | "2.12 ms ± 1.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 797 | ] 798 | } 799 | ], 800 | "source": [ 801 | "%timeit poly.fit_transform(trn)" 802 | ] 803 | }, 804 | { 805 | "cell_type": "markdown", 806 | "metadata": {}, 807 | "source": [ 808 | "### Regularization and Noise\n", 809 | "\n", 810 | " Regularization is a way to reduce over-fitting and create models that bettter generalize to new dataLasso Regression uses an L1 penalty , which pushestoward sparse coefficients" 811 | ] 812 | }, 813 | { 814 | "cell_type": "code", 815 | "execution_count": 52, 816 | "metadata": {}, 817 | "outputs": [], 818 | "source": [ 819 | "reg_regr = linear_model.LassoCV(n_alphas= 10)" 820 | ] 821 | }, 822 | { 823 | "cell_type": "code", 824 | "execution_count": 53, 825 | "metadata": {}, 826 | "outputs": [ 827 | { 828 | "name": "stderr", 829 | "output_type": "stream", 830 | "text": [ 831 | "c:\\Users\\Monit Sharma\\AppData\\Local\\Programs\\Python\\Python37\\lib\\site-packages\\sklearn\\linear_model\\_coordinate_descent.py:644: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 93470.86108051939, tolerance: 178.35448262411347\n", 832 | " positive,\n", 833 | "c:\\Users\\Monit Sharma\\AppData\\Local\\Programs\\Python\\Python37\\lib\\site-packages\\sklearn\\linear_model\\_coordinate_descent.py:644: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 43641.458391502674, tolerance: 177.24584452296824\n", 834 | " positive,\n", 835 | "c:\\Users\\Monit Sharma\\AppData\\Local\\Programs\\Python\\Python37\\lib\\site-packages\\sklearn\\linear_model\\_coordinate_descent.py:644: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 12906.598891411093, tolerance: 181.50651166077742\n", 836 | " positive,\n" 837 | ] 838 | }, 839 | { 840 | "data": { 841 | "text/plain": [ 842 | "LassoCV(n_alphas=10)" 843 | ] 844 | }, 845 | "execution_count": 53, 846 | "metadata": {}, 847 | "output_type": "execute_result" 848 | } 849 | ], 850 | "source": [ 851 | "reg_regr.fit(trn_feat, y_trn)" 852 | ] 853 | }, 854 | { 855 | "cell_type": "code", 856 | "execution_count": 54, 857 | "metadata": {}, 858 | "outputs": [ 859 | { 860 | "data": { 861 | "text/plain": [ 862 | "0.011006106192913239" 863 | ] 864 | }, 865 | "execution_count": 54, 866 | "metadata": {}, 867 | "output_type": "execute_result" 868 | } 869 | ], 870 | "source": [ 871 | "reg_regr.alpha_" 872 | ] 873 | }, 874 | { 875 | "cell_type": "code", 876 | "execution_count": 57, 877 | "metadata": {}, 878 | "outputs": [ 879 | { 880 | "data": { 881 | "text/plain": [ 882 | "(49.58213957493595, 39.277438762354656)" 883 | ] 884 | }, 885 | "execution_count": 57, 886 | "metadata": {}, 887 | "output_type": "execute_result" 888 | } 889 | ], 890 | "source": [ 891 | "regr_metric(y_test, reg_regr.predict(test_feat))" 892 | ] 893 | }, 894 | { 895 | "cell_type": "markdown", 896 | "metadata": {}, 897 | "source": [ 898 | "### Noise \n", 899 | "\n", 900 | " Adding some noise" 901 | ] 902 | }, 903 | { 904 | "cell_type": "code", 905 | "execution_count": 58, 906 | "metadata": {}, 907 | "outputs": [], 908 | "source": [ 909 | "idxs = np.random.randint(0, len(trn), 10)" 910 | ] 911 | }, 912 | { 913 | "cell_type": "code", 914 | "execution_count": 60, 915 | "metadata": {}, 916 | "outputs": [], 917 | "source": [ 918 | "y_trn2 = np.copy(y_trn)\n", 919 | "y_trn2[idxs] *= 10 # label noise" 920 | ] 921 | }, 922 | { 923 | "cell_type": "code", 924 | "execution_count": 61, 925 | "metadata": {}, 926 | "outputs": [ 927 | { 928 | "data": { 929 | "text/plain": [ 930 | "(49.87687754741944, 39.34822094080999)" 931 | ] 932 | }, 933 | "execution_count": 61, 934 | "metadata": {}, 935 | "output_type": "execute_result" 936 | } 937 | ], 938 | "source": [ 939 | "regr = linear_model.LinearRegression()\n", 940 | "regr.fit(trn, y_trn)\n", 941 | "regr_metric(y_test, regr.predict(test))" 942 | ] 943 | }, 944 | { 945 | "cell_type": "code", 946 | "execution_count": 63, 947 | "metadata": {}, 948 | "outputs": [ 949 | { 950 | "data": { 951 | "text/plain": [ 952 | "(69.14132406238272, 57.59514125171805)" 953 | ] 954 | }, 955 | "execution_count": 63, 956 | "metadata": {}, 957 | "output_type": "execute_result" 958 | } 959 | ], 960 | "source": [ 961 | "regr.fit(trn, y_trn2)\n", 962 | "regr_metric(y_test, regr.predict(test))" 963 | ] 964 | }, 965 | { 966 | "cell_type": "markdown", 967 | "metadata": {}, 968 | "source": [ 969 | " Huber Loss is a loss function that is less sensitive to outliers than squared error loss. It is quadratic for small error values, and linear for large values" 970 | ] 971 | }, 972 | { 973 | "cell_type": "code", 974 | "execution_count": 64, 975 | "metadata": {}, 976 | "outputs": [ 977 | { 978 | "name": "stderr", 979 | "output_type": "stream", 980 | "text": [ 981 | "c:\\Users\\Monit Sharma\\AppData\\Local\\Programs\\Python\\Python37\\lib\\site-packages\\sklearn\\linear_model\\_huber.py:332: ConvergenceWarning: lbfgs failed to converge (status=1):\n", 982 | "STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n", 983 | "\n", 984 | "Increase the number of iterations (max_iter) or scale the data as shown in:\n", 985 | " https://scikit-learn.org/stable/modules/preprocessing.html\n", 986 | " self.n_iter_ = _check_optimize_result(\"lbfgs\", opt_res, self.max_iter)\n" 987 | ] 988 | }, 989 | { 990 | "data": { 991 | "text/plain": [ 992 | "(51.24180446481409, 40.38448625427875)" 993 | ] 994 | }, 995 | "execution_count": 64, 996 | "metadata": {}, 997 | "output_type": "execute_result" 998 | } 999 | ], 1000 | "source": [ 1001 | "hregr = linear_model.HuberRegressor()\n", 1002 | "hregr.fit(trn, y_trn2)\n", 1003 | "regr_metric(y_test, hregr.predict(test))" 1004 | ] 1005 | }, 1006 | { 1007 | "cell_type": "code", 1008 | "execution_count": null, 1009 | "metadata": {}, 1010 | "outputs": [], 1011 | "source": [] 1012 | } 1013 | ], 1014 | "metadata": { 1015 | "kernelspec": { 1016 | "display_name": "Python 3.7.0 64-bit", 1017 | "language": "python", 1018 | "name": "python3" 1019 | }, 1020 | "language_info": { 1021 | "codemirror_mode": { 1022 | "name": "ipython", 1023 | "version": 3 1024 | }, 1025 | "file_extension": ".py", 1026 | "mimetype": "text/x-python", 1027 | "name": "python", 1028 | "nbconvert_exporter": "python", 1029 | "pygments_lexer": "ipython3", 1030 | "version": "3.7.0" 1031 | }, 1032 | "orig_nbformat": 4, 1033 | "vscode": { 1034 | "interpreter": { 1035 | "hash": "c0f2ef85a013b3e049ebe5ad4c39b42bd34e85c4b0686b7bef6ce5ff62aa91e1" 1036 | } 1037 | } 1038 | }, 1039 | "nbformat": 4, 1040 | "nbformat_minor": 2 1041 | } 1042 | -------------------------------------------------------------------------------- /Advanced Numerical Linear Algebra/week_7_how_to_implement_linear_regression.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## How to implement Linear Regression\n", 8 | "\n", 9 | " We will look into how we could write our own implementation" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "from sklearn import datasets, linear_model, metrics\n", 19 | "from sklearn.model_selection import train_test_split\n", 20 | "from sklearn.preprocessing import PolynomialFeatures\n", 21 | "import math, scipy , numpy as np\n", 22 | "from scipy import linalg" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 3, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "np.set_printoptions(precision=6)" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": 4, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "data = datasets.load_diabetes()" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 5, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "feature_names=['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 6, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "trn,test,y_trn,y_test = train_test_split(data.data, data.target, test_size=0.2)" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": 7, 64 | "metadata": {}, 65 | "outputs": [ 66 | { 67 | "data": { 68 | "text/plain": [ 69 | "((353, 10), (89, 10))" 70 | ] 71 | }, 72 | "execution_count": 7, 73 | "metadata": {}, 74 | "output_type": "execute_result" 75 | } 76 | ], 77 | "source": [ 78 | "trn.shape, test.shape" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": 8, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "def regr_metrics(act, pred):\n", 88 | " return (math.sqrt(metrics.mean_squared_error(act, pred)), \n", 89 | " metrics.mean_absolute_error(act, pred))" 90 | ] 91 | }, 92 | { 93 | "cell_type": "markdown", 94 | "metadata": {}, 95 | "source": [ 96 | "## How did sklearn do it?\n", 97 | "\n", 98 | " By checking the source code, you can see that in the dense case, it calls scipy.linalg.lstqr , which is calling the LAPACK method:\n", 99 | "\n", 100 | " 1. gelsd : uses SVD and a divide and conquer method\n", 101 | " 2. gelsy : uses QR factorization\n", 102 | " 3. gelss : uses SVD\n", 103 | "\n", 104 | "\n", 105 | " Scipy Sparse Least Squares\n", 106 | " It uses an iterative method called Golub and Kahan bidiagonalization\n", 107 | " Preconditioning is another way to reduce the number of iterations. If it is possible to solve a related system M*x = b, efficiently, where M approximates A in some helpful way, LSQR may converge more rapidly on the system" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "#### linalg.lstqr\n", 115 | " The sklearn implementation handeled adding a constant term for us" 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": 9, 121 | "metadata": {}, 122 | "outputs": [], 123 | "source": [ 124 | "trn_int = np.c_[trn, np.ones(trn.shape[0])]\n", 125 | "test_int = np.c_[test, np.ones(test.shape[0])]" 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "metadata": {}, 131 | "source": [ 132 | " Since linalg.lstsq lets us specify which LAPACK routine we want to use, lets try them all and do some timing comparisons:" 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "execution_count": 10, 138 | "metadata": {}, 139 | "outputs": [ 140 | { 141 | "name": "stdout", 142 | "output_type": "stream", 143 | "text": [ 144 | "The slowest run took 5.14 times longer than the fastest. This could mean that an intermediate result is being cached.\n", 145 | "607 µs ± 510 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 146 | ] 147 | } 148 | ], 149 | "source": [ 150 | "%timeit coef, _,_,_ = linalg.lstsq(trn_int, y_trn, lapack_driver=\"gelsd\")" 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 11, 156 | "metadata": {}, 157 | "outputs": [ 158 | { 159 | "name": "stdout", 160 | "output_type": "stream", 161 | "text": [ 162 | "1.24 ms ± 424 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 163 | ] 164 | } 165 | ], 166 | "source": [ 167 | "%timeit coef, _,_,_ = linalg.lstsq(trn_int, y_trn, lapack_driver=\"gelsy\")" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 12, 173 | "metadata": {}, 174 | "outputs": [ 175 | { 176 | "name": "stdout", 177 | "output_type": "stream", 178 | "text": [ 179 | "The slowest run took 4.58 times longer than the fastest. This could mean that an intermediate result is being cached.\n", 180 | "587 µs ± 366 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 181 | ] 182 | } 183 | ], 184 | "source": [ 185 | "%timeit coef, _,_,_ = linalg.lstsq(trn_int, y_trn, lapack_driver=\"gelss\")" 186 | ] 187 | }, 188 | { 189 | "cell_type": "markdown", 190 | "metadata": {}, 191 | "source": [ 192 | "## Naive Solution\n", 193 | " Recall that we want to find x that minimizes:\n", 194 | "$$ ||Ax-b||_2 $$\n", 195 | " Another way to think about this is that we are interested in where vector b is closest to the subspace spanned by A. This is the projection of b onto A. Since b-Ax must be perpendicular to the subspace spanned by A, we see that\n", 196 | "\n", 197 | "$$ A^T (b - Ax) = 0 $$\n", 198 | "\n", 199 | " This leads us to the normal equations\n", 200 | "$$ x = (A^T A)^{-1} A^Tb $$" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": 13, 206 | "metadata": {}, 207 | "outputs": [], 208 | "source": [ 209 | "def ls_naive(A, b):\n", 210 | " return np.linalg.inv(A.T @ A) @ A.T @ b" 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": 14, 216 | "metadata": {}, 217 | "outputs": [ 218 | { 219 | "name": "stdout", 220 | "output_type": "stream", 221 | "text": [ 222 | "The slowest run took 13.44 times longer than the fastest. This could mean that an intermediate result is being cached.\n", 223 | "128 ms ± 106 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 224 | ] 225 | } 226 | ], 227 | "source": [ 228 | "%timeit coeffs_naive = ls_naive(trn_int, y_trn)" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 15, 234 | "metadata": {}, 235 | "outputs": [ 236 | { 237 | "data": { 238 | "text/plain": [ 239 | "(56.32960590877427, 47.82386741198034)" 240 | ] 241 | }, 242 | "execution_count": 15, 243 | "metadata": {}, 244 | "output_type": "execute_result" 245 | } 246 | ], 247 | "source": [ 248 | "coeffs_naive = ls_naive(trn_int, y_trn)\n", 249 | "regr_metrics(y_test, test_int @ coeffs_naive)" 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "metadata": {}, 255 | "source": [ 256 | "### Normal Equations (Cholesky)\n", 257 | "\n", 258 | " Normal Equations :\n", 259 | "\n", 260 | "$$ A^T Ax = A^T b $$\n", 261 | "\n", 262 | "If A has full rank, the pseudo-inverse $(A^TA)^{-1} A^T$ is a square, hermitian positive definite matrix. The satndard way of solving such a system is Cholesky Factorization, which dfinds iupper-triangular R, s.t $A^T A = R^TR$" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": 16, 268 | "metadata": {}, 269 | "outputs": [], 270 | "source": [ 271 | "A = trn_int" 272 | ] 273 | }, 274 | { 275 | "cell_type": "code", 276 | "execution_count": 17, 277 | "metadata": {}, 278 | "outputs": [], 279 | "source": [ 280 | "b = y_trn" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": 18, 286 | "metadata": {}, 287 | "outputs": [], 288 | "source": [ 289 | "AtA = A.T @ A\n", 290 | "Atb = A.T @ b" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": 19, 296 | "metadata": {}, 297 | "outputs": [], 298 | "source": [ 299 | "R = scipy.linalg.cholesky(AtA)" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": 20, 305 | "metadata": {}, 306 | "outputs": [ 307 | { 308 | "data": { 309 | "text/plain": [ 310 | "array([[ 0.912 , 0.1804, 0.1842, 0.3173, 0.2687, 0.2217, -0.0509,\n", 311 | " 0.1901, 0.2657, 0.3002, -0.1708],\n", 312 | " [ 0. , 0.8748, 0.0565, 0.1561, -0.0118, 0.1097, -0.3566,\n", 313 | " 0.2886, 0.0754, 0.1325, -0.1085],\n", 314 | " [ 0. , 0. , 0.8607, 0.307 , 0.169 , 0.1812, -0.2942,\n", 315 | " 0.2979, 0.3477, 0.3038, -0.2325],\n", 316 | " [ 0. , 0. , 0. , 0.7696, 0.1211, 0.0471, 0.0286,\n", 317 | " 0.0233, 0.1789, 0.163 , 0.0297],\n", 318 | " [ 0. , 0. , 0. , 0. , 0.8123, 0.7239, 0.1359,\n", 319 | " 0.3783, 0.33 , 0.1165, -1.2538],\n", 320 | " [ 0. , 0. , 0. , 0. , 0. , 0.3846, -0.4024,\n", 321 | " 0.2558, -0.3441, -0.0579, -0.7216],\n", 322 | " [ 0. , 0. , 0. , 0. , 0. , 0. , 0.6497,\n", 323 | " -0.4917, -0.5276, -0.1386, 0.3961],\n", 324 | " [ 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", 325 | " 0.2804, -0.018 , 0.0037, -0.3866],\n", 326 | " [ 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", 327 | " 0. , 0.2952, 0.0783, -0.3811],\n", 328 | " [ 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", 329 | " 0. , 0. , 0.7238, -0.0866],\n", 330 | " [ 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", 331 | " 0. , 0. , 0. , 18.7177]])" 332 | ] 333 | }, 334 | "execution_count": 20, 335 | "metadata": {}, 336 | "output_type": "execute_result" 337 | } 338 | ], 339 | "source": [ 340 | "np.set_printoptions(suppress=True, precision=4)\n", 341 | "R" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": 21, 347 | "metadata": {}, 348 | "outputs": [ 349 | { 350 | "data": { 351 | "text/plain": [ 352 | "5.684557367178372e-14" 353 | ] 354 | }, 355 | "execution_count": 21, 356 | "metadata": {}, 357 | "output_type": "execute_result" 358 | } 359 | ], 360 | "source": [ 361 | "np.linalg.norm(AtA - R.T @ R)" 362 | ] 363 | }, 364 | { 365 | "cell_type": "markdown", 366 | "metadata": {}, 367 | "source": [ 368 | "$$ A^T Ax = A^T b $$\n", 369 | "$$ R^T Rx = A^T b $$\n", 370 | "$$ R^T w = A^T b $$\n", 371 | "$$ Rx = w $$" 372 | ] 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": 22, 377 | "metadata": {}, 378 | "outputs": [], 379 | "source": [ 380 | "w = scipy.linalg.solve_triangular(R, Atb, lower=False, trans='T')" 381 | ] 382 | }, 383 | { 384 | "cell_type": "code", 385 | "execution_count": 23, 386 | "metadata": {}, 387 | "outputs": [ 388 | { 389 | "data": { 390 | "text/plain": [ 391 | "7.277126723879252e-12" 392 | ] 393 | }, 394 | "execution_count": 23, 395 | "metadata": {}, 396 | "output_type": "execute_result" 397 | } 398 | ], 399 | "source": [ 400 | "np.linalg.norm(R.T @ w - Atb)" 401 | ] 402 | }, 403 | { 404 | "cell_type": "code", 405 | "execution_count": 24, 406 | "metadata": {}, 407 | "outputs": [], 408 | "source": [ 409 | "coeffs_chol = scipy.linalg.solve_triangular(R, w, lower=False)\n" 410 | ] 411 | }, 412 | { 413 | "cell_type": "code", 414 | "execution_count": 25, 415 | "metadata": {}, 416 | "outputs": [ 417 | { 418 | "data": { 419 | "text/plain": [ 420 | "1.3048672769382318e-13" 421 | ] 422 | }, 423 | "execution_count": 25, 424 | "metadata": {}, 425 | "output_type": "execute_result" 426 | } 427 | ], 428 | "source": [ 429 | "np.linalg.norm(R @ coeffs_chol - w)" 430 | ] 431 | }, 432 | { 433 | "cell_type": "code", 434 | "execution_count": 26, 435 | "metadata": {}, 436 | "outputs": [], 437 | "source": [ 438 | "def ls_chol(A, b):\n", 439 | " R = scipy.linalg.cholesky(A.T @ A)\n", 440 | " w = scipy.linalg.solve_triangular(R, A.T @ b, trans='T')\n", 441 | " return scipy.linalg.solve_triangular(R, w)" 442 | ] 443 | }, 444 | { 445 | "cell_type": "code", 446 | "execution_count": 27, 447 | "metadata": {}, 448 | "outputs": [ 449 | { 450 | "name": "stdout", 451 | "output_type": "stream", 452 | "text": [ 453 | "The slowest run took 8.18 times longer than the fastest. This could mean that an intermediate result is being cached.\n", 454 | "120 ms ± 65.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 455 | ] 456 | } 457 | ], 458 | "source": [ 459 | "%timeit coeffs_chol = ls_chol(trn_int, y_trn)" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": 28, 465 | "metadata": {}, 466 | "outputs": [ 467 | { 468 | "data": { 469 | "text/plain": [ 470 | "(56.3296059087742, 47.82386741198027)" 471 | ] 472 | }, 473 | "execution_count": 28, 474 | "metadata": {}, 475 | "output_type": "execute_result" 476 | } 477 | ], 478 | "source": [ 479 | "coeffs_chol = ls_chol(trn_int, y_trn)\n", 480 | "regr_metrics(y_test, test_int @ coeffs_chol)" 481 | ] 482 | }, 483 | { 484 | "cell_type": "markdown", 485 | "metadata": {}, 486 | "source": [ 487 | "### QR Factorization\n", 488 | "$$ Ax = b $$\n", 489 | "\n", 490 | "$$ A = QR$$\n", 491 | "$$ QRx = b $$\n", 492 | "$$ Rx = Q^T b$$" 493 | ] 494 | }, 495 | { 496 | "cell_type": "code", 497 | "execution_count": 29, 498 | "metadata": {}, 499 | "outputs": [], 500 | "source": [ 501 | "def ls_qr(A,b):\n", 502 | " Q, R = scipy.linalg.qr(A, mode='economic')\n", 503 | " return scipy.linalg.solve_triangular(R, Q.T @ b)" 504 | ] 505 | }, 506 | { 507 | "cell_type": "code", 508 | "execution_count": 30, 509 | "metadata": {}, 510 | "outputs": [ 511 | { 512 | "name": "stdout", 513 | "output_type": "stream", 514 | "text": [ 515 | "409 µs ± 103 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 516 | ] 517 | } 518 | ], 519 | "source": [ 520 | "%timeit coeffs_qr = ls_qr(trn_int, y_trn)" 521 | ] 522 | }, 523 | { 524 | "cell_type": "code", 525 | "execution_count": 31, 526 | "metadata": {}, 527 | "outputs": [ 528 | { 529 | "data": { 530 | "text/plain": [ 531 | "(56.32960590877418, 47.82386741198025)" 532 | ] 533 | }, 534 | "execution_count": 31, 535 | "metadata": {}, 536 | "output_type": "execute_result" 537 | } 538 | ], 539 | "source": [ 540 | "coeffs_qr = ls_qr(trn_int, y_trn)\n", 541 | "regr_metrics(y_test, test_int @ coeffs_qr)" 542 | ] 543 | }, 544 | { 545 | "cell_type": "markdown", 546 | "metadata": {}, 547 | "source": [ 548 | "### SVD \n", 549 | "\n", 550 | "$$ Ax = b $$\n", 551 | "$$ A = U \\sum V$$\n", 552 | "$$ \\sum V x = U^T b $$\n", 553 | "$$ \\sum w = U^T b $$\n", 554 | "$$ x = V^T w $$\n", 555 | "\n", 556 | " SVD gives a pseudo-inverse" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": 32, 562 | "metadata": {}, 563 | "outputs": [], 564 | "source": [ 565 | "def ls_svd(A,b):\n", 566 | " m, n = A.shape\n", 567 | " U, sigma, Vh = scipy.linalg.svd(A, full_matrices=False, lapack_driver='gesdd')\n", 568 | " w = (U.T @ b)/ sigma\n", 569 | " return Vh.T @ w" 570 | ] 571 | }, 572 | { 573 | "cell_type": "code", 574 | "execution_count": 33, 575 | "metadata": {}, 576 | "outputs": [ 577 | { 578 | "name": "stdout", 579 | "output_type": "stream", 580 | "text": [ 581 | "550 µs ± 149 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" 582 | ] 583 | } 584 | ], 585 | "source": [ 586 | "%timeit coeffs_svd = ls_svd(trn_int, y_trn)" 587 | ] 588 | }, 589 | { 590 | "cell_type": "code", 591 | "execution_count": 34, 592 | "metadata": {}, 593 | "outputs": [ 594 | { 595 | "name": "stdout", 596 | "output_type": "stream", 597 | "text": [ 598 | "2.44 ms ± 775 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" 599 | ] 600 | } 601 | ], 602 | "source": [ 603 | "%timeit coeffs_svd = ls_svd(trn_int, y_trn)" 604 | ] 605 | }, 606 | { 607 | "cell_type": "code", 608 | "execution_count": 35, 609 | "metadata": {}, 610 | "outputs": [ 611 | { 612 | "data": { 613 | "text/plain": [ 614 | "(56.32960590877416, 47.82386741198024)" 615 | ] 616 | }, 617 | "execution_count": 35, 618 | "metadata": {}, 619 | "output_type": "execute_result" 620 | } 621 | ], 622 | "source": [ 623 | "coeffs_svd = ls_svd(trn_int, y_trn)\n", 624 | "regr_metrics(y_test, test_int @ coeffs_svd)" 625 | ] 626 | }, 627 | { 628 | "cell_type": "markdown", 629 | "metadata": {}, 630 | "source": [ 631 | "### Random Sketching Technique for Least Squares Regression\n", 632 | "\n", 633 | " 1. Sample a r x n random matrix\n", 634 | " 2. Compute SA and Sb\n", 635 | " 3. Find Exact Solution x to regression SA x = Sb" 636 | ] 637 | }, 638 | { 639 | "cell_type": "markdown", 640 | "metadata": {}, 641 | "source": [ 642 | "### Timing Comparison" 643 | ] 644 | }, 645 | { 646 | "cell_type": "code", 647 | "execution_count": 36, 648 | "metadata": {}, 649 | "outputs": [], 650 | "source": [ 651 | "import timeit\n", 652 | "import pandas as pd" 653 | ] 654 | }, 655 | { 656 | "cell_type": "code", 657 | "execution_count": 37, 658 | "metadata": {}, 659 | "outputs": [], 660 | "source": [ 661 | "def scipylstq(A, b):\n", 662 | " return scipy.linalg.lstsq(A,b)[0]" 663 | ] 664 | }, 665 | { 666 | "cell_type": "code", 667 | "execution_count": 38, 668 | "metadata": {}, 669 | "outputs": [], 670 | "source": [ 671 | "row_names = ['Normal Eqns- Naive',\n", 672 | " 'Normal Eqns- Cholesky', \n", 673 | " 'QR Factorization', \n", 674 | " 'SVD', \n", 675 | " 'Scipy lstsq']\n", 676 | "\n", 677 | "name2func = {'Normal Eqns- Naive': 'ls_naive', \n", 678 | " 'Normal Eqns- Cholesky': 'ls_chol', \n", 679 | " 'QR Factorization': 'ls_qr',\n", 680 | " 'SVD': 'ls_svd',\n", 681 | " 'Scipy lstsq': 'scipylstq'}" 682 | ] 683 | }, 684 | { 685 | "cell_type": "code", 686 | "execution_count": 39, 687 | "metadata": {}, 688 | "outputs": [], 689 | "source": [ 690 | "m_array = np.array([100, 1000, 10000])\n", 691 | "n_array = np.array([20, 100, 1000])" 692 | ] 693 | }, 694 | { 695 | "cell_type": "code", 696 | "execution_count": 40, 697 | "metadata": {}, 698 | "outputs": [], 699 | "source": [ 700 | "index = pd.MultiIndex.from_product([m_array, n_array], names=['# rows', '# cols'])\n", 701 | "pd.options.display.float_format = '{:,.6f}'.format\n", 702 | "df = pd.DataFrame(index=row_names, columns=index)\n", 703 | "df_error = pd.DataFrame(index=row_names, columns=index)" 704 | ] 705 | }, 706 | { 707 | "cell_type": "code", 708 | "execution_count": null, 709 | "metadata": {}, 710 | "outputs": [], 711 | "source": [ 712 | "# %%prun\n", 713 | "for m in m_array:\n", 714 | " for n in n_array:\n", 715 | " if m >= n: \n", 716 | " x = np.random.uniform(-10,10,n)\n", 717 | " A = np.random.uniform(-40,40,[m,n]) # removed np.asfortranarray\n", 718 | " b = np.matmul(A, x) + np.random.normal(0,2,m)\n", 719 | " for name in row_names:\n", 720 | " fcn = name2func[name]\n", 721 | " t = timeit.timeit(fcn + '(A,b)', number=5, globals=globals())\n", 722 | " df.set_value(name, (m,n), t)\n", 723 | " coeffs = locals()[fcn](A, b)\n", 724 | " reg_met = regr_metrics(b, A @ coeffs)\n", 725 | " df_error.set_value(name, (m,n), reg_met[0])" 726 | ] 727 | }, 728 | { 729 | "cell_type": "code", 730 | "execution_count": null, 731 | "metadata": {}, 732 | "outputs": [], 733 | "source": [ 734 | "df" 735 | ] 736 | }, 737 | { 738 | "cell_type": "code", 739 | "execution_count": null, 740 | "metadata": {}, 741 | "outputs": [], 742 | "source": [ 743 | "df_error" 744 | ] 745 | }, 746 | { 747 | "cell_type": "code", 748 | "execution_count": null, 749 | "metadata": {}, 750 | "outputs": [], 751 | "source": [ 752 | "store = pd.HDFStore('least_squares_results.h5')" 753 | ] 754 | }, 755 | { 756 | "cell_type": "code", 757 | "execution_count": null, 758 | "metadata": {}, 759 | "outputs": [], 760 | "source": [ 761 | "store['df'] = df" 762 | ] 763 | }, 764 | { 765 | "cell_type": "markdown", 766 | "metadata": {}, 767 | "source": [ 768 | "### Conditioning & Stability\n", 769 | "\n", 770 | " Condition Number is a measure of how small changes to the input cause the output to change.The relative condition number is defined by \n", 771 | "\n", 772 | "$$ \\kappa = \\text{sup} \\frac{||\\delta f ||}{|| f(x)||} / \\frac{||\\delta x||}{||x||} $$\n", 773 | "\n", 774 | "\n", 775 | " Conditioning : perturbation behavior of a mathematical problem\n", 776 | " Stability : perturabation behavior of an algorithm used to solve that problem on a computer\n", 777 | "\n", 778 | "\n", 779 | "### Conditioning Example\n", 780 | " The problem of computing eigenvalues of a non-symmetric matrix is often ill-conditioned" 781 | ] 782 | }, 783 | { 784 | "cell_type": "code", 785 | "execution_count": 48, 786 | "metadata": {}, 787 | "outputs": [], 788 | "source": [ 789 | "A = [[1, 1000], [0, 1]]\n", 790 | "B = [[1, 1000], [0.001, 1]]" 791 | ] 792 | }, 793 | { 794 | "cell_type": "code", 795 | "execution_count": 49, 796 | "metadata": {}, 797 | "outputs": [], 798 | "source": [ 799 | "wA, vrA = scipy.linalg.eig(A)\n", 800 | "wB, vrB = scipy.linalg.eig(B)" 801 | ] 802 | }, 803 | { 804 | "cell_type": "code", 805 | "execution_count": 50, 806 | "metadata": {}, 807 | "outputs": [ 808 | { 809 | "data": { 810 | "text/plain": [ 811 | "(array([1.+0.j, 1.+0.j]), array([2.+0.j, 0.+0.j]))" 812 | ] 813 | }, 814 | "execution_count": 50, 815 | "metadata": {}, 816 | "output_type": "execute_result" 817 | } 818 | ], 819 | "source": [ 820 | "wA, wB" 821 | ] 822 | }, 823 | { 824 | "cell_type": "markdown", 825 | "metadata": {}, 826 | "source": [ 827 | "### Condition Number of a Matrix\n", 828 | "\n", 829 | "The product $||A|| || A^{-1}|| $ come up so often it has its own name: the condition number of A. Note that normalyy we talk about the conditioning of problems, not matrices.\n", 830 | "\n", 831 | " The condition number of A relates to :\n", 832 | " 1. computing b given A and x in Ax = b\n", 833 | " 2. computing x given A and b in Ax = b" 834 | ] 835 | }, 836 | { 837 | "cell_type": "markdown", 838 | "metadata": {}, 839 | "source": [ 840 | "### Matrix Inversion is Unstable" 841 | ] 842 | }, 843 | { 844 | "cell_type": "code", 845 | "execution_count": 51, 846 | "metadata": {}, 847 | "outputs": [], 848 | "source": [ 849 | "from scipy.linalg import hilbert" 850 | ] 851 | }, 852 | { 853 | "cell_type": "code", 854 | "execution_count": 52, 855 | "metadata": {}, 856 | "outputs": [], 857 | "source": [ 858 | "n = 14\n", 859 | "A = hilbert(n)\n", 860 | "x = np.random.uniform(-10,10,n)\n", 861 | "b = A @ x" 862 | ] 863 | }, 864 | { 865 | "cell_type": "code", 866 | "execution_count": 53, 867 | "metadata": {}, 868 | "outputs": [], 869 | "source": [ 870 | "A_inv = np.linalg.inv(A)" 871 | ] 872 | }, 873 | { 874 | "cell_type": "code", 875 | "execution_count": 54, 876 | "metadata": {}, 877 | "outputs": [ 878 | { 879 | "data": { 880 | "text/plain": [ 881 | "4.7177114010624965" 882 | ] 883 | }, 884 | "execution_count": 54, 885 | "metadata": {}, 886 | "output_type": "execute_result" 887 | } 888 | ], 889 | "source": [ 890 | "np.linalg.norm(np.eye(n) - A @ A_inv)" 891 | ] 892 | }, 893 | { 894 | "cell_type": "code", 895 | "execution_count": 55, 896 | "metadata": {}, 897 | "outputs": [ 898 | { 899 | "data": { 900 | "text/plain": [ 901 | "9.274324459181185e+17" 902 | ] 903 | }, 904 | "execution_count": 55, 905 | "metadata": {}, 906 | "output_type": "execute_result" 907 | } 908 | ], 909 | "source": [ 910 | "np.linalg.cond(A)" 911 | ] 912 | }, 913 | { 914 | "cell_type": "code", 915 | "execution_count": 56, 916 | "metadata": {}, 917 | "outputs": [ 918 | { 919 | "data": { 920 | "text/plain": [ 921 | "array([[ 1. , 0. , 0. , 0.0003, -0.0008, -0.0033, 0.0068,\n", 922 | " -0.0587, -0.0236, 0.3125, -0.9121, -0.1484, 0.0065, -0.0569],\n", 923 | " [-0. , 1. , -0.0001, 0.0014, -0.0126, 0.0785, -0.1674,\n", 924 | " 0.2113, -0.2763, 0.625 , -0.9451, -0.1367, 0.1234, -0.0101],\n", 925 | " [ 0. , -0. , 1. , -0.0002, 0.002 , 0.002 , 0.0156,\n", 926 | " -0.0156, 0.3125, 0.125 , -0.2812, -0.0312, 0. , -0.0127],\n", 927 | " [ 0. , 0. , 0. , 0.9999, 0.0047, -0.0407, 0.0894,\n", 928 | " -0.6306, 0.9308, -1.4375, 1.2176, -1.0351, 0.4388, -0.0656],\n", 929 | " [ 0. , -0. , 0. , -0. , 1.0011, 0.0031, 0.0148,\n", 930 | " -0.1498, 0.4747, 0. , 0.2246, -0.0547, 0.0454, -0.0039],\n", 931 | " [ 0. , -0. , 0. , -0.0002, 0.0062, 0.964 , 0.1136,\n", 932 | " -0.3853, 0.6504, -0.4688, 0.5615, -0.2886, 0.2377, -0.0267],\n", 933 | " [ 0. , -0. , 0. , -0.0005, 0.007 , -0.0264, 1.0804,\n", 934 | " -0.2277, 0.6103, -0.2188, 0.1478, -0.1211, 0.0712, -0.0013],\n", 935 | " [-0. , -0. , 0. , 0.0002, 0.0009, -0.0055, 0.0384,\n", 936 | " 0.9296, 0.2239, -0.1875, 0.3086, -0.1875, 0.1137, -0.0081],\n", 937 | " [ 0. , -0. , 0. , -0.0007, 0.0071, -0.0468, 0.1594,\n", 938 | " -0.6134, 2.0899, -1.375 , 1.2291, -0.8576, 0.336 , -0.0507],\n", 939 | " [-0. , -0. , 0. , -0.0001, 0.0026, -0.0024, 0.032 ,\n", 940 | " -0.0885, 0.1898, 1. , -0.1539, -0.0812, 0.0166, 0.0062],\n", 941 | " [ 0. , -0. , -0. , -0.0001, 0.0012, -0.0099, 0.0268,\n", 942 | " -0.1045, 0.1998, -0.375 , 1.0435, -0.1308, 0.078 , -0.0024],\n", 943 | " [ 0. , -0. , 0. , -0.0004, 0.0057, -0.0331, 0.114 ,\n", 944 | " -0.3636, 0.9706, -0.9375, 0.6756, 0.4871, 0.2194, -0.024 ],\n", 945 | " [ 0. , -0. , 0. , -0.0003, 0.0026, -0.0082, -0.0202,\n", 946 | " 0.1145, 0.0151, 0.5938, -0.6964, 0.3672, 0.849 , 0.0227],\n", 947 | " [-0. , 0. , -0. , 0.0003, -0.0026, 0.0154, -0.0878,\n", 948 | " 0.1345, -0.3189, 0.4688, -0.9179, 0.375 , -0.0765, 1.0153]])" 949 | ] 950 | }, 951 | "execution_count": 56, 952 | "metadata": {}, 953 | "output_type": "execute_result" 954 | } 955 | ], 956 | "source": [ 957 | "A @ A_inv" 958 | ] 959 | }, 960 | { 961 | "cell_type": "code", 962 | "execution_count": 57, 963 | "metadata": {}, 964 | "outputs": [], 965 | "source": [ 966 | "row_names = ['Normal Eqns- Naive',\n", 967 | " 'QR Factorization', \n", 968 | " 'SVD', \n", 969 | " 'Scipy lstsq']\n", 970 | "\n", 971 | "name2func = {'Normal Eqns- Naive': 'ls_naive', \n", 972 | " 'QR Factorization': 'ls_qr',\n", 973 | " 'SVD': 'ls_svd',\n", 974 | " 'Scipy lstsq': 'scipylstq'}" 975 | ] 976 | }, 977 | { 978 | "cell_type": "code", 979 | "execution_count": 58, 980 | "metadata": {}, 981 | "outputs": [], 982 | "source": [ 983 | "pd.options.display.float_format = '{:,.9f}'.format\n", 984 | "df = pd.DataFrame(index=row_names, columns=['Time', 'Error'])" 985 | ] 986 | }, 987 | { 988 | "cell_type": "markdown", 989 | "metadata": {}, 990 | "source": [ 991 | " Even if A is incredibly sparse, A inverse is generally dense. For large matrices , A inverse could be so dense as to not fit in memory\n", 992 | "\n", 993 | "\n", 994 | "## Runtime\n", 995 | "1. Matrix Inversion : $2n^3$\n", 996 | "2. Matrix Multiplication : $n^3$\n", 997 | "3. Cholesky : $\\frac{1}{3} n^3$\n", 998 | "4. QR, Gram Schmidt : $ 2mn^2$, $m \\ge n$\n", 999 | "5. QR, Householder : $2mn^2 - \\frac{2}{3} n^3$\n", 1000 | "6. Solving a triangular system : $n^2$" 1001 | ] 1002 | } 1003 | ], 1004 | "metadata": { 1005 | "kernelspec": { 1006 | "display_name": "Python 3.7.0 64-bit", 1007 | "language": "python", 1008 | "name": "python3" 1009 | }, 1010 | "language_info": { 1011 | "codemirror_mode": { 1012 | "name": "ipython", 1013 | "version": 3 1014 | }, 1015 | "file_extension": ".py", 1016 | "mimetype": "text/x-python", 1017 | "name": "python", 1018 | "nbconvert_exporter": "python", 1019 | "pygments_lexer": "ipython3", 1020 | "version": "3.7.0" 1021 | }, 1022 | "orig_nbformat": 4, 1023 | "vscode": { 1024 | "interpreter": { 1025 | "hash": "c0f2ef85a013b3e049ebe5ad4c39b42bd34e85c4b0686b7bef6ce5ff62aa91e1" 1026 | } 1027 | } 1028 | }, 1029 | "nbformat": 4, 1030 | "nbformat_minor": 2 1031 | } 1032 | -------------------------------------------------------------------------------- /Basic Numerical Linear Algebra/1-Scalars,_Vectors,_Matrices_and_Tensors.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyMFsIYh5fmPEcJfwyqudNyo", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 1, 32 | "metadata": { 33 | "id": "ldP4Mp1AJ-q3" 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "# import numpy\n", 38 | "\n", 39 | "import numpy as np" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "source": [ 45 | "Author: [Monit Sharma](https://github.com/MonitSharma),\n", 46 | " LinkedIn: [Monit Sharma](https://www.linkedin.com/in/monitsharma/),\n", 47 | " Twitter: [@MonitSharma1729](https://twitter.com/MonitSharma1729)" 48 | ], 49 | "metadata": { 50 | "id": "9pfwsve3QvTV" 51 | } 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "source": [ 56 | "# Introduction\n", 57 | "\n", 58 | "This first chapter is quite light and concerns the basic elements used in linear algebra and their definitions. It also introduces important functions in Python/Numpy that we will use all along this series. It will explain how to create and use vectors and matrices through examples." 59 | ], 60 | "metadata": { 61 | "id": "ee0f-Du-KJUD" 62 | } 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "source": [ 67 | "# Scalars, Vectors, Matrices and Tensors\n", 68 | "\n", 69 | "Let's start with some basic definition\n", 70 | "\n", 71 | "![](https://github.com/akhilvasvani/Linear-Algebra-Basics/blob/master/Chapters/2.01%20Scalars%2C%20Vectors%2C%20Matrices%20and%20Tensors/images/scalar-vector-matrix-tensor.png)\n", 72 | "\n", 73 | "\n", 74 | "\n", 75 | "\n", 76 | "*Difference between a scalar, a vector , a matrix and a tensor*\n", 77 | "\n", 78 | "
  • A scalar is a single number or a mtrix with single entry.\n", 79 | "\n", 80 | "
  • A vector is a 1-d array of numbers. Another way to think of vectors is identifying points in space with each element giving the coordinate along a different axis.\n", 81 | "\n", 82 | "$$ {x} =\\begin{bmatrix}\n", 83 | " x_1 \\\\\\\\\n", 84 | " x_2 \\\\\\\\\n", 85 | " \\cdots \\\\\\\\\n", 86 | " x_n\n", 87 | "\\end{bmatrix}$$\n", 88 | "\n", 89 | "\n", 90 | "
  • A matrix is a 2-D array where each element is identified by two indices (ROW then COLUMN).\n", 91 | "\n", 92 | "$$ {A}=\n", 93 | "\\begin{bmatrix}\n", 94 | " A_{1,1} & A_{1,2} & \\cdots & A_{1,n} \\\\\\\\\n", 95 | " A_{2,1} & A_{2,2} & \\cdots & A_{2,n} \\\\\\\\\n", 96 | " \\cdots & \\cdots & \\cdots & \\cdots \\\\\\\\\n", 97 | " A_{m,1} & A_{m,2} & \\cdots & A_{m,n}\n", 98 | "\\end{bmatrix} $$\n", 99 | "\n", 100 | "\n", 101 | "
  • A tensor is a $n$-dimensional array with $n>2$\n", 102 | "\n", 103 | "\n", 104 | "-------\n", 105 | "\n", 106 | "1. scalars are written in lowercase and italics. For instance: $n$\n", 107 | "\n", 108 | "2. vectors are written in lowercase, italics and bold type. For instance: $x$\n", 109 | "\n", 110 | "3. matrices are written in uppercase, italics and bold. For instance: $X$ \n" 111 | ], 112 | "metadata": { 113 | "id": "R_nYYdKLKS_b" 114 | } 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "source": [ 119 | "## Example 1:\n", 120 | "\n", 121 | "### Create a vector with Python and Numpy\n", 122 | "\n", 123 | "*Coding tip* : Unlike the `matrix()` function which necessarily creates $2$-dimensional matrices, you can create $n$-dimensional arrays with the `array()` function. The main advantage to use `matrix()` is the useful methods (conjugate transpose, inverse...). We will use the `array()` function in this series.\n", 124 | "\n", 125 | "\n", 126 | "\n", 127 | "------\n", 128 | "\n", 129 | "We will start by creating a vector. This is just a $1$-dimensional array:" 130 | ], 131 | "metadata": { 132 | "id": "jV2PI9z7LUOY" 133 | } 134 | }, 135 | { 136 | "cell_type": "code", 137 | "source": [ 138 | "x = np.array([1,2,3,4])\n", 139 | "x" 140 | ], 141 | "metadata": { 142 | "colab": { 143 | "base_uri": "https://localhost:8080/" 144 | }, 145 | "id": "dSnRFdwsKI1X", 146 | "outputId": "95497608-ed11-4e53-b96d-204c97d138de" 147 | }, 148 | "execution_count": 6, 149 | "outputs": [ 150 | { 151 | "output_type": "execute_result", 152 | "data": { 153 | "text/plain": [ 154 | "array([1, 2, 3, 4])" 155 | ] 156 | }, 157 | "metadata": {}, 158 | "execution_count": 6 159 | } 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "source": [ 165 | "## Example 2:\n", 166 | "\n", 167 | "### Create a $3 \\times 2$ matrix with nested brackets.\n", 168 | "\n", 169 | "The `array()` function can also create $2$ dimensional arrays with nested brackets:" 170 | ], 171 | "metadata": { 172 | "id": "NQocF8OzMQy4" 173 | } 174 | }, 175 | { 176 | "cell_type": "code", 177 | "source": [ 178 | "A = np.array([[1,2],[3,4],[5,6]])\n", 179 | "A" 180 | ], 181 | "metadata": { 182 | "colab": { 183 | "base_uri": "https://localhost:8080/" 184 | }, 185 | "id": "bkr1vJtwMPQG", 186 | "outputId": "6c9fc8c4-8aa9-4b3e-f5ac-12fd5928b210" 187 | }, 188 | "execution_count": 3, 189 | "outputs": [ 190 | { 191 | "output_type": "execute_result", 192 | "data": { 193 | "text/plain": [ 194 | "array([[1, 2],\n", 195 | " [3, 4],\n", 196 | " [5, 6]])" 197 | ] 198 | }, 199 | "metadata": {}, 200 | "execution_count": 3 201 | } 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "source": [ 207 | "### Shape\n", 208 | "\n", 209 | "The shape of an array (that is to say its dimensions) tells you the number of values for each dimension. For a \n", 210 | "$2$-dimensional array it will give you the number of rows and the number of columns. Let's find the shape of our preceding \n", 211 | "$2$-dimensional array `A`. Since `A` is a Numpy array (it was created with the `array()` function) you can access its shape with:\n" 212 | ], 213 | "metadata": { 214 | "id": "AR3ZBBYeMnnN" 215 | } 216 | }, 217 | { 218 | "cell_type": "code", 219 | "source": [ 220 | "A.shape" 221 | ], 222 | "metadata": { 223 | "colab": { 224 | "base_uri": "https://localhost:8080/" 225 | }, 226 | "id": "5oXR4SydMmVR", 227 | "outputId": "695b51b5-f4ef-411c-894b-d57b382bf96f" 228 | }, 229 | "execution_count": 4, 230 | "outputs": [ 231 | { 232 | "output_type": "execute_result", 233 | "data": { 234 | "text/plain": [ 235 | "(3, 2)" 236 | ] 237 | }, 238 | "metadata": {}, 239 | "execution_count": 4 240 | } 241 | ] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "source": [ 246 | "We can see that $A$ has $3$ rows and $2$ columns.\n", 247 | "\n", 248 | "-----\n", 249 | "\n", 250 | "Let's see the shape of our first vector:" 251 | ], 252 | "metadata": { 253 | "id": "0OglsJDzM2g3" 254 | } 255 | }, 256 | { 257 | "cell_type": "code", 258 | "source": [ 259 | "x.shape" 260 | ], 261 | "metadata": { 262 | "colab": { 263 | "base_uri": "https://localhost:8080/" 264 | }, 265 | "id": "mwiEng71M1OD", 266 | "outputId": "abe6ccea-9d7f-4c2b-fb8a-5df05559ccdf" 267 | }, 268 | "execution_count": 8, 269 | "outputs": [ 270 | { 271 | "output_type": "execute_result", 272 | "data": { 273 | "text/plain": [ 274 | "(4,)" 275 | ] 276 | }, 277 | "metadata": {}, 278 | "execution_count": 8 279 | } 280 | ] 281 | }, 282 | { 283 | "cell_type": "markdown", 284 | "source": [ 285 | "As expected you can see that $x$ has only one dimension. The number corresponds to the length of the array:" 286 | ], 287 | "metadata": { 288 | "id": "F1zhwj0oNHTn" 289 | } 290 | }, 291 | { 292 | "cell_type": "code", 293 | "source": [ 294 | "len(x)" 295 | ], 296 | "metadata": { 297 | "colab": { 298 | "base_uri": "https://localhost:8080/" 299 | }, 300 | "id": "XjXRos4mM-7D", 301 | "outputId": "f8d725a6-02d5-4571-8b54-8f46d19cd9b4" 302 | }, 303 | "execution_count": 9, 304 | "outputs": [ 305 | { 306 | "output_type": "execute_result", 307 | "data": { 308 | "text/plain": [ 309 | "4" 310 | ] 311 | }, 312 | "metadata": {}, 313 | "execution_count": 9 314 | } 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "source": [ 320 | "## Transposition\n", 321 | "\n", 322 | "With transposition you can convert a row vector to a column vector and vice versa.\n", 323 | "\n", 324 | "-----\n", 325 | "\n", 326 | "The transpose $A^T$\n", 327 | " of the matrix $A$\n", 328 | " corresponds to the mirrored axes. If the matrix is a square matrix (same number of columns and rows).\n", 329 | "\n", 330 | "\n", 331 | " -----\n", 332 | "\n", 333 | " $$ {A}=\n", 334 | "\\begin{bmatrix}\n", 335 | " A_{1,1} & A_{1,2} \\\\\\\\\n", 336 | " A_{2,1} & A_{2,2} \\\\\\\\\n", 337 | " A_{3,1} & A_{3,2}\n", 338 | "\\end{bmatrix} $$\n", 339 | "\n", 340 | "\n", 341 | "\n", 342 | "----\n", 343 | "\n", 344 | "$${A}^{\\text{T}}=\n", 345 | "\\begin{bmatrix}\n", 346 | " A_{1,1} & A_{2,1} & A_{3,1} \\\\\\\\\n", 347 | " A_{1,2} & A_{2,2} & A_{3,2}\n", 348 | "\\end{bmatrix}$$" 349 | ], 350 | "metadata": { 351 | "id": "eG7xFqO4NW0i" 352 | } 353 | }, 354 | { 355 | "cell_type": "markdown", 356 | "source": [ 357 | "## Example 3:\n", 358 | "\n", 359 | "### Create a matrix A and transpose it" 360 | ], 361 | "metadata": { 362 | "id": "LXq64I_FN5ze" 363 | } 364 | }, 365 | { 366 | "cell_type": "code", 367 | "source": [ 368 | "A = np.array([[1, 2], [3, 4], [5, 6]])\n", 369 | "A" 370 | ], 371 | "metadata": { 372 | "colab": { 373 | "base_uri": "https://localhost:8080/" 374 | }, 375 | "id": "91LrePujNRCv", 376 | "outputId": "819cd15e-0741-4af1-daa6-6a79d5e7f41d" 377 | }, 378 | "execution_count": 10, 379 | "outputs": [ 380 | { 381 | "output_type": "execute_result", 382 | "data": { 383 | "text/plain": [ 384 | "array([[1, 2],\n", 385 | " [3, 4],\n", 386 | " [5, 6]])" 387 | ] 388 | }, 389 | "metadata": {}, 390 | "execution_count": 10 391 | } 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "source": [ 397 | "A_t = A.T\n", 398 | "A_t" 399 | ], 400 | "metadata": { 401 | "colab": { 402 | "base_uri": "https://localhost:8080/" 403 | }, 404 | "id": "v-rILxjAOBXT", 405 | "outputId": "f1e597b5-952e-49ff-91d5-3b70c91bffc0" 406 | }, 407 | "execution_count": 11, 408 | "outputs": [ 409 | { 410 | "output_type": "execute_result", 411 | "data": { 412 | "text/plain": [ 413 | "array([[1, 3, 5],\n", 414 | " [2, 4, 6]])" 415 | ] 416 | }, 417 | "metadata": {}, 418 | "execution_count": 11 419 | } 420 | ] 421 | }, 422 | { 423 | "cell_type": "markdown", 424 | "source": [ 425 | "Checking the dimensions" 426 | ], 427 | "metadata": { 428 | "id": "5a4Tw6yNOD50" 429 | } 430 | }, 431 | { 432 | "cell_type": "code", 433 | "source": [ 434 | "A.shape" 435 | ], 436 | "metadata": { 437 | "colab": { 438 | "base_uri": "https://localhost:8080/" 439 | }, 440 | "id": "To83HyyLODKx", 441 | "outputId": "c78e42b8-6d1c-49fb-d64a-3b8ead957f8c" 442 | }, 443 | "execution_count": 12, 444 | "outputs": [ 445 | { 446 | "output_type": "execute_result", 447 | "data": { 448 | "text/plain": [ 449 | "(3, 2)" 450 | ] 451 | }, 452 | "metadata": {}, 453 | "execution_count": 12 454 | } 455 | ] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "source": [ 460 | "A_t.shape" 461 | ], 462 | "metadata": { 463 | "colab": { 464 | "base_uri": "https://localhost:8080/" 465 | }, 466 | "id": "wxKVCRuHOGgT", 467 | "outputId": "2800321c-8be6-44f9-fe93-1d91850fb047" 468 | }, 469 | "execution_count": 13, 470 | "outputs": [ 471 | { 472 | "output_type": "execute_result", 473 | "data": { 474 | "text/plain": [ 475 | "(2, 3)" 476 | ] 477 | }, 478 | "metadata": {}, 479 | "execution_count": 13 480 | } 481 | ] 482 | }, 483 | { 484 | "cell_type": "markdown", 485 | "source": [ 486 | "We can see that the number of columns becomes the number of rows with transposition and vice versa." 487 | ], 488 | "metadata": { 489 | "id": "UTvfx2nVOOTZ" 490 | } 491 | }, 492 | { 493 | "cell_type": "markdown", 494 | "source": [ 495 | "# Addition\n", 496 | "\n", 497 | "Matrices can be added if they have the same shape:\n", 498 | "\n", 499 | "$$ A + B = C $$\n", 500 | "\n", 501 | "Each cell of $A$ is added to the corresponding cell of $B$:\n", 502 | "\n", 503 | "$$ {A}_{i,j} + {B}_{i,j} = {C}_{i,j}$$\n", 504 | "\n", 505 | "$i$ is the row index and $j$ the column index." 506 | ], 507 | "metadata": { 508 | "id": "Yk8SZlkcOQ3x" 509 | } 510 | }, 511 | { 512 | "cell_type": "markdown", 513 | "source": [ 514 | "$$ \\begin{bmatrix}\n", 515 | " A_{1,1} & A_{1,2} \\\\\\\\\n", 516 | " A_{2,1} & A_{2,2} \\\\\\\\\n", 517 | " A_{3,1} & A_{3,2}\n", 518 | "\\end{bmatrix}+\n", 519 | "\\begin{bmatrix}\n", 520 | " B_{1,1} & B_{1,2} \\\\\\\\\n", 521 | " B_{2,1} & B_{2,2} \\\\\\\\\n", 522 | " B_{3,1} & B_{3,2}\n", 523 | "\\end{bmatrix}=\n", 524 | "\\begin{bmatrix}\n", 525 | " A_{1,1} + B_{1,1} & A_{1,2} + B_{1,2} \\\\\\\\\n", 526 | " A_{2,1} + B_{2,1} & A_{2,2} + B_{2,2} \\\\\\\\\n", 527 | " A_{3,1} + B_{3,1} & A_{3,2} + B_{3,2}\n", 528 | "\\end{bmatrix} $$\n", 529 | "\n", 530 | "\n", 531 | "The shape of $A,B$ and $C$ are identical. Let's check that in an example" 532 | ], 533 | "metadata": { 534 | "id": "N6JRmK-VOp-o" 535 | } 536 | }, 537 | { 538 | "cell_type": "markdown", 539 | "source": [ 540 | "## Example 4\n", 541 | "\n", 542 | "### Create two matrices A and B and add them\n", 543 | "\n", 544 | "With Numpy you can add matrices just as you would add vectors or scalars.\n" 545 | ], 546 | "metadata": { 547 | "id": "ikBG8IxVO0Fq" 548 | } 549 | }, 550 | { 551 | "cell_type": "code", 552 | "source": [ 553 | "A = np.array([[1, 2], [3, 4], [5, 6]])\n", 554 | "A" 555 | ], 556 | "metadata": { 557 | "colab": { 558 | "base_uri": "https://localhost:8080/" 559 | }, 560 | "id": "EnBVLOniOIwt", 561 | "outputId": "4e1f5e91-8086-4481-ca95-c023e5f3e3c9" 562 | }, 563 | "execution_count": 14, 564 | "outputs": [ 565 | { 566 | "output_type": "execute_result", 567 | "data": { 568 | "text/plain": [ 569 | "array([[1, 2],\n", 570 | " [3, 4],\n", 571 | " [5, 6]])" 572 | ] 573 | }, 574 | "metadata": {}, 575 | "execution_count": 14 576 | } 577 | ] 578 | }, 579 | { 580 | "cell_type": "code", 581 | "source": [ 582 | "B = np.array([[2, 5], [7, 4], [4, 3]])\n", 583 | "B" 584 | ], 585 | "metadata": { 586 | "colab": { 587 | "base_uri": "https://localhost:8080/" 588 | }, 589 | "id": "kkwADgR8O8s8", 590 | "outputId": "ff748f7c-2f0c-4df9-97f6-20affaca1cb7" 591 | }, 592 | "execution_count": 15, 593 | "outputs": [ 594 | { 595 | "output_type": "execute_result", 596 | "data": { 597 | "text/plain": [ 598 | "array([[2, 5],\n", 599 | " [7, 4],\n", 600 | " [4, 3]])" 601 | ] 602 | }, 603 | "metadata": {}, 604 | "execution_count": 15 605 | } 606 | ] 607 | }, 608 | { 609 | "cell_type": "code", 610 | "source": [ 611 | "# Add matrices A and B\n", 612 | "C = A + B\n", 613 | "C" 614 | ], 615 | "metadata": { 616 | "colab": { 617 | "base_uri": "https://localhost:8080/" 618 | }, 619 | "id": "xN1aUJWwO-DV", 620 | "outputId": "b9147777-c03a-4d20-9f34-a95a57e86e47" 621 | }, 622 | "execution_count": 16, 623 | "outputs": [ 624 | { 625 | "output_type": "execute_result", 626 | "data": { 627 | "text/plain": [ 628 | "array([[ 3, 7],\n", 629 | " [10, 8],\n", 630 | " [ 9, 9]])" 631 | ] 632 | }, 633 | "metadata": {}, 634 | "execution_count": 16 635 | } 636 | ] 637 | }, 638 | { 639 | "cell_type": "markdown", 640 | "source": [ 641 | "It is also possible to add a scalar to a matrix. This means adding this scalar to each cell of the matrix.\n", 642 | "\n", 643 | "\n", 644 | "$$ \\alpha+ \\begin{bmatrix}\n", 645 | " A_{1,1} & A_{1,2} \\\\\\\\\n", 646 | " A_{2,1} & A_{2,2} \\\\\\\\\n", 647 | " A_{3,1} & A_{3,2}\n", 648 | "\\end{bmatrix}=\n", 649 | "\\begin{bmatrix}\n", 650 | " \\alpha + A_{1,1} & \\alpha + A_{1,2} \\\\\\\\\n", 651 | " \\alpha + A_{2,1} & \\alpha + A_{2,2} \\\\\\\\\n", 652 | " \\alpha + A_{3,1} & \\alpha + A_{3,2}\n", 653 | "\\end{bmatrix} $$" 654 | ], 655 | "metadata": { 656 | "id": "AESR1i-QPCC0" 657 | } 658 | }, 659 | { 660 | "cell_type": "markdown", 661 | "source": [ 662 | "## Example 5\n", 663 | "\n", 664 | "### Add a scalar to a matrix" 665 | ], 666 | "metadata": { 667 | "id": "JkuKbVylPGsj" 668 | } 669 | }, 670 | { 671 | "cell_type": "code", 672 | "source": [ 673 | "A" 674 | ], 675 | "metadata": { 676 | "colab": { 677 | "base_uri": "https://localhost:8080/" 678 | }, 679 | "id": "jDG-gHr_O_5x", 680 | "outputId": "4e232bc6-adc4-4395-cdfe-2e1c3dafbff2" 681 | }, 682 | "execution_count": 17, 683 | "outputs": [ 684 | { 685 | "output_type": "execute_result", 686 | "data": { 687 | "text/plain": [ 688 | "array([[1, 2],\n", 689 | " [3, 4],\n", 690 | " [5, 6]])" 691 | ] 692 | }, 693 | "metadata": {}, 694 | "execution_count": 17 695 | } 696 | ] 697 | }, 698 | { 699 | "cell_type": "code", 700 | "source": [ 701 | "# Exemple: Add 4 to the matrix A\n", 702 | "C = A+4\n", 703 | "C" 704 | ], 705 | "metadata": { 706 | "colab": { 707 | "base_uri": "https://localhost:8080/" 708 | }, 709 | "id": "nnu816G7PLJV", 710 | "outputId": "4edf93ac-4fc4-4f77-9934-eb5b9006c0ab" 711 | }, 712 | "execution_count": 18, 713 | "outputs": [ 714 | { 715 | "output_type": "execute_result", 716 | "data": { 717 | "text/plain": [ 718 | "array([[ 5, 6],\n", 719 | " [ 7, 8],\n", 720 | " [ 9, 10]])" 721 | ] 722 | }, 723 | "metadata": {}, 724 | "execution_count": 18 725 | } 726 | ] 727 | }, 728 | { 729 | "cell_type": "markdown", 730 | "source": [ 731 | "# Broadcasting\n", 732 | "\n", 733 | "Numpy can handle operations on arrays of different shapes. The smaller array will be extended to match the shape of the bigger one. The advantage is that this is done in C under the hood (like any vectorized operations in Numpy). Actually, we used broadcasting in the example 5. The scalar was converted in an array of same shape as $A$.\n", 734 | "\n", 735 | "Here is another generic example:\n", 736 | "\n", 737 | "$$ \\begin{bmatrix}\n", 738 | " A_{1,1} & A_{1,2} \\\\\\\\\n", 739 | " A_{2,1} & A_{2,2} \\\\\\\\\n", 740 | " A_{3,1} & A_{3,2}\n", 741 | "\\end{bmatrix}+\n", 742 | "\\begin{bmatrix}\n", 743 | " B_{1,1} \\\\\\\\\n", 744 | " B_{2,1} \\\\\\\\\n", 745 | " B_{3,1}\n", 746 | "\\end{bmatrix} $$\n", 747 | "\n", 748 | "is equivalent to\n", 749 | "\n", 750 | "$$ \\begin{bmatrix}\n", 751 | " A_{1,1} & A_{1,2} \\\\\\\\\n", 752 | " A_{2,1} & A_{2,2} \\\\\\\\\n", 753 | " A_{3,1} & A_{3,2}\n", 754 | "\\end{bmatrix}+\n", 755 | "\\begin{bmatrix}\n", 756 | " B_{1,1} & B_{1,1} \\\\\\\\\n", 757 | " B_{2,1} & B_{2,1} \\\\\\\\\n", 758 | " B_{3,1} & B_{3,1}\n", 759 | "\\end{bmatrix}=\n", 760 | "\\begin{bmatrix}\n", 761 | " A_{1,1} + B_{1,1} & A_{1,2} + B_{1,1} \\\\\\\\\n", 762 | " A_{2,1} + B_{2,1} & A_{2,2} + B_{2,1} \\\\\\\\\n", 763 | " A_{3,1} + B_{3,1} & A_{3,2} + B_{3,1}\n", 764 | "\\end{bmatrix} $$\n", 765 | "\n", 766 | "where the ($3\\times 1$\n", 767 | ") matrix is converted to the right shape ($3\\times 2$\n", 768 | ") by copying the first column. Numpy will do that automatically if the shapes can match." 769 | ], 770 | "metadata": { 771 | "id": "FkqUbhmsPN39" 772 | } 773 | }, 774 | { 775 | "cell_type": "markdown", 776 | "source": [ 777 | "## Example 6\n", 778 | "\n", 779 | "### Add two matrices of different shapes" 780 | ], 781 | "metadata": { 782 | "id": "ewtpjAF4Pxqi" 783 | } 784 | }, 785 | { 786 | "cell_type": "code", 787 | "source": [ 788 | "A = np.array([[1, 2], [3, 4], [5, 6]])\n", 789 | "A" 790 | ], 791 | "metadata": { 792 | "colab": { 793 | "base_uri": "https://localhost:8080/" 794 | }, 795 | "id": "G7YejOcnPMvE", 796 | "outputId": "2578a3b3-0cb3-46b2-b5ff-6c14d97ef527" 797 | }, 798 | "execution_count": 19, 799 | "outputs": [ 800 | { 801 | "output_type": "execute_result", 802 | "data": { 803 | "text/plain": [ 804 | "array([[1, 2],\n", 805 | " [3, 4],\n", 806 | " [5, 6]])" 807 | ] 808 | }, 809 | "metadata": {}, 810 | "execution_count": 19 811 | } 812 | ] 813 | }, 814 | { 815 | "cell_type": "code", 816 | "source": [ 817 | "B = np.array([[2], [4], [6]])\n", 818 | "B" 819 | ], 820 | "metadata": { 821 | "colab": { 822 | "base_uri": "https://localhost:8080/" 823 | }, 824 | "id": "-uMWNNUdP5m4", 825 | "outputId": "976f5374-6d89-4613-c888-fdf6895c1972" 826 | }, 827 | "execution_count": 20, 828 | "outputs": [ 829 | { 830 | "output_type": "execute_result", 831 | "data": { 832 | "text/plain": [ 833 | "array([[2],\n", 834 | " [4],\n", 835 | " [6]])" 836 | ] 837 | }, 838 | "metadata": {}, 839 | "execution_count": 20 840 | } 841 | ] 842 | }, 843 | { 844 | "cell_type": "code", 845 | "source": [ 846 | "# Broadcasting\n", 847 | "C=A+B\n", 848 | "C" 849 | ], 850 | "metadata": { 851 | "colab": { 852 | "base_uri": "https://localhost:8080/" 853 | }, 854 | "id": "g0yJonCvP7Sy", 855 | "outputId": "94e71fad-48ca-4358-eb7d-368bbf1a6208" 856 | }, 857 | "execution_count": 21, 858 | "outputs": [ 859 | { 860 | "output_type": "execute_result", 861 | "data": { 862 | "text/plain": [ 863 | "array([[ 3, 4],\n", 864 | " [ 7, 8],\n", 865 | " [11, 12]])" 866 | ] 867 | }, 868 | "metadata": {}, 869 | "execution_count": 21 870 | } 871 | ] 872 | }, 873 | { 874 | "cell_type": "markdown", 875 | "source": [ 876 | "*Coding tip*: Sometimes row or column vectors are not in proper shape for broadcasting. We need to imploy a trick ( a `numpy.newaxis` object) to help fix this issue." 877 | ], 878 | "metadata": { 879 | "id": "2ewYtHs7P_YV" 880 | } 881 | }, 882 | { 883 | "cell_type": "code", 884 | "source": [ 885 | "x = np.arange(4)\n", 886 | "x.shape" 887 | ], 888 | "metadata": { 889 | "colab": { 890 | "base_uri": "https://localhost:8080/" 891 | }, 892 | "id": "Y8rEDXELP9oY", 893 | "outputId": "94feba17-ad56-441b-f0ae-386ea81df7ec" 894 | }, 895 | "execution_count": 22, 896 | "outputs": [ 897 | { 898 | "output_type": "execute_result", 899 | "data": { 900 | "text/plain": [ 901 | "(4,)" 902 | ] 903 | }, 904 | "metadata": {}, 905 | "execution_count": 22 906 | } 907 | ] 908 | }, 909 | { 910 | "cell_type": "code", 911 | "source": [ 912 | "# Adds a new dimension\n", 913 | "x[:, np.newaxis]" 914 | ], 915 | "metadata": { 916 | "colab": { 917 | "base_uri": "https://localhost:8080/" 918 | }, 919 | "id": "riZhBFZYQGio", 920 | "outputId": "52b087de-6261-4df8-cae7-658bb54c9279" 921 | }, 922 | "execution_count": 23, 923 | "outputs": [ 924 | { 925 | "output_type": "execute_result", 926 | "data": { 927 | "text/plain": [ 928 | "array([[0],\n", 929 | " [1],\n", 930 | " [2],\n", 931 | " [3]])" 932 | ] 933 | }, 934 | "metadata": {}, 935 | "execution_count": 23 936 | } 937 | ] 938 | }, 939 | { 940 | "cell_type": "code", 941 | "source": [ 942 | "A = np.random.randn(4,3)\n", 943 | "A" 944 | ], 945 | "metadata": { 946 | "colab": { 947 | "base_uri": "https://localhost:8080/" 948 | }, 949 | "id": "6X-7Yy8WQIiY", 950 | "outputId": "9339c242-841d-4629-d117-b9be66cc286a" 951 | }, 952 | "execution_count": 24, 953 | "outputs": [ 954 | { 955 | "output_type": "execute_result", 956 | "data": { 957 | "text/plain": [ 958 | "array([[ 0.37898843, 0.42689999, -1.34790859],\n", 959 | " [ 1.59115004, 0.59600385, 0.25510038],\n", 960 | " [ 1.08659174, -1.6311077 , -0.78809825],\n", 961 | " [ 1.34425773, 0.07104051, 0.06759489]])" 962 | ] 963 | }, 964 | "metadata": {}, 965 | "execution_count": 24 966 | } 967 | ] 968 | }, 969 | { 970 | "cell_type": "code", 971 | "source": [ 972 | "# This will throw an error\n", 973 | "try:\n", 974 | " A - x\n", 975 | "except ValueError:\n", 976 | " print(\"Operation cannot be completed. Dimension mismatch\") " 977 | ], 978 | "metadata": { 979 | "colab": { 980 | "base_uri": "https://localhost:8080/" 981 | }, 982 | "id": "QivourdbQKbm", 983 | "outputId": "eea6a3fb-566b-4c44-e259-b625fc0f2387" 984 | }, 985 | "execution_count": 25, 986 | "outputs": [ 987 | { 988 | "output_type": "stream", 989 | "name": "stdout", 990 | "text": [ 991 | "Operation cannot be completed. Dimension mismatch\n" 992 | ] 993 | } 994 | ] 995 | }, 996 | { 997 | "cell_type": "code", 998 | "source": [ 999 | "# But this works -- subtract each column of A by the column vector x\n", 1000 | "A - x[:, np.newaxis]" 1001 | ], 1002 | "metadata": { 1003 | "colab": { 1004 | "base_uri": "https://localhost:8080/" 1005 | }, 1006 | "id": "75nfP8rfQNHo", 1007 | "outputId": "901af2bb-05b5-4d07-b476-654c07cad8d3" 1008 | }, 1009 | "execution_count": 26, 1010 | "outputs": [ 1011 | { 1012 | "output_type": "execute_result", 1013 | "data": { 1014 | "text/plain": [ 1015 | "array([[ 0.37898843, 0.42689999, -1.34790859],\n", 1016 | " [ 0.59115004, -0.40399615, -0.74489962],\n", 1017 | " [-0.91340826, -3.6311077 , -2.78809825],\n", 1018 | " [-1.65574227, -2.92895949, -2.93240511]])" 1019 | ] 1020 | }, 1021 | "metadata": {}, 1022 | "execution_count": 26 1023 | } 1024 | ] 1025 | }, 1026 | { 1027 | "cell_type": "code", 1028 | "source": [], 1029 | "metadata": { 1030 | "id": "jJE1HX3oQSmj" 1031 | }, 1032 | "execution_count": null, 1033 | "outputs": [] 1034 | } 1035 | ] 1036 | } -------------------------------------------------------------------------------- /Basic Numerical Linear Algebra/10-The_Trace_Operator.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyPiT7fqYp/srP3TTDgquKfD", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 1, 32 | "metadata": { 33 | "id": "H8WERUSFt2o3" 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "import numpy as np\n", 38 | "import matplotlib.pyplot as plt\n", 39 | "import seaborn as sns" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "source": [ 45 | "# Plot style\n", 46 | "sns.set()\n", 47 | "%pylab inline\n", 48 | "pylab.rcParams['figure.figsize'] = (4, 4)" 49 | ], 50 | "metadata": { 51 | "colab": { 52 | "base_uri": "https://localhost:8080/" 53 | }, 54 | "id": "PHz_XlZIt7ay", 55 | "outputId": "a45d64d9-3352-4759-fd7c-3ec45f70ca88" 56 | }, 57 | "execution_count": 2, 58 | "outputs": [ 59 | { 60 | "output_type": "stream", 61 | "name": "stdout", 62 | "text": [ 63 | "Populating the interactive namespace from numpy and matplotlib\n" 64 | ] 65 | } 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "source": [ 71 | "# Introduction\n", 72 | "This chapter is very light! I can assure you that you will read it in 1 minute! It is nice after the last two chapters that were quite big! We will see what is the Trace of a matrix. It will be needed for the last chapter on the Principal Component Analysis (PCA)." 73 | ], 74 | "metadata": { 75 | "id": "j7GeJ3mWt_bO" 76 | } 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "source": [ 81 | "# The Trace operator\n", 82 | "\n", 83 | "*The Trace of matrix*\n", 84 | "\n", 85 | "The trace is the sum of all values in the diagonal of a square matrix.\n", 86 | "\n", 87 | "$$ {A}=\n", 88 | "\\begin{bmatrix}\n", 89 | " 2 & 9 & 8 \\\\\\\\\n", 90 | " 4 & 7 & 1 \\\\\\\\\n", 91 | " 8 & 2 & 5\n", 92 | "\\end{bmatrix}$$\n", 93 | "\n", 94 | "$$\\mathrm{Tr}({A}) = 2 + 7 + 5 = 14$$\n", 95 | "\n", 96 | "\n", 97 | "\n", 98 | "`numpy` provides the function `trace()` to calculate it:" 99 | ], 100 | "metadata": { 101 | "id": "bsqWfLfFuBmO" 102 | } 103 | }, 104 | { 105 | "cell_type": "code", 106 | "source": [ 107 | "A = np.array([[2, 9, 8], [4, 7, 1], [8, 2, 5]])\n", 108 | "A" 109 | ], 110 | "metadata": { 111 | "colab": { 112 | "base_uri": "https://localhost:8080/" 113 | }, 114 | "id": "6I9YmUB2t9Y9", 115 | "outputId": "2732b67c-b24d-4ddd-d71f-8cb00e40af1f" 116 | }, 117 | "execution_count": 3, 118 | "outputs": [ 119 | { 120 | "output_type": "execute_result", 121 | "data": { 122 | "text/plain": [ 123 | "array([[2, 9, 8],\n", 124 | " [4, 7, 1],\n", 125 | " [8, 2, 5]])" 126 | ] 127 | }, 128 | "metadata": {}, 129 | "execution_count": 3 130 | } 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "source": [ 136 | "A_tr = np.trace(A)\n", 137 | "A_tr" 138 | ], 139 | "metadata": { 140 | "colab": { 141 | "base_uri": "https://localhost:8080/" 142 | }, 143 | "id": "aaXix1-quYYS", 144 | "outputId": "e14f6e8e-d4ea-447b-a6e7-32f76ce8dd6a" 145 | }, 146 | "execution_count": 4, 147 | "outputs": [ 148 | { 149 | "output_type": "execute_result", 150 | "data": { 151 | "text/plain": [ 152 | "14" 153 | ] 154 | }, 155 | "metadata": {}, 156 | "execution_count": 4 157 | } 158 | ] 159 | }, 160 | { 161 | "cell_type": "markdown", 162 | "source": [ 163 | "Trace can be used to specify the Frobenius norm of a matrix. The Frobenius norm is the equivalent of the $L^2$\n", 164 | " norm for matrices. It is defined by:\n", 165 | "\n", 166 | " $$ ||{{A}}||_F=\\sqrt{\\sum_{i,j}A^2_{i,j}}$$\n", 167 | "\n", 168 | " Take the square of all elements and sum them. Take the square root of the result. This norm can also be calculated with:\n", 169 | "\n", 170 | " $$ ||{{A}}||_F=\\sqrt{Tr({{AA}^T})}$$\n", 171 | "\n", 172 | " We can check this by `np.linalg.norm()`" 173 | ], 174 | "metadata": { 175 | "id": "o0gwUduuudsN" 176 | } 177 | }, 178 | { 179 | "cell_type": "code", 180 | "source": [ 181 | "np.linalg.norm(A)" 182 | ], 183 | "metadata": { 184 | "colab": { 185 | "base_uri": "https://localhost:8080/" 186 | }, 187 | "id": "APu01wxsua62", 188 | "outputId": "dafd1ff2-3849-4a5e-cbba-73ee7e1d0b19" 189 | }, 190 | "execution_count": 5, 191 | "outputs": [ 192 | { 193 | "output_type": "execute_result", 194 | "data": { 195 | "text/plain": [ 196 | "17.549928774784245" 197 | ] 198 | }, 199 | "metadata": {}, 200 | "execution_count": 5 201 | } 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "source": [ 207 | "The Frobenius norm of $A$\n", 208 | " is 17.549928774784245, with the trace the result is identical." 209 | ], 210 | "metadata": { 211 | "id": "RsueaPsbvLS_" 212 | } 213 | }, 214 | { 215 | "cell_type": "code", 216 | "source": [ 217 | "np.sqrt(np.trace(A.dot(A.T)))" 218 | ], 219 | "metadata": { 220 | "colab": { 221 | "base_uri": "https://localhost:8080/" 222 | }, 223 | "id": "ey6GwHCSvElw", 224 | "outputId": "1f14b813-d58e-490a-bb87-f8b02d85603a" 225 | }, 226 | "execution_count": 6, 227 | "outputs": [ 228 | { 229 | "output_type": "execute_result", 230 | "data": { 231 | "text/plain": [ 232 | "17.549928774784245" 233 | ] 234 | }, 235 | "metadata": {}, 236 | "execution_count": 6 237 | } 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "source": [ 243 | "np.sqrt(np.trace(A.dot(A.T))) == np.linalg.norm(A)" 244 | ], 245 | "metadata": { 246 | "colab": { 247 | "base_uri": "https://localhost:8080/" 248 | }, 249 | "id": "bF_J72rHvUhM", 250 | "outputId": "3201edea-a5fa-40e5-e1fd-efcb9e9f540d" 251 | }, 252 | "execution_count": 7, 253 | "outputs": [ 254 | { 255 | "output_type": "execute_result", 256 | "data": { 257 | "text/plain": [ 258 | "True" 259 | ] 260 | }, 261 | "metadata": {}, 262 | "execution_count": 7 263 | } 264 | ] 265 | }, 266 | { 267 | "cell_type": "markdown", 268 | "source": [ 269 | "Since the transposition of a matrix doesn't change the diagonal, the trace of the matrix is equal to the trace of its transpose:\n", 270 | "\n", 271 | "$$ Tr({A})=Tr({A}^T)$$" 272 | ], 273 | "metadata": { 274 | "id": "c4rKF_GtvbFr" 275 | } 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "source": [ 280 | "## Trace of a product\n", 281 | "\n", 282 | "$$ Tr({ABC}) = Tr({CAB}) = Tr({BCA})$$\n", 283 | "\n", 284 | "\n", 285 | "#### Example 1\n", 286 | "\n", 287 | "Let's see an example of this property\n", 288 | "\n", 289 | "$$ {A}=\n", 290 | "\\begin{bmatrix}\n", 291 | " 4 & 12 \\\\\\\\\n", 292 | " 7 & 6\n", 293 | "\\end{bmatrix}$$\n", 294 | "\n", 295 | "$${B}=\n", 296 | "\\begin{bmatrix}\n", 297 | " 1 & -3 \\\\\\\\\n", 298 | " 4 & 3\n", 299 | "\\end{bmatrix}$$\n", 300 | "\n", 301 | "$${C}=\n", 302 | "\\begin{bmatrix}\n", 303 | " 6 & 6 \\\\\\\\\n", 304 | " 2 & 5\n", 305 | "\\end{bmatrix}$$" 306 | ], 307 | "metadata": { 308 | "id": "90IY_Qc1vkw2" 309 | } 310 | }, 311 | { 312 | "cell_type": "code", 313 | "source": [ 314 | "A = np.array([[4, 12], [7, 6]])\n", 315 | "B = np.array([[1, -3], [4, 3]])\n", 316 | "C = np.array([[6, 6], [2, 5]])\n", 317 | "\n", 318 | "np.trace(A.dot(B).dot(C))" 319 | ], 320 | "metadata": { 321 | "colab": { 322 | "base_uri": "https://localhost:8080/" 323 | }, 324 | "id": "gYaN-aJhvXvb", 325 | "outputId": "d938c453-4236-46b8-e5e4-40c9baece6da" 326 | }, 327 | "execution_count": 8, 328 | "outputs": [ 329 | { 330 | "output_type": "execute_result", 331 | "data": { 332 | "text/plain": [ 333 | "531" 334 | ] 335 | }, 336 | "metadata": {}, 337 | "execution_count": 8 338 | } 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "source": [ 344 | "np.trace(C.dot(A).dot(B))" 345 | ], 346 | "metadata": { 347 | "colab": { 348 | "base_uri": "https://localhost:8080/" 349 | }, 350 | "id": "z6gI5f9swAPy", 351 | "outputId": "fc549f37-e30e-47ad-c1bb-91fa22b2b0de" 352 | }, 353 | "execution_count": 9, 354 | "outputs": [ 355 | { 356 | "output_type": "execute_result", 357 | "data": { 358 | "text/plain": [ 359 | "531" 360 | ] 361 | }, 362 | "metadata": {}, 363 | "execution_count": 9 364 | } 365 | ] 366 | }, 367 | { 368 | "cell_type": "code", 369 | "source": [ 370 | "np.trace(B.dot(C).dot(A))" 371 | ], 372 | "metadata": { 373 | "colab": { 374 | "base_uri": "https://localhost:8080/" 375 | }, 376 | "id": "AeOb-E7hwDEO", 377 | "outputId": "fa70b5af-f7eb-47b0-b771-b7b6152d03ec" 378 | }, 379 | "execution_count": 10, 380 | "outputs": [ 381 | { 382 | "output_type": "execute_result", 383 | "data": { 384 | "text/plain": [ 385 | "531" 386 | ] 387 | }, 388 | "metadata": {}, 389 | "execution_count": 10 390 | } 391 | ] 392 | }, 393 | { 394 | "cell_type": "markdown", 395 | "source": [ 396 | "$$ {ABC}=\n", 397 | "\\begin{bmatrix}\n", 398 | " 360 & 432 \\\\\\\\\n", 399 | " 180 & 171\n", 400 | "\\end{bmatrix}$$\n", 401 | "\n", 402 | "$$ {CAB}=\n", 403 | "\\begin{bmatrix}\n", 404 | " 498 & 126 \\\\\\\\\n", 405 | " 259 & 33\n", 406 | "\\end{bmatrix}$$\n", 407 | "\n", 408 | "$${BCA}=\n", 409 | "\\begin{bmatrix}\n", 410 | " -63 & -54 \\\\\\\\\n", 411 | " 393 & 594\n", 412 | "\\end{bmatrix}$$\n", 413 | "\n", 414 | "$$Tr({ABC}) = Tr({CAB}) = Tr({BCA}) = 531$$" 415 | ], 416 | "metadata": { 417 | "id": "wpxe8s4AwHQP" 418 | } 419 | } 420 | ] 421 | } -------------------------------------------------------------------------------- /Basic Numerical Linear Algebra/2-Multiplying_Matrices_and_Vectors.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyMSmY5IIoQCJBI9GQH5HztX", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "Author: [Monit Sharma](https://github.com/MonitSharma),\n", 33 | " LinkedIn: [Monit Sharma](https://www.linkedin.com/in/monitsharma/),\n", 34 | " Twitter: [@MonitSharma1729](https://twitter.com/MonitSharma1729)" 35 | ], 36 | "metadata": { 37 | "id": "Qnlg0jmmRIdP" 38 | } 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": 1, 43 | "metadata": { 44 | "id": "-DqOJ5BnRHCT" 45 | }, 46 | "outputs": [], 47 | "source": [ 48 | "# import numpy\n", 49 | "import numpy as np\n", 50 | "\n", 51 | "# avoid innacurate floating points \n", 52 | "np.set_printoptions(suppress=True)" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "source": [ 58 | "# Introduction\n", 59 | "\n", 60 | "We will see some very important concepts in this chapter. The dot product is used in every equation explaining data science algorithms so it's worth the effort to understand it. Then we will see some properties of this operation. Finally, we will to get some intuition on the link between matrices and systems of linear equations." 61 | ], 62 | "metadata": { 63 | "id": "XjXmGj2dRnpF" 64 | } 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "source": [ 69 | "# Multiplying Matrices and Vectors\n", 70 | "\n", 71 | "The standard way to multiply matrices is not to multiply each element of one with each elements of the other (this is the element-wise product) but to calculate the sum of the products between rows and columns. \n", 72 | "\n", 73 | "-----\n", 74 | "\n", 75 | "The number of columns of the first matrix must be equal to the number of rows of the second matrix. Thus, if the dimensions, or the shape of the first matrix, is ($m\\times n$) the second matrix need to be of shape ($n \\times x$\n", 76 | "). The resulting matrix will have the shape ($m \\times x$\n", 77 | ").\n", 78 | "\n", 79 | "\n" 80 | ], 81 | "metadata": { 82 | "id": "wcLRSyGXRxje" 83 | } 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "source": [ 88 | "## Example 1\n", 89 | "\n", 90 | "As a starter we will see the multiplication of a matrix and a vector\n", 91 | "\n", 92 | "$$ {A} \\times {b} = {C} $$\n", 93 | "\n", 94 | "with ${A}=\n", 95 | "\\begin{bmatrix}\n", 96 | " 1 & 2\\\\\\\\\n", 97 | " 3 & 4\\\\\\\\\n", 98 | " 5 & 6\n", 99 | "\\end{bmatrix}$ and $ {b}=\\begin{bmatrix}\n", 100 | " 2\\\\\\\\\n", 101 | " 4\n", 102 | "\\end{bmatrix}$.\n", 103 | "\n", 104 | "We saw that the formula is the following:\n", 105 | "\n", 106 | "$$\\begin{align*}\n", 107 | "&\\begin{bmatrix}\n", 108 | " A_{1,1} & A_{1,2} \\\\\\\\\n", 109 | " A_{2,1} & A_{2,2} \\\\\\\\\n", 110 | " A_{3,1} & A_{3,2}\n", 111 | "\\end{bmatrix}\\times\n", 112 | "\\begin{bmatrix}\n", 113 | " B_{1,1} \\\\\\\\\n", 114 | " B_{2,1}\n", 115 | "\\end{bmatrix}=\\\\\\\\\n", 116 | "&\\begin{bmatrix}\n", 117 | " A_{1,1}B_{1,1} + A_{1,2}B_{2,1} \\\\\\\\\n", 118 | " A_{2,1}B_{1,1} + A_{2,2}B_{2,1} \\\\\\\\\n", 119 | " A_{3,1}B_{1,1} + A_{3,2}B_{2,1}\n", 120 | "\\end{bmatrix}\n", 121 | "\\end{align*}$$\n", 122 | "\n", 123 | "So we will have:\n", 124 | "\n", 125 | "$$ \\begin{align*}\n", 126 | "&\\begin{bmatrix}\n", 127 | " 1 & 2 \\\\\\\\\n", 128 | " 3 & 4 \\\\\\\\\n", 129 | " 5 & 6\n", 130 | "\\end{bmatrix}\\times\n", 131 | "\\begin{bmatrix}\n", 132 | " 2 \\\\\\\\\n", 133 | " 4\n", 134 | "\\end{bmatrix}=\\\\\\\\\n", 135 | "&\\begin{bmatrix}\n", 136 | " 1 \\times 2 + 2 \\times 4 \\\\\\\\\n", 137 | " 3 \\times 2 + 4 \\times 4 \\\\\\\\\n", 138 | " 5 \\times 2 + 6 \\times 4\n", 139 | "\\end{bmatrix}=\n", 140 | "\\begin{bmatrix}\n", 141 | " 10 \\\\\\\\\n", 142 | " 22 \\\\\\\\\n", 143 | " 34\n", 144 | "\\end{bmatrix}\n", 145 | "\\end{align*} $$\n", 146 | "\n", 147 | "It is a good habit to check the dimensions of the matrix so see what is going on. We can see in this example that the shape of $A$\n", 148 | " is ($ 3 \\times 2$\n", 149 | ") and the shape of $b$\n", 150 | " is ($2 \\times 1$\n", 151 | "). So the dimensions of $C$\n", 152 | " are ($3 \\times 1$\n", 153 | ").\n" 154 | ], 155 | "metadata": { 156 | "id": "PWDNpjlsSHnQ" 157 | } 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "source": [ 162 | "## With Numpy\n", 163 | "\n", 164 | "The Numpy function `dot()` can be used to compute the matrix product (or dot product). Let's try to reproduce the last exemple:\n" 165 | ], 166 | "metadata": { 167 | "id": "0OxVLDJmS4wW" 168 | } 169 | }, 170 | { 171 | "cell_type": "code", 172 | "source": [ 173 | "A = np.array([[1, 2], [3, 4], [5, 6]])\n", 174 | "A" 175 | ], 176 | "metadata": { 177 | "colab": { 178 | "base_uri": "https://localhost:8080/" 179 | }, 180 | "id": "ciZgF1HrRcGb", 181 | "outputId": "b0c5e813-0341-49be-c018-403e15ffdcd0" 182 | }, 183 | "execution_count": 2, 184 | "outputs": [ 185 | { 186 | "output_type": "execute_result", 187 | "data": { 188 | "text/plain": [ 189 | "array([[1, 2],\n", 190 | " [3, 4],\n", 191 | " [5, 6]])" 192 | ] 193 | }, 194 | "metadata": {}, 195 | "execution_count": 2 196 | } 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "source": [ 202 | "B = np.array([[2], [4]])\n", 203 | "B" 204 | ], 205 | "metadata": { 206 | "colab": { 207 | "base_uri": "https://localhost:8080/" 208 | }, 209 | "id": "xQQ0PA55TAp4", 210 | "outputId": "f288ea0d-9177-4bb4-941e-a74bfd08e99d" 211 | }, 212 | "execution_count": 3, 213 | "outputs": [ 214 | { 215 | "output_type": "execute_result", 216 | "data": { 217 | "text/plain": [ 218 | "array([[2],\n", 219 | " [4]])" 220 | ] 221 | }, 222 | "metadata": {}, 223 | "execution_count": 3 224 | } 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "source": [ 230 | "C = np.dot(A, B)\n", 231 | "C" 232 | ], 233 | "metadata": { 234 | "colab": { 235 | "base_uri": "https://localhost:8080/" 236 | }, 237 | "id": "Od5Tp-nMTCNQ", 238 | "outputId": "5b71d021-990e-4e4c-d700-e6319f240c7f" 239 | }, 240 | "execution_count": 4, 241 | "outputs": [ 242 | { 243 | "output_type": "execute_result", 244 | "data": { 245 | "text/plain": [ 246 | "array([[10],\n", 247 | " [22],\n", 248 | " [34]])" 249 | ] 250 | }, 251 | "metadata": {}, 252 | "execution_count": 4 253 | } 254 | ] 255 | }, 256 | { 257 | "cell_type": "markdown", 258 | "source": [ 259 | "It is equivalent to use the method `dot()` of Numpy arrays:" 260 | ], 261 | "metadata": { 262 | "id": "uDOi3zfqTGyA" 263 | } 264 | }, 265 | { 266 | "cell_type": "code", 267 | "source": [ 268 | "C = A.dot(B)\n", 269 | "C" 270 | ], 271 | "metadata": { 272 | "colab": { 273 | "base_uri": "https://localhost:8080/" 274 | }, 275 | "id": "kVFOFjUVTD1H", 276 | "outputId": "1d8e84d5-bb29-45f1-dde0-5e9029f042e0" 277 | }, 278 | "execution_count": 5, 279 | "outputs": [ 280 | { 281 | "output_type": "execute_result", 282 | "data": { 283 | "text/plain": [ 284 | "array([[10],\n", 285 | " [22],\n", 286 | " [34]])" 287 | ] 288 | }, 289 | "metadata": {}, 290 | "execution_count": 5 291 | } 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "source": [ 297 | "## Example 2\n", 298 | "\n", 299 | "Multiplication of two matrices\n", 300 | "\n", 301 | "$$ {A} \\times {B} = {C} $$\n", 302 | "\n", 303 | "with \n", 304 | "\n", 305 | "$$ {A}=\\begin{bmatrix}\n", 306 | " 1 & 2 & 3 \\\\\\\\\n", 307 | " 4 & 5 & 6 \\\\\\\\\n", 308 | " 7 & 8 & 9 \\\\\\\\\n", 309 | " 10 & 11 & 12\n", 310 | "\\end{bmatrix} $$\n", 311 | "\n", 312 | "and \n", 313 | "\n", 314 | "$$ {B}=\\begin{bmatrix}\n", 315 | " 2 & 7 \\\\\\\\\n", 316 | " 1 & 2 \\\\\\\\\n", 317 | " 3 & 6\n", 318 | "\\end{bmatrix}$$\n", 319 | "\n", 320 | "So we have:\n", 321 | "\n", 322 | "$$ \\begin{align*}\n", 323 | "&\\begin{bmatrix}\n", 324 | " 1 & 2 & 3 \\\\\\\\\n", 325 | " 4 & 5 & 6 \\\\\\\\\n", 326 | " 7 & 8 & 9 \\\\\\\\\n", 327 | " 10 & 11 & 12\n", 328 | "\\end{bmatrix}\\times\n", 329 | "\\begin{bmatrix}\n", 330 | " 2 & 7 \\\\\\\\\n", 331 | " 1 & 2 \\\\\\\\\n", 332 | " 3 & 6\n", 333 | "\\end{bmatrix}=\\\\\\\\\n", 334 | "&\\begin{bmatrix}\n", 335 | " 2 \\times 1 + 1 \\times 2 + 3 \\times 3 & 7 \\times 1 + 2 \\times 2 + 6 \\times 3 \\\\\\\\\n", 336 | " 2 \\times 4 + 1 \\times 5 + 3 \\times 6 & 7 \\times 4 + 2 \\times 5 + 6 \\times 6 \\\\\\\\\n", 337 | " 2 \\times 7 + 1 \\times 8 + 3 \\times 9 & 7 \\times 7 + 2 \\times 8 + 6 \\times 9 \\\\\\\\\n", 338 | " 2 \\times 10 + 1 \\times 11 + 3 \\times 12 & 7 \\times 10 + 2 \\times 11 + 6 \\times 12 \\\\\\\\\n", 339 | "\\end{bmatrix}\\\\\\\\\n", 340 | "&=\n", 341 | "\\begin{bmatrix}\n", 342 | " 13 & 29 \\\\\\\\\n", 343 | " 31 & 74 \\\\\\\\\n", 344 | " 49 & 119 \\\\\\\\\n", 345 | " 67 & 164\n", 346 | "\\end{bmatrix}\n", 347 | "\\end{align*} $$\n", 348 | "\n", 349 | "Let's verify with `numpy`" 350 | ], 351 | "metadata": { 352 | "id": "qpkyCeoeTMHM" 353 | } 354 | }, 355 | { 356 | "cell_type": "code", 357 | "source": [ 358 | "A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])\n", 359 | "A" 360 | ], 361 | "metadata": { 362 | "colab": { 363 | "base_uri": "https://localhost:8080/" 364 | }, 365 | "id": "uzLViCOnTK49", 366 | "outputId": "7d76e225-c920-4ce4-b8ac-f61305b85649" 367 | }, 368 | "execution_count": 6, 369 | "outputs": [ 370 | { 371 | "output_type": "execute_result", 372 | "data": { 373 | "text/plain": [ 374 | "array([[ 1, 2, 3],\n", 375 | " [ 4, 5, 6],\n", 376 | " [ 7, 8, 9],\n", 377 | " [10, 11, 12]])" 378 | ] 379 | }, 380 | "metadata": {}, 381 | "execution_count": 6 382 | } 383 | ] 384 | }, 385 | { 386 | "cell_type": "code", 387 | "source": [ 388 | "B = np.array([[2, 7], [1, 2], [3, 6]])\n", 389 | "B" 390 | ], 391 | "metadata": { 392 | "colab": { 393 | "base_uri": "https://localhost:8080/" 394 | }, 395 | "id": "wUm8qK15TpK2", 396 | "outputId": "929711aa-5b51-4c26-a939-82ef46059887" 397 | }, 398 | "execution_count": 7, 399 | "outputs": [ 400 | { 401 | "output_type": "execute_result", 402 | "data": { 403 | "text/plain": [ 404 | "array([[2, 7],\n", 405 | " [1, 2],\n", 406 | " [3, 6]])" 407 | ] 408 | }, 409 | "metadata": {}, 410 | "execution_count": 7 411 | } 412 | ] 413 | }, 414 | { 415 | "cell_type": "code", 416 | "source": [ 417 | "C = A.dot(B)\n", 418 | "C" 419 | ], 420 | "metadata": { 421 | "colab": { 422 | "base_uri": "https://localhost:8080/" 423 | }, 424 | "id": "ukkDsHS8TrD_", 425 | "outputId": "2b9ea87c-ebd6-434d-a3eb-b443bdf5e6f4" 426 | }, 427 | "execution_count": 8, 428 | "outputs": [ 429 | { 430 | "output_type": "execute_result", 431 | "data": { 432 | "text/plain": [ 433 | "array([[ 13, 29],\n", 434 | " [ 31, 74],\n", 435 | " [ 49, 119],\n", 436 | " [ 67, 164]])" 437 | ] 438 | }, 439 | "metadata": {}, 440 | "execution_count": 8 441 | } 442 | ] 443 | }, 444 | { 445 | "cell_type": "markdown", 446 | "source": [ 447 | "# Formalization of the dot product\n", 448 | "\n", 449 | "$$ C_{i,j} = A_{i,k}B_{k,j} = \\sum_{k}A_{i,k}B_{k,j} $$\n", 450 | "\n", 451 | "## Properties of the dot product\n", 452 | "\n", 453 | "We will now see some interesting properties of the matrix multiplication. It will become useful as we move forward in the chapters. Using simple examples for each property will provide a way to check them while we get used to the Numpy functions.\n", 454 | "\n", 455 | "\n", 456 | "## Matrices multiplication is distributive\n", 457 | "\n", 458 | "$$ {A}({B}+{C}) = {AB}+{AC} $$\n", 459 | "\n" 460 | ], 461 | "metadata": { 462 | "id": "hvQZWz3NTuW0" 463 | } 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "source": [ 468 | "## Example 3\n", 469 | "\n", 470 | "$$ {A}=\\begin{bmatrix}\n", 471 | " 2 & 3 \\\\\\\\\n", 472 | " 1 & 4 \\\\\\\\\n", 473 | " 7 & 6\n", 474 | "\\end{bmatrix}, \n", 475 | "{B}=\\begin{bmatrix}\n", 476 | " 5 \\\\\\\\\n", 477 | " 2\n", 478 | "\\end{bmatrix}, \n", 479 | "{C}=\\begin{bmatrix}\n", 480 | " 4 \\\\\\\\\n", 481 | " 3\n", 482 | "\\end{bmatrix}$$\n", 483 | "\n", 484 | "\n", 485 | "----\n", 486 | "\n", 487 | "$$ \\begin{align*}\n", 488 | "{A}({B}+{C})&=\\begin{bmatrix}\n", 489 | " 2 & 3 \\\\\\\\\n", 490 | " 1 & 4 \\\\\\\\\n", 491 | " 7 & 6\n", 492 | "\\end{bmatrix}\\times\n", 493 | "\\left(\\begin{bmatrix}\n", 494 | " 5 \\\\\\\\\n", 495 | " 2\n", 496 | "\\end{bmatrix}+\n", 497 | "\\begin{bmatrix}\n", 498 | " 4 \\\\\\\\\n", 499 | " 3\n", 500 | "\\end{bmatrix}\\right)=\n", 501 | "\\begin{bmatrix}\n", 502 | " 2 & 3 \\\\\\\\\n", 503 | " 1 & 4 \\\\\\\\\n", 504 | " 7 & 6\n", 505 | "\\end{bmatrix}\\times\n", 506 | "\\begin{bmatrix}\n", 507 | " 9 \\\\\\\\\n", 508 | " 5\n", 509 | "\\end{bmatrix}\\\\\\\\\n", 510 | "&=\n", 511 | "\\begin{bmatrix}\n", 512 | " 2 \\times 9 + 3 \\times 5 \\\\\\\\\n", 513 | " 1 \\times 9 + 4 \\times 5 \\\\\\\\\n", 514 | " 7 \\times 9 + 6 \\times 5\n", 515 | "\\end{bmatrix}=\n", 516 | "\\begin{bmatrix}\n", 517 | " 33 \\\\\\\\\n", 518 | " 29 \\\\\\\\\n", 519 | " 93\n", 520 | "\\end{bmatrix}\n", 521 | "\\end{align*}$$\n", 522 | "\n", 523 | "\n", 524 | "is equivalent to \n", 525 | "\n", 526 | "$$ \\begin{align*}\n", 527 | "{A}{B}+{A}{C} &= \\begin{bmatrix}\n", 528 | " 2 & 3 \\\\\\\\\n", 529 | " 1 & 4 \\\\\\\\\n", 530 | " 7 & 6\n", 531 | "\\end{bmatrix}\\times\n", 532 | "\\begin{bmatrix}\n", 533 | " 5 \\\\\\\\\n", 534 | " 2\n", 535 | "\\end{bmatrix}+\n", 536 | "\\begin{bmatrix}\n", 537 | " 2 & 3 \\\\\\\\\n", 538 | " 1 & 4 \\\\\\\\\n", 539 | " 7 & 6\n", 540 | "\\end{bmatrix}\\times\n", 541 | "\\begin{bmatrix}\n", 542 | " 4 \\\\\\\\\n", 543 | " 3\n", 544 | "\\end{bmatrix}\\\\\\\\\n", 545 | "&=\n", 546 | "\\begin{bmatrix}\n", 547 | " 2 \\times 5 + 3 \\times 2 \\\\\\\\\n", 548 | " 1 \\times 5 + 4 \\times 2 \\\\\\\\\n", 549 | " 7 \\times 5 + 6 \\times 2\n", 550 | "\\end{bmatrix}+\n", 551 | "\\begin{bmatrix}\n", 552 | " 2 \\times 4 + 3 \\times 3 \\\\\\\\\n", 553 | " 1 \\times 4 + 4 \\times 3 \\\\\\\\\n", 554 | " 7 \\times 4 + 6 \\times 3\n", 555 | "\\end{bmatrix}\\\\\\\\\n", 556 | "&=\n", 557 | "\\begin{bmatrix}\n", 558 | " 16 \\\\\\\\\n", 559 | " 13 \\\\\\\\\n", 560 | " 47\n", 561 | "\\end{bmatrix}+\n", 562 | "\\begin{bmatrix}\n", 563 | " 17 \\\\\\\\\n", 564 | " 16 \\\\\\\\\n", 565 | " 46\n", 566 | "\\end{bmatrix}=\n", 567 | "\\begin{bmatrix}\n", 568 | " 33 \\\\\\\\\n", 569 | " 29 \\\\\\\\\n", 570 | " 93\n", 571 | "\\end{bmatrix}\n", 572 | "\\end{align*}$$" 573 | ], 574 | "metadata": { 575 | "id": "G3e7wB46UFUt" 576 | } 577 | }, 578 | { 579 | "cell_type": "code", 580 | "source": [ 581 | "A = np.array([[2, 3], [1, 4], [7, 6]])\n", 582 | "A" 583 | ], 584 | "metadata": { 585 | "colab": { 586 | "base_uri": "https://localhost:8080/" 587 | }, 588 | "id": "H18MnekiTuH3", 589 | "outputId": "ad65e7f3-4bf3-4bea-fd92-45c22228deee" 590 | }, 591 | "execution_count": 9, 592 | "outputs": [ 593 | { 594 | "output_type": "execute_result", 595 | "data": { 596 | "text/plain": [ 597 | "array([[2, 3],\n", 598 | " [1, 4],\n", 599 | " [7, 6]])" 600 | ] 601 | }, 602 | "metadata": {}, 603 | "execution_count": 9 604 | } 605 | ] 606 | }, 607 | { 608 | "cell_type": "code", 609 | "source": [ 610 | "B = np.array([[5], [2]])\n", 611 | "B" 612 | ], 613 | "metadata": { 614 | "colab": { 615 | "base_uri": "https://localhost:8080/" 616 | }, 617 | "id": "Jw9S50OXTspU", 618 | "outputId": "18ea0e55-1881-4693-e2cd-48defaa55c58" 619 | }, 620 | "execution_count": 10, 621 | "outputs": [ 622 | { 623 | "output_type": "execute_result", 624 | "data": { 625 | "text/plain": [ 626 | "array([[5],\n", 627 | " [2]])" 628 | ] 629 | }, 630 | "metadata": {}, 631 | "execution_count": 10 632 | } 633 | ] 634 | }, 635 | { 636 | "cell_type": "code", 637 | "source": [ 638 | "C = np.array([[4], [3]])\n", 639 | "C" 640 | ], 641 | "metadata": { 642 | "colab": { 643 | "base_uri": "https://localhost:8080/" 644 | }, 645 | "id": "2N6U0FKIUikB", 646 | "outputId": "45e3fe84-ba9b-4feb-b3eb-a2a48e294f4d" 647 | }, 648 | "execution_count": 11, 649 | "outputs": [ 650 | { 651 | "output_type": "execute_result", 652 | "data": { 653 | "text/plain": [ 654 | "array([[4],\n", 655 | " [3]])" 656 | ] 657 | }, 658 | "metadata": {}, 659 | "execution_count": 11 660 | } 661 | ] 662 | }, 663 | { 664 | "cell_type": "markdown", 665 | "source": [ 666 | "$A(B+C)$" 667 | ], 668 | "metadata": { 669 | "id": "rJPwNB1lUlpv" 670 | } 671 | }, 672 | { 673 | "cell_type": "code", 674 | "source": [ 675 | "D = A.dot(B+C)\n", 676 | "D" 677 | ], 678 | "metadata": { 679 | "colab": { 680 | "base_uri": "https://localhost:8080/" 681 | }, 682 | "id": "4migHKzXUk4c", 683 | "outputId": "3f9a0cad-b718-43a7-8416-c5bafb8d1b0d" 684 | }, 685 | "execution_count": 12, 686 | "outputs": [ 687 | { 688 | "output_type": "execute_result", 689 | "data": { 690 | "text/plain": [ 691 | "array([[33],\n", 692 | " [29],\n", 693 | " [93]])" 694 | ] 695 | }, 696 | "metadata": {}, 697 | "execution_count": 12 698 | } 699 | ] 700 | }, 701 | { 702 | "cell_type": "markdown", 703 | "source": [ 704 | "is equivalent to $AB + AC$" 705 | ], 706 | "metadata": { 707 | "id": "6NeSLvoLUru6" 708 | } 709 | }, 710 | { 711 | "cell_type": "code", 712 | "source": [ 713 | "D = A.dot(B) + A.dot(C)\n", 714 | "D" 715 | ], 716 | "metadata": { 717 | "colab": { 718 | "base_uri": "https://localhost:8080/" 719 | }, 720 | "id": "6iqzmkI-UqZW", 721 | "outputId": "c86cb5fe-05c2-49ed-b286-680cb017d0c4" 722 | }, 723 | "execution_count": 13, 724 | "outputs": [ 725 | { 726 | "output_type": "execute_result", 727 | "data": { 728 | "text/plain": [ 729 | "array([[33],\n", 730 | " [29],\n", 731 | " [93]])" 732 | ] 733 | }, 734 | "metadata": {}, 735 | "execution_count": 13 736 | } 737 | ] 738 | }, 739 | { 740 | "cell_type": "markdown", 741 | "source": [ 742 | "## Matrices mutliplication is associative\n", 743 | "\n", 744 | "$$ {A}({BC}) = ({AB}){C} $$" 745 | ], 746 | "metadata": { 747 | "id": "RBYkT3ChUwdC" 748 | } 749 | }, 750 | { 751 | "cell_type": "code", 752 | "source": [ 753 | "A = np.array([[2, 3], [1, 4], [7, 6]])\n", 754 | "A" 755 | ], 756 | "metadata": { 757 | "colab": { 758 | "base_uri": "https://localhost:8080/" 759 | }, 760 | "id": "PEM4yXV3UvO_", 761 | "outputId": "34b7b68a-5c2e-46b4-c2e3-494531948b75" 762 | }, 763 | "execution_count": 14, 764 | "outputs": [ 765 | { 766 | "output_type": "execute_result", 767 | "data": { 768 | "text/plain": [ 769 | "array([[2, 3],\n", 770 | " [1, 4],\n", 771 | " [7, 6]])" 772 | ] 773 | }, 774 | "metadata": {}, 775 | "execution_count": 14 776 | } 777 | ] 778 | }, 779 | { 780 | "cell_type": "code", 781 | "source": [ 782 | "B = np.array([[5, 3], [2, 2]])\n", 783 | "B" 784 | ], 785 | "metadata": { 786 | "colab": { 787 | "base_uri": "https://localhost:8080/" 788 | }, 789 | "id": "PYJRkbPAU2Sx", 790 | "outputId": "9e1f88c3-2bed-4e3f-9875-855fcb60498a" 791 | }, 792 | "execution_count": 15, 793 | "outputs": [ 794 | { 795 | "output_type": "execute_result", 796 | "data": { 797 | "text/plain": [ 798 | "array([[5, 3],\n", 799 | " [2, 2]])" 800 | ] 801 | }, 802 | "metadata": {}, 803 | "execution_count": 15 804 | } 805 | ] 806 | }, 807 | { 808 | "cell_type": "markdown", 809 | "source": [ 810 | "$A(BC)$" 811 | ], 812 | "metadata": { 813 | "id": "f6j_no3EU8M7" 814 | } 815 | }, 816 | { 817 | "cell_type": "code", 818 | "source": [ 819 | "D = A.dot(B.dot(C))\n", 820 | "D" 821 | ], 822 | "metadata": { 823 | "colab": { 824 | "base_uri": "https://localhost:8080/" 825 | }, 826 | "id": "cXkcxhlgU7KE", 827 | "outputId": "74feefe8-7eab-4272-b183-61f745c9f55d" 828 | }, 829 | "execution_count": 16, 830 | "outputs": [ 831 | { 832 | "output_type": "execute_result", 833 | "data": { 834 | "text/plain": [ 835 | "array([[100],\n", 836 | " [ 85],\n", 837 | " [287]])" 838 | ] 839 | }, 840 | "metadata": {}, 841 | "execution_count": 16 842 | } 843 | ] 844 | }, 845 | { 846 | "cell_type": "markdown", 847 | "source": [ 848 | "is equivalent to $(AB)C$\n", 849 | ":" 850 | ], 851 | "metadata": { 852 | "id": "7DF5jOVvVANU" 853 | } 854 | }, 855 | { 856 | "cell_type": "code", 857 | "source": [ 858 | "D = (A.dot(B)).dot(C)\n", 859 | "D" 860 | ], 861 | "metadata": { 862 | "colab": { 863 | "base_uri": "https://localhost:8080/" 864 | }, 865 | "id": "hbRIU6JXU-ny", 866 | "outputId": "a6008a4b-bb8f-4e0c-c9c5-2e4cd81fd6b9" 867 | }, 868 | "execution_count": 17, 869 | "outputs": [ 870 | { 871 | "output_type": "execute_result", 872 | "data": { 873 | "text/plain": [ 874 | "array([[100],\n", 875 | " [ 85],\n", 876 | " [287]])" 877 | ] 878 | }, 879 | "metadata": {}, 880 | "execution_count": 17 881 | } 882 | ] 883 | }, 884 | { 885 | "cell_type": "markdown", 886 | "source": [ 887 | "## Matrix multiplication is not commutative\n", 888 | "\n", 889 | "$$ {AB} \\neq {BA} $$" 890 | ], 891 | "metadata": { 892 | "id": "fwQ3LibsVJeJ" 893 | } 894 | }, 895 | { 896 | "cell_type": "code", 897 | "source": [ 898 | "A = np.array([[2, 3], [6, 5]])\n", 899 | "A" 900 | ], 901 | "metadata": { 902 | "colab": { 903 | "base_uri": "https://localhost:8080/" 904 | }, 905 | "id": "IZ39M-9KVGW-", 906 | "outputId": "848804a8-1e39-49af-fa33-eeb94c2f6fcf" 907 | }, 908 | "execution_count": 18, 909 | "outputs": [ 910 | { 911 | "output_type": "execute_result", 912 | "data": { 913 | "text/plain": [ 914 | "array([[2, 3],\n", 915 | " [6, 5]])" 916 | ] 917 | }, 918 | "metadata": {}, 919 | "execution_count": 18 920 | } 921 | ] 922 | }, 923 | { 924 | "cell_type": "code", 925 | "source": [ 926 | "B = np.array([[5, 3], [2, 2]])\n", 927 | "B" 928 | ], 929 | "metadata": { 930 | "colab": { 931 | "base_uri": "https://localhost:8080/" 932 | }, 933 | "id": "ax8pm--XVPt1", 934 | "outputId": "8a5ab3a1-e8ed-463f-d538-3491e4b79e80" 935 | }, 936 | "execution_count": 19, 937 | "outputs": [ 938 | { 939 | "output_type": "execute_result", 940 | "data": { 941 | "text/plain": [ 942 | "array([[5, 3],\n", 943 | " [2, 2]])" 944 | ] 945 | }, 946 | "metadata": {}, 947 | "execution_count": 19 948 | } 949 | ] 950 | }, 951 | { 952 | "cell_type": "markdown", 953 | "source": [ 954 | "$AB$ " 955 | ], 956 | "metadata": { 957 | "id": "1C0LyVoJjlci" 958 | } 959 | }, 960 | { 961 | "cell_type": "code", 962 | "source": [ 963 | "AB = np.dot(A, B)\n", 964 | "AB" 965 | ], 966 | "metadata": { 967 | "colab": { 968 | "base_uri": "https://localhost:8080/" 969 | }, 970 | "id": "4E7FBSprVRGw", 971 | "outputId": "59f11284-36e4-4bb5-ae4c-c7ee0fd9443f" 972 | }, 973 | "execution_count": 20, 974 | "outputs": [ 975 | { 976 | "output_type": "execute_result", 977 | "data": { 978 | "text/plain": [ 979 | "array([[16, 12],\n", 980 | " [40, 28]])" 981 | ] 982 | }, 983 | "metadata": {}, 984 | "execution_count": 20 985 | } 986 | ] 987 | }, 988 | { 989 | "cell_type": "markdown", 990 | "source": [ 991 | "is different from $BA$" 992 | ], 993 | "metadata": { 994 | "id": "jUlmWvnqjn69" 995 | } 996 | }, 997 | { 998 | "cell_type": "code", 999 | "source": [ 1000 | "BA = np.dot(B, A)\n", 1001 | "BA" 1002 | ], 1003 | "metadata": { 1004 | "colab": { 1005 | "base_uri": "https://localhost:8080/" 1006 | }, 1007 | "id": "mKDg90wCVSPW", 1008 | "outputId": "434a3880-cb46-4aa9-e0ac-643145c3388a" 1009 | }, 1010 | "execution_count": 21, 1011 | "outputs": [ 1012 | { 1013 | "output_type": "execute_result", 1014 | "data": { 1015 | "text/plain": [ 1016 | "array([[28, 30],\n", 1017 | " [16, 16]])" 1018 | ] 1019 | }, 1020 | "metadata": {}, 1021 | "execution_count": 21 1022 | } 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "markdown", 1027 | "source": [ 1028 | "## However vector multiplication is commutative\n", 1029 | "\n", 1030 | "$$ {x^{ \\text{T}}y} = {y^{\\text{T}}x}$$" 1031 | ], 1032 | "metadata": { 1033 | "id": "N7sA-sjxjvr3" 1034 | } 1035 | }, 1036 | { 1037 | "cell_type": "code", 1038 | "source": [ 1039 | "x = np.array([[2], [6]])\n", 1040 | "x" 1041 | ], 1042 | "metadata": { 1043 | "colab": { 1044 | "base_uri": "https://localhost:8080/" 1045 | }, 1046 | "id": "dXylDCrOjtaJ", 1047 | "outputId": "11fe3c41-b8d4-4b42-fa27-98ba9c4f7a53" 1048 | }, 1049 | "execution_count": 22, 1050 | "outputs": [ 1051 | { 1052 | "output_type": "execute_result", 1053 | "data": { 1054 | "text/plain": [ 1055 | "array([[2],\n", 1056 | " [6]])" 1057 | ] 1058 | }, 1059 | "metadata": {}, 1060 | "execution_count": 22 1061 | } 1062 | ] 1063 | }, 1064 | { 1065 | "cell_type": "code", 1066 | "source": [ 1067 | "y = np.array([[5], [2]])\n", 1068 | "y\n" 1069 | ], 1070 | "metadata": { 1071 | "colab": { 1072 | "base_uri": "https://localhost:8080/" 1073 | }, 1074 | "id": "D00pl_i0j2YK", 1075 | "outputId": "3c3ea0aa-ae2e-4d48-c006-a07bac66616a" 1076 | }, 1077 | "execution_count": 23, 1078 | "outputs": [ 1079 | { 1080 | "output_type": "execute_result", 1081 | "data": { 1082 | "text/plain": [ 1083 | "array([[5],\n", 1084 | " [2]])" 1085 | ] 1086 | }, 1087 | "metadata": {}, 1088 | "execution_count": 23 1089 | } 1090 | ] 1091 | }, 1092 | { 1093 | "cell_type": "markdown", 1094 | "source": [ 1095 | "${x^\\text{T}y}$" 1096 | ], 1097 | "metadata": { 1098 | "id": "O8GhLpcxj5xv" 1099 | } 1100 | }, 1101 | { 1102 | "cell_type": "code", 1103 | "source": [ 1104 | "x_ty = x.T.dot(y)\n", 1105 | "x_ty" 1106 | ], 1107 | "metadata": { 1108 | "colab": { 1109 | "base_uri": "https://localhost:8080/" 1110 | }, 1111 | "id": "XM35EuW1j3_8", 1112 | "outputId": "8c740c2f-9048-4737-e333-5f9ab06a004d" 1113 | }, 1114 | "execution_count": 24, 1115 | "outputs": [ 1116 | { 1117 | "output_type": "execute_result", 1118 | "data": { 1119 | "text/plain": [ 1120 | "array([[22]])" 1121 | ] 1122 | }, 1123 | "metadata": {}, 1124 | "execution_count": 24 1125 | } 1126 | ] 1127 | }, 1128 | { 1129 | "cell_type": "markdown", 1130 | "source": [ 1131 | "is equivalent to ${y^\\text{T}x}$\n", 1132 | ": " 1133 | ], 1134 | "metadata": { 1135 | "id": "zFe42CkKj_DT" 1136 | } 1137 | }, 1138 | { 1139 | "cell_type": "code", 1140 | "source": [ 1141 | "y_tx = y.T.dot(x)\n", 1142 | "y_tx" 1143 | ], 1144 | "metadata": { 1145 | "colab": { 1146 | "base_uri": "https://localhost:8080/" 1147 | }, 1148 | "id": "XBc9Z_6dj9jE", 1149 | "outputId": "46162d90-a037-43a7-b7b8-a8c3d61fa930" 1150 | }, 1151 | "execution_count": 25, 1152 | "outputs": [ 1153 | { 1154 | "output_type": "execute_result", 1155 | "data": { 1156 | "text/plain": [ 1157 | "array([[22]])" 1158 | ] 1159 | }, 1160 | "metadata": {}, 1161 | "execution_count": 25 1162 | } 1163 | ] 1164 | }, 1165 | { 1166 | "cell_type": "markdown", 1167 | "source": [ 1168 | "## Simplification of the matrix product\n", 1169 | "\n", 1170 | "$$ ({AB})^{\\text{T}} = {B}^\\text{T}\n", 1171 | "{A}^\\text{T} $$" 1172 | ], 1173 | "metadata": { 1174 | "id": "_r2EoCVSkFPl" 1175 | } 1176 | }, 1177 | { 1178 | "cell_type": "code", 1179 | "source": [ 1180 | "A = np.array([[2, 3], [1, 4], [7, 6]])\n", 1181 | "A" 1182 | ], 1183 | "metadata": { 1184 | "colab": { 1185 | "base_uri": "https://localhost:8080/" 1186 | }, 1187 | "id": "C_dcnvXrkNjs", 1188 | "outputId": "5ccf9550-9554-4127-f2c4-d7322a916203" 1189 | }, 1190 | "execution_count": 26, 1191 | "outputs": [ 1192 | { 1193 | "output_type": "execute_result", 1194 | "data": { 1195 | "text/plain": [ 1196 | "array([[2, 3],\n", 1197 | " [1, 4],\n", 1198 | " [7, 6]])" 1199 | ] 1200 | }, 1201 | "metadata": {}, 1202 | "execution_count": 26 1203 | } 1204 | ] 1205 | }, 1206 | { 1207 | "cell_type": "code", 1208 | "source": [ 1209 | "B = np.array([[5, 3], [2, 2]])\n", 1210 | "B" 1211 | ], 1212 | "metadata": { 1213 | "colab": { 1214 | "base_uri": "https://localhost:8080/" 1215 | }, 1216 | "id": "O20MT5lPkPGb", 1217 | "outputId": "b854b30c-c346-4f3c-bb2f-bf494716124c" 1218 | }, 1219 | "execution_count": 27, 1220 | "outputs": [ 1221 | { 1222 | "output_type": "execute_result", 1223 | "data": { 1224 | "text/plain": [ 1225 | "array([[5, 3],\n", 1226 | " [2, 2]])" 1227 | ] 1228 | }, 1229 | "metadata": {}, 1230 | "execution_count": 27 1231 | } 1232 | ] 1233 | }, 1234 | { 1235 | "cell_type": "markdown", 1236 | "source": [ 1237 | "$({AB})^{\\text{T}}$" 1238 | ], 1239 | "metadata": { 1240 | "id": "1RjMq44vkSB9" 1241 | } 1242 | }, 1243 | { 1244 | "cell_type": "code", 1245 | "source": [ 1246 | "AB_t = A.dot(B).T\n", 1247 | "AB_t" 1248 | ], 1249 | "metadata": { 1250 | "colab": { 1251 | "base_uri": "https://localhost:8080/" 1252 | }, 1253 | "id": "4r4OXkeQkQg8", 1254 | "outputId": "37f19028-fbb2-40b2-95c7-82a3e68300f8" 1255 | }, 1256 | "execution_count": 28, 1257 | "outputs": [ 1258 | { 1259 | "output_type": "execute_result", 1260 | "data": { 1261 | "text/plain": [ 1262 | "array([[16, 13, 47],\n", 1263 | " [12, 11, 33]])" 1264 | ] 1265 | }, 1266 | "metadata": {}, 1267 | "execution_count": 28 1268 | } 1269 | ] 1270 | }, 1271 | { 1272 | "cell_type": "markdown", 1273 | "source": [ 1274 | "is equivalent to ${B}^\\text{T}{A}^\\text{T}$\n", 1275 | ": " 1276 | ], 1277 | "metadata": { 1278 | "id": "ufbWCJv4kXbX" 1279 | } 1280 | }, 1281 | { 1282 | "cell_type": "code", 1283 | "source": [ 1284 | "B_tA = B.T.dot(A.T)\n", 1285 | "B_tA" 1286 | ], 1287 | "metadata": { 1288 | "colab": { 1289 | "base_uri": "https://localhost:8080/" 1290 | }, 1291 | "id": "AvbZumkBkV26", 1292 | "outputId": "eafa1e99-9b22-47da-9ef3-7bc471d4db8e" 1293 | }, 1294 | "execution_count": 29, 1295 | "outputs": [ 1296 | { 1297 | "output_type": "execute_result", 1298 | "data": { 1299 | "text/plain": [ 1300 | "array([[16, 13, 47],\n", 1301 | " [12, 11, 33]])" 1302 | ] 1303 | }, 1304 | "metadata": {}, 1305 | "execution_count": 29 1306 | } 1307 | ] 1308 | }, 1309 | { 1310 | "cell_type": "code", 1311 | "source": [], 1312 | "metadata": { 1313 | "id": "LB8uy02-kdsi" 1314 | }, 1315 | "execution_count": null, 1316 | "outputs": [] 1317 | } 1318 | ] 1319 | } -------------------------------------------------------------------------------- /Basic Numerical Linear Algebra/3-Identity_and_Inverse_Matrices.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "colab_type": "text", 7 | "id": "view-in-github" 8 | }, 9 | "source": [ 10 | "\"Open" 11 | ] 12 | }, 13 | { 14 | "attachments": {}, 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "Author: [Monit Sharma](https://github.com/MonitSharma),\n", 19 | " LinkedIn: [Monit Sharma](https://www.linkedin.com/in/monitsharma/),\n", 20 | " Twitter: [@MonitSharma1729](https://twitter.com/MonitSharma1729)" 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": 1, 26 | "metadata": { 27 | "id": "Y0rgY7EhbU_Q" 28 | }, 29 | "outputs": [], 30 | "source": [ 31 | "# importing packages\n", 32 | "import numpy as np\n", 33 | "import matplotlib.pyplot as plt\n", 34 | "import seaborn as sns" 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": 2, 40 | "metadata": { 41 | "colab": { 42 | "base_uri": "https://localhost:8080/" 43 | }, 44 | "id": "gCxHfYUube-V", 45 | "outputId": "a821d1a5-501f-4490-8bcb-fe18f7f9dc4a" 46 | }, 47 | "outputs": [ 48 | { 49 | "name": "stdout", 50 | "output_type": "stream", 51 | "text": [ 52 | "Populating the interactive namespace from numpy and matplotlib\n" 53 | ] 54 | } 55 | ], 56 | "source": [ 57 | "# plotting parameters\n", 58 | "sns.set()\n", 59 | "%pylab inline\n", 60 | "pylab.rcParams['figure.figsize'] = (4,4)" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": 3, 66 | "metadata": { 67 | "id": "ZQIqYsMjbnjn" 68 | }, 69 | "outputs": [], 70 | "source": [ 71 | "# avoiding innaccurate floating points\n", 72 | "np.set_printoptions(suppress=True)" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "metadata": { 78 | "id": "s4_zb6gxbuwK" 79 | }, 80 | "source": [ 81 | "# Introdcution\n", 82 | "\n", 83 | "This chapter is light but contains some important definitions. The identity matrix or the inverse of a matrix are concepts that will be very useful in the next chapters. We will see at the end of this chapter that we can solve systems of linear equations by using the inverse matrix. " 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": { 89 | "id": "QCono1y1bzR-" 90 | }, 91 | "source": [ 92 | "# Identity and Inverse Matrices\n", 93 | "\n", 94 | "The identity matrix $I_n$\n", 95 | " is a special matrix of shape ($ n \\times n$\n", 96 | ") that is filled with $0$ \n", 97 | " except the diagonal that is filled with $1$.\n", 98 | "\n", 99 | "\n", 100 | "----\n", 101 | "\n", 102 | "An identity matrix can be created with the `numpy` function `eye()`\n" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": 4, 108 | "metadata": { 109 | "colab": { 110 | "base_uri": "https://localhost:8080/" 111 | }, 112 | "id": "1Gxx5e_ebt29", 113 | "outputId": "a2c3480a-8e17-4590-9515-395a460d4e4b" 114 | }, 115 | "outputs": [ 116 | { 117 | "data": { 118 | "text/plain": [ 119 | "array([[1., 0., 0.],\n", 120 | " [0., 1., 0.],\n", 121 | " [0., 0., 1.]])" 122 | ] 123 | }, 124 | "execution_count": 4, 125 | "metadata": {}, 126 | "output_type": "execute_result" 127 | } 128 | ], 129 | "source": [ 130 | "np.eye(3)" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": { 136 | "id": "9lC2HvfecwpY" 137 | }, 138 | "source": [ 139 | "While applying the Identity matrix to a vector the result is the same vector\n", 140 | "\n", 141 | "$$ {I}_n {x} = {x}$$\n", 142 | "\n", 143 | "\n", 144 | "\n", 145 | "\n", 146 | "\n" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": { 152 | "id": "nQeJJfCpc8LH" 153 | }, 154 | "source": [ 155 | "## Example 1\n", 156 | "\n", 157 | "$$ \\begin{bmatrix}\n", 158 | " 1 & 0 & 0 \\\\\\\\\n", 159 | " 0 & 1 & 0 \\\\\\\\\n", 160 | " 0 & 0 & 1\n", 161 | "\\end{bmatrix}\n", 162 | "\\times\n", 163 | "\\begin{bmatrix}\n", 164 | " x_{1} \\\\\\\\\n", 165 | " x_{2} \\\\\\\\\n", 166 | " x_{3}\n", 167 | "\\end{bmatrix}=\n", 168 | "\\begin{bmatrix}\n", 169 | " 1 \\times x_1 + 0 \\times x_2 + 0\\times x_3 \\\\\\\\\n", 170 | " 0 \\times x_1 + 1 \\times x_2 + 0\\times x_3 \\\\\\\\\n", 171 | " 0 \\times x_1 + 0 \\times x_2 + 1\\times x_3\n", 172 | "\\end{bmatrix}=\n", 173 | "\\begin{bmatrix}\n", 174 | " x_{1} \\\\\\\\\n", 175 | " x_{2} \\\\\\\\\n", 176 | " x_{3}\n", 177 | "\\end{bmatrix}$$" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 5, 183 | "metadata": { 184 | "colab": { 185 | "base_uri": "https://localhost:8080/" 186 | }, 187 | "id": "6LENmi5tcGAH", 188 | "outputId": "64001c20-9ebf-40ce-bcf3-046c4ff78cc3" 189 | }, 190 | "outputs": [ 191 | { 192 | "data": { 193 | "text/plain": [ 194 | "array([[2],\n", 195 | " [6],\n", 196 | " [3]])" 197 | ] 198 | }, 199 | "execution_count": 5, 200 | "metadata": {}, 201 | "output_type": "execute_result" 202 | } 203 | ], 204 | "source": [ 205 | "x = np.array([[2],[6],[3]])\n", 206 | "\n", 207 | "x" 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": 6, 213 | "metadata": { 214 | "colab": { 215 | "base_uri": "https://localhost:8080/" 216 | }, 217 | "id": "BZK8jXjxdHfL", 218 | "outputId": "9a600af1-0572-4768-d56a-62b1c40564c2" 219 | }, 220 | "outputs": [ 221 | { 222 | "data": { 223 | "text/plain": [ 224 | "array([[2.],\n", 225 | " [6.],\n", 226 | " [3.]])" 227 | ] 228 | }, 229 | "execution_count": 6, 230 | "metadata": {}, 231 | "output_type": "execute_result" 232 | } 233 | ], 234 | "source": [ 235 | "xid = np.eye(x.shape[0]).dot(x)\n", 236 | "xid" 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "metadata": { 242 | "id": "WnqfaSy9dQXq" 243 | }, 244 | "source": [ 245 | "## Intuition\n", 246 | "\n", 247 | "You can think of a matrix as a way to transform objects in a $n$\n", 248 | "-dimensional space. It applies a linear transformation of the space. We can say that we apply a matrix to an element: this means that we do the dot product between this matrix and the element. We will see this notion thoroughly in the next chapters but the identity matrix is a good first example. It is a particular example because the space doesn't change when we apply the identity matrix to it.\n", 249 | "\n", 250 | "-----\n", 251 | "\n", 252 | "The space doesn't change when we *apply* the identity matrix to it\n", 253 | "We saw that $x$\n", 254 | " was not altered after being multiplied by $I$\n", 255 | "." 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": { 261 | "id": "ipUEHDOtdrKN" 262 | }, 263 | "source": [ 264 | "# Inverse Matrices\n", 265 | "\n", 266 | "The matrix inverse of ${A}$\n", 267 | " is denoted ${A}^{-1}$\n", 268 | ". It is the matrix that results in the identity matrix when it is multiplied by \n", 269 | ":\n", 270 | "$$ {A}^{-1}{A}={I}_n$$\n", 271 | "\n", 272 | "\n", 273 | "----\n", 274 | "\n", 275 | "This means that if we apply a linear transformation to the space with $A$ , it is possible to go back with $A^{-1}$. It provides a way to cancel the transformation.\n", 276 | "\n" 277 | ] 278 | }, 279 | { 280 | "cell_type": "markdown", 281 | "metadata": { 282 | "id": "bW48vbGreQ7w" 283 | }, 284 | "source": [ 285 | "## Example 2\n", 286 | "\n", 287 | "$$ {A}=\\begin{bmatrix}\n", 288 | " 3 & 0 & 2 \\\\\\\\\n", 289 | " 2 & 0 & -2 \\\\\\\\\n", 290 | " 0 & 1 & 1\n", 291 | "\\end{bmatrix}$$\n", 292 | "\n", 293 | "For this example, we will use `numpy` function `linalg.inv()` to calculate the inverse of $A$." 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": 7, 299 | "metadata": { 300 | "colab": { 301 | "base_uri": "https://localhost:8080/" 302 | }, 303 | "id": "BZXDQRkjdLwp", 304 | "outputId": "2aec6513-78cf-41e9-df58-95d7cb2c1ce4" 305 | }, 306 | "outputs": [ 307 | { 308 | "data": { 309 | "text/plain": [ 310 | "array([[ 3, 0, 2],\n", 311 | " [ 2, 0, -2],\n", 312 | " [ 0, 1, 1]])" 313 | ] 314 | }, 315 | "execution_count": 7, 316 | "metadata": {}, 317 | "output_type": "execute_result" 318 | } 319 | ], 320 | "source": [ 321 | "A = np.array([[3, 0, 2], [2, 0, -2], [0, 1, 1]])\n", 322 | "A" 323 | ] 324 | }, 325 | { 326 | "cell_type": "markdown", 327 | "metadata": { 328 | "id": "yX2xc2tQeh6Q" 329 | }, 330 | "source": [ 331 | "Now, we calculate its inverse" 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": 8, 337 | "metadata": { 338 | "colab": { 339 | "base_uri": "https://localhost:8080/" 340 | }, 341 | "id": "7rb9blcAeglE", 342 | "outputId": "62bd3d94-1d9d-40e5-fe33-8ec12832b20b" 343 | }, 344 | "outputs": [ 345 | { 346 | "data": { 347 | "text/plain": [ 348 | "array([[ 0.2, 0.2, 0. ],\n", 349 | " [-0.2, 0.3, 1. ],\n", 350 | " [ 0.2, -0.3, -0. ]])" 351 | ] 352 | }, 353 | "execution_count": 8, 354 | "metadata": {}, 355 | "output_type": "execute_result" 356 | } 357 | ], 358 | "source": [ 359 | "A_inv = np.linalg.inv(A)\n", 360 | "A_inv" 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "metadata": { 366 | "id": "VRi1oE_VepHt" 367 | }, 368 | "source": [ 369 | "We can check that $A^{-1}$ is well with the inverse of $A$ with python" 370 | ] 371 | }, 372 | { 373 | "cell_type": "code", 374 | "execution_count": 9, 375 | "metadata": { 376 | "colab": { 377 | "base_uri": "https://localhost:8080/" 378 | }, 379 | "id": "T_beTLFhenu7", 380 | "outputId": "b47ab128-5747-430c-f84f-41c1fb0ddcd0" 381 | }, 382 | "outputs": [ 383 | { 384 | "data": { 385 | "text/plain": [ 386 | "array([[ 1., 0., -0.],\n", 387 | " [ 0., 1., 0.],\n", 388 | " [ 0., 0., 1.]])" 389 | ] 390 | }, 391 | "execution_count": 9, 392 | "metadata": {}, 393 | "output_type": "execute_result" 394 | } 395 | ], 396 | "source": [ 397 | "A_bis = A_inv.dot(A)\n", 398 | "A_bis" 399 | ] 400 | }, 401 | { 402 | "cell_type": "markdown", 403 | "metadata": { 404 | "id": "iJ-zzWc3e0n2" 405 | }, 406 | "source": [ 407 | "We will see that inverse of matrices can be very usefull, for instance to solve a set of linear equations. We must note however that non square matrices (matrices with more columns than rows or more rows than columns) don't have inverse." 408 | ] 409 | }, 410 | { 411 | "cell_type": "markdown", 412 | "metadata": { 413 | "id": "RcQwpdoTe3LJ" 414 | }, 415 | "source": [ 416 | "# Solving a system of linear equations\n", 417 | "\n", 418 | "The inverse matrix can be used to solve the equation ${Ax}={b}$\n", 419 | " by adding it to each term:\n", 420 | "\n", 421 | " $$ {A}^{-1}{Ax}={A}^{-1}{b}$$\n", 422 | "\n", 423 | " ------\n", 424 | "\n", 425 | "Since we know by definition that ${A}^{-1}{A}={I}$, we have\n", 426 | "\n", 427 | "$$ {I}_n{x}={A}^{-1}{b}$$\n", 428 | "\n", 429 | "------\n", 430 | "\n", 431 | "We saw that a vector is not changed when multiplied by the identity matrix. So we can write:\n", 432 | "\n", 433 | "$${x}={A}^{-1}{b}$$\n", 434 | "\n", 435 | "This is great! We can solve a set of linear equation just by computing the inverse of $A$\n", 436 | " and apply this matrix to the vector of results $b$!" 437 | ] 438 | }, 439 | { 440 | "cell_type": "markdown", 441 | "metadata": { 442 | "id": "UihRRN04gSRW" 443 | }, 444 | "source": [ 445 | "## Example 3\n", 446 | "\n", 447 | "We will take a simple solvable example\n", 448 | "\n", 449 | "$$ \\begin{cases}\n", 450 | "y = 2x \\\\\\\\\n", 451 | "y = -x +3\n", 452 | "\\end{cases}$$\n", 453 | "\n", 454 | "\n", 455 | "-----\n", 456 | "\n", 457 | "We will use the notation as used before\n", 458 | "\n", 459 | "$$ \\begin{cases}\n", 460 | "A_{1,1}x_1 + A_{1,2}x_2 = b_1 \\\\\\\\\n", 461 | "A_{2,1}x_1 + A_{2,2}x_2= b_2\n", 462 | "\\end{cases}$$\n", 463 | "\n", 464 | "\n", 465 | "----\n", 466 | "\n", 467 | "Here $x_1$ corresponds to $x$ and $x_2$ corresponds to $y$. So we have:\n", 468 | "\n", 469 | "$$ \\begin{cases}\n", 470 | "2x_1 - x_2 = 0 \\\\\\\\\n", 471 | "x_1 + x_2= 3\n", 472 | "\\end{cases}$$\n", 473 | "\n", 474 | "Our matrix $A$ of weights is:\n", 475 | "\n", 476 | "$$ {A}=\n", 477 | "\\begin{bmatrix}\n", 478 | " 2 & -1 \\\\\\\\\n", 479 | " 1 & 1\n", 480 | "\\end{bmatrix}$$\n", 481 | "\n", 482 | "And the vector $b$ containing the solutions of individual equations is:\n", 483 | "\n", 484 | "$$ {b}=\n", 485 | "\\begin{bmatrix}\n", 486 | " 0 \\\\\\\\\n", 487 | " 3\n", 488 | "\\end{bmatrix}$$\n", 489 | "\n", 490 | "-----\n", 491 | "\n", 492 | "Under the matrix form, our system becomes\n", 493 | "\n", 494 | "$$ \\begin{bmatrix}\n", 495 | " 2 & -1 \\\\\\\\\n", 496 | " 1 & 1\n", 497 | "\\end{bmatrix}\n", 498 | "\\begin{bmatrix}\n", 499 | " x_1 \\\\\\\\\n", 500 | " x_2\n", 501 | "\\end{bmatrix}=\n", 502 | "\\begin{bmatrix}\n", 503 | " 0 \\\\\\\\\n", 504 | " 3\n", 505 | "\\end{bmatrix} $$\n", 506 | "\n", 507 | "\n", 508 | "\n", 509 | "Let's find the inverse of $A$" 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": 10, 515 | "metadata": { 516 | "colab": { 517 | "base_uri": "https://localhost:8080/" 518 | }, 519 | "id": "ANvD__7mexRd", 520 | "outputId": "d7a2d725-1427-464a-9f51-ed81f3d68307" 521 | }, 522 | "outputs": [ 523 | { 524 | "data": { 525 | "text/plain": [ 526 | "array([[ 2, -1],\n", 527 | " [ 1, 1]])" 528 | ] 529 | }, 530 | "execution_count": 10, 531 | "metadata": {}, 532 | "output_type": "execute_result" 533 | } 534 | ], 535 | "source": [ 536 | "A = np.array([[2, -1], [1, 1]])\n", 537 | "A" 538 | ] 539 | }, 540 | { 541 | "cell_type": "code", 542 | "execution_count": 11, 543 | "metadata": { 544 | "colab": { 545 | "base_uri": "https://localhost:8080/" 546 | }, 547 | "id": "7DNbKGbBg_9G", 548 | "outputId": "df2b71a0-2dbe-41db-ac51-e10ec3f08c45" 549 | }, 550 | "outputs": [ 551 | { 552 | "data": { 553 | "text/plain": [ 554 | "array([[ 0.33333333, 0.33333333],\n", 555 | " [-0.33333333, 0.66666667]])" 556 | ] 557 | }, 558 | "execution_count": 11, 559 | "metadata": {}, 560 | "output_type": "execute_result" 561 | } 562 | ], 563 | "source": [ 564 | "A_inv = np.linalg.inv(A)\n", 565 | "A_inv" 566 | ] 567 | }, 568 | { 569 | "cell_type": "markdown", 570 | "metadata": { 571 | "id": "SAL8g2W-hNIW" 572 | }, 573 | "source": [ 574 | "We also have:" 575 | ] 576 | }, 577 | { 578 | "cell_type": "code", 579 | "execution_count": 12, 580 | "metadata": { 581 | "id": "9SxwPA98hFOO" 582 | }, 583 | "outputs": [], 584 | "source": [ 585 | "b = np.array([[0], [3]])" 586 | ] 587 | }, 588 | { 589 | "cell_type": "markdown", 590 | "metadata": { 591 | "id": "OU-phRb0hUvZ" 592 | }, 593 | "source": [ 594 | "We have:" 595 | ] 596 | }, 597 | { 598 | "cell_type": "code", 599 | "execution_count": 13, 600 | "metadata": { 601 | "colab": { 602 | "base_uri": "https://localhost:8080/" 603 | }, 604 | "id": "aMPYSQRHhQh7", 605 | "outputId": "1c1a09b5-8340-4c1e-fe62-9d9da16a1ebd" 606 | }, 607 | "outputs": [ 608 | { 609 | "data": { 610 | "text/plain": [ 611 | "array([[1.],\n", 612 | " [2.]])" 613 | ] 614 | }, 615 | "execution_count": 13, 616 | "metadata": {}, 617 | "output_type": "execute_result" 618 | } 619 | ], 620 | "source": [ 621 | "x = A_inv.dot(b)\n", 622 | "x" 623 | ] 624 | }, 625 | { 626 | "cell_type": "markdown", 627 | "metadata": { 628 | "id": "qMsz1Bj5hZFB" 629 | }, 630 | "source": [ 631 | "This is our solution!\n", 632 | "\n", 633 | "$$ {x}=\n", 634 | "\\begin{bmatrix}\n", 635 | " 1 \\\\\\\\\n", 636 | " 2\n", 637 | "\\end{bmatrix}$$\n", 638 | "\n", 639 | " \n", 640 | " \n", 641 | "This means that the point of coordinates (1, 2) is the solution and is at the intersection of the lines representing the equations. Let's plot them to check this solution:\n" 642 | ] 643 | }, 644 | { 645 | "cell_type": "code", 646 | "execution_count": 14, 647 | "metadata": { 648 | "colab": { 649 | "base_uri": "https://localhost:8080/", 650 | "height": 272 651 | }, 652 | "id": "TW9EXjW8hX_5", 653 | "outputId": "7eb185d9-08d9-4f00-d90a-4885e7a6fab7" 654 | }, 655 | "outputs": [ 656 | { 657 | "data": { 658 | "image/png": "", 659 | "text/plain": [ 660 | "
    " 661 | ] 662 | }, 663 | "metadata": { 664 | "needs_background": "light" 665 | }, 666 | "output_type": "display_data" 667 | } 668 | ], 669 | "source": [ 670 | "x = np.arange(-10, 10)\n", 671 | "y = 2*x\n", 672 | "y1 = -x + 3\n", 673 | "\n", 674 | "plt.figure()\n", 675 | "plt.plot(x, y)\n", 676 | "plt.plot(x, y1)\n", 677 | "plt.xlim(0, 3)\n", 678 | "plt.ylim(0, 3)\n", 679 | "# draw axes\n", 680 | "plt.axvline(x=0, color='grey')\n", 681 | "plt.axhline(y=0, color='grey')\n", 682 | "plt.show()\n", 683 | "plt.close()" 684 | ] 685 | }, 686 | { 687 | "cell_type": "markdown", 688 | "metadata": { 689 | "id": "5yIfJ8C2hmhI" 690 | }, 691 | "source": [ 692 | "We can see that the solution (corresponding to the line crossing) is when $x=1$\n", 693 | " and $y=2$\n", 694 | ". It confirms what we found with the matrix inversion!" 695 | ] 696 | }, 697 | { 698 | "cell_type": "markdown", 699 | "metadata": { 700 | "id": "QST6reSAhr1Q" 701 | }, 702 | "source": [ 703 | "## Draw an equation\n", 704 | "\n", 705 | "To draw the equation with Matplotlib, we first need to create a vector with all the $x$\n", 706 | " values. Actually, since this is a line, only two points would have been sufficient. But with more complex functions, the length of the vector $x$ \n", 707 | " corresponds to the sampling rate. So here we used the Numpy function `arrange()` to create a vector from $-10$\n", 708 | " to $10$\n", 709 | " (not included)." 710 | ] 711 | }, 712 | { 713 | "cell_type": "code", 714 | "execution_count": 15, 715 | "metadata": { 716 | "colab": { 717 | "base_uri": "https://localhost:8080/" 718 | }, 719 | "id": "u5TDKW7vhkUe", 720 | "outputId": "4a91cedf-147e-4a7f-81f7-b9f5271bcb23" 721 | }, 722 | "outputs": [ 723 | { 724 | "data": { 725 | "text/plain": [ 726 | "array([-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2,\n", 727 | " 3, 4, 5, 6, 7, 8, 9])" 728 | ] 729 | }, 730 | "execution_count": 15, 731 | "metadata": {}, 732 | "output_type": "execute_result" 733 | } 734 | ], 735 | "source": [ 736 | "np.arange(-10, 10)" 737 | ] 738 | }, 739 | { 740 | "cell_type": "markdown", 741 | "metadata": { 742 | "id": "CkZYrSP0iCoe" 743 | }, 744 | "source": [ 745 | "The first argument is the starting point and the second the ending point. You can add a third argument to specify the step:" 746 | ] 747 | }, 748 | { 749 | "cell_type": "code", 750 | "execution_count": 16, 751 | "metadata": { 752 | "colab": { 753 | "base_uri": "https://localhost:8080/" 754 | }, 755 | "id": "0fjXY_Bsh_ck", 756 | "outputId": "d454aa42-f406-4d08-9dbf-d65445f1921a" 757 | }, 758 | "outputs": [ 759 | { 760 | "data": { 761 | "text/plain": [ 762 | "array([-10, -8, -6, -4, -2, 0, 2, 4, 6, 8])" 763 | ] 764 | }, 765 | "execution_count": 16, 766 | "metadata": {}, 767 | "output_type": "execute_result" 768 | } 769 | ], 770 | "source": [ 771 | "np.arange(-10, 10, 2)" 772 | ] 773 | }, 774 | { 775 | "cell_type": "markdown", 776 | "metadata": { 777 | "id": "4mZMn2ntiMqk" 778 | }, 779 | "source": [ 780 | "Then we create a second vector $y$\n", 781 | " that depends on the $x$\n", 782 | " vector. Numpy will take each value of \n", 783 | " and apply the equation formula to it." 784 | ] 785 | }, 786 | { 787 | "cell_type": "code", 788 | "execution_count": 17, 789 | "metadata": { 790 | "colab": { 791 | "base_uri": "https://localhost:8080/" 792 | }, 793 | "id": "wTSc3-o7iJ0o", 794 | "outputId": "2d56b3d1-cbfa-4c29-cf62-4b0472672f22" 795 | }, 796 | "outputs": [ 797 | { 798 | "data": { 799 | "text/plain": [ 800 | "array([-19, -17, -15, -13, -11, -9, -7, -5, -3, -1, 1, 3, 5,\n", 801 | " 7, 9, 11, 13, 15, 17, 19])" 802 | ] 803 | }, 804 | "execution_count": 17, 805 | "metadata": {}, 806 | "output_type": "execute_result" 807 | } 808 | ], 809 | "source": [ 810 | "x = np.arange(-10, 10)\n", 811 | "y = 2*x + 1\n", 812 | "y" 813 | ] 814 | }, 815 | { 816 | "cell_type": "markdown", 817 | "metadata": { 818 | "id": "y1WTUnv4iWPk" 819 | }, 820 | "source": [ 821 | "## Singular Matrices\n", 822 | "\n", 823 | "Some matrices are not invertible. They are called **singular**." 824 | ] 825 | }, 826 | { 827 | "cell_type": "markdown", 828 | "metadata": { 829 | "id": "eGs-lPy6icqS" 830 | }, 831 | "source": [ 832 | "## Conclusion\n", 833 | "\n", 834 | "This introduces different cases according to the linear system because ${A}^{-1}$ \n", 835 | " exists only if the equation ${Ax}={b}$\n", 836 | " has one and only one solution" 837 | ] 838 | } 839 | ], 840 | "metadata": { 841 | "colab": { 842 | "authorship_tag": "ABX9TyNMZ0cBltSj4PIT+susezD7", 843 | "include_colab_link": true, 844 | "provenance": [] 845 | }, 846 | "kernelspec": { 847 | "display_name": "Python 3", 848 | "name": "python3" 849 | }, 850 | "language_info": { 851 | "name": "python" 852 | } 853 | }, 854 | "nbformat": 4, 855 | "nbformat_minor": 0 856 | } 857 | -------------------------------------------------------------------------------- /GIFs/RealisticPlayfulCygnet.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Numerical-Linear-Algebra/7c70ad485922a4dc4925c57b221d0b21fecbe035/GIFs/RealisticPlayfulCygnet.gif -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Monit Sharma 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Readme.md: -------------------------------------------------------------------------------- 1 | [←←Back to Homepage](https://monitsharma.github.io/) 2 | 3 | 4 | # Learn Linear Algebra via Programming 5 | 6 | 7 | This repository is aimed at providing an introduction to the basics of linear algebra and advanced computational numerical linear algebra with a focus on applications in quantum computing. 8 | 9 | 10 | Quantum computing is a rapidly growing field that has the potential to revolutionize the way we process and analyze information. Linear algebra forms an essential part of the mathematical framework used in quantum computing. In this article, we will explore the role of linear algebra in quantum computing, including its importance in representing quantum states, quantum gates, and quantum algorithms. 11 | 12 | 13 | Check the full Blog series [here](https://medium.com/@_monitsharma/list/computational-linear-algebra-c25866dd2935) 14 | 15 | 16 | 17 | ## Table 18 | 19 | - [Introductory Linear Algebra](#introduction-to-linear-algebra) 20 | - [Advanced Linear Algebra](#use-cases-of-linear-algebra) 21 | - [Linear Algebra for Quantum Computing](#applications-in-quantum-computing) 22 | 23 | 24 | 25 | ## Content 26 | 27 | 28 | ### Introduction to Linear Algebra 29 | This section of the repository will cover the basics of linear algebra. It will include topics such as vectors, matrices, linear transformations, and eigenvalues/eigenvectors. The content will be presented in a way that is accessible to beginners, with examples and exercises to solidify understanding. 30 | 31 | 32 | | Serial Number | Title | Description | Links | Medium | 33 | | ------------- | ----------------------------------------- | --------------------------------------------------- | ----------------------------------------------------------------------------------------- |------------------------------------| 34 | | 1 | Scalars, vectors, matrices and tensors | Introduction to basic concepts in linear algebra | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/1-Scalars%2C_Vectors%2C_Matrices_and_Tensors.ipynb) | [![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-scalars-vectors-matrices-and-tensors-50e392df9ccc) | 35 | | 2 | Multiplying matrices and vectors | Understanding how matrix multiplication works | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/2-Multiplying_Matrices_and_Vectors.ipynb) | [![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-multiplication-and-identity-a2391370d713) | 36 | | 3 | Identity and inverse matrices | Explanation of identity and inverse matrices | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/3-Identity_and_Inverse_Matrices.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-multiplication-and-identity-a2391370d713) | 37 | | 4 | Linear dependence and span | Understanding linear dependence and span of vectors | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/4-Linear_Dependence_and_Span.ipynb) | [![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-linear-dependence-and-span-62fc0393d247) | 38 | | 5 | Norms | Definition and examples of vector norms | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/5-Norms.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-norms-and-special-kind-of-matrices-c5dcff7ae3d3) | 39 | | 6 | Special kind of matrices | Introduction to special types of matrices | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/6-Special_Kind_of_Matrices_and_Vectors.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-norms-and-special-kind-of-matrices-c5dcff7ae3d3) | 40 | | 7 | Eigendecomposition | Understanding eigenvectors and eigenvalues | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/7-Eigendecomposition.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-eigendecompositon-9942ccc746c0) | 41 | | 8 | Singular value decomposition | Introduction to singular value decomposition | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/8-Singular_Value_Decomposition.ipynb) | [![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/list/computational-linear-algebra-c25866dd2935) | 42 | | 9 | The Moore-Penrose pseudoinverse | Definition and applications of the pseudoinverse | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/9-The_Moore_Penrose_Pseudoinverse.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-moore-penrose-pseudo-inverse-4e18ddfa7d9c) | 43 | | 10 | The trace operator | Definition and properties of the trace operator | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/10-The_Trace_Operator.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-trace-and-determinant-d4c01523cb6c) | 44 | | 11 | The determinant | Explanation of the determinant of a matrix | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Basic%20Numerical%20Linear%20Algebra/11-The_Determinant.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-trace-and-determinant-d4c01523cb6c) | 45 | 46 | 47 | --- 48 | 49 | 50 | 51 | 52 | ### Use cases of Linear Algebra 53 | This section of the repository will cover advanced topics in computational numerical linear algebra. It will include topics such as singular value decomposition (SVD), QR decomposition, and LU decomposition. The content will be presented with a focus on their applications in quantum computing, and will include exercises and projects to solidify understanding. 54 | 55 | 56 | | Serial Number | Title | Description | Links | Medium | 57 | | ------------- | ------------------------------------ | --------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |------------------------------------| 58 | | 1 | Introduction to matrices | Overview of matrices and their properties, operations, and applications | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_1_introduction_to_matrices.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-scalars-vectors-matrices-and-tensors-50e392df9ccc) | 59 | | 2 | Singular Value Decomposition | Introduction to singular value decomposition and its applications | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_2_singular_value_decomposition.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-singular-value-decomposition-3150b5b19987) | 60 | | 3 | Topic Modelling with NMF | Explanation of Non-negative Matrix Factorization for topic modeling | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_3_topic_modeling_with_NMF_and_SVD.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-topic-modelling-with-nmf-and-svd-d07c83e4f006) | 61 | | 4 | Background Removal | Techniques for removing the background from images using linear algebra | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_4_background_removal_with_robust_PCA.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-background-removal-with-principle-component-analysis-c0bed850b6c) | 62 | | 5 | Compressed Sensing CT scans | Using compressed sensing for faster and more efficient CT scans | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_5_compressed_sensing_CT_scans_robust_regression.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-compressed-sensing-of-ct-scans-with-robust-regression-d64e745c06a2) | 63 | | 6 | Health outcomes with linear algebra | Applications of linear algebra in healthcare and medical research | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_6_health_outcomes_with_linear_regression.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-linear-regression-in-health-sector-b23bcee7dcd6) | 64 | | 7 | Linear regression | Introduction to linear regression and its applications | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_7_how_to_implement_linear_regression.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-how-to-implement-linear-regression-c4ef19741a4d) | 65 | | 8 | Page rank with eigen decomposition | Explanation of PageRank algorithm using eigendecomposition | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Numerical-Linear-Algebra/blob/main/Advanced%20Numerical%20Linear%20Algebra/week_8_page_rank_with_eigen_decomposition.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/computational-linear-algebra-page-rank-with-eigen-decompositions-d9ec02c2490a) | 66 | 67 | 68 | 69 | --- 70 | 71 | 72 | ### Applications in Quantum Computing 73 | 74 | | Serial Number | Title | Description | Links | Medium | 75 | | ------------- | ------------------------------------ | --------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |------------------------------------| 76 | | 1 | Introduction to Complex Arithemetic | This is a tutorial designed to introduce you to complex arithmetic. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Learn-Quantum-Computing-with-Qiskit/blob/main/Complex%20Arithmetic.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/learn-quantum-computing-with-qiskit-complex-arithmetic-453d5f15638b) | 77 | | 2 | Introduction to Linear Algebra | This is a tutorial designed to introduce you to Linear Algebra. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Quantum-Computing-with-Qiskit-and-IBMQ/blob/main/Linear%20Algebra%20Background.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/learn-quantum-computing-with-qiskit-linear-algebra-501587c3297d) | 78 | | 3 | Single Qubit Gates | This is a tutorial designed to introduce you to Linear Algebra. in single qubit gates | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Learn-Quantum-Computing-with-Qiskit/blob/main/Single_Qubit_Gates.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/learn-quantum-computing-with-qiskit-single-qubit-gates-17a5a419ed7c) | 79 | | 4 | Multiple Qubits | This is a tutorial designed to introduce you to Linear Algebra in multiple qubit operations. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/MonitSharma/Learn-Quantum-Computing-with-Qiskit/blob/main/Multiple_Qubits_and_Entanglement.ipynb) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/@_monitsharma/learn-quantum-computing-with-qiskit-multiple-qubits-and-entanglement-39b8ffa9b95d) | 80 | 81 | 82 | 83 | 84 | 85 | This section of the repository will cover the applications of linear algebra and computational numerical linear algebra in quantum computing. It will include topics such as quantum gates, quantum circuits, and quantum algorithms. The content will be presented in a way that is accessible to beginners, with examples and exercises to solidify understanding. This will be covered in detail in another repository. 86 | 87 | 88 | 89 | 90 | 91 | ## Getting Started 92 | These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. 93 | 94 | 95 | 96 | 97 | ## Prerequisites 98 | This project requires Python 3.x and Jupyter Notebook. You can install Python from the official Python website and Jupyter Notebook can be installed using the following command: 99 | 100 | 101 | ```python 102 | pip install jupyter 103 | ``` 104 | 105 | 106 | 107 | 108 | ## Installation 109 | To get started, simply clone the repository: 110 | 111 | ```python 112 | git clone https://github.com/MonitSharma/Numerical-Linear-Algebra.git 113 | ``` 114 | 115 | 116 | 117 | 118 | ## Usage 119 | You will find the content organized into directories according to the topics covered in this repository. The examples and exercises are provided in Jupyter notebooks that can be run on your local machine. To run the Jupyter notebooks, navigate to the directory containing the notebooks and type the following command in the terminal: 120 | 121 | ```python 122 | jupyter notebook 123 | ``` 124 | 125 | This will start the Jupyter Notebook server and open a web page in your browser. Click on the notebook you want to open and start exploring the content. 126 | 127 | 128 | 129 | 130 | ## Contributing 131 | Contributions are welcome! Please feel free to open an issue if you find a bug or have a suggestion for improvement. Pull requests are also welcome. 132 | 133 | 134 | 135 | 136 | ## License 137 | This repository is licensed under the MIT License. 138 | 139 | 140 | 141 | 142 | ## Acknowledgements 143 | We would like to thank the following resources for their contribution to this repository: 144 | 145 | 1. Linear Algebra - Khan Academy 146 | 2. Numerical Linear Algebra for Coders - Fast.ai 147 | 3. Quantum Computing for the Very Curious - IBM 148 | --------------------------------------------------------------------------------