├── .gitignore ├── Dockerfile ├── LICENSE ├── README.md ├── img ├── ML-house.png └── Oboe.jpg ├── installation ├── README.md └── let_jovyan_write.sh └── notebooks ├── 1-intro-to-linear-algebra.ipynb ├── 2-linear-algebra-ii.ipynb ├── 3-calculus-i.ipynb ├── 4-calculus-ii.ipynb ├── 5-probability.ipynb ├── 6-statistics.ipynb ├── 7-algos-and-data-structures.ipynb ├── 8-optimization.ipynb ├── SGD-from-scratch.ipynb ├── artificial-neurons.ipynb ├── batch-regression-gradient.ipynb ├── gradient-descent-from-scratch.ipynb ├── learning-rate-scheduling.ipynb ├── regression-in-pytorch.ipynb └── single-point-regression-gradient.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | # Details of the base image are here: hub.docker.com/r/jupyter/scipy-notebook 2 | # Tag [29f53f8b9927] is latest image as of Apr 23, 2020 3 | 4 | FROM jupyter/scipy-notebook:29f53f8b9927 5 | 6 | MAINTAINER Jon Krohn 7 | 8 | USER $NB_USER 9 | 10 | # Install TensorFlow: 11 | RUN pip install tensorflow==2.2.0rc3 12 | 13 | # Install PyTorch: 14 | RUN pip install torch==1.4.0 15 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Jon Krohn 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Machine Learning Foundations 2 | 3 | This repo is home to the code that accompanies Jon Krohn's *Machine Learning Foundations* curriculum, which provides a comprehensive overview of all of the subjects — across mathematics, statistics, and computer science — that underlie contemporary machine learning approaches, including deep learning and other artificial intelligence techniques. 4 | 5 | There are eight subjects in the curriculum, organized into four subject areas. See the "Machine Learning House" section below for detail on why these are the essential foundational subject areas: 6 | 7 | * **Linear Algebra** 8 | * 1: [Intro to Linear Algebra](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/1-intro-to-linear-algebra.ipynb) 9 | * 2: [Linear Algebra II: Matrix Operations](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/2-linear-algebra-ii.ipynb) 10 | * **Calculus** 11 | * 3: [Calculus I: Limits & Derivatives](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/3-calculus-i.ipynb) 12 | * 4: [Calculus II: Partial Derivatives & Integrals](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/4-calculus-ii.ipynb) 13 | * **Probability and Statistics** 14 | * 5: [Probability & Information Theory](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/5-probability.ipynb) 15 | * 6: [Intro to Statistics](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/6-statistics.ipynb) 16 | * **Computer Science** 17 | * 7: [Algorithms & Data Structures](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/7-algos-and-data-structures.ipynb) 18 | * 8: [Optimization](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/8-optimization.ipynb) 19 | 20 | Later subjects build upon content from earlier subjects, so the recommended approach is to progress through the eight subjects in the order provided. That said, you're welcome to pick and choose individual subjects based on your interest or existing familiarity with the material. In particular, each of the four subject areas are fairly independent so could be approached separately. 21 | 22 | ### Where and When 23 | 24 | The eight *ML Foundations* subjects were initially offered by [Jon Krohn](jonkrohn.com) as live online trainings in the [O'Reilly learning platform](https://learning.oreilly.com/home/) from May-Sep 2020 (and were offered a second time from Jul-Dec 2021; see [here](https://www.jonkrohn.com/talks) for individual lecture dates). 25 | 26 | To suit your preferred mode of learning, the content is now available via several channels: 27 | 28 | * **YouTube** 29 | * Linear Algebra [complete playlist here](https://www.youtube.com/playlist?list=PLRDl2inPrWQW1QSWhBU0ki-jq_uElkh2a) and [detailed blog post here](https://www.jonkrohn.com/posts/2021/5/9/linear-algebra-for-machine-learning-complete-math-course-on-youtube) 30 | * Calculus [complete playlist here](https://www.youtube.com/playlist?list=PLRDl2inPrWQVu2OvnTvtkRpJ-wz-URMJx) 31 | * [Probability playlist](https://www.youtube.com/playlist?list=PLRDl2inPrWQWwJ1mh4tCUxlLfZ76C1zge) is in active development (sign up for my email newsletter at [jonkrohn.com](https://www.jonkrohn.com/) to be notified of new video releases) 32 | * In time, all of the subjects of my ML Foundations curriculum will be freely available on YouTube. 33 | * **O'Reilly** (many employers and educational institutions provide free access to this platform; if you don't have access, you can get a 30-day free trial [via my special SDSPOD23 code](https://learning.oreilly.com/get-learning/?code=SDSPOD23)) 34 | * [Linear Algebra videos](https://learning.oreilly.com/videos/linear-algebra-for/9780137398119/) published in Dec 2020 ([free hour-long lesson](https://www.youtube.com/watch?v=uG_wjmuigGg)) 35 | * [Calculus videos](https://learning.oreilly.com/videos/calculus-for-machine/9780137398171/) published in Jan 2021 ([free hour-long lesson](https://youtu.be/ZDAX17OGMAM)) 36 | * [Probability and Stats videos](https://learning.oreilly.com/videos/probability-and-statistics/9780137566273/) published in May 2021 ([free hour-long lesson](https://youtu.be/uJcGj-k50iE)) 37 | * [Computer Science videos](https://learning.oreilly.com/videos/data-structures-algorithms/9780137644889/) published in Jun 2021 ([free hour-long lesson](https://youtu.be/yfKkMdndY-E)) 38 | * (For convenience, this publisher compiled all 28 hours of the above four video series into a single playlist [here](https://learning.oreilly.com/videos/-/9780137903245/).) 39 | * **Udemy**: All the Linear Algebra and Calculus content has been [live in a *Mathematical Foundations of ML* course](https://www.udemy.com/course/machine-learning-data-science-foundations-masterclass/) since Sep 2021 (free overview video [here](https://youtu.be/qhLo19EIA4g)). While this course stands alone as a complete introduction to the math subjects, Subjects 5-8 will eventually be added as free bonus material. 40 | * **Open Data Science Conference**: The entire series was taught live online from Dec 2020 to Jun 2021. On-demand recordings of all these trainings are now available in the [Ai+ Platform](https://aiplus.odsc.com/pages/mlbootcamp). 41 | * **Book**: A book deal with Pearson is in place; eventually I'll have bandwidth to work on the manuscript and pre-release chapter drafts will be available via oreilly.com. 42 | 43 | *(Note that while YouTube contains 100% of the taught content, the paid options — e.g., Udemy, O'Reilly, and ODSC — contain comprehensive solution walk-throughs for exercises that are not available on YouTube. Some of the paid options also include exclusive, platform-specific features such as interactive testing, "cheat sheets" and the awarding of a certificate for successful course completion.)* 44 | 45 | ### Push Notifications 46 | 47 | To stay informed of future live training sessions, new video releases, and book chapter releases, consider signing up for Jon Krohn's [email newsletter via his homepage](https://www.jonkrohn.com/). 48 | 49 | ### Notebooks 50 | 51 | All code is provided within Jupyter notebooks [in this directory](https://github.com/jonkrohn/DLTFpT/blob/master/notebooks/). These notebooks are intended for use within the (free) [Colab cloud environment](https://colab.research.google.com) and that is the only environment currently actively supported. 52 | 53 | That said, if you are familiar with running Jupyter notebooks locally, you're welcome to do so (note that the library versions in this repo's [Dockerfile](https://github.com/jonkrohn/ML-foundations/blob/master/Dockerfile) are not necessarily current, but may provide a reasonable starting point for running Jupyter within a Docker container). 54 | 55 | 56 | ### The Machine Learning House 57 | 58 |

59 | 60 |

61 | 62 | To be an outstanding data scientist or ML engineer, it doesn't suffice to only know how to use ML algorithms via the abstract interfaces that the most popular libraries (e.g., scikit-learn, Keras) provide. To train innovative models or deploy them to run performantly in production, an in-depth appreciation of machine learning theory (pictured as the central, purple floor of the "Machine Learning House") may be helpful or essential. And, to cultivate such in-depth appreciation of ML, one must possess a working understanding of the foundational subjects. 63 | 64 | When the foundations of the "Machine Learning House" are firm, it also makes it much easier to make the jump from general ML principles (purple floor) to specialized ML domains (the top floor, shown in gray) such as deep learning, natural language processing, machine vision, and reinforcement learning. This is because, the more specialized the application, the more likely its details for implementation are available only in academic papers or graduate-level textbooks, either of which typically assume an understanding of the foundational subjects. 65 | 66 | The content in this series may be particularly relevant for you if: 67 | 68 | * **You use high-level software libraries** to train or deploy machine learning algorithms, and would now like to understand the fundamentals underlying the abstractions, enabling you to expand your capabilities 69 | * You’re a **data scientist** who would like to reinforce your understanding of the subjects at the core of your professional discipline 70 | * You’re a **software developer** who would like to develop a firm foundation for the deployment of machine learning algorithms into production systems 71 | * You’re a **data analyst** or **A.I. enthusiast** who would like to become a data scientist or data/ML engineer, and so you’re keen to deeply understand the field you’re entering from the ground up (very wise of you!) 72 | * You're simply keen to understand the essentials of linear algebra, calculus, probability, stats, algorithms and/or data structures 73 | 74 | The foundational subjects have largely been unchanged in recent decades and are likely to remain so for the coming decades, yet they're critical across all machine learning and data science approaches. Thus, the foundations provide a solid, career-long bedrock. 75 | 76 | 77 | ### Pedagogical Approach 78 | 79 | The purpose of this series it to provide you with a practical, functional understanding of the content covered. Context will be given for each topic, highlighting its relevance to machine learning. 80 | 81 | As with other materials created by Jon Krohn (such as the book *[Deep Learning Illustrated](https://www.deeplearningillustrated.com/)* and his 18-hour video series *[Deep Learning with TensorFlow, Keras, and PyTorch](https://github.com/jonkrohn/DLTFpT/))*, the content in the series is brought to life through the combination of: 82 | 83 | * Vivid full-color illustrations 84 | * Paper-and-pencil comprehension exercises with fully-worked solutions 85 | * Hundreds of straightforward examples of Python code within hands-on Jupyter notebooks (with a particular focus on the PyTorch and TensorFlow libraries) 86 | * Practical ML applications 87 | * Resources for digging even deeper into topics that pique your curiosity 88 | 89 | 90 | ### Prerequisites 91 | 92 | **Programming**: All code demos will be in Python so experience with it or another object-oriented programming language would be helpful for following along with the code examples. A good (and free!) resource for getting started with Python is Al Sweigart's [Automate the Boring Stuff](https://automatetheboringstuff.com/). 93 | 94 | **Mathematics**: Familiarity with secondary school-level mathematics will make the class easier to follow along with. If you are comfortable dealing with quantitative information – such as understanding charts and rearranging simple equations — then you should be well-prepared to follow along with all of the mathematics. If you discover you have some math gaps as you work through this *ML Foundations* curriculum, I recommend the free, comprehensive [Khan Academy](https://www.khanacademy.org) to fill those gaps in. 95 | 96 | 97 | ### Oboe 98 | 99 | Finally, here's an illustration of Oboe, the *Machine Learning Foundations* mascot, created by the wonderful artist [Aglaé Bassens](https://www.aglaebassens.com): 100 | 101 |

102 | 103 |

104 | -------------------------------------------------------------------------------- /img/ML-house.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jonkrohn/ML-foundations/a459590173d655db09d701d63dbfac11c3be97b2/img/ML-house.png -------------------------------------------------------------------------------- /img/Oboe.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jonkrohn/ML-foundations/a459590173d655db09d701d63dbfac11c3be97b2/img/Oboe.jpg -------------------------------------------------------------------------------- /installation/README.md: -------------------------------------------------------------------------------- 1 | All code is provided within Jupyter notebooks [in this directory](https://github.com/jonkrohn/DLTFpT/blob/master/notebooks/). These notebooks are intended for use within the (free) [Colab cloud environment](https://colab.research.google.com), which requires no installation. 2 | 3 | Nothing else in this directory matters at this time. Feel free to ignore. 4 | 5 | Step 9: `sudo docker run -v $(pwd):/home/jovyan/work -it --rm -p 8896:8888 mlf-stack` 6 | -------------------------------------------------------------------------------- /installation/let_jovyan_write.sh: -------------------------------------------------------------------------------- 1 | # script that provides jovyan, the default Docker container username, permission to write to the directory 2 | sudo chgrp -R 100 ML-foundations/ 3 | sudo chmod -R g+w ML-foundations/ #to recursively change permissions so jovyan can write to the directory 4 | -------------------------------------------------------------------------------- /notebooks/8-optimization.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "8-optimization.ipynb", 7 | "provenance": [], 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "display_name": "Python 3", 12 | "language": "python", 13 | "name": "python3" 14 | }, 15 | "language_info": { 16 | "codemirror_mode": { 17 | "name": "ipython", 18 | "version": 3 19 | }, 20 | "file_extension": ".py", 21 | "mimetype": "text/x-python", 22 | "name": "python", 23 | "nbconvert_exporter": "python", 24 | "pygments_lexer": "ipython3", 25 | "version": "3.7.6" 26 | } 27 | }, 28 | "cells": [ 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "id": "view-in-github", 33 | "colab_type": "text" 34 | }, 35 | "source": [ 36 | "\"Open" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": { 42 | "id": "aTOLgsbN69-P" 43 | }, 44 | "source": [ 45 | "# Optimization" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": { 51 | "id": "Sqs7yWgf_NQA" 52 | }, 53 | "source": [ 54 | "This class, *Optimization*, is the eighth of eight classes in the *Machine Learning Foundations* series. It builds upon the material from each of the other classes in the series -- on linear algebra, calculus, probability, statistics, and algorithms -- in order to provide a detailed introduction to training machine learning models. \n", 55 | "\n", 56 | "Through the measured exposition of theory paired with interactive examples, you’ll develop a working understanding of all of the essential theory behind the ubiquitous gradient descent approach to optimization as well as how to apply it yourself — both at a granular, matrix operations level and a quick, abstract level — with TensorFlow and PyTorch. You’ll also learn about the latest optimizers, such as Adam and Nadam, that are widely-used for training deep neural networks. \n", 57 | "\n", 58 | "Over the course of studying this topic, you'll:\n", 59 | "\n", 60 | "* Discover how the statistical and machine learning approaches to optimization differ, and why you would select one or the other for a given problem you’re solving.\n", 61 | "* Understand exactly how the extremely versatile (stochastic) gradient descent optimization algorithm works, including how to apply it\n", 62 | "* Get acquainted with the “fancy” optimizers that are available for advanced machine learning approaches (e.g., deep learning) and when you should consider using them.\n", 63 | "\n", 64 | "Note that this Jupyter notebook is not intended to stand alone. It is the companion code to a lecture or to videos from Jon Krohn's [Machine Learning Foundations](https://github.com/jonkrohn/ML-foundations) series, which offer detail on the following:\n", 65 | "\n", 66 | "*Segment 1: The Machine Learning Approach to Optimization*\n", 67 | "\n", 68 | "* The Statistical Approach to Regression: Ordinary Least Squares\n", 69 | "* When Statistical Approaches to Optimization Break Down\n", 70 | "* The Machine Learning Solution \n", 71 | "\n", 72 | "*Segment 2: Gradient Descent*\n", 73 | "\n", 74 | "* Objective Functions\n", 75 | "* Cost / Loss / Error Functions\n", 76 | "* Minimizing Cost with Gradient Descent\n", 77 | "* Learning Rate\n", 78 | "* Critical Points, incl. Saddle Points\n", 79 | "* Gradient Descent from Scratch with PyTorch\n", 80 | "* The Global Minimum and Local Minima\n", 81 | "* Mini-Batches and Stochastic Gradient Descent (SGD)\n", 82 | "* Learning Rate Scheduling\n", 83 | "* Maximizing Reward with Gradient Ascent\n", 84 | "\n", 85 | "*Segment 3: Fancy Deep Learning Optimizers*\n", 86 | "\n", 87 | "* A Layer of Artificial Neurons in PyTorch\n", 88 | "* Jacobian Matrices\n", 89 | "* Hessian Matrices and Second-Order Optimization\n", 90 | "* Momentum\n", 91 | "* Nesterov Momentum\n", 92 | "* AdaGrad\n", 93 | "* AdaDelta \n", 94 | "* RMSProp\n", 95 | "* Adam \n", 96 | "* Nadam\n", 97 | "* Training a Deep Neural Net\n", 98 | "* Resources for Further Study" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "metadata": { 104 | "id": "zBdzukZv5b1w" 105 | }, 106 | "source": [ 107 | "import numpy as np\n", 108 | "import matplotlib.pyplot as plt" 109 | ], 110 | "execution_count": 18, 111 | "outputs": [] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": { 116 | "id": "VSe5VI38a1Ns" 117 | }, 118 | "source": [ 119 | "## Segment 1: Optimization Approaches" 120 | ] 121 | }, 122 | { 123 | "cell_type": "markdown", 124 | "metadata": { 125 | "id": "ay3FqULN5b11" 126 | }, 127 | "source": [ 128 | "Refer to the slides and the *Ordinary Least Squares* section of the [*Intro to Stats* notebook](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/6-statistics.ipynb)." 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": { 134 | "id": "D2y1HvVD5b12" 135 | }, 136 | "source": [ 137 | "## Segment 2: Gradient Descent" 138 | ] 139 | }, 140 | { 141 | "cell_type": "markdown", 142 | "metadata": { 143 | "id": "nD25jSYF5b12" 144 | }, 145 | "source": [ 146 | "### Cost Functions" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": { 152 | "id": "grAFt27_5b13" 153 | }, 154 | "source": [ 155 | "Fundamentally, for some instance $i$, we'd like to quantify the difference between the \"correct\" target model output $y_i$ and the model's predicted output $\\hat{y}_i$. A first idea might be to take the simple difference: $$\\Delta y_i = \\hat{y}_i - y_i$$" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": { 161 | "id": "vf0Qh1mk5b13" 162 | }, 163 | "source": [ 164 | "For computational efficiency (we'll explore this later in the segment), in ML we seldom consider the cost associated with a single instance $i$. Instead, we typically consider several instances simultaneously, in which case calculating the simple difference is largely ineffective because positive and negative $\\Delta y_i$ values cancel out. E.g., consider a situation where: \n", 165 | "\n", 166 | "* $\\Delta y_1 = \\hat{y}_1 - y_1$ = 7 - 2 = 5\n", 167 | "* $\\Delta y_2 = \\hat{y}_2 - y_2$ = 3 - 8 = -5\n", 168 | "\n", 169 | "On an individual-instance basis, there are differences between the predicted and target outputs, indicating the model could be improved. Despite this, the total cost ($\\Sigma \\Delta y_i = \\Delta y_1 + \\Delta y_2 = 5-5$) is zero and therefore the mean cost ($\\frac{\\Sigma{\\Delta y_i}}{n} = \\frac{0}{2}$) is also zero, erroneously implying a perfect model fit." 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": { 175 | "id": "3PIhblP-5b14" 176 | }, 177 | "source": [ 178 | "#### Mean Absolute Error" 179 | ] 180 | }, 181 | { 182 | "cell_type": "markdown", 183 | "metadata": { 184 | "id": "ZmE1Zu255b14" 185 | }, 186 | "source": [ 187 | "A straightforward resolution to the simple-difference shortcoming is to take the absolute value of the difference. I.e., using the same contrived values as above: \n", 188 | "\n", 189 | "* $|\\Delta y_1| = |\\hat{y}_1 - y_1|$ = |7 - 2| = 5\n", 190 | "* $|\\Delta y_2| = |\\hat{y}_2 - y_2|$ = |3 - 8| = 5\n", 191 | "\n", 192 | "In this case:\n", 193 | "\n", 194 | "* The total error is ten: $\\Sigma |\\Delta y_i| = |\\Delta y_1| + |\\Delta y_2| = 5+5$ \n", 195 | "* The **mean absolute error** (MAE) is five: $\\frac{\\Sigma{|\\Delta y_i}|}{n} = \\frac{10}{2}$" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": { 201 | "id": "PRngCTzq5b15" 202 | }, 203 | "source": [ 204 | "#### Mean Squared Error" 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": { 210 | "id": "ICZZqWyP5b15" 211 | }, 212 | "source": [ 213 | "While MAE satisfactorily quantifies cost across multiple instances, for reasons we'll numerate in a moment, it is much more common in ML to use **mean *squared* error** (MSE). \n", 214 | "\n", 215 | "Let's use the contrived values again to show how MSE is calculated. As covered in [*Calculus II*](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/4-calculus-ii.ipynb), the individual-instance variant of MSE is *squared error* or *quadratic cost*, which -- as with absolute error -- dictates that the cost for each instance $i$ must be $\\geq 0$: \n", 216 | "\n", 217 | "* $(\\hat{y}_1 - y_1)^2 = (7 - 2)^2 = 5^2 = 25$\n", 218 | "* $(\\hat{y}_2 - y_2)^2 = (3 - 8)^2 = (-5)^2 = 25$\n", 219 | "\n", 220 | "As suggested by its name, MSE is the average: $$C = \\frac{1}{n} \\sum_{i=1}^n (\\hat{y_i}-y_i)^2 = \\frac{1}{2} (25 + 25) = \\frac{50}{2} = 25 $$\n", 221 | "\n", 222 | "Since cost for each instance $i$ must be $\\geq 0$, MSE (as with MAE) must therefore also be $\\geq 0$. In addition: \n", 223 | "\n", 224 | "1. The partial derivative of the MSE $C$ can be efficiently computed w.r.t. model parameters, providing a *gradient* of cost, $\\nabla C$ (the primary focus of [*Calculus II*](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/4-calculus-ii.ipynb) is deriving $\\nabla C$). Adjusting model parameters, we can *descend* $\\nabla C$ and thereby minimize $C$ (the mechanics of which are the primary focus of *Optimization*).\n", 225 | "2. Compared to MAE, MSE is relatively tolerant of small $\\Delta y_i$ and intolerant of large $\\Delta y_i$, a characteristic that tends to lead to better-fitting models." 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "metadata": { 231 | "id": "WnLkt4505b15" 232 | }, 233 | "source": [ 234 | "delta_y = np.linspace(-5, 5, 1000)" 235 | ], 236 | "execution_count": 19, 237 | "outputs": [] 238 | }, 239 | { 240 | "cell_type": "code", 241 | "metadata": { 242 | "id": "eS8MQfFj5b18" 243 | }, 244 | "source": [ 245 | "abs_error = np.abs(delta_y)" 246 | ], 247 | "execution_count": 20, 248 | "outputs": [] 249 | }, 250 | { 251 | "cell_type": "code", 252 | "metadata": { 253 | "id": "2VEm2rY75b1-" 254 | }, 255 | "source": [ 256 | "sq_error = delta_y**2" 257 | ], 258 | "execution_count": 21, 259 | "outputs": [] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "metadata": { 264 | "id": "iFDqR0Xr5b2B", 265 | "colab": { 266 | "base_uri": "https://localhost:8080/", 267 | "height": 279 268 | }, 269 | "outputId": "ca63427c-ca25-4fff-9445-a77d6e172e58" 270 | }, 271 | "source": [ 272 | "fig, ax = plt.subplots()\n", 273 | "\n", 274 | "plt.plot(delta_y, abs_error)\n", 275 | "plt.plot(delta_y, sq_error)\n", 276 | "ax.axhline(c='lightgray')\n", 277 | "\n", 278 | "plt.xlabel('Delta y')\n", 279 | "plt.ylabel('Error')\n", 280 | "\n", 281 | "ax.set_xlim(-4.2, 4.2)\n", 282 | "ax.set_ylim(-1, 17)\n", 283 | "_ = ax.legend(['Absolute', 'Squared'])" 284 | ], 285 | "execution_count": 22, 286 | "outputs": [ 287 | { 288 | "output_type": "display_data", 289 | "data": { 290 | "image/png": "\n", 291 | "text/plain": [ 292 | "
" 293 | ] 294 | }, 295 | "metadata": { 296 | "tags": [], 297 | "needs_background": "light" 298 | } 299 | } 300 | ] 301 | }, 302 | { 303 | "cell_type": "markdown", 304 | "metadata": { 305 | "id": "Z3fpl9k75b2E" 306 | }, 307 | "source": [ 308 | "There are other cost functions out there (e.g., [cross-entropy cost](https://github.com/the-deep-learners/deep-learning-illustrated/blob/master/notebooks/cross_entropy_cost.ipynb) is the typical choice for a deep learning classifier), but since MSE is the most common, including being the default option for regression models, it will be our focus for the remainder of *ML Foundations*." 309 | ] 310 | }, 311 | { 312 | "cell_type": "markdown", 313 | "metadata": { 314 | "id": "K5EDHWnw5b2E" 315 | }, 316 | "source": [ 317 | "**Return to slides here.**" 318 | ] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "metadata": { 323 | "id": "ocedC8el5b2F" 324 | }, 325 | "source": [ 326 | "### Minimizing Cost with Gradient Descent" 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": { 332 | "id": "CN3TlkVM5b2F" 333 | }, 334 | "source": [ 335 | "Refer to the slides and the [*Gradient Descent from Scratch* notebook](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/gradient-descent-from-scratch.ipynb)." 336 | ] 337 | }, 338 | { 339 | "cell_type": "markdown", 340 | "metadata": { 341 | "id": "43iljMvZ5b2G" 342 | }, 343 | "source": [ 344 | "### Critical Points" 345 | ] 346 | }, 347 | { 348 | "cell_type": "markdown", 349 | "metadata": { 350 | "id": "-RO6uoRh5b2G" 351 | }, 352 | "source": [ 353 | "#### Minimum" 354 | ] 355 | }, 356 | { 357 | "cell_type": "code", 358 | "metadata": { 359 | "id": "52gWUjiW5b2H" 360 | }, 361 | "source": [ 362 | "x = np.linspace(-10, 10, 1000)" 363 | ], 364 | "execution_count": 23, 365 | "outputs": [] 366 | }, 367 | { 368 | "cell_type": "markdown", 369 | "metadata": { 370 | "id": "nqbOSBOc5b2J" 371 | }, 372 | "source": [ 373 | "If $y = x^2 + 3x + 4$..." 374 | ] 375 | }, 376 | { 377 | "cell_type": "code", 378 | "metadata": { 379 | "id": "suJYqaq_5b2J" 380 | }, 381 | "source": [ 382 | "y_min = x**2 + 3*x + 4" 383 | ], 384 | "execution_count": 24, 385 | "outputs": [] 386 | }, 387 | { 388 | "cell_type": "markdown", 389 | "metadata": { 390 | "id": "Skkp7dSA5b2M" 391 | }, 392 | "source": [ 393 | "...then $y' = 2x + 3$. \n", 394 | "\n", 395 | "Critical point is located where $y' = 0$, so where $2x + 3 = 0$.\n", 396 | "\n", 397 | "Rearranging to solve for $x$:\n", 398 | "$$2x = -3$$\n", 399 | "$$x = \\frac{-3}{2} = -1.5$$\n", 400 | "\n", 401 | "At which point, $y = x^2 + 3x + 4 = (-1.5)^2 + 3(-1.5) + 4 = 1.75$\n", 402 | "\n", 403 | "Thus, the critical point is located at (-1.5, 1.75)." 404 | ] 405 | }, 406 | { 407 | "cell_type": "code", 408 | "metadata": { 409 | "id": "cXq47qEq5b2M", 410 | "colab": { 411 | "base_uri": "https://localhost:8080/", 412 | "height": 265 413 | }, 414 | "outputId": "b5d76c63-d12c-4237-92b9-53509f1d9711" 415 | }, 416 | "source": [ 417 | "fig, ax = plt.subplots()\n", 418 | "\n", 419 | "plt.scatter(-1.5, 1.75, c='orange')\n", 420 | "plt.axhline(y=1.75, c='orange', linestyle='--')\n", 421 | "\n", 422 | "plt.axvline(x=0, c='lightgray')\n", 423 | "plt.axhline(y=0, c='lightgray')\n", 424 | "\n", 425 | "ax.set_xlim(-5, 2)\n", 426 | "ax.set_ylim(-1, 15)\n", 427 | "\n", 428 | "_ = ax.plot(x, y_min)" 429 | ], 430 | "execution_count": 25, 431 | "outputs": [ 432 | { 433 | "output_type": "display_data", 434 | "data": { 435 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deVyVZf7/8dfnsG8CCm4sIiruuKGWWWlqaZlllqXV6NSMVtNUM+3Tvti+7zmpTZNttmpftUwzs0wFEzdQEVxAQdzYd67fH9D8TFMOcOA+5/B5Ph4+hHPuc+63qG9u7vu6r0uMMSillHJ9NqsDKKWUcgwtdKWUchNa6Eop5Sa00JVSyk1ooSullJvwbM6dhYWFmZiYmObcpVKqkcrKygDw8fGxOEnLlZSUdMgYE17Xds1a6DExMSQmJjbnLpVSjZSeng5AbGysxUlaLhHZY892espFKaXchBa6Ukq5CS10pZRyE1roSinlJrTQlVLKTWihK6WUm9BCV0opN6GFrpRSbkILXSml3ESdhS4ic0XkoIhs+YPnbhcRIyJhTRNPKaVatvLKaru3tecI/V1g7IkPikgUcD6w1+5gVfYHU0opBZ9tyLR72zoL3RizCjjyB0+9CNwF2L2GXW5Bmd3BlFKqpSuvrOa1FWl2b9+gc+gicgmQZYxJtmPbGSKSKCKJR4rKyTxa3JBdKqVUi/PZhkyyjpXYvX29C11E/IF/AQ/as70xZrYxJsEYkyDA69/b/91GKaVaqt+OzvtHhdj9moYcoXcBOgPJIrIbiAQ2iEj7ul7YOsCbBYmZ7DuiR+lKKXU6vx2d3zq6m92vqXehG2M2G2PaGmNijDExQCYw0BiTXddrw4N8sInoUbpSSp3Gb0fn/aJCGBFX57oW/2PPsMUPgTVAdxHJFJHrGxrSy8PGlCFRfJqkR+lKKXUqC5L2kXWshNtGdUNE7H6dPaNcphhjOhhjvIwxkcaYOSc8H2OMOWTvDm8c0RWbTXh1xU67QyqlVEtRWlHFq8vTGBgdwoju9h+dgwV3irYP9mXqkGg+25BFxqGi5t69Uko5tfd/2UN2fil3XNC9XkfnYNGt/zeN7IK3h40Xl+2wYvdKKeWUCssqeXPlLoZ3DWNYl/rfgG9JobcN8uXPZ8WwMHk/2/bnWxFBKaWczrzVGRwuKueOC7o36PWWTc4185wuBPl68sKy7VZFUEopp5FXXMHsH9MZ3bNdvcaeH8+yQg/29+KGc7vwXcpBkvYctSqGUko5hbdX7aKwrJLbz49r8HtYOn3u9GExhAV689w3epSulGq5cgvKmPfTbi6O70jPDq0a/D6WFnqAjyc3jejKmvTD/JRm98hHpZRyK69/n0Z5VTX/GNPwo3NwggUupg6NpmOwL898sx1j7J64USml3ELWsRI+WLuXywdG0jksoFHvZXmh+3p5cOvobiTvO8aybTlWx1FKqWb16vKamyxvqcecLadieaEDTKr9zvT8tzuoqtajdKVUy5BxqIgFSZlMHRpNRIhfo9/PKQrd08PGP8fEsT2ngEXJ+62Oo5RSzeLFZTvw9rDxt5FdHfJ+TlHoABf17UDPDq14YdkOKnSpOqWUm9u6P49Fm/Yz/awYwoN8HPKeTlPoNptw5wVx7D1SzMfr91kdRymlmtRTS1IJ9qu5H8dRnKbQAUZ2b0tCp1BeXr6T4vJKq+MopVSTWL3zED/uPMTNI7sS7OflsPd1qkIXEe69sCe5BWW882OG1XGUUsrhqqsNTy5JISLEj2vP7OTQ93aqQgcY1CmUsb3b8/YPu8gtKLM6jlJKOdSiTfvZuj+fOy6Iw8fTw6Hv7XSFDnDX2O6UVlbzynJdBEMp5T7KKqt49pvt9OrQikv6RTj8/Z2y0GPDA5k6JJoP1u0lPbfQ6jhKKeUQ83/ZS+bREu4Z1wObrX6LV9jDnjVF54rIQRHZctxjz4pIqohsEpEvRKRhcz2exi2juuHraeNZnbhLKeUG8ksreHXFToZ3DeOceiz8XB/2HKG/C4w94bFlQB9jTDywA7jXwbkID/JhxjldWLIlW6fXVUq5vLd/2MXR4gruHtujyfZhzyLRq4AjJzz2rTHmt3GFvwCRTZCNv5zdmfAgH55cnKITdymlXFZ2XilzVmcwoV9H+kYGN9l+HHEO/TpgyameFJEZIpIoIom5ubn1euMAH0/+MTqOxD1HdeIupZTLeum7mnmq7mzg0nL2alShi8h9QCUw/1TbGGNmG2MSjDEJ4eH1P280OSGSLuEBPLU0lUqdEkAp5WJ25hTwSeI+rjmjE1Gt/Zt0Xw0udBGZDowHrjZNeD7E08PG3WN7kJ5bxMeJOiWAUsq1PL00lQBvT/5+XuOnx61LgwpdRMYCdwETjDHFjo10sjG92jE4JpQXl+2kqEynBFBKuYaf0g7xXcpBbhzZhdYB3k2+P3uGLX4IrAG6i0imiFwPvAYEActEZKOIvNWUIX+bEuBQYRlv/7CrKXellFIOUVVteOzrbUSG+nHdWZ2bZZ+edW1gjJnyBw/PaYIspzUwOpQJ/Try9qp0rhzimMnglVKqqSxI3EdqdgGvTR2Ar5djb/E/Fae8U/RU7h5XM37z6SWpFidRSqlTKyyr5Llvd5DQKZSL+nZotv26VKFHhPgx45xYFibv15uNlFJO643v0zhUWMYD43sh4vhb/E/FpQod4IZzu9A2yIfHvt5Gta4/qpRyMplHi3lndQYTB0TQL8rhs6KclssVeoCPJ3de0J2N+46xaJOuP6qUci5PL92OTWjym4j+iMsVOsCkgZH0iWjFU0tSKSmvsjqOUkoBkLTnKIuS9zPj7Fg6WjBwwyUL3WYTHhzfmwN5pcxelW51HKWUorp2mGLbIB9mOnCd0PpwyUIHGNK5NRf2bc9bP+wiO6/U6jhKqRZu0ab9bNx3jDsv6E6AT50jwpuEyxY6wD1je1JVbXhmqQ5jVEpZp6iskicWp9A3IphJA5tk8lm7uHShR7fx57rhnfn81yw27NVhjEopa7z+fRo5+WU8PKF3k6xEZC+XLnSAm8/rStsgHx76aitVOoxRKdXMdh8q4p0fM7hsYASDOoVamsXlCz3Qx5P7LurJ5qw8PtHZGJVSzezRr7fh7WnjniZcicheLl/oABP6dWRITGueWZrKseJyq+MopVqIFak5rEg9yC2jutK2la/Vcdyj0EWEhyf0Jq+kgheW7bA6jlKqBSirrOLRRduIDQ9g+rDmmU2xLm5R6AC9OrbimjM68f4ve9i2P9/qOEopNzd39W52Hy7m4Yt74+3pHFXqHCkc5J9j4gjx9+bhhVt1UWmlVJPJzivl1RU7GdOrHefE1X9pzabiVoUe4u/NnRd0Z93uIyxM1nlelFJN48klKVRWGx64qJfVUX7HrQodYHJCFPGRwcz6vxQKdbk6pZSDrU0/zFcb9zPznFii2zTtos/15XaF7mETHpnQm4MFZby6fKfVcZRSbqS8spr7v9xCZKgfN43oanWck9izpuhcETkoIluOe6y1iCwTkZ21v1s7mv4EA6JDmZwQyZzVGWzPLrA6jlLKTcxZncHOg4U8MqE3ft7Ns6xcfdhzhP4uMPaEx+4BlhtjugHLaz93KveM60mQryf3f7lZF8JQSjVa5tFiXllecyF0VM92Vsf5Q3UWujFmFXDkhIcvAf5T+/F/gEsdnKvRWgd4c++4nqzffZRPkzKtjqOUcnGPLNoGwMMTeluc5NQaeg69nTHmQO3H2cApv12JyAwRSRSRxNzc3AburmEuHxTJ4JhQnlySwpEivYNUKdUw323LYdm2HG4d3Y0ICxausFejL4qamgHfpzynYYyZbYxJMMYkhIc373hNm014/NK+FJRW8tSSlGbdt1LKPRSXV/LQwq3EtQvk+uHOcUfoqTS00HNEpANA7e8HHRfJsbq3D+IvZ8fySWIm6zJOPHOklFKn9+qKNLKOlfD4pX3x8nDugYENTbcQmFb78TTgK8fEaRq3jOpKRIgf93+5mfLKaqvjKKVcxM6cAv69Kp3LB0UypHNrq+PUyZ5hix8Ca4DuIpIpItcDTwFjRGQnMLr2c6fl7+3JIxN6syOnkDmrM6yOo5RyAcYY7v9yCwE+ntw7zvqpce1R58J3xpgpp3hqlIOzNKnRvdpxfq92vLx8B+PjOxDV2rnu8FJKOZeP1+9jbcYRnrqsL20CfayOYxfnPiHkYA9N6I1NhAe+2qKTdymlTulgfilPLE5haOfWXDk4yuo4dmtRhR4R4scd53dn5fZcvtqok3cppf7Yw4u2UlpZzZOX9UXEujVC66tFFTrAtGEx9I8K4ZFFWzlcWGZ1HKWUk/lmazaLN2dz66huxIYHWh2nXlpcoXvYhGcuj6ewrJJHv95mdRyllBPJL63gwa+20KN9EDPOibU6Tr21uEIHiGsXxE0juvLVxv18n+q0Q+iVUs3s6SWp5BaU8dSkeKcfc/5HXC+xg9w0sgtd2wZy3xebdd50pRTrdx9h/tq9TB/Wmf5RIVbHaZAWW+g+nh48PSmeA/mlPLs01eo4SikLlVZUcc9nm4gI8eP28+OsjtNgLbbQAQZ1CmXamTG898sekvbotABKtVSvf5/GrtwiZk3sQ4BPnbfnOK0WXegAd1zQnY7Bftz92WbKKqusjqOUamZbsvJ4Y+UuJg6IYET3tlbHaZQWX+iBPp48PrEPaQcLefk7XbJOqZakvLKaOxYk0zrAm4cudq4FnxuixRc6wMjubZmcEMlbP+xi475jVsdRSjWTV1fsJDW7gCcn9iXE39vqOI2mhV7r/vG9aNfKl9s/2UhphZ56Ucrdbc6sOdVy2cAIRvdyziXl6ksLvVYrXy+enhTPrtwiXli2w+o4SqkmVFZZxR0LkgkL9Oah8c67pFx9aaEf55y4cKYOjebfP6aTuFtHvSjlrl5dnsb2nAKevKwvwf5eVsdxGC30E/zrwp50DPbjjgXJlJTrqRel3M2mzGO8+cMuJg2M5Lwe7nGq5Tda6CcI9PHk2Svi2X24mGe+0RuOlHInx59qedANRrWcSAv9DwzrEsa0Mzsx76fd/JJ+2Oo4SikHeWHZDnbkFPLUpHiC/dznVMtvtNBP4e5xPYhu7c8dC5IpKK2wOo5SqpF+ST/M7FXpTBkSxUgXv4HoVBpV6CLyDxHZKiJbRORDEfF1VDCr+Xt78sLkfuw/VsIji3SaXaVcWX5pBbd/kkyn1v7cf5H7nWr5TYMLXUQigFuABGNMH8ADuMpRwZxBQkxr/jayK58mZbJ48wGr4yilGuihr7aSnV/Ki1f2d+m5WurS2FMunoCfiHgC/oDbret2y6hu9IsM5t7PN5OdV2p1HKVUPS1K3s8Xv2bx9/O6MiA61Oo4TarBhW6MyQKeA/YCB4A8Y8y3J24nIjNEJFFEEnNzcxue1CJeHjZevLI/5ZXV3L5gI9XVuri0Uq7iQF4J932xmf5RIdw8sqvVcZpcY065hAKXAJ2BjkCAiFxz4nbGmNnGmARjTEJ4eHjDk1ooNjyQBy/uxU9ph5n7U4bVcZRSdqiuNtyxIJnKasOLV/bH0wVXIKqvxvwJRwMZxphcY0wF8DkwzDGxnM9Vg6MY06sdzyzdTsqBfKvjKKXqMPenDH5KO8wD43vROSzA6jjNojGFvhc4Q0T8RUSAUUCKY2I5HxHhqdrbhG/7SCfwUsqZpRzI55lvtjO6ZzuuGhxldZxm05hz6GuBT4ENwOba95rtoFxOqU2gD89eHs/2nAKeWqJ3kSrljIrLK/nbBxsI8fPi6Ul9qTnebBkadVLJGPOQMaaHMaaPMeZaY0yZo4I5qxHd2/Lns2J49+fdLNuWY3UcpdQJHvpqKxmHinjpyv60CfSxOk6zcv+rBE3gnnE96BPRijsWJJN1rMTqOEqpWl/+msWCpEz+PrIrw7qGWR2n2WmhN4CPpwevTRlIVbXhlg9/paKq2upISrV4GYeKuO+LzQyOCeWWUd2sjmMJLfQGigkL4InL+pK056guiKGUxcoqq/j7hxvw8rTx8lUDWsQQxT/SMv/UDjKhX0emDInizZW7+GGH6900pZS7eGpJKluy8nn28n50DPGzOo5ltNAb6cHxvYlrF8g/P97IwXydGkCp5rZsWw7zftrN9GExjHGTtUEbSgu9kfy8PXh96kCKyiu59aONVOnUAEo1m31Hirnz02R6d2zFvRf2sDqO5bTQHaBbuyAevaQPa9IP8/LynVbHUapFKK2o4qb5G6iqNrw+dSA+nh5WR7KcFrqDXDEokkkDI3ll+U5WpOr4dKWa2iOLtrE5K4/nr+hHTAu5tb8uWugOIiLMmtiHXh1acdtHG9l7uNjqSEq5rQWJ+/hw3V5uHNGF83u3tzqO09BCdyBfLw/eumYQADPfT6KkXOd7UcrRtu3P5/4vt3BmbBtuHxNndRynooXuYNFt/Hn5qgGkHMjnvi83Y4xeJFXKUfJKKrhxfhIh/l68OrXljjc/Ff1qNIGRPdpy66hufL4hi/lr91odRym3UF1tuP2TjWQdLeGNqwcS1sLmabGHFnoTuXVUN0Z0D+eRRVv5de9Rq+Mo5fLe/GEX36Uc5P6LejKoU2ur4zglLfQmYrMJL13Zn3atfLnx/Q3kFrj9RJRKNZnlKTk89+12JvTryLRhMVbHcVpa6E0oxN+bt64ZxLGScm54P4mySr1IqlR97cwp4NaPNtKnYzBPT4pvUfOb15cWehPrExHMc1f0I2nPUe7/YoteJFWqHo4Vl/PX9xLx9fJg9p8G4eetNw+djqfVAVqC8fEd2ZFdwCsr0ujePoi/nB1rdSSlnF5lVTU3f/Ar+4+V8uGMoXQIbrmTbtlLj9CbyW2j47igdzueWJyiMzMqZYdZi1NYnXaIxy/toxdB7aSF3kxsNuGFyf2JaxfEzR9sYFduodWRlHJan6zfx7yfdvPns2KY3IIWeW6sRhW6iISIyKcikioiKSJypqOCuaMAH0/emZaAt4eNv/wnkbziCqsjKeV0Encf4b4vNzO8axj3XdjT6jgupbFH6C8DS40xPYB+QErjI7m3yFB/3rp2EJlHi/nbBxt0+TqljpNxqIi/vpdIZKg/r+mdoPXW4K+WiAQD5wBzAIwx5caYY44K5s4Gx7Rm1sS+rE47xH1f6PQASgEcKSrnz/PWATBv+mBC/L0tTuR6GvPtrzOQC8wTkV9F5B0ROWkOSxGZISKJIpKYm6sXA38zOSGKW87ryieJmby2Is3qOEpZqrSiir++l8j+vFLemZag0+E2UGMK3RMYCLxpjBkAFAH3nLiRMWa2MSbBGJMQHh7eiN25n3+MieOyARE8v2wHnyVlWh1HKUtUVxvuWJBM0p6jvDi5v45oaYTGFHomkGmMWVv7+afUFLyyk4jw1KR4zoxtw92fbeLntENWR1Kq2T377Xa+3nSAe8b14KL4DlbHcWkNLnRjTDawT0S61z40CtjmkFQtiLenjbeuHURseAAz309ie3aB1ZGUajYfrN3Lmyt3MXVoNDPP0RvuGquxl5D/DswXkU1Af+CJxkdqeYL9vJj35yH4eXnw53nryMkvtTqSUk1ueUoOD3y1hXPjwnl0Qm+do8UBGlXoxpiNtefH440xlxpjdJ7YBooI8WPu9MHklVQwbe46HaOu3Nr63Ue4af4GendsxetXD9ThiQ6iX0Un0icimLevTSA9t4jr/rOe4vJKqyMp5XApB/K57t31RIT6MW/6YAJ9dEopR9FCdzLDu4Xx8lX9+XXvUW54fwPllXrjkXIf+44UM23uOgK8PXnvuiG00VWHHEoL3QmN69uBJy/ry6odufzjk41UVeuNR8r15RaUce2ctZRVVvPe9UOIDPW3OpLb0Z91nNSVg6PJK6ngicWptPL14omJffSikXJZeSUVTJ+3jpz8Mub/dShx7YKsjuSWtNCd2IxzunCsuII3Vu4ixN+Lu8f2sDqSUvVWUFpzoX9nTiH/npbAwOhQqyO5LS10J3fnBd05VlLBmyt3Eejjyd9GdrU6klJ2Ky6v5Lp317MlK483rh7IuXF6t3hT0kJ3ciLCY5f0oaiskme/2Y6HTbjh3C5Wx1KqTqUVVfzlP4kk7TnKq1MGcn7v9lZHcnta6C7AwyY8f0U/qg08tSQVT5voMnbKqZVVVnHD+0msST/M81f001v6m4kWuovw9LDx4uR+VFcbHv+/FGwiXDe8s9WxlDpJWWUVf5v/Kyu35/LkZX25bGCk1ZFaDC10F+LpYeOlq/pTVW149OtteNiEacNirI6l1P+UVlRx4/tJfL89l0cv6c2UIdFWR2pRdBy6i/HysPHKlAGM6dWOhxZuZc7qDKsjKQVASXnNnObfb8/liYl9+dOZMVZHanG00F2Qt6eN16cOZFyf9jz29TZe/14XyFDW+m00y+q0QzxzeTxTh+qRuRW00F2Ut6eNV6cMYOKACJ79ZjvPLE3VpeyUJQrLKpk+dz1rMw7zwuR+TE6IsjpSi6Xn0F2Yp4eN56/oh6+XB2+s3EVxeRUPXdxL7yhVzeZIUTl/rh1n/tJVA5jQr6PVkVo0LXQXZ7MJT0zsg5+XB3N/yqC0oopZE/viYdNSV01r/7ESrp2zln1HS3jzah1n7gy00N2AiPDA+J4E+njwyoo0jhSV88qUAfh6eVgdTbmptIOF/GnOWvJLK3nvuiGcEdvG6kgKPYfuNkSEf57fnYcv7sWylByunbNWF8lQTWJT5jEmv72G8qpqPppxhpa5E9FCdzPTz+rMq1MGkLwvjyve/pn9x0qsjqTcyI87c5ky+xf8vT1YcMMw+kQEWx1JHafRhS4iHiLyq4h87YhAqvHGx3fk3esGc+BYKZPe/JkdObrwtGq8j9btZfq89US19ufTG4bROSzA6kjqBI44Qr8VSHHA+ygHGtYljI9nnklltWHSmz/z485cqyMpF1VtDP9ed5B7Pt/M8K5hLLjhTNoH+1odS/2BRhW6iEQCFwHvOCaOcqReHVvxxU3DiAjxY/q89fz3lz1WR1IuprSiilnL9/PhxsNMHRrNnGkJBPl6WR1LnUJjj9BfAu4CTrnwpYjMEJFEEUnMzdWjxOYWGerPpzcO49y4cB74cgsPL9xKZZWuU6rqlltQxtXvrOX79HxmDG3LrEv74Omhl92cWYP/dkRkPHDQGJN0uu2MMbONMQnGmITwcJ3c3gqBPp78+08JXD+8M+/+vJvr/5NIQamOgFGntinzGBNeW83W/Xk8ODqCq/q10RvWXEBjvt2eBUwQkd3AR8B5IvK+Q1Iph/OwCQ+M78WsiX1YnXaIS17/ibSDerFUneyLXzO54q012ET49IZhjIhtZXUkZacGF7ox5l5jTKQxJga4ClhhjLnGYclUk7h6aCfev34o+SUVXPLaTyzZfMDqSMpJVFZV8/jX2/jHx8kMiA5h4c1n6bBEF6MnxFqgM7u0YdHfh9OtXRA3zt/Ak4tT9Lx6C5dbUMa0eet4Z3UG04fF8N/rh9Im0MfqWKqeHHLrvzFmJbDSEe+lmkeHYD8+nnkGjy7axtur0tmclccrUwYQpv+JW5yfdx3i1o82kl9SwbOXx3OFzpbosvQIvQXz8fRg1sS+PHt5PEl7jjL2pR91vHoLUlVtePm7nVzzzlpa+Xry1c1naZm7OC10xRUJUXx181mE+ntx7Zx1PLkkhQo9BePWcgvK+NPctbz43Q4u6R/BwpuH06O9Xvx0dVroCoAe7Vux8ObhTBkSzds/pHP5W2vYe7jY6liqCXy7NZuxL60icfdRnp7Ulxcm9yPARydedQda6Op//Lw9ePKyvrw+dSDpuYVc+MqPfLJ+n66E5CYKyyq569NkZvw3iXatfFn09+FcOThax5e7Ef22rE5yUXwH+kUFc8eCZO76bBOLtxzgqcvidf4OF7Z+9xH++clGso6WcNOILtw2Og5vTz2eczf6N6r+UGSoPx/85QwemdCbtelHGPPiD3yWlKlH6y6muLySx7/exuS31yAIn8w8k7vG9tAyd1N6hK5OyWYTpg2L4dy4cO5YkMztC5JZvPkAj17ah4gQP6vjqTqs3H6Q+77YQtaxEq4eGs29F/YkUM+VuzX9Nq3qFBMWwMczz+T+i3ry867DjH7+B976YZeOhHFShwrLuPWjX5k+bz2+XjYW3HAmsyb21TJvAfRvWNnFwyb85exYxvXtwMMLt/LUklS+2JDF4xP7MDimtdXxFDXjyj9ct5fnvt1OUVklt47qxk0ju+DjqWvLthRa6KpeIkL8+PefEli2LYeHF27lirfWMGlgJHde0F0vmlpoza7DPLJoK6nZBQzt3JrHL+1Dt3ZBVsdSzUwLXTXImF7tOKtrG15dkcacHzP4v837mXF2LDPP7aJjmpvRviPFPLkkhcWbs4kI8eONqwcyrk97HYrYQun/PNVg/t6e3D22B1OHRPP00lReWZHGh+v3cfuYOK5IiMLDpqXSVA4VlvHG97t4/5c92GzwzzFxzDgnFl8vPb3SkklzDkNLSEgwiYmJzbY/1bw27D3KrP9LIWnPUWLDA7h1VDfGx3fUYnegvJIK3vkxnTmrMyitqOLyQZHcNjqOjk046ig9PR2A2NjYJtuHOj0RSTLGJNS5nRa6ciRjDN9szeal73aSml2gxe4geSUVvP/LHmavSievpILx8R34x5g4uoQHNvm+tdCtZ2+h6ykX5VAiwtg+HTi/V3uWbs3m5e92cutHG3ll+U5mntOFCf076mmBesjJL2XO6gw+WLuXwrJKzuvRltvPj6N3R114Qp1Mj9BVk6quNizdms0ry2uO2MMCvbn2jBiuOSNaF1A4jZ05BbzzYwZf/JpFZXU14+M7MvPcWEuKXI/QradH6Mop2GzChX07MK5Pe37edZg5qzN48bsdvL4yjYn9I5gyNJp+kcE6KgMoq6xi6ZZs5q/dy7qMI/h42rhycBR/PTuW6Db+VsdTLkALXTULEeGsrmGc1TWMtIOFzPspg883ZPFx4j66twti8uAoJg6IoHWAt9VRm13awUI+TcpkQeI+DheVE93an3vH9eDyQZH6U4yqlwafchGRKOA9oB1ggNnGmJdP9xo95aKOV1BawaLkA3y8fi/JmXl4e9g4r0dbLorvwAaTRI4AAAnHSURBVHk92rr1ePYDeSV8nXyALzdmsXV/Ph42YVSPtlxzRieGdw3D5kQXkPWUi/Wa45RLJXC7MWaDiAQBSSKyzBizrRHvqVqQIF8vpg6NZurQaFKz8/l4/T4WJR9g6dZsfDxtjOzelnF923Nej7YE+XpZHbfR0nMLWZF6kGXbcli3+wjGQL/IYB4Y34uL4zvQtpXeaasax2EXRUXkK+A1Y8yyU22jR+iqLlXVhvW7j7B48wGWbMkmt6AMD5swMDqEs7uFc05cOH0jgl1iCGRRWSVJe46ycnsuK1Jz2F27AlRcu0Au6tuRCf070jkswOKUddMjdOs16zh0EYkBVgF9jDH5Jzw3A5gBEB0dPWjPnj2N3p9qGaqqDRv2HmXl9oOs2nGILfvzMAaC/bwY1CmUAVEhDIgOJT4qmFYNPYLPmA/J90HxXvCPhn6zoPPV9X4bYwy5BWVs2HuM9buPsH73Ebbuz6eq2uDtaWNYlzac16MtI7u3Jaq1a13g1EK3XrMVuogEAj8As4wxn59uWz1CV41xuLCMn3YdZvXOXDbsPUbawUIARKBLeCBx7QLpGh5Il7aBdG0bSGxYIH7epxnznjEf1s2AquPWTvXwhyGzT1nq1dWG3MIyMo8Wk55bRGp2AanZ+aQcKOBIUTkAPp42+keFMDimNYM7t2ZwTCj+3q57PUAL3XrNUugi4gV8DXxjjHmhru0T4oJM4huDfv9g9GSIuwkqi2HlhSe/KHZ6za/SQ7D68pOf73YjdLoSivbBmmtPfr7H7RB5MeRvh3UzT36+z/3QfjQc3QhJt538fL8nIHwY5P4Myf86+flBL0Fof8j+DrY8fvLzQ96GVt0hcxGkPn/y82f+FwKiYM/HsPPNk58f/in4hkH6uzW/TjRiMXj6w443YO8nJz8/emXN7ynPQdbXv3/Oww9GLqn5ePNjkLP898/7tIGzP6v5eOO9cGjN75/3j4Rh79d8nHRbzdfweEFxMHR2zcdrZ0DBjt8/H9q/5usH8PM1UJz5++fDzoT+T9Z8/OMkKDv8u6fzQs9nU6u/snHvMZI3LiWtMJS9JcFUHzfNfytfT9oH+9KuYjPtfApp412Cv0c5AR4V+JXtJsAcRqjGYKMKG1XGRpX4UOjbg2MBQ8jz7UleYRFH9m9gf2kQWSWtKDf/v5x9vWx0D/ehB2vpEZhLfKts+rTKwcdW5Tb/9tK3rYHSbGKPnDDmoQX/26PdKOj7QM3H34+DqpLfPx8xHnreUfPxdyM4ST17T8b80LQXRaVm4PAcIMWeMlfK0YK9qzi7WzhndwsH2y1QVUJplQe7S0JJK2rDHs8zyPEdTHZeKTmZPuwsCuNohS9l1fadnrGJIdjvAMG+NkIqvegVdJDzw3cS6ZdPpG8e0T3GERN/BR4lmbDm0Sb+0ypVt8YMWxwO/AhsBn5buuZfxpjFp3qNnnJRzqCq2lBcXknxwgSKig5RbWx4SDU2qrFJNR5+HQi6ZD2BPp56wxN6ysUZNPmwRWPMakD/tSuX42ETgny9CBp81ynOod8JbjBMUrU8uqaoark6X11zAdS/EyA1v5/mgqhSzs51L70r5Qidr9YCV25Dj9CVUspNaKErpZSb0EJXSik3oYWulFJuQgtdKaXchBa6Ukq5CS10pZRyE1roSinlJrTQlVLKTWihK6WUm9BCV0opN6GFrpRSbkILXSml3IQWulJKuQktdKWUchNa6Eop5SYaVegiMlZEtotImojc46hQSiml6q/BhS4iHsDrwDigFzBFRHo5KphSSqn6acwSdEOANGNMOoCIfARcAmw71QvKysr+t4K4Uso1lJaWAuj/XRfQmFMuEcC+4z7PrH3sd0RkhogkikhiRUVFI3anlFLqdJp8kWhjzGxgNkBCQoKJjY1t6l0qpRzotyNz/b/r/BpzhJ4FRB33eWTtY0oppSzQmEJfD3QTkc4i4g1cBSx0TCyllFL11eBTLsaYShG5GfgG8ADmGmO2OiyZUkqpemnUOXRjzGJgsYOyKKWUagS9U1QppdyEFrpSSrkJLXSllHITWuhKKeUmtNCVUspNaKErpZSb0EJXSik3oYWulFJuQowxzbczkQJge7Pt0PHCgENWh2gEV87vytlB81vN1fN3N8YE1bVRk8+2eILtxpiEZt6nw4hIoua3hitnB81vNXfIb892espFKaXchBa6Ukq5ieYu9NnNvD9H0/zWceXsoPmt1iLyN+tFUaWUUk1HT7kopZSb0EJXSik30eyFLiIPi0iWiGys/XVhc2dwBBG5XUSMiIRZncVeIvKYiGyq/bp/KyIdrc5UHyLyrIik1v4ZvhCREKsz1YeIXCEiW0WkWkRcZgidiIwVke0ikiYi91idpz5EZK6IHBSRLVZnqS8RiRKR70VkW+2/m1vreo1VR+gvGmP61/5yuRWPRCQKOB/Ya3WWenrWGBNvjOkPfA08aHWgeloG9DHGxAM7gHstzlNfW4DLgFVWB7GXiHgArwPjgF7AFBHpZW2qenkXGGt1iAaqBG43xvQCzgD+VtfXXk+5NMyLwF2AS11RNsbkH/dpAK6X/1tjTGXtp78AkVbmqS9jTIoxxtXulB4CpBlj0o0x5cBHwCUWZ7KbMWYVcMTqHA1hjDlgjNlQ+3EBkAJEnO41VhX6zbU/Ns8VkVCLMjSIiFwCZBljkq3O0hAiMktE9gFX43pH6Me7DlhidYgWIALYd9znmdRRKsrxRCQGGACsPd12TXLrv4h8B7T/g6fuA94EHqPm6PAx4Hlq/nM6jTry/4ua0y1O6XTZjTFfGWPuA+4TkXuBm4GHmjVgHerKX7vNfdT8ODq/ObPZw578StWHiAQCnwG3nfBT9kmapNCNMaPt2U5E/k3NuVyncqr8ItIX6AwkiwjU/Mi/QUSGGGOymzHiKdn7taemDBfjZIVeV34RmQ6MB0YZJ7yJoh5ff1eRBUQd93lk7WOqGYiIFzVlPt8Y83ld21sxyqXDcZ9OpOZCkUswxmw2xrQ1xsQYY2Ko+fFzoLOUeV1EpNtxn14CpFqVpSFEZCw11y4mGGOKrc7TQqwHuolIZxHxBq4CFlqcqUWQmqPGOUCKMeYFu17T3Ac5IvJfoD81p1x2AzONMQeaNYSDiMhuIMEY4xLTcorIZ0B3oBrYA9xgjHGZoy0RSQN8gMO1D/1ijLnBwkj1IiITgVeBcOAYsNEYc4G1qepWO7T4JcADmGuMmWVxJLuJyIfACGqmz80BHjLGzLE0lJ1EZDjwI7CZmv+zAP863chAvfVfKaXchA5bVEopN6GFrpRSbkILXSml3IQWulJKuQktdKWUchNa6Eop5Sa00JVSyk38PzL9EUMZKHYPAAAAAElFTkSuQmCC\n", 436 | "text/plain": [ 437 | "
" 438 | ] 439 | }, 440 | "metadata": { 441 | "tags": [], 442 | "needs_background": "light" 443 | } 444 | } 445 | ] 446 | }, 447 | { 448 | "cell_type": "markdown", 449 | "metadata": { 450 | "id": "EWFvP3Tr5b2P" 451 | }, 452 | "source": [ 453 | "#### Maximum" 454 | ] 455 | }, 456 | { 457 | "cell_type": "markdown", 458 | "metadata": { 459 | "id": "Bn6YmWLR5b2P" 460 | }, 461 | "source": [ 462 | "If $y = -x^2 + 3x + 4$..." 463 | ] 464 | }, 465 | { 466 | "cell_type": "code", 467 | "metadata": { 468 | "id": "XA2KkY9x5b2Q" 469 | }, 470 | "source": [ 471 | "y_max = -x**2 + 3*x + 4" 472 | ], 473 | "execution_count": 26, 474 | "outputs": [] 475 | }, 476 | { 477 | "cell_type": "markdown", 478 | "metadata": { 479 | "id": "b8AUiYpt5b2S" 480 | }, 481 | "source": [ 482 | "...then $y' = -2x + 3$. \n", 483 | "\n", 484 | "Critical point is located where $y' = 0$, so where $-2x + 3 = 0$.\n", 485 | "\n", 486 | "Rearranging to solve for $x$:\n", 487 | "$$-2x = -3$$\n", 488 | "$$x = \\frac{-3}{-2} = 1.5$$\n", 489 | "\n", 490 | "At which point, $y = -x^2 + 3x + 4 = -(1.5)^2 + 3(1.5) + 4 = 6.25$\n", 491 | "\n", 492 | "Thus, the critical point is located at (1.5, 6.25)." 493 | ] 494 | }, 495 | { 496 | "cell_type": "code", 497 | "metadata": { 498 | "id": "M0S-2dX_5b2S", 499 | "colab": { 500 | "base_uri": "https://localhost:8080/", 501 | "height": 269 502 | }, 503 | "outputId": "5c77fa62-fa1d-4bfc-ac39-74eabc9781fa" 504 | }, 505 | "source": [ 506 | "fig, ax = plt.subplots()\n", 507 | "\n", 508 | "plt.scatter(1.5, 6.25, c='orange')\n", 509 | "plt.axhline(y=6.25, c='orange', linestyle='--')\n", 510 | "\n", 511 | "plt.axvline(x=0, c='lightgray')\n", 512 | "plt.axhline(y=0, c='lightgray')\n", 513 | "\n", 514 | "ax.set_xlim(-2, 5)\n", 515 | "ax.set_ylim(-5, 10)\n", 516 | "\n", 517 | "_ = ax.plot(x, y_max)" 518 | ], 519 | "execution_count": 27, 520 | "outputs": [ 521 | { 522 | "output_type": "display_data", 523 | "data": { 524 | "image/png": "\n", 525 | "text/plain": [ 526 | "
" 527 | ] 528 | }, 529 | "metadata": { 530 | "tags": [], 531 | "needs_background": "light" 532 | } 533 | } 534 | ] 535 | }, 536 | { 537 | "cell_type": "markdown", 538 | "metadata": { 539 | "id": "isgDTQNi5b2V" 540 | }, 541 | "source": [ 542 | "#### Saddle point" 543 | ] 544 | }, 545 | { 546 | "cell_type": "markdown", 547 | "metadata": { 548 | "id": "EID56T7T5b2V" 549 | }, 550 | "source": [ 551 | "If $y = x^3 + 6$..." 552 | ] 553 | }, 554 | { 555 | "cell_type": "code", 556 | "metadata": { 557 | "id": "aoZghxgI5b2W" 558 | }, 559 | "source": [ 560 | "y_sp = x**3 + 6" 561 | ], 562 | "execution_count": 28, 563 | "outputs": [] 564 | }, 565 | { 566 | "cell_type": "markdown", 567 | "metadata": { 568 | "id": "NzU1HTKQ5b2Y" 569 | }, 570 | "source": [ 571 | "...then $y' = 3x^2$. \n", 572 | "\n", 573 | "Critical point is located where $y' = 0$, so where $3x^2 = 0$.\n", 574 | "\n", 575 | "Rearranging to solve for $x$:\n", 576 | "$$x^2 = \\frac{0}{3} = 0$$\n", 577 | "$$x = \\sqrt{0} = 0$$\n", 578 | "\n", 579 | "At which point, $y = x^3 + 6 = (0)^3 + 6 = 6$\n", 580 | "\n", 581 | "Thus, the critical point is located at (0, 6)." 582 | ] 583 | }, 584 | { 585 | "cell_type": "code", 586 | "metadata": { 587 | "id": "7isUxig-5b2Y", 588 | "colab": { 589 | "base_uri": "https://localhost:8080/", 590 | "height": 269 591 | }, 592 | "outputId": "7680e12d-4d8a-4f8b-a371-d285c6b2b704" 593 | }, 594 | "source": [ 595 | "fig, ax = plt.subplots()\n", 596 | "\n", 597 | "plt.scatter(0, 6, c='orange')\n", 598 | "plt.axhline(y=6, c='orange', linestyle='--')\n", 599 | "\n", 600 | "plt.axvline(x=0, c='lightgray')\n", 601 | "plt.axhline(y=0, c='lightgray')\n", 602 | "\n", 603 | "ax.set_xlim(-2, 2)\n", 604 | "ax.set_ylim(-1, 10)\n", 605 | "\n", 606 | "_ = ax.plot(x, y_sp)" 607 | ], 608 | "execution_count": 29, 609 | "outputs": [ 610 | { 611 | "output_type": "display_data", 612 | "data": { 613 | "image/png": "\n", 614 | "text/plain": [ 615 | "
" 616 | ] 617 | }, 618 | "metadata": { 619 | "tags": [], 620 | "needs_background": "light" 621 | } 622 | } 623 | ] 624 | }, 625 | { 626 | "cell_type": "markdown", 627 | "metadata": { 628 | "id": "86XDUeDP5b2c" 629 | }, 630 | "source": [ 631 | "**Return to slides here.**" 632 | ] 633 | }, 634 | { 635 | "cell_type": "markdown", 636 | "metadata": { 637 | "id": "GKUXnREM5b2c" 638 | }, 639 | "source": [ 640 | "### Stochastic Gradient Descent" 641 | ] 642 | }, 643 | { 644 | "cell_type": "markdown", 645 | "metadata": { 646 | "id": "7clTIe1z5b2c" 647 | }, 648 | "source": [ 649 | "Refer to the slides and the [*SGD from Scratch* notebook](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/SGD-from-scratch.ipynb)." 650 | ] 651 | }, 652 | { 653 | "cell_type": "markdown", 654 | "metadata": { 655 | "id": "gqdFAfzB5b2d" 656 | }, 657 | "source": [ 658 | "### Learning Rate Scheduling" 659 | ] 660 | }, 661 | { 662 | "cell_type": "markdown", 663 | "metadata": { 664 | "id": "KHoYnNJ55b2d" 665 | }, 666 | "source": [ 667 | "Refer to the slides and the [*Learning Rate Scheduling* notebook](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/learning-rate-scheduling.ipynb)." 668 | ] 669 | }, 670 | { 671 | "cell_type": "markdown", 672 | "metadata": { 673 | "id": "ob0c0k3d5b2e" 674 | }, 675 | "source": [ 676 | "## Segment 3: Fancy Deep Learning Optimizers" 677 | ] 678 | }, 679 | { 680 | "cell_type": "markdown", 681 | "metadata": { 682 | "id": "LymJg7aQ5b2f" 683 | }, 684 | "source": [ 685 | "### Jacobian and Hessian Matrices" 686 | ] 687 | }, 688 | { 689 | "cell_type": "markdown", 690 | "metadata": { 691 | "id": "ihbu7w5H5b2f" 692 | }, 693 | "source": [ 694 | "A *gradient* holds the partial derivatives of a function with vector input and scalar output. That is, $\\boldsymbol{f}: \\mathbb{R}^n \\rightarrow \\mathbb{R}$." 695 | ] 696 | }, 697 | { 698 | "cell_type": "markdown", 699 | "metadata": { 700 | "id": "nvjLc4T_5b2g" 701 | }, 702 | "source": [ 703 | "A **Jacobian matrix** holds the partial derivatives of a function with vector input *and vector output*. That is, if $\\boldsymbol{f}: \\mathbb{R}^m \\rightarrow \\mathbb{R}^n$. \n", 704 | "\n", 705 | "E.g., in the [*Artificial Neurons* notebook](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/artificial-neurons.ipynb), $m = 784$ and $n = 128$. In this case: \n", 706 | "* The Jacobian matrix $\\boldsymbol{J}$ has 128 rows and 784 columns.\n", 707 | "* Each element of $\\boldsymbol{J}$ contains the partial derivative of a particular output element with respect to a particular input element: $J_{i,j} = \\frac{\\partial}{\\partial x_j} f(\\boldsymbol{x})_i$" 708 | ] 709 | }, 710 | { 711 | "cell_type": "markdown", 712 | "metadata": { 713 | "id": "Q2GWdtGk5b2g" 714 | }, 715 | "source": [ 716 | "Note that a *generalized Jacobian* holds the partial derivatives of a function with an input tensor of any dimensionality and an output tensor of any dimensionality." 717 | ] 718 | }, 719 | { 720 | "cell_type": "markdown", 721 | "metadata": { 722 | "id": "02BNYUTF5b2g" 723 | }, 724 | "source": [ 725 | "**Return to slides here.**" 726 | ] 727 | }, 728 | { 729 | "cell_type": "markdown", 730 | "metadata": { 731 | "id": "4PPuIRHT5b2h" 732 | }, 733 | "source": [ 734 | "A **Hessian matrix** is the Jacobian matrix of the gradient. Its elements hold all of the possible second derivatives of some function $f(\\boldsymbol{x})$: \n", 735 | "$$ \\boldsymbol{H}(f)(\\boldsymbol{x})_{i,j} = \\frac{\\partial^2}{\\partial x_i \\partial x_j} f(\\boldsymbol{x}) $$" 736 | ] 737 | }, 738 | { 739 | "cell_type": "markdown", 740 | "metadata": { 741 | "id": "lQM3Ons65b2h" 742 | }, 743 | "source": [ 744 | "Otherwise, refer to the slides and the following notebooks: \n", 745 | "\n", 746 | "* [*Artificial Neurons*](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/artificial-neurons.ipynb)\n", 747 | "* [*Deep Net in TensorFlow* (1.x)](https://github.com/the-deep-learners/TensorFlow-LiveLessons/blob/master/notebooks/deep_net_in_tensorflow.ipynb)\n", 748 | "* [*Deep Convolutional Net in TensorFlow* (1.x)](https://github.com/the-deep-learners/TensorFlow-LiveLessons/blob/master/notebooks/lenet_in_tensorflow.ipynb)\n", 749 | "* [*Deep Convolutional Net in TensorFlow* (2.x, with Keras)](https://github.com/jonkrohn/DLTFpT/blob/master/notebooks/lenet_in_tensorflow.ipynb)\n", 750 | "* [*Deep Net in PyTorch*](https://github.com/jonkrohn/DLTFpT/blob/master/notebooks/deep_net_in_pytorch.ipynb)" 751 | ] 752 | } 753 | ] 754 | } -------------------------------------------------------------------------------- /notebooks/artificial-neurons.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "kernelspec": { 6 | "display_name": "Python 3", 7 | "language": "python", 8 | "name": "python3" 9 | }, 10 | "language_info": { 11 | "codemirror_mode": { 12 | "name": "ipython", 13 | "version": 3 14 | }, 15 | "file_extension": ".py", 16 | "mimetype": "text/x-python", 17 | "name": "python", 18 | "nbconvert_exporter": "python", 19 | "pygments_lexer": "ipython3", 20 | "version": "3.7.6" 21 | }, 22 | "colab": { 23 | "name": "artificial-neurons.ipynb", 24 | "provenance": [], 25 | "include_colab_link": true 26 | } 27 | }, 28 | "cells": [ 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "id": "view-in-github", 33 | "colab_type": "text" 34 | }, 35 | "source": [ 36 | "\"Open" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": { 42 | "id": "wvAIp_JlhU8Y", 43 | "colab_type": "text" 44 | }, 45 | "source": [ 46 | "# Artificial Neuron Layer" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": { 52 | "id": "XqExxAQlhU8Z", 53 | "colab_type": "text" 54 | }, 55 | "source": [ 56 | "In this notebook, we use PyTorch tensors to create a layer of artificial neurons that could be used within a deep learning model architecture." 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "metadata": { 62 | "id": "t3ZJ9KhVhU8a", 63 | "colab_type": "code", 64 | "colab": {} 65 | }, 66 | "source": [ 67 | "import torch\n", 68 | "import matplotlib.pyplot as plt" 69 | ], 70 | "execution_count": null, 71 | "outputs": [] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "metadata": { 76 | "id": "OX15SB16hU8d", 77 | "colab_type": "code", 78 | "colab": {} 79 | }, 80 | "source": [ 81 | "_ = torch.manual_seed(42)" 82 | ], 83 | "execution_count": null, 84 | "outputs": [] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": { 89 | "id": "oU8z-VtUhU8g", 90 | "colab_type": "text" 91 | }, 92 | "source": [ 93 | "Set number of neurons: " 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "metadata": { 99 | "id": "fUyNsPPuhU8g", 100 | "colab_type": "code", 101 | "colab": {} 102 | }, 103 | "source": [ 104 | "n_input = 784 # Flattened 28x28-pixel image\n", 105 | "n_dense = 128" 106 | ], 107 | "execution_count": null, 108 | "outputs": [] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": { 113 | "id": "lEfHqjdthU8i", 114 | "colab_type": "text" 115 | }, 116 | "source": [ 117 | "Simulate an \"input image\" with a vector tensor `x`: " 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "metadata": { 123 | "id": "L_fDjA4nhU8j", 124 | "colab_type": "code", 125 | "colab": {} 126 | }, 127 | "source": [ 128 | "x = torch.rand(n_input) # Samples float values from [0,1) uniform distribution (interval doesn't include 1)" 129 | ], 130 | "execution_count": null, 131 | "outputs": [] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "metadata": { 136 | "id": "fMquDxtzhU8m", 137 | "colab_type": "code", 138 | "colab": {}, 139 | "outputId": "46d3a7be-8108-4f9c-e63b-84dce3831d87" 140 | }, 141 | "source": [ 142 | "x.shape" 143 | ], 144 | "execution_count": null, 145 | "outputs": [ 146 | { 147 | "output_type": "execute_result", 148 | "data": { 149 | "text/plain": [ 150 | "torch.Size([784])" 151 | ] 152 | }, 153 | "metadata": { 154 | "tags": [] 155 | }, 156 | "execution_count": 5 157 | } 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "metadata": { 163 | "id": "ECeWngtshU8p", 164 | "colab_type": "code", 165 | "colab": {}, 166 | "outputId": "a2ac3bbc-c545-4ce8-ce32-81bfae659327" 167 | }, 168 | "source": [ 169 | "x[0:6]" 170 | ], 171 | "execution_count": null, 172 | "outputs": [ 173 | { 174 | "output_type": "execute_result", 175 | "data": { 176 | "text/plain": [ 177 | "tensor([0.8823, 0.9150, 0.3829, 0.9593, 0.3904, 0.6009])" 178 | ] 179 | }, 180 | "metadata": { 181 | "tags": [] 182 | }, 183 | "execution_count": 6 184 | } 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "metadata": { 190 | "id": "Po1ffSeohU8s", 191 | "colab_type": "code", 192 | "colab": {}, 193 | "outputId": "d563e272-50b8-4176-d228-806fa836ee6d" 194 | }, 195 | "source": [ 196 | "_ = plt.hist(x)" 197 | ], 198 | "execution_count": null, 199 | "outputs": [ 200 | { 201 | "output_type": "display_data", 202 | "data": { 203 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAMUElEQVR4nO3cf6zdd13H8efLlQXmxHX2bqkb8w4zgcW4MK86QQlaCW4zdiYsmQo0y5LGqDiNiVT+cH/4z0iMQeMP0gBaI4EsY3FVFF2KEw1u2sLYDypuDiyTunaooDMRy97+cb6yrrvd/e7e86Pv3ecjac75fs85Pe9P7s2z337v/Z5UFZKkfr5u0QNIktbHgEtSUwZckpoy4JLUlAGXpKa2zPPNtm3bVsvLy/N8S0lq79ChQ09U1dKp++ca8OXlZQ4ePDjPt5Sk9pL882r7PYUiSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTc31Skz1sLznwwt778/deu3C3lvqxiNwSWrKgEtSUwZckpryHLi0YIv6mYM/b+jPI3BJasqAS1JTBlySmjLgktRUmx9iLvLikkXxh0ySnotH4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekptpcSr8ZbcaPD5A0nkfgktSUR+CSNo1F/q92Fh9O5xG4JDU1KuBJfiHJQ0keTPKBJC9Ocn6Su5I8PNxunfWwkqSnrRnwJBcBPwesVNW3A2cBNwB7gANVdRlwYNiWJM3J2FMoW4CXJNkCnAN8AdgJ7Bse3wdcN/3xJEmns2bAq+pfgF8DjgBHgS9V1V8AF1bV0eE5R4ELVnt9kt1JDiY5ePz48elNLkmb3JhTKFuZHG1fCnwz8PVJ3jz2Dapqb1WtVNXK0tLS+ieVJD3DmFMoPwR8tqqOV9X/AncArwEeT7IdYLg9NrsxJUmnGhPwI8BVSc5JEmAHcBjYD+wanrMLuHM2I0qSVrPmhTxVdW+S24FPACeATwJ7gXOB25LcxCTy189yUEnSM426ErOqbgFuOWX3/zA5GpckLYBXYkpSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpkZ9mJWkF57lPR9e2Ht/7tZrF/beLyQegUtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKT9OVmKxH60qrZdH4JLUlAGXpKY8haIziqcypPE8Apekpgy4JDVlwCWpKQMuSU0ZcElqalTAk5yX5PYk/5DkcJLvTXJ+kruSPDzcbp31sJKkp409Av8N4CNV9UrgCuAwsAc4UFWXAQeGbUnSnKwZ8CQvBV4HvBegqr5SVf8B7AT2DU/bB1w3qyElSc825kKelwPHgd9LcgVwCLgZuLCqjgJU1dEkF6z24iS7gd0Al1xyyVSGltSbF2xNx5hTKFuAK4HfrapXA0/yPE6XVNXeqlqpqpWlpaV1jilJOtWYgD8GPFZV9w7btzMJ+uNJtgMMt8dmM6IkaTVrBryq/hX4fJJXDLt2AJ8G9gO7hn27gDtnMqEkaVVjP8zqbcD7k5wNPArcyCT+tyW5CTgCXD+bESVJqxkV8Kq6D1hZ5aEd0x1HkjSWV2JKUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKZGBzzJWUk+meRPhu3zk9yV5OHhduvsxpQkner5HIHfDBw+aXsPcKCqLgMODNuSpDkZFfAkFwPXAu85afdOYN9wfx9w3XRHkyQ9l7FH4O8Cfgl46qR9F1bVUYDh9oLVXphkd5KDSQ4eP358Q8NKkp62ZsCT/AhwrKoOrecNqmpvVa1U1crS0tJ6/gpJ0iq2jHjOa4EfTXIN8GLgpUn+EHg8yfaqOppkO3BsloNKkp5pzSPwqvrlqrq4qpaBG4CPVtWbgf3AruFpu4A7ZzalJOlZNvJ74LcCb0jyMPCGYVuSNCdjTqF8TVXdDdw93P8isGP6I0mSxvBKTElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLU1JoBT/KyJH+Z5HCSh5LcPOw/P8ldSR4ebrfOflxJ0v8bcwR+AvjFqnoVcBXwM0kuB/YAB6rqMuDAsC1JmpM1A15VR6vqE8P9/wQOAxcBO4F9w9P2AdfNakhJ0rM9r3PgSZaBVwP3AhdW1VGYRB644DSv2Z3kYJKDx48f39i0kqSvGR3wJOcCHwJ+vqq+PPZ1VbW3qlaqamVpaWk9M0qSVjEq4ElexCTe76+qO4bdjyfZPjy+HTg2mxElSasZ81soAd4LHK6qXz/pof3AruH+LuDO6Y8nSTqdLSOe81rgLcADSe4b9r0DuBW4LclNwBHg+tmMKElazZoBr6q/AXKah3dMdxxJ0lheiSlJTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmtpQwJP8cJLPJHkkyZ5pDSVJWtu6A57kLOC3gauBy4EfT3L5tAaTJD23jRyBfzfwSFU9WlVfAT4I7JzOWJKktWzZwGsvAj5/0vZjwPec+qQku4Hdw+Z/JfnMOt5rG/DEOl7X2WZcM2zOdbvmTSDv3NCav2W1nRsJeFbZV8/aUbUX2LuB9yHJwapa2cjf0c1mXDNsznW75s1hFmveyCmUx4CXnbR9MfCFjY0jSRprIwH/e+CyJJcmORu4Adg/nbEkSWtZ9ymUqjqR5GeBPwfOAt5XVQ9NbbJn2tApmKY245phc67bNW8OU19zqp512lqS1IBXYkpSUwZckpo6owK+1qX5mfjN4fH7k1y5iDmnacSaf3JY6/1JPp7kikXMOU1jP4IhyXcl+WqSN81zvlkYs+Ykr09yX5KHkvzVvGecthHf29+Y5I+TfGpY842LmHOakrwvybEkD57m8ek2rKrOiD9MfhD6T8DLgbOBTwGXn/Kca4A/Y/I76FcB9y567jms+TXA1uH+1ZthzSc976PAnwJvWvTcc/g6nwd8Grhk2L5g0XPPYc3vAN453F8C/g04e9Gzb3DdrwOuBB48zeNTbdiZdAQ+5tL8ncAf1MQ9wHlJts970Clac81V9fGq+vdh8x4mv2/f2diPYHgb8CHg2DyHm5Exa/4J4I6qOgJQVd3XPWbNBXxDkgDnMgn4ifmOOV1V9TEm6zidqTbsTAr4apfmX7SO53TyfNdzE5N/vTtbc81JLgJ+DHj3HOeapTFf528Dtia5O8mhJG+d23SzMWbNvwW8iskFgA8AN1fVU/MZb2Gm2rCNXEo/bWMuzR91+X4jo9eT5AeYBPz7ZjrR7I1Z87uAt1fVVycHZ+2NWfMW4DuBHcBLgL9Nck9V/eOsh5uRMWt+I3Af8IPAtwJ3JfnrqvryrIdboKk27EwK+JhL819ol++PWk+S7wDeA1xdVV+c02yzMmbNK8AHh3hvA65JcqKq/mg+I07d2O/tJ6rqSeDJJB8DrgC6BnzMmm8Ebq3JyeFHknwWeCXwd/MZcSGm2rAz6RTKmEvz9wNvHX6SexXwpao6Ou9Bp2jNNSe5BLgDeEvjo7GTrbnmqrq0qparahm4HfjpxvGGcd/bdwLfn2RLknOYfLLn4TnPOU1j1nyEyf84SHIh8Arg0blOOX9TbdgZcwRep7k0P8lPDY+/m8lvJFwDPAL8N5N/wdsaueZfAb4J+J3hiPRENf4Ut5FrfkEZs+aqOpzkI8D9wFPAe6pq1V9F62Dk1/lXgd9P8gCTUwtvr6rWHzGb5APA64FtSR4DbgFeBLNpmJfSS1JTZ9IpFEnS82DAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLU1P8BdA7Vzag5qk8AAAAASUVORK5CYII=\n", 204 | "text/plain": [ 205 | "
" 206 | ] 207 | }, 208 | "metadata": { 209 | "tags": [], 210 | "needs_background": "light" 211 | } 212 | } 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": { 218 | "id": "jAtgSZkkhU8u", 219 | "colab_type": "text" 220 | }, 221 | "source": [ 222 | "Create tensors to store neuron parameters (i.e., weight matrix `W`, bias vector `b`) and initialize them with starting values: " 223 | ] 224 | }, 225 | { 226 | "cell_type": "code", 227 | "metadata": { 228 | "id": "Xz8ho_F8hU8u", 229 | "colab_type": "code", 230 | "colab": {} 231 | }, 232 | "source": [ 233 | "b = torch.zeros(n_dense)" 234 | ], 235 | "execution_count": null, 236 | "outputs": [] 237 | }, 238 | { 239 | "cell_type": "code", 240 | "metadata": { 241 | "id": "fLNQg0SehU8x", 242 | "colab_type": "code", 243 | "colab": {}, 244 | "outputId": "d15a2afd-461d-43cb-d5d8-48ed605de798" 245 | }, 246 | "source": [ 247 | "b.shape" 248 | ], 249 | "execution_count": null, 250 | "outputs": [ 251 | { 252 | "output_type": "execute_result", 253 | "data": { 254 | "text/plain": [ 255 | "torch.Size([128])" 256 | ] 257 | }, 258 | "metadata": { 259 | "tags": [] 260 | }, 261 | "execution_count": 9 262 | } 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "metadata": { 268 | "id": "n2IamD5phU8z", 269 | "colab_type": "code", 270 | "colab": {}, 271 | "outputId": "98137b49-5eff-4cb3-9f14-3c5f39bc7d62" 272 | }, 273 | "source": [ 274 | "b[0:6]" 275 | ], 276 | "execution_count": null, 277 | "outputs": [ 278 | { 279 | "output_type": "execute_result", 280 | "data": { 281 | "text/plain": [ 282 | "tensor([0., 0., 0., 0., 0., 0.])" 283 | ] 284 | }, 285 | "metadata": { 286 | "tags": [] 287 | }, 288 | "execution_count": 10 289 | } 290 | ] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "metadata": { 295 | "id": "2pbT93JshU81", 296 | "colab_type": "code", 297 | "colab": {} 298 | }, 299 | "source": [ 300 | "W = torch.empty([n_input, n_dense])\n", 301 | "W = torch.nn.init.xavier_normal_(W)" 302 | ], 303 | "execution_count": null, 304 | "outputs": [] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "metadata": { 309 | "id": "WJResuyMhU84", 310 | "colab_type": "code", 311 | "colab": {}, 312 | "outputId": "96f028ec-0761-4eee-d187-5e31d4d0a496" 313 | }, 314 | "source": [ 315 | "W.shape" 316 | ], 317 | "execution_count": null, 318 | "outputs": [ 319 | { 320 | "output_type": "execute_result", 321 | "data": { 322 | "text/plain": [ 323 | "torch.Size([784, 128])" 324 | ] 325 | }, 326 | "metadata": { 327 | "tags": [] 328 | }, 329 | "execution_count": 12 330 | } 331 | ] 332 | }, 333 | { 334 | "cell_type": "code", 335 | "metadata": { 336 | "id": "48_BVbr2hU87", 337 | "colab_type": "code", 338 | "colab": {}, 339 | "outputId": "95f3fb25-a6fd-487f-bd7f-593dea237bd7" 340 | }, 341 | "source": [ 342 | "W[0:4, 0:4]" 343 | ], 344 | "execution_count": null, 345 | "outputs": [ 346 | { 347 | "output_type": "execute_result", 348 | "data": { 349 | "text/plain": [ 350 | "tensor([[ 0.0008, 0.0038, 0.0349, 0.0630],\n", 351 | " [ 0.0872, -0.0505, 0.0414, -0.0391],\n", 352 | " [-0.0162, -0.0056, 0.0555, -0.0571],\n", 353 | " [ 0.0050, -0.0144, 0.0405, -0.0499]])" 354 | ] 355 | }, 356 | "metadata": { 357 | "tags": [] 358 | }, 359 | "execution_count": 13 360 | } 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "metadata": { 366 | "id": "tfQC0hb7hU89", 367 | "colab_type": "text" 368 | }, 369 | "source": [ 370 | "Pass the \"input image\" `x` through a *dense* neuron layer with a *sigmoid activation function* to output the vector tensor `a`, which contains one element per neuron: " 371 | ] 372 | }, 373 | { 374 | "cell_type": "code", 375 | "metadata": { 376 | "id": "QIsGHaKfhU89", 377 | "colab_type": "code", 378 | "colab": {} 379 | }, 380 | "source": [ 381 | "z = torch.add(torch.matmul(x, W), b)" 382 | ], 383 | "execution_count": null, 384 | "outputs": [] 385 | }, 386 | { 387 | "cell_type": "code", 388 | "metadata": { 389 | "id": "ZklBDX1OhU8_", 390 | "colab_type": "code", 391 | "colab": {} 392 | }, 393 | "source": [ 394 | "a = torch.sigmoid(z)" 395 | ], 396 | "execution_count": null, 397 | "outputs": [] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "metadata": { 402 | "id": "QCI3pZwthU9B", 403 | "colab_type": "code", 404 | "colab": {}, 405 | "outputId": "e5e2506c-14b4-4959-df4f-e6a97408558b" 406 | }, 407 | "source": [ 408 | "a.shape" 409 | ], 410 | "execution_count": null, 411 | "outputs": [ 412 | { 413 | "output_type": "execute_result", 414 | "data": { 415 | "text/plain": [ 416 | "torch.Size([128])" 417 | ] 418 | }, 419 | "metadata": { 420 | "tags": [] 421 | }, 422 | "execution_count": 16 423 | } 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "metadata": { 429 | "id": "zUIWCKjxhU9E", 430 | "colab_type": "code", 431 | "colab": {}, 432 | "outputId": "d7343ff1-9caf-4194-8632-c81b524949d4" 433 | }, 434 | "source": [ 435 | "_ = plt.hist(a)" 436 | ], 437 | "execution_count": null, 438 | "outputs": [ 439 | { 440 | "output_type": "display_data", 441 | "data": { 442 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAANdklEQVR4nO3df4zkdX3H8eernvwhoB69BSkF11IsoJHTrtdWmgZjrPyIAVqaQhsllPRsIw0k2nAhqZKYJmdatUmt2FMImFhME6DSHlrJaUusFl3oAUcPC8UrBS/cUpqCpmm9490/5ntls+wyM7szs/PB5yOZ7Pf7nc/O95Xv7r7uO5+Z71yqCklSe35svQNIklbHApekRlngktQoC1ySGmWBS1KjNkxyZ5s2barZ2dlJ7lKSmnfPPfc8VVUzS7dPtMBnZ2eZn5+f5C4lqXlJ/m257U6hSFKjLHBJapQFLkmNssAlqVEWuCQ1ygKXpEZZ4JLUKAtckhplgUtSoyZ6JabaMLtt57rte9/289Zt31JrPAOXpEZZ4JLUKAtckhplgUtSo/oWeJITk3wtyd4kDya5stt+bZInkuzubueOP64k6bBB3oVyEPhAVd2b5GjgniR3dvd9oqr+eHzxJEkr6VvgVbUf2N8tP5tkL3DCuINJkl7cUHPgSWaBNwN3d5uuSHJ/khuSbFzhe7YmmU8yv7CwsKawkqTnDVzgSY4CbgGuqqpngOuAk4HN9M7QP7bc91XVjqqaq6q5mZkX/JdukqRVGqjAk7ycXnl/vqpuBaiqJ6vqUFU9B3wG2DK+mJKkpQZ5F0qA64G9VfXxRduPXzTsQmDP6ONJklYyyLtQzgTeAzyQZHe37RrgkiSbgQL2Ae8bS0JJ0rIGeRfK14Esc9cdo48jSRqUV2JKUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJatQgn4UiTczstp3rHWHi9m0/b70jqFGegUtSoyxwSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmN6lvgSU5M8rUke5M8mOTKbvsxSe5M8nD3deP440qSDhvkDPwg8IGqOg34eeD9SU4HtgG7quoUYFe3LkmakL4FXlX7q+rebvlZYC9wAnA+cFM37CbggnGFlCS90FBz4ElmgTcDdwPHVdV+6JU8cOyow0mSVjZwgSc5CrgFuKqqnhni+7YmmU8yv7CwsJqMkqRlDFTgSV5Or7w/X1W3dpufTHJ8d//xwIHlvreqdlTVXFXNzczMjCKzJInB3oUS4Hpgb1V9fNFdtwOXdsuXAl8cfTxJ0ko2DDDmTOA9wANJdnfbrgG2A3+Z5HLgMeDXxhNRkrScvgVeVV8HssLd7xhtHEnSoLwSU5IaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1ygKXpEZZ4JLUKAtckhplgUtSoyxwSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRvUt8CQ3JDmQZM+ibdcmeSLJ7u527nhjSpKWGuQM/Ebg7GW2f6KqNne3O0YbS5LUT98Cr6q7gKcnkEWSNIS1zIFfkeT+bopl40qDkmxNMp9kfmFhYQ27kyQtttoCvw44GdgM7Ac+ttLAqtpRVXNVNTczM7PK3UmSllpVgVfVk1V1qKqeAz4DbBltLElSP6sq8CTHL1q9ENiz0lhJ0nhs6Dcgyc3AWcCmJI8DHwbOSrIZKGAf8L4xZpQkLaNvgVfVJctsvn4MWSRJQ/BKTElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmN6nspvaTxmt22c70jTNy+7eetd4SXBM/AJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1ygKXpEZZ4JLUqL4FnuSGJAeS7Fm07ZgkdyZ5uPu6cbwxJUlLDXIGfiNw9pJt24BdVXUKsKtblyRNUN8Cr6q7gKeXbD4fuKlbvgm4YMS5JEl9rHYO/Liq2g/QfT12pYFJtiaZTzK/sLCwyt1JkpYa+4uYVbWjquaqam5mZmbcu5OkHxmrLfAnkxwP0H09MLpIkqRBrLbAbwcu7ZYvBb44mjiSpEEN8jbCm4FvAj+T5PEklwPbgXcmeRh4Z7cuSZqgDf0GVNUlK9z1jhFnkSQNwSsxJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDWq76X0gtltO9c7giS9gGfgktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1yveBS/qRsZ7XdOzbft7IH9MzcElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1ak0fZpVkH/AscAg4WFVzowglSepvFJ9G+PaqemoEjyNJGoJTKJLUqLWegRfwlSQF/HlV7Vg6IMlWYCvASSedtMbdSXopWM/P5X4pWesZ+JlV9RbgHOD9SX5p6YCq2lFVc1U1NzMzs8bdSZIOW1OBV9X3uq8HgNuALaMIJUnqb9UFnuTIJEcfXgZ+GdgzqmCSpBe3ljnw44Dbkhx+nL+oqi+PJJUkqa9VF3hVPQqcMcIskqQh+DZCSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNWsv/iTlRs9t2rncESZoqnoFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVFrKvAkZyf5TpJHkmwbVShJUn+rLvAkLwP+DDgHOB24JMnpowomSXpxazkD3wI8UlWPVtX/Al8Azh9NLElSP2v5PPATgH9ftP448HNLByXZCmztVr+f5DsDPv4m4Kk15Bsnsw1vWnOB2VbLbEPIR/9/cTXZXrvcxrUUeJbZVi/YULUD2DH0gyfzVTW3mmDjZrbhTWsuMNtqmW11RpltLVMojwMnLlr/SeB7a4sjSRrUWgr828ApSV6X5AjgYuD20cSSJPWz6imUqjqY5Argb4GXATdU1YMjS7aKaZcJMtvwpjUXmG21zLY6I8uWqhdMW0uSGuCVmJLUKAtckhq1rgXe71L8JKcm+WaS/0nywSnL9ptJ7u9u30hyxhRlO7/LtTvJfJJfnJZsi8a9NcmhJBdNS7YkZyX5r+647U7yoWnJtijf7iQPJvn7acmW5PcXHbM93c/1mCnJ9qokf53kvu64XTaJXANm25jktu5v9VtJ3jj0TqpqXW70Xvj8V+CngCOA+4DTl4w5Fngr8IfAB6cs29uAjd3yOcDdU5TtKJ5/feNNwEPTkm3RuK8CdwAXTUs24Czgbyb1ezZktlcD/wyc1K0fOy3Zlox/N/DVackGXAN8tFueAZ4GjpiSbH8EfLhbPhXYNex+1vMMvO+l+FV1oKq+DfxwCrN9o6r+s1v9R3rvg5+WbN+v7rcCOJJlLrBar2yd3wNuAQ5MKNcw2dbDINl+A7i1qh6D3t/GFGVb7BLg5okkGyxbAUcnCb0Tm6eBg1OS7XRgF0BVPQTMJjlumJ2sZ4Evdyn+CeuUZalhs10OfGmsiZ43ULYkFyZ5CNgJ/Na0ZEtyAnAh8OkJZTps0J/pL3RPt7+U5A2TiTZQttcDG5P8XZJ7krx3irIBkOQVwNn0/nGehEGyfRI4jd5Fhg8AV1bVc1OS7T7gVwCSbKF3ufxQJ4LrWeADXYq/TgbOluTt9Ar86rEmWrTLZbYt9xEGt1XVqcAFwEfGnqpnkGx/AlxdVYcmkGexQbLdC7y2qs4A/hT4q7Gn6hkk2wbgZ4HzgHcBf5Dk9eMOxnB/p+8G/qGqnh5jnsUGyfYuYDfwE8Bm4JNJXjnuYAyWbTu9f5R303tW+k8M+exgLZ+FslbTfCn+QNmSvAn4LHBOVf3HNGU7rKruSnJykk1VNe4P9xkk2xzwhd4zWjYB5yY5WFXjLsu+2arqmUXLdyT51BQdt8eBp6rqB8APktwFnAH8yxRkO+xiJjd9AoNluwzY3k0pPpLku/Tmm7+13tm637fLALopnu92t8FN4sWGFSb5NwCPAq/j+Un+N6ww9lom+yJm32zAScAjwNum7bgBP83zL2K+BXji8Pp6Z1sy/kYm9yLmIMftNYuO2xbgsWk5bvSmAXZ1Y18B7AHeOA3ZunGvoje/fOQkfp5DHLfrgGu75eO6v4VNU5Lt1XQvqAK/DXxu2P2s2xl4rXApfpLf6e7/dJLXAPPAK4HnklxF75XcZ1Z84AllAz4E/Djwqe5s8mBN4NPPBsz2q8B7k/wQ+G/g16v7LZmCbOtiwGwXAb+b5CC943bxtBy3qtqb5MvA/cBzwGeras80ZOuGXgh8pXrPECZiwGwfAW5M8gC9aY2ra/zPqAbNdhrwuSSH6L3D6PJh9+Ol9JLUKK/ElKRGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpUf8HwyiT/ngg2PsAAAAASUVORK5CYII=\n", 443 | "text/plain": [ 444 | "
" 445 | ] 446 | }, 447 | "metadata": { 448 | "tags": [], 449 | "needs_background": "light" 450 | } 451 | } 452 | ] 453 | }, 454 | { 455 | "cell_type": "code", 456 | "metadata": { 457 | "id": "E2rFwCZ0hU9H", 458 | "colab_type": "code", 459 | "colab": {} 460 | }, 461 | "source": [ 462 | "" 463 | ], 464 | "execution_count": null, 465 | "outputs": [] 466 | } 467 | ] 468 | } -------------------------------------------------------------------------------- /notebooks/single-point-regression-gradient.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "single-point-regression-gradient.ipynb", 7 | "provenance": [], 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "display_name": "Python 3", 12 | "language": "python", 13 | "name": "python3" 14 | }, 15 | "language_info": { 16 | "codemirror_mode": { 17 | "name": "ipython", 18 | "version": 3 19 | }, 20 | "file_extension": ".py", 21 | "mimetype": "text/x-python", 22 | "name": "python", 23 | "nbconvert_exporter": "python", 24 | "pygments_lexer": "ipython3", 25 | "version": "3.7.6" 26 | } 27 | }, 28 | "cells": [ 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "id": "view-in-github", 33 | "colab_type": "text" 34 | }, 35 | "source": [ 36 | "\"Open" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": { 42 | "id": "dlQvcmWWNd4Y" 43 | }, 44 | "source": [ 45 | "# Gradient of a Single-Point Regression" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": { 51 | "id": "5JtUZ9KYNd4Z" 52 | }, 53 | "source": [ 54 | "In this notebook, we calculate the gradient of quadratic cost with respect to a straight-line regression model's parameters. We keep the partial derivatives as simple as possible by limiting the model to handling a single data point. " 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "metadata": { 60 | "id": "EdScrjQCNd4Z" 61 | }, 62 | "source": [ 63 | "import torch" 64 | ], 65 | "execution_count": 1, 66 | "outputs": [] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": { 71 | "id": "NnFhWE9kNd4d" 72 | }, 73 | "source": [ 74 | "Let's use the same data as we did in the [*Regression in PyTorch* notebook](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/regression-in-pytorch.ipynb) as well as for demonstrating the Moore-Penrose Pseudoinverse in the [*Linear Algebra II* notebook](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/2-linear-algebra-ii.ipynb):" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "metadata": { 80 | "id": "Oem10L3iNd4d" 81 | }, 82 | "source": [ 83 | "xs = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7.])" 84 | ], 85 | "execution_count": 2, 86 | "outputs": [] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "metadata": { 91 | "id": "jPZsXuwWNd4f" 92 | }, 93 | "source": [ 94 | "ys = torch.tensor([1.86, 1.31, .62, .33, .09, -.67, -1.23, -1.37])" 95 | ], 96 | "execution_count": 3, 97 | "outputs": [] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "metadata": { 102 | "id": "yM6hk9NeNd4i" 103 | }, 104 | "source": [ 105 | "The slope of a line is given by $y = mx + b$:" 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "metadata": { 111 | "id": "NtyilFoYNd4i" 112 | }, 113 | "source": [ 114 | "def regression(my_x, my_m, my_b):\n", 115 | " return my_m*my_x + my_b" 116 | ], 117 | "execution_count": 4, 118 | "outputs": [] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": { 123 | "id": "ROV3p3BHNd4l" 124 | }, 125 | "source": [ 126 | "Let's initialize $m$ and $b$ with the same \"random\" near-zero values as we did in the *Regression in PyTorch* notebook: " 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "metadata": { 132 | "id": "qmiBbvH1Nd4l" 133 | }, 134 | "source": [ 135 | "m = torch.tensor([0.9]).requires_grad_()" 136 | ], 137 | "execution_count": 5, 138 | "outputs": [] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "metadata": { 143 | "id": "TRxe0rU9Nd4n" 144 | }, 145 | "source": [ 146 | "b = torch.tensor([0.1]).requires_grad_()" 147 | ], 148 | "execution_count": 6, 149 | "outputs": [] 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "metadata": { 154 | "id": "7Iu4uKsqNd4r" 155 | }, 156 | "source": [ 157 | "To keep the partial derivatives as simple as possible, let's move forward with a single instance $i$ from the eight possible data points: " 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "metadata": { 163 | "id": "_ttss5lTNd4s" 164 | }, 165 | "source": [ 166 | "i = 7\n", 167 | "x = xs[i]\n", 168 | "y = ys[i]" 169 | ], 170 | "execution_count": 7, 171 | "outputs": [] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "metadata": { 176 | "colab": { 177 | "base_uri": "https://localhost:8080/" 178 | }, 179 | "id": "TO7ozmjeNd4u", 180 | "outputId": "5e072736-5c54-4456-952a-471e5239fda5" 181 | }, 182 | "source": [ 183 | "x" 184 | ], 185 | "execution_count": 8, 186 | "outputs": [ 187 | { 188 | "output_type": "execute_result", 189 | "data": { 190 | "text/plain": [ 191 | "tensor(7.)" 192 | ] 193 | }, 194 | "metadata": { 195 | "tags": [] 196 | }, 197 | "execution_count": 8 198 | } 199 | ] 200 | }, 201 | { 202 | "cell_type": "code", 203 | "metadata": { 204 | "colab": { 205 | "base_uri": "https://localhost:8080/" 206 | }, 207 | "id": "b63IzdS1Nd4x", 208 | "outputId": "892dc353-4ba2-4886-fe6e-18aa59bbd51a" 209 | }, 210 | "source": [ 211 | "y" 212 | ], 213 | "execution_count": 9, 214 | "outputs": [ 215 | { 216 | "output_type": "execute_result", 217 | "data": { 218 | "text/plain": [ 219 | "tensor(-1.3700)" 220 | ] 221 | }, 222 | "metadata": { 223 | "tags": [] 224 | }, 225 | "execution_count": 9 226 | } 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": { 232 | "id": "TVkbo0oPNd4z" 233 | }, 234 | "source": [ 235 | "**Step 1**: Forward pass" 236 | ] 237 | }, 238 | { 239 | "cell_type": "markdown", 240 | "metadata": { 241 | "id": "M_bUxX__Nd4z" 242 | }, 243 | "source": [ 244 | "We can flow the scalar tensor $x$ through our regression model to produce $\\hat{y}$, an estimate of $y$. Prior to any model training, this is an arbitrary estimate:" 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "metadata": { 250 | "colab": { 251 | "base_uri": "https://localhost:8080/" 252 | }, 253 | "id": "FBB2iwPiNd40", 254 | "outputId": "7e9530ce-f15b-4998-d776-29f09140aa6f" 255 | }, 256 | "source": [ 257 | "yhat = regression(x, m, b)\n", 258 | "yhat" 259 | ], 260 | "execution_count": 10, 261 | "outputs": [ 262 | { 263 | "output_type": "execute_result", 264 | "data": { 265 | "text/plain": [ 266 | "tensor([6.4000], grad_fn=)" 267 | ] 268 | }, 269 | "metadata": { 270 | "tags": [] 271 | }, 272 | "execution_count": 10 273 | } 274 | ] 275 | }, 276 | { 277 | "cell_type": "markdown", 278 | "metadata": { 279 | "id": "6Hy2sDlNNd42" 280 | }, 281 | "source": [ 282 | "**Step 2**: Compare $\\hat{y}$ with true $y$ to calculate cost $C$" 283 | ] 284 | }, 285 | { 286 | "cell_type": "markdown", 287 | "metadata": { 288 | "id": "JtHOeylJNd43" 289 | }, 290 | "source": [ 291 | "In the *Regression in PyTorch* notebook, we used mean-squared error, which averages quadratic cost over multiple data points. With a single data point, here we can use quadratic cost alone. It is defined by: $$ C = (\\hat{y} - y)^2 $$" 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "metadata": { 297 | "id": "t-THZMH0Nd43" 298 | }, 299 | "source": [ 300 | "def squared_error(my_yhat, my_y):\n", 301 | " return (my_yhat - my_y)**2" 302 | ], 303 | "execution_count": 11, 304 | "outputs": [] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "metadata": { 309 | "colab": { 310 | "base_uri": "https://localhost:8080/" 311 | }, 312 | "id": "1LSKXx5XNd45", 313 | "outputId": "08d223f2-c5e1-4f7c-d5f4-750ab79dc002" 314 | }, 315 | "source": [ 316 | "C = squared_error(yhat, y)\n", 317 | "C" 318 | ], 319 | "execution_count": 12, 320 | "outputs": [ 321 | { 322 | "output_type": "execute_result", 323 | "data": { 324 | "text/plain": [ 325 | "tensor([60.3729], grad_fn=)" 326 | ] 327 | }, 328 | "metadata": { 329 | "tags": [] 330 | }, 331 | "execution_count": 12 332 | } 333 | ] 334 | }, 335 | { 336 | "cell_type": "markdown", 337 | "metadata": { 338 | "id": "wu4nlO3-Nd47" 339 | }, 340 | "source": [ 341 | "**Step 3**: Use autodiff to calculate gradient of $C$ w.r.t. parameters" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "metadata": { 347 | "id": "Mk9Lx-gTNd48" 348 | }, 349 | "source": [ 350 | "C.backward()" 351 | ], 352 | "execution_count": 13, 353 | "outputs": [] 354 | }, 355 | { 356 | "cell_type": "markdown", 357 | "metadata": { 358 | "id": "eXQJxYduNd4-" 359 | }, 360 | "source": [ 361 | "The partial derivative of $C$ with respect to $m$ ($\\frac{\\partial C}{\\partial m}$) is: " 362 | ] 363 | }, 364 | { 365 | "cell_type": "code", 366 | "metadata": { 367 | "colab": { 368 | "base_uri": "https://localhost:8080/" 369 | }, 370 | "id": "1yQ7w1bfNd4-", 371 | "outputId": "d0c4d659-729b-4e97-b622-7f02b7445f46" 372 | }, 373 | "source": [ 374 | "m.grad" 375 | ], 376 | "execution_count": 14, 377 | "outputs": [ 378 | { 379 | "output_type": "execute_result", 380 | "data": { 381 | "text/plain": [ 382 | "tensor([108.7800])" 383 | ] 384 | }, 385 | "metadata": { 386 | "tags": [] 387 | }, 388 | "execution_count": 14 389 | } 390 | ] 391 | }, 392 | { 393 | "cell_type": "markdown", 394 | "metadata": { 395 | "id": "EGK1NwomNd5B" 396 | }, 397 | "source": [ 398 | "And the partial derivative of $C$ with respect to $b$ ($\\frac{\\partial C}{\\partial b}$) is: " 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "metadata": { 404 | "colab": { 405 | "base_uri": "https://localhost:8080/" 406 | }, 407 | "id": "vIeBu-tINd5B", 408 | "outputId": "ac2461e7-90ec-430e-93cd-51cf67e792f7" 409 | }, 410 | "source": [ 411 | "b.grad" 412 | ], 413 | "execution_count": 15, 414 | "outputs": [ 415 | { 416 | "output_type": "execute_result", 417 | "data": { 418 | "text/plain": [ 419 | "tensor([15.5400])" 420 | ] 421 | }, 422 | "metadata": { 423 | "tags": [] 424 | }, 425 | "execution_count": 15 426 | } 427 | ] 428 | }, 429 | { 430 | "cell_type": "markdown", 431 | "metadata": { 432 | "id": "NK1tyH4WNd5E" 433 | }, 434 | "source": [ 435 | "**Return to *Calculus II* slides here to derive $\\frac{\\partial C}{\\partial m}$ and $\\frac{\\partial C}{\\partial b}$.**" 436 | ] 437 | }, 438 | { 439 | "cell_type": "markdown", 440 | "metadata": { 441 | "id": "wTgfR-wINd5F" 442 | }, 443 | "source": [ 444 | "$$ \\frac{\\partial C}{\\partial m} = 2x(\\hat{y} - y) $$" 445 | ] 446 | }, 447 | { 448 | "cell_type": "code", 449 | "metadata": { 450 | "colab": { 451 | "base_uri": "https://localhost:8080/" 452 | }, 453 | "id": "AKjWDa4QNd5F", 454 | "outputId": "5d29bb67-6538-4ff9-a868-85927f114c21" 455 | }, 456 | "source": [ 457 | "2*x*(yhat.item()-y)" 458 | ], 459 | "execution_count": 16, 460 | "outputs": [ 461 | { 462 | "output_type": "execute_result", 463 | "data": { 464 | "text/plain": [ 465 | "tensor(108.7800)" 466 | ] 467 | }, 468 | "metadata": { 469 | "tags": [] 470 | }, 471 | "execution_count": 16 472 | } 473 | ] 474 | }, 475 | { 476 | "cell_type": "markdown", 477 | "metadata": { 478 | "id": "yCV3zMIjNd5H" 479 | }, 480 | "source": [ 481 | "$$ \\frac{\\partial C}{\\partial b} = 2(\\hat{y}-y) $$" 482 | ] 483 | }, 484 | { 485 | "cell_type": "code", 486 | "metadata": { 487 | "colab": { 488 | "base_uri": "https://localhost:8080/" 489 | }, 490 | "id": "gNrOSpcONd5H", 491 | "outputId": "73b37a87-7535-4049-9587-aa8b1831c69f" 492 | }, 493 | "source": [ 494 | "2*(yhat.item()-y)" 495 | ], 496 | "execution_count": 17, 497 | "outputs": [ 498 | { 499 | "output_type": "execute_result", 500 | "data": { 501 | "text/plain": [ 502 | "tensor(15.5400)" 503 | ] 504 | }, 505 | "metadata": { 506 | "tags": [] 507 | }, 508 | "execution_count": 17 509 | } 510 | ] 511 | }, 512 | { 513 | "cell_type": "markdown", 514 | "metadata": { 515 | "id": "zXor7Ev7Nd5J" 516 | }, 517 | "source": [ 518 | "### The Gradient of Cost, $\\nabla C$" 519 | ] 520 | }, 521 | { 522 | "cell_type": "markdown", 523 | "metadata": { 524 | "id": "XeII8EHQNd5K" 525 | }, 526 | "source": [ 527 | "The gradient of cost, which is symbolized $\\nabla C$ (pronounced \"nabla C\"), is a vector of all the partial derivatives of $C$ with respect to each of the individual model parameters: " 528 | ] 529 | }, 530 | { 531 | "cell_type": "markdown", 532 | "metadata": { 533 | "id": "kVnD7j3HNd5K" 534 | }, 535 | "source": [ 536 | "$\\nabla C = \\nabla_p C = \\left[ \\frac{\\partial{C}}{\\partial{p_1}}, \\frac{\\partial{C}}{\\partial{p_2}}, \\cdots, \\frac{\\partial{C}}{\\partial{p_n}} \\right]^T $" 537 | ] 538 | }, 539 | { 540 | "cell_type": "markdown", 541 | "metadata": { 542 | "id": "ILK7BRLJNd5K" 543 | }, 544 | "source": [ 545 | "In this case, there are only two parameters, $b$ and $m$: " 546 | ] 547 | }, 548 | { 549 | "cell_type": "markdown", 550 | "metadata": { 551 | "id": "5Yq3BQ3YNd5L" 552 | }, 553 | "source": [ 554 | "$\\nabla C = \\left[ \\frac{\\partial{C}}{\\partial{b}}, \\frac{\\partial{C}}{\\partial{m}} \\right]^T $" 555 | ] 556 | }, 557 | { 558 | "cell_type": "code", 559 | "metadata": { 560 | "colab": { 561 | "base_uri": "https://localhost:8080/" 562 | }, 563 | "id": "ZsZhOKFZNd5L", 564 | "outputId": "d3dc7def-3e43-40c2-9eb2-d7b86db95994" 565 | }, 566 | "source": [ 567 | "gradient = torch.tensor([[b.grad.item(), m.grad.item()]]).T\n", 568 | "gradient" 569 | ], 570 | "execution_count": 18, 571 | "outputs": [ 572 | { 573 | "output_type": "execute_result", 574 | "data": { 575 | "text/plain": [ 576 | "tensor([[ 15.5400],\n", 577 | " [108.7800]])" 578 | ] 579 | }, 580 | "metadata": { 581 | "tags": [] 582 | }, 583 | "execution_count": 18 584 | } 585 | ] 586 | } 587 | ] 588 | } --------------------------------------------------------------------------------