├── 01_LR_Introduction.pdf ├── 02_Odd_LogOdd_OddRatio.pdf ├── 03_Logit_Model.pdf ├── 04_Likelihood_Probability.pdf ├── 05_Maxium_Likelihood_Estimation.pdf ├── 06_LR_Assumptions.pdf ├── 07_LR_Assumptions.ipynb ├── 08_AIC_BIC.pdf ├── 09_Logistic_Regression.ipynb ├── 10_Multiclass_Classification.pdf ├── 11_Multi_Class_Classification.ipynb ├── 12_Regularization.pdf ├── 13_LR_Regularization.ipynb ├── 14_WOE_IV.pdf ├── 15_LR_WOE_IV.ipynb ├── 16_LR_Revision.pdf ├── 17_LR_1_Interview_Questions.pdf ├── 18_LR_2_Interview_Questions.pdf └── README.md /01_LR_Introduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/01_LR_Introduction.pdf -------------------------------------------------------------------------------- /02_Odd_LogOdd_OddRatio.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/02_Odd_LogOdd_OddRatio.pdf -------------------------------------------------------------------------------- /03_Logit_Model.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/03_Logit_Model.pdf -------------------------------------------------------------------------------- /04_Likelihood_Probability.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/04_Likelihood_Probability.pdf -------------------------------------------------------------------------------- /05_Maxium_Likelihood_Estimation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/05_Maxium_Likelihood_Estimation.pdf -------------------------------------------------------------------------------- /06_LR_Assumptions.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/06_LR_Assumptions.pdf -------------------------------------------------------------------------------- /08_AIC_BIC.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/08_AIC_BIC.pdf -------------------------------------------------------------------------------- /10_Multiclass_Classification.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/10_Multiclass_Classification.pdf -------------------------------------------------------------------------------- /11_Multi_Class_Classification.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "11 Multi Class Classification.ipynb", 7 | "provenance": [], 8 | "authorship_tag": "ABX9TyPWNOFvn1SCWX1AdS9Jn8me", 9 | "include_colab_link": true 10 | }, 11 | "kernelspec": { 12 | "name": "python3", 13 | "display_name": "Python 3" 14 | }, 15 | "language_info": { 16 | "name": "python" 17 | } 18 | }, 19 | "cells": [ 20 | { 21 | "cell_type": "markdown", 22 | "metadata": { 23 | "id": "view-in-github", 24 | "colab_type": "text" 25 | }, 26 | "source": [ 27 | "\"Open" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "source": [ 33 | "**Binary Classifiers for Multi-Class Classification**\n", 34 | "\n", 35 | "- Classification is a predictive modeling problem that involves assigning a class label to an example.\n", 36 | "- **Binary classification** are those tasks where examples are assigned exactly one of two classes. \n", 37 | "- **Binary Classification:** Classification tasks with two classes.\n", 38 | "- **Multi-class classification** is those tasks where examples are assigned exactly one of more than two classes.\n", 39 | "- **Multi-class Classification:** Classification tasks with more than two classes.\n", 40 | "\n", 41 | "*Some algorithms are designed for binary classification problems. Examples include:*\n", 42 | "\n", 43 | "1. Logistic Regression\n", 44 | "2. Support Vector Machines\n", 45 | "\n", 46 | "\n", 47 | "Instead, heuristic methods can be used to split a multi-class classification problem into multiple binary classification datasets and train a binary classification model each.\n", 48 | "\n", 49 | "Two examples of these heuristic methods include:\n", 50 | "\n", 51 | "1. One-vs-Rest (OvR)\n", 52 | "2. One-vs-One (OvO)" 53 | ], 54 | "metadata": { 55 | "id": "VyVM3wn3FDmZ" 56 | } 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "source": [ 61 | "**One-Vs-Rest (OvR) for Multi-Class Classification**\n", 62 | "\n", 63 | "- **One-vs-rest (OvR)** for short, also referred to as **One-vs-All (OvA)** is a heuristic method for using binary classification algorithms for multi-class classification.\n", 64 | "\n", 65 | "- It involves splitting the multi-class dataset into multiple binary classification problems. \n", 66 | "- A binary classifier is then trained on each binary classification problem and predictions are made using the model that is the most confident.\n", 67 | "\n", 68 | "- For example, \n", 69 | " - Given a multi-class classification problem with examples for each class ‘red,’ ‘blue,’ and ‘green‘. \n", 70 | " - This could be divided into three binary classification datasets as follows:\n", 71 | " 1. Binary Classification Problem 1: red vs [blue, green]\n", 72 | " 2. Binary Classification Problem 2: blue vs [red, green]\n", 73 | " 3. Binary Classification Problem 3: green vs [red, blue]\n", 74 | "\n", 75 | "**Possible Issue**\n", 76 | "- A possible downside of this approach is that it requires one model to be created for each class. \n", 77 | "- For example, three classes requires three models. \n", 78 | " - This could be an issue for large datasets (e.g. millions of rows) \n", 79 | " - Very large numbers of classes (e.g. hundreds of classes).\n", 80 | "\n", 81 | "**Approach of OvA or OvR**\n", 82 | "- The obvious approach is to use a one-versus-the-rest approach (also called one-vs-all), in which we train C binary classifiers, fc(x), where the data from class c is treated as positive, and the data from all the other classes is treated as negative.\n", 83 | "- This approach requires that each model predicts a class membership probability or a probability-like score. The argmax of these scores (class index with the largest score) is then used to predict a class.\n", 84 | "- This approach is commonly used for algorithms that naturally predict numerical class membership probability or score, such as: Logistic Regression. \n", 85 | "As such, the implementation of these algorithms in the scikit-learn library implements the OvR strategy by default when using these algorithms for multi-class classification.\n", 86 | "\n", 87 | "**Python Example**\n", 88 | "- We can demonstrate this with an example on a 3-class classification problem using the LogisticRegression algorithm. \n", 89 | "- The strategy for handling multi-class classification can be set via the “multi_class” argument and can be set to “ovr” for the one-vs-rest strategy." 90 | ], 91 | "metadata": { 92 | "id": "LNYaXKSJFjUq" 93 | } 94 | }, 95 | { 96 | "cell_type": "code", 97 | "source": [ 98 | "# logistic regression for multi-class classification using built-in one-vs-rest\n", 99 | "from sklearn.datasets import make_classification\n", 100 | "from sklearn.linear_model import LogisticRegression\n", 101 | "\n", 102 | "# define dataset\n", 103 | "X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)\n", 104 | "\n", 105 | "# define model\n", 106 | "model = LogisticRegression(multi_class='ovr')\n", 107 | "\n", 108 | "# fit model\n", 109 | "model.fit(X, y)\n", 110 | "\n", 111 | "# make predictions\n", 112 | "yhat = model.predict(X)" 113 | ], 114 | "metadata": { 115 | "id": "LoAiNMAzFjEX" 116 | }, 117 | "execution_count": 1, 118 | "outputs": [] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "source": [ 123 | "- The scikit-learn library also provides a separate OneVsRestClassifier class that allows the one-vs-rest strategy to be used with any classifier.\n", 124 | "\n", 125 | "- This class can be used to use a binary classifier like Logistic Regression or Perceptron for multi-class classification, or even other classifiers that natively support multi-class classification." 126 | ], 127 | "metadata": { 128 | "id": "VoM4ch_1FqKC" 129 | } 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": 2, 134 | "metadata": { 135 | "id": "AcL956kJBPes" 136 | }, 137 | "outputs": [], 138 | "source": [ 139 | "# logistic regression for multi-class classification using a one-vs-rest\n", 140 | "from sklearn.datasets import make_classification\n", 141 | "from sklearn.linear_model import LogisticRegression\n", 142 | "from sklearn.multiclass import OneVsRestClassifier\n", 143 | "\n", 144 | "# define dataset\n", 145 | "X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)\n", 146 | "\n", 147 | "# define model\n", 148 | "model = LogisticRegression()\n", 149 | "\n", 150 | "# define the ovr strategy\n", 151 | "ovr = OneVsRestClassifier(model)\n", 152 | "\n", 153 | "# fit model\n", 154 | "ovr.fit(X, y)\n", 155 | "\n", 156 | "# make predictions\n", 157 | "yhat = ovr.predict(X)" 158 | ] 159 | }, 160 | { 161 | "cell_type": "markdown", 162 | "source": [ 163 | "**One-Vs-One for Multi-Class Classification**\n", 164 | "\n", 165 | "- One-vs-One (OvO for short) is another heuristic method for using binary classification algorithms for multi-class classification.\n", 166 | "\n", 167 | "- Like one-vs-rest, one-vs-one splits a multi-class classification dataset into binary classification problems. \n", 168 | "\n", 169 | "- Unlike one-vs-rest that splits it into one binary dataset for each class, the one-vs-one approach splits the dataset into one dataset for each class versus every other class.\n", 170 | "- For example\n", 171 | " - Consider a multi-class classification problem with four classes: ‘red,’ ‘blue,’ and ‘green,’ ‘yellow.’ \n", 172 | " - This could be divided into six binary classification datasets as follows:\n", 173 | " \n", 174 | " Binary Classification Problem 1: red vs. blue\n", 175 | "\n", 176 | " Binary Classification Problem 2: red vs. green\n", 177 | "\n", 178 | " Binary Classification Problem 3: red vs. yellow\n", 179 | "\n", 180 | " Binary Classification Problem 4: blue vs. green\n", 181 | "\n", 182 | " Binary Classification Problem 5: blue vs. yellow\n", 183 | "\n", 184 | " Binary Classification Problem 6: green vs. yellow\n", 185 | "\n", 186 | "- The formula for calculating the number of binary datasets, and in turn, models, is as follows:\n", 187 | "\n", 188 | " (NumClasses * (NumClasses – 1)) / 2\n", 189 | "\n", 190 | "- We can see that for four classes, this gives us the expected value of six binary classification problems:\n", 191 | "\n", 192 | " (NumClasses * (NumClasses – 1)) / 2\n", 193 | " \n", 194 | " (4 * (4 – 1)) / 2\n", 195 | " \n", 196 | " (4 * 3) / 2\n", 197 | " \n", 198 | " 12 / 2\n", 199 | " \n", 200 | " 6\n", 201 | "\n", 202 | "Each binary classification model may predict one class label and the model with the most predictions or votes is predicted by the one-vs-one strategy.\n", 203 | "\n", 204 | "- An alternative is to introduce K(K − 1)/2 binary discriminant functions, one for every possible pair of classes. \n", 205 | "- This is known as a **one-versus-one classifier**. \n", 206 | "- Each point is then classified according to a *majority vote amongst* the discriminant functions.\n", 207 | "- Similarly, if the binary classification models predict a numerical class membership, such as a *probability, then the argmax of the sum of the scores (class with the largest sum score) is predicted as the class label.*\n", 208 | "\n", 209 | "The support vector machine implementation in the scikit-learn is provided by the SVC class and supports the one-vs-one method for multi-class classification problems. This can be achieved by setting the “decision_function_shape” argument to ‘ovo‘." 210 | ], 211 | "metadata": { 212 | "id": "KrhI1hBuFqaP" 213 | } 214 | }, 215 | { 216 | "cell_type": "code", 217 | "source": [ 218 | "# SVM for multi-class classification using built-in one-vs-one\n", 219 | "from sklearn.datasets import make_classification\n", 220 | "from sklearn.svm import SVC\n", 221 | "\n", 222 | "# define dataset\n", 223 | "X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)\n", 224 | "\n", 225 | "# define model\n", 226 | "model = SVC(decision_function_shape='ovo')\n", 227 | "\n", 228 | "# fit model\n", 229 | "model.fit(X, y)\n", 230 | "\n", 231 | "# make predictions\n", 232 | "yhat = model.predict(X)" 233 | ], 234 | "metadata": { 235 | "id": "Avulnt8zL071" 236 | }, 237 | "execution_count": 3, 238 | "outputs": [] 239 | }, 240 | { 241 | "cell_type": "markdown", 242 | "source": [ 243 | "- The scikit-learn library also provides a separate OneVsOneClassifier class that allows the one-vs-one strategy to be used with any classifier.\n", 244 | "\n", 245 | "- This class can be used with a binary classifier like SVM, Logistic Regression or Perceptron for multi-class classification, or even other classifiers that natively support multi-class classification.\n", 246 | "\n", 247 | "- It is very easy to use and requires that a classifier that is to be used for binary classification be provided to the OneVsOneClassifier as an argument." 248 | ], 249 | "metadata": { 250 | "id": "SraJXdoKL0h3" 251 | } 252 | }, 253 | { 254 | "cell_type": "code", 255 | "source": [ 256 | "# SVM for multi-class classification using one-vs-one\n", 257 | "from sklearn.datasets import make_classification\n", 258 | "from sklearn.svm import SVC\n", 259 | "from sklearn.multiclass import OneVsOneClassifier\n", 260 | "\n", 261 | "# define dataset\n", 262 | "X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)\n", 263 | "\n", 264 | "# define model\n", 265 | "model = SVC()\n", 266 | "\n", 267 | "# define ovo strategy\n", 268 | "ovo = OneVsOneClassifier(model)\n", 269 | "\n", 270 | "# fit model\n", 271 | "ovo.fit(X, y)\n", 272 | "\n", 273 | "# make predictions\n", 274 | "yhat = ovo.predict(X)" 275 | ], 276 | "metadata": { 277 | "id": "DRGfJE3nL15l" 278 | }, 279 | "execution_count": 4, 280 | "outputs": [] 281 | } 282 | ] 283 | } -------------------------------------------------------------------------------- /12_Regularization.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/12_Regularization.pdf -------------------------------------------------------------------------------- /13_LR_Regularization.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "13_LR_Regularization.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [], 9 | "authorship_tag": "ABX9TyOBOJ2j7rR2RWq4fqoSqdFZ", 10 | "include_colab_link": true 11 | }, 12 | "kernelspec": { 13 | "name": "python3", 14 | "display_name": "Python 3" 15 | }, 16 | "language_info": { 17 | "name": "python" 18 | } 19 | }, 20 | "cells": [ 21 | { 22 | "cell_type": "markdown", 23 | "metadata": { 24 | "id": "view-in-github", 25 | "colab_type": "text" 26 | }, 27 | "source": [ 28 | "\"Open" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "source": [ 34 | "**Regularization**\n", 35 | "\n", 36 | "They work by penalizing (regularizing) the magnitude of coefficients of features along with minimizing the error between predicted and actual observations.\n", 37 | "\n", 38 | "**Ridge (L2) and Lasso (L1) regression** \n", 39 | "- They are powerful techniques generally used for creating parsimonious models in presence of a ‘large’ number of features. \n", 40 | "- Here ‘large’ can typically mean either of two things:\n", 41 | " 1. *Large enough to enhance the tendency of a model to overfit* (as low as 10 variables might cause overfitting)\n", 42 | " 2. *Large enough to cause computational challenges.* With modern systems, this situation might arise in case of millions or billions of features\n", 43 | "\n", 44 | "Though Ridge and Lasso might appear to work towards a common goal, the inherent properties and practical use cases differ substantially. \n", 45 | "- They work by penalizing the magnitude of coefficients of features along with minimizing the error between predicted and actual observations. \n", 46 | "- These are called **‘regularization’ techniques**. The key difference is in how they assign penalty to the coefficients:\n", 47 | "\n", 48 | "**Ridge Regression:**\n", 49 | "\n", 50 | "- Performs L2 regularization\n", 51 | "- i.e. adds penalty equivalent to square of the magnitude of coefficients\n", 52 | " \n", 53 | " Minimization objective = LS Obj + α * (sum of square of coefficients)\n", 54 | "\n", 55 | "**Lasso Regression:**\n", 56 | "- Performs L1 regularization\n", 57 | "- i.e. adds penalty equivalent to absolute value of the magnitude of coefficients\n", 58 | "\n", 59 | " Minimization objective = LS Obj + α * (sum of absolute value of coefficients)\n", 60 | "\n", 61 | "**Note** that here ‘LS Obj’ refers to ‘least squares objective’, i.e. the linear regression objective without regularization.\n", 62 | " \n", 63 | "**Why Penalize the Magnitude of Coefficients?**\n", 64 | "\n", 65 | "Lets try to understand the impact of model complexity on the magnitude of coefficients. \n", 66 | "\n", 67 | "As an example, I have simulated a sine curve (between 60° and 300°) and added some random noise:" 68 | ], 69 | "metadata": { 70 | "id": "hvDEPAfrK1r-" 71 | } 72 | }, 73 | { 74 | "cell_type": "code", 75 | "source": [ 76 | "# Importing libraries\n", 77 | "\n", 78 | "import numpy as np\n", 79 | "import pandas as pd\n", 80 | "import random\n", 81 | "import matplotlib.pyplot as plt\n", 82 | "%matplotlib inline\n", 83 | "from matplotlib.pylab import rcParams\n", 84 | "rcParams['figure.figsize'] = 12, 10\n", 85 | "\n", 86 | "#Define input array with angles from 60deg to 300deg converted to radians\n", 87 | "x = np.array([i*np.pi/180 for i in range(60,300,4)])\n", 88 | "np.random.seed(10) #Setting seed for reproducibility\n", 89 | "y = np.sin(x) + np.random.normal(0,0.15,len(x))\n", 90 | "data = pd.DataFrame(np.column_stack([x,y]),columns=['x','y'])\n", 91 | "plt.plot(data['x'],data['y'],'.')" 92 | ], 93 | "metadata": { 94 | "colab": { 95 | "base_uri": "https://localhost:8080/", 96 | "height": 610 97 | }, 98 | "id": "1mWldIWiMHQZ", 99 | "outputId": "171e8046-5d72-4279-c370-01909e7956b1" 100 | }, 101 | "execution_count": 1, 102 | "outputs": [ 103 | { 104 | "output_type": "execute_result", 105 | "data": { 106 | "text/plain": [ 107 | "[]" 108 | ] 109 | }, 110 | "metadata": {}, 111 | "execution_count": 1 112 | }, 113 | { 114 | "output_type": "display_data", 115 | "data": { 116 | "image/png": "\n", 117 | "text/plain": [ 118 | "
" 119 | ] 120 | }, 121 | "metadata": { 122 | "needs_background": "light" 123 | } 124 | } 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "source": [ 130 | "This resembles a sine curve but not exactly because of the noise. \n", 131 | "\n", 132 | "Let’s try to estimate the sine function using polynomial regression with powers of x from 1 to 15. \n", 133 | "\n", 134 | "Let’s add a column for each power upto 15 in our dataframe. " 135 | ], 136 | "metadata": { 137 | "id": "q4N-tDcCK1xQ" 138 | } 139 | }, 140 | { 141 | "cell_type": "code", 142 | "source": [ 143 | "for i in range(2,16): #power of 1 is already there\n", 144 | " colname = 'x_%d'%i #new var will be x_power\n", 145 | " data[colname] = data['x']**i\n", 146 | "print(data.head())" 147 | ], 148 | "metadata": { 149 | "colab": { 150 | "base_uri": "https://localhost:8080/" 151 | }, 152 | "id": "dI3Do9x-MdEU", 153 | "outputId": "46e921a0-21cb-4320-d7c9-dcd82f84b2bc" 154 | }, 155 | "execution_count": 2, 156 | "outputs": [ 157 | { 158 | "output_type": "stream", 159 | "name": "stdout", 160 | "text": [ 161 | " x y x_2 ... x_13 x_14 x_15\n", 162 | "0 1.047198 1.065763 1.096623 ... 1.821260 1.907219 1.997235\n", 163 | "1 1.117011 1.006086 1.247713 ... 4.214494 4.707635 5.258479\n", 164 | "2 1.186824 0.695374 1.408551 ... 9.268760 11.000386 13.055521\n", 165 | "3 1.256637 0.949799 1.579137 ... 19.486248 24.487142 30.771450\n", 166 | "4 1.326450 1.063496 1.759470 ... 39.353420 52.200353 69.241170\n", 167 | "\n", 168 | "[5 rows x 16 columns]\n" 169 | ] 170 | } 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "source": [ 176 | "Now that we have all the 15 powers, let’s make 15 different linear regression models with each model containing variables with powers of x from 1 to the particular model number. For example, the feature set of model 8 will be – {x, x_2, x_3, … ,x_8}.\n", 177 | "\n", 178 | "First, we’ll define a generic function which takes in the required maximum power of x as an input and returns a list containing – [ model RSS, intercept, coef_x, coef_x2, … upto entered power ]. \n", 179 | "\n", 180 | "Here RSS refers to ‘Residual Sum of Squares’ which is nothing but the sum of square of errors between the predicted and actual values in the training data set.\n", 181 | "\n", 182 | "Note that this function will not plot the model fit for all the powers but will return the RSS and coefficients for all the models. \n", 183 | "\n", 184 | "Now, we can make all 15 models and compare the results. For ease of analysis, we’ll store all the results in a Pandas dataframe and plot 6 models to get an idea of the trend. " 185 | ], 186 | "metadata": { 187 | "id": "2S5LF7KeMeHG" 188 | } 189 | }, 190 | { 191 | "cell_type": "code", 192 | "source": [ 193 | "#Import Linear Regression model from scikit-learn.\n", 194 | "from sklearn.linear_model import LinearRegression\n", 195 | "import warnings\n", 196 | "warnings.filterwarnings(\"ignore\")\n", 197 | "def linear_regression(data, power, models_to_plot):\n", 198 | " #initialize predictors:\n", 199 | " predictors=['x']\n", 200 | " if power>=2:\n", 201 | " predictors.extend(['x_%d'%i for i in range(2,power+1)])\n", 202 | " \n", 203 | " #Fit the model\n", 204 | " linreg = LinearRegression(normalize=True)\n", 205 | " linreg.fit(data[predictors],data['y'])\n", 206 | " y_pred = linreg.predict(data[predictors])\n", 207 | " \n", 208 | " #Check if a plot is to be made for the entered power\n", 209 | " if power in models_to_plot:\n", 210 | " plt.subplot(models_to_plot[power])\n", 211 | " plt.tight_layout()\n", 212 | " plt.plot(data['x'],y_pred)\n", 213 | " plt.plot(data['x'],data['y'],'.')\n", 214 | " plt.title('Plot for power: %d'%power)\n", 215 | " \n", 216 | " #Return the result in pre-defined format\n", 217 | " rss = sum((y_pred-data['y'])**2)\n", 218 | " ret = [rss]\n", 219 | " ret.extend([linreg.intercept_])\n", 220 | " ret.extend(linreg.coef_)\n", 221 | " return ret\n", 222 | "\n", 223 | "#Initialize a dataframe to store the results:\n", 224 | "col = ['rss','intercept'] + ['coef_x_%d'%i for i in range(1,16)]\n", 225 | "ind = ['model_pow_%d'%i for i in range(1,16)]\n", 226 | "coef_matrix_simple = pd.DataFrame(index=ind, columns=col)\n", 227 | "\n", 228 | "#Define the powers for which a plot is required:\n", 229 | "models_to_plot = {1:231,3:232,6:233,9:234,12:235,15:236}\n", 230 | "\n", 231 | "#Iterate through all powers and assimilate results\n", 232 | "for i in range(1,16):\n", 233 | " coef_matrix_simple.iloc[i-1,0:i+2] = linear_regression(data, power=i, models_to_plot=models_to_plot)" 234 | ], 235 | "metadata": { 236 | "colab": { 237 | "base_uri": "https://localhost:8080/", 238 | "height": 297 239 | }, 240 | "id": "mVVO9UeSMej-", 241 | "outputId": "d1af115a-0f8c-4966-8581-ac529bc4bb50" 242 | }, 243 | "execution_count": 3, 244 | "outputs": [ 245 | { 246 | "output_type": "display_data", 247 | "data": { 248 | "image/png": "\n", 249 | "text/plain": [ 250 | "
" 251 | ] 252 | }, 253 | "metadata": { 254 | "needs_background": "light" 255 | } 256 | } 257 | ] 258 | }, 259 | { 260 | "cell_type": "markdown", 261 | "source": [ 262 | "We would expect the models with increasing complexity to better fit the data and result in lower RSS values. This can be verified by looking at the plots generated for 6 models.\n", 263 | "\n", 264 | "This clearly aligns with our initial understanding. As the model complexity increases, the models tends to fit even smaller deviations in the training data set. Though this **leads to overfitting**, lets keep this issue aside for some time and come to our main objective, i.e. the impact on the magnitude of coefficients.\n", 265 | "\n" 266 | ], 267 | "metadata": { 268 | "id": "vzt4JtBRMeuA" 269 | } 270 | }, 271 | { 272 | "cell_type": "code", 273 | "source": [ 274 | "#Set the display format to be scientific for ease of analysis\n", 275 | "pd.options.display.float_format = '{:,.2g}'.format\n", 276 | "coef_matrix_simple" 277 | ], 278 | "metadata": { 279 | "colab": { 280 | "base_uri": "https://localhost:8080/", 281 | "height": 578 282 | }, 283 | "id": "m5lNtk8BNxtD", 284 | "outputId": "d1043c10-39e2-4d05-8c6d-58de217efbc5" 285 | }, 286 | "execution_count": 4, 287 | "outputs": [ 288 | { 289 | "output_type": "execute_result", 290 | "data": { 291 | "text/html": [ 292 | "\n", 293 | "
\n", 294 | "
\n", 295 | "
\n", 296 | "\n", 309 | "\n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 358 | " \n", 359 | " \n", 360 | " \n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | "
rssinterceptcoef_x_1coef_x_2coef_x_3coef_x_4coef_x_5coef_x_6coef_x_7coef_x_8coef_x_9coef_x_10coef_x_11coef_x_12coef_x_13coef_x_14coef_x_15
model_pow_13.32-0.62NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
model_pow_23.31.9-0.58-0.006NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
model_pow_31.1-1.13-1.30.14NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
model_pow_41.1-0.271.7-0.53-0.0360.014NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
model_pow_513-5.14.7-1.90.33-0.021NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
model_pow_60.99-2.89.5-9.75.2-1.60.23-0.014NaNNaNNaNNaNNaNNaNNaNNaNNaN
model_pow_70.9319-5669-4517-3.50.4-0.019NaNNaNNaNNaNNaNNaNNaNNaN
model_pow_80.9243-1.4e+021.8e+02-1.3e+0258-152.4-0.210.0077NaNNaNNaNNaNNaNNaNNaN
model_pow_90.871.7e+02-6.1e+029.6e+02-8.5e+024.6e+02-1.6e+0237-5.20.42-0.015NaNNaNNaNNaNNaNNaN
model_pow_100.871.4e+02-4.9e+027.3e+02-6e+022.9e+02-8715-0.81-0.140.026-0.0013NaNNaNNaNNaNNaN
model_pow_110.87-755.1e+02-1.3e+031.9e+03-1.6e+039.1e+02-3.5e+0291-161.8-0.120.0034NaNNaNNaNNaN
model_pow_120.87-3.4e+021.9e+03-4.4e+036e+03-5.2e+033.1e+03-1.3e+033.8e+02-8012-1.10.062-0.0016NaNNaNNaN
model_pow_130.863.2e+03-1.8e+044.5e+04-6.7e+046.6e+04-4.6e+042.3e+04-8.5e+032.3e+03-4.5e+0262-5.70.31-0.0078NaNNaN
model_pow_140.792.4e+04-1.4e+053.8e+05-6.1e+056.6e+05-5e+052.8e+05-1.2e+053.7e+04-8.5e+031.5e+03-1.8e+0215-0.730.017NaN
model_pow_150.7-3.6e+042.4e+05-7.5e+051.4e+06-1.7e+061.5e+06-1e+065e+05-1.9e+055.4e+04-1.2e+041.9e+03-2.2e+0217-0.810.018
\n", 635 | "
\n", 636 | " \n", 646 | " \n", 647 | " \n", 684 | "\n", 685 | " \n", 709 | "
\n", 710 | "
\n", 711 | " " 712 | ], 713 | "text/plain": [ 714 | " rss intercept coef_x_1 ... coef_x_13 coef_x_14 coef_x_15\n", 715 | "model_pow_1 3.3 2 -0.62 ... NaN NaN NaN\n", 716 | "model_pow_2 3.3 1.9 -0.58 ... NaN NaN NaN\n", 717 | "model_pow_3 1.1 -1.1 3 ... NaN NaN NaN\n", 718 | "model_pow_4 1.1 -0.27 1.7 ... NaN NaN NaN\n", 719 | "model_pow_5 1 3 -5.1 ... NaN NaN NaN\n", 720 | "model_pow_6 0.99 -2.8 9.5 ... NaN NaN NaN\n", 721 | "model_pow_7 0.93 19 -56 ... NaN NaN NaN\n", 722 | "model_pow_8 0.92 43 -1.4e+02 ... NaN NaN NaN\n", 723 | "model_pow_9 0.87 1.7e+02 -6.1e+02 ... NaN NaN NaN\n", 724 | "model_pow_10 0.87 1.4e+02 -4.9e+02 ... NaN NaN NaN\n", 725 | "model_pow_11 0.87 -75 5.1e+02 ... NaN NaN NaN\n", 726 | "model_pow_12 0.87 -3.4e+02 1.9e+03 ... NaN NaN NaN\n", 727 | "model_pow_13 0.86 3.2e+03 -1.8e+04 ... -0.0078 NaN NaN\n", 728 | "model_pow_14 0.79 2.4e+04 -1.4e+05 ... -0.73 0.017 NaN\n", 729 | "model_pow_15 0.7 -3.6e+04 2.4e+05 ... 17 -0.81 0.018\n", 730 | "\n", 731 | "[15 rows x 17 columns]" 732 | ] 733 | }, 734 | "metadata": {}, 735 | "execution_count": 4 736 | } 737 | ] 738 | }, 739 | { 740 | "cell_type": "markdown", 741 | "source": [ 742 | "It is clearly evident that the size of coefficients increase exponentially with increase in model complexity. Compare Power 2 with Power 15 Coefficients.\n", 743 | "\n", 744 | "**What does a large coefficient signify?(Power 2 or 3 or 4)**\n", 745 | "\n", 746 | "- It means that we’re putting a lot of emphasis on that feature, i.e. the particular feature is a good predictor for the outcome. \n", 747 | "- When it becomes too large, the algorithm starts modelling intricate relations to estimate the output and ends up overfitting to the particular training data." 748 | ], 749 | "metadata": { 750 | "id": "b0fyONkdMfGq" 751 | } 752 | }, 753 | { 754 | "cell_type": "markdown", 755 | "source": [ 756 | "**Ridge Regression**\n", 757 | "\n", 758 | "- Ridge regression performs ‘L2 regularization‘, i.e. it adds a factor of sum of squares of coefficients in the optimization objective. \n", 759 | "\n", 760 | "- Thus, ridge regression optimizes the following:\n", 761 | " Objective = RSS + α * (sum of square of coefficients)\n", 762 | "\n", 763 | " Here, α (alpha) is the parameter which balances the amount of emphasis given to minimizing RSS vs minimizing sum of square of coefficients. \n", 764 | " \n", 765 | " α can take various values:\n", 766 | " \n", 767 | " α = 0: The objective becomes same as simple linear regression. We’ll get the same coefficients as simple linear regression.\n", 768 | "\n", 769 | " α = ∞: The coefficients will be zero. Why? Because of infinite weightage on square of coefficients, anything less than zero will make the objective infinite.\n", 770 | "\n", 771 | " 0 < α < ∞: The magnitude of α will decide the weightage given to different parts of objective.\n", 772 | "The coefficients will be somewhere between 0 and ones for simple linear regression.\n", 773 | "\n", 774 | "This gives some sense on how α would impact the magnitude of coefficients. One thing is for sure that any non-zero value would give values less than that of simple linear regression. \n", 775 | "\n", 776 | "Note the ‘Ridge’ function used here. \n", 777 | " - It takes ‘alpha’ as a parameter on initialization. \n", 778 | " - Also, keep in mind that normalizing the inputs is generally a good idea in every type of regression and should be used in case of ridge regression as well.\n", 779 | "\n", 780 | "Now, lets analyze the result of Ridge regression for 10 different values of α ranging from 1e-15 to 20. These values have been chosen so that we can easily analyze the trend with change in values of α. These would however differ from case to case.\n", 781 | "\n", 782 | "**Note** \n", 783 | " - Each of these 10 models will contain all the 15 variables and only the value of alpha would differ\n", 784 | " - This is different from the simple linear regression case where each model had a subset of features." 785 | ], 786 | "metadata": { 787 | "id": "17KNhEWQK13O" 788 | } 789 | }, 790 | { 791 | "cell_type": "code", 792 | "execution_count": 5, 793 | "metadata": { 794 | "id": "dUb6YeuhKxF1", 795 | "colab": { 796 | "base_uri": "https://localhost:8080/", 797 | "height": 297 798 | }, 799 | "outputId": "85633f1e-78f7-4333-9baa-aac7bc097656" 800 | }, 801 | "outputs": [ 802 | { 803 | "output_type": "display_data", 804 | "data": { 805 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaUAAAEYCAYAAAD8hukFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOydeXxU5fX/3+fOJAHZdwQCYRdUFokRlyhYV9TiVguKglqX2lWp/dmqrVpxaUu/2lattsqiAooLsuOKREBD2JRVQIJh30RAIMnMfX5/3DtkMkyWSSaZJef9Yl7JzL3PvWdyP5zzPOfZxBiDoiiKosQDVqwNUBRFUZQAGpQURVGUuEGDkqIoihI3aFBSFEVR4gYNSoqiKErcoEFJURRFiRuqFZREZL6I/CwahojDOBH5TkRyo3HNCu43XkQej/a5SvmoZpSKUI3UbSoMSiKSLyJHReSwiOxy/5ANI7mJiGSIiBERbzmnnQdcDHQwxmRFcv1ERkQGi8gnIvK9iORX81qpIvKW+8yMiAwKOf6IiBS7zzLw6lKde5Zhh2qmBqllzYiIPC0i+9zX0yIi1bmne13VSA0S6XMTkRtFZIuI/CAi00SkedCx5iLyrntsi4jcGEHZX4pInogUisj4ythe2ZbSVcaYhsAZQCbwUCXLRUInIN8Y80OkBSsQZbzzA/AKcH+UrvcZMALYWcbxN4wxDYNe30TpvqGoZmqO2tTMncDVQF+gD3AVcFeU7qsaqTkq/dxE5FTgReBmoA1wBHg+6JTngCL32E3AC26ZypTdDjyOo9dKEVH6zhizDZgDnBZ6TEQsEXnIjZi7RWSiiDRxDy9wfx5wa0Znh5S9HfgfcLZ7/FH38ztEZKOI7BeR6SLSLqiMEZFfiMgGYEM4e0VkqojsdGuUCwJ/yDDnDRKRrSLyRxHZ69bibgo5rZmIzBKRQyLyhYh0DSr/rIgUiMhBEVkqItnl/R2DMcbkGmNeBcIGBxE5RUQ+cP8G60XkhnKuVWSMecYY8xngr6wNNYlqJuE1MxIYa4zZ6j7LscCoytpaGVQj0dcIkT23m4AZxpgFxpjDwMPAtSLSSEQaANcBDxtjDrs6mY4ThMotC2CMeccYMw3YV1nDIwpKIpIODAGWhzk8yn0NBroADYF/u8fOd382dWvni4MLGmNeBu4GFrvH/ywiFwJPAjcAJwNbgCkh97waOAvoXYbJc4DuQGtgGfB6OV+vLdASaI/zQF8SkZ5Bx4cBjwLNgI3AmKBjS4B+QHNgEjBVROoBiMh5InKgnPuWiSuID9xrtnZteF5Eyvq+leEq9z/jahH5eTWuUylUMwmvmVOBlUHvV7qfRQ3VSI1oJJLnVupcY8wmnJZRD/flM8Z8Xca1yitbNYwx5b6AfOAwcADnAT4P1HePzQd+5v7+EXBPULmeQDHgBTIAA3jLuc8o4LOg9y8Dfw1639C9Xob73gAXVmR/UPmmbpkm7vvxwOPu74MAH9Ag6Pw3cWoHgXP/F3RsCLCunHt9B/StrG1umYtw0gzBn/0UyAn57EXgz5W43lZgUMhnvYF2gAc4B9gBDI/Ezkp+F9VM8mjGD5wS9L67+zcR1Uj8aiSS5+b+je8O+Wyba382sDPk2B3A/IrKhnz2ODC+MrZXtqV0tTGmqTGmkzHmHmPM0TDntMMRV4AtOMJpU8l7lHs94zQN9+HUOAIUlFVYRDwi8pSIbBKRgzj/CcCptYTjO1M677zFtSFAcL79CI6YA/f6nYisdZvzB4Am5dwnEjoBZ4nIgcALp7ncVkQ6StCAhcpczBizxhiz3RjjN8YsAp4Fro+CneFQzSSBZnACR+Og942Bw8b1NNVENVJzGonkuYWeGzj/UAXHKipbJaLZkbcd5z9EgI44tYRdlH7gVbqem5ZogROFA5T3H+NGYChubRLngX4HlDUCpZmINAgSUEdgVUVGunne3wM/AlYbY2wRKe8+kVAAfGqMubiM4xGNVgqDITp2VhXVTPxrZjVOZ3lgOHVf97PaQjVSNY1E8twC5wbu3QVIA74GbMArIt2NMYE+tuBrlVe2SkRz8uxk4F4R6SzO0M4ncEZ6+YA9OF8ukuHHk4FbRaSfiKS51/vCGJNfyfKNgEKcWtBJbvmKeFScIbLZwJXA1EreJ/AdvSLyJ06sOZSJOB259YAU563UE5FU9/BMoIeI3CwiKe7rTBHpVc710gJ5ZyDVvZ64x4aKSDNxyAJ+DbxXWVtrANVMnGsGmAjcJyLtxRkQMBon7VRbqEaqoBEie26v4/Q1Z7tB+jHgHWPMITeYvgM8JiINRORcnKD8akVlwRnB6GrLA3hcbZXbGIpmUHrFNXQBsBk4BvwKwBhzBKcDb6GbUhhY0cWMMR/ijOR4G6fvoytOp2BlmYjTVN4GrAE+r+D8nTg1nu04f+i7jTHrKnGfecBcnJrBFpzvfbz57z6s8lIl5wNHgdk4taijwPsA7oO9BOd7b3dtfBqnJlIW691rtHdtO0pJzXAYTmfqIZy/z9PGmAmV+I41hWom/jXzIjAD+Aqnhj/L/ay2UI1UTSPlPjc3hZsNYIxZjTMg5HVgN05AvCfoWvcA9d1jk4Gfu2UqU/YhHD09gDPt4CgVDP2X6KSGExtxJgy+ZozpEGtblMRANaNUhGqkaujad4qiKErcoEFJURRFiRs0facoiqLEDdpSUhRFUeKGmCw42LJlS5ORkRGLWyc8S5cu3WuMaRVrO2oT1Uv1UM0okRBrvcQkKGVkZJCXlxeLWyc8IrKl4rOSC9VL9VDNKJEQa71o+k5RFEWJGxIzKBXkQs5Y56eiVITqRYkU1UzMSLxNrApyYcKPwV8EnlQYOR3S68yGkkqkqF6USFHNxJTEaynl5zhiMX7wFcL8J7U2o5RNsF78RbByktaAlfJRHxNTYhqUjDHsXLWAovl/K3noFTWbM7Kd2gsWYMM3851ajYqmbhBOH+VpJqAX8YDlgeWT4OMxqpm6QqR6AfUxMSYq6TsReQVn9dvdxpgTtjQui7VLPqLzrOF4xQefjYXLnoK5D5Q0my97Co7uw9fxXNZ5e2Ebw6m2wdNvGOz4ErYvB2M75+fnaBM7QaiqXsKmVaDkM8sD/UdA3+GltdBvGMdX/F86oaTVpJpJGKqkmYr0EuRjyMg+UTPqY2JCtPqUxuNsUTwxkkLdjizHwoeFDb5jsHxi6Wbz7NEY24/B4rWiUXxt0pmU9gSCD8vjBSsFbJ8jroxItq9XYsx4qqCX0mmVY04qrkl6UHrOD3njYPlrTnBq2/fESo4nteS9aiaRGE+kmgmXhmuWcYKPwbbBsmDIWGjTu3QlR31MrROVoGSMWSAiGZGWS+16AYXzn8KDjWCcmonldXZIEcHYfgSDFz9PpI5jZ9vBpOx0gpjt92Nl3uI4pdBajhLXVFUvZGQ7jsLvB4yTirv8r47D8B1zPsNg/EWQ9woGC8E42vIXOTXikdMdZ6WaSSiqpJlAGs5XyPE0nOUt5WOwXS3ZNsy6F3oOKQlaNjBAfUxtE9uBDulZLG9+Rck2j7Yf2p4GA0ay49zH8Rnr+NaoFjbtdn6CJYKNRaHxMEsGQfbo8GLRIZ3JR3oWu7v9BBNIxdk+2LkC+g2jsOtl+CQVPwLG0YwYG4zBj1AsXg62GehoJZxmVC/JR3qWUwnpOgiDBcbG7y9mLZ1401zIg0WjKHZ9DIAxNva6WRixnD5ITyr0vVH1UsvU2pBwEbkTuBOgY8eOxz8/fMr1FC6cQz3xIdiwfTlm1xrGpDxCO+8d/MF+ycnpAmAjBrC8TG3+Cx75LIUmPfZyXr1vnFQO4vQngA7pTHDC6cVvGx77tg9/Yypp+EE8mGWvg+3DNl7+XHwz5zfezqXH5mFhI1Kyr/WfCkcwe/IPPDZ0G0NbbFO9JCHhNLPgaGdm7ruCR81npFCMRww9/RvoKls41uc55v3QgiFb/gqBqo4x+G0/mzpcR8aPbictPcsJPPk5UL+F09qu36J0Wlj1ElVqLSgZY14CXgLIzMw83jhK7zOYmz75I893eJ+2ez8HY2P7Cul4bBnn3/Yksr+Xm/d1m9mAGMOw0xrwalFD/jd5CufKo4i/yLng8teh/02lhwFrB2XCEU4vHkv4/c9u4dHXvDTfnUs72cswzyd4xSZVfNyf3YoWl/8D8sY7qRhjI4BHhPvObcX6/AZMeONNrkgbg4dixwmpXpKGcJrx24Yv6cG8zBe5dPd4vFtzEGOTio9bTv4Wsv8Iee1g1r0Y41RkBMN7my3ee+Mofx84m7M/u60kBSiW8zK2DoCoIWI+T6lb64ZsOek0/mVfj/GkYeOhyHg5rVsXBm4b73Q83joHMkeBJ+14szq16wW8MGIAfXxfYfzFJRf0FwGmZBiwdlAmFR1bnMQTv7qdH//q7/S45E4sbypGPHg8Xlr4dzu12sxRcMX/OX0HYoEnjVan/4g37jqbezrvRIwvkABUvSQ5g3q2Yu5vz+fqq66l/iUPlfgQywPfby2lF3H1YnnrcdmV19Okfgo5H0zDHwhI4AQiOxCcVC81QVT2UxKRycAgoCWwC/izMeblss7PzMw0wYslTs79lj+88xW/7vEd9uYFNGzWhruO/Ndp/QQ3jwPN6KBOx/fnTeeCRbeSKq6j8aTBqJnOhZOwQ1tElhpjMmNtR3Worl5KUZDrpOKWTyoZJVWOXsy3X+AfdwUeu9jpePKkIUmsF1DNlCICvfj8Nu/NnMaQZXeRIsV4MIhbySlzKHkSEGu9RGv03fDqlP9pZjrvLNvKcxuFrIzbGNctB8kJk04JvIK4+JKreHrLP+hYMJ3LT2tLs3NGlpxT1gCIJHU+iUJ19VKK9Cznedq+SulFOp6F97bZLJ/xPKu3H6R+/xFcp3qJe6KmmQj04vVYXDf0Wta1a8xHc95hj92A2/o1puMZl5atBdVLtYmLte8sS5h8x0D8xpDm9UCBBYvGlj+fxH34kpHNHTcO49Jn2jNheyrTfX7ScsaGF4WuaZWcBIb+VjT/KMhh9Pv5OMZNWcH0xdvp2HwuZ7L6RM2oXpKTCPVySkY2jX49lpGv5DIl7whvtDpA3/wwPkb1EhXiIiiBUys5bkxgKGdZNY6Qh99i5HT+9pM+/Gv861gTnwR84UURug6adlAmBxXpBU7QjIyczl+vH0DK9iWc9tGDGPEjoZpRvSQnVdBL+5HTmXLnQMb8ZwI95j2IsXyIJ031UgPETVA6gTCpl+OEefiDs7Pwd96JbCsGKWNUTGVrSEriUZ5eIKxm6qVn8fBp+0lZ7EOMjfEXIcGaUb0kL1XQS8v0LP7S7wCpn6leapKYj76rEsGLbAY9/OyLr8YnXnxYGE/KiaII1JAufFCb1nWNMjTTtPeFiCcVn7Hwibe0ZlQvdZcy9NKw5yDEm4rPncB/uO3ZJWVUL1EhflpKkXQQltH8Tut8NluvncrUtyazt3EWj7TLJCVcWRVL4hNph3JZKZv0LDyjZjBrxlTGbWvPw/Sgb2g51UtyEAUfQ3oW1sgZbF/+PvfmNqTxwhT+29VgWVJSTvVSLeIjKFWlg7CMh9+hzyC62N15dsoK6s1Zx8NX9q4ho5WYUdUO5bIcRnoW2bf15/H/W8DoqSuZ+avzqJfiib7dSuyIoo8hPYsO6Vlc1SqfP723mhcXfMPPB3WtGbvrIPGRvgvXQVgV3PWohrbYxqhzMnj5s828t2JbdG1VYk+09ALHNdN4z3Keuq4PG3cf5rlPNkbPViU+iLKPoSCXmwd2YsjpbRn7/npWFhyIrr11mPhoKUWjgzCkJvTQzdNYs705/+/tL+nWuiGntmsSfbuV2BCtDuUQzVwwcjrX9m/PC/M3cWWfdvRs2yi6diuxowZ8jIyczpPX9GfFtwu4780VzPp1trawo0B8tJSi0UEYUhPyfruQ5246g2YnpXLnxKXs/6Go9Pm6ym/iEq0O5TC154eu7E2jel4efPcrbDtotRPVS2JTAz6G/ByanJTC09f3YdOeH/i/D74ufb5qpkrER0sJqt9BGKYm1KpRGv8ZMYCfvLiYf45/nT+dvh+rczbsWuMs8mpsZ8kQHSmTeESjQzmMZpo3SOUPQ3rx+7e+5OMPZ3BR/Q1w7CAs/rfqJdGpAR8DkN29FcOz0lm6cB47zTu07XOx+phqED9BqbqUMVqmb3pTXjjfxzkLfw8f+8DjdVYcN36nnL9QJ7nVVcrQzE8GdGDFwvc5b9EDGNwtVQKoXuou5Uy6ffD0w3i/HEPKEh9m2TOI+pgqkzxBCU6sCblDQH9UtBW/BHasLaZkWy+c1X51klvdJVgzQUtX/abbLry5PkTs0uerXuo2ZfiYht9vxabEx4j6mCqTXEEpmOBOScuD5UnB5y/Gj0Wqx0L8PrAsGDJWazDKCZ3YbS57iiIrBZ9dhEcMgjjbHahelAAhPkY8Kfj9xfiwSPV4EH+x+pgqkLxBKbhT0gYZcAv7rdb8clEDzuzQjPt77tGVfJUSQjuxj+7j2PB3GTfpNRo0bcXtZzRBOqtelCDC+Ji90pqfL6zPVX3acWv7repjqkDyBqXQTsm+N9I6PYsL6m/kb/PWc+pZFzMk/eRYW6nEC2E6sRunZ9HoonY8NnMNXVufyeD01rG2UoknwviYNulZdDu6kieWb+PCi++mU4sGsbYy4YiPIeE1QRlDQO86vwuntW/MI9NXc7jQ55yrQzeVMvQyYmAnMlqcxJjZa/H53f4l1YsCZWrmd5f0xGtZjH3fHSKueomI5G0pQdghoF6PxeNXn841zy/kmQ++5qG+h3UPFMUhjF5SvRYPXN6Lu19bytvLtvLTtjtVL0oJYTTTunE9bjsvg+c+2cRvT/mOLrNuVL1EQPK2lMqhX3pTfpqZzvhF+Xy3+uPoLVmjJCWXntqG/h2b8o8PvqZ40wLVi1Ihd57flSb1U1j26QzVS4Qkf1Aqo+l83yU9SPVavLytfdgl6pU6Shi9iAh/uLwXuw4WMutgF9WLUpowmmlSP4U7z+/CpF0dsa0U1UsEJHf6LjBk01dYMjQzcxQArRvV42fZXfjnR36uuWESXX9YriNl6johQ3zpPwL6Dof0LLI6N2dQz1Y8suIAF9/0Dg22L1a9KOX6mJHnZPC/nFMZ0+ppHj51n+qlkiR3Syk/xxELNtg+Z9mPoNrMXV32cm+9GbyZVwDZo1UwdZ3QYeF54xyH42rmd5f0pMvR1SxbMEMdjOJQjo9pmOblob6HSStYxMaT+qleKklyB6WMbKf2EsDYJTndglwaTL6WX/EGv932OzYt+zg2NirxQ2CIL+6GbZhS/QCn2euZUu9Jzs7/D2bCVTqaSqnQx1z71c+5L2UqHWcOV71UkuQOSulZTnPa8jpLfXjSSnK6bq3YwiYFH6sXzY6trUrsCQzxzRzlaCW0HyA/hxR8eMXG9hVrp7VSoY8RfzFebCy7mD2rPoqtrQlCcvcpgeNg2vQ+cRHFoIlvxvIyYXt7+u07QscWJ8XUXCXGBIb49r0xrGbEk4rfV0iR8eBvezYNY2utEg9U4GOMv4hi42HSrnR+E1NDE4PkD0oQfsn6oBV/D7bM4svXDvHKws088uNTY2OjEl+Uo5l9X33I3Tn1OX9LS37bPTbmKXFGOXqR/BwmbW3Pv75qzPUHjtK+af3Y2Jgg1I2gVBaukFoAV/ZZwdd5H1HYaDZp3S7QTkklPOlZtE7PosXePPI+m0eh5z3Vi1I2ro+57MBRnvjqE96fO13XxKuAuh2UgvhFt320X/MXUj71wcKxOvNaKZf7e39P+qbHVC9KpWjftD6/6LafYWvvxaz3I7q6Q5kk90CHCOi2YyZp4gx8MDrzWqmAHjtnkibFqhel0ow8aTFpFCG6ukO5aFACZ6jm8tcQwBgw4tGZ10rZHNeLwRiwVS9KRRTk0mLDm4iAAYzlVc2UgQYlcGosth8BbBFyGlyqzWqlbEL0MtO6ELv9mbG2SolngjVjhG87Xq0+pgyiEpRE5DIRWS8iG0XkgWhcs1YJDA8XD7aVyr/2Z7L74LFYW5XUJLRmgvSCJ40JPwzkg7W7Ym1VUpPQeoGS4eHioVhSePHAWbG2KG6pdlASEQ/wHHA50BsYLiK9q3vdWiVoX5Td17xJnr87U5dujbVVSUvCayZIL+aW99jVpC8v52yOtVVJS8LrBUqGh1/4IB9k/pdJO9qyatv3sbYqLolGSykL2GiM+cYYUwRMAYZG4bq1S3oWZI+m/emDGNilOVOWfIttm1hblawkvmZcvXg7DeTWczPIzd/PyoIDsbYqWUl8vcBxzVxw0RWclOph/KL8WFsUl0QjKLUHCoLeb3U/K4WI3CkieSKSt2fPnijctoYoyOVPTefS6ruVLNq0L9bWJCsVaiZh9AIMb7eT36bN4KMPZsbalGQlqXxM4z3Lebb9x3y7cj77DhfG2py4o9bmKRljXgJeAsjMzIzPJoi7DH0vfxGvp3l4PqcV53Uf4XweuoSIUqMkhF7g+MK+v5ZCCre8zb61nWjRME31EgMSQjOuj7nIX8R5Hg+zPkzn+quvVR8TRDSC0jYgPeh9B/ezxMNdpFWMn1QMbM7h4NedafzmdbqdcXRJOs0EFvb9dv7LtNg3V/USXZJOL2L8pIph76qP8PVvh/fVq1UzLtFI3y0BuotIZxFJBYYB06Nw3donaFSVeFLZ42/Awbl/AX+hbmccXZJSM0Y8eHd9hfGpXqJMUuoFy0vDYzvY/uk43TI9iGoHJWOMD/glMA9YC7xpjFld3evGhMCoqgG3YHW/iMdSJ3Ly/s8xxnaXpdftjKNBUmqm52V4xKa32QioXqJJsurFMn6Gez+m7Tdvu1tf6JbpEKU+JWPMbCB5NiRaMQV8x0jBOKs8YEGXQTDoD3W6WR1Nkk4zX89DjB9LwA9YnQchg1Uv0SJZ9eIBjPFx8JQRNG7bWfuU0BUdTiSwJTZOP6lthGJJ0YCklE1+Dti2W4EB23hYd8ovVC9KeFy9gLvkEBZTfedB9mjVDBqUTiSkX2lxs6u41X6YI23OiLVlSrySkQ3eNMACy8sYuZ3nNjaPtVVKvBKkF7G8TG37W55Z34wjRb5YWxYX6NYVoQRt/kdGNqn+biz8z2JmfrmDGzLTKy6v1D2CNCMZ2XhWNmTuonx2HTxGm8b1Ym2dEm+E+Jhuvm4cenExM1Zu56dndoy1dTFHW0rhcGdek55FprWBhxrPYdnCebG2SolnApoBfpkynT6sZ9IX38bYKCVuCfIxZ3o28Oemc1mSMxdj4nN6VW2iLaXyKMhFJg7lNl8hxYWT+e6N5TQ7Z6TmfZXwuBMjm/mLmJzm5e+Ld+JLbYW3y/mqGSU8ro8Z6Suk+KjF3jdW0OrcUXVaLxqUyiNoYmQqNilrX4cNbztN78BxHS2jBAgMkjF+UjD83v4v1icGcv4Olz0FR/epXpTShPiYFusmwcZ36rSP0aBUHoFBD75jIAYLg/EVInMfgJ2rwPbpDGylhIBe/EWICJbtx8KA7yjMus85R/WiBBPWxxxDFj4DGz+uk6s8aJ9SeQQ6JDNHYaxUfEYAG7Yt01UelBMJ2tJChozFiNfZyRgcrahelFBK+RhHL2Bg/dw662O0pVQR6VmQnoX0Gc7KV/8f/YpX4CHQGSk6A1spjasXAH/BMqwVE4JqfqoXJQyuZiwEO2+c01oyBrEs6qJmtKVUSaTjWRw4czRFJgU7sBxI5q11qlmtREZa5gh8Vho+Y2FUL0pF9B2O8Th6sT2pMGQsXPhgndOMtpQi4NzBQ7j78z9zVZNvuO7aYXVKKEoVSM9i+9A3eHPqZLoPuIxrrrw21hYp8Ux6FuaW9/jvxInsbJbJo5mjYm1RTNCWUgTUS/HQZ+DF/G7nj9hc/9RYm6MkABn9BrO80238fU1TfH471uYocY6300BM9n1M2NqG9TsPxdqcmKBBKUJGnN2JMz0b+fqtR515KYpSAb895TuGHppCbs7cWJuiJADDzuzIWd6NbJ72WJ30MZq+i5DWB77ktZQxWDuKMRMmIJc9rfNPlLIpyCVrwa0MSCnEP38adHW3TK+D80+UytF8/wpePe5jJtY5H6NBKVLyc0jBh4iN31eEZ/ZoMKbOzSVQKkl+DuIvwosNppg9n42j1aZ36+T8E6WSBPkY21eE1DEfo+m7SMnIRjyp+LGwjWBsu07OJVAqiTs50oiHYvGydsch3WVUKZ8gH+On7vkYDUqR4k52++6s3/OwbxQ+KyW6O0YW5ELO2DqZS05KXL3IhQ8ytffzPLt3ALYnippRvSQfrma29ruPh4tG4bdS65SP0fRdVUjPomV6FkcOLOeWNRn894KjNOw5uPrNandBT03tJBnu5MhLDx7j8ZWf8N/Oz3BXx+3V7yNQvSQv6Vl07HAmqwo+4zdHevCvs3/A6hyFPqUE0Iy2lKrBvRf3YIm/G08dHhKdBxu0oGddaarXJdo0rsd1Azowdm1TdveNws60qpekRkS4Z1A3Zn2Xzqwmw+uMj9GgVA06H13NC53ms27JR2zac7j6Fwza9bauLS1SV/h1j/3cwbvMmv1e9S+mekl6Lmv8LQ81nsPHH8zEtqOw11ICaEbTd1XFbQZf5C/iPK+Hv7/dlIfvHlW9a4bsSBlvzWqlmhTkcvK0n3Kft5Cite+yf107mp9SDaegekluCnKxXnX2cyssfIPPF3TgnEFDqnfNBNCMBqWq4jaDxfhJE+i8dTob395Ct6zLq/eggxb0VJIMVzMebFIoZtusx2jeYIzqRQlP0F5LKVJMSs7T+Du3wNPprOpdN841o+m7qhLUDBaPhxu8C8j46hnMhB/H7agWJcYENIOFRwydDy7BnnCV6kUJT7BeMJzhW1kn/IsGpaoSvHdO/xGk4MeLje0rjMvOQyUOCGim6yACgcn44rOzWYkDgvUijl6wiyjc+GmsLatRNChVh/QsyB4NfYcj3jT8WBQZL+vq9a3yJXM27OGtaW+z+d26ue5V0pOeBYP+gHjTsPFQZLx86T29WpfcfegYe9fmxPXcE6WKBPTiSXMmYBsvU3Z3qv5143iukvYpRQO3RuPb+Cn/XLiXZu+/S3rzk2jQ9ZyILvO/nG+YPfs9Xk99ghR82Kuewxo1I/qQLB4AACAASURBVK7zv0oVcPXi37SAfy/aS9NPptOzbSPSOp8d0WWOFft5aNoqvln2Ma+nPoEtPixvWlzOPVGqQWACdn4O4zafzMdfbueyhmNo0+fiKj3n/etzaDr1esRfjMThXCUNStEiPYu0XWu43/8Uxvbjf20q5tYZSMeznNrIykmAQN/w8w0WfjKb/R9O47etjlDvkB8xNj5/EWzOwYojwShRIj2LlF1rGF38JMb2Y098E26b6WijINdJ6dVvUeZCnIWbFzPt3Tf4Zk8nftllJ6nbfFjYGH8Rkp8TV05GiQLu4IRRnpe545vHsJbYmOX/QgIBpSIfU5DL92s/5oX8k7G+Xch9niK8Ep960aAULQpyYfZoLOMDAWMX8fX7/6Vn28mw7DWwi53zlr8Oo2aWiKAgl+8/n0jmqskMTLGxfvAilhfbD8XGwzdpfdCdm5KQMHrZ/ukrtGs6CZZPAn8xYINY4Cnd+vEvGYdn1mh+Ymyur5+Gt//TmF2p+HxF2HhJjcO5J0oUKMjlpA9/j8GPALbvGDL/Seg1FObc70yGhRN9TN547FmjaWj8/MaksKDHaMhPwecvxocHT/q5pMTsS52IBqVokZ8DtrOJmwEQIaPgXcxWP0LQpLfALGq3dmMm/JhGvmOIGATA9sOAWyg6qT03fejlgkMZGpSSkTB6abHxLQy+0noxdinNmG+/wMwajcf4EQFMERzdh4ycwdzpb/LWvgxeaX+mdhYnI65mBEczYgz2pvlYmxc4fiNAiF7sWfdh2X4sgXri49LOKTB4Jhty5/D/8hozbG97bsiI0XcKg2o3WmRkgzcNsBDxIC264RUnIJWahx2YRV2Qi/3JkxjfMazgM0SgbT/qdb+AHzf5hpO+ejVuOySVahCsF8uLr10WKRSfqBexHM3Ub4G9YCxL3nsejO0EpAD1W0B6FvV7XMBFRR/z/dRfql6SkSDNgICAhY1t+8P6mMMbFrF2yh/BdiowBmfpIr7fCkC3My/j9kZfUG/e7/Bv+aL2v08ZiDFVX7pCRH4CPAL0ArKMMXmVKZeZmWny8ip1amIRyOu66ReDu/21ASMWVqezoVVPSGuMWfxvjO1DAscsy9kzBQOWFxBsfxFiDIiFuB3Y0vGspcaYzFh+zepQFc0ktV4CfUdz7scE0i+AEQ+c/UusooOw52vMt59jjE2x8eCxBI8JtKgEvPXgsqew59yP+Iocf+VJO57CEZGE1Yz6mBACPmbZa5hAl4ABW0DwYHfIoii1KbsPHuPkvZ/hMT5nKDmCiOVUcowNlgeMca5hwHhSsW6dFRd6qW5LaRVwLbAgCrYkPulZ0CQdbB9gI05CDhGwjcG35XPsvHGYhc+A7cMCjAjSZRCccYtzDWM7/Qn+ouMtKMGO28UTq4BqJkBgSsHRfU5t1v1YANs2TP9iHcV5r2G2LATbj4UhRWw8Z4xAug52HAzG0cba97D8xYhTgVa9JCsBH2Ps43pBwAPYxoaCL6i/aQ6d9nxCKsV4xSBYjl4GjHT8i/E7PsYuRnD8k2UXx41eqtWnZIxZC26TUHEIzML2F4HlQRCnRSSAsbEwGMPx5rRleWDwH5yyKyYfLweC8ReD2NhYWHG6eGKkqGbCENCMrxDBxogFViqN66dgHfY5zue4Xizod6NTbsviki0Ieg2F/M+Ot7ZE9ZK8BPkYCfgK24clAsZH6b+UOCm/QWF8jDElA7DiSC+1NtBBRO4E7gTo2LFjbd229gld8BCQ/Bw89Vtg5j6A8RWC2BjEEdSQsSWjZMKUM/WbYx3dH7eLJ9YUdUYvUFoz9VsgR/fhzcjmQoDx7x8fVSVSjl7Ss6BNb6SCqQfJTJ3RTBk+Rtw08PFReFYKnHFzaS2ElKtoqkosqLBPSUQ+BNqGOfSgMeY995z5wO/qfL63IkLnn5QzD6UsYp3vrQzR1kyd1QuUnn/Stm/EeoH414z6mCgSOl8JIl4RPNZ6qbClZIy5qDYMqRMEr86bADtAVhXVTBQJaEb1olSGJPAxOiQ8ViTADpBKHKF6USIlQTVTraAkIteIyFbgbGCWiMyLjll1gATYAbImUM1UEdWL6iVSElQz1R199y7wbpRsqVskwA6QNYFqpoqoXpRISVDN6DJDsSTOd4BU4gzVixIpCaiZaq3oUOWbiuwBttTybVsCe2v5ntWhLHs7GWNa1bYxsUT1UinKs1c1Uzski2ZiqpeYBKVYICJ58TwsNpREszfZSLS/f6LZm4wk2jOIV3t19J2iKIoSN2hQUhRFUeKGuhSUXoq1ARGSaPYmG4n29080e5ORRHsGcWlvnelTUhRFUeKfutRSUhRFUeIcDUqKoihK3JBUQUlE0kXkExFZIyKrReQ3Yc4ZJCLfi8gK9/WnWNgaZE++iHzl2nLCssbi8E8R2SgiX4rIGbGwMxlRvSiRkmiaSUS9JNuKDj5gtDFmmYg0ApaKyAfGmDUh5+UYY66MgX1lMdgYU9aku8uB7u7rLOAF96dSfVQvSqQkomYSSi9J1VIyxuwwxixzfz8ErAXax9aqajMUmGgcPgeaisjJsTYqGVC9KJGShJqJO70kVVAKRkQygP7AF2EOny0iK0VkjoicWquGnYgB3heRpe7OmaG0BwqC3m8lsf8TxCWqFyVSEkQzCaeXZEvfASAiDYG3gd8aYw6GHF6Gs7bTYREZAkzDabrGivOMMdtEpDXwgYisM8YsiKE9dQ7VixIpCaSZhNNL0rWURCQFRyyvG2PeCT1ujDlojDns/j4bSBGRlrVsZrA929yfu3GW6A9d0ncbkB70voP7mRIFVC9KpCSSZhJRL0kVlEREgJeBtcaYf5RxTlv3PEQkC+dvsK/2rCxlSwO3sxQRaQBcAqwKOW06cIs7SmYg8L0xZkctm5qUqF6USEkkzSSqXpItfXcucDPwlYiscD/7I9ARwBjzH+B64Oci4gOOAsNM7Ja1aAO86+rXC0wyxswVkbuD7J0NDAE2AkeAW2NkazKielEiJZE0k5B60WWGFEVRlLghqdJ3iqIoSmKjQUlRFEWJGzQoKYqiKHGDBiVFURQlbtCgpCiKosQNGpQURVGUuEGDkqIoihI3aFBSFEVR4gYNSoqiKErcoEFJURRFiRs0KCmKoihxgwYlRVEUJW6odlASkfki8rNoGOMunz5ORL4TkdxoXLOC+40Xkcejfa5SeVQ/Sk2gukpcKhWURCRfRI6KyGER2eX+IRpGciMRyRARIyLlbZdxHnAx0MEYE7oZVdLiiv5pEdnnvp4O7MdSxvk3isgWEflBRKaJSPOgY81F5F332BYRuTHo2MkiMl1EtrvPIqNmv9nx+6p+apAo6+eXIpInIoUiMr5WvkAVUV3VLCLyiIgUu3/fwKtLTd83kpbSVcaYhsAZQCbwUA3Y0wnIN8b8EGnBCkQV79wJXA30BfoAVwF3hTtRRE4FXsTZ06UNzh4ozwed8hxQ5B67CXjBLQNgA3OB66L/FSpE9VNzRFM/24HHgVdq0N5oorqqWd4wxjQMen1T0zeMOH3nbq87Bzgt9JiIWCLykFsL2y0iE0WkiXs4sC/8ATfinh1S9nbgf8DZ7vFH3c/vEJGNIrLfreW3CypjROQXIrIB2BDOXhGZKiI7ReR7EVkQ5KBDzxskIltF5I8istethd0UclozEZklIodE5AsR6RpU/lkRKRCRgyKyVESyy/s7hjASGGuM2er+fccCo8o49yZghjFmgbvl8sPAtSLSSJzdJa8DHjbGHDbGfIazs+TNAMaYXcaY54ElEdgWVVQ/8asfAGPMO8aYacRod92qorqqEV3FhIiDkoik4+xUuDzM4VHuazDQBWgI/Ns9dr77s6kbcRcHFzTGvAzcDSx2j/9ZRC4EngRuAE4GtgBTQu55NXAW0LsMk+cA3YHWwDLg9XK+XlugJdAe5z/6SyLSM+j4MOBRoBnOTo1jgo4tAfoBzYFJwFQRqQcgIueJyIFy7nsqsDLo/Ur3swrPNcZswmkZ9XBfPmPM15W8Vq2j+olr/SQsqqsa0RXAVW7gXS0iP6/g3OhgjKnwBeQDh4EDOA/geaC+e2w+8DP394+Ae4LK9QSKcbbizQAM4C3nPqOAz4Levwz8Neh9Q/d6Ge57A1xYme/gnt/ULdPEfT8eeNz9fRDgAxoEnf8mTqsjcO7/go4NAdaVc6/vgL6VtMsPnBL0vrtrp4Q59yPg7pDPtrn2ZwM7Q47dAcwP+czrXj+jsn+76rxUP4mhn5DPHgfG14Y+VFdxq6veQDvAA5wD7ACG1/RzjaSldLUxpqkxppMx5h5jzNEw57RzxRFgi/vg20RwnzKvZ5x0wz6cGkOAgrIKi4hHRJ4SkU0ichBHxODUOsLxnSmdN97i2hBgZ9DvR3DEGLjX70RkrdscPwA0Kec+oRwGGge9bwwcNq4yKjg3cP6hCo7FGtVP/OsnEVFd1ZCujDFrjDHbjTF+Y8wi4Fng+sqUrQ7Rnqe0HadTMEBHnCi/C6cmUK3ruX0mLXBqdgHKu+6NwFDgIpyHkRG4VBnnN3PvEaCja0O5uHna3+M055sZY5oC35dzn1BW43RSB+jrflbhueKMhkkDvnZfXhHpXslrxRuqn9jqJ1lRXVVNV6GYapStNNEOSpOBe0WkszhDM5/AGb3hA/bgjP6KZEjhZOBWEeknImnu9b4wxuRXsnwjoBCnFnOSW74iHhWRVPeBXglMreR9At/RKyJ/4sTaaHlMBO4TkfZuh+lonGZ5OF7HyfNmu0J9DHjHGHPIrU29AzwmIg1E5Fwc8b8aKOzmk9Pct2mB/HKcoPqJoX7AGS3masIDeESkniT+CDLVVRV0JSJDRaSZOGQBvwbeq2z5qhLtoPQKjgNcAGwGjgG/AjDGHMHpgFsoIgdEZGBFFzPGfIgzOuhtnHxmV5xOvcoyEaepuw1YA3xewfk7cXKu23H+895tjFlXifvMwxlq/bV7v2MENd9dB3C4nPIvAjOAr4BVwCz3s0D5w4FRM8aY1Tgdr68Du3GEd0/Qte4B6rvHJgM/d8sEOIqTwgFY576PF1Q/sdfPQziaeAAY4f5eE8OsaxPVVdV0NQxn4MQh1+anjTETKnHfaiHh0851DxEZBLxmjOkQa1uUxEP1o9QEdVFXuvadoiiKEjdoUFIURVHiBk3fKYqiKHGDtpQURVGUuCEmQz1btmxpMjIyYnHrhGfp0qV7jTGtYm1HbaJ6qR6qGSUSYq2XmASljIwM8vLyYnHrhEdEtlR8VnKheqkeqhklEmKtl8RM3xXkQs5Y56eiVITqRYkU1UzMSLyZ2gW5MOHH4C8CTyqMnA7pdWbfrTrPsWI/9VI8lS+gelEiRTUTUxKvpZSf44jF+J2f+TmxtkipRe55fRm3j1/C+p2VXD9U9aJEimompiReUMrIdmov4gHLA99v1SZ2HcG2DZkZzcjN38/lzy7gwXe/Yv8PReUXCtaLJxXqt9C0TF2iKmk49TExJfbpu4JcpyaSke00kUPfh6PfMDi8BzZ8AEsnwIrJ2sSuA1iWcE/X/dzqX8nruzry5JICZn21g7+ddYwf1d+A1bkMzfQbBgi07QtzH9C0TF2hrDSc+pi4JipBSURewVm5drcx5oTtiMskVDSXPVXaaVz2FBzdVzpgBc4XAWM7r0ATWwWTEFRXL/X9RfzMk8olP53Ciws2cd6iBzDiw/Z4sfqPgL7DT9SLJ9XZfzM0LaOaSQiqpJmy0nChPmfnCkAc3QQfVx8TE6LVUhqPs73wxEgKFW9agNdfhBg/tu8Yhxe9TCP3vfEVIrNHg2074uh5OTRsXSIyY4FlAeKIKyPut55XShhPFfRSysn4jtGxYBp/6ZsOH/uwsDG+IkzeOFj+GtL9Eji0A/yFJU4F42gl4JBUM4nEeCLVTCAN5y8qScOtnBykoUKYdZ/zOzgtop6Xq4+JMVEJSsaYBSKSEWm5LY36k24LqYBgqLdvNUUIHiyMCB7bhwUYA7JuppPjBRALPGkntqSCqUwTXYkJVdULGdmOc/H7AQPLJ2Fd/lfwpmF8x0AMgsH4izDrZgbtRmY5TqXvjc4rnC5UL3FNlTSTnuWk3FZOguWTnKBjecDyOjsoiYDtD7qJH9bNco5D+GxNANVLjVFrfUoicidwJ0DHjh0BaHPq+exefT0dvnkDwZCC4WCzU9nZoCdrTGeu3DYWMX7E9S7GrdGI5XXEkjkq/M10SGfCE04vpGdB/xGQNw4wYPuc1Eu/YcjhPZgNH2D7ixBjjmvGuRiOXgIaCFeBUb0kPGVqJj/H0Yrxg99A+/5wcl+nj3HO/W4rOoBxWtYDRpZOA+fnwK41ToCq30L7JmuQWgtKxpiXgJcAMjMzDUCjeik0Gnw7fPse+AoRbJocWEWTQxvoOXI67GqPmXXf8WCEcSs3tkGO7CupCRfkOrWhQF44XC5ZRZNQhNML4DzfFZNLUjLLJzkOx5OKXP5XZOcK7KWvgvEd1wvgOJMAqpekpEzNBNJ4vkLAhu3LnQAzcjiMmgULn4F1synZwdxAkw6l+yUDZcVyXtrXVGPEfkh4oInddVDJww486MxRyG1zkcxboecV2J5UfFgUGg+PfNmMgv1HHNGMv8KpPee9AuOvdGoywcOANRecPAT0cuGDTqvpeA24yAk8Vz6DddscjnW5HFssfEbwSQqm03lOedVL3SOcj/EVwvwnnePDJsGVzzhpu0DXQEADgQoLtvPe2G4/t6V6qSFiPyQcHNEM+gNsWVzSJA7MJ8nIhiufQQBPQS725hzm/9Cdtz9P461nFjD1tMX09heXXCvgnEZO15xvspKeVVKLDW41BeaTpGdR/5YpFG5ezNyZbzFhewf6rGjInzoYrPwcUL3UPYJ9TKDV88185/3I6U5XQJveJ2ogtJVVmf5spVpEZT8lEZkMDAJaAruAPxtjXi7r/MzMTBN2scRA7rYSOdttB47ym8nLsb/9gjfTxuCh2EnnedJg1MykFYqILDXGZMbajuoQNb1ASSouKI0XrBfbNoyZvZaXP9vM1f3a8fezC/FOvKqkHyHJ9QKqmVIU5DotpG/mO60e8Tit7uzRZd882C9VFIiSYABErPUSrdF3w6NxneM14JyxFeb42zetz+Q7BzJmVhNuWAy/bpnHed1b4u1/U/liSALRJDpR0wuE6cgurRfLEh66ohfNG6Tyt3nrOVrchn/fPIOUVVM43qekeol7oupjQrMyZaXfgp99cNAKrBIRbkSeDpipNvGRvgsleH5BWaIpyCUlP4dH+mczrsXVjJrRg3Mat2BcsZ+0cIJxy6hokpAK9CIi/GJwNzodWcWaRS/xtyPn8rvb/0Gq1+1SLSvwqF6Sk0AfU3mVjfJWgyhLEzpgJirEZ1CqSDQhwrh15HQa/6Qvr789FV59EoMPCedEVDTJSSWdzJXL72ZISiGF29/l2fHCfbfdjGfbEnUydZFAVqYsynr25WmiMpVppULiMyhB+aIJI4zrskfTY8MBPGuLEbEx/iIk1ImoaJKXSjoZC5s08WM2f8Yf3+nHU61zEHUySihlPfvyNFGZypFSIfEblMqjDGGcfu4V+L5+AZ+/GCNeUkKdiIqm7hKkGcuTSvrpF/P8kgL6ZHbkJnUydYNI+gfLevYVaaKiypFSIfETlKIkGO+tM/lwzts8v7ktP93Zlp+mhymrokl8Ih2AEKKZYR3O5Ev5igdzC2g56CUubbBBnUwyU5X+wXDPXge+1DjxEZSiLJjBl17LhA/h4WmrOaVtY/qmN60525Xap6oDEII0I8Bfhp5Gg93L+PKzRbS+/Dr6q5NJXqLRP6gDX2qF2K/oANHZ6TEgmI/H4Hl1KM9l+2jVKI1fTFrG90eLKy6vJA5R2hnUuz2PB/c+wH3eqZzywQi2fTU/unYq8UPoZo9V6R/UHWlrhfgISjUgmMa7Puefw/uz4/tj/PHdr4jGJGElToiGXgDynUEOHmxS8DFnxlscLvRF11YlPghenqqqLZxIdVeVXW+VOEnfRaNDOczghwHpzbjv4h78bd56Lu7Vhqv7ty85X3PDiUu0BiAEaUasFOb+0I28N1fywogzkFLLjKN6SQaq2z9YmakqgWO71sDs0c6qEZ40TfVFQHwEJagxwdx9QVc+Xrebt957hx/tO0KjnoNVMMlANAYgBGnGk5HNpZtbMmb2Wl5ZmM/t53UucTLHDsLif6telBN1F25pNMvj7NMU2N3AX6hz3CIgfoJSNAgjGE9+Dv/p1ZCGOx8j9TMfZvHfERWMEiBIMz8zX9Di5I+YPGcD2fVOp8fcm0oW4gygelECBA98KLV1uk3JNhg4i7jqHLdKk1xBKZggwbQSwRY/FgbbX4yoYJRQCnKRiUO5xl/EkBQPc+cMprtdhAQHJFC9KCUE92MHb51ueZyf/mLnsyFjtRITAckblEIEI5YHn23jxyLV40FUMEowrl7E+EkT+KHQR3GKl1QM7t7ZjrNRvSQUa7Yf5JP1uzmlbSNOObkx7ZrUO7G/sKqE9mMHb2cB2gdZRZI3KIUIRi57il07tvKrRQ0Y2q89I9sVqGCUEoIHPXhS8Z5+I8OWZDOm//f06pKhe+ckKMu+/Y6/zVt//H2T+in0OrkRp7Zrwuntm3B6hyZ0btEAy6pCoKrM6g6gg2QiJHmDUhjBtAe6HlvJ4yu2kX3hXXRp1TDWVirxQohermuXyaQdixm+7gfev/x8WjeqF2sLlSowYmAnhvZrx9e7DrFmxyHW7jjImu0Hee3zLRT6nNRsozQvp3doQv+OTTmjYzPO6NiMZg1SK3eDigbc6ITbiEneoARhBfP7y05h7qqdjJm1lpdHnel8qDUZBUrpxQuM/UlfrvhnDn98ZxX/vWVASdpH9ZJQNKqXwoBOzRnQqfnxz3x+mw27D/PV1u9ZufUAK7ce4D+ffoPfdvqbu7ZqQFbn5pzVuQUDu7SgbZMqVkp0pfmISe6gFIZWjdL4xYXdeGrOOhZt3Ms5ad9oTUYJS7fWDbn/0p48Pmst01duZ2i/9lrzTRK8HoteJzem18mNueFMZ4HMo0V+Vm49wNIt35GXv5+ZX+5gcm4BAF1aNeC8bi05r1tLzunWkoZpQa6zvEqKrjQfMckflMIIZtQ5Gby6eAtjZq9lRr9cLK3JKAFC9HLruZ2Z+eUOHpm+mvO6taSF1nyTlvqpHgZ2cVpGAH7bsHbHQRZv2sfCTXuZmreViYu3kOIRzsxozo96teGKpgW0nXaDM3UgMHAqc1TJRXWl+YhJ7qAUqNWGCKZeioffXdqDe99YyaLTTuE8rckoULoVZHmg/wg8fYfz1+v7cMU/cxgzay3/OEdrvnUFjyWc1r4Jp7Vvwh3nd6HQ52fplu/4dP0ePlm/m7/MXMM+z3uMTjmGB4OxbWT2aGjTWwc5VIPkDkr5OSWTH23bWcXBFcyP+7bnxU+/4aGlfj68eRrebxeqcOo6pVpBfsgbBysm02PkdO6+oCv/+ngj155xFudpzbdOkub1cE7XlpzTtSV/GNKLb/cdYcXiIkzeWxjjRwT8tp+ln7xHxtV9aP39l5rqrQLxsSBrTZGR7U5oczH28ZV9PZbwlzOOMOTAZOav3w3Zo1UwdZ1A/p/A8GBzPEX3i8Hd6NTiJKa88xa+bxZoQFLo2OIkfnzlNXiv/Ad4vNhY+EjhqbUtGfjkR7z19mRsX6GuKh4hyd1SSs9yUnbB69wF0i0FuWQuGMUZKYUUL3qX4h4zSMkYGFt7ldgSyP+vnATLJ4HtO56iq5fi4Zlzijjl/YexPvFBzt+15qs4ZI5C2vRG8nNIy8jm7/V689bSrczJy+cK4yVFfBjxcrTNQBrH2tYEILmDEjidjm16n5huCdq2wBgfqxbNop8GJSUwLLzvjSdopr+9Cr/4sLAx/iJEBzkoAYKmE3TBmXriu7gHeQu78O3SeUzZ3Yk1rx7kJwNWcef5XUhvflJs7Y1jkj8oQfgJbm6qxviL8OPh+fyTec5vk+JJ7oymUknK0Ix4U/EVF2GLl1Qd5KCUg9djMfD8yxl4/uX033WI/+VsZl3eh0zJW0O97hcw9Mpr6NhCg1ModSMohcNN1Uh+DqvkNN6f6ePdZduOz1lQlBNIz8IaOYPPP5rG39e34vfFXdG2tVIZurdpxNNnHcOsexLjK6Lwm3e55R97OOXMi/jNRd1p2TAt1ibGDXW7WZCeBdmjOePcSzitfWM+/XgW9gLdKVIph/QszrjpL+xq0pc33n1b9aJUnpWTEV8hFjb1LD93dNzO5NxvGfS3+bz46SaKfHbF16gD1N2WUhAiwh9PP0T/Tx6Gj33g1Y3clLKpl+Lh6ayjDPj0QdWLUjkKcmH5awT2WRLLyyVDrmdevd48MWstT85Zx9SlW3nimtPJ6ty8/GslOXW7pRTEwEPvkyZFxzuxdfimUh7n/vCB6kWpPPk5zm60AAj0vxHSs+jaqiEvjzqTV0ZlcrTIzw0vLuZP763iSJEvpubGEg1KAAW5WCteRwBjwIhHZ+orZVOQiwTpxY9u/FfTiMhlIrJeRDaKyAOxtidiAnPgxAPees7oziAuPKUNH9x3Preem8Grn2/hin9+xpdbD8TI2NiiQQmO12IEsEX4uN7FmopRyiZIL0aEt/wXsLtpn1hblbSIiAd4Drgc6A0MF5HesbUqQgJz4C58sMxU70mpXv581alMvmMghcV+rnthEa98thljTJgLJi9RCUrJVIuxrVSe/+5MVm//PtZWJTUJrZlStd403vZn86+PNsbaqmQmC9hojPnGGFMETAGGxtimyHEHVgGQU/YAmYFdWjD7N9lc0KM1j81cw2+mrOBYsT/suclItYNSstViCm+cxvqUXvwvZ3OsrUpaEl4zQXqxRs6g55k/YnLut+Tv/SHWliUr7YGCoPdb3c9KISJ3ikieiOTt2bOn1oyLiMCivx+PcX6WEZianpTKSzcP4P5LezLjy+3cZrqElQAAE/JJREFU8OJidh86VsvGxoZotJSSqhbTMM3LPzt8wtYv57Pj+6OxtipZSXzNBGq96Vnc1+sA93jf4+3p78baqjqNMeYlY0ymMSazVatWsTYnPOG2PgEnOIW0nixL+MXgbrx0cyYbdh3m2ucXsbkOVHyiEZSSrhZz4Y7/MtE7hg/fnxlri5KVCjWTEHoBKMil+dSf8FvrTe7Zci+bl38Sa4uSkW1A8Kz2Du5niUdw6tfywPdbIW98ua2ni3u3YcqdAzlS5Ocn/1nMup0HY2N7LVFrAx0SqRYjxk+q+Ni/+mNnaGaYWoxSsySEXuC4ZixsUsTHsk+nq16izxKgu4h0FpFUYBgwPcY2VY1A6rfnZc4Q8aXjnQWj/eWvJt43vSlv3nU2XksY/tLnrN2RvIEpGkEpOWsxnlR2FNWnYOLdMP7KCnPASkQkp2bEQ4O9X2KPu0L1EkWMMT7gl8A8YC3wpjFmdWytqiZfz3OCkLGd4CTWcb9T1vSCbq0b8sZdA6mX4mHE/75g057DtWx07RCNoJR8tZgLH8S6/CkeTX2V7lvfwlRQi1EiJvk0M+AWPJbFxd6liB2mz0CpFsaY2caYHsaYrsaYMbG2p1rk5zibjgawPM4WO+UMFw/QqUUDXvvZWQDc/L8v2HUw+QY/VDsoJV0tJj3LWQ167XRS8WERmCMguv11lEhKzSDuVigG55/qRSmDjGxnaSossLxOQMocVemNRru2asiE27I4cLSY28Yv4YfC5Fr9ISpr3xljZgOzo3GtmBMYsuluo+5H8OMlNfNm6DtcJ9VGiaTTjLuumQGKxcOnJ13KxcPvVb0oJxJoXYfu8RYBp7VvwnM3ncHt45fwu6kree7GM7AsqbhgAqArOoQSGLKJjWCxtelZDCt8kI1Zj6mDUcITtK6ZIGxqfw137B/BoqIuMTZMiVuCphRUdWDM4J6t+cPlvZizaicvfLqphgytfTQohRIyW7/JkIdZZZ3C+EX5sbZMiVdC1jXrctHPaNM4jWc+2FDnlohRIiR4Mu34K2DmvREFp59ld+bKPicz9v31LN60rwYNrT00KIUSskZV0x7n8eN+7Xh76Ta+P1oca+uUeCREM2mdz+aeQd3Izd/PoiRxFEoNETqZNm9cyajNSrSgRISnrutDRosG3PvGCg4cKapF42sGDUrhCGla33/STHr51jI1r6DiskrdJGRds2En76Bt43o8+6G2lpRyCLSyCfQHGfAdg4XPVGo5IoCGaV6eHdafvYcLeWjaqloxuybRoFQebtO6Td5YpqSNocX8B/Bv+SLWVinxSlAqJu31a/hH1+VkFozjq88/iLVlSrwSaGVnjgIrxf3QwPq5FU6oDeb0Dk347UXdmfnlDuau2lHjZtckuvNseQQ1rVPwM9Q3DzPhEzhjBLTtC0f3VXn0jJKEBKdifIWcvf5JslL8MO8tzOFfIvWaqF6UE0nPOj6tgLxxOHMKDFgWkUxFueuCrsxZtZOHpq3m7K4taVI/pcIy8YgGpfIINK19xwCDJWDsIsh7xTkuFnh0K2zFJaAXfxGIILYfLwZj/LDwWdWL8v/bO/vgKOo0j3+e7pmELAJKeDUmoIgisosChsAdigouKivrvqmgLtbd6npn1elxeytaWlfl3h5rrZ6u3rnr3e0tqEDVniAUb4IgCwpu5CWsCGJEIgGDAgoISjLT/bs/uodMhgDOSzLdk+dTlUpmuqfzTOabfvp5+nme3+kZehvUzPH0YxfBhBlpXfhGbYtfff9b3PTsG/z61R089t0h7WB07lGndDoSofWW2cjm2bhOE2JMUvrXbQ6t9SSjJPeflJTCkmkYN44IXgu26kU5HTnqX7pzVH9mrq/jlivKGVLWLfd2tjHqlM5EIrQeOpnYxpdg84tEcLzOfbG0a19pyYlUjIcsmYbrOogxGLEQ1YtyOlL0kwkPjL+IhVs+5rFF25h7dxUi4WqqVaf0dSmvpLi8kuePjeKL917nJxNG0NU9ovcIlFMzYir0Hoyzcw2Pr9nPBZ2Pc9uPpqhelDNTX51xxNStJMoD4wbyyIJ3Wb7tE759aZ82MrJtUKeUJtdeN5Fr3+mC9eVAHhh/Ub7NUYJOeSXR8kr6RnYxfdE2zo8NoCrfNinBJlHFmeG9JYDbKiv433V1PLF8B+Mu6Y0dohFEWhKeJgN6nsW1g3rx4lsfcTzm5NscJSRMHllBry7F/PuK9/NtihJ0Uqo4WTIt7aVQIrbFP46/iPc/OcrCLeFaFUadUgbcP+gQtxz/I2tWLc63KUoYqK+m01tP8ehlR/nzrs9Yt/NAvi1SgkyL1Wktr0Amg6VQbhjSl0v6duWZVR/guEkN3AFfhFLTd+lSX82Q1+5gcLSR2Pr5uN0fxzr+md5bUlonKRVzo13E/LMeYfHifYy67BByvmpGaYXUKs5lDzan8tIokrEs4b6rL+TvZ29i6dYGJn7r3JNTgwFsT1CnlC51a/11c1yMicGSfwJMYD9gJc8kpWLEaeKfyzZTUb8AVsW9NXVUM0prJFfh9R6ccdHDhCF9uKBnZ/7j9Z3c+M2+SOqsvQC2J2j6Ll380NqIjRHLa4zM5SqjAQ+tlTRJTsXYRQzo1ZmoxBFcTC40o3opfJJncaaJbQn3XHkB2xuOeFPEU/QYxPYEjZTSxQ+tpW4tGxoMl737K4otBysXH3AIQmslTVIaIiNAvGYucacJsaPY2WhG9VL4ZFEanmDSZWU8vmwH//3GLkZPzb5Bt61Rp5QJfmh9eczh73aUcF3nWm794eTsP+AQhNZKBqQ2RP54Ab9/8QW2Fw/l12VXYGd6XNVLYZN80WHZcPntGa1+3Slqc3tVP55eWUvdgWP0z0GDblui6bss6BS1GT32eh78dDxvOxdmf8AQhNZK9kT6VdFn4sPMP1DGgposynVVL4XN6dZaSpPJIyuwLWF29e42MDS3qFPKhvpqfuzM4+rOdTy5PAf9JymLxQX5akbJEP8e0MSz67n03K48sfx9GuMZ9rupXgqb1tZayvA+ZO+unbhucG/+uKE+8P2Vmr7LFD+0jjpN/JdE+NGu6az74EJGX9gju+MGPLRWsiApHWPZRfzymllMWhhj5ro67r5yQGbHVL0ULkkDodk8G9x4VhHxlJH9WLp1H6++u49JpXsDe19JnVKmJIXWNnB7yTpq5+2i6pYpWP1G5ts6JYikdOoP3fkcf9N/Es+sivCD4eV071yU9a+YU72bmONyR1W/0A3iVFohaSB0tk5k9IBSys4uYcv6FUw6+GBgC2Q0fZcpSfl8sWwmsZopx17AzTDnq3QATqRjLMCFD1fz8IGfc3FsO08s35H14eOOy29W1vLa9k/VIRUaibJwgEX3w6IHWp5n6qtbfz5pu/Xmk9w38CAle9d57Qi5bGXJIRopZUpyqe/hPVgbZyLiEnebaPzgTxRnc+WRgzJQJYAkNLP63+DD1WBcLCfGvf0a+Nvq3dxWWZH5+jf11dRVL6Xvka5M/s6tOTVbCQj11fCHGz1HArDpBRh2h7cK9tKfNT+/+SWYuqj53LHhD978PONyi13EQ+4U4hIhCoEskFGnlA2J0Lq+GqmZg3GaiLk2cz+p4K5Mj6m9J4VNeSWMnQ4frT/xGY+8ZhKlc4/x8CtbmXfv6PQnOvuaOT/eyOziCJEuo4C+bWK+kkfq1oITa37sxryKPMsGN6l4Ibk9oL7ac0huHADLaWLIOXF+bv2CJyu/COSFr6bvckGiofaah1l63v0c2LqS995+reU+Z+q8T2zfMudEaJ2Tjn8leCRXzU2YwVkN63lydBNb6g8xa31d836n00zytrq1GH/0VRSHyO432+udKO1J/zFgR1OeNC0dEjRHP/XVXlTeYrsw/Oxj1B04xp5uw70iilOl/PKERkq5orwSPtnGzQ3TcCMOscXz+cqZQUnsEBw/Auuf9ab92inzzuqrT1TXGDdO3Fi4RrCxiGOzr8sw+uf1jSltgq+XRFpljF3Mo33v4fNXF7I/NoyeR7e3rLhK1kxSOga7GCbMIEYEyxjsaDRw6ZhsEJEfAv8CXAJUGmM25NeiPFJeCVMXw5tPwXtLAdffYADx+tUqqqDkHHjzaahd4UdWie0WiMWgj+czp+gV7HkG8B1Wasovj6hTyhV+mCxuHBswpgmW/QyDQU6IB3AaW4bWM2+C+HF/PxBj2Fz6HaLd+/GfdX14Z7HDyxVfUt79G/l6Z0pbkJJWkfhx7jr8LI7lYK+ei0EQ/OUGTpOOwWnk84P7uLvpIe7p18C4678fiBNLDtkKfA/4Xb4NCQTllVA2HHYsgxOrUQhgwBjY/ZZXwNACCwaMhXP6w8aZiHGIiv+aBAGaCKJOKVfUrQW32flYIhjjImIS1ykeYrUIrY3T6J18DLgiWJEiqm6+D4AZ3Vcy7e3tbPhokDqlQiNFLwDiOkQwGAMiiROGeJFSSamXrju8p+XrxOK5ur5ss3sz7MrhzeneAJxccoExZjug1YTJJKo4E+OHEO8iReTkVB7iTaMfO917WDMHnCZcLFzXJSqOd24KUMFDVk5JQ+sk+o/xPvz4ccAgGES8ixfED6DFhhue8PafeRMm3gi4xI1gJAIDxxPt1sdL6yx7kJ5OEzMtG9n7IfRIf+ZVEFHN+JzQS6N3MoHmK1xfN64VwR5+p1ddlVhTx7K9+wpODETY12csG3Z9xr+O6kL3/5vqHc+yPJ2NmJqvd5cXRORu4G6AioqKPFvThqQM+QWa115KrsKzol51XvK8PP91n5VewU9f2Mgvzn+Hwed2y2imXluRbaSkoXWClHJfMa6XkBMvLecg7C4dQ1HDHjrVLKJ7/DiCIW6Ehu4jKRt9K9ar0z1BifirTbqI43gVNjVzvOOHH9UMnNRSwMaZJzYJXtQ8N3YV55WM46rNT5y42MEFht8JgLv5JXp8vIo5xWuIMMVzSLheJLVkmrcOT0BONKdDRF4D+rSy6WFjzIKvexxjzPPA8wAjRowwZ9g93KRO8khee2nLbEA8RwMto2f/qxfQ2Fd4xBnJyxNHt6flZyQrp6ShdQop5b7ih9bGjeMY4dz9b2Dv/xO2n8s1AlYkSvn3HmvZ7W8sfxlkgxdjZT7zKmioZpJIaimgZk6zUxELsYs51u1SqtbciRH/vhOAFYGhk9m/dSXnOHEiuNjiIBaeZhKpPeMG5h7BmTDGjMu3DQVDsrM6Q3vJ+MG9eXplLQePNlJ6VnGeDD6Zdrun1JFDa6lbS/RQPWyahZjke0yCXD6lWSiJPLFdBBNmwL6anMy8CiMdRi9w8vLXXx1E+o/hrp1rsFc7yeM4+fC8Sfx+Qwm1G7/BrEjEc0h2kTeGps9lLavyOpBelFY4w9ImV1/ci6deq2Vt7QG+e3lZHg1tyRmdkobWGdBKaC311bBlLsQbvWo8sbyrXsS7okl1ZonX52DmVXuTC810KL1Aq4NVowBro16/GhAzNut3HqSWVfQbOpZjQ4bR6UB1szbKK7NaOjuIiMjNwDNAT2CxiNQYY76dZ7PCQXJBRCsXtd8s60Zp5yJW7/g0XE5JQ+sckXo1nIiCNs5svl/U2sTnEE6BVs3kCL8vRbbMhqP7idauYIq1min2OqRqFJSPAcac/JqQ6eV0GGPmA/PzbUcoOdWFro9lCVdd1JPXd3yK45r0J4m0ETrRoT1JDFUcMRW6lXtpuYAORVQCQnklTHwKyoYhbhwxDqJ6Ub4uiXPOKS5Urrq4J59/GeMvew61s2GnJiunJCI3i8geYBReaP1qbszqAHTQVUNVMxnSQfWitC1XD+rFnJ9Ucem5GQ4CbgOyrb7T0DpTzhBaFyqqmQzpoHpR2paunaKMGlCabzNaoBMd8kmB5f+VNkb1onQA9J6SoiiKEhjEmPavthWR/cBH7fxrewAH2vl3ZsOp7O1njOnZ3sbkE9XL1+J09qpm2odC0Uxe9ZIXp5QPRGSDMWZEvu34uoTN3kIjbH//sNlbiITtMwiqvZq+UxRFUQKDOiVFURQlMHQkp/R8vg1Ik7DZW2iE7e8fNnsLkbB9BoG0t8PcU1IURVGCT0eKlBRFUZSAo05JURRFCQwF5ZREpFxEXheRbSLyroj8Qyv7jBWRwyJS4389mg9bk+ypE5F3fFtOWhpcPH4jIh+IyF9EZFg+7CxEVC9KuoRNM2HUS6GNGYoD04wxm0SkC7BRRFYYY7al7LfWGDMxD/adiquNMadqurseGOh/jQSe878r2aN6UdIljJoJlV4KKlIyxjQYYzb5P38BbAeCs3pVZkwCZhmPt4CzRaRvvo0qBFQvSroUoGYCp5eCckrJiEh/4HLgz61sHiUiW0RkqYhc2q6GnYwBlovIRn8J8FTKgPqkx3sI9z9BIFG9KOkSEs2ETi+Flr4DQETOAl4G7jfGHEnZvAlvttNREbkBeAUvdM0Xf22M2SsivYAVIvKeMWZNHu3pcKhelHQJkWZCp5eCi5REJIonlpeMMfNStxtjjhhjjvo/LwGiItKjnc1Mtmev//1TvHWGUtcm2AuUJz0+z39OyQGqFyVdwqSZMOqloJySiAjwP8B2Y8yTp9inj78fIlKJ9zc42H5WtrCls3+zFBHpDFwHbE3ZbSFwp18lUwUcNsY0tLOpBYnqRUmXMGkmrHoptPTdXwF3AO+ISI3/3ENABYAx5rfAD4B7RSQOfAXcavI31qI3MN/XbwSYbYxZJiI/TbJ3CXAD8AHwJXBXnmwtRFQvSrqESTOh1IuOGVIURVECQ0Gl7xRFUZRwo05JURRFCQzqlBRFUZTAoE5JURRFCQzqlBRFUZTAoE5JURRFCQzqlBRFUZTA8P/kfGrx230//gAAAABJRU5ErkJggg==\n", 806 | "text/plain": [ 807 | "
" 808 | ] 809 | }, 810 | "metadata": { 811 | "needs_background": "light" 812 | } 813 | } 814 | ], 815 | "source": [ 816 | "from sklearn.linear_model import Ridge\n", 817 | "def ridge_regression(data, predictors, alpha, models_to_plot={}):\n", 818 | " #Fit the model\n", 819 | " ridgereg = Ridge(alpha=alpha,normalize=True)\n", 820 | " ridgereg.fit(data[predictors],data['y'])\n", 821 | " y_pred = ridgereg.predict(data[predictors])\n", 822 | " \n", 823 | " #Check if a plot is to be made for the entered alpha\n", 824 | " if alpha in models_to_plot:\n", 825 | " plt.subplot(models_to_plot[alpha])\n", 826 | " plt.tight_layout()\n", 827 | " plt.plot(data['x'],y_pred)\n", 828 | " plt.plot(data['x'],data['y'],'.')\n", 829 | " plt.title('Plot for alpha: %.3g'%alpha)\n", 830 | " \n", 831 | " #Return the result in pre-defined format\n", 832 | " rss = sum((y_pred-data['y'])**2)\n", 833 | " ret = [rss]\n", 834 | " ret.extend([ridgereg.intercept_])\n", 835 | " ret.extend(ridgereg.coef_)\n", 836 | " return ret\n", 837 | "\n", 838 | "#Initialize predictors to be set of 15 powers of x\n", 839 | "predictors=['x']\n", 840 | "predictors.extend(['x_%d'%i for i in range(2,16)])\n", 841 | "\n", 842 | "#Set the different values of alpha to be tested\n", 843 | "alpha_ridge = [1e-15, 1e-10, 1e-8, 1e-4, 1e-3,1e-2, 1, 5, 10, 20]\n", 844 | "\n", 845 | "#Initialize the dataframe for storing coefficients.\n", 846 | "col = ['rss','intercept'] + ['coef_x_%d'%i for i in range(1,16)]\n", 847 | "ind = ['alpha_%.2g'%alpha_ridge[i] for i in range(0,10)]\n", 848 | "coef_matrix_ridge = pd.DataFrame(index=ind, columns=col)\n", 849 | "\n", 850 | "models_to_plot = {1e-15:231, 1e-10:232, 1e-4:233, 1e-3:234, 1e-2:235, 5:236}\n", 851 | "for i in range(10):\n", 852 | " coef_matrix_ridge.iloc[i,] = ridge_regression(data, predictors, alpha_ridge[i], models_to_plot)" 853 | ] 854 | }, 855 | { 856 | "cell_type": "markdown", 857 | "source": [ 858 | "- Here we can clearly observe that as the value of alpha increases, the model complexity reduces. \n", 859 | "- Though higher values of alpha reduce overfitting, significantly high values can cause underfitting as well (eg. alpha = 5). \n", 860 | "- Thus alpha should be chosen wisely. A widely accept technique is cross-validation, i.e. the value of alpha is iterated over a range of values and the one giving higher cross-validation score is chosen.\n", 861 | "\n" 862 | ], 863 | "metadata": { 864 | "id": "HZVaWbqdK18h" 865 | } 866 | }, 867 | { 868 | "cell_type": "code", 869 | "source": [ 870 | "#Set the display format to be scientific for ease of analysis\n", 871 | "pd.options.display.float_format = '{:,.2g}'.format\n", 872 | "coef_matrix_ridge" 873 | ], 874 | "metadata": { 875 | "id": "e_XKJQx8K1Zo", 876 | "colab": { 877 | "base_uri": "https://localhost:8080/", 878 | "height": 423 879 | }, 880 | "outputId": "53978e33-b3b8-44a0-e4b1-24bf6132da79" 881 | }, 882 | "execution_count": 6, 883 | "outputs": [ 884 | { 885 | "output_type": "execute_result", 886 | "data": { 887 | "text/html": [ 888 | "\n", 889 | "
\n", 890 | "
\n", 891 | "
\n", 892 | "\n", 905 | "\n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 936 | " \n", 937 | " \n", 938 | " \n", 939 | " \n", 940 | " \n", 941 | " \n", 942 | " \n", 943 | " \n", 944 | " \n", 945 | " \n", 946 | " \n", 947 | " \n", 948 | " \n", 949 | " \n", 950 | " \n", 951 | " \n", 952 | " \n", 953 | " \n", 954 | " \n", 955 | " \n", 956 | " \n", 957 | " \n", 958 | " \n", 959 | " \n", 960 | " \n", 961 | " \n", 962 | " \n", 963 | " \n", 964 | " \n", 965 | " \n", 966 | " \n", 967 | " \n", 968 | " \n", 969 | " \n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | " \n", 1082 | " \n", 1083 | " \n", 1084 | " \n", 1085 | " \n", 1086 | " \n", 1087 | " \n", 1088 | " \n", 1089 | " \n", 1090 | " \n", 1091 | " \n", 1092 | " \n", 1093 | " \n", 1094 | " \n", 1095 | " \n", 1096 | " \n", 1097 | " \n", 1098 | " \n", 1099 | " \n", 1100 | " \n", 1101 | " \n", 1102 | " \n", 1103 | " \n", 1104 | " \n", 1105 | " \n", 1106 | " \n", 1107 | " \n", 1108 | " \n", 1109 | " \n", 1110 | " \n", 1111 | " \n", 1112 | " \n", 1113 | " \n", 1114 | " \n", 1115 | " \n", 1116 | " \n", 1117 | " \n", 1118 | " \n", 1119 | " \n", 1120 | " \n", 1121 | " \n", 1122 | " \n", 1123 | " \n", 1124 | " \n", 1125 | " \n", 1126 | " \n", 1127 | " \n", 1128 | " \n", 1129 | " \n", 1130 | "
rssinterceptcoef_x_1coef_x_2coef_x_3coef_x_4coef_x_5coef_x_6coef_x_7coef_x_8coef_x_9coef_x_10coef_x_11coef_x_12coef_x_13coef_x_14coef_x_15
alpha_1e-150.8795-3e+023.8e+02-2.3e+02632.1-50.60.17-0.031-0.00510.000850.00024-6.1e-054.5e-06-9.3e-08
alpha_1e-100.9211-2931-152.90.17-0.091-0.0110.0020.000642.4e-05-2e-05-4.2e-062.2e-072.3e-07-2.3e-08
alpha_1e-080.951.3-1.51.7-0.680.0390.0160.00016-0.00036-5.4e-05-2.9e-071.1e-061.9e-072e-083.9e-098.2e-10-4.6e-10
alpha_0.00010.960.560.55-0.13-0.026-0.0028-0.000114.1e-051.5e-053.7e-067.4e-071.3e-071.9e-081.9e-09-1.3e-10-1.5e-10-6.2e-11
alpha_0.00110.820.31-0.087-0.02-0.0028-0.000221.8e-051.2e-053.4e-067.3e-071.3e-071.9e-081.7e-09-1.5e-10-1.4e-10-5.2e-11
alpha_0.011.41.3-0.088-0.052-0.01-0.0014-0.000137.2e-074.1e-061.3e-063e-075.6e-089e-091.1e-094.3e-11-3.1e-11-1.5e-11
alpha_15.60.97-0.14-0.019-0.003-0.00047-7e-05-9.9e-06-1.3e-06-1.4e-07-9.3e-091.3e-097.8e-102.4e-106.2e-111.4e-113.2e-12
alpha_5140.55-0.059-0.0085-0.0014-0.00024-4.1e-05-6.9e-06-1.1e-06-1.9e-07-3.1e-08-5.1e-09-8.2e-10-1.3e-10-2e-11-3e-12-4.2e-13
alpha_10180.4-0.037-0.0055-0.00095-0.00017-3e-05-5.2e-06-9.2e-07-1.6e-07-2.9e-08-5.1e-09-9.1e-10-1.6e-10-2.9e-11-5.1e-12-9.1e-13
alpha_20230.28-0.022-0.0034-0.0006-0.00011-2e-05-3.6e-06-6.6e-07-1.2e-07-2.2e-08-4e-09-7.5e-10-1.4e-10-2.5e-11-4.7e-12-8.7e-13
\n", 1131 | "
\n", 1132 | " \n", 1142 | " \n", 1143 | " \n", 1180 | "\n", 1181 | " \n", 1205 | "
\n", 1206 | "
\n", 1207 | " " 1208 | ], 1209 | "text/plain": [ 1210 | " rss intercept coef_x_1 ... coef_x_13 coef_x_14 coef_x_15\n", 1211 | "alpha_1e-15 0.87 95 -3e+02 ... -6.1e-05 4.5e-06 -9.3e-08\n", 1212 | "alpha_1e-10 0.92 11 -29 ... 2.2e-07 2.3e-07 -2.3e-08\n", 1213 | "alpha_1e-08 0.95 1.3 -1.5 ... 3.9e-09 8.2e-10 -4.6e-10\n", 1214 | "alpha_0.0001 0.96 0.56 0.55 ... -1.3e-10 -1.5e-10 -6.2e-11\n", 1215 | "alpha_0.001 1 0.82 0.31 ... -1.5e-10 -1.4e-10 -5.2e-11\n", 1216 | "alpha_0.01 1.4 1.3 -0.088 ... 4.3e-11 -3.1e-11 -1.5e-11\n", 1217 | "alpha_1 5.6 0.97 -0.14 ... 6.2e-11 1.4e-11 3.2e-12\n", 1218 | "alpha_5 14 0.55 -0.059 ... -2e-11 -3e-12 -4.2e-13\n", 1219 | "alpha_10 18 0.4 -0.037 ... -2.9e-11 -5.1e-12 -9.1e-13\n", 1220 | "alpha_20 23 0.28 -0.022 ... -2.5e-11 -4.7e-12 -8.7e-13\n", 1221 | "\n", 1222 | "[10 rows x 17 columns]" 1223 | ] 1224 | }, 1225 | "metadata": {}, 1226 | "execution_count": 6 1227 | } 1228 | ] 1229 | }, 1230 | { 1231 | "cell_type": "markdown", 1232 | "source": [ 1233 | "This straight away gives us the following inferences:\n", 1234 | "\n", 1235 | "- The RSS increases with increase in alpha, this model complexity reduces\n", 1236 | "- An alpha as small as 1e-15 gives us significant reduction in magnitude of coefficients. How? Compare the coefficients in the first row of this table to the last row of simple linear regression table.\n", 1237 | "- High alpha values can lead to significant underfitting. Note the rapid increase in RSS for values of alpha greater than 1\n", 1238 | "- Though the coefficients are very very small, they are NOT zero.\n", 1239 | "\n", 1240 | "The first 3 are very intuitive. But #4 is also a crucial observation. Let’s reconfirm the same by determining the number of zeros in each row of the coefficients data set:\n", 1241 | "\n" 1242 | ], 1243 | "metadata": { 1244 | "id": "3oz0KcDCPhqd" 1245 | } 1246 | }, 1247 | { 1248 | "cell_type": "code", 1249 | "source": [ 1250 | "coef_matrix_ridge.apply(lambda x: sum(x.values==0),axis=1)\n", 1251 | "\n", 1252 | "# Inference\n", 1253 | "# This confirms that all the 15 coefficients are greater than zero in magnitude \n", 1254 | "# (can be +ve or -ve). " 1255 | ], 1256 | "metadata": { 1257 | "colab": { 1258 | "base_uri": "https://localhost:8080/" 1259 | }, 1260 | "id": "mIGf1hmRPaVE", 1261 | "outputId": "b94a2ec1-9e88-4ffa-e22c-8d181d92f9c1" 1262 | }, 1263 | "execution_count": 7, 1264 | "outputs": [ 1265 | { 1266 | "output_type": "execute_result", 1267 | "data": { 1268 | "text/plain": [ 1269 | "alpha_1e-15 0\n", 1270 | "alpha_1e-10 0\n", 1271 | "alpha_1e-08 0\n", 1272 | "alpha_0.0001 0\n", 1273 | "alpha_0.001 0\n", 1274 | "alpha_0.01 0\n", 1275 | "alpha_1 0\n", 1276 | "alpha_5 0\n", 1277 | "alpha_10 0\n", 1278 | "alpha_20 0\n", 1279 | "dtype: int64" 1280 | ] 1281 | }, 1282 | "metadata": {}, 1283 | "execution_count": 7 1284 | } 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "markdown", 1289 | "source": [ 1290 | "**Lasso Regression**\n", 1291 | "\n", 1292 | "- LASSO stands for Least Absolute Shrinkage and Selection Operator. \n", 1293 | "- There are 2 key words here – ‘absolute‘ and ‘selection‘.\n", 1294 | "- Lasso regression performs L1 regularization, i.e. it adds a factor of sum of absolute value of coefficients in the optimization objective. \n", 1295 | "- Thus, lasso regression optimizes the following:\n", 1296 | "\n", 1297 | " Objective = RSS + α * (sum of absolute value of coefficients)\n", 1298 | "\n", 1299 | " Here, α (alpha) works similar to that of ridge and provides a trade-off between balancing RSS and magnitude of coefficients. \n", 1300 | " Like that of ridge, α can take various values.\n", 1301 | " Lets iterate it here briefly:\n", 1302 | "\n", 1303 | " α = 0: Same coefficients as simple linear regression\n", 1304 | "\n", 1305 | " α = ∞: All coefficients zero (same logic as before)\n", 1306 | " \n", 1307 | " 0 < α < ∞: coefficients between 0 and that of simple linear regression\n", 1308 | "\n", 1309 | "Notice the additional parameters defined in Lasso function – ‘max_iter‘. \n", 1310 | "\n", 1311 | "This is the maximum number of iterations for which we want the model to run if it doesn’t converge before. \n", 1312 | "\n", 1313 | "This exists for Ridge as as well but setting this to a higher than default value was required in this case. " 1314 | ], 1315 | "metadata": { 1316 | "id": "Uqx_IIyaPh6X" 1317 | } 1318 | }, 1319 | { 1320 | "cell_type": "code", 1321 | "source": [ 1322 | "from sklearn.linear_model import Lasso\n", 1323 | "def lasso_regression(data, predictors, alpha, models_to_plot={}):\n", 1324 | " #Fit the model\n", 1325 | " lassoreg = Lasso(alpha=alpha,normalize=True, max_iter=1e5)\n", 1326 | " lassoreg.fit(data[predictors],data['y'])\n", 1327 | " y_pred = lassoreg.predict(data[predictors])\n", 1328 | " \n", 1329 | " #Check if a plot is to be made for the entered alpha\n", 1330 | " if alpha in models_to_plot:\n", 1331 | " plt.subplot(models_to_plot[alpha])\n", 1332 | " plt.tight_layout()\n", 1333 | " plt.plot(data['x'],y_pred)\n", 1334 | " plt.plot(data['x'],data['y'],'.')\n", 1335 | " plt.title('Plot for alpha: %.3g'%alpha)\n", 1336 | " \n", 1337 | " #Return the result in pre-defined format\n", 1338 | " rss = sum((y_pred-data['y'])**2)\n", 1339 | " ret = [rss]\n", 1340 | " ret.extend([lassoreg.intercept_])\n", 1341 | " ret.extend(lassoreg.coef_)\n", 1342 | " return ret\n", 1343 | "\n", 1344 | "#Initialize predictors to all 15 powers of x\n", 1345 | "predictors=['x']\n", 1346 | "predictors.extend(['x_%d'%i for i in range(2,16)])\n", 1347 | "\n", 1348 | "#Define the alpha values to test\n", 1349 | "alpha_lasso = [1e-15, 1e-10, 1e-8, 1e-5,1e-4, 1e-3,1e-2, 1, 5, 10]\n", 1350 | "\n", 1351 | "#Initialize the dataframe to store coefficients\n", 1352 | "col = ['rss','intercept'] + ['coef_x_%d'%i for i in range(1,16)]\n", 1353 | "ind = ['alpha_%.2g'%alpha_lasso[i] for i in range(0,10)]\n", 1354 | "coef_matrix_lasso = pd.DataFrame(index=ind, columns=col)\n", 1355 | "\n", 1356 | "#Define the models to plot\n", 1357 | "models_to_plot = {1e-10:231, 1e-5:232,1e-4:233, 1e-3:234, 1e-2:235, 1:236}\n", 1358 | "\n", 1359 | "#Iterate over the 10 alpha values:\n", 1360 | "for i in range(10):\n", 1361 | " coef_matrix_lasso.iloc[i,] = lasso_regression(data, predictors, alpha_lasso[i], models_to_plot)" 1362 | ], 1363 | "metadata": { 1364 | "colab": { 1365 | "base_uri": "https://localhost:8080/", 1366 | "height": 297 1367 | }, 1368 | "id": "kj2u_6LIPiGN", 1369 | "outputId": "cf9386f3-2b51-4148-f07a-a6dec8cf0df0" 1370 | }, 1371 | "execution_count": 8, 1372 | "outputs": [ 1373 | { 1374 | "output_type": "display_data", 1375 | "data": { 1376 | "image/png": "\n", 1377 | "text/plain": [ 1378 | "
" 1379 | ] 1380 | }, 1381 | "metadata": { 1382 | "needs_background": "light" 1383 | } 1384 | } 1385 | ] 1386 | }, 1387 | { 1388 | "cell_type": "markdown", 1389 | "source": [ 1390 | "This again tells us that the model complexity decreases with increase in the values of alpha. \n", 1391 | "\n", 1392 | "But notice the straight line at alpha=1. Appears a bit strange to me. Let’s explore this further by looking at the coefficients:" 1393 | ], 1394 | "metadata": { 1395 | "id": "8mtV-ckZPh9T" 1396 | } 1397 | }, 1398 | { 1399 | "cell_type": "code", 1400 | "source": [ 1401 | "#Set the display format to be scientific for ease of analysis\n", 1402 | "pd.options.display.float_format = '{:,.2g}'.format\n", 1403 | "coef_matrix_lasso" 1404 | ], 1405 | "metadata": { 1406 | "colab": { 1407 | "base_uri": "https://localhost:8080/", 1408 | "height": 423 1409 | }, 1410 | "id": "nmXOcN9PQrtf", 1411 | "outputId": "90f91d7d-aba2-49c7-e0c3-898d476374db" 1412 | }, 1413 | "execution_count": 9, 1414 | "outputs": [ 1415 | { 1416 | "output_type": "execute_result", 1417 | "data": { 1418 | "text/html": [ 1419 | "\n", 1420 | "
\n", 1421 | "
\n", 1422 | "
\n", 1423 | "\n", 1436 | "\n", 1437 | " \n", 1438 | " \n", 1439 | " \n", 1440 | " \n", 1441 | " \n", 1442 | " \n", 1443 | " \n", 1444 | " \n", 1445 | " \n", 1446 | " \n", 1447 | " \n", 1448 | " \n", 1449 | " \n", 1450 | " \n", 1451 | " \n", 1452 | " \n", 1453 | " \n", 1454 | " \n", 1455 | " \n", 1456 | " \n", 1457 | " \n", 1458 | " \n", 1459 | " \n", 1460 | " \n", 1461 | " \n", 1462 | " \n", 1463 | " \n", 1464 | " \n", 1465 | " \n", 1466 | " \n", 1467 | " \n", 1468 | " \n", 1469 | " \n", 1470 | " \n", 1471 | " \n", 1472 | " \n", 1473 | " \n", 1474 | " \n", 1475 | " \n", 1476 | " \n", 1477 | " \n", 1478 | " \n", 1479 | " \n", 1480 | " \n", 1481 | " \n", 1482 | " \n", 1483 | " \n", 1484 | " \n", 1485 | " \n", 1486 | " \n", 1487 | " \n", 1488 | " \n", 1489 | " \n", 1490 | " \n", 1491 | " \n", 1492 | " \n", 1493 | " \n", 1494 | " \n", 1495 | " \n", 1496 | " \n", 1497 | " \n", 1498 | " \n", 1499 | " \n", 1500 | " \n", 1501 | " \n", 1502 | " \n", 1503 | " \n", 1504 | " \n", 1505 | " \n", 1506 | " \n", 1507 | " \n", 1508 | " \n", 1509 | " \n", 1510 | " \n", 1511 | " \n", 1512 | " \n", 1513 | " \n", 1514 | " \n", 1515 | " \n", 1516 | " \n", 1517 | " \n", 1518 | " \n", 1519 | " \n", 1520 | " \n", 1521 | " \n", 1522 | " \n", 1523 | " \n", 1524 | " \n", 1525 | " \n", 1526 | " \n", 1527 | " \n", 1528 | " \n", 1529 | " \n", 1530 | " \n", 1531 | " \n", 1532 | " \n", 1533 | " \n", 1534 | " \n", 1535 | " \n", 1536 | " \n", 1537 | " \n", 1538 | " \n", 1539 | " \n", 1540 | " \n", 1541 | " \n", 1542 | " \n", 1543 | " \n", 1544 | " \n", 1545 | " \n", 1546 | " \n", 1547 | " \n", 1548 | " \n", 1549 | " \n", 1550 | " \n", 1551 | " \n", 1552 | " \n", 1553 | " \n", 1554 | " \n", 1555 | " \n", 1556 | " \n", 1557 | " \n", 1558 | " \n", 1559 | " \n", 1560 | " \n", 1561 | " \n", 1562 | " \n", 1563 | " \n", 1564 | " \n", 1565 | " \n", 1566 | " \n", 1567 | " \n", 1568 | " \n", 1569 | " \n", 1570 | " \n", 1571 | " \n", 1572 | " \n", 1573 | " \n", 1574 | " \n", 1575 | " \n", 1576 | " \n", 1577 | " \n", 1578 | " \n", 1579 | " \n", 1580 | " \n", 1581 | " \n", 1582 | " \n", 1583 | " \n", 1584 | " \n", 1585 | " \n", 1586 | " \n", 1587 | " \n", 1588 | " \n", 1589 | " \n", 1590 | " \n", 1591 | " \n", 1592 | " \n", 1593 | " \n", 1594 | " \n", 1595 | " \n", 1596 | " \n", 1597 | " \n", 1598 | " \n", 1599 | " \n", 1600 | " \n", 1601 | " \n", 1602 | " \n", 1603 | " \n", 1604 | " \n", 1605 | " \n", 1606 | " \n", 1607 | " \n", 1608 | " \n", 1609 | " \n", 1610 | " \n", 1611 | " \n", 1612 | " \n", 1613 | " \n", 1614 | " \n", 1615 | " \n", 1616 | " \n", 1617 | " \n", 1618 | " \n", 1619 | " \n", 1620 | " \n", 1621 | " \n", 1622 | " \n", 1623 | " \n", 1624 | " \n", 1625 | " \n", 1626 | " \n", 1627 | " \n", 1628 | " \n", 1629 | " \n", 1630 | " \n", 1631 | " \n", 1632 | " \n", 1633 | " \n", 1634 | " \n", 1635 | " \n", 1636 | " \n", 1637 | " \n", 1638 | " \n", 1639 | " \n", 1640 | " \n", 1641 | " \n", 1642 | " \n", 1643 | " \n", 1644 | " \n", 1645 | " \n", 1646 | " \n", 1647 | " \n", 1648 | " \n", 1649 | " \n", 1650 | " \n", 1651 | " \n", 1652 | " \n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | "
rssinterceptcoef_x_1coef_x_2coef_x_3coef_x_4coef_x_5coef_x_6coef_x_7coef_x_8coef_x_9coef_x_10coef_x_11coef_x_12coef_x_13coef_x_14coef_x_15
alpha_1e-150.960.221.1-0.370.000890.0016-0.00012-6.4e-05-6.3e-061.4e-067.8e-072.1e-074e-085.4e-091.8e-10-2e-10-9.2e-11
alpha_1e-100.960.221.1-0.370.000880.0016-0.00012-6.4e-05-6.3e-061.4e-067.8e-072.1e-074e-085.4e-091.8e-10-2e-10-9.2e-11
alpha_1e-080.960.221.1-0.370.000770.0016-0.00011-6.4e-05-6.3e-061.4e-067.8e-072.1e-074e-085.3e-092e-10-1.9e-10-9.3e-11
alpha_1e-050.960.50.6-0.13-0.038-00007.7e-061e-067.7e-08000-0-7e-11
alpha_0.000110.90.17-0-0.048-0-0009.5e-065.1e-07000-0-0-4.4e-11
alpha_0.0011.71.3-0-0.13-0-0-0000001.5e-087.5e-10000
alpha_0.013.61.8-0.55-0.00056-0-0-0-0-0-0-0000000
alpha_1370.038-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
alpha_5370.038-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
alpha_10370.038-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
\n", 1662 | "
\n", 1663 | " \n", 1673 | " \n", 1674 | " \n", 1711 | "\n", 1712 | " \n", 1736 | "
\n", 1737 | "
\n", 1738 | " " 1739 | ], 1740 | "text/plain": [ 1741 | " rss intercept coef_x_1 ... coef_x_13 coef_x_14 coef_x_15\n", 1742 | "alpha_1e-15 0.96 0.22 1.1 ... 1.8e-10 -2e-10 -9.2e-11\n", 1743 | "alpha_1e-10 0.96 0.22 1.1 ... 1.8e-10 -2e-10 -9.2e-11\n", 1744 | "alpha_1e-08 0.96 0.22 1.1 ... 2e-10 -1.9e-10 -9.3e-11\n", 1745 | "alpha_1e-05 0.96 0.5 0.6 ... 0 -0 -7e-11\n", 1746 | "alpha_0.0001 1 0.9 0.17 ... -0 -0 -4.4e-11\n", 1747 | "alpha_0.001 1.7 1.3 -0 ... 0 0 0\n", 1748 | "alpha_0.01 3.6 1.8 -0.55 ... 0 0 0\n", 1749 | "alpha_1 37 0.038 -0 ... -0 -0 -0\n", 1750 | "alpha_5 37 0.038 -0 ... -0 -0 -0\n", 1751 | "alpha_10 37 0.038 -0 ... -0 -0 -0\n", 1752 | "\n", 1753 | "[10 rows x 17 columns]" 1754 | ] 1755 | }, 1756 | "metadata": {}, 1757 | "execution_count": 9 1758 | } 1759 | ] 1760 | }, 1761 | { 1762 | "cell_type": "markdown", 1763 | "source": [ 1764 | "Apart from the expected inference of higher RSS for higher alphas, we can see the following:\n", 1765 | "\n", 1766 | "1. For the same values of alpha, the coefficients of lasso regression are much smaller as compared to that of ridge regression (compare row 1 of the 2 tables).\n", 1767 | "2. For the same alpha, lasso has higher RSS (poorer fit) as compared to ridge regression\n", 1768 | "3. Many of the coefficients are zero even for very small values of alpha\n", 1769 | "\n", 1770 | "Inferences #1,2 might not generalize always but will hold for many cases. The real difference from ridge is coming out in the last inference. " 1771 | ], 1772 | "metadata": { 1773 | "id": "hmVoR602Q_9e" 1774 | } 1775 | }, 1776 | { 1777 | "cell_type": "code", 1778 | "source": [ 1779 | "coef_matrix_lasso.apply(lambda x: sum(x.values==0),axis=1)" 1780 | ], 1781 | "metadata": { 1782 | "colab": { 1783 | "base_uri": "https://localhost:8080/" 1784 | }, 1785 | "id": "43M_fKAvQ3iK", 1786 | "outputId": "242be1eb-672b-450a-f011-d90b30d76b9c" 1787 | }, 1788 | "execution_count": 10, 1789 | "outputs": [ 1790 | { 1791 | "output_type": "execute_result", 1792 | "data": { 1793 | "text/plain": [ 1794 | "alpha_1e-15 0\n", 1795 | "alpha_1e-10 0\n", 1796 | "alpha_1e-08 0\n", 1797 | "alpha_1e-05 8\n", 1798 | "alpha_0.0001 10\n", 1799 | "alpha_0.001 12\n", 1800 | "alpha_0.01 13\n", 1801 | "alpha_1 15\n", 1802 | "alpha_5 15\n", 1803 | "alpha_10 15\n", 1804 | "dtype: int64" 1805 | ] 1806 | }, 1807 | "metadata": {}, 1808 | "execution_count": 10 1809 | } 1810 | ] 1811 | }, 1812 | { 1813 | "cell_type": "markdown", 1814 | "source": [ 1815 | "We can observe that even for a small value of alpha, a significant number of coefficients are zero. \n", 1816 | "\n", 1817 | "This also explains the horizontal line fit for alpha=1 in the lasso plots, its just a baseline model! \n", 1818 | "\n", 1819 | "This phenomenon of most of the coefficients being zero is called ‘sparsity‘. \n", 1820 | "\n", 1821 | "Although lasso performs feature selection, this level of sparsity is achieved in special cases only which we’ll discuss towards the end.\n", 1822 | "\n", 1823 | "This has some really interesting implications on the use cases of lasso regression as compared to that of ridge regression." 1824 | ], 1825 | "metadata": { 1826 | "id": "WgiW_7hNRNBV" 1827 | } 1828 | } 1829 | ] 1830 | } -------------------------------------------------------------------------------- /14_WOE_IV.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/14_WOE_IV.pdf -------------------------------------------------------------------------------- /16_LR_Revision.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/16_LR_Revision.pdf -------------------------------------------------------------------------------- /17_LR_1_Interview_Questions.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/17_LR_1_Interview_Questions.pdf -------------------------------------------------------------------------------- /18_LR_2_Interview_Questions.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sandipanpaul21/Logistic-regression-in-python/83f46ca3cbef084a01c368b8347f86a0c528d8be/18_LR_2_Interview_Questions.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Logistic-regression-in-python 2 | 3 | ### 01_LR_Introduction (Theory) 4 | - Predict categories based on MLE 5 | ### 02_Odd_LogOdd_OddRatio (Theory) 6 | - Probability : Something Happening / Everything that could Happen 7 | - Odds : Something Happening / Something Not Happening 8 | - Log(Odds) : To make Odds output symmetry 9 | ### 03_Logit_Model (Theory) 10 | - Indepth Logistic Output explained 11 | ### 04_Likelihood_Probability (Theory) 12 | ### 05_MLE (Theory) 13 | ### 06_LR_Assumptions (Theory) 14 | - Assumption 1 - Appropriate outcome type (Must be categorical) 15 | - Assumption 2 - Linearity of independent variables and log odds 16 | - Assumption 3 - No strongly influential outliers 17 | - Assumption 4 - Absence of multicollinearity 18 | - Assumption 5 - Independence of observations 19 | - Assumption 6 - Sufficiently large sample size 20 | ### 07_LR_Assumptions (Python Code) 21 | - Python Code for Logistic Regression Assumptions 22 | ### 08_AIC_BIC (Theory) 23 | - Akaike Information Criterion 24 | - Bayesian Information Criterion 25 | - Choose the lowest score 26 | ### 09_Logistic_Regression (Python Code) 27 | - Python Code for Logistic Regression 28 | ### 10_Multiclass_Classification (Theory) 29 | - One vs All (OvA) also known as One vs Rest (OvR) 30 | - One vs One (OnO) 31 | ### 11_Multi_Class_Classification (Python Code) 32 | - Python Code for Multi Class Classification 33 | ### 12_Regularization (Theory) 34 | - L1 Lasso 35 | - SSR + lamda * (slope)^2 36 | - Useless variable become 0 37 | - L2 Ridge 38 | - SSR + lamda * |slope| 39 | - Useless variable tends to become 0 but never = 0 40 | - Elastic Net : Combination of L1 & L2 41 | ### 13_LR_Regularization (Python Code) 42 | - Python Code of Regularization (L1 Lasso,L2 Ridge & Elastic Net) 43 | ### 14_WOE_IV (Theory) 44 | - Weight of Evidence : Predictive power of Independent Variables 45 | - Information Value : Technique to select important Variables 46 | ### 15_LR_WOE_IV (Python Code) 47 | - Python Code for WOE and IV 48 | ### 16_LR_Revision (Theory) 49 | - Logistic Regression Revision 50 | ### 17_LR_1_Interview_Questions (Theory) 51 | - Logistic Regression Interview quesion bank 52 | ### 18_LR_2_Interview_Questions (Theory) 53 | - Indepth Logistic Output explained 54 | --------------------------------------------------------------------------------