├── 2 K-Nearest Neigbours Classification (for Slides).ipynb ├── 2 K-Nearest Neigbours Classification.ipynb ├── 2 K-Nearest Neigbours Regression (for Slides).ipynb ├── 2. K-Nearest Neigbours Classification.ipynb ├── 2_K_Nearest_Neigbours_Classification_(for_Slides).ipynb ├── 3 K-Nearest Neigbours Regression (for Slides).ipynb ├── 4 Linear Regression for Slides.ipynb ├── 4_linear_regression_for_slides.ipynb ├── Code Walk Through ├── K-Nearest Neighbours.ipynb ├── knn-data.csv └── temp.txt ├── Codes └── K-Nearest Neighbours │ └── KNN Classification.ipynb ├── Copy_of_2_K_Nearest_Neigbours_Classification_(for_Slides).ipynb ├── Decision Tree ├── 16 Exercise: Calculating Information Gain on a Dataset.ipynb ├── 17 Hyperparameters - Solution.ipynb ├── 17 Hyperparameters.ipynb ├── 18 Exercise: Decision Trees in sklearn - Solution.ipynb ├── 18 Exercise: Decision Trees in sklearn.ipynb ├── 19 Exercise: Titanic Survival Model with Decision Trees - Solution.ipynb ├── 19 Exercise: Titanic Survival Model with Decision Trees.ipynb ├── data.csv ├── ml-bugs.csv └── titanic_data.csv ├── Deep Learning with Tensorflow ├── Part_1_Introduction_to_Neural_Networks_with_TensorFlow_(Exercise).ipynb ├── Part_1_Introduction_to_Neural_Networks_with_TensorFlow_(Solution).ipynb ├── Part_2_Neural_networks_with_TensorFlow_and_Keras_(Exercise).ipynb ├── Part_2_Neural_networks_with_TensorFlow_and_Keras_(Solution).ipynb ├── Part_3_Training_Neural_Networks_(Exercise).ipynb ├── Part_3_Training_Neural_Networks_(Solution).ipynb ├── Part_4_Fashion_MNIST_(Exercise).ipynb ├── Part_4_Fashion_MNIST_(Solution).ipynb ├── Part_5_Inference_and_Validation_(Exercise).ipynb ├── Part_5_Inference_and_Validation_(Solution).ipynb ├── Part_6_Saving_and_Loading_Models.ipynb ├── Part_7_Loading_Image_Data_(Exercise).ipynb ├── Part_7_Loading_Image_Data_(Solution).ipynb ├── Part_8_Transfer_Learning_(Exercise).ipynb ├── Part_8_Transfer_Learning_(Solution).ipynb └── assets │ ├── activation.png │ ├── backprop_diagram.png │ ├── cat_cropped.png │ ├── dog_cat.png │ ├── fashion-mnist-sprite.png │ ├── function_approx.png │ ├── gradient_descent.png │ ├── image_distribution.png │ ├── mlp_mnist.png │ ├── mnist.png │ ├── multilayer_diagram_weights.png │ ├── overfitting.png │ ├── simple_neuron.png │ ├── tensor_examples.svg │ └── train_examples.png ├── Ensemble Models ├── Exercise: More Spam Classifying - Solution.ipynb └── Exercise: More Spam Classifying.ipynb ├── Guided Projects ├── Predicting Credit Card Approvals │ ├── Predicting Credit Card Approvals - Solution.ipynb │ ├── Predicting Credit Card Approvals.ipynb │ ├── cc_approvals.data │ └── temp.txt ├── Who's Tweeting - Trump or Trudeau │ ├── Who's Tweeting - Trump or Trudeau - Solution.ipynb │ └── Who's Tweeting - Trump or Trudeau.ipynb └── temp.txt ├── LICENSE ├── Lab: K-Nearest Neighbours Classification.ipynb ├── Lab: K-Nearest Neighbours Regression.ipynb ├── Lecture Slides: 1 Machine Learning: An Introduction.md ├── Lectures ├── 1 Machine Learning: An Introduction │ ├── Machine Learning - An Introduction.pdf │ └── Note: Machine Learning: An Introduction.md ├── 10 Decision Tree │ ├── Decision Tree.pdf │ └── Note: Decision Tree.md ├── 12 Principle Component Analysis (PCA) │ └── dd.md ├── 13 Deep Neural Networks │ ├── Deep Network and Regularisation.pdf │ ├── Hands-On Lab: Deep Neural Networks.ipynb │ └── dd.md ├── 13 K-Means Clustering │ └── Principle Component Analysis (PCA).pdf ├── 14 Neural Networks │ ├── Keras Lab: Single-Layer Perceptron.ipynb │ ├── Single-Layer Perceptron.pdf │ ├── Solution: Try-It-Yourself SLP Lab.ipynb │ └── Try-It-Yourself SLP Lab.ipynb ├── 2 K-Nearest Neighbours Classification │ ├── Guided Lab: K‑Nearest Neighbours (KNN) Classification — Guided_Lab.ipynb │ ├── Guided Lab: K‑Nearest Neighbours (KNN) Classification.ipynb │ ├── K-Nearest Neighbours Classification.pdf │ ├── Lab: K-Nearest Neigbours Classification from Scratch.ipynb │ └── Note: K-Nearest Neighbours Classification.md ├── 3 K-Nearest Neighbours Classification from Scratch.ipynb ├── 3 K-Nearest Neighbours Regression │ ├── Guided Lab: K-Nearest Neighbours (KNN) Regression.ipynb │ ├── K-Nearest Neighbours Regression.pdf │ ├── Lab: K-Nearest Neigbours Regression from Scratch.ipynb │ └── Note: K-Nearest Neighbours Regression.md ├── 4 Linear Regression │ ├── Guided Lab: Linear Regression with California Housing.ipynb │ ├── Lab: Linear Regression from Scratch.ipynb │ ├── Lab: Linear Regression.ipynb │ ├── Linear Regression.pdf │ └── Note: Linear Regression.md ├── 5 Gradient Descent │ ├── Gradient Descent.pdf │ ├── Lab: Gradient Descent from Scratch.ipynb │ └── Note: Gradient Descent.md ├── 6 Logistic Regression │ ├── Lab: Predicting Credit Risk.ipynb │ ├── Logistic Regression.pdf │ └── Note: Logistic Regression.md ├── 7 Regularisation │ ├── Note: Regularisation.md │ └── Regularisation.pdf ├── 8 Feature Selection │ └── dd.md ├── 9 Naive Bayes │ ├── Naive Bayes.pdf │ ├── Note: Naive Bayes.md │ ├── Who's Tweeting - Trump or Trudeau - Solution.ipynb │ └── Who's Tweeting - Trump or Trudeau.ipynb ├── AI as a Growth Enabler.pdf ├── CNN Lab with Keras.ipynb └── Case Study 1: AI as a Growth Enabler │ └── AI as a Growth Enabler.pdf ├── Linear Regression ├── 14 Mini-Batch Gradient Descent.ipynb ├── 15 Exercise: Mini- Batch Gradient Descent - Solution.ipynb ├── 15 Exercise: Mini- Batch Gradient Descent.ipynb ├── 17 Linear Regression in scikit-learn - Solution.ipynb ├── 17 Linear Regression in scikit-learn.ipynb ├── 19 Exercise: Multiple Linear Regression - Solution.ipynb ├── 19 Exercise: Multiple Linear Regression.ipynb ├── 22 Linear Regression Warnings.ipynb ├── 24 Exercise: Polynomial Regression - Solution.ipynb ├── 24 Exercise: Polynomial Regression.ipynb ├── 26 Exercise: Regularization - Solution.ipynb ├── 26 Exercise: Regularization.ipynb ├── 27 Exercise: Feature Scaling - Solution.ipynb ├── 27 Exercise: Feature Scaling.ipynb ├── 8 Exercise-Quiz: Absolute and Square Trick - Solution.ipynb ├── 8 Exercise-Quiz: Absolute and Square Trick.ipynb ├── data.csv ├── gd-data.csv ├── poly-data.csv └── reg-data.csv ├── Model Evaluation Metrics ├── 15 Exercise: Sklearn Classification - Solution.ipynb ├── 15 Exercise: Sklearn Classification.ipynb ├── 18 Exercise: Sklearn Regression - Solution.ipynb ├── 18 Exercise: Sklearn Regression.ipynb ├── 2 Quiz: Testing Your Models - Solution.ipynb ├── 2 Quiz: Testing Your Models.ipynb ├── data.csv └── temp.txt ├── Naive Bayes ├── Exercise: Building a Spam Classifier - Solution.ipynb └── Exercise: Building a Spam Classifier.ipynb ├── README.md ├── Sentiment Analysis of US Airline Tweets Using Deep Neural Networks.ipynb ├── Sentiment Analysis on Movie Reviews using Multiple Models.ipynb ├── Training and Tuning ├── 11 Exercise: Grid Search Lab - Solution.ipynb ├── 11 Exercise: Grid Search Lab.ipynb ├── 13 Exercise: Putting It All Together - Solution.ipynb ├── 13 Exercise: Putting It All Together.ipynb └── 7 Exercise: Detecting Overfitting and Underfitting with Learning Curves.ipynb ├── Walk-Through └── temp.txt ├── Who's Tweeting - Trump or Trudeau └── Who's Tweeting - Trump or Trudeau.ipynb ├── aligning-contents-with-coursework-requirements.md └── mid-term-project-ideas.md /Code Walk Through/knn-data.csv: -------------------------------------------------------------------------------- 1 | X1,X2,X3,X4,Y 2 | 5.1,3.5,1.4,0.2,0.0 3 | 4.9,3.0,1.4,0.2,0.0 4 | 4.7,3.2,1.3,0.2,0.0 5 | 4.6,3.1,1.5,0.2,0.0 6 | 5.0,3.6,1.4,0.2,0.0 7 | 5.4,3.9,1.7,0.4,0.0 8 | 4.6,3.4,1.4,0.3,0.0 9 | 5.0,3.4,1.5,0.2,0.0 10 | 4.4,2.9,1.4,0.2,0.0 11 | 4.9,3.1,1.5,0.1,0.0 12 | 5.4,3.7,1.5,0.2,0.0 13 | 4.8,3.4,1.6,0.2,0.0 14 | 4.8,3.0,1.4,0.1,0.0 15 | 4.3,3.0,1.1,0.1,0.0 16 | 5.8,4.0,1.2,0.2,0.0 17 | 5.7,4.4,1.5,0.4,0.0 18 | 5.4,3.9,1.3,0.4,0.0 19 | 5.1,3.5,1.4,0.3,0.0 20 | 5.7,3.8,1.7,0.3,0.0 21 | 5.1,3.8,1.5,0.3,0.0 22 | 5.4,3.4,1.7,0.2,0.0 23 | 5.1,3.7,1.5,0.4,0.0 24 | 4.6,3.6,1.0,0.2,0.0 25 | 5.1,3.3,1.7,0.5,0.0 26 | 4.8,3.4,1.9,0.2,0.0 27 | 5.0,3.0,1.6,0.2,0.0 28 | 5.0,3.4,1.6,0.4,0.0 29 | 5.2,3.5,1.5,0.2,0.0 30 | 5.2,3.4,1.4,0.2,0.0 31 | 4.7,3.2,1.6,0.2,0.0 32 | 4.8,3.1,1.6,0.2,0.0 33 | 5.4,3.4,1.5,0.4,0.0 34 | 5.2,4.1,1.5,0.1,0.0 35 | 5.5,4.2,1.4,0.2,0.0 36 | 4.9,3.1,1.5,0.2,0.0 37 | 5.0,3.2,1.2,0.2,0.0 38 | 5.5,3.5,1.3,0.2,0.0 39 | 4.9,3.6,1.4,0.1,0.0 40 | 4.4,3.0,1.3,0.2,0.0 41 | 5.1,3.4,1.5,0.2,0.0 42 | 5.0,3.5,1.3,0.3,0.0 43 | 4.5,2.3,1.3,0.3,0.0 44 | 4.4,3.2,1.3,0.2,0.0 45 | 5.0,3.5,1.6,0.6,0.0 46 | 5.1,3.8,1.9,0.4,0.0 47 | 4.8,3.0,1.4,0.3,0.0 48 | 5.1,3.8,1.6,0.2,0.0 49 | 4.6,3.2,1.4,0.2,0.0 50 | 5.3,3.7,1.5,0.2,0.0 51 | 5.0,3.3,1.4,0.2,0.0 52 | 7.0,3.2,4.7,1.4,1.0 53 | 6.4,3.2,4.5,1.5,1.0 54 | 6.9,3.1,4.9,1.5,1.0 55 | 5.5,2.3,4.0,1.3,1.0 56 | 6.5,2.8,4.6,1.5,1.0 57 | 5.7,2.8,4.5,1.3,1.0 58 | 6.3,3.3,4.7,1.6,1.0 59 | 4.9,2.4,3.3,1.0,1.0 60 | 6.6,2.9,4.6,1.3,1.0 61 | 5.2,2.7,3.9,1.4,1.0 62 | 5.0,2.0,3.5,1.0,1.0 63 | 5.9,3.0,4.2,1.5,1.0 64 | 6.0,2.2,4.0,1.0,1.0 65 | 6.1,2.9,4.7,1.4,1.0 66 | 5.6,2.9,3.6,1.3,1.0 67 | 6.7,3.1,4.4,1.4,1.0 68 | 5.6,3.0,4.5,1.5,1.0 69 | 5.8,2.7,4.1,1.0,1.0 70 | 6.2,2.2,4.5,1.5,1.0 71 | 5.6,2.5,3.9,1.1,1.0 72 | 5.9,3.2,4.8,1.8,1.0 73 | 6.1,2.8,4.0,1.3,1.0 74 | 6.3,2.5,4.9,1.5,1.0 75 | 6.1,2.8,4.7,1.2,1.0 76 | 6.4,2.9,4.3,1.3,1.0 77 | 6.6,3.0,4.4,1.4,1.0 78 | 6.8,2.8,4.8,1.4,1.0 79 | 6.7,3.0,5.0,1.7,1.0 80 | 6.0,2.9,4.5,1.5,1.0 81 | 5.7,2.6,3.5,1.0,1.0 82 | 5.5,2.4,3.8,1.1,1.0 83 | 5.5,2.4,3.7,1.0,1.0 84 | 5.8,2.7,3.9,1.2,1.0 85 | 6.0,2.7,5.1,1.6,1.0 86 | 5.4,3.0,4.5,1.5,1.0 87 | 6.0,3.4,4.5,1.6,1.0 88 | 6.7,3.1,4.7,1.5,1.0 89 | 6.3,2.3,4.4,1.3,1.0 90 | 5.6,3.0,4.1,1.3,1.0 91 | 5.5,2.5,4.0,1.3,1.0 92 | 5.5,2.6,4.4,1.2,1.0 93 | 6.1,3.0,4.6,1.4,1.0 94 | 5.8,2.6,4.0,1.2,1.0 95 | 5.0,2.3,3.3,1.0,1.0 96 | 5.6,2.7,4.2,1.3,1.0 97 | 5.7,3.0,4.2,1.2,1.0 98 | 5.7,2.9,4.2,1.3,1.0 99 | 6.2,2.9,4.3,1.3,1.0 100 | 5.1,2.5,3.0,1.1,1.0 101 | 5.7,2.8,4.1,1.3,1.0 102 | 6.3,3.3,6.0,2.5,2.0 103 | 5.8,2.7,5.1,1.9,2.0 104 | 7.1,3.0,5.9,2.1,2.0 105 | 6.3,2.9,5.6,1.8,2.0 106 | 6.5,3.0,5.8,2.2,2.0 107 | 7.6,3.0,6.6,2.1,2.0 108 | 4.9,2.5,4.5,1.7,2.0 109 | 7.3,2.9,6.3,1.8,2.0 110 | 6.7,2.5,5.8,1.8,2.0 111 | 7.2,3.6,6.1,2.5,2.0 112 | 6.5,3.2,5.1,2.0,2.0 113 | 6.4,2.7,5.3,1.9,2.0 114 | 6.8,3.0,5.5,2.1,2.0 115 | 5.7,2.5,5.0,2.0,2.0 116 | 5.8,2.8,5.1,2.4,2.0 117 | 6.4,3.2,5.3,2.3,2.0 118 | 6.5,3.0,5.5,1.8,2.0 119 | 7.7,3.8,6.7,2.2,2.0 120 | 7.7,2.6,6.9,2.3,2.0 121 | 6.0,2.2,5.0,1.5,2.0 122 | 6.9,3.2,5.7,2.3,2.0 123 | 5.6,2.8,4.9,2.0,2.0 124 | 7.7,2.8,6.7,2.0,2.0 125 | 6.3,2.7,4.9,1.8,2.0 126 | 6.7,3.3,5.7,2.1,2.0 127 | 7.2,3.2,6.0,1.8,2.0 128 | 6.2,2.8,4.8,1.8,2.0 129 | 6.1,3.0,4.9,1.8,2.0 130 | 6.4,2.8,5.6,2.1,2.0 131 | 7.2,3.0,5.8,1.6,2.0 132 | 7.4,2.8,6.1,1.9,2.0 133 | 7.9,3.8,6.4,2.0,2.0 134 | 6.4,2.8,5.6,2.2,2.0 135 | 6.3,2.8,5.1,1.5,2.0 136 | 6.1,2.6,5.6,1.4,2.0 137 | 7.7,3.0,6.1,2.3,2.0 138 | 6.3,3.4,5.6,2.4,2.0 139 | 6.4,3.1,5.5,1.8,2.0 140 | 6.0,3.0,4.8,1.8,2.0 141 | 6.9,3.1,5.4,2.1,2.0 142 | 6.7,3.1,5.6,2.4,2.0 143 | 6.9,3.1,5.1,2.3,2.0 144 | 5.8,2.7,5.1,1.9,2.0 145 | 6.8,3.2,5.9,2.3,2.0 146 | 6.7,3.3,5.7,2.5,2.0 147 | 6.7,3.0,5.2,2.3,2.0 148 | 6.3,2.5,5.0,1.9,2.0 149 | 6.5,3.0,5.2,2.0,2.0 150 | 6.2,3.4,5.4,2.3,2.0 151 | 5.9,3.0,5.1,1.8,2.0 152 | -------------------------------------------------------------------------------- /Code Walk Through/temp.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Decision Tree/19 Exercise: Titanic Survival Model with Decision Trees - Solution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyNF3zA8ADL+nWXiY9v1Dpvp", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Lab: Titanic Survival Exploration with Decision Trees" 33 | ], 34 | "metadata": { 35 | "id": "aQOQuH86z-D3" 36 | } 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "source": [ 41 | "## Getting Started\n", 42 | "\n", 43 | "In this lab, you will see how decision trees work by implementing a decision tree in sklearn.\n", 44 | "\n", 45 | "We'll start by loading the dataset and displaying some of its rows." 46 | ], 47 | "metadata": { 48 | "id": "f6S13xo_0CD0" 49 | } 50 | }, 51 | { 52 | "cell_type": "code", 53 | "source": [ 54 | "# Import libraries necessary for this project\n", 55 | "import numpy as np\n", 56 | "import pandas as pd\n", 57 | "from IPython.display import display # Allows the use of display() for DataFrames\n", 58 | "\n", 59 | "# Pretty display for notebooks\n", 60 | "%matplotlib inline\n", 61 | "\n", 62 | "# Set a random seed\n", 63 | "import random\n", 64 | "random.seed(42)\n", 65 | "\n", 66 | "# Load the dataset\n", 67 | "# URL for our dataset, titanic_data.csv\n", 68 | "URL = \"https://drive.google.com/file/d/1nPumfMimt3yHF6O1GfUikH18YaGE9Q7D/view?usp=sharing\"\n", 69 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 70 | "full_data = pd.read_csv(FILE_PATH)\n", 71 | "\n", 72 | "# Print the first few entries of the RMS Titanic data\n", 73 | "display(full_data.head())" 74 | ], 75 | "metadata": { 76 | "id": "S_RtljL7xc66" 77 | }, 78 | "execution_count": null, 79 | "outputs": [] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "source": [ 84 | "Recall that these are the various features present for each passenger on the ship:\n", 85 | "- **Survived**: Outcome of survival (0 = No; 1 = Yes)\n", 86 | "- **Pclass**: Socio-economic class (1 = Upper class; 2 = Middle class; 3 = Lower class)\n", 87 | "- **Name**: Name of passenger\n", 88 | "- **Sex**: Sex of the passenger\n", 89 | "- **Age**: Age of the passenger (Some entries contain `NaN`)\n", 90 | "- **SibSp**: Number of siblings and spouses of the passenger aboard\n", 91 | "- **Parch**: Number of parents and children of the passenger aboard\n", 92 | "- **Ticket**: Ticket number of the passenger\n", 93 | "- **Fare**: Fare paid by the passenger\n", 94 | "- **Cabin** Cabin number of the passenger (Some entries contain `NaN`)\n", 95 | "- **Embarked**: Port of embarkation of the passenger (C = Cherbourg; Q = Queenstown; S = Southampton)\n", 96 | "\n", 97 | "Since we're interested in the outcome of survival for each passenger or crew member, we can remove the **Survived** feature from this dataset and store it as its own separate variable `outcomes`. We will use these outcomes as our prediction targets. \n", 98 | "Run the code cell below to remove **Survived** as a feature of the dataset and store it in `outcomes`." 99 | ], 100 | "metadata": { 101 | "id": "np664ffG0kUL" 102 | } 103 | }, 104 | { 105 | "cell_type": "code", 106 | "source": [ 107 | "# Store the 'Survived' feature in a new variable and remove it from the dataset\n", 108 | "outcomes = full_data['Survived']\n", 109 | "features_raw = full_data.drop('Survived', axis = 1)\n", 110 | "\n", 111 | "# Show the new dataset with 'Survived' removed\n", 112 | "display(features_raw.head())" 113 | ], 114 | "metadata": { 115 | "id": "F12pjyxl0kyc" 116 | }, 117 | "execution_count": null, 118 | "outputs": [] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "source": [ 123 | "The very same sample of the RMS Titanic data now shows the **Survived** feature removed from the DataFrame. Note that `data` (the passenger data) and `outcomes` (the outcomes of survival) are now *paired*. That means for any passenger `data.loc[i]`, they have the survival outcome `outcomes[i]`.\n", 124 | "\n", 125 | "## Preprocessing the data\n", 126 | "\n", 127 | "Now, let's do some data preprocessing. First, we'll remove the names of the passengers, and then one-hot encode the features.\n", 128 | "\n", 129 | "One-Hot encoding is useful for changing over categorical data into numerical data, with each different option within a category changed into either a 0 or 1 in a separate *new* category as to whether it is that option or not (e.g. Queenstown port or not Queenstown port). Check out [this article](https://hackernoon.com/what-is-one-hot-encoding-why-and-when-do-you-have-to-use-it-e3c6186d008f) before continuing.\n", 130 | "\n", 131 | "**Question:** Why would it be a terrible idea to one-hot encode the data without removing the names?" 132 | ], 133 | "metadata": { 134 | "id": "50ch7GOf0oGw" 135 | } 136 | }, 137 | { 138 | "cell_type": "code", 139 | "source": [ 140 | "# Removing the names\n", 141 | "features_no_names = features_raw.drop(['Name'], axis=1)\n", 142 | "\n", 143 | "# One-hot encoding\n", 144 | "features = pd.get_dummies(features_no_names)" 145 | ], 146 | "metadata": { 147 | "id": "0rbqsbao0qyX" 148 | }, 149 | "execution_count": null, 150 | "outputs": [] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "source": [ 155 | "And now we'll fill in any blanks with zeroes." 156 | ], 157 | "metadata": { 158 | "id": "XV4g2jid0tp-" 159 | } 160 | }, 161 | { 162 | "cell_type": "code", 163 | "source": [ 164 | "features = features.fillna(0.0)\n", 165 | "display(features.head())" 166 | ], 167 | "metadata": { 168 | "id": "byV2Luk30v7c" 169 | }, 170 | "execution_count": null, 171 | "outputs": [] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "source": [ 176 | "## (TODO) Training the model\n", 177 | "\n", 178 | "Now we're ready to train a model in sklearn. First, let's split the data into training and testing sets. Then we'll train the model on the training set." 179 | ], 180 | "metadata": { 181 | "id": "SKecVQKp0ywI" 182 | } 183 | }, 184 | { 185 | "cell_type": "code", 186 | "source": [ 187 | "from sklearn.model_selection import train_test_split\n", 188 | "X_train, X_test, y_train, y_test = train_test_split(features, outcomes, test_size=0.2, random_state=42)" 189 | ], 190 | "metadata": { 191 | "id": "spApIsg_005S" 192 | }, 193 | "execution_count": null, 194 | "outputs": [] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "source": [ 199 | "# Import the classifier from sklearn\n", 200 | "from sklearn.tree import DecisionTreeClassifier\n", 201 | "\n", 202 | "# TODO: Define the classifier, and fit it to the data\n", 203 | "model = DecisionTreeClassifier()\n", 204 | "model.fit(X_train, y_train)" 205 | ], 206 | "metadata": { 207 | "id": "6sS0_jFl03fM" 208 | }, 209 | "execution_count": null, 210 | "outputs": [] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "source": [ 215 | "## Testing the model\n", 216 | "Now, let's see how our model does, let's calculate the accuracy over both the training and the testing set." 217 | ], 218 | "metadata": { 219 | "id": "QU9ONQEV06cl" 220 | } 221 | }, 222 | { 223 | "cell_type": "code", 224 | "source": [ 225 | "# Making predictions\n", 226 | "y_train_pred = model.predict(X_train)\n", 227 | "y_test_pred = model.predict(X_test)\n", 228 | "\n", 229 | "# Calculate the accuracy\n", 230 | "from sklearn.metrics import accuracy_score\n", 231 | "train_accuracy = accuracy_score(y_train, y_train_pred)\n", 232 | "test_accuracy = accuracy_score(y_test, y_test_pred)\n", 233 | "print('The training accuracy is', train_accuracy)\n", 234 | "print('The test accuracy is', test_accuracy)" 235 | ], 236 | "metadata": { 237 | "id": "QuPP2wC008cm" 238 | }, 239 | "execution_count": null, 240 | "outputs": [] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "source": [ 245 | "# Exercise: Improving the model\n", 246 | "\n", 247 | "Ok, high training accuracy and a lower testing accuracy. We may be overfitting a bit.\n", 248 | "\n", 249 | "So now it's your turn to shine! Train a new model, and try to specify some parameters in order to improve the testing accuracy, such as:\n", 250 | "- `max_depth`\n", 251 | "- `min_samples_leaf`\n", 252 | "- `min_samples_split`\n", 253 | "\n", 254 | "You can use your intuition, trial and error, or even better, feel free to use Grid Search!\n", 255 | "\n", 256 | "**Challenge:** Try to get to 85% accuracy on the testing set. If you'd like a hint, take a look at the solutions notebook next." 257 | ], 258 | "metadata": { 259 | "id": "pizTZid80_wk" 260 | } 261 | }, 262 | { 263 | "cell_type": "code", 264 | "source": [ 265 | "# Training the model\n", 266 | "model = DecisionTreeClassifier(max_depth=6, min_samples_leaf=6, min_samples_split=10)\n", 267 | "model.fit(X_train, y_train)\n", 268 | "\n", 269 | "# Making predictions\n", 270 | "y_train_pred = model.predict(X_train)\n", 271 | "y_test_pred = model.predict(X_test)\n", 272 | "\n", 273 | "# Calculating accuracies\n", 274 | "train_accuracy = accuracy_score(y_train, y_train_pred)\n", 275 | "test_accuracy = accuracy_score(y_test, y_test_pred)\n", 276 | "\n", 277 | "print('The training accuracy is', train_accuracy)\n", 278 | "print('The test accuracy is', test_accuracy)" 279 | ], 280 | "metadata": { 281 | "id": "xxKadlkF1E3O" 282 | }, 283 | "execution_count": null, 284 | "outputs": [] 285 | } 286 | ] 287 | } -------------------------------------------------------------------------------- /Decision Tree/19 Exercise: Titanic Survival Model with Decision Trees.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyOWUwwUyYRtBY/Vzz2idWE9", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Lab: Titanic Survival Exploration with Decision Trees" 33 | ], 34 | "metadata": { 35 | "id": "aQOQuH86z-D3" 36 | } 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "source": [ 41 | "## Getting Started\n", 42 | "\n", 43 | "In this lab, you will see how decision trees work by implementing a decision tree in sklearn.\n", 44 | "\n", 45 | "We'll start by loading the dataset and displaying some of its rows." 46 | ], 47 | "metadata": { 48 | "id": "f6S13xo_0CD0" 49 | } 50 | }, 51 | { 52 | "cell_type": "code", 53 | "source": [ 54 | "# Import libraries necessary for this project\n", 55 | "import numpy as np\n", 56 | "import pandas as pd\n", 57 | "from IPython.display import display # Allows the use of display() for DataFrames\n", 58 | "\n", 59 | "# Pretty display for notebooks\n", 60 | "%matplotlib inline\n", 61 | "\n", 62 | "# Set a random seed\n", 63 | "import random\n", 64 | "random.seed(42)\n", 65 | "\n", 66 | "# Load the dataset\n", 67 | "# URL for our dataset, titanic_data.csv\n", 68 | "URL = \"https://drive.google.com/file/d/1nPumfMimt3yHF6O1GfUikH18YaGE9Q7D/view?usp=sharing\"\n", 69 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 70 | "full_data = pd.read_csv(in_file)\n", 71 | "\n", 72 | "# Print the first few entries of the RMS Titanic data\n", 73 | "display(full_data.head())" 74 | ], 75 | "metadata": { 76 | "id": "S_RtljL7xc66" 77 | }, 78 | "execution_count": null, 79 | "outputs": [] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "source": [ 84 | "Recall that these are the various features present for each passenger on the ship:\n", 85 | "- **Survived**: Outcome of survival (0 = No; 1 = Yes)\n", 86 | "- **Pclass**: Socio-economic class (1 = Upper class; 2 = Middle class; 3 = Lower class)\n", 87 | "- **Name**: Name of passenger\n", 88 | "- **Sex**: Sex of the passenger\n", 89 | "- **Age**: Age of the passenger (Some entries contain `NaN`)\n", 90 | "- **SibSp**: Number of siblings and spouses of the passenger aboard\n", 91 | "- **Parch**: Number of parents and children of the passenger aboard\n", 92 | "- **Ticket**: Ticket number of the passenger\n", 93 | "- **Fare**: Fare paid by the passenger\n", 94 | "- **Cabin** Cabin number of the passenger (Some entries contain `NaN`)\n", 95 | "- **Embarked**: Port of embarkation of the passenger (C = Cherbourg; Q = Queenstown; S = Southampton)\n", 96 | "\n", 97 | "Since we're interested in the outcome of survival for each passenger or crew member, we can remove the **Survived** feature from this dataset and store it as its own separate variable `outcomes`. We will use these outcomes as our prediction targets. \n", 98 | "Run the code cell below to remove **Survived** as a feature of the dataset and store it in `outcomes`." 99 | ], 100 | "metadata": { 101 | "id": "np664ffG0kUL" 102 | } 103 | }, 104 | { 105 | "cell_type": "code", 106 | "source": [ 107 | "# Store the 'Survived' feature in a new variable and remove it from the dataset\n", 108 | "outcomes = full_data['Survived']\n", 109 | "features_raw = full_data.drop('Survived', axis = 1)\n", 110 | "\n", 111 | "# Show the new dataset with 'Survived' removed\n", 112 | "display(features_raw.head())" 113 | ], 114 | "metadata": { 115 | "id": "F12pjyxl0kyc" 116 | }, 117 | "execution_count": null, 118 | "outputs": [] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "source": [ 123 | "The very same sample of the RMS Titanic data now shows the **Survived** feature removed from the DataFrame. Note that `data` (the passenger data) and `outcomes` (the outcomes of survival) are now *paired*. That means for any passenger `data.loc[i]`, they have the survival outcome `outcomes[i]`.\n", 124 | "\n", 125 | "## Preprocessing the data\n", 126 | "\n", 127 | "Now, let's do some data preprocessing. First, we'll remove the names of the passengers, and then one-hot encode the features.\n", 128 | "\n", 129 | "One-Hot encoding is useful for changing over categorical data into numerical data, with each different option within a category changed into either a 0 or 1 in a separate *new* category as to whether it is that option or not (e.g. Queenstown port or not Queenstown port). Check out [this article](https://hackernoon.com/what-is-one-hot-encoding-why-and-when-do-you-have-to-use-it-e3c6186d008f) before continuing.\n", 130 | "\n", 131 | "**Question:** Why would it be a terrible idea to one-hot encode the data without removing the names?" 132 | ], 133 | "metadata": { 134 | "id": "50ch7GOf0oGw" 135 | } 136 | }, 137 | { 138 | "cell_type": "code", 139 | "source": [ 140 | "# Removing the names\n", 141 | "features_no_names = features_raw.drop(['Name'], axis=1)\n", 142 | "\n", 143 | "# One-hot encoding\n", 144 | "features = pd.get_dummies(features_no_names)" 145 | ], 146 | "metadata": { 147 | "id": "0rbqsbao0qyX" 148 | }, 149 | "execution_count": null, 150 | "outputs": [] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "source": [ 155 | "And now we'll fill in any blanks with zeroes." 156 | ], 157 | "metadata": { 158 | "id": "XV4g2jid0tp-" 159 | } 160 | }, 161 | { 162 | "cell_type": "code", 163 | "source": [ 164 | "features = features.fillna(0.0)\n", 165 | "display(features.head())" 166 | ], 167 | "metadata": { 168 | "id": "byV2Luk30v7c" 169 | }, 170 | "execution_count": null, 171 | "outputs": [] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "source": [ 176 | "## (TODO) Training the model\n", 177 | "\n", 178 | "Now we're ready to train a model in sklearn. First, let's split the data into training and testing sets. Then we'll train the model on the training set." 179 | ], 180 | "metadata": { 181 | "id": "SKecVQKp0ywI" 182 | } 183 | }, 184 | { 185 | "cell_type": "code", 186 | "source": [ 187 | "from sklearn.model_selection import train_test_split\n", 188 | "X_train, X_test, y_train, y_test = train_test_split(features, outcomes, test_size=0.2, random_state=42)" 189 | ], 190 | "metadata": { 191 | "id": "spApIsg_005S" 192 | }, 193 | "execution_count": null, 194 | "outputs": [] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "source": [ 199 | "# Import the classifier from sklearn\n", 200 | "from sklearn.tree import DecisionTreeClassifier\n", 201 | "\n", 202 | "# TODO: Define the classifier, and fit it to the data\n", 203 | "model = None" 204 | ], 205 | "metadata": { 206 | "id": "6sS0_jFl03fM" 207 | }, 208 | "execution_count": null, 209 | "outputs": [] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "source": [ 214 | "## Testing the model\n", 215 | "Now, let's see how our model does, let's calculate the accuracy over both the training and the testing set." 216 | ], 217 | "metadata": { 218 | "id": "QU9ONQEV06cl" 219 | } 220 | }, 221 | { 222 | "cell_type": "code", 223 | "source": [ 224 | "# Making predictions\n", 225 | "y_train_pred = model.predict(X_train)\n", 226 | "y_test_pred = model.predict(X_test)\n", 227 | "\n", 228 | "# Calculate the accuracy\n", 229 | "from sklearn.metrics import accuracy_score\n", 230 | "train_accuracy = accuracy_score(y_train, y_train_pred)\n", 231 | "test_accuracy = accuracy_score(y_test, y_test_pred)\n", 232 | "print('The training accuracy is', train_accuracy)\n", 233 | "print('The test accuracy is', test_accuracy)" 234 | ], 235 | "metadata": { 236 | "id": "QuPP2wC008cm" 237 | }, 238 | "execution_count": null, 239 | "outputs": [] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "source": [ 244 | "# Exercise: Improving the model\n", 245 | "\n", 246 | "Ok, high training accuracy and a lower testing accuracy. We may be overfitting a bit.\n", 247 | "\n", 248 | "So now it's your turn to shine! Train a new model, and try to specify some parameters in order to improve the testing accuracy, such as:\n", 249 | "- `max_depth`\n", 250 | "- `min_samples_leaf`\n", 251 | "- `min_samples_split`\n", 252 | "\n", 253 | "You can use your intuition, trial and error, or even better, feel free to use Grid Search!\n", 254 | "\n", 255 | "**Challenge:** Try to get to 85% accuracy on the testing set. If you'd like a hint, take a look at the solutions notebook next." 256 | ], 257 | "metadata": { 258 | "id": "pizTZid80_wk" 259 | } 260 | }, 261 | { 262 | "cell_type": "code", 263 | "source": [ 264 | "# TODO: Train the model\n", 265 | "\n", 266 | "# TODO: Make predictions\n", 267 | "\n", 268 | "# TODO: Calculate the accuracy" 269 | ], 270 | "metadata": { 271 | "id": "xxKadlkF1E3O" 272 | }, 273 | "execution_count": null, 274 | "outputs": [] 275 | } 276 | ] 277 | } -------------------------------------------------------------------------------- /Decision Tree/data.csv: -------------------------------------------------------------------------------- 1 | 0.24539,0.81725,0 2 | 0.21774,0.76462,0 3 | 0.20161,0.69737,0 4 | 0.20161,0.58041,0 5 | 0.2477,0.49561,0 6 | 0.32834,0.44883,0 7 | 0.39516,0.48099,0 8 | 0.39286,0.57164,0 9 | 0.33525,0.62135,0 10 | 0.33986,0.71199,0 11 | 0.34447,0.81433,0 12 | 0.28226,0.82602,0 13 | 0.26613,0.75,0 14 | 0.26613,0.63596,0 15 | 0.32604,0.54825,0 16 | 0.28917,0.65643,0 17 | 0.80069,0.71491,0 18 | 0.80069,0.64181,0 19 | 0.80069,0.50146,0 20 | 0.79839,0.36988,0 21 | 0.73157,0.25,0 22 | 0.63249,0.18275,0 23 | 0.60023,0.27047,0 24 | 0.66014,0.34649,0 25 | 0.70161,0.42251,0 26 | 0.70853,0.53947,0 27 | 0.71544,0.63304,0 28 | 0.74309,0.72076,0 29 | 0.75,0.63596,0 30 | 0.75,0.46345,0 31 | 0.72235,0.35526,0 32 | 0.66935,0.28509,0 33 | 0.20622,0.94298,1 34 | 0.26613,0.8962,1 35 | 0.38134,0.8962,1 36 | 0.42051,0.94591,1 37 | 0.49885,0.86404,1 38 | 0.31452,0.93421,1 39 | 0.53111,0.72076,1 40 | 0.45276,0.74415,1 41 | 0.53571,0.6038,1 42 | 0.60484,0.71491,1 43 | 0.60945,0.58333,1 44 | 0.51267,0.47807,1 45 | 0.50806,0.59211,1 46 | 0.46198,0.30556,1 47 | 0.5288,0.41082,1 48 | 0.38594,0.35819,1 49 | 0.31682,0.31433,1 50 | 0.29608,0.20906,1 51 | 0.36982,0.27632,1 52 | 0.42972,0.18275,1 53 | 0.51498,0.10965,1 54 | 0.53111,0.20906,1 55 | 0.59793,0.095029,1 56 | 0.73848,0.086257,1 57 | 0.83065,0.18275,1 58 | 0.8629,0.10965,1 59 | 0.88364,0.27924,1 60 | 0.93433,0.30848,1 61 | 0.93433,0.19444,1 62 | 0.92512,0.43421,1 63 | 0.87903,0.43421,1 64 | 0.87903,0.58626,1 65 | 0.9182,0.71491,1 66 | 0.85138,0.8348,1 67 | 0.85599,0.94006,1 68 | 0.70853,0.94298,1 69 | 0.70853,0.87281,1 70 | 0.59793,0.93129,1 71 | 0.61175,0.83187,1 72 | 0.78226,0.82895,1 73 | 0.78917,0.8962,1 74 | 0.90668,0.89912,1 75 | 0.14862,0.92251,1 76 | 0.15092,0.85819,1 77 | 0.097926,0.85819,1 78 | 0.079493,0.91374,1 79 | 0.079493,0.77632,1 80 | 0.10945,0.79678,1 81 | 0.12327,0.67982,1 82 | 0.077189,0.6886,1 83 | 0.081797,0.58626,1 84 | 0.14862,0.58041,1 85 | 0.14862,0.5307,1 86 | 0.14171,0.41959,1 87 | 0.08871,0.49269,1 88 | 0.095622,0.36696,1 89 | 0.24539,0.3962,1 90 | 0.1947,0.29678,1 91 | 0.16935,0.22368,1 92 | 0.15553,0.13596,1 93 | 0.23848,0.12427,1 94 | 0.33065,0.12427,1 95 | 0.095622,0.2617,1 96 | 0.091014,0.20322,1 97 | -------------------------------------------------------------------------------- /Decision Tree/ml-bugs.csv: -------------------------------------------------------------------------------- 1 | Species,Color,Length (mm) 2 | Mobug,Brown,11.6 3 | Mobug,Blue,16.3 4 | Lobug,Blue,15.1 5 | Lobug,Green,23.7 6 | Lobug,Blue,18.4 7 | Lobug,Brown,17.1 8 | Mobug,Brown,15.7 9 | Lobug,Green,18.6 10 | Lobug,Blue,22.9 11 | Lobug,Blue,21.0 12 | Lobug,Blue,20.5 13 | Mobug,Green,21.2 14 | Mobug,Brown,13.8 15 | Lobug,Blue,14.5 16 | Lobug,Green,24.8 17 | Mobug,Brown,18.2 18 | Lobug,Green,17.9 19 | Lobug,Green,22.7 20 | Mobug,Green,19.9 21 | Mobug,Blue,14.6 22 | Mobug,Blue,19.2 23 | Lobug,Brown,14.1 24 | Lobug,Green,18.8 25 | Mobug,Blue,13.1 -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/activation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/activation.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/backprop_diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/backprop_diagram.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/cat_cropped.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/cat_cropped.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/dog_cat.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/dog_cat.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/fashion-mnist-sprite.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/fashion-mnist-sprite.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/function_approx.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/function_approx.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/gradient_descent.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/gradient_descent.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/image_distribution.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/image_distribution.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/mlp_mnist.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/mlp_mnist.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/mnist.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/mnist.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/multilayer_diagram_weights.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/multilayer_diagram_weights.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/overfitting.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/overfitting.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/simple_neuron.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/simple_neuron.png -------------------------------------------------------------------------------- /Deep Learning with Tensorflow/assets/train_examples.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Deep Learning with Tensorflow/assets/train_examples.png -------------------------------------------------------------------------------- /Ensemble Models/Exercise: More Spam Classifying - Solution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "include_colab_link": true 8 | }, 9 | "kernelspec": { 10 | "name": "python3", 11 | "display_name": "Python 3" 12 | }, 13 | "language_info": { 14 | "name": "python" 15 | } 16 | }, 17 | "cells": [ 18 | { 19 | "cell_type": "markdown", 20 | "metadata": { 21 | "id": "view-in-github", 22 | "colab_type": "text" 23 | }, 24 | "source": [ 25 | "\"Open" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "source": [ 31 | "# Our Mission\n", 32 | "\n", 33 | "ou recently used Naive Bayes to classify spam in this [dataset](https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection). In this notebook, we will expand on the previous analysis by using a few of the new techniques you've learned throughout this lesson.\n", 34 | "\n", 35 | "\n", 36 | "> Let's quickly re-create what we did in the previous Naive Bayes Spam Classifier notebook. We're providing the essential code from that previous workspace here, so please run this cell below." 37 | ], 38 | "metadata": { 39 | "id": "IYnLjMZ5D0wA" 40 | } 41 | }, 42 | { 43 | "cell_type": "code", 44 | "source": [ 45 | "# Import our libraries\n", 46 | "import pandas as pd\n", 47 | "from sklearn.model_selection import train_test_split\n", 48 | "from sklearn.feature_extraction.text import CountVectorizer\n", 49 | "from sklearn.naive_bayes import MultinomialNB\n", 50 | "from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score\n", 51 | "\n", 52 | "# Read in our dataset\n", 53 | "URL = \"https://drive.google.com/file/d/15gMyvFMdIZ-Iu6LwUSQ88tcJuk83lJiA/view?usp=sharing\"\n", 54 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 55 | "df = pd.read_table(FILE_PATH,\n", 56 | " sep='\\t',\n", 57 | " header=None,\n", 58 | " names=['label', 'sms_message'])\n", 59 | "\n", 60 | "# Fix our response value\n", 61 | "df['label'] = df.label.map({'ham':0, 'spam':1})\n", 62 | "\n", 63 | "# Split our dataset into training and testing data\n", 64 | "X_train, X_test, y_train, y_test = train_test_split(df['sms_message'],\n", 65 | " df['label'],\n", 66 | " random_state=1)\n", 67 | "\n", 68 | "# Instantiate the CountVectorizer method\n", 69 | "count_vector = CountVectorizer()\n", 70 | "\n", 71 | "# Fit the training data and then return the matrix\n", 72 | "training_data = count_vector.fit_transform(X_train)\n", 73 | "\n", 74 | "# Transform testing data and return the matrix. Note we are not fitting the testing data into the CountVectorizer()\n", 75 | "testing_data = count_vector.transform(X_test)\n", 76 | "\n", 77 | "# Instantiate our model\n", 78 | "naive_bayes = MultinomialNB()\n", 79 | "\n", 80 | "# Fit our model to the training data\n", 81 | "naive_bayes.fit(training_data, y_train)\n", 82 | "\n", 83 | "# Predict on the test data\n", 84 | "predictions = naive_bayes.predict(testing_data)\n", 85 | "\n", 86 | "# Score our model\n", 87 | "print('Accuracy score: ', format(accuracy_score(y_test, predictions)))\n", 88 | "print('Precision score: ', format(precision_score(y_test, predictions)))\n", 89 | "print('Recall score: ', format(recall_score(y_test, predictions)))\n", 90 | "print('F1 score: ', format(f1_score(y_test, predictions)))" 91 | ], 92 | "metadata": { 93 | "id": "iZ_x1QlfwsfM" 94 | }, 95 | "execution_count": null, 96 | "outputs": [] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "source": [ 101 | "### Turns Out...\n", 102 | "\n", 103 | "We can see from the scores above that our Naive Bayes model actually does a pretty good job of classifying spam and \"ham.\" However, let's take a look at a few additional models to see if we can't improve anyway.\n", 104 | "\n", 105 | "Specifically in this notebook, we will take a look at the following techniques:\n", 106 | "\n", 107 | "* [BaggingClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html#sklearn.ensemble.BaggingClassifier)\n", 108 | "* [RandomForestClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier)\n", 109 | "* [AdaBoostClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html#sklearn.ensemble.AdaBoostClassifier)\n", 110 | "\n", 111 | "Another really useful guide for ensemble methods can be found [in the documentation here](http://scikit-learn.org/stable/modules/ensemble.html).\n", 112 | "\n", 113 | "These ensemble methods use a combination of techniques you have seen throughout this lesson:\n", 114 | "\n", 115 | "* **Bootstrap the data** passed through a learner (bagging).\n", 116 | "* **Subset the features** used for a learner (combined with bagging signifies the two random components of random forests).\n", 117 | "* **Ensemble learners** together in a way that allows those that perform best in certain areas to create the largest impact (boosting).\n", 118 | "\n", 119 | "\n", 120 | "In this notebook, let's get some practice with these methods, which will also help you get comfortable with the process used for performing supervised machine learning in Python in general.\n", 121 | "\n", 122 | "Since you cleaned and vectorized the text in the previous notebook, this notebook can be focused on the fun part - the machine learning part.\n", 123 | "\n", 124 | "### This Process Looks Familiar...\n", 125 | "\n", 126 | "In general, there is a five step process that can be used each time you want to use a supervised learning method (which you actually used above):\n", 127 | "\n", 128 | "1. **Import** the model.\n", 129 | "2. **Instantiate** the model with the hyperparameters of interest.\n", 130 | "3. **Fit** the model to the training data.\n", 131 | "4. **Predict** on the test data.\n", 132 | "5. **Score** the model by comparing the predictions to the actual values.\n", 133 | "\n", 134 | "Follow the steps through this notebook to perform these steps using each of the ensemble methods: **BaggingClassifier**, **RandomForestClassifier**, and **AdaBoostClassifier**.\n", 135 | "\n", 136 | "> **Step 1**: First use the documentation to `import` all three of the models." 137 | ], 138 | "metadata": { 139 | "id": "rj5638Dgw1Pz" 140 | } 141 | }, 142 | { 143 | "cell_type": "code", 144 | "source": [ 145 | "# Import the Bagging, RandomForest, and AdaBoost Classifier\n", 146 | "from sklearn.ensemble import BaggingClassifier, RandomForestClassifier, AdaBoostClassifier" 147 | ], 148 | "metadata": { 149 | "id": "UfbqCuwyw8P7" 150 | }, 151 | "execution_count": null, 152 | "outputs": [] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "source": [ 157 | "> **Step 2:** Now that you have imported each of the classifiers, `instantiate` each with the hyperparameters specified in each comment. In the upcoming lessons, you will see how we can automate the process to finding the best hyperparameters. For now, let's get comfortable with the process and our new algorithms." 158 | ], 159 | "metadata": { 160 | "id": "qEfIP9g5xA2b" 161 | } 162 | }, 163 | { 164 | "cell_type": "code", 165 | "source": [ 166 | "# Instantiate a BaggingClassifier with:\n", 167 | "# 200 weak learners (n_estimators) and everything else as default values\n", 168 | "bag_mod = BaggingClassifier(n_estimators=200)\n", 169 | "\n", 170 | "\n", 171 | "# Instantiate a RandomForestClassifier with:\n", 172 | "# 200 weak learners (n_estimators) and everything else as default values\n", 173 | "rf_mod = RandomForestClassifier(n_estimators=200)\n", 174 | "\n", 175 | "# Instantiate an a AdaBoostClassifier with:\n", 176 | "# With 300 weak learners (n_estimators) and a learning_rate of 0.2\n", 177 | "ada_mod = AdaBoostClassifier(n_estimators=300, learning_rate=0.2)" 178 | ], 179 | "metadata": { 180 | "id": "kJZlnORvxD56" 181 | }, 182 | "execution_count": null, 183 | "outputs": [] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "source": [ 188 | "> **Step 3:** Now that you have instantiated each of your models, `fit` them using the **training_data** and **y_train**. This may take a bit of time, you are fitting 700 weak learners after all!" 189 | ], 190 | "metadata": { 191 | "id": "n7TXzMhdxGSg" 192 | } 193 | }, 194 | { 195 | "cell_type": "code", 196 | "source": [ 197 | "# Fit your BaggingClassifier to the training data\n", 198 | "bag_mod.fit(training_data, y_train)\n", 199 | "\n", 200 | "# Fit your RandomForestClassifier to the training data\n", 201 | "rf_mod.fit(training_data, y_train)\n", 202 | "\n", 203 | "# Fit your AdaBoostClassifier to the training data\n", 204 | "ada_mod.fit(training_data, y_train)" 205 | ], 206 | "metadata": { 207 | "id": "CGF-u3HNxJxc" 208 | }, 209 | "execution_count": null, 210 | "outputs": [] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "source": [ 215 | "> **Step 4:** Now that you have fit each of your models, you will use each to `predict` on the **testing_data**." 216 | ], 217 | "metadata": { 218 | "id": "0fpduCWXxMWn" 219 | } 220 | }, 221 | { 222 | "cell_type": "code", 223 | "source": [ 224 | "# Predict using BaggingClassifier on the test data\n", 225 | "bag_preds = bag_mod.predict(testing_data)\n", 226 | "\n", 227 | "# Predict using RandomForestClassifier on the test data\n", 228 | "rf_preds = rf_mod.predict(testing_data)\n", 229 | "\n", 230 | "# Predict using AdaBoostClassifier on the test data\n", 231 | "ada_preds = ada_mod.predict(testing_data)" 232 | ], 233 | "metadata": { 234 | "id": "80uzQtB1xPJ7" 235 | }, 236 | "execution_count": null, 237 | "outputs": [] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "source": [ 242 | "> **Step 5:** Now that you have made your predictions, compare your predictions to the actual values using the function below for each of your models - this will give you the `score` for how well each of your models is performing. It might also be useful to show the Naive Bayes model again here, so we can compare them all side by side." 243 | ], 244 | "metadata": { 245 | "id": "9W0KepxHxRa4" 246 | } 247 | }, 248 | { 249 | "cell_type": "code", 250 | "source": [ 251 | "def print_metrics(y_true, preds, model_name=None):\n", 252 | " '''\n", 253 | " INPUT:\n", 254 | " y_true - the y values that are actually true in the dataset (NumPy array or pandas series)\n", 255 | " preds - the predictions for those values from some model (NumPy array or pandas series)\n", 256 | " model_name - (str - optional) a name associated with the model if you would like to add it to the print statements\n", 257 | "\n", 258 | " OUTPUT:\n", 259 | " None - prints the accuracy, precision, recall, and F1 score\n", 260 | " '''\n", 261 | " if model_name == None:\n", 262 | " print('Accuracy score: ', format(accuracy_score(y_true, preds)))\n", 263 | " print('Precision score: ', format(precision_score(y_true, preds)))\n", 264 | " print('Recall score: ', format(recall_score(y_true, preds)))\n", 265 | " print('F1 score: ', format(f1_score(y_true, preds)))\n", 266 | " print('\\n\\n')\n", 267 | "\n", 268 | " else:\n", 269 | " print('Accuracy score for ' + model_name + ' :' , format(accuracy_score(y_true, preds)))\n", 270 | " print('Precision score ' + model_name + ' :', format(precision_score(y_true, preds)))\n", 271 | " print('Recall score ' + model_name + ' :', format(recall_score(y_true, preds)))\n", 272 | " print('F1 score ' + model_name + ' :', format(f1_score(y_true, preds)))\n", 273 | " print('\\n\\n')" 274 | ], 275 | "metadata": { 276 | "id": "btQZc-WhxUcQ" 277 | }, 278 | "execution_count": null, 279 | "outputs": [] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "source": [ 284 | "# Print Bagging scores\n", 285 | "print_metrics(y_test, bag_preds, 'bagging')\n", 286 | "\n", 287 | "# Print Random Forest scores\n", 288 | "print_metrics(y_test, rf_preds, 'random forest')\n", 289 | "\n", 290 | "# Print AdaBoost scores\n", 291 | "print_metrics(y_test, ada_preds, 'adaboost')\n", 292 | "\n", 293 | "# Naive Bayes Classifier scores\n", 294 | "print_metrics(y_test, predictions, 'naive bayes')" 295 | ], 296 | "metadata": { 297 | "id": "ePJhq_OkxYkJ" 298 | }, 299 | "execution_count": null, 300 | "outputs": [] 301 | }, 302 | { 303 | "cell_type": "markdown", 304 | "source": [ 305 | "### Recap\n", 306 | "\n", 307 | "Now you have seen the whole process for a few ensemble models!\n", 308 | "\n", 309 | "1. **Import** the model.\n", 310 | "2. **Instantiate** the model with the hyperparameters of interest.\n", 311 | "3. **Fit** the model to the training data.\n", 312 | "4. **Predict** on the test data.\n", 313 | "5. **Score** the model by comparing the predictions to the actual values.\n", 314 | "\n", 315 | "And that's it. This is a very common process for performing machine learning.\n" 316 | ], 317 | "metadata": { 318 | "id": "JXOj7OiWxXff" 319 | } 320 | } 321 | ] 322 | } -------------------------------------------------------------------------------- /Ensemble Models/Exercise: More Spam Classifying.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "include_colab_link": true 8 | }, 9 | "kernelspec": { 10 | "name": "python3", 11 | "display_name": "Python 3" 12 | }, 13 | "language_info": { 14 | "name": "python" 15 | } 16 | }, 17 | "cells": [ 18 | { 19 | "cell_type": "markdown", 20 | "metadata": { 21 | "id": "view-in-github", 22 | "colab_type": "text" 23 | }, 24 | "source": [ 25 | "\"Open" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "source": [ 31 | "# Our Mission\n", 32 | "\n", 33 | "ou recently used Naive Bayes to classify spam in this [dataset](https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection). In this notebook, we will expand on the previous analysis by using a few of the new techniques you've learned throughout this lesson.\n", 34 | "\n", 35 | "\n", 36 | "> Let's quickly re-create what we did in the previous Naive Bayes Spam Classifier notebook. We're providing the essential code from that previous workspace here, so please run this cell below." 37 | ], 38 | "metadata": { 39 | "id": "IYnLjMZ5D0wA" 40 | } 41 | }, 42 | { 43 | "cell_type": "code", 44 | "source": [ 45 | "# Import our libraries\n", 46 | "import pandas as pd\n", 47 | "from sklearn.model_selection import train_test_split\n", 48 | "from sklearn.feature_extraction.text import CountVectorizer\n", 49 | "from sklearn.naive_bayes import MultinomialNB\n", 50 | "from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score\n", 51 | "\n", 52 | "\n", 53 | "# Read in our dataset\n", 54 | "URL = \"https://drive.google.com/file/d/15gMyvFMdIZ-Iu6LwUSQ88tcJuk83lJiA/view?usp=sharing\"\n", 55 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 56 | "df = pd.read_table(FILE_PATH,\n", 57 | " sep='\\t',\n", 58 | " header=None,\n", 59 | " names=['label', 'sms_message'])\n", 60 | "\n", 61 | "# Fix our response value\n", 62 | "df['label'] = df.label.map({'ham':0, 'spam':1})\n", 63 | "\n", 64 | "# Split our dataset into training and testing data\n", 65 | "X_train, X_test, y_train, y_test = train_test_split(df['sms_message'],\n", 66 | " df['label'],\n", 67 | " random_state=1)\n", 68 | "\n", 69 | "# Instantiate the CountVectorizer method\n", 70 | "count_vector = CountVectorizer()\n", 71 | "\n", 72 | "# Fit the training data and then return the matrix\n", 73 | "training_data = count_vector.fit_transform(X_train)\n", 74 | "\n", 75 | "# Transform testing data and return the matrix. Note we are not fitting the testing data into the CountVectorizer()\n", 76 | "testing_data = count_vector.transform(X_test)\n", 77 | "\n", 78 | "# Instantiate our model\n", 79 | "naive_bayes = MultinomialNB()\n", 80 | "\n", 81 | "# Fit our model to the training data\n", 82 | "naive_bayes.fit(training_data, y_train)\n", 83 | "\n", 84 | "# Predict on the test data\n", 85 | "predictions = naive_bayes.predict(testing_data)\n", 86 | "\n", 87 | "# Score our model\n", 88 | "print('Accuracy score: ', format(accuracy_score(y_test, predictions)))\n", 89 | "print('Precision score: ', format(precision_score(y_test, predictions)))\n", 90 | "print('Recall score: ', format(recall_score(y_test, predictions)))\n", 91 | "print('F1 score: ', format(f1_score(y_test, predictions)))" 92 | ], 93 | "metadata": { 94 | "id": "iZ_x1QlfwsfM", 95 | "outputId": "0889e2ef-bca5-4c87-e19a-7a08cf09d72a", 96 | "colab": { 97 | "base_uri": "https://localhost:8080/" 98 | } 99 | }, 100 | "execution_count": 1, 101 | "outputs": [ 102 | { 103 | "output_type": "stream", 104 | "name": "stdout", 105 | "text": [ 106 | "Accuracy score: 0.9885139985642498\n", 107 | "Precision score: 0.9720670391061452\n", 108 | "Recall score: 0.9405405405405406\n", 109 | "F1 score: 0.9560439560439562\n" 110 | ] 111 | } 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "source": [ 117 | "### Turns Out...\n", 118 | "\n", 119 | "We can see from the scores above that our Naive Bayes model actually does a pretty good job of classifying spam and \"ham.\" However, let's take a look at a few additional models to see if we can't improve anyway.\n", 120 | "\n", 121 | "Specifically in this notebook, we will take a look at the following techniques:\n", 122 | "\n", 123 | "* [BaggingClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html#sklearn.ensemble.BaggingClassifier)\n", 124 | "* [RandomForestClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier)\n", 125 | "* [AdaBoostClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html#sklearn.ensemble.AdaBoostClassifier)\n", 126 | "\n", 127 | "Another really useful guide for ensemble methods can be found [in the documentation here](http://scikit-learn.org/stable/modules/ensemble.html).\n", 128 | "\n", 129 | "These ensemble methods use a combination of techniques you have seen throughout this lesson:\n", 130 | "\n", 131 | "* **Bootstrap the data** passed through a learner (bagging).\n", 132 | "* **Subset the features** used for a learner (combined with bagging signifies the two random components of random forests).\n", 133 | "* **Ensemble learners** together in a way that allows those that perform best in certain areas to create the largest impact (boosting).\n", 134 | "\n", 135 | "\n", 136 | "In this notebook, let's get some practice with these methods, which will also help you get comfortable with the process used for performing supervised machine learning in Python in general.\n", 137 | "\n", 138 | "Since you cleaned and vectorized the text in the previous notebook, this notebook can be focused on the fun part - the machine learning part.\n", 139 | "\n", 140 | "### This Process Looks Familiar...\n", 141 | "\n", 142 | "In general, there is a five step process that can be used each time you want to use a supervised learning method (which you actually used above):\n", 143 | "\n", 144 | "1. **Import** the model.\n", 145 | "2. **Instantiate** the model with the hyperparameters of interest.\n", 146 | "3. **Fit** the model to the training data.\n", 147 | "4. **Predict** on the test data.\n", 148 | "5. **Score** the model by comparing the predictions to the actual values.\n", 149 | "\n", 150 | "Follow the steps through this notebook to perform these steps using each of the ensemble methods: **BaggingClassifier**, **RandomForestClassifier**, and **AdaBoostClassifier**.\n", 151 | "\n", 152 | "> **Step 1**: First use the documentation to `import` all three of the models." 153 | ], 154 | "metadata": { 155 | "id": "rj5638Dgw1Pz" 156 | } 157 | }, 158 | { 159 | "cell_type": "code", 160 | "source": [ 161 | "# Import the Bagging, RandomForest, and AdaBoost Classifier\n" 162 | ], 163 | "metadata": { 164 | "id": "UfbqCuwyw8P7" 165 | }, 166 | "execution_count": null, 167 | "outputs": [] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "source": [ 172 | "> **Step 2:** Now that you have imported each of the classifiers, `instantiate` each with the hyperparameters specified in each comment. In the upcoming lessons, you will see how we can automate the process to finding the best hyperparameters. For now, let's get comfortable with the process and our new algorithms." 173 | ], 174 | "metadata": { 175 | "id": "qEfIP9g5xA2b" 176 | } 177 | }, 178 | { 179 | "cell_type": "code", 180 | "source": [ 181 | "# Instantiate a BaggingClassifier with:\n", 182 | "# 200 weak learners (n_estimators) and everything else as default values\n", 183 | "\n", 184 | "\n", 185 | "\n", 186 | "# Instantiate a RandomForestClassifier with:\n", 187 | "# 200 weak learners (n_estimators) and everything else as default values\n", 188 | "\n", 189 | "\n", 190 | "# Instantiate an a AdaBoostClassifier with:\n", 191 | "# With 300 weak learners (n_estimators) and a learning_rate of 0.2\n", 192 | "\n" 193 | ], 194 | "metadata": { 195 | "id": "kJZlnORvxD56" 196 | }, 197 | "execution_count": null, 198 | "outputs": [] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "source": [ 203 | "> **Step 3:** Now that you have instantiated each of your models, `fit` them using the **training_data** and **y_train**. This may take a bit of time, you are fitting 700 weak learners after all!" 204 | ], 205 | "metadata": { 206 | "id": "n7TXzMhdxGSg" 207 | } 208 | }, 209 | { 210 | "cell_type": "code", 211 | "source": [ 212 | "# Fit your BaggingClassifier to the training data\n", 213 | "\n", 214 | "\n", 215 | "# Fit your RandomForestClassifier to the training data\n", 216 | "\n", 217 | "\n", 218 | "# Fit your AdaBoostClassifier to the training data\n", 219 | "\n" 220 | ], 221 | "metadata": { 222 | "id": "CGF-u3HNxJxc" 223 | }, 224 | "execution_count": null, 225 | "outputs": [] 226 | }, 227 | { 228 | "cell_type": "markdown", 229 | "source": [ 230 | "> **Step 4:** Now that you have fit each of your models, you will use each to `predict` on the **testing_data**." 231 | ], 232 | "metadata": { 233 | "id": "0fpduCWXxMWn" 234 | } 235 | }, 236 | { 237 | "cell_type": "code", 238 | "source": [ 239 | "# Predict using BaggingClassifier on the test data\n", 240 | "\n", 241 | "\n", 242 | "# Predict using RandomForestClassifier on the test data\n", 243 | "\n", 244 | "\n", 245 | "# Predict using AdaBoostClassifier on the test data\n", 246 | "\n" 247 | ], 248 | "metadata": { 249 | "id": "80uzQtB1xPJ7" 250 | }, 251 | "execution_count": null, 252 | "outputs": [] 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "source": [ 257 | "> **Step 5:** Now that you have made your predictions, compare your predictions to the actual values using the function below for each of your models - this will give you the `score` for how well each of your models is performing. It might also be useful to show the Naive Bayes model again here, so we can compare them all side by side." 258 | ], 259 | "metadata": { 260 | "id": "9W0KepxHxRa4" 261 | } 262 | }, 263 | { 264 | "cell_type": "code", 265 | "source": [ 266 | "def print_metrics(y_true, preds, model_name=None):\n", 267 | " '''\n", 268 | " INPUT:\n", 269 | " y_true - the y values that are actually true in the dataset (NumPy array or pandas series)\n", 270 | " preds - the predictions for those values from some model (NumPy array or pandas series)\n", 271 | " model_name - (str - optional) a name associated with the model if you would like to add it to the print statements\n", 272 | "\n", 273 | " OUTPUT:\n", 274 | " None - prints the accuracy, precision, recall, and F1 score\n", 275 | " '''\n", 276 | " if model_name == None:\n", 277 | " print('Accuracy score: ', format(accuracy_score(y_true, preds)))\n", 278 | " print('Precision score: ', format(precision_score(y_true, preds)))\n", 279 | " print('Recall score: ', format(recall_score(y_true, preds)))\n", 280 | " print('F1 score: ', format(f1_score(y_true, preds)))\n", 281 | " print('\\n\\n')\n", 282 | "\n", 283 | " else:\n", 284 | " print('Accuracy score for ' + model_name + ' :' , format(accuracy_score(y_true, preds)))\n", 285 | " print('Precision score ' + model_name + ' :', format(precision_score(y_true, preds)))\n", 286 | " print('Recall score ' + model_name + ' :', format(recall_score(y_true, preds)))\n", 287 | " print('F1 score ' + model_name + ' :', format(f1_score(y_true, preds)))\n", 288 | " print('\\n\\n')" 289 | ], 290 | "metadata": { 291 | "id": "btQZc-WhxUcQ" 292 | }, 293 | "execution_count": null, 294 | "outputs": [] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "source": [ 299 | "# Print Bagging scores\n", 300 | "\n", 301 | "\n", 302 | "# Print Random Forest scores\n", 303 | "\n", 304 | "\n", 305 | "# Print AdaBoost scores\n", 306 | "\n", 307 | "\n", 308 | "# Naive Bayes Classifier scores\n", 309 | "\n" 310 | ], 311 | "metadata": { 312 | "id": "ePJhq_OkxYkJ" 313 | }, 314 | "execution_count": null, 315 | "outputs": [] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "source": [ 320 | "### Recap\n", 321 | "\n", 322 | "Now you have seen the whole process for a few ensemble models!\n", 323 | "\n", 324 | "1. **Import** the model.\n", 325 | "2. **Instantiate** the model with the hyperparameters of interest.\n", 326 | "3. **Fit** the model to the training data.\n", 327 | "4. **Predict** on the test data.\n", 328 | "5. **Score** the model by comparing the predictions to the actual values.\n", 329 | "\n", 330 | "And that's it. This is a very common process for performing machine learning.\n" 331 | ], 332 | "metadata": { 333 | "id": "JXOj7OiWxXff" 334 | } 335 | } 336 | ] 337 | } -------------------------------------------------------------------------------- /Guided Projects/Predicting Credit Card Approvals/temp.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Guided Projects/temp.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 sreent 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Lecture Slides: 1 Machine Learning: An Introduction.md: -------------------------------------------------------------------------------- 1 | 2 | --- 3 | 4 | ### Lecture 1: Machine Learning: An Introduction 5 | **Date**: 1 February 2024 6 | **Instructor**: Tarapong Sreenuch, PhD 7 | 8 | --- 9 | 10 | ### Slide 1: Opening Quote 11 | > 克明峻德,格物致知 12 | > *"Exalt the bright virtue and explore the principles of things to attain knowledge."* 13 | > — Confucius 14 | 15 | **Intention**: Inspire a thoughtful mindset as we embark on understanding Machine Learning (ML). 16 | **Visuals**: Simple, elegant typography with a subtle background related to learning. 17 | 18 | --- 19 | 20 | ### Slide 2: Machine Learning in Daily Life 21 | **Everyday Interactions with Machine Learning** 22 | Ever wondered how streaming services seem to know what you want to watch next? Or how your email filters out spam effortlessly? These are examples of ML at work in your daily life. 23 | 24 | *We often experience ML without even realizing it.* 25 | 26 | **Intention**: Relate ML to common experiences. 27 | **Visuals**: Icons representing streaming services, email filtering, and online shopping. 28 | 29 | --- 30 | 31 | ### Slide 3: Defining Machine Learning 32 | **What is Machine Learning?** 33 | Machine Learning enables computers to learn from data, identifying patterns and making decisions with minimal human intervention. It's about turning experience into expertise. 34 | 35 | *Think of it as teaching a friend to recognize patterns rather than giving them step-by-step instructions.* 36 | 37 | **Intention**: Offer a relatable and concise definition of ML. 38 | **Visuals**: Illustration of a person identifying patterns with assistance. 39 | 40 | --- 41 | 42 | ### Slide 4: The Power of ML in Action 43 | **The Impact of Machine Learning** 44 | ML allows us to: 45 | - **Automate Complex Tasks**: Voice recognition, medical image analysis. 46 | - **Uncover Insights**: Finding trends in large datasets. 47 | - **Personalize Experiences**: Tailored recommendations, customer service. 48 | - **Drive Innovation**: Autonomous vehicles, smart cities. 49 | 50 | *ML is transforming industries and enhancing our daily lives.* 51 | 52 | **Intention**: Highlight the significance of ML with impactful examples. 53 | **Visuals**: Icons representing various ML applications. 54 | 55 | --- 56 | 57 | ### Slide 5: Traditional Programming vs. ML 58 | **From Explicit Instructions to Learning from Data** 59 | 60 | **Traditional Programming**: 61 | - Developers define explicit rules. 62 | - Example: Coding specific logic for sorting. 63 | 64 | **Machine Learning**: 65 | - Models learn patterns from data. 66 | - Example: Recognizing spoken language by learning from audio samples. 67 | 68 | *Machine Learning shifts the paradigm from telling computers **how** to do tasks to letting them **learn** how to perform them.* 69 | 70 | **Intention**: Distinguish ML from traditional programming approaches. 71 | **Visuals**: Side-by-side comparison chart. 72 | 73 | --- 74 | 75 | ### Slide 6: Types of Machine Learning 76 | **Different Flavors of Learning** 77 | 1. **Supervised Learning**: 78 | - Learns from labeled examples. 79 | - Example: Email classification (spam or not). 80 | 81 | 2. **Unsupervised Learning**: 82 | - Finds patterns in data without labels. 83 | - Example: Customer segmentation in marketing. 84 | 85 | 3. **Reinforcement Learning**: 86 | - Learns by interacting with an environment and receiving feedback. 87 | - Example: Training AI to play games. 88 | 89 | *Each type serves different purposes, tailored to the problem at hand.* 90 | 91 | **Intention**: Introduce the three main categories of ML with practical examples. 92 | **Visuals**: Icons representing each learning type. 93 | 94 | --- 95 | 96 | ### Slide 7: Supervised Learning Overview 97 | **Supervised Learning** 98 | Involves training a model on labeled data, where the model learns to map inputs to desired outputs. 99 | 100 | - **Classification**: Predict discrete labels. 101 | - Example: Diagnosing diseases from symptoms. 102 | - **Regression**: Predict continuous values. 103 | - Example: Estimating house prices. 104 | 105 | *Supervised learning is like learning with an answer key.* 106 | 107 | **Intention**: Provide a high-level overview of supervised learning. 108 | **Visuals**: Simple diagrams illustrating classification and regression. 109 | 110 | --- 111 | 112 | ### Slide 8: Unsupervised Learning Overview 113 | **Unsupervised Learning** 114 | Finds hidden patterns in data without predefined labels. 115 | 116 | - **Clustering**: Group similar data points. 117 | - Example: Grouping articles by topic. 118 | - **Dimensionality Reduction**: Simplify data for better understanding. 119 | - Example: Visualizing high-dimensional data in 2D. 120 | 121 | *Unsupervised learning helps us uncover the unknown.* 122 | 123 | **Intention**: Introduce unsupervised learning with intuitive examples. 124 | **Visuals**: Illustrations of clusters and dimensionality reduction. 125 | 126 | --- 127 | 128 | ### Slide 9: Reinforcement Learning Overview 129 | **Reinforcement Learning** 130 | An agent learns to make decisions by taking actions and receiving feedback from its environment. 131 | 132 | - **How it works**: 133 | - Take an action. 134 | - Get a reward or penalty. 135 | - Learn to maximize rewards over time. 136 | 137 | - **Applications**: 138 | - Game AI. 139 | - Robotics. 140 | 141 | *Reinforcement learning is about learning through trial and error.* 142 | 143 | **Intention**: Simplify the concept of reinforcement learning. 144 | **Visuals**: Diagram showing an agent interacting with an environment. 145 | 146 | --- 147 | 148 | ### Slide 10: The Machine Learning Workflow 149 | **How Do We Build ML Models?** 150 | 1. **Define the Problem**: What are we trying to solve? 151 | 2. **Collect Data**: Gather relevant data. 152 | 3. **Preprocess Data**: Clean and prepare the data. 153 | 4. **Choose a Model**: Select the right algorithm. 154 | 5. **Train the Model**: Learn from the training data. 155 | 6. **Evaluate the Model**: Test and validate performance. 156 | 7. **Deploy**: Use the model in real-world applications. 157 | 158 | *ML is a journey from data to insight.* 159 | 160 | **Intention**: Provide a concise overview of the ML process. 161 | **Visuals**: Flowchart depicting the ML process. 162 | 163 | --- 164 | 165 | ### Slide 11: Data – The Heart of Machine Learning 166 | **The Importance of Quality Data** 167 | Good models rely on good data. Data must be: 168 | - **Accurate**: Reflect the real world. 169 | - **Relevant**: Include features related to the problem. 170 | - **Clean**: Free of errors and missing values. 171 | 172 | *Quality data is the foundation of effective machine learning.* 173 | 174 | **Intention**: Emphasize the importance of data in ML. 175 | **Visuals**: Icons representing clean and organized data. 176 | 177 | --- 178 | 179 | ### Slide 12: Preprocessing the Data 180 | **Preparing Data for Learning** 181 | - **Cleaning**: Handle missing values, remove duplicates. 182 | - **Normalization**: Scale features to a common range. 183 | - **Feature Engineering**: Extract relevant features. 184 | 185 | *Garbage in, garbage out – clean data is crucial for good results.* 186 | 187 | **Intention**: Highlight key preprocessing steps. 188 | **Visuals**: Icons illustrating data cleaning and normalization. 189 | 190 | --- 191 | 192 | ### Slide 13: Feature Engineering 193 | **Crafting Features for Better Learning** 194 | Features are the inputs to your model. Effective feature engineering can significantly improve model performance. 195 | 196 | - **Feature Selection**: Choose relevant features. 197 | - **Feature Transformation**: Create new features from existing data. 198 | 199 | *Well-crafted features are key to unlocking a model's potential.* 200 | 201 | **Intention**: Discuss the importance of feature engineering. 202 | **Visuals**: Diagram showing transformation of raw data into features. 203 | 204 | --- 205 | 206 | ### Slide 14: Model Training and Evaluation 207 | **Teaching and Testing the Model** 208 | - **Training**: Teach the model using labeled data. 209 | - **Validation**: Fine-tune the model. 210 | - **Testing**: Evaluate how well the model performs on unseen data. 211 | 212 | *The goal is to build a model that generalizes well to new data.* 213 | 214 | **Intention**: Explain the training and evaluation process. 215 | **Visuals**: Illustration of data splitting (training, validation, testing). 216 | 217 | --- 218 | 219 | ### Slide 15: Overfitting and Underfitting 220 | **Finding the Right Balance** 221 | - **Overfitting**: Model learns noise in training data. 222 | - **Underfitting**: Model is too simple to capture patterns. 223 | - **Goal**: Find a model that generalizes well to new data. 224 | 225 | *Aim for the sweet spot – not too complex, not too simple.* 226 | 227 | **Intention**: Warn about common pitfalls in modeling. 228 | **Visuals**: Graphs illustrating overfitting and underfitting. 229 | 230 | --- 231 | 232 | ### Slide 16: Bias-Variance Trade-Off 233 | **Understanding Model Errors** 234 | - **Bias**: Error due to overly simplistic assumptions. 235 | - **Variance**: Error due to sensitivity to fluctuations in training data. 236 | 237 | *Balancing bias and variance is key to building robust models.* 238 | 239 | **Intention**: Simplify the concept of the bias-variance trade-off. 240 | **Visuals**: Diagram showing the trade-off curve. 241 | 242 | --- 243 | 244 | ### Slide 17: Deployment and Maintenance 245 | **Putting the Model to Work** 246 | Deploying a model means using it in real-world applications. 247 | 248 | - **Monitoring**: Track performance. 249 | - **Maintenance**: Update the model with new data. 250 | 251 | *Deployment is not the end – it's the beginning of continuous improvement.* 252 | 253 | **Intention**: Highlight the ongoing nature of ML deployment. 254 | **Visuals**: Flowchart showing deployment and feedback loop. 255 | 256 | --- 257 | 258 | ### Slide 18: Ethical Considerations 259 | **Building Responsible AI** 260 | - **Fairness**: Avoid biases in decision-making. 261 | - **Transparency**: Explain how models make decisions. 262 | - **Privacy**: Protect user data. 263 | 264 | *Ethics in AI is about ensuring positive impact and trust.* 265 | 266 | **Intention**: Emphasize the ethical use of ML. 267 | **Visuals**: Icons representing fairness, transparency, and privacy. 268 | 269 | --- 270 | 271 | ### Slide 19 272 | 273 | : Real-World Applications 274 | **Machine Learning in Action** 275 | - **Healthcare**: Disease prediction, personalized treatment. 276 | - **Finance**: Fraud detection, investment strategies. 277 | - **Retail**: Personalized recommendations, demand forecasting. 278 | 279 | *ML is transforming industries and solving complex problems.* 280 | 281 | **Intention**: Showcase practical impacts of ML. 282 | **Visuals**: Icons representing different industries. 283 | 284 | --- 285 | 286 | ### Slide 20: Summary 287 | **Summary** 288 | Machine Learning (ML) represents a shift in how we approach problem-solving, allowing computers to learn from data and make informed decisions. Rather than explicitly programming every step, ML leverages patterns in data to create models capable of addressing complex tasks across various domains. 289 | 290 | An effective ML model balances complexity to generalize well to unseen data, avoiding the pitfalls of overfitting and underfitting. This balance is achieved by understanding the bias-variance trade-off and choosing the right algorithms and features. 291 | 292 | High-quality data and meticulous preprocessing are the foundations of successful ML. By cleaning data, engineering meaningful features, and scaling appropriately, we ensure our models are trained on robust, relevant information. 293 | 294 | Selecting the appropriate type of learning—supervised, unsupervised, or reinforcement—guides us in choosing the right techniques for different problems. Each learning type serves specific purposes, whether predicting outcomes, finding hidden patterns, or optimizing actions. 295 | 296 | Ethical considerations in ML are paramount. Models must be built with fairness, transparency, and accountability to ensure they make positive, unbiased contributions to society. This responsibility extends to monitoring models post-deployment to adapt to new data and maintain integrity. 297 | 298 | *Grasping these core principles sets the stage for applying machine learning effectively, responsibly, and innovatively in real-world scenarios.* 299 | 300 | **Intention**: Provide a cohesive recap of key concepts covered in the lecture. 301 | **Visuals**: Subtle icons or graphics complementing each paragraph. 302 | 303 | --- 304 | -------------------------------------------------------------------------------- /Lectures/1 Machine Learning: An Introduction/Machine Learning - An Introduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/1 Machine Learning: An Introduction/Machine Learning - An Introduction.pdf -------------------------------------------------------------------------------- /Lectures/1 Machine Learning: An Introduction/Note: Machine Learning: An Introduction.md: -------------------------------------------------------------------------------- 1 | ## **Machine Learning: An Introduction** 2 | 3 | ### **Tesla and the Power of Bundling AI** 4 | 5 | Over a decade ago, Tesla wasn’t the only company making electric cars. In fact, others were producing hybrid and electric cars that were arguably better. So why did Tesla stand out? The answer lies in **Tesla’s Autopilot**. It wasn’t just about making electric cars; Tesla was bundling advanced AI technology with their vehicles. This bundling of AI with an electric vehicle created an irresistible package for consumers, even if Tesla's cars weren’t necessarily the best from a traditional car manufacturing perspective. 6 | 7 | Tesla’s strategy shows how **Artificial Intelligence (AI)**, particularly **Machine Learning (ML)**, has become an expected part of modern products. **Machine Learning** is the driving force behind systems like Autopilot, which allows the car to make decisions on the road by learning from vast amounts of data. Today, consumers aren’t just looking for cars—they’re looking for AI-powered experiences bundled with them. 8 | 9 | ### **What is Machine Learning?** 10 | 11 | At its core, **Machine Learning** is a technology that lowers the **cost of prediction**. Prediction is the process of filling in missing information. When you swipe your credit card, the system uses machine learning to predict whether that transaction is fraudulent. In healthcare, ML predicts whether a medical image shows a benign or malignant tumor. 12 | 13 | But machine learning isn’t just about high-tech innovations. It’s everywhere—from predicting what show you’ll watch next on Netflix to translating languages on your phone. Machine learning is helping to turn tasks that used to be non-prediction tasks into prediction tasks. 14 | 15 | ### **Why Machine Learning is Thriving Now** 16 | 17 | So, why is machine learning booming today? The answer lies in the availability of **data** and the decreasing **cost of computing**. Decades ago, computers were primarily used for arithmetic tasks. Now, with technologies like **GPUs**, **cloud computing**, and vast data libraries like ImageNet, machine learning has become more accessible than ever before. The commoditization of these resources has driven down costs, making machine learning a practical solution for a wide range of applications. 18 | 19 | ### **Prediction: The New Foundation of AI** 20 | 21 | Machine learning revolves around the idea of **prediction**. Take the example of self-driving cars. To navigate safely, the car must predict what a good human driver would do—whether to brake, accelerate, or steer. These predictions are made by analyzing data from previous driving experiences. 22 | 23 | Or think about language translation. It used to be a task reserved for experts who knew the rules and exceptions of a language. Now, machine learning can analyze patterns in thousands of translations and make predictions about the best translation for new sentences. This shift—turning tasks like translation and driving into prediction problems—has fundamentally changed how we use technology. 24 | 25 | ### **The Shift from Algorithms to Models** 26 | 27 | Traditionally, computers followed **algorithms**—a series of steps to solve a problem. But machine learning represents a shift towards **models**. Instead of programming every rule about a cat’s appearance, for example, we feed the computer thousands of images of cats, and the machine builds a model based on patterns it identifies. These models are approximations of reality—they’re flexible, adaptive, and powerful. 28 | 29 | For example, consider **object recognition**. Instead of creating a specific set of rules to define a "cat," machine learning builds a model by learning from data. This model is then able to predict whether a new image contains a cat by recognizing patterns it learned from previous images. This ability to generalize from data is what makes machine learning so effective in areas like facial recognition or speech recognition. 30 | 31 | ### **Handling Complexity: Infinite Scenarios** 32 | 33 | One of the reasons machine learning is so powerful is its ability to handle **complexities**. In traditional programming, you would need to create an infinite number of rules to handle every possible scenario. But with machine learning, we let the machine **learn from data**, identifying patterns and handling new situations based on what it has learned . 34 | 35 | This is similar to how humans learn. We don’t explicitly think about all the rules that differentiate cats from dogs, but through experience, we implicitly know the difference. Machines work in much the same way—they learn from experience without needing every rule spelled out. 36 | 37 | ### **Supervised and Unsupervised Learning** 38 | 39 | There are two primary types of machine learning: 40 | - **Supervised Learning**: The machine is given labeled data (features and target labels) and learns from it. For example, a system might be trained to classify emails as "spam" or "not spam" based on labeled examples. It uses this training data to predict labels for new emails. 41 | 42 | - **Unsupervised Learning**: The machine is given data without labels and must find patterns on its own. This is useful for tasks like **clustering**, where we want to group similar items together without knowing the specific categories beforehand. 43 | 44 | Both methods allow machines to learn and make predictions, but they differ in the amount of information provided upfront. 45 | 46 | ### **Real-World Applications of Machine Learning** 47 | 48 | Machine learning is not just a theoretical concept; it’s deeply embedded in real-world applications: 49 | - **Healthcare**: Diagnosing diseases through pattern recognition in medical images. 50 | - **Finance**: Predicting fraudulent transactions by analyzing transaction patterns. 51 | - **Retail**: Recommending products to customers based on their shopping history. 52 | 53 | These applications highlight the versatility of machine learning in solving complex problems across industries. 54 | 55 | ### **The Machine Learning Workflow** 56 | 57 | Building a machine learning model involves several key steps: 58 | 1. **Collect Data**: The first step is gathering data relevant to the problem. 59 | 2. **Extract Features**: We then select the key aspects of the data that will be used to make predictions. 60 | 3. **Train the Model**: The model is trained on a portion of the data, learning patterns that link features to outcomes. 61 | 4. **Evaluate**: The model is tested on new data to see how well it can predict outcomes it hasn’t seen before. 62 | 63 | This workflow is iterative—the model is refined as more data is collected and as its predictions are evaluated. 64 | 65 | ### **Summary** 66 | 67 | Machine learning is transforming the way we approach problem-solving. Instead of writing complex rules for every situation, we let machines learn from data and make intelligent decisions. This shift—from algorithms to models, from explicit instructions to learning from experience—is revolutionizing industries, from self-driving cars to healthcare and beyond. 68 | 69 | As machine learning continues to evolve, it will undoubtedly play an even bigger role in shaping the future, helping us solve problems that were once too complex for traditional computing. 70 | -------------------------------------------------------------------------------- /Lectures/10 Decision Tree/Decision Tree.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/10 Decision Tree/Decision Tree.pdf -------------------------------------------------------------------------------- /Lectures/10 Decision Tree/Note: Decision Tree.md: -------------------------------------------------------------------------------- 1 | # Decision Tree 2 | 3 | ### 1 Introduction 4 | 5 | Imagine you’re trying to **diagnose a problem** or **make a decision** by asking a series of questions. Each answer **narrows down the possibilities**, much like a flowchart or a game of **20 Questions**. A **Decision Tree** is simply this idea carried out in a formal, data-driven manner. 6 | 7 | In a decision tree: 8 | 9 | 1. It begins with an initial query at the **root** and branches based on the answer. 10 | 2. Each subsequent question (an **internal node**) further splits the data. 11 | 3. Eventually, we reach a **leaf node**, where the tree outputs a class. 12 | 13 | These hierarchical questions mimic how we might reason through a decision ourselves, but they do so using data-driven criteria: the tree systematically asks the *most informative* questions first and splits accordingly. 14 | 15 | **Café‐Pastry Example** 16 | 17 | Let’s bring this to life with a lighthearted scenario. Suppose we want to predict which pastry a customer will buy (*Muffin*, *Cake*, or *Cookie*) using two features: 18 | 19 | - **Seating** – Indoor vs Outdoor 20 | - **Drink** – Coffee vs Tea 21 | 22 | Picture a café by the park, where customers who sit outside might be more likely to grab a cookie, while those inside might favor cake or muffins. Our job is to figure out how best to split on these features. Two possible strategies might be: 23 | 24 | | Strategy | Description | 25 | |-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------| 26 | | **Drink → Seating** | Root split on **Drink**. Each branch then asks **Seating** (Indoor / Outdoor) to decide the pastry. Both branches still need a second question because drink alone doesn’t yield pure groups. | 27 | | **Seating → Drink** | Root split on **Seating**. Outdoor customers jump straight to *Cookie* (pure). Indoor customers need one more split on **Drink** (Coffee → Muffin, Tea → Cake). This ordering yields purer groups earlier. | 28 | 29 | Both approaches ultimately produce the same leaves (*Muffin*, *Cake*, *Cookie*). However, the *Seating-first* tree requires **one fewer question** on the Outdoor branch, illustrating how decision trees aim to **reduce uncertainty as quickly as possible**. 30 | 31 | 32 | 33 | ### 2 Entropy and Information Gain 34 | 35 | #### 2.1 Entropy $H$ 36 | 37 | To quantify “uncertainty,” we use a measure called **entropy**. Formally: 38 | 39 | $$ 40 | H(S) = -\sum_{c \in \text{classes}} p(c)\,\log_2 p(c), 41 | $$ 42 | 43 | where $p(c)$ is the proportion of class $c$ in set $S$. High entropy indicates a mixture of classes (uncertainty), whereas low entropy signifies mostly one class (pure node). 44 | 45 | For instance: 46 | 47 | - *50% Cookies / 50% Muffins → higher entropy.* 48 | - *100% Muffins → entropy 0.* 49 | 50 | #### 2.2 Information Gain (IG) 51 | 52 | A decision tree reduces this entropy by splitting on features that best separate the classes. The measure of that separation is **information gain**: 53 | 54 | $$ 55 | \text{IG}(S, A) = H(S) - \sum_{v \in \text{values}(A)} \frac{|S_v|}{|S|} H(S_v). 56 | $$ 57 | 58 | This calculation starts with the entropy of the parent node ($H(S)$) and subtracts the weighted entropies of the child nodes. The split that **maximizes** this IG is chosen at each step. 59 | 60 | **Revisiting the café example**: Splitting first on **Seating** sends Outdoor customers (100% Cookie) into a perfectly pure node, giving high IG. Splitting on **Drink** first, on the other hand, leaves both children mixed, and the resulting IG is lower—hence the preference for **Seating** as the top-level question. 61 | 62 | 63 | 64 | ### 3 Handling Numeric vs Categorical Features 65 | 66 | Decision trees handle different feature types in slightly different ways. Here’s a concise overview: 67 | 68 | | Feature type | How the tree splits | Practical note | 69 | |------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 70 | | **Numeric / continuous** | The algorithm sorts the unique values of the feature, then tests threshold candidates at each midpoint, e.g. $(v_i + v_{i+1})/2$. Whichever threshold yields the **highest information gain** is chosen for the binary split $x_j \le \theta$. | Works out-of-the-box for any real-valued column. | 71 | | **Categorical** | A simple approach is **label encoding**: map each unique category to an integer (e.g. Red → 0, Green → 1, Blue → 2). The tree then treats the column like a numeric feature and searches for the best threshold.

*Alternative:* explicitly branch on each category (“IF color = Red → …”). | In scikit-learn, `LabelEncoder` (or `OrdinalEncoder`) is the quickest way to feed categories into a single decision tree. One-hot encoding is *not* required, because trees handle integer codes naturally. | 72 | 73 | > **Key point:** After label encoding, **the same threshold-search routine used for continuous features handles categorical ones as well**. 74 | 75 | In a practical setting, imagine a dataset for predicting whether a person will subscribe to a streaming service. You might have a numeric feature like *age* (split using a threshold such as 30.5) and a categorical feature like *favorite genre*, which can be turned into integers (e.g., Action → 0, Comedy → 1, Drama → 2, etc.) and then treated similarly by the algorithm. 76 | 77 | 78 | 79 | ### 4 Training a Decision Tree: Algorithm & Stopping Rules 80 | 81 | Building a decision tree can be broken into a sequence of steps, from selecting the root question to deciding when to stop: 82 | 83 | 1. **Root node**: Start with the entire training set. 84 | 2. **Evaluate each feature / threshold**: Compute IG. 85 | 3. **Split** on the feature with **max IG**, creating child nodes. 86 | 4. **Partition** data into these child nodes. 87 | 5. **Recurse** on each child: 88 | - Repeat steps 1–4 for its subset of the data. 89 | 6. **Stop** when: 90 | - (a) a node is pure (entropy 0), **or** 91 | - (b) no features remain, **or** 92 | - (c) a **pre-pruning rule** triggers (e.g., `max_depth`, `min_samples_split`, `min_samples_leaf`). 93 | 94 | **Overfitting vs Underfitting** 95 | 96 | While following these steps, you must watch for: 97 | 98 | | Scenario | Symptoms | Cause | Remedy | 99 | |------------------|----------------------------------|--------------------------------|--------------------------------------------------------------------------------------------------| 100 | | **Overfitting** | Train accuracy ≈ 100%, validation low | Tree too deep or many tiny leaves | Increase `min_samples_split` or `min_samples_leaf`, reduce `max_depth`, or prune the tree. | 101 | | **Underfitting** | Train & validation both low | Tree too shallow | Allow deeper splits, reduce `min_samples_split`. | 102 | 103 | In other words, you want the decision tree to balance capturing enough detail (without memorizing noise) and staying general enough to work on new data. **Cross-validation** helps you tune these hyperparameters by testing how well the model generalizes; final performance is then confirmed on a **held-out test set**. 104 | 105 | 106 | 107 | ### 5 Decision Boundaries & Interpretability 108 | 109 | Once you’ve trained a decision tree, its decision boundaries and interpretability often become big selling points for using it: 110 | 111 | - **Axis-aligned splits**: Each rule $x_j \le t$ or $x_j > t$ creates a vertical or horizontal boundary (in 2D) or a hyperplane (in higher dimensions). 112 | - **Rectangular regions**: The feature space gets chopped up into rectangles (or hyper-rectangles), where each region corresponds to a leaf node predicting a specific class. 113 | - **Transparent rules**: You can read off the *if–then* path from root to leaf to see exactly why a given decision was made. 114 | - *For example, “IF Seating = Outdoor → Cookie; ELSE IF Drink = Coffee → Muffin; ELSE → Cake.”* 115 | 116 | Thanks to these straightforward splits, **stakeholders**—whether they’re business partners, doctors, or domain experts—can trace the reasoning behind a classification. This transparency can be critical in high-stakes fields where understanding the rationale behind a prediction is just as important as the prediction’s accuracy. 117 | 118 | **Feature (Variable) Importance** 119 | 120 | Another interpretability advantage is that decision trees naturally provide **feature importance** scores. By summing the **information gain** a feature contributes across all the splits in which it appears, you get a measure of how pivotal that feature is overall. Features near the top of the tree often have the largest impact on predictions. 121 | 122 | 123 | 124 | ### 6 No Need for Feature Scaling 125 | 126 | Unlike some other algorithms, decision trees compare features **individually** with simple threshold checks. They do **not** rely on distances or dot products. As a result, **normalization and standardization** of features are unnecessary for tree-based methods. 127 | 128 | If you’re used to scaling features for neural networks, k-Nearest Neighbors, or SVMs, this is one less step to worry about when setting up your data for a decision tree. 129 | 130 | 131 | 132 | ### 7 Summary 133 | 134 | A decision tree essentially asks a hierarchy of data-driven questions. Each node splits on the feature that most reduces entropy, measured by **information gain**. Along the way, the algorithm checks for stopping criteria or applies pruning to avoid overfitting. Here are the key takeaways: 135 | 136 | - **Entropy** measures node impurity; **information gain** captures how much a split reduces that impurity. 137 | - Numeric features are split on an optimal **threshold**; categorical features can be **label-encoded** so the same threshold strategy applies. 138 | - Hyper-parameters (`max_depth`, `min_samples_split`, `min_samples_leaf`, etc.) help control a tree’s complexity. 139 | - **Cross-validation** is crucial for tuning those parameters and avoiding over- or underfitting. 140 | - Decision trees yield **transparent rules**, axis-aligned decision boundaries, and clear feature-importance scores. 141 | - They require **no feature scaling** and serve as the foundation for ensemble methods like Random Forests and Gradient Boosted Trees. 142 | 143 | Armed with this understanding, we can now deploy decision trees with a good sense of when they shine (interpretability, minimal preprocessing) and where we’ll need caution (their tendency to overfit if not pruned). 144 | -------------------------------------------------------------------------------- /Lectures/12 Principle Component Analysis (PCA)/dd.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Lectures/13 Deep Neural Networks/Deep Network and Regularisation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/13 Deep Neural Networks/Deep Network and Regularisation.pdf -------------------------------------------------------------------------------- /Lectures/13 Deep Neural Networks/dd.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Lectures/13 K-Means Clustering/Principle Component Analysis (PCA).pdf: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Lectures/14 Neural Networks/Single-Layer Perceptron.pdf: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Lectures/2 K-Nearest Neighbours Classification/K-Nearest Neighbours Classification.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/2 K-Nearest Neighbours Classification/K-Nearest Neighbours Classification.pdf -------------------------------------------------------------------------------- /Lectures/2 K-Nearest Neighbours Classification/Note: K-Nearest Neighbours Classification.md: -------------------------------------------------------------------------------- 1 | ## **K-Nearest Neighbours Classification** 2 | 3 | **The Bookstore Analogy** 4 | 5 | Imagine you're back in that bookstore, trying to recommend books to a new customer. You've seen many customers with different tastes, some preferring mysteries, others gravitating toward science fiction. Now, a new customer walks in, and based on their brief browsing behavior, you need to suggest a book they'll likely enjoy. You look around the store, find customers with similar browsing patterns, and recommend what they liked. This intuitive process is quite similar to how K-Nearest Neighbours (KNN) works. 6 | 7 | **What is K-Nearest Neighbours?** 8 | 9 | K-Nearest Neighbours is one of the most straightforward yet powerful classification algorithms in machine learning. Instead of building a complex model, KNN makes predictions based on the closest neighbors to the new data point in the feature space. It’s like saying, "Show me the 'k' most similar customers, and let's see what they preferred." By identifying patterns in the behavior of these neighbors, KNN predicts the most probable outcome for the new data point. 10 | 11 | **How KNN Makes Predictions** 12 | 13 | Let’s break down the bookstore scenario further. Suppose you have a customer who picks up a mix of mystery and thriller novels. You check the records of past customers who exhibited similar behavior—those who browsed a mix of similar genres. The question then is: Should you recommend more mystery novels or perhaps suggest thrillers? KNN would look at the 'k' most similar customers and make a recommendation based on what the majority of them enjoyed. 14 | 15 | If \( k = 1 \), KNN looks at the single most similar customer. If that customer preferred thrillers, KNN suggests thrillers. If \( k = 5 \), it looks at the five most similar customers and takes a majority vote. This process can adapt to different scenarios based on the value of \( k \), allowing the algorithm to be flexible and simple yet effective. 16 | 17 | **Choosing the Right 'k'** 18 | 19 | Selecting the right number of neighbors (\( k \)) is crucial. If \( k \) is too small, KNN might get too focused on noise in the data, leading to overfitting—like recommending a very niche book based on the preference of just one customer. If \( k \) is too large, KNN might overlook specific preferences, resulting in underfitting, where the recommendation becomes too generic. Our goal is to find that sweet spot where KNN balances specificity and generality. 20 | 21 | **The Importance of Distance** 22 | 23 | KNN is fundamentally a distance-based algorithm. The idea is simple: the closer two data points are in the feature space, the more similar they are. Imagine the bookstore floor as a grid. Customers who prefer similar genres would cluster together. When a new customer arrives, KNN calculates the distance to each existing customer on this grid—using measures like Euclidean distance. The algorithm then finds the 'k' nearest neighbors and makes a prediction based on their preferences. 24 | 25 | **Scaling Matters** 26 | 27 | But there’s a catch: not all features are created equal. If you have one feature that ranges from 1 to 1000 (say, the price of books) and another that ranges from 1 to 10 (like a rating), the distance calculations could become skewed. In KNN, this could lead to misleading results because the algorithm might give undue weight to features with larger numeric ranges. This is why scaling features to a common range is crucial before applying KNN. It ensures that each feature contributes fairly to the distance calculation. 28 | 29 | **Visualizing KNN with Decision Boundaries** 30 | 31 | One way to understand how KNN works is by visualizing its decision boundaries. Imagine plotting different genres of books on a 2D map based on their attributes—mystery, thriller, romance, etc. KNN draws boundaries between these genres, classifying a new book based on the region it falls into. As we change \( k \), these boundaries shift, becoming more fluid or rigid. When \( k \) is small, the boundaries are complex, potentially overfitting to noise in the data. When \( k \) is large, the boundaries become smoother, generalizing across broader categories. 32 | 33 | **Evaluating KNN’s Performance** 34 | 35 | KNN is intuitive and easy to implement, but how do we know if it's doing a good job? Evaluating the model’s performance on unseen data is essential. This involves splitting the dataset into training, validation, and test sets: 36 | - **Training Set**: Used to teach the model the patterns in the data. 37 | - **Validation Set**: Helps fine-tune hyperparameters like \( k \). 38 | - **Test Set**: Measures how well the model generalizes to new data. 39 | 40 | We want our KNN model to perform well on both the training and test sets, indicating that it has learned useful patterns rather than memorizing the data. 41 | 42 | **Bias-Variance Trade-Off in KNN** 43 | 44 | KNN provides a hands-on example of the bias-variance trade-off: 45 | - **High Bias (Underfitting)**: If \( k \) is large, KNN may overlook important patterns, treating unique preferences as noise. 46 | - **High Variance (Overfitting)**: If \( k \) is small, KNN might tailor its predictions too closely to the training data, failing to generalize. 47 | 48 | The challenge is to find a balance where KNN captures the underlying patterns without being overly sensitive to every detail in the training data. 49 | 50 | **Real-World Applications of KNN** 51 | 52 | KNN isn't just a theoretical concept. It has practical applications in various fields: 53 | - **Healthcare**: Classifying patients based on medical records to predict diseases. 54 | - **Finance**: Identifying fraudulent transactions by comparing them with known patterns. 55 | - **Retail**: Recommending products to customers by finding similar users and their preferences. 56 | 57 | Its simplicity and interpretability make KNN an attractive option for many real-world problems, especially when transparency and easy implementation are priorities. 58 | 59 | **Summary** 60 | 61 | K-Nearest Neighbours Classification offers a straightforward yet powerful approach to making predictions based on similarity. By examining the closest neighbors in the feature space, KNN can classify new data points effectively. Its success hinges on choosing the right number of neighbors (\( k \)), scaling features appropriately, and understanding the bias-variance trade-off. While it has limitations, particularly with large datasets and high-dimensional spaces, KNN serves as a foundational tool in the machine learning toolkit, helping us intuitively grasp how machines can learn from patterns in data. 62 | -------------------------------------------------------------------------------- /Lectures/3 K-Nearest Neighbours Regression/K-Nearest Neighbours Regression.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/3 K-Nearest Neighbours Regression/K-Nearest Neighbours Regression.pdf -------------------------------------------------------------------------------- /Lectures/3 K-Nearest Neighbours Regression/Note: K-Nearest Neighbours Regression.md: -------------------------------------------------------------------------------- 1 | ## **K-Nearest Neighbours Regression** 2 | 3 | **Predicting House Prices** 4 | 5 | Imagine you're tasked with estimating the price of a house. You don’t have a complex algorithm at hand, but you do have access to the prices of nearby houses that are similar in size, age, and number of rooms. Intuitively, you might guess that the price of the house in question should be similar to those of its "neighbors" in the dataset. This is the essence of K-Nearest Neighbours (KNN) Regression: making predictions based on the known outcomes of the closest examples. 6 | 7 | KNN Regression is one of the most intuitive machine learning algorithms. Unlike traditional regression models that try to fit a global equation to the entire dataset, KNN makes predictions by looking at the 'k' nearest neighbors to a new data point and averaging their house prices. It’s like asking the neighbors to estimate a house’s value based on their own experiences. Simple, yet surprisingly effective in many scenarios. 8 | 9 | **How It Works** 10 | 11 | In KNN Regression, every data point has a set of features, and we use these features to find the 'k' closest points to the one we're trying to predict. The distance metric, typically Euclidean distance, helps us determine which points are "nearest." Once we identify these neighbors, we average their target values to make our prediction. If 'k' is 3, for example, we look at the three closest points and use their average value as our prediction. 12 | 13 | But here's where it gets interesting. The choice of 'k' greatly influences the model's performance. A small 'k' captures local nuances but may be sensitive to noise, leading to overfitting. A large 'k' provides a smoother prediction by considering a broader context, which can be beneficial but might also lead to underfitting. Finding the right balance is key. 14 | 15 | **Feature Scaling and Its Importance** 16 | 17 | Consider two features: the size of a house in square feet and the number of bedrooms. The size might range from 500 to 5000 square feet, while the number of bedrooms usually varies between 1 and 5. If we don't scale these features, the model might give undue importance to the size, simply because its range is larger. In KNN, the distance metric can be heavily influenced by feature scales, so it's crucial to normalize or standardize the data to ensure fair comparison. 18 | 19 | **KNN in Action** 20 | 21 | Let's bring this to life with an example. Suppose we're predicting house prices. We have a dataset with features like size, number of rooms, and age of the house, along with the corresponding prices. When a new house comes in, KNN finds the 'k' most similar houses in terms of these features and averages their prices to give us an estimate. It's straightforward but can capture complex patterns if the right 'k' and feature scaling are applied. 22 | 23 | However, KNN Regression isn't without its limitations. It can be computationally intensive, especially with large datasets, because it requires calculating the distance to every other point. Additionally, it struggles with irrelevant features, as they can skew the distance calculations and affect the predictions. 24 | 25 | **Why KNN Regression Matters** 26 | 27 | KNN Regression offers a unique perspective in the machine learning toolkit. It's non-parametric, meaning it doesn't assume a specific form for the function we're trying to approximate. This makes it flexible, capable of modeling complex relationships in data that might be challenging for more rigid algorithms. It's like having a rule of thumb that adapts to the local context rather than applying a one-size-fits-all solution. 28 | 29 | While it may not always be the go-to method for large-scale problems due to its computational demands, KNN Regression is invaluable for its simplicity and interpretability. It encourages us to think locally, focusing on immediate surroundings to make informed predictions. This aligns with how we often approach problems in real life—by drawing on the experiences of those around us. 30 | 31 | **Summary** 32 | 33 | K-Nearest Neighbours Regression provides an intuitive approach to making predictions by leveraging local patterns in data. It avoids making strong assumptions about the data structure, offering a flexible way to model complex relationships. The choice of 'k,' the importance of feature scaling, and the consideration of distance metrics are crucial elements that shape the model's effectiveness. 34 | 35 | While simple and easy to understand, KNN Regression does have its challenges, particularly in terms of computational efficiency and sensitivity to irrelevant features. Despite these limitations, it serves as a foundational concept in machine learning, highlighting the power of local information in making informed decisions. 36 | -------------------------------------------------------------------------------- /Lectures/4 Linear Regression/Linear Regression.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/4 Linear Regression/Linear Regression.pdf -------------------------------------------------------------------------------- /Lectures/4 Linear Regression/Note: Linear Regression.md: -------------------------------------------------------------------------------- 1 | ## **Linear Regression** 2 | 3 | **Uncovering Patterns: Correlation in Data** 4 | 5 | Imagine you frequently order food from different restaurants using an online delivery app. Over time, you’ve noticed a pattern: the farther the restaurant, the longer it takes for your food to arrive. This observation suggests a **correlation**—a relationship between two variables: the distance from the restaurant and the delivery time. However, correlation alone doesn’t tell us exactly how much delivery time increases for every additional mile of distance. To uncover this, we use **linear regression**, which allows us to quantify this relationship and make predictions. 6 | 7 | **What is Linear Regression?** 8 | 9 | Linear regression is one of the simplest yet most powerful tools in machine learning. It helps us quantify the relationship between two (or more) variables and make predictions based on that relationship. In our case, we can use past delivery data to model the relationship between **distance** (input feature) and **delivery time** (target variable). With this model, we can predict how long it will take for a new order to arrive based on the distance. 10 | 11 | Linear regression works by fitting a **straight line** through the data points that best represents the underlying trend. Mathematically, this line is expressed as: 12 | 13 | $$ 14 | y = mx + c 15 | $$ 16 | 17 | Where: 18 | - $$y$$ is the delivery time (what we want to predict), 19 | - $$x$$ is the distance (the feature), 20 | - $$m$$ is the slope of the line (how much delivery time changes with distance), 21 | - $$c$$ is the intercept (the delivery time when the distance is zero). 22 | 23 | The slope tells us how much the delivery time increases for every additional mile of distance—this makes the model interpretable and easy to apply in everyday scenarios. 24 | 25 | **From Simple Equations to Matrix Representation** 26 | 27 | As the delivery app becomes more sophisticated, you want to add other features to your prediction model, such as the **day of the week** or **traffic conditions**. Now the model can no longer be represented by a simple line. Instead, we move to a matrix representation to handle multiple features: 28 | 29 | $$ 30 | \mathbf{y} = \mathbf{X} \vec{w} + \epsilon 31 | $$ 32 | 33 | Where: 34 | - $$\mathbf{y}$$ represents the observed delivery times, 35 | - $$\mathbf{X}$$ contains all the features (distance, day of the week, traffic, etc.), 36 | - $$\vec{w}$$ represents the weights or coefficients that the model learns, 37 | - $$\epsilon$$ is the error term, accounting for the difference between actual and predicted delivery times. 38 | 39 | This matrix form allows us to efficiently handle multiple features and gives us a clearer way to express the model when there are several inputs. 40 | 41 | **Finding the Best Line** 42 | 43 | The goal of linear regression is to find the line (or plane, in higher dimensions) that best fits the data. But how do we know what makes a line “best”? The key lies in minimizing the **error** between the predicted and actual values. In linear regression, we use a metric called **Sum of Squared Errors (SSE)** to quantify this difference. 44 | 45 | We find the line that minimizes this error using a closed-form solution known as the **Normal Equation**: 46 | 47 | $$ 48 | \vec{w} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} 49 | $$ 50 | 51 | The solution is where the **gradient of the error function is zero**—this gives us the coefficients that define our best-fitting line. 52 | 53 | **Understanding Error and Residuals** 54 | 55 | In linear regression, the difference between the observed value and the predicted value is called the **residual**. The goal is to minimize the sum of squared residuals (SSE) to ensure that the model captures the true relationship between the variables. A common metric used to evaluate the model is **Mean Squared Error (MSE)**, which provides an average of how far the predictions are from the actual values: 56 | 57 | $$ 58 | MSE = \frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{y}_i)^2 59 | $$ 60 | 61 | Where $$y_i$$ represents the actual values, and $$\hat{y}_i$$ are the predicted values. Minimizing the MSE helps the model make more accurate predictions on unseen data. 62 | 63 | **Feature Scaling and Outliers** 64 | 65 | Not all features are on the same scale. In our delivery example, distance might be measured in miles, while traffic conditions could be on a scale from 1 to 10. If we don't **scale features** to the same range, features with larger scales (like distance) can dominate the model and bias the results. To ensure that each feature contributes equally to the model, we apply feature scaling techniques, such as **min-max scaling** or **z-score normalization**. 66 | 67 | Additionally, **outliers**—such as an unusually long delivery time due to a restaurant mishap—can distort the model by pulling the regression line toward them. Identifying and handling outliers is crucial to creating a model that generalizes well to new data points. Techniques like **outlier removal** or using **robust regression** help in reducing their influence. 68 | 69 | **Interpreting the Model** 70 | 71 | One of the key advantages of linear regression is its **interpretability**. We can look at the coefficients and easily understand the relationship between each feature and the target variable. For example, in our delivery model, the coefficient for distance tells us how much delivery time increases for every additional mile. This transparency makes linear regression particularly useful in domains where understanding the underlying relationships is as important as making predictions. 72 | 73 | **Summary** 74 | 75 | Linear regression is a foundational technique in machine learning that allows us to model and predict relationships between variables. By finding the line of best fit, we can make accurate predictions for new data points, like estimating delivery times based on distance and other features. However, it’s essential to handle **outliers** and **scale features** properly to ensure the model's effectiveness. The solution is found by minimizing the error through the **Normal Equation**, where the **gradient** of the error function reaches zero. Despite its simplicity, linear regression is a powerful tool, especially when **interpretability** and transparency are required. 76 | 77 | -------------------------------------------------------------------------------- /Lectures/5 Gradient Descent/Gradient Descent.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/5 Gradient Descent/Gradient Descent.pdf -------------------------------------------------------------------------------- /Lectures/5 Gradient Descent/Lab: Gradient Descent from Scratch.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "include_colab_link": true 8 | }, 9 | "kernelspec": { 10 | "name": "python3", 11 | "display_name": "Python 3" 12 | }, 13 | "language_info": { 14 | "name": "python" 15 | } 16 | }, 17 | "cells": [ 18 | { 19 | "cell_type": "markdown", 20 | "metadata": { 21 | "id": "view-in-github", 22 | "colab_type": "text" 23 | }, 24 | "source": [ 25 | "\"Open" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "source": [ 31 | "# Gradient Descent from Scratch" 32 | ], 33 | "metadata": { 34 | "id": "DNQ86nYu3_Sx" 35 | } 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "source": [ 40 | "### **Objective**\n", 41 | "In this lab, we’ll implement Gradient Descent from scratch. By the end, you’ll understand how gradient descent iteratively minimizes an error function by updating weights in the direction of the negative gradient.\n", 42 | "\n", 43 | "### **Lab Outline**\n", 44 | "\n", 45 | "1. Introduction to Gradient Descent\n", 46 | "2. Data Generation\n", 47 | "3. Gradient Descent Implementation\n", 48 | "4. Parameter Tuning: Learning Rate and Convergence\n", 49 | "5. Tracking Cost History and Plotting Convergence\n", 50 | "6. Comparison with `scikit-learn`’s LinearRegression" 51 | ], 52 | "metadata": { 53 | "id": "SVO2poMAr81g" 54 | } 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "source": [ 59 | "### **1. Introduction to Gradient Descent**\n", 60 | "\n", 61 | "Gradient Descent is an optimization technique used to minimize an error function by iteratively adjusting model parameters. We follow the gradient's direction to reach a minimum, ideally finding the point where the error is smallest.\n", 62 | "\n", 63 | "#### Intuition\n", 64 | "Imagine trying to find the lowest point in a valley by taking steps downhill. The direction you move each time depends on the slope. This is similar to Gradient Descent, where we take steps proportional to the slope (or gradient) of the function at each point.\n", 65 | "\n", 66 | "#### The Update Rule\n", 67 | "The weights in Gradient Descent are updated as follows:\n", 68 | "$$\n", 69 | "w_{new} = w_{old} - \\eta \\cdot \\nabla \\text{Error}(w)\n", 70 | "$$\n", 71 | "where:\n", 72 | "- $\\eta$ is the learning rate, controlling the step size.\n", 73 | "- $\\nabla \\text{Error}(w)$ is the gradient of the error with respect to the weights." 74 | ], 75 | "metadata": { 76 | "id": "MNaJv5QYsHsu" 77 | } 78 | }, 79 | { 80 | "cell_type": "markdown", 81 | "source": [ 82 | "### **2. Data Generation**\n", 83 | "\n", 84 | "To demonstrate gradient descent, we’ll generate a synthetic dataset for linear regression. This data will have a linear relationship with some added noise." 85 | ], 86 | "metadata": { 87 | "id": "ZHdPHw0usWUr" 88 | } 89 | }, 90 | { 91 | "cell_type": "code", 92 | "source": [ 93 | "import numpy as np\n", 94 | "import matplotlib.pyplot as plt\n", 95 | "\n", 96 | "# Generate synthetic data\n", 97 | "np.random.seed(0)\n", 98 | "X = 2 * np.random.rand(100, 1)\n", 99 | "y = 4 + 3 * X + np.random.randn(100, 1)\n", 100 | "\n", 101 | "# Add a column of ones to X for the bias term (intercept)\n", 102 | "X_b = np.c_[np.ones((X.shape[0], 1)), X]" 103 | ], 104 | "metadata": { 105 | "id": "3CabMDSXsUS-" 106 | }, 107 | "execution_count": null, 108 | "outputs": [] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "source": [ 113 | "### **3. Gradient Descent Implementation**\n", 114 | "\n", 115 | "We’ll now implement gradient descent from scratch to find the weights that minimize the Sum of Squared Errors (SSE).\n", 116 | "\n", 117 | "#### Steps:\n", 118 | "1. Initialize the weights to zero.\n", 119 | "2. Compute the gradient of the error function with respect to each weight.\n", 120 | "3. Update the weights by moving them in the direction of the negative gradient." 121 | ], 122 | "metadata": { 123 | "id": "pozWgoiIschX" 124 | } 125 | }, 126 | { 127 | "cell_type": "code", 128 | "source": [ 129 | "# Gradient Descent function\n", 130 | "def gradient_descent(X, y, learning_rate=0.01, max_iterations=1000, tolerance=1e-6):\n", 131 | " # Step 1: Initialize parameters\n", 132 | " weights = np.zeros(X.shape[1]) # Start with zero weights\n", 133 | " history = [] # To record the cost at each iteration\n", 134 | "\n", 135 | " # Step 2: Start the optimization loop\n", 136 | " for iteration in range(max_iterations):\n", 137 | " # Step 2.1: Compute the gradient\n", 138 | " predictions = X @ weights\n", 139 | " errors = predictions - y.flatten()\n", 140 | " gradient = 2 * X.T @ errors / len(y) # Gradient of SSE with respect to weights\n", 141 | "\n", 142 | " # Step 2.2: Update weights\n", 143 | " weights -= learning_rate * gradient\n", 144 | "\n", 145 | " # Step 2.3: Calculate and record the cost (Sum of Squared Errors)\n", 146 | " cost = np.sum(errors ** 2)\n", 147 | " history.append(cost)\n", 148 | "\n", 149 | " # Step 2.4: Check for convergence\n", 150 | " if np.linalg.norm(gradient) < tolerance:\n", 151 | " print(f\"Convergence achieved at iteration {iteration}\")\n", 152 | " break\n", 153 | "\n", 154 | " # Step 3: Return the optimized weights and cost history\n", 155 | " return weights, history" 156 | ], 157 | "metadata": { 158 | "id": "7IVaLC_JsgfD" 159 | }, 160 | "execution_count": null, 161 | "outputs": [] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "source": [ 166 | "### **4. Parameter Tuning: Learning Rate and Convergence**\n", 167 | "\n", 168 | "The learning rate ($\\eta$) is crucial in gradient descent. If it’s too high, the algorithm might oscillate or diverge. If it’s too low, convergence will be slow.\n", 169 | "\n", 170 | "Try different values of `learning_rate` and observe how it affects convergence." 171 | ], 172 | "metadata": { 173 | "id": "tfCwArF2slnC" 174 | } 175 | }, 176 | { 177 | "cell_type": "code", 178 | "source": [ 179 | "# Set parameters for gradient descent\n", 180 | "learning_rate = 0.01\n", 181 | "max_iterations = 1000\n", 182 | "tolerance = 1e-6\n", 183 | "\n", 184 | "# Perform gradient descent optimization\n", 185 | "weights, cost_history = gradient_descent(X_b, y, learning_rate, max_iterations, tolerance)\n", 186 | "\n", 187 | "# Print the optimized weights\n", 188 | "print(\"Optimized Weights:\", weights)" 189 | ], 190 | "metadata": { 191 | "id": "q01ln28qstLs" 192 | }, 193 | "execution_count": null, 194 | "outputs": [] 195 | }, 196 | { 197 | "cell_type": "markdown", 198 | "source": [ 199 | "### **5. Tracking Cost History and Plotting Convergence**\n", 200 | "\n", 201 | "Visualizing the cost (error) over iterations can provide insight into the convergence behavior of gradient descent. We’ll plot the cost history to observe how the cost decreases with each iteration." 202 | ], 203 | "metadata": { 204 | "id": "yXp8NSmDs473" 205 | } 206 | }, 207 | { 208 | "cell_type": "code", 209 | "source": [ 210 | "# Plot cost history to visualize convergence\n", 211 | "plt.plot(cost_history)\n", 212 | "plt.xlabel(\"Iteration\")\n", 213 | "plt.ylabel(\"Cost (SSE)\")\n", 214 | "plt.title(\"Cost Convergence over Iterations\")\n", 215 | "plt.show()" 216 | ], 217 | "metadata": { 218 | "id": "k26AWF5As8Rq" 219 | }, 220 | "execution_count": null, 221 | "outputs": [] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "source": [ 226 | "#### Interpretation\n", 227 | "- **If the cost steadily decreases**, gradient descent is converging.\n", 228 | "- **If the cost fluctuates or increases**, try reducing the learning rate." 229 | ], 230 | "metadata": { 231 | "id": "rAuYq-5ns_mx" 232 | } 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "source": [ 237 | "### **6. Comparison with `scikit-learn`’s LinearRegression**\n", 238 | "\n", 239 | "To confirm our implementation, we’ll compare the optimized weights obtained through gradient descent with those from `scikit-learn`’s `LinearRegression`." 240 | ], 241 | "metadata": { 242 | "id": "Znp3pHD4tBLm" 243 | } 244 | }, 245 | { 246 | "cell_type": "code", 247 | "source": [ 248 | "from sklearn.linear_model import LinearRegression\n", 249 | "\n", 250 | "# Fit scikit-learn's LinearRegression model\n", 251 | "sklearn_model = LinearRegression()\n", 252 | "sklearn_model.fit(X, y)\n", 253 | "\n", 254 | "print(\"Weights from Gradient Descent:\", weights)\n", 255 | "print(\"Weights from scikit-learn:\", [sklearn_model.intercept_[0], sklearn_model.coef_[0][0]])" 256 | ], 257 | "metadata": { 258 | "id": "W3RFOZPCtGDm" 259 | }, 260 | "execution_count": null, 261 | "outputs": [] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "source": [ 266 | "### **Reflection and Summary**\n", 267 | "\n", 268 | "In this lab, you:\n", 269 | "- Implemented **Gradient Descent from scratch** to minimize an error function iteratively.\n", 270 | "- Experimented with different learning rates to understand their impact on convergence.\n", 271 | "- Tracked and visualized **cost history** to assess convergence.\n", 272 | "- Compared your results with `scikit-learn`’s **LinearRegression** to validate the implementation.\n", 273 | "\n", 274 | "This hands-on lab provides an understanding of how gradient descent optimizes parameters and emphasizes the importance of tuning hyperparameters like the learning rate. By building gradient descent from scratch, you’ve gained a deeper understanding of this optimization technique, applicable in various machine learning algorithms." 275 | ], 276 | "metadata": { 277 | "id": "g765RzPXtJsO" 278 | } 279 | } 280 | ] 281 | } -------------------------------------------------------------------------------- /Lectures/5 Gradient Descent/Note: Gradient Descent.md: -------------------------------------------------------------------------------- 1 | ## **Gradient Descent** 2 | 3 | **Navigating Complex Equations: A Numerical Solution** 4 | 5 | Imagine facing a complex equation that can’t be solved with standard algebra due to its many variables and nonlinear behavior. This is common in machine learning, where we frequently deal with intricate relationships within large datasets. In these cases, instead of looking for an exact solution, we rely on **numerical methods** for an approximate answer. 6 | 7 | **Gradient Descent** is one such method: an iterative approach that involves "stepping" closer to a solution by moving in the direction that minimizes our error. Picture navigating down a mountain by always taking a step in the steepest downhill direction. Each step gets us closer to the lowest point—our solution. Gradient Descent works similarly, guiding us towards the optimal set of parameters for our model by iteratively reducing the error. This iterative process is a core component of training in machine learning models. 8 | 9 | **How Gradient Descent Works** 10 | 11 | Gradient Descent minimizes the error by updating model parameters, like weights in a regression model, to iteratively find the optimal values that reduce the cost function (or error). Mathematically, we adjust each parameter based on its **gradient**—the slope of the cost function with respect to that parameter. 12 | 13 | In a **linear regression context**, our error or **cost function** could be the sum of squared errors (SSE) between predicted and actual values. The gradient of this cost function with respect to the weights tells us the direction to adjust the weights to reduce the error. The update rule can be expressed as: 14 | 15 | $$ 16 | \vec{w} \leftarrow \vec{w} - \text{learning rate} \times \nabla \text{SSE} 17 | $$ 18 | 19 | where: 20 | - $\vec{w}$ represents the weights (parameters) of our model. 21 | - The **learning rate** is a factor that controls the size of each step. 22 | - $\nabla \text{SSE}$ is the gradient of the error function, given by: 23 | 24 | $$ 25 | \nabla \text{SSE} = 2 \mathbf{X}^T (\mathbf{Y} - \mathbf{X} \vec{w}) 26 | $$ 27 | 28 | This update rule iteratively moves the weights closer to their optimal values by taking steps in the opposite direction of the gradient—toward the minimum error. 29 | 30 | **Learning Rate and Convergence** 31 | 32 | The **learning rate** is essential in controlling the speed and accuracy of Gradient Descent. If the learning rate is too high, we risk "overshooting" the minimum, like taking steps that are too large and missing our target. If it’s too low, the process becomes slow, requiring many iterations to converge. Choosing an appropriate learning rate involves balancing these two extremes to ensure efficient and reliable convergence. 33 | 34 | As we iterate, we stop when the error no longer decreases significantly or when we reach a maximum number of steps (iterations). This stopping criterion ensures that the model doesn’t continue training unnecessarily. 35 | 36 | **Challenges in Gradient Descent** 37 | 38 | 1. **Feature Scaling**: Since Gradient Descent relies on calculating distances and gradients, features on different scales can affect the update steps and convergence. For example, if one feature ranges from 0 to 1000 and another from 0 to 1, the larger feature may dominate. **Scaling features** to a similar range, like between 0 and 1 or by using z-score normalization, allows each feature to contribute more evenly to the model. 39 | 40 | 2. **Learning Rate Selection**: Choosing the right learning rate is critical. A too-large learning rate may cause the updates to overshoot the minimum, while a too-small learning rate can lead to slow convergence. Often, experimentation or adaptive methods (e.g., using learning rate schedules) help find an optimal balance. 41 | 42 | 3. **Local Minima**: In more complex, nonlinear functions, Gradient Descent can get “stuck” in local minima (small dips in the function) rather than finding the global minimum. However, in linear regression, the cost function is convex, meaning Gradient Descent is guaranteed to find the global minimum, making it highly effective for this type of problem. 43 | 44 | **Gradient Descent’s Flexibility Across Models** 45 | 46 | While we’ve discussed Gradient Descent in the context of **linear regression**, it’s worth noting that this method is foundational across machine learning. From **logistic regression** to complex **neural networks**, Gradient Descent serves as the underlying learning algorithm, helping models improve by minimizing error iteratively. In each case, Gradient Descent is the process that enables models to adapt and learn from data, regardless of the complexity of the model itself. 47 | 48 | **Summary** 49 | 50 | Gradient Descent is a powerful and flexible optimization method that underpins learning in machine learning models. By iteratively adjusting parameters to minimize error, it guides models toward an optimal state, whether in linear regression or advanced neural networks. The balance of **learning rate** and **feature scaling** are essential for effective training, ensuring Gradient Descent efficiently converges to the best solution. While the math behind Gradient Descent provides the structure, its intuitive basis as a “downhill navigation” tool makes it an essential and approachable method for solving machine learning problems. 51 | 52 | -------------------------------------------------------------------------------- /Lectures/6 Logistic Regression/Logistic Regression.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/6 Logistic Regression/Logistic Regression.pdf -------------------------------------------------------------------------------- /Lectures/6 Logistic Regression/Note: Logistic Regression.md: -------------------------------------------------------------------------------- 1 | ## **Logistic Regression** 2 | 3 | **Predicting Customer Churn: Focusing Resources with Precision** 4 | 5 | Imagine you're working for an automotive brand and need to address customer retention. Research shows that car owners tend to leave dealer services after around three years. With a limited budget, your goal is to focus retention efforts on those most likely to churn. By calculating a **churn probability score** for each owner, you can prioritize high-risk individuals, maximizing the impact of your retention budget. Logistic regression helps us calculate this probability score, making it a powerful tool for tasks where we need confidence levels to guide decisions. 6 | 7 | **What Makes Logistic Regression Different?** 8 | 9 | Logistic regression is a classification technique designed to estimate the probability of a binary outcome—such as whether a customer will churn (1) or stay (0). Unlike linear regression, which predicts a continuous outcome, logistic regression provides a **probability score** between 0 and 1, making it ideal for binary classification tasks. Linear regression would not work well here because it could give probabilities outside this range, leading to impractical interpretations. 10 | 11 | **Mapping Predictions to Probabilities with the Sigmoid Function** 12 | 13 | To convert a linear combination of input features (like vehicle age and service frequency) into a probability, logistic regression uses the **sigmoid function**. This function is central to logistic regression as it “squeezes” values into a range between 0 and 1, making them interpretable as probabilities. The sigmoid function is defined as: 14 | 15 | $$ 16 | \sigma(z) = \frac{1}{1 + e^{-z}} 17 | $$ 18 | 19 | where $$z = w \cdot x + b$$: 20 | - $w$ represents the weights for each feature, 21 | - $x$ represents the input features (e.g., vehicle age, service frequency), 22 | - $b$ is the bias term. 23 | 24 | This transformation allows us to interpret high $z$ values as probabilities close to 1 (indicating likely churn) and low $z$ values as probabilities close to 0. 25 | 26 | **Finding Optimal Weights through Maximum Likelihood Estimation** 27 | 28 | To determine the best weights $w$, logistic regression uses **Maximum Likelihood Estimation (MLE)**. This method seeks the parameters that make the observed data most probable under the model. The goal is to maximize the **log-likelihood function**, which expresses how well our model explains the actual data. For binary outcomes, the log-likelihood function is: 29 | 30 | $$ 31 | \log L(w) = \sum_{i=1}^{N} \left( y^{(i)} \log P(y=1|x^{(i)}) + (1 - y^{(i)}) \log (1 - P(y=1|x^{(i)})) \right) 32 | $$ 33 | 34 | where $y^{(i)}$ is the actual label (1 for churn, 0 for stay), and $P(y=1|x^{(i)})$ is the predicted probability. To find the weights that maximize this function, we use **gradient descent**. This iterative method adjusts the weights in small steps to increase the likelihood, moving closer to a model that accurately reflects the data. 35 | 36 | **Evaluating with Precision, Recall, and Cross-Validation** 37 | 38 | Logistic regression’s effectiveness can be measured using metrics like **precision** and **recall**, especially in cases where the data is imbalanced or the cost of false positives and false negatives varies: 39 | - **Precision** measures how many of our predicted positives (high churn probability) are actual churn cases. This is crucial if false positives are costly. 40 | - **Recall** measures how many actual churn cases we correctly identified, which is important if missing true positives (actual churners) is costly. 41 | 42 | **K-Fold Cross Validation** is commonly used to assess how well our model generalizes to new data. It involves splitting the data into $K$ parts, training on $K-1$ parts, and validating on the remaining part. This process repeats $K$ times, with each part serving as the validation set once. Cross-validation ensures our model is robust and minimizes overfitting, making it especially valuable when data is limited. 43 | 44 | **Summary** 45 | 46 | Logistic regression is a powerful tool for binary classification, helping predict the probability of an outcome, such as customer churn. By mapping a linear combination of input features to probabilities through the **sigmoid function**, it provides interpretable scores between 0 and 1. Logistic regression uses **Maximum Likelihood Estimation (MLE)** to find the best-fitting model parameters, often optimized through **gradient descent**. 47 | 48 | In practice, logistic regression models are evaluated with metrics like **precision** and **recall** to handle imbalanced data effectively. **K-Fold Cross Validation** further ensures that the model generalizes well, making it reliable even when data is limited. This combination of probabilistic predictions, robust evaluation, and interpretability makes logistic regression a valuable choice for applications like customer retention, fraud detection, and medical diagnostics. 49 | 50 | -------------------------------------------------------------------------------- /Lectures/7 Regularisation/Note: Regularisation.md: -------------------------------------------------------------------------------- 1 | ## **Regularization** 2 | 3 | **Addressing Model Complexity: A Symptomatic Approach to Overfitting** 4 | 5 | Imagine visiting a doctor with a fever and runny nose, symptoms of a larger viral infection. The doctor prescribes paracetamol to manage these symptoms, though it doesn’t cure the virus itself. In machine learning, overfitting is much like this: it’s a symptom of a deeper problem—high model complexity and memorization of noise. Regularization addresses these symptoms by controlling weight magnitudes, “treating” the overfitting without directly simplifying the data itself. Through this lens, regularization helps our models generalize better, even when complex patterns or noise may be present. 6 | 7 | **What is L2 Regularization?** 8 | 9 | Regularization is a technique that modifies the learning algorithm to prevent overfitting by adding a penalty for larger weight values. In L2 regularization, also known as Ridge regression, this penalty is proportional to the square of the weights. Regularization changes the cost function by combining the original error term with this penalty term, encouraging the model to keep weights smaller. The cost function for regularized logistic regression can be expressed as: 10 | 11 | $$ 12 | J(w) = -L(w) + \lambda \sum \lVert\vec{w}\rVert_2^2 13 | $$ 14 | 15 | where: 16 | - $-L(\vec{w})$ is the negative log-likelihood function, representing the original error term that logistic regression seeks to minimize, 17 | - $\lambda$ is the regularization parameter, controlling the strength of the penalty, 18 | - $\vec{w}$ represents the model’s weights. 19 | 20 | In this modified cost function, the penalty term $\lambda \sum w^2$ discourages excessively large weights. The higher the value of $\lambda$, the greater the penalty for large weights, leading to more regularization. By penalizing larger weights, L2 regularization implicitly forces the model to be simpler and less prone to capturing noise, which improves its ability to generalize to new data. 21 | 22 | **Balancing the Bias-Variance Trade-off** 23 | 24 | One of the main reasons regularization is effective is because it helps balance the bias-variance trade-off. When a model has too much flexibility, it fits training data very closely (low bias) but performs poorly on new data due to high variance. L2 regularization helps by penalizing large weights, which reduces model complexity and variance. However, if $\lambda$ is set too high, the model becomes overly simplistic, increasing bias. In practice, the right $\lambda$ allows the model to fit the training data well without capturing noise, leading to a better generalization on new data. 25 | 26 | To find the best $\lambda$, we can use **cross-validation**. Cross-validation involves splitting the data into multiple parts, training the model on some parts, and validating on others. By testing different values of $\lambda$ across these splits, we can identify the value that achieves the best balance between bias and variance, ensuring optimal model performance on unseen data. 27 | 28 | **Implicit Feature Selection and Model Simplicity** 29 | 30 | Regularization also acts as an implicit form of feature selection. By penalizing larger weights, it suppresses less important features, giving preference to those that contribute the most to the predictive power of the model. Features with weights close to zero are effectively “ignored” by the model, meaning that their influence on the predictions is minimized. This selective focus helps the model capture relevant patterns without being influenced by noise. 31 | 32 | For instance, in a dataset with many features, some features might not significantly contribute to the model’s performance. With L2 regularization, weights for these features can be reduced to nearly zero, effectively excluding them from the model. This “weight suppression” simplifies the model and enhances generalization by focusing on features that truly matter. 33 | 34 | **Observing Overfitting and Model Complexity** 35 | 36 | When we observe overfitting, we often see very large weight values, as the model attempts to fit every nuance in the data, including noise. This results in a complex model with high variance that performs well on training data but poorly on new data. Regularization addresses this by controlling weight magnitudes, thus reducing the model’s complexity and making it less likely to overfit. 37 | 38 | As a practical indicator, if a model’s weights are large and varied, it may be overfitting. By applying L2 regularization, we encourage the model to maintain smaller weights, leading to a smoother, more stable model that captures only the essential trends in the data. 39 | 40 | **Summary** 41 | 42 | L2 regularization is a powerful technique for managing model complexity and preventing overfitting. By adding a penalty to large weights in the cost function, it encourages simpler models with smaller weights, reducing variance and helping the model generalize better. Regularization implicitly controls model complexity, acting as a form of feature selection by reducing the influence of less important features. Cross-validation can be used to select an optimal value for $\lambda$, ensuring the model strikes the right balance in the bias-variance trade-off. Through these mechanisms, L2 regularization enables machine learning models to capture true patterns without being swayed by noise, making it an essential tool for robust, generalizable models. 43 | -------------------------------------------------------------------------------- /Lectures/7 Regularisation/Regularisation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/7 Regularisation/Regularisation.pdf -------------------------------------------------------------------------------- /Lectures/8 Feature Selection/dd.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Lectures/9 Naive Bayes/Naive Bayes.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/9 Naive Bayes/Naive Bayes.pdf -------------------------------------------------------------------------------- /Lectures/9 Naive Bayes/Note: Naive Bayes.md: -------------------------------------------------------------------------------- 1 | ## **Naïve Bayes** 2 | 3 | **A verdict by independent experts — why the “naïve” trick still works** 4 | 5 | > *“If ten independent experts all whisper the same name, we don’t need a courtroom drama — the verdict is obvious.”* 6 | 7 | Imagine we are detectives with a panel of hyper-specialised colleagues. 8 | One knows every brand of glove, another memorises getaway playlists, a third can identify tyre tracks at a glance. 9 | Each colleague inspects **only the clue they understand best** and quietly hands us a likelihood: 10 | “*If Suspect A did it, the red hatchback is 80 % likely; under Suspect B only 5 %.*” 11 | We multiply those private likelihoods, mix in how common each suspect is, and—click—out pops a posterior probability. 12 | No debate, no waiting: the naive-but-dead-fast way to solve a case. 13 | That, in a nutshell, is **Naïve Bayes** (NB). 14 | 15 | 16 | ### 1 Why the model is called “naïve” 17 | 18 | Bayes’ rule says 19 | 20 | $$ 21 | P(y\mid\mathbf x)\;=\;\frac{P(y)\,P(\mathbf x\mid y)}{\sum_{y'}P(y')P(\mathbf x\mid y')}. 22 | $$ 23 | 24 | Estimating the full joint likelihood $P(\mathbf x\mid y)$ is hopeless when $\mathbf x$ has thousands of coordinates: we would need elephant-size tables of probabilities. 25 | NB waves a magic wand: **assume every feature is conditionally independent given the class**. 26 | Now the joint factorises into a neat product $\prod_i P(x^{(i)}\mid y)$. 27 | We’ve traded realism for tractability — but, like our expert panel, the math becomes a row of single-dimension look-ups and one big multiplication. 28 | That is the “naïve” move, and it turns an impossible density-estimation task into a statistics-on-a-napkin exercise. 29 | 30 | 31 | ### 2 Two flavours that cover most menus 32 | 33 | | Variant | Likelihood model | Typical domain | Gotchas | 34 | | ------------------ | ---------------------------------------------------------- | ---------------------------------------- | ----------------------------------------------------------------------------------------- | 35 | | **Multinomial NB** | Discrete word/token counts | Text, spam, topic, short messages | Laplace smoothing α is crucial; counts, not just presence, give stronger evidence. | 36 | | **Gaussian NB** | Normal $N(\mu_{ic},\sigma_{ic}^2)$ per feature i & class c | Continuous tabular data, sensor readings | Very **sensitive to outliers** which skew $\mu,\sigma$; robust scaling or trimming helps. | 37 | 38 | Both flavours finish training in a single pass: count, or compute mean/variance — that’s it. 39 | 40 | 41 | ### 3 Log-probabilities and the underflow monster 42 | 43 | NB multiplies hundreds or thousands of probabilities < 1; in raw space we soon hit numbers that underflow to zero. 44 | The remedy is to **take logs**: products turn into sums, and the biggest log-sum wins. 45 | A favourite demo is typing `0.1**1000` into Python and watching the result collapse to 0, then rerunning the same calculation in log-space with no drama. 46 | Every industrial NB implementation works in log-space; the denominator is just a *log-sum-exp* that rescales everything back to finite probabilities. 47 | 48 | 49 | ### 4 Bias, variance, and why NB wins small-data races 50 | 51 | Those independence (or Gaussian) assumptions inject a **lot** of bias, but they also smash variance. 52 | NB can squeeze useful posteriors from a few dozen examples because the number of parameters is tiny — one count or one mean per feature, not a full weight vector that must be learned by optimisation. 53 | Compare the learning curves in the slides: NB hits respectable accuracy after 50 labelled emails; logistic regression needs ten times more to overtake. 54 | So NB is the “high-bias, low-variance” detective: quick to form an opinion, a bit blind to subtle interactions, but hard to overfit. 55 | 56 | 57 | ### 5 Representations that love (or hate) NB 58 | 59 | * **Raw token counts** let Multinomial NB exploit *frequency* as well as presence — perfect for short, repetitive spam where “FREE” appears ten times. 60 | * **TF-IDF weighting** can help when common words are just noise, yet it sometimes *hurts* NB by damping precisely those class-defining words; our SMS-spam study (TF 0.949 vs TF-IDF 0.944 F1) is a textbook example. 61 | * **N-grams** patch NB’s biggest linguistic blind spot: context. The bigram “not good” or “very bad” becomes its own feature, flipping or amplifying sentiment without abandoning the model. 62 | * **Negation/intensifier tagging** goes one step further: turn every word between “not” and the next punctuation into `NEG_word`, or merge “very good” into `very_good`, so the independence trick still works on enriched tokens. 63 | 64 | The guiding rule: give NB features whose meaning really is close to independent; then its naive multiplication becomes shockingly effective. 65 | 66 | 67 | ### **Take-home intuition** 68 | 69 | > *A panel of independent experts, each whispering a likelihood, can reach a solid consensus in the blink of an eye.* 70 | 71 | That is Naïve Bayes: multiply a handful of one-dimensional clues, divide by a normalising constant, and we are done. 72 | Its assumptions are heroic, yet on sparse, high-dimensional, low-sample problems NB often **wins the sprint** while heavier models are still tying their shoelaces. 73 | Just remember the fine print: correlated clues and wild outliers can fool the panel — but as long as the evidence really *is* roughly independent, few methods deliver more accuracy per microsecond. 74 | -------------------------------------------------------------------------------- /Lectures/AI as a Growth Enabler.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/AI as a Growth Enabler.pdf -------------------------------------------------------------------------------- /Lectures/Case Study 1: AI as a Growth Enabler/AI as a Growth Enabler.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sreent/machine-learning/f401a259a35261671e1924ca3364e207e402af04/Lectures/Case Study 1: AI as a Growth Enabler/AI as a Growth Enabler.pdf -------------------------------------------------------------------------------- /Linear Regression/17 Linear Regression in scikit-learn - Solution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyMPpqCTgUnmk2kGjPmkk+94", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Linear Regression in scikit-learn\n", 33 | "## Linear Regression\n", 34 | "In this section, you'll use linear regression to predict life expectancy from [Body Mass Index (BMI)](https://en.wikipedia.org/wiki/Body_mass_index). Before you do that, let's go over the tools required to build this model.\n", 35 | "\n", 36 | "For your linear regression model, you'll be using scikit-learn's LinearRegression class. This class provides the function fit() to fit the model to your data.\n", 37 | "\n", 38 | "```\n", 39 | ">>> from sklearn.linear_model import LinearRegression\n", 40 | ">>> model = LinearRegression()\n", 41 | ">>> model.fit(x_values, y_values)\n", 42 | "```\n", 43 | "\n", 44 | "In the example above, the model variable is a linear regression model that has been fitted to the data x_values and y_values. Fitting the model means finding the best line that fits the training data. Let's make two predictions using the model's predict() function.\n", 45 | "\n", 46 | "```\n", 47 | ">>> print(model.predict([ [127], [248] ]))\n", 48 | "[[ 438.94308857, 127.14839521]]\n", 49 | "```\n", 50 | "\n", 51 | "The model returned an array of predictions, one prediction for each input array. The first input, [127], got a prediction of 438.94308857. The second input, [248], got a prediction of 127.14839521. The reason for predicting on an array like [127] and not just 127, is because you can have a model that makes a prediction using multiple features. We'll go over using multiple variables in linear regression later in this lesson. For now, let's stick to a single value." 52 | ], 53 | "metadata": { 54 | "id": "9-lIRdPsFUv4" 55 | } 56 | }, 57 | { 58 | "cell_type": "markdown", 59 | "source": [ 60 | "\n", 61 | "## Linear Regression Quiz\n", 62 | "\n", 63 | "In this quiz, you'll be working with data on the average life expectancy at birth and the average BMI for males across the world. The data comes from [Gapminder](https://www.gapminder.org/).\n", 64 | "\n", 65 | "The data file can be found under the \"bmi_and_life_expectancy.csv\" tab in the quiz below. It includes three columns, containing the following data:\n", 66 | "\n", 67 | "- **Country** – The country the person was born in.\n", 68 | "- **Life expectancy** – The average life expectancy at birth for a person in that country.\n", 69 | "- **BMI** – The mean BMI of males in that country.\n", 70 | "\n", 71 | "### You'll need to complete each of the following steps:\n", 72 | "1. **Load the data**\n", 73 | " - The data is in the file called bmi_and_life_expectancy.csv.\n", 74 | " - Use pandas read_csv to load the data into a dataframe (don't forget to import pandas!)\n", 75 | " - Assign the dataframe to the variable bmi_life_data.\n", 76 | "\n", 77 | "2. **Build a linear regression model**\n", 78 | " - Create a regression model using scikit-learn's LinearRegression and assign it to bmi_life_model.\n", 79 | " - Fit the model to the data.\n", 80 | "\n", 81 | "3. **Predict using the model**\n", 82 | " - Predict using a BMI of 21.07931 and assign it to the variable laos_life_exp." 83 | ], 84 | "metadata": { 85 | "id": "lyoUyYZoFm89" 86 | } 87 | }, 88 | { 89 | "cell_type": "code", 90 | "source": [ 91 | "# TODO: Add import statements\n", 92 | "import pandas as pd\n", 93 | "from sklearn.linear_model import LinearRegression" 94 | ], 95 | "metadata": { 96 | "id": "yi9IGr0NHjfQ" 97 | }, 98 | "execution_count": null, 99 | "outputs": [] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "source": [ 104 | "# Assign the dataframe to this variable.\n", 105 | "# TODO: Load the data\n", 106 | "\n", 107 | "# URL for our dataset, bmi_and_life_expectancy.csv\n", 108 | "URL = \"https://drive.google.com/file/d/1AOMCvdCQrPZgCQHRreVJgQu6zrdc6nVB/view?usp=sharing\"\n", 109 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 110 | "\n", 111 | "bmi_life_data = pd.read_csv(FILE_PATH)\n", 112 | "X = bmi_life_data[['BMI']]\n", 113 | "y = bmi_life_data[['Life expectancy']]" 114 | ], 115 | "metadata": { 116 | "id": "6wzqweUnHmRV" 117 | }, 118 | "execution_count": null, 119 | "outputs": [] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "source": [ 124 | "# Make and fit the linear regression model\n", 125 | "#TODO: Fit the model and Assign it to bmi_life_model\n", 126 | "bmi_life_model = LinearRegression()\n", 127 | "bmi_life_model.fit(X, y)" 128 | ], 129 | "metadata": { 130 | "id": "cMl04-lpHoRL" 131 | }, 132 | "execution_count": null, 133 | "outputs": [] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "source": [ 138 | "# Make a prediction using the model\n", 139 | "# TODO: Predict life expectancy for a BMI value of 21.07931\n", 140 | "laos_life_exp = bmi_life_model.predict(21.07931)\n", 141 | "laos_life_exp" 142 | ], 143 | "metadata": { 144 | "id": "M6-mxU8BHqQs" 145 | }, 146 | "execution_count": null, 147 | "outputs": [] 148 | } 149 | ] 150 | } -------------------------------------------------------------------------------- /Linear Regression/17 Linear Regression in scikit-learn.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyP9WJ9DUSNj5L/TJ1XOIAfZ", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Linear Regression in scikit-learn\n", 33 | "## Linear Regression\n", 34 | "In this section, you'll use linear regression to predict life expectancy from [Body Mass Index (BMI)](https://en.wikipedia.org/wiki/Body_mass_index). Before you do that, let's go over the tools required to build this model.\n", 35 | "\n", 36 | "For your linear regression model, you'll be using scikit-learn's LinearRegression class. This class provides the function fit() to fit the model to your data.\n", 37 | "\n", 38 | "```\n", 39 | ">>> from sklearn.linear_model import LinearRegression\n", 40 | ">>> model = LinearRegression()\n", 41 | ">>> model.fit(x_values, y_values)\n", 42 | "```\n", 43 | "\n", 44 | "In the example above, the model variable is a linear regression model that has been fitted to the data x_values and y_values. Fitting the model means finding the best line that fits the training data. Let's make two predictions using the model's predict() function.\n", 45 | "\n", 46 | "```\n", 47 | ">>> print(model.predict([ [127], [248] ]))\n", 48 | "[[ 438.94308857, 127.14839521]]\n", 49 | "```\n", 50 | "\n", 51 | "The model returned an array of predictions, one prediction for each input array. The first input, [127], got a prediction of 438.94308857. The second input, [248], got a prediction of 127.14839521. The reason for predicting on an array like [127] and not just 127, is because you can have a model that makes a prediction using multiple features. We'll go over using multiple variables in linear regression later in this lesson. For now, let's stick to a single value." 52 | ], 53 | "metadata": { 54 | "id": "9-lIRdPsFUv4" 55 | } 56 | }, 57 | { 58 | "cell_type": "markdown", 59 | "source": [ 60 | "\n", 61 | "## Linear Regression Quiz\n", 62 | "\n", 63 | "In this quiz, you'll be working with data on the average life expectancy at birth and the average BMI for males across the world. The data comes from [Gapminder](https://www.gapminder.org/).\n", 64 | "\n", 65 | "The data file can be found under the \"bmi_and_life_expectancy.csv\" tab in the quiz below. It includes three columns, containing the following data:\n", 66 | "\n", 67 | "- **Country** – The country the person was born in.\n", 68 | "- **Life expectancy** – The average life expectancy at birth for a person in that country.\n", 69 | "- **BMI** – The mean BMI of males in that country.\n", 70 | "\n", 71 | "### You'll need to complete each of the following steps:\n", 72 | "1. **Load the data**\n", 73 | " - The data is in the file called bmi_and_life_expectancy.csv.\n", 74 | " - Use pandas read_csv to load the data into a dataframe (don't forget to import pandas!)\n", 75 | " - Assign the dataframe to the variable bmi_life_data.\n", 76 | "\n", 77 | "2. **Build a linear regression model**\n", 78 | " - Create a regression model using scikit-learn's LinearRegression and assign it to bmi_life_model.\n", 79 | " - Fit the model to the data.\n", 80 | "\n", 81 | "3. **Predict using the model**\n", 82 | " - Predict using a BMI of 21.07931 and assign it to the variable laos_life_exp." 83 | ], 84 | "metadata": { 85 | "id": "lyoUyYZoFm89" 86 | } 87 | }, 88 | { 89 | "cell_type": "code", 90 | "source": [ 91 | "# TODO: Add import statements\n", 92 | "import pandas as pd\n", 93 | "from sklearn.linear_model import LinearRegression" 94 | ], 95 | "metadata": { 96 | "id": "yi9IGr0NHjfQ" 97 | }, 98 | "execution_count": null, 99 | "outputs": [] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "source": [ 104 | "# Assign the dataframe to this variable.\n", 105 | "# TODO: Load the data\n", 106 | "\n", 107 | "# URL for our dataset, bmi_and_life_expectancy.csv\n", 108 | "URL = \"https://drive.google.com/file/d/1AOMCvdCQrPZgCQHRreVJgQu6zrdc6nVB/view?usp=sharing\"\n", 109 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 110 | "\n", 111 | "bmi_life_data = pd.read_csv(FILE_PATH)\n", 112 | "X = None\n", 113 | "y = None" 114 | ], 115 | "metadata": { 116 | "id": "6wzqweUnHmRV" 117 | }, 118 | "execution_count": null, 119 | "outputs": [] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "source": [ 124 | "# Make and fit the linear regression model\n", 125 | "#TODO: Fit the model and Assign it to bmi_life_model\n", 126 | "bmi_life_model = None\n", 127 | "bmi_life_model.fit(None, None)" 128 | ], 129 | "metadata": { 130 | "id": "cMl04-lpHoRL" 131 | }, 132 | "execution_count": null, 133 | "outputs": [] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "source": [ 138 | "# Make a prediction using the model\n", 139 | "# TODO: Predict life expectancy for a BMI value of 21.07931\n", 140 | "laos_life_exp = None\n", 141 | "laos_life_exp" 142 | ], 143 | "metadata": { 144 | "id": "M6-mxU8BHqQs" 145 | }, 146 | "execution_count": null, 147 | "outputs": [] 148 | } 149 | ] 150 | } -------------------------------------------------------------------------------- /Linear Regression/24 Exercise: Polynomial Regression - Solution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyNIM70cdBDP7gUFhHvVFFlB", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Exercise: Polynomial Regression\n", 33 | "\n", 34 | "Get some practice implementing polynomial regression in this exercise. In data.csv, you can see data generated for one predictor feature ('Var_X') and one outcome feature ('Var_Y'), following a non-linear trend. Use sklearn's PolynomialFeatures class to extend the predictor feature column into multiple columns with polynomial features. Play around with different degrees of polynomial and the Test Run button to see what fits best: when you think you have the best-fitting degree, press the Submit button to check your work!\n", 35 | "\n", 36 | "## Perform the following steps below:\n", 37 | "1. **Load in the data**\n", 38 | " - The data is in the file called 'data.csv'. Note that this data has a header line.\n", 39 | " - Make sure that you've split out the data into the predictor feature in X and outcome feature in y.\n", 40 | " - For X, make sure it is in a 2-d array of 20 rows by 1 column. You might need to use NumPy's reshape function to accomplish this.\n", 41 | "\n", 42 | "2. **Create polynomial features**\n", 43 | " - Create an instance of sklearn's PolynomialFeatures class and assign it to the variable poly_feat. Pay attention to how to set the degree of features, since that will be how the exercise is evaluated.\n", 44 | " - Create the polynomial features by using the PolynomialFeatures object's .fit_transform() method. The \"fit\" side of the method considers how many features are needed in the output, and the \"transform\" side applies those considerations to the data provided to the method as an argument. Assign the new feature matrix to the X_poly variable.\n", 45 | "\n", 46 | "3. **Build a polynomial regression model**\n", 47 | " - Create a polynomial regression model by combining sklearn's LinearRegression class with the polynomial features. Assign the fit model to poly_model.\n" 48 | ], 49 | "metadata": { 50 | "id": "IbHgrZcTTp3o" 51 | } 52 | }, 53 | { 54 | "cell_type": "code", 55 | "source": [ 56 | "# TODO: Add import statements\n", 57 | "import numpy as np\n", 58 | "import pandas as pd\n", 59 | "from sklearn.linear_model import LinearRegression\n", 60 | "from sklearn.preprocessing import PolynomialFeatures" 61 | ], 62 | "metadata": { 63 | "id": "kbWPUPDMUjrb" 64 | }, 65 | "execution_count": null, 66 | "outputs": [] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "source": [ 71 | "# Assign the data to predictor and outcome variables\n", 72 | "# TODO: Load the data\n", 73 | "\n", 74 | "# URL for our dataset, poly-data.csv\n", 75 | "URL = \"https://drive.google.com/file/d/1YXjQt6QKTbBmTNB9P6VGyxo7KC1fWoUx/view?usp=sharing\"\n", 76 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 77 | "\n", 78 | "train_data = pd.read_csv(FILE_PATH)\n", 79 | "X = train_data['Var_X'].values.reshape(-1, 1)\n", 80 | "y = train_data['Var_Y'].values" 81 | ], 82 | "metadata": { 83 | "id": "vd7Zxg4eUm7D" 84 | }, 85 | "execution_count": null, 86 | "outputs": [] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "source": [ 91 | "# Create polynomial features\n", 92 | "# TODO: Create a PolynomialFeatures object, then fit and transform the\n", 93 | "# predictor feature\n", 94 | "poly_feat = PolynomialFeatures(degree = 4)\n", 95 | "X_poly = poly_feat.fit_transform(X)" 96 | ], 97 | "metadata": { 98 | "id": "R5jOWhTvUpL6" 99 | }, 100 | "execution_count": null, 101 | "outputs": [] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "source": [ 106 | "# Make and fit the polynomial regression model\n", 107 | "# TODO: Create a LinearRegression object and fit it to the polynomial predictor\n", 108 | "# features\n", 109 | "poly_model = LinearRegression(fit_intercept = False).fit(X_poly, y)\n", 110 | "\n", 111 | "print(f\"Coefficients: {poly_model.coef_}\")\n", 112 | "print(f\"Intercept: {poly_model.intercept_}\")" 113 | ], 114 | "metadata": { 115 | "id": "gcj9ed6CUrj4" 116 | }, 117 | "execution_count": null, 118 | "outputs": [] 119 | } 120 | ] 121 | } -------------------------------------------------------------------------------- /Linear Regression/24 Exercise: Polynomial Regression.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyOED+2t7lq6rPV+98tgikNf", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Exercise: Polynomial Regression\n", 33 | "\n", 34 | "Get some practice implementing polynomial regression in this exercise. In data.csv, you can see data generated for one predictor feature ('Var_X') and one outcome feature ('Var_Y'), following a non-linear trend. Use sklearn's PolynomialFeatures class to extend the predictor feature column into multiple columns with polynomial features. Play around with different degrees of polynomial and the Test Run button to see what fits best: when you think you have the best-fitting degree, press the Submit button to check your work!\n", 35 | "\n", 36 | "## Perform the following steps below:\n", 37 | "1. **Load in the data**\n", 38 | " - The data is in the file called 'data.csv'. Note that this data has a header line.\n", 39 | " - Make sure that you've split out the data into the predictor feature in X and outcome feature in y.\n", 40 | " - For X, make sure it is in a 2-d array of 20 rows by 1 column. You might need to use NumPy's reshape function to accomplish this.\n", 41 | "\n", 42 | "2. **Create polynomial features**\n", 43 | " - Create an instance of sklearn's PolynomialFeatures class and assign it to the variable poly_feat. Pay attention to how to set the degree of features, since that will be how the exercise is evaluated.\n", 44 | " - Create the polynomial features by using the PolynomialFeatures object's .fit_transform() method. The \"fit\" side of the method considers how many features are needed in the output, and the \"transform\" side applies those considerations to the data provided to the method as an argument. Assign the new feature matrix to the X_poly variable.\n", 45 | "\n", 46 | "3. **Build a polynomial regression model**\n", 47 | " - Create a polynomial regression model by combining sklearn's LinearRegression class with the polynomial features. Assign the fit model to poly_model.\n" 48 | ], 49 | "metadata": { 50 | "id": "IbHgrZcTTp3o" 51 | } 52 | }, 53 | { 54 | "cell_type": "code", 55 | "source": [ 56 | "# TODO: Add import statements\n", 57 | "import numpy as np\n", 58 | "import pandas as pd\n", 59 | "from sklearn.linear_model import LinearRegression\n", 60 | "from sklearn.preprocessing import PolynomialFeatures" 61 | ], 62 | "metadata": { 63 | "id": "kbWPUPDMUjrb" 64 | }, 65 | "execution_count": null, 66 | "outputs": [] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "source": [ 71 | "# Assign the data to predictor and outcome variables\n", 72 | "# TODO: Load the data\n", 73 | "\n", 74 | "# URL for our dataset, data.csv\n", 75 | "URL = \"https://drive.google.com/file/d/1tWr6TuxJnLn5XzCP7Q5y1f0n522KGTsO/view?usp=sharing\"\n", 76 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 77 | "\n", 78 | "train_data = pd.read_csv(FILE_PATH)\n", 79 | "X = None\n", 80 | "y = None" 81 | ], 82 | "metadata": { 83 | "id": "vd7Zxg4eUm7D" 84 | }, 85 | "execution_count": null, 86 | "outputs": [] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "source": [ 91 | "# Create polynomial features\n", 92 | "# TODO: Create a PolynomialFeatures object, then fit and transform the\n", 93 | "# predictor feature\n", 94 | "poly_feat = None\n", 95 | "X_poly = None" 96 | ], 97 | "metadata": { 98 | "id": "R5jOWhTvUpL6" 99 | }, 100 | "execution_count": null, 101 | "outputs": [] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "source": [ 106 | "# Make and fit the polynomial regression model\n", 107 | "# TODO: Create a LinearRegression object and fit it to the polynomial predictor\n", 108 | "# features\n", 109 | "poly_model = None\n", 110 | "\n", 111 | "print(f\"Coefficients: {poly_model.coef_}\")\n", 112 | "print(f\"Intercept: {poly_model.intercept_}\")" 113 | ], 114 | "metadata": { 115 | "id": "gcj9ed6CUrj4" 116 | }, 117 | "execution_count": null, 118 | "outputs": [] 119 | } 120 | ] 121 | } -------------------------------------------------------------------------------- /Linear Regression/26 Exercise: Regularization - Solution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyN9Frf7U0wo8p1PCU2duRxx", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Regularization Exercise\n", 33 | "\n", 34 | "Perhaps it's not too surprising at this point, but there are classes in sklearn that will help you perform regularization with your linear regression. You'll get practice with implementing that in this exercise. In this assignment's data.csv, you'll find data for a bunch of points including six predictor variables and one outcome variable. Use sklearn's Lasso class to fit a linear regression model to the data, while also using L1 regularization to control for model complexity.\n", 35 | "\n", 36 | "# Perform the following steps:\n", 37 | "1. **Load in the data**\n", 38 | " - The data is in the file called 'data.csv'. Note that there's no header row on this file.\n", 39 | " - Split the data so that the six predictor features (first six columns) are stored in X, and the outcome feature (last column) is stored in y.\n", 40 | "2. **Fit data using linear regression with Lasso regularization**\n", 41 | " - Create an instance of sklearn's Lasso class and assign it to the variable lasso_reg. You don't need to set any parameter values: use the default values for the quiz.\n", 42 | " - Use the Lasso object's .fit() method to fit the regression model onto the data.\n", 43 | "\n", 44 | "3. **Inspect the coefficients of the regression model**\n", 45 | " - Obtain the coefficients of the fit regression model using the .coef_ attribute of the Lasso object. Store this in the reg_coef variable: the coefficients will be printed out, and you will use your observations to answer the question at the bottom of the page." 46 | ], 47 | "metadata": { 48 | "id": "pCrXQPZkVafV" 49 | } 50 | }, 51 | { 52 | "cell_type": "code", 53 | "source": [ 54 | "# TODO: Add import statements\n", 55 | "import numpy as np\n", 56 | "import pandas as pd\n", 57 | "from sklearn.linear_model import Lasso" 58 | ], 59 | "metadata": { 60 | "id": "csiLxtHcWC85" 61 | }, 62 | "execution_count": null, 63 | "outputs": [] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "source": [ 68 | "# Assign the data to predictor and outcome variables\n", 69 | "# TODO: Load the data data.csv\n", 70 | "# URL for our dataset, data.csv\n", 71 | "URL = \"https://drive.google.com/file/d/1UfRe1oUjr3qQhl_mKA31X4-jtgYmvu45/view?usp=sharing\"\n", 72 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 73 | "\n", 74 | "train_data = pd.read_csv(FILE_PATH, header = None)\n", 75 | "X = train_data.iloc[:,:-1]\n", 76 | "y = train_data.iloc[:,-1]" 77 | ], 78 | "metadata": { 79 | "id": "uTOA6Si7WGd3" 80 | }, 81 | "execution_count": null, 82 | "outputs": [] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "source": [ 87 | "# TODO: Create the linear regression model with lasso regularization.\n", 88 | "lasso_reg = Lasso()\n", 89 | "\n", 90 | "# TODO: Fit the model.\n", 91 | "lasso_reg.fit(X, y)\n", 92 | "\n", 93 | "# TODO: Retrieve and print out the coefficients from the regression model.\n", 94 | "reg_coef = lasso_reg.coef_\n", 95 | "reg_coef" 96 | ], 97 | "metadata": { 98 | "id": "d8yE0SZuWIqI" 99 | }, 100 | "execution_count": null, 101 | "outputs": [] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "source": [ 106 | "## Quiz Question\n", 107 | "\n", 108 | "For which of the predictor features has the lasso regularization step zeroed the corresponding coefficient?\n", 109 | "\n", 110 | "- Column X1\n", 111 | "- Column X2\n", 112 | "- Column X3\n", 113 | "- Column X4\n", 114 | "- Column X5\n", 115 | "- Column X6" 116 | ], 117 | "metadata": { 118 | "id": "uMYtrkWbWNEz" 119 | } 120 | }, 121 | { 122 | "cell_type": "code", 123 | "source": [], 124 | "metadata": { 125 | "id": "lvESbX0kWYeG" 126 | }, 127 | "execution_count": null, 128 | "outputs": [] 129 | } 130 | ] 131 | } -------------------------------------------------------------------------------- /Linear Regression/26 Exercise: Regularization.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyPuTKQQIPDhHZpedyHVyq15", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Regularization Exercise\n", 33 | "\n", 34 | "Perhaps it's not too surprising at this point, but there are classes in sklearn that will help you perform regularization with your linear regression. You'll get practice with implementing that in this exercise. In this assignment's data.csv, you'll find data for a bunch of points including six predictor variables and one outcome variable. Use sklearn's Lasso class to fit a linear regression model to the data, while also using L1 regularization to control for model complexity.\n", 35 | "\n", 36 | "# Perform the following steps:\n", 37 | "1. **Load in the data**\n", 38 | " - The data is in the file called 'data.csv'. Note that there's no header row on this file.\n", 39 | " - Split the data so that the six predictor features (first six columns) are stored in X, and the outcome feature (last column) is stored in y.\n", 40 | "2. **Fit data using linear regression with Lasso regularization**\n", 41 | " - Create an instance of sklearn's Lasso class and assign it to the variable lasso_reg. You don't need to set any parameter values: use the default values for the quiz.\n", 42 | " - Use the Lasso object's .fit() method to fit the regression model onto the data.\n", 43 | "\n", 44 | "3. **Inspect the coefficients of the regression model**\n", 45 | " - Obtain the coefficients of the fit regression model using the .coef_ attribute of the Lasso object. Store this in the reg_coef variable: the coefficients will be printed out, and you will use your observations to answer the question at the bottom of the page." 46 | ], 47 | "metadata": { 48 | "id": "pCrXQPZkVafV" 49 | } 50 | }, 51 | { 52 | "cell_type": "code", 53 | "source": [ 54 | "# TODO: Add import statements\n", 55 | "import numpy as np\n", 56 | "import pandas as pd\n", 57 | "from sklearn.linear_model import Lasso" 58 | ], 59 | "metadata": { 60 | "id": "csiLxtHcWC85" 61 | }, 62 | "execution_count": null, 63 | "outputs": [] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "source": [ 68 | "# Assign the data to predictor and outcome variables\n", 69 | "# TODO: Load the data\n", 70 | "# URL for our dataset, data.csv\n", 71 | "URL = \"https://drive.google.com/file/d/1UfRe1oUjr3qQhl_mKA31X4-jtgYmvu45/view?usp=sharing\"\n", 72 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 73 | "\n", 74 | "train_data = pd.read_csv(FILE_PATH, header = None)\n", 75 | "X = None\n", 76 | "y = None" 77 | ], 78 | "metadata": { 79 | "id": "uTOA6Si7WGd3" 80 | }, 81 | "execution_count": null, 82 | "outputs": [] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "source": [ 87 | "# TODO: Create the linear regression model with lasso regularization.\n", 88 | "lasso_reg = None\n", 89 | "\n", 90 | "# TODO: Fit the model.\n", 91 | "lasso_reg.fit(None, None)\n", 92 | "\n", 93 | "# TODO: Retrieve and print out the coefficients from the regression model.\n", 94 | "reg_coef = None\n", 95 | "reg_coef" 96 | ], 97 | "metadata": { 98 | "id": "d8yE0SZuWIqI" 99 | }, 100 | "execution_count": null, 101 | "outputs": [] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "source": [ 106 | "## Quiz Question\n", 107 | "\n", 108 | "For which of the predictor features has the lasso regularization step zeroed the corresponding coefficient?\n", 109 | "\n", 110 | "- Column X1\n", 111 | "- Column X2\n", 112 | "- Column X3\n", 113 | "- Column X4\n", 114 | "- Column X5\n", 115 | "- Column X6" 116 | ], 117 | "metadata": { 118 | "id": "uMYtrkWbWNEz" 119 | } 120 | }, 121 | { 122 | "cell_type": "code", 123 | "source": [], 124 | "metadata": { 125 | "id": "lvESbX0kWYeG" 126 | }, 127 | "execution_count": null, 128 | "outputs": [] 129 | } 130 | ] 131 | } -------------------------------------------------------------------------------- /Linear Regression/8 Exercise-Quiz: Absolute and Square Trick - Solution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyPmacitlDV4/6deBuQKSqm1", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Exercise-Quiz: Absolute and Square Trick" 33 | ], 34 | "metadata": { 35 | "id": "rsxCzehma4pM" 36 | } 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "source": [ 41 | "## Quiz for Absolute Trick\n", 42 | "\n", 43 | "Let's say that we have a line whose equation is **y = -0.6x + 4**. For the point **(x,y) = (-5, 3)**, apply the **absolute trick** to get the new equation for the line, using a learning rate of $alpha=0.1$.\n", 44 | "\n", 45 | "Report your answer in the form **y = w_1x + w_2**, substituting appropriate values for **w_1** and **w_2**.\n", 46 | "\n", 47 | "Note: $y = (w_1+p\\alpha)x + (w_2+α)$" 48 | ], 49 | "metadata": { 50 | "id": "apgoN00va7yK" 51 | } 52 | }, 53 | { 54 | "cell_type": "code", 55 | "source": [ 56 | "The new equation of the line should be y = −0.1x+3.9" 57 | ], 58 | "metadata": { 59 | "id": "DkeIPKgLbVat" 60 | }, 61 | "execution_count": null, 62 | "outputs": [] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "source": [ 67 | "## Quiz for Square Trick\n", 68 | "\n", 69 | "Let's say that we have a line whose equation is **y = -0.6x + 4**. For the point **(x,y) = (-5, 3)**, apply the **square trick** to get the new equation for the line, using a learning rate of $alpha=0.1$.\n", 70 | "\n", 71 | "Report your answer in the form **y = w_1x + w_2**, substituting appropriate values for **w_1** and **w_2**.\n", 72 | "\n", 73 | "Note: $y = (w_1 + p(q-q^{'})\\alpha)x + (w_2+(q-q^{'})α)$" 74 | ], 75 | "metadata": { 76 | "id": "Po0LzSjHa8Gc" 77 | } 78 | }, 79 | { 80 | "cell_type": "code", 81 | "source": [ 82 | "The new equation of the line should be y=−0.4x+3.96" 83 | ], 84 | "metadata": { 85 | "id": "9ufH8QIzbkwv" 86 | }, 87 | "execution_count": null, 88 | "outputs": [] 89 | } 90 | ] 91 | } -------------------------------------------------------------------------------- /Linear Regression/8 Exercise-Quiz: Absolute and Square Trick.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyPzMyF5wO/r9YEPUC+u0BXU", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "source": [ 32 | "# Exercise-Quiz: Absolute and Square Trick" 33 | ], 34 | "metadata": { 35 | "id": "rsxCzehma4pM" 36 | } 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "source": [ 41 | "## Quiz for Absolute Trick\n", 42 | "\n", 43 | "Let's say that we have a line whose equation is **y = -0.6x + 4**. For the point **(x,y) = (-5, 3)**, apply the **absolute trick** to get the new equation for the line, using a learning rate of $alpha=0.1$.\n", 44 | "\n", 45 | "Report your answer in the form **y = w_1x + w_2**, substituting appropriate values for **w_1** and **w_2**.\n", 46 | "\n", 47 | "Note: $y = (w_1+p\\alpha)x + (w_2+α)$" 48 | ], 49 | "metadata": { 50 | "id": "apgoN00va7yK" 51 | } 52 | }, 53 | { 54 | "cell_type": "code", 55 | "source": [], 56 | "metadata": { 57 | "id": "DkeIPKgLbVat" 58 | }, 59 | "execution_count": null, 60 | "outputs": [] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "source": [ 65 | "## Quiz for Square Trick\n", 66 | "\n", 67 | "Let's say that we have a line whose equation is **y = -0.6x + 4**. For the point **(x,y) = (-5, 3)**, apply the **square trick** to get the new equation for the line, using a learning rate of $alpha=0.1$.\n", 68 | "\n", 69 | "Report your answer in the form **y = w_1x + w_2**, substituting appropriate values for **w_1** and **w_2**.\n", 70 | "\n", 71 | "Note: $y = (w_1 + p(q-q^{'})\\alpha)x + (w_2+(q-q^{'})α)$" 72 | ], 73 | "metadata": { 74 | "id": "Po0LzSjHa8Gc" 75 | } 76 | }, 77 | { 78 | "cell_type": "code", 79 | "source": [], 80 | "metadata": { 81 | "id": "9ufH8QIzbkwv" 82 | }, 83 | "execution_count": null, 84 | "outputs": [] 85 | } 86 | ] 87 | } -------------------------------------------------------------------------------- /Linear Regression/data.csv: -------------------------------------------------------------------------------- 1 | 1.25664,2.04978,-6.23640,4.71926,-4.26931,0.20590,12.31798 2 | -3.89012,-0.37511,6.14979,4.94585,-3.57844,0.00640,23.67628 3 | 5.09784,0.98120,-0.29939,5.85805,0.28297,-0.20626,-1.53459 4 | 0.39034,-3.06861,-5.63488,6.43941,0.39256,-0.07084,-24.68670 5 | 5.84727,-0.15922,11.41246,7.52165,1.69886,0.29022,17.54122 6 | -2.86202,-0.84337,-1.08165,0.67115,-2.48911,0.52328,9.39789 7 | -7.09328,-0.07233,6.76632,13.06072,0.12876,-0.01048,11.73565 8 | -7.17614,0.62875,-2.89924,-5.21458,-2.70344,-0.22035,4.42482 9 | 8.67430,2.09933,-11.23591,-5.99532,-2.79770,-0.08710,-5.94615 10 | -6.03324,-4.16724,2.42063,-3.61827,1.96815,0.17723,-13.11848 11 | 8.67485,1.48271,-1.31205,-1.81154,2.67940,0.04803,-9.25647 12 | 4.36248,-2.69788,-4.60562,-0.12849,3.40617,-0.07841,-29.94048 13 | 9.97205,-0.61515,2.63039,2.81044,5.68249,-0.04495,-20.46775 14 | -1.44556,0.18337,4.61021,-2.54824,0.86388,0.17696,7.12822 15 | -3.90381,0.53243,2.83416,-5.42397,-0.06367,-0.22810,6.05628 16 | -12.39824,-1.54269,-2.66748,10.82084,5.92054,0.13415,-32.91328 17 | 5.75911,-0.82222,10.24701,0.33635,0.26025,-0.02588,17.75036 18 | -7.12657,3.28707,-0.22508,13.42902,2.16708,-0.09153,-2.80277 19 | 7.22736,1.27122,0.99188,-8.87118,-6.86533,0.09410,33.98791 20 | -10.31393,2.23819,-7.87166,-3.44388,-1.43267,-0.07893,-3.18407 21 | -8.25971,-0.15799,-1.81740,1.12972,4.24165,-0.01607,-20.57366 22 | 13.37454,-0.91051,4.61334,0.93989,4.81350,-0.07428,-12.66661 23 | 1.49973,-0.50929,-2.66670,-1.28560,-0.18299,-0.00552,-6.56370 24 | -10.46766,0.73077,3.93791,-1.73489,-3.26768,0.02366,23.19621 25 | -1.15898,3.14709,-4.73329,13.61355,-3.87487,-0.14112,13.89143 26 | 4.42275,-2.09867,3.06395,-0.45331,-2.07717,0.22815,10.29282 27 | -3.34113,-0.31138,4.49844,-2.32619,-2.95757,-0.00793,21.21512 28 | -1.85433,-1.32509,8.06274,12.75080,-0.89005,-0.04312,14.54248 29 | 0.85474,-0.50002,-3.52152,-4.30405,4.13943,-0.02834,-24.77918 30 | 0.33271,-5.28025,-4.95832,22.48546,4.95051,0.17153,-45.01710 31 | -0.07308,0.51247,-1.38120,7.86552,3.31641,0.06808,-12.63583 32 | 2.99294,2.85192,5.51751,8.53749,4.30806,-0.17462,0.84415 33 | 1.41135,-1.01899,2.27500,5.27479,-4.90004,0.19508,23.54972 34 | 3.84816,-0.66249,-1.35364,16.51379,0.32115,0.41051,-2.28650 35 | 3.30223,0.23152,-2.16852,0.75257,-0.05749,-0.03427,-4.22022 36 | -6.12524,-2.56204,0.79878,-3.36284,1.00396,0.06219,-9.10749 37 | -7.47524,1.31401,-3.30847,4.83057,1.00104,-0.19851,-7.69059 38 | 5.84884,-0.53504,-0.19543,10.27451,6.98704,0.22706,-29.21246 39 | 6.44377,0.47687,-0.08731,22.88008,-2.86604,0.03142,10.90274 40 | 6.35366,-2.04444,1.98872,-1.45189,-1.24062,0.23626,4.62178 41 | 6.85563,-0.94543,5.16637,2.85611,4.64812,0.29535,-7.83647 42 | 1.61758,1.31067,-2.16795,8.07492,-0.17166,-0.10273,0.06922 43 | 3.80137,1.02276,-3.15429,6.09774,3.18885,-0.00163,-16.11486 44 | -6.81855,-0.15776,-10.69117,8.07818,4.14656,0.10691,-38.47710 45 | -6.43852,4.30120,2.63923,-1.98297,-0.89599,-0.08174,20.77790 46 | -2.35292,1.26425,-6.80877,3.31220,-6.17515,-0.04764,14.92507 47 | 9.13580,-1.21425,1.17227,-6.33648,-0.85276,-0.13366,-0.17285 48 | -3.02986,-0.48694,0.24329,-0.38830,-4.70410,-0.18065,15.95300 49 | 3.27244,2.22393,-1.96640,17.53694,1.62378,0.11539,-4.29743 50 | -4.44346,-1.96429,0.22209,15.29785,-1.98503,0.40131,4.07647 51 | -2.61294,-0.24905,-4.02974,-23.82024,-5.94171,-0.04932,16.50504 52 | 3.65962,1.69832,0.78025,9.88639,-1.61555,-0.18570,9.99506 53 | 2.22893,-4.62231,-3.33440,0.07179,0.21983,0.14348,-19.94698 54 | -5.43092,1.39655,-2.79175,0.16622,-2.38112,-0.09009,6.49039 55 | -5.88117,-3.04210,-0.87931,3.96197,-1.01125,0.08132,-6.01714 56 | 0.51401,-0.30742,6.01407,-6.85848,-3.61343,-0.15710,24.56965 57 | 4.45547,2.34283,0.98094,-4.66298,-3.79507,0.37084,27.19791 58 | 0.05320,0.27458,6.95838,7.50119,-5.50256,0.06913,36.21698 59 | 4.72057,0.17165,4.83822,-1.03917,4.11211,-0.14773,-6.32623 60 | -11.60674,-1.15594,-10.23150,0.49843,0.32477,-0.14543,-28.54003 61 | -7.55406,0.45765,10.67537,-15.12397,3.49680,0.20350,11.97581 62 | -1.73618,-1.56867,3.98355,-5.16723,-1.20911,0.19377,9.55247 63 | 2.01963,-1.12612,1.16531,-2.71553,-5.39782,0.01086,21.83478 64 | -1.68542,-1.08901,-3.55426,3.14201,0.82668,0.04372,-13.11204 65 | -3.09104,-0.23295,-5.62436,-3.03831,0.77772,0.02000,-14.74251 66 | -3.87717,0.74098,-2.88109,-2.88103,3.36945,-0.30445,-18.44363 67 | -0.42754,-0.42819,5.02998,-3.45859,-4.21739,0.25281,29.20439 68 | 8.31292,2.30543,-1.52645,-8.39725,-2.65715,-0.30785,12.65607 69 | 8.96352,2.15330,7.97777,-2.99501,2.19453,0.11162,13.62118 70 | -0.90896,-0.03845,11.60698,5.39133,1.58423,-0.23637,13.73746 71 | 2.03663,-0.49245,4.30331,17.83947,-0.96290,0.10803,10.85762 72 | -1.72766,1.38544,1.88234,-0.58255,-1.55674,0.08176,16.49896 73 | -2.40833,-0.00177,2.32146,-1.06438,2.92114,-0.05635,-8.16292 74 | -1.22998,-1.81632,-2.81740,12.29083,-1.40781,-0.15404,-6.76994 75 | -3.85332,-1.24892,-6.24187,0.95304,-3.66314,0.02746,-0.87206 76 | -7.18419,-0.91048,-2.41759,2.46251,-5.11125,-0.05417,11.48350 77 | 5.69279,-0.66299,-3.40195,1.77690,3.70297,-0.02102,-23.71307 78 | 5.82082,1.75872,1.50493,-1.14792,-0.66104,0.14593,11.82506 79 | 0.98854,-0.91971,11.94650,1.36820,2.53711,0.30359,13.23011 80 | 1.55873,0.25462,2.37448,16.04402,-0.06938,-0.36479,-0.67043 81 | -0.66650,-2.27045,6.40325,7.64815,1.58676,-0.11790,-3.12393 82 | 4.58728,-2.90732,-0.05803,2.27259,2.29507,0.13907,-16.76419 83 | -11.73607,-2.26595,1.63461,6.21257,0.73723,0.03777,-7.00464 84 | -2.03125,1.83364,1.57590,5.52329,-3.64759,0.06059,23.96407 85 | 4.63339,1.37232,-0.62675,13.46151,3.69937,-0.09897,-13.66325 86 | -0.93955,-1.39664,-4.69027,-5.30208,-2.70883,0.07360,-0.26176 87 | 3.19531,-1.43186,3.82859,-9.83963,-2.83611,0.09403,14.30309 88 | -0.66991,-0.33925,-0.26224,-6.71810,0.52439,0.00654,-2.45750 89 | 3.32705,-0.20431,-0.61940,-5.82014,-3.30832,-0.13399,9.94820 90 | -3.01400,-1.40133,7.13418,-15.85676,3.92442,0.29137,-0.19544 91 | 10.75129,-0.08744,4.35843,-9.89202,-0.71794,0.12349,12.68742 92 | 4.74271,-1.32895,-2.73218,9.15129,0.93902,-0.17934,-15.58698 93 | 3.96678,-1.93074,-1.98368,-12.52082,7.35129,-0.30941,-40.20406 94 | 2.98664,1.85034,2.54075,-2.98750,0.37193,0.16048,9.08819 95 | -6.73878,-1.08637,-1.55835,-3.93097,-3.02271,0.11860,6.24185 96 | -4.58240,-1.27825,7.55098,8.83930,-3.80318,0.04386,26.14768 97 | -10.00364,2.66002,-4.26776,-3.73792,-0.72349,-0.24617,0.76214 98 | -4.32624,-2.30314,-8.16044,4.46366,-3.33569,-0.01655,-10.05262 99 | -1.90167,-0.15858,-10.43466,4.89762,-0.64606,-0.14519,-19.63970 100 | 2.43213,2.41613,2.49949,-8.03891,-1.64164,-0.63444,12.76193 -------------------------------------------------------------------------------- /Linear Regression/gd-data.csv: -------------------------------------------------------------------------------- 1 | -0.72407,2.23863 2 | -2.40724,-0.00156 3 | 2.64837,3.01665 4 | 0.36092,2.31019 5 | 0.67312,2.05950 6 | -0.45460,1.24736 7 | 2.20168,2.82497 8 | 1.15605,2.21802 9 | 0.50694,1.43644 10 | -0.85952,1.74980 11 | -0.59970,1.63259 12 | 1.46804,2.43461 13 | -1.05659,1.02226 14 | 1.29177,3.11769 15 | -0.74565,0.81194 16 | 0.15033,2.81910 17 | -1.49627,0.53105 18 | -0.72071,1.64845 19 | 0.32924,1.91416 20 | -0.28053,2.11376 21 | -1.36115,1.70969 22 | 0.74678,2.92253 23 | 0.10621,3.29827 24 | 0.03256,1.58565 25 | -0.98290,2.30455 26 | -1.15661,1.79169 27 | 0.09024,1.54723 28 | -1.03816,1.06893 29 | -0.00604,1.78802 30 | 0.16278,1.84746 31 | -0.69869,1.58732 32 | 1.03857,1.94799 33 | -0.11783,3.09324 34 | -0.95409,1.86155 35 | -0.81839,1.88817 36 | -1.28802,1.39474 37 | 0.62822,1.71526 38 | -2.29674,1.75695 39 | -0.85601,1.12981 40 | -1.75223,1.67000 41 | -1.19662,0.66711 42 | 0.97781,3.11987 43 | -1.17110,0.56924 44 | 0.15835,2.28231 45 | -0.58918,1.23798 46 | -1.79678,1.35803 47 | -0.95727,1.75579 48 | 0.64556,1.91470 49 | 0.24625,2.33029 50 | 0.45917,3.25263 51 | 1.21036,2.07602 52 | -0.60116,1.54254 53 | 0.26851,2.79202 54 | 0.49594,1.96178 55 | -2.67877,0.95898 56 | 0.49402,1.96690 57 | 1.18643,3.06144 58 | -0.17741,1.85984 59 | 0.57938,1.82967 60 | -2.14926,0.62285 61 | 2.27700,3.63838 62 | -1.05695,1.11807 63 | 1.68288,2.91735 64 | -1.53513,1.99668 65 | 0.00099,1.76149 66 | 0.45520,2.31938 67 | -0.37855,0.90172 68 | 1.35638,3.49432 69 | 0.01763,1.87838 70 | 2.21725,2.61171 71 | -0.44442,2.06623 72 | 0.89583,3.04041 73 | 1.30499,2.42824 74 | 0.10883,0.63190 75 | 1.79466,2.95265 76 | -0.00733,1.87546 77 | 0.79862,3.44953 78 | -0.12353,1.53740 79 | -1.34999,1.59958 80 | -0.67825,1.57832 81 | -0.17901,1.73312 82 | 0.12577,2.00244 83 | 1.11943,2.08990 84 | -3.02296,1.09255 85 | 0.64965,1.28183 86 | 1.05994,2.32358 87 | 0.53360,1.75136 88 | -0.73591,1.43076 89 | -0.09569,2.81376 90 | 1.04694,2.56597 91 | 0.46511,2.36401 92 | -0.75463,2.30161 93 | -0.94159,1.94500 94 | -0.09314,1.87619 95 | -0.98641,1.46602 96 | -0.92159,1.21538 97 | 0.76953,2.39377 98 | 0.03283,1.55730 99 | -1.07619,0.70874 100 | 0.20174,1.76894 -------------------------------------------------------------------------------- /Linear Regression/poly-data.csv: -------------------------------------------------------------------------------- 1 | Var_X,Var_Y 2 | -0.33532,6.66854 3 | 0.02160,3.86398 4 | -1.19438,5.16161 5 | -0.65046,8.43823 6 | -0.28001,5.57201 7 | 1.93258,-11.13270 8 | 1.22620,-5.31226 9 | 0.74727,-4.63725 10 | 3.32853,3.80650 11 | 2.87457,-6.06084 12 | -1.48662,7.22328 13 | 0.37629,2.38887 14 | 1.43918,-7.13415 15 | 0.24183,2.00412 16 | -2.79140,4.29794 17 | 1.08176,-5.86553 18 | 2.81555,-5.20711 19 | 0.54924,-3.52863 20 | 2.36449,-10.16202 21 | -1.01925,5.31123 -------------------------------------------------------------------------------- /Linear Regression/reg-data.csv: -------------------------------------------------------------------------------- 1 | 1.25664,2.04978,-6.23640,4.71926,-4.26931,0.20590,12.31798 2 | -3.89012,-0.37511,6.14979,4.94585,-3.57844,0.00640,23.67628 3 | 5.09784,0.98120,-0.29939,5.85805,0.28297,-0.20626,-1.53459 4 | 0.39034,-3.06861,-5.63488,6.43941,0.39256,-0.07084,-24.68670 5 | 5.84727,-0.15922,11.41246,7.52165,1.69886,0.29022,17.54122 6 | -2.86202,-0.84337,-1.08165,0.67115,-2.48911,0.52328,9.39789 7 | -7.09328,-0.07233,6.76632,13.06072,0.12876,-0.01048,11.73565 8 | -7.17614,0.62875,-2.89924,-5.21458,-2.70344,-0.22035,4.42482 9 | 8.67430,2.09933,-11.23591,-5.99532,-2.79770,-0.08710,-5.94615 10 | -6.03324,-4.16724,2.42063,-3.61827,1.96815,0.17723,-13.11848 11 | 8.67485,1.48271,-1.31205,-1.81154,2.67940,0.04803,-9.25647 12 | 4.36248,-2.69788,-4.60562,-0.12849,3.40617,-0.07841,-29.94048 13 | 9.97205,-0.61515,2.63039,2.81044,5.68249,-0.04495,-20.46775 14 | -1.44556,0.18337,4.61021,-2.54824,0.86388,0.17696,7.12822 15 | -3.90381,0.53243,2.83416,-5.42397,-0.06367,-0.22810,6.05628 16 | -12.39824,-1.54269,-2.66748,10.82084,5.92054,0.13415,-32.91328 17 | 5.75911,-0.82222,10.24701,0.33635,0.26025,-0.02588,17.75036 18 | -7.12657,3.28707,-0.22508,13.42902,2.16708,-0.09153,-2.80277 19 | 7.22736,1.27122,0.99188,-8.87118,-6.86533,0.09410,33.98791 20 | -10.31393,2.23819,-7.87166,-3.44388,-1.43267,-0.07893,-3.18407 21 | -8.25971,-0.15799,-1.81740,1.12972,4.24165,-0.01607,-20.57366 22 | 13.37454,-0.91051,4.61334,0.93989,4.81350,-0.07428,-12.66661 23 | 1.49973,-0.50929,-2.66670,-1.28560,-0.18299,-0.00552,-6.56370 24 | -10.46766,0.73077,3.93791,-1.73489,-3.26768,0.02366,23.19621 25 | -1.15898,3.14709,-4.73329,13.61355,-3.87487,-0.14112,13.89143 26 | 4.42275,-2.09867,3.06395,-0.45331,-2.07717,0.22815,10.29282 27 | -3.34113,-0.31138,4.49844,-2.32619,-2.95757,-0.00793,21.21512 28 | -1.85433,-1.32509,8.06274,12.75080,-0.89005,-0.04312,14.54248 29 | 0.85474,-0.50002,-3.52152,-4.30405,4.13943,-0.02834,-24.77918 30 | 0.33271,-5.28025,-4.95832,22.48546,4.95051,0.17153,-45.01710 31 | -0.07308,0.51247,-1.38120,7.86552,3.31641,0.06808,-12.63583 32 | 2.99294,2.85192,5.51751,8.53749,4.30806,-0.17462,0.84415 33 | 1.41135,-1.01899,2.27500,5.27479,-4.90004,0.19508,23.54972 34 | 3.84816,-0.66249,-1.35364,16.51379,0.32115,0.41051,-2.28650 35 | 3.30223,0.23152,-2.16852,0.75257,-0.05749,-0.03427,-4.22022 36 | -6.12524,-2.56204,0.79878,-3.36284,1.00396,0.06219,-9.10749 37 | -7.47524,1.31401,-3.30847,4.83057,1.00104,-0.19851,-7.69059 38 | 5.84884,-0.53504,-0.19543,10.27451,6.98704,0.22706,-29.21246 39 | 6.44377,0.47687,-0.08731,22.88008,-2.86604,0.03142,10.90274 40 | 6.35366,-2.04444,1.98872,-1.45189,-1.24062,0.23626,4.62178 41 | 6.85563,-0.94543,5.16637,2.85611,4.64812,0.29535,-7.83647 42 | 1.61758,1.31067,-2.16795,8.07492,-0.17166,-0.10273,0.06922 43 | 3.80137,1.02276,-3.15429,6.09774,3.18885,-0.00163,-16.11486 44 | -6.81855,-0.15776,-10.69117,8.07818,4.14656,0.10691,-38.47710 45 | -6.43852,4.30120,2.63923,-1.98297,-0.89599,-0.08174,20.77790 46 | -2.35292,1.26425,-6.80877,3.31220,-6.17515,-0.04764,14.92507 47 | 9.13580,-1.21425,1.17227,-6.33648,-0.85276,-0.13366,-0.17285 48 | -3.02986,-0.48694,0.24329,-0.38830,-4.70410,-0.18065,15.95300 49 | 3.27244,2.22393,-1.96640,17.53694,1.62378,0.11539,-4.29743 50 | -4.44346,-1.96429,0.22209,15.29785,-1.98503,0.40131,4.07647 51 | -2.61294,-0.24905,-4.02974,-23.82024,-5.94171,-0.04932,16.50504 52 | 3.65962,1.69832,0.78025,9.88639,-1.61555,-0.18570,9.99506 53 | 2.22893,-4.62231,-3.33440,0.07179,0.21983,0.14348,-19.94698 54 | -5.43092,1.39655,-2.79175,0.16622,-2.38112,-0.09009,6.49039 55 | -5.88117,-3.04210,-0.87931,3.96197,-1.01125,0.08132,-6.01714 56 | 0.51401,-0.30742,6.01407,-6.85848,-3.61343,-0.15710,24.56965 57 | 4.45547,2.34283,0.98094,-4.66298,-3.79507,0.37084,27.19791 58 | 0.05320,0.27458,6.95838,7.50119,-5.50256,0.06913,36.21698 59 | 4.72057,0.17165,4.83822,-1.03917,4.11211,-0.14773,-6.32623 60 | -11.60674,-1.15594,-10.23150,0.49843,0.32477,-0.14543,-28.54003 61 | -7.55406,0.45765,10.67537,-15.12397,3.49680,0.20350,11.97581 62 | -1.73618,-1.56867,3.98355,-5.16723,-1.20911,0.19377,9.55247 63 | 2.01963,-1.12612,1.16531,-2.71553,-5.39782,0.01086,21.83478 64 | -1.68542,-1.08901,-3.55426,3.14201,0.82668,0.04372,-13.11204 65 | -3.09104,-0.23295,-5.62436,-3.03831,0.77772,0.02000,-14.74251 66 | -3.87717,0.74098,-2.88109,-2.88103,3.36945,-0.30445,-18.44363 67 | -0.42754,-0.42819,5.02998,-3.45859,-4.21739,0.25281,29.20439 68 | 8.31292,2.30543,-1.52645,-8.39725,-2.65715,-0.30785,12.65607 69 | 8.96352,2.15330,7.97777,-2.99501,2.19453,0.11162,13.62118 70 | -0.90896,-0.03845,11.60698,5.39133,1.58423,-0.23637,13.73746 71 | 2.03663,-0.49245,4.30331,17.83947,-0.96290,0.10803,10.85762 72 | -1.72766,1.38544,1.88234,-0.58255,-1.55674,0.08176,16.49896 73 | -2.40833,-0.00177,2.32146,-1.06438,2.92114,-0.05635,-8.16292 74 | -1.22998,-1.81632,-2.81740,12.29083,-1.40781,-0.15404,-6.76994 75 | -3.85332,-1.24892,-6.24187,0.95304,-3.66314,0.02746,-0.87206 76 | -7.18419,-0.91048,-2.41759,2.46251,-5.11125,-0.05417,11.48350 77 | 5.69279,-0.66299,-3.40195,1.77690,3.70297,-0.02102,-23.71307 78 | 5.82082,1.75872,1.50493,-1.14792,-0.66104,0.14593,11.82506 79 | 0.98854,-0.91971,11.94650,1.36820,2.53711,0.30359,13.23011 80 | 1.55873,0.25462,2.37448,16.04402,-0.06938,-0.36479,-0.67043 81 | -0.66650,-2.27045,6.40325,7.64815,1.58676,-0.11790,-3.12393 82 | 4.58728,-2.90732,-0.05803,2.27259,2.29507,0.13907,-16.76419 83 | -11.73607,-2.26595,1.63461,6.21257,0.73723,0.03777,-7.00464 84 | -2.03125,1.83364,1.57590,5.52329,-3.64759,0.06059,23.96407 85 | 4.63339,1.37232,-0.62675,13.46151,3.69937,-0.09897,-13.66325 86 | -0.93955,-1.39664,-4.69027,-5.30208,-2.70883,0.07360,-0.26176 87 | 3.19531,-1.43186,3.82859,-9.83963,-2.83611,0.09403,14.30309 88 | -0.66991,-0.33925,-0.26224,-6.71810,0.52439,0.00654,-2.45750 89 | 3.32705,-0.20431,-0.61940,-5.82014,-3.30832,-0.13399,9.94820 90 | -3.01400,-1.40133,7.13418,-15.85676,3.92442,0.29137,-0.19544 91 | 10.75129,-0.08744,4.35843,-9.89202,-0.71794,0.12349,12.68742 92 | 4.74271,-1.32895,-2.73218,9.15129,0.93902,-0.17934,-15.58698 93 | 3.96678,-1.93074,-1.98368,-12.52082,7.35129,-0.30941,-40.20406 94 | 2.98664,1.85034,2.54075,-2.98750,0.37193,0.16048,9.08819 95 | -6.73878,-1.08637,-1.55835,-3.93097,-3.02271,0.11860,6.24185 96 | -4.58240,-1.27825,7.55098,8.83930,-3.80318,0.04386,26.14768 97 | -10.00364,2.66002,-4.26776,-3.73792,-0.72349,-0.24617,0.76214 98 | -4.32624,-2.30314,-8.16044,4.46366,-3.33569,-0.01655,-10.05262 99 | -1.90167,-0.15858,-10.43466,4.89762,-0.64606,-0.14519,-19.63970 100 | 2.43213,2.41613,2.49949,-8.03891,-1.64164,-0.63444,12.76193 -------------------------------------------------------------------------------- /Model Evaluation Metrics/2 Quiz: Testing Your Models - Solution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyNkKpEWIHXNnTVKCwY5BYZT", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": null, 32 | "metadata": { 33 | "id": "DHQ4PQRyjqHT" 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "# Import statements\n", 38 | "import pandas as pd\n", 39 | "import numpy as np\n", 40 | "from sklearn.tree import DecisionTreeClassifier\n", 41 | "from sklearn.metrics import accuracy_score\n", 42 | "from sklearn.model_selection import train_test_split" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "source": [ 48 | "# URL for our dataset, data.csv\n", 49 | "URL = \"https://drive.google.com/file/d/1SJVKhCW-oHfsQDCbOqca-eW4tnNKfFHC/view?usp=sharing\"\n", 50 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 51 | "\n", 52 | "# Read the data.\n", 53 | "data = np.asarray(pd.read_csv(FILE_PATH, header=None))\n", 54 | "\n", 55 | "# Assign the features to the variable X, and the labels to the variable y.\n", 56 | "X = data[:,0:2]\n", 57 | "y = data[:,2]" 58 | ], 59 | "metadata": { 60 | "id": "nkIrrCWAkFZ4" 61 | }, 62 | "execution_count": null, 63 | "outputs": [] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "source": [ 68 | "# Use train test split to split your data\n", 69 | "# Use a test size of 25% and a random state of 42\n", 70 | "X_train, X_test, y_train, y_test = train_test_split(X,\n", 71 | " y,\n", 72 | " test_size=0.25,\n", 73 | " random_state=42)" 74 | ], 75 | "metadata": { 76 | "id": "5tw5D11okIn6" 77 | }, 78 | "execution_count": null, 79 | "outputs": [] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "source": [ 84 | "# TODO: Create the decision tree model and assign it to the variable model.\n", 85 | "model = DecisionTreeClassifier()\n", 86 | "\n", 87 | "# TODO: Fit the model to the training data.\n", 88 | "model.fit(X_train,y_train)\n", 89 | "\n", 90 | "# TODO: Make predictions on the test data\n", 91 | "y_pred = model.predict(X_test)\n", 92 | "\n", 93 | "# TODO: Calculate the accuracy and assign it to the variable acc. on the test data\n", 94 | "acc = accuracy_score(y_test, y_pred)\n", 95 | "acc" 96 | ], 97 | "metadata": { 98 | "id": "vRsTNrUykJdy" 99 | }, 100 | "execution_count": null, 101 | "outputs": [] 102 | } 103 | ] 104 | } -------------------------------------------------------------------------------- /Model Evaluation Metrics/2 Quiz: Testing Your Models.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyOzOyU4pkoQy4ufyPYTNSfO", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": null, 32 | "metadata": { 33 | "id": "DHQ4PQRyjqHT" 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "# Import statements\n", 38 | "import pandas as pd\n", 39 | "import numpy as np\n", 40 | "from sklearn.tree import DecisionTreeClassifier\n", 41 | "from sklearn.metrics import accuracy_score\n", 42 | "from sklearn.model_selection import train_test_split" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "source": [ 48 | "# URL for our dataset, data.csv\n", 49 | "URL = \"https://drive.google.com/file/d/1SJVKhCW-oHfsQDCbOqca-eW4tnNKfFHC/view?usp=sharing\"\n", 50 | "FILE_PATH = \"https://drive.google.com/uc?export=download&id=\" + URL.split(\"/\")[-2]\n", 51 | "\n", 52 | "# Read the data.\n", 53 | "data = np.asarray(pd.read_csv(FILE_PATH, header=None))\n", 54 | "\n", 55 | "# Assign the features to the variable X, and the labels to the variable y.\n", 56 | "X = None\n", 57 | "y = None" 58 | ], 59 | "metadata": { 60 | "id": "nkIrrCWAkFZ4" 61 | }, 62 | "execution_count": null, 63 | "outputs": [] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "source": [ 68 | "# Use train test split to split your data\n", 69 | "# Use a test size of 25% and a random state of 42\n", 70 | "X_train, X_test, y_train, y_test = None" 71 | ], 72 | "metadata": { 73 | "id": "5tw5D11okIn6" 74 | }, 75 | "execution_count": null, 76 | "outputs": [] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "source": [ 81 | "# TODO: Create the decision tree model and assign it to the variable model.\n", 82 | "model = None\n", 83 | "\n", 84 | "# TODO: Fit the model to the training data.\n", 85 | "model.fit(None, None)\n", 86 | "\n", 87 | "# TODO: Make predictions on the test data\n", 88 | "y_pred = model.predict(None)\n", 89 | "\n", 90 | "# TODO: Calculate the accuracy and assign it to the variable acc. on the test data\n", 91 | "acc = accuracy_score(None, None)\n", 92 | "acc" 93 | ], 94 | "metadata": { 95 | "id": "vRsTNrUykJdy" 96 | }, 97 | "execution_count": null, 98 | "outputs": [] 99 | } 100 | ] 101 | } -------------------------------------------------------------------------------- /Model Evaluation Metrics/data.csv: -------------------------------------------------------------------------------- 1 | 0.24539,0.81725,0 2 | 0.21774,0.76462,0 3 | 0.20161,0.69737,0 4 | 0.20161,0.58041,0 5 | 0.2477,0.49561,0 6 | 0.32834,0.44883,0 7 | 0.39516,0.48099,0 8 | 0.39286,0.57164,0 9 | 0.33525,0.62135,0 10 | 0.33986,0.71199,0 11 | 0.34447,0.81433,0 12 | 0.28226,0.82602,0 13 | 0.26613,0.75,0 14 | 0.26613,0.63596,0 15 | 0.32604,0.54825,0 16 | 0.28917,0.65643,0 17 | 0.80069,0.71491,0 18 | 0.80069,0.64181,0 19 | 0.80069,0.50146,0 20 | 0.79839,0.36988,0 21 | 0.73157,0.25,0 22 | 0.63249,0.18275,0 23 | 0.60023,0.27047,0 24 | 0.66014,0.34649,0 25 | 0.70161,0.42251,0 26 | 0.70853,0.53947,0 27 | 0.71544,0.63304,0 28 | 0.74309,0.72076,0 29 | 0.75,0.63596,0 30 | 0.75,0.46345,0 31 | 0.72235,0.35526,0 32 | 0.66935,0.28509,0 33 | 0.20622,0.94298,1 34 | 0.26613,0.8962,1 35 | 0.38134,0.8962,1 36 | 0.42051,0.94591,1 37 | 0.49885,0.86404,1 38 | 0.31452,0.93421,1 39 | 0.53111,0.72076,1 40 | 0.45276,0.74415,1 41 | 0.53571,0.6038,1 42 | 0.60484,0.71491,1 43 | 0.60945,0.58333,1 44 | 0.51267,0.47807,1 45 | 0.50806,0.59211,1 46 | 0.46198,0.30556,1 47 | 0.5288,0.41082,1 48 | 0.38594,0.35819,1 49 | 0.31682,0.31433,1 50 | 0.29608,0.20906,1 51 | 0.36982,0.27632,1 52 | 0.42972,0.18275,1 53 | 0.51498,0.10965,1 54 | 0.53111,0.20906,1 55 | 0.59793,0.095029,1 56 | 0.73848,0.086257,1 57 | 0.83065,0.18275,1 58 | 0.8629,0.10965,1 59 | 0.88364,0.27924,1 60 | 0.93433,0.30848,1 61 | 0.93433,0.19444,1 62 | 0.92512,0.43421,1 63 | 0.87903,0.43421,1 64 | 0.87903,0.58626,1 65 | 0.9182,0.71491,1 66 | 0.85138,0.8348,1 67 | 0.85599,0.94006,1 68 | 0.70853,0.94298,1 69 | 0.70853,0.87281,1 70 | 0.59793,0.93129,1 71 | 0.61175,0.83187,1 72 | 0.78226,0.82895,1 73 | 0.78917,0.8962,1 74 | 0.90668,0.89912,1 75 | 0.14862,0.92251,1 76 | 0.15092,0.85819,1 77 | 0.097926,0.85819,1 78 | 0.079493,0.91374,1 79 | 0.079493,0.77632,1 80 | 0.10945,0.79678,1 81 | 0.12327,0.67982,1 82 | 0.077189,0.6886,1 83 | 0.081797,0.58626,1 84 | 0.14862,0.58041,1 85 | 0.14862,0.5307,1 86 | 0.14171,0.41959,1 87 | 0.08871,0.49269,1 88 | 0.095622,0.36696,1 89 | 0.24539,0.3962,1 90 | 0.1947,0.29678,1 91 | 0.16935,0.22368,1 92 | 0.15553,0.13596,1 93 | 0.23848,0.12427,1 94 | 0.33065,0.12427,1 95 | 0.095622,0.2617,1 96 | 0.091014,0.20322,1 97 | -------------------------------------------------------------------------------- /Model Evaluation Metrics/temp.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Machine Learning and Neural Networks 2 | -------------------------------------------------------------------------------- /Walk-Through/temp.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /aligning-contents-with-coursework-requirements.md: -------------------------------------------------------------------------------- 1 | | **Key Requirement** | **Explanation** | **How the Report Meets the Requirement** | 2 | |----------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 3 | | **Adherence to the Universal Workflow** | Follow the workflow from *Deep Learning with Python* by François Chollet, covering all steps from problem definition to conclusions. | The report is structured according to the universal workflow, covering problem definition, evaluation metrics, model development, regularization, and conclusions. | 4 | | **Report Structure and Quality** | The report should read as a comprehensive document with clear headings, subheadings, and a logical flow, avoiding excessive repetition. | The report is well-organized with clear sections (e.g., Introduction, Methodology, Analysis of Results, Conclusions) and a logical progression of ideas. | 5 | | **Systematic Investigation** | Conduct a thorough and systematic investigation, including data exploration, model development, and performance evaluation. | The report systematically explores different models (baseline, overfitting, regularized, wider, deeper, narrower) and evaluates their performance using various metrics. | 6 | | **Interpretation of Results** | Provide a clear interpretation of the results, discussing the strengths and weaknesses of each model and how they align with the objectives. | The report includes an Analysis of Results section that discusses the performance of each model, highlighting key insights and comparing them against the baseline model. | 7 | | **Use of Evaluation Protocols** | Employ appropriate evaluation protocols, such as hold-out validation, K-fold cross-validation, and validation split, to ensure robust model evaluation. | The report uses K-fold cross-validation combined with a validation split during model refitting, ensuring that the models are evaluated robustly on unseen data. | 8 | | **Focus on Model Regularization** | Apply regularization techniques (e.g., Dropout, L2 regularization) to prevent overfitting and improve model generalization. | The report discusses overfitting in the baseline model and implements Dropout and L2 regularization in more complex models to mitigate this issue and improve generalization.| 9 | | **Exploration of Model Architectures** | Explore different neural network architectures (e.g., wider, deeper, narrower) to optimize model performance and draw comparisons. | The report investigates various architectures (wider, deeper, narrower) and compares their performance, providing insights into the effectiveness of each approach. | 10 | | **Implementation of Class Weights** | Address class imbalance in the dataset by applying class weights during model training. | The report computes and applies class weights during model training to handle the significant class imbalance observed in the dataset. | 11 | | **Avoidance of Advanced Techniques** | The assignment constraints require that only Dense layers, Dropout, and L1/L2 regularization are used, avoiding CNNs, RNNs, and Early Stopping. | The report adheres to these constraints by only using Dense layers, Dropout, and L1/L2 regularization, without implementing advanced techniques like CNNs or Early Stopping.| 12 | | **Thorough Examination and Discussion of Model Performance** | Examine and discuss the performance of each model thoroughly, highlighting the impact of different architectural choices and regularization techniques. | The report provides a detailed comparison of model performances, discussing how each architectural choice and regularization technique affects metrics like accuracy, F1, and AUC. | 13 | | **Suggestions for Future Work** | Identify areas for future work, including potential improvements to models and techniques not explored due to constraints. | The report includes a succinct Future Work section, suggesting the exploration of optimization techniques and model interpretability methods like LIME and SHAP. | 14 | 15 | --------------------------------------------------------------------------------