├── README.md ├── brain_body.txt ├── challenge_dataset.txt ├── demo.py └── requirements.txt /README.md: -------------------------------------------------------------------------------- 1 | # linear_regression_demo 2 | This is the code for "How to Make a Prediction - Intro to Deep Learning #1' by Siraj Raval on YouTube 3 | 4 | ##Overview 5 | This is the code for [this](https://youtu.be/vOppzHpvTiQ) video by Siraj Raval on Youtube. This is the 1st episode in my 'Intro to Deep Learning' series. The goal is to predict an animal's body weight given it's brain weight. The model we'll be using is called [Linear Regression](http://www.statisticssolutions.com/what-is-linear-regression/). The dataset we're using to train our model is a list of brain weight and body weight measurements from a bunch of animals. We'll fit our line to the data using the scikit learn machine learning library, then plot our graph using matplotlib. 6 | 7 | ##Dependencies 8 | 9 | * pandas 10 | * scikit-learn 11 | * matplotlib 12 | 13 | You can just run 14 | `pip install -r requirements.txt` 15 | in terminal to install the necessary dependencies. Here is a link to [pip](https://pip.pypa.io/en/stable/installing/) if you don't already have it. 16 | 17 | ##Usage 18 | 19 | Type `python demo.py` into terminal and you'll see the scatter plot and line of best fit appear. 20 | 21 | ##Challenge 22 | 23 | The challenge for this video is to use scikit-learn to create a line of best fit for the included 'challenge_dataset'. Then, make a prediction for an existing data point and see how close it matches up to the actual value. Print out the error you get. You can use scikit-learn's [documentation](http://scikit-learn.org/stable/documentation.html) for more help. These weekly challenges are not related to the Udacity nanodegree projects, those are additional. 24 | 25 | *Bonus points if you perform linear regression on a dataset with 3 different variables* 26 | 27 | ##Credits 28 | 29 | The credits for the original code go to [gcrowder](https://github.com/gcrowder). I've merely created a wrapper to get people started. 30 | 31 | -------------------------------------------------------------------------------- /brain_body.txt: -------------------------------------------------------------------------------- 1 | Brain Body 2 | 3.385 44.500 3 | 0.480 15.500 4 | 1.350 8.100 5 | 465.000 423.000 6 | 36.330 119.500 7 | 27.660 115.000 8 | 14.830 98.200 9 | 1.040 5.500 10 | 4.190 58.000 11 | 0.425 6.400 12 | 0.101 4.000 13 | 0.920 5.700 14 | 1.000 6.600 15 | 0.005 0.140 16 | 0.060 1.000 17 | 3.500 10.800 18 | 2.000 12.300 19 | 1.700 6.300 20 | 2547.000 4603.000 21 | 0.023 0.300 22 | 187.100 419.000 23 | 521.000 655.000 24 | 0.785 3.500 25 | 10.000 115.000 26 | 3.300 25.600 27 | 0.200 5.000 28 | 1.410 17.500 29 | 529.000 680.000 30 | 207.000 406.000 31 | 85.000 325.000 32 | 0.750 12.300 33 | 62.000 1320.000 34 | 6654.000 5712.000 35 | 3.500 3.900 36 | 6.800 179.000 37 | 35.000 56.000 38 | 4.050 17.000 39 | 0.120 1.000 40 | 0.023 0.400 41 | 0.010 0.250 42 | 1.400 12.500 43 | 250.000 490.000 44 | 2.500 12.100 45 | 55.500 175.000 46 | 100.000 157.000 47 | 52.160 440.000 48 | 10.550 179.500 49 | 0.550 2.400 50 | 60.000 81.000 51 | 3.600 21.000 52 | 4.288 39.200 53 | 0.280 1.900 54 | 0.075 1.200 55 | 0.122 3.000 56 | 0.048 0.330 57 | 192.000 180.000 58 | 3.000 25.000 59 | 160.000 169.000 60 | 0.900 2.600 61 | 1.620 11.400 62 | 0.104 2.500 63 | 4.235 50.400 64 | -------------------------------------------------------------------------------- /challenge_dataset.txt: -------------------------------------------------------------------------------- 1 | 6.1101,17.592 2 | 5.5277,9.1302 3 | 8.5186,13.662 4 | 7.0032,11.854 5 | 5.8598,6.8233 6 | 8.3829,11.886 7 | 7.4764,4.3483 8 | 8.5781,12 9 | 6.4862,6.5987 10 | 5.0546,3.8166 11 | 5.7107,3.2522 12 | 14.164,15.505 13 | 5.734,3.1551 14 | 8.4084,7.2258 15 | 5.6407,0.71618 16 | 5.3794,3.5129 17 | 6.3654,5.3048 18 | 5.1301,0.56077 19 | 6.4296,3.6518 20 | 7.0708,5.3893 21 | 6.1891,3.1386 22 | 20.27,21.767 23 | 5.4901,4.263 24 | 6.3261,5.1875 25 | 5.5649,3.0825 26 | 18.945,22.638 27 | 12.828,13.501 28 | 10.957,7.0467 29 | 13.176,14.692 30 | 22.203,24.147 31 | 5.2524,-1.22 32 | 6.5894,5.9966 33 | 9.2482,12.134 34 | 5.8918,1.8495 35 | 8.2111,6.5426 36 | 7.9334,4.5623 37 | 8.0959,4.1164 38 | 5.6063,3.3928 39 | 12.836,10.117 40 | 6.3534,5.4974 41 | 5.4069,0.55657 42 | 6.8825,3.9115 43 | 11.708,5.3854 44 | 5.7737,2.4406 45 | 7.8247,6.7318 46 | 7.0931,1.0463 47 | 5.0702,5.1337 48 | 5.8014,1.844 49 | 11.7,8.0043 50 | 5.5416,1.0179 51 | 7.5402,6.7504 52 | 5.3077,1.8396 53 | 7.4239,4.2885 54 | 7.6031,4.9981 55 | 6.3328,1.4233 56 | 6.3589,-1.4211 57 | 6.2742,2.4756 58 | 5.6397,4.6042 59 | 9.3102,3.9624 60 | 9.4536,5.4141 61 | 8.8254,5.1694 62 | 5.1793,-0.74279 63 | 21.279,17.929 64 | 14.908,12.054 65 | 18.959,17.054 66 | 7.2182,4.8852 67 | 8.2951,5.7442 68 | 10.236,7.7754 69 | 5.4994,1.0173 70 | 20.341,20.992 71 | 10.136,6.6799 72 | 7.3345,4.0259 73 | 6.0062,1.2784 74 | 7.2259,3.3411 75 | 5.0269,-2.6807 76 | 6.5479,0.29678 77 | 7.5386,3.8845 78 | 5.0365,5.7014 79 | 10.274,6.7526 80 | 5.1077,2.0576 81 | 5.7292,0.47953 82 | 5.1884,0.20421 83 | 6.3557,0.67861 84 | 9.7687,7.5435 85 | 6.5159,5.3436 86 | 8.5172,4.2415 87 | 9.1802,6.7981 88 | 6.002,0.92695 89 | 5.5204,0.152 90 | 5.0594,2.8214 91 | 5.7077,1.8451 92 | 7.6366,4.2959 93 | 5.8707,7.2029 94 | 5.3054,1.9869 95 | 8.2934,0.14454 96 | 13.394,9.0551 97 | 5.4369,0.61705 98 | -------------------------------------------------------------------------------- /demo.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn import linear_model 3 | import matplotlib.pyplot as plt 4 | 5 | #read data 6 | dataframe = pd.read_fwf('brain_body.txt') 7 | x_values = dataframe[['Brain']] 8 | y_values = dataframe[['Body']] 9 | 10 | #train model on data 11 | body_reg = linear_model.LinearRegression() 12 | body_reg.fit(x_values, y_values) 13 | 14 | #visualize results 15 | plt.scatter(x_values, y_values) 16 | plt.plot(x_values, body_reg.predict(x_values)) 17 | plt.show() 18 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib==1.5.1 2 | numpy==1.11.0 3 | pandas==0.18.0 4 | scikit-learn==0.17.1 5 | --------------------------------------------------------------------------------