├── .gitignore ├── README.md ├── car_features.csv ├── dnn_from_scratch.py ├── dnn_from_scratch_tensorflow.py ├── dnn_from_scratch_tensorflow_load_model.py ├── download_lbc_cars_data.py ├── normalize_lbc_cars_data.py ├── normalized_car_features.csv ├── predict.py ├── requirements.txt └── saved_model ├── saved_model.pb └── variables ├── variables.data-00000-of-00001 └── variables.index /.gitignore: -------------------------------------------------------------------------------- 1 | .idea 2 | tensorboard/ 3 | __pycache__/ 4 | *.pyc -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Note 2 | 3 | The project should work with python 2 and 3. 4 | 5 | # About 6 | 7 | This project provides the python code that supports this [blog post](https://matrices.io/deep-neural-network-from-scratch/) (if you are a beginner, you should read it). 8 | 9 | The goal is to make a neural network from scratch using numpy, then the same one using TensorFlow. 10 | 11 | As a toy example, we try to predict the price of car using online data. 12 | 13 | `download_lbc_cars_data.py` downloads data from leboncoin.fr, which is a website of classified ads. The data retrieved are about BMW Serie 1 (only one model of car). 14 | 15 | For each BMW Serie 1 we save an input with the number of km, fuel, age and the price. The data are saved into `car_features.csv`. 16 | 17 | These data are then normalized by `normalize_lbc_cars_data.py` to produce `normalized_car_features.csv`. 18 | 19 | `normalized_car_features.csv` is used as input by `dnn_from_scratch.py` which is the neural network using numpy and `dnn_from_scratch_tensorflow.py` which is the neural network using TensorFlow. 20 | 21 | `predict.py` is used to transform the data back and forth from the normalized to the human readeable version. For instance to predict a price, the user will input the raw car attributes. `predict.py` will convert the raw data to the normalized version and return them. The neural network output is also given to `predict.py` so that the user obtains a readable price and not a normalized one. 22 | 23 | Overall results are pretty good knowing that the price is impacted by more than three attributes. 24 | 25 | # Network architecture 26 | The architecture is pretty simple and well described in the blog post. Here is an illustration: 27 | ![Network architecture](https://matrices.io/content/images/2017/02/DNN-S12.png) 28 | 29 | # Usage 30 | A requirements.txt file exists at the root of the repository. Run `pip install -r requirements.txt `. 31 | 32 | # Issue 33 | If you see a bad implementation or you come across a bug, open an issue. I'll help you. 34 | -------------------------------------------------------------------------------- /dnn_from_scratch.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import csv 3 | import predict as util 4 | 5 | 6 | class NeuralNetwork: 7 | def __init__(self): 8 | 9 | # load the dataset from the CSV file 10 | reader = csv.reader(open("normalized_car_features.csv", "r"), delimiter=",") 11 | x = list(reader) 12 | features = np.array(x[2:]).astype("float") 13 | np.random.shuffle(features) 14 | 15 | # car attribute and price are splitted, note that 1 is appended at each car for the bias 16 | data_x = np.concatenate((features[:, :3], np.ones((features.shape[0], 1))), axis=1) 17 | data_y = features[:, 3:] 18 | 19 | # we save the dataset metadata for the prediction part of the network 20 | self.predict = util.Predict(float(x[0][0]), float(x[0][1]), float(x[0][2]), float(x[0][3]), float(x[0][4]), 21 | float(x[0][5])) 22 | 23 | # we set a threshold at 80% of the data 24 | self.m = float(features.shape[0]) 25 | self.m_train_set = int(self.m * 0.8) 26 | 27 | # we split the train and test set using the threshold 28 | self.x, self.x_test = data_x[:self.m_train_set, :], data_x[self.m_train_set:, :] 29 | self.y, self.y_test = data_y[:self.m_train_set, :], data_y[self.m_train_set:, :] 30 | 31 | # we init the network parameters 32 | self.z2, self.a2, self.z3, self.a3, self.z4, self.a4 = (None,) * 6 33 | self.delta2, self.delta3, self.delta4 = (None,) * 3 34 | self.djdw1, self.djdw2, self.djdw3 = (None,) * 3 35 | self.gradient, self.numericalGradient = (None,) * 2 36 | self.Lambda = 0.01 37 | self.learning_rate = 0.01 38 | 39 | # we init the weights using the blog post values 40 | self.w1 = np.matrix([ 41 | [0.01, 0.05, 0.07], 42 | [0.2, 0.041, 0.11], 43 | [0.04, 0.56, 0.13], 44 | [0.1, 0.1, 0.1] 45 | ]) 46 | 47 | self.w2 = np.matrix([ 48 | [0.04, 0.78], 49 | [0.4, 0.45], 50 | [0.65, 0.23], 51 | [0.1, 0.1] 52 | ]) 53 | 54 | self.w3 = np.matrix([ 55 | [0.04], 56 | [0.41], 57 | [0.1] 58 | ]) 59 | 60 | def forward(self): 61 | 62 | # first layer 63 | self.z2 = np.dot(self.x, self.w1) 64 | self.a2 = np.tanh(self.z2) 65 | 66 | # we add the the 1 unit (bias) at the output of the first layer 67 | ba2 = np.ones((self.x.shape[0], 1)) 68 | self.a2 = np.concatenate((self.a2, ba2), axis=1) 69 | 70 | # second layer 71 | self.z3 = np.dot(self.a2, self.w2) 72 | self.a3 = np.tanh(self.z3) 73 | 74 | # we add the the 1 unit (bias) at the output of the second layer 75 | ba3 = np.ones((self.a3.shape[0], 1)) 76 | self.a3 = np.concatenate((self.a3, ba3), axis=1) 77 | 78 | # output layer, prediction of our network 79 | self.z4 = np.dot(self.a3, self.w3) 80 | self.a4 = np.tanh(self.z4) 81 | 82 | def backward(self): 83 | 84 | # gradient of the cost function with regards to W3 85 | self.delta4 = np.multiply(-(self.y - self.a4), tanh_prime(self.z4)) 86 | self.djdw3 = (self.a3.T * self.delta4) / self.m_train_set + self.Lambda * self.w3 87 | 88 | # gradient of the cost function with regards to W2 89 | self.delta3 = np.multiply(self.delta4 * self.w3.T, tanh_prime(np.concatenate((self.z3, np.ones((self.z3.shape[0], 1))), axis=1))) 90 | self.djdw2 = (self.a2.T * np.delete(self.delta3, 2, axis=1)) / self.m_train_set + self.Lambda * self.w2 91 | 92 | # gradient of the cost function with regards to W1 93 | self.delta2 = np.multiply(np.delete(self.delta3, 2, axis=1) * self.w2.T, tanh_prime(np.concatenate((self.z2, np.ones((self.z2.shape[0], 1))), axis=1))) 94 | self.djdw1 = (self.x.T * np.delete(self.delta2, 3, axis=1)) / self.m_train_set + self.Lambda * self.w1 95 | 96 | def update_gradient(self): 97 | self.w1 -= self.learning_rate * self.djdw1 98 | self.w2 -= self.learning_rate * self.djdw2 99 | self.w3 -= self.learning_rate * self.djdw3 100 | 101 | def cost_function(self): 102 | return 0.5 * sum(np.square((self.y - self.a4))) / self.m_train_set + (self.Lambda / 2) * ( 103 | np.sum(np.square(self.w1)) + 104 | np.sum(np.square(self.w2)) + 105 | np.sum(np.square(self.w3)) 106 | ) 107 | 108 | def set_weights(self, weights): 109 | self.w1 = np.reshape(weights[0:12], (4, 3)) 110 | self.w2 = np.reshape(weights[12:20], (4, 2)) 111 | self.w3 = np.reshape(weights[20:23], (3, 1)) 112 | 113 | def compute_gradients(self): 114 | nn.forward() 115 | nn.backward() 116 | self.gradient = np.concatenate((self.djdw1.ravel(), self.djdw2.ravel(), self.djdw3.ravel()), axis=1).T 117 | 118 | def compute_numerical_gradients(self): 119 | weights = np.concatenate((self.w1.ravel(), self.w2.ravel(), self.w3.ravel()), axis=1).T 120 | 121 | self.numericalGradient = np.zeros(weights.shape) 122 | perturbation = np.zeros(weights.shape) 123 | e = 1e-4 124 | 125 | for p in range(len(weights)): 126 | # Set perturbation vector 127 | perturbation[p] = e 128 | 129 | self.set_weights(weights + perturbation) 130 | self.forward() 131 | loss2 = self.cost_function() 132 | 133 | self.set_weights(weights - perturbation) 134 | self.forward() 135 | loss1 = self.cost_function() 136 | 137 | self.numericalGradient[p] = (loss2 - loss1) / (2 * e) 138 | 139 | perturbation[p] = 0 140 | 141 | self.set_weights(weights) 142 | 143 | def check_gradients(self): 144 | self.compute_gradients() 145 | self.compute_numerical_gradients() 146 | print("Gradient checked: " + str(np.linalg.norm(self.gradient - self.numericalGradient) / np.linalg.norm( 147 | self.gradient + self.numericalGradient))) 148 | 149 | def predict(self, X): 150 | self.x = X 151 | self.forward() 152 | return self.a4 153 | 154 | def r2(self): 155 | y_mean = np.mean(self.y) 156 | ss_res = np.sum(np.square(self.y - self.a4)) 157 | ss_tot = np.sum(np.square(self.y - y_mean)) 158 | return 1 - (ss_res / ss_tot) 159 | 160 | def summary(self, step): 161 | print("Iteration: %d, Loss %f" % (step, self.cost_function())) 162 | print("RMSE: " + str(np.sqrt(np.mean(np.square(self.a4 - self.y))))) 163 | print("MAE: " + str(np.sum(np.absolute(self.a4 - self.y)) / self.m_train_set)) 164 | print("R2: " + str(self.r2())) 165 | 166 | def predict_price(self, km, fuel, age): 167 | self.x = np.concatenate((self.predict.input(km, fuel, age), np.ones((1, 1))), axis=1) 168 | nn.forward() 169 | print("Predicted price: " + str(self.predict.output(self.a4[0]))) 170 | 171 | 172 | def tanh_prime(x): 173 | return 1.0 - np.square(np.tanh(x)) 174 | 175 | 176 | nn = NeuralNetwork() 177 | 178 | print("### Gradient checking ###") 179 | nn.check_gradients() 180 | 181 | print("### Training data ###") 182 | nb_it = 5000 183 | for step in range(nb_it): 184 | 185 | nn.forward() 186 | nn.backward() 187 | nn.update_gradient() 188 | 189 | if step % 100 == 0: 190 | nn.summary(step) 191 | 192 | print("### Testing data ###") 193 | nn.x = nn.x_test 194 | nn.y = nn.y_test 195 | nn.forward() 196 | 197 | print("### Testing summary ###") 198 | nn.summary(nb_it) 199 | 200 | print("### Predict ###") 201 | nn.predict_price(168000, "Diesel", 5) 202 | -------------------------------------------------------------------------------- /dnn_from_scratch_tensorflow.py: -------------------------------------------------------------------------------- 1 | import csv 2 | import os 3 | import shutil 4 | import numpy as np 5 | import tensorflow as tf 6 | import predict as util 7 | 8 | saved_model_directory = "saved_model" 9 | 10 | # if a saved model directory exist we delete it because TF 11 | # will not overwrite it (and throw an error) 12 | if os.path.isdir(saved_model_directory): 13 | shutil.rmtree(saved_model_directory) 14 | 15 | # read the data from the CSV 16 | reader = csv.reader(open("normalized_car_features.csv", "r"), delimiter=",") 17 | x = list(reader) 18 | features = np.array(x[2:]).astype("float") 19 | np.random.shuffle(features) 20 | 21 | predict = util.Predict(float(x[0][0]), float(x[0][1]), float(x[0][2]), float(x[0][3]), float(x[0][4]), 22 | float(x[0][5])) 23 | 24 | data_x = features[:, :3] 25 | data_y = features[:, 3:] 26 | 27 | # size of the dataset 28 | m = float(features.shape[0]) 29 | 30 | # size of the train set 31 | train_set_size = int(m * 0.8) 32 | 33 | # the data are splitted between the train and test set 34 | x_data, x_test = data_x[:train_set_size, :], data_x[train_set_size:, :] 35 | y_data, y_test = data_y[:train_set_size, :], data_y[train_set_size:, :] 36 | 37 | # regularization strength 38 | Lambda = 0.01 39 | learning_rate = 0.01 40 | 41 | with tf.name_scope('input'): 42 | # training data 43 | x = tf.placeholder("float", name="cars") 44 | y = tf.placeholder("float", name="prices") 45 | 46 | with tf.name_scope('weights'): 47 | w1 = tf.Variable(tf.random_normal([3, 3]), name="W1") 48 | w2 = tf.Variable(tf.random_normal([3, 2]), name="W2") 49 | w3 = tf.Variable(tf.random_normal([2, 1]), name="W3") 50 | 51 | with tf.name_scope('biases'): 52 | # biases (we separate them from the weights because it is easier to do that when using TensorFlow) 53 | b1 = tf.Variable(tf.random_normal([1, 3]), name="b1") 54 | b2 = tf.Variable(tf.random_normal([1, 2]), name="b2") 55 | b3 = tf.Variable(tf.random_normal([1, 1]), name="b3") 56 | 57 | with tf.name_scope('layer_1'): 58 | # three hidden layer 59 | layer_1 = tf.nn.tanh(tf.add(tf.matmul(x, w1), b1)) 60 | 61 | with tf.name_scope('layer_2'): 62 | layer_2 = tf.nn.tanh(tf.add(tf.matmul(layer_1, w2), b2)) 63 | 64 | with tf.name_scope('layer_3'): 65 | layer_3 = tf.nn.tanh(tf.add(tf.matmul(layer_2, w3), b3)) 66 | 67 | with tf.name_scope('regularization'): 68 | # L2 regularization applied on each weight 69 | regularization = tf.nn.l2_loss(w1) + tf.nn.l2_loss(w2) + tf.nn.l2_loss(w3) 70 | 71 | with tf.name_scope('loss'): 72 | # loss function + regularization value 73 | loss = tf.reduce_mean(tf.square(layer_3 - y)) + Lambda * regularization 74 | loss = tf.Print(loss, [loss], "loss") 75 | 76 | with tf.name_scope('train'): 77 | # we'll use gradient descent as optimization algorithm 78 | train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) 79 | 80 | # launching the previously defined model begins here 81 | init = tf.global_variables_initializer() 82 | 83 | # we'll saved the model once the training is done 84 | builder = tf.saved_model.builder.SavedModelBuilder(saved_model_directory) 85 | 86 | with tf.Session() as session: 87 | session.run(init) 88 | 89 | # we'll make 5000 gradient descent iteration 90 | for i in range(10000): 91 | session.run(train_op, feed_dict={x: x_data, y: y_data}) 92 | 93 | builder.add_meta_graph_and_variables(session, ["dnn_from_scratch_tensorflow"]) 94 | 95 | # testing the network 96 | print("Testing data") 97 | print("Loss: " + str(session.run([loss], feed_dict={x: x_test, y: y_test})[0])) 98 | 99 | # do a forward pass 100 | print("Predicted price: " + str(predict.output(session.run(layer_3, 101 | feed_dict={x: predict.input(168000, "Diesel", 5)})))) 102 | 103 | # saving the model 104 | builder.save() 105 | -------------------------------------------------------------------------------- /dnn_from_scratch_tensorflow_load_model.py: -------------------------------------------------------------------------------- 1 | import csv 2 | import tensorflow as tf 3 | import predict as util 4 | 5 | reader = csv.reader(open("normalized_car_features.csv", "r"), delimiter=",") 6 | x = list(reader) 7 | 8 | predict = util.Predict(float(x[0][0]), float(x[0][1]), float(x[0][2]), float(x[0][3]), float(x[0][4]), 9 | float(x[0][5])) 10 | 11 | saved_model_directory = "saved_model" 12 | 13 | with tf.Session() as session: 14 | tf.saved_model.loader.load(session, ["dnn_from_scratch_tensorflow"], saved_model_directory) 15 | 16 | # do a forward pass 17 | print("Predicted price: " + str(predict.output(session.run('layer_3/Tanh:0', 18 | {'input/cars:0': predict.input(168000, "Diesel", 5)})))) 19 | -------------------------------------------------------------------------------- /download_lbc_cars_data.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | 3 | """ 4 | This script download data from leboncoin.fr and save them into a CSV file named car_features.csv. 5 | """ 6 | from bs4 import BeautifulSoup 7 | import requests 8 | try: 9 | from urllib.parse import urlparse, parse_qs 10 | except ImportError: 11 | from urlparse import urlparse, parse_qs 12 | import csv 13 | 14 | # the BMW1 serie 1 page is used 15 | url = "https://www.leboncoin.fr/voitures/offres/rhone_alpes/occasions/?o=0&brd=Bmw&mdl=Serie%201" 16 | r = requests.get(url) 17 | data = r.text 18 | 19 | soup = BeautifulSoup(data, "html.parser") 20 | carLinks = set() 21 | pageLinks = set() 22 | data_set = [] 23 | 24 | parsed = urlparse(soup.select('a#last')[0].get('href')) 25 | nbPage = parse_qs(parsed.query)['o'][0] 26 | print("There are " + str(nbPage) + " web pages to process") 27 | 28 | # for each web page that contains a grid of car offers 29 | for i in range(1, int(nbPage), 1): 30 | 31 | print("Processing web page: " + str(i)) 32 | 33 | # each car offer link is saved into the carLinks 34 | for link in soup.select('#listingAds > section > section > ul > li > a'): 35 | carLinks.add(link.get('href').replace("//", "http://")) 36 | 37 | # the next url page is set 38 | url = "https://www.leboncoin.fr/voitures/offres/rhone_alpes/occasions/?o=" + str(i) + "&brd=Bmw&mdl=Serie%201" 39 | r = requests.get(url) 40 | data = r.text 41 | soup = BeautifulSoup(data, "html.parser") 42 | 43 | # for each car link 44 | for carLink in carLinks: 45 | 46 | print("Processing car page: " + carLink) 47 | 48 | # we load the car page 49 | r = requests.get(carLink) 50 | data = r.text 51 | soup = BeautifulSoup(data, "html.parser") 52 | km = 0 53 | fuel = "" 54 | age = 0 55 | price = 0 56 | 57 | # for each attribute of the car 58 | for info in soup.select("div.line h2"): 59 | 60 | # we keep the ones that we need 61 | if info.select('.property')[0].text == u'Kilométrage': 62 | km = int(info.select('.value')[0].text.replace(" ", "").replace("KM", "")) 63 | if info.select('.property')[0].text == u'Carburant': 64 | fuel = info.select('.value')[0].text 65 | if info.select('.property')[0].text == u'Année-modèle': 66 | age = 2017 - int(info.select('.value')[0].text) 67 | if info.select('.property')[0].text == u'Prix': 68 | price = int(info.select('.value')[0].text.replace(" ", "").replace(u"€", "")) 69 | 70 | # each car is an array of four features added to the data_set 71 | data_set.append([km, fuel, age, price]) 72 | 73 | # the data_set is save into the CSV file 74 | fl = open('car_features.csv', 'w') 75 | writer = csv.writer(fl) 76 | writer.writerow(['km', 'fuel', 'age', 'price']) 77 | for values in data_set: 78 | writer.writerow(values) 79 | 80 | fl.close() 81 | -------------------------------------------------------------------------------- /normalize_lbc_cars_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | This script normalize the car dataset and produce the normalized_car_features.csv. 3 | """ 4 | import csv 5 | import numpy as np 6 | 7 | reader = csv.reader(open("car_features.csv", "r"), delimiter=",") 8 | x = list(reader) 9 | features = np.array(x).astype("str") 10 | 11 | feature_cleaned = [] 12 | 13 | # cleaning: we keep the car Diesel and Essence for which the price is higher than 1000 euros 14 | # also removing the headers column 15 | for feature in features[1:, :]: 16 | if (feature[1] == 'Diesel' or feature[1] == 'Essence') and int(feature[3]) > 1000: 17 | feature_cleaned.append(feature) 18 | 19 | print("Original dataset size: " + str(features.shape[0] - 1)) 20 | features = np.array(feature_cleaned).astype("str") 21 | print("Cleaned dataset size: " + str(features.shape[0])) 22 | 23 | # standardize kilometers: (x - mean)/std 24 | km = features[:, 0].astype("int") 25 | mean_km = np.mean(km) 26 | std_km = np.std(km) 27 | km = (km - mean_km)/std_km 28 | features[:, 0] = km 29 | 30 | # binary convert fuel: Diesel = -1, Essence = 1 31 | features[:, 1] = [-1 if x == 'Diesel' else 1 for x in features[:,1]] 32 | 33 | # standardize age: (x - mean)/std 34 | age = features[:, 2].astype("int") 35 | mean_age = np.mean(age) 36 | std_age = np.std(age) 37 | age = (age - mean_age)/std_age 38 | features[:, 2] = age 39 | 40 | # standardize price: (x - min)/(max - min) 41 | price = features[:, 3].astype("float") 42 | min_price = np.min(price) 43 | max_price = np.max(price) 44 | features[:, 3] = (price - min_price)/(max_price - min_price) 45 | 46 | # summary 47 | print("Mean km: " + str(mean_km)) 48 | print("Std km: " + str(std_km)) 49 | print("Mean age: " + str(mean_age)) 50 | print("Std age: " + str(std_age)) 51 | print("Min price: " + str(min_price)) 52 | print("Max price: " + str(max_price)) 53 | 54 | fl = open('normalized_car_features.csv', 'w') 55 | 56 | writer = csv.writer(fl) 57 | # the first line contains the normalization metadata 58 | writer.writerow([mean_km, std_km, mean_age, std_age, min_price, max_price]) 59 | writer.writerow(['km', 'fuel', 'age', 'price']) 60 | for values in features: 61 | writer.writerow(values) 62 | 63 | fl.close() -------------------------------------------------------------------------------- /predict.py: -------------------------------------------------------------------------------- 1 | """ 2 | The goal of this file is to normalize raw car attributes for the prediction and transform the price using the inverse 3 | of the standardize process. 4 | """ 5 | import numpy as np 6 | 7 | 8 | class Predict: 9 | """ 10 | We save the data set metadata. 11 | """ 12 | def __init__(self, mean_km, std_km, mean_age, std_age, min_price, max_price): 13 | self.mean_km = mean_km 14 | self.std_km = std_km 15 | self.mean_age = mean_age 16 | self.std_age = std_age 17 | self.min_price = min_price 18 | self.max_price = max_price 19 | 20 | """ 21 | This method returns the car's data normalized using the data set metadata. 22 | """ 23 | def input(self, km, fuel, age): 24 | km = (km - self.mean_km) / self.std_km 25 | fuel = -1 if fuel == 'Diesel' else 1 26 | age = (age - self.mean_age) / self.std_age 27 | return np.matrix([[ 28 | km, fuel, age 29 | ]]) 30 | 31 | """ 32 | This method returns the price in euro from the output of the network. The inverse of the standardize process for the 33 | price is applied. 34 | """ 35 | def output(self, price): 36 | return price * (self.max_price - self.min_price) + self.min_price -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy==1.12.0 2 | tensorflow==1.4.1 3 | requests==2.13.0 4 | beautifulsoup4==4.5.3 -------------------------------------------------------------------------------- /saved_model/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/theflofly/dnn_from_scratch_py/2ca351078b6666f43f8794951283dbb9648bfd80/saved_model/saved_model.pb -------------------------------------------------------------------------------- /saved_model/variables/variables.data-00000-of-00001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/theflofly/dnn_from_scratch_py/2ca351078b6666f43f8794951283dbb9648bfd80/saved_model/variables/variables.data-00000-of-00001 -------------------------------------------------------------------------------- /saved_model/variables/variables.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/theflofly/dnn_from_scratch_py/2ca351078b6666f43f8794951283dbb9648bfd80/saved_model/variables/variables.index --------------------------------------------------------------------------------