├── model.h5 ├── src ├── ml.gif ├── ml2.gif └── mnist-sample.png ├── __pycache__ └── mnist_test.cpython-36.pyc ├── mnist_test.py ├── mnist_test.ipynb ├── README.md ├── digit_recogniser.py ├── model.json ├── mnist_train.py └── mnist_train.ipynb /model.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shubham99bisht/Handwritten-digit-recognition-MNIST/HEAD/model.h5 -------------------------------------------------------------------------------- /src/ml.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shubham99bisht/Handwritten-digit-recognition-MNIST/HEAD/src/ml.gif -------------------------------------------------------------------------------- /src/ml2.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shubham99bisht/Handwritten-digit-recognition-MNIST/HEAD/src/ml2.gif -------------------------------------------------------------------------------- /src/mnist-sample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shubham99bisht/Handwritten-digit-recognition-MNIST/HEAD/src/mnist-sample.png -------------------------------------------------------------------------------- /__pycache__/mnist_test.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/shubham99bisht/Handwritten-digit-recognition-MNIST/HEAD/__pycache__/mnist_test.cpython-36.pyc -------------------------------------------------------------------------------- /mnist_test.py: -------------------------------------------------------------------------------- 1 | 2 | def model(): 3 | from keras.models import model_from_json 4 | json_file = open('model.json', 'r') 5 | loaded_model_json = json_file.read() 6 | json_file.close() 7 | loaded_model = model_from_json(loaded_model_json) 8 | # load weights into new model 9 | loaded_model.load_weights("model.h5") 10 | print("Loaded model from disk") 11 | 12 | 13 | return loaded_model 14 | -------------------------------------------------------------------------------- /mnist_test.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "This script will load pre-trained model from *model.h5* and *model.json* files.\n", 8 | "The pre-trained model is returned to the calling function.\n", 9 | "\n", 10 | "These files are generated after running \"mnist_train.ipynb\" file." 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": 2, 16 | "metadata": { 17 | "collapsed": true 18 | }, 19 | "outputs": [], 20 | "source": [ 21 | "\n", 22 | "def model():\n", 23 | " from keras.models import model_from_json\n", 24 | " json_file = open('model.json', 'r')\n", 25 | " loaded_model_json = json_file.read()\n", 26 | " json_file.close()\n", 27 | " loaded_model = model_from_json(loaded_model_json)\n", 28 | " # load weights into new model\n", 29 | " loaded_model.load_weights(\"model.h5\")\n", 30 | " print(\"Loaded model from disk\")\n", 31 | "\n", 32 | "\n", 33 | " return loaded_model\n" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": { 40 | "collapsed": true 41 | }, 42 | "outputs": [], 43 | "source": [] 44 | } 45 | ], 46 | "metadata": { 47 | "kernelspec": { 48 | "display_name": "Python 3", 49 | "language": "python", 50 | "name": "python3" 51 | }, 52 | "language_info": { 53 | "codemirror_mode": { 54 | "name": "ipython", 55 | "version": 3 56 | }, 57 | "file_extension": ".py", 58 | "mimetype": "text/x-python", 59 | "name": "python", 60 | "nbconvert_exporter": "python", 61 | "pygments_lexer": "ipython3", 62 | "version": "3.6.3" 63 | } 64 | }, 65 | "nbformat": 4, 66 | "nbformat_minor": 2 67 | } 68 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Handwritten-digit-recognition-MNIST 2 | Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras 3 | 4 | ## MNIST dataset: 5 | 6 | MNIST is a collection of handwritten digits from 0-9. 7 | Image of size 28 X 28 8 | 9 | ![alt text](https://github.com/shubham99bisht/Handwritten-digit-recognition-MNIST/blob/master/src/mnist-sample.png "MNIST") 10 | 11 | ## Code Requirements 12 | python 3.x with following modules installed 13 | 14 | 1. numpy 15 | 2. seaborn 16 | 3. tensorflow 17 | 4. keras 18 | 5. opencv2 19 | 20 | ## Description 21 | This is a 5 layers Sequential Convolutional Neural Network for digits recognition trained on MNIST dataset. I choosed to build it with keras API (Tensorflow backend) which is very intuitive. 22 | 23 | It achieved 98.51% of accuracy with this CNN trained on a GPU, which took me about a minute. If you dont have a GPU powered machine it might take a little longer, you can try reducing the epochs (steps) to reduce computation. 24 | 25 | ## Execution 26 | 27 | ![alt text](https://github.com/shubham99bisht/Handwritten-digit-recognition-MNIST/blob/master/src/ml2.gif) 28 | 29 | To run the code type, 30 | 31 | `python digit_recogniser.py` 32 | 33 | 34 | ## Tutorial 35 | **Note: This page is not complete. Sorry for delay.** 36 | 37 | **Need help for this to be completed** 38 | 39 | For step-by-step tutorial please refer to [wiki](https://github.com/shubham99bisht/Handwritten-digit-recognition-MNIST/wiki). It will take you through all the steps right from loading the data to recognising digits through live cam. 40 | 41 | ## Update 42 | 43 | ### For running on GPU enabled devices: 44 | 45 | Please uncomment the following line from **digit_recogniser.py** (line no. 70) file: 46 | ``` 47 | tfback._get_available_gpus = _get_available_gpus 48 | ``` 49 | 50 | **Note: If you are using the tensorflow 2.1, then you may get an error "AttributeError: module'tensorflow_core._api.v2.config' has no attribute 'experimental_list_devices'"** 51 | 52 | As the experimental_list_devices is deprecated in tf 2.1. A simple snippet is injected into the code to make the code work. 53 | And the code is taken from here : https://github.com/keras-team/keras/issues/13684#issuecomment-595054461 54 | -------------------------------------------------------------------------------- /digit_recogniser.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import mnist_test 4 | import keras.backend.tensorflow_backend as tfback 5 | import tensorflow as tf 6 | 7 | #code is taken from here : https://github.com/keras-team/keras/issues/13684#issuecomment-595054461 8 | 9 | def _get_available_gpus(): 10 | """Get a list of available gpu devices (formatted as strings). 11 | 12 | # Returns 13 | A list of available GPU devices. 14 | """ 15 | #global _LOCAL_DEVICES 16 | if tfback._LOCAL_DEVICES is None: 17 | devices = tf.config.list_logical_devices() 18 | tfback._LOCAL_DEVICES = [x.name for x in devices] 19 | return [x for x in tfback._LOCAL_DEVICES if 'device:gpu' in x.lower()] 20 | #experimental_list_devices is deprecated in tf 2.1 21 | 22 | 23 | def get_img_contour_thresh(img): 24 | x, y, w, h = 0, 0, 300, 300 25 | gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 26 | blur = cv2.GaussianBlur(gray, (35, 35), 0) 27 | ret, thresh1 = cv2.threshold(blur, 70, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) 28 | thresh1 = thresh1[y:y + h, x:x + w] 29 | contours, hierarchy = cv2.findContours(thresh1, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[-2:] 30 | return img, contours, thresh1 31 | 32 | def main(): 33 | loaded_model = mnist_test.model() 34 | 35 | cap = cv2.VideoCapture(0) 36 | while (cap.isOpened()): 37 | ret, img = cap.read() 38 | img, contours, thresh = get_img_contour_thresh(img) 39 | ans1 = '' 40 | 41 | if len(contours) > 0: 42 | contour = max(contours, key=cv2.contourArea) 43 | if cv2.contourArea(contour) > 2500: 44 | x, y, w, h = cv2.boundingRect(contour) 45 | newImage = thresh[y:y + h, x:x + w] 46 | newImage = cv2.resize(newImage, (28, 28)) 47 | newImage = np.array(newImage) 48 | newImage = newImage.flatten() 49 | newImage = newImage.reshape(1, 1,28,28) 50 | ans1 = loaded_model.predict(newImage) 51 | ans1=ans1.tolist() 52 | ans1 = ans1[0].index(max(ans1[0])) 53 | 54 | x, y, w, h = 0, 0, 300, 300 55 | cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2) 56 | 57 | cv2.putText(img, " Deep Network : " + str(ans1), (10, 330), 58 | cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2) 59 | 60 | #change the window size to fit screen properly 61 | #img = cv2.resize(img, (1000, 600)) 62 | cv2.imshow("Frame", img) 63 | cv2.imshow("Contours", thresh) 64 | if cv2.waitKey(1) & 0xFF == ord('q'): 65 | break 66 | 67 | cap.release() 68 | cv2.destroyAllWindows() 69 | 70 | # tfback._get_available_gpus = _get_available_gpus 71 | main() 72 | -------------------------------------------------------------------------------- /model.json: -------------------------------------------------------------------------------- 1 | {"class_name": "Sequential", "config": [{"class_name": "Conv2D", "config": {"name": "conv2d_3", "trainable": true, "batch_input_shape": [null, 1, 28, 28], "dtype": "float32", "filters": 30, "kernel_size": [5, 5], "strides": [1, 1], "padding": "valid", "data_format": "channels_first", "dilation_rate": [1, 1], "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "MaxPooling2D", "config": {"name": "max_pooling2d_3", "trainable": true, "pool_size": [2, 2], "padding": "valid", "strides": [2, 2], "data_format": "channels_first"}}, {"class_name": "Conv2D", "config": {"name": "conv2d_4", "trainable": true, "filters": 15, "kernel_size": [3, 3], "strides": [1, 1], "padding": "valid", "data_format": "channels_first", "dilation_rate": [1, 1], "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "MaxPooling2D", "config": {"name": "max_pooling2d_4", "trainable": true, "pool_size": [2, 2], "padding": "valid", "strides": [2, 2], "data_format": "channels_first"}}, {"class_name": "Dropout", "config": {"name": "dropout_2", "trainable": true, "rate": 0.2, "noise_shape": null, "seed": null}}, {"class_name": "Flatten", "config": {"name": "flatten_2", "trainable": true, "data_format": "channels_last"}}, {"class_name": "Dense", "config": {"name": "dense_3", "trainable": true, "units": 128, "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_4", "trainable": true, "units": 50, "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_5", "trainable": true, "units": 10, "activation": "softmax", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}], "keras_version": "2.1.6", "backend": "tensorflow"} -------------------------------------------------------------------------------- /mnist_train.py: -------------------------------------------------------------------------------- 1 | # In[1]: 2 | 3 | import pandas as pd 4 | import numpy as np 5 | import matplotlib.pyplot as plt 6 | import matplotlib.image as mpimg 7 | import seaborn as sns 8 | from keras.utils import np_utils 9 | get_ipython().magic('matplotlib inline') 10 | 11 | 12 | # Importing data from csv files: 13 | # 1. train.csv : The training data set, has 785 columns. The first column, called "label", is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image. 14 | # 2. test.csv: The test data set, (test.csv), is the same as the training set, except that it does not contain the "label" column. 15 | # 16 | # Reading train data using pandas. 17 | # 18 | # 19 | 20 | # In[2]: 21 | 22 | 23 | data = pd.read_csv("../input/train.csv") 24 | data = data.values 25 | #Taking labels(first column) out of data. 26 | label = data[:,0] 27 | 28 | # Drop 'label' column 29 | data = data[:,1:] 30 | 31 | print("Data loaded, ready to go!") 32 | 33 | 34 | # In[3]: 35 | 36 | 37 | #plot distribution of label values 38 | g = sns.countplot(label) 39 | 40 | 41 | # Diving data and label into two parts: train and validation. 42 | # We have a separate test.csv file for testing our model predictions 43 | 44 | # In[4]: 45 | 46 | 47 | #splitting data into train and valid 48 | train_data=data[:35000,:] 49 | valid_data=data[35000:,:] 50 | 51 | #reshaping to make it in proper input shape for a neural network 52 | train_data = train_data.reshape(train_data.shape[0], 1, 28, 28).astype('float32') 53 | valid_data = valid_data.reshape(valid_data.shape[0], 1, 28, 28).astype('float32') 54 | 55 | #normalise data 56 | train_data = train_data / 255 57 | valid_data= valid_data/255 58 | 59 | #spliting label into train and valid 60 | train_label = label[:35000] 61 | valid_label = label[35000:] 62 | 63 | #one-hot-encoding 64 | #Encode labels to one hot vectors (ex : 2 -> [0,0,1,0,0,0,0,0,0,0]) 65 | train_label = np_utils.to_categorical(train_label) 66 | valid_label = np_utils.to_categorical(valid_label) 67 | 68 | #print shape 69 | print("train_data shape: ",train_data.shape) 70 | print("train_label shape: ",train_label.shape) 71 | print("valid_data shape: ",valid_data.shape) 72 | print("valid_label shape: ",valid_label.shape) 73 | 74 | 75 | # Importing modules needed to build model 76 | # Keras does provide a lot of capability for creating convolutional neural networks. 77 | 78 | # In[5]: 79 | 80 | 81 | from keras.models import Sequential 82 | from keras.layers import Dense 83 | from keras.layers import Dropout 84 | from keras.layers import Flatten 85 | from keras.layers.convolutional import Conv2D 86 | from keras.layers.convolutional import MaxPooling2D 87 | from keras.utils import np_utils 88 | from keras import backend as K 89 | K.set_image_dim_ordering('th') 90 | 91 | # fix random seed for reproducibility 92 | seed = 7 93 | np.random.seed(seed) 94 | 95 | 96 | # Define a function to create a model 97 | 98 | # In[6]: 99 | 100 | 101 | def create_model(): 102 | # create model 103 | model = Sequential() 104 | model.add(Conv2D(32, (5, 5), input_shape=(1, 28, 28), activation='relu')) 105 | model.add(MaxPooling2D(pool_size=(2, 2))) 106 | model.add(Dropout(0.2)) 107 | model.add(Flatten()) 108 | model.add(Dense(128, activation='relu')) 109 | model.add(Dense(10, activation='softmax')) 110 | # Compile model 111 | model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 112 | return model 113 | 114 | 115 | # Passing training and validation data along with labels to model 116 | 117 | # In[7]: 118 | 119 | 120 | # build the model 121 | model = create_model() 122 | # Fit the model 123 | model.fit(train_data, train_label, validation_data=(valid_data, valid_label), epochs=10, batch_size=200, verbose=2) 124 | 125 | 126 | # In[8]: 127 | 128 | 129 | # Final evaluation of the model 130 | scores = model.evaluate(valid_data, valid_label, verbose=0) 131 | print("CNN Error: %.2f%%" % (100-scores[1]*100)) 132 | 133 | 134 | # Exciting! we have trained our model 135 | # Saving model weights for later use 136 | 137 | # In[9]: 138 | 139 | 140 | model.save("model.h5") 141 | print("model weights saved in model.h5 file") 142 | 143 | 144 | # Saving model information in .json file 145 | 146 | # In[10]: 147 | 148 | 149 | from keras.models import model_from_json 150 | model_json = model.to_json() 151 | with open("model.json", "w") as json_file: 152 | json_file.write(model_json) 153 | print("model saved as model.json file") 154 | -------------------------------------------------------------------------------- /mnist_train.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "metadata": { 5 | "_uuid": "1976eb3fb0f4524f8641833247f7eef69b6afebd" 6 | }, 7 | "cell_type": "markdown", 8 | "source": "Importing required python modules\n1. pandas and numpy for reading csv files and saving the results\n2. matplotlib and seaborn for plotting graphs to visualise and make sense of data\n3. np_utils for \"one-hot encoding\" of labels" 9 | }, 10 | { 11 | "metadata": { 12 | "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5", 13 | "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", 14 | "trusted": true 15 | }, 16 | "cell_type": "code", 17 | "source": "import pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport matplotlib.image as mpimg\nimport seaborn as sns\nfrom keras.utils import np_utils\n%matplotlib inline", 18 | "execution_count": 1, 19 | "outputs": [ 20 | { 21 | "output_type": "stream", 22 | "text": "/opt/conda/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n from ._conv import register_converters as _register_converters\nUsing TensorFlow backend.\n", 23 | "name": "stderr" 24 | } 25 | ] 26 | }, 27 | { 28 | "metadata": { 29 | "_uuid": "4d261d218055b3551257657fb6ae7070789e04fb" 30 | }, 31 | "cell_type": "markdown", 32 | "source": "Importing data from csv files:\n1. train.csv : The training data set, has 785 columns. The first column, called \"label\", is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image.\n2. test.csv: The test data set, (test.csv), is the same as the training set, except that it does not contain the \"label\" column.\n\nReading train data using pandas.\n\n" 33 | }, 34 | { 35 | "metadata": { 36 | "_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0", 37 | "_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a", 38 | "trusted": true 39 | }, 40 | "cell_type": "code", 41 | "source": "data = pd.read_csv(\"../input/train.csv\")\ndata = data.values\n#Taking labels(first column) out of data.\nlabel = data[:,0]\n\n# Drop 'label' column\ndata = data[:,1:]\n\nprint(\"Data loaded, ready to go!\")", 42 | "execution_count": 2, 43 | "outputs": [ 44 | { 45 | "output_type": "stream", 46 | "text": "Data loaded, ready to go!\n", 47 | "name": "stdout" 48 | } 49 | ] 50 | }, 51 | { 52 | "metadata": { 53 | "trusted": true, 54 | "_uuid": "d3753a7add38650a9b5e7efa0b5b8527a1411b40" 55 | }, 56 | "cell_type": "code", 57 | "source": "#plot distribution of label values\ng = sns.countplot(label)", 58 | "execution_count": 3, 59 | "outputs": [ 60 | { 61 | "output_type": "display_data", 62 | "data": { 63 | "text/plain": "", 64 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAD8CAYAAABgmUMCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAEZtJREFUeJzt3X2wXVV9xvHvQ4IvqAjK1WJCG9pS\nR2pbpRmkZQYdaAFfoQ5YnKoZS4dOBx1sO219mSlWy4xOfWutpcMYNKiVUtBKHaY2BV9aO4IJoALR\nkqqFFGpig/hWX2J//eOsyDHcJHfBPfucm/v9zNw5e6+9zlm/XG54stdee99UFZIkLdRB0y5AkrS0\nGBySpC4GhySpi8EhSepicEiSuhgckqQuBockqYvBIUnqYnBIkrqsnHYBk3DEEUfUmjVrpl2GJC0p\nmzdv/mpVze2v3wEZHGvWrGHTpk3TLkOSlpQk/7mQfk5VSZK6GBySpC4GhySpi8EhSepicEiSuhgc\nkqQuBockqYvBIUnqYnBIkrockHeOz6I7Xvdzg43143/8ucHGkrT8eMYhSepicEiSuhgckqQuBock\nqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GBySpC4+q0rSTHjta197QI51IPKMQ5LUxTMO\nDe7jJz19sLGe/omPDzaWtFx4xiFJ6mJwSJK6GBySpC4GhySpi8EhSepicEiSuhgckqQu3sexzJz4\n9hMHGeeTL//kIONIB6JfuPIjg431mbNO636PZxySpC7L4ozjF//gskHG2fxnLxlkHGmxbbnoukHG\nedJrTh5kHE2WZxySpC4TD44kK5LclOTDbf/oJNcnuT3J3yZ5SGt/aNvf2o6vGfuMV7X2LyTpn5CT\nJC2aIaaqLgC2AIe2/TcCb62qy5P8NXAucHF7vaeqfjrJOa3fryc5FjgH+FngCcA/J/mZqvrBALXr\nAPaXv/8Pg4zzsjc/d5BxtDiu+LvjBxnnBWffMMg4kzDRM44kq4FnA+9s+wFOBq5sXTYAZ7btM9o+\n7fgprf8ZwOVV9d2q+hKwFRjmv6wk6X4mPVX1NuAPgf9r+48FvlZVu9r+NmBV214F3AnQjt/b+v+w\nfZ73SJIGNrHgSPIcYHtVbR5vnqdr7efYvt4zPt55STYl2bRjx47ueiVJCzPJM44Tgecl+TJwOaMp\nqrcBhyXZfW1lNXBX294GHAXQjj8a2DnePs97fqiqLqmqtVW1dm5ubvH/NJIkYILBUVWvqqrVVbWG\n0cXt66rqN4CPAme1buuAD7Xtq9s+7fh1VVWt/Zy26upo4Bhg6V5VkqQlbho3AP4RcHmSPwVuAta3\n9vXAe5JsZXSmcQ5AVd2a5ArgNmAXcL4rqiRpegYJjqr6GPCxtv1F5lkVVVXfAc7ey/svAi6aXIWS\npIXyznFJUheDQ5LUxeCQJHUxOCRJXZbFY9WlWXXRi87af6dF8pr3Xrn/TtICeMYhSepicEiSuhgc\nkqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GBySpC4GhySpi8EhSepicEiSuhgc\nkqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GBySpC4GhySpi8EhSepicEiSuhgc\nkqQuBockqYvBIUnqYnBIkroYHJKkLhMLjiQPS3JDks8kuTXJn7T2o5Ncn+T2JH+b5CGt/aFtf2s7\nvmbss17V2r+Q5LRJ1SxJ2r9JnnF8Fzi5qn4BeApwepITgDcCb62qY4B7gHNb/3OBe6rqp4G3tn4k\nORY4B/hZ4HTgr5KsmGDdkqR9mFhw1Mg32+7B7auAk4ErW/sG4My2fUbbpx0/JUla++VV9d2q+hKw\nFTh+UnVLkvZtotc4kqxIcjOwHdgI/Afwtara1bpsA1a17VXAnQDt+L3AY8fb53nP+FjnJdmUZNOO\nHTsm8ceRJDHh4KiqH1TVU4DVjM4SnjRft/aavRzbW/ueY11SVWurau3c3NwDLVmStB+DrKqqqq8B\nHwNOAA5LsrIdWg3c1ba3AUcBtOOPBnaOt8/zHknSwCa5qmouyWFt++HArwBbgI8CZ7Vu64APte2r\n2z7t+HVVVa39nLbq6mjgGOCGSdUtSdq3lfvv8oAdCWxoK6AOAq6oqg8nuQ24PMmfAjcB61v/9cB7\nkmxldKZxDkBV3ZrkCuA2YBdwflX9YIJ1S5L2YWLBUVWfBZ46T/sXmWdVVFV9Bzh7L591EXDRYtco\nSernneOSpC4GhySpi8EhSepicEiSuhgckqQuBockqcuCgiPJtQtpkyQd+PZ5H0eShwGHAEckOZz7\nnht1KPCECdcmSZpB+7sB8LeBVzAKic3cFxxfB94xwbokSTNqn8FRVX8O/HmSl1fV2weqSZI0wxb0\nyJGqenuSXwbWjL+nqi6bUF2SpBm1oOBI8h7gp4Cbgd0PGCzA4JCkZWahDzlcCxzbHnMuSVrGFnof\nxy3Aj02yEEnS0rDQM44jgNuS3AB8d3djVT1vIlVJkmbWQoPjtZMsQpK0dCx0VdXHJ12IJGlpWOiq\nqm8wWkUF8BDgYOBbVXXopAqTJM2mhZ5xPGp8P8mZzPPrXyVJB74H9HTcqvp74ORFrkWStAQsdKrq\n+WO7BzG6r8N7OiRpGVroqqrnjm3vAr4MnLHo1UiSZt5Cr3G8dNKFSJKWhoX+IqfVST6YZHuSryS5\nKsnqSRcnSZo9C704/i7gaka/l2MV8A+tTZK0zCw0OOaq6l1Vtat9vRuYm2BdkqQZtdDg+GqSFyVZ\n0b5eBPzPJAuTJM2mhQbHbwIvAP4buBs4C/CCuSQtQwtdjvt6YF1V3QOQ5DHAmxgFiiRpGVnoGcfP\n7w4NgKraCTx1MiVJkmbZQoPjoCSH795pZxwLPVuRJB1AFvo//zcD/5bkSkaPGnkBcNHEqpIkzayF\n3jl+WZJNjB5sGOD5VXXbRCuTJM2kBU83taAwLCRpmXtAj1WXJC1fEwuOJEcl+WiSLUluTXJBa39M\nko1Jbm+vh7f2JPmLJFuTfDbJcWOfta71vz3JuknVLEnav0mecewCfr+qngScAJyf5FjglcC1VXUM\ncG3bB3gmcEz7Og+4GH64gutC4GmMfuvgheMrvCRJw5pYcFTV3VV1Y9v+BrCF0QMSzwA2tG4bgDPb\n9hnAZTXyKeCwJEcCpwEbq2pnu5dkI3D6pOqWJO3bINc4kqxhdMPg9cDjq+puGIUL8LjWbRVw59jb\ntrW2vbXvOcZ5STYl2bRjx47F/iNIkpqJB0eSRwJXAa+oqq/vq+s8bbWP9h9tqLqkqtZW1dq5OR/c\nK0mTMtHgSHIwo9B4X1V9oDV/pU1B0V63t/ZtwFFjb18N3LWPdknSFExyVVWA9cCWqnrL2KGrgd0r\no9YBHxprf0lbXXUCcG+byvoIcGqSw9tF8VNbmyRpCib5vKkTgRcDn0tyc2t7NfAG4Iok5wJ3AGe3\nY9cAzwK2At+mPba9qnYmeT3w6dbvde0hi5KkKZhYcFTVvzL/9QmAU+bpX8D5e/msS4FLF686SdID\n5Z3jkqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GBySpC4GhySpi8EhSepicEiS\nuhgckqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GBySpC4GhySpi8EhSepicEiS\nuhgckqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GBySpC4GhySpi8EhSeoyseBI\ncmmS7UluGWt7TJKNSW5vr4e39iT5iyRbk3w2yXFj71nX+t+eZN2k6pUkLcwkzzjeDZy+R9srgWur\n6hjg2rYP8EzgmPZ1HnAxjIIGuBB4GnA8cOHusJEkTcfEgqOqPgHs3KP5DGBD294AnDnWflmNfAo4\nLMmRwGnAxqraWVX3ABu5fxhJkgY09DWOx1fV3QDt9XGtfRVw51i/ba1tb+2SpCmZlYvjmaet9tF+\n/w9IzkuyKcmmHTt2LGpxkqT7DB0cX2lTULTX7a19G3DUWL/VwF37aL+fqrqkqtZW1dq5ublFL1yS\nNDJ0cFwN7F4ZtQ740Fj7S9rqqhOAe9tU1keAU5Mc3i6Kn9raJElTsnJSH5zk/cAzgCOSbGO0OuoN\nwBVJzgXuAM5u3a8BngVsBb4NvBSgqnYmeT3w6dbvdVW15wV3SdKAJhYcVfXCvRw6ZZ6+BZy/l8+5\nFLh0EUuTJD0Is3JxXJK0RBgckqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GByS\npC4GhySpi8EhSepicEiSuhgckqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GByS\npC4GhySpi8EhSepicEiSuhgckqQuBockqYvBIUnqYnBIkroYHJKkLgaHJKmLwSFJ6mJwSJK6GByS\npC4GhySpy5IJjiSnJ/lCkq1JXjnteiRpuVoSwZFkBfAO4JnAscALkxw73aokaXlaEsEBHA9sraov\nVtX3gMuBM6ZckyQtS0slOFYBd47tb2ttkqSBpaqmXcN+JTkbOK2qfqvtvxg4vqpePtbnPOC8tvtE\n4AsPctgjgK8+yM9YDLNQxyzUALNRhzXcZxbqmIUaYDbqWIwafqKq5vbXaeWDHGQo24CjxvZXA3eN\nd6iqS4BLFmvAJJuqau1ifd5SrmMWapiVOqxhtuqYhRpmpY4ha1gqU1WfBo5JcnSShwDnAFdPuSZJ\nWpaWxBlHVe1K8jLgI8AK4NKqunXKZUnSsrQkggOgqq4BrhlwyEWb9nqQZqGOWagBZqMOa7jPLNQx\nCzXAbNQxWA1L4uK4JGl2LJVrHJKkGWFwzGPajzdJcmmS7UluGXrsPeo4KslHk2xJcmuSC6ZQw8OS\n3JDkM62GPxm6hrFaViS5KcmHp1jDl5N8LsnNSTZNsY7DklyZ5PPt5+OXBh7/ie17sPvr60leMWQN\nrY7fbT+XtyR5f5KHDV1Dq+OCVsOtQ3wfnKraQ3u8yb8Dv8poGfCngRdW1W0D1nAS8E3gsqp68lDj\nzlPHkcCRVXVjkkcBm4EzB/5eBHhEVX0zycHAvwIXVNWnhqphrJbfA9YCh1bVc4Yev9XwZWBtVU31\nnoEkG4B/qap3tpWOh1TV16ZUywrgv4CnVdV/DjjuKkY/j8dW1f8muQK4pqrePVQNrY4nM3qaxvHA\n94B/BH6nqm6f1Jiecdzf1B9vUlWfAHYOOeZe6ri7qm5s298AtjDwHfs18s22e3D7GvxfO0lWA88G\n3jn02LMmyaHAScB6gKr63rRCozkF+I8hQ2PMSuDhSVYCh7DH/WUDeRLwqar6dlXtAj4O/NokBzQ4\n7s/Hm8wjyRrgqcD1Uxh7RZKbge3AxqoavAbgbcAfAv83hbHHFfBPSTa3pyVMw08CO4B3tam7dyZ5\nxJRqgdF9Xe8fetCq+i/gTcAdwN3AvVX1T0PXAdwCnJTksUkOAZ7Fj94wvegMjvvLPG3Lej4vySOB\nq4BXVNXXhx6/qn5QVU9h9MSA49up+WCSPAfYXlWbhxx3L06squMYPSn6/DatObSVwHHAxVX1VOBb\nwFR+1UGbJnse8HdTGPtwRrMRRwNPAB6R5EVD11FVW4A3AhsZTVN9Btg1yTENjvvb7+NNlpN2XeEq\n4H1V9YFp1tKmQz4GnD7w0CcCz2vXFy4HTk7y3oFrAKCq7mqv24EPMppaHdo2YNvYmd+VjIJkGp4J\n3FhVX5nC2L8CfKmqdlTV94EPAL88hTqoqvVVdVxVncRomnti1zfA4JiPjzdp2oXp9cCWqnrLlGqY\nS3JY2344o7+snx+yhqp6VVWtrqo1jH4erquqwf9lmeQRbZECbWroVEbTFIOqqv8G7kzyxNZ0CjDY\ngok9vJApTFM1dwAnJDmk/V05hdF1wMEleVx7/XHg+Uz4e7Jk7hwfyiw83iTJ+4FnAEck2QZcWFXr\nh6yhORF4MfC5do0B4NXtLv6hHAlsaCtnDgKuqKqpLYedsscDHxz9P4qVwN9U1T9OqZaXA+9r/7j6\nIvDSoQto8/m/Cvz20GMDVNX1Sa4EbmQ0NXQT07uD/KokjwW+D5xfVfdMcjCX40qSujhVJUnqYnBI\nkroYHJKkLgaHJKmLwSFJ6mJwSJK6GBySpC4GhySpy/8DQBTOC9jS9A8AAAAASUVORK5CYII=\n" 65 | }, 66 | "metadata": {} 67 | } 68 | ] 69 | }, 70 | { 71 | "metadata": { 72 | "trusted": true, 73 | "collapsed": true, 74 | "_uuid": "31973fa0f4f20d2ffec4eb8e3668962fc014289c" 75 | }, 76 | "cell_type": "markdown", 77 | "source": "Diving data and label into two parts: train and validation.\nWe have a separate test.csv file for testing our model predictions" 78 | }, 79 | { 80 | "metadata": { 81 | "trusted": true, 82 | "_uuid": "053b9dba1b91c71305e19343230b7329538e19f2" 83 | }, 84 | "cell_type": "code", 85 | "source": "#splitting data into train and valid\ntrain_data=data[:35000,:]\nvalid_data=data[35000:,:]\n\n#reshaping to make it in proper input shape for a neural network\ntrain_data = train_data.reshape(train_data.shape[0], 1, 28, 28).astype('float32')\nvalid_data = valid_data.reshape(valid_data.shape[0], 1, 28, 28).astype('float32')\n\n#normalise data\ntrain_data = train_data / 255\nvalid_data= valid_data/255\n\n#spliting label into train and valid\ntrain_label = label[:35000]\nvalid_label = label[35000:]\n\n#one-hot-encoding\n#Encode labels to one hot vectors (ex : 2 -> [0,0,1,0,0,0,0,0,0,0])\ntrain_label = np_utils.to_categorical(train_label)\nvalid_label = np_utils.to_categorical(valid_label)\n\n#print shape\nprint(\"train_data shape: \",train_data.shape)\nprint(\"train_label shape: \",train_label.shape)\nprint(\"valid_data shape: \",valid_data.shape)\nprint(\"valid_label shape: \",valid_label.shape)", 86 | "execution_count": 4, 87 | "outputs": [ 88 | { 89 | "output_type": "stream", 90 | "text": "train_data shape: (35000, 1, 28, 28)\ntrain_label shape: (35000, 10)\nvalid_data shape: (7000, 1, 28, 28)\nvalid_label shape: (7000, 10)\n", 91 | "name": "stdout" 92 | } 93 | ] 94 | }, 95 | { 96 | "metadata": { 97 | "_uuid": "1af9d926660eb3880ad0e4cb5c66c032f11ede59" 98 | }, 99 | "cell_type": "markdown", 100 | "source": "Importing modules needed to build model\nKeras does provide a lot of capability for creating convolutional neural networks." 101 | }, 102 | { 103 | "metadata": { 104 | "trusted": true, 105 | "_uuid": "4c3c18e6f503ce6a56e12c73edc1207c9a5d3b5d", 106 | "collapsed": true 107 | }, 108 | "cell_type": "code", 109 | "source": "from keras.models import Sequential\nfrom keras.layers import Dense\nfrom keras.layers import Dropout\nfrom keras.layers import Flatten\nfrom keras.layers.convolutional import Conv2D\nfrom keras.layers.convolutional import MaxPooling2D\nfrom keras.utils import np_utils\nfrom keras import backend as K\nK.set_image_dim_ordering('th')\n\n# fix random seed for reproducibility\nseed = 7\nnp.random.seed(seed)", 110 | "execution_count": 5, 111 | "outputs": [] 112 | }, 113 | { 114 | "metadata": { 115 | "_uuid": "c69c0701c1084022f59dc1ffec33ec8886866402" 116 | }, 117 | "cell_type": "markdown", 118 | "source": "Define a function to create a model" 119 | }, 120 | { 121 | "metadata": { 122 | "trusted": true, 123 | "collapsed": true, 124 | "_uuid": "dff559899ab9a7705d9c2dcec5cdbdf61d4ebf07" 125 | }, 126 | "cell_type": "code", 127 | "source": "def create_model():\n # create model\n model = Sequential()\n model.add(Conv2D(32, (5, 5), input_shape=(1, 28, 28), activation='relu'))\n model.add(MaxPooling2D(pool_size=(2, 2)))\n model.add(Dropout(0.2))\n model.add(Flatten())\n model.add(Dense(128, activation='relu'))\n model.add(Dense(10, activation='softmax'))\n # Compile model\n model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\n return model", 128 | "execution_count": 6, 129 | "outputs": [] 130 | }, 131 | { 132 | "metadata": { 133 | "_uuid": "53f4a7b167027287b525314767a9267b5ddaf7cf" 134 | }, 135 | "cell_type": "markdown", 136 | "source": "Passing training and validation data along with labels to model" 137 | }, 138 | { 139 | "metadata": { 140 | "trusted": true, 141 | "_uuid": "b1124814d1f7d24b2a1e33cec590ff92807cfee3" 142 | }, 143 | "cell_type": "code", 144 | "source": "# build the model\nmodel = create_model()\n# Fit the model\nmodel.fit(train_data, train_label, validation_data=(valid_data, valid_label), epochs=10, batch_size=200, verbose=2)\n", 145 | "execution_count": 7, 146 | "outputs": [ 147 | { 148 | "output_type": "stream", 149 | "text": "Train on 35000 samples, validate on 7000 samples\nEpoch 1/10\n - 5s - loss: 0.3262 - acc: 0.9066 - val_loss: 0.1186 - val_acc: 0.9660\nEpoch 2/10\n - 2s - loss: 0.0956 - acc: 0.9716 - val_loss: 0.0731 - val_acc: 0.9774\nEpoch 3/10\n - 2s - loss: 0.0646 - acc: 0.9810 - val_loss: 0.0633 - val_acc: 0.9804\nEpoch 4/10\n - 2s - loss: 0.0513 - acc: 0.9847 - val_loss: 0.0563 - val_acc: 0.9829\nEpoch 5/10\n - 2s - loss: 0.0408 - acc: 0.9872 - val_loss: 0.0514 - val_acc: 0.9844\nEpoch 6/10\n - 2s - loss: 0.0309 - acc: 0.9904 - val_loss: 0.0570 - val_acc: 0.9833\nEpoch 7/10\n - 2s - loss: 0.0282 - acc: 0.9916 - val_loss: 0.0655 - val_acc: 0.9800\nEpoch 8/10\n - 2s - loss: 0.0231 - acc: 0.9929 - val_loss: 0.0478 - val_acc: 0.9859\nEpoch 9/10\n - 2s - loss: 0.0211 - acc: 0.9937 - val_loss: 0.0521 - val_acc: 0.9854\nEpoch 10/10\n - 2s - loss: 0.0190 - acc: 0.9939 - val_loss: 0.0507 - val_acc: 0.9851\n", 150 | "name": "stdout" 151 | }, 152 | { 153 | "output_type": "execute_result", 154 | "execution_count": 7, 155 | "data": { 156 | "text/plain": "" 157 | }, 158 | "metadata": {} 159 | } 160 | ] 161 | }, 162 | { 163 | "metadata": { 164 | "trusted": true, 165 | "_uuid": "23e342cce3be6515dcfb10bd780b591bfbaeb2a3" 166 | }, 167 | "cell_type": "code", 168 | "source": "# Final evaluation of the model\nscores = model.evaluate(valid_data, valid_label, verbose=0)\nprint(\"CNN Error: %.2f%%\" % (100-scores[1]*100))", 169 | "execution_count": 8, 170 | "outputs": [ 171 | { 172 | "output_type": "stream", 173 | "text": "CNN Error: 1.49%\n", 174 | "name": "stdout" 175 | } 176 | ] 177 | }, 178 | { 179 | "metadata": { 180 | "_uuid": "25377bafcf1ca418f32cab5ab4f68558c0a916bd" 181 | }, 182 | "cell_type": "markdown", 183 | "source": "Exciting! we have trained our model\nSaving model weights for later use" 184 | }, 185 | { 186 | "metadata": { 187 | "trusted": true, 188 | "_uuid": "dfe9153fbbdde64a4d5d754f9ce1e466095e018a" 189 | }, 190 | "cell_type": "code", 191 | "source": "model.save(\"model.h5\")\nprint(\"model weights saved in model.h5 file\")", 192 | "execution_count": 9, 193 | "outputs": [ 194 | { 195 | "output_type": "stream", 196 | "text": "model weights saved in model.h5 file\n", 197 | "name": "stdout" 198 | } 199 | ] 200 | }, 201 | { 202 | "metadata": { 203 | "_uuid": "7998d66d539b2f50f9a70e735f39484240c3835b" 204 | }, 205 | "cell_type": "markdown", 206 | "source": "Saving model information in .json file" 207 | }, 208 | { 209 | "metadata": { 210 | "trusted": true, 211 | "_uuid": "2f8de0384de4e7395cb43ce0e593b4479f80fe34" 212 | }, 213 | "cell_type": "code", 214 | "source": "from keras.models import model_from_json\nmodel_json = model.to_json()\nwith open(\"model.json\", \"w\") as json_file:\n json_file.write(model_json)\nprint(\"model saved as model.json file\")", 215 | "execution_count": 10, 216 | "outputs": [ 217 | { 218 | "output_type": "stream", 219 | "text": "model saved as model.json file\n", 220 | "name": "stdout" 221 | } 222 | ] 223 | } 224 | ], 225 | "metadata": { 226 | "kernelspec": { 227 | "display_name": "Python 3", 228 | "language": "python", 229 | "name": "python3" 230 | }, 231 | "language_info": { 232 | "name": "python", 233 | "version": "3.6.4", 234 | "mimetype": "text/x-python", 235 | "codemirror_mode": { 236 | "name": "ipython", 237 | "version": 3 238 | }, 239 | "pygments_lexer": "ipython3", 240 | "nbconvert_exporter": "python", 241 | "file_extension": ".py" 242 | } 243 | }, 244 | "nbformat": 4, 245 | "nbformat_minor": 1 246 | } --------------------------------------------------------------------------------