├── .gitignore ├── digits.png ├── test_image.png ├── final_digits.png ├── original_overlay.png ├── custom_train_digits.jpg ├── training_box_overlay.png ├── needed.py ├── requirements.txt ├── README.md ├── digit_recog.py └── NEW_digit_recog.py /.gitignore: -------------------------------------------------------------------------------- 1 | .idea 2 | __pycache__/ 3 | github-test/ -------------------------------------------------------------------------------- /digits.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/digits.png -------------------------------------------------------------------------------- /test_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/test_image.png -------------------------------------------------------------------------------- /final_digits.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/final_digits.png -------------------------------------------------------------------------------- /original_overlay.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/original_overlay.png -------------------------------------------------------------------------------- /custom_train_digits.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/custom_train_digits.jpg -------------------------------------------------------------------------------- /training_box_overlay.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/training_box_overlay.png -------------------------------------------------------------------------------- /needed.py: -------------------------------------------------------------------------------- 1 | from PIL import Image 2 | from numpy import asarray 3 | 4 | 5 | def imresize(arr,size): 6 | img=Image.fromarray(arr) 7 | img=img.resize(size) 8 | return asarray(img) 9 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | cycler==0.10.0 2 | dask==2021.10.0 3 | decorator==4.4.2 4 | joblib==1.2.0 5 | kiwisolver==1.3.1 6 | matplotlib==3.3.3 7 | networkx==2.5 8 | numpy==1.22.0 9 | opencv-python==4.2.0.32 10 | Pillow==10.2.0 11 | pkg-resources==0.0.0 12 | pyparsing==2.4.7 13 | python-dateutil==2.8.1 14 | PyYAML==5.4 15 | scikit-image==0.12.1 16 | scikit-learn==1.5.0 17 | scipy==1.10.0 18 | six==1.15.0 19 | toolz==0.11.1 20 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Python-Custom-Digit-Recognition 2 | 3 | You can apply a simple OCR on your own handrwitten digits using this python script. 4 | I have used OpenCV to pre-process the image and to extract the digits from the picture. 5 | Using K-Nearest Neighbours (or SVM) as my model - I trained it using my own handwritten data set. I have also included the freely [available](http://yann.lecun.com/exdb/mnist/) MNIST data set so you can experiment on how different datasets work with different handwritings. 6 | 7 | ## Analysis 8 | I tried using just extracted the pixels as data to train and to predict the digits, but the accuracy was too low even on popular classification algorithms like SVM, KNN and Neural Netoworks. I did improve the accuracy a little bit after trying some custom threshold values. The best accuracy I could achieve only using pixel values was close to 55-60% that was after converting all the images to Black OR White from Black AND White. 9 | 10 | After searching and reading about feature extraction from images for OCR - I stumbled [HOG](https://en.wikipedia.org/wiki/Histogram_of_oriented_gradients) (Histogram of Gradients). Basically, it tries to capture the shape of structures in the region by capturing information about gradients. Image gradient are simply intensity changes across pixels in an image. 11 | 12 | ![pic-explain](https://gilscvblog.files.wordpress.com/2013/08/figure5.jpg "pic") 13 | 14 | 15 | It works by dividing the image into small (usually 8x8 pixels) cells and blocks of 4x4 cells. Each cell has a fixed number of gradient orientation bins. Each pixel in the cell votes for a gradient orientation bin with a vote proportional to the gradient magnitude at that pixel or simple put, the "histogram" counts how many pixels have an edge with a specific orientation. More more info please refer [this](https://gilscvblog.wordpress.com/2013/08/18/a-short-introduction-to-descriptors/) blog post 16 | 17 | Using just only HOG histogram vectors as features drastically improved the accuracy of the prediction. Currently, I have used KNN from OpenCV as my model - I tried using SVM from the same module, but its accuracy was not as good as KNN. The best accuracy I have achieved on a sample image of about 100 digits is 80%. In the future, I might add more features after looking into SIFT, SURF or even try to get a better accuracy using just plain pixels as data 18 | 19 | ## Usage 20 | 21 | `digit_recog.py` *is deprecated - may not work with newer versions of libraries* 22 | 23 | UPDATED CODE: `NEW_digit_recog.py` 24 | 25 | To run code, download/clone repo and execute: 26 | ```python NEW_digit_recog.py ``` 27 | 28 | This code uses my own handwritten digits (`custom_train_digits.jpg`) as training data. You can also use your own but keep the positioning of the digits similar to whats in `custom_train_digits.jpg` file. If you make modifications in the format of the custom training data (your handwritten digits) make sure to edit `load_digits_custom` function in `NEW_digit_recog.py` as per the changes 29 | 30 | Executing the program will generate 2 output files 31 | 32 | This is the original image with digit boxes and the numbers on the top. 33 | ![original_overlay](https://github.com/pavitrakumar78/Python-Custom-Digit-Recognition/blob/master/original_overlay.png) 34 | This is a plain image with just the recognized numbers printed. 35 | ![final_digits](https://github.com/pavitrakumar78/Python-Custom-Digit-Recognition/blob/master/final_digits.png) 36 | 37 | ### Note: 38 | - User image should be a scanned (atleast 300dpi) image. 39 | - Image can be any format supported by OpenCV. 40 | 41 | In `NEW_digit_recog.py`, use either 42 | ```digits, labels = load_digits(TRAIN_DATA_IMG) #original MNIST data``` 43 | For MNIST dataset OR 44 | ```digits, labels = load_digits_custom('custom_train_digits.jpg') #my handwritten dataset``` 45 | For your own custom dataset. 46 | 47 | Edit `TRAIN_DATA_IMG` and `USER_IMG` At line 190 and 191 if you want to use your own images for testing and training. 48 | 49 | ## Libraries and Environement: 50 | 51 | # NOTE: To run this code without errors, you need a virtualenv with the correct libraries because the code is outdated (it was written over 5 years ago...) 52 | Recommended py version: 3.6+ 53 | ``` 54 | sudo apt-get install python3-venv 55 | sudo apt-get install libgtk2.0-dev pkg-config 56 | 57 | python3 -m venv github-test 58 | source github-test/bin/activate 59 | 60 | pip3 install numpy==1.18 61 | pip3 install scipy==1.1.0 62 | pip3 install scikit-learn==0.21.3 63 | pip3 install opencv-python==3.2.0.8 64 | pip3 install scikit-image==0.12.1 65 | pip3 install Pillow==2.2.2 66 | 67 | git clone https://github.com/pavitrakumar78/Python-Custom-Digit-Recognition.git 68 | 69 | python NEW_digit_recog.py 70 | ``` 71 | If you don't want to manually work with the versions, I've also added a tested requirements.txt file. Just intall whatever is in there and this script should run without any issues. 72 | 73 | The accuarcy may be lower. You will need to tune the hyperparams in the model and try modifying the image processing piepline. 74 | 75 | [PROBABLY DOES NOT WORK:] 76 | ~~Tested on: 77 | Windows 10 78 | Python 3.5 79 | 80 | ~~Dependencies: 81 | numpy 1.13.1 82 | SciPy 0.19.0 83 | OpenCv (cv2) 3.2.0 84 | ~~ 85 | 86 | ## Similar Project 87 | I recently did a project where I use 2 CNNs to do both bounding box regression for detection and classification for digits on the street view house numbers dataset (SVHN). You can view the project here: 88 | https://github.com/pavitrakumar78/Street-View-House-Numbers-SVHN-Detection-and-Classification-using-CNN 89 | -------------------------------------------------------------------------------- /digit_recog.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sat Nov 21 14:38:53 2015 4 | 5 | @author: Pavitrakumar 6 | 7 | """ 8 | 9 | import numpy as np 10 | # from scipy.misc.pilutil import imresize 11 | from needed import imresize 12 | import cv2 13 | from skimage.feature import hog 14 | import sys 15 | 16 | # sys arg 1 17 | # TRAIN_DATA_IMG = 'digits.png' 18 | 19 | # sys arg 2 20 | # USER_IMG = 'test_image.png' 21 | 22 | DIGIT_DIM = 20 # size of each digit is SZ x SZ 23 | CLASS_N = 10 # 0-9 24 | 25 | 26 | # This method splits the input training image into small cells (of a single digit) and uses these cells as training data. 27 | # The default training image (MNIST) is a 1000x1000 size image and each digit is of size 20x20. so we divide 1000/20 horizontally and 1000/20 vertically. 28 | # If you are going to use a custom digit training image, then adjust the code below so that it properly captures the digits in your image. 29 | # Also, change the labelling scheme in line 41 to correspond to your image. 30 | def split2d(img, cell_size, flatten=True): 31 | h, w = img.shape[:2] 32 | sx, sy = cell_size 33 | cells = [np.hsplit(row, w // sx) for row in np.vsplit(img, h // sy)] 34 | cells = np.array(cells) 35 | if flatten: 36 | cells = cells.reshape(-1, sy, sx) 37 | return cells 38 | 39 | 40 | def load_digits(fn): 41 | print('loading "%s for training" ...' % fn) 42 | digits_img = cv2.imread(fn, 0) 43 | digits = split2d(digits_img, (DIGIT_DIM, DIGIT_DIM)) 44 | labels = np.repeat(np.arange(CLASS_N), len(digits) / CLASS_N) 45 | # 2500 samples in the digits.png so repeat 0-9 2500/10(0-9 - no. of classes) times. 46 | return digits, labels 47 | 48 | 49 | class KNN_MODEL(): # can also define a custom model in a similar class wrapper with train and predict methods 50 | def __init__(self, k=3): 51 | self.k = k 52 | self.model = cv2.ml.KNearest_create() 53 | 54 | def train(self, samples, responses): 55 | self.model = cv2.ml.KNearest_create() 56 | self.model.train(samples, cv2.ml.ROW_SAMPLE, responses) 57 | 58 | def predict(self, samples): 59 | retval, results, neigh_resp, dists = self.model.find_nearest(samples, self.k) 60 | return results.ravel() 61 | 62 | 63 | def contains(r1, r2): 64 | r1_x1 = r1[0] 65 | r1_y1 = r1[1] 66 | r2_x1 = r2[0] 67 | r2_y1 = r2[1] 68 | 69 | r1_x2 = r1[0] + r1[2] 70 | r1_y2 = r1[1] + r1[3] 71 | r2_x2 = r2[0] + r2[2] 72 | r2_y2 = r2[1] + r2[3] 73 | 74 | # does r1 contain r2? 75 | return r1_x1 < r2_x1 < r2_x2 < r1_x2 and r1_y1 < r2_y1 < r2_y2 < r1_y2 76 | 77 | 78 | def pixels_to_hog_20(pixel_array): 79 | hog_featuresData = [] 80 | for img in pixel_array: 81 | # img = 20x20 82 | fd = hog(img, orientations=9, pixels_per_cell=(10, 10), cells_per_block=(1, 1)) 83 | hog_featuresData.append(fd) 84 | hog_features = np.array(hog_featuresData, 'float64') 85 | return np.float32(hog_features) 86 | 87 | 88 | def get_digits(contours): 89 | digit_rects = [cv2.boundingRect(ctr) for ctr in contours] 90 | rects_final = digit_rects[:] 91 | 92 | for r in digit_rects: 93 | x, y, w, h = r 94 | if w < 15 and h < 15: # too small, remove it 95 | rects_final.remove(r) 96 | 97 | for r1 in digit_rects: 98 | for r2 in digit_rects: 99 | if (r1[1] != 1 and r1[1] != 1) and ( 100 | r2[1] != 1 and r2[1] != 1): # if the rectangle is not the page-bounding rectangle, 101 | if contains(r1, r2) and (r2 in rects_final): 102 | rects_final.remove(r2) 103 | return rects_final 104 | 105 | 106 | def proc_user_img(fn, model): 107 | print('loading "%s for digit recognition" ...' % fn) 108 | im = cv2.imread(fn) 109 | im_original = cv2.imread(fn) 110 | 111 | blank_image = np.zeros((im.shape[0], im.shape[1], 3), np.uint8) 112 | blank_image.fill(255) 113 | 114 | imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY) 115 | 116 | kernel = np.ones((5, 5), np.uint8) 117 | 118 | ret, thresh = cv2.threshold(imgray, 127, 255, 0) 119 | 120 | thresh = cv2.erode(thresh, kernel, iterations=1) 121 | thresh = cv2.dilate(thresh, kernel, iterations=1) 122 | thresh = cv2.erode(thresh, kernel, iterations=1) 123 | 124 | # for opencv 3.0.x 125 | # _,contours,hierarchy = cv2.findContours(thresh,cv2.RETR_CCOMP,cv2.CHAIN_APPROX_SIMPLE) 126 | # for opencv 2.4.x 127 | contours, hierarchy = cv2.findContours(thresh, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE) 128 | 129 | digits_rect = get_digits(contours) # rectangles of bounding the digits in user image 130 | 131 | for rect in digits_rect: 132 | x, y, w, h = rect 133 | _ = cv2.rectangle(im, (x, y), (x + w, y + h), (0, 255, 0), 2) 134 | 135 | im_digit = im_original[y:y + h, x:x + w] 136 | sz = 28 137 | im_digit = imresize(im_digit, (sz, sz)) 138 | 139 | for i in range(sz): # need to remove border pixels 140 | im_digit[i, 0] = 255 141 | im_digit[i, 1] = 255 142 | im_digit[0, i] = 255 143 | im_digit[1, i] = 255 144 | 145 | thresh = 210 146 | im_digit = cv2.cvtColor(im_digit, cv2.COLOR_BGR2GRAY) 147 | im_digit = cv2.threshold(im_digit, thresh, 255, cv2.THRESH_BINARY)[1] 148 | # im_digit = cv2.adaptiveThreshold(im_digit,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C ,cv2.THRESH_BINARY,11,2) 149 | im_digit = (255 - im_digit) 150 | 151 | im_digit = imresize(im_digit, (20, 20)) 152 | 153 | hog_img_data = pixels_to_hog_20([im_digit]) 154 | 155 | pred = model.predict(hog_img_data) 156 | 157 | _ = cv2.putText(im, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 0), 3) 158 | _ = cv2.putText(blank_image, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 3, (255, 0, 0), 5) 159 | 160 | cv2.imwrite("original_overlay.png", im) 161 | cv2.imwrite("final_digits.png", blank_image) 162 | cv2.destroyAllWindows() 163 | 164 | 165 | if __name__ == '__main__': 166 | print(__doc__) 167 | 168 | if len(sys.argv) < 3: 169 | print( 170 | "Enter Proper Arguments \n Usage: digit_recog.py training_image.png testing_image.png \n Example: digit_recog.py digits.png test_image.png") 171 | exit(0) 172 | 173 | TRAIN_DATA_IMG = sys.argv[1] 174 | USER_IMG = sys.argv[2] 175 | 176 | digits, labels = load_digits(TRAIN_DATA_IMG) 177 | 178 | print('training ....') 179 | # shuffle digits 180 | rand = np.random.RandomState(123) 181 | shuffle_index = rand.permutation(len(digits)) 182 | 183 | digits, labels = digits[shuffle_index], labels[shuffle_index] 184 | 185 | train_digits_data = pixels_to_hog_20(digits) 186 | train_digits_labels = labels 187 | 188 | print('training KNearest...') # gets 80% in most user images 189 | model = KNN_MODEL(k=4) 190 | model.train(train_digits_data, train_digits_labels) 191 | 192 | proc_user_img(USER_IMG, model) 193 | -------------------------------------------------------------------------------- /NEW_digit_recog.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sat Nov 21 14:38:53 2015 4 | 5 | @author: Pavitrakumar 6 | 7 | """ 8 | 9 | import numpy as np 10 | # from scipy.misc.pilutil import imresize 11 | from needed import imresize 12 | from PIL import Image 13 | import cv2 # version 3.2.0 14 | from skimage.feature import hog 15 | from matplotlib import pyplot as plt 16 | from sklearn.model_selection import train_test_split 17 | from sklearn.metrics import accuracy_score 18 | from sklearn.utils import shuffle 19 | 20 | DIGIT_WIDTH = 10 21 | DIGIT_HEIGHT = 20 22 | IMG_HEIGHT = 28 23 | IMG_WIDTH = 28 24 | CLASS_N = 10 # 0-9 25 | 26 | 27 | # This method splits the input training image into small cells (of a single digit) and uses these cells as training data. 28 | # The default training image (MNIST) is a 1000x1000 size image and each digit is of size 10x20. so we divide 1000/10 horizontally and 1000/20 vertically. 29 | def split2d(img, cell_size, flatten=True): 30 | h, w = img.shape[:2] 31 | sx, sy = cell_size 32 | cells = [np.hsplit(row, w // sx) for row in np.vsplit(img, h // sy)] 33 | cells = np.array(cells) 34 | if flatten: 35 | cells = cells.reshape(-1, sy, sx) 36 | return cells 37 | 38 | 39 | def load_digits(fn): 40 | print('loading "%s for training" ...' % fn) 41 | digits_img = cv2.imread(fn, 0) 42 | digits = split2d(digits_img, (DIGIT_WIDTH, DIGIT_HEIGHT)) 43 | resized_digits = [] 44 | for digit in digits: 45 | resized_digits.append(imresize(digit, (IMG_WIDTH, IMG_HEIGHT))) 46 | labels = np.repeat(np.arange(CLASS_N), len(digits) / CLASS_N) 47 | return np.array(resized_digits), labels 48 | 49 | 50 | def pixels_to_hog_20(img_array): 51 | hog_featuresData = [] 52 | for img in img_array: 53 | fd = hog(img, 54 | orientations=10, 55 | pixels_per_cell=(5, 5), 56 | cells_per_block=(1, 1)) 57 | hog_featuresData.append(fd) 58 | hog_features = np.array(hog_featuresData, 'float64') 59 | return np.float32(hog_features) 60 | 61 | 62 | # define a custom model in a similar class wrapper with train and predict methods 63 | class KNN_MODEL(): 64 | def __init__(self, k=3): 65 | self.k = k 66 | self.model = cv2.ml.KNearest_create() 67 | 68 | def train(self, samples, responses): 69 | self.model.train(samples, cv2.ml.ROW_SAMPLE, responses) 70 | 71 | def predict(self, samples): 72 | retval, results, neigh_resp, dists = self.model.findNearest(samples, self.k) 73 | return results.ravel() 74 | 75 | 76 | class SVM_MODEL(): 77 | def __init__(self, num_feats, C=1, gamma=0.1): 78 | self.model = cv2.ml.SVM_create() 79 | self.model.setType(cv2.ml.SVM_C_SVC) 80 | self.model.setKernel(cv2.ml.SVM_RBF) # SVM_LINEAR, SVM_RBF 81 | self.model.setC(C) 82 | self.model.setGamma(gamma) 83 | self.features = num_feats 84 | 85 | def train(self, samples, responses): 86 | self.model.train(samples, cv2.ml.ROW_SAMPLE, responses) 87 | 88 | def predict(self, samples): 89 | results = self.model.predict(samples.reshape(-1, self.features)) 90 | return results[1].ravel() 91 | 92 | 93 | def get_digits(contours, hierarchy): 94 | hierarchy = hierarchy[0] 95 | bounding_rectangles = [cv2.boundingRect(ctr) for ctr in contours] 96 | final_bounding_rectangles = [] 97 | # find the most common heirarchy level - that is where our digits's bounding boxes are 98 | u, indices = np.unique(hierarchy[:, -1], return_inverse=True) 99 | most_common_heirarchy = u[np.argmax(np.bincount(indices))] 100 | 101 | for r, hr in zip(bounding_rectangles, hierarchy): 102 | x, y, w, h = r 103 | # this could vary depending on the image you are trying to predict 104 | # we are trying to extract ONLY the rectangles with images in it (this is a very simple way to do it) 105 | # we use heirarchy to extract only the boxes that are in the same global level - to avoid digits inside other digits 106 | # ex: there could be a bounding box inside every 6,9,8 because of the loops in the number's appearence - we don't want that. 107 | # read more about it here: https://docs.opencv.org/trunk/d9/d8b/tutorial_py_contours_hierarchy.html 108 | if ((w * h) > 250) and (10 <= w <= 200) and (10 <= h <= 200) and hr[3] == most_common_heirarchy: 109 | final_bounding_rectangles.append(r) 110 | 111 | return final_bounding_rectangles 112 | 113 | 114 | def proc_user_img(img_file, model): 115 | print('loading "%s for digit recognition" ...' % img_file) 116 | im = cv2.imread(img_file) 117 | blank_image = np.zeros((im.shape[0], im.shape[1], 3), np.uint8) 118 | blank_image.fill(255) 119 | 120 | imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY) 121 | plt.imshow(imgray) 122 | kernel = np.ones((5, 5), np.uint8) 123 | 124 | ret, thresh = cv2.threshold(imgray, 127, 255, 0) 125 | thresh = cv2.erode(thresh, kernel, iterations=1) 126 | thresh = cv2.dilate(thresh, kernel, iterations=1) 127 | thresh = cv2.erode(thresh, kernel, iterations=1) 128 | 129 | contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) 130 | 131 | digits_rectangles = get_digits(contours, hierarchy) # rectangles of bounding the digits in user image 132 | 133 | for rect in digits_rectangles: 134 | x, y, w, h = rect 135 | cv2.rectangle(im, (x, y), (x + w, y + h), (0, 255, 0), 2) 136 | im_digit = imgray[y:y + h, x:x + w] 137 | im_digit = (255 - im_digit) 138 | im_digit = imresize(im_digit, (IMG_WIDTH, IMG_HEIGHT)) 139 | 140 | hog_img_data = pixels_to_hog_20([im_digit]) 141 | pred = model.predict(hog_img_data) 142 | cv2.putText(im, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 0), 3) 143 | cv2.putText(blank_image, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 3, (255, 0, 0), 5) 144 | 145 | plt.imshow(im) 146 | cv2.imwrite("original_overlay.png", im) 147 | cv2.imwrite("final_digits.png", blank_image) 148 | # cv2.destroyAllWindows() 149 | 150 | 151 | def get_contour_precedence(contour, cols): 152 | return contour[1] * cols + contour[0] # row-wise ordering 153 | 154 | 155 | # this function processes a custom training image 156 | # see example : custom_train.digits.jpg 157 | # if you want to use your own, it should be in a similar format 158 | def load_digits_custom(img_file): 159 | train_data = [] 160 | train_target = [] 161 | start_class = 1 162 | im = cv2.imread(img_file) 163 | imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY) 164 | plt.imshow(imgray) 165 | kernel = np.ones((5, 5), np.uint8) 166 | 167 | ret, thresh = cv2.threshold(imgray, 127, 255, 0) 168 | thresh = cv2.erode(thresh, kernel, iterations=1) 169 | thresh = cv2.dilate(thresh, kernel, iterations=1) 170 | thresh = cv2.erode(thresh, kernel, iterations=1) 171 | 172 | contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) 173 | digits_rectangles = get_digits(contours, hierarchy) # rectangles of bounding the digits in user image 174 | 175 | # sort rectangles accoring to x,y pos so that we can label them 176 | digits_rectangles.sort(key=lambda x: get_contour_precedence(x, im.shape[1])) 177 | 178 | for index, rect in enumerate(digits_rectangles): 179 | x, y, w, h = rect 180 | cv2.rectangle(im, (x, y), (x + w, y + h), (0, 255, 0), 2) 181 | im_digit = imgray[y:y + h, x:x + w] 182 | im_digit = (255 - im_digit) 183 | 184 | im_digit = imresize(im_digit, (IMG_WIDTH, IMG_HEIGHT)) 185 | train_data.append(im_digit) 186 | train_target.append(start_class % 10) 187 | 188 | if index > 0 and (index + 1) % 10 == 0: 189 | start_class += 1 190 | cv2.imwrite("training_box_overlay.png", im) 191 | 192 | return np.array(train_data), np.array(train_target) 193 | 194 | 195 | # ------------------data preparation-------------------------------------------- 196 | 197 | TRAIN_MNIST_IMG = 'digits.png' 198 | TRAIN_USER_IMG = 'custom_train_digits.jpg' 199 | TEST_USER_IMG = 'test_image.png' 200 | 201 | # digits, labels = load_digits(TRAIN_MNIST_IMG) #original MNIST data (not good detection) 202 | digits, labels = load_digits_custom( 203 | TRAIN_USER_IMG) # my handwritten dataset (better than MNIST on my handwritten digits) 204 | 205 | print('train data shape', digits.shape) 206 | print('test data shape', labels.shape) 207 | 208 | digits, labels = shuffle(digits, labels, random_state=256) 209 | train_digits_data = pixels_to_hog_20(digits) 210 | X_train, X_test, y_train, y_test = train_test_split(train_digits_data, labels, test_size=0.33, random_state=42) 211 | 212 | # ------------------training and testing---------------------------------------- 213 | 214 | model = KNN_MODEL(k=3) 215 | model.train(X_train, y_train) 216 | preds = model.predict(X_test) 217 | print('Accuracy: ', accuracy_score(y_test, preds)) 218 | 219 | model = KNN_MODEL(k=4) 220 | model.train(train_digits_data, labels) 221 | proc_user_img(TEST_USER_IMG, model) 222 | 223 | model = SVM_MODEL(num_feats=train_digits_data.shape[1]) 224 | model.train(X_train, y_train) 225 | preds = model.predict(X_test) 226 | print('Accuracy: ', accuracy_score(y_test, preds)) 227 | 228 | model = SVM_MODEL(num_feats=train_digits_data.shape[1]) 229 | model.train(train_digits_data, labels) 230 | proc_user_img(TEST_USER_IMG, model) 231 | 232 | # ------------------------------------------------------------------------------ 233 | --------------------------------------------------------------------------------