├── .gitignore
├── digits.png
├── test_image.png
├── final_digits.png
├── original_overlay.png
├── custom_train_digits.jpg
├── training_box_overlay.png
├── needed.py
├── requirements.txt
├── README.md
├── digit_recog.py
└── NEW_digit_recog.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .idea
2 | __pycache__/
3 | github-test/


--------------------------------------------------------------------------------
/digits.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/digits.png


--------------------------------------------------------------------------------
/test_image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/test_image.png


--------------------------------------------------------------------------------
/final_digits.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/final_digits.png


--------------------------------------------------------------------------------
/original_overlay.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/original_overlay.png


--------------------------------------------------------------------------------
/custom_train_digits.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/custom_train_digits.jpg


--------------------------------------------------------------------------------
/training_box_overlay.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pavitrakumar78/Python-Custom-Digit-Recognition/HEAD/training_box_overlay.png


--------------------------------------------------------------------------------
/needed.py:
--------------------------------------------------------------------------------
1 | from PIL import Image
2 | from numpy import asarray
3 | 
4 | 
5 | def imresize(arr,size):
6 | 	img=Image.fromarray(arr)
7 | 	img=img.resize(size)
8 | 	return asarray(img)
9 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | cycler==0.10.0
 2 | dask==2021.10.0
 3 | decorator==4.4.2
 4 | joblib==1.2.0
 5 | kiwisolver==1.3.1
 6 | matplotlib==3.3.3
 7 | networkx==2.5
 8 | numpy==1.22.0
 9 | opencv-python==4.2.0.32
10 | Pillow==10.2.0
11 | pkg-resources==0.0.0
12 | pyparsing==2.4.7
13 | python-dateutil==2.8.1
14 | PyYAML==5.4
15 | scikit-image==0.12.1
16 | scikit-learn==1.5.0
17 | scipy==1.10.0
18 | six==1.15.0
19 | toolz==0.11.1
20 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Python-Custom-Digit-Recognition
 2 | 
 3 | You can apply a simple OCR on your own handrwitten digits using this python script.
 4 | I have used OpenCV to pre-process the image and to extract the digits from the picture.
 5 | Using K-Nearest Neighbours (or SVM) as my model - I trained it using my own handwritten data set. I have also included the freely [available](http://yann.lecun.com/exdb/mnist/) MNIST data set so you can experiment on how different datasets work with different handwritings.
 6 | 
 7 | ## Analysis  
 8 | I tried using just extracted the pixels as data to train and to predict the digits, but the accuracy was too low even on popular classification algorithms like SVM, KNN and Neural Netoworks.  I did improve the accuracy a little bit after trying some custom threshold values. The best accuracy I could achieve only using pixel values was close to 55-60% that was after converting all the images to Black OR White from Black AND White.    
 9 | 
10 | After searching and reading about feature extraction from images for OCR - I stumbled [HOG](https://en.wikipedia.org/wiki/Histogram_of_oriented_gradients) (Histogram of Gradients).  Basically, it tries to capture the shape of structures in the region by capturing information about gradients. Image gradient are simply intensity changes across pixels in an image.  
11 | 
12 | ![pic-explain](https://gilscvblog.files.wordpress.com/2013/08/figure5.jpg "pic")
13 | 
14 | 
15 | It works by dividing the image into small (usually 8x8 pixels) cells and blocks of 4x4 cells. Each cell has a fixed number of gradient orientation bins. Each pixel in the cell votes for a gradient orientation bin with a vote proportional to the gradient magnitude at that pixel or simple put, the "histogram" counts how many pixels have an edge with a specific orientation.  More more info please refer [this](https://gilscvblog.wordpress.com/2013/08/18/a-short-introduction-to-descriptors/) blog post
16 | 
17 | Using just only HOG histogram vectors as features drastically improved the accuracy of the prediction.  Currently, I have used KNN from OpenCV as my model - I tried using SVM from the same module, but its accuracy was not as good as KNN. The best accuracy I have achieved on a sample image of about 100 digits is 80%.  In the future, I might add more features after looking into SIFT, SURF or even try to get a better accuracy using just plain pixels as data
18 | 
19 | ## Usage  
20 | 
21 | `digit_recog.py` *is deprecated - may not work with newer versions of libraries*  
22 | 
23 | UPDATED CODE: `NEW_digit_recog.py`
24 | 
25 | To run code, download/clone repo and execute:
26 | ```python NEW_digit_recog.py ```
27 | 
28 | This code uses my own handwritten digits (`custom_train_digits.jpg`) as training data. You can also use your own but keep the positioning of the digits similar to whats in `custom_train_digits.jpg` file. If you make modifications in the format of the custom training data (your handwritten digits) make sure to edit `load_digits_custom` function in `NEW_digit_recog.py` as per the changes
29 | 
30 | Executing the program will generate 2 output files
31 | 
32 | This is the original image with digit boxes and the numbers on the top.   
33 | ![original_overlay](https://github.com/pavitrakumar78/Python-Custom-Digit-Recognition/blob/master/original_overlay.png)
34 | This is a plain image with just the recognized numbers printed.   
35 | ![final_digits](https://github.com/pavitrakumar78/Python-Custom-Digit-Recognition/blob/master/final_digits.png)
36 | 
37 | ### Note:  
38 | - User image should be a scanned (atleast 300dpi) image.  
39 | - Image can be any format supported by OpenCV.  
40 | 
41 | In `NEW_digit_recog.py`, use either      
42 | ```digits, labels = load_digits(TRAIN_DATA_IMG) #original MNIST data```  
43 | For MNIST dataset OR  
44 | ```digits, labels = load_digits_custom('custom_train_digits.jpg') #my handwritten dataset```
45 | For your own custom dataset.  
46 |   
47 | Edit `TRAIN_DATA_IMG` and `USER_IMG` At line 190 and 191 if you want to use your own images for testing and training.  
48 |     
49 | ## Libraries and Environement:
50 | 
51 | # NOTE: To run this code without errors, you need a virtualenv with the correct libraries because the code is outdated (it was written over 5 years ago...)
52 | Recommended py version: 3.6+
53 | ```
54 | sudo apt-get install python3-venv 
55 | sudo apt-get install libgtk2.0-dev pkg-config
56 | 
57 | python3 -m venv github-test
58 | source github-test/bin/activate
59 | 
60 | pip3 install numpy==1.18
61 | pip3 install scipy==1.1.0
62 | pip3 install scikit-learn==0.21.3
63 | pip3 install opencv-python==3.2.0.8
64 | pip3 install scikit-image==0.12.1
65 | pip3 install Pillow==2.2.2
66 | 
67 | git clone https://github.com/pavitrakumar78/Python-Custom-Digit-Recognition.git
68 | 
69 | python NEW_digit_recog.py
70 | ```
71 | If you don't want to manually work with the versions, I've also added a tested requirements.txt file. Just intall whatever is in there and this script should run without any issues.
72 | 
73 | The accuarcy may be lower. You will need to tune the hyperparams in the model and try modifying the image processing piepline.
74 | 
75 | [PROBABLY DOES NOT WORK:]
76 | ~~Tested on:  
77 | Windows 10    
78 | Python 3.5    
79 | 
80 | ~~Dependencies:  
81 | numpy 1.13.1  
82 | SciPy 0.19.0  
83 | OpenCv (cv2) 3.2.0
84 | ~~
85 | 
86 | ## Similar Project
87 | I recently did a project where I use 2 CNNs to do both bounding box regression for detection and classification for digits on the street view house numbers dataset (SVHN). You can view the project here:  
88 | https://github.com/pavitrakumar78/Street-View-House-Numbers-SVHN-Detection-and-Classification-using-CNN
89 | 


--------------------------------------------------------------------------------
/digit_recog.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Sat Nov 21 14:38:53 2015
  4 | 
  5 | @author: Pavitrakumar
  6 | 
  7 | """
  8 | 
  9 | import numpy as np
 10 | # from scipy.misc.pilutil import imresize
 11 | from needed import imresize
 12 | import cv2
 13 | from skimage.feature import hog
 14 | import sys
 15 | 
 16 | # sys arg 1
 17 | # TRAIN_DATA_IMG = 'digits.png'
 18 | 
 19 | # sys arg 2
 20 | # USER_IMG = 'test_image.png'
 21 | 
 22 | DIGIT_DIM = 20  # size of each digit is SZ x SZ
 23 | CLASS_N = 10  # 0-9
 24 | 
 25 | 
 26 | # This method splits the input training image into small cells (of a single digit) and uses these cells as training data.
 27 | # The default training image (MNIST) is a 1000x1000 size image and each digit is of size 20x20. so we divide 1000/20 horizontally and 1000/20 vertically.
 28 | # If you are going to use a custom digit training image, then adjust the code below so that it properly captures the digits in your image.
 29 | # Also, change the labelling scheme in line 41 to correspond to your image.
 30 | def split2d(img, cell_size, flatten=True):
 31 |     h, w = img.shape[:2]
 32 |     sx, sy = cell_size
 33 |     cells = [np.hsplit(row, w // sx) for row in np.vsplit(img, h // sy)]
 34 |     cells = np.array(cells)
 35 |     if flatten:
 36 |         cells = cells.reshape(-1, sy, sx)
 37 |     return cells
 38 | 
 39 | 
 40 | def load_digits(fn):
 41 |     print('loading "%s for training" ...' % fn)
 42 |     digits_img = cv2.imread(fn, 0)
 43 |     digits = split2d(digits_img, (DIGIT_DIM, DIGIT_DIM))
 44 |     labels = np.repeat(np.arange(CLASS_N), len(digits) / CLASS_N)
 45 |     # 2500 samples in the digits.png so repeat 0-9 2500/10(0-9 - no. of classes) times.
 46 |     return digits, labels
 47 | 
 48 | 
 49 | class KNN_MODEL():  # can also define a custom model in a similar class wrapper with train and predict methods
 50 |     def __init__(self, k=3):
 51 |         self.k = k
 52 |         self.model = cv2.ml.KNearest_create()
 53 | 
 54 |     def train(self, samples, responses):
 55 |         self.model = cv2.ml.KNearest_create()
 56 |         self.model.train(samples, cv2.ml.ROW_SAMPLE, responses)
 57 | 
 58 |     def predict(self, samples):
 59 |         retval, results, neigh_resp, dists = self.model.find_nearest(samples, self.k)
 60 |         return results.ravel()
 61 | 
 62 | 
 63 | def contains(r1, r2):
 64 |     r1_x1 = r1[0]
 65 |     r1_y1 = r1[1]
 66 |     r2_x1 = r2[0]
 67 |     r2_y1 = r2[1]
 68 | 
 69 |     r1_x2 = r1[0] + r1[2]
 70 |     r1_y2 = r1[1] + r1[3]
 71 |     r2_x2 = r2[0] + r2[2]
 72 |     r2_y2 = r2[1] + r2[3]
 73 | 
 74 |     # does r1 contain r2?
 75 |     return r1_x1 < r2_x1 < r2_x2 < r1_x2 and r1_y1 < r2_y1 < r2_y2 < r1_y2
 76 | 
 77 | 
 78 | def pixels_to_hog_20(pixel_array):
 79 |     hog_featuresData = []
 80 |     for img in pixel_array:
 81 |         # img = 20x20
 82 |         fd = hog(img, orientations=9, pixels_per_cell=(10, 10), cells_per_block=(1, 1))
 83 |         hog_featuresData.append(fd)
 84 |     hog_features = np.array(hog_featuresData, 'float64')
 85 |     return np.float32(hog_features)
 86 | 
 87 | 
 88 | def get_digits(contours):
 89 |     digit_rects = [cv2.boundingRect(ctr) for ctr in contours]
 90 |     rects_final = digit_rects[:]
 91 | 
 92 |     for r in digit_rects:
 93 |         x, y, w, h = r
 94 |         if w < 15 and h < 15:  # too small, remove it
 95 |             rects_final.remove(r)
 96 | 
 97 |     for r1 in digit_rects:
 98 |         for r2 in digit_rects:
 99 |             if (r1[1] != 1 and r1[1] != 1) and (
100 |                     r2[1] != 1 and r2[1] != 1):  # if the rectangle is not the page-bounding rectangle,
101 |                 if contains(r1, r2) and (r2 in rects_final):
102 |                     rects_final.remove(r2)
103 |     return rects_final
104 | 
105 | 
106 | def proc_user_img(fn, model):
107 |     print('loading "%s for digit recognition" ...' % fn)
108 |     im = cv2.imread(fn)
109 |     im_original = cv2.imread(fn)
110 | 
111 |     blank_image = np.zeros((im.shape[0], im.shape[1], 3), np.uint8)
112 |     blank_image.fill(255)
113 | 
114 |     imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
115 | 
116 |     kernel = np.ones((5, 5), np.uint8)
117 | 
118 |     ret, thresh = cv2.threshold(imgray, 127, 255, 0)
119 | 
120 |     thresh = cv2.erode(thresh, kernel, iterations=1)
121 |     thresh = cv2.dilate(thresh, kernel, iterations=1)
122 |     thresh = cv2.erode(thresh, kernel, iterations=1)
123 | 
124 |     # for opencv 3.0.x
125 |     # _,contours,hierarchy = cv2.findContours(thresh,cv2.RETR_CCOMP,cv2.CHAIN_APPROX_SIMPLE)
126 |     # for opencv 2.4.x
127 |     contours, hierarchy = cv2.findContours(thresh, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
128 | 
129 |     digits_rect = get_digits(contours)  # rectangles of bounding the digits in user image
130 | 
131 |     for rect in digits_rect:
132 |         x, y, w, h = rect
133 |         _ = cv2.rectangle(im, (x, y), (x + w, y + h), (0, 255, 0), 2)
134 | 
135 |         im_digit = im_original[y:y + h, x:x + w]
136 |         sz = 28
137 |         im_digit = imresize(im_digit, (sz, sz))
138 | 
139 |         for i in range(sz):  # need to remove border pixels
140 |             im_digit[i, 0] = 255
141 |             im_digit[i, 1] = 255
142 |             im_digit[0, i] = 255
143 |             im_digit[1, i] = 255
144 | 
145 |         thresh = 210
146 |         im_digit = cv2.cvtColor(im_digit, cv2.COLOR_BGR2GRAY)
147 |         im_digit = cv2.threshold(im_digit, thresh, 255, cv2.THRESH_BINARY)[1]
148 |         # im_digit = cv2.adaptiveThreshold(im_digit,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C ,cv2.THRESH_BINARY,11,2)
149 |         im_digit = (255 - im_digit)
150 | 
151 |         im_digit = imresize(im_digit, (20, 20))
152 | 
153 |         hog_img_data = pixels_to_hog_20([im_digit])
154 | 
155 |         pred = model.predict(hog_img_data)
156 | 
157 |         _ = cv2.putText(im, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 0), 3)
158 |         _ = cv2.putText(blank_image, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 3, (255, 0, 0), 5)
159 | 
160 |     cv2.imwrite("original_overlay.png", im)
161 |     cv2.imwrite("final_digits.png", blank_image)
162 |     cv2.destroyAllWindows()
163 | 
164 | 
165 | if __name__ == '__main__':
166 |     print(__doc__)
167 | 
168 |     if len(sys.argv) < 3:
169 |         print(
170 |             "Enter Proper Arguments \n Usage: digit_recog.py training_image.png testing_image.png \n Example: digit_recog.py digits.png test_image.png")
171 |         exit(0)
172 | 
173 |     TRAIN_DATA_IMG = sys.argv[1]
174 |     USER_IMG = sys.argv[2]
175 | 
176 |     digits, labels = load_digits(TRAIN_DATA_IMG)
177 | 
178 |     print('training ....')
179 |     # shuffle digits
180 |     rand = np.random.RandomState(123)
181 |     shuffle_index = rand.permutation(len(digits))
182 | 
183 |     digits, labels = digits[shuffle_index], labels[shuffle_index]
184 | 
185 |     train_digits_data = pixels_to_hog_20(digits)
186 |     train_digits_labels = labels
187 | 
188 |     print('training KNearest...')  # gets 80% in most user images
189 |     model = KNN_MODEL(k=4)
190 |     model.train(train_digits_data, train_digits_labels)
191 | 
192 |     proc_user_img(USER_IMG, model)
193 | 


--------------------------------------------------------------------------------
/NEW_digit_recog.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Sat Nov 21 14:38:53 2015
  4 | 
  5 | @author: Pavitrakumar
  6 | 
  7 | """
  8 | 
  9 | import numpy as np
 10 | # from scipy.misc.pilutil import imresize
 11 | from needed import imresize
 12 | from PIL import Image
 13 | import cv2  # version 3.2.0
 14 | from skimage.feature import hog
 15 | from matplotlib import pyplot as plt
 16 | from sklearn.model_selection import train_test_split
 17 | from sklearn.metrics import accuracy_score
 18 | from sklearn.utils import shuffle
 19 | 
 20 | DIGIT_WIDTH = 10
 21 | DIGIT_HEIGHT = 20
 22 | IMG_HEIGHT = 28
 23 | IMG_WIDTH = 28
 24 | CLASS_N = 10  # 0-9
 25 | 
 26 | 
 27 | # This method splits the input training image into small cells (of a single digit) and uses these cells as training data.
 28 | # The default training image (MNIST) is a 1000x1000 size image and each digit is of size 10x20. so we divide 1000/10 horizontally and 1000/20 vertically.
 29 | def split2d(img, cell_size, flatten=True):
 30 |     h, w = img.shape[:2]
 31 |     sx, sy = cell_size
 32 |     cells = [np.hsplit(row, w // sx) for row in np.vsplit(img, h // sy)]
 33 |     cells = np.array(cells)
 34 |     if flatten:
 35 |         cells = cells.reshape(-1, sy, sx)
 36 |     return cells
 37 | 
 38 | 
 39 | def load_digits(fn):
 40 |     print('loading "%s for training" ...' % fn)
 41 |     digits_img = cv2.imread(fn, 0)
 42 |     digits = split2d(digits_img, (DIGIT_WIDTH, DIGIT_HEIGHT))
 43 |     resized_digits = []
 44 |     for digit in digits:
 45 |         resized_digits.append(imresize(digit, (IMG_WIDTH, IMG_HEIGHT)))
 46 |     labels = np.repeat(np.arange(CLASS_N), len(digits) / CLASS_N)
 47 |     return np.array(resized_digits), labels
 48 | 
 49 | 
 50 | def pixels_to_hog_20(img_array):
 51 |     hog_featuresData = []
 52 |     for img in img_array:
 53 |         fd = hog(img,
 54 |                  orientations=10,
 55 |                  pixels_per_cell=(5, 5),
 56 |                  cells_per_block=(1, 1))
 57 |         hog_featuresData.append(fd)
 58 |     hog_features = np.array(hog_featuresData, 'float64')
 59 |     return np.float32(hog_features)
 60 | 
 61 | 
 62 | # define a custom model in a similar class wrapper with train and predict methods
 63 | class KNN_MODEL():
 64 |     def __init__(self, k=3):
 65 |         self.k = k
 66 |         self.model = cv2.ml.KNearest_create()
 67 | 
 68 |     def train(self, samples, responses):
 69 |         self.model.train(samples, cv2.ml.ROW_SAMPLE, responses)
 70 | 
 71 |     def predict(self, samples):
 72 |         retval, results, neigh_resp, dists = self.model.findNearest(samples, self.k)
 73 |         return results.ravel()
 74 | 
 75 | 
 76 | class SVM_MODEL():
 77 |     def __init__(self, num_feats, C=1, gamma=0.1):
 78 |         self.model = cv2.ml.SVM_create()
 79 |         self.model.setType(cv2.ml.SVM_C_SVC)
 80 |         self.model.setKernel(cv2.ml.SVM_RBF)  # SVM_LINEAR, SVM_RBF
 81 |         self.model.setC(C)
 82 |         self.model.setGamma(gamma)
 83 |         self.features = num_feats
 84 | 
 85 |     def train(self, samples, responses):
 86 |         self.model.train(samples, cv2.ml.ROW_SAMPLE, responses)
 87 | 
 88 |     def predict(self, samples):
 89 |         results = self.model.predict(samples.reshape(-1, self.features))
 90 |         return results[1].ravel()
 91 | 
 92 | 
 93 | def get_digits(contours, hierarchy):
 94 |     hierarchy = hierarchy[0]
 95 |     bounding_rectangles = [cv2.boundingRect(ctr) for ctr in contours]
 96 |     final_bounding_rectangles = []
 97 |     # find the most common heirarchy level - that is where our digits's bounding boxes are
 98 |     u, indices = np.unique(hierarchy[:, -1], return_inverse=True)
 99 |     most_common_heirarchy = u[np.argmax(np.bincount(indices))]
100 | 
101 |     for r, hr in zip(bounding_rectangles, hierarchy):
102 |         x, y, w, h = r
103 |         # this could vary depending on the image you are trying to predict
104 |         # we are trying to extract ONLY the rectangles with images in it (this is a very simple way to do it)
105 |         # we use heirarchy to extract only the boxes that are in the same global level - to avoid digits inside other digits
106 |         # ex: there could be a bounding box inside every 6,9,8 because of the loops in the number's appearence - we don't want that.
107 |         # read more about it here: https://docs.opencv.org/trunk/d9/d8b/tutorial_py_contours_hierarchy.html
108 |         if ((w * h) > 250) and (10 <= w <= 200) and (10 <= h <= 200) and hr[3] == most_common_heirarchy:
109 |             final_bounding_rectangles.append(r)
110 | 
111 |     return final_bounding_rectangles
112 | 
113 | 
114 | def proc_user_img(img_file, model):
115 |     print('loading "%s for digit recognition" ...' % img_file)
116 |     im = cv2.imread(img_file)
117 |     blank_image = np.zeros((im.shape[0], im.shape[1], 3), np.uint8)
118 |     blank_image.fill(255)
119 | 
120 |     imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
121 |     plt.imshow(imgray)
122 |     kernel = np.ones((5, 5), np.uint8)
123 | 
124 |     ret, thresh = cv2.threshold(imgray, 127, 255, 0)
125 |     thresh = cv2.erode(thresh, kernel, iterations=1)
126 |     thresh = cv2.dilate(thresh, kernel, iterations=1)
127 |     thresh = cv2.erode(thresh, kernel, iterations=1)
128 | 
129 |     contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
130 | 
131 |     digits_rectangles = get_digits(contours, hierarchy)  # rectangles of bounding the digits in user image
132 | 
133 |     for rect in digits_rectangles:
134 |         x, y, w, h = rect
135 |         cv2.rectangle(im, (x, y), (x + w, y + h), (0, 255, 0), 2)
136 |         im_digit = imgray[y:y + h, x:x + w]
137 |         im_digit = (255 - im_digit)
138 |         im_digit = imresize(im_digit, (IMG_WIDTH, IMG_HEIGHT))
139 | 
140 |         hog_img_data = pixels_to_hog_20([im_digit])
141 |         pred = model.predict(hog_img_data)
142 |         cv2.putText(im, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 0), 3)
143 |         cv2.putText(blank_image, str(int(pred[0])), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 3, (255, 0, 0), 5)
144 | 
145 |     plt.imshow(im)
146 |     cv2.imwrite("original_overlay.png", im)
147 |     cv2.imwrite("final_digits.png", blank_image)
148 |     # cv2.destroyAllWindows()
149 | 
150 | 
151 | def get_contour_precedence(contour, cols):
152 |     return contour[1] * cols + contour[0]  # row-wise ordering
153 | 
154 | 
155 | # this function processes a custom training image
156 | # see example : custom_train.digits.jpg
157 | # if you want to use your own, it should be in a similar format
158 | def load_digits_custom(img_file):
159 |     train_data = []
160 |     train_target = []
161 |     start_class = 1
162 |     im = cv2.imread(img_file)
163 |     imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
164 |     plt.imshow(imgray)
165 |     kernel = np.ones((5, 5), np.uint8)
166 | 
167 |     ret, thresh = cv2.threshold(imgray, 127, 255, 0)
168 |     thresh = cv2.erode(thresh, kernel, iterations=1)
169 |     thresh = cv2.dilate(thresh, kernel, iterations=1)
170 |     thresh = cv2.erode(thresh, kernel, iterations=1)
171 | 
172 |     contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
173 |     digits_rectangles = get_digits(contours, hierarchy)  # rectangles of bounding the digits in user image
174 | 
175 |     # sort rectangles accoring to x,y pos so that we can label them
176 |     digits_rectangles.sort(key=lambda x: get_contour_precedence(x, im.shape[1]))
177 | 
178 |     for index, rect in enumerate(digits_rectangles):
179 |         x, y, w, h = rect
180 |         cv2.rectangle(im, (x, y), (x + w, y + h), (0, 255, 0), 2)
181 |         im_digit = imgray[y:y + h, x:x + w]
182 |         im_digit = (255 - im_digit)
183 | 
184 |         im_digit = imresize(im_digit, (IMG_WIDTH, IMG_HEIGHT))
185 |         train_data.append(im_digit)
186 |         train_target.append(start_class % 10)
187 | 
188 |         if index > 0 and (index + 1) % 10 == 0:
189 |             start_class += 1
190 |     cv2.imwrite("training_box_overlay.png", im)
191 | 
192 |     return np.array(train_data), np.array(train_target)
193 | 
194 | 
195 | # ------------------data preparation--------------------------------------------
196 | 
197 | TRAIN_MNIST_IMG = 'digits.png'
198 | TRAIN_USER_IMG = 'custom_train_digits.jpg'
199 | TEST_USER_IMG = 'test_image.png'
200 | 
201 | # digits, labels = load_digits(TRAIN_MNIST_IMG) #original MNIST data (not good detection)
202 | digits, labels = load_digits_custom(
203 |     TRAIN_USER_IMG)  # my handwritten dataset (better than MNIST on my handwritten digits)
204 | 
205 | print('train data shape', digits.shape)
206 | print('test data shape', labels.shape)
207 | 
208 | digits, labels = shuffle(digits, labels, random_state=256)
209 | train_digits_data = pixels_to_hog_20(digits)
210 | X_train, X_test, y_train, y_test = train_test_split(train_digits_data, labels, test_size=0.33, random_state=42)
211 | 
212 | # ------------------training and testing----------------------------------------
213 | 
214 | model = KNN_MODEL(k=3)
215 | model.train(X_train, y_train)
216 | preds = model.predict(X_test)
217 | print('Accuracy: ', accuracy_score(y_test, preds))
218 | 
219 | model = KNN_MODEL(k=4)
220 | model.train(train_digits_data, labels)
221 | proc_user_img(TEST_USER_IMG, model)
222 | 
223 | model = SVM_MODEL(num_feats=train_digits_data.shape[1])
224 | model.train(X_train, y_train)
225 | preds = model.predict(X_test)
226 | print('Accuracy: ', accuracy_score(y_test, preds))
227 | 
228 | model = SVM_MODEL(num_feats=train_digits_data.shape[1])
229 | model.train(train_digits_data, labels)
230 | proc_user_img(TEST_USER_IMG, model)
231 | 
232 | # ------------------------------------------------------------------------------
233 | 


--------------------------------------------------------------------------------