├── LICENSE
├── README.md
├── frozen_east_text_detection.pb
├── images
    ├── car_wash.png
    ├── lebron_james.jpg
    └── sign.jpg
└── opencv_text_detection_image.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Abhishek Singh
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # EAST Detector for Text Detection
 2 | 
 3 | OpenCV’s EAST(Efficient and Accurate Scene Text Detection ) text detector is a deep learning model, based on a novel architecture and training pattern. It is capable of 
 4 | - running at near real-time at 13 FPS on 720p images and 
 5 | - obtains state-of-the-art text detection accuracy.
 6 | 
 7 | [Link to paper](https://arxiv.org/pdf/1704.03155.pdf)
 8 | 
 9 | OpenCV’s text detector implementation of EAST is quite robust, capable of localizing text even when it’s blurred, reflective, or partially obscured.
10 | 
11 | There are many natural scene text detection challenges that have been described by Celine Mancas-Thillou and Bernard Gosselin in their excellent 2017 paper, [Natural Scene Text Understanding](https://www.tcts.fpms.ac.be/publications/regpapers/2007/VS_cmtbg2007.pdf) below:
12 | 
13 | - **Image/sensor noise**: Sensor noise from a handheld camera is typically higher than that of a traditional scanner. Additionally, low-priced cameras will typically interpolate the pixels of raw sensors to produce real colors.
14 | 
15 | - **Viewing angles**: Natural scene text can naturally have viewing angles that are not parallel to the text, making the text harder to recognize.
16 | Blurring: Uncontrolled environments tend to have blur, especially if the end user is utilizing a smartphone that does not have some form of stabilization.
17 | 
18 | - **Lighting conditions**: We cannot make any assumptions regarding our lighting conditions in natural scene images. It may be near dark, the flash on the camera may be on, or the sun may be shining brightly, saturating the entire image.
19 | 
20 | - **Resolution**: Not all cameras are created equal — we may be dealing with cameras with sub-par resolution.
21 | 
22 | - **Non-paper objects**: Most, but not all, paper is not reflective (at least in context of paper you are trying to scan). Text in natural scenes may be reflective, including logos, signs, etc.
23 | 
24 | - **Non-planar objects**: Consider what happens when you wrap text around a bottle — the text on the surface becomes distorted and deformed. While humans may still be able to easily “detect” and read the text, our algorithms will struggle. We need to be able to handle such use cases.
25 | 
26 | - **Unknown layout**: We cannot use any a priori information to give our algorithms “clues” as to where the text resides.
27 | 
28 | 
29 | ## Contributing
30 | Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
31 | 
32 | ### Thanks to [Adrian's Blog](https://www.pyimagesearch.com/2018/08/20/opencv-text-detection-east-text-detector/) for a comprehensive blog on EAST Detector.
33 | 
34 | ## License
35 | [MIT](https://choosealicense.com/licenses/mit/)


--------------------------------------------------------------------------------
/frozen_east_text_detection.pb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ZER-0-NE/EAST-Detector-for-text-detection-using-OpenCV/5b6c8d025778e5402c327a4a3f484a16ce7dda84/frozen_east_text_detection.pb


--------------------------------------------------------------------------------
/images/car_wash.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ZER-0-NE/EAST-Detector-for-text-detection-using-OpenCV/5b6c8d025778e5402c327a4a3f484a16ce7dda84/images/car_wash.png


--------------------------------------------------------------------------------
/images/lebron_james.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ZER-0-NE/EAST-Detector-for-text-detection-using-OpenCV/5b6c8d025778e5402c327a4a3f484a16ce7dda84/images/lebron_james.jpg


--------------------------------------------------------------------------------
/images/sign.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ZER-0-NE/EAST-Detector-for-text-detection-using-OpenCV/5b6c8d025778e5402c327a4a3f484a16ce7dda84/images/sign.jpg


--------------------------------------------------------------------------------
/opencv_text_detection_image.py:
--------------------------------------------------------------------------------
  1 | # USAGE
  2 | # python3 opencv_text_detection_image.py --image images/lebron_james.jpg --east frozen_east_text_detection.pb
  3 | 
  4 | # import the necessary packages
  5 | from imutils.object_detection import non_max_suppression
  6 | import numpy as np
  7 | import argparse
  8 | import time
  9 | import cv2
 10 | 
 11 | # construct the argument parser and parse the arguments
 12 | ap = argparse.ArgumentParser()
 13 | ap.add_argument("-i", "--image", type=str,
 14 |                 help="path to input image")
 15 | ap.add_argument("-east", "--east", type=str,
 16 |                 help="path to input EAST text detector")
 17 | ap.add_argument("-c", "--min-confidence", type=float, default=0.5,
 18 |                 help="minimum probability required to inspect a region")
 19 | ap.add_argument("-w", "--width", type=int, default=320,
 20 |                 help="resized image width (should be multiple of 32)")
 21 | ap.add_argument("-e", "--height", type=int, default=320,
 22 |                 help="resized image height (should be multiple of 32)")
 23 | args = vars(ap.parse_args())
 24 | 
 25 | # load the input image and grab the image dimensions
 26 | image = cv2.imread(args["image"])
 27 | orig = image.copy()
 28 | (H, W) = image.shape[:2]
 29 | 
 30 | # set the new width and height and then determine the ratio in change
 31 | # for both the width and height
 32 | (newW, newH) = (args["width"], args["height"])
 33 | rW = W / float(newW)
 34 | rH = H / float(newH)
 35 | 
 36 | # resize the image and grab the new image dimensions
 37 | image = cv2.resize(image, (newW, newH))
 38 | (H, W) = image.shape[:2]
 39 | 
 40 | # define the two output layer names for the EAST detector model that
 41 | # we are interested -- the first is the output probabilities and the
 42 | # second can be used to derive the bounding box coordinates of text
 43 | layerNames = [
 44 |     "feature_fusion/Conv_7/Sigmoid",
 45 |     "feature_fusion/concat_3"]
 46 | 
 47 | # load the pre-trained EAST text detector
 48 | print("[INFO] loading EAST text detector...")
 49 | net = cv2.dnn.readNet(args["east"])
 50 | 
 51 | # construct a blob from the image and then perform a forward pass of
 52 | # the model to obtain the two output layer sets
 53 | blob = cv2.dnn.blobFromImage(image, 1.0, (W, H),
 54 |                              (123.68, 116.78, 103.94), swapRB=True, crop=False)
 55 | start = time.time()
 56 | net.setInput(blob)
 57 | (scores, geometry) = net.forward(layerNames)
 58 | end = time.time()
 59 | 
 60 | # show timing information on text prediction
 61 | print("[INFO] text detection took {:.6f} seconds".format(end - start))
 62 | 
 63 | # grab the number of rows and columns from the scores volume, then
 64 | # initialize our set of bounding box rectangles and corresponding
 65 | # confidence scores
 66 | (numRows, numCols) = scores.shape[2:4]
 67 | rects = []
 68 | confidences = []
 69 | 
 70 | # loop over the number of rows
 71 | for y in range(0, numRows):
 72 |     # extract the scores (probabilities), followed by the geometrical
 73 |     # data used to derive potential bounding box coordinates that
 74 |     # surround text
 75 |     scoresData = scores[0, 0, y]
 76 |     xData0 = geometry[0, 0, y]
 77 |     xData1 = geometry[0, 1, y]
 78 |     xData2 = geometry[0, 2, y]
 79 |     xData3 = geometry[0, 3, y]
 80 |     anglesData = geometry[0, 4, y]
 81 | 
 82 |     # loop over the number of columns
 83 |     for x in range(0, numCols):
 84 |         # if our score does not have sufficient probability, ignore it
 85 |         if scoresData[x] < args["min_confidence"]:
 86 |             continue
 87 | 
 88 |         # compute the offset factor as our resulting feature maps will
 89 |         # be 4x smaller than the input image
 90 |         (offsetX, offsetY) = (x * 4.0, y * 4.0)
 91 | 
 92 |         # extract the rotation angle for the prediction and then
 93 |         # compute the sin and cosine
 94 |         angle = anglesData[x]
 95 |         cos = np.cos(angle)
 96 |         sin = np.sin(angle)
 97 | 
 98 |         # use the geometry volume to derive the width and height of
 99 |         # the bounding box
100 |         h = xData0[x] + xData2[x]
101 |         w = xData1[x] + xData3[x]
102 | 
103 |         # compute both the starting and ending (x, y)-coordinates for
104 |         # the text prediction bounding box
105 |         endX = int(offsetX + (cos * xData1[x]) + (sin * xData2[x]))
106 |         endY = int(offsetY - (sin * xData1[x]) + (cos * xData2[x]))
107 |         startX = int(endX - w)
108 |         startY = int(endY - h)
109 | 
110 |         # add the bounding box coordinates and probability score to
111 |         # our respective lists
112 |         rects.append((startX, startY, endX, endY))
113 |         confidences.append(scoresData[x])
114 | 
115 | # apply non-maxima suppression to suppress weak, overlapping bounding
116 | # boxes
117 | boxes = non_max_suppression(np.array(rects), probs=confidences)
118 | 
119 | # loop over the bounding boxes
120 | for (startX, startY, endX, endY) in boxes:
121 |     # scale the bounding box coordinates based on the respective
122 |     # ratios
123 |     startX = int(startX * rW)
124 |     startY = int(startY * rH)
125 |     endX = int(endX * rW)
126 |     endY = int(endY * rH)
127 | 
128 |     # draw the bounding box on the image
129 |     cv2.rectangle(orig, (startX, startY), (endX, endY), (0, 255, 0), 2)
130 | 
131 | # show the output image
132 | cv2.imshow("Text Detection", orig)
133 | cv2.waitKey(0)
134 | 


--------------------------------------------------------------------------------