├── 1.png ├── 2.png ├── README.md └── main.py /1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmedgulabkhan/TEI2S/2f7e3ea134ab8336b5d65121b5e2744f64573cf2/1.png -------------------------------------------------------------------------------- /2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmedgulabkhan/TEI2S/2f7e3ea134ab8336b5d65121b5e2744f64573cf2/2.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TEI2S 2 | 3 | ## About 4 | TEI2S (Text Embedded Image to Speech conversion) is a project which is really helpful for the visually impaired, in a sense that it takes an image containing text embedding as the input, extracts the text from the image, and converts this text to speech, i.e; the output is an audio file containing the text which is embedded in the provided input image. 5 | 6 | ## Configuration Steps 7 | 1. Cloning the repository 8 | 9 | ``` 10 | $ git clone https://github.com/ahmedgulabkhan/TEI2S.git 11 | ``` 12 | 13 | 2. Installing the dependencies 14 | 15 | First goto the [tesseract-OCR](https://github.com/tesseract-ocr/tesseract/wiki) engine, read the steps and install it on your system. After installing it, follow the steps mentioned below. 16 | 17 | ``` 18 | $ pip install numpy 19 | 20 | $ pip install PIL 21 | 22 | $ pip install opencv-python 23 | 24 | $ pip install pytesseract 25 | 26 | $ pip install gTTS 27 | ``` 28 | This installs all the required dependencies like Numpy, PIL (Python Imaging Library), OpenCV, Pytesseract and gTTS. 29 | 30 | 3. Running the file 31 | 32 | Open the terminal and `cd` to the directory where you have cloned the repository. Then run the file by typing `python main.py`. 33 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import pytesseract 4 | from PIL import Image 5 | from gtts import gTTS 6 | import os 7 | 8 | pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' 9 | 10 | # Path of working folder on Disk 11 | src_path = './' 12 | 13 | def get_string(img_path): 14 | # Read image with opencv 15 | img = cv2.imread(img_path) 16 | 17 | # Convert to gray 18 | img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 19 | 20 | # Apply dilation and erosion to remove some noise 21 | kernel = np.ones((1, 1), np.uint8) 22 | img = cv2.dilate(img, kernel, iterations=1) 23 | img = cv2.erode(img, kernel, iterations=1) 24 | 25 | # Write image after removed noise 26 | cv2.imwrite(src_path + "removed_noise.png", img) 27 | 28 | # Apply threshold to get image with only black and white 29 | img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2) 30 | 31 | # Write the image after apply opencv to do some ... 32 | cv2.imwrite(src_path + "thres.png", img) 33 | 34 | # Recognize text with tesseract for python 35 | result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) 36 | 37 | # Remove template file 38 | #os.remove(temp) 39 | 40 | return result 41 | 42 | 43 | print('--- Recognizing text from image ---') 44 | img2txt = get_string(src_path + "2.png") 45 | print(img2txt) 46 | 47 | myobj = gTTS(text=img2txt, lang='en', slow=False) 48 | myobj.save('output1.mp3') 49 | #os.system('mpg321 output1.mp3') 50 | os.system('output1.mp3') 51 | print("------ Done -------") 52 | --------------------------------------------------------------------------------