├── LICENSE ├── README.md ├── final.py └── images ├── devinfo1.png ├── devinfo2.png ├── system1.jpg └── system2.jpg /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Boudhayan Dev 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Blind Reader                   [![Ask Me Anything !](https://img.shields.io/badge/Ask%20me-anything-1abc9c.svg?longCache=true&style=plastic)](mailto:dev.dibyo@gmail.com) [![made-with-python](https://img.shields.io/badge/Made%20with-Python-blue.svg?longCache=true&style=plastic)](https://www.python.org/) [![GitHub license](https://img.shields.io/github/license/Naereen/StrapDown.js.svg?longCache=true&style=plastic)](https://github.com/Naereen/StrapDown.js/blob/master/LICENSE) ![PyPI - Status](https://img.shields.io/pypi/status/Django.svg?style=plastic) [![Contributor](https://img.shields.io/badge/Contributors-2-orange.svg?longCache=true&style=plastic)](https://github.com/boudhayan-dev/Blind-Reader-project/graphs/contributors)
3 | 4 | ###### Welcome to the Blind Reader project ! 5 | 6 | Blind Reader is a portable, low-cost, reading device made for the blind people. The Braille machines are expensive and as a result are not accessible to many. Blind Reader overcomes the limitation of conventional Braille machine by making it affordable for the common masses. The system uses OCR technology to convert images into text and reads out the text by using Text-to-Speech conversion.The system supports audio output via Speakers as well as headphone. The user also has the ability to pause the audio output whenever he desires. It also has the facility to store the images in their respective book folder, thereby creating digital backup simultaneously. With this system, the blind user does not require the complexity of Braille machine to read a book. All it takes is a button to control the entire system ! 7 |

8 | 9 | ## Dependency 10 | 11 | #### Hardware Requirements: 12 | 13 | - Raspberry Pi 3B. 14 | - Pi Camera. 15 | - Speakers / Headphones. 16 | - Push buttons - 2. 17 | - LDR - 1. 18 | - LED - 4. 19 | - Power supply - 5V,2A. 20 | 21 | #### Software Requirements: 22 | 23 | - Python 3. 24 | - Python Dependencies: 25 | - Rpi.GPIO 26 | - Pygame library. 27 | - picamera library. 28 | - google-cloud. 29 | - time. 30 | - os. 31 | - datetime. 32 | - Google Cloud API - Vision , Text-to-Speech 33 | 34 |
35 | 36 | ## Usage 37 | 38 | - Use the following code to install the Google cloud python dependency. 39 | 40 | ``` 41 | pip3 install --upgrade google-api-python-client 42 | pip3 install --upgrade google-cloud-vision 43 | pip3 install --upgrade google-cloud 44 | ``` 45 |
46 | Use : Google CLoud Vision API for further Details. 47 |
48 | 49 | - Activate Cloud Vision API and Google Cloud Text-to-Speech API by visiting the dashboard and download the Service account credentials (Json file).
50 | 51 | - Connect the hardware as follows: 52 | - Pi Camera --> Camera Slot in Raspberry Pi 3. 53 | - Pair Bluetooth Speaker / Insert headphone into Raspberry Pi 3 audio jack. 54 | - LDR --> GPIO 37. 55 | - 4 LEDs - GPIO 29 , 31 , 33 , 35 respectively. 56 | - Push Button 1 ( Camera capture ) --> GPIO 16. 57 | - Push Button 2 ( Play/Pause audio ) --> GPIO 18. 58 |
59 | 60 | - Use the following code to start the system:
61 | ``` python3 //path/to/your/final.py/file ``` 62 |
63 | 64 | - Place the image to be read under the camera and press Button 1 to read out a page. 65 |

66 | 67 | ## Demonstration 68 |
69 | 70 | 71 |
















72 | 73 | ## Resources 74 | 75 | - [Google Cloud Platform.](https://cloud.google.com/python/docs/reference/) 76 | 77 | - [Pygame python library.](https://www.pygame.org/news) 78 | 79 | - [Raspberry Pi.](https://www.raspberrypi.org/) 80 | 81 | - [Python.](https://www.python.org/) 82 | 83 |

84 | 85 | -------------------------------------------------------------------------------- /final.py: -------------------------------------------------------------------------------- 1 | # Dependencies are imported here. 2 | # Check out ----> https://cloud.google.com/python/docs/ for instruction to install the G-cloud dependencies. 3 | import RPi.GPIO as GPIO 4 | import time,os,pygame,datetime 5 | from picamera import PiCamera 6 | from google.cloud import vision 7 | from google.cloud.vision import types 8 | from google.cloud import texttospeech 9 | 10 | # Edit the "YOUR_GCLOUD_JSON_CREDENTIALS" with the path to your credential file. 11 | # Credentials can be found here ----> https://console.cloud.google.com 12 | os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'YOUR_GCLOUD_JSON_CREDENTIALS' 13 | 14 | # client ---> Object initiated to Vision API class. 15 | # camera ---> Object intitiated to PiCamera Class. 16 | client = vision.ImageAnnotatorClient() 17 | camera = PiCamera() 18 | 19 | # Set the camera resolution to 512x512. 20 | camera.resolution = (512,512) 21 | 22 | # Set the GPIO pin configuration of RPi3 to BOARD mode. 23 | # GPIO 16 ---> Triggers Camera capture. 24 | # GPIO 18 ---> Pause/Play the audio. 25 | # GPIO 37 ---> LDR input. 26 | # GPIO 35,33,31,29 are connected to LEDs. 27 | GPIO.setmode(GPIO.BOARD) 28 | GPIO.setup(16, GPIO.IN, pull_up_down=GPIO.PUD_UP) 29 | GPIO.setup(18, GPIO.IN, pull_up_down=GPIO.PUD_UP) 30 | GPIO.setup(37,GPIO.IN) # ldr INPUT 31 | GPIO.setup(35,GPIO.OUT) 32 | GPIO.setup(33,GPIO.OUT) 33 | GPIO.setup(31,GPIO.OUT) 34 | GPIO.setup(29,GPIO.OUT) 35 | 36 | #Initialize the pygame module for playing the audio files. 37 | #pygame.init() ---> Uncomment this line in case there is problem with the sound playing part. 38 | pygame.mixer.init() 39 | 40 | print("Press Start button to read out the page") 41 | 42 | # flag ---> Status flag to indicate the current status of audio file i.e. Playing/Paused. 43 | # light_on ---> Status flag to check the current status of the LEDs i.e. ON/OFF. 44 | flag=0 45 | light_on=0 46 | file_playing=0 47 | 48 | # Helper function to perform Text-to-Speech conversion using Google Text to SPeech API. 49 | # text ---> The text to be converted to audio. 50 | # audioname ---> The name of the output audio file. 51 | def synthesize_text(text,audioname): 52 | 53 | # client ---> Initialised to TextToSpeech API Class. 54 | client = texttospeech.TextToSpeechClient() 55 | input_text = texttospeech.types.SynthesisInput(text=text) 56 | 57 | # Note: the voice can also be specified by name. 58 | # Names of voices can be retrieved with client.list_voices(). 59 | voice = texttospeech.types.VoiceSelectionParams( 60 | language_code='en-US', 61 | ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE) 62 | 63 | # audion_config = configuration for the output audio file. Supports other formats such as WAV, AVI etc. 64 | audio_config = texttospeech.types.AudioConfig( 65 | audio_encoding=texttospeech.enums.AudioEncoding.MP3) 66 | 67 | # response ---> Receives API response for the input text. 68 | response = client.synthesize_speech(input_text, voice, audio_config) 69 | 70 | # The response's audio_content is in binary format. The audio content is written into the output file. 71 | with open(audioname, 'wb') as out: 72 | out.write(response.audio_content) 73 | print('Audio content written to file {}'.format(audioname)) 74 | 75 | # Detect Capture button state. 76 | # Capture new image if the camera capture button is pressed. 77 | while True: 78 | try: 79 | 80 | # capture_button ---> Button input to capture new image. 81 | # pause_button ---> Button input to pause/play audio file. 82 | capture_button = GPIO.input(16) 83 | pause_button = GPIO.input(18) 84 | 85 | # capture_button=0 ---> when it is pressed. 86 | if capture_button == False: 87 | time.sleep(1) 88 | flag=0 89 | print('Capturing new image...') 90 | 91 | # image_name ---> name of the captured image. 92 | image_name = "Book-"+datetime.datetime.now().strftime("%H-%M-%S")+".jpg" 93 | 94 | # 2 second sleep to give sufficient time for the camera to initialize. 95 | camera.start_preview() 96 | time.sleep(2) 97 | camera.capture(image_name) 98 | print("Image clicked . . .") 99 | 100 | # content ---> Image contents in binary. 101 | with open(image_name, 'rb') as image_file: 102 | content = image_file.read() 103 | print("Sending Image to OCR . . ") 104 | 105 | # Send the binary to OCR API for text extraction. 106 | image = types.Image(content=content) 107 | response = client.document_text_detection(image=image) 108 | labels = response.full_text_annotation 109 | 110 | # Capitalize every word's first letter. 111 | # Add an additional identifier at the end, to signify end of file. 112 | s="" 113 | for i in labels.text.split(): 114 | s+=i+" " 115 | t="" 116 | for i in s.split(): 117 | temp=i[0].upper() 118 | temp2=i[1:].lower() 119 | t+=temp+temp2+" " 120 | 121 | print("\n"+t) 122 | t+=". ,End Of Page Mr. X" 123 | print("Converting your text to sound . . .") 124 | 125 | # audioname ---> stores the name of the audio file. 126 | audioname="Book-"+datetime.datetime.now().strftime("%H-%M-%S")+".mp3" 127 | synthesize_text(t,audioname) 128 | print("Starting audio. . .") 129 | print("Press pause button to Pause/Resume") 130 | 131 | # Initialize the mixer to start playing the audio file. 132 | pygame.mixer.music.load(audioname) 133 | pygame.mixer.music.play() 134 | time.sleep(4) 135 | file_playing=1 136 | #os.system("vlc {}".format(audioname)) 137 | 138 | # Check for pause button condition. 139 | # flag=0 ---> audio playing. 140 | # flag=1 ---> audio paused. 141 | if (pause_button == False): 142 | time.sleep(1) 143 | if flag==0: 144 | pygame.mixer.music.pause() 145 | flag=1 146 | print("Paused. . . ") 147 | elif flag==1: 148 | pygame.mixer.music.unpause() 149 | flag=0 150 | print("Resumed. . . ") 151 | 152 | # LDR ---> pin 37. Glow LEDs in low lighting condition. 153 | if GPIO.input(37) != False and light_on==0: 154 | time.sleep(0.2) 155 | GPIO.output(35,1) 156 | GPIO.output(33,1) 157 | GPIO.output(31,1) 158 | GPIO.output(29,1) 159 | light_on=1 160 | 161 | # Switch off LEDs. 162 | if GPIO.input(37) != True and light_on==1: 163 | time.sleep(0.2) 164 | GPIO.output(35,0) 165 | GPIO.output(33,0) 166 | GPIO.output(31,0) 167 | GPIO.output(29,0) 168 | light_on=0 169 | 170 | except Exception as e: 171 | print(e) 172 | GPIO.cleanup() 173 | break -------------------------------------------------------------------------------- /images/devinfo1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/boudhayan-dev/Blind-Reader-project/fdd3fdebf618cb611f9810ebf68bd3bc1924aea2/images/devinfo1.png -------------------------------------------------------------------------------- /images/devinfo2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/boudhayan-dev/Blind-Reader-project/fdd3fdebf618cb611f9810ebf68bd3bc1924aea2/images/devinfo2.png -------------------------------------------------------------------------------- /images/system1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/boudhayan-dev/Blind-Reader-project/fdd3fdebf618cb611f9810ebf68bd3bc1924aea2/images/system1.jpg -------------------------------------------------------------------------------- /images/system2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/boudhayan-dev/Blind-Reader-project/fdd3fdebf618cb611f9810ebf68bd3bc1924aea2/images/system2.jpg --------------------------------------------------------------------------------