├── Testing.PNG ├── Link_Extraction.py └── README.md /Testing.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AdityaPai2398/Extract-Links-from-Images/HEAD/Testing.PNG -------------------------------------------------------------------------------- /Link_Extraction.py: -------------------------------------------------------------------------------- 1 | from PIL import Image 2 | import pytesseract 3 | import cv2 4 | import re 5 | import os 6 | import webbrowser 7 | 8 | img=cv2.imread("Testing.png") 9 | 10 | gray=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 11 | 12 | file="{}.png" .format(os.getpid()) 13 | cv2.imwrite(file,gray) 14 | 15 | text=pytesseract.image_to_string(Image.open(file)) 16 | os.remove(file) 17 | 18 | print(text) 19 | 20 | 21 | urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[.]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', text) 22 | 23 | print(urls[0]) 24 | 25 | webbrowser.open(urls[0]) 26 | 27 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Extract-Links-from-Images 2 | Using OCR and pytesseract to extract links from images in Python 3 | 4 | A tutorial that I made explaining the code : https://www.youtube.com/watch?v=8exztOf6ul4 5 | 6 | ## Why I built this? 7 | Whenever we have an event such as an hackathon or workshop we get the banner or poster of the event with the registration link, we need to refer the image and type the damn characters in the browser which is very annoying. 8 | 9 | ## What does this do? 10 | It extracts links from Images and automatically opens browser and directs us to that link. 11 | 12 | ## Installation 13 | ``` 14 | pip install pillow 15 | pip install pytesseract 16 | ``` 17 | 18 | 19 | Once these links are extracted we would use webbrowser to automatically open browser and direct us to that particular link. 20 | --------------------------------------------------------------------------------