├── LICENSE ├── README.md ├── app.py ├── assets ├── demo.mp4 ├── question.jpg └── signs.png ├── model train.ipynb ├── model ├── __init__.py └── keypoint_classifier │ ├── coord.csv │ ├── keypoint_classifier.py │ ├── label.csv │ ├── model.hdf5 │ └── model.tflite └── requirements.txt /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Meet Patel 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Sign Language Translator 2 | 3 | ### Translate sign language to text with camera and python (GUI and ML) 📷🤖📝 4 | 5 | https://github.com/meet244/Sign-Language-Translator/assets/83262693/8020b9f0-6c23-4af4-a43c-bae3d59a8899 6 | 7 | ## Problem Statement 🧩 8 | 9 | Develop an innovative system that utilizes camera technology in web and mobile applications to translate sign language gestures into text. The primary goal is to enhance communication accessibility for the Deaf and Hard of Hearing community by providing a real-time sign language-to-text translation solution. 🌐🤟📱 10 | 11 | ## Key Features 🚀 12 | 13 | 1. **Real-Time Gesture Recognition:** Advanced algorithms for recognizing sign language gestures in real-time through the device's camera. 📹👋 14 | 15 | 2. **Text Translation:** Accurate translation mechanism to convert recognized gestures into text. 📝🔄 16 | 17 | 3. **Accessible Interface:** User-friendly interface for both sign language users and those who rely on the translated text. 🖥️👨‍👩‍🦳 18 | 19 | 4. **Multiple Sign Languages:** Support for a variety of sign languages to cater to a diverse user base. 🌍🤟 20 | 21 | 5. **Customizable Settings:** Allow users to personalize the system's settings and preferences. ⚙️🛠️ 22 | 23 | ## Solution Overview 🌟 24 | 25 | We first understood how sign language functions and what the signs for India. Here's the signs we made this project for and you can try this out : 26 | 27 | ![signs](https://github.com/meet244/Sign-Language-Translator/assets/83262693/30087850-85a3-4850-bdc4-fbe3daf87cc6) 28 | 29 | We solved this problem by implementing a comprehensive system that combines customtkinter (an enhanced version of tkinter), Mediapipe for hand sign recognition, and TensorFlow for machine learning to recognize signs. Here's a more detailed breakdown of our solution: 30 | 31 | - **Real-Time Gesture Recognition:** We leveraged Mediapipe's advanced hand tracking capabilities to recognize sign language gestures in real-time through the device's camera. This allowed us to precisely track hand movements and gestures. 👐🕐 32 | 33 | - **Text Translation:** To convert recognized sign language gestures into text, we utilized TensorFlow for machine learning. Our machine learning model was trained to understand a wide range of sign language signs, ensuring high accuracy and reliability in translation. 🤖💬 34 | 35 | - **Accessible Interface:** We designed a user-friendly interface using customtkinter, which offers enhanced customization and a smoother user experience. Our interface facilitates seamless communication for both sign language users and those who rely on the translated text. 🖼️🤝 36 | 37 | 38 | ## Installation ⚙️ 39 | 40 | ```shell 41 | # Clone the repository 42 | git clone https://github.com/meet244/Sign-Language-Translator.git 43 | cd Sign-Language-Translator 44 | 45 | # Install modules 46 | pip install -r requirements.txt 47 | 48 | # Start the application 49 | python app.py 50 | ``` 51 | 52 | ## Contributing 🤝 53 | 54 | If you'd like to contribute to this project, please follow guidelines. 🙌 55 | 56 | ## License 📜 57 | 58 | This project is licensed under the [MIT License](LICENSE). 📄 59 | 60 | -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | import customtkinter as ctk 2 | import csv 3 | import tkinter as tk 4 | import cv2 5 | from PIL import Image, ImageTk 6 | import mediapipe as mp 7 | from model import KeyPointClassifier 8 | import itertools 9 | import copy 10 | from datetime import datetime 11 | 12 | # Function to calculate the landmark points from an image 13 | def calc_landmark_list(image, landmarks): 14 | image_width, image_height = image.shape[1], image.shape[0] 15 | 16 | landmark_point = [] 17 | 18 | # Iterate over each landmark and convert its coordinates 19 | for _, landmark in enumerate(landmarks.landmark): 20 | landmark_x = min(int(landmark.x * image_width), image_width - 1) 21 | landmark_y = min(int(landmark.y * image_height), image_height - 1) 22 | 23 | landmark_point.append([landmark_x, landmark_y]) 24 | 25 | return landmark_point 26 | 27 | # Function to preprocess landmark data 28 | def pre_process_landmark(landmark_list): 29 | temp_landmark_list = copy.deepcopy(landmark_list) 30 | 31 | # Convert to relative coordinates 32 | base_x, base_y = 0, 0 33 | for index, landmark_point in enumerate(temp_landmark_list): 34 | if index == 0: 35 | base_x, base_y = landmark_point[0], landmark_point[1] 36 | 37 | temp_landmark_list[index][0] = temp_landmark_list[index][0] - base_x 38 | temp_landmark_list[index][1] = temp_landmark_list[index][1] - base_y 39 | 40 | # Convert to a one-dimensional list 41 | temp_landmark_list = list( 42 | itertools.chain.from_iterable(temp_landmark_list)) 43 | 44 | # Normalization 45 | max_value = max(list(map(abs, temp_landmark_list))) 46 | 47 | def normalize_(n): 48 | return n / max_value 49 | 50 | temp_landmark_list = list(map(normalize_, temp_landmark_list)) 51 | 52 | return temp_landmark_list 53 | 54 | # Load the KeyPointClassifier model 55 | keypoint_classifier = KeyPointClassifier() 56 | 57 | # Read labels from a CSV file 58 | with open('model/keypoint_classifier/label.csv', encoding='utf-8-sig') as f: 59 | keypoint_classifier_labels = csv.reader(f) 60 | keypoint_classifier_labels = [ 61 | row[0] for row in keypoint_classifier_labels 62 | ] 63 | 64 | # Set the appearance mode and color theme for the custom tkinter library 65 | ctk.set_appearance_mode("Dark") 66 | ctk.set_default_color_theme("blue") 67 | 68 | # Create the main window 69 | window = ctk.CTk() 70 | window.geometry('1080x1080') 71 | window.title("HAND SIGNS") 72 | prev = "" 73 | 74 | # Function to open the camera and perform hand gesture recognition 75 | def open_camera1(): 76 | global prev 77 | width, height = 800, 600 78 | with mphands.Hands(min_detection_confidence=0.5, min_tracking_confidence=0.5,static_image_mode=False) as hands: 79 | 80 | _, frame = vid.read() 81 | opencv_image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 82 | opencv_image = cv2.resize(opencv_image, (width,height)) 83 | 84 | processFrames = hands.process(opencv_image) 85 | if processFrames.multi_hand_landmarks: 86 | for lm in processFrames.multi_hand_landmarks: 87 | mpdrawing.draw_landmarks(frame, lm, mphands.HAND_CONNECTIONS) 88 | 89 | landmark_list = calc_landmark_list(frame, lm) 90 | 91 | pre_processed_landmark_list = pre_process_landmark( 92 | landmark_list) 93 | 94 | hand_sign_id = keypoint_classifier(pre_processed_landmark_list) 95 | 96 | cur = keypoint_classifier_labels[hand_sign_id] 97 | if(cur == prev) : 98 | letter.configure(text=cur) 99 | elif(cur): 100 | prev = cur 101 | 102 | frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGBA) 103 | frame = cv2.flip(frame,1) 104 | captured_image = Image.fromarray(frame) 105 | my_image = ctk.CTkImage(dark_image=captured_image,size=(340,335)) 106 | video_lable.configure(image=my_image) 107 | video_lable.after(10, open_camera1) 108 | 109 | # Initialize the video capture 110 | vid = cv2.VideoCapture(0) 111 | mphands = mp.solutions.hands 112 | mpdrawing = mp.solutions.drawing_utils 113 | width, height = 600, 500 114 | 115 | # Create the title label 116 | i = 0 117 | title = ctk.CTkFont( 118 | family='Consolas', 119 | weight='bold', 120 | size=25 121 | ) 122 | Label = ctk.CTkLabel( 123 | window, 124 | text = 'HAND SIGNS', 125 | fg_color='steelblue', 126 | text_color= 'white', 127 | height= 40, 128 | font=title, 129 | corner_radius= 8) 130 | Label.pack(side = ctk.TOP,fill=ctk.X,pady=(10,4),padx=(10,10)) 131 | 132 | # Create the main frame 133 | main_frame = ctk.CTkFrame(master=window, 134 | height=770, 135 | corner_radius=8 136 | ) 137 | 138 | main_frame.pack(fill = ctk.X , padx=(10,10),pady=(5,0)) 139 | MyFrame1=ctk.CTkFrame(master=main_frame, 140 | height = 375, 141 | width=365 142 | ) 143 | MyFrame1.pack(fill = ctk.BOTH,expand=ctk.TRUE,side = ctk.LEFT,padx = (10,10),pady=(10,10)) 144 | 145 | # Create the video frame 146 | video_frame = ctk.CTkFrame(master=MyFrame1,height=340,width=365,corner_radius=12) 147 | video_frame.pack(side=ctk.TOP,fill=ctk.BOTH,expand = ctk.TRUE ,padx=(10,10),pady=(10,5)) 148 | 149 | # Create the video label 150 | video_lable = ctk.CTkLabel(master=video_frame, text='',height=340,width=365,corner_radius=12) 151 | video_lable.pack(fill=ctk.BOTH,padx=(0,0),pady=(0,0)) 152 | 153 | # Create a button to start the camera feed 154 | Camera_feed_start= ctk.CTkButton(master=MyFrame1,text='START',height=40,width=250,border_width=0,corner_radius=12,command=lambda : open_camera1()) 155 | Camera_feed_start.pack(side = ctk.TOP,pady=(5,10)) 156 | 157 | MyFrame2=ctk.CTkFrame(master=main_frame, 158 | height=375 159 | ) 160 | MyFrame2.pack(fill = ctk.BOTH,side=ctk.LEFT,expand = ctk.TRUE,padx = (10,10),pady=(10,10)) 161 | 162 | # Create a font for displaying letters 163 | myfont = ctk.CTkFont( 164 | family='Consolas', 165 | weight='bold', 166 | size=200 167 | ) 168 | letter = ctk.CTkLabel(MyFrame2, 169 | font=myfont,fg_color='#2B2B2B',justify=ctk.CENTER) 170 | letter.pack(fill = ctk.BOTH,side=ctk.LEFT,expand = ctk.TRUE,padx = (10,10),pady=(10,10)) 171 | letter.configure(text='') 172 | 173 | MyFrame3=ctk.CTkFrame(master=window, 174 | height=175, 175 | corner_radius=12 176 | ) 177 | MyFrame3.pack(fill = ctk.X,expand = ctk.TRUE,padx = (10,10),pady=(10,10)) 178 | 179 | # Create a textbox for displaying a sentence 180 | Sentence = ctk.CTkTextbox(MyFrame3, 181 | font=("Consolas",24)) 182 | Sentence.pack(fill = ctk.X,side=ctk.LEFT,expand = ctk.TRUE,padx = (10,10),pady=(10,10)) 183 | 184 | # Start the tkinter main loop 185 | window.mainloop() 186 | -------------------------------------------------------------------------------- /assets/demo.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meet244/Sign-Language-Translator/fbc7add1df6e45d3388817aa21fd9f1bd4a2286a/assets/demo.mp4 -------------------------------------------------------------------------------- /assets/question.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meet244/Sign-Language-Translator/fbc7add1df6e45d3388817aa21fd9f1bd4a2286a/assets/question.jpg -------------------------------------------------------------------------------- /assets/signs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meet244/Sign-Language-Translator/fbc7add1df6e45d3388817aa21fd9f1bd4a2286a/assets/signs.png -------------------------------------------------------------------------------- /model/__init__.py: -------------------------------------------------------------------------------- 1 | from model.keypoint_classifier.keypoint_classifier import KeyPointClassifier -------------------------------------------------------------------------------- /model/keypoint_classifier/keypoint_classifier.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import tensorflow as tf 3 | 4 | class KeyPointClassifier(object): 5 | def __init__( 6 | self, 7 | model_path='model/keypoint_classifier/model.tflite', 8 | num_threads=1, 9 | ): 10 | # Initialize the KeyPointClassifier object 11 | # Parameters: 12 | # - model_path: Path to the TensorFlow Lite model file 13 | # - num_threads: Number of threads to use for model inference (default is 1) 14 | 15 | # Create an interpreter for the TensorFlow Lite model 16 | self.interpreter = tf.lite.Interpreter(model_path=model_path, 17 | num_threads=num_threads) 18 | 19 | # Allocate memory for the interpreter 20 | self.interpreter.allocate_tensors() 21 | 22 | # Get input and output details of the model 23 | self.input_details = self.interpreter.get_input_details() 24 | self.output_details = self.interpreter.get_output_details() 25 | 26 | def __call__( 27 | self, 28 | landmark_list, 29 | ): 30 | # Perform inference using the KeyPointClassifier 31 | # Parameters: 32 | # - landmark_list: A list of landmarks to classify 33 | 34 | # Get the index of the input tensor 35 | input_details_tensor_index = self.input_details[0]['index'] 36 | 37 | # Set the input tensor with the landmark_list data 38 | self.interpreter.set_tensor( 39 | input_details_tensor_index, 40 | np.array([landmark_list], dtype=np.float32)) 41 | 42 | # Run inference 43 | self.interpreter.invoke() 44 | 45 | # Get the index of the output tensor 46 | output_details_tensor_index = self.output_details[0]['index'] 47 | 48 | # Get the result from the output tensor 49 | result = self.interpreter.get_tensor(output_details_tensor_index) 50 | 51 | # Find the index with the highest confidence score 52 | result_index = np.argmax(np.squeeze(result)) 53 | 54 | # Return the index of the predicted class 55 | return result_index 56 | -------------------------------------------------------------------------------- /model/keypoint_classifier/label.csv: -------------------------------------------------------------------------------- 1 | C 2 | A 3 | B 4 | D 5 | E 6 | F 7 | G 8 | H 9 | I 10 | J 11 | K 12 | L 13 | M 14 | N 15 | O 16 | P 17 | R 18 | T 19 | U 20 | V 21 | W 22 | 23 | X 24 | Y 25 | Z 26 | -------------------------------------------------------------------------------- /model/keypoint_classifier/model.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meet244/Sign-Language-Translator/fbc7add1df6e45d3388817aa21fd9f1bd4a2286a/model/keypoint_classifier/model.hdf5 -------------------------------------------------------------------------------- /model/keypoint_classifier/model.tflite: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meet244/Sign-Language-Translator/fbc7add1df6e45d3388817aa21fd9f1bd4a2286a/model/keypoint_classifier/model.tflite -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | customtkinter 2 | csv 3 | tkinter 4 | opencv-python 5 | Pillow 6 | mediapipe 7 | model 8 | --------------------------------------------------------------------------------