├── LICENSE ├── README.md ├── align_dataset_mtcnn.py ├── det1.npy ├── det2.npy ├── det3.npy ├── detect_face.py ├── face_aligner.py ├── face_detect.py ├── facenet.py ├── final_sotware.py ├── images ├── image2.png ├── image3.png ├── image4.png ├── image5.png └── images.txt ├── sheet.py └── user_interface.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 aashishrai3799 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # This is the official implementation of 2 | 3 | ## An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks 4 | (https://ieeexplore.ieee.org/document/9029001) 5 | 6 | 7 | 8 | An end-to-end face identification and attendance approach using Convolutional Neural Networks (CNN), which processes the CCTV footage or a video of the class and mark the attendance of the entire class simultaneously. One of the main advantages of the proposed solution is its robustness against usual challenges like occlusion (partially visible/covered faces), orientation, alignment and luminescence of the classroom. 9 | 10 | # Libraries 11 | 1. Tensorflow 1.14 12 | 2. Numpy 13 | 3. OpenCV 14 | 4. MTCNN 15 | 5. Sklearn 16 | 6. xlsxwriter, xlrd 17 | 7. scipy 18 | 8. pickle 19 | 20 | 21 | # How to use 22 | 23 | ## Installation 24 | 1. Install the required libraries. (Conda environment preferred). 25 | 2. Download the pre-trained model from [[link]](https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55-) and copy to the main directory. 26 | 3. Make sure to have the below mentioned directory structure (you've to manually create two folders named "attendance" and "output" in the main directory | refer to the "Main" directory structure). 27 | 4. To verify if everything is installed correctly, run 'user_interface.py'. 28 | 29 | ## Create Dataset 30 | 1. Run 'user_interface.py' 31 | 2. Click on the 'Create' button. 32 | 3. Select 'webcam' if you wish to create live dataset. (you can leave all other fileds empty) 33 | 4. Click on the 'Continue' button to start streaming webcam feed. 34 | 5. Press 's' to save the face images. Take as many images as you can take. (approx. 80-100 preferred) 35 | 6. Press 'q' to exit. 36 | 7. Likewise create other datasets. 37 | 38 | ## Training 39 | 1. Run 'user_interface.py' 40 | 2. Click on the 'Train' button. 41 | 3. Training may take several minutes (depending upon your system configuration). 42 | 4. Once training is completed, a 'classifier.pkl' file will be generated. 43 | 44 | ## Run 45 | 1. Run 'user_interface.py' 46 | 2. Click on the 'Run' button. 47 | 3. Select 'Webcam' fom the list and leave all fields blank. 48 | 4. Click on 'Mark Attendance' button. 49 | 5. Attendance sheet will be generated automatically with current date/time. 50 | 51 | ## Make sure to have following directory structure 52 | 1. 'Main' directory: 53 | 54 | 2. 'output' directory: 55 | 56 | 3. '20180402-114759' directory: 57 | 58 | 59 | 60 | 61 | The file for data augmentation will be uploaded soon. 62 | 63 | To know more about the working of the software, refer to our paper. 64 | 65 | 66 | 67 | ## Download pre-trained model: 68 | https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55- 69 | 70 | 71 | ## Cite 72 | If you find this paper/code userful, consider citing 73 | 74 | ``` 75 | @INPROCEEDINGS{9029001, 76 | author={Rai, Aashish and Karnani, Rashmi and Chudasama, Vishal and Upla, Kishor}, 77 | booktitle={2019 IEEE 16th India Council International Conference (INDICON)}, 78 | title={An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks}, 79 | year={2019}, volume={}, number={}, pages={1-4}, 80 | doi={10.1109/INDICON47234.2019.9029001}} 81 | ``` 82 | 83 | ## License 84 | 85 | The code is available under MIT License. Please read the license terms available at [[Link]](https://github.com/aashishrai3799/Automated-Attendance-System-using-CNN/blob/master/LICENSE) 86 | 87 | -------------------------------------------------------------------------------- /align_dataset_mtcnn.py: -------------------------------------------------------------------------------- 1 | """Performs face alignment and stores face thumbnails in the output directory.""" 2 | # MIT License 3 | # 4 | # Copyright (c) 2016 David Sandberg 5 | # 6 | # Permission is hereby granted, free of charge, to any person obtaining a copy 7 | # of this software and associated documentation files (the "Software"), to deal 8 | # in the Software without restriction, including without limitation the rights 9 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | # copies of the Software, and to permit persons to whom the Software is 11 | # furnished to do so, subject to the following conditions: 12 | # 13 | # The above copyright notice and this permission notice shall be included in all 14 | # copies or substantial portions of the Software. 15 | # 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | # SOFTWARE. 23 | 24 | from __future__ import absolute_import 25 | from __future__ import division 26 | from __future__ import print_function 27 | 28 | from scipy import misc 29 | import sys 30 | import os 31 | import argparse 32 | import tensorflow as tf 33 | import numpy as np 34 | import facenet 35 | import align.detect_face 36 | import random 37 | from time import sleep 38 | 39 | def main(args): 40 | sleep(random.random()) 41 | output_dir = os.path.expanduser(args.output_dir) 42 | if not os.path.exists(output_dir): 43 | os.makedirs(output_dir) 44 | # Store some git revision info in a text file in the log directory 45 | src_path,_ = os.path.split(os.path.realpath(__file__)) 46 | facenet.store_revision_info(src_path, output_dir, ' '.join(sys.argv)) 47 | dataset = facenet.get_dataset(args.input_dir) 48 | 49 | print('Creating networks and loading parameters') 50 | 51 | with tf.Graph().as_default(): 52 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) 53 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) 54 | with sess.as_default(): 55 | pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) 56 | 57 | minsize = 20 # minimum size of face 58 | threshold = [ 0.6, 0.7, 0.7 ] # three steps's threshold 59 | factor = 0.709 # scale factor 60 | 61 | # Add a random key to the filename to allow alignment using multiple processes 62 | random_key = np.random.randint(0, high=99999) 63 | bounding_boxes_filename = os.path.join(output_dir, 'bounding_boxes_%05d.txt' % random_key) 64 | 65 | with open(bounding_boxes_filename, "w") as text_file: 66 | nrof_images_total = 0 67 | nrof_successfully_aligned = 0 68 | if args.random_order: 69 | random.shuffle(dataset) 70 | for cls in dataset: 71 | output_class_dir = os.path.join(output_dir, cls.name) 72 | if not os.path.exists(output_class_dir): 73 | os.makedirs(output_class_dir) 74 | if args.random_order: 75 | random.shuffle(cls.image_paths) 76 | for image_path in cls.image_paths: 77 | nrof_images_total += 1 78 | filename = os.path.splitext(os.path.split(image_path)[1])[0] 79 | output_filename = os.path.join(output_class_dir, filename+'.png') 80 | print(image_path) 81 | if not os.path.exists(output_filename): 82 | try: 83 | img = misc.imread(image_path) 84 | except (IOError, ValueError, IndexError) as e: 85 | errorMessage = '{}: {}'.format(image_path, e) 86 | print(errorMessage) 87 | else: 88 | if img.ndim<2: 89 | print('Unable to align "%s"' % image_path) 90 | text_file.write('%s\n' % (output_filename)) 91 | continue 92 | if img.ndim == 2: 93 | img = facenet.to_rgb(img) 94 | img = img[:,:,0:3] 95 | 96 | bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor) 97 | nrof_faces = bounding_boxes.shape[0] 98 | if nrof_faces>0: 99 | det = bounding_boxes[:,0:4] 100 | det_arr = [] 101 | img_size = np.asarray(img.shape)[0:2] 102 | if nrof_faces>1: 103 | if args.detect_multiple_faces: 104 | for i in range(nrof_faces): 105 | det_arr.append(np.squeeze(det[i])) 106 | else: 107 | bounding_box_size = (det[:,2]-det[:,0])*(det[:,3]-det[:,1]) 108 | img_center = img_size / 2 109 | offsets = np.vstack([ (det[:,0]+det[:,2])/2-img_center[1], (det[:,1]+det[:,3])/2-img_center[0] ]) 110 | offset_dist_squared = np.sum(np.power(offsets,2.0),0) 111 | index = np.argmax(bounding_box_size-offset_dist_squared*2.0) # some extra weight on the centering 112 | det_arr.append(det[index,:]) 113 | else: 114 | det_arr.append(np.squeeze(det)) 115 | 116 | for i, det in enumerate(det_arr): 117 | det = np.squeeze(det) 118 | bb = np.zeros(4, dtype=np.int32) 119 | bb[0] = np.maximum(det[0]-args.margin/2, 0) 120 | bb[1] = np.maximum(det[1]-args.margin/2, 0) 121 | bb[2] = np.minimum(det[2]+args.margin/2, img_size[1]) 122 | bb[3] = np.minimum(det[3]+args.margin/2, img_size[0]) 123 | cropped = img[bb[1]:bb[3],bb[0]:bb[2],:] 124 | scaled = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear') 125 | nrof_successfully_aligned += 1 126 | filename_base, file_extension = os.path.splitext(output_filename) 127 | if args.detect_multiple_faces: 128 | output_filename_n = "{}_{}{}".format(filename_base, i, file_extension) 129 | else: 130 | output_filename_n = "{}{}".format(filename_base, file_extension) 131 | misc.imsave(output_filename_n, scaled) 132 | text_file.write('%s %d %d %d %d\n' % (output_filename_n, bb[0], bb[1], bb[2], bb[3])) 133 | else: 134 | print('Unable to align "%s"' % image_path) 135 | text_file.write('%s\n' % (output_filename)) 136 | 137 | print('Total number of images: %d' % nrof_images_total) 138 | print('Number of successfully aligned images: %d' % nrof_successfully_aligned) 139 | 140 | 141 | def parse_arguments(argv): 142 | parser = argparse.ArgumentParser() 143 | 144 | parser.add_argument('input_dir', type=str, help='Directory with unaligned images.') 145 | parser.add_argument('output_dir', type=str, help='Directory with aligned face thumbnails.') 146 | parser.add_argument('--image_size', type=int, 147 | help='Image size (height, width) in pixels.', default=182) 148 | parser.add_argument('--margin', type=int, 149 | help='Margin for the crop around the bounding box (height, width) in pixels.', default=44) 150 | parser.add_argument('--random_order', 151 | help='Shuffles the order of images to enable alignment using multiple processes.', action='store_true') 152 | parser.add_argument('--gpu_memory_fraction', type=float, 153 | help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0) 154 | parser.add_argument('--detect_multiple_faces', type=bool, 155 | help='Detect and align multiple faces per image.', default=False) 156 | return parser.parse_args(argv) 157 | 158 | if __name__ == '__main__': 159 | main(parse_arguments(sys.argv[1:])) 160 | -------------------------------------------------------------------------------- /det1.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/det1.npy -------------------------------------------------------------------------------- /det2.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/det2.npy -------------------------------------------------------------------------------- /det3.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/det3.npy -------------------------------------------------------------------------------- /detect_face.py: -------------------------------------------------------------------------------- 1 | """ Tensorflow implementation of the face detection / alignment algorithm found at 2 | https://github.com/kpzhang93/MTCNN_face_detection_alignment 3 | """ 4 | # MIT License 5 | # 6 | # Copyright (c) 2016 David Sandberg 7 | # 8 | # Permission is hereby granted, free of charge, to any person obtaining a copy 9 | # of this software and associated documentation files (the "Software"), to deal 10 | # in the Software without restriction, including without limitation the rights 11 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | # copies of the Software, and to permit persons to whom the Software is 13 | # furnished to do so, subject to the following conditions: 14 | # 15 | # The above copyright notice and this permission notice shall be included in all 16 | # copies or substantial portions of the Software. 17 | # 18 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 24 | # SOFTWARE. 25 | 26 | from __future__ import absolute_import 27 | from __future__ import division 28 | from __future__ import print_function 29 | from six import string_types, iteritems 30 | 31 | import numpy as np 32 | import tensorflow as tf 33 | import cv2 34 | import os 35 | 36 | def layer(op): 37 | """Decorator for composable network layers.""" 38 | 39 | def layer_decorated(self, *args, **kwargs): 40 | # Automatically set a name if not provided. 41 | name = kwargs.setdefault('name', self.get_unique_name(op.__name__)) 42 | # Figure out the layer inputs. 43 | if len(self.terminals) == 0: 44 | raise RuntimeError('No input variables found for layer %s.' % name) 45 | elif len(self.terminals) == 1: 46 | layer_input = self.terminals[0] 47 | else: 48 | layer_input = list(self.terminals) 49 | # Perform the operation and get the output. 50 | layer_output = op(self, layer_input, *args, **kwargs) 51 | # Add to layer LUT. 52 | self.layers[name] = layer_output 53 | # This output is now the input for the next layer. 54 | self.feed(layer_output) 55 | # Return self for chained calls. 56 | return self 57 | 58 | return layer_decorated 59 | 60 | class Network(object): 61 | 62 | def __init__(self, inputs, trainable=True): 63 | # The input nodes for this network 64 | self.inputs = inputs 65 | # The current list of terminal nodes 66 | self.terminals = [] 67 | # Mapping from layer names to layers 68 | self.layers = dict(inputs) 69 | # If true, the resulting variables are set as trainable 70 | self.trainable = trainable 71 | 72 | self.setup() 73 | 74 | def setup(self): 75 | """Construct the network. """ 76 | raise NotImplementedError('Must be implemented by the subclass.') 77 | 78 | def load(self, data_path, session, ignore_missing=False): 79 | """Load network weights. 80 | data_path: The path to the numpy-serialized network weights 81 | session: The current TensorFlow session 82 | ignore_missing: If true, serialized weights for missing layers are ignored. 83 | """ 84 | data_dict = np.load(data_path, encoding='latin1').item() #pylint: disable=no-member 85 | 86 | for op_name in data_dict: 87 | with tf.variable_scope(op_name, reuse=True): 88 | for param_name, data in iteritems(data_dict[op_name]): 89 | try: 90 | var = tf.get_variable(param_name) 91 | session.run(var.assign(data)) 92 | except ValueError: 93 | if not ignore_missing: 94 | raise 95 | 96 | def feed(self, *args): 97 | """Set the input(s) for the next operation by replacing the terminal nodes. 98 | The arguments can be either layer names or the actual layers. 99 | """ 100 | assert len(args) != 0 101 | self.terminals = [] 102 | for fed_layer in args: 103 | if isinstance(fed_layer, string_types): 104 | try: 105 | fed_layer = self.layers[fed_layer] 106 | except KeyError: 107 | raise KeyError('Unknown layer name fed: %s' % fed_layer) 108 | self.terminals.append(fed_layer) 109 | return self 110 | 111 | def get_output(self): 112 | """Returns the current network output.""" 113 | return self.terminals[-1] 114 | 115 | def get_unique_name(self, prefix): 116 | """Returns an index-suffixed unique name for the given prefix. 117 | This is used for auto-generating layer names based on the type-prefix. 118 | """ 119 | ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1 120 | return '%s_%d' % (prefix, ident) 121 | 122 | def make_var(self, name, shape): 123 | """Creates a new TensorFlow variable.""" 124 | return tf.get_variable(name, shape, trainable=self.trainable) 125 | 126 | def validate_padding(self, padding): 127 | """Verifies that the padding is one of the supported ones.""" 128 | assert padding in ('SAME', 'VALID') 129 | 130 | @layer 131 | def conv(self, 132 | inp, 133 | k_h, 134 | k_w, 135 | c_o, 136 | s_h, 137 | s_w, 138 | name, 139 | relu=True, 140 | padding='SAME', 141 | group=1, 142 | biased=True): 143 | # Verify that the padding is acceptable 144 | self.validate_padding(padding) 145 | # Get the number of channels in the input 146 | c_i = int(inp.get_shape()[-1]) 147 | # Verify that the grouping parameter is valid 148 | assert c_i % group == 0 149 | assert c_o % group == 0 150 | # Convolution for a given input and kernel 151 | convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding) 152 | with tf.variable_scope(name) as scope: 153 | kernel = self.make_var('weights', shape=[k_h, k_w, c_i // group, c_o]) 154 | # This is the common-case. Convolve the input without any further complications. 155 | output = convolve(inp, kernel) 156 | # Add the biases 157 | if biased: 158 | biases = self.make_var('biases', [c_o]) 159 | output = tf.nn.bias_add(output, biases) 160 | if relu: 161 | # ReLU non-linearity 162 | output = tf.nn.relu(output, name=scope.name) 163 | return output 164 | 165 | @layer 166 | def prelu(self, inp, name): 167 | with tf.variable_scope(name): 168 | i = int(inp.get_shape()[-1]) 169 | alpha = self.make_var('alpha', shape=(i,)) 170 | output = tf.nn.relu(inp) + tf.multiply(alpha, -tf.nn.relu(-inp)) 171 | return output 172 | 173 | @layer 174 | def max_pool(self, inp, k_h, k_w, s_h, s_w, name, padding='SAME'): 175 | self.validate_padding(padding) 176 | return tf.nn.max_pool(inp, 177 | ksize=[1, k_h, k_w, 1], 178 | strides=[1, s_h, s_w, 1], 179 | padding=padding, 180 | name=name) 181 | 182 | @layer 183 | def fc(self, inp, num_out, name, relu=True): 184 | with tf.variable_scope(name): 185 | input_shape = inp.get_shape() 186 | if input_shape.ndims == 4: 187 | # The input is spatial. Vectorize it first. 188 | dim = 1 189 | for d in input_shape[1:].as_list(): 190 | dim *= int(d) 191 | feed_in = tf.reshape(inp, [-1, dim]) 192 | else: 193 | feed_in, dim = (inp, input_shape[-1].value) 194 | weights = self.make_var('weights', shape=[dim, num_out]) 195 | biases = self.make_var('biases', [num_out]) 196 | op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b 197 | fc = op(feed_in, weights, biases, name=name) 198 | return fc 199 | 200 | 201 | """ 202 | Multi dimensional softmax, 203 | refer to https://github.com/tensorflow/tensorflow/issues/210 204 | compute softmax along the dimension of target 205 | the native softmax only supports batch_size x dimension 206 | """ 207 | @layer 208 | def softmax(self, target, axis, name=None): 209 | max_axis = tf.reduce_max(target, axis, keepdims=True) 210 | target_exp = tf.exp(target-max_axis) 211 | normalize = tf.reduce_sum(target_exp, axis, keepdims=True) 212 | softmax = tf.div(target_exp, normalize, name) 213 | return softmax 214 | 215 | class PNet(Network): 216 | def setup(self): 217 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member 218 | .conv(3, 3, 10, 1, 1, padding='VALID', relu=False, name='conv1') 219 | .prelu(name='PReLU1') 220 | .max_pool(2, 2, 2, 2, name='pool1') 221 | .conv(3, 3, 16, 1, 1, padding='VALID', relu=False, name='conv2') 222 | .prelu(name='PReLU2') 223 | .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv3') 224 | .prelu(name='PReLU3') 225 | .conv(1, 1, 2, 1, 1, relu=False, name='conv4-1') 226 | .softmax(3,name='prob1')) 227 | 228 | (self.feed('PReLU3') #pylint: disable=no-value-for-parameter 229 | .conv(1, 1, 4, 1, 1, relu=False, name='conv4-2')) 230 | 231 | class RNet(Network): 232 | def setup(self): 233 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member 234 | .conv(3, 3, 28, 1, 1, padding='VALID', relu=False, name='conv1') 235 | .prelu(name='prelu1') 236 | .max_pool(3, 3, 2, 2, name='pool1') 237 | .conv(3, 3, 48, 1, 1, padding='VALID', relu=False, name='conv2') 238 | .prelu(name='prelu2') 239 | .max_pool(3, 3, 2, 2, padding='VALID', name='pool2') 240 | .conv(2, 2, 64, 1, 1, padding='VALID', relu=False, name='conv3') 241 | .prelu(name='prelu3') 242 | .fc(128, relu=False, name='conv4') 243 | .prelu(name='prelu4') 244 | .fc(2, relu=False, name='conv5-1') 245 | .softmax(1,name='prob1')) 246 | 247 | (self.feed('prelu4') #pylint: disable=no-value-for-parameter 248 | .fc(4, relu=False, name='conv5-2')) 249 | 250 | class ONet(Network): 251 | def setup(self): 252 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member 253 | .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv1') 254 | .prelu(name='prelu1') 255 | .max_pool(3, 3, 2, 2, name='pool1') 256 | .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv2') 257 | .prelu(name='prelu2') 258 | .max_pool(3, 3, 2, 2, padding='VALID', name='pool2') 259 | .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv3') 260 | .prelu(name='prelu3') 261 | .max_pool(2, 2, 2, 2, name='pool3') 262 | .conv(2, 2, 128, 1, 1, padding='VALID', relu=False, name='conv4') 263 | .prelu(name='prelu4') 264 | .fc(256, relu=False, name='conv5') 265 | .prelu(name='prelu5') 266 | .fc(2, relu=False, name='conv6-1') 267 | .softmax(1, name='prob1')) 268 | 269 | (self.feed('prelu5') #pylint: disable=no-value-for-parameter 270 | .fc(4, relu=False, name='conv6-2')) 271 | 272 | (self.feed('prelu5') #pylint: disable=no-value-for-parameter 273 | .fc(10, relu=False, name='conv6-3')) 274 | 275 | def create_mtcnn(sess, model_path): 276 | if not model_path: 277 | model_path,_ = os.path.split(os.path.realpath(__file__)) 278 | 279 | with tf.variable_scope('pnet'): 280 | data = tf.placeholder(tf.float32, (None,None,None,3), 'input') 281 | pnet = PNet({'data':data}) 282 | pnet.load(os.path.join(model_path, 'det1.npy'), sess) 283 | with tf.variable_scope('rnet'): 284 | data = tf.placeholder(tf.float32, (None,24,24,3), 'input') 285 | rnet = RNet({'data':data}) 286 | rnet.load(os.path.join(model_path, 'det2.npy'), sess) 287 | with tf.variable_scope('onet'): 288 | data = tf.placeholder(tf.float32, (None,48,48,3), 'input') 289 | onet = ONet({'data':data}) 290 | onet.load(os.path.join(model_path, 'det3.npy'), sess) 291 | 292 | pnet_fun = lambda img : sess.run(('pnet/conv4-2/BiasAdd:0', 'pnet/prob1:0'), feed_dict={'pnet/input:0':img}) 293 | rnet_fun = lambda img : sess.run(('rnet/conv5-2/conv5-2:0', 'rnet/prob1:0'), feed_dict={'rnet/input:0':img}) 294 | onet_fun = lambda img : sess.run(('onet/conv6-2/conv6-2:0', 'onet/conv6-3/conv6-3:0', 'onet/prob1:0'), feed_dict={'onet/input:0':img}) 295 | return pnet_fun, rnet_fun, onet_fun 296 | 297 | def detect_face(img, minsize, pnet, rnet, onet, threshold, factor): 298 | """Detects faces in an image, and returns bounding boxes and points for them. 299 | img: input image 300 | minsize: minimum faces' size 301 | pnet, rnet, onet: caffemodel 302 | threshold: threshold=[th1, th2, th3], th1-3 are three steps's threshold 303 | factor: the factor used to create a scaling pyramid of face sizes to detect in the image. 304 | """ 305 | factor_count=0 306 | total_boxes=np.empty((0,9)) 307 | points=np.empty(0) 308 | h=img.shape[0] 309 | w=img.shape[1] 310 | minl=np.amin([h, w]) 311 | m=12.0/minsize 312 | minl=minl*m 313 | # create scale pyramid 314 | scales=[] 315 | while minl>=12: 316 | scales += [m*np.power(factor, factor_count)] 317 | minl = minl*factor 318 | factor_count += 1 319 | 320 | # first stage 321 | for scale in scales: 322 | hs=int(np.ceil(h*scale)) 323 | ws=int(np.ceil(w*scale)) 324 | im_data = imresample(img, (hs, ws)) 325 | im_data = (im_data-127.5)*0.0078125 326 | img_x = np.expand_dims(im_data, 0) 327 | img_y = np.transpose(img_x, (0,2,1,3)) 328 | out = pnet(img_y) 329 | out0 = np.transpose(out[0], (0,2,1,3)) 330 | out1 = np.transpose(out[1], (0,2,1,3)) 331 | 332 | boxes, _ = generateBoundingBox(out1[0,:,:,1].copy(), out0[0,:,:,:].copy(), scale, threshold[0]) 333 | 334 | # inter-scale nms 335 | pick = nms(boxes.copy(), 0.5, 'Union') 336 | if boxes.size>0 and pick.size>0: 337 | boxes = boxes[pick,:] 338 | total_boxes = np.append(total_boxes, boxes, axis=0) 339 | 340 | numbox = total_boxes.shape[0] 341 | if numbox>0: 342 | pick = nms(total_boxes.copy(), 0.7, 'Union') 343 | total_boxes = total_boxes[pick,:] 344 | regw = total_boxes[:,2]-total_boxes[:,0] 345 | regh = total_boxes[:,3]-total_boxes[:,1] 346 | qq1 = total_boxes[:,0]+total_boxes[:,5]*regw 347 | qq2 = total_boxes[:,1]+total_boxes[:,6]*regh 348 | qq3 = total_boxes[:,2]+total_boxes[:,7]*regw 349 | qq4 = total_boxes[:,3]+total_boxes[:,8]*regh 350 | total_boxes = np.transpose(np.vstack([qq1, qq2, qq3, qq4, total_boxes[:,4]])) 351 | total_boxes = rerec(total_boxes.copy()) 352 | total_boxes[:,0:4] = np.fix(total_boxes[:,0:4]).astype(np.int32) 353 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h) 354 | 355 | numbox = total_boxes.shape[0] 356 | if numbox>0: 357 | # second stage 358 | tempimg = np.zeros((24,24,3,numbox)) 359 | for k in range(0,numbox): 360 | tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3)) 361 | tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:] 362 | if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0: 363 | tempimg[:,:,:,k] = imresample(tmp, (24, 24)) 364 | else: 365 | return np.empty() 366 | tempimg = (tempimg-127.5)*0.0078125 367 | tempimg1 = np.transpose(tempimg, (3,1,0,2)) 368 | out = rnet(tempimg1) 369 | out0 = np.transpose(out[0]) 370 | out1 = np.transpose(out[1]) 371 | score = out1[1,:] 372 | ipass = np.where(score>threshold[1]) 373 | total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)]) 374 | mv = out0[:,ipass[0]] 375 | if total_boxes.shape[0]>0: 376 | pick = nms(total_boxes, 0.7, 'Union') 377 | total_boxes = total_boxes[pick,:] 378 | total_boxes = bbreg(total_boxes.copy(), np.transpose(mv[:,pick])) 379 | total_boxes = rerec(total_boxes.copy()) 380 | 381 | numbox = total_boxes.shape[0] 382 | if numbox>0: 383 | # third stage 384 | total_boxes = np.fix(total_boxes).astype(np.int32) 385 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h) 386 | tempimg = np.zeros((48,48,3,numbox)) 387 | for k in range(0,numbox): 388 | tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3)) 389 | tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:] 390 | if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0: 391 | tempimg[:,:,:,k] = imresample(tmp, (48, 48)) 392 | else: 393 | return np.empty() 394 | tempimg = (tempimg-127.5)*0.0078125 395 | tempimg1 = np.transpose(tempimg, (3,1,0,2)) 396 | out = onet(tempimg1) 397 | out0 = np.transpose(out[0]) 398 | out1 = np.transpose(out[1]) 399 | out2 = np.transpose(out[2]) 400 | score = out2[1,:] 401 | points = out1 402 | ipass = np.where(score>threshold[2]) 403 | points = points[:,ipass[0]] 404 | total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)]) 405 | mv = out0[:,ipass[0]] 406 | 407 | w = total_boxes[:,2]-total_boxes[:,0]+1 408 | h = total_boxes[:,3]-total_boxes[:,1]+1 409 | points[0:5,:] = np.tile(w,(5, 1))*points[0:5,:] + np.tile(total_boxes[:,0],(5, 1))-1 410 | points[5:10,:] = np.tile(h,(5, 1))*points[5:10,:] + np.tile(total_boxes[:,1],(5, 1))-1 411 | if total_boxes.shape[0]>0: 412 | total_boxes = bbreg(total_boxes.copy(), np.transpose(mv)) 413 | pick = nms(total_boxes.copy(), 0.7, 'Min') 414 | total_boxes = total_boxes[pick,:] 415 | points = points[:,pick] 416 | 417 | return total_boxes, points 418 | 419 | 420 | def bulk_detect_face(images, detection_window_size_ratio, pnet, rnet, onet, threshold, factor): 421 | """Detects faces in a list of images 422 | images: list containing input images 423 | detection_window_size_ratio: ratio of minimum face size to smallest image dimension 424 | pnet, rnet, onet: caffemodel 425 | threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold [0-1] 426 | factor: the factor used to create a scaling pyramid of face sizes to detect in the image. 427 | """ 428 | all_scales = [None] * len(images) 429 | images_with_boxes = [None] * len(images) 430 | 431 | for i in range(len(images)): 432 | images_with_boxes[i] = {'total_boxes': np.empty((0, 9))} 433 | 434 | # create scale pyramid 435 | for index, img in enumerate(images): 436 | all_scales[index] = [] 437 | h = img.shape[0] 438 | w = img.shape[1] 439 | minsize = int(detection_window_size_ratio * np.minimum(w, h)) 440 | factor_count = 0 441 | minl = np.amin([h, w]) 442 | if minsize <= 12: 443 | minsize = 12 444 | 445 | m = 12.0 / minsize 446 | minl = minl * m 447 | while minl >= 12: 448 | all_scales[index].append(m * np.power(factor, factor_count)) 449 | minl = minl * factor 450 | factor_count += 1 451 | 452 | # # # # # # # # # # # # # 453 | # first stage - fast proposal network (pnet) to obtain face candidates 454 | # # # # # # # # # # # # # 455 | 456 | images_obj_per_resolution = {} 457 | 458 | # TODO: use some type of rounding to number module 8 to increase probability that pyramid images will have the same resolution across input images 459 | 460 | for index, scales in enumerate(all_scales): 461 | h = images[index].shape[0] 462 | w = images[index].shape[1] 463 | 464 | for scale in scales: 465 | hs = int(np.ceil(h * scale)) 466 | ws = int(np.ceil(w * scale)) 467 | 468 | if (ws, hs) not in images_obj_per_resolution: 469 | images_obj_per_resolution[(ws, hs)] = [] 470 | 471 | im_data = imresample(images[index], (hs, ws)) 472 | im_data = (im_data - 127.5) * 0.0078125 473 | img_y = np.transpose(im_data, (1, 0, 2)) # caffe uses different dimensions ordering 474 | images_obj_per_resolution[(ws, hs)].append({'scale': scale, 'image': img_y, 'index': index}) 475 | 476 | for resolution in images_obj_per_resolution: 477 | images_per_resolution = [i['image'] for i in images_obj_per_resolution[resolution]] 478 | outs = pnet(images_per_resolution) 479 | 480 | for index in range(len(outs[0])): 481 | scale = images_obj_per_resolution[resolution][index]['scale'] 482 | image_index = images_obj_per_resolution[resolution][index]['index'] 483 | out0 = np.transpose(outs[0][index], (1, 0, 2)) 484 | out1 = np.transpose(outs[1][index], (1, 0, 2)) 485 | 486 | boxes, _ = generateBoundingBox(out1[:, :, 1].copy(), out0[:, :, :].copy(), scale, threshold[0]) 487 | 488 | # inter-scale nms 489 | pick = nms(boxes.copy(), 0.5, 'Union') 490 | if boxes.size > 0 and pick.size > 0: 491 | boxes = boxes[pick, :] 492 | images_with_boxes[image_index]['total_boxes'] = np.append(images_with_boxes[image_index]['total_boxes'], 493 | boxes, 494 | axis=0) 495 | 496 | for index, image_obj in enumerate(images_with_boxes): 497 | numbox = image_obj['total_boxes'].shape[0] 498 | if numbox > 0: 499 | h = images[index].shape[0] 500 | w = images[index].shape[1] 501 | pick = nms(image_obj['total_boxes'].copy(), 0.7, 'Union') 502 | image_obj['total_boxes'] = image_obj['total_boxes'][pick, :] 503 | regw = image_obj['total_boxes'][:, 2] - image_obj['total_boxes'][:, 0] 504 | regh = image_obj['total_boxes'][:, 3] - image_obj['total_boxes'][:, 1] 505 | qq1 = image_obj['total_boxes'][:, 0] + image_obj['total_boxes'][:, 5] * regw 506 | qq2 = image_obj['total_boxes'][:, 1] + image_obj['total_boxes'][:, 6] * regh 507 | qq3 = image_obj['total_boxes'][:, 2] + image_obj['total_boxes'][:, 7] * regw 508 | qq4 = image_obj['total_boxes'][:, 3] + image_obj['total_boxes'][:, 8] * regh 509 | image_obj['total_boxes'] = np.transpose(np.vstack([qq1, qq2, qq3, qq4, image_obj['total_boxes'][:, 4]])) 510 | image_obj['total_boxes'] = rerec(image_obj['total_boxes'].copy()) 511 | image_obj['total_boxes'][:, 0:4] = np.fix(image_obj['total_boxes'][:, 0:4]).astype(np.int32) 512 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(image_obj['total_boxes'].copy(), w, h) 513 | 514 | numbox = image_obj['total_boxes'].shape[0] 515 | tempimg = np.zeros((24, 24, 3, numbox)) 516 | 517 | if numbox > 0: 518 | for k in range(0, numbox): 519 | tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3)) 520 | tmp[dy[k] - 1:edy[k], dx[k] - 1:edx[k], :] = images[index][y[k] - 1:ey[k], x[k] - 1:ex[k], :] 521 | if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0: 522 | tempimg[:, :, :, k] = imresample(tmp, (24, 24)) 523 | else: 524 | return np.empty() 525 | 526 | tempimg = (tempimg - 127.5) * 0.0078125 527 | image_obj['rnet_input'] = np.transpose(tempimg, (3, 1, 0, 2)) 528 | 529 | # # # # # # # # # # # # # 530 | # second stage - refinement of face candidates with rnet 531 | # # # # # # # # # # # # # 532 | 533 | bulk_rnet_input = np.empty((0, 24, 24, 3)) 534 | for index, image_obj in enumerate(images_with_boxes): 535 | if 'rnet_input' in image_obj: 536 | bulk_rnet_input = np.append(bulk_rnet_input, image_obj['rnet_input'], axis=0) 537 | 538 | out = rnet(bulk_rnet_input) 539 | out0 = np.transpose(out[0]) 540 | out1 = np.transpose(out[1]) 541 | score = out1[1, :] 542 | 543 | i = 0 544 | for index, image_obj in enumerate(images_with_boxes): 545 | if 'rnet_input' not in image_obj: 546 | continue 547 | 548 | rnet_input_count = image_obj['rnet_input'].shape[0] 549 | score_per_image = score[i:i + rnet_input_count] 550 | out0_per_image = out0[:, i:i + rnet_input_count] 551 | 552 | ipass = np.where(score_per_image > threshold[1]) 553 | image_obj['total_boxes'] = np.hstack([image_obj['total_boxes'][ipass[0], 0:4].copy(), 554 | np.expand_dims(score_per_image[ipass].copy(), 1)]) 555 | 556 | mv = out0_per_image[:, ipass[0]] 557 | 558 | if image_obj['total_boxes'].shape[0] > 0: 559 | h = images[index].shape[0] 560 | w = images[index].shape[1] 561 | pick = nms(image_obj['total_boxes'], 0.7, 'Union') 562 | image_obj['total_boxes'] = image_obj['total_boxes'][pick, :] 563 | image_obj['total_boxes'] = bbreg(image_obj['total_boxes'].copy(), np.transpose(mv[:, pick])) 564 | image_obj['total_boxes'] = rerec(image_obj['total_boxes'].copy()) 565 | 566 | numbox = image_obj['total_boxes'].shape[0] 567 | 568 | if numbox > 0: 569 | tempimg = np.zeros((48, 48, 3, numbox)) 570 | image_obj['total_boxes'] = np.fix(image_obj['total_boxes']).astype(np.int32) 571 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(image_obj['total_boxes'].copy(), w, h) 572 | 573 | for k in range(0, numbox): 574 | tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3)) 575 | tmp[dy[k] - 1:edy[k], dx[k] - 1:edx[k], :] = images[index][y[k] - 1:ey[k], x[k] - 1:ex[k], :] 576 | if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0: 577 | tempimg[:, :, :, k] = imresample(tmp, (48, 48)) 578 | else: 579 | return np.empty() 580 | tempimg = (tempimg - 127.5) * 0.0078125 581 | image_obj['onet_input'] = np.transpose(tempimg, (3, 1, 0, 2)) 582 | 583 | i += rnet_input_count 584 | 585 | # # # # # # # # # # # # # 586 | # third stage - further refinement and facial landmarks positions with onet 587 | # # # # # # # # # # # # # 588 | 589 | bulk_onet_input = np.empty((0, 48, 48, 3)) 590 | for index, image_obj in enumerate(images_with_boxes): 591 | if 'onet_input' in image_obj: 592 | bulk_onet_input = np.append(bulk_onet_input, image_obj['onet_input'], axis=0) 593 | 594 | out = onet(bulk_onet_input) 595 | 596 | out0 = np.transpose(out[0]) 597 | out1 = np.transpose(out[1]) 598 | out2 = np.transpose(out[2]) 599 | score = out2[1, :] 600 | points = out1 601 | 602 | i = 0 603 | ret = [] 604 | for index, image_obj in enumerate(images_with_boxes): 605 | if 'onet_input' not in image_obj: 606 | ret.append(None) 607 | continue 608 | 609 | onet_input_count = image_obj['onet_input'].shape[0] 610 | 611 | out0_per_image = out0[:, i:i + onet_input_count] 612 | score_per_image = score[i:i + onet_input_count] 613 | points_per_image = points[:, i:i + onet_input_count] 614 | 615 | ipass = np.where(score_per_image > threshold[2]) 616 | points_per_image = points_per_image[:, ipass[0]] 617 | 618 | image_obj['total_boxes'] = np.hstack([image_obj['total_boxes'][ipass[0], 0:4].copy(), 619 | np.expand_dims(score_per_image[ipass].copy(), 1)]) 620 | mv = out0_per_image[:, ipass[0]] 621 | 622 | w = image_obj['total_boxes'][:, 2] - image_obj['total_boxes'][:, 0] + 1 623 | h = image_obj['total_boxes'][:, 3] - image_obj['total_boxes'][:, 1] + 1 624 | points_per_image[0:5, :] = np.tile(w, (5, 1)) * points_per_image[0:5, :] + np.tile( 625 | image_obj['total_boxes'][:, 0], (5, 1)) - 1 626 | points_per_image[5:10, :] = np.tile(h, (5, 1)) * points_per_image[5:10, :] + np.tile( 627 | image_obj['total_boxes'][:, 1], (5, 1)) - 1 628 | 629 | if image_obj['total_boxes'].shape[0] > 0: 630 | image_obj['total_boxes'] = bbreg(image_obj['total_boxes'].copy(), np.transpose(mv)) 631 | pick = nms(image_obj['total_boxes'].copy(), 0.7, 'Min') 632 | image_obj['total_boxes'] = image_obj['total_boxes'][pick, :] 633 | points_per_image = points_per_image[:, pick] 634 | 635 | ret.append((image_obj['total_boxes'], points_per_image)) 636 | else: 637 | ret.append(None) 638 | 639 | i += onet_input_count 640 | 641 | return ret 642 | 643 | 644 | # function [boundingbox] = bbreg(boundingbox,reg) 645 | def bbreg(boundingbox,reg): 646 | """Calibrate bounding boxes""" 647 | if reg.shape[1]==1: 648 | reg = np.reshape(reg, (reg.shape[2], reg.shape[3])) 649 | 650 | w = boundingbox[:,2]-boundingbox[:,0]+1 651 | h = boundingbox[:,3]-boundingbox[:,1]+1 652 | b1 = boundingbox[:,0]+reg[:,0]*w 653 | b2 = boundingbox[:,1]+reg[:,1]*h 654 | b3 = boundingbox[:,2]+reg[:,2]*w 655 | b4 = boundingbox[:,3]+reg[:,3]*h 656 | boundingbox[:,0:4] = np.transpose(np.vstack([b1, b2, b3, b4 ])) 657 | return boundingbox 658 | 659 | def generateBoundingBox(imap, reg, scale, t): 660 | """Use heatmap to generate bounding boxes""" 661 | stride=2 662 | cellsize=12 663 | 664 | imap = np.transpose(imap) 665 | dx1 = np.transpose(reg[:,:,0]) 666 | dy1 = np.transpose(reg[:,:,1]) 667 | dx2 = np.transpose(reg[:,:,2]) 668 | dy2 = np.transpose(reg[:,:,3]) 669 | y, x = np.where(imap >= t) 670 | if y.shape[0]==1: 671 | dx1 = np.flipud(dx1) 672 | dy1 = np.flipud(dy1) 673 | dx2 = np.flipud(dx2) 674 | dy2 = np.flipud(dy2) 675 | score = imap[(y,x)] 676 | reg = np.transpose(np.vstack([ dx1[(y,x)], dy1[(y,x)], dx2[(y,x)], dy2[(y,x)] ])) 677 | if reg.size==0: 678 | reg = np.empty((0,3)) 679 | bb = np.transpose(np.vstack([y,x])) 680 | q1 = np.fix((stride*bb+1)/scale) 681 | q2 = np.fix((stride*bb+cellsize-1+1)/scale) 682 | boundingbox = np.hstack([q1, q2, np.expand_dims(score,1), reg]) 683 | return boundingbox, reg 684 | 685 | # function pick = nms(boxes,threshold,type) 686 | def nms(boxes, threshold, method): 687 | if boxes.size==0: 688 | return np.empty((0,3)) 689 | x1 = boxes[:,0] 690 | y1 = boxes[:,1] 691 | x2 = boxes[:,2] 692 | y2 = boxes[:,3] 693 | s = boxes[:,4] 694 | area = (x2-x1+1) * (y2-y1+1) 695 | I = np.argsort(s) 696 | pick = np.zeros_like(s, dtype=np.int16) 697 | counter = 0 698 | while I.size>0: 699 | i = I[-1] 700 | pick[counter] = i 701 | counter += 1 702 | idx = I[0:-1] 703 | xx1 = np.maximum(x1[i], x1[idx]) 704 | yy1 = np.maximum(y1[i], y1[idx]) 705 | xx2 = np.minimum(x2[i], x2[idx]) 706 | yy2 = np.minimum(y2[i], y2[idx]) 707 | w = np.maximum(0.0, xx2-xx1+1) 708 | h = np.maximum(0.0, yy2-yy1+1) 709 | inter = w * h 710 | if method is 'Min': 711 | o = inter / np.minimum(area[i], area[idx]) 712 | else: 713 | o = inter / (area[i] + area[idx] - inter) 714 | I = I[np.where(o<=threshold)] 715 | pick = pick[0:counter] 716 | return pick 717 | 718 | # function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h) 719 | def pad(total_boxes, w, h): 720 | """Compute the padding coordinates (pad the bounding boxes to square)""" 721 | tmpw = (total_boxes[:,2]-total_boxes[:,0]+1).astype(np.int32) 722 | tmph = (total_boxes[:,3]-total_boxes[:,1]+1).astype(np.int32) 723 | numbox = total_boxes.shape[0] 724 | 725 | dx = np.ones((numbox), dtype=np.int32) 726 | dy = np.ones((numbox), dtype=np.int32) 727 | edx = tmpw.copy().astype(np.int32) 728 | edy = tmph.copy().astype(np.int32) 729 | 730 | x = total_boxes[:,0].copy().astype(np.int32) 731 | y = total_boxes[:,1].copy().astype(np.int32) 732 | ex = total_boxes[:,2].copy().astype(np.int32) 733 | ey = total_boxes[:,3].copy().astype(np.int32) 734 | 735 | tmp = np.where(ex>w) 736 | edx.flat[tmp] = np.expand_dims(-ex[tmp]+w+tmpw[tmp],1) 737 | ex[tmp] = w 738 | 739 | tmp = np.where(ey>h) 740 | edy.flat[tmp] = np.expand_dims(-ey[tmp]+h+tmph[tmp],1) 741 | ey[tmp] = h 742 | 743 | tmp = np.where(x<1) 744 | dx.flat[tmp] = np.expand_dims(2-x[tmp],1) 745 | x[tmp] = 1 746 | 747 | tmp = np.where(y<1) 748 | dy.flat[tmp] = np.expand_dims(2-y[tmp],1) 749 | y[tmp] = 1 750 | 751 | return dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph 752 | 753 | # function [bboxA] = rerec(bboxA) 754 | def rerec(bboxA): 755 | """Convert bboxA to square.""" 756 | h = bboxA[:,3]-bboxA[:,1] 757 | w = bboxA[:,2]-bboxA[:,0] 758 | l = np.maximum(w, h) 759 | bboxA[:,0] = bboxA[:,0]+w*0.5-l*0.5 760 | bboxA[:,1] = bboxA[:,1]+h*0.5-l*0.5 761 | bboxA[:,2:4] = bboxA[:,0:2] + np.transpose(np.tile(l,(2,1))) 762 | return bboxA 763 | 764 | def imresample(img, sz): 765 | im_data = cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_AREA) 766 | return im_data 767 | -------------------------------------------------------------------------------- /face_aligner.py: -------------------------------------------------------------------------------- 1 | # import the necessary packages 2 | import numpy as np 3 | import cv2 4 | 5 | class FaceAligner: 6 | def __init__(self, desiredLeftEye=(0.4, 0.4), 7 | desiredFaceWidth=256, desiredFaceHeight=None): 8 | self.desiredLeftEye = desiredLeftEye 9 | self.desiredFaceWidth = desiredFaceWidth 10 | self.desiredFaceHeight = desiredFaceHeight 11 | 12 | # if the desired face height is None, set it to be the 13 | # desired face width (normal behavior) 14 | if self.desiredFaceHeight is None: 15 | self.desiredFaceHeight = self.desiredFaceWidth 16 | 17 | def align(self, image, points): 18 | 19 | # compute the center of mass for each eye 20 | leftEyeCenter = (int(points[0]), int(points[5])) 21 | rightEyeCenter = (int(points[1]), int(points[6])) 22 | 23 | # compute the angle between the eye centroids 24 | dY = rightEyeCenter[1] - leftEyeCenter[1] 25 | dX = rightEyeCenter[0] - leftEyeCenter[0] 26 | angle = np.degrees(np.arctan2(dY, dX)) 27 | 28 | # compute the desired right eye x-coordinate based on the 29 | # desired x-coordinate of the left eye 30 | desiredRightEyeX = 1.0 - self.desiredLeftEye[0] 31 | 32 | # determine the scale of the new resulting image by taking 33 | # the ratio of the distance between eyes in the *current* 34 | # image to the ratio of distance between eyes in the 35 | # *desired* image 36 | dist = np.sqrt((dX ** 2) + (dY ** 2)) 37 | desiredDist = (desiredRightEyeX - self.desiredLeftEye[0]) 38 | desiredDist *= self.desiredFaceWidth 39 | scale = desiredDist / dist 40 | 41 | # compute center (x, y)-coordinates (i.e., the median point) 42 | # between the two eyes in the input image 43 | eyesCenter = ((leftEyeCenter[0] + rightEyeCenter[0]) // 2, 44 | (leftEyeCenter[1] + rightEyeCenter[1]) // 2) 45 | 46 | # grab the rotation matrix for rotating and scaling the face 47 | M = cv2.getRotationMatrix2D(eyesCenter, angle, scale) 48 | 49 | # update the translation component of the matrix 50 | tX = self.desiredFaceWidth * 0.5 51 | tY = self.desiredFaceHeight * self.desiredLeftEye[1] 52 | M[0, 2] += (tX - eyesCenter[0]) 53 | M[1, 2] += (tY - eyesCenter[1]) 54 | 55 | # apply the affine transformation 56 | (w, h) = (self.desiredFaceWidth, self.desiredFaceHeight) 57 | output = cv2.warpAffine(image, M, (w, h), 58 | flags=cv2.INTER_CUBIC) 59 | 60 | # return the aligned face 61 | return output -------------------------------------------------------------------------------- /face_detect.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | from mtcnn2 import MTCNN 3 | from draw_points import * 4 | import os 5 | import numpy as np 6 | 7 | #ckpts = np.zeros((5000, 2500), dtype='uint8') 8 | 9 | print('Welcome to Face Detection \n\n Enter 1 to add image manually\n Enter 2 to detect face in Webcam feed') 10 | n = int(input()) 11 | if n != 1 and n != 2: 12 | print('Wrong Choice') 13 | exit(0) 14 | count = 0 15 | if n == 1: 16 | print('Enter complete address of the image') 17 | #addr = str(input()) 18 | #addr = 'C:/Users/Rashmi/Downloads/21.jpg' 19 | addr = '/home/ml/Documents/attendance_dl/21.jpg' 20 | if not os.path.exists(addr): 21 | print('Invalid Address') 22 | exit(0) 23 | 24 | print('Enter Resolution of output image (in heightXwidth format)') 25 | res = input().split('X') 26 | img = cv2.imread(addr) 27 | img = cv2.resize(img, (int(res[0]), int(res[1]))) 28 | ckpts = np.zeros((int(res[0]), int(res[1])), dtype = 'uint8') 29 | 30 | elif n ==2: 31 | #video_capture = cv2.VideoCapture(0) 32 | #/home/ml/Documents/attendance_dl/dataset/dtst7.mp4 33 | video_capture = cv2.VideoCapture('dataset/Mam.mp4') 34 | 35 | 36 | detector = MTCNN() 37 | ct = 0 38 | alpha = 0.12 39 | beta = 0.04 40 | 41 | while True: 42 | 43 | if n == 2: 44 | ret, frame = video_capture.read() 45 | #frame = cv2.resize(frame) 46 | elif n == 1: 47 | frame = img 48 | 49 | #edges = cv2.Canny(frame,500,1000) 50 | #b, g, r = cv2.split(frame) 51 | #dst = cv2.add(r, edges) 52 | #frame2 = cv2.merge((r, b, dst)) 53 | m = cv2.getRotationMatrix2D((frame.shape[1]/2, frame.shape[0]/2+250), -90, 1) 54 | frame = cv2.warpAffine(frame, m, (frame.shape[1], frame.shape[0])) 55 | frame = cv2.resize(frame, (840, 480)) 56 | 57 | detect = detector.detect_faces(frame) 58 | 59 | if detect: 60 | 61 | for i in range(int(len(detect[:]))): 62 | boxes = detect[i]['box'] 63 | keypoints = detect[i]['keypoints'] 64 | #print(keypoints['nose']) 65 | if ckpts[keypoints['nose']] == 0 and ckpts[keypoints['left_eye']] == 0 and ckpts[keypoints['right_eye']] == 0 and ckpts[keypoints['mouth_left']] == 0 and ckpts[keypoints['mouth_right']] == 0: 66 | #show_points(frame, boxes, keypoints, alpha, beta) 67 | draw_lines(frame, boxes, keypoints, alpha, beta, count) 68 | count = count+1 69 | print('count', count) 70 | '''for w in range(boxes[0], boxes[0]+boxes[2]): 71 | for h in range(boxes[1], boxes[1]+boxes[3]): 72 | ckpts[w][h] = 1''' 73 | 74 | 75 | # Display the resulting frame 76 | cv2.imshow('Frame', frame) 77 | #cv2.waitKey(0) 78 | #break 79 | if cv2.waitKey(1) & 0xFF == ord('q'): 80 | break 81 | 82 | # Release the capture 83 | #video_capture.release() 84 | cv2.destroyAllWindows() 85 | -------------------------------------------------------------------------------- /facenet.py: -------------------------------------------------------------------------------- 1 | import os 2 | from subprocess import Popen, PIPE 3 | import tensorflow as tf 4 | import numpy as np 5 | from scipy import misc 6 | from sklearn.model_selection import KFold 7 | from scipy import interpolate 8 | from tensorflow.python.training import training 9 | import random 10 | import re 11 | from tensorflow.python.platform import gfile 12 | import math 13 | from six import iteritems 14 | import cv2 15 | 16 | def triplet_loss(anchor, positive, negative, alpha): 17 | """Calculate the triplet loss according to the FaceNet paper 18 | 19 | Args: 20 | anchor: the embeddings for the anchor images. 21 | positive: the embeddings for the positive images. 22 | negative: the embeddings for the negative images. 23 | 24 | Returns: 25 | the triplet loss according to the FaceNet paper as a float tensor. 26 | """ 27 | with tf.variable_scope('triplet_loss'): 28 | pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1) 29 | neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1) 30 | 31 | basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha) 32 | loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0) 33 | 34 | return loss 35 | 36 | def center_loss(features, label, alfa, nrof_classes): 37 | """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition" 38 | (http://ydwen.github.io/papers/WenECCV16.pdf) 39 | """ 40 | nrof_features = features.get_shape()[1] 41 | centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, 42 | initializer=tf.constant_initializer(0), trainable=False) 43 | label = tf.reshape(label, [-1]) 44 | centers_batch = tf.gather(centers, label) 45 | diff = (1 - alfa) * (centers_batch - features) 46 | centers = tf.scatter_sub(centers, label, diff) 47 | with tf.control_dependencies([centers]): 48 | loss = tf.reduce_mean(tf.square(features - centers_batch)) 49 | return loss, centers 50 | 51 | def get_image_paths_and_labels(dataset): 52 | image_paths_flat = [] 53 | labels_flat = [] 54 | for i in range(len(dataset)): 55 | image_paths_flat += dataset[i].image_paths 56 | labels_flat += [i] * len(dataset[i].image_paths) 57 | return image_paths_flat, labels_flat 58 | 59 | def shuffle_examples(image_paths, labels): 60 | shuffle_list = list(zip(image_paths, labels)) 61 | random.shuffle(shuffle_list) 62 | image_paths_shuff, labels_shuff = zip(*shuffle_list) 63 | return image_paths_shuff, labels_shuff 64 | 65 | def random_rotate_image(image): 66 | angle = np.random.uniform(low=-10.0, high=10.0) 67 | return misc.imrotate(image, angle, 'bicubic') 68 | 69 | # 1: Random rotate 2: Random crop 4: Random flip 8: Fixed image standardization 16: Flip 70 | RANDOM_ROTATE = 1 71 | RANDOM_CROP = 2 72 | RANDOM_FLIP = 4 73 | FIXED_STANDARDIZATION = 8 74 | FLIP = 16 75 | def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder): 76 | images_and_labels_list = [] 77 | for _ in range(nrof_preprocess_threads): 78 | filenames, label, control = input_queue.dequeue() 79 | images = [] 80 | for filename in tf.unstack(filenames): 81 | file_contents = tf.read_file(filename) 82 | image = tf.image.decode_image(file_contents, 3) 83 | image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE), 84 | lambda:tf.py_func(random_rotate_image, [image], tf.uint8), 85 | lambda:tf.identity(image)) 86 | image = tf.cond(get_control_flag(control[0], RANDOM_CROP), 87 | lambda:tf.random_crop(image, image_size + (3,)), 88 | lambda:tf.image.resize_image_with_crop_or_pad(image, image_size[0], image_size[1])) 89 | image = tf.cond(get_control_flag(control[0], RANDOM_FLIP), 90 | lambda:tf.image.random_flip_left_right(image), 91 | lambda:tf.identity(image)) 92 | image = tf.cond(get_control_flag(control[0], FIXED_STANDARDIZATION), 93 | lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, 94 | lambda:tf.image.per_image_standardization(image)) 95 | image = tf.cond(get_control_flag(control[0], FLIP), 96 | lambda:tf.image.flip_left_right(image), 97 | lambda:tf.identity(image)) 98 | #pylint: disable=no-member 99 | image.set_shape(image_size + (3,)) 100 | images.append(image) 101 | images_and_labels_list.append([images, label]) 102 | 103 | image_batch, label_batch = tf.train.batch_join( 104 | images_and_labels_list, batch_size=batch_size_placeholder, 105 | shapes=[image_size + (3,), ()], enqueue_many=True, 106 | capacity=4 * nrof_preprocess_threads * 100, 107 | allow_smaller_final_batch=True) 108 | 109 | return image_batch, label_batch 110 | 111 | def get_control_flag(control, field): 112 | return tf.equal(tf.mod(tf.floor_div(control, field), 2), 1) 113 | 114 | def _add_loss_summaries(total_loss): 115 | """Add summaries for losses. 116 | 117 | Generates moving average for all losses and associated summaries for 118 | visualizing the performance of the network. 119 | 120 | Args: 121 | total_loss: Total loss from loss(). 122 | Returns: 123 | loss_averages_op: op for generating moving averages of losses. 124 | """ 125 | # Compute the moving average of all individual losses and the total loss. 126 | loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg') 127 | losses = tf.get_collection('losses') 128 | loss_averages_op = loss_averages.apply(losses + [total_loss]) 129 | 130 | # Attach a scalar summmary to all individual losses and the total loss; do the 131 | # same for the averaged version of the losses. 132 | for l in losses + [total_loss]: 133 | # Name each loss as '(raw)' and name the moving average version of the loss 134 | # as the original loss name. 135 | tf.summary.scalar(l.op.name +' (raw)', l) 136 | tf.summary.scalar(l.op.name, loss_averages.average(l)) 137 | 138 | return loss_averages_op 139 | 140 | def train(total_loss, global_step, optimizer, learning_rate, moving_average_decay, update_gradient_vars, log_histograms=True): 141 | # Generate moving averages of all losses and associated summaries. 142 | loss_averages_op = _add_loss_summaries(total_loss) 143 | 144 | # Compute gradients. 145 | with tf.control_dependencies([loss_averages_op]): 146 | if optimizer=='ADAGRAD': 147 | opt = tf.train.AdagradOptimizer(learning_rate) 148 | elif optimizer=='ADADELTA': 149 | opt = tf.train.AdadeltaOptimizer(learning_rate, rho=0.9, epsilon=1e-6) 150 | elif optimizer=='ADAM': 151 | opt = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999, epsilon=0.1) 152 | elif optimizer=='RMSPROP': 153 | opt = tf.train.RMSPropOptimizer(learning_rate, decay=0.9, momentum=0.9, epsilon=1.0) 154 | elif optimizer=='MOM': 155 | opt = tf.train.MomentumOptimizer(learning_rate, 0.9, use_nesterov=True) 156 | else: 157 | raise ValueError('Invalid optimization algorithm') 158 | 159 | grads = opt.compute_gradients(total_loss, update_gradient_vars) 160 | 161 | # Apply gradients. 162 | apply_gradient_op = opt.apply_gradients(grads, global_step=global_step) 163 | 164 | # Add histograms for trainable variables. 165 | if log_histograms: 166 | for var in tf.trainable_variables(): 167 | tf.summary.histogram(var.op.name, var) 168 | 169 | # Add histograms for gradients. 170 | if log_histograms: 171 | for grad, var in grads: 172 | if grad is not None: 173 | tf.summary.histogram(var.op.name + '/gradients', grad) 174 | 175 | # Track the moving averages of all trainable variables. 176 | variable_averages = tf.train.ExponentialMovingAverage( 177 | moving_average_decay, global_step) 178 | variables_averages_op = variable_averages.apply(tf.trainable_variables()) 179 | 180 | with tf.control_dependencies([apply_gradient_op, variables_averages_op]): 181 | train_op = tf.no_op(name='train') 182 | 183 | return train_op 184 | 185 | def prewhiten(x): 186 | mean = np.mean(x) 187 | std = np.std(x) 188 | std_adj = np.maximum(std, 1.0/np.sqrt(x.size)) 189 | y = np.multiply(np.subtract(x, mean), 1/std_adj) 190 | return y 191 | 192 | def crop(image, random_crop, image_size): 193 | if image.shape[1]>image_size: 194 | sz1 = int(image.shape[1]//2) 195 | sz2 = int(image_size//2) 196 | if random_crop: 197 | diff = sz1-sz2 198 | (h, v) = (np.random.randint(-diff, diff+1), np.random.randint(-diff, diff+1)) 199 | else: 200 | (h, v) = (0,0) 201 | image = image[(sz1-sz2+v):(sz1+sz2+v),(sz1-sz2+h):(sz1+sz2+h),:] 202 | return image 203 | 204 | def flip(image, random_flip): 205 | if random_flip and np.random.choice([True, False]): 206 | image = np.fliplr(image) 207 | return image 208 | 209 | def to_rgb(img): 210 | w, h = img.shape 211 | ret = np.empty((w, h, 3), dtype=np.uint8) 212 | ret[:, :, 0] = ret[:, :, 1] = ret[:, :, 2] = img 213 | return ret 214 | 215 | def load_data(image_paths, do_random_crop, do_random_flip, image_size, do_prewhiten=True): 216 | nrof_samples = len(image_paths) 217 | images = np.zeros((nrof_samples, image_size, image_size, 3)) 218 | for i in range(nrof_samples): 219 | img = cv2.imread(image_paths[i]) 220 | if img.ndim == 2: 221 | img = to_rgb(img) 222 | if do_prewhiten: 223 | img = prewhiten(img) 224 | img = crop(img, do_random_crop, image_size) 225 | img = flip(img, do_random_flip) 226 | images[i,:,:,:] = img 227 | return images 228 | 229 | def get_label_batch(label_data, batch_size, batch_index): 230 | nrof_examples = np.size(label_data, 0) 231 | j = batch_index*batch_size % nrof_examples 232 | if j+batch_size<=nrof_examples: 233 | batch = label_data[j:j+batch_size] 234 | else: 235 | x1 = label_data[j:nrof_examples] 236 | x2 = label_data[0:nrof_examples-j] 237 | batch = np.vstack([x1,x2]) 238 | batch_int = batch.astype(np.int64) 239 | return batch_int 240 | 241 | def get_batch(image_data, batch_size, batch_index): 242 | nrof_examples = np.size(image_data, 0) 243 | j = batch_index*batch_size % nrof_examples 244 | if j+batch_size<=nrof_examples: 245 | batch = image_data[j:j+batch_size,:,:,:] 246 | else: 247 | x1 = image_data[j:nrof_examples,:,:,:] 248 | x2 = image_data[0:nrof_examples-j,:,:,:] 249 | batch = np.vstack([x1,x2]) 250 | batch_float = batch.astype(np.float32) 251 | return batch_float 252 | 253 | def get_triplet_batch(triplets, batch_index, batch_size): 254 | ax, px, nx = triplets 255 | a = get_batch(ax, int(batch_size/3), batch_index) 256 | p = get_batch(px, int(batch_size/3), batch_index) 257 | n = get_batch(nx, int(batch_size/3), batch_index) 258 | batch = np.vstack([a, p, n]) 259 | return batch 260 | 261 | def get_learning_rate_from_file(filename, epoch): 262 | with open(filename, 'r') as f: 263 | for line in f.readlines(): 264 | line = line.split('#', 1)[0] 265 | if line: 266 | par = line.strip().split(':') 267 | e = int(par[0]) 268 | if par[1]=='-': 269 | lr = -1 270 | else: 271 | lr = float(par[1]) 272 | if e <= epoch: 273 | learning_rate = lr 274 | else: 275 | return learning_rate 276 | 277 | class ImageClass(): 278 | "Stores the paths to images for a given class" 279 | def __init__(self, name, image_paths): 280 | self.name = name 281 | self.image_paths = image_paths 282 | 283 | def __str__(self): 284 | return self.name + ', ' + str(len(self.image_paths)) + ' images' 285 | 286 | def __len__(self): 287 | return len(self.image_paths) 288 | 289 | def get_dataset(path, has_class_directories=True): 290 | dataset = [] 291 | path_exp = os.path.expanduser(path) 292 | classes = [path for path in os.listdir(path_exp) \ 293 | if os.path.isdir(os.path.join(path_exp, path))] 294 | classes.sort() 295 | nrof_classes = len(classes) 296 | for i in range(nrof_classes): 297 | class_name = classes[i] 298 | facedir = os.path.join(path_exp, class_name) 299 | image_paths = get_image_paths(facedir) 300 | dataset.append(ImageClass(class_name, image_paths)) 301 | 302 | return dataset 303 | 304 | def get_image_paths(facedir): 305 | image_paths = [] 306 | if os.path.isdir(facedir): 307 | images = os.listdir(facedir) 308 | image_paths = [os.path.join(facedir,img) for img in images] 309 | return image_paths 310 | 311 | def split_dataset(dataset, split_ratio, min_nrof_images_per_class, mode): 312 | if mode=='SPLIT_CLASSES': 313 | nrof_classes = len(dataset) 314 | class_indices = np.arange(nrof_classes) 315 | np.random.shuffle(class_indices) 316 | split = int(round(nrof_classes*(1-split_ratio))) 317 | train_set = [dataset[i] for i in class_indices[0:split]] 318 | test_set = [dataset[i] for i in class_indices[split:-1]] 319 | elif mode=='SPLIT_IMAGES': 320 | train_set = [] 321 | test_set = [] 322 | for cls in dataset: 323 | paths = cls.image_paths 324 | np.random.shuffle(paths) 325 | nrof_images_in_class = len(paths) 326 | split = int(math.floor(nrof_images_in_class*(1-split_ratio))) 327 | if split==nrof_images_in_class: 328 | split = nrof_images_in_class-1 329 | if split>=min_nrof_images_per_class and nrof_images_in_class-split>=1: 330 | train_set.append(ImageClass(cls.name, paths[:split])) 331 | test_set.append(ImageClass(cls.name, paths[split:])) 332 | else: 333 | raise ValueError('Invalid train/test split mode "%s"' % mode) 334 | return train_set, test_set 335 | 336 | def load_model(model, input_map=None): 337 | # Check if the model is a model directory (containing a metagraph and a checkpoint file) 338 | # or if it is a protobuf file with a frozen graph 339 | model_exp = os.path.expanduser(model) 340 | if (os.path.isfile(model_exp)): 341 | print('Model filename: %s' % model_exp) 342 | with gfile.FastGFile(model_exp,'rb') as f: 343 | graph_def = tf.GraphDef() 344 | graph_def.ParseFromString(f.read()) 345 | tf.import_graph_def(graph_def, input_map=input_map, name='') 346 | else: 347 | print('Model directory: %s' % model_exp) 348 | meta_file, ckpt_file = get_model_filenames(model_exp) 349 | 350 | print('Metagraph file: %s' % meta_file) 351 | print('Checkpoint file: %s' % ckpt_file) 352 | 353 | saver = tf.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map) 354 | saver.restore(tf.get_default_session(), os.path.join(model_exp, ckpt_file)) 355 | 356 | def get_model_filenames(model_dir): 357 | files = os.listdir(model_dir) 358 | meta_files = [s for s in files if s.endswith('.meta')] 359 | if len(meta_files)==0: 360 | raise ValueError('No meta file found in the model directory (%s)' % model_dir) 361 | elif len(meta_files)>1: 362 | raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir) 363 | meta_file = meta_files[0] 364 | ckpt = tf.train.get_checkpoint_state(model_dir) 365 | if ckpt and ckpt.model_checkpoint_path: 366 | ckpt_file = os.path.basename(ckpt.model_checkpoint_path) 367 | return meta_file, ckpt_file 368 | 369 | meta_files = [s for s in files if '.ckpt' in s] 370 | max_step = -1 371 | for f in files: 372 | step_str = re.match(r'(^model-[\w\- ]+.ckpt-(\d+))', f) 373 | if step_str is not None and len(step_str.groups())>=2: 374 | step = int(step_str.groups()[1]) 375 | if step > max_step: 376 | max_step = step 377 | ckpt_file = step_str.groups()[0] 378 | return meta_file, ckpt_file 379 | 380 | def distance(embeddings1, embeddings2, distance_metric=0): 381 | if distance_metric==0: 382 | # Euclidian distance 383 | diff = np.subtract(embeddings1, embeddings2) 384 | dist = np.sum(np.square(diff),1) 385 | elif distance_metric==1: 386 | # Distance based on cosine similarity 387 | dot = np.sum(np.multiply(embeddings1, embeddings2), axis=1) 388 | norm = np.linalg.norm(embeddings1, axis=1) * np.linalg.norm(embeddings2, axis=1) 389 | similarity = dot / norm 390 | dist = np.arccos(similarity) / math.pi 391 | else: 392 | raise 'Undefined distance metric %d' % distance_metric 393 | 394 | return dist 395 | 396 | def calculate_roc(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10, distance_metric=0, subtract_mean=False): 397 | assert(embeddings1.shape[0] == embeddings2.shape[0]) 398 | assert(embeddings1.shape[1] == embeddings2.shape[1]) 399 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) 400 | nrof_thresholds = len(thresholds) 401 | k_fold = KFold(n_splits=nrof_folds, shuffle=False) 402 | 403 | tprs = np.zeros((nrof_folds,nrof_thresholds)) 404 | fprs = np.zeros((nrof_folds,nrof_thresholds)) 405 | accuracy = np.zeros((nrof_folds)) 406 | 407 | indices = np.arange(nrof_pairs) 408 | 409 | for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): 410 | if subtract_mean: 411 | mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0) 412 | else: 413 | mean = 0.0 414 | dist = distance(embeddings1-mean, embeddings2-mean, distance_metric) 415 | 416 | # Find the best threshold for the fold 417 | acc_train = np.zeros((nrof_thresholds)) 418 | for threshold_idx, threshold in enumerate(thresholds): 419 | _, _, acc_train[threshold_idx] = calculate_accuracy(threshold, dist[train_set], actual_issame[train_set]) 420 | best_threshold_index = np.argmax(acc_train) 421 | for threshold_idx, threshold in enumerate(thresholds): 422 | tprs[fold_idx,threshold_idx], fprs[fold_idx,threshold_idx], _ = calculate_accuracy(threshold, dist[test_set], actual_issame[test_set]) 423 | _, _, accuracy[fold_idx] = calculate_accuracy(thresholds[best_threshold_index], dist[test_set], actual_issame[test_set]) 424 | 425 | tpr = np.mean(tprs,0) 426 | fpr = np.mean(fprs,0) 427 | return tpr, fpr, accuracy 428 | 429 | def calculate_accuracy(threshold, dist, actual_issame): 430 | predict_issame = np.less(dist, threshold) 431 | tp = np.sum(np.logical_and(predict_issame, actual_issame)) 432 | fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) 433 | tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame))) 434 | fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame)) 435 | 436 | tpr = 0 if (tp+fn==0) else float(tp) / float(tp+fn) 437 | fpr = 0 if (fp+tn==0) else float(fp) / float(fp+tn) 438 | acc = float(tp+tn)/dist.size 439 | return tpr, fpr, acc 440 | 441 | 442 | 443 | def calculate_val(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10, distance_metric=0, subtract_mean=False): 444 | assert(embeddings1.shape[0] == embeddings2.shape[0]) 445 | assert(embeddings1.shape[1] == embeddings2.shape[1]) 446 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) 447 | nrof_thresholds = len(thresholds) 448 | k_fold = KFold(n_splits=nrof_folds, shuffle=False) 449 | 450 | val = np.zeros(nrof_folds) 451 | far = np.zeros(nrof_folds) 452 | 453 | indices = np.arange(nrof_pairs) 454 | 455 | for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): 456 | if subtract_mean: 457 | mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0) 458 | else: 459 | mean = 0.0 460 | dist = distance(embeddings1-mean, embeddings2-mean, distance_metric) 461 | 462 | # Find the threshold that gives FAR = far_target 463 | far_train = np.zeros(nrof_thresholds) 464 | for threshold_idx, threshold in enumerate(thresholds): 465 | _, far_train[threshold_idx] = calculate_val_far(threshold, dist[train_set], actual_issame[train_set]) 466 | if np.max(far_train)>=far_target: 467 | f = interpolate.interp1d(far_train, thresholds, kind='slinear') 468 | threshold = f(far_target) 469 | else: 470 | threshold = 0.0 471 | 472 | val[fold_idx], far[fold_idx] = calculate_val_far(threshold, dist[test_set], actual_issame[test_set]) 473 | 474 | val_mean = np.mean(val) 475 | far_mean = np.mean(far) 476 | val_std = np.std(val) 477 | return val_mean, val_std, far_mean 478 | 479 | 480 | def calculate_val_far(threshold, dist, actual_issame): 481 | predict_issame = np.less(dist, threshold) 482 | true_accept = np.sum(np.logical_and(predict_issame, actual_issame)) 483 | false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) 484 | n_same = np.sum(actual_issame) 485 | n_diff = np.sum(np.logical_not(actual_issame)) 486 | val = float(true_accept) / float(n_same) 487 | far = float(false_accept) / float(n_diff) 488 | return val, far 489 | 490 | def store_revision_info(src_path, output_dir, arg_string): 491 | try: 492 | # Get git hash 493 | cmd = ['git', 'rev-parse', 'HEAD'] 494 | gitproc = Popen(cmd, stdout = PIPE, cwd=src_path) 495 | (stdout, _) = gitproc.communicate() 496 | git_hash = stdout.strip() 497 | except OSError as e: 498 | git_hash = ' '.join(cmd) + ': ' + e.strerror 499 | 500 | try: 501 | # Get local changes 502 | cmd = ['git', 'diff', 'HEAD'] 503 | gitproc = Popen(cmd, stdout = PIPE, cwd=src_path) 504 | (stdout, _) = gitproc.communicate() 505 | git_diff = stdout.strip() 506 | except OSError as e: 507 | git_diff = ' '.join(cmd) + ': ' + e.strerror 508 | 509 | # Store a text file in the log directory 510 | rev_info_filename = os.path.join(output_dir, 'revision_info.txt') 511 | with open(rev_info_filename, "w") as text_file: 512 | text_file.write('arguments: %s\n--------------------\n' % arg_string) 513 | text_file.write('tensorflow version: %s\n--------------------\n' % tf.__version__) # @UndefinedVariable 514 | text_file.write('git hash: %s\n--------------------\n' % git_hash) 515 | text_file.write('%s' % git_diff) 516 | 517 | def list_variables(filename): 518 | reader = training.NewCheckpointReader(filename) 519 | variable_map = reader.get_variable_to_shape_map() 520 | names = sorted(variable_map.keys()) 521 | return names 522 | 523 | def put_images_on_grid(images, shape=(16,8)): 524 | nrof_images = images.shape[0] 525 | img_size = images.shape[1] 526 | bw = 3 527 | img = np.zeros((shape[1]*(img_size+bw)+bw, shape[0]*(img_size+bw)+bw, 3), np.float32) 528 | for i in range(shape[1]): 529 | x_start = i*(img_size+bw)+bw 530 | for j in range(shape[0]): 531 | img_index = i*shape[0]+j 532 | if img_index>=nrof_images: 533 | break 534 | y_start = j*(img_size+bw)+bw 535 | img[x_start:x_start+img_size, y_start:y_start+img_size, :] = images[img_index, :, :, :] 536 | if img_index>=nrof_images: 537 | break 538 | return img 539 | 540 | def write_arguments_to_file(args, filename): 541 | with open(filename, 'w') as f: 542 | for key, value in iteritems(vars(args)): 543 | f.write('%s: %s\n' % (key, str(value))) 544 | -------------------------------------------------------------------------------- /final_sotware.py: -------------------------------------------------------------------------------- 1 | # /home/aashish/Documents/deep_learning/attendance_deep_learning 2 | 3 | import tensorflow as tf 4 | from scipy import misc 5 | import numpy as np 6 | import argparse 7 | import facenet 8 | import cv2 9 | import sys 10 | import os 11 | import math 12 | import pickle 13 | from sklearn.svm import SVC 14 | from PIL import Image 15 | from face_aligner import FaceAligner 16 | import detect_face 17 | from sheet import mark_present 18 | from mtcnn.mtcnn import MTCNN 19 | 20 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 21 | 22 | 23 | 24 | 25 | 26 | 27 | def dataset_creation(parameters): 28 | path1, webcam, face_dim, gpu, username, vid_path = parameters 29 | path = "" 30 | res = () 31 | personNo = 1 32 | folder_name = "" 33 | 34 | path = path1 35 | 36 | if os.path.isdir(path): 37 | path += '/output' 38 | if os.path.isdir(path): 39 | print("Directory already exists. Using it \n") 40 | else: 41 | if not os.makedirs(path): 42 | print("Directory successfully made in: " + path + "\n") 43 | 44 | else: 45 | if path == "": 46 | print("Making an output folder in this directory only. \n") 47 | else: 48 | print("No such directory exists. Making an output folder in this current code directory only. \n") 49 | 50 | path = 'output' 51 | if os.path.isdir(path): 52 | print("Directory already exists. Using it \n") 53 | else: 54 | if os.makedirs(path): 55 | print("error in making directory. \n") 56 | sys.exit() 57 | else: 58 | print("Directory successfully made: " + path + "\n") 59 | detector = MTCNN() 60 | res = webcam 61 | if res == "": 62 | res = (640, 480) 63 | else: 64 | res = tuple(map(int, res.split('x'))) 65 | 66 | gpu_fraction = gpu 67 | if gpu_fraction == "": 68 | gpu_fraction = 0.8 69 | else: 70 | gpu_fraction = round(float(gpu_fraction), 1) 71 | 72 | minsize = 20 73 | threshold = [0.6, 0.7, 0.7] 74 | factor = 0.7 75 | 76 | with tf.Graph().as_default(): 77 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction) 78 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) 79 | with sess.as_default(): 80 | pnet, rnet, onet = detect_face.create_mtcnn(sess, None) 81 | 82 | face_size = face_dim 83 | if face_size == "": 84 | face_size = (160, 160) 85 | print('default face size') 86 | else: 87 | face_size = tuple(map(int, face_size.split('x'))) 88 | affine = FaceAligner(desiredLeftEye=(0.33, 0.33), desiredFaceWidth=face_size[0], desiredFaceHeight=face_size[1]) 89 | 90 | while True: 91 | ask = username 92 | ask = ask.replace(" ", "_") 93 | 94 | if ask == "": 95 | folder_name = 'person' + str(personNo) 96 | else: 97 | folder_name = ask 98 | 99 | personNo += 1 100 | users_folder = path + "/" + folder_name 101 | image_no = 1 102 | 103 | if os.path.isdir(users_folder): 104 | print("Directory already exists. Using it \n") 105 | else: 106 | if os.makedirs(users_folder): 107 | print("error in making directory. \n") 108 | sys.exit() 109 | else: 110 | print("Directory successfully made: " + users_folder + "\n") 111 | 112 | data_type = vid_path 113 | loop_type = False 114 | total_frames = 0 115 | 116 | if data_type == "": 117 | data_type = 0 118 | loop_type = True 119 | 120 | # Initialize webcam or video 121 | device = cv2.VideoCapture(data_type) 122 | 123 | # If webcam set resolution 124 | if data_type == 0: 125 | device.set(3, res[0]) 126 | device.set(4, res[1]) 127 | else: 128 | # Finding total number of frames of video. 129 | total_frames = int(device.get(cv2.CAP_PROP_FRAME_COUNT)) 130 | # Shutting down webcam variable 131 | loop_type = False 132 | 133 | # Start web cam or start video and start creating dataset by user. 134 | while loop_type or (total_frames > 0): 135 | 136 | # If video selected dec counter 137 | if loop_type == False: 138 | total_frames -= 1 139 | 140 | ret, image = device.read() 141 | 142 | # Run MTCNN and do face detection until 's' keyword is pressed 143 | if (cv2.waitKey(1) & 0xFF) == ord("s"): 144 | 145 | #bb, points = detect_face.detect_face(image, minsize, pnet, rnet, onet, threshold, factor) 146 | detect = detector.detect_faces(image) 147 | print(detect) 148 | 149 | # See if face is detected 150 | if detect: 151 | bb = detect[0]['box'] 152 | points = detect[0]['keypoints'] 153 | print(bb) 154 | x, y, w, h = bb 155 | aligned_image = image[y:y+h, x:x+w] 156 | #aligned_image = affine.align(image, points) 157 | image_name = users_folder + "/" + folder_name + "_" + str(image_no).zfill(4) + ".png" 158 | cv2.imwrite(image_name, aligned_image) 159 | image_no += 1 160 | 161 | ''' 162 | for i in range(bb.shape[0]): 163 | cv2.rectangle(image, (int(bb[i][0]), int(bb[i][1])), (int(bb[i][2]), int(bb[i][3])), (0, 255, 0), 2) 164 | 165 | # loop over the (x, y)-coordinates for the facial landmarks 166 | # and draw each of them 167 | for col in range(points.shape[1]): 168 | for i in range(5): 169 | cv2.circle(image, (int(points[i][col]), int(points[i + 5][col])), 1, (0, 255, 0), -1)''' 170 | 171 | # Show the output video to user 172 | cv2.imshow("Output", image) 173 | 174 | # Break this loop if 'q' keyword pressed to go to next user. 175 | if (cv2.waitKey(0) & 0xFF) == ord("q"): 176 | device.release() 177 | cv2.destroyAllWindows() 178 | # break 179 | abcd = 1 180 | return abcd 181 | 182 | def train(parameters): 183 | path1, path2, batch, img_dim, gpu, svm_name, split_percent, split_data = parameters 184 | 185 | path = path1 # input("\nEnter the path to the face images directory inside which multiple user folders are present or press ENTER if the default created output folder is present in this code directory only: ") 186 | if path == "": 187 | path = 'output' 188 | 189 | gpu_fraction = gpu # input("\nEnter the gpu memory fraction u want to allocate out of 1 or press ENTER for default 0.8: ").rstrip().lstrip() 190 | if gpu_fraction == "": 191 | gpu_fraction = 0.8 192 | else: 193 | gpu_fraction = round(float(gpu_fraction), 1) 194 | 195 | model = path2 # input("\nEnter the FOLDER PATH inside which 20180402-114759 FOLDER is present. Press ENTER stating that the FOLDER 20180402-114759 is present in this code directory itself: ").rstrip().lstrip() 196 | if model == "": 197 | model = "20180402-114759/20180402-114759.pb" 198 | else: 199 | model += "/20180402-114759/20180402-114759.pb" 200 | 201 | batch_size = 90 202 | ask = batch # input("\nEnter the batch size of images to process at once OR press ENTER for default 90: ").rstrip().lstrip() 203 | if ask != "": 204 | batch_size = int(ask) 205 | 206 | image_size = 160 207 | ask = img_dim # input("\nEnter the width_size of face images OR press ENTER for default 160: ").rstrip().lstrip() 208 | if ask != "": 209 | image_size = int(ask) 210 | 211 | classifier_filename = svm_name # input("Enter the output SVM classifier filename OR press ENTER for default name= classifier: ") 212 | if classifier_filename == "": 213 | classifier_filename = 'classifier.pkl' 214 | else: 215 | classifier_filename += '.pkl' 216 | classifier_filename = os.path.expanduser(classifier_filename) 217 | 218 | split_dataset = split_data # input("\nPress Y if you want to split the dataset for Training and Testing: ").rstrip().lstrip().lower() 219 | 220 | # If yes ask for the percentage of training and testing division. 221 | percentage = 70 222 | if split_dataset == 'y': 223 | ask = split_percent # input("\nEnter the percentage of training dataset for splitting OR press ENTER for default 70: ").rstrip().lstrip() 224 | if ask != "": 225 | percentage = float(ask) 226 | 227 | min_nrof_images_per_class = 0 228 | 229 | dataset = facenet.get_dataset(path) 230 | train_set = [] 231 | test_set = [] 232 | 233 | if split_dataset == 'y': 234 | for cls in dataset: 235 | paths = cls.image_paths 236 | # Remove classes with less than min_nrof_images_per_class 237 | if len(paths) >= min_nrof_images_per_class: 238 | np.random.shuffle(paths) 239 | 240 | # Find the number of images in training set and testing set images for this class 241 | no_train_images = int(percentage * len(paths) * 0.01) 242 | 243 | train_set.append(facenet.ImageClass(cls.name, paths[:no_train_images])) 244 | test_set.append(facenet.ImageClass(cls.name, paths[no_train_images:])) 245 | 246 | 247 | paths_train = [] 248 | labels_train = [] 249 | paths_test = [] 250 | labels_test = [] 251 | emb_array = [] 252 | class_names = [] 253 | 254 | if split_dataset == 'y': 255 | paths_train, labels_train = facenet.get_image_paths_and_labels(train_set) 256 | paths_test, labels_test = facenet.get_image_paths_and_labels(test_set) 257 | print('\nNumber of classes: %d' % len(train_set)) 258 | print('\nNumber of images in TRAIN set: %d' % len(paths_train)) 259 | print('\nNumber of images in TEST set: %d' % len(paths_test)) 260 | else: 261 | paths_train, labels_train = facenet.get_image_paths_and_labels(dataset) 262 | print('\nNumber of classes: %d' % len(dataset)) 263 | print('\nNumber of images: %d' % len(paths_train)) 264 | 265 | # Find embedding 266 | emb_array = get_embeddings(model, paths_train, batch_size, image_size, gpu_fraction) 267 | 268 | # Train the classifier 269 | print('\nTraining classifier') 270 | model_svc = SVC(kernel='linear', probability=True) 271 | model_svc.fit(emb_array, labels_train) 272 | 273 | # Create a list of class names 274 | if split_dataset == 'y': 275 | class_names = [cls.name.replace('_', ' ') for cls in train_set] 276 | else: 277 | class_names = [cls.name.replace('_', ' ') for cls in dataset] 278 | 279 | # Saving classifier model 280 | with open(classifier_filename, 'wb') as outfile: 281 | pickle.dump((model_svc, class_names), outfile) 282 | 283 | print('\nSaved classifier model to file: "%s"' % classifier_filename) 284 | 285 | if split_dataset == 'y': 286 | # Find embedding for test data 287 | emb_array = get_embeddings(model, paths_test, batch_size, image_size, gpu_fraction) 288 | 289 | # Call test on the test set. 290 | parameters = '', '', '', '', '', gpu_fraction 291 | test(parameters, classifier_filename, emb_array, labels_test, model, batch_size, image_size) 292 | 293 | c = 1 294 | return c 295 | 296 | 297 | def test(parameters, classifier_filename="", emb_array=[], labels_test=[], model="", batch_size=0, image_size=0): 298 | path1, path2, path3, batch_size, img_dim, gpu = parameters 299 | 300 | if classifier_filename == "": 301 | classifier_filename = path1 # input("\nEnter the path of the classifier .pkl file or press ENTER if a filename classifier.pkl is present in this code directory itself: ") 302 | if classifier_filename == "": 303 | classifier_filename = 'classifier.pkl' 304 | classifier_filename = os.path.expanduser(classifier_filename) 305 | 306 | gpu_fraction = gpu # input("\nEnter the gpu memory fraction u want to allocate out of 1 or press ENTER for default 0.8: ").rstrip().lstrip() 307 | if gpu_fraction == "": 308 | gpu_fraction = 0.8 309 | else: 310 | gpu_fraction = round(float(gpu_fraction), 1) 311 | 312 | if model == "": 313 | model = path2 # input("\nEnter the FOLDER PATH inside which 20180402-114759 FOLDER is present. Press ENTER stating that the FOLDER 20180402-114759 is present in this code directory itself: ").rstrip() 314 | if model == "": 315 | model = "20180402-114759/20180402-114759.pb" 316 | 317 | if batch_size == 0 or batch_size == '': 318 | ask = batch_size # input("\nEnter the batch size of images to process at once OR press ENTER for default 90: ").rstrip().lstrip() 319 | if ask == "": 320 | batch_size = 90 321 | else: 322 | batch_size = int(ask) 323 | 324 | if image_size == 0: 325 | ask = img_dim # input("\nEnter the width_size of face images OR press ENTER for default 160: ").rstrip().lstrip() 326 | if ask == "": 327 | image_size = 160 328 | else: 329 | image_size = int(ask) 330 | 331 | if labels_test == []: 332 | path = path3 # input("\nEnter the path to the face images directory inside which multiple user folders are present or press ENTER if the default created output folder is present in this code directory only: ") 333 | if path == "": 334 | path = 'output' 335 | dataset = facenet.get_dataset(path) 336 | paths, labels_test = facenet.get_image_paths_and_labels(dataset) 337 | print('\nNumber of classes to test: %d' % len(dataset)) 338 | print('\nNumber of images to test: %d' % len(paths)) 339 | # Generate embeddings of these paths 340 | emb_array = get_embeddings(model, paths, batch_size, image_size, gpu_fraction) 341 | 342 | # Classify images 343 | print('\nTesting classifier') 344 | with open(classifier_filename, 'rb') as infile: 345 | (modelSVM, class_names) = pickle.load(infile) 346 | 347 | print('\nLoaded classifier model from file "%s"' % classifier_filename) 348 | 349 | predictions = modelSVM.predict_proba(emb_array) 350 | best_class_indices = np.argmax(predictions, axis=1) 351 | best_class_probabilities = predictions[np.arange(len(best_class_indices)), best_class_indices] 352 | 353 | for i in range(len(best_class_indices)): 354 | print('%4d %s: %.3f' % (i, class_names[best_class_indices[i]], best_class_probabilities[i])) 355 | 356 | accuracy = np.mean(np.equal(best_class_indices, labels_test)) 357 | print('\nAccuracy: %.3f' % accuracy) 358 | 359 | 360 | def get_embeddings(model, paths, batch_size, image_size, gpu_fraction): 361 | # initializing the facenet tensorflow model 362 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction) 363 | with tf.Graph().as_default(): 364 | with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) as sess: 365 | # Load the model 366 | print('\nLoading feature extraction model') 367 | facenet.load_model(model) 368 | 369 | # Get input and output tensors 370 | images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 371 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 372 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") 373 | embedding_size = embeddings.get_shape()[1] 374 | 375 | # Run forward pass to calculate embeddings 376 | print('Calculating features for images') 377 | nrof_images = len(paths) 378 | nrof_batches_per_epoch = int(math.ceil(1.0 * nrof_images / batch_size)) 379 | emb_array = np.zeros((nrof_images, embedding_size)) 380 | 381 | for i in range(nrof_batches_per_epoch): 382 | start_index = i * batch_size 383 | end_index = min((i + 1) * batch_size, nrof_images) 384 | paths_batch = paths[start_index:end_index] 385 | # print(paths_batch) 386 | 387 | # Does random crop, prewhitening and flipping. 388 | images = facenet.load_data(paths_batch, False, False, image_size) 389 | 390 | # Get the embeddings 391 | feed_dict = {images_placeholder: images, phase_train_placeholder: False} 392 | emb_array[start_index:end_index, :] = sess.run(embeddings, feed_dict=feed_dict) 393 | 394 | return emb_array 395 | 396 | 397 | def recognize(mode, parameters): 398 | print(parameters) 399 | path1, path2, face_dim, gpu, thresh1, thresh2, resolution, img_path, out_img_path, vid_path, vid_save, vid_see = parameters 400 | st_name = '' 401 | # Taking the parameters for recogniton by the user 402 | if path1: 403 | classifier_filename = path1 # input("\nEnter the path of the classifier .pkl file or press ENTER if a filename 'classifier.pkl' is present in this code directory itself: ") 404 | else: 405 | classifier_filename = 'classifier.pkl' 406 | classifier_filename = os.path.expanduser(classifier_filename) 407 | 408 | if path2: 409 | model = path2 # input("\nEnter the FOLDER PATH inside which 20180402-114759 FOLDER is present. Press ENTER stating that the FOLDER 20180402-114759 is present in this code directory itself: ").rstrip() 410 | else: 411 | model = "20180402-114759/20180402-114759.pb" 412 | 413 | # Create an object of face aligner module 414 | image_size = (160, 160) 415 | if face_dim: 416 | ask = face_dim # input("\nEnter desired face width and height in WidthxHeight format for face aligner to take OR press ENTER for default 160x160 pixel: ").rstrip().lower() 417 | image_size = tuple(map(int, ask.split('x'))) 418 | 419 | # Take gpu fraction values 420 | if gpu: 421 | gpu_fraction = gpu # input("\nEnter the gpu memory fraction u want to allocate out of 1 or press ENTER for default 0.8: ").rstrip() 422 | gpu_fraction = round(float(gpu_fraction), 1) 423 | 424 | else: 425 | gpu_fraction = 0.8 426 | 427 | # input_type = input("\nPress I for image input OR\nPress V for video input OR\nPress W for webcam input OR\nPress ENTER for default webcam: ").lstrip().rstrip().lower() 428 | # if input_type == "": 429 | # input_type = 'w' 430 | input_type = mode 431 | 432 | # Load the face aligner model 433 | affine = FaceAligner(desiredLeftEye=(0.33, 0.33), desiredFaceWidth=image_size[0], desiredFaceHeight=image_size[1]) 434 | 435 | # Building seperate graphs for both the tf architectures 436 | g1 = tf.Graph() 437 | g2 = tf.Graph() 438 | 439 | # Load the model for FaceNet image recognition 440 | with g1.as_default(): 441 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction) 442 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) 443 | with tf.Session() as sess: 444 | facenet.load_model(model) 445 | 446 | # Load the model of MTCNN face detection. 447 | with g2.as_default(): 448 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction) 449 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) 450 | with sess.as_default(): 451 | pnet, rnet, onet = detect_face.create_mtcnn(sess, None) 452 | 453 | # Some MTCNN network parameters 454 | minsize = 20 # minimum size of face 455 | threshold = [0.6, 0.7, 0.8] # Three steps's threshold 456 | factor = 0.709 # scale factor 457 | if thresh1: 458 | ask = thresh1 # input("\nEnter the threshold FACE DETECTION CONFIDENCE SCORE to consider detection by MTCNN OR press ENTER for default 0.80: ") 459 | if ask != "" and float(ask) < 1: 460 | threshold[2] = round(float(ask), 2) 461 | 462 | classifier_threshold = 0.50 463 | if thresh2: 464 | ask = thresh2 # input("\nEnter the threshold FACE RECOGNITION CONFIDENCE SCORE to consider face is recognised OR press ENTER for default 0.50: ") 465 | if ask != "": 466 | classifier_threshold = float(ask) 467 | 468 | # Loading the classifier model 469 | with open(classifier_filename, 'rb') as infile: 470 | (modelSVM, class_names) = pickle.load(infile) 471 | 472 | # helper variables 473 | image = [] 474 | device = [] 475 | display_output = True 476 | 477 | # Webcam variables 478 | loop_type = False 479 | res = (640, 480) 480 | 481 | # Video input variables 482 | total_frames = 0 483 | save_video = False 484 | frame_no = 1 485 | output_video = [] 486 | 487 | # image input type variables 488 | save_images = False 489 | image_folder = "" 490 | out_img_folder = "" 491 | imageNo = 1 492 | image_list = [] 493 | image_name = "" 494 | 495 | # If web cam is selected 496 | if input_type == "w": 497 | data_type = 0 498 | loop_type = True 499 | # Ask for webcam resolution 500 | if resolution: 501 | ask = resolution # input("\nEnter your webcam SUPPORTED resolution for face detection. For eg. 640x480 OR press ENTER for default 640x480: ").rstrip().lower() 502 | if ask != "": 503 | res = tuple(map(int, ask.split('x'))) 504 | 505 | # If image selected, go to image function. 506 | elif input_type == "i": 507 | 508 | # Create a list of images inside the given folder 509 | if img_path: 510 | image_folder = img_path # input("\nWrite the folder path inside which images are kept: ").rstrip().lstrip() 511 | for img in os.listdir(image_folder): 512 | image_list.append(img) 513 | total_frames = len(image_list) 514 | 515 | path = 'y' # vid_save #input("\nIf you want to save the output images to a folder press Y OR press ENTER to ignore it: ").lstrip().rstrip().lower() 516 | 517 | if path == "y": 518 | save_images = True 519 | path = out_img_path # input("\nEnter the location of output folder OR press ENTER to default create an output_images directory here only: ").lstrip().rstrip() 520 | if os.path.isdir(path) or path == "": 521 | # User given path is present. 522 | if path == "": 523 | path = "output_images" 524 | else: 525 | path += '/output_images' 526 | if os.path.isdir(path): 527 | print("Directory already exists. Using it \n") 528 | else: 529 | if not os.makedirs(path): 530 | print("Directory successfully made in: " + path + "\n") 531 | else: 532 | print("Error image folder path. Exiting") 533 | sys.exit() 534 | out_img_folder = path + "/" 535 | 536 | 537 | # Video is selected 538 | else: 539 | data_type = vid_path # input("\nWrite the video path file to open: ").rstrip().lstrip() 540 | ask = vid_save # input("\nPress y to save the output video OR simply press ENTER to ignore it: ").lstrip().rstrip().lower() 541 | if ask == "y": 542 | save_video = True 543 | 544 | if input_type != "w": 545 | ask = vid_see # input("\nSimply press ENTER to see the output video OR press N to switch off the display: ").lstrip().rstrip().lower() 546 | if ask != "y": 547 | display_output = False 548 | 549 | # Initialize webcam or video if no image format 550 | if input_type != "i": 551 | device = cv2.VideoCapture(data_type) 552 | 553 | # If webcam set resolution 554 | if input_type == "w": 555 | device.set(3, res[0]) 556 | device.set(4, res[1]) 557 | 558 | elif input_type == "v": 559 | # Finding total number of frames of video. 560 | total_frames = int(device.get(cv2.CAP_PROP_FRAME_COUNT)) 561 | # save video feature. 562 | if save_video: 563 | # Finding the file format, size and the fps rate 564 | fps = device.get(cv2.CAP_PROP_FPS) 565 | video_format = int(device.get(cv2.CAP_PROP_FOURCC)) 566 | frame_size = (int(device.get(cv2.CAP_PROP_FRAME_WIDTH)), int(device.get(cv2.CAP_PROP_FRAME_HEIGHT))) 567 | # Creating video writer to save the video after process if needed 568 | output_video = cv2.VideoWriter("/home/ml/Documents/attendance_dl/videos/dslr/Output_" + data_type, video_format, fps, frame_size) 569 | 570 | # Start web cam or start video and start creating dataset by user. 571 | while loop_type or (frame_no <= total_frames): 572 | 573 | if input_type == "i": 574 | image = cv2.imread(image_folder + "/" + image_list[frame_no - 1]) 575 | else: 576 | ret, image = device.read() 577 | 578 | # Run MTCNN model to detect faces 579 | g2.as_default() 580 | with tf.Session(graph=g2) as sess: 581 | # we get the bounding boxes as well as the points for the face 582 | frame = image 583 | #/home/ml/Documents/attendance_dl/dataset/test.mp4 584 | image = cv2.resize(image, (800, 600)) 585 | 586 | 587 | hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) 588 | value = 0 589 | h, s, v = cv2.split(hsv) 590 | v -= value 591 | #h -= value 592 | image = cv2.merge((h, s, v)) 593 | image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR) ################################################################ 594 | 595 | #image = noisy('speckle', image) 596 | image = np.asarray(image, dtype = 'uint8') 597 | 598 | bb, points = detect_face.detect_face(image, minsize, pnet, rnet, onet, threshold, factor) 599 | 600 | # See if face is detected 601 | if bb.shape[0] > 0: 602 | 603 | # ALIGNMENT - use the bounding boxes and facial landmarks points to align images 604 | 605 | # create a numpy array to feed the network 606 | img_list = [] 607 | images = np.empty([bb.shape[0], image.shape[0], image.shape[1]]) 608 | 609 | for col in range(points.shape[1]): 610 | aligned_image = affine.align(image, points[:, col]) 611 | print(aligned_image) 612 | print("\n" + str(len(aligned_image))) 613 | 614 | # Prewhiten the image for facenet architecture to give better results 615 | mean = np.mean(aligned_image) 616 | std = np.std(aligned_image) 617 | std_adj = np.maximum(std, 1.0 / np.sqrt(aligned_image.size)) 618 | ready_image = np.multiply(np.subtract(aligned_image, mean), 1 / std_adj) 619 | img_list.append(ready_image) 620 | images = np.stack(img_list) 621 | 622 | # EMBEDDINGS: Use the processed aligned images for Facenet embeddings 623 | 624 | g1.as_default() 625 | with tf.Session(graph=g1) as sess: 626 | # Run forward pass on FaceNet to get the embeddings 627 | images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 628 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 629 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") 630 | feed_dict = {images_placeholder: images, phase_train_placeholder: False} 631 | embedding = sess.run(embeddings, feed_dict=feed_dict) 632 | 633 | # PREDICTION: use the classifier to predict the most likely class (person). 634 | predictions = modelSVM.predict_proba(embedding) 635 | best_class_indices = np.argmax(predictions, axis=1) 636 | best_class_probabilities = predictions[np.arange(len(best_class_indices)), best_class_indices] 637 | 638 | # DRAW: draw bounding boxes, landmarks and predicted names 639 | 640 | if save_video or display_output or save_images: 641 | for i in range(bb.shape[0]): 642 | cv2.rectangle(image, (int(bb[i][0]), int(bb[i][1])), (int(bb[i][2]), int(bb[i][3])), (0, 255, 0), 1) 643 | 644 | # Put name and probability of detection only if given threshold is crossed 645 | if best_class_probabilities[i] > classifier_threshold: 646 | cv2.putText(image, class_names[best_class_indices[i]], (int(bb[i][0] + 1), int(bb[i][1]) + 10), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.6, (255, 255, 0), 1, cv2.LINE_AA) 647 | print(class_names[best_class_indices[i]]) 648 | st_name += ',' 649 | st_name += class_names[best_class_indices[i]] 650 | mark_present(st_name) 651 | #cv2.waitKey(0) 652 | #cv2.putText(image, str(round(best_class_probabilities[i] * 100, 2)) + "%", (int(bb[i][0]), int(bb[i][3]) + 7), cv2.FONT_HERSHEY_COMPLEX_SMALL, 1, (0, 255, 255), 1, cv2.LINE_AA) 653 | 654 | # loop over the (x, y)-coordinates for the facial landmarks 655 | for col in range(points.shape[1]): 656 | for i in range(5): 657 | cv2.circle(image, (int(points[i][col]), int(points[i + 5][col])), 1, (0, 255, 0), 1) 658 | 659 | if display_output: 660 | cv2.imshow("Output", image) 661 | if save_video: 662 | output_video.write(image) 663 | if save_images: 664 | output_name = out_img_folder + image_list[frame_no - 1] 665 | # Just taking the initial name of the input image and save in jpg which opencv supports for sure 666 | # output_name = out_img_folder + image_list[frame_no-1].split(".")[0] + ".jpg" 667 | cv2.imwrite(output_name, image) 668 | 669 | # If video or images selected dec counter 670 | if loop_type == False: 671 | # Display the progress 672 | print("\nProgress: %.2f" % (100 * frame_no / total_frames) + "%") 673 | frame_no += 1 674 | 675 | # if the `q` key was pressed, break from the loop 676 | if cv2.waitKey(1) == 'q': 677 | # do a bit of cleanup 678 | if save_video: 679 | output_video.release() 680 | device.release() 681 | cv2.destroyAllWindows() 682 | break 683 | 684 | return st_name 685 | 686 | 687 | 688 | 689 | 690 | if __name__ == '__main__': 691 | main() 692 | 693 | 694 | 695 | -------------------------------------------------------------------------------- /images/image2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image2.png -------------------------------------------------------------------------------- /images/image3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image3.png -------------------------------------------------------------------------------- /images/image4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image4.png -------------------------------------------------------------------------------- /images/image5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image5.png -------------------------------------------------------------------------------- /images/images.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /sheet.py: -------------------------------------------------------------------------------- 1 | import xlwt 2 | import xlrd 3 | from xlutils.copy import copy 4 | import os 5 | import datetime 6 | import xlsxwriter 7 | 8 | st_name = 'Aashish' 9 | def mark_present(st_name): 10 | 11 | names = os.listdir('output/') 12 | print(names) 13 | 14 | sub = 'SAMPLE' 15 | 16 | if not os.path.exists('attendance/' + sub + '.xlsx'): 17 | count = 2 18 | workbook = xlsxwriter.Workbook('attendance/' + sub + '.xlsx') 19 | print("Creating Spreadsheet with Title: " + sub) 20 | sheet = workbook.add_worksheet() 21 | for i in names: 22 | sheet.write(count, 0, i) 23 | count += 1 24 | workbook.close() 25 | 26 | rb = xlrd.open_workbook('attendance/' + sub + '.xlsx') 27 | wb = copy(rb) 28 | sheet = wb.get_sheet(0) 29 | sheet.write(1,1,str(datetime.datetime.now())) 30 | 31 | 32 | count = 2 33 | for i in names: 34 | if i in st_name: 35 | sheet.write(count, 1, 'P') 36 | else: 37 | sheet.write(count, 1, 'A') 38 | sheet.write(count, 0, i) 39 | count += 1 40 | 41 | wb.save('attendance/' + sub + '.xlsx') 42 | 43 | 44 | mark_present(st_name) 45 | -------------------------------------------------------------------------------- /user_interface.py: -------------------------------------------------------------------------------- 1 | import tkinter as tk 2 | from tkinter import * 3 | import tkinter 4 | from tkinter import filedialog 5 | from tkinter import ttk, StringVar, IntVar 6 | from PIL import ImageTk, Image 7 | from tkinter import messagebox 8 | from PIL import Image 9 | import final_sotware 10 | import xlwt 11 | from xlwt import Workbook 12 | 13 | 14 | def s_exit(): 15 | exit(0) 16 | 17 | 18 | def putwindow(): 19 | 20 | window = Tk() 21 | window.geometry("800x500") 22 | #window.configure(background='') 23 | window.title("Attendance System") 24 | #window.geometry("800x500") 25 | tkinter.Label(window, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg = "black", bg = "darkorange").pack(fill = "x") 26 | tkinter.Label(window, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg = 'orange').pack(fill = 'x') 27 | tkinter.Label(window, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg = 'orange').pack(fill = 'x') 28 | 29 | tkinter.Label(window, text = "GUIDELINES TO USE THIS SOFTWARE", fg = "black", bg = "salmon1").pack(fill = "x") 30 | 31 | tkinter.Label(window, text = " ").pack(fill = 'y') 32 | 33 | tkinter.Label(window, text = ": \n\n" 34 | "This software allows user to:\n\n" 35 | "1) CREATE DATASET using MTCNN face detection and alignment\n" 36 | "2) TRAIN FaceNet for face recognition \n" 37 | "3) Do both \n\n\n " 38 | , fg = "black", bg = "aquamarine").pack(fill = "y") 39 | 40 | tkinter.Label(window, text = "\n\n").pack(fill = 'y') 41 | 42 | 43 | tkinter.Label(window, text = ": \n\n" 44 | "The user will multiple times get option to choose webcam (default option) or \n" 45 | "video file to do face detection and will be asked for output folder, username\n" 46 | "on folder and image files etc also (default options exists for that too) \n\n\n " 47 | "************** IMPORTANT *************\n\n" 48 | "1) Whenever webcam or video starts press 's' keyword to start face detection in video or webcam frames \n" 49 | " and save the faces in the folder for a single user. This dataset creation will stop the moment you \n" 50 | " release the 's' key. This can be done multiple times. \n\n" 51 | "2) Press 'q' to close it when you are done with one person, and want to detect face for another person.\n\n" 52 | "3) Make sure you press the keywords on the image window and not the terminal window. \n" 53 | , fg = "black", bg = "gray").pack(fill = "y") 54 | 55 | def cont_inue(): 56 | window.destroy() 57 | show() 58 | 59 | btn1 = tkinter.Button(window, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = cont_inue) 60 | btn1.place(x=360, y=450, width=80) 61 | 62 | 63 | 64 | 65 | window.mainloop() 66 | 67 | 68 | def show(): 69 | #putwindow.window.destroy() 70 | window2 = Tk() 71 | window2.title("Attendance System") 72 | window2.geometry("800x500") 73 | tkinter.Label(window2, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x") 74 | tkinter.Label(window2, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x') 75 | tkinter.Label(window2, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x') 76 | 77 | #tkinter.Label(window2, text="TEST", fg="lightblue", bg="gray").pack(fill="x") 78 | tkinter.Label(window2, text = "\n\n ").pack(fill = 'y') 79 | 80 | 81 | tkinter.Label(window2, text = "Click 'TRAIN' to TRAIN and maybe 'TEST later' by making a classifer on the facenet model.\n\n" 82 | "Click 'TEST' to TEST on previously MTCNN created dataset by loading already created \n" 83 | "facenet classification model. \n\n" 84 | "Click 'CREATE' to first create dataset and then 'maybe' train later. \n\n" 85 | "Click 'RUN' to TEST by loading a classifier model and using webcam OR user given video \n" 86 | "OR given set of images (save option is also available). \n", 87 | fg = 'blue', bg = 'pink').pack(fill = 'y') 88 | 89 | 90 | bottom_frame = tkinter.Frame(window2).pack(side = "bottom") 91 | 92 | 93 | 94 | def train(): 95 | print('train') 96 | window2.destroy() 97 | show_train() 98 | 99 | def test(): 100 | print('test') 101 | window2.destroy() 102 | show_test() 103 | 104 | def create(): 105 | print('create') 106 | window2.destroy() 107 | show_create() 108 | 109 | def run(): 110 | print('run') 111 | window2.destroy() 112 | show_run() 113 | 114 | 115 | btn1 = tkinter.Button(bottom_frame, text = "TRAIN", fg = "black", bg = 'turquoise1', command = train) 116 | btn1.place(x=230, y=350, width=50) 117 | 118 | btn2 = tkinter.Button(bottom_frame, text = "TEST", fg = "black", bg = 'turquoise1', command = test) 119 | btn2.place(x=330, y=350, width=50) 120 | 121 | btn3 = tkinter.Button(bottom_frame, text = "CREATE", fg = "black", bg = 'turquoise1', command = create) 122 | btn3.place(x=430, y=350, width=50) 123 | 124 | btn3 = tkinter.Button(bottom_frame, text = "RUN", fg = "black", bg = 'turquoise1', command = run) 125 | btn3.place(x=530, y=350, width=50) 126 | 127 | btn4 = tkinter.Button(bottom_frame, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit) 128 | btn4.place(x=370, y=450, width=60) 129 | 130 | window2.mainloop() 131 | 132 | 133 | 134 | def show_run(): 135 | 136 | window3 = Tk() 137 | #window3.configure(background='lightyellow') 138 | window3.title("Attendance System") 139 | window3.geometry("1200x800") 140 | tkinter.Label(window3, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x") 141 | tkinter.Label(window3, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x') 142 | tkinter.Label(window3, text="\n\n ").pack(fill='y') 143 | 144 | #path1 = tk.StringVar() 145 | 146 | tkinter.Label(window3, text = "Enter the path to classifier.pkl").place(x=50, y=50, width=250) 147 | path1 = tkinter.Entry(window3) 148 | path1.place(x=60, y=70, width=400) 149 | 150 | tkinter.Label(window3, text = "Enter the path to 20180402-114759 FOLDER").place(x=50, y=100, width=300) 151 | path2 = tkinter.Entry(window3) 152 | path2.place(x=60, y=120, width=400) 153 | 154 | tkinter.Label(window3, text = "Enter desired face width and height for face aligner (WidthxHeight format)").place(x=50, y=150, width=530) 155 | face_dim = tkinter.Entry(window3) 156 | face_dim.place(x=60, y=170, width=400) 157 | 158 | tkinter.Label(window3, text = "Enter the gpu memory fraction u want to allocate out of 1").place(x=50, y=200, width=420) 159 | gpu = tkinter.Entry(window3) 160 | gpu.place(x=60, y=220, width=400) 161 | 162 | tkinter.Label(window3, text = "Enter the threshold to consider detection by MTCNN").place(x=50, y=250, width=380) 163 | thresh1 = tkinter.Entry(window3) 164 | thresh1.place(x=60, y=270, width=400) 165 | 166 | tkinter.Label(window3, text = "Enter the threshold to consider face is recognised").place(x=50, y=300, width=380) 167 | thresh2 = tkinter.Entry(window3) 168 | thresh2.place(x=60, y=320, width=400) 169 | 170 | tkinter.Label(window3, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=700, y=200, width=380) 171 | tkinter.Label(window3, bg = 'orange') 172 | #Label.place(y = 350, width = 1400) 173 | 174 | rdbtn1 = IntVar() 175 | rdbtn2 = IntVar() 176 | rdbtn3 = IntVar() 177 | 178 | rdbi = tkinter.Checkbutton(window3, text="Input: Image", variable = rdbtn3, fg="blue", bg="cyan") 179 | rdbi.place(x=60, y=400, width=200) 180 | 181 | #tkinter.Label(window3, text="Input: Image", fg="blue", bg="cyan").place(x=60, y=400, width=200) 182 | tkinter.Label(window3, text = "Enter the folder path inside which images are kept").place(x=300, y=400, width=360) 183 | img_path = tkinter.Entry(window3) 184 | img_path.place(x=300, y=420, width=400) 185 | 186 | tkinter.Label(window3, text = "Enter folder path inside which output images are to be saved").place(x=720, y=400, width=420) 187 | out_img_path = tkinter.Entry(window3) 188 | out_img_path.place(x=720, y=420, width=400) 189 | 190 | rdbv = tkinter.Checkbutton(window3, text="Input: Video", variable = rdbtn2, fg="blue", bg="cyan") 191 | rdbv.place(x=60, y=450, width=200) 192 | 193 | #tkinter.Label(window3, text="Input: Video", fg="blue", bg="cyan").place(x=60, y=450, width=200) 194 | tkinter.Label(window3, text = "Enter path to the video file").place(x=300, y=450, width=200) 195 | vid_path = tkinter.Entry(window3) 196 | vid_path.place(x=300, y=470, width=400) 197 | 198 | tkinter.Label(window3, text = "To Save output video type 'y'").place(x=720, y=450, width=200) 199 | vid_save = tkinter.Entry(window3) 200 | vid_save.place(x=720, y=470, width=100) 201 | 202 | tkinter.Label(window3, text = "To See output video type y").place(x=950, y=450, width=200) 203 | vid_see = tkinter.Entry(window3) 204 | vid_see.place(x=960, y=470, width=100) 205 | 206 | rdbw = tkinter.Checkbutton(window3, text="Input: Webcam", variable = rdbtn1, fg="blue", bg="cyan") 207 | rdbw.place(x=60, y=500, width=200) 208 | tkinter.Label(window3, text = "Enter your supported webcam resolution (eg 640x480)").place(x=300, y=500, width=380) 209 | resolution = tkinter.Entry(window3) 210 | resolution.place(x=300, y=520, width=400) 211 | 212 | #parameters = path1, path2, face_dim, gpu, thresh1, thresh2, resolution 213 | 214 | 215 | def submit(): 216 | print('submit') 217 | if rdbtn1.get(): 218 | print('Webcam') 219 | mode = 'w' 220 | elif rdbtn2.get(): 221 | print('Video') 222 | mode = 'v' 223 | elif rdbtn3.get(): 224 | print('Image') 225 | mode = 'i' 226 | else: 227 | print('default') 228 | mode = 'w' 229 | 230 | print(mode) 231 | parameters = path1.get(), path2.get(), face_dim.get(), gpu.get(), thresh1.get(), thresh2.get(), resolution.get(), \ 232 | img_path.get(), out_img_path.get(), vid_path.get(), vid_save.get(), vid_see.get() 233 | print(parameters) 234 | #mode = 'w' 235 | st_name = final_sotware.recognize(mode, parameters) 236 | print('students recognised', st_name) 237 | 238 | def mark_attend(): 239 | if rdbtn1.get(): 240 | print('Webcam') 241 | mode = 'w' 242 | elif rdbtn2.get(): 243 | print('Video') 244 | mode = 'v' 245 | elif rdbtn3.get(): 246 | print('Image') 247 | mode = 'i' 248 | else: 249 | print('default') 250 | mode = 'w' 251 | 252 | print(mode) 253 | parameters = path1.get(), path2.get(), face_dim.get(), gpu.get(), thresh1.get(), thresh2.get(), resolution.get(), \ 254 | img_path.get(), out_img_path.get(), vid_path.get(), vid_save.get(), vid_see.get() 255 | run_attend(mode, parameters) 256 | 257 | btn9 = tkinter.Button(window3, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit) 258 | btn9.place(x=550, y=600, width=90) 259 | 260 | btn11 = tkinter.Button(window3, text = "Mark Attendance", fg = "black", bg = 'turquoise1', command = mark_attend) 261 | btn11.place(x=635, y=630, width=120) 262 | 263 | btn10 = tkinter.Button(window3, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit) 264 | btn10.place(x=750, y=600, width=60) 265 | 266 | def home(): 267 | window3.destroy() 268 | gotohome() 269 | 270 | btn12 = tkinter.Button(window3, text = "HOME", fg = "black", bg = 'turquoise1', command = home) 271 | btn12.place(x=650, y=600, width=90) 272 | 273 | 274 | 275 | window3.mainloop() 276 | 277 | def run_attend(mode, parameters): 278 | present = final_sotware.recognize(mode, parameters) 279 | 280 | 281 | def show_create(): 282 | 283 | window4 = Tk() 284 | window4.title("Attendance System") 285 | window4.geometry("800x500") 286 | tkinter.Label(window4, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x") 287 | tkinter.Label(window4, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x') 288 | tkinter.Label(window4, text="\n\n ").pack(fill='y') 289 | 290 | tkinter.Label(window4, text = "Enter the path to output folder").place(x=50, y=50, width=240) 291 | path1 = tkinter.Entry(window4) 292 | path1.place(x=60, y=70, width=400) 293 | 294 | tkinter.Label(window4, text = "Enter your supported webcam resolution (eg 640x480)").place(x=50, y=100, width=380) 295 | webcam = tkinter.Entry(window4) 296 | webcam.place(x=60, y=120, width=400) 297 | 298 | tkinter.Label(window4, text = "Enter the gpu memory fraction u want to allocate(out of 1)").place(x=50, y=150, width=430) 299 | gpu = tkinter.Entry(window4) 300 | gpu.place(x=60, y=170, width=400) 301 | 302 | tkinter.Label(window4, text = "Enter desired face width and height (WidthxHeight format)").place(x=50, y=200, width=430) 303 | face_dim = tkinter.Entry(window4) 304 | face_dim.place(x=60, y=220, width=400) 305 | 306 | tkinter.Label(window4, text = "Enter user name (default: person)").place(x=50, y=250, width=260) 307 | username = tkinter.Entry(window4) 308 | username.place(x=60, y=270, width=400) 309 | 310 | tkinter.Label(window4, text = "Create dataset using:").place(x=50, y=300, width=180) 311 | 312 | rdbtn1 = IntVar() 313 | rdbtn2 = IntVar() 314 | 315 | rdbv = tkinter.Checkbutton(window4, text="Video", variable = rdbtn1, fg="black", bg="skyblue1") 316 | rdbv.place(x=220, y=300, width=80) 317 | 318 | rdbw = tkinter.Checkbutton(window4, text="Webcam", variable = rdbtn2, fg="black", bg="skyblue1") 319 | rdbw.place(x=320, y=300, width=80) 320 | 321 | tkinter.Label(window4, text = "Enter video path (if applicable)").place(x=50, y=330, width=250) 322 | vid_path = tkinter.Entry(window4) 323 | vid_path.place(x=60, y=350, width=400) 324 | 325 | tkinter.Label(window4, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=50, y=450, width=380) 326 | 327 | tkinter.Label(window4, bg = 'orange').place(y = 485, width = 800) 328 | 329 | get_f = 0 330 | 331 | def submit(): 332 | #vid_path2 = '' 333 | print('submit') 334 | '''if rdbtn1.get(): 335 | print('Video') 336 | #vid_path = '/home/aashish/Documents/deep_learning/attendance_deep_learning/scripts_used/video/uri1.webm' 337 | vid_path2 = vid_path.get() 338 | 339 | elif rdbtn2.get(): 340 | print('Webcam') 341 | vid_path = '' 342 | else: 343 | print('default') 344 | vid_path = '' ''' 345 | 346 | parameters = path1.get(), webcam.get(), face_dim.get(), gpu.get(), username.get(), vid_path.get() 347 | print(parameters) 348 | # mode = 'w' 349 | get_f = final_sotware.dataset_creation(parameters) 350 | 351 | if get_f == 1: 352 | tkinter.messagebox.showinfo("Attendance", "Dataset Created") 353 | 354 | 355 | btn9 = tkinter.Button(window4, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit) 356 | btn9.place(x=650, y=200, width=90) 357 | 358 | btn9 = tkinter.Button(window4, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit) 359 | btn9.place(x=650, y=300, width=90) 360 | 361 | def home(): 362 | window4.destroy() 363 | gotohome() 364 | 365 | btn10 = tkinter.Button(window4, text = "HOME", fg = "black", bg = 'turquoise1', command = home) 366 | btn10.place(x=650, y=250, width=90) 367 | 368 | window4.mainloop() 369 | 370 | 371 | 372 | def show_train(): 373 | 374 | window5 = Tk() 375 | window5.title("Attendance System") 376 | window5.geometry("800x500") 377 | tkinter.Label(window5, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x") 378 | tkinter.Label(window5, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x') 379 | tkinter.Label(window5, text="\n\n ").pack(fill='y') 380 | 381 | tkinter.Label(window5, text = "Enter the path to dataset folder").place(x=50, y=50, width=250) 382 | path1 = tkinter.Entry(window5) 383 | path1.place(x=60, y=70, width=400) 384 | 385 | tkinter.Label(window5, text = "Enter the path to 20180402-114759 FOLDER").place(x=50, y=100, width=300) 386 | path2 = tkinter.Entry(window5) 387 | path2.place(x=60, y=120, width=400) 388 | 389 | tkinter.Label(window5, text = "Enter the gpu memory fraction u want to allocate(out of 1)").place(x=50, y=150, width=430) 390 | gpu = tkinter.Entry(window5) 391 | gpu.place(x=60, y=170, width=400) 392 | 393 | tkinter.Label(window5, text = "Enter the batch size of images to process at once").place(x=50, y=200, width=370) 394 | batch = tkinter.Entry(window5) 395 | batch.place(x=60, y=220, width=400) 396 | 397 | tkinter.Label(window5, text = "Enter input image dimension (eg. 160)").place(x=50, y=250, width=285) 398 | img_dim = tkinter.Entry(window5) 399 | img_dim.place(x=60, y=270, width=400) 400 | 401 | tkinter.Label(window5, text = "Enter output SVM classifier filename").place(x=50, y=300, width=275) 402 | svm_name = tkinter.Entry(window5) 403 | svm_name.place(x=60, y=320, width=400) 404 | 405 | tkinter.Label(window5, text = "Split dataset into training and testing:").place(x=50, y=350, width=305) 406 | 407 | chkbtn1 = IntVar() 408 | chkbtn2 = IntVar() 409 | 410 | ckbt1 = tkinter.Checkbutton(window5, text="Yes", variable = chkbtn1, fg="black", bg="skyblue1") 411 | ckbt1.place(x=350, y=350, width=50) 412 | 413 | ckbt2 = tkinter.Checkbutton(window5, text="No", variable = chkbtn2, fg="black", bg="skyblue1") 414 | ckbt2.place(x=410, y=350, width=50) 415 | 416 | tkinter.Label(window5, text = "Enter split percentage (if applicable)").place(x=50, y=380, width=290) 417 | split_percent = tkinter.Entry(window5) 418 | split_percent.place(x=60, y=400, width=400) 419 | 420 | tkinter.Label(window5, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=60, y=450, width=380) 421 | 422 | def submit(): 423 | 424 | print('submit') 425 | if chkbtn1.get(): 426 | print('Yes') 427 | split_data = 'y' 428 | 429 | elif chkbtn2.get(): 430 | print('No') 431 | split_data = '' 432 | else: 433 | print('default') 434 | split_data = 'y' 435 | 436 | parameters = path1.get(), path2.get(), batch.get(), img_dim.get(), gpu.get(), svm_name.get(), split_percent.get(), split_data 437 | print(parameters) 438 | # mode = 'w' 439 | get_f = final_sotware.train(parameters) 440 | 441 | if get_f == 1: 442 | tkinter.messagebox.showinfo("Title", "Training Completed") 443 | 444 | 445 | btn9 = tkinter.Button(window5, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit) 446 | btn9.place(x=650, y=200, width=90) 447 | 448 | btn9 = tkinter.Button(window5, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit) 449 | btn9.place(x=650, y=300, width=90) 450 | 451 | def home(): 452 | window5.destroy() 453 | gotohome() 454 | 455 | btn10 = tkinter.Button(window5, text = "HOME", fg = "black", bg = 'turquoise1', command = home) 456 | btn10.place(x=650, y=250, width=90) 457 | 458 | 459 | window5.mainloop() 460 | 461 | 462 | 463 | def show_test(): 464 | 465 | window6 = Tk() 466 | window6.title("Attendance System") 467 | window6.geometry("800x500") 468 | tkinter.Label(window6, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x") 469 | tkinter.Label(window6, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x') 470 | tkinter.Label(window6, text="\n\n ").pack(fill='y') 471 | 472 | tkinter.Label(window6, text = "Enter the path to classifier.pkl").place(x=50, y=50, width=250) 473 | path1 = tkinter.Entry(window6) 474 | path1.place(x=60, y=70, width=400) 475 | 476 | tkinter.Label(window6, text = "Enter the path to 20180402-114759 FOLDER").place(x=50, y=100, width=300) 477 | path2 = tkinter.Entry(window6) 478 | path2.place(x=60, y=120, width=400) 479 | 480 | tkinter.Label(window6, text="Enter path to dataset folder").place(x=50, y=150, width=230) 481 | path3 = tkinter.Entry(window6) 482 | path3.place(x=60, y=170, width=400) 483 | 484 | tkinter.Label(window6, text="Enter the batch size of images to process at once").place(x=50, y=200, width=370) 485 | batch = tkinter.Entry(window6) 486 | batch.place(x=60, y=220, width=400) 487 | 488 | tkinter.Label(window6, text="Enter input image dimension (eg. 160)").place(x=50, y=250, width=285) 489 | img_dim = tkinter.Entry(window6) 490 | img_dim.place(x=60, y=270, width=400) 491 | 492 | tkinter.Label(window6, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=60, y=450, width=380) 493 | 494 | def submit(): 495 | 496 | gpu = 0.8 497 | parameters = path1.get(), path2.get(), path3.get(), batch.get(), img_dim.get(), gpu 498 | print(parameters) 499 | get_f = final_sotware.test(parameters = parameters) 500 | 501 | if get_f == 1: 502 | tkinter.messagebox.showinfo("Title", "Training Completed") 503 | 504 | 505 | btn9 = tkinter.Button(window6, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit) 506 | btn9.place(x=650, y=200, width=90) 507 | 508 | btn9 = tkinter.Button(window6, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit) 509 | btn9.place(x=650, y=300, width=90) 510 | 511 | def home(): 512 | window6.destroy() 513 | gotohome() 514 | 515 | btn10 = tkinter.Button(window6, text = "HOME", fg = "black", bg = 'turquoise1', command = home) 516 | btn10.place(x=650, y=250, width=90) 517 | 518 | 519 | 520 | window6.mainloop() 521 | 522 | 523 | 524 | 525 | 526 | def gotohome(): 527 | 528 | show() 529 | #show_test() 530 | #show_train() 531 | #putwindow() 532 | #show_run() 533 | #show_create() 534 | 535 | if __name__ == '__main__': 536 | putwindow() 537 | #show_attend() 538 | --------------------------------------------------------------------------------