├── LICENSE
├── README.md
├── align_dataset_mtcnn.py
├── det1.npy
├── det2.npy
├── det3.npy
├── detect_face.py
├── face_aligner.py
├── face_detect.py
├── facenet.py
├── final_sotware.py
├── images
├── image2.png
├── image3.png
├── image4.png
├── image5.png
└── images.txt
├── sheet.py
└── user_interface.py
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 aashishrai3799
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # This is the official implementation of
2 |
3 | ## An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks
4 | (https://ieeexplore.ieee.org/document/9029001)
5 |
6 |
7 |
8 | An end-to-end face identification and attendance approach using Convolutional Neural Networks (CNN), which processes the CCTV footage or a video of the class and mark the attendance of the entire class simultaneously. One of the main advantages of the proposed solution is its robustness against usual challenges like occlusion (partially visible/covered faces), orientation, alignment and luminescence of the classroom.
9 |
10 | # Libraries
11 | 1. Tensorflow 1.14
12 | 2. Numpy
13 | 3. OpenCV
14 | 4. MTCNN
15 | 5. Sklearn
16 | 6. xlsxwriter, xlrd
17 | 7. scipy
18 | 8. pickle
19 |
20 |
21 | # How to use
22 |
23 | ## Installation
24 | 1. Install the required libraries. (Conda environment preferred).
25 | 2. Download the pre-trained model from [[link]](https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55-) and copy to the main directory.
26 | 3. Make sure to have the below mentioned directory structure (you've to manually create two folders named "attendance" and "output" in the main directory | refer to the "Main" directory structure).
27 | 4. To verify if everything is installed correctly, run 'user_interface.py'.
28 |
29 | ## Create Dataset
30 | 1. Run 'user_interface.py'
31 | 2. Click on the 'Create' button.
32 | 3. Select 'webcam' if you wish to create live dataset. (you can leave all other fileds empty)
33 | 4. Click on the 'Continue' button to start streaming webcam feed.
34 | 5. Press 's' to save the face images. Take as many images as you can take. (approx. 80-100 preferred)
35 | 6. Press 'q' to exit.
36 | 7. Likewise create other datasets.
37 |
38 | ## Training
39 | 1. Run 'user_interface.py'
40 | 2. Click on the 'Train' button.
41 | 3. Training may take several minutes (depending upon your system configuration).
42 | 4. Once training is completed, a 'classifier.pkl' file will be generated.
43 |
44 | ## Run
45 | 1. Run 'user_interface.py'
46 | 2. Click on the 'Run' button.
47 | 3. Select 'Webcam' fom the list and leave all fields blank.
48 | 4. Click on 'Mark Attendance' button.
49 | 5. Attendance sheet will be generated automatically with current date/time.
50 |
51 | ## Make sure to have following directory structure
52 | 1. 'Main' directory:
53 |
54 | 2. 'output' directory:
55 |
56 | 3. '20180402-114759' directory:
57 |
58 |
59 |
60 |
61 | The file for data augmentation will be uploaded soon.
62 |
63 | To know more about the working of the software, refer to our paper.
64 |
65 |
66 |
67 | ## Download pre-trained model:
68 | https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55-
69 |
70 |
71 | ## Cite
72 | If you find this paper/code userful, consider citing
73 |
74 | ```
75 | @INPROCEEDINGS{9029001,
76 | author={Rai, Aashish and Karnani, Rashmi and Chudasama, Vishal and Upla, Kishor},
77 | booktitle={2019 IEEE 16th India Council International Conference (INDICON)},
78 | title={An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks},
79 | year={2019}, volume={}, number={}, pages={1-4},
80 | doi={10.1109/INDICON47234.2019.9029001}}
81 | ```
82 |
83 | ## License
84 |
85 | The code is available under MIT License. Please read the license terms available at [[Link]](https://github.com/aashishrai3799/Automated-Attendance-System-using-CNN/blob/master/LICENSE)
86 |
87 |
--------------------------------------------------------------------------------
/align_dataset_mtcnn.py:
--------------------------------------------------------------------------------
1 | """Performs face alignment and stores face thumbnails in the output directory."""
2 | # MIT License
3 | #
4 | # Copyright (c) 2016 David Sandberg
5 | #
6 | # Permission is hereby granted, free of charge, to any person obtaining a copy
7 | # of this software and associated documentation files (the "Software"), to deal
8 | # in the Software without restriction, including without limitation the rights
9 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | # copies of the Software, and to permit persons to whom the Software is
11 | # furnished to do so, subject to the following conditions:
12 | #
13 | # The above copyright notice and this permission notice shall be included in all
14 | # copies or substantial portions of the Software.
15 | #
16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22 | # SOFTWARE.
23 |
24 | from __future__ import absolute_import
25 | from __future__ import division
26 | from __future__ import print_function
27 |
28 | from scipy import misc
29 | import sys
30 | import os
31 | import argparse
32 | import tensorflow as tf
33 | import numpy as np
34 | import facenet
35 | import align.detect_face
36 | import random
37 | from time import sleep
38 |
39 | def main(args):
40 | sleep(random.random())
41 | output_dir = os.path.expanduser(args.output_dir)
42 | if not os.path.exists(output_dir):
43 | os.makedirs(output_dir)
44 | # Store some git revision info in a text file in the log directory
45 | src_path,_ = os.path.split(os.path.realpath(__file__))
46 | facenet.store_revision_info(src_path, output_dir, ' '.join(sys.argv))
47 | dataset = facenet.get_dataset(args.input_dir)
48 |
49 | print('Creating networks and loading parameters')
50 |
51 | with tf.Graph().as_default():
52 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction)
53 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
54 | with sess.as_default():
55 | pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None)
56 |
57 | minsize = 20 # minimum size of face
58 | threshold = [ 0.6, 0.7, 0.7 ] # three steps's threshold
59 | factor = 0.709 # scale factor
60 |
61 | # Add a random key to the filename to allow alignment using multiple processes
62 | random_key = np.random.randint(0, high=99999)
63 | bounding_boxes_filename = os.path.join(output_dir, 'bounding_boxes_%05d.txt' % random_key)
64 |
65 | with open(bounding_boxes_filename, "w") as text_file:
66 | nrof_images_total = 0
67 | nrof_successfully_aligned = 0
68 | if args.random_order:
69 | random.shuffle(dataset)
70 | for cls in dataset:
71 | output_class_dir = os.path.join(output_dir, cls.name)
72 | if not os.path.exists(output_class_dir):
73 | os.makedirs(output_class_dir)
74 | if args.random_order:
75 | random.shuffle(cls.image_paths)
76 | for image_path in cls.image_paths:
77 | nrof_images_total += 1
78 | filename = os.path.splitext(os.path.split(image_path)[1])[0]
79 | output_filename = os.path.join(output_class_dir, filename+'.png')
80 | print(image_path)
81 | if not os.path.exists(output_filename):
82 | try:
83 | img = misc.imread(image_path)
84 | except (IOError, ValueError, IndexError) as e:
85 | errorMessage = '{}: {}'.format(image_path, e)
86 | print(errorMessage)
87 | else:
88 | if img.ndim<2:
89 | print('Unable to align "%s"' % image_path)
90 | text_file.write('%s\n' % (output_filename))
91 | continue
92 | if img.ndim == 2:
93 | img = facenet.to_rgb(img)
94 | img = img[:,:,0:3]
95 |
96 | bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor)
97 | nrof_faces = bounding_boxes.shape[0]
98 | if nrof_faces>0:
99 | det = bounding_boxes[:,0:4]
100 | det_arr = []
101 | img_size = np.asarray(img.shape)[0:2]
102 | if nrof_faces>1:
103 | if args.detect_multiple_faces:
104 | for i in range(nrof_faces):
105 | det_arr.append(np.squeeze(det[i]))
106 | else:
107 | bounding_box_size = (det[:,2]-det[:,0])*(det[:,3]-det[:,1])
108 | img_center = img_size / 2
109 | offsets = np.vstack([ (det[:,0]+det[:,2])/2-img_center[1], (det[:,1]+det[:,3])/2-img_center[0] ])
110 | offset_dist_squared = np.sum(np.power(offsets,2.0),0)
111 | index = np.argmax(bounding_box_size-offset_dist_squared*2.0) # some extra weight on the centering
112 | det_arr.append(det[index,:])
113 | else:
114 | det_arr.append(np.squeeze(det))
115 |
116 | for i, det in enumerate(det_arr):
117 | det = np.squeeze(det)
118 | bb = np.zeros(4, dtype=np.int32)
119 | bb[0] = np.maximum(det[0]-args.margin/2, 0)
120 | bb[1] = np.maximum(det[1]-args.margin/2, 0)
121 | bb[2] = np.minimum(det[2]+args.margin/2, img_size[1])
122 | bb[3] = np.minimum(det[3]+args.margin/2, img_size[0])
123 | cropped = img[bb[1]:bb[3],bb[0]:bb[2],:]
124 | scaled = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear')
125 | nrof_successfully_aligned += 1
126 | filename_base, file_extension = os.path.splitext(output_filename)
127 | if args.detect_multiple_faces:
128 | output_filename_n = "{}_{}{}".format(filename_base, i, file_extension)
129 | else:
130 | output_filename_n = "{}{}".format(filename_base, file_extension)
131 | misc.imsave(output_filename_n, scaled)
132 | text_file.write('%s %d %d %d %d\n' % (output_filename_n, bb[0], bb[1], bb[2], bb[3]))
133 | else:
134 | print('Unable to align "%s"' % image_path)
135 | text_file.write('%s\n' % (output_filename))
136 |
137 | print('Total number of images: %d' % nrof_images_total)
138 | print('Number of successfully aligned images: %d' % nrof_successfully_aligned)
139 |
140 |
141 | def parse_arguments(argv):
142 | parser = argparse.ArgumentParser()
143 |
144 | parser.add_argument('input_dir', type=str, help='Directory with unaligned images.')
145 | parser.add_argument('output_dir', type=str, help='Directory with aligned face thumbnails.')
146 | parser.add_argument('--image_size', type=int,
147 | help='Image size (height, width) in pixels.', default=182)
148 | parser.add_argument('--margin', type=int,
149 | help='Margin for the crop around the bounding box (height, width) in pixels.', default=44)
150 | parser.add_argument('--random_order',
151 | help='Shuffles the order of images to enable alignment using multiple processes.', action='store_true')
152 | parser.add_argument('--gpu_memory_fraction', type=float,
153 | help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0)
154 | parser.add_argument('--detect_multiple_faces', type=bool,
155 | help='Detect and align multiple faces per image.', default=False)
156 | return parser.parse_args(argv)
157 |
158 | if __name__ == '__main__':
159 | main(parse_arguments(sys.argv[1:]))
160 |
--------------------------------------------------------------------------------
/det1.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/det1.npy
--------------------------------------------------------------------------------
/det2.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/det2.npy
--------------------------------------------------------------------------------
/det3.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/det3.npy
--------------------------------------------------------------------------------
/detect_face.py:
--------------------------------------------------------------------------------
1 | """ Tensorflow implementation of the face detection / alignment algorithm found at
2 | https://github.com/kpzhang93/MTCNN_face_detection_alignment
3 | """
4 | # MIT License
5 | #
6 | # Copyright (c) 2016 David Sandberg
7 | #
8 | # Permission is hereby granted, free of charge, to any person obtaining a copy
9 | # of this software and associated documentation files (the "Software"), to deal
10 | # in the Software without restriction, including without limitation the rights
11 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
12 | # copies of the Software, and to permit persons to whom the Software is
13 | # furnished to do so, subject to the following conditions:
14 | #
15 | # The above copyright notice and this permission notice shall be included in all
16 | # copies or substantial portions of the Software.
17 | #
18 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
19 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
20 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
21 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
22 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
23 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
24 | # SOFTWARE.
25 |
26 | from __future__ import absolute_import
27 | from __future__ import division
28 | from __future__ import print_function
29 | from six import string_types, iteritems
30 |
31 | import numpy as np
32 | import tensorflow as tf
33 | import cv2
34 | import os
35 |
36 | def layer(op):
37 | """Decorator for composable network layers."""
38 |
39 | def layer_decorated(self, *args, **kwargs):
40 | # Automatically set a name if not provided.
41 | name = kwargs.setdefault('name', self.get_unique_name(op.__name__))
42 | # Figure out the layer inputs.
43 | if len(self.terminals) == 0:
44 | raise RuntimeError('No input variables found for layer %s.' % name)
45 | elif len(self.terminals) == 1:
46 | layer_input = self.terminals[0]
47 | else:
48 | layer_input = list(self.terminals)
49 | # Perform the operation and get the output.
50 | layer_output = op(self, layer_input, *args, **kwargs)
51 | # Add to layer LUT.
52 | self.layers[name] = layer_output
53 | # This output is now the input for the next layer.
54 | self.feed(layer_output)
55 | # Return self for chained calls.
56 | return self
57 |
58 | return layer_decorated
59 |
60 | class Network(object):
61 |
62 | def __init__(self, inputs, trainable=True):
63 | # The input nodes for this network
64 | self.inputs = inputs
65 | # The current list of terminal nodes
66 | self.terminals = []
67 | # Mapping from layer names to layers
68 | self.layers = dict(inputs)
69 | # If true, the resulting variables are set as trainable
70 | self.trainable = trainable
71 |
72 | self.setup()
73 |
74 | def setup(self):
75 | """Construct the network. """
76 | raise NotImplementedError('Must be implemented by the subclass.')
77 |
78 | def load(self, data_path, session, ignore_missing=False):
79 | """Load network weights.
80 | data_path: The path to the numpy-serialized network weights
81 | session: The current TensorFlow session
82 | ignore_missing: If true, serialized weights for missing layers are ignored.
83 | """
84 | data_dict = np.load(data_path, encoding='latin1').item() #pylint: disable=no-member
85 |
86 | for op_name in data_dict:
87 | with tf.variable_scope(op_name, reuse=True):
88 | for param_name, data in iteritems(data_dict[op_name]):
89 | try:
90 | var = tf.get_variable(param_name)
91 | session.run(var.assign(data))
92 | except ValueError:
93 | if not ignore_missing:
94 | raise
95 |
96 | def feed(self, *args):
97 | """Set the input(s) for the next operation by replacing the terminal nodes.
98 | The arguments can be either layer names or the actual layers.
99 | """
100 | assert len(args) != 0
101 | self.terminals = []
102 | for fed_layer in args:
103 | if isinstance(fed_layer, string_types):
104 | try:
105 | fed_layer = self.layers[fed_layer]
106 | except KeyError:
107 | raise KeyError('Unknown layer name fed: %s' % fed_layer)
108 | self.terminals.append(fed_layer)
109 | return self
110 |
111 | def get_output(self):
112 | """Returns the current network output."""
113 | return self.terminals[-1]
114 |
115 | def get_unique_name(self, prefix):
116 | """Returns an index-suffixed unique name for the given prefix.
117 | This is used for auto-generating layer names based on the type-prefix.
118 | """
119 | ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1
120 | return '%s_%d' % (prefix, ident)
121 |
122 | def make_var(self, name, shape):
123 | """Creates a new TensorFlow variable."""
124 | return tf.get_variable(name, shape, trainable=self.trainable)
125 |
126 | def validate_padding(self, padding):
127 | """Verifies that the padding is one of the supported ones."""
128 | assert padding in ('SAME', 'VALID')
129 |
130 | @layer
131 | def conv(self,
132 | inp,
133 | k_h,
134 | k_w,
135 | c_o,
136 | s_h,
137 | s_w,
138 | name,
139 | relu=True,
140 | padding='SAME',
141 | group=1,
142 | biased=True):
143 | # Verify that the padding is acceptable
144 | self.validate_padding(padding)
145 | # Get the number of channels in the input
146 | c_i = int(inp.get_shape()[-1])
147 | # Verify that the grouping parameter is valid
148 | assert c_i % group == 0
149 | assert c_o % group == 0
150 | # Convolution for a given input and kernel
151 | convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)
152 | with tf.variable_scope(name) as scope:
153 | kernel = self.make_var('weights', shape=[k_h, k_w, c_i // group, c_o])
154 | # This is the common-case. Convolve the input without any further complications.
155 | output = convolve(inp, kernel)
156 | # Add the biases
157 | if biased:
158 | biases = self.make_var('biases', [c_o])
159 | output = tf.nn.bias_add(output, biases)
160 | if relu:
161 | # ReLU non-linearity
162 | output = tf.nn.relu(output, name=scope.name)
163 | return output
164 |
165 | @layer
166 | def prelu(self, inp, name):
167 | with tf.variable_scope(name):
168 | i = int(inp.get_shape()[-1])
169 | alpha = self.make_var('alpha', shape=(i,))
170 | output = tf.nn.relu(inp) + tf.multiply(alpha, -tf.nn.relu(-inp))
171 | return output
172 |
173 | @layer
174 | def max_pool(self, inp, k_h, k_w, s_h, s_w, name, padding='SAME'):
175 | self.validate_padding(padding)
176 | return tf.nn.max_pool(inp,
177 | ksize=[1, k_h, k_w, 1],
178 | strides=[1, s_h, s_w, 1],
179 | padding=padding,
180 | name=name)
181 |
182 | @layer
183 | def fc(self, inp, num_out, name, relu=True):
184 | with tf.variable_scope(name):
185 | input_shape = inp.get_shape()
186 | if input_shape.ndims == 4:
187 | # The input is spatial. Vectorize it first.
188 | dim = 1
189 | for d in input_shape[1:].as_list():
190 | dim *= int(d)
191 | feed_in = tf.reshape(inp, [-1, dim])
192 | else:
193 | feed_in, dim = (inp, input_shape[-1].value)
194 | weights = self.make_var('weights', shape=[dim, num_out])
195 | biases = self.make_var('biases', [num_out])
196 | op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b
197 | fc = op(feed_in, weights, biases, name=name)
198 | return fc
199 |
200 |
201 | """
202 | Multi dimensional softmax,
203 | refer to https://github.com/tensorflow/tensorflow/issues/210
204 | compute softmax along the dimension of target
205 | the native softmax only supports batch_size x dimension
206 | """
207 | @layer
208 | def softmax(self, target, axis, name=None):
209 | max_axis = tf.reduce_max(target, axis, keepdims=True)
210 | target_exp = tf.exp(target-max_axis)
211 | normalize = tf.reduce_sum(target_exp, axis, keepdims=True)
212 | softmax = tf.div(target_exp, normalize, name)
213 | return softmax
214 |
215 | class PNet(Network):
216 | def setup(self):
217 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member
218 | .conv(3, 3, 10, 1, 1, padding='VALID', relu=False, name='conv1')
219 | .prelu(name='PReLU1')
220 | .max_pool(2, 2, 2, 2, name='pool1')
221 | .conv(3, 3, 16, 1, 1, padding='VALID', relu=False, name='conv2')
222 | .prelu(name='PReLU2')
223 | .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv3')
224 | .prelu(name='PReLU3')
225 | .conv(1, 1, 2, 1, 1, relu=False, name='conv4-1')
226 | .softmax(3,name='prob1'))
227 |
228 | (self.feed('PReLU3') #pylint: disable=no-value-for-parameter
229 | .conv(1, 1, 4, 1, 1, relu=False, name='conv4-2'))
230 |
231 | class RNet(Network):
232 | def setup(self):
233 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member
234 | .conv(3, 3, 28, 1, 1, padding='VALID', relu=False, name='conv1')
235 | .prelu(name='prelu1')
236 | .max_pool(3, 3, 2, 2, name='pool1')
237 | .conv(3, 3, 48, 1, 1, padding='VALID', relu=False, name='conv2')
238 | .prelu(name='prelu2')
239 | .max_pool(3, 3, 2, 2, padding='VALID', name='pool2')
240 | .conv(2, 2, 64, 1, 1, padding='VALID', relu=False, name='conv3')
241 | .prelu(name='prelu3')
242 | .fc(128, relu=False, name='conv4')
243 | .prelu(name='prelu4')
244 | .fc(2, relu=False, name='conv5-1')
245 | .softmax(1,name='prob1'))
246 |
247 | (self.feed('prelu4') #pylint: disable=no-value-for-parameter
248 | .fc(4, relu=False, name='conv5-2'))
249 |
250 | class ONet(Network):
251 | def setup(self):
252 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member
253 | .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv1')
254 | .prelu(name='prelu1')
255 | .max_pool(3, 3, 2, 2, name='pool1')
256 | .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv2')
257 | .prelu(name='prelu2')
258 | .max_pool(3, 3, 2, 2, padding='VALID', name='pool2')
259 | .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv3')
260 | .prelu(name='prelu3')
261 | .max_pool(2, 2, 2, 2, name='pool3')
262 | .conv(2, 2, 128, 1, 1, padding='VALID', relu=False, name='conv4')
263 | .prelu(name='prelu4')
264 | .fc(256, relu=False, name='conv5')
265 | .prelu(name='prelu5')
266 | .fc(2, relu=False, name='conv6-1')
267 | .softmax(1, name='prob1'))
268 |
269 | (self.feed('prelu5') #pylint: disable=no-value-for-parameter
270 | .fc(4, relu=False, name='conv6-2'))
271 |
272 | (self.feed('prelu5') #pylint: disable=no-value-for-parameter
273 | .fc(10, relu=False, name='conv6-3'))
274 |
275 | def create_mtcnn(sess, model_path):
276 | if not model_path:
277 | model_path,_ = os.path.split(os.path.realpath(__file__))
278 |
279 | with tf.variable_scope('pnet'):
280 | data = tf.placeholder(tf.float32, (None,None,None,3), 'input')
281 | pnet = PNet({'data':data})
282 | pnet.load(os.path.join(model_path, 'det1.npy'), sess)
283 | with tf.variable_scope('rnet'):
284 | data = tf.placeholder(tf.float32, (None,24,24,3), 'input')
285 | rnet = RNet({'data':data})
286 | rnet.load(os.path.join(model_path, 'det2.npy'), sess)
287 | with tf.variable_scope('onet'):
288 | data = tf.placeholder(tf.float32, (None,48,48,3), 'input')
289 | onet = ONet({'data':data})
290 | onet.load(os.path.join(model_path, 'det3.npy'), sess)
291 |
292 | pnet_fun = lambda img : sess.run(('pnet/conv4-2/BiasAdd:0', 'pnet/prob1:0'), feed_dict={'pnet/input:0':img})
293 | rnet_fun = lambda img : sess.run(('rnet/conv5-2/conv5-2:0', 'rnet/prob1:0'), feed_dict={'rnet/input:0':img})
294 | onet_fun = lambda img : sess.run(('onet/conv6-2/conv6-2:0', 'onet/conv6-3/conv6-3:0', 'onet/prob1:0'), feed_dict={'onet/input:0':img})
295 | return pnet_fun, rnet_fun, onet_fun
296 |
297 | def detect_face(img, minsize, pnet, rnet, onet, threshold, factor):
298 | """Detects faces in an image, and returns bounding boxes and points for them.
299 | img: input image
300 | minsize: minimum faces' size
301 | pnet, rnet, onet: caffemodel
302 | threshold: threshold=[th1, th2, th3], th1-3 are three steps's threshold
303 | factor: the factor used to create a scaling pyramid of face sizes to detect in the image.
304 | """
305 | factor_count=0
306 | total_boxes=np.empty((0,9))
307 | points=np.empty(0)
308 | h=img.shape[0]
309 | w=img.shape[1]
310 | minl=np.amin([h, w])
311 | m=12.0/minsize
312 | minl=minl*m
313 | # create scale pyramid
314 | scales=[]
315 | while minl>=12:
316 | scales += [m*np.power(factor, factor_count)]
317 | minl = minl*factor
318 | factor_count += 1
319 |
320 | # first stage
321 | for scale in scales:
322 | hs=int(np.ceil(h*scale))
323 | ws=int(np.ceil(w*scale))
324 | im_data = imresample(img, (hs, ws))
325 | im_data = (im_data-127.5)*0.0078125
326 | img_x = np.expand_dims(im_data, 0)
327 | img_y = np.transpose(img_x, (0,2,1,3))
328 | out = pnet(img_y)
329 | out0 = np.transpose(out[0], (0,2,1,3))
330 | out1 = np.transpose(out[1], (0,2,1,3))
331 |
332 | boxes, _ = generateBoundingBox(out1[0,:,:,1].copy(), out0[0,:,:,:].copy(), scale, threshold[0])
333 |
334 | # inter-scale nms
335 | pick = nms(boxes.copy(), 0.5, 'Union')
336 | if boxes.size>0 and pick.size>0:
337 | boxes = boxes[pick,:]
338 | total_boxes = np.append(total_boxes, boxes, axis=0)
339 |
340 | numbox = total_boxes.shape[0]
341 | if numbox>0:
342 | pick = nms(total_boxes.copy(), 0.7, 'Union')
343 | total_boxes = total_boxes[pick,:]
344 | regw = total_boxes[:,2]-total_boxes[:,0]
345 | regh = total_boxes[:,3]-total_boxes[:,1]
346 | qq1 = total_boxes[:,0]+total_boxes[:,5]*regw
347 | qq2 = total_boxes[:,1]+total_boxes[:,6]*regh
348 | qq3 = total_boxes[:,2]+total_boxes[:,7]*regw
349 | qq4 = total_boxes[:,3]+total_boxes[:,8]*regh
350 | total_boxes = np.transpose(np.vstack([qq1, qq2, qq3, qq4, total_boxes[:,4]]))
351 | total_boxes = rerec(total_boxes.copy())
352 | total_boxes[:,0:4] = np.fix(total_boxes[:,0:4]).astype(np.int32)
353 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)
354 |
355 | numbox = total_boxes.shape[0]
356 | if numbox>0:
357 | # second stage
358 | tempimg = np.zeros((24,24,3,numbox))
359 | for k in range(0,numbox):
360 | tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3))
361 | tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:]
362 | if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0:
363 | tempimg[:,:,:,k] = imresample(tmp, (24, 24))
364 | else:
365 | return np.empty()
366 | tempimg = (tempimg-127.5)*0.0078125
367 | tempimg1 = np.transpose(tempimg, (3,1,0,2))
368 | out = rnet(tempimg1)
369 | out0 = np.transpose(out[0])
370 | out1 = np.transpose(out[1])
371 | score = out1[1,:]
372 | ipass = np.where(score>threshold[1])
373 | total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)])
374 | mv = out0[:,ipass[0]]
375 | if total_boxes.shape[0]>0:
376 | pick = nms(total_boxes, 0.7, 'Union')
377 | total_boxes = total_boxes[pick,:]
378 | total_boxes = bbreg(total_boxes.copy(), np.transpose(mv[:,pick]))
379 | total_boxes = rerec(total_boxes.copy())
380 |
381 | numbox = total_boxes.shape[0]
382 | if numbox>0:
383 | # third stage
384 | total_boxes = np.fix(total_boxes).astype(np.int32)
385 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)
386 | tempimg = np.zeros((48,48,3,numbox))
387 | for k in range(0,numbox):
388 | tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3))
389 | tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:]
390 | if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0:
391 | tempimg[:,:,:,k] = imresample(tmp, (48, 48))
392 | else:
393 | return np.empty()
394 | tempimg = (tempimg-127.5)*0.0078125
395 | tempimg1 = np.transpose(tempimg, (3,1,0,2))
396 | out = onet(tempimg1)
397 | out0 = np.transpose(out[0])
398 | out1 = np.transpose(out[1])
399 | out2 = np.transpose(out[2])
400 | score = out2[1,:]
401 | points = out1
402 | ipass = np.where(score>threshold[2])
403 | points = points[:,ipass[0]]
404 | total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)])
405 | mv = out0[:,ipass[0]]
406 |
407 | w = total_boxes[:,2]-total_boxes[:,0]+1
408 | h = total_boxes[:,3]-total_boxes[:,1]+1
409 | points[0:5,:] = np.tile(w,(5, 1))*points[0:5,:] + np.tile(total_boxes[:,0],(5, 1))-1
410 | points[5:10,:] = np.tile(h,(5, 1))*points[5:10,:] + np.tile(total_boxes[:,1],(5, 1))-1
411 | if total_boxes.shape[0]>0:
412 | total_boxes = bbreg(total_boxes.copy(), np.transpose(mv))
413 | pick = nms(total_boxes.copy(), 0.7, 'Min')
414 | total_boxes = total_boxes[pick,:]
415 | points = points[:,pick]
416 |
417 | return total_boxes, points
418 |
419 |
420 | def bulk_detect_face(images, detection_window_size_ratio, pnet, rnet, onet, threshold, factor):
421 | """Detects faces in a list of images
422 | images: list containing input images
423 | detection_window_size_ratio: ratio of minimum face size to smallest image dimension
424 | pnet, rnet, onet: caffemodel
425 | threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold [0-1]
426 | factor: the factor used to create a scaling pyramid of face sizes to detect in the image.
427 | """
428 | all_scales = [None] * len(images)
429 | images_with_boxes = [None] * len(images)
430 |
431 | for i in range(len(images)):
432 | images_with_boxes[i] = {'total_boxes': np.empty((0, 9))}
433 |
434 | # create scale pyramid
435 | for index, img in enumerate(images):
436 | all_scales[index] = []
437 | h = img.shape[0]
438 | w = img.shape[1]
439 | minsize = int(detection_window_size_ratio * np.minimum(w, h))
440 | factor_count = 0
441 | minl = np.amin([h, w])
442 | if minsize <= 12:
443 | minsize = 12
444 |
445 | m = 12.0 / minsize
446 | minl = minl * m
447 | while minl >= 12:
448 | all_scales[index].append(m * np.power(factor, factor_count))
449 | minl = minl * factor
450 | factor_count += 1
451 |
452 | # # # # # # # # # # # # #
453 | # first stage - fast proposal network (pnet) to obtain face candidates
454 | # # # # # # # # # # # # #
455 |
456 | images_obj_per_resolution = {}
457 |
458 | # TODO: use some type of rounding to number module 8 to increase probability that pyramid images will have the same resolution across input images
459 |
460 | for index, scales in enumerate(all_scales):
461 | h = images[index].shape[0]
462 | w = images[index].shape[1]
463 |
464 | for scale in scales:
465 | hs = int(np.ceil(h * scale))
466 | ws = int(np.ceil(w * scale))
467 |
468 | if (ws, hs) not in images_obj_per_resolution:
469 | images_obj_per_resolution[(ws, hs)] = []
470 |
471 | im_data = imresample(images[index], (hs, ws))
472 | im_data = (im_data - 127.5) * 0.0078125
473 | img_y = np.transpose(im_data, (1, 0, 2)) # caffe uses different dimensions ordering
474 | images_obj_per_resolution[(ws, hs)].append({'scale': scale, 'image': img_y, 'index': index})
475 |
476 | for resolution in images_obj_per_resolution:
477 | images_per_resolution = [i['image'] for i in images_obj_per_resolution[resolution]]
478 | outs = pnet(images_per_resolution)
479 |
480 | for index in range(len(outs[0])):
481 | scale = images_obj_per_resolution[resolution][index]['scale']
482 | image_index = images_obj_per_resolution[resolution][index]['index']
483 | out0 = np.transpose(outs[0][index], (1, 0, 2))
484 | out1 = np.transpose(outs[1][index], (1, 0, 2))
485 |
486 | boxes, _ = generateBoundingBox(out1[:, :, 1].copy(), out0[:, :, :].copy(), scale, threshold[0])
487 |
488 | # inter-scale nms
489 | pick = nms(boxes.copy(), 0.5, 'Union')
490 | if boxes.size > 0 and pick.size > 0:
491 | boxes = boxes[pick, :]
492 | images_with_boxes[image_index]['total_boxes'] = np.append(images_with_boxes[image_index]['total_boxes'],
493 | boxes,
494 | axis=0)
495 |
496 | for index, image_obj in enumerate(images_with_boxes):
497 | numbox = image_obj['total_boxes'].shape[0]
498 | if numbox > 0:
499 | h = images[index].shape[0]
500 | w = images[index].shape[1]
501 | pick = nms(image_obj['total_boxes'].copy(), 0.7, 'Union')
502 | image_obj['total_boxes'] = image_obj['total_boxes'][pick, :]
503 | regw = image_obj['total_boxes'][:, 2] - image_obj['total_boxes'][:, 0]
504 | regh = image_obj['total_boxes'][:, 3] - image_obj['total_boxes'][:, 1]
505 | qq1 = image_obj['total_boxes'][:, 0] + image_obj['total_boxes'][:, 5] * regw
506 | qq2 = image_obj['total_boxes'][:, 1] + image_obj['total_boxes'][:, 6] * regh
507 | qq3 = image_obj['total_boxes'][:, 2] + image_obj['total_boxes'][:, 7] * regw
508 | qq4 = image_obj['total_boxes'][:, 3] + image_obj['total_boxes'][:, 8] * regh
509 | image_obj['total_boxes'] = np.transpose(np.vstack([qq1, qq2, qq3, qq4, image_obj['total_boxes'][:, 4]]))
510 | image_obj['total_boxes'] = rerec(image_obj['total_boxes'].copy())
511 | image_obj['total_boxes'][:, 0:4] = np.fix(image_obj['total_boxes'][:, 0:4]).astype(np.int32)
512 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(image_obj['total_boxes'].copy(), w, h)
513 |
514 | numbox = image_obj['total_boxes'].shape[0]
515 | tempimg = np.zeros((24, 24, 3, numbox))
516 |
517 | if numbox > 0:
518 | for k in range(0, numbox):
519 | tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3))
520 | tmp[dy[k] - 1:edy[k], dx[k] - 1:edx[k], :] = images[index][y[k] - 1:ey[k], x[k] - 1:ex[k], :]
521 | if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0:
522 | tempimg[:, :, :, k] = imresample(tmp, (24, 24))
523 | else:
524 | return np.empty()
525 |
526 | tempimg = (tempimg - 127.5) * 0.0078125
527 | image_obj['rnet_input'] = np.transpose(tempimg, (3, 1, 0, 2))
528 |
529 | # # # # # # # # # # # # #
530 | # second stage - refinement of face candidates with rnet
531 | # # # # # # # # # # # # #
532 |
533 | bulk_rnet_input = np.empty((0, 24, 24, 3))
534 | for index, image_obj in enumerate(images_with_boxes):
535 | if 'rnet_input' in image_obj:
536 | bulk_rnet_input = np.append(bulk_rnet_input, image_obj['rnet_input'], axis=0)
537 |
538 | out = rnet(bulk_rnet_input)
539 | out0 = np.transpose(out[0])
540 | out1 = np.transpose(out[1])
541 | score = out1[1, :]
542 |
543 | i = 0
544 | for index, image_obj in enumerate(images_with_boxes):
545 | if 'rnet_input' not in image_obj:
546 | continue
547 |
548 | rnet_input_count = image_obj['rnet_input'].shape[0]
549 | score_per_image = score[i:i + rnet_input_count]
550 | out0_per_image = out0[:, i:i + rnet_input_count]
551 |
552 | ipass = np.where(score_per_image > threshold[1])
553 | image_obj['total_boxes'] = np.hstack([image_obj['total_boxes'][ipass[0], 0:4].copy(),
554 | np.expand_dims(score_per_image[ipass].copy(), 1)])
555 |
556 | mv = out0_per_image[:, ipass[0]]
557 |
558 | if image_obj['total_boxes'].shape[0] > 0:
559 | h = images[index].shape[0]
560 | w = images[index].shape[1]
561 | pick = nms(image_obj['total_boxes'], 0.7, 'Union')
562 | image_obj['total_boxes'] = image_obj['total_boxes'][pick, :]
563 | image_obj['total_boxes'] = bbreg(image_obj['total_boxes'].copy(), np.transpose(mv[:, pick]))
564 | image_obj['total_boxes'] = rerec(image_obj['total_boxes'].copy())
565 |
566 | numbox = image_obj['total_boxes'].shape[0]
567 |
568 | if numbox > 0:
569 | tempimg = np.zeros((48, 48, 3, numbox))
570 | image_obj['total_boxes'] = np.fix(image_obj['total_boxes']).astype(np.int32)
571 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(image_obj['total_boxes'].copy(), w, h)
572 |
573 | for k in range(0, numbox):
574 | tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3))
575 | tmp[dy[k] - 1:edy[k], dx[k] - 1:edx[k], :] = images[index][y[k] - 1:ey[k], x[k] - 1:ex[k], :]
576 | if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0:
577 | tempimg[:, :, :, k] = imresample(tmp, (48, 48))
578 | else:
579 | return np.empty()
580 | tempimg = (tempimg - 127.5) * 0.0078125
581 | image_obj['onet_input'] = np.transpose(tempimg, (3, 1, 0, 2))
582 |
583 | i += rnet_input_count
584 |
585 | # # # # # # # # # # # # #
586 | # third stage - further refinement and facial landmarks positions with onet
587 | # # # # # # # # # # # # #
588 |
589 | bulk_onet_input = np.empty((0, 48, 48, 3))
590 | for index, image_obj in enumerate(images_with_boxes):
591 | if 'onet_input' in image_obj:
592 | bulk_onet_input = np.append(bulk_onet_input, image_obj['onet_input'], axis=0)
593 |
594 | out = onet(bulk_onet_input)
595 |
596 | out0 = np.transpose(out[0])
597 | out1 = np.transpose(out[1])
598 | out2 = np.transpose(out[2])
599 | score = out2[1, :]
600 | points = out1
601 |
602 | i = 0
603 | ret = []
604 | for index, image_obj in enumerate(images_with_boxes):
605 | if 'onet_input' not in image_obj:
606 | ret.append(None)
607 | continue
608 |
609 | onet_input_count = image_obj['onet_input'].shape[0]
610 |
611 | out0_per_image = out0[:, i:i + onet_input_count]
612 | score_per_image = score[i:i + onet_input_count]
613 | points_per_image = points[:, i:i + onet_input_count]
614 |
615 | ipass = np.where(score_per_image > threshold[2])
616 | points_per_image = points_per_image[:, ipass[0]]
617 |
618 | image_obj['total_boxes'] = np.hstack([image_obj['total_boxes'][ipass[0], 0:4].copy(),
619 | np.expand_dims(score_per_image[ipass].copy(), 1)])
620 | mv = out0_per_image[:, ipass[0]]
621 |
622 | w = image_obj['total_boxes'][:, 2] - image_obj['total_boxes'][:, 0] + 1
623 | h = image_obj['total_boxes'][:, 3] - image_obj['total_boxes'][:, 1] + 1
624 | points_per_image[0:5, :] = np.tile(w, (5, 1)) * points_per_image[0:5, :] + np.tile(
625 | image_obj['total_boxes'][:, 0], (5, 1)) - 1
626 | points_per_image[5:10, :] = np.tile(h, (5, 1)) * points_per_image[5:10, :] + np.tile(
627 | image_obj['total_boxes'][:, 1], (5, 1)) - 1
628 |
629 | if image_obj['total_boxes'].shape[0] > 0:
630 | image_obj['total_boxes'] = bbreg(image_obj['total_boxes'].copy(), np.transpose(mv))
631 | pick = nms(image_obj['total_boxes'].copy(), 0.7, 'Min')
632 | image_obj['total_boxes'] = image_obj['total_boxes'][pick, :]
633 | points_per_image = points_per_image[:, pick]
634 |
635 | ret.append((image_obj['total_boxes'], points_per_image))
636 | else:
637 | ret.append(None)
638 |
639 | i += onet_input_count
640 |
641 | return ret
642 |
643 |
644 | # function [boundingbox] = bbreg(boundingbox,reg)
645 | def bbreg(boundingbox,reg):
646 | """Calibrate bounding boxes"""
647 | if reg.shape[1]==1:
648 | reg = np.reshape(reg, (reg.shape[2], reg.shape[3]))
649 |
650 | w = boundingbox[:,2]-boundingbox[:,0]+1
651 | h = boundingbox[:,3]-boundingbox[:,1]+1
652 | b1 = boundingbox[:,0]+reg[:,0]*w
653 | b2 = boundingbox[:,1]+reg[:,1]*h
654 | b3 = boundingbox[:,2]+reg[:,2]*w
655 | b4 = boundingbox[:,3]+reg[:,3]*h
656 | boundingbox[:,0:4] = np.transpose(np.vstack([b1, b2, b3, b4 ]))
657 | return boundingbox
658 |
659 | def generateBoundingBox(imap, reg, scale, t):
660 | """Use heatmap to generate bounding boxes"""
661 | stride=2
662 | cellsize=12
663 |
664 | imap = np.transpose(imap)
665 | dx1 = np.transpose(reg[:,:,0])
666 | dy1 = np.transpose(reg[:,:,1])
667 | dx2 = np.transpose(reg[:,:,2])
668 | dy2 = np.transpose(reg[:,:,3])
669 | y, x = np.where(imap >= t)
670 | if y.shape[0]==1:
671 | dx1 = np.flipud(dx1)
672 | dy1 = np.flipud(dy1)
673 | dx2 = np.flipud(dx2)
674 | dy2 = np.flipud(dy2)
675 | score = imap[(y,x)]
676 | reg = np.transpose(np.vstack([ dx1[(y,x)], dy1[(y,x)], dx2[(y,x)], dy2[(y,x)] ]))
677 | if reg.size==0:
678 | reg = np.empty((0,3))
679 | bb = np.transpose(np.vstack([y,x]))
680 | q1 = np.fix((stride*bb+1)/scale)
681 | q2 = np.fix((stride*bb+cellsize-1+1)/scale)
682 | boundingbox = np.hstack([q1, q2, np.expand_dims(score,1), reg])
683 | return boundingbox, reg
684 |
685 | # function pick = nms(boxes,threshold,type)
686 | def nms(boxes, threshold, method):
687 | if boxes.size==0:
688 | return np.empty((0,3))
689 | x1 = boxes[:,0]
690 | y1 = boxes[:,1]
691 | x2 = boxes[:,2]
692 | y2 = boxes[:,3]
693 | s = boxes[:,4]
694 | area = (x2-x1+1) * (y2-y1+1)
695 | I = np.argsort(s)
696 | pick = np.zeros_like(s, dtype=np.int16)
697 | counter = 0
698 | while I.size>0:
699 | i = I[-1]
700 | pick[counter] = i
701 | counter += 1
702 | idx = I[0:-1]
703 | xx1 = np.maximum(x1[i], x1[idx])
704 | yy1 = np.maximum(y1[i], y1[idx])
705 | xx2 = np.minimum(x2[i], x2[idx])
706 | yy2 = np.minimum(y2[i], y2[idx])
707 | w = np.maximum(0.0, xx2-xx1+1)
708 | h = np.maximum(0.0, yy2-yy1+1)
709 | inter = w * h
710 | if method is 'Min':
711 | o = inter / np.minimum(area[i], area[idx])
712 | else:
713 | o = inter / (area[i] + area[idx] - inter)
714 | I = I[np.where(o<=threshold)]
715 | pick = pick[0:counter]
716 | return pick
717 |
718 | # function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h)
719 | def pad(total_boxes, w, h):
720 | """Compute the padding coordinates (pad the bounding boxes to square)"""
721 | tmpw = (total_boxes[:,2]-total_boxes[:,0]+1).astype(np.int32)
722 | tmph = (total_boxes[:,3]-total_boxes[:,1]+1).astype(np.int32)
723 | numbox = total_boxes.shape[0]
724 |
725 | dx = np.ones((numbox), dtype=np.int32)
726 | dy = np.ones((numbox), dtype=np.int32)
727 | edx = tmpw.copy().astype(np.int32)
728 | edy = tmph.copy().astype(np.int32)
729 |
730 | x = total_boxes[:,0].copy().astype(np.int32)
731 | y = total_boxes[:,1].copy().astype(np.int32)
732 | ex = total_boxes[:,2].copy().astype(np.int32)
733 | ey = total_boxes[:,3].copy().astype(np.int32)
734 |
735 | tmp = np.where(ex>w)
736 | edx.flat[tmp] = np.expand_dims(-ex[tmp]+w+tmpw[tmp],1)
737 | ex[tmp] = w
738 |
739 | tmp = np.where(ey>h)
740 | edy.flat[tmp] = np.expand_dims(-ey[tmp]+h+tmph[tmp],1)
741 | ey[tmp] = h
742 |
743 | tmp = np.where(x<1)
744 | dx.flat[tmp] = np.expand_dims(2-x[tmp],1)
745 | x[tmp] = 1
746 |
747 | tmp = np.where(y<1)
748 | dy.flat[tmp] = np.expand_dims(2-y[tmp],1)
749 | y[tmp] = 1
750 |
751 | return dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph
752 |
753 | # function [bboxA] = rerec(bboxA)
754 | def rerec(bboxA):
755 | """Convert bboxA to square."""
756 | h = bboxA[:,3]-bboxA[:,1]
757 | w = bboxA[:,2]-bboxA[:,0]
758 | l = np.maximum(w, h)
759 | bboxA[:,0] = bboxA[:,0]+w*0.5-l*0.5
760 | bboxA[:,1] = bboxA[:,1]+h*0.5-l*0.5
761 | bboxA[:,2:4] = bboxA[:,0:2] + np.transpose(np.tile(l,(2,1)))
762 | return bboxA
763 |
764 | def imresample(img, sz):
765 | im_data = cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_AREA)
766 | return im_data
767 |
--------------------------------------------------------------------------------
/face_aligner.py:
--------------------------------------------------------------------------------
1 | # import the necessary packages
2 | import numpy as np
3 | import cv2
4 |
5 | class FaceAligner:
6 | def __init__(self, desiredLeftEye=(0.4, 0.4),
7 | desiredFaceWidth=256, desiredFaceHeight=None):
8 | self.desiredLeftEye = desiredLeftEye
9 | self.desiredFaceWidth = desiredFaceWidth
10 | self.desiredFaceHeight = desiredFaceHeight
11 |
12 | # if the desired face height is None, set it to be the
13 | # desired face width (normal behavior)
14 | if self.desiredFaceHeight is None:
15 | self.desiredFaceHeight = self.desiredFaceWidth
16 |
17 | def align(self, image, points):
18 |
19 | # compute the center of mass for each eye
20 | leftEyeCenter = (int(points[0]), int(points[5]))
21 | rightEyeCenter = (int(points[1]), int(points[6]))
22 |
23 | # compute the angle between the eye centroids
24 | dY = rightEyeCenter[1] - leftEyeCenter[1]
25 | dX = rightEyeCenter[0] - leftEyeCenter[0]
26 | angle = np.degrees(np.arctan2(dY, dX))
27 |
28 | # compute the desired right eye x-coordinate based on the
29 | # desired x-coordinate of the left eye
30 | desiredRightEyeX = 1.0 - self.desiredLeftEye[0]
31 |
32 | # determine the scale of the new resulting image by taking
33 | # the ratio of the distance between eyes in the *current*
34 | # image to the ratio of distance between eyes in the
35 | # *desired* image
36 | dist = np.sqrt((dX ** 2) + (dY ** 2))
37 | desiredDist = (desiredRightEyeX - self.desiredLeftEye[0])
38 | desiredDist *= self.desiredFaceWidth
39 | scale = desiredDist / dist
40 |
41 | # compute center (x, y)-coordinates (i.e., the median point)
42 | # between the two eyes in the input image
43 | eyesCenter = ((leftEyeCenter[0] + rightEyeCenter[0]) // 2,
44 | (leftEyeCenter[1] + rightEyeCenter[1]) // 2)
45 |
46 | # grab the rotation matrix for rotating and scaling the face
47 | M = cv2.getRotationMatrix2D(eyesCenter, angle, scale)
48 |
49 | # update the translation component of the matrix
50 | tX = self.desiredFaceWidth * 0.5
51 | tY = self.desiredFaceHeight * self.desiredLeftEye[1]
52 | M[0, 2] += (tX - eyesCenter[0])
53 | M[1, 2] += (tY - eyesCenter[1])
54 |
55 | # apply the affine transformation
56 | (w, h) = (self.desiredFaceWidth, self.desiredFaceHeight)
57 | output = cv2.warpAffine(image, M, (w, h),
58 | flags=cv2.INTER_CUBIC)
59 |
60 | # return the aligned face
61 | return output
--------------------------------------------------------------------------------
/face_detect.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | from mtcnn2 import MTCNN
3 | from draw_points import *
4 | import os
5 | import numpy as np
6 |
7 | #ckpts = np.zeros((5000, 2500), dtype='uint8')
8 |
9 | print('Welcome to Face Detection \n\n Enter 1 to add image manually\n Enter 2 to detect face in Webcam feed')
10 | n = int(input())
11 | if n != 1 and n != 2:
12 | print('Wrong Choice')
13 | exit(0)
14 | count = 0
15 | if n == 1:
16 | print('Enter complete address of the image')
17 | #addr = str(input())
18 | #addr = 'C:/Users/Rashmi/Downloads/21.jpg'
19 | addr = '/home/ml/Documents/attendance_dl/21.jpg'
20 | if not os.path.exists(addr):
21 | print('Invalid Address')
22 | exit(0)
23 |
24 | print('Enter Resolution of output image (in heightXwidth format)')
25 | res = input().split('X')
26 | img = cv2.imread(addr)
27 | img = cv2.resize(img, (int(res[0]), int(res[1])))
28 | ckpts = np.zeros((int(res[0]), int(res[1])), dtype = 'uint8')
29 |
30 | elif n ==2:
31 | #video_capture = cv2.VideoCapture(0)
32 | #/home/ml/Documents/attendance_dl/dataset/dtst7.mp4
33 | video_capture = cv2.VideoCapture('dataset/Mam.mp4')
34 |
35 |
36 | detector = MTCNN()
37 | ct = 0
38 | alpha = 0.12
39 | beta = 0.04
40 |
41 | while True:
42 |
43 | if n == 2:
44 | ret, frame = video_capture.read()
45 | #frame = cv2.resize(frame)
46 | elif n == 1:
47 | frame = img
48 |
49 | #edges = cv2.Canny(frame,500,1000)
50 | #b, g, r = cv2.split(frame)
51 | #dst = cv2.add(r, edges)
52 | #frame2 = cv2.merge((r, b, dst))
53 | m = cv2.getRotationMatrix2D((frame.shape[1]/2, frame.shape[0]/2+250), -90, 1)
54 | frame = cv2.warpAffine(frame, m, (frame.shape[1], frame.shape[0]))
55 | frame = cv2.resize(frame, (840, 480))
56 |
57 | detect = detector.detect_faces(frame)
58 |
59 | if detect:
60 |
61 | for i in range(int(len(detect[:]))):
62 | boxes = detect[i]['box']
63 | keypoints = detect[i]['keypoints']
64 | #print(keypoints['nose'])
65 | if ckpts[keypoints['nose']] == 0 and ckpts[keypoints['left_eye']] == 0 and ckpts[keypoints['right_eye']] == 0 and ckpts[keypoints['mouth_left']] == 0 and ckpts[keypoints['mouth_right']] == 0:
66 | #show_points(frame, boxes, keypoints, alpha, beta)
67 | draw_lines(frame, boxes, keypoints, alpha, beta, count)
68 | count = count+1
69 | print('count', count)
70 | '''for w in range(boxes[0], boxes[0]+boxes[2]):
71 | for h in range(boxes[1], boxes[1]+boxes[3]):
72 | ckpts[w][h] = 1'''
73 |
74 |
75 | # Display the resulting frame
76 | cv2.imshow('Frame', frame)
77 | #cv2.waitKey(0)
78 | #break
79 | if cv2.waitKey(1) & 0xFF == ord('q'):
80 | break
81 |
82 | # Release the capture
83 | #video_capture.release()
84 | cv2.destroyAllWindows()
85 |
--------------------------------------------------------------------------------
/facenet.py:
--------------------------------------------------------------------------------
1 | import os
2 | from subprocess import Popen, PIPE
3 | import tensorflow as tf
4 | import numpy as np
5 | from scipy import misc
6 | from sklearn.model_selection import KFold
7 | from scipy import interpolate
8 | from tensorflow.python.training import training
9 | import random
10 | import re
11 | from tensorflow.python.platform import gfile
12 | import math
13 | from six import iteritems
14 | import cv2
15 |
16 | def triplet_loss(anchor, positive, negative, alpha):
17 | """Calculate the triplet loss according to the FaceNet paper
18 |
19 | Args:
20 | anchor: the embeddings for the anchor images.
21 | positive: the embeddings for the positive images.
22 | negative: the embeddings for the negative images.
23 |
24 | Returns:
25 | the triplet loss according to the FaceNet paper as a float tensor.
26 | """
27 | with tf.variable_scope('triplet_loss'):
28 | pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1)
29 | neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1)
30 |
31 | basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha)
32 | loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0)
33 |
34 | return loss
35 |
36 | def center_loss(features, label, alfa, nrof_classes):
37 | """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition"
38 | (http://ydwen.github.io/papers/WenECCV16.pdf)
39 | """
40 | nrof_features = features.get_shape()[1]
41 | centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32,
42 | initializer=tf.constant_initializer(0), trainable=False)
43 | label = tf.reshape(label, [-1])
44 | centers_batch = tf.gather(centers, label)
45 | diff = (1 - alfa) * (centers_batch - features)
46 | centers = tf.scatter_sub(centers, label, diff)
47 | with tf.control_dependencies([centers]):
48 | loss = tf.reduce_mean(tf.square(features - centers_batch))
49 | return loss, centers
50 |
51 | def get_image_paths_and_labels(dataset):
52 | image_paths_flat = []
53 | labels_flat = []
54 | for i in range(len(dataset)):
55 | image_paths_flat += dataset[i].image_paths
56 | labels_flat += [i] * len(dataset[i].image_paths)
57 | return image_paths_flat, labels_flat
58 |
59 | def shuffle_examples(image_paths, labels):
60 | shuffle_list = list(zip(image_paths, labels))
61 | random.shuffle(shuffle_list)
62 | image_paths_shuff, labels_shuff = zip(*shuffle_list)
63 | return image_paths_shuff, labels_shuff
64 |
65 | def random_rotate_image(image):
66 | angle = np.random.uniform(low=-10.0, high=10.0)
67 | return misc.imrotate(image, angle, 'bicubic')
68 |
69 | # 1: Random rotate 2: Random crop 4: Random flip 8: Fixed image standardization 16: Flip
70 | RANDOM_ROTATE = 1
71 | RANDOM_CROP = 2
72 | RANDOM_FLIP = 4
73 | FIXED_STANDARDIZATION = 8
74 | FLIP = 16
75 | def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder):
76 | images_and_labels_list = []
77 | for _ in range(nrof_preprocess_threads):
78 | filenames, label, control = input_queue.dequeue()
79 | images = []
80 | for filename in tf.unstack(filenames):
81 | file_contents = tf.read_file(filename)
82 | image = tf.image.decode_image(file_contents, 3)
83 | image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE),
84 | lambda:tf.py_func(random_rotate_image, [image], tf.uint8),
85 | lambda:tf.identity(image))
86 | image = tf.cond(get_control_flag(control[0], RANDOM_CROP),
87 | lambda:tf.random_crop(image, image_size + (3,)),
88 | lambda:tf.image.resize_image_with_crop_or_pad(image, image_size[0], image_size[1]))
89 | image = tf.cond(get_control_flag(control[0], RANDOM_FLIP),
90 | lambda:tf.image.random_flip_left_right(image),
91 | lambda:tf.identity(image))
92 | image = tf.cond(get_control_flag(control[0], FIXED_STANDARDIZATION),
93 | lambda:(tf.cast(image, tf.float32) - 127.5)/128.0,
94 | lambda:tf.image.per_image_standardization(image))
95 | image = tf.cond(get_control_flag(control[0], FLIP),
96 | lambda:tf.image.flip_left_right(image),
97 | lambda:tf.identity(image))
98 | #pylint: disable=no-member
99 | image.set_shape(image_size + (3,))
100 | images.append(image)
101 | images_and_labels_list.append([images, label])
102 |
103 | image_batch, label_batch = tf.train.batch_join(
104 | images_and_labels_list, batch_size=batch_size_placeholder,
105 | shapes=[image_size + (3,), ()], enqueue_many=True,
106 | capacity=4 * nrof_preprocess_threads * 100,
107 | allow_smaller_final_batch=True)
108 |
109 | return image_batch, label_batch
110 |
111 | def get_control_flag(control, field):
112 | return tf.equal(tf.mod(tf.floor_div(control, field), 2), 1)
113 |
114 | def _add_loss_summaries(total_loss):
115 | """Add summaries for losses.
116 |
117 | Generates moving average for all losses and associated summaries for
118 | visualizing the performance of the network.
119 |
120 | Args:
121 | total_loss: Total loss from loss().
122 | Returns:
123 | loss_averages_op: op for generating moving averages of losses.
124 | """
125 | # Compute the moving average of all individual losses and the total loss.
126 | loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg')
127 | losses = tf.get_collection('losses')
128 | loss_averages_op = loss_averages.apply(losses + [total_loss])
129 |
130 | # Attach a scalar summmary to all individual losses and the total loss; do the
131 | # same for the averaged version of the losses.
132 | for l in losses + [total_loss]:
133 | # Name each loss as '(raw)' and name the moving average version of the loss
134 | # as the original loss name.
135 | tf.summary.scalar(l.op.name +' (raw)', l)
136 | tf.summary.scalar(l.op.name, loss_averages.average(l))
137 |
138 | return loss_averages_op
139 |
140 | def train(total_loss, global_step, optimizer, learning_rate, moving_average_decay, update_gradient_vars, log_histograms=True):
141 | # Generate moving averages of all losses and associated summaries.
142 | loss_averages_op = _add_loss_summaries(total_loss)
143 |
144 | # Compute gradients.
145 | with tf.control_dependencies([loss_averages_op]):
146 | if optimizer=='ADAGRAD':
147 | opt = tf.train.AdagradOptimizer(learning_rate)
148 | elif optimizer=='ADADELTA':
149 | opt = tf.train.AdadeltaOptimizer(learning_rate, rho=0.9, epsilon=1e-6)
150 | elif optimizer=='ADAM':
151 | opt = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999, epsilon=0.1)
152 | elif optimizer=='RMSPROP':
153 | opt = tf.train.RMSPropOptimizer(learning_rate, decay=0.9, momentum=0.9, epsilon=1.0)
154 | elif optimizer=='MOM':
155 | opt = tf.train.MomentumOptimizer(learning_rate, 0.9, use_nesterov=True)
156 | else:
157 | raise ValueError('Invalid optimization algorithm')
158 |
159 | grads = opt.compute_gradients(total_loss, update_gradient_vars)
160 |
161 | # Apply gradients.
162 | apply_gradient_op = opt.apply_gradients(grads, global_step=global_step)
163 |
164 | # Add histograms for trainable variables.
165 | if log_histograms:
166 | for var in tf.trainable_variables():
167 | tf.summary.histogram(var.op.name, var)
168 |
169 | # Add histograms for gradients.
170 | if log_histograms:
171 | for grad, var in grads:
172 | if grad is not None:
173 | tf.summary.histogram(var.op.name + '/gradients', grad)
174 |
175 | # Track the moving averages of all trainable variables.
176 | variable_averages = tf.train.ExponentialMovingAverage(
177 | moving_average_decay, global_step)
178 | variables_averages_op = variable_averages.apply(tf.trainable_variables())
179 |
180 | with tf.control_dependencies([apply_gradient_op, variables_averages_op]):
181 | train_op = tf.no_op(name='train')
182 |
183 | return train_op
184 |
185 | def prewhiten(x):
186 | mean = np.mean(x)
187 | std = np.std(x)
188 | std_adj = np.maximum(std, 1.0/np.sqrt(x.size))
189 | y = np.multiply(np.subtract(x, mean), 1/std_adj)
190 | return y
191 |
192 | def crop(image, random_crop, image_size):
193 | if image.shape[1]>image_size:
194 | sz1 = int(image.shape[1]//2)
195 | sz2 = int(image_size//2)
196 | if random_crop:
197 | diff = sz1-sz2
198 | (h, v) = (np.random.randint(-diff, diff+1), np.random.randint(-diff, diff+1))
199 | else:
200 | (h, v) = (0,0)
201 | image = image[(sz1-sz2+v):(sz1+sz2+v),(sz1-sz2+h):(sz1+sz2+h),:]
202 | return image
203 |
204 | def flip(image, random_flip):
205 | if random_flip and np.random.choice([True, False]):
206 | image = np.fliplr(image)
207 | return image
208 |
209 | def to_rgb(img):
210 | w, h = img.shape
211 | ret = np.empty((w, h, 3), dtype=np.uint8)
212 | ret[:, :, 0] = ret[:, :, 1] = ret[:, :, 2] = img
213 | return ret
214 |
215 | def load_data(image_paths, do_random_crop, do_random_flip, image_size, do_prewhiten=True):
216 | nrof_samples = len(image_paths)
217 | images = np.zeros((nrof_samples, image_size, image_size, 3))
218 | for i in range(nrof_samples):
219 | img = cv2.imread(image_paths[i])
220 | if img.ndim == 2:
221 | img = to_rgb(img)
222 | if do_prewhiten:
223 | img = prewhiten(img)
224 | img = crop(img, do_random_crop, image_size)
225 | img = flip(img, do_random_flip)
226 | images[i,:,:,:] = img
227 | return images
228 |
229 | def get_label_batch(label_data, batch_size, batch_index):
230 | nrof_examples = np.size(label_data, 0)
231 | j = batch_index*batch_size % nrof_examples
232 | if j+batch_size<=nrof_examples:
233 | batch = label_data[j:j+batch_size]
234 | else:
235 | x1 = label_data[j:nrof_examples]
236 | x2 = label_data[0:nrof_examples-j]
237 | batch = np.vstack([x1,x2])
238 | batch_int = batch.astype(np.int64)
239 | return batch_int
240 |
241 | def get_batch(image_data, batch_size, batch_index):
242 | nrof_examples = np.size(image_data, 0)
243 | j = batch_index*batch_size % nrof_examples
244 | if j+batch_size<=nrof_examples:
245 | batch = image_data[j:j+batch_size,:,:,:]
246 | else:
247 | x1 = image_data[j:nrof_examples,:,:,:]
248 | x2 = image_data[0:nrof_examples-j,:,:,:]
249 | batch = np.vstack([x1,x2])
250 | batch_float = batch.astype(np.float32)
251 | return batch_float
252 |
253 | def get_triplet_batch(triplets, batch_index, batch_size):
254 | ax, px, nx = triplets
255 | a = get_batch(ax, int(batch_size/3), batch_index)
256 | p = get_batch(px, int(batch_size/3), batch_index)
257 | n = get_batch(nx, int(batch_size/3), batch_index)
258 | batch = np.vstack([a, p, n])
259 | return batch
260 |
261 | def get_learning_rate_from_file(filename, epoch):
262 | with open(filename, 'r') as f:
263 | for line in f.readlines():
264 | line = line.split('#', 1)[0]
265 | if line:
266 | par = line.strip().split(':')
267 | e = int(par[0])
268 | if par[1]=='-':
269 | lr = -1
270 | else:
271 | lr = float(par[1])
272 | if e <= epoch:
273 | learning_rate = lr
274 | else:
275 | return learning_rate
276 |
277 | class ImageClass():
278 | "Stores the paths to images for a given class"
279 | def __init__(self, name, image_paths):
280 | self.name = name
281 | self.image_paths = image_paths
282 |
283 | def __str__(self):
284 | return self.name + ', ' + str(len(self.image_paths)) + ' images'
285 |
286 | def __len__(self):
287 | return len(self.image_paths)
288 |
289 | def get_dataset(path, has_class_directories=True):
290 | dataset = []
291 | path_exp = os.path.expanduser(path)
292 | classes = [path for path in os.listdir(path_exp) \
293 | if os.path.isdir(os.path.join(path_exp, path))]
294 | classes.sort()
295 | nrof_classes = len(classes)
296 | for i in range(nrof_classes):
297 | class_name = classes[i]
298 | facedir = os.path.join(path_exp, class_name)
299 | image_paths = get_image_paths(facedir)
300 | dataset.append(ImageClass(class_name, image_paths))
301 |
302 | return dataset
303 |
304 | def get_image_paths(facedir):
305 | image_paths = []
306 | if os.path.isdir(facedir):
307 | images = os.listdir(facedir)
308 | image_paths = [os.path.join(facedir,img) for img in images]
309 | return image_paths
310 |
311 | def split_dataset(dataset, split_ratio, min_nrof_images_per_class, mode):
312 | if mode=='SPLIT_CLASSES':
313 | nrof_classes = len(dataset)
314 | class_indices = np.arange(nrof_classes)
315 | np.random.shuffle(class_indices)
316 | split = int(round(nrof_classes*(1-split_ratio)))
317 | train_set = [dataset[i] for i in class_indices[0:split]]
318 | test_set = [dataset[i] for i in class_indices[split:-1]]
319 | elif mode=='SPLIT_IMAGES':
320 | train_set = []
321 | test_set = []
322 | for cls in dataset:
323 | paths = cls.image_paths
324 | np.random.shuffle(paths)
325 | nrof_images_in_class = len(paths)
326 | split = int(math.floor(nrof_images_in_class*(1-split_ratio)))
327 | if split==nrof_images_in_class:
328 | split = nrof_images_in_class-1
329 | if split>=min_nrof_images_per_class and nrof_images_in_class-split>=1:
330 | train_set.append(ImageClass(cls.name, paths[:split]))
331 | test_set.append(ImageClass(cls.name, paths[split:]))
332 | else:
333 | raise ValueError('Invalid train/test split mode "%s"' % mode)
334 | return train_set, test_set
335 |
336 | def load_model(model, input_map=None):
337 | # Check if the model is a model directory (containing a metagraph and a checkpoint file)
338 | # or if it is a protobuf file with a frozen graph
339 | model_exp = os.path.expanduser(model)
340 | if (os.path.isfile(model_exp)):
341 | print('Model filename: %s' % model_exp)
342 | with gfile.FastGFile(model_exp,'rb') as f:
343 | graph_def = tf.GraphDef()
344 | graph_def.ParseFromString(f.read())
345 | tf.import_graph_def(graph_def, input_map=input_map, name='')
346 | else:
347 | print('Model directory: %s' % model_exp)
348 | meta_file, ckpt_file = get_model_filenames(model_exp)
349 |
350 | print('Metagraph file: %s' % meta_file)
351 | print('Checkpoint file: %s' % ckpt_file)
352 |
353 | saver = tf.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map)
354 | saver.restore(tf.get_default_session(), os.path.join(model_exp, ckpt_file))
355 |
356 | def get_model_filenames(model_dir):
357 | files = os.listdir(model_dir)
358 | meta_files = [s for s in files if s.endswith('.meta')]
359 | if len(meta_files)==0:
360 | raise ValueError('No meta file found in the model directory (%s)' % model_dir)
361 | elif len(meta_files)>1:
362 | raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir)
363 | meta_file = meta_files[0]
364 | ckpt = tf.train.get_checkpoint_state(model_dir)
365 | if ckpt and ckpt.model_checkpoint_path:
366 | ckpt_file = os.path.basename(ckpt.model_checkpoint_path)
367 | return meta_file, ckpt_file
368 |
369 | meta_files = [s for s in files if '.ckpt' in s]
370 | max_step = -1
371 | for f in files:
372 | step_str = re.match(r'(^model-[\w\- ]+.ckpt-(\d+))', f)
373 | if step_str is not None and len(step_str.groups())>=2:
374 | step = int(step_str.groups()[1])
375 | if step > max_step:
376 | max_step = step
377 | ckpt_file = step_str.groups()[0]
378 | return meta_file, ckpt_file
379 |
380 | def distance(embeddings1, embeddings2, distance_metric=0):
381 | if distance_metric==0:
382 | # Euclidian distance
383 | diff = np.subtract(embeddings1, embeddings2)
384 | dist = np.sum(np.square(diff),1)
385 | elif distance_metric==1:
386 | # Distance based on cosine similarity
387 | dot = np.sum(np.multiply(embeddings1, embeddings2), axis=1)
388 | norm = np.linalg.norm(embeddings1, axis=1) * np.linalg.norm(embeddings2, axis=1)
389 | similarity = dot / norm
390 | dist = np.arccos(similarity) / math.pi
391 | else:
392 | raise 'Undefined distance metric %d' % distance_metric
393 |
394 | return dist
395 |
396 | def calculate_roc(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10, distance_metric=0, subtract_mean=False):
397 | assert(embeddings1.shape[0] == embeddings2.shape[0])
398 | assert(embeddings1.shape[1] == embeddings2.shape[1])
399 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
400 | nrof_thresholds = len(thresholds)
401 | k_fold = KFold(n_splits=nrof_folds, shuffle=False)
402 |
403 | tprs = np.zeros((nrof_folds,nrof_thresholds))
404 | fprs = np.zeros((nrof_folds,nrof_thresholds))
405 | accuracy = np.zeros((nrof_folds))
406 |
407 | indices = np.arange(nrof_pairs)
408 |
409 | for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
410 | if subtract_mean:
411 | mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0)
412 | else:
413 | mean = 0.0
414 | dist = distance(embeddings1-mean, embeddings2-mean, distance_metric)
415 |
416 | # Find the best threshold for the fold
417 | acc_train = np.zeros((nrof_thresholds))
418 | for threshold_idx, threshold in enumerate(thresholds):
419 | _, _, acc_train[threshold_idx] = calculate_accuracy(threshold, dist[train_set], actual_issame[train_set])
420 | best_threshold_index = np.argmax(acc_train)
421 | for threshold_idx, threshold in enumerate(thresholds):
422 | tprs[fold_idx,threshold_idx], fprs[fold_idx,threshold_idx], _ = calculate_accuracy(threshold, dist[test_set], actual_issame[test_set])
423 | _, _, accuracy[fold_idx] = calculate_accuracy(thresholds[best_threshold_index], dist[test_set], actual_issame[test_set])
424 |
425 | tpr = np.mean(tprs,0)
426 | fpr = np.mean(fprs,0)
427 | return tpr, fpr, accuracy
428 |
429 | def calculate_accuracy(threshold, dist, actual_issame):
430 | predict_issame = np.less(dist, threshold)
431 | tp = np.sum(np.logical_and(predict_issame, actual_issame))
432 | fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
433 | tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame)))
434 | fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame))
435 |
436 | tpr = 0 if (tp+fn==0) else float(tp) / float(tp+fn)
437 | fpr = 0 if (fp+tn==0) else float(fp) / float(fp+tn)
438 | acc = float(tp+tn)/dist.size
439 | return tpr, fpr, acc
440 |
441 |
442 |
443 | def calculate_val(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10, distance_metric=0, subtract_mean=False):
444 | assert(embeddings1.shape[0] == embeddings2.shape[0])
445 | assert(embeddings1.shape[1] == embeddings2.shape[1])
446 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
447 | nrof_thresholds = len(thresholds)
448 | k_fold = KFold(n_splits=nrof_folds, shuffle=False)
449 |
450 | val = np.zeros(nrof_folds)
451 | far = np.zeros(nrof_folds)
452 |
453 | indices = np.arange(nrof_pairs)
454 |
455 | for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
456 | if subtract_mean:
457 | mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0)
458 | else:
459 | mean = 0.0
460 | dist = distance(embeddings1-mean, embeddings2-mean, distance_metric)
461 |
462 | # Find the threshold that gives FAR = far_target
463 | far_train = np.zeros(nrof_thresholds)
464 | for threshold_idx, threshold in enumerate(thresholds):
465 | _, far_train[threshold_idx] = calculate_val_far(threshold, dist[train_set], actual_issame[train_set])
466 | if np.max(far_train)>=far_target:
467 | f = interpolate.interp1d(far_train, thresholds, kind='slinear')
468 | threshold = f(far_target)
469 | else:
470 | threshold = 0.0
471 |
472 | val[fold_idx], far[fold_idx] = calculate_val_far(threshold, dist[test_set], actual_issame[test_set])
473 |
474 | val_mean = np.mean(val)
475 | far_mean = np.mean(far)
476 | val_std = np.std(val)
477 | return val_mean, val_std, far_mean
478 |
479 |
480 | def calculate_val_far(threshold, dist, actual_issame):
481 | predict_issame = np.less(dist, threshold)
482 | true_accept = np.sum(np.logical_and(predict_issame, actual_issame))
483 | false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
484 | n_same = np.sum(actual_issame)
485 | n_diff = np.sum(np.logical_not(actual_issame))
486 | val = float(true_accept) / float(n_same)
487 | far = float(false_accept) / float(n_diff)
488 | return val, far
489 |
490 | def store_revision_info(src_path, output_dir, arg_string):
491 | try:
492 | # Get git hash
493 | cmd = ['git', 'rev-parse', 'HEAD']
494 | gitproc = Popen(cmd, stdout = PIPE, cwd=src_path)
495 | (stdout, _) = gitproc.communicate()
496 | git_hash = stdout.strip()
497 | except OSError as e:
498 | git_hash = ' '.join(cmd) + ': ' + e.strerror
499 |
500 | try:
501 | # Get local changes
502 | cmd = ['git', 'diff', 'HEAD']
503 | gitproc = Popen(cmd, stdout = PIPE, cwd=src_path)
504 | (stdout, _) = gitproc.communicate()
505 | git_diff = stdout.strip()
506 | except OSError as e:
507 | git_diff = ' '.join(cmd) + ': ' + e.strerror
508 |
509 | # Store a text file in the log directory
510 | rev_info_filename = os.path.join(output_dir, 'revision_info.txt')
511 | with open(rev_info_filename, "w") as text_file:
512 | text_file.write('arguments: %s\n--------------------\n' % arg_string)
513 | text_file.write('tensorflow version: %s\n--------------------\n' % tf.__version__) # @UndefinedVariable
514 | text_file.write('git hash: %s\n--------------------\n' % git_hash)
515 | text_file.write('%s' % git_diff)
516 |
517 | def list_variables(filename):
518 | reader = training.NewCheckpointReader(filename)
519 | variable_map = reader.get_variable_to_shape_map()
520 | names = sorted(variable_map.keys())
521 | return names
522 |
523 | def put_images_on_grid(images, shape=(16,8)):
524 | nrof_images = images.shape[0]
525 | img_size = images.shape[1]
526 | bw = 3
527 | img = np.zeros((shape[1]*(img_size+bw)+bw, shape[0]*(img_size+bw)+bw, 3), np.float32)
528 | for i in range(shape[1]):
529 | x_start = i*(img_size+bw)+bw
530 | for j in range(shape[0]):
531 | img_index = i*shape[0]+j
532 | if img_index>=nrof_images:
533 | break
534 | y_start = j*(img_size+bw)+bw
535 | img[x_start:x_start+img_size, y_start:y_start+img_size, :] = images[img_index, :, :, :]
536 | if img_index>=nrof_images:
537 | break
538 | return img
539 |
540 | def write_arguments_to_file(args, filename):
541 | with open(filename, 'w') as f:
542 | for key, value in iteritems(vars(args)):
543 | f.write('%s: %s\n' % (key, str(value)))
544 |
--------------------------------------------------------------------------------
/final_sotware.py:
--------------------------------------------------------------------------------
1 | # /home/aashish/Documents/deep_learning/attendance_deep_learning
2 |
3 | import tensorflow as tf
4 | from scipy import misc
5 | import numpy as np
6 | import argparse
7 | import facenet
8 | import cv2
9 | import sys
10 | import os
11 | import math
12 | import pickle
13 | from sklearn.svm import SVC
14 | from PIL import Image
15 | from face_aligner import FaceAligner
16 | import detect_face
17 | from sheet import mark_present
18 | from mtcnn.mtcnn import MTCNN
19 |
20 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
21 |
22 |
23 |
24 |
25 |
26 |
27 | def dataset_creation(parameters):
28 | path1, webcam, face_dim, gpu, username, vid_path = parameters
29 | path = ""
30 | res = ()
31 | personNo = 1
32 | folder_name = ""
33 |
34 | path = path1
35 |
36 | if os.path.isdir(path):
37 | path += '/output'
38 | if os.path.isdir(path):
39 | print("Directory already exists. Using it \n")
40 | else:
41 | if not os.makedirs(path):
42 | print("Directory successfully made in: " + path + "\n")
43 |
44 | else:
45 | if path == "":
46 | print("Making an output folder in this directory only. \n")
47 | else:
48 | print("No such directory exists. Making an output folder in this current code directory only. \n")
49 |
50 | path = 'output'
51 | if os.path.isdir(path):
52 | print("Directory already exists. Using it \n")
53 | else:
54 | if os.makedirs(path):
55 | print("error in making directory. \n")
56 | sys.exit()
57 | else:
58 | print("Directory successfully made: " + path + "\n")
59 | detector = MTCNN()
60 | res = webcam
61 | if res == "":
62 | res = (640, 480)
63 | else:
64 | res = tuple(map(int, res.split('x')))
65 |
66 | gpu_fraction = gpu
67 | if gpu_fraction == "":
68 | gpu_fraction = 0.8
69 | else:
70 | gpu_fraction = round(float(gpu_fraction), 1)
71 |
72 | minsize = 20
73 | threshold = [0.6, 0.7, 0.7]
74 | factor = 0.7
75 |
76 | with tf.Graph().as_default():
77 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction)
78 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
79 | with sess.as_default():
80 | pnet, rnet, onet = detect_face.create_mtcnn(sess, None)
81 |
82 | face_size = face_dim
83 | if face_size == "":
84 | face_size = (160, 160)
85 | print('default face size')
86 | else:
87 | face_size = tuple(map(int, face_size.split('x')))
88 | affine = FaceAligner(desiredLeftEye=(0.33, 0.33), desiredFaceWidth=face_size[0], desiredFaceHeight=face_size[1])
89 |
90 | while True:
91 | ask = username
92 | ask = ask.replace(" ", "_")
93 |
94 | if ask == "":
95 | folder_name = 'person' + str(personNo)
96 | else:
97 | folder_name = ask
98 |
99 | personNo += 1
100 | users_folder = path + "/" + folder_name
101 | image_no = 1
102 |
103 | if os.path.isdir(users_folder):
104 | print("Directory already exists. Using it \n")
105 | else:
106 | if os.makedirs(users_folder):
107 | print("error in making directory. \n")
108 | sys.exit()
109 | else:
110 | print("Directory successfully made: " + users_folder + "\n")
111 |
112 | data_type = vid_path
113 | loop_type = False
114 | total_frames = 0
115 |
116 | if data_type == "":
117 | data_type = 0
118 | loop_type = True
119 |
120 | # Initialize webcam or video
121 | device = cv2.VideoCapture(data_type)
122 |
123 | # If webcam set resolution
124 | if data_type == 0:
125 | device.set(3, res[0])
126 | device.set(4, res[1])
127 | else:
128 | # Finding total number of frames of video.
129 | total_frames = int(device.get(cv2.CAP_PROP_FRAME_COUNT))
130 | # Shutting down webcam variable
131 | loop_type = False
132 |
133 | # Start web cam or start video and start creating dataset by user.
134 | while loop_type or (total_frames > 0):
135 |
136 | # If video selected dec counter
137 | if loop_type == False:
138 | total_frames -= 1
139 |
140 | ret, image = device.read()
141 |
142 | # Run MTCNN and do face detection until 's' keyword is pressed
143 | if (cv2.waitKey(1) & 0xFF) == ord("s"):
144 |
145 | #bb, points = detect_face.detect_face(image, minsize, pnet, rnet, onet, threshold, factor)
146 | detect = detector.detect_faces(image)
147 | print(detect)
148 |
149 | # See if face is detected
150 | if detect:
151 | bb = detect[0]['box']
152 | points = detect[0]['keypoints']
153 | print(bb)
154 | x, y, w, h = bb
155 | aligned_image = image[y:y+h, x:x+w]
156 | #aligned_image = affine.align(image, points)
157 | image_name = users_folder + "/" + folder_name + "_" + str(image_no).zfill(4) + ".png"
158 | cv2.imwrite(image_name, aligned_image)
159 | image_no += 1
160 |
161 | '''
162 | for i in range(bb.shape[0]):
163 | cv2.rectangle(image, (int(bb[i][0]), int(bb[i][1])), (int(bb[i][2]), int(bb[i][3])), (0, 255, 0), 2)
164 |
165 | # loop over the (x, y)-coordinates for the facial landmarks
166 | # and draw each of them
167 | for col in range(points.shape[1]):
168 | for i in range(5):
169 | cv2.circle(image, (int(points[i][col]), int(points[i + 5][col])), 1, (0, 255, 0), -1)'''
170 |
171 | # Show the output video to user
172 | cv2.imshow("Output", image)
173 |
174 | # Break this loop if 'q' keyword pressed to go to next user.
175 | if (cv2.waitKey(0) & 0xFF) == ord("q"):
176 | device.release()
177 | cv2.destroyAllWindows()
178 | # break
179 | abcd = 1
180 | return abcd
181 |
182 | def train(parameters):
183 | path1, path2, batch, img_dim, gpu, svm_name, split_percent, split_data = parameters
184 |
185 | path = path1 # input("\nEnter the path to the face images directory inside which multiple user folders are present or press ENTER if the default created output folder is present in this code directory only: ")
186 | if path == "":
187 | path = 'output'
188 |
189 | gpu_fraction = gpu # input("\nEnter the gpu memory fraction u want to allocate out of 1 or press ENTER for default 0.8: ").rstrip().lstrip()
190 | if gpu_fraction == "":
191 | gpu_fraction = 0.8
192 | else:
193 | gpu_fraction = round(float(gpu_fraction), 1)
194 |
195 | model = path2 # input("\nEnter the FOLDER PATH inside which 20180402-114759 FOLDER is present. Press ENTER stating that the FOLDER 20180402-114759 is present in this code directory itself: ").rstrip().lstrip()
196 | if model == "":
197 | model = "20180402-114759/20180402-114759.pb"
198 | else:
199 | model += "/20180402-114759/20180402-114759.pb"
200 |
201 | batch_size = 90
202 | ask = batch # input("\nEnter the batch size of images to process at once OR press ENTER for default 90: ").rstrip().lstrip()
203 | if ask != "":
204 | batch_size = int(ask)
205 |
206 | image_size = 160
207 | ask = img_dim # input("\nEnter the width_size of face images OR press ENTER for default 160: ").rstrip().lstrip()
208 | if ask != "":
209 | image_size = int(ask)
210 |
211 | classifier_filename = svm_name # input("Enter the output SVM classifier filename OR press ENTER for default name= classifier: ")
212 | if classifier_filename == "":
213 | classifier_filename = 'classifier.pkl'
214 | else:
215 | classifier_filename += '.pkl'
216 | classifier_filename = os.path.expanduser(classifier_filename)
217 |
218 | split_dataset = split_data # input("\nPress Y if you want to split the dataset for Training and Testing: ").rstrip().lstrip().lower()
219 |
220 | # If yes ask for the percentage of training and testing division.
221 | percentage = 70
222 | if split_dataset == 'y':
223 | ask = split_percent # input("\nEnter the percentage of training dataset for splitting OR press ENTER for default 70: ").rstrip().lstrip()
224 | if ask != "":
225 | percentage = float(ask)
226 |
227 | min_nrof_images_per_class = 0
228 |
229 | dataset = facenet.get_dataset(path)
230 | train_set = []
231 | test_set = []
232 |
233 | if split_dataset == 'y':
234 | for cls in dataset:
235 | paths = cls.image_paths
236 | # Remove classes with less than min_nrof_images_per_class
237 | if len(paths) >= min_nrof_images_per_class:
238 | np.random.shuffle(paths)
239 |
240 | # Find the number of images in training set and testing set images for this class
241 | no_train_images = int(percentage * len(paths) * 0.01)
242 |
243 | train_set.append(facenet.ImageClass(cls.name, paths[:no_train_images]))
244 | test_set.append(facenet.ImageClass(cls.name, paths[no_train_images:]))
245 |
246 |
247 | paths_train = []
248 | labels_train = []
249 | paths_test = []
250 | labels_test = []
251 | emb_array = []
252 | class_names = []
253 |
254 | if split_dataset == 'y':
255 | paths_train, labels_train = facenet.get_image_paths_and_labels(train_set)
256 | paths_test, labels_test = facenet.get_image_paths_and_labels(test_set)
257 | print('\nNumber of classes: %d' % len(train_set))
258 | print('\nNumber of images in TRAIN set: %d' % len(paths_train))
259 | print('\nNumber of images in TEST set: %d' % len(paths_test))
260 | else:
261 | paths_train, labels_train = facenet.get_image_paths_and_labels(dataset)
262 | print('\nNumber of classes: %d' % len(dataset))
263 | print('\nNumber of images: %d' % len(paths_train))
264 |
265 | # Find embedding
266 | emb_array = get_embeddings(model, paths_train, batch_size, image_size, gpu_fraction)
267 |
268 | # Train the classifier
269 | print('\nTraining classifier')
270 | model_svc = SVC(kernel='linear', probability=True)
271 | model_svc.fit(emb_array, labels_train)
272 |
273 | # Create a list of class names
274 | if split_dataset == 'y':
275 | class_names = [cls.name.replace('_', ' ') for cls in train_set]
276 | else:
277 | class_names = [cls.name.replace('_', ' ') for cls in dataset]
278 |
279 | # Saving classifier model
280 | with open(classifier_filename, 'wb') as outfile:
281 | pickle.dump((model_svc, class_names), outfile)
282 |
283 | print('\nSaved classifier model to file: "%s"' % classifier_filename)
284 |
285 | if split_dataset == 'y':
286 | # Find embedding for test data
287 | emb_array = get_embeddings(model, paths_test, batch_size, image_size, gpu_fraction)
288 |
289 | # Call test on the test set.
290 | parameters = '', '', '', '', '', gpu_fraction
291 | test(parameters, classifier_filename, emb_array, labels_test, model, batch_size, image_size)
292 |
293 | c = 1
294 | return c
295 |
296 |
297 | def test(parameters, classifier_filename="", emb_array=[], labels_test=[], model="", batch_size=0, image_size=0):
298 | path1, path2, path3, batch_size, img_dim, gpu = parameters
299 |
300 | if classifier_filename == "":
301 | classifier_filename = path1 # input("\nEnter the path of the classifier .pkl file or press ENTER if a filename classifier.pkl is present in this code directory itself: ")
302 | if classifier_filename == "":
303 | classifier_filename = 'classifier.pkl'
304 | classifier_filename = os.path.expanduser(classifier_filename)
305 |
306 | gpu_fraction = gpu # input("\nEnter the gpu memory fraction u want to allocate out of 1 or press ENTER for default 0.8: ").rstrip().lstrip()
307 | if gpu_fraction == "":
308 | gpu_fraction = 0.8
309 | else:
310 | gpu_fraction = round(float(gpu_fraction), 1)
311 |
312 | if model == "":
313 | model = path2 # input("\nEnter the FOLDER PATH inside which 20180402-114759 FOLDER is present. Press ENTER stating that the FOLDER 20180402-114759 is present in this code directory itself: ").rstrip()
314 | if model == "":
315 | model = "20180402-114759/20180402-114759.pb"
316 |
317 | if batch_size == 0 or batch_size == '':
318 | ask = batch_size # input("\nEnter the batch size of images to process at once OR press ENTER for default 90: ").rstrip().lstrip()
319 | if ask == "":
320 | batch_size = 90
321 | else:
322 | batch_size = int(ask)
323 |
324 | if image_size == 0:
325 | ask = img_dim # input("\nEnter the width_size of face images OR press ENTER for default 160: ").rstrip().lstrip()
326 | if ask == "":
327 | image_size = 160
328 | else:
329 | image_size = int(ask)
330 |
331 | if labels_test == []:
332 | path = path3 # input("\nEnter the path to the face images directory inside which multiple user folders are present or press ENTER if the default created output folder is present in this code directory only: ")
333 | if path == "":
334 | path = 'output'
335 | dataset = facenet.get_dataset(path)
336 | paths, labels_test = facenet.get_image_paths_and_labels(dataset)
337 | print('\nNumber of classes to test: %d' % len(dataset))
338 | print('\nNumber of images to test: %d' % len(paths))
339 | # Generate embeddings of these paths
340 | emb_array = get_embeddings(model, paths, batch_size, image_size, gpu_fraction)
341 |
342 | # Classify images
343 | print('\nTesting classifier')
344 | with open(classifier_filename, 'rb') as infile:
345 | (modelSVM, class_names) = pickle.load(infile)
346 |
347 | print('\nLoaded classifier model from file "%s"' % classifier_filename)
348 |
349 | predictions = modelSVM.predict_proba(emb_array)
350 | best_class_indices = np.argmax(predictions, axis=1)
351 | best_class_probabilities = predictions[np.arange(len(best_class_indices)), best_class_indices]
352 |
353 | for i in range(len(best_class_indices)):
354 | print('%4d %s: %.3f' % (i, class_names[best_class_indices[i]], best_class_probabilities[i]))
355 |
356 | accuracy = np.mean(np.equal(best_class_indices, labels_test))
357 | print('\nAccuracy: %.3f' % accuracy)
358 |
359 |
360 | def get_embeddings(model, paths, batch_size, image_size, gpu_fraction):
361 | # initializing the facenet tensorflow model
362 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction)
363 | with tf.Graph().as_default():
364 | with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) as sess:
365 | # Load the model
366 | print('\nLoading feature extraction model')
367 | facenet.load_model(model)
368 |
369 | # Get input and output tensors
370 | images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
371 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
372 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
373 | embedding_size = embeddings.get_shape()[1]
374 |
375 | # Run forward pass to calculate embeddings
376 | print('Calculating features for images')
377 | nrof_images = len(paths)
378 | nrof_batches_per_epoch = int(math.ceil(1.0 * nrof_images / batch_size))
379 | emb_array = np.zeros((nrof_images, embedding_size))
380 |
381 | for i in range(nrof_batches_per_epoch):
382 | start_index = i * batch_size
383 | end_index = min((i + 1) * batch_size, nrof_images)
384 | paths_batch = paths[start_index:end_index]
385 | # print(paths_batch)
386 |
387 | # Does random crop, prewhitening and flipping.
388 | images = facenet.load_data(paths_batch, False, False, image_size)
389 |
390 | # Get the embeddings
391 | feed_dict = {images_placeholder: images, phase_train_placeholder: False}
392 | emb_array[start_index:end_index, :] = sess.run(embeddings, feed_dict=feed_dict)
393 |
394 | return emb_array
395 |
396 |
397 | def recognize(mode, parameters):
398 | print(parameters)
399 | path1, path2, face_dim, gpu, thresh1, thresh2, resolution, img_path, out_img_path, vid_path, vid_save, vid_see = parameters
400 | st_name = ''
401 | # Taking the parameters for recogniton by the user
402 | if path1:
403 | classifier_filename = path1 # input("\nEnter the path of the classifier .pkl file or press ENTER if a filename 'classifier.pkl' is present in this code directory itself: ")
404 | else:
405 | classifier_filename = 'classifier.pkl'
406 | classifier_filename = os.path.expanduser(classifier_filename)
407 |
408 | if path2:
409 | model = path2 # input("\nEnter the FOLDER PATH inside which 20180402-114759 FOLDER is present. Press ENTER stating that the FOLDER 20180402-114759 is present in this code directory itself: ").rstrip()
410 | else:
411 | model = "20180402-114759/20180402-114759.pb"
412 |
413 | # Create an object of face aligner module
414 | image_size = (160, 160)
415 | if face_dim:
416 | ask = face_dim # input("\nEnter desired face width and height in WidthxHeight format for face aligner to take OR press ENTER for default 160x160 pixel: ").rstrip().lower()
417 | image_size = tuple(map(int, ask.split('x')))
418 |
419 | # Take gpu fraction values
420 | if gpu:
421 | gpu_fraction = gpu # input("\nEnter the gpu memory fraction u want to allocate out of 1 or press ENTER for default 0.8: ").rstrip()
422 | gpu_fraction = round(float(gpu_fraction), 1)
423 |
424 | else:
425 | gpu_fraction = 0.8
426 |
427 | # input_type = input("\nPress I for image input OR\nPress V for video input OR\nPress W for webcam input OR\nPress ENTER for default webcam: ").lstrip().rstrip().lower()
428 | # if input_type == "":
429 | # input_type = 'w'
430 | input_type = mode
431 |
432 | # Load the face aligner model
433 | affine = FaceAligner(desiredLeftEye=(0.33, 0.33), desiredFaceWidth=image_size[0], desiredFaceHeight=image_size[1])
434 |
435 | # Building seperate graphs for both the tf architectures
436 | g1 = tf.Graph()
437 | g2 = tf.Graph()
438 |
439 | # Load the model for FaceNet image recognition
440 | with g1.as_default():
441 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction)
442 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
443 | with tf.Session() as sess:
444 | facenet.load_model(model)
445 |
446 | # Load the model of MTCNN face detection.
447 | with g2.as_default():
448 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction)
449 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
450 | with sess.as_default():
451 | pnet, rnet, onet = detect_face.create_mtcnn(sess, None)
452 |
453 | # Some MTCNN network parameters
454 | minsize = 20 # minimum size of face
455 | threshold = [0.6, 0.7, 0.8] # Three steps's threshold
456 | factor = 0.709 # scale factor
457 | if thresh1:
458 | ask = thresh1 # input("\nEnter the threshold FACE DETECTION CONFIDENCE SCORE to consider detection by MTCNN OR press ENTER for default 0.80: ")
459 | if ask != "" and float(ask) < 1:
460 | threshold[2] = round(float(ask), 2)
461 |
462 | classifier_threshold = 0.50
463 | if thresh2:
464 | ask = thresh2 # input("\nEnter the threshold FACE RECOGNITION CONFIDENCE SCORE to consider face is recognised OR press ENTER for default 0.50: ")
465 | if ask != "":
466 | classifier_threshold = float(ask)
467 |
468 | # Loading the classifier model
469 | with open(classifier_filename, 'rb') as infile:
470 | (modelSVM, class_names) = pickle.load(infile)
471 |
472 | # helper variables
473 | image = []
474 | device = []
475 | display_output = True
476 |
477 | # Webcam variables
478 | loop_type = False
479 | res = (640, 480)
480 |
481 | # Video input variables
482 | total_frames = 0
483 | save_video = False
484 | frame_no = 1
485 | output_video = []
486 |
487 | # image input type variables
488 | save_images = False
489 | image_folder = ""
490 | out_img_folder = ""
491 | imageNo = 1
492 | image_list = []
493 | image_name = ""
494 |
495 | # If web cam is selected
496 | if input_type == "w":
497 | data_type = 0
498 | loop_type = True
499 | # Ask for webcam resolution
500 | if resolution:
501 | ask = resolution # input("\nEnter your webcam SUPPORTED resolution for face detection. For eg. 640x480 OR press ENTER for default 640x480: ").rstrip().lower()
502 | if ask != "":
503 | res = tuple(map(int, ask.split('x')))
504 |
505 | # If image selected, go to image function.
506 | elif input_type == "i":
507 |
508 | # Create a list of images inside the given folder
509 | if img_path:
510 | image_folder = img_path # input("\nWrite the folder path inside which images are kept: ").rstrip().lstrip()
511 | for img in os.listdir(image_folder):
512 | image_list.append(img)
513 | total_frames = len(image_list)
514 |
515 | path = 'y' # vid_save #input("\nIf you want to save the output images to a folder press Y OR press ENTER to ignore it: ").lstrip().rstrip().lower()
516 |
517 | if path == "y":
518 | save_images = True
519 | path = out_img_path # input("\nEnter the location of output folder OR press ENTER to default create an output_images directory here only: ").lstrip().rstrip()
520 | if os.path.isdir(path) or path == "":
521 | # User given path is present.
522 | if path == "":
523 | path = "output_images"
524 | else:
525 | path += '/output_images'
526 | if os.path.isdir(path):
527 | print("Directory already exists. Using it \n")
528 | else:
529 | if not os.makedirs(path):
530 | print("Directory successfully made in: " + path + "\n")
531 | else:
532 | print("Error image folder path. Exiting")
533 | sys.exit()
534 | out_img_folder = path + "/"
535 |
536 |
537 | # Video is selected
538 | else:
539 | data_type = vid_path # input("\nWrite the video path file to open: ").rstrip().lstrip()
540 | ask = vid_save # input("\nPress y to save the output video OR simply press ENTER to ignore it: ").lstrip().rstrip().lower()
541 | if ask == "y":
542 | save_video = True
543 |
544 | if input_type != "w":
545 | ask = vid_see # input("\nSimply press ENTER to see the output video OR press N to switch off the display: ").lstrip().rstrip().lower()
546 | if ask != "y":
547 | display_output = False
548 |
549 | # Initialize webcam or video if no image format
550 | if input_type != "i":
551 | device = cv2.VideoCapture(data_type)
552 |
553 | # If webcam set resolution
554 | if input_type == "w":
555 | device.set(3, res[0])
556 | device.set(4, res[1])
557 |
558 | elif input_type == "v":
559 | # Finding total number of frames of video.
560 | total_frames = int(device.get(cv2.CAP_PROP_FRAME_COUNT))
561 | # save video feature.
562 | if save_video:
563 | # Finding the file format, size and the fps rate
564 | fps = device.get(cv2.CAP_PROP_FPS)
565 | video_format = int(device.get(cv2.CAP_PROP_FOURCC))
566 | frame_size = (int(device.get(cv2.CAP_PROP_FRAME_WIDTH)), int(device.get(cv2.CAP_PROP_FRAME_HEIGHT)))
567 | # Creating video writer to save the video after process if needed
568 | output_video = cv2.VideoWriter("/home/ml/Documents/attendance_dl/videos/dslr/Output_" + data_type, video_format, fps, frame_size)
569 |
570 | # Start web cam or start video and start creating dataset by user.
571 | while loop_type or (frame_no <= total_frames):
572 |
573 | if input_type == "i":
574 | image = cv2.imread(image_folder + "/" + image_list[frame_no - 1])
575 | else:
576 | ret, image = device.read()
577 |
578 | # Run MTCNN model to detect faces
579 | g2.as_default()
580 | with tf.Session(graph=g2) as sess:
581 | # we get the bounding boxes as well as the points for the face
582 | frame = image
583 | #/home/ml/Documents/attendance_dl/dataset/test.mp4
584 | image = cv2.resize(image, (800, 600))
585 |
586 |
587 | hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
588 | value = 0
589 | h, s, v = cv2.split(hsv)
590 | v -= value
591 | #h -= value
592 | image = cv2.merge((h, s, v))
593 | image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR) ################################################################
594 |
595 | #image = noisy('speckle', image)
596 | image = np.asarray(image, dtype = 'uint8')
597 |
598 | bb, points = detect_face.detect_face(image, minsize, pnet, rnet, onet, threshold, factor)
599 |
600 | # See if face is detected
601 | if bb.shape[0] > 0:
602 |
603 | # ALIGNMENT - use the bounding boxes and facial landmarks points to align images
604 |
605 | # create a numpy array to feed the network
606 | img_list = []
607 | images = np.empty([bb.shape[0], image.shape[0], image.shape[1]])
608 |
609 | for col in range(points.shape[1]):
610 | aligned_image = affine.align(image, points[:, col])
611 | print(aligned_image)
612 | print("\n" + str(len(aligned_image)))
613 |
614 | # Prewhiten the image for facenet architecture to give better results
615 | mean = np.mean(aligned_image)
616 | std = np.std(aligned_image)
617 | std_adj = np.maximum(std, 1.0 / np.sqrt(aligned_image.size))
618 | ready_image = np.multiply(np.subtract(aligned_image, mean), 1 / std_adj)
619 | img_list.append(ready_image)
620 | images = np.stack(img_list)
621 |
622 | # EMBEDDINGS: Use the processed aligned images for Facenet embeddings
623 |
624 | g1.as_default()
625 | with tf.Session(graph=g1) as sess:
626 | # Run forward pass on FaceNet to get the embeddings
627 | images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
628 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
629 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
630 | feed_dict = {images_placeholder: images, phase_train_placeholder: False}
631 | embedding = sess.run(embeddings, feed_dict=feed_dict)
632 |
633 | # PREDICTION: use the classifier to predict the most likely class (person).
634 | predictions = modelSVM.predict_proba(embedding)
635 | best_class_indices = np.argmax(predictions, axis=1)
636 | best_class_probabilities = predictions[np.arange(len(best_class_indices)), best_class_indices]
637 |
638 | # DRAW: draw bounding boxes, landmarks and predicted names
639 |
640 | if save_video or display_output or save_images:
641 | for i in range(bb.shape[0]):
642 | cv2.rectangle(image, (int(bb[i][0]), int(bb[i][1])), (int(bb[i][2]), int(bb[i][3])), (0, 255, 0), 1)
643 |
644 | # Put name and probability of detection only if given threshold is crossed
645 | if best_class_probabilities[i] > classifier_threshold:
646 | cv2.putText(image, class_names[best_class_indices[i]], (int(bb[i][0] + 1), int(bb[i][1]) + 10), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.6, (255, 255, 0), 1, cv2.LINE_AA)
647 | print(class_names[best_class_indices[i]])
648 | st_name += ','
649 | st_name += class_names[best_class_indices[i]]
650 | mark_present(st_name)
651 | #cv2.waitKey(0)
652 | #cv2.putText(image, str(round(best_class_probabilities[i] * 100, 2)) + "%", (int(bb[i][0]), int(bb[i][3]) + 7), cv2.FONT_HERSHEY_COMPLEX_SMALL, 1, (0, 255, 255), 1, cv2.LINE_AA)
653 |
654 | # loop over the (x, y)-coordinates for the facial landmarks
655 | for col in range(points.shape[1]):
656 | for i in range(5):
657 | cv2.circle(image, (int(points[i][col]), int(points[i + 5][col])), 1, (0, 255, 0), 1)
658 |
659 | if display_output:
660 | cv2.imshow("Output", image)
661 | if save_video:
662 | output_video.write(image)
663 | if save_images:
664 | output_name = out_img_folder + image_list[frame_no - 1]
665 | # Just taking the initial name of the input image and save in jpg which opencv supports for sure
666 | # output_name = out_img_folder + image_list[frame_no-1].split(".")[0] + ".jpg"
667 | cv2.imwrite(output_name, image)
668 |
669 | # If video or images selected dec counter
670 | if loop_type == False:
671 | # Display the progress
672 | print("\nProgress: %.2f" % (100 * frame_no / total_frames) + "%")
673 | frame_no += 1
674 |
675 | # if the `q` key was pressed, break from the loop
676 | if cv2.waitKey(1) == 'q':
677 | # do a bit of cleanup
678 | if save_video:
679 | output_video.release()
680 | device.release()
681 | cv2.destroyAllWindows()
682 | break
683 |
684 | return st_name
685 |
686 |
687 |
688 |
689 |
690 | if __name__ == '__main__':
691 | main()
692 |
693 |
694 |
695 |
--------------------------------------------------------------------------------
/images/image2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image2.png
--------------------------------------------------------------------------------
/images/image3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image3.png
--------------------------------------------------------------------------------
/images/image4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image4.png
--------------------------------------------------------------------------------
/images/image5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aashishrai3799/Automated-Attendance-System-using-CNN/ed8ce04100558c91288a1faa5aba7609233701dd/images/image5.png
--------------------------------------------------------------------------------
/images/images.txt:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/sheet.py:
--------------------------------------------------------------------------------
1 | import xlwt
2 | import xlrd
3 | from xlutils.copy import copy
4 | import os
5 | import datetime
6 | import xlsxwriter
7 |
8 | st_name = 'Aashish'
9 | def mark_present(st_name):
10 |
11 | names = os.listdir('output/')
12 | print(names)
13 |
14 | sub = 'SAMPLE'
15 |
16 | if not os.path.exists('attendance/' + sub + '.xlsx'):
17 | count = 2
18 | workbook = xlsxwriter.Workbook('attendance/' + sub + '.xlsx')
19 | print("Creating Spreadsheet with Title: " + sub)
20 | sheet = workbook.add_worksheet()
21 | for i in names:
22 | sheet.write(count, 0, i)
23 | count += 1
24 | workbook.close()
25 |
26 | rb = xlrd.open_workbook('attendance/' + sub + '.xlsx')
27 | wb = copy(rb)
28 | sheet = wb.get_sheet(0)
29 | sheet.write(1,1,str(datetime.datetime.now()))
30 |
31 |
32 | count = 2
33 | for i in names:
34 | if i in st_name:
35 | sheet.write(count, 1, 'P')
36 | else:
37 | sheet.write(count, 1, 'A')
38 | sheet.write(count, 0, i)
39 | count += 1
40 |
41 | wb.save('attendance/' + sub + '.xlsx')
42 |
43 |
44 | mark_present(st_name)
45 |
--------------------------------------------------------------------------------
/user_interface.py:
--------------------------------------------------------------------------------
1 | import tkinter as tk
2 | from tkinter import *
3 | import tkinter
4 | from tkinter import filedialog
5 | from tkinter import ttk, StringVar, IntVar
6 | from PIL import ImageTk, Image
7 | from tkinter import messagebox
8 | from PIL import Image
9 | import final_sotware
10 | import xlwt
11 | from xlwt import Workbook
12 |
13 |
14 | def s_exit():
15 | exit(0)
16 |
17 |
18 | def putwindow():
19 |
20 | window = Tk()
21 | window.geometry("800x500")
22 | #window.configure(background='')
23 | window.title("Attendance System")
24 | #window.geometry("800x500")
25 | tkinter.Label(window, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg = "black", bg = "darkorange").pack(fill = "x")
26 | tkinter.Label(window, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg = 'orange').pack(fill = 'x')
27 | tkinter.Label(window, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg = 'orange').pack(fill = 'x')
28 |
29 | tkinter.Label(window, text = "GUIDELINES TO USE THIS SOFTWARE", fg = "black", bg = "salmon1").pack(fill = "x")
30 |
31 | tkinter.Label(window, text = " ").pack(fill = 'y')
32 |
33 | tkinter.Label(window, text = ": \n\n"
34 | "This software allows user to:\n\n"
35 | "1) CREATE DATASET using MTCNN face detection and alignment\n"
36 | "2) TRAIN FaceNet for face recognition \n"
37 | "3) Do both \n\n\n "
38 | , fg = "black", bg = "aquamarine").pack(fill = "y")
39 |
40 | tkinter.Label(window, text = "\n\n").pack(fill = 'y')
41 |
42 |
43 | tkinter.Label(window, text = ": \n\n"
44 | "The user will multiple times get option to choose webcam (default option) or \n"
45 | "video file to do face detection and will be asked for output folder, username\n"
46 | "on folder and image files etc also (default options exists for that too) \n\n\n "
47 | "************** IMPORTANT *************\n\n"
48 | "1) Whenever webcam or video starts press 's' keyword to start face detection in video or webcam frames \n"
49 | " and save the faces in the folder for a single user. This dataset creation will stop the moment you \n"
50 | " release the 's' key. This can be done multiple times. \n\n"
51 | "2) Press 'q' to close it when you are done with one person, and want to detect face for another person.\n\n"
52 | "3) Make sure you press the keywords on the image window and not the terminal window. \n"
53 | , fg = "black", bg = "gray").pack(fill = "y")
54 |
55 | def cont_inue():
56 | window.destroy()
57 | show()
58 |
59 | btn1 = tkinter.Button(window, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = cont_inue)
60 | btn1.place(x=360, y=450, width=80)
61 |
62 |
63 |
64 |
65 | window.mainloop()
66 |
67 |
68 | def show():
69 | #putwindow.window.destroy()
70 | window2 = Tk()
71 | window2.title("Attendance System")
72 | window2.geometry("800x500")
73 | tkinter.Label(window2, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x")
74 | tkinter.Label(window2, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x')
75 | tkinter.Label(window2, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x')
76 |
77 | #tkinter.Label(window2, text="TEST", fg="lightblue", bg="gray").pack(fill="x")
78 | tkinter.Label(window2, text = "\n\n ").pack(fill = 'y')
79 |
80 |
81 | tkinter.Label(window2, text = "Click 'TRAIN' to TRAIN and maybe 'TEST later' by making a classifer on the facenet model.\n\n"
82 | "Click 'TEST' to TEST on previously MTCNN created dataset by loading already created \n"
83 | "facenet classification model. \n\n"
84 | "Click 'CREATE' to first create dataset and then 'maybe' train later. \n\n"
85 | "Click 'RUN' to TEST by loading a classifier model and using webcam OR user given video \n"
86 | "OR given set of images (save option is also available). \n",
87 | fg = 'blue', bg = 'pink').pack(fill = 'y')
88 |
89 |
90 | bottom_frame = tkinter.Frame(window2).pack(side = "bottom")
91 |
92 |
93 |
94 | def train():
95 | print('train')
96 | window2.destroy()
97 | show_train()
98 |
99 | def test():
100 | print('test')
101 | window2.destroy()
102 | show_test()
103 |
104 | def create():
105 | print('create')
106 | window2.destroy()
107 | show_create()
108 |
109 | def run():
110 | print('run')
111 | window2.destroy()
112 | show_run()
113 |
114 |
115 | btn1 = tkinter.Button(bottom_frame, text = "TRAIN", fg = "black", bg = 'turquoise1', command = train)
116 | btn1.place(x=230, y=350, width=50)
117 |
118 | btn2 = tkinter.Button(bottom_frame, text = "TEST", fg = "black", bg = 'turquoise1', command = test)
119 | btn2.place(x=330, y=350, width=50)
120 |
121 | btn3 = tkinter.Button(bottom_frame, text = "CREATE", fg = "black", bg = 'turquoise1', command = create)
122 | btn3.place(x=430, y=350, width=50)
123 |
124 | btn3 = tkinter.Button(bottom_frame, text = "RUN", fg = "black", bg = 'turquoise1', command = run)
125 | btn3.place(x=530, y=350, width=50)
126 |
127 | btn4 = tkinter.Button(bottom_frame, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit)
128 | btn4.place(x=370, y=450, width=60)
129 |
130 | window2.mainloop()
131 |
132 |
133 |
134 | def show_run():
135 |
136 | window3 = Tk()
137 | #window3.configure(background='lightyellow')
138 | window3.title("Attendance System")
139 | window3.geometry("1200x800")
140 | tkinter.Label(window3, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x")
141 | tkinter.Label(window3, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x')
142 | tkinter.Label(window3, text="\n\n ").pack(fill='y')
143 |
144 | #path1 = tk.StringVar()
145 |
146 | tkinter.Label(window3, text = "Enter the path to classifier.pkl").place(x=50, y=50, width=250)
147 | path1 = tkinter.Entry(window3)
148 | path1.place(x=60, y=70, width=400)
149 |
150 | tkinter.Label(window3, text = "Enter the path to 20180402-114759 FOLDER").place(x=50, y=100, width=300)
151 | path2 = tkinter.Entry(window3)
152 | path2.place(x=60, y=120, width=400)
153 |
154 | tkinter.Label(window3, text = "Enter desired face width and height for face aligner (WidthxHeight format)").place(x=50, y=150, width=530)
155 | face_dim = tkinter.Entry(window3)
156 | face_dim.place(x=60, y=170, width=400)
157 |
158 | tkinter.Label(window3, text = "Enter the gpu memory fraction u want to allocate out of 1").place(x=50, y=200, width=420)
159 | gpu = tkinter.Entry(window3)
160 | gpu.place(x=60, y=220, width=400)
161 |
162 | tkinter.Label(window3, text = "Enter the threshold to consider detection by MTCNN").place(x=50, y=250, width=380)
163 | thresh1 = tkinter.Entry(window3)
164 | thresh1.place(x=60, y=270, width=400)
165 |
166 | tkinter.Label(window3, text = "Enter the threshold to consider face is recognised").place(x=50, y=300, width=380)
167 | thresh2 = tkinter.Entry(window3)
168 | thresh2.place(x=60, y=320, width=400)
169 |
170 | tkinter.Label(window3, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=700, y=200, width=380)
171 | tkinter.Label(window3, bg = 'orange')
172 | #Label.place(y = 350, width = 1400)
173 |
174 | rdbtn1 = IntVar()
175 | rdbtn2 = IntVar()
176 | rdbtn3 = IntVar()
177 |
178 | rdbi = tkinter.Checkbutton(window3, text="Input: Image", variable = rdbtn3, fg="blue", bg="cyan")
179 | rdbi.place(x=60, y=400, width=200)
180 |
181 | #tkinter.Label(window3, text="Input: Image", fg="blue", bg="cyan").place(x=60, y=400, width=200)
182 | tkinter.Label(window3, text = "Enter the folder path inside which images are kept").place(x=300, y=400, width=360)
183 | img_path = tkinter.Entry(window3)
184 | img_path.place(x=300, y=420, width=400)
185 |
186 | tkinter.Label(window3, text = "Enter folder path inside which output images are to be saved").place(x=720, y=400, width=420)
187 | out_img_path = tkinter.Entry(window3)
188 | out_img_path.place(x=720, y=420, width=400)
189 |
190 | rdbv = tkinter.Checkbutton(window3, text="Input: Video", variable = rdbtn2, fg="blue", bg="cyan")
191 | rdbv.place(x=60, y=450, width=200)
192 |
193 | #tkinter.Label(window3, text="Input: Video", fg="blue", bg="cyan").place(x=60, y=450, width=200)
194 | tkinter.Label(window3, text = "Enter path to the video file").place(x=300, y=450, width=200)
195 | vid_path = tkinter.Entry(window3)
196 | vid_path.place(x=300, y=470, width=400)
197 |
198 | tkinter.Label(window3, text = "To Save output video type 'y'").place(x=720, y=450, width=200)
199 | vid_save = tkinter.Entry(window3)
200 | vid_save.place(x=720, y=470, width=100)
201 |
202 | tkinter.Label(window3, text = "To See output video type y").place(x=950, y=450, width=200)
203 | vid_see = tkinter.Entry(window3)
204 | vid_see.place(x=960, y=470, width=100)
205 |
206 | rdbw = tkinter.Checkbutton(window3, text="Input: Webcam", variable = rdbtn1, fg="blue", bg="cyan")
207 | rdbw.place(x=60, y=500, width=200)
208 | tkinter.Label(window3, text = "Enter your supported webcam resolution (eg 640x480)").place(x=300, y=500, width=380)
209 | resolution = tkinter.Entry(window3)
210 | resolution.place(x=300, y=520, width=400)
211 |
212 | #parameters = path1, path2, face_dim, gpu, thresh1, thresh2, resolution
213 |
214 |
215 | def submit():
216 | print('submit')
217 | if rdbtn1.get():
218 | print('Webcam')
219 | mode = 'w'
220 | elif rdbtn2.get():
221 | print('Video')
222 | mode = 'v'
223 | elif rdbtn3.get():
224 | print('Image')
225 | mode = 'i'
226 | else:
227 | print('default')
228 | mode = 'w'
229 |
230 | print(mode)
231 | parameters = path1.get(), path2.get(), face_dim.get(), gpu.get(), thresh1.get(), thresh2.get(), resolution.get(), \
232 | img_path.get(), out_img_path.get(), vid_path.get(), vid_save.get(), vid_see.get()
233 | print(parameters)
234 | #mode = 'w'
235 | st_name = final_sotware.recognize(mode, parameters)
236 | print('students recognised', st_name)
237 |
238 | def mark_attend():
239 | if rdbtn1.get():
240 | print('Webcam')
241 | mode = 'w'
242 | elif rdbtn2.get():
243 | print('Video')
244 | mode = 'v'
245 | elif rdbtn3.get():
246 | print('Image')
247 | mode = 'i'
248 | else:
249 | print('default')
250 | mode = 'w'
251 |
252 | print(mode)
253 | parameters = path1.get(), path2.get(), face_dim.get(), gpu.get(), thresh1.get(), thresh2.get(), resolution.get(), \
254 | img_path.get(), out_img_path.get(), vid_path.get(), vid_save.get(), vid_see.get()
255 | run_attend(mode, parameters)
256 |
257 | btn9 = tkinter.Button(window3, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit)
258 | btn9.place(x=550, y=600, width=90)
259 |
260 | btn11 = tkinter.Button(window3, text = "Mark Attendance", fg = "black", bg = 'turquoise1', command = mark_attend)
261 | btn11.place(x=635, y=630, width=120)
262 |
263 | btn10 = tkinter.Button(window3, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit)
264 | btn10.place(x=750, y=600, width=60)
265 |
266 | def home():
267 | window3.destroy()
268 | gotohome()
269 |
270 | btn12 = tkinter.Button(window3, text = "HOME", fg = "black", bg = 'turquoise1', command = home)
271 | btn12.place(x=650, y=600, width=90)
272 |
273 |
274 |
275 | window3.mainloop()
276 |
277 | def run_attend(mode, parameters):
278 | present = final_sotware.recognize(mode, parameters)
279 |
280 |
281 | def show_create():
282 |
283 | window4 = Tk()
284 | window4.title("Attendance System")
285 | window4.geometry("800x500")
286 | tkinter.Label(window4, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x")
287 | tkinter.Label(window4, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x')
288 | tkinter.Label(window4, text="\n\n ").pack(fill='y')
289 |
290 | tkinter.Label(window4, text = "Enter the path to output folder").place(x=50, y=50, width=240)
291 | path1 = tkinter.Entry(window4)
292 | path1.place(x=60, y=70, width=400)
293 |
294 | tkinter.Label(window4, text = "Enter your supported webcam resolution (eg 640x480)").place(x=50, y=100, width=380)
295 | webcam = tkinter.Entry(window4)
296 | webcam.place(x=60, y=120, width=400)
297 |
298 | tkinter.Label(window4, text = "Enter the gpu memory fraction u want to allocate(out of 1)").place(x=50, y=150, width=430)
299 | gpu = tkinter.Entry(window4)
300 | gpu.place(x=60, y=170, width=400)
301 |
302 | tkinter.Label(window4, text = "Enter desired face width and height (WidthxHeight format)").place(x=50, y=200, width=430)
303 | face_dim = tkinter.Entry(window4)
304 | face_dim.place(x=60, y=220, width=400)
305 |
306 | tkinter.Label(window4, text = "Enter user name (default: person)").place(x=50, y=250, width=260)
307 | username = tkinter.Entry(window4)
308 | username.place(x=60, y=270, width=400)
309 |
310 | tkinter.Label(window4, text = "Create dataset using:").place(x=50, y=300, width=180)
311 |
312 | rdbtn1 = IntVar()
313 | rdbtn2 = IntVar()
314 |
315 | rdbv = tkinter.Checkbutton(window4, text="Video", variable = rdbtn1, fg="black", bg="skyblue1")
316 | rdbv.place(x=220, y=300, width=80)
317 |
318 | rdbw = tkinter.Checkbutton(window4, text="Webcam", variable = rdbtn2, fg="black", bg="skyblue1")
319 | rdbw.place(x=320, y=300, width=80)
320 |
321 | tkinter.Label(window4, text = "Enter video path (if applicable)").place(x=50, y=330, width=250)
322 | vid_path = tkinter.Entry(window4)
323 | vid_path.place(x=60, y=350, width=400)
324 |
325 | tkinter.Label(window4, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=50, y=450, width=380)
326 |
327 | tkinter.Label(window4, bg = 'orange').place(y = 485, width = 800)
328 |
329 | get_f = 0
330 |
331 | def submit():
332 | #vid_path2 = ''
333 | print('submit')
334 | '''if rdbtn1.get():
335 | print('Video')
336 | #vid_path = '/home/aashish/Documents/deep_learning/attendance_deep_learning/scripts_used/video/uri1.webm'
337 | vid_path2 = vid_path.get()
338 |
339 | elif rdbtn2.get():
340 | print('Webcam')
341 | vid_path = ''
342 | else:
343 | print('default')
344 | vid_path = '' '''
345 |
346 | parameters = path1.get(), webcam.get(), face_dim.get(), gpu.get(), username.get(), vid_path.get()
347 | print(parameters)
348 | # mode = 'w'
349 | get_f = final_sotware.dataset_creation(parameters)
350 |
351 | if get_f == 1:
352 | tkinter.messagebox.showinfo("Attendance", "Dataset Created")
353 |
354 |
355 | btn9 = tkinter.Button(window4, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit)
356 | btn9.place(x=650, y=200, width=90)
357 |
358 | btn9 = tkinter.Button(window4, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit)
359 | btn9.place(x=650, y=300, width=90)
360 |
361 | def home():
362 | window4.destroy()
363 | gotohome()
364 |
365 | btn10 = tkinter.Button(window4, text = "HOME", fg = "black", bg = 'turquoise1', command = home)
366 | btn10.place(x=650, y=250, width=90)
367 |
368 | window4.mainloop()
369 |
370 |
371 |
372 | def show_train():
373 |
374 | window5 = Tk()
375 | window5.title("Attendance System")
376 | window5.geometry("800x500")
377 | tkinter.Label(window5, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x")
378 | tkinter.Label(window5, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x')
379 | tkinter.Label(window5, text="\n\n ").pack(fill='y')
380 |
381 | tkinter.Label(window5, text = "Enter the path to dataset folder").place(x=50, y=50, width=250)
382 | path1 = tkinter.Entry(window5)
383 | path1.place(x=60, y=70, width=400)
384 |
385 | tkinter.Label(window5, text = "Enter the path to 20180402-114759 FOLDER").place(x=50, y=100, width=300)
386 | path2 = tkinter.Entry(window5)
387 | path2.place(x=60, y=120, width=400)
388 |
389 | tkinter.Label(window5, text = "Enter the gpu memory fraction u want to allocate(out of 1)").place(x=50, y=150, width=430)
390 | gpu = tkinter.Entry(window5)
391 | gpu.place(x=60, y=170, width=400)
392 |
393 | tkinter.Label(window5, text = "Enter the batch size of images to process at once").place(x=50, y=200, width=370)
394 | batch = tkinter.Entry(window5)
395 | batch.place(x=60, y=220, width=400)
396 |
397 | tkinter.Label(window5, text = "Enter input image dimension (eg. 160)").place(x=50, y=250, width=285)
398 | img_dim = tkinter.Entry(window5)
399 | img_dim.place(x=60, y=270, width=400)
400 |
401 | tkinter.Label(window5, text = "Enter output SVM classifier filename").place(x=50, y=300, width=275)
402 | svm_name = tkinter.Entry(window5)
403 | svm_name.place(x=60, y=320, width=400)
404 |
405 | tkinter.Label(window5, text = "Split dataset into training and testing:").place(x=50, y=350, width=305)
406 |
407 | chkbtn1 = IntVar()
408 | chkbtn2 = IntVar()
409 |
410 | ckbt1 = tkinter.Checkbutton(window5, text="Yes", variable = chkbtn1, fg="black", bg="skyblue1")
411 | ckbt1.place(x=350, y=350, width=50)
412 |
413 | ckbt2 = tkinter.Checkbutton(window5, text="No", variable = chkbtn2, fg="black", bg="skyblue1")
414 | ckbt2.place(x=410, y=350, width=50)
415 |
416 | tkinter.Label(window5, text = "Enter split percentage (if applicable)").place(x=50, y=380, width=290)
417 | split_percent = tkinter.Entry(window5)
418 | split_percent.place(x=60, y=400, width=400)
419 |
420 | tkinter.Label(window5, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=60, y=450, width=380)
421 |
422 | def submit():
423 |
424 | print('submit')
425 | if chkbtn1.get():
426 | print('Yes')
427 | split_data = 'y'
428 |
429 | elif chkbtn2.get():
430 | print('No')
431 | split_data = ''
432 | else:
433 | print('default')
434 | split_data = 'y'
435 |
436 | parameters = path1.get(), path2.get(), batch.get(), img_dim.get(), gpu.get(), svm_name.get(), split_percent.get(), split_data
437 | print(parameters)
438 | # mode = 'w'
439 | get_f = final_sotware.train(parameters)
440 |
441 | if get_f == 1:
442 | tkinter.messagebox.showinfo("Title", "Training Completed")
443 |
444 |
445 | btn9 = tkinter.Button(window5, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit)
446 | btn9.place(x=650, y=200, width=90)
447 |
448 | btn9 = tkinter.Button(window5, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit)
449 | btn9.place(x=650, y=300, width=90)
450 |
451 | def home():
452 | window5.destroy()
453 | gotohome()
454 |
455 | btn10 = tkinter.Button(window5, text = "HOME", fg = "black", bg = 'turquoise1', command = home)
456 | btn10.place(x=650, y=250, width=90)
457 |
458 |
459 | window5.mainloop()
460 |
461 |
462 |
463 | def show_test():
464 |
465 | window6 = Tk()
466 | window6.title("Attendance System")
467 | window6.geometry("800x500")
468 | tkinter.Label(window6, text = "^^ WELCOME TO FACE DETECTION AND RECOGNITION SOFTWARE ^^", fg="black", bg="darkorange").pack(fill="x")
469 | tkinter.Label(window6, text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -', bg='orange').pack(fill='x')
470 | tkinter.Label(window6, text="\n\n ").pack(fill='y')
471 |
472 | tkinter.Label(window6, text = "Enter the path to classifier.pkl").place(x=50, y=50, width=250)
473 | path1 = tkinter.Entry(window6)
474 | path1.place(x=60, y=70, width=400)
475 |
476 | tkinter.Label(window6, text = "Enter the path to 20180402-114759 FOLDER").place(x=50, y=100, width=300)
477 | path2 = tkinter.Entry(window6)
478 | path2.place(x=60, y=120, width=400)
479 |
480 | tkinter.Label(window6, text="Enter path to dataset folder").place(x=50, y=150, width=230)
481 | path3 = tkinter.Entry(window6)
482 | path3.place(x=60, y=170, width=400)
483 |
484 | tkinter.Label(window6, text="Enter the batch size of images to process at once").place(x=50, y=200, width=370)
485 | batch = tkinter.Entry(window6)
486 | batch.place(x=60, y=220, width=400)
487 |
488 | tkinter.Label(window6, text="Enter input image dimension (eg. 160)").place(x=50, y=250, width=285)
489 | img_dim = tkinter.Entry(window6)
490 | img_dim.place(x=60, y=270, width=400)
491 |
492 | tkinter.Label(window6, text="Default values would be assigned to empyt fields", fg="navy", bg="lightblue").place(x=60, y=450, width=380)
493 |
494 | def submit():
495 |
496 | gpu = 0.8
497 | parameters = path1.get(), path2.get(), path3.get(), batch.get(), img_dim.get(), gpu
498 | print(parameters)
499 | get_f = final_sotware.test(parameters = parameters)
500 |
501 | if get_f == 1:
502 | tkinter.messagebox.showinfo("Title", "Training Completed")
503 |
504 |
505 | btn9 = tkinter.Button(window6, text = "CONTINUE", fg = "black", bg = 'turquoise1', command = submit)
506 | btn9.place(x=650, y=200, width=90)
507 |
508 | btn9 = tkinter.Button(window6, text = "EXIT", fg = "red", bg = 'indianred1', command = s_exit)
509 | btn9.place(x=650, y=300, width=90)
510 |
511 | def home():
512 | window6.destroy()
513 | gotohome()
514 |
515 | btn10 = tkinter.Button(window6, text = "HOME", fg = "black", bg = 'turquoise1', command = home)
516 | btn10.place(x=650, y=250, width=90)
517 |
518 |
519 |
520 | window6.mainloop()
521 |
522 |
523 |
524 |
525 |
526 | def gotohome():
527 |
528 | show()
529 | #show_test()
530 | #show_train()
531 | #putwindow()
532 | #show_run()
533 | #show_create()
534 |
535 | if __name__ == '__main__':
536 | putwindow()
537 | #show_attend()
538 |
--------------------------------------------------------------------------------