├── README.md ├── data └── images │ ├── Julien.jpg │ ├── Muzzamil.jpg │ └── Zuheng.jpg ├── figs └── fig1.png └── src ├── __pycache__ ├── __init__.cpython-35.pyc ├── detect_face.cpython-35.pyc └── face_align_mtcnn.cpython-35.pyc ├── align ├── __init__.py ├── __init__.pyc ├── detect_face.py ├── detect_face.pyc ├── face_align_mtcnn.py └── face_align_mtcnn.pyc ├── face_verification.py ├── facenet.py ├── facenet_ext.py ├── facenet_train_classifier_expression_pretrainExpr_multidata_addcnns_simple.py ├── lfw_ext.py ├── metrics_loss.py ├── models └── inception_resnet_v1_expression_simple.pyc ├── test_realtime.py ├── train.py └── train_BP.py /README.md: -------------------------------------------------------------------------------- 1 | # Demo of the FaceLiveNet1.0 2 | This demo shows the function of the FaceLiveNet for the face authentication, which employs the face verification and the liveness control simultaneous. The demo is realized in TensorFlow1.6, Python 2.7 and openCV 3.0 under Unbuntu 16.4. The details of the FaceLiveNet are described in the paper 3 | ["FaceLiveNet: End-to-End Face Verification Networks Combining With Interactive Facial Expression-based Liveness Detection"](https://www.researchgate.net/publication/325229686_FaceLiveNet_End-to-End_Face_Verification_Networks_Combining_With_Interactive_Facial_Expression-based_Liveness_Detection). The FaceLiveNet is the holistic end-to-end deep networks for the face authentication including the face verification and the liveness-control of the real presentation of the user. The Challenge-Response mechanism based on the facial expression recognition (Happy and Surprise) is used for the liveness control. 4 | 5 | The accuracy of the face verification are measured on the benchmark LFW (99.4%) and YTF(95%), the accuracy of the facial expression recognition of the six basic expressions is measured on the benchmark CK+(99.1%), OuluCasia(87.5%), SFEW(53.2%) and FER2013(68.6%). Fusing the face verification and the facial expression recognition, the global accuracy of the face authentication on the proposed dataset based on the CK+ and the OuluCasia is 99% and 92% respectively. The proposed architecture is shown in ![Fig.1](https://github.com/zuhengming/face_recognition/blob/master/figs/fig1.png). More details can be found in the paper. 6 | 7 | ## Dependencies 8 | - The code is tested on Ubuntu 16.04. 9 | 10 | - install Tensorflow 1.6 (with CPU) 11 | 12 | - install opencv 2.4.13. 13 | 14 | - install python 2.7 15 | 16 | ## Protocol for the face authentication 17 | The face authentication including the face verification and the liveness control can employ in two modes: real-time mode and off-line mode. For the liveness control, both two modes are based on the facial expression Challenge-Response mechanism, i.e. the system randomly proposes an expression as a request, if the user can give a right response by acting the right expression and verified by the system, the user can pass the liveness control. Specially, in this demo maximum two expressions (Happy/Surprise) can be used as the request in the liveness control. Beside the required expression, the neutral expression is always detected in both of the two modes. Since people normally start from the neutral expression to act a facial expression. In this way, the system can protect from the attack of the photo with a facial expression. 18 | 19 | ### 1. Real-time face authentication 20 | In the real-time face authentication mode, the face authentication will employ on the real-time camera video stream, the system will not unlock until the user gives the right response and verified by the system. 21 | 22 | ### 2. Off-line face authentication based on the upload video 23 | In the off-line face authentication mode, the face authentication is based on the use's upload video-clips. One expression is corresponding to one video. The user will take a video-clip of her/his facial expression and upload to the backend for the face authentication. This mode is risk to take an inappropriate or incomplete video-clip, however the system only processes a small video-clip rather than the video stream in the real-time mode. 24 | 25 | ## Pretrained model 26 | The pretrained model for FaceLiveNet1.0 is [here](https://drive.google.com/file/d/1B-ZRtWk1UoAQXHTewhKV5UPvwP3L102X/view?usp=sharing) 27 | 28 | 29 | ## Training 30 | The face verification networks is trained on the [CASIA-WebFace](http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html), [MSCeleb](https://www.msceleb.org/), the facial expression recognition networks branch is trained on the [CK+](http://www.consortium.ri.cmu.edu/ckagree/), [OuluCasia](http://www.cse.oulu.fi/CMV/Downloads/Oulu-CASIA), [SFEW](https://computervisiononline.com/dataset/1105138659) and [FER2013](https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data). 31 | 32 | 33 | 34 | ## Face alignment 35 | The face detection is implemented by the [Multi-task CNN (Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks).The paper for MTCNN)](https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html). 36 | 37 | 38 | ## Parameters and Example 39 | ### Parameters: 40 | 41 | --model_dir: Directory containing the metagraph (.meta) file and the checkpoint (ckpt) file containing model parameters 42 | --image_ref: The reference image for the face verification 43 | --num_expression: The number of the required expressions for face authentication. The maximum num_request_expression is 2 which are the Happy and Surprise, otherwise the Happy will be chosen for face authentication 44 | 45 | ### Examples for command line: 46 | 47 | 1. Real-time mode: 48 | python Demo_FaceLiveNet1.0_Realtime.py --model_dir /mnt/hgfs/VMshare-2/Fer2013/20180115-025629_model/best_model/ --image_ref ../data/images/Zuheng.jpg --num_expression 1 49 | 50 | 2. Off-line video mode: 51 | python Demo_FaceLiveNet1.0_video.py --model_dir /mnt/hgfs/VMshare-2/Fer2013/20180115-025629_model/best_model/ --image_ref ../data/images/Zuheng.jpg --num_expression 2 52 | 53 | -------------------------------------------------------------------------------- /data/images/Julien.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/data/images/Julien.jpg -------------------------------------------------------------------------------- /data/images/Muzzamil.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/data/images/Muzzamil.jpg -------------------------------------------------------------------------------- /data/images/Zuheng.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/data/images/Zuheng.jpg -------------------------------------------------------------------------------- /figs/fig1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/figs/fig1.png -------------------------------------------------------------------------------- /src/__pycache__/__init__.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/__pycache__/__init__.cpython-35.pyc -------------------------------------------------------------------------------- /src/__pycache__/detect_face.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/__pycache__/detect_face.cpython-35.pyc -------------------------------------------------------------------------------- /src/__pycache__/face_align_mtcnn.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/__pycache__/face_align_mtcnn.cpython-35.pyc -------------------------------------------------------------------------------- /src/align/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/align/__init__.py -------------------------------------------------------------------------------- /src/align/__init__.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/align/__init__.pyc -------------------------------------------------------------------------------- /src/align/detect_face.py: -------------------------------------------------------------------------------- 1 | """ Tensorflow implementation of the face detection / alignment algorithm found at 2 | https://github.com/kpzhang93/MTCNN_face_detection_alignment 3 | """ 4 | # MIT License 5 | # 6 | # Copyright (c) 2016 David Sandberg 7 | # 8 | # Permission is hereby granted, free of charge, to any person obtaining a copy 9 | # of this software and associated documentation files (the "Software"), to deal 10 | # in the Software without restriction, including without limitation the rights 11 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | # copies of the Software, and to permit persons to whom the Software is 13 | # furnished to do so, subject to the following conditions: 14 | # 15 | # The above copyright notice and this permission notice shall be included in all 16 | # copies or substantial portions of the Software. 17 | # 18 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 24 | # SOFTWARE. 25 | 26 | from __future__ import absolute_import 27 | from __future__ import division 28 | from __future__ import print_function 29 | 30 | import numpy as np 31 | import tensorflow as tf 32 | #from math import floor 33 | import cv2 34 | import os 35 | 36 | import random 37 | import matplotlib.pyplot as plt 38 | import matplotlib.patches as patches 39 | 40 | import time 41 | 42 | 43 | def layer(op): 44 | '''Decorator for composable network layers.''' 45 | 46 | def layer_decorated(self, *args, **kwargs): 47 | # Automatically set a name if not provided. 48 | name = kwargs.setdefault('name', self.get_unique_name(op.__name__)) 49 | # Figure out the layer inputs. 50 | if len(self.terminals) == 0: 51 | raise RuntimeError('No input variables found for layer %s.' % name) 52 | elif len(self.terminals) == 1: 53 | layer_input = self.terminals[0] 54 | else: 55 | layer_input = list(self.terminals) 56 | # Perform the operation and get the output. 57 | layer_output = op(self, layer_input, *args, **kwargs) 58 | # Add to layer LUT. 59 | self.layers[name] = layer_output 60 | # This output is now the input for the next layer. 61 | self.feed(layer_output) 62 | # Return self for chained calls. 63 | return self 64 | 65 | return layer_decorated 66 | 67 | class Network(object): 68 | 69 | def __init__(self, inputs, trainable=True): 70 | # The input nodes for this network 71 | self.inputs = inputs 72 | # The current list of terminal nodes 73 | self.terminals = [] 74 | # Mapping from layer names to layers 75 | self.layers = dict(inputs) 76 | # If true, the resulting variables are set as trainable 77 | self.trainable = trainable 78 | 79 | self.setup() 80 | 81 | def setup(self): 82 | '''Construct the network. ''' 83 | raise NotImplementedError('Must be implemented by the subclass.') 84 | 85 | def load(self, data_path, session, ignore_missing=False): 86 | '''Load network weights. 87 | data_path: The path to the numpy-serialized network weights 88 | session: The current TensorFlow session 89 | ignore_missing: If true, serialized weights for missing layers are ignored. 90 | ''' 91 | #data_dict = np.load(data_path).item() #pylint: disable=no-member ##python2, encoding in ascii 92 | data_dict = np.load(data_path, encoding='latin1').item() #pylint: disable=no-member ##python2, encoding in UTF8 93 | for op_name in data_dict: 94 | with tf.variable_scope(op_name, reuse=True): 95 | #for param_name, data in data_dict[op_name].iteritems(): #python2.7 96 | for param_name, data in data_dict[op_name].items(): ##python3.5 97 | try: 98 | var = tf.get_variable(param_name) 99 | session.run(var.assign(data)) 100 | except ValueError: 101 | if not ignore_missing: 102 | raise 103 | 104 | def feed(self, *args): 105 | '''Set the input(s) for the next operation by replacing the terminal nodes. 106 | The arguments can be either layer names or the actual layers. 107 | ''' 108 | assert len(args) != 0 109 | self.terminals = [] 110 | for fed_layer in args: 111 | #if isinstance(fed_layer, basestring): ##python 2.7 112 | if isinstance(fed_layer, str): ##python 3.5 113 | try: 114 | fed_layer = self.layers[fed_layer] 115 | except KeyError: 116 | raise KeyError('Unknown layer name fed: %s' % fed_layer) 117 | self.terminals.append(fed_layer) 118 | return self 119 | 120 | def get_output(self): 121 | '''Returns the current network output.''' 122 | return self.terminals[-1] 123 | 124 | def get_unique_name(self, prefix): 125 | '''Returns an index-suffixed unique name for the given prefix. 126 | This is used for auto-generating layer names based on the type-prefix. 127 | ''' 128 | ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1 129 | return '%s_%d' % (prefix, ident) 130 | 131 | def make_var(self, name, shape): 132 | '''Creates a new TensorFlow variable.''' 133 | return tf.get_variable(name, shape, trainable=self.trainable) 134 | 135 | def validate_padding(self, padding): 136 | '''Verifies that the padding is one of the supported ones.''' 137 | assert padding in ('SAME', 'VALID') 138 | 139 | @layer 140 | def conv(self, 141 | inp, 142 | k_h, 143 | k_w, 144 | c_o, 145 | s_h, 146 | s_w, 147 | name, 148 | relu=True, 149 | padding='SAME', 150 | group=1, 151 | biased=True): 152 | # Verify that the padding is acceptable 153 | self.validate_padding(padding) 154 | # Get the number of channels in the input 155 | c_i = inp.get_shape()[-1] 156 | # Verify that the grouping parameter is valid 157 | assert c_i % group == 0 158 | assert c_o % group == 0 159 | # Convolution for a given input and kernel 160 | convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding) 161 | with tf.variable_scope(name) as scope: 162 | kernel = self.make_var('weights', shape=[k_h, k_w, c_i // group, c_o]) 163 | # This is the common-case. Convolve the input without any further complications. 164 | output = convolve(inp, kernel) 165 | # Add the biases 166 | if biased: 167 | biases = self.make_var('biases', [c_o]) 168 | output = tf.nn.bias_add(output, biases) 169 | if relu: 170 | # ReLU non-linearity 171 | output = tf.nn.relu(output, name=scope.name) 172 | return output 173 | 174 | @layer 175 | def prelu(self, inp, name): 176 | with tf.variable_scope(name): 177 | i = inp.get_shape().as_list() 178 | alpha = self.make_var('alpha', shape=(i[-1])) 179 | output = tf.nn.relu(inp) + tf.multiply(alpha, -tf.nn.relu(-inp)) 180 | return output 181 | 182 | @layer 183 | def max_pool(self, inp, k_h, k_w, s_h, s_w, name, padding='SAME'): 184 | self.validate_padding(padding) 185 | return tf.nn.max_pool(inp, 186 | ksize=[1, k_h, k_w, 1], 187 | strides=[1, s_h, s_w, 1], 188 | padding=padding, 189 | name=name) 190 | 191 | @layer 192 | def fc(self, inp, num_out, name, relu=True): 193 | with tf.variable_scope(name): 194 | input_shape = inp.get_shape() 195 | if input_shape.ndims == 4: 196 | # The input is spatial. Vectorize it first. 197 | dim = 1 198 | for d in input_shape[1:].as_list(): 199 | dim *= d 200 | feed_in = tf.reshape(inp, [-1, dim]) 201 | else: 202 | feed_in, dim = (inp, input_shape[-1].value) 203 | weights = self.make_var('weights', shape=[dim, num_out]) 204 | biases = self.make_var('biases', [num_out]) 205 | op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b 206 | fc = op(feed_in, weights, biases, name=name) 207 | return fc 208 | 209 | 210 | """ 211 | Multi dimensional softmax, 212 | refer to https://github.com/tensorflow/tensorflow/issues/210 213 | compute softmax along the dimension of target 214 | the native softmax only supports batch_size x dimension 215 | """ 216 | @layer 217 | def softmax(self, target, axis, name=None): 218 | max_axis = tf.reduce_max(target, axis, keep_dims=True) 219 | target_exp = tf.exp(target-max_axis) 220 | normalize = tf.reduce_sum(target_exp, axis, keep_dims=True) 221 | softmax = tf.div(target_exp, normalize, name) 222 | return softmax 223 | 224 | class PNet(Network): 225 | def setup(self): 226 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member 227 | .conv(3, 3, 10, 1, 1, padding='VALID', relu=False, name='conv1') 228 | .prelu(name='PReLU1') 229 | .max_pool(2, 2, 2, 2, name='pool1') 230 | .conv(3, 3, 16, 1, 1, padding='VALID', relu=False, name='conv2') 231 | .prelu(name='PReLU2') 232 | .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv3') 233 | .prelu(name='PReLU3') 234 | .conv(1, 1, 2, 1, 1, relu=False, name='conv4-1') 235 | .softmax(3,name='prob1')) 236 | 237 | (self.feed('PReLU3') #pylint: disable=no-value-for-parameter 238 | .conv(1, 1, 4, 1, 1, relu=False, name='conv4-2')) 239 | 240 | class RNet(Network): 241 | def setup(self): 242 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member 243 | .conv(3, 3, 28, 1, 1, padding='VALID', relu=False, name='conv1') 244 | .prelu(name='prelu1') 245 | .max_pool(3, 3, 2, 2, name='pool1') 246 | .conv(3, 3, 48, 1, 1, padding='VALID', relu=False, name='conv2') 247 | .prelu(name='prelu2') 248 | .max_pool(3, 3, 2, 2, padding='VALID', name='pool2') 249 | .conv(2, 2, 64, 1, 1, padding='VALID', relu=False, name='conv3') 250 | .prelu(name='prelu3') 251 | .fc(128, relu=False, name='conv4') 252 | .prelu(name='prelu4') 253 | .fc(2, relu=False, name='conv5-1') 254 | .softmax(1,name='prob1')) 255 | 256 | (self.feed('prelu4') #pylint: disable=no-value-for-parameter 257 | .fc(4, relu=False, name='conv5-2')) 258 | 259 | class ONet(Network): 260 | def setup(self): 261 | (self.feed('data') #pylint: disable=no-value-for-parameter, no-member 262 | .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv1') 263 | .prelu(name='prelu1') 264 | .max_pool(3, 3, 2, 2, name='pool1') 265 | .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv2') 266 | .prelu(name='prelu2') 267 | .max_pool(3, 3, 2, 2, padding='VALID', name='pool2') 268 | .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv3') 269 | .prelu(name='prelu3') 270 | .max_pool(2, 2, 2, 2, name='pool3') 271 | .conv(2, 2, 128, 1, 1, padding='VALID', relu=False, name='conv4') 272 | .prelu(name='prelu4') 273 | .fc(256, relu=False, name='conv5') 274 | .prelu(name='prelu5') 275 | .fc(2, relu=False, name='conv6-1') 276 | .softmax(1, name='prob1')) 277 | 278 | (self.feed('prelu5') #pylint: disable=no-value-for-parameter 279 | .fc(4, relu=False, name='conv6-2')) 280 | 281 | (self.feed('prelu5') #pylint: disable=no-value-for-parameter 282 | .fc(10, relu=False, name='conv6-3')) 283 | 284 | def create_mtcnn(sess, model_path): 285 | with tf.variable_scope('pnet'): 286 | data = tf.placeholder(tf.float32, (None,None,None,3), 'input') 287 | pnet = PNet({'data':data}) 288 | pnet.load(os.path.join(model_path, 'det1.npy'), sess) 289 | with tf.variable_scope('rnet'): 290 | data = tf.placeholder(tf.float32, (None,24,24,3), 'input') 291 | rnet = RNet({'data':data}) 292 | rnet.load(os.path.join(model_path, 'det2.npy'), sess) 293 | with tf.variable_scope('onet'): 294 | data = tf.placeholder(tf.float32, (None,48,48,3), 'input') 295 | onet = ONet({'data':data}) 296 | onet.load(os.path.join(model_path, 'det3.npy'), sess) 297 | 298 | pnet_fun = lambda img : sess.run(('pnet/conv4-2/BiasAdd:0', 'pnet/prob1:0'), feed_dict={'pnet/input:0':img}) 299 | rnet_fun = lambda img : sess.run(('rnet/conv5-2/conv5-2:0', 'rnet/prob1:0'), feed_dict={'rnet/input:0':img}) 300 | onet_fun = lambda img : sess.run(('onet/conv6-2/conv6-2:0', 'onet/conv6-3/conv6-3:0', 'onet/prob1:0'), feed_dict={'onet/input:0':img}) 301 | return pnet_fun, rnet_fun, onet_fun 302 | 303 | def detect_face(img, minsize, pnet, rnet, onet, threshold, factor): 304 | # im: input image 305 | # minsize: minimum of faces' size 306 | # pnet, rnet, onet: caffemodel 307 | # threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold 308 | # fastresize: resize img from last scale (using in high-resolution images) if fastresize==true 309 | 310 | # plt.figure() 311 | # plt.imshow(img) 312 | factor_count=0 313 | total_boxes=np.empty((0,9)) 314 | points=[] 315 | h=img.shape[0] 316 | w=img.shape[1] 317 | # h=img.size[0] 318 | # w=img.size[1] 319 | minl=np.amin([h, w]) 320 | m=12.0/minsize 321 | minl=minl*m 322 | # creat scale pyramid 323 | scales=[] 324 | while minl>=12: 325 | scales += [m*np.power(factor, factor_count)] 326 | minl = minl*factor 327 | factor_count += 1 328 | 329 | start_time = time.time() 330 | # first stage 331 | for j in range(len(scales)): 332 | #scale=scales[j] 333 | scale = scales[len(scales)-j-1] 334 | hs=int(np.ceil(h*scale)) 335 | ws=int(np.ceil(w*scale)) 336 | im_data = imresample(img, (hs, ws)) 337 | #plt.imshow(im_data) 338 | im_data = (im_data-127.5)*0.0078125 339 | img_x = np.expand_dims(im_data, 0) 340 | img_y = np.transpose(img_x, (0,2,1,3)) 341 | out = pnet(img_y) 342 | out0 = np.transpose(out[0], (0,2,1,3)) 343 | out1 = np.transpose(out[1], (0,2,1,3)) 344 | 345 | boxes, _ = generateBoundingBox(out1[0,:,:,1].copy(), out0[0,:,:,:].copy(), scale, threshold[0]) 346 | 347 | # inter-scale nms 348 | pick = nms(boxes.copy(), 0.5, 'Union') 349 | if boxes.size>0 and pick.size>0: 350 | boxes = boxes[pick,:] 351 | total_boxes = np.append(total_boxes, boxes, axis=0) 352 | elapse_time = time.time() - start_time 353 | 354 | numbox = total_boxes.shape[0] 355 | if numbox>0: 356 | pick = nms(total_boxes.copy(), 0.7, 'Union') 357 | total_boxes = total_boxes[pick,:] 358 | #plotbb(img, total_boxes, output_filename) 359 | regw = total_boxes[:,2]-total_boxes[:,0] 360 | regh = total_boxes[:,3]-total_boxes[:,1] 361 | qq1 = total_boxes[:,0]+total_boxes[:,5]*regw 362 | qq2 = total_boxes[:,1]+total_boxes[:,6]*regh 363 | qq3 = total_boxes[:,2]+total_boxes[:,7]*regw 364 | qq4 = total_boxes[:,3]+total_boxes[:,8]*regh 365 | total_boxes = np.transpose(np.vstack([qq1, qq2, qq3, qq4, total_boxes[:,4]])) 366 | total_boxes = rerec(total_boxes.copy()) 367 | total_boxes[:,0:4] = np.fix(total_boxes[:,0:4]).astype(np.int32) 368 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h) 369 | 370 | numbox = total_boxes.shape[0] 371 | if numbox>0: 372 | # second stage 373 | tempimg = np.zeros((24,24,3,numbox)) 374 | for k in range(0,numbox): 375 | tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3)) 376 | tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k] - 1:ey[k], x[k] - 1:ex[k], :] 377 | if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0: 378 | tempimg[:,:,:,k] = imresample(tmp, (24, 24)) 379 | else: 380 | return np.empty() 381 | tempimg = (tempimg-127.5)*0.0078125 382 | tempimg1 = np.transpose(tempimg, (3,1,0,2)) 383 | out = rnet(tempimg1) 384 | out0 = np.transpose(out[0]) 385 | out1 = np.transpose(out[1]) 386 | score = out1[1,:] 387 | ipass = np.where(score>threshold[1]) 388 | total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)]) 389 | mv = out0[:,ipass[0]] 390 | if total_boxes.shape[0]>0: 391 | pick = nms(total_boxes, 0.7, 'Union') 392 | total_boxes = total_boxes[pick,:] 393 | total_boxes = bbreg(total_boxes.copy(), np.transpose(mv[:,pick])) 394 | total_boxes = rerec(total_boxes.copy()) 395 | 396 | numbox = total_boxes.shape[0] 397 | if numbox>0: 398 | #plotbb(img, total_boxes, output_filename) 399 | # third stage 400 | total_boxes = np.fix(total_boxes).astype(np.int32) 401 | dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h) 402 | tempimg = np.zeros((48,48,3,numbox)) 403 | for k in range(0,numbox): 404 | tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3)) 405 | tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:] 406 | if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0: 407 | tempimg[:,:,:,k] = imresample(tmp, (48, 48)) 408 | else: 409 | return np.empty() 410 | tempimg = (tempimg-127.5)*0.0078125 411 | tempimg1 = np.transpose(tempimg, (3,1,0,2)) 412 | out = onet(tempimg1) 413 | out0 = np.transpose(out[0]) 414 | out1 = np.transpose(out[1]) 415 | out2 = np.transpose(out[2]) 416 | score = out2[1,:] 417 | points = out1 ## points[0:5] is the ratios of the width of bounding box corresponding to x of landmarks,points[5:9] is the ratios of the height of bounding box corresponding to y of landmarks. 418 | ipass = np.where(score>threshold[2]) 419 | points = points[:,ipass[0]] 420 | total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)]) 421 | mv = out0[:,ipass[0]] 422 | 423 | w = total_boxes[:,2]-total_boxes[:,0]+1 424 | h = total_boxes[:,3]-total_boxes[:,1]+1 425 | points[0:5,:] = np.tile(w,(5, 1))*points[0:5,:] + np.tile(total_boxes[:,0],(5, 1))-1 ## points[0:5] is the ratios of the width of bounding box corresponding to x, i.e. the coloum 426 | points[5:10,:] = np.tile(h,(5, 1))*points[5:10,:] + np.tile(total_boxes[:,1],(5, 1))-1## points[0:5] is the ratios of the width of bounding box corresponding to x, i.e. the row 427 | if total_boxes.shape[0]>0: 428 | total_boxes = bbreg(total_boxes.copy(), np.transpose(mv)) 429 | pick = nms(total_boxes.copy(), 0.7, 'Min') 430 | total_boxes = total_boxes[pick,:] 431 | points = points[:,pick] 432 | #plotbb1(img, total_boxes, points, output_filename) 433 | 434 | 435 | #print('detect_face time %f\n'%elapse_time) 436 | return total_boxes, points 437 | 438 | 439 | # function [boundingbox] = bbreg(boundingbox,reg) 440 | def bbreg(boundingbox,reg): 441 | # calibrate bounding boxes 442 | if reg.shape[1]==1: 443 | reg = np.reshape(reg, (reg.shape[2], reg.shape[3])) 444 | 445 | w = boundingbox[:,2]-boundingbox[:,0]+1 446 | h = boundingbox[:,3]-boundingbox[:,1]+1 447 | b1 = boundingbox[:,0]+reg[:,0]*w 448 | b2 = boundingbox[:,1]+reg[:,1]*h 449 | b3 = boundingbox[:,2]+reg[:,2]*w 450 | b4 = boundingbox[:,3]+reg[:,3]*h 451 | boundingbox[:,0:4] = np.transpose(np.vstack([b1, b2, b3, b4 ])) 452 | return boundingbox 453 | 454 | def generateBoundingBox(imap, reg, scale, t): 455 | # use heatmap to generate bounding boxes 456 | stride=2 ## the stride of maxpooling in the step of pnet is 2 457 | cellsize=12 ## the size of the boudning box side in the original image 458 | 459 | imap = np.transpose(imap) 460 | dx1 = np.transpose(reg[:,:,0]) 461 | dy1 = np.transpose(reg[:,:,1]) 462 | dx2 = np.transpose(reg[:,:,2]) 463 | dy2 = np.transpose(reg[:,:,3]) 464 | y, x = np.where(imap >= t) 465 | if y.shape[0]==1: 466 | dx1 = np.flipud(dx1) 467 | dy1 = np.flipud(dy1) 468 | dx2 = np.flipud(dx2) 469 | dy2 = np.flipud(dy2) 470 | score = imap[(y,x)] 471 | reg = np.transpose(np.vstack([ dx1[(y,x)], dy1[(y,x)], dx2[(y,x)], dy2[(y,x)] ])) 472 | if reg.size==0: 473 | reg = np.empty((0,3)) 474 | bb = np.transpose(np.vstack([y,x])) 475 | q1 = np.fix((stride*bb+1)/scale) ## remaping the bb coordinates founded in the scaled pyramides images to the original image 476 | q2 = np.fix((stride*bb+cellsize-1+1)/scale) 477 | boundingbox = np.hstack([q1, q2, np.expand_dims(score,1), reg]) 478 | return boundingbox, reg 479 | 480 | # function pick = nms(boxes,threshold,type) 481 | def nms(boxes, threshold, method): 482 | if boxes.size==0: 483 | return np.empty((0,3)) 484 | x1 = boxes[:,0] 485 | y1 = boxes[:,1] 486 | x2 = boxes[:,2] 487 | y2 = boxes[:,3] 488 | s = boxes[:,4] 489 | area = (x2-x1+1) * (y2-y1+1) 490 | I = np.argsort(s) 491 | pick = np.zeros_like(s, dtype=np.int16) 492 | counter = 0 493 | while I.size>0: 494 | i = I[-1] 495 | pick[counter] = i 496 | counter += 1 497 | idx = I[0:-1] 498 | xx1 = np.maximum(x1[i], x1[idx]) 499 | yy1 = np.maximum(y1[i], y1[idx]) 500 | xx2 = np.minimum(x2[i], x2[idx]) 501 | yy2 = np.minimum(y2[i], y2[idx]) 502 | w = np.maximum(0.0, xx2-xx1+1) 503 | h = np.maximum(0.0, yy2-yy1+1) 504 | inter = w * h 505 | if method is 'Min': 506 | o = inter / np.minimum(area[i], area[idx]) 507 | else: 508 | o = inter / (area[i] + area[idx] - inter) 509 | I = I[np.where(o<=threshold)] 510 | pick = pick[0:counter] 511 | return pick 512 | 513 | # function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h) 514 | def pad(total_boxes, w, h): ## resise the bounding box whose cordinates are out of the image region 515 | # compute the padding coordinates (pad the bounding boxes to square) 516 | tmpw = (total_boxes[:,2]-total_boxes[:,0]+1).astype(np.int32) 517 | tmph = (total_boxes[:,3]-total_boxes[:,1]+1).astype(np.int32) 518 | numbox = total_boxes.shape[0] 519 | 520 | dx = np.ones((numbox), dtype=np.int32) 521 | dy = np.ones((numbox), dtype=np.int32) 522 | edx = tmpw.copy().astype(np.int32) 523 | edy = tmph.copy().astype(np.int32) 524 | 525 | x = total_boxes[:,0].copy().astype(np.int32) 526 | y = total_boxes[:,1].copy().astype(np.int32) 527 | ex = total_boxes[:,2].copy().astype(np.int32) 528 | ey = total_boxes[:,3].copy().astype(np.int32) 529 | 530 | ## resize x of bottom-right of bb+ 531 | tmp = np.where(ex>w) 532 | ##edx[tmp] = np.expand_dims(-ex[tmp]+w+tmpw[tmp],1) ##python 2.7.6 533 | edx[tmp] = -ex[tmp] + w + tmpw[tmp] ##python 2.7.12 534 | ex[tmp] = w 535 | 536 | ## resize y of bottom-right of bb 537 | tmp = np.where(ey>h) 538 | ##edy[tmp] = np.expand_dims(-ey[tmp]+h+tmph[tmp],1) ##python 2.7.6 539 | edy[tmp] = -ey[tmp]+h+tmph[tmp] ##python 2.7.12 540 | ey[tmp] = h 541 | 542 | ## resize x of top-left of bb 543 | tmp = np.where(x<1) 544 | ##dx[tmp] = np.expand_dims(2-x[tmp],1) ##python 2.7.6 545 | dx[tmp] = 2 - x[tmp] ##python 2.7.12 546 | x[tmp] = 1 547 | 548 | ## resize y of top-left of bb 549 | tmp = np.where(y<1) 550 | ##dy[tmp] = np.expand_dims(2-y[tmp],1) ##python 2.7.6 551 | dy[tmp] = 2-y[tmp] ##python 2.7.12 552 | y[tmp] = 1 553 | 554 | return dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph 555 | 556 | # function [bboxA] = rerec(bboxA) 557 | def rerec(bboxA): 558 | # convert bboxA to square 559 | h = bboxA[:,3]-bboxA[:,1] 560 | w = bboxA[:,2]-bboxA[:,0] 561 | l = np.maximum(w, h) 562 | bboxA[:,0] = bboxA[:,0]+w*0.5-l*0.5 563 | bboxA[:,1] = bboxA[:,1]+h*0.5-l*0.5 564 | bboxA[:,2:4] = bboxA[:,0:2] + np.transpose(np.tile(l,(2,1))) 565 | return bboxA 566 | 567 | def imresample(img, sz): 568 | #im_data = cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_AREA) #pylint: disable=no-member 569 | im_data = cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_NEAREST) 570 | return im_data 571 | 572 | # This method is kept for debugging purpose 573 | # h=img.shape[0] 574 | # w=img.shape[1] 575 | # hs, ws = sz 576 | # dx = float(w) / ws 577 | # dy = float(h) / hs 578 | # im_data = np.zeros((hs,ws,3)) 579 | # for a1 in range(0,hs): 580 | # for a2 in range(0,ws): 581 | # for a3 in range(0,3): 582 | # im_data[a1,a2,a3] = img[int(floor(a1*dy)),int(floor(a2*dx)),a3] 583 | # return im_data 584 | 585 | def plotbb(img, bboxs, output_filename): 586 | #plt.imshow(img) 587 | #patterns = ['-', '+', 'x', 'o', 'O', '.', '*'] # more patterns 588 | fig,ax = plt.subplots(1) 589 | ax.imshow(img) 590 | 591 | for i in range(bboxs.shape[0]): 592 | rect = patches.Rectangle( 593 | (bboxs[i,0], bboxs[i,1]), 594 | bboxs[i, 2] - bboxs[i, 0], 595 | bboxs[i, 3] - bboxs[i, 1], 596 | #hatch=patterns[i], 597 | fill=False, 598 | linewidth=1, 599 | edgecolor='r', 600 | facecolor='none' 601 | ) 602 | ax.add_patch(rect) 603 | score = '%.02f'%(bboxs[i,4]) 604 | ax.text(int(bboxs[i,0]), int(bboxs[i,1]), score, color='green', fontsize=10) 605 | #dirtmp, dirtmp1= output_filename.split('.') 606 | dirtmp = output_filename 607 | if not os.path.exists(dirtmp): 608 | os.mkdir(dirtmp) 609 | random_key = np.random.randint(0, high=99999) 610 | 611 | fig.savefig(os.path.join(dirtmp,'face_dd_ld_%03d.png')%random_key, dpi=90, bbox_inches='tight') 612 | 613 | # for p in [ 614 | # patches.Rectangle( 615 | # (bboxs[i,0], bboxs[i,1]), 616 | # bboxs[i, 2] - bboxs[i, 0], 617 | # bboxs[i, 3] - bboxs[i, 1], 618 | # #hatch=patterns[i], 619 | # fill=False, 620 | # linewidth=1, 621 | # edgecolor='r', 622 | # facecolor='none' 623 | # ) for i in range(bboxs.shape[0]) 624 | # ]: 625 | # ax.add_patch(p) 626 | 627 | 628 | #plt.show() 629 | #fig.savefig('rect.png', dpi=90, bbox_inches='tight') 630 | 631 | 632 | 633 | def plotbb1(img, bboxs, ld, output_filename): 634 | # plt.imshow(img) 635 | # patterns = ['-', '+', 'x', 'o', 'O', '.', '*'] # more patterns 636 | fig, ax = plt.subplots(1) 637 | ax.imshow(img) 638 | 639 | for i in range(bboxs.shape[0]): 640 | rect = patches.Rectangle( 641 | (bboxs[i, 0], bboxs[i, 1]), 642 | bboxs[i, 2] - bboxs[i, 0], 643 | bboxs[i, 3] - bboxs[i, 1], 644 | # hatch=patterns[i], 645 | fill=False, 646 | linewidth=1, 647 | edgecolor='r', 648 | facecolor='none' 649 | ) 650 | ax.add_patch(rect) 651 | score = '%.02f' % (bboxs[i, 4]) 652 | ax.text(int(bboxs[i, 0]), int(bboxs[i, 1]), score, color='green', fontsize=10) 653 | 654 | ax = plt.gca() 655 | ld = np.int32(np.squeeze(ld)) 656 | for i in range(np.int32(ld.shape[0]/2)): 657 | ax.plot(ld[i],ld[i+5], 'o', color='r', linewidth=0.1) 658 | #dirtmp, dirtmp1= output_filename.split('.') 659 | dirtmp = output_filename 660 | if not os.path.exists(dirtmp): 661 | os.mkdir(dirtmp) 662 | fig.savefig(os.path.join(dirtmp,'face_dd_ld.png'), dpi=90, bbox_inches='tight') 663 | -------------------------------------------------------------------------------- /src/align/detect_face.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/align/detect_face.pyc -------------------------------------------------------------------------------- /src/align/face_align_mtcnn.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | from scipy import misc 6 | import sys 7 | import os 8 | import argparse 9 | import tensorflow as tf 10 | import numpy as np 11 | import random 12 | from PIL import Image 13 | import time 14 | import shutil 15 | 16 | sys.path.append('../') 17 | import facenet 18 | import align.detect_face 19 | 20 | 21 | def load_align_mtcnn(model_dir): 22 | print('Creating networks and loading parameters') 23 | 24 | with tf.Graph().as_default(): 25 | with tf.device('/cpu:0'): 26 | # gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) 27 | # sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) 28 | sess = tf.Session(config=tf.ConfigProto(log_device_placement=False)) 29 | with sess.as_default(): 30 | pnet, rnet, onet = align.detect_face.create_mtcnn(sess, model_dir) 31 | 32 | return pnet, rnet, onet 33 | 34 | def align_mtcnn(args, pnet, rnet, onet): 35 | 36 | src_path,_ = os.path.split(os.path.realpath(__file__)) 37 | 38 | minsize = 20 # minimum size of face 39 | threshold = [ 0.6, 0.7, 0.7 ] # three steps's threshold 40 | factor = 0.709 # scale factor 41 | 42 | nrof_images_total = 0 43 | nrof_successfully_aligned = 0 44 | 45 | scaled_imgs = np.zeros((2, args.image_size, args.image_size, 3)) 46 | 47 | time_rec = [] 48 | bboxes = np.zeros((2,5)) 49 | for image_path in [args.img1,args.img2]: 50 | time_rec.append(time.time()) 51 | nrof_images_total += 1 52 | filename = os.path.splitext(os.path.split(image_path)[1])[0] 53 | 54 | if os.path.exists(image_path): 55 | try: 56 | img = misc.imread(image_path) 57 | except (IOError, ValueError, IndexError) as e: 58 | errorMessage = '{}: {}'.format(image_path, e) 59 | print(errorMessage) 60 | else: 61 | if img.ndim<2: 62 | print('Unable to align "%s"' % image_path) 63 | continue 64 | if img.ndim == 2: 65 | img = facenet.to_rgb(img) 66 | 67 | img = img[:, :, 0:3] 68 | output_tmp_path = os.path.join(os.path.abspath(args.output_dir),filename) 69 | 70 | bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor, output_tmp_path) 71 | 72 | nrof_faces = bounding_boxes.shape[0] 73 | if nrof_faces>0: 74 | det = bounding_boxes[:,0:4] 75 | img_size = np.asarray(img.shape)[0:2] 76 | if nrof_faces>1: 77 | bounding_box_size = (det[:,2]-det[:,0])*(det[:,3]-det[:,1]) 78 | img_center = img_size / 2 79 | offsets = np.vstack([ (det[:,0]+det[:,2])/2-img_center[1], (det[:,1]+det[:,3])/2-img_center[0] ]) 80 | offset_dist_squared = np.sum(np.power(offsets,2.0),0) 81 | index = np.argmax(bounding_box_size-offset_dist_squared*2.0) # some extra weight on the centering 82 | det = det[index,:] 83 | det = np.squeeze(det) 84 | bb = np.zeros(4, dtype=np.int32) 85 | bb[0] = np.maximum(det[0]-args.margin/2, 0) 86 | bb[1] = np.maximum(det[1]-args.margin/2, 0) 87 | bb[2] = np.minimum(det[2]+args.margin/2, img_size[1]) 88 | bb[3] = np.minimum(det[3]+args.margin/2, img_size[0]) 89 | 90 | cropped = img[bb[1]:bb[3],bb[0]:bb[2],:] 91 | scaled_imgs[nrof_successfully_aligned,:,:,:] = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear') 92 | bboxes[nrof_successfully_aligned,0:4]=bb ## bounding box with the margin 93 | if nrof_faces > 1: 94 | bboxes[nrof_successfully_aligned, 4] = bounding_boxes[index, 4] ## the detection score/probability of the bouding box 95 | else: 96 | bboxes[nrof_successfully_aligned, 4] = bounding_boxes[0, 4] 97 | 98 | 99 | nrof_successfully_aligned += 1 100 | else: 101 | print('Unable to align "%s"' % image_path) 102 | time_rec.append(time.time()) 103 | 104 | def align_mtcnn_realplay(img, pnet, rnet, onet): 105 | 106 | minsize = 20 # minimum size of face 107 | threshold = [0.6, 0.7, 0.7] # three steps's threshold 108 | factor = 0.709 # scale factor 109 | bb = [] 110 | prob = [] 111 | 112 | bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor) 113 | 114 | nrof_faces = bounding_boxes.shape[0] 115 | if nrof_faces > 0: 116 | det = bounding_boxes[:, 0:5] 117 | bb = det[:, 0:4] 118 | prob = det[:, 4] 119 | img_size = np.asarray(img.shape)[0:2] 120 | # if nrof_faces > 1: 121 | # bounding_box_size = (det[:, 2] - det[:, 0]) * (det[:, 3] - det[:, 1]) 122 | # img_center = img_size / 2 123 | # offsets = np.vstack([(det[:, 0] + det[:, 2]) / 2 - img_center[1], 124 | # (det[:, 1] + det[:, 3]) / 2 - img_center[0]]) 125 | # offset_dist_squared = np.sum(np.power(offsets, 2.0), 0) 126 | # index = np.argmax(bounding_box_size - offset_dist_squared * 2.0) # some extra weight on the centering 127 | # # if det.shape[0]>0: 128 | # # print (det.shape) 129 | # # print (index.shape) 130 | # det = det[np.newaxis, index, :] 131 | # bb = det[:, 0:4] 132 | # prob = det[:, 4] 133 | # else: 134 | # print('Unable to align image') 135 | 136 | 137 | return bb, prob 138 | -------------------------------------------------------------------------------- /src/align/face_align_mtcnn.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/align/face_align_mtcnn.pyc -------------------------------------------------------------------------------- /src/face_verification.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import os 9 | import sys 10 | import math 11 | from datetime import datetime 12 | from PIL import Image 13 | 14 | sys.path.append('../') 15 | import align.face_align_mtcnn 16 | import facenet 17 | from scipy import spatial 18 | import time 19 | import matplotlib.pyplot as plt 20 | import matplotlib.patches as patches 21 | 22 | import importlib 23 | import random 24 | import tensorflow as tf 25 | import tensorflow.contrib.slim as slim 26 | from tensorflow.python.ops import data_flow_ops 27 | from tensorflow.python.framework import ops 28 | from tensorflow.python.ops import array_ops 29 | 30 | import time 31 | from datetime import datetime 32 | import lfw 33 | 34 | def exp_forward(args): 35 | network = importlib.import_module(args.model_def, 'inference') 36 | 37 | subdir = datetime.strftime(datetime.now(), '%Y%m%d-%H%M%S') 38 | log_dir = os.path.join(os.path.expanduser(args.logs_base_dir), subdir) 39 | if not os.path.isdir(log_dir): # Create the log directory if it doesn't exist 40 | os.makedirs(log_dir) 41 | model_dir = os.path.join(os.path.expanduser(args.models_base_dir), subdir) 42 | if not os.path.isdir(model_dir): # Create the model directory if it doesn't exist 43 | os.makedirs(model_dir) 44 | 45 | # Store some git revision info in a text file in the log directory 46 | src_path, _ = os.path.split(os.path.realpath(__file__)) 47 | facenet.store_revision_info(src_path, log_dir, ' '.join(sys.argv)) 48 | 49 | np.random.seed(seed=args.seed) 50 | random.seed(args.seed) 51 | 52 | train_set = facenet.get_dataset(args.data_dir) 53 | image_list, label_list, usage_list, nrof_classes = facenet.get_image_paths_and_labels_fer2013(args.data_dir, 54 | args.labels_expression, 55 | 'Training') 56 | 57 | 58 | print('Total number of subjects: %d' % nrof_classes) 59 | print('Total number of images: %d' % len(image_list)) 60 | 61 | 62 | print('Model directory: %s' % model_dir) 63 | print('Log directory: %s' % log_dir) 64 | pretrained_model = None 65 | if args.pretrained_model: 66 | pretrained_model = os.path.expanduser(args.pretrained_model) 67 | meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.pretrained_model)) 68 | print('Pre-trained model: %s' % pretrained_model) 69 | 70 | if args.lfw_dir: 71 | print('LFW directory: %s' % args.lfw_dir) 72 | # Read the file containing the pairs used for testing 73 | pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs)) 74 | # Get the paths for the corresponding images 75 | lfw_paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext) 76 | 77 | if args.evaluate_express: 78 | print('FER2013 test data directory: %s' % args.data_dir) 79 | fer2013_paths_test, label_list_test, usage_list_test, nrof_classes_test = facenet.get_image_paths_and_labels_fer2013( 80 | args.data_dir, 81 | args.labels_expression, 82 | 'PublicTest') 83 | 84 | with tf.Graph().as_default(): 85 | tf.set_random_seed(args.seed) 86 | global_step = tf.Variable(0, trainable=False) 87 | 88 | # Create a queue that produces indices into the image_list and label_list 89 | labels = ops.convert_to_tensor(label_list, dtype=tf.int32) 90 | range_size = array_ops.shape(labels)[0] 91 | index_queue = tf.train.range_input_producer(range_size, num_epochs=None, 92 | shuffle=True, seed=None, capacity=32) 93 | 94 | index_dequeue_op = index_queue.dequeue_many(args.batch_size * args.epoch_size, 'index_dequeue') 95 | 96 | learning_rate_placeholder = tf.placeholder(tf.float32, name='learning_rate') 97 | 98 | batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size') 99 | 100 | phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train') 101 | 102 | image_paths_placeholder = tf.placeholder(tf.string, shape=(None, 1), name='image_paths') 103 | 104 | labels_placeholder = tf.placeholder(tf.int64, shape=(None, 1), name='labels') 105 | 106 | keep_probability_placeholder = tf.placeholder(tf.float32, name='keep_probability') 107 | 108 | input_queue = data_flow_ops.FIFOQueue(capacity=100000, 109 | dtypes=[tf.string, tf.int64], 110 | shapes=[(1,), (1,)], 111 | shared_name=None, name=None) 112 | enqueue_op = input_queue.enqueue_many([image_paths_placeholder, labels_placeholder], name='enqueue_op') 113 | 114 | nrof_preprocess_threads = 4 115 | images_and_labels = [] 116 | for _ in range(nrof_preprocess_threads): 117 | filenames, label = input_queue.dequeue() 118 | images = [] 119 | for filename in tf.unpack(filenames): 120 | file_contents = tf.read_file(filename) 121 | image = tf.image.decode_png(file_contents) 122 | if args.random_rotate: 123 | image = tf.py_func(facenet.random_rotate_image, [image], tf.uint8) 124 | if args.random_crop: 125 | image = tf.random_crop(image, [args.image_size, args.image_size, 3]) 126 | else: 127 | image = tf.image.resize_image_with_crop_or_pad(image, args.image_size, args.image_size) 128 | if args.random_flip: 129 | image = tf.image.random_flip_left_right(image) 130 | 131 | # pylint: disable=no-member 132 | image.set_shape((args.image_size, args.image_size, 3)) 133 | images.append(tf.image.per_image_standardization(image)) 134 | images_and_labels.append([images, label]) 135 | 136 | image_batch, label_batch = tf.train.batch_join( 137 | images_and_labels, batch_size=batch_size_placeholder, 138 | shapes=[(args.image_size, args.image_size, 3), ()], enqueue_many=True, 139 | capacity=4 * nrof_preprocess_threads * args.batch_size, 140 | allow_smaller_final_batch=True) 141 | # image_batch = tf.identity(image_batch, 'image_batch') 142 | image_batch = tf.identity(image_batch, 'input') 143 | label_batch = tf.identity(label_batch, 'label_batch') 144 | 145 | print('Building training graph') 146 | 147 | # Build the inference graph 148 | prelogits, _ = network.inference(image_batch, keep_probability_placeholder, 149 | phase_train=phase_train_placeholder, weight_decay=args.weight_decay) 150 | # logits = slim.fully_connected(prelogits, len(train_set), activation_fn=None, 151 | # weights_initializer=tf.truncated_normal_initializer(stddev=0.1), 152 | # weights_regularizer=slim.l2_regularizer(args.weight_decay), 153 | # scope='Logits', reuse=False) 154 | 155 | logits0 = slim.fully_connected(prelogits, 512, activation_fn=tf.nn.relu, 156 | weights_initializer=tf.truncated_normal_initializer(stddev=0.1), 157 | weights_regularizer=slim.l2_regularizer(args.weight_decay), 158 | scope='Logits0', reuse=False) 159 | 160 | logits = slim.fully_connected(logits0, len(set(label_list)), activation_fn=tf.nn.relu, 161 | weights_initializer=tf.truncated_normal_initializer(stddev=0.1), 162 | weights_regularizer=slim.l2_regularizer(args.weight_decay), 163 | scope='Logits', reuse=False) 164 | 165 | logits = tf.identity(logits, 'logits') 166 | 167 | embeddings = tf.nn.l2_normalize(prelogits, 1, 1e-10, name='embeddings') 168 | 169 | # Add center loss 170 | if args.center_loss_factor > 0.0: 171 | # prelogits_center_loss, centers, _, centers_cts_batch_reshape = facenet.center_loss(prelogits, label_batch, args.center_loss_alfa, nrof_classes) 172 | prelogits_center_loss, centers, _, centers_cts_batch_reshape = facenet.center_loss(embeddings, label_batch, 173 | args.center_loss_alfa, 174 | nrof_classes) 175 | # prelogits_center_loss, _ = facenet.center_loss_similarity(prelogits, label_batch, args.center_loss_alfa, nrof_classes) ####Similarity cosine distance, center loss 176 | tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, prelogits_center_loss * args.center_loss_factor) 177 | 178 | learning_rate = tf.train.exponential_decay(learning_rate_placeholder, global_step, 179 | args.learning_rate_decay_epochs * args.epoch_size, 180 | args.learning_rate_decay_factor, staircase=True) 181 | tf.summary.scalar('learning_rate', learning_rate) 182 | 183 | # Calculate the average cross entropy loss across the batch 184 | cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits( 185 | logits, label_batch, name='cross_entropy_per_example') 186 | cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy') 187 | tf.add_to_collection('losses', cross_entropy_mean) 188 | 189 | # Calculate the total losses 190 | regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) 191 | # total_loss = tf.add_n([cross_entropy_mean] + regularization_losses, name='total_loss') 192 | total_loss = tf.add_n([cross_entropy_mean], name='total_loss') 193 | 194 | #### Training accuracy of softmax: check the underfitting or overfiting ############################# 195 | correct_prediction = tf.equal(tf.argmax(logits, 1), label_batch) 196 | softmax_acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 197 | ######################################################################################################## 198 | 199 | ########## edit mzh ##################### 200 | # Create list with variables to restore 201 | restore_vars = [] 202 | update_gradient_vars = [] 203 | # logits_var = [] 204 | if args.pretrained_model: 205 | for var in tf.global_variables(): 206 | # if 'InceptionResnet' in var.op.name: 207 | # restore_vars.append(var) 208 | if 'Logits' in var.op.name or 'Logits0' in var.op.name: 209 | print(var.op.name) 210 | update_gradient_vars.append(var) 211 | restore_vars = tf.global_variables(); 212 | restore_saver = tf.train.Saver(restore_vars) 213 | else: 214 | update_gradient_vars = tf.trainable_variables() 215 | 216 | # update_gradient_vars0 = tf.global_variables() 217 | # update_gradient_vars = tf.trainable_variables() 218 | train_op = facenet.train(total_loss, global_step, args.optimizer, 219 | learning_rate, args.moving_average_decay, update_gradient_vars, args.log_histograms) 220 | 221 | # saver = tf.train.Saver(tf.trainable_variables(), max_to_keep=3) 222 | saver = tf.train.Saver(tf.global_variables(), max_to_keep=3) 223 | 224 | # Build the summary operation based on the TF collection of Summaries. 225 | summary_op = tf.summary.merge_all() 226 | 227 | # Start running operations on the Graph. 228 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) 229 | sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) 230 | sess.run(tf.global_variables_initializer()) 231 | sess.run(tf.local_variables_initializer()) 232 | summary_writer = tf.summary.FileWriter(log_dir, sess.graph) 233 | tf.train.start_queue_runners(sess=sess) 234 | 235 | with sess.as_default(): 236 | 237 | if pretrained_model: 238 | print( 239 | 'Restoring pretrained model: %s' % os.path.join(os.path.expanduser(args.pretrained_model), ckpt_file)) 240 | # saver.restore(sess, pretrained_model) 241 | restore_saver.restore(sess, os.path.join(os.path.expanduser(args.pretrained_model), ckpt_file)) 242 | 243 | # Training and validation loop 244 | print('Running training') 245 | epoch = 0 246 | acc = 0 247 | val = 0 248 | far = 0 249 | best_acc = 0 250 | acc_expression = 0 251 | while epoch < args.max_nrof_epochs: 252 | step = sess.run(global_step, feed_dict=None) 253 | print('Epoch step: %d' % step) 254 | epoch = step // args.epoch_size 255 | 256 | if not (epoch % 1): 257 | if args.evaluate_express: 258 | acc_expression = evaluate_expression(sess, enqueue_op, image_paths_placeholder, 259 | labels_placeholder, 260 | phase_train_placeholder, batch_size_placeholder, logits, 261 | label_batch, fer2013_paths_test, label_list_test, 262 | args.lfw_batch_size, 263 | log_dir, step, summary_writer, 264 | keep_probability_placeholder) 265 | 266 | return model_dir 267 | 268 | def evaluate_expression(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, 269 | batch_size_placeholder, 270 | logits, labels, image_paths, actual_expre, batch_size, log_dir, step, summary_writer, 271 | keep_probability_placeholder): 272 | start_time = time.time() 273 | # Run forward pass to calculate embeddings 274 | print('Runnning forward pass on FER2013 images') 275 | nrof_images = len(actual_expre) 276 | nrof_batches = nrof_images // batch_size 277 | 278 | # Enqueue one epoch of image paths and labels 279 | labels_array = np.expand_dims(np.arange((nrof_batches*batch_size)), 1) ## used for noting the locations of the image in the queue 280 | image_paths_array = np.expand_dims(np.array(image_paths[0:nrof_batches*batch_size]), 1) 281 | sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array}) 282 | 283 | 284 | logits_size = logits.get_shape()[1] 285 | 286 | 287 | logits_array = np.zeros((nrof_batches*batch_size, logits_size), dtype=float) 288 | lab_array = np.zeros((nrof_batches*batch_size), dtype=int) 289 | for ii in range(nrof_batches): 290 | #print('nrof_batches %d'%ii) 291 | feed_dict = {phase_train_placeholder: False, batch_size_placeholder: batch_size, 292 | keep_probability_placeholder: 1.0} 293 | logits_batch, lab = sess.run([logits, labels], feed_dict=feed_dict) 294 | lab_array[lab] = lab 295 | logits_array[lab] = logits_batch 296 | assert np.array_equal(lab_array, np.arange(nrof_batches*batch_size)) == True, 'Wrong labels used for evaluation, possibly caused by training examples left in the input pipeline' 297 | 298 | express_probs = np.exp(logits_array) / np.tile(np.reshape(np.sum(np.exp(logits_array), 1), (logits_array.shape[0], 1)), (1, logits_array.shape[1])) 299 | nrof_expression = express_probs.shape[1] 300 | expressions_predict = np.argmax(express_probs, 1) 301 | #### Training accuracy of softmax: check the underfitting or overfiting ############################# 302 | correct_prediction = np.equal(expressions_predict, actual_expre[0:nrof_batches*batch_size]) 303 | test_expr_acc = np.mean(correct_prediction) 304 | 305 | ######################################################################################################## 306 | print('%d expressions recognition accuracy is: %f' % (nrof_expression, test_expr_acc)) 307 | 308 | fer_time = time.time() - start_time 309 | # Add validation loss and accuracy to summary 310 | summary = tf.Summary() 311 | # pylint: disable=maybe-no-member 312 | summary.value.add(tag='fer/accuracy', simple_value=test_expr_acc) 313 | summary.value.add(tag='time/fer', simple_value=fer_time) 314 | summary_writer.add_summary(summary, step) 315 | with open(os.path.join(log_dir, 'Fer2013_result.txt'), 'at') as f: 316 | f.write('%d\t%.5f\n' % (step, test_expr_acc)) 317 | 318 | return test_expr_acc 319 | 320 | 321 | def load_align_model(args): 322 | with tf.device('/cpu:0'): 323 | 324 | pnet, rnet, onet = align.face_align_mtcnn.load_align_mtcnn(args.align_model_dir) 325 | 326 | return pnet, rnet, onet 327 | 328 | def load_face_verif_model(args): 329 | with tf.device('/cpu:0'): 330 | 331 | sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) 332 | 333 | # Load the model of face verification 334 | print('Model directory: %s' % args.model_dir) 335 | meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir)) 336 | 337 | print('Metagraph file: %s' % meta_file) 338 | print('Checkpoint file: %s' % ckpt_file) 339 | 340 | model_dir_exp = os.path.expanduser(args.model_dir) 341 | saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file)) 342 | #saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) 343 | saver.restore(sess, os.path.join(model_dir_exp, ckpt_file)) 344 | 345 | return sess 346 | 347 | 348 | def load_models(args): 349 | with tf.device('/cpu:0'): 350 | # gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) 351 | 352 | # with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, device_count = {'CPU': 0})) as sess: 353 | # with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=True)) as sess: 354 | #with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess: 355 | 356 | sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) 357 | # Load the model of face detection 358 | pnet, rnet, onet = align.face_align_mtcnn.load_align_mtcnn(args.align_model_dir) 359 | 360 | # Load the model of face verification 361 | print('Model directory: %s' % args.model_dir) 362 | meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir)) 363 | 364 | print('Metagraph file: %s' % meta_file) 365 | print('Checkpoint file: %s' % ckpt_file) 366 | 367 | model_dir_exp = os.path.expanduser(args.model_dir) 368 | saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file)) 369 | #saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) 370 | saver.restore(sess, os.path.join(model_dir_exp, ckpt_file)) 371 | 372 | return pnet, rnet, onet, sess 373 | 374 | def load_models_forward(args, nrof_expressions): 375 | with tf.device('/cpu:0'): 376 | # gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) 377 | 378 | # with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, device_count = {'CPU': 0})) as sess: 379 | # with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=True)) as sess: 380 | # with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess: 381 | 382 | 383 | # Load the model of face detection 384 | pnet, rnet, onet = align.face_align_mtcnn.load_align_mtcnn(args.align_model_dir) 385 | 386 | 387 | ########### load face verif_expression model ############################# 388 | # Load the model of face verification 389 | print('Model directory: %s' % args.model_dir) 390 | meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir)) 391 | 392 | network = importlib.import_module(args.model_def, 'inference') 393 | 394 | image_batch_placeholder = tf.placeholder(tf.float32, shape=(None, args.image_size, args.image_size, 3), name='image_batch') 395 | 396 | # Build the inference graph 397 | prelogits, _ = network.inference(image_batch_placeholder, 1.0, 398 | phase_train=False, weight_decay=args.weight_decay) 399 | 400 | embeddings = tf.nn.l2_normalize(prelogits, 1, 1e-10, name='embeddings') 401 | 402 | 403 | logits0 = slim.fully_connected(prelogits, 512, activation_fn=tf.nn.relu, 404 | weights_initializer=tf.truncated_normal_initializer(stddev=0.1), 405 | weights_regularizer=slim.l2_regularizer(args.weight_decay), 406 | scope='Logits0', reuse=False) 407 | 408 | logits = slim.fully_connected(logits0, nrof_expressions, activation_fn=tf.nn.relu, 409 | weights_initializer=tf.truncated_normal_initializer(stddev=0.1), 410 | weights_regularizer=slim.l2_regularizer(args.weight_decay), 411 | scope='Logits', reuse=False) 412 | 413 | sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) 414 | 415 | restore_vars = tf.global_variables(); 416 | restore_saver = tf.train.Saver(restore_vars) 417 | 418 | restore_saver.restore(sess, os.path.join(os.path.expanduser(args.model_dir), ckpt_file)) 419 | 420 | return pnet, rnet, onet, sess, embeddings, logits, image_batch_placeholder 421 | 422 | def load_models_forward_v2(args, Expr_dataset): 423 | with tf.device('/cpu:0'): 424 | # gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) 425 | 426 | # with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, device_count = {'CPU': 0})) as sess: 427 | # with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=True)) as sess: 428 | # with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess: 429 | 430 | 431 | # Load the model of face detection 432 | pnet, rnet, onet = align.face_align_mtcnn.load_align_mtcnn(args.align_model_dir) 433 | 434 | 435 | ########### load face verif_expression model ############################# 436 | # Load the model of face verification 437 | print('Model directory: %s' % args.model_dir) 438 | meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir)) 439 | 440 | #facenet.load_model(args.model_dir, meta_file, ckpt_file) 441 | sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) 442 | model_dir_exp = os.path.expanduser(args.model_dir) 443 | saver = tf.train.import_meta_graph(os.path.join(args.model_dir, meta_file)) 444 | saver.restore(sess, os.path.join(model_dir_exp, ckpt_file)) 445 | 446 | if Expr_dataset == 'CK+': 447 | args_model.images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 448 | args_model.embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 449 | args_model.keep_probability_placeholder = tf.get_default_graph().get_tensor_by_name('keep_probability:0') 450 | args_model.phase_train_placeholder = tf.get_default_graph().get_tensor_by_name('phase_train:0') 451 | args_model.logits = tf.get_default_graph().get_tensor_by_name('logits:0') 452 | 453 | 454 | if Expr_dataset == 'FER2013': 455 | args_model.images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 456 | args_model.embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 457 | args_model.keep_probability_placeholder = tf.get_default_graph().get_tensor_by_name('keep_probability:0') 458 | args_model.phase_train_placeholder = tf.get_default_graph().get_tensor_by_name('phase_train:0') 459 | args_model.logits = tf.get_default_graph().get_tensor_by_name('logits:0') 460 | args_model.phase_train_placeholder_expression = tf.get_default_graph().get_tensor_by_name('phase_train_expression:0') 461 | 462 | return pnet, rnet, onet, sess, args_model 463 | 464 | 465 | 466 | def compare_faces(args, pnet, rnet, onet, sess): 467 | 468 | start_time = time.time() 469 | align_imgs, bboxes = align.face_align_mtcnn.align_mtcnn(args, pnet, rnet, onet) 470 | elapsed_time1 = time.time() - start_time 471 | # print('Face_loop Elapsed time: %fs\n' % (elapsed_time1)) 472 | # print('The probablity of the same person : %f\n' % simi) 473 | 474 | 475 | # Get input and output tensors 476 | #images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 477 | images_placeholder = tf.get_default_graph().get_tensor_by_name("image_batch:0") 478 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 479 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") 480 | 481 | # Load images 482 | image_size = images_placeholder.get_shape()[1] 483 | images = facenet.load_data_im(align_imgs, False, False, image_size) 484 | 485 | feed_dict = {images_placeholder: images, phase_train_placeholder: False} 486 | try: 487 | emb_array = sess.run(embeddings, feed_dict=feed_dict) 488 | except: 489 | sess.close() 490 | print("Unexpected error:", sys.exc_info()[0]) 491 | 492 | embeddings1 = emb_array[0] 493 | embeddings2 = emb_array[1] 494 | 495 | # Caculate the distance of embeddings and verification the two face 496 | assert (embeddings1.shape[0] == embeddings2.shape[0]) 497 | diff = np.subtract(embeddings1, embeddings2) 498 | dist = np.sum(np.square(diff), 0) 499 | simi = 1 - spatial.distance.cosine(embeddings1, embeddings2) 500 | 501 | predict_issame = np.less(dist, args.threshold) 502 | # predict_issame = np.greater(simi, args.threshold) 503 | elapsed_time = time.time() - start_time 504 | 505 | 506 | # print('Image 1: %s\n' % args.img1) 507 | # print('Image 2: %s\n' % args.img2) 508 | # print('The same person: %s.......The distance of the two persons: %f | The threshold: %f\n' % ( 509 | # predict_issame, dist, args.threshold)) 510 | 511 | elapsed_time2 = elapsed_time - elapsed_time1 512 | print( 513 | 'Elapsed time: %fs....Detection: %fs....Face verification %fs\n' % (elapsed_time, elapsed_time1, elapsed_time2)) 514 | 515 | # img1 = Image.open(args.img1) 516 | # img2 = Image.open(args.img2) 517 | # plotbb(img1, bboxes[0, :]) 518 | # 519 | # plotbb(img2, bboxes[1, :]) 520 | # 521 | # raw_input('pause...') 522 | return predict_issame, dist, bboxes 523 | 524 | def face_expression(face_img_, img_ref_, args, sess): 525 | start_time = time.time() 526 | imgs = np.zeros((2, args.image_size, args.image_size, 3)) 527 | imgs[0, :, :, :]=face_img_ 528 | imgs[1, :, :, :] = img_ref_ 529 | 530 | express_probs = [] 531 | 532 | # Get input and output tensors 533 | images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 534 | #images_placeholder = tf.get_default_graph().get_tensor_by_name("image_batch:0") 535 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") 536 | #keep_probability_placeholder = tf.get_default_graph().get_tensor_by_name("keep_probability:0") 537 | 538 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 539 | #logits = tf.get_default_graph().get_tensor_by_name("logits:0") 540 | 541 | 542 | # Load images 543 | image_size = images_placeholder.get_shape()[1] 544 | images = facenet.load_data_im(imgs, False, False, image_size) 545 | 546 | #feed_dict = {images_placeholder: images, phase_train_placeholder: False, keep_probability_placeholder:1.0} 547 | feed_dict = {images_placeholder: images, phase_train_placeholder: False} 548 | try: 549 | #emb_array, logits_array = sess.run([embeddings, logits], feed_dict=feed_dict) 550 | emb_array = sess.run(embeddings, feed_dict=feed_dict) 551 | except: 552 | sess.close() 553 | print("Unexpected error:", sys.exc_info()[0]) 554 | 555 | embeddings1 = emb_array[0] 556 | embeddings2 = emb_array[1] 557 | 558 | 559 | # Caculate the distance of embeddings and verification the two face 560 | assert (embeddings1.shape[0] == embeddings2.shape[0]) 561 | diff = np.subtract(embeddings1, embeddings2) 562 | dist = np.sum(np.square(diff), 0) 563 | simi = 1 - spatial.distance.cosine(embeddings1, embeddings2) 564 | 565 | predict_issame = np.less(dist, args.threshold) 566 | # predict_issame = np.greater(simi, args.threshold) 567 | 568 | # logits0 = logits_array[0] 569 | # express_probs = np.exp(logits0)/sum(np.exp(logits0)) 570 | 571 | elapsed_time = time.time() - start_time 572 | 573 | 574 | # print('Image 1: %s\n' % args.img1) 575 | # print('Image 2: %s\n' % args.img2) 576 | # print('The same person: %s.......The distance of the two persons: %f | The threshold: %f\n' % ( 577 | # predict_issame, dist, args.threshold)) 578 | 579 | # print( 580 | # 'Elapsed time: %fs....Detection: %fs....Face verification %fs\n' % (elapsed_time, elapsed_time1, elapsed_time2)) 581 | 582 | # img1 = Image.open(args.img1) 583 | # img2 = Image.open(args.img2) 584 | # plotbb(img1, bboxes[0, :]) 585 | # 586 | # plotbb(img2, bboxes[1, :]) 587 | # 588 | # raw_input('pause...') 589 | return predict_issame, dist, express_probs 590 | 591 | def face_expression_multiref(face_img_, img_refs_, args, sess): 592 | start_time = time.time() 593 | if len(img_refs_.shape) == 4: 594 | nrof_imgs = 1+img_refs_.shape[0] 595 | elif len(img_refs_.shape) == 3: 596 | nrof_imgs = 2 597 | else: 598 | raise ValueError("Dimensions of img_refs is not correct!") 599 | 600 | imgs = np.zeros((nrof_imgs, args.image_size, args.image_size, 3)) 601 | imgs[0, :, :, :]=face_img_ 602 | imgs[1:nrof_imgs, :, :, :] = img_refs_ 603 | 604 | express_probs = [] 605 | 606 | # Get input and output tensors 607 | images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 608 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 609 | keep_probability_placeholder = tf.get_default_graph().get_tensor_by_name('keep_probability:0') 610 | # weight_decay_placeholder = tf.get_default_graph().get_tensor_by_name('weight_decay:0') 611 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name('phase_train:0') 612 | logits = tf.get_default_graph().get_tensor_by_name('logits:0') 613 | 614 | # images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 615 | # #images_placeholder = tf.get_default_graph().get_tensor_by_name("image_batch:0") 616 | # phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") 617 | # keep_probability_placeholder = tf.get_default_graph().get_tensor_by_name("keep_probability:0") 618 | # 619 | # embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 620 | # logits = tf.get_default_graph().get_tensor_by_name("logits:0") 621 | 622 | 623 | # Load images 624 | image_size = images_placeholder.get_shape()[1] 625 | images = facenet.load_data_im(imgs, False, False, image_size) 626 | 627 | feed_dict = {images_placeholder: images, phase_train_placeholder: False, keep_probability_placeholder:1.0} 628 | #feed_dict = {images_placeholder: images, phase_train_placeholder: False} 629 | try: 630 | t2 = time.time() 631 | emb_array, logits_array = sess.run([embeddings, logits], feed_dict=feed_dict) 632 | #emb_array = sess.run(embeddings, feed_dict=feed_dict) 633 | t3 = time.time() 634 | print('Embedding calculation FPS:%d' % (int(1 / ((t3 - t2))))) 635 | except: 636 | sess.close() 637 | print("Unexpected error:", sys.exc_info()[0]) 638 | 639 | embeddings1 = emb_array[0] 640 | embeddings2 = emb_array[1:len(emb_array)] 641 | 642 | 643 | # Caculate the distance of embeddings and verification the two face 644 | assert (embeddings1.shape[0] == embeddings2[1].shape[0]) 645 | diff = np.subtract(embeddings1, embeddings2) 646 | if len(diff.shape)==2: 647 | dist = np.sum(np.square(diff), 1) 648 | elif len(diff.shape)==1: 649 | dist = np.sum(np.square(diff), 0) 650 | else: 651 | raise ValueError("Dimension of the embeddings2 is not correct!") 652 | 653 | #simi = 1 - spatial.distance.cosine(embeddings1, embeddings2) 654 | 655 | predict_issame = np.less(dist, args.threshold) 656 | ##predict_issame = np.greater(simi, args.threshold) 657 | 658 | logits0 = logits_array[0] 659 | express_probs = np.exp(logits0)/sum(np.exp(logits0)) 660 | 661 | return predict_issame, dist, express_probs 662 | 663 | #def face_embeddings(img_refs_, args, sess, images_placeholder, embeddings, keep_probability_placeholder, phase_train_placeholder): 664 | def face_embeddings(img_refs_, args, sess, args_model, Expr_dataset): 665 | start_time = time.time() 666 | 667 | # Load images 668 | image_size = args.image_size 669 | 670 | images = facenet.load_data_im(img_refs_, False, False, image_size) 671 | if len(images.shape)==3: 672 | images = np.expand_dims(images, axis=0) 673 | 674 | if Expr_dataset == 'CK+' or 'FER2013': 675 | # feed_dict = {images_placeholder: images} 676 | feed_dict = {args_model.phase_train_placeholder: False, args_model.images_placeholder: images, args_model.keep_probability_placeholder: 1.0} 677 | 678 | t2 = time.time() 679 | emb_array = sess.run([args_model.embeddings], feed_dict=feed_dict) 680 | t3 = time.time() 681 | print('Embedding calculation FPS:%d' % (int(1 / (t3 - t2)))) 682 | t2 = time.time() 683 | 684 | return emb_array 685 | 686 | 687 | 688 | #def face_expression_multiref_forward(face_img_, emb_ref, args, sess, images_placeholder, embeddings, keep_probability_placeholder, phase_train_placeholder, logits): 689 | def face_expression_multiref_forward(face_img_, emb_ref, args, sess, args_model, Expr_dataset): 690 | start_time = time.time() 691 | # if len(img_refs_.shape) == 4: 692 | # nrof_imgs = 1+img_refs_.shape[0] 693 | # elif len(img_refs_.shape) == 3: 694 | # nrof_imgs = 2 695 | # else: 696 | # raise ValueError("Dimensions of img_refs is not correct!") 697 | nrof_imgs = 1 698 | imgs = np.zeros((nrof_imgs, args.image_size, args.image_size, 3)) 699 | imgs[0, :, :, :]=face_img_ 700 | 701 | # Load images 702 | image_size = args.image_size 703 | 704 | images = facenet.load_data_im(imgs, False, False, image_size) 705 | if len(images.shape) == 3: 706 | images = np.expand_dims(images,axis=0) 707 | 708 | if Expr_dataset == 'CK+': 709 | feed_dict = {args_model.phase_train_placeholder: False, args_model.images_placeholder: images, args_model.keep_probability_placeholder: 1.0} 710 | if Expr_dataset == 'FER2013': 711 | feed_dict = {args_model.phase_train_placeholder: False, args_model.phase_train_placeholder_expression: False, args_model.images_placeholder: images, args_model.keep_probability_placeholder: 1.0} 712 | 713 | t2 = time.time() 714 | emb_array, logits_array = sess.run([args_model.embeddings, args_model.logits], feed_dict=feed_dict) 715 | t3 = time.time() 716 | print('Embedding calculation FPS:%d' % (int(1 / (t3 - t2)))) 717 | t2 = time.time() 718 | embeddings1 = emb_array[0] 719 | embeddings2 = emb_ref[0] 720 | 721 | 722 | # Caculate the distance of embeddings and verification the two face 723 | assert (embeddings1.shape[0] == embeddings2[0].shape[0]) 724 | diff = np.subtract(embeddings1, embeddings2) 725 | if len(diff.shape)==2: 726 | dist = np.sum(np.square(diff), 1) 727 | elif len(diff.shape)==1: 728 | dist = np.sum(np.square(diff), 0) 729 | else: 730 | raise ValueError("Dimension of the embeddings2 is not correct!") 731 | 732 | #simi = 1 - spatial.distance.cosine(embeddings1, embeddings2) 733 | 734 | predict_issame = np.less(dist, args.threshold) 735 | ##predict_issame = np.greater(simi, args.threshold) 736 | 737 | logits0 = logits_array[0] 738 | express_probs = np.exp(logits0)/sum(np.exp(logits0)) 739 | return predict_issame, dist, express_probs 740 | 741 | def face_verif_batch(face_img_batch, img_refs_, args, sess): 742 | start_time = time.time() 743 | predict_issame_batch = [] 744 | dist_batch = [] 745 | if len(img_refs_.shape) == 4: 746 | nrof_imgs = len(face_img_batch)+img_refs_.shape[0] 747 | elif len(img_refs_.shape) == 3: 748 | nrof_imgs = len(face_img_batch) + 1 749 | else: 750 | raise ValueError("Dimensions of img_refs is not correct!") 751 | 752 | imgs = np.zeros((nrof_imgs, args.image_size, args.image_size, 3)) 753 | imgs[0:len(face_img_batch), :, :, :]=np.array(face_img_batch) 754 | imgs[len(face_img_batch):nrof_imgs, :, :, :] = img_refs_ 755 | 756 | express_probs = [] 757 | 758 | # Get input and output tensors 759 | images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") 760 | #images_placeholder = tf.get_default_graph().get_tensor_by_name("image_batch:0") 761 | phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") 762 | #keep_probability_placeholder = tf.get_default_graph().get_tensor_by_name("keep_probability:0") 763 | 764 | embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") 765 | #logits = tf.get_default_graph().get_tensor_by_name("logits:0") 766 | 767 | 768 | # Load images 769 | image_size = images_placeholder.get_shape()[1] 770 | images = facenet.load_data_im(imgs, False, False, image_size) 771 | 772 | #feed_dict = {images_placeholder: images, phase_train_placeholder: False, keep_probability_placeholder:1.0} 773 | feed_dict = {images_placeholder: images, phase_train_placeholder: False} 774 | try: 775 | t2 = time.time() 776 | #emb_array, logits_array = sess.run([embeddings, logits], feed_dict=feed_dict) 777 | emb_array = sess.run(embeddings, feed_dict=feed_dict) 778 | t3 = time.time() 779 | print('Embedding calculation FPS:%d' % (int(1 / (t3 - t2) * nrof_imgs))) 780 | except: 781 | sess.close() 782 | print("Unexpected error:", sys.exc_info()[0]) 783 | 784 | embeddings1 = emb_array[0:len(face_img_batch)] 785 | embeddings2 = emb_array[len(face_img_batch):len(emb_array)] 786 | 787 | 788 | # Caculate the distance of embeddings and verification the two face 789 | assert (embeddings1[1].shape[0] == embeddings2[1].shape[0]) 790 | for i in range(nrof_imgs-len(face_img_batch)): 791 | diff = np.subtract(embeddings1, embeddings2[i]) 792 | if len(diff.shape)==2: 793 | dist = np.sum(np.square(diff), 1) 794 | elif len(diff.shape)==1: 795 | dist = np.sum(np.square(diff), 0) 796 | else: 797 | raise ValueError("Dimension of the embeddings2 is not correct!") 798 | 799 | #simi = 1 - spatial.distance.cosine(embeddings1, embeddings2) 800 | 801 | predict_issame = np.less(dist, args.threshold) 802 | 803 | # predict_issame = np.greater(simi, args.threshold) 804 | 805 | # logits0 = logits_array[0] 806 | # express_probs = np.exp(logits0)/sum(np.exp(logits0)) 807 | predict_issame_batch.append(predict_issame) 808 | dist_batch.append(dist) 809 | elapsed_time = time.time() - start_time 810 | 811 | return predict_issame_batch, dist_batch 812 | 813 | 814 | 815 | 816 | def plotbb(img, bboxes, ld=[], output_filename=[]): 817 | #img = np.float32(image) 818 | # plt.imshow(img) 819 | # patterns = ['-', '+', 'x', 'o', 'O', '.', '*'] # more patterns 820 | 821 | 822 | fig, ax = plt.subplots(1) 823 | ax.imshow(img) 824 | 825 | if bboxes.ndim <2: 826 | bboxes = np.expand_dims(bboxes, axis=0) 827 | 828 | for i in range(bboxes.shape[0]): 829 | rect = patches.Rectangle( 830 | (bboxes[i, 0], bboxes[i, 1]), 831 | bboxes[i, 2] - bboxes[i, 0], 832 | bboxes[i, 3] - bboxes[i, 1], 833 | # hatch=patterns[i], 834 | fill=False, 835 | linewidth=1, 836 | edgecolor='r', 837 | facecolor='none' 838 | ) 839 | ax.add_patch(rect) 840 | score = '%.02f' % (bboxes[i, 4]) 841 | ax.text(int(bboxes[i, 0]), int(bboxes[i, 1]), score, color='green', fontsize=10) 842 | plt.pause(0.0001) 843 | 844 | if(ld): 845 | ax = plt.gca() 846 | ld = np.int32(np.squeeze(ld)) 847 | for i in range(np.int32(ld.shape[0] / 2)): 848 | ax.plot(ld[i], ld[i + 5], 'o', color='r', linewidth=0.1) 849 | 850 | if (output_filename): 851 | dirtmp = output_filename 852 | if not os.path.exists(dirtmp): 853 | os.mkdir(dirtmp) 854 | random_key = np.random.randint(0, high=99999) 855 | fig.savefig(os.path.join(dirtmp, 'face_dd_ld_%03d.png') % random_key, dpi=90, bbox_inches='tight') 856 | 857 | 858 | 859 | 860 | class args_model(): 861 | def __init__(self): 862 | self.images_placeholder = None 863 | self.embeddings = None 864 | self.keep_probability_placeholder = None 865 | self.phase_train_placeholder = None 866 | self.logits = None 867 | self.phase_train_placeholder_expression = None -------------------------------------------------------------------------------- /src/facenet.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=missing-docstring 2 | from __future__ import absolute_import 3 | from __future__ import division 4 | from __future__ import print_function 5 | 6 | import os 7 | from subprocess import Popen, PIPE 8 | import tensorflow as tf 9 | from tensorflow.python.framework import ops 10 | import numpy as np 11 | from scipy import misc 12 | import matplotlib.pyplot as plt 13 | from sklearn.cross_validation import KFold 14 | from scipy import interpolate 15 | from tensorflow.python.training import training 16 | import random 17 | import re 18 | from collections import Counter 19 | import matplotlib.pyplot as plt 20 | import cv2 21 | #import python_getdents 22 | from scipy import spatial 23 | from sklearn.decomposition import PCA 24 | from itertools import islice 25 | import itertools 26 | 27 | 28 | def shuffle_examples(image_paths, labels): 29 | shuffle_list = list(zip(image_paths, labels)) 30 | random.shuffle(shuffle_list) 31 | image_paths_shuff, labels_shuff = zip(*shuffle_list) 32 | return image_paths_shuff, labels_shuff 33 | 34 | def read_images_from_disk(input_queue): 35 | """Consumes a single filename and label as a ' '-delimited string. 36 | Args: 37 | filename_and_label_tensor: A scalar string tensor. 38 | Returns: 39 | Two tensors: the decoded image, and the string label. 40 | """ 41 | label = input_queue[1] 42 | file_contents = tf.read_file(input_queue[0]) 43 | example = tf.image.decode_png(file_contents, channels=3) 44 | return example, label 45 | 46 | def random_rotate_image(image): 47 | #angle = np.random.uniform(low=-10.0, high=10.0) 48 | angle = np.random.uniform(low=-180.0, high=180.0) 49 | return misc.imrotate(image, angle, 'bicubic') 50 | 51 | def read_and_augument_data(image_list, label_list, image_size, batch_size, max_nrof_epochs, 52 | random_crop, random_flip, random_rotate, nrof_preprocess_threads, shuffle=True): 53 | 54 | images = ops.convert_to_tensor(image_list, dtype=tf.string) 55 | labels = ops.convert_to_tensor(label_list, dtype=tf.int32) 56 | 57 | # Makes an input queue 58 | input_queue = tf.train.slice_input_producer([images, labels], 59 | num_epochs=max_nrof_epochs, shuffle=shuffle) 60 | 61 | images_and_labels = [] 62 | for _ in range(nrof_preprocess_threads): 63 | image, label = read_images_from_disk(input_queue) 64 | if random_rotate: 65 | image = tf.py_func(random_rotate_image, [image], tf.uint8) 66 | if random_crop: 67 | image = tf.random_crop(image, [image_size, image_size, 3]) 68 | else: 69 | image = tf.image.resize_image_with_crop_or_pad(image, image_size, image_size) 70 | if random_flip: 71 | image = tf.image.random_flip_left_right(image) 72 | #pylint: disable=no-member 73 | image.set_shape((image_size, image_size, 3)) 74 | image = tf.image.per_image_standardization(image) 75 | images_and_labels.append([image, label]) 76 | 77 | image_batch, label_batch = tf.train.batch_join( 78 | images_and_labels, batch_size=batch_size, 79 | capacity=4 * nrof_preprocess_threads * batch_size, 80 | allow_smaller_final_batch=True) 81 | 82 | return image_batch, label_batch 83 | 84 | def _add_loss_summaries(total_loss): 85 | """Add summaries for losses. 86 | 87 | Generates moving average for all losses and associated summaries for 88 | visualizing the performance of the network. 89 | 90 | Args: 91 | total_loss: Total loss from loss(). 92 | Returns: 93 | loss_averages_op: op for generating moving averages of losses. 94 | """ 95 | # Compute the moving average of all individual losses and the total loss. 96 | loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg') 97 | losses = tf.get_collection('losses') 98 | loss_averages_op = loss_averages.apply(losses + [total_loss]) 99 | 100 | # Attach a scalar summmary to all individual losses and the total loss; do the 101 | # same for the averaged version of the losses. 102 | for l in losses + [total_loss]: 103 | # Name each loss as '(raw)' and name the moving average version of the loss 104 | # as the original loss name. 105 | tf.summary.scalar(l.op.name +' (raw)', l) 106 | tf.summary.scalar(l.op.name, loss_averages.average(l)) 107 | 108 | return loss_averages_op 109 | 110 | 111 | def get_learning_rate_from_file(filename, epoch): 112 | with open(filename, 'r') as f: 113 | for line in f.readlines(): 114 | line = line.split('#', 1)[0] 115 | if line: 116 | par = line.strip().split(':') 117 | e = int(par[0]) 118 | lr = float(par[1]) 119 | if e <= epoch: 120 | learning_rate = lr 121 | # else: 122 | # return learning_rate 123 | 124 | return learning_rate 125 | 126 | class ImageClass(): 127 | "Stores the paths to images for a given class" 128 | def __init__(self, name, image_paths): 129 | self.name = name 130 | self.image_paths = image_paths 131 | 132 | def __str__(self): 133 | return self.name + ', ' + str(len(self.image_paths)) + ' images' 134 | 135 | def __len__(self): 136 | return len(self.image_paths) 137 | 138 | def get_dataset(paths): 139 | dataset = [] 140 | for path in paths.split(':'): 141 | path_exp = os.path.expanduser(path) 142 | classes = os.listdir(path_exp) 143 | classes.sort() 144 | nrof_classes = len(classes) 145 | for i in range(nrof_classes): 146 | class_name = classes[i] 147 | facedir = os.path.join(path_exp, class_name) 148 | if os.path.isdir(facedir): 149 | images = os.listdir(facedir) 150 | image_paths = [os.path.join(facedir,img) for img in images] 151 | dataset.append(ImageClass(class_name, image_paths)) 152 | 153 | return dataset 154 | 155 | def get_huge_dataset(paths, start_n, end_n): 156 | dataset = [] 157 | classes = [] 158 | for path in paths.split(':'): 159 | path_exp = os.path.expanduser(path) 160 | for (d_ino, d_off, d_reclen, d_type, d_name) in python_getdents.getdents64(path_exp): 161 | if d_name=='.' or d_name == '..': 162 | continue 163 | classes += [d_name] 164 | 165 | classes.sort() 166 | nrof_classes = len(classes) 167 | if end_n == -1: 168 | end_n = nrof_classes 169 | if end_n>nrof_classes: 170 | raise ValueError('Invalid end_n:%d more than nrof_class:%d'%(end_n,nrof_classes)) 171 | for i in range(start_n,end_n): 172 | if(i%1000 == 0): 173 | print('reading identities: %d/%d\n'%(i,end_n)) 174 | class_name = classes[i] 175 | facedir = os.path.join(path_exp, class_name) 176 | if os.path.isdir(facedir): 177 | images = os.listdir(facedir) 178 | image_paths = [os.path.join(facedir,img) for img in images] 179 | dataset.append(ImageClass(class_name, image_paths)) 180 | 181 | 182 | return dataset 183 | 184 | 185 | 186 | def split_dataset(dataset, split_ratio, mode): 187 | if mode=='SPLIT_CLASSES': 188 | nrof_classes = len(dataset) 189 | class_indices = np.arange(nrof_classes) 190 | np.random.shuffle(class_indices) 191 | split = int(round(nrof_classes*split_ratio)) 192 | train_set = [dataset[i] for i in class_indices[0:split]] 193 | test_set = [dataset[i] for i in class_indices[split:-1]] 194 | elif mode=='SPLIT_IMAGES': 195 | train_set = [] 196 | test_set = [] 197 | min_nrof_images = 2 198 | for cls in dataset: 199 | paths = cls.image_paths 200 | np.random.shuffle(paths) 201 | split = int(round(len(paths)*split_ratio)) 202 | if split1: 221 | # raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir) 222 | # meta_file = meta_files[0] 223 | # ckpt_file = tf.train.get_checkpoint_state(model_dir).model_checkpoint_path 224 | # return meta_file, ckpt_file 225 | 226 | 227 | 228 | def store_revision_info(src_path, output_dir, arg_string): 229 | 230 | # # git hash 231 | # gitproc = Popen(['git', 'rev-parse', 'HEAD'], stdout = PIPE, cwd=src_path) 232 | # (stdout, _) = gitproc.communicate() 233 | # git_hash = stdout.strip() 234 | # 235 | # # Get local changes 236 | # gitproc = Popen(['git', 'diff', 'HEAD'], stdout = PIPE, cwd=src_path) 237 | # (stdout, _) = gitproc.communicate() 238 | # git_diff = stdout.strip() 239 | 240 | # Store a text file in the log directory 241 | rev_info_filename = os.path.join(output_dir, 'revision_info.txt') 242 | with open(rev_info_filename, "w") as text_file: 243 | text_file.write('arguments: %s\n--------------------\n' % arg_string) 244 | # text_file.write('git hash: %s\n--------------------\n' % git_hash) 245 | # text_file.write('%s' % git_diff) 246 | 247 | def list_variables(filename): 248 | reader = training.NewCheckpointReader(filename) 249 | variable_map = reader.get_variable_to_shape_map() 250 | names = sorted(variable_map.keys()) 251 | return names 252 | 253 | ## get the labels of the triplet paths for calculating the center loss - mzh edit 31012017 254 | def get_label_triplet(triplet_paths): 255 | classes = [] 256 | classes_list = [] 257 | labels_triplet = [] 258 | for image_path in triplet_paths: 259 | str_items=image_path.split('/') 260 | classes_list.append(str_items[-2]) 261 | 262 | classes = list(sorted(set(classes_list), key=classes_list.index)) 263 | 264 | for item in classes_list: 265 | labels_triplet.append(classes.index(item)) 266 | 267 | return labels_triplet 268 | 269 | def get_model_filenames(model_dir): 270 | files = os.listdir(model_dir) 271 | meta_files = [s for s in files if s.endswith('.meta')] 272 | if len(meta_files)==0: 273 | raise ValueError('No meta file found in the model directory (%s)' % model_dir) 274 | elif len(meta_files)>1: 275 | raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir) 276 | meta_file = meta_files[0] 277 | meta_files = [s for s in files if '.ckpt' in s] 278 | max_step = -1 279 | for f in files: 280 | step_str = re.match(r'(^model-[\w\- ]+.ckpt-(\d+))', f) 281 | if step_str is not None and len(step_str.groups())>=2: 282 | step = int(step_str.groups()[1]) 283 | if step > max_step: 284 | max_step = step 285 | ckpt_file = step_str.groups()[0] 286 | return meta_file, ckpt_file 287 | 288 | def class_filter(image_list, label_list, num_imgs_class): 289 | counter = Counter(label_list) 290 | label_num = counter.values() 291 | label_key = counter.keys() 292 | 293 | idx = [idx for idx, val in enumerate(label_num) if val > num_imgs_class] 294 | label_idx = [label_key[i] for i in idx] 295 | idx_list = [i for i in range(0,len(label_list)) if label_list[i] in label_idx] 296 | label_list_new = [label_list[i] for i in idx_list] 297 | image_list_new = [image_list[i] for i in idx_list] 298 | 299 | #plt.hist(label_num, bins = 'auto') 300 | return image_list_new, label_list_new 301 | 302 | 303 | -------------------------------------------------------------------------------- /src/facenet_ext.py: -------------------------------------------------------------------------------- 1 | """Functions for building the face recognition network. 2 | """ 3 | # MIT License 4 | # 5 | # Copyright (c) 2016 David Sandberg 6 | # 7 | # Permission is hereby granted, free of charge, to any person obtaining a copy 8 | # of this software and associated documentation files (the "Software"), to deal 9 | # in the Software without restriction, including without limitation the rights 10 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 | # copies of the Software, and to permit persons to whom the Software is 12 | # furnished to do so, subject to the following conditions: 13 | # 14 | # The above copyright notice and this permission notice shall be included in all 15 | # copies or substantial portions of the Software. 16 | # 17 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 | # SOFTWARE. 24 | 25 | # pylint: disable=missing-docstring 26 | from __future__ import absolute_import 27 | from __future__ import division 28 | from __future__ import print_function 29 | 30 | import os 31 | from subprocess import Popen, PIPE 32 | import tensorflow as tf 33 | from tensorflow.python.framework import ops 34 | import numpy as np 35 | from scipy import misc 36 | import matplotlib.pyplot as plt 37 | from sklearn.cross_validation import KFold 38 | from scipy import interpolate 39 | from tensorflow.python.training import training 40 | import random 41 | import re 42 | from collections import Counter 43 | import matplotlib.pyplot as plt 44 | import cv2 45 | import python_getdents 46 | from scipy import spatial 47 | from sklearn.decomposition import PCA 48 | from itertools import islice 49 | import itertools 50 | import sys 51 | 52 | #### libs of DavaideSanderburg #### 53 | sys.path.insert(0, '../lib/facenet/src') 54 | import facenet 55 | 56 | 57 | # import h5py 58 | 59 | def label_mapping(label_list_src, EXPRSSIONS_TYPE_src, EXPRSSIONS_TYPE_trg): 60 | labels_mapping = [] 61 | idx_label_notexist = [] 62 | for i, label in enumerate(label_list_src): 63 | expre_src = str.split(EXPRSSIONS_TYPE_src[label], '=')[1] 64 | expre_trg = [x for x in EXPRSSIONS_TYPE_trg if expre_src in x] 65 | if expre_trg == []: 66 | label_trg = -1 67 | idx_label_notexist.append(i) 68 | else: 69 | label_trg = int(str.split(expre_trg[0], '=')[0]) 70 | labels_mapping.append(label_trg) 71 | 72 | return idx_label_notexist, labels_mapping 73 | 74 | 75 | def gather(data, label): 76 | i = 0 77 | if data.ndim == 1: 78 | data_batch = np.zeros(len(label)) 79 | for idx in label: 80 | data_batch[i] = data[idx] 81 | i += 1 82 | if data.ndim == 2: 83 | data_batch = np.zeros([len(label), np.shape(data)[1]]) 84 | for idx in label: 85 | data_batch[i, :] = data[idx, :] 86 | i += 1 87 | if data.ndim > 2: 88 | print('The data of dimension should be less than 3!\n') 89 | assert (data.ndim < 3) 90 | 91 | return data_batch 92 | 93 | 94 | # def scatter(data, index): 95 | # return data_sactter 96 | 97 | def generate_labels_id(subs): 98 | subjects = list(set(subs)) 99 | subjects = np.sort(subjects) 100 | labels_id = [] 101 | for sub in subs: 102 | labels_id.append([idx for idx, subject in enumerate(subjects) if sub == subject][0]) 103 | 104 | return labels_id 105 | 106 | 107 | def generate_idiap_image_label(path, quality, framenum_dist): 108 | if quality == 'real': 109 | label = 1 110 | elif quality == 'attack': 111 | label = 0 112 | else: 113 | raise ValueError("Invalid quality of the images!") 114 | 115 | img_list = [] 116 | label_list = [] 117 | i = 0 118 | videos = os.listdir(path) 119 | videos.sort() 120 | for video in videos: 121 | video_path = os.path.join(path, video) 122 | if os.path.isdir(video_path): 123 | imgs = os.listdir(video_path) 124 | imgs.sort() 125 | imgs_select = random.sample(imgs, framenum_dist) 126 | for img in imgs_select: 127 | img_path = os.path.join(video_path, img) 128 | img_list.append(img_path) 129 | label_list.append(label) 130 | i += 1 131 | # print('%d %s %s' % (i, img_path, label)) 132 | 133 | return img_list, label_list, i 134 | 135 | 136 | def get_image_paths_and_labels_idiap(data_dir_dist, framenum_dist_real, framenum_dist_attack): 137 | images_list = [] 138 | labels_list = [] 139 | cnt_total = 0 140 | 141 | qualities = os.listdir(data_dir_dist) 142 | for quality in qualities: 143 | if quality == 'real': 144 | imgs, labels, cnt = generate_idiap_image_label(os.path.join(data_dir_dist, quality), quality, 145 | framenum_dist_real) 146 | images_list += imgs 147 | labels_list += labels 148 | cnt_total += cnt 149 | 150 | elif quality == 'attack': 151 | attack_styles = os.listdir(os.path.join(data_dir_dist, quality)) 152 | for attack_style in attack_styles: 153 | imgs, labels, cnt = generate_idiap_image_label(os.path.join(data_dir_dist, quality, attack_style), 154 | quality, framenum_dist_attack) 155 | images_list += imgs 156 | labels_list += labels 157 | cnt_total += cnt 158 | else: 159 | raise ValueError("Invalid quality of the images!") 160 | 161 | return images_list, labels_list, cnt_total 162 | 163 | def get_image_paths_and_labels_genki4k(labels_expression, usage): 164 | image_paths_flat = [] 165 | labels_flat = [] 166 | usage_flat = [] 167 | 168 | ## read labels 169 | with open(labels_expression,'r') as f: 170 | s = f.readlines() 171 | for line in s: 172 | x = line[66:-1] 173 | if x == usage: 174 | label = int(line[0]) 175 | labels_flat.append(label) 176 | image = line[2:65] 177 | image_paths_flat.append(image) 178 | usage_flat.append(usage) 179 | 180 | 181 | nrof_classes = len(labels_flat) 182 | 183 | return image_paths_flat, labels_flat, usage_flat, nrof_classes 184 | 185 | def get_image_paths_and_labels_sfew(images_path, labels_expression): 186 | image_paths_flat = [] 187 | labels_flat = [] 188 | usage_flat = [] 189 | subs = [] 190 | sub_imgs = [] 191 | idx_train = [] 192 | idx_test = [] 193 | 194 | idx_train_sub_all = [] 195 | idx_test_sub_all = [] 196 | 197 | with open(labels_expression, 'r') as text_file: 198 | for line in islice(text_file, 1, None): 199 | [No, expression, label, img] = str.split(line) 200 | labels_flat.append(int(label)) 201 | image_paths_flat.append(img) 202 | 203 | nrof_classes = len(image_paths_flat) 204 | return image_paths_flat, labels_flat, nrof_classes 205 | 206 | 207 | def get_image_paths_and_labels_oulucasia(images_path, labels_expression, usage, nfold, ifold, isaug=False): 208 | image_paths_flat = [] 209 | labels_flat = [] 210 | usage_flat = [] 211 | subs = [] 212 | sub_imgs = [] 213 | idx_train = [] 214 | idx_test = [] 215 | 216 | idx_train_sub_all = [] 217 | idx_test_sub_all = [] 218 | 219 | with open(labels_expression, 'r') as text_file: 220 | for line in islice(text_file, 1, None): 221 | [No, sub, expression, label, img] = str.split(line) 222 | labels_flat.append(int(label)) 223 | image_paths_flat.append(img) 224 | subs.append(sub) 225 | 226 | subjects = list(set(subs)) 227 | nrof_subj = len(subjects) 228 | 229 | for idx_subj, subj in enumerate(subjects): 230 | sub_imgs.append([]) 231 | for idx_sub, sub in enumerate(subs): 232 | if sub == subj: 233 | sub_imgs[idx_subj].append(idx_sub) 234 | 235 | # folds = KFold(n=len(labels_flat), n_folds=nrof_folds, shuffle=True) 236 | folds = KFold(n=nrof_subj, n_folds=nfold, shuffle=False) 237 | 238 | i = 0 239 | for idx_train_sub, idx_test_sub in folds: 240 | idx_train_sub_all.append([]) 241 | idx_train_sub_all[i].append(idx_train_sub) 242 | idx_test_sub_all.append([]) 243 | idx_test_sub_all[i].append(idx_test_sub) 244 | # print('train:', idx_train_sub, 'test', idx_test_sub) 245 | i += 1 246 | 247 | idx_train_sub = idx_train_sub_all[ifold][0] 248 | idx_test_sub = idx_test_sub_all[ifold][0] 249 | 250 | image_paths_flat_array = np.asarray(image_paths_flat) 251 | labels_flat_array = np.asarray(labels_flat) 252 | 253 | if usage == 'Training': 254 | for idx in idx_train_sub: 255 | idx_train += sub_imgs[idx] 256 | 257 | image_paths_flat_array = image_paths_flat_array[idx_train] 258 | labels_flat_array = labels_flat_array[idx_train] 259 | 260 | ### Reduce the number of the samples of the 'neutral' to balance the number of the classes 261 | if isaug: 262 | labels_unique = set(labels_flat_array) 263 | nrof_classes = len(labels_unique) 264 | idx_labels_expression = [] 265 | for _ in range(nrof_classes): 266 | idx_labels_expression.append([]) 267 | 268 | for i, lab in enumerate(labels_flat_array): 269 | idx_labels_expression[lab].append(i) 270 | idx_labels_neutral = random.sample(idx_labels_expression[0], len(idx_labels_expression[1])) 271 | idx_labels_augmentation = idx_labels_neutral 272 | for i in range(1, nrof_classes): 273 | idx_labels_augmentation += idx_labels_expression[i] 274 | 275 | image_paths_flat_array = image_paths_flat_array[idx_labels_augmentation] 276 | labels_flat_array = labels_flat_array[idx_labels_augmentation] 277 | 278 | if usage == 'Test': 279 | for idx in idx_test_sub: 280 | idx_test += sub_imgs[idx] 281 | 282 | image_paths_flat_array = image_paths_flat_array[idx_test] 283 | labels_flat_array = labels_flat_array[idx_test] 284 | 285 | image_paths_flat = image_paths_flat_array.tolist() 286 | labels_flat = labels_flat_array.tolist() 287 | 288 | # nrof_classes = len(image_paths_flat) 289 | nrof_classes = nrof_subj 290 | return image_paths_flat, labels_flat, usage_flat, nrof_classes 291 | 292 | 293 | def get_image_paths_and_labels_joint_oulucasia(images_path, labels_expression, usage, nfold, ifold, isaug=True): 294 | image_paths_flat = [] 295 | labels_flat = [] 296 | usage_flat = [] 297 | subs = [] 298 | sub_imgs = [] 299 | idx_train = [] 300 | idx_test = [] 301 | 302 | idx_train_sub_all = [] 303 | idx_test_sub_all = [] 304 | 305 | with open(labels_expression, 'r') as text_file: 306 | for line in islice(text_file, 1, None): 307 | [No, sub, expression, label, img] = str.split(line) 308 | labels_flat.append(int(label)) 309 | image_paths_flat.append(img) 310 | subs.append(sub) 311 | 312 | subjects = list(set(subs)) 313 | nrof_subj = len(subjects) 314 | labels_id = generate_labels_id(subs) 315 | 316 | for idx_subj, subj in enumerate(subjects): 317 | sub_imgs.append([]) 318 | for idx_sub, sub in enumerate(subs): 319 | if sub == subj: 320 | sub_imgs[idx_subj].append(idx_sub) 321 | 322 | # folds = KFold(n=len(labels_flat), n_folds=nrof_folds, shuffle=True) 323 | folds = KFold(n=nrof_subj, n_folds=nfold, shuffle=False) 324 | 325 | i = 0 326 | for idx_train_sub, idx_test_sub in folds: 327 | idx_train_sub_all.append([]) 328 | idx_train_sub_all[i].append(idx_train_sub) 329 | idx_test_sub_all.append([]) 330 | idx_test_sub_all[i].append(idx_test_sub) 331 | print('train:', idx_train_sub, 'test', idx_test_sub) 332 | i += 1 333 | 334 | idx_train_sub = idx_train_sub_all[ifold][0] 335 | idx_test_sub = idx_test_sub_all[ifold][0] 336 | 337 | image_paths_flat_array = np.asarray(image_paths_flat) 338 | labels_flat_array = np.asarray(labels_flat) 339 | labels_id_array = np.asarray(labels_id) 340 | 341 | if usage == 'Training': 342 | for idx in idx_train_sub: 343 | idx_train += sub_imgs[idx] 344 | 345 | image_paths_flat_array = image_paths_flat_array[idx_train] 346 | labels_flat_array = labels_flat_array[idx_train] 347 | labels_id_array = labels_id_array[idx_train] 348 | 349 | ### Reduce the number of the samples of the 'neutral' to balance the number of the classes 350 | if isaug: 351 | labels_unique = set(labels_flat_array) 352 | nrof_classes = len(labels_unique) 353 | idx_labels_expression = [] 354 | for _ in range(nrof_classes): 355 | idx_labels_expression.append([]) 356 | 357 | for i, lab in enumerate(labels_flat_array): 358 | idx_labels_expression[lab].append(i) 359 | idx_labels_neutral = random.sample(idx_labels_expression[0], len(idx_labels_expression[1])) 360 | idx_labels_augmentation = idx_labels_neutral 361 | for i in range(1, nrof_classes): 362 | idx_labels_augmentation += idx_labels_expression[i] 363 | 364 | image_paths_flat_array = image_paths_flat_array[idx_labels_augmentation] 365 | labels_flat_array = labels_flat_array[idx_labels_augmentation] 366 | labels_id_array = labels_id_array[idx_labels_augmentation] 367 | 368 | if usage == 'Test': 369 | for idx in idx_test_sub: 370 | idx_test += sub_imgs[idx] 371 | 372 | image_paths_flat_array = image_paths_flat_array[idx_test] 373 | labels_flat_array = labels_flat_array[idx_test] 374 | labels_id_array = labels_id_array[idx_test] 375 | 376 | image_paths_flat = image_paths_flat_array.tolist() 377 | labels_flat = labels_flat_array.tolist() 378 | labels_id_flat = labels_id_array.tolist() 379 | 380 | nrof_classes = len(set(labels_id_flat)) 381 | 382 | return image_paths_flat, labels_flat, usage_flat, nrof_classes, labels_id_flat 383 | 384 | 385 | def get_image_paths_and_labels_ckplus(images_path, labels_expression, usage, nfold, ifold): 386 | image_paths_flat = [] 387 | labels_flat = [] 388 | usage_flat = [] 389 | subs = [] 390 | sub_imgs = [] 391 | idx_train = [] 392 | idx_test = [] 393 | 394 | idx_train_sub_all = [] 395 | idx_test_sub_all = [] 396 | 397 | with open(labels_expression, 'r') as text_file: 398 | for line in islice(text_file, 1, None): 399 | [cnt, sub, session, frame, label] = str.split(line) 400 | labels_flat.append(int(float(label))) 401 | subs.append(sub) 402 | 403 | img_name = sub + '_' + session + '_' + frame 404 | img_path = os.path.join(images_path, sub, session, img_name + '.png') 405 | image_paths_flat.append(img_path) 406 | 407 | subjects = list(set(subs)) 408 | nrof_subj = len(subjects) 409 | 410 | for idx_subj, subj in enumerate(subjects): 411 | sub_imgs.append([]) 412 | for idx_sub, sub in enumerate(subs): 413 | if sub == subj: 414 | sub_imgs[idx_subj].append(idx_sub) 415 | 416 | # folds = KFold(n=len(labels_flat), n_folds=nrof_folds, shuffle=True) 417 | folds = KFold(n=nrof_subj, n_folds=nfold, shuffle=False) 418 | 419 | i = 0 420 | for idx_train_sub, idx_test_sub in folds: 421 | idx_train_sub_all.append([]) 422 | idx_train_sub_all[i].append(idx_train_sub) 423 | idx_test_sub_all.append([]) 424 | idx_test_sub_all[i].append(idx_test_sub) 425 | print('train:', idx_train_sub, 'test', idx_test_sub) 426 | i += 1 427 | 428 | idx_train_sub = idx_train_sub_all[ifold][0] 429 | idx_test_sub = idx_test_sub_all[ifold][0] 430 | 431 | image_paths_flat_array = np.asarray(image_paths_flat) 432 | labels_flat_array = np.asarray(labels_flat) 433 | 434 | if usage == 'Training': 435 | for idx in idx_train_sub: 436 | idx_train += sub_imgs[idx] 437 | 438 | image_paths_flat_array = image_paths_flat_array[idx_train] 439 | labels_flat_array = labels_flat_array[idx_train] 440 | 441 | if usage == 'Test': 442 | for idx in idx_test_sub: 443 | idx_test += sub_imgs[idx] 444 | 445 | image_paths_flat_array = image_paths_flat_array[idx_test] 446 | labels_flat_array = labels_flat_array[idx_test] 447 | 448 | image_paths_flat = image_paths_flat_array.tolist() 449 | labels_flat = labels_flat_array.tolist() 450 | 451 | nrof_classes = len(image_paths_flat) 452 | return image_paths_flat, labels_flat, usage_flat, nrof_classes 453 | 454 | 455 | def get_image_paths_and_labels_joint_ckplus(images_path, labels_expression, usage, nfold, ifold): 456 | image_paths_flat = [] 457 | labels_flat = [] 458 | usage_flat = [] 459 | subs = [] 460 | sub_imgs = [] 461 | idx_train = [] 462 | idx_test = [] 463 | 464 | idx_train_sub_all = [] 465 | idx_test_sub_all = [] 466 | 467 | with open(labels_expression, 'r') as text_file: 468 | for line in islice(text_file, 1, None): 469 | [cnt, sub, session, frame, label] = str.split(line) 470 | labels_flat.append(int(float(label))) 471 | subs.append(sub) 472 | 473 | img_name = sub + '_' + session + '_' + frame 474 | img_path = os.path.join(images_path, sub, session, img_name + '.png') 475 | image_paths_flat.append(img_path) 476 | 477 | subjects = list(set(subs)) 478 | nrof_subj = len(subjects) 479 | labels_id = generate_labels_id(subs) 480 | 481 | for idx_subj, subj in enumerate(subjects): 482 | sub_imgs.append([]) 483 | for idx_sub, sub in enumerate(subs): 484 | if sub == subj: 485 | sub_imgs[idx_subj].append(idx_sub) 486 | 487 | # folds = KFold(n=len(labels_flat), n_folds=nrof_folds, shuffle=True) 488 | folds = KFold(n=nrof_subj, n_folds=nfold, shuffle=False) 489 | 490 | i = 0 491 | for idx_train_sub, idx_test_sub in folds: 492 | idx_train_sub_all.append([]) 493 | idx_train_sub_all[i].append(idx_train_sub) 494 | idx_test_sub_all.append([]) 495 | idx_test_sub_all[i].append(idx_test_sub) 496 | print('train:', idx_train_sub, 'test', idx_test_sub) 497 | i += 1 498 | 499 | idx_train_sub = idx_train_sub_all[ifold][0] 500 | idx_test_sub = idx_test_sub_all[ifold][0] 501 | 502 | image_paths_flat_array = np.asarray(image_paths_flat) 503 | labels_flat_array = np.asarray(labels_flat) 504 | labels_id_array = np.asarray(labels_id) 505 | 506 | if usage == 'Training': 507 | for idx in idx_train_sub: 508 | idx_train += sub_imgs[idx] 509 | 510 | image_paths_flat_array = image_paths_flat_array[idx_train] 511 | labels_flat_array = labels_flat_array[idx_train] 512 | labels_id_array = labels_id_array[idx_train] 513 | 514 | if usage == 'Test': 515 | for idx in idx_test_sub: 516 | idx_test += sub_imgs[idx] 517 | 518 | image_paths_flat_array = image_paths_flat_array[idx_test] 519 | labels_flat_array = labels_flat_array[idx_test] 520 | labels_id_array = labels_id_array[idx_test] 521 | 522 | image_paths_flat = image_paths_flat_array.tolist() 523 | labels_flat = labels_flat_array.tolist() 524 | labels_id_flat = labels_id_array.tolist() 525 | 526 | nrof_classes = len(set(labels_id_flat)) 527 | 528 | return image_paths_flat, labels_flat, usage_flat, nrof_classes, labels_id_flat 529 | 530 | 531 | def get_image_paths_and_labels_fer2013(paths, labels_expression, usage, isaug=False): 532 | image_paths_flat = [] 533 | labels_flat = [] 534 | usage_flat = [] 535 | for path in paths.split(':'): 536 | path_exp = os.path.expanduser(path) 537 | images = os.listdir(path_exp) 538 | images.sort() 539 | image_paths_flat = [os.path.join(path_exp, image) for image in images] 540 | 541 | with open(labels_expression, 'r') as text_file: 542 | for line in islice(text_file, 1, None): 543 | strtmp = str.split(line) 544 | labels_flat.append(np.int(strtmp[1])) 545 | usage_flat.append(strtmp[2]) 546 | 547 | index = [idx for idx, phrase in enumerate(usage_flat) if phrase == usage] 548 | image_paths_flat = [im for idx, im in enumerate(image_paths_flat) if idx in index] 549 | labels_flat = [im for idx, im in enumerate(labels_flat) if idx in index] 550 | usage_flat = [im for idx, im in enumerate(usage_flat) if idx in index] 551 | 552 | nrof_classes = len(image_paths_flat) 553 | 554 | if usage == 'PublicTest' or usage == 'PrivateTest': 555 | return image_paths_flat, labels_flat, usage_flat, nrof_classes 556 | 557 | if isaug: 558 | ################################################################################# 559 | ###### data augmentation: blancing the number of the samples of each class ###### 560 | ################################################################################# 561 | ## compute the histogram of the dataset 562 | nrof_expressions = len(set(labels_flat)) 563 | idx_expression_classes = [] 564 | for _ in range(nrof_expressions): 565 | idx_expression_classes.append([]) 566 | 567 | for idx, lab in enumerate(labels_flat): 568 | if usage_flat[idx] == 'Training': 569 | idx_expression_classes[lab].append(idx) 570 | 571 | expression_classes_lens = np.zeros(nrof_expressions) 572 | for i in range(nrof_expressions): 573 | expression_classes_lens[i] = len(idx_expression_classes[i]) 574 | max_expression_classes_len = int(np.max(expression_classes_lens)) 575 | 576 | ## Fill the lists of each expression class by the random samples chosen from the original classes 577 | for i in range(nrof_expressions): 578 | idx_expression = idx_expression_classes[i] 579 | num_to_fill = max_expression_classes_len - len(idx_expression) 580 | if num_to_fill < len(idx_expression): 581 | gen_rand_samples = random.sample(idx_expression, num_to_fill) 582 | else: 583 | gen_rand_samples = list(itertools.chain.from_iterable( 584 | itertools.repeat(x, num_to_fill // len(idx_expression)) for x in idx_expression)) 585 | 586 | idx_expression_classes[i] += gen_rand_samples 587 | 588 | ## Flatten the 2D list to 1D list 589 | idx_expression_classes_1D = list(itertools.chain.from_iterable(idx_expression_classes)) 590 | 591 | image_paths_flat_aug = [image_paths_flat[i] for i in idx_expression_classes_1D] 592 | labels_flat_aug = [labels_flat[i] for i in idx_expression_classes_1D] 593 | usage_flat_aug = [usage_flat[i] for i in idx_expression_classes_1D] 594 | nrof_classes_aug = len(labels_flat_aug) 595 | 596 | # return image_paths_flat, labels_flat, usage_flat, nrof_classes 597 | return image_paths_flat_aug, labels_flat_aug, usage_flat_aug, nrof_classes_aug 598 | else: 599 | return image_paths_flat, labels_flat, usage_flat, nrof_classes 600 | 601 | 602 | def get_image_paths_and_labels_expression(dataset, labels_expression): 603 | image_paths_flat = [] 604 | labels_flat = [] 605 | image_paths = [] 606 | for i in range(len(dataset)): 607 | image_paths += dataset[i].image_paths 608 | 609 | with open(labels_expression, 'r') as text_file: 610 | for line in islice(text_file, 1, None): 611 | strtmp = str.split(line) 612 | expr_img = strtmp[1] + '_' + strtmp[2] + '_' + strtmp[3] + '.png' 613 | matching = [img for img in image_paths if expr_img in img] 614 | if len([matching]) == 1: 615 | image_paths_flat.append(matching[0]) 616 | labels_flat.append(int(float(strtmp[-1]))) 617 | else: 618 | raise ValueError('Find no or more than one image corresponding to the emotion label!') 619 | 620 | return image_paths_flat, labels_flat 621 | 622 | 623 | def get_image_paths_and_labels_recog(dataset): 624 | image_paths_flat = [] 625 | labels_flat = [] 626 | classes_flat = [] 627 | for i in range(len(dataset)): 628 | image_paths_flat += dataset[i].image_paths 629 | classes_flat += [dataset[i].name] 630 | labels_flat += [i] * len(dataset[i].image_paths) 631 | 632 | return image_paths_flat, labels_flat, classes_flat 633 | 634 | 635 | def random_rotate_image(image): 636 | # angle = np.random.uniform(low=-10.0, high=10.0) 637 | angle = np.random.uniform(low=-180.0, high=180.0) 638 | return misc.imrotate(image, angle, 'bicubic') 639 | 640 | 641 | def flip(image, random_flip): 642 | if random_flip and np.random.choice([True, False]): 643 | image = np.fliplr(image) 644 | return image 645 | 646 | 647 | def to_rgb(img): 648 | w, h = img.shape 649 | ret = np.empty((w, h, 3), dtype=np.uint8) 650 | ret[:, :, 0] = ret[:, :, 1] = ret[:, :, 2] = img 651 | return ret 652 | 653 | def prewhiten(x): 654 | mean = np.mean(x) 655 | std = np.std(x) 656 | std_adj = np.maximum(std, 1.0 / np.sqrt(x.size)) 657 | y = np.multiply(np.subtract(x, mean), 1 / std_adj) 658 | return y 659 | 660 | def crop(image, random_crop, image_size): 661 | if min(image.shape[0], image.shape[1]) > image_size: 662 | sz1 = image.shape[0] // 2 663 | sz2 = image.shape[1] // 2 664 | 665 | crop_size = image_size//2 666 | diff_h = sz1 - crop_size 667 | diff_v = sz2 - crop_size 668 | (h, v) = (np.random.randint(-diff_h, diff_h + 1), np.random.randint(-diff_v, diff_v + 1)) 669 | 670 | image = image[(sz1+h-crop_size):(sz1+h+crop_size ), (sz2+v-crop_size):(sz2+v+crop_size ), :] 671 | else: 672 | print("Image size is small than crop image size!") 673 | 674 | return image 675 | 676 | def load_data_test(image_paths, do_random_crop, do_random_flip, image_size, do_prewhiten=True): 677 | nrof_samples = len(image_paths) 678 | images = np.zeros((nrof_samples, image_size, image_size, 3)) 679 | for i in range(nrof_samples): 680 | img = misc.imread(image_paths[i]) 681 | img = cv2.resize(img, (image_size, image_size)) 682 | if img.ndim == 2: 683 | img = to_rgb(img) 684 | if do_prewhiten: 685 | img = prewhiten(img) 686 | img = cv2.resize(img, (image_size, image_size)) 687 | ##img = crop(img, do_random_crop, image_size) 688 | img = flip(img, do_random_flip) 689 | images[i, :, :, :] = img 690 | return images 691 | 692 | 693 | def load_data_mega(image_paths, do_random_crop, do_random_flip, do_resize, image_size, BBox, do_prewhiten=True): 694 | nrof_samples = len(image_paths) 695 | images = np.zeros((nrof_samples, image_size, image_size, 3)) 696 | for i in range(nrof_samples): 697 | image = misc.imread(image_paths[i]) 698 | BBox = BBox.astype(int) 699 | img = image[BBox[i, 0]:BBox[i, 0] + BBox[i, 2], BBox[i, 1]:BBox[i, 1] + BBox[i, 3], :] 700 | if img.ndim == 2: 701 | img = to_rgb(img) 702 | if do_prewhiten: 703 | img = prewhiten(img) 704 | if do_resize: 705 | img = cv2.resize(img, (image_size, image_size), interpolation=cv2.INTER_NEAREST) 706 | img = crop(img, do_random_crop, image_size) 707 | img = flip(img, do_random_flip) 708 | images[i, :, :, :] = img 709 | 710 | return images 711 | 712 | 713 | def load_data_facescrub(image_paths, do_random_crop, do_random_flip, do_resize, image_size, do_prewhiten=True): 714 | nrof_samples = len(image_paths) 715 | images = np.zeros((nrof_samples, image_size, image_size, 3)) 716 | for i in range(nrof_samples): 717 | img = misc.imread(image_paths[i]) 718 | if img.ndim == 2: 719 | img = to_rgb(img) 720 | if do_prewhiten: 721 | img = prewhiten(img) 722 | if do_resize: 723 | img = cv2.resize(img, (image_size, image_size), interpolation=cv2.INTER_NEAREST) 724 | if do_random_crop: 725 | img = crop(img, do_random_crop, image_size) 726 | if do_random_flip: 727 | img = flip(img, do_random_flip) 728 | 729 | images[i, :, :, :] = img 730 | return images 731 | 732 | 733 | def get_learning_rate_from_file(filename, epoch): 734 | with open(filename, 'r') as f: 735 | for line in f.readlines(): 736 | line = line.split('#', 1)[0] 737 | if line: 738 | par = line.strip().split(':') 739 | e = int(par[0]) 740 | lr = float(par[1]) 741 | if e <= epoch: 742 | learning_rate = lr 743 | # else: 744 | # return learning_rate 745 | 746 | return learning_rate 747 | 748 | 749 | def get_dataset(paths): 750 | dataset = [] 751 | for path in paths.split(':'): 752 | path_exp = os.path.expanduser(path) 753 | classes = os.listdir(path_exp) 754 | classes.sort() 755 | nrof_classes = len(classes) 756 | for i in range(nrof_classes): 757 | class_name = classes[i] 758 | facedir = os.path.join(path_exp, class_name) 759 | if os.path.isdir(facedir): 760 | images = os.listdir(facedir) 761 | image_paths = [os.path.join(facedir, img) for img in images] 762 | dataset.append(ImageClass(class_name, image_paths)) 763 | 764 | return dataset 765 | 766 | 767 | def get_huge_dataset(paths, start_n=0, end_n=-1): 768 | dataset = [] 769 | classes = [] 770 | for path in paths.split(':'): 771 | path_exp = os.path.expanduser(path) 772 | for (d_ino, d_off, d_reclen, d_type, d_name) in python_getdents.getdents64(path_exp): 773 | if d_name == '.' or d_name == '..': 774 | continue 775 | classes += [d_name] 776 | 777 | classes.sort() 778 | nrof_classes = len(classes) 779 | if end_n == -1: 780 | end_n = nrof_classes 781 | if end_n > nrof_classes: 782 | raise ValueError('Invalid end_n:%d more than nrof_class:%d' % (end_n, nrof_classes)) 783 | for i in range(start_n, end_n): 784 | if (i % 1000 == 0): 785 | print('reading identities: %d/%d\n' % (i, end_n)) 786 | class_name = classes[i] 787 | facedir = os.path.join(path_exp, class_name) 788 | if os.path.isdir(facedir): 789 | images = os.listdir(facedir) 790 | image_paths = [os.path.join(facedir, img) for img in images] 791 | dataset.append(ImageClass(class_name, image_paths)) 792 | 793 | return dataset 794 | 795 | 796 | class ImageClass(): 797 | "Stores the paths to images for a given class" 798 | 799 | def __init__(self, name, image_paths): 800 | self.name = name 801 | self.image_paths = image_paths 802 | 803 | def __str__(self): 804 | return self.name + ', ' + str(len(self.image_paths)) + ' images' 805 | 806 | def __len__(self): 807 | return len(self.image_paths) 808 | 809 | 810 | def load_model(model_dir, meta_file, ckpt_file): 811 | model_dir_exp = os.path.expanduser(model_dir) 812 | saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file)) 813 | saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) 814 | 815 | 816 | def load_data_im(imgs, do_random_crop, do_random_flip, image_size, do_prewhiten=True): 817 | # nrof_samples = len(image_paths) 818 | if (len(imgs.shape) > 3): 819 | nrof_samples = imgs.shape[0] 820 | elif (len(imgs.shape) == 3): 821 | nrof_samples = 1 822 | elif (len(imgs.shape) == 1): 823 | nrof_samples = len(imgs) 824 | else: 825 | print('No images!') 826 | return -1 827 | 828 | images = np.zeros((nrof_samples, image_size, image_size, 3)) 829 | for i in range(nrof_samples): 830 | # img = misc.imread(image_paths[i]) 831 | if nrof_samples > 1: 832 | img = imgs[i] 833 | else: 834 | img = imgs 835 | 836 | if len(img): 837 | if img.ndim == 2: 838 | img = facenet.to_rgb(img) 839 | if do_prewhiten: 840 | img = prewhiten(img) 841 | img = crop(img, do_random_crop, image_size) 842 | img = flip(img, do_random_flip) 843 | images[i] = img 844 | images = np.squeeze(images) 845 | return images 846 | 847 | 848 | def calculate_roc(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10): 849 | assert (embeddings1.shape[0] == embeddings2.shape[0]) 850 | assert (embeddings1.shape[1] == embeddings2.shape[1]) 851 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) 852 | nrof_thresholds = len(thresholds) 853 | folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=False) 854 | # folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=True, seed=666) 855 | 856 | tprs = np.zeros((nrof_folds, nrof_thresholds)) 857 | fprs = np.zeros((nrof_folds, nrof_thresholds)) 858 | accuracy = np.zeros((nrof_folds)) 859 | best_threshold = np.zeros((nrof_folds)) 860 | 861 | diff = np.subtract(embeddings1, embeddings2) 862 | dist = np.sum(np.square(diff), 1) 863 | 864 | for fold_idx, (train_set, test_set) in enumerate(folds): 865 | 866 | # Find the best threshold for the fold 867 | acc_train = np.zeros((nrof_thresholds)) 868 | for threshold_idx, threshold in enumerate(thresholds): 869 | _, _, acc_train[threshold_idx], fp_idx, fn_idx = calculate_accuracy(threshold, dist[train_set], 870 | actual_issame[train_set]) 871 | best_threshold_index = np.argmax(acc_train) 872 | best_threshold[fold_idx] = thresholds[best_threshold_index] 873 | 874 | for threshold_idx, threshold in enumerate(thresholds): 875 | tprs[fold_idx, threshold_idx], fprs[fold_idx, threshold_idx], _, fp_idx, fn_idx = calculate_accuracy( 876 | threshold, dist[test_set], actual_issame[test_set]) 877 | _, _, accuracy[fold_idx], fp_idx, fn_idx = calculate_accuracy(thresholds[best_threshold_index], dist[test_set], 878 | actual_issame[test_set]) 879 | 880 | tpr = np.mean(tprs, 0) 881 | fpr = np.mean(fprs, 0) 882 | mean_best_threshold = np.mean(best_threshold) 883 | 884 | # #### Global evaluation (not n-fold evaluation) for collecting the indices of the False positive/negative examples ##### 885 | _, _, acc_total, fp_idx, fn_idx = calculate_accuracy(mean_best_threshold, dist, actual_issame) 886 | 887 | return tpr, fpr, accuracy, fp_idx, fn_idx, mean_best_threshold 888 | 889 | 890 | def calculate_roc_cosine(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10): 891 | assert (embeddings1.shape[0] == embeddings2.shape[0]) 892 | assert (embeddings1.shape[1] == embeddings2.shape[1]) 893 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) 894 | nrof_thresholds = len(thresholds) 895 | folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=False) 896 | # folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=True, seed=666) 897 | 898 | tprs = np.zeros((nrof_folds, nrof_thresholds)) 899 | fprs = np.zeros((nrof_folds, nrof_thresholds)) 900 | accuracy = np.zeros((nrof_folds)) 901 | 902 | # diff = np.subtract(embeddings1, embeddings2) ###Eucldian l2 distance 903 | # dist = np.sum(np.square(diff), 1) 904 | 905 | dist_all = spatial.distance.cdist(embeddings1, embeddings2, 906 | 'cosine') ## cosine_distance = 1 - similarity; similarity=dot(u,v)/(||u||*||v||) 907 | dist = dist_all.diagonal() 908 | 909 | for fold_idx, (train_set, test_set) in enumerate(folds): 910 | 911 | # Find the best threshold for the fold 912 | acc_train = np.zeros((nrof_thresholds)) 913 | for threshold_idx, threshold in enumerate(thresholds): 914 | _, _, acc_train[threshold_idx], fp_idx, fn_idx = calculate_accuracy(threshold, dist[train_set], 915 | actual_issame[train_set]) 916 | best_threshold_index = np.argmax(acc_train) 917 | for threshold_idx, threshold in enumerate(thresholds): 918 | tprs[fold_idx, threshold_idx], fprs[fold_idx, threshold_idx], _, fp_idx, fn_idx = calculate_accuracy( 919 | threshold, 920 | dist[test_set], 921 | actual_issame[ 922 | test_set]) 923 | _, _, accuracy[fold_idx], fp_idx, fn_idx = calculate_accuracy(thresholds[best_threshold_index], dist[test_set], 924 | actual_issame[test_set]) 925 | 926 | tpr = np.mean(tprs, 0) 927 | fpr = np.mean(fprs, 0) 928 | best_threshold = thresholds[best_threshold_index] 929 | 930 | # #### Global evaluation (not n-fold evaluation) for collecting the indices of the False positive/negative examples ##### 931 | _, _, acc_total, fp_idx, fn_idx = calculate_accuracy(best_threshold, dist, actual_issame) 932 | 933 | return tpr, fpr, accuracy, fp_idx, fn_idx, best_threshold 934 | 935 | 936 | def calculate_accuracy(threshold, dist, actual_issame): 937 | predict_issame = np.less(dist, threshold) 938 | tp = np.sum(np.logical_and(predict_issame, actual_issame)) 939 | fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) 940 | tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame))) 941 | fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame)) 942 | 943 | tpr = 0 if (tp + fn == 0) else float(tp) / float(tp + fn) 944 | fpr = 0 if (fp + tn == 0) else float(fp) / float(fp + tn) 945 | acc = float(tp + tn) / dist.size 946 | 947 | # #################################### Edit by mzh 11012017 #################################### 948 | # #### save the false predict samples: the false posivite (fp) or the false negative(fn) ##### 949 | fp_idx = np.logical_and(predict_issame, np.logical_not(actual_issame)) 950 | fn_idx = np.logical_and(np.logical_not(predict_issame), actual_issame) 951 | # #################################### Edit by mzh 11012017 #################################### 952 | 953 | return tpr, fpr, acc, fp_idx, fn_idx 954 | 955 | 956 | def plot_roc(fpr, tpr, label): 957 | figure = plt.figure() 958 | plt.plot(fpr, tpr, label=label) 959 | plt.title('Receiver Operating Characteristics') 960 | plt.xlabel('False Positive Rate') 961 | plt.ylabel('True Positive Rate') 962 | plt.legend() 963 | plt.plot([0, 1], [0, 1], 'g--') 964 | plt.grid(True) 965 | plt.show() 966 | 967 | return figure 968 | 969 | 970 | def calculate_val(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10): 971 | assert (embeddings1.shape[0] == embeddings2.shape[0]) 972 | assert (embeddings1.shape[1] == embeddings2.shape[1]) 973 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) 974 | diff = np.subtract(embeddings1, embeddings2) 975 | dist = np.sum(np.square(diff), 1) 976 | 977 | nrof_thresholds = len(thresholds) 978 | folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=False) 979 | # folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=True, seed=666) 980 | 981 | val = np.zeros(nrof_folds) 982 | far = np.zeros(nrof_folds) 983 | 984 | for fold_idx, (train_set, test_set) in enumerate(folds): 985 | 986 | if nrof_thresholds > 1: 987 | # Find the threshold that gives FAR = far_target 988 | far_train = np.zeros(nrof_thresholds) 989 | for threshold_idx, threshold in enumerate(thresholds): 990 | _, far_train[threshold_idx] = calculate_val_far(threshold, dist[train_set], actual_issame[train_set]) 991 | if np.max(far_train) >= far_target: 992 | f = interpolate.interp1d(far_train, thresholds, kind='slinear') 993 | threshold = f(far_target) 994 | else: 995 | threshold = 0.0 996 | else: 997 | threshold = thresholds[0] 998 | 999 | val[fold_idx], far[fold_idx] = calculate_val_far(threshold, dist[test_set], actual_issame[test_set]) 1000 | 1001 | val_mean = np.mean(val) 1002 | far_mean = np.mean(far) 1003 | val_std = np.std(val) 1004 | 1005 | val[fold_idx], far[fold_idx] = calculate_val_far(threshold, dist[test_set], actual_issame[test_set]) 1006 | 1007 | return val_mean, val_std, far_mean, threshold 1008 | 1009 | 1010 | def calculate_val_cosine(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10): 1011 | assert (embeddings1.shape[0] == embeddings2.shape[0]) 1012 | assert (embeddings1.shape[1] == embeddings2.shape[1]) 1013 | nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) 1014 | nrof_thresholds = len(thresholds) 1015 | folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=False) 1016 | # folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=True, seed=666) 1017 | 1018 | val = np.zeros(nrof_folds) 1019 | far = np.zeros(nrof_folds) 1020 | 1021 | # diff = np.subtract(embeddings1, embeddings2) 1022 | # dist = np.sum(np.square(diff), 1) 1023 | dist_all = spatial.distance.cdist(embeddings1, embeddings2, 1024 | 'cosine') ## cosine_distance = 1 - similarity; similarity=dot(u,v)/(||u||*||v||) 1025 | dist = dist_all.diagonal() 1026 | 1027 | for fold_idx, (train_set, test_set) in enumerate(folds): 1028 | 1029 | # Find the threshold that gives FAR = far_target 1030 | far_train = np.zeros(nrof_thresholds) 1031 | for threshold_idx, threshold in enumerate(thresholds): 1032 | _, far_train[threshold_idx] = calculate_val_far(threshold, dist[train_set], actual_issame[train_set]) 1033 | if np.max(far_train) >= far_target: 1034 | f = interpolate.interp1d(far_train, thresholds, kind='slinear') 1035 | threshold = f(far_target) 1036 | else: 1037 | threshold = 0.0 1038 | 1039 | val[fold_idx], far[fold_idx] = calculate_val_far(threshold, dist[test_set], actual_issame[test_set]) 1040 | 1041 | val_mean = np.mean(val) 1042 | far_mean = np.mean(far) 1043 | val_std = np.std(val) 1044 | return val_mean, val_std, far_mean, threshold 1045 | 1046 | 1047 | def calculate_val_far(threshold, dist, actual_issame): 1048 | predict_issame = np.less(dist, threshold) 1049 | true_accept = np.sum(np.logical_and(predict_issame, actual_issame)) 1050 | false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) 1051 | n_same = np.sum(actual_issame) 1052 | n_diff = np.sum(np.logical_not(actual_issame)) 1053 | if n_same > 0: 1054 | val = float(true_accept) / float(n_same) 1055 | else: 1056 | val = 0 1057 | if n_diff > 0: 1058 | far = float(false_accept) / float(n_diff) 1059 | else: 1060 | far = 0 1061 | return val, far 1062 | 1063 | 1064 | ## get the labels of the triplet paths for calculating the center loss - mzh edit 31012017 1065 | def get_label_triplet(triplet_paths): 1066 | classes = [] 1067 | classes_list = [] 1068 | labels_triplet = [] 1069 | for image_path in triplet_paths: 1070 | str_items = image_path.split('/') 1071 | classes_list.append(str_items[-2]) 1072 | 1073 | classes = list(sorted(set(classes_list), key=classes_list.index)) 1074 | 1075 | for item in classes_list: 1076 | labels_triplet.append(classes.index(item)) 1077 | 1078 | return labels_triplet 1079 | 1080 | 1081 | def get_model_filenames(model_dir): 1082 | files = os.listdir(model_dir) 1083 | meta_files = [s for s in files if s.endswith('.meta')] 1084 | if len(meta_files) == 0: 1085 | raise ValueError('No meta file found in the model directory (%s)' % model_dir) 1086 | elif len(meta_files) > 1: 1087 | raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir) 1088 | meta_file = meta_files[0] 1089 | meta_files = [s for s in files if '.ckpt' in s] 1090 | max_step = -1 1091 | for f in files: 1092 | step_str = re.match(r'(^model-[\w\- ]+.ckpt-(\d+))', f) 1093 | if step_str is not None and len(step_str.groups()) >= 2: 1094 | step = int(step_str.groups()[1]) 1095 | if step > max_step: 1096 | max_step = step 1097 | ckpt_file = step_str.groups()[0] 1098 | return meta_file, ckpt_file 1099 | 1100 | 1101 | def class_filter(image_list, label_list, num_imgs_class): 1102 | counter = Counter(label_list) 1103 | label_num = counter.values() 1104 | label_key = counter.keys() 1105 | 1106 | idx = [idx for idx, val in enumerate(label_num) if val > num_imgs_class] 1107 | label_idx = [label_key[i] for i in idx] 1108 | idx_list = [i for i in range(0, len(label_list)) if label_list[i] in label_idx] 1109 | label_list_new = [label_list[i] for i in idx_list] 1110 | image_list_new = [image_list[i] for i in idx_list] 1111 | 1112 | # plt.hist(label_num, bins = 'auto') 1113 | return image_list_new, label_list_new 1114 | 1115 | 1116 | ## Select the images for a epoch in which each batch includes at least two different classes and each class has more than one image 1117 | def select_batch_images(image_list, label_list, epoch, epoch_size, batch_size, num_classes_batch, num_imgs_class): 1118 | label_epoch = [] 1119 | image_epoch = [] 1120 | 1121 | counter = Counter(label_list) 1122 | label_num = counter.values() 1123 | label_key = counter.keys() 1124 | nrof_examples = len(image_list) 1125 | nrof_examples_per_epoch = epoch_size * batch_size 1126 | j = epoch * nrof_examples_per_epoch % nrof_examples 1127 | 1128 | if j + epoch_size * batch_size > nrof_examples: 1129 | j = random.choice(range(0, nrof_examples - epoch_size * batch_size)) 1130 | 1131 | for i in range(epoch_size): 1132 | print('In select_batch_images, batch %d selecting...\n' % (i)) 1133 | label_batch = label_list[j + i * batch_size:j + (i + 1) * batch_size] 1134 | image_batch = image_list[j + i * batch_size:j + (i + 1) * batch_size] 1135 | 1136 | label_unique = set(label_batch) 1137 | if (len(label_unique) < num_classes_batch or len(label_unique) > (batch_size / num_imgs_class)): 1138 | if (num_classes_batch > (batch_size / num_imgs_class)): 1139 | raise ValueError( 1140 | 'The wanted minumum number of classes in a batch (%d classes) is more than the limit can be assigned (%d classes)' % ( 1141 | num_classes_batch, num_imgs_class)) 1142 | label_batch = [] 1143 | image_batch = [] 1144 | ## re-select the image batch which includes num_classes_batch classes 1145 | nrof_im_each_class = np.int(batch_size / num_classes_batch) 1146 | idx = [idx for idx, val in enumerate(label_num) if val > nrof_im_each_class] 1147 | if (len(idx) < num_classes_batch): 1148 | raise ValueError('No enough classes to chose!') 1149 | idx_select = random.sample(idx, num_classes_batch) 1150 | label_key_select = [label_key[i] for i in idx_select] 1151 | for label in label_key_select: 1152 | start_tmp = label_list.index(label) 1153 | idx_tmp = range(start_tmp, start_tmp + nrof_im_each_class + 1) 1154 | label_tmp = [label_list[i] for i in idx_tmp] 1155 | img_tmp = [image_list[i] for i in idx_tmp] 1156 | label_batch += label_tmp 1157 | image_batch += img_tmp 1158 | 1159 | label_batch = label_batch[0:batch_size] 1160 | image_batch = image_batch[0:batch_size] 1161 | 1162 | label_epoch += label_batch 1163 | image_epoch += image_batch 1164 | 1165 | return image_epoch, label_epoch 1166 | 1167 | 1168 | def label_mapping(label_list_src, EXPRSSIONS_TYPE_src, EXPRSSIONS_TYPE_trg): 1169 | labels_mapping = [] 1170 | idx_label_notexist = [] 1171 | for i, label in enumerate(label_list_src): 1172 | expre_src = str.split(EXPRSSIONS_TYPE_src[label], '=')[1] 1173 | expre_trg = [x for x in EXPRSSIONS_TYPE_trg if expre_src in x] 1174 | if expre_trg == []: 1175 | label_trg = -1 1176 | idx_label_notexist.append(i) 1177 | else: 1178 | label_trg = int(str.split(expre_trg[0], '=')[0]) 1179 | labels_mapping.append(label_trg) 1180 | 1181 | return idx_label_notexist, labels_mapping -------------------------------------------------------------------------------- /src/facenet_train_classifier_expression_pretrainExpr_multidata_addcnns_simple.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/facenet_train_classifier_expression_pretrainExpr_multidata_addcnns_simple.py -------------------------------------------------------------------------------- /src/lfw_ext.py: -------------------------------------------------------------------------------- 1 | """Helper for evaluation on the Labeled Faces in the Wild dataset 2 | """ 3 | 4 | # MIT License 5 | # 6 | # Copyright (c) 2016 David Sandberg 7 | # 8 | # Permission is hereby granted, free of charge, to any person obtaining a copy 9 | # of this software and associated documentation files (the "Software"), to deal 10 | # in the Software without restriction, including without limitation the rights 11 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | # copies of the Software, and to permit persons to whom the Software is 13 | # furnished to do so, subject to the following conditions: 14 | # 15 | # The above copyright notice and this permission notice shall be included in all 16 | # copies or substantial portions of the Software. 17 | # 18 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 24 | # SOFTWARE. 25 | 26 | from __future__ import absolute_import 27 | from __future__ import division 28 | from __future__ import print_function 29 | 30 | import os 31 | import numpy as np 32 | import facenet 33 | import facenet_ext 34 | 35 | def evaluate(embeddings, actual_issame, nrof_folds=10): 36 | # Calculate evaluation metrics 37 | thresholds = np.arange(0, 4, 0.01) 38 | embeddings1 = embeddings[0::2] 39 | embeddings2 = embeddings[1::2] 40 | # tpr, fpr, accuracy = facenet.calculate_roc(thresholds, embeddings1, embeddings2, 41 | # np.asarray(actual_issame), nrof_folds=nrof_folds) 42 | # thresholds = np.arange(0, 4, 0.001) 43 | # val, val_std, far = facenet.calculate_val(thresholds, embeddings1, embeddings2, 44 | # np.asarray(actual_issame), 1e-3, nrof_folds=nrof_folds) 45 | 46 | tpr, fpr, accuracy, fp_idx, fn_idx, best_threshold_acc = facenet_ext.calculate_roc(thresholds, embeddings1, embeddings2, 47 | np.asarray(actual_issame), nrof_folds=nrof_folds) 48 | thresholds = np.arange(0, 4, 0.001) 49 | val, val_std, far, threshold_val = facenet_ext.calculate_val(thresholds, embeddings1, embeddings2, 50 | np.asarray(actual_issame), 1e-3, nrof_folds=nrof_folds) 51 | 52 | val_acc, val_std_acc, far_acc, threshold_val_acc = facenet_ext.calculate_val([best_threshold_acc], embeddings1, embeddings2, 53 | np.asarray(actual_issame), 1e-3, nrof_folds=nrof_folds) 54 | 55 | return tpr, fpr, accuracy, val, val_std, far, fp_idx, fn_idx, best_threshold_acc, threshold_val, val_acc, far_acc 56 | 57 | def evaluate_cosine(embeddings, actual_issame, nrof_folds=10): 58 | # Calculate evaluation metrics 59 | thresholds = np.arange(0, 4, 0.01) 60 | embeddings1 = embeddings[0::2] 61 | embeddings2 = embeddings[1::2] 62 | 63 | tpr, fpr, accuracy, fp_idx, fn_idx, best_threshold_acc = facenet.calculate_roc_cosine(thresholds, embeddings1, embeddings2, 64 | np.asarray(actual_issame), nrof_folds=nrof_folds) 65 | thresholds = np.arange(0, 4, 0.001) 66 | val, val_std, far, threshold_val = facenet.calculate_val_cosine(thresholds, embeddings1, embeddings2, 67 | np.asarray(actual_issame), 1e-3, nrof_folds=nrof_folds) 68 | 69 | 70 | return tpr, fpr, accuracy, val, val_std, far, fp_idx, fn_idx, best_threshold_acc, threshold_val 71 | 72 | def get_paths(lfw_dir, pairs, file_ext): 73 | nrof_skipped_pairs = 0 74 | path_list = [] 75 | issame_list = [] 76 | dataset = str.split(lfw_dir, '/') 77 | if dataset[4] == 'youtubefacesdb': 78 | for pair in pairs: 79 | if len(pair) == 3: 80 | vid0_dir = str.split(pair[1], '.') 81 | vid0_dir = vid0_dir[0] 82 | vid1_dir = str.split(pair[2], '.') 83 | vid1_dir = vid1_dir[0] 84 | path0 = os.path.join(lfw_dir, pair[0], vid0_dir, pair[1]) 85 | path1 = os.path.join(lfw_dir, pair[0], vid1_dir, pair[2]) 86 | issame = True 87 | elif len(pair) == 4: 88 | vid0_dir = str.split(pair[1], '.') 89 | vid0_dir = vid0_dir[0] 90 | vid1_dir = str.split(pair[3], '.') 91 | vid1_dir = vid1_dir[0] 92 | path0 = os.path.join(lfw_dir, pair[0], vid0_dir, pair[1]) 93 | path1 = os.path.join(lfw_dir, pair[2], vid1_dir, pair[3]) 94 | issame = False 95 | if os.path.exists(path0) and os.path.exists(path1): # Only add the pair if both paths exist 96 | path_list += (path0, path1) 97 | issame_list.append(issame) 98 | else: 99 | nrof_skipped_pairs += 1 100 | else: 101 | for pair in pairs: 102 | if len(pair) == 3: 103 | path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1]) + '.' + file_ext) 104 | path1 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[2]) + '.' + file_ext) 105 | issame = True 106 | elif len(pair) == 4: 107 | path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1]) + '.' + file_ext) 108 | path1 = os.path.join(lfw_dir, pair[2], pair[2] + '_' + '%04d' % int(pair[3]) + '.' + file_ext) 109 | issame = False 110 | if os.path.exists(path0) and os.path.exists(path1): # Only add the pair if both paths exist 111 | path_list += (path0, path1) 112 | issame_list.append(issame) 113 | else: 114 | nrof_skipped_pairs += 1 115 | 116 | if nrof_skipped_pairs > 0: 117 | print('Skipped %d image pairs' % nrof_skipped_pairs) 118 | 119 | return path_list, issame_list 120 | 121 | 122 | def read_pairs(pairs_filename): 123 | pairs = [] 124 | with open(pairs_filename, 'r') as f: 125 | for line in f.readlines()[1:]: 126 | pair = line.strip().split() 127 | pairs.append(pair) 128 | return np.array(pairs) 129 | 130 | 131 | def get_expr_paths(pairs): 132 | paths = [] 133 | actual_issame = [] 134 | for pair in pairs: 135 | #[cnt, img, img_ref, issame, expr_actual, expr_require, expr_ref, expre_isrequire] = str.split(pair) 136 | img=pair[1] 137 | img_ref = pair[2] 138 | issame = pair[3] 139 | paths.append(img) 140 | paths.append(img_ref) 141 | issame = True if issame == 'True' else False 142 | actual_issame.append(issame) 143 | 144 | return paths, actual_issame 145 | 146 | -------------------------------------------------------------------------------- /src/metrics_loss.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=missing-docstring 2 | from __future__ import absolute_import 3 | from __future__ import division 4 | from __future__ import print_function 5 | 6 | import os 7 | from subprocess import Popen, PIPE 8 | import tensorflow as tf 9 | from tensorflow.python.framework import ops 10 | import numpy as np 11 | from scipy import misc 12 | import matplotlib.pyplot as plt 13 | from sklearn.cross_validation import KFold 14 | from scipy import interpolate 15 | from tensorflow.python.training import training 16 | import random 17 | import re 18 | from collections import Counter 19 | import matplotlib.pyplot as plt 20 | import cv2 21 | import python_getdents 22 | from scipy import spatial 23 | from sklearn.decomposition import PCA 24 | from itertools import islice 25 | import itertools 26 | 27 | #import h5py 28 | 29 | 30 | 31 | 32 | def center_loss(features, label, alfa, nrof_classes): 33 | """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition" 34 | (http://ydwen.github.io/papers/WenECCV16.pdf) 35 | This is not exactly the algorthim proposed in the paper, since the update/shift of the centers is not moving towards the 36 | centers (i.e. sum(Xi)/Nj, Xi is the element of class j) of the classes but the sum of the elements (sum(Xi)) in the class 37 | """ 38 | #nrof_features = features.get_shape()[1] 39 | #centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, 40 | # initializer=tf.constant_initializer(0), trainable=False) 41 | #label = tf.reshape(label, [-1]) 42 | #centers_batch = tf.gather(centers, label) 43 | #diff = (1 - alfa) * (centers_batch - features) 44 | #diff = alfa * (centers_batch - features) 45 | #centers = tf.scatter_sub(centers, label, diff) 46 | # loss = tf.nn.l2_loss(features - centers_batch) 47 | # return loss, centers, diff, centers_batch 48 | 49 | """Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition" 50 | (http://ydwen.github.io/papers/WenECCV16.pdf) 51 | -- mzh 15/02/2017 52 | -- Correcting the center updating, center updates/shifts towards to the center of the correponding class with a weight: 53 | -- centers = centers- (1-alpha)(centers-sum(Xi)/Nj), where Xi is the elements of the class j, Nj is the number of the elements of class Nj 54 | -- code has been tested by the test script '../test/center_loss_test.py' 55 | """ 56 | nrof_features = features.get_shape()[1] 57 | centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, initializer=tf.constant_initializer(0), trainable=False) 58 | centers_cts = tf.get_variable('centers_cts', [nrof_classes], dtype=tf.float32, initializer=tf.constant_initializer(0), trainable=False) 59 | #centers_cts_init = tf.zeros_like(nrof_classes, tf.float32) 60 | label = tf.reshape(label, [-1]) 61 | centers_batch = tf.gather(centers, label) #get the corresponding center of each element in features, the list of the centers is in the same order as the features 62 | loss_n = tf.reduce_sum(tf.square(features - centers_batch)/2, 1) 63 | loss = tf.nn.l2_loss(features - centers_batch) 64 | diff = (1 - alfa) * (centers_batch - features) 65 | 66 | ## update the centers 67 | label_unique, idx = tf.unique(label) 68 | zeros = tf.zeros_like(label_unique, tf.float32) 69 | ## calculation the repeat time of same label 70 | nrof_elements_per_class_clean = tf.scatter_update(centers_cts, label_unique, zeros) 71 | ones = tf.ones_like(label, tf.float32) 72 | ## counting the number elments in each class, the class is in the order of the [0,1,2,3,....] as initialzation 73 | nrof_elements_per_class_update = tf.scatter_add(nrof_elements_per_class_clean, label, ones) 74 | ## nrof_elements_per_class_list is the number of the elements in each class in the batch 75 | nrof_elements_per_class_batch = tf.gather(nrof_elements_per_class_update, label) 76 | nrof_elements_per_class_batch_reshape = tf.reshape(nrof_elements_per_class_batch, [-1, 1])## reshape the matrix as 1 coloum no matter the dimension of the row (-1) 77 | diff_mean = tf.div(diff, nrof_elements_per_class_batch_reshape) 78 | centers = tf.scatter_sub(centers, label, diff_mean) 79 | 80 | #return loss, centers, label, centers_batch, diff, centers_cts, centers_cts_batch, diff_mean,center_cts_clear, nrof_elements_per_class_batch_reshape 81 | return loss, loss_n, centers, nrof_elements_per_class_clean, nrof_elements_per_class_batch_reshape,diff_mean # facenet_expression_addcnns_simple_joint_v4_dynamic.py 82 | #return loss, centers, nrof_elements_per_class_clean, nrof_elements_per_class_batch_reshape,diff_mean ### facenet_train_classifier_expression_pretrainExpr_multidata_addcnns_simple.py 83 | 84 | def center_loss_similarity(features, label, alfa, nrof_classes): 85 | ## center_loss on cosine distance =1 - similarity instead of the L2 norm, i.e. Euclidian distance 86 | 87 | ## normalisation as the embedding vectors in order to similarity distance 88 | features = tf.nn.l2_normalize(features, 1, 1e-10, name='feat_emb') 89 | 90 | nrof_features = features.get_shape()[1] 91 | centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, initializer=tf.constant_initializer(0), trainable=False) 92 | centers_cts = tf.get_variable('centers_cts', [nrof_classes], dtype=tf.float32, initializer=tf.constant_initializer(0), trainable=False) 93 | #centers_cts_init = tf.zeros_like(nrof_classes, tf.float32) 94 | label = tf.reshape(label, [-1]) 95 | centers_batch = tf.gather(centers, label) #get the corresponding center of each element in features, the list of the centers is in the same order as the features 96 | #loss = tf.nn.l2_loss(features - centers_batch) ## 0.5*(L2 norm)**2, L2 norm is the Euclidian distance 97 | similarity_all = tf.matmul(features, tf.transpose(tf.nn.l2_normalize(centers_batch, 1, 1e-10))) ## dot prodoct, cosine distance, similarity of x and y 98 | similarity_self = tf.diag_part(similarity_all) 99 | loss_x = tf.subtract(1.0, similarity_self) 100 | loss = tf.reduce_sum(loss_x) ## sum the cosine distance of each vector/tensor 101 | diff = (1 - alfa) * (centers_batch - features) 102 | ones = tf.ones_like(label, tf.float32) 103 | centers_cts = tf.scatter_add(centers_cts, label, ones) # counting the number of each class, the class is in the order of the [0,1,2,3,....] as initialzation 104 | centers_cts_batch = tf.gather(centers_cts, label) 105 | #centers_cts_batch_ext = tf.tile(centers_cts_batch, nrof_features) 106 | #centers_cts_batch_reshape = tf.reshape(centers_cts_batch_ext,[-1, nrof_features]) 107 | centers_cts_batch_reshape = tf.reshape(centers_cts_batch, [-1,1]) 108 | diff_mean = tf.div(diff, centers_cts_batch_reshape) 109 | centers = tf.scatter_sub(centers, label, diff_mean) 110 | zeros = tf.zeros_like(label, tf.float32) 111 | center_cts_clear = tf.scatter_update(centers_cts, label, zeros) 112 | #return loss, centers, label, centers_batch, diff, centers_cts, centers_cts_batch, diff_mean,center_cts_clear, centers_cts_batch_reshape 113 | #return loss, centers, loss_x, similarity_all, similarity_self 114 | return loss, centers 115 | 116 | 117 | 118 | 119 | def center_inter_loss_tf(features, nrof_features, label, alfa, nrof_classes): # tensorflow version 120 | """ center_inter_loss = center_loss/||Xi - centers(0,1,2,...i-1,i+1,i+2,...)|| 121 | --mzh 22022017 122 | """ 123 | # dim_features = features.get_shape()[1] 124 | # nrof_features = features.get_shape()[0] 125 | dim_features = features.get_shape()[1].value 126 | #nrof_features = features.get_shape()[0].value 127 | # dim_features = features.shape[1] 128 | # nrof_features = features.shape[0] 129 | centers = tf.get_variable('centers', [nrof_classes, dim_features], dtype=tf.float32, 130 | initializer=tf.constant_initializer(0), trainable=False) 131 | centers_cts = tf.get_variable('centers_cts', [nrof_classes], dtype=tf.float32, 132 | initializer=tf.constant_initializer(0), trainable=False) 133 | ## center_loss calculation 134 | label = tf.reshape(label, [-1]) 135 | centers_batch = tf.gather(centers,label) # get the corresponding center of each element in features, the list of the centers is in the same order as the features 136 | dist_centers = features - centers_batch 137 | dist_centers_sum = tf.reduce_sum(dist_centers**2,1)/2 138 | loss_center = tf.nn.l2_loss(dist_centers) 139 | 140 | ## calculation the repeat time of same label 141 | ones = tf.ones_like(label, tf.float32) 142 | centers_cts = tf.scatter_add(centers_cts, label, ones) # counting the number of each class, the class is in the order of the [0,1,2,3,....] as initialzation 143 | centers_cts_batch = tf.gather(centers_cts, label) 144 | 145 | 146 | ## inter_center_loss calculation 147 | #label_unique, label_idx = tf.unique(label) 148 | #centers_batch1 = tf.gather(centers,label_unique) 149 | #nrof_classes_batch = centers_batch.get_shape()[0].value 150 | #centers_1D = tf.reshape(centers_batch1, [1, nrof_classes_batch * dim_features]) 151 | centers_batch1 = tf.gather(centers,label) 152 | centers_1D = tf.reshape(centers_batch1, [1, nrof_features * dim_features]) 153 | centers_2D = tf.tile(centers_1D, [nrof_features, 1]) 154 | centers_3D = tf.reshape(centers_2D,[nrof_features, nrof_features, dim_features]) 155 | features_3D = tf.reshape(features, [nrof_features, 1, dim_features]) 156 | dist_inter_centers = features_3D - centers_3D 157 | dist_inter_centers_sum_dim = tf.reduce_sum(dist_inter_centers**2,2)/2 158 | centers_cts_batch_1D = tf.tile(centers_cts_batch,[nrof_features]) 159 | centers_cts_batch_2D = tf.reshape(centers_cts_batch_1D, [nrof_features, nrof_features]) 160 | dist_inter_centers_sum_unique = tf.div(dist_inter_centers_sum_dim, centers_cts_batch_2D) 161 | dist_inter_centers_sum_all = tf.reduce_sum(dist_inter_centers_sum_unique, 1) 162 | dist_inter_centers_sum = dist_inter_centers_sum_all - dist_centers_sum 163 | loss_inter_centers = tf.reduce_mean(dist_inter_centers_sum) 164 | 165 | ## total loss 166 | loss = tf.div(loss_center, loss_inter_centers) 167 | 168 | ## update centers 169 | diff = (1 - alfa) * (centers_batch - features) 170 | # ones = tf.ones_like(label, tf.float32) 171 | # centers_cts = tf.scatter_add(centers_cts, label, ones) # counting the number of each class 172 | # centers_cts_batch = tf.gather(centers_cts, label) 173 | centers_cts_batch_reshape = tf.reshape(centers_cts_batch, [-1, 1]) 174 | diff_mean = tf.div(diff, centers_cts_batch_reshape) 175 | centers = tf.scatter_sub(centers, label, diff_mean) 176 | zeros = tf.zeros_like(label, tf.float32) 177 | center_cts_clear = tf.scatter_update(centers_cts, label, zeros) 178 | # return loss, centers, label, centers_batch, diff, centers_cts, centers_cts_batch, diff_mean,center_cts_clear, centers_cts_batch_reshape 179 | return loss, centers, loss_center, loss_inter_centers, center_cts_clear 180 | #return loss, centers, loss_center, loss_inter_centers, dist_inter_centers_sum_dim, centers_cts_batch_2D, dist_inter_centers_sum_unique, dist_inter_centers_sum_all, dist_inter_centers_sum, dist_inter_centers_sum, center_cts_clear 181 | 182 | def center_inter_triplet_loss_tf(features, nrof_features, label, alfa, nrof_classes, beta): # tensorflow version 183 | """ center_inter_loss = center_loss/||Xi - centers(0,1,2,...i-1,i+1,i+2,...)|| 184 | --mzh 22022017 185 | """ 186 | dim_features = features.get_shape()[1].value 187 | centers = tf.get_variable('centers', [nrof_classes, dim_features], dtype=tf.float32, 188 | initializer=tf.constant_initializer(0), trainable=False) 189 | nrof_elements_per_class_list = tf.get_variable('centers_cts', [nrof_classes], dtype=tf.float32, 190 | initializer=tf.constant_initializer(0), trainable=False) 191 | ## center_loss calculation 192 | label = tf.reshape(label, [-1]) 193 | centers_batch = tf.gather(centers,label) # get the corresponding center of each element in features, the list of the centers is in the same order as the features 194 | dist_centers = features - centers_batch 195 | dist_centers_sum = tf.reduce_sum(dist_centers**2,1)/2 196 | loss_center = tf.nn.l2_loss(dist_centers) 197 | 198 | ## calculation the repeat time of same label 199 | ones = tf.ones_like(label, tf.float32) 200 | nrof_elements_per_class_list = tf.scatter_add(nrof_elements_per_class_list, label, ones) # counting the number elments in each class, the class is in the order of the [0,1,2,3,....] as initialzation 201 | nrof_elements_per_class = tf.gather(nrof_elements_per_class_list, label) #nrof_elements_per_class is the number of the elements in each class 202 | 203 | 204 | ## inter_center_loss calculation 205 | centers_batch1 = tf.gather(centers,label) 206 | centers_1D = tf.reshape(centers_batch1, [1, nrof_features * dim_features]) 207 | centers_2D = tf.tile(centers_1D, [nrof_features, 1]) 208 | centers_3D = tf.reshape(centers_2D,[nrof_features, nrof_features, dim_features]) 209 | features_3D = tf.reshape(features, [nrof_features, 1, dim_features]) 210 | dist_inter_centers = features_3D - centers_3D 211 | dist_inter_centers_sum_dim = tf.reduce_sum(dist_inter_centers**2,2)/2 212 | centers_cts_batch_1D = tf.tile(nrof_elements_per_class,[nrof_features]) 213 | centers_cts_batch_2D = tf.reshape(centers_cts_batch_1D, [nrof_features, nrof_features]) 214 | dist_inter_centers_sum_unique = tf.div(dist_inter_centers_sum_dim, centers_cts_batch_2D) 215 | dist_inter_centers_sum_all = tf.reduce_sum(dist_inter_centers_sum_unique, 1) 216 | dist_inter_centers_sum = dist_inter_centers_sum_all - dist_centers_sum 217 | loss_inter_centers = tf.reduce_mean(dist_inter_centers_sum) 218 | 219 | ## total loss 220 | loss = loss_center + (loss_center + beta*nrof_features - loss_inter_centers) 221 | 222 | ## update centers 223 | diff = (1 - alfa) * (centers_batch - features) 224 | centers_cts_batch_reshape = tf.reshape(nrof_elements_per_class, [-1, 1]) 225 | diff_mean = tf.div(diff, centers_cts_batch_reshape) 226 | centers = tf.scatter_sub(centers, label, diff_mean) 227 | zeros = tf.zeros_like(label, tf.float32) 228 | center_cts_clear = tf.scatter_update(nrof_elements_per_class_list, label, zeros) 229 | return loss, centers, loss_center, loss_inter_centers, center_cts_clear 230 | 231 | def class_level_triplet_loss_tf(features, nrof_samples, label, alfa, nrof_classes, beta, gamma): # tensorflow version 232 | """ Class_level_Triple_loss, triple loss implemented on the centers of the class intead of the individual sample 233 | --mzh 30072017s 234 | """ 235 | dim_features = features.get_shape()[1].value 236 | centers = tf.get_variable('centers', [nrof_classes, dim_features], dtype=tf.float32, 237 | initializer=tf.constant_initializer(0), trainable=False) 238 | nrof_elements_per_class = tf.get_variable('centers_cts', [nrof_classes], dtype=tf.float32, 239 | initializer=tf.constant_initializer(0), trainable=False) 240 | 241 | ## normalisation as the embedding vectors in order to similarity distance 242 | #features = tf.nn.l2_normalize(features, 1, 1e-10, name='feat_emb') 243 | 244 | ## calculate centers 245 | centers_batch = tf.gather(centers, label) 246 | diff = (1 - alfa) * (centers_batch - features) 247 | diff_within = centers_batch - features 248 | dist_within = tf.reduce_sum(diff_within**2/2, axis=1, keep_dims=True) 249 | dist_within_center = tf.reduce_sum(dist_within, axis=0) ## sum all the elements in the dist_centers_sum, dist_within_center is a scale 250 | 251 | ## inter_center_loss calculation 252 | label_unique,idx = tf.unique(label) 253 | centers_batch_unique = tf.gather(centers,label_unique)#select the centers corresponding to the batch samples, otherwise the whole centers will cause the overflow of the centers_2D 254 | nrof_centers_batch_unique = tf.shape(centers_batch_unique)[0]##very important, tf.shape() can be used to get the run-time dynamic tensor shape; however .get_shape() can only be used to get the shape of the static shape of the tensor 255 | centers_1D = tf.reshape(centers_batch_unique, [1, nrof_centers_batch_unique * dim_features]) 256 | centers_2D = tf.tile(centers_1D, [nrof_samples, 1]) 257 | centers_3D = tf.reshape(centers_2D, [nrof_samples,nrof_centers_batch_unique, dim_features]) 258 | features_3D = tf.reshape(features, [nrof_samples, 1, dim_features]) 259 | dist_inter_centers = features_3D - centers_3D 260 | dist_inter_centers_sum_dim = tf.reduce_sum(dist_inter_centers**2,2)/2 # calculate the L2 of the features, [nrof_samples, nrof_classes, feature_dimension] 261 | dist_inter_centers_sum_all = tf.reduce_sum(dist_inter_centers_sum_dim)#sum all the elements in the dist_inter_centers_sum_dim 262 | 263 | ## total loss 264 | dist_within_2D = tf.tile(dist_within, [1, nrof_centers_batch_unique]) 265 | dist_matrix = dist_within_2D + beta*tf.ones([nrof_samples, nrof_centers_batch_unique]) - gamma*dist_inter_centers_sum_dim 266 | loss_matrix = tf.maximum(dist_matrix, tf.zeros([nrof_samples, nrof_centers_batch_unique], tf.float32)) 267 | loss_pre = tf.reduce_sum(loss_matrix) - nrof_samples*beta 268 | #loss = tf.divide(loss_pre, nrof_samples) 269 | loss = tf.divide(loss_pre, tf.multiply(tf.cast(nrof_samples, tf.float32), 270 | tf.cast(nrof_centers_batch_unique, tf.float32) - tf.cast(1, tf.float32))) 271 | 272 | #centers = tf.scatter_sub(centers, label, diff) 273 | 274 | ##update centers 275 | zeros = tf.zeros_like(label_unique, tf.float32) 276 | ## calculation the repeat time of same label 277 | nrof_elements_per_class_clean = tf.scatter_update(nrof_elements_per_class, label_unique, zeros) 278 | ones = tf.ones_like(label, tf.float32) 279 | ## counting the number elments in each class, the class is in the order of the [0,1,2,3,....] as initialzation 280 | nrof_elements_per_class_update = tf.scatter_add(nrof_elements_per_class_clean, label, ones) 281 | ## nrof_elements_per_class_list is the number of the elements in each class in the batch 282 | nrof_elements_per_class_batch = tf.gather(nrof_elements_per_class_update, label) 283 | centers_cts_batch_reshape = tf.reshape(nrof_elements_per_class_batch, [-1, 1]) 284 | diff_mean = tf.div(diff, centers_cts_batch_reshape) 285 | centers = tf.scatter_sub(centers, label, diff_mean) 286 | 287 | #return loss 288 | return loss, centers, dist_within_center, dist_inter_centers_sum_all, nrof_centers_batch_unique 289 | #return loss, loss_matrix, dist_matrix, dist_within_2D, dist_inter_centers_sum_dim, centers, dist_inter_centers, features_3D, centers_3D, centers_1D 290 | #return loss, dist_within_center, dist_inter_centers_sum_all, nrof_centers_batch 291 | 292 | 293 | def class_level_triplet_loss_similarity_tf(features, nrof_samples, label, nrof_classes, beta): # tensorflow version 294 | """ Class_level_Triple_loss_similarity, triple loss implemented on the centers of the class intead of the individual sample, however here the distance cosine (representing the similarity) replaces the L2 distance inclass_level_triplet_loss_tf. 295 | --mzh 25062017 296 | """ 297 | dim_features = features.get_shape()[1].value 298 | centers = tf.get_variable('centers', [nrof_classes, dim_features], dtype=tf.float32, 299 | initializer=tf.constant_initializer(0), trainable=False) 300 | nrof_elements_per_class = tf.get_variable('centers_cts', [nrof_classes], dtype=tf.float32, 301 | initializer=tf.constant_initializer(0), trainable=False) 302 | 303 | ## normalisation as the embedding vectors in order to similarity distance 304 | features = tf.nn.l2_normalize(features, 1, 1e-10, name='feat_emb') 305 | 306 | #nrof_samples = label.get_shape()[0] 307 | #nrof_samples = tf.shape(label)[0] 308 | ## calculation the repeat time of same label 309 | ones = tf.ones_like(label, tf.float32) 310 | ## counting the number elments in each class, the class is in the order of the [0,1,2,3,....] as initialzation 311 | nrof_elements_per_class = tf.scatter_add(nrof_elements_per_class, label, ones) 312 | ## nrof_elements_per_class_list is the number of the elements in each class 313 | nrof_elements_per_class_list = tf.gather(nrof_elements_per_class, label) 314 | 315 | ## calculate centers 316 | class_sum = tf.scatter_add(centers, label, features) 317 | centers = tf.divide(class_sum, nrof_elements_per_class[:,None]) 318 | ##very important, tf.shape() can be used to get the run-time dynamic tensor shape; however .get_shape() can only be used to get the shape of the static shape of the tensor 319 | ## inter_center_loss calculation 320 | label_unique, idx = tf.unique(label) 321 | nrof_centers_batch = tf.shape(label_unique)[0] ##very important, tf.shape() can be used to get the run-time dynamic tensor shape; however .get_shape() can only be used to get the shape of the static shape of the tensor 322 | centers_batch = tf.gather(centers, label_unique) 323 | 324 | 325 | ## class distance loss 326 | label = tf.reshape(label, [-1]) 327 | #centers_list = tf.gather(centers,label) # get the corresponding center of each element in features, the list of the centers is in the same order as the features 328 | ## dot prodoct, cosine distance, similarity of x and y 329 | similarity_all = tf.matmul(features, tf.transpose(centers_batch)) 330 | 331 | centers_list = tf.gather(centers, label) 332 | similarity_all_nn = tf.matmul(features, tf.transpose(centers_list)) 333 | similarity_self = tf.diag_part(similarity_all_nn) 334 | 335 | # n = tf.cast(tf.shape(label)[0],dtype=tf.int64) 336 | # a = tf.range(n, dtype=tf.int64) 337 | # #a = tf.ones_like(label) 338 | # #self_index_1 = tf.transpose(tf.stack([a[0:], label])) 339 | # self_index = tf.transpose(tf.stack([a, label])) 340 | # similarity_self = tf.gather_nd(similarity_all, self_index) 341 | 342 | similarity_self_mn = tf.tile(tf.transpose([similarity_self]), [1, nrof_centers_batch]) 343 | #similarity_self_mn = tf.ones_like(similarity_all) 344 | similarity_all_beta = similarity_all + beta 345 | pre_loss_mtr = tf.subtract(similarity_all_beta, similarity_self_mn) 346 | 347 | ## ignore the element in loss_mtr less than 0, it means the (similarity_within + beta > similarity_inter ) is already satified. 348 | zero_mtr = tf.zeros_like(pre_loss_mtr, tf.float32) 349 | loss_mtr = tf.maximum(pre_loss_mtr, zero_mtr) 350 | loss_sum = tf.reduce_sum(loss_mtr) 351 | loss_sum_real = tf.sub(loss_sum, beta*nrof_samples) 352 | loss_mean = tf.div(loss_sum_real, tf.cast(nrof_samples * (nrof_centers_batch-1), tf.float32)) 353 | 354 | ## Adding a regularisation term to loss to make it not equal to zero 355 | loss_reg = tf.add(loss_mean, 1e-10) 356 | 357 | #return loss_reg, loss_mtr, loss_sum, loss_sum_real, loss_mean, pre_loss_mtr, similarity_self_mn, similarity_self, similarity_all, centers, features, nrof_centers_batch 358 | #return loss_reg, loss_real_mtr, loss_real_mtr, pre_loss_mtr, similarity_self_mn_beta, similarity_self_mn, similarity_self, similarity_all, centers, centers_norm, features, nrof_centers_batch, self_index, a 359 | return loss_reg, similarity_all, similarity_self, nrof_centers_batch 360 | 361 | 362 | 363 | def center_inter_loss_python(features, label, alfa, nrof_classes, centers): # python version: very slow, 30 time cost than tf version 364 | """ center_inter_loss = center_loss/||Xi - centers(0,1,2,...i-1,i+1,i+2,...)|| 365 | mzh 22022017 366 | """ 367 | # loss calculation 368 | nrof_features = np.shape(features)[0] 369 | #dim_feature = np.shape(features)[1] 370 | centers_batch = gather(centers, label) 371 | dist_center = np.sum(np.square(features - centers_batch),1) 372 | loss_center = np.sum(dist_center) 373 | dist_inter_centers = np.zeros([nrof_features, nrof_classes], dtype=np.float32) 374 | for i in np.arange(nrof_features): 375 | dist_inter_centers[i,:] = np.sum(np.square(features[i] - centers),1) 376 | dist_inter_centers_sum = np.sum(dist_inter_centers,1) 377 | dist_inter_centers_sum = dist_inter_centers_sum - dist_center 378 | #loss_inter_centers = np.sum(dist_inter_centers_sum) 379 | loss_inter_centers = np.sum(dist_inter_centers_sum / nrof_classes) 380 | loss_inter_centers = np.maximum(1e-5, loss_inter_centers) 381 | loss = loss_center/loss_inter_centers 382 | 383 | # update centers 384 | centers_cts = np.zeros(nrof_classes, dtype=np.int32) 385 | centers_batch = gather(centers, label) 386 | diff = (1 - alfa) * (centers_batch - features) 387 | for idx in label: 388 | centers_cts[idx] += 1 389 | centers_cts_batch = gather(centers_cts, label) 390 | centers_cts_batch_reshape = np.reshape(centers_cts_batch, [-1,1]) 391 | diff_mean = diff / centers_cts_batch_reshape 392 | i = 0 393 | for idx in label: 394 | centers[idx,:] -= diff_mean[i,:] 395 | i += 1 396 | 397 | return loss, centers 398 | 399 | 400 | 401 | 402 | 403 | 404 | 405 | 406 | 407 | 408 | 409 | 410 | 411 | 412 | 413 | 414 | 415 | 416 | 417 | 418 | 419 | 420 | 421 | 422 | 423 | 424 | 425 | 426 | 427 | 428 | 429 | 430 | 431 | 432 | 433 | 434 | 435 | 436 | 437 | 438 | 439 | 440 | 441 | 442 | 443 | 444 | 445 | 446 | 447 | 448 | 449 | 450 | 451 | 452 | 453 | 454 | 455 | 456 | 457 | 458 | 459 | 460 | 461 | -------------------------------------------------------------------------------- /src/models/inception_resnet_v1_expression_simple.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gonghao67/FaceLiveNet/b5bb9ab97dd6cd9d6cfbaa167e4d966f5c4a8605/src/models/inception_resnet_v1_expression_simple.pyc -------------------------------------------------------------------------------- /src/test_realtime.py: -------------------------------------------------------------------------------- 1 | ############# face_expression ###################################################################### 2 | ##### FER2013 : 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral ###### 3 | ##### CK+: 0=neutral, 1=anger, 2=contempt, 3=disgust, 4=fear, 5=happy, 6=sadness, 7=surprise# ###### 4 | ############# face_verification, face_expression #################################################### 5 | import argparse 6 | import os 7 | import sys 8 | import numpy as np 9 | import matplotlib.pyplot as plt 10 | import matplotlib.patches as patches 11 | from PIL import Image 12 | import cv2 13 | from scipy import misc 14 | 15 | import time 16 | 17 | 18 | import align.face_align_mtcnn 19 | import face_verification 20 | import facenet 21 | 22 | 23 | 24 | 25 | def verification_test(args): 26 | # label = [] 27 | # phrase = [] 28 | # with open('/mnt/hgfs/VMshare-2/fer2013/fer2013.csv', 'rb') as csvfile: 29 | # reader = csv.reader(csvfile, delimiter=',', quotechar='|') 30 | # header = next(reader) 31 | # for row in reader: 32 | # label.append(row[0]) 33 | # img = row[1] 34 | # img = img.split(' ') 35 | # img = [int(i) for i in img] 36 | # img = np.array(img) 37 | # img = img.reshape(48,48) 38 | # phrase.append(row[2]) 39 | 40 | predict_issame = False 41 | rect_len = 120 42 | offset_x = 50 43 | offset_y = 40 44 | 45 | # Expr_str = ["Neu", "Ang", "Cont", "Disg", "Fear", "Hap", "Sad", "Surp"] ###CK+ 46 | # Expr_dataset = 'CK+' 47 | #Expr_str = ["Ang", "Disg", "Fear", "Hap", "Sad", "Surp", "Neu"] ###FER2013+ 48 | #Expr_str = ['Neu', 'Ang', 'Disg', 'Fear', 'Hap', 'Sad', 'Surp'] #####FER2013+ EXPRSSIONS_TYPE_fusion 49 | Expr_str = ['Neutre', 'Colere', 'Degoute', 'Peur', 'Content', 'Triste', 'Surprise'] #####FER2013+ EXPRSSIONS_TYPE_fusion 50 | Expr_dataset = 'FER2013' 51 | 52 | c_red = (0, 0, 255) 53 | c_green = (0, 255, 0) 54 | font = cv2.FONT_HERSHEY_SIMPLEX 55 | #express_probs = np.ones(7) 56 | 57 | scale_size = 3 ## scale the original image as the input image to align the face 58 | 59 | 60 | ## load models for the face detection and verfication 61 | 62 | pnet, rnet, onet, sess, args_model = face_verification.load_models_forward_v2(args, Expr_dataset) 63 | 64 | 65 | face_img_refs_ = [] 66 | img_ref_paths = [] 67 | probs_face = [] 68 | for img_ref_path in os.listdir(args.img_ref): 69 | img_ref_paths.append(img_ref_path) 70 | img_ref = misc.imread(os.path.join(args.img_ref, img_ref_path)) # python format 71 | img_size = img_ref.shape[0:2] 72 | 73 | bb, probs = align.face_align_mtcnn.align_mtcnn_realplay(img_ref, pnet, rnet, onet) 74 | if (bb == []): 75 | continue; 76 | 77 | bb_face = [] 78 | probs_face = [] 79 | for i, prob in enumerate(probs): 80 | if prob > args.face_detect_threshold: 81 | bb_face.append(bb[i]) 82 | probs_face.append(prob) 83 | 84 | bb = np.asarray(bb_face) 85 | probs = np.asarray(probs_face) 86 | 87 | det = bb 88 | 89 | if det.shape[0] > 1: 90 | bounding_box_size = (det[:, 2] - det[:, 0]) * (det[:, 3] - det[:, 1]) 91 | img_center = np.array(img_size) / 2 92 | offsets = np.vstack( 93 | [(det[:, 0] + det[:, 2]) / 2 - img_center[1], (det[:, 1] + det[:, 3]) / 2 - img_center[0]]) 94 | offset_dist_squared = np.sum(np.power(offsets, 2.0), 0) 95 | index = np.argmax( 96 | bounding_box_size - offset_dist_squared * 2.0) # some extra weight on the centering 97 | det = det[index, :] 98 | prob = probs_face[index] 99 | 100 | det = np.squeeze(det) 101 | x0 = det[0] 102 | y0 = det[1] 103 | x1 = det[2] 104 | y1 = det[3] 105 | 106 | bb_tmp = np.zeros(4, dtype=np.int32) 107 | bb_tmp[0] = np.maximum(det[0] - args.margin / 2, 0) 108 | bb_tmp[1] = np.maximum(det[1] - args.margin / 2, 0) 109 | bb_tmp[2] = np.minimum(det[2] + args.margin / 2, img_size[1]) 110 | bb_tmp[3] = np.minimum(det[3] + args.margin / 2, img_size[0]) 111 | 112 | face_img_ref = img_ref[bb_tmp[1]:bb_tmp[3], bb_tmp[0]:bb_tmp[2], :] 113 | face_img_ref = misc.imresize(face_img_ref, (args.image_size, args.image_size), interp='bilinear') 114 | face_img_ref_ = facenet.load_data_im(face_img_ref, False, False, args.image_size) 115 | face_img_refs_.append(face_img_ref_) 116 | 117 | img_ref_cv = cv2.cvtColor(img_ref, cv2.COLOR_BGR2RGB) 118 | cv2.rectangle(img_ref_cv, (int(det[0]), int(det[1])), (int(det[2]), int(det[3])), c_red, 2, 8, 0) 119 | # cv2.putText(img_ref_, "%.4f" % prob, (int(x0), int(y0)), font, 1, c_green, 3) 120 | img_ref_name = img_ref_path.split('.')[0] 121 | cv2.putText(img_ref_cv, "%s" % img_ref_name, (int(x0), int(y0 - 10)), font, 122 | 1, 123 | c_red, 2) 124 | cv2.imshow('%s'%img_ref_path, img_ref_cv) 125 | cv2.waitKey(20) 126 | 127 | face_img_refs_ = np.array(face_img_refs_) 128 | 129 | #emb_ref = face_verification.face_embeddings(face_img_refs_,args, sess, images_placeholder, embeddings, keep_probability_placeholder, phase_train_placeholder) 130 | emb_ref = face_verification.face_embeddings(face_img_refs_, args, sess, args_model, Expr_dataset) 131 | 132 | 133 | ################ capture the camera for realplay ############################################# 134 | cap = cv2.VideoCapture(0) 135 | cap.set(cv2.CAP_PROP_FRAME_WIDTH, 800) 136 | cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 600) 137 | 138 | realplay_window = "Realplay" 139 | cv2.namedWindow(realplay_window, cv2.WINDOW_NORMAL) 140 | 141 | 142 | while (True): 143 | if cv2.getWindowProperty(realplay_window, cv2.WINDOW_NORMAL) < 0: 144 | return 145 | # Capture frame-by-frame 146 | t6 = time.time() 147 | ret, frame = cap.read() 148 | t7 = time.time() 149 | #print('face cap eclapse %f, FPS:%d' % ((t7 - t6), int(1 / ((t7 - t6))))) 150 | 151 | # Our operations on the frame come here 152 | # gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) 153 | if frame is None: 154 | continue; 155 | 156 | 157 | 158 | cv2_im = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 159 | #img = Image.fromarray(cv2_im) 160 | 161 | #face alignmentation 162 | #im_np = np.asarray(img) 163 | im_np = cv2_im 164 | img_size = im_np.shape[0:2] 165 | im_np_scale = cv2.resize(im_np, (int(img_size[1] / scale_size), int(img_size[0] / scale_size)), 166 | interpolation=cv2.INTER_LINEAR) 167 | t0 = time.time() 168 | bb, probs = align.face_align_mtcnn.align_mtcnn_realplay(im_np_scale, pnet, rnet, onet) 169 | t1 = time.time() 170 | #print('align face FPS:%d' % (int(1 / ((t1 - t0))))) 171 | 172 | bb_face = [] 173 | probs_face = [] 174 | for i, prob in enumerate(probs): 175 | if prob > args.face_detect_threshold: 176 | bb_face.append(bb[i]) 177 | probs_face.append(prob) 178 | 179 | bb = np.asarray(bb_face) 180 | probs = np.asarray(probs_face) 181 | 182 | bb = bb*scale_size #re_scale of the scaled image for align_face 183 | 184 | if (len(bb) > 0): 185 | for i in range(bb.shape[0]): 186 | prob = probs[i] 187 | det = bb[i] 188 | bb_tmp = np.zeros(4, dtype=np.int32) 189 | bb_tmp[0] = np.maximum(det[0] - args.margin / 2, 0) 190 | bb_tmp[1] = np.maximum(det[1] - args.margin / 2, 0) 191 | bb_tmp[2] = np.minimum(det[2] + args.margin / 2, img_size[1]) 192 | bb_tmp[3] = np.minimum(det[3] + args.margin / 2, img_size[0]) 193 | 194 | face_img = im_np[bb_tmp[1]:bb_tmp[3], bb_tmp[0]:bb_tmp[2], :] 195 | face_img_ = misc.imresize(face_img, (args.image_size, args.image_size), interp='bilinear') 196 | face_img_ = facenet.load_data_im(face_img_, False, False, args.image_size) 197 | 198 | 199 | ######### 200 | x0 = bb[i][0] 201 | y0 = bb[i][1] 202 | x1 = bb[i][2] 203 | y1 = bb[i][3] 204 | offset_y = int((y1-y0)/7) 205 | 206 | 207 | # if (predict_issame): 208 | # cv2.rectangle(frame, (int(x0), int(y0)), (int(x1), int(y1)), c_red, 2, 209 | # 8, 210 | # 0) 211 | # cv2.putText(frame, "%.4f" % prob, (int(x0), int(y0)), font, 212 | # 1, 213 | # c_red, 3) 214 | # 215 | # for k in range(express_probs.shape[0]): 216 | # cv2.putText(frame, Expr_str[k], (int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4)), font, 217 | # 0.5, 218 | # c_red, 2) 219 | # cv2.rectangle(frame, (int(x1 + offset_x), int(y0 + offset_y * k)), 220 | # (int(x1 + offset_x + rect_len * express_probs[i]), int(y0 + offset_y * k + offset_y/2)), 221 | # c_red, cv2.FILLED, 222 | # 8, 223 | # 0) 224 | # 225 | # else: 226 | # cv2.rectangle(frame, (int(x0), int(y0)), (int(x1), int(y1)), c_green, 2, 227 | # 8, 228 | # 0) 229 | # cv2.putText(frame, "%.4f" % prob, (int(x0), int(y0)), font, 230 | # 1, 231 | # c_green, 3) 232 | # 233 | # for k in range(express_probs.shape[0]): 234 | # cv2.putText(frame, Expr_str[k], (int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4)), 235 | # font, 236 | # 0.5, 237 | # c_green, 2) 238 | # cv2.rectangle(frame, (int(x1 + offset_x), int(y0 + offset_y * k)), 239 | # (int(x1 + offset_x + rect_len * express_probs[i]), 240 | # int(y0 + offset_y * k + offset_y / 2)), 241 | # c_green, cv2.FILLED, 242 | # 8, 243 | # 0) 244 | 245 | # face experssion 246 | ##### 0=neutral, 1=anger, 2=contempt, 3=disgust, 4=fear, 5=happy, 6=sadness, 7=surprise ############ 247 | t2 = time.time() 248 | #predict_issames, dists, express_probs = face_verification.face_expression_multiref_forward(face_img_, emb_ref, args, sess, images_placeholder, embeddings, keep_probability_placeholder, phase_train_placeholder, logits) 249 | predict_issames, dists, express_probs = face_verification.face_expression_multiref_forward(face_img_, emb_ref, args, sess, args_model, Expr_dataset) 250 | t3 = time.time() 251 | 252 | 253 | print('face verif FPS:%d' % (int(1 / ((t3 - t2))))) 254 | 255 | predict_issame_idx = [i for i, predict_issame in enumerate(predict_issames) if predict_issame == True] 256 | 257 | if predict_issame_idx: 258 | for i in predict_issame_idx: 259 | dist = dists[i] 260 | img_ref_name = img_ref_paths[i].split('.')[0] 261 | 262 | cv2.rectangle(frame, (int(x0), int(y0)), (int(x1), int(y1)), c_green, 2, 263 | 8, 264 | 0) 265 | cv2.putText(frame, "%.4f" % prob, (int(x0), int(y0)), font, 266 | 0.5, 267 | c_green, 1) 268 | cv2.putText(frame, "%.2f" % dist, (int(x1), int(y1)), font, 269 | 0.5, 270 | c_green, 1) 271 | cv2.putText(frame, "%s" % img_ref_name, (int((x1 + x0) / 2), int(y0 - 10)), font, 272 | 1, 273 | c_green, 2) 274 | 275 | for k in range(express_probs.shape[0]): 276 | cv2.putText(frame, Expr_str[k], 277 | (int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4)), 278 | font, 279 | 0.5, 280 | c_green, 1) 281 | cv2.rectangle(frame, ( 282 | int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4 + offset_y / 5)), 283 | (int(x1 + offset_x / 4 + rect_len * express_probs[k]), 284 | int(y0 + offset_y * k + + offset_y / 4 + offset_y / 2)), 285 | c_green, cv2.FILLED, 286 | 8, 287 | 0) 288 | else: 289 | dist = min(dists) 290 | cv2.rectangle(frame, (int(x0), int(y0)), (int(x1), int(y1)), c_red, 2, 291 | 8, 292 | 0) 293 | cv2.putText(frame, "%.4f" % prob, (int(x0), int(y0)), font, 294 | 0.5, 295 | c_red, 1) 296 | cv2.putText(frame, "%.2f" % dist, (int(x1), int(y1)), font, 297 | 0.5, 298 | c_red, 1) 299 | 300 | for k in range(express_probs.shape[0]): 301 | cv2.putText(frame, Expr_str[k], (int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4)), 302 | font, 303 | 0.5, 304 | c_red, 1) 305 | cv2.rectangle(frame, (int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4 + offset_y / 5)), 306 | (int(x1 + offset_x / 4 + rect_len * express_probs[k]), 307 | int(y0 + offset_y * k + + offset_y / 4 + offset_y / 2)), 308 | c_red, cv2.FILLED, 309 | 8, 310 | 0) 311 | 312 | # for i in range(len(predict_issames)): 313 | # 314 | # predict_issame = predict_issames[i] 315 | # dist= dists[i] 316 | # img_ref_name = img_ref_paths[i].split('.')[0] 317 | # 318 | # 319 | # 320 | # if (predict_issame): 321 | # cv2.rectangle(frame, (int(x0), int(y0)), (int(x1), int(y1)), c_green, 2, 322 | # 8, 323 | # 0) 324 | # cv2.putText(frame, "%.4f" % prob, (int(x0), int(y0)), font, 325 | # 0.5, 326 | # c_green, 1) 327 | # cv2.putText(frame, "%.2f" % dist, (int(x1), int(y1)), font, 328 | # 0.5, 329 | # c_green, 1) 330 | # cv2.putText(frame, "%s" % img_ref_name, (int((x1+x0)/2), int(y0-10)), font, 331 | # 1, 332 | # c_green, 2) 333 | # 334 | # 335 | # for k in range(express_probs.shape[0]): 336 | # cv2.putText(frame, Expr_str[k], (int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4)), 337 | # font, 338 | # 0.5, 339 | # c_green, 1) 340 | # cv2.rectangle(frame, (int(x1 + offset_x/4), int(y0 + (offset_y+offset_y/2)* k)), 341 | # (int(x1 + offset_x/4 + rect_len * express_probs[k]), 342 | # int(y0 + (offset_y+offset_y/2) * k + offset_y / 3)), 343 | # c_green, cv2.FILLED, 344 | # 8, 345 | # 0) 346 | # break 347 | # 348 | # else: 349 | # cv2.rectangle(frame, (int(x0), int(y0)), (int(x1), int(y1)), c_red, 2, 350 | # 8, 351 | # 0) 352 | # cv2.putText(frame, "%.4f" % prob, (int(x0), int(y0)), font, 353 | # 0.5, 354 | # c_red, 1) 355 | # cv2.putText(frame, "%.2f" % dist, (int(x1), int(y1)), font, 356 | # 0.5, 357 | # c_red, 1) 358 | # 359 | # for k in range(express_probs.shape[0]): 360 | # cv2.putText(frame, Expr_str[k], (int(x1 + offset_x / 4), int(y0 + offset_y * k + offset_y / 4)), 361 | # font, 362 | # 0.5, 363 | # c_red, 1) 364 | # cv2.rectangle(frame, (int(x1 + offset_x/4), int(y0 + offset_y * k + offset_y / 4 + offset_y / 5)), 365 | # (int(x1 + offset_x/4 + rect_len * express_probs[k]), 366 | # int(y0 + offset_y * k + + offset_y / 4 + offset_y / 2)), 367 | # c_red, cv2.FILLED, 368 | # 8, 369 | # 0) 370 | 371 | 372 | 373 | # visualation 374 | cv2.imshow(realplay_window, frame) 375 | if cv2.waitKey(1) & 0xFF == ord('q'): 376 | break 377 | 378 | # When everything done, release the capture 379 | cap.release() 380 | cv2.destroyAllWindows() 381 | ################ capture the camera for realplay ############################################# 382 | 383 | 384 | 385 | 386 | return 387 | 388 | 389 | 390 | def read_pairs(pairs_filename): 391 | pairs = [] 392 | with open(pairs_filename, 'r') as f: 393 | for line in f.readlines(): 394 | pair = line.strip().split() 395 | pairs.append(pair) 396 | return np.array(pairs) 397 | 398 | def plotbb(img, bboxes, ld=[], output_filename=[]): 399 | #img = np.float32(image) 400 | # plt.imshow(img) 401 | # patterns = ['-', '+', 'x', 'o', 'O', '.', '*'] # more patterns 402 | 403 | 404 | fig, ax = plt.subplots(1) 405 | ax.imshow(img) 406 | 407 | if bboxes.ndim <2: 408 | bboxes = np.expand_dims(bboxes, axis=0) 409 | 410 | for i in range(bboxes.shape[0]): 411 | rect = patches.Rectangle( 412 | (bboxes[i, 0], bboxes[i, 1]), 413 | bboxes[i, 2] - bboxes[i, 0], 414 | bboxes[i, 3] - bboxes[i, 1], 415 | # hatch=patterns[i], 416 | fill=False, 417 | linewidth=1, 418 | edgecolor='r', 419 | facecolor='none' 420 | ) 421 | ax.add_patch(rect) 422 | score = '%.02f' % (bboxes[i, 4]) 423 | ax.text(int(bboxes[i, 0]), int(bboxes[i, 1]), score, color='green', fontsize=10) 424 | plt.pause(0.0001) 425 | 426 | if(ld): 427 | ax = plt.gca() 428 | ld = np.int32(np.squeeze(ld)) 429 | for i in range(np.int32(ld.shape[0] / 2)): 430 | ax.plot(ld[i], ld[i + 5], 'o', color='r', linewidth=0.1) 431 | 432 | if (output_filename): 433 | dirtmp = output_filename 434 | if not os.path.exists(dirtmp): 435 | os.mkdir(dirtmp) 436 | random_key = np.random.randint(0, high=99999) 437 | fig.savefig(os.path.join(dirtmp, 'face_dd_ld_%03d.png') % random_key, dpi=90, bbox_inches='tight') 438 | 439 | 440 | def parse_arguments(argv): 441 | parser = argparse.ArgumentParser() 442 | 443 | parser.add_argument('--img_ref', type=str, help='Directory with unaligned image 1.', default='../data/images') 444 | #parser.add_argument('--img2', type=str, help='Directory with unaligned image 2.') 445 | # parser.add_argument('--gpu_memory_fraction', type=float, 446 | # help='Upper bound on the amount of GPU memory that will be used by the process.', default=0.0001) 447 | 448 | ## face_align_mtcnn_test() arguments 449 | parser.add_argument('--align_model_dir', type=str, 450 | help='Directory containing the models for the face detection', default='../../model') 451 | parser.add_argument('--model_dir', type=str, 452 | help='Directory containing the metagraph (.meta) file and the checkpoint (ckpt) file containing model parameters', default='../../model/20180115-025629_model/best_model') #20170920-174400_expression') #../model/20170501-153641#20161217-135827#20170131-234652 453 | parser.add_argument('--threshold', type=float, 454 | help='The threshold for the face verification',default=0.9) 455 | parser.add_argument('--face_detect_threshold', type=float, 456 | help='The threshold for the face detection', default=0.9) 457 | # parser.add_argument('--model_def', type=str, 458 | # help='Model definition. Points to a module containing the definition of the inference graph.', 459 | # default='models.inception_resnet_v1') 460 | # parser.add_argument('--weight_decay', type=float, help='L2 weight regularization.', default=5e-5) 461 | 462 | ## face_align_mtcnn_test() arguments 463 | parser.add_argument('--output_dir', type=str, help='Directory with aligned face thumbnails.', default='./align/output') 464 | parser.add_argument('--image_size', type=int, 465 | help='Image size (height, width) in pixels.', default=160) 466 | parser.add_argument('--margin', type=int, 467 | help='Margin for the crop around the bounding box (height, width) in pixels.', default=32) 468 | parser.add_argument('--random_order', 469 | help='Shuffles the order of images to enable alignment using multiple processes.', 470 | action='store_true') 471 | 472 | 473 | return parser.parse_args(argv) 474 | 475 | if __name__ == '__main__': 476 | verification_test(parse_arguments(sys.argv[1:])) -------------------------------------------------------------------------------- /src/train_BP.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=missing-docstring 2 | from __future__ import absolute_import 3 | from __future__ import division 4 | from __future__ import print_function 5 | 6 | import os 7 | from subprocess import Popen, PIPE 8 | import tensorflow as tf 9 | from tensorflow.python.framework import ops 10 | import numpy as np 11 | from scipy import misc 12 | import matplotlib.pyplot as plt 13 | from sklearn.cross_validation import KFold 14 | from scipy import interpolate 15 | from tensorflow.python.training import training 16 | import random 17 | import re 18 | from collections import Counter 19 | import matplotlib.pyplot as plt 20 | import cv2 21 | import python_getdents 22 | from scipy import spatial 23 | from sklearn.decomposition import PCA 24 | from itertools import islice 25 | import itertools 26 | 27 | 28 | def _add_loss_summaries(total_loss): 29 | """Add summaries for losses. 30 | 31 | Generates moving average for all losses and associated summaries for 32 | visualizing the performance of the network. 33 | 34 | Args: 35 | total_loss: Total loss from loss(). 36 | Returns: 37 | loss_averages_op: op for generating moving averages of losses. 38 | """ 39 | # Compute the moving average of all individual losses and the total loss. 40 | loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg') 41 | losses = tf.get_collection('losses') 42 | loss_averages_op = loss_averages.apply(losses + [total_loss]) 43 | 44 | # Attach a scalar summmary to all individual losses and the total loss; do the 45 | # same for the averaged version of the losses. 46 | for l in losses + [total_loss]: 47 | # Name each loss as '(raw)' and name the moving average version of the loss 48 | # as the original loss name. 49 | tf.summary.scalar(l.op.name + ' (raw)', l) 50 | tf.summary.scalar(l.op.name, loss_averages.average(l)) 51 | 52 | return loss_averages_op 53 | 54 | def train(total_loss, global_step, optimizer, learning_rate, moving_average_decay, update_gradient_vars, summary, log_histograms=True): 55 | # Generate moving averages of all losses and associated summaries. 56 | loss_averages_op = _add_loss_summaries(total_loss) 57 | 58 | print('######## length of update_gradient_vars: %d\n' % len(update_gradient_vars)) 59 | # Compute gradients. 60 | with tf.control_dependencies([loss_averages_op]): 61 | if optimizer=='Adagrad': 62 | opt = tf.train.AdagradOptimizer(learning_rate) 63 | elif optimizer=='Adadelta': 64 | opt = tf.train.AdadeltaOptimizer(learning_rate, rho=0.9, epsilon=1e-6) 65 | elif optimizer=='Adam': 66 | opt = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999, epsilon=0.1) 67 | elif optimizer=='RMSProp': 68 | opt = tf.train.RMSPropOptimizer(learning_rate, decay=0.9, momentum=0.9, epsilon=1.0) 69 | elif optimizer=='Momentum': 70 | opt = tf.train.MomentumOptimizer(learning_rate, 0.9, use_nesterov=True) 71 | elif optimizer=='SGD': 72 | opt = tf.train.GradientDescentOptimizer(learning_rate) 73 | 74 | else: 75 | raise ValueError('Invalid optimization algorithm') 76 | 77 | gvs = opt.compute_gradients(total_loss, update_gradient_vars) 78 | 79 | ### gradient clip for handling the gradient exploding 80 | gradslist, varslist = zip(*gvs) 81 | grads_clip, _ = tf.clip_by_global_norm(gradslist, 5.0) 82 | #grads_clip = [(tf.clip_by_value(grad, -1.0, 1.0),var) for grad, var in grads] 83 | 84 | # Apply gradients. 85 | #apply_gradient_op = opt.apply_gradients(grads, global_step=global_step) 86 | #apply_gradient_op = opt.apply_gradients(grads_clip, global_step=global_step) 87 | apply_gradient_op = opt.apply_gradients(zip(grads_clip, varslist), global_step=global_step) 88 | 89 | # Add histograms for trainable variables. 90 | if log_histograms: 91 | #for var in tf.trainable_variables(): 92 | for var in update_gradient_vars: 93 | tf.summary.histogram(var.op.name, var) 94 | 95 | # Add histograms for gradients. 96 | if log_histograms: 97 | for grad, var in grads: 98 | if grad is not None: 99 | tf.summary.histogram(var.op.name + '/gradients', grad) 100 | 101 | # Track the moving averages of all trainable variables. 102 | variable_averages = tf.train.ExponentialMovingAverage(moving_average_decay, global_step) 103 | #variables_averages_op = variable_averages.apply(tf.trainable_variables()) 104 | variables_averages_op = variable_averages.apply(update_gradient_vars) 105 | 106 | with tf.control_dependencies([apply_gradient_op, variables_averages_op]): 107 | #with tf.control_dependencies([apply_gradient_op]): 108 | train_op = tf.no_op(name='train') 109 | 110 | print('######## length of update_gradient_vars: %d\n' % len(update_gradient_vars)) 111 | return train_op, gvs, grads_clip 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | --------------------------------------------------------------------------------