├── .gitignore ├── LICENSE ├── README.md ├── itracker.py ├── itracker_adv.py ├── itracker_adv_arch.png ├── itracker_arch.png ├── pretrained_models └── itracker_adv │ ├── checkpoint │ ├── model-23.data-00000-of-00001 │ ├── model-23.index │ └── model-23.meta └── validation_script.py /.gitignore: -------------------------------------------------------------------------------- 1 | ckpt/ 2 | *.pyc 3 | *.npz 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2017, Hugo Chan 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | * Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | * Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Eye Tracker 2 | Implemented and improved the iTracker model proposed in the paper [Eye Tracking for Everyone](https://arxiv.org/abs/1606.05814). 3 | 4 | ![](itracker_arch.png) 5 | *

Figure 1: iTracker architecture

* 6 | 7 | ![](itracker_adv_arch.png) 8 | *

Figure 2: modified iTracker architecture

* 9 | 10 | Figures 1 and 2 show the architectures of the iTracker model 11 | and the modified model. The only difference between the modified model and the iTracker model is 12 | that we concatenate the face layer FC-F1 and face mask layer FC-FG1 first, after applying a fully connected layer FC-F2, 13 | we then concatenate the eye layer FC-E1 and FC-F2 layer. 14 | We claim that this modified architecture is superior to the iTracker architecture. 15 | Intuitively, concatenating the face mask information together with the eye information 16 | may confuse the model since the face mask information is irrelevant to the eye information. 17 | Even though the iTracker model succeeded to learn this fact from the data, 18 | the modified model outperforms the iTracker model by explictlying encoded with this knowledge. 19 | In experiments, the modified model converged faster (28 epochs vs. 40+ epochs) and achieved better validation 20 | error (2.19 cm vs. 2.514 cm). 21 | The iTracker model was implemented in itracker.py and the modified one was 22 | implemented in itracker_adv.py. 23 | Note that a smaller dataset (i.e., a subset of the full dataset in the original paper) was used in experiments and no data augmentation was applied. 24 | This smaller dataset contains 48,000 training samples and 5,000 validation samples. 25 | You can download this smaller dataset [here](http://hugochan.net/download/eye_tracker_train_and_val.npz). 26 | 27 | # Get started 28 | To train the model: run 29 | `python itracker_adv.py --train -i input_data -sm saved_model` 30 | 31 | To test the trained model: run 32 | `python itracker_adv.py -i input_data -lm saved_model` 33 | 34 | You can find a pretrained (on the smaller dataset) model under the pretrained_models/itracker_adv/ folder. 35 | 36 | # FAQ 37 | 1) What are the datasets? 38 | 39 | The original dataset comes from the [GazeCapture](http://gazecapture.csail.mit.edu/) project. The dataset involves over 1400 subjects and results in more than 2 million face images. Due to the limitation of computation power, a much [smaller dataset](http://hugochan.net/download/eye_tracker_train_and_val.npz) with 48000 training samples and 5000 validation samples was used here. Each sample contains 5 items: face, left eye, right eye, face mask and labels. 40 | 41 | # Other implementations 42 | For pytorch implementations, see [GazeCapture](https://github.com/CSAILVision/GazeCapture). 43 | -------------------------------------------------------------------------------- /itracker.py: -------------------------------------------------------------------------------- 1 | import os 2 | import argparse 3 | import timeit 4 | import numpy as np 5 | import tensorflow as tf 6 | import matplotlib.pyplot as plt 7 | 8 | 9 | # Network Parameters 10 | img_size = 64 11 | n_channel = 3 12 | mask_size = 25 13 | 14 | # pathway: eye_left and eye_right 15 | conv1_eye_size = 11 16 | conv1_eye_out = 96 17 | pool1_eye_size = 2 18 | pool1_eye_stride = 2 19 | 20 | conv2_eye_size = 5 21 | conv2_eye_out = 256 22 | pool2_eye_size = 2 23 | pool2_eye_stride = 2 24 | 25 | conv3_eye_size = 3 26 | conv3_eye_out = 384 27 | pool3_eye_size = 2 28 | pool3_eye_stride = 2 29 | 30 | conv4_eye_size = 1 31 | conv4_eye_out = 64 32 | pool4_eye_size = 2 33 | pool4_eye_stride = 2 34 | 35 | eye_size = 2 * 2 * 2 * conv4_eye_out 36 | 37 | # pathway: face 38 | conv1_face_size = 11 39 | conv1_face_out = 96 40 | pool1_face_size = 2 41 | pool1_face_stride = 2 42 | 43 | conv2_face_size = 5 44 | conv2_face_out = 256 45 | pool2_face_size = 2 46 | pool2_face_stride = 2 47 | 48 | conv3_face_size = 3 49 | conv3_face_out = 384 50 | pool3_face_size = 2 51 | pool3_face_stride = 2 52 | 53 | conv4_face_size = 1 54 | conv4_face_out = 64 55 | pool4_face_size = 2 56 | pool4_face_stride = 2 57 | 58 | face_size = 2 * 2 * conv4_face_out 59 | 60 | # fc layer 61 | fc_eye_size = 128 62 | fc_face_size = 128 63 | fc2_face_size = 64 64 | fc_face_mask_size = 256 65 | fc2_face_mask_size = 128 66 | fc_size = 128 67 | fc2_size = 2 68 | 69 | 70 | # Import data 71 | def load_data(file): 72 | npzfile = np.load(file) 73 | train_eye_left = npzfile["train_eye_left"] 74 | train_eye_right = npzfile["train_eye_right"] 75 | train_face = npzfile["train_face"] 76 | train_face_mask = npzfile["train_face_mask"] 77 | train_y = npzfile["train_y"] 78 | val_eye_left = npzfile["val_eye_left"] 79 | val_eye_right = npzfile["val_eye_right"] 80 | val_face = npzfile["val_face"] 81 | val_face_mask = npzfile["val_face_mask"] 82 | val_y = npzfile["val_y"] 83 | return [train_eye_left, train_eye_right, train_face, train_face_mask, train_y], [val_eye_left, val_eye_right, val_face, val_face_mask, val_y] 84 | 85 | def normalize(data): 86 | shape = data.shape 87 | data = np.reshape(data, (shape[0], -1)) 88 | data = data.astype('float32') / 255. # scaling 89 | data = data - np.mean(data, axis=0) # normalizing 90 | return np.reshape(data, shape) 91 | 92 | def prepare_data(data): 93 | eye_left, eye_right, face, face_mask, y = data 94 | eye_left = normalize(eye_left) 95 | eye_right = normalize(eye_right) 96 | face = normalize(face) 97 | face_mask = np.reshape(face_mask, (face_mask.shape[0], -1)).astype('float32') 98 | y = y.astype('float32') 99 | return [eye_left, eye_right, face, face_mask, y] 100 | 101 | def shuffle_data(data): 102 | idx = np.arange(data[0].shape[0]) 103 | np.random.shuffle(idx) 104 | for i in range(len(data)): 105 | data[i] = data[i][idx] 106 | return data 107 | 108 | def next_batch(data, batch_size): 109 | for i in np.arange(0, data[0].shape[0], batch_size): 110 | # yield a tuple of the current batched data 111 | yield [each[i: i + batch_size] for each in data] 112 | 113 | class EyeTracker(object): 114 | def __init__(self): 115 | # tf Graph input 116 | self.eye_left = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_left') 117 | self.eye_right = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_right') 118 | self.face = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='face') 119 | self.face_mask = tf.placeholder(tf.float32, [None, mask_size * mask_size], name='face_mask') 120 | self.y = tf.placeholder(tf.float32, [None, 2], name='pos') 121 | # Store layers weight & bias 122 | self.weights = { 123 | 'conv1_eye': tf.get_variable('conv1_eye_w', shape=(conv1_eye_size, conv1_eye_size, n_channel, conv1_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 124 | 'conv2_eye': tf.get_variable('conv2_eye_w', shape=(conv2_eye_size, conv2_eye_size, conv1_eye_out, conv2_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 125 | 'conv3_eye': tf.get_variable('conv3_eye_w', shape=(conv3_eye_size, conv3_eye_size, conv2_eye_out, conv3_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 126 | 'conv4_eye': tf.get_variable('conv4_eye_w', shape=(conv4_eye_size, conv4_eye_size, conv3_eye_out, conv4_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 127 | 'conv1_face': tf.get_variable('conv1_face_w', shape=(conv1_face_size, conv1_face_size, n_channel, conv1_face_out), initializer=tf.contrib.layers.xavier_initializer()), 128 | 'conv2_face': tf.get_variable('conv2_face_w', shape=(conv2_face_size, conv2_face_size, conv1_face_out, conv2_face_out), initializer=tf.contrib.layers.xavier_initializer()), 129 | 'conv3_face': tf.get_variable('conv3_face_w', shape=(conv3_face_size, conv3_face_size, conv2_face_out, conv3_face_out), initializer=tf.contrib.layers.xavier_initializer()), 130 | 'conv4_face': tf.get_variable('conv4_face_w', shape=(conv4_face_size, conv4_face_size, conv3_face_out, conv4_face_out), initializer=tf.contrib.layers.xavier_initializer()), 131 | 'fc_eye': tf.get_variable('fc_eye_w', shape=(eye_size, fc_eye_size), initializer=tf.contrib.layers.xavier_initializer()), 132 | 'fc_face': tf.get_variable('fc_face_w', shape=(face_size, fc_face_size), initializer=tf.contrib.layers.xavier_initializer()), 133 | 'fc2_face': tf.get_variable('fc2_face_w', shape=(fc_face_size, fc2_face_size), initializer=tf.contrib.layers.xavier_initializer()), 134 | 'fc_face_mask': tf.get_variable('fc_face_mask_w', shape=(mask_size * mask_size, fc_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()), 135 | 'fc2_face_mask': tf.get_variable('fc2_face_mask_w', shape=(fc_face_mask_size, fc2_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()), 136 | 'fc': tf.get_variable('fc_w', shape=(fc_eye_size + fc2_face_size + fc2_face_mask_size, fc_size), initializer=tf.contrib.layers.xavier_initializer()), 137 | 'fc2': tf.get_variable('fc2_w', shape=(fc_size, fc2_size), initializer=tf.contrib.layers.xavier_initializer()) 138 | } 139 | self.biases = { 140 | 'conv1_eye': tf.Variable(tf.constant(0.1, shape=[conv1_eye_out])), 141 | 'conv2_eye': tf.Variable(tf.constant(0.1, shape=[conv2_eye_out])), 142 | 'conv3_eye': tf.Variable(tf.constant(0.1, shape=[conv3_eye_out])), 143 | 'conv4_eye': tf.Variable(tf.constant(0.1, shape=[conv4_eye_out])), 144 | 'conv1_face': tf.Variable(tf.constant(0.1, shape=[conv1_face_out])), 145 | 'conv2_face': tf.Variable(tf.constant(0.1, shape=[conv2_face_out])), 146 | 'conv3_face': tf.Variable(tf.constant(0.1, shape=[conv3_face_out])), 147 | 'conv4_face': tf.Variable(tf.constant(0.1, shape=[conv4_face_out])), 148 | 'fc_eye': tf.Variable(tf.constant(0.1, shape=[fc_eye_size])), 149 | 'fc_face': tf.Variable(tf.constant(0.1, shape=[fc_face_size])), 150 | 'fc2_face': tf.Variable(tf.constant(0.1, shape=[fc2_face_size])), 151 | 'fc_face_mask': tf.Variable(tf.constant(0.1, shape=[fc_face_mask_size])), 152 | 'fc2_face_mask': tf.Variable(tf.constant(0.1, shape=[fc2_face_mask_size])), 153 | 'fc': tf.Variable(tf.constant(0.1, shape=[fc_size])), 154 | 'fc2': tf.Variable(tf.constant(0.1, shape=[fc2_size])) 155 | } 156 | 157 | # Construct model 158 | self.pred = self.itracker_nets(self.eye_left, self.eye_right, self.face, self.face_mask, self.weights, self.biases) 159 | 160 | # Create some wrappers for simplicity 161 | def conv2d(self, x, W, b, strides=1): 162 | # Conv2D wrapper, with bias and relu activation 163 | x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='VALID') 164 | x = tf.nn.bias_add(x, b) 165 | return tf.nn.relu(x) 166 | 167 | def maxpool2d(self, x, k, strides): 168 | # MaxPool2D wrapper 169 | return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, strides, strides, 1], 170 | padding='VALID') 171 | 172 | # Create model 173 | def itracker_nets(self, eye_left, eye_right, face, face_mask, weights, biases): 174 | # pathway: left eye 175 | eye_left = self.conv2d(eye_left, weights['conv1_eye'], biases['conv1_eye'], strides=1) 176 | eye_left = self.maxpool2d(eye_left, k=pool1_eye_size, strides=pool1_eye_stride) 177 | 178 | eye_left = self.conv2d(eye_left, weights['conv2_eye'], biases['conv2_eye'], strides=1) 179 | eye_left = self.maxpool2d(eye_left, k=pool2_eye_size, strides=pool2_eye_stride) 180 | 181 | eye_left = self.conv2d(eye_left, weights['conv3_eye'], biases['conv3_eye'], strides=1) 182 | eye_left = self.maxpool2d(eye_left, k=pool3_eye_size, strides=pool3_eye_stride) 183 | 184 | eye_left = self.conv2d(eye_left, weights['conv4_eye'], biases['conv4_eye'], strides=1) 185 | eye_left = self.maxpool2d(eye_left, k=pool4_eye_size, strides=pool4_eye_stride) 186 | 187 | # pathway: right eye 188 | eye_right = self.conv2d(eye_right, weights['conv1_eye'], biases['conv1_eye'], strides=1) 189 | eye_right = self.maxpool2d(eye_right, k=pool1_eye_size, strides=pool1_eye_stride) 190 | 191 | eye_right = self.conv2d(eye_right, weights['conv2_eye'], biases['conv2_eye'], strides=1) 192 | eye_right = self.maxpool2d(eye_right, k=pool2_eye_size, strides=pool2_eye_stride) 193 | 194 | eye_right = self.conv2d(eye_right, weights['conv3_eye'], biases['conv3_eye'], strides=1) 195 | eye_right = self.maxpool2d(eye_right, k=pool3_eye_size, strides=pool3_eye_stride) 196 | 197 | eye_right = self.conv2d(eye_right, weights['conv4_eye'], biases['conv4_eye'], strides=1) 198 | eye_right = self.maxpool2d(eye_right, k=pool4_eye_size, strides=pool4_eye_stride) 199 | 200 | # pathway: face 201 | face = self.conv2d(face, weights['conv1_face'], biases['conv1_face'], strides=1) 202 | face = self.maxpool2d(face, k=pool1_face_size, strides=pool1_face_stride) 203 | 204 | face = self.conv2d(face, weights['conv2_face'], biases['conv2_face'], strides=1) 205 | face = self.maxpool2d(face, k=pool2_face_size, strides=pool2_face_stride) 206 | 207 | face = self.conv2d(face, weights['conv3_face'], biases['conv3_face'], strides=1) 208 | face = self.maxpool2d(face, k=pool3_face_size, strides=pool3_face_stride) 209 | 210 | face = self.conv2d(face, weights['conv4_face'], biases['conv4_face'], strides=1) 211 | face = self.maxpool2d(face, k=pool4_face_size, strides=pool4_face_stride) 212 | 213 | # fc layer 214 | # eye 215 | eye_left = tf.reshape(eye_left, [-1, int(np.prod(eye_left.get_shape()[1:]))]) 216 | eye_right = tf.reshape(eye_right, [-1, int(np.prod(eye_right.get_shape()[1:]))]) 217 | eye = tf.concat([eye_left, eye_right], 1) 218 | eye = tf.nn.relu(tf.add(tf.matmul(eye, weights['fc_eye']), biases['fc_eye'])) 219 | 220 | # face 221 | face = tf.reshape(face, [-1, int(np.prod(face.get_shape()[1:]))]) 222 | face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc_face']), biases['fc_face'])) 223 | face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc2_face']), biases['fc2_face'])) 224 | 225 | # face mask 226 | face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc_face_mask']), biases['fc_face_mask'])) 227 | face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc2_face_mask']), biases['fc2_face_mask'])) 228 | 229 | # all 230 | fc = tf.concat([eye, face, face_mask], 1) 231 | fc = tf.nn.relu(tf.add(tf.matmul(fc, weights['fc']), biases['fc'])) 232 | out = tf.add(tf.matmul(fc, weights['fc2']), biases['fc2']) 233 | return out 234 | 235 | def train(self, train_data, val_data, lr=1e-3, batch_size=128, max_epoch=1000, min_delta=1e-4, patience=10, print_per_epoch=10, out_model='my_model'): 236 | ckpt = os.path.split(out_model)[0] 237 | if not os.path.exists(ckpt): 238 | os.makedirs(ckpt) 239 | 240 | print 'Train on %s samples, validate on %s samples' % (train_data[0].shape[0], val_data[0].shape[0]) 241 | # Define loss and optimizer 242 | self.cost = tf.losses.mean_squared_error(self.y, self.pred) 243 | self.optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(self.cost) 244 | 245 | # Evaluate model 246 | self.err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(self.pred, self.y), axis=1))) 247 | train_loss_history = [] 248 | train_err_history = [] 249 | val_loss_history = [] 250 | val_err_history = [] 251 | n_incr_error = 0 # nb. of consecutive increase in error 252 | best_loss = np.Inf 253 | n_batches = train_data[0].shape[0] / batch_size + (train_data[0].shape[0] % batch_size != 0) 254 | 255 | # Create the collection 256 | tf.get_collection("validation_nodes") 257 | # Add stuff to the collection. 258 | tf.add_to_collection("validation_nodes", self.eye_left) 259 | tf.add_to_collection("validation_nodes", self.eye_right) 260 | tf.add_to_collection("validation_nodes", self.face) 261 | tf.add_to_collection("validation_nodes", self.face_mask) 262 | tf.add_to_collection("validation_nodes", self.pred) 263 | saver = tf.train.Saver(max_to_keep=1) 264 | 265 | # Initializing the variables 266 | init = tf.global_variables_initializer() 267 | # Launch the graph 268 | with tf.Session() as sess: 269 | sess.run(init) 270 | # Keep training until reach max iterations 271 | for n_epoch in range(1, max_epoch + 1): 272 | n_incr_error += 1 273 | train_loss = 0. 274 | val_loss = 0. 275 | train_err = 0. 276 | val_err = 0. 277 | train_data = shuffle_data(train_data) 278 | for batch_train_data in next_batch(train_data, batch_size): 279 | # Run optimization op (backprop) 280 | sess.run(self.optimizer, feed_dict={self.eye_left: batch_train_data[0], \ 281 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \ 282 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]}) 283 | train_batch_loss, train_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_train_data[0], \ 284 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \ 285 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]}) 286 | train_loss += train_batch_loss / n_batches 287 | train_err += train_batch_err / n_batches 288 | val_loss, val_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: val_data[0], \ 289 | self.eye_right: val_data[1], self.face: val_data[2], \ 290 | self.face_mask: val_data[3], self.y: val_data[4]}) 291 | 292 | train_loss_history.append(train_loss) 293 | train_err_history.append(train_err) 294 | val_loss_history.append(val_loss) 295 | val_err_history.append(val_err) 296 | if val_loss - min_delta < best_loss: 297 | best_loss = val_loss 298 | save_path = saver.save(sess, out_model, global_step=n_epoch) 299 | print "Model saved in file: %s" % save_path 300 | n_incr_error = 0 301 | 302 | if n_epoch % print_per_epoch == 0: 303 | print 'Epoch %s/%s, train loss: %.5f, train error: %.5f, val loss: %.5f, val error: %.5f' % \ 304 | (n_epoch, max_epoch, train_loss, train_err, val_loss, val_err) 305 | 306 | if n_incr_error >= patience: 307 | print 'Early stopping occured. Optimization Finished!' 308 | return train_loss_history, train_err_history, val_loss_history, val_err_history 309 | 310 | return train_loss_history, train_err_history, val_loss_history, val_err_history 311 | 312 | def extract_validation_handles(session): 313 | """ Extracts the input and predict_op handles that we use for validation. 314 | Args: 315 | session: The session with the loaded graph. 316 | Returns: 317 | validation handles. 318 | """ 319 | valid_nodes = tf.get_collection_ref("validation_nodes") 320 | if len(valid_nodes) != 5: 321 | raise Exception("ERROR: Expected 5 items in validation_nodes, got %d." % len(valid_nodes)) 322 | return valid_nodes 323 | 324 | def load_model(session, save_path): 325 | """ Loads a saved TF model from a file. 326 | Args: 327 | session: The tf.Session to use. 328 | save_path: The save path for the saved session, returned by Saver.save(). 329 | Returns: 330 | The inputs placehoder and the prediction operation. 331 | """ 332 | print "Loading model from file '%s'..." % save_path 333 | 334 | meta_file = save_path + ".meta" 335 | if not os.path.exists(meta_file): 336 | raise Exception("ERROR: Expected .meta file '%s', but could not find it." % meta_file) 337 | 338 | saver = tf.train.import_meta_graph(meta_file) 339 | # It's finicky about the save path. 340 | save_path = os.path.join("./", save_path) 341 | saver.restore(session, save_path) 342 | 343 | # Check that we have the handles we expected. 344 | return extract_validation_handles(session) 345 | 346 | def validate_model(session, val_data, val_ops): 347 | """ Validates the model stored in a session. 348 | Args: 349 | session: The session where the model is loaded. 350 | val_data: The validation data to use for evaluating the model. 351 | val_ops: The validation operations. 352 | Returns: 353 | The overall validation error for the model. """ 354 | print "Validating model..." 355 | 356 | eye_left, eye_right, face, face_mask, pred = val_ops 357 | val_eye_left, val_eye_right, val_face, val_face_mask, val_y = val_data 358 | y = tf.placeholder(tf.float32, [None, 2], name='pos') 359 | err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(pred, y), axis=1))) 360 | # Validate the model. 361 | error = session.run(err, feed_dict={eye_left: val_eye_left, \ 362 | eye_right: val_eye_right, face: val_face, \ 363 | face_mask: val_face_mask, y: val_y}) 364 | return error 365 | 366 | def plot_loss(train_loss, train_err, test_err, start=0, per=1, save_file='loss.png'): 367 | assert len(train_err) == len(test_err) 368 | idx = np.arange(start, len(train_loss), per) 369 | fig, ax1 = plt.subplots() 370 | lns1 = ax1.plot(idx, train_loss[idx], 'b-', alpha=1.0, label='train loss') 371 | ax1.set_xlabel('epochs') 372 | # Make the y-axis label, ticks and tick labels match the line color. 373 | ax1.set_ylabel('loss', color='b') 374 | ax1.tick_params('y', colors='b') 375 | 376 | ax2 = ax1.twinx() 377 | lns2 = ax2.plot(idx, train_err[idx], 'r-', alpha=1.0, label='train error') 378 | lns3 = ax2.plot(idx, test_err[idx], 'g-', alpha=1.0, label='test error') 379 | ax2.set_ylabel('error', color='r') 380 | ax2.tick_params('y', colors='r') 381 | 382 | # added these three lines 383 | lns = lns1 + lns2 + lns3 384 | labs = [l.get_label() for l in lns] 385 | ax1.legend(lns, labs, loc=0) 386 | 387 | fig.tight_layout() 388 | plt.savefig(save_file) 389 | # plt.show() 390 | 391 | def train(args): 392 | train_data, val_data = load_data(args.input) 393 | 394 | # train_size = 10 395 | # train_data = [each[:train_size] for each in train_data] 396 | # val_size = 1 397 | # val_data = [each[:val_size] for each in val_data] 398 | train_data = prepare_data(train_data) 399 | val_data = prepare_data(val_data) 400 | 401 | start = timeit.default_timer() 402 | et = EyeTracker() 403 | train_loss_history, train_err_history, val_loss_history, val_err_history = et.train(train_data, val_data, \ 404 | lr=args.learning_rate, \ 405 | batch_size=args.batch_size, \ 406 | max_epoch=args.max_epoch, \ 407 | min_delta=1e-4, \ 408 | patience=args.patience, \ 409 | print_per_epoch=args.print_per_epoch, 410 | out_model=args.save_model) 411 | 412 | print 'runtime: %.1fs' % (timeit.default_timer() - start) 413 | 414 | if args.save_loss: 415 | with open(args.save_loss, 'w') as outfile: 416 | np.savez(outfile, train_loss_history=train_loss_history, train_err_history=train_err_history, \ 417 | val_loss_history=val_loss_history, val_err_history=val_err_history) 418 | 419 | if args.plot_loss: 420 | plot_loss(np.array(train_loss_history), np.array(train_err_history), np.array(val_err_history), start=0, per=1, save_file=args.plot_loss) 421 | 422 | def test(args): 423 | _, val_data = load_data(args.input) 424 | 425 | # val_size = 10 426 | # val_data = [each[:val_size] for each in val_data] 427 | 428 | val_data = prepare_data(val_data) 429 | 430 | # Load and validate the network. 431 | with tf.Session() as sess: 432 | val_ops = load_model(sess, args.load_model) 433 | error = validate_model(sess, val_data, val_ops) 434 | print 'Overall validation error: %f' % error 435 | 436 | def main(): 437 | parser = argparse.ArgumentParser() 438 | parser.add_argument('--train', action='store_true', help='train flag') 439 | parser.add_argument('-i', '--input', required=True, type=str, help='path to the input data') 440 | parser.add_argument('-max_epoch', '--max_epoch', type=int, default=100, help='max number of iterations (default 100)') 441 | parser.add_argument('-lr', '--learning_rate', type=float, default=0.002, help='learning rate (default 1e-3)') 442 | parser.add_argument('-bs', '--batch_size', type=int, default=128, help='batch size (default 50)') 443 | parser.add_argument('-p', '--patience', type=int, default=5, help='early stopping patience (default 10)') 444 | parser.add_argument('-pp_iter', '--print_per_epoch', type=int, default=1, help='print per iteration (default 10)') 445 | parser.add_argument('-sm', '--save_model', type=str, default='my_model', help='path to the output model (default my_model)') 446 | parser.add_argument('-lm', '--load_model', type=str, help='path to the loaded model') 447 | parser.add_argument('-pf', '--plot_filter', type=str, default='filter.png', help='plot filters') 448 | parser.add_argument('-pl', '--plot_loss', type=str, default='loss.png', help='plot loss') 449 | parser.add_argument('-sl', '--save_loss', type=str, default='loss.npz', help='save loss') 450 | args = parser.parse_args() 451 | 452 | if args.train: 453 | train(args) 454 | else: 455 | if not args.load_model: 456 | raise Exception('load_model arg needed in test phase') 457 | test(args) 458 | 459 | if __name__ == '__main__': 460 | main() 461 | -------------------------------------------------------------------------------- /itracker_adv.py: -------------------------------------------------------------------------------- 1 | import os 2 | import argparse 3 | import timeit 4 | import numpy as np 5 | import tensorflow as tf 6 | import matplotlib.pyplot as plt 7 | 8 | 9 | # Network Parameters 10 | img_size = 64 11 | n_channel = 3 12 | mask_size = 25 13 | 14 | # pathway: eye_left and eye_right 15 | conv1_eye_size = 11 16 | conv1_eye_out = 96 17 | pool1_eye_size = 2 18 | pool1_eye_stride = 2 19 | 20 | conv2_eye_size = 5 21 | conv2_eye_out = 256 22 | pool2_eye_size = 2 23 | pool2_eye_stride = 2 24 | 25 | conv3_eye_size = 3 26 | conv3_eye_out = 384 27 | pool3_eye_size = 2 28 | pool3_eye_stride = 2 29 | 30 | conv4_eye_size = 1 31 | conv4_eye_out = 64 32 | pool4_eye_size = 2 33 | pool4_eye_stride = 2 34 | 35 | eye_size = 2 * 2 * 2 * conv4_eye_out 36 | 37 | # pathway: face 38 | conv1_face_size = 11 39 | conv1_face_out = 96 40 | pool1_face_size = 2 41 | pool1_face_stride = 2 42 | 43 | conv2_face_size = 5 44 | conv2_face_out = 256 45 | pool2_face_size = 2 46 | pool2_face_stride = 2 47 | 48 | conv3_face_size = 3 49 | conv3_face_out = 384 50 | pool3_face_size = 2 51 | pool3_face_stride = 2 52 | 53 | conv4_face_size = 1 54 | conv4_face_out = 64 55 | pool4_face_size = 2 56 | pool4_face_stride = 2 57 | 58 | face_size = 2 * 2 * conv4_face_out 59 | 60 | # fc layer 61 | fc_eye_size = 128 62 | fc_face_size = 128 63 | fc_face_mask_size = 256 64 | face_face_mask_size = 128 65 | fc_size = 128 66 | fc2_size = 2 67 | 68 | 69 | # Import data 70 | def load_data(file): 71 | npzfile = np.load(file) 72 | train_eye_left = npzfile["train_eye_left"] 73 | train_eye_right = npzfile["train_eye_right"] 74 | train_face = npzfile["train_face"] 75 | train_face_mask = npzfile["train_face_mask"] 76 | train_y = npzfile["train_y"] 77 | val_eye_left = npzfile["val_eye_left"] 78 | val_eye_right = npzfile["val_eye_right"] 79 | val_face = npzfile["val_face"] 80 | val_face_mask = npzfile["val_face_mask"] 81 | val_y = npzfile["val_y"] 82 | return [train_eye_left, train_eye_right, train_face, train_face_mask, train_y], [val_eye_left, val_eye_right, val_face, val_face_mask, val_y] 83 | 84 | def normalize(data): 85 | shape = data.shape 86 | data = np.reshape(data, (shape[0], -1)) 87 | data = data.astype('float32') / 255. # scaling 88 | data = data - np.mean(data, axis=0) # normalizing 89 | return np.reshape(data, shape) 90 | 91 | def prepare_data(data): 92 | eye_left, eye_right, face, face_mask, y = data 93 | eye_left = normalize(eye_left) 94 | eye_right = normalize(eye_right) 95 | face = normalize(face) 96 | face_mask = np.reshape(face_mask, (face_mask.shape[0], -1)).astype('float32') 97 | y = y.astype('float32') 98 | return [eye_left, eye_right, face, face_mask, y] 99 | 100 | def shuffle_data(data): 101 | idx = np.arange(data[0].shape[0]) 102 | np.random.shuffle(idx) 103 | for i in range(len(data)): 104 | data[i] = data[i][idx] 105 | return data 106 | 107 | def next_batch(data, batch_size): 108 | for i in np.arange(0, data[0].shape[0], batch_size): 109 | # yield a tuple of the current batched data 110 | yield [each[i: i + batch_size] for each in data] 111 | 112 | class EyeTracker(object): 113 | def __init__(self): 114 | # tf Graph input 115 | self.eye_left = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_left') 116 | self.eye_right = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_right') 117 | self.face = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='face') 118 | self.face_mask = tf.placeholder(tf.float32, [None, mask_size * mask_size], name='face_mask') 119 | self.y = tf.placeholder(tf.float32, [None, 2], name='pos') 120 | # Store layers weight & bias 121 | self.weights = { 122 | 'conv1_eye': tf.get_variable('conv1_eye_w', shape=(conv1_eye_size, conv1_eye_size, n_channel, conv1_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 123 | 'conv2_eye': tf.get_variable('conv2_eye_w', shape=(conv2_eye_size, conv2_eye_size, conv1_eye_out, conv2_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 124 | 'conv3_eye': tf.get_variable('conv3_eye_w', shape=(conv3_eye_size, conv3_eye_size, conv2_eye_out, conv3_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 125 | 'conv4_eye': tf.get_variable('conv4_eye_w', shape=(conv4_eye_size, conv4_eye_size, conv3_eye_out, conv4_eye_out), initializer=tf.contrib.layers.xavier_initializer()), 126 | 'conv1_face': tf.get_variable('conv1_face_w', shape=(conv1_face_size, conv1_face_size, n_channel, conv1_face_out), initializer=tf.contrib.layers.xavier_initializer()), 127 | 'conv2_face': tf.get_variable('conv2_face_w', shape=(conv2_face_size, conv2_face_size, conv1_face_out, conv2_face_out), initializer=tf.contrib.layers.xavier_initializer()), 128 | 'conv3_face': tf.get_variable('conv3_face_w', shape=(conv3_face_size, conv3_face_size, conv2_face_out, conv3_face_out), initializer=tf.contrib.layers.xavier_initializer()), 129 | 'conv4_face': tf.get_variable('conv4_face_w', shape=(conv4_face_size, conv4_face_size, conv3_face_out, conv4_face_out), initializer=tf.contrib.layers.xavier_initializer()), 130 | 'fc_eye': tf.get_variable('fc_eye_w', shape=(eye_size, fc_eye_size), initializer=tf.contrib.layers.xavier_initializer()), 131 | 'fc_face': tf.get_variable('fc_face_w', shape=(face_size, fc_face_size), initializer=tf.contrib.layers.xavier_initializer()), 132 | 'fc_face_mask': tf.get_variable('fc_face_mask_w', shape=(mask_size * mask_size, fc_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()), 133 | 'face_face_mask': tf.get_variable('face_face_mask_w', shape=(fc_face_size + fc_face_mask_size, face_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()), 134 | 'fc': tf.get_variable('fc_w', shape=(fc_eye_size + face_face_mask_size, fc_size), initializer=tf.contrib.layers.xavier_initializer()), 135 | 'fc2': tf.get_variable('fc2_w', shape=(fc_size, fc2_size), initializer=tf.contrib.layers.xavier_initializer()) 136 | } 137 | self.biases = { 138 | 'conv1_eye': tf.Variable(tf.constant(0.1, shape=[conv1_eye_out])), 139 | 'conv2_eye': tf.Variable(tf.constant(0.1, shape=[conv2_eye_out])), 140 | 'conv3_eye': tf.Variable(tf.constant(0.1, shape=[conv3_eye_out])), 141 | 'conv4_eye': tf.Variable(tf.constant(0.1, shape=[conv4_eye_out])), 142 | 'conv1_face': tf.Variable(tf.constant(0.1, shape=[conv1_face_out])), 143 | 'conv2_face': tf.Variable(tf.constant(0.1, shape=[conv2_face_out])), 144 | 'conv3_face': tf.Variable(tf.constant(0.1, shape=[conv3_face_out])), 145 | 'conv4_face': tf.Variable(tf.constant(0.1, shape=[conv4_face_out])), 146 | 'fc_eye': tf.Variable(tf.constant(0.1, shape=[fc_eye_size])), 147 | 'fc_face': tf.Variable(tf.constant(0.1, shape=[fc_face_size])), 148 | 'fc_face_mask': tf.Variable(tf.constant(0.1, shape=[fc_face_mask_size])), 149 | 'face_face_mask': tf.Variable(tf.constant(0.1, shape=[face_face_mask_size])), 150 | 'fc': tf.Variable(tf.constant(0.1, shape=[fc_size])), 151 | 'fc2': tf.Variable(tf.constant(0.1, shape=[fc2_size])) 152 | } 153 | 154 | # Construct model 155 | self.pred = self.itracker_nets(self.eye_left, self.eye_right, self.face, self.face_mask, self.weights, self.biases) 156 | 157 | # Create some wrappers for simplicity 158 | def conv2d(self, x, W, b, strides=1): 159 | # Conv2D wrapper, with bias and relu activation 160 | x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='VALID') 161 | x = tf.nn.bias_add(x, b) 162 | return tf.nn.relu(x) 163 | 164 | def maxpool2d(self, x, k, strides): 165 | # MaxPool2D wrapper 166 | return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, strides, strides, 1], 167 | padding='VALID') 168 | 169 | # Create model 170 | def itracker_nets(self, eye_left, eye_right, face, face_mask, weights, biases): 171 | # pathway: left eye 172 | eye_left = self.conv2d(eye_left, weights['conv1_eye'], biases['conv1_eye'], strides=1) 173 | eye_left = self.maxpool2d(eye_left, k=pool1_eye_size, strides=pool1_eye_stride) 174 | 175 | eye_left = self.conv2d(eye_left, weights['conv2_eye'], biases['conv2_eye'], strides=1) 176 | eye_left = self.maxpool2d(eye_left, k=pool2_eye_size, strides=pool2_eye_stride) 177 | 178 | eye_left = self.conv2d(eye_left, weights['conv3_eye'], biases['conv3_eye'], strides=1) 179 | eye_left = self.maxpool2d(eye_left, k=pool3_eye_size, strides=pool3_eye_stride) 180 | 181 | eye_left = self.conv2d(eye_left, weights['conv4_eye'], biases['conv4_eye'], strides=1) 182 | eye_left = self.maxpool2d(eye_left, k=pool4_eye_size, strides=pool4_eye_stride) 183 | 184 | # pathway: right eye 185 | eye_right = self.conv2d(eye_right, weights['conv1_eye'], biases['conv1_eye'], strides=1) 186 | eye_right = self.maxpool2d(eye_right, k=pool1_eye_size, strides=pool1_eye_stride) 187 | 188 | eye_right = self.conv2d(eye_right, weights['conv2_eye'], biases['conv2_eye'], strides=1) 189 | eye_right = self.maxpool2d(eye_right, k=pool2_eye_size, strides=pool2_eye_stride) 190 | 191 | eye_right = self.conv2d(eye_right, weights['conv3_eye'], biases['conv3_eye'], strides=1) 192 | eye_right = self.maxpool2d(eye_right, k=pool3_eye_size, strides=pool3_eye_stride) 193 | 194 | eye_right = self.conv2d(eye_right, weights['conv4_eye'], biases['conv4_eye'], strides=1) 195 | eye_right = self.maxpool2d(eye_right, k=pool4_eye_size, strides=pool4_eye_stride) 196 | 197 | # pathway: face 198 | face = self.conv2d(face, weights['conv1_face'], biases['conv1_face'], strides=1) 199 | face = self.maxpool2d(face, k=pool1_face_size, strides=pool1_face_stride) 200 | 201 | face = self.conv2d(face, weights['conv2_face'], biases['conv2_face'], strides=1) 202 | face = self.maxpool2d(face, k=pool2_face_size, strides=pool2_face_stride) 203 | 204 | face = self.conv2d(face, weights['conv3_face'], biases['conv3_face'], strides=1) 205 | face = self.maxpool2d(face, k=pool3_face_size, strides=pool3_face_stride) 206 | 207 | face = self.conv2d(face, weights['conv4_face'], biases['conv4_face'], strides=1) 208 | face = self.maxpool2d(face, k=pool4_face_size, strides=pool4_face_stride) 209 | 210 | # fc layer 211 | # eye 212 | eye_left = tf.reshape(eye_left, [-1, int(np.prod(eye_left.get_shape()[1:]))]) 213 | eye_right = tf.reshape(eye_right, [-1, int(np.prod(eye_right.get_shape()[1:]))]) 214 | eye = tf.concat([eye_left, eye_right], 1) 215 | eye = tf.nn.relu(tf.add(tf.matmul(eye, weights['fc_eye']), biases['fc_eye'])) 216 | 217 | # face 218 | face = tf.reshape(face, [-1, int(np.prod(face.get_shape()[1:]))]) 219 | face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc_face']), biases['fc_face'])) 220 | 221 | # face mask 222 | face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc_face_mask']), biases['fc_face_mask'])) 223 | 224 | face_face_mask = tf.concat([face, face_mask], 1) 225 | face_face_mask = tf.nn.relu(tf.add(tf.matmul(face_face_mask, weights['face_face_mask']), biases['face_face_mask'])) 226 | 227 | # all 228 | fc = tf.concat([eye, face_face_mask], 1) 229 | fc = tf.nn.relu(tf.add(tf.matmul(fc, weights['fc']), biases['fc'])) 230 | out = tf.add(tf.matmul(fc, weights['fc2']), biases['fc2']) 231 | return out 232 | 233 | def train(self, train_data, val_data, lr=1e-3, batch_size=128, max_epoch=1000, min_delta=1e-4, patience=10, print_per_epoch=10, out_model='my_model'): 234 | ckpt = os.path.split(out_model)[0] 235 | if not os.path.exists(ckpt): 236 | os.makedirs(ckpt) 237 | 238 | print 'Train on %s samples, validate on %s samples' % (train_data[0].shape[0], val_data[0].shape[0]) 239 | # Define loss and optimizer 240 | self.cost = tf.losses.mean_squared_error(self.y, self.pred) 241 | self.optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(self.cost) 242 | 243 | # Evaluate model 244 | self.err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(self.pred, self.y), axis=1))) 245 | train_loss_history = [] 246 | train_err_history = [] 247 | val_loss_history = [] 248 | val_err_history = [] 249 | n_incr_error = 0 # nb. of consecutive increase in error 250 | best_loss = np.Inf 251 | n_batches = train_data[0].shape[0] / batch_size + (train_data[0].shape[0] % batch_size != 0) 252 | val_n_batches = val_data[0].shape[0] / batch_size + (val_data[0].shape[0] % batch_size != 0) 253 | 254 | # Create the collection 255 | tf.get_collection("validation_nodes") 256 | # Add stuff to the collection. 257 | tf.add_to_collection("validation_nodes", self.eye_left) 258 | tf.add_to_collection("validation_nodes", self.eye_right) 259 | tf.add_to_collection("validation_nodes", self.face) 260 | tf.add_to_collection("validation_nodes", self.face_mask) 261 | tf.add_to_collection("validation_nodes", self.pred) 262 | saver = tf.train.Saver(max_to_keep=1) 263 | 264 | # Initializing the variables 265 | init = tf.global_variables_initializer() 266 | # Launch the graph 267 | with tf.Session() as sess: 268 | sess.run(init) 269 | writer = tf.summary.FileWriter("logs", sess.graph) 270 | 271 | # Keep training until reach max iterations 272 | for n_epoch in range(1, max_epoch + 1): 273 | n_incr_error += 1 274 | train_loss = 0. 275 | train_err = 0. 276 | train_data = shuffle_data(train_data) 277 | for batch_train_data in next_batch(train_data, batch_size): 278 | # Run optimization op (backprop) 279 | sess.run(self.optimizer, feed_dict={self.eye_left: batch_train_data[0], \ 280 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \ 281 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]}) 282 | train_batch_loss, train_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_train_data[0], \ 283 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \ 284 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]}) 285 | train_loss += train_batch_loss / n_batches 286 | train_err += train_batch_err / n_batches 287 | 288 | val_loss = 0. 289 | val_err = 0 290 | for batch_val_data in next_batch(val_data, batch_size): 291 | val_batch_loss, val_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_val_data[0], \ 292 | self.eye_right: batch_val_data[1], self.face: batch_val_data[2], \ 293 | self.face_mask: batch_val_data[3], self.y: batch_val_data[4]}) 294 | val_loss += val_batch_loss / val_n_batches 295 | val_err += val_batch_err / val_n_batches 296 | 297 | train_loss_history.append(train_loss) 298 | train_err_history.append(train_err) 299 | val_loss_history.append(val_loss) 300 | val_err_history.append(val_err) 301 | if val_loss - min_delta < best_loss: 302 | best_loss = val_loss 303 | save_path = saver.save(sess, out_model, global_step=n_epoch) 304 | print "Model saved in file: %s" % save_path 305 | n_incr_error = 0 306 | 307 | if n_epoch % print_per_epoch == 0: 308 | print 'Epoch %s/%s, train loss: %.5f, train error: %.5f, val loss: %.5f, val error: %.5f' % \ 309 | (n_epoch, max_epoch, train_loss, train_err, val_loss, val_err) 310 | 311 | if n_incr_error >= patience: 312 | print 'Early stopping occured. Optimization Finished!' 313 | return train_loss_history, train_err_history, val_loss_history, val_err_history 314 | 315 | return train_loss_history, train_err_history, val_loss_history, val_err_history 316 | 317 | def extract_validation_handles(session): 318 | """ Extracts the input and predict_op handles that we use for validation. 319 | Args: 320 | session: The session with the loaded graph. 321 | Returns: 322 | validation handles. 323 | """ 324 | valid_nodes = tf.get_collection_ref("validation_nodes") 325 | if len(valid_nodes) != 5: 326 | raise Exception("ERROR: Expected 5 items in validation_nodes, got %d." % len(valid_nodes)) 327 | return valid_nodes 328 | 329 | def load_model(session, save_path): 330 | """ Loads a saved TF model from a file. 331 | Args: 332 | session: The tf.Session to use. 333 | save_path: The save path for the saved session, returned by Saver.save(). 334 | Returns: 335 | The inputs placehoder and the prediction operation. 336 | """ 337 | print "Loading model from file '%s'..." % save_path 338 | 339 | meta_file = save_path + ".meta" 340 | if not os.path.exists(meta_file): 341 | raise Exception("ERROR: Expected .meta file '%s', but could not find it." % meta_file) 342 | 343 | saver = tf.train.import_meta_graph(meta_file) 344 | # It's finicky about the save path. 345 | save_path = os.path.join("./", save_path) 346 | saver.restore(session, save_path) 347 | 348 | # Check that we have the handles we expected. 349 | return extract_validation_handles(session) 350 | 351 | def validate_model(session, val_data, val_ops, batch_size=200): 352 | """ Validates the model stored in a session. 353 | Args: 354 | session: The session where the model is loaded. 355 | val_data: The validation data to use for evaluating the model. 356 | val_ops: The validation operations. 357 | Returns: 358 | The overall validation error for the model. """ 359 | print "Validating model..." 360 | 361 | eye_left, eye_right, face, face_mask, pred = val_ops 362 | y = tf.placeholder(tf.float32, [None, 2], name='pos') 363 | err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(pred, y), axis=1))) 364 | # Validate the model. 365 | val_n_batches = val_data[0].shape[0] / batch_size + (val_data[0].shape[0] % batch_size != 0) 366 | val_err = 0 367 | for batch_val_data in next_batch(val_data, batch_size): 368 | val_batch_err = session.run(err, feed_dict={eye_left: batch_val_data[0], \ 369 | eye_right: batch_val_data[1], face: batch_val_data[2], \ 370 | face_mask: batch_val_data[3], y: batch_val_data[4]}) 371 | val_err += val_batch_err / val_n_batches 372 | return val_err 373 | 374 | def plot_loss(train_loss, train_err, test_err, start=0, per=1, save_file='loss.png'): 375 | assert len(train_err) == len(test_err) 376 | idx = np.arange(start, len(train_loss), per) 377 | fig, ax1 = plt.subplots() 378 | lns1 = ax1.plot(idx, train_loss[idx], 'b-', alpha=1.0, label='train loss') 379 | ax1.set_xlabel('epochs') 380 | # Make the y-axis label, ticks and tick labels match the line color. 381 | ax1.set_ylabel('loss', color='b') 382 | ax1.tick_params('y', colors='b') 383 | 384 | ax2 = ax1.twinx() 385 | lns2 = ax2.plot(idx, train_err[idx], 'r-', alpha=1.0, label='train error') 386 | lns3 = ax2.plot(idx, test_err[idx], 'g-', alpha=1.0, label='test error') 387 | ax2.set_ylabel('error', color='r') 388 | ax2.tick_params('y', colors='r') 389 | 390 | # added these three lines 391 | lns = lns1 + lns2 + lns3 392 | labs = [l.get_label() for l in lns] 393 | ax1.legend(lns, labs, loc=0) 394 | 395 | fig.tight_layout() 396 | plt.savefig(save_file) 397 | # plt.show() 398 | 399 | def train(args): 400 | train_data, val_data = load_data(args.input) 401 | 402 | # train_size = 10 403 | # train_data = [each[:train_size] for each in train_data] 404 | # val_size = 1 405 | # val_data = [each[:val_size] for each in val_data] 406 | 407 | train_data = prepare_data(train_data) 408 | val_data = prepare_data(val_data) 409 | 410 | start = timeit.default_timer() 411 | et = EyeTracker() 412 | train_loss_history, train_err_history, val_loss_history, val_err_history = et.train(train_data, val_data, \ 413 | lr=args.learning_rate, \ 414 | batch_size=args.batch_size, \ 415 | max_epoch=args.max_epoch, \ 416 | min_delta=1e-4, \ 417 | patience=args.patience, \ 418 | print_per_epoch=args.print_per_epoch, 419 | out_model=args.save_model) 420 | 421 | print 'runtime: %.1fs' % (timeit.default_timer() - start) 422 | 423 | if args.save_loss: 424 | with open(args.save_loss, 'w') as outfile: 425 | np.savez(outfile, train_loss_history=train_loss_history, train_err_history=train_err_history, \ 426 | val_loss_history=val_loss_history, val_err_history=val_err_history) 427 | 428 | if args.plot_loss: 429 | plot_loss(np.array(train_loss_history), np.array(train_err_history), np.array(val_err_history), start=0, per=1, save_file=args.plot_loss) 430 | 431 | def test(args): 432 | _, val_data = load_data(args.input) 433 | 434 | # val_size = 10 435 | # val_data = [each[:val_size] for each in val_data] 436 | 437 | val_data = prepare_data(val_data) 438 | 439 | # Load and validate the network. 440 | with tf.Session() as sess: 441 | val_ops = load_model(sess, args.load_model) 442 | error = validate_model(sess, val_data, val_ops, batch_size=args.batch_size) 443 | print 'Overall validation error: %f' % error 444 | 445 | def main(): 446 | parser = argparse.ArgumentParser() 447 | parser.add_argument('--train', action='store_true', help='train flag') 448 | parser.add_argument('-i', '--input', required=True, type=str, help='path to the input data') 449 | parser.add_argument('-max_epoch', '--max_epoch', type=int, default=100, help='max number of iterations') 450 | parser.add_argument('-lr', '--learning_rate', type=float, default=0.0025, help='learning rate') 451 | parser.add_argument('-bs', '--batch_size', type=int, default=200, help='batch size') 452 | parser.add_argument('-p', '--patience', type=int, default=5, help='early stopping patience') 453 | parser.add_argument('-pp_iter', '--print_per_epoch', type=int, default=1, help='print per iteration') 454 | parser.add_argument('-sm', '--save_model', type=str, default='my_model', help='path to the output model') 455 | parser.add_argument('-lm', '--load_model', type=str, help='path to the loaded model') 456 | parser.add_argument('-pl', '--plot_loss', type=str, default='loss.png', help='plot loss') 457 | parser.add_argument('-sl', '--save_loss', type=str, default='loss.npz', help='save loss') 458 | args = parser.parse_args() 459 | 460 | if args.train: 461 | train(args) 462 | else: 463 | if not args.load_model: 464 | raise Exception('load_model arg needed in test phase') 465 | test(args) 466 | 467 | if __name__ == '__main__': 468 | main() 469 | -------------------------------------------------------------------------------- /itracker_adv_arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/itracker_adv_arch.png -------------------------------------------------------------------------------- /itracker_arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/itracker_arch.png -------------------------------------------------------------------------------- /pretrained_models/itracker_adv/checkpoint: -------------------------------------------------------------------------------- 1 | model_checkpoint_path: "model-23" 2 | all_model_checkpoint_paths: "model-23" 3 | -------------------------------------------------------------------------------- /pretrained_models/itracker_adv/model-23.data-00000-of-00001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.data-00000-of-00001 -------------------------------------------------------------------------------- /pretrained_models/itracker_adv/model-23.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.index -------------------------------------------------------------------------------- /pretrained_models/itracker_adv/model-23.meta: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.meta -------------------------------------------------------------------------------- /validation_script.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import argparse 4 | """ 5 | try: 6 | import cPickle as pickle 7 | except ImportError: 8 | # Python 3 9 | import pickle 10 | """ 11 | import os 12 | import sys 13 | 14 | import numpy as np 15 | 16 | import tensorflow as tf 17 | 18 | 19 | # How many images to include in each validation batch. This is just a default 20 | # value, and may be set differently to accomodate network parameters. 21 | batch_size = 1000 22 | 23 | 24 | def extract_validation_handles(session): 25 | """ Extracts the input and predict_op handles that we use for validation. 26 | Args: 27 | session: The session with the loaded graph. 28 | Returns: 29 | The inputs placeholder, mask placeholder, and the prediction operation. """ 30 | # The students should have saved their input placeholder, mask placeholder and prediction 31 | # operation in a collection called "validation_nodes". 32 | valid_nodes = tf.get_collection_ref("validation_nodes") 33 | if len(valid_nodes) != 5: 34 | print("ERROR: Expected 3 items in validation_nodes, got %d." % \ 35 | (len(valid_nodes))) 36 | sys.exit(1) 37 | 38 | # Figure out which is which. 39 | eye_left = valid_nodes[0] 40 | eye_right = valid_nodes[1] 41 | face = valid_nodes[2] 42 | face_mask = valid_nodes[3] 43 | predict = valid_nodes[4] 44 | """if type(valid_nodes[1]) == tf.placeholder: 45 | inputs = valid_nodes[1] 46 | predict = valid_nodes[0]""" 47 | 48 | # Check to make sure we've set the batch size correctly. 49 | global batch_size 50 | try: 51 | batch_size = int(eye_left.get_shape()[0]) 52 | print("WARNING: Network does not support variable batch sizes. (inputs)") 53 | except TypeError: 54 | # It's unspecified, which is actually correct. 55 | pass 56 | try: 57 | # I've also seen people who don't specify an input shape but do specify a 58 | # shape for the prediction operation. 59 | batch_size = int(predict.get_shape()[0]) 60 | print("WARNING: Network does not support variable batch sizes. (predict)") 61 | except TypeError: 62 | pass 63 | 64 | # Predict op should also yield integers. 65 | #predict = tf.cast(predict, "int32") 66 | 67 | # Check the shape of the prediction output. 68 | p_shape = predict.get_shape() 69 | #Commented these out because there could be squeezes in the code earlier 70 | """ 71 | print p_shape 72 | if len(p_shape) > 2: 73 | print("ERROR: Expected prediction of shape (, 1), got shape of %s." % \ 74 | (str(p_shape))) 75 | sys.exit(1) 76 | if len(p_shape) == 2: 77 | if p_shape[1] != 1: 78 | print("ERROR: Expected prediction of shape (, 1), got shape of %s." % \ 79 | (str(p_shape))) 80 | sys.exit(1) 81 | 82 | # We need to contract it into a vector. 83 | predict = predict[:, 0]""" 84 | 85 | return (eye_left, eye_right, face, face_mask, predict) 86 | 87 | def load_model(session, save_path): 88 | """ Loads a saved TF model from a file. 89 | Args: 90 | session: The tf.Session to use. 91 | save_path: The save path for the saved session, returned by Saver.save(). 92 | Returns: 93 | The inputs placehoder and the prediction operation. 94 | """ 95 | print("Loading model from file '%s'..." % (save_path)) 96 | 97 | meta_file = save_path + ".meta" 98 | if not os.path.exists(meta_file): 99 | print("ERROR: Expected .meta file '%s', but could not find it." % \ 100 | (meta_file)) 101 | sys.exit(1) 102 | 103 | saver = tf.train.import_meta_graph(meta_file) 104 | # It's finicky about the save path. 105 | save_path = os.path.join("./", save_path) 106 | saver.restore(session, save_path) 107 | 108 | # Check that we have the handles we expected. 109 | return extract_validation_handles(session) 110 | 111 | def load_validation_data(val_filename): 112 | """ Loads the validation data. 113 | Args: 114 | val_filename: The file where the validation data is stored. 115 | Returns: 116 | A tuple of the loaded validation data and validation labels. """ 117 | print("Loading validation data...") 118 | 119 | npzfile = np.load(val_filename) 120 | val_eye_left = npzfile["val_eye_left"] 121 | val_eye_right = npzfile["val_eye_right"] 122 | val_face = npzfile["val_face"] 123 | val_face_mask = npzfile["val_face_mask"] 124 | val_y = npzfile["val_y"] 125 | 126 | return (val_eye_left, val_eye_right, val_face, val_face_mask, val_y) 127 | 128 | def validate_model(session, val_data, eye_left, eye_right, face, face_mask, predict_op): 129 | """ Validates the model stored in a session. 130 | Args: 131 | session: The session where the model is loaded. 132 | val_data: The validation data to use for evaluating the model. 133 | eye_left: The inputs placeholder. 134 | eye_right: The inputs placeholder. 135 | face: The inputs placeholder. 136 | face_mask: The inputs placeholder. 137 | predict_op: The prediction operation. 138 | Returns: 139 | The overall validation accuracy for the model. """ 140 | print("Validating model...") 141 | 142 | 143 | 144 | # Validate the model. 145 | val_eye_left, val_eye_right, val_face, val_face_mask, val_y = val_data 146 | num_iters = val_eye_left.shape[0] // batch_size 147 | 148 | err_val = [] 149 | for i in range(0, int(num_iters)): 150 | start_index = i * batch_size 151 | end_index = start_index + batch_size 152 | 153 | eye_left_batch = val_eye_left[start_index:end_index, :] 154 | eye_right_batch = val_eye_right[start_index:end_index, :] 155 | face_batch = val_face[start_index:end_index, :] 156 | # face_mask_batch = val_face_mask[start_index:end_index, :] 157 | face_mask_batch = np.reshape(val_face_mask[start_index:end_index, :], (batch_size, -1)) 158 | y_batch = val_y[start_index:end_index, :] 159 | 160 | 161 | 162 | print("Validating batch %d of %d..." % (i + 1, num_iters)) 163 | yp = session.run(predict_op, 164 | feed_dict={eye_left: eye_left_batch / 255., 165 | eye_right: eye_right_batch / 255., 166 | face: face_batch / 255., 167 | face_mask: face_mask_batch}) 168 | 169 | err = np.mean(np.sqrt(np.sum((yp - y_batch)**2, axis=1))) 170 | err_val.append(err) 171 | 172 | # Compute total error 173 | error = np.mean(err_val) 174 | return error 175 | 176 | def try_with_random_data(session, eye_left, eye_right, face, face_mask, predict_op): 177 | """ Tries putting random data through the network, mostly to make sure this 178 | works. 179 | Args: 180 | session: The session to use. 181 | inputs: The inputs placeholder. 182 | predict_op: The prediction operation. """ 183 | print("Trying random batch...") 184 | 185 | # Get a random batch. 186 | eye_left_batch = np.random.rand(batch_size, 64, 64, 3) 187 | eye_right_batch = np.random.rand(batch_size, 64, 64, 3) 188 | face_batch = np.random.rand(batch_size, 64, 64, 3) 189 | face_mask_batch = np.random.rand(batch_size, 25, 25) 190 | 191 | print("Batch of shape (%d, 64, 64, 3)" % (batch_size)) 192 | 193 | # Put it through the model. 194 | predictions = session.run(predict_op, feed_dict={eye_left: eye_left_batch, 195 | eye_right: eye_right_batch, 196 | face: face_batch, 197 | face_mask: face_mask_batch}) 198 | if np.isnan(predictions).any(): 199 | print("Warning: Got NaN value in prediction!") 200 | 201 | 202 | def main(): 203 | parser = argparse.ArgumentParser("Analyze student models.") 204 | parser.add_argument("-v", "--val_data_file", default=None, 205 | help="Validate the network with the data from this " + \ 206 | "pickle file.") 207 | parser.add_argument("save_path", help="The base path for your saved model.") 208 | args = parser.parse_args() 209 | 210 | if not args.val_data_file: 211 | print("Not validating, but checking network compatibility...") 212 | elif not os.path.exists(args.val_data_file): 213 | print("ERROR: Could not find validation data '%s'." % (args.val_data)) 214 | sys.exit(1) 215 | 216 | # Load and validate the network. 217 | with tf.Session() as session: 218 | eye_left, eye_right, face, face_mask, predict_op = load_model(session, args.save_path) 219 | if args.val_data_file: 220 | val_data = load_validation_data(args.val_data_file) 221 | accuracy = validate_model(session, val_data, eye_left, eye_right, face, face_mask, predict_op) 222 | 223 | print("Overall validation error: %f cm" % (accuracy)) 224 | print("Network seems good. Go ahead and submit") 225 | 226 | else: 227 | try_with_random_data(session, eye_left, eye_right, face, face_mask, predict_op) 228 | print("Network seems good. Go ahead and submit.") 229 | 230 | if __name__ == "__main__": 231 | main() 232 | --------------------------------------------------------------------------------