├── .gitignore
├── LICENSE
├── README.md
├── itracker.py
├── itracker_adv.py
├── itracker_adv_arch.png
├── itracker_arch.png
├── pretrained_models
    └── itracker_adv
    │   ├── checkpoint
    │   ├── model-23.data-00000-of-00001
    │   ├── model-23.index
    │   └── model-23.meta
└── validation_script.py


/.gitignore:
--------------------------------------------------------------------------------
1 | ckpt/
2 | *.pyc
3 | *.npz
4 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | BSD 3-Clause License
 2 | 
 3 | Copyright (c) 2017, Hugo Chan
 4 | All rights reserved.
 5 | 
 6 | Redistribution and use in source and binary forms, with or without
 7 | modification, are permitted provided that the following conditions are met:
 8 | 
 9 | * Redistributions of source code must retain the above copyright notice, this
10 |   list of conditions and the following disclaimer.
11 | 
12 | * Redistributions in binary form must reproduce the above copyright notice,
13 |   this list of conditions and the following disclaimer in the documentation
14 |   and/or other materials provided with the distribution.
15 | 
16 | * Neither the name of the copyright holder nor the names of its
17 |   contributors may be used to endorse or promote products derived from
18 |   this software without specific prior written permission.
19 | 
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Eye Tracker
 2 | Implemented and improved the iTracker model proposed in the paper [Eye Tracking for Everyone](https://arxiv.org/abs/1606.05814).
 3 | 
 4 | ![](itracker_arch.png)
 5 | *<center><h3>Figure 1: iTracker architecture</h3></center>*
 6 | 
 7 | ![](itracker_adv_arch.png)
 8 | *<center><h3>Figure 2: modified iTracker architecture</h3></center>*
 9 | 
10 | Figures 1 and 2 show the architectures of the iTracker model
11 | and the modified model. The only difference between the modified model and the iTracker model is
12 | that we concatenate the face layer FC-F1 and face mask layer FC-FG1 first, after applying a fully connected layer FC-F2,
13 | we then concatenate the eye layer FC-E1 and FC-F2 layer.
14 | We claim that this modified architecture is superior to the iTracker architecture.
15 | Intuitively, concatenating the face mask information together with the eye information
16 | may confuse the model since the face mask information is irrelevant to the eye information.
17 | Even though the iTracker model succeeded to learn this fact from the data,
18 | the modified model outperforms the iTracker model by explictlying encoded with this knowledge.
19 | In experiments, the modified model converged faster (28 epochs vs. 40+ epochs) and achieved better validation
20 | error (2.19 cm vs. 2.514 cm).
21 | The iTracker model was implemented in itracker.py and the modified one was
22 | implemented in itracker_adv.py.
23 | Note that a smaller dataset (i.e., a subset of the full dataset in the original paper) was used in experiments and no data augmentation was applied.
24 | This smaller dataset contains 48,000 training samples and 5,000 validation samples.
25 | You can download this smaller dataset [here](http://hugochan.net/download/eye_tracker_train_and_val.npz).
26 | 
27 | # Get started
28 | To train the model: run
29 | `python itracker_adv.py --train -i input_data -sm saved_model`
30 | 
31 | To test the trained model: run
32 | `python itracker_adv.py -i input_data -lm saved_model`
33 | 
34 | You can find a pretrained (on the smaller dataset) model under the pretrained_models/itracker_adv/ folder.
35 | 
36 | # FAQ
37 | 1) What are the datasets?
38 | 
39 | The original dataset comes from the [GazeCapture](http://gazecapture.csail.mit.edu/) project. The dataset involves over 1400 subjects and results in more than 2 million face images. Due to the limitation of computation power, a much [smaller dataset](http://hugochan.net/download/eye_tracker_train_and_val.npz) with 48000 training samples and 5000 validation samples was used here. Each sample contains 5 items: face, left eye, right eye, face mask and labels.
40 | 
41 | # Other implementations
42 | For pytorch implementations, see [GazeCapture](https://github.com/CSAILVision/GazeCapture).
43 | 


--------------------------------------------------------------------------------
/itracker.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import argparse
  3 | import timeit
  4 | import numpy as np
  5 | import tensorflow as tf
  6 | import matplotlib.pyplot as plt
  7 | 
  8 | 
  9 | # Network Parameters
 10 | img_size = 64
 11 | n_channel = 3
 12 | mask_size = 25
 13 | 
 14 | # pathway: eye_left and eye_right
 15 | conv1_eye_size = 11
 16 | conv1_eye_out = 96
 17 | pool1_eye_size = 2
 18 | pool1_eye_stride = 2
 19 | 
 20 | conv2_eye_size = 5
 21 | conv2_eye_out = 256
 22 | pool2_eye_size = 2
 23 | pool2_eye_stride = 2
 24 | 
 25 | conv3_eye_size = 3
 26 | conv3_eye_out = 384
 27 | pool3_eye_size = 2
 28 | pool3_eye_stride = 2
 29 | 
 30 | conv4_eye_size = 1
 31 | conv4_eye_out = 64
 32 | pool4_eye_size = 2
 33 | pool4_eye_stride = 2
 34 | 
 35 | eye_size = 2 * 2 * 2 * conv4_eye_out
 36 | 
 37 | # pathway: face
 38 | conv1_face_size = 11
 39 | conv1_face_out = 96
 40 | pool1_face_size = 2
 41 | pool1_face_stride = 2
 42 | 
 43 | conv2_face_size = 5
 44 | conv2_face_out = 256
 45 | pool2_face_size = 2
 46 | pool2_face_stride = 2
 47 | 
 48 | conv3_face_size = 3
 49 | conv3_face_out = 384
 50 | pool3_face_size = 2
 51 | pool3_face_stride = 2
 52 | 
 53 | conv4_face_size = 1
 54 | conv4_face_out = 64
 55 | pool4_face_size = 2
 56 | pool4_face_stride = 2
 57 | 
 58 | face_size = 2 * 2 * conv4_face_out
 59 | 
 60 | # fc layer
 61 | fc_eye_size = 128
 62 | fc_face_size = 128
 63 | fc2_face_size = 64
 64 | fc_face_mask_size = 256
 65 | fc2_face_mask_size = 128
 66 | fc_size = 128
 67 | fc2_size = 2
 68 | 
 69 | 
 70 | # Import data
 71 | def load_data(file):
 72 |     npzfile = np.load(file)
 73 |     train_eye_left = npzfile["train_eye_left"]
 74 |     train_eye_right = npzfile["train_eye_right"]
 75 |     train_face = npzfile["train_face"]
 76 |     train_face_mask = npzfile["train_face_mask"]
 77 |     train_y = npzfile["train_y"]
 78 |     val_eye_left = npzfile["val_eye_left"]
 79 |     val_eye_right = npzfile["val_eye_right"]
 80 |     val_face = npzfile["val_face"]
 81 |     val_face_mask = npzfile["val_face_mask"]
 82 |     val_y = npzfile["val_y"]
 83 |     return [train_eye_left, train_eye_right, train_face, train_face_mask, train_y], [val_eye_left, val_eye_right, val_face, val_face_mask, val_y]
 84 | 
 85 | def normalize(data):
 86 |     shape = data.shape
 87 |     data = np.reshape(data, (shape[0], -1))
 88 |     data = data.astype('float32') / 255. # scaling
 89 |     data = data - np.mean(data, axis=0) # normalizing
 90 |     return np.reshape(data, shape)
 91 | 
 92 | def prepare_data(data):
 93 |     eye_left, eye_right, face, face_mask, y = data
 94 |     eye_left = normalize(eye_left)
 95 |     eye_right = normalize(eye_right)
 96 |     face = normalize(face)
 97 |     face_mask = np.reshape(face_mask, (face_mask.shape[0], -1)).astype('float32')
 98 |     y = y.astype('float32')
 99 |     return [eye_left, eye_right, face, face_mask, y]
100 | 
101 | def shuffle_data(data):
102 |     idx = np.arange(data[0].shape[0])
103 |     np.random.shuffle(idx)
104 |     for i in range(len(data)):
105 |         data[i] = data[i][idx]
106 |     return data
107 | 
108 | def next_batch(data, batch_size):
109 |     for i in np.arange(0, data[0].shape[0], batch_size):
110 |         # yield a tuple of the current batched data
111 |         yield [each[i: i + batch_size] for each in data]
112 | 
113 | class EyeTracker(object):
114 |     def __init__(self):
115 |         # tf Graph input
116 |         self.eye_left = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_left')
117 |         self.eye_right = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_right')
118 |         self.face = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='face')
119 |         self.face_mask = tf.placeholder(tf.float32, [None, mask_size * mask_size], name='face_mask')
120 |         self.y = tf.placeholder(tf.float32, [None, 2], name='pos')
121 |         # Store layers weight & bias
122 |         self.weights = {
123 |             'conv1_eye': tf.get_variable('conv1_eye_w', shape=(conv1_eye_size, conv1_eye_size, n_channel, conv1_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
124 |             'conv2_eye': tf.get_variable('conv2_eye_w', shape=(conv2_eye_size, conv2_eye_size, conv1_eye_out, conv2_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
125 |             'conv3_eye': tf.get_variable('conv3_eye_w', shape=(conv3_eye_size, conv3_eye_size, conv2_eye_out, conv3_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
126 |             'conv4_eye': tf.get_variable('conv4_eye_w', shape=(conv4_eye_size, conv4_eye_size, conv3_eye_out, conv4_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
127 |             'conv1_face': tf.get_variable('conv1_face_w', shape=(conv1_face_size, conv1_face_size, n_channel, conv1_face_out), initializer=tf.contrib.layers.xavier_initializer()),
128 |             'conv2_face': tf.get_variable('conv2_face_w', shape=(conv2_face_size, conv2_face_size, conv1_face_out, conv2_face_out), initializer=tf.contrib.layers.xavier_initializer()),
129 |             'conv3_face': tf.get_variable('conv3_face_w', shape=(conv3_face_size, conv3_face_size, conv2_face_out, conv3_face_out), initializer=tf.contrib.layers.xavier_initializer()),
130 |             'conv4_face': tf.get_variable('conv4_face_w', shape=(conv4_face_size, conv4_face_size, conv3_face_out, conv4_face_out), initializer=tf.contrib.layers.xavier_initializer()),
131 |             'fc_eye': tf.get_variable('fc_eye_w', shape=(eye_size, fc_eye_size), initializer=tf.contrib.layers.xavier_initializer()),
132 |             'fc_face': tf.get_variable('fc_face_w', shape=(face_size, fc_face_size), initializer=tf.contrib.layers.xavier_initializer()),
133 |             'fc2_face': tf.get_variable('fc2_face_w', shape=(fc_face_size, fc2_face_size), initializer=tf.contrib.layers.xavier_initializer()),
134 |             'fc_face_mask': tf.get_variable('fc_face_mask_w', shape=(mask_size * mask_size, fc_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
135 |             'fc2_face_mask': tf.get_variable('fc2_face_mask_w', shape=(fc_face_mask_size, fc2_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
136 |             'fc': tf.get_variable('fc_w', shape=(fc_eye_size + fc2_face_size + fc2_face_mask_size, fc_size), initializer=tf.contrib.layers.xavier_initializer()),
137 |             'fc2': tf.get_variable('fc2_w', shape=(fc_size, fc2_size), initializer=tf.contrib.layers.xavier_initializer())
138 |         }
139 |         self.biases = {
140 |             'conv1_eye': tf.Variable(tf.constant(0.1, shape=[conv1_eye_out])),
141 |             'conv2_eye': tf.Variable(tf.constant(0.1, shape=[conv2_eye_out])),
142 |             'conv3_eye': tf.Variable(tf.constant(0.1, shape=[conv3_eye_out])),
143 |             'conv4_eye': tf.Variable(tf.constant(0.1, shape=[conv4_eye_out])),
144 |             'conv1_face': tf.Variable(tf.constant(0.1, shape=[conv1_face_out])),
145 |             'conv2_face': tf.Variable(tf.constant(0.1, shape=[conv2_face_out])),
146 |             'conv3_face': tf.Variable(tf.constant(0.1, shape=[conv3_face_out])),
147 |             'conv4_face': tf.Variable(tf.constant(0.1, shape=[conv4_face_out])),
148 |             'fc_eye': tf.Variable(tf.constant(0.1, shape=[fc_eye_size])),
149 |             'fc_face': tf.Variable(tf.constant(0.1, shape=[fc_face_size])),
150 |             'fc2_face': tf.Variable(tf.constant(0.1, shape=[fc2_face_size])),
151 |             'fc_face_mask': tf.Variable(tf.constant(0.1, shape=[fc_face_mask_size])),
152 |             'fc2_face_mask': tf.Variable(tf.constant(0.1, shape=[fc2_face_mask_size])),
153 |             'fc': tf.Variable(tf.constant(0.1, shape=[fc_size])),
154 |             'fc2': tf.Variable(tf.constant(0.1, shape=[fc2_size]))
155 |         }
156 | 
157 |         # Construct model
158 |         self.pred = self.itracker_nets(self.eye_left, self.eye_right, self.face, self.face_mask, self.weights, self.biases)
159 | 
160 |     # Create some wrappers for simplicity
161 |     def conv2d(self, x, W, b, strides=1):
162 |         # Conv2D wrapper, with bias and relu activation
163 |         x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='VALID')
164 |         x = tf.nn.bias_add(x, b)
165 |         return tf.nn.relu(x)
166 | 
167 |     def maxpool2d(self, x, k, strides):
168 |         # MaxPool2D wrapper
169 |         return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, strides, strides, 1],
170 |                               padding='VALID')
171 | 
172 |     # Create model
173 |     def itracker_nets(self, eye_left, eye_right, face, face_mask, weights, biases):
174 |         # pathway: left eye
175 |         eye_left = self.conv2d(eye_left, weights['conv1_eye'], biases['conv1_eye'], strides=1)
176 |         eye_left = self.maxpool2d(eye_left, k=pool1_eye_size, strides=pool1_eye_stride)
177 | 
178 |         eye_left = self.conv2d(eye_left, weights['conv2_eye'], biases['conv2_eye'], strides=1)
179 |         eye_left = self.maxpool2d(eye_left, k=pool2_eye_size, strides=pool2_eye_stride)
180 | 
181 |         eye_left = self.conv2d(eye_left, weights['conv3_eye'], biases['conv3_eye'], strides=1)
182 |         eye_left = self.maxpool2d(eye_left, k=pool3_eye_size, strides=pool3_eye_stride)
183 | 
184 |         eye_left = self.conv2d(eye_left, weights['conv4_eye'], biases['conv4_eye'], strides=1)
185 |         eye_left = self.maxpool2d(eye_left, k=pool4_eye_size, strides=pool4_eye_stride)
186 | 
187 |         # pathway: right eye
188 |         eye_right = self.conv2d(eye_right, weights['conv1_eye'], biases['conv1_eye'], strides=1)
189 |         eye_right = self.maxpool2d(eye_right, k=pool1_eye_size, strides=pool1_eye_stride)
190 | 
191 |         eye_right = self.conv2d(eye_right, weights['conv2_eye'], biases['conv2_eye'], strides=1)
192 |         eye_right = self.maxpool2d(eye_right, k=pool2_eye_size, strides=pool2_eye_stride)
193 | 
194 |         eye_right = self.conv2d(eye_right, weights['conv3_eye'], biases['conv3_eye'], strides=1)
195 |         eye_right = self.maxpool2d(eye_right, k=pool3_eye_size, strides=pool3_eye_stride)
196 | 
197 |         eye_right = self.conv2d(eye_right, weights['conv4_eye'], biases['conv4_eye'], strides=1)
198 |         eye_right = self.maxpool2d(eye_right, k=pool4_eye_size, strides=pool4_eye_stride)
199 | 
200 |         # pathway: face
201 |         face = self.conv2d(face, weights['conv1_face'], biases['conv1_face'], strides=1)
202 |         face = self.maxpool2d(face, k=pool1_face_size, strides=pool1_face_stride)
203 | 
204 |         face = self.conv2d(face, weights['conv2_face'], biases['conv2_face'], strides=1)
205 |         face = self.maxpool2d(face, k=pool2_face_size, strides=pool2_face_stride)
206 | 
207 |         face = self.conv2d(face, weights['conv3_face'], biases['conv3_face'], strides=1)
208 |         face = self.maxpool2d(face, k=pool3_face_size, strides=pool3_face_stride)
209 | 
210 |         face = self.conv2d(face, weights['conv4_face'], biases['conv4_face'], strides=1)
211 |         face = self.maxpool2d(face, k=pool4_face_size, strides=pool4_face_stride)
212 | 
213 |         # fc layer
214 |         # eye
215 |         eye_left = tf.reshape(eye_left, [-1, int(np.prod(eye_left.get_shape()[1:]))])
216 |         eye_right = tf.reshape(eye_right, [-1, int(np.prod(eye_right.get_shape()[1:]))])
217 |         eye = tf.concat([eye_left, eye_right], 1)
218 |         eye = tf.nn.relu(tf.add(tf.matmul(eye, weights['fc_eye']), biases['fc_eye']))
219 | 
220 |         # face
221 |         face = tf.reshape(face, [-1, int(np.prod(face.get_shape()[1:]))])
222 |         face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc_face']), biases['fc_face']))
223 |         face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc2_face']), biases['fc2_face']))
224 | 
225 |         # face mask
226 |         face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc_face_mask']), biases['fc_face_mask']))
227 |         face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc2_face_mask']), biases['fc2_face_mask']))
228 | 
229 |         # all
230 |         fc = tf.concat([eye, face, face_mask], 1)
231 |         fc = tf.nn.relu(tf.add(tf.matmul(fc, weights['fc']), biases['fc']))
232 |         out = tf.add(tf.matmul(fc, weights['fc2']), biases['fc2'])
233 |         return out
234 | 
235 |     def train(self, train_data, val_data, lr=1e-3, batch_size=128, max_epoch=1000, min_delta=1e-4, patience=10, print_per_epoch=10, out_model='my_model'):
236 |         ckpt = os.path.split(out_model)[0]
237 |         if not os.path.exists(ckpt):
238 |             os.makedirs(ckpt)
239 | 
240 |         print 'Train on %s samples, validate on %s samples' % (train_data[0].shape[0], val_data[0].shape[0])
241 |         # Define loss and optimizer
242 |         self.cost = tf.losses.mean_squared_error(self.y, self.pred)
243 |         self.optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(self.cost)
244 | 
245 |         # Evaluate model
246 |         self.err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(self.pred, self.y), axis=1)))
247 |         train_loss_history = []
248 |         train_err_history = []
249 |         val_loss_history = []
250 |         val_err_history = []
251 |         n_incr_error = 0  # nb. of consecutive increase in error
252 |         best_loss = np.Inf
253 |         n_batches = train_data[0].shape[0] / batch_size + (train_data[0].shape[0] % batch_size != 0)
254 | 
255 |         # Create the collection
256 |         tf.get_collection("validation_nodes")
257 |         # Add stuff to the collection.
258 |         tf.add_to_collection("validation_nodes", self.eye_left)
259 |         tf.add_to_collection("validation_nodes", self.eye_right)
260 |         tf.add_to_collection("validation_nodes", self.face)
261 |         tf.add_to_collection("validation_nodes", self.face_mask)
262 |         tf.add_to_collection("validation_nodes", self.pred)
263 |         saver = tf.train.Saver(max_to_keep=1)
264 | 
265 |         # Initializing the variables
266 |         init = tf.global_variables_initializer()
267 |         # Launch the graph
268 |         with tf.Session() as sess:
269 |             sess.run(init)
270 |             # Keep training until reach max iterations
271 |             for n_epoch in range(1, max_epoch + 1):
272 |                 n_incr_error += 1
273 |                 train_loss = 0.
274 |                 val_loss = 0.
275 |                 train_err = 0.
276 |                 val_err = 0.
277 |                 train_data = shuffle_data(train_data)
278 |                 for batch_train_data in next_batch(train_data, batch_size):
279 |                     # Run optimization op (backprop)
280 |                     sess.run(self.optimizer, feed_dict={self.eye_left: batch_train_data[0], \
281 |                                 self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
282 |                                 self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
283 |                     train_batch_loss, train_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_train_data[0], \
284 |                                 self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
285 |                                 self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
286 |                     train_loss += train_batch_loss / n_batches
287 |                     train_err += train_batch_err / n_batches
288 |                 val_loss, val_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: val_data[0], \
289 |                                 self.eye_right: val_data[1], self.face: val_data[2], \
290 |                                 self.face_mask: val_data[3], self.y: val_data[4]})
291 | 
292 |                 train_loss_history.append(train_loss)
293 |                 train_err_history.append(train_err)
294 |                 val_loss_history.append(val_loss)
295 |                 val_err_history.append(val_err)
296 |                 if val_loss - min_delta < best_loss:
297 |                     best_loss = val_loss
298 |                     save_path = saver.save(sess, out_model, global_step=n_epoch)
299 |                     print "Model saved in file: %s" % save_path
300 |                     n_incr_error = 0
301 | 
302 |                 if n_epoch % print_per_epoch == 0:
303 |                     print 'Epoch %s/%s, train loss: %.5f, train error: %.5f, val loss: %.5f, val error: %.5f' % \
304 |                                                 (n_epoch, max_epoch, train_loss, train_err, val_loss, val_err)
305 | 
306 |                 if n_incr_error >= patience:
307 |                     print 'Early stopping occured. Optimization Finished!'
308 |                     return train_loss_history, train_err_history, val_loss_history, val_err_history
309 | 
310 |             return train_loss_history, train_err_history, val_loss_history, val_err_history
311 | 
312 | def extract_validation_handles(session):
313 |     """ Extracts the input and predict_op handles that we use for validation.
314 |     Args:
315 |         session: The session with the loaded graph.
316 |     Returns:
317 |         validation handles.
318 |     """
319 |     valid_nodes = tf.get_collection_ref("validation_nodes")
320 |     if len(valid_nodes) != 5:
321 |         raise Exception("ERROR: Expected 5 items in validation_nodes, got %d." % len(valid_nodes))
322 |     return valid_nodes
323 | 
324 | def load_model(session, save_path):
325 |     """ Loads a saved TF model from a file.
326 |     Args:
327 |         session: The tf.Session to use.
328 |         save_path: The save path for the saved session, returned by Saver.save().
329 |     Returns:
330 |         The inputs placehoder and the prediction operation.
331 |     """
332 |     print "Loading model from file '%s'..." % save_path
333 | 
334 |     meta_file = save_path + ".meta"
335 |     if not os.path.exists(meta_file):
336 |         raise Exception("ERROR: Expected .meta file '%s', but could not find it." % meta_file)
337 | 
338 |     saver = tf.train.import_meta_graph(meta_file)
339 |     # It's finicky about the save path.
340 |     save_path = os.path.join("./", save_path)
341 |     saver.restore(session, save_path)
342 | 
343 |     # Check that we have the handles we expected.
344 |     return extract_validation_handles(session)
345 | 
346 | def validate_model(session, val_data, val_ops):
347 |     """ Validates the model stored in a session.
348 |     Args:
349 |         session: The session where the model is loaded.
350 |         val_data: The validation data to use for evaluating the model.
351 |         val_ops: The validation operations.
352 |     Returns:
353 |         The overall validation error for the model. """
354 |     print "Validating model..."
355 | 
356 |     eye_left, eye_right, face, face_mask, pred = val_ops
357 |     val_eye_left, val_eye_right, val_face, val_face_mask, val_y = val_data
358 |     y = tf.placeholder(tf.float32, [None, 2], name='pos')
359 |     err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(pred, y), axis=1)))
360 |     # Validate the model.
361 |     error = session.run(err, feed_dict={eye_left: val_eye_left, \
362 |                                 eye_right: val_eye_right, face: val_face, \
363 |                                 face_mask: val_face_mask, y: val_y})
364 |     return error
365 | 
366 | def plot_loss(train_loss, train_err, test_err, start=0, per=1, save_file='loss.png'):
367 |     assert len(train_err) == len(test_err)
368 |     idx = np.arange(start, len(train_loss), per)
369 |     fig, ax1 = plt.subplots()
370 |     lns1 = ax1.plot(idx, train_loss[idx], 'b-', alpha=1.0, label='train loss')
371 |     ax1.set_xlabel('epochs')
372 |     # Make the y-axis label, ticks and tick labels match the line color.
373 |     ax1.set_ylabel('loss', color='b')
374 |     ax1.tick_params('y', colors='b')
375 | 
376 |     ax2 = ax1.twinx()
377 |     lns2 = ax2.plot(idx, train_err[idx], 'r-', alpha=1.0, label='train error')
378 |     lns3 = ax2.plot(idx, test_err[idx], 'g-', alpha=1.0, label='test error')
379 |     ax2.set_ylabel('error', color='r')
380 |     ax2.tick_params('y', colors='r')
381 | 
382 |     # added these three lines
383 |     lns = lns1 + lns2 + lns3
384 |     labs = [l.get_label() for l in lns]
385 |     ax1.legend(lns, labs, loc=0)
386 | 
387 |     fig.tight_layout()
388 |     plt.savefig(save_file)
389 |     # plt.show()
390 | 
391 | def train(args):
392 |     train_data, val_data = load_data(args.input)
393 | 
394 |     # train_size = 10
395 |     # train_data = [each[:train_size] for each in train_data]
396 |     # val_size = 1
397 |     # val_data = [each[:val_size] for each in val_data]
398 |     train_data = prepare_data(train_data)
399 |     val_data = prepare_data(val_data)
400 | 
401 |     start = timeit.default_timer()
402 |     et = EyeTracker()
403 |     train_loss_history, train_err_history, val_loss_history, val_err_history = et.train(train_data, val_data, \
404 |                                             lr=args.learning_rate, \
405 |                                             batch_size=args.batch_size, \
406 |                                             max_epoch=args.max_epoch, \
407 |                                             min_delta=1e-4, \
408 |                                             patience=args.patience, \
409 |                                             print_per_epoch=args.print_per_epoch,
410 |                                             out_model=args.save_model)
411 | 
412 |     print 'runtime: %.1fs' % (timeit.default_timer() - start)
413 | 
414 |     if args.save_loss:
415 |         with open(args.save_loss, 'w') as outfile:
416 |             np.savez(outfile, train_loss_history=train_loss_history, train_err_history=train_err_history, \
417 |                                     val_loss_history=val_loss_history, val_err_history=val_err_history)
418 | 
419 |     if args.plot_loss:
420 |         plot_loss(np.array(train_loss_history), np.array(train_err_history), np.array(val_err_history), start=0, per=1, save_file=args.plot_loss)
421 | 
422 | def test(args):
423 |     _, val_data = load_data(args.input)
424 | 
425 |     # val_size = 10
426 |     # val_data = [each[:val_size] for each in val_data]
427 | 
428 |     val_data = prepare_data(val_data)
429 | 
430 |     # Load and validate the network.
431 |     with tf.Session() as sess:
432 |         val_ops = load_model(sess, args.load_model)
433 |         error = validate_model(sess, val_data, val_ops)
434 |         print 'Overall validation error: %f' % error
435 | 
436 | def main():
437 |     parser = argparse.ArgumentParser()
438 |     parser.add_argument('--train', action='store_true', help='train flag')
439 |     parser.add_argument('-i', '--input', required=True, type=str, help='path to the input data')
440 |     parser.add_argument('-max_epoch', '--max_epoch', type=int, default=100, help='max number of iterations (default 100)')
441 |     parser.add_argument('-lr', '--learning_rate', type=float, default=0.002, help='learning rate (default 1e-3)')
442 |     parser.add_argument('-bs', '--batch_size', type=int, default=128, help='batch size (default 50)')
443 |     parser.add_argument('-p', '--patience', type=int, default=5, help='early stopping patience (default 10)')
444 |     parser.add_argument('-pp_iter', '--print_per_epoch', type=int, default=1, help='print per iteration (default 10)')
445 |     parser.add_argument('-sm', '--save_model', type=str, default='my_model', help='path to the output model (default my_model)')
446 |     parser.add_argument('-lm', '--load_model', type=str, help='path to the loaded model')
447 |     parser.add_argument('-pf', '--plot_filter', type=str, default='filter.png', help='plot filters')
448 |     parser.add_argument('-pl', '--plot_loss', type=str, default='loss.png', help='plot loss')
449 |     parser.add_argument('-sl', '--save_loss', type=str, default='loss.npz', help='save loss')
450 |     args = parser.parse_args()
451 | 
452 |     if args.train:
453 |         train(args)
454 |     else:
455 |         if not args.load_model:
456 |             raise Exception('load_model arg needed in test phase')
457 |         test(args)
458 | 
459 | if __name__ == '__main__':
460 |     main()
461 | 


--------------------------------------------------------------------------------
/itracker_adv.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import argparse
  3 | import timeit
  4 | import numpy as np
  5 | import tensorflow as tf
  6 | import matplotlib.pyplot as plt
  7 | 
  8 | 
  9 | # Network Parameters
 10 | img_size = 64
 11 | n_channel = 3
 12 | mask_size = 25
 13 | 
 14 | # pathway: eye_left and eye_right
 15 | conv1_eye_size = 11
 16 | conv1_eye_out = 96
 17 | pool1_eye_size = 2
 18 | pool1_eye_stride = 2
 19 | 
 20 | conv2_eye_size = 5
 21 | conv2_eye_out = 256
 22 | pool2_eye_size = 2
 23 | pool2_eye_stride = 2
 24 | 
 25 | conv3_eye_size = 3
 26 | conv3_eye_out = 384
 27 | pool3_eye_size = 2
 28 | pool3_eye_stride = 2
 29 | 
 30 | conv4_eye_size = 1
 31 | conv4_eye_out = 64
 32 | pool4_eye_size = 2
 33 | pool4_eye_stride = 2
 34 | 
 35 | eye_size = 2 * 2 * 2 * conv4_eye_out
 36 | 
 37 | # pathway: face
 38 | conv1_face_size = 11
 39 | conv1_face_out = 96
 40 | pool1_face_size = 2
 41 | pool1_face_stride = 2
 42 | 
 43 | conv2_face_size = 5
 44 | conv2_face_out = 256
 45 | pool2_face_size = 2
 46 | pool2_face_stride = 2
 47 | 
 48 | conv3_face_size = 3
 49 | conv3_face_out = 384
 50 | pool3_face_size = 2
 51 | pool3_face_stride = 2
 52 | 
 53 | conv4_face_size = 1
 54 | conv4_face_out = 64
 55 | pool4_face_size = 2
 56 | pool4_face_stride = 2
 57 | 
 58 | face_size = 2 * 2 * conv4_face_out
 59 | 
 60 | # fc layer
 61 | fc_eye_size = 128
 62 | fc_face_size = 128
 63 | fc_face_mask_size = 256
 64 | face_face_mask_size = 128
 65 | fc_size = 128
 66 | fc2_size = 2
 67 | 
 68 | 
 69 | # Import data
 70 | def load_data(file):
 71 |     npzfile = np.load(file)
 72 |     train_eye_left = npzfile["train_eye_left"]
 73 |     train_eye_right = npzfile["train_eye_right"]
 74 |     train_face = npzfile["train_face"]
 75 |     train_face_mask = npzfile["train_face_mask"]
 76 |     train_y = npzfile["train_y"]
 77 |     val_eye_left = npzfile["val_eye_left"]
 78 |     val_eye_right = npzfile["val_eye_right"]
 79 |     val_face = npzfile["val_face"]
 80 |     val_face_mask = npzfile["val_face_mask"]
 81 |     val_y = npzfile["val_y"]
 82 |     return [train_eye_left, train_eye_right, train_face, train_face_mask, train_y], [val_eye_left, val_eye_right, val_face, val_face_mask, val_y]
 83 | 
 84 | def normalize(data):
 85 |     shape = data.shape
 86 |     data = np.reshape(data, (shape[0], -1))
 87 |     data = data.astype('float32') / 255. # scaling
 88 |     data = data - np.mean(data, axis=0) # normalizing
 89 |     return np.reshape(data, shape)
 90 | 
 91 | def prepare_data(data):
 92 |     eye_left, eye_right, face, face_mask, y = data
 93 |     eye_left = normalize(eye_left)
 94 |     eye_right = normalize(eye_right)
 95 |     face = normalize(face)
 96 |     face_mask = np.reshape(face_mask, (face_mask.shape[0], -1)).astype('float32')
 97 |     y = y.astype('float32')
 98 |     return [eye_left, eye_right, face, face_mask, y]
 99 | 
100 | def shuffle_data(data):
101 |     idx = np.arange(data[0].shape[0])
102 |     np.random.shuffle(idx)
103 |     for i in range(len(data)):
104 |         data[i] = data[i][idx]
105 |     return data
106 | 
107 | def next_batch(data, batch_size):
108 |     for i in np.arange(0, data[0].shape[0], batch_size):
109 |         # yield a tuple of the current batched data
110 |         yield [each[i: i + batch_size] for each in data]
111 | 
112 | class EyeTracker(object):
113 |     def __init__(self):
114 |         # tf Graph input
115 |         self.eye_left = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_left')
116 |         self.eye_right = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_right')
117 |         self.face = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='face')
118 |         self.face_mask = tf.placeholder(tf.float32, [None, mask_size * mask_size], name='face_mask')
119 |         self.y = tf.placeholder(tf.float32, [None, 2], name='pos')
120 |         # Store layers weight & bias
121 |         self.weights = {
122 |             'conv1_eye': tf.get_variable('conv1_eye_w', shape=(conv1_eye_size, conv1_eye_size, n_channel, conv1_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
123 |             'conv2_eye': tf.get_variable('conv2_eye_w', shape=(conv2_eye_size, conv2_eye_size, conv1_eye_out, conv2_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
124 |             'conv3_eye': tf.get_variable('conv3_eye_w', shape=(conv3_eye_size, conv3_eye_size, conv2_eye_out, conv3_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
125 |             'conv4_eye': tf.get_variable('conv4_eye_w', shape=(conv4_eye_size, conv4_eye_size, conv3_eye_out, conv4_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
126 |             'conv1_face': tf.get_variable('conv1_face_w', shape=(conv1_face_size, conv1_face_size, n_channel, conv1_face_out), initializer=tf.contrib.layers.xavier_initializer()),
127 |             'conv2_face': tf.get_variable('conv2_face_w', shape=(conv2_face_size, conv2_face_size, conv1_face_out, conv2_face_out), initializer=tf.contrib.layers.xavier_initializer()),
128 |             'conv3_face': tf.get_variable('conv3_face_w', shape=(conv3_face_size, conv3_face_size, conv2_face_out, conv3_face_out), initializer=tf.contrib.layers.xavier_initializer()),
129 |             'conv4_face': tf.get_variable('conv4_face_w', shape=(conv4_face_size, conv4_face_size, conv3_face_out, conv4_face_out), initializer=tf.contrib.layers.xavier_initializer()),
130 |             'fc_eye': tf.get_variable('fc_eye_w', shape=(eye_size, fc_eye_size), initializer=tf.contrib.layers.xavier_initializer()),
131 |             'fc_face': tf.get_variable('fc_face_w', shape=(face_size, fc_face_size), initializer=tf.contrib.layers.xavier_initializer()),
132 |             'fc_face_mask': tf.get_variable('fc_face_mask_w', shape=(mask_size * mask_size, fc_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
133 |             'face_face_mask': tf.get_variable('face_face_mask_w', shape=(fc_face_size + fc_face_mask_size, face_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
134 |             'fc': tf.get_variable('fc_w', shape=(fc_eye_size + face_face_mask_size, fc_size), initializer=tf.contrib.layers.xavier_initializer()),
135 |             'fc2': tf.get_variable('fc2_w', shape=(fc_size, fc2_size), initializer=tf.contrib.layers.xavier_initializer())
136 |         }
137 |         self.biases = {
138 |             'conv1_eye': tf.Variable(tf.constant(0.1, shape=[conv1_eye_out])),
139 |             'conv2_eye': tf.Variable(tf.constant(0.1, shape=[conv2_eye_out])),
140 |             'conv3_eye': tf.Variable(tf.constant(0.1, shape=[conv3_eye_out])),
141 |             'conv4_eye': tf.Variable(tf.constant(0.1, shape=[conv4_eye_out])),
142 |             'conv1_face': tf.Variable(tf.constant(0.1, shape=[conv1_face_out])),
143 |             'conv2_face': tf.Variable(tf.constant(0.1, shape=[conv2_face_out])),
144 |             'conv3_face': tf.Variable(tf.constant(0.1, shape=[conv3_face_out])),
145 |             'conv4_face': tf.Variable(tf.constant(0.1, shape=[conv4_face_out])),
146 |             'fc_eye': tf.Variable(tf.constant(0.1, shape=[fc_eye_size])),
147 |             'fc_face': tf.Variable(tf.constant(0.1, shape=[fc_face_size])),
148 |             'fc_face_mask': tf.Variable(tf.constant(0.1, shape=[fc_face_mask_size])),
149 |             'face_face_mask': tf.Variable(tf.constant(0.1, shape=[face_face_mask_size])),
150 |             'fc': tf.Variable(tf.constant(0.1, shape=[fc_size])),
151 |             'fc2': tf.Variable(tf.constant(0.1, shape=[fc2_size]))
152 |         }
153 | 
154 |         # Construct model
155 |         self.pred = self.itracker_nets(self.eye_left, self.eye_right, self.face, self.face_mask, self.weights, self.biases)
156 | 
157 |     # Create some wrappers for simplicity
158 |     def conv2d(self, x, W, b, strides=1):
159 |         # Conv2D wrapper, with bias and relu activation
160 |         x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='VALID')
161 |         x = tf.nn.bias_add(x, b)
162 |         return tf.nn.relu(x)
163 | 
164 |     def maxpool2d(self, x, k, strides):
165 |         # MaxPool2D wrapper
166 |         return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, strides, strides, 1],
167 |                               padding='VALID')
168 | 
169 |     # Create model
170 |     def itracker_nets(self, eye_left, eye_right, face, face_mask, weights, biases):
171 |         # pathway: left eye
172 |         eye_left = self.conv2d(eye_left, weights['conv1_eye'], biases['conv1_eye'], strides=1)
173 |         eye_left = self.maxpool2d(eye_left, k=pool1_eye_size, strides=pool1_eye_stride)
174 | 
175 |         eye_left = self.conv2d(eye_left, weights['conv2_eye'], biases['conv2_eye'], strides=1)
176 |         eye_left = self.maxpool2d(eye_left, k=pool2_eye_size, strides=pool2_eye_stride)
177 | 
178 |         eye_left = self.conv2d(eye_left, weights['conv3_eye'], biases['conv3_eye'], strides=1)
179 |         eye_left = self.maxpool2d(eye_left, k=pool3_eye_size, strides=pool3_eye_stride)
180 | 
181 |         eye_left = self.conv2d(eye_left, weights['conv4_eye'], biases['conv4_eye'], strides=1)
182 |         eye_left = self.maxpool2d(eye_left, k=pool4_eye_size, strides=pool4_eye_stride)
183 | 
184 |         # pathway: right eye
185 |         eye_right = self.conv2d(eye_right, weights['conv1_eye'], biases['conv1_eye'], strides=1)
186 |         eye_right = self.maxpool2d(eye_right, k=pool1_eye_size, strides=pool1_eye_stride)
187 | 
188 |         eye_right = self.conv2d(eye_right, weights['conv2_eye'], biases['conv2_eye'], strides=1)
189 |         eye_right = self.maxpool2d(eye_right, k=pool2_eye_size, strides=pool2_eye_stride)
190 | 
191 |         eye_right = self.conv2d(eye_right, weights['conv3_eye'], biases['conv3_eye'], strides=1)
192 |         eye_right = self.maxpool2d(eye_right, k=pool3_eye_size, strides=pool3_eye_stride)
193 | 
194 |         eye_right = self.conv2d(eye_right, weights['conv4_eye'], biases['conv4_eye'], strides=1)
195 |         eye_right = self.maxpool2d(eye_right, k=pool4_eye_size, strides=pool4_eye_stride)
196 | 
197 |         # pathway: face
198 |         face = self.conv2d(face, weights['conv1_face'], biases['conv1_face'], strides=1)
199 |         face = self.maxpool2d(face, k=pool1_face_size, strides=pool1_face_stride)
200 | 
201 |         face = self.conv2d(face, weights['conv2_face'], biases['conv2_face'], strides=1)
202 |         face = self.maxpool2d(face, k=pool2_face_size, strides=pool2_face_stride)
203 | 
204 |         face = self.conv2d(face, weights['conv3_face'], biases['conv3_face'], strides=1)
205 |         face = self.maxpool2d(face, k=pool3_face_size, strides=pool3_face_stride)
206 | 
207 |         face = self.conv2d(face, weights['conv4_face'], biases['conv4_face'], strides=1)
208 |         face = self.maxpool2d(face, k=pool4_face_size, strides=pool4_face_stride)
209 | 
210 |         # fc layer
211 |         # eye
212 |         eye_left = tf.reshape(eye_left, [-1, int(np.prod(eye_left.get_shape()[1:]))])
213 |         eye_right = tf.reshape(eye_right, [-1, int(np.prod(eye_right.get_shape()[1:]))])
214 |         eye = tf.concat([eye_left, eye_right], 1)
215 |         eye = tf.nn.relu(tf.add(tf.matmul(eye, weights['fc_eye']), biases['fc_eye']))
216 | 
217 |         # face
218 |         face = tf.reshape(face, [-1, int(np.prod(face.get_shape()[1:]))])
219 |         face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc_face']), biases['fc_face']))
220 | 
221 |         # face mask
222 |         face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc_face_mask']), biases['fc_face_mask']))
223 | 
224 |         face_face_mask = tf.concat([face, face_mask], 1)
225 |         face_face_mask = tf.nn.relu(tf.add(tf.matmul(face_face_mask, weights['face_face_mask']), biases['face_face_mask']))
226 | 
227 |         # all
228 |         fc = tf.concat([eye, face_face_mask], 1)
229 |         fc = tf.nn.relu(tf.add(tf.matmul(fc, weights['fc']), biases['fc']))
230 |         out = tf.add(tf.matmul(fc, weights['fc2']), biases['fc2'])
231 |         return out
232 | 
233 |     def train(self, train_data, val_data, lr=1e-3, batch_size=128, max_epoch=1000, min_delta=1e-4, patience=10, print_per_epoch=10, out_model='my_model'):
234 |         ckpt = os.path.split(out_model)[0]
235 |         if not os.path.exists(ckpt):
236 |             os.makedirs(ckpt)
237 | 
238 |         print 'Train on %s samples, validate on %s samples' % (train_data[0].shape[0], val_data[0].shape[0])
239 |         # Define loss and optimizer
240 |         self.cost = tf.losses.mean_squared_error(self.y, self.pred)
241 |         self.optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(self.cost)
242 | 
243 |         # Evaluate model
244 |         self.err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(self.pred, self.y), axis=1)))
245 |         train_loss_history = []
246 |         train_err_history = []
247 |         val_loss_history = []
248 |         val_err_history = []
249 |         n_incr_error = 0  # nb. of consecutive increase in error
250 |         best_loss = np.Inf
251 |         n_batches = train_data[0].shape[0] / batch_size + (train_data[0].shape[0] % batch_size != 0)
252 |         val_n_batches = val_data[0].shape[0] / batch_size + (val_data[0].shape[0] % batch_size != 0)
253 | 
254 |         # Create the collection
255 |         tf.get_collection("validation_nodes")
256 |         # Add stuff to the collection.
257 |         tf.add_to_collection("validation_nodes", self.eye_left)
258 |         tf.add_to_collection("validation_nodes", self.eye_right)
259 |         tf.add_to_collection("validation_nodes", self.face)
260 |         tf.add_to_collection("validation_nodes", self.face_mask)
261 |         tf.add_to_collection("validation_nodes", self.pred)
262 |         saver = tf.train.Saver(max_to_keep=1)
263 | 
264 |         # Initializing the variables
265 |         init = tf.global_variables_initializer()
266 |         # Launch the graph
267 |         with tf.Session() as sess:
268 |             sess.run(init)
269 |             writer = tf.summary.FileWriter("logs", sess.graph)
270 | 
271 |             # Keep training until reach max iterations
272 |             for n_epoch in range(1, max_epoch + 1):
273 |                 n_incr_error += 1
274 |                 train_loss = 0.
275 |                 train_err = 0.
276 |                 train_data = shuffle_data(train_data)
277 |                 for batch_train_data in next_batch(train_data, batch_size):
278 |                     # Run optimization op (backprop)
279 |                     sess.run(self.optimizer, feed_dict={self.eye_left: batch_train_data[0], \
280 |                                 self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
281 |                                 self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
282 |                     train_batch_loss, train_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_train_data[0], \
283 |                                 self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
284 |                                 self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
285 |                     train_loss += train_batch_loss / n_batches
286 |                     train_err += train_batch_err / n_batches
287 | 
288 |                 val_loss = 0.
289 |                 val_err = 0
290 |                 for batch_val_data in next_batch(val_data, batch_size):
291 |                     val_batch_loss, val_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_val_data[0], \
292 |                                     self.eye_right: batch_val_data[1], self.face: batch_val_data[2], \
293 |                                     self.face_mask: batch_val_data[3], self.y: batch_val_data[4]})
294 |                     val_loss += val_batch_loss / val_n_batches
295 |                     val_err += val_batch_err / val_n_batches
296 | 
297 |                 train_loss_history.append(train_loss)
298 |                 train_err_history.append(train_err)
299 |                 val_loss_history.append(val_loss)
300 |                 val_err_history.append(val_err)
301 |                 if val_loss - min_delta < best_loss:
302 |                     best_loss = val_loss
303 |                     save_path = saver.save(sess, out_model, global_step=n_epoch)
304 |                     print "Model saved in file: %s" % save_path
305 |                     n_incr_error = 0
306 | 
307 |                 if n_epoch % print_per_epoch == 0:
308 |                     print 'Epoch %s/%s, train loss: %.5f, train error: %.5f, val loss: %.5f, val error: %.5f' % \
309 |                                                 (n_epoch, max_epoch, train_loss, train_err, val_loss, val_err)
310 | 
311 |                 if n_incr_error >= patience:
312 |                     print 'Early stopping occured. Optimization Finished!'
313 |                     return train_loss_history, train_err_history, val_loss_history, val_err_history
314 | 
315 |             return train_loss_history, train_err_history, val_loss_history, val_err_history
316 | 
317 | def extract_validation_handles(session):
318 |     """ Extracts the input and predict_op handles that we use for validation.
319 |     Args:
320 |         session: The session with the loaded graph.
321 |     Returns:
322 |         validation handles.
323 |     """
324 |     valid_nodes = tf.get_collection_ref("validation_nodes")
325 |     if len(valid_nodes) != 5:
326 |         raise Exception("ERROR: Expected 5 items in validation_nodes, got %d." % len(valid_nodes))
327 |     return valid_nodes
328 | 
329 | def load_model(session, save_path):
330 |     """ Loads a saved TF model from a file.
331 |     Args:
332 |         session: The tf.Session to use.
333 |         save_path: The save path for the saved session, returned by Saver.save().
334 |     Returns:
335 |         The inputs placehoder and the prediction operation.
336 |     """
337 |     print "Loading model from file '%s'..." % save_path
338 | 
339 |     meta_file = save_path + ".meta"
340 |     if not os.path.exists(meta_file):
341 |         raise Exception("ERROR: Expected .meta file '%s', but could not find it." % meta_file)
342 | 
343 |     saver = tf.train.import_meta_graph(meta_file)
344 |     # It's finicky about the save path.
345 |     save_path = os.path.join("./", save_path)
346 |     saver.restore(session, save_path)
347 | 
348 |     # Check that we have the handles we expected.
349 |     return extract_validation_handles(session)
350 | 
351 | def validate_model(session, val_data, val_ops, batch_size=200):
352 |     """ Validates the model stored in a session.
353 |     Args:
354 |         session: The session where the model is loaded.
355 |         val_data: The validation data to use for evaluating the model.
356 |         val_ops: The validation operations.
357 |     Returns:
358 |         The overall validation error for the model. """
359 |     print "Validating model..."
360 | 
361 |     eye_left, eye_right, face, face_mask, pred = val_ops
362 |     y = tf.placeholder(tf.float32, [None, 2], name='pos')
363 |     err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(pred, y), axis=1)))
364 |     # Validate the model.
365 |     val_n_batches = val_data[0].shape[0] / batch_size + (val_data[0].shape[0] % batch_size != 0)
366 |     val_err = 0
367 |     for batch_val_data in next_batch(val_data, batch_size):
368 |         val_batch_err = session.run(err, feed_dict={eye_left: batch_val_data[0], \
369 |                                     eye_right: batch_val_data[1], face: batch_val_data[2], \
370 |                                     face_mask: batch_val_data[3], y: batch_val_data[4]})
371 |         val_err += val_batch_err / val_n_batches
372 |     return val_err
373 | 
374 | def plot_loss(train_loss, train_err, test_err, start=0, per=1, save_file='loss.png'):
375 |     assert len(train_err) == len(test_err)
376 |     idx = np.arange(start, len(train_loss), per)
377 |     fig, ax1 = plt.subplots()
378 |     lns1 = ax1.plot(idx, train_loss[idx], 'b-', alpha=1.0, label='train loss')
379 |     ax1.set_xlabel('epochs')
380 |     # Make the y-axis label, ticks and tick labels match the line color.
381 |     ax1.set_ylabel('loss', color='b')
382 |     ax1.tick_params('y', colors='b')
383 | 
384 |     ax2 = ax1.twinx()
385 |     lns2 = ax2.plot(idx, train_err[idx], 'r-', alpha=1.0, label='train error')
386 |     lns3 = ax2.plot(idx, test_err[idx], 'g-', alpha=1.0, label='test error')
387 |     ax2.set_ylabel('error', color='r')
388 |     ax2.tick_params('y', colors='r')
389 | 
390 |     # added these three lines
391 |     lns = lns1 + lns2 + lns3
392 |     labs = [l.get_label() for l in lns]
393 |     ax1.legend(lns, labs, loc=0)
394 | 
395 |     fig.tight_layout()
396 |     plt.savefig(save_file)
397 |     # plt.show()
398 | 
399 | def train(args):
400 |     train_data, val_data = load_data(args.input)
401 | 
402 |     # train_size = 10
403 |     # train_data = [each[:train_size] for each in train_data]
404 |     # val_size = 1
405 |     # val_data = [each[:val_size] for each in val_data]
406 | 
407 |     train_data = prepare_data(train_data)
408 |     val_data = prepare_data(val_data)
409 | 
410 |     start = timeit.default_timer()
411 |     et = EyeTracker()
412 |     train_loss_history, train_err_history, val_loss_history, val_err_history = et.train(train_data, val_data, \
413 |                                             lr=args.learning_rate, \
414 |                                             batch_size=args.batch_size, \
415 |                                             max_epoch=args.max_epoch, \
416 |                                             min_delta=1e-4, \
417 |                                             patience=args.patience, \
418 |                                             print_per_epoch=args.print_per_epoch,
419 |                                             out_model=args.save_model)
420 | 
421 |     print 'runtime: %.1fs' % (timeit.default_timer() - start)
422 | 
423 |     if args.save_loss:
424 |         with open(args.save_loss, 'w') as outfile:
425 |             np.savez(outfile, train_loss_history=train_loss_history, train_err_history=train_err_history, \
426 |                                     val_loss_history=val_loss_history, val_err_history=val_err_history)
427 | 
428 |     if args.plot_loss:
429 |         plot_loss(np.array(train_loss_history), np.array(train_err_history), np.array(val_err_history), start=0, per=1, save_file=args.plot_loss)
430 | 
431 | def test(args):
432 |     _, val_data = load_data(args.input)
433 | 
434 |     # val_size = 10
435 |     # val_data = [each[:val_size] for each in val_data]
436 | 
437 |     val_data = prepare_data(val_data)
438 | 
439 |     # Load and validate the network.
440 |     with tf.Session() as sess:
441 |         val_ops = load_model(sess, args.load_model)
442 |         error = validate_model(sess, val_data, val_ops, batch_size=args.batch_size)
443 |         print 'Overall validation error: %f' % error
444 | 
445 | def main():
446 |     parser = argparse.ArgumentParser()
447 |     parser.add_argument('--train', action='store_true', help='train flag')
448 |     parser.add_argument('-i', '--input', required=True, type=str, help='path to the input data')
449 |     parser.add_argument('-max_epoch', '--max_epoch', type=int, default=100, help='max number of iterations')
450 |     parser.add_argument('-lr', '--learning_rate', type=float, default=0.0025, help='learning rate')
451 |     parser.add_argument('-bs', '--batch_size', type=int, default=200, help='batch size')
452 |     parser.add_argument('-p', '--patience', type=int, default=5, help='early stopping patience')
453 |     parser.add_argument('-pp_iter', '--print_per_epoch', type=int, default=1, help='print per iteration')
454 |     parser.add_argument('-sm', '--save_model', type=str, default='my_model', help='path to the output model')
455 |     parser.add_argument('-lm', '--load_model', type=str, help='path to the loaded model')
456 |     parser.add_argument('-pl', '--plot_loss', type=str, default='loss.png', help='plot loss')
457 |     parser.add_argument('-sl', '--save_loss', type=str, default='loss.npz', help='save loss')
458 |     args = parser.parse_args()
459 | 
460 |     if args.train:
461 |         train(args)
462 |     else:
463 |         if not args.load_model:
464 |             raise Exception('load_model arg needed in test phase')
465 |         test(args)
466 | 
467 | if __name__ == '__main__':
468 |     main()
469 | 


--------------------------------------------------------------------------------
/itracker_adv_arch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/itracker_adv_arch.png


--------------------------------------------------------------------------------
/itracker_arch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/itracker_arch.png


--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/checkpoint:
--------------------------------------------------------------------------------
1 | model_checkpoint_path: "model-23"
2 | all_model_checkpoint_paths: "model-23"
3 | 


--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/model-23.data-00000-of-00001:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.data-00000-of-00001


--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/model-23.index:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.index


--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/model-23.meta:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.meta


--------------------------------------------------------------------------------
/validation_script.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/python
  2 | 
  3 | import argparse
  4 | """
  5 | try:
  6 |   import cPickle as pickle
  7 | except ImportError:
  8 |   # Python 3
  9 |   import pickle
 10 | """
 11 | import os
 12 | import sys
 13 | 
 14 | import numpy as np
 15 | 
 16 | import tensorflow as tf
 17 | 
 18 | 
 19 | # How many images to include in each validation batch. This is just a default
 20 | # value, and may be set differently to accomodate network parameters.
 21 | batch_size = 1000
 22 | 
 23 | 
 24 | def extract_validation_handles(session):
 25 |   """ Extracts the input and predict_op handles that we use for validation.
 26 |   Args:
 27 |     session: The session with the loaded graph.
 28 |   Returns:
 29 |     The inputs placeholder, mask placeholder, and the prediction operation. """
 30 |   # The students should have saved their input placeholder, mask placeholder and prediction
 31 |   # operation in a collection called "validation_nodes".
 32 |   valid_nodes = tf.get_collection_ref("validation_nodes")
 33 |   if len(valid_nodes) != 5:
 34 |     print("ERROR: Expected 3 items in validation_nodes, got %d." % \
 35 |           (len(valid_nodes)))
 36 |     sys.exit(1)
 37 | 
 38 |   # Figure out which is which.
 39 |   eye_left = valid_nodes[0]
 40 |   eye_right = valid_nodes[1]
 41 |   face = valid_nodes[2]
 42 |   face_mask = valid_nodes[3]
 43 |   predict = valid_nodes[4]
 44 |   """if type(valid_nodes[1]) == tf.placeholder:
 45 |     inputs = valid_nodes[1]
 46 |     predict = valid_nodes[0]"""
 47 | 
 48 |   # Check to make sure we've set the batch size correctly.
 49 |   global batch_size
 50 |   try:
 51 |     batch_size = int(eye_left.get_shape()[0])
 52 |     print("WARNING: Network does not support variable batch sizes. (inputs)")
 53 |   except TypeError:
 54 |     # It's unspecified, which is actually correct.
 55 |     pass
 56 |   try:
 57 |     # I've also seen people who don't specify an input shape but do specify a
 58 |     # shape for the prediction operation.
 59 |     batch_size = int(predict.get_shape()[0])
 60 |     print("WARNING: Network does not support variable batch sizes. (predict)")
 61 |   except TypeError:
 62 |     pass
 63 | 
 64 |   # Predict op should also yield integers.
 65 |   #predict = tf.cast(predict, "int32")
 66 | 
 67 |   # Check the shape of the prediction output.
 68 |   p_shape = predict.get_shape()
 69 |   #Commented these out because there could be squeezes in the code earlier
 70 |   """
 71 |   print p_shape
 72 |   if len(p_shape) > 2:
 73 |     print("ERROR: Expected prediction of shape (<X>, 1), got shape of %s." % \
 74 |           (str(p_shape)))
 75 |     sys.exit(1)
 76 |   if len(p_shape) == 2:
 77 |     if p_shape[1] != 1:
 78 |       print("ERROR: Expected prediction of shape (<X>, 1), got shape of %s." % \
 79 |             (str(p_shape)))
 80 |       sys.exit(1)
 81 | 
 82 |     # We need to contract it into a vector.
 83 |     predict = predict[:, 0]"""
 84 | 
 85 |   return (eye_left, eye_right, face, face_mask, predict)
 86 | 
 87 | def load_model(session, save_path):
 88 |   """ Loads a saved TF model from a file.
 89 |   Args:
 90 |     session: The tf.Session to use.
 91 |     save_path: The save path for the saved session, returned by Saver.save().
 92 |   Returns:
 93 |     The inputs placehoder and the prediction operation.
 94 |   """
 95 |   print("Loading model from file '%s'..." % (save_path))
 96 | 
 97 |   meta_file = save_path + ".meta"
 98 |   if not os.path.exists(meta_file):
 99 |     print("ERROR: Expected .meta file '%s', but could not find it." % \
100 |           (meta_file))
101 |     sys.exit(1)
102 | 
103 |   saver = tf.train.import_meta_graph(meta_file)
104 |   # It's finicky about the save path.
105 |   save_path = os.path.join("./", save_path)
106 |   saver.restore(session, save_path)
107 | 
108 |   # Check that we have the handles we expected.
109 |   return extract_validation_handles(session)
110 | 
111 | def load_validation_data(val_filename):
112 |   """ Loads the validation data.
113 |   Args:
114 |     val_filename: The file where the validation data is stored.
115 |   Returns:
116 |     A tuple of the loaded validation data and validation labels. """
117 |   print("Loading validation data...")
118 | 
119 |   npzfile = np.load(val_filename)
120 |   val_eye_left = npzfile["val_eye_left"]
121 |   val_eye_right = npzfile["val_eye_right"]
122 |   val_face = npzfile["val_face"]
123 |   val_face_mask = npzfile["val_face_mask"]
124 |   val_y = npzfile["val_y"]
125 | 
126 |   return (val_eye_left, val_eye_right, val_face,  val_face_mask, val_y)
127 | 
128 | def validate_model(session, val_data, eye_left, eye_right, face, face_mask, predict_op):
129 |   """ Validates the model stored in a session.
130 |   Args:
131 |     session: The session where the model is loaded.
132 |     val_data: The validation data to use for evaluating the model.
133 |     eye_left: The inputs placeholder.
134 |     eye_right: The inputs placeholder.
135 |     face: The inputs placeholder.
136 |     face_mask: The inputs placeholder.
137 |     predict_op: The prediction operation.
138 |   Returns:
139 |     The overall validation accuracy for the model. """
140 |   print("Validating model...")
141 | 
142 | 
143 | 
144 |   # Validate the model.
145 |   val_eye_left, val_eye_right, val_face,  val_face_mask, val_y = val_data
146 |   num_iters = val_eye_left.shape[0] // batch_size
147 | 
148 |   err_val = []
149 |   for i in range(0, int(num_iters)):
150 |     start_index = i * batch_size
151 |     end_index = start_index + batch_size
152 | 
153 |     eye_left_batch = val_eye_left[start_index:end_index, :]
154 |     eye_right_batch = val_eye_right[start_index:end_index, :]
155 |     face_batch = val_face[start_index:end_index, :]
156 |     # face_mask_batch = val_face_mask[start_index:end_index, :]
157 |     face_mask_batch = np.reshape(val_face_mask[start_index:end_index, :], (batch_size, -1))
158 |     y_batch = val_y[start_index:end_index, :]
159 | 
160 | 
161 | 
162 |     print("Validating batch %d of %d..." % (i + 1, num_iters))
163 |     yp = session.run(predict_op,
164 |                      feed_dict={eye_left: eye_left_batch / 255.,
165 |                                 eye_right: eye_right_batch / 255.,
166 |                                 face: face_batch / 255.,
167 |                                 face_mask: face_mask_batch})
168 | 
169 |     err = np.mean(np.sqrt(np.sum((yp - y_batch)**2, axis=1)))
170 |     err_val.append(err)
171 | 
172 |   # Compute total error
173 |   error = np.mean(err_val)
174 |   return error
175 | 
176 | def try_with_random_data(session, eye_left, eye_right, face, face_mask, predict_op):
177 |   """ Tries putting random data through the network, mostly to make sure this
178 |   works.
179 |   Args:
180 |     session: The session to use.
181 |     inputs: The inputs placeholder.
182 |     predict_op: The prediction operation. """
183 |   print("Trying random batch...")
184 | 
185 |   # Get a random batch.
186 |   eye_left_batch = np.random.rand(batch_size, 64, 64, 3)
187 |   eye_right_batch = np.random.rand(batch_size, 64, 64, 3)
188 |   face_batch = np.random.rand(batch_size, 64, 64, 3)
189 |   face_mask_batch = np.random.rand(batch_size, 25, 25)
190 | 
191 |   print("Batch of shape (%d, 64, 64, 3)" % (batch_size))
192 | 
193 |   # Put it through the model.
194 |   predictions = session.run(predict_op, feed_dict={eye_left: eye_left_batch,
195 |                                                    eye_right: eye_right_batch,
196 |                                                    face: face_batch,
197 |                                                    face_mask: face_mask_batch})
198 |   if np.isnan(predictions).any():
199 |     print("Warning: Got NaN value in prediction!")
200 | 
201 | 
202 | def main():
203 |   parser = argparse.ArgumentParser("Analyze student models.")
204 |   parser.add_argument("-v", "--val_data_file", default=None,
205 |                       help="Validate the network with the data from this " + \
206 |                            "pickle file.")
207 |   parser.add_argument("save_path", help="The base path for your saved model.")
208 |   args = parser.parse_args()
209 | 
210 |   if not args.val_data_file:
211 |     print("Not validating, but checking network compatibility...")
212 |   elif not os.path.exists(args.val_data_file):
213 |     print("ERROR: Could not find validation data '%s'." % (args.val_data))
214 |     sys.exit(1)
215 | 
216 |   # Load and validate the network.
217 |   with tf.Session() as session:
218 |     eye_left, eye_right, face, face_mask, predict_op = load_model(session, args.save_path)
219 |     if args.val_data_file:
220 |       val_data = load_validation_data(args.val_data_file)
221 |       accuracy = validate_model(session, val_data, eye_left, eye_right, face, face_mask, predict_op)
222 | 
223 |       print("Overall validation error: %f cm" % (accuracy))
224 |       print("Network seems good. Go ahead and submit")
225 | 
226 |     else:
227 |       try_with_random_data(session, eye_left, eye_right, face, face_mask, predict_op)
228 |       print("Network seems good. Go ahead and submit.")
229 | 
230 | if __name__ == "__main__":
231 |   main()
232 | 


--------------------------------------------------------------------------------