├── .gitignore
├── LICENSE
├── README.md
├── itracker.py
├── itracker_adv.py
├── itracker_adv_arch.png
├── itracker_arch.png
├── pretrained_models
└── itracker_adv
│ ├── checkpoint
│ ├── model-23.data-00000-of-00001
│ ├── model-23.index
│ └── model-23.meta
└── validation_script.py
/.gitignore:
--------------------------------------------------------------------------------
1 | ckpt/
2 | *.pyc
3 | *.npz
4 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | BSD 3-Clause License
2 |
3 | Copyright (c) 2017, Hugo Chan
4 | All rights reserved.
5 |
6 | Redistribution and use in source and binary forms, with or without
7 | modification, are permitted provided that the following conditions are met:
8 |
9 | * Redistributions of source code must retain the above copyright notice, this
10 | list of conditions and the following disclaimer.
11 |
12 | * Redistributions in binary form must reproduce the above copyright notice,
13 | this list of conditions and the following disclaimer in the documentation
14 | and/or other materials provided with the distribution.
15 |
16 | * Neither the name of the copyright holder nor the names of its
17 | contributors may be used to endorse or promote products derived from
18 | this software without specific prior written permission.
19 |
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Eye Tracker
2 | Implemented and improved the iTracker model proposed in the paper [Eye Tracking for Everyone](https://arxiv.org/abs/1606.05814).
3 |
4 | 
5 | *
Figure 1: iTracker architecture
*
6 |
7 | 
8 | *Figure 2: modified iTracker architecture
*
9 |
10 | Figures 1 and 2 show the architectures of the iTracker model
11 | and the modified model. The only difference between the modified model and the iTracker model is
12 | that we concatenate the face layer FC-F1 and face mask layer FC-FG1 first, after applying a fully connected layer FC-F2,
13 | we then concatenate the eye layer FC-E1 and FC-F2 layer.
14 | We claim that this modified architecture is superior to the iTracker architecture.
15 | Intuitively, concatenating the face mask information together with the eye information
16 | may confuse the model since the face mask information is irrelevant to the eye information.
17 | Even though the iTracker model succeeded to learn this fact from the data,
18 | the modified model outperforms the iTracker model by explictlying encoded with this knowledge.
19 | In experiments, the modified model converged faster (28 epochs vs. 40+ epochs) and achieved better validation
20 | error (2.19 cm vs. 2.514 cm).
21 | The iTracker model was implemented in itracker.py and the modified one was
22 | implemented in itracker_adv.py.
23 | Note that a smaller dataset (i.e., a subset of the full dataset in the original paper) was used in experiments and no data augmentation was applied.
24 | This smaller dataset contains 48,000 training samples and 5,000 validation samples.
25 | You can download this smaller dataset [here](http://hugochan.net/download/eye_tracker_train_and_val.npz).
26 |
27 | # Get started
28 | To train the model: run
29 | `python itracker_adv.py --train -i input_data -sm saved_model`
30 |
31 | To test the trained model: run
32 | `python itracker_adv.py -i input_data -lm saved_model`
33 |
34 | You can find a pretrained (on the smaller dataset) model under the pretrained_models/itracker_adv/ folder.
35 |
36 | # FAQ
37 | 1) What are the datasets?
38 |
39 | The original dataset comes from the [GazeCapture](http://gazecapture.csail.mit.edu/) project. The dataset involves over 1400 subjects and results in more than 2 million face images. Due to the limitation of computation power, a much [smaller dataset](http://hugochan.net/download/eye_tracker_train_and_val.npz) with 48000 training samples and 5000 validation samples was used here. Each sample contains 5 items: face, left eye, right eye, face mask and labels.
40 |
41 | # Other implementations
42 | For pytorch implementations, see [GazeCapture](https://github.com/CSAILVision/GazeCapture).
43 |
--------------------------------------------------------------------------------
/itracker.py:
--------------------------------------------------------------------------------
1 | import os
2 | import argparse
3 | import timeit
4 | import numpy as np
5 | import tensorflow as tf
6 | import matplotlib.pyplot as plt
7 |
8 |
9 | # Network Parameters
10 | img_size = 64
11 | n_channel = 3
12 | mask_size = 25
13 |
14 | # pathway: eye_left and eye_right
15 | conv1_eye_size = 11
16 | conv1_eye_out = 96
17 | pool1_eye_size = 2
18 | pool1_eye_stride = 2
19 |
20 | conv2_eye_size = 5
21 | conv2_eye_out = 256
22 | pool2_eye_size = 2
23 | pool2_eye_stride = 2
24 |
25 | conv3_eye_size = 3
26 | conv3_eye_out = 384
27 | pool3_eye_size = 2
28 | pool3_eye_stride = 2
29 |
30 | conv4_eye_size = 1
31 | conv4_eye_out = 64
32 | pool4_eye_size = 2
33 | pool4_eye_stride = 2
34 |
35 | eye_size = 2 * 2 * 2 * conv4_eye_out
36 |
37 | # pathway: face
38 | conv1_face_size = 11
39 | conv1_face_out = 96
40 | pool1_face_size = 2
41 | pool1_face_stride = 2
42 |
43 | conv2_face_size = 5
44 | conv2_face_out = 256
45 | pool2_face_size = 2
46 | pool2_face_stride = 2
47 |
48 | conv3_face_size = 3
49 | conv3_face_out = 384
50 | pool3_face_size = 2
51 | pool3_face_stride = 2
52 |
53 | conv4_face_size = 1
54 | conv4_face_out = 64
55 | pool4_face_size = 2
56 | pool4_face_stride = 2
57 |
58 | face_size = 2 * 2 * conv4_face_out
59 |
60 | # fc layer
61 | fc_eye_size = 128
62 | fc_face_size = 128
63 | fc2_face_size = 64
64 | fc_face_mask_size = 256
65 | fc2_face_mask_size = 128
66 | fc_size = 128
67 | fc2_size = 2
68 |
69 |
70 | # Import data
71 | def load_data(file):
72 | npzfile = np.load(file)
73 | train_eye_left = npzfile["train_eye_left"]
74 | train_eye_right = npzfile["train_eye_right"]
75 | train_face = npzfile["train_face"]
76 | train_face_mask = npzfile["train_face_mask"]
77 | train_y = npzfile["train_y"]
78 | val_eye_left = npzfile["val_eye_left"]
79 | val_eye_right = npzfile["val_eye_right"]
80 | val_face = npzfile["val_face"]
81 | val_face_mask = npzfile["val_face_mask"]
82 | val_y = npzfile["val_y"]
83 | return [train_eye_left, train_eye_right, train_face, train_face_mask, train_y], [val_eye_left, val_eye_right, val_face, val_face_mask, val_y]
84 |
85 | def normalize(data):
86 | shape = data.shape
87 | data = np.reshape(data, (shape[0], -1))
88 | data = data.astype('float32') / 255. # scaling
89 | data = data - np.mean(data, axis=0) # normalizing
90 | return np.reshape(data, shape)
91 |
92 | def prepare_data(data):
93 | eye_left, eye_right, face, face_mask, y = data
94 | eye_left = normalize(eye_left)
95 | eye_right = normalize(eye_right)
96 | face = normalize(face)
97 | face_mask = np.reshape(face_mask, (face_mask.shape[0], -1)).astype('float32')
98 | y = y.astype('float32')
99 | return [eye_left, eye_right, face, face_mask, y]
100 |
101 | def shuffle_data(data):
102 | idx = np.arange(data[0].shape[0])
103 | np.random.shuffle(idx)
104 | for i in range(len(data)):
105 | data[i] = data[i][idx]
106 | return data
107 |
108 | def next_batch(data, batch_size):
109 | for i in np.arange(0, data[0].shape[0], batch_size):
110 | # yield a tuple of the current batched data
111 | yield [each[i: i + batch_size] for each in data]
112 |
113 | class EyeTracker(object):
114 | def __init__(self):
115 | # tf Graph input
116 | self.eye_left = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_left')
117 | self.eye_right = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_right')
118 | self.face = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='face')
119 | self.face_mask = tf.placeholder(tf.float32, [None, mask_size * mask_size], name='face_mask')
120 | self.y = tf.placeholder(tf.float32, [None, 2], name='pos')
121 | # Store layers weight & bias
122 | self.weights = {
123 | 'conv1_eye': tf.get_variable('conv1_eye_w', shape=(conv1_eye_size, conv1_eye_size, n_channel, conv1_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
124 | 'conv2_eye': tf.get_variable('conv2_eye_w', shape=(conv2_eye_size, conv2_eye_size, conv1_eye_out, conv2_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
125 | 'conv3_eye': tf.get_variable('conv3_eye_w', shape=(conv3_eye_size, conv3_eye_size, conv2_eye_out, conv3_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
126 | 'conv4_eye': tf.get_variable('conv4_eye_w', shape=(conv4_eye_size, conv4_eye_size, conv3_eye_out, conv4_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
127 | 'conv1_face': tf.get_variable('conv1_face_w', shape=(conv1_face_size, conv1_face_size, n_channel, conv1_face_out), initializer=tf.contrib.layers.xavier_initializer()),
128 | 'conv2_face': tf.get_variable('conv2_face_w', shape=(conv2_face_size, conv2_face_size, conv1_face_out, conv2_face_out), initializer=tf.contrib.layers.xavier_initializer()),
129 | 'conv3_face': tf.get_variable('conv3_face_w', shape=(conv3_face_size, conv3_face_size, conv2_face_out, conv3_face_out), initializer=tf.contrib.layers.xavier_initializer()),
130 | 'conv4_face': tf.get_variable('conv4_face_w', shape=(conv4_face_size, conv4_face_size, conv3_face_out, conv4_face_out), initializer=tf.contrib.layers.xavier_initializer()),
131 | 'fc_eye': tf.get_variable('fc_eye_w', shape=(eye_size, fc_eye_size), initializer=tf.contrib.layers.xavier_initializer()),
132 | 'fc_face': tf.get_variable('fc_face_w', shape=(face_size, fc_face_size), initializer=tf.contrib.layers.xavier_initializer()),
133 | 'fc2_face': tf.get_variable('fc2_face_w', shape=(fc_face_size, fc2_face_size), initializer=tf.contrib.layers.xavier_initializer()),
134 | 'fc_face_mask': tf.get_variable('fc_face_mask_w', shape=(mask_size * mask_size, fc_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
135 | 'fc2_face_mask': tf.get_variable('fc2_face_mask_w', shape=(fc_face_mask_size, fc2_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
136 | 'fc': tf.get_variable('fc_w', shape=(fc_eye_size + fc2_face_size + fc2_face_mask_size, fc_size), initializer=tf.contrib.layers.xavier_initializer()),
137 | 'fc2': tf.get_variable('fc2_w', shape=(fc_size, fc2_size), initializer=tf.contrib.layers.xavier_initializer())
138 | }
139 | self.biases = {
140 | 'conv1_eye': tf.Variable(tf.constant(0.1, shape=[conv1_eye_out])),
141 | 'conv2_eye': tf.Variable(tf.constant(0.1, shape=[conv2_eye_out])),
142 | 'conv3_eye': tf.Variable(tf.constant(0.1, shape=[conv3_eye_out])),
143 | 'conv4_eye': tf.Variable(tf.constant(0.1, shape=[conv4_eye_out])),
144 | 'conv1_face': tf.Variable(tf.constant(0.1, shape=[conv1_face_out])),
145 | 'conv2_face': tf.Variable(tf.constant(0.1, shape=[conv2_face_out])),
146 | 'conv3_face': tf.Variable(tf.constant(0.1, shape=[conv3_face_out])),
147 | 'conv4_face': tf.Variable(tf.constant(0.1, shape=[conv4_face_out])),
148 | 'fc_eye': tf.Variable(tf.constant(0.1, shape=[fc_eye_size])),
149 | 'fc_face': tf.Variable(tf.constant(0.1, shape=[fc_face_size])),
150 | 'fc2_face': tf.Variable(tf.constant(0.1, shape=[fc2_face_size])),
151 | 'fc_face_mask': tf.Variable(tf.constant(0.1, shape=[fc_face_mask_size])),
152 | 'fc2_face_mask': tf.Variable(tf.constant(0.1, shape=[fc2_face_mask_size])),
153 | 'fc': tf.Variable(tf.constant(0.1, shape=[fc_size])),
154 | 'fc2': tf.Variable(tf.constant(0.1, shape=[fc2_size]))
155 | }
156 |
157 | # Construct model
158 | self.pred = self.itracker_nets(self.eye_left, self.eye_right, self.face, self.face_mask, self.weights, self.biases)
159 |
160 | # Create some wrappers for simplicity
161 | def conv2d(self, x, W, b, strides=1):
162 | # Conv2D wrapper, with bias and relu activation
163 | x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='VALID')
164 | x = tf.nn.bias_add(x, b)
165 | return tf.nn.relu(x)
166 |
167 | def maxpool2d(self, x, k, strides):
168 | # MaxPool2D wrapper
169 | return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, strides, strides, 1],
170 | padding='VALID')
171 |
172 | # Create model
173 | def itracker_nets(self, eye_left, eye_right, face, face_mask, weights, biases):
174 | # pathway: left eye
175 | eye_left = self.conv2d(eye_left, weights['conv1_eye'], biases['conv1_eye'], strides=1)
176 | eye_left = self.maxpool2d(eye_left, k=pool1_eye_size, strides=pool1_eye_stride)
177 |
178 | eye_left = self.conv2d(eye_left, weights['conv2_eye'], biases['conv2_eye'], strides=1)
179 | eye_left = self.maxpool2d(eye_left, k=pool2_eye_size, strides=pool2_eye_stride)
180 |
181 | eye_left = self.conv2d(eye_left, weights['conv3_eye'], biases['conv3_eye'], strides=1)
182 | eye_left = self.maxpool2d(eye_left, k=pool3_eye_size, strides=pool3_eye_stride)
183 |
184 | eye_left = self.conv2d(eye_left, weights['conv4_eye'], biases['conv4_eye'], strides=1)
185 | eye_left = self.maxpool2d(eye_left, k=pool4_eye_size, strides=pool4_eye_stride)
186 |
187 | # pathway: right eye
188 | eye_right = self.conv2d(eye_right, weights['conv1_eye'], biases['conv1_eye'], strides=1)
189 | eye_right = self.maxpool2d(eye_right, k=pool1_eye_size, strides=pool1_eye_stride)
190 |
191 | eye_right = self.conv2d(eye_right, weights['conv2_eye'], biases['conv2_eye'], strides=1)
192 | eye_right = self.maxpool2d(eye_right, k=pool2_eye_size, strides=pool2_eye_stride)
193 |
194 | eye_right = self.conv2d(eye_right, weights['conv3_eye'], biases['conv3_eye'], strides=1)
195 | eye_right = self.maxpool2d(eye_right, k=pool3_eye_size, strides=pool3_eye_stride)
196 |
197 | eye_right = self.conv2d(eye_right, weights['conv4_eye'], biases['conv4_eye'], strides=1)
198 | eye_right = self.maxpool2d(eye_right, k=pool4_eye_size, strides=pool4_eye_stride)
199 |
200 | # pathway: face
201 | face = self.conv2d(face, weights['conv1_face'], biases['conv1_face'], strides=1)
202 | face = self.maxpool2d(face, k=pool1_face_size, strides=pool1_face_stride)
203 |
204 | face = self.conv2d(face, weights['conv2_face'], biases['conv2_face'], strides=1)
205 | face = self.maxpool2d(face, k=pool2_face_size, strides=pool2_face_stride)
206 |
207 | face = self.conv2d(face, weights['conv3_face'], biases['conv3_face'], strides=1)
208 | face = self.maxpool2d(face, k=pool3_face_size, strides=pool3_face_stride)
209 |
210 | face = self.conv2d(face, weights['conv4_face'], biases['conv4_face'], strides=1)
211 | face = self.maxpool2d(face, k=pool4_face_size, strides=pool4_face_stride)
212 |
213 | # fc layer
214 | # eye
215 | eye_left = tf.reshape(eye_left, [-1, int(np.prod(eye_left.get_shape()[1:]))])
216 | eye_right = tf.reshape(eye_right, [-1, int(np.prod(eye_right.get_shape()[1:]))])
217 | eye = tf.concat([eye_left, eye_right], 1)
218 | eye = tf.nn.relu(tf.add(tf.matmul(eye, weights['fc_eye']), biases['fc_eye']))
219 |
220 | # face
221 | face = tf.reshape(face, [-1, int(np.prod(face.get_shape()[1:]))])
222 | face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc_face']), biases['fc_face']))
223 | face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc2_face']), biases['fc2_face']))
224 |
225 | # face mask
226 | face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc_face_mask']), biases['fc_face_mask']))
227 | face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc2_face_mask']), biases['fc2_face_mask']))
228 |
229 | # all
230 | fc = tf.concat([eye, face, face_mask], 1)
231 | fc = tf.nn.relu(tf.add(tf.matmul(fc, weights['fc']), biases['fc']))
232 | out = tf.add(tf.matmul(fc, weights['fc2']), biases['fc2'])
233 | return out
234 |
235 | def train(self, train_data, val_data, lr=1e-3, batch_size=128, max_epoch=1000, min_delta=1e-4, patience=10, print_per_epoch=10, out_model='my_model'):
236 | ckpt = os.path.split(out_model)[0]
237 | if not os.path.exists(ckpt):
238 | os.makedirs(ckpt)
239 |
240 | print 'Train on %s samples, validate on %s samples' % (train_data[0].shape[0], val_data[0].shape[0])
241 | # Define loss and optimizer
242 | self.cost = tf.losses.mean_squared_error(self.y, self.pred)
243 | self.optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(self.cost)
244 |
245 | # Evaluate model
246 | self.err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(self.pred, self.y), axis=1)))
247 | train_loss_history = []
248 | train_err_history = []
249 | val_loss_history = []
250 | val_err_history = []
251 | n_incr_error = 0 # nb. of consecutive increase in error
252 | best_loss = np.Inf
253 | n_batches = train_data[0].shape[0] / batch_size + (train_data[0].shape[0] % batch_size != 0)
254 |
255 | # Create the collection
256 | tf.get_collection("validation_nodes")
257 | # Add stuff to the collection.
258 | tf.add_to_collection("validation_nodes", self.eye_left)
259 | tf.add_to_collection("validation_nodes", self.eye_right)
260 | tf.add_to_collection("validation_nodes", self.face)
261 | tf.add_to_collection("validation_nodes", self.face_mask)
262 | tf.add_to_collection("validation_nodes", self.pred)
263 | saver = tf.train.Saver(max_to_keep=1)
264 |
265 | # Initializing the variables
266 | init = tf.global_variables_initializer()
267 | # Launch the graph
268 | with tf.Session() as sess:
269 | sess.run(init)
270 | # Keep training until reach max iterations
271 | for n_epoch in range(1, max_epoch + 1):
272 | n_incr_error += 1
273 | train_loss = 0.
274 | val_loss = 0.
275 | train_err = 0.
276 | val_err = 0.
277 | train_data = shuffle_data(train_data)
278 | for batch_train_data in next_batch(train_data, batch_size):
279 | # Run optimization op (backprop)
280 | sess.run(self.optimizer, feed_dict={self.eye_left: batch_train_data[0], \
281 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
282 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
283 | train_batch_loss, train_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_train_data[0], \
284 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
285 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
286 | train_loss += train_batch_loss / n_batches
287 | train_err += train_batch_err / n_batches
288 | val_loss, val_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: val_data[0], \
289 | self.eye_right: val_data[1], self.face: val_data[2], \
290 | self.face_mask: val_data[3], self.y: val_data[4]})
291 |
292 | train_loss_history.append(train_loss)
293 | train_err_history.append(train_err)
294 | val_loss_history.append(val_loss)
295 | val_err_history.append(val_err)
296 | if val_loss - min_delta < best_loss:
297 | best_loss = val_loss
298 | save_path = saver.save(sess, out_model, global_step=n_epoch)
299 | print "Model saved in file: %s" % save_path
300 | n_incr_error = 0
301 |
302 | if n_epoch % print_per_epoch == 0:
303 | print 'Epoch %s/%s, train loss: %.5f, train error: %.5f, val loss: %.5f, val error: %.5f' % \
304 | (n_epoch, max_epoch, train_loss, train_err, val_loss, val_err)
305 |
306 | if n_incr_error >= patience:
307 | print 'Early stopping occured. Optimization Finished!'
308 | return train_loss_history, train_err_history, val_loss_history, val_err_history
309 |
310 | return train_loss_history, train_err_history, val_loss_history, val_err_history
311 |
312 | def extract_validation_handles(session):
313 | """ Extracts the input and predict_op handles that we use for validation.
314 | Args:
315 | session: The session with the loaded graph.
316 | Returns:
317 | validation handles.
318 | """
319 | valid_nodes = tf.get_collection_ref("validation_nodes")
320 | if len(valid_nodes) != 5:
321 | raise Exception("ERROR: Expected 5 items in validation_nodes, got %d." % len(valid_nodes))
322 | return valid_nodes
323 |
324 | def load_model(session, save_path):
325 | """ Loads a saved TF model from a file.
326 | Args:
327 | session: The tf.Session to use.
328 | save_path: The save path for the saved session, returned by Saver.save().
329 | Returns:
330 | The inputs placehoder and the prediction operation.
331 | """
332 | print "Loading model from file '%s'..." % save_path
333 |
334 | meta_file = save_path + ".meta"
335 | if not os.path.exists(meta_file):
336 | raise Exception("ERROR: Expected .meta file '%s', but could not find it." % meta_file)
337 |
338 | saver = tf.train.import_meta_graph(meta_file)
339 | # It's finicky about the save path.
340 | save_path = os.path.join("./", save_path)
341 | saver.restore(session, save_path)
342 |
343 | # Check that we have the handles we expected.
344 | return extract_validation_handles(session)
345 |
346 | def validate_model(session, val_data, val_ops):
347 | """ Validates the model stored in a session.
348 | Args:
349 | session: The session where the model is loaded.
350 | val_data: The validation data to use for evaluating the model.
351 | val_ops: The validation operations.
352 | Returns:
353 | The overall validation error for the model. """
354 | print "Validating model..."
355 |
356 | eye_left, eye_right, face, face_mask, pred = val_ops
357 | val_eye_left, val_eye_right, val_face, val_face_mask, val_y = val_data
358 | y = tf.placeholder(tf.float32, [None, 2], name='pos')
359 | err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(pred, y), axis=1)))
360 | # Validate the model.
361 | error = session.run(err, feed_dict={eye_left: val_eye_left, \
362 | eye_right: val_eye_right, face: val_face, \
363 | face_mask: val_face_mask, y: val_y})
364 | return error
365 |
366 | def plot_loss(train_loss, train_err, test_err, start=0, per=1, save_file='loss.png'):
367 | assert len(train_err) == len(test_err)
368 | idx = np.arange(start, len(train_loss), per)
369 | fig, ax1 = plt.subplots()
370 | lns1 = ax1.plot(idx, train_loss[idx], 'b-', alpha=1.0, label='train loss')
371 | ax1.set_xlabel('epochs')
372 | # Make the y-axis label, ticks and tick labels match the line color.
373 | ax1.set_ylabel('loss', color='b')
374 | ax1.tick_params('y', colors='b')
375 |
376 | ax2 = ax1.twinx()
377 | lns2 = ax2.plot(idx, train_err[idx], 'r-', alpha=1.0, label='train error')
378 | lns3 = ax2.plot(idx, test_err[idx], 'g-', alpha=1.0, label='test error')
379 | ax2.set_ylabel('error', color='r')
380 | ax2.tick_params('y', colors='r')
381 |
382 | # added these three lines
383 | lns = lns1 + lns2 + lns3
384 | labs = [l.get_label() for l in lns]
385 | ax1.legend(lns, labs, loc=0)
386 |
387 | fig.tight_layout()
388 | plt.savefig(save_file)
389 | # plt.show()
390 |
391 | def train(args):
392 | train_data, val_data = load_data(args.input)
393 |
394 | # train_size = 10
395 | # train_data = [each[:train_size] for each in train_data]
396 | # val_size = 1
397 | # val_data = [each[:val_size] for each in val_data]
398 | train_data = prepare_data(train_data)
399 | val_data = prepare_data(val_data)
400 |
401 | start = timeit.default_timer()
402 | et = EyeTracker()
403 | train_loss_history, train_err_history, val_loss_history, val_err_history = et.train(train_data, val_data, \
404 | lr=args.learning_rate, \
405 | batch_size=args.batch_size, \
406 | max_epoch=args.max_epoch, \
407 | min_delta=1e-4, \
408 | patience=args.patience, \
409 | print_per_epoch=args.print_per_epoch,
410 | out_model=args.save_model)
411 |
412 | print 'runtime: %.1fs' % (timeit.default_timer() - start)
413 |
414 | if args.save_loss:
415 | with open(args.save_loss, 'w') as outfile:
416 | np.savez(outfile, train_loss_history=train_loss_history, train_err_history=train_err_history, \
417 | val_loss_history=val_loss_history, val_err_history=val_err_history)
418 |
419 | if args.plot_loss:
420 | plot_loss(np.array(train_loss_history), np.array(train_err_history), np.array(val_err_history), start=0, per=1, save_file=args.plot_loss)
421 |
422 | def test(args):
423 | _, val_data = load_data(args.input)
424 |
425 | # val_size = 10
426 | # val_data = [each[:val_size] for each in val_data]
427 |
428 | val_data = prepare_data(val_data)
429 |
430 | # Load and validate the network.
431 | with tf.Session() as sess:
432 | val_ops = load_model(sess, args.load_model)
433 | error = validate_model(sess, val_data, val_ops)
434 | print 'Overall validation error: %f' % error
435 |
436 | def main():
437 | parser = argparse.ArgumentParser()
438 | parser.add_argument('--train', action='store_true', help='train flag')
439 | parser.add_argument('-i', '--input', required=True, type=str, help='path to the input data')
440 | parser.add_argument('-max_epoch', '--max_epoch', type=int, default=100, help='max number of iterations (default 100)')
441 | parser.add_argument('-lr', '--learning_rate', type=float, default=0.002, help='learning rate (default 1e-3)')
442 | parser.add_argument('-bs', '--batch_size', type=int, default=128, help='batch size (default 50)')
443 | parser.add_argument('-p', '--patience', type=int, default=5, help='early stopping patience (default 10)')
444 | parser.add_argument('-pp_iter', '--print_per_epoch', type=int, default=1, help='print per iteration (default 10)')
445 | parser.add_argument('-sm', '--save_model', type=str, default='my_model', help='path to the output model (default my_model)')
446 | parser.add_argument('-lm', '--load_model', type=str, help='path to the loaded model')
447 | parser.add_argument('-pf', '--plot_filter', type=str, default='filter.png', help='plot filters')
448 | parser.add_argument('-pl', '--plot_loss', type=str, default='loss.png', help='plot loss')
449 | parser.add_argument('-sl', '--save_loss', type=str, default='loss.npz', help='save loss')
450 | args = parser.parse_args()
451 |
452 | if args.train:
453 | train(args)
454 | else:
455 | if not args.load_model:
456 | raise Exception('load_model arg needed in test phase')
457 | test(args)
458 |
459 | if __name__ == '__main__':
460 | main()
461 |
--------------------------------------------------------------------------------
/itracker_adv.py:
--------------------------------------------------------------------------------
1 | import os
2 | import argparse
3 | import timeit
4 | import numpy as np
5 | import tensorflow as tf
6 | import matplotlib.pyplot as plt
7 |
8 |
9 | # Network Parameters
10 | img_size = 64
11 | n_channel = 3
12 | mask_size = 25
13 |
14 | # pathway: eye_left and eye_right
15 | conv1_eye_size = 11
16 | conv1_eye_out = 96
17 | pool1_eye_size = 2
18 | pool1_eye_stride = 2
19 |
20 | conv2_eye_size = 5
21 | conv2_eye_out = 256
22 | pool2_eye_size = 2
23 | pool2_eye_stride = 2
24 |
25 | conv3_eye_size = 3
26 | conv3_eye_out = 384
27 | pool3_eye_size = 2
28 | pool3_eye_stride = 2
29 |
30 | conv4_eye_size = 1
31 | conv4_eye_out = 64
32 | pool4_eye_size = 2
33 | pool4_eye_stride = 2
34 |
35 | eye_size = 2 * 2 * 2 * conv4_eye_out
36 |
37 | # pathway: face
38 | conv1_face_size = 11
39 | conv1_face_out = 96
40 | pool1_face_size = 2
41 | pool1_face_stride = 2
42 |
43 | conv2_face_size = 5
44 | conv2_face_out = 256
45 | pool2_face_size = 2
46 | pool2_face_stride = 2
47 |
48 | conv3_face_size = 3
49 | conv3_face_out = 384
50 | pool3_face_size = 2
51 | pool3_face_stride = 2
52 |
53 | conv4_face_size = 1
54 | conv4_face_out = 64
55 | pool4_face_size = 2
56 | pool4_face_stride = 2
57 |
58 | face_size = 2 * 2 * conv4_face_out
59 |
60 | # fc layer
61 | fc_eye_size = 128
62 | fc_face_size = 128
63 | fc_face_mask_size = 256
64 | face_face_mask_size = 128
65 | fc_size = 128
66 | fc2_size = 2
67 |
68 |
69 | # Import data
70 | def load_data(file):
71 | npzfile = np.load(file)
72 | train_eye_left = npzfile["train_eye_left"]
73 | train_eye_right = npzfile["train_eye_right"]
74 | train_face = npzfile["train_face"]
75 | train_face_mask = npzfile["train_face_mask"]
76 | train_y = npzfile["train_y"]
77 | val_eye_left = npzfile["val_eye_left"]
78 | val_eye_right = npzfile["val_eye_right"]
79 | val_face = npzfile["val_face"]
80 | val_face_mask = npzfile["val_face_mask"]
81 | val_y = npzfile["val_y"]
82 | return [train_eye_left, train_eye_right, train_face, train_face_mask, train_y], [val_eye_left, val_eye_right, val_face, val_face_mask, val_y]
83 |
84 | def normalize(data):
85 | shape = data.shape
86 | data = np.reshape(data, (shape[0], -1))
87 | data = data.astype('float32') / 255. # scaling
88 | data = data - np.mean(data, axis=0) # normalizing
89 | return np.reshape(data, shape)
90 |
91 | def prepare_data(data):
92 | eye_left, eye_right, face, face_mask, y = data
93 | eye_left = normalize(eye_left)
94 | eye_right = normalize(eye_right)
95 | face = normalize(face)
96 | face_mask = np.reshape(face_mask, (face_mask.shape[0], -1)).astype('float32')
97 | y = y.astype('float32')
98 | return [eye_left, eye_right, face, face_mask, y]
99 |
100 | def shuffle_data(data):
101 | idx = np.arange(data[0].shape[0])
102 | np.random.shuffle(idx)
103 | for i in range(len(data)):
104 | data[i] = data[i][idx]
105 | return data
106 |
107 | def next_batch(data, batch_size):
108 | for i in np.arange(0, data[0].shape[0], batch_size):
109 | # yield a tuple of the current batched data
110 | yield [each[i: i + batch_size] for each in data]
111 |
112 | class EyeTracker(object):
113 | def __init__(self):
114 | # tf Graph input
115 | self.eye_left = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_left')
116 | self.eye_right = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='eye_right')
117 | self.face = tf.placeholder(tf.float32, [None, img_size, img_size, n_channel], name='face')
118 | self.face_mask = tf.placeholder(tf.float32, [None, mask_size * mask_size], name='face_mask')
119 | self.y = tf.placeholder(tf.float32, [None, 2], name='pos')
120 | # Store layers weight & bias
121 | self.weights = {
122 | 'conv1_eye': tf.get_variable('conv1_eye_w', shape=(conv1_eye_size, conv1_eye_size, n_channel, conv1_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
123 | 'conv2_eye': tf.get_variable('conv2_eye_w', shape=(conv2_eye_size, conv2_eye_size, conv1_eye_out, conv2_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
124 | 'conv3_eye': tf.get_variable('conv3_eye_w', shape=(conv3_eye_size, conv3_eye_size, conv2_eye_out, conv3_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
125 | 'conv4_eye': tf.get_variable('conv4_eye_w', shape=(conv4_eye_size, conv4_eye_size, conv3_eye_out, conv4_eye_out), initializer=tf.contrib.layers.xavier_initializer()),
126 | 'conv1_face': tf.get_variable('conv1_face_w', shape=(conv1_face_size, conv1_face_size, n_channel, conv1_face_out), initializer=tf.contrib.layers.xavier_initializer()),
127 | 'conv2_face': tf.get_variable('conv2_face_w', shape=(conv2_face_size, conv2_face_size, conv1_face_out, conv2_face_out), initializer=tf.contrib.layers.xavier_initializer()),
128 | 'conv3_face': tf.get_variable('conv3_face_w', shape=(conv3_face_size, conv3_face_size, conv2_face_out, conv3_face_out), initializer=tf.contrib.layers.xavier_initializer()),
129 | 'conv4_face': tf.get_variable('conv4_face_w', shape=(conv4_face_size, conv4_face_size, conv3_face_out, conv4_face_out), initializer=tf.contrib.layers.xavier_initializer()),
130 | 'fc_eye': tf.get_variable('fc_eye_w', shape=(eye_size, fc_eye_size), initializer=tf.contrib.layers.xavier_initializer()),
131 | 'fc_face': tf.get_variable('fc_face_w', shape=(face_size, fc_face_size), initializer=tf.contrib.layers.xavier_initializer()),
132 | 'fc_face_mask': tf.get_variable('fc_face_mask_w', shape=(mask_size * mask_size, fc_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
133 | 'face_face_mask': tf.get_variable('face_face_mask_w', shape=(fc_face_size + fc_face_mask_size, face_face_mask_size), initializer=tf.contrib.layers.xavier_initializer()),
134 | 'fc': tf.get_variable('fc_w', shape=(fc_eye_size + face_face_mask_size, fc_size), initializer=tf.contrib.layers.xavier_initializer()),
135 | 'fc2': tf.get_variable('fc2_w', shape=(fc_size, fc2_size), initializer=tf.contrib.layers.xavier_initializer())
136 | }
137 | self.biases = {
138 | 'conv1_eye': tf.Variable(tf.constant(0.1, shape=[conv1_eye_out])),
139 | 'conv2_eye': tf.Variable(tf.constant(0.1, shape=[conv2_eye_out])),
140 | 'conv3_eye': tf.Variable(tf.constant(0.1, shape=[conv3_eye_out])),
141 | 'conv4_eye': tf.Variable(tf.constant(0.1, shape=[conv4_eye_out])),
142 | 'conv1_face': tf.Variable(tf.constant(0.1, shape=[conv1_face_out])),
143 | 'conv2_face': tf.Variable(tf.constant(0.1, shape=[conv2_face_out])),
144 | 'conv3_face': tf.Variable(tf.constant(0.1, shape=[conv3_face_out])),
145 | 'conv4_face': tf.Variable(tf.constant(0.1, shape=[conv4_face_out])),
146 | 'fc_eye': tf.Variable(tf.constant(0.1, shape=[fc_eye_size])),
147 | 'fc_face': tf.Variable(tf.constant(0.1, shape=[fc_face_size])),
148 | 'fc_face_mask': tf.Variable(tf.constant(0.1, shape=[fc_face_mask_size])),
149 | 'face_face_mask': tf.Variable(tf.constant(0.1, shape=[face_face_mask_size])),
150 | 'fc': tf.Variable(tf.constant(0.1, shape=[fc_size])),
151 | 'fc2': tf.Variable(tf.constant(0.1, shape=[fc2_size]))
152 | }
153 |
154 | # Construct model
155 | self.pred = self.itracker_nets(self.eye_left, self.eye_right, self.face, self.face_mask, self.weights, self.biases)
156 |
157 | # Create some wrappers for simplicity
158 | def conv2d(self, x, W, b, strides=1):
159 | # Conv2D wrapper, with bias and relu activation
160 | x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='VALID')
161 | x = tf.nn.bias_add(x, b)
162 | return tf.nn.relu(x)
163 |
164 | def maxpool2d(self, x, k, strides):
165 | # MaxPool2D wrapper
166 | return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, strides, strides, 1],
167 | padding='VALID')
168 |
169 | # Create model
170 | def itracker_nets(self, eye_left, eye_right, face, face_mask, weights, biases):
171 | # pathway: left eye
172 | eye_left = self.conv2d(eye_left, weights['conv1_eye'], biases['conv1_eye'], strides=1)
173 | eye_left = self.maxpool2d(eye_left, k=pool1_eye_size, strides=pool1_eye_stride)
174 |
175 | eye_left = self.conv2d(eye_left, weights['conv2_eye'], biases['conv2_eye'], strides=1)
176 | eye_left = self.maxpool2d(eye_left, k=pool2_eye_size, strides=pool2_eye_stride)
177 |
178 | eye_left = self.conv2d(eye_left, weights['conv3_eye'], biases['conv3_eye'], strides=1)
179 | eye_left = self.maxpool2d(eye_left, k=pool3_eye_size, strides=pool3_eye_stride)
180 |
181 | eye_left = self.conv2d(eye_left, weights['conv4_eye'], biases['conv4_eye'], strides=1)
182 | eye_left = self.maxpool2d(eye_left, k=pool4_eye_size, strides=pool4_eye_stride)
183 |
184 | # pathway: right eye
185 | eye_right = self.conv2d(eye_right, weights['conv1_eye'], biases['conv1_eye'], strides=1)
186 | eye_right = self.maxpool2d(eye_right, k=pool1_eye_size, strides=pool1_eye_stride)
187 |
188 | eye_right = self.conv2d(eye_right, weights['conv2_eye'], biases['conv2_eye'], strides=1)
189 | eye_right = self.maxpool2d(eye_right, k=pool2_eye_size, strides=pool2_eye_stride)
190 |
191 | eye_right = self.conv2d(eye_right, weights['conv3_eye'], biases['conv3_eye'], strides=1)
192 | eye_right = self.maxpool2d(eye_right, k=pool3_eye_size, strides=pool3_eye_stride)
193 |
194 | eye_right = self.conv2d(eye_right, weights['conv4_eye'], biases['conv4_eye'], strides=1)
195 | eye_right = self.maxpool2d(eye_right, k=pool4_eye_size, strides=pool4_eye_stride)
196 |
197 | # pathway: face
198 | face = self.conv2d(face, weights['conv1_face'], biases['conv1_face'], strides=1)
199 | face = self.maxpool2d(face, k=pool1_face_size, strides=pool1_face_stride)
200 |
201 | face = self.conv2d(face, weights['conv2_face'], biases['conv2_face'], strides=1)
202 | face = self.maxpool2d(face, k=pool2_face_size, strides=pool2_face_stride)
203 |
204 | face = self.conv2d(face, weights['conv3_face'], biases['conv3_face'], strides=1)
205 | face = self.maxpool2d(face, k=pool3_face_size, strides=pool3_face_stride)
206 |
207 | face = self.conv2d(face, weights['conv4_face'], biases['conv4_face'], strides=1)
208 | face = self.maxpool2d(face, k=pool4_face_size, strides=pool4_face_stride)
209 |
210 | # fc layer
211 | # eye
212 | eye_left = tf.reshape(eye_left, [-1, int(np.prod(eye_left.get_shape()[1:]))])
213 | eye_right = tf.reshape(eye_right, [-1, int(np.prod(eye_right.get_shape()[1:]))])
214 | eye = tf.concat([eye_left, eye_right], 1)
215 | eye = tf.nn.relu(tf.add(tf.matmul(eye, weights['fc_eye']), biases['fc_eye']))
216 |
217 | # face
218 | face = tf.reshape(face, [-1, int(np.prod(face.get_shape()[1:]))])
219 | face = tf.nn.relu(tf.add(tf.matmul(face, weights['fc_face']), biases['fc_face']))
220 |
221 | # face mask
222 | face_mask = tf.nn.relu(tf.add(tf.matmul(face_mask, weights['fc_face_mask']), biases['fc_face_mask']))
223 |
224 | face_face_mask = tf.concat([face, face_mask], 1)
225 | face_face_mask = tf.nn.relu(tf.add(tf.matmul(face_face_mask, weights['face_face_mask']), biases['face_face_mask']))
226 |
227 | # all
228 | fc = tf.concat([eye, face_face_mask], 1)
229 | fc = tf.nn.relu(tf.add(tf.matmul(fc, weights['fc']), biases['fc']))
230 | out = tf.add(tf.matmul(fc, weights['fc2']), biases['fc2'])
231 | return out
232 |
233 | def train(self, train_data, val_data, lr=1e-3, batch_size=128, max_epoch=1000, min_delta=1e-4, patience=10, print_per_epoch=10, out_model='my_model'):
234 | ckpt = os.path.split(out_model)[0]
235 | if not os.path.exists(ckpt):
236 | os.makedirs(ckpt)
237 |
238 | print 'Train on %s samples, validate on %s samples' % (train_data[0].shape[0], val_data[0].shape[0])
239 | # Define loss and optimizer
240 | self.cost = tf.losses.mean_squared_error(self.y, self.pred)
241 | self.optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(self.cost)
242 |
243 | # Evaluate model
244 | self.err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(self.pred, self.y), axis=1)))
245 | train_loss_history = []
246 | train_err_history = []
247 | val_loss_history = []
248 | val_err_history = []
249 | n_incr_error = 0 # nb. of consecutive increase in error
250 | best_loss = np.Inf
251 | n_batches = train_data[0].shape[0] / batch_size + (train_data[0].shape[0] % batch_size != 0)
252 | val_n_batches = val_data[0].shape[0] / batch_size + (val_data[0].shape[0] % batch_size != 0)
253 |
254 | # Create the collection
255 | tf.get_collection("validation_nodes")
256 | # Add stuff to the collection.
257 | tf.add_to_collection("validation_nodes", self.eye_left)
258 | tf.add_to_collection("validation_nodes", self.eye_right)
259 | tf.add_to_collection("validation_nodes", self.face)
260 | tf.add_to_collection("validation_nodes", self.face_mask)
261 | tf.add_to_collection("validation_nodes", self.pred)
262 | saver = tf.train.Saver(max_to_keep=1)
263 |
264 | # Initializing the variables
265 | init = tf.global_variables_initializer()
266 | # Launch the graph
267 | with tf.Session() as sess:
268 | sess.run(init)
269 | writer = tf.summary.FileWriter("logs", sess.graph)
270 |
271 | # Keep training until reach max iterations
272 | for n_epoch in range(1, max_epoch + 1):
273 | n_incr_error += 1
274 | train_loss = 0.
275 | train_err = 0.
276 | train_data = shuffle_data(train_data)
277 | for batch_train_data in next_batch(train_data, batch_size):
278 | # Run optimization op (backprop)
279 | sess.run(self.optimizer, feed_dict={self.eye_left: batch_train_data[0], \
280 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
281 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
282 | train_batch_loss, train_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_train_data[0], \
283 | self.eye_right: batch_train_data[1], self.face: batch_train_data[2], \
284 | self.face_mask: batch_train_data[3], self.y: batch_train_data[4]})
285 | train_loss += train_batch_loss / n_batches
286 | train_err += train_batch_err / n_batches
287 |
288 | val_loss = 0.
289 | val_err = 0
290 | for batch_val_data in next_batch(val_data, batch_size):
291 | val_batch_loss, val_batch_err = sess.run([self.cost, self.err], feed_dict={self.eye_left: batch_val_data[0], \
292 | self.eye_right: batch_val_data[1], self.face: batch_val_data[2], \
293 | self.face_mask: batch_val_data[3], self.y: batch_val_data[4]})
294 | val_loss += val_batch_loss / val_n_batches
295 | val_err += val_batch_err / val_n_batches
296 |
297 | train_loss_history.append(train_loss)
298 | train_err_history.append(train_err)
299 | val_loss_history.append(val_loss)
300 | val_err_history.append(val_err)
301 | if val_loss - min_delta < best_loss:
302 | best_loss = val_loss
303 | save_path = saver.save(sess, out_model, global_step=n_epoch)
304 | print "Model saved in file: %s" % save_path
305 | n_incr_error = 0
306 |
307 | if n_epoch % print_per_epoch == 0:
308 | print 'Epoch %s/%s, train loss: %.5f, train error: %.5f, val loss: %.5f, val error: %.5f' % \
309 | (n_epoch, max_epoch, train_loss, train_err, val_loss, val_err)
310 |
311 | if n_incr_error >= patience:
312 | print 'Early stopping occured. Optimization Finished!'
313 | return train_loss_history, train_err_history, val_loss_history, val_err_history
314 |
315 | return train_loss_history, train_err_history, val_loss_history, val_err_history
316 |
317 | def extract_validation_handles(session):
318 | """ Extracts the input and predict_op handles that we use for validation.
319 | Args:
320 | session: The session with the loaded graph.
321 | Returns:
322 | validation handles.
323 | """
324 | valid_nodes = tf.get_collection_ref("validation_nodes")
325 | if len(valid_nodes) != 5:
326 | raise Exception("ERROR: Expected 5 items in validation_nodes, got %d." % len(valid_nodes))
327 | return valid_nodes
328 |
329 | def load_model(session, save_path):
330 | """ Loads a saved TF model from a file.
331 | Args:
332 | session: The tf.Session to use.
333 | save_path: The save path for the saved session, returned by Saver.save().
334 | Returns:
335 | The inputs placehoder and the prediction operation.
336 | """
337 | print "Loading model from file '%s'..." % save_path
338 |
339 | meta_file = save_path + ".meta"
340 | if not os.path.exists(meta_file):
341 | raise Exception("ERROR: Expected .meta file '%s', but could not find it." % meta_file)
342 |
343 | saver = tf.train.import_meta_graph(meta_file)
344 | # It's finicky about the save path.
345 | save_path = os.path.join("./", save_path)
346 | saver.restore(session, save_path)
347 |
348 | # Check that we have the handles we expected.
349 | return extract_validation_handles(session)
350 |
351 | def validate_model(session, val_data, val_ops, batch_size=200):
352 | """ Validates the model stored in a session.
353 | Args:
354 | session: The session where the model is loaded.
355 | val_data: The validation data to use for evaluating the model.
356 | val_ops: The validation operations.
357 | Returns:
358 | The overall validation error for the model. """
359 | print "Validating model..."
360 |
361 | eye_left, eye_right, face, face_mask, pred = val_ops
362 | y = tf.placeholder(tf.float32, [None, 2], name='pos')
363 | err = tf.reduce_mean(tf.sqrt(tf.reduce_sum(tf.squared_difference(pred, y), axis=1)))
364 | # Validate the model.
365 | val_n_batches = val_data[0].shape[0] / batch_size + (val_data[0].shape[0] % batch_size != 0)
366 | val_err = 0
367 | for batch_val_data in next_batch(val_data, batch_size):
368 | val_batch_err = session.run(err, feed_dict={eye_left: batch_val_data[0], \
369 | eye_right: batch_val_data[1], face: batch_val_data[2], \
370 | face_mask: batch_val_data[3], y: batch_val_data[4]})
371 | val_err += val_batch_err / val_n_batches
372 | return val_err
373 |
374 | def plot_loss(train_loss, train_err, test_err, start=0, per=1, save_file='loss.png'):
375 | assert len(train_err) == len(test_err)
376 | idx = np.arange(start, len(train_loss), per)
377 | fig, ax1 = plt.subplots()
378 | lns1 = ax1.plot(idx, train_loss[idx], 'b-', alpha=1.0, label='train loss')
379 | ax1.set_xlabel('epochs')
380 | # Make the y-axis label, ticks and tick labels match the line color.
381 | ax1.set_ylabel('loss', color='b')
382 | ax1.tick_params('y', colors='b')
383 |
384 | ax2 = ax1.twinx()
385 | lns2 = ax2.plot(idx, train_err[idx], 'r-', alpha=1.0, label='train error')
386 | lns3 = ax2.plot(idx, test_err[idx], 'g-', alpha=1.0, label='test error')
387 | ax2.set_ylabel('error', color='r')
388 | ax2.tick_params('y', colors='r')
389 |
390 | # added these three lines
391 | lns = lns1 + lns2 + lns3
392 | labs = [l.get_label() for l in lns]
393 | ax1.legend(lns, labs, loc=0)
394 |
395 | fig.tight_layout()
396 | plt.savefig(save_file)
397 | # plt.show()
398 |
399 | def train(args):
400 | train_data, val_data = load_data(args.input)
401 |
402 | # train_size = 10
403 | # train_data = [each[:train_size] for each in train_data]
404 | # val_size = 1
405 | # val_data = [each[:val_size] for each in val_data]
406 |
407 | train_data = prepare_data(train_data)
408 | val_data = prepare_data(val_data)
409 |
410 | start = timeit.default_timer()
411 | et = EyeTracker()
412 | train_loss_history, train_err_history, val_loss_history, val_err_history = et.train(train_data, val_data, \
413 | lr=args.learning_rate, \
414 | batch_size=args.batch_size, \
415 | max_epoch=args.max_epoch, \
416 | min_delta=1e-4, \
417 | patience=args.patience, \
418 | print_per_epoch=args.print_per_epoch,
419 | out_model=args.save_model)
420 |
421 | print 'runtime: %.1fs' % (timeit.default_timer() - start)
422 |
423 | if args.save_loss:
424 | with open(args.save_loss, 'w') as outfile:
425 | np.savez(outfile, train_loss_history=train_loss_history, train_err_history=train_err_history, \
426 | val_loss_history=val_loss_history, val_err_history=val_err_history)
427 |
428 | if args.plot_loss:
429 | plot_loss(np.array(train_loss_history), np.array(train_err_history), np.array(val_err_history), start=0, per=1, save_file=args.plot_loss)
430 |
431 | def test(args):
432 | _, val_data = load_data(args.input)
433 |
434 | # val_size = 10
435 | # val_data = [each[:val_size] for each in val_data]
436 |
437 | val_data = prepare_data(val_data)
438 |
439 | # Load and validate the network.
440 | with tf.Session() as sess:
441 | val_ops = load_model(sess, args.load_model)
442 | error = validate_model(sess, val_data, val_ops, batch_size=args.batch_size)
443 | print 'Overall validation error: %f' % error
444 |
445 | def main():
446 | parser = argparse.ArgumentParser()
447 | parser.add_argument('--train', action='store_true', help='train flag')
448 | parser.add_argument('-i', '--input', required=True, type=str, help='path to the input data')
449 | parser.add_argument('-max_epoch', '--max_epoch', type=int, default=100, help='max number of iterations')
450 | parser.add_argument('-lr', '--learning_rate', type=float, default=0.0025, help='learning rate')
451 | parser.add_argument('-bs', '--batch_size', type=int, default=200, help='batch size')
452 | parser.add_argument('-p', '--patience', type=int, default=5, help='early stopping patience')
453 | parser.add_argument('-pp_iter', '--print_per_epoch', type=int, default=1, help='print per iteration')
454 | parser.add_argument('-sm', '--save_model', type=str, default='my_model', help='path to the output model')
455 | parser.add_argument('-lm', '--load_model', type=str, help='path to the loaded model')
456 | parser.add_argument('-pl', '--plot_loss', type=str, default='loss.png', help='plot loss')
457 | parser.add_argument('-sl', '--save_loss', type=str, default='loss.npz', help='save loss')
458 | args = parser.parse_args()
459 |
460 | if args.train:
461 | train(args)
462 | else:
463 | if not args.load_model:
464 | raise Exception('load_model arg needed in test phase')
465 | test(args)
466 |
467 | if __name__ == '__main__':
468 | main()
469 |
--------------------------------------------------------------------------------
/itracker_adv_arch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/itracker_adv_arch.png
--------------------------------------------------------------------------------
/itracker_arch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/itracker_arch.png
--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/checkpoint:
--------------------------------------------------------------------------------
1 | model_checkpoint_path: "model-23"
2 | all_model_checkpoint_paths: "model-23"
3 |
--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/model-23.data-00000-of-00001:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.data-00000-of-00001
--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/model-23.index:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.index
--------------------------------------------------------------------------------
/pretrained_models/itracker_adv/model-23.meta:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hugochan/Eye-Tracker/10c8f692ef14a99cd1cf9818ade328f9965662b0/pretrained_models/itracker_adv/model-23.meta
--------------------------------------------------------------------------------
/validation_script.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 |
3 | import argparse
4 | """
5 | try:
6 | import cPickle as pickle
7 | except ImportError:
8 | # Python 3
9 | import pickle
10 | """
11 | import os
12 | import sys
13 |
14 | import numpy as np
15 |
16 | import tensorflow as tf
17 |
18 |
19 | # How many images to include in each validation batch. This is just a default
20 | # value, and may be set differently to accomodate network parameters.
21 | batch_size = 1000
22 |
23 |
24 | def extract_validation_handles(session):
25 | """ Extracts the input and predict_op handles that we use for validation.
26 | Args:
27 | session: The session with the loaded graph.
28 | Returns:
29 | The inputs placeholder, mask placeholder, and the prediction operation. """
30 | # The students should have saved their input placeholder, mask placeholder and prediction
31 | # operation in a collection called "validation_nodes".
32 | valid_nodes = tf.get_collection_ref("validation_nodes")
33 | if len(valid_nodes) != 5:
34 | print("ERROR: Expected 3 items in validation_nodes, got %d." % \
35 | (len(valid_nodes)))
36 | sys.exit(1)
37 |
38 | # Figure out which is which.
39 | eye_left = valid_nodes[0]
40 | eye_right = valid_nodes[1]
41 | face = valid_nodes[2]
42 | face_mask = valid_nodes[3]
43 | predict = valid_nodes[4]
44 | """if type(valid_nodes[1]) == tf.placeholder:
45 | inputs = valid_nodes[1]
46 | predict = valid_nodes[0]"""
47 |
48 | # Check to make sure we've set the batch size correctly.
49 | global batch_size
50 | try:
51 | batch_size = int(eye_left.get_shape()[0])
52 | print("WARNING: Network does not support variable batch sizes. (inputs)")
53 | except TypeError:
54 | # It's unspecified, which is actually correct.
55 | pass
56 | try:
57 | # I've also seen people who don't specify an input shape but do specify a
58 | # shape for the prediction operation.
59 | batch_size = int(predict.get_shape()[0])
60 | print("WARNING: Network does not support variable batch sizes. (predict)")
61 | except TypeError:
62 | pass
63 |
64 | # Predict op should also yield integers.
65 | #predict = tf.cast(predict, "int32")
66 |
67 | # Check the shape of the prediction output.
68 | p_shape = predict.get_shape()
69 | #Commented these out because there could be squeezes in the code earlier
70 | """
71 | print p_shape
72 | if len(p_shape) > 2:
73 | print("ERROR: Expected prediction of shape (, 1), got shape of %s." % \
74 | (str(p_shape)))
75 | sys.exit(1)
76 | if len(p_shape) == 2:
77 | if p_shape[1] != 1:
78 | print("ERROR: Expected prediction of shape (, 1), got shape of %s." % \
79 | (str(p_shape)))
80 | sys.exit(1)
81 |
82 | # We need to contract it into a vector.
83 | predict = predict[:, 0]"""
84 |
85 | return (eye_left, eye_right, face, face_mask, predict)
86 |
87 | def load_model(session, save_path):
88 | """ Loads a saved TF model from a file.
89 | Args:
90 | session: The tf.Session to use.
91 | save_path: The save path for the saved session, returned by Saver.save().
92 | Returns:
93 | The inputs placehoder and the prediction operation.
94 | """
95 | print("Loading model from file '%s'..." % (save_path))
96 |
97 | meta_file = save_path + ".meta"
98 | if not os.path.exists(meta_file):
99 | print("ERROR: Expected .meta file '%s', but could not find it." % \
100 | (meta_file))
101 | sys.exit(1)
102 |
103 | saver = tf.train.import_meta_graph(meta_file)
104 | # It's finicky about the save path.
105 | save_path = os.path.join("./", save_path)
106 | saver.restore(session, save_path)
107 |
108 | # Check that we have the handles we expected.
109 | return extract_validation_handles(session)
110 |
111 | def load_validation_data(val_filename):
112 | """ Loads the validation data.
113 | Args:
114 | val_filename: The file where the validation data is stored.
115 | Returns:
116 | A tuple of the loaded validation data and validation labels. """
117 | print("Loading validation data...")
118 |
119 | npzfile = np.load(val_filename)
120 | val_eye_left = npzfile["val_eye_left"]
121 | val_eye_right = npzfile["val_eye_right"]
122 | val_face = npzfile["val_face"]
123 | val_face_mask = npzfile["val_face_mask"]
124 | val_y = npzfile["val_y"]
125 |
126 | return (val_eye_left, val_eye_right, val_face, val_face_mask, val_y)
127 |
128 | def validate_model(session, val_data, eye_left, eye_right, face, face_mask, predict_op):
129 | """ Validates the model stored in a session.
130 | Args:
131 | session: The session where the model is loaded.
132 | val_data: The validation data to use for evaluating the model.
133 | eye_left: The inputs placeholder.
134 | eye_right: The inputs placeholder.
135 | face: The inputs placeholder.
136 | face_mask: The inputs placeholder.
137 | predict_op: The prediction operation.
138 | Returns:
139 | The overall validation accuracy for the model. """
140 | print("Validating model...")
141 |
142 |
143 |
144 | # Validate the model.
145 | val_eye_left, val_eye_right, val_face, val_face_mask, val_y = val_data
146 | num_iters = val_eye_left.shape[0] // batch_size
147 |
148 | err_val = []
149 | for i in range(0, int(num_iters)):
150 | start_index = i * batch_size
151 | end_index = start_index + batch_size
152 |
153 | eye_left_batch = val_eye_left[start_index:end_index, :]
154 | eye_right_batch = val_eye_right[start_index:end_index, :]
155 | face_batch = val_face[start_index:end_index, :]
156 | # face_mask_batch = val_face_mask[start_index:end_index, :]
157 | face_mask_batch = np.reshape(val_face_mask[start_index:end_index, :], (batch_size, -1))
158 | y_batch = val_y[start_index:end_index, :]
159 |
160 |
161 |
162 | print("Validating batch %d of %d..." % (i + 1, num_iters))
163 | yp = session.run(predict_op,
164 | feed_dict={eye_left: eye_left_batch / 255.,
165 | eye_right: eye_right_batch / 255.,
166 | face: face_batch / 255.,
167 | face_mask: face_mask_batch})
168 |
169 | err = np.mean(np.sqrt(np.sum((yp - y_batch)**2, axis=1)))
170 | err_val.append(err)
171 |
172 | # Compute total error
173 | error = np.mean(err_val)
174 | return error
175 |
176 | def try_with_random_data(session, eye_left, eye_right, face, face_mask, predict_op):
177 | """ Tries putting random data through the network, mostly to make sure this
178 | works.
179 | Args:
180 | session: The session to use.
181 | inputs: The inputs placeholder.
182 | predict_op: The prediction operation. """
183 | print("Trying random batch...")
184 |
185 | # Get a random batch.
186 | eye_left_batch = np.random.rand(batch_size, 64, 64, 3)
187 | eye_right_batch = np.random.rand(batch_size, 64, 64, 3)
188 | face_batch = np.random.rand(batch_size, 64, 64, 3)
189 | face_mask_batch = np.random.rand(batch_size, 25, 25)
190 |
191 | print("Batch of shape (%d, 64, 64, 3)" % (batch_size))
192 |
193 | # Put it through the model.
194 | predictions = session.run(predict_op, feed_dict={eye_left: eye_left_batch,
195 | eye_right: eye_right_batch,
196 | face: face_batch,
197 | face_mask: face_mask_batch})
198 | if np.isnan(predictions).any():
199 | print("Warning: Got NaN value in prediction!")
200 |
201 |
202 | def main():
203 | parser = argparse.ArgumentParser("Analyze student models.")
204 | parser.add_argument("-v", "--val_data_file", default=None,
205 | help="Validate the network with the data from this " + \
206 | "pickle file.")
207 | parser.add_argument("save_path", help="The base path for your saved model.")
208 | args = parser.parse_args()
209 |
210 | if not args.val_data_file:
211 | print("Not validating, but checking network compatibility...")
212 | elif not os.path.exists(args.val_data_file):
213 | print("ERROR: Could not find validation data '%s'." % (args.val_data))
214 | sys.exit(1)
215 |
216 | # Load and validate the network.
217 | with tf.Session() as session:
218 | eye_left, eye_right, face, face_mask, predict_op = load_model(session, args.save_path)
219 | if args.val_data_file:
220 | val_data = load_validation_data(args.val_data_file)
221 | accuracy = validate_model(session, val_data, eye_left, eye_right, face, face_mask, predict_op)
222 |
223 | print("Overall validation error: %f cm" % (accuracy))
224 | print("Network seems good. Go ahead and submit")
225 |
226 | else:
227 | try_with_random_data(session, eye_left, eye_right, face, face_mask, predict_op)
228 | print("Network seems good. Go ahead and submit.")
229 |
230 | if __name__ == "__main__":
231 | main()
232 |
--------------------------------------------------------------------------------