├── LICENSE.md
├── README.md
├── dm_arch.py
├── dm_celeba.py
├── dm_flags.py
├── dm_infer.py
├── dm_input.py
├── dm_main.py
├── dm_model.py
├── dm_show.py
├── dm_train.py
├── dm_utils.py
└── images
    ├── example_female_to_male.jpg
    └── example_male_to_female.jpg


/LICENSE.md:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2016-2017 David Garcia
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
 6 | 
 7 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
 8 | 
 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
10 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # deep-makeover
 2 | 
 3 | The purpose of this deep-learning project is to show that it's possible to automatically transform pictures of faces in useful and fun ways. The way this is done is by filtering the type of faces used as inputs to the model and the types of faces used as the desired target. The exact same architecture can be used to transform masculine faces into feminine ones, or vice versa, just by switching the source and target images used during training.
 4 | 
 5 | Here are two examples of this in action:
 6 | 
 7 | ![Example male to female transformation](images/example_male_to_female.jpg)
 8 | 
 9 | ![Example female to male](images/example_female_to_male.jpg)
10 | 
11 | Please note the male-to-female example is my former boss [Benj Lipchak](http://www.charitocracy.org). Used with permission.
12 | 
13 | Each of these two examples were made after training a model for just two hours on one GTX 1080 GPU.
14 | 
15 | It also has other potential applications, such as vanity filters that make people look more attractive. This would be done by selecting only attractive faces as the target population. More experimentation will be required.
16 | 
17 | 
18 | # How it works
19 | 
20 | The network architecture is essentially a conditioned DCGAN where the generator is actually composed of two parts: an encoder and a decoder. The encoder transforms the input image into a lower-dimensional latent representation, and the decoder transforms that latent representation back to an RGB image of the same dimensions as the network's input. Both the generator and discriminator are resnets.
21 | 
22 | For details see function `create_model()` in file `dm_model.py`.
23 | 
24 | 
25 | # Key takeaways from this project
26 | 
27 | Here is what I learned in this project. I can't claim these are original ideas, just my personal observations.
28 | 
29 | ## Tune the architecture to the nature of the problem
30 | 
31 | This project takes 80x100 pixel images as inputs and produces images of the same size as outputs. In addition to that both of those images are faces, which means that the input and output distributions are very similar.
32 | 
33 | GANs often start from an arbitrary multidimensional distribution Z which is progressively shaped into an image. It would be possible to do the same in this project by progressively encoding the 80x100 pixel input image into a 1x1 latent embedding and later expanding it to an image again. However, since we know that the input and output distributions are very similar we don't need to encode the input image all the way down into 1x1 pixels.
34 | 
35 | In the final architecture the encoder only has two pooling layers. Increasing the number of layers actually lowered the quality of the outputs in the sense that they no longer resembled the person in the source image. The goal was to produce an output that was clearly recognizable as the same person, and that necessarily requires making relatively small changes to the source material.
36 | 
37 | ## Resnets do best with a custom initialization
38 | 
39 | Xavier Glorot's type of initialization makes sense when your network lacks skip connections, but in a resnet it makes more sense to initialize the weights with very small values centered around zero so that the composite function they compute is not far from the identity.  For the projection layers, which can't be residual, we initialized the weights as to approximate the identity function.
40 | 
41 | This type of initialization definitely makes sense for networks that transform images into images, as the initial network prior to any training will compute a reasonable first approximation: the identity.
42 | 
43 | ## Consider annealing the loss function
44 | 
45 | A good loss function for this model has two competing elements. On the one hand it is desirable that the output image is similar to the input image on a pixel basis (either MSE or L1 distance). On the other hand it is also desirable to allow the generator to be strongly influenced by the discriminator in order to avoid the blurriness that comes from a pixel-distance-based loss. Additionally GANs often fail to converge early on when the discriminator has no idea of what a plausible sample looks like.
46 | 
47 | What we did in this project was to modify the loss function over time. At the very beginning the generator completely ignores the gradients coming from the discriminator and instead only uses an pixel-based L1 loss. Over time as the discriminator becomes more discerning the importance of the adversarial loss increases.
48 | 
49 | ## A smaller dataset can be better
50 | 
51 | I initially assumed that a larger dataset would be better, and as a consequence I did no cleanup or filtering of the dataset. To put it bluntly, there are only a few ways to be handsome, but many different ways to be ugly. This means that if you select only faces labeled as 'attractive=true' in the dataset, the network converges more quickly, as attractive faces are a narrower target distribution. This is the same reason why I also filtered out people with glasses and sunglasses.
52 | 
53 | 
54 | # Requirements
55 | 
56 | You will need Python 3.5+ with Tensorflow r0.12+, and reasonably recent versions of numpy and scipy.
57 | 
58 | 
59 | ## Dataset
60 | 
61 | After you have the required software above you will also need the `Large-scale CelebFaces Attributes (CelebA) Dataset`. The model expects the `Align&Cropped Images` version. Extract all images to a subfolder named `dataset`. I.e. `deep-makeover/dataset/lotsoffiles.jpg`.
62 | 
63 | 
64 | # Training the model
65 | 
66 | Training with default settings: `python3 dm_main.py --run train`. The script will periodically output an example batch in PNG format onto the `train` folder, and checkpoint data will be stored in the `checkpoint` folder.
67 | 
68 | I recommend training the model for about 40,000 to 50,000 batches. You will need to adjust `--train_time` depending on how many batches/hour your system can train.
69 | 
70 | # About the author
71 | 
72 | [LinkedIn profile of David Garcia](https://ca.linkedin.com/in/david-garcia-70913311).
73 | 


--------------------------------------------------------------------------------
/dm_arch.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | import numpy as np
  3 | import tensorflow as tf
  4 | 
  5 | import dm_utils
  6 | 
  7 | FLAGS = tf.app.flags.FLAGS
  8 | 
  9 | # Global switch to enable/disable training of variables
 10 | _glbl_is_training = tf.Variable(initial_value=True, trainable=False, name='glbl_is_training')
 11 | 
 12 | # Global variable dictionary. This is how we can share variables across models
 13 | _glbl_variables = {_glbl_is_training.name : _glbl_is_training}
 14 | 
 15 | 
 16 | def initialize_variables(sess):
 17 |     """Run this function only once and before the model begins to train"""
 18 | 
 19 |     # First initialize all variables
 20 |     sess.run(tf.global_variables_initializer())
 21 | 
 22 |     # Now freeze the graph to prevent new operations from being added
 23 |     #tf.get_default_graph().finalize()
 24 | 
 25 | def enable_training(onoff):
 26 |     """Switches training on or off globally (all models are affected).
 27 |     It is expected that dropout will be enabled during training and disabled afterwards. Batch normalization also affected.
 28 |     """
 29 |     tf.assign(_glbl_is_training, bool(onoff))
 30 | 
 31 | 
 32 | # TBD: Add "All you need is a good init"
 33 | 
 34 | class Model:
 35 |     """A neural network model.
 36 | 
 37 |     Currently only supports a feedforward architecture."""
 38 |     
 39 |     def __init__(self, name, features, enable_batch_norm=True):
 40 |         self.name    = name
 41 |         self.locals  = set()
 42 |         self.outputs = [features]
 43 |         
 44 |         self.enable_batch_norm = enable_batch_norm
 45 | 
 46 |     def _get_variable(self, name, initializer=None):
 47 |         # Variables are uniquely identified by a triplet: model name, layer number, and variable name
 48 |         layer = 'L%03d' % (self.get_num_layers()+1,)
 49 |         full_name = '/'.join([self.name, layer, name])
 50 | 
 51 |         if full_name in _glbl_variables:
 52 |             # Reuse existing variable
 53 |             #print("Reusing variable %s" % full_name)
 54 |             var = _glbl_variables[full_name]
 55 |             assert var.get_shape() == initializer.get_shape()
 56 |         elif initializer is not None:
 57 |             # Create new variable
 58 |             var = tf.Variable(initializer, name=full_name)
 59 |             _glbl_variables[full_name] = var
 60 |         else:
 61 |             raise ValueError("Initializer must be provided if variable is new") 
 62 | 
 63 |         self.locals.add(var)
 64 |         return var
 65 | 
 66 |     def _get_num_inputs(self):
 67 |         return int(self.get_output().get_shape()[-1])
 68 | 
 69 |     def _variable_initializer(self, prev_units, num_units, stddev_factor=1.0):
 70 |         """Initialization in the style of Glorot 2010.
 71 | 
 72 |         stddev_factor should be 1.0 for linear activations, and 2.0 for ReLUs"""
 73 | 
 74 |         assert prev_units > 0 and num_units > 0
 75 |         stddev  = np.sqrt(float(stddev_factor) / np.sqrt(prev_units*num_units))
 76 |         return tf.truncated_normal([prev_units, num_units],
 77 |                                     mean=0.0, stddev=stddev)
 78 | 
 79 |     def _variable_initializer_conv2d(self, prev_units, num_units, mapsize, is_residual):
 80 |         """Initialization in the style of Glorot 2010.
 81 | 
 82 |         stddev_factor should be 1.0 for linear activations, and 2.0 for ReLUs"""
 83 | 
 84 |         assert prev_units > 0 and num_units > 0
 85 |         size   = [mapsize, mapsize, prev_units, num_units]
 86 |         stddev_factor = 1e-1 / (mapsize * mapsize * prev_units * num_units)
 87 |         result = stddev_factor * np.random.uniform(low=-1, high=1, size=size)
 88 | 
 89 |         if not is_residual:
 90 |             # Focus nearly all the weight on the center
 91 |             for i in range(min(prev_units, num_units)):
 92 |                 result[mapsize//2, mapsize//2, i, i] += 1.0
 93 |         # else leaving all parameters near zero is the right thing to do
 94 | 
 95 |         result = tf.constant(result.astype(np.float32))
 96 | 
 97 |         return result
 98 | 
 99 |     def get_num_layers(self):
100 |         return len(self.outputs)
101 | 
102 |     def add_batch_norm(self, scale=False):
103 |         """Adds a batch normalization layer to this model.
104 | 
105 |         See ArXiv 1502.03167v3 for details."""
106 | 
107 |         if not self.enable_batch_norm:
108 |             return self
109 | 
110 |         out = tf.contrib.layers.batch_norm(self.get_output(), scale=scale, is_training=_glbl_is_training)
111 |         
112 |         self.outputs.append(out)
113 |         return self
114 | 
115 |     def add_dropout(self, keep_prob=.5):
116 |         """Applies dropout to output of this model"""
117 | 
118 |         is_training = tf.to_float(_glbl_is_training)
119 |         keep_prob = is_training * keep_prob + (1.0 - is_training)
120 |         out = tf.nn.dropout(self.get_output(), keep_prob=keep_prob)
121 | 
122 |         self.outputs.append(out)
123 |         return self
124 | 
125 |     def add_flatten(self):
126 |         """Transforms the output of this network to a 1D tensor"""
127 | 
128 |         batch_size = int(self.get_output().get_shape()[0])
129 |         out = tf.reshape(self.get_output(), [batch_size, -1])
130 | 
131 |         self.outputs.append(out)
132 |         return self
133 | 
134 |     def add_reshape(self, shape):
135 |         """Reshapes the output of this network"""
136 | 
137 |         out = tf.reshape(self.get_output(), shape)
138 | 
139 |         self.outputs.append(out)
140 |         return self        
141 | 
142 |     def add_dense(self, num_units, stddev_factor=1.0):
143 |         """Adds a dense linear layer to this model.
144 | 
145 |         Uses Glorot 2010 initialization assuming linear activation."""
146 |         
147 |         assert len(self.get_output().get_shape()) == 2, "Previous layer must be 2-dimensional (batch, channels)"
148 | 
149 |         prev_units = self._get_num_inputs()
150 |         
151 |         # Weight term
152 |         initw   = self._variable_initializer(prev_units, num_units,
153 |                                            stddev_factor=stddev_factor)
154 |         weight  = self._get_variable('weight', initw)
155 | 
156 |         # Bias term
157 |         initb   = tf.constant(0.0, shape=[num_units])
158 |         bias    = self._get_variable('bias', initb)
159 | 
160 |         # Output of this layer
161 |         out     = tf.matmul(self.get_output(), weight) + bias
162 | 
163 |         self.outputs.append(out)
164 |         return self
165 | 
166 |     def add_sigmoid(self, rnge=1.0):
167 |         """Adds a sigmoid (0,1) activation function layer to this model."""
168 | 
169 |         prev_units = self._get_num_inputs()
170 |         out = 0.5 + rnge * (tf.nn.sigmoid(self.get_output()) - 0.5)
171 |         
172 |         self.outputs.append(out)
173 |         return self
174 | 
175 |     def add_tanh(self):
176 |         """Adds a tanh (-1,+1) activation function layer to this model."""
177 | 
178 |         prev_units = self._get_num_inputs()
179 |         out = tf.nn.tanh(self.get_output())
180 |         
181 |         self.outputs.append(out)
182 |         return self
183 | 
184 |     def add_softmax(self):
185 |         """Adds a softmax operation to this model"""
186 | 
187 |         this_input = tf.square(self.get_output())
188 |         reduction_indices = list(range(1, len(this_input.get_shape())))
189 |         acc = tf.reduce_sum(this_input, reduction_indices=reduction_indices, keep_dims=True)
190 |         out = this_input / (acc+FLAGS.epsilon)
191 |         #out = tf.verify_tensor_all_finite(out, "add_softmax failed; is sum equal to zero?")
192 |         
193 |         self.outputs.append(out)
194 |         return self
195 | 
196 |     def add_relu(self):
197 |         """Adds a ReLU activation function to this model"""
198 | 
199 |         out = tf.nn.relu(self.get_output())
200 | 
201 |         self.outputs.append(out)
202 |         return self        
203 | 
204 |     def add_elu(self):
205 |         """Adds a ELU activation function to this model"""
206 | 
207 |         out = tf.nn.elu(self.get_output())
208 | 
209 |         self.outputs.append(out)
210 |         return self
211 | 
212 |     def add_lrelu(self, leak=.2):
213 |         """Adds a leaky ReLU (LReLU) activation function to this model"""
214 | 
215 |         t1  = .5 * (1 + leak)
216 |         t2  = .5 * (1 - leak)
217 |         out = t1 * self.get_output() + \
218 |               t2 * tf.abs(self.get_output())
219 |             
220 |         self.outputs.append(out)
221 |         return self
222 | 
223 |     def add_conv2d(self, num_units, mapsize=1, stride=1, is_residual = False):
224 |         """Adds a 2D convolutional layer."""
225 | 
226 |         assert len(self.get_output().get_shape()) == 4 and "Previous layer must be 4-dimensional (batch, width, height, channels)"
227 |         
228 |         prev_units = self._get_num_inputs()
229 |         
230 |         # Weight term and convolution
231 |         initw  = self._variable_initializer_conv2d(prev_units, num_units, mapsize, is_residual=is_residual)
232 |         weight = self._get_variable('weight', initw)
233 |         out    = tf.nn.conv2d(self.get_output(), weight,
234 |                               strides=[1, stride, stride, 1],
235 |                               padding='SAME')
236 | 
237 |         # Bias term
238 |         initb  = tf.constant(0.0, shape=[num_units])
239 |         bias   = self._get_variable('bias', initb)
240 |         out    = tf.nn.bias_add(out, bias)
241 |             
242 |         self.outputs.append(out)
243 |         return self
244 | 
245 |     def add_conv2d_transpose(self, num_units, mapsize=1, stride=1, is_residual = False):
246 |         """Adds a transposed 2D convolutional layer"""
247 | 
248 |         assert not "This function is broken right now due to how _variable_initializer_conv2d is built. Use a regular convolution instead"
249 |         
250 |         assert len(self.get_output().get_shape()) == 4 and "Previous layer must be 4-dimensional (batch, width, height, channels)"
251 | 
252 |         prev_units = self._get_num_inputs()
253 |         
254 |         # Weight term and convolution
255 |         initw  = self._variable_initializer_conv2d(prev_units, num_units, mapsize, is_residual=is_residual)
256 |         weight = self._get_variable('weight', initw)
257 |         weight = tf.transpose(weight, perm=[0, 1, 3, 2])
258 |         prev_output = self.get_output()
259 |         output_shape = [FLAGS.batch_size,
260 |                         int(prev_output.get_shape()[1]) * stride,
261 |                         int(prev_output.get_shape()[2]) * stride,
262 |                         num_units]
263 |         out    = tf.nn.conv2d_transpose(self.get_output(), weight,
264 |                                         output_shape=output_shape,
265 |                                         strides=[1, stride, stride, 1],
266 |                                         padding='SAME')
267 | 
268 |         # Bias term
269 |         initb  = tf.constant(0.0, shape=[num_units])
270 |         bias   = self._get_variable('bias', initb)
271 |         out    = tf.nn.bias_add(out, bias)
272 |             
273 |         self.outputs.append(out)
274 |         return self
275 | 
276 |     def add_concat(self, terms):
277 |         """Adds a concatenation layer"""
278 | 
279 |         if len(terms) > 0:
280 |             axis = len(self.get_output().get_shape()) - 1
281 |             terms = terms + [self.get_output()]
282 |             out = tf.concat(axis, terms)
283 |             self.outputs.append(out)
284 |         
285 |         return self
286 | 
287 |     def add_sum(self, term):
288 |         """Adds a layer that sums the top layer with the given term"""
289 | 
290 |         prev_shape = self.get_output().get_shape()
291 |         term_shape = term.get_shape()
292 |         #print("%s %s" % (prev_shape, term_shape))
293 |         assert prev_shape[1:] == term_shape[1:] and "Can't sum terms with a different size"
294 |         out = tf.add(self.get_output(), term)
295 |         
296 |         self.outputs.append(out)
297 |         return self
298 | 
299 |     def add_mean(self):
300 |         """Adds a layer that averages the inputs from the previous layer"""
301 | 
302 |         prev_shape = self.get_output().get_shape()
303 |         reduction_indices = list(range(len(prev_shape)))
304 |         assert len(reduction_indices) > 2 and "Can't average a (batch, activation) tensor"
305 |         reduction_indices = reduction_indices[1:-1]
306 |         out = tf.reduce_mean(self.get_output(), reduction_indices=reduction_indices)
307 |         
308 |         self.outputs.append(out)
309 |         return self
310 | 
311 |     def add_avg_pool(self, height=2, width=2):
312 |         """Adds a layer that performs average pooling of the given size"""
313 | 
314 |         ksize   = [1, height, width, 1]
315 |         strides = [1, height, width, 1]
316 |         out     = tf.nn.avg_pool(self.get_output(), ksize, strides, 'VALID')
317 |         
318 |         self.outputs.append(out)
319 |         return self
320 | 
321 |     def add_upscale(self, factor=2):
322 |         """Adds a layer that upscales the output by 2x through nearest neighbor interpolation.
323 |         See http://distill.pub/2016/deconv-checkerboard/"""
324 | 
325 |         out = dm_utils.upscale(self.get_output(), factor)
326 | 
327 |         self.outputs.append(out)
328 |         return self        
329 | 
330 |     def get_output(self):
331 |         """Returns the output from the topmost layer of the network"""
332 |         return self.outputs[-1]
333 | 
334 |     def get_num_parameters(self):
335 |         """Return the number of parameters in this model"""
336 |         num_params = 0
337 |         for var in self.locals:
338 |             size = 1
339 |             for dim in var.get_shape():
340 |                 size *= int(dim)
341 |             num_params += size
342 |         return num_params
343 | 
344 |     def get_all_variables(self):
345 |         """Returns all variables used in this model"""
346 |         return list(self.locals)
347 | 


--------------------------------------------------------------------------------
/dm_celeba.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import os.path
 3 | import random
 4 | import tensorflow as tf
 5 | 
 6 | FLAGS = tf.app.flags.FLAGS
 7 | 
 8 | # For convenience, here are the available attributes in the dataset:
 9 | # 5_o_Clock_Shadow Arched_Eyebrows Attractive Bags_Under_Eyes Bald Bangs Big_Lips \
10 | # Big_Nose Black_Hair Blond_Hair Blurry Brown_Hair Bushy_Eyebrows Chubby Double_Chin \
11 | # Eyeglasses Goatee Gray_Hair Heavy_Makeup High_Cheekbones Male Mouth_Slightly_Open Mustache \
12 | # Narrow_Eyes No_Beard Oval_Face Pale_Skin Pointy_Nose Receding_Hairline Rosy_Cheeks Sideburns \
13 | # Smiling Straight_Hair Wavy_Hair Wearing_Earrings Wearing_Hat Wearing_Lipstick Wearing_Necklace
14 | # Wearing_Necktie Young
15 | 
16 | def _read_attributes(attrfile):
17 |     """Parses attributes file from Celeb-A dataset and returns"""
18 | 
19 |     # The first line is the number of images in the dataset. Ignore.
20 |     f = open(attrfile, 'r')
21 |     f.readline()
22 | 
23 |     # The second line contains the names of the boolean attributes
24 |     names = f.readline().strip().split()
25 | 
26 |     attr_names = {}
27 |     for i in range(len(names)):
28 |         attr_names[names[i]] = i
29 | 
30 |     # The remaining lines contain file name and a list of boolean attributes
31 |     attr_values = []
32 |     for _, line in enumerate(f):
33 |         fields = line.strip().split()
34 |         img_name = fields[0]
35 |         assert img_name[-4:] == '.jpg'
36 |         attr_bitfield = [field == '1' for field in fields[1:]]
37 |         attr_bitfield = np.array(attr_bitfield, dtype=np.bool)
38 |         attr_values.append((img_name, attr_bitfield))
39 |         
40 |     return attr_names, attr_values
41 | 
42 | 
43 | def _filter_attributes(attr_names, attr_values, sel):
44 |     """Returns the filenames that match the attributes given by 'dic'"""
45 | 
46 |     # Then select those files whose attributes all match the selection
47 |     filenames = []
48 |     for filename, attrs in attr_values:
49 |         all_match = True
50 |         for name, value in sel.items():
51 |             column = attr_names[name]
52 |             #print("name=%s, value=%s, column=%s, attrs[column]=%s" % (name, value, column, attrs[column]))
53 |             if attrs[column] != value:
54 |                 all_match = False
55 |                 break
56 | 
57 |         if all_match:
58 |             filenames.append(filename)
59 | 
60 |     return filenames
61 | 
62 | 
63 | def select_samples(selection={}):
64 |     """Selects those images in the Celeb-A dataset whose
65 |     attributes match the constraints given in 'sel'"""
66 | 
67 |     attrfile = os.path.join(FLAGS.dataset, FLAGS.attribute_file)
68 |     names, attributes = _read_attributes(attrfile)
69 | 
70 |     filenames = _filter_attributes(names, attributes, selection)
71 | 
72 |     filenames = sorted(filenames)
73 |     random.shuffle(filenames)
74 | 
75 |     filenames = [os.path.join(FLAGS.dataset, file) for file in filenames]
76 | 
77 |     return filenames
78 | 
79 | 


--------------------------------------------------------------------------------
/dm_flags.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import tensorflow as tf
 3 | 
 4 | FLAGS = tf.app.flags.FLAGS
 5 | 
 6 | def define_flags():
 7 |     # Configuration (alphabetically)
 8 |     tf.app.flags.DEFINE_integer('annealing_half_life', 10000,
 9 |                                 "Number of batches until annealing temperature is halved")
10 | 
11 |     tf.app.flags.DEFINE_string('attribute_file', 'list_attr_celeba.txt',
12 |                                "Celeb-A dataset attribute file")
13 | 
14 |     tf.app.flags.DEFINE_integer('batch_size', 16,
15 |                                 "Number of samples per batch.")
16 | 
17 |     tf.app.flags.DEFINE_string('checkpoint_dir', 'checkpoint',
18 |                                "Output folder where checkpoints are dumped.")
19 | 
20 |     tf.app.flags.DEFINE_string('dataset', 'dataset',
21 |                                "Path to the dataset directory.")
22 | 
23 |     tf.app.flags.DEFINE_float('disc_loss_threshold', 0.1,
24 |                               "If the discriminator's loss is above this threshold then only the discriminator will train in during the next step")
25 | 
26 |     tf.app.flags.DEFINE_float('disc_weights_threshold', 0.01,
27 |                               "Maximum absolute value allowed for weights in the discriminator")
28 | 
29 |     tf.app.flags.DEFINE_float('epsilon', 1e-8,
30 |                               "Fuzz term to avoid numerical instability")
31 | 
32 |     tf.app.flags.DEFINE_string('infile', None,
33 |                                "Inference input file. See also `outfile`")
34 | 
35 |     tf.app.flags.DEFINE_float('instance_noise', 0.5,
36 |                               "Standard deviation (amplitude) of instance noise")
37 | 
38 |     tf.app.flags.DEFINE_float('learning_rate_start', 0.000100,
39 |                               "Starting learning rate used for AdamOptimizer")
40 | 
41 |     tf.app.flags.DEFINE_float('learning_rate_end',   0.000001,
42 |                               "Ending learning rate used for AdamOptimizer")
43 | 
44 |     tf.app.flags.DEFINE_string('outfile', 'inference_out.png',
45 |                                "Inference output file. See also `infile`")
46 | 
47 |     tf.app.flags.DEFINE_float('pixel_loss_max', 0.95,
48 |                               "Initial pixel loss relative weight")
49 | 
50 |     tf.app.flags.DEFINE_float('pixel_loss_min', 0.70,
51 |                               "Asymptotic pixel loss relative weight")
52 | 
53 |     tf.app.flags.DEFINE_string('run', None,
54 |                                 "Which operation to run. [train|inference]")
55 | 
56 |     tf.app.flags.DEFINE_integer('summary_period', 20,
57 |                                 "Number of batches between summary data dumps")
58 | 
59 |     tf.app.flags.DEFINE_integer('random_seed', 10,
60 |                                 "Seed used to initialize rng.")
61 | 
62 |     tf.app.flags.DEFINE_integer('test_vectors', 16,
63 |                                 """Number of features to use for testing""")
64 |                                 
65 |     tf.app.flags.DEFINE_string('train_dir', 'train',
66 |                                "Output folder where training logs are dumped.")
67 | 
68 |     tf.app.flags.DEFINE_string('train_mode', 'mtf',
69 |                                "Training mode. Can be male-to-female (`mtf`), female-to-male (`ftm`), male-to-male (`mtm`) or female-to-female (`ftf`)")
70 | 
71 |     tf.app.flags.DEFINE_integer('train_time', 180,
72 |                                 "Time in minutes to train the model")
73 | 


--------------------------------------------------------------------------------
/dm_infer.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import tensorflow as tf
 3 | 
 4 | import dm_utils
 5 | 
 6 | FLAGS = tf.app.flags.FLAGS
 7 | 
 8 | 
 9 | def inference(infer_data):
10 | 
11 |     sess = infer_data.sess
12 |     idm  = infer_data.infer_model
13 | 
14 |     image = sess.run(idm.gene_out)
15 |     image = np.squeeze(image, axis=0)
16 | 
17 |     dm_utils.save_image(image, FLAGS.outfile)
18 | 


--------------------------------------------------------------------------------
/dm_input.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | import dm_celeba
 4 | 
 5 | FLAGS = tf.app.flags.FLAGS
 6 | 
 7 | def input_data(sess, mode, filenames, capacity_factor=3):
 8 | 
 9 |     # Separate training and test sets
10 |     # TBD: Use partition given by dataset creators
11 |     assert mode == 'inference' or len(filenames) >= FLAGS.test_vectors
12 |     
13 |     if mode == 'train':
14 |         filenames  = filenames[FLAGS.test_vectors:]
15 |         batch_size = FLAGS.batch_size
16 |     elif mode == 'test':
17 |         filenames  = filenames[:FLAGS.test_vectors]
18 |         batch_size = FLAGS.batch_size
19 |     elif mode == 'inference':
20 |         filenames  = filenames[:]
21 |         batch_size = 1
22 |     else:
23 |         raise ValueError('Unknown mode `%s`' % (mode,))
24 |     
25 |     # Read each JPEG file
26 |     reader = tf.WholeFileReader()
27 |     filename_queue = tf.train.string_input_producer(filenames)
28 |     key, value = reader.read(filename_queue)
29 |     channels = 3
30 |     image = tf.image.decode_jpeg(value, channels=channels, name="dataset_image")
31 |     image.set_shape([None, None, channels])
32 | 
33 |     # Crop and other random augmentations
34 |     if mode == 'train':
35 |         image = tf.image.random_flip_left_right(image)
36 |         #image = tf.image.random_saturation(image, .95, 1.05)
37 |         #image = tf.image.random_brightness(image, .05)
38 |         #image = tf.image.random_contrast(image, .95, 1.05)
39 | 
40 |     size_x, size_y = 80, 100
41 | 
42 |     if mode == 'inference':
43 |         # TBD: What does the 'align_corners' parameter do? Stretch blit?
44 |         image = tf.image.resize_images(image, (size_y, size_x), method=tf.image.ResizeMethod.AREA)
45 |     else:
46 |         # Dataset samples are 178x218 pixels
47 |         # Select face only without hair
48 |         off_x, off_y   = 49, 90
49 |         image = tf.image.crop_to_bounding_box(image, off_y, off_x, size_y, size_x)
50 |     
51 |     feature = tf.cast(image, tf.float32)/255.0
52 | 
53 |     # Using asynchronous queues
54 |     features = tf.train.batch([feature],
55 |                               batch_size=batch_size,
56 |                               num_threads=4,
57 |                               capacity = capacity_factor*batch_size,
58 |                               name='features')
59 | 
60 |     tf.train.start_queue_runners(sess=sess)
61 |       
62 |     return features
63 | 


--------------------------------------------------------------------------------
/dm_main.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | 
  3 | # Disable Tensorflow's INFO and WARNING messages
  4 | # See http://stackoverflow.com/questions/35911252
  5 | if 'TF_CPP_MIN_LOG_LEVEL' not in os.environ:
  6 |     os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
  7 | 
  8 | import numpy as np
  9 | import numpy.random
 10 | import os.path
 11 | import random
 12 | import tensorflow as tf
 13 | 
 14 | import dm_celeba
 15 | import dm_flags
 16 | import dm_infer
 17 | import dm_input
 18 | import dm_model
 19 | import dm_show
 20 | import dm_train
 21 | import dm_utils
 22 | 
 23 | FLAGS = tf.app.flags.FLAGS
 24 | 
 25 | 
 26 | def _setup_tensorflow():
 27 |     # Create session
 28 |     config = tf.ConfigProto(log_device_placement=False) #, intra_op_parallelism_threads=1)
 29 |     sess   = tf.Session(config=config)
 30 | 
 31 |     # Initialize all RNGs with a deterministic seed
 32 |     with sess.graph.as_default():
 33 |         tf.set_random_seed(FLAGS.random_seed)
 34 |     
 35 |     random.seed(FLAGS.random_seed)
 36 |     np.random.seed(FLAGS.random_seed)
 37 | 
 38 |     return sess
 39 | 
 40 | 
 41 | # TBD: Move to dm_train.py?
 42 | def _prepare_train_dirs():
 43 |     # Create checkpoint dir (do not delete anything)
 44 |     if not tf.gfile.Exists(FLAGS.checkpoint_dir):
 45 |         tf.gfile.MakeDirs(FLAGS.checkpoint_dir)
 46 |     
 47 |     # Cleanup train dir
 48 |     if tf.gfile.Exists(FLAGS.train_dir):
 49 |         try:
 50 |             tf.gfile.DeleteRecursively(FLAGS.train_dir)
 51 |         except:
 52 |             pass
 53 |     tf.gfile.MakeDirs(FLAGS.train_dir)
 54 | 
 55 |     # Ensure dataset folder exists
 56 |     if not tf.gfile.Exists(FLAGS.dataset) or \
 57 |        not tf.gfile.IsDirectory(FLAGS.dataset):
 58 |         raise FileNotFoundError("Could not find folder `%s'" % (FLAGS.dataset,))
 59 | 
 60 | 
 61 | # TBD: Move to dm_train.py?
 62 | def _get_train_data():
 63 |     # Setup global tensorflow state
 64 |     sess = _setup_tensorflow()
 65 | 
 66 |     # Prepare directories
 67 |     _prepare_train_dirs()
 68 | 
 69 |     # Which type of transformation?
 70 |     # Note: eyeglasses and sunglasses are filtered out because they tend to produce artifacts
 71 |     if FLAGS.train_mode == 'ftm' or FLAGS.train_mode == 'f2m':
 72 |         # Trans filter: from female to attractive male
 73 |         # Note: removed facial hair from target images because otherwise the network becomes overly focused on rendering facial hair
 74 |         source_filter = {'Male':False, 'Blurry':False, 'Eyeglasses':False}
 75 |         target_filter = {'Male':True,  'Blurry':False, 'Eyeglasses':False, 'Attractive':True, 'Goatee':False, 'Mustache':False, 'No_Beard':True}
 76 |     elif FLAGS.train_mode == 'mtf' or FLAGS.train_mode == 'm2f':
 77 |         # Trans filter: from male to attractuve female
 78 |         source_filter = {'Male':True,  'Blurry':False, 'Eyeglasses':False}
 79 |         target_filter = {'Male':False, 'Blurry':False, 'Eyeglasses':False, 'Attractive':True}
 80 |     elif FLAGS.train_mode == 'ftf' or FLAGS.train_mode == 'f2f':
 81 |         # Vanity filter: from female to attractive female
 82 |         source_filter = {'Male':False, 'Blurry':False, 'Eyeglasses':False}
 83 |         target_filter = {'Male':False, 'Blurry':False, 'Eyeglasses':False, 'Attractive':True}
 84 |     elif FLAGS.train_mode == "mtm" or FLAGS.train_mode == 'm2m':
 85 |         # Vanity filter: from male to attractive male
 86 |         source_filter = {'Male':True,  'Blurry':False, 'Eyeglasses':False}
 87 |         target_filter = {'Male':True,  'Blurry':False, 'Eyeglasses':False, 'Attractive':True}
 88 |     else:
 89 |         raise ValueError('`train_mode` must be one of: `ftm`, `mtf`, `ftf` or `mtm`')
 90 | 
 91 |     # Setup async input queues
 92 |     selected      = dm_celeba.select_samples(source_filter)
 93 |     source_images = dm_input.input_data(sess, 'train', selected)
 94 |     test_images   = dm_input.input_data(sess, 'test', selected)
 95 |     print('%8d source images selected' % (len(selected),))
 96 | 
 97 |     selected      = dm_celeba.select_samples(target_filter)
 98 |     target_images = dm_input.input_data(sess, 'train', selected)
 99 |     print('%8d target images selected' % (len(selected),))
100 |     print()
101 | 
102 |     # Annealing temperature: starts at 1.0 and decreases exponentially over time
103 |     annealing = tf.Variable(initial_value=1.0, trainable=False, name='annealing')
104 |     halve_annealing = tf.assign(annealing, 0.5*annealing)
105 | 
106 |     # Create and initialize training and testing models
107 |     train_model  = dm_model.create_model(sess, source_images, target_images, annealing, verbose=True)
108 | 
109 |     print("Building testing model...")
110 |     test_model   = dm_model.create_model(sess, test_images, None, annealing)
111 |     print("Done.")
112 |     
113 |     # Forget this line and TF will deadlock at the beginning of training
114 |     tf.train.start_queue_runners(sess=sess)
115 | 
116 |     # Pack all for convenience
117 |     train_data = dm_utils.Container(locals())
118 | 
119 |     return train_data
120 | 
121 | 
122 | # TBD: Move to dm_infer.py?
123 | def _get_inference_data():
124 |     # Setup global tensorflow state
125 |     sess = _setup_tensorflow()
126 | 
127 |     # Load single image to use for inference
128 |     if FLAGS.infile is None:
129 |         raise ValueError('Must specify inference input file through `--infile <filename>` command line argument')
130 |                          
131 |     if not tf.gfile.Exists(FLAGS.infile) or tf.gfile.IsDirectory(FLAGS.infile):
132 |         raise FileNotFoundError('File `%s` does not exist or is a directory' % (FLAGS.infile,))
133 |     
134 |     filenames    = [FLAGS.infile]
135 |     infer_images = dm_input.input_data(sess, 'inference', filenames)
136 | 
137 |     print('Loading model...')
138 |     # Create inference model
139 |     infer_model  = dm_model.create_model(sess, infer_images)
140 | 
141 |     # Load model parameters from checkpoint
142 |     checkpoint = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
143 |     try:
144 |         saver = tf.train.Saver()
145 |         saver.restore(sess, checkpoint.model_checkpoint_path)
146 |         del saver
147 |         del checkpoint
148 |     except:
149 |         raise RuntimeError('Unable to read checkpoint from `%s`' % (FLAGS.checkpoint_dir,))
150 |     print('Done.')
151 | 
152 |     # Pack all for convenience
153 |     infer_data = dm_utils.Container(locals())
154 | 
155 |     return infer_data
156 | 
157 | 
158 | def main(argv=None):
159 |     if FLAGS.run == 'train':
160 |         train_data = _get_train_data()
161 |         dm_train.train_model(train_data)
162 |     elif FLAGS.run == 'inference':
163 |         infer_data = _get_inference_data()
164 |         dm_infer.inference(infer_data)
165 |     else:
166 |         print("Operation `%s` not supported" % (FLAGS.run,))
167 | 
168 | if __name__ == '__main__':
169 |     dm_flags.define_flags()
170 |     tf.app.run()
171 | 


--------------------------------------------------------------------------------
/dm_model.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | import numpy as np
  3 | import tensorflow as tf
  4 | 
  5 | import dm_arch
  6 | import dm_utils
  7 | 
  8 | FLAGS = tf.app.flags.FLAGS
  9 | 
 10 | def _residual_block(model, num_units, mapsize, nlayers=2):
 11 |     """Adds a residual block similar to Arxiv 1512.03385, Figure 3.
 12 |     """
 13 | 
 14 |     # TBD: Try pyramidal block as per arXiv 1610.02915.
 15 |     # Note Figure 6d (the extra BN compared to 6b seems to help as per Table 2)
 16 |     # Also note Figure 5b.
 17 | 
 18 |     assert len(model.get_output().get_shape()) == 4 and "Previous layer must be 4-dimensional (batch, width, height, channels)"
 19 | 
 20 |     # Add *linear* projection in series if needed prior to shortcut
 21 |     if num_units != int(model.get_output().get_shape()[3]):
 22 |         model.add_conv2d(num_units, mapsize=1, stride=1)
 23 | 
 24 |     if nlayers > 0:
 25 |         # Batch norm not needed for every conv layer
 26 |         # and it slows down training substantially
 27 |         model.add_batch_norm()
 28 | 
 29 |         for _ in range(nlayers):
 30 |             # Bypassing on every conv layer, as implied by Arxiv 1612.07771
 31 |             # Experimental results particularly favor one (Arxiv 1512.03385) or the other (this)
 32 |             bypass = model.get_output()
 33 |             model.add_relu()
 34 |             model.add_conv2d(num_units, mapsize=mapsize, is_residual=True)
 35 |             model.add_sum(bypass)
 36 | 
 37 |     return model
 38 | 
 39 | 
 40 | def _generator_model(sess, features):
 41 |     # See Arxiv 1603.05027
 42 |     model = dm_arch.Model('GENE', 2 * features - 1)
 43 | 
 44 |     mapsize = 3
 45 | 
 46 |     # Encoder
 47 |     layers  = [24, 48]
 48 |     for nunits in layers:
 49 |         _residual_block(model, nunits, mapsize)
 50 |         model.add_avg_pool()
 51 | 
 52 |     # Decoder
 53 |     layers  = [96, 64]
 54 |     for nunits in layers:
 55 |         _residual_block(model, nunits, mapsize)
 56 |         _residual_block(model, nunits, mapsize)
 57 |         model.add_upscale()
 58 | 
 59 |     nunits = 48
 60 |     _residual_block(model, nunits, mapsize)
 61 |     _residual_block(model, nunits, mapsize)
 62 |     model.add_conv2d(3, mapsize=1)
 63 |     model.add_sigmoid(1.1)
 64 |     
 65 |     return model
 66 | 
 67 | 
 68 | def _discriminator_model(sess, image):
 69 |     model = dm_arch.Model('DISC', 2 * image - 1.0)
 70 | 
 71 |     mapsize = 3
 72 |     layers  = [64, 96, 128, 192] #[32, 48, 96, 128]
 73 | 
 74 |     for nunits in layers:
 75 |         model.add_batch_norm()
 76 |         model.add_lrelu()
 77 |         model.add_conv2d(nunits, mapsize=mapsize)
 78 |             
 79 |         model.add_avg_pool()
 80 | 
 81 |     nunits = layers[-1]
 82 |     model.add_batch_norm()
 83 |     model.add_lrelu()
 84 |     model.add_conv2d(nunits, mapsize=mapsize)
 85 | 
 86 |     #model.add_batch_norm()
 87 |     model.add_lrelu()
 88 |     model.add_conv2d(1, mapsize=mapsize)
 89 |     
 90 |     model.add_mean()
 91 | 
 92 |     return model
 93 | 
 94 | 
 95 | def _generator_loss(features, gene_output, disc_fake_output, annealing):
 96 |     # I.e. did we fool the discriminator?
 97 |     gene_adversarial_loss = tf.reduce_mean(-disc_fake_output, name='gene_adversarial_loss')
 98 |         
 99 | 
100 |     return gene_adversarial_loss # gene_loss
101 | 
102 | 
103 | def _discriminator_loss(disc_real_output, disc_fake_output):
104 |     # I.e. did we correctly identify the input as real or not?
105 |     disc_real_loss = -disc_real_output
106 |     disc_fake_loss =  disc_fake_output
107 | 
108 |     disc_real_loss = tf.reduce_mean(disc_real_loss, name='disc_real_loss')
109 |     disc_fake_loss = tf.reduce_mean(disc_fake_loss, name='disc_fake_loss')
110 |     disc_loss      = tf.add(disc_real_loss, disc_fake_loss, name='dics_loss')
111 | 
112 |     return disc_loss, disc_real_loss, disc_fake_loss
113 | 
114 | 
115 | def _clip_weights(var_list, weights_threshold):
116 |     """Clips all the given weights to fall within the range [-weight_threshold, weight_threshold]"""
117 |     ops = []
118 |     for var in var_list:
119 |         clipped = tf.clip_by_value(var, -weights_threshold, weights_threshold)
120 |         op      = tf.assign(var, clipped)
121 |         ops.append(op)
122 | 
123 |     return tf.group(*ops, name='clip_weights')
124 | 
125 | 
126 | def create_model(sess, source_images, target_images=None, annealing=None, verbose=False):    
127 |     rows  = int(source_images.get_shape()[1])
128 |     cols  = int(source_images.get_shape()[2])
129 |     depth = int(source_images.get_shape()[3])
130 | 
131 |     #
132 |     # Generator
133 |     #
134 |     gene          = _generator_model(sess, source_images)
135 |     gene_out      = gene.get_output()
136 |     gene_var_list = gene.get_all_variables()
137 | 
138 |     if verbose:
139 |         print("Generator input (feature) size is %d x %d x %d = %d" %
140 |               (rows, cols, depth, rows*cols*depth))
141 | 
142 |         print("Generator has %4.2fM parameters" % (gene.get_num_parameters()/1e6,))
143 |         print()
144 | 
145 |     if target_images is not None:
146 |         learning_rate = tf.maximum(FLAGS.learning_rate_start * annealing, FLAGS.learning_rate_end, name='learning_rate')
147 | 
148 |         # Instance noise used to aid convergence.
149 |         # See http://www.inference.vc/instance-noise-a-trick-for-stabilising-gan-training/
150 |         noise_shape = [FLAGS.batch_size, rows, cols, depth]
151 |         noise = tf.truncated_normal(noise_shape, mean=0.0, stddev=FLAGS.instance_noise*annealing, name='instance_noise')
152 |         noise = tf.reshape(noise, noise_shape) # TBD: Why is this even necessary? I don't get it.
153 |         noise = 0.0
154 | 
155 |         #
156 |         # Discriminator: one takes real inputs, another takes fake (generated) inputs
157 |         #
158 |         disc_real     = _discriminator_model(sess, target_images + noise)
159 |         disc_real_out = disc_real.get_output()
160 |         disc_var_list = disc_real.get_all_variables()
161 | 
162 |         disc_fake     = _discriminator_model(sess, gene_out + noise)
163 |         disc_fake_out = disc_fake.get_output()
164 |     
165 |         if verbose:
166 |             print("Discriminator input (feature) size is %d x %d x %d = %d" %
167 |                   (rows, cols, depth, rows*cols*depth))
168 | 
169 |             print("Discriminator has %4.2fM parameters" % (disc_real.get_num_parameters()/1e6,))
170 |             print()
171 | 
172 |         #
173 |         # Losses and optimizers
174 |         #
175 |         gene_loss = _generator_loss(source_images, gene_out, disc_fake_out, annealing)
176 |         
177 |         disc_loss, disc_real_loss, disc_fake_loss = _discriminator_loss(disc_real_out, disc_fake_out)
178 | 
179 |         gene_opti = tf.train.AdamOptimizer(learning_rate=learning_rate,
180 |                                            name='gene_optimizer')
181 | 
182 |         # Note WGAN doesn't work well with Adam or any other optimizer that relies on momentum
183 |         disc_opti = tf.train.RMSPropOptimizer(learning_rate=learning_rate, momentum=0.0,
184 |                                               name='disc_optimizer')
185 | 
186 |         gene_minimize = gene_opti.minimize(gene_loss, var_list=gene_var_list, name='gene_loss_minimize')    
187 |         disc_minimize = disc_opti.minimize(disc_loss, var_list=disc_var_list, name='disc_loss_minimize')
188 | 
189 |         # Weight clipping a la WGAN (arXiv 1701.07875)
190 |         # TBD: We shouldn't be clipping all variables (incl biases), just the weights
191 |         disc_clip_weights = _clip_weights(disc_var_list, FLAGS.disc_weights_threshold)
192 |         disc_minimize     = tf.group(disc_minimize, disc_clip_weights)
193 | 
194 |     # Package everything into an dumb object
195 |     model = dm_utils.Container(locals())
196 | 
197 |     return model
198 | 


--------------------------------------------------------------------------------
/dm_show.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import os.path
 3 | import scipy.misc
 4 | import tensorflow as tf
 5 | import time
 6 | 
 7 | import dm_arch
 8 | import dm_input
 9 | import dm_utils
10 | 
11 | FLAGS = tf.app.flags.FLAGS
12 | 
13 | 


--------------------------------------------------------------------------------
/dm_train.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import os.path
  3 | import tensorflow as tf
  4 | import time
  5 | 
  6 | import dm_arch
  7 | import dm_input
  8 | import dm_utils
  9 | 
 10 | FLAGS = tf.app.flags.FLAGS
 11 | 
 12 | 
 13 | def _save_image(train_data, feature, gene_output, batch, suffix, max_samples=None):
 14 |     """Saves a picture showing the current progress of the model"""
 15 |     
 16 |     if max_samples is None:
 17 |         max_samples = int(feature.shape[0])
 18 |     
 19 |     td  = train_data
 20 | 
 21 |     clipped  = np.clip(gene_output, 0, 1)
 22 |     image    = np.concatenate([feature, clipped], 2)
 23 | 
 24 |     image    = image[:max_samples,:,:,:]
 25 |     cols     = []
 26 |     num_cols = 4
 27 |     samples_per_col = max_samples//num_cols
 28 |     
 29 |     for c in range(num_cols):
 30 |         col   = np.concatenate([image[samples_per_col*c + i,:,:,:] for i in range(samples_per_col)], 0)
 31 |         cols.append(col)
 32 | 
 33 |     image   = np.concatenate(cols, 1)
 34 | 
 35 |     filename = 'batch%06d_%s.png' % (batch, suffix)
 36 |     filename = os.path.join(FLAGS.train_dir, filename)
 37 |     
 38 |     dm_utils.save_image(image, filename)
 39 | 
 40 | 
 41 | def _save_checkpoint(train_data, batch):
 42 |     """Saves a checkpoint of the model which can later be restored"""
 43 |     td = train_data
 44 | 
 45 |     oldname = 'checkpoint_old.txt'
 46 |     newname = 'checkpoint_new.txt'
 47 | 
 48 |     oldname = os.path.join(FLAGS.checkpoint_dir, oldname)
 49 |     newname = os.path.join(FLAGS.checkpoint_dir, newname)
 50 | 
 51 |     # Delete oldest checkpoint
 52 |     try:
 53 |         tf.gfile.Remove(oldname)
 54 |         tf.gfile.Remove(oldname + '.meta')
 55 |     except:
 56 |         pass
 57 | 
 58 |     # Rename old checkpoint
 59 |     try:
 60 |         tf.gfile.Rename(newname, oldname)
 61 |         tf.gfile.Rename(newname + '.meta', oldname + '.meta')
 62 |     except:
 63 |         pass
 64 | 
 65 |     # Generate new checkpoint
 66 |     saver = tf.train.Saver()
 67 |     saver.save(td.sess, newname)
 68 | 
 69 |     print("    Checkpoint saved")
 70 | 
 71 | 
 72 | def train_model(train_data):
 73 |     """Trains the given model with the given dataset"""
 74 |     td  = train_data
 75 |     tda = td.train_model
 76 |     tde = td.test_model
 77 | 
 78 |     dm_arch.enable_training(True)
 79 |     dm_arch.initialize_variables(td.sess)
 80 | 
 81 |     # Train the model
 82 |     minimize_ops = [tda.gene_minimize, tda.disc_minimize]
 83 |     show_ops     = [td.annealing, tda.gene_loss, tda.disc_loss, tda.disc_real_loss, tda.disc_fake_loss]
 84 | 
 85 |     start_time = time.time()
 86 |     step       = 0
 87 |     done       = False
 88 |     gene_decor = " "
 89 | 
 90 |     print('\nModel training...')
 91 |     step = 0
 92 |     while not done:
 93 |         # Show progress with test features
 94 |         if step % FLAGS.summary_period == 0:
 95 |             feature, gene_mout = td.sess.run([tde.source_images, tde.gene_out])
 96 |             _save_image(td, feature, gene_mout, step, 'out')
 97 | 
 98 |         # Compute losses and show that we are alive
 99 |         annealing, gene_loss, disc_loss, disc_real_loss, disc_fake_loss = td.sess.run(show_ops)
100 |         elapsed   = int(time.time() - start_time)/60
101 |         print('  Progress[%3d%%], ETA[%4dm], Step [%5d], temp[%3.3f], %sgene[%-3.3f], *disc[%-3.3f] real[%-3.3f] fake[%-3.3f]' %
102 |               (int(100*elapsed/FLAGS.train_time), FLAGS.train_time - elapsed, step,
103 |                annealing, gene_decor, gene_loss, disc_loss, disc_real_loss, disc_fake_loss))
104 | 
105 |         # Tight loop to maximize GPU utilization
106 |         # TBD: Is there any way to make Tensorflow repeat multiple times an operation with a single sess.run call?
107 |         if step < 200:
108 |             # Discriminator doing poorly --> train discriminator only
109 |             gene_decor = " "
110 |             for _ in range(10):
111 |                 td.sess.run(tda.disc_minimize)
112 |         else:
113 |             # Discriminator doing well --> train both generator and discriminator, but mostly discriminator
114 |             gene_decor = "*"
115 |             for _ in range(2):
116 |                 td.sess.run(minimize_ops)
117 |                 td.sess.run(tda.disc_minimize)
118 |                 td.sess.run(tda.disc_minimize)
119 |                 td.sess.run(tda.disc_minimize)
120 |         step += 1
121 | 
122 |         # Finished?
123 |         current_progress = elapsed / FLAGS.train_time
124 |         if current_progress >= 1.0:
125 |             done = True
126 |         
127 |         # Decrease annealing temperature exponentially
128 |         if step % FLAGS.annealing_half_life == 0:
129 |             td.sess.run(td.halve_annealing)
130 | 
131 |         # Save checkpoint
132 |         #if step % FLAGS.checkpoint_period == 0:
133 |         #    _save_checkpoint(td, step)
134 | 
135 |     _save_checkpoint(td, step)
136 |     print('Finished training!')
137 | 


--------------------------------------------------------------------------------
/dm_utils.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | import numpy as np
 3 | import scipy.misc
 4 | import tensorflow as tf
 5 | 
 6 | class Container(object):
 7 |     """Dumb container object"""
 8 |     def __init__(self, dictionary):
 9 |         self.__dict__.update(dictionary)
10 | 
11 | def _edge_filter():
12 |     """Returns a 3x3 edge-detection functionally filter similar to Sobel"""
13 | 
14 |     # See https://en.wikipedia.org/w/index.php?title=Talk:Sobel_operator&oldid=737772121#Scharr_not_the_ultimate_solution
15 |     a = .5*(1-math.sqrt(.5))
16 |     b = math.sqrt(.5)
17 | 
18 |     # Horizontal filter as a 4-D tensor suitable for tf.nn.conv2d()
19 |     h = np.zeros([3,3,3,3])
20 | 
21 |     for d in range(3):
22 |         # I.e. each RGB channel is processed independently
23 |         h[0,:,d,d] = [ a,  b,  a]
24 |         h[2,:,d,d] = [-a, -b, -a]
25 | 
26 |     # Vertical filter
27 |     v = np.transpose(h, axes=[1, 0, 2, 3])
28 | 
29 |     return h, v
30 | 
31 | def total_variation_loss(images, name='total_variation_loss'):
32 |     """Returns a loss function that penalizes high-frequency features in the image.
33 |     Similar to the 'total variation loss' but using a different high-pass filter."""
34 | 
35 |     filter_h, filter_v = _edge_filter()
36 |     strides = [1,1,1,1]
37 | 
38 |     hor_edges = tf.nn.conv2d(images, filter_h, strides, padding='VALID', name='horizontal_edges')
39 |     ver_edges = tf.nn.conv2d(images, filter_v, strides, padding='VALID', name='vertical_edges')
40 | 
41 |     l2_edges  = tf.add(hor_edges*hor_edges, ver_edges*ver_edges, name='L2_edges')
42 | 
43 |     total_variation_loss = tf.reduce_mean(l2_edges, name=name)
44 | 
45 |     return total_variation_loss
46 | 
47 | def distort_image(image):
48 |     """Perform random distortions to the given 4D image and return result"""
49 | 
50 |     # Switch to 3D as that's what these operations require
51 |     slices = tf.unpack(image)
52 |     output = []
53 | 
54 |     # Perform pixel-wise distortions
55 |     for image in slices:
56 |         image  = tf.image.random_flip_left_right(image)
57 |         image  = tf.image.random_saturation(image, .2, 2.)
58 |         image += tf.truncated_normal(image.get_shape(), stddev=.05)
59 |         image  = tf.image.random_contrast(image, .85, 1.15)
60 |         image  = tf.image.random_brightness(image, .3)
61 |         
62 |         output.append(image)
63 | 
64 |     # Go back to 4D
65 |     image   = tf.pack(output)
66 |     
67 |     return image
68 | 
69 | def downscale(images, K):
70 |     """Differentiable image downscaling by a factor of K"""
71 |     arr = np.zeros([K, K, 3, 3])
72 |     arr[:,:,0,0] = 1.0/(K*K)
73 |     arr[:,:,1,1] = 1.0/(K*K)
74 |     arr[:,:,2,2] = 1.0/(K*K)
75 |     dowscale_weight = tf.constant(arr, dtype=tf.float32)
76 |     
77 |     downscaled = tf.nn.conv2d(images, dowscale_weight,
78 |                               strides=[1, K, K, 1],
79 |                               padding='SAME')
80 |     return downscaled
81 | 
82 | def upscale(images, K):
83 |     """Differentiable image upscaling by a factor of K"""
84 |     prev_shape = images.get_shape()
85 |     size = [K * int(s) for s in prev_shape[1:3]]
86 |     out  = tf.image.resize_nearest_neighbor(images, size)
87 | 
88 |     return out
89 | 
90 | def save_image(image, filename, verbose=True):
91 |     """Saves a (height,width,3) numpy array into a file"""
92 |     scipy.misc.toimage(image, cmin=0., cmax=1.).save(filename)
93 |     print("    Saved %s" % (filename,))
94 | 


--------------------------------------------------------------------------------
/images/example_female_to_male.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/david-gpu/deep-makeover/691b77be809887723f14f95b25a5eb87299c80a7/images/example_female_to_male.jpg


--------------------------------------------------------------------------------
/images/example_male_to_female.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/david-gpu/deep-makeover/691b77be809887723f14f95b25a5eb87299c80a7/images/example_male_to_female.jpg


--------------------------------------------------------------------------------