├── LICENSE
├── README.md
├── gen.py
├── image_collection
├── collection_0.png
├── collection_1.png
├── collection_2.png
└── collection_3.png
├── prepro.py
├── test.py
└── train.py
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2016 Kyubyong Park
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # An Implementation of 'Texture Synthesis Using Convolutional Neural Networks'
2 | This project is desgined to synthesize 28 texture classes of the [Kylberg Texture Dataset](http://www.cb.uu.se/~gustaf/texture/), based on the ideas of Gatys et al.'s paper [Texture Synthesis Using Convolutional Neural Networks](https://arxiv.org/pdf/1505.07376v3.pdf).
3 |
4 | ## Requirements
5 | * numpy >= 1.11.1
6 | * sugartensor == 0.0.1.8 (pip install sugartensor)
7 | * Kylberg Texture Dataset (Can be freely downloaded [here](http://www.cb.uu.se/~gustaf/texture/data/without-rotations-zip/)
8 |
9 | ## Research Question
10 | Can we generate images that are similar to the real texture images using neural networks?
11 |
12 | ## Main Idea
13 | If two images are similar, their feature maps should be similar, and vice versa. Accordingly, first, we train discriminative networks such that they can correctly classify different classes of texture images. Then, we train generative networks such that the feature maps of the input image become similar with those of its target true image.
14 |
15 | ## Dataset
16 | Refer to [Kylberg Texture Dataset](http://www.cb.uu.se/~gustaf/texture/)
17 |
18 | ## Model Architecture and Objective Function
19 |
20 | Model: VGG-19, replacing the original final dense layers with a convolutional layer.
21 | Objective function: Sum of L2 losses between the gram matrix of the feature maps of the noise and the target image.
22 |
23 | ## Folder and file instructions
24 | * prepro.py: Preprocessing. Download / unzip the dataset and write its path to the `Hyperaparams.image_fpath`. This file should make queues of images.
25 | * train.py: Training. This should train the discriminative model so that the networks can correctly classify images. By default, it creates log files and model files at `asset/train/log` and `asset/train/ckpt` respectively.
26 | * test.py: Testing. This should read the latest model file and print out classification results.
27 | * model-001-89612: Pretrained model parameters. Can be downloaded [here](https://drive.google.com/open?id=0B5M-ed49qMsDLU9SV3A3VmczV0E).
28 | (If you use this, copy it to `asset/train/ckpt`.)
29 | * gen.py: Generating. This should generate an image for the given target image to `gen_images` folder. Put the path of the target image as an argument.
30 | We generated 28 images, targeting the first image of each class. Here is the simple bash script.
31 |
32 | ```
33 | #!/bin/bash
34 | for entry in ../datasets/Kylberg\ Texture\ Dataset\ v.\ 1.0/without-rotations-zip/*/*-a-p001.png
35 | do
36 | python gen.py "$entry"
37 | done
38 | ```
39 |
40 | ## Results
41 |
42 | Classification acc. = 4285/4480 = 0.96
43 |
44 | Here are the generated images.
45 |
46 | 
47 | 
48 | 
49 | 
50 |
51 |
52 |
53 |
54 |
--------------------------------------------------------------------------------
/gen.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | '''
3 | Texture image generation
4 | '''
5 | import sugartensor as tf
6 | import numpy as np
7 | from scipy import misc
8 | import glob
9 | import os, sys
10 | import time
11 | from prepro import Hyperparams
12 | import sys
13 |
14 | sample_image = sys.argv[1]
15 |
16 | def transform_image(target_img):
17 | r"""
18 | Arg:
19 | target_img: img file full path
20 |
21 | Returns:
22 | A numpy array of (1, 224, 224, 1)
23 | """
24 | img = misc.imread(target_img)
25 |
26 | # Center crop
27 | offset_height = (576-224)/2
28 | offset_width = offset_height
29 | img = img[offset_height:offset_height+224, offset_width:offset_width+224]
30 |
31 | # Convert to 4-D
32 | img = np.expand_dims(img, 0)
33 | img = np.expand_dims(img, -1)
34 |
35 | # Normalize
36 | img = img.astype(np.float32) / 255
37 |
38 | return img
39 |
40 | class ModelGraph:
41 | def __init__(self):
42 |
43 | with tf.sg_context(name='generator'):
44 | self.x = tf.sg_initializer.he_uniform(name="x", shape=[1, 224, 224, 1]) # noise image
45 | self.y = tf.placeholder(dtype=tf.float32, shape=[1, 224, 224, 1]) # true target image
46 |
47 | with tf.sg_context(name='conv', act='relu'):
48 | self.x_conv1 = (self.x
49 | .sg_conv(dim=64)
50 | .sg_conv()
51 | .sg_pool()) # (1, 112, 112, 64)
52 | self.x_conv2 = (self.x_conv1
53 | .sg_conv(dim=128)
54 | .sg_conv()
55 | .sg_pool()) # (1, 56, 56, 128)
56 | self.x_conv3 = (self.x_conv2
57 | .sg_conv(dim=256)
58 | .sg_conv()
59 | .sg_conv()
60 | .sg_conv()
61 | .sg_pool()) # (1, 28, 28, 256)
62 | self.x_conv4 = (self.x_conv3
63 | .sg_conv(dim=512)
64 | .sg_conv()
65 | .sg_conv()
66 | .sg_conv()
67 | .sg_pool()) # (1, 14, 14, 512)
68 | # .sg_conv(dim=512)
69 | # .sg_conv()
70 | # .sg_conv()
71 | # .sg_conv()
72 | # .sg_pool())
73 |
74 | self.y_conv1 = self.x_conv1.sg_reuse(input=self.y)
75 | self.y_conv2 = self.x_conv2.sg_reuse(input=self.y)
76 | self.y_conv3 = self.x_conv3.sg_reuse(input=self.y)
77 | self.y_conv4 = self.x_conv4.sg_reuse(input=self.y)
78 | #
79 | def get_gram_mat(tensor):
80 | '''
81 | Arg:
82 | tensor: 4-D tensor. The first dimension must be 1.
83 |
84 | Returns:
85 | gram matrix. Read `https://en.wikipedia.org/wiki/Gramian_matrix` for details.
86 | 512 by 512.
87 | '''
88 | assert tensor.get_shape().ndims == 4, "The tensor must be 4 dimensions."
89 |
90 | dim0, dim1, dim2, dim3 = tensor.get_shape().as_list()
91 | tensor = tensor.sg_reshape(shape=[dim0*dim1*dim2, dim3]) #(1*7*7, 512)
92 |
93 | # normalization: Why? Because the original value of gram mat. would be too huge.
94 | mean, variance = tf.nn.moments(tensor, [0, 1])
95 | tensor = (tensor - mean) / tf.sqrt(variance + tf.sg_eps)
96 |
97 | tensor_t = tensor.sg_transpose(perm=[1, 0]) #(512, 1*7*7)
98 | gram_mat = tf.matmul(tensor_t, tensor) # (512, 512)
99 |
100 | return gram_mat
101 |
102 | # Loss: Add the loss of each layer
103 | self.mse = tf.squared_difference(get_gram_mat(self.x_conv1), get_gram_mat(self.y_conv1)).sg_mean() +\
104 | tf.squared_difference(get_gram_mat(self.x_conv2), get_gram_mat(self.y_conv2)).sg_mean() +\
105 | tf.squared_difference(get_gram_mat(self.x_conv3), get_gram_mat(self.y_conv3)).sg_mean() +\
106 | tf.squared_difference(get_gram_mat(self.x_conv4), get_gram_mat(self.y_conv4)).sg_mean()
107 |
108 | self.train_gen = tf.sg_optim(self.mse, lr=0.0001, category='generator') # Note that we train only variable x.
109 |
110 | def generate(sample_image):
111 | start_time = time.time()
112 |
113 | g = ModelGraph()
114 |
115 | with tf.Session() as sess:
116 | # We need to initialize variables in this case because the Variable `generator/x` will not restored.
117 | tf.sg_init(sess)
118 |
119 | vars = [v for v in tf.global_variables() if "generator" not in v.name]
120 | saver = tf.train.Saver(vars)
121 | saver.restore(sess, tf.train.latest_checkpoint('asset/train/ckpt'))
122 |
123 | i = 0
124 | while True:
125 | mse, _ = sess.run([g.mse, g.train_gen], {g.y: transform_image(sample_image)}) # (16, 28)
126 |
127 | if time.time() - start_time > 60: # Save every 60 seconds
128 | gen_image = sess.run(g.x)
129 | gen_image = np.squeeze(gen_image)
130 | misc.imsave('gen_images/%s/gen_%.2f.jpg' % (label, mse), gen_image)
131 |
132 | start_time = time.time()
133 | i += 1
134 | if i == 60: break # Finish after 1 hour
135 |
136 | if __name__ == '__main__':
137 | label = sample_image.split("/")[-1].split("-")[0]
138 | if not os.path.exists("gen_images/" + label): os.makedirs("gen_images/" + label)
139 |
140 | # Save cropped image of the target image.
141 | misc.imsave("gen_images/{}/{}.jpg".format(label, label), np.squeeze(transform_image(sample_image)))
142 |
143 | generate(sample_image)
144 | print "Done"
145 |
146 |
--------------------------------------------------------------------------------
/image_collection/collection_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_0.png
--------------------------------------------------------------------------------
/image_collection/collection_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_1.png
--------------------------------------------------------------------------------
/image_collection/collection_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_2.png
--------------------------------------------------------------------------------
/image_collection/collection_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_3.png
--------------------------------------------------------------------------------
/prepro.py:
--------------------------------------------------------------------------------
1 | #/usr/bin/python2
2 | # coding: utf-8
3 |
4 | import numpy as np
5 | import cPickle as pickle
6 | import codecs
7 | import re
8 | import sugartensor as tf
9 | import random
10 |
11 | # image file path
12 | class Hyperparams:
13 | image_fpath = '../../datasets/Kylberg Texture Dataset v. 1.0/without-rotations-zip/*/*.png'
14 |
15 | class Data:
16 | def __init__(self, batch_size=16):
17 |
18 | print "# Make classes"
19 | import glob
20 | files = glob.glob(Hyperparams.image_fpath)
21 | labels = [f.split('/')[-1].split('-')[0] for f in files] # ['scarf2', 'scarf1', ...]
22 |
23 | self.idx2label = {idx:label for idx, label in enumerate(set(labels))}
24 | self.label2idx = {label:idx for idx, label in self.idx2label.items()}
25 |
26 | labels = [self.label2idx[label] for label in labels] # [3, 4, 6, ...]
27 |
28 | files = tf.convert_to_tensor(files) #(4480,) (4480,)
29 | labels = tf.convert_to_tensor(labels) #(4480,) (4480,)
30 |
31 | file_q, label_q = tf.train.slice_input_producer([files, labels], num_epochs=1) # (), ()
32 | img_q = tf.image.decode_png(tf.read_file(file_q), channels=1) # (576, 576, 1) uint8
33 | img_q = self.transform_image(img_q) # (224, 224, 1) float32
34 |
35 | self.x, self.y = tf.train.shuffle_batch([img_q, label_q], batch_size,
36 | num_threads=32, capacity=batch_size*128,
37 | min_after_dequeue=batch_size*32,
38 | allow_smaller_final_batch=False) # (16, 224, 224, 1) (16,)
39 |
40 | def transform_image(self, img):
41 | r"""
42 | Arg:
43 | img: A 3-D tensor.
44 |
45 | Returns:
46 | A `Tensor`. Has the shape of (224, 224) and dtype of float32.
47 | """
48 | # center crop
49 | offset_height = (576-224)/2
50 | offset_width = offset_height
51 | img = tf.image.crop_to_bounding_box(img, offset_height, offset_width, 224, 224)
52 | # normalization
53 | img = img.sg_float() / 255
54 |
55 | return img
56 |
57 |
--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 |
3 | import sugartensor as tf
4 | import numpy as np
5 | from train import ModelGraph
6 | import codecs
7 |
8 | def main():
9 | g = ModelGraph()
10 |
11 | with tf.Session() as sess:
12 | tf.sg_init(sess)
13 |
14 | # restore parameters
15 | saver = tf.train.Saver()
16 | saver.restore(sess, tf.train.latest_checkpoint('asset/train/ckpt'))
17 |
18 | hits = 0
19 | num_imgs = 0
20 |
21 | with tf.sg_queue_context(sess):
22 | # loop end-of-queue
23 | while True:
24 | try:
25 | logits, y = sess.run([g.logits, g.y]) # (16, 28)
26 | preds = np.squeeze(np.argmax(logits, -1)) # (16,)
27 |
28 | hits += np.equal(preds, y).astype(np.int32).sum()
29 | num_imgs += len(y)
30 | print "%d/%d = %.02f" % (hits, num_imgs, float(hits) / num_imgs)
31 | except:
32 | break
33 |
34 | print "\nFinal result is\n%d/%d = %.02f" % (hits, num_imgs, float(hits) / num_imgs)
35 |
36 |
37 |
38 | # fout.write(u"▌file_name: {}\n".format(f))
39 | # fout.write(u"▌Expected: {}\n".format(label2cls[]))
40 | # fout.write(u"▌file_name: {}\n".format(f))
41 | # fout.write(u"▌Got: " + predicted + "\n\n")
42 |
43 | if __name__ == '__main__':
44 | main()
45 | print "Done"
46 |
47 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #/usr/bin/python2
3 | '''
4 | Training.
5 | '''
6 | from prepro import Data
7 | import sugartensor as tf
8 | import numpy as np
9 | import random
10 |
11 | class ModelGraph():
12 | '''Builds a model graph'''
13 | def __init__(self):
14 | '''
15 | Args:
16 | is_train: Boolean. If True, backprop is executed.
17 | '''
18 | train_data = Data() # (16, 224, 224, 1), (16,)
19 |
20 | self.x = train_data.x
21 | self.y = train_data.y
22 | self.idx2label = train_data.idx2label
23 | self.label2idx = train_data.label2idx
24 |
25 | self.conv5 = self.x.sg_vgg_19(conv_only=True) # (batch_size, 7, 7, 512)
26 |
27 | self.logits = (self.conv5.sg_conv(size=1, stride=1, dim=28, act="linear", bn=False)
28 | .sg_mean(dims=[1, 2], keep_dims=False) )# (16, 28)
29 |
30 | self.ce = self.logits.sg_ce(target=self.y, mask=False) # (16,)
31 |
32 | # training accuracy
33 | self.acc = (self.logits.sg_softmax()
34 | .sg_accuracy(target=self.y, name='training_acc'))
35 |
36 |
37 | def train():
38 | g = ModelGraph(); print "Graph loaded!"
39 |
40 | tf.sg_train(lr=0.00001, lr_reset=True, log_interval=10, loss=g.ce, eval_metric=[g.acc], max_ep=5,
41 | save_dir='asset/train', early_stop=False, max_keep=5)
42 |
43 | if __name__ == '__main__':
44 | train(); print "Done"
45 |
--------------------------------------------------------------------------------