├── LICENSE ├── README.md ├── gen.py ├── image_collection ├── collection_0.png ├── collection_1.png ├── collection_2.png └── collection_3.png ├── prepro.py ├── test.py └── train.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016 Kyubyong Park 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # An Implementation of 'Texture Synthesis Using Convolutional Neural Networks' 2 | This project is desgined to synthesize 28 texture classes of the [Kylberg Texture Dataset](http://www.cb.uu.se/~gustaf/texture/), based on the ideas of Gatys et al.'s paper [Texture Synthesis Using Convolutional Neural Networks](https://arxiv.org/pdf/1505.07376v3.pdf). 3 | 4 | ## Requirements 5 | * numpy >= 1.11.1 6 | * sugartensor == 0.0.1.8 (pip install sugartensor) 7 | * Kylberg Texture Dataset (Can be freely downloaded [here](http://www.cb.uu.se/~gustaf/texture/data/without-rotations-zip/) 8 | 9 | ## Research Question 10 | Can we generate images that are similar to the real texture images using neural networks? 11 | 12 | ## Main Idea 13 | If two images are similar, their feature maps should be similar, and vice versa. Accordingly, first, we train discriminative networks such that they can correctly classify different classes of texture images. Then, we train generative networks such that the feature maps of the input image become similar with those of its target true image. 14 | 15 | ## Dataset 16 | Refer to [Kylberg Texture Dataset](http://www.cb.uu.se/~gustaf/texture/) 17 | 18 | ## Model Architecture and Objective Function 19 | 20 | Model: VGG-19, replacing the original final dense layers with a convolutional layer.
21 | Objective function: Sum of L2 losses between the gram matrix of the feature maps of the noise and the target image. 22 | 23 | ## Folder and file instructions 24 | * prepro.py: Preprocessing. Download / unzip the dataset and write its path to the `Hyperaparams.image_fpath`. This file should make queues of images. 25 | * train.py: Training. This should train the discriminative model so that the networks can correctly classify images. By default, it creates log files and model files at `asset/train/log` and `asset/train/ckpt` respectively. 26 | * test.py: Testing. This should read the latest model file and print out classification results. 27 | * model-001-89612: Pretrained model parameters. Can be downloaded [here](https://drive.google.com/open?id=0B5M-ed49qMsDLU9SV3A3VmczV0E). 28 | (If you use this, copy it to `asset/train/ckpt`.) 29 | * gen.py: Generating. This should generate an image for the given target image to `gen_images` folder. Put the path of the target image as an argument. 30 | We generated 28 images, targeting the first image of each class. Here is the simple bash script. 31 | 32 | ``` 33 | #!/bin/bash 34 | for entry in ../datasets/Kylberg\ Texture\ Dataset\ v.\ 1.0/without-rotations-zip/*/*-a-p001.png 35 | do 36 | python gen.py "$entry" 37 | done 38 | ``` 39 | 40 | ## Results 41 | 42 | Classification acc. = 4285/4480 = 0.96
43 | 44 | Here are the generated images. 45 | 46 | ![collection_0](image_collection/collection_0.png?raw=true) 47 | ![collection_1](image_collection/collection_1.png?raw=true) 48 | ![collection_2](image_collection/collection_2.png?raw=true) 49 | ![collection_3](image_collection/collection_3.png?raw=true) 50 | 51 | 52 | 53 | 54 | -------------------------------------------------------------------------------- /gen.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | ''' 3 | Texture image generation 4 | ''' 5 | import sugartensor as tf 6 | import numpy as np 7 | from scipy import misc 8 | import glob 9 | import os, sys 10 | import time 11 | from prepro import Hyperparams 12 | import sys 13 | 14 | sample_image = sys.argv[1] 15 | 16 | def transform_image(target_img): 17 | r""" 18 | Arg: 19 | target_img: img file full path 20 | 21 | Returns: 22 | A numpy array of (1, 224, 224, 1) 23 | """ 24 | img = misc.imread(target_img) 25 | 26 | # Center crop 27 | offset_height = (576-224)/2 28 | offset_width = offset_height 29 | img = img[offset_height:offset_height+224, offset_width:offset_width+224] 30 | 31 | # Convert to 4-D 32 | img = np.expand_dims(img, 0) 33 | img = np.expand_dims(img, -1) 34 | 35 | # Normalize 36 | img = img.astype(np.float32) / 255 37 | 38 | return img 39 | 40 | class ModelGraph: 41 | def __init__(self): 42 | 43 | with tf.sg_context(name='generator'): 44 | self.x = tf.sg_initializer.he_uniform(name="x", shape=[1, 224, 224, 1]) # noise image 45 | self.y = tf.placeholder(dtype=tf.float32, shape=[1, 224, 224, 1]) # true target image 46 | 47 | with tf.sg_context(name='conv', act='relu'): 48 | self.x_conv1 = (self.x 49 | .sg_conv(dim=64) 50 | .sg_conv() 51 | .sg_pool()) # (1, 112, 112, 64) 52 | self.x_conv2 = (self.x_conv1 53 | .sg_conv(dim=128) 54 | .sg_conv() 55 | .sg_pool()) # (1, 56, 56, 128) 56 | self.x_conv3 = (self.x_conv2 57 | .sg_conv(dim=256) 58 | .sg_conv() 59 | .sg_conv() 60 | .sg_conv() 61 | .sg_pool()) # (1, 28, 28, 256) 62 | self.x_conv4 = (self.x_conv3 63 | .sg_conv(dim=512) 64 | .sg_conv() 65 | .sg_conv() 66 | .sg_conv() 67 | .sg_pool()) # (1, 14, 14, 512) 68 | # .sg_conv(dim=512) 69 | # .sg_conv() 70 | # .sg_conv() 71 | # .sg_conv() 72 | # .sg_pool()) 73 | 74 | self.y_conv1 = self.x_conv1.sg_reuse(input=self.y) 75 | self.y_conv2 = self.x_conv2.sg_reuse(input=self.y) 76 | self.y_conv3 = self.x_conv3.sg_reuse(input=self.y) 77 | self.y_conv4 = self.x_conv4.sg_reuse(input=self.y) 78 | # 79 | def get_gram_mat(tensor): 80 | ''' 81 | Arg: 82 | tensor: 4-D tensor. The first dimension must be 1. 83 | 84 | Returns: 85 | gram matrix. Read `https://en.wikipedia.org/wiki/Gramian_matrix` for details. 86 | 512 by 512. 87 | ''' 88 | assert tensor.get_shape().ndims == 4, "The tensor must be 4 dimensions." 89 | 90 | dim0, dim1, dim2, dim3 = tensor.get_shape().as_list() 91 | tensor = tensor.sg_reshape(shape=[dim0*dim1*dim2, dim3]) #(1*7*7, 512) 92 | 93 | # normalization: Why? Because the original value of gram mat. would be too huge. 94 | mean, variance = tf.nn.moments(tensor, [0, 1]) 95 | tensor = (tensor - mean) / tf.sqrt(variance + tf.sg_eps) 96 | 97 | tensor_t = tensor.sg_transpose(perm=[1, 0]) #(512, 1*7*7) 98 | gram_mat = tf.matmul(tensor_t, tensor) # (512, 512) 99 | 100 | return gram_mat 101 | 102 | # Loss: Add the loss of each layer 103 | self.mse = tf.squared_difference(get_gram_mat(self.x_conv1), get_gram_mat(self.y_conv1)).sg_mean() +\ 104 | tf.squared_difference(get_gram_mat(self.x_conv2), get_gram_mat(self.y_conv2)).sg_mean() +\ 105 | tf.squared_difference(get_gram_mat(self.x_conv3), get_gram_mat(self.y_conv3)).sg_mean() +\ 106 | tf.squared_difference(get_gram_mat(self.x_conv4), get_gram_mat(self.y_conv4)).sg_mean() 107 | 108 | self.train_gen = tf.sg_optim(self.mse, lr=0.0001, category='generator') # Note that we train only variable x. 109 | 110 | def generate(sample_image): 111 | start_time = time.time() 112 | 113 | g = ModelGraph() 114 | 115 | with tf.Session() as sess: 116 | # We need to initialize variables in this case because the Variable `generator/x` will not restored. 117 | tf.sg_init(sess) 118 | 119 | vars = [v for v in tf.global_variables() if "generator" not in v.name] 120 | saver = tf.train.Saver(vars) 121 | saver.restore(sess, tf.train.latest_checkpoint('asset/train/ckpt')) 122 | 123 | i = 0 124 | while True: 125 | mse, _ = sess.run([g.mse, g.train_gen], {g.y: transform_image(sample_image)}) # (16, 28) 126 | 127 | if time.time() - start_time > 60: # Save every 60 seconds 128 | gen_image = sess.run(g.x) 129 | gen_image = np.squeeze(gen_image) 130 | misc.imsave('gen_images/%s/gen_%.2f.jpg' % (label, mse), gen_image) 131 | 132 | start_time = time.time() 133 | i += 1 134 | if i == 60: break # Finish after 1 hour 135 | 136 | if __name__ == '__main__': 137 | label = sample_image.split("/")[-1].split("-")[0] 138 | if not os.path.exists("gen_images/" + label): os.makedirs("gen_images/" + label) 139 | 140 | # Save cropped image of the target image. 141 | misc.imsave("gen_images/{}/{}.jpg".format(label, label), np.squeeze(transform_image(sample_image))) 142 | 143 | generate(sample_image) 144 | print "Done" 145 | 146 | -------------------------------------------------------------------------------- /image_collection/collection_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_0.png -------------------------------------------------------------------------------- /image_collection/collection_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_1.png -------------------------------------------------------------------------------- /image_collection/collection_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_2.png -------------------------------------------------------------------------------- /image_collection/collection_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Kyubyong/texture_generation/5c03def3a72977bf81aa47af009c321092d95a09/image_collection/collection_3.png -------------------------------------------------------------------------------- /prepro.py: -------------------------------------------------------------------------------- 1 | #/usr/bin/python2 2 | # coding: utf-8 3 | 4 | import numpy as np 5 | import cPickle as pickle 6 | import codecs 7 | import re 8 | import sugartensor as tf 9 | import random 10 | 11 | # image file path 12 | class Hyperparams: 13 | image_fpath = '../../datasets/Kylberg Texture Dataset v. 1.0/without-rotations-zip/*/*.png' 14 | 15 | class Data: 16 | def __init__(self, batch_size=16): 17 | 18 | print "# Make classes" 19 | import glob 20 | files = glob.glob(Hyperparams.image_fpath) 21 | labels = [f.split('/')[-1].split('-')[0] for f in files] # ['scarf2', 'scarf1', ...] 22 | 23 | self.idx2label = {idx:label for idx, label in enumerate(set(labels))} 24 | self.label2idx = {label:idx for idx, label in self.idx2label.items()} 25 | 26 | labels = [self.label2idx[label] for label in labels] # [3, 4, 6, ...] 27 | 28 | files = tf.convert_to_tensor(files) #(4480,) (4480,) 29 | labels = tf.convert_to_tensor(labels) #(4480,) (4480,) 30 | 31 | file_q, label_q = tf.train.slice_input_producer([files, labels], num_epochs=1) # (), () 32 | img_q = tf.image.decode_png(tf.read_file(file_q), channels=1) # (576, 576, 1) uint8 33 | img_q = self.transform_image(img_q) # (224, 224, 1) float32 34 | 35 | self.x, self.y = tf.train.shuffle_batch([img_q, label_q], batch_size, 36 | num_threads=32, capacity=batch_size*128, 37 | min_after_dequeue=batch_size*32, 38 | allow_smaller_final_batch=False) # (16, 224, 224, 1) (16,) 39 | 40 | def transform_image(self, img): 41 | r""" 42 | Arg: 43 | img: A 3-D tensor. 44 | 45 | Returns: 46 | A `Tensor`. Has the shape of (224, 224) and dtype of float32. 47 | """ 48 | # center crop 49 | offset_height = (576-224)/2 50 | offset_width = offset_height 51 | img = tf.image.crop_to_bounding_box(img, offset_height, offset_width, 224, 224) 52 | # normalization 53 | img = img.sg_float() / 255 54 | 55 | return img 56 | 57 | -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import sugartensor as tf 4 | import numpy as np 5 | from train import ModelGraph 6 | import codecs 7 | 8 | def main(): 9 | g = ModelGraph() 10 | 11 | with tf.Session() as sess: 12 | tf.sg_init(sess) 13 | 14 | # restore parameters 15 | saver = tf.train.Saver() 16 | saver.restore(sess, tf.train.latest_checkpoint('asset/train/ckpt')) 17 | 18 | hits = 0 19 | num_imgs = 0 20 | 21 | with tf.sg_queue_context(sess): 22 | # loop end-of-queue 23 | while True: 24 | try: 25 | logits, y = sess.run([g.logits, g.y]) # (16, 28) 26 | preds = np.squeeze(np.argmax(logits, -1)) # (16,) 27 | 28 | hits += np.equal(preds, y).astype(np.int32).sum() 29 | num_imgs += len(y) 30 | print "%d/%d = %.02f" % (hits, num_imgs, float(hits) / num_imgs) 31 | except: 32 | break 33 | 34 | print "\nFinal result is\n%d/%d = %.02f" % (hits, num_imgs, float(hits) / num_imgs) 35 | 36 | 37 | 38 | # fout.write(u"▌file_name: {}\n".format(f)) 39 | # fout.write(u"▌Expected: {}\n".format(label2cls[])) 40 | # fout.write(u"▌file_name: {}\n".format(f)) 41 | # fout.write(u"▌Got: " + predicted + "\n\n") 42 | 43 | if __name__ == '__main__': 44 | main() 45 | print "Done" 46 | 47 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | #/usr/bin/python2 3 | ''' 4 | Training. 5 | ''' 6 | from prepro import Data 7 | import sugartensor as tf 8 | import numpy as np 9 | import random 10 | 11 | class ModelGraph(): 12 | '''Builds a model graph''' 13 | def __init__(self): 14 | ''' 15 | Args: 16 | is_train: Boolean. If True, backprop is executed. 17 | ''' 18 | train_data = Data() # (16, 224, 224, 1), (16,) 19 | 20 | self.x = train_data.x 21 | self.y = train_data.y 22 | self.idx2label = train_data.idx2label 23 | self.label2idx = train_data.label2idx 24 | 25 | self.conv5 = self.x.sg_vgg_19(conv_only=True) # (batch_size, 7, 7, 512) 26 | 27 | self.logits = (self.conv5.sg_conv(size=1, stride=1, dim=28, act="linear", bn=False) 28 | .sg_mean(dims=[1, 2], keep_dims=False) )# (16, 28) 29 | 30 | self.ce = self.logits.sg_ce(target=self.y, mask=False) # (16,) 31 | 32 | # training accuracy 33 | self.acc = (self.logits.sg_softmax() 34 | .sg_accuracy(target=self.y, name='training_acc')) 35 | 36 | 37 | def train(): 38 | g = ModelGraph(); print "Graph loaded!" 39 | 40 | tf.sg_train(lr=0.00001, lr_reset=True, log_interval=10, loss=g.ce, eval_metric=[g.acc], max_ep=5, 41 | save_dir='asset/train', early_stop=False, max_keep=5) 42 | 43 | if __name__ == '__main__': 44 | train(); print "Done" 45 | --------------------------------------------------------------------------------