├── images ├── style.jpg └── content.jpg ├── sample ├── output_1.jpg ├── output_2.jpg ├── input_style_1.jpg ├── input_style_2.jpg ├── input_content_1.jpg └── input_content_2.jpg ├── settings.py ├── README.md ├── train.py └── models.py /images/style.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/images/style.jpg -------------------------------------------------------------------------------- /images/content.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/images/content.jpg -------------------------------------------------------------------------------- /sample/output_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/sample/output_1.jpg -------------------------------------------------------------------------------- /sample/output_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/sample/output_2.jpg -------------------------------------------------------------------------------- /sample/input_style_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/sample/input_style_1.jpg -------------------------------------------------------------------------------- /sample/input_style_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/sample/input_style_2.jpg -------------------------------------------------------------------------------- /sample/input_content_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/sample/input_content_1.jpg -------------------------------------------------------------------------------- /sample/input_content_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AaronJny/nerual_style_change/HEAD/sample/input_content_2.jpg -------------------------------------------------------------------------------- /settings.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # @Time : 18-3-23 下午12:22 3 | # @Author : AaronJny 4 | # @Email : Aaron__7@163.com 5 | 6 | 7 | # 内容图片路径 8 | CONTENT_IMAGE = 'images/content.jpg' 9 | # 风格图片路径 10 | STYLE_IMAGE = 'images/style.jpg' 11 | # 输出图片路径 12 | OUTPUT_IMAGE = 'output/output' 13 | # 预训练的vgg模型路径 14 | VGG_MODEL_PATH = 'imagenet-vgg-verydeep-19.mat' 15 | # 图片宽度 16 | IMAGE_WIDTH = 450 17 | # 图片高度 18 | IMAGE_HEIGHT = 300 19 | # 定义计算内容损失的vgg层名称及对应权重的列表 20 | CONTENT_LOSS_LAYERS = [('conv4_2', 0.5),('conv5_2',0.5)] 21 | # 定义计算风格损失的vgg层名称及对应权重的列表 22 | STYLE_LOSS_LAYERS = [('conv1_1', 0.2), ('conv2_1', 0.2), ('conv3_1', 0.2), ('conv4_1', 0.2), ('conv5_1', 0.2)] 23 | # 噪音比率 24 | NOISE = 0.5 25 | # 图片RGB均值 26 | IMAGE_MEAN_VALUE = [128.0, 128.0, 128.0] 27 | # 内容损失权重 28 | ALPHA = 1 29 | # 风格损失权重 30 | BETA = 500 31 | # 训练次数 32 | TRAIN_STEPS = 3000 33 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 此项目使用Python2.7+TensorFlow 1.4编写,环境太过古老,可能无法正常运行起来。 2 | 3 | 如有需要,请移步我使用Python 3.7 + TensorFlow 2.0重写的版本: 4 | 5 | [DeepLearningExamples/tf2-neural-style-transfer](https://github.com/AaronJny/DeepLearningExamples/tree/master/tf2-neural-style-transfer) 6 | 7 | 8 | ------------------------- 9 | 10 | # 使用VGG19迁移学习实现图像风格迁移 11 | 12 | 这是一个使用预训练的VGG19网络完成图片风格迁移的项目,使用的语言为python,框架为tensorflow。 13 | 14 | 给定一张风格图片A和内容图片B,能够生成具备A图片风格和B图片内容的图片C。 15 | 16 | 下面给出两个示例,风格图片都使用梵高的星夜: 17 | 18 | ![风格图片](https://raw.githubusercontent.com/AaronJny/nerual_style_change/master/sample/input_style_1.jpg) 19 | 20 | **示例1:** 21 | 22 | 网络上找到的一张风景图片。 23 | 24 | 内容图片: 25 | 26 | ![内容图片1](https://raw.githubusercontent.com/AaronJny/nerual_style_change/master/sample/input_content_1.jpg) 27 | 28 | 生成图片: 29 | 30 | ![生成图片1](https://raw.githubusercontent.com/AaronJny/nerual_style_change/master/sample/output_1.jpg) 31 | 32 | 33 | **示例2:** 34 | 35 | 嗷嗷嗷,狼人嚎叫~ 36 | 37 | 内容图片: 38 | 39 | ![内容图片2](https://raw.githubusercontent.com/AaronJny/nerual_style_change/master/sample/input_content_2.jpg) 40 | 41 | 生成图片: 42 | 43 | ![生成图片2](https://raw.githubusercontent.com/AaronJny/nerual_style_change/master/sample/output_2.jpg) 44 | 45 | 46 | 更多详情请移步博客[https://blog.csdn.net/aaronjny/article/details/79681080](https://blog.csdn.net/aaronjny/article/details/79681080) 47 | 48 | ---------------------- 49 | 50 | ## 快速开始 51 | 52 | **1.下载预训练的vgg网络,并放入到项目的根目录中** 53 | 54 | 模型有500M+,故没有放到GitHub上,有需要请自行下载。 55 | 56 | 下载地址:[http://www.vlfeat.org/matconvnet/models/beta16/imagenet-vgg-verydeep-19.mat](http://www.vlfeat.org/matconvnet/models/beta16/imagenet-vgg-verydeep-19.mat) 57 | 58 | **2.选定风格图片和内容图片,放入项目根目录下的`images`文件夹中** 59 | 60 | 在项目根目录下的`images`文件夹中,有两张图片,分别为`content.jpg`和`style.jpg`,即内容图片和风格图片。 61 | 62 | 如果只是使用默认图片测试模型,这里可以不做任何操作。 63 | 64 | 如果要测试自定义的图片,请使用自定义的内容图片`和/或`风格图片替换该目录下的内容图片`和/或`风格图片,请保持命名与默认一致,或者在`settings.py`中修改路径及名称。 65 | 66 | **3.开始生成图片** 67 | 68 | 运行项目中的`train.py`文件,进行训练。在训练过程中,程序会定期提示进度,并保存过程图片。 69 | 70 | 当训练结束后,保存最终生成图片。 71 | 72 | 所有生成的图片均保存在项目根目录下`output`文件夹中。 73 | 74 | **4.更多设置** 75 | 76 | 在`settings.py`文件中存在多种配置项,可根据需求进行配置。 77 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # @Time : 18-3-23 下午12:22 3 | # @Author : AaronJny 4 | # @Email : Aaron__7@163.com 5 | import tensorflow as tf 6 | import settings 7 | import models 8 | import numpy as np 9 | import scipy.misc 10 | 11 | 12 | def loss(sess, model): 13 | """ 14 | 定义模型的损失函数 15 | :param sess: tf session 16 | :param model: 神经网络模型 17 | :return: 内容损失和风格损失的加权和损失 18 | """ 19 | # 先计算内容损失函数 20 | # 获取定义内容损失的vgg层名称列表及权重 21 | content_layers = settings.CONTENT_LOSS_LAYERS 22 | # 将内容图片作为输入,方便后面提取内容图片在各层中的特征矩阵 23 | sess.run(tf.assign(model.net['input'], model.content)) 24 | # 内容损失累加量 25 | content_loss = 0.0 26 | # 逐个取出衡量内容损失的vgg层名称及对应权重 27 | for layer_name, weight in content_layers: 28 | # 提取内容图片在layer_name层中的特征矩阵 29 | p = sess.run(model.net[layer_name]) 30 | # 提取噪音图片在layer_name层中的特征矩阵 31 | x = model.net[layer_name] 32 | # 长x宽 33 | M = p.shape[1] * p.shape[2] 34 | # 信道数 35 | N = p.shape[3] 36 | # 根据公式计算损失,并进行累加 37 | content_loss += (1.0 / (2 * M * N)) * tf.reduce_sum(tf.pow(p - x, 2)) * weight 38 | # 将损失对层数取平均 39 | content_loss /= len(content_layers) 40 | 41 | # 再计算风格损失函数 42 | style_layers = settings.STYLE_LOSS_LAYERS 43 | # 将风格图片作为输入,方便后面提取风格图片在各层中的特征矩阵 44 | sess.run(tf.assign(model.net['input'], model.style)) 45 | # 风格损失累加量 46 | style_loss = 0.0 47 | # 逐个取出衡量风格损失的vgg层名称及对应权重 48 | for layer_name, weight in style_layers: 49 | # 提取风格图片在layer_name层中的特征矩阵 50 | a = sess.run(model.net[layer_name]) 51 | # 提取噪音图片在layer_name层中的特征矩阵 52 | x = model.net[layer_name] 53 | # 长x宽 54 | M = a.shape[1] * a.shape[2] 55 | # 信道数 56 | N = a.shape[3] 57 | # 求风格图片特征的gram矩阵 58 | A = gram(a, M, N) 59 | # 求噪音图片特征的gram矩阵 60 | G = gram(x, M, N) 61 | # 根据公式计算损失,并进行累加 62 | style_loss += (1.0 / (4 * M * M * N * N)) * tf.reduce_sum(tf.pow(G - A, 2)) * weight 63 | # 将损失对层数取平均 64 | style_loss /= len(style_layers) 65 | # 将内容损失和风格损失加权求和,构成总损失函数 66 | loss = settings.ALPHA * content_loss + settings.BETA * style_loss 67 | 68 | return loss 69 | 70 | 71 | def gram(x, size, deep): 72 | """ 73 | 创建给定矩阵的格莱姆矩阵,用来衡量风格 74 | :param x:给定矩阵 75 | :param size:矩阵的行数与列数的乘积 76 | :param deep:矩阵信道数 77 | :return:格莱姆矩阵 78 | """ 79 | # 改变shape为(size,deep) 80 | x = tf.reshape(x, (size, deep)) 81 | # 求xTx 82 | g = tf.matmul(tf.transpose(x), x) 83 | return g 84 | 85 | 86 | def train(): 87 | # 创建一个模型 88 | model = models.Model(settings.CONTENT_IMAGE, settings.STYLE_IMAGE) 89 | # 创建session 90 | with tf.Session() as sess: 91 | # 全局初始化 92 | sess.run(tf.global_variables_initializer()) 93 | # 定义损失函数 94 | cost = loss(sess, model) 95 | # 创建优化器 96 | optimizer = tf.train.AdamOptimizer(1.0).minimize(cost) 97 | # 再初始化一次(主要针对于第一次初始化后又定义的运算,不然可能会报错) 98 | sess.run(tf.global_variables_initializer()) 99 | # 使用噪声图片进行训练 100 | sess.run(tf.assign(model.net['input'], model.random_img)) 101 | # 迭代指定次数 102 | for step in range(settings.TRAIN_STEPS): 103 | # 进行一次反向传播 104 | sess.run(optimizer) 105 | # 每隔一定次数,输出一下进度,并保存当前训练结果 106 | if step % 50 == 0: 107 | print 'step {} is down.'.format(step) 108 | # 取出input的内容,这是生成的图片 109 | img = sess.run(model.net['input']) 110 | # 训练过程是减去均值的,这里要加上 111 | img += settings.IMAGE_MEAN_VALUE 112 | # 这里是一个batch_size=1的batch,所以img[0]才是图片内容 113 | img = img[0] 114 | # 将像素值限定在0-255,并转为整型 115 | img = np.clip(img, 0, 255).astype(np.uint8) 116 | # 保存图片 117 | scipy.misc.imsave('{}-{}.jpg'.format(settings.OUTPUT_IMAGE,step), img) 118 | # 保存最终训练结果 119 | img = sess.run(model.net['input']) 120 | img += settings.IMAGE_MEAN_VALUE 121 | img = img[0] 122 | img = np.clip(img, 0, 255).astype(np.uint8) 123 | scipy.misc.imsave('{}.jpg'.format(settings.OUTPUT_IMAGE), img) 124 | 125 | 126 | if __name__ == '__main__': 127 | train() 128 | -------------------------------------------------------------------------------- /models.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # @Time : 18-3-23 下午12:20 3 | # @Author : AaronJny 4 | # @Email : Aaron__7@163.com 5 | import tensorflow as tf 6 | import numpy as np 7 | import settings 8 | import scipy.io 9 | import scipy.misc 10 | 11 | 12 | class Model(object): 13 | def __init__(self, content_path, style_path): 14 | self.content = self.loadimg(content_path) # 加载内容图片 15 | self.style = self.loadimg(style_path) # 加载风格图片 16 | self.random_img = self.get_random_img() # 生成噪音内容图片 17 | self.net = self.vggnet() # 建立vgg网络 18 | 19 | def vggnet(self): 20 | # 读取预训练的vgg模型 21 | vgg = scipy.io.loadmat(settings.VGG_MODEL_PATH) 22 | vgg_layers = vgg['layers'][0] 23 | net = {} 24 | # 使用预训练的模型参数构建vgg网络的卷积层和池化层 25 | # 全连接层不需要 26 | # 注意,除了input之外,这里参数都为constant,即常量 27 | # 和平时不同,我们并不训练vgg的参数,它们保持不变 28 | # 需要进行训练的是input,它即是我们最终生成的图像 29 | net['input'] = tf.Variable(np.zeros([1, settings.IMAGE_HEIGHT, settings.IMAGE_WIDTH, 3]), dtype=tf.float32) 30 | # 参数对应的层数可以参考vgg模型图 31 | net['conv1_1'] = self.conv_relu(net['input'], self.get_wb(vgg_layers, 0)) 32 | net['conv1_2'] = self.conv_relu(net['conv1_1'], self.get_wb(vgg_layers, 2)) 33 | net['pool1'] = self.pool(net['conv1_2']) 34 | net['conv2_1'] = self.conv_relu(net['pool1'], self.get_wb(vgg_layers, 5)) 35 | net['conv2_2'] = self.conv_relu(net['conv2_1'], self.get_wb(vgg_layers, 7)) 36 | net['pool2'] = self.pool(net['conv2_2']) 37 | net['conv3_1'] = self.conv_relu(net['pool2'], self.get_wb(vgg_layers, 10)) 38 | net['conv3_2'] = self.conv_relu(net['conv3_1'], self.get_wb(vgg_layers, 12)) 39 | net['conv3_3'] = self.conv_relu(net['conv3_2'], self.get_wb(vgg_layers, 14)) 40 | net['conv3_4'] = self.conv_relu(net['conv3_3'], self.get_wb(vgg_layers, 16)) 41 | net['pool3'] = self.pool(net['conv3_4']) 42 | net['conv4_1'] = self.conv_relu(net['pool3'], self.get_wb(vgg_layers, 19)) 43 | net['conv4_2'] = self.conv_relu(net['conv4_1'], self.get_wb(vgg_layers, 21)) 44 | net['conv4_3'] = self.conv_relu(net['conv4_2'], self.get_wb(vgg_layers, 23)) 45 | net['conv4_4'] = self.conv_relu(net['conv4_3'], self.get_wb(vgg_layers, 25)) 46 | net['pool4'] = self.pool(net['conv4_4']) 47 | net['conv5_1'] = self.conv_relu(net['pool4'], self.get_wb(vgg_layers, 28)) 48 | net['conv5_2'] = self.conv_relu(net['conv5_1'], self.get_wb(vgg_layers, 30)) 49 | net['conv5_3'] = self.conv_relu(net['conv5_2'], self.get_wb(vgg_layers, 32)) 50 | net['conv5_4'] = self.conv_relu(net['conv5_3'], self.get_wb(vgg_layers, 34)) 51 | net['pool5'] = self.pool(net['conv5_4']) 52 | return net 53 | 54 | def conv_relu(self, input, wb): 55 | """ 56 | 进行先卷积、后relu的运算 57 | :param input: 输入层 58 | :param wb: wb[0],wb[1] == w,b 59 | :return: relu后的结果 60 | """ 61 | conv = tf.nn.conv2d(input, wb[0], strides=[1, 1, 1, 1], padding='SAME') 62 | relu = tf.nn.relu(conv + wb[1]) 63 | return relu 64 | 65 | def pool(self, input): 66 | """ 67 | 进行max_pool操作 68 | :param input: 输入层 69 | :return: 池化后的结果 70 | """ 71 | return tf.nn.max_pool(input, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') 72 | 73 | def get_wb(self, layers, i): 74 | """ 75 | 从预训练好的vgg模型中读取参数 76 | :param layers: 训练好的vgg模型 77 | :param i: vgg指定层数 78 | :return: 该层的w,b 79 | """ 80 | w = tf.constant(layers[i][0][0][0][0][0]) 81 | bias = layers[i][0][0][0][0][1] 82 | b = tf.constant(np.reshape(bias, (bias.size))) 83 | return w, b 84 | 85 | def get_random_img(self): 86 | """ 87 | 根据噪音和内容图片,生成一张随机图片 88 | :return: 89 | """ 90 | noise_image = np.random.uniform(-20, 20, [1, settings.IMAGE_HEIGHT, settings.IMAGE_WIDTH, 3]) 91 | random_img = noise_image * settings.NOISE + self.content * (1 - settings.NOISE) 92 | return random_img 93 | 94 | def loadimg(self, path): 95 | """ 96 | 加载一张图片,将其转化为符合要求的格式 97 | :param path: 98 | :return: 99 | """ 100 | # 读取图片 101 | image = scipy.misc.imread(path) 102 | # 重新设定图片大小 103 | image = scipy.misc.imresize(image, [settings.IMAGE_HEIGHT, settings.IMAGE_WIDTH]) 104 | # 改变数组形状,其实就是把它变成一个batch_size=1的batch 105 | image = np.reshape(image, (1, settings.IMAGE_HEIGHT, settings.IMAGE_WIDTH, 3)) 106 | # 减去均值,使其数据分布接近0 107 | image = image - settings.IMAGE_MEAN_VALUE 108 | return image 109 | 110 | 111 | if __name__ == '__main__': 112 | Model(settings.CONTENT_IMAGE, settings.STYLE_IMAGE) 113 | --------------------------------------------------------------------------------