├── README.md ├── __init__.py ├── get_features.py ├── test_data ├── puzzle.jpeg └── tiger.jpeg ├── test_vgg16.py ├── test_vgg19.py ├── utils.py ├── vgg16.py └── vgg19.py /README.md: -------------------------------------------------------------------------------- 1 | # Tensorflow VGG16 and VGG19 2 | Reference to https://github.com/machrisaa/tensorflow-vgg 3 | 4 | This is a Tensorflow implemention of VGG 16 and VGG 19 based on [tensorflow-vgg16](https://github.com/ry/tensorflow-vgg16) and [Caffe to Tensorflow](https://github.com/ethereon/caffe-tensorflow). Original Caffe implementation can be found in [here](https://gist.github.com/ksimonyan/211839e770f7b538e2d8) and [here](https://gist.github.com/ksimonyan/3785162f95cd2d5fee77). 5 | 6 | We have modified the implementation of tensorflow-vgg16 to use numpy loading instead of default tensorflow model loading in order to speed up the initialisation and reduce the overall memory usage. This implementation enable further modify the network, e.g. remove the FC layers, or increase the batch size. 7 | 8 | >To use the VGG networks, the npy files for [VGG16 NPY](https://mega.nz/#!YU1FWJrA!O1ywiCS2IiOlUCtCpI6HTJOMrneN-Qdv3ywQP5poecM) or [VGG19 NPY](https://mega.nz/#!xZ8glS6J!MAnE91ND_WyfZ_8mvkuSa2YcA7q-1ehfSm-Q1fxOvvs) has to be downloaded. 9 | 10 | ##Usage 11 | Use this to build the VGG object 12 | ``` 13 | vgg = vgg19.Vgg19() 14 | vgg.build(images) 15 | ``` 16 | or 17 | ``` 18 | vgg = vgg16.Vgg16() 19 | vgg.build(images) 20 | ``` 21 | The `images` is a tensor with shape `[None, 224, 224, 3]`. 22 | >Trick: the tensor can be a placeholder, a variable or even a constant. 23 | 24 | All the VGG layers (tensors) can then be accessed using the vgg object. For example, `vgg.conv1_1`, `vgg.conv1_2`, `vgg.pool5`, `vgg.prob`, ... 25 | 26 | `test_vgg16.py` and `test_vgg19.py` contain the sample usage. 27 | 28 | ##Extra 29 | This library has been used in my another Tensorflow image style synethesis project: [stylenet](https://github.com/machrisaa/stylenet) 30 | 31 | 32 | ##Update: 33 | Added a trainable version of the VGG19 `vgg19_trainable`. It support train from existing vaiables or from scratch. (But. the trainer is not included) 34 | 35 | A very simple testing is added `test_vgg19_trainable`, switch has demo about how to train, switch off train mode for verification, and how to save. 36 | 37 | A seperated file is added (instead of changing existing one) because I want to keep the simplicity of the original VGG networks. 38 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2281123066/vgg-tensorflow/5a7ec53248a5059f7695d75b1863b1bae105820e/__init__.py -------------------------------------------------------------------------------- /get_features.py: -------------------------------------------------------------------------------- 1 | import scipy.io as sio 2 | import vgg16 3 | import tensorflow as tf 4 | import utils 5 | import cv2 6 | import numpy as np 7 | import skimage 8 | from skimage import io 9 | from skimage import transform 10 | import pandas as pd 11 | from scipy.linalg import norm 12 | import os 13 | 14 | batch_size = 32 15 | annotation_path = 'D:/dataset_code/数据集/flickr+mscoco/flickr30k/results_20130124.token' 16 | flickr_image_path = 'D:/dataset_code/数据集/flickr+mscoco/flickr30k/flickr30k-images/' 17 | feat_path = 'D:/dataset_code/数据集/flickr+mscoco/flickr30k/data/vgg16_feats.npy' 18 | annotation_result_path = 'D:/dataset_code/数据集/flickr+mscoco/flickr30k/data/annotations.pickle' 19 | annotations = pd.read_table(annotation_path, sep='\t', header=None, names=['image', 'caption']) 20 | annotations['image_num'] = annotations['image'].map(lambda x: x.split('#')[1]) 21 | annotations['image'] = annotations['image'].map(lambda x: os.path.join(flickr_image_path, x.split('#')[0])) 22 | 23 | #获取文件夹下每一张图片 24 | unique_images = annotations['image'].unique() 25 | #print(len(unique_images))#31783 26 | image_df = pd.DataFrame({'image': unique_images, 'image_id': range(len(unique_images))}) 27 | # 每张图片对应5个句子 28 | annotations = pd.merge(annotations, image_df) 29 | annotations.to_pickle(annotation_result_path) 30 | 31 | def get_feats(): 32 | vgg16_feats = np.zeros((len(unique_images), 4096)) 33 | with tf.Session() as sess: 34 | images = tf.placeholder(dtype=tf.float32, shape=[None, 224, 224, 3]) 35 | vgg = vgg16.Vgg16() 36 | vgg.build(images) 37 | for i in range(len(unique_images)): 38 | img_list = utils.load_image(unique_images[i]) 39 | batch = img_list.reshape((1, 224, 224, 3)) 40 | feature = sess.run(vgg.fc7, feed_dict={images: batch})#提取fc7层的特征 41 | feature = np.reshape(feature, [4096]) 42 | feature /= norm(feature) # 特征归一化 43 | vgg16_feats[i, :] = feature #每张图片的特征向量为1行 44 | vgg16_feats = np.save('D:/dataset_code/数据集/flickr+mscoco/flickr30k/data/vgg16_feats', vgg16_feats) 45 | return vgg16_feats 46 | 47 | 48 | if __name__ == '__main__': 49 | get_feats() 50 | -------------------------------------------------------------------------------- /test_data/puzzle.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2281123066/vgg-tensorflow/5a7ec53248a5059f7695d75b1863b1bae105820e/test_data/puzzle.jpeg -------------------------------------------------------------------------------- /test_data/tiger.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2281123066/vgg-tensorflow/5a7ec53248a5059f7695d75b1863b1bae105820e/test_data/tiger.jpeg -------------------------------------------------------------------------------- /test_vgg16.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import tensorflow as tf 3 | 4 | import vgg16 5 | import utils 6 | 7 | img1 = utils.load_image("./test_data/tiger.jpeg") 8 | img2 = utils.load_image("./test_data/puzzle.jpeg") 9 | 10 | batch1 = img1.reshape((1, 224, 224, 3)) 11 | batch2 = img2.reshape((1, 224, 224, 3)) 12 | 13 | batch = np.concatenate((batch1, batch2), 0) 14 | 15 | # with tf.Session(\ 16 | # config=tf.ConfigProto(gpu_options=(tf.GPUOptions(per_process_gpu_memory_fraction=0.7)))) as sess: 17 | with tf.device('/cpu:0'): 18 | with tf.Session() as sess: 19 | images = tf.placeholder("float", [2, 224, 224, 3]) 20 | feed_dict = {images: batch} 21 | 22 | vgg = vgg16.Vgg16() 23 | with tf.name_scope("content_vgg"): 24 | vgg.build(images) 25 | feature = sess.run(vgg.fc7, feed_dict=feed_dict)# 需要提取哪一层特征,就在这里做修改,比如fc6,只需要把vgg.fc7修改为vgg.fc6 26 | -------------------------------------------------------------------------------- /test_vgg19.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import tensorflow as tf 3 | 4 | import vgg19 5 | import utils 6 | 7 | img1 = utils.load_image("./test_data/tiger.jpeg") 8 | img2 = utils.load_image("./test_data/puzzle.jpeg") 9 | 10 | batch1 = img1.reshape((1, 224, 224, 3)) 11 | batch2 = img2.reshape((1, 224, 224, 3)) 12 | 13 | batch = np.concatenate((batch1, batch2), 0) 14 | 15 | with tf.Session( 16 | config=tf.ConfigProto(gpu_options=(tf.GPUOptions(per_process_gpu_memory_fraction=0.7)))) as sess: 17 | images = tf.placeholder("float", [2, 224, 224, 3]) 18 | feed_dict = {images: batch} 19 | 20 | vgg = vgg19.Vgg19() 21 | with tf.name_scope("content_vgg"): 22 | vgg.build(images) 23 | feature = sess.run(vgg.fc7, feed_dict=feed_dict) 24 | 25 | # prob = sess.run(vgg.prob, feed_dict=feed_dict) 26 | # print(prob) 27 | # utils.print_prob(prob[0], './synset.txt') 28 | # utils.print_prob(prob[1], './synset.txt') 29 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | import skimage 2 | import skimage.io 3 | import skimage.transform 4 | import numpy as np 5 | 6 | 7 | # synset = [l.strip() for l in open('synset.txt').readlines()] 8 | 9 | 10 | # returns image of shape [224, 224, 3] 11 | # [height, width, depth] 12 | def load_image(path): 13 | # load image 14 | img = skimage.io.imread(path) 15 | img = img / 255.0 16 | assert (0 <= img).all() and (img <= 1.0).all() 17 | # print "Original Image Shape: ", img.shape 18 | # we crop image from center 19 | short_edge = min(img.shape[:2]) 20 | yy = int((img.shape[0] - short_edge) / 2) 21 | xx = int((img.shape[1] - short_edge) / 2) 22 | crop_img = img[yy: yy + short_edge, xx: xx + short_edge] 23 | # resize to 224, 224 24 | resized_img = skimage.transform.resize(crop_img, (224, 224)) 25 | return resized_img 26 | 27 | 28 | # returns the top1 string 29 | def print_prob(prob, file_path): 30 | synset = [l.strip() for l in open(file_path).readlines()] 31 | 32 | # print prob 33 | pred = np.argsort(prob)[::-1] 34 | 35 | # Get top1 label 36 | top1 = synset[pred[0]] 37 | print("Top1: ", top1, prob[pred[0]]) 38 | # Get top5 label 39 | top5 = [(synset[pred[i]], prob[pred[i]]) for i in range(5)] 40 | print("Top5: ", top5) 41 | return top1 42 | 43 | 44 | def load_image2(path, height=None, width=None): 45 | # load image 46 | img = skimage.io.imread(path) 47 | img = img / 255.0 48 | if height is not None and width is not None: 49 | ny = height 50 | nx = width 51 | elif height is not None: 52 | ny = height 53 | nx = img.shape[1] * ny / img.shape[0] 54 | elif width is not None: 55 | nx = width 56 | ny = img.shape[0] * nx / img.shape[1] 57 | else: 58 | ny = img.shape[0] 59 | nx = img.shape[1] 60 | return skimage.transform.resize(img, (ny, nx)) 61 | 62 | 63 | def test(): 64 | img = skimage.io.imread("./test_data/starry_night.jpg") 65 | ny = 300 66 | nx = img.shape[1] * ny / img.shape[0] 67 | img = skimage.transform.resize(img, (ny, nx)) 68 | skimage.io.imsave("./test_data/test/output.jpg", img) 69 | 70 | 71 | if __name__ == "__main__": 72 | test() 73 | -------------------------------------------------------------------------------- /vgg16.py: -------------------------------------------------------------------------------- 1 | import inspect 2 | import os 3 | 4 | import numpy as np 5 | import tensorflow as tf 6 | import time 7 | 8 | VGG_MEAN = [103.939, 116.779, 123.68] 9 | 10 | 11 | class Vgg16: 12 | def __init__(self, vgg16_npy_path=None): 13 | if vgg16_npy_path is None: 14 | path = inspect.getfile(Vgg16) 15 | path = os.path.abspath(os.path.join(path, os.pardir)) 16 | path = os.path.join(path, "D:/myproject/tensorflow-vgg-master/pre_model/vgg16.npy") 17 | vgg16_npy_path = path 18 | print(path) 19 | 20 | self.data_dict = np.load(vgg16_npy_path, encoding='latin1').item() 21 | print("npy file loaded") 22 | 23 | def build(self, rgb): 24 | """ 25 | load variable from npy to build the VGG 26 | 27 | :param rgb: rgb image [batch, height, width, 3] values scaled [0, 1] 28 | """ 29 | 30 | start_time = time.time() 31 | print("build model started") 32 | rgb_scaled = rgb * 255.0 33 | 34 | # Convert RGB to BGR 35 | red, green, blue = tf.split(value=rgb_scaled, num_or_size_splits=3, axis=3) 36 | assert red.get_shape().as_list()[1:] == [224, 224, 1] 37 | assert green.get_shape().as_list()[1:] == [224, 224, 1] 38 | assert blue.get_shape().as_list()[1:] == [224, 224, 1] 39 | bgr = tf.concat(axis=3, values=[ 40 | blue - VGG_MEAN[0], 41 | green - VGG_MEAN[1], 42 | red - VGG_MEAN[2], 43 | ]) 44 | assert bgr.get_shape().as_list()[1:] == [224, 224, 3] 45 | 46 | # block 1 -- outputs 112x112x64 47 | self.conv1_1 = self.conv_layer(bgr, "conv1_1") 48 | self.conv1_2 = self.conv_layer(self.conv1_1, "conv1_2") 49 | self.pool1 = self.max_pool(self.conv1_2, 'pool1') 50 | 51 | # block 2 -- outputs 56x56x128 52 | self.conv2_1 = self.conv_layer(self.pool1, "conv2_1") 53 | self.conv2_2 = self.conv_layer(self.conv2_1, "conv2_2") 54 | self.pool2 = self.max_pool(self.conv2_2, 'pool2') 55 | 56 | # block 3 -- outputs 28x28x256 57 | self.conv3_1 = self.conv_layer(self.pool2, "conv3_1") 58 | self.conv3_2 = self.conv_layer(self.conv3_1, "conv3_2") 59 | self.conv3_3 = self.conv_layer(self.conv3_2, "conv3_3") 60 | self.pool3 = self.max_pool(self.conv3_3, 'pool3') 61 | 62 | # block 4 -- outputs 14x14x512 63 | self.conv4_1 = self.conv_layer(self.pool3, "conv4_1") 64 | self.conv4_2 = self.conv_layer(self.conv4_1, "conv4_2") 65 | self.conv4_3 = self.conv_layer(self.conv4_2, "conv4_3") 66 | self.pool4 = self.max_pool(self.conv4_3, 'pool4') 67 | 68 | # block 5 -- outputs 7x7x512 69 | self.conv5_1 = self.conv_layer(self.pool4, "conv5_1") 70 | self.conv5_2 = self.conv_layer(self.conv5_1, "conv5_2") 71 | self.conv5_3 = self.conv_layer(self.conv5_2, "conv5_3") 72 | self.pool5 = self.max_pool(self.conv5_3, 'pool5') 73 | 74 | self.fc6 = self.fc_layer(self.pool5, "fc6") 75 | assert self.fc6.get_shape().as_list()[1:] == [4096] # flatten 76 | self.relu6 = tf.nn.relu(self.fc6) 77 | 78 | self.fc7 = self.fc_layer(self.relu6, "fc7") 79 | self.relu7 = tf.nn.relu(self.fc7) 80 | 81 | self.fc8 = self.fc_layer(self.relu7, "fc8") 82 | 83 | self.prob = tf.nn.softmax(self.fc8, name="prob") 84 | 85 | self.data_dict = None 86 | print("build model finished: %ds" % (time.time() - start_time)) 87 | 88 | def avg_pool(self, bottom, name): 89 | return tf.nn.avg_pool(bottom, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name) 90 | 91 | def max_pool(self, bottom, name): 92 | return tf.nn.max_pool(bottom, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name) 93 | 94 | def conv_layer(self, bottom, name): 95 | with tf.variable_scope(name): 96 | filt = self.get_conv_filter(name) 97 | 98 | conv = tf.nn.conv2d(bottom, filt, [1, 1, 1, 1], padding='SAME') 99 | 100 | conv_biases = self.get_bias(name) 101 | bias = tf.nn.bias_add(conv, conv_biases) 102 | 103 | relu = tf.nn.relu(bias) 104 | return relu 105 | 106 | def fc_layer(self, bottom, name): 107 | with tf.variable_scope(name): 108 | shape = bottom.get_shape().as_list() 109 | dim = 1 110 | for d in shape[1:]: 111 | dim *= d 112 | x = tf.reshape(bottom, [-1, dim]) 113 | 114 | weights = self.get_fc_weight(name) 115 | biases = self.get_bias(name) 116 | 117 | # Fully connected layer. Note that the '+' operation automatically 118 | # broadcasts the biases. 119 | fc = tf.nn.bias_add(tf.matmul(x, weights), biases) 120 | 121 | return fc 122 | 123 | def get_conv_filter(self, name): 124 | return tf.constant(self.data_dict[name][0], name="filter") 125 | 126 | def get_bias(self, name): 127 | return tf.constant(self.data_dict[name][1], name="biases") 128 | 129 | def get_fc_weight(self, name): 130 | return tf.constant(self.data_dict[name][0], name="weights") 131 | -------------------------------------------------------------------------------- /vgg19.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tensorflow as tf 3 | 4 | import numpy as np 5 | import time 6 | import inspect 7 | 8 | VGG_MEAN = [103.939, 116.779, 123.68] 9 | 10 | 11 | class Vgg19: 12 | def __init__(self, vgg19_npy_path=None): 13 | if vgg19_npy_path is None: 14 | path = inspect.getfile(Vgg19) 15 | path = os.path.abspath(os.path.join(path, os.pardir)) 16 | path = os.path.join(path, "D:/myproject/tensorflow-vgg-master/pre_model/vgg19.npy") 17 | vgg19_npy_path = path 18 | print(vgg19_npy_path) 19 | 20 | self.data_dict = np.load(vgg19_npy_path, encoding='latin1').item() 21 | print("npy file loaded") 22 | 23 | def build(self, rgb): 24 | """ 25 | load variable from npy to build the VGG 26 | 27 | :param rgb: rgb image [batch, height, width, 3] values scaled [0, 1] 28 | """ 29 | 30 | start_time = time.time() 31 | print("build model started") 32 | rgb_scaled = rgb * 255.0 33 | 34 | # Convert RGB to BGR 35 | red, green, blue = tf.split(value=rgb_scaled, num_or_size_splits=3, axis=3) 36 | assert red.get_shape().as_list()[1:] == [224, 224, 1] 37 | assert green.get_shape().as_list()[1:] == [224, 224, 1] 38 | assert blue.get_shape().as_list()[1:] == [224, 224, 1] 39 | bgr = tf.concat(axis=3, values=[ 40 | blue - VGG_MEAN[0], 41 | green - VGG_MEAN[1], 42 | red - VGG_MEAN[2], 43 | ]) 44 | assert bgr.get_shape().as_list()[1:] == [224, 224, 3] 45 | 46 | self.conv1_1 = self.conv_layer(bgr, "conv1_1") 47 | self.conv1_2 = self.conv_layer(self.conv1_1, "conv1_2") 48 | self.pool1 = self.max_pool(self.conv1_2, 'pool1') 49 | 50 | self.conv2_1 = self.conv_layer(self.pool1, "conv2_1") 51 | self.conv2_2 = self.conv_layer(self.conv2_1, "conv2_2") 52 | self.pool2 = self.max_pool(self.conv2_2, 'pool2') 53 | 54 | self.conv3_1 = self.conv_layer(self.pool2, "conv3_1") 55 | self.conv3_2 = self.conv_layer(self.conv3_1, "conv3_2") 56 | self.conv3_3 = self.conv_layer(self.conv3_2, "conv3_3") 57 | self.conv3_4 = self.conv_layer(self.conv3_3, "conv3_4") 58 | self.pool3 = self.max_pool(self.conv3_4, 'pool3') 59 | 60 | self.conv4_1 = self.conv_layer(self.pool3, "conv4_1") 61 | self.conv4_2 = self.conv_layer(self.conv4_1, "conv4_2") 62 | self.conv4_3 = self.conv_layer(self.conv4_2, "conv4_3") 63 | self.conv4_4 = self.conv_layer(self.conv4_3, "conv4_4") 64 | self.pool4 = self.max_pool(self.conv4_4, 'pool4') 65 | 66 | self.conv5_1 = self.conv_layer(self.pool4, "conv5_1") 67 | self.conv5_2 = self.conv_layer(self.conv5_1, "conv5_2") 68 | self.conv5_3 = self.conv_layer(self.conv5_2, "conv5_3") 69 | self.conv5_4 = self.conv_layer(self.conv5_3, "conv5_4") 70 | self.pool5 = self.max_pool(self.conv5_4, 'pool5') 71 | 72 | self.fc6 = self.fc_layer(self.pool5, "fc6") 73 | assert self.fc6.get_shape().as_list()[1:] == [4096] 74 | self.relu6 = tf.nn.relu(self.fc6) 75 | 76 | self.fc7 = self.fc_layer(self.relu6, "fc7") 77 | self.relu7 = tf.nn.relu(self.fc7) 78 | 79 | self.fc8 = self.fc_layer(self.relu7, "fc8") 80 | 81 | self.prob = tf.nn.softmax(self.fc8, name="prob") 82 | 83 | self.data_dict = None 84 | print("build model finished: %ds" % (time.time() - start_time)) 85 | 86 | def avg_pool(self, bottom, name): 87 | return tf.nn.avg_pool(bottom, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name) 88 | 89 | def max_pool(self, bottom, name): 90 | return tf.nn.max_pool(bottom, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name) 91 | 92 | def conv_layer(self, bottom, name): 93 | with tf.variable_scope(name): 94 | filt = self.get_conv_filter(name) 95 | 96 | conv = tf.nn.conv2d(bottom, filt, [1, 1, 1, 1], padding='SAME') 97 | 98 | conv_biases = self.get_bias(name) 99 | bias = tf.nn.bias_add(conv, conv_biases) 100 | 101 | relu = tf.nn.relu(bias) 102 | return relu 103 | 104 | def fc_layer(self, bottom, name): 105 | with tf.variable_scope(name): 106 | shape = bottom.get_shape().as_list() 107 | dim = 1 108 | for d in shape[1:]: 109 | dim *= d 110 | x = tf.reshape(bottom, [-1, dim]) 111 | 112 | weights = self.get_fc_weight(name) 113 | biases = self.get_bias(name) 114 | 115 | # Fully connected layer. Note that the '+' operation automatically 116 | # broadcasts the biases. 117 | fc = tf.nn.bias_add(tf.matmul(x, weights), biases) 118 | 119 | return fc 120 | 121 | def get_conv_filter(self, name): 122 | return tf.constant(self.data_dict[name][0], name="filter") 123 | 124 | def get_bias(self, name): 125 | return tf.constant(self.data_dict[name][1], name="biases") 126 | 127 | def get_fc_weight(self, name): 128 | return tf.constant(self.data_dict[name][0], name="weights") 129 | --------------------------------------------------------------------------------