├── LICENSE ├── README.md ├── main.py ├── session_params └── README.md ├── ssd300.py ├── ssd300_resnet.py └── train_datasets ├── README.md ├── voc2007 └── README.md └── voc2012 └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SSD_for_Tensorflow 2 | 3 | Single Shot MultiBox Detector目标检测算法基于tensorflow的实现
4 | 论文在这里 5 |

6 | 网上也有不少基于tensorflow实现ssd的源码,不过大多码得太复杂。
7 | 我看了几套,然后就有一种强烈的冲动再码一套尽可能简单直接的源码,一方面可以更好地理解SSD的内部原理,另一方面也可以给各位初学者有个简单入门的源码参考。
8 |
9 | 代码结构:
10 | ssd300.py - ssd的核心代码封装,实现300 * 300的图片格式。
11 | main.py - ssd300的使用用例,包括训练、检测的调用示例。训练时使用voc2012数据集,数据可以从这里下载,解压到\train_datasets\voc2012目录下 12 |
13 |
14 | 因为要简洁,就只有这2个文件。 15 |
16 |
17 | 与原论文不一致的地方:
18 | 1,box的位置信息论文描述为 [center_X, center_Y, width, height], 为了更好兼容和理解,这套源码统一改为[top_X, top_Y, width, height]
19 |

20 | 2,论文中default box的width=scale*sqrt(aspect_ratio)、height=scale/sqrt(aspect_ratio) 是错误的,
改为width=sqrt(scale * aspect_ratio)、height=sqrt(scale/aspect_ratio),有兴趣的朋友可以反推一下。

21 | 3,按照论文中描述长宽比ratio = 1时,scale=sqrt(scale0 * scale1),即值为sqrt(1.0 * 2.0)=1.414,与scale4=1.5接近,不利于区分default box,因此直接修改为(scale0+scale4)/2=(1.0+1.5)/2=1.25,即取scale0和scale4中间值。

22 | 4,论文中default_box_scale由公式s_k=s_min+(s_max-s_min) * (k-1)/(m-1)生成,源码改为np.linspace生成等差数组,效果一致

23 | 5,box scale 由[ 0.2 , 0.9 ]改为[ 0.1 , 0.9 ],因为最小box面积0.2不利于识别面积小的物体,所以改为0.1

24 | 25 | 26 |

27 | 调用简单示例
28 | 1,检测
29 |     ssd_model = ssd300.SSD300(tf_sess=sess, isTraining=False)
30 |     f_class,f_location = ssd_model.run(input_img,None)
31 | 2,训练
32 |     ssd_model = ssd300.SSD300(tf_sess=sess, isTraining=True)
33 |     loss_all,loss_location,loss_class = ssd_model.run(train_data, actual_data)
34 |
35 | 【整体框架源码已完成,可以参考学习。还没完成训练,可能还存在一些问题,如果发现有问题,请告诉我 : jasonli8848@qq.com】
36 |
37 | 【注】
38 | 1,【经实验top_x,top_y并不适合卷积,会降低精度,应改为center_x,center_y】;
39 | 2,【源码中vgg基础网络并不完善,最好改为ResNet + Inception2】;
40 | 3,【default box 应根据具体业务设置,以免造成资源浪费以及影响精度】;
41 | 42 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | """ 2 | date: 2017/11/10 3 | author: lslcode [jasonli8848@qq.com] 4 | """ 5 | 6 | import os 7 | import gc 8 | import xml.etree.ElementTree as etxml 9 | import random 10 | import skimage.io 11 | import skimage.transform 12 | import numpy as np 13 | import tensorflow as tf 14 | import ssd300 15 | import time 16 | 17 | ''' 18 | SSD检测 19 | ''' 20 | def testing(): 21 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.9) 22 | with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess: 23 | ssd_model = ssd300.SSD300(sess,False) 24 | sess.run(tf.global_variables_initializer()) 25 | saver = tf.train.Saver(var_list=tf.trainable_variables()) 26 | if os.path.exists('./session_params/session.ckpt.index') : 27 | saver.restore(sess, './session_params/session.ckpt') 28 | image, actual,file_list = get_traindata_voc2007(1) 29 | pred_class, pred_class_val, pred_location = ssd_model.run(image,None) 30 | print('file_list:' + str(file_list)) 31 | 32 | for index, act in zip(range(len(image)), actual): 33 | for a in act : 34 | print('【img-'+str(index)+' actual】:' + str(a)) 35 | print('pred_class:' + str(pred_class[index])) 36 | print('pred_class_val:' + str(pred_class_val[index])) 37 | print('pred_location:' + str(pred_location[index])) 38 | 39 | else: 40 | print('No Data Exists!') 41 | sess.close() 42 | 43 | ''' 44 | SSD训练 45 | ''' 46 | def training(): 47 | batch_size = 15 48 | running_count = 0 49 | 50 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.9) 51 | with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess: 52 | ssd_model = ssd300.SSD300(sess,True) 53 | sess.run(tf.global_variables_initializer()) 54 | saver = tf.train.Saver(var_list=tf.trainable_variables()) 55 | if os.path.exists('./session_params/session.ckpt.index') : 56 | print('\nStart Restore') 57 | saver.restore(sess, './session_params/session.ckpt') 58 | print('\nEnd Restore') 59 | 60 | print('\nStart Training') 61 | min_loss_location = 100000. 62 | min_loss_class = 100000. 63 | while((min_loss_location + min_loss_class) > 0.001 and running_count < 100000): 64 | running_count += 1 65 | 66 | train_data, actual_data,_ = get_traindata_voc2007(batch_size) 67 | if len(train_data) > 0: 68 | loss_all,loss_class,loss_location,pred_class,pred_location = ssd_model.run(train_data, actual_data) 69 | l = np.sum(loss_location) 70 | c = np.sum(loss_class) 71 | if min_loss_location > l: 72 | min_loss_location = l 73 | if min_loss_class > c: 74 | min_loss_class = c 75 | 76 | print('Running:【' + str(running_count) + '】|Loss All:【'+str(min_loss_location + min_loss_class)+'|'+ str(loss_all) + '】|Location:【'+ str(np.sum(loss_location)) + '】|Class:【'+ str(np.sum(loss_class)) + '】|pred_class:【'+ str(np.sum(pred_class))+'|'+str(np.amax(pred_class))+'|'+ str(np.min(pred_class)) + '】|pred_location:【'+ str(np.sum(pred_location))+'|'+str(np.amax(pred_location))+'|'+ str(np.min(pred_location)) + '】') 77 | 78 | # 定期保存ckpt 79 | if running_count % 100 == 0: 80 | saver.save(sess, './session_params/session.ckpt') 81 | print('session.ckpt has been saved.') 82 | gc.collect() 83 | else: 84 | print('No Data Exists!') 85 | break 86 | 87 | saver.save(sess, './session_params/session.ckpt') 88 | sess.close() 89 | gc.collect() 90 | 91 | print('End Training') 92 | 93 | ''' 94 | 获取voc2007训练图片数据 95 | train_data:训练批次图像,格式[None,width,height,3] 96 | actual_data:图像标注数据,格式[None,[None,center_x,center_y,width,height,lable]] 97 | ''' 98 | file_name_list = os.listdir('./train_datasets/voc2007/JPEGImages/') 99 | lable_arr = ['background','aeroplane','bicycle','bird','boat','bottle','bus','car','cat','chair','cow','diningtable','dog','horse','motorbike','person','pottedplant','sheep','sofa','train','tvmonitor'] 100 | # 图像白化,格式:[R,G,B] 101 | whitened_RGB_mean = [123.68, 116.78, 103.94] 102 | def get_traindata_voc2007(batch_size): 103 | def get_actual_data_from_xml(xml_path): 104 | actual_item = [] 105 | try: 106 | annotation_node = etxml.parse(xml_path).getroot() 107 | img_width = float(annotation_node.find('size').find('width').text.strip()) 108 | img_height = float(annotation_node.find('size').find('height').text.strip()) 109 | object_node_list = annotation_node.findall('object') 110 | for obj_node in object_node_list: 111 | lable = lable_arr.index(obj_node.find('name').text.strip()) 112 | bndbox = obj_node.find('bndbox') 113 | x_min = float(bndbox.find('xmin').text.strip()) 114 | y_min = float(bndbox.find('ymin').text.strip()) 115 | x_max = float(bndbox.find('xmax').text.strip()) 116 | y_max = float(bndbox.find('ymax').text.strip()) 117 | # 位置数据用比例来表示,格式[center_x,center_y,width,height,lable] 118 | actual_item.append([((x_min + x_max)/2/img_width), ((y_min + y_max)/2/img_height), ((x_max - x_min) / img_width), ((y_max - y_min) / img_height), lable]) 119 | return actual_item 120 | except: 121 | return None 122 | 123 | train_data = [] 124 | actual_data = [] 125 | 126 | file_list = random.sample(file_name_list, batch_size) 127 | 128 | for f_name in file_list : 129 | img_path = './train_datasets/voc2007/JPEGImages/' + f_name 130 | xml_path = './train_datasets/voc2007/Annotations/' + f_name.replace('.jpg','.xml') 131 | if os.path.splitext(img_path)[1].lower() == '.jpg' : 132 | actual_item = get_actual_data_from_xml(xml_path) 133 | if actual_item != None : 134 | actual_data.append(actual_item) 135 | else : 136 | print('Error : '+xml_path) 137 | continue 138 | img = skimage.io.imread(img_path) 139 | img = skimage.transform.resize(img, (300, 300)) 140 | # 图像白化预处理 141 | img = img - whitened_RGB_mean 142 | train_data.append(img) 143 | 144 | return train_data, actual_data,file_list 145 | 146 | 147 | ''' 148 | 主程序入口 149 | ''' 150 | if __name__ == '__main__': 151 | print('\nStart Running') 152 | # 检测 153 | #testing() 154 | # 训练 155 | training() 156 | print('\nEnd Running') 157 | 158 | 159 | -------------------------------------------------------------------------------- /session_params/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /ssd300.py: -------------------------------------------------------------------------------- 1 | """ 2 | date: 2017/11/10 3 | author: lslcode [jasonli8848@qq.com] 4 | """ 5 | 6 | import numpy as np 7 | import tensorflow as tf 8 | from tensorflow.python.training.moving_averages import assign_moving_average 9 | 10 | class SSD300: 11 | def __init__(self, tf_sess, isTraining): 12 | # tensorflow session 13 | self.sess = tf_sess 14 | # 是否训练 15 | self.isTraining = isTraining 16 | # 允许的图像大小 17 | self.img_size = [300, 300] 18 | # 分类总数量 19 | self.classes_size = 21 20 | # 背景分类的值 21 | self.background_classes_val = 0 22 | # 每个特征图单元的default box数量 23 | self.default_box_size = [4, 6, 6, 6, 4, 4] 24 | # default box 尺寸长宽比例 25 | self.box_aspect_ratio = [ 26 | [1.0, 1.25, 2.0, 3.0], 27 | [1.0, 1.25, 2.0, 3.0, 1.0 / 2.0, 1.0 / 3.0], 28 | [1.0, 1.25, 2.0, 3.0, 1.0 / 2.0, 1.0 / 3.0], 29 | [1.0, 1.25, 2.0, 3.0, 1.0 / 2.0, 1.0 / 3.0], 30 | [1.0, 1.25, 2.0, 3.0], 31 | [1.0, 1.25, 2.0, 3.0] 32 | ] 33 | # 最小default box面积比例 34 | self.min_box_scale = 0.05 35 | # 最大default box面积比例 36 | self.max_box_scale = 0.9 37 | # 每个特征层的面积比例 38 | # numpy生成等差数组,效果等同于论文中的s_k=s_min+(s_max-s_min)*(k-1)/(m-1) 39 | self.default_box_scale = np.linspace(self.min_box_scale, self.max_box_scale, num = np.amax(self.default_box_size)) 40 | print('## default_box_scale:'+str(self.default_box_scale)) 41 | # 卷积步长 42 | self.conv_strides_1 = [1, 1, 1, 1] 43 | self.conv_strides_2 = [1, 2, 2, 1] 44 | self.conv_strides_3 = [1, 3, 3, 1] 45 | # 池化窗口 46 | self.pool_size = [1, 2, 2, 1] 47 | # 池化步长 48 | self.pool_strides = [1, 2, 2, 1] 49 | # Batch Normalization 算法的 decay 参数 50 | self.conv_bn_decay = 0.99999 51 | # Batch Normalization 算法的 variance_epsilon 参数 52 | self.conv_bn_epsilon = 0.00001 53 | # Jaccard相似度判断阀值 54 | self.jaccard_value = 0.6 55 | 56 | # 初始化Tensorflow Graph 57 | self.generate_graph() 58 | 59 | def generate_graph(self): 60 | # 输入数据 61 | self.input = tf.placeholder(shape=[None, self.img_size[0], self.img_size[1], 3], dtype=tf.float32, name='input_image') 62 | 63 | # vvg16卷积层 1 64 | self.conv_1_1 = self.convolution(self.input, [3, 3, 3, 32], self.conv_strides_1,'conv_1_1') 65 | self.conv_1_2 = self.convolution(self.conv_1_1, [3, 3, 32, 32], self.conv_strides_1,'conv_1_2') 66 | self.conv_1_2 = tf.nn.avg_pool(self.conv_1_2, self.pool_size, self.pool_strides, padding='SAME', name='pool_1_2') 67 | print('## conv_1_2 shape: ' + str(self.conv_1_2.get_shape().as_list())) 68 | # vvg16卷积层 2 69 | self.conv_2_1 = self.convolution(self.conv_1_2, [3, 3, 32, 64], self.conv_strides_1,'conv_2_1') 70 | self.conv_2_2 = self.convolution(self.conv_2_1, [3, 3, 64, 64], self.conv_strides_1,'conv_2_2') 71 | #self.conv_2_2 = tf.nn.avg_pool(self.conv_2_2, self.pool_size, self.pool_strides, padding='SAME', name='pool_2_2') 72 | print('## conv_2_2 shape: ' + str(self.conv_2_2.get_shape().as_list())) 73 | # vvg16卷积层 3 74 | self.conv_3_1 = self.convolution(self.conv_2_2, [3, 3, 64, 128], self.conv_strides_1,'conv_3_1') 75 | self.conv_3_2 = self.convolution(self.conv_3_1, [3, 3, 128, 128], self.conv_strides_1,'conv_3_2') 76 | self.conv_3_3 = self.convolution(self.conv_3_2, [3, 3, 128, 128], self.conv_strides_1,'conv_3_3') 77 | self.conv_3_3 = tf.nn.avg_pool(self.conv_3_3, self.pool_size, self.pool_strides, padding='SAME', name='pool_3_3') 78 | print('## conv_3_3 shape: ' + str(self.conv_3_3.get_shape().as_list())) 79 | # vvg16卷积层 4 80 | self.conv_4_1 = self.convolution(self.conv_3_3, [3, 3, 128, 256], self.conv_strides_1,'conv_4_1') 81 | self.conv_4_2 = self.convolution(self.conv_4_1, [3, 3, 256, 256], self.conv_strides_1,'conv_4_2') 82 | self.conv_4_3 = self.convolution(self.conv_4_2, [3, 3, 256, 256], self.conv_strides_1,'conv_4_3') 83 | self.conv_4_3 = tf.nn.avg_pool(self.conv_4_3, self.pool_size, self.pool_strides, padding='SAME', name='pool_4_3') 84 | print('## conv_4_3 shape: ' + str(self.conv_4_3.get_shape().as_list())) 85 | # vvg16卷积层 5 86 | self.conv_5_1 = self.convolution(self.conv_4_3, [3, 3, 256, 256], self.conv_strides_1,'conv_5_1') 87 | self.conv_5_2 = self.convolution(self.conv_5_1, [3, 3, 256, 256], self.conv_strides_1,'conv_5_2') 88 | self.conv_5_3 = self.convolution(self.conv_5_2, [3, 3, 256, 256], self.conv_strides_1,'conv_5_3') 89 | self.conv_5_3 = tf.nn.avg_pool(self.conv_5_3, self.pool_size, self.pool_strides, padding='SAME', name='pool_5_3') 90 | print('## conv_5_3 shape: ' + str(self.conv_5_3.get_shape().as_list())) 91 | # ssd卷积层 6 92 | self.conv_6_1 = self.convolution(self.conv_5_3, [3, 3, 256, 512], self.conv_strides_1,'conv_6_1') 93 | print('## conv_6_1 shape: ' + str(self.conv_6_1.get_shape().as_list())) 94 | # ssd卷积层 7 95 | self.conv_7_1 = self.convolution(self.conv_6_1, [1, 1, 512, 512], self.conv_strides_1,'conv_7_1') 96 | print('## conv_7_1 shape: ' + str(self.conv_7_1.get_shape().as_list())) 97 | # ssd卷积层 8 98 | self.conv_8_1 = self.convolution(self.conv_7_1, [1, 1, 512, 128], self.conv_strides_1,'conv_8_1') 99 | self.conv_8_2 = self.convolution(self.conv_8_1, [3, 3, 128, 256], self.conv_strides_2,'conv_8_2') 100 | print('## conv_8_2 shape: ' + str(self.conv_8_2.get_shape().as_list())) 101 | # ssd卷积层 9 102 | self.conv_9_1 = self.convolution(self.conv_8_2, [1, 1, 256, 64], self.conv_strides_1,'conv_9_1') 103 | self.conv_9_2 = self.convolution(self.conv_9_1, [3, 3, 64, 128], self.conv_strides_2,'conv_9_2') 104 | print('## conv_9_2 shape: ' + str(self.conv_9_2.get_shape().as_list())) 105 | # ssd卷积层 10 106 | self.conv_10_1 = self.convolution(self.conv_9_2, [1, 1, 128, 64], self.conv_strides_1,'conv_10_1') 107 | self.conv_10_2 = self.convolution(self.conv_10_1, [3, 3, 64, 128], self.conv_strides_2,'conv_10_2') 108 | print('## conv_10_2 shape: ' + str(self.conv_10_2.get_shape().as_list())) 109 | # ssd卷积层 11 110 | self.conv_11 = tf.nn.avg_pool(self.conv_10_2, self.pool_size, self.pool_strides, "VALID") 111 | print('## conv_11 shape: ' + str(self.conv_11.get_shape().as_list())) 112 | 113 | # 第 1 层 特征层,来源于conv_4_3 114 | self.features_1 = self.convolution(self.conv_4_3, [3, 3, 256, self.default_box_size[0] * (self.classes_size + 4)], self.conv_strides_1,'features_1') 115 | print('## features_1 shape: ' + str(self.features_1.get_shape().as_list())) 116 | # 第 2 层 特征层,来源于conv_7_1 117 | self.features_2 = self.convolution(self.conv_7_1, [3, 3, 512, self.default_box_size[1] * (self.classes_size + 4)], self.conv_strides_1,'features_2') 118 | print('## features_2 shape: ' + str(self.features_2.get_shape().as_list())) 119 | # 第 3 层 特征层,来源于conv_8_2 120 | self.features_3 = self.convolution(self.conv_8_2, [3, 3, 256, self.default_box_size[2] * (self.classes_size + 4)], self.conv_strides_1,'features_3') 121 | print('## features_3 shape: ' + str(self.features_3.get_shape().as_list())) 122 | # 第 4 层 特征层,来源于conv_9_2 123 | self.features_4 = self.convolution(self.conv_9_2, [3, 3, 128, self.default_box_size[3] * (self.classes_size + 4)], self.conv_strides_1,'features_4') 124 | print('## features_4 shape: ' + str(self.features_4.get_shape().as_list())) 125 | # 第 5 层 特征层,来源于conv_10_2 126 | self.features_5 = self.convolution(self.conv_10_2, [3, 3, 128, self.default_box_size[4] * (self.classes_size + 4)], self.conv_strides_1,'features_5') 127 | print('## features_5 shape: ' + str(self.features_5.get_shape().as_list())) 128 | # 第 6 层 特征层,来源于conv_11 129 | self.features_6 = self.convolution(self.conv_11, [1, 1, 128, self.default_box_size[5] * (self.classes_size + 4)], self.conv_strides_1,'features_6') 130 | print('## features_6 shape: ' + str(self.features_6.get_shape().as_list())) 131 | 132 | # 特征层集合 133 | self.feature_maps = [self.features_1, self.features_2, self.features_3, self.features_4, self.features_5, self.features_6] 134 | # 获取卷积后各个特征层的shape,以便生成feature和groundtruth格式一致的训练数据 135 | self.feature_maps_shape = [m.get_shape().as_list() for m in self.feature_maps] 136 | 137 | # 整理feature数据 138 | self.tmp_all_feature = [] 139 | for i, fmap in zip(range(len(self.feature_maps)), self.feature_maps): 140 | width = self.feature_maps_shape[i][1] 141 | height = self.feature_maps_shape[i][2] 142 | # 这里reshape目的为定位和类别2方面回归作准备 143 | # reshape前 shape=[None, width, height, default_box*(classes+4)] 144 | # reshape后 shape=[None, width*height*default_box, (classes+4) ] 145 | self.tmp_all_feature.append(tf.reshape(fmap, [-1, (width * height * self.default_box_size[i]) , (self.classes_size + 4)])) 146 | # 合并每张图像产生的所有特征 147 | self.tmp_all_feature = tf.concat(self.tmp_all_feature, axis=1) 148 | # 这里正式拆分为定位和类别2类数据 149 | self.feature_class = self.tmp_all_feature[:,:,:self.classes_size] 150 | self.feature_location = self.tmp_all_feature[:,:,self.classes_size:] 151 | 152 | print('## feature_class shape : ' + str(self.feature_class.get_shape().as_list())) 153 | print('## feature_location shape : ' + str(self.feature_location.get_shape().as_list())) 154 | # 生成所有default boxs 155 | self.all_default_boxs = self.generate_all_default_boxs() 156 | self.all_default_boxs_len = len(self.all_default_boxs) 157 | print('## all default boxs : ' + str(self.all_default_boxs_len)) 158 | 159 | # 输入真实数据 160 | self.groundtruth_class = tf.placeholder(shape=[None,self.all_default_boxs_len], dtype=tf.int32,name='groundtruth_class') 161 | self.groundtruth_location = tf.placeholder(shape=[None,self.all_default_boxs_len,4], dtype=tf.float32,name='groundtruth_location') 162 | self.groundtruth_positives = tf.placeholder(shape=[None,self.all_default_boxs_len], dtype=tf.float32,name='groundtruth_positives') 163 | self.groundtruth_negatives = tf.placeholder(shape=[None,self.all_default_boxs_len], dtype=tf.float32,name='groundtruth_negatives') 164 | 165 | # 损失函数 166 | self.groundtruth_count = tf.add(self.groundtruth_positives , self.groundtruth_negatives) 167 | self.softmax_cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=self.feature_class, labels=self.groundtruth_class) 168 | self.loss_location = tf.div(tf.reduce_sum(tf.multiply(tf.reduce_sum(self.smooth_L1(tf.subtract(self.groundtruth_location , self.feature_location)), reduction_indices=2) , self.groundtruth_positives), reduction_indices=1) , tf.reduce_sum(self.groundtruth_positives, reduction_indices = 1)) 169 | self.loss_class = tf.div(tf.reduce_sum(tf.multiply(self.softmax_cross_entropy , self.groundtruth_count), reduction_indices=1) , tf.reduce_sum(self.groundtruth_count, reduction_indices = 1)) 170 | self.loss_all = tf.reduce_sum(tf.add(self.loss_class , self.loss_location)) 171 | 172 | # loss优化函数 173 | self.optimizer = tf.train.AdamOptimizer(0.001) 174 | #self.optimizer = tf.train.GradientDescentOptimizer(0.001) 175 | self.train = self.optimizer.minimize(self.loss_all) 176 | 177 | # 图像检测与训练 178 | # input_images : 输入图像数据,格式:[None,width,hight,channel] 179 | # actual_data : 标注数据,格式:[None,[None,center_X,center_Y,width,hight,classes]] , classes值范围[0,classes_size) 180 | def run(self, input_images, actual_data): 181 | # 训练部分 182 | if self.isTraining : 183 | if actual_data is None : 184 | raise Exception('actual_data参数不存在!') 185 | if len(input_images) != len(actual_data): 186 | raise Exception('input_images 与 actual_data参数长度不对应!') 187 | 188 | f_class, f_location = self.sess.run([self.feature_class, self.feature_location], feed_dict={self.input : input_images }) 189 | 190 | with tf.control_dependencies([self.feature_class, self.feature_location]): 191 | #检查数据是否正确 192 | f_class = self.check_numerics(f_class,'预测集f_class') 193 | f_location = self.check_numerics(f_location,'预测集f_location') 194 | 195 | gt_class,gt_location,gt_positives,gt_negatives = self.generate_groundtruth_data(actual_data, f_class) 196 | #print('gt_positives :【'+str(np.sum(gt_positives))+'|'+str(np.amax(gt_positives))+'|'+str(np.amin(gt_positives))+'】|gt_negatives : 【'+str(np.sum(gt_negatives))+'|'+str(np.amax(gt_negatives))+'|'+str(np.amin(gt_negatives))+'】') 197 | self.sess.run(self.train, feed_dict={ 198 | self.input : input_images, 199 | self.groundtruth_class : gt_class, 200 | self.groundtruth_location : gt_location, 201 | self.groundtruth_positives : gt_positives, 202 | self.groundtruth_negatives : gt_negatives 203 | }) 204 | with tf.control_dependencies([self.train]): 205 | loss_all,loss_location,loss_class = self.sess.run([self.loss_all,self.loss_location,self.loss_class], feed_dict={ 206 | self.input : input_images, 207 | self.groundtruth_class : gt_class, 208 | self.groundtruth_location : gt_location, 209 | self.groundtruth_positives : gt_positives, 210 | self.groundtruth_negatives : gt_negatives 211 | }) 212 | #检查数据是否正确 213 | loss_all = self.check_numerics(loss_all,'损失值loss_all') 214 | return loss_all, loss_class, loss_location, f_class, f_location 215 | 216 | # 检测部分 217 | else : 218 | # softmax归一化预测结果 219 | feature_class_softmax = tf.nn.softmax(logits=self.feature_class, dim=-1) 220 | # 过滤background的预测值 221 | background_filter = np.ones(self.classes_size, dtype=np.float32) 222 | background_filter[self.background_classes_val] = 0 223 | background_filter = tf.constant(background_filter) 224 | feature_class_softmax=tf.multiply(feature_class_softmax, background_filter) 225 | # 计算每个box的最大预测值 226 | feature_class_softmax = tf.reduce_max(feature_class_softmax,2) 227 | # 过滤冗余的预测结果 228 | box_top_set = tf.nn.top_k(feature_class_softmax, int(self.all_default_boxs_len/20)) 229 | box_top_index = box_top_set.indices 230 | box_top_value = box_top_set.values 231 | f_class, f_location, f_class_softmax, box_top_index, box_top_value = self.sess.run( 232 | [self.feature_class, self.feature_location, feature_class_softmax, box_top_index, box_top_value], 233 | feed_dict={self.input : input_images } 234 | ) 235 | top_shape = np.shape(box_top_index) 236 | pred_class = [] 237 | pred_class_val = [] 238 | pred_location = [] 239 | for i in range(top_shape[0]) : 240 | item_img_class = [] 241 | item_img_class_val = [] 242 | item_img_location = [] 243 | for j in range(top_shape[1]) : 244 | p_class_val = f_class_softmax[i][box_top_index[i][j]] 245 | if p_class_val < 0.5: 246 | continue 247 | p_class = np.argmax(f_class[i][box_top_index[i][j]]) 248 | if p_class==self.background_classes_val: 249 | continue 250 | p_location = f_location[i][box_top_index[i][j]] 251 | if p_location[0]<0 or p_location[1]<0 or p_location[2]<0 or p_location[3]<0 or p_location[2]==0 or p_location[3]==0 : 252 | continue 253 | is_box_filter = False 254 | for f_index in range(len(item_img_class)) : 255 | if self.jaccard(p_location,item_img_location[f_index]) > 0.3 and p_class == item_img_class[f_index] : 256 | is_box_filter = True 257 | break 258 | if is_box_filter == False : 259 | item_img_class.append(p_class) 260 | item_img_class_val.append(p_class_val) 261 | item_img_location.append(p_location) 262 | pred_class.append(item_img_class) 263 | pred_class_val.append(item_img_class_val) 264 | pred_location.append(item_img_location) 265 | return pred_class, pred_class_val, pred_location 266 | 267 | # 卷积操作 268 | def convolution(self, input, shape, strides, name): 269 | with tf.variable_scope(name): 270 | weight = tf.get_variable(initializer=tf.truncated_normal(shape, 0, 1), dtype=tf.float32, name=name+'_weight') 271 | bias = tf.get_variable(initializer=tf.truncated_normal(shape[-1:], 0, 1), dtype=tf.float32, name=name+'_bias') 272 | result = tf.nn.conv2d(input, weight, strides, padding='SAME', name=name+'_conv') 273 | result = tf.nn.bias_add(result, bias) 274 | result = self.batch_normalization(result, name=name+'_bn') 275 | result = tf.nn.relu(result, name=name+'_relu') 276 | return result 277 | 278 | # fully connect操作 279 | def fc(self, input, out_shape, name): 280 | with tf.variable_scope(name+'_fc'): 281 | in_shape = 1 282 | for d in input.get_shape().as_list()[1:]: 283 | in_shape *= d 284 | weight = tf.get_variable(initializer=tf.truncated_normal([in_shape, out_shape], 0, 1), dtype=tf.float32, name=name+'_fc_weight') 285 | bias = tf.get_variable(initializer=tf.truncated_normal([out_shape], 0, 1), dtype=tf.float32, name=name+'_fc_bias') 286 | result = tf.reshape(input, [-1, in_shape]) 287 | result = tf.nn.xw_plus_b(result, weight, bias, name=name+'_fc_do') 288 | return result 289 | 290 | # Batch Normalization算法 291 | def batch_normalization(self, input, name): 292 | with tf.variable_scope(name): 293 | bn_input_shape = input.get_shape() 294 | moving_mean = tf.get_variable(name+'_mean', bn_input_shape[-1:] , initializer=tf.zeros_initializer, trainable=False) 295 | moving_variance = tf.get_variable(name+'_variance', bn_input_shape[-1:] , initializer=tf.ones_initializer, trainable=False) 296 | def mean_var_with_update(): 297 | mean, variance = tf.nn.moments(input, list(range(len(bn_input_shape) - 1)), name=name+'_moments') 298 | with tf.control_dependencies([assign_moving_average(moving_mean, mean, self.conv_bn_decay),assign_moving_average(moving_variance, variance, self.conv_bn_decay)]): 299 | return tf.identity(mean), tf.identity(variance) 300 | #mean, variance = tf.cond(tf.cast(self.isTraining, tf.bool), mean_var_with_update, lambda: (moving_mean, moving_variance)) 301 | mean, variance = tf.cond(tf.cast(True, tf.bool), mean_var_with_update, lambda: (moving_mean, moving_variance)) 302 | beta = tf.get_variable(name+'_beta', bn_input_shape[-1:] , initializer=tf.zeros_initializer) 303 | gamma = tf.get_variable(name+'_gamma', bn_input_shape[-1:] , initializer=tf.ones_initializer) 304 | return tf.nn.batch_normalization(input, mean, variance, beta, gamma, self.conv_bn_epsilon, name+'_bn_opt') 305 | 306 | # smooth_L1 算法 307 | def smooth_L1(self, x): 308 | return tf.where(tf.less_equal(tf.abs(x),1.0), tf.multiply(0.5, tf.pow(x, 2.0)), tf.subtract(tf.abs(x), 0.5)) 309 | 310 | # 初始化、整理训练数据 311 | def generate_all_default_boxs(self): 312 | # 全部按照比例计算并生成一张图像产生的每个default box的坐标以及长宽 313 | # 用于后续的jaccard匹配 314 | all_default_boxes = [] 315 | for index, map_shape in zip(range(len(self.feature_maps_shape)), self.feature_maps_shape): 316 | width = int(map_shape[1]) 317 | height = int(map_shape[2]) 318 | cell_scale = self.default_box_scale[index] 319 | for x in range(width): 320 | for y in range(height): 321 | for ratio in self.box_aspect_ratio[index]: 322 | center_x = (x / float(width)) + (0.5/ float(width)) 323 | center_y = (y / float(height)) + (0.5 / float(height)) 324 | box_width = np.sqrt(cell_scale * ratio) 325 | box_height = np.sqrt(cell_scale / ratio) 326 | all_default_boxes.append([center_x, center_y, box_width, box_height]) 327 | all_default_boxes = np.array(all_default_boxes) 328 | #检查数据是否正确 329 | all_default_boxes = self.check_numerics(all_default_boxes,'all_default_boxes') 330 | return all_default_boxes 331 | 332 | # 整理生成groundtruth数据 333 | def generate_groundtruth_data(self,input_actual_data, f_class): 334 | # 生成空数组,用于保存groundtruth 335 | input_actual_data_len = len(input_actual_data) 336 | gt_class = np.zeros((input_actual_data_len, self.all_default_boxs_len)) 337 | gt_location = np.zeros((input_actual_data_len, self.all_default_boxs_len, 4)) 338 | gt_positives_jacc = np.zeros((input_actual_data_len, self.all_default_boxs_len)) 339 | gt_positives = np.zeros((input_actual_data_len, self.all_default_boxs_len)) 340 | gt_negatives = np.zeros((input_actual_data_len, self.all_default_boxs_len)) 341 | background_jacc = max(0, (self.jaccard_value-0.2)) 342 | # 初始化正例训练数据 343 | for img_index in range(input_actual_data_len): 344 | for pre_actual in input_actual_data[img_index]: 345 | gt_class_val = pre_actual[-1:][0] 346 | gt_box_val = pre_actual[:-1] 347 | for boxe_index in range(self.all_default_boxs_len): 348 | jacc = self.jaccard(gt_box_val, self.all_default_boxs[boxe_index]) 349 | if jacc > self.jaccard_value or jacc == self.jaccard_value: 350 | gt_class[img_index][boxe_index] = gt_class_val 351 | gt_location[img_index][boxe_index] = gt_box_val 352 | gt_positives_jacc[img_index][boxe_index] = jacc 353 | gt_positives[img_index][boxe_index] = 1 354 | gt_negatives[img_index][boxe_index] = 0 355 | # 如果没有正例,则随机创建一个正例,预防nan 356 | if np.sum(gt_positives[img_index])==0 : 357 | #print('【没有匹配jacc】:'+str(input_actual_data[img_index])) 358 | random_pos_index = np.random.randint(low=0, high=self.all_default_boxs_len, size=1)[0] 359 | gt_class[img_index][random_pos_index] = self.background_classes_val 360 | gt_location[img_index][random_pos_index] = [0,0,0,0] 361 | gt_positives_jacc[img_index][random_pos_index] = self.jaccard_value 362 | gt_positives[img_index][random_pos_index] = 1 363 | gt_negatives[img_index][random_pos_index] = 0 364 | # 正负例比值 1:3 365 | gt_neg_end_count = int(np.sum(gt_positives[img_index]) * 3) 366 | if (gt_neg_end_count+np.sum(gt_positives[img_index])) > self.all_default_boxs_len : 367 | gt_neg_end_count = self.all_default_boxs_len - np.sum(gt_positives[img_index]) 368 | # 随机选择负例 369 | gt_neg_index = np.random.randint(low=0, high=self.all_default_boxs_len, size=gt_neg_end_count) 370 | for r_index in gt_neg_index: 371 | if gt_positives_jacc[img_index][r_index] < background_jacc : 372 | gt_class[img_index][r_index] = self.background_classes_val 373 | gt_positives[img_index][r_index] = 0 374 | gt_negatives[img_index][r_index] = 1 375 | return gt_class, gt_location, gt_positives, gt_negatives 376 | 377 | # jaccard算法 378 | # 计算IOU,rect1、rect2格式为[center_x,center_y,width,height] 379 | def jaccard(self, rect1, rect2): 380 | x_overlap = max(0, (min(rect1[0]+(rect1[2]/2), rect2[0]+(rect2[2]/2)) - max(rect1[0]-(rect1[2]/2), rect2[0]-(rect2[2]/2)))) 381 | y_overlap = max(0, (min(rect1[1]+(rect1[3]/2), rect2[1]+(rect2[3]/2)) - max(rect1[1]-(rect1[3]/2), rect2[1]-(rect2[3]/2)))) 382 | intersection = x_overlap * y_overlap 383 | # 删除超出图像大小的部分 384 | rect1_width_sub = 0 385 | rect1_height_sub = 0 386 | rect2_width_sub = 0 387 | rect2_height_sub = 0 388 | if (rect1[0]-rect1[2]/2) < 0 : rect1_width_sub += 0-(rect1[0]-rect1[2]/2) 389 | if (rect1[0]+rect1[2]/2) > 1 : rect1_width_sub += (rect1[0]+rect1[2]/2)-1 390 | if (rect1[1]-rect1[3]/2) < 0 : rect1_height_sub += 0-(rect1[1]-rect1[3]/2) 391 | if (rect1[1]+rect1[3]/2) > 1 : rect1_height_sub += (rect1[1]+rect1[3]/2)-1 392 | if (rect2[0]-rect2[2]/2) < 0 : rect2_width_sub += 0-(rect2[0]-rect2[2]/2) 393 | if (rect2[0]+rect2[2]/2) > 1 : rect2_width_sub += (rect2[0]+rect2[2]/2)-1 394 | if (rect2[1]-rect2[3]/2) < 0 : rect2_height_sub += 0-(rect2[1]-rect2[3]/2) 395 | if (rect2[1]+rect2[3]/2) > 1 : rect2_height_sub += (rect2[1]+rect2[3]/2)-1 396 | area_box_a = (rect1[2]-rect1_width_sub) * (rect1[3]-rect1_height_sub) 397 | area_box_b = (rect2[2]-rect2_width_sub) * (rect2[3]-rect2_height_sub) 398 | union = area_box_a + area_box_b - intersection 399 | if intersection > 0 and union > 0 : 400 | return intersection / union 401 | else : 402 | return 0 403 | 404 | # 检测数据是否正常 405 | def check_numerics(self, input_dataset, message): 406 | if str(input_dataset).find('Tensor') == 0 : 407 | input_dataset = tf.check_numerics(input_dataset, message) 408 | else : 409 | dataset = np.array(input_dataset) 410 | nan_count = np.count_nonzero(dataset != dataset) 411 | inf_count = len(dataset[dataset == float("inf")]) 412 | n_inf_count = len(dataset[dataset == float("-inf")]) 413 | if nan_count>0 or inf_count>0 or n_inf_count>0: 414 | data_error = '【'+ message +'】出现数据错误!【nan:'+str(nan_count)+'|inf:'+str(inf_count)+'|-inf:'+str(n_inf_count)+'】' 415 | raise Exception(data_error) 416 | return input_dataset 417 | -------------------------------------------------------------------------------- /ssd300_resnet.py: -------------------------------------------------------------------------------- 1 | """ 2 | date: 2018/01/17 3 | author: lslcode [jasonli8848@qq.com] 4 | """ 5 | -------------------------------------------------------------------------------- /train_datasets/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /train_datasets/voc2007/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /train_datasets/voc2012/README.md: -------------------------------------------------------------------------------- 1 | 2 | --------------------------------------------------------------------------------