├── README.md ├── config.py ├── data_aug ├── __pycache__ │ └── data_aug.cpython-35.pyc └── data_aug.py ├── load_image ├── __pycache__ │ └── load_image.cpython-35.pyc └── load_image.py ├── main.py ├── net ├── alexnet │ └── alexnet.py ├── cifarnet │ └── cifarnet.py ├── inception_resnet_v2 │ └── inception_resnet_v2.py ├── inception_v4 │ ├── __pycache__ │ │ ├── inception_utils.cpython-35.pyc │ │ └── inception_v4.cpython-35.pyc │ ├── inception_utils.py │ └── inception_v4.py ├── resnet_v2 │ ├── __pycache__ │ │ ├── resnet_utils.cpython-35.pyc │ │ └── resnet_v2.cpython-35.pyc │ ├── resnet_utils.py │ └── resnet_v2.py ├── vgg │ ├── __pycache__ │ │ └── vgg.cpython-35.pyc │ └── vgg.py └── z_build_net │ ├── __pycache__ │ └── build_net.cpython-35.pyc │ └── build_net.py ├── pretrain └── README.md ├── sample_train ├── 0male │ ├── 0(1).jpeg │ ├── 0(1).jpg │ ├── 0(2).jpeg │ ├── 0(2).jpg │ ├── 0(3).jpeg │ └── 0(3).jpg ├── 1female │ ├── 1(1).jpg │ ├── 1(2).jpg │ ├── 1(3).jpg │ ├── 1(4).jpg │ ├── 1(5).jpg │ └── 1(6).jpg ├── 2many │ ├── 0_Parade_marchingband_1_12.jpg │ ├── 0_Parade_marchingband_1_13.jpg │ ├── 0_Parade_marchingband_1_17.jpg │ ├── 0_Parade_marchingband_1_5.jpg │ ├── 0_Parade_marchingband_1_6.jpg │ └── 0_Parade_marchingband_1_8.jpg └── 3other │ ├── 6(2).jpg │ ├── 6(3).jpg │ ├── 6(4).jpg │ ├── 6(5).jpg │ ├── 6(6).jpg │ └── 6(9).jpg ├── train_net ├── __pycache__ │ └── train.cpython-35.pyc └── train.py └── z_ckpt_pb ├── ckpt_pb.py ├── img_preprocessing.py ├── inception_preprocessing.py ├── inception_utils.py ├── inception_v4.py └── test.py /README.md: -------------------------------------------------------------------------------- 1 | 2 | # 自己搭建的一个训练框架,包含模型有:cnn+rnn: vgg(vgg16,vgg19)+rnn(LSTM, GRU), resnet(resnet_v2_50,resnet_v2_101,resnet_v2_152)+rnn(LSTM, GRU), inception_v4+rnn(LSTM, GRU), inception_resnet_v2+rnn(LSTM, GRU)等。 3 | # 此框架主要针对分类任务, 后面会陆续搭建多任务多标签、检测等框架,欢迎关注。 4 | 使用说明: 5 | 搭建时使用的环境为:Python3.5, tensorflow1.4 6 | 7 | 变量设置参考config.py。 8 | 详细说明参见config.py。 9 | 10 | ( mkdir pretrain/inception_v4, 下载与训练模型, cp到pretrain/inception_v4/ ) 11 | 12 | 运行代码: python main.py 13 | 14 | 另外此代码加了tensorboard,将在工程目录下生成 xxx_log 的文件。 然后使用:tensorboard --logdir arch_inceion_v4_rnn_log查看。 后续有时间会把其它的功能加上。 15 | 16 | 其中,z_ckpt_pb:ckpt转pb的代码,和测试接口。 17 | 18 | 19 | 对dl感兴趣,还可以关注我的博客,这是我的博客目录:(地址: http://blog.csdn.net/u014365862/article/details/78422372 ) 20 | 本文为博主原创文章,未经博主允许不得转载。有问题可以加微信:lp9628(注明CSDN)。 21 | 22 | 公众号MachineLN,邀请您扫码关注: 23 | 24 | ![image](http://upload-images.jianshu.io/upload_images/4618424-3ef1722341ba72d2?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240) 25 | 26 | **机器学习进阶系列:(下面内容在微信公众号同步)** 27 | 28 | 1. [机器学习-1:MachineLN之三要素](http://blog.csdn.net/u014365862/article/details/78955063) 29 | 30 | 2. [机器学习-2:MachineLN之模型评估](http://blog.csdn.net/u014365862/article/details/78959353) 31 | 32 | 3. [机器学习-3:MachineLN之dl](http://blog.csdn.net/u014365862/article/details/78980142) 33 | 34 | 4. [机器学习-4:DeepLN之CNN解析](http://blog.csdn.net/u014365862/article/details/78986089) 35 | 36 | 5. [机器学习-5:DeepLN之CNN权重更新(笔记)](http://blog.csdn.net/u014365862/article/details/78959211) 37 | 38 | 6. [机器学习-6:DeepLN之CNN源码](http://blog.csdn.net/u014365862/article/details/79010248) 39 | 40 | 7. [机器学习-7:MachineLN之激活函数](http://blog.csdn.net/u014365862/article/details/79007801) 41 | 42 | 8. [机器学习-8:DeepLN之BN](http://blog.csdn.net/u014365862/article/details/79019518) 43 | 44 | 9. [机器学习-9:MachineLN之数据归一化](http://blog.csdn.net/u014365862/article/details/79031089) 45 | 46 | 10. [机器学习-10:MachineLN之样本不均衡](http://blog.csdn.net/u014365862/article/details/79040390) 47 | 48 | 11. [机器学习-11:MachineLN之过拟合](http://blog.csdn.net/u014365862/article/details/79057073) 49 | 50 | 12. [机器学习-12:MachineLN之优化算法](http://blog.csdn.net/u014365862/article/details/79070721) 51 | 52 | 13. [机器学习-13:MachineLN之kNN](http://blog.csdn.net/u014365862/article/details/79091913) 53 | 54 | 14. [机器学习-14:MachineLN之kNN源码](http://blog.csdn.net/u014365862/article/details/79101209) 55 | 56 | 15. [](http://mp.blog.csdn.net/postedit/79135612)[机器学习-15:MachineLN之感知机](http://blog.csdn.net/u014365862/article/details/79135612) 57 | 58 | 16. [机器学习-16:MachineLN之感知机源码](http://blog.csdn.net/u014365862/article/details/79135767) 59 | 60 | 17. [机器学习-17:MachineLN之逻辑回归](http://blog.csdn.net/u014365862/article/details/79157777) 61 | 62 | 18. [机器学习-18:MachineLN之逻辑回归源码](http://blog.csdn.net/u014365862/article/details/79157841) 63 | 64 | 19. [机器学习-19:MachineLN之SVM(1)](http://blog.csdn.net/u014365862/article/details/79184858) 65 | 66 | 20. [机器学习-20:MachineLN之SVM(2)](http://blog.csdn.net/u014365862/article/details/79202089) 67 | 68 | 21. [机器学习-21:MachineLN之SVM源码](http://blog.csdn.net/u014365862/article/details/79224119) 69 | 70 | 22. [机器学习-22:MachineLN之RL](http://blog.csdn.net/u014365862/article/details/79240997) 71 | 72 | **人脸检测系列:** 73 | 74 | 1. [人脸检测——AFLW准备人脸](http://blog.csdn.net/u014365862/article/details/74682464) 75 | 76 | 2. [人脸检测——生成矫正人脸——cascade cnn的思想, 但是mtcnn的效果貌似更赞](http://blog.csdn.net/u014365862/article/details/74690865) 77 | 78 | 3. [人脸检测——准备非人脸](http://blog.csdn.net/u014365862/article/details/74719498) 79 | 80 | 4. [人脸检测——矫正人脸生成标签](http://blog.csdn.net/u014365862/article/details/74853099) 81 | 82 | 5. [人脸检测——mtcnn思想,生成negative、positive、part样本。](http://blog.csdn.net/u014365862/article/details/78051411) 83 | 84 | 6. [**人脸检测——滑动窗口篇(训练和实现)**](http://blog.csdn.net/u014365862/article/details/77816493) 85 | 86 | 7. [**人脸检测——fcn**](http://blog.csdn.net/u014365862/article/details/78036382) 87 | 88 | 8. [简单的人脸跟踪](http://blog.csdn.net/u014365862/article/details/77989896) 89 | 90 | 9. [Face Detection(OpenCV) Using Hadoop Streaming API](http://blog.csdn.net/u014365862/article/details/78173740) 91 | 92 | 10. [Face Recognition(face_recognition) Using Hadoop Streaming API](http://blog.csdn.net/u014365862/article/details/78175803) 93 | 94 | 11. [非极大值抑制(Non-Maximum-Suppression)](http://blog.csdn.net/u014365862/article/details/53376516) 95 | 96 | **OCR系列:** 97 | 98 | **1. [tf20: CNN—识别字符验证码](http://blog.csdn.net/u014365862/article/details/53869816)** 99 | 100 | 2. [**身份证识别——生成身份证号和汉字**](http://blog.csdn.net/u014365862/article/details/78581949) 101 | 102 | 3. [**tf21: 身份证识别——识别身份证号**](http://blog.csdn.net/u014365862/article/details/78582128) 103 | 104 | 4. **[tf22: ocr识别——不定长数字串识别](http://blog.csdn.net/u014365862/article/details/78582417)** 105 | 106 | **机器学习,深度学习系列:** 107 | 108 | 1. [反向传播与它的直观理解](http://blog.csdn.net/u014365862/article/details/54728707) 109 | 110 | 2. [卷积神经网络(CNN):从原理到实现](http://blog.csdn.net/u014365862/article/details/54865609) 111 | 112 | 3. [机器学习算法应用中常用技巧-1](http://blog.csdn.net/u014365862/article/details/54890040) 113 | 114 | 4. [机器学习算法应用中常用技巧-2](http://blog.csdn.net/u014365862/article/details/54890046) 115 | 116 | 5. [一个隐马尔科夫模型的应用实例:中文分词](http://blog.csdn.net/u014365862/article/details/54891582) 117 | 118 | 6. [**Pandas处理csv表格**](http://blog.csdn.net/u014365862/article/details/54923429) 119 | 120 | 7. [sklearn查看数据分布](http://blog.csdn.net/u014365862/article/details/54973495) 121 | 122 | 8. [TensorFlow 聊天机器人](http://blog.csdn.net/u014365862/article/details/57518873) 123 | 124 | 9. [YOLO](http://blog.csdn.net/u014365862/article/details/60321879) 125 | 126 | 10. [感知机--模型与策略](http://blog.csdn.net/u014365862/article/details/61413859) 127 | 128 | 11. [从 0 到 1 走进 Kaggle](http://blog.csdn.net/u014365862/article/details/72794198) 129 | 130 | 12. [python调用Face++,玩坏了!](http://blog.csdn.net/u014365862/article/details/74149097) 131 | 132 | 13. [人脸识别keras实现教程](http://blog.csdn.net/u014365862/article/details/74332028) 133 | 134 | 14. [机器学习中的Bias(偏差),Error(误差),和Variance(方差)有什么区别和联系?](http://blog.csdn.net/u014365862/article/details/76360351) 135 | 136 | 15. [CNN—pooling层的作用](http://blog.csdn.net/u014365862/article/details/77159143) 137 | 138 | 16. [trick—Batch Normalization](http://blog.csdn.net/u014365862/article/details/77159778) 139 | 140 | 17. [**tensorflow使用BN—Batch Normalization**](http://blog.csdn.net/u014365862/article/details/77188011) 141 | 142 | 18. [trick—Data Augmentation](http://blog.csdn.net/u014365862/article/details/77193754) 143 | 144 | 19. [CNN图图图](http://blog.csdn.net/u014365862/article/details/77367172) 145 | 146 | 20. [为什么很多做人脸的Paper会最后加入一个Local Connected Conv?](http://blog.csdn.net/u014365862/article/details/77795902) 147 | 148 | 21. [**Faster RCNN:RPN,anchor,sliding windows**](http://blog.csdn.net/u014365862/article/details/77887230) 149 | 150 | 22. [**深度学习这些坑你都遇到过吗?**](http://blog.csdn.net/u014365862/article/details/77961624) 151 | 152 | 23. [**image——Data Augmentation的代码**](http://blog.csdn.net/u014365862/article/details/78086604) 153 | 154 | 24. [8种常见机器学习算法比较](http://blog.csdn.net/u014365862/article/details/52937983) 155 | 156 | 25. [几种常见的激活函数](http://blog.csdn.net/u014365862/article/details/52710698) 157 | 158 | 26. [**Building powerful image classification models using very little data**](http://blog.csdn.net/u014365862/article/details/78519629) 159 | 160 | 27. [**机器学习模型训练时候tricks**](http://blog.csdn.net/u014365862/article/details/78519727) 161 | 162 | 28. [OCR综述](https://handong1587.github.io/deep_learning/2015/10/09/ocr.html#handwritten-recognition) 163 | 164 | 29. [一个有趣的周报](http://blog.csdn.net/u014365862/article/details/78757109) 165 | 166 | 30. [根据已给字符数据,训练逻辑回归、随机森林、SVM,生成ROC和箱线图](http://blog.csdn.net/u014365862/article/details/78835541) 167 | 168 | **图像处理系列:** 169 | 170 | 1. [python下使用cv2.drawContours填充轮廓颜色](http://blog.csdn.net/u014365862/article/details/77720368) 171 | 172 | 2. [imge stitching图像拼接stitching](http://blog.csdn.net/u014365862/article/details/53433048) 173 | 174 | 3. [用python简单处理图片(1):打开\显示\保存图像](http://blog.csdn.net/u014365862/article/details/52652256) 175 | 176 | 4. [用python简单处理图片(2):图像通道\几何变换\裁剪](http://blog.csdn.net/u014365862/article/details/52652273) 177 | 178 | 5. [用python简单处理图片(3):添加水印](http://blog.csdn.net/u014365862/article/details/52652296) 179 | 180 | 6. [用python简单处理图片(4):图像中的像素访问](http://blog.csdn.net/u014365862/article/details/52652300) 181 | 182 | 7. [用python简单处理图片(5):图像直方图](http://blog.csdn.net/u014365862/article/details/52652309) 183 | 184 | 8. [**仿射变换,透视变换:二维坐标到二维坐标之间的线性变换,可用于landmark人脸矫正。**](http://blog.csdn.net/u014365862/article/details/78678872) 185 | 186 | **代码整合系列:** 187 | 188 | 1. [windows下C++如何调用matlab程序](http://blog.csdn.net/u014365862/article/details/77480325) 189 | 190 | 2. [ubuntu下C++如何调用matlab程序](http://blog.csdn.net/u014365862/article/details/77529096) 191 | 192 | 3. [matlab使用TCP/IP Server Sockets](http://blog.csdn.net/u014365862/article/details/77745476) 193 | 194 | 4. [ubuntu下C++如何调用python程序,gdb调试C++代码](http://blog.csdn.net/u014365862/article/details/77864743) 195 | 196 | 5. [How to pass an array from C++ to an embedded python](http://blog.csdn.net/u014365862/article/details/77891487) 197 | 198 | 6. [如何使用Python为Hadoop编写一个简单的MapReduce程序](http://blog.csdn.net/u014365862/article/details/78169554) 199 | 200 | 7. [图像的遍历](http://blog.csdn.net/u014365862/article/details/53513710) 201 | 202 | 8. [**ubuntu下CMake编译生成动态库和静态库,以OpenTLD为例。**](http://blog.csdn.net/u014365862/article/details/78663269) 203 | 204 | 9. [**ubuntu下make编译生成动态库,然后python调用cpp。**](http://blog.csdn.net/u014365862/article/details/78675033) 205 | 206 | **数据结构和算法系列:** 207 | 208 | 1. [堆排序](http://blog.csdn.net/u014365862/article/details/78200711) 209 | 210 | 2. [red and black (深度优先搜索算法dfs)](http://blog.csdn.net/u014365862/article/details/48781603) 211 | 212 | 3. [深度优先搜索算法](http://blog.csdn.net/u014365862/article/details/48729681) 213 | 214 | 4. [qsort原理和实现](http://blog.csdn.net/u014365862/article/details/48688457) 215 | 216 | 5. [stack实现queue ; list实现stack](http://blog.csdn.net/u014365862/article/details/48594323) 217 | 218 | 6. [另一种斐波那契数列](http://blog.csdn.net/u014365862/article/details/48573545) 219 | 220 | 7. [堆和栈的区别(个人感觉挺不错的)](http://blog.csdn.net/u014365862/article/details/49159499) 221 | 222 | 8. [排序方法比较](http://blog.csdn.net/u014365862/article/details/52502824) 223 | 224 | 9. [漫画 :什么是红黑树?](https://mp.weixin.qq.com/s/JJVbi7kqDpLUuh696J7oLg) 225 | 226 | 10. [牛客网刷题](https://www.nowcoder.com/activity/oj) 227 | 228 | 11. [莫烦python 666](https://morvanzhou.github.io/) 229 | 230 | **kinect 系列:** 231 | 232 | 1. [Kinect v2.0原理介绍之一:硬件结构](http://blog.csdn.net/u014365862/article/details/46713807) 233 | 234 | 2. [Kinect v2.0原理介绍之二:6种数据源](http://blog.csdn.net/u014365862/article/details/46849253) 235 | 236 | 3. [Kinect v2.0原理介绍之三:骨骼跟踪的原理](http://blog.csdn.net/u014365862/article/details/46849309) 237 | 238 | 4. [Kinect v2.0原理介绍之四:人脸跟踪探讨](http://blog.csdn.net/u014365862/article/details/46849357) 239 | 240 | 5. [Kinect v2.0原理介绍之五:只检测离kinect最近的人脸](http://blog.csdn.net/u014365862/article/details/47809401) 241 | 242 | 6. [Kinect v2.0原理介绍之六:Kinect深度图与彩色图的坐标校准](http://blog.csdn.net/u014365862/article/details/48212085) 243 | 244 | 7. [Kinect v2.0原理介绍之七:彩色帧获取](http://blog.csdn.net/u014365862/article/details/48212377) 245 | 246 | 8. [Kinect v2.0原理介绍之八:高清面部帧(1) FACS 介绍](http://blog.csdn.net/u014365862/article/details/48212631) 247 | 248 | 9. [Kinect v2.0原理介绍之九:高清面部帧(2) 面部特征对齐](http://blog.csdn.net/u014365862/article/details/48212757) 249 | 250 | 10. [Kinect v2.0原理介绍之十:获取高清面部帧的AU单元特征保存到文件](http://blog.csdn.net/u014365862/article/details/48780361) 251 | 252 | 11. [kinect v2.0原理介绍之十一:录制视频](http://blog.csdn.net/u014365862/article/details/77929405) 253 | 254 | 12. [Kinect v2.0原理介绍之十二:音频获取](http://blog.csdn.net/u014365862/article/details/49204931) 255 | 256 | 13. [Kinect v2.0原理介绍之十三:面部帧获取](http://blog.csdn.net/u014365862/article/details/50434088) 257 | 258 | 14. [Kinect for Windows V2和V1对比开发___彩色数据获取并用OpenCV2.4.10显示](http://blog.csdn.net/u014365862/article/details/48948861) 259 | 260 | 15. [Kinect for Windows V2和V1对比开发___骨骼数据获取并用OpenCV2.4.10显示](http://blog.csdn.net/u014365862/article/details/48949055) 261 | 262 | 16. [用kinect录视频库](http://blog.csdn.net/u014365862/article/details/48946543) 263 | 264 | **tensorflow系列:** 265 | 266 | 1. [](http://blog.csdn.net/u014365862/article/details/78422315)[Ubuntu 16.04 安装 Tensorflow(GPU支持)](http://blog.csdn.net/u014365862/article/details/53868411) 267 | 268 | 2. [使用Python实现神经网络](http://blog.csdn.net/u014365862/article/details/53868414) 269 | 270 | 3. [tf1: nn实现评论分类](http://blog.csdn.net/u014365862/article/details/53868418) 271 | 272 | 4. [tf2: nn和cnn实现评论分类](http://blog.csdn.net/u014365862/article/details/53868422) 273 | 274 | 5. [tf3: RNN—mnist识别](http://blog.csdn.net/u014365862/article/details/53868425) 275 | 276 | 6. [tf4: CNN—mnist识别](http://blog.csdn.net/u014365862/article/details/53868430) 277 | 278 | 7\.  [tf5: Deep Q Network—AI游戏](http://blog.csdn.net/u014365862/article/details/53868436) 279 | 280 | 8. [tf6: autoencoder—WiFi指纹的室内定位](http://blog.csdn.net/u014365862/article/details/53868533) 281 | 282 | 9. [tf7: RNN—古诗词](http://blog.csdn.net/u014365862/article/details/53868544) 283 | 284 | 10. [tf8:RNN—生成音乐](http://blog.csdn.net/u014365862/article/details/53868549) 285 | 286 | 11. [tf9: PixelCNN](http://blog.csdn.net/u014365862/article/details/53868557) 287 | 288 | 12. [tf10: 谷歌Deep Dream](http://blog.csdn.net/u014365862/article/details/53868560) 289 | 290 | 13. [tf11: retrain谷歌Inception模型](http://blog.csdn.net/u014365862/article/details/53868568) 291 | 292 | 14. [tf12: 判断男声女声](http://blog.csdn.net/u014365862/article/details/54600398) 293 | 294 | 15. [tf13: 简单聊天机器人](http://blog.csdn.net/u014365862/article/details/53869660) 295 | 296 | 16. [tf14: 黑白图像上色](http://blog.csdn.net/u014365862/article/details/53869682) 297 | 298 | 17. [tf15: 中文语音识别](http://blog.csdn.net/u014365862/article/details/53869701) 299 | 300 | 18. [tf16: 脸部特征识别性别和年龄](http://blog.csdn.net/u014365862/article/details/53869712) 301 | 302 | 19. [tf17: “声音大挪移”](http://blog.csdn.net/u014365862/article/details/53869724) 303 | 304 | 20. [tf18: 根据姓名判断性别](http://blog.csdn.net/u014365862/article/details/53869732) 305 | 306 | 21\.  [tf19: 预测铁路客运量](http://blog.csdn.net/u014365862/article/details/53869802) 307 | 308 | 22. [**tf20: CNN—识别字符验证码**](http://blog.csdn.net/u014365862/article/details/53869816) 309 | 310 | 23. [tf21: 身份证识别——识别身份证号](http://blog.csdn.net/u014365862/article/details/78582128) 311 | 312 | 24. [](http://blog.csdn.net/u014365862/article/details/78582417)[tf22: ocr识别——不定长数字串识别](http://blog.csdn.net/u014365862/article/details/78582417) 313 | 314 | 25. [tf23: “恶作剧” --人脸检测](http://blog.csdn.net/u014365862/article/details/53978811) 315 | 316 | 26. [tf24: GANs—生成明星脸](http://blog.csdn.net/u014365862/article/details/54380277) 317 | 318 | 27. [](http://blog.csdn.net/u014365862/article/details/54706771)[tf25: 使用深度学习做阅读理解+完形填空](http://blog.csdn.net/u014365862/article/details/54428325) 319 | 320 | 28. [tf26: AI操盘手](http://blog.csdn.net/u014365862/article/details/54706771) 321 | 322 | 29. [tensorflow_cookbook--preface](http://blog.csdn.net/u014365862/article/details/70837573) 323 | 324 | 30. [01 TensorFlow入门(1)](http://blog.csdn.net/u014365862/article/details/70837638) 325 | 326 | 31. [01 TensorFlow入门(2)](http://blog.csdn.net/u014365862/article/details/70849334) 327 | 328 | 32. [02 The TensorFlow Way(1)](http://blog.csdn.net/u014365862/article/details/70884624) 329 | 330 | 33. [02 The TensorFlow Way(2)](http://blog.csdn.net/u014365862/article/details/70887213) 331 | 332 | 34. [02 The TensorFlow Way(3)](http://blog.csdn.net/u014365862/article/details/71038528) 333 | 334 | 35. [03 Linear Regression](http://blog.csdn.net/u014365862/article/details/71064855) 335 | 336 | 36. [04 Support Vector Machines](http://blog.csdn.net/u014365862/article/details/71078010) 337 | 338 | 37. [tf API 研读1:tf.nn,tf.layers, tf.contrib概述](http://blog.csdn.net/u014365862/article/details/77833481) 339 | 340 | 38. [tf API 研读2:math](http://blog.csdn.net/u014365862/article/details/77847410) 341 | 342 | 39. [tensorflow中的上采样(unpool)和反卷积(conv2d_transpose)](http://blog.csdn.net/u014365862/article/details/77936259) 343 | 344 | 40. [tf API 研读3:Building Graphs](http://blog.csdn.net/u014365862/article/details/77944301) 345 | 346 | 41. [tf API 研读4:Inputs and Readers](http://blog.csdn.net/u014365862/article/details/77946268) 347 | 348 | 42. [](http://blog.csdn.net/u014365862/article/details/77967231)[tf API 研读5:Data IO](http://blog.csdn.net/u014365862/article/details/77967231) 349 | 350 | 43. [tf API 研读6:Running Graphs](http://blog.csdn.net/u014365862/article/details/77967995) 351 | 352 | 44. [**tf.contrib.rnn.static_rnn与tf.nn.dynamic_rnn区别**](http://blog.csdn.net/u014365862/article/details/78238807) 353 | 354 | 45. [**Tensorflow使用的预训练的resnet_v2_50,resnet_v2_101,resnet_v2_152等模型预测,训练**](http://blog.csdn.net/u014365862/article/details/78272811) 355 | 356 | 46. [**tensorflow下设置使用某一块GPU、多GPU、CPU的情况**](http://blog.csdn.net/u014365862/article/details/78292475) 357 | 358 | 47. [**工业器件检测和识别**](http://blog.csdn.net/u014365862/article/details/78359194) 359 | 360 | 48. [**将tf训练的权重保存为CKPT,PB ,CKPT 转换成 PB格式。并将权重固化到图里面,并使用该模型进行预测**](http://blog.csdn.net/u014365862/article/details/78404980) 361 | 362 | 49. **[tensorsor快速获取所有变量,和快速计算L2范数](http://blog.csdn.net/u014365862/article/details/78422315)** 363 | 364 | 50. [**cnn+rnn+attention**](http://blog.csdn.net/u014365862/article/details/78495870) 365 | 366 | 51. [Tensorflow实战学习笔记](https://github.com/MachineLP/Tensorflow-) 367 | 368 | 52. [tf27: Deep Dream—应用到视频](http://blog.csdn.net/u014365862/article/details/53869830) 369 | 370 | 53. [tf28: 手写汉字识别](http://blog.csdn.net/u014365862/article/details/53869837) 371 | 372 | 54. [tf29: 使用tensorboard可视化inception_v4](http://blog.csdn.net/u014365862/article/details/79115556) 373 | 374 | 55. [tf30: center loss及其mnist上的应用](http://blog.csdn.net/u014365862/article/details/79184966) 375 | 376 | 56. [tf31: keras的LSTM腾讯人数在线预测](http://blog.csdn.net/u014365862/article/details/79186993) 377 | 378 | 57. [tf32: 一个简单的cnn模型:人脸特征点训练](http://blog.csdn.net/u014365862/article/details/79187157) 379 | 380 | 58. [tf33: 图片降噪:卷积自编码](http://blog.csdn.net/u014365862/article/details/79246179) 381 | 382 | **C++系列:** 383 | 384 | 1. [c++ primer之const限定符](http://blog.csdn.net/u014365862/article/details/46848613) 385 | 386 | 2. [c++primer之auto类型说明符](http://blog.csdn.net/u014365862/article/details/46849697) 387 | 388 | 3. [c++primer之预处理器](http://blog.csdn.net/u014365862/article/details/46853869) 389 | 390 | 4. [c++primer之string](http://blog.csdn.net/u014365862/article/details/46860037) 391 | 392 | 5. [c++primer之vector](http://blog.csdn.net/u014365862/article/details/46885087) 393 | 394 | 6. [c++primer之多维数组](http://blog.csdn.net/u014365862/article/details/46933199) 395 | 396 | 7. [c++primer之范围for循环](http://blog.csdn.net/u014365862/article/details/47706255) 397 | 398 | 8. [c++primer之运算符优先级表](http://blog.csdn.net/u014365862/article/details/47706423) 399 | 400 | 9. [c++primer之try语句块和异常处理](http://blog.csdn.net/u014365862/article/details/47707669) 401 | 402 | 10. [c++primer之函数(函数基础和参数传递)](http://blog.csdn.net/u014365862/article/details/47783193) 403 | 404 | 11. [c++primer之函数(返回类型和return语句)](http://blog.csdn.net/u014365862/article/details/47808711) 405 | 406 | 12. [c++primer之函数重载](http://blog.csdn.net/u014365862/article/details/47834667) 407 | 408 | 13. [c++重写卷积网络的前向计算过程,完美复现theano的测试结果](http://blog.csdn.net/u014365862/article/details/48010697) 409 | 410 | 14. [c++ primer之类](http://blog.csdn.net/u014365862/article/details/48165685) 411 | 412 | 15. [c++primer之类(构造函数再探)](http://blog.csdn.net/u014365862/article/details/48198595) 413 | 414 | 16. [c++primer之类(类的静态成员)](http://blog.csdn.net/u014365862/article/details/48199161) 415 | 416 | 17. [c++primer之顺序容器(容器库概览)](http://blog.csdn.net/u014365862/article/details/48209805) 417 | 418 | 18. [c++primer之顺序容器(添加元素)](http://blog.csdn.net/u014365862/article/details/48226673) 419 | 420 | 19. [c++primer之顺序容器(访问元素)](http://blog.csdn.net/u014365862/article/details/48230053) 421 | 422 | **OpenCV系列:** 423 | 424 | 1. [自己训练SVM分类器,进行HOG行人检测。](http://blog.csdn.net/u014365862/article/details/53243604) 425 | 426 | 2. [opencv-haar-classifier-training](http://blog.csdn.net/u014365862/article/details/53096367) 427 | 428 | 3. [vehicleDectection with Haar Cascades](http://blog.csdn.net/u014365862/article/details/53087675) 429 | 430 | 4. [LaneDetection](http://blog.csdn.net/u014365862/article/details/53083143) 431 | 432 | 5. [OpenCV学习笔记大集锦](http://blog.csdn.net/u014365862/article/details/53063627) 433 | 434 | 6. [Why always OpenCV Error: Assertion failed (elements_read == 1) in unknown function ?](http://blog.csdn.net/u014365862/article/details/53000619) 435 | 436 | 7. [目标检测之训练opencv自带的分类器(opencv_haartraining 或 opencv_traincascade)](http://blog.csdn.net/u014365862/article/details/52997019) 437 | 438 | 8. [车牌识别 之 字符分割](http://blog.csdn.net/u014365862/article/details/52672747) 439 | 440 | 9. **[仿射变换,透视变换:二维坐标到二维坐标之间的线性变换,可用于landmark人脸矫正。](http://blog.csdn.net/u014365862/article/details/78678872)** 441 | 442 | 10. [opencv实现抠图(单一背景),替换背景图](http://blog.csdn.net/u014365862/article/details/78863756) 443 | 444 | **python系列(**web开发、多线程等**):** 445 | 446 | 1. [**flask的web开发,用于机器学习(主要还是DL)模型的简单演示。**](http://blog.csdn.net/u014365862/article/details/78818334) 447 | 448 | 2. **[python多线程,获取多线程的返回值](http://blog.csdn.net/u014365862/article/details/78835348)** 449 | 450 | 3. [文件中字的统计及创建字典](http://blog.csdn.net/u014365862/article/details/78914151) 451 | 452 | **其他:** 453 | 454 | 1. [MAC平台下Xcode配置使用OpenCV的具体方法 (2016最新)](http://blog.csdn.net/u014365862/article/details/53067565) 455 | 456 | 2. [**python下如何安装.whl包?**](http://blog.csdn.net/u014365862/article/details/51817390) 457 | 458 | 3. [给中国学生的第三封信:成功、自信、快乐](http://blog.csdn.net/u014365862/article/details/47972321) 459 | 460 | 4. [自己-社会-机器学习](http://blog.csdn.net/u014365862/article/details/48604145) 461 | 462 | 5. [不执著才叫看破,不完美才叫人生。](http://blog.csdn.net/u014365862/article/details/49079047) 463 | 464 | 6. [PCANet的C++代码——详细注释版](http://blog.csdn.net/u014365862/article/details/51213280) 465 | 466 | 7. [责任与担当](http://blog.csdn.net/u014365862/article/details/51841590) 467 | 468 | 8. [好走的都是下坡路](http://blog.csdn.net/u014365862/article/details/53244402) 469 | 470 | 9. [一些零碎的语言,却触动到内心深处。](http://blog.csdn.net/u014365862/article/details/53186012) 471 | 472 | 10. [用一个脚本学习 python](http://blog.csdn.net/u014365862/article/details/54428373) 473 | 474 | 11. [一个有趣的周报](http://blog.csdn.net/u014365862/article/details/78757109) 475 | 476 | -------------------------------------------------------------------------------- /config.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | # inception_v4:299 10 | # resnet_v2:224 11 | # vgg:224 12 | 13 | IMAGE_HEIGHT = 299 14 | IMAGE_WIDTH = 299 15 | num_classes = 4 16 | EARLY_STOP_PATIENCE = 1000 17 | # epoch 18 | epoch = 1000 19 | batch_size = 1 20 | # 模型的学习率 21 | learning_rate = 0.00001 22 | keep_prob = 0.8 23 | 24 | 25 | # 设置训练样本的占总样本的比例: 26 | train_rate = 0.9 27 | 28 | # 每个类别保存到一个文件中,放在此目录下,只要是二级目录就可以。 29 | craterDir = "sample_train" 30 | 31 | # 选择需要的模型 32 | # arch_model="arch_inception_v4"; 33 | # arch_model="arch_resnet_v2_50" 34 | # arch_model="vgg_16" 35 | arch_model="arch_inception_v4_rnn" 36 | 37 | # 设置要更新的参数和加载的参数,目前是非此即彼,可以自己修改哦 38 | checkpoint_exclude_scopes = "Logits_out" 39 | 40 | # 迁移学习模型参数, 下载训练好模型:https://github.com/MachineLP/models/tree/master/research/slim 41 | # checkpoint_path="pretrain/inception_v4/inception_v4.ckpt"; 42 | # checkpoint_path="pretrain/resnet_v2_50/resnet_v2_50.ckpt" 43 | checkpoint_path="pretrain/inception_v4/inception_v4.ckpt" 44 | 45 | #训练好的模型参数在model文件夹下。 46 | 47 | # 接下来可以添加的功能: 48 | # 图像归一化:默认的是归一化到[-1,1]:(load_image/load_image.py:get_next_batch_from_path) (可以自行加一些设置参数,在此处设置) 49 | # 需要加入模型 需修改 (train_net/train.py) 50 | # 设置GPU使用, train_net/train.py (多GPU), main.py 51 | # 设置学习率衰减:learningRate_1 = tf.train.exponential_decay(lr1_init, tf.subtract(global_step, 1), decay_steps, decay_rate, True) 52 | # 加入tensorboard 可视化 53 | # 需要修改参数更新的方法请参考:(train_net/train.py) 54 | ''' 55 | def _configure_optimizer(learning_rate): 56 | """Configures the optimizer used for training. 57 | 58 | Args: 59 | learning_rate: A scalar or `Tensor` learning rate. 60 | 61 | Returns: 62 | An instance of an optimizer. 63 | 64 | Raises: 65 | ValueError: if FLAGS.optimizer is not recognized. 66 | """ 67 | if FLAGS.optimizer == 'adadelta': 68 | optimizer = tf.train.AdadeltaOptimizer( 69 | learning_rate, 70 | rho=FLAGS.adadelta_rho, 71 | epsilon=FLAGS.opt_epsilon) 72 | elif FLAGS.optimizer == 'adagrad': 73 | optimizer = tf.train.AdagradOptimizer( 74 | learning_rate, 75 | initial_accumulator_value=FLAGS.adagrad_initial_accumulator_value) 76 | elif FLAGS.optimizer == 'adam': 77 | optimizer = tf.train.AdamOptimizer( 78 | learning_rate, 79 | beta1=FLAGS.adam_beta1, 80 | beta2=FLAGS.adam_beta2, 81 | epsilon=FLAGS.opt_epsilon) 82 | elif FLAGS.optimizer == 'ftrl': 83 | optimizer = tf.train.FtrlOptimizer( 84 | learning_rate, 85 | learning_rate_power=FLAGS.ftrl_learning_rate_power, 86 | initial_accumulator_value=FLAGS.ftrl_initial_accumulator_value, 87 | l1_regularization_strength=FLAGS.ftrl_l1, 88 | l2_regularization_strength=FLAGS.ftrl_l2) 89 | elif FLAGS.optimizer == 'momentum': 90 | optimizer = tf.train.MomentumOptimizer( 91 | learning_rate, 92 | momentum=FLAGS.momentum, 93 | name='Momentum') 94 | elif FLAGS.optimizer == 'rmsprop': 95 | optimizer = tf.train.RMSPropOptimizer( 96 | learning_rate, 97 | decay=FLAGS.rmsprop_decay, 98 | momentum=FLAGS.rmsprop_momentum, 99 | epsilon=FLAGS.opt_epsilon) 100 | elif FLAGS.optimizer == 'sgd': 101 | optimizer = tf.train.GradientDescentOptimizer(learning_rate) 102 | else: 103 | raise ValueError('Optimizer [%s] was not recognized', FLAGS.optimizer) 104 | 105 | 106 | return optimizer''' 107 | -------------------------------------------------------------------------------- /data_aug/__pycache__/data_aug.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/data_aug/__pycache__/data_aug.cpython-35.pyc -------------------------------------------------------------------------------- /data_aug/data_aug.py: -------------------------------------------------------------------------------- 1 | 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on 2017 10.17 5 | @author: liupeng 6 | wechat: lp9628 7 | blog: http://blog.csdn.net/u014365862/article/details/78422372 8 | """ 9 | 10 | import numpy as np 11 | import tensorflow as tf 12 | import numpy as np 13 | import os 14 | from PIL import Image 15 | import cv2 16 | from skimage import exposure 17 | 18 | class data_aug(object): 19 | 20 | def __init__(self, img): 21 | self.image= img 22 | 23 | # 左右镜像 24 | def random_fliplr(self, random_flip = True): 25 | if random_flip and np.random.choice([True, False]): 26 | self.image = np.fliplr(self.image) # 左右 27 | 28 | # 上下镜像 29 | def random_flidud(self, random_flip = True): 30 | if random_flip and np.random.choice([True, False]): 31 | self.image = np.flipud(self.image) # 上下 32 | 33 | # 改变光照 34 | def random_exposure(self, random_exposure = True): 35 | if random_exposure and np.random.choice([True, False]): 36 | e_rate = np.random.uniform(0.5,1.5) 37 | self.image = exposure.adjust_gamma(self.image, e_rate) 38 | 39 | # 旋转 40 | def random_rotation(self, random_rotation = True): 41 | if random_rotation and np.random.choice([True, False]): 42 | w,h = self.image.shape[1], self.image.shape[0] 43 | # 0-180随机产生旋转角度。 44 | angle = np.random.randint(0,10) 45 | RotateMatrix = cv2.getRotationMatrix2D(center=(w/2, h/2), angle=angle, scale=0.7) 46 | # image = cv2.warpAffine(image, RotateMatrix, (w,h), borderValue=(129,137,130)) 47 | self.image = cv2.warpAffine(self.image, RotateMatrix, (w,h), borderMode=cv2.BORDER_REPLICATE) 48 | 49 | # 裁剪 50 | def random_crop(self, crop_size = 299, random_crop = True): 51 | if random_crop and np.random.choice([True, False]): 52 | if self.image.shape[1] > crop_size: 53 | sz1 = self.image.shape[1] // 2 54 | sz2 = crop_size // 2 55 | diff = sz1 - sz2 56 | (h, v) = (np.random.randint(0, diff + 1), np.random.randint(0, diff + 1)) 57 | self.image = self.image[v:(v + crop_size), h:(h + crop_size), :] 58 | # 59 | def get_aug_img(self): 60 | self.random_fliplr() 61 | self.random_flidud() 62 | self.random_exposure() 63 | self.random_rotation() 64 | self.random_crop() 65 | return self.image 66 | 67 | 68 | -------------------------------------------------------------------------------- /load_image/__pycache__/load_image.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/load_image/__pycache__/load_image.cpython-35.pyc -------------------------------------------------------------------------------- /load_image/load_image.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | import numpy as np 10 | import tensorflow as tf 11 | import numpy as np 12 | import os 13 | from PIL import Image 14 | import cv2 15 | from data_aug import data_aug 16 | 17 | # 适用于二级目录 。。。/图片类别文件/图片(.png ,jpg等) 18 | 19 | class load_image(object): 20 | 21 | def __init__(self, img_dir, train_rate): 22 | self.imgDir = img_dir 23 | self.train_rate = train_rate 24 | 25 | def load_img_path(self, img_label): 26 | imgs = os.listdir(self.craterDir + self.foldName) 27 | imgNum = len(imgs) 28 | self.data = [] 29 | self.label = [] 30 | for i in range (imgNum): 31 | img_path = self.craterDir+self.foldName+"/"+imgs[i] 32 | # 用来检测图片是否有效,放在这里会太费时间。 33 | # img = cv2.imread(img_path) 34 | # if img is not None: 35 | self.data.append(img_path) 36 | self.label.append(int(img_label)) 37 | 38 | def shuffle_train_data(self): 39 | index = [i for i in range(len(self.train_imgs))] 40 | np.random.shuffle(index) 41 | self.train_imgs = np.asarray(self.train_imgs) 42 | self.train_labels = np.asarray(self.train_labels) 43 | self.train_imgs = self.train_imgs[index] 44 | self.train_labels = self.train_labels[index] 45 | 46 | 47 | def load_database_path(self): 48 | img_path = os.listdir(self.imgDir) 49 | self.train_imgs = [] 50 | self.train_labels = [] 51 | for i, path in enumerate(img_path): 52 | self.craterDir = self.imgDir + '/' 53 | self.foldName = path 54 | self.load_img_path(i) 55 | self.train_imgs.extend(self.data) 56 | self.train_labels.extend(self.label) 57 | print ("文件名对应的label:") 58 | print (path, i) 59 | #打乱数据集 60 | self.shuffle_train_data() 61 | # 数据集的数量 62 | self.image_n = len(self.train_imgs) 63 | 64 | def gen_train_valid_image(self): 65 | self.load_database_path() 66 | self.train_n = int(self.image_n * self.train_rate) 67 | self.valid_n = int(self.image_n * (1 - self.train_rate)) 68 | return self.train_imgs[0:self.train_n], self.train_labels[0:self.train_n], self.train_imgs[self.train_n:self.image_n], self.train_labels[self.train_n:self.image_n] 69 | 70 | ''' 71 | def get_next_batch_from_path(image_path, image_labels, pointer, IMAGE_HEIGHT=299, IMAGE_WIDTH=299, batch_size=64, is_train=True): 72 | batch_x = np.zeros([batch_size, IMAGE_HEIGHT,IMAGE_WIDTH,3]) 73 | num_classes = len(image_labels[0]) 74 | batch_y = np.zeros([batch_size, num_classes]) 75 | for i in range(batch_size): 76 | image = cv2.imread(image_path[i+pointer*batch_size]) 77 | image = cv2.resize(image, (IMAGE_HEIGHT, IMAGE_WIDTH)) 78 | if is_train: 79 | img_aug = data_aug.data_aug(image) 80 | image = img_aug.get_aug_img() 81 | # image = cv2.resize(image, (IMAGE_HEIGHT, IMAGE_WIDTH)) 82 | # 选择自己预处理方式: 83 | 84 | # m = image.mean() 85 | # s = image.std() 86 | # min_s = 1.0/(np.sqrt(image.shape[0]*image.shape[1])) 87 | # std = max(min_s, s) 88 | # image = (image-m)/std 89 | # image = (image-127.5) 90 | image = image / 255.0 91 | image = image - 0.5 92 | image = image * 2 93 | 94 | batch_x[i,:,:,:] = image 95 | # print labels[i+pointer*batch_size] 96 | batch_y[i] = image_labels[i+pointer*batch_size] 97 | return batch_x, batch_y''' 98 | 99 | 100 | def shuffle_train_data(train_imgs, train_labels): 101 | index = [i for i in range(len(train_imgs))] 102 | np.random.shuffle(index) 103 | train_imgs = np.asarray(train_imgs) 104 | train_labels = np.asarray(train_labels) 105 | train_imgs = train_imgs[index] 106 | train_labels = train_labels[index] 107 | return train_imgs, train_labels 108 | 109 | #------------------------------------------------# 110 | # 功能:按照图像最小的边进行缩放 111 | # 输入:img:图像,resize_size:需要的缩放大小 112 | # 输出:缩放后的图像 113 | #------------------------------------------------# 114 | def img_crop_pre(img, resize_size=336): 115 | h, w, _ = img.shape 116 | deta = h if h < w else w 117 | alpha = resize_size / float(deta) 118 | # print (alpha) 119 | img = cv2.resize(img, (int(h*alpha), int(w*alpha))) 120 | return img 121 | 122 | def get_next_batch_from_path(image_path, image_labels, pointer, IMAGE_HEIGHT=299, IMAGE_WIDTH=299, batch_size=64, is_train=True): 123 | batch_x = np.zeros([batch_size, IMAGE_HEIGHT,IMAGE_WIDTH,3]) 124 | num_classes = len(image_labels[0]) 125 | batch_y = np.zeros([batch_size, num_classes]) 126 | for i in range(batch_size): 127 | image = cv2.imread(image_path[i+pointer*batch_size]) 128 | image = img_crop_pre(image, resize_size=336) 129 | # image = cv2.resize(image, (IMAGE_HEIGHT, IMAGE_WIDTH)) 130 | if is_train: 131 | img_aug = data_aug.data_aug(image) 132 | image = img_aug.get_aug_img() 133 | image = cv2.resize(image, (IMAGE_HEIGHT, IMAGE_WIDTH)) 134 | # 选择自己预处理方式: 135 | ''' 136 | m = image.mean() 137 | s = image.std() 138 | min_s = 1.0/(np.sqrt(image.shape[0]*image.shape[1])) 139 | std = max(min_s, s) 140 | image = (image-m)/std''' 141 | # image = (image-127.5) 142 | image = image / 255.0 143 | image = image - 0.5 144 | image = image * 2 145 | 146 | batch_x[i,:,:,:] = image 147 | # print labels[i+pointer*batch_size] 148 | batch_y[i] = image_labels[i+pointer*batch_size] 149 | return batch_x, batch_y 150 | 151 | 152 | def test(): 153 | 154 | craterDir = "train" 155 | data, label = load_database(craterDir) 156 | print (data.shape) 157 | print (len(data)) 158 | print (data[0].shape) 159 | print (label[0]) 160 | batch_x, batch_y = get_next_batch_from_path(data, label, 0, IMAGE_HEIGHT=299, IMAGE_WIDTH=299, batch_size=64, is_train=True) 161 | print (batch_x) 162 | print (batch_y) 163 | 164 | if __name__ == '__main__': 165 | test() 166 | 167 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | import numpy as np 10 | import tensorflow as tf 11 | slim = tf.contrib.slim 12 | import numpy as np 13 | import argparse 14 | import os 15 | from PIL import Image 16 | from datetime import datetime 17 | import math 18 | import time 19 | from load_image import load_image 20 | try: 21 | from train import train 22 | except: 23 | from train_net.train import train 24 | import cv2 25 | import os 26 | from keras.utils import np_utils 27 | os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 28 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 29 | 30 | import config 31 | 32 | if __name__ == '__main__': 33 | 34 | IMAGE_HEIGHT = config.IMAGE_HEIGHT 35 | IMAGE_WIDTH = config.IMAGE_WIDTH 36 | num_classes = config.num_classes 37 | EARLY_STOP_PATIENCE = config.EARLY_STOP_PATIENCE 38 | # epoch 39 | epoch = config.epoch 40 | batch_size = config.batch_size 41 | # 模型的学习率 42 | learning_rate = config.learning_rate 43 | keep_prob = config.keep_prob 44 | 45 | ##----------------------------------------------------------------------------## 46 | # 设置训练样本的占总样本的比例: 47 | train_rate = config.train_rate 48 | # 每个类别保存到一个文件中,放在此目录下,只要是二级目录就可以。 49 | craterDir = config.craterDir 50 | 51 | # 选择需要的模型 52 | # arch_model="arch_inception_v4"; arch_model="arch_resnet_v2_50"; arch_model="vgg_16" 53 | arch_model=config.arch_model 54 | # 设置要更新的参数和加载的参数,目前是非此即彼,可以自己修改哦 55 | checkpoint_exclude_scopes = config.checkpoint_exclude_scopes 56 | # 迁移学习模型参数 57 | checkpoint_path=config.checkpoint_path 58 | 59 | ##----------------------------------------------------------------------------## 60 | print ("-----------------------------load_image.py start--------------------------") 61 | # 准备训练数据 62 | all_image = load_image.load_image(craterDir, train_rate) 63 | train_data, train_label, valid_data, valid_label= all_image.gen_train_valid_image() 64 | image_n = all_image.image_n 65 | # 样本的总数量 66 | print ("样本的总数量:") 67 | print (image_n) 68 | # 定义90%作为训练样本 69 | train_n = all_image.train_n 70 | valid_n = all_image.valid_n 71 | # ont-hot 72 | train_label = np_utils.to_categorical(train_label, num_classes) 73 | valid_label = np_utils.to_categorical(valid_label, num_classes) 74 | ##----------------------------------------------------------------------------## 75 | 76 | print ("-----------------------------train.py start--------------------------") 77 | train(train_data,train_label,valid_data,valid_label,train_n,valid_n,IMAGE_HEIGHT,IMAGE_WIDTH,learning_rate,num_classes,epoch,EARLY_STOP_PATIENCE,batch_size,keep_prob, 78 | arch_model, checkpoint_exclude_scopes, checkpoint_path) 79 | -------------------------------------------------------------------------------- /net/alexnet/alexnet.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | 15 | slim = tf.contrib.slim 16 | trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev) 17 | 18 | 19 | def alexnet_v2_arg_scope(weight_decay=0.0005): 20 | with slim.arg_scope([slim.conv2d, slim.fully_connected], 21 | activation_fn=tf.nn.relu, 22 | biases_initializer=tf.constant_initializer(0.1), 23 | weights_regularizer=slim.l2_regularizer(weight_decay)): 24 | with slim.arg_scope([slim.conv2d], padding='SAME'): 25 | with slim.arg_scope([slim.max_pool2d], padding='VALID') as arg_sc: 26 | return arg_sc 27 | 28 | 29 | def alexnet_v2(inputs, 30 | num_classes=1000, 31 | is_training=True, 32 | dropout_keep_prob=0.5, 33 | spatial_squeeze=True, 34 | scope='alexnet_v2'): 35 | """AlexNet version 2. 36 | 37 | Described in: http://arxiv.org/pdf/1404.5997v2.pdf 38 | Parameters from: 39 | github.com/akrizhevsky/cuda-convnet2/blob/master/layers/ 40 | layers-imagenet-1gpu.cfg 41 | 42 | Note: All the fully_connected layers have been transformed to conv2d layers. 43 | To use in classification mode, resize input to 224x224. To use in fully 44 | convolutional mode, set spatial_squeeze to false. 45 | The LRN layers have been removed and change the initializers from 46 | random_normal_initializer to xavier_initializer. 47 | 48 | Args: 49 | inputs: a tensor of size [batch_size, height, width, channels]. 50 | num_classes: number of predicted classes. 51 | is_training: whether or not the model is being trained. 52 | dropout_keep_prob: the probability that activations are kept in the dropout 53 | layers during training. 54 | spatial_squeeze: whether or not should squeeze the spatial dimensions of the 55 | outputs. Useful to remove unnecessary dimensions for classification. 56 | scope: Optional scope for the variables. 57 | 58 | Returns: 59 | the last op containing the log predictions and end_points dict. 60 | """ 61 | with tf.variable_scope(scope, 'alexnet_v2', [inputs]) as sc: 62 | end_points_collection = sc.name + '_end_points' 63 | # Collect outputs for conv2d, fully_connected and max_pool2d. 64 | with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d], 65 | outputs_collections=[end_points_collection]): 66 | net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', 67 | scope='conv1') 68 | net = slim.max_pool2d(net, [3, 3], 2, scope='pool1') 69 | net = slim.conv2d(net, 192, [5, 5], scope='conv2') 70 | net = slim.max_pool2d(net, [3, 3], 2, scope='pool2') 71 | net = slim.conv2d(net, 384, [3, 3], scope='conv3') 72 | net = slim.conv2d(net, 384, [3, 3], scope='conv4') 73 | net = slim.conv2d(net, 256, [3, 3], scope='conv5') 74 | net = slim.max_pool2d(net, [3, 3], 2, scope='pool5') 75 | 76 | # Use conv2d instead of fully_connected layers. 77 | with slim.arg_scope([slim.conv2d], 78 | weights_initializer=trunc_normal(0.005), 79 | biases_initializer=tf.constant_initializer(0.1)): 80 | net = slim.conv2d(net, 4096, [5, 5], padding='VALID', 81 | scope='fc6') 82 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 83 | scope='dropout6') 84 | net = slim.conv2d(net, 4096, [1, 1], scope='fc7') 85 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 86 | scope='dropout7') 87 | net = slim.conv2d(net, num_classes, [1, 1], 88 | activation_fn=None, 89 | normalizer_fn=None, 90 | biases_initializer=tf.zeros_initializer(), 91 | scope='fc8') 92 | 93 | # Convert end_points_collection into a end_point dict. 94 | end_points = slim.utils.convert_collection_to_dict(end_points_collection) 95 | if spatial_squeeze: 96 | net = tf.squeeze(net, [1, 2], name='fc8/squeezed') 97 | end_points[sc.name + '/fc8'] = net 98 | return net, end_points 99 | alexnet_v2.default_image_size = 224 100 | -------------------------------------------------------------------------------- /net/cifarnet/cifarnet.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | 15 | slim = tf.contrib.slim 16 | 17 | trunc_normal = lambda stddev: tf.truncated_normal_initializer(stddev=stddev) 18 | 19 | 20 | def cifarnet(images, num_classes=10, is_training=False, 21 | dropout_keep_prob=0.5, 22 | prediction_fn=slim.softmax, 23 | scope='CifarNet'): 24 | """Creates a variant of the CifarNet model. 25 | 26 | Note that since the output is a set of 'logits', the values fall in the 27 | interval of (-infinity, infinity). Consequently, to convert the outputs to a 28 | probability distribution over the characters, one will need to convert them 29 | using the softmax function: 30 | 31 | logits = cifarnet.cifarnet(images, is_training=False) 32 | probabilities = tf.nn.softmax(logits) 33 | predictions = tf.argmax(logits, 1) 34 | 35 | Args: 36 | images: A batch of `Tensors` of size [batch_size, height, width, channels]. 37 | num_classes: the number of classes in the dataset. 38 | is_training: specifies whether or not we're currently training the model. 39 | This variable will determine the behaviour of the dropout layer. 40 | dropout_keep_prob: the percentage of activation values that are retained. 41 | prediction_fn: a function to get predictions out of logits. 42 | scope: Optional variable_scope. 43 | 44 | Returns: 45 | logits: the pre-softmax activations, a tensor of size 46 | [batch_size, `num_classes`] 47 | end_points: a dictionary from components of the network to the corresponding 48 | activation. 49 | """ 50 | end_points = {} 51 | 52 | with tf.variable_scope(scope, 'CifarNet', [images, num_classes]): 53 | net = slim.conv2d(images, 64, [5, 5], scope='conv1') 54 | end_points['conv1'] = net 55 | net = slim.max_pool2d(net, [2, 2], 2, scope='pool1') 56 | end_points['pool1'] = net 57 | net = tf.nn.lrn(net, 4, bias=1.0, alpha=0.001/9.0, beta=0.75, name='norm1') 58 | net = slim.conv2d(net, 64, [5, 5], scope='conv2') 59 | end_points['conv2'] = net 60 | net = tf.nn.lrn(net, 4, bias=1.0, alpha=0.001/9.0, beta=0.75, name='norm2') 61 | net = slim.max_pool2d(net, [2, 2], 2, scope='pool2') 62 | end_points['pool2'] = net 63 | net = slim.flatten(net) 64 | end_points['Flatten'] = net 65 | net = slim.fully_connected(net, 384, scope='fc3') 66 | end_points['fc3'] = net 67 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 68 | scope='dropout3') 69 | net = slim.fully_connected(net, 192, scope='fc4') 70 | end_points['fc4'] = net 71 | logits = slim.fully_connected(net, num_classes, 72 | biases_initializer=tf.zeros_initializer(), 73 | weights_initializer=trunc_normal(1/192.0), 74 | weights_regularizer=None, 75 | activation_fn=None, 76 | scope='logits') 77 | 78 | end_points['Logits'] = logits 79 | end_points['Predictions'] = prediction_fn(logits, scope='Predictions') 80 | 81 | return logits, end_points 82 | cifarnet.default_image_size = 32 83 | 84 | 85 | def cifarnet_arg_scope(weight_decay=0.004): 86 | """Defines the default cifarnet argument scope. 87 | 88 | Args: 89 | weight_decay: The weight decay to use for regularizing the model. 90 | 91 | Returns: 92 | An `arg_scope` to use for the inception v3 model. 93 | """ 94 | with slim.arg_scope( 95 | [slim.conv2d], 96 | weights_initializer=tf.truncated_normal_initializer(stddev=5e-2), 97 | activation_fn=tf.nn.relu): 98 | with slim.arg_scope( 99 | [slim.fully_connected], 100 | biases_initializer=tf.constant_initializer(0.1), 101 | weights_initializer=trunc_normal(0.04), 102 | weights_regularizer=slim.l2_regularizer(weight_decay), 103 | activation_fn=tf.nn.relu) as sc: 104 | return sc 105 | -------------------------------------------------------------------------------- /net/inception_resnet_v2/inception_resnet_v2.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | """ 6 | 7 | from __future__ import absolute_import 8 | from __future__ import division 9 | from __future__ import print_function 10 | 11 | 12 | import tensorflow as tf 13 | 14 | slim = tf.contrib.slim 15 | 16 | 17 | def block35(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): 18 | """Builds the 35x35 resnet block.""" 19 | with tf.variable_scope(scope, 'Block35', [net], reuse=reuse): 20 | with tf.variable_scope('Branch_0'): 21 | tower_conv = slim.conv2d(net, 32, 1, scope='Conv2d_1x1') 22 | with tf.variable_scope('Branch_1'): 23 | tower_conv1_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1') 24 | tower_conv1_1 = slim.conv2d(tower_conv1_0, 32, 3, scope='Conv2d_0b_3x3') 25 | with tf.variable_scope('Branch_2'): 26 | tower_conv2_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1') 27 | tower_conv2_1 = slim.conv2d(tower_conv2_0, 48, 3, scope='Conv2d_0b_3x3') 28 | tower_conv2_2 = slim.conv2d(tower_conv2_1, 64, 3, scope='Conv2d_0c_3x3') 29 | mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_1, tower_conv2_2]) 30 | up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None, 31 | activation_fn=None, scope='Conv2d_1x1') 32 | net += scale * up 33 | if activation_fn: 34 | net = activation_fn(net) 35 | return net 36 | 37 | 38 | def block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): 39 | """Builds the 17x17 resnet block.""" 40 | with tf.variable_scope(scope, 'Block17', [net], reuse=reuse): 41 | with tf.variable_scope('Branch_0'): 42 | tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1') 43 | with tf.variable_scope('Branch_1'): 44 | tower_conv1_0 = slim.conv2d(net, 128, 1, scope='Conv2d_0a_1x1') 45 | tower_conv1_1 = slim.conv2d(tower_conv1_0, 160, [1, 7], 46 | scope='Conv2d_0b_1x7') 47 | tower_conv1_2 = slim.conv2d(tower_conv1_1, 192, [7, 1], 48 | scope='Conv2d_0c_7x1') 49 | mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2]) 50 | up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None, 51 | activation_fn=None, scope='Conv2d_1x1') 52 | net += scale * up 53 | if activation_fn: 54 | net = activation_fn(net) 55 | return net 56 | 57 | 58 | def block8(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): 59 | """Builds the 8x8 resnet block.""" 60 | with tf.variable_scope(scope, 'Block8', [net], reuse=reuse): 61 | with tf.variable_scope('Branch_0'): 62 | tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1') 63 | with tf.variable_scope('Branch_1'): 64 | tower_conv1_0 = slim.conv2d(net, 192, 1, scope='Conv2d_0a_1x1') 65 | tower_conv1_1 = slim.conv2d(tower_conv1_0, 224, [1, 3], 66 | scope='Conv2d_0b_1x3') 67 | tower_conv1_2 = slim.conv2d(tower_conv1_1, 256, [3, 1], 68 | scope='Conv2d_0c_3x1') 69 | mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2]) 70 | up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None, 71 | activation_fn=None, scope='Conv2d_1x1') 72 | net += scale * up 73 | if activation_fn: 74 | net = activation_fn(net) 75 | return net 76 | 77 | 78 | def inception_resnet_v2_base(inputs, 79 | final_endpoint='Conv2d_7b_1x1', 80 | output_stride=16, 81 | align_feature_maps=False, 82 | scope=None): 83 | """Inception model from http://arxiv.org/abs/1602.07261. 84 | 85 | Constructs an Inception Resnet v2 network from inputs to the given final 86 | endpoint. This method can construct the network up to the final inception 87 | block Conv2d_7b_1x1. 88 | 89 | Args: 90 | inputs: a tensor of size [batch_size, height, width, channels]. 91 | final_endpoint: specifies the endpoint to construct the network up to. It 92 | can be one of ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 93 | 'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3', 'MaxPool_5a_3x3', 94 | 'Mixed_5b', 'Mixed_6a', 'PreAuxLogits', 'Mixed_7a', 'Conv2d_7b_1x1'] 95 | output_stride: A scalar that specifies the requested ratio of input to 96 | output spatial resolution. Only supports 8 and 16. 97 | align_feature_maps: When true, changes all the VALID paddings in the network 98 | to SAME padding so that the feature maps are aligned. 99 | scope: Optional variable_scope. 100 | 101 | Returns: 102 | tensor_out: output tensor corresponding to the final_endpoint. 103 | end_points: a set of activations for external use, for example summaries or 104 | losses. 105 | 106 | Raises: 107 | ValueError: if final_endpoint is not set to one of the predefined values, 108 | or if the output_stride is not 8 or 16, or if the output_stride is 8 and 109 | we request an end point after 'PreAuxLogits'. 110 | """ 111 | if output_stride != 8 and output_stride != 16: 112 | raise ValueError('output_stride must be 8 or 16.') 113 | 114 | padding = 'SAME' if align_feature_maps else 'VALID' 115 | 116 | end_points = {} 117 | 118 | def add_and_check_final(name, net): 119 | end_points[name] = net 120 | return name == final_endpoint 121 | 122 | with tf.variable_scope(scope, 'InceptionResnetV2', [inputs]): 123 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], 124 | stride=1, padding='SAME'): 125 | # 149 x 149 x 32 126 | net = slim.conv2d(inputs, 32, 3, stride=2, padding=padding, 127 | scope='Conv2d_1a_3x3') 128 | if add_and_check_final('Conv2d_1a_3x3', net): return net, end_points 129 | 130 | # 147 x 147 x 32 131 | net = slim.conv2d(net, 32, 3, padding=padding, 132 | scope='Conv2d_2a_3x3') 133 | if add_and_check_final('Conv2d_2a_3x3', net): return net, end_points 134 | # 147 x 147 x 64 135 | net = slim.conv2d(net, 64, 3, scope='Conv2d_2b_3x3') 136 | if add_and_check_final('Conv2d_2b_3x3', net): return net, end_points 137 | # 73 x 73 x 64 138 | net = slim.max_pool2d(net, 3, stride=2, padding=padding, 139 | scope='MaxPool_3a_3x3') 140 | if add_and_check_final('MaxPool_3a_3x3', net): return net, end_points 141 | # 73 x 73 x 80 142 | net = slim.conv2d(net, 80, 1, padding=padding, 143 | scope='Conv2d_3b_1x1') 144 | if add_and_check_final('Conv2d_3b_1x1', net): return net, end_points 145 | # 71 x 71 x 192 146 | net = slim.conv2d(net, 192, 3, padding=padding, 147 | scope='Conv2d_4a_3x3') 148 | if add_and_check_final('Conv2d_4a_3x3', net): return net, end_points 149 | # 35 x 35 x 192 150 | net = slim.max_pool2d(net, 3, stride=2, padding=padding, 151 | scope='MaxPool_5a_3x3') 152 | if add_and_check_final('MaxPool_5a_3x3', net): return net, end_points 153 | 154 | # 35 x 35 x 320 155 | with tf.variable_scope('Mixed_5b'): 156 | with tf.variable_scope('Branch_0'): 157 | tower_conv = slim.conv2d(net, 96, 1, scope='Conv2d_1x1') 158 | with tf.variable_scope('Branch_1'): 159 | tower_conv1_0 = slim.conv2d(net, 48, 1, scope='Conv2d_0a_1x1') 160 | tower_conv1_1 = slim.conv2d(tower_conv1_0, 64, 5, 161 | scope='Conv2d_0b_5x5') 162 | with tf.variable_scope('Branch_2'): 163 | tower_conv2_0 = slim.conv2d(net, 64, 1, scope='Conv2d_0a_1x1') 164 | tower_conv2_1 = slim.conv2d(tower_conv2_0, 96, 3, 165 | scope='Conv2d_0b_3x3') 166 | tower_conv2_2 = slim.conv2d(tower_conv2_1, 96, 3, 167 | scope='Conv2d_0c_3x3') 168 | with tf.variable_scope('Branch_3'): 169 | tower_pool = slim.avg_pool2d(net, 3, stride=1, padding='SAME', 170 | scope='AvgPool_0a_3x3') 171 | tower_pool_1 = slim.conv2d(tower_pool, 64, 1, 172 | scope='Conv2d_0b_1x1') 173 | net = tf.concat( 174 | [tower_conv, tower_conv1_1, tower_conv2_2, tower_pool_1], 3) 175 | 176 | if add_and_check_final('Mixed_5b', net): return net, end_points 177 | # TODO(alemi): Register intermediate endpoints 178 | net = slim.repeat(net, 10, block35, scale=0.17) 179 | 180 | # 17 x 17 x 1088 if output_stride == 8, 181 | # 33 x 33 x 1088 if output_stride == 16 182 | use_atrous = output_stride == 8 183 | 184 | with tf.variable_scope('Mixed_6a'): 185 | with tf.variable_scope('Branch_0'): 186 | tower_conv = slim.conv2d(net, 384, 3, stride=1 if use_atrous else 2, 187 | padding=padding, 188 | scope='Conv2d_1a_3x3') 189 | with tf.variable_scope('Branch_1'): 190 | tower_conv1_0 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') 191 | tower_conv1_1 = slim.conv2d(tower_conv1_0, 256, 3, 192 | scope='Conv2d_0b_3x3') 193 | tower_conv1_2 = slim.conv2d(tower_conv1_1, 384, 3, 194 | stride=1 if use_atrous else 2, 195 | padding=padding, 196 | scope='Conv2d_1a_3x3') 197 | with tf.variable_scope('Branch_2'): 198 | tower_pool = slim.max_pool2d(net, 3, stride=1 if use_atrous else 2, 199 | padding=padding, 200 | scope='MaxPool_1a_3x3') 201 | net = tf.concat([tower_conv, tower_conv1_2, tower_pool], 3) 202 | 203 | if add_and_check_final('Mixed_6a', net): return net, end_points 204 | 205 | # TODO(alemi): register intermediate endpoints 206 | with slim.arg_scope([slim.conv2d], rate=2 if use_atrous else 1): 207 | net = slim.repeat(net, 20, block17, scale=0.10) 208 | if add_and_check_final('PreAuxLogits', net): return net, end_points 209 | 210 | if output_stride == 8: 211 | # TODO(gpapan): Properly support output_stride for the rest of the net. 212 | raise ValueError('output_stride==8 is only supported up to the ' 213 | 'PreAuxlogits end_point for now.') 214 | 215 | # 8 x 8 x 2080 216 | with tf.variable_scope('Mixed_7a'): 217 | with tf.variable_scope('Branch_0'): 218 | tower_conv = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') 219 | tower_conv_1 = slim.conv2d(tower_conv, 384, 3, stride=2, 220 | padding=padding, 221 | scope='Conv2d_1a_3x3') 222 | with tf.variable_scope('Branch_1'): 223 | tower_conv1 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') 224 | tower_conv1_1 = slim.conv2d(tower_conv1, 288, 3, stride=2, 225 | padding=padding, 226 | scope='Conv2d_1a_3x3') 227 | with tf.variable_scope('Branch_2'): 228 | tower_conv2 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') 229 | tower_conv2_1 = slim.conv2d(tower_conv2, 288, 3, 230 | scope='Conv2d_0b_3x3') 231 | tower_conv2_2 = slim.conv2d(tower_conv2_1, 320, 3, stride=2, 232 | padding=padding, 233 | scope='Conv2d_1a_3x3') 234 | with tf.variable_scope('Branch_3'): 235 | tower_pool = slim.max_pool2d(net, 3, stride=2, 236 | padding=padding, 237 | scope='MaxPool_1a_3x3') 238 | net = tf.concat( 239 | [tower_conv_1, tower_conv1_1, tower_conv2_2, tower_pool], 3) 240 | 241 | if add_and_check_final('Mixed_7a', net): return net, end_points 242 | 243 | # TODO(alemi): register intermediate endpoints 244 | net = slim.repeat(net, 9, block8, scale=0.20) 245 | net = block8(net, activation_fn=None) 246 | 247 | # 8 x 8 x 1536 248 | net = slim.conv2d(net, 1536, 1, scope='Conv2d_7b_1x1') 249 | if add_and_check_final('Conv2d_7b_1x1', net): return net, end_points 250 | 251 | raise ValueError('final_endpoint (%s) not recognized', final_endpoint) 252 | 253 | 254 | def inception_resnet_v2(inputs, num_classes=1001, is_training=True, 255 | dropout_keep_prob=0.8, 256 | reuse=None, 257 | scope='InceptionResnetV2', 258 | create_aux_logits=True): 259 | """Creates the Inception Resnet V2 model. 260 | 261 | Args: 262 | inputs: a 4-D tensor of size [batch_size, height, width, 3]. 263 | num_classes: number of predicted classes. 264 | is_training: whether is training or not. 265 | dropout_keep_prob: float, the fraction to keep before final layer. 266 | reuse: whether or not the network and its variables should be reused. To be 267 | able to reuse 'scope' must be given. 268 | scope: Optional variable_scope. 269 | create_aux_logits: Whether to include the auxilliary logits. 270 | 271 | Returns: 272 | logits: the logits outputs of the model. 273 | end_points: the set of end_points from the inception model. 274 | """ 275 | end_points = {} 276 | 277 | with tf.variable_scope(scope, 'InceptionResnetV2', [inputs, num_classes], 278 | reuse=reuse) as scope: 279 | with slim.arg_scope([slim.batch_norm, slim.dropout], 280 | is_training=is_training): 281 | 282 | net, end_points = inception_resnet_v2_base(inputs, scope=scope) 283 | 284 | if create_aux_logits: 285 | with tf.variable_scope('AuxLogits'): 286 | aux = end_points['PreAuxLogits'] 287 | aux = slim.avg_pool2d(aux, 5, stride=3, padding='VALID', 288 | scope='Conv2d_1a_3x3') 289 | aux = slim.conv2d(aux, 128, 1, scope='Conv2d_1b_1x1') 290 | aux = slim.conv2d(aux, 768, aux.get_shape()[1:3], 291 | padding='VALID', scope='Conv2d_2a_5x5') 292 | aux = slim.flatten(aux) 293 | aux = slim.fully_connected(aux, num_classes, activation_fn=None, 294 | scope='Logits') 295 | end_points['AuxLogits'] = aux 296 | 297 | with tf.variable_scope('Logits'): 298 | net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID', 299 | scope='AvgPool_1a_8x8') 300 | net = slim.flatten(net) 301 | 302 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 303 | scope='Dropout') 304 | 305 | end_points['PreLogitsFlatten'] = net 306 | logits = slim.fully_connected(net, num_classes, activation_fn=None, 307 | scope='Logits') 308 | end_points['Logits'] = logits 309 | end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions') 310 | 311 | return logits, end_points 312 | inception_resnet_v2.default_image_size = 299 313 | 314 | 315 | def inception_resnet_v2_arg_scope(weight_decay=0.00004, 316 | batch_norm_decay=0.9997, 317 | batch_norm_epsilon=0.001): 318 | """Returns the scope with the default parameters for inception_resnet_v2. 319 | 320 | Args: 321 | weight_decay: the weight decay for weights variables. 322 | batch_norm_decay: decay for the moving average of batch_norm momentums. 323 | batch_norm_epsilon: small float added to variance to avoid dividing by zero. 324 | 325 | Returns: 326 | a arg_scope with the parameters needed for inception_resnet_v2. 327 | """ 328 | # Set weight_decay for weights in conv2d and fully_connected layers. 329 | with slim.arg_scope([slim.conv2d, slim.fully_connected], 330 | weights_regularizer=slim.l2_regularizer(weight_decay), 331 | biases_regularizer=slim.l2_regularizer(weight_decay)): 332 | 333 | batch_norm_params = { 334 | 'decay': batch_norm_decay, 335 | 'epsilon': batch_norm_epsilon, 336 | } 337 | # Set activation_fn and parameters for batch_norm. 338 | with slim.arg_scope([slim.conv2d], activation_fn=tf.nn.relu, 339 | normalizer_fn=slim.batch_norm, 340 | normalizer_params=batch_norm_params) as scope: 341 | return scope 342 | -------------------------------------------------------------------------------- /net/inception_v4/__pycache__/inception_utils.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/net/inception_v4/__pycache__/inception_utils.cpython-35.pyc -------------------------------------------------------------------------------- /net/inception_v4/__pycache__/inception_v4.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/net/inception_v4/__pycache__/inception_v4.cpython-35.pyc -------------------------------------------------------------------------------- /net/inception_v4/inception_utils.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | 15 | slim = tf.contrib.slim 16 | 17 | 18 | def inception_arg_scope(weight_decay=0.00004, 19 | use_batch_norm=True, 20 | batch_norm_decay=0.9997, 21 | batch_norm_epsilon=0.001): 22 | """Defines the default arg scope for inception models. 23 | Args: 24 | weight_decay: The weight decay to use for regularizing the model. 25 | use_batch_norm: "If `True`, batch_norm is applied after each convolution. 26 | batch_norm_decay: Decay for batch norm moving average. 27 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero 28 | in batch norm. 29 | Returns: 30 | An `arg_scope` to use for the inception models. 31 | """ 32 | batch_norm_params = { 33 | # Decay for the moving averages. 34 | 'decay': batch_norm_decay, 35 | # epsilon to prevent 0s in variance. 36 | 'epsilon': batch_norm_epsilon, 37 | # collection containing update_ops. 38 | 'updates_collections': tf.GraphKeys.UPDATE_OPS, 39 | } 40 | if use_batch_norm: 41 | normalizer_fn = slim.batch_norm 42 | normalizer_params = batch_norm_params 43 | else: 44 | normalizer_fn = None 45 | normalizer_params = {} 46 | # Set weight_decay for weights in Conv and FC layers. 47 | with slim.arg_scope([slim.conv2d, slim.fully_connected], 48 | weights_regularizer=slim.l2_regularizer(weight_decay)): 49 | with slim.arg_scope( 50 | [slim.conv2d], 51 | weights_initializer=slim.variance_scaling_initializer(), 52 | activation_fn=tf.nn.relu, 53 | normalizer_fn=normalizer_fn, 54 | normalizer_params=normalizer_params) as sc: 55 | return sc 56 | -------------------------------------------------------------------------------- /net/inception_v4/inception_v4.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | try: 15 | import inception_utils 16 | except: 17 | from net.inception_v4 import inception_utils 18 | 19 | slim = tf.contrib.slim 20 | 21 | 22 | def block_inception_a(inputs, scope=None, reuse=None): 23 | """Builds Inception-A block for Inception v4 network.""" 24 | # By default use stride=1 and SAME padding 25 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 26 | stride=1, padding='SAME'): 27 | with tf.variable_scope(scope, 'BlockInceptionA', [inputs], reuse=reuse): 28 | with tf.variable_scope('Branch_0'): 29 | branch_0 = slim.conv2d(inputs, 96, [1, 1], scope='Conv2d_0a_1x1') 30 | with tf.variable_scope('Branch_1'): 31 | branch_1 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1') 32 | branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_0b_3x3') 33 | with tf.variable_scope('Branch_2'): 34 | branch_2 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1') 35 | branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3') 36 | branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3') 37 | with tf.variable_scope('Branch_3'): 38 | branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3') 39 | branch_3 = slim.conv2d(branch_3, 96, [1, 1], scope='Conv2d_0b_1x1') 40 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3]) 41 | 42 | 43 | def block_reduction_a(inputs, scope=None, reuse=None): 44 | """Builds Reduction-A block for Inception v4 network.""" 45 | # By default use stride=1 and SAME padding 46 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 47 | stride=1, padding='SAME'): 48 | with tf.variable_scope(scope, 'BlockReductionA', [inputs], reuse=reuse): 49 | with tf.variable_scope('Branch_0'): 50 | branch_0 = slim.conv2d(inputs, 384, [3, 3], stride=2, padding='VALID', 51 | scope='Conv2d_1a_3x3') 52 | with tf.variable_scope('Branch_1'): 53 | branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 54 | branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope='Conv2d_0b_3x3') 55 | branch_1 = slim.conv2d(branch_1, 256, [3, 3], stride=2, 56 | padding='VALID', scope='Conv2d_1a_3x3') 57 | with tf.variable_scope('Branch_2'): 58 | branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID', 59 | scope='MaxPool_1a_3x3') 60 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2]) 61 | 62 | 63 | def block_inception_b(inputs, scope=None, reuse=None): 64 | """Builds Inception-B block for Inception v4 network.""" 65 | # By default use stride=1 and SAME padding 66 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 67 | stride=1, padding='SAME'): 68 | with tf.variable_scope(scope, 'BlockInceptionB', [inputs], reuse=reuse): 69 | with tf.variable_scope('Branch_0'): 70 | branch_0 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1') 71 | with tf.variable_scope('Branch_1'): 72 | branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 73 | branch_1 = slim.conv2d(branch_1, 224, [1, 7], scope='Conv2d_0b_1x7') 74 | branch_1 = slim.conv2d(branch_1, 256, [7, 1], scope='Conv2d_0c_7x1') 75 | with tf.variable_scope('Branch_2'): 76 | branch_2 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 77 | branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0b_7x1') 78 | branch_2 = slim.conv2d(branch_2, 224, [1, 7], scope='Conv2d_0c_1x7') 79 | branch_2 = slim.conv2d(branch_2, 224, [7, 1], scope='Conv2d_0d_7x1') 80 | branch_2 = slim.conv2d(branch_2, 256, [1, 7], scope='Conv2d_0e_1x7') 81 | with tf.variable_scope('Branch_3'): 82 | branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3') 83 | branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1') 84 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3]) 85 | 86 | 87 | def block_reduction_b(inputs, scope=None, reuse=None): 88 | """Builds Reduction-B block for Inception v4 network.""" 89 | # By default use stride=1 and SAME padding 90 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 91 | stride=1, padding='SAME'): 92 | with tf.variable_scope(scope, 'BlockReductionB', [inputs], reuse=reuse): 93 | with tf.variable_scope('Branch_0'): 94 | branch_0 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 95 | branch_0 = slim.conv2d(branch_0, 192, [3, 3], stride=2, 96 | padding='VALID', scope='Conv2d_1a_3x3') 97 | with tf.variable_scope('Branch_1'): 98 | branch_1 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1') 99 | branch_1 = slim.conv2d(branch_1, 256, [1, 7], scope='Conv2d_0b_1x7') 100 | branch_1 = slim.conv2d(branch_1, 320, [7, 1], scope='Conv2d_0c_7x1') 101 | branch_1 = slim.conv2d(branch_1, 320, [3, 3], stride=2, 102 | padding='VALID', scope='Conv2d_1a_3x3') 103 | with tf.variable_scope('Branch_2'): 104 | branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID', 105 | scope='MaxPool_1a_3x3') 106 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2]) 107 | 108 | 109 | def block_inception_c(inputs, scope=None, reuse=None): 110 | """Builds Inception-C block for Inception v4 network.""" 111 | # By default use stride=1 and SAME padding 112 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 113 | stride=1, padding='SAME'): 114 | with tf.variable_scope(scope, 'BlockInceptionC', [inputs], reuse=reuse): 115 | with tf.variable_scope('Branch_0'): 116 | branch_0 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1') 117 | with tf.variable_scope('Branch_1'): 118 | branch_1 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1') 119 | branch_1 = tf.concat(axis=3, values=[ 120 | slim.conv2d(branch_1, 256, [1, 3], scope='Conv2d_0b_1x3'), 121 | slim.conv2d(branch_1, 256, [3, 1], scope='Conv2d_0c_3x1')]) 122 | with tf.variable_scope('Branch_2'): 123 | branch_2 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1') 124 | branch_2 = slim.conv2d(branch_2, 448, [3, 1], scope='Conv2d_0b_3x1') 125 | branch_2 = slim.conv2d(branch_2, 512, [1, 3], scope='Conv2d_0c_1x3') 126 | branch_2 = tf.concat(axis=3, values=[ 127 | slim.conv2d(branch_2, 256, [1, 3], scope='Conv2d_0d_1x3'), 128 | slim.conv2d(branch_2, 256, [3, 1], scope='Conv2d_0e_3x1')]) 129 | with tf.variable_scope('Branch_3'): 130 | branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3') 131 | branch_3 = slim.conv2d(branch_3, 256, [1, 1], scope='Conv2d_0b_1x1') 132 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3]) 133 | 134 | 135 | def inception_v4_base(inputs, final_endpoint='Mixed_7d', scope=None): 136 | """Creates the Inception V4 network up to the given final endpoint. 137 | Args: 138 | inputs: a 4-D tensor of size [batch_size, height, width, 3]. 139 | final_endpoint: specifies the endpoint to construct the network up to. 140 | It can be one of [ 'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 141 | 'Mixed_3a', 'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d', 142 | 'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d', 'Mixed_6e', 143 | 'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c', 144 | 'Mixed_7d'] 145 | scope: Optional variable_scope. 146 | Returns: 147 | logits: the logits outputs of the model. 148 | end_points: the set of end_points from the inception model. 149 | Raises: 150 | ValueError: if final_endpoint is not set to one of the predefined values, 151 | """ 152 | end_points = {} 153 | 154 | def add_and_check_final(name, net): 155 | end_points[name] = net 156 | return name == final_endpoint 157 | 158 | with tf.variable_scope(scope, 'InceptionV4', [inputs]): 159 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], 160 | stride=1, padding='SAME'): 161 | # 299 x 299 x 3 162 | net = slim.conv2d(inputs, 32, [3, 3], stride=2, 163 | padding='VALID', scope='Conv2d_1a_3x3') 164 | if add_and_check_final('Conv2d_1a_3x3', net): return net, end_points 165 | # 149 x 149 x 32 166 | net = slim.conv2d(net, 32, [3, 3], padding='VALID', 167 | scope='Conv2d_2a_3x3') 168 | if add_and_check_final('Conv2d_2a_3x3', net): return net, end_points 169 | # 147 x 147 x 32 170 | net = slim.conv2d(net, 64, [3, 3], scope='Conv2d_2b_3x3') 171 | if add_and_check_final('Conv2d_2b_3x3', net): return net, end_points 172 | # 147 x 147 x 64 173 | with tf.variable_scope('Mixed_3a'): 174 | with tf.variable_scope('Branch_0'): 175 | branch_0 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', 176 | scope='MaxPool_0a_3x3') 177 | with tf.variable_scope('Branch_1'): 178 | branch_1 = slim.conv2d(net, 96, [3, 3], stride=2, padding='VALID', 179 | scope='Conv2d_0a_3x3') 180 | net = tf.concat(axis=3, values=[branch_0, branch_1]) 181 | if add_and_check_final('Mixed_3a', net): return net, end_points 182 | 183 | # 73 x 73 x 160 184 | with tf.variable_scope('Mixed_4a'): 185 | with tf.variable_scope('Branch_0'): 186 | branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1') 187 | branch_0 = slim.conv2d(branch_0, 96, [3, 3], padding='VALID', 188 | scope='Conv2d_1a_3x3') 189 | with tf.variable_scope('Branch_1'): 190 | branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1') 191 | branch_1 = slim.conv2d(branch_1, 64, [1, 7], scope='Conv2d_0b_1x7') 192 | branch_1 = slim.conv2d(branch_1, 64, [7, 1], scope='Conv2d_0c_7x1') 193 | branch_1 = slim.conv2d(branch_1, 96, [3, 3], padding='VALID', 194 | scope='Conv2d_1a_3x3') 195 | net = tf.concat(axis=3, values=[branch_0, branch_1]) 196 | if add_and_check_final('Mixed_4a', net): return net, end_points 197 | 198 | # 71 x 71 x 192 199 | with tf.variable_scope('Mixed_5a'): 200 | with tf.variable_scope('Branch_0'): 201 | branch_0 = slim.conv2d(net, 192, [3, 3], stride=2, padding='VALID', 202 | scope='Conv2d_1a_3x3') 203 | with tf.variable_scope('Branch_1'): 204 | branch_1 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', 205 | scope='MaxPool_1a_3x3') 206 | net = tf.concat(axis=3, values=[branch_0, branch_1]) 207 | if add_and_check_final('Mixed_5a', net): return net, end_points 208 | 209 | # 35 x 35 x 384 210 | # 4 x Inception-A blocks 211 | for idx in range(4): 212 | block_scope = 'Mixed_5' + chr(ord('b') + idx) 213 | net = block_inception_a(net, block_scope) 214 | if add_and_check_final(block_scope, net): return net, end_points 215 | 216 | # 35 x 35 x 384 217 | # Reduction-A block 218 | net = block_reduction_a(net, 'Mixed_6a') 219 | if add_and_check_final('Mixed_6a', net): return net, end_points 220 | 221 | # 17 x 17 x 1024 222 | # 7 x Inception-B blocks 223 | for idx in range(7): 224 | block_scope = 'Mixed_6' + chr(ord('b') + idx) 225 | net = block_inception_b(net, block_scope) 226 | if add_and_check_final(block_scope, net): return net, end_points 227 | 228 | # 17 x 17 x 1024 229 | # Reduction-B block 230 | net = block_reduction_b(net, 'Mixed_7a') 231 | if add_and_check_final('Mixed_7a', net): return net, end_points 232 | 233 | # 8 x 8 x 1536 234 | # 3 x Inception-C blocks 235 | for idx in range(3): 236 | block_scope = 'Mixed_7' + chr(ord('b') + idx) 237 | net = block_inception_c(net, block_scope) 238 | if add_and_check_final(block_scope, net): return net, end_points 239 | raise ValueError('Unknown final endpoint %s' % final_endpoint) 240 | 241 | 242 | # num_classes=1001 243 | def inception_v4(inputs, num_classes=None, is_training=True, 244 | dropout_keep_prob=0.8, 245 | reuse=None, 246 | scope='InceptionV4', 247 | create_aux_logits=True): 248 | """Creates the Inception V4 model. 249 | Args: 250 | inputs: a 4-D tensor of size [batch_size, height, width, 3]. 251 | num_classes: number of predicted classes. 252 | is_training: whether is training or not. 253 | dropout_keep_prob: float, the fraction to keep before final layer. 254 | reuse: whether or not the network and its variables should be reused. To be 255 | able to reuse 'scope' must be given. 256 | scope: Optional variable_scope. 257 | create_aux_logits: Whether to include the auxiliary logits. 258 | Returns: 259 | logits: the logits outputs of the model. 260 | end_points: the set of end_points from the inception model. 261 | """ 262 | end_points = {} 263 | with tf.variable_scope(scope, 'InceptionV4', [inputs], reuse=reuse) as scope: 264 | with slim.arg_scope([slim.batch_norm, slim.dropout], 265 | is_training=is_training): 266 | net, end_points = inception_v4_base(inputs, scope=scope) 267 | if num_classes is not None: 268 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], 269 | stride=1, padding='SAME'): 270 | # Auxiliary Head logits 271 | if create_aux_logits: 272 | with tf.variable_scope('AuxLogits'): 273 | # 17 x 17 x 1024 274 | aux_logits = end_points['Mixed_6h'] 275 | aux_logits = slim.avg_pool2d(aux_logits, [5, 5], stride=3, 276 | padding='VALID', 277 | scope='AvgPool_1a_5x5') 278 | aux_logits = slim.conv2d(aux_logits, 128, [1, 1], 279 | scope='Conv2d_1b_1x1') 280 | aux_logits = slim.conv2d(aux_logits, 768, 281 | aux_logits.get_shape()[1:3], 282 | padding='VALID', scope='Conv2d_2a') 283 | aux_logits = slim.flatten(aux_logits) 284 | aux_logits = slim.fully_connected(aux_logits, num_classes, 285 | activation_fn=None, 286 | scope='Aux_logits') 287 | end_points['AuxLogits'] = aux_logits 288 | 289 | # Final pooling and prediction 290 | with tf.variable_scope('Logits'): 291 | # 8 x 8 x 1536 292 | net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID', 293 | scope='AvgPool_1a') 294 | # 1 x 1 x 1536 295 | net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b') 296 | net = slim.flatten(net, scope='PreLogitsFlatten') 297 | end_points['PreLogitsFlatten'] = net 298 | # 1536 299 | logits = slim.fully_connected(net, num_classes, activation_fn=None, 300 | scope='Logits') 301 | end_points['Logits'] = logits 302 | end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions') 303 | else: 304 | logits = net 305 | end_points = end_points 306 | return logits, end_points 307 | inception_v4.default_image_size = 299 308 | 309 | 310 | inception_v4_arg_scope = inception_utils.inception_arg_scope 311 | -------------------------------------------------------------------------------- /net/resnet_v2/__pycache__/resnet_utils.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/net/resnet_v2/__pycache__/resnet_utils.cpython-35.pyc -------------------------------------------------------------------------------- /net/resnet_v2/__pycache__/resnet_v2.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/net/resnet_v2/__pycache__/resnet_v2.cpython-35.pyc -------------------------------------------------------------------------------- /net/resnet_v2/resnet_utils.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import collections 14 | import tensorflow as tf 15 | 16 | slim = tf.contrib.slim 17 | 18 | 19 | class Block(collections.namedtuple('Block', ['scope', 'unit_fn', 'args'])): 20 | """A named tuple describing a ResNet block. 21 | 22 | Its parts are: 23 | scope: The scope of the `Block`. 24 | unit_fn: The ResNet unit function which takes as input a `Tensor` and 25 | returns another `Tensor` with the output of the ResNet unit. 26 | args: A list of length equal to the number of units in the `Block`. The list 27 | contains one (depth, depth_bottleneck, stride) tuple for each unit in the 28 | block to serve as argument to unit_fn. 29 | """ 30 | 31 | 32 | def subsample(inputs, factor, scope=None): 33 | """Subsamples the input along the spatial dimensions. 34 | 35 | Args: 36 | inputs: A `Tensor` of size [batch, height_in, width_in, channels]. 37 | factor: The subsampling factor. 38 | scope: Optional variable_scope. 39 | 40 | Returns: 41 | output: A `Tensor` of size [batch, height_out, width_out, channels] with the 42 | input, either intact (if factor == 1) or subsampled (if factor > 1). 43 | """ 44 | if factor == 1: 45 | return inputs 46 | else: 47 | return slim.max_pool2d(inputs, [1, 1], stride=factor, scope=scope) 48 | 49 | 50 | def conv2d_same(inputs, num_outputs, kernel_size, stride, rate=1, scope=None): 51 | """Strided 2-D convolution with 'SAME' padding. 52 | 53 | When stride > 1, then we do explicit zero-padding, followed by conv2d with 54 | 'VALID' padding. 55 | 56 | Note that 57 | 58 | net = conv2d_same(inputs, num_outputs, 3, stride=stride) 59 | 60 | is equivalent to 61 | 62 | net = slim.conv2d(inputs, num_outputs, 3, stride=1, padding='SAME') 63 | net = subsample(net, factor=stride) 64 | 65 | whereas 66 | 67 | net = slim.conv2d(inputs, num_outputs, 3, stride=stride, padding='SAME') 68 | 69 | is different when the input's height or width is even, which is why we add the 70 | current function. For more details, see ResnetUtilsTest.testConv2DSameEven(). 71 | 72 | Args: 73 | inputs: A 4-D tensor of size [batch, height_in, width_in, channels]. 74 | num_outputs: An integer, the number of output filters. 75 | kernel_size: An int with the kernel_size of the filters. 76 | stride: An integer, the output stride. 77 | rate: An integer, rate for atrous convolution. 78 | scope: Scope. 79 | 80 | Returns: 81 | output: A 4-D tensor of size [batch, height_out, width_out, channels] with 82 | the convolution output. 83 | """ 84 | if stride == 1: 85 | return slim.conv2d(inputs, num_outputs, kernel_size, stride=1, rate=rate, 86 | padding='SAME', scope=scope) 87 | else: 88 | kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1) 89 | pad_total = kernel_size_effective - 1 90 | pad_beg = pad_total // 2 91 | pad_end = pad_total - pad_beg 92 | inputs = tf.pad(inputs, 93 | [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]]) 94 | return slim.conv2d(inputs, num_outputs, kernel_size, stride=stride, 95 | rate=rate, padding='VALID', scope=scope) 96 | 97 | 98 | @slim.add_arg_scope 99 | def stack_blocks_dense(net, blocks, output_stride=None, 100 | outputs_collections=None): 101 | """Stacks ResNet `Blocks` and controls output feature density. 102 | 103 | First, this function creates scopes for the ResNet in the form of 104 | 'block_name/unit_1', 'block_name/unit_2', etc. 105 | 106 | Second, this function allows the user to explicitly control the ResNet 107 | output_stride, which is the ratio of the input to output spatial resolution. 108 | This is useful for dense prediction tasks such as semantic segmentation or 109 | object detection. 110 | 111 | Most ResNets consist of 4 ResNet blocks and subsample the activations by a 112 | factor of 2 when transitioning between consecutive ResNet blocks. This results 113 | to a nominal ResNet output_stride equal to 8. If we set the output_stride to 114 | half the nominal network stride (e.g., output_stride=4), then we compute 115 | responses twice. 116 | 117 | Control of the output feature density is implemented by atrous convolution. 118 | 119 | Args: 120 | net: A `Tensor` of size [batch, height, width, channels]. 121 | blocks: A list of length equal to the number of ResNet `Blocks`. Each 122 | element is a ResNet `Block` object describing the units in the `Block`. 123 | output_stride: If `None`, then the output will be computed at the nominal 124 | network stride. If output_stride is not `None`, it specifies the requested 125 | ratio of input to output spatial resolution, which needs to be equal to 126 | the product of unit strides from the start up to some level of the ResNet. 127 | For example, if the ResNet employs units with strides 1, 2, 1, 3, 4, 1, 128 | then valid values for the output_stride are 1, 2, 6, 24 or None (which 129 | is equivalent to output_stride=24). 130 | outputs_collections: Collection to add the ResNet block outputs. 131 | 132 | Returns: 133 | net: Output tensor with stride equal to the specified output_stride. 134 | 135 | Raises: 136 | ValueError: If the target output_stride is not valid. 137 | """ 138 | # The current_stride variable keeps track of the effective stride of the 139 | # activations. This allows us to invoke atrous convolution whenever applying 140 | # the next residual unit would result in the activations having stride larger 141 | # than the target output_stride. 142 | current_stride = 1 143 | 144 | # The atrous convolution rate parameter. 145 | rate = 1 146 | 147 | for block in blocks: 148 | with tf.variable_scope(block.scope, 'block', [net]) as sc: 149 | for i, unit in enumerate(block.args): 150 | if output_stride is not None and current_stride > output_stride: 151 | raise ValueError('The target output_stride cannot be reached.') 152 | 153 | with tf.variable_scope('unit_%d' % (i + 1), values=[net]): 154 | # If we have reached the target output_stride, then we need to employ 155 | # atrous convolution with stride=1 and multiply the atrous rate by the 156 | # current unit's stride for use in subsequent layers. 157 | if output_stride is not None and current_stride == output_stride: 158 | net = block.unit_fn(net, rate=rate, **dict(unit, stride=1)) 159 | rate *= unit.get('stride', 1) 160 | 161 | else: 162 | net = block.unit_fn(net, rate=1, **unit) 163 | current_stride *= unit.get('stride', 1) 164 | net = slim.utils.collect_named_outputs(outputs_collections, sc.name, net) 165 | 166 | if output_stride is not None and current_stride != output_stride: 167 | raise ValueError('The target output_stride cannot be reached.') 168 | 169 | return net 170 | 171 | 172 | def resnet_arg_scope(weight_decay=0.0001, 173 | batch_norm_decay=0.997, 174 | batch_norm_epsilon=1e-5, 175 | batch_norm_scale=True, 176 | activation_fn=tf.nn.relu, 177 | use_batch_norm=True): 178 | """Defines the default ResNet arg scope. 179 | 180 | TODO(gpapan): The batch-normalization related default values above are 181 | appropriate for use in conjunction with the reference ResNet models 182 | released at https://github.com/KaimingHe/deep-residual-networks. When 183 | training ResNets from scratch, they might need to be tuned. 184 | 185 | Args: 186 | weight_decay: The weight decay to use for regularizing the model. 187 | batch_norm_decay: The moving average decay when estimating layer activation 188 | statistics in batch normalization. 189 | batch_norm_epsilon: Small constant to prevent division by zero when 190 | normalizing activations by their variance in batch normalization. 191 | batch_norm_scale: If True, uses an explicit `gamma` multiplier to scale the 192 | activations in the batch normalization layer. 193 | activation_fn: The activation function which is used in ResNet. 194 | use_batch_norm: Whether or not to use batch normalization. 195 | 196 | Returns: 197 | An `arg_scope` to use for the resnet models. 198 | """ 199 | batch_norm_params = { 200 | 'decay': batch_norm_decay, 201 | 'epsilon': batch_norm_epsilon, 202 | 'scale': batch_norm_scale, 203 | 'updates_collections': tf.GraphKeys.UPDATE_OPS, 204 | } 205 | 206 | with slim.arg_scope( 207 | [slim.conv2d], 208 | weights_regularizer=slim.l2_regularizer(weight_decay), 209 | weights_initializer=slim.variance_scaling_initializer(), 210 | activation_fn=activation_fn, 211 | normalizer_fn=slim.batch_norm if use_batch_norm else None, 212 | normalizer_params=batch_norm_params): 213 | with slim.arg_scope([slim.batch_norm], **batch_norm_params): 214 | # The following implies padding='SAME' for pool1, which makes feature 215 | # alignment easier for dense prediction tasks. This is also used in 216 | # https://github.com/facebook/fb.resnet.torch. However the accompanying 217 | # code of 'Deep Residual Learning for Image Recognition' uses 218 | # padding='VALID' for pool1. You can switch to that choice by setting 219 | # slim.arg_scope([slim.max_pool2d], padding='VALID'). 220 | with slim.arg_scope([slim.max_pool2d], padding='SAME') as arg_sc: 221 | return arg_sc 222 | -------------------------------------------------------------------------------- /net/resnet_v2/resnet_v2.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | 15 | try: 16 | import resnet_utils 17 | except: 18 | from net.resnet_v2 import resnet_utils 19 | 20 | slim = tf.contrib.slim 21 | resnet_arg_scope = resnet_utils.resnet_arg_scope 22 | 23 | 24 | @slim.add_arg_scope 25 | def bottleneck(inputs, depth, depth_bottleneck, stride, rate=1, 26 | outputs_collections=None, scope=None): 27 | """Bottleneck residual unit variant with BN before convolutions. 28 | 29 | This is the full preactivation residual unit variant proposed in [2]. See 30 | Fig. 1(b) of [2] for its definition. Note that we use here the bottleneck 31 | variant which has an extra bottleneck layer. 32 | 33 | When putting together two consecutive ResNet blocks that use this unit, one 34 | should use stride = 2 in the last unit of the first block. 35 | 36 | Args: 37 | inputs: A tensor of size [batch, height, width, channels]. 38 | depth: The depth of the ResNet unit output. 39 | depth_bottleneck: The depth of the bottleneck layers. 40 | stride: The ResNet unit's stride. Determines the amount of downsampling of 41 | the units output compared to its input. 42 | rate: An integer, rate for atrous convolution. 43 | outputs_collections: Collection to add the ResNet unit output. 44 | scope: Optional variable_scope. 45 | 46 | Returns: 47 | The ResNet unit's output. 48 | """ 49 | with tf.variable_scope(scope, 'bottleneck_v2', [inputs]) as sc: 50 | depth_in = slim.utils.last_dimension(inputs.get_shape(), min_rank=4) 51 | preact = slim.batch_norm(inputs, activation_fn=tf.nn.relu, scope='preact') 52 | if depth == depth_in: 53 | shortcut = resnet_utils.subsample(inputs, stride, 'shortcut') 54 | else: 55 | shortcut = slim.conv2d(preact, depth, [1, 1], stride=stride, 56 | normalizer_fn=None, activation_fn=None, 57 | scope='shortcut') 58 | 59 | residual = slim.conv2d(preact, depth_bottleneck, [1, 1], stride=1, 60 | scope='conv1') 61 | residual = resnet_utils.conv2d_same(residual, depth_bottleneck, 3, stride, 62 | rate=rate, scope='conv2') 63 | residual = slim.conv2d(residual, depth, [1, 1], stride=1, 64 | normalizer_fn=None, activation_fn=None, 65 | scope='conv3') 66 | 67 | output = shortcut + residual 68 | 69 | return slim.utils.collect_named_outputs(outputs_collections, 70 | sc.original_name_scope, 71 | output) 72 | 73 | 74 | def resnet_v2(inputs, 75 | blocks, 76 | num_classes=None, 77 | is_training=True, 78 | global_pool=True, 79 | output_stride=None, 80 | include_root_block=True, 81 | spatial_squeeze=True, 82 | reuse=None, 83 | scope=None): 84 | """Generator for v2 (preactivation) ResNet models. 85 | 86 | This function generates a family of ResNet v2 models. See the resnet_v2_*() 87 | methods for specific model instantiations, obtained by selecting different 88 | block instantiations that produce ResNets of various depths. 89 | 90 | Training for image classification on Imagenet is usually done with [224, 224] 91 | inputs, resulting in [7, 7] feature maps at the output of the last ResNet 92 | block for the ResNets defined in [1] that have nominal stride equal to 32. 93 | However, for dense prediction tasks we advise that one uses inputs with 94 | spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In 95 | this case the feature maps at the ResNet output will have spatial shape 96 | [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1] 97 | and corners exactly aligned with the input image corners, which greatly 98 | facilitates alignment of the features to the image. Using as input [225, 225] 99 | images results in [8, 8] feature maps at the output of the last ResNet block. 100 | 101 | For dense prediction tasks, the ResNet needs to run in fully-convolutional 102 | (FCN) mode and global_pool needs to be set to False. The ResNets in [1, 2] all 103 | have nominal stride equal to 32 and a good choice in FCN mode is to use 104 | output_stride=16 in order to increase the density of the computed features at 105 | small computational and memory overhead, cf. http://arxiv.org/abs/1606.00915. 106 | 107 | Args: 108 | inputs: A tensor of size [batch, height_in, width_in, channels]. 109 | blocks: A list of length equal to the number of ResNet blocks. Each element 110 | is a resnet_utils.Block object describing the units in the block. 111 | num_classes: Number of predicted classes for classification tasks. If None 112 | we return the features before the logit layer. 113 | is_training: whether is training or not. 114 | global_pool: If True, we perform global average pooling before computing the 115 | logits. Set to True for image classification, False for dense prediction. 116 | output_stride: If None, then the output will be computed at the nominal 117 | network stride. If output_stride is not None, it specifies the requested 118 | ratio of input to output spatial resolution. 119 | include_root_block: If True, include the initial convolution followed by 120 | max-pooling, if False excludes it. If excluded, `inputs` should be the 121 | results of an activation-less convolution. 122 | spatial_squeeze: if True, logits is of shape [B, C], if false logits is 123 | of shape [B, 1, 1, C], where B is batch_size and C is number of classes. 124 | To use this parameter, the input images must be smaller than 300x300 125 | pixels, in which case the output logit layer does not contain spatial 126 | information and can be removed. 127 | reuse: whether or not the network and its variables should be reused. To be 128 | able to reuse 'scope' must be given. 129 | scope: Optional variable_scope. 130 | 131 | 132 | Returns: 133 | net: A rank-4 tensor of size [batch, height_out, width_out, channels_out]. 134 | If global_pool is False, then height_out and width_out are reduced by a 135 | factor of output_stride compared to the respective height_in and width_in, 136 | else both height_out and width_out equal one. If num_classes is None, then 137 | net is the output of the last ResNet block, potentially after global 138 | average pooling. If num_classes is not None, net contains the pre-softmax 139 | activations. 140 | end_points: A dictionary from components of the network to the corresponding 141 | activation. 142 | 143 | Raises: 144 | ValueError: If the target output_stride is not valid. 145 | """ 146 | with tf.variable_scope(scope, 'resnet_v2', [inputs], reuse=reuse) as sc: 147 | end_points_collection = sc.name + '_end_points' 148 | with slim.arg_scope([slim.conv2d, bottleneck, 149 | resnet_utils.stack_blocks_dense], 150 | outputs_collections=end_points_collection): 151 | with slim.arg_scope([slim.batch_norm], is_training=is_training): 152 | net = inputs 153 | if include_root_block: 154 | if output_stride is not None: 155 | if output_stride % 4 != 0: 156 | raise ValueError('The output_stride needs to be a multiple of 4.') 157 | output_stride /= 4 158 | # We do not include batch normalization or activation functions in 159 | # conv1 because the first ResNet unit will perform these. Cf. 160 | # Appendix of [2]. 161 | with slim.arg_scope([slim.conv2d], 162 | activation_fn=None, normalizer_fn=None): 163 | net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1') 164 | net = slim.max_pool2d(net, [3, 3], stride=2, scope='pool1') 165 | net = resnet_utils.stack_blocks_dense(net, blocks, output_stride) 166 | # This is needed because the pre-activation variant does not have batch 167 | # normalization or activation functions in the residual unit output. See 168 | # Appendix of [2]. 169 | net = slim.batch_norm(net, activation_fn=tf.nn.relu, scope='postnorm') 170 | if global_pool: 171 | # Global average pooling. 172 | net = tf.reduce_mean(net, [1, 2], name='pool5', keep_dims=True) 173 | if num_classes is not None: 174 | net = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, 175 | normalizer_fn=None, scope='logits') 176 | if spatial_squeeze: 177 | net = tf.squeeze(net, [1, 2], name='SpatialSqueeze') 178 | # Convert end_points_collection into a dictionary of end_points. 179 | end_points = slim.utils.convert_collection_to_dict( 180 | end_points_collection) 181 | if num_classes is not None: 182 | end_points['predictions'] = slim.softmax(net, scope='predictions') 183 | return net, end_points 184 | resnet_v2.default_image_size = 224 185 | 186 | 187 | def resnet_v2_block(scope, base_depth, num_units, stride): 188 | """Helper function for creating a resnet_v2 bottleneck block. 189 | 190 | Args: 191 | scope: The scope of the block. 192 | base_depth: The depth of the bottleneck layer for each unit. 193 | num_units: The number of units in the block. 194 | stride: The stride of the block, implemented as a stride in the last unit. 195 | All other units have stride=1. 196 | 197 | Returns: 198 | A resnet_v2 bottleneck block. 199 | """ 200 | return resnet_utils.Block(scope, bottleneck, [{ 201 | 'depth': base_depth * 4, 202 | 'depth_bottleneck': base_depth, 203 | 'stride': 1 204 | }] * (num_units - 1) + [{ 205 | 'depth': base_depth * 4, 206 | 'depth_bottleneck': base_depth, 207 | 'stride': stride 208 | }]) 209 | resnet_v2.default_image_size = 224 210 | 211 | 212 | def resnet_v2_50(inputs, 213 | num_classes=None, 214 | is_training=True, 215 | global_pool=True, 216 | output_stride=None, 217 | spatial_squeeze=True, 218 | reuse=None, 219 | scope='resnet_v2_50'): 220 | """ResNet-50 model of [1]. See resnet_v2() for arg and return description.""" 221 | blocks = [ 222 | resnet_v2_block('block1', base_depth=64, num_units=3, stride=2), 223 | resnet_v2_block('block2', base_depth=128, num_units=4, stride=2), 224 | resnet_v2_block('block3', base_depth=256, num_units=6, stride=2), 225 | resnet_v2_block('block4', base_depth=512, num_units=3, stride=1), 226 | ] 227 | return resnet_v2(inputs, blocks, num_classes, is_training=is_training, 228 | global_pool=global_pool, output_stride=output_stride, 229 | include_root_block=True, spatial_squeeze=spatial_squeeze, 230 | reuse=reuse, scope=scope) 231 | resnet_v2_50.default_image_size = resnet_v2.default_image_size 232 | 233 | 234 | def resnet_v2_101(inputs, 235 | num_classes=None, 236 | is_training=True, 237 | global_pool=True, 238 | output_stride=None, 239 | spatial_squeeze=True, 240 | reuse=None, 241 | scope='resnet_v2_101'): 242 | """ResNet-101 model of [1]. See resnet_v2() for arg and return description.""" 243 | blocks = [ 244 | resnet_v2_block('block1', base_depth=64, num_units=3, stride=2), 245 | resnet_v2_block('block2', base_depth=128, num_units=4, stride=2), 246 | resnet_v2_block('block3', base_depth=256, num_units=23, stride=2), 247 | resnet_v2_block('block4', base_depth=512, num_units=3, stride=1), 248 | ] 249 | return resnet_v2(inputs, blocks, num_classes, is_training=is_training, 250 | global_pool=global_pool, output_stride=output_stride, 251 | include_root_block=True, spatial_squeeze=spatial_squeeze, 252 | reuse=reuse, scope=scope) 253 | resnet_v2_101.default_image_size = resnet_v2.default_image_size 254 | 255 | 256 | def resnet_v2_152(inputs, 257 | num_classes=None, 258 | is_training=True, 259 | global_pool=True, 260 | output_stride=None, 261 | spatial_squeeze=True, 262 | reuse=None, 263 | scope='resnet_v2_152'): 264 | """ResNet-152 model of [1]. See resnet_v2() for arg and return description.""" 265 | blocks = [ 266 | resnet_v2_block('block1', base_depth=64, num_units=3, stride=2), 267 | resnet_v2_block('block2', base_depth=128, num_units=8, stride=2), 268 | resnet_v2_block('block3', base_depth=256, num_units=36, stride=2), 269 | resnet_v2_block('block4', base_depth=512, num_units=3, stride=1), 270 | ] 271 | return resnet_v2(inputs, blocks, num_classes, is_training=is_training, 272 | global_pool=global_pool, output_stride=output_stride, 273 | include_root_block=True, spatial_squeeze=spatial_squeeze, 274 | reuse=reuse, scope=scope) 275 | resnet_v2_152.default_image_size = resnet_v2.default_image_size 276 | 277 | 278 | def resnet_v2_200(inputs, 279 | num_classes=None, 280 | is_training=True, 281 | global_pool=True, 282 | output_stride=None, 283 | spatial_squeeze=True, 284 | reuse=None, 285 | scope='resnet_v2_200'): 286 | """ResNet-200 model of [2]. See resnet_v2() for arg and return description.""" 287 | blocks = [ 288 | resnet_v2_block('block1', base_depth=64, num_units=3, stride=2), 289 | resnet_v2_block('block2', base_depth=128, num_units=24, stride=2), 290 | resnet_v2_block('block3', base_depth=256, num_units=36, stride=2), 291 | resnet_v2_block('block4', base_depth=512, num_units=3, stride=1), 292 | ] 293 | return resnet_v2(inputs, blocks, num_classes, is_training=is_training, 294 | global_pool=global_pool, output_stride=output_stride, 295 | include_root_block=True, spatial_squeeze=spatial_squeeze, 296 | reuse=reuse, scope=scope) 297 | resnet_v2_200.default_image_size = resnet_v2.default_image_size 298 | -------------------------------------------------------------------------------- /net/vgg/__pycache__/vgg.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/net/vgg/__pycache__/vgg.cpython-35.pyc -------------------------------------------------------------------------------- /net/vgg/vgg.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | 15 | slim = tf.contrib.slim 16 | 17 | 18 | def vgg_arg_scope(weight_decay=0.0005): 19 | """Defines the VGG arg scope. 20 | 21 | Args: 22 | weight_decay: The l2 regularization coefficient. 23 | 24 | Returns: 25 | An arg_scope. 26 | """ 27 | with slim.arg_scope([slim.conv2d, slim.fully_connected], 28 | activation_fn=tf.nn.relu, 29 | weights_regularizer=slim.l2_regularizer(weight_decay), 30 | biases_initializer=tf.zeros_initializer()): 31 | with slim.arg_scope([slim.conv2d], padding='SAME') as arg_sc: 32 | return arg_sc 33 | 34 | 35 | def vgg_a(inputs, 36 | num_classes=1000, 37 | is_training=True, 38 | dropout_keep_prob=0.5, 39 | spatial_squeeze=True, 40 | scope='vgg_a', 41 | fc_conv_padding='VALID'): 42 | """Oxford Net VGG 11-Layers version A Example. 43 | 44 | Note: All the fully_connected layers have been transformed to conv2d layers. 45 | To use in classification mode, resize input to 224x224. 46 | 47 | Args: 48 | inputs: a tensor of size [batch_size, height, width, channels]. 49 | num_classes: number of predicted classes. 50 | is_training: whether or not the model is being trained. 51 | dropout_keep_prob: the probability that activations are kept in the dropout 52 | layers during training. 53 | spatial_squeeze: whether or not should squeeze the spatial dimensions of the 54 | outputs. Useful to remove unnecessary dimensions for classification. 55 | scope: Optional scope for the variables. 56 | fc_conv_padding: the type of padding to use for the fully connected layer 57 | that is implemented as a convolutional layer. Use 'SAME' padding if you 58 | are applying the network in a fully convolutional manner and want to 59 | get a prediction map downsampled by a factor of 32 as an output. 60 | Otherwise, the output prediction map will be (input / 32) - 6 in case of 61 | 'VALID' padding. 62 | 63 | Returns: 64 | the last op containing the log predictions and end_points dict. 65 | """ 66 | with tf.variable_scope(scope, 'vgg_a', [inputs]) as sc: 67 | end_points_collection = sc.name + '_end_points' 68 | # Collect outputs for conv2d, fully_connected and max_pool2d. 69 | with slim.arg_scope([slim.conv2d, slim.max_pool2d], 70 | outputs_collections=end_points_collection): 71 | net = slim.repeat(inputs, 1, slim.conv2d, 64, [3, 3], scope='conv1') 72 | net = slim.max_pool2d(net, [2, 2], scope='pool1') 73 | net = slim.repeat(net, 1, slim.conv2d, 128, [3, 3], scope='conv2') 74 | net = slim.max_pool2d(net, [2, 2], scope='pool2') 75 | net = slim.repeat(net, 2, slim.conv2d, 256, [3, 3], scope='conv3') 76 | net = slim.max_pool2d(net, [2, 2], scope='pool3') 77 | net = slim.repeat(net, 2, slim.conv2d, 512, [3, 3], scope='conv4') 78 | net = slim.max_pool2d(net, [2, 2], scope='pool4') 79 | net = slim.repeat(net, 2, slim.conv2d, 512, [3, 3], scope='conv5') 80 | net = slim.max_pool2d(net, [2, 2], scope='pool5') 81 | # Use conv2d instead of fully_connected layers. 82 | net = slim.conv2d(net, 4096, [7, 7], padding=fc_conv_padding, scope='fc6') 83 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 84 | scope='dropout6') 85 | net = slim.conv2d(net, 4096, [1, 1], scope='fc7') 86 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 87 | scope='dropout7') 88 | net = slim.conv2d(net, num_classes, [1, 1], 89 | activation_fn=None, 90 | normalizer_fn=None, 91 | scope='fc8') 92 | # Convert end_points_collection into a end_point dict. 93 | end_points = slim.utils.convert_collection_to_dict(end_points_collection) 94 | if spatial_squeeze: 95 | net = tf.squeeze(net, [1, 2], name='fc8/squeezed') 96 | end_points[sc.name + '/fc8'] = net 97 | return net, end_points 98 | vgg_a.default_image_size = 224 99 | 100 | 101 | def vgg_16(inputs, 102 | num_classes=None, 103 | is_training=True, 104 | dropout_keep_prob=0.5, 105 | spatial_squeeze=True, 106 | scope='vgg_16', 107 | fc_conv_padding='VALID'): 108 | """Oxford Net VGG 16-Layers version D Example. 109 | 110 | Note: All the fully_connected layers have been transformed to conv2d layers. 111 | To use in classification mode, resize input to 224x224. 112 | 113 | Args: 114 | inputs: a tensor of size [batch_size, height, width, channels]. 115 | num_classes: number of predicted classes. 116 | is_training: whether or not the model is being trained. 117 | dropout_keep_prob: the probability that activations are kept in the dropout 118 | layers during training. 119 | spatial_squeeze: whether or not should squeeze the spatial dimensions of the 120 | outputs. Useful to remove unnecessary dimensions for classification. 121 | scope: Optional scope for the variables. 122 | fc_conv_padding: the type of padding to use for the fully connected layer 123 | that is implemented as a convolutional layer. Use 'SAME' padding if you 124 | are applying the network in a fully convolutional manner and want to 125 | get a prediction map downsampled by a factor of 32 as an output. 126 | Otherwise, the output prediction map will be (input / 32) - 6 in case of 127 | 'VALID' padding. 128 | 129 | Returns: 130 | the last op containing the log predictions and end_points dict. 131 | """ 132 | with tf.variable_scope(scope, 'vgg_16', [inputs]) as sc: 133 | end_points_collection = sc.name + '_end_points' 134 | # Collect outputs for conv2d, fully_connected and max_pool2d. 135 | with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d], 136 | outputs_collections=end_points_collection): 137 | net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1') 138 | net = slim.max_pool2d(net, [2, 2], scope='pool1') 139 | net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2') 140 | net = slim.max_pool2d(net, [2, 2], scope='pool2') 141 | net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3') 142 | net = slim.max_pool2d(net, [2, 2], scope='pool3') 143 | net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4') 144 | net = slim.max_pool2d(net, [2, 2], scope='pool4') 145 | net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5') 146 | net = slim.max_pool2d(net, [2, 2], scope='pool5') 147 | # Use conv2d instead of fully_connected layers. 148 | net = slim.conv2d(net, 4096, [7, 7], padding=fc_conv_padding, scope='fc6') 149 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 150 | scope='dropout6') 151 | net = slim.conv2d(net, 4096, [1, 1], scope='fc7') 152 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 153 | scope='dropout7') 154 | # Convert end_points_collection into a end_point dict. 155 | end_points = slim.utils.convert_collection_to_dict(end_points_collection) 156 | if num_classes is not None: 157 | net = slim.conv2d(net, num_classes, [1, 1], 158 | activation_fn=None, 159 | normalizer_fn=None, 160 | scope='fc8') 161 | if spatial_squeeze: 162 | net = tf.squeeze(net, [1, 2], name='fc8/squeezed') 163 | end_points[sc.name + '/fc8'] = net 164 | else: 165 | net = net 166 | end_points = end_points 167 | return net, end_points 168 | vgg_16.default_image_size = 224 169 | 170 | 171 | def vgg_19(inputs, 172 | num_classes=None, 173 | is_training=True, 174 | dropout_keep_prob=0.5, 175 | spatial_squeeze=True, 176 | scope='vgg_19', 177 | fc_conv_padding='VALID'): 178 | """Oxford Net VGG 19-Layers version E Example. 179 | 180 | Note: All the fully_connected layers have been transformed to conv2d layers. 181 | To use in classification mode, resize input to 224x224. 182 | 183 | Args: 184 | inputs: a tensor of size [batch_size, height, width, channels]. 185 | num_classes: number of predicted classes. 186 | is_training: whether or not the model is being trained. 187 | dropout_keep_prob: the probability that activations are kept in the dropout 188 | layers during training. 189 | spatial_squeeze: whether or not should squeeze the spatial dimensions of the 190 | outputs. Useful to remove unnecessary dimensions for classification. 191 | scope: Optional scope for the variables. 192 | fc_conv_padding: the type of padding to use for the fully connected layer 193 | that is implemented as a convolutional layer. Use 'SAME' padding if you 194 | are applying the network in a fully convolutional manner and want to 195 | get a prediction map downsampled by a factor of 32 as an output. 196 | Otherwise, the output prediction map will be (input / 32) - 6 in case of 197 | 'VALID' padding. 198 | 199 | 200 | Returns: 201 | the last op containing the log predictions and end_points dict. 202 | """ 203 | with tf.variable_scope(scope, 'vgg_19', [inputs]) as sc: 204 | end_points_collection = sc.name + '_end_points' 205 | # Collect outputs for conv2d, fully_connected and max_pool2d. 206 | with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d], 207 | outputs_collections=end_points_collection): 208 | net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1') 209 | net = slim.max_pool2d(net, [2, 2], scope='pool1') 210 | net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2') 211 | net = slim.max_pool2d(net, [2, 2], scope='pool2') 212 | net = slim.repeat(net, 4, slim.conv2d, 256, [3, 3], scope='conv3') 213 | net = slim.max_pool2d(net, [2, 2], scope='pool3') 214 | net = slim.repeat(net, 4, slim.conv2d, 512, [3, 3], scope='conv4') 215 | net = slim.max_pool2d(net, [2, 2], scope='pool4') 216 | net = slim.repeat(net, 4, slim.conv2d, 512, [3, 3], scope='conv5') 217 | net = slim.max_pool2d(net, [2, 2], scope='pool5') 218 | # Use conv2d instead of fully_connected layers. 219 | net = slim.conv2d(net, 4096, [7, 7], padding=fc_conv_padding, scope='fc6') 220 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 221 | scope='dropout6') 222 | net = slim.conv2d(net, 4096, [1, 1], scope='fc7') 223 | net = slim.dropout(net, dropout_keep_prob, is_training=is_training, 224 | scope='dropout7') 225 | end_points = slim.utils.convert_collection_to_dict(end_points_collection) 226 | if num_classes is not None: 227 | net = slim.conv2d(net, num_classes, [1, 1], 228 | activation_fn=None, 229 | normalizer_fn=None, 230 | scope='fc8') 231 | if spatial_squeeze: 232 | net = tf.squeeze(net, [1, 2], name='fc8/squeezed') 233 | end_points[sc.name + '/fc8'] = net 234 | else: 235 | net = net 236 | end_points = end_points 237 | return net, end_pointss 238 | vgg_19.default_image_size = 224 239 | 240 | # Alias 241 | vgg_d = vgg_16 242 | vgg_e = vgg_19 243 | -------------------------------------------------------------------------------- /net/z_build_net/__pycache__/build_net.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/net/z_build_net/__pycache__/build_net.cpython-35.pyc -------------------------------------------------------------------------------- /net/z_build_net/build_net.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | import numpy as np 10 | import tensorflow as tf 11 | slim = tf.contrib.slim 12 | import numpy as np 13 | import argparse 14 | import os 15 | from PIL import Image 16 | from datetime import datetime 17 | import math 18 | import time 19 | import cv2 20 | from keras.utils import np_utils 21 | 22 | # inception_v4 23 | try: 24 | from inception_v4 import inception_v4_arg_scope, inception_v4 25 | except: 26 | from net.inception_v4.inception_v4 import inception_v4_arg_scope, inception_v4 27 | # resnet_v2_50, resnet_v2_101, resnet_v2_152 28 | try: 29 | from resnet_v2 import resnet_arg_scope, resnet_v2_50 30 | except: 31 | from net.resnet_v2.resnet_v2 import resnet_arg_scope, resnet_v2_50 32 | # vgg16, vgg19 33 | try: 34 | from vgg import vgg_arg_scope, vgg_16 35 | except: 36 | from net.vgg.vgg import vgg_arg_scope, vgg_16 37 | 38 | 39 | class net_arch(object): 40 | 41 | # def __init__(self): 42 | 43 | def arch_inception_v4(self, X, num_classes, dropout_keep_prob=0.8, is_train=False): 44 | arg_scope = inception_v4_arg_scope() 45 | with slim.arg_scope(arg_scope): 46 | net, end_points = inception_v4(X, is_training=is_train) 47 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'): 48 | with tf.variable_scope('Logits_out'): 49 | # 8 x 8 x 1536 50 | net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID', 51 | scope='AvgPool_1a_out') 52 | # 1 x 1 x 1536 53 | net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b_out') 54 | net = slim.flatten(net, scope='PreLogitsFlatten_out') 55 | # 1536 56 | net = slim.fully_connected(net, 256, activation_fn=tf.nn.relu, scope='Logits_out0') 57 | net = slim.fully_connected(net, num_classes, activation_fn=None,scope='Logits_out1') 58 | return net 59 | 60 | def arch_resnet_v2_50(self, X, num_classes, dropout_keep_prob=0.8, is_train=False): 61 | arg_scope = resnet_arg_scope() 62 | with slim.arg_scope(arg_scope): 63 | net, end_points = resnet_v2_50(X, is_training=is_train) 64 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'): 65 | with tf.variable_scope('Logits_out'): 66 | net = slim.conv2d(net, 1000, [1, 1], activation_fn=None, normalizer_fn=None, scope='Logits_out0') 67 | net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b_out0') 68 | net = slim.conv2d(net, 200, [1, 1], activation_fn=None, normalizer_fn=None, scope='Logits_out1') 69 | net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b_out1') 70 | net = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, scope='Logits_out2') 71 | net = tf.squeeze(net,[1,2], name='SpatialSqueeze') 72 | return net 73 | 74 | def arch_vgg16(self, X, num_classes, dropout_keep_prob=0.8, is_train=False): 75 | arg_scope = vgg_arg_scope() 76 | with slim.arg_scope(arg_scope): 77 | net, end_points = vgg_16(X, is_training=is_train) 78 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'): 79 | with tf.variable_scope('Logits_out'): 80 | net = slim.conv2d(net, num_classes, [1, 1],activation_fn=None,normalizer_fn=None,scope='fc8') 81 | net = tf.squeeze(net,[1,2], name='fc8/squeezed') 82 | return net 83 | 84 | def arch_inception_v4_rnn(self, X, num_classes, dropout_keep_prob=0.8, is_train=False): 85 | rnn_size = 256 86 | num_layers = 2 87 | arg_scope = inception_v4_arg_scope() 88 | with slim.arg_scope(arg_scope): 89 | net, end_points = inception_v4(X, is_training=is_train) 90 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'): 91 | with tf.variable_scope('Logits_out'): 92 | # 8 x 8 x 1536 93 | orig_shape = net.get_shape().as_list() 94 | net = tf.reshape(net, [-1, orig_shape[1] * orig_shape[2], orig_shape[3]]) 95 | 96 | def gru_cell(): 97 | return tf.contrib.rnn.GRUCell(rnn_size) 98 | def lstm_cell(): 99 | return tf.contrib.rnn.LSTMCell(rnn_size) 100 | def attn_cell(): 101 | return tf.contrib.rnn.DropoutWrapper(lstm_cell(), output_keep_prob=dropout_keep_prob) 102 | stack = tf.contrib.rnn.MultiRNNCell([lstm_cell() for _ in range(0, num_layers)], state_is_tuple=True) 103 | net, _ = tf.nn.dynamic_rnn(stack, net, dtype=tf.float32) 104 | net = tf.transpose(net, (1, 0, 2)) 105 | # 1536 106 | net = slim.fully_connected(net[-1], 256, activation_fn=tf.nn.relu, scope='Logits_out0') 107 | net = slim.fully_connected(net, num_classes, activation_fn=None,scope='Logits_out1') 108 | return net 109 | 110 | def arch_resnet_v2_50_rnn(self, X, num_classes, dropout_keep_prob=0.8, is_train=False): 111 | rnn_size = 256 112 | num_layers = 2 113 | arg_scope = resnet_arg_scope() 114 | with slim.arg_scope(arg_scope): 115 | net, end_points = resnet_v2_50(X, is_training=is_train) 116 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'): 117 | with tf.variable_scope('Logits_out'): 118 | orig_shape = net.get_shape().as_list() 119 | net = tf.reshape(net, [-1, orig_shape[1] * orig_shape[2], orig_shape[3]]) 120 | 121 | def gru_cell(): 122 | return tf.contrib.rnn.GRUCell(run_size) 123 | def lstm_cell(): 124 | return tf.contrib.rnn.LSTMCell(rnn_size) 125 | def attn_cell(): 126 | return tf.contrib.rnn.DropoutWrapper(lstm_cell(), output_keep_prob=dropout_keep_prob) 127 | stack = tf.contrib.rnn.MultiRNNCell([lstm_cell() for _ in range(0, num_layers)], state_is_tuple=True) 128 | net, _ = tf.nn.dynamic_rnn(stack, net, dtype=tf.float32) 129 | net = tf.transpose(net, (1, 0, 2)) 130 | # 131 | net = slim.fully_connected(net[-1], 256, activation_fn=tf.nn.relu, scope='Logits_out0') 132 | net = slim.fully_connected(net, num_classes, activation_fn=None,scope='Logits_out1') 133 | return net 134 | 135 | -------------------------------------------------------------------------------- /pretrain/README.md: -------------------------------------------------------------------------------- 1 | 2 | # 预训练好的模型放在这里。 3 | -------------------------------------------------------------------------------- /sample_train/0male/0(1).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/0male/0(1).jpeg -------------------------------------------------------------------------------- /sample_train/0male/0(1).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/0male/0(1).jpg -------------------------------------------------------------------------------- /sample_train/0male/0(2).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/0male/0(2).jpeg -------------------------------------------------------------------------------- /sample_train/0male/0(2).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/0male/0(2).jpg -------------------------------------------------------------------------------- /sample_train/0male/0(3).jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/0male/0(3).jpeg -------------------------------------------------------------------------------- /sample_train/0male/0(3).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/0male/0(3).jpg -------------------------------------------------------------------------------- /sample_train/1female/1(1).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/1female/1(1).jpg -------------------------------------------------------------------------------- /sample_train/1female/1(2).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/1female/1(2).jpg -------------------------------------------------------------------------------- /sample_train/1female/1(3).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/1female/1(3).jpg -------------------------------------------------------------------------------- /sample_train/1female/1(4).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/1female/1(4).jpg -------------------------------------------------------------------------------- /sample_train/1female/1(5).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/1female/1(5).jpg -------------------------------------------------------------------------------- /sample_train/1female/1(6).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/1female/1(6).jpg -------------------------------------------------------------------------------- /sample_train/2many/0_Parade_marchingband_1_12.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/2many/0_Parade_marchingband_1_12.jpg -------------------------------------------------------------------------------- /sample_train/2many/0_Parade_marchingband_1_13.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/2many/0_Parade_marchingband_1_13.jpg -------------------------------------------------------------------------------- /sample_train/2many/0_Parade_marchingband_1_17.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/2many/0_Parade_marchingband_1_17.jpg -------------------------------------------------------------------------------- /sample_train/2many/0_Parade_marchingband_1_5.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/2many/0_Parade_marchingband_1_5.jpg -------------------------------------------------------------------------------- /sample_train/2many/0_Parade_marchingband_1_6.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/2many/0_Parade_marchingband_1_6.jpg -------------------------------------------------------------------------------- /sample_train/2many/0_Parade_marchingband_1_8.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/2many/0_Parade_marchingband_1_8.jpg -------------------------------------------------------------------------------- /sample_train/3other/6(2).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/3other/6(2).jpg -------------------------------------------------------------------------------- /sample_train/3other/6(3).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/3other/6(3).jpg -------------------------------------------------------------------------------- /sample_train/3other/6(4).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/3other/6(4).jpg -------------------------------------------------------------------------------- /sample_train/3other/6(5).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/3other/6(5).jpg -------------------------------------------------------------------------------- /sample_train/3other/6(6).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/3other/6(6).jpg -------------------------------------------------------------------------------- /sample_train/3other/6(9).jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/sample_train/3other/6(9).jpg -------------------------------------------------------------------------------- /train_net/__pycache__/train.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MachineLP/train_cnn-rnn/b14a9d65030cd4874c64ce5f52c39f771acbb776/train_net/__pycache__/train.cpython-35.pyc -------------------------------------------------------------------------------- /train_net/train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | import numpy as np 10 | import tensorflow as tf 11 | slim = tf.contrib.slim 12 | import numpy as np 13 | import argparse 14 | import os 15 | from PIL import Image 16 | from datetime import datetime 17 | import math 18 | import time 19 | import cv2 20 | 21 | from keras.utils import np_utils 22 | os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 23 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 24 | 25 | try: 26 | from load_image import get_next_batch_from_path, shuffle_train_data 27 | except: 28 | from load_image.load_image import get_next_batch_from_path, shuffle_train_data 29 | from load_image import load_image 30 | 31 | # net_arch 32 | from net.z_build_net import build_net 33 | 34 | 35 | 36 | def g_parameter(checkpoint_exclude_scopes): 37 | exclusions = [] 38 | if checkpoint_exclude_scopes: 39 | exclusions = [scope.strip() for scope in checkpoint_exclude_scopes.split(',')] 40 | print (exclusions) 41 | # 需要加载的参数。 42 | variables_to_restore = [] 43 | # 需要训练的参数 44 | variables_to_train = [] 45 | for var in slim.get_model_variables(): 46 | # 切记不要用下边这个,这是个天大的bug,调试了3天。 47 | # for var in tf.trainable_variables(): 48 | excluded = False 49 | for exclusion in exclusions: 50 | if var.op.name.startswith(exclusion): 51 | excluded = True 52 | variables_to_train.append(var) 53 | print ("ok") 54 | print (var.op.name) 55 | break 56 | if not excluded: 57 | variables_to_restore.append(var) 58 | return variables_to_restore,variables_to_train 59 | 60 | 61 | def train(train_data,train_label,valid_data,valid_label,train_n,valid_n,IMAGE_HEIGHT,IMAGE_WIDTH,learning_rate,num_classes,epoch,EARLY_STOP_PATIENCE,batch_size=64,keep_prob=0.8, 62 | arch_model="arch_inception_v4",checkpoint_exclude_scopes="Logits_out", checkpoint_path="pretrain/inception_v4/inception_v4.ckpt"): 63 | 64 | X = tf.placeholder(tf.float32, [None, IMAGE_HEIGHT, IMAGE_WIDTH, 3]) 65 | #Y = tf.placeholder(tf.float32, [None, 4]) 66 | Y = tf.placeholder(tf.float32, [None, num_classes]) 67 | is_training = tf.placeholder(tf.bool, name='is_training') 68 | k_prob = tf.placeholder(tf.float32) # dropout 69 | 70 | # 定义模型 71 | net = build_net.net_arch() 72 | if arch_model == "arch_inception_v4": 73 | net = net.arch_inception_v4(X, num_classes, k_prob, is_training) 74 | elif arch_model == "arch_resnet_v2_50": 75 | net = net.arch_resnet_v2_50(X, num_classes, k_prob, is_training) 76 | elif arch_model == "vgg_16": 77 | net = net.arch_vgg16(X, num_classes, k_prob, is_training) 78 | elif arch_model == "arch_inception_v4_rnn": 79 | net = net.arch_inception_v4_rnn(X, num_classes, k_prob, is_training) 80 | elif arch_model == "arch_inception_v4_rnn_attention": 81 | net = net.arch_inception_v4_rnn_attention(X, num_classes, k_prob, is_training) 82 | 83 | # 84 | variables_to_restore,variables_to_train = g_parameter(checkpoint_exclude_scopes) 85 | 86 | # loss function 87 | loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = Y, logits = net)) 88 | # loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = Y, logits = net)) 89 | 90 | var_list = variables_to_train 91 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 92 | with tf.control_dependencies(update_ops): 93 | optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss, var_list=var_list) 94 | predict = tf.reshape(net, [-1, num_classes]) 95 | max_idx_p = tf.argmax(predict, 1) 96 | max_idx_l = tf.argmax(Y, 1) 97 | correct_pred = tf.equal(max_idx_p, max_idx_l) 98 | accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 99 | 100 | # tensorboard 101 | with tf.name_scope('tmp/'): 102 | tf.summary.scalar('loss', loss) 103 | tf.summary.scalar('accuracy', accuracy) 104 | summary_op = tf.summary.merge_all() 105 | #------------------------------------------------------------------------------------# 106 | sess = tf.Session() 107 | init = tf.global_variables_initializer() 108 | sess.run(init) 109 | 110 | # 111 | log_dir = arch_model + '_log' 112 | if not os.path.exists(log_dir): 113 | os.makedirs(log_dir) 114 | writer = tf.summary.FileWriter(log_dir, sess.graph) 115 | 116 | saver2 = tf.train.Saver(tf.global_variables()) 117 | model_path = 'model/fine-tune' 118 | 119 | net_vars = variables_to_restore 120 | saver_net = tf.train.Saver(net_vars) 121 | # checkpoint_path = 'pretrain/inception_v4.ckpt' 122 | saver_net.restore(sess, checkpoint_path) 123 | 124 | # early stopping 125 | best_valid = np.inf 126 | best_valid_epoch = 0 127 | 128 | # saver2.restore(sess, "model/fine-tune-1120") 129 | for epoch_i in range(epoch): 130 | for batch_i in range(int(train_n/batch_size)): 131 | 132 | images_train, labels_train = get_next_batch_from_path(train_data, train_label, batch_i, IMAGE_HEIGHT, IMAGE_WIDTH, batch_size=batch_size, is_train=True) 133 | 134 | los, _ = sess.run([loss,optimizer], feed_dict={X: images_train, Y: labels_train, k_prob:keep_prob, is_training:True}) 135 | # print (los) 136 | 137 | if batch_i % 100 == 0: 138 | images_valid, labels_valid = get_next_batch_from_path(valid_data, valid_label, batch_i%(int(valid_n/batch_size)), IMAGE_HEIGHT, IMAGE_WIDTH, batch_size=batch_size, is_train=False) 139 | ls, acc = sess.run([loss, accuracy], feed_dict={X: images_valid, Y: labels_valid, k_prob:1.0, is_training:False}) 140 | print('Batch: {:>2}: Validation loss: {:>3.5f}, Validation accuracy: {:>3.5f}'.format(batch_i, ls, acc)) 141 | #if acc > 0.90: 142 | # saver2.save(sess, model_path, global_step=batch_i, write_meta_graph=False) 143 | elif batch_i % 20 == 0: 144 | loss_, acc_, summary_str = sess.run([loss, accuracy, summary_op], feed_dict={X: images_train, Y: labels_train, k_prob:1.0, is_training:False}) 145 | writer.add_summary(summary_str, global_step=((int(train_n/batch_size))*epoch_i+batch_i)) 146 | print('Batch: {:>2}: Training loss: {:>3.5f}, Training accuracy: {:>3.5f}'.format(batch_i, loss_, acc_)) 147 | 148 | print('Epoch===================================>: {:>2}'.format(epoch_i)) 149 | valid_ls = 0 150 | valid_acc = 0 151 | for batch_i in range(int(valid_n/batch_size)): 152 | images_valid, labels_valid = get_next_batch_from_path(valid_data, valid_label, batch_i, IMAGE_HEIGHT, IMAGE_WIDTH, batch_size=batch_size, is_train=False) 153 | epoch_ls, epoch_acc = sess.run([loss, accuracy], feed_dict={X: images_valid, Y: labels_valid, k_prob:1.0, is_training:False}) 154 | valid_ls = valid_ls + epoch_ls 155 | valid_acc = valid_acc + epoch_acc 156 | print('Epoch: {:>2}: Validation loss: {:>3.5f}, Validation accuracy: {:>3.5f}'.format(epoch_i, valid_ls/int(valid_n/batch_size), valid_acc/int(valid_n/batch_size))) 157 | if valid_acc/int(valid_n/batch_size) > 0.90: 158 | saver2.save(sess, model_path, global_step=epoch_i, write_meta_graph=False) 159 | loss_valid = valid_ls/int(valid_n/batch_size) 160 | if loss_valid < best_valid: 161 | best_valid = loss_valid 162 | best_valid_epoch = epoch_i 163 | elif best_valid_epoch + EARLY_STOP_PATIENCE < epoch_i: 164 | print("Early stopping.") 165 | print("Best valid loss was {:.6f} at epoch {}.".format(best_valid, best_valid_epoch)) 166 | break 167 | if valid_acc/int(valid_n/batch_size) > 0.90: 168 | saver2.save(sess, model_path, global_step=epoch_i, write_meta_graph=False) 169 | 170 | print('>>>>>>>>>>>>>>>>>>>shuffle train_data<<<<<<<<<<<<<<<<<') 171 | # 每个epoch,重新打乱一次训练集: 172 | train_data, train_label = shuffle_train_data(train_data, train_label) 173 | writer.close() 174 | sess.close() 175 | 176 | if __name__ == '__main__': 177 | 178 | IMAGE_HEIGHT = 299 179 | IMAGE_WIDTH = 299 180 | num_classes = 4 181 | # epoch 182 | epoch = 100 183 | batch_size = 16 184 | # 模型的学习率 185 | learning_rate = 0.00001 186 | keep_prob = 0.8 187 | 188 | 189 | ##----------------------------------------------------------------------------## 190 | # 设置训练样本的占总样本的比例: 191 | train_rate = 0.9 192 | # 每个类别保存到一个文件中,放在此目录下,只要是二级目录就可以。 193 | craterDir = "train" 194 | # arch_model="arch_inception_v4"; arch_model="arch_resnet_v2_50"; arch_model="vgg_16" 195 | arch_model="arch_inception_v4" 196 | checkpoint_exclude_scopes = "Logits_out" 197 | checkpoint_path="pretrain/inception_v4/inception_v4.ckpt" 198 | 199 | 200 | ##----------------------------------------------------------------------------## 201 | print ("-----------------------------load_image.py start--------------------------") 202 | # 准备训练数据 203 | all_image = load_image.load_image(craterDir, train_rate) 204 | train_data, train_label, valid_data, valid_label= all_image.gen_train_valid_image() 205 | image_n = all_image.image_n 206 | # 样本的总数量 207 | print ("样本的总数量:") 208 | print (image_n) 209 | # 定义90%作为训练样本 210 | train_n = all_image.train_n 211 | valid_n = all_image.valid_n 212 | # ont-hot 213 | train_label = np_utils.to_categorical(train_label, num_classes) 214 | valid_label = np_utils.to_categorical(valid_label, num_classes) 215 | ##----------------------------------------------------------------------------## 216 | 217 | print ("-----------------------------train.py start--------------------------") 218 | train(train_data,train_label,valid_data,valid_label,train_n,valid_n,IMAGE_HEIGHT,IMAGE_WIDTH,learning_rate,num_classes,epoch,batch_size,keep_prob, 219 | arch_model, checkpoint_exclude_scopes, checkpoint_path) 220 | -------------------------------------------------------------------------------- /z_ckpt_pb/ckpt_pb.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | 10 | import tensorflow as tf 11 | slim = tf.contrib.slim 12 | import os.path 13 | import argparse 14 | from tensorflow.python.framework import graph_util 15 | from inception_v4 import * 16 | from inception_preprocessing import * 17 | 18 | 19 | 20 | MODEL_DIR = "model/" 21 | MODEL_NAME = "frozen_model.pb" 22 | 23 | if not tf.gfile.Exists(MODEL_DIR): #创建目录 24 | tf.gfile.MakeDirs(MODEL_DIR) 25 | 26 | batch_size = 32 27 | height, width = 299, 299 28 | num_classes = 3 29 | X = tf.placeholder(tf.float32, [None, height, width, 3], name = "inputs_placeholder") 30 | ''' 31 | X = tf.placeholder(tf.uint8, [None, None, 3],name = "inputs_placeholder") 32 | X = tf.image.encode_jpeg(X, format='rgb') # 单通道用 'grayscale' 33 | X = tf.image.decode_jpeg(X, channels=3) 34 | X = preprocess_for_eval(X, 299,299) 35 | X = tf.reshape(X, [-1,299,299,3])''' 36 | Y = tf.placeholder(tf.float32, [None, num_classes]) 37 | #keep_prob = tf.placeholder(tf.float32) # dropout 38 | #keep_prob_fc = tf.placeholder(tf.float32) # dropout 39 | arg_scope = inception_v4_arg_scope() 40 | with slim.arg_scope(arg_scope): 41 | net, end_points = inception_v4(X, is_training=False) 42 | #sess1 = tf.Session() 43 | #saver1 = tf.train.Saver(tf.global_variables()) 44 | #checkpoint_path = 'model/inception_v4.ckpt' 45 | #saver1.restore(sess1, checkpoint_path) 46 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'): 47 | with tf.variable_scope('Logits_out'): 48 | # 8 x 8 x 1536 49 | net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID', 50 | scope='AvgPool_1a_out') 51 | # 1 x 1 x 1536 52 | dropout_keep_prob = 1.0 53 | net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b_out') 54 | net = slim.flatten(net, scope='PreLogitsFlatten_out') 55 | # 1536 56 | net = slim.fully_connected(net, 256, activation_fn=tf.nn.relu, scope='Logits_out0') 57 | net = slim.fully_connected(net, num_classes, activation_fn=None,scope='Logits_out1') 58 | # net = tf.nn.softmax(net) 59 | net = tf.nn.sigmoid(net) 60 | predict = tf.reshape(net, [-1, num_classes], name='predictions') 61 | 62 | for var in tf.trainable_variables(): 63 | print (var.op.name) 64 | 65 | 66 | def freeze_graph(model_folder): 67 | #checkpoint = tf.train.get_checkpoint_state(model_folder) #检查目录下ckpt文件状态是否可用 68 | #input_checkpoint = checkpoint.model_checkpoint_path #得ckpt文件路径 69 | input_checkpoint = model_folder 70 | output_graph = os.path.join(MODEL_DIR, MODEL_NAME) #PB模型保存路径 71 | 72 | output_node_names = "predictions" #原模型输出操作节点的名字 73 | #saver = tf.train.import_meta_graph(input_checkpoint + '.meta', clear_devices=True) #得到图、clear_devices :Whether or not to clear the device field for an `Operation` or `Tensor` during import. 74 | saver = tf.train.Saver() 75 | 76 | graph = tf.get_default_graph() #获得默认的图 77 | input_graph_def = graph.as_graph_def() #返回一个序列化的图代表当前的图 78 | 79 | with tf.Session() as sess: 80 | sess.run(tf.initialize_all_variables()) 81 | saver.restore(sess, input_checkpoint) #恢复图并得到数据 82 | 83 | #print "predictions : ", sess.run("predictions:0", feed_dict={"input_holder:0": [10.0]}) # 测试读出来的模型是否正确,注意这里传入的是输出 和输入 节点的 tensor的名字,不是操作节点的名字 84 | 85 | output_graph_def = graph_util.convert_variables_to_constants( #模型持久化,将变量值固定 86 | sess, 87 | input_graph_def, 88 | output_node_names.split(",") #如果有多个输出节点,以逗号隔开 89 | ) 90 | with tf.gfile.GFile(output_graph, "wb") as f: #保存模型 91 | f.write(output_graph_def.SerializeToString()) #序列化输出 92 | print("%d ops in the final graph." % len(output_graph_def.node)) #得到当前图有几个操作节点 93 | 94 | for op in graph.get_operations(): 95 | #print(op.name, op.values()) 96 | print("name:",op.name) 97 | print ("success!") 98 | 99 | 100 | #下面是用于测试, 读取pd模型,答应每个变量的名字。 101 | graph = load_graph("model/frozen_model.pb") 102 | for op in graph.get_operations(): 103 | #print(op.name, op.values()) 104 | print("name111111111111:",op.name) 105 | pred = graph.get_tensor_by_name('prefix/inputs_placeholder:0') 106 | print (pred) 107 | temp = graph.get_tensor_by_name('prefix/predictions:0') 108 | print (temp) 109 | 110 | def load_graph(frozen_graph_filename): 111 | # We load the protobuf file from the disk and parse it to retrieve the 112 | # unserialized graph_def 113 | with tf.gfile.GFile(frozen_graph_filename, "rb") as f: 114 | graph_def = tf.GraphDef() 115 | graph_def.ParseFromString(f.read()) 116 | 117 | # Then, we can use again a convenient built-in function to import a graph_def into the 118 | # current default Graph 119 | with tf.Graph().as_default() as graph: 120 | tf.import_graph_def( 121 | graph_def, 122 | input_map=None, 123 | return_elements=None, 124 | name="prefix", 125 | op_dict=None, 126 | producer_op_list=None 127 | ) 128 | return graph 129 | 130 | if __name__ == '__main__': 131 | parser = argparse.ArgumentParser() 132 | parser.add_argument("model_folder", type=str, help="input ckpt model dir", default="model/cnn_model-1700") #命令行解析,help是提示符,type是输入的类型, 133 | # 这里运行程序时需要带上模型ckpt的路径,不然会报 error: too few arguments 134 | aggs = parser.parse_args() 135 | freeze_graph(aggs.model_folder) 136 | # freeze_graph("model/ckpt") #模型目录 137 | # python ckpt_pb.py "model/fine-tune-160" 138 | -------------------------------------------------------------------------------- /z_ckpt_pb/img_preprocessing.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | ''' 9 | 在进行训练之前要将训练数据筛选一下; 10 | 是不是为空;并且另存为jpg格式; 11 | ''' 12 | 13 | import numpy as np 14 | import tensorflow as tf 15 | import numpy as np 16 | import os 17 | from PIL import Image 18 | import cv2 19 | #from vgg_preprocessing import* 20 | import tensorflow as tf 21 | # 适用于二级目录 。。。/图片类别文件/图片(.png ,jpg等) 22 | 23 | 24 | #input = tf.placeholder(tf.float32, [None, None, 3]) 25 | #out = preprocess_image(input, 224,224, is_training=False) 26 | #out = preprocess_image(input, 224,224, is_training=True) 27 | #sess = tf.Session() 28 | def load_img(imgDir,imgFoldName, img_label): 29 | imgs = os.listdir(imgDir+imgFoldName) 30 | imgNum = len(imgs) 31 | data = []#np.empty((imgNum,224,224,3),dtype="float32") 32 | label = []#np.empty((imgNum,),dtype="uint8") 33 | for i in range (imgNum): 34 | img = cv2.imread(imgDir+imgFoldName+"/"+imgs[i]) 35 | #for j in range(1): 36 | if img is not None: 37 | #img_ = sess.run(out, feed_dict={input: img}) 38 | img_ = img 39 | # img_ = cv2.resize(img_, (960, 540)) 40 | #save_path = "train/"+imgFoldName+"/"+imgs[i] 41 | save_path = dir_path+imgFoldName+"/"+imgs[i] 42 | save_path = save_path.split('.')[0] 43 | #save_path = save_path + str(j) + '.jpg' 44 | save_path = save_path + '.jpg' 45 | print (save_path) 46 | cv2.imwrite(save_path, img_) 47 | '''img = cv2.resize(img, (224, 224)) 48 | arr = np.asarray(img,dtype="float32") 49 | data[i,:,:,:] = arr 50 | # label[i] = int(imgs[i].split('.')[0]) 51 | label[i] = int(img_label)''' 52 | return data,label 53 | ''' 54 | craterDir = "train/" 55 | foldName = "0male" 56 | data, label = load_Img(craterDir,foldName, 0) 57 | 58 | print (data[0].shape) 59 | print (label[0])''' 60 | 61 | def load_database(imgDir): 62 | img_path = os.listdir(imgDir) 63 | train_imgs = [] 64 | train_labels = [] 65 | for i, path in enumerate(img_path): 66 | craterDir = imgDir + '/' 67 | foldName = path 68 | data, label = load_img(craterDir,foldName, i) 69 | train_imgs.extend(data) 70 | train_labels.extend(label) 71 | #打乱数据集 72 | index = [i for i in range(len(train_imgs))] 73 | np.random.shuffle(index) 74 | train_imgs = np.asarray(train_imgs) 75 | train_labels = np.asarray(train_labels) 76 | train_imgs = train_imgs[index] 77 | train_labels = train_labels[index] 78 | return train_imgs, train_labels 79 | 80 | 81 | def get_next_batch(train_imgs, train_labels, pointer, batch_size=64): 82 | batch_x = np.zeros([batch_size, 224,224,3]) 83 | batch_y = np.zeros([batch_size, ]) 84 | for i in range(batch_size): 85 | #image = cv2.imread(image_path[i+pointer*batch_size]) 86 | #image = cv2.resize(image, (IMAGE_WIDTH, IMAGE_HEIGHT)) 87 | #image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 88 | #image = Image.open(image_path[i+pointer*batch_size]) 89 | #image = image.resize((IMAGE_WIDTH, IMAGE_HEIGHT)) 90 | #image = image.convert('L') 91 | #大神说,转成数组再搞 92 | #image=np.array(image) 93 | # 94 | image = train_imgs[i+pointer*batch_size] 95 | ''' 96 | m = image.mean() 97 | s = image.std() 98 | min_s = 1.0/(np.sqrt(image.shape[0]*image.shape[1])) 99 | std = max(min_s, s) 100 | image = (image-m)/std''' 101 | image = (image-127.5) 102 | 103 | batch_x[i,:,:,:] = image 104 | # print labels[i+pointer*batch_size] 105 | batch_y[i] = train_labels[i+pointer*batch_size] 106 | return batch_x, batch_y 107 | 108 | 109 | def test(): 110 | 111 | craterDir = "gender" 112 | global dir_path 113 | dir_path = "train_gender/" 114 | #dir_path = "train/" 115 | data, label = load_database(craterDir) 116 | #dir = "/1female" 117 | #data, label = load_img(craterDir,dir,0) 118 | print (data.shape) 119 | print (len(data)) 120 | print (data[0].shape) 121 | print (label[0]) 122 | 123 | 124 | if __name__ == '__main__': 125 | test() 126 | -------------------------------------------------------------------------------- /z_ckpt_pb/inception_preprocessing.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | 15 | from tensorflow.python.ops import control_flow_ops 16 | 17 | 18 | def apply_with_random_selector(x, func, num_cases): 19 | """Computes func(x, sel), with sel sampled from [0...num_cases-1]. 20 | 21 | Args: 22 | x: input Tensor. 23 | func: Python function to apply. 24 | num_cases: Python int32, number of cases to sample sel from. 25 | 26 | Returns: 27 | The result of func(x, sel), where func receives the value of the 28 | selector as a python integer, but sel is sampled dynamically. 29 | """ 30 | sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32) 31 | # Pass the real x only to one of the func calls. 32 | return control_flow_ops.merge([ 33 | func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case) 34 | for case in range(num_cases)])[0] 35 | 36 | 37 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None): 38 | """Distort the color of a Tensor image. 39 | 40 | Each color distortion is non-commutative and thus ordering of the color ops 41 | matters. Ideally we would randomly permute the ordering of the color ops. 42 | Rather then adding that level of complication, we select a distinct ordering 43 | of color ops for each preprocessing thread. 44 | 45 | Args: 46 | image: 3-D Tensor containing single image in [0, 1]. 47 | color_ordering: Python int, a type of distortion (valid values: 0-3). 48 | fast_mode: Avoids slower ops (random_hue and random_contrast) 49 | scope: Optional scope for name_scope. 50 | Returns: 51 | 3-D Tensor color-distorted image on range [0, 1] 52 | Raises: 53 | ValueError: if color_ordering not in [0, 3] 54 | """ 55 | with tf.name_scope(scope, 'distort_color', [image]): 56 | if fast_mode: 57 | if color_ordering == 0: 58 | image = tf.image.random_brightness(image, max_delta=32. / 255.) 59 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5) 60 | else: 61 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5) 62 | image = tf.image.random_brightness(image, max_delta=32. / 255.) 63 | else: 64 | if color_ordering == 0: 65 | image = tf.image.random_brightness(image, max_delta=32. / 255.) 66 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5) 67 | image = tf.image.random_hue(image, max_delta=0.2) 68 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5) 69 | elif color_ordering == 1: 70 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5) 71 | image = tf.image.random_brightness(image, max_delta=32. / 255.) 72 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5) 73 | image = tf.image.random_hue(image, max_delta=0.2) 74 | elif color_ordering == 2: 75 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5) 76 | image = tf.image.random_hue(image, max_delta=0.2) 77 | image = tf.image.random_brightness(image, max_delta=32. / 255.) 78 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5) 79 | elif color_ordering == 3: 80 | image = tf.image.random_hue(image, max_delta=0.2) 81 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5) 82 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5) 83 | image = tf.image.random_brightness(image, max_delta=32. / 255.) 84 | else: 85 | raise ValueError('color_ordering must be in [0, 3]') 86 | 87 | # The random_* ops do not necessarily clamp. 88 | return tf.clip_by_value(image, 0.0, 1.0) 89 | 90 | 91 | def distorted_bounding_box_crop(image, 92 | bbox, 93 | min_object_covered=0.1, 94 | aspect_ratio_range=(0.75, 1.33), 95 | area_range=(0.05, 1.0), 96 | max_attempts=100, 97 | scope=None): 98 | """Generates cropped_image using a one of the bboxes randomly distorted. 99 | 100 | See `tf.image.sample_distorted_bounding_box` for more documentation. 101 | 102 | Args: 103 | image: 3-D Tensor of image (it will be converted to floats in [0, 1]). 104 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords] 105 | where each coordinate is [0, 1) and the coordinates are arranged 106 | as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole 107 | image. 108 | min_object_covered: An optional `float`. Defaults to `0.1`. The cropped 109 | area of the image must contain at least this fraction of any bounding box 110 | supplied. 111 | aspect_ratio_range: An optional list of `floats`. The cropped area of the 112 | image must have an aspect ratio = width / height within this range. 113 | area_range: An optional list of `floats`. The cropped area of the image 114 | must contain a fraction of the supplied image within in this range. 115 | max_attempts: An optional `int`. Number of attempts at generating a cropped 116 | region of the image of the specified constraints. After `max_attempts` 117 | failures, return the entire image. 118 | scope: Optional scope for name_scope. 119 | Returns: 120 | A tuple, a 3-D Tensor cropped_image and the distorted bbox 121 | """ 122 | with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]): 123 | # Each bounding box has shape [1, num_boxes, box coords] and 124 | # the coordinates are ordered [ymin, xmin, ymax, xmax]. 125 | 126 | # A large fraction of image datasets contain a human-annotated bounding 127 | # box delineating the region of the image containing the object of interest. 128 | # We choose to create a new bounding box for the object which is a randomly 129 | # distorted version of the human-annotated bounding box that obeys an 130 | # allowed range of aspect ratios, sizes and overlap with the human-annotated 131 | # bounding box. If no box is supplied, then we assume the bounding box is 132 | # the entire image. 133 | sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box( 134 | tf.shape(image), 135 | bounding_boxes=bbox, 136 | min_object_covered=min_object_covered, 137 | aspect_ratio_range=aspect_ratio_range, 138 | area_range=area_range, 139 | max_attempts=max_attempts, 140 | use_image_if_no_bounding_boxes=True) 141 | bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box 142 | 143 | # Crop the image to the specified bounding box. 144 | cropped_image = tf.slice(image, bbox_begin, bbox_size) 145 | return cropped_image, distort_bbox 146 | 147 | 148 | def preprocess_for_train(image, height, width, bbox, 149 | fast_mode=True, 150 | scope=None): 151 | """Distort one image for training a network. 152 | 153 | Distorting images provides a useful technique for augmenting the data 154 | set during training in order to make the network invariant to aspects 155 | of the image that do not effect the label. 156 | 157 | Additionally it would create image_summaries to display the different 158 | transformations applied to the image. 159 | 160 | Args: 161 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be 162 | [0, 1], otherwise it would converted to tf.float32 assuming that the range 163 | is [0, MAX], where MAX is largest positive representable number for 164 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details). 165 | height: integer 166 | width: integer 167 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords] 168 | where each coordinate is [0, 1) and the coordinates are arranged 169 | as [ymin, xmin, ymax, xmax]. 170 | fast_mode: Optional boolean, if True avoids slower transformations (i.e. 171 | bi-cubic resizing, random_hue or random_contrast). 172 | scope: Optional scope for name_scope. 173 | Returns: 174 | 3-D float Tensor of distorted image used for training with range [-1, 1]. 175 | """ 176 | with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]): 177 | if bbox is None: 178 | bbox = tf.constant([0.0, 0.0, 1.0, 1.0], 179 | dtype=tf.float32, 180 | shape=[1, 1, 4]) 181 | if image.dtype != tf.float32: 182 | image = tf.image.convert_image_dtype(image, dtype=tf.float32) 183 | # Each bounding box has shape [1, num_boxes, box coords] and 184 | # the coordinates are ordered [ymin, xmin, ymax, xmax]. 185 | image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0), 186 | bbox) 187 | tf.summary.image('image_with_bounding_boxes', image_with_box) 188 | 189 | distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox) 190 | # Restore the shape since the dynamic slice based upon the bbox_size loses 191 | # the third dimension. 192 | distorted_image.set_shape([None, None, 3]) 193 | image_with_distorted_box = tf.image.draw_bounding_boxes( 194 | tf.expand_dims(image, 0), distorted_bbox) 195 | tf.summary.image('images_with_distorted_bounding_box', 196 | image_with_distorted_box) 197 | 198 | # This resizing operation may distort the images because the aspect 199 | # ratio is not respected. We select a resize method in a round robin 200 | # fashion based on the thread number. 201 | # Note that ResizeMethod contains 4 enumerated resizing methods. 202 | 203 | # We select only 1 case for fast_mode bilinear. 204 | num_resize_cases = 1 if fast_mode else 4 205 | distorted_image = apply_with_random_selector( 206 | distorted_image, 207 | lambda x, method: tf.image.resize_images(x, [height, width], method=method), 208 | num_cases=num_resize_cases) 209 | 210 | tf.summary.image('cropped_resized_image', 211 | tf.expand_dims(distorted_image, 0)) 212 | 213 | # Randomly flip the image horizontally. 214 | distorted_image = tf.image.random_flip_left_right(distorted_image) 215 | 216 | # Randomly distort the colors. There are 4 ways to do it. 217 | distorted_image = apply_with_random_selector( 218 | distorted_image, 219 | lambda x, ordering: distort_color(x, ordering, fast_mode), 220 | num_cases=4) 221 | 222 | tf.summary.image('final_distorted_image', 223 | tf.expand_dims(distorted_image, 0)) 224 | distorted_image = tf.subtract(distorted_image, 0.5) 225 | distorted_image = tf.multiply(distorted_image, 2.0) 226 | return distorted_image 227 | 228 | 229 | def preprocess_for_eval(image, height, width, 230 | central_fraction=0.875, scope=None): 231 | """Prepare one image for evaluation. 232 | 233 | If height and width are specified it would output an image with that size by 234 | applying resize_bilinear. 235 | 236 | If central_fraction is specified it would crop the central fraction of the 237 | input image. 238 | 239 | Args: 240 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be 241 | [0, 1], otherwise it would converted to tf.float32 assuming that the range 242 | is [0, MAX], where MAX is largest positive representable number for 243 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details) 244 | height: integer 245 | width: integer 246 | central_fraction: Optional Float, fraction of the image to crop. 247 | scope: Optional scope for name_scope. 248 | Returns: 249 | 3-D float Tensor of prepared image. 250 | """ 251 | with tf.name_scope(scope, 'eval_image', [image, height, width]): 252 | if image.dtype != tf.float32: 253 | image = tf.image.convert_image_dtype(image, dtype=tf.float32) 254 | # Crop the central region of the image with an area containing 87.5% of 255 | # the original image. 256 | if central_fraction: 257 | image = tf.image.central_crop(image, central_fraction=central_fraction) 258 | 259 | if height and width: 260 | # Resize the image to the specified height and width. 261 | image = tf.expand_dims(image, 0) 262 | image = tf.image.resize_bilinear(image, [height, width], 263 | align_corners=False) 264 | image = tf.squeeze(image, [0]) 265 | image = tf.subtract(image, 0.5) 266 | image = tf.multiply(image, 2.0) 267 | return image 268 | 269 | 270 | def preprocess_image(image, height, width, 271 | is_training=False, 272 | bbox=None, 273 | fast_mode=True): 274 | """Pre-process one image for training or evaluation. 275 | 276 | Args: 277 | image: 3-D Tensor [height, width, channels] with the image. 278 | height: integer, image expected height. 279 | width: integer, image expected width. 280 | is_training: Boolean. If true it would transform an image for train, 281 | otherwise it would transform it for evaluation. 282 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords] 283 | where each coordinate is [0, 1) and the coordinates are arranged as 284 | [ymin, xmin, ymax, xmax]. 285 | fast_mode: Optional boolean, if True avoids slower transformations. 286 | 287 | Returns: 288 | 3-D float Tensor containing an appropriately scaled image 289 | 290 | Raises: 291 | ValueError: if user does not provide bounding box 292 | """ 293 | if is_training: 294 | return preprocess_for_train(image, height, width, bbox, fast_mode) 295 | else: 296 | return preprocess_for_eval(image, height, width) 297 | -------------------------------------------------------------------------------- /z_ckpt_pb/inception_utils.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng 5 | wechat: lp9628 6 | blog: http://blog.csdn.net/u014365862/article/details/78422372 7 | """ 8 | 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import tensorflow as tf 14 | 15 | slim = tf.contrib.slim 16 | 17 | 18 | def inception_arg_scope(weight_decay=0.00004, 19 | use_batch_norm=True, 20 | batch_norm_decay=0.9997, 21 | batch_norm_epsilon=0.001): 22 | """Defines the default arg scope for inception models. 23 | Args: 24 | weight_decay: The weight decay to use for regularizing the model. 25 | use_batch_norm: "If `True`, batch_norm is applied after each convolution. 26 | batch_norm_decay: Decay for batch norm moving average. 27 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero 28 | in batch norm. 29 | Returns: 30 | An `arg_scope` to use for the inception models. 31 | """ 32 | batch_norm_params = { 33 | # Decay for the moving averages. 34 | 'decay': batch_norm_decay, 35 | # epsilon to prevent 0s in variance. 36 | 'epsilon': batch_norm_epsilon, 37 | # collection containing update_ops. 38 | 'updates_collections': tf.GraphKeys.UPDATE_OPS, 39 | } 40 | if use_batch_norm: 41 | normalizer_fn = slim.batch_norm 42 | normalizer_params = batch_norm_params 43 | else: 44 | normalizer_fn = None 45 | normalizer_params = {} 46 | # Set weight_decay for weights in Conv and FC layers. 47 | with slim.arg_scope([slim.conv2d, slim.fully_connected], 48 | weights_regularizer=slim.l2_regularizer(weight_decay)): 49 | with slim.arg_scope( 50 | [slim.conv2d], 51 | weights_initializer=slim.variance_scaling_initializer(), 52 | activation_fn=tf.nn.relu, 53 | normalizer_fn=normalizer_fn, 54 | normalizer_params=normalizer_params) as sc: 55 | return sc 56 | -------------------------------------------------------------------------------- /z_ckpt_pb/inception_v4.py: -------------------------------------------------------------------------------- 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================== 15 | """Contains the definition of the Inception V4 architecture. 16 | As described in http://arxiv.org/abs/1602.07261. 17 | Inception-v4, Inception-ResNet and the Impact of Residual Connections 18 | on Learning 19 | Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi 20 | """ 21 | from __future__ import absolute_import 22 | from __future__ import division 23 | from __future__ import print_function 24 | 25 | import tensorflow as tf 26 | 27 | import inception_utils 28 | 29 | slim = tf.contrib.slim 30 | 31 | 32 | def block_inception_a(inputs, scope=None, reuse=None): 33 | """Builds Inception-A block for Inception v4 network.""" 34 | # By default use stride=1 and SAME padding 35 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 36 | stride=1, padding='SAME'): 37 | with tf.variable_scope(scope, 'BlockInceptionA', [inputs], reuse=reuse): 38 | with tf.variable_scope('Branch_0'): 39 | branch_0 = slim.conv2d(inputs, 96, [1, 1], scope='Conv2d_0a_1x1') 40 | with tf.variable_scope('Branch_1'): 41 | branch_1 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1') 42 | branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_0b_3x3') 43 | with tf.variable_scope('Branch_2'): 44 | branch_2 = slim.conv2d(inputs, 64, [1, 1], scope='Conv2d_0a_1x1') 45 | branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3') 46 | branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3') 47 | with tf.variable_scope('Branch_3'): 48 | branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3') 49 | branch_3 = slim.conv2d(branch_3, 96, [1, 1], scope='Conv2d_0b_1x1') 50 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3]) 51 | 52 | 53 | def block_reduction_a(inputs, scope=None, reuse=None): 54 | """Builds Reduction-A block for Inception v4 network.""" 55 | # By default use stride=1 and SAME padding 56 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 57 | stride=1, padding='SAME'): 58 | with tf.variable_scope(scope, 'BlockReductionA', [inputs], reuse=reuse): 59 | with tf.variable_scope('Branch_0'): 60 | branch_0 = slim.conv2d(inputs, 384, [3, 3], stride=2, padding='VALID', 61 | scope='Conv2d_1a_3x3') 62 | with tf.variable_scope('Branch_1'): 63 | branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 64 | branch_1 = slim.conv2d(branch_1, 224, [3, 3], scope='Conv2d_0b_3x3') 65 | branch_1 = slim.conv2d(branch_1, 256, [3, 3], stride=2, 66 | padding='VALID', scope='Conv2d_1a_3x3') 67 | with tf.variable_scope('Branch_2'): 68 | branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID', 69 | scope='MaxPool_1a_3x3') 70 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2]) 71 | 72 | 73 | def block_inception_b(inputs, scope=None, reuse=None): 74 | """Builds Inception-B block for Inception v4 network.""" 75 | # By default use stride=1 and SAME padding 76 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 77 | stride=1, padding='SAME'): 78 | with tf.variable_scope(scope, 'BlockInceptionB', [inputs], reuse=reuse): 79 | with tf.variable_scope('Branch_0'): 80 | branch_0 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1') 81 | with tf.variable_scope('Branch_1'): 82 | branch_1 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 83 | branch_1 = slim.conv2d(branch_1, 224, [1, 7], scope='Conv2d_0b_1x7') 84 | branch_1 = slim.conv2d(branch_1, 256, [7, 1], scope='Conv2d_0c_7x1') 85 | with tf.variable_scope('Branch_2'): 86 | branch_2 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 87 | branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0b_7x1') 88 | branch_2 = slim.conv2d(branch_2, 224, [1, 7], scope='Conv2d_0c_1x7') 89 | branch_2 = slim.conv2d(branch_2, 224, [7, 1], scope='Conv2d_0d_7x1') 90 | branch_2 = slim.conv2d(branch_2, 256, [1, 7], scope='Conv2d_0e_1x7') 91 | with tf.variable_scope('Branch_3'): 92 | branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3') 93 | branch_3 = slim.conv2d(branch_3, 128, [1, 1], scope='Conv2d_0b_1x1') 94 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3]) 95 | 96 | 97 | def block_reduction_b(inputs, scope=None, reuse=None): 98 | """Builds Reduction-B block for Inception v4 network.""" 99 | # By default use stride=1 and SAME padding 100 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 101 | stride=1, padding='SAME'): 102 | with tf.variable_scope(scope, 'BlockReductionB', [inputs], reuse=reuse): 103 | with tf.variable_scope('Branch_0'): 104 | branch_0 = slim.conv2d(inputs, 192, [1, 1], scope='Conv2d_0a_1x1') 105 | branch_0 = slim.conv2d(branch_0, 192, [3, 3], stride=2, 106 | padding='VALID', scope='Conv2d_1a_3x3') 107 | with tf.variable_scope('Branch_1'): 108 | branch_1 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1') 109 | branch_1 = slim.conv2d(branch_1, 256, [1, 7], scope='Conv2d_0b_1x7') 110 | branch_1 = slim.conv2d(branch_1, 320, [7, 1], scope='Conv2d_0c_7x1') 111 | branch_1 = slim.conv2d(branch_1, 320, [3, 3], stride=2, 112 | padding='VALID', scope='Conv2d_1a_3x3') 113 | with tf.variable_scope('Branch_2'): 114 | branch_2 = slim.max_pool2d(inputs, [3, 3], stride=2, padding='VALID', 115 | scope='MaxPool_1a_3x3') 116 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2]) 117 | 118 | 119 | def block_inception_c(inputs, scope=None, reuse=None): 120 | """Builds Inception-C block for Inception v4 network.""" 121 | # By default use stride=1 and SAME padding 122 | with slim.arg_scope([slim.conv2d, slim.avg_pool2d, slim.max_pool2d], 123 | stride=1, padding='SAME'): 124 | with tf.variable_scope(scope, 'BlockInceptionC', [inputs], reuse=reuse): 125 | with tf.variable_scope('Branch_0'): 126 | branch_0 = slim.conv2d(inputs, 256, [1, 1], scope='Conv2d_0a_1x1') 127 | with tf.variable_scope('Branch_1'): 128 | branch_1 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1') 129 | branch_1 = tf.concat(axis=3, values=[ 130 | slim.conv2d(branch_1, 256, [1, 3], scope='Conv2d_0b_1x3'), 131 | slim.conv2d(branch_1, 256, [3, 1], scope='Conv2d_0c_3x1')]) 132 | with tf.variable_scope('Branch_2'): 133 | branch_2 = slim.conv2d(inputs, 384, [1, 1], scope='Conv2d_0a_1x1') 134 | branch_2 = slim.conv2d(branch_2, 448, [3, 1], scope='Conv2d_0b_3x1') 135 | branch_2 = slim.conv2d(branch_2, 512, [1, 3], scope='Conv2d_0c_1x3') 136 | branch_2 = tf.concat(axis=3, values=[ 137 | slim.conv2d(branch_2, 256, [1, 3], scope='Conv2d_0d_1x3'), 138 | slim.conv2d(branch_2, 256, [3, 1], scope='Conv2d_0e_3x1')]) 139 | with tf.variable_scope('Branch_3'): 140 | branch_3 = slim.avg_pool2d(inputs, [3, 3], scope='AvgPool_0a_3x3') 141 | branch_3 = slim.conv2d(branch_3, 256, [1, 1], scope='Conv2d_0b_1x1') 142 | return tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3]) 143 | 144 | 145 | def inception_v4_base(inputs, final_endpoint='Mixed_7d', scope=None): 146 | """Creates the Inception V4 network up to the given final endpoint. 147 | Args: 148 | inputs: a 4-D tensor of size [batch_size, height, width, 3]. 149 | final_endpoint: specifies the endpoint to construct the network up to. 150 | It can be one of [ 'Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 151 | 'Mixed_3a', 'Mixed_4a', 'Mixed_5a', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d', 152 | 'Mixed_5e', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d', 'Mixed_6e', 153 | 'Mixed_6f', 'Mixed_6g', 'Mixed_6h', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c', 154 | 'Mixed_7d'] 155 | scope: Optional variable_scope. 156 | Returns: 157 | logits: the logits outputs of the model. 158 | end_points: the set of end_points from the inception model. 159 | Raises: 160 | ValueError: if final_endpoint is not set to one of the predefined values, 161 | """ 162 | end_points = {} 163 | 164 | def add_and_check_final(name, net): 165 | end_points[name] = net 166 | return name == final_endpoint 167 | 168 | with tf.variable_scope(scope, 'InceptionV4', [inputs]): 169 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], 170 | stride=1, padding='SAME'): 171 | # 299 x 299 x 3 172 | net = slim.conv2d(inputs, 32, [3, 3], stride=2, 173 | padding='VALID', scope='Conv2d_1a_3x3') 174 | if add_and_check_final('Conv2d_1a_3x3', net): return net, end_points 175 | # 149 x 149 x 32 176 | net = slim.conv2d(net, 32, [3, 3], padding='VALID', 177 | scope='Conv2d_2a_3x3') 178 | if add_and_check_final('Conv2d_2a_3x3', net): return net, end_points 179 | # 147 x 147 x 32 180 | net = slim.conv2d(net, 64, [3, 3], scope='Conv2d_2b_3x3') 181 | if add_and_check_final('Conv2d_2b_3x3', net): return net, end_points 182 | # 147 x 147 x 64 183 | with tf.variable_scope('Mixed_3a'): 184 | with tf.variable_scope('Branch_0'): 185 | branch_0 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', 186 | scope='MaxPool_0a_3x3') 187 | with tf.variable_scope('Branch_1'): 188 | branch_1 = slim.conv2d(net, 96, [3, 3], stride=2, padding='VALID', 189 | scope='Conv2d_0a_3x3') 190 | net = tf.concat(axis=3, values=[branch_0, branch_1]) 191 | if add_and_check_final('Mixed_3a', net): return net, end_points 192 | 193 | # 73 x 73 x 160 194 | with tf.variable_scope('Mixed_4a'): 195 | with tf.variable_scope('Branch_0'): 196 | branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1') 197 | branch_0 = slim.conv2d(branch_0, 96, [3, 3], padding='VALID', 198 | scope='Conv2d_1a_3x3') 199 | with tf.variable_scope('Branch_1'): 200 | branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1') 201 | branch_1 = slim.conv2d(branch_1, 64, [1, 7], scope='Conv2d_0b_1x7') 202 | branch_1 = slim.conv2d(branch_1, 64, [7, 1], scope='Conv2d_0c_7x1') 203 | branch_1 = slim.conv2d(branch_1, 96, [3, 3], padding='VALID', 204 | scope='Conv2d_1a_3x3') 205 | net = tf.concat(axis=3, values=[branch_0, branch_1]) 206 | if add_and_check_final('Mixed_4a', net): return net, end_points 207 | 208 | # 71 x 71 x 192 209 | with tf.variable_scope('Mixed_5a'): 210 | with tf.variable_scope('Branch_0'): 211 | branch_0 = slim.conv2d(net, 192, [3, 3], stride=2, padding='VALID', 212 | scope='Conv2d_1a_3x3') 213 | with tf.variable_scope('Branch_1'): 214 | branch_1 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', 215 | scope='MaxPool_1a_3x3') 216 | net = tf.concat(axis=3, values=[branch_0, branch_1]) 217 | if add_and_check_final('Mixed_5a', net): return net, end_points 218 | 219 | # 35 x 35 x 384 220 | # 4 x Inception-A blocks 221 | for idx in range(4): 222 | block_scope = 'Mixed_5' + chr(ord('b') + idx) 223 | net = block_inception_a(net, block_scope) 224 | if add_and_check_final(block_scope, net): return net, end_points 225 | 226 | # 35 x 35 x 384 227 | # Reduction-A block 228 | net = block_reduction_a(net, 'Mixed_6a') 229 | if add_and_check_final('Mixed_6a', net): return net, end_points 230 | 231 | # 17 x 17 x 1024 232 | # 7 x Inception-B blocks 233 | for idx in range(7): 234 | block_scope = 'Mixed_6' + chr(ord('b') + idx) 235 | net = block_inception_b(net, block_scope) 236 | if add_and_check_final(block_scope, net): return net, end_points 237 | 238 | # 17 x 17 x 1024 239 | # Reduction-B block 240 | net = block_reduction_b(net, 'Mixed_7a') 241 | if add_and_check_final('Mixed_7a', net): return net, end_points 242 | 243 | # 8 x 8 x 1536 244 | # 3 x Inception-C blocks 245 | for idx in range(3): 246 | block_scope = 'Mixed_7' + chr(ord('b') + idx) 247 | net = block_inception_c(net, block_scope) 248 | if add_and_check_final(block_scope, net): return net, end_points 249 | raise ValueError('Unknown final endpoint %s' % final_endpoint) 250 | 251 | 252 | def inception_v4(inputs, num_classes=None, is_training=True, 253 | dropout_keep_prob=0.8, 254 | reuse=None, 255 | scope='InceptionV4', 256 | create_aux_logits=True): 257 | """Creates the Inception V4 model. 258 | Args: 259 | inputs: a 4-D tensor of size [batch_size, height, width, 3]. 260 | num_classes: number of predicted classes. 261 | is_training: whether is training or not. 262 | dropout_keep_prob: float, the fraction to keep before final layer. 263 | reuse: whether or not the network and its variables should be reused. To be 264 | able to reuse 'scope' must be given. 265 | scope: Optional variable_scope. 266 | create_aux_logits: Whether to include the auxiliary logits. 267 | Returns: 268 | logits: the logits outputs of the model. 269 | end_points: the set of end_points from the inception model. 270 | """ 271 | end_points = {} 272 | with tf.variable_scope(scope, 'InceptionV4', [inputs], reuse=reuse) as scope: 273 | with slim.arg_scope([slim.batch_norm, slim.dropout], 274 | is_training=is_training): 275 | net, end_points = inception_v4_base(inputs, scope=scope) 276 | if num_classes is not None: 277 | with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], 278 | stride=1, padding='SAME'): 279 | # Auxiliary Head logits 280 | if create_aux_logits: 281 | with tf.variable_scope('AuxLogits'): 282 | # 17 x 17 x 1024 283 | aux_logits = end_points['Mixed_6h'] 284 | aux_logits = slim.avg_pool2d(aux_logits, [5, 5], stride=3, 285 | padding='VALID', 286 | scope='AvgPool_1a_5x5') 287 | aux_logits = slim.conv2d(aux_logits, 128, [1, 1], 288 | scope='Conv2d_1b_1x1') 289 | aux_logits = slim.conv2d(aux_logits, 768, 290 | aux_logits.get_shape()[1:3], 291 | padding='VALID', scope='Conv2d_2a') 292 | aux_logits = slim.flatten(aux_logits) 293 | aux_logits = slim.fully_connected(aux_logits, num_classes, 294 | activation_fn=None, 295 | scope='Aux_logits') 296 | end_points['AuxLogits'] = aux_logits 297 | 298 | # Final pooling and prediction 299 | with tf.variable_scope('Logits'): 300 | # 8 x 8 x 1536 301 | net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID', 302 | scope='AvgPool_1a') 303 | # 1 x 1 x 1536 304 | net = slim.dropout(net, dropout_keep_prob, scope='Dropout_1b') 305 | net = slim.flatten(net, scope='PreLogitsFlatten') 306 | end_points['PreLogitsFlatten'] = net 307 | # 1536 308 | logits = slim.fully_connected(net, num_classes, activation_fn=None, 309 | scope='Logits') 310 | end_points['Logits'] = logits 311 | end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions') 312 | else: 313 | logits = net 314 | end_points = end_points 315 | return logits, end_points 316 | inception_v4.default_image_size = 299 317 | 318 | 319 | inception_v4_arg_scope = inception_utils.inception_arg_scope -------------------------------------------------------------------------------- /z_ckpt_pb/test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on 2017 10.17 4 | @author: liupeng""" 5 | 6 | import numpy as np 7 | import numpy as np 8 | import os 9 | from PIL import Image 10 | import cv2 11 | 12 | import csv 13 | import argparse, json, textwrap 14 | import sys 15 | import csv 16 | 17 | 18 | def result2num(out, image_path): 19 | # print (out) 20 | dc = out[image_path] 21 | # print (dc) 22 | 23 | if dc.get("help", ""): 24 | print ("help is true!") 25 | dc.pop('help') 26 | print (">>>>>>>>", dc) 27 | 28 | def dict2list(dic:dict): 29 | ''''' 将字典转化为列表 ''' 30 | keys = dic.keys() 31 | vals = dic.values() 32 | lst = [(key, val) for key, val in zip(keys, vals)] 33 | return lst 34 | dc = sorted(dict2list(dc), key=lambda d:d[1], reverse=True) 35 | # print (dc[0][0]) 36 | if dc[0][0] == 'NG1': 37 | return 0 38 | if dc[0][0] == 'NG2': 39 | return 1 40 | if dc[0][0] == 'OK': 41 | return 2 42 | 43 | file = open("output.csv", "r") 44 | 45 | err_num = 0 46 | sample_num = 0 47 | for r in file: 48 | sample_num = sample_num + 1 49 | # 转为字典 50 | r = eval(r) 51 | # 转为列表 52 | image_path = list(r.keys()) 53 | la = 888888888888888 54 | label = str (str(image_path[0]).split('/')[1]) 55 | # print (label) 56 | if label == 'NG1': 57 | la = 0 58 | if label == 'NG2': 59 | la = 1 60 | if label == 'OK': 61 | la = 2 62 | print (la) 63 | 64 | image_path = str(image_path[0]) 65 | res = result2num(r, image_path) 66 | print (res) 67 | 68 | if (la != res): 69 | err_num = err_num + 1 70 | print (sample_num) 71 | print (err_num) 72 | acc_num = sample_num - err_num 73 | print ('accuracy >>> ', acc_num/sample_num) 74 | --------------------------------------------------------------------------------