├── .gitignore
├── LICENSE
├── README.md
├── data
    ├── README.md
    ├── dbnet-2018
    │   └── README.md
    └── demo
    │   ├── README.md
    │   └── data.csv
├── docs
    ├── logo.jpeg
    └── pred.jpg
├── evaluate.py
├── models
    ├── densenet169_io.py
    ├── densenet169_pm.py
    ├── densenet169_pn.py
    ├── inception_v4_io.py
    ├── inception_v4_pm.py
    ├── inception_v4_pn.py
    ├── nvidia_io.py
    ├── nvidia_pm.py
    ├── nvidia_pn.py
    ├── resnet152_io.py
    ├── resnet152_pm.py
    └── resnet152_pn.py
├── predict.py
├── provider.py
├── tools
    ├── README.md
    ├── img_pre.py
    ├── las2fmap.py
    ├── pcd2las.py
    └── video2img.py
├── train.py
├── train_demo.py
└── utils
    ├── custom_layers.py
    ├── helper.py
    ├── pointnet.py
    ├── tf_util.py
    └── weights
        └── README.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | .vscode/
 2 | *pyc
 3 | data/dbnet-2018/train
 4 | data/dbnet-2018/val
 5 | data/dbnet-2018/test
 6 | data/demo/DVR
 7 | data/demo/fmap
 8 | data/demo/points_16384
 9 | logs/
10 | results/
11 | dbnet_test.py
12 | utils/weights/*.h5
13 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright [yyyy] [name of copyright owner]
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | <a href="http://www.dbehavior.net/"><img src=docs/logo.jpeg width=135/></a>
 2 | 
 3 | ![db-prediction](docs/pred.jpg)
 4 | 
 5 | [DBNet](http://www.dbehavior.net/) is a __large-scale driving behavior dataset__, which provides large-scale __high-quality point clouds__ scanned by Velodyne lasers, __high-resolution videos__ recorded by dashboard cameras and __standard drivers' behaviors__ (vehicle speed, steering angle) collected by real-time sensors.
 6 | 
 7 | Extensive experiments demonstrate that extra depth information helps networks to determine driving policies indeed. We hope it will become useful resources for the autonomous driving research community.
 8 | 
 9 | _Created by [Yiping Chen*](https://scholar.google.com/citations?user=e9lv2fUAAAAJ&hl=en), [Jingkang Wang*](https://wangjksjtu.github.io/), [Jonathan Li](https://uwaterloo.ca/mobile-sensing/people-profiles/jonathan-li), [Cewu Lu](http://www.mvig.org/), Zhipeng Luo, HanXue and [Cheng Wang](http://chwang.xmu.edu.cn/). (*equal contribution)_
10 | 
11 | The resources of our work are available: [[paper]](http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_LiDAR-Video_Driving_Dataset_CVPR_2018_paper.pdf), [[code]](https://github.com/driving-behavior/DBNet), [[video]](http://www.dbehavior.net/data/demo.mp4), [[website]](http://www.dbehavior.net/), [[challenge]](http://www.dbehavior.net/task.html), [[prepared data]](https://drive.google.com/file/d/1WxzOrhvMnHCOkh6EFGWltflyPb_UnGqo/view?usp=sharing)
12 | 
13 | <!--
14 | ## News!
15 | __DBNet Autonomous Driving Data (prepared & raw) are released [here](http://www.dbehavior.net/download.aspx)!__
16 | ___We are going to organize DBNet challenges for CVPR/ICCV/ECCV Workshops. The instructions of DBNet-2018 challenge will be open soon. Stay tuned!___
17 | -->
18 | 
19 | ## Contents
20 | 1. [Introduction](#introduction)
21 | 2. [Requirements](#requirements)
22 | 3. [Quick Start](#quick-start)
23 | 4. [Baseline](#baseline)
24 | 5. [Contributors](#contributors)
25 | 6. [Citation](#citation)
26 | 7. [License](#license)
27 | 
28 | ## Introduction
29 | This work is based on our [research paper](http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_LiDAR-Video_Driving_Dataset_CVPR_2018_paper.html), which appears in CVPR 2018. We propose a large-scale dataset for driving behavior learning, namely, DBNet. You can also check our [dataset webpage](http://www.dbehavior.net/) for a deeper introduction.
30 | 
31 | In this repository, we release __demo code__ and __partial prepared data__ for training with only images, as well as leveraging feature maps or point clouds. The prepared data are accessible [here](https://drive.google.com/open?id=14RPdVTwBTuCTo0tFeYmL_SyN8fD0g6Hc). (__More demo models and scripts are released soon!__)
32 | 
33 | ## Requirements
34 | 
35 | * **Tensorflow 1.2.0**
36 | * Python 2.7
37 | * CUDA 8.0+ (For GPU)
38 | * Python Libraries: numpy, scipy and __laspy__
39 | 
40 | The code has been tested with Python 2.7, Tensorflow 1.2.0, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04. But it may work on more machines (directly or through mini-modification), pull-requests or test report are well welcomed.
41 | 
42 | ## Quick Start
43 | ### Training
44 | To train a model to predict vehicle speeds and steering angles:
45 | 
46 |     python train.py --model nvidia_pn --batch_size 16 --max_epoch 125 --gpu 0
47 | 
48 | The names of the models are consistent with our [paper](http://www.dbehavior.net/publications.html).
49 | Log files and network parameters will be saved to `logs` folder in default.
50 | 
51 | To see HELP for the training script:
52 | 
53 |     python train.py -h
54 | 
55 | We can use TensorBoard to view the network architecture and monitor the training progress.
56 | 
57 |     tensorboard --logdir logs
58 | 
59 | ### Evaluation    
60 | After training, you could evaluate the performance of models using `evaluate.py`. To plot the figures or calculate AUC, you may need to have matplotlib library installed.
61 | 
62 |     python evaluate.py --model_path logs/nvidia_pn/model.ckpt
63 | 
64 | ### Prediction
65 | To get the predictions of test data:
66 | 
67 |     python predict.py
68 | 
69 | The results are saved in `results/results` (every segment) and `results/behavior_pred.txt` (merged) by default.
70 | To change the storation location:
71 | 
72 |     python predict.py --result_dir specified_dir
73 | 
74 | The result directory will be created automatically if it doesn't exist.
75 | 
76 | ## Baseline
77 | <table style="undefined;table-layout: fixed; width: 512px"><colgroup><col style="width: 68px"><col style="width: 106px"><col style="width: 66px"><col style="width: 88px"><col style="width: 54px"><col style="width: 46px"><col style="width: 38px"><col style="width: 46px"></colgroup><tr><th>Method</th><th colspan="2">Setting</th><th>Accuracy</th><th>AUC</th><th>ME</th><th>AE</th><th>AME</th></tr><tr><td rowspan="2">nvidia-pn</td><td rowspan="2">Videos + Laser Points</td><td>angle</td><td>70.65% (&lt;5)</td><td>0.7799 </td><td>29.46</td><td>4.23</td><td>20.88</td></tr><tr><td>speed</td><td>82.21% (&lt;3)</td><td>0.8701</td><td>18.56</td><td>1.80</td><td>9.68</td></tr></table>
78 | 
79 | This baseline is run on __dbnet-2018 challenge data__ and only __nvidia\_pn__ is tested. To measure difficult architectures comprehensively, several metrics are set, including accuracy under different thresholds, area under curve (__AUC__), max error (__ME__), mean error (__AE__) and mean of max errors (__AME__).
80 | 
81 | The implementations of these metrics could be found in `evaluate.py`.
82 | 
83 | ## Contributors
84 | DBNet was developed by [MVIG](http://www.mvig.org/), Shanghai Jiao Tong University* and [SCSC](http://scsc.xmu.edu.cn/) Lab, Xiamen University* (*alphabetical order*).
85 | 
86 | ## Citation
87 | If you find our work useful in your research, please consider citing:
88 | 
89 | 	@InProceedings{DBNet2018,
90 | 	  author = {Yiping Chen and Jingkang Wang and Jonathan Li and Cewu Lu and Zhipeng Luo and HanXue and Cheng Wang},
91 | 	  title = {LiDAR-Video Driving Dataset: Learning Driving Policies Effectively},
92 | 	  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
93 | 	  month = {June},
94 | 	  year = {2018}
95 | 	}
96 | 
97 | ## License
98 | Our code is released under Apache 2.0 License. The copyright of DBNet could be checked [here](http://www.dbehavior.net/contact.html).
99 | 


--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
 1 | ## Home Directory of DBNet Data
 2 | 
 3 | This is the place where DBNet data are placed in order to fit the default path in `../provider.py`. In total, two kinds of prepared data are provided, which are listed in `dbnet-2018` and `demo` folder, respectively.
 4 | 
 5 | ### dbnet-2018
 6 | Download DBNet-2018 challenge data [here](https://drive.google.com/open?id=14RPdVTwBTuCTo0tFeYmL_SyN8fD0g6Hc) and organize the folders as follows (in `dbnet-2018/`):
 7 | ```
 8 | ├── train
 9 | ├─  └── i [56 folders] (6569 in total, will release continously)
10 | ├─      ├── dvr_66x200     [<= 120 images]
11 | ├─      ├── dvr_1920x1080  [<= 120 images]
12 | ├─      ├── points_16384   [<= 120 images]
13 | ├─      └── behavior.csv   [labels]
14 | ├── val
15 | ├─  └── j [20 folders] (2349 in total)
16 | ├─      ├── dvr_66x200     [<= 120 images]
17 | ├─      ├── dvr_1920x1080  [<= 120 images]
18 | ├─      ├── points_16384   [<= 120 clouds]
19 | ├─      └── behavior.csv   [labels]
20 | └── test
21 |     └── k [20 folders] (2376 in total)
22 |         ├── dvr_66x200     [<= 120 images]
23 |         ├── dvr_1920x1080  [<= 120 images]
24 |         └── points_16384   [<= 120 clouds]
25 |     
26 | ```
27 | In general, the train/val/test ratio is approximatingly set to 8:1:1 and all of the val/test data are released already. Almost five eighths of training data are still pre-processed and will be __uploaded soon__.
28 | 
29 | Please note that the data in subfolders of `train/`, `val/` and `test/` are __continuous__ and __time-ordered__. The `ith` line of `behavior.csv` correponds to `i-1.jpg` in `dvf_66x200/` and `i-1.las` in `points_16384/`. Moreover, if you don't intend to utilize prepared data directly, please download and pre-process the [raw data]() in your favorite methods.
30 | 
31 | ### demo
32 | Download DBNet demo data [here](https://drive.google.com/open?id=1NjhHwV_q6EMZ6MiGhZnqxg7yRCawx79c) and organize the folders as follows (in `demo`):
33 | 
34 | ```
35 | ├── data.csv
36 | ├── DVR
37 | ├─  └── i.jpg [3788 images]
38 | ├── fmap
39 | ├─  └── i.jpg [3788 feature maps]
40 | └── points_16384
41 |     └── i.las [3788 point clouds]
42 | ```
43 | 


--------------------------------------------------------------------------------
/data/dbnet-2018/README.md:
--------------------------------------------------------------------------------
 1 | ## DBNet-2018 Challenge
 2 | The DBNet-2018 challenge data are organized as follows:
 3 | 
 4 | ```
 5 | ├── train
 6 | ├─  └── i [56 folders] (6569 in total, will release continously)
 7 | ├─      ├── dvr_66x200   [<= 120 images]
 8 | ├─      ├── points_16384 [<= 120 images]
 9 | ├─      └── behavior.csv [labels]
10 | ├── val
11 | ├─  └── j [20 folders] (2349 in total)
12 | ├─      ├── dvr_66x200   [<= 120 images]
13 | ├─      ├── points_16384 [<= 120 clouds]
14 | ├─      └── behavior.csv [labels]
15 | └── test
16 |     └── k [20 folders] (2376 in total)
17 |         ├── dvr_66x200   [<= 120 images]
18 |         └── points_16384 [<= 120 clouds]
19 | ```
20 | 
21 | In general, the train/val/test ratio is approximatingly set to 8:1:1 and all of the val/test data are released already. Almost five eighths of training data are still pre-processed and will be __uploaded soon__.
22 | 
23 | Please note that the data in subfolders of `train/`, `val/` and `test/` are __continuous__ and __time-ordered__. The `ith` line of `behavior.csv` correponds to `i-1.jpg` in `dvf_66x200/` and `i-1.las` in `points_16384/`. Moreover, if you don't intend to utilize the prepared data directly, please download and pre-process the raw data in your favorite methods.
24 | 


--------------------------------------------------------------------------------
/data/demo/README.md:
--------------------------------------------------------------------------------
 1 | ## Demo Data
 2 | 
 3 | Download DBNet demo data and organize the folders as follows:
 4 | 
 5 | ```
 6 | ├── data.csv
 7 | ├── DVR
 8 | ├─  └── i.jpg [3788 images]
 9 | ├── fmap
10 | ├─  └── i.jpg [3788 feature maps]
11 | └── points_16384
12 |     └── i.las [3788 point clouds]
13 | ```
14 | 


--------------------------------------------------------------------------------
/docs/logo.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/driving-behavior/DBNet/c32ad78afb9f83e354eae7eb6793dbeef7e83bac/docs/logo.jpeg


--------------------------------------------------------------------------------
/docs/pred.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/driving-behavior/DBNet/c32ad78afb9f83e354eae7eb6793dbeef7e83bac/docs/pred.jpg


--------------------------------------------------------------------------------
/evaluate.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import importlib
  3 | import os
  4 | import sys
  5 | import time
  6 | 
  7 | # import matplotlib.pyplot as plt
  8 | import numpy as np
  9 | import scipy
 10 | 
 11 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 12 | sys.path.append(os.path.join(BASE_DIR, 'models'))
 13 | sys.path.append(os.path.join(BASE_DIR, 'utils'))
 14 | 
 15 | import provider
 16 | import tensorflow as tf
 17 | from helper import str2bool
 18 | 
 19 | 
 20 | parser = argparse.ArgumentParser()
 21 | parser.add_argument('--gpu', type=int, default=0,
 22 |                     help='GPU to use [default: GPU 0]')
 23 | parser.add_argument('--model', default='nvidia_pn',
 24 |                     help='Model name [default: nvidia_pn]')
 25 | parser.add_argument('--model_path', default='logs/nvidia_pn/model_best.ckpt',
 26 |                     help='Model checkpoint file path [default: logs/nvidia_pn/model_best.ckpt]')
 27 | parser.add_argument('--max_epoch', type=int, default=250,
 28 |                     help='Epoch to run [default: 250]')
 29 | parser.add_argument('--batch_size', type=int, default=8,
 30 |                     help='Batch Size during training [default: 8]')
 31 | parser.add_argument('--result_dir', default='results',
 32 |                     help='Result folder path [default: results]')
 33 | parser.add_argument('--test', type=str2bool, default=False, # only used in test server
 34 |                     help='Get performance on test data [default: False]')
 35 | 
 36 | FLAGS = parser.parse_args()
 37 | BATCH_SIZE = FLAGS.batch_size
 38 | GPU_INDEX = FLAGS.gpu
 39 | MODEL_PATH = FLAGS.model_path
 40 | 
 41 | supported_models = ["nvidia_io", "nvidia_pn",
 42 |                     "resnet152_io", "resnet152_pn",
 43 |                     "inception_v4_io", "inception_v4_pn",
 44 |                     "densenet169_io", "densenet169_pn"]
 45 | assert (FLAGS.model in supported_models)
 46 | 
 47 | MODEL = importlib.import_module(FLAGS.model)  # import network module
 48 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
 49 | 
 50 | RESULT_DIR = os.path.join(FLAGS.result_dir, FLAGS.model)
 51 | if not os.path.exists(RESULT_DIR):
 52 |     os.makedirs(RESULT_DIR)
 53 | if FLAGS.test:
 54 |     TEST_RESULT_DIR = os.path.join(RESULT_DIR, "test")
 55 |     if not os.path.exists(TEST_RESULT_DIR):
 56 |         os.makedirs(TEST_RESULT_DIR)
 57 |     LOG_FOUT = open(os.path.join(TEST_RESULT_DIR, 'log_test.txt'), 'w')
 58 |     LOG_FOUT.write(str(FLAGS)+'\n')
 59 | else:
 60 |     VAL_RESULT_DIR = os.path.join(RESULT_DIR, "val")
 61 |     if not os.path.exists(VAL_RESULT_DIR):
 62 |         os.makedirs(VAL_RESULT_DIR)
 63 |     LOG_FOUT = open(os.path.join(VAL_RESULT_DIR, 'log_evaluate.txt'), 'w')
 64 |     LOG_FOUT.write(str(FLAGS)+'\n')
 65 | 
 66 | 
 67 | def log_string(out_str):
 68 |     LOG_FOUT.write(out_str+'\n')
 69 |     LOG_FOUT.flush()
 70 |     print(out_str)
 71 | 
 72 | def evaluate():
 73 |     with tf.device('/gpu:'+str(GPU_INDEX)):
 74 |         if '_pn' in MODEL_FILE:
 75 |             data_input = provider.Provider()
 76 |             imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
 77 |             imgs_pl = [imgs_pl, pts_pl]
 78 |         elif '_io' in MODEL_FILE:
 79 |             data_input = provider.Provider()
 80 |             imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
 81 |         else:
 82 |             raise NotImplementedError
 83 | 
 84 |         is_training_pl = tf.placeholder(tf.bool, shape=())
 85 |         print(is_training_pl)
 86 | 
 87 |         # Get model and loss
 88 |         pred = MODEL.get_model(imgs_pl, is_training_pl)
 89 | 
 90 |         loss = MODEL.get_loss(pred, labels_pl)
 91 | 
 92 |         # Add ops to save and restore all the variables.
 93 |         saver = tf.train.Saver()
 94 | 
 95 |     # Create a session
 96 |     config = tf.ConfigProto()
 97 |     config.gpu_options.allow_growth = True
 98 |     config.allow_soft_placement = True
 99 |     config.log_device_placement = True
100 |     sess = tf.Session(config=config)
101 | 
102 |     # Restore variables from disk.
103 |     saver.restore(sess, MODEL_PATH)
104 |     log_string("Model restored.")
105 | 
106 |     ops = {'imgs_pl': imgs_pl,
107 |             'labels_pl': labels_pl,
108 |             'is_training_pl': is_training_pl,
109 |             'pred': pred,
110 |             'loss': loss}
111 | 
112 |     eval_one_epoch(sess, ops, data_input)
113 | 
114 | 
115 | def eval_one_epoch(sess, ops, data_input):
116 |     """ ops: dict mapping from string to tf ops """
117 |     is_training = False
118 |     loss_sum = 0
119 | 
120 |     num_batches = data_input.num_val // BATCH_SIZE
121 |     acc_a_sum = [0] * 5
122 |     acc_s_sum = [0] * 5
123 | 
124 |     preds = []
125 |     labels_total = []
126 |     acc_a = [0] * 5
127 |     acc_s = [0] * 5
128 |     for batch_idx in range(num_batches):
129 |         if "_io" in MODEL_FILE:
130 |             imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val", reader_type="io")
131 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
132 |                 imgs = MODEL.resize(imgs)
133 |             feed_dict = {ops['imgs_pl']: imgs,
134 |                          ops['labels_pl']: labels,
135 |                          ops['is_training_pl']: is_training}
136 |         else:
137 |             imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val")
138 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
139 |                 imgs = MODEL.resize(imgs)
140 |             feed_dict = {ops['imgs_pl'][0]: imgs,
141 |                          ops['imgs_pl'][1]: others,
142 |                          ops['labels_pl']: labels,
143 |                          ops['is_training_pl']: is_training}
144 | 
145 |         loss_val, pred_val = sess.run([ops['loss'], ops['pred']],
146 |                                     feed_dict=feed_dict)
147 | 
148 |         preds.append(pred_val)
149 |         labels_total.append(labels)
150 |         loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
151 |         for i in range(5):
152 |             acc_a[i] = np.mean(np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (1.0 * (i+1) / 180 * scipy.pi))
153 |             acc_a_sum[i] += acc_a[i]
154 |             acc_s[i] = np.mean(np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (1.0 * (i+1) / 20))
155 |             acc_s_sum[i] += acc_s[i]
156 | 
157 |     log_string('eval mean loss: %f' % (loss_sum / float(num_batches)))
158 |     for i in range(5):
159 |         log_string('eval accuracy (angle-%d): %f' % (float(i+1), (acc_a_sum[i] / float(num_batches))))
160 |         log_string('eval accuracy (speed-%d): %f' % (float(i+1), (acc_s_sum[i] / float(num_batches))))
161 | 
162 |     preds = np.vstack(preds)
163 |     labels = np.vstack(labels_total)
164 | 
165 |     a_error, s_error = mean_max_error(preds, labels, dicts=get_dicts())
166 |     log_string('eval error (mean-max): angle:%.2f speed:%.2f' %
167 |                (a_error / scipy.pi * 180, s_error * 20))
168 |     a_error, s_error = max_error(preds, labels)
169 |     log_string('eval error (max): angle:%.2f speed:%.2f' %
170 |                (a_error / scipy.pi * 180, s_error * 20))
171 |     a_error, s_error = mean_topk_error(preds, labels, 5)
172 |     log_string('eval error (mean-top5): angle:%.2f speed:%.2f' %
173 |                (a_error / scipy.pi * 180, s_error * 20))
174 |     a_error, s_error = mean_error(preds, labels)
175 |     log_string('eval error (mean): angle:%.2f speed:%.2f' %
176 |                (a_error / scipy.pi * 180, s_error * 20))
177 | 
178 |     print (preds.shape, labels.shape)
179 |     np.savetxt(os.path.join(VAL_RESULT_DIR, "preds_val.txt"), preds)
180 |     np.savetxt(os.path.join(VAL_RESULT_DIR, "labels_val.txt"), labels)
181 |     # plot_acc(preds, labels)
182 | 
183 | 
184 | def test():
185 |     with tf.device('/gpu:'+str(GPU_INDEX)):
186 |         if '_pn' in MODEL_FILE:
187 |             data_input = provider.Provider2()
188 |             imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
189 |             imgs_pl = [imgs_pl, pts_pl]
190 |         elif '_io' in MODEL_FILE:
191 |             data_input = provider.Provider2()
192 |             imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
193 |         else:
194 |             raise NotImplementedError
195 | 
196 | 
197 |         is_training_pl = tf.placeholder(tf.bool, shape=())
198 |         print(is_training_pl)
199 | 
200 |         # Get model and loss
201 |         pred = MODEL.get_model(imgs_pl, is_training_pl)
202 | 
203 |         loss = MODEL.get_loss(pred, labels_pl)
204 | 
205 |         # Add ops to save and restore all the variables.
206 |         saver = tf.train.Saver()
207 | 
208 |     # Create a session
209 |     config = tf.ConfigProto()
210 |     config.gpu_options.allow_growth = True
211 |     config.allow_soft_placement = True
212 |     config.log_device_placement = True
213 |     sess = tf.Session(config=config)
214 | 
215 |     # Restore variables from disk.
216 |     saver.restore(sess, MODEL_PATH)
217 |     log_string("Model restored.")
218 | 
219 |     ops = {'imgs_pl': imgs_pl,
220 |             'labels_pl': labels_pl,
221 |             'is_training_pl': is_training_pl,
222 |             'pred': pred,
223 |             'loss': loss}
224 | 
225 |     test_one_epoch(sess, ops, data_input)
226 | 
227 | 
228 | def test_one_epoch(sess, ops, data_input):
229 |     """ ops: dict mapping from string to tf ops """
230 |     is_training = False
231 |     loss_sum = 0
232 | 
233 |     num_batches = data_input.num_test // BATCH_SIZE
234 |     acc_a_sum = [0] * 5
235 |     acc_s_sum = [0] * 5
236 | 
237 |     preds = []
238 |     labels_total = []
239 |     acc_a = [0] * 5
240 |     acc_s = [0] * 5
241 |     for batch_idx in range(num_batches):
242 |         if "_io" in MODEL_FILE:
243 |             imgs, labels = data_input.load_one_batch(BATCH_SIZE, reader_type="io")
244 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
245 |                 imgs = MODEL.resize(imgs)
246 |             feed_dict = {ops['imgs_pl']: imgs,
247 |                          ops['labels_pl']: labels,
248 |                          ops['is_training_pl']: is_training}
249 |         else:
250 |             imgs, others, labels = data_input.load_one_batch(BATCH_SIZE)
251 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
252 |                 imgs = MODEL.resize(imgs)
253 |             feed_dict = {ops['imgs_pl'][0]: imgs,
254 |                          ops['imgs_pl'][1]: others,
255 |                          ops['labels_pl']: labels,
256 |                          ops['is_training_pl']: is_training}
257 | 
258 |         loss_val, pred_val = sess.run([ops['loss'], ops['pred']],
259 |                                     feed_dict=feed_dict)
260 | 
261 |         preds.append(pred_val)
262 |         labels_total.append(labels)
263 |         loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
264 |         for i in range(5):
265 |             acc_a[i] = np.mean(np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (1.0 * (i+1) / 180 * scipy.pi))
266 |             acc_a_sum[i] += acc_a[i]
267 |             acc_s[i] = np.mean(np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (1.0 * (i+1) / 20))
268 |             acc_s_sum[i] += acc_s[i]
269 | 
270 |     log_string('test mean loss: %f' % (loss_sum / float(num_batches)))
271 |     for i in range(5):
272 |         log_string('test accuracy (angle-%d): %f' % (float(i+1), (acc_a_sum[i] / float(num_batches))))
273 |         log_string('test accuracy (speed-%d): %f' % (float(i+1), (acc_s_sum[i] / float(num_batches))))
274 | 
275 |     preds = np.vstack(preds)
276 |     labels = np.vstack(labels_total)
277 | 
278 |     a_error, s_error = mean_max_error(preds, labels, dicts=get_dicts())
279 |     log_string('test error (mean-max): angle:%.2f speed:%.2f' %
280 |                (a_error / scipy.pi * 180, s_error * 20))
281 |     a_error, s_error = max_error(preds, labels)
282 |     log_string('test error (max): angle:%.2f speed:%.2f' %
283 |                (a_error / scipy.pi * 180, s_error * 20))
284 |     a_error, s_error = mean_topk_error(preds, labels, 5)
285 |     log_string('test error (mean-top5): angle:%.2f speed:%.2f' %
286 |                (a_error / scipy.pi * 180, s_error * 20))
287 |     a_error, s_error = mean_error(preds, labels)
288 |     log_string('test error (mean): angle:%.2f speed:%.2f' %
289 |                (a_error / scipy.pi * 180, s_error * 20))
290 | 
291 |     print (preds.shape, labels.shape)
292 |     np.savetxt(os.path.join(TEST_RESULT_DIR, "preds_val.txt"), preds)
293 |     np.savetxt(os.path.join(TEST_RESULT_DIR, "labels_val.txt"), labels)
294 |     # plot_acc(preds, labels)
295 | 
296 | 
297 | def plot_acc(preds, labels, counts = 100):
298 |     a_list = []
299 |     s_list = []
300 |     for i in range(counts):
301 |         acc_a = np.abs(np.subtract(preds[:, 1], labels[:, 1])) < (20.0 / 180 * scipy.pi / counts * i)
302 |         a_list.append(np.mean(acc_a))
303 | 
304 |     for i in range(counts):
305 |         acc_s = np.abs(np.subtract(preds[:, 0], labels[:, 0])) < (15.0 / 20 / counts * i)
306 |         s_list.append(np.mean(acc_s))
307 | 
308 |     print (len(a_list), len(s_list))
309 |     a_xaxis = [20.0 / counts * i for i in range(counts)]
310 |     s_xaxis = [15.0 / counts * i for i in range(counts)]
311 | 
312 |     auc_angle = np.trapz(np.array(a_list), x=a_xaxis) / 20.0
313 |     auc_speed = np.trapz(np.array(s_list), x=s_xaxis) / 15.0
314 | 
315 |     plt.style.use('ggplot')
316 |     plt.figure()
317 |     plt.plot(a_xaxis, np.array(a_list), label='Area Under Curve (AUC): %f' % auc_angle)
318 |     plt.legend(loc='best')
319 |     plt.xlabel("Threshold (angle)")
320 |     plt.ylabel("Validation accuracy")
321 |     plt.savefig(os.path.join(RESULT_DIR, "acc_angle.png"))
322 |     plt.figure()
323 |     plt.plot(s_xaxis, np.array(s_list), label='Area Under Curve (AUC): %f' % auc_speed)
324 |     plt.xlabel("Threshold (speed)")
325 |     plt.ylabel("Validation accuracy")
326 |     plt.legend(loc='best')
327 |     plt.savefig(os.path.join(RESULT_DIR, 'acc_spped.png'))
328 | 
329 | def plot_acc_from_txt(counts=100):
330 |     preds = np.loadtxt(os.path.join(RESULT_DIR, "test/preds_val.txt"))
331 |     labels = np.loadtxt(os.path.join(RESULT_DIR, "test/labels_val.txt"))
332 |     print (preds.shape, labels.shape)
333 |     plot_acc(preds, labels, counts)
334 | 
335 | def get_dicts(description="val"):
336 |     if description == "train":
337 |         raise NotImplementedError
338 |     elif description == "val": # batch_size == 8
339 |         return [120] * 4 + [111] + [120] * 4 + [109] + [120] * 9 + [89 - 87 % 8]
340 |     elif description == "test": # batch_size == 8
341 |         return [120] * 9 + [116] + [120] * 4 + [106] + [120] * 4 + [114 - 114 % 8]
342 |     else:
343 |         raise NotImplementedError
344 | 
345 | def mean_max_error(preds, labels, dicts):
346 |     cnt = 0
347 |     a_error = 0
348 |     s_error = 0
349 |     for i in dicts:
350 |         print (preds.shape, cnt, cnt + i)
351 |         a_error += np.max(np.abs(preds[cnt:cnt+i, 1] - labels[cnt:cnt+i, 1]))
352 |         s_error += np.max(np.abs(preds[cnt:cnt+i, 0] - labels[cnt:cnt+i, 0]))
353 |         cnt += i
354 |     return a_error / float(len(dicts)), s_error / float(len(dicts))
355 | 
356 | def max_error(preds, labels):
357 |     return np.max(np.abs(preds[:,1] - labels[:,1])), np.max(np.abs(preds[:, 0] - labels[:, 0]))
358 | 
359 | def mean_error(preds, labels):
360 |     return np.mean(np.abs(preds[:,1] - labels[:,1])), np.mean(np.abs(preds[:,0] - labels[:,0]))
361 | 
362 | def mean_topk_error(preds, labels, k):
363 |     a_error = np.abs(preds[:,1] - labels[:,1])
364 |     s_error = np.abs(preds[:,0] - labels[:,0])
365 |     return np.mean(np.sort(a_error)[::-1][0:k]), np.mean(np.sort(s_error)[::-1][0:k])
366 | 
367 | if __name__ == "__main__":
368 |     if FLAGS.test: test()
369 |     else: evaluate()
370 |     # plot_acc_from_txt()
371 | 


--------------------------------------------------------------------------------
/models/densenet169_io.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | from custom_layers import Scale
 13 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D,
 14 |                           AveragePooling2D, GlobalAveragePooling2D,
 15 |                           ZeroPadding2D, Dropout, Flatten, add,
 16 |                           concatenate, Reshape, Activation)
 17 | from keras.layers.normalization import BatchNormalization
 18 | from keras.models import Model
 19 | 
 20 | 
 21 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, separately=False):
 22 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 23 |     if separately:
 24 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 25 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 26 |         labels_pl = [speeds_pl, angles_pl]
 27 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 28 |     return imgs_pl, labels_pl
 29 | 
 30 | 
 31 | def get_densenet(img_rows, img_cols, nb_dense_block=4,
 32 |                  growth_rate=32, nb_filter=64, reduction=0.5,
 33 |                  dropout_rate=0.0, weight_decay=1e-4):
 34 |     '''
 35 |     DenseNet 169 Model for Keras
 36 | 
 37 |     Model Schema is based on
 38 |     https://github.com/flyyufelix/DenseNet-Keras
 39 | 
 40 |     ImageNet Pretrained Weights
 41 |     Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU
 42 |     TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM
 43 | 
 44 |     # Arguments
 45 |         nb_dense_block: number of dense blocks to add to end
 46 |         growth_rate: number of filters to add per dense block
 47 |         nb_filter: initial number of filters
 48 |         reduction: reduction factor of transition blocks.
 49 |         dropout_rate: dropout rate
 50 |         weight_decay: weight decay factor
 51 |         classes: optional number of classes to classify images
 52 |         weights_path: path to pre-trained weights
 53 |     # Returns
 54 |         A Keras model instance.
 55 |     '''
 56 |     eps = 1.1e-5
 57 | 
 58 |     # compute compression factor
 59 |     compression = 1.0 - reduction
 60 | 
 61 |     # Handle Dimension Ordering for different backends
 62 |     img_input = Input(shape=(224, 224, 3), name='data')
 63 | 
 64 |     # From architecture for ImageNet (Table 1 in the paper)
 65 |     nb_filter = 64
 66 |     nb_layers = [6,12,32,32] # For DenseNet-169
 67 | 
 68 |     # Initial convolution
 69 |     x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
 70 |     x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
 71 |     x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x)
 72 |     x = Scale(axis=3, name='conv1_scale')(x)
 73 |     x = Activation('relu', name='relu1')(x)
 74 |     x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x)
 75 |     x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
 76 | 
 77 |     # Add dense blocks
 78 |     for block_idx in range(nb_dense_block - 1):
 79 |         stage = block_idx+2
 80 |         x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
 81 | 
 82 |         # Add transition_block
 83 |         x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay)
 84 |         nb_filter = int(nb_filter * compression)
 85 | 
 86 |     final_stage = stage + 1
 87 |     x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
 88 | 
 89 |     x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x)
 90 |     x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x)
 91 |     x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x)
 92 | 
 93 |     x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
 94 |     x_fc = Dense(1000, name='fc6')(x_fc)
 95 |     x_fc = Activation('softmax', name='prob')(x_fc)
 96 | 
 97 |     model = Model(img_input, x_fc, name='densenet')
 98 | 
 99 |     # Use pre-trained weights for Tensorflow backend
100 |     weights_path = 'utils/weights/densenet169_weights_tf.h5'
101 | 
102 |     model.load_weights(weights_path, by_name=True)
103 | 
104 |     # Truncate and replace softmax layer for transfer learning
105 |     # Cannot use model.layers.pop() since model is not of Sequential() type
106 |     # The method below works since pre-trained weights are stored in layers but not in the model
107 |     x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
108 | 
109 |     x_newfc = Dense(256, name='fc7')(x_newfc)
110 |     model = Model(img_input, x_newfc)
111 | 
112 |     return model
113 | 
114 | 
115 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
116 |     """ Densenet169 regression model, input is BxWxHx3, output Bx2"""
117 |     net = get_densenet(224, 224)(net)
118 | 
119 |     if not add_lstm:
120 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final')
121 | 
122 |     else:
123 |         net = tf_util.fully_connected(net, 784, bn=True,
124 |                                       is_training=is_training,
125 |                                       scope='fc_lstm',
126 |                                       bn_decay=bn_decay)
127 |         net = tf_util.dropout(net, keep_prob=0.7,
128 |                               is_training=is_training,
129 |                               scope="dp1")
130 |         net = cnn_lstm_block(net)
131 | 
132 |     return net
133 | 
134 | 
135 | def cnn_lstm_block(input_tensor):
136 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
137 |     lstm_out = tf_util.stacked_lstm(lstm_in,
138 |                                     num_outputs=10,
139 |                                     time_steps=28,
140 |                                     scope="cnn_lstm")
141 | 
142 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
143 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
144 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
145 | 
146 | 
147 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4):
148 |     '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout
149 |         # Arguments
150 |             x: input tensor
151 |             stage: index for dense block
152 |             branch: layer index within each dense block
153 |             nb_filter: number of filters
154 |             dropout_rate: dropout rate
155 |             weight_decay: weight decay factor
156 |     '''
157 |     eps = 1.1e-5
158 |     conv_name_base = 'conv' + str(stage) + '_' + str(branch)
159 |     relu_name_base = 'relu' + str(stage) + '_' + str(branch)
160 | 
161 |     # 1x1 Convolution (Bottleneck layer)
162 |     inter_channel = nb_filter * 4
163 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x)
164 |     x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x)
165 |     x = Activation('relu', name=relu_name_base+'_x1')(x)
166 |     x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x)
167 | 
168 |     if dropout_rate:
169 |         x = Dropout(dropout_rate)(x)
170 | 
171 |     # 3x3 Convolution
172 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x)
173 |     x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x)
174 |     x = Activation('relu', name=relu_name_base+'_x2')(x)
175 |     x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x)
176 |     x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x)
177 | 
178 |     if dropout_rate:
179 |         x = Dropout(dropout_rate)(x)
180 | 
181 |     return x
182 | 
183 | 
184 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4):
185 |     ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout
186 |         # Arguments
187 |             x: input tensor
188 |             stage: index for dense block
189 |             nb_filter: number of filters
190 |             compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block.
191 |             dropout_rate: dropout rate
192 |             weight_decay: weight decay factor
193 |     '''
194 | 
195 |     eps = 1.1e-5
196 |     conv_name_base = 'conv' + str(stage) + '_blk'
197 |     relu_name_base = 'relu' + str(stage) + '_blk'
198 |     pool_name_base = 'pool' + str(stage)
199 | 
200 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x)
201 |     x = Scale(axis=3, name=conv_name_base+'_scale')(x)
202 |     x = Activation('relu', name=relu_name_base)(x)
203 |     x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x)
204 | 
205 |     if dropout_rate:
206 |         x = Dropout(dropout_rate)(x)
207 | 
208 |     x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x)
209 | 
210 |     return x
211 | 
212 | 
213 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True):
214 |     ''' Build a dense_block where the output of each conv_block is fed to subsequent ones
215 |         # Arguments
216 |             x: input tensor
217 |             stage: index for dense block
218 |             nb_layers: the number of layers of conv_block to append to the model.
219 |             nb_filter: number of filters
220 |             growth_rate: growth rate
221 |             dropout_rate: dropout rate
222 |             weight_decay: weight decay factor
223 |             grow_nb_filters: flag to decide to allow number of filters to grow
224 |     '''
225 | 
226 |     eps = 1.1e-5
227 |     concat_feat = x
228 | 
229 |     for i in range(nb_layers):
230 |         branch = i+1
231 |         x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay)
232 |         concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch))
233 | 
234 |         if grow_nb_filters:
235 |             nb_filter += growth_rate
236 | 
237 |     return concat_feat, nb_filter
238 | 
239 | 
240 | def get_loss(pred, label, l2_weight=0.0001):
241 |     diff = tf.square(tf.subtract(pred, label))
242 |     train_vars = tf.trainable_variables()
243 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
244 |     loss = tf.reduce_mean(diff + l2_loss)
245 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
246 |     tf.summary.scalar('loss', loss)
247 | 
248 |     return loss
249 | 
250 | 
251 | def summary_scalar(pred, label):
252 |     threholds = [5, 4, 3, 2, 1, 0.5]
253 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
254 |     speeds = [float(t) / 20 for t in threholds]
255 | 
256 |     for i in range(len(threholds)):
257 |         scalar_angle = "angle(" + str(angles[i]) + ")"
258 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
259 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
260 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
261 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
262 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
263 | 
264 |         tf.summary.scalar(scalar_angle, ac_angle)
265 |         tf.summary.scalar(scalar_speed, ac_speed)
266 | 
267 | 
268 | def resize(imgs):
269 |     batch_size = imgs.shape[0]
270 |     imgs_new = []
271 |     for j in range(batch_size):
272 |         img = imgs[j,:,:,:]
273 |         new = scipy.misc.imresize(img, (224, 224))
274 |         imgs_new.append(new)
275 |     imgs_new = np.stack(imgs_new, axis=0)
276 |     return imgs_new
277 | 
278 | 
279 | if __name__ == '__main__':
280 |     with tf.Graph().as_default():
281 |         inputs = tf.zeros((32, 224, 224, 3))
282 |         outputs = get_model(inputs, tf.constant(True))
283 |         print(outputs)
284 | 


--------------------------------------------------------------------------------
/models/densenet169_pm.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | import pointnet
 13 | from custom_layers import Scale
 14 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D,
 15 |                           AveragePooling2D, GlobalAveragePooling2D,
 16 |                           ZeroPadding2D, Dropout, Flatten, add,
 17 |                           concatenate, Reshape, Activation)
 18 | from keras.layers.normalization import BatchNormalization
 19 | from keras.models import Model
 20 | 
 21 | from keras import backend as K
 22 | K.set_learning_phase(1) #set learning phase
 23 | 
 24 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False):
 25 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 26 |     fmaps_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 27 |     if separately:
 28 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 29 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 30 |         labels_pl = [speeds_pl, angles_pl]
 31 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 32 |     return imgs_pl, fmaps_pl, labels_pl
 33 | 
 34 | 
 35 | def get_densenet(img_rows, img_cols, nb_dense_block=4,
 36 |                  growth_rate=32, nb_filter=64, reduction=0.5,
 37 |                  dropout_rate=0.0, weight_decay=1e-4):
 38 |     '''
 39 |     DenseNet 169 Model for Keras
 40 | 
 41 |     Model Schema is based on
 42 |     https://github.com/flyyufelix/DenseNet-Keras
 43 | 
 44 |     ImageNet Pretrained Weights
 45 |     Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU
 46 |     TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM
 47 | 
 48 |     # Arguments
 49 |         nb_dense_block: number of dense blocks to add to end
 50 |         growth_rate: number of filters to add per dense block
 51 |         nb_filter: initial number of filters
 52 |         reduction: reduction factor of transition blocks.
 53 |         dropout_rate: dropout rate
 54 |         weight_decay: weight decay factor
 55 |         classes: optional number of classes to classify images
 56 |         weights_path: path to pre-trained weights
 57 |     # Returns
 58 |         A Keras model instance.
 59 |     '''
 60 |     eps = 1.1e-5
 61 | 
 62 |     # compute compression factor
 63 |     compression = 1.0 - reduction
 64 | 
 65 |     # Handle Dimension Ordering for different backends
 66 |     img_input = Input(shape=(224, 224, 3), name='data')
 67 | 
 68 |     # From architecture for ImageNet (Table 1 in the paper)
 69 |     nb_filter = 64
 70 |     nb_layers = [6,12,32,32] # For DenseNet-169
 71 | 
 72 |     # Initial convolution
 73 |     x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
 74 |     x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
 75 |     x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x)
 76 |     x = Scale(axis=3, name='conv1_scale')(x)
 77 |     x = Activation('relu', name='relu1')(x)
 78 |     x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x)
 79 |     x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
 80 | 
 81 |     # Add dense blocks
 82 |     for block_idx in range(nb_dense_block - 1):
 83 |         stage = block_idx+2
 84 |         x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
 85 | 
 86 |         # Add transition_block
 87 |         x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay)
 88 |         nb_filter = int(nb_filter * compression)
 89 | 
 90 |     final_stage = stage + 1
 91 |     x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
 92 | 
 93 |     x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x)
 94 |     x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x)
 95 |     x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x)
 96 | 
 97 |     x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
 98 |     x_fc = Dense(1000, name='fc6')(x_fc)
 99 |     x_fc = Activation('softmax', name='prob')(x_fc)
100 | 
101 |     model = Model(img_input, x_fc, name='densenet')
102 | 
103 |     # Use pre-trained weights for Tensorflow backend
104 |     weights_path = 'utils/weights/densenet169_weights_tf.h5'
105 | 
106 |     model.load_weights(weights_path, by_name=True)
107 | 
108 |     # Truncate and replace softmax layer for transfer learning
109 |     # Cannot use model.layers.pop() since model is not of Sequential() type
110 |     # The method below works since pre-trained weights are stored in layers but not in the model
111 |     x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
112 | 
113 |     x_newfc = Dense(256, name='fc7')(x_newfc)
114 |     model = Model(img_input, x_newfc)
115 | 
116 |     return model
117 | 
118 | 
119 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
120 |     """ Densenet169 regression model, input is BxWxHx3, output Bx2"""
121 |     batch_size = net[0].get_shape()[0].value
122 |     img_net, fmap_net = net[0], net[1]
123 | 
124 |     img_net = get_densenet(224, 224)(img_net)
125 |     fmap_net = get_densenet(224, 224)(fmap_net)
126 | 
127 |     net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
128 | 
129 |     if not add_lstm:
130 |         for i, dim in enumerate([256, 128, 16]):
131 |             fc_scope = "fc" + str(i + 1)
132 |             dp_scope = "dp" + str(i + 1)
133 |             net = tf_util.fully_connected(net, dim, bn=True,
134 |                                         is_training=is_training,
135 |                                         scope=fc_scope,
136 |                                         bn_decay=bn_decay)
137 |             net = tf_util.dropout(net, keep_prob=0.7,
138 |                                 is_training=is_training,
139 |                                 scope=dp_scope)
140 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
141 |     else:
142 |         fc_scope = "fc1"
143 |         net = tf_util.fully_connected(net, 784, bn=True,
144 |                                       is_training=is_training,
145 |                                       scope=fc_scope,
146 |                                       bn_decay=bn_decay)
147 |         net = tf_util.dropout(net, keep_prob=0.7,
148 |                               is_training=is_training,
149 |                               scope="dp1")
150 |         net = cnn_lstm_block(net)
151 |     return net
152 | 
153 | 
154 | def cnn_lstm_block(input_tensor):
155 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
156 |     lstm_out = tf_util.stacked_lstm(lstm_in,
157 |                                     num_outputs=10,
158 |                                     time_steps=28,
159 |                                     scope="cnn_lstm")
160 | 
161 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
162 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
163 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
164 | 
165 | 
166 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4):
167 |     '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout
168 |         # Arguments
169 |             x: input tensor
170 |             stage: index for dense block
171 |             branch: layer index within each dense block
172 |             nb_filter: number of filters
173 |             dropout_rate: dropout rate
174 |             weight_decay: weight decay factor
175 |     '''
176 |     eps = 1.1e-5
177 |     conv_name_base = 'conv' + str(stage) + '_' + str(branch)
178 |     relu_name_base = 'relu' + str(stage) + '_' + str(branch)
179 | 
180 |     # 1x1 Convolution (Bottleneck layer)
181 |     inter_channel = nb_filter * 4
182 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x)
183 |     x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x)
184 |     x = Activation('relu', name=relu_name_base+'_x1')(x)
185 |     x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x)
186 | 
187 |     if dropout_rate:
188 |         x = Dropout(dropout_rate)(x)
189 | 
190 |     # 3x3 Convolution
191 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x)
192 |     x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x)
193 |     x = Activation('relu', name=relu_name_base+'_x2')(x)
194 |     x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x)
195 |     x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x)
196 | 
197 |     if dropout_rate:
198 |         x = Dropout(dropout_rate)(x)
199 | 
200 |     return x
201 | 
202 | 
203 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4):
204 |     ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout
205 |         # Arguments
206 |             x: input tensor
207 |             stage: index for dense block
208 |             nb_filter: number of filters
209 |             compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block.
210 |             dropout_rate: dropout rate
211 |             weight_decay: weight decay factor
212 |     '''
213 | 
214 |     eps = 1.1e-5
215 |     conv_name_base = 'conv' + str(stage) + '_blk'
216 |     relu_name_base = 'relu' + str(stage) + '_blk'
217 |     pool_name_base = 'pool' + str(stage)
218 | 
219 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x)
220 |     x = Scale(axis=3, name=conv_name_base+'_scale')(x)
221 |     x = Activation('relu', name=relu_name_base)(x)
222 |     x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x)
223 | 
224 |     if dropout_rate:
225 |         x = Dropout(dropout_rate)(x)
226 | 
227 |     x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x)
228 | 
229 |     return x
230 | 
231 | 
232 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True):
233 |     ''' Build a dense_block where the output of each conv_block is fed to subsequent ones
234 |         # Arguments
235 |             x: input tensor
236 |             stage: index for dense block
237 |             nb_layers: the number of layers of conv_block to append to the model.
238 |             nb_filter: number of filters
239 |             growth_rate: growth rate
240 |             dropout_rate: dropout rate
241 |             weight_decay: weight decay factor
242 |             grow_nb_filters: flag to decide to allow number of filters to grow
243 |     '''
244 | 
245 |     eps = 1.1e-5
246 |     concat_feat = x
247 | 
248 |     for i in range(nb_layers):
249 |         branch = i+1
250 |         x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay)
251 |         concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch))
252 | 
253 |         if grow_nb_filters:
254 |             nb_filter += growth_rate
255 | 
256 |     return concat_feat, nb_filter
257 | 
258 | 
259 | def get_loss(pred, label, l2_weight=0.0001):
260 |     diff = tf.square(tf.subtract(pred, label))
261 |     train_vars = tf.trainable_variables()
262 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
263 |     loss = tf.reduce_mean(diff + l2_loss)
264 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
265 |     tf.summary.scalar('loss', loss)
266 | 
267 |     return loss
268 | 
269 | 
270 | def summary_scalar(pred, label):
271 |     threholds = [5, 4, 3, 2, 1, 0.5]
272 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
273 |     speeds = [float(t) / 20 for t in threholds]
274 | 
275 |     for i in range(len(threholds)):
276 |         scalar_angle = "angle(" + str(angles[i]) + ")"
277 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
278 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
279 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
280 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
281 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
282 | 
283 |         tf.summary.scalar(scalar_angle, ac_angle)
284 |         tf.summary.scalar(scalar_speed, ac_speed)
285 | 
286 | 
287 | def resize(imgs):
288 |     batch_size = imgs.shape[0]
289 |     imgs_new = []
290 |     for j in range(batch_size):
291 |         img = imgs[j,:,:,:]
292 |         new = scipy.misc.imresize(img, (224, 224))
293 |         imgs_new.append(new)
294 |     imgs_new = np.stack(imgs_new, axis=0)
295 |     return imgs_new
296 | 
297 | 
298 | if __name__ == '__main__':
299 |     with tf.Graph().as_default():
300 |         imgs = tf.zeros((32, 224, 224, 3))
301 |         fmaps = tf.zeros((32, 224, 224, 3))
302 |         outputs = get_model([imgs, fmaps], tf.constant(True))
303 |         print(outputs)
304 | 


--------------------------------------------------------------------------------
/models/densenet169_pn.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | import pointnet
 13 | from custom_layers import Scale
 14 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D, 
 15 |                           AveragePooling2D, GlobalAveragePooling2D, 
 16 |                           ZeroPadding2D, Dropout, Flatten, add, 
 17 |                           concatenate, Reshape, Activation)
 18 | from keras.layers.normalization import BatchNormalization
 19 | from keras.models import Model
 20 | 
 21 | 
 22 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False):
 23 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 24 |     pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
 25 |     if separately:
 26 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 27 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 28 |         labels_pl = [speeds_pl, angles_pl]
 29 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 30 |     return imgs_pl, pts_pl, labels_pl
 31 | 
 32 | 
 33 | def get_densenet(img_rows, img_cols, nb_dense_block=4, 
 34 |                  growth_rate=32, nb_filter=64, reduction=0.5, 
 35 |                  dropout_rate=0.0, weight_decay=1e-4):
 36 |     '''
 37 |     DenseNet 169 Model for Keras
 38 | 
 39 |     Model Schema is based on
 40 |     https://github.com/flyyufelix/DenseNet-Keras
 41 | 
 42 |     ImageNet Pretrained Weights
 43 |     Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU
 44 |     TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM
 45 | 
 46 |     # Arguments
 47 |         nb_dense_block: number of dense blocks to add to end
 48 |         growth_rate: number of filters to add per dense block
 49 |         nb_filter: initial number of filters
 50 |         reduction: reduction factor of transition blocks.
 51 |         dropout_rate: dropout rate
 52 |         weight_decay: weight decay factor
 53 |         classes: optional number of classes to classify images
 54 |         weights_path: path to pre-trained weights
 55 |     # Returns
 56 |         A Keras model instance.
 57 |     '''
 58 |     eps = 1.1e-5
 59 | 
 60 |     # compute compression factor
 61 |     compression = 1.0 - reduction
 62 | 
 63 |     # Handle Dimension Ordering for different backends
 64 |     img_input = Input(shape=(224, 224, 3), name='data')
 65 | 
 66 |     # From architecture for ImageNet (Table 1 in the paper)
 67 |     nb_filter = 64
 68 |     nb_layers = [6,12,32,32] # For DenseNet-169
 69 | 
 70 |     # Initial convolution
 71 |     x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
 72 |     x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
 73 |     x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x)
 74 |     x = Scale(axis=3, name='conv1_scale')(x)
 75 |     x = Activation('relu', name='relu1')(x)
 76 |     x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x)
 77 |     x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
 78 | 
 79 |     # Add dense blocks
 80 |     for block_idx in range(nb_dense_block - 1):
 81 |         stage = block_idx+2
 82 |         x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
 83 | 
 84 |         # Add transition_block
 85 |         x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay)
 86 |         nb_filter = int(nb_filter * compression)
 87 | 
 88 |     final_stage = stage + 1
 89 |     x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
 90 | 
 91 |     x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x)
 92 |     x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x)
 93 |     x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x)
 94 | 
 95 |     x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
 96 |     x_fc = Dense(1000, name='fc6')(x_fc)
 97 |     x_fc = Activation('softmax', name='prob')(x_fc)
 98 | 
 99 |     model = Model(img_input, x_fc, name='densenet')
100 | 
101 |     # Use pre-trained weights for Tensorflow backend
102 |     weights_path = 'utils/weights/densenet169_weights_tf.h5'
103 | 
104 |     model.load_weights(weights_path, by_name=True)
105 | 
106 |     # Truncate and replace softmax layer for transfer learning
107 |     # Cannot use model.layers.pop() since model is not of Sequential() type
108 |     # The method below works since pre-trained weights are stored in layers but not in the model
109 |     x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
110 | 
111 |     x_newfc = Dense(256, name='fc7')(x_newfc)
112 |     model = Model(img_input, x_newfc)
113 | 
114 |     return model
115 | 
116 | 
117 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
118 |     """ Densenet169 regression model, input is BxWxHx3, output Bx2"""
119 |     batch_size = net[0].get_shape()[0].value
120 |     img_net, pt_net = net[0], net[1]
121 | 
122 |     img_net = get_densenet(299, 299)(img_net)
123 |     with tf.variable_scope('pointnet'):
124 |         pt_net = pointnet.get_model(pt_net, tf.constant(True))
125 |     net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1])
126 | 
127 |     if not add_lstm:
128 |         for i, dim in enumerate([256, 128, 16]):
129 |             fc_scope = "fc" + str(i + 1)
130 |             dp_scope = "dp" + str(i + 1)
131 |             net = tf_util.fully_connected(net, dim, bn=True,
132 |                                         is_training=is_training,
133 |                                         scope=fc_scope,
134 |                                         bn_decay=bn_decay)
135 |             net = tf_util.dropout(net, keep_prob=0.7,
136 |                                 is_training=is_training,
137 |                                 scope=dp_scope)
138 | 
139 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
140 |     else:
141 |         fc_scope = "fc1"
142 |         net = tf_util.fully_connected(net, 784, bn=True,
143 |                                       is_training=is_training,
144 |                                       scope=fc_scope,
145 |                                       bn_decay=bn_decay)
146 |         net = tf_util.dropout(net, keep_prob=0.7,
147 |                               is_training=is_training,
148 |                               scope="dp1")
149 |         net = cnn_lstm_block(net)
150 |     return net
151 | 
152 | 
153 | def cnn_lstm_block(input_tensor):
154 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
155 |     lstm_out = tf_util.stacked_lstm(lstm_in,
156 |                                     num_outputs=10,
157 |                                     time_steps=28,
158 |                                     scope="cnn_lstm")
159 | 
160 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
161 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
162 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
163 | 
164 | 
165 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4):
166 |     '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout
167 |         # Arguments
168 |             x: input tensor
169 |             stage: index for dense block
170 |             branch: layer index within each dense block
171 |             nb_filter: number of filters
172 |             dropout_rate: dropout rate
173 |             weight_decay: weight decay factor
174 |     '''
175 |     eps = 1.1e-5
176 |     conv_name_base = 'conv' + str(stage) + '_' + str(branch)
177 |     relu_name_base = 'relu' + str(stage) + '_' + str(branch)
178 | 
179 |     # 1x1 Convolution (Bottleneck layer)
180 |     inter_channel = nb_filter * 4
181 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x)
182 |     x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x)
183 |     x = Activation('relu', name=relu_name_base+'_x1')(x)
184 |     x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x)
185 | 
186 |     if dropout_rate:
187 |         x = Dropout(dropout_rate)(x)
188 | 
189 |     # 3x3 Convolution
190 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x)
191 |     x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x)
192 |     x = Activation('relu', name=relu_name_base+'_x2')(x)
193 |     x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x)
194 |     x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x)
195 | 
196 |     if dropout_rate:
197 |         x = Dropout(dropout_rate)(x)
198 | 
199 |     return x
200 | 
201 | 
202 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4):
203 |     ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout
204 |         # Arguments
205 |             x: input tensor
206 |             stage: index for dense block
207 |             nb_filter: number of filters
208 |             compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block.
209 |             dropout_rate: dropout rate
210 |             weight_decay: weight decay factor
211 |     '''
212 | 
213 |     eps = 1.1e-5
214 |     conv_name_base = 'conv' + str(stage) + '_blk'
215 |     relu_name_base = 'relu' + str(stage) + '_blk'
216 |     pool_name_base = 'pool' + str(stage)
217 | 
218 |     x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x)
219 |     x = Scale(axis=3, name=conv_name_base+'_scale')(x)
220 |     x = Activation('relu', name=relu_name_base)(x)
221 |     x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x)
222 | 
223 |     if dropout_rate:
224 |         x = Dropout(dropout_rate)(x)
225 | 
226 |     x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x)
227 | 
228 |     return x
229 | 
230 | 
231 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True):
232 |     ''' Build a dense_block where the output of each conv_block is fed to subsequent ones
233 |         # Arguments
234 |             x: input tensor
235 |             stage: index for dense block
236 |             nb_layers: the number of layers of conv_block to append to the model.
237 |             nb_filter: number of filters
238 |             growth_rate: growth rate
239 |             dropout_rate: dropout rate
240 |             weight_decay: weight decay factor
241 |             grow_nb_filters: flag to decide to allow number of filters to grow
242 |     '''
243 | 
244 |     eps = 1.1e-5
245 |     concat_feat = x
246 | 
247 |     for i in range(nb_layers):
248 |         branch = i+1
249 |         x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay)
250 |         concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch))
251 | 
252 |         if grow_nb_filters:
253 |             nb_filter += growth_rate
254 | 
255 |     return concat_feat, nb_filter
256 | 
257 | 
258 | def get_loss(pred, label, l2_weight=0.0001):
259 |     diff = tf.square(tf.subtract(pred, label))
260 |     train_vars = tf.trainable_variables()
261 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
262 |     loss = tf.reduce_mean(diff + l2_loss)
263 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
264 |     tf.summary.scalar('loss', loss)
265 | 
266 |     return loss
267 | 
268 | 
269 | def summary_scalar(pred, label):
270 |     threholds = [5, 4, 3, 2, 1, 0.5]
271 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
272 |     speeds = [float(t) / 20 for t in threholds]
273 | 
274 |     for i in range(len(threholds)):
275 |         scalar_angle = "angle(" + str(angles[i]) + ")"
276 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
277 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
278 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
279 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
280 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
281 | 
282 |         tf.summary.scalar(scalar_angle, ac_angle)
283 |         tf.summary.scalar(scalar_speed, ac_speed)
284 | 
285 | 
286 | def resize(imgs):
287 |     batch_size = imgs.shape[0]
288 |     imgs_new = []
289 |     for j in range(batch_size):
290 |         img = imgs[j,:,:,:]
291 |         new = scipy.misc.imresize(img, (224, 224))
292 |         imgs_new.append(new)
293 |     imgs_new = np.stack(imgs_new, axis=0)
294 |     return imgs_new
295 | 
296 | 
297 | if __name__ == '__main__':
298 |     with tf.Graph().as_default():
299 |         imgs = tf.zeros((32, 224, 224, 3))
300 |         pts = tf.zeros((32, 16384, 3))
301 |         outputs = get_model([imgs, pts], tf.constant(True))
302 |         print(outputs)
303 | 


--------------------------------------------------------------------------------
/models/inception_v4_io.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation
 13 | from keras.layers.normalization import BatchNormalization
 14 | from keras.models import Model
 15 | 
 16 | 
 17 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, separately=False):
 18 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 19 |     if separately:
 20 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 21 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 22 |         labels_pl = [speeds_pl, angles_pl]
 23 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 24 |     return imgs_pl, labels_pl
 25 | 
 26 | 
 27 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False):
 28 |     '''
 29 |     Inception V4 Model for Keras
 30 | 
 31 |     Model Schema is based on
 32 |     https://github.com/kentsommer/keras-inceptionV4
 33 | 
 34 |     ImageNet Pretrained Weights
 35 |     Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5
 36 |     TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5
 37 | 
 38 |     Parameters:
 39 |       img_rows, img_cols - resolution of inputs
 40 |       channel - 1 for grayscale, 3 for color
 41 |       num_classes - number of class labels for our classification task
 42 |     '''
 43 | 
 44 |     # Input Shape is 299 x 299 x 3 (tf)
 45 |     img_input = Input(shape=(img_rows, img_cols, 3), name='data')
 46 | 
 47 |     # Make inception base
 48 |     net = inception_v4_base(img_input)
 49 | 
 50 |     # Final pooling and prediction
 51 | 
 52 |     # 8 x 8 x 1536
 53 |     net_old = AveragePooling2D((8,8), padding='valid')(net)
 54 | 
 55 |     # 1 x 1 x 1536
 56 |     net_old = Dropout(dropout_keep_prob)(net_old)
 57 |     net_old = Flatten()(net_old)
 58 | 
 59 |     # 1536
 60 |     predictions = Dense(units=1001, activation='softmax')(net_old)
 61 | 
 62 |     model = Model(img_input, predictions, name='inception_v4')
 63 | 
 64 |     weights_path = 'utils/weights/inception-v4_weights_tf.h5'
 65 |     assert (os.path.exists(weights_path))
 66 |     model.load_weights(weights_path, by_name=True)
 67 | 
 68 |     # Truncate and replace softmax layer for transfer learning
 69 |     # Cannot use model.layers.pop() since model is not of Sequential() type
 70 |     # The method below works since pre-trained weights are stored in layers but not in the model
 71 |     net_ft = AveragePooling2D((8,8), border_mode='valid')(net)
 72 |     net_ft = Dropout(dropout_keep_prob)(net_ft)
 73 |     net_ft = Flatten()(net_ft)
 74 |     net = Dense(256, name='fc_mid')(net_ft)
 75 | 
 76 |     model = Model(img_input, net, name='inception_v4')
 77 |     return model
 78 | 
 79 | 
 80 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
 81 |     """ Inception_V4 regression model, input is BxWxHx3, output Bx2"""
 82 |     net = get_inception(299, 299)(net)
 83 | 
 84 |     if not add_lstm:
 85 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final')
 86 | 
 87 |     else:
 88 |         net = tf_util.fully_connected(net, 784, bn=True,
 89 |                                       is_training=is_training,
 90 |                                       scope='fc_lstm',
 91 |                                       bn_decay=bn_decay)
 92 |         net = tf_util.dropout(net, keep_prob=0.7,
 93 |                               is_training=is_training,
 94 |                               scope="dp1")
 95 |         net = cnn_lstm_block(net)
 96 | 
 97 |     return net
 98 | 
 99 | 
100 | def cnn_lstm_block(input_tensor):
101 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
102 |     lstm_out = tf_util.stacked_lstm(lstm_in,
103 |                                     num_outputs=10,
104 |                                     time_steps=28,
105 |                                     scope="cnn_lstm")
106 | 
107 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
108 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
109 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
110 | 
111 | 
112 | def conv2d_bn(x, nb_filter, nb_row, nb_col,
113 |               border_mode='same', subsample=(1, 1), bias=False):
114 |     """
115 |     Utility function to apply conv + BN.
116 |     (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py)
117 |     """
118 |     channel_axis = -1
119 |     x = Convolution2D(nb_filter, (nb_row, nb_col),
120 |                       strides=subsample,
121 |                       padding=border_mode,
122 |                       use_bias=bias)(x)
123 |     x = BatchNormalization(axis=channel_axis)(x)
124 |     x = Activation('relu')(x)
125 |     return x
126 | 
127 | def block_inception_a(input):
128 |     channel_axis = -1
129 | 
130 |     branch_0 = conv2d_bn(input, 96, 1, 1)
131 | 
132 |     branch_1 = conv2d_bn(input, 64, 1, 1)
133 |     branch_1 = conv2d_bn(branch_1, 96, 3, 3)
134 | 
135 |     branch_2 = conv2d_bn(input, 64, 1, 1)
136 |     branch_2 = conv2d_bn(branch_2, 96, 3, 3)
137 |     branch_2 = conv2d_bn(branch_2, 96, 3, 3)
138 | 
139 |     branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
140 |     branch_3 = conv2d_bn(branch_3, 96, 1, 1)
141 | 
142 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
143 |     return x
144 | 
145 | 
146 | def block_reduction_a(input):
147 |     channel_axis = -1
148 | 
149 |     branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid')
150 | 
151 |     branch_1 = conv2d_bn(input, 192, 1, 1)
152 |     branch_1 = conv2d_bn(branch_1, 224, 3, 3)
153 |     branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid')
154 | 
155 |     branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input)
156 | 
157 |     x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
158 |     return x
159 | 
160 | 
161 | def block_inception_b(input):
162 |     channel_axis = -1
163 | 
164 |     branch_0 = conv2d_bn(input, 384, 1, 1)
165 | 
166 |     branch_1 = conv2d_bn(input, 192, 1, 1)
167 |     branch_1 = conv2d_bn(branch_1, 224, 1, 7)
168 |     branch_1 = conv2d_bn(branch_1, 256, 7, 1)
169 | 
170 |     branch_2 = conv2d_bn(input, 192, 1, 1)
171 |     branch_2 = conv2d_bn(branch_2, 192, 7, 1)
172 |     branch_2 = conv2d_bn(branch_2, 224, 1, 7)
173 |     branch_2 = conv2d_bn(branch_2, 224, 7, 1)
174 |     branch_2 = conv2d_bn(branch_2, 256, 1, 7)
175 | 
176 |     branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
177 |     branch_3 = conv2d_bn(branch_3, 128, 1, 1)
178 | 
179 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
180 |     return x
181 | 
182 | 
183 | def block_reduction_b(input):
184 |     channel_axis = -1
185 | 
186 |     branch_0 = conv2d_bn(input, 192, 1, 1)
187 |     branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid')
188 | 
189 |     branch_1 = conv2d_bn(input, 256, 1, 1)
190 |     branch_1 = conv2d_bn(branch_1, 256, 1, 7)
191 |     branch_1 = conv2d_bn(branch_1, 320, 7, 1)
192 |     branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid')
193 | 
194 |     branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input)
195 | 
196 |     x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
197 |     return x
198 | 
199 | 
200 | def block_inception_c(input):
201 |     channel_axis = -1
202 | 
203 |     branch_0 = conv2d_bn(input, 256, 1, 1)
204 | 
205 |     branch_1 = conv2d_bn(input, 384, 1, 1)
206 |     branch_10 = conv2d_bn(branch_1, 256, 1, 3)
207 |     branch_11 = conv2d_bn(branch_1, 256, 3, 1)
208 |     branch_1 = concatenate([branch_10, branch_11], axis=channel_axis)
209 | 
210 | 
211 |     branch_2 = conv2d_bn(input, 384, 1, 1)
212 |     branch_2 = conv2d_bn(branch_2, 448, 3, 1)
213 |     branch_2 = conv2d_bn(branch_2, 512, 1, 3)
214 |     branch_20 = conv2d_bn(branch_2, 256, 1, 3)
215 |     branch_21 = conv2d_bn(branch_2, 256, 3, 1)
216 |     branch_2 = concatenate([branch_20, branch_21], axis=channel_axis)
217 | 
218 |     branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input)
219 |     branch_3 = conv2d_bn(branch_3, 256, 1, 1)
220 | 
221 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
222 |     return x
223 | 
224 | 
225 | def inception_v4_base(input):
226 |     channel_axis = -1
227 | 
228 |     # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th)
229 |     net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid')
230 |     net = conv2d_bn(net, 32, 3, 3, border_mode='valid')
231 |     net = conv2d_bn(net, 64, 3, 3)
232 | 
233 |     branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
234 | 
235 |     branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid')
236 | 
237 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
238 | 
239 |     branch_0 = conv2d_bn(net, 64, 1, 1)
240 |     branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid')
241 | 
242 |     branch_1 = conv2d_bn(net, 64, 1, 1)
243 |     branch_1 = conv2d_bn(branch_1, 64, 1, 7)
244 |     branch_1 = conv2d_bn(branch_1, 64, 7, 1)
245 |     branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid')
246 | 
247 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
248 | 
249 |     branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid')
250 |     branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
251 | 
252 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
253 | 
254 |     # 35 x 35 x 384
255 |     # 4 x Inception-A blocks
256 |     for idx in xrange(4):
257 |       net = block_inception_a(net)
258 | 
259 |     # 35 x 35 x 384
260 |     # Reduction-A block
261 |     net = block_reduction_a(net)
262 | 
263 |     # 17 x 17 x 1024
264 |     # 7 x Inception-B blocks
265 |     for idx in xrange(7):
266 |       net = block_inception_b(net)
267 | 
268 |     # 17 x 17 x 1024
269 |     # Reduction-B block
270 |     net = block_reduction_b(net)
271 | 
272 |     # 8 x 8 x 1536
273 |     # 3 x Inception-C blocks
274 |     for idx in xrange(3):
275 |       net = block_inception_c(net)
276 | 
277 |     return net
278 | 
279 | 
280 | def get_loss(pred, label, l2_weight=0.0001):
281 |     diff = tf.square(tf.subtract(pred, label))
282 |     train_vars = tf.trainable_variables()
283 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
284 |     loss = tf.reduce_mean(diff + l2_loss)
285 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
286 |     tf.summary.scalar('loss', loss)
287 | 
288 |     return loss
289 | 
290 | 
291 | def summary_scalar(pred, label):
292 |     threholds = [5, 4, 3, 2, 1, 0.5]
293 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
294 |     speeds = [float(t) / 20 for t in threholds]
295 | 
296 |     for i in range(len(threholds)):
297 |         scalar_angle = "angle(" + str(angles[i]) + ")"
298 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
299 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
300 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
301 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
302 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
303 | 
304 |         tf.summary.scalar(scalar_angle, ac_angle)
305 |         tf.summary.scalar(scalar_speed, ac_speed)
306 | 
307 | 
308 | def resize(imgs):
309 |     batch_size = imgs.shape[0]
310 |     imgs_new = []
311 |     for j in range(batch_size):
312 |         img = imgs[j,:,:,:]
313 |         new = scipy.misc.imresize(img, (299, 299))
314 |         imgs_new.append(new)
315 |     imgs_new = np.stack(imgs_new, axis=0)
316 |     return imgs_new
317 | 
318 | 
319 | if __name__ == '__main__':
320 |     with tf.Graph().as_default():
321 |         inputs = tf.zeros((32, 224, 224, 3))
322 |         outputs = get_model(inputs, tf.constant(True))
323 |         print(outputs)
324 | 
325 | 


--------------------------------------------------------------------------------
/models/inception_v4_pm.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation
 13 | from keras.layers.normalization import BatchNormalization
 14 | from keras.models import Model
 15 | 
 16 | from keras import backend as K
 17 | K.set_learning_phase(1) #set learning phase
 18 | 
 19 | 
 20 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, points=16384, separately=False):
 21 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 22 |     fmaps_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 23 |     if separately:
 24 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 25 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 26 |         labels_pl = [speeds_pl, angles_pl]
 27 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 28 |     return imgs_pl, fmaps_pl, labels_pl
 29 | 
 30 | 
 31 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False):
 32 |     '''
 33 |     Inception V4 Model for Keras
 34 | 
 35 |     Model Schema is based on
 36 |     https://github.com/kentsommer/keras-inceptionV4
 37 | 
 38 |     ImageNet Pretrained Weights
 39 |     Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5
 40 |     TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5
 41 | 
 42 |     Parameters:
 43 |       img_rows, img_cols - resolution of inputs
 44 |       channel - 1 for grayscale, 3 for color
 45 |       num_classes - number of class labels for our classification task
 46 |     '''
 47 | 
 48 |     # Input Shape is 299 x 299 x 3 (tf)
 49 |     img_input = Input(shape=(img_rows, img_cols, 3), name='data')
 50 | 
 51 |     # Make inception base
 52 |     net = inception_v4_base(img_input)
 53 | 
 54 |     # Final pooling and prediction
 55 | 
 56 |     # 8 x 8 x 1536
 57 |     net_old = AveragePooling2D((8,8), padding='valid')(net)
 58 | 
 59 |     # 1 x 1 x 1536
 60 |     net_old = Dropout(dropout_keep_prob)(net_old)
 61 |     net_old = Flatten()(net_old)
 62 | 
 63 |     # 1536
 64 |     predictions = Dense(units=1001, activation='softmax')(net_old)
 65 | 
 66 |     model = Model(img_input, predictions, name='inception_v4')
 67 | 
 68 |     weights_path = 'utils/weights/inception-v4_weights_tf.h5'
 69 |     assert (os.path.exists(weights_path))
 70 |     model.load_weights(weights_path, by_name=True)
 71 | 
 72 |     # Truncate and replace softmax layer for transfer learning
 73 |     # Cannot use model.layers.pop() since model is not of Sequential() type
 74 |     # The method below works since pre-trained weights are stored in layers but not in the model
 75 |     net_ft = AveragePooling2D((8,8), border_mode='valid')(net)
 76 |     net_ft = Dropout(dropout_keep_prob)(net_ft)
 77 |     net_ft = Flatten()(net_ft)
 78 |     net = Dense(256, name='fc_mid')(net_ft)
 79 | 
 80 |     model = Model(img_input, net, name='inception_v4')
 81 |     return model
 82 | 
 83 | 
 84 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
 85 |     """ Inception_V4 regression model, input is BxWxHx3, output Bx2"""
 86 |     batch_size = net[0].get_shape()[0].value
 87 |     img_net, fmap_net = net[0], net[1]
 88 | 
 89 |     img_net = get_inception(299, 299)(img_net)
 90 |     fmap_net = get_inception(299, 299)(fmap_net)
 91 | 
 92 |     net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
 93 | 
 94 |     if not add_lstm:
 95 |         for i, dim in enumerate([256, 128, 16]):
 96 |             fc_scope = "fc" + str(i + 1)
 97 |             dp_scope = "dp" + str(i + 1)
 98 |             net = tf_util.fully_connected(net, dim, bn=True,
 99 |                                         is_training=is_training,
100 |                                         scope=fc_scope,
101 |                                         bn_decay=bn_decay)
102 |             net = tf_util.dropout(net, keep_prob=0.7,
103 |                                 is_training=is_training,
104 |                                 scope=dp_scope)
105 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
106 |     else:
107 |         fc_scope = "fc1"
108 |         net = tf_util.fully_connected(net, 784, bn=True,
109 |                                       is_training=is_training,
110 |                                       scope=fc_scope,
111 |                                       bn_decay=bn_decay)
112 |         net = tf_util.dropout(net, keep_prob=0.7,
113 |                               is_training=is_training,
114 |                               scope="dp1")
115 |         net = cnn_lstm_block(net)
116 |     return net
117 | 
118 | 
119 | def cnn_lstm_block(input_tensor):
120 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
121 |     lstm_out = tf_util.stacked_lstm(lstm_in,
122 |                                     num_outputs=10,
123 |                                     time_steps=28,
124 |                                     scope="cnn_lstm")
125 | 
126 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
127 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
128 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
129 | 
130 | 
131 | def conv2d_bn(x, nb_filter, nb_row, nb_col,
132 |               border_mode='same', subsample=(1, 1), bias=False):
133 |     """
134 |     Utility function to apply conv + BN.
135 |     (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py)
136 |     """
137 |     channel_axis = -1
138 |     x = Convolution2D(nb_filter, (nb_row, nb_col),
139 |                       strides=subsample,
140 |                       padding=border_mode,
141 |                       use_bias=bias)(x)
142 |     x = BatchNormalization(axis=channel_axis)(x)
143 |     x = Activation('relu')(x)
144 |     return x
145 | 
146 | def block_inception_a(input):
147 |     channel_axis = -1
148 | 
149 |     branch_0 = conv2d_bn(input, 96, 1, 1)
150 | 
151 |     branch_1 = conv2d_bn(input, 64, 1, 1)
152 |     branch_1 = conv2d_bn(branch_1, 96, 3, 3)
153 | 
154 |     branch_2 = conv2d_bn(input, 64, 1, 1)
155 |     branch_2 = conv2d_bn(branch_2, 96, 3, 3)
156 |     branch_2 = conv2d_bn(branch_2, 96, 3, 3)
157 | 
158 |     branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
159 |     branch_3 = conv2d_bn(branch_3, 96, 1, 1)
160 | 
161 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
162 |     return x
163 | 
164 | 
165 | def block_reduction_a(input):
166 |     channel_axis = -1
167 | 
168 |     branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid')
169 | 
170 |     branch_1 = conv2d_bn(input, 192, 1, 1)
171 |     branch_1 = conv2d_bn(branch_1, 224, 3, 3)
172 |     branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid')
173 | 
174 |     branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input)
175 | 
176 |     x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
177 |     return x
178 | 
179 | 
180 | def block_inception_b(input):
181 |     channel_axis = -1
182 | 
183 |     branch_0 = conv2d_bn(input, 384, 1, 1)
184 | 
185 |     branch_1 = conv2d_bn(input, 192, 1, 1)
186 |     branch_1 = conv2d_bn(branch_1, 224, 1, 7)
187 |     branch_1 = conv2d_bn(branch_1, 256, 7, 1)
188 | 
189 |     branch_2 = conv2d_bn(input, 192, 1, 1)
190 |     branch_2 = conv2d_bn(branch_2, 192, 7, 1)
191 |     branch_2 = conv2d_bn(branch_2, 224, 1, 7)
192 |     branch_2 = conv2d_bn(branch_2, 224, 7, 1)
193 |     branch_2 = conv2d_bn(branch_2, 256, 1, 7)
194 | 
195 |     branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
196 |     branch_3 = conv2d_bn(branch_3, 128, 1, 1)
197 | 
198 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
199 |     return x
200 | 
201 | 
202 | def block_reduction_b(input):
203 |     channel_axis = -1
204 | 
205 |     branch_0 = conv2d_bn(input, 192, 1, 1)
206 |     branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid')
207 | 
208 |     branch_1 = conv2d_bn(input, 256, 1, 1)
209 |     branch_1 = conv2d_bn(branch_1, 256, 1, 7)
210 |     branch_1 = conv2d_bn(branch_1, 320, 7, 1)
211 |     branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid')
212 | 
213 |     branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input)
214 | 
215 |     x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
216 |     return x
217 | 
218 | 
219 | def block_inception_c(input):
220 |     channel_axis = -1
221 | 
222 |     branch_0 = conv2d_bn(input, 256, 1, 1)
223 | 
224 |     branch_1 = conv2d_bn(input, 384, 1, 1)
225 |     branch_10 = conv2d_bn(branch_1, 256, 1, 3)
226 |     branch_11 = conv2d_bn(branch_1, 256, 3, 1)
227 |     branch_1 = concatenate([branch_10, branch_11], axis=channel_axis)
228 | 
229 | 
230 |     branch_2 = conv2d_bn(input, 384, 1, 1)
231 |     branch_2 = conv2d_bn(branch_2, 448, 3, 1)
232 |     branch_2 = conv2d_bn(branch_2, 512, 1, 3)
233 |     branch_20 = conv2d_bn(branch_2, 256, 1, 3)
234 |     branch_21 = conv2d_bn(branch_2, 256, 3, 1)
235 |     branch_2 = concatenate([branch_20, branch_21], axis=channel_axis)
236 | 
237 |     branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input)
238 |     branch_3 = conv2d_bn(branch_3, 256, 1, 1)
239 | 
240 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
241 |     return x
242 | 
243 | 
244 | def inception_v4_base(input):
245 |     channel_axis = -1
246 | 
247 |     # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th)
248 |     net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid')
249 |     net = conv2d_bn(net, 32, 3, 3, border_mode='valid')
250 |     net = conv2d_bn(net, 64, 3, 3)
251 | 
252 |     branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
253 | 
254 |     branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid')
255 | 
256 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
257 | 
258 |     branch_0 = conv2d_bn(net, 64, 1, 1)
259 |     branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid')
260 | 
261 |     branch_1 = conv2d_bn(net, 64, 1, 1)
262 |     branch_1 = conv2d_bn(branch_1, 64, 1, 7)
263 |     branch_1 = conv2d_bn(branch_1, 64, 7, 1)
264 |     branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid')
265 | 
266 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
267 | 
268 |     branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid')
269 |     branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
270 | 
271 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
272 | 
273 |     # 35 x 35 x 384
274 |     # 4 x Inception-A blocks
275 |     for idx in xrange(4):
276 |       net = block_inception_a(net)
277 | 
278 |     # 35 x 35 x 384
279 |     # Reduction-A block
280 |     net = block_reduction_a(net)
281 | 
282 |     # 17 x 17 x 1024
283 |     # 7 x Inception-B blocks
284 |     for idx in xrange(7):
285 |       net = block_inception_b(net)
286 | 
287 |     # 17 x 17 x 1024
288 |     # Reduction-B block
289 |     net = block_reduction_b(net)
290 | 
291 |     # 8 x 8 x 1536
292 |     # 3 x Inception-C blocks
293 |     for idx in xrange(3):
294 |       net = block_inception_c(net)
295 | 
296 |     return net
297 | 
298 | 
299 | def get_loss(pred, label, l2_weight=0.0001):
300 |     diff = tf.square(tf.subtract(pred, label))
301 |     train_vars = tf.trainable_variables()
302 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
303 |     loss = tf.reduce_mean(diff + l2_loss)
304 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
305 |     tf.summary.scalar('loss', loss)
306 | 
307 |     return loss
308 | 
309 | 
310 | def summary_scalar(pred, label):
311 |     threholds = [5, 4, 3, 2, 1, 0.5]
312 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
313 |     speeds = [float(t) / 20 for t in threholds]
314 | 
315 |     for i in range(len(threholds)):
316 |         scalar_angle = "angle(" + str(angles[i]) + ")"
317 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
318 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
319 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
320 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
321 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
322 | 
323 |         tf.summary.scalar(scalar_angle, ac_angle)
324 |         tf.summary.scalar(scalar_speed, ac_speed)
325 | 
326 | 
327 | def resize(imgs):
328 |     batch_size = imgs.shape[0]
329 |     imgs_new = []
330 |     for j in range(batch_size):
331 |         img = imgs[j,:,:,:]
332 |         new = scipy.misc.imresize(img, (299, 299))
333 |         imgs_new.append(new)
334 |     imgs_new = np.stack(imgs_new, axis=0)
335 |     return imgs_new
336 | 
337 | 
338 | if __name__ == '__main__':
339 |     with tf.Graph().as_default():
340 |         imgs = tf.zeros((32, 224, 224, 3))
341 |         fmaps = tf.zeros((32, 224, 224, 3))
342 |         outputs = get_model([imgs, fmaps], tf.constant(True))
343 |         print(outputs)
344 | 
345 | 


--------------------------------------------------------------------------------
/models/inception_v4_pn.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | import pointnet
 13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation
 14 | from keras.layers.normalization import BatchNormalization
 15 | from keras.models import Model
 16 | 
 17 | from keras import backend as K
 18 | K.set_learning_phase(1) #set learning phase
 19 | 
 20 | 
 21 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, points=16384, separately=False):
 22 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 23 |     pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
 24 |     if separately:
 25 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 26 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 27 |         labels_pl = [speeds_pl, angles_pl]
 28 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 29 |     return imgs_pl, pts_pl, labels_pl
 30 | 
 31 | 
 32 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False):
 33 |     '''
 34 |     Inception V4 Model for Keras
 35 | 
 36 |     Model Schema is based on
 37 |     https://github.com/kentsommer/keras-inceptionV4
 38 | 
 39 |     ImageNet Pretrained Weights
 40 |     Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5
 41 |     TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5
 42 | 
 43 |     Parameters:
 44 |       img_rows, img_cols - resolution of inputs
 45 |       channel - 1 for grayscale, 3 for color
 46 |       num_classes - number of class labels for our classification task
 47 |     '''
 48 | 
 49 |     # Input Shape is 299 x 299 x 3 (tf)
 50 |     img_input = Input(shape=(img_rows, img_cols, 3), name='data')
 51 | 
 52 |     # Make inception base
 53 |     net = inception_v4_base(img_input)
 54 | 
 55 |     # Final pooling and prediction
 56 | 
 57 |     # 8 x 8 x 1536
 58 |     net_old = AveragePooling2D((8,8), padding='valid')(net)
 59 | 
 60 |     # 1 x 1 x 1536
 61 |     net_old = Dropout(dropout_keep_prob)(net_old)
 62 |     net_old = Flatten()(net_old)
 63 | 
 64 |     # 1536
 65 |     predictions = Dense(units=1001, activation='softmax')(net_old)
 66 | 
 67 |     model = Model(img_input, predictions, name='inception_v4')
 68 | 
 69 |     weights_path = 'utils/weights/inception-v4_weights_tf.h5'
 70 |     assert (os.path.exists(weights_path))
 71 |     model.load_weights(weights_path, by_name=True)
 72 | 
 73 |     # Truncate and replace softmax layer for transfer learning
 74 |     # Cannot use model.layers.pop() since model is not of Sequential() type
 75 |     # The method below works since pre-trained weights are stored in layers but not in the model
 76 |     net_ft = AveragePooling2D((8,8), border_mode='valid')(net)
 77 |     net_ft = Dropout(dropout_keep_prob)(net_ft)
 78 |     net_ft = Flatten()(net_ft)
 79 |     net = Dense(256, name='fc_mid')(net_ft)
 80 | 
 81 |     model = Model(img_input, net, name='inception_v4')
 82 |     return model
 83 | 
 84 | 
 85 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
 86 |     """ Inception_V4 regression model, input is BxWxHx3, output Bx2"""
 87 |     batch_size = net[0].get_shape()[0].value
 88 |     img_net, pt_net = net[0], net[1]
 89 | 
 90 |     img_net = get_inception(299, 299)(img_net)
 91 |     with tf.variable_scope('pointnet'):
 92 |         pt_net = pointnet.get_model(pt_net, tf.constant(True))
 93 |     net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1])
 94 | 
 95 |     if not add_lstm:
 96 |         for i, dim in enumerate([256, 128, 16]):
 97 |             fc_scope = "fc" + str(i + 1)
 98 |             dp_scope = "dp" + str(i + 1)
 99 |             net = tf_util.fully_connected(net, dim, bn=True,
100 |                                         is_training=is_training,
101 |                                         scope=fc_scope,
102 |                                         bn_decay=bn_decay)
103 |             net = tf_util.dropout(net, keep_prob=0.7,
104 |                                 is_training=is_training,
105 |                                 scope=dp_scope)
106 | 
107 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
108 |     else:
109 |         fc_scope = "fc1"
110 |         net = tf_util.fully_connected(net, 784, bn=True,
111 |                                       is_training=is_training,
112 |                                       scope=fc_scope,
113 |                                       bn_decay=bn_decay)
114 |         net = tf_util.dropout(net, keep_prob=0.7,
115 |                               is_training=is_training,
116 |                               scope="dp1")
117 |         net = cnn_lstm_block(net)
118 |     return net
119 | 
120 | 
121 | def cnn_lstm_block(input_tensor):
122 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
123 |     lstm_out = tf_util.stacked_lstm(lstm_in,
124 |                                     num_outputs=10,
125 |                                     time_steps=28,
126 |                                     scope="cnn_lstm")
127 | 
128 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
129 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
130 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
131 | 
132 | 
133 | def conv2d_bn(x, nb_filter, nb_row, nb_col,
134 |               border_mode='same', subsample=(1, 1), bias=False):
135 |     """
136 |     Utility function to apply conv + BN.
137 |     (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py)
138 |     """
139 |     channel_axis = -1
140 |     x = Convolution2D(nb_filter, (nb_row, nb_col),
141 |                       strides=subsample,
142 |                       padding=border_mode,
143 |                       use_bias=bias)(x)
144 |     x = BatchNormalization(axis=channel_axis)(x)
145 |     x = Activation('relu')(x)
146 |     return x
147 | 
148 | def block_inception_a(input):
149 |     channel_axis = -1
150 | 
151 |     branch_0 = conv2d_bn(input, 96, 1, 1)
152 | 
153 |     branch_1 = conv2d_bn(input, 64, 1, 1)
154 |     branch_1 = conv2d_bn(branch_1, 96, 3, 3)
155 | 
156 |     branch_2 = conv2d_bn(input, 64, 1, 1)
157 |     branch_2 = conv2d_bn(branch_2, 96, 3, 3)
158 |     branch_2 = conv2d_bn(branch_2, 96, 3, 3)
159 | 
160 |     branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
161 |     branch_3 = conv2d_bn(branch_3, 96, 1, 1)
162 | 
163 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
164 |     return x
165 | 
166 | 
167 | def block_reduction_a(input):
168 |     channel_axis = -1
169 | 
170 |     branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid')
171 | 
172 |     branch_1 = conv2d_bn(input, 192, 1, 1)
173 |     branch_1 = conv2d_bn(branch_1, 224, 3, 3)
174 |     branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid')
175 | 
176 |     branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input)
177 | 
178 |     x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
179 |     return x
180 | 
181 | 
182 | def block_inception_b(input):
183 |     channel_axis = -1
184 | 
185 |     branch_0 = conv2d_bn(input, 384, 1, 1)
186 | 
187 |     branch_1 = conv2d_bn(input, 192, 1, 1)
188 |     branch_1 = conv2d_bn(branch_1, 224, 1, 7)
189 |     branch_1 = conv2d_bn(branch_1, 256, 7, 1)
190 | 
191 |     branch_2 = conv2d_bn(input, 192, 1, 1)
192 |     branch_2 = conv2d_bn(branch_2, 192, 7, 1)
193 |     branch_2 = conv2d_bn(branch_2, 224, 1, 7)
194 |     branch_2 = conv2d_bn(branch_2, 224, 7, 1)
195 |     branch_2 = conv2d_bn(branch_2, 256, 1, 7)
196 | 
197 |     branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
198 |     branch_3 = conv2d_bn(branch_3, 128, 1, 1)
199 | 
200 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
201 |     return x
202 | 
203 | 
204 | def block_reduction_b(input):
205 |     channel_axis = -1
206 | 
207 |     branch_0 = conv2d_bn(input, 192, 1, 1)
208 |     branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid')
209 | 
210 |     branch_1 = conv2d_bn(input, 256, 1, 1)
211 |     branch_1 = conv2d_bn(branch_1, 256, 1, 7)
212 |     branch_1 = conv2d_bn(branch_1, 320, 7, 1)
213 |     branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid')
214 | 
215 |     branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input)
216 | 
217 |     x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
218 |     return x
219 | 
220 | 
221 | def block_inception_c(input):
222 |     channel_axis = -1
223 | 
224 |     branch_0 = conv2d_bn(input, 256, 1, 1)
225 | 
226 |     branch_1 = conv2d_bn(input, 384, 1, 1)
227 |     branch_10 = conv2d_bn(branch_1, 256, 1, 3)
228 |     branch_11 = conv2d_bn(branch_1, 256, 3, 1)
229 |     branch_1 = concatenate([branch_10, branch_11], axis=channel_axis)
230 | 
231 | 
232 |     branch_2 = conv2d_bn(input, 384, 1, 1)
233 |     branch_2 = conv2d_bn(branch_2, 448, 3, 1)
234 |     branch_2 = conv2d_bn(branch_2, 512, 1, 3)
235 |     branch_20 = conv2d_bn(branch_2, 256, 1, 3)
236 |     branch_21 = conv2d_bn(branch_2, 256, 3, 1)
237 |     branch_2 = concatenate([branch_20, branch_21], axis=channel_axis)
238 | 
239 |     branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input)
240 |     branch_3 = conv2d_bn(branch_3, 256, 1, 1)
241 | 
242 |     x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
243 |     return x
244 | 
245 | 
246 | def inception_v4_base(input):
247 |     channel_axis = -1
248 | 
249 |     # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th)
250 |     net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid')
251 |     net = conv2d_bn(net, 32, 3, 3, border_mode='valid')
252 |     net = conv2d_bn(net, 64, 3, 3)
253 | 
254 |     branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
255 | 
256 |     branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid')
257 | 
258 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
259 | 
260 |     branch_0 = conv2d_bn(net, 64, 1, 1)
261 |     branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid')
262 | 
263 |     branch_1 = conv2d_bn(net, 64, 1, 1)
264 |     branch_1 = conv2d_bn(branch_1, 64, 1, 7)
265 |     branch_1 = conv2d_bn(branch_1, 64, 7, 1)
266 |     branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid')
267 | 
268 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
269 | 
270 |     branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid')
271 |     branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
272 | 
273 |     net = concatenate([branch_0, branch_1], axis=channel_axis)
274 | 
275 |     # 35 x 35 x 384
276 |     # 4 x Inception-A blocks
277 |     for idx in xrange(4):
278 |       net = block_inception_a(net)
279 | 
280 |     # 35 x 35 x 384
281 |     # Reduction-A block
282 |     net = block_reduction_a(net)
283 | 
284 |     # 17 x 17 x 1024
285 |     # 7 x Inception-B blocks
286 |     for idx in xrange(7):
287 |       net = block_inception_b(net)
288 | 
289 |     # 17 x 17 x 1024
290 |     # Reduction-B block
291 |     net = block_reduction_b(net)
292 | 
293 |     # 8 x 8 x 1536
294 |     # 3 x Inception-C blocks
295 |     for idx in xrange(3):
296 |       net = block_inception_c(net)
297 | 
298 |     return net
299 | 
300 | 
301 | def get_loss(pred, label, l2_weight=0.0001):
302 |     diff = tf.square(tf.subtract(pred, label))
303 |     train_vars = tf.trainable_variables()
304 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
305 |     loss = tf.reduce_mean(diff + l2_loss)
306 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
307 |     tf.summary.scalar('loss', loss)
308 | 
309 |     return loss
310 | 
311 | 
312 | def summary_scalar(pred, label):
313 |     threholds = [5, 4, 3, 2, 1, 0.5]
314 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
315 |     speeds = [float(t) / 20 for t in threholds]
316 | 
317 |     for i in range(len(threholds)):
318 |         scalar_angle = "angle(" + str(angles[i]) + ")"
319 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
320 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
321 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
322 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
323 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
324 | 
325 |         tf.summary.scalar(scalar_angle, ac_angle)
326 |         tf.summary.scalar(scalar_speed, ac_speed)
327 | 
328 | 
329 | def resize(imgs):
330 |     batch_size = imgs.shape[0]
331 |     imgs_new = []
332 |     for j in range(batch_size):
333 |         img = imgs[j,:,:,:]
334 |         new = scipy.misc.imresize(img, (299, 299))
335 |         imgs_new.append(new)
336 |     imgs_new = np.stack(imgs_new, axis=0)
337 |     return imgs_new
338 | 
339 | 
340 | if __name__ == '__main__':
341 |     with tf.Graph().as_default():
342 |         imgs = tf.zeros((32, 224, 224, 3))
343 |         pts = tf.zeros((32, 16384, 3))
344 |         outputs = get_model([imgs, pts], tf.constant(True))
345 |         print(outputs)
346 |         


--------------------------------------------------------------------------------
/models/nvidia_io.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | 
 4 | import tensorflow as tf
 5 | import scipy
 6 | 
 7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 9 | 
10 | import tf_util
11 | 
12 | 
13 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False):
14 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
15 |     if separately:
16 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
17 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
18 |         labels_pl = [speeds_pl, angles_pl]
19 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
20 |     return imgs_pl, labels_pl
21 | 
22 | 
23 | def get_model(net, is_training, bn_decay=None, separately=False):
24 |     """ NVIDIA regression model, input is BxWxHx3, output Bx2"""
25 |     batch_size = net.get_shape()[0].value
26 | 
27 |     for i, dim in enumerate([24, 36, 48, 64, 64]):
28 |         scope = "conv" + str(i + 1)
29 |         net = tf_util.conv2d(net, dim, [5, 5],
30 |                              padding='VALID', stride=[1, 1],
31 |                              bn=True, is_training=is_training,
32 |                              scope=scope, bn_decay=bn_decay)
33 | 
34 |     net = tf.reshape(net, [batch_size, -1])
35 |     for i, dim in enumerate([256, 100, 50, 10]):
36 |         fc_scope = "fc" + str(i + 1)
37 |         dp_scope = "dp" + str(i + 1)
38 |         net = tf_util.fully_connected(net, dim, bn=True,
39 |                                       is_training=is_training,
40 |                                       scope=fc_scope, 
41 |                                       bn_decay=bn_decay)
42 |         net = tf_util.dropout(net, keep_prob=0.7,
43 |                               is_training=is_training,
44 |                               scope=dp_scope)
45 | 
46 |     net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5')
47 | 
48 |     return net
49 | 
50 | 
51 | def get_loss(pred, label, l2_weight=0.0001):
52 |     diff = tf.square(tf.subtract(pred, label))
53 |     train_vars = tf.trainable_variables()
54 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
55 |     loss = tf.reduce_mean(diff + l2_loss)
56 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
57 |     tf.summary.scalar('loss', loss)
58 | 
59 |     return loss
60 | 
61 | 
62 | def summary_scalar(pred, label):
63 |     threholds = [5, 4, 3, 2, 1, 0.5]
64 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
65 |     speeds = [float(t) / 20 for t in threholds]
66 | 
67 |     for i in range(len(threholds)):
68 |         scalar_angle = "angle(" + str(angles[i]) + ")"
69 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
70 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
71 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
72 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
73 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
74 | 
75 |         tf.summary.scalar(scalar_angle, ac_angle)
76 |         tf.summary.scalar(scalar_speed, ac_speed)
77 | 
78 | 
79 | if __name__ == '__main__':
80 |     with tf.Graph().as_default():
81 |         inputs = tf.zeros((32, 66, 200, 3))
82 |         outputs = get_model(inputs, tf.constant(True))
83 |         print(outputs)
84 | 


--------------------------------------------------------------------------------
/models/nvidia_pm.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | 
 4 | import tensorflow as tf
 5 | import scipy
 6 | 
 7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 9 | 
10 | import tf_util
11 | 
12 | 
13 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False):
14 |     imgs_pl = tf.placeholder(tf.float32, 
15 |                              shape=(batch_size, img_rows, img_cols, 3))
16 |     fmaps_pl = tf.placeholder(tf.float32, 
17 |                               shape=(batch_size, img_rows, img_cols, 3))
18 |     if separately:
19 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
20 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
21 |         labels_pl = [speeds_pl, angles_pl]
22 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
23 |     return imgs_pl, fmaps_pl, labels_pl
24 | 
25 | 
26 | def get_model(net, is_training, bn_decay=None, separately=False):
27 |     """ NVIDIA regression model, input is BxWxHx3, output Bx2"""
28 |     batch_size = net[0].get_shape()[0].value
29 |     img_net, fmap_net = net
30 |     for i, dim in enumerate([24, 36, 48, 64, 64]):
31 |         scope_img = "image_conv" + str(i + 1)
32 |         scope_fmap = "fmap_conv" + str(i + 1)
33 |         img_net = tf_util.conv2d(img_net, dim, [5, 5],
34 |                                  padding='VALID', stride=[1, 1],
35 |                                  bn=True, is_training=is_training,
36 |                                  scope=scope_img, bn_decay=bn_decay)
37 |         fmap_net = tf_util.conv2d(fmap_net, dim, [5, 5],
38 |                                   padding='VALID', stride=[1, 1],
39 |                                   bn=True, is_training=is_training,
40 |                                   scope=scope_fmap, bn_decay=bn_decay)
41 |     net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
42 |     for i, dim in enumerate([256, 100, 50, 10]):
43 |         fc_scope = "fc" + str(i + 1)
44 |         dp_scope = "dp" + str(i + 1)
45 |         net = tf_util.fully_connected(net, dim, bn=True,
46 |                                       is_training=is_training,
47 |                                       scope=fc_scope, 
48 |                                       bn_decay=bn_decay)
49 |         net = tf_util.dropout(net, keep_prob=0.7,
50 |                               is_training=is_training,
51 |                               scope=dp_scope)
52 | 
53 |     net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5')
54 | 
55 |     return net
56 | 
57 | 
58 | def get_loss(pred, label, l2_weight=0.0001):
59 |     diff = tf.square(tf.subtract(pred, label))
60 |     train_vars = tf.trainable_variables()
61 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
62 |     loss = tf.reduce_mean(diff + l2_loss)
63 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
64 |     tf.summary.scalar('loss', loss)
65 | 
66 |     return loss
67 | 
68 | 
69 | def summary_scalar(pred, label):
70 |     threholds = [5, 4, 3, 2, 1, 0.5]
71 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
72 |     speeds = [float(t) / 20 for t in threholds]
73 | 
74 |     for i in range(len(threholds)):
75 |         scalar_angle = "angle(" + str(angles[i]) + ")"
76 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
77 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
78 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
79 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
80 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
81 | 
82 |         tf.summary.scalar(scalar_angle, ac_angle)
83 |         tf.summary.scalar(scalar_speed, ac_speed)
84 | 
85 | 
86 | if __name__ == '__main__':
87 |     with tf.Graph().as_default():
88 |         inputs = [tf.zeros((32, 66, 200, 3)), tf.zeros((32, 66, 200, 3))]
89 |         outputs = get_model(inputs, tf.constant(True))
90 |         print(outputs)
91 | 


--------------------------------------------------------------------------------
/models/nvidia_pn.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | 
 4 | import tensorflow as tf
 5 | import scipy
 6 | import numpy as np
 7 | 
 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 | 
11 | import tf_util
12 | import pointnet
13 | 
14 | 
15 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, points=16384, separately=False):
16 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
17 |     pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
18 |     if separately:
19 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
20 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
21 |         labels_pl = [speeds_pl, angles_pl]
22 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
23 |     return imgs_pl, pts_pl, labels_pl
24 | 
25 | 
26 | def get_model(net, is_training, bn_decay=None, separately=False):
27 |     """ NVIDIA regression model, input is BxWxHx3, output Bx2"""
28 |     batch_size = net[0].get_shape()[0].value
29 |     img_net, pt_net = net[0], net[1]
30 | 
31 |     for i, dim in enumerate([24, 36, 48, 64, 64]):
32 |         scope = "conv" + str(i + 1)
33 |         img_net = tf_util.conv2d(img_net, dim, [5, 5],
34 |                                  padding='VALID', stride=[1, 1],
35 |                                  bn=True, is_training=is_training,
36 |                                  scope=scope, bn_decay=bn_decay)
37 | 
38 |     img_net = tf.reshape(img_net, [batch_size, -1])
39 |     img_net = tf_util.fully_connected(img_net, 256, bn=True,
40 |                                       is_training=is_training,
41 |                                       scope='img_fc0',
42 |                                       bn_decay=bn_decay)
43 |     with tf.variable_scope('pointnet'):
44 |         pt_net = pointnet.get_model(pt_net, tf.constant(True))
45 |     net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, 512])
46 | 
47 |     for i, dim in enumerate([256, 128, 16]):
48 |         fc_scope = "fc" + str(i + 1)
49 |         dp_scope = "dp" + str(i + 1)
50 |         net = tf_util.fully_connected(net, dim, bn=True,
51 |                                       is_training=is_training,
52 |                                       scope=fc_scope, 
53 |                                       bn_decay=bn_decay)
54 |         net = tf_util.dropout(net, keep_prob=0.7,
55 |                               is_training=is_training,
56 |                               scope=dp_scope)
57 | 
58 |     net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5')
59 | 
60 |     return net
61 | 
62 | 
63 | def get_loss(pred, label, l2_weight=0.0001):
64 |     diff = tf.square(tf.subtract(pred, label))
65 |     train_vars = tf.trainable_variables()
66 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
67 |     loss = tf.reduce_mean(diff + l2_loss)
68 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
69 |     tf.summary.scalar('loss', loss)
70 | 
71 |     return loss
72 | 
73 | 
74 | def summary_scalar(pred, label):
75 |     threholds = [5, 4, 3, 2, 1, 0.5]
76 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
77 |     speeds = [float(t) / 20 for t in threholds]
78 | 
79 |     for i in range(len(threholds)):
80 |         scalar_angle = "angle(" + str(angles[i]) + ")"
81 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
82 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
83 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
84 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
85 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
86 | 
87 |         tf.summary.scalar(scalar_angle, ac_angle)
88 |         tf.summary.scalar(scalar_speed, ac_speed)
89 | 
90 | 
91 | if __name__ == '__main__':
92 |     with tf.Graph().as_default():
93 |         inputs = [tf.zeros((32, 66, 200, 3)), tf.zeros((32, 16384, 3))]
94 |         outputs = get_model(inputs, tf.constant(True))
95 |         print(outputs)
96 | 


--------------------------------------------------------------------------------
/models/resnet152_io.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | from custom_layers import Scale
 13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation
 14 | from keras.layers.normalization import BatchNormalization
 15 | from keras.models import Model
 16 | 
 17 | 
 18 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, separately=False):
 19 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 20 |     if separately:
 21 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 22 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 23 |         labels_pl = [speeds_pl, angles_pl]
 24 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 25 |     return imgs_pl, labels_pl
 26 | 
 27 | 
 28 | def get_resnet(img_rows=224, img_cols=224, separately=False):
 29 |     """
 30 |     Resnet 152 Model for Keras
 31 | 
 32 |     Model Schema and layer naming follow that of the original Caffe implementation
 33 |     https://github.com/KaimingHe/deep-residual-networks
 34 | 
 35 |     ImageNet Pretrained Weights
 36 |     Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing
 37 |     TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing
 38 | 
 39 |     Parameters:
 40 |       img_rows, img_cols - resolution of inputs
 41 |       channel - 1 for grayscale, 3 for color
 42 |     """
 43 |     eps = 1.1e-5
 44 |     # Handle Dimension Ordering for different backends
 45 |     img_input = Input(shape=(img_rows, img_cols, 3), name='data')
 46 | 
 47 |     x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
 48 |     x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
 49 |     x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x)
 50 |     x = Scale(axis=3, name='scale_conv1')(x)
 51 |     x = Activation('relu', name='conv1_relu')(x)
 52 |     x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
 53 | 
 54 |     x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
 55 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
 56 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
 57 | 
 58 |     x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
 59 |     for i in range(1,8):
 60 |       x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i))
 61 | 
 62 |     x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
 63 |     for i in range(1,36):
 64 |       x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i))
 65 | 
 66 |     x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
 67 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
 68 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
 69 | 
 70 |     x_fc = AveragePooling2D((7, 7), name='avg_pool')(x)
 71 |     x_fc = Flatten()(x_fc)
 72 |     x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc)
 73 | 
 74 |     model = Model(img_input, x_fc)
 75 | 
 76 |     weights_path = 'utils/weights/resnet152_weights_tf.h5'
 77 |     assert (os.path.exists(weights_path))
 78 |     model.load_weights(weights_path, by_name=True)
 79 | 
 80 |     # Truncate and replace softmax layer for transfer learning
 81 |     # Cannot use model.layers.pop() since model is not of Sequential() type
 82 |     # The method below works since pre-trained weights are stored in layers but not in the model
 83 |     x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x)
 84 |     x_newfc = Flatten()(x_newfc)
 85 |     x_newfc = Dense(256, name='fc8')(x_newfc)
 86 |     model = Model(img_input, x_newfc)
 87 | 
 88 |     return model
 89 | 
 90 | 
 91 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
 92 |     """ ResNet152 regression model, input is BxWxHx3, output Bx2"""
 93 |     net = get_resnet(224, 224)(net)
 94 | 
 95 |     if not add_lstm:
 96 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final')
 97 | 
 98 |     else:
 99 |         net = tf_util.fully_connected(net, 784, bn=True,
100 |                                       is_training=is_training,
101 |                                       scope='fc_lstm',
102 |                                       bn_decay=bn_decay)
103 |         net = tf_util.dropout(net, keep_prob=0.7,
104 |                               is_training=is_training,
105 |                               scope="dp1")
106 |         net = cnn_lstm_block(net)
107 | 
108 |     return net
109 | 
110 | 
111 | def cnn_lstm_block(input_tensor):
112 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
113 |     lstm_out = tf_util.stacked_lstm(lstm_in,
114 |                                     num_outputs=10,
115 |                                     time_steps=28,
116 |                                     scope="cnn_lstm")
117 | 
118 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
119 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
120 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
121 | 
122 | 
123 | def identity_block(input_tensor, kernel_size, filters, stage, block):
124 |     '''The identity_block is the block that has no conv layer at shortcut
125 |     # Arguments
126 |         input_tensor: input tensor
127 |         kernel_size: defualt 3, the kernel size of middle conv layer at main path
128 |         filters: list of integers, the nb_filters of 3 conv layer at main path
129 |         stage: integer, current stage label, used for generating layer names
130 |         block: 'a','b'..., current block label, used for generating layer names
131 |     '''
132 |     eps = 1.1e-5
133 |     nb_filter1, nb_filter2, nb_filter3 = filters
134 |     conv_name_base = 'res' + str(stage) + block + '_branch'
135 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
136 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
137 | 
138 |     x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor)
139 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
140 |     x = Scale(axis=3, name=scale_name_base + '2a')(x)
141 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
142 | 
143 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
144 |     x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
145 |                       name=conv_name_base + '2b', use_bias=False)(x)
146 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
147 |     x = Scale(axis=3, name=scale_name_base + '2b')(x)
148 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
149 | 
150 |     x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
151 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
152 |     x = Scale(axis=3, name=scale_name_base + '2c')(x)
153 | 
154 |     x = add([x, input_tensor], name='res' + str(stage) + block)
155 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
156 |     return x
157 | 
158 | 
159 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
160 |     '''conv_block is the block that has a conv layer at shortcut
161 |     # Arguments
162 |         input_tensor: input tensor
163 |         kernel_size: defualt 3, the kernel size of middle conv layer at main path
164 |         filters: list of integers, the nb_filters of 3 conv layer at main path
165 |         stage: integer, current stage label, used for generating layer names
166 |         block: 'a','b'..., current block label, used for generating layer names
167 |     Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
168 |     And the shortcut should have subsample=(2,2) as well
169 |     '''
170 |     eps = 1.1e-5
171 |     nb_filter1, nb_filter2, nb_filter3 = filters
172 |     conv_name_base = 'res' + str(stage) + block + '_branch'
173 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
174 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
175 | 
176 |     x = Convolution2D(nb_filter1, (1, 1), strides=strides,
177 |                       name=conv_name_base + '2a', use_bias=False)(input_tensor)
178 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
179 |     x = Scale(axis=3, name=scale_name_base + '2a')(x)
180 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
181 | 
182 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
183 |     x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
184 |                       name=conv_name_base + '2b', use_bias=False)(x)
185 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
186 |     x = Scale(axis=3, name=scale_name_base + '2b')(x)
187 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
188 | 
189 |     x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
190 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
191 |     x = Scale(axis=3, name=scale_name_base + '2c')(x)
192 | 
193 |     shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides,
194 |                              name=conv_name_base + '1', use_bias=False)(input_tensor)
195 |     shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut)
196 |     shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut)
197 | 
198 |     x = add([x, shortcut], name='res' + str(stage) + block)
199 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
200 |     return x
201 | 
202 | 
203 | def get_loss(pred, label, l2_weight=0.0001):
204 |     diff = tf.square(tf.subtract(pred, label))
205 |     train_vars = tf.trainable_variables()
206 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
207 |     loss = tf.reduce_mean(diff + l2_loss)
208 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
209 |     tf.summary.scalar('loss', loss)
210 | 
211 |     return loss
212 | 
213 | 
214 | def summary_scalar(pred, label):
215 |     threholds = [5, 4, 3, 2, 1, 0.5]
216 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
217 |     speeds = [float(t) / 20 for t in threholds]
218 | 
219 |     for i in range(len(threholds)):
220 |         scalar_angle = "angle(" + str(angles[i]) + ")"
221 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
222 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
223 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
224 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
225 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
226 | 
227 |         tf.summary.scalar(scalar_angle, ac_angle)
228 |         tf.summary.scalar(scalar_speed, ac_speed)
229 | 
230 | 
231 | def resize(imgs):
232 |     batch_size = imgs.shape[0]
233 |     imgs_new = []
234 |     for j in range(batch_size):
235 |         img = imgs[j,:,:,:]
236 |         new = scipy.misc.imresize(img, (224, 224))
237 |         imgs_new.append(new)
238 |     imgs_new = np.stack(imgs_new, axis=0)
239 |     return imgs_new
240 | 
241 | 
242 | if __name__ == '__main__':
243 |     with tf.Graph().as_default():
244 |         inputs = tf.zeros((32, 224, 224, 3))
245 |         outputs = get_model(inputs, tf.constant(True))
246 |         print(outputs)
247 | 
248 | 


--------------------------------------------------------------------------------
/models/resnet152_pm.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | from custom_layers import Scale
 13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation
 14 | from keras.layers.normalization import BatchNormalization
 15 | from keras.models import Model
 16 | 
 17 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False):
 18 |     imgs_pl = tf.placeholder(tf.float32,
 19 |                              shape=(batch_size, img_rows, img_cols, 3))
 20 |     fmaps_pl = tf.placeholder(tf.float32,
 21 |                               shape=(batch_size, img_rows, img_cols, 3))
 22 |     if separately:
 23 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 24 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 25 |         labels_pl = [speeds_pl, angles_pl]
 26 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 27 |     return imgs_pl, fmaps_pl, labels_pl
 28 | 
 29 | 
 30 | def get_resnet(img_rows=224, img_cols=224, separately=False):
 31 |     """
 32 |     Resnet 152 Model for Keras
 33 | 
 34 |     Model Schema and layer naming follow that of the original Caffe implementation
 35 |     https://github.com/KaimingHe/deep-residual-networks
 36 | 
 37 |     ImageNet Pretrained Weights
 38 |     Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing
 39 |     TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing
 40 | 
 41 |     Parameters:
 42 |       img_rows, img_cols - resolution of inputs
 43 |       channel - 1 for grayscale, 3 for color
 44 |     """
 45 | 
 46 |     img_input = Input(shape=(img_rows, img_cols, 3), name='data')
 47 | 
 48 |     eps = 1.1e-5
 49 |     x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
 50 |     x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
 51 |     x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x)
 52 |     x = Scale(axis=3, name='scale_conv1')(x)
 53 |     x = Activation('relu', name='conv1_relu')(x)
 54 |     x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
 55 | 
 56 |     x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
 57 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
 58 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
 59 | 
 60 |     x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
 61 |     for i in range(1,8):
 62 |       x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i))
 63 | 
 64 |     x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
 65 |     for i in range(1,36):
 66 |       x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i))
 67 | 
 68 |     x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
 69 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
 70 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
 71 | 
 72 |     x_fc = AveragePooling2D((7, 7), name='avg_pool')(x)
 73 |     x_fc = Flatten()(x_fc)
 74 |     x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc)
 75 | 
 76 |     model = Model(img_input, x_fc)
 77 | 
 78 |     # Use pre-trained weights for Tensorflow backend
 79 |     weights_path = 'utils/weights/resnet152_weights_tf.h5'
 80 |     assert (os.path.exists(weights_path))
 81 | 
 82 |     model.load_weights(weights_path, by_name=True)
 83 | 
 84 |     # Truncate and replace softmax layer for transfer learning
 85 |     # Cannot use model.layers.pop() since model is not of Sequential() type
 86 |     # The method below works since pre-trained weights are stored in layers but not in the model
 87 |     x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x)
 88 |     x_newfc = Flatten()(x_newfc)
 89 |     x_newfc = Dense(256, name='fc8')(x_newfc)
 90 | 
 91 |     model = Model(img_input, x_newfc)
 92 |     return model
 93 | 
 94 | 
 95 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
 96 |     """ ResNet152 regression model, input is BxWxHx3, output Bx2"""
 97 |     batch_size = net[0].get_shape()[0].value
 98 |     img_net, fmap_net = net[0], net[1]
 99 | 
100 |     img_net = get_resnet(224, 224)(img_net)
101 |     fmap_net = get_resnet(224, 224)(fmap_net)
102 | 
103 |     net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
104 | 
105 |     if not add_lstm:
106 |         for i, dim in enumerate([256, 128, 16]):
107 |             fc_scope = "fc" + str(i + 1)
108 |             dp_scope = "dp" + str(i + 1)
109 |             net = tf_util.fully_connected(net, dim, bn=True,
110 |                                         is_training=is_training,
111 |                                         scope=fc_scope,
112 |                                         bn_decay=bn_decay)
113 |             net = tf_util.dropout(net, keep_prob=0.7,
114 |                                 is_training=is_training,
115 |                                 scope=dp_scope)
116 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
117 |     else:
118 |         fc_scope = "fc1"
119 |         net = tf_util.fully_connected(net, 784, bn=True,
120 |                                       is_training=is_training,
121 |                                       scope=fc_scope,
122 |                                       bn_decay=bn_decay)
123 |         net = tf_util.dropout(net, keep_prob=0.7,
124 |                               is_training=is_training,
125 |                               scope="dp1")
126 |         net = cnn_lstm_block(net)
127 |     return net
128 | 
129 | 
130 | def cnn_lstm_block(input_tensor):
131 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
132 |     lstm_out = tf_util.stacked_lstm(lstm_in,
133 |                                     num_outputs=10,
134 |                                     time_steps=28,
135 |                                     scope="cnn_lstm")
136 | 
137 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
138 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
139 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
140 | 
141 | 
142 | def identity_block(input_tensor, kernel_size, filters, stage, block):
143 |     '''The identity_block is the block that has no conv layer at shortcut
144 |     # Arguments
145 |         input_tensor: input tensor
146 |         kernel_size: defualt 3, the kernel size of middle conv layer at main path
147 |         filters: list of integers, the nb_filters of 3 conv layer at main path
148 |         stage: integer, current stage label, used for generating layer names
149 |         block: 'a','b'..., current block label, used for generating layer names
150 |     '''
151 |     eps = 1.1e-5
152 |     nb_filter1, nb_filter2, nb_filter3 = filters
153 |     conv_name_base = 'res' + str(stage) + block + '_branch'
154 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
155 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
156 | 
157 |     x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor)
158 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
159 |     x = Scale(axis=3, name=scale_name_base + '2a')(x)
160 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
161 | 
162 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
163 |     x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
164 |                       name=conv_name_base + '2b', use_bias=False)(x)
165 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
166 |     x = Scale(axis=3, name=scale_name_base + '2b')(x)
167 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
168 | 
169 |     x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
170 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
171 |     x = Scale(axis=3, name=scale_name_base + '2c')(x)
172 | 
173 |     x = add([x, input_tensor], name='res' + str(stage) + block)
174 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
175 |     return x
176 | 
177 | 
178 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
179 |     '''conv_block is the block that has a conv layer at shortcut
180 |     # Arguments
181 |         input_tensor: input tensor
182 |         kernel_size: defualt 3, the kernel size of middle conv layer at main path
183 |         filters: list of integers, the nb_filters of 3 conv layer at main path
184 |         stage: integer, current stage label, used for generating layer names
185 |         block: 'a','b'..., current block label, used for generating layer names
186 |     Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
187 |     And the shortcut should have subsample=(2,2) as well
188 |     '''
189 |     eps = 1.1e-5
190 |     nb_filter1, nb_filter2, nb_filter3 = filters
191 |     conv_name_base = 'res' + str(stage) + block + '_branch'
192 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
193 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
194 | 
195 |     x = Convolution2D(nb_filter1, (1, 1), strides=strides,
196 |                       name=conv_name_base + '2a', use_bias=False)(input_tensor)
197 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
198 |     x = Scale(axis=3, name=scale_name_base + '2a')(x)
199 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
200 | 
201 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
202 |     x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
203 |                       name=conv_name_base + '2b', use_bias=False)(x)
204 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
205 |     x = Scale(axis=3, name=scale_name_base + '2b')(x)
206 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
207 | 
208 |     x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
209 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
210 |     x = Scale(axis=3, name=scale_name_base + '2c')(x)
211 | 
212 |     shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides,
213 |                              name=conv_name_base + '1', use_bias=False)(input_tensor)
214 |     shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut)
215 |     shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut)
216 | 
217 |     x = add([x, shortcut], name='res' + str(stage) + block)
218 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
219 |     return x
220 | 
221 | 
222 | def get_loss(pred, label, l2_weight=0.0001):
223 |     diff = tf.square(tf.subtract(pred, label))
224 |     train_vars = tf.trainable_variables()
225 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
226 |     loss = tf.reduce_mean(diff + l2_loss)
227 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
228 |     tf.summary.scalar('loss', loss)
229 | 
230 |     return loss
231 | 
232 | 
233 | def summary_scalar(pred, label):
234 |     threholds = [5, 4, 3, 2, 1, 0.5]
235 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
236 |     speeds = [float(t) / 20 for t in threholds]
237 | 
238 |     for i in range(len(threholds)):
239 |         scalar_angle = "angle(" + str(angles[i]) + ")"
240 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
241 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
242 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
243 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
244 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
245 | 
246 |         tf.summary.scalar(scalar_angle, ac_angle)
247 |         tf.summary.scalar(scalar_speed, ac_speed)
248 | 
249 | 
250 | def resize(imgs):
251 |     batch_size = imgs.shape[0]
252 |     imgs_new = []
253 |     for j in range(batch_size):
254 |         img = imgs[j,:,:,:]
255 |         new = scipy.misc.imresize(img, (224, 224))
256 |         imgs_new.append(new)
257 |     imgs_new = np.stack(imgs_new, axis=0)
258 |     return imgs_new
259 | 
260 | 
261 | if __name__ == '__main__':
262 |     with tf.Graph().as_default():
263 |         imgs = tf.zeros((32, 224, 224, 3))
264 |         fmaps = tf.zeros((32, 224, 224, 3))
265 |         outputs = get_model([imgs, fmaps], tf.constant(True))
266 |         print(outputs)
267 | 


--------------------------------------------------------------------------------
/models/resnet152_pn.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | 
  4 | import tensorflow as tf
  5 | import scipy
  6 | import numpy as np
  7 | 
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 10 | 
 11 | import tf_util
 12 | import pointnet
 13 | from custom_layers import Scale
 14 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation
 15 | from keras.layers.normalization import BatchNormalization
 16 | from keras.models import Model
 17 | 
 18 | from keras import backend as K
 19 | K.set_learning_phase(1) #set learning phase
 20 | 
 21 | 
 22 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False):
 23 |     imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
 24 |     pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
 25 |     if separately:
 26 |         speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
 27 |         angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
 28 |         labels_pl = [speeds_pl, angles_pl]
 29 |     labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
 30 |     return imgs_pl, pts_pl, labels_pl
 31 | 
 32 | 
 33 | def get_resnet(img_rows=224, img_cols=224, separately=False):
 34 |     """
 35 |     Resnet 152 Model for Keras
 36 | 
 37 |     Model Schema and layer naming follow that of the original Caffe implementation
 38 |     https://github.com/KaimingHe/deep-residual-networks
 39 | 
 40 |     ImageNet Pretrained Weights
 41 |     Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing
 42 |     TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing
 43 | 
 44 |     Parameters:
 45 |       img_rows, img_cols - resolution of inputs
 46 |       channel - 1 for grayscale, 3 for color
 47 |     """
 48 | 
 49 |     img_input = Input(shape=(img_rows, img_cols, 3), name='data')
 50 | 
 51 |     eps = 1.1e-5
 52 |     x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
 53 |     x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
 54 |     x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x)
 55 |     x = Scale(axis=3, name='scale_conv1')(x)
 56 |     x = Activation('relu', name='conv1_relu')(x)
 57 |     x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
 58 | 
 59 |     x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
 60 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
 61 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
 62 | 
 63 |     x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
 64 |     for i in range(1,8):
 65 |       x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i))
 66 | 
 67 |     x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
 68 |     for i in range(1,36):
 69 |       x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i))
 70 | 
 71 |     x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
 72 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
 73 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
 74 | 
 75 |     x_fc = AveragePooling2D((7, 7), name='avg_pool')(x)
 76 |     x_fc = Flatten()(x_fc)
 77 |     x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc)
 78 | 
 79 |     model = Model(img_input, x_fc)
 80 | 
 81 |     # Use pre-trained weights for Tensorflow backend
 82 |     weights_path = 'utils/weights/resnet152_weights_tf.h5'
 83 |     assert (os.path.exists(weights_path))
 84 | 
 85 |     model.load_weights(weights_path, by_name=True)
 86 | 
 87 |     # Truncate and replace softmax layer for transfer learning
 88 |     # Cannot use model.layers.pop() since model is not of Sequential() type
 89 |     # The method below works since pre-trained weights are stored in layers but not in the model
 90 |     x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x)
 91 |     x_newfc = Flatten()(x_newfc)
 92 |     x_newfc = Dense(256, name='fc8')(x_newfc)
 93 | 
 94 |     model = Model(img_input, x_newfc)
 95 |     return model
 96 | 
 97 | 
 98 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
 99 |     """ ResNet152 regression model, input is BxWxHx3, output Bx2"""
100 |     batch_size = net[0].get_shape()[0].value
101 |     img_net, pt_net = net[0], net[1]
102 | 
103 |     img_net = get_resnet(224, 224)(img_net)
104 |     with tf.variable_scope('pointnet'):
105 |         pt_net = pointnet.get_model(pt_net, tf.constant(True))
106 |     net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1])
107 | 
108 |     if not add_lstm:
109 |         for i, dim in enumerate([256, 128, 16]):
110 |             fc_scope = "fc" + str(i + 1)
111 |             dp_scope = "dp" + str(i + 1)
112 |             net = tf_util.fully_connected(net, dim, bn=True,
113 |                                         is_training=is_training,
114 |                                         scope=fc_scope,
115 |                                         bn_decay=bn_decay)
116 |             net = tf_util.dropout(net, keep_prob=0.7,
117 |                                 is_training=is_training,
118 |                                 scope=dp_scope)
119 | 
120 |         net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
121 |     else:
122 |         fc_scope = "fc1"
123 |         net = tf_util.fully_connected(net, 784, bn=True,
124 |                                       is_training=is_training,
125 |                                       scope=fc_scope,
126 |                                       bn_decay=bn_decay)
127 |         net = tf_util.dropout(net, keep_prob=0.7,
128 |                               is_training=is_training,
129 |                               scope="dp1")
130 |         net = cnn_lstm_block(net)
131 |     return net
132 | 
133 | 
134 | def cnn_lstm_block(input_tensor):
135 |     lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
136 |     lstm_out = tf_util.stacked_lstm(lstm_in,
137 |                                     num_outputs=10,
138 |                                     time_steps=28,
139 |                                     scope="cnn_lstm")
140 | 
141 |     W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
142 |     b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
143 |     return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
144 | 
145 | 
146 | def identity_block(input_tensor, kernel_size, filters, stage, block):
147 |     '''The identity_block is the block that has no conv layer at shortcut
148 |     # Arguments
149 |         input_tensor: input tensor
150 |         kernel_size: defualt 3, the kernel size of middle conv layer at main path
151 |         filters: list of integers, the nb_filters of 3 conv layer at main path
152 |         stage: integer, current stage label, used for generating layer names
153 |         block: 'a','b'..., current block label, used for generating layer names
154 |     '''
155 |     eps = 1.1e-5
156 |     nb_filter1, nb_filter2, nb_filter3 = filters
157 |     conv_name_base = 'res' + str(stage) + block + '_branch'
158 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
159 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
160 | 
161 |     x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor)
162 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
163 |     x = Scale(axis=3, name=scale_name_base + '2a')(x)
164 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
165 | 
166 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
167 |     x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
168 |                       name=conv_name_base + '2b', use_bias=False)(x)
169 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
170 |     x = Scale(axis=3, name=scale_name_base + '2b')(x)
171 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
172 | 
173 |     x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
174 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
175 |     x = Scale(axis=3, name=scale_name_base + '2c')(x)
176 | 
177 |     x = add([x, input_tensor], name='res' + str(stage) + block)
178 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
179 |     return x
180 | 
181 | 
182 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
183 |     '''conv_block is the block that has a conv layer at shortcut
184 |     # Arguments
185 |         input_tensor: input tensor
186 |         kernel_size: defualt 3, the kernel size of middle conv layer at main path
187 |         filters: list of integers, the nb_filters of 3 conv layer at main path
188 |         stage: integer, current stage label, used for generating layer names
189 |         block: 'a','b'..., current block label, used for generating layer names
190 |     Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
191 |     And the shortcut should have subsample=(2,2) as well
192 |     '''
193 |     eps = 1.1e-5
194 |     nb_filter1, nb_filter2, nb_filter3 = filters
195 |     conv_name_base = 'res' + str(stage) + block + '_branch'
196 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
197 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
198 | 
199 |     x = Convolution2D(nb_filter1, (1, 1), strides=strides,
200 |                       name=conv_name_base + '2a', use_bias=False)(input_tensor)
201 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
202 |     x = Scale(axis=3, name=scale_name_base + '2a')(x)
203 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
204 | 
205 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
206 |     x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
207 |                       name=conv_name_base + '2b', use_bias=False)(x)
208 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
209 |     x = Scale(axis=3, name=scale_name_base + '2b')(x)
210 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
211 | 
212 |     x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
213 |     x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
214 |     x = Scale(axis=3, name=scale_name_base + '2c')(x)
215 | 
216 |     shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides,
217 |                              name=conv_name_base + '1', use_bias=False)(input_tensor)
218 |     shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut)
219 |     shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut)
220 | 
221 |     x = add([x, shortcut], name='res' + str(stage) + block)
222 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
223 |     return x
224 | 
225 | 
226 | def get_loss(pred, label, l2_weight=0.0001):
227 |     diff = tf.square(tf.subtract(pred, label))
228 |     train_vars = tf.trainable_variables()
229 |     l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
230 |     loss = tf.reduce_mean(diff + l2_loss)
231 |     tf.summary.scalar('l2 loss', l2_loss * l2_weight)
232 |     tf.summary.scalar('loss', loss)
233 | 
234 |     return loss
235 | 
236 | 
237 | def summary_scalar(pred, label):
238 |     threholds = [5, 4, 3, 2, 1, 0.5]
239 |     angles = [float(t) / 180 * scipy.pi for t in threholds]
240 |     speeds = [float(t) / 20 for t in threholds]
241 | 
242 |     for i in range(len(threholds)):
243 |         scalar_angle = "angle(" + str(angles[i]) + ")"
244 |         scalar_speed = "speed(" + str(speeds[i]) + ")"
245 |         ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
246 |         ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
247 |         ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
248 |         ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
249 | 
250 |         tf.summary.scalar(scalar_angle, ac_angle)
251 |         tf.summary.scalar(scalar_speed, ac_speed)
252 | 
253 | 
254 | def resize(imgs):
255 |     batch_size = imgs.shape[0]
256 |     imgs_new = []
257 |     for j in range(batch_size):
258 |         img = imgs[j,:,:,:]
259 |         new = scipy.misc.imresize(img, (224, 224))
260 |         imgs_new.append(new)
261 |     imgs_new = np.stack(imgs_new, axis=0)
262 |     return imgs_new
263 | 
264 | 
265 | if __name__ == '__main__':
266 |     with tf.Graph().as_default():
267 |         imgs = tf.zeros((32, 224, 224, 3))
268 |         pts = tf.zeros((32, 16384, 3))
269 |         outputs = get_model([imgs, pts], tf.constant(True))
270 |         print(outputs)


--------------------------------------------------------------------------------
/predict.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import importlib
  3 | import os
  4 | import sys
  5 | import time
  6 | 
  7 | import numpy as np
  8 | import scipy
  9 | 
 10 | import provider
 11 | import tensorflow as tf
 12 | 
 13 | import matplotlib.pyplot as plt
 14 | 
 15 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 16 | sys.path.append(os.path.join(BASE_DIR, 'models'))
 17 | 
 18 | parser = argparse.ArgumentParser()
 19 | parser.add_argument('--gpu', type=int, default=0,
 20 |                     help='GPU to use [default: GPU 0]')
 21 | parser.add_argument('--model', default='nvidia_pn',
 22 |                     help='Model name [default: nvidia_pn]')
 23 | parser.add_argument('--model_path', default='logs/nvidia_pn/model.ckpt',
 24 |                     help='Model checkpoint file path [default: logs/nvidia_pn/model.ckpt]')
 25 | parser.add_argument('--max_epoch', type=int, default=250,
 26 |                     help='Epoch to run [default: 250]')
 27 | parser.add_argument('--batch_size', type=int, default=8,
 28 |                     help='Batch Size during training [default: 8]')
 29 | parser.add_argument('--result_dir', default='results',
 30 |                     help='Result folder path [results]')
 31 | 
 32 | FLAGS = parser.parse_args()
 33 | BATCH_SIZE = FLAGS.batch_size
 34 | GPU_INDEX = FLAGS.gpu
 35 | MODEL_PATH = FLAGS.model_path
 36 | 
 37 | assert (FLAGS.model == "nvidia_pn")
 38 | MODEL = importlib.import_module(FLAGS.model)  # import network module
 39 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
 40 | 
 41 | RESULT_DIR = os.path.join(FLAGS.result_dir, FLAGS.model)
 42 | if not os.path.exists(RESULT_DIR):
 43 |     os.makedirs(RESULT_DIR)
 44 | LOG_FOUT = open(os.path.join(RESULT_DIR, 'log_predict.txt'), 'w')
 45 | LOG_FOUT.write(str(FLAGS)+'\n')
 46 | 
 47 | 
 48 | def log_string(out_str):
 49 |     LOG_FOUT.write(out_str+'\n')
 50 |     LOG_FOUT.flush()
 51 |     print(out_str)
 52 | 
 53 | def predict():
 54 |     with tf.device('/gpu:'+str(GPU_INDEX)):
 55 |         if 'pn' in MODEL_FILE:
 56 |             data_input = provider.Provider()
 57 |             imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
 58 |             imgs_pl = [imgs_pl, pts_pl]
 59 |         else:
 60 |             raise NotImplementedError
 61 | 
 62 |         is_training_pl = tf.placeholder(tf.bool, shape=())
 63 |         print(is_training_pl)
 64 | 
 65 |         # Get model and loss
 66 |         pred = MODEL.get_model(imgs_pl, is_training_pl)
 67 | 
 68 |         loss = MODEL.get_loss(pred, labels_pl)
 69 | 
 70 |         # Add ops to save and restore all the variables.
 71 |         saver = tf.train.Saver()
 72 | 
 73 |     # Create a session
 74 |     config = tf.ConfigProto()
 75 |     config.gpu_options.allow_growth = True
 76 |     config.allow_soft_placement = True
 77 |     config.log_device_placement = True
 78 |     sess = tf.Session(config=config)
 79 | 
 80 |     # Restore variables from disk.
 81 |     saver.restore(sess, MODEL_PATH)
 82 |     log_string("Model restored.")
 83 | 
 84 |     ops = {'imgs_pl': imgs_pl,
 85 |             'labels_pl': labels_pl,
 86 |             'is_training_pl': is_training_pl,
 87 |             'pred': pred,
 88 |             'loss': loss}
 89 | 
 90 |     pred_one_epoch(sess, ops, data_input)
 91 | 
 92 | def pred_one_epoch(sess, ops, data_input):
 93 |     """ ops: dict mapping from string to tf ops """
 94 |     is_training = False
 95 |     preds = []
 96 |     num_batches = data_input.num_test // BATCH_SIZE
 97 | 
 98 |     for batch_idx in range(num_batches):
 99 |         if "io" in MODEL_FILE:
100 |             imgs = data_input.load_one_batch(BATCH_SIZE, "test")
101 |             feed_dict = {ops['imgs_pl']: imgs,
102 |                          ops['is_training_pl']: is_training}
103 |         else:
104 |             imgs, others = data_input.load_one_batch(BATCH_SIZE, "test")
105 |             feed_dict = {ops['imgs_pl'][0]: imgs,
106 |                          ops['imgs_pl'][1]: others,
107 |                          ops['is_training_pl']: is_training}
108 | 
109 |         pred_val = sess.run(ops['pred'], feed_dict=feed_dict)
110 |         preds.append(pred_val)
111 | 
112 |     preds = np.vstack(preds)
113 |     print (preds.shape)
114 |     # preds[:, 1] = preds[:, 1] * 180.0 / scipy.pi
115 |     # preds[:, 0] = preds[:, 0] * 20 + 20
116 | 
117 |     np.savetxt(os.path.join(RESULT_DIR, "behavior_pred.txt"), preds)
118 | 
119 |     output_dir = os.path.join(RESULT_DIR, "results")
120 |     if not os.path.exists(output_dir):
121 |         os.makedirs(output_dir)
122 |     i_list = get_dicts(description="test")
123 |     counter = 0
124 |     for i, num in enumerate(i_list):
125 |         np.savetxt(os.path.join(output_dir, str(i) + ".txt"), preds[counter:counter+num,:])
126 |         counter += num
127 |     # plot_acc(preds, labels)
128 | 
129 | def get_dicts(description="val"):
130 |     if description == "train":
131 |         raise NotImplementedError
132 |     elif description == "val": # batch_size == 8
133 |         return [120] * 4 + [111] + [120] * 4 + [109] + [120] * 9 + [89 - 87 % 8]
134 |     elif description == "test": # batch_size == 8
135 |         return [120] * 9 + [116] + [120] * 4 + [106] + [120] * 4 + [114 - 114 % 8]
136 |     else:
137 |         raise NotImplementedError
138 | 
139 | if __name__ == "__main__":
140 |     predict()
141 |     # plot_acc_from_txt()
142 | 


--------------------------------------------------------------------------------
/tools/README.md:
--------------------------------------------------------------------------------
 1 | ## Tools to process point clouds and images
 2 | 
 3 | A list of tools (python scripts) are created to make data processing easier and more convenient. Note that they are not professional tools so you may need to modify some lines before using it in your cases.
 4 | 
 5 | If you have more efficient tools, code or other suggestions to process DBNet data, especially point clouds, don't hesitate to contact [@wangjksjtu(wangjksjtu@gmail.com)](https://github.com/wangjksjtu) or __submit pull-requests directly__. 
 6 | Your contributions are highly encouraged and appreciated!
 7 | 
 8 | - __img_pre.py__: croping and resizing images using python-opencv
 9 | - __las2fmap.py__: extracting feature maps from point clouds
10 | - __pcd2las.py__: downsampling point clouds; converting point clouds from '.pcd' to '.las' format. 
11 | - __video2img.py__: converting one video to continuous frames
12 | 
13 | To see HELP for these script:
14 | 
15 |     python train.py -h
16 | 
17 | ### Requirements
18 | - python-opencv
19 | - numpy, pickle, scipy, __laspy__
20 | - __CloudCompare (CC)__ (set __PATH variables__)
21 | 
22 | ### CC Examples
23 | Convert point clouds to `.las` format:
24 | 
25 |     CloudCompare.exe -SILENT -NO_TIMESTAMP -C_EXPORT_FMT LAS -O %s
26 |     
27 | Downsample point clouds to 16384 points and save in `.las` format:
28 | 
29 |     CloudCompare.exe -SILENT -NO_TIMESTAMP -C_EXPORT_FMT LAS -O %s -SS RANDOM 16384
30 | 
31 | More command line usages of CloudCompareare available on the [offical manual page](http://www.cloudcompare.org/doc/wiki/index.php?title=Command_line_mode).
32 | 
33 | ### las2fmap Examples
34 | Downlaod the example point cloud from the [Google Drive](https://drive.google.com/file/d/1lxl7M2MTA7afg5UItA5hCvh-Wt5bTSNJ/view?usp=sharing). 
35 | 
36 |     python las2fmap.py -f example.las
37 | 
38 | To see HELP for the `las2fmap.py` script:
39 | 
40 |     python las2fmap.py -h
41 |     # usage: las2fmap.py [-h] [-d DIR] [-f FILE]
42 |     # 
43 |     # optional arguments:
44 |     # -h, --help            show this help message and exit
45 |     # -d DIR, --dir DIR     Directory of las files [default: '']
46 |     # -f FILE, --file FILE  Specify one las file you want to convert # [default: '']
47 | 


--------------------------------------------------------------------------------
/tools/img_pre.py:
--------------------------------------------------------------------------------
  1 | """ 
  2 | Simple python scripts for croping and resizing images
  3 | Author: Jingkang Wang
  4 | Date: November 2017
  5 | Dependency: python-opencv
  6 | """
  7 | 
  8 | import argparse
  9 | import glob
 10 | import os
 11 | 
 12 | import cv2
 13 | 
 14 | 
 15 | def crop(input_dir="DVR_1920x1080",
 16 |          output_dir="DVR_1080x600"):
 17 |     """
 18 |     Crop images in folders
 19 |         :param input_dir: path of input directory
 20 |         :param output_dir: path of output directory
 21 |     """
 22 |     assert os.path.exists(input_dir)
 23 |     if not os.path.exists(output_dir):
 24 |         os.makedirs(output_dir)
 25 |     subfolders = glob.glob(os.path.join(input_dir, "*"))
 26 | 
 27 |     for folder in subfolders:
 28 |         new_subfolder = os.path.join(output_dir, folder[folder.rfind("/")+1:])
 29 |         # print new_subfolder
 30 |         if not os.path.exists(new_subfolder):
 31 |             os.mkdir(new_subfolder)
 32 |         files = glob.glob(os.path.join(folder, "*.jpg"))
 33 |         # print files
 34 |         for filename in files:
 35 |             out_filename = os.path.join(output_dir, filename[filename.find("/")+1:])
 36 |             print filename, out_filename
 37 |             crop_img(filename, out_filename)
 38 |     
 39 |     
 40 | def crop_img(input_img, output_img, 
 41 |              left=500, right=1580, down=200, up=800):
 42 |     """
 43 |     Crop single image
 44 |         :param input_img: path of input image
 45 |         :param output_img: path of cropped image
 46 |         :param left, right, down, up: cropped positions
 47 |     """
 48 |     img = cv2.imread(input_img)
 49 |     crop_img = img[down:up, left:right]
 50 |     cv2.imwrite(output_img, crop_img)
 51 | 
 52 | 
 53 | def resize(input_dir="DVR_1080x600",
 54 |            output_dir="DVR_200x66"):
 55 |     """
 56 |     Resize images in folders
 57 |         :param input_dir: path of input directory
 58 |         :param output_dir: path of output directory
 59 |     """
 60 |     width = int(output_dir.split("_")[-1].split("x")[0])
 61 |     height = int(output_dir.split("_")[-1].split("x")[-1])
 62 |     print width, height
 63 |     assert os.path.exists(input_dir)
 64 |     if not os.path.exists(output_dir):
 65 |         os.makedirs(output_dir)
 66 |     subfolders = glob.glob(os.path.join(input_dir, "*"))
 67 |     for folder in subfolders:
 68 |         new_subfolder = os.path.join(output_dir, folder[folder.rfind("/")+1:])
 69 |         # print new_subfolder
 70 |         if not os.path.exists(new_subfolder):
 71 |             os.mkdir(new_subfolder)
 72 |         files = glob.glob(os.path.join(folder, "*.jpg"))
 73 |         # print files
 74 |         for filename in files:
 75 |             out_filename = os.path.join(output_dir, filename[filename.find("/")+1:])
 76 |             print filename, out_filename
 77 |             resize_img(filename, out_filename, width, height)
 78 | 
 79 | 
 80 | def resize_img(input_img, output_img, newx, newy):
 81 |     """
 82 |     Resize single image
 83 |         :param input_img: path of input image
 84 |         :param output_img: path of cropped image
 85 |         :param newx, newy: scale of resized image
 86 |     """
 87 |     img = cv2.imread(input_img)
 88 |     newimage = cv2.resize(img, (newx, newy))
 89 |     cv2.imwrite(output_img, newimage)
 90 | 
 91 | 
 92 | if __name__ == "__main__":
 93 |     parser = argparse.ArgumentParser()
 94 |     parser.add_argument('--input_dir', type=str, default="DVR_1920x1080",
 95 |                         help='Path of input directory [default: DVR_1920x1080]')
 96 |     parser.add_argument('--output_dir', type=str, default="DVR_1080x600",
 97 |                         help='Path of input directory [default: DVR_1080x600]')
 98 |     parser.add_argument('--oper', type=str, default="crop",
 99 |                         help='Operation to conduct (crop/resize) [default: crop]')
100 |     FLAGS = parser.parse_args()
101 | 
102 |     INPUT_DIR = FLAGS.input_dir
103 |     OUTPUT_DIR = FLAGS.output_dir
104 |     OPER = FLAGS.oper
105 | 
106 |     assert (os.path.exists(INPUT_DIR))
107 |     if (OPER == "crop"):
108 |         crop(INPUT_DIR, OUTPUT_DIR)
109 |     elif (OPER == "resize"):
110 |         resize(INPUT_DIR, OUTPUT_DIR)
111 |     else:
112 |         raise NotImplementedError


--------------------------------------------------------------------------------
/tools/las2fmap.py:
--------------------------------------------------------------------------------
  1 | """ 
  2 | Simple python scripts for extracting feature maps from point clouds
  3 | Author: Jingkang Wang
  4 | Date: November 2017
  5 | Dependency: python-opencv, numpy, pickle, scipy, laspy
  6 | """
  7 | 
  8 | import argparse
  9 | import glob
 10 | import math
 11 | import os
 12 | import pickle
 13 | 
 14 | import numpy as np
 15 | from scipy.misc import imsave, imshow
 16 | 
 17 | import cv2
 18 | from laspy.base import Writer
 19 | from laspy.file import File
 20 | 
 21 | 
 22 | def lasReader(filename):
 23 |     """
 24 |     Read xyz points from single las file
 25 |         :param filename: path of single point cloud
 26 |     """
 27 |     f = File(filename, mode='r')
 28 |     x_max, x_min = np.max(f.x), np.min(f.x)
 29 |     y_max, y_min = np.max(f.y), np.min(f.y)
 30 |     z_max, z_min = np.max(f.z), np.min(f.z)
 31 |     return np.transpose(np.asarray([f.x, f.y, f.z])), \
 32 |             [(x_min, x_max), (y_min, y_max), (z_min, z_max)], f.header
 33 | 
 34 | 
 35 | def transform(merge, ranges, order=[0,1,2]):
 36 |     """
 37 |     Swap xyz axis
 38 |         :param merge: transformed xyz points
 39 |         :param ranges: specified shifts [default: None]
 40 |     """
 41 |     i = np.argsort(order)
 42 |     merge = merge[i,:]
 43 |     ranges = np.asarray(ranges)[i,:]
 44 |     return merge, ranges
 45 | 
 46 | 
 47 | def standardize(points, ranges=None):
 48 |     """
 49 |     Standardize points in point clouds
 50 |         :param filename: transformed xyz points
 51 |         :param ranges: specified shifts [default: None]
 52 |     """
 53 |     if ranges != None:
 54 |         points -= np.array([ranges[0][0], ranges[1][0], ranges[2][0]])
 55 |     else:
 56 |         x_min = np.min(points[:,0])
 57 |         y_min = np.min(points[:,1])
 58 |         z_min = np.min(points[:,2])
 59 |         points -= np.array(np.array([x_min, y_min, z_min]))
 60 |     return np.transpose(points), [(0, np.max(points[:,0])), \
 61 |             (0, np.max(points[:,1])), (0, np.max(points[:,2]))]
 62 | 
 63 | 
 64 | def rotate(img, angle=180):
 65 |     """
 66 |     Rotate images using opencv
 67 |         :param img: one image (opencv format)
 68 |         :param angle: rotated angle [default: 180]
 69 |     """
 70 |     rows, cols = img.shape[0], img.shape[1]
 71 |     rotation_matrix = cv2.getRotationMatrix2D((rows/2, cols/2), angle, 1)
 72 |     dst = cv2.warpAffine(img, rotation_matrix, (cols, rows))
 73 |     return dst
 74 | 
 75 | 
 76 | def rotate_about_center(src, angle, scale=1.):
 77 |     """
 78 |     Rotate images based on there centers
 79 |         :param src: one image (opencv format)
 80 |         :param angle: rotated angle
 81 |         :param scale: re-scaling images [default: 1.]
 82 |     """
 83 |     w = src.shape[1]
 84 |     h = src.shape[0]
 85 |     rangle = np.deg2rad(angle)  # angle in radians
 86 |     # now calculate new image width and height
 87 |     nw = (abs(np.sin(rangle)*h) + abs(np.cos(rangle)*w))*scale
 88 |     nh = (abs(np.cos(rangle)*h) + abs(np.sin(rangle)*w))*scale
 89 |     # ask opencv for the rotation matrix
 90 |     rot_mat = cv2.getRotationMatrix2D((nw*0.5, nh*0.5), angle, scale)
 91 |     # calculate the move from the old center to the new center combined
 92 |     # with the rotation
 93 |     rot_move = np.dot(rot_mat, np.array([(nw-w)*0.5, (nh-h)*0.5,0]))
 94 |     # the move only affects the translation, so update the translation
 95 |     # part of the transform
 96 |     rot_mat[0,2] += rot_move[0]
 97 |     rot_mat[1,2] += rot_move[1]
 98 |     return cv2.warpAffine(src, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4)
 99 | 
100 | 
101 | def feature_map(merge, ranges, alpha=0.2, beta=0.8, GSD=0.5):
102 |     """
103 |     Obatin feature maps from pound clouds
104 |         :param merge: merged xyz points
105 |         :param ranges: focused ranges
106 |         :param alpha, beta, GSD: hyper-parameters in paper
107 |     """
108 |     (X_min, X_max) = ranges[0]
109 |     (Y_min, Y_max) = ranges[1]
110 |     (Z_min, Z_max) = ranges[2]
111 | 
112 |     W = int((X_max - X_min) / GSD) + 1
113 |     H = int((Y_max - Y_min) / GSD) + 1
114 |     print ("(W, H) = (" + str(W) + ", " + str(H) + ")")
115 |     feature_map = np.zeros((W, H))
116 | 
117 |     net_dict = dict()
118 |     for i in range(merge.shape[1]):
119 |         if i % 1000000 == 0:
120 |             print ("processed 1000000 points...")
121 |         point = merge[:,i]
122 |         x = int((point[0] - X_min) / GSD)
123 |         y = int((point[1] - Y_min) / GSD)
124 |         try:
125 |             net_dict[(x, y)].append(i)
126 |         except:
127 |             net_dict[(x, y)] = [i]
128 | 
129 |     print ("mapping points...")
130 |     # caculate the feature
131 |     count = 0
132 |     F_ij_min = 1000
133 |     F_ij_max = -1000
134 |     for i in range(W):
135 |         for j in range(H):
136 |             F_ij = 0
137 | 
138 |             try:
139 |                 h_min = 1000
140 |                 h_max = -1000
141 |                 for num in net_dict[(i, j)]:
142 |                     point =  merge[:,num]
143 |                     h_max = max(point[2], h_max)
144 |                     h_min = min(point[2], h_min)
145 | 
146 |                 Z_ijs = []
147 |                 W_ijs = []
148 |                 tol = 1e-5
149 |                 for num in net_dict[(i, j)]:
150 |                     point =  merge[:,num]
151 |                     Z_ij = point[2]
152 |                     H_ij = Z_ij - Z_min
153 |                     # obtain D_ij from Eqn.(5)
154 |                     x_ij = (i + 0.5) * GSD + X_min
155 |                     y_ij = (j + 0.5) * GSD + Y_min
156 |                     D_ij = math.sqrt((point[0] - x_ij)**2 + (point[1] - y_ij)**2)
157 |                     # obtain W_ij_XY and W_ij_J from Eqn.(4)
158 |                     W_ij_XY = math.sqrt(2) * GSD / (D_ij + tol)
159 |                     W_ij_H = H_ij * (h_min - Z_min) / (Z_max - h_max + tol)
160 |                     # obtain W_ij from Eqn.(3)
161 |                     W_ij = alpha * W_ij_XY + beta * W_ij_H
162 |                     Z_ijs.append(Z_ij)
163 |                     W_ijs.append(W_ij)
164 | 
165 |                 for k in range(len(Z_ijs)):
166 |                     # obtain feature value F_ij from Eqn.(2)
167 |                     F_ij += W_ijs[k] * Z_ijs[k]
168 | 
169 |                 F_ij /= sum(W_ijs)
170 |                 count += 1
171 | 
172 |                 F_ij_min = min(F_ij, F_ij_min)
173 |                 F_ij_max = max(F_ij, F_ij_max)
174 |             except: pass
175 | 
176 |             feature_map[i][j] = F_ij
177 | 
178 |     feature_map -= F_ij_min
179 |     feature_map /= (F_ij_max - F_ij_min)
180 |     feature_map *= 255
181 | 
182 |     return feature_map
183 | 
184 | 
185 | def clean_map(fmap):
186 |     """
187 |     Clean the feature map
188 |         :param fmap: feature map
189 |     """
190 |     # fmap = fmap[~(fmap==0).all(1)]
191 |     fmap = fmap[(fmap != 0).sum(axis=1) >= 100, :]
192 |     fmap = fmap[:, (fmap != 0).sum(axis=0) >= 50]
193 | 
194 |     return fmap
195 | 
196 | 
197 | def resize(path, x_axis, y_axis):
198 |     """
199 |     Resize images
200 |         :param path: path of an image
201 |         :param x_axis: width of resized image
202 |         :param y_axis: height of resized image
203 |     """
204 |     img = cv2.imread(path)
205 |     new_image = cv2.resize(img, (x_axis, y_axis))
206 |     cv2.imwrite(path, new_image)
207 | 
208 | 
209 | def get_fmap(filename, dir1='gray', dir2='jet'):
210 |     """
211 |     Visualize feature maps
212 |         :param dir1: path of gray images to be saved
213 |         :param dir2: path of jet images to be saved
214 |     """
215 |     if not os.path.exists(dir1): os.mkdir(dir1)
216 |     if not os.path.exists(dir2): os.mkdir(dir2)
217 | 
218 |     if not os.path.isfile(filename):
219 |         print ("[Error]: \'%s\' is not a valid filename" % filename)
220 |         return False
221 | 
222 |     merge, ranges, _ = lasReader(filename)
223 |     merge, ranges = standardize(merge, ranges)
224 |     print ("standardized point clouds")
225 |     print ("total: " + str(merge.shape[1]) + " points")
226 | 
227 |     # transform x,y,z axis: 0,2,1
228 |     merge, ranges = transform(merge, ranges, order=[1, 2, 0])
229 | 
230 |     # clean the feature map
231 |     fmap = clean_map(feature_map(merge, ranges=ranges, GSD=0.05))
232 |     cv2.imwrite(os.path.join(dir1, '%s.jpg' % filename[:-4]), \
233 |                 rotate_about_center(fmap, 180, 1.0))
234 | 
235 |     # uncomment the following line if you want to resize the feature map 
236 |     # resize(os.path.join(dir1, '%s.jpg' % filename[:-4]), x_axis=1080, y_axis=270)
237 | 
238 |     gray = cv2.imread(os.path.join(dir1, '%s.jpg' % filename[:-4]))
239 |     gray_single = gray[:,:,0]
240 |     imC = cv2.applyColorMap(gray_single, cv2.COLORMAP_JET)
241 |     cv2.imwrite(os.path.join(dir1, '%s_jet_tmp.jpg' % filename[:-4]), imC)
242 | 
243 |     img = cv2.imread(os.path.join(dir1, '%s_jet_tmp.jpg' % filename[:-4]))
244 |     cv2.imwrite(os.path.join(dir2, '%s_jet.jpg' % filename[:-4]), img)
245 |     os.system("rm %s_jet_tmp.jpg" % os.path.join(dir1, filename[:-4]))
246 | 
247 |     return True
248 | 
249 | 
250 | def main():
251 |     parser = argparse.ArgumentParser()
252 |     parser.add_argument('-d', '--dir', default='',
253 |                         help='Directory of las files [default: \'\']')
254 |     parser.add_argument('-f', '--file', default='',
255 |                         help='Specify one las file you want to convert [default: \'\']')
256 |     FLAGS = parser.parse_args()
257 | 
258 |     d = FLAGS.dir
259 |     f = FLAGS.file
260 | 
261 |     if f == '' and d == '':
262 |         parser.print_help()
263 |     elif f != '' and d != '':
264 |         if not os.path.isdir(d):
265 |             print ("[Error]: \'%s\' is not a valid directory!" % d)
266 |         else:
267 |             p = os.path.join(d, f)
268 |             print (p)
269 |             if get_fmap(p):
270 |                 print ("Finished!")
271 |     elif f != '':
272 |         p = f
273 |         if get_fmap(p):
274 |             print ("Finished!")
275 |     else:
276 |         if not os.path.isdir(d):
277 |             print ("[Error]: \'%s\' is not a valid directory!" % d)
278 |         else:
279 |             files = sorted(glob.glob(os.path.join(d, "*.las")))
280 |             count = 0
281 |             for f in files:
282 |                 if get_fmap(f):
283 |                     count += 1
284 |                 if count % 25 == 0 and count != 0:
285 |                     print ("25 Finished!")
286 | 
287 | 
288 | if __name__ == "__main__":
289 |     main()
290 | 


--------------------------------------------------------------------------------
/tools/pcd2las.py:
--------------------------------------------------------------------------------
 1 | """ 
 2 | Simple python scripts for 1) downsampling point clouds
 3 | 2) converting point clouds from '.pcd' to '.las' format. 
 4 | Author: Jingkang Wang
 5 | Date: November 2017
 6 | Dependency: CloudCompare
 7 | """
 8 | 
 9 | import argparse
10 | import glob
11 | import os
12 | import time
13 | 
14 | 
15 | def downsample(absolute_path):
16 |     """
17 |     Downsample point clouds (supported formats: las/pcd/...)
18 |         :param absolute_path: directory of point clouds
19 |     """
20 |     files = glob.glob(absolute_path + "*.las")
21 |     files.sort()
22 |     files.sort(key=len)
23 |     time_in = time.time()
24 |     for f in files:
25 |         os.system("CloudCompare.exe -SILENT \
26 |                    -NO_TIMESTAMP -C_EXPORT_FMT LAS \
27 |                    -O %s -SS RANDOM 16384" % f)
28 |     print time.time() - time_in
29 | 
30 | 
31 | def pcd2las(absolute_path):
32 |     """
33 |     Convert point clouds from las to pcd.
34 |         :param absolute_path: directory of point clouds
35 |     """
36 |     print (absolute_path)
37 |     files = glob.glob(absolute_path + "*.pcd")
38 |     files.sort()
39 |     files.sort(key=len)
40 |     print (files)
41 |     time_in = time.time()
42 |     for f in files:
43 |         os.system("CloudCompare.exe -SILENT \
44 |                    -NO_TIMESTAMP -C_EXPORT_FMT LAS \
45 |                    -O %s -SS RANDOM 16384" % f)
46 |     print time.time() - time_in
47 | 
48 | 
49 | if __name__ == "__main__":
50 |     parser = argparse.ArgumentParser()
51 |     parser.add_argument('input_dir', type=str, required=True,
52 |                         help='Input directory of point clouds')
53 |     parser.add_argument('oper', type=str, default="downsample",
54 |                         help='Operations to conduct')
55 |     FLAGS = parser.parse_args()
56 |     INPUT_DIR = FLAGS.input_dir
57 |     OPER = FLAGS.oper
58 | 
59 |     assert (os.path.exists(INPUT_DIR))
60 |     if (OPER == "downsample"):
61 |         downsample(INPUT_DIR)
62 |     else:
63 |         pcd2las(INPUT_DIR)
64 | 


--------------------------------------------------------------------------------
/tools/video2img.py:
--------------------------------------------------------------------------------
 1 | """ 
 2 | Simple python scripts for converting one video to continuous frames
 3 | Author: Jingkang Wang
 4 | Date: November 2017
 5 | Dependency: python-opencv
 6 | """
 7 | 
 8 | import argparse
 9 | import math
10 | import os
11 | import sys
12 | 
13 | import cv2
14 | 
15 | parser = argparse.ArgumentParser()
16 | parser.add_argument('-i', help='Path of video')
17 | parser.add_argument('-t', default=1.0, help='Time interval')
18 | parser.add_argument('-o', default='./images', help='Dir of images')
19 | FLAGS = parser.parse_args()
20 | 
21 | videoFile = FLAGS.i
22 | imagesFolder = FLAGS.o
23 | t_int = FLAGS.t
24 | 
25 | if videoFile == None:
26 |     print ("[Error]: Please input path of video")
27 |     sys.exit(0)
28 | 
29 | if not os.path.exists(videoFile):
30 |     print ("[Error]: %s is not a valid video" % videoFile)
31 |     sys.exit(0)
32 | 
33 | if not os.path.exists(imagesFolder): os.makedirs(imagesFolder)
34 | 
35 | cap = cv2.VideoCapture(videoFile)
36 | frameRate = cap.get(5) #frame rate
37 | 
38 | count = 0
39 | while(cap.isOpened()):
40 |     frameId = cap.get(1)
41 |     success, frame = cap.read()
42 |     if not success:
43 |         break
44 |     #print frameId
45 |     if (int(frameId) % math.floor(float(t_int) * frameRate) == 0):
46 |         filename = imagesFolder + "/images_" + str(int(frameId)) + ".jpg"
47 |         cv2.imwrite(filename, frame)
48 |         count += 1
49 | 
50 |     if (count % 100 == 0): print ("100 finished!")
51 | 
52 | cap.release()
53 | print "Done!"
54 | print ("FrameRate: %f" % frameRate)
55 | print ("Total: %d" % count)
56 | 


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import importlib
  3 | import os
  4 | import sys
  5 | import time
  6 | 
  7 | import numpy as np
  8 | import scipy
  9 | 
 10 | import provider
 11 | import tensorflow as tf
 12 | import keras
 13 | 
 14 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 15 | sys.path.append(os.path.join(BASE_DIR, 'models'))
 16 | 
 17 | parser = argparse.ArgumentParser()
 18 | parser.add_argument('--gpu', type=int, default=0,
 19 |                     help='GPU to use [default: GPU 0]')
 20 | parser.add_argument('--model', default='nvidia_pn',
 21 |                     help='Model name [default: nvidia_pn]')
 22 | parser.add_argument('--add_lstm', type=bool, default=False,
 23 |                     help='Introduce LSTM mechanism in netowrk [default: False]')
 24 | parser.add_argument('--log_dir', default='logs',
 25 |                     help='Log dir [default: logs]')
 26 | parser.add_argument('--max_epoch', type=int, default=250,
 27 |                     help='Epoch to run [default: 250]')
 28 | parser.add_argument('--batch_size', type=int, default=8,
 29 |                     help='Batch Size during training [default: 8]')
 30 | parser.add_argument('--learning_rate', type=float, default=0.001,
 31 |                     help='Learning rate during training [default: 0.001]')
 32 | parser.add_argument('--momentum', type=float, default=0.9,
 33 |                     help='Initial learning rate [default: 0.9]')
 34 | parser.add_argument('--optimizer', default='adam',
 35 |                     help='adam or momentum [default: adam]')
 36 | parser.add_argument('--decay_step', type=int, default=200000,
 37 |                     help='Decay step for lr decay [default: 200000]')
 38 | parser.add_argument('--decay_rate', type=float, default=0.7,
 39 |                     help='Decay rate for lr decay [default: 0.8]')
 40 | FLAGS = parser.parse_args()
 41 | 
 42 | BATCH_SIZE = FLAGS.batch_size
 43 | MAX_EPOCH = FLAGS.max_epoch
 44 | LEARNING_RATE = FLAGS.learning_rate
 45 | OPTIMIZER = FLAGS.optimizer
 46 | BASE_LEARNING_RATE = FLAGS.learning_rate
 47 | GPU_INDEX = FLAGS.gpu
 48 | MOMENTUM = FLAGS.momentum
 49 | DECAY_STEP = FLAGS.decay_step
 50 | DECAY_RATE = FLAGS.decay_rate
 51 | ADD_LSTM = FLAGS.add_lstm
 52 | 
 53 | BN_INIT_DECAY = 0.5
 54 | BN_DECAY_DECAY_RATE = 0.5
 55 | BN_DECAY_DECAY_STEP = float(DECAY_STEP)
 56 | BN_DECAY_CLIP = 0.99
 57 | 
 58 | supported_models = ["nvidia_io", "nvidia_pn",
 59 |                     "resnet152_io", "resnet152_pn",
 60 |                     "inception_v4_io", "inception_v4_pn",
 61 |                     "densenet169_io", "densenet169_pn"]
 62 | assert (FLAGS.model in supported_models)
 63 | MODEL = importlib.import_module(FLAGS.model)  # import network module
 64 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
 65 | 
 66 | LOG_DIR = os.path.join(FLAGS.log_dir, FLAGS.model)
 67 | if not os.path.exists(LOG_DIR):
 68 |     os.makedirs(LOG_DIR)
 69 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR))  # bkp of model def
 70 | os.system('cp train.py %s' % (LOG_DIR))  # bkp of train procedure
 71 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w')
 72 | LOG_FOUT.write(str(FLAGS)+'\n')
 73 | 
 74 | 
 75 | def log_string(out_str):
 76 |     LOG_FOUT.write(out_str+'\n')
 77 |     LOG_FOUT.flush()
 78 |     print(out_str)
 79 | 
 80 | 
 81 | def get_learning_rate(batch):
 82 |     learning_rate = tf.train.exponential_decay(
 83 |                         BASE_LEARNING_RATE,  # Base learning rate.
 84 |                         batch * BATCH_SIZE,  # Current index into the dataset.
 85 |                         DECAY_STEP,          # Decay step.
 86 |                         DECAY_RATE,          # Decay rate.
 87 |                         staircase=True)
 88 |     learning_rate = tf.maximum(learning_rate, 0.00001)  # CLIP THE LEARNING RATE!
 89 |     return learning_rate
 90 | 
 91 | 
 92 | def get_bn_decay(batch):
 93 |     bn_momentum = tf.train.exponential_decay(
 94 |                       BN_INIT_DECAY,
 95 |                       batch*BATCH_SIZE,
 96 |                       BN_DECAY_DECAY_STEP,
 97 |                       BN_DECAY_DECAY_RATE,
 98 |                       staircase=True)
 99 |     bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
100 |     return bn_decay
101 | 
102 | 
103 | def train():
104 |     with tf.Graph().as_default():
105 |         with tf.device('/gpu:'+str(GPU_INDEX)):
106 |             if '_pn' in MODEL_FILE:
107 |                 data_input = provider.Provider()
108 |                 imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
109 |                 imgs_pl = [imgs_pl, pts_pl]
110 |             elif '_io' in MODEL_FILE:
111 |                 data_input = provider.Provider()
112 |                 imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
113 |             else:
114 |                 raise NotImplementedError
115 | 
116 |             is_training_pl = tf.placeholder(tf.bool, shape=())
117 |             print(is_training_pl)
118 | 
119 |             # Note the global_step=batch parameter to minimize.
120 |             # That tells the optimizer to helpfully increment the 'batch'
121 |             # parameter for you every time it trains.
122 |             batch = tf.Variable(0)
123 |             bn_decay = get_bn_decay(batch)
124 |             tf.summary.scalar('bn_decay', bn_decay)
125 | 
126 |             # Get model and loss
127 |             pred = MODEL.get_model(imgs_pl, is_training_pl,
128 |                                    bn_decay=bn_decay)
129 | 
130 |             loss = MODEL.get_loss(pred, labels_pl)
131 |             MODEL.summary_scalar(pred, labels_pl)
132 | 
133 |             # Get training operator
134 |             learning_rate = get_learning_rate(batch)
135 |             tf.summary.scalar('learning_rate', learning_rate)
136 |             if OPTIMIZER == 'momentum':
137 |                 optimizer = tf.train.MomentumOptimizer(learning_rate,
138 |                                                        momentum=MOMENTUM)
139 |             elif OPTIMIZER == 'adam':
140 |                 optimizer = tf.train.AdamOptimizer(learning_rate)
141 |             train_op = optimizer.minimize(loss, global_step=batch)
142 |             # Add ops to save and restore all the variables.
143 |             saver = tf.train.Saver()
144 | 
145 |         # Create a session
146 |         config = tf.ConfigProto()
147 |         config.gpu_options.allow_growth = True
148 |         config.allow_soft_placement = True
149 |         config.log_device_placement = False
150 |         sess = tf.Session(config=config)
151 | 
152 |         # Add summary writers
153 |         # merged = tf.merge_all_summaries()
154 |         merged = tf.summary.merge_all()
155 |         train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'),
156 |                                              sess.graph)
157 |         test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test'))
158 | 
159 |         # Init variables
160 |         init = tf.global_variables_initializer()
161 |         sess.run(init, {is_training_pl: True})
162 | 
163 |         ops = {'imgs_pl': imgs_pl,
164 |                'labels_pl': labels_pl,
165 |                'is_training_pl': is_training_pl,
166 |                'pred': pred,
167 |                'loss': loss,
168 |                'train_op': train_op,
169 |                'merged': merged,
170 |                'step': batch}
171 | 
172 |         eval_acc_max = 0
173 |         for epoch in range(MAX_EPOCH):
174 |             log_string('**** EPOCH %03d ****' % (epoch))
175 |             sys.stdout.flush()
176 | 
177 |             train_one_epoch(sess, ops, train_writer, data_input)
178 |             eval_acc = eval_one_epoch(sess, ops, test_writer, data_input)
179 |             if eval_acc > eval_acc_max:
180 |                 eval_acc_max = eval_acc
181 |                 save_path = saver.save(sess, os.path.join(LOG_DIR, "model_best.ckpt"))
182 |                 log_string("Model saved in file: %s" % save_path)
183 | 
184 |             # Save the variables to disk.
185 |             if epoch % 10 == 0:
186 |                 save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt"))
187 |                 log_string("Model saved in file: %s" % save_path)
188 | 
189 | 
190 | def train_one_epoch(sess, ops, train_writer, data_input):
191 |     """ ops: dict mapping from string to tf ops """
192 |     is_training = True
193 |     num_batches = data_input.num_train // BATCH_SIZE
194 |     loss_sum = 0
195 |     acc_a_sum = 0
196 |     acc_s_sum = 0
197 |     counter = 0
198 | 
199 |     for batch_idx in range(num_batches):
200 |         if "_io" in MODEL_FILE:
201 |             imgs, labels = data_input.load_one_batch(BATCH_SIZE, "train", reader_type="io")
202 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
203 |                 imgs = MODEL.resize(imgs)
204 |             feed_dict = {ops['imgs_pl']: imgs,
205 |                          ops['labels_pl']: labels,
206 |                          ops['is_training_pl']: is_training}
207 |         else:
208 |             imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "train")
209 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:              
210 |                 imgs = MODEL.resize(imgs)
211 |             feed_dict = {ops['imgs_pl'][0]: imgs,
212 |                          ops['imgs_pl'][1]: others,
213 |                          ops['labels_pl']: labels,
214 |                          ops['is_training_pl']: is_training}
215 | 
216 |         summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
217 |                                                          ops['step'],
218 |                                                          ops['train_op'],
219 |                                                          ops['loss'],
220 |                                                          ops['pred']],
221 |                                                         feed_dict=feed_dict)
222 |         train_writer.add_summary(summary, step)
223 |         loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
224 |         acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
225 |         acc_a = np.mean(acc_a)
226 |         acc_a_sum += acc_a
227 |         acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
228 |         acc_s = np.mean(acc_s)
229 |         acc_s_sum += acc_s
230 | 
231 |         counter += 1
232 |         if counter % 200 == 0:
233 |             log_string(str(counter) + " step:")
234 |             log_string('loss: %f' % (loss_sum / float(batch_idx + 1)))
235 |             log_string('acc (angle): %f' % (acc_a_sum / float(batch_idx + 1)))
236 |             log_string('acc (speed): %f' % (acc_s_sum / float(batch_idx + 1)))
237 | 
238 |     log_string('mean loss: %f' % (loss_sum / float(num_batches)))
239 |     log_string('accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
240 |     log_string('accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
241 | 
242 | 
243 | def eval_one_epoch(sess, ops, test_writer, data_input):
244 |     """ ops: dict mapping from string to tf ops """
245 |     is_training = False
246 |     loss_sum = 0
247 | 
248 |     num_batches = data_input.num_val // BATCH_SIZE
249 |     loss_sum = 0
250 |     acc_a_sum = 0
251 |     acc_s_sum = 0
252 | 
253 |     for batch_idx in range(num_batches):
254 |         if "_io" in MODEL_FILE:
255 |             imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val", reader_type="io")
256 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
257 |                 imgs = MODEL.resize(imgs)
258 |             feed_dict = {ops['imgs_pl']: imgs,
259 |                          ops['labels_pl']: labels,
260 |                          ops['is_training_pl']: is_training}
261 |         else:
262 |             imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val")
263 |             if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
264 |                 imgs = MODEL.resize(imgs)
265 |             feed_dict = {ops['imgs_pl'][0]: imgs,
266 |                          ops['imgs_pl'][1]: others,
267 |                          ops['labels_pl']: labels,
268 |                          ops['is_training_pl']: is_training}
269 |         summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
270 |                                                          ops['step'],
271 |                                                          ops['train_op'],
272 |                                                          ops['loss'],
273 |                                                          ops['pred']],
274 |                                                         feed_dict=feed_dict)
275 |         test_writer.add_summary(summary, step)
276 |         loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
277 |         acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
278 |         acc_a = np.mean(acc_a)
279 |         acc_a_sum += acc_a
280 |         acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
281 |         acc_s = np.mean(acc_s)
282 |         acc_s_sum += acc_s
283 | 
284 |     log_string('eval mean loss: %f' % (loss_sum / float(num_batches)))
285 |     log_string('eval accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
286 |     log_string('eval accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
287 |     return acc_a_sum / float(num_batches)
288 | 
289 | 
290 | if __name__ == "__main__":
291 |     train()
292 | 


--------------------------------------------------------------------------------
/train_demo.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import importlib
  3 | import os
  4 | import sys
  5 | import time
  6 | 
  7 | import numpy as np
  8 | import scipy
  9 | 
 10 | import provider
 11 | import tensorflow as tf
 12 | 
 13 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 14 | sys.path.append(os.path.join(BASE_DIR, 'models'))
 15 | 
 16 | parser = argparse.ArgumentParser()
 17 | parser.add_argument('--gpu', type=int, default=0,
 18 |                     help='GPU to use [default: GPU 0]')
 19 | parser.add_argument('--model', default='nvidia_io',
 20 |                     help='Model name [default: nvidia_io]')
 21 | parser.add_argument('--log_dir', default='logs',
 22 |                     help='Log dir [default: logs]')
 23 | parser.add_argument('--max_epoch', type=int, default=250,
 24 |                     help='Epoch to run [default: 250]')
 25 | parser.add_argument('--batch_size', type=int, default=8,
 26 |                     help='Batch Size during training [default: 8]')
 27 | parser.add_argument('--learning_rate', type=float, default=0.001,
 28 |                     help='Learning rate during training [default: 0.001]')
 29 | parser.add_argument('--momentum', type=float, default=0.9,
 30 |                     help='Initial learning rate [default: 0.9]')
 31 | parser.add_argument('--optimizer', default='adam',
 32 |                     help='adam or momentum [default: adam]')
 33 | parser.add_argument('--decay_step', type=int, default=200000,
 34 |                     help='Decay step for lr decay [default: 200000]')
 35 | parser.add_argument('--decay_rate', type=float, default=0.7,
 36 |                     help='Decay rate for lr decay [default: 0.8]')
 37 | FLAGS = parser.parse_args()
 38 | 
 39 | BATCH_SIZE = FLAGS.batch_size
 40 | MAX_EPOCH = FLAGS.max_epoch
 41 | LEARNING_RATE = FLAGS.learning_rate
 42 | OPTIMIZER = FLAGS.optimizer
 43 | BASE_LEARNING_RATE = FLAGS.learning_rate
 44 | GPU_INDEX = FLAGS.gpu
 45 | MOMENTUM = FLAGS.momentum
 46 | DECAY_STEP = FLAGS.decay_step
 47 | DECAY_RATE = FLAGS.decay_rate
 48 | 
 49 | BN_INIT_DECAY = 0.5
 50 | BN_DECAY_DECAY_RATE = 0.5
 51 | BN_DECAY_DECAY_STEP = float(DECAY_STEP)
 52 | BN_DECAY_CLIP = 0.99
 53 | 
 54 | MODEL = importlib.import_module(FLAGS.model)  # import network module
 55 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
 56 | 
 57 | LOG_DIR = os.path.join(FLAGS.log_dir, FLAGS.model)
 58 | if not os.path.exists(LOG_DIR):
 59 |     os.makedirs(LOG_DIR)
 60 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR))  # bkp of model def
 61 | os.system('cp train.py %s' % (LOG_DIR))  # bkp of train procedure
 62 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w')
 63 | LOG_FOUT.write(str(FLAGS)+'\n')
 64 | 
 65 | 
 66 | def log_string(out_str):
 67 |     LOG_FOUT.write(out_str+'\n')
 68 |     LOG_FOUT.flush()
 69 |     print(out_str)
 70 | 
 71 | 
 72 | def get_learning_rate(batch):
 73 |     learning_rate = tf.train.exponential_decay(
 74 |                         BASE_LEARNING_RATE,  # Base learning rate.
 75 |                         batch * BATCH_SIZE,  # Current index into the dataset.
 76 |                         DECAY_STEP,          # Decay step.
 77 |                         DECAY_RATE,          # Decay rate.
 78 |                         staircase=True)
 79 |     learning_rate = tf.maximum(learning_rate, 0.00001)  # CLIP THE LEARNING RATE!
 80 |     return learning_rate
 81 | 
 82 | 
 83 | def get_bn_decay(batch):
 84 |     bn_momentum = tf.train.exponential_decay(
 85 |                       BN_INIT_DECAY,
 86 |                       batch*BATCH_SIZE,
 87 |                       BN_DECAY_DECAY_STEP,
 88 |                       BN_DECAY_DECAY_RATE,
 89 |                       staircase=True)
 90 |     bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
 91 |     return bn_decay
 92 | 
 93 | 
 94 | def train():
 95 |     with tf.Graph().as_default():
 96 |         with tf.device('/gpu:'+str(GPU_INDEX)):
 97 |             if 'io' in MODEL_FILE:
 98 |                 data_input = provider.DVR_Provider()
 99 |                 imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
100 |             elif 'pm' in MODEL_FILE:
101 |                 data_input = provider.DVR_FMAP_Provider()
102 |                 imgs_pl, fmaps_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
103 |                 imgs_pl = [imgs_pl, fmaps_pl]
104 |             else:
105 |                 data_input = provider.DVR_Points_Provider()
106 |                 imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
107 |                 imgs_pl = [imgs_pl, pts_pl]
108 | 
109 |             is_training_pl = tf.placeholder(tf.bool, shape=())
110 |             print(is_training_pl)
111 | 
112 |             # Note the global_step=batch parameter to minimize.
113 |             # That tells the optimizer to helpfully increment the 'batch'
114 |             # parameter for you every time it trains.
115 |             batch = tf.Variable(0)
116 |             bn_decay = get_bn_decay(batch)
117 |             tf.summary.scalar('bn_decay', bn_decay)
118 | 
119 |             # Get model and loss
120 |             pred = MODEL.get_model(imgs_pl, is_training_pl,
121 |                                    bn_decay=bn_decay)
122 | 
123 |             loss = MODEL.get_loss(pred, labels_pl)
124 |             MODEL.summary_scalar(pred, labels_pl)
125 | 
126 |             # Get training operator
127 |             learning_rate = get_learning_rate(batch)
128 |             tf.summary.scalar('learning_rate', learning_rate)
129 |             if OPTIMIZER == 'momentum':
130 |                 optimizer = tf.train.MomentumOptimizer(learning_rate,
131 |                                                        momentum=MOMENTUM)
132 |             elif OPTIMIZER == 'adam':
133 |                 optimizer = tf.train.AdamOptimizer(learning_rate)
134 |             train_op = optimizer.minimize(loss, global_step=batch)
135 |             # Add ops to save and restore all the variables.
136 |             saver = tf.train.Saver()
137 | 
138 |         # Create a session
139 |         config = tf.ConfigProto()
140 |         config.gpu_options.allow_growth = True
141 |         config.allow_soft_placement = True
142 |         config.log_device_placement = False
143 |         sess = tf.Session(config=config)
144 | 
145 |         # Add summary writers
146 |         # merged = tf.merge_all_summaries()
147 |         merged = tf.summary.merge_all()
148 |         train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'),
149 |                                              sess.graph)
150 |         test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test'))
151 | 
152 |         # Init variables
153 |         init = tf.global_variables_initializer()
154 |         sess.run(init, {is_training_pl: True})
155 | 
156 |         ops = {'imgs_pl': imgs_pl,
157 |                'labels_pl': labels_pl,
158 |                'is_training_pl': is_training_pl,
159 |                'pred': pred,
160 |                'loss': loss,
161 |                'train_op': train_op,
162 |                'merged': merged,
163 |                'step': batch}
164 | 
165 |         for epoch in range(MAX_EPOCH):
166 |             log_string('**** EPOCH %03d ****' % (epoch))
167 |             sys.stdout.flush()
168 | 
169 |             train_one_epoch(sess, ops, train_writer, data_input)
170 |             eval_one_epoch(sess, ops, test_writer, data_input)
171 | 
172 |             # Save the variables to disk.
173 |             if epoch % 10 == 0:
174 |                 save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt"))
175 |                 log_string("Model saved in file: %s" % save_path)
176 | 
177 | 
178 | def train_one_epoch(sess, ops, train_writer, data_input):
179 |     """ ops: dict mapping from string to tf ops """
180 |     is_training = True
181 |     num_batches = data_input.num_train // BATCH_SIZE
182 |     loss_sum = 0
183 |     acc_a_sum = 0
184 |     acc_s_sum = 0
185 | 
186 |     for batch_idx in range(num_batches):
187 |         if "io" in MODEL_FILE:
188 |             imgs, labels = data_input.load_one_batch(BATCH_SIZE, "train")
189 |             feed_dict = {ops['imgs_pl']: imgs,
190 |                          ops['labels_pl']: labels,
191 |                          ops['is_training_pl']: is_training}
192 |         else:
193 |             imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "train")
194 |             feed_dict = {ops['imgs_pl'][0]: imgs,
195 |                          ops['imgs_pl'][1]: others,
196 |                          ops['labels_pl']: labels,
197 |                          ops['is_training_pl']: is_training}
198 | 
199 |         summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
200 |                                                          ops['step'],
201 |                                                          ops['train_op'],
202 |                                                          ops['loss'],
203 |                                                          ops['pred']],
204 |                                                         feed_dict=feed_dict)
205 |         train_writer.add_summary(summary, step)
206 |         loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
207 |         acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
208 |         acc_a = np.mean(acc_a)
209 |         acc_a_sum += acc_a
210 |         acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
211 |         acc_s = np.mean(acc_s)
212 |         acc_s_sum += acc_s
213 | 
214 |     log_string('mean loss: %f' % (loss_sum / float(num_batches)))
215 |     log_string('accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
216 |     log_string('accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
217 | 
218 | 
219 | def eval_one_epoch(sess, ops, test_writer, data_input):
220 |     """ ops: dict mapping from string to tf ops """
221 |     is_training = False
222 |     loss_sum = 0
223 | 
224 |     num_batches = data_input.num_val // BATCH_SIZE
225 |     loss_sum = 0
226 |     acc_a_sum = 0
227 |     acc_s_sum = 0
228 | 
229 |     for batch_idx in range(num_batches):
230 |         if "io" in MODEL_FILE:
231 |             imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val")
232 |             feed_dict = {ops['imgs_pl']: imgs,
233 |                          ops['labels_pl']: labels,
234 |                          ops['is_training_pl']: is_training}
235 |         else:
236 |             imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val")
237 |             feed_dict = {ops['imgs_pl'][0]: imgs,
238 |                          ops['imgs_pl'][1]: others,
239 |                          ops['labels_pl']: labels,
240 |                          ops['is_training_pl']: is_training}
241 |         summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
242 |                                                          ops['step'],
243 |                                                          ops['train_op'],
244 |                                                          ops['loss'],
245 |                                                          ops['pred']],
246 |                                                         feed_dict=feed_dict)
247 |         test_writer.add_summary(summary, step)
248 |         loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
249 |         acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
250 |         acc_a = np.mean(acc_a)
251 |         acc_a_sum += acc_a
252 |         acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
253 |         acc_s = np.mean(acc_s)
254 |         acc_s_sum += acc_s
255 | 
256 |     log_string('eval mean loss: %f' % (loss_sum / float(num_batches)))
257 |     log_string('eval accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
258 |     log_string('eval accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
259 | 
260 | 
261 | if __name__ == "__main__":
262 |     train()
263 | 


--------------------------------------------------------------------------------
/utils/custom_layers.py:
--------------------------------------------------------------------------------
 1 | from keras.layers.core import Layer
 2 | from keras.engine import InputSpec
 3 | from keras import backend as K
 4 | try:
 5 |     from keras import initializations
 6 | except ImportError:
 7 |     from keras import initializers as initializations
 8 | 
 9 | 
10 | class Scale(Layer):
11 |     '''Learns a set of weights and biases used for scaling the input data.
12 |     the output consists simply in an element-wise multiplication of the input
13 |     and a sum of a set of constants:
14 | 
15 |         out = in * gamma + beta,
16 | 
17 |     where 'gamma' and 'beta' are the weights and biases larned.
18 | 
19 |     # Arguments
20 |         axis: integer, axis along which to normalize in mode 0. For instance,
21 |             if your input tensor has shape (samples, channels, rows, cols),
22 |             set axis to 1 to normalize per feature map (channels axis).
23 |         momentum: momentum in the computation of the
24 |             exponential average of the mean and standard deviation
25 |             of the data, for feature-wise normalization.
26 |         weights: Initialization weights.
27 |             List of 2 Numpy arrays, with shapes:
28 |             `[(input_shape,), (input_shape,)]`
29 |         beta_init: name of initialization function for shift parameter
30 |             (see [initializations](../initializations.md)), or alternatively,
31 |             Theano/TensorFlow function to use for weights initialization.
32 |             This parameter is only relevant if you don't pass a `weights` argument.
33 |         gamma_init: name of initialization function for scale parameter (see
34 |             [initializations](../initializations.md)), or alternatively,
35 |             Theano/TensorFlow function to use for weights initialization.
36 |             This parameter is only relevant if you don't pass a `weights` argument.
37 |     '''
38 |     def __init__(self, weights=None, axis=-1, momentum = 0.9, beta_init='zero', gamma_init='one', **kwargs):
39 |         self.momentum = momentum
40 |         self.axis = axis
41 |         self.beta_init = initializations.get(beta_init)
42 |         self.gamma_init = initializations.get(gamma_init)
43 |         self.initial_weights = weights
44 |         super(Scale, self).__init__(**kwargs)
45 | 
46 |     def build(self, input_shape):
47 |         self.input_spec = [InputSpec(shape=input_shape)]
48 |         shape = (int(input_shape[self.axis]),)
49 | 
50 |         # Compatibility with TensorFlow >= 1.0.0
51 |         self.gamma = K.variable(self.gamma_init(shape), name='{}_gamma'.format(self.name))
52 |         self.beta = K.variable(self.beta_init(shape), name='{}_beta'.format(self.name))
53 |         #self.gamma = self.gamma_init(shape, name='{}_gamma'.format(self.name))
54 |         #self.beta = self.beta_init(shape, name='{}_beta'.format(self.name))
55 |         self.trainable_weights = [self.gamma, self.beta]
56 | 
57 |         if self.initial_weights is not None:
58 |             self.set_weights(self.initial_weights)
59 |             del self.initial_weights
60 | 
61 |     def call(self, x, mask=None):
62 |         input_shape = self.input_spec[0].shape
63 |         broadcast_shape = [1] * len(input_shape)
64 |         broadcast_shape[self.axis] = input_shape[self.axis]
65 | 
66 |         out = K.reshape(self.gamma, broadcast_shape) * x + K.reshape(self.beta, broadcast_shape)
67 |         return out
68 | 
69 |     def get_config(self):
70 |         config = {"momentum": self.momentum, "axis": self.axis}
71 |         base_config = super(Scale, self).get_config()
72 |         return dict(list(base_config.items()) + list(config.items()))
73 | 


--------------------------------------------------------------------------------
/utils/helper.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | 
 4 | def str2bool(v):
 5 |     if v.lower() in ('yes', 'true', 't', 'y', '1'):
 6 |         return True
 7 |     elif v.lower() in ('no', 'false', 'f', 'n', '0'):
 8 |         return False
 9 |     else:
10 |         raise argparse.ArgumentTypeError('Boolean value expected.')


--------------------------------------------------------------------------------
/utils/pointnet.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | import numpy as np
 3 | import math
 4 | import sys
 5 | import os
 6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 7 | sys.path.append(BASE_DIR)
 8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 9 | import tf_util
10 | 
11 | def placeholder_inputs(batch_size, num_point):
12 |     pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point, 3))
13 |     labels_pl = tf.placeholder(tf.int32, shape=(batch_size))
14 |     return pointclouds_pl, labels_pl
15 | 
16 | 
17 | def get_model(point_cloud, is_training, bn_decay=None):
18 |     """ Classification PointNet, input is BxNx3, output Bx40 """
19 |     batch_size = point_cloud.get_shape()[0].value
20 |     num_point = point_cloud.get_shape()[1].value
21 |     end_points = {}
22 |     input_image = tf.expand_dims(point_cloud, -1)
23 | 
24 |     # Point functions (MLP implemented as conv2d)
25 |     net = tf_util.conv2d(input_image, 64, [1,3],
26 |                          padding='VALID', stride=[1,1],
27 |                          bn=True, is_training=is_training,
28 |                          scope='conv1', bn_decay=bn_decay)
29 |     net = tf_util.conv2d(net, 64, [1,1],
30 |                          padding='VALID', stride=[1,1],
31 |                          bn=True, is_training=is_training,
32 |                          scope='conv2', bn_decay=bn_decay)
33 |     net = tf_util.conv2d(net, 64, [1,1],
34 |                          padding='VALID', stride=[1,1],
35 |                          bn=True, is_training=is_training,
36 |                          scope='conv3', bn_decay=bn_decay)
37 |     net = tf_util.conv2d(net, 128, [1,1],
38 |                          padding='VALID', stride=[1,1],
39 |                          bn=True, is_training=is_training,
40 |                          scope='conv4', bn_decay=bn_decay)
41 |     net = tf_util.conv2d(net, 256, [1,1],
42 |                          padding='VALID', stride=[1,1],
43 |                          bn=True, is_training=is_training,
44 |                          scope='conv5', bn_decay=bn_decay)
45 | 
46 |     # Symmetric function: max pooling
47 |     net = tf_util.max_pool2d(net, [num_point,1],
48 |                              padding='VALID', scope='maxpool')
49 | 
50 |     # MLP on global point cloud vector
51 |     net = tf.reshape(net, [batch_size, -1])
52 | 
53 |     return net
54 | 
55 | if __name__=='__main__':
56 |     with tf.Graph().as_default():
57 |         inputs = tf.zeros((32,100000,3))
58 |         outputs = get_model(inputs, tf.constant(True))
59 |         print(outputs)
60 | 


--------------------------------------------------------------------------------
/utils/weights/README.md:
--------------------------------------------------------------------------------
 1 | ## ImageNet Pretrained Models
 2 | 
 3 | This is the place where the weights of pre-trained models in ImageNet are placed. This part is modified from [this repo](https://github.com/flyyufelix/cnn_finetune).
 4 | 
 5 | ### Folder Structure
 6 | Download pre-trained weights and organize the files as follows (in `utils/weights/`):
 7 | ```
 8 | ├── resnet152_weights_tf.h5
 9 | ├── inception-v4_weights_tf.h5
10 | └── densenet169_weights_tf.h5    
11 | ```
12 | 
13 | ### Download the Weights
14 | 
15 | Network|Tensorflow
16 | :---:|:---:
17 | Inception-V4 | [model (172 MB)](https://drive.google.com/file/d/0Byy2AcGyEVxfTmRRVmpGWDczaXM/view?usp=sharing)
18 | ResNet-152 | [model (243 MB)](https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing)
19 | DenseNet-169 | [model (56 MB)](https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM)
20 | 
21 | More pre-trained weights are available [here](https://github.com/flyyufelix/cnn_finetune).


--------------------------------------------------------------------------------