├── .gitignore
├── LICENSE
├── README.md
├── data
├── README.md
├── dbnet-2018
│ └── README.md
└── demo
│ ├── README.md
│ └── data.csv
├── docs
├── logo.jpeg
└── pred.jpg
├── evaluate.py
├── models
├── densenet169_io.py
├── densenet169_pm.py
├── densenet169_pn.py
├── inception_v4_io.py
├── inception_v4_pm.py
├── inception_v4_pn.py
├── nvidia_io.py
├── nvidia_pm.py
├── nvidia_pn.py
├── resnet152_io.py
├── resnet152_pm.py
└── resnet152_pn.py
├── predict.py
├── provider.py
├── tools
├── README.md
├── img_pre.py
├── las2fmap.py
├── pcd2las.py
└── video2img.py
├── train.py
├── train_demo.py
└── utils
├── custom_layers.py
├── helper.py
├── pointnet.py
├── tf_util.py
└── weights
└── README.md
/.gitignore:
--------------------------------------------------------------------------------
1 | .vscode/
2 | *pyc
3 | data/dbnet-2018/train
4 | data/dbnet-2018/val
5 | data/dbnet-2018/test
6 | data/demo/DVR
7 | data/demo/fmap
8 | data/demo/points_16384
9 | logs/
10 | results/
11 | dbnet_test.py
12 | utils/weights/*.h5
13 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | 
4 |
5 | [DBNet](http://www.dbehavior.net/) is a __large-scale driving behavior dataset__, which provides large-scale __high-quality point clouds__ scanned by Velodyne lasers, __high-resolution videos__ recorded by dashboard cameras and __standard drivers' behaviors__ (vehicle speed, steering angle) collected by real-time sensors.
6 |
7 | Extensive experiments demonstrate that extra depth information helps networks to determine driving policies indeed. We hope it will become useful resources for the autonomous driving research community.
8 |
9 | _Created by [Yiping Chen*](https://scholar.google.com/citations?user=e9lv2fUAAAAJ&hl=en), [Jingkang Wang*](https://wangjksjtu.github.io/), [Jonathan Li](https://uwaterloo.ca/mobile-sensing/people-profiles/jonathan-li), [Cewu Lu](http://www.mvig.org/), Zhipeng Luo, HanXue and [Cheng Wang](http://chwang.xmu.edu.cn/). (*equal contribution)_
10 |
11 | The resources of our work are available: [[paper]](http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_LiDAR-Video_Driving_Dataset_CVPR_2018_paper.pdf), [[code]](https://github.com/driving-behavior/DBNet), [[video]](http://www.dbehavior.net/data/demo.mp4), [[website]](http://www.dbehavior.net/), [[challenge]](http://www.dbehavior.net/task.html), [[prepared data]](https://drive.google.com/file/d/1WxzOrhvMnHCOkh6EFGWltflyPb_UnGqo/view?usp=sharing)
12 |
13 |
18 |
19 | ## Contents
20 | 1. [Introduction](#introduction)
21 | 2. [Requirements](#requirements)
22 | 3. [Quick Start](#quick-start)
23 | 4. [Baseline](#baseline)
24 | 5. [Contributors](#contributors)
25 | 6. [Citation](#citation)
26 | 7. [License](#license)
27 |
28 | ## Introduction
29 | This work is based on our [research paper](http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_LiDAR-Video_Driving_Dataset_CVPR_2018_paper.html), which appears in CVPR 2018. We propose a large-scale dataset for driving behavior learning, namely, DBNet. You can also check our [dataset webpage](http://www.dbehavior.net/) for a deeper introduction.
30 |
31 | In this repository, we release __demo code__ and __partial prepared data__ for training with only images, as well as leveraging feature maps or point clouds. The prepared data are accessible [here](https://drive.google.com/open?id=14RPdVTwBTuCTo0tFeYmL_SyN8fD0g6Hc). (__More demo models and scripts are released soon!__)
32 |
33 | ## Requirements
34 |
35 | * **Tensorflow 1.2.0**
36 | * Python 2.7
37 | * CUDA 8.0+ (For GPU)
38 | * Python Libraries: numpy, scipy and __laspy__
39 |
40 | The code has been tested with Python 2.7, Tensorflow 1.2.0, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04. But it may work on more machines (directly or through mini-modification), pull-requests or test report are well welcomed.
41 |
42 | ## Quick Start
43 | ### Training
44 | To train a model to predict vehicle speeds and steering angles:
45 |
46 | python train.py --model nvidia_pn --batch_size 16 --max_epoch 125 --gpu 0
47 |
48 | The names of the models are consistent with our [paper](http://www.dbehavior.net/publications.html).
49 | Log files and network parameters will be saved to `logs` folder in default.
50 |
51 | To see HELP for the training script:
52 |
53 | python train.py -h
54 |
55 | We can use TensorBoard to view the network architecture and monitor the training progress.
56 |
57 | tensorboard --logdir logs
58 |
59 | ### Evaluation
60 | After training, you could evaluate the performance of models using `evaluate.py`. To plot the figures or calculate AUC, you may need to have matplotlib library installed.
61 |
62 | python evaluate.py --model_path logs/nvidia_pn/model.ckpt
63 |
64 | ### Prediction
65 | To get the predictions of test data:
66 |
67 | python predict.py
68 |
69 | The results are saved in `results/results` (every segment) and `results/behavior_pred.txt` (merged) by default.
70 | To change the storation location:
71 |
72 | python predict.py --result_dir specified_dir
73 |
74 | The result directory will be created automatically if it doesn't exist.
75 |
76 | ## Baseline
77 |
Method | Setting | Accuracy | AUC | ME | AE | AME |
---|
nvidia-pn | Videos + Laser Points | angle | 70.65% (<5) | 0.7799 | 29.46 | 4.23 | 20.88 |
speed | 82.21% (<3) | 0.8701 | 18.56 | 1.80 | 9.68 |
78 |
79 | This baseline is run on __dbnet-2018 challenge data__ and only __nvidia\_pn__ is tested. To measure difficult architectures comprehensively, several metrics are set, including accuracy under different thresholds, area under curve (__AUC__), max error (__ME__), mean error (__AE__) and mean of max errors (__AME__).
80 |
81 | The implementations of these metrics could be found in `evaluate.py`.
82 |
83 | ## Contributors
84 | DBNet was developed by [MVIG](http://www.mvig.org/), Shanghai Jiao Tong University* and [SCSC](http://scsc.xmu.edu.cn/) Lab, Xiamen University* (*alphabetical order*).
85 |
86 | ## Citation
87 | If you find our work useful in your research, please consider citing:
88 |
89 | @InProceedings{DBNet2018,
90 | author = {Yiping Chen and Jingkang Wang and Jonathan Li and Cewu Lu and Zhipeng Luo and HanXue and Cheng Wang},
91 | title = {LiDAR-Video Driving Dataset: Learning Driving Policies Effectively},
92 | booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
93 | month = {June},
94 | year = {2018}
95 | }
96 |
97 | ## License
98 | Our code is released under Apache 2.0 License. The copyright of DBNet could be checked [here](http://www.dbehavior.net/contact.html).
99 |
--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
1 | ## Home Directory of DBNet Data
2 |
3 | This is the place where DBNet data are placed in order to fit the default path in `../provider.py`. In total, two kinds of prepared data are provided, which are listed in `dbnet-2018` and `demo` folder, respectively.
4 |
5 | ### dbnet-2018
6 | Download DBNet-2018 challenge data [here](https://drive.google.com/open?id=14RPdVTwBTuCTo0tFeYmL_SyN8fD0g6Hc) and organize the folders as follows (in `dbnet-2018/`):
7 | ```
8 | ├── train
9 | ├─ └── i [56 folders] (6569 in total, will release continously)
10 | ├─ ├── dvr_66x200 [<= 120 images]
11 | ├─ ├── dvr_1920x1080 [<= 120 images]
12 | ├─ ├── points_16384 [<= 120 images]
13 | ├─ └── behavior.csv [labels]
14 | ├── val
15 | ├─ └── j [20 folders] (2349 in total)
16 | ├─ ├── dvr_66x200 [<= 120 images]
17 | ├─ ├── dvr_1920x1080 [<= 120 images]
18 | ├─ ├── points_16384 [<= 120 clouds]
19 | ├─ └── behavior.csv [labels]
20 | └── test
21 | └── k [20 folders] (2376 in total)
22 | ├── dvr_66x200 [<= 120 images]
23 | ├── dvr_1920x1080 [<= 120 images]
24 | └── points_16384 [<= 120 clouds]
25 |
26 | ```
27 | In general, the train/val/test ratio is approximatingly set to 8:1:1 and all of the val/test data are released already. Almost five eighths of training data are still pre-processed and will be __uploaded soon__.
28 |
29 | Please note that the data in subfolders of `train/`, `val/` and `test/` are __continuous__ and __time-ordered__. The `ith` line of `behavior.csv` correponds to `i-1.jpg` in `dvf_66x200/` and `i-1.las` in `points_16384/`. Moreover, if you don't intend to utilize prepared data directly, please download and pre-process the [raw data]() in your favorite methods.
30 |
31 | ### demo
32 | Download DBNet demo data [here](https://drive.google.com/open?id=1NjhHwV_q6EMZ6MiGhZnqxg7yRCawx79c) and organize the folders as follows (in `demo`):
33 |
34 | ```
35 | ├── data.csv
36 | ├── DVR
37 | ├─ └── i.jpg [3788 images]
38 | ├── fmap
39 | ├─ └── i.jpg [3788 feature maps]
40 | └── points_16384
41 | └── i.las [3788 point clouds]
42 | ```
43 |
--------------------------------------------------------------------------------
/data/dbnet-2018/README.md:
--------------------------------------------------------------------------------
1 | ## DBNet-2018 Challenge
2 | The DBNet-2018 challenge data are organized as follows:
3 |
4 | ```
5 | ├── train
6 | ├─ └── i [56 folders] (6569 in total, will release continously)
7 | ├─ ├── dvr_66x200 [<= 120 images]
8 | ├─ ├── points_16384 [<= 120 images]
9 | ├─ └── behavior.csv [labels]
10 | ├── val
11 | ├─ └── j [20 folders] (2349 in total)
12 | ├─ ├── dvr_66x200 [<= 120 images]
13 | ├─ ├── points_16384 [<= 120 clouds]
14 | ├─ └── behavior.csv [labels]
15 | └── test
16 | └── k [20 folders] (2376 in total)
17 | ├── dvr_66x200 [<= 120 images]
18 | └── points_16384 [<= 120 clouds]
19 | ```
20 |
21 | In general, the train/val/test ratio is approximatingly set to 8:1:1 and all of the val/test data are released already. Almost five eighths of training data are still pre-processed and will be __uploaded soon__.
22 |
23 | Please note that the data in subfolders of `train/`, `val/` and `test/` are __continuous__ and __time-ordered__. The `ith` line of `behavior.csv` correponds to `i-1.jpg` in `dvf_66x200/` and `i-1.las` in `points_16384/`. Moreover, if you don't intend to utilize the prepared data directly, please download and pre-process the raw data in your favorite methods.
24 |
--------------------------------------------------------------------------------
/data/demo/README.md:
--------------------------------------------------------------------------------
1 | ## Demo Data
2 |
3 | Download DBNet demo data and organize the folders as follows:
4 |
5 | ```
6 | ├── data.csv
7 | ├── DVR
8 | ├─ └── i.jpg [3788 images]
9 | ├── fmap
10 | ├─ └── i.jpg [3788 feature maps]
11 | └── points_16384
12 | └── i.las [3788 point clouds]
13 | ```
14 |
--------------------------------------------------------------------------------
/docs/logo.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/driving-behavior/DBNet/c32ad78afb9f83e354eae7eb6793dbeef7e83bac/docs/logo.jpeg
--------------------------------------------------------------------------------
/docs/pred.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/driving-behavior/DBNet/c32ad78afb9f83e354eae7eb6793dbeef7e83bac/docs/pred.jpg
--------------------------------------------------------------------------------
/evaluate.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import importlib
3 | import os
4 | import sys
5 | import time
6 |
7 | # import matplotlib.pyplot as plt
8 | import numpy as np
9 | import scipy
10 |
11 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
12 | sys.path.append(os.path.join(BASE_DIR, 'models'))
13 | sys.path.append(os.path.join(BASE_DIR, 'utils'))
14 |
15 | import provider
16 | import tensorflow as tf
17 | from helper import str2bool
18 |
19 |
20 | parser = argparse.ArgumentParser()
21 | parser.add_argument('--gpu', type=int, default=0,
22 | help='GPU to use [default: GPU 0]')
23 | parser.add_argument('--model', default='nvidia_pn',
24 | help='Model name [default: nvidia_pn]')
25 | parser.add_argument('--model_path', default='logs/nvidia_pn/model_best.ckpt',
26 | help='Model checkpoint file path [default: logs/nvidia_pn/model_best.ckpt]')
27 | parser.add_argument('--max_epoch', type=int, default=250,
28 | help='Epoch to run [default: 250]')
29 | parser.add_argument('--batch_size', type=int, default=8,
30 | help='Batch Size during training [default: 8]')
31 | parser.add_argument('--result_dir', default='results',
32 | help='Result folder path [default: results]')
33 | parser.add_argument('--test', type=str2bool, default=False, # only used in test server
34 | help='Get performance on test data [default: False]')
35 |
36 | FLAGS = parser.parse_args()
37 | BATCH_SIZE = FLAGS.batch_size
38 | GPU_INDEX = FLAGS.gpu
39 | MODEL_PATH = FLAGS.model_path
40 |
41 | supported_models = ["nvidia_io", "nvidia_pn",
42 | "resnet152_io", "resnet152_pn",
43 | "inception_v4_io", "inception_v4_pn",
44 | "densenet169_io", "densenet169_pn"]
45 | assert (FLAGS.model in supported_models)
46 |
47 | MODEL = importlib.import_module(FLAGS.model) # import network module
48 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
49 |
50 | RESULT_DIR = os.path.join(FLAGS.result_dir, FLAGS.model)
51 | if not os.path.exists(RESULT_DIR):
52 | os.makedirs(RESULT_DIR)
53 | if FLAGS.test:
54 | TEST_RESULT_DIR = os.path.join(RESULT_DIR, "test")
55 | if not os.path.exists(TEST_RESULT_DIR):
56 | os.makedirs(TEST_RESULT_DIR)
57 | LOG_FOUT = open(os.path.join(TEST_RESULT_DIR, 'log_test.txt'), 'w')
58 | LOG_FOUT.write(str(FLAGS)+'\n')
59 | else:
60 | VAL_RESULT_DIR = os.path.join(RESULT_DIR, "val")
61 | if not os.path.exists(VAL_RESULT_DIR):
62 | os.makedirs(VAL_RESULT_DIR)
63 | LOG_FOUT = open(os.path.join(VAL_RESULT_DIR, 'log_evaluate.txt'), 'w')
64 | LOG_FOUT.write(str(FLAGS)+'\n')
65 |
66 |
67 | def log_string(out_str):
68 | LOG_FOUT.write(out_str+'\n')
69 | LOG_FOUT.flush()
70 | print(out_str)
71 |
72 | def evaluate():
73 | with tf.device('/gpu:'+str(GPU_INDEX)):
74 | if '_pn' in MODEL_FILE:
75 | data_input = provider.Provider()
76 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
77 | imgs_pl = [imgs_pl, pts_pl]
78 | elif '_io' in MODEL_FILE:
79 | data_input = provider.Provider()
80 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
81 | else:
82 | raise NotImplementedError
83 |
84 | is_training_pl = tf.placeholder(tf.bool, shape=())
85 | print(is_training_pl)
86 |
87 | # Get model and loss
88 | pred = MODEL.get_model(imgs_pl, is_training_pl)
89 |
90 | loss = MODEL.get_loss(pred, labels_pl)
91 |
92 | # Add ops to save and restore all the variables.
93 | saver = tf.train.Saver()
94 |
95 | # Create a session
96 | config = tf.ConfigProto()
97 | config.gpu_options.allow_growth = True
98 | config.allow_soft_placement = True
99 | config.log_device_placement = True
100 | sess = tf.Session(config=config)
101 |
102 | # Restore variables from disk.
103 | saver.restore(sess, MODEL_PATH)
104 | log_string("Model restored.")
105 |
106 | ops = {'imgs_pl': imgs_pl,
107 | 'labels_pl': labels_pl,
108 | 'is_training_pl': is_training_pl,
109 | 'pred': pred,
110 | 'loss': loss}
111 |
112 | eval_one_epoch(sess, ops, data_input)
113 |
114 |
115 | def eval_one_epoch(sess, ops, data_input):
116 | """ ops: dict mapping from string to tf ops """
117 | is_training = False
118 | loss_sum = 0
119 |
120 | num_batches = data_input.num_val // BATCH_SIZE
121 | acc_a_sum = [0] * 5
122 | acc_s_sum = [0] * 5
123 |
124 | preds = []
125 | labels_total = []
126 | acc_a = [0] * 5
127 | acc_s = [0] * 5
128 | for batch_idx in range(num_batches):
129 | if "_io" in MODEL_FILE:
130 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val", reader_type="io")
131 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
132 | imgs = MODEL.resize(imgs)
133 | feed_dict = {ops['imgs_pl']: imgs,
134 | ops['labels_pl']: labels,
135 | ops['is_training_pl']: is_training}
136 | else:
137 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val")
138 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
139 | imgs = MODEL.resize(imgs)
140 | feed_dict = {ops['imgs_pl'][0]: imgs,
141 | ops['imgs_pl'][1]: others,
142 | ops['labels_pl']: labels,
143 | ops['is_training_pl']: is_training}
144 |
145 | loss_val, pred_val = sess.run([ops['loss'], ops['pred']],
146 | feed_dict=feed_dict)
147 |
148 | preds.append(pred_val)
149 | labels_total.append(labels)
150 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
151 | for i in range(5):
152 | acc_a[i] = np.mean(np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (1.0 * (i+1) / 180 * scipy.pi))
153 | acc_a_sum[i] += acc_a[i]
154 | acc_s[i] = np.mean(np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (1.0 * (i+1) / 20))
155 | acc_s_sum[i] += acc_s[i]
156 |
157 | log_string('eval mean loss: %f' % (loss_sum / float(num_batches)))
158 | for i in range(5):
159 | log_string('eval accuracy (angle-%d): %f' % (float(i+1), (acc_a_sum[i] / float(num_batches))))
160 | log_string('eval accuracy (speed-%d): %f' % (float(i+1), (acc_s_sum[i] / float(num_batches))))
161 |
162 | preds = np.vstack(preds)
163 | labels = np.vstack(labels_total)
164 |
165 | a_error, s_error = mean_max_error(preds, labels, dicts=get_dicts())
166 | log_string('eval error (mean-max): angle:%.2f speed:%.2f' %
167 | (a_error / scipy.pi * 180, s_error * 20))
168 | a_error, s_error = max_error(preds, labels)
169 | log_string('eval error (max): angle:%.2f speed:%.2f' %
170 | (a_error / scipy.pi * 180, s_error * 20))
171 | a_error, s_error = mean_topk_error(preds, labels, 5)
172 | log_string('eval error (mean-top5): angle:%.2f speed:%.2f' %
173 | (a_error / scipy.pi * 180, s_error * 20))
174 | a_error, s_error = mean_error(preds, labels)
175 | log_string('eval error (mean): angle:%.2f speed:%.2f' %
176 | (a_error / scipy.pi * 180, s_error * 20))
177 |
178 | print (preds.shape, labels.shape)
179 | np.savetxt(os.path.join(VAL_RESULT_DIR, "preds_val.txt"), preds)
180 | np.savetxt(os.path.join(VAL_RESULT_DIR, "labels_val.txt"), labels)
181 | # plot_acc(preds, labels)
182 |
183 |
184 | def test():
185 | with tf.device('/gpu:'+str(GPU_INDEX)):
186 | if '_pn' in MODEL_FILE:
187 | data_input = provider.Provider2()
188 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
189 | imgs_pl = [imgs_pl, pts_pl]
190 | elif '_io' in MODEL_FILE:
191 | data_input = provider.Provider2()
192 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
193 | else:
194 | raise NotImplementedError
195 |
196 |
197 | is_training_pl = tf.placeholder(tf.bool, shape=())
198 | print(is_training_pl)
199 |
200 | # Get model and loss
201 | pred = MODEL.get_model(imgs_pl, is_training_pl)
202 |
203 | loss = MODEL.get_loss(pred, labels_pl)
204 |
205 | # Add ops to save and restore all the variables.
206 | saver = tf.train.Saver()
207 |
208 | # Create a session
209 | config = tf.ConfigProto()
210 | config.gpu_options.allow_growth = True
211 | config.allow_soft_placement = True
212 | config.log_device_placement = True
213 | sess = tf.Session(config=config)
214 |
215 | # Restore variables from disk.
216 | saver.restore(sess, MODEL_PATH)
217 | log_string("Model restored.")
218 |
219 | ops = {'imgs_pl': imgs_pl,
220 | 'labels_pl': labels_pl,
221 | 'is_training_pl': is_training_pl,
222 | 'pred': pred,
223 | 'loss': loss}
224 |
225 | test_one_epoch(sess, ops, data_input)
226 |
227 |
228 | def test_one_epoch(sess, ops, data_input):
229 | """ ops: dict mapping from string to tf ops """
230 | is_training = False
231 | loss_sum = 0
232 |
233 | num_batches = data_input.num_test // BATCH_SIZE
234 | acc_a_sum = [0] * 5
235 | acc_s_sum = [0] * 5
236 |
237 | preds = []
238 | labels_total = []
239 | acc_a = [0] * 5
240 | acc_s = [0] * 5
241 | for batch_idx in range(num_batches):
242 | if "_io" in MODEL_FILE:
243 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, reader_type="io")
244 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
245 | imgs = MODEL.resize(imgs)
246 | feed_dict = {ops['imgs_pl']: imgs,
247 | ops['labels_pl']: labels,
248 | ops['is_training_pl']: is_training}
249 | else:
250 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE)
251 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
252 | imgs = MODEL.resize(imgs)
253 | feed_dict = {ops['imgs_pl'][0]: imgs,
254 | ops['imgs_pl'][1]: others,
255 | ops['labels_pl']: labels,
256 | ops['is_training_pl']: is_training}
257 |
258 | loss_val, pred_val = sess.run([ops['loss'], ops['pred']],
259 | feed_dict=feed_dict)
260 |
261 | preds.append(pred_val)
262 | labels_total.append(labels)
263 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
264 | for i in range(5):
265 | acc_a[i] = np.mean(np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (1.0 * (i+1) / 180 * scipy.pi))
266 | acc_a_sum[i] += acc_a[i]
267 | acc_s[i] = np.mean(np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (1.0 * (i+1) / 20))
268 | acc_s_sum[i] += acc_s[i]
269 |
270 | log_string('test mean loss: %f' % (loss_sum / float(num_batches)))
271 | for i in range(5):
272 | log_string('test accuracy (angle-%d): %f' % (float(i+1), (acc_a_sum[i] / float(num_batches))))
273 | log_string('test accuracy (speed-%d): %f' % (float(i+1), (acc_s_sum[i] / float(num_batches))))
274 |
275 | preds = np.vstack(preds)
276 | labels = np.vstack(labels_total)
277 |
278 | a_error, s_error = mean_max_error(preds, labels, dicts=get_dicts())
279 | log_string('test error (mean-max): angle:%.2f speed:%.2f' %
280 | (a_error / scipy.pi * 180, s_error * 20))
281 | a_error, s_error = max_error(preds, labels)
282 | log_string('test error (max): angle:%.2f speed:%.2f' %
283 | (a_error / scipy.pi * 180, s_error * 20))
284 | a_error, s_error = mean_topk_error(preds, labels, 5)
285 | log_string('test error (mean-top5): angle:%.2f speed:%.2f' %
286 | (a_error / scipy.pi * 180, s_error * 20))
287 | a_error, s_error = mean_error(preds, labels)
288 | log_string('test error (mean): angle:%.2f speed:%.2f' %
289 | (a_error / scipy.pi * 180, s_error * 20))
290 |
291 | print (preds.shape, labels.shape)
292 | np.savetxt(os.path.join(TEST_RESULT_DIR, "preds_val.txt"), preds)
293 | np.savetxt(os.path.join(TEST_RESULT_DIR, "labels_val.txt"), labels)
294 | # plot_acc(preds, labels)
295 |
296 |
297 | def plot_acc(preds, labels, counts = 100):
298 | a_list = []
299 | s_list = []
300 | for i in range(counts):
301 | acc_a = np.abs(np.subtract(preds[:, 1], labels[:, 1])) < (20.0 / 180 * scipy.pi / counts * i)
302 | a_list.append(np.mean(acc_a))
303 |
304 | for i in range(counts):
305 | acc_s = np.abs(np.subtract(preds[:, 0], labels[:, 0])) < (15.0 / 20 / counts * i)
306 | s_list.append(np.mean(acc_s))
307 |
308 | print (len(a_list), len(s_list))
309 | a_xaxis = [20.0 / counts * i for i in range(counts)]
310 | s_xaxis = [15.0 / counts * i for i in range(counts)]
311 |
312 | auc_angle = np.trapz(np.array(a_list), x=a_xaxis) / 20.0
313 | auc_speed = np.trapz(np.array(s_list), x=s_xaxis) / 15.0
314 |
315 | plt.style.use('ggplot')
316 | plt.figure()
317 | plt.plot(a_xaxis, np.array(a_list), label='Area Under Curve (AUC): %f' % auc_angle)
318 | plt.legend(loc='best')
319 | plt.xlabel("Threshold (angle)")
320 | plt.ylabel("Validation accuracy")
321 | plt.savefig(os.path.join(RESULT_DIR, "acc_angle.png"))
322 | plt.figure()
323 | plt.plot(s_xaxis, np.array(s_list), label='Area Under Curve (AUC): %f' % auc_speed)
324 | plt.xlabel("Threshold (speed)")
325 | plt.ylabel("Validation accuracy")
326 | plt.legend(loc='best')
327 | plt.savefig(os.path.join(RESULT_DIR, 'acc_spped.png'))
328 |
329 | def plot_acc_from_txt(counts=100):
330 | preds = np.loadtxt(os.path.join(RESULT_DIR, "test/preds_val.txt"))
331 | labels = np.loadtxt(os.path.join(RESULT_DIR, "test/labels_val.txt"))
332 | print (preds.shape, labels.shape)
333 | plot_acc(preds, labels, counts)
334 |
335 | def get_dicts(description="val"):
336 | if description == "train":
337 | raise NotImplementedError
338 | elif description == "val": # batch_size == 8
339 | return [120] * 4 + [111] + [120] * 4 + [109] + [120] * 9 + [89 - 87 % 8]
340 | elif description == "test": # batch_size == 8
341 | return [120] * 9 + [116] + [120] * 4 + [106] + [120] * 4 + [114 - 114 % 8]
342 | else:
343 | raise NotImplementedError
344 |
345 | def mean_max_error(preds, labels, dicts):
346 | cnt = 0
347 | a_error = 0
348 | s_error = 0
349 | for i in dicts:
350 | print (preds.shape, cnt, cnt + i)
351 | a_error += np.max(np.abs(preds[cnt:cnt+i, 1] - labels[cnt:cnt+i, 1]))
352 | s_error += np.max(np.abs(preds[cnt:cnt+i, 0] - labels[cnt:cnt+i, 0]))
353 | cnt += i
354 | return a_error / float(len(dicts)), s_error / float(len(dicts))
355 |
356 | def max_error(preds, labels):
357 | return np.max(np.abs(preds[:,1] - labels[:,1])), np.max(np.abs(preds[:, 0] - labels[:, 0]))
358 |
359 | def mean_error(preds, labels):
360 | return np.mean(np.abs(preds[:,1] - labels[:,1])), np.mean(np.abs(preds[:,0] - labels[:,0]))
361 |
362 | def mean_topk_error(preds, labels, k):
363 | a_error = np.abs(preds[:,1] - labels[:,1])
364 | s_error = np.abs(preds[:,0] - labels[:,0])
365 | return np.mean(np.sort(a_error)[::-1][0:k]), np.mean(np.sort(s_error)[::-1][0:k])
366 |
367 | if __name__ == "__main__":
368 | if FLAGS.test: test()
369 | else: evaluate()
370 | # plot_acc_from_txt()
371 |
--------------------------------------------------------------------------------
/models/densenet169_io.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | from custom_layers import Scale
13 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D,
14 | AveragePooling2D, GlobalAveragePooling2D,
15 | ZeroPadding2D, Dropout, Flatten, add,
16 | concatenate, Reshape, Activation)
17 | from keras.layers.normalization import BatchNormalization
18 | from keras.models import Model
19 |
20 |
21 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, separately=False):
22 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
23 | if separately:
24 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
25 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
26 | labels_pl = [speeds_pl, angles_pl]
27 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
28 | return imgs_pl, labels_pl
29 |
30 |
31 | def get_densenet(img_rows, img_cols, nb_dense_block=4,
32 | growth_rate=32, nb_filter=64, reduction=0.5,
33 | dropout_rate=0.0, weight_decay=1e-4):
34 | '''
35 | DenseNet 169 Model for Keras
36 |
37 | Model Schema is based on
38 | https://github.com/flyyufelix/DenseNet-Keras
39 |
40 | ImageNet Pretrained Weights
41 | Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU
42 | TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM
43 |
44 | # Arguments
45 | nb_dense_block: number of dense blocks to add to end
46 | growth_rate: number of filters to add per dense block
47 | nb_filter: initial number of filters
48 | reduction: reduction factor of transition blocks.
49 | dropout_rate: dropout rate
50 | weight_decay: weight decay factor
51 | classes: optional number of classes to classify images
52 | weights_path: path to pre-trained weights
53 | # Returns
54 | A Keras model instance.
55 | '''
56 | eps = 1.1e-5
57 |
58 | # compute compression factor
59 | compression = 1.0 - reduction
60 |
61 | # Handle Dimension Ordering for different backends
62 | img_input = Input(shape=(224, 224, 3), name='data')
63 |
64 | # From architecture for ImageNet (Table 1 in the paper)
65 | nb_filter = 64
66 | nb_layers = [6,12,32,32] # For DenseNet-169
67 |
68 | # Initial convolution
69 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
70 | x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
71 | x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x)
72 | x = Scale(axis=3, name='conv1_scale')(x)
73 | x = Activation('relu', name='relu1')(x)
74 | x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x)
75 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
76 |
77 | # Add dense blocks
78 | for block_idx in range(nb_dense_block - 1):
79 | stage = block_idx+2
80 | x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
81 |
82 | # Add transition_block
83 | x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay)
84 | nb_filter = int(nb_filter * compression)
85 |
86 | final_stage = stage + 1
87 | x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
88 |
89 | x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x)
90 | x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x)
91 | x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x)
92 |
93 | x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
94 | x_fc = Dense(1000, name='fc6')(x_fc)
95 | x_fc = Activation('softmax', name='prob')(x_fc)
96 |
97 | model = Model(img_input, x_fc, name='densenet')
98 |
99 | # Use pre-trained weights for Tensorflow backend
100 | weights_path = 'utils/weights/densenet169_weights_tf.h5'
101 |
102 | model.load_weights(weights_path, by_name=True)
103 |
104 | # Truncate and replace softmax layer for transfer learning
105 | # Cannot use model.layers.pop() since model is not of Sequential() type
106 | # The method below works since pre-trained weights are stored in layers but not in the model
107 | x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
108 |
109 | x_newfc = Dense(256, name='fc7')(x_newfc)
110 | model = Model(img_input, x_newfc)
111 |
112 | return model
113 |
114 |
115 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
116 | """ Densenet169 regression model, input is BxWxHx3, output Bx2"""
117 | net = get_densenet(224, 224)(net)
118 |
119 | if not add_lstm:
120 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final')
121 |
122 | else:
123 | net = tf_util.fully_connected(net, 784, bn=True,
124 | is_training=is_training,
125 | scope='fc_lstm',
126 | bn_decay=bn_decay)
127 | net = tf_util.dropout(net, keep_prob=0.7,
128 | is_training=is_training,
129 | scope="dp1")
130 | net = cnn_lstm_block(net)
131 |
132 | return net
133 |
134 |
135 | def cnn_lstm_block(input_tensor):
136 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
137 | lstm_out = tf_util.stacked_lstm(lstm_in,
138 | num_outputs=10,
139 | time_steps=28,
140 | scope="cnn_lstm")
141 |
142 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
143 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
144 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
145 |
146 |
147 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4):
148 | '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout
149 | # Arguments
150 | x: input tensor
151 | stage: index for dense block
152 | branch: layer index within each dense block
153 | nb_filter: number of filters
154 | dropout_rate: dropout rate
155 | weight_decay: weight decay factor
156 | '''
157 | eps = 1.1e-5
158 | conv_name_base = 'conv' + str(stage) + '_' + str(branch)
159 | relu_name_base = 'relu' + str(stage) + '_' + str(branch)
160 |
161 | # 1x1 Convolution (Bottleneck layer)
162 | inter_channel = nb_filter * 4
163 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x)
164 | x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x)
165 | x = Activation('relu', name=relu_name_base+'_x1')(x)
166 | x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x)
167 |
168 | if dropout_rate:
169 | x = Dropout(dropout_rate)(x)
170 |
171 | # 3x3 Convolution
172 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x)
173 | x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x)
174 | x = Activation('relu', name=relu_name_base+'_x2')(x)
175 | x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x)
176 | x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x)
177 |
178 | if dropout_rate:
179 | x = Dropout(dropout_rate)(x)
180 |
181 | return x
182 |
183 |
184 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4):
185 | ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout
186 | # Arguments
187 | x: input tensor
188 | stage: index for dense block
189 | nb_filter: number of filters
190 | compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block.
191 | dropout_rate: dropout rate
192 | weight_decay: weight decay factor
193 | '''
194 |
195 | eps = 1.1e-5
196 | conv_name_base = 'conv' + str(stage) + '_blk'
197 | relu_name_base = 'relu' + str(stage) + '_blk'
198 | pool_name_base = 'pool' + str(stage)
199 |
200 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x)
201 | x = Scale(axis=3, name=conv_name_base+'_scale')(x)
202 | x = Activation('relu', name=relu_name_base)(x)
203 | x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x)
204 |
205 | if dropout_rate:
206 | x = Dropout(dropout_rate)(x)
207 |
208 | x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x)
209 |
210 | return x
211 |
212 |
213 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True):
214 | ''' Build a dense_block where the output of each conv_block is fed to subsequent ones
215 | # Arguments
216 | x: input tensor
217 | stage: index for dense block
218 | nb_layers: the number of layers of conv_block to append to the model.
219 | nb_filter: number of filters
220 | growth_rate: growth rate
221 | dropout_rate: dropout rate
222 | weight_decay: weight decay factor
223 | grow_nb_filters: flag to decide to allow number of filters to grow
224 | '''
225 |
226 | eps = 1.1e-5
227 | concat_feat = x
228 |
229 | for i in range(nb_layers):
230 | branch = i+1
231 | x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay)
232 | concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch))
233 |
234 | if grow_nb_filters:
235 | nb_filter += growth_rate
236 |
237 | return concat_feat, nb_filter
238 |
239 |
240 | def get_loss(pred, label, l2_weight=0.0001):
241 | diff = tf.square(tf.subtract(pred, label))
242 | train_vars = tf.trainable_variables()
243 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
244 | loss = tf.reduce_mean(diff + l2_loss)
245 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
246 | tf.summary.scalar('loss', loss)
247 |
248 | return loss
249 |
250 |
251 | def summary_scalar(pred, label):
252 | threholds = [5, 4, 3, 2, 1, 0.5]
253 | angles = [float(t) / 180 * scipy.pi for t in threholds]
254 | speeds = [float(t) / 20 for t in threholds]
255 |
256 | for i in range(len(threholds)):
257 | scalar_angle = "angle(" + str(angles[i]) + ")"
258 | scalar_speed = "speed(" + str(speeds[i]) + ")"
259 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
260 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
261 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
262 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
263 |
264 | tf.summary.scalar(scalar_angle, ac_angle)
265 | tf.summary.scalar(scalar_speed, ac_speed)
266 |
267 |
268 | def resize(imgs):
269 | batch_size = imgs.shape[0]
270 | imgs_new = []
271 | for j in range(batch_size):
272 | img = imgs[j,:,:,:]
273 | new = scipy.misc.imresize(img, (224, 224))
274 | imgs_new.append(new)
275 | imgs_new = np.stack(imgs_new, axis=0)
276 | return imgs_new
277 |
278 |
279 | if __name__ == '__main__':
280 | with tf.Graph().as_default():
281 | inputs = tf.zeros((32, 224, 224, 3))
282 | outputs = get_model(inputs, tf.constant(True))
283 | print(outputs)
284 |
--------------------------------------------------------------------------------
/models/densenet169_pm.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | import pointnet
13 | from custom_layers import Scale
14 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D,
15 | AveragePooling2D, GlobalAveragePooling2D,
16 | ZeroPadding2D, Dropout, Flatten, add,
17 | concatenate, Reshape, Activation)
18 | from keras.layers.normalization import BatchNormalization
19 | from keras.models import Model
20 |
21 | from keras import backend as K
22 | K.set_learning_phase(1) #set learning phase
23 |
24 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False):
25 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
26 | fmaps_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
27 | if separately:
28 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
29 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
30 | labels_pl = [speeds_pl, angles_pl]
31 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
32 | return imgs_pl, fmaps_pl, labels_pl
33 |
34 |
35 | def get_densenet(img_rows, img_cols, nb_dense_block=4,
36 | growth_rate=32, nb_filter=64, reduction=0.5,
37 | dropout_rate=0.0, weight_decay=1e-4):
38 | '''
39 | DenseNet 169 Model for Keras
40 |
41 | Model Schema is based on
42 | https://github.com/flyyufelix/DenseNet-Keras
43 |
44 | ImageNet Pretrained Weights
45 | Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU
46 | TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM
47 |
48 | # Arguments
49 | nb_dense_block: number of dense blocks to add to end
50 | growth_rate: number of filters to add per dense block
51 | nb_filter: initial number of filters
52 | reduction: reduction factor of transition blocks.
53 | dropout_rate: dropout rate
54 | weight_decay: weight decay factor
55 | classes: optional number of classes to classify images
56 | weights_path: path to pre-trained weights
57 | # Returns
58 | A Keras model instance.
59 | '''
60 | eps = 1.1e-5
61 |
62 | # compute compression factor
63 | compression = 1.0 - reduction
64 |
65 | # Handle Dimension Ordering for different backends
66 | img_input = Input(shape=(224, 224, 3), name='data')
67 |
68 | # From architecture for ImageNet (Table 1 in the paper)
69 | nb_filter = 64
70 | nb_layers = [6,12,32,32] # For DenseNet-169
71 |
72 | # Initial convolution
73 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
74 | x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
75 | x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x)
76 | x = Scale(axis=3, name='conv1_scale')(x)
77 | x = Activation('relu', name='relu1')(x)
78 | x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x)
79 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
80 |
81 | # Add dense blocks
82 | for block_idx in range(nb_dense_block - 1):
83 | stage = block_idx+2
84 | x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
85 |
86 | # Add transition_block
87 | x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay)
88 | nb_filter = int(nb_filter * compression)
89 |
90 | final_stage = stage + 1
91 | x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
92 |
93 | x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x)
94 | x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x)
95 | x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x)
96 |
97 | x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
98 | x_fc = Dense(1000, name='fc6')(x_fc)
99 | x_fc = Activation('softmax', name='prob')(x_fc)
100 |
101 | model = Model(img_input, x_fc, name='densenet')
102 |
103 | # Use pre-trained weights for Tensorflow backend
104 | weights_path = 'utils/weights/densenet169_weights_tf.h5'
105 |
106 | model.load_weights(weights_path, by_name=True)
107 |
108 | # Truncate and replace softmax layer for transfer learning
109 | # Cannot use model.layers.pop() since model is not of Sequential() type
110 | # The method below works since pre-trained weights are stored in layers but not in the model
111 | x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
112 |
113 | x_newfc = Dense(256, name='fc7')(x_newfc)
114 | model = Model(img_input, x_newfc)
115 |
116 | return model
117 |
118 |
119 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
120 | """ Densenet169 regression model, input is BxWxHx3, output Bx2"""
121 | batch_size = net[0].get_shape()[0].value
122 | img_net, fmap_net = net[0], net[1]
123 |
124 | img_net = get_densenet(224, 224)(img_net)
125 | fmap_net = get_densenet(224, 224)(fmap_net)
126 |
127 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
128 |
129 | if not add_lstm:
130 | for i, dim in enumerate([256, 128, 16]):
131 | fc_scope = "fc" + str(i + 1)
132 | dp_scope = "dp" + str(i + 1)
133 | net = tf_util.fully_connected(net, dim, bn=True,
134 | is_training=is_training,
135 | scope=fc_scope,
136 | bn_decay=bn_decay)
137 | net = tf_util.dropout(net, keep_prob=0.7,
138 | is_training=is_training,
139 | scope=dp_scope)
140 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
141 | else:
142 | fc_scope = "fc1"
143 | net = tf_util.fully_connected(net, 784, bn=True,
144 | is_training=is_training,
145 | scope=fc_scope,
146 | bn_decay=bn_decay)
147 | net = tf_util.dropout(net, keep_prob=0.7,
148 | is_training=is_training,
149 | scope="dp1")
150 | net = cnn_lstm_block(net)
151 | return net
152 |
153 |
154 | def cnn_lstm_block(input_tensor):
155 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
156 | lstm_out = tf_util.stacked_lstm(lstm_in,
157 | num_outputs=10,
158 | time_steps=28,
159 | scope="cnn_lstm")
160 |
161 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
162 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
163 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
164 |
165 |
166 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4):
167 | '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout
168 | # Arguments
169 | x: input tensor
170 | stage: index for dense block
171 | branch: layer index within each dense block
172 | nb_filter: number of filters
173 | dropout_rate: dropout rate
174 | weight_decay: weight decay factor
175 | '''
176 | eps = 1.1e-5
177 | conv_name_base = 'conv' + str(stage) + '_' + str(branch)
178 | relu_name_base = 'relu' + str(stage) + '_' + str(branch)
179 |
180 | # 1x1 Convolution (Bottleneck layer)
181 | inter_channel = nb_filter * 4
182 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x)
183 | x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x)
184 | x = Activation('relu', name=relu_name_base+'_x1')(x)
185 | x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x)
186 |
187 | if dropout_rate:
188 | x = Dropout(dropout_rate)(x)
189 |
190 | # 3x3 Convolution
191 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x)
192 | x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x)
193 | x = Activation('relu', name=relu_name_base+'_x2')(x)
194 | x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x)
195 | x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x)
196 |
197 | if dropout_rate:
198 | x = Dropout(dropout_rate)(x)
199 |
200 | return x
201 |
202 |
203 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4):
204 | ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout
205 | # Arguments
206 | x: input tensor
207 | stage: index for dense block
208 | nb_filter: number of filters
209 | compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block.
210 | dropout_rate: dropout rate
211 | weight_decay: weight decay factor
212 | '''
213 |
214 | eps = 1.1e-5
215 | conv_name_base = 'conv' + str(stage) + '_blk'
216 | relu_name_base = 'relu' + str(stage) + '_blk'
217 | pool_name_base = 'pool' + str(stage)
218 |
219 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x)
220 | x = Scale(axis=3, name=conv_name_base+'_scale')(x)
221 | x = Activation('relu', name=relu_name_base)(x)
222 | x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x)
223 |
224 | if dropout_rate:
225 | x = Dropout(dropout_rate)(x)
226 |
227 | x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x)
228 |
229 | return x
230 |
231 |
232 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True):
233 | ''' Build a dense_block where the output of each conv_block is fed to subsequent ones
234 | # Arguments
235 | x: input tensor
236 | stage: index for dense block
237 | nb_layers: the number of layers of conv_block to append to the model.
238 | nb_filter: number of filters
239 | growth_rate: growth rate
240 | dropout_rate: dropout rate
241 | weight_decay: weight decay factor
242 | grow_nb_filters: flag to decide to allow number of filters to grow
243 | '''
244 |
245 | eps = 1.1e-5
246 | concat_feat = x
247 |
248 | for i in range(nb_layers):
249 | branch = i+1
250 | x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay)
251 | concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch))
252 |
253 | if grow_nb_filters:
254 | nb_filter += growth_rate
255 |
256 | return concat_feat, nb_filter
257 |
258 |
259 | def get_loss(pred, label, l2_weight=0.0001):
260 | diff = tf.square(tf.subtract(pred, label))
261 | train_vars = tf.trainable_variables()
262 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
263 | loss = tf.reduce_mean(diff + l2_loss)
264 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
265 | tf.summary.scalar('loss', loss)
266 |
267 | return loss
268 |
269 |
270 | def summary_scalar(pred, label):
271 | threholds = [5, 4, 3, 2, 1, 0.5]
272 | angles = [float(t) / 180 * scipy.pi for t in threholds]
273 | speeds = [float(t) / 20 for t in threholds]
274 |
275 | for i in range(len(threholds)):
276 | scalar_angle = "angle(" + str(angles[i]) + ")"
277 | scalar_speed = "speed(" + str(speeds[i]) + ")"
278 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
279 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
280 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
281 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
282 |
283 | tf.summary.scalar(scalar_angle, ac_angle)
284 | tf.summary.scalar(scalar_speed, ac_speed)
285 |
286 |
287 | def resize(imgs):
288 | batch_size = imgs.shape[0]
289 | imgs_new = []
290 | for j in range(batch_size):
291 | img = imgs[j,:,:,:]
292 | new = scipy.misc.imresize(img, (224, 224))
293 | imgs_new.append(new)
294 | imgs_new = np.stack(imgs_new, axis=0)
295 | return imgs_new
296 |
297 |
298 | if __name__ == '__main__':
299 | with tf.Graph().as_default():
300 | imgs = tf.zeros((32, 224, 224, 3))
301 | fmaps = tf.zeros((32, 224, 224, 3))
302 | outputs = get_model([imgs, fmaps], tf.constant(True))
303 | print(outputs)
304 |
--------------------------------------------------------------------------------
/models/densenet169_pn.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | import pointnet
13 | from custom_layers import Scale
14 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D,
15 | AveragePooling2D, GlobalAveragePooling2D,
16 | ZeroPadding2D, Dropout, Flatten, add,
17 | concatenate, Reshape, Activation)
18 | from keras.layers.normalization import BatchNormalization
19 | from keras.models import Model
20 |
21 |
22 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False):
23 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
24 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
25 | if separately:
26 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
27 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
28 | labels_pl = [speeds_pl, angles_pl]
29 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
30 | return imgs_pl, pts_pl, labels_pl
31 |
32 |
33 | def get_densenet(img_rows, img_cols, nb_dense_block=4,
34 | growth_rate=32, nb_filter=64, reduction=0.5,
35 | dropout_rate=0.0, weight_decay=1e-4):
36 | '''
37 | DenseNet 169 Model for Keras
38 |
39 | Model Schema is based on
40 | https://github.com/flyyufelix/DenseNet-Keras
41 |
42 | ImageNet Pretrained Weights
43 | Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU
44 | TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM
45 |
46 | # Arguments
47 | nb_dense_block: number of dense blocks to add to end
48 | growth_rate: number of filters to add per dense block
49 | nb_filter: initial number of filters
50 | reduction: reduction factor of transition blocks.
51 | dropout_rate: dropout rate
52 | weight_decay: weight decay factor
53 | classes: optional number of classes to classify images
54 | weights_path: path to pre-trained weights
55 | # Returns
56 | A Keras model instance.
57 | '''
58 | eps = 1.1e-5
59 |
60 | # compute compression factor
61 | compression = 1.0 - reduction
62 |
63 | # Handle Dimension Ordering for different backends
64 | img_input = Input(shape=(224, 224, 3), name='data')
65 |
66 | # From architecture for ImageNet (Table 1 in the paper)
67 | nb_filter = 64
68 | nb_layers = [6,12,32,32] # For DenseNet-169
69 |
70 | # Initial convolution
71 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
72 | x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
73 | x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x)
74 | x = Scale(axis=3, name='conv1_scale')(x)
75 | x = Activation('relu', name='relu1')(x)
76 | x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x)
77 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
78 |
79 | # Add dense blocks
80 | for block_idx in range(nb_dense_block - 1):
81 | stage = block_idx+2
82 | x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
83 |
84 | # Add transition_block
85 | x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay)
86 | nb_filter = int(nb_filter * compression)
87 |
88 | final_stage = stage + 1
89 | x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay)
90 |
91 | x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x)
92 | x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x)
93 | x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x)
94 |
95 | x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
96 | x_fc = Dense(1000, name='fc6')(x_fc)
97 | x_fc = Activation('softmax', name='prob')(x_fc)
98 |
99 | model = Model(img_input, x_fc, name='densenet')
100 |
101 | # Use pre-trained weights for Tensorflow backend
102 | weights_path = 'utils/weights/densenet169_weights_tf.h5'
103 |
104 | model.load_weights(weights_path, by_name=True)
105 |
106 | # Truncate and replace softmax layer for transfer learning
107 | # Cannot use model.layers.pop() since model is not of Sequential() type
108 | # The method below works since pre-trained weights are stored in layers but not in the model
109 | x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x)
110 |
111 | x_newfc = Dense(256, name='fc7')(x_newfc)
112 | model = Model(img_input, x_newfc)
113 |
114 | return model
115 |
116 |
117 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
118 | """ Densenet169 regression model, input is BxWxHx3, output Bx2"""
119 | batch_size = net[0].get_shape()[0].value
120 | img_net, pt_net = net[0], net[1]
121 |
122 | img_net = get_densenet(299, 299)(img_net)
123 | with tf.variable_scope('pointnet'):
124 | pt_net = pointnet.get_model(pt_net, tf.constant(True))
125 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1])
126 |
127 | if not add_lstm:
128 | for i, dim in enumerate([256, 128, 16]):
129 | fc_scope = "fc" + str(i + 1)
130 | dp_scope = "dp" + str(i + 1)
131 | net = tf_util.fully_connected(net, dim, bn=True,
132 | is_training=is_training,
133 | scope=fc_scope,
134 | bn_decay=bn_decay)
135 | net = tf_util.dropout(net, keep_prob=0.7,
136 | is_training=is_training,
137 | scope=dp_scope)
138 |
139 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
140 | else:
141 | fc_scope = "fc1"
142 | net = tf_util.fully_connected(net, 784, bn=True,
143 | is_training=is_training,
144 | scope=fc_scope,
145 | bn_decay=bn_decay)
146 | net = tf_util.dropout(net, keep_prob=0.7,
147 | is_training=is_training,
148 | scope="dp1")
149 | net = cnn_lstm_block(net)
150 | return net
151 |
152 |
153 | def cnn_lstm_block(input_tensor):
154 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
155 | lstm_out = tf_util.stacked_lstm(lstm_in,
156 | num_outputs=10,
157 | time_steps=28,
158 | scope="cnn_lstm")
159 |
160 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
161 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
162 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
163 |
164 |
165 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4):
166 | '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout
167 | # Arguments
168 | x: input tensor
169 | stage: index for dense block
170 | branch: layer index within each dense block
171 | nb_filter: number of filters
172 | dropout_rate: dropout rate
173 | weight_decay: weight decay factor
174 | '''
175 | eps = 1.1e-5
176 | conv_name_base = 'conv' + str(stage) + '_' + str(branch)
177 | relu_name_base = 'relu' + str(stage) + '_' + str(branch)
178 |
179 | # 1x1 Convolution (Bottleneck layer)
180 | inter_channel = nb_filter * 4
181 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x)
182 | x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x)
183 | x = Activation('relu', name=relu_name_base+'_x1')(x)
184 | x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x)
185 |
186 | if dropout_rate:
187 | x = Dropout(dropout_rate)(x)
188 |
189 | # 3x3 Convolution
190 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x)
191 | x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x)
192 | x = Activation('relu', name=relu_name_base+'_x2')(x)
193 | x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x)
194 | x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x)
195 |
196 | if dropout_rate:
197 | x = Dropout(dropout_rate)(x)
198 |
199 | return x
200 |
201 |
202 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4):
203 | ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout
204 | # Arguments
205 | x: input tensor
206 | stage: index for dense block
207 | nb_filter: number of filters
208 | compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block.
209 | dropout_rate: dropout rate
210 | weight_decay: weight decay factor
211 | '''
212 |
213 | eps = 1.1e-5
214 | conv_name_base = 'conv' + str(stage) + '_blk'
215 | relu_name_base = 'relu' + str(stage) + '_blk'
216 | pool_name_base = 'pool' + str(stage)
217 |
218 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x)
219 | x = Scale(axis=3, name=conv_name_base+'_scale')(x)
220 | x = Activation('relu', name=relu_name_base)(x)
221 | x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x)
222 |
223 | if dropout_rate:
224 | x = Dropout(dropout_rate)(x)
225 |
226 | x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x)
227 |
228 | return x
229 |
230 |
231 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True):
232 | ''' Build a dense_block where the output of each conv_block is fed to subsequent ones
233 | # Arguments
234 | x: input tensor
235 | stage: index for dense block
236 | nb_layers: the number of layers of conv_block to append to the model.
237 | nb_filter: number of filters
238 | growth_rate: growth rate
239 | dropout_rate: dropout rate
240 | weight_decay: weight decay factor
241 | grow_nb_filters: flag to decide to allow number of filters to grow
242 | '''
243 |
244 | eps = 1.1e-5
245 | concat_feat = x
246 |
247 | for i in range(nb_layers):
248 | branch = i+1
249 | x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay)
250 | concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch))
251 |
252 | if grow_nb_filters:
253 | nb_filter += growth_rate
254 |
255 | return concat_feat, nb_filter
256 |
257 |
258 | def get_loss(pred, label, l2_weight=0.0001):
259 | diff = tf.square(tf.subtract(pred, label))
260 | train_vars = tf.trainable_variables()
261 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
262 | loss = tf.reduce_mean(diff + l2_loss)
263 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
264 | tf.summary.scalar('loss', loss)
265 |
266 | return loss
267 |
268 |
269 | def summary_scalar(pred, label):
270 | threholds = [5, 4, 3, 2, 1, 0.5]
271 | angles = [float(t) / 180 * scipy.pi for t in threholds]
272 | speeds = [float(t) / 20 for t in threholds]
273 |
274 | for i in range(len(threholds)):
275 | scalar_angle = "angle(" + str(angles[i]) + ")"
276 | scalar_speed = "speed(" + str(speeds[i]) + ")"
277 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
278 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
279 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
280 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
281 |
282 | tf.summary.scalar(scalar_angle, ac_angle)
283 | tf.summary.scalar(scalar_speed, ac_speed)
284 |
285 |
286 | def resize(imgs):
287 | batch_size = imgs.shape[0]
288 | imgs_new = []
289 | for j in range(batch_size):
290 | img = imgs[j,:,:,:]
291 | new = scipy.misc.imresize(img, (224, 224))
292 | imgs_new.append(new)
293 | imgs_new = np.stack(imgs_new, axis=0)
294 | return imgs_new
295 |
296 |
297 | if __name__ == '__main__':
298 | with tf.Graph().as_default():
299 | imgs = tf.zeros((32, 224, 224, 3))
300 | pts = tf.zeros((32, 16384, 3))
301 | outputs = get_model([imgs, pts], tf.constant(True))
302 | print(outputs)
303 |
--------------------------------------------------------------------------------
/models/inception_v4_io.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation
13 | from keras.layers.normalization import BatchNormalization
14 | from keras.models import Model
15 |
16 |
17 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, separately=False):
18 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
19 | if separately:
20 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
21 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
22 | labels_pl = [speeds_pl, angles_pl]
23 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
24 | return imgs_pl, labels_pl
25 |
26 |
27 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False):
28 | '''
29 | Inception V4 Model for Keras
30 |
31 | Model Schema is based on
32 | https://github.com/kentsommer/keras-inceptionV4
33 |
34 | ImageNet Pretrained Weights
35 | Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5
36 | TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5
37 |
38 | Parameters:
39 | img_rows, img_cols - resolution of inputs
40 | channel - 1 for grayscale, 3 for color
41 | num_classes - number of class labels for our classification task
42 | '''
43 |
44 | # Input Shape is 299 x 299 x 3 (tf)
45 | img_input = Input(shape=(img_rows, img_cols, 3), name='data')
46 |
47 | # Make inception base
48 | net = inception_v4_base(img_input)
49 |
50 | # Final pooling and prediction
51 |
52 | # 8 x 8 x 1536
53 | net_old = AveragePooling2D((8,8), padding='valid')(net)
54 |
55 | # 1 x 1 x 1536
56 | net_old = Dropout(dropout_keep_prob)(net_old)
57 | net_old = Flatten()(net_old)
58 |
59 | # 1536
60 | predictions = Dense(units=1001, activation='softmax')(net_old)
61 |
62 | model = Model(img_input, predictions, name='inception_v4')
63 |
64 | weights_path = 'utils/weights/inception-v4_weights_tf.h5'
65 | assert (os.path.exists(weights_path))
66 | model.load_weights(weights_path, by_name=True)
67 |
68 | # Truncate and replace softmax layer for transfer learning
69 | # Cannot use model.layers.pop() since model is not of Sequential() type
70 | # The method below works since pre-trained weights are stored in layers but not in the model
71 | net_ft = AveragePooling2D((8,8), border_mode='valid')(net)
72 | net_ft = Dropout(dropout_keep_prob)(net_ft)
73 | net_ft = Flatten()(net_ft)
74 | net = Dense(256, name='fc_mid')(net_ft)
75 |
76 | model = Model(img_input, net, name='inception_v4')
77 | return model
78 |
79 |
80 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
81 | """ Inception_V4 regression model, input is BxWxHx3, output Bx2"""
82 | net = get_inception(299, 299)(net)
83 |
84 | if not add_lstm:
85 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final')
86 |
87 | else:
88 | net = tf_util.fully_connected(net, 784, bn=True,
89 | is_training=is_training,
90 | scope='fc_lstm',
91 | bn_decay=bn_decay)
92 | net = tf_util.dropout(net, keep_prob=0.7,
93 | is_training=is_training,
94 | scope="dp1")
95 | net = cnn_lstm_block(net)
96 |
97 | return net
98 |
99 |
100 | def cnn_lstm_block(input_tensor):
101 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
102 | lstm_out = tf_util.stacked_lstm(lstm_in,
103 | num_outputs=10,
104 | time_steps=28,
105 | scope="cnn_lstm")
106 |
107 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
108 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
109 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
110 |
111 |
112 | def conv2d_bn(x, nb_filter, nb_row, nb_col,
113 | border_mode='same', subsample=(1, 1), bias=False):
114 | """
115 | Utility function to apply conv + BN.
116 | (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py)
117 | """
118 | channel_axis = -1
119 | x = Convolution2D(nb_filter, (nb_row, nb_col),
120 | strides=subsample,
121 | padding=border_mode,
122 | use_bias=bias)(x)
123 | x = BatchNormalization(axis=channel_axis)(x)
124 | x = Activation('relu')(x)
125 | return x
126 |
127 | def block_inception_a(input):
128 | channel_axis = -1
129 |
130 | branch_0 = conv2d_bn(input, 96, 1, 1)
131 |
132 | branch_1 = conv2d_bn(input, 64, 1, 1)
133 | branch_1 = conv2d_bn(branch_1, 96, 3, 3)
134 |
135 | branch_2 = conv2d_bn(input, 64, 1, 1)
136 | branch_2 = conv2d_bn(branch_2, 96, 3, 3)
137 | branch_2 = conv2d_bn(branch_2, 96, 3, 3)
138 |
139 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
140 | branch_3 = conv2d_bn(branch_3, 96, 1, 1)
141 |
142 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
143 | return x
144 |
145 |
146 | def block_reduction_a(input):
147 | channel_axis = -1
148 |
149 | branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid')
150 |
151 | branch_1 = conv2d_bn(input, 192, 1, 1)
152 | branch_1 = conv2d_bn(branch_1, 224, 3, 3)
153 | branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid')
154 |
155 | branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input)
156 |
157 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
158 | return x
159 |
160 |
161 | def block_inception_b(input):
162 | channel_axis = -1
163 |
164 | branch_0 = conv2d_bn(input, 384, 1, 1)
165 |
166 | branch_1 = conv2d_bn(input, 192, 1, 1)
167 | branch_1 = conv2d_bn(branch_1, 224, 1, 7)
168 | branch_1 = conv2d_bn(branch_1, 256, 7, 1)
169 |
170 | branch_2 = conv2d_bn(input, 192, 1, 1)
171 | branch_2 = conv2d_bn(branch_2, 192, 7, 1)
172 | branch_2 = conv2d_bn(branch_2, 224, 1, 7)
173 | branch_2 = conv2d_bn(branch_2, 224, 7, 1)
174 | branch_2 = conv2d_bn(branch_2, 256, 1, 7)
175 |
176 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
177 | branch_3 = conv2d_bn(branch_3, 128, 1, 1)
178 |
179 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
180 | return x
181 |
182 |
183 | def block_reduction_b(input):
184 | channel_axis = -1
185 |
186 | branch_0 = conv2d_bn(input, 192, 1, 1)
187 | branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid')
188 |
189 | branch_1 = conv2d_bn(input, 256, 1, 1)
190 | branch_1 = conv2d_bn(branch_1, 256, 1, 7)
191 | branch_1 = conv2d_bn(branch_1, 320, 7, 1)
192 | branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid')
193 |
194 | branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input)
195 |
196 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
197 | return x
198 |
199 |
200 | def block_inception_c(input):
201 | channel_axis = -1
202 |
203 | branch_0 = conv2d_bn(input, 256, 1, 1)
204 |
205 | branch_1 = conv2d_bn(input, 384, 1, 1)
206 | branch_10 = conv2d_bn(branch_1, 256, 1, 3)
207 | branch_11 = conv2d_bn(branch_1, 256, 3, 1)
208 | branch_1 = concatenate([branch_10, branch_11], axis=channel_axis)
209 |
210 |
211 | branch_2 = conv2d_bn(input, 384, 1, 1)
212 | branch_2 = conv2d_bn(branch_2, 448, 3, 1)
213 | branch_2 = conv2d_bn(branch_2, 512, 1, 3)
214 | branch_20 = conv2d_bn(branch_2, 256, 1, 3)
215 | branch_21 = conv2d_bn(branch_2, 256, 3, 1)
216 | branch_2 = concatenate([branch_20, branch_21], axis=channel_axis)
217 |
218 | branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input)
219 | branch_3 = conv2d_bn(branch_3, 256, 1, 1)
220 |
221 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
222 | return x
223 |
224 |
225 | def inception_v4_base(input):
226 | channel_axis = -1
227 |
228 | # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th)
229 | net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid')
230 | net = conv2d_bn(net, 32, 3, 3, border_mode='valid')
231 | net = conv2d_bn(net, 64, 3, 3)
232 |
233 | branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
234 |
235 | branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid')
236 |
237 | net = concatenate([branch_0, branch_1], axis=channel_axis)
238 |
239 | branch_0 = conv2d_bn(net, 64, 1, 1)
240 | branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid')
241 |
242 | branch_1 = conv2d_bn(net, 64, 1, 1)
243 | branch_1 = conv2d_bn(branch_1, 64, 1, 7)
244 | branch_1 = conv2d_bn(branch_1, 64, 7, 1)
245 | branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid')
246 |
247 | net = concatenate([branch_0, branch_1], axis=channel_axis)
248 |
249 | branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid')
250 | branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
251 |
252 | net = concatenate([branch_0, branch_1], axis=channel_axis)
253 |
254 | # 35 x 35 x 384
255 | # 4 x Inception-A blocks
256 | for idx in xrange(4):
257 | net = block_inception_a(net)
258 |
259 | # 35 x 35 x 384
260 | # Reduction-A block
261 | net = block_reduction_a(net)
262 |
263 | # 17 x 17 x 1024
264 | # 7 x Inception-B blocks
265 | for idx in xrange(7):
266 | net = block_inception_b(net)
267 |
268 | # 17 x 17 x 1024
269 | # Reduction-B block
270 | net = block_reduction_b(net)
271 |
272 | # 8 x 8 x 1536
273 | # 3 x Inception-C blocks
274 | for idx in xrange(3):
275 | net = block_inception_c(net)
276 |
277 | return net
278 |
279 |
280 | def get_loss(pred, label, l2_weight=0.0001):
281 | diff = tf.square(tf.subtract(pred, label))
282 | train_vars = tf.trainable_variables()
283 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
284 | loss = tf.reduce_mean(diff + l2_loss)
285 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
286 | tf.summary.scalar('loss', loss)
287 |
288 | return loss
289 |
290 |
291 | def summary_scalar(pred, label):
292 | threholds = [5, 4, 3, 2, 1, 0.5]
293 | angles = [float(t) / 180 * scipy.pi for t in threholds]
294 | speeds = [float(t) / 20 for t in threholds]
295 |
296 | for i in range(len(threholds)):
297 | scalar_angle = "angle(" + str(angles[i]) + ")"
298 | scalar_speed = "speed(" + str(speeds[i]) + ")"
299 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
300 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
301 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
302 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
303 |
304 | tf.summary.scalar(scalar_angle, ac_angle)
305 | tf.summary.scalar(scalar_speed, ac_speed)
306 |
307 |
308 | def resize(imgs):
309 | batch_size = imgs.shape[0]
310 | imgs_new = []
311 | for j in range(batch_size):
312 | img = imgs[j,:,:,:]
313 | new = scipy.misc.imresize(img, (299, 299))
314 | imgs_new.append(new)
315 | imgs_new = np.stack(imgs_new, axis=0)
316 | return imgs_new
317 |
318 |
319 | if __name__ == '__main__':
320 | with tf.Graph().as_default():
321 | inputs = tf.zeros((32, 224, 224, 3))
322 | outputs = get_model(inputs, tf.constant(True))
323 | print(outputs)
324 |
325 |
--------------------------------------------------------------------------------
/models/inception_v4_pm.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation
13 | from keras.layers.normalization import BatchNormalization
14 | from keras.models import Model
15 |
16 | from keras import backend as K
17 | K.set_learning_phase(1) #set learning phase
18 |
19 |
20 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, points=16384, separately=False):
21 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
22 | fmaps_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
23 | if separately:
24 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
25 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
26 | labels_pl = [speeds_pl, angles_pl]
27 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
28 | return imgs_pl, fmaps_pl, labels_pl
29 |
30 |
31 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False):
32 | '''
33 | Inception V4 Model for Keras
34 |
35 | Model Schema is based on
36 | https://github.com/kentsommer/keras-inceptionV4
37 |
38 | ImageNet Pretrained Weights
39 | Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5
40 | TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5
41 |
42 | Parameters:
43 | img_rows, img_cols - resolution of inputs
44 | channel - 1 for grayscale, 3 for color
45 | num_classes - number of class labels for our classification task
46 | '''
47 |
48 | # Input Shape is 299 x 299 x 3 (tf)
49 | img_input = Input(shape=(img_rows, img_cols, 3), name='data')
50 |
51 | # Make inception base
52 | net = inception_v4_base(img_input)
53 |
54 | # Final pooling and prediction
55 |
56 | # 8 x 8 x 1536
57 | net_old = AveragePooling2D((8,8), padding='valid')(net)
58 |
59 | # 1 x 1 x 1536
60 | net_old = Dropout(dropout_keep_prob)(net_old)
61 | net_old = Flatten()(net_old)
62 |
63 | # 1536
64 | predictions = Dense(units=1001, activation='softmax')(net_old)
65 |
66 | model = Model(img_input, predictions, name='inception_v4')
67 |
68 | weights_path = 'utils/weights/inception-v4_weights_tf.h5'
69 | assert (os.path.exists(weights_path))
70 | model.load_weights(weights_path, by_name=True)
71 |
72 | # Truncate and replace softmax layer for transfer learning
73 | # Cannot use model.layers.pop() since model is not of Sequential() type
74 | # The method below works since pre-trained weights are stored in layers but not in the model
75 | net_ft = AveragePooling2D((8,8), border_mode='valid')(net)
76 | net_ft = Dropout(dropout_keep_prob)(net_ft)
77 | net_ft = Flatten()(net_ft)
78 | net = Dense(256, name='fc_mid')(net_ft)
79 |
80 | model = Model(img_input, net, name='inception_v4')
81 | return model
82 |
83 |
84 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
85 | """ Inception_V4 regression model, input is BxWxHx3, output Bx2"""
86 | batch_size = net[0].get_shape()[0].value
87 | img_net, fmap_net = net[0], net[1]
88 |
89 | img_net = get_inception(299, 299)(img_net)
90 | fmap_net = get_inception(299, 299)(fmap_net)
91 |
92 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
93 |
94 | if not add_lstm:
95 | for i, dim in enumerate([256, 128, 16]):
96 | fc_scope = "fc" + str(i + 1)
97 | dp_scope = "dp" + str(i + 1)
98 | net = tf_util.fully_connected(net, dim, bn=True,
99 | is_training=is_training,
100 | scope=fc_scope,
101 | bn_decay=bn_decay)
102 | net = tf_util.dropout(net, keep_prob=0.7,
103 | is_training=is_training,
104 | scope=dp_scope)
105 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
106 | else:
107 | fc_scope = "fc1"
108 | net = tf_util.fully_connected(net, 784, bn=True,
109 | is_training=is_training,
110 | scope=fc_scope,
111 | bn_decay=bn_decay)
112 | net = tf_util.dropout(net, keep_prob=0.7,
113 | is_training=is_training,
114 | scope="dp1")
115 | net = cnn_lstm_block(net)
116 | return net
117 |
118 |
119 | def cnn_lstm_block(input_tensor):
120 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
121 | lstm_out = tf_util.stacked_lstm(lstm_in,
122 | num_outputs=10,
123 | time_steps=28,
124 | scope="cnn_lstm")
125 |
126 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
127 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
128 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
129 |
130 |
131 | def conv2d_bn(x, nb_filter, nb_row, nb_col,
132 | border_mode='same', subsample=(1, 1), bias=False):
133 | """
134 | Utility function to apply conv + BN.
135 | (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py)
136 | """
137 | channel_axis = -1
138 | x = Convolution2D(nb_filter, (nb_row, nb_col),
139 | strides=subsample,
140 | padding=border_mode,
141 | use_bias=bias)(x)
142 | x = BatchNormalization(axis=channel_axis)(x)
143 | x = Activation('relu')(x)
144 | return x
145 |
146 | def block_inception_a(input):
147 | channel_axis = -1
148 |
149 | branch_0 = conv2d_bn(input, 96, 1, 1)
150 |
151 | branch_1 = conv2d_bn(input, 64, 1, 1)
152 | branch_1 = conv2d_bn(branch_1, 96, 3, 3)
153 |
154 | branch_2 = conv2d_bn(input, 64, 1, 1)
155 | branch_2 = conv2d_bn(branch_2, 96, 3, 3)
156 | branch_2 = conv2d_bn(branch_2, 96, 3, 3)
157 |
158 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
159 | branch_3 = conv2d_bn(branch_3, 96, 1, 1)
160 |
161 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
162 | return x
163 |
164 |
165 | def block_reduction_a(input):
166 | channel_axis = -1
167 |
168 | branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid')
169 |
170 | branch_1 = conv2d_bn(input, 192, 1, 1)
171 | branch_1 = conv2d_bn(branch_1, 224, 3, 3)
172 | branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid')
173 |
174 | branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input)
175 |
176 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
177 | return x
178 |
179 |
180 | def block_inception_b(input):
181 | channel_axis = -1
182 |
183 | branch_0 = conv2d_bn(input, 384, 1, 1)
184 |
185 | branch_1 = conv2d_bn(input, 192, 1, 1)
186 | branch_1 = conv2d_bn(branch_1, 224, 1, 7)
187 | branch_1 = conv2d_bn(branch_1, 256, 7, 1)
188 |
189 | branch_2 = conv2d_bn(input, 192, 1, 1)
190 | branch_2 = conv2d_bn(branch_2, 192, 7, 1)
191 | branch_2 = conv2d_bn(branch_2, 224, 1, 7)
192 | branch_2 = conv2d_bn(branch_2, 224, 7, 1)
193 | branch_2 = conv2d_bn(branch_2, 256, 1, 7)
194 |
195 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
196 | branch_3 = conv2d_bn(branch_3, 128, 1, 1)
197 |
198 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
199 | return x
200 |
201 |
202 | def block_reduction_b(input):
203 | channel_axis = -1
204 |
205 | branch_0 = conv2d_bn(input, 192, 1, 1)
206 | branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid')
207 |
208 | branch_1 = conv2d_bn(input, 256, 1, 1)
209 | branch_1 = conv2d_bn(branch_1, 256, 1, 7)
210 | branch_1 = conv2d_bn(branch_1, 320, 7, 1)
211 | branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid')
212 |
213 | branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input)
214 |
215 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
216 | return x
217 |
218 |
219 | def block_inception_c(input):
220 | channel_axis = -1
221 |
222 | branch_0 = conv2d_bn(input, 256, 1, 1)
223 |
224 | branch_1 = conv2d_bn(input, 384, 1, 1)
225 | branch_10 = conv2d_bn(branch_1, 256, 1, 3)
226 | branch_11 = conv2d_bn(branch_1, 256, 3, 1)
227 | branch_1 = concatenate([branch_10, branch_11], axis=channel_axis)
228 |
229 |
230 | branch_2 = conv2d_bn(input, 384, 1, 1)
231 | branch_2 = conv2d_bn(branch_2, 448, 3, 1)
232 | branch_2 = conv2d_bn(branch_2, 512, 1, 3)
233 | branch_20 = conv2d_bn(branch_2, 256, 1, 3)
234 | branch_21 = conv2d_bn(branch_2, 256, 3, 1)
235 | branch_2 = concatenate([branch_20, branch_21], axis=channel_axis)
236 |
237 | branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input)
238 | branch_3 = conv2d_bn(branch_3, 256, 1, 1)
239 |
240 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
241 | return x
242 |
243 |
244 | def inception_v4_base(input):
245 | channel_axis = -1
246 |
247 | # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th)
248 | net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid')
249 | net = conv2d_bn(net, 32, 3, 3, border_mode='valid')
250 | net = conv2d_bn(net, 64, 3, 3)
251 |
252 | branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
253 |
254 | branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid')
255 |
256 | net = concatenate([branch_0, branch_1], axis=channel_axis)
257 |
258 | branch_0 = conv2d_bn(net, 64, 1, 1)
259 | branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid')
260 |
261 | branch_1 = conv2d_bn(net, 64, 1, 1)
262 | branch_1 = conv2d_bn(branch_1, 64, 1, 7)
263 | branch_1 = conv2d_bn(branch_1, 64, 7, 1)
264 | branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid')
265 |
266 | net = concatenate([branch_0, branch_1], axis=channel_axis)
267 |
268 | branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid')
269 | branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
270 |
271 | net = concatenate([branch_0, branch_1], axis=channel_axis)
272 |
273 | # 35 x 35 x 384
274 | # 4 x Inception-A blocks
275 | for idx in xrange(4):
276 | net = block_inception_a(net)
277 |
278 | # 35 x 35 x 384
279 | # Reduction-A block
280 | net = block_reduction_a(net)
281 |
282 | # 17 x 17 x 1024
283 | # 7 x Inception-B blocks
284 | for idx in xrange(7):
285 | net = block_inception_b(net)
286 |
287 | # 17 x 17 x 1024
288 | # Reduction-B block
289 | net = block_reduction_b(net)
290 |
291 | # 8 x 8 x 1536
292 | # 3 x Inception-C blocks
293 | for idx in xrange(3):
294 | net = block_inception_c(net)
295 |
296 | return net
297 |
298 |
299 | def get_loss(pred, label, l2_weight=0.0001):
300 | diff = tf.square(tf.subtract(pred, label))
301 | train_vars = tf.trainable_variables()
302 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
303 | loss = tf.reduce_mean(diff + l2_loss)
304 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
305 | tf.summary.scalar('loss', loss)
306 |
307 | return loss
308 |
309 |
310 | def summary_scalar(pred, label):
311 | threholds = [5, 4, 3, 2, 1, 0.5]
312 | angles = [float(t) / 180 * scipy.pi for t in threholds]
313 | speeds = [float(t) / 20 for t in threholds]
314 |
315 | for i in range(len(threholds)):
316 | scalar_angle = "angle(" + str(angles[i]) + ")"
317 | scalar_speed = "speed(" + str(speeds[i]) + ")"
318 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
319 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
320 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
321 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
322 |
323 | tf.summary.scalar(scalar_angle, ac_angle)
324 | tf.summary.scalar(scalar_speed, ac_speed)
325 |
326 |
327 | def resize(imgs):
328 | batch_size = imgs.shape[0]
329 | imgs_new = []
330 | for j in range(batch_size):
331 | img = imgs[j,:,:,:]
332 | new = scipy.misc.imresize(img, (299, 299))
333 | imgs_new.append(new)
334 | imgs_new = np.stack(imgs_new, axis=0)
335 | return imgs_new
336 |
337 |
338 | if __name__ == '__main__':
339 | with tf.Graph().as_default():
340 | imgs = tf.zeros((32, 224, 224, 3))
341 | fmaps = tf.zeros((32, 224, 224, 3))
342 | outputs = get_model([imgs, fmaps], tf.constant(True))
343 | print(outputs)
344 |
345 |
--------------------------------------------------------------------------------
/models/inception_v4_pn.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | import pointnet
13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation
14 | from keras.layers.normalization import BatchNormalization
15 | from keras.models import Model
16 |
17 | from keras import backend as K
18 | K.set_learning_phase(1) #set learning phase
19 |
20 |
21 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, points=16384, separately=False):
22 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
23 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
24 | if separately:
25 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
26 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
27 | labels_pl = [speeds_pl, angles_pl]
28 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
29 | return imgs_pl, pts_pl, labels_pl
30 |
31 |
32 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False):
33 | '''
34 | Inception V4 Model for Keras
35 |
36 | Model Schema is based on
37 | https://github.com/kentsommer/keras-inceptionV4
38 |
39 | ImageNet Pretrained Weights
40 | Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5
41 | TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5
42 |
43 | Parameters:
44 | img_rows, img_cols - resolution of inputs
45 | channel - 1 for grayscale, 3 for color
46 | num_classes - number of class labels for our classification task
47 | '''
48 |
49 | # Input Shape is 299 x 299 x 3 (tf)
50 | img_input = Input(shape=(img_rows, img_cols, 3), name='data')
51 |
52 | # Make inception base
53 | net = inception_v4_base(img_input)
54 |
55 | # Final pooling and prediction
56 |
57 | # 8 x 8 x 1536
58 | net_old = AveragePooling2D((8,8), padding='valid')(net)
59 |
60 | # 1 x 1 x 1536
61 | net_old = Dropout(dropout_keep_prob)(net_old)
62 | net_old = Flatten()(net_old)
63 |
64 | # 1536
65 | predictions = Dense(units=1001, activation='softmax')(net_old)
66 |
67 | model = Model(img_input, predictions, name='inception_v4')
68 |
69 | weights_path = 'utils/weights/inception-v4_weights_tf.h5'
70 | assert (os.path.exists(weights_path))
71 | model.load_weights(weights_path, by_name=True)
72 |
73 | # Truncate and replace softmax layer for transfer learning
74 | # Cannot use model.layers.pop() since model is not of Sequential() type
75 | # The method below works since pre-trained weights are stored in layers but not in the model
76 | net_ft = AveragePooling2D((8,8), border_mode='valid')(net)
77 | net_ft = Dropout(dropout_keep_prob)(net_ft)
78 | net_ft = Flatten()(net_ft)
79 | net = Dense(256, name='fc_mid')(net_ft)
80 |
81 | model = Model(img_input, net, name='inception_v4')
82 | return model
83 |
84 |
85 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
86 | """ Inception_V4 regression model, input is BxWxHx3, output Bx2"""
87 | batch_size = net[0].get_shape()[0].value
88 | img_net, pt_net = net[0], net[1]
89 |
90 | img_net = get_inception(299, 299)(img_net)
91 | with tf.variable_scope('pointnet'):
92 | pt_net = pointnet.get_model(pt_net, tf.constant(True))
93 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1])
94 |
95 | if not add_lstm:
96 | for i, dim in enumerate([256, 128, 16]):
97 | fc_scope = "fc" + str(i + 1)
98 | dp_scope = "dp" + str(i + 1)
99 | net = tf_util.fully_connected(net, dim, bn=True,
100 | is_training=is_training,
101 | scope=fc_scope,
102 | bn_decay=bn_decay)
103 | net = tf_util.dropout(net, keep_prob=0.7,
104 | is_training=is_training,
105 | scope=dp_scope)
106 |
107 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
108 | else:
109 | fc_scope = "fc1"
110 | net = tf_util.fully_connected(net, 784, bn=True,
111 | is_training=is_training,
112 | scope=fc_scope,
113 | bn_decay=bn_decay)
114 | net = tf_util.dropout(net, keep_prob=0.7,
115 | is_training=is_training,
116 | scope="dp1")
117 | net = cnn_lstm_block(net)
118 | return net
119 |
120 |
121 | def cnn_lstm_block(input_tensor):
122 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
123 | lstm_out = tf_util.stacked_lstm(lstm_in,
124 | num_outputs=10,
125 | time_steps=28,
126 | scope="cnn_lstm")
127 |
128 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
129 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
130 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
131 |
132 |
133 | def conv2d_bn(x, nb_filter, nb_row, nb_col,
134 | border_mode='same', subsample=(1, 1), bias=False):
135 | """
136 | Utility function to apply conv + BN.
137 | (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py)
138 | """
139 | channel_axis = -1
140 | x = Convolution2D(nb_filter, (nb_row, nb_col),
141 | strides=subsample,
142 | padding=border_mode,
143 | use_bias=bias)(x)
144 | x = BatchNormalization(axis=channel_axis)(x)
145 | x = Activation('relu')(x)
146 | return x
147 |
148 | def block_inception_a(input):
149 | channel_axis = -1
150 |
151 | branch_0 = conv2d_bn(input, 96, 1, 1)
152 |
153 | branch_1 = conv2d_bn(input, 64, 1, 1)
154 | branch_1 = conv2d_bn(branch_1, 96, 3, 3)
155 |
156 | branch_2 = conv2d_bn(input, 64, 1, 1)
157 | branch_2 = conv2d_bn(branch_2, 96, 3, 3)
158 | branch_2 = conv2d_bn(branch_2, 96, 3, 3)
159 |
160 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
161 | branch_3 = conv2d_bn(branch_3, 96, 1, 1)
162 |
163 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
164 | return x
165 |
166 |
167 | def block_reduction_a(input):
168 | channel_axis = -1
169 |
170 | branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid')
171 |
172 | branch_1 = conv2d_bn(input, 192, 1, 1)
173 | branch_1 = conv2d_bn(branch_1, 224, 3, 3)
174 | branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid')
175 |
176 | branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input)
177 |
178 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
179 | return x
180 |
181 |
182 | def block_inception_b(input):
183 | channel_axis = -1
184 |
185 | branch_0 = conv2d_bn(input, 384, 1, 1)
186 |
187 | branch_1 = conv2d_bn(input, 192, 1, 1)
188 | branch_1 = conv2d_bn(branch_1, 224, 1, 7)
189 | branch_1 = conv2d_bn(branch_1, 256, 7, 1)
190 |
191 | branch_2 = conv2d_bn(input, 192, 1, 1)
192 | branch_2 = conv2d_bn(branch_2, 192, 7, 1)
193 | branch_2 = conv2d_bn(branch_2, 224, 1, 7)
194 | branch_2 = conv2d_bn(branch_2, 224, 7, 1)
195 | branch_2 = conv2d_bn(branch_2, 256, 1, 7)
196 |
197 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input)
198 | branch_3 = conv2d_bn(branch_3, 128, 1, 1)
199 |
200 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
201 | return x
202 |
203 |
204 | def block_reduction_b(input):
205 | channel_axis = -1
206 |
207 | branch_0 = conv2d_bn(input, 192, 1, 1)
208 | branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid')
209 |
210 | branch_1 = conv2d_bn(input, 256, 1, 1)
211 | branch_1 = conv2d_bn(branch_1, 256, 1, 7)
212 | branch_1 = conv2d_bn(branch_1, 320, 7, 1)
213 | branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid')
214 |
215 | branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input)
216 |
217 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis)
218 | return x
219 |
220 |
221 | def block_inception_c(input):
222 | channel_axis = -1
223 |
224 | branch_0 = conv2d_bn(input, 256, 1, 1)
225 |
226 | branch_1 = conv2d_bn(input, 384, 1, 1)
227 | branch_10 = conv2d_bn(branch_1, 256, 1, 3)
228 | branch_11 = conv2d_bn(branch_1, 256, 3, 1)
229 | branch_1 = concatenate([branch_10, branch_11], axis=channel_axis)
230 |
231 |
232 | branch_2 = conv2d_bn(input, 384, 1, 1)
233 | branch_2 = conv2d_bn(branch_2, 448, 3, 1)
234 | branch_2 = conv2d_bn(branch_2, 512, 1, 3)
235 | branch_20 = conv2d_bn(branch_2, 256, 1, 3)
236 | branch_21 = conv2d_bn(branch_2, 256, 3, 1)
237 | branch_2 = concatenate([branch_20, branch_21], axis=channel_axis)
238 |
239 | branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input)
240 | branch_3 = conv2d_bn(branch_3, 256, 1, 1)
241 |
242 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis)
243 | return x
244 |
245 |
246 | def inception_v4_base(input):
247 | channel_axis = -1
248 |
249 | # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th)
250 | net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid')
251 | net = conv2d_bn(net, 32, 3, 3, border_mode='valid')
252 | net = conv2d_bn(net, 64, 3, 3)
253 |
254 | branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
255 |
256 | branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid')
257 |
258 | net = concatenate([branch_0, branch_1], axis=channel_axis)
259 |
260 | branch_0 = conv2d_bn(net, 64, 1, 1)
261 | branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid')
262 |
263 | branch_1 = conv2d_bn(net, 64, 1, 1)
264 | branch_1 = conv2d_bn(branch_1, 64, 1, 7)
265 | branch_1 = conv2d_bn(branch_1, 64, 7, 1)
266 | branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid')
267 |
268 | net = concatenate([branch_0, branch_1], axis=channel_axis)
269 |
270 | branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid')
271 | branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net)
272 |
273 | net = concatenate([branch_0, branch_1], axis=channel_axis)
274 |
275 | # 35 x 35 x 384
276 | # 4 x Inception-A blocks
277 | for idx in xrange(4):
278 | net = block_inception_a(net)
279 |
280 | # 35 x 35 x 384
281 | # Reduction-A block
282 | net = block_reduction_a(net)
283 |
284 | # 17 x 17 x 1024
285 | # 7 x Inception-B blocks
286 | for idx in xrange(7):
287 | net = block_inception_b(net)
288 |
289 | # 17 x 17 x 1024
290 | # Reduction-B block
291 | net = block_reduction_b(net)
292 |
293 | # 8 x 8 x 1536
294 | # 3 x Inception-C blocks
295 | for idx in xrange(3):
296 | net = block_inception_c(net)
297 |
298 | return net
299 |
300 |
301 | def get_loss(pred, label, l2_weight=0.0001):
302 | diff = tf.square(tf.subtract(pred, label))
303 | train_vars = tf.trainable_variables()
304 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
305 | loss = tf.reduce_mean(diff + l2_loss)
306 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
307 | tf.summary.scalar('loss', loss)
308 |
309 | return loss
310 |
311 |
312 | def summary_scalar(pred, label):
313 | threholds = [5, 4, 3, 2, 1, 0.5]
314 | angles = [float(t) / 180 * scipy.pi for t in threholds]
315 | speeds = [float(t) / 20 for t in threholds]
316 |
317 | for i in range(len(threholds)):
318 | scalar_angle = "angle(" + str(angles[i]) + ")"
319 | scalar_speed = "speed(" + str(speeds[i]) + ")"
320 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
321 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
322 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
323 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
324 |
325 | tf.summary.scalar(scalar_angle, ac_angle)
326 | tf.summary.scalar(scalar_speed, ac_speed)
327 |
328 |
329 | def resize(imgs):
330 | batch_size = imgs.shape[0]
331 | imgs_new = []
332 | for j in range(batch_size):
333 | img = imgs[j,:,:,:]
334 | new = scipy.misc.imresize(img, (299, 299))
335 | imgs_new.append(new)
336 | imgs_new = np.stack(imgs_new, axis=0)
337 | return imgs_new
338 |
339 |
340 | if __name__ == '__main__':
341 | with tf.Graph().as_default():
342 | imgs = tf.zeros((32, 224, 224, 3))
343 | pts = tf.zeros((32, 16384, 3))
344 | outputs = get_model([imgs, pts], tf.constant(True))
345 | print(outputs)
346 |
--------------------------------------------------------------------------------
/models/nvidia_io.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 |
7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
9 |
10 | import tf_util
11 |
12 |
13 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False):
14 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
15 | if separately:
16 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
17 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
18 | labels_pl = [speeds_pl, angles_pl]
19 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
20 | return imgs_pl, labels_pl
21 |
22 |
23 | def get_model(net, is_training, bn_decay=None, separately=False):
24 | """ NVIDIA regression model, input is BxWxHx3, output Bx2"""
25 | batch_size = net.get_shape()[0].value
26 |
27 | for i, dim in enumerate([24, 36, 48, 64, 64]):
28 | scope = "conv" + str(i + 1)
29 | net = tf_util.conv2d(net, dim, [5, 5],
30 | padding='VALID', stride=[1, 1],
31 | bn=True, is_training=is_training,
32 | scope=scope, bn_decay=bn_decay)
33 |
34 | net = tf.reshape(net, [batch_size, -1])
35 | for i, dim in enumerate([256, 100, 50, 10]):
36 | fc_scope = "fc" + str(i + 1)
37 | dp_scope = "dp" + str(i + 1)
38 | net = tf_util.fully_connected(net, dim, bn=True,
39 | is_training=is_training,
40 | scope=fc_scope,
41 | bn_decay=bn_decay)
42 | net = tf_util.dropout(net, keep_prob=0.7,
43 | is_training=is_training,
44 | scope=dp_scope)
45 |
46 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5')
47 |
48 | return net
49 |
50 |
51 | def get_loss(pred, label, l2_weight=0.0001):
52 | diff = tf.square(tf.subtract(pred, label))
53 | train_vars = tf.trainable_variables()
54 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
55 | loss = tf.reduce_mean(diff + l2_loss)
56 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
57 | tf.summary.scalar('loss', loss)
58 |
59 | return loss
60 |
61 |
62 | def summary_scalar(pred, label):
63 | threholds = [5, 4, 3, 2, 1, 0.5]
64 | angles = [float(t) / 180 * scipy.pi for t in threholds]
65 | speeds = [float(t) / 20 for t in threholds]
66 |
67 | for i in range(len(threholds)):
68 | scalar_angle = "angle(" + str(angles[i]) + ")"
69 | scalar_speed = "speed(" + str(speeds[i]) + ")"
70 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
71 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
72 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
73 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
74 |
75 | tf.summary.scalar(scalar_angle, ac_angle)
76 | tf.summary.scalar(scalar_speed, ac_speed)
77 |
78 |
79 | if __name__ == '__main__':
80 | with tf.Graph().as_default():
81 | inputs = tf.zeros((32, 66, 200, 3))
82 | outputs = get_model(inputs, tf.constant(True))
83 | print(outputs)
84 |
--------------------------------------------------------------------------------
/models/nvidia_pm.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 |
7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
9 |
10 | import tf_util
11 |
12 |
13 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False):
14 | imgs_pl = tf.placeholder(tf.float32,
15 | shape=(batch_size, img_rows, img_cols, 3))
16 | fmaps_pl = tf.placeholder(tf.float32,
17 | shape=(batch_size, img_rows, img_cols, 3))
18 | if separately:
19 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
20 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
21 | labels_pl = [speeds_pl, angles_pl]
22 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
23 | return imgs_pl, fmaps_pl, labels_pl
24 |
25 |
26 | def get_model(net, is_training, bn_decay=None, separately=False):
27 | """ NVIDIA regression model, input is BxWxHx3, output Bx2"""
28 | batch_size = net[0].get_shape()[0].value
29 | img_net, fmap_net = net
30 | for i, dim in enumerate([24, 36, 48, 64, 64]):
31 | scope_img = "image_conv" + str(i + 1)
32 | scope_fmap = "fmap_conv" + str(i + 1)
33 | img_net = tf_util.conv2d(img_net, dim, [5, 5],
34 | padding='VALID', stride=[1, 1],
35 | bn=True, is_training=is_training,
36 | scope=scope_img, bn_decay=bn_decay)
37 | fmap_net = tf_util.conv2d(fmap_net, dim, [5, 5],
38 | padding='VALID', stride=[1, 1],
39 | bn=True, is_training=is_training,
40 | scope=scope_fmap, bn_decay=bn_decay)
41 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
42 | for i, dim in enumerate([256, 100, 50, 10]):
43 | fc_scope = "fc" + str(i + 1)
44 | dp_scope = "dp" + str(i + 1)
45 | net = tf_util.fully_connected(net, dim, bn=True,
46 | is_training=is_training,
47 | scope=fc_scope,
48 | bn_decay=bn_decay)
49 | net = tf_util.dropout(net, keep_prob=0.7,
50 | is_training=is_training,
51 | scope=dp_scope)
52 |
53 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5')
54 |
55 | return net
56 |
57 |
58 | def get_loss(pred, label, l2_weight=0.0001):
59 | diff = tf.square(tf.subtract(pred, label))
60 | train_vars = tf.trainable_variables()
61 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
62 | loss = tf.reduce_mean(diff + l2_loss)
63 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
64 | tf.summary.scalar('loss', loss)
65 |
66 | return loss
67 |
68 |
69 | def summary_scalar(pred, label):
70 | threholds = [5, 4, 3, 2, 1, 0.5]
71 | angles = [float(t) / 180 * scipy.pi for t in threholds]
72 | speeds = [float(t) / 20 for t in threholds]
73 |
74 | for i in range(len(threholds)):
75 | scalar_angle = "angle(" + str(angles[i]) + ")"
76 | scalar_speed = "speed(" + str(speeds[i]) + ")"
77 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
78 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
79 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
80 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
81 |
82 | tf.summary.scalar(scalar_angle, ac_angle)
83 | tf.summary.scalar(scalar_speed, ac_speed)
84 |
85 |
86 | if __name__ == '__main__':
87 | with tf.Graph().as_default():
88 | inputs = [tf.zeros((32, 66, 200, 3)), tf.zeros((32, 66, 200, 3))]
89 | outputs = get_model(inputs, tf.constant(True))
90 | print(outputs)
91 |
--------------------------------------------------------------------------------
/models/nvidia_pn.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | import pointnet
13 |
14 |
15 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, points=16384, separately=False):
16 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
17 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
18 | if separately:
19 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
20 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
21 | labels_pl = [speeds_pl, angles_pl]
22 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
23 | return imgs_pl, pts_pl, labels_pl
24 |
25 |
26 | def get_model(net, is_training, bn_decay=None, separately=False):
27 | """ NVIDIA regression model, input is BxWxHx3, output Bx2"""
28 | batch_size = net[0].get_shape()[0].value
29 | img_net, pt_net = net[0], net[1]
30 |
31 | for i, dim in enumerate([24, 36, 48, 64, 64]):
32 | scope = "conv" + str(i + 1)
33 | img_net = tf_util.conv2d(img_net, dim, [5, 5],
34 | padding='VALID', stride=[1, 1],
35 | bn=True, is_training=is_training,
36 | scope=scope, bn_decay=bn_decay)
37 |
38 | img_net = tf.reshape(img_net, [batch_size, -1])
39 | img_net = tf_util.fully_connected(img_net, 256, bn=True,
40 | is_training=is_training,
41 | scope='img_fc0',
42 | bn_decay=bn_decay)
43 | with tf.variable_scope('pointnet'):
44 | pt_net = pointnet.get_model(pt_net, tf.constant(True))
45 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, 512])
46 |
47 | for i, dim in enumerate([256, 128, 16]):
48 | fc_scope = "fc" + str(i + 1)
49 | dp_scope = "dp" + str(i + 1)
50 | net = tf_util.fully_connected(net, dim, bn=True,
51 | is_training=is_training,
52 | scope=fc_scope,
53 | bn_decay=bn_decay)
54 | net = tf_util.dropout(net, keep_prob=0.7,
55 | is_training=is_training,
56 | scope=dp_scope)
57 |
58 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5')
59 |
60 | return net
61 |
62 |
63 | def get_loss(pred, label, l2_weight=0.0001):
64 | diff = tf.square(tf.subtract(pred, label))
65 | train_vars = tf.trainable_variables()
66 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
67 | loss = tf.reduce_mean(diff + l2_loss)
68 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
69 | tf.summary.scalar('loss', loss)
70 |
71 | return loss
72 |
73 |
74 | def summary_scalar(pred, label):
75 | threholds = [5, 4, 3, 2, 1, 0.5]
76 | angles = [float(t) / 180 * scipy.pi for t in threholds]
77 | speeds = [float(t) / 20 for t in threholds]
78 |
79 | for i in range(len(threholds)):
80 | scalar_angle = "angle(" + str(angles[i]) + ")"
81 | scalar_speed = "speed(" + str(speeds[i]) + ")"
82 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
83 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
84 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
85 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
86 |
87 | tf.summary.scalar(scalar_angle, ac_angle)
88 | tf.summary.scalar(scalar_speed, ac_speed)
89 |
90 |
91 | if __name__ == '__main__':
92 | with tf.Graph().as_default():
93 | inputs = [tf.zeros((32, 66, 200, 3)), tf.zeros((32, 16384, 3))]
94 | outputs = get_model(inputs, tf.constant(True))
95 | print(outputs)
96 |
--------------------------------------------------------------------------------
/models/resnet152_io.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | from custom_layers import Scale
13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation
14 | from keras.layers.normalization import BatchNormalization
15 | from keras.models import Model
16 |
17 |
18 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, separately=False):
19 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
20 | if separately:
21 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
22 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
23 | labels_pl = [speeds_pl, angles_pl]
24 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
25 | return imgs_pl, labels_pl
26 |
27 |
28 | def get_resnet(img_rows=224, img_cols=224, separately=False):
29 | """
30 | Resnet 152 Model for Keras
31 |
32 | Model Schema and layer naming follow that of the original Caffe implementation
33 | https://github.com/KaimingHe/deep-residual-networks
34 |
35 | ImageNet Pretrained Weights
36 | Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing
37 | TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing
38 |
39 | Parameters:
40 | img_rows, img_cols - resolution of inputs
41 | channel - 1 for grayscale, 3 for color
42 | """
43 | eps = 1.1e-5
44 | # Handle Dimension Ordering for different backends
45 | img_input = Input(shape=(img_rows, img_cols, 3), name='data')
46 |
47 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
48 | x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
49 | x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x)
50 | x = Scale(axis=3, name='scale_conv1')(x)
51 | x = Activation('relu', name='conv1_relu')(x)
52 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
53 |
54 | x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
55 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
56 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
57 |
58 | x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
59 | for i in range(1,8):
60 | x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i))
61 |
62 | x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
63 | for i in range(1,36):
64 | x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i))
65 |
66 | x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
67 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
68 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
69 |
70 | x_fc = AveragePooling2D((7, 7), name='avg_pool')(x)
71 | x_fc = Flatten()(x_fc)
72 | x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc)
73 |
74 | model = Model(img_input, x_fc)
75 |
76 | weights_path = 'utils/weights/resnet152_weights_tf.h5'
77 | assert (os.path.exists(weights_path))
78 | model.load_weights(weights_path, by_name=True)
79 |
80 | # Truncate and replace softmax layer for transfer learning
81 | # Cannot use model.layers.pop() since model is not of Sequential() type
82 | # The method below works since pre-trained weights are stored in layers but not in the model
83 | x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x)
84 | x_newfc = Flatten()(x_newfc)
85 | x_newfc = Dense(256, name='fc8')(x_newfc)
86 | model = Model(img_input, x_newfc)
87 |
88 | return model
89 |
90 |
91 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
92 | """ ResNet152 regression model, input is BxWxHx3, output Bx2"""
93 | net = get_resnet(224, 224)(net)
94 |
95 | if not add_lstm:
96 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final')
97 |
98 | else:
99 | net = tf_util.fully_connected(net, 784, bn=True,
100 | is_training=is_training,
101 | scope='fc_lstm',
102 | bn_decay=bn_decay)
103 | net = tf_util.dropout(net, keep_prob=0.7,
104 | is_training=is_training,
105 | scope="dp1")
106 | net = cnn_lstm_block(net)
107 |
108 | return net
109 |
110 |
111 | def cnn_lstm_block(input_tensor):
112 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
113 | lstm_out = tf_util.stacked_lstm(lstm_in,
114 | num_outputs=10,
115 | time_steps=28,
116 | scope="cnn_lstm")
117 |
118 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
119 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
120 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
121 |
122 |
123 | def identity_block(input_tensor, kernel_size, filters, stage, block):
124 | '''The identity_block is the block that has no conv layer at shortcut
125 | # Arguments
126 | input_tensor: input tensor
127 | kernel_size: defualt 3, the kernel size of middle conv layer at main path
128 | filters: list of integers, the nb_filters of 3 conv layer at main path
129 | stage: integer, current stage label, used for generating layer names
130 | block: 'a','b'..., current block label, used for generating layer names
131 | '''
132 | eps = 1.1e-5
133 | nb_filter1, nb_filter2, nb_filter3 = filters
134 | conv_name_base = 'res' + str(stage) + block + '_branch'
135 | bn_name_base = 'bn' + str(stage) + block + '_branch'
136 | scale_name_base = 'scale' + str(stage) + block + '_branch'
137 |
138 | x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor)
139 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
140 | x = Scale(axis=3, name=scale_name_base + '2a')(x)
141 | x = Activation('relu', name=conv_name_base + '2a_relu')(x)
142 |
143 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
144 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
145 | name=conv_name_base + '2b', use_bias=False)(x)
146 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
147 | x = Scale(axis=3, name=scale_name_base + '2b')(x)
148 | x = Activation('relu', name=conv_name_base + '2b_relu')(x)
149 |
150 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
151 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
152 | x = Scale(axis=3, name=scale_name_base + '2c')(x)
153 |
154 | x = add([x, input_tensor], name='res' + str(stage) + block)
155 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
156 | return x
157 |
158 |
159 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
160 | '''conv_block is the block that has a conv layer at shortcut
161 | # Arguments
162 | input_tensor: input tensor
163 | kernel_size: defualt 3, the kernel size of middle conv layer at main path
164 | filters: list of integers, the nb_filters of 3 conv layer at main path
165 | stage: integer, current stage label, used for generating layer names
166 | block: 'a','b'..., current block label, used for generating layer names
167 | Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
168 | And the shortcut should have subsample=(2,2) as well
169 | '''
170 | eps = 1.1e-5
171 | nb_filter1, nb_filter2, nb_filter3 = filters
172 | conv_name_base = 'res' + str(stage) + block + '_branch'
173 | bn_name_base = 'bn' + str(stage) + block + '_branch'
174 | scale_name_base = 'scale' + str(stage) + block + '_branch'
175 |
176 | x = Convolution2D(nb_filter1, (1, 1), strides=strides,
177 | name=conv_name_base + '2a', use_bias=False)(input_tensor)
178 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
179 | x = Scale(axis=3, name=scale_name_base + '2a')(x)
180 | x = Activation('relu', name=conv_name_base + '2a_relu')(x)
181 |
182 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
183 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
184 | name=conv_name_base + '2b', use_bias=False)(x)
185 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
186 | x = Scale(axis=3, name=scale_name_base + '2b')(x)
187 | x = Activation('relu', name=conv_name_base + '2b_relu')(x)
188 |
189 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
190 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
191 | x = Scale(axis=3, name=scale_name_base + '2c')(x)
192 |
193 | shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides,
194 | name=conv_name_base + '1', use_bias=False)(input_tensor)
195 | shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut)
196 | shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut)
197 |
198 | x = add([x, shortcut], name='res' + str(stage) + block)
199 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
200 | return x
201 |
202 |
203 | def get_loss(pred, label, l2_weight=0.0001):
204 | diff = tf.square(tf.subtract(pred, label))
205 | train_vars = tf.trainable_variables()
206 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
207 | loss = tf.reduce_mean(diff + l2_loss)
208 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
209 | tf.summary.scalar('loss', loss)
210 |
211 | return loss
212 |
213 |
214 | def summary_scalar(pred, label):
215 | threholds = [5, 4, 3, 2, 1, 0.5]
216 | angles = [float(t) / 180 * scipy.pi for t in threholds]
217 | speeds = [float(t) / 20 for t in threholds]
218 |
219 | for i in range(len(threholds)):
220 | scalar_angle = "angle(" + str(angles[i]) + ")"
221 | scalar_speed = "speed(" + str(speeds[i]) + ")"
222 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
223 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
224 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
225 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
226 |
227 | tf.summary.scalar(scalar_angle, ac_angle)
228 | tf.summary.scalar(scalar_speed, ac_speed)
229 |
230 |
231 | def resize(imgs):
232 | batch_size = imgs.shape[0]
233 | imgs_new = []
234 | for j in range(batch_size):
235 | img = imgs[j,:,:,:]
236 | new = scipy.misc.imresize(img, (224, 224))
237 | imgs_new.append(new)
238 | imgs_new = np.stack(imgs_new, axis=0)
239 | return imgs_new
240 |
241 |
242 | if __name__ == '__main__':
243 | with tf.Graph().as_default():
244 | inputs = tf.zeros((32, 224, 224, 3))
245 | outputs = get_model(inputs, tf.constant(True))
246 | print(outputs)
247 |
248 |
--------------------------------------------------------------------------------
/models/resnet152_pm.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | from custom_layers import Scale
13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation
14 | from keras.layers.normalization import BatchNormalization
15 | from keras.models import Model
16 |
17 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False):
18 | imgs_pl = tf.placeholder(tf.float32,
19 | shape=(batch_size, img_rows, img_cols, 3))
20 | fmaps_pl = tf.placeholder(tf.float32,
21 | shape=(batch_size, img_rows, img_cols, 3))
22 | if separately:
23 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
24 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
25 | labels_pl = [speeds_pl, angles_pl]
26 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
27 | return imgs_pl, fmaps_pl, labels_pl
28 |
29 |
30 | def get_resnet(img_rows=224, img_cols=224, separately=False):
31 | """
32 | Resnet 152 Model for Keras
33 |
34 | Model Schema and layer naming follow that of the original Caffe implementation
35 | https://github.com/KaimingHe/deep-residual-networks
36 |
37 | ImageNet Pretrained Weights
38 | Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing
39 | TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing
40 |
41 | Parameters:
42 | img_rows, img_cols - resolution of inputs
43 | channel - 1 for grayscale, 3 for color
44 | """
45 |
46 | img_input = Input(shape=(img_rows, img_cols, 3), name='data')
47 |
48 | eps = 1.1e-5
49 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
50 | x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
51 | x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x)
52 | x = Scale(axis=3, name='scale_conv1')(x)
53 | x = Activation('relu', name='conv1_relu')(x)
54 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
55 |
56 | x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
57 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
58 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
59 |
60 | x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
61 | for i in range(1,8):
62 | x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i))
63 |
64 | x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
65 | for i in range(1,36):
66 | x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i))
67 |
68 | x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
69 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
70 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
71 |
72 | x_fc = AveragePooling2D((7, 7), name='avg_pool')(x)
73 | x_fc = Flatten()(x_fc)
74 | x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc)
75 |
76 | model = Model(img_input, x_fc)
77 |
78 | # Use pre-trained weights for Tensorflow backend
79 | weights_path = 'utils/weights/resnet152_weights_tf.h5'
80 | assert (os.path.exists(weights_path))
81 |
82 | model.load_weights(weights_path, by_name=True)
83 |
84 | # Truncate and replace softmax layer for transfer learning
85 | # Cannot use model.layers.pop() since model is not of Sequential() type
86 | # The method below works since pre-trained weights are stored in layers but not in the model
87 | x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x)
88 | x_newfc = Flatten()(x_newfc)
89 | x_newfc = Dense(256, name='fc8')(x_newfc)
90 |
91 | model = Model(img_input, x_newfc)
92 | return model
93 |
94 |
95 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
96 | """ ResNet152 regression model, input is BxWxHx3, output Bx2"""
97 | batch_size = net[0].get_shape()[0].value
98 | img_net, fmap_net = net[0], net[1]
99 |
100 | img_net = get_resnet(224, 224)(img_net)
101 | fmap_net = get_resnet(224, 224)(fmap_net)
102 |
103 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1])
104 |
105 | if not add_lstm:
106 | for i, dim in enumerate([256, 128, 16]):
107 | fc_scope = "fc" + str(i + 1)
108 | dp_scope = "dp" + str(i + 1)
109 | net = tf_util.fully_connected(net, dim, bn=True,
110 | is_training=is_training,
111 | scope=fc_scope,
112 | bn_decay=bn_decay)
113 | net = tf_util.dropout(net, keep_prob=0.7,
114 | is_training=is_training,
115 | scope=dp_scope)
116 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
117 | else:
118 | fc_scope = "fc1"
119 | net = tf_util.fully_connected(net, 784, bn=True,
120 | is_training=is_training,
121 | scope=fc_scope,
122 | bn_decay=bn_decay)
123 | net = tf_util.dropout(net, keep_prob=0.7,
124 | is_training=is_training,
125 | scope="dp1")
126 | net = cnn_lstm_block(net)
127 | return net
128 |
129 |
130 | def cnn_lstm_block(input_tensor):
131 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
132 | lstm_out = tf_util.stacked_lstm(lstm_in,
133 | num_outputs=10,
134 | time_steps=28,
135 | scope="cnn_lstm")
136 |
137 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
138 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
139 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
140 |
141 |
142 | def identity_block(input_tensor, kernel_size, filters, stage, block):
143 | '''The identity_block is the block that has no conv layer at shortcut
144 | # Arguments
145 | input_tensor: input tensor
146 | kernel_size: defualt 3, the kernel size of middle conv layer at main path
147 | filters: list of integers, the nb_filters of 3 conv layer at main path
148 | stage: integer, current stage label, used for generating layer names
149 | block: 'a','b'..., current block label, used for generating layer names
150 | '''
151 | eps = 1.1e-5
152 | nb_filter1, nb_filter2, nb_filter3 = filters
153 | conv_name_base = 'res' + str(stage) + block + '_branch'
154 | bn_name_base = 'bn' + str(stage) + block + '_branch'
155 | scale_name_base = 'scale' + str(stage) + block + '_branch'
156 |
157 | x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor)
158 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
159 | x = Scale(axis=3, name=scale_name_base + '2a')(x)
160 | x = Activation('relu', name=conv_name_base + '2a_relu')(x)
161 |
162 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
163 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
164 | name=conv_name_base + '2b', use_bias=False)(x)
165 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
166 | x = Scale(axis=3, name=scale_name_base + '2b')(x)
167 | x = Activation('relu', name=conv_name_base + '2b_relu')(x)
168 |
169 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
170 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
171 | x = Scale(axis=3, name=scale_name_base + '2c')(x)
172 |
173 | x = add([x, input_tensor], name='res' + str(stage) + block)
174 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
175 | return x
176 |
177 |
178 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
179 | '''conv_block is the block that has a conv layer at shortcut
180 | # Arguments
181 | input_tensor: input tensor
182 | kernel_size: defualt 3, the kernel size of middle conv layer at main path
183 | filters: list of integers, the nb_filters of 3 conv layer at main path
184 | stage: integer, current stage label, used for generating layer names
185 | block: 'a','b'..., current block label, used for generating layer names
186 | Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
187 | And the shortcut should have subsample=(2,2) as well
188 | '''
189 | eps = 1.1e-5
190 | nb_filter1, nb_filter2, nb_filter3 = filters
191 | conv_name_base = 'res' + str(stage) + block + '_branch'
192 | bn_name_base = 'bn' + str(stage) + block + '_branch'
193 | scale_name_base = 'scale' + str(stage) + block + '_branch'
194 |
195 | x = Convolution2D(nb_filter1, (1, 1), strides=strides,
196 | name=conv_name_base + '2a', use_bias=False)(input_tensor)
197 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
198 | x = Scale(axis=3, name=scale_name_base + '2a')(x)
199 | x = Activation('relu', name=conv_name_base + '2a_relu')(x)
200 |
201 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
202 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
203 | name=conv_name_base + '2b', use_bias=False)(x)
204 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
205 | x = Scale(axis=3, name=scale_name_base + '2b')(x)
206 | x = Activation('relu', name=conv_name_base + '2b_relu')(x)
207 |
208 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
209 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
210 | x = Scale(axis=3, name=scale_name_base + '2c')(x)
211 |
212 | shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides,
213 | name=conv_name_base + '1', use_bias=False)(input_tensor)
214 | shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut)
215 | shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut)
216 |
217 | x = add([x, shortcut], name='res' + str(stage) + block)
218 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
219 | return x
220 |
221 |
222 | def get_loss(pred, label, l2_weight=0.0001):
223 | diff = tf.square(tf.subtract(pred, label))
224 | train_vars = tf.trainable_variables()
225 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
226 | loss = tf.reduce_mean(diff + l2_loss)
227 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
228 | tf.summary.scalar('loss', loss)
229 |
230 | return loss
231 |
232 |
233 | def summary_scalar(pred, label):
234 | threholds = [5, 4, 3, 2, 1, 0.5]
235 | angles = [float(t) / 180 * scipy.pi for t in threholds]
236 | speeds = [float(t) / 20 for t in threholds]
237 |
238 | for i in range(len(threholds)):
239 | scalar_angle = "angle(" + str(angles[i]) + ")"
240 | scalar_speed = "speed(" + str(speeds[i]) + ")"
241 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
242 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
243 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
244 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
245 |
246 | tf.summary.scalar(scalar_angle, ac_angle)
247 | tf.summary.scalar(scalar_speed, ac_speed)
248 |
249 |
250 | def resize(imgs):
251 | batch_size = imgs.shape[0]
252 | imgs_new = []
253 | for j in range(batch_size):
254 | img = imgs[j,:,:,:]
255 | new = scipy.misc.imresize(img, (224, 224))
256 | imgs_new.append(new)
257 | imgs_new = np.stack(imgs_new, axis=0)
258 | return imgs_new
259 |
260 |
261 | if __name__ == '__main__':
262 | with tf.Graph().as_default():
263 | imgs = tf.zeros((32, 224, 224, 3))
264 | fmaps = tf.zeros((32, 224, 224, 3))
265 | outputs = get_model([imgs, fmaps], tf.constant(True))
266 | print(outputs)
267 |
--------------------------------------------------------------------------------
/models/resnet152_pn.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import tensorflow as tf
5 | import scipy
6 | import numpy as np
7 |
8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
9 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
10 |
11 | import tf_util
12 | import pointnet
13 | from custom_layers import Scale
14 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation
15 | from keras.layers.normalization import BatchNormalization
16 | from keras.models import Model
17 |
18 | from keras import backend as K
19 | K.set_learning_phase(1) #set learning phase
20 |
21 |
22 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False):
23 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3))
24 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3))
25 | if separately:
26 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size))
27 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size))
28 | labels_pl = [speeds_pl, angles_pl]
29 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2))
30 | return imgs_pl, pts_pl, labels_pl
31 |
32 |
33 | def get_resnet(img_rows=224, img_cols=224, separately=False):
34 | """
35 | Resnet 152 Model for Keras
36 |
37 | Model Schema and layer naming follow that of the original Caffe implementation
38 | https://github.com/KaimingHe/deep-residual-networks
39 |
40 | ImageNet Pretrained Weights
41 | Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing
42 | TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing
43 |
44 | Parameters:
45 | img_rows, img_cols - resolution of inputs
46 | channel - 1 for grayscale, 3 for color
47 | """
48 |
49 | img_input = Input(shape=(img_rows, img_cols, 3), name='data')
50 |
51 | eps = 1.1e-5
52 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
53 | x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
54 | x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x)
55 | x = Scale(axis=3, name='scale_conv1')(x)
56 | x = Activation('relu', name='conv1_relu')(x)
57 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
58 |
59 | x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
60 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
61 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
62 |
63 | x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
64 | for i in range(1,8):
65 | x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i))
66 |
67 | x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
68 | for i in range(1,36):
69 | x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i))
70 |
71 | x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
72 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
73 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
74 |
75 | x_fc = AveragePooling2D((7, 7), name='avg_pool')(x)
76 | x_fc = Flatten()(x_fc)
77 | x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc)
78 |
79 | model = Model(img_input, x_fc)
80 |
81 | # Use pre-trained weights for Tensorflow backend
82 | weights_path = 'utils/weights/resnet152_weights_tf.h5'
83 | assert (os.path.exists(weights_path))
84 |
85 | model.load_weights(weights_path, by_name=True)
86 |
87 | # Truncate and replace softmax layer for transfer learning
88 | # Cannot use model.layers.pop() since model is not of Sequential() type
89 | # The method below works since pre-trained weights are stored in layers but not in the model
90 | x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x)
91 | x_newfc = Flatten()(x_newfc)
92 | x_newfc = Dense(256, name='fc8')(x_newfc)
93 |
94 | model = Model(img_input, x_newfc)
95 | return model
96 |
97 |
98 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False):
99 | """ ResNet152 regression model, input is BxWxHx3, output Bx2"""
100 | batch_size = net[0].get_shape()[0].value
101 | img_net, pt_net = net[0], net[1]
102 |
103 | img_net = get_resnet(224, 224)(img_net)
104 | with tf.variable_scope('pointnet'):
105 | pt_net = pointnet.get_model(pt_net, tf.constant(True))
106 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1])
107 |
108 | if not add_lstm:
109 | for i, dim in enumerate([256, 128, 16]):
110 | fc_scope = "fc" + str(i + 1)
111 | dp_scope = "dp" + str(i + 1)
112 | net = tf_util.fully_connected(net, dim, bn=True,
113 | is_training=is_training,
114 | scope=fc_scope,
115 | bn_decay=bn_decay)
116 | net = tf_util.dropout(net, keep_prob=0.7,
117 | is_training=is_training,
118 | scope=dp_scope)
119 |
120 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4')
121 | else:
122 | fc_scope = "fc1"
123 | net = tf_util.fully_connected(net, 784, bn=True,
124 | is_training=is_training,
125 | scope=fc_scope,
126 | bn_decay=bn_decay)
127 | net = tf_util.dropout(net, keep_prob=0.7,
128 | is_training=is_training,
129 | scope="dp1")
130 | net = cnn_lstm_block(net)
131 | return net
132 |
133 |
134 | def cnn_lstm_block(input_tensor):
135 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28])
136 | lstm_out = tf_util.stacked_lstm(lstm_in,
137 | num_outputs=10,
138 | time_steps=28,
139 | scope="cnn_lstm")
140 |
141 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
142 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1))
143 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2)
144 |
145 |
146 | def identity_block(input_tensor, kernel_size, filters, stage, block):
147 | '''The identity_block is the block that has no conv layer at shortcut
148 | # Arguments
149 | input_tensor: input tensor
150 | kernel_size: defualt 3, the kernel size of middle conv layer at main path
151 | filters: list of integers, the nb_filters of 3 conv layer at main path
152 | stage: integer, current stage label, used for generating layer names
153 | block: 'a','b'..., current block label, used for generating layer names
154 | '''
155 | eps = 1.1e-5
156 | nb_filter1, nb_filter2, nb_filter3 = filters
157 | conv_name_base = 'res' + str(stage) + block + '_branch'
158 | bn_name_base = 'bn' + str(stage) + block + '_branch'
159 | scale_name_base = 'scale' + str(stage) + block + '_branch'
160 |
161 | x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor)
162 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
163 | x = Scale(axis=3, name=scale_name_base + '2a')(x)
164 | x = Activation('relu', name=conv_name_base + '2a_relu')(x)
165 |
166 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
167 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
168 | name=conv_name_base + '2b', use_bias=False)(x)
169 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
170 | x = Scale(axis=3, name=scale_name_base + '2b')(x)
171 | x = Activation('relu', name=conv_name_base + '2b_relu')(x)
172 |
173 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
174 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
175 | x = Scale(axis=3, name=scale_name_base + '2c')(x)
176 |
177 | x = add([x, input_tensor], name='res' + str(stage) + block)
178 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
179 | return x
180 |
181 |
182 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
183 | '''conv_block is the block that has a conv layer at shortcut
184 | # Arguments
185 | input_tensor: input tensor
186 | kernel_size: defualt 3, the kernel size of middle conv layer at main path
187 | filters: list of integers, the nb_filters of 3 conv layer at main path
188 | stage: integer, current stage label, used for generating layer names
189 | block: 'a','b'..., current block label, used for generating layer names
190 | Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
191 | And the shortcut should have subsample=(2,2) as well
192 | '''
193 | eps = 1.1e-5
194 | nb_filter1, nb_filter2, nb_filter3 = filters
195 | conv_name_base = 'res' + str(stage) + block + '_branch'
196 | bn_name_base = 'bn' + str(stage) + block + '_branch'
197 | scale_name_base = 'scale' + str(stage) + block + '_branch'
198 |
199 | x = Convolution2D(nb_filter1, (1, 1), strides=strides,
200 | name=conv_name_base + '2a', use_bias=False)(input_tensor)
201 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x)
202 | x = Scale(axis=3, name=scale_name_base + '2a')(x)
203 | x = Activation('relu', name=conv_name_base + '2a_relu')(x)
204 |
205 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
206 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size),
207 | name=conv_name_base + '2b', use_bias=False)(x)
208 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x)
209 | x = Scale(axis=3, name=scale_name_base + '2b')(x)
210 | x = Activation('relu', name=conv_name_base + '2b_relu')(x)
211 |
212 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x)
213 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x)
214 | x = Scale(axis=3, name=scale_name_base + '2c')(x)
215 |
216 | shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides,
217 | name=conv_name_base + '1', use_bias=False)(input_tensor)
218 | shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut)
219 | shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut)
220 |
221 | x = add([x, shortcut], name='res' + str(stage) + block)
222 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
223 | return x
224 |
225 |
226 | def get_loss(pred, label, l2_weight=0.0001):
227 | diff = tf.square(tf.subtract(pred, label))
228 | train_vars = tf.trainable_variables()
229 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight
230 | loss = tf.reduce_mean(diff + l2_loss)
231 | tf.summary.scalar('l2 loss', l2_loss * l2_weight)
232 | tf.summary.scalar('loss', loss)
233 |
234 | return loss
235 |
236 |
237 | def summary_scalar(pred, label):
238 | threholds = [5, 4, 3, 2, 1, 0.5]
239 | angles = [float(t) / 180 * scipy.pi for t in threholds]
240 | speeds = [float(t) / 20 for t in threholds]
241 |
242 | for i in range(len(threholds)):
243 | scalar_angle = "angle(" + str(angles[i]) + ")"
244 | scalar_speed = "speed(" + str(speeds[i]) + ")"
245 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i]
246 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i]
247 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32))
248 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32))
249 |
250 | tf.summary.scalar(scalar_angle, ac_angle)
251 | tf.summary.scalar(scalar_speed, ac_speed)
252 |
253 |
254 | def resize(imgs):
255 | batch_size = imgs.shape[0]
256 | imgs_new = []
257 | for j in range(batch_size):
258 | img = imgs[j,:,:,:]
259 | new = scipy.misc.imresize(img, (224, 224))
260 | imgs_new.append(new)
261 | imgs_new = np.stack(imgs_new, axis=0)
262 | return imgs_new
263 |
264 |
265 | if __name__ == '__main__':
266 | with tf.Graph().as_default():
267 | imgs = tf.zeros((32, 224, 224, 3))
268 | pts = tf.zeros((32, 16384, 3))
269 | outputs = get_model([imgs, pts], tf.constant(True))
270 | print(outputs)
--------------------------------------------------------------------------------
/predict.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import importlib
3 | import os
4 | import sys
5 | import time
6 |
7 | import numpy as np
8 | import scipy
9 |
10 | import provider
11 | import tensorflow as tf
12 |
13 | import matplotlib.pyplot as plt
14 |
15 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
16 | sys.path.append(os.path.join(BASE_DIR, 'models'))
17 |
18 | parser = argparse.ArgumentParser()
19 | parser.add_argument('--gpu', type=int, default=0,
20 | help='GPU to use [default: GPU 0]')
21 | parser.add_argument('--model', default='nvidia_pn',
22 | help='Model name [default: nvidia_pn]')
23 | parser.add_argument('--model_path', default='logs/nvidia_pn/model.ckpt',
24 | help='Model checkpoint file path [default: logs/nvidia_pn/model.ckpt]')
25 | parser.add_argument('--max_epoch', type=int, default=250,
26 | help='Epoch to run [default: 250]')
27 | parser.add_argument('--batch_size', type=int, default=8,
28 | help='Batch Size during training [default: 8]')
29 | parser.add_argument('--result_dir', default='results',
30 | help='Result folder path [results]')
31 |
32 | FLAGS = parser.parse_args()
33 | BATCH_SIZE = FLAGS.batch_size
34 | GPU_INDEX = FLAGS.gpu
35 | MODEL_PATH = FLAGS.model_path
36 |
37 | assert (FLAGS.model == "nvidia_pn")
38 | MODEL = importlib.import_module(FLAGS.model) # import network module
39 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
40 |
41 | RESULT_DIR = os.path.join(FLAGS.result_dir, FLAGS.model)
42 | if not os.path.exists(RESULT_DIR):
43 | os.makedirs(RESULT_DIR)
44 | LOG_FOUT = open(os.path.join(RESULT_DIR, 'log_predict.txt'), 'w')
45 | LOG_FOUT.write(str(FLAGS)+'\n')
46 |
47 |
48 | def log_string(out_str):
49 | LOG_FOUT.write(out_str+'\n')
50 | LOG_FOUT.flush()
51 | print(out_str)
52 |
53 | def predict():
54 | with tf.device('/gpu:'+str(GPU_INDEX)):
55 | if 'pn' in MODEL_FILE:
56 | data_input = provider.Provider()
57 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
58 | imgs_pl = [imgs_pl, pts_pl]
59 | else:
60 | raise NotImplementedError
61 |
62 | is_training_pl = tf.placeholder(tf.bool, shape=())
63 | print(is_training_pl)
64 |
65 | # Get model and loss
66 | pred = MODEL.get_model(imgs_pl, is_training_pl)
67 |
68 | loss = MODEL.get_loss(pred, labels_pl)
69 |
70 | # Add ops to save and restore all the variables.
71 | saver = tf.train.Saver()
72 |
73 | # Create a session
74 | config = tf.ConfigProto()
75 | config.gpu_options.allow_growth = True
76 | config.allow_soft_placement = True
77 | config.log_device_placement = True
78 | sess = tf.Session(config=config)
79 |
80 | # Restore variables from disk.
81 | saver.restore(sess, MODEL_PATH)
82 | log_string("Model restored.")
83 |
84 | ops = {'imgs_pl': imgs_pl,
85 | 'labels_pl': labels_pl,
86 | 'is_training_pl': is_training_pl,
87 | 'pred': pred,
88 | 'loss': loss}
89 |
90 | pred_one_epoch(sess, ops, data_input)
91 |
92 | def pred_one_epoch(sess, ops, data_input):
93 | """ ops: dict mapping from string to tf ops """
94 | is_training = False
95 | preds = []
96 | num_batches = data_input.num_test // BATCH_SIZE
97 |
98 | for batch_idx in range(num_batches):
99 | if "io" in MODEL_FILE:
100 | imgs = data_input.load_one_batch(BATCH_SIZE, "test")
101 | feed_dict = {ops['imgs_pl']: imgs,
102 | ops['is_training_pl']: is_training}
103 | else:
104 | imgs, others = data_input.load_one_batch(BATCH_SIZE, "test")
105 | feed_dict = {ops['imgs_pl'][0]: imgs,
106 | ops['imgs_pl'][1]: others,
107 | ops['is_training_pl']: is_training}
108 |
109 | pred_val = sess.run(ops['pred'], feed_dict=feed_dict)
110 | preds.append(pred_val)
111 |
112 | preds = np.vstack(preds)
113 | print (preds.shape)
114 | # preds[:, 1] = preds[:, 1] * 180.0 / scipy.pi
115 | # preds[:, 0] = preds[:, 0] * 20 + 20
116 |
117 | np.savetxt(os.path.join(RESULT_DIR, "behavior_pred.txt"), preds)
118 |
119 | output_dir = os.path.join(RESULT_DIR, "results")
120 | if not os.path.exists(output_dir):
121 | os.makedirs(output_dir)
122 | i_list = get_dicts(description="test")
123 | counter = 0
124 | for i, num in enumerate(i_list):
125 | np.savetxt(os.path.join(output_dir, str(i) + ".txt"), preds[counter:counter+num,:])
126 | counter += num
127 | # plot_acc(preds, labels)
128 |
129 | def get_dicts(description="val"):
130 | if description == "train":
131 | raise NotImplementedError
132 | elif description == "val": # batch_size == 8
133 | return [120] * 4 + [111] + [120] * 4 + [109] + [120] * 9 + [89 - 87 % 8]
134 | elif description == "test": # batch_size == 8
135 | return [120] * 9 + [116] + [120] * 4 + [106] + [120] * 4 + [114 - 114 % 8]
136 | else:
137 | raise NotImplementedError
138 |
139 | if __name__ == "__main__":
140 | predict()
141 | # plot_acc_from_txt()
142 |
--------------------------------------------------------------------------------
/tools/README.md:
--------------------------------------------------------------------------------
1 | ## Tools to process point clouds and images
2 |
3 | A list of tools (python scripts) are created to make data processing easier and more convenient. Note that they are not professional tools so you may need to modify some lines before using it in your cases.
4 |
5 | If you have more efficient tools, code or other suggestions to process DBNet data, especially point clouds, don't hesitate to contact [@wangjksjtu(wangjksjtu@gmail.com)](https://github.com/wangjksjtu) or __submit pull-requests directly__.
6 | Your contributions are highly encouraged and appreciated!
7 |
8 | - __img_pre.py__: croping and resizing images using python-opencv
9 | - __las2fmap.py__: extracting feature maps from point clouds
10 | - __pcd2las.py__: downsampling point clouds; converting point clouds from '.pcd' to '.las' format.
11 | - __video2img.py__: converting one video to continuous frames
12 |
13 | To see HELP for these script:
14 |
15 | python train.py -h
16 |
17 | ### Requirements
18 | - python-opencv
19 | - numpy, pickle, scipy, __laspy__
20 | - __CloudCompare (CC)__ (set __PATH variables__)
21 |
22 | ### CC Examples
23 | Convert point clouds to `.las` format:
24 |
25 | CloudCompare.exe -SILENT -NO_TIMESTAMP -C_EXPORT_FMT LAS -O %s
26 |
27 | Downsample point clouds to 16384 points and save in `.las` format:
28 |
29 | CloudCompare.exe -SILENT -NO_TIMESTAMP -C_EXPORT_FMT LAS -O %s -SS RANDOM 16384
30 |
31 | More command line usages of CloudCompareare available on the [offical manual page](http://www.cloudcompare.org/doc/wiki/index.php?title=Command_line_mode).
32 |
33 | ### las2fmap Examples
34 | Downlaod the example point cloud from the [Google Drive](https://drive.google.com/file/d/1lxl7M2MTA7afg5UItA5hCvh-Wt5bTSNJ/view?usp=sharing).
35 |
36 | python las2fmap.py -f example.las
37 |
38 | To see HELP for the `las2fmap.py` script:
39 |
40 | python las2fmap.py -h
41 | # usage: las2fmap.py [-h] [-d DIR] [-f FILE]
42 | #
43 | # optional arguments:
44 | # -h, --help show this help message and exit
45 | # -d DIR, --dir DIR Directory of las files [default: '']
46 | # -f FILE, --file FILE Specify one las file you want to convert # [default: '']
47 |
--------------------------------------------------------------------------------
/tools/img_pre.py:
--------------------------------------------------------------------------------
1 | """
2 | Simple python scripts for croping and resizing images
3 | Author: Jingkang Wang
4 | Date: November 2017
5 | Dependency: python-opencv
6 | """
7 |
8 | import argparse
9 | import glob
10 | import os
11 |
12 | import cv2
13 |
14 |
15 | def crop(input_dir="DVR_1920x1080",
16 | output_dir="DVR_1080x600"):
17 | """
18 | Crop images in folders
19 | :param input_dir: path of input directory
20 | :param output_dir: path of output directory
21 | """
22 | assert os.path.exists(input_dir)
23 | if not os.path.exists(output_dir):
24 | os.makedirs(output_dir)
25 | subfolders = glob.glob(os.path.join(input_dir, "*"))
26 |
27 | for folder in subfolders:
28 | new_subfolder = os.path.join(output_dir, folder[folder.rfind("/")+1:])
29 | # print new_subfolder
30 | if not os.path.exists(new_subfolder):
31 | os.mkdir(new_subfolder)
32 | files = glob.glob(os.path.join(folder, "*.jpg"))
33 | # print files
34 | for filename in files:
35 | out_filename = os.path.join(output_dir, filename[filename.find("/")+1:])
36 | print filename, out_filename
37 | crop_img(filename, out_filename)
38 |
39 |
40 | def crop_img(input_img, output_img,
41 | left=500, right=1580, down=200, up=800):
42 | """
43 | Crop single image
44 | :param input_img: path of input image
45 | :param output_img: path of cropped image
46 | :param left, right, down, up: cropped positions
47 | """
48 | img = cv2.imread(input_img)
49 | crop_img = img[down:up, left:right]
50 | cv2.imwrite(output_img, crop_img)
51 |
52 |
53 | def resize(input_dir="DVR_1080x600",
54 | output_dir="DVR_200x66"):
55 | """
56 | Resize images in folders
57 | :param input_dir: path of input directory
58 | :param output_dir: path of output directory
59 | """
60 | width = int(output_dir.split("_")[-1].split("x")[0])
61 | height = int(output_dir.split("_")[-1].split("x")[-1])
62 | print width, height
63 | assert os.path.exists(input_dir)
64 | if not os.path.exists(output_dir):
65 | os.makedirs(output_dir)
66 | subfolders = glob.glob(os.path.join(input_dir, "*"))
67 | for folder in subfolders:
68 | new_subfolder = os.path.join(output_dir, folder[folder.rfind("/")+1:])
69 | # print new_subfolder
70 | if not os.path.exists(new_subfolder):
71 | os.mkdir(new_subfolder)
72 | files = glob.glob(os.path.join(folder, "*.jpg"))
73 | # print files
74 | for filename in files:
75 | out_filename = os.path.join(output_dir, filename[filename.find("/")+1:])
76 | print filename, out_filename
77 | resize_img(filename, out_filename, width, height)
78 |
79 |
80 | def resize_img(input_img, output_img, newx, newy):
81 | """
82 | Resize single image
83 | :param input_img: path of input image
84 | :param output_img: path of cropped image
85 | :param newx, newy: scale of resized image
86 | """
87 | img = cv2.imread(input_img)
88 | newimage = cv2.resize(img, (newx, newy))
89 | cv2.imwrite(output_img, newimage)
90 |
91 |
92 | if __name__ == "__main__":
93 | parser = argparse.ArgumentParser()
94 | parser.add_argument('--input_dir', type=str, default="DVR_1920x1080",
95 | help='Path of input directory [default: DVR_1920x1080]')
96 | parser.add_argument('--output_dir', type=str, default="DVR_1080x600",
97 | help='Path of input directory [default: DVR_1080x600]')
98 | parser.add_argument('--oper', type=str, default="crop",
99 | help='Operation to conduct (crop/resize) [default: crop]')
100 | FLAGS = parser.parse_args()
101 |
102 | INPUT_DIR = FLAGS.input_dir
103 | OUTPUT_DIR = FLAGS.output_dir
104 | OPER = FLAGS.oper
105 |
106 | assert (os.path.exists(INPUT_DIR))
107 | if (OPER == "crop"):
108 | crop(INPUT_DIR, OUTPUT_DIR)
109 | elif (OPER == "resize"):
110 | resize(INPUT_DIR, OUTPUT_DIR)
111 | else:
112 | raise NotImplementedError
--------------------------------------------------------------------------------
/tools/las2fmap.py:
--------------------------------------------------------------------------------
1 | """
2 | Simple python scripts for extracting feature maps from point clouds
3 | Author: Jingkang Wang
4 | Date: November 2017
5 | Dependency: python-opencv, numpy, pickle, scipy, laspy
6 | """
7 |
8 | import argparse
9 | import glob
10 | import math
11 | import os
12 | import pickle
13 |
14 | import numpy as np
15 | from scipy.misc import imsave, imshow
16 |
17 | import cv2
18 | from laspy.base import Writer
19 | from laspy.file import File
20 |
21 |
22 | def lasReader(filename):
23 | """
24 | Read xyz points from single las file
25 | :param filename: path of single point cloud
26 | """
27 | f = File(filename, mode='r')
28 | x_max, x_min = np.max(f.x), np.min(f.x)
29 | y_max, y_min = np.max(f.y), np.min(f.y)
30 | z_max, z_min = np.max(f.z), np.min(f.z)
31 | return np.transpose(np.asarray([f.x, f.y, f.z])), \
32 | [(x_min, x_max), (y_min, y_max), (z_min, z_max)], f.header
33 |
34 |
35 | def transform(merge, ranges, order=[0,1,2]):
36 | """
37 | Swap xyz axis
38 | :param merge: transformed xyz points
39 | :param ranges: specified shifts [default: None]
40 | """
41 | i = np.argsort(order)
42 | merge = merge[i,:]
43 | ranges = np.asarray(ranges)[i,:]
44 | return merge, ranges
45 |
46 |
47 | def standardize(points, ranges=None):
48 | """
49 | Standardize points in point clouds
50 | :param filename: transformed xyz points
51 | :param ranges: specified shifts [default: None]
52 | """
53 | if ranges != None:
54 | points -= np.array([ranges[0][0], ranges[1][0], ranges[2][0]])
55 | else:
56 | x_min = np.min(points[:,0])
57 | y_min = np.min(points[:,1])
58 | z_min = np.min(points[:,2])
59 | points -= np.array(np.array([x_min, y_min, z_min]))
60 | return np.transpose(points), [(0, np.max(points[:,0])), \
61 | (0, np.max(points[:,1])), (0, np.max(points[:,2]))]
62 |
63 |
64 | def rotate(img, angle=180):
65 | """
66 | Rotate images using opencv
67 | :param img: one image (opencv format)
68 | :param angle: rotated angle [default: 180]
69 | """
70 | rows, cols = img.shape[0], img.shape[1]
71 | rotation_matrix = cv2.getRotationMatrix2D((rows/2, cols/2), angle, 1)
72 | dst = cv2.warpAffine(img, rotation_matrix, (cols, rows))
73 | return dst
74 |
75 |
76 | def rotate_about_center(src, angle, scale=1.):
77 | """
78 | Rotate images based on there centers
79 | :param src: one image (opencv format)
80 | :param angle: rotated angle
81 | :param scale: re-scaling images [default: 1.]
82 | """
83 | w = src.shape[1]
84 | h = src.shape[0]
85 | rangle = np.deg2rad(angle) # angle in radians
86 | # now calculate new image width and height
87 | nw = (abs(np.sin(rangle)*h) + abs(np.cos(rangle)*w))*scale
88 | nh = (abs(np.cos(rangle)*h) + abs(np.sin(rangle)*w))*scale
89 | # ask opencv for the rotation matrix
90 | rot_mat = cv2.getRotationMatrix2D((nw*0.5, nh*0.5), angle, scale)
91 | # calculate the move from the old center to the new center combined
92 | # with the rotation
93 | rot_move = np.dot(rot_mat, np.array([(nw-w)*0.5, (nh-h)*0.5,0]))
94 | # the move only affects the translation, so update the translation
95 | # part of the transform
96 | rot_mat[0,2] += rot_move[0]
97 | rot_mat[1,2] += rot_move[1]
98 | return cv2.warpAffine(src, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4)
99 |
100 |
101 | def feature_map(merge, ranges, alpha=0.2, beta=0.8, GSD=0.5):
102 | """
103 | Obatin feature maps from pound clouds
104 | :param merge: merged xyz points
105 | :param ranges: focused ranges
106 | :param alpha, beta, GSD: hyper-parameters in paper
107 | """
108 | (X_min, X_max) = ranges[0]
109 | (Y_min, Y_max) = ranges[1]
110 | (Z_min, Z_max) = ranges[2]
111 |
112 | W = int((X_max - X_min) / GSD) + 1
113 | H = int((Y_max - Y_min) / GSD) + 1
114 | print ("(W, H) = (" + str(W) + ", " + str(H) + ")")
115 | feature_map = np.zeros((W, H))
116 |
117 | net_dict = dict()
118 | for i in range(merge.shape[1]):
119 | if i % 1000000 == 0:
120 | print ("processed 1000000 points...")
121 | point = merge[:,i]
122 | x = int((point[0] - X_min) / GSD)
123 | y = int((point[1] - Y_min) / GSD)
124 | try:
125 | net_dict[(x, y)].append(i)
126 | except:
127 | net_dict[(x, y)] = [i]
128 |
129 | print ("mapping points...")
130 | # caculate the feature
131 | count = 0
132 | F_ij_min = 1000
133 | F_ij_max = -1000
134 | for i in range(W):
135 | for j in range(H):
136 | F_ij = 0
137 |
138 | try:
139 | h_min = 1000
140 | h_max = -1000
141 | for num in net_dict[(i, j)]:
142 | point = merge[:,num]
143 | h_max = max(point[2], h_max)
144 | h_min = min(point[2], h_min)
145 |
146 | Z_ijs = []
147 | W_ijs = []
148 | tol = 1e-5
149 | for num in net_dict[(i, j)]:
150 | point = merge[:,num]
151 | Z_ij = point[2]
152 | H_ij = Z_ij - Z_min
153 | # obtain D_ij from Eqn.(5)
154 | x_ij = (i + 0.5) * GSD + X_min
155 | y_ij = (j + 0.5) * GSD + Y_min
156 | D_ij = math.sqrt((point[0] - x_ij)**2 + (point[1] - y_ij)**2)
157 | # obtain W_ij_XY and W_ij_J from Eqn.(4)
158 | W_ij_XY = math.sqrt(2) * GSD / (D_ij + tol)
159 | W_ij_H = H_ij * (h_min - Z_min) / (Z_max - h_max + tol)
160 | # obtain W_ij from Eqn.(3)
161 | W_ij = alpha * W_ij_XY + beta * W_ij_H
162 | Z_ijs.append(Z_ij)
163 | W_ijs.append(W_ij)
164 |
165 | for k in range(len(Z_ijs)):
166 | # obtain feature value F_ij from Eqn.(2)
167 | F_ij += W_ijs[k] * Z_ijs[k]
168 |
169 | F_ij /= sum(W_ijs)
170 | count += 1
171 |
172 | F_ij_min = min(F_ij, F_ij_min)
173 | F_ij_max = max(F_ij, F_ij_max)
174 | except: pass
175 |
176 | feature_map[i][j] = F_ij
177 |
178 | feature_map -= F_ij_min
179 | feature_map /= (F_ij_max - F_ij_min)
180 | feature_map *= 255
181 |
182 | return feature_map
183 |
184 |
185 | def clean_map(fmap):
186 | """
187 | Clean the feature map
188 | :param fmap: feature map
189 | """
190 | # fmap = fmap[~(fmap==0).all(1)]
191 | fmap = fmap[(fmap != 0).sum(axis=1) >= 100, :]
192 | fmap = fmap[:, (fmap != 0).sum(axis=0) >= 50]
193 |
194 | return fmap
195 |
196 |
197 | def resize(path, x_axis, y_axis):
198 | """
199 | Resize images
200 | :param path: path of an image
201 | :param x_axis: width of resized image
202 | :param y_axis: height of resized image
203 | """
204 | img = cv2.imread(path)
205 | new_image = cv2.resize(img, (x_axis, y_axis))
206 | cv2.imwrite(path, new_image)
207 |
208 |
209 | def get_fmap(filename, dir1='gray', dir2='jet'):
210 | """
211 | Visualize feature maps
212 | :param dir1: path of gray images to be saved
213 | :param dir2: path of jet images to be saved
214 | """
215 | if not os.path.exists(dir1): os.mkdir(dir1)
216 | if not os.path.exists(dir2): os.mkdir(dir2)
217 |
218 | if not os.path.isfile(filename):
219 | print ("[Error]: \'%s\' is not a valid filename" % filename)
220 | return False
221 |
222 | merge, ranges, _ = lasReader(filename)
223 | merge, ranges = standardize(merge, ranges)
224 | print ("standardized point clouds")
225 | print ("total: " + str(merge.shape[1]) + " points")
226 |
227 | # transform x,y,z axis: 0,2,1
228 | merge, ranges = transform(merge, ranges, order=[1, 2, 0])
229 |
230 | # clean the feature map
231 | fmap = clean_map(feature_map(merge, ranges=ranges, GSD=0.05))
232 | cv2.imwrite(os.path.join(dir1, '%s.jpg' % filename[:-4]), \
233 | rotate_about_center(fmap, 180, 1.0))
234 |
235 | # uncomment the following line if you want to resize the feature map
236 | # resize(os.path.join(dir1, '%s.jpg' % filename[:-4]), x_axis=1080, y_axis=270)
237 |
238 | gray = cv2.imread(os.path.join(dir1, '%s.jpg' % filename[:-4]))
239 | gray_single = gray[:,:,0]
240 | imC = cv2.applyColorMap(gray_single, cv2.COLORMAP_JET)
241 | cv2.imwrite(os.path.join(dir1, '%s_jet_tmp.jpg' % filename[:-4]), imC)
242 |
243 | img = cv2.imread(os.path.join(dir1, '%s_jet_tmp.jpg' % filename[:-4]))
244 | cv2.imwrite(os.path.join(dir2, '%s_jet.jpg' % filename[:-4]), img)
245 | os.system("rm %s_jet_tmp.jpg" % os.path.join(dir1, filename[:-4]))
246 |
247 | return True
248 |
249 |
250 | def main():
251 | parser = argparse.ArgumentParser()
252 | parser.add_argument('-d', '--dir', default='',
253 | help='Directory of las files [default: \'\']')
254 | parser.add_argument('-f', '--file', default='',
255 | help='Specify one las file you want to convert [default: \'\']')
256 | FLAGS = parser.parse_args()
257 |
258 | d = FLAGS.dir
259 | f = FLAGS.file
260 |
261 | if f == '' and d == '':
262 | parser.print_help()
263 | elif f != '' and d != '':
264 | if not os.path.isdir(d):
265 | print ("[Error]: \'%s\' is not a valid directory!" % d)
266 | else:
267 | p = os.path.join(d, f)
268 | print (p)
269 | if get_fmap(p):
270 | print ("Finished!")
271 | elif f != '':
272 | p = f
273 | if get_fmap(p):
274 | print ("Finished!")
275 | else:
276 | if not os.path.isdir(d):
277 | print ("[Error]: \'%s\' is not a valid directory!" % d)
278 | else:
279 | files = sorted(glob.glob(os.path.join(d, "*.las")))
280 | count = 0
281 | for f in files:
282 | if get_fmap(f):
283 | count += 1
284 | if count % 25 == 0 and count != 0:
285 | print ("25 Finished!")
286 |
287 |
288 | if __name__ == "__main__":
289 | main()
290 |
--------------------------------------------------------------------------------
/tools/pcd2las.py:
--------------------------------------------------------------------------------
1 | """
2 | Simple python scripts for 1) downsampling point clouds
3 | 2) converting point clouds from '.pcd' to '.las' format.
4 | Author: Jingkang Wang
5 | Date: November 2017
6 | Dependency: CloudCompare
7 | """
8 |
9 | import argparse
10 | import glob
11 | import os
12 | import time
13 |
14 |
15 | def downsample(absolute_path):
16 | """
17 | Downsample point clouds (supported formats: las/pcd/...)
18 | :param absolute_path: directory of point clouds
19 | """
20 | files = glob.glob(absolute_path + "*.las")
21 | files.sort()
22 | files.sort(key=len)
23 | time_in = time.time()
24 | for f in files:
25 | os.system("CloudCompare.exe -SILENT \
26 | -NO_TIMESTAMP -C_EXPORT_FMT LAS \
27 | -O %s -SS RANDOM 16384" % f)
28 | print time.time() - time_in
29 |
30 |
31 | def pcd2las(absolute_path):
32 | """
33 | Convert point clouds from las to pcd.
34 | :param absolute_path: directory of point clouds
35 | """
36 | print (absolute_path)
37 | files = glob.glob(absolute_path + "*.pcd")
38 | files.sort()
39 | files.sort(key=len)
40 | print (files)
41 | time_in = time.time()
42 | for f in files:
43 | os.system("CloudCompare.exe -SILENT \
44 | -NO_TIMESTAMP -C_EXPORT_FMT LAS \
45 | -O %s -SS RANDOM 16384" % f)
46 | print time.time() - time_in
47 |
48 |
49 | if __name__ == "__main__":
50 | parser = argparse.ArgumentParser()
51 | parser.add_argument('input_dir', type=str, required=True,
52 | help='Input directory of point clouds')
53 | parser.add_argument('oper', type=str, default="downsample",
54 | help='Operations to conduct')
55 | FLAGS = parser.parse_args()
56 | INPUT_DIR = FLAGS.input_dir
57 | OPER = FLAGS.oper
58 |
59 | assert (os.path.exists(INPUT_DIR))
60 | if (OPER == "downsample"):
61 | downsample(INPUT_DIR)
62 | else:
63 | pcd2las(INPUT_DIR)
64 |
--------------------------------------------------------------------------------
/tools/video2img.py:
--------------------------------------------------------------------------------
1 | """
2 | Simple python scripts for converting one video to continuous frames
3 | Author: Jingkang Wang
4 | Date: November 2017
5 | Dependency: python-opencv
6 | """
7 |
8 | import argparse
9 | import math
10 | import os
11 | import sys
12 |
13 | import cv2
14 |
15 | parser = argparse.ArgumentParser()
16 | parser.add_argument('-i', help='Path of video')
17 | parser.add_argument('-t', default=1.0, help='Time interval')
18 | parser.add_argument('-o', default='./images', help='Dir of images')
19 | FLAGS = parser.parse_args()
20 |
21 | videoFile = FLAGS.i
22 | imagesFolder = FLAGS.o
23 | t_int = FLAGS.t
24 |
25 | if videoFile == None:
26 | print ("[Error]: Please input path of video")
27 | sys.exit(0)
28 |
29 | if not os.path.exists(videoFile):
30 | print ("[Error]: %s is not a valid video" % videoFile)
31 | sys.exit(0)
32 |
33 | if not os.path.exists(imagesFolder): os.makedirs(imagesFolder)
34 |
35 | cap = cv2.VideoCapture(videoFile)
36 | frameRate = cap.get(5) #frame rate
37 |
38 | count = 0
39 | while(cap.isOpened()):
40 | frameId = cap.get(1)
41 | success, frame = cap.read()
42 | if not success:
43 | break
44 | #print frameId
45 | if (int(frameId) % math.floor(float(t_int) * frameRate) == 0):
46 | filename = imagesFolder + "/images_" + str(int(frameId)) + ".jpg"
47 | cv2.imwrite(filename, frame)
48 | count += 1
49 |
50 | if (count % 100 == 0): print ("100 finished!")
51 |
52 | cap.release()
53 | print "Done!"
54 | print ("FrameRate: %f" % frameRate)
55 | print ("Total: %d" % count)
56 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import importlib
3 | import os
4 | import sys
5 | import time
6 |
7 | import numpy as np
8 | import scipy
9 |
10 | import provider
11 | import tensorflow as tf
12 | import keras
13 |
14 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
15 | sys.path.append(os.path.join(BASE_DIR, 'models'))
16 |
17 | parser = argparse.ArgumentParser()
18 | parser.add_argument('--gpu', type=int, default=0,
19 | help='GPU to use [default: GPU 0]')
20 | parser.add_argument('--model', default='nvidia_pn',
21 | help='Model name [default: nvidia_pn]')
22 | parser.add_argument('--add_lstm', type=bool, default=False,
23 | help='Introduce LSTM mechanism in netowrk [default: False]')
24 | parser.add_argument('--log_dir', default='logs',
25 | help='Log dir [default: logs]')
26 | parser.add_argument('--max_epoch', type=int, default=250,
27 | help='Epoch to run [default: 250]')
28 | parser.add_argument('--batch_size', type=int, default=8,
29 | help='Batch Size during training [default: 8]')
30 | parser.add_argument('--learning_rate', type=float, default=0.001,
31 | help='Learning rate during training [default: 0.001]')
32 | parser.add_argument('--momentum', type=float, default=0.9,
33 | help='Initial learning rate [default: 0.9]')
34 | parser.add_argument('--optimizer', default='adam',
35 | help='adam or momentum [default: adam]')
36 | parser.add_argument('--decay_step', type=int, default=200000,
37 | help='Decay step for lr decay [default: 200000]')
38 | parser.add_argument('--decay_rate', type=float, default=0.7,
39 | help='Decay rate for lr decay [default: 0.8]')
40 | FLAGS = parser.parse_args()
41 |
42 | BATCH_SIZE = FLAGS.batch_size
43 | MAX_EPOCH = FLAGS.max_epoch
44 | LEARNING_RATE = FLAGS.learning_rate
45 | OPTIMIZER = FLAGS.optimizer
46 | BASE_LEARNING_RATE = FLAGS.learning_rate
47 | GPU_INDEX = FLAGS.gpu
48 | MOMENTUM = FLAGS.momentum
49 | DECAY_STEP = FLAGS.decay_step
50 | DECAY_RATE = FLAGS.decay_rate
51 | ADD_LSTM = FLAGS.add_lstm
52 |
53 | BN_INIT_DECAY = 0.5
54 | BN_DECAY_DECAY_RATE = 0.5
55 | BN_DECAY_DECAY_STEP = float(DECAY_STEP)
56 | BN_DECAY_CLIP = 0.99
57 |
58 | supported_models = ["nvidia_io", "nvidia_pn",
59 | "resnet152_io", "resnet152_pn",
60 | "inception_v4_io", "inception_v4_pn",
61 | "densenet169_io", "densenet169_pn"]
62 | assert (FLAGS.model in supported_models)
63 | MODEL = importlib.import_module(FLAGS.model) # import network module
64 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
65 |
66 | LOG_DIR = os.path.join(FLAGS.log_dir, FLAGS.model)
67 | if not os.path.exists(LOG_DIR):
68 | os.makedirs(LOG_DIR)
69 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR)) # bkp of model def
70 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure
71 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w')
72 | LOG_FOUT.write(str(FLAGS)+'\n')
73 |
74 |
75 | def log_string(out_str):
76 | LOG_FOUT.write(out_str+'\n')
77 | LOG_FOUT.flush()
78 | print(out_str)
79 |
80 |
81 | def get_learning_rate(batch):
82 | learning_rate = tf.train.exponential_decay(
83 | BASE_LEARNING_RATE, # Base learning rate.
84 | batch * BATCH_SIZE, # Current index into the dataset.
85 | DECAY_STEP, # Decay step.
86 | DECAY_RATE, # Decay rate.
87 | staircase=True)
88 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE!
89 | return learning_rate
90 |
91 |
92 | def get_bn_decay(batch):
93 | bn_momentum = tf.train.exponential_decay(
94 | BN_INIT_DECAY,
95 | batch*BATCH_SIZE,
96 | BN_DECAY_DECAY_STEP,
97 | BN_DECAY_DECAY_RATE,
98 | staircase=True)
99 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
100 | return bn_decay
101 |
102 |
103 | def train():
104 | with tf.Graph().as_default():
105 | with tf.device('/gpu:'+str(GPU_INDEX)):
106 | if '_pn' in MODEL_FILE:
107 | data_input = provider.Provider()
108 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
109 | imgs_pl = [imgs_pl, pts_pl]
110 | elif '_io' in MODEL_FILE:
111 | data_input = provider.Provider()
112 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
113 | else:
114 | raise NotImplementedError
115 |
116 | is_training_pl = tf.placeholder(tf.bool, shape=())
117 | print(is_training_pl)
118 |
119 | # Note the global_step=batch parameter to minimize.
120 | # That tells the optimizer to helpfully increment the 'batch'
121 | # parameter for you every time it trains.
122 | batch = tf.Variable(0)
123 | bn_decay = get_bn_decay(batch)
124 | tf.summary.scalar('bn_decay', bn_decay)
125 |
126 | # Get model and loss
127 | pred = MODEL.get_model(imgs_pl, is_training_pl,
128 | bn_decay=bn_decay)
129 |
130 | loss = MODEL.get_loss(pred, labels_pl)
131 | MODEL.summary_scalar(pred, labels_pl)
132 |
133 | # Get training operator
134 | learning_rate = get_learning_rate(batch)
135 | tf.summary.scalar('learning_rate', learning_rate)
136 | if OPTIMIZER == 'momentum':
137 | optimizer = tf.train.MomentumOptimizer(learning_rate,
138 | momentum=MOMENTUM)
139 | elif OPTIMIZER == 'adam':
140 | optimizer = tf.train.AdamOptimizer(learning_rate)
141 | train_op = optimizer.minimize(loss, global_step=batch)
142 | # Add ops to save and restore all the variables.
143 | saver = tf.train.Saver()
144 |
145 | # Create a session
146 | config = tf.ConfigProto()
147 | config.gpu_options.allow_growth = True
148 | config.allow_soft_placement = True
149 | config.log_device_placement = False
150 | sess = tf.Session(config=config)
151 |
152 | # Add summary writers
153 | # merged = tf.merge_all_summaries()
154 | merged = tf.summary.merge_all()
155 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'),
156 | sess.graph)
157 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test'))
158 |
159 | # Init variables
160 | init = tf.global_variables_initializer()
161 | sess.run(init, {is_training_pl: True})
162 |
163 | ops = {'imgs_pl': imgs_pl,
164 | 'labels_pl': labels_pl,
165 | 'is_training_pl': is_training_pl,
166 | 'pred': pred,
167 | 'loss': loss,
168 | 'train_op': train_op,
169 | 'merged': merged,
170 | 'step': batch}
171 |
172 | eval_acc_max = 0
173 | for epoch in range(MAX_EPOCH):
174 | log_string('**** EPOCH %03d ****' % (epoch))
175 | sys.stdout.flush()
176 |
177 | train_one_epoch(sess, ops, train_writer, data_input)
178 | eval_acc = eval_one_epoch(sess, ops, test_writer, data_input)
179 | if eval_acc > eval_acc_max:
180 | eval_acc_max = eval_acc
181 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model_best.ckpt"))
182 | log_string("Model saved in file: %s" % save_path)
183 |
184 | # Save the variables to disk.
185 | if epoch % 10 == 0:
186 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt"))
187 | log_string("Model saved in file: %s" % save_path)
188 |
189 |
190 | def train_one_epoch(sess, ops, train_writer, data_input):
191 | """ ops: dict mapping from string to tf ops """
192 | is_training = True
193 | num_batches = data_input.num_train // BATCH_SIZE
194 | loss_sum = 0
195 | acc_a_sum = 0
196 | acc_s_sum = 0
197 | counter = 0
198 |
199 | for batch_idx in range(num_batches):
200 | if "_io" in MODEL_FILE:
201 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "train", reader_type="io")
202 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
203 | imgs = MODEL.resize(imgs)
204 | feed_dict = {ops['imgs_pl']: imgs,
205 | ops['labels_pl']: labels,
206 | ops['is_training_pl']: is_training}
207 | else:
208 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "train")
209 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
210 | imgs = MODEL.resize(imgs)
211 | feed_dict = {ops['imgs_pl'][0]: imgs,
212 | ops['imgs_pl'][1]: others,
213 | ops['labels_pl']: labels,
214 | ops['is_training_pl']: is_training}
215 |
216 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
217 | ops['step'],
218 | ops['train_op'],
219 | ops['loss'],
220 | ops['pred']],
221 | feed_dict=feed_dict)
222 | train_writer.add_summary(summary, step)
223 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
224 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
225 | acc_a = np.mean(acc_a)
226 | acc_a_sum += acc_a
227 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
228 | acc_s = np.mean(acc_s)
229 | acc_s_sum += acc_s
230 |
231 | counter += 1
232 | if counter % 200 == 0:
233 | log_string(str(counter) + " step:")
234 | log_string('loss: %f' % (loss_sum / float(batch_idx + 1)))
235 | log_string('acc (angle): %f' % (acc_a_sum / float(batch_idx + 1)))
236 | log_string('acc (speed): %f' % (acc_s_sum / float(batch_idx + 1)))
237 |
238 | log_string('mean loss: %f' % (loss_sum / float(num_batches)))
239 | log_string('accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
240 | log_string('accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
241 |
242 |
243 | def eval_one_epoch(sess, ops, test_writer, data_input):
244 | """ ops: dict mapping from string to tf ops """
245 | is_training = False
246 | loss_sum = 0
247 |
248 | num_batches = data_input.num_val // BATCH_SIZE
249 | loss_sum = 0
250 | acc_a_sum = 0
251 | acc_s_sum = 0
252 |
253 | for batch_idx in range(num_batches):
254 | if "_io" in MODEL_FILE:
255 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val", reader_type="io")
256 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
257 | imgs = MODEL.resize(imgs)
258 | feed_dict = {ops['imgs_pl']: imgs,
259 | ops['labels_pl']: labels,
260 | ops['is_training_pl']: is_training}
261 | else:
262 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val")
263 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE:
264 | imgs = MODEL.resize(imgs)
265 | feed_dict = {ops['imgs_pl'][0]: imgs,
266 | ops['imgs_pl'][1]: others,
267 | ops['labels_pl']: labels,
268 | ops['is_training_pl']: is_training}
269 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
270 | ops['step'],
271 | ops['train_op'],
272 | ops['loss'],
273 | ops['pred']],
274 | feed_dict=feed_dict)
275 | test_writer.add_summary(summary, step)
276 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
277 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
278 | acc_a = np.mean(acc_a)
279 | acc_a_sum += acc_a
280 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
281 | acc_s = np.mean(acc_s)
282 | acc_s_sum += acc_s
283 |
284 | log_string('eval mean loss: %f' % (loss_sum / float(num_batches)))
285 | log_string('eval accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
286 | log_string('eval accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
287 | return acc_a_sum / float(num_batches)
288 |
289 |
290 | if __name__ == "__main__":
291 | train()
292 |
--------------------------------------------------------------------------------
/train_demo.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import importlib
3 | import os
4 | import sys
5 | import time
6 |
7 | import numpy as np
8 | import scipy
9 |
10 | import provider
11 | import tensorflow as tf
12 |
13 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
14 | sys.path.append(os.path.join(BASE_DIR, 'models'))
15 |
16 | parser = argparse.ArgumentParser()
17 | parser.add_argument('--gpu', type=int, default=0,
18 | help='GPU to use [default: GPU 0]')
19 | parser.add_argument('--model', default='nvidia_io',
20 | help='Model name [default: nvidia_io]')
21 | parser.add_argument('--log_dir', default='logs',
22 | help='Log dir [default: logs]')
23 | parser.add_argument('--max_epoch', type=int, default=250,
24 | help='Epoch to run [default: 250]')
25 | parser.add_argument('--batch_size', type=int, default=8,
26 | help='Batch Size during training [default: 8]')
27 | parser.add_argument('--learning_rate', type=float, default=0.001,
28 | help='Learning rate during training [default: 0.001]')
29 | parser.add_argument('--momentum', type=float, default=0.9,
30 | help='Initial learning rate [default: 0.9]')
31 | parser.add_argument('--optimizer', default='adam',
32 | help='adam or momentum [default: adam]')
33 | parser.add_argument('--decay_step', type=int, default=200000,
34 | help='Decay step for lr decay [default: 200000]')
35 | parser.add_argument('--decay_rate', type=float, default=0.7,
36 | help='Decay rate for lr decay [default: 0.8]')
37 | FLAGS = parser.parse_args()
38 |
39 | BATCH_SIZE = FLAGS.batch_size
40 | MAX_EPOCH = FLAGS.max_epoch
41 | LEARNING_RATE = FLAGS.learning_rate
42 | OPTIMIZER = FLAGS.optimizer
43 | BASE_LEARNING_RATE = FLAGS.learning_rate
44 | GPU_INDEX = FLAGS.gpu
45 | MOMENTUM = FLAGS.momentum
46 | DECAY_STEP = FLAGS.decay_step
47 | DECAY_RATE = FLAGS.decay_rate
48 |
49 | BN_INIT_DECAY = 0.5
50 | BN_DECAY_DECAY_RATE = 0.5
51 | BN_DECAY_DECAY_STEP = float(DECAY_STEP)
52 | BN_DECAY_CLIP = 0.99
53 |
54 | MODEL = importlib.import_module(FLAGS.model) # import network module
55 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
56 |
57 | LOG_DIR = os.path.join(FLAGS.log_dir, FLAGS.model)
58 | if not os.path.exists(LOG_DIR):
59 | os.makedirs(LOG_DIR)
60 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR)) # bkp of model def
61 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure
62 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w')
63 | LOG_FOUT.write(str(FLAGS)+'\n')
64 |
65 |
66 | def log_string(out_str):
67 | LOG_FOUT.write(out_str+'\n')
68 | LOG_FOUT.flush()
69 | print(out_str)
70 |
71 |
72 | def get_learning_rate(batch):
73 | learning_rate = tf.train.exponential_decay(
74 | BASE_LEARNING_RATE, # Base learning rate.
75 | batch * BATCH_SIZE, # Current index into the dataset.
76 | DECAY_STEP, # Decay step.
77 | DECAY_RATE, # Decay rate.
78 | staircase=True)
79 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE!
80 | return learning_rate
81 |
82 |
83 | def get_bn_decay(batch):
84 | bn_momentum = tf.train.exponential_decay(
85 | BN_INIT_DECAY,
86 | batch*BATCH_SIZE,
87 | BN_DECAY_DECAY_STEP,
88 | BN_DECAY_DECAY_RATE,
89 | staircase=True)
90 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
91 | return bn_decay
92 |
93 |
94 | def train():
95 | with tf.Graph().as_default():
96 | with tf.device('/gpu:'+str(GPU_INDEX)):
97 | if 'io' in MODEL_FILE:
98 | data_input = provider.DVR_Provider()
99 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
100 | elif 'pm' in MODEL_FILE:
101 | data_input = provider.DVR_FMAP_Provider()
102 | imgs_pl, fmaps_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
103 | imgs_pl = [imgs_pl, fmaps_pl]
104 | else:
105 | data_input = provider.DVR_Points_Provider()
106 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE)
107 | imgs_pl = [imgs_pl, pts_pl]
108 |
109 | is_training_pl = tf.placeholder(tf.bool, shape=())
110 | print(is_training_pl)
111 |
112 | # Note the global_step=batch parameter to minimize.
113 | # That tells the optimizer to helpfully increment the 'batch'
114 | # parameter for you every time it trains.
115 | batch = tf.Variable(0)
116 | bn_decay = get_bn_decay(batch)
117 | tf.summary.scalar('bn_decay', bn_decay)
118 |
119 | # Get model and loss
120 | pred = MODEL.get_model(imgs_pl, is_training_pl,
121 | bn_decay=bn_decay)
122 |
123 | loss = MODEL.get_loss(pred, labels_pl)
124 | MODEL.summary_scalar(pred, labels_pl)
125 |
126 | # Get training operator
127 | learning_rate = get_learning_rate(batch)
128 | tf.summary.scalar('learning_rate', learning_rate)
129 | if OPTIMIZER == 'momentum':
130 | optimizer = tf.train.MomentumOptimizer(learning_rate,
131 | momentum=MOMENTUM)
132 | elif OPTIMIZER == 'adam':
133 | optimizer = tf.train.AdamOptimizer(learning_rate)
134 | train_op = optimizer.minimize(loss, global_step=batch)
135 | # Add ops to save and restore all the variables.
136 | saver = tf.train.Saver()
137 |
138 | # Create a session
139 | config = tf.ConfigProto()
140 | config.gpu_options.allow_growth = True
141 | config.allow_soft_placement = True
142 | config.log_device_placement = False
143 | sess = tf.Session(config=config)
144 |
145 | # Add summary writers
146 | # merged = tf.merge_all_summaries()
147 | merged = tf.summary.merge_all()
148 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'),
149 | sess.graph)
150 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test'))
151 |
152 | # Init variables
153 | init = tf.global_variables_initializer()
154 | sess.run(init, {is_training_pl: True})
155 |
156 | ops = {'imgs_pl': imgs_pl,
157 | 'labels_pl': labels_pl,
158 | 'is_training_pl': is_training_pl,
159 | 'pred': pred,
160 | 'loss': loss,
161 | 'train_op': train_op,
162 | 'merged': merged,
163 | 'step': batch}
164 |
165 | for epoch in range(MAX_EPOCH):
166 | log_string('**** EPOCH %03d ****' % (epoch))
167 | sys.stdout.flush()
168 |
169 | train_one_epoch(sess, ops, train_writer, data_input)
170 | eval_one_epoch(sess, ops, test_writer, data_input)
171 |
172 | # Save the variables to disk.
173 | if epoch % 10 == 0:
174 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt"))
175 | log_string("Model saved in file: %s" % save_path)
176 |
177 |
178 | def train_one_epoch(sess, ops, train_writer, data_input):
179 | """ ops: dict mapping from string to tf ops """
180 | is_training = True
181 | num_batches = data_input.num_train // BATCH_SIZE
182 | loss_sum = 0
183 | acc_a_sum = 0
184 | acc_s_sum = 0
185 |
186 | for batch_idx in range(num_batches):
187 | if "io" in MODEL_FILE:
188 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "train")
189 | feed_dict = {ops['imgs_pl']: imgs,
190 | ops['labels_pl']: labels,
191 | ops['is_training_pl']: is_training}
192 | else:
193 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "train")
194 | feed_dict = {ops['imgs_pl'][0]: imgs,
195 | ops['imgs_pl'][1]: others,
196 | ops['labels_pl']: labels,
197 | ops['is_training_pl']: is_training}
198 |
199 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
200 | ops['step'],
201 | ops['train_op'],
202 | ops['loss'],
203 | ops['pred']],
204 | feed_dict=feed_dict)
205 | train_writer.add_summary(summary, step)
206 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
207 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
208 | acc_a = np.mean(acc_a)
209 | acc_a_sum += acc_a
210 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
211 | acc_s = np.mean(acc_s)
212 | acc_s_sum += acc_s
213 |
214 | log_string('mean loss: %f' % (loss_sum / float(num_batches)))
215 | log_string('accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
216 | log_string('accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
217 |
218 |
219 | def eval_one_epoch(sess, ops, test_writer, data_input):
220 | """ ops: dict mapping from string to tf ops """
221 | is_training = False
222 | loss_sum = 0
223 |
224 | num_batches = data_input.num_val // BATCH_SIZE
225 | loss_sum = 0
226 | acc_a_sum = 0
227 | acc_s_sum = 0
228 |
229 | for batch_idx in range(num_batches):
230 | if "io" in MODEL_FILE:
231 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val")
232 | feed_dict = {ops['imgs_pl']: imgs,
233 | ops['labels_pl']: labels,
234 | ops['is_training_pl']: is_training}
235 | else:
236 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val")
237 | feed_dict = {ops['imgs_pl'][0]: imgs,
238 | ops['imgs_pl'][1]: others,
239 | ops['labels_pl']: labels,
240 | ops['is_training_pl']: is_training}
241 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'],
242 | ops['step'],
243 | ops['train_op'],
244 | ops['loss'],
245 | ops['pred']],
246 | feed_dict=feed_dict)
247 | test_writer.add_summary(summary, step)
248 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels)))
249 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi)
250 | acc_a = np.mean(acc_a)
251 | acc_a_sum += acc_a
252 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20)
253 | acc_s = np.mean(acc_s)
254 | acc_s_sum += acc_s
255 |
256 | log_string('eval mean loss: %f' % (loss_sum / float(num_batches)))
257 | log_string('eval accuracy (angle): %f' % (acc_a_sum / float(num_batches)))
258 | log_string('eval accuracy (speed): %f' % (acc_s_sum / float(num_batches)))
259 |
260 |
261 | if __name__ == "__main__":
262 | train()
263 |
--------------------------------------------------------------------------------
/utils/custom_layers.py:
--------------------------------------------------------------------------------
1 | from keras.layers.core import Layer
2 | from keras.engine import InputSpec
3 | from keras import backend as K
4 | try:
5 | from keras import initializations
6 | except ImportError:
7 | from keras import initializers as initializations
8 |
9 |
10 | class Scale(Layer):
11 | '''Learns a set of weights and biases used for scaling the input data.
12 | the output consists simply in an element-wise multiplication of the input
13 | and a sum of a set of constants:
14 |
15 | out = in * gamma + beta,
16 |
17 | where 'gamma' and 'beta' are the weights and biases larned.
18 |
19 | # Arguments
20 | axis: integer, axis along which to normalize in mode 0. For instance,
21 | if your input tensor has shape (samples, channels, rows, cols),
22 | set axis to 1 to normalize per feature map (channels axis).
23 | momentum: momentum in the computation of the
24 | exponential average of the mean and standard deviation
25 | of the data, for feature-wise normalization.
26 | weights: Initialization weights.
27 | List of 2 Numpy arrays, with shapes:
28 | `[(input_shape,), (input_shape,)]`
29 | beta_init: name of initialization function for shift parameter
30 | (see [initializations](../initializations.md)), or alternatively,
31 | Theano/TensorFlow function to use for weights initialization.
32 | This parameter is only relevant if you don't pass a `weights` argument.
33 | gamma_init: name of initialization function for scale parameter (see
34 | [initializations](../initializations.md)), or alternatively,
35 | Theano/TensorFlow function to use for weights initialization.
36 | This parameter is only relevant if you don't pass a `weights` argument.
37 | '''
38 | def __init__(self, weights=None, axis=-1, momentum = 0.9, beta_init='zero', gamma_init='one', **kwargs):
39 | self.momentum = momentum
40 | self.axis = axis
41 | self.beta_init = initializations.get(beta_init)
42 | self.gamma_init = initializations.get(gamma_init)
43 | self.initial_weights = weights
44 | super(Scale, self).__init__(**kwargs)
45 |
46 | def build(self, input_shape):
47 | self.input_spec = [InputSpec(shape=input_shape)]
48 | shape = (int(input_shape[self.axis]),)
49 |
50 | # Compatibility with TensorFlow >= 1.0.0
51 | self.gamma = K.variable(self.gamma_init(shape), name='{}_gamma'.format(self.name))
52 | self.beta = K.variable(self.beta_init(shape), name='{}_beta'.format(self.name))
53 | #self.gamma = self.gamma_init(shape, name='{}_gamma'.format(self.name))
54 | #self.beta = self.beta_init(shape, name='{}_beta'.format(self.name))
55 | self.trainable_weights = [self.gamma, self.beta]
56 |
57 | if self.initial_weights is not None:
58 | self.set_weights(self.initial_weights)
59 | del self.initial_weights
60 |
61 | def call(self, x, mask=None):
62 | input_shape = self.input_spec[0].shape
63 | broadcast_shape = [1] * len(input_shape)
64 | broadcast_shape[self.axis] = input_shape[self.axis]
65 |
66 | out = K.reshape(self.gamma, broadcast_shape) * x + K.reshape(self.beta, broadcast_shape)
67 | return out
68 |
69 | def get_config(self):
70 | config = {"momentum": self.momentum, "axis": self.axis}
71 | base_config = super(Scale, self).get_config()
72 | return dict(list(base_config.items()) + list(config.items()))
73 |
--------------------------------------------------------------------------------
/utils/helper.py:
--------------------------------------------------------------------------------
1 | import argparse
2 |
3 |
4 | def str2bool(v):
5 | if v.lower() in ('yes', 'true', 't', 'y', '1'):
6 | return True
7 | elif v.lower() in ('no', 'false', 'f', 'n', '0'):
8 | return False
9 | else:
10 | raise argparse.ArgumentTypeError('Boolean value expected.')
--------------------------------------------------------------------------------
/utils/pointnet.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | import sys
5 | import os
6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
7 | sys.path.append(BASE_DIR)
8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
9 | import tf_util
10 |
11 | def placeholder_inputs(batch_size, num_point):
12 | pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point, 3))
13 | labels_pl = tf.placeholder(tf.int32, shape=(batch_size))
14 | return pointclouds_pl, labels_pl
15 |
16 |
17 | def get_model(point_cloud, is_training, bn_decay=None):
18 | """ Classification PointNet, input is BxNx3, output Bx40 """
19 | batch_size = point_cloud.get_shape()[0].value
20 | num_point = point_cloud.get_shape()[1].value
21 | end_points = {}
22 | input_image = tf.expand_dims(point_cloud, -1)
23 |
24 | # Point functions (MLP implemented as conv2d)
25 | net = tf_util.conv2d(input_image, 64, [1,3],
26 | padding='VALID', stride=[1,1],
27 | bn=True, is_training=is_training,
28 | scope='conv1', bn_decay=bn_decay)
29 | net = tf_util.conv2d(net, 64, [1,1],
30 | padding='VALID', stride=[1,1],
31 | bn=True, is_training=is_training,
32 | scope='conv2', bn_decay=bn_decay)
33 | net = tf_util.conv2d(net, 64, [1,1],
34 | padding='VALID', stride=[1,1],
35 | bn=True, is_training=is_training,
36 | scope='conv3', bn_decay=bn_decay)
37 | net = tf_util.conv2d(net, 128, [1,1],
38 | padding='VALID', stride=[1,1],
39 | bn=True, is_training=is_training,
40 | scope='conv4', bn_decay=bn_decay)
41 | net = tf_util.conv2d(net, 256, [1,1],
42 | padding='VALID', stride=[1,1],
43 | bn=True, is_training=is_training,
44 | scope='conv5', bn_decay=bn_decay)
45 |
46 | # Symmetric function: max pooling
47 | net = tf_util.max_pool2d(net, [num_point,1],
48 | padding='VALID', scope='maxpool')
49 |
50 | # MLP on global point cloud vector
51 | net = tf.reshape(net, [batch_size, -1])
52 |
53 | return net
54 |
55 | if __name__=='__main__':
56 | with tf.Graph().as_default():
57 | inputs = tf.zeros((32,100000,3))
58 | outputs = get_model(inputs, tf.constant(True))
59 | print(outputs)
60 |
--------------------------------------------------------------------------------
/utils/weights/README.md:
--------------------------------------------------------------------------------
1 | ## ImageNet Pretrained Models
2 |
3 | This is the place where the weights of pre-trained models in ImageNet are placed. This part is modified from [this repo](https://github.com/flyyufelix/cnn_finetune).
4 |
5 | ### Folder Structure
6 | Download pre-trained weights and organize the files as follows (in `utils/weights/`):
7 | ```
8 | ├── resnet152_weights_tf.h5
9 | ├── inception-v4_weights_tf.h5
10 | └── densenet169_weights_tf.h5
11 | ```
12 |
13 | ### Download the Weights
14 |
15 | Network|Tensorflow
16 | :---:|:---:
17 | Inception-V4 | [model (172 MB)](https://drive.google.com/file/d/0Byy2AcGyEVxfTmRRVmpGWDczaXM/view?usp=sharing)
18 | ResNet-152 | [model (243 MB)](https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing)
19 | DenseNet-169 | [model (56 MB)](https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM)
20 |
21 | More pre-trained weights are available [here](https://github.com/flyyufelix/cnn_finetune).
--------------------------------------------------------------------------------