├── .gitignore ├── LICENSE ├── README.md ├── data ├── README.md ├── dbnet-2018 │ └── README.md └── demo │ ├── README.md │ └── data.csv ├── docs ├── logo.jpeg └── pred.jpg ├── evaluate.py ├── models ├── densenet169_io.py ├── densenet169_pm.py ├── densenet169_pn.py ├── inception_v4_io.py ├── inception_v4_pm.py ├── inception_v4_pn.py ├── nvidia_io.py ├── nvidia_pm.py ├── nvidia_pn.py ├── resnet152_io.py ├── resnet152_pm.py └── resnet152_pn.py ├── predict.py ├── provider.py ├── tools ├── README.md ├── img_pre.py ├── las2fmap.py ├── pcd2las.py └── video2img.py ├── train.py ├── train_demo.py └── utils ├── custom_layers.py ├── helper.py ├── pointnet.py ├── tf_util.py └── weights └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | .vscode/ 2 | *pyc 3 | data/dbnet-2018/train 4 | data/dbnet-2018/val 5 | data/dbnet-2018/test 6 | data/demo/DVR 7 | data/demo/fmap 8 | data/demo/points_16384 9 | logs/ 10 | results/ 11 | dbnet_test.py 12 | utils/weights/*.h5 13 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ![db-prediction](docs/pred.jpg) 4 | 5 | [DBNet](http://www.dbehavior.net/) is a __large-scale driving behavior dataset__, which provides large-scale __high-quality point clouds__ scanned by Velodyne lasers, __high-resolution videos__ recorded by dashboard cameras and __standard drivers' behaviors__ (vehicle speed, steering angle) collected by real-time sensors. 6 | 7 | Extensive experiments demonstrate that extra depth information helps networks to determine driving policies indeed. We hope it will become useful resources for the autonomous driving research community. 8 | 9 | _Created by [Yiping Chen*](https://scholar.google.com/citations?user=e9lv2fUAAAAJ&hl=en), [Jingkang Wang*](https://wangjksjtu.github.io/), [Jonathan Li](https://uwaterloo.ca/mobile-sensing/people-profiles/jonathan-li), [Cewu Lu](http://www.mvig.org/), Zhipeng Luo, HanXue and [Cheng Wang](http://chwang.xmu.edu.cn/). (*equal contribution)_ 10 | 11 | The resources of our work are available: [[paper]](http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_LiDAR-Video_Driving_Dataset_CVPR_2018_paper.pdf), [[code]](https://github.com/driving-behavior/DBNet), [[video]](http://www.dbehavior.net/data/demo.mp4), [[website]](http://www.dbehavior.net/), [[challenge]](http://www.dbehavior.net/task.html), [[prepared data]](https://drive.google.com/file/d/1WxzOrhvMnHCOkh6EFGWltflyPb_UnGqo/view?usp=sharing) 12 | 13 | 18 | 19 | ## Contents 20 | 1. [Introduction](#introduction) 21 | 2. [Requirements](#requirements) 22 | 3. [Quick Start](#quick-start) 23 | 4. [Baseline](#baseline) 24 | 5. [Contributors](#contributors) 25 | 6. [Citation](#citation) 26 | 7. [License](#license) 27 | 28 | ## Introduction 29 | This work is based on our [research paper](http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_LiDAR-Video_Driving_Dataset_CVPR_2018_paper.html), which appears in CVPR 2018. We propose a large-scale dataset for driving behavior learning, namely, DBNet. You can also check our [dataset webpage](http://www.dbehavior.net/) for a deeper introduction. 30 | 31 | In this repository, we release __demo code__ and __partial prepared data__ for training with only images, as well as leveraging feature maps or point clouds. The prepared data are accessible [here](https://drive.google.com/open?id=14RPdVTwBTuCTo0tFeYmL_SyN8fD0g6Hc). (__More demo models and scripts are released soon!__) 32 | 33 | ## Requirements 34 | 35 | * **Tensorflow 1.2.0** 36 | * Python 2.7 37 | * CUDA 8.0+ (For GPU) 38 | * Python Libraries: numpy, scipy and __laspy__ 39 | 40 | The code has been tested with Python 2.7, Tensorflow 1.2.0, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04. But it may work on more machines (directly or through mini-modification), pull-requests or test report are well welcomed. 41 | 42 | ## Quick Start 43 | ### Training 44 | To train a model to predict vehicle speeds and steering angles: 45 | 46 | python train.py --model nvidia_pn --batch_size 16 --max_epoch 125 --gpu 0 47 | 48 | The names of the models are consistent with our [paper](http://www.dbehavior.net/publications.html). 49 | Log files and network parameters will be saved to `logs` folder in default. 50 | 51 | To see HELP for the training script: 52 | 53 | python train.py -h 54 | 55 | We can use TensorBoard to view the network architecture and monitor the training progress. 56 | 57 | tensorboard --logdir logs 58 | 59 | ### Evaluation 60 | After training, you could evaluate the performance of models using `evaluate.py`. To plot the figures or calculate AUC, you may need to have matplotlib library installed. 61 | 62 | python evaluate.py --model_path logs/nvidia_pn/model.ckpt 63 | 64 | ### Prediction 65 | To get the predictions of test data: 66 | 67 | python predict.py 68 | 69 | The results are saved in `results/results` (every segment) and `results/behavior_pred.txt` (merged) by default. 70 | To change the storation location: 71 | 72 | python predict.py --result_dir specified_dir 73 | 74 | The result directory will be created automatically if it doesn't exist. 75 | 76 | ## Baseline 77 |
MethodSettingAccuracyAUCMEAEAME
nvidia-pnVideos + Laser Pointsangle70.65% (<5)0.7799 29.464.2320.88
speed82.21% (<3)0.870118.561.809.68
78 | 79 | This baseline is run on __dbnet-2018 challenge data__ and only __nvidia\_pn__ is tested. To measure difficult architectures comprehensively, several metrics are set, including accuracy under different thresholds, area under curve (__AUC__), max error (__ME__), mean error (__AE__) and mean of max errors (__AME__). 80 | 81 | The implementations of these metrics could be found in `evaluate.py`. 82 | 83 | ## Contributors 84 | DBNet was developed by [MVIG](http://www.mvig.org/), Shanghai Jiao Tong University* and [SCSC](http://scsc.xmu.edu.cn/) Lab, Xiamen University* (*alphabetical order*). 85 | 86 | ## Citation 87 | If you find our work useful in your research, please consider citing: 88 | 89 | @InProceedings{DBNet2018, 90 | author = {Yiping Chen and Jingkang Wang and Jonathan Li and Cewu Lu and Zhipeng Luo and HanXue and Cheng Wang}, 91 | title = {LiDAR-Video Driving Dataset: Learning Driving Policies Effectively}, 92 | booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 93 | month = {June}, 94 | year = {2018} 95 | } 96 | 97 | ## License 98 | Our code is released under Apache 2.0 License. The copyright of DBNet could be checked [here](http://www.dbehavior.net/contact.html). 99 | -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | ## Home Directory of DBNet Data 2 | 3 | This is the place where DBNet data are placed in order to fit the default path in `../provider.py`. In total, two kinds of prepared data are provided, which are listed in `dbnet-2018` and `demo` folder, respectively. 4 | 5 | ### dbnet-2018 6 | Download DBNet-2018 challenge data [here](https://drive.google.com/open?id=14RPdVTwBTuCTo0tFeYmL_SyN8fD0g6Hc) and organize the folders as follows (in `dbnet-2018/`): 7 | ``` 8 | ├── train 9 | ├─ └── i [56 folders] (6569 in total, will release continously) 10 | ├─ ├── dvr_66x200 [<= 120 images] 11 | ├─ ├── dvr_1920x1080 [<= 120 images] 12 | ├─ ├── points_16384 [<= 120 images] 13 | ├─ └── behavior.csv [labels] 14 | ├── val 15 | ├─ └── j [20 folders] (2349 in total) 16 | ├─ ├── dvr_66x200 [<= 120 images] 17 | ├─ ├── dvr_1920x1080 [<= 120 images] 18 | ├─ ├── points_16384 [<= 120 clouds] 19 | ├─ └── behavior.csv [labels] 20 | └── test 21 | └── k [20 folders] (2376 in total) 22 | ├── dvr_66x200 [<= 120 images] 23 | ├── dvr_1920x1080 [<= 120 images] 24 | └── points_16384 [<= 120 clouds] 25 | 26 | ``` 27 | In general, the train/val/test ratio is approximatingly set to 8:1:1 and all of the val/test data are released already. Almost five eighths of training data are still pre-processed and will be __uploaded soon__. 28 | 29 | Please note that the data in subfolders of `train/`, `val/` and `test/` are __continuous__ and __time-ordered__. The `ith` line of `behavior.csv` correponds to `i-1.jpg` in `dvf_66x200/` and `i-1.las` in `points_16384/`. Moreover, if you don't intend to utilize prepared data directly, please download and pre-process the [raw data]() in your favorite methods. 30 | 31 | ### demo 32 | Download DBNet demo data [here](https://drive.google.com/open?id=1NjhHwV_q6EMZ6MiGhZnqxg7yRCawx79c) and organize the folders as follows (in `demo`): 33 | 34 | ``` 35 | ├── data.csv 36 | ├── DVR 37 | ├─ └── i.jpg [3788 images] 38 | ├── fmap 39 | ├─ └── i.jpg [3788 feature maps] 40 | └── points_16384 41 | └── i.las [3788 point clouds] 42 | ``` 43 | -------------------------------------------------------------------------------- /data/dbnet-2018/README.md: -------------------------------------------------------------------------------- 1 | ## DBNet-2018 Challenge 2 | The DBNet-2018 challenge data are organized as follows: 3 | 4 | ``` 5 | ├── train 6 | ├─ └── i [56 folders] (6569 in total, will release continously) 7 | ├─ ├── dvr_66x200 [<= 120 images] 8 | ├─ ├── points_16384 [<= 120 images] 9 | ├─ └── behavior.csv [labels] 10 | ├── val 11 | ├─ └── j [20 folders] (2349 in total) 12 | ├─ ├── dvr_66x200 [<= 120 images] 13 | ├─ ├── points_16384 [<= 120 clouds] 14 | ├─ └── behavior.csv [labels] 15 | └── test 16 | └── k [20 folders] (2376 in total) 17 | ├── dvr_66x200 [<= 120 images] 18 | └── points_16384 [<= 120 clouds] 19 | ``` 20 | 21 | In general, the train/val/test ratio is approximatingly set to 8:1:1 and all of the val/test data are released already. Almost five eighths of training data are still pre-processed and will be __uploaded soon__. 22 | 23 | Please note that the data in subfolders of `train/`, `val/` and `test/` are __continuous__ and __time-ordered__. The `ith` line of `behavior.csv` correponds to `i-1.jpg` in `dvf_66x200/` and `i-1.las` in `points_16384/`. Moreover, if you don't intend to utilize the prepared data directly, please download and pre-process the raw data in your favorite methods. 24 | -------------------------------------------------------------------------------- /data/demo/README.md: -------------------------------------------------------------------------------- 1 | ## Demo Data 2 | 3 | Download DBNet demo data and organize the folders as follows: 4 | 5 | ``` 6 | ├── data.csv 7 | ├── DVR 8 | ├─ └── i.jpg [3788 images] 9 | ├── fmap 10 | ├─ └── i.jpg [3788 feature maps] 11 | └── points_16384 12 | └── i.las [3788 point clouds] 13 | ``` 14 | -------------------------------------------------------------------------------- /docs/logo.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/driving-behavior/DBNet/c32ad78afb9f83e354eae7eb6793dbeef7e83bac/docs/logo.jpeg -------------------------------------------------------------------------------- /docs/pred.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/driving-behavior/DBNet/c32ad78afb9f83e354eae7eb6793dbeef7e83bac/docs/pred.jpg -------------------------------------------------------------------------------- /evaluate.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import importlib 3 | import os 4 | import sys 5 | import time 6 | 7 | # import matplotlib.pyplot as plt 8 | import numpy as np 9 | import scipy 10 | 11 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 12 | sys.path.append(os.path.join(BASE_DIR, 'models')) 13 | sys.path.append(os.path.join(BASE_DIR, 'utils')) 14 | 15 | import provider 16 | import tensorflow as tf 17 | from helper import str2bool 18 | 19 | 20 | parser = argparse.ArgumentParser() 21 | parser.add_argument('--gpu', type=int, default=0, 22 | help='GPU to use [default: GPU 0]') 23 | parser.add_argument('--model', default='nvidia_pn', 24 | help='Model name [default: nvidia_pn]') 25 | parser.add_argument('--model_path', default='logs/nvidia_pn/model_best.ckpt', 26 | help='Model checkpoint file path [default: logs/nvidia_pn/model_best.ckpt]') 27 | parser.add_argument('--max_epoch', type=int, default=250, 28 | help='Epoch to run [default: 250]') 29 | parser.add_argument('--batch_size', type=int, default=8, 30 | help='Batch Size during training [default: 8]') 31 | parser.add_argument('--result_dir', default='results', 32 | help='Result folder path [default: results]') 33 | parser.add_argument('--test', type=str2bool, default=False, # only used in test server 34 | help='Get performance on test data [default: False]') 35 | 36 | FLAGS = parser.parse_args() 37 | BATCH_SIZE = FLAGS.batch_size 38 | GPU_INDEX = FLAGS.gpu 39 | MODEL_PATH = FLAGS.model_path 40 | 41 | supported_models = ["nvidia_io", "nvidia_pn", 42 | "resnet152_io", "resnet152_pn", 43 | "inception_v4_io", "inception_v4_pn", 44 | "densenet169_io", "densenet169_pn"] 45 | assert (FLAGS.model in supported_models) 46 | 47 | MODEL = importlib.import_module(FLAGS.model) # import network module 48 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py') 49 | 50 | RESULT_DIR = os.path.join(FLAGS.result_dir, FLAGS.model) 51 | if not os.path.exists(RESULT_DIR): 52 | os.makedirs(RESULT_DIR) 53 | if FLAGS.test: 54 | TEST_RESULT_DIR = os.path.join(RESULT_DIR, "test") 55 | if not os.path.exists(TEST_RESULT_DIR): 56 | os.makedirs(TEST_RESULT_DIR) 57 | LOG_FOUT = open(os.path.join(TEST_RESULT_DIR, 'log_test.txt'), 'w') 58 | LOG_FOUT.write(str(FLAGS)+'\n') 59 | else: 60 | VAL_RESULT_DIR = os.path.join(RESULT_DIR, "val") 61 | if not os.path.exists(VAL_RESULT_DIR): 62 | os.makedirs(VAL_RESULT_DIR) 63 | LOG_FOUT = open(os.path.join(VAL_RESULT_DIR, 'log_evaluate.txt'), 'w') 64 | LOG_FOUT.write(str(FLAGS)+'\n') 65 | 66 | 67 | def log_string(out_str): 68 | LOG_FOUT.write(out_str+'\n') 69 | LOG_FOUT.flush() 70 | print(out_str) 71 | 72 | def evaluate(): 73 | with tf.device('/gpu:'+str(GPU_INDEX)): 74 | if '_pn' in MODEL_FILE: 75 | data_input = provider.Provider() 76 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 77 | imgs_pl = [imgs_pl, pts_pl] 78 | elif '_io' in MODEL_FILE: 79 | data_input = provider.Provider() 80 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 81 | else: 82 | raise NotImplementedError 83 | 84 | is_training_pl = tf.placeholder(tf.bool, shape=()) 85 | print(is_training_pl) 86 | 87 | # Get model and loss 88 | pred = MODEL.get_model(imgs_pl, is_training_pl) 89 | 90 | loss = MODEL.get_loss(pred, labels_pl) 91 | 92 | # Add ops to save and restore all the variables. 93 | saver = tf.train.Saver() 94 | 95 | # Create a session 96 | config = tf.ConfigProto() 97 | config.gpu_options.allow_growth = True 98 | config.allow_soft_placement = True 99 | config.log_device_placement = True 100 | sess = tf.Session(config=config) 101 | 102 | # Restore variables from disk. 103 | saver.restore(sess, MODEL_PATH) 104 | log_string("Model restored.") 105 | 106 | ops = {'imgs_pl': imgs_pl, 107 | 'labels_pl': labels_pl, 108 | 'is_training_pl': is_training_pl, 109 | 'pred': pred, 110 | 'loss': loss} 111 | 112 | eval_one_epoch(sess, ops, data_input) 113 | 114 | 115 | def eval_one_epoch(sess, ops, data_input): 116 | """ ops: dict mapping from string to tf ops """ 117 | is_training = False 118 | loss_sum = 0 119 | 120 | num_batches = data_input.num_val // BATCH_SIZE 121 | acc_a_sum = [0] * 5 122 | acc_s_sum = [0] * 5 123 | 124 | preds = [] 125 | labels_total = [] 126 | acc_a = [0] * 5 127 | acc_s = [0] * 5 128 | for batch_idx in range(num_batches): 129 | if "_io" in MODEL_FILE: 130 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val", reader_type="io") 131 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 132 | imgs = MODEL.resize(imgs) 133 | feed_dict = {ops['imgs_pl']: imgs, 134 | ops['labels_pl']: labels, 135 | ops['is_training_pl']: is_training} 136 | else: 137 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val") 138 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 139 | imgs = MODEL.resize(imgs) 140 | feed_dict = {ops['imgs_pl'][0]: imgs, 141 | ops['imgs_pl'][1]: others, 142 | ops['labels_pl']: labels, 143 | ops['is_training_pl']: is_training} 144 | 145 | loss_val, pred_val = sess.run([ops['loss'], ops['pred']], 146 | feed_dict=feed_dict) 147 | 148 | preds.append(pred_val) 149 | labels_total.append(labels) 150 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels))) 151 | for i in range(5): 152 | acc_a[i] = np.mean(np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (1.0 * (i+1) / 180 * scipy.pi)) 153 | acc_a_sum[i] += acc_a[i] 154 | acc_s[i] = np.mean(np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (1.0 * (i+1) / 20)) 155 | acc_s_sum[i] += acc_s[i] 156 | 157 | log_string('eval mean loss: %f' % (loss_sum / float(num_batches))) 158 | for i in range(5): 159 | log_string('eval accuracy (angle-%d): %f' % (float(i+1), (acc_a_sum[i] / float(num_batches)))) 160 | log_string('eval accuracy (speed-%d): %f' % (float(i+1), (acc_s_sum[i] / float(num_batches)))) 161 | 162 | preds = np.vstack(preds) 163 | labels = np.vstack(labels_total) 164 | 165 | a_error, s_error = mean_max_error(preds, labels, dicts=get_dicts()) 166 | log_string('eval error (mean-max): angle:%.2f speed:%.2f' % 167 | (a_error / scipy.pi * 180, s_error * 20)) 168 | a_error, s_error = max_error(preds, labels) 169 | log_string('eval error (max): angle:%.2f speed:%.2f' % 170 | (a_error / scipy.pi * 180, s_error * 20)) 171 | a_error, s_error = mean_topk_error(preds, labels, 5) 172 | log_string('eval error (mean-top5): angle:%.2f speed:%.2f' % 173 | (a_error / scipy.pi * 180, s_error * 20)) 174 | a_error, s_error = mean_error(preds, labels) 175 | log_string('eval error (mean): angle:%.2f speed:%.2f' % 176 | (a_error / scipy.pi * 180, s_error * 20)) 177 | 178 | print (preds.shape, labels.shape) 179 | np.savetxt(os.path.join(VAL_RESULT_DIR, "preds_val.txt"), preds) 180 | np.savetxt(os.path.join(VAL_RESULT_DIR, "labels_val.txt"), labels) 181 | # plot_acc(preds, labels) 182 | 183 | 184 | def test(): 185 | with tf.device('/gpu:'+str(GPU_INDEX)): 186 | if '_pn' in MODEL_FILE: 187 | data_input = provider.Provider2() 188 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 189 | imgs_pl = [imgs_pl, pts_pl] 190 | elif '_io' in MODEL_FILE: 191 | data_input = provider.Provider2() 192 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 193 | else: 194 | raise NotImplementedError 195 | 196 | 197 | is_training_pl = tf.placeholder(tf.bool, shape=()) 198 | print(is_training_pl) 199 | 200 | # Get model and loss 201 | pred = MODEL.get_model(imgs_pl, is_training_pl) 202 | 203 | loss = MODEL.get_loss(pred, labels_pl) 204 | 205 | # Add ops to save and restore all the variables. 206 | saver = tf.train.Saver() 207 | 208 | # Create a session 209 | config = tf.ConfigProto() 210 | config.gpu_options.allow_growth = True 211 | config.allow_soft_placement = True 212 | config.log_device_placement = True 213 | sess = tf.Session(config=config) 214 | 215 | # Restore variables from disk. 216 | saver.restore(sess, MODEL_PATH) 217 | log_string("Model restored.") 218 | 219 | ops = {'imgs_pl': imgs_pl, 220 | 'labels_pl': labels_pl, 221 | 'is_training_pl': is_training_pl, 222 | 'pred': pred, 223 | 'loss': loss} 224 | 225 | test_one_epoch(sess, ops, data_input) 226 | 227 | 228 | def test_one_epoch(sess, ops, data_input): 229 | """ ops: dict mapping from string to tf ops """ 230 | is_training = False 231 | loss_sum = 0 232 | 233 | num_batches = data_input.num_test // BATCH_SIZE 234 | acc_a_sum = [0] * 5 235 | acc_s_sum = [0] * 5 236 | 237 | preds = [] 238 | labels_total = [] 239 | acc_a = [0] * 5 240 | acc_s = [0] * 5 241 | for batch_idx in range(num_batches): 242 | if "_io" in MODEL_FILE: 243 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, reader_type="io") 244 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 245 | imgs = MODEL.resize(imgs) 246 | feed_dict = {ops['imgs_pl']: imgs, 247 | ops['labels_pl']: labels, 248 | ops['is_training_pl']: is_training} 249 | else: 250 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE) 251 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 252 | imgs = MODEL.resize(imgs) 253 | feed_dict = {ops['imgs_pl'][0]: imgs, 254 | ops['imgs_pl'][1]: others, 255 | ops['labels_pl']: labels, 256 | ops['is_training_pl']: is_training} 257 | 258 | loss_val, pred_val = sess.run([ops['loss'], ops['pred']], 259 | feed_dict=feed_dict) 260 | 261 | preds.append(pred_val) 262 | labels_total.append(labels) 263 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels))) 264 | for i in range(5): 265 | acc_a[i] = np.mean(np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (1.0 * (i+1) / 180 * scipy.pi)) 266 | acc_a_sum[i] += acc_a[i] 267 | acc_s[i] = np.mean(np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (1.0 * (i+1) / 20)) 268 | acc_s_sum[i] += acc_s[i] 269 | 270 | log_string('test mean loss: %f' % (loss_sum / float(num_batches))) 271 | for i in range(5): 272 | log_string('test accuracy (angle-%d): %f' % (float(i+1), (acc_a_sum[i] / float(num_batches)))) 273 | log_string('test accuracy (speed-%d): %f' % (float(i+1), (acc_s_sum[i] / float(num_batches)))) 274 | 275 | preds = np.vstack(preds) 276 | labels = np.vstack(labels_total) 277 | 278 | a_error, s_error = mean_max_error(preds, labels, dicts=get_dicts()) 279 | log_string('test error (mean-max): angle:%.2f speed:%.2f' % 280 | (a_error / scipy.pi * 180, s_error * 20)) 281 | a_error, s_error = max_error(preds, labels) 282 | log_string('test error (max): angle:%.2f speed:%.2f' % 283 | (a_error / scipy.pi * 180, s_error * 20)) 284 | a_error, s_error = mean_topk_error(preds, labels, 5) 285 | log_string('test error (mean-top5): angle:%.2f speed:%.2f' % 286 | (a_error / scipy.pi * 180, s_error * 20)) 287 | a_error, s_error = mean_error(preds, labels) 288 | log_string('test error (mean): angle:%.2f speed:%.2f' % 289 | (a_error / scipy.pi * 180, s_error * 20)) 290 | 291 | print (preds.shape, labels.shape) 292 | np.savetxt(os.path.join(TEST_RESULT_DIR, "preds_val.txt"), preds) 293 | np.savetxt(os.path.join(TEST_RESULT_DIR, "labels_val.txt"), labels) 294 | # plot_acc(preds, labels) 295 | 296 | 297 | def plot_acc(preds, labels, counts = 100): 298 | a_list = [] 299 | s_list = [] 300 | for i in range(counts): 301 | acc_a = np.abs(np.subtract(preds[:, 1], labels[:, 1])) < (20.0 / 180 * scipy.pi / counts * i) 302 | a_list.append(np.mean(acc_a)) 303 | 304 | for i in range(counts): 305 | acc_s = np.abs(np.subtract(preds[:, 0], labels[:, 0])) < (15.0 / 20 / counts * i) 306 | s_list.append(np.mean(acc_s)) 307 | 308 | print (len(a_list), len(s_list)) 309 | a_xaxis = [20.0 / counts * i for i in range(counts)] 310 | s_xaxis = [15.0 / counts * i for i in range(counts)] 311 | 312 | auc_angle = np.trapz(np.array(a_list), x=a_xaxis) / 20.0 313 | auc_speed = np.trapz(np.array(s_list), x=s_xaxis) / 15.0 314 | 315 | plt.style.use('ggplot') 316 | plt.figure() 317 | plt.plot(a_xaxis, np.array(a_list), label='Area Under Curve (AUC): %f' % auc_angle) 318 | plt.legend(loc='best') 319 | plt.xlabel("Threshold (angle)") 320 | plt.ylabel("Validation accuracy") 321 | plt.savefig(os.path.join(RESULT_DIR, "acc_angle.png")) 322 | plt.figure() 323 | plt.plot(s_xaxis, np.array(s_list), label='Area Under Curve (AUC): %f' % auc_speed) 324 | plt.xlabel("Threshold (speed)") 325 | plt.ylabel("Validation accuracy") 326 | plt.legend(loc='best') 327 | plt.savefig(os.path.join(RESULT_DIR, 'acc_spped.png')) 328 | 329 | def plot_acc_from_txt(counts=100): 330 | preds = np.loadtxt(os.path.join(RESULT_DIR, "test/preds_val.txt")) 331 | labels = np.loadtxt(os.path.join(RESULT_DIR, "test/labels_val.txt")) 332 | print (preds.shape, labels.shape) 333 | plot_acc(preds, labels, counts) 334 | 335 | def get_dicts(description="val"): 336 | if description == "train": 337 | raise NotImplementedError 338 | elif description == "val": # batch_size == 8 339 | return [120] * 4 + [111] + [120] * 4 + [109] + [120] * 9 + [89 - 87 % 8] 340 | elif description == "test": # batch_size == 8 341 | return [120] * 9 + [116] + [120] * 4 + [106] + [120] * 4 + [114 - 114 % 8] 342 | else: 343 | raise NotImplementedError 344 | 345 | def mean_max_error(preds, labels, dicts): 346 | cnt = 0 347 | a_error = 0 348 | s_error = 0 349 | for i in dicts: 350 | print (preds.shape, cnt, cnt + i) 351 | a_error += np.max(np.abs(preds[cnt:cnt+i, 1] - labels[cnt:cnt+i, 1])) 352 | s_error += np.max(np.abs(preds[cnt:cnt+i, 0] - labels[cnt:cnt+i, 0])) 353 | cnt += i 354 | return a_error / float(len(dicts)), s_error / float(len(dicts)) 355 | 356 | def max_error(preds, labels): 357 | return np.max(np.abs(preds[:,1] - labels[:,1])), np.max(np.abs(preds[:, 0] - labels[:, 0])) 358 | 359 | def mean_error(preds, labels): 360 | return np.mean(np.abs(preds[:,1] - labels[:,1])), np.mean(np.abs(preds[:,0] - labels[:,0])) 361 | 362 | def mean_topk_error(preds, labels, k): 363 | a_error = np.abs(preds[:,1] - labels[:,1]) 364 | s_error = np.abs(preds[:,0] - labels[:,0]) 365 | return np.mean(np.sort(a_error)[::-1][0:k]), np.mean(np.sort(s_error)[::-1][0:k]) 366 | 367 | if __name__ == "__main__": 368 | if FLAGS.test: test() 369 | else: evaluate() 370 | # plot_acc_from_txt() 371 | -------------------------------------------------------------------------------- /models/densenet169_io.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | from custom_layers import Scale 13 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D, 14 | AveragePooling2D, GlobalAveragePooling2D, 15 | ZeroPadding2D, Dropout, Flatten, add, 16 | concatenate, Reshape, Activation) 17 | from keras.layers.normalization import BatchNormalization 18 | from keras.models import Model 19 | 20 | 21 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, separately=False): 22 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 23 | if separately: 24 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 25 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 26 | labels_pl = [speeds_pl, angles_pl] 27 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 28 | return imgs_pl, labels_pl 29 | 30 | 31 | def get_densenet(img_rows, img_cols, nb_dense_block=4, 32 | growth_rate=32, nb_filter=64, reduction=0.5, 33 | dropout_rate=0.0, weight_decay=1e-4): 34 | ''' 35 | DenseNet 169 Model for Keras 36 | 37 | Model Schema is based on 38 | https://github.com/flyyufelix/DenseNet-Keras 39 | 40 | ImageNet Pretrained Weights 41 | Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU 42 | TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM 43 | 44 | # Arguments 45 | nb_dense_block: number of dense blocks to add to end 46 | growth_rate: number of filters to add per dense block 47 | nb_filter: initial number of filters 48 | reduction: reduction factor of transition blocks. 49 | dropout_rate: dropout rate 50 | weight_decay: weight decay factor 51 | classes: optional number of classes to classify images 52 | weights_path: path to pre-trained weights 53 | # Returns 54 | A Keras model instance. 55 | ''' 56 | eps = 1.1e-5 57 | 58 | # compute compression factor 59 | compression = 1.0 - reduction 60 | 61 | # Handle Dimension Ordering for different backends 62 | img_input = Input(shape=(224, 224, 3), name='data') 63 | 64 | # From architecture for ImageNet (Table 1 in the paper) 65 | nb_filter = 64 66 | nb_layers = [6,12,32,32] # For DenseNet-169 67 | 68 | # Initial convolution 69 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input) 70 | x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x) 71 | x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x) 72 | x = Scale(axis=3, name='conv1_scale')(x) 73 | x = Activation('relu', name='relu1')(x) 74 | x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x) 75 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x) 76 | 77 | # Add dense blocks 78 | for block_idx in range(nb_dense_block - 1): 79 | stage = block_idx+2 80 | x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay) 81 | 82 | # Add transition_block 83 | x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay) 84 | nb_filter = int(nb_filter * compression) 85 | 86 | final_stage = stage + 1 87 | x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay) 88 | 89 | x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x) 90 | x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x) 91 | x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x) 92 | 93 | x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x) 94 | x_fc = Dense(1000, name='fc6')(x_fc) 95 | x_fc = Activation('softmax', name='prob')(x_fc) 96 | 97 | model = Model(img_input, x_fc, name='densenet') 98 | 99 | # Use pre-trained weights for Tensorflow backend 100 | weights_path = 'utils/weights/densenet169_weights_tf.h5' 101 | 102 | model.load_weights(weights_path, by_name=True) 103 | 104 | # Truncate and replace softmax layer for transfer learning 105 | # Cannot use model.layers.pop() since model is not of Sequential() type 106 | # The method below works since pre-trained weights are stored in layers but not in the model 107 | x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x) 108 | 109 | x_newfc = Dense(256, name='fc7')(x_newfc) 110 | model = Model(img_input, x_newfc) 111 | 112 | return model 113 | 114 | 115 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 116 | """ Densenet169 regression model, input is BxWxHx3, output Bx2""" 117 | net = get_densenet(224, 224)(net) 118 | 119 | if not add_lstm: 120 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final') 121 | 122 | else: 123 | net = tf_util.fully_connected(net, 784, bn=True, 124 | is_training=is_training, 125 | scope='fc_lstm', 126 | bn_decay=bn_decay) 127 | net = tf_util.dropout(net, keep_prob=0.7, 128 | is_training=is_training, 129 | scope="dp1") 130 | net = cnn_lstm_block(net) 131 | 132 | return net 133 | 134 | 135 | def cnn_lstm_block(input_tensor): 136 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 137 | lstm_out = tf_util.stacked_lstm(lstm_in, 138 | num_outputs=10, 139 | time_steps=28, 140 | scope="cnn_lstm") 141 | 142 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 143 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 144 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 145 | 146 | 147 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4): 148 | '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout 149 | # Arguments 150 | x: input tensor 151 | stage: index for dense block 152 | branch: layer index within each dense block 153 | nb_filter: number of filters 154 | dropout_rate: dropout rate 155 | weight_decay: weight decay factor 156 | ''' 157 | eps = 1.1e-5 158 | conv_name_base = 'conv' + str(stage) + '_' + str(branch) 159 | relu_name_base = 'relu' + str(stage) + '_' + str(branch) 160 | 161 | # 1x1 Convolution (Bottleneck layer) 162 | inter_channel = nb_filter * 4 163 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x) 164 | x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x) 165 | x = Activation('relu', name=relu_name_base+'_x1')(x) 166 | x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x) 167 | 168 | if dropout_rate: 169 | x = Dropout(dropout_rate)(x) 170 | 171 | # 3x3 Convolution 172 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x) 173 | x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x) 174 | x = Activation('relu', name=relu_name_base+'_x2')(x) 175 | x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x) 176 | x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x) 177 | 178 | if dropout_rate: 179 | x = Dropout(dropout_rate)(x) 180 | 181 | return x 182 | 183 | 184 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4): 185 | ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout 186 | # Arguments 187 | x: input tensor 188 | stage: index for dense block 189 | nb_filter: number of filters 190 | compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block. 191 | dropout_rate: dropout rate 192 | weight_decay: weight decay factor 193 | ''' 194 | 195 | eps = 1.1e-5 196 | conv_name_base = 'conv' + str(stage) + '_blk' 197 | relu_name_base = 'relu' + str(stage) + '_blk' 198 | pool_name_base = 'pool' + str(stage) 199 | 200 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x) 201 | x = Scale(axis=3, name=conv_name_base+'_scale')(x) 202 | x = Activation('relu', name=relu_name_base)(x) 203 | x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x) 204 | 205 | if dropout_rate: 206 | x = Dropout(dropout_rate)(x) 207 | 208 | x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x) 209 | 210 | return x 211 | 212 | 213 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True): 214 | ''' Build a dense_block where the output of each conv_block is fed to subsequent ones 215 | # Arguments 216 | x: input tensor 217 | stage: index for dense block 218 | nb_layers: the number of layers of conv_block to append to the model. 219 | nb_filter: number of filters 220 | growth_rate: growth rate 221 | dropout_rate: dropout rate 222 | weight_decay: weight decay factor 223 | grow_nb_filters: flag to decide to allow number of filters to grow 224 | ''' 225 | 226 | eps = 1.1e-5 227 | concat_feat = x 228 | 229 | for i in range(nb_layers): 230 | branch = i+1 231 | x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay) 232 | concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch)) 233 | 234 | if grow_nb_filters: 235 | nb_filter += growth_rate 236 | 237 | return concat_feat, nb_filter 238 | 239 | 240 | def get_loss(pred, label, l2_weight=0.0001): 241 | diff = tf.square(tf.subtract(pred, label)) 242 | train_vars = tf.trainable_variables() 243 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 244 | loss = tf.reduce_mean(diff + l2_loss) 245 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 246 | tf.summary.scalar('loss', loss) 247 | 248 | return loss 249 | 250 | 251 | def summary_scalar(pred, label): 252 | threholds = [5, 4, 3, 2, 1, 0.5] 253 | angles = [float(t) / 180 * scipy.pi for t in threholds] 254 | speeds = [float(t) / 20 for t in threholds] 255 | 256 | for i in range(len(threholds)): 257 | scalar_angle = "angle(" + str(angles[i]) + ")" 258 | scalar_speed = "speed(" + str(speeds[i]) + ")" 259 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 260 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 261 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 262 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 263 | 264 | tf.summary.scalar(scalar_angle, ac_angle) 265 | tf.summary.scalar(scalar_speed, ac_speed) 266 | 267 | 268 | def resize(imgs): 269 | batch_size = imgs.shape[0] 270 | imgs_new = [] 271 | for j in range(batch_size): 272 | img = imgs[j,:,:,:] 273 | new = scipy.misc.imresize(img, (224, 224)) 274 | imgs_new.append(new) 275 | imgs_new = np.stack(imgs_new, axis=0) 276 | return imgs_new 277 | 278 | 279 | if __name__ == '__main__': 280 | with tf.Graph().as_default(): 281 | inputs = tf.zeros((32, 224, 224, 3)) 282 | outputs = get_model(inputs, tf.constant(True)) 283 | print(outputs) 284 | -------------------------------------------------------------------------------- /models/densenet169_pm.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | import pointnet 13 | from custom_layers import Scale 14 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D, 15 | AveragePooling2D, GlobalAveragePooling2D, 16 | ZeroPadding2D, Dropout, Flatten, add, 17 | concatenate, Reshape, Activation) 18 | from keras.layers.normalization import BatchNormalization 19 | from keras.models import Model 20 | 21 | from keras import backend as K 22 | K.set_learning_phase(1) #set learning phase 23 | 24 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False): 25 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 26 | fmaps_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 27 | if separately: 28 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 29 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 30 | labels_pl = [speeds_pl, angles_pl] 31 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 32 | return imgs_pl, fmaps_pl, labels_pl 33 | 34 | 35 | def get_densenet(img_rows, img_cols, nb_dense_block=4, 36 | growth_rate=32, nb_filter=64, reduction=0.5, 37 | dropout_rate=0.0, weight_decay=1e-4): 38 | ''' 39 | DenseNet 169 Model for Keras 40 | 41 | Model Schema is based on 42 | https://github.com/flyyufelix/DenseNet-Keras 43 | 44 | ImageNet Pretrained Weights 45 | Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU 46 | TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM 47 | 48 | # Arguments 49 | nb_dense_block: number of dense blocks to add to end 50 | growth_rate: number of filters to add per dense block 51 | nb_filter: initial number of filters 52 | reduction: reduction factor of transition blocks. 53 | dropout_rate: dropout rate 54 | weight_decay: weight decay factor 55 | classes: optional number of classes to classify images 56 | weights_path: path to pre-trained weights 57 | # Returns 58 | A Keras model instance. 59 | ''' 60 | eps = 1.1e-5 61 | 62 | # compute compression factor 63 | compression = 1.0 - reduction 64 | 65 | # Handle Dimension Ordering for different backends 66 | img_input = Input(shape=(224, 224, 3), name='data') 67 | 68 | # From architecture for ImageNet (Table 1 in the paper) 69 | nb_filter = 64 70 | nb_layers = [6,12,32,32] # For DenseNet-169 71 | 72 | # Initial convolution 73 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input) 74 | x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x) 75 | x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x) 76 | x = Scale(axis=3, name='conv1_scale')(x) 77 | x = Activation('relu', name='relu1')(x) 78 | x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x) 79 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x) 80 | 81 | # Add dense blocks 82 | for block_idx in range(nb_dense_block - 1): 83 | stage = block_idx+2 84 | x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay) 85 | 86 | # Add transition_block 87 | x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay) 88 | nb_filter = int(nb_filter * compression) 89 | 90 | final_stage = stage + 1 91 | x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay) 92 | 93 | x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x) 94 | x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x) 95 | x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x) 96 | 97 | x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x) 98 | x_fc = Dense(1000, name='fc6')(x_fc) 99 | x_fc = Activation('softmax', name='prob')(x_fc) 100 | 101 | model = Model(img_input, x_fc, name='densenet') 102 | 103 | # Use pre-trained weights for Tensorflow backend 104 | weights_path = 'utils/weights/densenet169_weights_tf.h5' 105 | 106 | model.load_weights(weights_path, by_name=True) 107 | 108 | # Truncate and replace softmax layer for transfer learning 109 | # Cannot use model.layers.pop() since model is not of Sequential() type 110 | # The method below works since pre-trained weights are stored in layers but not in the model 111 | x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x) 112 | 113 | x_newfc = Dense(256, name='fc7')(x_newfc) 114 | model = Model(img_input, x_newfc) 115 | 116 | return model 117 | 118 | 119 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 120 | """ Densenet169 regression model, input is BxWxHx3, output Bx2""" 121 | batch_size = net[0].get_shape()[0].value 122 | img_net, fmap_net = net[0], net[1] 123 | 124 | img_net = get_densenet(224, 224)(img_net) 125 | fmap_net = get_densenet(224, 224)(fmap_net) 126 | 127 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1]) 128 | 129 | if not add_lstm: 130 | for i, dim in enumerate([256, 128, 16]): 131 | fc_scope = "fc" + str(i + 1) 132 | dp_scope = "dp" + str(i + 1) 133 | net = tf_util.fully_connected(net, dim, bn=True, 134 | is_training=is_training, 135 | scope=fc_scope, 136 | bn_decay=bn_decay) 137 | net = tf_util.dropout(net, keep_prob=0.7, 138 | is_training=is_training, 139 | scope=dp_scope) 140 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4') 141 | else: 142 | fc_scope = "fc1" 143 | net = tf_util.fully_connected(net, 784, bn=True, 144 | is_training=is_training, 145 | scope=fc_scope, 146 | bn_decay=bn_decay) 147 | net = tf_util.dropout(net, keep_prob=0.7, 148 | is_training=is_training, 149 | scope="dp1") 150 | net = cnn_lstm_block(net) 151 | return net 152 | 153 | 154 | def cnn_lstm_block(input_tensor): 155 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 156 | lstm_out = tf_util.stacked_lstm(lstm_in, 157 | num_outputs=10, 158 | time_steps=28, 159 | scope="cnn_lstm") 160 | 161 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 162 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 163 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 164 | 165 | 166 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4): 167 | '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout 168 | # Arguments 169 | x: input tensor 170 | stage: index for dense block 171 | branch: layer index within each dense block 172 | nb_filter: number of filters 173 | dropout_rate: dropout rate 174 | weight_decay: weight decay factor 175 | ''' 176 | eps = 1.1e-5 177 | conv_name_base = 'conv' + str(stage) + '_' + str(branch) 178 | relu_name_base = 'relu' + str(stage) + '_' + str(branch) 179 | 180 | # 1x1 Convolution (Bottleneck layer) 181 | inter_channel = nb_filter * 4 182 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x) 183 | x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x) 184 | x = Activation('relu', name=relu_name_base+'_x1')(x) 185 | x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x) 186 | 187 | if dropout_rate: 188 | x = Dropout(dropout_rate)(x) 189 | 190 | # 3x3 Convolution 191 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x) 192 | x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x) 193 | x = Activation('relu', name=relu_name_base+'_x2')(x) 194 | x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x) 195 | x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x) 196 | 197 | if dropout_rate: 198 | x = Dropout(dropout_rate)(x) 199 | 200 | return x 201 | 202 | 203 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4): 204 | ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout 205 | # Arguments 206 | x: input tensor 207 | stage: index for dense block 208 | nb_filter: number of filters 209 | compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block. 210 | dropout_rate: dropout rate 211 | weight_decay: weight decay factor 212 | ''' 213 | 214 | eps = 1.1e-5 215 | conv_name_base = 'conv' + str(stage) + '_blk' 216 | relu_name_base = 'relu' + str(stage) + '_blk' 217 | pool_name_base = 'pool' + str(stage) 218 | 219 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x) 220 | x = Scale(axis=3, name=conv_name_base+'_scale')(x) 221 | x = Activation('relu', name=relu_name_base)(x) 222 | x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x) 223 | 224 | if dropout_rate: 225 | x = Dropout(dropout_rate)(x) 226 | 227 | x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x) 228 | 229 | return x 230 | 231 | 232 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True): 233 | ''' Build a dense_block where the output of each conv_block is fed to subsequent ones 234 | # Arguments 235 | x: input tensor 236 | stage: index for dense block 237 | nb_layers: the number of layers of conv_block to append to the model. 238 | nb_filter: number of filters 239 | growth_rate: growth rate 240 | dropout_rate: dropout rate 241 | weight_decay: weight decay factor 242 | grow_nb_filters: flag to decide to allow number of filters to grow 243 | ''' 244 | 245 | eps = 1.1e-5 246 | concat_feat = x 247 | 248 | for i in range(nb_layers): 249 | branch = i+1 250 | x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay) 251 | concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch)) 252 | 253 | if grow_nb_filters: 254 | nb_filter += growth_rate 255 | 256 | return concat_feat, nb_filter 257 | 258 | 259 | def get_loss(pred, label, l2_weight=0.0001): 260 | diff = tf.square(tf.subtract(pred, label)) 261 | train_vars = tf.trainable_variables() 262 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 263 | loss = tf.reduce_mean(diff + l2_loss) 264 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 265 | tf.summary.scalar('loss', loss) 266 | 267 | return loss 268 | 269 | 270 | def summary_scalar(pred, label): 271 | threholds = [5, 4, 3, 2, 1, 0.5] 272 | angles = [float(t) / 180 * scipy.pi for t in threholds] 273 | speeds = [float(t) / 20 for t in threholds] 274 | 275 | for i in range(len(threholds)): 276 | scalar_angle = "angle(" + str(angles[i]) + ")" 277 | scalar_speed = "speed(" + str(speeds[i]) + ")" 278 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 279 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 280 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 281 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 282 | 283 | tf.summary.scalar(scalar_angle, ac_angle) 284 | tf.summary.scalar(scalar_speed, ac_speed) 285 | 286 | 287 | def resize(imgs): 288 | batch_size = imgs.shape[0] 289 | imgs_new = [] 290 | for j in range(batch_size): 291 | img = imgs[j,:,:,:] 292 | new = scipy.misc.imresize(img, (224, 224)) 293 | imgs_new.append(new) 294 | imgs_new = np.stack(imgs_new, axis=0) 295 | return imgs_new 296 | 297 | 298 | if __name__ == '__main__': 299 | with tf.Graph().as_default(): 300 | imgs = tf.zeros((32, 224, 224, 3)) 301 | fmaps = tf.zeros((32, 224, 224, 3)) 302 | outputs = get_model([imgs, fmaps], tf.constant(True)) 303 | print(outputs) 304 | -------------------------------------------------------------------------------- /models/densenet169_pn.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | import pointnet 13 | from custom_layers import Scale 14 | from keras.layers import (Input, Dense, Convolution2D, MaxPooling2D, 15 | AveragePooling2D, GlobalAveragePooling2D, 16 | ZeroPadding2D, Dropout, Flatten, add, 17 | concatenate, Reshape, Activation) 18 | from keras.layers.normalization import BatchNormalization 19 | from keras.models import Model 20 | 21 | 22 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False): 23 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 24 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3)) 25 | if separately: 26 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 27 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 28 | labels_pl = [speeds_pl, angles_pl] 29 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 30 | return imgs_pl, pts_pl, labels_pl 31 | 32 | 33 | def get_densenet(img_rows, img_cols, nb_dense_block=4, 34 | growth_rate=32, nb_filter=64, reduction=0.5, 35 | dropout_rate=0.0, weight_decay=1e-4): 36 | ''' 37 | DenseNet 169 Model for Keras 38 | 39 | Model Schema is based on 40 | https://github.com/flyyufelix/DenseNet-Keras 41 | 42 | ImageNet Pretrained Weights 43 | Theano: https://drive.google.com/open?id=0Byy2AcGyEVxfN0d3T1F1MXg0NlU 44 | TensorFlow: https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM 45 | 46 | # Arguments 47 | nb_dense_block: number of dense blocks to add to end 48 | growth_rate: number of filters to add per dense block 49 | nb_filter: initial number of filters 50 | reduction: reduction factor of transition blocks. 51 | dropout_rate: dropout rate 52 | weight_decay: weight decay factor 53 | classes: optional number of classes to classify images 54 | weights_path: path to pre-trained weights 55 | # Returns 56 | A Keras model instance. 57 | ''' 58 | eps = 1.1e-5 59 | 60 | # compute compression factor 61 | compression = 1.0 - reduction 62 | 63 | # Handle Dimension Ordering for different backends 64 | img_input = Input(shape=(224, 224, 3), name='data') 65 | 66 | # From architecture for ImageNet (Table 1 in the paper) 67 | nb_filter = 64 68 | nb_layers = [6,12,32,32] # For DenseNet-169 69 | 70 | # Initial convolution 71 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input) 72 | x = Convolution2D(nb_filter, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x) 73 | x = BatchNormalization(epsilon=eps, axis=3, name='conv1_bn')(x) 74 | x = Scale(axis=3, name='conv1_scale')(x) 75 | x = Activation('relu', name='relu1')(x) 76 | x = ZeroPadding2D((1, 1), name='pool1_zeropadding')(x) 77 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x) 78 | 79 | # Add dense blocks 80 | for block_idx in range(nb_dense_block - 1): 81 | stage = block_idx+2 82 | x, nb_filter = dense_block(x, stage, nb_layers[block_idx], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay) 83 | 84 | # Add transition_block 85 | x = transition_block(x, stage, nb_filter, compression=compression, dropout_rate=dropout_rate, weight_decay=weight_decay) 86 | nb_filter = int(nb_filter * compression) 87 | 88 | final_stage = stage + 1 89 | x, nb_filter = dense_block(x, final_stage, nb_layers[-1], nb_filter, growth_rate, dropout_rate=dropout_rate, weight_decay=weight_decay) 90 | 91 | x = BatchNormalization(epsilon=eps, axis=3, name='conv'+str(final_stage)+'_blk_bn')(x) 92 | x = Scale(axis=3, name='conv'+str(final_stage)+'_blk_scale')(x) 93 | x = Activation('relu', name='relu'+str(final_stage)+'_blk')(x) 94 | 95 | x_fc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x) 96 | x_fc = Dense(1000, name='fc6')(x_fc) 97 | x_fc = Activation('softmax', name='prob')(x_fc) 98 | 99 | model = Model(img_input, x_fc, name='densenet') 100 | 101 | # Use pre-trained weights for Tensorflow backend 102 | weights_path = 'utils/weights/densenet169_weights_tf.h5' 103 | 104 | model.load_weights(weights_path, by_name=True) 105 | 106 | # Truncate and replace softmax layer for transfer learning 107 | # Cannot use model.layers.pop() since model is not of Sequential() type 108 | # The method below works since pre-trained weights are stored in layers but not in the model 109 | x_newfc = GlobalAveragePooling2D(name='pool'+str(final_stage))(x) 110 | 111 | x_newfc = Dense(256, name='fc7')(x_newfc) 112 | model = Model(img_input, x_newfc) 113 | 114 | return model 115 | 116 | 117 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 118 | """ Densenet169 regression model, input is BxWxHx3, output Bx2""" 119 | batch_size = net[0].get_shape()[0].value 120 | img_net, pt_net = net[0], net[1] 121 | 122 | img_net = get_densenet(299, 299)(img_net) 123 | with tf.variable_scope('pointnet'): 124 | pt_net = pointnet.get_model(pt_net, tf.constant(True)) 125 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1]) 126 | 127 | if not add_lstm: 128 | for i, dim in enumerate([256, 128, 16]): 129 | fc_scope = "fc" + str(i + 1) 130 | dp_scope = "dp" + str(i + 1) 131 | net = tf_util.fully_connected(net, dim, bn=True, 132 | is_training=is_training, 133 | scope=fc_scope, 134 | bn_decay=bn_decay) 135 | net = tf_util.dropout(net, keep_prob=0.7, 136 | is_training=is_training, 137 | scope=dp_scope) 138 | 139 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4') 140 | else: 141 | fc_scope = "fc1" 142 | net = tf_util.fully_connected(net, 784, bn=True, 143 | is_training=is_training, 144 | scope=fc_scope, 145 | bn_decay=bn_decay) 146 | net = tf_util.dropout(net, keep_prob=0.7, 147 | is_training=is_training, 148 | scope="dp1") 149 | net = cnn_lstm_block(net) 150 | return net 151 | 152 | 153 | def cnn_lstm_block(input_tensor): 154 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 155 | lstm_out = tf_util.stacked_lstm(lstm_in, 156 | num_outputs=10, 157 | time_steps=28, 158 | scope="cnn_lstm") 159 | 160 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 161 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 162 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 163 | 164 | 165 | def conv_block(x, stage, branch, nb_filter, dropout_rate=None, weight_decay=1e-4): 166 | '''Apply BatchNorm, Relu, bottleneck 1x1 Conv2D, 3x3 Conv2D, and option dropout 167 | # Arguments 168 | x: input tensor 169 | stage: index for dense block 170 | branch: layer index within each dense block 171 | nb_filter: number of filters 172 | dropout_rate: dropout rate 173 | weight_decay: weight decay factor 174 | ''' 175 | eps = 1.1e-5 176 | conv_name_base = 'conv' + str(stage) + '_' + str(branch) 177 | relu_name_base = 'relu' + str(stage) + '_' + str(branch) 178 | 179 | # 1x1 Convolution (Bottleneck layer) 180 | inter_channel = nb_filter * 4 181 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x1_bn')(x) 182 | x = Scale(axis=3, name=conv_name_base+'_x1_scale')(x) 183 | x = Activation('relu', name=relu_name_base+'_x1')(x) 184 | x = Convolution2D(inter_channel, (1, 1), name=conv_name_base+'_x1', use_bias=False)(x) 185 | 186 | if dropout_rate: 187 | x = Dropout(dropout_rate)(x) 188 | 189 | # 3x3 Convolution 190 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_x2_bn')(x) 191 | x = Scale(axis=3, name=conv_name_base+'_x2_scale')(x) 192 | x = Activation('relu', name=relu_name_base+'_x2')(x) 193 | x = ZeroPadding2D((1, 1), name=conv_name_base+'_x2_zeropadding')(x) 194 | x = Convolution2D(nb_filter, (3, 3), name=conv_name_base+'_x2', use_bias=False)(x) 195 | 196 | if dropout_rate: 197 | x = Dropout(dropout_rate)(x) 198 | 199 | return x 200 | 201 | 202 | def transition_block(x, stage, nb_filter, compression=1.0, dropout_rate=None, weight_decay=1E-4): 203 | ''' Apply BatchNorm, 1x1 Convolution, averagePooling, optional compression, dropout 204 | # Arguments 205 | x: input tensor 206 | stage: index for dense block 207 | nb_filter: number of filters 208 | compression: calculated as 1 - reduction. Reduces the number of feature maps in the transition block. 209 | dropout_rate: dropout rate 210 | weight_decay: weight decay factor 211 | ''' 212 | 213 | eps = 1.1e-5 214 | conv_name_base = 'conv' + str(stage) + '_blk' 215 | relu_name_base = 'relu' + str(stage) + '_blk' 216 | pool_name_base = 'pool' + str(stage) 217 | 218 | x = BatchNormalization(epsilon=eps, axis=3, name=conv_name_base+'_bn')(x) 219 | x = Scale(axis=3, name=conv_name_base+'_scale')(x) 220 | x = Activation('relu', name=relu_name_base)(x) 221 | x = Convolution2D(int(nb_filter * compression), (1, 1), name=conv_name_base, use_bias=False)(x) 222 | 223 | if dropout_rate: 224 | x = Dropout(dropout_rate)(x) 225 | 226 | x = AveragePooling2D((2, 2), strides=(2, 2), name=pool_name_base)(x) 227 | 228 | return x 229 | 230 | 231 | def dense_block(x, stage, nb_layers, nb_filter, growth_rate, dropout_rate=None, weight_decay=1e-4, grow_nb_filters=True): 232 | ''' Build a dense_block where the output of each conv_block is fed to subsequent ones 233 | # Arguments 234 | x: input tensor 235 | stage: index for dense block 236 | nb_layers: the number of layers of conv_block to append to the model. 237 | nb_filter: number of filters 238 | growth_rate: growth rate 239 | dropout_rate: dropout rate 240 | weight_decay: weight decay factor 241 | grow_nb_filters: flag to decide to allow number of filters to grow 242 | ''' 243 | 244 | eps = 1.1e-5 245 | concat_feat = x 246 | 247 | for i in range(nb_layers): 248 | branch = i+1 249 | x = conv_block(concat_feat, stage, branch, growth_rate, dropout_rate, weight_decay) 250 | concat_feat = concatenate([concat_feat, x], axis=3, name='concat_'+str(stage)+'_'+str(branch)) 251 | 252 | if grow_nb_filters: 253 | nb_filter += growth_rate 254 | 255 | return concat_feat, nb_filter 256 | 257 | 258 | def get_loss(pred, label, l2_weight=0.0001): 259 | diff = tf.square(tf.subtract(pred, label)) 260 | train_vars = tf.trainable_variables() 261 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 262 | loss = tf.reduce_mean(diff + l2_loss) 263 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 264 | tf.summary.scalar('loss', loss) 265 | 266 | return loss 267 | 268 | 269 | def summary_scalar(pred, label): 270 | threholds = [5, 4, 3, 2, 1, 0.5] 271 | angles = [float(t) / 180 * scipy.pi for t in threholds] 272 | speeds = [float(t) / 20 for t in threholds] 273 | 274 | for i in range(len(threholds)): 275 | scalar_angle = "angle(" + str(angles[i]) + ")" 276 | scalar_speed = "speed(" + str(speeds[i]) + ")" 277 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 278 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 279 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 280 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 281 | 282 | tf.summary.scalar(scalar_angle, ac_angle) 283 | tf.summary.scalar(scalar_speed, ac_speed) 284 | 285 | 286 | def resize(imgs): 287 | batch_size = imgs.shape[0] 288 | imgs_new = [] 289 | for j in range(batch_size): 290 | img = imgs[j,:,:,:] 291 | new = scipy.misc.imresize(img, (224, 224)) 292 | imgs_new.append(new) 293 | imgs_new = np.stack(imgs_new, axis=0) 294 | return imgs_new 295 | 296 | 297 | if __name__ == '__main__': 298 | with tf.Graph().as_default(): 299 | imgs = tf.zeros((32, 224, 224, 3)) 300 | pts = tf.zeros((32, 16384, 3)) 301 | outputs = get_model([imgs, pts], tf.constant(True)) 302 | print(outputs) 303 | -------------------------------------------------------------------------------- /models/inception_v4_io.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation 13 | from keras.layers.normalization import BatchNormalization 14 | from keras.models import Model 15 | 16 | 17 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, separately=False): 18 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 19 | if separately: 20 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 21 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 22 | labels_pl = [speeds_pl, angles_pl] 23 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 24 | return imgs_pl, labels_pl 25 | 26 | 27 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False): 28 | ''' 29 | Inception V4 Model for Keras 30 | 31 | Model Schema is based on 32 | https://github.com/kentsommer/keras-inceptionV4 33 | 34 | ImageNet Pretrained Weights 35 | Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5 36 | TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5 37 | 38 | Parameters: 39 | img_rows, img_cols - resolution of inputs 40 | channel - 1 for grayscale, 3 for color 41 | num_classes - number of class labels for our classification task 42 | ''' 43 | 44 | # Input Shape is 299 x 299 x 3 (tf) 45 | img_input = Input(shape=(img_rows, img_cols, 3), name='data') 46 | 47 | # Make inception base 48 | net = inception_v4_base(img_input) 49 | 50 | # Final pooling and prediction 51 | 52 | # 8 x 8 x 1536 53 | net_old = AveragePooling2D((8,8), padding='valid')(net) 54 | 55 | # 1 x 1 x 1536 56 | net_old = Dropout(dropout_keep_prob)(net_old) 57 | net_old = Flatten()(net_old) 58 | 59 | # 1536 60 | predictions = Dense(units=1001, activation='softmax')(net_old) 61 | 62 | model = Model(img_input, predictions, name='inception_v4') 63 | 64 | weights_path = 'utils/weights/inception-v4_weights_tf.h5' 65 | assert (os.path.exists(weights_path)) 66 | model.load_weights(weights_path, by_name=True) 67 | 68 | # Truncate and replace softmax layer for transfer learning 69 | # Cannot use model.layers.pop() since model is not of Sequential() type 70 | # The method below works since pre-trained weights are stored in layers but not in the model 71 | net_ft = AveragePooling2D((8,8), border_mode='valid')(net) 72 | net_ft = Dropout(dropout_keep_prob)(net_ft) 73 | net_ft = Flatten()(net_ft) 74 | net = Dense(256, name='fc_mid')(net_ft) 75 | 76 | model = Model(img_input, net, name='inception_v4') 77 | return model 78 | 79 | 80 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 81 | """ Inception_V4 regression model, input is BxWxHx3, output Bx2""" 82 | net = get_inception(299, 299)(net) 83 | 84 | if not add_lstm: 85 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final') 86 | 87 | else: 88 | net = tf_util.fully_connected(net, 784, bn=True, 89 | is_training=is_training, 90 | scope='fc_lstm', 91 | bn_decay=bn_decay) 92 | net = tf_util.dropout(net, keep_prob=0.7, 93 | is_training=is_training, 94 | scope="dp1") 95 | net = cnn_lstm_block(net) 96 | 97 | return net 98 | 99 | 100 | def cnn_lstm_block(input_tensor): 101 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 102 | lstm_out = tf_util.stacked_lstm(lstm_in, 103 | num_outputs=10, 104 | time_steps=28, 105 | scope="cnn_lstm") 106 | 107 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 108 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 109 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 110 | 111 | 112 | def conv2d_bn(x, nb_filter, nb_row, nb_col, 113 | border_mode='same', subsample=(1, 1), bias=False): 114 | """ 115 | Utility function to apply conv + BN. 116 | (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py) 117 | """ 118 | channel_axis = -1 119 | x = Convolution2D(nb_filter, (nb_row, nb_col), 120 | strides=subsample, 121 | padding=border_mode, 122 | use_bias=bias)(x) 123 | x = BatchNormalization(axis=channel_axis)(x) 124 | x = Activation('relu')(x) 125 | return x 126 | 127 | def block_inception_a(input): 128 | channel_axis = -1 129 | 130 | branch_0 = conv2d_bn(input, 96, 1, 1) 131 | 132 | branch_1 = conv2d_bn(input, 64, 1, 1) 133 | branch_1 = conv2d_bn(branch_1, 96, 3, 3) 134 | 135 | branch_2 = conv2d_bn(input, 64, 1, 1) 136 | branch_2 = conv2d_bn(branch_2, 96, 3, 3) 137 | branch_2 = conv2d_bn(branch_2, 96, 3, 3) 138 | 139 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input) 140 | branch_3 = conv2d_bn(branch_3, 96, 1, 1) 141 | 142 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 143 | return x 144 | 145 | 146 | def block_reduction_a(input): 147 | channel_axis = -1 148 | 149 | branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid') 150 | 151 | branch_1 = conv2d_bn(input, 192, 1, 1) 152 | branch_1 = conv2d_bn(branch_1, 224, 3, 3) 153 | branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid') 154 | 155 | branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input) 156 | 157 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis) 158 | return x 159 | 160 | 161 | def block_inception_b(input): 162 | channel_axis = -1 163 | 164 | branch_0 = conv2d_bn(input, 384, 1, 1) 165 | 166 | branch_1 = conv2d_bn(input, 192, 1, 1) 167 | branch_1 = conv2d_bn(branch_1, 224, 1, 7) 168 | branch_1 = conv2d_bn(branch_1, 256, 7, 1) 169 | 170 | branch_2 = conv2d_bn(input, 192, 1, 1) 171 | branch_2 = conv2d_bn(branch_2, 192, 7, 1) 172 | branch_2 = conv2d_bn(branch_2, 224, 1, 7) 173 | branch_2 = conv2d_bn(branch_2, 224, 7, 1) 174 | branch_2 = conv2d_bn(branch_2, 256, 1, 7) 175 | 176 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input) 177 | branch_3 = conv2d_bn(branch_3, 128, 1, 1) 178 | 179 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 180 | return x 181 | 182 | 183 | def block_reduction_b(input): 184 | channel_axis = -1 185 | 186 | branch_0 = conv2d_bn(input, 192, 1, 1) 187 | branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid') 188 | 189 | branch_1 = conv2d_bn(input, 256, 1, 1) 190 | branch_1 = conv2d_bn(branch_1, 256, 1, 7) 191 | branch_1 = conv2d_bn(branch_1, 320, 7, 1) 192 | branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid') 193 | 194 | branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input) 195 | 196 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis) 197 | return x 198 | 199 | 200 | def block_inception_c(input): 201 | channel_axis = -1 202 | 203 | branch_0 = conv2d_bn(input, 256, 1, 1) 204 | 205 | branch_1 = conv2d_bn(input, 384, 1, 1) 206 | branch_10 = conv2d_bn(branch_1, 256, 1, 3) 207 | branch_11 = conv2d_bn(branch_1, 256, 3, 1) 208 | branch_1 = concatenate([branch_10, branch_11], axis=channel_axis) 209 | 210 | 211 | branch_2 = conv2d_bn(input, 384, 1, 1) 212 | branch_2 = conv2d_bn(branch_2, 448, 3, 1) 213 | branch_2 = conv2d_bn(branch_2, 512, 1, 3) 214 | branch_20 = conv2d_bn(branch_2, 256, 1, 3) 215 | branch_21 = conv2d_bn(branch_2, 256, 3, 1) 216 | branch_2 = concatenate([branch_20, branch_21], axis=channel_axis) 217 | 218 | branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input) 219 | branch_3 = conv2d_bn(branch_3, 256, 1, 1) 220 | 221 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 222 | return x 223 | 224 | 225 | def inception_v4_base(input): 226 | channel_axis = -1 227 | 228 | # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th) 229 | net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid') 230 | net = conv2d_bn(net, 32, 3, 3, border_mode='valid') 231 | net = conv2d_bn(net, 64, 3, 3) 232 | 233 | branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net) 234 | 235 | branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid') 236 | 237 | net = concatenate([branch_0, branch_1], axis=channel_axis) 238 | 239 | branch_0 = conv2d_bn(net, 64, 1, 1) 240 | branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid') 241 | 242 | branch_1 = conv2d_bn(net, 64, 1, 1) 243 | branch_1 = conv2d_bn(branch_1, 64, 1, 7) 244 | branch_1 = conv2d_bn(branch_1, 64, 7, 1) 245 | branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid') 246 | 247 | net = concatenate([branch_0, branch_1], axis=channel_axis) 248 | 249 | branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid') 250 | branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net) 251 | 252 | net = concatenate([branch_0, branch_1], axis=channel_axis) 253 | 254 | # 35 x 35 x 384 255 | # 4 x Inception-A blocks 256 | for idx in xrange(4): 257 | net = block_inception_a(net) 258 | 259 | # 35 x 35 x 384 260 | # Reduction-A block 261 | net = block_reduction_a(net) 262 | 263 | # 17 x 17 x 1024 264 | # 7 x Inception-B blocks 265 | for idx in xrange(7): 266 | net = block_inception_b(net) 267 | 268 | # 17 x 17 x 1024 269 | # Reduction-B block 270 | net = block_reduction_b(net) 271 | 272 | # 8 x 8 x 1536 273 | # 3 x Inception-C blocks 274 | for idx in xrange(3): 275 | net = block_inception_c(net) 276 | 277 | return net 278 | 279 | 280 | def get_loss(pred, label, l2_weight=0.0001): 281 | diff = tf.square(tf.subtract(pred, label)) 282 | train_vars = tf.trainable_variables() 283 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 284 | loss = tf.reduce_mean(diff + l2_loss) 285 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 286 | tf.summary.scalar('loss', loss) 287 | 288 | return loss 289 | 290 | 291 | def summary_scalar(pred, label): 292 | threholds = [5, 4, 3, 2, 1, 0.5] 293 | angles = [float(t) / 180 * scipy.pi for t in threholds] 294 | speeds = [float(t) / 20 for t in threholds] 295 | 296 | for i in range(len(threholds)): 297 | scalar_angle = "angle(" + str(angles[i]) + ")" 298 | scalar_speed = "speed(" + str(speeds[i]) + ")" 299 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 300 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 301 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 302 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 303 | 304 | tf.summary.scalar(scalar_angle, ac_angle) 305 | tf.summary.scalar(scalar_speed, ac_speed) 306 | 307 | 308 | def resize(imgs): 309 | batch_size = imgs.shape[0] 310 | imgs_new = [] 311 | for j in range(batch_size): 312 | img = imgs[j,:,:,:] 313 | new = scipy.misc.imresize(img, (299, 299)) 314 | imgs_new.append(new) 315 | imgs_new = np.stack(imgs_new, axis=0) 316 | return imgs_new 317 | 318 | 319 | if __name__ == '__main__': 320 | with tf.Graph().as_default(): 321 | inputs = tf.zeros((32, 224, 224, 3)) 322 | outputs = get_model(inputs, tf.constant(True)) 323 | print(outputs) 324 | 325 | -------------------------------------------------------------------------------- /models/inception_v4_pm.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation 13 | from keras.layers.normalization import BatchNormalization 14 | from keras.models import Model 15 | 16 | from keras import backend as K 17 | K.set_learning_phase(1) #set learning phase 18 | 19 | 20 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, points=16384, separately=False): 21 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 22 | fmaps_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 23 | if separately: 24 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 25 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 26 | labels_pl = [speeds_pl, angles_pl] 27 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 28 | return imgs_pl, fmaps_pl, labels_pl 29 | 30 | 31 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False): 32 | ''' 33 | Inception V4 Model for Keras 34 | 35 | Model Schema is based on 36 | https://github.com/kentsommer/keras-inceptionV4 37 | 38 | ImageNet Pretrained Weights 39 | Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5 40 | TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5 41 | 42 | Parameters: 43 | img_rows, img_cols - resolution of inputs 44 | channel - 1 for grayscale, 3 for color 45 | num_classes - number of class labels for our classification task 46 | ''' 47 | 48 | # Input Shape is 299 x 299 x 3 (tf) 49 | img_input = Input(shape=(img_rows, img_cols, 3), name='data') 50 | 51 | # Make inception base 52 | net = inception_v4_base(img_input) 53 | 54 | # Final pooling and prediction 55 | 56 | # 8 x 8 x 1536 57 | net_old = AveragePooling2D((8,8), padding='valid')(net) 58 | 59 | # 1 x 1 x 1536 60 | net_old = Dropout(dropout_keep_prob)(net_old) 61 | net_old = Flatten()(net_old) 62 | 63 | # 1536 64 | predictions = Dense(units=1001, activation='softmax')(net_old) 65 | 66 | model = Model(img_input, predictions, name='inception_v4') 67 | 68 | weights_path = 'utils/weights/inception-v4_weights_tf.h5' 69 | assert (os.path.exists(weights_path)) 70 | model.load_weights(weights_path, by_name=True) 71 | 72 | # Truncate and replace softmax layer for transfer learning 73 | # Cannot use model.layers.pop() since model is not of Sequential() type 74 | # The method below works since pre-trained weights are stored in layers but not in the model 75 | net_ft = AveragePooling2D((8,8), border_mode='valid')(net) 76 | net_ft = Dropout(dropout_keep_prob)(net_ft) 77 | net_ft = Flatten()(net_ft) 78 | net = Dense(256, name='fc_mid')(net_ft) 79 | 80 | model = Model(img_input, net, name='inception_v4') 81 | return model 82 | 83 | 84 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 85 | """ Inception_V4 regression model, input is BxWxHx3, output Bx2""" 86 | batch_size = net[0].get_shape()[0].value 87 | img_net, fmap_net = net[0], net[1] 88 | 89 | img_net = get_inception(299, 299)(img_net) 90 | fmap_net = get_inception(299, 299)(fmap_net) 91 | 92 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1]) 93 | 94 | if not add_lstm: 95 | for i, dim in enumerate([256, 128, 16]): 96 | fc_scope = "fc" + str(i + 1) 97 | dp_scope = "dp" + str(i + 1) 98 | net = tf_util.fully_connected(net, dim, bn=True, 99 | is_training=is_training, 100 | scope=fc_scope, 101 | bn_decay=bn_decay) 102 | net = tf_util.dropout(net, keep_prob=0.7, 103 | is_training=is_training, 104 | scope=dp_scope) 105 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4') 106 | else: 107 | fc_scope = "fc1" 108 | net = tf_util.fully_connected(net, 784, bn=True, 109 | is_training=is_training, 110 | scope=fc_scope, 111 | bn_decay=bn_decay) 112 | net = tf_util.dropout(net, keep_prob=0.7, 113 | is_training=is_training, 114 | scope="dp1") 115 | net = cnn_lstm_block(net) 116 | return net 117 | 118 | 119 | def cnn_lstm_block(input_tensor): 120 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 121 | lstm_out = tf_util.stacked_lstm(lstm_in, 122 | num_outputs=10, 123 | time_steps=28, 124 | scope="cnn_lstm") 125 | 126 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 127 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 128 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 129 | 130 | 131 | def conv2d_bn(x, nb_filter, nb_row, nb_col, 132 | border_mode='same', subsample=(1, 1), bias=False): 133 | """ 134 | Utility function to apply conv + BN. 135 | (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py) 136 | """ 137 | channel_axis = -1 138 | x = Convolution2D(nb_filter, (nb_row, nb_col), 139 | strides=subsample, 140 | padding=border_mode, 141 | use_bias=bias)(x) 142 | x = BatchNormalization(axis=channel_axis)(x) 143 | x = Activation('relu')(x) 144 | return x 145 | 146 | def block_inception_a(input): 147 | channel_axis = -1 148 | 149 | branch_0 = conv2d_bn(input, 96, 1, 1) 150 | 151 | branch_1 = conv2d_bn(input, 64, 1, 1) 152 | branch_1 = conv2d_bn(branch_1, 96, 3, 3) 153 | 154 | branch_2 = conv2d_bn(input, 64, 1, 1) 155 | branch_2 = conv2d_bn(branch_2, 96, 3, 3) 156 | branch_2 = conv2d_bn(branch_2, 96, 3, 3) 157 | 158 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input) 159 | branch_3 = conv2d_bn(branch_3, 96, 1, 1) 160 | 161 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 162 | return x 163 | 164 | 165 | def block_reduction_a(input): 166 | channel_axis = -1 167 | 168 | branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid') 169 | 170 | branch_1 = conv2d_bn(input, 192, 1, 1) 171 | branch_1 = conv2d_bn(branch_1, 224, 3, 3) 172 | branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid') 173 | 174 | branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input) 175 | 176 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis) 177 | return x 178 | 179 | 180 | def block_inception_b(input): 181 | channel_axis = -1 182 | 183 | branch_0 = conv2d_bn(input, 384, 1, 1) 184 | 185 | branch_1 = conv2d_bn(input, 192, 1, 1) 186 | branch_1 = conv2d_bn(branch_1, 224, 1, 7) 187 | branch_1 = conv2d_bn(branch_1, 256, 7, 1) 188 | 189 | branch_2 = conv2d_bn(input, 192, 1, 1) 190 | branch_2 = conv2d_bn(branch_2, 192, 7, 1) 191 | branch_2 = conv2d_bn(branch_2, 224, 1, 7) 192 | branch_2 = conv2d_bn(branch_2, 224, 7, 1) 193 | branch_2 = conv2d_bn(branch_2, 256, 1, 7) 194 | 195 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input) 196 | branch_3 = conv2d_bn(branch_3, 128, 1, 1) 197 | 198 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 199 | return x 200 | 201 | 202 | def block_reduction_b(input): 203 | channel_axis = -1 204 | 205 | branch_0 = conv2d_bn(input, 192, 1, 1) 206 | branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid') 207 | 208 | branch_1 = conv2d_bn(input, 256, 1, 1) 209 | branch_1 = conv2d_bn(branch_1, 256, 1, 7) 210 | branch_1 = conv2d_bn(branch_1, 320, 7, 1) 211 | branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid') 212 | 213 | branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input) 214 | 215 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis) 216 | return x 217 | 218 | 219 | def block_inception_c(input): 220 | channel_axis = -1 221 | 222 | branch_0 = conv2d_bn(input, 256, 1, 1) 223 | 224 | branch_1 = conv2d_bn(input, 384, 1, 1) 225 | branch_10 = conv2d_bn(branch_1, 256, 1, 3) 226 | branch_11 = conv2d_bn(branch_1, 256, 3, 1) 227 | branch_1 = concatenate([branch_10, branch_11], axis=channel_axis) 228 | 229 | 230 | branch_2 = conv2d_bn(input, 384, 1, 1) 231 | branch_2 = conv2d_bn(branch_2, 448, 3, 1) 232 | branch_2 = conv2d_bn(branch_2, 512, 1, 3) 233 | branch_20 = conv2d_bn(branch_2, 256, 1, 3) 234 | branch_21 = conv2d_bn(branch_2, 256, 3, 1) 235 | branch_2 = concatenate([branch_20, branch_21], axis=channel_axis) 236 | 237 | branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input) 238 | branch_3 = conv2d_bn(branch_3, 256, 1, 1) 239 | 240 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 241 | return x 242 | 243 | 244 | def inception_v4_base(input): 245 | channel_axis = -1 246 | 247 | # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th) 248 | net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid') 249 | net = conv2d_bn(net, 32, 3, 3, border_mode='valid') 250 | net = conv2d_bn(net, 64, 3, 3) 251 | 252 | branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net) 253 | 254 | branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid') 255 | 256 | net = concatenate([branch_0, branch_1], axis=channel_axis) 257 | 258 | branch_0 = conv2d_bn(net, 64, 1, 1) 259 | branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid') 260 | 261 | branch_1 = conv2d_bn(net, 64, 1, 1) 262 | branch_1 = conv2d_bn(branch_1, 64, 1, 7) 263 | branch_1 = conv2d_bn(branch_1, 64, 7, 1) 264 | branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid') 265 | 266 | net = concatenate([branch_0, branch_1], axis=channel_axis) 267 | 268 | branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid') 269 | branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net) 270 | 271 | net = concatenate([branch_0, branch_1], axis=channel_axis) 272 | 273 | # 35 x 35 x 384 274 | # 4 x Inception-A blocks 275 | for idx in xrange(4): 276 | net = block_inception_a(net) 277 | 278 | # 35 x 35 x 384 279 | # Reduction-A block 280 | net = block_reduction_a(net) 281 | 282 | # 17 x 17 x 1024 283 | # 7 x Inception-B blocks 284 | for idx in xrange(7): 285 | net = block_inception_b(net) 286 | 287 | # 17 x 17 x 1024 288 | # Reduction-B block 289 | net = block_reduction_b(net) 290 | 291 | # 8 x 8 x 1536 292 | # 3 x Inception-C blocks 293 | for idx in xrange(3): 294 | net = block_inception_c(net) 295 | 296 | return net 297 | 298 | 299 | def get_loss(pred, label, l2_weight=0.0001): 300 | diff = tf.square(tf.subtract(pred, label)) 301 | train_vars = tf.trainable_variables() 302 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 303 | loss = tf.reduce_mean(diff + l2_loss) 304 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 305 | tf.summary.scalar('loss', loss) 306 | 307 | return loss 308 | 309 | 310 | def summary_scalar(pred, label): 311 | threholds = [5, 4, 3, 2, 1, 0.5] 312 | angles = [float(t) / 180 * scipy.pi for t in threholds] 313 | speeds = [float(t) / 20 for t in threholds] 314 | 315 | for i in range(len(threholds)): 316 | scalar_angle = "angle(" + str(angles[i]) + ")" 317 | scalar_speed = "speed(" + str(speeds[i]) + ")" 318 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 319 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 320 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 321 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 322 | 323 | tf.summary.scalar(scalar_angle, ac_angle) 324 | tf.summary.scalar(scalar_speed, ac_speed) 325 | 326 | 327 | def resize(imgs): 328 | batch_size = imgs.shape[0] 329 | imgs_new = [] 330 | for j in range(batch_size): 331 | img = imgs[j,:,:,:] 332 | new = scipy.misc.imresize(img, (299, 299)) 333 | imgs_new.append(new) 334 | imgs_new = np.stack(imgs_new, axis=0) 335 | return imgs_new 336 | 337 | 338 | if __name__ == '__main__': 339 | with tf.Graph().as_default(): 340 | imgs = tf.zeros((32, 224, 224, 3)) 341 | fmaps = tf.zeros((32, 224, 224, 3)) 342 | outputs = get_model([imgs, fmaps], tf.constant(True)) 343 | print(outputs) 344 | 345 | -------------------------------------------------------------------------------- /models/inception_v4_pn.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | import pointnet 13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, concatenate, Reshape, Activation 14 | from keras.layers.normalization import BatchNormalization 15 | from keras.models import Model 16 | 17 | from keras import backend as K 18 | K.set_learning_phase(1) #set learning phase 19 | 20 | 21 | def placeholder_inputs(batch_size, img_rows=299, img_cols=299, points=16384, separately=False): 22 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 23 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3)) 24 | if separately: 25 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 26 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 27 | labels_pl = [speeds_pl, angles_pl] 28 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 29 | return imgs_pl, pts_pl, labels_pl 30 | 31 | 32 | def get_inception(img_rows=299, img_cols=299, dropout_keep_prob=0.2, separately=False): 33 | ''' 34 | Inception V4 Model for Keras 35 | 36 | Model Schema is based on 37 | https://github.com/kentsommer/keras-inceptionV4 38 | 39 | ImageNet Pretrained Weights 40 | Theano: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_th_dim_ordering_th_kernels.h5 41 | TensorFlow: https://github.com/kentsommer/keras-inceptionV4/releases/download/2.0/inception-v4_weights_tf_dim_ordering_tf_kernels.h5 42 | 43 | Parameters: 44 | img_rows, img_cols - resolution of inputs 45 | channel - 1 for grayscale, 3 for color 46 | num_classes - number of class labels for our classification task 47 | ''' 48 | 49 | # Input Shape is 299 x 299 x 3 (tf) 50 | img_input = Input(shape=(img_rows, img_cols, 3), name='data') 51 | 52 | # Make inception base 53 | net = inception_v4_base(img_input) 54 | 55 | # Final pooling and prediction 56 | 57 | # 8 x 8 x 1536 58 | net_old = AveragePooling2D((8,8), padding='valid')(net) 59 | 60 | # 1 x 1 x 1536 61 | net_old = Dropout(dropout_keep_prob)(net_old) 62 | net_old = Flatten()(net_old) 63 | 64 | # 1536 65 | predictions = Dense(units=1001, activation='softmax')(net_old) 66 | 67 | model = Model(img_input, predictions, name='inception_v4') 68 | 69 | weights_path = 'utils/weights/inception-v4_weights_tf.h5' 70 | assert (os.path.exists(weights_path)) 71 | model.load_weights(weights_path, by_name=True) 72 | 73 | # Truncate and replace softmax layer for transfer learning 74 | # Cannot use model.layers.pop() since model is not of Sequential() type 75 | # The method below works since pre-trained weights are stored in layers but not in the model 76 | net_ft = AveragePooling2D((8,8), border_mode='valid')(net) 77 | net_ft = Dropout(dropout_keep_prob)(net_ft) 78 | net_ft = Flatten()(net_ft) 79 | net = Dense(256, name='fc_mid')(net_ft) 80 | 81 | model = Model(img_input, net, name='inception_v4') 82 | return model 83 | 84 | 85 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 86 | """ Inception_V4 regression model, input is BxWxHx3, output Bx2""" 87 | batch_size = net[0].get_shape()[0].value 88 | img_net, pt_net = net[0], net[1] 89 | 90 | img_net = get_inception(299, 299)(img_net) 91 | with tf.variable_scope('pointnet'): 92 | pt_net = pointnet.get_model(pt_net, tf.constant(True)) 93 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1]) 94 | 95 | if not add_lstm: 96 | for i, dim in enumerate([256, 128, 16]): 97 | fc_scope = "fc" + str(i + 1) 98 | dp_scope = "dp" + str(i + 1) 99 | net = tf_util.fully_connected(net, dim, bn=True, 100 | is_training=is_training, 101 | scope=fc_scope, 102 | bn_decay=bn_decay) 103 | net = tf_util.dropout(net, keep_prob=0.7, 104 | is_training=is_training, 105 | scope=dp_scope) 106 | 107 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4') 108 | else: 109 | fc_scope = "fc1" 110 | net = tf_util.fully_connected(net, 784, bn=True, 111 | is_training=is_training, 112 | scope=fc_scope, 113 | bn_decay=bn_decay) 114 | net = tf_util.dropout(net, keep_prob=0.7, 115 | is_training=is_training, 116 | scope="dp1") 117 | net = cnn_lstm_block(net) 118 | return net 119 | 120 | 121 | def cnn_lstm_block(input_tensor): 122 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 123 | lstm_out = tf_util.stacked_lstm(lstm_in, 124 | num_outputs=10, 125 | time_steps=28, 126 | scope="cnn_lstm") 127 | 128 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 129 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 130 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 131 | 132 | 133 | def conv2d_bn(x, nb_filter, nb_row, nb_col, 134 | border_mode='same', subsample=(1, 1), bias=False): 135 | """ 136 | Utility function to apply conv + BN. 137 | (Slightly modified from https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py) 138 | """ 139 | channel_axis = -1 140 | x = Convolution2D(nb_filter, (nb_row, nb_col), 141 | strides=subsample, 142 | padding=border_mode, 143 | use_bias=bias)(x) 144 | x = BatchNormalization(axis=channel_axis)(x) 145 | x = Activation('relu')(x) 146 | return x 147 | 148 | def block_inception_a(input): 149 | channel_axis = -1 150 | 151 | branch_0 = conv2d_bn(input, 96, 1, 1) 152 | 153 | branch_1 = conv2d_bn(input, 64, 1, 1) 154 | branch_1 = conv2d_bn(branch_1, 96, 3, 3) 155 | 156 | branch_2 = conv2d_bn(input, 64, 1, 1) 157 | branch_2 = conv2d_bn(branch_2, 96, 3, 3) 158 | branch_2 = conv2d_bn(branch_2, 96, 3, 3) 159 | 160 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input) 161 | branch_3 = conv2d_bn(branch_3, 96, 1, 1) 162 | 163 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 164 | return x 165 | 166 | 167 | def block_reduction_a(input): 168 | channel_axis = -1 169 | 170 | branch_0 = conv2d_bn(input, 384, 3, 3, subsample=(2,2), border_mode='valid') 171 | 172 | branch_1 = conv2d_bn(input, 192, 1, 1) 173 | branch_1 = conv2d_bn(branch_1, 224, 3, 3) 174 | branch_1 = conv2d_bn(branch_1, 256, 3, 3, subsample=(2,2), border_mode='valid') 175 | 176 | branch_2 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(input) 177 | 178 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis) 179 | return x 180 | 181 | 182 | def block_inception_b(input): 183 | channel_axis = -1 184 | 185 | branch_0 = conv2d_bn(input, 384, 1, 1) 186 | 187 | branch_1 = conv2d_bn(input, 192, 1, 1) 188 | branch_1 = conv2d_bn(branch_1, 224, 1, 7) 189 | branch_1 = conv2d_bn(branch_1, 256, 7, 1) 190 | 191 | branch_2 = conv2d_bn(input, 192, 1, 1) 192 | branch_2 = conv2d_bn(branch_2, 192, 7, 1) 193 | branch_2 = conv2d_bn(branch_2, 224, 1, 7) 194 | branch_2 = conv2d_bn(branch_2, 224, 7, 1) 195 | branch_2 = conv2d_bn(branch_2, 256, 1, 7) 196 | 197 | branch_3 = AveragePooling2D((3,3), strides=(1,1), padding='same')(input) 198 | branch_3 = conv2d_bn(branch_3, 128, 1, 1) 199 | 200 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 201 | return x 202 | 203 | 204 | def block_reduction_b(input): 205 | channel_axis = -1 206 | 207 | branch_0 = conv2d_bn(input, 192, 1, 1) 208 | branch_0 = conv2d_bn(branch_0, 192, 3, 3, subsample=(2, 2), border_mode='valid') 209 | 210 | branch_1 = conv2d_bn(input, 256, 1, 1) 211 | branch_1 = conv2d_bn(branch_1, 256, 1, 7) 212 | branch_1 = conv2d_bn(branch_1, 320, 7, 1) 213 | branch_1 = conv2d_bn(branch_1, 320, 3, 3, subsample=(2,2), border_mode='valid') 214 | 215 | branch_2 = MaxPooling2D((3, 3), strides=(2, 2), padding='valid')(input) 216 | 217 | x = concatenate([branch_0, branch_1, branch_2], axis=channel_axis) 218 | return x 219 | 220 | 221 | def block_inception_c(input): 222 | channel_axis = -1 223 | 224 | branch_0 = conv2d_bn(input, 256, 1, 1) 225 | 226 | branch_1 = conv2d_bn(input, 384, 1, 1) 227 | branch_10 = conv2d_bn(branch_1, 256, 1, 3) 228 | branch_11 = conv2d_bn(branch_1, 256, 3, 1) 229 | branch_1 = concatenate([branch_10, branch_11], axis=channel_axis) 230 | 231 | 232 | branch_2 = conv2d_bn(input, 384, 1, 1) 233 | branch_2 = conv2d_bn(branch_2, 448, 3, 1) 234 | branch_2 = conv2d_bn(branch_2, 512, 1, 3) 235 | branch_20 = conv2d_bn(branch_2, 256, 1, 3) 236 | branch_21 = conv2d_bn(branch_2, 256, 3, 1) 237 | branch_2 = concatenate([branch_20, branch_21], axis=channel_axis) 238 | 239 | branch_3 = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(input) 240 | branch_3 = conv2d_bn(branch_3, 256, 1, 1) 241 | 242 | x = concatenate([branch_0, branch_1, branch_2, branch_3], axis=channel_axis) 243 | return x 244 | 245 | 246 | def inception_v4_base(input): 247 | channel_axis = -1 248 | 249 | # Input Shape is 299 x 299 x 3 (th) or 3 x 299 x 299 (th) 250 | net = conv2d_bn(input, 32, 3, 3, subsample=(2,2), border_mode='valid') 251 | net = conv2d_bn(net, 32, 3, 3, border_mode='valid') 252 | net = conv2d_bn(net, 64, 3, 3) 253 | 254 | branch_0 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net) 255 | 256 | branch_1 = conv2d_bn(net, 96, 3, 3, subsample=(2,2), border_mode='valid') 257 | 258 | net = concatenate([branch_0, branch_1], axis=channel_axis) 259 | 260 | branch_0 = conv2d_bn(net, 64, 1, 1) 261 | branch_0 = conv2d_bn(branch_0, 96, 3, 3, border_mode='valid') 262 | 263 | branch_1 = conv2d_bn(net, 64, 1, 1) 264 | branch_1 = conv2d_bn(branch_1, 64, 1, 7) 265 | branch_1 = conv2d_bn(branch_1, 64, 7, 1) 266 | branch_1 = conv2d_bn(branch_1, 96, 3, 3, border_mode='valid') 267 | 268 | net = concatenate([branch_0, branch_1], axis=channel_axis) 269 | 270 | branch_0 = conv2d_bn(net, 192, 3, 3, subsample=(2,2), border_mode='valid') 271 | branch_1 = MaxPooling2D((3,3), strides=(2,2), padding='valid')(net) 272 | 273 | net = concatenate([branch_0, branch_1], axis=channel_axis) 274 | 275 | # 35 x 35 x 384 276 | # 4 x Inception-A blocks 277 | for idx in xrange(4): 278 | net = block_inception_a(net) 279 | 280 | # 35 x 35 x 384 281 | # Reduction-A block 282 | net = block_reduction_a(net) 283 | 284 | # 17 x 17 x 1024 285 | # 7 x Inception-B blocks 286 | for idx in xrange(7): 287 | net = block_inception_b(net) 288 | 289 | # 17 x 17 x 1024 290 | # Reduction-B block 291 | net = block_reduction_b(net) 292 | 293 | # 8 x 8 x 1536 294 | # 3 x Inception-C blocks 295 | for idx in xrange(3): 296 | net = block_inception_c(net) 297 | 298 | return net 299 | 300 | 301 | def get_loss(pred, label, l2_weight=0.0001): 302 | diff = tf.square(tf.subtract(pred, label)) 303 | train_vars = tf.trainable_variables() 304 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 305 | loss = tf.reduce_mean(diff + l2_loss) 306 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 307 | tf.summary.scalar('loss', loss) 308 | 309 | return loss 310 | 311 | 312 | def summary_scalar(pred, label): 313 | threholds = [5, 4, 3, 2, 1, 0.5] 314 | angles = [float(t) / 180 * scipy.pi for t in threholds] 315 | speeds = [float(t) / 20 for t in threholds] 316 | 317 | for i in range(len(threholds)): 318 | scalar_angle = "angle(" + str(angles[i]) + ")" 319 | scalar_speed = "speed(" + str(speeds[i]) + ")" 320 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 321 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 322 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 323 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 324 | 325 | tf.summary.scalar(scalar_angle, ac_angle) 326 | tf.summary.scalar(scalar_speed, ac_speed) 327 | 328 | 329 | def resize(imgs): 330 | batch_size = imgs.shape[0] 331 | imgs_new = [] 332 | for j in range(batch_size): 333 | img = imgs[j,:,:,:] 334 | new = scipy.misc.imresize(img, (299, 299)) 335 | imgs_new.append(new) 336 | imgs_new = np.stack(imgs_new, axis=0) 337 | return imgs_new 338 | 339 | 340 | if __name__ == '__main__': 341 | with tf.Graph().as_default(): 342 | imgs = tf.zeros((32, 224, 224, 3)) 343 | pts = tf.zeros((32, 16384, 3)) 344 | outputs = get_model([imgs, pts], tf.constant(True)) 345 | print(outputs) 346 | -------------------------------------------------------------------------------- /models/nvidia_io.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | 7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 8 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 9 | 10 | import tf_util 11 | 12 | 13 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False): 14 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 15 | if separately: 16 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 17 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 18 | labels_pl = [speeds_pl, angles_pl] 19 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 20 | return imgs_pl, labels_pl 21 | 22 | 23 | def get_model(net, is_training, bn_decay=None, separately=False): 24 | """ NVIDIA regression model, input is BxWxHx3, output Bx2""" 25 | batch_size = net.get_shape()[0].value 26 | 27 | for i, dim in enumerate([24, 36, 48, 64, 64]): 28 | scope = "conv" + str(i + 1) 29 | net = tf_util.conv2d(net, dim, [5, 5], 30 | padding='VALID', stride=[1, 1], 31 | bn=True, is_training=is_training, 32 | scope=scope, bn_decay=bn_decay) 33 | 34 | net = tf.reshape(net, [batch_size, -1]) 35 | for i, dim in enumerate([256, 100, 50, 10]): 36 | fc_scope = "fc" + str(i + 1) 37 | dp_scope = "dp" + str(i + 1) 38 | net = tf_util.fully_connected(net, dim, bn=True, 39 | is_training=is_training, 40 | scope=fc_scope, 41 | bn_decay=bn_decay) 42 | net = tf_util.dropout(net, keep_prob=0.7, 43 | is_training=is_training, 44 | scope=dp_scope) 45 | 46 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5') 47 | 48 | return net 49 | 50 | 51 | def get_loss(pred, label, l2_weight=0.0001): 52 | diff = tf.square(tf.subtract(pred, label)) 53 | train_vars = tf.trainable_variables() 54 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 55 | loss = tf.reduce_mean(diff + l2_loss) 56 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 57 | tf.summary.scalar('loss', loss) 58 | 59 | return loss 60 | 61 | 62 | def summary_scalar(pred, label): 63 | threholds = [5, 4, 3, 2, 1, 0.5] 64 | angles = [float(t) / 180 * scipy.pi for t in threholds] 65 | speeds = [float(t) / 20 for t in threholds] 66 | 67 | for i in range(len(threholds)): 68 | scalar_angle = "angle(" + str(angles[i]) + ")" 69 | scalar_speed = "speed(" + str(speeds[i]) + ")" 70 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 71 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 72 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 73 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 74 | 75 | tf.summary.scalar(scalar_angle, ac_angle) 76 | tf.summary.scalar(scalar_speed, ac_speed) 77 | 78 | 79 | if __name__ == '__main__': 80 | with tf.Graph().as_default(): 81 | inputs = tf.zeros((32, 66, 200, 3)) 82 | outputs = get_model(inputs, tf.constant(True)) 83 | print(outputs) 84 | -------------------------------------------------------------------------------- /models/nvidia_pm.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | 7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 8 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 9 | 10 | import tf_util 11 | 12 | 13 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False): 14 | imgs_pl = tf.placeholder(tf.float32, 15 | shape=(batch_size, img_rows, img_cols, 3)) 16 | fmaps_pl = tf.placeholder(tf.float32, 17 | shape=(batch_size, img_rows, img_cols, 3)) 18 | if separately: 19 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 20 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 21 | labels_pl = [speeds_pl, angles_pl] 22 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 23 | return imgs_pl, fmaps_pl, labels_pl 24 | 25 | 26 | def get_model(net, is_training, bn_decay=None, separately=False): 27 | """ NVIDIA regression model, input is BxWxHx3, output Bx2""" 28 | batch_size = net[0].get_shape()[0].value 29 | img_net, fmap_net = net 30 | for i, dim in enumerate([24, 36, 48, 64, 64]): 31 | scope_img = "image_conv" + str(i + 1) 32 | scope_fmap = "fmap_conv" + str(i + 1) 33 | img_net = tf_util.conv2d(img_net, dim, [5, 5], 34 | padding='VALID', stride=[1, 1], 35 | bn=True, is_training=is_training, 36 | scope=scope_img, bn_decay=bn_decay) 37 | fmap_net = tf_util.conv2d(fmap_net, dim, [5, 5], 38 | padding='VALID', stride=[1, 1], 39 | bn=True, is_training=is_training, 40 | scope=scope_fmap, bn_decay=bn_decay) 41 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1]) 42 | for i, dim in enumerate([256, 100, 50, 10]): 43 | fc_scope = "fc" + str(i + 1) 44 | dp_scope = "dp" + str(i + 1) 45 | net = tf_util.fully_connected(net, dim, bn=True, 46 | is_training=is_training, 47 | scope=fc_scope, 48 | bn_decay=bn_decay) 49 | net = tf_util.dropout(net, keep_prob=0.7, 50 | is_training=is_training, 51 | scope=dp_scope) 52 | 53 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5') 54 | 55 | return net 56 | 57 | 58 | def get_loss(pred, label, l2_weight=0.0001): 59 | diff = tf.square(tf.subtract(pred, label)) 60 | train_vars = tf.trainable_variables() 61 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 62 | loss = tf.reduce_mean(diff + l2_loss) 63 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 64 | tf.summary.scalar('loss', loss) 65 | 66 | return loss 67 | 68 | 69 | def summary_scalar(pred, label): 70 | threholds = [5, 4, 3, 2, 1, 0.5] 71 | angles = [float(t) / 180 * scipy.pi for t in threholds] 72 | speeds = [float(t) / 20 for t in threholds] 73 | 74 | for i in range(len(threholds)): 75 | scalar_angle = "angle(" + str(angles[i]) + ")" 76 | scalar_speed = "speed(" + str(speeds[i]) + ")" 77 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 78 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 79 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 80 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 81 | 82 | tf.summary.scalar(scalar_angle, ac_angle) 83 | tf.summary.scalar(scalar_speed, ac_speed) 84 | 85 | 86 | if __name__ == '__main__': 87 | with tf.Graph().as_default(): 88 | inputs = [tf.zeros((32, 66, 200, 3)), tf.zeros((32, 66, 200, 3))] 89 | outputs = get_model(inputs, tf.constant(True)) 90 | print(outputs) 91 | -------------------------------------------------------------------------------- /models/nvidia_pn.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | import pointnet 13 | 14 | 15 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, points=16384, separately=False): 16 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 17 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3)) 18 | if separately: 19 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 20 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 21 | labels_pl = [speeds_pl, angles_pl] 22 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 23 | return imgs_pl, pts_pl, labels_pl 24 | 25 | 26 | def get_model(net, is_training, bn_decay=None, separately=False): 27 | """ NVIDIA regression model, input is BxWxHx3, output Bx2""" 28 | batch_size = net[0].get_shape()[0].value 29 | img_net, pt_net = net[0], net[1] 30 | 31 | for i, dim in enumerate([24, 36, 48, 64, 64]): 32 | scope = "conv" + str(i + 1) 33 | img_net = tf_util.conv2d(img_net, dim, [5, 5], 34 | padding='VALID', stride=[1, 1], 35 | bn=True, is_training=is_training, 36 | scope=scope, bn_decay=bn_decay) 37 | 38 | img_net = tf.reshape(img_net, [batch_size, -1]) 39 | img_net = tf_util.fully_connected(img_net, 256, bn=True, 40 | is_training=is_training, 41 | scope='img_fc0', 42 | bn_decay=bn_decay) 43 | with tf.variable_scope('pointnet'): 44 | pt_net = pointnet.get_model(pt_net, tf.constant(True)) 45 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, 512]) 46 | 47 | for i, dim in enumerate([256, 128, 16]): 48 | fc_scope = "fc" + str(i + 1) 49 | dp_scope = "dp" + str(i + 1) 50 | net = tf_util.fully_connected(net, dim, bn=True, 51 | is_training=is_training, 52 | scope=fc_scope, 53 | bn_decay=bn_decay) 54 | net = tf_util.dropout(net, keep_prob=0.7, 55 | is_training=is_training, 56 | scope=dp_scope) 57 | 58 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc5') 59 | 60 | return net 61 | 62 | 63 | def get_loss(pred, label, l2_weight=0.0001): 64 | diff = tf.square(tf.subtract(pred, label)) 65 | train_vars = tf.trainable_variables() 66 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 67 | loss = tf.reduce_mean(diff + l2_loss) 68 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 69 | tf.summary.scalar('loss', loss) 70 | 71 | return loss 72 | 73 | 74 | def summary_scalar(pred, label): 75 | threholds = [5, 4, 3, 2, 1, 0.5] 76 | angles = [float(t) / 180 * scipy.pi for t in threholds] 77 | speeds = [float(t) / 20 for t in threholds] 78 | 79 | for i in range(len(threholds)): 80 | scalar_angle = "angle(" + str(angles[i]) + ")" 81 | scalar_speed = "speed(" + str(speeds[i]) + ")" 82 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 83 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 84 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 85 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 86 | 87 | tf.summary.scalar(scalar_angle, ac_angle) 88 | tf.summary.scalar(scalar_speed, ac_speed) 89 | 90 | 91 | if __name__ == '__main__': 92 | with tf.Graph().as_default(): 93 | inputs = [tf.zeros((32, 66, 200, 3)), tf.zeros((32, 16384, 3))] 94 | outputs = get_model(inputs, tf.constant(True)) 95 | print(outputs) 96 | -------------------------------------------------------------------------------- /models/resnet152_io.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | from custom_layers import Scale 13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation 14 | from keras.layers.normalization import BatchNormalization 15 | from keras.models import Model 16 | 17 | 18 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, separately=False): 19 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 20 | if separately: 21 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 22 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 23 | labels_pl = [speeds_pl, angles_pl] 24 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 25 | return imgs_pl, labels_pl 26 | 27 | 28 | def get_resnet(img_rows=224, img_cols=224, separately=False): 29 | """ 30 | Resnet 152 Model for Keras 31 | 32 | Model Schema and layer naming follow that of the original Caffe implementation 33 | https://github.com/KaimingHe/deep-residual-networks 34 | 35 | ImageNet Pretrained Weights 36 | Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing 37 | TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing 38 | 39 | Parameters: 40 | img_rows, img_cols - resolution of inputs 41 | channel - 1 for grayscale, 3 for color 42 | """ 43 | eps = 1.1e-5 44 | # Handle Dimension Ordering for different backends 45 | img_input = Input(shape=(img_rows, img_cols, 3), name='data') 46 | 47 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input) 48 | x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x) 49 | x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x) 50 | x = Scale(axis=3, name='scale_conv1')(x) 51 | x = Activation('relu', name='conv1_relu')(x) 52 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x) 53 | 54 | x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1)) 55 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='b') 56 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='c') 57 | 58 | x = conv_block(x, 3, [128, 128, 512], stage=3, block='a') 59 | for i in range(1,8): 60 | x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i)) 61 | 62 | x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a') 63 | for i in range(1,36): 64 | x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i)) 65 | 66 | x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a') 67 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b') 68 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c') 69 | 70 | x_fc = AveragePooling2D((7, 7), name='avg_pool')(x) 71 | x_fc = Flatten()(x_fc) 72 | x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc) 73 | 74 | model = Model(img_input, x_fc) 75 | 76 | weights_path = 'utils/weights/resnet152_weights_tf.h5' 77 | assert (os.path.exists(weights_path)) 78 | model.load_weights(weights_path, by_name=True) 79 | 80 | # Truncate and replace softmax layer for transfer learning 81 | # Cannot use model.layers.pop() since model is not of Sequential() type 82 | # The method below works since pre-trained weights are stored in layers but not in the model 83 | x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x) 84 | x_newfc = Flatten()(x_newfc) 85 | x_newfc = Dense(256, name='fc8')(x_newfc) 86 | model = Model(img_input, x_newfc) 87 | 88 | return model 89 | 90 | 91 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 92 | """ ResNet152 regression model, input is BxWxHx3, output Bx2""" 93 | net = get_resnet(224, 224)(net) 94 | 95 | if not add_lstm: 96 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc_final') 97 | 98 | else: 99 | net = tf_util.fully_connected(net, 784, bn=True, 100 | is_training=is_training, 101 | scope='fc_lstm', 102 | bn_decay=bn_decay) 103 | net = tf_util.dropout(net, keep_prob=0.7, 104 | is_training=is_training, 105 | scope="dp1") 106 | net = cnn_lstm_block(net) 107 | 108 | return net 109 | 110 | 111 | def cnn_lstm_block(input_tensor): 112 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 113 | lstm_out = tf_util.stacked_lstm(lstm_in, 114 | num_outputs=10, 115 | time_steps=28, 116 | scope="cnn_lstm") 117 | 118 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 119 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 120 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 121 | 122 | 123 | def identity_block(input_tensor, kernel_size, filters, stage, block): 124 | '''The identity_block is the block that has no conv layer at shortcut 125 | # Arguments 126 | input_tensor: input tensor 127 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 128 | filters: list of integers, the nb_filters of 3 conv layer at main path 129 | stage: integer, current stage label, used for generating layer names 130 | block: 'a','b'..., current block label, used for generating layer names 131 | ''' 132 | eps = 1.1e-5 133 | nb_filter1, nb_filter2, nb_filter3 = filters 134 | conv_name_base = 'res' + str(stage) + block + '_branch' 135 | bn_name_base = 'bn' + str(stage) + block + '_branch' 136 | scale_name_base = 'scale' + str(stage) + block + '_branch' 137 | 138 | x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor) 139 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x) 140 | x = Scale(axis=3, name=scale_name_base + '2a')(x) 141 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 142 | 143 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 144 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size), 145 | name=conv_name_base + '2b', use_bias=False)(x) 146 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x) 147 | x = Scale(axis=3, name=scale_name_base + '2b')(x) 148 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 149 | 150 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 151 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x) 152 | x = Scale(axis=3, name=scale_name_base + '2c')(x) 153 | 154 | x = add([x, input_tensor], name='res' + str(stage) + block) 155 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 156 | return x 157 | 158 | 159 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)): 160 | '''conv_block is the block that has a conv layer at shortcut 161 | # Arguments 162 | input_tensor: input tensor 163 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 164 | filters: list of integers, the nb_filters of 3 conv layer at main path 165 | stage: integer, current stage label, used for generating layer names 166 | block: 'a','b'..., current block label, used for generating layer names 167 | Note that from stage 3, the first conv layer at main path is with subsample=(2,2) 168 | And the shortcut should have subsample=(2,2) as well 169 | ''' 170 | eps = 1.1e-5 171 | nb_filter1, nb_filter2, nb_filter3 = filters 172 | conv_name_base = 'res' + str(stage) + block + '_branch' 173 | bn_name_base = 'bn' + str(stage) + block + '_branch' 174 | scale_name_base = 'scale' + str(stage) + block + '_branch' 175 | 176 | x = Convolution2D(nb_filter1, (1, 1), strides=strides, 177 | name=conv_name_base + '2a', use_bias=False)(input_tensor) 178 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x) 179 | x = Scale(axis=3, name=scale_name_base + '2a')(x) 180 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 181 | 182 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 183 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size), 184 | name=conv_name_base + '2b', use_bias=False)(x) 185 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x) 186 | x = Scale(axis=3, name=scale_name_base + '2b')(x) 187 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 188 | 189 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 190 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x) 191 | x = Scale(axis=3, name=scale_name_base + '2c')(x) 192 | 193 | shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides, 194 | name=conv_name_base + '1', use_bias=False)(input_tensor) 195 | shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut) 196 | shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut) 197 | 198 | x = add([x, shortcut], name='res' + str(stage) + block) 199 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 200 | return x 201 | 202 | 203 | def get_loss(pred, label, l2_weight=0.0001): 204 | diff = tf.square(tf.subtract(pred, label)) 205 | train_vars = tf.trainable_variables() 206 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 207 | loss = tf.reduce_mean(diff + l2_loss) 208 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 209 | tf.summary.scalar('loss', loss) 210 | 211 | return loss 212 | 213 | 214 | def summary_scalar(pred, label): 215 | threholds = [5, 4, 3, 2, 1, 0.5] 216 | angles = [float(t) / 180 * scipy.pi for t in threholds] 217 | speeds = [float(t) / 20 for t in threholds] 218 | 219 | for i in range(len(threholds)): 220 | scalar_angle = "angle(" + str(angles[i]) + ")" 221 | scalar_speed = "speed(" + str(speeds[i]) + ")" 222 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 223 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 224 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 225 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 226 | 227 | tf.summary.scalar(scalar_angle, ac_angle) 228 | tf.summary.scalar(scalar_speed, ac_speed) 229 | 230 | 231 | def resize(imgs): 232 | batch_size = imgs.shape[0] 233 | imgs_new = [] 234 | for j in range(batch_size): 235 | img = imgs[j,:,:,:] 236 | new = scipy.misc.imresize(img, (224, 224)) 237 | imgs_new.append(new) 238 | imgs_new = np.stack(imgs_new, axis=0) 239 | return imgs_new 240 | 241 | 242 | if __name__ == '__main__': 243 | with tf.Graph().as_default(): 244 | inputs = tf.zeros((32, 224, 224, 3)) 245 | outputs = get_model(inputs, tf.constant(True)) 246 | print(outputs) 247 | 248 | -------------------------------------------------------------------------------- /models/resnet152_pm.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | from custom_layers import Scale 13 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation 14 | from keras.layers.normalization import BatchNormalization 15 | from keras.models import Model 16 | 17 | def placeholder_inputs(batch_size, img_rows=66, img_cols=200, separately=False): 18 | imgs_pl = tf.placeholder(tf.float32, 19 | shape=(batch_size, img_rows, img_cols, 3)) 20 | fmaps_pl = tf.placeholder(tf.float32, 21 | shape=(batch_size, img_rows, img_cols, 3)) 22 | if separately: 23 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 24 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 25 | labels_pl = [speeds_pl, angles_pl] 26 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 27 | return imgs_pl, fmaps_pl, labels_pl 28 | 29 | 30 | def get_resnet(img_rows=224, img_cols=224, separately=False): 31 | """ 32 | Resnet 152 Model for Keras 33 | 34 | Model Schema and layer naming follow that of the original Caffe implementation 35 | https://github.com/KaimingHe/deep-residual-networks 36 | 37 | ImageNet Pretrained Weights 38 | Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing 39 | TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing 40 | 41 | Parameters: 42 | img_rows, img_cols - resolution of inputs 43 | channel - 1 for grayscale, 3 for color 44 | """ 45 | 46 | img_input = Input(shape=(img_rows, img_cols, 3), name='data') 47 | 48 | eps = 1.1e-5 49 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input) 50 | x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x) 51 | x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x) 52 | x = Scale(axis=3, name='scale_conv1')(x) 53 | x = Activation('relu', name='conv1_relu')(x) 54 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x) 55 | 56 | x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1)) 57 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='b') 58 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='c') 59 | 60 | x = conv_block(x, 3, [128, 128, 512], stage=3, block='a') 61 | for i in range(1,8): 62 | x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i)) 63 | 64 | x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a') 65 | for i in range(1,36): 66 | x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i)) 67 | 68 | x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a') 69 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b') 70 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c') 71 | 72 | x_fc = AveragePooling2D((7, 7), name='avg_pool')(x) 73 | x_fc = Flatten()(x_fc) 74 | x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc) 75 | 76 | model = Model(img_input, x_fc) 77 | 78 | # Use pre-trained weights for Tensorflow backend 79 | weights_path = 'utils/weights/resnet152_weights_tf.h5' 80 | assert (os.path.exists(weights_path)) 81 | 82 | model.load_weights(weights_path, by_name=True) 83 | 84 | # Truncate and replace softmax layer for transfer learning 85 | # Cannot use model.layers.pop() since model is not of Sequential() type 86 | # The method below works since pre-trained weights are stored in layers but not in the model 87 | x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x) 88 | x_newfc = Flatten()(x_newfc) 89 | x_newfc = Dense(256, name='fc8')(x_newfc) 90 | 91 | model = Model(img_input, x_newfc) 92 | return model 93 | 94 | 95 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 96 | """ ResNet152 regression model, input is BxWxHx3, output Bx2""" 97 | batch_size = net[0].get_shape()[0].value 98 | img_net, fmap_net = net[0], net[1] 99 | 100 | img_net = get_resnet(224, 224)(img_net) 101 | fmap_net = get_resnet(224, 224)(fmap_net) 102 | 103 | net = tf.reshape(tf.stack([img_net, fmap_net]), [batch_size, -1]) 104 | 105 | if not add_lstm: 106 | for i, dim in enumerate([256, 128, 16]): 107 | fc_scope = "fc" + str(i + 1) 108 | dp_scope = "dp" + str(i + 1) 109 | net = tf_util.fully_connected(net, dim, bn=True, 110 | is_training=is_training, 111 | scope=fc_scope, 112 | bn_decay=bn_decay) 113 | net = tf_util.dropout(net, keep_prob=0.7, 114 | is_training=is_training, 115 | scope=dp_scope) 116 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4') 117 | else: 118 | fc_scope = "fc1" 119 | net = tf_util.fully_connected(net, 784, bn=True, 120 | is_training=is_training, 121 | scope=fc_scope, 122 | bn_decay=bn_decay) 123 | net = tf_util.dropout(net, keep_prob=0.7, 124 | is_training=is_training, 125 | scope="dp1") 126 | net = cnn_lstm_block(net) 127 | return net 128 | 129 | 130 | def cnn_lstm_block(input_tensor): 131 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 132 | lstm_out = tf_util.stacked_lstm(lstm_in, 133 | num_outputs=10, 134 | time_steps=28, 135 | scope="cnn_lstm") 136 | 137 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 138 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 139 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 140 | 141 | 142 | def identity_block(input_tensor, kernel_size, filters, stage, block): 143 | '''The identity_block is the block that has no conv layer at shortcut 144 | # Arguments 145 | input_tensor: input tensor 146 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 147 | filters: list of integers, the nb_filters of 3 conv layer at main path 148 | stage: integer, current stage label, used for generating layer names 149 | block: 'a','b'..., current block label, used for generating layer names 150 | ''' 151 | eps = 1.1e-5 152 | nb_filter1, nb_filter2, nb_filter3 = filters 153 | conv_name_base = 'res' + str(stage) + block + '_branch' 154 | bn_name_base = 'bn' + str(stage) + block + '_branch' 155 | scale_name_base = 'scale' + str(stage) + block + '_branch' 156 | 157 | x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor) 158 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x) 159 | x = Scale(axis=3, name=scale_name_base + '2a')(x) 160 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 161 | 162 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 163 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size), 164 | name=conv_name_base + '2b', use_bias=False)(x) 165 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x) 166 | x = Scale(axis=3, name=scale_name_base + '2b')(x) 167 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 168 | 169 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 170 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x) 171 | x = Scale(axis=3, name=scale_name_base + '2c')(x) 172 | 173 | x = add([x, input_tensor], name='res' + str(stage) + block) 174 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 175 | return x 176 | 177 | 178 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)): 179 | '''conv_block is the block that has a conv layer at shortcut 180 | # Arguments 181 | input_tensor: input tensor 182 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 183 | filters: list of integers, the nb_filters of 3 conv layer at main path 184 | stage: integer, current stage label, used for generating layer names 185 | block: 'a','b'..., current block label, used for generating layer names 186 | Note that from stage 3, the first conv layer at main path is with subsample=(2,2) 187 | And the shortcut should have subsample=(2,2) as well 188 | ''' 189 | eps = 1.1e-5 190 | nb_filter1, nb_filter2, nb_filter3 = filters 191 | conv_name_base = 'res' + str(stage) + block + '_branch' 192 | bn_name_base = 'bn' + str(stage) + block + '_branch' 193 | scale_name_base = 'scale' + str(stage) + block + '_branch' 194 | 195 | x = Convolution2D(nb_filter1, (1, 1), strides=strides, 196 | name=conv_name_base + '2a', use_bias=False)(input_tensor) 197 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x) 198 | x = Scale(axis=3, name=scale_name_base + '2a')(x) 199 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 200 | 201 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 202 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size), 203 | name=conv_name_base + '2b', use_bias=False)(x) 204 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x) 205 | x = Scale(axis=3, name=scale_name_base + '2b')(x) 206 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 207 | 208 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 209 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x) 210 | x = Scale(axis=3, name=scale_name_base + '2c')(x) 211 | 212 | shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides, 213 | name=conv_name_base + '1', use_bias=False)(input_tensor) 214 | shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut) 215 | shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut) 216 | 217 | x = add([x, shortcut], name='res' + str(stage) + block) 218 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 219 | return x 220 | 221 | 222 | def get_loss(pred, label, l2_weight=0.0001): 223 | diff = tf.square(tf.subtract(pred, label)) 224 | train_vars = tf.trainable_variables() 225 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 226 | loss = tf.reduce_mean(diff + l2_loss) 227 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 228 | tf.summary.scalar('loss', loss) 229 | 230 | return loss 231 | 232 | 233 | def summary_scalar(pred, label): 234 | threholds = [5, 4, 3, 2, 1, 0.5] 235 | angles = [float(t) / 180 * scipy.pi for t in threholds] 236 | speeds = [float(t) / 20 for t in threholds] 237 | 238 | for i in range(len(threholds)): 239 | scalar_angle = "angle(" + str(angles[i]) + ")" 240 | scalar_speed = "speed(" + str(speeds[i]) + ")" 241 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 242 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 243 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 244 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 245 | 246 | tf.summary.scalar(scalar_angle, ac_angle) 247 | tf.summary.scalar(scalar_speed, ac_speed) 248 | 249 | 250 | def resize(imgs): 251 | batch_size = imgs.shape[0] 252 | imgs_new = [] 253 | for j in range(batch_size): 254 | img = imgs[j,:,:,:] 255 | new = scipy.misc.imresize(img, (224, 224)) 256 | imgs_new.append(new) 257 | imgs_new = np.stack(imgs_new, axis=0) 258 | return imgs_new 259 | 260 | 261 | if __name__ == '__main__': 262 | with tf.Graph().as_default(): 263 | imgs = tf.zeros((32, 224, 224, 3)) 264 | fmaps = tf.zeros((32, 224, 224, 3)) 265 | outputs = get_model([imgs, fmaps], tf.constant(True)) 266 | print(outputs) 267 | -------------------------------------------------------------------------------- /models/resnet152_pn.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | 4 | import tensorflow as tf 5 | import scipy 6 | import numpy as np 7 | 8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 9 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 10 | 11 | import tf_util 12 | import pointnet 13 | from custom_layers import Scale 14 | from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, add, Reshape, Activation 15 | from keras.layers.normalization import BatchNormalization 16 | from keras.models import Model 17 | 18 | from keras import backend as K 19 | K.set_learning_phase(1) #set learning phase 20 | 21 | 22 | def placeholder_inputs(batch_size, img_rows=224, img_cols=224, points=16384, separately=False): 23 | imgs_pl = tf.placeholder(tf.float32, shape=(batch_size, img_rows, img_cols, 3)) 24 | pts_pl = tf.placeholder(tf.float32, shape=(batch_size, points, 3)) 25 | if separately: 26 | speeds_pl = tf.placeholder(tf.float32, shape=(batch_size)) 27 | angles_pl = tf.placeholder(tf.float32, shape=(batch_size)) 28 | labels_pl = [speeds_pl, angles_pl] 29 | labels_pl = tf.placeholder(tf.float32, shape=(batch_size, 2)) 30 | return imgs_pl, pts_pl, labels_pl 31 | 32 | 33 | def get_resnet(img_rows=224, img_cols=224, separately=False): 34 | """ 35 | Resnet 152 Model for Keras 36 | 37 | Model Schema and layer naming follow that of the original Caffe implementation 38 | https://github.com/KaimingHe/deep-residual-networks 39 | 40 | ImageNet Pretrained Weights 41 | Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing 42 | TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing 43 | 44 | Parameters: 45 | img_rows, img_cols - resolution of inputs 46 | channel - 1 for grayscale, 3 for color 47 | """ 48 | 49 | img_input = Input(shape=(img_rows, img_cols, 3), name='data') 50 | 51 | eps = 1.1e-5 52 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input) 53 | x = Convolution2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x) 54 | x = BatchNormalization(epsilon=eps, axis=3, name='bn_conv1')(x) 55 | x = Scale(axis=3, name='scale_conv1')(x) 56 | x = Activation('relu', name='conv1_relu')(x) 57 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x) 58 | 59 | x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1)) 60 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='b') 61 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='c') 62 | 63 | x = conv_block(x, 3, [128, 128, 512], stage=3, block='a') 64 | for i in range(1,8): 65 | x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i)) 66 | 67 | x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a') 68 | for i in range(1,36): 69 | x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i)) 70 | 71 | x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a') 72 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b') 73 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c') 74 | 75 | x_fc = AveragePooling2D((7, 7), name='avg_pool')(x) 76 | x_fc = Flatten()(x_fc) 77 | x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc) 78 | 79 | model = Model(img_input, x_fc) 80 | 81 | # Use pre-trained weights for Tensorflow backend 82 | weights_path = 'utils/weights/resnet152_weights_tf.h5' 83 | assert (os.path.exists(weights_path)) 84 | 85 | model.load_weights(weights_path, by_name=True) 86 | 87 | # Truncate and replace softmax layer for transfer learning 88 | # Cannot use model.layers.pop() since model is not of Sequential() type 89 | # The method below works since pre-trained weights are stored in layers but not in the model 90 | x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x) 91 | x_newfc = Flatten()(x_newfc) 92 | x_newfc = Dense(256, name='fc8')(x_newfc) 93 | 94 | model = Model(img_input, x_newfc) 95 | return model 96 | 97 | 98 | def get_model(net, is_training, add_lstm=False, bn_decay=None, separately=False): 99 | """ ResNet152 regression model, input is BxWxHx3, output Bx2""" 100 | batch_size = net[0].get_shape()[0].value 101 | img_net, pt_net = net[0], net[1] 102 | 103 | img_net = get_resnet(224, 224)(img_net) 104 | with tf.variable_scope('pointnet'): 105 | pt_net = pointnet.get_model(pt_net, tf.constant(True)) 106 | net = tf.reshape(tf.stack([img_net, pt_net], axis=2), [batch_size, -1]) 107 | 108 | if not add_lstm: 109 | for i, dim in enumerate([256, 128, 16]): 110 | fc_scope = "fc" + str(i + 1) 111 | dp_scope = "dp" + str(i + 1) 112 | net = tf_util.fully_connected(net, dim, bn=True, 113 | is_training=is_training, 114 | scope=fc_scope, 115 | bn_decay=bn_decay) 116 | net = tf_util.dropout(net, keep_prob=0.7, 117 | is_training=is_training, 118 | scope=dp_scope) 119 | 120 | net = tf_util.fully_connected(net, 2, activation_fn=None, scope='fc4') 121 | else: 122 | fc_scope = "fc1" 123 | net = tf_util.fully_connected(net, 784, bn=True, 124 | is_training=is_training, 125 | scope=fc_scope, 126 | bn_decay=bn_decay) 127 | net = tf_util.dropout(net, keep_prob=0.7, 128 | is_training=is_training, 129 | scope="dp1") 130 | net = cnn_lstm_block(net) 131 | return net 132 | 133 | 134 | def cnn_lstm_block(input_tensor): 135 | lstm_in = tf.reshape(input_tensor, [-1, 28, 28]) 136 | lstm_out = tf_util.stacked_lstm(lstm_in, 137 | num_outputs=10, 138 | time_steps=28, 139 | scope="cnn_lstm") 140 | 141 | W_final = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1)) 142 | b_final = tf.Variable(tf.truncated_normal([2], stddev=0.1)) 143 | return tf.multiply(tf.atan(tf.matmul(lstm_out, W_final) + b_final), 2) 144 | 145 | 146 | def identity_block(input_tensor, kernel_size, filters, stage, block): 147 | '''The identity_block is the block that has no conv layer at shortcut 148 | # Arguments 149 | input_tensor: input tensor 150 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 151 | filters: list of integers, the nb_filters of 3 conv layer at main path 152 | stage: integer, current stage label, used for generating layer names 153 | block: 'a','b'..., current block label, used for generating layer names 154 | ''' 155 | eps = 1.1e-5 156 | nb_filter1, nb_filter2, nb_filter3 = filters 157 | conv_name_base = 'res' + str(stage) + block + '_branch' 158 | bn_name_base = 'bn' + str(stage) + block + '_branch' 159 | scale_name_base = 'scale' + str(stage) + block + '_branch' 160 | 161 | x = Convolution2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor) 162 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x) 163 | x = Scale(axis=3, name=scale_name_base + '2a')(x) 164 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 165 | 166 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 167 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size), 168 | name=conv_name_base + '2b', use_bias=False)(x) 169 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x) 170 | x = Scale(axis=3, name=scale_name_base + '2b')(x) 171 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 172 | 173 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 174 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x) 175 | x = Scale(axis=3, name=scale_name_base + '2c')(x) 176 | 177 | x = add([x, input_tensor], name='res' + str(stage) + block) 178 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 179 | return x 180 | 181 | 182 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)): 183 | '''conv_block is the block that has a conv layer at shortcut 184 | # Arguments 185 | input_tensor: input tensor 186 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 187 | filters: list of integers, the nb_filters of 3 conv layer at main path 188 | stage: integer, current stage label, used for generating layer names 189 | block: 'a','b'..., current block label, used for generating layer names 190 | Note that from stage 3, the first conv layer at main path is with subsample=(2,2) 191 | And the shortcut should have subsample=(2,2) as well 192 | ''' 193 | eps = 1.1e-5 194 | nb_filter1, nb_filter2, nb_filter3 = filters 195 | conv_name_base = 'res' + str(stage) + block + '_branch' 196 | bn_name_base = 'bn' + str(stage) + block + '_branch' 197 | scale_name_base = 'scale' + str(stage) + block + '_branch' 198 | 199 | x = Convolution2D(nb_filter1, (1, 1), strides=strides, 200 | name=conv_name_base + '2a', use_bias=False)(input_tensor) 201 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2a')(x) 202 | x = Scale(axis=3, name=scale_name_base + '2a')(x) 203 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 204 | 205 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 206 | x = Convolution2D(nb_filter2, (kernel_size, kernel_size), 207 | name=conv_name_base + '2b', use_bias=False)(x) 208 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2b')(x) 209 | x = Scale(axis=3, name=scale_name_base + '2b')(x) 210 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 211 | 212 | x = Convolution2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 213 | x = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '2c')(x) 214 | x = Scale(axis=3, name=scale_name_base + '2c')(x) 215 | 216 | shortcut = Convolution2D(nb_filter3, (1, 1), strides=strides, 217 | name=conv_name_base + '1', use_bias=False)(input_tensor) 218 | shortcut = BatchNormalization(epsilon=eps, axis=3, name=bn_name_base + '1')(shortcut) 219 | shortcut = Scale(axis=3, name=scale_name_base + '1')(shortcut) 220 | 221 | x = add([x, shortcut], name='res' + str(stage) + block) 222 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 223 | return x 224 | 225 | 226 | def get_loss(pred, label, l2_weight=0.0001): 227 | diff = tf.square(tf.subtract(pred, label)) 228 | train_vars = tf.trainable_variables() 229 | l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in train_vars[1:]]) * l2_weight 230 | loss = tf.reduce_mean(diff + l2_loss) 231 | tf.summary.scalar('l2 loss', l2_loss * l2_weight) 232 | tf.summary.scalar('loss', loss) 233 | 234 | return loss 235 | 236 | 237 | def summary_scalar(pred, label): 238 | threholds = [5, 4, 3, 2, 1, 0.5] 239 | angles = [float(t) / 180 * scipy.pi for t in threholds] 240 | speeds = [float(t) / 20 for t in threholds] 241 | 242 | for i in range(len(threholds)): 243 | scalar_angle = "angle(" + str(angles[i]) + ")" 244 | scalar_speed = "speed(" + str(speeds[i]) + ")" 245 | ac_angle = tf.abs(tf.subtract(pred[:, 1], label[:, 1])) < threholds[i] 246 | ac_speed = tf.abs(tf.subtract(pred[:, 0], label[:, 0])) < threholds[i] 247 | ac_angle = tf.reduce_mean(tf.cast(ac_angle, tf.float32)) 248 | ac_speed = tf.reduce_mean(tf.cast(ac_speed, tf.float32)) 249 | 250 | tf.summary.scalar(scalar_angle, ac_angle) 251 | tf.summary.scalar(scalar_speed, ac_speed) 252 | 253 | 254 | def resize(imgs): 255 | batch_size = imgs.shape[0] 256 | imgs_new = [] 257 | for j in range(batch_size): 258 | img = imgs[j,:,:,:] 259 | new = scipy.misc.imresize(img, (224, 224)) 260 | imgs_new.append(new) 261 | imgs_new = np.stack(imgs_new, axis=0) 262 | return imgs_new 263 | 264 | 265 | if __name__ == '__main__': 266 | with tf.Graph().as_default(): 267 | imgs = tf.zeros((32, 224, 224, 3)) 268 | pts = tf.zeros((32, 16384, 3)) 269 | outputs = get_model([imgs, pts], tf.constant(True)) 270 | print(outputs) -------------------------------------------------------------------------------- /predict.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import importlib 3 | import os 4 | import sys 5 | import time 6 | 7 | import numpy as np 8 | import scipy 9 | 10 | import provider 11 | import tensorflow as tf 12 | 13 | import matplotlib.pyplot as plt 14 | 15 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 16 | sys.path.append(os.path.join(BASE_DIR, 'models')) 17 | 18 | parser = argparse.ArgumentParser() 19 | parser.add_argument('--gpu', type=int, default=0, 20 | help='GPU to use [default: GPU 0]') 21 | parser.add_argument('--model', default='nvidia_pn', 22 | help='Model name [default: nvidia_pn]') 23 | parser.add_argument('--model_path', default='logs/nvidia_pn/model.ckpt', 24 | help='Model checkpoint file path [default: logs/nvidia_pn/model.ckpt]') 25 | parser.add_argument('--max_epoch', type=int, default=250, 26 | help='Epoch to run [default: 250]') 27 | parser.add_argument('--batch_size', type=int, default=8, 28 | help='Batch Size during training [default: 8]') 29 | parser.add_argument('--result_dir', default='results', 30 | help='Result folder path [results]') 31 | 32 | FLAGS = parser.parse_args() 33 | BATCH_SIZE = FLAGS.batch_size 34 | GPU_INDEX = FLAGS.gpu 35 | MODEL_PATH = FLAGS.model_path 36 | 37 | assert (FLAGS.model == "nvidia_pn") 38 | MODEL = importlib.import_module(FLAGS.model) # import network module 39 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py') 40 | 41 | RESULT_DIR = os.path.join(FLAGS.result_dir, FLAGS.model) 42 | if not os.path.exists(RESULT_DIR): 43 | os.makedirs(RESULT_DIR) 44 | LOG_FOUT = open(os.path.join(RESULT_DIR, 'log_predict.txt'), 'w') 45 | LOG_FOUT.write(str(FLAGS)+'\n') 46 | 47 | 48 | def log_string(out_str): 49 | LOG_FOUT.write(out_str+'\n') 50 | LOG_FOUT.flush() 51 | print(out_str) 52 | 53 | def predict(): 54 | with tf.device('/gpu:'+str(GPU_INDEX)): 55 | if 'pn' in MODEL_FILE: 56 | data_input = provider.Provider() 57 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 58 | imgs_pl = [imgs_pl, pts_pl] 59 | else: 60 | raise NotImplementedError 61 | 62 | is_training_pl = tf.placeholder(tf.bool, shape=()) 63 | print(is_training_pl) 64 | 65 | # Get model and loss 66 | pred = MODEL.get_model(imgs_pl, is_training_pl) 67 | 68 | loss = MODEL.get_loss(pred, labels_pl) 69 | 70 | # Add ops to save and restore all the variables. 71 | saver = tf.train.Saver() 72 | 73 | # Create a session 74 | config = tf.ConfigProto() 75 | config.gpu_options.allow_growth = True 76 | config.allow_soft_placement = True 77 | config.log_device_placement = True 78 | sess = tf.Session(config=config) 79 | 80 | # Restore variables from disk. 81 | saver.restore(sess, MODEL_PATH) 82 | log_string("Model restored.") 83 | 84 | ops = {'imgs_pl': imgs_pl, 85 | 'labels_pl': labels_pl, 86 | 'is_training_pl': is_training_pl, 87 | 'pred': pred, 88 | 'loss': loss} 89 | 90 | pred_one_epoch(sess, ops, data_input) 91 | 92 | def pred_one_epoch(sess, ops, data_input): 93 | """ ops: dict mapping from string to tf ops """ 94 | is_training = False 95 | preds = [] 96 | num_batches = data_input.num_test // BATCH_SIZE 97 | 98 | for batch_idx in range(num_batches): 99 | if "io" in MODEL_FILE: 100 | imgs = data_input.load_one_batch(BATCH_SIZE, "test") 101 | feed_dict = {ops['imgs_pl']: imgs, 102 | ops['is_training_pl']: is_training} 103 | else: 104 | imgs, others = data_input.load_one_batch(BATCH_SIZE, "test") 105 | feed_dict = {ops['imgs_pl'][0]: imgs, 106 | ops['imgs_pl'][1]: others, 107 | ops['is_training_pl']: is_training} 108 | 109 | pred_val = sess.run(ops['pred'], feed_dict=feed_dict) 110 | preds.append(pred_val) 111 | 112 | preds = np.vstack(preds) 113 | print (preds.shape) 114 | # preds[:, 1] = preds[:, 1] * 180.0 / scipy.pi 115 | # preds[:, 0] = preds[:, 0] * 20 + 20 116 | 117 | np.savetxt(os.path.join(RESULT_DIR, "behavior_pred.txt"), preds) 118 | 119 | output_dir = os.path.join(RESULT_DIR, "results") 120 | if not os.path.exists(output_dir): 121 | os.makedirs(output_dir) 122 | i_list = get_dicts(description="test") 123 | counter = 0 124 | for i, num in enumerate(i_list): 125 | np.savetxt(os.path.join(output_dir, str(i) + ".txt"), preds[counter:counter+num,:]) 126 | counter += num 127 | # plot_acc(preds, labels) 128 | 129 | def get_dicts(description="val"): 130 | if description == "train": 131 | raise NotImplementedError 132 | elif description == "val": # batch_size == 8 133 | return [120] * 4 + [111] + [120] * 4 + [109] + [120] * 9 + [89 - 87 % 8] 134 | elif description == "test": # batch_size == 8 135 | return [120] * 9 + [116] + [120] * 4 + [106] + [120] * 4 + [114 - 114 % 8] 136 | else: 137 | raise NotImplementedError 138 | 139 | if __name__ == "__main__": 140 | predict() 141 | # plot_acc_from_txt() 142 | -------------------------------------------------------------------------------- /tools/README.md: -------------------------------------------------------------------------------- 1 | ## Tools to process point clouds and images 2 | 3 | A list of tools (python scripts) are created to make data processing easier and more convenient. Note that they are not professional tools so you may need to modify some lines before using it in your cases. 4 | 5 | If you have more efficient tools, code or other suggestions to process DBNet data, especially point clouds, don't hesitate to contact [@wangjksjtu(wangjksjtu@gmail.com)](https://github.com/wangjksjtu) or __submit pull-requests directly__. 6 | Your contributions are highly encouraged and appreciated! 7 | 8 | - __img_pre.py__: croping and resizing images using python-opencv 9 | - __las2fmap.py__: extracting feature maps from point clouds 10 | - __pcd2las.py__: downsampling point clouds; converting point clouds from '.pcd' to '.las' format. 11 | - __video2img.py__: converting one video to continuous frames 12 | 13 | To see HELP for these script: 14 | 15 | python train.py -h 16 | 17 | ### Requirements 18 | - python-opencv 19 | - numpy, pickle, scipy, __laspy__ 20 | - __CloudCompare (CC)__ (set __PATH variables__) 21 | 22 | ### CC Examples 23 | Convert point clouds to `.las` format: 24 | 25 | CloudCompare.exe -SILENT -NO_TIMESTAMP -C_EXPORT_FMT LAS -O %s 26 | 27 | Downsample point clouds to 16384 points and save in `.las` format: 28 | 29 | CloudCompare.exe -SILENT -NO_TIMESTAMP -C_EXPORT_FMT LAS -O %s -SS RANDOM 16384 30 | 31 | More command line usages of CloudCompareare available on the [offical manual page](http://www.cloudcompare.org/doc/wiki/index.php?title=Command_line_mode). 32 | 33 | ### las2fmap Examples 34 | Downlaod the example point cloud from the [Google Drive](https://drive.google.com/file/d/1lxl7M2MTA7afg5UItA5hCvh-Wt5bTSNJ/view?usp=sharing). 35 | 36 | python las2fmap.py -f example.las 37 | 38 | To see HELP for the `las2fmap.py` script: 39 | 40 | python las2fmap.py -h 41 | # usage: las2fmap.py [-h] [-d DIR] [-f FILE] 42 | # 43 | # optional arguments: 44 | # -h, --help show this help message and exit 45 | # -d DIR, --dir DIR Directory of las files [default: ''] 46 | # -f FILE, --file FILE Specify one las file you want to convert # [default: ''] 47 | -------------------------------------------------------------------------------- /tools/img_pre.py: -------------------------------------------------------------------------------- 1 | """ 2 | Simple python scripts for croping and resizing images 3 | Author: Jingkang Wang 4 | Date: November 2017 5 | Dependency: python-opencv 6 | """ 7 | 8 | import argparse 9 | import glob 10 | import os 11 | 12 | import cv2 13 | 14 | 15 | def crop(input_dir="DVR_1920x1080", 16 | output_dir="DVR_1080x600"): 17 | """ 18 | Crop images in folders 19 | :param input_dir: path of input directory 20 | :param output_dir: path of output directory 21 | """ 22 | assert os.path.exists(input_dir) 23 | if not os.path.exists(output_dir): 24 | os.makedirs(output_dir) 25 | subfolders = glob.glob(os.path.join(input_dir, "*")) 26 | 27 | for folder in subfolders: 28 | new_subfolder = os.path.join(output_dir, folder[folder.rfind("/")+1:]) 29 | # print new_subfolder 30 | if not os.path.exists(new_subfolder): 31 | os.mkdir(new_subfolder) 32 | files = glob.glob(os.path.join(folder, "*.jpg")) 33 | # print files 34 | for filename in files: 35 | out_filename = os.path.join(output_dir, filename[filename.find("/")+1:]) 36 | print filename, out_filename 37 | crop_img(filename, out_filename) 38 | 39 | 40 | def crop_img(input_img, output_img, 41 | left=500, right=1580, down=200, up=800): 42 | """ 43 | Crop single image 44 | :param input_img: path of input image 45 | :param output_img: path of cropped image 46 | :param left, right, down, up: cropped positions 47 | """ 48 | img = cv2.imread(input_img) 49 | crop_img = img[down:up, left:right] 50 | cv2.imwrite(output_img, crop_img) 51 | 52 | 53 | def resize(input_dir="DVR_1080x600", 54 | output_dir="DVR_200x66"): 55 | """ 56 | Resize images in folders 57 | :param input_dir: path of input directory 58 | :param output_dir: path of output directory 59 | """ 60 | width = int(output_dir.split("_")[-1].split("x")[0]) 61 | height = int(output_dir.split("_")[-1].split("x")[-1]) 62 | print width, height 63 | assert os.path.exists(input_dir) 64 | if not os.path.exists(output_dir): 65 | os.makedirs(output_dir) 66 | subfolders = glob.glob(os.path.join(input_dir, "*")) 67 | for folder in subfolders: 68 | new_subfolder = os.path.join(output_dir, folder[folder.rfind("/")+1:]) 69 | # print new_subfolder 70 | if not os.path.exists(new_subfolder): 71 | os.mkdir(new_subfolder) 72 | files = glob.glob(os.path.join(folder, "*.jpg")) 73 | # print files 74 | for filename in files: 75 | out_filename = os.path.join(output_dir, filename[filename.find("/")+1:]) 76 | print filename, out_filename 77 | resize_img(filename, out_filename, width, height) 78 | 79 | 80 | def resize_img(input_img, output_img, newx, newy): 81 | """ 82 | Resize single image 83 | :param input_img: path of input image 84 | :param output_img: path of cropped image 85 | :param newx, newy: scale of resized image 86 | """ 87 | img = cv2.imread(input_img) 88 | newimage = cv2.resize(img, (newx, newy)) 89 | cv2.imwrite(output_img, newimage) 90 | 91 | 92 | if __name__ == "__main__": 93 | parser = argparse.ArgumentParser() 94 | parser.add_argument('--input_dir', type=str, default="DVR_1920x1080", 95 | help='Path of input directory [default: DVR_1920x1080]') 96 | parser.add_argument('--output_dir', type=str, default="DVR_1080x600", 97 | help='Path of input directory [default: DVR_1080x600]') 98 | parser.add_argument('--oper', type=str, default="crop", 99 | help='Operation to conduct (crop/resize) [default: crop]') 100 | FLAGS = parser.parse_args() 101 | 102 | INPUT_DIR = FLAGS.input_dir 103 | OUTPUT_DIR = FLAGS.output_dir 104 | OPER = FLAGS.oper 105 | 106 | assert (os.path.exists(INPUT_DIR)) 107 | if (OPER == "crop"): 108 | crop(INPUT_DIR, OUTPUT_DIR) 109 | elif (OPER == "resize"): 110 | resize(INPUT_DIR, OUTPUT_DIR) 111 | else: 112 | raise NotImplementedError -------------------------------------------------------------------------------- /tools/las2fmap.py: -------------------------------------------------------------------------------- 1 | """ 2 | Simple python scripts for extracting feature maps from point clouds 3 | Author: Jingkang Wang 4 | Date: November 2017 5 | Dependency: python-opencv, numpy, pickle, scipy, laspy 6 | """ 7 | 8 | import argparse 9 | import glob 10 | import math 11 | import os 12 | import pickle 13 | 14 | import numpy as np 15 | from scipy.misc import imsave, imshow 16 | 17 | import cv2 18 | from laspy.base import Writer 19 | from laspy.file import File 20 | 21 | 22 | def lasReader(filename): 23 | """ 24 | Read xyz points from single las file 25 | :param filename: path of single point cloud 26 | """ 27 | f = File(filename, mode='r') 28 | x_max, x_min = np.max(f.x), np.min(f.x) 29 | y_max, y_min = np.max(f.y), np.min(f.y) 30 | z_max, z_min = np.max(f.z), np.min(f.z) 31 | return np.transpose(np.asarray([f.x, f.y, f.z])), \ 32 | [(x_min, x_max), (y_min, y_max), (z_min, z_max)], f.header 33 | 34 | 35 | def transform(merge, ranges, order=[0,1,2]): 36 | """ 37 | Swap xyz axis 38 | :param merge: transformed xyz points 39 | :param ranges: specified shifts [default: None] 40 | """ 41 | i = np.argsort(order) 42 | merge = merge[i,:] 43 | ranges = np.asarray(ranges)[i,:] 44 | return merge, ranges 45 | 46 | 47 | def standardize(points, ranges=None): 48 | """ 49 | Standardize points in point clouds 50 | :param filename: transformed xyz points 51 | :param ranges: specified shifts [default: None] 52 | """ 53 | if ranges != None: 54 | points -= np.array([ranges[0][0], ranges[1][0], ranges[2][0]]) 55 | else: 56 | x_min = np.min(points[:,0]) 57 | y_min = np.min(points[:,1]) 58 | z_min = np.min(points[:,2]) 59 | points -= np.array(np.array([x_min, y_min, z_min])) 60 | return np.transpose(points), [(0, np.max(points[:,0])), \ 61 | (0, np.max(points[:,1])), (0, np.max(points[:,2]))] 62 | 63 | 64 | def rotate(img, angle=180): 65 | """ 66 | Rotate images using opencv 67 | :param img: one image (opencv format) 68 | :param angle: rotated angle [default: 180] 69 | """ 70 | rows, cols = img.shape[0], img.shape[1] 71 | rotation_matrix = cv2.getRotationMatrix2D((rows/2, cols/2), angle, 1) 72 | dst = cv2.warpAffine(img, rotation_matrix, (cols, rows)) 73 | return dst 74 | 75 | 76 | def rotate_about_center(src, angle, scale=1.): 77 | """ 78 | Rotate images based on there centers 79 | :param src: one image (opencv format) 80 | :param angle: rotated angle 81 | :param scale: re-scaling images [default: 1.] 82 | """ 83 | w = src.shape[1] 84 | h = src.shape[0] 85 | rangle = np.deg2rad(angle) # angle in radians 86 | # now calculate new image width and height 87 | nw = (abs(np.sin(rangle)*h) + abs(np.cos(rangle)*w))*scale 88 | nh = (abs(np.cos(rangle)*h) + abs(np.sin(rangle)*w))*scale 89 | # ask opencv for the rotation matrix 90 | rot_mat = cv2.getRotationMatrix2D((nw*0.5, nh*0.5), angle, scale) 91 | # calculate the move from the old center to the new center combined 92 | # with the rotation 93 | rot_move = np.dot(rot_mat, np.array([(nw-w)*0.5, (nh-h)*0.5,0])) 94 | # the move only affects the translation, so update the translation 95 | # part of the transform 96 | rot_mat[0,2] += rot_move[0] 97 | rot_mat[1,2] += rot_move[1] 98 | return cv2.warpAffine(src, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4) 99 | 100 | 101 | def feature_map(merge, ranges, alpha=0.2, beta=0.8, GSD=0.5): 102 | """ 103 | Obatin feature maps from pound clouds 104 | :param merge: merged xyz points 105 | :param ranges: focused ranges 106 | :param alpha, beta, GSD: hyper-parameters in paper 107 | """ 108 | (X_min, X_max) = ranges[0] 109 | (Y_min, Y_max) = ranges[1] 110 | (Z_min, Z_max) = ranges[2] 111 | 112 | W = int((X_max - X_min) / GSD) + 1 113 | H = int((Y_max - Y_min) / GSD) + 1 114 | print ("(W, H) = (" + str(W) + ", " + str(H) + ")") 115 | feature_map = np.zeros((W, H)) 116 | 117 | net_dict = dict() 118 | for i in range(merge.shape[1]): 119 | if i % 1000000 == 0: 120 | print ("processed 1000000 points...") 121 | point = merge[:,i] 122 | x = int((point[0] - X_min) / GSD) 123 | y = int((point[1] - Y_min) / GSD) 124 | try: 125 | net_dict[(x, y)].append(i) 126 | except: 127 | net_dict[(x, y)] = [i] 128 | 129 | print ("mapping points...") 130 | # caculate the feature 131 | count = 0 132 | F_ij_min = 1000 133 | F_ij_max = -1000 134 | for i in range(W): 135 | for j in range(H): 136 | F_ij = 0 137 | 138 | try: 139 | h_min = 1000 140 | h_max = -1000 141 | for num in net_dict[(i, j)]: 142 | point = merge[:,num] 143 | h_max = max(point[2], h_max) 144 | h_min = min(point[2], h_min) 145 | 146 | Z_ijs = [] 147 | W_ijs = [] 148 | tol = 1e-5 149 | for num in net_dict[(i, j)]: 150 | point = merge[:,num] 151 | Z_ij = point[2] 152 | H_ij = Z_ij - Z_min 153 | # obtain D_ij from Eqn.(5) 154 | x_ij = (i + 0.5) * GSD + X_min 155 | y_ij = (j + 0.5) * GSD + Y_min 156 | D_ij = math.sqrt((point[0] - x_ij)**2 + (point[1] - y_ij)**2) 157 | # obtain W_ij_XY and W_ij_J from Eqn.(4) 158 | W_ij_XY = math.sqrt(2) * GSD / (D_ij + tol) 159 | W_ij_H = H_ij * (h_min - Z_min) / (Z_max - h_max + tol) 160 | # obtain W_ij from Eqn.(3) 161 | W_ij = alpha * W_ij_XY + beta * W_ij_H 162 | Z_ijs.append(Z_ij) 163 | W_ijs.append(W_ij) 164 | 165 | for k in range(len(Z_ijs)): 166 | # obtain feature value F_ij from Eqn.(2) 167 | F_ij += W_ijs[k] * Z_ijs[k] 168 | 169 | F_ij /= sum(W_ijs) 170 | count += 1 171 | 172 | F_ij_min = min(F_ij, F_ij_min) 173 | F_ij_max = max(F_ij, F_ij_max) 174 | except: pass 175 | 176 | feature_map[i][j] = F_ij 177 | 178 | feature_map -= F_ij_min 179 | feature_map /= (F_ij_max - F_ij_min) 180 | feature_map *= 255 181 | 182 | return feature_map 183 | 184 | 185 | def clean_map(fmap): 186 | """ 187 | Clean the feature map 188 | :param fmap: feature map 189 | """ 190 | # fmap = fmap[~(fmap==0).all(1)] 191 | fmap = fmap[(fmap != 0).sum(axis=1) >= 100, :] 192 | fmap = fmap[:, (fmap != 0).sum(axis=0) >= 50] 193 | 194 | return fmap 195 | 196 | 197 | def resize(path, x_axis, y_axis): 198 | """ 199 | Resize images 200 | :param path: path of an image 201 | :param x_axis: width of resized image 202 | :param y_axis: height of resized image 203 | """ 204 | img = cv2.imread(path) 205 | new_image = cv2.resize(img, (x_axis, y_axis)) 206 | cv2.imwrite(path, new_image) 207 | 208 | 209 | def get_fmap(filename, dir1='gray', dir2='jet'): 210 | """ 211 | Visualize feature maps 212 | :param dir1: path of gray images to be saved 213 | :param dir2: path of jet images to be saved 214 | """ 215 | if not os.path.exists(dir1): os.mkdir(dir1) 216 | if not os.path.exists(dir2): os.mkdir(dir2) 217 | 218 | if not os.path.isfile(filename): 219 | print ("[Error]: \'%s\' is not a valid filename" % filename) 220 | return False 221 | 222 | merge, ranges, _ = lasReader(filename) 223 | merge, ranges = standardize(merge, ranges) 224 | print ("standardized point clouds") 225 | print ("total: " + str(merge.shape[1]) + " points") 226 | 227 | # transform x,y,z axis: 0,2,1 228 | merge, ranges = transform(merge, ranges, order=[1, 2, 0]) 229 | 230 | # clean the feature map 231 | fmap = clean_map(feature_map(merge, ranges=ranges, GSD=0.05)) 232 | cv2.imwrite(os.path.join(dir1, '%s.jpg' % filename[:-4]), \ 233 | rotate_about_center(fmap, 180, 1.0)) 234 | 235 | # uncomment the following line if you want to resize the feature map 236 | # resize(os.path.join(dir1, '%s.jpg' % filename[:-4]), x_axis=1080, y_axis=270) 237 | 238 | gray = cv2.imread(os.path.join(dir1, '%s.jpg' % filename[:-4])) 239 | gray_single = gray[:,:,0] 240 | imC = cv2.applyColorMap(gray_single, cv2.COLORMAP_JET) 241 | cv2.imwrite(os.path.join(dir1, '%s_jet_tmp.jpg' % filename[:-4]), imC) 242 | 243 | img = cv2.imread(os.path.join(dir1, '%s_jet_tmp.jpg' % filename[:-4])) 244 | cv2.imwrite(os.path.join(dir2, '%s_jet.jpg' % filename[:-4]), img) 245 | os.system("rm %s_jet_tmp.jpg" % os.path.join(dir1, filename[:-4])) 246 | 247 | return True 248 | 249 | 250 | def main(): 251 | parser = argparse.ArgumentParser() 252 | parser.add_argument('-d', '--dir', default='', 253 | help='Directory of las files [default: \'\']') 254 | parser.add_argument('-f', '--file', default='', 255 | help='Specify one las file you want to convert [default: \'\']') 256 | FLAGS = parser.parse_args() 257 | 258 | d = FLAGS.dir 259 | f = FLAGS.file 260 | 261 | if f == '' and d == '': 262 | parser.print_help() 263 | elif f != '' and d != '': 264 | if not os.path.isdir(d): 265 | print ("[Error]: \'%s\' is not a valid directory!" % d) 266 | else: 267 | p = os.path.join(d, f) 268 | print (p) 269 | if get_fmap(p): 270 | print ("Finished!") 271 | elif f != '': 272 | p = f 273 | if get_fmap(p): 274 | print ("Finished!") 275 | else: 276 | if not os.path.isdir(d): 277 | print ("[Error]: \'%s\' is not a valid directory!" % d) 278 | else: 279 | files = sorted(glob.glob(os.path.join(d, "*.las"))) 280 | count = 0 281 | for f in files: 282 | if get_fmap(f): 283 | count += 1 284 | if count % 25 == 0 and count != 0: 285 | print ("25 Finished!") 286 | 287 | 288 | if __name__ == "__main__": 289 | main() 290 | -------------------------------------------------------------------------------- /tools/pcd2las.py: -------------------------------------------------------------------------------- 1 | """ 2 | Simple python scripts for 1) downsampling point clouds 3 | 2) converting point clouds from '.pcd' to '.las' format. 4 | Author: Jingkang Wang 5 | Date: November 2017 6 | Dependency: CloudCompare 7 | """ 8 | 9 | import argparse 10 | import glob 11 | import os 12 | import time 13 | 14 | 15 | def downsample(absolute_path): 16 | """ 17 | Downsample point clouds (supported formats: las/pcd/...) 18 | :param absolute_path: directory of point clouds 19 | """ 20 | files = glob.glob(absolute_path + "*.las") 21 | files.sort() 22 | files.sort(key=len) 23 | time_in = time.time() 24 | for f in files: 25 | os.system("CloudCompare.exe -SILENT \ 26 | -NO_TIMESTAMP -C_EXPORT_FMT LAS \ 27 | -O %s -SS RANDOM 16384" % f) 28 | print time.time() - time_in 29 | 30 | 31 | def pcd2las(absolute_path): 32 | """ 33 | Convert point clouds from las to pcd. 34 | :param absolute_path: directory of point clouds 35 | """ 36 | print (absolute_path) 37 | files = glob.glob(absolute_path + "*.pcd") 38 | files.sort() 39 | files.sort(key=len) 40 | print (files) 41 | time_in = time.time() 42 | for f in files: 43 | os.system("CloudCompare.exe -SILENT \ 44 | -NO_TIMESTAMP -C_EXPORT_FMT LAS \ 45 | -O %s -SS RANDOM 16384" % f) 46 | print time.time() - time_in 47 | 48 | 49 | if __name__ == "__main__": 50 | parser = argparse.ArgumentParser() 51 | parser.add_argument('input_dir', type=str, required=True, 52 | help='Input directory of point clouds') 53 | parser.add_argument('oper', type=str, default="downsample", 54 | help='Operations to conduct') 55 | FLAGS = parser.parse_args() 56 | INPUT_DIR = FLAGS.input_dir 57 | OPER = FLAGS.oper 58 | 59 | assert (os.path.exists(INPUT_DIR)) 60 | if (OPER == "downsample"): 61 | downsample(INPUT_DIR) 62 | else: 63 | pcd2las(INPUT_DIR) 64 | -------------------------------------------------------------------------------- /tools/video2img.py: -------------------------------------------------------------------------------- 1 | """ 2 | Simple python scripts for converting one video to continuous frames 3 | Author: Jingkang Wang 4 | Date: November 2017 5 | Dependency: python-opencv 6 | """ 7 | 8 | import argparse 9 | import math 10 | import os 11 | import sys 12 | 13 | import cv2 14 | 15 | parser = argparse.ArgumentParser() 16 | parser.add_argument('-i', help='Path of video') 17 | parser.add_argument('-t', default=1.0, help='Time interval') 18 | parser.add_argument('-o', default='./images', help='Dir of images') 19 | FLAGS = parser.parse_args() 20 | 21 | videoFile = FLAGS.i 22 | imagesFolder = FLAGS.o 23 | t_int = FLAGS.t 24 | 25 | if videoFile == None: 26 | print ("[Error]: Please input path of video") 27 | sys.exit(0) 28 | 29 | if not os.path.exists(videoFile): 30 | print ("[Error]: %s is not a valid video" % videoFile) 31 | sys.exit(0) 32 | 33 | if not os.path.exists(imagesFolder): os.makedirs(imagesFolder) 34 | 35 | cap = cv2.VideoCapture(videoFile) 36 | frameRate = cap.get(5) #frame rate 37 | 38 | count = 0 39 | while(cap.isOpened()): 40 | frameId = cap.get(1) 41 | success, frame = cap.read() 42 | if not success: 43 | break 44 | #print frameId 45 | if (int(frameId) % math.floor(float(t_int) * frameRate) == 0): 46 | filename = imagesFolder + "/images_" + str(int(frameId)) + ".jpg" 47 | cv2.imwrite(filename, frame) 48 | count += 1 49 | 50 | if (count % 100 == 0): print ("100 finished!") 51 | 52 | cap.release() 53 | print "Done!" 54 | print ("FrameRate: %f" % frameRate) 55 | print ("Total: %d" % count) 56 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import importlib 3 | import os 4 | import sys 5 | import time 6 | 7 | import numpy as np 8 | import scipy 9 | 10 | import provider 11 | import tensorflow as tf 12 | import keras 13 | 14 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 15 | sys.path.append(os.path.join(BASE_DIR, 'models')) 16 | 17 | parser = argparse.ArgumentParser() 18 | parser.add_argument('--gpu', type=int, default=0, 19 | help='GPU to use [default: GPU 0]') 20 | parser.add_argument('--model', default='nvidia_pn', 21 | help='Model name [default: nvidia_pn]') 22 | parser.add_argument('--add_lstm', type=bool, default=False, 23 | help='Introduce LSTM mechanism in netowrk [default: False]') 24 | parser.add_argument('--log_dir', default='logs', 25 | help='Log dir [default: logs]') 26 | parser.add_argument('--max_epoch', type=int, default=250, 27 | help='Epoch to run [default: 250]') 28 | parser.add_argument('--batch_size', type=int, default=8, 29 | help='Batch Size during training [default: 8]') 30 | parser.add_argument('--learning_rate', type=float, default=0.001, 31 | help='Learning rate during training [default: 0.001]') 32 | parser.add_argument('--momentum', type=float, default=0.9, 33 | help='Initial learning rate [default: 0.9]') 34 | parser.add_argument('--optimizer', default='adam', 35 | help='adam or momentum [default: adam]') 36 | parser.add_argument('--decay_step', type=int, default=200000, 37 | help='Decay step for lr decay [default: 200000]') 38 | parser.add_argument('--decay_rate', type=float, default=0.7, 39 | help='Decay rate for lr decay [default: 0.8]') 40 | FLAGS = parser.parse_args() 41 | 42 | BATCH_SIZE = FLAGS.batch_size 43 | MAX_EPOCH = FLAGS.max_epoch 44 | LEARNING_RATE = FLAGS.learning_rate 45 | OPTIMIZER = FLAGS.optimizer 46 | BASE_LEARNING_RATE = FLAGS.learning_rate 47 | GPU_INDEX = FLAGS.gpu 48 | MOMENTUM = FLAGS.momentum 49 | DECAY_STEP = FLAGS.decay_step 50 | DECAY_RATE = FLAGS.decay_rate 51 | ADD_LSTM = FLAGS.add_lstm 52 | 53 | BN_INIT_DECAY = 0.5 54 | BN_DECAY_DECAY_RATE = 0.5 55 | BN_DECAY_DECAY_STEP = float(DECAY_STEP) 56 | BN_DECAY_CLIP = 0.99 57 | 58 | supported_models = ["nvidia_io", "nvidia_pn", 59 | "resnet152_io", "resnet152_pn", 60 | "inception_v4_io", "inception_v4_pn", 61 | "densenet169_io", "densenet169_pn"] 62 | assert (FLAGS.model in supported_models) 63 | MODEL = importlib.import_module(FLAGS.model) # import network module 64 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py') 65 | 66 | LOG_DIR = os.path.join(FLAGS.log_dir, FLAGS.model) 67 | if not os.path.exists(LOG_DIR): 68 | os.makedirs(LOG_DIR) 69 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR)) # bkp of model def 70 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure 71 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w') 72 | LOG_FOUT.write(str(FLAGS)+'\n') 73 | 74 | 75 | def log_string(out_str): 76 | LOG_FOUT.write(out_str+'\n') 77 | LOG_FOUT.flush() 78 | print(out_str) 79 | 80 | 81 | def get_learning_rate(batch): 82 | learning_rate = tf.train.exponential_decay( 83 | BASE_LEARNING_RATE, # Base learning rate. 84 | batch * BATCH_SIZE, # Current index into the dataset. 85 | DECAY_STEP, # Decay step. 86 | DECAY_RATE, # Decay rate. 87 | staircase=True) 88 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE! 89 | return learning_rate 90 | 91 | 92 | def get_bn_decay(batch): 93 | bn_momentum = tf.train.exponential_decay( 94 | BN_INIT_DECAY, 95 | batch*BATCH_SIZE, 96 | BN_DECAY_DECAY_STEP, 97 | BN_DECAY_DECAY_RATE, 98 | staircase=True) 99 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum) 100 | return bn_decay 101 | 102 | 103 | def train(): 104 | with tf.Graph().as_default(): 105 | with tf.device('/gpu:'+str(GPU_INDEX)): 106 | if '_pn' in MODEL_FILE: 107 | data_input = provider.Provider() 108 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 109 | imgs_pl = [imgs_pl, pts_pl] 110 | elif '_io' in MODEL_FILE: 111 | data_input = provider.Provider() 112 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 113 | else: 114 | raise NotImplementedError 115 | 116 | is_training_pl = tf.placeholder(tf.bool, shape=()) 117 | print(is_training_pl) 118 | 119 | # Note the global_step=batch parameter to minimize. 120 | # That tells the optimizer to helpfully increment the 'batch' 121 | # parameter for you every time it trains. 122 | batch = tf.Variable(0) 123 | bn_decay = get_bn_decay(batch) 124 | tf.summary.scalar('bn_decay', bn_decay) 125 | 126 | # Get model and loss 127 | pred = MODEL.get_model(imgs_pl, is_training_pl, 128 | bn_decay=bn_decay) 129 | 130 | loss = MODEL.get_loss(pred, labels_pl) 131 | MODEL.summary_scalar(pred, labels_pl) 132 | 133 | # Get training operator 134 | learning_rate = get_learning_rate(batch) 135 | tf.summary.scalar('learning_rate', learning_rate) 136 | if OPTIMIZER == 'momentum': 137 | optimizer = tf.train.MomentumOptimizer(learning_rate, 138 | momentum=MOMENTUM) 139 | elif OPTIMIZER == 'adam': 140 | optimizer = tf.train.AdamOptimizer(learning_rate) 141 | train_op = optimizer.minimize(loss, global_step=batch) 142 | # Add ops to save and restore all the variables. 143 | saver = tf.train.Saver() 144 | 145 | # Create a session 146 | config = tf.ConfigProto() 147 | config.gpu_options.allow_growth = True 148 | config.allow_soft_placement = True 149 | config.log_device_placement = False 150 | sess = tf.Session(config=config) 151 | 152 | # Add summary writers 153 | # merged = tf.merge_all_summaries() 154 | merged = tf.summary.merge_all() 155 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'), 156 | sess.graph) 157 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test')) 158 | 159 | # Init variables 160 | init = tf.global_variables_initializer() 161 | sess.run(init, {is_training_pl: True}) 162 | 163 | ops = {'imgs_pl': imgs_pl, 164 | 'labels_pl': labels_pl, 165 | 'is_training_pl': is_training_pl, 166 | 'pred': pred, 167 | 'loss': loss, 168 | 'train_op': train_op, 169 | 'merged': merged, 170 | 'step': batch} 171 | 172 | eval_acc_max = 0 173 | for epoch in range(MAX_EPOCH): 174 | log_string('**** EPOCH %03d ****' % (epoch)) 175 | sys.stdout.flush() 176 | 177 | train_one_epoch(sess, ops, train_writer, data_input) 178 | eval_acc = eval_one_epoch(sess, ops, test_writer, data_input) 179 | if eval_acc > eval_acc_max: 180 | eval_acc_max = eval_acc 181 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model_best.ckpt")) 182 | log_string("Model saved in file: %s" % save_path) 183 | 184 | # Save the variables to disk. 185 | if epoch % 10 == 0: 186 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt")) 187 | log_string("Model saved in file: %s" % save_path) 188 | 189 | 190 | def train_one_epoch(sess, ops, train_writer, data_input): 191 | """ ops: dict mapping from string to tf ops """ 192 | is_training = True 193 | num_batches = data_input.num_train // BATCH_SIZE 194 | loss_sum = 0 195 | acc_a_sum = 0 196 | acc_s_sum = 0 197 | counter = 0 198 | 199 | for batch_idx in range(num_batches): 200 | if "_io" in MODEL_FILE: 201 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "train", reader_type="io") 202 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 203 | imgs = MODEL.resize(imgs) 204 | feed_dict = {ops['imgs_pl']: imgs, 205 | ops['labels_pl']: labels, 206 | ops['is_training_pl']: is_training} 207 | else: 208 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "train") 209 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 210 | imgs = MODEL.resize(imgs) 211 | feed_dict = {ops['imgs_pl'][0]: imgs, 212 | ops['imgs_pl'][1]: others, 213 | ops['labels_pl']: labels, 214 | ops['is_training_pl']: is_training} 215 | 216 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], 217 | ops['step'], 218 | ops['train_op'], 219 | ops['loss'], 220 | ops['pred']], 221 | feed_dict=feed_dict) 222 | train_writer.add_summary(summary, step) 223 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels))) 224 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi) 225 | acc_a = np.mean(acc_a) 226 | acc_a_sum += acc_a 227 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20) 228 | acc_s = np.mean(acc_s) 229 | acc_s_sum += acc_s 230 | 231 | counter += 1 232 | if counter % 200 == 0: 233 | log_string(str(counter) + " step:") 234 | log_string('loss: %f' % (loss_sum / float(batch_idx + 1))) 235 | log_string('acc (angle): %f' % (acc_a_sum / float(batch_idx + 1))) 236 | log_string('acc (speed): %f' % (acc_s_sum / float(batch_idx + 1))) 237 | 238 | log_string('mean loss: %f' % (loss_sum / float(num_batches))) 239 | log_string('accuracy (angle): %f' % (acc_a_sum / float(num_batches))) 240 | log_string('accuracy (speed): %f' % (acc_s_sum / float(num_batches))) 241 | 242 | 243 | def eval_one_epoch(sess, ops, test_writer, data_input): 244 | """ ops: dict mapping from string to tf ops """ 245 | is_training = False 246 | loss_sum = 0 247 | 248 | num_batches = data_input.num_val // BATCH_SIZE 249 | loss_sum = 0 250 | acc_a_sum = 0 251 | acc_s_sum = 0 252 | 253 | for batch_idx in range(num_batches): 254 | if "_io" in MODEL_FILE: 255 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val", reader_type="io") 256 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 257 | imgs = MODEL.resize(imgs) 258 | feed_dict = {ops['imgs_pl']: imgs, 259 | ops['labels_pl']: labels, 260 | ops['is_training_pl']: is_training} 261 | else: 262 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val") 263 | if "resnet" in MODEL_FILE or "inception" in MODEL_FILE or "densenet" in MODEL_FILE: 264 | imgs = MODEL.resize(imgs) 265 | feed_dict = {ops['imgs_pl'][0]: imgs, 266 | ops['imgs_pl'][1]: others, 267 | ops['labels_pl']: labels, 268 | ops['is_training_pl']: is_training} 269 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], 270 | ops['step'], 271 | ops['train_op'], 272 | ops['loss'], 273 | ops['pred']], 274 | feed_dict=feed_dict) 275 | test_writer.add_summary(summary, step) 276 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels))) 277 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi) 278 | acc_a = np.mean(acc_a) 279 | acc_a_sum += acc_a 280 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20) 281 | acc_s = np.mean(acc_s) 282 | acc_s_sum += acc_s 283 | 284 | log_string('eval mean loss: %f' % (loss_sum / float(num_batches))) 285 | log_string('eval accuracy (angle): %f' % (acc_a_sum / float(num_batches))) 286 | log_string('eval accuracy (speed): %f' % (acc_s_sum / float(num_batches))) 287 | return acc_a_sum / float(num_batches) 288 | 289 | 290 | if __name__ == "__main__": 291 | train() 292 | -------------------------------------------------------------------------------- /train_demo.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import importlib 3 | import os 4 | import sys 5 | import time 6 | 7 | import numpy as np 8 | import scipy 9 | 10 | import provider 11 | import tensorflow as tf 12 | 13 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 14 | sys.path.append(os.path.join(BASE_DIR, 'models')) 15 | 16 | parser = argparse.ArgumentParser() 17 | parser.add_argument('--gpu', type=int, default=0, 18 | help='GPU to use [default: GPU 0]') 19 | parser.add_argument('--model', default='nvidia_io', 20 | help='Model name [default: nvidia_io]') 21 | parser.add_argument('--log_dir', default='logs', 22 | help='Log dir [default: logs]') 23 | parser.add_argument('--max_epoch', type=int, default=250, 24 | help='Epoch to run [default: 250]') 25 | parser.add_argument('--batch_size', type=int, default=8, 26 | help='Batch Size during training [default: 8]') 27 | parser.add_argument('--learning_rate', type=float, default=0.001, 28 | help='Learning rate during training [default: 0.001]') 29 | parser.add_argument('--momentum', type=float, default=0.9, 30 | help='Initial learning rate [default: 0.9]') 31 | parser.add_argument('--optimizer', default='adam', 32 | help='adam or momentum [default: adam]') 33 | parser.add_argument('--decay_step', type=int, default=200000, 34 | help='Decay step for lr decay [default: 200000]') 35 | parser.add_argument('--decay_rate', type=float, default=0.7, 36 | help='Decay rate for lr decay [default: 0.8]') 37 | FLAGS = parser.parse_args() 38 | 39 | BATCH_SIZE = FLAGS.batch_size 40 | MAX_EPOCH = FLAGS.max_epoch 41 | LEARNING_RATE = FLAGS.learning_rate 42 | OPTIMIZER = FLAGS.optimizer 43 | BASE_LEARNING_RATE = FLAGS.learning_rate 44 | GPU_INDEX = FLAGS.gpu 45 | MOMENTUM = FLAGS.momentum 46 | DECAY_STEP = FLAGS.decay_step 47 | DECAY_RATE = FLAGS.decay_rate 48 | 49 | BN_INIT_DECAY = 0.5 50 | BN_DECAY_DECAY_RATE = 0.5 51 | BN_DECAY_DECAY_STEP = float(DECAY_STEP) 52 | BN_DECAY_CLIP = 0.99 53 | 54 | MODEL = importlib.import_module(FLAGS.model) # import network module 55 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py') 56 | 57 | LOG_DIR = os.path.join(FLAGS.log_dir, FLAGS.model) 58 | if not os.path.exists(LOG_DIR): 59 | os.makedirs(LOG_DIR) 60 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR)) # bkp of model def 61 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure 62 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w') 63 | LOG_FOUT.write(str(FLAGS)+'\n') 64 | 65 | 66 | def log_string(out_str): 67 | LOG_FOUT.write(out_str+'\n') 68 | LOG_FOUT.flush() 69 | print(out_str) 70 | 71 | 72 | def get_learning_rate(batch): 73 | learning_rate = tf.train.exponential_decay( 74 | BASE_LEARNING_RATE, # Base learning rate. 75 | batch * BATCH_SIZE, # Current index into the dataset. 76 | DECAY_STEP, # Decay step. 77 | DECAY_RATE, # Decay rate. 78 | staircase=True) 79 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE! 80 | return learning_rate 81 | 82 | 83 | def get_bn_decay(batch): 84 | bn_momentum = tf.train.exponential_decay( 85 | BN_INIT_DECAY, 86 | batch*BATCH_SIZE, 87 | BN_DECAY_DECAY_STEP, 88 | BN_DECAY_DECAY_RATE, 89 | staircase=True) 90 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum) 91 | return bn_decay 92 | 93 | 94 | def train(): 95 | with tf.Graph().as_default(): 96 | with tf.device('/gpu:'+str(GPU_INDEX)): 97 | if 'io' in MODEL_FILE: 98 | data_input = provider.DVR_Provider() 99 | imgs_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 100 | elif 'pm' in MODEL_FILE: 101 | data_input = provider.DVR_FMAP_Provider() 102 | imgs_pl, fmaps_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 103 | imgs_pl = [imgs_pl, fmaps_pl] 104 | else: 105 | data_input = provider.DVR_Points_Provider() 106 | imgs_pl, pts_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE) 107 | imgs_pl = [imgs_pl, pts_pl] 108 | 109 | is_training_pl = tf.placeholder(tf.bool, shape=()) 110 | print(is_training_pl) 111 | 112 | # Note the global_step=batch parameter to minimize. 113 | # That tells the optimizer to helpfully increment the 'batch' 114 | # parameter for you every time it trains. 115 | batch = tf.Variable(0) 116 | bn_decay = get_bn_decay(batch) 117 | tf.summary.scalar('bn_decay', bn_decay) 118 | 119 | # Get model and loss 120 | pred = MODEL.get_model(imgs_pl, is_training_pl, 121 | bn_decay=bn_decay) 122 | 123 | loss = MODEL.get_loss(pred, labels_pl) 124 | MODEL.summary_scalar(pred, labels_pl) 125 | 126 | # Get training operator 127 | learning_rate = get_learning_rate(batch) 128 | tf.summary.scalar('learning_rate', learning_rate) 129 | if OPTIMIZER == 'momentum': 130 | optimizer = tf.train.MomentumOptimizer(learning_rate, 131 | momentum=MOMENTUM) 132 | elif OPTIMIZER == 'adam': 133 | optimizer = tf.train.AdamOptimizer(learning_rate) 134 | train_op = optimizer.minimize(loss, global_step=batch) 135 | # Add ops to save and restore all the variables. 136 | saver = tf.train.Saver() 137 | 138 | # Create a session 139 | config = tf.ConfigProto() 140 | config.gpu_options.allow_growth = True 141 | config.allow_soft_placement = True 142 | config.log_device_placement = False 143 | sess = tf.Session(config=config) 144 | 145 | # Add summary writers 146 | # merged = tf.merge_all_summaries() 147 | merged = tf.summary.merge_all() 148 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'), 149 | sess.graph) 150 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test')) 151 | 152 | # Init variables 153 | init = tf.global_variables_initializer() 154 | sess.run(init, {is_training_pl: True}) 155 | 156 | ops = {'imgs_pl': imgs_pl, 157 | 'labels_pl': labels_pl, 158 | 'is_training_pl': is_training_pl, 159 | 'pred': pred, 160 | 'loss': loss, 161 | 'train_op': train_op, 162 | 'merged': merged, 163 | 'step': batch} 164 | 165 | for epoch in range(MAX_EPOCH): 166 | log_string('**** EPOCH %03d ****' % (epoch)) 167 | sys.stdout.flush() 168 | 169 | train_one_epoch(sess, ops, train_writer, data_input) 170 | eval_one_epoch(sess, ops, test_writer, data_input) 171 | 172 | # Save the variables to disk. 173 | if epoch % 10 == 0: 174 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt")) 175 | log_string("Model saved in file: %s" % save_path) 176 | 177 | 178 | def train_one_epoch(sess, ops, train_writer, data_input): 179 | """ ops: dict mapping from string to tf ops """ 180 | is_training = True 181 | num_batches = data_input.num_train // BATCH_SIZE 182 | loss_sum = 0 183 | acc_a_sum = 0 184 | acc_s_sum = 0 185 | 186 | for batch_idx in range(num_batches): 187 | if "io" in MODEL_FILE: 188 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "train") 189 | feed_dict = {ops['imgs_pl']: imgs, 190 | ops['labels_pl']: labels, 191 | ops['is_training_pl']: is_training} 192 | else: 193 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "train") 194 | feed_dict = {ops['imgs_pl'][0]: imgs, 195 | ops['imgs_pl'][1]: others, 196 | ops['labels_pl']: labels, 197 | ops['is_training_pl']: is_training} 198 | 199 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], 200 | ops['step'], 201 | ops['train_op'], 202 | ops['loss'], 203 | ops['pred']], 204 | feed_dict=feed_dict) 205 | train_writer.add_summary(summary, step) 206 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels))) 207 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi) 208 | acc_a = np.mean(acc_a) 209 | acc_a_sum += acc_a 210 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20) 211 | acc_s = np.mean(acc_s) 212 | acc_s_sum += acc_s 213 | 214 | log_string('mean loss: %f' % (loss_sum / float(num_batches))) 215 | log_string('accuracy (angle): %f' % (acc_a_sum / float(num_batches))) 216 | log_string('accuracy (speed): %f' % (acc_s_sum / float(num_batches))) 217 | 218 | 219 | def eval_one_epoch(sess, ops, test_writer, data_input): 220 | """ ops: dict mapping from string to tf ops """ 221 | is_training = False 222 | loss_sum = 0 223 | 224 | num_batches = data_input.num_val // BATCH_SIZE 225 | loss_sum = 0 226 | acc_a_sum = 0 227 | acc_s_sum = 0 228 | 229 | for batch_idx in range(num_batches): 230 | if "io" in MODEL_FILE: 231 | imgs, labels = data_input.load_one_batch(BATCH_SIZE, "val") 232 | feed_dict = {ops['imgs_pl']: imgs, 233 | ops['labels_pl']: labels, 234 | ops['is_training_pl']: is_training} 235 | else: 236 | imgs, others, labels = data_input.load_one_batch(BATCH_SIZE, "val") 237 | feed_dict = {ops['imgs_pl'][0]: imgs, 238 | ops['imgs_pl'][1]: others, 239 | ops['labels_pl']: labels, 240 | ops['is_training_pl']: is_training} 241 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], 242 | ops['step'], 243 | ops['train_op'], 244 | ops['loss'], 245 | ops['pred']], 246 | feed_dict=feed_dict) 247 | test_writer.add_summary(summary, step) 248 | loss_sum += np.mean(np.square(np.subtract(pred_val, labels))) 249 | acc_a = np.abs(np.subtract(pred_val[:, 1], labels[:, 1])) < (5.0 / 180 * scipy.pi) 250 | acc_a = np.mean(acc_a) 251 | acc_a_sum += acc_a 252 | acc_s = np.abs(np.subtract(pred_val[:, 0], labels[:, 0])) < (5.0 / 20) 253 | acc_s = np.mean(acc_s) 254 | acc_s_sum += acc_s 255 | 256 | log_string('eval mean loss: %f' % (loss_sum / float(num_batches))) 257 | log_string('eval accuracy (angle): %f' % (acc_a_sum / float(num_batches))) 258 | log_string('eval accuracy (speed): %f' % (acc_s_sum / float(num_batches))) 259 | 260 | 261 | if __name__ == "__main__": 262 | train() 263 | -------------------------------------------------------------------------------- /utils/custom_layers.py: -------------------------------------------------------------------------------- 1 | from keras.layers.core import Layer 2 | from keras.engine import InputSpec 3 | from keras import backend as K 4 | try: 5 | from keras import initializations 6 | except ImportError: 7 | from keras import initializers as initializations 8 | 9 | 10 | class Scale(Layer): 11 | '''Learns a set of weights and biases used for scaling the input data. 12 | the output consists simply in an element-wise multiplication of the input 13 | and a sum of a set of constants: 14 | 15 | out = in * gamma + beta, 16 | 17 | where 'gamma' and 'beta' are the weights and biases larned. 18 | 19 | # Arguments 20 | axis: integer, axis along which to normalize in mode 0. For instance, 21 | if your input tensor has shape (samples, channels, rows, cols), 22 | set axis to 1 to normalize per feature map (channels axis). 23 | momentum: momentum in the computation of the 24 | exponential average of the mean and standard deviation 25 | of the data, for feature-wise normalization. 26 | weights: Initialization weights. 27 | List of 2 Numpy arrays, with shapes: 28 | `[(input_shape,), (input_shape,)]` 29 | beta_init: name of initialization function for shift parameter 30 | (see [initializations](../initializations.md)), or alternatively, 31 | Theano/TensorFlow function to use for weights initialization. 32 | This parameter is only relevant if you don't pass a `weights` argument. 33 | gamma_init: name of initialization function for scale parameter (see 34 | [initializations](../initializations.md)), or alternatively, 35 | Theano/TensorFlow function to use for weights initialization. 36 | This parameter is only relevant if you don't pass a `weights` argument. 37 | ''' 38 | def __init__(self, weights=None, axis=-1, momentum = 0.9, beta_init='zero', gamma_init='one', **kwargs): 39 | self.momentum = momentum 40 | self.axis = axis 41 | self.beta_init = initializations.get(beta_init) 42 | self.gamma_init = initializations.get(gamma_init) 43 | self.initial_weights = weights 44 | super(Scale, self).__init__(**kwargs) 45 | 46 | def build(self, input_shape): 47 | self.input_spec = [InputSpec(shape=input_shape)] 48 | shape = (int(input_shape[self.axis]),) 49 | 50 | # Compatibility with TensorFlow >= 1.0.0 51 | self.gamma = K.variable(self.gamma_init(shape), name='{}_gamma'.format(self.name)) 52 | self.beta = K.variable(self.beta_init(shape), name='{}_beta'.format(self.name)) 53 | #self.gamma = self.gamma_init(shape, name='{}_gamma'.format(self.name)) 54 | #self.beta = self.beta_init(shape, name='{}_beta'.format(self.name)) 55 | self.trainable_weights = [self.gamma, self.beta] 56 | 57 | if self.initial_weights is not None: 58 | self.set_weights(self.initial_weights) 59 | del self.initial_weights 60 | 61 | def call(self, x, mask=None): 62 | input_shape = self.input_spec[0].shape 63 | broadcast_shape = [1] * len(input_shape) 64 | broadcast_shape[self.axis] = input_shape[self.axis] 65 | 66 | out = K.reshape(self.gamma, broadcast_shape) * x + K.reshape(self.beta, broadcast_shape) 67 | return out 68 | 69 | def get_config(self): 70 | config = {"momentum": self.momentum, "axis": self.axis} 71 | base_config = super(Scale, self).get_config() 72 | return dict(list(base_config.items()) + list(config.items())) 73 | -------------------------------------------------------------------------------- /utils/helper.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | 4 | def str2bool(v): 5 | if v.lower() in ('yes', 'true', 't', 'y', '1'): 6 | return True 7 | elif v.lower() in ('no', 'false', 'f', 'n', '0'): 8 | return False 9 | else: 10 | raise argparse.ArgumentTypeError('Boolean value expected.') -------------------------------------------------------------------------------- /utils/pointnet.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | import sys 5 | import os 6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 7 | sys.path.append(BASE_DIR) 8 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 9 | import tf_util 10 | 11 | def placeholder_inputs(batch_size, num_point): 12 | pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point, 3)) 13 | labels_pl = tf.placeholder(tf.int32, shape=(batch_size)) 14 | return pointclouds_pl, labels_pl 15 | 16 | 17 | def get_model(point_cloud, is_training, bn_decay=None): 18 | """ Classification PointNet, input is BxNx3, output Bx40 """ 19 | batch_size = point_cloud.get_shape()[0].value 20 | num_point = point_cloud.get_shape()[1].value 21 | end_points = {} 22 | input_image = tf.expand_dims(point_cloud, -1) 23 | 24 | # Point functions (MLP implemented as conv2d) 25 | net = tf_util.conv2d(input_image, 64, [1,3], 26 | padding='VALID', stride=[1,1], 27 | bn=True, is_training=is_training, 28 | scope='conv1', bn_decay=bn_decay) 29 | net = tf_util.conv2d(net, 64, [1,1], 30 | padding='VALID', stride=[1,1], 31 | bn=True, is_training=is_training, 32 | scope='conv2', bn_decay=bn_decay) 33 | net = tf_util.conv2d(net, 64, [1,1], 34 | padding='VALID', stride=[1,1], 35 | bn=True, is_training=is_training, 36 | scope='conv3', bn_decay=bn_decay) 37 | net = tf_util.conv2d(net, 128, [1,1], 38 | padding='VALID', stride=[1,1], 39 | bn=True, is_training=is_training, 40 | scope='conv4', bn_decay=bn_decay) 41 | net = tf_util.conv2d(net, 256, [1,1], 42 | padding='VALID', stride=[1,1], 43 | bn=True, is_training=is_training, 44 | scope='conv5', bn_decay=bn_decay) 45 | 46 | # Symmetric function: max pooling 47 | net = tf_util.max_pool2d(net, [num_point,1], 48 | padding='VALID', scope='maxpool') 49 | 50 | # MLP on global point cloud vector 51 | net = tf.reshape(net, [batch_size, -1]) 52 | 53 | return net 54 | 55 | if __name__=='__main__': 56 | with tf.Graph().as_default(): 57 | inputs = tf.zeros((32,100000,3)) 58 | outputs = get_model(inputs, tf.constant(True)) 59 | print(outputs) 60 | -------------------------------------------------------------------------------- /utils/weights/README.md: -------------------------------------------------------------------------------- 1 | ## ImageNet Pretrained Models 2 | 3 | This is the place where the weights of pre-trained models in ImageNet are placed. This part is modified from [this repo](https://github.com/flyyufelix/cnn_finetune). 4 | 5 | ### Folder Structure 6 | Download pre-trained weights and organize the files as follows (in `utils/weights/`): 7 | ``` 8 | ├── resnet152_weights_tf.h5 9 | ├── inception-v4_weights_tf.h5 10 | └── densenet169_weights_tf.h5 11 | ``` 12 | 13 | ### Download the Weights 14 | 15 | Network|Tensorflow 16 | :---:|:---: 17 | Inception-V4 | [model (172 MB)](https://drive.google.com/file/d/0Byy2AcGyEVxfTmRRVmpGWDczaXM/view?usp=sharing) 18 | ResNet-152 | [model (243 MB)](https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing) 19 | DenseNet-169 | [model (56 MB)](https://drive.google.com/open?id=0Byy2AcGyEVxfSEc5UC1ROUFJdmM) 20 | 21 | More pre-trained weights are available [here](https://github.com/flyyufelix/cnn_finetune). --------------------------------------------------------------------------------