├── .gitignore
├── LICENSE
├── README.md
├── apps
    ├── __init__.py
    ├── demo.py
    └── main.py
├── configs
    ├── __init__.py
    ├── configs.py
    └── label.pbtxt
├── deployments
    └── __init__.py
├── example_data
    └── __init__.py
├── libs
    ├── __init__.py
    ├── label_map_util.py
    └── string_int_label_map_pb2.py
├── models
    ├── __init__.py
    ├── _base_server.py
    ├── _frustum_pointnets_v1.py
    ├── detector_2d.py
    ├── detector_3d.py
    ├── frustum_proposal.py
    ├── model_util.py
    ├── server.py
    └── tf_util.py
├── pretrained
    └── __init__.py
├── requirements.txt
├── services
    └── __init__.py
├── tests
    └── __init__.py
└── utils
    ├── __init__.py
    └── utils.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .idea/
2 | .DS_Store
3 | */*.pyc
4 | *.pyc


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 AIInAi
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Summary
  2 | 
  3 | ![3d](https://user-images.githubusercontent.com/8921629/41188550-0ed19016-6b74-11e8-92fb-193a8160d0e2.png)
  4 | 
  5 | (Below is from a data in [KITTI 3D Object Detection Dataset](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d))
  6 | 
  7 | ![semi-endtoend](https://user-images.githubusercontent.com/8921629/41068890-76807090-69a0-11e8-9794-62fc394667b3.png)
  8 | 
  9 | # Run demo
 10 | 
 11 | #### 1. Requirements
 12 | 
 13 | - [X] MacOS or Ubuntu
 14 | 
 15 | - [X] Tensorflow
 16 | 
 17 | - [X] Mayavi (visualization Only)
 18 | 
 19 | - [X] OpenCV
 20 | 
 21 | - [ ] Anaconda preferred (optional)
 22 | 
 23 | ### 2. Clone this repo
 24 | 
 25 | ```
 26 | git clone https://github.com/KleinYuan/tf-3d-object-detection.git
 27 | ```
 28 | 
 29 | ### 2. Install Dependencies
 30 | 
 31 | ```
 32 | # Simply run this in this project root folder
 33 | cd tf-3d-object-detection
 34 | pip install -r requirements.txt
 35 | ```
 36 | 
 37 | If you meet error install say `opencv`, do `conda install opencv` if you use Anaconda. Otherwise, dude, build from source and let's call it a day.
 38 | 
 39 | ### 3. Pick a 2D Object Detection Model
 40 | 
 41 | In here we support 5 different 2D Detection models:
 42 | 
 43 | | Model name  | Speed | COCO mAP | Outputs |
 44 | | ------------ | :--------------: | :--------------: | :-------------: |
 45 | | [ssd_mobilenet_v1_coco](http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz) | fast | 21 | Boxes |
 46 | | [ssd_inception_v2_coco](http://download.tensorflow.org/models/object_detection/ssd_inception_v2_coco_11_06_2017.tar.gz) | fast | 24 | Boxes |
 47 | | [rfcn_resnet101_coco](http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_11_06_2017.tar.gz)  | medium | 30 | Boxes |
 48 | | [faster_rcnn_resnet101_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_11_06_2017.tar.gz) | medium | 32 | Boxes |
 49 | | [faster_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017.tar.gz) | slow | 37 | Boxes |
 50 | 
 51 | Pick one of those that makes you feel good, and find it in the list -- [`_DETECTOR_2D_OPTIONS` in `configs/configs`](https://github.com/KleinYuan/tf-3d-object-detection/blob/master/configs/configs.py#L17),
 52 | then replace it with the value of [`_DETECTOR_2D_MODEL_NAME`](https://github.com/KleinYuan/tf-3d-object-detection/blob/master/configs/configs.py#L16).
 53 | 
 54 | And by default, I use [`ssd_mobilenet_v1_coco_11_06_2017`](https://github.com/KleinYuan/tf-3d-object-detection/blob/master/configs/configs.py#L16) due to it's fast.
 55 | 
 56 | ### 4. Download Test Data
 57 | 
 58 | Due to the license of KITTI is waaaaaaaaaaay to long to read, I will just tell ya how to do it instead of running a risk to attach here with some data from KITTI, which
 59 | when I downloaded it I clicked some button to have agreed on something that's TLTR.
 60 | 
 61 | ```
 62 | # Step1 Go to http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d
 63 | # Step2 Do "Download left color images of object data set (12 GB)"
 64 | # Step3 Do "Download Velodyne point clouds, if you want to use laser information (29 GB)"
 65 | # Step4 Do "Download camera calibration matrices of object data set (16 MB)"
 66 | # Step5 Unzip all those three zip files and you will find ~7000ish training datasets, each pair include velodyne, image and calibration
 67 | # Step6 Pick one of them, copy and paste it under example_data folder, and name the image to 1.png, and velodyne file to 1.bin
 68 | # Step7 Open calibration file, find corresponding item and replace it with CALIB_PARAM in configs/configs.py, by default, it's from training/000000.txt
 69 | # Step8 Really sorry to let you go thru last 7 Steps and I think I may come up with a better idea to do it with one button
 70 | ```
 71 | 
 72 | ### 5. Download Pretrained Model
 73 | 
 74 | As you may see, this project combined 2 Deep Neural Networks together. Therefore, yes you need to download two pre-trained model.
 75 | 
 76 | | 2D Object Detector Model  | 3D Object Detector Model |
 77 | | ------------ | :--------------: |
 78 | | [Download Link](https://github.com/KleinYuan/tf-object-detection/blob/master/README.md#introduction)| [Download v1 and v2 is not supported yet](https://shapenet.cs.stanford.edu/media/frustum_pointnets_snapshots.zip) (originally from [here](https://github.com/Dark-Rinnegan/frustum-pointnets/tree/app#training-frustum-pointnets))|
 79 | 
 80 | Then, unzip them and put them under [`pretrained`](https://github.com/KleinYuan/tf-3d-object-detection/tree/master/pretrained) folder. Also, renamed the `checkpoint.txt` file to `checkpoint` even though it's useless and you cannot freeze it :unhappy: .
 81 | 
 82 | 
 83 | The folder will look like this:
 84 | 
 85 | ```
 86 | --tf-3d-object-detection
 87 |   |-- pretrained
 88 |       |--log_v1
 89 |           |-- checkpoint (originally named checkpoint.txt)
 90 |           |-- log_train.txt
 91 |           |-- model.ckpt.data-00000-of-00001
 92 |           |-- model.ckpt
 93 |           |-- model.ckpt.meta
 94 |       |-- ssd_mobilenet_v1_coco_11_06_2017 (or other names if you decide to use different ones)
 95 |           |-- frozen_inference_graph.pb
 96 |           |-- graph.pbtxt
 97 |           |-- model.ckpt-0.data-00000-of-00001
 98 |           |-- model.ckpt-0.index
 99 |           |-- model.ckpt-0.meta
100 | 
101 | ```
102 | 
103 | You may realize this [fact](https://github.com/Dark-Rinnegan/frustum-pointnets/tree/app/app#intro) thus 3D object detection model is not really frozenable one.
104 | 
105 | (Hopefully they can disclose the original tensorflow ops for v1 so that we can remove both `tf.py_func` and freeze the model)
106 | 
107 | ### 6. Run Demo
108 | 
109 | ```
110 | # If you use Pycharm, just click the green run button
111 | # If not, navigate to root folder of this repo and run:
112 | python apps/demo.py
113 | 
114 | # If it complains, yo, I cannot find some modules, yo, do:
115 | export PYTHONPATH='.'
116 | python apps/demo.py
117 | 
118 | # And if you still have the issue, man, you must really mess up with your python env.
119 | # I don't wanan help you on that in this readme and don't create an issue for that as well.
120 | # You shall either try using anaconda or find a python knower to help you with it
121 | # Or, just do STACKOVERFLOW like other pals do
122 | 
123 | ```
124 | 
125 | Then you should be able to see 3 Windows pop up in order, and don't forget to `Press any key to continue` as the terminal mention.
126 | 
127 | 
128 | # References
129 | 
130 | - [X] Project Template: [AIInAi/tensorflow-project-template](https://github.com/AIInAi/tensorflow-project-template)
131 | 
132 | - [X] FPNet Code: [Dark-Rinnegan/frustum-pointnets](https://github.com/Dark-Rinnegan/frustum-pointnets/tree/app/app)
133 | 


--------------------------------------------------------------------------------
/apps/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/apps/__init__.py


--------------------------------------------------------------------------------
/apps/demo.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | from models import server
 3 | from utils import utils
 4 | from configs import configs
 5 | 
 6 | # Reading example image
 7 | img = cv2.imread('{}'.format(configs.TEST_DATA_FP['img']))
 8 | 
 9 | # Reading example points cloud
10 | pclds = utils.load_velo_scan('{}'.format(configs.TEST_DATA_FP['pclds']))
11 | 
12 | test_input = {'img': img, 'pclds': pclds}
13 | server_ins = server.Server()
14 | server_ins.predict(test_input)
15 | 


--------------------------------------------------------------------------------
/apps/main.py:
--------------------------------------------------------------------------------
1 | '''
2 | To be added
3 | '''


--------------------------------------------------------------------------------
/configs/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/configs/__init__.py


--------------------------------------------------------------------------------
/configs/configs.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import os
 3 | 
 4 | BASE_PATH = '/'.join(os.getcwd().split('/')[:-1])
 5 | ####################################################################
 6 | # Configurations for test/demo images/points cloud/calibration params
 7 | # This is the only part, you are free to change to run the demo and
 8 | # any changes in this section will not break anything
 9 | ####################################################################
10 | TEST_DATA_FP = {
11 |     'img': '{}/example_data/1.png'.format(BASE_PATH),
12 |     'pclds': '{}/example_data/1.bin'.format(BASE_PATH)
13 | }
14 | 
15 | # Read https://github.com/KleinYuan/tf-object-detection#introduction
16 | _DETECTOR_2D_MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
17 | _DETECTOR_2D_OPTIONS = [
18 |     'ssd_mobilenet_v1_coco_11_06_2017',
19 |     'ssd_inception_v2_coco_11_06_2017',
20 |     'rfcn_resnet101_coco_11_06_2017',
21 |     'faster_rcnn_resnet101_coco_11_06_2017',
22 |     'faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017'
23 | ]
24 | 
25 | ####################################################################
26 | # Configurations for Main Server
27 | # (Please read the readme in the link if you don't know what this section is:
28 | # https://s3.eu-central-1.amazonaws.com/avg-kitti/devkit_object.zip)
29 | ####################################################################
30 | 
31 | # STUB PARAM
32 | CALIB_PARAM = {
33 |     'P': (7.070493000000e+02, 0.000000000000e+00, 6.040814000000e+02, 4.575831000000e+01, 0.000000000000e+00, 7.070493000000e+02, 1.805066000000e+02, -3.454157000000e-01, 0.000000000000e+00,
34 |           0.000000000000e+00, 1.000000000000e+00, 4.981016000000e-03),
35 |     'Tr_velo_to_cam': (
36 |         6.927964000000e-03, -9.999722000000e-01, -2.757829000000e-03, -2.457729000000e-02, -1.162982000000e-03, 2.749836000000e-03, -9.999955000000e-01, -6.127237000000e-02, 9.999753000000e-01,
37 |         6.931141000000e-03, -1.143899000000e-03, -3.321029000000e-01),
38 |     'R0_rect': (9.999128000000e-01, 1.009263000000e-02, -8.511932000000e-03, -1.012729000000e-02, 9.999406000000e-01, -4.037671000000e-03, 8.470675000000e-03, 4.123522000000e-03, 9.999556000000e-01)
39 | }
40 | 
41 | ####################################################################
42 | # Configurations for BASE_SERVER Template
43 | # (Don't touch this section if you are not fluent in tensorflow)
44 | ####################################################################
45 | 
46 | BASE_SERVER = {
47 |     'input_tensor_names': ['image_tensor:0'],
48 |     'output_tensor_names': ['detection_boxes:0', 'detection_scores:0', 'detection_classes:0', 'num_detections:0'],
49 |     'device': '/gpu:0'
50 | }
51 | 
52 | ####################################################################
53 | # Configurations or 2D Detector
54 | # (Don't touch this section if you are not familiar with tensorflow)
55 | ####################################################################
56 | DETECTOR_2D = {
57 |     'MODEL_FP': '{}/pretrained/{}/frozen_inference_graph.pb'.format(BASE_PATH, _DETECTOR_2D_MODEL_NAME),
58 |     'LABEL_FP': '{}/configs/label.pbtxt'.format(BASE_PATH),
59 |     'NUM_CLASSES': 90,
60 |     'FEED_IMG_SIZE': 320,
61 |     'ONE_HOT_VECTOR_MAP': {'car': 0, 'person': 1, 'bicycle': 2}
62 | }
63 | 
64 | ####################################################################
65 | # Configurations for 3D Detector
66 | # (Don't touch this section if you are not familiar with Frustum PointNet)
67 | ####################################################################
68 | DETECTOR_3D = {
69 |     'MODEL_FP': '{}/pretrained/log_v1/model.ckpt'.format(BASE_PATH)
70 | }
71 | 
72 | FPNET = {
73 |     'BATCH_SIZE': 1,
74 |     'NUM_POINT': 1024,
75 |     'NUM_HEADING_BIN': 12,
76 |     'NUM_SIZE_CLUSTER': 8,
77 |     'NUM_OBJECT_POINT': 512,
78 |     'DEVICE': '/gpu:0'
79 | }
80 | 
81 | # FPNET labels
82 | g_type2class = {'Car': 0, 'Van': 1, 'Truck': 2, 'Pedestrian': 3, 'Person_sitting': 4, 'Cyclist': 5, 'Tram': 6, 'Misc': 7}
83 | g_class2type = {g_type2class[t]: t for t in g_type2class}
84 | g_type2onehotclass = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
85 | g_type_mean_size = {'Car': np.array([3.88311640418, 1.62856739989, 1.52563191462]),
86 |                     'Van': np.array([5.06763659, 1.9007158, 2.20532825]),
87 |                     'Truck': np.array([10.13586957, 2.58549199, 3.2520595]),
88 |                     'Pedestrian': np.array([0.84422524, 0.66068622, 1.76255119]),
89 |                     'Person_sitting': np.array([0.80057803, 0.5983815, 1.27450867]),
90 |                     'Cyclist': np.array([1.76282397, 0.59706367, 1.73698127]),
91 |                     'Tram': np.array([16.17150617, 2.53246914, 3.53079012]),
92 |                     'Misc': np.array([3.64300781, 1.54298177, 1.92320313])}
93 | g_mean_size_arr = np.zeros((FPNET['NUM_SIZE_CLUSTER'], 3))  # size clustrs
94 | 


--------------------------------------------------------------------------------
/configs/label.pbtxt:
--------------------------------------------------------------------------------
  1 | item {
  2 |   name: "/m/01g317"
  3 |   id: 1
  4 |   display_name: "person"
  5 | }
  6 | item {
  7 |   name: "/m/0199g"
  8 |   id: 2
  9 |   display_name: "bicycle"
 10 | }
 11 | item {
 12 |   name: "/m/0k4j"
 13 |   id: 3
 14 |   display_name: "car"
 15 | }
 16 | item {
 17 |   name: "/m/04_sv"
 18 |   id: 4
 19 |   display_name: "motorcycle"
 20 | }
 21 | item {
 22 |   name: "/m/05czz6l"
 23 |   id: 5
 24 |   display_name: "airplane"
 25 | }
 26 | item {
 27 |   name: "/m/01bjv"
 28 |   id: 6
 29 |   display_name: "bus"
 30 | }
 31 | item {
 32 |   name: "/m/07jdr"
 33 |   id: 7
 34 |   display_name: "train"
 35 | }
 36 | item {
 37 |   name: "/m/07r04"
 38 |   id: 8
 39 |   display_name: "truck"
 40 | }
 41 | item {
 42 |   name: "/m/019jd"
 43 |   id: 9
 44 |   display_name: "boat"
 45 | }
 46 | item {
 47 |   name: "/m/015qff"
 48 |   id: 10
 49 |   display_name: "traffic light"
 50 | }
 51 | item {
 52 |   name: "/m/01pns0"
 53 |   id: 11
 54 |   display_name: "fire hydrant"
 55 | }
 56 | item {
 57 |   name: "/m/02pv19"
 58 |   id: 13
 59 |   display_name: "stop sign"
 60 | }
 61 | item {
 62 |   name: "/m/015qbp"
 63 |   id: 14
 64 |   display_name: "parking meter"
 65 | }
 66 | item {
 67 |   name: "/m/0cvnqh"
 68 |   id: 15
 69 |   display_name: "bench"
 70 | }
 71 | item {
 72 |   name: "/m/015p6"
 73 |   id: 16
 74 |   display_name: "bird"
 75 | }
 76 | item {
 77 |   name: "/m/01yrx"
 78 |   id: 17
 79 |   display_name: "cat"
 80 | }
 81 | item {
 82 |   name: "/m/0bt9lr"
 83 |   id: 18
 84 |   display_name: "dog"
 85 | }
 86 | item {
 87 |   name: "/m/03k3r"
 88 |   id: 19
 89 |   display_name: "horse"
 90 | }
 91 | item {
 92 |   name: "/m/07bgp"
 93 |   id: 20
 94 |   display_name: "sheep"
 95 | }
 96 | item {
 97 |   name: "/m/01xq0k1"
 98 |   id: 21
 99 |   display_name: "cow"
100 | }
101 | item {
102 |   name: "/m/0bwd_0j"
103 |   id: 22
104 |   display_name: "elephant"
105 | }
106 | item {
107 |   name: "/m/01dws"
108 |   id: 23
109 |   display_name: "bear"
110 | }
111 | item {
112 |   name: "/m/0898b"
113 |   id: 24
114 |   display_name: "zebra"
115 | }
116 | item {
117 |   name: "/m/03bk1"
118 |   id: 25
119 |   display_name: "giraffe"
120 | }
121 | item {
122 |   name: "/m/01940j"
123 |   id: 27
124 |   display_name: "backpack"
125 | }
126 | item {
127 |   name: "/m/0hnnb"
128 |   id: 28
129 |   display_name: "umbrella"
130 | }
131 | item {
132 |   name: "/m/080hkjn"
133 |   id: 31
134 |   display_name: "handbag"
135 | }
136 | item {
137 |   name: "/m/01rkbr"
138 |   id: 32
139 |   display_name: "tie"
140 | }
141 | item {
142 |   name: "/m/01s55n"
143 |   id: 33
144 |   display_name: "suitcase"
145 | }
146 | item {
147 |   name: "/m/02wmf"
148 |   id: 34
149 |   display_name: "frisbee"
150 | }
151 | item {
152 |   name: "/m/071p9"
153 |   id: 35
154 |   display_name: "skis"
155 | }
156 | item {
157 |   name: "/m/06__v"
158 |   id: 36
159 |   display_name: "snowboard"
160 | }
161 | item {
162 |   name: "/m/018xm"
163 |   id: 37
164 |   display_name: "sports ball"
165 | }
166 | item {
167 |   name: "/m/02zt3"
168 |   id: 38
169 |   display_name: "kite"
170 | }
171 | item {
172 |   name: "/m/03g8mr"
173 |   id: 39
174 |   display_name: "baseball bat"
175 | }
176 | item {
177 |   name: "/m/03grzl"
178 |   id: 40
179 |   display_name: "baseball glove"
180 | }
181 | item {
182 |   name: "/m/06_fw"
183 |   id: 41
184 |   display_name: "skateboard"
185 | }
186 | item {
187 |   name: "/m/019w40"
188 |   id: 42
189 |   display_name: "surfboard"
190 | }
191 | item {
192 |   name: "/m/0dv9c"
193 |   id: 43
194 |   display_name: "tennis racket"
195 | }
196 | item {
197 |   name: "/m/04dr76w"
198 |   id: 44
199 |   display_name: "bottle"
200 | }
201 | item {
202 |   name: "/m/09tvcd"
203 |   id: 46
204 |   display_name: "wine glass"
205 | }
206 | item {
207 |   name: "/m/08gqpm"
208 |   id: 47
209 |   display_name: "cup"
210 | }
211 | item {
212 |   name: "/m/0dt3t"
213 |   id: 48
214 |   display_name: "fork"
215 | }
216 | item {
217 |   name: "/m/04ctx"
218 |   id: 49
219 |   display_name: "knife"
220 | }
221 | item {
222 |   name: "/m/0cmx8"
223 |   id: 50
224 |   display_name: "spoon"
225 | }
226 | item {
227 |   name: "/m/04kkgm"
228 |   id: 51
229 |   display_name: "bowl"
230 | }
231 | item {
232 |   name: "/m/09qck"
233 |   id: 52
234 |   display_name: "banana"
235 | }
236 | item {
237 |   name: "/m/014j1m"
238 |   id: 53
239 |   display_name: "apple"
240 | }
241 | item {
242 |   name: "/m/0l515"
243 |   id: 54
244 |   display_name: "sandwich"
245 | }
246 | item {
247 |   name: "/m/0cyhj_"
248 |   id: 55
249 |   display_name: "orange"
250 | }
251 | item {
252 |   name: "/m/0hkxq"
253 |   id: 56
254 |   display_name: "broccoli"
255 | }
256 | item {
257 |   name: "/m/0fj52s"
258 |   id: 57
259 |   display_name: "carrot"
260 | }
261 | item {
262 |   name: "/m/01b9xk"
263 |   id: 58
264 |   display_name: "hot dog"
265 | }
266 | item {
267 |   name: "/m/0663v"
268 |   id: 59
269 |   display_name: "pizza"
270 | }
271 | item {
272 |   name: "/m/0jy4k"
273 |   id: 60
274 |   display_name: "donut"
275 | }
276 | item {
277 |   name: "/m/0fszt"
278 |   id: 61
279 |   display_name: "cake"
280 | }
281 | item {
282 |   name: "/m/01mzpv"
283 |   id: 62
284 |   display_name: "chair"
285 | }
286 | item {
287 |   name: "/m/02crq1"
288 |   id: 63
289 |   display_name: "couch"
290 | }
291 | item {
292 |   name: "/m/03fp41"
293 |   id: 64
294 |   display_name: "potted plant"
295 | }
296 | item {
297 |   name: "/m/03ssj5"
298 |   id: 65
299 |   display_name: "bed"
300 | }
301 | item {
302 |   name: "/m/04bcr3"
303 |   id: 67
304 |   display_name: "dining table"
305 | }
306 | item {
307 |   name: "/m/09g1w"
308 |   id: 70
309 |   display_name: "toilet"
310 | }
311 | item {
312 |   name: "/m/07c52"
313 |   id: 72
314 |   display_name: "tv"
315 | }
316 | item {
317 |   name: "/m/01c648"
318 |   id: 73
319 |   display_name: "laptop"
320 | }
321 | item {
322 |   name: "/m/020lf"
323 |   id: 74
324 |   display_name: "mouse"
325 | }
326 | item {
327 |   name: "/m/0qjjc"
328 |   id: 75
329 |   display_name: "remote"
330 | }
331 | item {
332 |   name: "/m/01m2v"
333 |   id: 76
334 |   display_name: "keyboard"
335 | }
336 | item {
337 |   name: "/m/050k8"
338 |   id: 77
339 |   display_name: "cell phone"
340 | }
341 | item {
342 |   name: "/m/0fx9l"
343 |   id: 78
344 |   display_name: "microwave"
345 | }
346 | item {
347 |   name: "/m/029bxz"
348 |   id: 79
349 |   display_name: "oven"
350 | }
351 | item {
352 |   name: "/m/01k6s3"
353 |   id: 80
354 |   display_name: "toaster"
355 | }
356 | item {
357 |   name: "/m/0130jx"
358 |   id: 81
359 |   display_name: "sink"
360 | }
361 | item {
362 |   name: "/m/040b_t"
363 |   id: 82
364 |   display_name: "refrigerator"
365 | }
366 | item {
367 |   name: "/m/0bt_c3"
368 |   id: 84
369 |   display_name: "book"
370 | }
371 | item {
372 |   name: "/m/01x3z"
373 |   id: 85
374 |   display_name: "clock"
375 | }
376 | item {
377 |   name: "/m/02s195"
378 |   id: 86
379 |   display_name: "vase"
380 | }
381 | item {
382 |   name: "/m/01lsmm"
383 |   id: 87
384 |   display_name: "scissors"
385 | }
386 | item {
387 |   name: "/m/0kmg4"
388 |   id: 88
389 |   display_name: "teddy bear"
390 | }
391 | item {
392 |   name: "/m/03wvsk"
393 |   id: 89
394 |   display_name: "hair drier"
395 | }
396 | item {
397 |   name: "/m/012xff"
398 |   id: 90
399 |   display_name: "toothbrush"
400 | }


--------------------------------------------------------------------------------
/deployments/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/deployments/__init__.py


--------------------------------------------------------------------------------
/example_data/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/example_data/__init__.py


--------------------------------------------------------------------------------
/libs/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/libs/__init__.py


--------------------------------------------------------------------------------
/libs/label_map_util.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | 
 16 | """Label map utility functions."""
 17 | 
 18 | import logging
 19 | 
 20 | import tensorflow as tf
 21 | from google.protobuf import text_format
 22 | import string_int_label_map_pb2
 23 | 
 24 | 
 25 | def create_category_index(categories):
 26 |   """Creates dictionary of COCO compatible categories keyed by category id.
 27 | 
 28 |   Args:
 29 |     categories: a list of dicts, each of which has the following keys:
 30 |       'id': (required) an integer id uniquely identifying this category.
 31 |       'name': (required) string representing category name
 32 |         e.g., 'cat', 'dog', 'pizza'.
 33 | 
 34 |   Returns:
 35 |     category_index: a dict containing the same entries as categories, but keyed
 36 |       by the 'id' field of each category.
 37 |   """
 38 |   category_index = {}
 39 |   for cat in categories:
 40 |     category_index[cat['id']] = cat
 41 |   return category_index
 42 | 
 43 | 
 44 | def convert_label_map_to_categories(label_map,
 45 |                                     max_num_classes,
 46 |                                     use_display_name=True):
 47 |   """Loads label map proto and returns categories list compatible with eval.
 48 | 
 49 |   This function loads a label map and returns a list of dicts, each of which
 50 |   has the following keys:
 51 |     'id': (required) an integer id uniquely identifying this category.
 52 |     'name': (required) string representing category name
 53 |       e.g., 'cat', 'dog', 'pizza'.
 54 |   We only allow class into the list if its id-label_id_offset is
 55 |   between 0 (inclusive) and max_num_classes (exclusive).
 56 |   If there are several items mapping to the same id in the label map,
 57 |   we will only keep the first one in the categories list.
 58 | 
 59 |   Args:
 60 |     label_map: a StringIntLabelMapProto or None.  If None, a default categories
 61 |       list is created with max_num_classes categories.
 62 |     max_num_classes: maximum number of (consecutive) label indices to include.
 63 |     use_display_name: (boolean) choose whether to load 'display_name' field
 64 |       as category name.  If False of if the display_name field does not exist,
 65 |       uses 'name' field as category names instead.
 66 |   Returns:
 67 |     categories: a list of dictionaries representing all possible categories.
 68 |   """
 69 |   categories = []
 70 |   list_of_ids_already_added = []
 71 |   if not label_map:
 72 |     label_id_offset = 1
 73 |     for class_id in range(max_num_classes):
 74 |       categories.append({
 75 |           'id': class_id + label_id_offset,
 76 |           'name': 'category_{}'.format(class_id + label_id_offset)
 77 |       })
 78 |     return categories
 79 |   for item in label_map.item:
 80 |     if not 0 < item.id <= max_num_classes:
 81 |       logging.info('Ignore item %d since it falls outside of requested '
 82 |                    'label range.', item.id)
 83 |       continue
 84 |     if use_display_name and item.HasField('display_name'):
 85 |       name = item.display_name
 86 |     else:
 87 |       name = item.name
 88 |     if item.id not in list_of_ids_already_added:
 89 |       list_of_ids_already_added.append(item.id)
 90 |       categories.append({'id': item.id, 'name': name})
 91 |   return categories
 92 | 
 93 | 
 94 | # TODO: double check documentaion.
 95 | def load_labelmap(path):
 96 |   """Loads label map proto.
 97 | 
 98 |   Args:
 99 |     path: path to StringIntLabelMap proto text file.
100 |   Returns:
101 |     a StringIntLabelMapProto
102 |   """
103 |   with tf.gfile.GFile(path, 'r') as fid:
104 |     label_map_string = fid.read()
105 |     label_map = string_int_label_map_pb2.StringIntLabelMap()
106 |     try:
107 |       text_format.Merge(label_map_string, label_map)
108 |     except text_format.ParseError:
109 |       label_map.ParseFromString(label_map_string)
110 |   return label_map
111 | 
112 | 
113 | def get_label_map_dict(label_map_path):
114 |   """Reads a label map and returns a dictionary of label names to id.
115 | 
116 |   Args:
117 |     label_map_path: path to label_map.
118 | 
119 |   Returns:
120 |     A dictionary mapping label names to id.
121 |   """
122 |   label_map = load_labelmap(label_map_path)
123 |   label_map_dict = {}
124 |   for item in label_map.item:
125 |     label_map_dict[item.name] = item.id
126 |   return label_map_dict
127 | 


--------------------------------------------------------------------------------
/libs/string_int_label_map_pb2.py:
--------------------------------------------------------------------------------
  1 | # Generated by the protocol buffer compiler.  DO NOT EDIT!
  2 | # source: object_detection/protos/string_int_label_map.proto
  3 | 
  4 | import sys
  5 | _b=sys.version_info[0]<3 and (lambda x:x) or (lambda x:x.encode('latin1'))
  6 | from google.protobuf import descriptor as _descriptor
  7 | from google.protobuf import message as _message
  8 | from google.protobuf import reflection as _reflection
  9 | from google.protobuf import symbol_database as _symbol_database
 10 | from google.protobuf import descriptor_pb2
 11 | # @@protoc_insertion_point(imports)
 12 | 
 13 | _sym_db = _symbol_database.Default()
 14 | 
 15 | 
 16 | 
 17 | 
 18 | DESCRIPTOR = _descriptor.FileDescriptor(
 19 |   name='object_detection/protos/string_int_label_map.proto',
 20 |   package='object_detection.protos',
 21 |   syntax='proto2',
 22 |   serialized_pb=_b('\n2object_detection/protos/string_int_label_map.proto\x12\x17object_detection.protos\"G\n\x15StringIntLabelMapItem\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\n\n\x02id\x18\x02 \x01(\x05\x12\x14\n\x0c\x64isplay_name\x18\x03 \x01(\t\"Q\n\x11StringIntLabelMap\x12<\n\x04item\x18\x01 \x03(\x0b\x32..object_detection.protos.StringIntLabelMapItem')
 23 | )
 24 | 
 25 | 
 26 | 
 27 | 
 28 | _STRINGINTLABELMAPITEM = _descriptor.Descriptor(
 29 |   name='StringIntLabelMapItem',
 30 |   full_name='object_detection.protos.StringIntLabelMapItem',
 31 |   filename=None,
 32 |   file=DESCRIPTOR,
 33 |   containing_type=None,
 34 |   fields=[
 35 |     _descriptor.FieldDescriptor(
 36 |       name='name', full_name='object_detection.protos.StringIntLabelMapItem.name', index=0,
 37 |       number=1, type=9, cpp_type=9, label=1,
 38 |       has_default_value=False, default_value=_b("").decode('utf-8'),
 39 |       message_type=None, enum_type=None, containing_type=None,
 40 |       is_extension=False, extension_scope=None,
 41 |       options=None),
 42 |     _descriptor.FieldDescriptor(
 43 |       name='id', full_name='object_detection.protos.StringIntLabelMapItem.id', index=1,
 44 |       number=2, type=5, cpp_type=1, label=1,
 45 |       has_default_value=False, default_value=0,
 46 |       message_type=None, enum_type=None, containing_type=None,
 47 |       is_extension=False, extension_scope=None,
 48 |       options=None),
 49 |     _descriptor.FieldDescriptor(
 50 |       name='display_name', full_name='object_detection.protos.StringIntLabelMapItem.display_name', index=2,
 51 |       number=3, type=9, cpp_type=9, label=1,
 52 |       has_default_value=False, default_value=_b("").decode('utf-8'),
 53 |       message_type=None, enum_type=None, containing_type=None,
 54 |       is_extension=False, extension_scope=None,
 55 |       options=None),
 56 |   ],
 57 |   extensions=[
 58 |   ],
 59 |   nested_types=[],
 60 |   enum_types=[
 61 |   ],
 62 |   options=None,
 63 |   is_extendable=False,
 64 |   syntax='proto2',
 65 |   extension_ranges=[],
 66 |   oneofs=[
 67 |   ],
 68 |   serialized_start=79,
 69 |   serialized_end=150,
 70 | )
 71 | 
 72 | 
 73 | _STRINGINTLABELMAP = _descriptor.Descriptor(
 74 |   name='StringIntLabelMap',
 75 |   full_name='object_detection.protos.StringIntLabelMap',
 76 |   filename=None,
 77 |   file=DESCRIPTOR,
 78 |   containing_type=None,
 79 |   fields=[
 80 |     _descriptor.FieldDescriptor(
 81 |       name='item', full_name='object_detection.protos.StringIntLabelMap.item', index=0,
 82 |       number=1, type=11, cpp_type=10, label=3,
 83 |       has_default_value=False, default_value=[],
 84 |       message_type=None, enum_type=None, containing_type=None,
 85 |       is_extension=False, extension_scope=None,
 86 |       options=None),
 87 |   ],
 88 |   extensions=[
 89 |   ],
 90 |   nested_types=[],
 91 |   enum_types=[
 92 |   ],
 93 |   options=None,
 94 |   is_extendable=False,
 95 |   syntax='proto2',
 96 |   extension_ranges=[],
 97 |   oneofs=[
 98 |   ],
 99 |   serialized_start=152,
100 |   serialized_end=233,
101 | )
102 | 
103 | _STRINGINTLABELMAP.fields_by_name['item'].message_type = _STRINGINTLABELMAPITEM
104 | DESCRIPTOR.message_types_by_name['StringIntLabelMapItem'] = _STRINGINTLABELMAPITEM
105 | DESCRIPTOR.message_types_by_name['StringIntLabelMap'] = _STRINGINTLABELMAP
106 | _sym_db.RegisterFileDescriptor(DESCRIPTOR)
107 | 
108 | StringIntLabelMapItem = _reflection.GeneratedProtocolMessageType('StringIntLabelMapItem', (_message.Message,), dict(
109 |   DESCRIPTOR = _STRINGINTLABELMAPITEM,
110 |   __module__ = 'object_detection.protos.string_int_label_map_pb2'
111 |   # @@protoc_insertion_point(class_scope:object_detection.protos.StringIntLabelMapItem)
112 |   ))
113 | _sym_db.RegisterMessage(StringIntLabelMapItem)
114 | 
115 | StringIntLabelMap = _reflection.GeneratedProtocolMessageType('StringIntLabelMap', (_message.Message,), dict(
116 |   DESCRIPTOR = _STRINGINTLABELMAP,
117 |   __module__ = 'object_detection.protos.string_int_label_map_pb2'
118 |   # @@protoc_insertion_point(class_scope:object_detection.protos.StringIntLabelMap)
119 |   ))
120 | _sym_db.RegisterMessage(StringIntLabelMap)
121 | 
122 | 
123 | # @@protoc_insertion_point(module_scope)
124 | 


--------------------------------------------------------------------------------
/models/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/models/__init__.py


--------------------------------------------------------------------------------
/models/_base_server.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | 
 4 | class BaseServer(object):
 5 | 
 6 |     in_progress = False
 7 |     prediction = None
 8 |     session = None
 9 |     graph = None
10 |     feed_dict = {}
11 |     output_ops = []
12 |     input_ops = []
13 | 
14 |     def __init__(self, model_fp, input_tensor_names, output_tensor_names, device):
15 |         self.model_fp = model_fp
16 |         self.input_tensor_names = input_tensor_names
17 |         self.output_tensor_names = output_tensor_names
18 | 
19 |         with tf.device(device):
20 |             self._load_graph()
21 |             self._init_predictor()
22 | 
23 |     def _load_graph(self):
24 |         self.graph = tf.Graph()
25 |         with self.graph.as_default():
26 |             od_graph_def = tf.GraphDef()
27 |             with tf.gfile.GFile(self.model_fp, 'rb') as fid:
28 |                 serialized_graph = fid.read()
29 |                 od_graph_def.ParseFromString(serialized_graph)
30 |                 tf.import_graph_def(od_graph_def, name='')
31 |         tf.get_default_graph().finalize()
32 | 
33 |     def _init_predictor(self):
34 |         tf_config = tf.ConfigProto()
35 |         tf_config.gpu_options.allow_growth = True
36 |         with self.graph.as_default():
37 |             self.session = tf.Session(config=tf_config, graph=self.graph)
38 |             self._fetch_tensors()
39 | 
40 |     def _fetch_tensors(self):
41 |         assert len(self.input_tensor_names) > 0
42 |         assert len(self.output_tensor_names) > 0
43 |         for _tensor_name in self.input_tensor_names:
44 |             _op = self.graph.get_tensor_by_name(_tensor_name)
45 |             self.input_ops.append(_op)
46 |             self.feed_dict[_op] = None
47 |         for _tensor_name in self.output_tensor_names:
48 |             _op = self.graph.get_tensor_by_name(_tensor_name)
49 |             self.output_ops.append(_op)
50 | 
51 |     def _set_feed_dict(self, data):
52 |         assert len(data) == len(self.input_ops)
53 |         with self.graph.as_default():
54 |             for ind, op in enumerate(self.input_ops):
55 |                 self.feed_dict[op] = data[ind]
56 | 
57 |     def inference(self, data):
58 |         self.in_progress = True
59 | 
60 |         with self.graph.as_default():
61 |             self._set_feed_dict(data=data)
62 |             print("[Base Server] output ops: {}".format(self.output_ops))
63 |             self.prediction = self.session.run(self.output_ops, feed_dict=self.feed_dict)
64 |         self.in_progress = False
65 | 
66 |         return self.prediction
67 | 
68 |     def get_status(self):
69 |         return self.in_progress
70 | 
71 |     def kill_predictor(self):
72 |         # In old version tensorflow
73 |         # session sometimes will not be closed automatically
74 |         self.session.close()
75 |         self.session = None


--------------------------------------------------------------------------------
/models/_frustum_pointnets_v1.py:
--------------------------------------------------------------------------------
  1 | ''' Frsutum PointNets v1 Model.
  2 | '''
  3 | from __future__ import print_function
  4 | 
  5 | import sys
  6 | import os
  7 | import tensorflow as tf
  8 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  9 | ROOT_DIR = os.path.dirname(BASE_DIR)
 10 | sys.path.append(BASE_DIR)
 11 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 12 | import tf_util
 13 | from model_util import NUM_HEADING_BIN, NUM_SIZE_CLUSTER, NUM_OBJECT_POINT
 14 | from model_util import point_cloud_masking, get_center_regression_net
 15 | from model_util import placeholder_inputs, parse_output_to_tensors, get_loss
 16 | 
 17 | 
 18 | def get_instance_seg_v1_net(point_cloud, one_hot_vec,
 19 |                             is_training, bn_decay, end_points):
 20 |     ''' 3D instance segmentation PointNet v1 network.
 21 |     Input:
 22 |         point_cloud: TF tensor in shape (B,N,4)
 23 |             frustum point clouds with XYZ and intensity in point channels
 24 |             XYZs are in frustum coordinate
 25 |         one_hot_vec: TF tensor in shape (B,3)
 26 |             length-3 vectors indicating predicted object type
 27 |         is_training: TF boolean scalar
 28 |         bn_decay: TF float scalar
 29 |         end_points: dict
 30 |     Output:
 31 |         logits: TF tensor in shape (B,N,2), scores for bkg/clutter and object
 32 |         end_points: dict
 33 |     '''
 34 |     batch_size = point_cloud.get_shape()[0].value
 35 |     num_point = point_cloud.get_shape()[1].value
 36 | 
 37 |     net = tf.expand_dims(point_cloud, 2)
 38 | 
 39 |     net = tf_util.conv2d(net, 64, [1,1],
 40 |                          padding='VALID', stride=[1,1],
 41 |                          bn=True, is_training=is_training,
 42 |                          scope='conv1', bn_decay=bn_decay)
 43 |     net = tf_util.conv2d(net, 64, [1,1],
 44 |                          padding='VALID', stride=[1,1],
 45 |                          bn=True, is_training=is_training,
 46 |                          scope='conv2', bn_decay=bn_decay)
 47 |     point_feat = tf_util.conv2d(net, 64, [1,1],
 48 |                          padding='VALID', stride=[1,1],
 49 |                          bn=True, is_training=is_training,
 50 |                          scope='conv3', bn_decay=bn_decay)
 51 |     net = tf_util.conv2d(point_feat, 128, [1,1],
 52 |                          padding='VALID', stride=[1,1],
 53 |                          bn=True, is_training=is_training,
 54 |                          scope='conv4', bn_decay=bn_decay)
 55 |     net = tf_util.conv2d(net, 1024, [1,1],
 56 |                          padding='VALID', stride=[1,1],
 57 |                          bn=True, is_training=is_training,
 58 |                          scope='conv5', bn_decay=bn_decay)
 59 |     global_feat = tf_util.max_pool2d(net, [num_point,1],
 60 |                                      padding='VALID', scope='maxpool')
 61 | 
 62 |     global_feat = tf.concat([global_feat, tf.expand_dims(tf.expand_dims(one_hot_vec, 1), 1)], axis=3)
 63 |     global_feat_expand = tf.tile(global_feat, [1, num_point, 1, 1])
 64 |     concat_feat = tf.concat(axis=3, values=[point_feat, global_feat_expand])
 65 | 
 66 |     net = tf_util.conv2d(concat_feat, 512, [1,1],
 67 |                          padding='VALID', stride=[1,1],
 68 |                          bn=True, is_training=is_training,
 69 |                          scope='conv6', bn_decay=bn_decay)
 70 |     net = tf_util.conv2d(net, 256, [1,1],
 71 |                          padding='VALID', stride=[1,1],
 72 |                          bn=True, is_training=is_training,
 73 |                          scope='conv7', bn_decay=bn_decay)
 74 |     net = tf_util.conv2d(net, 128, [1,1],
 75 |                          padding='VALID', stride=[1,1],
 76 |                          bn=True, is_training=is_training,
 77 |                          scope='conv8', bn_decay=bn_decay)
 78 |     net = tf_util.conv2d(net, 128, [1,1],
 79 |                          padding='VALID', stride=[1,1],
 80 |                          bn=True, is_training=is_training,
 81 |                          scope='conv9', bn_decay=bn_decay)
 82 |     net = tf_util.dropout(net, is_training, 'dp1', keep_prob=0.5)
 83 | 
 84 |     logits = tf_util.conv2d(net, 2, [1,1],
 85 |                          padding='VALID', stride=[1,1], activation_fn=None,
 86 |                          scope='conv10')
 87 |     logits = tf.squeeze(logits, [2]) # BxNxC
 88 |     return logits, end_points
 89 |  
 90 | 
 91 | def get_3d_box_estimation_v1_net(object_point_cloud, one_hot_vec,
 92 |                                  is_training, bn_decay, end_points):
 93 |     ''' 3D Box Estimation PointNet v1 network.
 94 |     Input:
 95 |         object_point_cloud: TF tensor in shape (B,M,C)
 96 |             point clouds in object coordinate
 97 |         one_hot_vec: TF tensor in shape (B,3)
 98 |             length-3 vectors indicating predicted object type
 99 |     Output:
100 |         output: TF tensor in shape (B,3+NUM_HEADING_BIN*2+NUM_SIZE_CLUSTER*4)
101 |             including box centers, heading bin class scores and residuals,
102 |             and size cluster scores and residuals
103 |     ''' 
104 |     num_point = object_point_cloud.get_shape()[1].value
105 |     net = tf.expand_dims(object_point_cloud, 2)
106 |     net = tf_util.conv2d(net, 128, [1,1],
107 |                          padding='VALID', stride=[1,1],
108 |                          bn=True, is_training=is_training,
109 |                          scope='conv-reg1', bn_decay=bn_decay)
110 |     net = tf_util.conv2d(net, 128, [1,1],
111 |                          padding='VALID', stride=[1,1],
112 |                          bn=True, is_training=is_training,
113 |                          scope='conv-reg2', bn_decay=bn_decay)
114 |     net = tf_util.conv2d(net, 256, [1,1],
115 |                          padding='VALID', stride=[1,1],
116 |                          bn=True, is_training=is_training,
117 |                          scope='conv-reg3', bn_decay=bn_decay)
118 |     net = tf_util.conv2d(net, 512, [1,1],
119 |                          padding='VALID', stride=[1,1],
120 |                          bn=True, is_training=is_training,
121 |                          scope='conv-reg4', bn_decay=bn_decay)
122 |     net = tf_util.max_pool2d(net, [num_point,1],
123 |         padding='VALID', scope='maxpool2')
124 |     net = tf.squeeze(net, axis=[1,2])
125 |     net = tf.concat([net, one_hot_vec], axis=1)
126 |     net = tf_util.fully_connected(net, 512, scope='fc1', bn=True,
127 |         is_training=is_training, bn_decay=bn_decay)
128 |     net = tf_util.fully_connected(net, 256, scope='fc2', bn=True,
129 |         is_training=is_training, bn_decay=bn_decay)
130 | 
131 |     # The first 3 numbers: box center coordinates (cx,cy,cz),
132 |     # the next NUM_HEADING_BIN*2:  heading bin class scores and bin residuals
133 |     # next NUM_SIZE_CLUSTER*4: box cluster scores and residuals
134 |     output = tf_util.fully_connected(net,
135 |         3+NUM_HEADING_BIN*2+NUM_SIZE_CLUSTER*4, activation_fn=None, scope='fc3')
136 |     return output, end_points
137 | 
138 | 
139 | def get_model(point_cloud, one_hot_vec, is_training, bn_decay=None):
140 |     ''' Frustum PointNets model. The model predict 3D object masks and
141 |     amodel bounding boxes for objects in frustum point clouds.
142 | 
143 |     Input:
144 |         point_cloud: TF tensor in shape (B,N,4)
145 |             frustum point clouds with XYZ and intensity in point channels
146 |             XYZs are in frustum coordinate
147 |         one_hot_vec: TF tensor in shape (B,3)
148 |             length-3 vectors indicating predicted object type
149 |         is_training: TF boolean scalar
150 |         bn_decay: TF float scalar
151 |     Output:
152 |         end_points: dict (map from name strings to TF tensors)
153 |     '''
154 |     end_points = {}
155 |     
156 |     # 3D Instance Segmentation PointNet
157 |     logits, end_points = get_instance_seg_v1_net(\
158 |         point_cloud, one_hot_vec,
159 |         is_training, bn_decay, end_points)
160 |     end_points['mask_logits'] = logits
161 | 
162 |     # Masking
163 |     # select masked points and translate to masked points' centroid
164 |     object_point_cloud_xyz, mask_xyz_mean, end_points = \
165 |         point_cloud_masking(point_cloud, logits, end_points)
166 | 
167 |     # T-Net and coordinate translation
168 |     center_delta, end_points = get_center_regression_net(\
169 |         object_point_cloud_xyz, one_hot_vec,
170 |         is_training, bn_decay, end_points)
171 |     stage1_center = center_delta + mask_xyz_mean # Bx3
172 |     end_points['stage1_center'] = stage1_center
173 |     # Get object point cloud in object coordinate
174 |     object_point_cloud_xyz_new = \
175 |         object_point_cloud_xyz - tf.expand_dims(center_delta, 1)
176 | 
177 |     # Amodel Box Estimation PointNet
178 |     output, end_points = get_3d_box_estimation_v1_net(\
179 |         object_point_cloud_xyz_new, one_hot_vec,
180 |         is_training, bn_decay, end_points)
181 | 
182 |     # Parse output to 3D box parameters
183 |     end_points = parse_output_to_tensors(output, end_points)
184 |     end_points['center'] = end_points['center_boxnet'] + stage1_center # Bx3
185 | 
186 |     return end_points
187 | 
188 | 


--------------------------------------------------------------------------------
/models/detector_2d.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import _base_server
 3 | import cv2
 4 | import numpy as np
 5 | from configs import configs
 6 | 
 7 | sys.path.append("..")
 8 | import libs.label_map_util
 9 | 
10 | 
11 | class Detector2D(_base_server.BaseServer):
12 | 
13 |     img_height_received = 0
14 |     img_width_received = 0
15 |     img_feed = None
16 |     img_received = None
17 |     img_resized = None
18 |     num_classes = configs.DETECTOR_2D['NUM_CLASSES']
19 |     img_resize_size = configs.DETECTOR_2D['FEED_IMG_SIZE']
20 |     labels_fp = configs.DETECTOR_2D['LABEL_FP']
21 |     one_hot_vec_map = configs.DETECTOR_2D['ONE_HOT_VECTOR_MAP']
22 | 
23 |     def __init__(self, *args, **kwargs):
24 |         super(Detector2D, self).__init__(*args, **kwargs)
25 |         self._load_labels()
26 | 
27 |     def inference_verbose(self, data):
28 |         self.img_received = cv2.cvtColor(data, cv2.COLOR_RGB2BGR)
29 |         self.img_height_received, self.img_width_received, _ = self.img_received.shape
30 |         self.img_resized = cv2.resize(self.img_received, (self.img_resize_size, self.img_resize_size))
31 |         print('[Detector2D]Resizing image from {} to {}'.format(self.img_received.shape, self.img_resized.shape))
32 |         self.img_feed = np.expand_dims(self.img_resized, axis=0)
33 |         self.inference([self.img_feed])
34 |         bboxes_2d, one_hot_vectors = self.post_process()
35 |         print('[Detector2D]boxes 2d are {}\n                one_hot_vectors are {}'.format(bboxes_2d, one_hot_vectors))
36 |         return bboxes_2d, one_hot_vectors
37 | 
38 |     def _load_labels(self):
39 |         self.label_map = libs.label_map_util.load_labelmap(self.labels_fp)
40 |         self.categories = libs.label_map_util.convert_label_map_to_categories(self.label_map,
41 |                                                                               max_num_classes=self.num_classes,
42 |                                                                               use_display_name=True)
43 |         self.category_index = libs.label_map_util.create_category_index(self.categories)
44 | 
45 |     def _get_one_hot_vet(self, cls):
46 |         one_hot_vec = np.zeros((3))
47 |         one_hot_vec[self.one_hot_vec_map[cls]] = 1
48 |         print('[Detector2D]Converting {} to {}'.format(cls, one_hot_vec))
49 |         return one_hot_vec
50 | 
51 |     def post_process(self, threshold=0.2):
52 |         boxes, scores, classes, num_detections = self.prediction
53 |         filtered_results = []
54 |         bb_o = []
55 |         one_hot_vectors = []
56 |         print('[Detector2D]Number of detetcions is {}'.format(num_detections))
57 |         for i in range(0, num_detections):
58 |             score = scores[0][i]
59 |             if score >= threshold:
60 |                 print('[Detector2D]Found a detected class with score higher than %s' % score)
61 |                 y1, x1, y2, x2 = boxes[0][i]
62 |                 y1_o = int(y1 * self.img_height_received)
63 |                 x1_o = int(x1 * self.img_width_received)
64 |                 y2_o = int(y2 * self.img_height_received)
65 |                 x2_o = int(x2 * self.img_width_received)
66 |                 predicted_class = self.category_index[classes[0][i]]['name']
67 |                 filtered_results.append({
68 |                     "score": score,
69 |                     "bb": boxes[0][i],
70 |                     "bb_o": [x1_o, y1_o, x2_o, y2_o],
71 |                     "img_size": [self.img_height_received, self.img_width_received],
72 |                     "class": predicted_class
73 |                 })
74 |                 print('[Detector2D]%s: %s, %s' % (predicted_class, score, [x1_o, y1_o, x2_o, y2_o]))
75 |                 bb_o.append([x1_o, y1_o, x2_o, y2_o])
76 |                 one_hot_vectors.append(self._get_one_hot_vet(predicted_class))
77 |         self._viz(filtered_results)
78 |         return bb_o, one_hot_vectors
79 | 
80 |     def _viz(self, filtered_results):
81 |         font = cv2.FONT_HERSHEY_SIMPLEX
82 |         font_scale = 1
83 |         font_color = (0, 255, 0)
84 |         line_type = 2
85 |         offset = 20
86 |         for res in filtered_results:
87 |             x1, y1, x2, y2 = res["bb_o"]
88 |             cv2.rectangle(self.img_received, (x1, y1), (x2, y2), (255, 0, 0), 2)
89 |             cv2.putText(self.img_received, res["class"],
90 |                         (x1 + offset, y1 - offset),
91 |                         font,
92 |                         font_scale,
93 |                         font_color,
94 |                         line_type)
95 |         cv2.imshow('img', self.img_received)
96 |         cv2.waitKey(0)
97 |         cv2.destroyAllWindows()
98 | 


--------------------------------------------------------------------------------
/models/detector_3d.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | import _frustum_pointnets_v1 as fp_nets
 3 | from configs import configs
 4 | 
 5 | tf.logging.set_verbosity(tf.logging.INFO)
 6 | 
 7 | 
 8 | class FPNetPredictor(object):
 9 | 
10 |     graph = tf.Graph()
11 |     sess = None
12 |     saver = None
13 |     ops = None
14 | 
15 |     BATCH_SIZE = configs.FPNET['BATCH_SIZE']
16 |     NUM_POINT = configs.FPNET['NUM_POINT']
17 |     DEVICE = configs.FPNET['device']
18 | 
19 |     def __init__(self, model_fp):
20 |         tf.logging.info("Initializing FPNetPredictor Instance ...")
21 |         self.model_fp = model_fp
22 |         with tf.device(self.DEVICE):
23 |             self._init_session()
24 |             self._init_graph()
25 |         tf.logging.info("Initialized FPNetPredictor Instance!")
26 | 
27 |     def _init_session(self):
28 |         tf.logging.info("Initializing Session ...")
29 |         with self.graph.as_default():
30 |             config = tf.ConfigProto()
31 |             config.gpu_options.allow_growth = True
32 |             config.allow_soft_placement = True
33 |             self.sess = tf.Session(config=config)
34 | 
35 |     def _init_graph(self):
36 |         tf.logging.info("Initializing Graph ...")
37 |         with self.graph.as_default():
38 |             pointclouds_pl, one_hot_vec_pl, labels_pl, centers_pl, \
39 |             heading_class_label_pl, heading_residual_label_pl, \
40 |             size_class_label_pl, size_residual_label_pl = \
41 |                 fp_nets.placeholder_inputs(self.BATCH_SIZE, self.NUM_POINT)
42 | 
43 |             is_training_pl = tf.placeholder(tf.bool, shape=())
44 |             end_points = fp_nets.get_model(pointclouds_pl, one_hot_vec_pl, is_training_pl)
45 | 
46 |             self.saver = tf.train.Saver()
47 |             # Restore variables from disk.
48 |             self.saver.restore(self.sess, self.model_fp)
49 |             self.ops = {'pointclouds_pl': pointclouds_pl,
50 |                    'one_hot_vec_pl': one_hot_vec_pl,
51 |                    'labels_pl': labels_pl,
52 |                    'centers_pl': centers_pl,
53 |                    'heading_class_label_pl': heading_class_label_pl,
54 |                    'heading_residual_label_pl': heading_residual_label_pl,
55 |                    'size_class_label_pl': size_class_label_pl,
56 |                    'size_residual_label_pl': size_residual_label_pl,
57 |                    'is_training_pl': is_training_pl,
58 |                    'logits': end_points['mask_logits'],
59 |                    'center': end_points['center'],
60 |                    'end_points': end_points}
61 | 
62 |     def predict(self, pc, one_hot_vec):
63 |         tf.logging.info("Predicting with pointcloud and one hot vector ...")
64 |         _ops = self.ops
65 |         _ep = _ops['end_points']
66 | 
67 |         feed_dict = {_ops['pointclouds_pl']: pc, _ops['one_hot_vec_pl']: one_hot_vec, _ops['is_training_pl']: False}
68 | 
69 |         logits, centers, heading_logits, \
70 |         heading_residuals, size_scores, size_residuals = \
71 |         self.sess.run([_ops['logits'], _ops['center'],
72 |                   _ep['heading_scores'], _ep['heading_residuals'],
73 |                   _ep['size_scores'], _ep['size_residuals']],
74 |                  feed_dict=feed_dict)
75 | 
76 |         tf.logging.info("Prediction done ! \nResults:\nCenter: {}\nSize Score: {}".format(centers, size_scores))
77 |         return logits, centers, heading_logits, heading_residuals, size_scores, size_residuals
78 | 


--------------------------------------------------------------------------------
/models/frustum_proposal.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | 
  3 | '''
  4 | Note from KITTI Object detection note:
  5 | 
  6 | The coordinates in the camera coordinate system can be projected in the image
  7 | by using the 3x4 projection matrix in the calib folder, where for the left
  8 | color camera for which the images are provided, P2 must be used. The
  9 | difference between rotation_y and alpha is, that rotation_y is directly
 10 | given in camera coordinates, while alpha also considers the vector from the
 11 | camera center to the object center, to compute the relative orientation of
 12 | the object with respect to the camera. For example, a car which is facing
 13 | along the X-axis of the camera coordinate system corresponds to rotation_y=0,
 14 | no matter where it is located in the X/Z plane (bird's eye view), while
 15 | alpha is zero only, when this object is located along the Z-axis of the
 16 | camera. When moving the car away from the Z-axis, the observation angle
 17 | will change.
 18 | 
 19 | To project a point from Velodyne coordinates into the left color image,
 20 | you can use this formula: x = P2 * R0_rect * Tr_velo_to_cam * y
 21 | For the right color image: x = P3 * R0_rect * Tr_velo_to_cam * y
 22 | 
 23 | Note: All matrices are stored row-major, i.e., the first values correspond
 24 | to the first row. R0_rect contains a 3x3 matrix which you need to extend to
 25 | a 4x4 matrix by adding a 1 as the bottom-right element and 0's elsewhere.
 26 | Tr_xxx is a 3x4 matrix (R|t), which you need to extend to a 4x4 matrix 
 27 | in the same way!
 28 | 
 29 | Note, that while all this information is available for the training data,
 30 | only the data which is actually needed for the particular benchmark must
 31 | be provided to the evaluation server. However, all 15 values must be provided
 32 | at all times, with the unused ones set to their default values (=invalid) as
 33 | specified in writeLabels.m. Additionally a 16'th value must be provided
 34 | with a floating value of the score for a particular detection, where higher
 35 | indicates higher confidence in the detection. The range of your scores will
 36 | be automatically determined by our evaluation server, you don't have to
 37 | normalize it, but it should be roughly linear. If you use writeLabels.m for
 38 | writing your results, this function will take care of storing all required
 39 | data correctly.
 40 | 
 41 | '''
 42 | 
 43 | 
 44 | class FrustumProposal(object):
 45 |     def __init__(self, calibs):
 46 |         assert 'P' and 'Tr_velo_to_cam' and 'R0_rect' in calibs
 47 | 
 48 |         self.P = calibs['P']
 49 |         self.P = np.reshape(self.P, [3, 4])
 50 | 
 51 |         self.V2C = calibs['Tr_velo_to_cam']
 52 |         self.V2C = np.reshape(self.V2C, [3, 4])
 53 | 
 54 |         self.C2V = self.inverse_rigid_trans(self.V2C)
 55 | 
 56 |         self.R0 = calibs['R0_rect']
 57 |         self.R0 = np.reshape(self.R0, [3, 3])
 58 | 
 59 |     @staticmethod
 60 |     def inverse_rigid_trans(Tr):
 61 |         ''' Inverse a rigid body transform matrix (3x4 as [R|t])
 62 | 			[R'|-R't; 0|1]
 63 | 		'''
 64 |         inv_Tr = np.zeros_like(Tr)  # 3x4
 65 |         inv_Tr[0:3, 0:3] = np.transpose(Tr[0:3, 0:3])
 66 |         inv_Tr[0:3, 3] = np.dot(-np.transpose(Tr[0:3, 0:3]), Tr[0:3, 3])
 67 |         return inv_Tr
 68 | 
 69 |     def _cart2hom(self, pts_3d):
 70 |         ''' Input: nx3 points in Cartesian
 71 | 			Oupput: nx4 points in Homogeneous by pending 1
 72 | 		'''
 73 |         n = pts_3d.shape[0]
 74 |         pts_3d_hom = np.hstack((pts_3d, np.ones((n, 1))))
 75 |         return pts_3d_hom
 76 | 
 77 |     def _project_velo_to_ref(self, pts_3d_velo):
 78 |         pts_3d_velo = self._cart2hom(pts_3d_velo)  # nx4
 79 |         return np.dot(pts_3d_velo, np.transpose(self.V2C))
 80 | 
 81 |     def _project_ref_to_velo(self, pts_3d_ref):
 82 |         pts_3d_ref = self._cart2hom(pts_3d_ref)  # nx4
 83 |         return np.dot(pts_3d_ref, np.transpose(self.C2V))
 84 | 
 85 |     def _project_rect_to_ref(self, pts_3d_rect):
 86 |         ''' Input and Output are nx3 points '''
 87 |         return np.transpose(np.dot(np.linalg.inv(self.R0), np.transpose(pts_3d_rect)))
 88 | 
 89 |     def _project_ref_to_rect(self, pts_3d_ref):
 90 |         ''' Input and Output are nx3 points '''
 91 |         return np.transpose(np.dot(self.R0, np.transpose(pts_3d_ref)))
 92 | 
 93 |     def project_rect_to_velo(self, pts_3d_rect):
 94 |         ''' Input: nx3 points in rect camera coord.
 95 | 			Output: nx3 points in velodyne coord.
 96 | 		'''
 97 |         pts_3d_ref = self._project_rect_to_ref(pts_3d_rect)
 98 |         return self._project_ref_to_velo(pts_3d_ref)
 99 | 
100 |     def _project_velo_to_rect(self, pts_3d_velo):
101 |         pts_3d_ref = self._project_velo_to_ref(pts_3d_velo)
102 |         return self._project_ref_to_rect(pts_3d_ref)
103 | 
104 |     def _project_rect_to_image(self, pts_3d_rect):
105 |         ''' Input: nx3 points in rect camera coord.
106 | 			Output: nx2 points in image2 coord.
107 | 		'''
108 |         pts_3d_rect = self._cart2hom(pts_3d_rect)
109 |         pts_2d = np.dot(pts_3d_rect, np.transpose(self.P))  # nx3
110 |         pts_2d[:, 0] /= pts_2d[:, 2]
111 |         pts_2d[:, 1] /= pts_2d[:, 2]
112 |         return pts_2d[:, 0:2]
113 | 
114 |     def _project_velo_to_image(self, pts_3d_velo):
115 |         ''' Input: nx3 points in velodyne coord.
116 | 			Output: nx2 points in image2 coord.
117 | 		'''
118 |         pts_3d_rect = self._project_velo_to_rect(pts_3d_velo)
119 |         return self._project_rect_to_image(pts_3d_rect)
120 | 
121 |     def _get_lidar_in_image_fov(self, pc_velo, xmin, ymin, xmax, ymax,
122 |                                 return_more=False, clip_distance=2.0):
123 |         ''' Filter lidar points, keep those in image FOV '''
124 |         pts_2d = self._project_velo_to_image(pc_velo)
125 |         fov_inds = (pts_2d[:, 0] < xmax) & (pts_2d[:, 0] >= xmin) & \
126 |                    (pts_2d[:, 1] < ymax) & (pts_2d[:, 1] >= ymin)
127 |         fov_inds = fov_inds & (pc_velo[:, 0] > clip_distance)
128 |         imgfov_pc_velo = pc_velo[fov_inds, :]
129 |         if return_more:
130 |             return imgfov_pc_velo, pts_2d, fov_inds
131 |         else:
132 |             return imgfov_pc_velo
133 | 
134 |     def get_frustum_proposal(self, img_shape, boxes2d, pc_velo):
135 |         print('[FrustumProposal] Fetching frustum proposal from:')
136 |         print('[FrustumProposal] image_shape: {} '.format(img_shape))
137 |         print('[FrustumProposal] boxes2d: {} '.format(boxes2d))
138 |         print('[FrustumProposal] pc_velo.shape: {} '.format(pc_velo.shape))
139 |         frustum_proposals = []
140 |         frustum_proposals_velo = []
141 |         img_height, img_width, _ = img_shape
142 |         _num_objs = len(boxes2d)
143 |         _, pc_image_coord, img_fov_inds = self._get_lidar_in_image_fov(pc_velo[:, 0:3], 0, 0, img_width, img_height, True)
144 |         pc_rect = np.zeros_like(pc_velo)
145 |         pc_rect[:, 0:3] = self._project_velo_to_rect(pc_velo[:, 0:3])
146 |         pc_rect[:, 3] = pc_velo[:, 3]
147 |         for obj_idx in range(_num_objs):
148 |             box2d = boxes2d[obj_idx]
149 |             xmin, ymin, xmax, ymax = box2d
150 |             box_fov_inds = (pc_image_coord[:, 0] < xmax) & \
151 |                            (pc_image_coord[:, 0] >= xmin) & \
152 |                            (pc_image_coord[:, 1] < ymax) & \
153 |                            (pc_image_coord[:, 1] >= ymin)
154 |             box_fov_inds = box_fov_inds & img_fov_inds
155 |             pc_in_box_fov = pc_rect[box_fov_inds, :]
156 |             # Below is equivalent to the commented one line code. I do this to verify the projection
157 |             pc_in_velo_fov = np.zeros_like(pc_in_box_fov)
158 |             pc_in_velo_fov[:, 0:3] = self.project_rect_to_velo(pc_in_box_fov[:, 0:3])
159 |             pc_in_velo_fov[:, 3] = pc_in_box_fov[:, 3]
160 | 
161 |             # pc_in_velo_fov = pc_velo[box_fov_inds, :]
162 |             frustum_proposals.append(pc_in_box_fov)
163 |             frustum_proposals_velo.append(pc_in_velo_fov)
164 |         print('[Frustum Proposal] Propose %s frustum proposals' % len(frustum_proposals))
165 |         return frustum_proposals, frustum_proposals_velo
166 | 


--------------------------------------------------------------------------------
/models/model_util.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import tensorflow as tf
  3 | import os
  4 | import sys
  5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  6 | sys.path.append(BASE_DIR)
  7 | import tf_util
  8 | 
  9 | # -----------------
 10 | # Global Constants
 11 | # -----------------
 12 | 
 13 | from configs import configs
 14 | 
 15 | NUM_HEADING_BIN = configs.FPNET['NUM_HEADING_BIN']
 16 | NUM_SIZE_CLUSTER = configs.FPNET['NUM_SIZE_CLUSTER']
 17 | NUM_OBJECT_POINT = configs.FPNET['NUM_OBJECT_POINT']
 18 | 
 19 | g_type2class={'Car':0, 'Van':1, 'Truck':2, 'Pedestrian':3,
 20 |               'Person_sitting':4, 'Cyclist':5, 'Tram':6, 'Misc':7}
 21 | g_class2type = {g_type2class[t]:t for t in g_type2class}
 22 | g_type2onehotclass = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
 23 | g_type_mean_size = {'Car': np.array([3.88311640418,1.62856739989,1.52563191462]),
 24 |                     'Van': np.array([5.06763659,1.9007158,2.20532825]),
 25 |                     'Truck': np.array([10.13586957,2.58549199,3.2520595]),
 26 |                     'Pedestrian': np.array([0.84422524,0.66068622,1.76255119]),
 27 |                     'Person_sitting': np.array([0.80057803,0.5983815,1.27450867]),
 28 |                     'Cyclist': np.array([1.76282397,0.59706367,1.73698127]),
 29 |                     'Tram': np.array([16.17150617,2.53246914,3.53079012]),
 30 |                     'Misc': np.array([3.64300781,1.54298177,1.92320313])}
 31 | g_mean_size_arr = np.zeros((NUM_SIZE_CLUSTER, 3)) # size clustrs
 32 | for i in range(NUM_SIZE_CLUSTER):
 33 |     g_mean_size_arr[i,:] = g_type_mean_size[g_class2type[i]]
 34 | 
 35 | # -----------------
 36 | # TF Functions Helpers
 37 | # -----------------
 38 | 
 39 | def tf_gather_object_pc(point_cloud, mask, npoints=512):
 40 |     ''' Gather object point clouds according to predicted masks.
 41 |     Input:
 42 |         point_cloud: TF tensor in shape (B,N,C)
 43 |         mask: TF tensor in shape (B,N) of 0 (not pick) or 1 (pick)
 44 |         npoints: int scalar, maximum number of points to keep (default: 512)
 45 |     Output:
 46 |         object_pc: TF tensor in shape (B,npoint,C)
 47 |         indices: TF int tensor in shape (B,npoint,2)
 48 |     '''
 49 |     def mask_to_indices(mask):
 50 |         indices = np.zeros((mask.shape[0], npoints, 2), dtype=np.int32)
 51 |         for i in range(mask.shape[0]):
 52 |             pos_indices = np.where(mask[i,:]>0.5)[0]
 53 |             # skip cases when pos_indices is empty
 54 |             if len(pos_indices) > 0: 
 55 |                 if len(pos_indices) > npoints:
 56 |                     choice = np.random.choice(len(pos_indices),
 57 |                         npoints, replace=False)
 58 |                 else:
 59 |                     choice = np.random.choice(len(pos_indices),
 60 |                         npoints-len(pos_indices), replace=True)
 61 |                     choice = np.concatenate((np.arange(len(pos_indices)), choice))
 62 |                 np.random.shuffle(choice)
 63 |                 indices[i,:,1] = pos_indices[choice]
 64 |             indices[i,:,0] = i
 65 |         return indices
 66 | 
 67 |     indices = tf.py_func(mask_to_indices, [mask], tf.int32)  
 68 |     object_pc = tf.gather_nd(point_cloud, indices)
 69 |     return object_pc, indices
 70 | 
 71 | 
 72 | def get_box3d_corners_helper(centers, headings, sizes):
 73 |     """ TF layer. Input: (N,3), (N,), (N,3), Output: (N,8,3) """
 74 |     #print '-----', centers
 75 |     N = centers.get_shape()[0].value
 76 |     l = tf.slice(sizes, [0,0], [-1,1]) # (N,1)
 77 |     w = tf.slice(sizes, [0,1], [-1,1]) # (N,1)
 78 |     h = tf.slice(sizes, [0,2], [-1,1]) # (N,1)
 79 |     #print l,w,h
 80 |     x_corners = tf.concat([l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2], axis=1) # (N,8)
 81 |     y_corners = tf.concat([h/2,h/2,h/2,h/2,-h/2,-h/2,-h/2,-h/2], axis=1) # (N,8)
 82 |     z_corners = tf.concat([w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2], axis=1) # (N,8)
 83 |     corners = tf.concat([tf.expand_dims(x_corners,1), tf.expand_dims(y_corners,1), tf.expand_dims(z_corners,1)], axis=1) # (N,3,8)
 84 |     #print x_corners, y_corners, z_corners
 85 |     c = tf.cos(headings)
 86 |     s = tf.sin(headings)
 87 |     ones = tf.ones([N], dtype=tf.float32)
 88 |     zeros = tf.zeros([N], dtype=tf.float32)
 89 |     row1 = tf.stack([c,zeros,s], axis=1) # (N,3)
 90 |     row2 = tf.stack([zeros,ones,zeros], axis=1)
 91 |     row3 = tf.stack([-s,zeros,c], axis=1)
 92 |     R = tf.concat([tf.expand_dims(row1,1), tf.expand_dims(row2,1), tf.expand_dims(row3,1)], axis=1) # (N,3,3)
 93 |     #print row1, row2, row3, R, N
 94 |     corners_3d = tf.matmul(R, corners) # (N,3,8)
 95 |     corners_3d += tf.tile(tf.expand_dims(centers,2), [1,1,8]) # (N,3,8)
 96 |     corners_3d = tf.transpose(corners_3d, perm=[0,2,1]) # (N,8,3)
 97 |     return corners_3d
 98 | 
 99 | def get_box3d_corners(center, heading_residuals, size_residuals):
100 |     """ TF layer.
101 |     Inputs:
102 |         center: (B,3)
103 |         heading_residuals: (B,NH)
104 |         size_residuals: (B,NS,3)
105 |     Outputs:
106 |         box3d_corners: (B,NH,NS,8,3) tensor
107 |     """
108 |     batch_size = center.get_shape()[0].value
109 |     heading_bin_centers = tf.constant(np.arange(0,2*np.pi,2*np.pi/NUM_HEADING_BIN), dtype=tf.float32) # (NH,)
110 |     headings = heading_residuals + tf.expand_dims(heading_bin_centers, 0) # (B,NH)
111 |     
112 |     mean_sizes = tf.expand_dims(tf.constant(g_mean_size_arr, dtype=tf.float32), 0) + size_residuals # (B,NS,1)
113 |     sizes = mean_sizes + size_residuals # (B,NS,3)
114 |     sizes = tf.tile(tf.expand_dims(sizes,1), [1,NUM_HEADING_BIN,1,1]) # (B,NH,NS,3)
115 |     headings = tf.tile(tf.expand_dims(headings,-1), [1,1,NUM_SIZE_CLUSTER]) # (B,NH,NS)
116 |     centers = tf.tile(tf.expand_dims(tf.expand_dims(center,1),1), [1,NUM_HEADING_BIN, NUM_SIZE_CLUSTER,1]) # (B,NH,NS,3)
117 | 
118 |     N = batch_size*NUM_HEADING_BIN*NUM_SIZE_CLUSTER
119 |     corners_3d = get_box3d_corners_helper(tf.reshape(centers, [N,3]), tf.reshape(headings, [N]), tf.reshape(sizes, [N,3]))
120 | 
121 |     return tf.reshape(corners_3d, [batch_size, NUM_HEADING_BIN, NUM_SIZE_CLUSTER, 8, 3])
122 | 
123 | 
124 | def huber_loss(error, delta):
125 |     abs_error = tf.abs(error)
126 |     quadratic = tf.minimum(abs_error, delta)
127 |     linear = (abs_error - quadratic)
128 |     losses = 0.5 * quadratic**2 + delta * linear
129 |     return tf.reduce_mean(losses)
130 | 
131 | 
132 | def parse_output_to_tensors(output, end_points):
133 |     ''' Parse batch output to separate tensors (added to end_points)
134 |     Input:
135 |         output: TF tensor in shape (B,3+2*NUM_HEADING_BIN+4*NUM_SIZE_CLUSTER)
136 |         end_points: dict
137 |     Output:
138 |         end_points: dict (updated)
139 |     '''
140 |     batch_size = output.get_shape()[0].value
141 |     center = tf.slice(output, [0,0], [-1,3])
142 |     end_points['center_boxnet'] = center
143 | 
144 |     heading_scores = tf.slice(output, [0,3], [-1,NUM_HEADING_BIN])
145 |     heading_residuals_normalized = tf.slice(output, [0,3+NUM_HEADING_BIN],
146 |         [-1,NUM_HEADING_BIN])
147 |     end_points['heading_scores'] = heading_scores # BxNUM_HEADING_BIN
148 |     end_points['heading_residuals_normalized'] = \
149 |         heading_residuals_normalized # BxNUM_HEADING_BIN (-1 to 1)
150 |     end_points['heading_residuals'] = \
151 |         heading_residuals_normalized * (np.pi/NUM_HEADING_BIN) # BxNUM_HEADING_BIN
152 |     
153 |     size_scores = tf.slice(output, [0,3+NUM_HEADING_BIN*2],
154 |         [-1,NUM_SIZE_CLUSTER]) # BxNUM_SIZE_CLUSTER
155 |     size_residuals_normalized = tf.slice(output,
156 |         [0,3+NUM_HEADING_BIN*2+NUM_SIZE_CLUSTER], [-1,NUM_SIZE_CLUSTER*3])
157 |     size_residuals_normalized = tf.reshape(size_residuals_normalized,
158 |         [batch_size, NUM_SIZE_CLUSTER, 3]) # BxNUM_SIZE_CLUSTERx3
159 |     end_points['size_scores'] = size_scores
160 |     end_points['size_residuals_normalized'] = size_residuals_normalized
161 |     end_points['size_residuals'] = size_residuals_normalized * \
162 |         tf.expand_dims(tf.constant(g_mean_size_arr, dtype=tf.float32), 0)
163 | 
164 |     return end_points
165 | 
166 | # --------------------------------------
167 | # Shared subgraphs for v1 and v2 models
168 | # --------------------------------------
169 | 
170 | def placeholder_inputs(batch_size, num_point):
171 |     ''' Get useful placeholder tensors.
172 |     Input:
173 |         batch_size: scalar int
174 |         num_point: scalar int
175 |     Output:
176 |         TF placeholders for inputs and ground truths
177 |     '''
178 |     pointclouds_pl = tf.placeholder(tf.float32,
179 |         shape=(batch_size, num_point, 4))
180 |     one_hot_vec_pl = tf.placeholder(tf.float32, shape=(batch_size, 3))
181 | 
182 |     # labels_pl is for segmentation label
183 |     labels_pl = tf.placeholder(tf.int32, shape=(batch_size, num_point))
184 |     centers_pl = tf.placeholder(tf.float32, shape=(batch_size, 3))
185 |     heading_class_label_pl = tf.placeholder(tf.int32, shape=(batch_size,))
186 |     heading_residual_label_pl = tf.placeholder(tf.float32, shape=(batch_size,))
187 |     size_class_label_pl = tf.placeholder(tf.int32, shape=(batch_size,))
188 |     size_residual_label_pl = tf.placeholder(tf.float32, shape=(batch_size,3))
189 | 
190 |     return pointclouds_pl, one_hot_vec_pl, labels_pl, centers_pl, \
191 |         heading_class_label_pl, heading_residual_label_pl, \
192 |         size_class_label_pl, size_residual_label_pl
193 | 
194 | 
195 | def point_cloud_masking(point_cloud, logits, end_points, xyz_only=True):
196 |     ''' Select point cloud with predicted 3D mask,
197 |     translate coordinates to the masked points centroid.
198 |     
199 |     Input:
200 |         point_cloud: TF tensor in shape (B,N,C)
201 |         logits: TF tensor in shape (B,N,2)
202 |         end_points: dict
203 |         xyz_only: boolean, if True only return XYZ channels
204 |     Output:
205 |         object_point_cloud: TF tensor in shape (B,M,3)
206 |             for simplicity we only keep XYZ here
207 |             M = NUM_OBJECT_POINT as a hyper-parameter
208 |         mask_xyz_mean: TF tensor in shape (B,3)
209 |     '''
210 |     batch_size = point_cloud.get_shape()[0].value
211 |     num_point = point_cloud.get_shape()[1].value
212 |     mask = tf.slice(logits,[0,0,0],[-1,-1,1]) < \
213 |         tf.slice(logits,[0,0,1],[-1,-1,1])
214 |     mask = tf.to_float(mask) # BxNx1
215 |     mask_count = tf.tile(tf.reduce_sum(mask,axis=1,keep_dims=True),
216 |         [1,1,3]) # Bx1x3
217 |     point_cloud_xyz = tf.slice(point_cloud, [0,0,0], [-1,-1,3]) # BxNx3
218 |     mask_xyz_mean = tf.reduce_sum(tf.tile(mask, [1,1,3])*point_cloud_xyz,
219 |         axis=1, keep_dims=True) # Bx1x3
220 |     mask = tf.squeeze(mask, axis=[2]) # BxN
221 |     end_points['mask'] = mask
222 |     mask_xyz_mean = mask_xyz_mean/tf.maximum(mask_count,1) # Bx1x3
223 | 
224 |     # Translate to masked points' centroid
225 |     point_cloud_xyz_stage1 = point_cloud_xyz - \
226 |         tf.tile(mask_xyz_mean, [1,num_point,1])
227 | 
228 |     if xyz_only:
229 |         point_cloud_stage1 = point_cloud_xyz_stage1
230 |     else:
231 |         point_cloud_features = tf.slice(point_cloud, [0,0,3], [-1,-1,-1])
232 |         point_cloud_stage1 = tf.concat(\
233 |             [point_cloud_xyz_stage1, point_cloud_features], axis=-1)
234 |     num_channels = point_cloud_stage1.get_shape()[2].value
235 | 
236 |     object_point_cloud, _ = tf_gather_object_pc(point_cloud_stage1,
237 |         mask, NUM_OBJECT_POINT)
238 |     object_point_cloud.set_shape([batch_size, NUM_OBJECT_POINT, num_channels])
239 | 
240 |     return object_point_cloud, tf.squeeze(mask_xyz_mean, axis=1), end_points
241 | 
242 | 
243 | def get_center_regression_net(object_point_cloud, one_hot_vec,
244 |                               is_training, bn_decay, end_points):
245 |     ''' Regression network for center delta. a.k.a. T-Net.
246 |     Input:
247 |         object_point_cloud: TF tensor in shape (B,M,C)
248 |             point clouds in 3D mask coordinate
249 |         one_hot_vec: TF tensor in shape (B,3)
250 |             length-3 vectors indicating predicted object type
251 |     Output:
252 |         predicted_center: TF tensor in shape (B,3)
253 |     ''' 
254 |     num_point = object_point_cloud.get_shape()[1].value
255 |     net = tf.expand_dims(object_point_cloud, 2)
256 |     net = tf_util.conv2d(net, 128, [1,1],
257 |                          padding='VALID', stride=[1,1],
258 |                          bn=True, is_training=is_training,
259 |                          scope='conv-reg1-stage1', bn_decay=bn_decay)
260 |     net = tf_util.conv2d(net, 128, [1,1],
261 |                          padding='VALID', stride=[1,1],
262 |                          bn=True, is_training=is_training,
263 |                          scope='conv-reg2-stage1', bn_decay=bn_decay)
264 |     net = tf_util.conv2d(net, 256, [1,1],
265 |                          padding='VALID', stride=[1,1],
266 |                          bn=True, is_training=is_training,
267 |                          scope='conv-reg3-stage1', bn_decay=bn_decay)
268 |     net = tf_util.max_pool2d(net, [num_point,1],
269 |         padding='VALID', scope='maxpool-stage1')
270 |     net = tf.squeeze(net, axis=[1,2])
271 |     net = tf.concat([net, one_hot_vec], axis=1)
272 |     net = tf_util.fully_connected(net, 256, scope='fc1-stage1', bn=True,
273 |         is_training=is_training, bn_decay=bn_decay)
274 |     net = tf_util.fully_connected(net, 128, scope='fc2-stage1', bn=True,
275 |         is_training=is_training, bn_decay=bn_decay)
276 |     predicted_center = tf_util.fully_connected(net, 3, activation_fn=None,
277 |         scope='fc3-stage1')
278 |     return predicted_center, end_points
279 | 
280 | 
281 | def get_loss(mask_label, center_label, \
282 |              heading_class_label, heading_residual_label, \
283 |              size_class_label, size_residual_label, \
284 |              end_points, \
285 |              corner_loss_weight=10.0, \
286 |              box_loss_weight=1.0):
287 |     ''' Loss functions for 3D object detection.
288 |     Input:
289 |         mask_label: TF int32 tensor in shape (B,N)
290 |         center_label: TF tensor in shape (B,3)
291 |         heading_class_label: TF int32 tensor in shape (B,) 
292 |         heading_residual_label: TF tensor in shape (B,) 
293 |         size_class_label: TF tensor int32 in shape (B,)
294 |         size_residual_label: TF tensor tensor in shape (B,)
295 |         end_points: dict, outputs from our model
296 |         corner_loss_weight: float scalar
297 |         box_loss_weight: float scalar
298 |     Output:
299 |         total_loss: TF scalar tensor
300 |             the total_loss is also added to the losses collection
301 |     '''
302 |     # 3D Segmentation loss
303 |     mask_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(\
304 |         logits=end_points['mask_logits'], labels=mask_label))
305 |     tf.summary.scalar('3d mask loss', mask_loss)
306 | 
307 |     # Center regression losses
308 |     center_dist = tf.norm(center_label - end_points['center'], axis=-1)
309 |     center_loss = huber_loss(center_dist, delta=2.0)
310 |     tf.summary.scalar('center loss', center_loss)
311 |     stage1_center_dist = tf.norm(center_label - \
312 |         end_points['stage1_center'], axis=-1)
313 |     stage1_center_loss = huber_loss(stage1_center_dist, delta=1.0)
314 |     tf.summary.scalar('stage1 center loss', stage1_center_loss)
315 | 
316 |     # Heading loss
317 |     heading_class_loss = tf.reduce_mean( \
318 |         tf.nn.sparse_softmax_cross_entropy_with_logits( \
319 |         logits=end_points['heading_scores'], labels=heading_class_label))
320 |     tf.summary.scalar('heading class loss', heading_class_loss)
321 | 
322 |     hcls_onehot = tf.one_hot(heading_class_label,
323 |         depth=NUM_HEADING_BIN,
324 |         on_value=1, off_value=0, axis=-1) # BxNUM_HEADING_BIN
325 |     heading_residual_normalized_label = \
326 |         heading_residual_label / (np.pi/NUM_HEADING_BIN)
327 |     heading_residual_normalized_loss = huber_loss(tf.reduce_sum( \
328 |         end_points['heading_residuals_normalized']*tf.to_float(hcls_onehot), axis=1) - \
329 |         heading_residual_normalized_label, delta=1.0)
330 |     tf.summary.scalar('heading residual normalized loss',
331 |         heading_residual_normalized_loss)
332 | 
333 |     # Size loss
334 |     size_class_loss = tf.reduce_mean( \
335 |         tf.nn.sparse_softmax_cross_entropy_with_logits( \
336 |         logits=end_points['size_scores'], labels=size_class_label))
337 |     tf.summary.scalar('size class loss', size_class_loss)
338 | 
339 |     scls_onehot = tf.one_hot(size_class_label,
340 |         depth=NUM_SIZE_CLUSTER,
341 |         on_value=1, off_value=0, axis=-1) # BxNUM_SIZE_CLUSTER
342 |     scls_onehot_tiled = tf.tile(tf.expand_dims( \
343 |         tf.to_float(scls_onehot), -1), [1,1,3]) # BxNUM_SIZE_CLUSTERx3
344 |     predicted_size_residual_normalized = tf.reduce_sum( \
345 |         end_points['size_residuals_normalized']*scls_onehot_tiled, axis=[1]) # Bx3
346 | 
347 |     mean_size_arr_expand = tf.expand_dims( \
348 |         tf.constant(g_mean_size_arr, dtype=tf.float32),0) # 1xNUM_SIZE_CLUSTERx3
349 |     mean_size_label = tf.reduce_sum( \
350 |         scls_onehot_tiled * mean_size_arr_expand, axis=[1]) # Bx3
351 |     size_residual_label_normalized = size_residual_label / mean_size_label
352 |     size_normalized_dist = tf.norm( \
353 |         size_residual_label_normalized - predicted_size_residual_normalized,
354 |         axis=-1)
355 |     size_residual_normalized_loss = huber_loss(size_normalized_dist, delta=1.0)
356 |     tf.summary.scalar('size residual normalized loss',
357 |         size_residual_normalized_loss)
358 | 
359 |     # Corner loss
360 |     # We select the predicted corners corresponding to the 
361 |     # GT heading bin and size cluster.
362 |     corners_3d = get_box3d_corners(end_points['center'],
363 |         end_points['heading_residuals'],
364 |         end_points['size_residuals']) # (B,NH,NS,8,3)
365 |     gt_mask = tf.tile(tf.expand_dims(hcls_onehot, 2), [1,1,NUM_SIZE_CLUSTER]) * \
366 |         tf.tile(tf.expand_dims(scls_onehot,1), [1,NUM_HEADING_BIN,1]) # (B,NH,NS)
367 |     corners_3d_pred = tf.reduce_sum( \
368 |         tf.to_float(tf.expand_dims(tf.expand_dims(gt_mask,-1),-1)) * corners_3d,
369 |         axis=[1,2]) # (B,8,3)
370 | 
371 |     heading_bin_centers = tf.constant( \
372 |         np.arange(0,2*np.pi,2*np.pi/NUM_HEADING_BIN), dtype=tf.float32) # (NH,)
373 |     heading_label = tf.expand_dims(heading_residual_label,1) + \
374 |         tf.expand_dims(heading_bin_centers, 0) # (B,NH)
375 |     heading_label = tf.reduce_sum(tf.to_float(hcls_onehot)*heading_label, 1)
376 |     mean_sizes = tf.expand_dims( \
377 |         tf.constant(g_mean_size_arr, dtype=tf.float32), 0) # (1,NS,3)
378 |     size_label = mean_sizes + \
379 |         tf.expand_dims(size_residual_label, 1) # (1,NS,3) + (B,1,3) = (B,NS,3)
380 |     size_label = tf.reduce_sum( \
381 |         tf.expand_dims(tf.to_float(scls_onehot),-1)*size_label, axis=[1]) # (B,3)
382 |     corners_3d_gt = get_box3d_corners_helper( \
383 |         center_label, heading_label, size_label) # (B,8,3)
384 |     corners_3d_gt_flip = get_box3d_corners_helper( \
385 |         center_label, heading_label+np.pi, size_label) # (B,8,3)
386 | 
387 |     corners_dist = tf.minimum(tf.norm(corners_3d_pred - corners_3d_gt, axis=-1),
388 |         tf.norm(corners_3d_pred - corners_3d_gt_flip, axis=-1))
389 |     corners_loss = huber_loss(corners_dist, delta=1.0) 
390 |     tf.summary.scalar('corners loss', corners_loss)
391 | 
392 |     # Weighted sum of all losses
393 |     total_loss = mask_loss + box_loss_weight * (center_loss + \
394 |         heading_class_loss + size_class_loss + \
395 |         heading_residual_normalized_loss*20 + \
396 |         size_residual_normalized_loss*20 + \
397 |         stage1_center_loss + \
398 |         corner_loss_weight*corners_loss)
399 |     tf.add_to_collection('losses', total_loss)
400 | 
401 |     return total_loss
402 | 


--------------------------------------------------------------------------------
/models/server.py:
--------------------------------------------------------------------------------
 1 | import detector_3d, frustum_proposal, detector_2d
 2 | import numpy as np
 3 | from configs import configs
 4 | from utils import utils
 5 | 
 6 | 
 7 | class Server(object):
 8 | 
 9 |     frt_proposal_server = None
10 |     detector_2d = None
11 |     detector_3d = None
12 |     in_progress = False
13 |     CALIB_PARAM = configs.CALIB_PARAM
14 |     NUM_POINT = configs.FPNET['NUM_POINT']
15 |     DETECTOR_3D_MODEL_FP = configs.DETECTOR_3D['MODEL_FP']
16 |     NUM_HEADING_BIN = configs.FPNET['NUM_HEADING_BIN']
17 |     DETECTOR_2D_MODEL_FP = configs.DETECTOR_2D['MODEL_FP']
18 |     input_tensor_names = configs.BASE_SERVER['input_tensor_names']
19 |     output_tensor_names = configs.BASE_SERVER['output_tensor_names']
20 |     device = configs.BASE_SERVER['device']
21 | 
22 |     def __init__(self):
23 |         self._load_params()
24 |         self._init_detector_2d()
25 |         self._init_frt_proposal_server()
26 |         self._init_detector_3d()
27 | 
28 |     def _load_params(self):
29 |         print('[Server] Init Params ...')
30 |         self.calib_param = self.CALIB_PARAM
31 | 
32 |     def _init_frt_proposal_server(self):
33 |         print('[Server] Init frustum proposal server ...')
34 |         self.frt_proposal_server = frustum_proposal.FrustumProposal(self.calib_param)
35 | 
36 |     def _init_detector_2d(self):
37 |         print('[Server] Init image 2d detection server ...')
38 |         self.detector_2d = detector_2d.Detector2D(
39 |             model_fp=self.DETECTOR_2D_MODEL_FP,
40 |             input_tensor_names=self.input_tensor_names,
41 |             output_tensor_names=self.output_tensor_names,
42 |             device=self.device)
43 | 
44 |     def _init_detector_3d(self):
45 |         print('[Server] Init 3d object detection server ...')
46 |         self.detector_3d = detector_3d.FPNetPredictor(model_fp=self.DETECTOR_3D_MODEL_FP)
47 | 
48 |     def predict(self, inputs):
49 |         print('[Server | Init] Run prediction ...')
50 |         # Process one image and one frame of point cloud at once
51 |         assert 'img' and 'pclds' in inputs
52 |         self.in_progress = True
53 | 
54 |         print('[Server | Step1] Run 2d bounding box detection ...')
55 |         bboxes_2d, one_hot_vectors = self.detector_2d.inference_verbose(inputs['img'])
56 | 
57 |         print('[Server | Step2] Run frustum proposal server ...')
58 |         f_prop_cam_all, f_prop_velo_all = self.frt_proposal_server.get_frustum_proposal(inputs['img'].shape, bboxes_2d, inputs['pclds'])
59 | 
60 |         print('[Server | Step3] Down sampling points ...')
61 |         for idx, f_prop_cam in enumerate(f_prop_cam_all):
62 |             choice = np.random.choice(f_prop_cam.shape[0], self.NUM_POINT, replace=True)
63 |             f_prop_cam_all[idx] = f_prop_cam[choice, :]
64 | 
65 |         print('[Server | Step4] Detetcing 3D Bounding boxes from frustum proposals ...')
66 |         logits, centers, \
67 |         heading_logits, heading_residuals, \
68 |         size_scores, size_residuals = self.detector_3d.predict(pc=f_prop_cam_all, one_hot_vec=one_hot_vectors)
69 | 
70 |         print('[Server | Step5] Preparing visualization ...')
71 |         for idx in range(len(centers)):
72 |             heading_class = np.argmax(heading_logits, 1)
73 |             size_logits = size_scores
74 |             size_class = np.argmax(size_logits, 1)
75 |             size_residual = np.vstack([size_residuals[0, size_class[idx], :]])
76 |             heading_residual = np.array([heading_residuals[idx, heading_class[idx]]])  # B,
77 |             heading_angle = utils.class2angle(heading_class[idx], heading_residual[idx], self.NUM_HEADING_BIN)
78 |             box_size = utils.class2size(size_class[idx], size_residual[idx])
79 |             corners_3d = utils.get_3d_box(box_size, heading_angle, centers[idx])
80 | 
81 |             corners_3d_in_velo_frame = np.zeros_like(corners_3d)
82 |             centers_in_velo_frame = np.zeros_like(centers)
83 |             corners_3d_in_velo_frame[:, 0:3] = self.frt_proposal_server.project_rect_to_velo(corners_3d[:, 0:3])
84 |             centers_in_velo_frame[:, 0:3] = self.frt_proposal_server.project_rect_to_velo(centers[:, 0:3])
85 |             utils.viz_single(f_prop_velo_all[idx])
86 |             utils.viz(f_prop_velo_all[idx], centers_in_velo_frame, corners_3d_in_velo_frame, inputs['pclds'])
87 | 
88 |         self.in_progress = False
89 | 


--------------------------------------------------------------------------------
/models/tf_util.py:
--------------------------------------------------------------------------------
  1 | """ Wrapper functions for TensorFlow layers.
  2 | 
  3 | Author: Charles R. Qi
  4 | Date: November 2017
  5 | """
  6 | 
  7 | import tensorflow as tf
  8 | 
  9 | def _variable_on_cpu(name, shape, initializer, use_fp16=False):
 10 |   """Helper to create a Variable stored on CPU memory.
 11 |   Args:
 12 |     name: name of the variable
 13 |     shape: list of ints
 14 |     initializer: initializer for Variable
 15 |   Returns:
 16 |     Variable Tensor
 17 |   """
 18 |   with tf.device("/cpu:0"):
 19 |     dtype = tf.float16 if use_fp16 else tf.float32
 20 |     var = tf.get_variable(name, shape, initializer=initializer, dtype=dtype)
 21 |   return var
 22 | 
 23 | def _variable_with_weight_decay(name, shape, stddev, wd, use_xavier=True):
 24 |   """Helper to create an initialized Variable with weight decay.
 25 | 
 26 |   Note that the Variable is initialized with a truncated normal distribution.
 27 |   A weight decay is added only if one is specified.
 28 | 
 29 |   Args:
 30 |     name: name of the variable
 31 |     shape: list of ints
 32 |     stddev: standard deviation of a truncated Gaussian
 33 |     wd: add L2Loss weight decay multiplied by this float. If None, weight
 34 |         decay is not added for this Variable.
 35 |     use_xavier: bool, whether to use xavier initializer
 36 | 
 37 |   Returns:
 38 |     Variable Tensor
 39 |   """
 40 |   if use_xavier:
 41 |     initializer = tf.contrib.layers.xavier_initializer()
 42 |   else:
 43 |     initializer = tf.truncated_normal_initializer(stddev=stddev)
 44 |   var = _variable_on_cpu(name, shape, initializer)
 45 |   if wd is not None:
 46 |     weight_decay = tf.multiply(tf.nn.l2_loss(var), wd, name='weight_loss')
 47 |     tf.add_to_collection('losses', weight_decay)
 48 |   return var
 49 | 
 50 | 
 51 | def conv1d(inputs,
 52 |            num_output_channels,
 53 |            kernel_size,
 54 |            scope,
 55 |            stride=1,
 56 |            padding='SAME',
 57 |            data_format='NHWC',
 58 |            use_xavier=True,
 59 |            stddev=1e-3,
 60 |            weight_decay=None,
 61 |            activation_fn=tf.nn.relu,
 62 |            bn=False,
 63 |            bn_decay=None,
 64 |            is_training=None):
 65 |   """ 1D convolution with non-linear operation.
 66 | 
 67 |   Args:
 68 |     inputs: 3-D tensor variable BxLxC
 69 |     num_output_channels: int
 70 |     kernel_size: int
 71 |     scope: string
 72 |     stride: int
 73 |     padding: 'SAME' or 'VALID'
 74 |     data_format: 'NHWC' or 'NCHW'
 75 |     use_xavier: bool, use xavier_initializer if true
 76 |     stddev: float, stddev for truncated_normal init
 77 |     weight_decay: float
 78 |     activation_fn: function
 79 |     bn: bool, whether to use batch norm
 80 |     bn_decay: float or float tensor variable in [0,1]
 81 |     is_training: bool Tensor variable
 82 | 
 83 |   Returns:
 84 |     Variable tensor
 85 |   """
 86 |   with tf.variable_scope(scope) as sc:
 87 |     assert(data_format=='NHWC' or data_format=='NCHW')
 88 |     if data_format == 'NHWC':
 89 |       num_in_channels = inputs.get_shape()[-1].value
 90 |     elif data_format=='NCHW':
 91 |       num_in_channels = inputs.get_shape()[1].value
 92 |     kernel_shape = [kernel_size,
 93 |                     num_in_channels, num_output_channels]
 94 |     kernel = _variable_with_weight_decay('weights',
 95 |                                          shape=kernel_shape,
 96 |                                          use_xavier=use_xavier,
 97 |                                          stddev=stddev,
 98 |                                          wd=weight_decay)
 99 |     outputs = tf.nn.conv1d(inputs, kernel,
100 |                            stride=stride,
101 |                            padding=padding,
102 |                            data_format=data_format)
103 |     biases = _variable_on_cpu('biases', [num_output_channels],
104 |                               tf.constant_initializer(0.0))
105 |     outputs = tf.nn.bias_add(outputs, biases, data_format=data_format)
106 | 
107 |     if bn:
108 |       outputs = batch_norm_for_conv1d(outputs, is_training,
109 |                                       bn_decay=bn_decay, scope='bn',
110 |                                       data_format=data_format)
111 | 
112 |     if activation_fn is not None:
113 |       outputs = activation_fn(outputs)
114 |     return outputs
115 | 
116 | 
117 | 
118 | 
119 | def conv2d(inputs,
120 |            num_output_channels,
121 |            kernel_size,
122 |            scope,
123 |            stride=[1, 1],
124 |            padding='SAME',
125 |            data_format='NHWC',
126 |            use_xavier=True,
127 |            stddev=1e-3,
128 |            weight_decay=None,
129 |            activation_fn=tf.nn.relu,
130 |            bn=False,
131 |            bn_decay=None,
132 |            is_training=None):
133 |   """ 2D convolution with non-linear operation.
134 | 
135 |   Args:
136 |     inputs: 4-D tensor variable BxHxWxC
137 |     num_output_channels: int
138 |     kernel_size: a list of 2 ints
139 |     scope: string
140 |     stride: a list of 2 ints
141 |     padding: 'SAME' or 'VALID'
142 |     data_format: 'NHWC' or 'NCHW'
143 |     use_xavier: bool, use xavier_initializer if true
144 |     stddev: float, stddev for truncated_normal init
145 |     weight_decay: float
146 |     activation_fn: function
147 |     bn: bool, whether to use batch norm
148 |     bn_decay: float or float tensor variable in [0,1]
149 |     is_training: bool Tensor variable
150 | 
151 |   Returns:
152 |     Variable tensor
153 |   """
154 |   with tf.variable_scope(scope) as sc:
155 |       kernel_h, kernel_w = kernel_size
156 |       assert(data_format=='NHWC' or data_format=='NCHW')
157 |       if data_format == 'NHWC':
158 |         num_in_channels = inputs.get_shape()[-1].value
159 |       elif data_format=='NCHW':
160 |         num_in_channels = inputs.get_shape()[1].value
161 |       kernel_shape = [kernel_h, kernel_w,
162 |                       num_in_channels, num_output_channels]
163 |       kernel = _variable_with_weight_decay('weights',
164 |                                            shape=kernel_shape,
165 |                                            use_xavier=use_xavier,
166 |                                            stddev=stddev,
167 |                                            wd=weight_decay)
168 |       stride_h, stride_w = stride
169 |       outputs = tf.nn.conv2d(inputs, kernel,
170 |                              [1, stride_h, stride_w, 1],
171 |                              padding=padding,
172 |                              data_format=data_format)
173 |       biases = _variable_on_cpu('biases', [num_output_channels],
174 |                                 tf.constant_initializer(0.0))
175 |       outputs = tf.nn.bias_add(outputs, biases, data_format=data_format)
176 | 
177 |       if bn:
178 |         outputs = batch_norm_for_conv2d(outputs, is_training,
179 |                                         bn_decay=bn_decay, scope='bn',
180 |                                         data_format=data_format)
181 | 
182 |       if activation_fn is not None:
183 |         outputs = activation_fn(outputs)
184 |       return outputs
185 | 
186 | 
187 | def conv2d_transpose(inputs,
188 |                      num_output_channels,
189 |                      kernel_size,
190 |                      scope,
191 |                      stride=[1, 1],
192 |                      padding='SAME',
193 |                      use_xavier=True,
194 |                      stddev=1e-3,
195 |                      weight_decay=None,
196 |                      activation_fn=tf.nn.relu,
197 |                      bn=False,
198 |                      bn_decay=None,
199 |                      is_training=None):
200 |   """ 2D convolution transpose with non-linear operation.
201 | 
202 |   Args:
203 |     inputs: 4-D tensor variable BxHxWxC
204 |     num_output_channels: int
205 |     kernel_size: a list of 2 ints
206 |     scope: string
207 |     stride: a list of 2 ints
208 |     padding: 'SAME' or 'VALID'
209 |     use_xavier: bool, use xavier_initializer if true
210 |     stddev: float, stddev for truncated_normal init
211 |     weight_decay: float
212 |     activation_fn: function
213 |     bn: bool, whether to use batch norm
214 |     bn_decay: float or float tensor variable in [0,1]
215 |     is_training: bool Tensor variable
216 | 
217 |   Returns:
218 |     Variable tensor
219 | 
220 |   Note: conv2d(conv2d_transpose(a, num_out, ksize, stride), a.shape[-1], ksize, stride) == a
221 |   """
222 |   with tf.variable_scope(scope) as sc:
223 |       kernel_h, kernel_w = kernel_size
224 |       num_in_channels = inputs.get_shape()[-1].value
225 |       kernel_shape = [kernel_h, kernel_w,
226 |                       num_output_channels, num_in_channels] # reversed to conv2d
227 |       kernel = _variable_with_weight_decay('weights',
228 |                                            shape=kernel_shape,
229 |                                            use_xavier=use_xavier,
230 |                                            stddev=stddev,
231 |                                            wd=weight_decay)
232 |       stride_h, stride_w = stride
233 |       
234 |       # from slim.convolution2d_transpose
235 |       def get_deconv_dim(dim_size, stride_size, kernel_size, padding):
236 |           dim_size *= stride_size
237 | 
238 |           if padding == 'VALID' and dim_size is not None:
239 |             dim_size += max(kernel_size - stride_size, 0)
240 |           return dim_size
241 | 
242 |       # caculate output shape
243 |       batch_size = inputs.get_shape()[0].value
244 |       height = inputs.get_shape()[1].value
245 |       width = inputs.get_shape()[2].value
246 |       out_height = get_deconv_dim(height, stride_h, kernel_h, padding)
247 |       out_width = get_deconv_dim(width, stride_w, kernel_w, padding)
248 |       output_shape = [batch_size, out_height, out_width, num_output_channels]
249 | 
250 |       outputs = tf.nn.conv2d_transpose(inputs, kernel, output_shape,
251 |                              [1, stride_h, stride_w, 1],
252 |                              padding=padding)
253 |       biases = _variable_on_cpu('biases', [num_output_channels],
254 |                                 tf.constant_initializer(0.0))
255 |       outputs = tf.nn.bias_add(outputs, biases)
256 | 
257 |       if bn:
258 |         outputs = batch_norm_for_conv2d(outputs, is_training,
259 |                                         bn_decay=bn_decay, scope='bn')
260 | 
261 |       if activation_fn is not None:
262 |         outputs = activation_fn(outputs)
263 |       return outputs
264 | 
265 |    
266 | 
267 | def conv3d(inputs,
268 |            num_output_channels,
269 |            kernel_size,
270 |            scope,
271 |            stride=[1, 1, 1],
272 |            padding='SAME',
273 |            use_xavier=True,
274 |            stddev=1e-3,
275 |            weight_decay=None,
276 |            activation_fn=tf.nn.relu,
277 |            bn=False,
278 |            bn_decay=None,
279 |            is_training=None):
280 |   """ 3D convolution with non-linear operation.
281 | 
282 |   Args:
283 |     inputs: 5-D tensor variable BxDxHxWxC
284 |     num_output_channels: int
285 |     kernel_size: a list of 3 ints
286 |     scope: string
287 |     stride: a list of 3 ints
288 |     padding: 'SAME' or 'VALID'
289 |     use_xavier: bool, use xavier_initializer if true
290 |     stddev: float, stddev for truncated_normal init
291 |     weight_decay: float
292 |     activation_fn: function
293 |     bn: bool, whether to use batch norm
294 |     bn_decay: float or float tensor variable in [0,1]
295 |     is_training: bool Tensor variable
296 | 
297 |   Returns:
298 |     Variable tensor
299 |   """
300 |   with tf.variable_scope(scope) as sc:
301 |     kernel_d, kernel_h, kernel_w = kernel_size
302 |     num_in_channels = inputs.get_shape()[-1].value
303 |     kernel_shape = [kernel_d, kernel_h, kernel_w,
304 |                     num_in_channels, num_output_channels]
305 |     kernel = _variable_with_weight_decay('weights',
306 |                                          shape=kernel_shape,
307 |                                          use_xavier=use_xavier,
308 |                                          stddev=stddev,
309 |                                          wd=weight_decay)
310 |     stride_d, stride_h, stride_w = stride
311 |     outputs = tf.nn.conv3d(inputs, kernel,
312 |                            [1, stride_d, stride_h, stride_w, 1],
313 |                            padding=padding)
314 |     biases = _variable_on_cpu('biases', [num_output_channels],
315 |                               tf.constant_initializer(0.0))
316 |     outputs = tf.nn.bias_add(outputs, biases)
317 |     
318 |     if bn:
319 |       outputs = batch_norm_for_conv3d(outputs, is_training,
320 |                                       bn_decay=bn_decay, scope='bn')
321 | 
322 |     if activation_fn is not None:
323 |       outputs = activation_fn(outputs)
324 |     return outputs
325 | 
326 | def fully_connected(inputs,
327 |                     num_outputs,
328 |                     scope,
329 |                     use_xavier=True,
330 |                     stddev=1e-3,
331 |                     weight_decay=None,
332 |                     activation_fn=tf.nn.relu,
333 |                     bn=False,
334 |                     bn_decay=None,
335 |                     is_training=None):
336 |   """ Fully connected layer with non-linear operation.
337 |   
338 |   Args:
339 |     inputs: 2-D tensor BxN
340 |     num_outputs: int
341 |   
342 |   Returns:
343 |     Variable tensor of size B x num_outputs.
344 |   """
345 |   with tf.variable_scope(scope) as sc:
346 |     num_input_units = inputs.get_shape()[-1].value
347 |     weights = _variable_with_weight_decay('weights',
348 |                                           shape=[num_input_units, num_outputs],
349 |                                           use_xavier=use_xavier,
350 |                                           stddev=stddev,
351 |                                           wd=weight_decay)
352 |     outputs = tf.matmul(inputs, weights)
353 |     biases = _variable_on_cpu('biases', [num_outputs],
354 |                              tf.constant_initializer(0.0))
355 |     outputs = tf.nn.bias_add(outputs, biases)
356 |      
357 |     if bn:
358 |       outputs = batch_norm_for_fc(outputs, is_training, bn_decay, 'bn')
359 | 
360 |     if activation_fn is not None:
361 |       outputs = activation_fn(outputs)
362 |     return outputs
363 | 
364 | 
365 | def max_pool2d(inputs,
366 |                kernel_size,
367 |                scope,
368 |                stride=[2, 2],
369 |                padding='VALID'):
370 |   """ 2D max pooling.
371 | 
372 |   Args:
373 |     inputs: 4-D tensor BxHxWxC
374 |     kernel_size: a list of 2 ints
375 |     stride: a list of 2 ints
376 |   
377 |   Returns:
378 |     Variable tensor
379 |   """
380 |   with tf.variable_scope(scope) as sc:
381 |     kernel_h, kernel_w = kernel_size
382 |     stride_h, stride_w = stride
383 |     outputs = tf.nn.max_pool(inputs,
384 |                              ksize=[1, kernel_h, kernel_w, 1],
385 |                              strides=[1, stride_h, stride_w, 1],
386 |                              padding=padding,
387 |                              name=sc.name)
388 |     return outputs
389 | 
390 | def avg_pool2d(inputs,
391 |                kernel_size,
392 |                scope,
393 |                stride=[2, 2],
394 |                padding='VALID'):
395 |   """ 2D avg pooling.
396 | 
397 |   Args:
398 |     inputs: 4-D tensor BxHxWxC
399 |     kernel_size: a list of 2 ints
400 |     stride: a list of 2 ints
401 |   
402 |   Returns:
403 |     Variable tensor
404 |   """
405 |   with tf.variable_scope(scope) as sc:
406 |     kernel_h, kernel_w = kernel_size
407 |     stride_h, stride_w = stride
408 |     outputs = tf.nn.avg_pool(inputs,
409 |                              ksize=[1, kernel_h, kernel_w, 1],
410 |                              strides=[1, stride_h, stride_w, 1],
411 |                              padding=padding,
412 |                              name=sc.name)
413 |     return outputs
414 | 
415 | 
416 | def max_pool3d(inputs,
417 |                kernel_size,
418 |                scope,
419 |                stride=[2, 2, 2],
420 |                padding='VALID'):
421 |   """ 3D max pooling.
422 | 
423 |   Args:
424 |     inputs: 5-D tensor BxDxHxWxC
425 |     kernel_size: a list of 3 ints
426 |     stride: a list of 3 ints
427 |   
428 |   Returns:
429 |     Variable tensor
430 |   """
431 |   with tf.variable_scope(scope) as sc:
432 |     kernel_d, kernel_h, kernel_w = kernel_size
433 |     stride_d, stride_h, stride_w = stride
434 |     outputs = tf.nn.max_pool3d(inputs,
435 |                                ksize=[1, kernel_d, kernel_h, kernel_w, 1],
436 |                                strides=[1, stride_d, stride_h, stride_w, 1],
437 |                                padding=padding,
438 |                                name=sc.name)
439 |     return outputs
440 | 
441 | def avg_pool3d(inputs,
442 |                kernel_size,
443 |                scope,
444 |                stride=[2, 2, 2],
445 |                padding='VALID'):
446 |   """ 3D avg pooling.
447 | 
448 |   Args:
449 |     inputs: 5-D tensor BxDxHxWxC
450 |     kernel_size: a list of 3 ints
451 |     stride: a list of 3 ints
452 |   
453 |   Returns:
454 |     Variable tensor
455 |   """
456 |   with tf.variable_scope(scope) as sc:
457 |     kernel_d, kernel_h, kernel_w = kernel_size
458 |     stride_d, stride_h, stride_w = stride
459 |     outputs = tf.nn.avg_pool3d(inputs,
460 |                                ksize=[1, kernel_d, kernel_h, kernel_w, 1],
461 |                                strides=[1, stride_d, stride_h, stride_w, 1],
462 |                                padding=padding,
463 |                                name=sc.name)
464 |     return outputs
465 | 
466 | 
467 | def batch_norm_template_unused(inputs, is_training, scope, moments_dims, bn_decay):
468 |   """ NOTE: this is older version of the util func. it is deprecated.
469 |   Batch normalization on convolutional maps and beyond...
470 |   Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow
471 |   
472 |   Args:
473 |       inputs:        Tensor, k-D input ... x C could be BC or BHWC or BDHWC
474 |       is_training:   boolean tf.Varialbe, true indicates training phase
475 |       scope:         string, variable scope
476 |       moments_dims:  a list of ints, indicating dimensions for moments calculation
477 |       bn_decay:      float or float tensor variable, controling moving average weight
478 |   Return:
479 |       normed:        batch-normalized maps
480 |   """
481 |   with tf.variable_scope(scope) as sc:
482 |     num_channels = inputs.get_shape()[-1].value
483 |     beta = _variable_on_cpu(name='beta',shape=[num_channels],
484 |                             initializer=tf.constant_initializer(0))
485 |     gamma = _variable_on_cpu(name='gamma',shape=[num_channels],
486 |                             initializer=tf.constant_initializer(1.0))
487 |     batch_mean, batch_var = tf.nn.moments(inputs, moments_dims, name='moments')
488 |     decay = bn_decay if bn_decay is not None else 0.9
489 |     ema = tf.train.ExponentialMovingAverage(decay=decay)
490 |     # Operator that maintains moving averages of variables.
491 |     # Need to set reuse=False, otherwise if reuse, will see moments_1/mean/ExponentialMovingAverage/ does not exist
492 |     # https://github.com/shekkizh/WassersteinGAN.tensorflow/issues/3
493 |     with tf.variable_scope(tf.get_variable_scope(), reuse=False):
494 |         ema_apply_op = tf.cond(is_training,
495 |                                lambda: ema.apply([batch_mean, batch_var]),
496 |                                lambda: tf.no_op())
497 |     
498 |     # Update moving average and return current batch's avg and var.
499 |     def mean_var_with_update():
500 |       with tf.control_dependencies([ema_apply_op]):
501 |         return tf.identity(batch_mean), tf.identity(batch_var)
502 |     
503 |     # ema.average returns the Variable holding the average of var.
504 |     mean, var = tf.cond(is_training,
505 |                         mean_var_with_update,
506 |                         lambda: (ema.average(batch_mean), ema.average(batch_var)))
507 |     normed = tf.nn.batch_normalization(inputs, mean, var, beta, gamma, 1e-3)
508 |   return normed
509 | 
510 | 
511 | def batch_norm_template(inputs, is_training, scope, moments_dims_unused, bn_decay, data_format='NHWC'):
512 |   """ Batch normalization on convolutional maps and beyond...
513 |   Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow
514 |   
515 |   Args:
516 |       inputs:        Tensor, k-D input ... x C could be BC or BHWC or BDHWC
517 |       is_training:   boolean tf.Varialbe, true indicates training phase
518 |       scope:         string, variable scope
519 |       moments_dims:  a list of ints, indicating dimensions for moments calculation
520 |       bn_decay:      float or float tensor variable, controling moving average weight
521 |       data_format:   'NHWC' or 'NCHW'
522 |   Return:
523 |       normed:        batch-normalized maps
524 |   """
525 |   bn_decay = bn_decay if bn_decay is not None else 0.9
526 |   return tf.contrib.layers.batch_norm(inputs, 
527 |                                       center=True, scale=True,
528 |                                       is_training=is_training, decay=bn_decay,updates_collections=None,
529 |                                       scope=scope,
530 |                                       data_format=data_format)
531 | 
532 | 
533 | def batch_norm_for_fc(inputs, is_training, bn_decay, scope):
534 |   """ Batch normalization on FC data.
535 |   
536 |   Args:
537 |       inputs:      Tensor, 2D BxC input
538 |       is_training: boolean tf.Varialbe, true indicates training phase
539 |       bn_decay:    float or float tensor variable, controling moving average weight
540 |       scope:       string, variable scope
541 |   Return:
542 |       normed:      batch-normalized maps
543 |   """
544 |   return batch_norm_template(inputs, is_training, scope, [0,], bn_decay)
545 | 
546 | 
547 | def batch_norm_for_conv1d(inputs, is_training, bn_decay, scope, data_format):
548 |   """ Batch normalization on 1D convolutional maps.
549 |   
550 |   Args:
551 |       inputs:      Tensor, 3D BLC input maps
552 |       is_training: boolean tf.Varialbe, true indicates training phase
553 |       bn_decay:    float or float tensor variable, controling moving average weight
554 |       scope:       string, variable scope
555 |       data_format: 'NHWC' or 'NCHW'
556 |   Return:
557 |       normed:      batch-normalized maps
558 |   """
559 |   return batch_norm_template(inputs, is_training, scope, [0,1], bn_decay, data_format)
560 | 
561 | 
562 | 
563 |   
564 | def batch_norm_for_conv2d(inputs, is_training, bn_decay, scope, data_format):
565 |   """ Batch normalization on 2D convolutional maps.
566 |   
567 |   Args:
568 |       inputs:      Tensor, 4D BHWC input maps
569 |       is_training: boolean tf.Varialbe, true indicates training phase
570 |       bn_decay:    float or float tensor variable, controling moving average weight
571 |       scope:       string, variable scope
572 |       data_format: 'NHWC' or 'NCHW'
573 |   Return:
574 |       normed:      batch-normalized maps
575 |   """
576 |   return batch_norm_template(inputs, is_training, scope, [0,1,2], bn_decay, data_format)
577 | 
578 | 
579 | def batch_norm_for_conv3d(inputs, is_training, bn_decay, scope):
580 |   """ Batch normalization on 3D convolutional maps.
581 |   
582 |   Args:
583 |       inputs:      Tensor, 5D BDHWC input maps
584 |       is_training: boolean tf.Varialbe, true indicates training phase
585 |       bn_decay:    float or float tensor variable, controling moving average weight
586 |       scope:       string, variable scope
587 |   Return:
588 |       normed:      batch-normalized maps
589 |   """
590 |   return batch_norm_template(inputs, is_training, scope, [0,1,2,3], bn_decay)
591 | 
592 | 
593 | def dropout(inputs,
594 |             is_training,
595 |             scope,
596 |             keep_prob=0.5,
597 |             noise_shape=None):
598 |   """ Dropout layer.
599 | 
600 |   Args:
601 |     inputs: tensor
602 |     is_training: boolean tf.Variable
603 |     scope: string
604 |     keep_prob: float in [0,1]
605 |     noise_shape: list of ints
606 | 
607 |   Returns:
608 |     tensor variable
609 |   """
610 |   with tf.variable_scope(scope) as sc:
611 |     outputs = tf.cond(is_training,
612 |                       lambda: tf.nn.dropout(inputs, keep_prob, noise_shape),
613 |                       lambda: inputs)
614 |     return outputs
615 | 


--------------------------------------------------------------------------------
/pretrained/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/pretrained/__init__.py


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy
2 | tensorflow
3 | mayavi
4 | cv2


--------------------------------------------------------------------------------
/services/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/services/__init__.py


--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/tests/__init__.py


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KleinYuan/tf-3d-object-detection/ccbd987c08b90aaffada9e064b48574b9882db9a/utils/__init__.py


--------------------------------------------------------------------------------
/utils/utils.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from configs import configs
  3 | 
  4 | 
  5 | def viz(pc, centers, corners_3d, pc_origin):
  6 | 	import mayavi.mlab as mlab
  7 | 	fig = mlab.figure(figure=None, bgcolor=(0.4, 0.4, 0.4),
  8 | 	                  fgcolor=None, engine=None, size=(500, 500))
  9 | 	mlab.points3d(pc[:, 0], pc[:, 1], pc[:, 2], mode='sphere',
 10 | 	              colormap='gnuplot', scale_factor=0.1, figure=fig)
 11 | 	mlab.points3d(centers[:, 0], centers[:, 1], centers[:, 2], mode='sphere',
 12 | 	              color=(1, 0, 1), scale_factor=0.3, figure=fig)
 13 | 	mlab.points3d(corners_3d[:, 0], corners_3d[:, 1], corners_3d[:, 2], mode='sphere',
 14 | 	              color=(1, 1, 0), scale_factor=0.3, figure=fig)
 15 | 	mlab.points3d(pc_origin[:, 0], pc_origin[:, 1], pc_origin[:, 2], mode='sphere',
 16 | 	              color=(0, 1, 0), scale_factor=0.05, figure=fig)
 17 | 	'''
 18 |         Green points are original PC from KITTI
 19 |         White points are PC feed into the network
 20 |         Red point is the predicted center
 21 |         Yellow point the post-processed predicted bounding box corners
 22 |     '''
 23 | 	raw_input("Press any key to continue")
 24 | 
 25 | def viz_single(pc):
 26 | 	import mayavi.mlab as mlab
 27 | 	fig = mlab.figure(figure=None, bgcolor=(0.4, 0.4, 0.4),
 28 | 	                  fgcolor=None, engine=None, size=(500, 500))
 29 | 	mlab.points3d(pc[:, 0], pc[:, 1], pc[:, 2], mode='sphere',
 30 | 	              colormap='gnuplot', scale_factor=0.1, figure=fig)
 31 | 	'''
 32 |         Green points are original PC from KITTI
 33 |         White points are PC feed into the network
 34 |         Red point is the predicted center
 35 |         Yellow point the post-processed predicted bounding box corners
 36 |     '''
 37 | 	raw_input("Press any key to continue")
 38 | 
 39 | 
 40 | def load_velo_scan(velo_filename):
 41 | 	scan = np.fromfile(velo_filename, dtype=np.float32)
 42 | 	scan = scan.reshape((-1, 4))
 43 | 	return scan
 44 | 
 45 | 
 46 | def read_calib_file(filepath):
 47 | 	''' Read in a calibration file and parse into a dictionary.
 48 |     Ref: https://github.com/utiasSTARS/pykitti/blob/master/pykitti/utils.py
 49 |     '''
 50 | 	data = {}
 51 | 	with open(filepath, 'r') as f:
 52 | 		for line in f.readlines():
 53 | 			line = line.rstrip()
 54 | 			if len(line) == 0: continue
 55 | 			key, value = line.split(':', 1)
 56 | 			# The only non-float values in these files are dates, which
 57 | 			# we don't care about anyway
 58 | 			try:
 59 | 				data[key] = np.array([float(x) for x in value.split()])
 60 | 			except ValueError:
 61 | 				pass
 62 | 
 63 | 	return data
 64 | 
 65 | 
 66 | def class2size(pred_cls, residual):
 67 | 	''' Inverse function to size2class. '''
 68 | 	mean_size = configs.g_type_mean_size[configs.g_class2type[pred_cls]]
 69 | 	return mean_size + residual
 70 | 
 71 | 
 72 | def class2angle(pred_cls, residual, num_class, to_label_format=True):
 73 | 	''' Inverse function to angle2class.
 74 |     If to_label_format, adjust angle to the range as in labels.
 75 |     '''
 76 | 	angle_per_class = 2 * np.pi / float(num_class)
 77 | 	angle_center = pred_cls * angle_per_class
 78 | 	angle = angle_center + residual
 79 | 	if to_label_format and angle > np.pi:
 80 | 		angle = angle - 2 * np.pi
 81 | 	return angle
 82 | 
 83 | 
 84 | def get_3d_box(box_size, heading_angle, center):
 85 | 	''' Calculate 3D bounding box corners from its parameterization.
 86 | 
 87 |     Input:
 88 |         box_size: tuple of (l,w,h)
 89 |         heading_angle: rad scalar, clockwise from pos x axis
 90 |         center: tuple of (x,y,z)
 91 |     Output:
 92 |         corners_3d: numpy array of shape (8,3) for 3D box cornders
 93 |     '''
 94 | 
 95 | 	def roty(t):
 96 | 		c = np.cos(t)
 97 | 		s = np.sin(t)
 98 | 		return np.array([[c, 0, s],
 99 | 		                 [0, 1, 0],
100 | 		                 [-s, 0, c]])
101 | 
102 | 	R = roty(heading_angle)
103 | 	l, w, h = box_size
104 | 	x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
105 | 	y_corners = [h / 2, h / 2, h / 2, h / 2, -h / 2, -h / 2, -h / 2, -h / 2]
106 | 	z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]
107 | 	corners_3d = np.dot(R, np.vstack([x_corners, y_corners, z_corners]))
108 | 	corners_3d[0, :] = corners_3d[0, :] + center[0]
109 | 	corners_3d[1, :] = corners_3d[1, :] + center[1]
110 | 	corners_3d[2, :] = corners_3d[2, :] + center[2]
111 | 	corners_3d = np.transpose(corners_3d)
112 | 	return corners_3d
113 | 


--------------------------------------------------------------------------------