├── LICENSE
├── README.md
├── checkpoint.py
├── eval.py
├── models
    ├── ap_helper.py
    ├── backbone_module.py
    ├── backbone_module_scale.py
    ├── dump_helper.py
    ├── hdnet.py
    ├── hdnet_1bb.py
    ├── loss_helper.py
    ├── proposal_module_refine.py
    ├── proposal_module_surface.py
    └── voting_module.py
├── overview.jpg
├── pointnet2
    ├── _ext_src
    │   ├── include
    │   │   ├── ball_query.h
    │   │   ├── cuda_utils.h
    │   │   ├── group_points.h
    │   │   ├── interpolate.h
    │   │   ├── sampling.h
    │   │   └── utils.h
    │   └── src
    │   │   ├── ball_query.cpp
    │   │   ├── ball_query_gpu.cu
    │   │   ├── bindings.cpp
    │   │   ├── group_points.cpp
    │   │   ├── group_points_gpu.cu
    │   │   ├── interpolate.cpp
    │   │   ├── interpolate_gpu.cu
    │   │   ├── sampling.cpp
    │   │   └── sampling_gpu.cu
    ├── pointnet2_modules.py
    ├── pointnet2_utils.py
    ├── pytorch_utils.py
    └── setup.py
├── scannet
    ├── meta_data
    │   ├── scannet_means.npz
    │   ├── scannet_means_v2.npz.npy
    │   ├── scannet_train.txt
    │   ├── scannetv2-labels.combined.tsv
    │   ├── scannetv2_test.txt
    │   ├── scannetv2_train.txt
    │   └── scannetv2_val.txt
    ├── model_util_scannet.py
    └── scannet_detection_dataset_hd.py
├── sunrgbd
    ├── model_util_sunrgbd.py
    ├── sunrgbd_detection_dataset_hd.py
    └── sunrgbd_utils.py
├── train.py
├── train_1bb.py
└── utils
    ├── box_util.py
    ├── eval_det.py
    ├── metric_util.py
    ├── nms.py
    ├── nn_distance.py
    ├── pc_util.py
    ├── show_results_scannet.py
    ├── show_results_sunrgbd.py
    ├── tf_logger.py
    ├── tf_visualizer.py
    ├── utils.py
    └── viewpoint.json


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2020 zaiweizhang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # H3DNet: 3D Object Detection Using Hybrid Geometric Primitives
 2 | Created by <a href="https://sites.google.com/a/utexas.edu/zaiwei-zhang/" target="_blank">Zaiwei Zhang</a>, <a href="https://yanghtr.github.io/" target="_blank">Haitao Yang</a>, <a href="https://sites.google.com/view/bosun/home" target="_blank">Bo Sun</a> and <a href="https://www.cs.utexas.edu/~huangqx/" target="_blank">Qixing Huang</a>.
 3 | 
 4 | ![overview](overview.jpg)
 5 | 
 6 | ## Introduction
 7 | This repository is code release for our paper (arXiv report [here](https://arxiv.org/abs/2006.05682)).
 8 | 
 9 | We introduce H3DNet, which takes a colorless 3D point cloud as input and outputs a collection of oriented object bounding boxes (or BB) and their semantic labels. The critical idea of H3DNet is to predict a hybrid set of geometric primitives, i.e., BB centers, BB face centers, and BB edge centers. We show how to convert the predicted geometric primitives into object proposals by defining a distance function between an object and the geometric primitives. This distance function enables continuous optimization of object proposals, and its local minimums provide high-fidelity object proposals. H3DNet then utilizes a matching and refinement module to classify object proposals into detected objects and fine-tune the geometric parameters of the detected objects. The hybrid set of geometric primitives not only provides more accurate signals for object detection than using a single type of geometric primitives, but it also provides an overcomplete set of constraints on the resulting 3D layout. Therefore, H3DNet can tolerate outliers in predicted geometric primitives. Our model achieves state-of-the-art 3D detection results, with only pointclouds input, on two large datasets with real 3D scans, ScanNet and SUN RGB-D.
10 | 
11 | In this repository, we provide H3DNet model implementation (with Pytorch) as well as data preparation, training and evaluation scripts on SUN RGB-D and ScanNet. Since our model is built on <a href="https://github.com/facebookresearch/votenet" target="_blank">VoteNet</a>, we borrowed a lot of codes from their codebase.
12 | 
13 | ## Installation
14 | 
15 | Since we are built on top of VoteNet, we require similar packages before using our code. Install [Pytorch](https://pytorch.org/get-started/locally/) and [Tensorflow](https://github.com/tensorflow/tensorflow) (for TensorBoard). It is required that you have access to GPUs. Matlab is required to prepare data for SUN RGB-D. The code is tested with Ubuntu 18.04, Pytorch v1.1, TensorFlow v1.14, CUDA 10.0 and cuDNN v7.4.
16 | 
17 | Compile the CUDA layers for [PointNet++](http://arxiv.org/abs/1706.02413), which we used in the backbone network:
18 | 
19 |     cd pointnet2
20 |     python setup.py install
21 | 
22 | Install the following Python dependencies (with `pip install`):
23 | 
24 |     numpy
25 |     matplotlib
26 |     scipy
27 |     sklearn
28 |     opencv-python
29 |     plyfile
30 |     pytorch=1.1.0
31 |     tensorflow-gpu==1.12.0 (only for visualization)
32 |     'trimesh>=2.35.39,<2.35.40'
33 |     'networkx>=2.2,<2.3'
34 | 
35 | ## Training and evaluating
36 | 
37 | ### Data preparation
38 | 
39 | For data preparation, we share the same data pre-processing steps with VoteNet. We provide the processed training and testing data for SUN RGB-D [here](https://drive.google.com/file/d/1uwoi34N43jfreZooG-SuYhG5mdAsSHvK/view?usp=sharing), and for ScanNet [here](https://drive.google.com/file/d/1WtzsQBqU9rxc3tsa4kooRU_DbhRpuIyb/view?usp=sharing).
40 | 
41 | ### Train and test on SUN RGB-D
42 | 
43 | To train a new H3DNet model on SUN RGB-D data (depth images):
44 | 
45 |     python train.py --data_path path/to/sunrgbd --dataset sunrgbd --log_dir log_sunrgbd --num_point 40000 --model hdnet --batch_size 16
46 |   
47 | In order to train in batch_size 16, you will have to use at least 3/4 GPUs. You can use `CUDA_VISIBLE_DEVICES=0,1,2` to specify which GPU(s) to use. Without specifying CUDA devices, the training will use all the available GPUs and train with data parallel.
48 | While training you can check the `log_sunrgbd/log_train.txt` file on its progress, or use the TensorBoard to see loss curves.
49 | 
50 | To run H3DNet with one backbone (less memory):
51 | 
52 |     python train_1bb.py --data_path path/to/sunrgbd --dataset sunrgbd --log_dir log_sunrgbd --num_point 40000 --model hdnet_1bb --batch_size 16
53 | 
54 | You can set the pretrained-weight using --pre_checkpoint_path flag. You can use the pretrained weight in [here](https://github.com/facebookresearch/DepthContrast). Please set the scale of the backbone using --scale accordingly. Use pretrained weight with scale 3 should achieve around 63.5 mAP@0.25.
55 | 
56 | To test the trained model with its checkpoint:
57 | 
58 |     python eval.py --data_path path/to/sunrgbd --dataset sunrgbd --model hdnet --checkpoint_path path/to/checkpoint --dump_dir eval_sunrgbd --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal
59 | 
60 | Example results will be dumped in the `eval_sunrgbd` folder (or any other folder you specify). You can run `python eval.py -h` to see the full options for evaluation. After the evaluation, you can use MeshLab to visualize the predicted votes and 3D bounding boxes (select wireframe mode to view the boxes). Final evaluation results will be printed on screen and also written in the `log_eval.txt` file under the dump directory. In default we evaluate with both AP@0.25 and AP@0.5 with 3D IoU on oriented boxes. A properly trained H3DNet should have around 60 mAP@0.25 and 39 mAP@0.5.
61 | 
62 | ### Train and test on ScanNet
63 | 
64 | To train a H3DNet model on Scannet data (fused scan):
65 | 
66 |     python train.py --data_path path/to/scannet_train_detection_data --dataset scannet --log_dir log_scannet --num_point 40000 --model hdnet --batch_size 8
67 | 
68 | To run H3DNet with one backbone (less memory):
69 | 
70 |     python train_1bb.py --data_path path/to/scannet_train_detection_data --dataset scannet --log_dir log_scannet --num_point 40000 --model hdnet_1bb --batch_size 8
71 | 
72 | It should provide 66 mAP@0.25 with training from scratch. You can set the pretrained-weight using --pre_checkpoint_path flag. You can use the pretrained weight in [here](https://github.com/facebookresearch/DepthContrast). Please set the scale of the backbone using --scale accordingly. Use pretrained weight with scale 3 should achieve around 69.0 mAP@0.25.
73 | 
74 | To test the trained model with its checkpoint:
75 | 
76 |     python eval.py --data_path path/to/scannet_train_detection_data --dataset scannet --model hdnet --checkpoint_path path/to/checkpoint --dump_dir eval_scannet --num_point 40000 --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal
77 | 
78 | Example results will be dumped in the `eval_scannet` folder (or any other folder you specify). In default we evaluate with both AP@0.25 and AP@0.5 with 3D IoU on axis aligned boxes. A properly trained H3DNet should have around 67 mAP@0.25 and 48 mAP@0.5.
79 | 
80 | ### Visualize predictions and ground truths 
81 | Visualization codes for ScanNet and SUN RGB-D are in `utils/show_results_scannet.py` and `utils/show_results_sunrgbd.py` saparately. 
82 | 
83 | Before running them, you should change the data paths in the beginning of each script. 
84 | 
85 | To visualize ground truth scenes and bounding boxes of ScanNet, run 
86 | 
87 |     python show_results_scannet.py gt
88 | 
89 | To visualize ground truth scenes and bounding boxes of ScanNet, run 
90 | 
91 |     python show_results_scannet.py pred
92 | 
93 | Usages for SUN RGB-D are just replacing scripts with args unchanged. 
94 | ## License
95 | H3DNet is relased under the MIT License. See the LICENSE file for more details.
96 | 


--------------------------------------------------------------------------------
/checkpoint.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | # Copyright (c) Facebook, Inc. and its affiliates.
 3 | # All rights reserved.
 4 | #
 5 | # This source code is licensed under the license found in the
 6 | # LICENSE file in the root directory of this source tree.
 7 | #
 8 | import logging
 9 | import os
10 | 
11 | import torch
12 | 
13 | def _print_state_dict_shapes(state_dict):
14 |     logging.info("Model state_dict:")
15 |     for param_tensor in state_dict.keys():
16 |         logging.info(f"{param_tensor}:\t{state_dict[param_tensor].size()}")
17 | 
18 | def init_model_from_weights(
19 |     model,
20 |     state_dict,
21 |     skip_layers=None,
22 |     print_init_layers=True,
23 | ):
24 |     """
25 |     Initialize the model from any given params file. This is particularly useful
26 |     during the finetuning process or when we want to evaluate a model on a range
27 |     of tasks.
28 |     skip_layers:     string : layer names with this key are not copied
29 |     print_init_layers:   print whether layer was init or ignored
30 |                     indicates whether the layername was copied or not
31 |     """
32 |     # whether it's a model from somewhere else or a model from this codebase
33 |     state_dict = state_dict["model"]
34 |     
35 |     all_layers = model.state_dict()
36 |     init_layers = {layername: False for layername in all_layers}
37 | 
38 | 
39 |     new_state_dict = {}
40 |     for param_name in state_dict:
41 |         if "module.trunk.0" not in param_name:
42 |             continue
43 |         param_data = param_name.split(".")
44 |         newname = "backbone_net1"
45 |         for i in range(len(param_data[3:])):
46 |             newname += "."+param_data[i+3]
47 |         new_state_dict[newname] = state_dict[param_name]
48 |     state_dict = new_state_dict
49 |     
50 |     local_rank = int(os.environ.get("LOCAL_RANK", 0))
51 |     not_found, not_init = [], []
52 |     for layername in all_layers.keys():
53 |         if (
54 |             skip_layers and len(skip_layers) > 0 and layername.find(skip_layers) >= 0
55 |         ) or layername.find("num_batches_tracked") >= 0:
56 |             if print_init_layers and (local_rank == 0):
57 |                 not_init.append(layername)
58 |                 print(f"Ignored layer:\t{layername}")
59 |             continue
60 |         if layername in state_dict:
61 |             param = state_dict[layername]
62 |             if not isinstance(param, torch.Tensor):
63 |                 param = torch.from_numpy(param)
64 |             all_layers[layername].copy_(param)
65 |             init_layers[layername] = True
66 |             if print_init_layers and (local_rank == 0):
67 |                 print(f"Init layer:\t{layername}")
68 |         else:
69 |             not_found.append(layername)
70 |             if print_init_layers and (local_rank == 0):
71 |                 print(f"Not found:\t{layername}")
72 |     ####################### DEBUG ############################
73 |     # _print_state_dict_shapes(model.state_dict())
74 |     return model
75 | 


--------------------------------------------------------------------------------
/eval.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | """ Evaluation routine for 3D object detection with SUN RGB-D and ScanNet.
  7 | """
  8 | 
  9 | import os
 10 | import sys
 11 | import numpy as np
 12 | from datetime import datetime
 13 | import argparse
 14 | import importlib
 15 | import torch
 16 | import torch.nn as nn
 17 | import torch.optim as optim
 18 | from torch.utils.data import DataLoader
 19 | 
 20 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 21 | ROOT_DIR = BASE_DIR
 22 | sys.path.append(os.path.join(ROOT_DIR, 'models'))
 23 | from ap_helper import APCalculator, parse_predictions, parse_groundtruths
 24 | from dump_helper import dump_results
 25 | 
 26 | parser = argparse.ArgumentParser()
 27 | parser.add_argument('--data_path', default='/scratch/cluster/yanght/Dataset/sunrgbd/', help='path to dataset')
 28 | parser.add_argument('--model', default='hdnet', help='Model file name [default: hdnet]')
 29 | parser.add_argument('--dataset', default='sunrgbd', help='Dataset name. sunrgbd or scannet. [default: sunrgbd]')
 30 | parser.add_argument('--checkpoint_path', default=None, help='Model checkpoint path [default: None]')
 31 | parser.add_argument('--dump_dir', default=None, help='Dump dir to save sample outputs [default: None]')
 32 | parser.add_argument('--num_point', type=int, default=20000, help='Point Number [default: 20000]')
 33 | parser.add_argument('--num_target', type=int, default=256, help='Point Number [default: 256]')
 34 | parser.add_argument('--batch_size', type=int, default=8, help='Batch Size during training [default: 8]')
 35 | parser.add_argument('--vote_factor', type=int, default=1, help='Number of votes generated from each seed [default: 1]')
 36 | parser.add_argument('--cluster_sampling', default='vote_fps', help='Sampling strategy for vote clusters: vote_fps, seed_fps, random [default: vote_fps]')
 37 | parser.add_argument('--ap_iou_thresh', type=float, default=0.25, help='AP IoU threshold [default: 0.25]')
 38 | parser.add_argument('--no_height', action='store_true', help='Do NOT use height signal in input.')
 39 | parser.add_argument('--use_color', action='store_true', help='Use RGB color in input.')
 40 | parser.add_argument('--use_sunrgbd_v2', action='store_true', help='Use SUN RGB-D V2 box labels.')
 41 | parser.add_argument('--use_3d_nms', action='store_true', help='Use 3D NMS instead of 2D NMS.')
 42 | parser.add_argument('--use_cls_nms', action='store_true', help='Use per class NMS.')
 43 | parser.add_argument('--use_old_type_nms', action='store_true', help='Use old type of NMS, IoBox2Area.')
 44 | parser.add_argument('--per_class_proposal', action='store_true', help='Duplicate each proposal num_class times.')
 45 | parser.add_argument('--nms_iou', type=float, default=0.25, help='NMS IoU threshold. [default: 0.25]')
 46 | parser.add_argument('--conf_thresh', type=float, default=0.05, help='Filter out predictions with obj prob less than it. [default: 0.05]')
 47 | parser.add_argument('--faster_eval', action='store_true', help='Faster evaluation by skippling empty bounding box removal.')
 48 | parser.add_argument('--shuffle_dataset', action='store_true', help='Shuffle the dataset (random order).')
 49 | parser.add_argument('--dump_results', action='store_true', help='Dump results.')
 50 | 
 51 | FLAGS = parser.parse_args()
 52 | 
 53 | if FLAGS.use_cls_nms:
 54 |     assert(FLAGS.use_3d_nms)
 55 | 
 56 | # ------------------------------------------------------------------------- GLOBAL CONFIG BEG
 57 | BATCH_SIZE = FLAGS.batch_size
 58 | NUM_POINT = FLAGS.num_point
 59 | DUMP_DIR = FLAGS.dump_dir
 60 | CHECKPOINT_PATH = FLAGS.checkpoint_path
 61 | assert(CHECKPOINT_PATH is not None)
 62 | FLAGS.DUMP_DIR = DUMP_DIR
 63 | 
 64 | 
 65 | # Prepare DUMP_DIR
 66 | if not os.path.exists(DUMP_DIR): os.mkdir(DUMP_DIR)
 67 | DUMP_FOUT = open(os.path.join(DUMP_DIR, 'log_eval.txt'), 'w')
 68 | DUMP_FOUT.write(str(FLAGS)+'\n')
 69 | def log_string(out_str):
 70 |     DUMP_FOUT.write(out_str+'\n')
 71 |     DUMP_FOUT.flush()
 72 |     print(out_str)
 73 | 
 74 | # Init datasets and dataloaders 
 75 | def my_worker_init_fn(worker_id):
 76 |     np.random.seed(np.random.get_state()[1][0] + worker_id)
 77 | 
 78 | if FLAGS.dataset == 'sunrgbd':
 79 |     sys.path.append(os.path.join(ROOT_DIR, 'sunrgbd'))
 80 |     from sunrgbd_detection_dataset_hd import SunrgbdDetectionVotesDataset, MAX_NUM_OBJ
 81 |     from model_util_sunrgbd import SunrgbdDatasetConfig
 82 |     DATASET_CONFIG = SunrgbdDatasetConfig()
 83 |     TEST_DATASET = SunrgbdDetectionVotesDataset(FLAGS.data_path, 'val', num_points=NUM_POINT,
 84 |         augment=False, use_color=FLAGS.use_color, use_height=(not FLAGS.no_height),
 85 |         use_v1=(not FLAGS.use_sunrgbd_v2))
 86 | elif FLAGS.dataset == 'scannet':
 87 |     sys.path.append(os.path.join(ROOT_DIR, 'scannet'))
 88 |     from scannet_detection_dataset_hd import ScannetDetectionDataset, MAX_NUM_OBJ
 89 |     from model_util_scannet import ScannetDatasetConfig
 90 |     DATASET_CONFIG = ScannetDatasetConfig()
 91 |     TEST_DATASET = ScannetDetectionDataset(FLAGS.data_path, 'val', num_points=NUM_POINT,
 92 |                                            augment=False, use_angle=False,
 93 |                                            use_color=FLAGS.use_color, use_height=(not FLAGS.no_height))
 94 | else:
 95 |     print('Unknown dataset %s. Exiting...'%(FLAGS.dataset))
 96 |     exit(-1)
 97 | print(len(TEST_DATASET))
 98 | TEST_DATALOADER = DataLoader(TEST_DATASET, batch_size=BATCH_SIZE,
 99 |     shuffle=FLAGS.shuffle_dataset, num_workers=4, worker_init_fn=my_worker_init_fn)
100 | 
101 | # Init the model and optimzier
102 | MODEL = importlib.import_module(FLAGS.model) # import network module
103 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
104 | num_input_channel = int(FLAGS.use_color)*3 + int(not FLAGS.no_height)*1
105 | 
106 | Detector = MODEL.HDNet
107 | 
108 | net = Detector(num_class=DATASET_CONFIG.num_class,
109 |                num_heading_bin=DATASET_CONFIG.num_heading_bin,
110 |                num_size_cluster=DATASET_CONFIG.num_size_cluster,
111 |                mean_size_arr=DATASET_CONFIG.mean_size_arr,
112 |                num_proposal=FLAGS.num_target,
113 |                input_feature_dim=num_input_channel,
114 |                vote_factor=FLAGS.vote_factor,
115 |                sampling=FLAGS.cluster_sampling)
116 | 
117 | if torch.cuda.device_count() > 1:
118 |     log_string("Let's use %d GPUs!" % (torch.cuda.device_count()))
119 |     # dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs
120 |     net = nn.DataParallel(net) 
121 | net.to(device)
122 | criterion = MODEL.get_loss
123 | 
124 | # Load the Adam optimizer
125 | optimizer = optim.Adam(net.parameters(), lr=0.001)
126 | 
127 | # Load checkpoint if there is any
128 | if CHECKPOINT_PATH is not None and os.path.isfile(CHECKPOINT_PATH):
129 |     checkpoint = torch.load(CHECKPOINT_PATH)
130 |     checkpoint_multigpu = dict()
131 |     if torch.cuda.device_count() > 1:
132 |         for name, param in checkpoint['model_state_dict'].items():
133 |             checkpoint_multigpu.update({'module.' + name: param})
134 |         net.load_state_dict(checkpoint_multigpu)
135 |     else:
136 |         net.load_state_dict(checkpoint['model_state_dict'])
137 |     optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
138 |     epoch = checkpoint['epoch']
139 |     log_string("Loaded checkpoint %s (epoch: %d)"%(CHECKPOINT_PATH, epoch))
140 | 
141 | # Used for AP calculation
142 | CONFIG_DICT = {'remove_empty_box':False, 'use_3d_nms':True,
143 |     'nms_iou':0.25, 'use_old_type_nms':False, 'cls_nms':True,
144 |     'per_class_proposal': True, 'conf_thresh':0.05,
145 |     'dataset_config':DATASET_CONFIG}
146 | 
147 | CONFIG_DICT_L = {'remove_empty_box':False, 'use_3d_nms':True,
148 |     'nms_iou':0.5, 'use_old_type_nms':False, 'cls_nms':True,
149 |     'per_class_proposal': True, 'conf_thresh':0.05,
150 |     'dataset_config':DATASET_CONFIG}
151 | 
152 | # ------------------------------------------------------------------------- GLOBAL CONFIG END
153 | 
154 | def evaluate_one_epoch():
155 |     stat_dict = {}
156 | 
157 |     ap_calculator = APCalculator(ap_iou_thresh=FLAGS.ap_iou_thresh,
158 |         class2type_map=DATASET_CONFIG.class2type)
159 |     ap_calculator_l = APCalculator(ap_iou_thresh=FLAGS.ap_iou_thresh*2,
160 |         class2type_map=DATASET_CONFIG.class2type)
161 | 
162 |     net.eval() # set model to eval mode (for bn and dp)
163 |     for batch_idx, batch_data_label in enumerate(TEST_DATALOADER):
164 |         end_points = {}
165 |         if batch_idx % 10 == 0:
166 |             print('Eval batch: %d'%(batch_idx))
167 |         for key in batch_data_label:
168 |             batch_data_label[key] = batch_data_label[key].to(device)
169 |         
170 |         # Forward pass
171 |         inputs = {'point_clouds': batch_data_label['point_clouds']}
172 |         with torch.no_grad():
173 |             end_points = net(inputs, end_points)
174 | 
175 |         # Compute loss
176 |         for key in batch_data_label:
177 |             end_points[key] = batch_data_label[key]
178 |         loss, end_points = criterion(inputs, end_points, DATASET_CONFIG)
179 | 
180 |         # Accumulate statistics and print out
181 |         for key in end_points:
182 |             if 'loss' in key or 'acc' in key or 'ratio' in key:
183 |                 if key not in stat_dict: stat_dict[key] = 0
184 |                 stat_dict[key] += end_points[key].item()
185 | 
186 |         batch_pred_map_cls = parse_predictions(end_points, CONFIG_DICT, opt_ang=(FLAGS.dataset == 'sunrgbd'))
187 |         batch_gt_map_cls = parse_groundtruths(end_points, CONFIG_DICT) 
188 |         ap_calculator.step(batch_pred_map_cls, batch_gt_map_cls)
189 | 
190 |         batch_pred_map_cls = parse_predictions(end_points, CONFIG_DICT_L, opt_ang=(FLAGS.dataset == 'sunrgbd'))
191 |         batch_gt_map_cls = parse_groundtruths(end_points, CONFIG_DICT_L) 
192 |         ap_calculator_l.step(batch_pred_map_cls, batch_gt_map_cls)
193 |         
194 |         if FLAGS.dump_results:
195 |             dump_results(end_points, DUMP_DIR+'/result/', DATASET_CONFIG, TEST_DATASET, opt_ang=(FLAGS.dataset == 'sunrgbd'))
196 | 
197 | 
198 |     # Log statistics
199 |     for key in sorted(stat_dict.keys()):
200 |         log_string('eval mean %s: %f'%(key, stat_dict[key]/(float(batch_idx+1))))
201 | 
202 |     metrics_dict = ap_calculator.compute_metrics()
203 |     for key in metrics_dict:
204 |         log_string('iou = 0.25, eval %s: %f'%(key, metrics_dict[key]))
205 |     metrics_dict = ap_calculator_l.compute_metrics()
206 |     for key in metrics_dict:
207 |         log_string('iou = 0.5, eval %s: %f'%(key, metrics_dict[key]))
208 | 
209 |     mean_loss = stat_dict['loss']/float(batch_idx+1)
210 |     return mean_loss
211 | 
212 | 
213 | def eval():
214 |     log_string(str(datetime.now()))
215 |     # Reset numpy seed.
216 |     # REF: https://github.com/pytorch/pytorch/issues/5059
217 |     np.random.seed()
218 |     loss = evaluate_one_epoch()
219 | 
220 | if __name__=='__main__':
221 |     eval()
222 | 


--------------------------------------------------------------------------------
/models/ap_helper.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | """ Helper functions and class to calculate Average Precisions for 3D object detection.
  7 | """
  8 | import os
  9 | import sys
 10 | import numpy as np
 11 | import torch
 12 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 13 | ROOT_DIR = os.path.dirname(BASE_DIR)
 14 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 15 | from eval_det import eval_det_cls, eval_det_multiprocessing
 16 | from eval_det import get_iou_obb
 17 | from nms import nms_2d_faster, nms_3d_faster, nms_3d_faster_samecls
 18 | from box_util import get_3d_box
 19 | sys.path.append(os.path.join(ROOT_DIR, 'sunrgbd'))
 20 | from sunrgbd_utils import extract_pc_in_box3d
 21 | 
 22 | def flip_axis_to_camera(pc):
 23 |     ''' Flip X-right,Y-forward,Z-up to X-right,Y-down,Z-forward
 24 |     Input and output are both (N,3) array
 25 |     '''
 26 |     pc2 = np.copy(pc)
 27 |     pc2[...,[0,1,2]] = pc2[...,[0,2,1]] # cam X,Y,Z = depth X,-Z,Y
 28 |     pc2[...,1] *= -1
 29 |     return pc2
 30 | 
 31 | def flip_axis_to_depth(pc):
 32 |     pc2 = np.copy(pc)
 33 |     pc2[...,[0,1,2]] = pc2[...,[0,2,1]] # depth X,Y,Z = cam X,Z,-Y
 34 |     pc2[...,2] *= -1
 35 |     return pc2
 36 | 
 37 | def softmax(x):
 38 |     ''' Numpy function for softmax'''
 39 |     shape = x.shape
 40 |     probs = np.exp(x - np.max(x, axis=len(shape)-1, keepdims=True))
 41 |     probs /= np.sum(probs, axis=len(shape)-1, keepdims=True)
 42 |     return probs
 43 | 
 44 | def parse_predictions(end_points, config_dict, opt_ang=False, opt_sem=False):
 45 |     """ Parse predictions to OBB parameters and suppress overlapping boxes
 46 |     
 47 |     Args:
 48 |         end_points: dict
 49 |             {point_clouds, center, heading_scores, heading_residuals,
 50 |             size_scores, size_residuals, sem_cls_scores}
 51 |         config_dict: dict
 52 |             {dataset_config, remove_empty_box, use_3d_nms, nms_iou,
 53 |             use_old_type_nms, conf_thresh, per_class_proposal}
 54 | 
 55 |     Returns:
 56 |         batch_pred_map_cls: a list of len == batch size (BS)
 57 |             [pred_list_i], i = 0, 1, ..., BS-1
 58 |             where pred_list_i = [(pred_sem_cls, box_params, box_score)_j]
 59 |             where j = 0, ..., num of valid detections - 1 from sample input i
 60 |     """
 61 | 
 62 |     pred_center = end_points['center'+'opt']# + end_points['center'+'opt'] # B,num_proposal,3
 63 |     
 64 |     if opt_ang:
 65 |         pred_heading_class = torch.argmax(end_points['heading_scores'+'center'], -1) # B,num_proposal
 66 |         pred_heading_residual = torch.gather(end_points['heading_residuals'+'opt'], 2,
 67 |                                              pred_heading_class.unsqueeze(-1)) # B,num_proposal,1
 68 |     else:
 69 |         pred_heading_class = torch.argmax(end_points['heading_scores'+'center'], -1) # B,num_proposal
 70 |         pred_heading_residual = torch.gather(end_points['heading_residuals'+'center'], 2,
 71 |                                              pred_heading_class.unsqueeze(-1)) # B,num_proposal,1
 72 |     pred_heading_residual.squeeze_(2)
 73 | 
 74 |     pred_size_class = torch.argmax(end_points['size_scores'+'center'], -1) # B,num_proposal
 75 |     pred_size_residual = torch.gather(end_points['size_residuals'+'opt'], 2,
 76 |                                       pred_size_class.unsqueeze(-1).unsqueeze(-1).repeat(1,1,1,3)) # B,num_proposal,1,3
 77 |     pred_size_residual.squeeze_(2)
 78 |     
 79 |     if opt_sem:
 80 |         pred_sem_cls = torch.argmax(end_points['sem_cls_scores'+'opt'], -1) # B,num_proposal
 81 |         sem_cls_probs = softmax(end_points['sem_cls_scores'+'opt'].detach().cpu().numpy()) # B,num_proposal,10
 82 |     else:
 83 |         pred_sem_cls = torch.argmax(end_points['sem_cls_scores'+'center'], -1) # B,num_proposal
 84 |         sem_cls_probs = softmax(end_points['sem_cls_scores'+'center'].detach().cpu().numpy()) # B,num_proposal,10
 85 |     pred_sem_cls_prob = np.max(sem_cls_probs,-1) # B,num_proposal
 86 | 
 87 |     num_proposal = pred_center.shape[1] 
 88 |     # Since we operate in upright_depth coord for points, while util functions
 89 |     # assume upright_camera coord.
 90 |     bsize = pred_center.shape[0]
 91 |     pred_corners_3d_upright_camera = np.zeros((bsize, num_proposal, 8, 3))
 92 |     pred_center_upright_camera = flip_axis_to_camera(pred_center.detach().cpu().numpy())
 93 | 
 94 |     for i in range(bsize):
 95 |         for j in range(num_proposal):
 96 |             heading_angle = config_dict['dataset_config'].class2angle(\
 97 |                 pred_heading_class[i,j].detach().cpu().numpy(), pred_heading_residual[i,j].detach().cpu().numpy())
 98 |             box_size = config_dict['dataset_config'].class2size(\
 99 |                 int(pred_size_class[i,j].detach().cpu().numpy()), pred_size_residual[i,j].detach().cpu().numpy())     
100 |             corners_3d_upright_camera = get_3d_box(box_size, heading_angle, pred_center_upright_camera[i,j,:])
101 |             pred_corners_3d_upright_camera[i,j] = corners_3d_upright_camera
102 | 
103 |     K = pred_center.shape[1] # K==num_proposal
104 |     nonempty_box_mask = np.ones((bsize, K))
105 | 
106 |     if config_dict['remove_empty_box']:
107 |         # -------------------------------------
108 |         # Remove predicted boxes without any point within them..
109 |         batch_pc = end_points['point_clouds'].cpu().numpy()[:,:,0:3] # B,N,3
110 |         for i in range(bsize):
111 |             pc = batch_pc[i,:,:] # (N,3)
112 |             for j in range(K):
113 |                 box3d = pred_corners_3d_upright_camera[i,j,:,:] # (8,3)
114 |                 box3d = flip_axis_to_depth(box3d)
115 |                 pc_in_box,inds = extract_pc_in_box3d(pc, box3d)
116 |                 if len(pc_in_box) < 5:
117 |                     nonempty_box_mask[i,j] = 0
118 |         # -------------------------------------
119 | 
120 |     obj_logits = end_points['objectness_scores'+'opt'].detach().cpu().numpy()
121 |     obj_prob = softmax(obj_logits)[:,:,1] # (B,K)
122 |     
123 |     if not config_dict['use_3d_nms']:
124 |         # ---------- NMS input: pred_with_prob in (B,K,7) -----------
125 |         pred_mask = np.zeros((bsize, K))
126 |         for i in range(bsize):
127 |             boxes_2d_with_prob = np.zeros((K,5))
128 |             for j in range(K):
129 |                 boxes_2d_with_prob[j,0] = np.min(pred_corners_3d_upright_camera[i,j,:,0])
130 |                 boxes_2d_with_prob[j,2] = np.max(pred_corners_3d_upright_camera[i,j,:,0])
131 |                 boxes_2d_with_prob[j,1] = np.min(pred_corners_3d_upright_camera[i,j,:,2])
132 |                 boxes_2d_with_prob[j,3] = np.max(pred_corners_3d_upright_camera[i,j,:,2])
133 |                 boxes_2d_with_prob[j,4] = obj_prob[i,j]
134 |             nonempty_box_inds = np.where(nonempty_box_mask[i,:]==1)[0]
135 |             pick = nms_2d_faster(boxes_2d_with_prob[nonempty_box_mask[i,:]==1,:],
136 |                 config_dict['nms_iou'], config_dict['use_old_type_nms'])
137 |             assert(len(pick)>0)
138 |             pred_mask[i, nonempty_box_inds[pick]] = 1
139 |         end_points['pred_mask'] = pred_mask
140 |         # ---------- NMS output: pred_mask in (B,K) -----------
141 |     elif config_dict['use_3d_nms'] and (not config_dict['cls_nms']):
142 |         # ---------- NMS input: pred_with_prob in (B,K,7) -----------
143 |         pred_mask = np.zeros((bsize, K))
144 |         for i in range(bsize):
145 |             boxes_3d_with_prob = np.zeros((K,7))
146 |             for j in range(K):
147 |                 boxes_3d_with_prob[j,0] = np.min(pred_corners_3d_upright_camera[i,j,:,0])
148 |                 boxes_3d_with_prob[j,1] = np.min(pred_corners_3d_upright_camera[i,j,:,1])
149 |                 boxes_3d_with_prob[j,2] = np.min(pred_corners_3d_upright_camera[i,j,:,2])
150 |                 boxes_3d_with_prob[j,3] = np.max(pred_corners_3d_upright_camera[i,j,:,0])
151 |                 boxes_3d_with_prob[j,4] = np.max(pred_corners_3d_upright_camera[i,j,:,1])
152 |                 boxes_3d_with_prob[j,5] = np.max(pred_corners_3d_upright_camera[i,j,:,2])
153 |                 boxes_3d_with_prob[j,6] = obj_prob[i,j]
154 |             nonempty_box_inds = np.where(nonempty_box_mask[i,:]==1)[0]
155 |             pick = nms_3d_faster(boxes_3d_with_prob[nonempty_box_mask[i,:]==1,:],
156 |                 config_dict['nms_iou'], config_dict['use_old_type_nms'])
157 |             assert(len(pick)>0)
158 |             pred_mask[i, nonempty_box_inds[pick]] = 1
159 |         end_points['pred_mask'] = pred_mask
160 |         # ---------- NMS output: pred_mask in (B,K) -----------
161 |     elif config_dict['use_3d_nms'] and config_dict['cls_nms']:
162 |         # ---------- NMS input: pred_with_prob in (B,K,8) -----------
163 |         pred_mask = np.zeros((bsize, K))
164 |         for i in range(bsize):
165 |             boxes_3d_with_prob = np.zeros((K,8))
166 |             for j in range(K):
167 |                 boxes_3d_with_prob[j,0] = np.min(pred_corners_3d_upright_camera[i,j,:,0])
168 |                 boxes_3d_with_prob[j,1] = np.min(pred_corners_3d_upright_camera[i,j,:,1])
169 |                 boxes_3d_with_prob[j,2] = np.min(pred_corners_3d_upright_camera[i,j,:,2])
170 |                 boxes_3d_with_prob[j,3] = np.max(pred_corners_3d_upright_camera[i,j,:,0])
171 |                 boxes_3d_with_prob[j,4] = np.max(pred_corners_3d_upright_camera[i,j,:,1])
172 |                 boxes_3d_with_prob[j,5] = np.max(pred_corners_3d_upright_camera[i,j,:,2])
173 |                 boxes_3d_with_prob[j,6] = obj_prob[i,j]
174 |                 boxes_3d_with_prob[j,7] = pred_sem_cls[i,j] # only suppress if the two boxes are of the same class!!
175 |             nonempty_box_inds = np.where(nonempty_box_mask[i,:]==1)[0]
176 |             pick = nms_3d_faster_samecls(boxes_3d_with_prob[nonempty_box_mask[i,:]==1,:],
177 |                 config_dict['nms_iou'], config_dict['use_old_type_nms'])
178 |             assert(len(pick)>0)
179 |             pred_mask[i, nonempty_box_inds[pick]] = 1
180 |         end_points['pred_mask'] = pred_mask
181 |         # ---------- NMS output: pred_mask in (B,K) -----------
182 | 
183 |     batch_pred_map_cls = [] # a list (len: batch_size) of list (len: num of predictions per sample) of tuples of pred_cls, pred_box and conf (0-1)
184 |     for i in range(bsize):
185 |         if config_dict['per_class_proposal']:
186 |             cur_list = []
187 |             for ii in range(config_dict['dataset_config'].num_class):
188 |                 cur_list += [(ii, pred_corners_3d_upright_camera[i,j], sem_cls_probs[i,j,ii]*obj_prob[i,j]) \
189 |                     for j in range(pred_center.shape[1]) if pred_mask[i,j]==1 and obj_prob[i,j]>config_dict['conf_thresh']]
190 |             batch_pred_map_cls.append(cur_list)
191 |         else:
192 |             batch_pred_map_cls.append([(pred_sem_cls[i,j].item(), pred_corners_3d_upright_camera[i,j], obj_prob[i,j]) \
193 |                 for j in range(pred_center.shape[1]) if pred_mask[i,j]==1 and obj_prob[i,j]>config_dict['conf_thresh']])
194 |     end_points['batch_pred_map_cls'] = batch_pred_map_cls
195 | 
196 |     return batch_pred_map_cls
197 | 
198 | def parse_groundtruths(end_points, config_dict):
199 |     """ Parse groundtruth labels to OBB parameters.
200 |     
201 |     Args:
202 |         end_points: dict
203 |             {center_label, heading_class_label, heading_residual_label,
204 |             size_class_label, size_residual_label, sem_cls_label,
205 |             box_label_mask}
206 |         config_dict: dict
207 |             {dataset_config}
208 | 
209 |     Returns:
210 |         batch_gt_map_cls: a list  of len == batch_size (BS)
211 |             [gt_list_i], i = 0, 1, ..., BS-1
212 |             where gt_list_i = [(gt_sem_cls, gt_box_params)_j]
213 |             where j = 0, ..., num of objects - 1 at sample input i
214 |     """
215 |     center_label = end_points['center_label']
216 |     heading_class_label = end_points['heading_class_label']
217 |     heading_residual_label = end_points['heading_residual_label']
218 |     size_class_label = end_points['size_class_label']
219 |     size_residual_label = end_points['size_residual_label']
220 |     box_label_mask = end_points['box_label_mask']
221 |     sem_cls_label = end_points['sem_cls_label']
222 |     bsize = center_label.shape[0]
223 | 
224 |     K2 = center_label.shape[1] # K2==MAX_NUM_OBJ
225 |     gt_corners_3d_upright_camera = np.zeros((bsize, K2, 8, 3))
226 |     gt_center_upright_camera = flip_axis_to_camera(center_label[:,:,0:3].detach().cpu().numpy())
227 |     for i in range(bsize):
228 |         for j in range(K2):
229 |             if box_label_mask[i,j] == 0: continue
230 |             heading_angle = config_dict['dataset_config'].class2angle(heading_class_label[i,j].detach().cpu().numpy(), heading_residual_label[i,j].detach().cpu().numpy())
231 |             box_size = config_dict['dataset_config'].class2size(int(size_class_label[i,j].detach().cpu().numpy()), size_residual_label[i,j].detach().cpu().numpy())
232 |             corners_3d_upright_camera = get_3d_box(box_size, heading_angle, gt_center_upright_camera[i,j,:])
233 |             gt_corners_3d_upright_camera[i,j] = corners_3d_upright_camera
234 | 
235 |     batch_gt_map_cls = []
236 |     for i in range(bsize):
237 |         batch_gt_map_cls.append([(sem_cls_label[i,j].item(), gt_corners_3d_upright_camera[i,j]) for j in range(gt_corners_3d_upright_camera.shape[1]) if box_label_mask[i,j]==1])
238 |     end_points['batch_gt_map_cls'] = batch_gt_map_cls
239 | 
240 |     return batch_gt_map_cls
241 | 
242 | class APCalculator(object):
243 |     ''' Calculating Average Precision '''
244 |     def __init__(self, ap_iou_thresh=0.25, class2type_map=None):
245 |         """
246 |         Args:
247 |             ap_iou_thresh: float between 0 and 1.0
248 |                 IoU threshold to judge whether a prediction is positive.
249 |             class2type_map: [optional] dict {class_int:class_name}
250 |         """
251 |         self.ap_iou_thresh = ap_iou_thresh
252 |         self.class2type_map = class2type_map
253 |         self.reset()
254 |         
255 |     def step(self, batch_pred_map_cls, batch_gt_map_cls):
256 |         """ Accumulate one batch of prediction and groundtruth.
257 |         
258 |         Args:
259 |             batch_pred_map_cls: a list of lists [[(pred_cls, pred_box_params, score),...],...]
260 |             batch_gt_map_cls: a list of lists [[(gt_cls, gt_box_params),...],...]
261 |                 should have the same length with batch_pred_map_cls (batch_size)
262 |         """
263 |         
264 |         bsize = len(batch_pred_map_cls)
265 |         assert(bsize == len(batch_gt_map_cls))
266 |         for i in range(bsize):
267 |             self.gt_map_cls[self.scan_cnt] = batch_gt_map_cls[i] 
268 |             self.pred_map_cls[self.scan_cnt] = batch_pred_map_cls[i] 
269 |             self.scan_cnt += 1
270 |     
271 |     def compute_metrics(self):
272 |         """ Use accumulated predictions and groundtruths to compute Average Precision.
273 |         """
274 |         rec, prec, ap = eval_det_multiprocessing(self.pred_map_cls, self.gt_map_cls, ovthresh=self.ap_iou_thresh, get_iou_func=get_iou_obb)
275 |         ret_dict = {} 
276 |         for key in sorted(ap.keys()):
277 |             clsname = self.class2type_map[key] if self.class2type_map else str(key)
278 |             ret_dict['%s Average Precision'%(clsname)] = ap[key]
279 |         ret_dict['mAP'] = np.mean(list(ap.values()))
280 |         rec_list = []
281 |         for key in sorted(ap.keys()):
282 |             clsname = self.class2type_map[key] if self.class2type_map else str(key)
283 |             try:
284 |                 ret_dict['%s Recall'%(clsname)] = rec[key][-1]
285 |                 rec_list.append(rec[key][-1])
286 |             except:
287 |                 ret_dict['%s Recall'%(clsname)] = 0
288 |                 rec_list.append(0)
289 |         ret_dict['AR'] = np.mean(rec_list)
290 |         return ret_dict
291 | 
292 |     def reset(self):
293 |         self.gt_map_cls = {} # {scan_id: [(classname, bbox)]}
294 |         self.pred_map_cls = {} # {scan_id: [(classname, bbox, score)]}
295 |         self.scan_cnt = 0
296 | 


--------------------------------------------------------------------------------
/models/dump_helper.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | import numpy as np
  7 | import torch
  8 | import os
  9 | import sys
 10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 11 | ROOT_DIR = os.path.dirname(BASE_DIR)
 12 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 13 | import pc_util
 14 | import scipy.io as sio
 15 | import scipy
 16 | 
 17 | DUMP_CONF_THRESH = 0.5 # Dump boxes with obj prob larger than that.
 18 | 
 19 | def params2bbox(center, xsize, ysize, zsize, angle):
 20 |     ''' from bbox_center, angle and size to bbox
 21 |     @Args:
 22 |         center: (3)
 23 |         x/y/zsize: scalar
 24 |         angle: -pi ~ pi
 25 |     @Returns:
 26 |         bbox: 8 x 3, order:
 27 |          [[xmin, ymin, zmin], [xmin, ymin, zmax], [xmin, ymax, zmin], [xmin, ymax, zmax],
 28 |           [xmax, ymin, zmin], [xmax, ymin, zmax], [xmax, ymax, zmin], [xmax, ymax, zmax]]
 29 |     '''
 30 |     vx = np.array([np.cos(angle), np.sin(angle), 0])
 31 |     vy = np.array([-np.sin(angle), np.cos(angle), 0])
 32 |     vx = vx * np.abs(xsize) / 2
 33 |     vy = vy * np.abs(ysize) / 2
 34 |     vz = np.array([0, 0, np.abs(zsize) / 2])
 35 |     bbox = np.array([\
 36 |         center - vx - vy - vz, center - vx - vy + vz,
 37 |         center - vx + vy - vz, center - vx + vy + vz,
 38 |         center + vx - vy - vz, center + vx - vy + vz,
 39 |         center + vx + vy - vz, center + vx + vy + vz])
 40 |     return bbox
 41 | 
 42 | def softmax(x):
 43 |     ''' Numpy function for softmax'''
 44 |     shape = x.shape
 45 |     probs = np.exp(x - np.max(x, axis=len(shape)-1, keepdims=True))
 46 |     probs /= np.sum(probs, axis=len(shape)-1, keepdims=True)
 47 |     return probs
 48 | 
 49 | 
 50 | DUMP_CONF_THRESH = 0.5 # Dump boxes with obj prob larger than that.
 51 | 
 52 | def softmax(x):
 53 |     ''' Numpy function for softmax'''
 54 |     shape = x.shape
 55 |     probs = np.exp(x - np.max(x, axis=len(shape)-1, keepdims=True))
 56 |     probs /= np.sum(probs, axis=len(shape)-1, keepdims=True)
 57 |     return probs
 58 | 
 59 | def dump_results(end_points, dump_dir, config, dataset, opt_ang, mode='opt'):
 60 |     '''
 61 |         similar to dump results
 62 |         scan_names: all scan names
 63 |     '''
 64 |     if not os.path.exists(dump_dir):
 65 |         os.system('mkdir %s'%(dump_dir))
 66 |         
 67 |     # INPUT
 68 |     point_clouds = end_points['point_clouds'].cpu().numpy()
 69 |     batch_size = point_clouds.shape[0]
 70 |     
 71 |     # NETWORK OUTPUTS
 72 |     seed_xyz_z = end_points['seed_xyz'].detach().cpu().numpy() # (B,num_seed,3)
 73 |     seed_xyz_xy = end_points['seed_xyz'].detach().cpu().numpy() # (B,num_seed,3)
 74 |     seed_xyz_line = end_points['seed_xyz'].detach().cpu().numpy() # (B,num_seed,3)
 75 |     
 76 |     gt_center = end_points['center_label'].cpu().numpy() # (B,MAX_NUM_OBJ,3)
 77 |     gt_num = end_points['num_instance'].cpu().numpy() # (B,MAX_NUM_OBJ,3)
 78 |     scan_idxes = end_points['scan_idx'].detach().cpu().numpy()
 79 | 
 80 |     pred_center = end_points['vote_xyz'].detach().cpu().numpy()
 81 | 
 82 |     aggregated_vote_xyz = end_points['aggregated_vote_xyz'+mode].detach().cpu().numpy()
 83 |     objectness_scores = end_points['objectness_scores'+mode].detach().cpu().numpy() # (B,K,2)
 84 |     pred_center = end_points['center'+mode].detach().cpu().numpy() # (B,K,3)
 85 | 
 86 |     pred_heading_class = torch.argmax(end_points['heading_scores'+'center'], -1) # B,num_proposal
 87 |     if opt_ang:
 88 |         pred_heading_residual = torch.gather(end_points['heading_residuals'+'opt'], 2, pred_heading_class.unsqueeze(-1)) # B,num_proposal,1
 89 |     else:
 90 |         pred_heading_residual = torch.gather(end_points['heading_residuals'+'center'], 2, pred_heading_class.unsqueeze(-1)) # B,num_proposal,1
 91 | 
 92 |     pred_heading_class = pred_heading_class.detach().cpu().numpy() # B,num_proposal
 93 |     pred_heading_residual = pred_heading_residual.squeeze(2).detach().cpu().numpy() # B,num_proposal
 94 |     
 95 |     pred_size_class = torch.argmax(end_points['size_scores'+'center'], -1) # B,num_proposal
 96 |     pred_size_residual = torch.gather(end_points['size_residuals'+mode], 2, pred_size_class.unsqueeze(-1).unsqueeze(-1).repeat(1,1,1,3)) # B,num_proposal,1,3
 97 |     pred_size_residual = pred_size_residual.squeeze(2).detach().cpu().numpy() # B,num_proposal,3
 98 |     pred_sem_cls = torch.argmax(end_points['sem_cls_scores'+'center'], -1) # B, num_proposal
 99 |     pred_sem_cls = pred_sem_cls.detach().cpu().numpy()
100 | 
101 |     pred_mask = end_points['pred_mask'] # B,num_proposal
102 | 
103 |     # LABELS
104 |     gt_center = end_points['center_label'].cpu().numpy() # (B,MAX_NUM_OBJ,3)
105 |     gt_mask = end_points['box_label_mask'].cpu().numpy() # B,K2
106 |     gt_heading_class = end_points['heading_class_label'].cpu().numpy() # B,K2
107 |     gt_heading_residual = end_points['heading_residual_label'].cpu().numpy() # B,K2
108 |     gt_size_class = end_points['size_class_label'].cpu().numpy() # B,K2
109 |     gt_size_residual = end_points['size_residual_label'].cpu().numpy() # B,K2,3
110 |     objectness_label = end_points['objectness_label'+mode].detach().cpu().numpy() # (B,K,)
111 |     objectness_mask = end_points['objectness_mask'+mode].detach().cpu().numpy() # (B,K,)
112 |     sem_cls_label = end_points['sem_cls_label'].detach().cpu().numpy()
113 | 
114 |     ### Boundary points
115 |     boundary_gt_z = end_points['sub_point_sem_cls_label'+'_z'].detach().cpu().numpy()
116 |     boundary_pred_z = end_points['pred_flag'+'_z'].detach().cpu().numpy()
117 |     boundary_gt_xy = end_points['sub_point_sem_cls_label'+'_xy'].detach().cpu().numpy()
118 |     boundary_pred_xy = end_points['pred_flag'+'_xy'].detach().cpu().numpy()
119 |     boundary_gt_line = end_points['sub_point_sem_cls_label'+'_line'].detach().cpu().numpy()
120 |     boundary_pred_line = end_points['pred_flag'+'_line'].detach().cpu().numpy()
121 | 
122 |     gt_center_z = end_points['surface_center_gt_z'].detach().cpu().numpy()
123 |     gt_sem_z = end_points['surface_sem_gt_z'].detach().cpu().numpy()
124 |     gt_mask_z = end_points['surface_mask_gt_z'].detach().cpu().numpy()
125 | 
126 |     gt_center_xy = end_points['surface_center_gt_xy'].detach().cpu().numpy()
127 |     gt_sem_xy = end_points['surface_sem_gt_xy'].detach().cpu().numpy()
128 |     gt_mask_xy = end_points['surface_mask_gt_xy'].detach().cpu().numpy()
129 | 
130 |     gt_center_line = end_points['surface_center_gt_line'].detach().cpu().numpy()
131 |     gt_sem_line = end_points['surface_sem_gt_line'].detach().cpu().numpy()
132 |     gt_mask_line = end_points['surface_mask_gt_line'].detach().cpu().numpy()
133 |     
134 |     pred_center_z = end_points['center_z'].detach().cpu().numpy()
135 |     pred_center_xy = end_points['center_xy'].detach().cpu().numpy()
136 |     pred_center_line = end_points['center_line'].detach().cpu().numpy()
137 |     
138 |     pred_size_z = end_points['size_residuals_z'].detach().cpu().numpy()
139 |     pred_size_xy = end_points['size_residuals_xy'].detach().cpu().numpy()
140 | 
141 |     pred_sem_z = end_points['sem_cls_scores_z'].detach().cpu().numpy()
142 |     pred_sem_xy = end_points['sem_cls_scores_xy'].detach().cpu().numpy()
143 |     pred_sem_line = end_points['sem_cls_scores_line'].detach().cpu().numpy()
144 | 
145 |     num_proposal = pred_center.shape[1]
146 |     for i in range(batch_size):
147 |         idx = scan_idxes[i]
148 |         scan = dataset.scan_names[idx]
149 |         print('-' * 30)
150 |         print(scan)
151 |         print('-' * 30)
152 |     
153 |         box_pred_list = []
154 |         box_gt_list = []
155 |         obb_pred_list = []
156 |         obb_gt_list = []
157 | 
158 |         for j in range(num_proposal):
159 |             obb = config.param2obb2(pred_center[i,j,0:3], pred_heading_class[i,j], pred_heading_residual[i,j],
160 |                             pred_size_class[i,j], pred_size_residual[i,j])
161 |             obb_pred_list.append(np.hstack([obb, pred_sem_cls[i, j] + 1])) # ATTENTION: need to + 1
162 |             box = params2bbox(obb[:3], obb[3], obb[4], obb[5], obb[6])
163 |             box_pred_list.append(box)
164 |         obb_pred_mat = np.array(obb_pred_list)
165 |         
166 |         for j in range(gt_center.shape[1]):
167 |             if gt_mask[i, j] == 0: continue
168 |             obb = config.param2obb2(gt_center[i,j,0:3], gt_heading_class[i,j], gt_heading_residual[i,j],
169 |                             gt_size_class[i,j], gt_size_residual[i,j])
170 |             obb_gt_list.append(np.hstack([obb, sem_cls_label[i, j] + 1])) # ATTENTION: need to + 1
171 |             box = params2bbox(obb[:3], obb[3], obb[4], obb[5], obb[6])
172 |             box_gt_list.append(box)
173 |         obb_gt_mat = np.array(obb_gt_list)
174 | 
175 |         scipy.io.savemat(dump_dir + mode + scan + '_gt.mat', {'gt': obb_gt_mat})
176 |         scipy.io.savemat(dump_dir + mode + scan + '_boundary_z.mat', {'gt': boundary_gt_z[i,...], 'pred': boundary_pred_z[i,...], 'origpc': point_clouds[i,...], 'seedpc': seed_xyz_z[i,...], 'gt_center': gt_center_z[i,...], 'gt_sem': gt_sem_z[i,...], 'gt_mask': gt_mask_z[i,...], 'pred_center': pred_center_z[i,...], 'pred_sem': pred_sem_z[i,...], 'pred_size': pred_size_z[i,...]})
177 |         scipy.io.savemat(dump_dir + mode + scan + '_boundary_xy.mat', {'gt': boundary_gt_xy[i,...], 'pred': boundary_pred_xy[i,...], 'origpc': point_clouds[i,...], 'seedpc': seed_xyz_xy[i,...], 'gt_center': gt_center_xy[i,...], 'gt_sem': gt_sem_xy[i,...], 'gt_mask': gt_mask_xy[i,...], 'pred_center': pred_center_xy[i,...], 'pred_sem': pred_sem_xy[i,...], 'pred_size': pred_size_xy[i,...]})
178 |         scipy.io.savemat(dump_dir + mode + scan + '_boundary_line.mat', {'gt': boundary_gt_line[i,...], 'pred': boundary_pred_line[i,...], 'origpc': point_clouds[i,...], 'seedpc': seed_xyz_line[i,...], 'gt_center': gt_center_line[i,...], 'gt_sem': gt_sem_line[i,...], 'gt_mask': gt_mask_line[i,...], 'pred_center': pred_center_line[i,...], 'pred_sem': pred_sem_line[i,...]})
179 | 
180 |         # uncomment to visualize
181 |         # Dump predicted bounding boxes
182 |         objectness_prob = softmax(objectness_scores[i,:,:])[:,1] # (K,)
183 |         select_idx = np.logical_and(objectness_prob>DUMP_CONF_THRESH, pred_mask[i,:]==1)
184 |         box_pred_nms_list = []
185 |         obb_pred_nms_list = []
186 |         for i, val in enumerate(select_idx.tolist()):
187 |             if val:
188 |                 box_pred_nms_list.append(box_pred_list[i])
189 |                 obb_pred_nms_list.append(obb_pred_list[i])
190 | 
191 |         votenet_pred_nms_arr = np.array(obb_pred_nms_list)
192 |         np.save(dump_dir + mode + scan + '_nms.npy', votenet_pred_nms_arr)
193 | 


--------------------------------------------------------------------------------
/models/hdnet.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | """ Deep hough voting network for 3D object detection in point clouds.
  7 | 
  8 | Author: Charles R. Qi and Or Litany
  9 | """
 10 | 
 11 | import torch
 12 | import torch.nn as nn
 13 | import torch.nn.functional as F
 14 | import numpy as np
 15 | import sys
 16 | import os
 17 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 18 | ROOT_DIR = os.path.dirname(BASE_DIR)
 19 | sys.path.append(BASE_DIR)
 20 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 21 | import pc_util
 22 | 
 23 | from backbone_module import Pointnet2Backbone
 24 | from voting_module import VotingModule
 25 | 
 26 | from proposal_module_refine import ProposalModuleRefine
 27 | from proposal_module_surface import PrimitiveModule
 28 | 
 29 | from dump_helper import dump_results
 30 | from loss_helper import get_loss
 31 | 
 32 | class HDNet(nn.Module):
 33 |     r"""
 34 |         A deep neural network for 3D object detection with end-to-end optimizable hough voting.
 35 | 
 36 |         Parameters
 37 |         ----------
 38 |         num_class: int
 39 |             Number of semantics classes to predict over -- size of softmax classifier
 40 |         num_heading_bin: int
 41 |         num_size_cluster: int
 42 |         input_feature_dim: (default: 0)
 43 |             Input dim in the feature descriptor for each point.  If the point cloud is Nx9, this
 44 |             value should be 6 as in an Nx9 point cloud, 3 of the channels are xyz, and 6 are feature descriptors
 45 |         num_proposal: int (default: 128)
 46 |             Number of proposals/detections generated from the network. Each proposal is a 3D OBB with a semantic class.
 47 |         vote_factor: (default: 1)
 48 |             Number of votes generated from each seed point.
 49 |     """
 50 | 
 51 |     def __init__(self, num_class, num_heading_bin, num_size_cluster, mean_size_arr,
 52 |         input_feature_dim=0, num_proposal=128, vote_factor=1, sampling='vote_fps', with_angle=False):
 53 |         super().__init__()
 54 | 
 55 |         self.num_class = num_class
 56 |         self.num_heading_bin = num_heading_bin
 57 |         self.num_size_cluster = num_size_cluster
 58 |         self.mean_size_arr = mean_size_arr
 59 |         self.input_feature_dim = input_feature_dim
 60 |         self.num_proposal = num_proposal
 61 |         self.vote_factor = vote_factor
 62 |         self.sampling=sampling
 63 | 
 64 |         # Backbone point feature learning: 4 bb tower
 65 |         self.backbone_net1 = Pointnet2Backbone(input_feature_dim=self.input_feature_dim) ### Just xyz + height
 66 |         self.backbone_net2 = Pointnet2Backbone(input_feature_dim=self.input_feature_dim) ### Just xyz + height
 67 |         self.backbone_net3 = Pointnet2Backbone(input_feature_dim=self.input_feature_dim) ### Just xyz + height
 68 |         self.backbone_net4 = Pointnet2Backbone(input_feature_dim=self.input_feature_dim) ### Just xyz + height
 69 | 
 70 |         ### Feature concatenation
 71 |         self.conv_agg1 = torch.nn.Conv1d(256*4,256*2,1) 
 72 |         self.bn_agg1 = torch.nn.BatchNorm1d(256*2)
 73 |         self.conv_agg2 = torch.nn.Conv1d(256*2,256,1)
 74 |         self.bn_agg2 = torch.nn.BatchNorm1d(256)
 75 |     
 76 |         ### Existence flag prediction
 77 |         self.conv_flag_z1 = torch.nn.Conv1d(256,128,1) 
 78 |         self.bn_flag_z1 = torch.nn.BatchNorm1d(128)
 79 |         self.conv_flag_z2 = torch.nn.Conv1d(128,2,1)
 80 | 
 81 |         self.conv_flag_xy1 = torch.nn.Conv1d(256,128,1)
 82 |         self.bn_flag_xy1 = torch.nn.BatchNorm1d(128)
 83 |         self.conv_flag_xy2 = torch.nn.Conv1d(128,2,1)
 84 | 
 85 |         self.conv_flag_line1 = torch.nn.Conv1d(256,128,1)
 86 |         self.bn_flag_line1 = torch.nn.BatchNorm1d(128)
 87 |         self.conv_flag_line2 = torch.nn.Conv1d(128,2,1) 
 88 |         
 89 |         # Hough voting and clustering
 90 |         self.vgen = VotingModule(self.vote_factor, 256)
 91 |         self.vgen_z = VotingModule(self.vote_factor, 256)
 92 |         self.vgen_xy = VotingModule(self.vote_factor, 256)
 93 |         self.vgen_line = VotingModule(self.vote_factor, 256)
 94 |     
 95 |         # Vote aggregation and detection
 96 |         self.pnet_z = PrimitiveModule(num_class, num_heading_bin, num_size_cluster,
 97 |                                      mean_size_arr, num_proposal, sampling, seed_feat_dim=256, numd=2)
 98 |         self.pnet_xy = PrimitiveModule(num_class, num_heading_bin, num_size_cluster,
 99 |                                      mean_size_arr, num_proposal, sampling, seed_feat_dim=256, numd=1)
100 |         self.pnet_line = PrimitiveModule(num_class, num_heading_bin, num_size_cluster,
101 |                                         mean_size_arr, num_proposal, sampling, seed_feat_dim=256, numd=0)
102 |         
103 |         self.pnet_final = ProposalModuleRefine(num_class, num_heading_bin, num_size_cluster,
104 |                                    mean_size_arr, num_proposal, sampling, seed_feat_dim=256, with_angle=with_angle)
105 |         
106 |     def forward(self, inputs, end_points, mode=""):
107 |         """ Forward pass of the network
108 | 
109 |         Args:
110 |             inputs: dict
111 |                 {point_clouds}
112 | 
113 |                 point_clouds: Variable(torch.cuda.FloatTensor)
114 |                     (B, N, 3 + input_channels) tensor
115 |                     Point cloud to run predicts on
116 |                     Each point in the point-cloud MUST
117 |                     be formated as (x, y, z, features...)
118 |         Returns:
119 |             end_points: dict
120 |         """
121 |         batch_size = inputs['point_clouds'].shape[0]
122 | 
123 |         end_points = self.backbone_net1(inputs['point_clouds'], end_points)
124 |         end_points = self.backbone_net2(inputs['point_clouds'], end_points, mode='net1')
125 |         end_points = self.backbone_net3(inputs['point_clouds'], end_points, mode='net2')
126 |         end_points = self.backbone_net4(inputs['point_clouds'], end_points, mode='net3')
127 | 
128 |         ### Extract feature here
129 |         xyz = end_points['fp2_xyz']
130 |         features1 = end_points['fp2_features']
131 |         features2 = end_points['fp2_features'+'net1']
132 |         features3 = end_points['fp2_features'+'net2']
133 |         features4 = end_points['fp2_features'+'net3']
134 |         end_points['seed_inds'] = end_points['fp2_inds']
135 |         end_points['seed_xyz'] = xyz
136 |         end_points['seed_features'] = features1
137 |         
138 |         ### Combine the feature here
139 |         features_hd_discriptor = torch.cat((features1, features2, features3, features4), dim=1)
140 |         features_hd_discriptor = F.relu(self.bn_agg1(self.conv_agg1(features_hd_discriptor)))
141 |         features_hd_discriptor = F.relu(self.bn_agg2(self.conv_agg2(features_hd_discriptor)))
142 | 
143 |         end_points['hd_feature'] = features_hd_discriptor
144 |         
145 |         net_flag_z = F.relu(self.bn_flag_z1(self.conv_flag_z1(features_hd_discriptor)))
146 |         net_flag_z = self.conv_flag_z2(net_flag_z)
147 |         end_points["pred_flag_z"] = net_flag_z
148 | 
149 |         net_flag_xy = F.relu(self.bn_flag_xy1(self.conv_flag_xy1(features_hd_discriptor)))
150 |         net_flag_xy = self.conv_flag_xy2(net_flag_xy)
151 |         end_points["pred_flag_xy"] = net_flag_xy
152 | 
153 |         net_flag_line = F.relu(self.bn_flag_line1(self.conv_flag_line1(features_hd_discriptor)))
154 |         net_flag_line = self.conv_flag_line2(net_flag_line)
155 |         end_points["pred_flag_line"] = net_flag_line
156 | 
157 |         proposal_xyz, proposal_features, center_offset, center_residual = self.vgen(xyz, features_hd_discriptor)
158 |         proposal_features_norm = torch.norm(proposal_features, p=2, dim=1)
159 |         proposal_features = proposal_features.div(proposal_features_norm.unsqueeze(1))
160 |         end_points['vote_xyz'] = proposal_xyz
161 |         end_points['vote_features'] = proposal_features
162 |         
163 |         voted_z, voted_z_feature, z_offset, z_residual = self.vgen_z(xyz, features_hd_discriptor)
164 |         voted_z_feature_norm = torch.norm(voted_z_feature, p=2, dim=1)
165 |         voted_z_feature = voted_z_feature.div(voted_z_feature_norm.unsqueeze(1))
166 |         end_points['vote_z'] = voted_z
167 |         end_points['vote_z_feature'] = voted_z_feature
168 | 
169 |         voted_xy, voted_xy_feature, xy_offset, xy_residual = self.vgen_xy(xyz, features_hd_discriptor)
170 |         voted_xy_feature_norm = torch.norm(voted_xy_feature, p=2, dim=1)
171 |         voted_xy_feature = voted_xy_feature.div(voted_xy_feature_norm.unsqueeze(1))
172 |         end_points['vote_xy'] = voted_xy
173 |         end_points['vote_xy_feature'] = voted_xy_feature
174 | 
175 |         voted_line, voted_line_feature, line_offset, line_residual = self.vgen_line(xyz, features_hd_discriptor)
176 |         voted_line_feature_norm = torch.norm(voted_line_feature, p=2, dim=1)
177 |         voted_line_feature = voted_line_feature.div(voted_line_feature_norm.unsqueeze(1))
178 |         end_points['vote_line'] = voted_line
179 |         end_points['vote_line_feature'] = voted_line_feature
180 |         
181 |         center_z, feature_z, end_points = self.pnet_z(voted_z, voted_z_feature, end_points, mode='_z')
182 |         center_xy, feature_xy, end_points = self.pnet_xy(voted_xy, voted_xy_feature, end_points, mode='_xy')
183 |         center_line, feature_line, end_points = self.pnet_line(voted_line, voted_line_feature, end_points, mode='_line')
184 | 
185 |         end_points = self.pnet_final(proposal_xyz, proposal_features, center_z, feature_z, center_xy, feature_xy, center_line, feature_line, end_points)
186 |         return end_points
187 | 
188 | 


--------------------------------------------------------------------------------
/models/hdnet_1bb.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | """ Deep hough voting network for 3D object detection in point clouds.
  7 | 
  8 | Author: Charles R. Qi and Or Litany
  9 | """
 10 | 
 11 | import torch
 12 | import torch.nn as nn
 13 | import torch.nn.functional as F
 14 | import numpy as np
 15 | import sys
 16 | import os
 17 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 18 | ROOT_DIR = os.path.dirname(BASE_DIR)
 19 | sys.path.append(BASE_DIR)
 20 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 21 | import pc_util
 22 | 
 23 | from backbone_module_scale import Pointnet2Backbone
 24 | from voting_module import VotingModule
 25 | 
 26 | from proposal_module_refine import ProposalModuleRefine
 27 | from proposal_module_surface import PrimitiveModule
 28 | 
 29 | from dump_helper import dump_results
 30 | from loss_helper import get_loss
 31 | 
 32 | class HDNet_1bb(nn.Module):
 33 |     r"""
 34 |         A deep neural network for 3D object detection with end-to-end optimizable hough voting.
 35 | 
 36 |         Parameters
 37 |         ----------
 38 |         num_class: int
 39 |             Number of semantics classes to predict over -- size of softmax classifier
 40 |         num_heading_bin: int
 41 |         num_size_cluster: int
 42 |         input_feature_dim: (default: 0)
 43 |             Input dim in the feature descriptor for each point.  If the point cloud is Nx9, this
 44 |             value should be 6 as in an Nx9 point cloud, 3 of the channels are xyz, and 6 are feature descriptors
 45 |         num_proposal: int (default: 128)
 46 |             Number of proposals/detections generated from the network. Each proposal is a 3D OBB with a semantic class.
 47 |         vote_factor: (default: 1)
 48 |             Number of votes generated from each seed point.
 49 |     """
 50 | 
 51 |     def __init__(self, num_class, num_heading_bin, num_size_cluster, mean_size_arr,
 52 |                  input_feature_dim=0, num_proposal=128, vote_factor=1, sampling='vote_fps', with_angle=False, scale=1):
 53 |         super().__init__()
 54 | 
 55 |         self.num_class = num_class
 56 |         self.num_heading_bin = num_heading_bin
 57 |         self.num_size_cluster = num_size_cluster
 58 |         self.mean_size_arr = mean_size_arr
 59 |         self.input_feature_dim = input_feature_dim
 60 |         self.num_proposal = num_proposal
 61 |         self.vote_factor = vote_factor
 62 |         self.sampling=sampling
 63 | 
 64 |         # Backbone point feature learning: 4 bb tower
 65 |         self.backbone_net1 = Pointnet2Backbone(input_feature_dim=self.input_feature_dim, scale=scale) ### Just xyz + height
 66 |         scale = max(scale, 2)
 67 |         
 68 |         ### Existence flag prediction
 69 |         self.conv_flag_z1 = torch.nn.Conv1d(256*scale,128,1) 
 70 |         self.bn_flag_z1 = torch.nn.BatchNorm1d(128)
 71 |         self.conv_flag_z2 = torch.nn.Conv1d(128,2,1)
 72 | 
 73 |         self.conv_flag_xy1 = torch.nn.Conv1d(256*scale,128,1)
 74 |         self.bn_flag_xy1 = torch.nn.BatchNorm1d(128)
 75 |         self.conv_flag_xy2 = torch.nn.Conv1d(128,2,1)
 76 | 
 77 |         self.conv_flag_line1 = torch.nn.Conv1d(256*scale,128,1)
 78 |         self.bn_flag_line1 = torch.nn.BatchNorm1d(128)
 79 |         self.conv_flag_line2 = torch.nn.Conv1d(128,2,1) 
 80 |         
 81 |         # Hough voting and clustering
 82 |         self.vgen = VotingModule(self.vote_factor, 256*scale)
 83 |         self.vgen_z = VotingModule(self.vote_factor, 256*scale)
 84 |         self.vgen_xy = VotingModule(self.vote_factor, 256*scale)
 85 |         self.vgen_line = VotingModule(self.vote_factor, 256*scale)
 86 |     
 87 |         # Vote aggregation and detection
 88 |         self.pnet_z = PrimitiveModule(num_class, num_heading_bin, num_size_cluster,
 89 |                                      mean_size_arr, num_proposal, sampling, seed_feat_dim=256*scale, numd=2)
 90 |         self.pnet_xy = PrimitiveModule(num_class, num_heading_bin, num_size_cluster,
 91 |                                      mean_size_arr, num_proposal, sampling, seed_feat_dim=256*scale, numd=1)
 92 |         self.pnet_line = PrimitiveModule(num_class, num_heading_bin, num_size_cluster,
 93 |                                         mean_size_arr, num_proposal, sampling, seed_feat_dim=256*scale, numd=0)
 94 |         
 95 |         self.pnet_final = ProposalModuleRefine(num_class, num_heading_bin, num_size_cluster,
 96 |                                    mean_size_arr, num_proposal, sampling, seed_feat_dim=256*scale, with_angle=with_angle)
 97 |         
 98 |     def forward(self, inputs, end_points, mode=""):
 99 |         """ Forward pass of the network
100 | 
101 |         Args:
102 |             inputs: dict
103 |                 {point_clouds}
104 | 
105 |                 point_clouds: Variable(torch.cuda.FloatTensor)
106 |                     (B, N, 3 + input_channels) tensor
107 |                     Point cloud to run predicts on
108 |                     Each point in the point-cloud MUST
109 |                     be formated as (x, y, z, features...)
110 |         Returns:
111 |             end_points: dict
112 |         """
113 |         batch_size = inputs['point_clouds'].shape[0]
114 | 
115 |         end_points = self.backbone_net1(inputs['point_clouds'], end_points)
116 | 
117 |         ### Extract feature here
118 |         xyz = end_points['fp2_xyz']
119 |         features1 = end_points['fp2_features']
120 |         end_points['seed_inds'] = end_points['fp2_inds']
121 |         end_points['seed_xyz'] = xyz
122 |         end_points['seed_features'] = features1
123 |         
124 |         ### Combine the feature here
125 |         features_hd_discriptor = features1
126 | 
127 |         end_points['hd_feature'] = features_hd_discriptor
128 |         
129 |         net_flag_z = F.relu(self.bn_flag_z1(self.conv_flag_z1(features_hd_discriptor)))
130 |         net_flag_z = self.conv_flag_z2(net_flag_z)
131 |         end_points["pred_flag_z"] = net_flag_z
132 | 
133 |         net_flag_xy = F.relu(self.bn_flag_xy1(self.conv_flag_xy1(features_hd_discriptor)))
134 |         net_flag_xy = self.conv_flag_xy2(net_flag_xy)
135 |         end_points["pred_flag_xy"] = net_flag_xy
136 | 
137 |         net_flag_line = F.relu(self.bn_flag_line1(self.conv_flag_line1(features_hd_discriptor)))
138 |         net_flag_line = self.conv_flag_line2(net_flag_line)
139 |         end_points["pred_flag_line"] = net_flag_line
140 | 
141 |         proposal_xyz, proposal_features, center_offset, center_residual = self.vgen(xyz, features_hd_discriptor)
142 |         proposal_features_norm = torch.norm(proposal_features, p=2, dim=1)
143 |         proposal_features = proposal_features.div(proposal_features_norm.unsqueeze(1))
144 |         end_points['vote_xyz'] = proposal_xyz
145 |         end_points['vote_features'] = proposal_features
146 |         
147 |         voted_z, voted_z_feature, z_offset, z_residual = self.vgen_z(xyz, features_hd_discriptor)
148 |         voted_z_feature_norm = torch.norm(voted_z_feature, p=2, dim=1)
149 |         voted_z_feature = voted_z_feature.div(voted_z_feature_norm.unsqueeze(1))
150 |         end_points['vote_z'] = voted_z
151 |         end_points['vote_z_feature'] = voted_z_feature
152 | 
153 |         voted_xy, voted_xy_feature, xy_offset, xy_residual = self.vgen_xy(xyz, features_hd_discriptor)
154 |         voted_xy_feature_norm = torch.norm(voted_xy_feature, p=2, dim=1)
155 |         voted_xy_feature = voted_xy_feature.div(voted_xy_feature_norm.unsqueeze(1))
156 |         end_points['vote_xy'] = voted_xy
157 |         end_points['vote_xy_feature'] = voted_xy_feature
158 | 
159 |         voted_line, voted_line_feature, line_offset, line_residual = self.vgen_line(xyz, features_hd_discriptor)
160 |         voted_line_feature_norm = torch.norm(voted_line_feature, p=2, dim=1)
161 |         voted_line_feature = voted_line_feature.div(voted_line_feature_norm.unsqueeze(1))
162 |         end_points['vote_line'] = voted_line
163 |         end_points['vote_line_feature'] = voted_line_feature
164 |         
165 |         center_z, feature_z, end_points = self.pnet_z(voted_z, voted_z_feature, end_points, mode='_z')
166 |         center_xy, feature_xy, end_points = self.pnet_xy(voted_xy, voted_xy_feature, end_points, mode='_xy')
167 |         center_line, feature_line, end_points = self.pnet_line(voted_line, voted_line_feature, end_points, mode='_line')
168 | 
169 |         end_points = self.pnet_final(proposal_xyz, proposal_features, center_z, feature_z, center_xy, feature_xy, center_line, feature_line, end_points)
170 |         return end_points
171 | 
172 | 


--------------------------------------------------------------------------------
/models/proposal_module_surface.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | import torch
  7 | import torch.nn as nn
  8 | import torch.nn.functional as F
  9 | import numpy as np
 10 | import os
 11 | import sys
 12 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 13 | ROOT_DIR = os.path.dirname(BASE_DIR)
 14 | sys.path.append(os.path.join(ROOT_DIR, 'pointnet2'))
 15 | from pointnet2_modules import PointnetSAModuleVotes
 16 | import pointnet2_utils
 17 | 
 18 | def decode_scores(net, end_points, num_class, mode=''):
 19 |     net_transposed = net.transpose(2,1) # (batch_size, 1024, ..)
 20 |     batch_size = net_transposed.shape[0]
 21 |     num_proposal = net_transposed.shape[1]
 22 |     
 23 |     base_xyz = end_points['aggregated_vote_xyz'+mode] # (batch_size, num_proposal, 3)
 24 |     center = base_xyz + net_transposed[:,:,0:3] # (batch_size, num_proposal, 3)
 25 |     end_points['center'+mode] = center
 26 | 
 27 |     if mode == '_z':
 28 |         end_points['size_residuals'+mode] = net_transposed[:,:,3:5]
 29 |         sem_cls_scores = net_transposed[:,:,5:] # Bxnum_proposalx10
 30 |         end_points['sem_cls_scores'+mode] = sem_cls_scores
 31 |     elif mode == '_xy':
 32 |         end_points['size_residuals'+mode] = net_transposed[:,:,3:4]
 33 |         sem_cls_scores = net_transposed[:,:,4:] # Bxnum_proposalx10
 34 |         end_points['sem_cls_scores'+mode] = sem_cls_scores
 35 |     else:
 36 |         sem_cls_scores = net_transposed[:,:,3:] # Bxnum_proposalx10
 37 |         end_points['sem_cls_scores'+mode] = sem_cls_scores
 38 |     return center, end_points
 39 | 
 40 | 
 41 | class PrimitiveModule(nn.Module):
 42 |     def __init__(self, num_class, num_heading_bin, num_size_cluster, mean_size_arr, num_proposal, sampling, seed_feat_dim=256, numd=1):
 43 |         super().__init__() 
 44 | 
 45 |         self.num_class = num_class
 46 |         self.num_heading_bin = num_heading_bin
 47 |         self.num_size_cluster = num_size_cluster
 48 |         self.mean_size_arr = mean_size_arr
 49 |         self.num_proposal = num_proposal
 50 |         self.sampling = sampling
 51 |         self.seed_feat_dim = seed_feat_dim
 52 | 
 53 |         # Vote clustering
 54 |         self.vote_aggregation = PointnetSAModuleVotes( 
 55 |                 npoint=self.num_proposal,
 56 |                 radius=0.3,
 57 |                 nsample=16,
 58 |                 mlp=[self.seed_feat_dim, 128, 128, 128],
 59 |                 use_xyz=True,
 60 |                 normalize_xyz=True,
 61 |                 same_idx=True
 62 |             )
 63 |     
 64 |         # Object proposal/detection
 65 |         # Objectness scores (2), center residual (3),
 66 |         # heading class+residual (num_heading_bin*2), size class+residual(num_size_cluster*4)
 67 |         self.conv1 = torch.nn.Conv1d(128,128,1)
 68 |         self.conv2 = torch.nn.Conv1d(128,128,1)
 69 |         self.conv3 = torch.nn.Conv1d(128,3+numd+self.num_class,1)
 70 |         self.bn1 = torch.nn.BatchNorm1d(128)
 71 |         self.bn2 = torch.nn.BatchNorm1d(128)
 72 | 
 73 |     def forward(self, xyz, features, end_points, mode=''):
 74 |         """
 75 |         Args:
 76 |             xyz: (B,K,3)
 77 |             features: (B,C,K)
 78 |         Returns:
 79 |             scores: (B,num_proposal,2+3+NH*2+NS*4) 
 80 |         """
 81 |         if self.sampling == 'vote_fps':
 82 |             # Farthest point sampling (FPS) on votes
 83 |             original_feature = features
 84 |             xyz, features, fps_inds = self.vote_aggregation(xyz, features)
 85 |             #original_feature = torch.gather(original_features, 2, fps_inds.unsqueeze(1).repeat(1,256,1).detach().long()).contiguous()
 86 |             sample_inds = fps_inds
 87 |         elif self.sampling == 'seed_fps': 
 88 |             # FPS on seed and choose the votes corresponding to the seeds
 89 |             # This gets us a slightly better coverage of *object* votes than vote_fps (which tends to get more cluster votes)
 90 |             sample_inds = pointnet2_utils.furthest_point_sample(end_points['seed_xyz'], self.num_proposal)
 91 |             xyz, features, _ = self.vote_aggregation(xyz, features, sample_inds)
 92 |         elif self.sampling == 'random':
 93 |             # Random sampling from the votes
 94 |             num_seed = end_points['seed_xyz'].shape[1]
 95 |             sample_inds = torch.randint(0, num_seed, (batch_size, self.num_proposal), dtype=torch.int).cuda()
 96 |             xyz, features, _ = self.vote_aggregation(xyz, features, sample_inds)
 97 |         else:
 98 |             log_string('Unknown sampling strategy: %s. Exiting!'%(self.sampling))
 99 |             exit()
100 |         end_points['aggregated_vote_xyz'+mode] = xyz # (batch_size, num_proposal, 3)
101 |         end_points['aggregated_vote_inds'+mode] = sample_inds # (batch_size, num_proposal,) # should be 0,1,2,...,num_proposal
102 |         end_points['aggregated_feature'+mode] = features
103 |         
104 |         # --------- PROPOSAL GENERATION ---------
105 |         net = F.relu(self.bn1(self.conv1(features))) 
106 |         last_net = F.relu(self.bn2(self.conv2(net))) 
107 |         net = self.conv3(last_net) # (batch_size, 2+3+num_heading_bin*2+num_size_cluster*4, num_proposal)
108 | 
109 |         newcenter, end_points = decode_scores(net, end_points, self.num_class, mode=mode)
110 |         return newcenter.contiguous(), features.contiguous(), end_points
111 | 
112 | 
113 | 


--------------------------------------------------------------------------------
/models/voting_module.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates.
 2 | # 
 3 | # This source code is licensed under the MIT license found in the
 4 | # LICENSE file in the root directory of this source tree.
 5 | 
 6 | ''' Voting module: generate votes from XYZ and features of seed points.
 7 | 
 8 | Date: July, 2019
 9 | Author: Charles R. Qi and Or Litany
10 | '''
11 | 
12 | import torch
13 | import torch.nn as nn
14 | import torch.nn.functional as F
15 | 
16 | class VotingModule(nn.Module):
17 |     def __init__(self, vote_factor, seed_feature_dim):
18 |         """ Votes generation from seed point features.
19 | 
20 |         Args:
21 |             vote_facotr: int
22 |                 number of votes generated from each seed point
23 |             seed_feature_dim: int
24 |                 number of channels of seed point features
25 |             vote_feature_dim: int
26 |                 number of channels of vote features
27 |         """
28 |         super().__init__()
29 |         self.vote_factor = vote_factor
30 |         self.in_dim = seed_feature_dim
31 |         self.out_dim = self.in_dim # due to residual feature, in_dim has to be == out_dim
32 |         self.conv1 = torch.nn.Conv1d(self.in_dim, self.in_dim, 1)
33 |         self.conv2 = torch.nn.Conv1d(self.in_dim, self.in_dim, 1)
34 |         self.conv3 = torch.nn.Conv1d(self.in_dim, (3+self.out_dim) * self.vote_factor, 1)
35 |         self.bn1 = torch.nn.BatchNorm1d(self.in_dim)
36 |         self.bn2 = torch.nn.BatchNorm1d(self.in_dim)
37 |         
38 |     def forward(self, seed_xyz, seed_features):
39 |         """ Forward pass.
40 | 
41 |         Arguments:
42 |             seed_xyz: (batch_size, num_seed, 3) Pytorch tensor
43 |             seed_features: (batch_size, feature_dim, num_seed) Pytorch tensor
44 |         Returns:
45 |             vote_xyz: (batch_size, num_seed*vote_factor, 3)
46 |             vote_features: (batch_size, vote_feature_dim, num_seed*vote_factor)
47 |         """
48 |         batch_size = seed_xyz.shape[0]
49 |         num_seed = seed_xyz.shape[1]
50 |         num_vote = num_seed*self.vote_factor
51 |         net = F.relu(self.bn1(self.conv1(seed_features))) 
52 |         net = F.relu(self.bn2(self.conv2(net)))
53 |         net = self.conv3(net) # (batch_size, (3+out_dim)*vote_factor, num_seed)
54 |                 
55 |         net = net.transpose(2,1).view(batch_size, num_seed, self.vote_factor, 3+self.out_dim)
56 |         offset = net[:,:,:,:3]
57 |         vote_xyz = seed_xyz.unsqueeze(2) + offset
58 |         vote_xyz = vote_xyz.contiguous().view(batch_size, num_vote, 3)
59 |         
60 |         residual_features = net[:,:,:,3:] # (batch_size, num_seed, vote_factor, out_dim)
61 |         vote_features = seed_features.transpose(2,1).unsqueeze(2) + residual_features
62 |         vote_features = vote_features.contiguous().view(batch_size, num_vote, self.out_dim)
63 |         vote_features = vote_features.transpose(2,1).contiguous()
64 |         
65 |         return vote_xyz, vote_features, offset.squeeze(2), residual_features
66 |  
67 | if __name__=='__main__':
68 |     net = VotingModule(2, 256).cuda()
69 |     xyz, features = net(torch.rand(8,1024,3).cuda(), torch.rand(8,256,1024).cuda())
70 |     print('xyz', xyz.shape)
71 |     print('features', features.shape)
72 | 
73 | 


--------------------------------------------------------------------------------
/overview.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zaiweizhang/H3DNet/81bd6af37cb131fd9e81774f52f29a0f3b0a0f43/overview.jpg


--------------------------------------------------------------------------------
/pointnet2/_ext_src/include/ball_query.h:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #pragma once
 7 | #include <torch/extension.h>
 8 | 
 9 | at::Tensor ball_query(at::Tensor new_xyz, at::Tensor xyz, const float radius,
10 |                       const int nsample);
11 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/include/cuda_utils.h:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #ifndef _CUDA_UTILS_H
 7 | #define _CUDA_UTILS_H
 8 | 
 9 | #include <ATen/ATen.h>
10 | #include <ATen/cuda/CUDAContext.h>
11 | #include <cmath>
12 | 
13 | #include <cuda.h>
14 | #include <cuda_runtime.h>
15 | 
16 | #include <vector>
17 | 
18 | #define TOTAL_THREADS 512
19 | 
20 | inline int opt_n_threads(int work_size) {
21 |   const int pow_2 = std::log(static_cast<double>(work_size)) / std::log(2.0);
22 | 
23 |   return max(min(1 << pow_2, TOTAL_THREADS), 1);
24 | }
25 | 
26 | inline dim3 opt_block_config(int x, int y) {
27 |   const int x_threads = opt_n_threads(x);
28 |   const int y_threads =
29 |       max(min(opt_n_threads(y), TOTAL_THREADS / x_threads), 1);
30 |   dim3 block_config(x_threads, y_threads, 1);
31 | 
32 |   return block_config;
33 | }
34 | 
35 | #define CUDA_CHECK_ERRORS()                                           \
36 |   do {                                                                \
37 |     cudaError_t err = cudaGetLastError();                             \
38 |     if (cudaSuccess != err) {                                         \
39 |       fprintf(stderr, "CUDA kernel failed : %s\n%s at L:%d in %s\n",  \
40 |               cudaGetErrorString(err), __PRETTY_FUNCTION__, __LINE__, \
41 |               __FILE__);                                              \
42 |       exit(-1);                                                       \
43 |     }                                                                 \
44 |   } while (0)
45 | 
46 | #endif
47 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/include/group_points.h:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #pragma once
 7 | #include <torch/extension.h>
 8 | 
 9 | at::Tensor group_points(at::Tensor points, at::Tensor idx);
10 | at::Tensor group_points_grad(at::Tensor grad_out, at::Tensor idx, const int n);
11 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/include/interpolate.h:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #pragma once
 7 | 
 8 | #include <torch/extension.h>
 9 | #include <vector>
10 | 
11 | std::vector<at::Tensor> three_nn(at::Tensor unknowns, at::Tensor knows);
12 | at::Tensor three_interpolate(at::Tensor points, at::Tensor idx,
13 |                              at::Tensor weight);
14 | at::Tensor three_interpolate_grad(at::Tensor grad_out, at::Tensor idx,
15 |                                   at::Tensor weight, const int m);
16 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/include/sampling.h:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #pragma once
 7 | #include <torch/extension.h>
 8 | 
 9 | at::Tensor gather_points(at::Tensor points, at::Tensor idx);
10 | at::Tensor gather_points_grad(at::Tensor grad_out, at::Tensor idx, const int n);
11 | at::Tensor furthest_point_sampling(at::Tensor points, const int nsamples);
12 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/include/utils.h:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #pragma once
 7 | #include <ATen/cuda/CUDAContext.h>
 8 | #include <torch/extension.h>
 9 | 
10 | #define CHECK_CUDA(x)                                          \
11 |   do {                                                         \
12 |     AT_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor"); \
13 |   } while (0)
14 | 
15 | #define CHECK_CONTIGUOUS(x)                                         \
16 |   do {                                                              \
17 |     AT_CHECK(x.is_contiguous(), #x " must be a contiguous tensor"); \
18 |   } while (0)
19 | 
20 | #define CHECK_IS_INT(x)                              \
21 |   do {                                               \
22 |     AT_CHECK(x.scalar_type() == at::ScalarType::Int, \
23 |              #x " must be an int tensor");           \
24 |   } while (0)
25 | 
26 | #define CHECK_IS_FLOAT(x)                              \
27 |   do {                                                 \
28 |     AT_CHECK(x.scalar_type() == at::ScalarType::Float, \
29 |              #x " must be a float tensor");            \
30 |   } while (0)
31 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/ball_query.cpp:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #include "ball_query.h"
 7 | #include "utils.h"
 8 | 
 9 | void query_ball_point_kernel_wrapper(int b, int n, int m, float radius,
10 |                                      int nsample, const float *new_xyz,
11 |                                      const float *xyz, int *idx);
12 | 
13 | at::Tensor ball_query(at::Tensor new_xyz, at::Tensor xyz, const float radius,
14 |                       const int nsample) {
15 |   CHECK_CONTIGUOUS(new_xyz);
16 |   CHECK_CONTIGUOUS(xyz);
17 |   CHECK_IS_FLOAT(new_xyz);
18 |   CHECK_IS_FLOAT(xyz);
19 | 
20 |   if (new_xyz.type().is_cuda()) {
21 |     CHECK_CUDA(xyz);
22 |   }
23 | 
24 |   at::Tensor idx =
25 |       torch::zeros({new_xyz.size(0), new_xyz.size(1), nsample},
26 |                    at::device(new_xyz.device()).dtype(at::ScalarType::Int));
27 | 
28 |   if (new_xyz.type().is_cuda()) {
29 |     query_ball_point_kernel_wrapper(xyz.size(0), xyz.size(1), new_xyz.size(1),
30 |                                     radius, nsample, new_xyz.data<float>(),
31 |                                     xyz.data<float>(), idx.data<int>());
32 |   } else {
33 |     AT_CHECK(false, "CPU not supported");
34 |   }
35 | 
36 |   return idx;
37 | }
38 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/ball_query_gpu.cu:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #include <math.h>
 7 | #include <stdio.h>
 8 | #include <stdlib.h>
 9 | 
10 | #include "cuda_utils.h"
11 | 
12 | // input: new_xyz(b, m, 3) xyz(b, n, 3)
13 | // output: idx(b, m, nsample)
14 | __global__ void query_ball_point_kernel(int b, int n, int m, float radius,
15 |                                         int nsample,
16 |                                         const float *__restrict__ new_xyz,
17 |                                         const float *__restrict__ xyz,
18 |                                         int *__restrict__ idx) {
19 |   int batch_index = blockIdx.x;
20 |   xyz += batch_index * n * 3;
21 |   new_xyz += batch_index * m * 3;
22 |   idx += m * nsample * batch_index;
23 | 
24 |   int index = threadIdx.x;
25 |   int stride = blockDim.x;
26 | 
27 |   float radius2 = radius * radius;
28 |   for (int j = index; j < m; j += stride) {
29 |     float new_x = new_xyz[j * 3 + 0];
30 |     float new_y = new_xyz[j * 3 + 1];
31 |     float new_z = new_xyz[j * 3 + 2];
32 |     for (int k = 0, cnt = 0; k < n && cnt < nsample; ++k) {
33 |       float x = xyz[k * 3 + 0];
34 |       float y = xyz[k * 3 + 1];
35 |       float z = xyz[k * 3 + 2];
36 |       float d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) +
37 |                  (new_z - z) * (new_z - z);
38 |       if (d2 < radius2) {
39 |         if (cnt == 0) {
40 |           for (int l = 0; l < nsample; ++l) {
41 |             idx[j * nsample + l] = k;
42 |           }
43 |         }
44 |         idx[j * nsample + cnt] = k;
45 |         ++cnt;
46 |       }
47 |     }
48 |   }
49 | }
50 | 
51 | void query_ball_point_kernel_wrapper(int b, int n, int m, float radius,
52 |                                      int nsample, const float *new_xyz,
53 |                                      const float *xyz, int *idx) {
54 |   cudaStream_t stream = at::cuda::getCurrentCUDAStream();
55 |   query_ball_point_kernel<<<b, opt_n_threads(m), 0, stream>>>(
56 |       b, n, m, radius, nsample, new_xyz, xyz, idx);
57 | 
58 |   CUDA_CHECK_ERRORS();
59 | }
60 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/bindings.cpp:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #include "ball_query.h"
 7 | #include "group_points.h"
 8 | #include "interpolate.h"
 9 | #include "sampling.h"
10 | 
11 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
12 |   m.def("gather_points", &gather_points);
13 |   m.def("gather_points_grad", &gather_points_grad);
14 |   m.def("furthest_point_sampling", &furthest_point_sampling);
15 | 
16 |   m.def("three_nn", &three_nn);
17 |   m.def("three_interpolate", &three_interpolate);
18 |   m.def("three_interpolate_grad", &three_interpolate_grad);
19 | 
20 |   m.def("ball_query", &ball_query);
21 | 
22 |   m.def("group_points", &group_points);
23 |   m.def("group_points_grad", &group_points_grad);
24 | }
25 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/group_points.cpp:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #include "group_points.h"
 7 | #include "utils.h"
 8 | 
 9 | void group_points_kernel_wrapper(int b, int c, int n, int npoints, int nsample,
10 |                                  const float *points, const int *idx,
11 |                                  float *out);
12 | 
13 | void group_points_grad_kernel_wrapper(int b, int c, int n, int npoints,
14 |                                       int nsample, const float *grad_out,
15 |                                       const int *idx, float *grad_points);
16 | 
17 | at::Tensor group_points(at::Tensor points, at::Tensor idx) {
18 |   CHECK_CONTIGUOUS(points);
19 |   CHECK_CONTIGUOUS(idx);
20 |   CHECK_IS_FLOAT(points);
21 |   CHECK_IS_INT(idx);
22 | 
23 |   if (points.type().is_cuda()) {
24 |     CHECK_CUDA(idx);
25 |   }
26 | 
27 |   at::Tensor output =
28 |       torch::zeros({points.size(0), points.size(1), idx.size(1), idx.size(2)},
29 |                    at::device(points.device()).dtype(at::ScalarType::Float));
30 | 
31 |   if (points.type().is_cuda()) {
32 |     group_points_kernel_wrapper(points.size(0), points.size(1), points.size(2),
33 |                                 idx.size(1), idx.size(2), points.data<float>(),
34 |                                 idx.data<int>(), output.data<float>());
35 |   } else {
36 |     AT_CHECK(false, "CPU not supported");
37 |   }
38 | 
39 |   return output;
40 | }
41 | 
42 | at::Tensor group_points_grad(at::Tensor grad_out, at::Tensor idx, const int n) {
43 |   CHECK_CONTIGUOUS(grad_out);
44 |   CHECK_CONTIGUOUS(idx);
45 |   CHECK_IS_FLOAT(grad_out);
46 |   CHECK_IS_INT(idx);
47 | 
48 |   if (grad_out.type().is_cuda()) {
49 |     CHECK_CUDA(idx);
50 |   }
51 | 
52 |   at::Tensor output =
53 |       torch::zeros({grad_out.size(0), grad_out.size(1), n},
54 |                    at::device(grad_out.device()).dtype(at::ScalarType::Float));
55 | 
56 |   if (grad_out.type().is_cuda()) {
57 |     group_points_grad_kernel_wrapper(
58 |         grad_out.size(0), grad_out.size(1), n, idx.size(1), idx.size(2),
59 |         grad_out.data<float>(), idx.data<int>(), output.data<float>());
60 |   } else {
61 |     AT_CHECK(false, "CPU not supported");
62 |   }
63 | 
64 |   return output;
65 | }
66 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/group_points_gpu.cu:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #include <stdio.h>
 7 | #include <stdlib.h>
 8 | 
 9 | #include "cuda_utils.h"
10 | 
11 | // input: points(b, c, n) idx(b, npoints, nsample)
12 | // output: out(b, c, npoints, nsample)
13 | __global__ void group_points_kernel(int b, int c, int n, int npoints,
14 |                                     int nsample,
15 |                                     const float *__restrict__ points,
16 |                                     const int *__restrict__ idx,
17 |                                     float *__restrict__ out) {
18 |   int batch_index = blockIdx.x;
19 |   points += batch_index * n * c;
20 |   idx += batch_index * npoints * nsample;
21 |   out += batch_index * npoints * nsample * c;
22 | 
23 |   const int index = threadIdx.y * blockDim.x + threadIdx.x;
24 |   const int stride = blockDim.y * blockDim.x;
25 |   for (int i = index; i < c * npoints; i += stride) {
26 |     const int l = i / npoints;
27 |     const int j = i % npoints;
28 |     for (int k = 0; k < nsample; ++k) {
29 |       int ii = idx[j * nsample + k];
30 |       out[(l * npoints + j) * nsample + k] = points[l * n + ii];
31 |     }
32 |   }
33 | }
34 | 
35 | void group_points_kernel_wrapper(int b, int c, int n, int npoints, int nsample,
36 |                                  const float *points, const int *idx,
37 |                                  float *out) {
38 |   cudaStream_t stream = at::cuda::getCurrentCUDAStream();
39 | 
40 |   group_points_kernel<<<b, opt_block_config(npoints, c), 0, stream>>>(
41 |       b, c, n, npoints, nsample, points, idx, out);
42 | 
43 |   CUDA_CHECK_ERRORS();
44 | }
45 | 
46 | // input: grad_out(b, c, npoints, nsample), idx(b, npoints, nsample)
47 | // output: grad_points(b, c, n)
48 | __global__ void group_points_grad_kernel(int b, int c, int n, int npoints,
49 |                                          int nsample,
50 |                                          const float *__restrict__ grad_out,
51 |                                          const int *__restrict__ idx,
52 |                                          float *__restrict__ grad_points) {
53 |   int batch_index = blockIdx.x;
54 |   grad_out += batch_index * npoints * nsample * c;
55 |   idx += batch_index * npoints * nsample;
56 |   grad_points += batch_index * n * c;
57 | 
58 |   const int index = threadIdx.y * blockDim.x + threadIdx.x;
59 |   const int stride = blockDim.y * blockDim.x;
60 |   for (int i = index; i < c * npoints; i += stride) {
61 |     const int l = i / npoints;
62 |     const int j = i % npoints;
63 |     for (int k = 0; k < nsample; ++k) {
64 |       int ii = idx[j * nsample + k];
65 |       atomicAdd(grad_points + l * n + ii,
66 |                 grad_out[(l * npoints + j) * nsample + k]);
67 |     }
68 |   }
69 | }
70 | 
71 | void group_points_grad_kernel_wrapper(int b, int c, int n, int npoints,
72 |                                       int nsample, const float *grad_out,
73 |                                       const int *idx, float *grad_points) {
74 |   cudaStream_t stream = at::cuda::getCurrentCUDAStream();
75 | 
76 |   group_points_grad_kernel<<<b, opt_block_config(npoints, c), 0, stream>>>(
77 |       b, c, n, npoints, nsample, grad_out, idx, grad_points);
78 | 
79 |   CUDA_CHECK_ERRORS();
80 | }
81 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/interpolate.cpp:
--------------------------------------------------------------------------------
  1 | // Copyright (c) Facebook, Inc. and its affiliates.
  2 | // 
  3 | // This source code is licensed under the MIT license found in the
  4 | // LICENSE file in the root directory of this source tree.
  5 | 
  6 | #include "interpolate.h"
  7 | #include "utils.h"
  8 | 
  9 | void three_nn_kernel_wrapper(int b, int n, int m, const float *unknown,
 10 |                              const float *known, float *dist2, int *idx);
 11 | void three_interpolate_kernel_wrapper(int b, int c, int m, int n,
 12 |                                       const float *points, const int *idx,
 13 |                                       const float *weight, float *out);
 14 | void three_interpolate_grad_kernel_wrapper(int b, int c, int n, int m,
 15 |                                            const float *grad_out,
 16 |                                            const int *idx, const float *weight,
 17 |                                            float *grad_points);
 18 | 
 19 | std::vector<at::Tensor> three_nn(at::Tensor unknowns, at::Tensor knows) {
 20 |   CHECK_CONTIGUOUS(unknowns);
 21 |   CHECK_CONTIGUOUS(knows);
 22 |   CHECK_IS_FLOAT(unknowns);
 23 |   CHECK_IS_FLOAT(knows);
 24 | 
 25 |   if (unknowns.type().is_cuda()) {
 26 |     CHECK_CUDA(knows);
 27 |   }
 28 | 
 29 |   at::Tensor idx =
 30 |       torch::zeros({unknowns.size(0), unknowns.size(1), 3},
 31 |                    at::device(unknowns.device()).dtype(at::ScalarType::Int));
 32 |   at::Tensor dist2 =
 33 |       torch::zeros({unknowns.size(0), unknowns.size(1), 3},
 34 |                    at::device(unknowns.device()).dtype(at::ScalarType::Float));
 35 | 
 36 |   if (unknowns.type().is_cuda()) {
 37 |     three_nn_kernel_wrapper(unknowns.size(0), unknowns.size(1), knows.size(1),
 38 |                             unknowns.data<float>(), knows.data<float>(),
 39 |                             dist2.data<float>(), idx.data<int>());
 40 |   } else {
 41 |     AT_CHECK(false, "CPU not supported");
 42 |   }
 43 | 
 44 |   return {dist2, idx};
 45 | }
 46 | 
 47 | at::Tensor three_interpolate(at::Tensor points, at::Tensor idx,
 48 |                              at::Tensor weight) {
 49 |   CHECK_CONTIGUOUS(points);
 50 |   CHECK_CONTIGUOUS(idx);
 51 |   CHECK_CONTIGUOUS(weight);
 52 |   CHECK_IS_FLOAT(points);
 53 |   CHECK_IS_INT(idx);
 54 |   CHECK_IS_FLOAT(weight);
 55 | 
 56 |   if (points.type().is_cuda()) {
 57 |     CHECK_CUDA(idx);
 58 |     CHECK_CUDA(weight);
 59 |   }
 60 | 
 61 |   at::Tensor output =
 62 |       torch::zeros({points.size(0), points.size(1), idx.size(1)},
 63 |                    at::device(points.device()).dtype(at::ScalarType::Float));
 64 | 
 65 |   if (points.type().is_cuda()) {
 66 |     three_interpolate_kernel_wrapper(
 67 |         points.size(0), points.size(1), points.size(2), idx.size(1),
 68 |         points.data<float>(), idx.data<int>(), weight.data<float>(),
 69 |         output.data<float>());
 70 |   } else {
 71 |     AT_CHECK(false, "CPU not supported");
 72 |   }
 73 | 
 74 |   return output;
 75 | }
 76 | at::Tensor three_interpolate_grad(at::Tensor grad_out, at::Tensor idx,
 77 |                                   at::Tensor weight, const int m) {
 78 |   CHECK_CONTIGUOUS(grad_out);
 79 |   CHECK_CONTIGUOUS(idx);
 80 |   CHECK_CONTIGUOUS(weight);
 81 |   CHECK_IS_FLOAT(grad_out);
 82 |   CHECK_IS_INT(idx);
 83 |   CHECK_IS_FLOAT(weight);
 84 | 
 85 |   if (grad_out.type().is_cuda()) {
 86 |     CHECK_CUDA(idx);
 87 |     CHECK_CUDA(weight);
 88 |   }
 89 | 
 90 |   at::Tensor output =
 91 |       torch::zeros({grad_out.size(0), grad_out.size(1), m},
 92 |                    at::device(grad_out.device()).dtype(at::ScalarType::Float));
 93 | 
 94 |   if (grad_out.type().is_cuda()) {
 95 |     three_interpolate_kernel_wrapper(
 96 |         grad_out.size(0), grad_out.size(1), grad_out.size(2), m,
 97 |         grad_out.data<float>(), idx.data<int>(), weight.data<float>(),
 98 |         output.data<float>());
 99 |   } else {
100 |     AT_CHECK(false, "CPU not supported");
101 |   }
102 | 
103 |   return output;
104 | }
105 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/interpolate_gpu.cu:
--------------------------------------------------------------------------------
  1 | // Copyright (c) Facebook, Inc. and its affiliates.
  2 | // 
  3 | // This source code is licensed under the MIT license found in the
  4 | // LICENSE file in the root directory of this source tree.
  5 | 
  6 | #include <math.h>
  7 | #include <stdio.h>
  8 | #include <stdlib.h>
  9 | 
 10 | #include "cuda_utils.h"
 11 | 
 12 | // input: unknown(b, n, 3) known(b, m, 3)
 13 | // output: dist2(b, n, 3), idx(b, n, 3)
 14 | __global__ void three_nn_kernel(int b, int n, int m,
 15 |                                 const float *__restrict__ unknown,
 16 |                                 const float *__restrict__ known,
 17 |                                 float *__restrict__ dist2,
 18 |                                 int *__restrict__ idx) {
 19 |   int batch_index = blockIdx.x;
 20 |   unknown += batch_index * n * 3;
 21 |   known += batch_index * m * 3;
 22 |   dist2 += batch_index * n * 3;
 23 |   idx += batch_index * n * 3;
 24 | 
 25 |   int index = threadIdx.x;
 26 |   int stride = blockDim.x;
 27 |   for (int j = index; j < n; j += stride) {
 28 |     float ux = unknown[j * 3 + 0];
 29 |     float uy = unknown[j * 3 + 1];
 30 |     float uz = unknown[j * 3 + 2];
 31 | 
 32 |     double best1 = 1e40, best2 = 1e40, best3 = 1e40;
 33 |     int besti1 = 0, besti2 = 0, besti3 = 0;
 34 |     for (int k = 0; k < m; ++k) {
 35 |       float x = known[k * 3 + 0];
 36 |       float y = known[k * 3 + 1];
 37 |       float z = known[k * 3 + 2];
 38 |       float d = (ux - x) * (ux - x) + (uy - y) * (uy - y) + (uz - z) * (uz - z);
 39 |       if (d < best1) {
 40 |         best3 = best2;
 41 |         besti3 = besti2;
 42 |         best2 = best1;
 43 |         besti2 = besti1;
 44 |         best1 = d;
 45 |         besti1 = k;
 46 |       } else if (d < best2) {
 47 |         best3 = best2;
 48 |         besti3 = besti2;
 49 |         best2 = d;
 50 |         besti2 = k;
 51 |       } else if (d < best3) {
 52 |         best3 = d;
 53 |         besti3 = k;
 54 |       }
 55 |     }
 56 |     dist2[j * 3 + 0] = best1;
 57 |     dist2[j * 3 + 1] = best2;
 58 |     dist2[j * 3 + 2] = best3;
 59 | 
 60 |     idx[j * 3 + 0] = besti1;
 61 |     idx[j * 3 + 1] = besti2;
 62 |     idx[j * 3 + 2] = besti3;
 63 |   }
 64 | }
 65 | 
 66 | void three_nn_kernel_wrapper(int b, int n, int m, const float *unknown,
 67 |                              const float *known, float *dist2, int *idx) {
 68 |   cudaStream_t stream = at::cuda::getCurrentCUDAStream();
 69 |   three_nn_kernel<<<b, opt_n_threads(n), 0, stream>>>(b, n, m, unknown, known,
 70 |                                                       dist2, idx);
 71 | 
 72 |   CUDA_CHECK_ERRORS();
 73 | }
 74 | 
 75 | // input: points(b, c, m), idx(b, n, 3), weight(b, n, 3)
 76 | // output: out(b, c, n)
 77 | __global__ void three_interpolate_kernel(int b, int c, int m, int n,
 78 |                                          const float *__restrict__ points,
 79 |                                          const int *__restrict__ idx,
 80 |                                          const float *__restrict__ weight,
 81 |                                          float *__restrict__ out) {
 82 |   int batch_index = blockIdx.x;
 83 |   points += batch_index * m * c;
 84 | 
 85 |   idx += batch_index * n * 3;
 86 |   weight += batch_index * n * 3;
 87 | 
 88 |   out += batch_index * n * c;
 89 | 
 90 |   const int index = threadIdx.y * blockDim.x + threadIdx.x;
 91 |   const int stride = blockDim.y * blockDim.x;
 92 |   for (int i = index; i < c * n; i += stride) {
 93 |     const int l = i / n;
 94 |     const int j = i % n;
 95 |     float w1 = weight[j * 3 + 0];
 96 |     float w2 = weight[j * 3 + 1];
 97 |     float w3 = weight[j * 3 + 2];
 98 | 
 99 |     int i1 = idx[j * 3 + 0];
100 |     int i2 = idx[j * 3 + 1];
101 |     int i3 = idx[j * 3 + 2];
102 | 
103 |     out[i] = points[l * m + i1] * w1 + points[l * m + i2] * w2 +
104 |              points[l * m + i3] * w3;
105 |   }
106 | }
107 | 
108 | void three_interpolate_kernel_wrapper(int b, int c, int m, int n,
109 |                                       const float *points, const int *idx,
110 |                                       const float *weight, float *out) {
111 |   cudaStream_t stream = at::cuda::getCurrentCUDAStream();
112 |   three_interpolate_kernel<<<b, opt_block_config(n, c), 0, stream>>>(
113 |       b, c, m, n, points, idx, weight, out);
114 | 
115 |   CUDA_CHECK_ERRORS();
116 | }
117 | 
118 | // input: grad_out(b, c, n), idx(b, n, 3), weight(b, n, 3)
119 | // output: grad_points(b, c, m)
120 | 
121 | __global__ void three_interpolate_grad_kernel(
122 |     int b, int c, int n, int m, const float *__restrict__ grad_out,
123 |     const int *__restrict__ idx, const float *__restrict__ weight,
124 |     float *__restrict__ grad_points) {
125 |   int batch_index = blockIdx.x;
126 |   grad_out += batch_index * n * c;
127 |   idx += batch_index * n * 3;
128 |   weight += batch_index * n * 3;
129 |   grad_points += batch_index * m * c;
130 | 
131 |   const int index = threadIdx.y * blockDim.x + threadIdx.x;
132 |   const int stride = blockDim.y * blockDim.x;
133 |   for (int i = index; i < c * n; i += stride) {
134 |     const int l = i / n;
135 |     const int j = i % n;
136 |     float w1 = weight[j * 3 + 0];
137 |     float w2 = weight[j * 3 + 1];
138 |     float w3 = weight[j * 3 + 2];
139 | 
140 |     int i1 = idx[j * 3 + 0];
141 |     int i2 = idx[j * 3 + 1];
142 |     int i3 = idx[j * 3 + 2];
143 | 
144 |     atomicAdd(grad_points + l * m + i1, grad_out[i] * w1);
145 |     atomicAdd(grad_points + l * m + i2, grad_out[i] * w2);
146 |     atomicAdd(grad_points + l * m + i3, grad_out[i] * w3);
147 |   }
148 | }
149 | 
150 | void three_interpolate_grad_kernel_wrapper(int b, int c, int n, int m,
151 |                                            const float *grad_out,
152 |                                            const int *idx, const float *weight,
153 |                                            float *grad_points) {
154 |   cudaStream_t stream = at::cuda::getCurrentCUDAStream();
155 |   three_interpolate_grad_kernel<<<b, opt_block_config(n, c), 0, stream>>>(
156 |       b, c, n, m, grad_out, idx, weight, grad_points);
157 | 
158 |   CUDA_CHECK_ERRORS();
159 | }
160 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/sampling.cpp:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates.
 2 | // 
 3 | // This source code is licensed under the MIT license found in the
 4 | // LICENSE file in the root directory of this source tree.
 5 | 
 6 | #include "sampling.h"
 7 | #include "utils.h"
 8 | 
 9 | void gather_points_kernel_wrapper(int b, int c, int n, int npoints,
10 |                                   const float *points, const int *idx,
11 |                                   float *out);
12 | void gather_points_grad_kernel_wrapper(int b, int c, int n, int npoints,
13 |                                        const float *grad_out, const int *idx,
14 |                                        float *grad_points);
15 | 
16 | void furthest_point_sampling_kernel_wrapper(int b, int n, int m,
17 |                                             const float *dataset, float *temp,
18 |                                             int *idxs);
19 | 
20 | at::Tensor gather_points(at::Tensor points, at::Tensor idx) {
21 |   CHECK_CONTIGUOUS(points);
22 |   CHECK_CONTIGUOUS(idx);
23 |   CHECK_IS_FLOAT(points);
24 |   CHECK_IS_INT(idx);
25 | 
26 |   if (points.type().is_cuda()) {
27 |     CHECK_CUDA(idx);
28 |   }
29 | 
30 |   at::Tensor output =
31 |       torch::zeros({points.size(0), points.size(1), idx.size(1)},
32 |                    at::device(points.device()).dtype(at::ScalarType::Float));
33 | 
34 |   if (points.type().is_cuda()) {
35 |     gather_points_kernel_wrapper(points.size(0), points.size(1), points.size(2),
36 |                                  idx.size(1), points.data<float>(),
37 |                                  idx.data<int>(), output.data<float>());
38 |   } else {
39 |     AT_CHECK(false, "CPU not supported");
40 |   }
41 | 
42 |   return output;
43 | }
44 | 
45 | at::Tensor gather_points_grad(at::Tensor grad_out, at::Tensor idx,
46 |                               const int n) {
47 |   CHECK_CONTIGUOUS(grad_out);
48 |   CHECK_CONTIGUOUS(idx);
49 |   CHECK_IS_FLOAT(grad_out);
50 |   CHECK_IS_INT(idx);
51 | 
52 |   if (grad_out.type().is_cuda()) {
53 |     CHECK_CUDA(idx);
54 |   }
55 | 
56 |   at::Tensor output =
57 |       torch::zeros({grad_out.size(0), grad_out.size(1), n},
58 |                    at::device(grad_out.device()).dtype(at::ScalarType::Float));
59 | 
60 |   if (grad_out.type().is_cuda()) {
61 |     gather_points_grad_kernel_wrapper(grad_out.size(0), grad_out.size(1), n,
62 |                                       idx.size(1), grad_out.data<float>(),
63 |                                       idx.data<int>(), output.data<float>());
64 |   } else {
65 |     AT_CHECK(false, "CPU not supported");
66 |   }
67 | 
68 |   return output;
69 | }
70 | at::Tensor furthest_point_sampling(at::Tensor points, const int nsamples) {
71 |   CHECK_CONTIGUOUS(points);
72 |   CHECK_IS_FLOAT(points);
73 | 
74 |   at::Tensor output =
75 |       torch::zeros({points.size(0), nsamples},
76 |                    at::device(points.device()).dtype(at::ScalarType::Int));
77 | 
78 |   at::Tensor tmp =
79 |       torch::full({points.size(0), points.size(1)}, 1e10,
80 |                   at::device(points.device()).dtype(at::ScalarType::Float));
81 | 
82 |   if (points.type().is_cuda()) {
83 |     furthest_point_sampling_kernel_wrapper(
84 |         points.size(0), points.size(1), nsamples, points.data<float>(),
85 |         tmp.data<float>(), output.data<int>());
86 |   } else {
87 |     AT_CHECK(false, "CPU not supported");
88 |   }
89 | 
90 |   return output;
91 | }
92 | 


--------------------------------------------------------------------------------
/pointnet2/_ext_src/src/sampling_gpu.cu:
--------------------------------------------------------------------------------
  1 | // Copyright (c) Facebook, Inc. and its affiliates.
  2 | // 
  3 | // This source code is licensed under the MIT license found in the
  4 | // LICENSE file in the root directory of this source tree.
  5 | 
  6 | #include <stdio.h>
  7 | #include <stdlib.h>
  8 | 
  9 | #include "cuda_utils.h"
 10 | 
 11 | // input: points(b, c, n) idx(b, m)
 12 | // output: out(b, c, m)
 13 | __global__ void gather_points_kernel(int b, int c, int n, int m,
 14 |                                      const float *__restrict__ points,
 15 |                                      const int *__restrict__ idx,
 16 |                                      float *__restrict__ out) {
 17 |   for (int i = blockIdx.x; i < b; i += gridDim.x) {
 18 |     for (int l = blockIdx.y; l < c; l += gridDim.y) {
 19 |       for (int j = threadIdx.x; j < m; j += blockDim.x) {
 20 |         int a = idx[i * m + j];
 21 |         out[(i * c + l) * m + j] = points[(i * c + l) * n + a];
 22 |       }
 23 |     }
 24 |   }
 25 | }
 26 | 
 27 | void gather_points_kernel_wrapper(int b, int c, int n, int npoints,
 28 |                                   const float *points, const int *idx,
 29 |                                   float *out) {
 30 |   gather_points_kernel<<<dim3(b, c, 1), opt_n_threads(npoints), 0,
 31 |                          at::cuda::getCurrentCUDAStream()>>>(b, c, n, npoints,
 32 |                                                              points, idx, out);
 33 | 
 34 |   CUDA_CHECK_ERRORS();
 35 | }
 36 | 
 37 | // input: grad_out(b, c, m) idx(b, m)
 38 | // output: grad_points(b, c, n)
 39 | __global__ void gather_points_grad_kernel(int b, int c, int n, int m,
 40 |                                           const float *__restrict__ grad_out,
 41 |                                           const int *__restrict__ idx,
 42 |                                           float *__restrict__ grad_points) {
 43 |   for (int i = blockIdx.x; i < b; i += gridDim.x) {
 44 |     for (int l = blockIdx.y; l < c; l += gridDim.y) {
 45 |       for (int j = threadIdx.x; j < m; j += blockDim.x) {
 46 |         int a = idx[i * m + j];
 47 |         atomicAdd(grad_points + (i * c + l) * n + a,
 48 |                   grad_out[(i * c + l) * m + j]);
 49 |       }
 50 |     }
 51 |   }
 52 | }
 53 | 
 54 | void gather_points_grad_kernel_wrapper(int b, int c, int n, int npoints,
 55 |                                        const float *grad_out, const int *idx,
 56 |                                        float *grad_points) {
 57 |   gather_points_grad_kernel<<<dim3(b, c, 1), opt_n_threads(npoints), 0,
 58 |                               at::cuda::getCurrentCUDAStream()>>>(
 59 |       b, c, n, npoints, grad_out, idx, grad_points);
 60 | 
 61 |   CUDA_CHECK_ERRORS();
 62 | }
 63 | 
 64 | __device__ void __update(float *__restrict__ dists, int *__restrict__ dists_i,
 65 |                          int idx1, int idx2) {
 66 |   const float v1 = dists[idx1], v2 = dists[idx2];
 67 |   const int i1 = dists_i[idx1], i2 = dists_i[idx2];
 68 |   dists[idx1] = max(v1, v2);
 69 |   dists_i[idx1] = v2 > v1 ? i2 : i1;
 70 | }
 71 | 
 72 | // Input dataset: (b, n, 3), tmp: (b, n)
 73 | // Ouput idxs (b, m)
 74 | template <unsigned int block_size>
 75 | __global__ void furthest_point_sampling_kernel(
 76 |     int b, int n, int m, const float *__restrict__ dataset,
 77 |     float *__restrict__ temp, int *__restrict__ idxs) {
 78 |   if (m <= 0) return;
 79 |   __shared__ float dists[block_size];
 80 |   __shared__ int dists_i[block_size];
 81 | 
 82 |   int batch_index = blockIdx.x;
 83 |   dataset += batch_index * n * 3;
 84 |   temp += batch_index * n;
 85 |   idxs += batch_index * m;
 86 | 
 87 |   int tid = threadIdx.x;
 88 |   const int stride = block_size;
 89 | 
 90 |   int old = 0;
 91 |   if (threadIdx.x == 0) idxs[0] = old;
 92 | 
 93 |   __syncthreads();
 94 |   for (int j = 1; j < m; j++) {
 95 |     int besti = 0;
 96 |     float best = -1;
 97 |     float x1 = dataset[old * 3 + 0];
 98 |     float y1 = dataset[old * 3 + 1];
 99 |     float z1 = dataset[old * 3 + 2];
100 |     for (int k = tid; k < n; k += stride) {
101 |       float x2, y2, z2;
102 |       x2 = dataset[k * 3 + 0];
103 |       y2 = dataset[k * 3 + 1];
104 |       z2 = dataset[k * 3 + 2];
105 |       float mag = (x2 * x2) + (y2 * y2) + (z2 * z2);
106 |       if (mag <= 1e-3) continue;
107 | 
108 |       float d =
109 |           (x2 - x1) * (x2 - x1) + (y2 - y1) * (y2 - y1) + (z2 - z1) * (z2 - z1);
110 | 
111 |       float d2 = min(d, temp[k]);
112 |       temp[k] = d2;
113 |       besti = d2 > best ? k : besti;
114 |       best = d2 > best ? d2 : best;
115 |     }
116 |     dists[tid] = best;
117 |     dists_i[tid] = besti;
118 |     __syncthreads();
119 | 
120 |     if (block_size >= 512) {
121 |       if (tid < 256) {
122 |         __update(dists, dists_i, tid, tid + 256);
123 |       }
124 |       __syncthreads();
125 |     }
126 |     if (block_size >= 256) {
127 |       if (tid < 128) {
128 |         __update(dists, dists_i, tid, tid + 128);
129 |       }
130 |       __syncthreads();
131 |     }
132 |     if (block_size >= 128) {
133 |       if (tid < 64) {
134 |         __update(dists, dists_i, tid, tid + 64);
135 |       }
136 |       __syncthreads();
137 |     }
138 |     if (block_size >= 64) {
139 |       if (tid < 32) {
140 |         __update(dists, dists_i, tid, tid + 32);
141 |       }
142 |       __syncthreads();
143 |     }
144 |     if (block_size >= 32) {
145 |       if (tid < 16) {
146 |         __update(dists, dists_i, tid, tid + 16);
147 |       }
148 |       __syncthreads();
149 |     }
150 |     if (block_size >= 16) {
151 |       if (tid < 8) {
152 |         __update(dists, dists_i, tid, tid + 8);
153 |       }
154 |       __syncthreads();
155 |     }
156 |     if (block_size >= 8) {
157 |       if (tid < 4) {
158 |         __update(dists, dists_i, tid, tid + 4);
159 |       }
160 |       __syncthreads();
161 |     }
162 |     if (block_size >= 4) {
163 |       if (tid < 2) {
164 |         __update(dists, dists_i, tid, tid + 2);
165 |       }
166 |       __syncthreads();
167 |     }
168 |     if (block_size >= 2) {
169 |       if (tid < 1) {
170 |         __update(dists, dists_i, tid, tid + 1);
171 |       }
172 |       __syncthreads();
173 |     }
174 | 
175 |     old = dists_i[0];
176 |     if (tid == 0) idxs[j] = old;
177 |   }
178 | }
179 | 
180 | void furthest_point_sampling_kernel_wrapper(int b, int n, int m,
181 |                                             const float *dataset, float *temp,
182 |                                             int *idxs) {
183 |   unsigned int n_threads = opt_n_threads(n);
184 | 
185 |   cudaStream_t stream = at::cuda::getCurrentCUDAStream();
186 | 
187 |   switch (n_threads) {
188 |     case 512:
189 |       furthest_point_sampling_kernel<512>
190 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
191 |       break;
192 |     case 256:
193 |       furthest_point_sampling_kernel<256>
194 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
195 |       break;
196 |     case 128:
197 |       furthest_point_sampling_kernel<128>
198 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
199 |       break;
200 |     case 64:
201 |       furthest_point_sampling_kernel<64>
202 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
203 |       break;
204 |     case 32:
205 |       furthest_point_sampling_kernel<32>
206 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
207 |       break;
208 |     case 16:
209 |       furthest_point_sampling_kernel<16>
210 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
211 |       break;
212 |     case 8:
213 |       furthest_point_sampling_kernel<8>
214 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
215 |       break;
216 |     case 4:
217 |       furthest_point_sampling_kernel<4>
218 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
219 |       break;
220 |     case 2:
221 |       furthest_point_sampling_kernel<2>
222 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
223 |       break;
224 |     case 1:
225 |       furthest_point_sampling_kernel<1>
226 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
227 |       break;
228 |     default:
229 |       furthest_point_sampling_kernel<512>
230 |           <<<b, n_threads, 0, stream>>>(b, n, m, dataset, temp, idxs);
231 |   }
232 | 
233 |   CUDA_CHECK_ERRORS();
234 | }
235 | 


--------------------------------------------------------------------------------
/pointnet2/pointnet2_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | ''' Modified based on: https://github.com/erikwijmans/Pointnet2_PyTorch '''
  7 | from __future__ import (
  8 |     division,
  9 |     absolute_import,
 10 |     with_statement,
 11 |     print_function,
 12 |     unicode_literals,
 13 | )
 14 | import torch
 15 | from torch.autograd import Function
 16 | import torch.nn as nn
 17 | import pytorch_utils as pt_utils
 18 | import sys
 19 | 
 20 | try:
 21 |     import builtins
 22 | except:
 23 |     import __builtin__ as builtins
 24 | 
 25 | try:
 26 |     import pointnet2._ext as _ext
 27 | except ImportError:
 28 |     if not getattr(builtins, "__POINTNET2_SETUP__", False):
 29 |         raise ImportError(
 30 |             "Could not import _ext module.\n"
 31 |             "Please see the setup instructions in the README: "
 32 |             "https://github.com/erikwijmans/Pointnet2_PyTorch/blob/master/README.rst"
 33 |         )
 34 | 
 35 | if False:
 36 |     # Workaround for type hints without depending on the `typing` module
 37 |     from typing import *
 38 | 
 39 | 
 40 | class RandomDropout(nn.Module):
 41 |     def __init__(self, p=0.5, inplace=False):
 42 |         super(RandomDropout, self).__init__()
 43 |         self.p = p
 44 |         self.inplace = inplace
 45 | 
 46 |     def forward(self, X):
 47 |         theta = torch.Tensor(1).uniform_(0, self.p)[0]
 48 |         return pt_utils.feature_dropout_no_scaling(X, theta, self.train, self.inplace)
 49 | 
 50 | 
 51 | class FurthestPointSampling(Function):
 52 |     @staticmethod
 53 |     def forward(ctx, xyz, npoint):
 54 |         # type: (Any, torch.Tensor, int) -> torch.Tensor
 55 |         r"""
 56 |         Uses iterative furthest point sampling to select a set of npoint features that have the largest
 57 |         minimum distance
 58 | 
 59 |         Parameters
 60 |         ----------
 61 |         xyz : torch.Tensor
 62 |             (B, N, 3) tensor where N > npoint
 63 |         npoint : int32
 64 |             number of features in the sampled set
 65 | 
 66 |         Returns
 67 |         -------
 68 |         torch.Tensor
 69 |             (B, npoint) tensor containing the set
 70 |         """
 71 |         return _ext.furthest_point_sampling(xyz, npoint)
 72 | 
 73 |     @staticmethod
 74 |     def backward(xyz, a=None):
 75 |         return None, None
 76 | 
 77 | 
 78 | furthest_point_sample = FurthestPointSampling.apply
 79 | 
 80 | 
 81 | class GatherOperation(Function):
 82 |     @staticmethod
 83 |     def forward(ctx, features, idx):
 84 |         # type: (Any, torch.Tensor, torch.Tensor) -> torch.Tensor
 85 |         r"""
 86 | 
 87 |         Parameters
 88 |         ----------
 89 |         features : torch.Tensor
 90 |             (B, C, N) tensor
 91 | 
 92 |         idx : torch.Tensor
 93 |             (B, npoint) tensor of the features to gather
 94 | 
 95 |         Returns
 96 |         -------
 97 |         torch.Tensor
 98 |             (B, C, npoint) tensor
 99 |         """
100 | 
101 |         _, C, N = features.size()
102 | 
103 |         ctx.for_backwards = (idx, C, N)
104 | 
105 |         return _ext.gather_points(features, idx)
106 | 
107 |     @staticmethod
108 |     def backward(ctx, grad_out):
109 |         idx, C, N = ctx.for_backwards
110 | 
111 |         grad_features = _ext.gather_points_grad(grad_out.contiguous(), idx, N)
112 |         return grad_features, None
113 | 
114 | 
115 | gather_operation = GatherOperation.apply
116 | 
117 | 
118 | class ThreeNN(Function):
119 |     @staticmethod
120 |     def forward(ctx, unknown, known):
121 |         # type: (Any, torch.Tensor, torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]
122 |         r"""
123 |             Find the three nearest neighbors of unknown in known
124 |         Parameters
125 |         ----------
126 |         unknown : torch.Tensor
127 |             (B, n, 3) tensor of known features
128 |         known : torch.Tensor
129 |             (B, m, 3) tensor of unknown features
130 | 
131 |         Returns
132 |         -------
133 |         dist : torch.Tensor
134 |             (B, n, 3) l2 distance to the three nearest neighbors
135 |         idx : torch.Tensor
136 |             (B, n, 3) index of 3 nearest neighbors
137 |         """
138 |         dist2, idx = _ext.three_nn(unknown, known)
139 | 
140 |         return torch.sqrt(dist2), idx
141 | 
142 |     @staticmethod
143 |     def backward(ctx, a=None, b=None):
144 |         return None, None
145 | 
146 | 
147 | three_nn = ThreeNN.apply
148 | 
149 | 
150 | class ThreeInterpolate(Function):
151 |     @staticmethod
152 |     def forward(ctx, features, idx, weight):
153 |         # type(Any, torch.Tensor, torch.Tensor, torch.Tensor) -> Torch.Tensor
154 |         r"""
155 |             Performs weight linear interpolation on 3 features
156 |         Parameters
157 |         ----------
158 |         features : torch.Tensor
159 |             (B, c, m) Features descriptors to be interpolated from
160 |         idx : torch.Tensor
161 |             (B, n, 3) three nearest neighbors of the target features in features
162 |         weight : torch.Tensor
163 |             (B, n, 3) weights
164 | 
165 |         Returns
166 |         -------
167 |         torch.Tensor
168 |             (B, c, n) tensor of the interpolated features
169 |         """
170 |         B, c, m = features.size()
171 |         n = idx.size(1)
172 | 
173 |         ctx.three_interpolate_for_backward = (idx, weight, m)
174 | 
175 |         return _ext.three_interpolate(features, idx, weight)
176 | 
177 |     @staticmethod
178 |     def backward(ctx, grad_out):
179 |         # type: (Any, torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]
180 |         r"""
181 |         Parameters
182 |         ----------
183 |         grad_out : torch.Tensor
184 |             (B, c, n) tensor with gradients of ouputs
185 | 
186 |         Returns
187 |         -------
188 |         grad_features : torch.Tensor
189 |             (B, c, m) tensor with gradients of features
190 | 
191 |         None
192 | 
193 |         None
194 |         """
195 |         idx, weight, m = ctx.three_interpolate_for_backward
196 | 
197 |         grad_features = _ext.three_interpolate_grad(
198 |             grad_out.contiguous(), idx, weight, m
199 |         )
200 | 
201 |         return grad_features, None, None
202 | 
203 | 
204 | three_interpolate = ThreeInterpolate.apply
205 | 
206 | 
207 | class GroupingOperation(Function):
208 |     @staticmethod
209 |     def forward(ctx, features, idx):
210 |         # type: (Any, torch.Tensor, torch.Tensor) -> torch.Tensor
211 |         r"""
212 | 
213 |         Parameters
214 |         ----------
215 |         features : torch.Tensor
216 |             (B, C, N) tensor of features to group
217 |         idx : torch.Tensor
218 |             (B, npoint, nsample) tensor containing the indicies of features to group with
219 | 
220 |         Returns
221 |         -------
222 |         torch.Tensor
223 |             (B, C, npoint, nsample) tensor
224 |         """
225 |         B, nfeatures, nsample = idx.size()
226 |         _, C, N = features.size()
227 | 
228 |         ctx.for_backwards = (idx, N)
229 | 
230 |         return _ext.group_points(features, idx)
231 | 
232 |     @staticmethod
233 |     def backward(ctx, grad_out):
234 |         # type: (Any, torch.tensor) -> Tuple[torch.Tensor, torch.Tensor]
235 |         r"""
236 | 
237 |         Parameters
238 |         ----------
239 |         grad_out : torch.Tensor
240 |             (B, C, npoint, nsample) tensor of the gradients of the output from forward
241 | 
242 |         Returns
243 |         -------
244 |         torch.Tensor
245 |             (B, C, N) gradient of the features
246 |         None
247 |         """
248 |         idx, N = ctx.for_backwards
249 | 
250 |         grad_features = _ext.group_points_grad(grad_out.contiguous(), idx, N)
251 | 
252 |         return grad_features, None
253 | 
254 | 
255 | grouping_operation = GroupingOperation.apply
256 | 
257 | 
258 | class BallQuery(Function):
259 |     @staticmethod
260 |     def forward(ctx, radius, nsample, xyz, new_xyz):
261 |         # type: (Any, float, int, torch.Tensor, torch.Tensor) -> torch.Tensor
262 |         r"""
263 | 
264 |         Parameters
265 |         ----------
266 |         radius : float
267 |             radius of the balls
268 |         nsample : int
269 |             maximum number of features in the balls
270 |         xyz : torch.Tensor
271 |             (B, N, 3) xyz coordinates of the features
272 |         new_xyz : torch.Tensor
273 |             (B, npoint, 3) centers of the ball query
274 | 
275 |         Returns
276 |         -------
277 |         torch.Tensor
278 |             (B, npoint, nsample) tensor with the indicies of the features that form the query balls
279 |         """
280 |         return _ext.ball_query(new_xyz, xyz, radius, nsample)
281 | 
282 |     @staticmethod
283 |     def backward(ctx, a=None):
284 |         return None, None, None, None
285 | 
286 | 
287 | ball_query = BallQuery.apply
288 | 
289 | 
290 | class QueryAndGroup(nn.Module):
291 |     r"""
292 |     Groups with a ball query of radius
293 | 
294 |     Parameters
295 |     ---------
296 |     radius : float32
297 |         Radius of ball
298 |     nsample : int32
299 |         Maximum number of features to gather in the ball
300 |     """
301 | 
302 |     def __init__(self, radius, nsample, use_xyz=True, ret_grouped_xyz=False, normalize_xyz=False, sample_uniformly=False, ret_unique_cnt=False, use_feature=False, ret_idx=False):
303 |         # type: (QueryAndGroup, float, int, bool) -> None
304 |         super(QueryAndGroup, self).__init__()
305 |         self.radius, self.nsample, self.use_xyz = radius, nsample, use_xyz
306 |         self.ret_grouped_xyz = ret_grouped_xyz
307 |         self.normalize_xyz = normalize_xyz
308 |         self.sample_uniformly = sample_uniformly
309 |         self.ret_unique_cnt = ret_unique_cnt
310 |         self.ret_idx = ret_idx
311 |         self.use_feature = use_feature
312 |         if self.ret_unique_cnt:
313 |             assert(self.sample_uniformly)
314 | 
315 |     def forward(self, xyz, new_xyz, features=None):
316 |         # type: (QueryAndGroup, torch.Tensor. torch.Tensor, torch.Tensor) -> Tuple[Torch.Tensor]
317 |         r"""
318 |         Parameters
319 |         ----------
320 |         xyz : torch.Tensor
321 |             xyz coordinates of the features (B, N, 3)
322 |         new_xyz : torch.Tensor
323 |             centriods (B, npoint, 3)
324 |         features : torch.Tensor
325 |             Descriptors of the features (B, C, N)
326 | 
327 |         Returns
328 |         -------
329 |         new_features : torch.Tensor
330 |             (B, 3 + C, npoint, nsample) tensor
331 |         """
332 |         idx = ball_query(self.radius, self.nsample, xyz, new_xyz)
333 | 
334 |         if self.sample_uniformly:
335 |             unique_cnt = torch.zeros((idx.shape[0], idx.shape[1]))
336 |             for i_batch in range(idx.shape[0]):
337 |                 for i_region in range(idx.shape[1]):
338 |                     unique_ind = torch.unique(idx[i_batch, i_region, :])
339 |                     num_unique = unique_ind.shape[0]
340 |                     unique_cnt[i_batch, i_region] = num_unique
341 |                     sample_ind = torch.randint(0, num_unique, (self.nsample - num_unique,), dtype=torch.long)
342 |                     all_ind = torch.cat((unique_ind, unique_ind[sample_ind]))
343 |                     idx[i_batch, i_region, :] = all_ind
344 | 
345 | 
346 |         xyz_trans = xyz.transpose(1, 2).contiguous()
347 |         grouped_xyz = grouping_operation(xyz_trans, idx)  # (B, 3, npoint, nsample)
348 |         grouped_xyz -= new_xyz.transpose(1, 2).unsqueeze(-1)
349 |         if self.normalize_xyz:
350 |             grouped_xyz /= self.radius
351 | 
352 |         if features is not None:
353 |             grouped_features = grouping_operation(features, idx)
354 |             if self.use_xyz:
355 |                 new_features = torch.cat(
356 |                     [grouped_xyz, grouped_features], dim=1
357 |                 )  # (B, C + 3, npoint, nsample)
358 |             else:
359 |                 new_features = grouped_features
360 |             if self.use_feature:
361 |                 orig_features = features.unsqueeze(-1).repeat(1,1,1,self.nsample)
362 |                 new_features = torch.cat(
363 |                     [orig_features, new_features], dim=1
364 |                 )  # (B, C + 3, npoint, nsample)
365 |         else:
366 |             assert (
367 |                 self.use_xyz
368 |             ), "Cannot have not features and not use xyz as a feature!"
369 |             new_features = grouped_xyz
370 | 
371 |         ret = [new_features]
372 |         if self.ret_grouped_xyz:
373 |             ret.append(grouped_xyz)
374 |         if self.ret_unique_cnt:
375 |             ret.append(unique_cnt)
376 |         if self.ret_idx:
377 |             ret.append(idx)
378 |         if len(ret) == 1:
379 |             return ret[0]
380 |         else:
381 |             return tuple(ret)
382 | 
383 | class PairwiseGroup(nn.Module):
384 |     r"""
385 |     Groups with a ball query of radius
386 | 
387 |     Parameters
388 |     ---------
389 |     radius : float32
390 |         Radius of ball
391 |     nsample : int32
392 |         Maximum number of features to gather in the ball
393 |     """
394 | 
395 |     def __init__(self, radius, nsample, use_xyz=True, ret_grouped_xyz=False, normalize_xyz=False, sample_uniformly=False, ret_unique_cnt=False, use_feature=False):
396 |         # type: (QueryAndGroup, float, int, bool) -> None
397 |         super(PairwiseGroup, self).__init__()
398 |         self.radius, self.nsample, self.use_xyz = radius, nsample, use_xyz
399 |         self.ret_grouped_xyz = ret_grouped_xyz
400 |         self.normalize_xyz = normalize_xyz
401 |         self.sample_uniformly = sample_uniformly
402 |         self.ret_unique_cnt = ret_unique_cnt
403 |         self.use_feature = use_feature
404 |         if self.ret_unique_cnt:
405 |             assert(self.sample_uniformly)
406 | 
407 |     def forward(self, xyz, new_xyz, features=None):
408 |         # type: (QueryAndGroup, torch.Tensor. torch.Tensor, torch.Tensor) -> Tuple[Torch.Tensor]
409 |         r"""
410 |         Parameters
411 |         ----------
412 |         xyz : torch.Tensor
413 |             xyz coordinates of the features (B, N, 3)
414 |         new_xyz : torch.Tensor
415 |             centriods (B, npoint, 3)
416 |         features : torch.Tensor
417 |             Descriptors of the features (B, C, N)
418 | 
419 |         Returns
420 |         -------
421 |         new_features : torch.Tensor
422 |             (B, 3 + C, npoint, nsample) tensor
423 |         """
424 |         xyz_trans = xyz.transpose(1, 2).contiguous()
425 |         grouped_xyz = xyz_trans.unsqueeze(-1).repeat(1,1,1,self.nsample)#grouping_operation(xyz_trans, idx)  # (B, 3, npoint, nsample)
426 |         grouped_features1 = features.unsqueeze(-1).repeat(1,1,1,self.nsample)
427 |         grouped_features2 = features.unsqueeze(-2).repeat(1,1,self.nsample,1)
428 |         grouped_features = torch.cat([grouped_features1, grouped_features2], dim=1)
429 |         ret = [grouped_features]
430 |         if self.ret_grouped_xyz:
431 |             ret.append(grouped_xyz)
432 |         return tuple(ret)
433 |         
434 | class GroupAll(nn.Module):
435 |     r"""
436 |     Groups all features
437 | 
438 |     Parameters
439 |     ---------
440 |     """
441 | 
442 |     def __init__(self, use_xyz=True, ret_grouped_xyz=False):
443 |         # type: (GroupAll, bool) -> None
444 |         super(GroupAll, self).__init__()
445 |         self.use_xyz = use_xyz
446 |         self.ret_grouped_xyz = ret_grouped_xyz
447 |         
448 |     def forward(self, xyz, new_xyz, features=None):
449 |         # type: (GroupAll, torch.Tensor, torch.Tensor, torch.Tensor) -> Tuple[torch.Tensor]
450 |         r"""
451 |         Parameters
452 |         ----------
453 |         xyz : torch.Tensor
454 |             xyz coordinates of the features (B, N, 3)
455 |         new_xyz : torch.Tensor
456 |             Ignored
457 |         features : torch.Tensor
458 |             Descriptors of the features (B, C, N)
459 | 
460 |         Returns
461 |         -------
462 |         new_features : torch.Tensor
463 |             (B, C + 3, 1, N) tensor
464 |         """
465 | 
466 |         grouped_xyz = xyz.transpose(1, 2).unsqueeze(2)
467 | 
468 |         if features is not None:
469 |             grouped_features = features.unsqueeze(2)
470 |             if self.use_xyz:
471 |                 new_features = torch.cat(
472 |                     [grouped_xyz, grouped_features], dim=1
473 |                 )  # (B, 3 + C, 1, N)
474 |             else:
475 |                 new_features = grouped_features
476 |         else:
477 |             new_features = grouped_xyz
478 | 
479 |         if self.ret_grouped_xyz:
480 |             return new_features, grouped_xyz
481 |         else:
482 |             return new_features
483 | 


--------------------------------------------------------------------------------
/pointnet2/pytorch_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | ''' Modified based on Ref: https://github.com/erikwijmans/Pointnet2_PyTorch '''
  7 | import torch
  8 | import torch.nn as nn
  9 | from typing import List, Tuple
 10 | 
 11 | class SharedMLP(nn.Sequential):
 12 | 
 13 |     def __init__(
 14 |             self,
 15 |             args: List[int],
 16 |             *,
 17 |             bn: bool = False,
 18 |             activation=nn.ReLU(inplace=True),
 19 |             preact: bool = False,
 20 |             first: bool = False,
 21 |             name: str = ""
 22 |     ):
 23 |         super().__init__()
 24 | 
 25 |         for i in range(len(args) - 1):
 26 |             self.add_module(
 27 |                 name + 'layer{}'.format(i),
 28 |                 Conv2d(
 29 |                     args[i],
 30 |                     args[i + 1],
 31 |                     bn=(not first or not preact or (i != 0)) and bn,
 32 |                     activation=activation
 33 |                     if (not first or not preact or (i != 0)) else None,
 34 |                     preact=preact
 35 |                 )
 36 |             )
 37 | 
 38 | class SplitMLP(nn.Sequential):
 39 | 
 40 |     def __init__(
 41 |             self,
 42 |             args: List[int],
 43 |             *,
 44 |             split: int = 18,
 45 |             bn: bool = False,
 46 |             activation=nn.ReLU(inplace=True),
 47 |             preact: bool = False,
 48 |             first: bool = False,
 49 |             name: str = ""
 50 |     ):
 51 |         super().__init__()
 52 |         for j in range(split):
 53 |             for i in range(len(args) - 1):
 54 |                 self.add_module(
 55 |                     name + 'layer{}'.format(i),
 56 |                     Conv2d(
 57 |                         args[i],
 58 |                         args[i + 1],
 59 |                         bn=(not first or not preact or (i != 0)) and bn,
 60 |                         activation=activation
 61 |                         if (not first or not preact or (i != 0)) else None,
 62 |                         preact=preact
 63 |                     )
 64 |                 )
 65 | 
 66 | class _BNBase(nn.Sequential):
 67 | 
 68 |     def __init__(self, in_size, batch_norm=None, name=""):
 69 |         super().__init__()
 70 |         self.add_module(name + "bn", batch_norm(in_size))
 71 | 
 72 |         nn.init.constant_(self[0].weight, 1.0)
 73 |         nn.init.constant_(self[0].bias, 0)
 74 | 
 75 | 
 76 | class BatchNorm1d(_BNBase):
 77 | 
 78 |     def __init__(self, in_size: int, *, name: str = ""):
 79 |         super().__init__(in_size, batch_norm=nn.BatchNorm1d, name=name)
 80 | 
 81 | 
 82 | class BatchNorm2d(_BNBase):
 83 | 
 84 |     def __init__(self, in_size: int, name: str = ""):
 85 |         super().__init__(in_size, batch_norm=nn.BatchNorm2d, name=name)
 86 | 
 87 | 
 88 | class BatchNorm3d(_BNBase):
 89 | 
 90 |     def __init__(self, in_size: int, name: str = ""):
 91 |         super().__init__(in_size, batch_norm=nn.BatchNorm3d, name=name)
 92 | 
 93 | 
 94 | class _ConvBase(nn.Sequential):
 95 | 
 96 |     def __init__(
 97 |             self,
 98 |             in_size,
 99 |             out_size,
100 |             kernel_size,
101 |             stride,
102 |             padding,
103 |             activation,
104 |             bn,
105 |             init,
106 |             conv=None,
107 |             batch_norm=None,
108 |             bias=True,
109 |             preact=False,
110 |             name=""
111 |     ):
112 |         super().__init__()
113 | 
114 |         bias = bias and (not bn)
115 |         conv_unit = conv(
116 |             in_size,
117 |             out_size,
118 |             kernel_size=kernel_size,
119 |             stride=stride,
120 |             padding=padding,
121 |             bias=bias
122 |         )
123 |         init(conv_unit.weight)
124 |         if bias:
125 |             nn.init.constant_(conv_unit.bias, 0)
126 | 
127 |         if bn:
128 |             if not preact:
129 |                 bn_unit = batch_norm(out_size)
130 |             else:
131 |                 bn_unit = batch_norm(in_size)
132 | 
133 |         if preact:
134 |             if bn:
135 |                 self.add_module(name + 'bn', bn_unit)
136 | 
137 |             if activation is not None:
138 |                 self.add_module(name + 'activation', activation)
139 | 
140 |         self.add_module(name + 'conv', conv_unit)
141 | 
142 |         if not preact:
143 |             if bn:
144 |                 self.add_module(name + 'bn', bn_unit)
145 | 
146 |             if activation is not None:
147 |                 self.add_module(name + 'activation', activation)
148 | 
149 | 
150 | class Conv1d(_ConvBase):
151 | 
152 |     def __init__(
153 |             self,
154 |             in_size: int,
155 |             out_size: int,
156 |             *,
157 |             kernel_size: int = 1,
158 |             stride: int = 1,
159 |             padding: int = 0,
160 |             activation=nn.ReLU(inplace=True),
161 |             bn: bool = False,
162 |             init=nn.init.kaiming_normal_,
163 |             bias: bool = True,
164 |             preact: bool = False,
165 |             name: str = ""
166 |     ):
167 |         super().__init__(
168 |             in_size,
169 |             out_size,
170 |             kernel_size,
171 |             stride,
172 |             padding,
173 |             activation,
174 |             bn,
175 |             init,
176 |             conv=nn.Conv1d,
177 |             batch_norm=BatchNorm1d,
178 |             bias=bias,
179 |             preact=preact,
180 |             name=name
181 |         )
182 | 
183 | 
184 | class Conv2d(_ConvBase):
185 | 
186 |     def __init__(
187 |             self,
188 |             in_size: int,
189 |             out_size: int,
190 |             *,
191 |             kernel_size: Tuple[int, int] = (1, 1),
192 |             stride: Tuple[int, int] = (1, 1),
193 |             padding: Tuple[int, int] = (0, 0),
194 |             activation=nn.ReLU(inplace=True),
195 |             bn: bool = False,
196 |             init=nn.init.kaiming_normal_,
197 |             bias: bool = True,
198 |             preact: bool = False,
199 |             name: str = ""
200 |     ):
201 |         super().__init__(
202 |             in_size,
203 |             out_size,
204 |             kernel_size,
205 |             stride,
206 |             padding,
207 |             activation,
208 |             bn,
209 |             init,
210 |             conv=nn.Conv2d,
211 |             batch_norm=BatchNorm2d,
212 |             bias=bias,
213 |             preact=preact,
214 |             name=name
215 |         )
216 | 
217 | 
218 | class Conv3d(_ConvBase):
219 | 
220 |     def __init__(
221 |             self,
222 |             in_size: int,
223 |             out_size: int,
224 |             *,
225 |             kernel_size: Tuple[int, int, int] = (1, 1, 1),
226 |             stride: Tuple[int, int, int] = (1, 1, 1),
227 |             padding: Tuple[int, int, int] = (0, 0, 0),
228 |             activation=nn.ReLU(inplace=True),
229 |             bn: bool = False,
230 |             init=nn.init.kaiming_normal_,
231 |             bias: bool = True,
232 |             preact: bool = False,
233 |             name: str = ""
234 |     ):
235 |         super().__init__(
236 |             in_size,
237 |             out_size,
238 |             kernel_size,
239 |             stride,
240 |             padding,
241 |             activation,
242 |             bn,
243 |             init,
244 |             conv=nn.Conv3d,
245 |             batch_norm=BatchNorm3d,
246 |             bias=bias,
247 |             preact=preact,
248 |             name=name
249 |         )
250 | 
251 | 
252 | class FC(nn.Sequential):
253 | 
254 |     def __init__(
255 |             self,
256 |             in_size: int,
257 |             out_size: int,
258 |             *,
259 |             activation=nn.ReLU(inplace=True),
260 |             bn: bool = False,
261 |             init=None,
262 |             preact: bool = False,
263 |             name: str = ""
264 |     ):
265 |         super().__init__()
266 | 
267 |         fc = nn.Linear(in_size, out_size, bias=not bn)
268 |         if init is not None:
269 |             init(fc.weight)
270 |         if not bn:
271 |             nn.init.constant_(fc.bias, 0)
272 | 
273 |         if preact:
274 |             if bn:
275 |                 self.add_module(name + 'bn', BatchNorm1d(in_size))
276 | 
277 |             if activation is not None:
278 |                 self.add_module(name + 'activation', activation)
279 | 
280 |         self.add_module(name + 'fc', fc)
281 | 
282 |         if not preact:
283 |             if bn:
284 |                 self.add_module(name + 'bn', BatchNorm1d(out_size))
285 | 
286 |             if activation is not None:
287 |                 self.add_module(name + 'activation', activation)
288 | 
289 | def set_bn_momentum_default(bn_momentum):
290 | 
291 |     def fn(m):
292 |         if isinstance(m, (nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d)):
293 |             m.momentum = bn_momentum
294 | 
295 |     return fn
296 | 
297 | 
298 | class BNMomentumScheduler(object):
299 | 
300 |     def __init__(
301 |             self, model, bn_lambda, last_epoch=-1,
302 |             setter=set_bn_momentum_default
303 |     ):
304 |         if not isinstance(model, nn.Module):
305 |             raise RuntimeError(
306 |                 "Class '{}' is not a PyTorch nn Module".format(
307 |                     type(model).__name__
308 |                 )
309 |             )
310 | 
311 |         self.model = model
312 |         self.setter = setter
313 |         self.lmbd = bn_lambda
314 | 
315 |         self.step(last_epoch + 1)
316 |         self.last_epoch = last_epoch
317 | 
318 |     def step(self, epoch=None):
319 |         if epoch is None:
320 |             epoch = self.last_epoch + 1
321 | 
322 |         self.last_epoch = epoch
323 |         self.model.apply(self.setter(self.lmbd(epoch)))
324 | 
325 | 
326 | 


--------------------------------------------------------------------------------
/pointnet2/setup.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates.
 2 | # 
 3 | # This source code is licensed under the MIT license found in the
 4 | # LICENSE file in the root directory of this source tree.
 5 | 
 6 | from setuptools import setup
 7 | from torch.utils.cpp_extension import BuildExtension, CUDAExtension
 8 | import glob
 9 | 
10 | _ext_src_root = "_ext_src"
11 | _ext_sources = glob.glob("{}/src/*.cpp".format(_ext_src_root)) + glob.glob(
12 |     "{}/src/*.cu".format(_ext_src_root)
13 | )
14 | _ext_headers = glob.glob("{}/include/*".format(_ext_src_root))
15 | 
16 | setup(
17 |     name='pointnet2',
18 |     ext_modules=[
19 |         CUDAExtension(
20 |             name='pointnet2._ext',
21 |             sources=_ext_sources,
22 |             extra_compile_args={
23 |                 "cxx": ["-O2", "-I{}".format("{}/include".format(_ext_src_root))],
24 |                 "nvcc": ["-O2", "-I{}".format("{}/include".format(_ext_src_root))],
25 |             },
26 |         )
27 |     ],
28 |     cmdclass={
29 |         'build_ext': BuildExtension
30 |     }
31 | )
32 | 


--------------------------------------------------------------------------------
/scannet/meta_data/scannet_means.npz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zaiweizhang/H3DNet/81bd6af37cb131fd9e81774f52f29a0f3b0a0f43/scannet/meta_data/scannet_means.npz


--------------------------------------------------------------------------------
/scannet/meta_data/scannet_means_v2.npz.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zaiweizhang/H3DNet/81bd6af37cb131fd9e81774f52f29a0f3b0a0f43/scannet/meta_data/scannet_means_v2.npz.npy


--------------------------------------------------------------------------------
/scannet/meta_data/scannetv2_test.txt:
--------------------------------------------------------------------------------
  1 | scene0707_00
  2 | scene0708_00
  3 | scene0709_00
  4 | scene0710_00
  5 | scene0711_00
  6 | scene0712_00
  7 | scene0713_00
  8 | scene0714_00
  9 | scene0715_00
 10 | scene0716_00
 11 | scene0717_00
 12 | scene0718_00
 13 | scene0719_00
 14 | scene0720_00
 15 | scene0721_00
 16 | scene0722_00
 17 | scene0723_00
 18 | scene0724_00
 19 | scene0725_00
 20 | scene0726_00
 21 | scene0727_00
 22 | scene0728_00
 23 | scene0729_00
 24 | scene0730_00
 25 | scene0731_00
 26 | scene0732_00
 27 | scene0733_00
 28 | scene0734_00
 29 | scene0735_00
 30 | scene0736_00
 31 | scene0737_00
 32 | scene0738_00
 33 | scene0739_00
 34 | scene0740_00
 35 | scene0741_00
 36 | scene0742_00
 37 | scene0743_00
 38 | scene0744_00
 39 | scene0745_00
 40 | scene0746_00
 41 | scene0747_00
 42 | scene0748_00
 43 | scene0749_00
 44 | scene0750_00
 45 | scene0751_00
 46 | scene0752_00
 47 | scene0753_00
 48 | scene0754_00
 49 | scene0755_00
 50 | scene0756_00
 51 | scene0757_00
 52 | scene0758_00
 53 | scene0759_00
 54 | scene0760_00
 55 | scene0761_00
 56 | scene0762_00
 57 | scene0763_00
 58 | scene0764_00
 59 | scene0765_00
 60 | scene0766_00
 61 | scene0767_00
 62 | scene0768_00
 63 | scene0769_00
 64 | scene0770_00
 65 | scene0771_00
 66 | scene0772_00
 67 | scene0773_00
 68 | scene0774_00
 69 | scene0775_00
 70 | scene0776_00
 71 | scene0777_00
 72 | scene0778_00
 73 | scene0779_00
 74 | scene0780_00
 75 | scene0781_00
 76 | scene0782_00
 77 | scene0783_00
 78 | scene0784_00
 79 | scene0785_00
 80 | scene0786_00
 81 | scene0787_00
 82 | scene0788_00
 83 | scene0789_00
 84 | scene0790_00
 85 | scene0791_00
 86 | scene0792_00
 87 | scene0793_00
 88 | scene0794_00
 89 | scene0795_00
 90 | scene0796_00
 91 | scene0797_00
 92 | scene0798_00
 93 | scene0799_00
 94 | scene0800_00
 95 | scene0801_00
 96 | scene0802_00
 97 | scene0803_00
 98 | scene0804_00
 99 | scene0805_00
100 | scene0806_00
101 | 


--------------------------------------------------------------------------------
/scannet/meta_data/scannetv2_val.txt:
--------------------------------------------------------------------------------
  1 | scene0568_00
  2 | scene0568_01
  3 | scene0568_02
  4 | scene0304_00
  5 | scene0488_00
  6 | scene0488_01
  7 | scene0412_00
  8 | scene0412_01
  9 | scene0217_00
 10 | scene0019_00
 11 | scene0019_01
 12 | scene0414_00
 13 | scene0575_00
 14 | scene0575_01
 15 | scene0575_02
 16 | scene0426_00
 17 | scene0426_01
 18 | scene0426_02
 19 | scene0426_03
 20 | scene0549_00
 21 | scene0549_01
 22 | scene0578_00
 23 | scene0578_01
 24 | scene0578_02
 25 | scene0665_00
 26 | scene0665_01
 27 | scene0050_00
 28 | scene0050_01
 29 | scene0050_02
 30 | scene0257_00
 31 | scene0025_00
 32 | scene0025_01
 33 | scene0025_02
 34 | scene0583_00
 35 | scene0583_01
 36 | scene0583_02
 37 | scene0701_00
 38 | scene0701_01
 39 | scene0701_02
 40 | scene0580_00
 41 | scene0580_01
 42 | scene0565_00
 43 | scene0169_00
 44 | scene0169_01
 45 | scene0655_00
 46 | scene0655_01
 47 | scene0655_02
 48 | scene0063_00
 49 | scene0221_00
 50 | scene0221_01
 51 | scene0591_00
 52 | scene0591_01
 53 | scene0591_02
 54 | scene0678_00
 55 | scene0678_01
 56 | scene0678_02
 57 | scene0462_00
 58 | scene0427_00
 59 | scene0595_00
 60 | scene0193_00
 61 | scene0193_01
 62 | scene0164_00
 63 | scene0164_01
 64 | scene0164_02
 65 | scene0164_03
 66 | scene0598_00
 67 | scene0598_01
 68 | scene0598_02
 69 | scene0599_00
 70 | scene0599_01
 71 | scene0599_02
 72 | scene0328_00
 73 | scene0300_00
 74 | scene0300_01
 75 | scene0354_00
 76 | scene0458_00
 77 | scene0458_01
 78 | scene0423_00
 79 | scene0423_01
 80 | scene0423_02
 81 | scene0307_00
 82 | scene0307_01
 83 | scene0307_02
 84 | scene0606_00
 85 | scene0606_01
 86 | scene0606_02
 87 | scene0432_00
 88 | scene0432_01
 89 | scene0608_00
 90 | scene0608_01
 91 | scene0608_02
 92 | scene0651_00
 93 | scene0651_01
 94 | scene0651_02
 95 | scene0430_00
 96 | scene0430_01
 97 | scene0689_00
 98 | scene0357_00
 99 | scene0357_01
100 | scene0574_00
101 | scene0574_01
102 | scene0574_02
103 | scene0329_00
104 | scene0329_01
105 | scene0329_02
106 | scene0153_00
107 | scene0153_01
108 | scene0616_00
109 | scene0616_01
110 | scene0671_00
111 | scene0671_01
112 | scene0618_00
113 | scene0382_00
114 | scene0382_01
115 | scene0490_00
116 | scene0621_00
117 | scene0607_00
118 | scene0607_01
119 | scene0149_00
120 | scene0695_00
121 | scene0695_01
122 | scene0695_02
123 | scene0695_03
124 | scene0389_00
125 | scene0377_00
126 | scene0377_01
127 | scene0377_02
128 | scene0342_00
129 | scene0139_00
130 | scene0629_00
131 | scene0629_01
132 | scene0629_02
133 | scene0496_00
134 | scene0633_00
135 | scene0633_01
136 | scene0518_00
137 | scene0652_00
138 | scene0406_00
139 | scene0406_01
140 | scene0406_02
141 | scene0144_00
142 | scene0144_01
143 | scene0494_00
144 | scene0278_00
145 | scene0278_01
146 | scene0316_00
147 | scene0609_00
148 | scene0609_01
149 | scene0609_02
150 | scene0609_03
151 | scene0084_00
152 | scene0084_01
153 | scene0084_02
154 | scene0696_00
155 | scene0696_01
156 | scene0696_02
157 | scene0351_00
158 | scene0351_01
159 | scene0643_00
160 | scene0644_00
161 | scene0645_00
162 | scene0645_01
163 | scene0645_02
164 | scene0081_00
165 | scene0081_01
166 | scene0081_02
167 | scene0647_00
168 | scene0647_01
169 | scene0535_00
170 | scene0353_00
171 | scene0353_01
172 | scene0353_02
173 | scene0559_00
174 | scene0559_01
175 | scene0559_02
176 | scene0593_00
177 | scene0593_01
178 | scene0246_00
179 | scene0653_00
180 | scene0653_01
181 | scene0064_00
182 | scene0064_01
183 | scene0356_00
184 | scene0356_01
185 | scene0356_02
186 | scene0030_00
187 | scene0030_01
188 | scene0030_02
189 | scene0222_00
190 | scene0222_01
191 | scene0338_00
192 | scene0338_01
193 | scene0338_02
194 | scene0378_00
195 | scene0378_01
196 | scene0378_02
197 | scene0660_00
198 | scene0553_00
199 | scene0553_01
200 | scene0553_02
201 | scene0527_00
202 | scene0663_00
203 | scene0663_01
204 | scene0663_02
205 | scene0664_00
206 | scene0664_01
207 | scene0664_02
208 | scene0334_00
209 | scene0334_01
210 | scene0334_02
211 | scene0046_00
212 | scene0046_01
213 | scene0046_02
214 | scene0203_00
215 | scene0203_01
216 | scene0203_02
217 | scene0088_00
218 | scene0088_01
219 | scene0088_02
220 | scene0088_03
221 | scene0086_00
222 | scene0086_01
223 | scene0086_02
224 | scene0670_00
225 | scene0670_01
226 | scene0256_00
227 | scene0256_01
228 | scene0256_02
229 | scene0249_00
230 | scene0441_00
231 | scene0658_00
232 | scene0704_00
233 | scene0704_01
234 | scene0187_00
235 | scene0187_01
236 | scene0131_00
237 | scene0131_01
238 | scene0131_02
239 | scene0207_00
240 | scene0207_01
241 | scene0207_02
242 | scene0461_00
243 | scene0011_00
244 | scene0011_01
245 | scene0343_00
246 | scene0251_00
247 | scene0077_00
248 | scene0077_01
249 | scene0684_00
250 | scene0684_01
251 | scene0550_00
252 | scene0686_00
253 | scene0686_01
254 | scene0686_02
255 | scene0208_00
256 | scene0500_00
257 | scene0500_01
258 | scene0552_00
259 | scene0552_01
260 | scene0648_00
261 | scene0648_01
262 | scene0435_00
263 | scene0435_01
264 | scene0435_02
265 | scene0435_03
266 | scene0690_00
267 | scene0690_01
268 | scene0693_00
269 | scene0693_01
270 | scene0693_02
271 | scene0700_00
272 | scene0700_01
273 | scene0700_02
274 | scene0699_00
275 | scene0231_00
276 | scene0231_01
277 | scene0231_02
278 | scene0697_00
279 | scene0697_01
280 | scene0697_02
281 | scene0697_03
282 | scene0474_00
283 | scene0474_01
284 | scene0474_02
285 | scene0474_03
286 | scene0474_04
287 | scene0474_05
288 | scene0355_00
289 | scene0355_01
290 | scene0146_00
291 | scene0146_01
292 | scene0146_02
293 | scene0196_00
294 | scene0702_00
295 | scene0702_01
296 | scene0702_02
297 | scene0314_00
298 | scene0277_00
299 | scene0277_01
300 | scene0277_02
301 | scene0095_00
302 | scene0095_01
303 | scene0015_00
304 | scene0100_00
305 | scene0100_01
306 | scene0100_02
307 | scene0558_00
308 | scene0558_01
309 | scene0558_02
310 | scene0685_00
311 | scene0685_01
312 | scene0685_02
313 | 


--------------------------------------------------------------------------------
/scannet/model_util_scannet.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | import numpy as np
  7 | import sys
  8 | import os
  9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 10 | sys.path.append(BASE_DIR)
 11 | ROOT_DIR = os.path.dirname(BASE_DIR)
 12 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 13 | from box_util import get_3d_box
 14 | 
 15 | class ScannetDatasetConfig(object):
 16 |     def __init__(self):
 17 |         self.dataset = 'scannet'
 18 |         self.num_class = 18
 19 |         self.num_heading_bin = 24 # angle: -pi/2~pi/2, so divide 0~2*pi into 24 bin
 20 |         self.num_size_cluster = 18
 21 | 
 22 |         self.type2class = {'cabinet':0, 'bed':1, 'chair':2, 'sofa':3, 'table':4, 'door':5, 'window':6,'bookshelf':7,'picture':8, 'counter':9, 'desk':10, 'curtain':11, 'refrigerator':12, 'showercurtrain':13, 'toilet':14, 'sink':15, 'bathtub':16, 'garbagebin':17}
 23 |         #self.type2class = {'wall':0, 'floor':1, 'cabinet':2, 'bed':3, 'chair':4, 'sofa':5, 'table':6, 'door':7,'window':8,'bookshelf':9,'picture':10, 'counter':11, 'blinds':12, 'desk':13, 'shelves':14, 'curtain':15, 'dresser':16, 'pillow':17, 'mirror':18, 'floormat':19, 'clothes':20, 'ceiling':21, 'books':22, 'refrigerator':23, 'television':24, 'paper':25, 'towel':26, 'showercurtrain':27, 'box':28, 'whiteboard':29, 'person':30, 'nightstand':31, 'toilet':32, 'sink':33, 'lamp':34, 'bathtub':35, 'bag':36}
 24 |         self.type2class_room = {'other':0, 'wall':1, 'floor':2}
 25 |         self.class2type = {self.type2class[t]:t for t in self.type2class}
 26 |         self.class2type_room = {self.type2class_room[t]:t for t in self.type2class_room}
 27 |         self.nyu40ids = np.array([3,4,5,6,7,8,9,10,11,12,14,16,24,28,33,34,36,39])
 28 |         #self.nyu40ids = np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36])
 29 |         self.nyu40ids_room = np.array([1,2])
 30 |         
 31 |         self.nyu40id2class = {nyu40id: i for i,nyu40id in enumerate(list(self.nyu40ids))}
 32 |         self.nyu40id2class_sem = {nyu40id: i for i,nyu40id in enumerate(list(self.nyu40ids))}
 33 |         self.mean_size_arr = np.load(os.path.join(ROOT_DIR,'scannet/meta_data/scannet_means.npz'))['arr_0']
 34 |         #self.mean_size_arr = np.load(os.path.join(ROOT_DIR,'scannet/meta_data/scannet_means_v2.npz.npy'))[:self.num_class,:]
 35 |         self.type_mean_size = {}
 36 |         for i in range(self.num_size_cluster):
 37 |             self.type_mean_size[self.class2type[i]] = self.mean_size_arr[i,:]
 38 | 
 39 | 
 40 |     def class2angle(self, pred_cls, residual, to_label_format=True):
 41 |         return 0
 42 |             
 43 |     '''    
 44 |     def angle2class(self, angle):
 45 |         # assert(False)
 46 |         num_class = self.num_heading_bin
 47 |         angle = angle%(2*np.pi)
 48 |         assert(angle>=0 and angle<=2*np.pi)
 49 |         angle_per_class = 2*np.pi/float(num_class)
 50 |         shifted_angle = (angle+angle_per_class/2)%(2*np.pi)
 51 |         class_id = int(shifted_angle/angle_per_class)
 52 |         residual_angle = shifted_angle - (class_id*angle_per_class+angle_per_class/2)
 53 |         return class_id, residual_angle
 54 |     def class2angle(self, pred_cls, residual, to_label_format=True):
 55 |         num_class = self.num_heading_bin
 56 |         angle_per_class = 2*np.pi/float(num_class)
 57 |         angle_center = pred_cls * angle_per_class
 58 |         angle = angle_center + residual
 59 |         if to_label_format and angle>np.pi:
 60 |             angle = angle - 2*np.pi
 61 |         return angle
 62 |     '''
 63 |     def angle2class2(self, angle):
 64 |         ''' modify according to sunrgbd         
 65 |             scannet_angle: angle: -pi/2 ~ pi/2       
 66 |             1: angle += pi/2 -> 0~pi                  
 67 |             2: class*(2pi/N) + number = angle + pi/2  
 68 |         '''   
 69 |         class_id, residual_angle = self.angle2class(angle + np.pi / 2)
 70 |         return class_id, residual_angle
 71 |             
 72 |     def class2angle2(self, pred_cls, residual, to_label_format=True):
 73 |         angle = self.class2angle(pred_cls, residual)
 74 |         angle = angle - np.pi / 2
 75 |         return angle
 76 | 
 77 |     def size2class(self, size, type_name):
 78 |         ''' Convert 3D box size (l,w,h) to size class and size residual '''
 79 |         size_class = self.type2class[type_name]
 80 |         size_residual = size - self.type_mean_size[type_name]
 81 |         return size_class, size_residual
 82 |     
 83 |     def class2size(self, pred_cls, residual):
 84 |         ''' Inverse function to size2class '''        
 85 |         return self.mean_size_arr[pred_cls, :] + residual
 86 | 
 87 |     def param2obb(self, center, heading_class, heading_residual, size_class, size_residual):
 88 |         heading_angle = self.class2angle(heading_class, heading_residual)
 89 |         box_size = self.class2size(int(size_class), size_residual)
 90 |         obb = np.zeros((7,))
 91 |         obb[0:3] = center
 92 |         obb[3:6] = box_size
 93 |         obb[6] = heading_angle*-1
 94 |         return obb
 95 | 
 96 |     def param2obb2(self, center, heading_class, heading_residual, size_class, size_residual):
 97 |         heading_angle = self.class2angle(heading_class, heading_residual)
 98 |         box_size = self.class2size(int(size_class), size_residual)
 99 |         obb = np.zeros((7,))
100 |         obb[0:3] = center
101 |         obb[3:6] = box_size
102 |         obb[6] = heading_angle
103 |         return obb
104 | 
105 | def rotate_aligned_boxes(input_boxes, rot_mat):    
106 |     centers, lengths = input_boxes[:,0:3], input_boxes[:,3:6]    
107 |     new_centers = np.dot(centers, np.transpose(rot_mat))
108 |            
109 |     dx, dy = lengths[:,0]/2.0, lengths[:,1]/2.0
110 |     new_x = np.zeros((dx.shape[0], 4))
111 |     new_y = np.zeros((dx.shape[0], 4))
112 |     
113 |     for i, crnr in enumerate([(-1,-1), (1, -1), (1, 1), (-1, 1)]):        
114 |         crnrs = np.zeros((dx.shape[0], 3))
115 |         crnrs[:,0] = crnr[0]*dx
116 |         crnrs[:,1] = crnr[1]*dy
117 |         crnrs = np.dot(crnrs, np.transpose(rot_mat))
118 |         new_x[:,i] = crnrs[:,0]
119 |         new_y[:,i] = crnrs[:,1]
120 |     
121 |     
122 |     new_dx = 2.0*np.max(new_x, 1)
123 |     new_dy = 2.0*np.max(new_y, 1)    
124 |     new_lengths = np.stack((new_dx, new_dy, lengths[:,2]), axis=1)
125 |                   
126 |     return np.concatenate([new_centers, new_lengths], axis=1)
127 | 


--------------------------------------------------------------------------------
/sunrgbd/model_util_sunrgbd.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | import numpy as np
  7 | import sys
  8 | import os
  9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 10 | sys.path.append(BASE_DIR)
 11 | ROOT_DIR = os.path.dirname(BASE_DIR)
 12 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 13 | 
 14 | class SunrgbdDatasetConfig(object):
 15 |     def __init__(self):
 16 |         self.dataset = 'sunrgbd'
 17 |         self.num_class = 10
 18 |         self.num_heading_bin = 12
 19 |         self.num_size_cluster = 10
 20 | 
 21 |         self.type2class={'bed':0, 'table':1, 'sofa':2, 'chair':3, 'toilet':4, 'desk':5, 'dresser':6, 'night_stand':7, 'bookshelf':8, 'bathtub':9}
 22 | 
 23 |         self.class37_2_class10_multi_map = {4:[0], 7:[1, 5], 6:[2, 3], 5:[2, 3], 33:[4], 14:[1, 5], 17:[6], 32:[7], 10:[8], 36:[9]}
 24 |         self.class37_2_class10_multi = {}
 25 |         for i in range(38):
 26 |             if i in self.class37_2_class10_multi_map:
 27 |                 self.class37_2_class10_multi.update({i: self.class37_2_class10_multi_map[i]})
 28 |             else:
 29 |                 self.class37_2_class10_multi.update({i: [-1]})
 30 | 
 31 |         self.class37_2_class10_map = {4:0, 7:1, 6:2, 5:3, 33:4, 14:5, 17:6, 32:7, 10:8, 36:9}
 32 |         self.class37_2_class10 = {}
 33 |         for i in range(38):
 34 |             if i in self.class37_2_class10_map:
 35 |                 self.class37_2_class10.update({i: self.class37_2_class10_map[i]})
 36 |             else:
 37 |                 self.class37_2_class10.update({i: -1})
 38 | 
 39 |         self.class2type = {self.type2class[t]:t for t in self.type2class}
 40 |         self.type2onehotclass={'bed':0, 'table':1, 'sofa':2, 'chair':3, 'toilet':4, 'desk':5, 'dresser':6, 'night_stand':7, 'bookshelf':8, 'bathtub':9}
 41 |         self.type_mean_size = {'bathtub': np.array([0.765840,1.398258,0.472728]),
 42 |                           'bed': np.array([2.114256,1.620300,0.927272]),
 43 |                           'bookshelf': np.array([0.404671,1.071108,1.688889]),
 44 |                           'chair': np.array([0.591958,0.552978,0.827272]),
 45 |                           'desk': np.array([0.695190,1.346299,0.736364]),
 46 |                           'dresser': np.array([0.528526,1.002642,1.172878]),
 47 |                           'night_stand': np.array([0.500618,0.632163,0.683424]),
 48 |                           'sofa': np.array([0.923508,1.867419,0.845495]),
 49 |                           'table': np.array([0.791118,1.279516,0.718182]),
 50 |                           'toilet': np.array([0.699104,0.454178,0.756250])}
 51 | 
 52 |         self.mean_size_arr = np.zeros((self.num_size_cluster, 3))
 53 |         for i in range(self.num_size_cluster):
 54 |             self.mean_size_arr[i,:] = self.type_mean_size[self.class2type[i]]
 55 | 
 56 |     def size2class(self, size, type_name):
 57 |         ''' Convert 3D box size (l,w,h) to size class and size residual '''
 58 |         size_class = self.type2class[type_name]
 59 |         size_residual = size - self.type_mean_size[type_name]
 60 |         return size_class, size_residual
 61 |     
 62 |     def class2size(self, pred_cls, residual):
 63 |         ''' Inverse function to size2class '''
 64 |         mean_size = self.type_mean_size[self.class2type[pred_cls]]
 65 |         return mean_size + residual
 66 |     
 67 |     def angle2class(self, angle):
 68 |         ''' Convert continuous angle to discrete class
 69 |             [optinal] also small regression number from  
 70 |             class center angle to current angle.
 71 |            
 72 |             angle is from 0-2pi (or -pi~pi), class center at 0, 1*(2pi/N), 2*(2pi/N) ...  (N-1)*(2pi/N)
 73 |             return is class of int32 of 0,1,...,N-1 and a number such that
 74 |                 class*(2pi/N) + number = angle
 75 |         '''
 76 |         num_class = self.num_heading_bin
 77 |         angle = angle%(2*np.pi)
 78 |         assert(angle>=0 and angle<=2*np.pi)
 79 |         angle_per_class = 2*np.pi/float(num_class)
 80 |         shifted_angle = (angle+angle_per_class/2)%(2*np.pi)
 81 |         class_id = int(shifted_angle/angle_per_class)
 82 |         residual_angle = shifted_angle - (class_id*angle_per_class+angle_per_class/2)
 83 |         return class_id, residual_angle
 84 |     
 85 |     def class2angle(self, pred_cls, residual, to_label_format=True):
 86 |         ''' Inverse function to angle2class '''
 87 |         num_class = self.num_heading_bin
 88 |         angle_per_class = 2*np.pi/float(num_class)
 89 |         angle_center = pred_cls * angle_per_class
 90 |         angle = angle_center + residual
 91 |         if to_label_format and angle>np.pi:
 92 |             angle = angle - 2*np.pi
 93 |         return angle
 94 | 
 95 |     def param2obb(self, center, heading_class, heading_residual, size_class, size_residual):
 96 |         heading_angle = self.class2angle(heading_class, heading_residual)
 97 |         box_size = self.class2size(int(size_class), size_residual)
 98 |         obb = np.zeros((7,))
 99 |         obb[0:3] = center
100 |         obb[3:6] = box_size
101 |         obb[6] = heading_angle*-1
102 |         return obb
103 | 
104 | 
105 | 


--------------------------------------------------------------------------------
/sunrgbd/sunrgbd_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | ''' Provides Python helper function to read My SUNRGBD dataset.
  7 | 
  8 | Author: Charles R. Qi
  9 | Date: October, 2017
 10 | 
 11 | Updated by Charles R. Qi
 12 | Date: December, 2018
 13 | Note: removed basis loading.
 14 | '''
 15 | import numpy as np
 16 | import cv2
 17 | import os
 18 | import scipy.io as sio # to load .mat files for depth points
 19 | 
 20 | type2class={'bed':0, 'table':1, 'sofa':2, 'chair':3, 'toilet':4, 'desk':5, 'dresser':6, 'night_stand':7, 'bookshelf':8, 'bathtub':9}
 21 | class2type = {type2class[t]:t for t in type2class}
 22 | 
 23 | 
 24 | def flip_axis_to_camera(pc):
 25 |     ''' Flip X-right,Y-forward,Z-up to X-right,Y-down,Z-forward
 26 |         Input and output are both (N,3) array
 27 |     '''
 28 |     pc2 = np.copy(pc)
 29 |     pc2[:,[0,1,2]] = pc2[:,[0,2,1]] # cam X,Y,Z = depth X,-Z,Y
 30 |     pc2[:,1] *= -1
 31 |     return pc2
 32 | 
 33 | def flip_axis_to_depth(pc):
 34 |     pc2 = np.copy(pc)
 35 |     pc2[:,[0,1,2]] = pc2[:,[0,2,1]] # depth X,Y,Z = cam X,Z,-Y
 36 |     pc2[:,2] *= -1
 37 |     return pc2
 38 | 
 39 | 
 40 | class SUNObject3d(object):
 41 |     def __init__(self, line):
 42 |         data = line.split(' ')
 43 |         data[1:] = [float(x) for x in data[1:]]
 44 |         self.classname = data[0]
 45 |         self.xmin = data[1] 
 46 |         self.ymin = data[2]
 47 |         self.xmax = data[1]+data[3]
 48 |         self.ymax = data[2]+data[4]
 49 |         self.box2d = np.array([self.xmin,self.ymin,self.xmax,self.ymax])
 50 |         self.centroid = np.array([data[5],data[6],data[7]])
 51 |         self.unused_dimension = np.array([data[8],data[9],data[10]])
 52 |         self.w = data[8]
 53 |         self.l = data[9]
 54 |         self.h = data[10]
 55 |         self.orientation = np.zeros((3,))
 56 |         self.orientation[0] = data[11]
 57 |         self.orientation[1] = data[12]
 58 |         self.heading_angle = -1 * np.arctan2(self.orientation[1], self.orientation[0])
 59 | 
 60 | class SUNRGBD_Calibration(object):
 61 |     ''' Calibration matrices and utils
 62 |         We define five coordinate system in SUN RGBD dataset
 63 | 
 64 |         camera coodinate:
 65 |             Z is forward, Y is downward, X is rightward
 66 | 
 67 |         depth coordinate:
 68 |             Just change axis order and flip up-down axis from camera coord
 69 | 
 70 |         upright depth coordinate: tilted depth coordinate by Rtilt such that Z is gravity direction,
 71 |             Z is up-axis, Y is forward, X is right-ward
 72 | 
 73 |         upright camera coordinate:
 74 |             Just change axis order and flip up-down axis from upright depth coordinate
 75 | 
 76 |         image coordinate:
 77 |             ----> x-axis (u)
 78 |            |
 79 |            v
 80 |             y-axis (v) 
 81 | 
 82 |         depth points are stored in upright depth coordinate.
 83 |         labels for 3d box (basis, centroid, size) are in upright depth coordinate.
 84 |         2d boxes are in image coordinate
 85 | 
 86 |         We generate frustum point cloud and 3d box in upright camera coordinate
 87 |     '''
 88 | 
 89 |     def __init__(self, calib_filepath):
 90 |         lines = [line.rstrip() for line in open(calib_filepath)]
 91 |         Rtilt = np.array([float(x) for x in lines[0].split(' ')])
 92 |         self.Rtilt = np.reshape(Rtilt, (3,3), order='F')
 93 |         K = np.array([float(x) for x in lines[1].split(' ')])
 94 |         self.K = np.reshape(K, (3,3), order='F')
 95 |         self.f_u = self.K[0,0]
 96 |         self.f_v = self.K[1,1]
 97 |         self.c_u = self.K[0,2]
 98 |         self.c_v = self.K[1,2]
 99 |    
100 |     def project_upright_depth_to_camera(self, pc):
101 |         ''' project point cloud from depth coord to camera coordinate
102 |             Input: (N,3) Output: (N,3)
103 |         '''
104 |         # Project upright depth to depth coordinate
105 |         pc2 = np.dot(np.transpose(self.Rtilt), np.transpose(pc[:,0:3])) # (3,n)
106 |         return flip_axis_to_camera(np.transpose(pc2))
107 | 
108 |     def project_upright_depth_to_image(self, pc):
109 |         ''' Input: (N,3) Output: (N,2) UV and (N,) depth '''
110 |         pc2 = self.project_upright_depth_to_camera(pc)
111 |         uv = np.dot(pc2, np.transpose(self.K)) # (n,3)
112 |         uv[:,0] /= uv[:,2]
113 |         uv[:,1] /= uv[:,2]
114 |         return uv[:,0:2], pc2[:,2]
115 | 
116 |     def project_upright_depth_to_upright_camera(self, pc):
117 |         return flip_axis_to_camera(pc)
118 | 
119 |     def project_upright_camera_to_upright_depth(self, pc):
120 |         return flip_axis_to_depth(pc)
121 | 
122 |     def project_image_to_camera(self, uv_depth):
123 |         n = uv_depth.shape[0]
124 |         x = ((uv_depth[:,0]-self.c_u)*uv_depth[:,2])/self.f_u
125 |         y = ((uv_depth[:,1]-self.c_v)*uv_depth[:,2])/self.f_v
126 |         pts_3d_camera = np.zeros((n,3))
127 |         pts_3d_camera[:,0] = x
128 |         pts_3d_camera[:,1] = y
129 |         pts_3d_camera[:,2] = uv_depth[:,2]
130 |         return pts_3d_camera
131 | 
132 |     def project_image_to_upright_camerea(self, uv_depth):
133 |         pts_3d_camera = self.project_image_to_camera(uv_depth)
134 |         pts_3d_depth = flip_axis_to_depth(pts_3d_camera)
135 |         pts_3d_upright_depth = np.transpose(np.dot(self.Rtilt, np.transpose(pts_3d_depth)))
136 |         return self.project_upright_depth_to_upright_camera(pts_3d_upright_depth)
137 | 
138 |  
139 |  
140 | def rotx(t):
141 |     """Rotation about the x-axis."""
142 |     c = np.cos(t)
143 |     s = np.sin(t)
144 |     return np.array([[1,  0,  0],
145 |                      [0,  c, -s],
146 |                      [0,  s,  c]])
147 | 
148 | 
149 | def roty(t):
150 |     """Rotation about the y-axis."""
151 |     c = np.cos(t)
152 |     s = np.sin(t)
153 |     return np.array([[c,  0,  s],
154 |                      [0,  1,  0],
155 |                      [-s, 0,  c]])
156 | 
157 | 
158 | def rotz(t):
159 |     """Rotation about the z-axis."""
160 |     c = np.cos(t)
161 |     s = np.sin(t)
162 |     return np.array([[c, -s,  0],
163 |                      [s,  c,  0],
164 |                      [0,  0,  1]])
165 | 
166 | 
167 | def transform_from_rot_trans(R, t):
168 |     """Transforation matrix from rotation matrix and translation vector."""
169 |     R = R.reshape(3, 3)
170 |     t = t.reshape(3, 1)
171 |     return np.vstack((np.hstack([R, t]), [0, 0, 0, 1]))
172 | 
173 | 
174 | def inverse_rigid_trans(Tr):
175 |     """Inverse a rigid body transform matrix (3x4 as [R|t])
176 |         [R'|-R't; 0|1]
177 |     """ 
178 |     inv_Tr = np.zeros_like(Tr) # 3x4
179 |     inv_Tr[0:3,0:3] = np.transpose(Tr[0:3,0:3])
180 |     inv_Tr[0:3,3] = np.dot(-np.transpose(Tr[0:3,0:3]), Tr[0:3,3])
181 |     return inv_Tr
182 | 
183 | def read_sunrgbd_label(label_filename):
184 |     lines = [line.rstrip() for line in open(label_filename)]
185 |     objects = [SUNObject3d(line) for line in lines]
186 |     return objects
187 | 
188 | def load_image(img_filename):
189 |     return cv2.imread(img_filename)
190 | 
191 | def load_depth_points(depth_filename):
192 |     depth = np.loadtxt(depth_filename)
193 |     return depth
194 | 
195 | def load_depth_points_mat(depth_filename):
196 |     depth = sio.loadmat(depth_filename)['instance']
197 |     return depth
198 | 
199 | def random_shift_box2d(box2d, shift_ratio=0.1):
200 |     ''' Randomly shift box center, randomly scale width and height 
201 |     '''
202 |     r = shift_ratio
203 |     xmin,ymin,xmax,ymax = box2d
204 |     h = ymax-ymin
205 |     w = xmax-xmin
206 |     cx = (xmin+xmax)/2.0
207 |     cy = (ymin+ymax)/2.0
208 |     cx2 = cx + w*r*(np.random.random()*2-1)
209 |     cy2 = cy + h*r*(np.random.random()*2-1)
210 |     h2 = h*(1+np.random.random()*2*r-r) # 0.9 to 1.1
211 |     w2 = w*(1+np.random.random()*2*r-r) # 0.9 to 1.1
212 |     return np.array([cx2-w2/2.0, cy2-h2/2.0, cx2+w2/2.0, cy2+h2/2.0])
213 |  
214 | def in_hull(p, hull):
215 |     from scipy.spatial import Delaunay
216 |     if not isinstance(hull,Delaunay):
217 |         hull = Delaunay(hull)
218 |     return hull.find_simplex(p)>=0
219 | 
220 | def extract_pc_in_box3d(pc, box3d):
221 |     ''' pc: (N,3), box3d: (8,3) '''
222 |     box3d_roi_inds = in_hull(pc[:,0:3], box3d)
223 |     return pc[box3d_roi_inds,:], box3d_roi_inds
224 | 
225 | 
226 | def my_compute_box_3d(center, size, heading_angle):
227 |     R = rotz(-1*heading_angle)
228 |     l,w,h = size
229 |     x_corners = [-l,l,l,-l,-l,l,l,-l]
230 |     y_corners = [w,w,-w,-w,w,w,-w,-w]
231 |     z_corners = [h,h,h,h,-h,-h,-h,-h]
232 |     corners_3d = np.dot(R, np.vstack([x_corners, y_corners, z_corners]))
233 |     corners_3d[0,:] += center[0]
234 |     corners_3d[1,:] += center[1]
235 |     corners_3d[2,:] += center[2]
236 |     return np.transpose(corners_3d)
237 | 
238 | 
239 | def compute_box_3d(obj, calib):
240 |     ''' Takes an object and a projection matrix (P) and projects the 3d
241 |         bounding box into the image plane.
242 |         Returns:
243 |             corners_2d: (8,2) array in image coord.
244 |             corners_3d: (8,3) array in in upright depth coord.
245 |     '''
246 |     center = obj.centroid
247 | 
248 |     # compute rotational matrix around yaw axis
249 |     R = rotz(-1*obj.heading_angle)
250 |     #b,a,c = dimension
251 |     #print R, a,b,c
252 |     
253 |     # 3d bounding box dimensions
254 |     l = obj.l # along heading arrow
255 |     w = obj.w # perpendicular to heading arrow
256 |     h = obj.h
257 | 
258 |     # rotate and translate 3d bounding box
259 |     x_corners = [-l,l,l,-l,-l,l,l,-l]
260 |     y_corners = [w,w,-w,-w,w,w,-w,-w]
261 |     z_corners = [h,h,h,h,-h,-h,-h,-h]
262 |     corners_3d = np.dot(R, np.vstack([x_corners, y_corners, z_corners]))
263 |     corners_3d[0,:] += center[0]
264 |     corners_3d[1,:] += center[1]
265 |     corners_3d[2,:] += center[2]
266 | 
267 |     # project the 3d bounding box into the image plane
268 |     corners_2d,_ = calib.project_upright_depth_to_image(np.transpose(corners_3d))
269 |     #print 'corners_2d: ', corners_2d
270 |     return corners_2d, np.transpose(corners_3d)
271 | 
272 | def compute_orientation_3d(obj, calib):
273 |     ''' Takes an object and a projection matrix (P) and projects the 3d
274 |         object orientation vector into the image plane.
275 |         Returns:
276 |             orientation_2d: (2,2) array in image coord.
277 |             orientation_3d: (2,3) array in depth coord.
278 |     '''
279 |     
280 |     # orientation in object coordinate system
281 |     ori = obj.orientation
282 |     orientation_3d = np.array([[0, ori[0]],[0, ori[1]],[0,0]])
283 |     center = obj.centroid
284 |     orientation_3d[0,:] = orientation_3d[0,:] + center[0]
285 |     orientation_3d[1,:] = orientation_3d[1,:] + center[1]
286 |     orientation_3d[2,:] = orientation_3d[2,:] + center[2]
287 |     
288 |     # project orientation into the image plane
289 |     orientation_2d,_ = calib.project_upright_depth_to_image(np.transpose(orientation_3d))
290 |     return orientation_2d, np.transpose(orientation_3d)
291 | 
292 | def draw_projected_box3d(image, qs, color=(255,255,255), thickness=2):
293 |     ''' Draw 3d bounding box in image
294 |         qs: (8,2) array of vertices for the 3d box in following order:
295 |             1 -------- 0
296 |            /|         /|
297 |           2 -------- 3 .
298 |           | |        | |
299 |           . 5 -------- 4
300 |           |/         |/
301 |           6 -------- 7
302 |     '''
303 |     qs = qs.astype(np.int32)
304 |     for k in range(0,4):
305 |        #http://docs.enthought.com/mayavi/mayavi/auto/mlab_helper_functions.html
306 |        i,j=k,(k+1)%4
307 |        cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness, cv2.CV_AA) # use LINE_AA for opencv3
308 | 
309 |        i,j=k+4,(k+1)%4 + 4
310 |        cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness, cv2.CV_AA)
311 | 
312 |        i,j=k,k+4
313 |        cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness, cv2.CV_AA)
314 |     return image
315 | 
316 | 
317 | import pickle
318 | import gzip
319 | 
320 | def save_zipped_pickle(obj, filename, protocol=-1):
321 |     with gzip.open(filename, 'wb') as f:
322 |         pickle.dump(obj, f, protocol)
323 | 
324 | def load_zipped_pickle(filename):
325 |     with gzip.open(filename, 'rb') as f:
326 |         loaded_object = pickle.load(f)
327 |         return loaded_object
328 | 


--------------------------------------------------------------------------------
/utils/eval_det.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | """ Generic Code for Object Detection Evaluation
  7 | 
  8 |     Input:
  9 |     For each class:
 10 |         For each image:
 11 |             Predictions: box, score
 12 |             Groundtruths: box
 13 |     
 14 |     Output:
 15 |     For each class:
 16 |         precision-recal and average precision
 17 |     
 18 |     Author: Charles R. Qi
 19 |     
 20 |     Ref: https://raw.githubusercontent.com/rbgirshick/py-faster-rcnn/master/lib/datasets/voc_eval.py
 21 | """
 22 | import numpy as np
 23 | 
 24 | def voc_ap(rec, prec, use_07_metric=False):
 25 |     """ ap = voc_ap(rec, prec, [use_07_metric])
 26 |     Compute VOC AP given precision and recall.
 27 |     If use_07_metric is true, uses the
 28 |     VOC 07 11 point method (default:False).
 29 |     """
 30 |     if use_07_metric:
 31 |         # 11 point metric
 32 |         ap = 0.
 33 |         for t in np.arange(0., 1.1, 0.1):
 34 |             if np.sum(rec >= t) == 0:
 35 |                 p = 0
 36 |             else:
 37 |                 p = np.max(prec[rec >= t])
 38 |             ap = ap + p / 11.
 39 |     else:
 40 |         # correct AP calculation
 41 |         # first append sentinel values at the end
 42 |         mrec = np.concatenate(([0.], rec, [1.]))
 43 |         mpre = np.concatenate(([0.], prec, [0.]))
 44 | 
 45 |         # compute the precision envelope
 46 |         for i in range(mpre.size - 1, 0, -1):
 47 |             mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
 48 | 
 49 |         # to calculate area under PR curve, look for points
 50 |         # where X axis (recall) changes value
 51 |         i = np.where(mrec[1:] != mrec[:-1])[0]
 52 | 
 53 |         # and sum (\Delta recall) * prec
 54 |         ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
 55 |     return ap
 56 | 
 57 | import os
 58 | import sys
 59 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 60 | from metric_util import calc_iou # axis-aligned 3D box IoU
 61 | def get_iou(bb1, bb2):
 62 |     """ Compute IoU of two bounding boxes.
 63 |         ** Define your bod IoU function HERE **
 64 |     """
 65 |     #pass
 66 |     iou3d = calc_iou(bb1, bb2)
 67 |     return iou3d
 68 | 
 69 | from box_util import box3d_iou
 70 | def get_iou_obb(bb1,bb2):
 71 |     iou3d, iou2d = box3d_iou(bb1,bb2)
 72 |     return iou3d
 73 | 
 74 | def get_iou_main(get_iou_func, args):
 75 |     return get_iou_func(*args)
 76 | 
 77 | def eval_det_cls(pred, gt, ovthresh=0.25, use_07_metric=False, get_iou_func=get_iou):
 78 |     """ Generic functions to compute precision/recall for object detection
 79 |         for a single class.
 80 |         Input:
 81 |             pred: map of {img_id: [(bbox, score)]} where bbox is numpy array
 82 |             gt: map of {img_id: [bbox]}
 83 |             ovthresh: scalar, iou threshold
 84 |             use_07_metric: bool, if True use VOC07 11 point method
 85 |         Output:
 86 |             rec: numpy array of length nd
 87 |             prec: numpy array of length nd
 88 |             ap: scalar, average precision
 89 |     """
 90 | 
 91 |     # construct gt objects
 92 |     class_recs = {} # {img_id: {'bbox': bbox list, 'det': matched list}}
 93 |     npos = 0
 94 |     for img_id in gt.keys():
 95 |         bbox = np.array(gt[img_id])
 96 |         det = [False] * len(bbox)
 97 |         npos += len(bbox)
 98 |         class_recs[img_id] = {'bbox': bbox, 'det': det}
 99 |     # pad empty list to all other imgids
100 |     for img_id in pred.keys():
101 |         if img_id not in gt:
102 |             class_recs[img_id] = {'bbox': np.array([]), 'det': []}
103 | 
104 |     # construct dets
105 |     image_ids = []
106 |     confidence = []
107 |     BB = []
108 |     for img_id in pred.keys():
109 |         for box,score in pred[img_id]:
110 |             image_ids.append(img_id)
111 |             confidence.append(score)
112 |             BB.append(box)
113 |     confidence = np.array(confidence)
114 |     BB = np.array(BB) # (nd,4 or 8,3 or 6)
115 | 
116 |     # sort by confidence
117 |     sorted_ind = np.argsort(-confidence)
118 |     sorted_scores = np.sort(-confidence)
119 |     BB = BB[sorted_ind, ...]
120 |     image_ids = [image_ids[x] for x in sorted_ind]
121 | 
122 |     # go down dets and mark TPs and FPs
123 |     nd = len(image_ids)
124 |     tp = np.zeros(nd)
125 |     fp = np.zeros(nd)
126 |     for d in range(nd):
127 |         #if d%100==0: print(d)
128 |         R = class_recs[image_ids[d]]
129 |         bb = BB[d,...].astype(float)
130 |         ovmax = -np.inf
131 |         BBGT = R['bbox'].astype(float)
132 | 
133 |         if BBGT.size > 0:
134 |             # compute overlaps
135 |             for j in range(BBGT.shape[0]):
136 |                 iou = get_iou_main(get_iou_func, (bb, BBGT[j,...]))
137 |                 if iou > ovmax:
138 |                     ovmax = iou
139 |                     jmax = j
140 | 
141 |         #print d, ovmax
142 |         if ovmax > ovthresh:
143 |             if not R['det'][jmax]:
144 |                 tp[d] = 1.
145 |                 R['det'][jmax] = 1
146 |             else:
147 |                 fp[d] = 1.
148 |         else:
149 |             fp[d] = 1.
150 | 
151 |     # compute precision recall
152 |     fp = np.cumsum(fp)
153 |     tp = np.cumsum(tp)
154 |     rec = tp / float(npos)
155 |     #print('NPOS: ', npos)
156 |     # avoid divide by zero in case the first detection matches a difficult
157 |     # ground truth
158 |     prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
159 |     ap = voc_ap(rec, prec, use_07_metric)
160 | 
161 |     return rec, prec, ap
162 | 
163 | def eval_det_cls_wrapper(arguments):
164 |     pred, gt, ovthresh, use_07_metric, get_iou_func = arguments
165 |     rec, prec, ap = eval_det_cls(pred, gt, ovthresh, use_07_metric, get_iou_func)
166 |     return (rec, prec, ap)
167 | 
168 | def eval_det(pred_all, gt_all, ovthresh=0.25, use_07_metric=False, get_iou_func=get_iou):
169 |     """ Generic functions to compute precision/recall for object detection
170 |         for multiple classes.
171 |         Input:
172 |             pred_all: map of {img_id: [(classname, bbox, score)]}
173 |             gt_all: map of {img_id: [(classname, bbox)]}
174 |             ovthresh: scalar, iou threshold
175 |             use_07_metric: bool, if true use VOC07 11 point method
176 |         Output:
177 |             rec: {classname: rec}
178 |             prec: {classname: prec_all}
179 |             ap: {classname: scalar}
180 |     """
181 |     pred = {} # map {classname: pred}
182 |     gt = {} # map {classname: gt}
183 |     for img_id in pred_all.keys():
184 |         for classname, bbox, score in pred_all[img_id]:
185 |             if classname not in pred: pred[classname] = {}
186 |             if img_id not in pred[classname]:
187 |                 pred[classname][img_id] = []
188 |             if classname not in gt: gt[classname] = {}
189 |             if img_id not in gt[classname]:
190 |                 gt[classname][img_id] = []
191 |             pred[classname][img_id].append((bbox,score))
192 |     for img_id in gt_all.keys():
193 |         for classname, bbox in gt_all[img_id]:
194 |             if classname not in gt: gt[classname] = {}
195 |             if img_id not in gt[classname]:
196 |                 gt[classname][img_id] = []
197 |             gt[classname][img_id].append(bbox)
198 | 
199 |     rec = {}
200 |     prec = {}
201 |     ap = {}
202 |     for classname in gt.keys():
203 |         print('Computing AP for class: ', classname)
204 |         rec[classname], prec[classname], ap[classname] = eval_det_cls(pred[classname], gt[classname], ovthresh, use_07_metric, get_iou_func)
205 |         print(classname, ap[classname])
206 |     
207 |     return rec, prec, ap 
208 | 
209 | from multiprocessing import Pool
210 | def eval_det_multiprocessing(pred_all, gt_all, ovthresh=0.25, use_07_metric=False, get_iou_func=get_iou):
211 |     """ Generic functions to compute precision/recall for object detection
212 |         for multiple classes.
213 |         Input:
214 |             pred_all: map of {img_id: [(classname, bbox, score)]}
215 |             gt_all: map of {img_id: [(classname, bbox)]}
216 |             ovthresh: scalar, iou threshold
217 |             use_07_metric: bool, if true use VOC07 11 point method
218 |         Output:
219 |             rec: {classname: rec}
220 |             prec: {classname: prec_all}
221 |             ap: {classname: scalar}
222 |     """
223 |     pred = {} # map {classname: pred}
224 |     gt = {} # map {classname: gt}
225 |     for img_id in pred_all.keys():
226 |         for classname, bbox, score in pred_all[img_id]:
227 |             if classname not in pred: pred[classname] = {}
228 |             if img_id not in pred[classname]:
229 |                 pred[classname][img_id] = []
230 |             if classname not in gt: gt[classname] = {}
231 |             if img_id not in gt[classname]:
232 |                 gt[classname][img_id] = []
233 |             pred[classname][img_id].append((bbox,score))
234 |     for img_id in gt_all.keys():
235 |         for classname, bbox in gt_all[img_id]:
236 |             if classname not in gt: gt[classname] = {}
237 |             if img_id not in gt[classname]:
238 |                 gt[classname][img_id] = []
239 |             gt[classname][img_id].append(bbox)
240 | 
241 |     rec = {}
242 |     prec = {}
243 |     ap = {}
244 |     p = Pool(processes=10)
245 |     ret_values = p.map(eval_det_cls_wrapper, [(pred[classname], gt[classname], ovthresh, use_07_metric, get_iou_func) for classname in gt.keys() if classname in pred])
246 |     p.close()
247 |     for i, classname in enumerate(gt.keys()):
248 |         if classname in pred:
249 |             rec[classname], prec[classname], ap[classname] = ret_values[i]
250 |         else:
251 |             rec[classname] = 0
252 |             prec[classname] = 0
253 |             ap[classname] = 0
254 |         print(classname, ap[classname])
255 |     
256 |     return rec, prec, ap 
257 | 


--------------------------------------------------------------------------------
/utils/metric_util.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | """ Utility functions for metric evaluation.
  7 | 
  8 | Author: Or Litany and Charles R. Qi
  9 | """
 10 | 
 11 | import os
 12 | import sys
 13 | import torch
 14 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 15 | sys.path.append(BASE_DIR)
 16 | 
 17 | import numpy as np
 18 | 
 19 | # Mesh IO
 20 | import trimesh
 21 | 
 22 |  
 23 | # ----------------------------------------
 24 | # Precision and Recall
 25 | # ----------------------------------------
 26 | 
 27 | def multi_scene_precision_recall(labels, pred, iou_thresh, conf_thresh, label_mask, pred_mask=None):
 28 |     '''
 29 |     Args:
 30 |         labels: (B, N, 6)
 31 |         pred: (B, M, 6)
 32 |         iou_thresh: scalar
 33 |         conf_thresh: scalar
 34 |         label_mask: (B, N,) with values in 0 or 1 to indicate which GT boxes to consider.
 35 |         pred_mask: (B, M,) with values in 0 or 1 to indicate which PRED boxes to consider.
 36 |     Returns:
 37 |         TP,FP,FN,Precision,Recall
 38 |     '''
 39 |     # Make sure the masks are not Torch tensor, otherwise the mask==1 returns uint8 array instead
 40 |     # of True/False array as in numpy
 41 |     assert(not torch.is_tensor(label_mask))
 42 |     assert(not torch.is_tensor(pred_mask))
 43 |     TP, FP, FN = 0, 0, 0
 44 |     if label_mask is None: label_mask = np.ones((labels.shape[0], labels.shape[1]))
 45 |     if pred_mask is None: pred_mask = np.ones((pred.shape[0], pred.shape[1]))
 46 |     for batch_idx in range(labels.shape[0]):
 47 |         TP_i, FP_i, FN_i = single_scene_precision_recall(labels[batch_idx, label_mask[batch_idx,:]==1, :],
 48 |                                                          pred[batch_idx, pred_mask[batch_idx,:]==1, :],
 49 |                                                          iou_thresh, conf_thresh)
 50 |         TP += TP_i
 51 |         FP += FP_i
 52 |         FN += FN_i
 53 |     
 54 |     return TP, FP, FN, precision_recall(TP, FP, FN)
 55 |       
 56 | 
 57 | def single_scene_precision_recall(labels, pred, iou_thresh, conf_thresh):
 58 |     """Compute P and R for predicted bounding boxes. Ignores classes!
 59 |     Args:
 60 |         labels: (N x bbox) ground-truth bounding boxes (6 dims) 
 61 |         pred: (M x (bbox + conf)) predicted bboxes with confidence and maybe classification
 62 |     Returns:
 63 |         TP, FP, FN
 64 |     """
 65 |     
 66 |     
 67 |     # for each pred box with high conf (C), compute IoU with all gt boxes. 
 68 |     # TP = number of times IoU > th ; FP = C - TP 
 69 |     # FN - number of scene objects without good match
 70 |     
 71 |     gt_bboxes = labels[:, :6]      
 72 |     
 73 |     num_scene_bboxes = gt_bboxes.shape[0]
 74 |     conf = pred[:, 6]    
 75 |         
 76 |     conf_pred_bbox = pred[np.where(conf > conf_thresh)[0], :6]
 77 |     num_conf_pred_bboxes = conf_pred_bbox.shape[0]
 78 |     
 79 |     # init an array to keep iou between generated and scene bboxes
 80 |     iou_arr = np.zeros([num_conf_pred_bboxes, num_scene_bboxes])    
 81 |     for g_idx in range(num_conf_pred_bboxes):
 82 |         for s_idx in range(num_scene_bboxes):            
 83 |             iou_arr[g_idx, s_idx] = calc_iou(conf_pred_bbox[g_idx ,:], gt_bboxes[s_idx, :])
 84 |     
 85 |     
 86 |     good_match_arr = (iou_arr >= iou_thresh)
 87 |             
 88 |     TP = good_match_arr.any(axis=1).sum()    
 89 |     FP = num_conf_pred_bboxes - TP        
 90 |     FN = num_scene_bboxes - good_match_arr.any(axis=0).sum()
 91 |     
 92 |     return TP, FP, FN
 93 |     
 94 | 
 95 | def precision_recall(TP, FP, FN):
 96 |     Prec = 1.0 * TP / (TP + FP) if TP+FP>0 else 0
 97 |     Rec = 1.0 * TP / (TP + FN)
 98 |     return Prec, Rec
 99 |     
100 | 
101 | def calc_iou(box_a, box_b):
102 |     """Computes IoU of two axis aligned bboxes.
103 |     Args:
104 |         box_a, box_b: 6D of center and lengths        
105 |     Returns:
106 |         iou
107 |     """        
108 |         
109 |     max_a = box_a[0:3] + box_a[3:6]/2
110 |     max_b = box_b[0:3] + box_b[3:6]/2    
111 |     min_max = np.array([max_a, max_b]).min(0)
112 |         
113 |     min_a = box_a[0:3] - box_a[3:6]/2
114 |     min_b = box_b[0:3] - box_b[3:6]/2
115 |     max_min = np.array([min_a, min_b]).max(0)
116 |     if not ((min_max > max_min).all()):
117 |         return 0.0
118 | 
119 |     intersection = (min_max - max_min).prod()
120 |     vol_a = box_a[3:6].prod()
121 |     vol_b = box_b[3:6].prod()
122 |     union = vol_a + vol_b - intersection
123 |     return 1.0*intersection / union
124 | 
125 | 
126 | if __name__ == '__main__':
127 |     print('running some tests')
128 |     
129 |     ############
130 |     ## Test IoU 
131 |     ############
132 |     box_a = np.array([0,0,0,1,1,1])
133 |     box_b = np.array([0,0,0,2,2,2])
134 |     expected_iou = 1.0/8
135 |     pred_iou = calc_iou(box_a, box_b)
136 |     assert expected_iou == pred_iou, 'function returned wrong IoU'
137 |     
138 |     box_a = np.array([0,0,0,1,1,1])
139 |     box_b = np.array([10,10,10,2,2,2])
140 |     expected_iou = 0.0
141 |     pred_iou = calc_iou(box_a, box_b)
142 |     assert expected_iou == pred_iou, 'function returned wrong IoU'
143 |     
144 |     print('IoU test -- PASSED')
145 |     
146 |     #########################
147 |     ## Test Precition Recall 
148 |     #########################
149 |     gt_boxes = np.array([[0,0,0,1,1,1],[3, 0, 1, 1, 10, 1]])
150 |     detected_boxes = np.array([[0,0,0,1,1,1, 1.0],[3, 0, 1, 1, 10, 1, 0.9]])
151 |     TP, FP, FN = single_scene_precision_recall(gt_boxes, detected_boxes, 0.5, 0.5)
152 |     assert TP == 2 and FP == 0 and FN == 0
153 |     assert precision_recall(TP, FP, FN) == (1, 1)
154 |     
155 |     detected_boxes = np.array([[0,0,0,1,1,1, 1.0]])
156 |     TP, FP, FN = single_scene_precision_recall(gt_boxes, detected_boxes, 0.5, 0.5)
157 |     assert TP == 1 and FP == 0 and FN == 1
158 |     assert precision_recall(TP, FP, FN) == (1, 0.5)
159 |     
160 |     detected_boxes = np.array([[0,0,0,1,1,1, 1.0], [-1,-1,0,0.1,0.1,1, 1.0]])
161 |     TP, FP, FN = single_scene_precision_recall(gt_boxes, detected_boxes, 0.5, 0.5)
162 |     assert TP == 1 and FP == 1 and FN == 1
163 |     assert precision_recall(TP, FP, FN) == (0.5, 0.5)
164 |     
165 |     # wrong box has low confidence
166 |     detected_boxes = np.array([[0,0,0,1,1,1, 1.0], [-1,-1,0,0.1,0.1,1, 0.1]])
167 |     TP, FP, FN = single_scene_precision_recall(gt_boxes, detected_boxes, 0.5, 0.5)
168 |     assert TP == 1 and FP == 0 and FN == 1
169 |     assert precision_recall(TP, FP, FN) == (1, 0.5)
170 |     
171 |     print('Precition Recall test -- PASSED')
172 |     
173 | 


--------------------------------------------------------------------------------
/utils/nms.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # 
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | import numpy as np
  7 | from pc_util import bbox_corner_dist_measure
  8 | 
  9 | # boxes are axis aigned 2D boxes of shape (n,5) in FLOAT numbers with (x1,y1,x2,y2,score)
 10 | ''' Ref: https://www.pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/
 11 | Ref: https://github.com/vickyboy47/nms-python/blob/master/nms.py 
 12 | '''
 13 | def nms_2d(boxes, overlap_threshold):
 14 |     x1 = boxes[:,0]
 15 |     y1 = boxes[:,1]
 16 |     x2 = boxes[:,2]
 17 |     y2 = boxes[:,3]
 18 |     score = boxes[:,4]
 19 |     area = (x2-x1)*(y2-y1)
 20 | 
 21 |     I = np.argsort(score)
 22 |     pick = []
 23 |     while (I.size!=0):
 24 |         last = I.size
 25 |         i = I[-1]
 26 |         pick.append(i)
 27 |         suppress = [last-1]
 28 |         for pos in range(last-1):
 29 |             j = I[pos]
 30 |             xx1 = max(x1[i],x1[j])
 31 |             yy1 = max(y1[i],y1[j])
 32 |             xx2 = min(x2[i],x2[j])
 33 |             yy2 = min(y2[i],y2[j])
 34 |             w = xx2-xx1
 35 |             h = yy2-yy1
 36 |             if (w>0 and h>0):
 37 |                 o = w*h/area[j]
 38 |                 print('Overlap is', o)
 39 |                 if (o>overlap_threshold):
 40 |                     suppress.append(pos)
 41 |         I = np.delete(I,suppress)
 42 |     return pick
 43 | 
 44 | def nms_2d_faster(boxes, overlap_threshold, old_type=False):
 45 |     x1 = boxes[:,0]
 46 |     y1 = boxes[:,1]
 47 |     x2 = boxes[:,2]
 48 |     y2 = boxes[:,3]
 49 |     score = boxes[:,4]
 50 |     area = (x2-x1)*(y2-y1)
 51 | 
 52 |     I = np.argsort(score)
 53 |     pick = []
 54 |     while (I.size!=0):
 55 |         last = I.size
 56 |         i = I[-1]
 57 |         pick.append(i)
 58 | 
 59 |         xx1 = np.maximum(x1[i], x1[I[:last-1]])
 60 |         yy1 = np.maximum(y1[i], y1[I[:last-1]])
 61 |         xx2 = np.minimum(x2[i], x2[I[:last-1]])
 62 |         yy2 = np.minimum(y2[i], y2[I[:last-1]])
 63 | 
 64 |         w = np.maximum(0, xx2-xx1)
 65 |         h = np.maximum(0, yy2-yy1)
 66 | 
 67 |         if old_type:
 68 |             o = (w*h)/area[I[:last-1]]
 69 |         else:
 70 |             inter = w*h
 71 |             o = inter / (area[i] + area[I[:last-1]] - inter)
 72 | 
 73 |         I = np.delete(I, np.concatenate(([last-1], np.where(o>overlap_threshold)[0])))
 74 | 
 75 |     return pick
 76 | 
 77 | def nms_3d_faster(boxes, overlap_threshold, old_type=False):
 78 |     x1 = boxes[:,0]
 79 |     y1 = boxes[:,1]
 80 |     z1 = boxes[:,2]
 81 |     x2 = boxes[:,3]
 82 |     y2 = boxes[:,4]
 83 |     z2 = boxes[:,5]
 84 |     score = boxes[:,6]
 85 |     area = (x2-x1)*(y2-y1)*(z2-z1)
 86 | 
 87 |     I = np.argsort(score)
 88 |     pick = []
 89 |     while (I.size!=0):
 90 |         last = I.size
 91 |         i = I[-1]
 92 |         pick.append(i)
 93 | 
 94 |         xx1 = np.maximum(x1[i], x1[I[:last-1]])
 95 |         yy1 = np.maximum(y1[i], y1[I[:last-1]])
 96 |         zz1 = np.maximum(z1[i], z1[I[:last-1]])
 97 |         xx2 = np.minimum(x2[i], x2[I[:last-1]])
 98 |         yy2 = np.minimum(y2[i], y2[I[:last-1]])
 99 |         zz2 = np.minimum(z2[i], z2[I[:last-1]])
100 | 
101 |         l = np.maximum(0, xx2-xx1)
102 |         w = np.maximum(0, yy2-yy1)
103 |         h = np.maximum(0, zz2-zz1)
104 | 
105 |         if old_type:
106 |             o = (l*w*h)/area[I[:last-1]]
107 |         else:
108 |             inter = l*w*h
109 |             o = inter / (area[i] + area[I[:last-1]] - inter)
110 | 
111 |         I = np.delete(I, np.concatenate(([last-1], np.where(o>overlap_threshold)[0])))
112 | 
113 |     return pick
114 | 
115 | def nms_3d_faster_samecls(boxes, overlap_threshold, old_type=False):
116 |     x1 = boxes[:,0]
117 |     y1 = boxes[:,1]
118 |     z1 = boxes[:,2]
119 |     x2 = boxes[:,3]
120 |     y2 = boxes[:,4]
121 |     z2 = boxes[:,5]
122 |     score = boxes[:,6]
123 |     cls = boxes[:,7]
124 |     area = (x2-x1)*(y2-y1)*(z2-z1)
125 | 
126 |     I = np.argsort(score)
127 |     pick = []
128 |     while (I.size!=0):
129 |         last = I.size
130 |         i = I[-1]
131 |         pick.append(i)
132 | 
133 |         xx1 = np.maximum(x1[i], x1[I[:last-1]])
134 |         yy1 = np.maximum(y1[i], y1[I[:last-1]])
135 |         zz1 = np.maximum(z1[i], z1[I[:last-1]])
136 |         xx2 = np.minimum(x2[i], x2[I[:last-1]])
137 |         yy2 = np.minimum(y2[i], y2[I[:last-1]])
138 |         zz2 = np.minimum(z2[i], z2[I[:last-1]])
139 |         cls1 = cls[i]
140 |         cls2 = cls[I[:last-1]]
141 | 
142 |         l = np.maximum(0, xx2-xx1)
143 |         w = np.maximum(0, yy2-yy1)
144 |         h = np.maximum(0, zz2-zz1)
145 | 
146 |         if old_type:
147 |             o = (l*w*h)/area[I[:last-1]]
148 |         else:
149 |             inter = l*w*h
150 |             o = inter / (area[i] + area[I[:last-1]] - inter)
151 |         o = o * (cls1==cls2)
152 | 
153 |         I = np.delete(I, np.concatenate(([last-1], np.where(o>overlap_threshold)[0])))
154 | 
155 |     return pick
156 | 
157 | 
158 | def nms_crnr_dist(boxes, conf, overlap_threshold):
159 |         
160 |     I = np.argsort(conf)
161 |     pick = []
162 |     while (I.size!=0):
163 |         last = I.size
164 |         i = I[-1]
165 |         pick.append(i)        
166 |         
167 |         scores = []
168 |         for ind in I[:-1]:
169 |             scores.append(bbox_corner_dist_measure(boxes[i,:], boxes[ind, :]))
170 | 
171 |         I = np.delete(I, np.concatenate(([last-1], np.where(np.array(scores)>overlap_threshold)[0])))
172 | 
173 |     return pick
174 | 
175 | if __name__=='__main__':
176 |     a = np.random.random((100,5))
177 |     print(nms_2d(a,0.9))
178 |     print(nms_2d_faster(a,0.9))
179 | 


--------------------------------------------------------------------------------
/utils/nn_distance.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates.
 2 | # 
 3 | # This source code is licensed under the MIT license found in the
 4 | # LICENSE file in the root directory of this source tree.
 5 | 
 6 | """ Chamfer distance in Pytorch.
 7 | Author: Charles R. Qi
 8 | """
 9 | 
10 | import torch
11 | import torch.nn as nn
12 | import numpy as np
13 | 
14 | 
15 | def huber_loss(error, delta=1.0):
16 |     """
17 |     Args:
18 |         error: Torch tensor (d1,d2,...,dk)
19 |     Returns:
20 |         loss: Torch tensor (d1,d2,...,dk)
21 | 
22 |     x = error = pred - gt or dist(pred,gt)
23 |     0.5 * |x|^2                 if |x|<=d
24 |     0.5 * d^2 + d * (|x|-d)     if |x|>d
25 |     Ref: https://github.com/charlesq34/frustum-pointnets/blob/master/models/model_util.py
26 |     """
27 |     abs_error = torch.abs(error)
28 |     #quadratic = torch.min(abs_error, torch.FloatTensor([delta]))
29 |     quadratic = torch.clamp(abs_error, max=delta)
30 |     linear = (abs_error - quadratic)
31 |     loss = 0.5 * quadratic**2 + delta * linear
32 |     return loss
33 | 
34 | def nn_distance(pc1, pc2, l1smooth=False, delta=1.0, l1=False):
35 |     """
36 |     Input:
37 |         pc1: (B,N,C) torch tensor
38 |         pc2: (B,M,C) torch tensor
39 |         l1smooth: bool, whether to use l1smooth loss
40 |         delta: scalar, the delta used in l1smooth loss
41 |     Output:
42 |         dist1: (B,N) torch float32 tensor
43 |         idx1: (B,N) torch int64 tensor
44 |         dist2: (B,M) torch float32 tensor
45 |         idx2: (B,M) torch int64 tensor
46 |     """
47 |     N = pc1.shape[1]
48 |     M = pc2.shape[1]
49 |     pc1_expand_tile = pc1.unsqueeze(2).repeat(1,1,M,1)
50 |     pc2_expand_tile = pc2.unsqueeze(1).repeat(1,N,1,1)
51 |     pc_diff = pc1_expand_tile - pc2_expand_tile
52 |     
53 |     if l1smooth:
54 |         pc_dist = torch.sum(huber_loss(pc_diff, delta), dim=-1) # (B,N,M)
55 |     elif l1:
56 |         pc_dist = torch.sum(torch.abs(pc_diff), dim=-1) # (B,N,M)
57 |     else:
58 |         pc_dist = torch.sum(pc_diff**2, dim=-1) # (B,N,M)
59 |     dist1, idx1 = torch.min(pc_dist, dim=2) # (B,N)
60 |     dist2, idx2 = torch.min(pc_dist, dim=1) # (B,M)
61 |     return dist1, idx1, dist2, idx2
62 | 
63 | def demo_nn_distance():
64 |     np.random.seed(0)
65 |     pc1arr = np.random.random((1,5,3))
66 |     pc2arr = np.random.random((1,6,3))
67 |     pc1 = torch.from_numpy(pc1arr.astype(np.float32))
68 |     pc2 = torch.from_numpy(pc2arr.astype(np.float32))
69 |     dist1, idx1, dist2, idx2 = nn_distance(pc1, pc2)
70 |     print(dist1)
71 |     print(idx1)
72 |     dist = np.zeros((5,6))
73 |     for i in range(5):
74 |         for j in range(6):
75 |             dist[i,j] = np.sum((pc1arr[0,i,:] - pc2arr[0,j,:]) ** 2)
76 |     print(dist)
77 |     print('-'*30)
78 |     print('L1smooth dists:')
79 |     dist1, idx1, dist2, idx2 = nn_distance(pc1, pc2, True)
80 |     print(dist1)
81 |     print(idx1)
82 |     dist = np.zeros((5,6))
83 |     for i in range(5):
84 |         for j in range(6):
85 |             error = np.abs(pc1arr[0,i,:] - pc2arr[0,j,:])
86 |             quad = np.minimum(error, 1.0)
87 |             linear = error - quad
88 |             loss = 0.5*quad**2 + 1.0*linear
89 |             dist[i,j] = np.sum(loss)
90 |     print(dist)
91 | 
92 | 
93 | if __name__ == '__main__':
94 |     demo_nn_distance()
95 | 


--------------------------------------------------------------------------------
/utils/show_results_scannet.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  4 | ROOT_DIR = BASE_DIR
  5 | sys.path.append(os.path.join(ROOT_DIR, 'scannet'))
  6 | import datetime
  7 | import numpy as np
  8 | import pdb
  9 | import matplotlib.pyplot as pyplot
 10 | import open3d as o3d
 11 | from scipy.spatial.distance import directed_hausdorff
 12 | import json
 13 | import pickle
 14 | import random
 15 | import scipy.io as sio
 16 | 
 17 | 
 18 | THRESH = 0
 19 | THRESH2 = -0.1
 20 | VAL_SCAN_NAMES = [line.rstrip() for line in open('scannet/meta_data/scannetv2_val.txt')] 
 21 | SCANNET_DIR = '/home/bo/data/scannet/scans/' # path of scannet dataset 
 22 | LABEL_MAP_FILE = 'scannet/meta_data/scannetv2-labels.combined.tsv'
 23 | DONOTCARE_CLASS_IDS = np.array([])
 24 | OBJ_CLASS_IDS = np.array([3,4,5,6,7,8,9,10,11,12,14,16,24,28,33,34,36,39])
 25 | MAX_NUM_POINT = 40000
 26 | GT_PATH = '/home/bo/data/scannet/scannet_train_detection_data' # path of data dumped with scripts in scannet folder 
 27 | PRED_PATH = '/home/bo/data/scannet/dump/supp/result' # path of predictions 
 28 | mode = sys.argv[1] # gt or pred
 29 | color_mapping = {3: [255,140,0], 4:[30,144,255], 5:[50,205,50], 6:[255,215,0], 7:[255,69,0], 8:[138,43,226],9:[0,255,255],10:[210,105,30],11:[255,0,255], 12:[255,255,0], 14:[255,20,147], 16:[165,42,42], 24:[100,149,237], 28:[0,128,0], 33:[255,127,80],34:[221,160,221], 36:[95,158,160], 39:[119,136,153]}
 30 | 
 31 | def create_lineset(bbox, colors=[1, 0, 0]):
 32 |     ''' create bounding box
 33 |     '''
 34 |     xmin = bbox[0] - bbox[3] / 2
 35 |     xmax = bbox[0] + bbox[3] / 2
 36 |     ymin = bbox[1] - bbox[4] / 2
 37 |     ymax = bbox[1] + bbox[4] / 2
 38 |     zmin = bbox[2] - bbox[5] / 2
 39 |     zmax = bbox[2] + bbox[5] / 2
 40 |     points = [[xmin, ymin, zmin], [xmin, ymin, zmax], [xmin, ymax, zmin], [xmin, ymax, zmax],
 41 |               [xmax, ymin, zmin], [xmax, ymin, zmax], [xmax, ymax, zmin], [xmax, ymax, zmax]]
 42 |     lines = [[0, 1], [0, 2], [2, 3], [1, 3], [0, 4], [1, 5], [3, 7], [2, 6],
 43 |              [4, 5], [5, 7], [6, 7], [4, 6]]
 44 |     line_set = o3d.geometry.LineSet()
 45 |     line_set.points = o3d.utility.Vector3dVector(points)
 46 |     line_set.lines = o3d.utility.Vector2iVector(lines)
 47 |     line_set.colors = o3d.utility.Vector3dVector(np.tile(colors, [12, 1]))
 48 |     return line_set
 49 | 
 50 | def load_view_point(pcd, filename, window_name):
 51 |     if mode=='pred':
 52 |         left = 50
 53 |         top=50
 54 |     elif mode=='gt':
 55 |         left = 1000
 56 |         top=50
 57 |     else:
 58 |          print("model must be gt or pred")
 59 |          return
 60 |     vis = o3d.visualization.Visualizer()
 61 |     vis.create_window(window_name, width=880, height=680, left=left, top=top)
 62 |     for part in pcd:
 63 |         vis.add_geometry(part)
 64 |     ctr = vis.get_view_control()
 65 |     current_param = ctr.convert_to_pinhole_camera_parameters()
 66 |     trajectory = o3d.io.read_pinhole_camera_trajectory(filename)
 67 |     f = 983.80485869912241
 68 |     cx = current_param.intrinsic.width / 2 - 0.5
 69 |     cy = current_param.intrinsic.height / 2 - 0.5
 70 |     trajectory.parameters[0].intrinsic.set_intrinsics(current_param.intrinsic.width, current_param.intrinsic.height, f, f, cx, cy)
 71 | 
 72 |     ctr.convert_from_pinhole_camera_parameters(trajectory.parameters[0])
 73 |     vis.run()
 74 |     vis.destroy_window()
 75 | 
 76 | def select_bbox(bboxes):
 77 |     choose_ids = []
 78 |     for i in range(bboxes.shape[0]):
 79 |         if bboxes[i,-1] in OBJ_CLASS_IDS:
 80 |             choose_ids.append(i)
 81 |     bboxes = bboxes[choose_ids]
 82 |     return bboxes
 83 | 
 84 | def export_one_scan(scan_name):
 85 |     pt = np.load(os.path.join(GT_PATH, scan_name+'_vert.npy'))
 86 |     np.savetxt('tmp.xyz', pt)
 87 |     os.system("mv tmp.xyz tmp.xyzrgb")
 88 |     pcd  = o3d.io.read_point_cloud('tmp.xyzrgb')
 89 | 
 90 |     gt_bbox = np.load(os.path.join(GT_PATH, scan_name+'_all_angle_40cls.npy'))
 91 |     gt_bbox = select_bbox(np.unique(gt_bbox,axis=0))
 92 |     semantic_labels = gt_bbox[:,-1]
 93 |     pred_proposals = np.load(os.path.join(PRED_PATH, 'opt'+scan_name+'_nms.npy'))
 94 | 
 95 |     mask = np.logical_not(np.in1d(semantic_labels, DONOTCARE_CLASS_IDS))
 96 |     semantic_labels = semantic_labels[mask]
 97 | 
 98 |     bb =[]
 99 |     if mode=='gt':
100 |         boundingboxes = gt_bbox
101 |     elif mode =='pred':
102 |         boundingboxes = pred_proposals
103 |     else:
104 |           print("model must be gt or pred")
105 |           return
106 | 
107 |     for i in range(boundingboxes.shape[0]):
108 |         if mode =='gt':
109 |             c = np.array(color_mapping[int(boundingboxes[i,-1])])/255.0
110 |         else:
111 |             c = np.array(color_mapping[int(OBJ_CLASS_IDS[int(boundingboxes[i,-1])-1])])/255.0
112 |         for _ in range(2):
113 |             bb.append(create_lineset(boundingboxes[i]+0.005*(np.random.rand()-0.5)*2, colors=c))
114 |     load_view_point([pcd] + bb, './viewpoint.json', window_name=scan_name+'_'+mode)
115 | 
116 | 
117 | def batch_export():
118 |     for i, scan_name in enumerate(sorted(VAL_SCAN_NAMES)):
119 |         #if not scan_name.endswith('_00'):
120 |         #    continue
121 |         print('-'*20+'begin')
122 |         print(datetime.datetime.now())
123 |         print(scan_name)
124 |         export_one_scan(scan_name)
125 |         print('-'*20+'done')
126 | 
127 | if __name__=='__main__':
128 |     batch_export()
129 | 


--------------------------------------------------------------------------------
/utils/show_results_sunrgbd.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | #
  3 | # This source code is licensed under the MIT license found in the
  4 | # LICENSE file in the root directory of this source tree.
  5 | 
  6 | """ Batch mode in loading Scannet scenes with vertices and ground truth labels
  7 | for semantic and instance segmentations
  8 | 
  9 | Usage example: python ./batch_load_scannet_data.py
 10 | """
 11 | import os
 12 | import sys
 13 | import datetime
 14 | import numpy as np
 15 | import pdb
 16 | import matplotlib.pyplot as pyplot
 17 | import open3d as o3d
 18 | from scipy.spatial.distance import directed_hausdorff
 19 | import json
 20 | import pickle
 21 | import random
 22 | import scipy.io as sio
 23 | from pc_util import params2bbox, write_ply_rgb
 24 | 
 25 | THRESH = 0
 26 | THRESH2 = -0.1
 27 | DATA_DIR = os.path.join('/home/bo/data/sunrgbd/sunrgbd_pc_bbox_votes_50k_v1_val') # path of sunrgbd dataset 
 28 | VAL_SCAN_NAMES = sorted(list(set([os.path.basename(x)[0:6] for x in os.listdir(DATA_DIR)])))
 29 | PRED_PATH= '/home/bo/projects/cvpr2020/detection/new/new/sunrgbd/code_sunrgbd/indoor_scene_understanding/dump_sunrgbd/result' # path of predictions
 30 | 
 31 | DONOTCARE_CLASS_IDS = np.array([])
 32 | MAX_NUM_POINT = 40000
 33 | mode = sys.argv[1]
 34 | 
 35 | color_mapping = {1:[30,144,255], 2:[255,69,0], 3:[255,215,0], 4:[50,205,50], 5:[255,127,80],
 36 |         6:[255,20,147], 7:[100,149,237], 8:[255,127,80],9:[210,105,30], 10:[221,160,221],11:[95,158,  160]}
 37 | 
 38 | def create_lineset_old(bbox, colors=[1, 0, 0]):
 39 |     ''' create bounding box
 40 |     '''
 41 |     xmin = bbox[0] - bbox[3] / 2
 42 |     xmax = bbox[0] + bbox[3] / 2
 43 |     ymin = bbox[1] - bbox[4] / 2
 44 |     ymax = bbox[1] + bbox[4] / 2
 45 |     zmin = bbox[2] - bbox[5] / 2
 46 |     zmax = bbox[2] + bbox[5] / 2
 47 |     points = [[xmin, ymin, zmin], [xmin, ymin, zmax], [xmin, ymax, zmin], [xmin, ymax, zmax],
 48 |               [xmax, ymin, zmin], [xmax, ymin, zmax], [xmax, ymax, zmin], [xmax, ymax, zmax]]
 49 |     lines = [[0, 1], [0, 2], [2, 3], [1, 3], [0, 4], [1, 5], [3, 7], [2, 6],
 50 |              [4, 5], [5, 7], [6, 7], [4, 6]]
 51 |     line_set = o3d.geometry.LineSet()
 52 |     line_set.points = o3d.utility.Vector3dVector(points)
 53 |     line_set.lines = o3d.utility.Vector2iVector(lines)
 54 |     line_set.colors = o3d.utility.Vector3dVector(np.tile(colors, [12, 1]))
 55 |     return line_set
 56 | 
 57 | 
 58 | def create_lineset(bbox, colors=[1, 0, 0]):
 59 |     ''' create bounding box
 60 |     '''
 61 |     points = params2bbox(bbox)
 62 |     lines = [[0, 1], [0, 2], [2, 3], [1, 3], [0, 4], [1, 5], [3, 7], [2, 6],
 63 |              [4, 5], [5, 7], [6, 7], [4, 6]]
 64 |     line_set = o3d.geometry.LineSet()
 65 |     line_set.points = o3d.utility.Vector3dVector(points)
 66 |     line_set.lines = o3d.utility.Vector2iVector(lines)
 67 |     line_set.colors = o3d.utility.Vector3dVector(np.tile(colors, [12, 1]))
 68 |     return line_set
 69 | 
 70 | 
 71 | def load_view_point(pcd, filename, window_name):
 72 |     if mode=='pred':
 73 |         left = 50
 74 |         top=50
 75 |     elif mode=='gt':
 76 |         left = 1000
 77 |         top=730
 78 |     else:
 79 |         print("model must be gt or pred")
 80 |         return
 81 | 
 82 |     vis = o3d.visualization.Visualizer()
 83 |     vis.create_window(window_name, width=880, height=680, left=left, top=top)
 84 |     for part in pcd:
 85 |         vis.add_geometry(part)
 86 |     ctr = vis.get_view_control()
 87 |     current_param = ctr.convert_to_pinhole_camera_parameters()
 88 |     trajectory = o3d.io.read_pinhole_camera_trajectory(filename)
 89 |     f = 983.80485869912241
 90 |     cx = current_param.intrinsic.width / 2 - 0.5
 91 |     cy = current_param.intrinsic.height / 2 - 0.5
 92 |     trajectory.parameters[0].intrinsic.set_intrinsics(current_param.intrinsic.width, current_param.intrinsic.height, f, f, cx, cy)
 93 | 
 94 |     ctr.convert_from_pinhole_camera_parameters(trajectory.parameters[0])
 95 |     vis.run()
 96 |     vis.destroy_window()
 97 | 
 98 | def select_bbox(bboxes):
 99 |     choose_ids = []
100 |     for i in range(bboxes.shape[0]):
101 |         if bboxes[i,-1] in OBJ_CLASS_IDS:
102 |             choose_ids.append(i)
103 |     bboxes = bboxes[choose_ids]
104 |     return bboxes
105 | 
106 | def export_one_scan(scan_name):
107 |     pt = np.load(os.path.join(DATA_DIR, scan_name+'_pc.npz'))['pc']
108 |     np.savetxt(mode+'tmp.xyz', pt)
109 |     os.system("mv {}tmp.xyz {}tmp.xyzrgb".format(mode, mode))
110 |     point_cloud = o3d.io.read_point_cloud(mode+'tmp.xyzrgb')
111 | 
112 |     pred_proposals = np.load(os.path.join(PRED_PATH, 'center'+scan_name+'_nms.npy'))
113 |     gt_bbox = sio.loadmat(os.path.join(PRED_PATH, 'center'+scan_name+'_gt.mat'))['gt']
114 |     bb =[]
115 |     if mode=='gt':
116 |         boundingboxes = gt_bbox
117 |     elif mode =='pred':
118 |         boundingboxes = pred_proposals
119 |     else:
120 |         print("model must be gt or pred")
121 |         return
122 |     for i in range(boundingboxes.shape[0]):
123 |         c = np.array(color_mapping[int(boundingboxes[i,-1])])/255.0
124 |         for _ in range(2):
125 |             bb.append(create_lineset(boundingboxes[i]+0.005*(np.random.rand()-0.5)*2, colors=c))
126 |     load_view_point([point_cloud] + bb, './viewpoint.json', window_name=scan_name+'_'+mode)
127 | 
128 | 
129 | def batch_export():
130 |     for i, scan_name in enumerate(VAL_SCAN_NAMES):
131 |         if not scan_name.endswith('10'):
132 |             continue
133 |         print('-'*20+'begin')
134 |         print(scan_name)
135 |         export_one_scan(scan_name)
136 |         print('-'*20+'done')
137 | 
138 | if __name__=='__main__':
139 |     batch_export()
140 | 


--------------------------------------------------------------------------------
/utils/tf_logger.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates.
 2 | # 
 3 | # This source code is licensed under the MIT license found in the
 4 | # LICENSE file in the root directory of this source tree.
 5 | 
 6 | import tensorflow as tf
 7 | import numpy as np
 8 | import scipy.misc 
 9 | try:
10 |     from StringIO import StringIO  # Python 2.7
11 | except ImportError:
12 |     from io import BytesIO         # Python 3.x
13 | 
14 | 
15 | class Logger(object):
16 |     
17 |     def __init__(self, log_dir):
18 |         """Create a summary writer logging to log_dir."""
19 |         self.writer = tf.summary.FileWriter(log_dir)
20 | 
21 |     def scalar_summary(self, tag, value, step):
22 |         """Log a scalar variable."""
23 |         summary = tf.Summary(value=[tf.Summary.Value(tag=tag, simple_value=value)])
24 |         self.writer.add_summary(summary, step)
25 | 
26 |     def image_summary(self, tag, images, step):
27 |         """Log a list of images."""
28 | 
29 |         img_summaries = []
30 |         for i, img in enumerate(images):
31 |             # Write the image to a string
32 |             try:
33 |                 s = StringIO()
34 |             except:
35 |                 s = BytesIO()
36 |             scipy.misc.toimage(img).save(s, format="png")
37 | 
38 |             # Create an Image object
39 |             img_sum = tf.Summary.Image(encoded_image_string=s.getvalue(),
40 |                                        height=img.shape[0],
41 |                                        width=img.shape[1])
42 |             # Create a Summary value
43 |             img_summaries.append(tf.Summary.Value(tag='%s/%d' % (tag, i), image=img_sum))
44 | 
45 |         # Create and write Summary
46 |         summary = tf.Summary(value=img_summaries)
47 |         self.writer.add_summary(summary, step)
48 |         
49 |     def histo_summary(self, tag, values, step, bins=1000):
50 |         """Log a histogram of the tensor of values."""
51 | 
52 |         # Create a histogram using numpy
53 |         counts, bin_edges = np.histogram(values, bins=bins)
54 | 
55 |         # Fill the fields of the histogram proto
56 |         hist = tf.HistogramProto()
57 |         hist.min = float(np.min(values))
58 |         hist.max = float(np.max(values))
59 |         hist.num = int(np.prod(values.shape))
60 |         hist.sum = float(np.sum(values))
61 |         hist.sum_squares = float(np.sum(values**2))
62 | 
63 |         # Drop the start of the first bin
64 |         bin_edges = bin_edges[1:]
65 | 
66 |         # Add bin edges and counts
67 |         for edge in bin_edges:
68 |             hist.bucket_limit.append(edge)
69 |         for c in counts:
70 |             hist.bucket.append(c)
71 | 
72 |         # Create and write Summary
73 |         summary = tf.Summary(value=[tf.Summary.Value(tag=tag, histo=hist)])
74 |         self.writer.add_summary(summary, step)
75 |         self.writer.flush()
76 | 


--------------------------------------------------------------------------------
/utils/tf_visualizer.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates.
 2 | # 
 3 | # This source code is licensed under the MIT license found in the
 4 | # LICENSE file in the root directory of this source tree.
 5 | 
 6 | '''Code adapted from https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix'''
 7 | import os
 8 | import time
 9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
10 | import sys
11 | sys.path.append(BASE_DIR)
12 | import tf_logger
13 | 
14 | 
15 | class Visualizer():
16 |     def __init__(self, opt, name='train'):
17 |         # self.opt = opt
18 |         #self.logger = tf_logger.Logger(os.path.join(opt.logging_dir, opt.name))
19 |         #self.log_name = os.path.join(opt.checkpoint_dir, opt.name, 'loss_log.txt')
20 |         self.logger = tf_logger.Logger(os.path.join(opt.log_dir, name))
21 |         self.log_name = os.path.join(opt.log_dir, 'tf_visualizer_log.txt')
22 |         with open(self.log_name, "a") as log_file:
23 |             now = time.strftime("%c")
24 |             log_file.write('================ Training Loss (%s) ================\n' % now)
25 | 
26 |     # |visuals|: dictionary of images to save
27 |     def log_images(self, visuals, step):
28 |             for label, image_numpy in visuals.items():
29 |                 self.logger.image_summary(
30 |                     label, [image_numpy], step)
31 | 
32 |     # scalars: dictionary of scalar labels and values
33 |     def log_scalars(self, scalars, step):
34 |         for label, val in scalars.items():
35 |             self.logger.scalar_summary(label, val, step)
36 | 
37 |     # scatter plots
38 |     def plot_current_points(self, points, disp_offset=10):
39 |         pass
40 | 
41 |     # scalars: same format as |scalars| of plot_current_scalars
42 |     def print_current_scalars(self, epoch, i, scalars):
43 |         message = '(epoch: %d, iters: %d) ' % (epoch, i)
44 |         for k, v in scalars.items():
45 |             message += '%s: %.3f ' % (k, v)
46 | 
47 |         print(message)
48 |         with open(self.log_name, "a") as log_file:
49 |             log_file.write('%s\n' % message)
50 | 


--------------------------------------------------------------------------------
/utils/utils.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | import torch.nn.functional as F
 4 | 
 5 | def conv3x3x3(in_planes, out_planes, stride):
 6 |     # 3x3x3 convolution with padding
 7 |     return nn.Conv3d(
 8 |         in_planes,
 9 |         out_planes,
10 |         kernel_size=3,
11 |         stride=stride,
12 |         padding=1)
13 | def upconv3x3x3(in_planes, out_planes, stride):
14 |     return nn.ConvTranspose3d(
15 |         in_planes, 
16 |         out_planes, 
17 |         kernel_size=3, 
18 |         stride=1,
19 |         padding=1,
20 |         output_padding=1)
21 |     
22 | def conv_block_3d(in_dim, out_dim, activation):
23 |     return nn.Sequential(
24 |         nn.Conv3d(in_dim, out_dim, kernel_size=3, stride=1, padding=1),
25 |         nn.BatchNorm3d(out_dim),
26 |         activation,)
27 | 
28 | 
29 | def conv_trans_block_3d(in_dim, out_dim, activation, stride=2):
30 |     return nn.Sequential(
31 |         nn.ConvTranspose3d(in_dim, out_dim, kernel_size=3, stride=stride, padding=1, output_padding=1),
32 |         nn.BatchNorm3d(out_dim),
33 |         activation,)
34 | 
35 | 
36 | def max_pooling_3d():
37 |     return nn.MaxPool3d(kernel_size=2, stride=2, padding=0)
38 | 
39 | 
40 | def conv_block_2_3d(in_dim, out_dim, activation, stride=1):
41 |     return nn.Sequential(
42 |         conv_block_3d(in_dim, out_dim, activation),
43 |         nn.Conv3d(out_dim, out_dim, kernel_size=3, stride=stride, padding=1),
44 |         nn.BatchNorm3d(out_dim),)
45 | 
46 | 


--------------------------------------------------------------------------------
/utils/viewpoint.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"class_name" : "PinholeCameraTrajectory",
 3 | 	"parameters" : 
 4 | 	[
 5 | 		{
 6 | 			"class_name" : "PinholeCameraParameters",
 7 | 			"extrinsic" : 
 8 | 			[
 9 | 				0.99916142714838663,
10 | 				-0.007048749653398266,
11 | 				-0.040333083531057058,
12 | 				0.0,
13 | 				0.020877457243770447,
14 | 				-0.75968410011193177,
15 | 				0.64995707536433445,
16 | 				0.0,
17 | 				-0.035221786976728557,
18 | 				-0.65025409123314892,
19 | 				-0.75889988968026456,
20 | 				0.0,
21 | 				0.27650272158383526,
22 | 				0.43341214902144198,
23 | 				12.630418838778768,
24 | 				1.0
25 | 			],
26 | 			"intrinsic" : 
27 | 			{
28 | 				"height" : 1136,
29 | 				"intrinsic_matrix" : 
30 | 				[
31 | 					983.80485869912241,
32 | 					0.0,
33 | 					0.0,
34 | 					0.0,
35 | 					983.80485869912241,
36 | 					0.0,
37 | 					959.5,
38 | 					567.5,
39 | 					1.0
40 | 				],
41 | 				"width" : 1920
42 | 			},
43 | 			"version_major" : 1,
44 | 			"version_minor" : 0
45 | 		}
46 | 	],
47 | 	"version_major" : 1,
48 | 	"version_minor" : 0
49 | }


--------------------------------------------------------------------------------