├── .gitignore ├── 3DInsSegNet.py ├── LICENSE ├── README.md ├── comparison.png ├── data_preprocessing ├── PointsData │ ├── Points.py │ └── util.py ├── README.md ├── convert_points_to_tfrecords_level123.py ├── get_points_filelist.py ├── get_points_filelist_level123.py ├── parse_h5_to_points.py ├── run_convert_points_to_tfrecords_level123.py ├── run_get_points_filelist.py └── run_get_points_filelist_level123.py ├── get_results.py ├── ocnn └── tensorflow │ ├── libs │ ├── __init__.py │ └── libocnn_docker.so │ └── script │ └── ocnn.py ├── run_partnet_test.py └── util ├── category_info.py ├── cluster.py ├── config.py ├── dataset.py ├── instance_metric.py ├── network.py ├── numeric_function.py ├── transform.py └── vis_pointcloud.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Chunyu Sun 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Semantic Segmentation-Assisted Instance Feature Fusion for Multi-level 3D Part Instance Segmentation 2 | 3 | comparison 4 | 5 | 6 | ## Introduction 7 | 8 | This work is based on our CVM paper. We proposed a new method for 3D shape instance segmentation. You can check our [project webpage](https://isunchy.github.io/projects/3d_instance_segmentation.html) for a quick overview. 9 | 10 | Recognizing 3D part instances from 3D point cloud is crucial for 3D structure and scene understanding. Many learning-based approaches simply utilize semantic segmentation and instance center prediction as training tasks and fail to further exploit the inherent relationship between shape semantics and part instances. In this paper, we present a new method for 3D part instance segmentation. Our method exploits semantic segmentation for fusing nonlocal instance features for instance center prediction and further enhances the fusion scheme in a multi- and cross-level way. We also propose a semantic region center prediction task for training and leverage the prediction results to improve the clustering of instance points. Our method outperforms existing methods with a large-margin improvement in the PartNet benchmark. We also demonstrate that our feature fusion scheme can be applied to other existing methods to improve their performance in indoor scene instance segmentation tasks. 11 | 12 | In this repository, we release the code and data for training the networks for 3d shape instance segmentation. 13 | 14 | ## Citation 15 | 16 | If you use our code for research, please cite our paper: 17 | ``` 18 | @article{sun2022ins, 19 | title = {Semantic Segmentation-Assisted Instance Feature Fusion for Multi-level 3D Part Instance Segmentation}, 20 | author = {Sun, Chunyu and Tong, Xin and Liu, Yang}, 21 | journal = {Computational Visual Media}, 22 | year = {2022}, 23 | publisher = {Springer} 24 | } 25 | ``` 26 | 27 | ## Setup 28 | 29 | 30 | docker pull tensorflow/tensorflow:1.15.0-gpu-py3 31 | docker run -it --runtime=nvidia -v /path/to/3d_instance_segmentation/:/workspace tensorflow/tensorflow:1.15.0-gpu-py3 32 | cd /workspace 33 | pip install tqdm scipy scikit-learn --user 34 | 35 | 36 | ## Experiments 37 | 38 | 39 | ### Data Preparation 40 | 41 | Refer to the folder `data_preprocessing` for generating the training and test data. 42 | 43 | And we also provide the Baidu drive link for downloading the training and test datasets: 44 | 45 | >[Training data](https://pan.baidu.com/s/1IQoUcak971ENxQQNfn0Q0w?pwd=3ins) 46 | 47 | 48 | ### Training 49 | 50 | To start the training, run 51 | 52 | $ python 3DInsSegNet.py --logdir log/test_chair --train_data data/Chair_level123_train_4489.tfrecords --test_data data/Chair_level123_test_1217.tfrecords --test_data_visual data/Chair_level123_test_1217.tfrecords --train_batch_size 8 --test_batch_size 1 --max_iter 100000 --test_every_iter 5000 --test_iter 1217 --test_iter_visual 0 --cache_folder test_chair --gpu 0 --n_part_1 6 --n_part_2 30 --n_part_3 39 --level_1_weight 1 --level_2_weight 1 --level_3_weight 1 --phase train --seg_loss_weight 1 --offset_weight 1 --sem_offset_weight 1 --learning_rate 0.1 --delete_0 --notest_visual --depth 6 --weight_decay 0.0001 --stop_gradient --category Chair 53 | 54 | ### Test 55 | 56 | To test a trained model, run 57 | 58 | $ python 3DInsSegNet.py --logdir log/test_chair --train_data data/Chair_level123_train_4489.tfrecords --test_data data/Chair_level123_test_1217.tfrecords --test_data_visual data/Chair_level123_test_1217.tfrecords --train_batch_size 8 --test_batch_size 1 --max_iter 100000 --test_every_iter 5000 --test_iter 1217 --test_iter_visual 0 --cache_folder test_chair --gpu 0 --n_part_1 6 --n_part_2 30 --n_part_3 39 --level_1_weight 1 --level_2_weight 1 --level_3_weight 1 --phase test --seg_loss_weight 1 --offset_weight 1 --sem_offset_weight 1 --learning_rate 0.1 --ckpt weight/Chair --delete_0 --notest_visual --depth 6 --weight_decay 0.0001 --stop_gradient --category Chair 59 | 60 | We provide the trained weights used in our paper: 61 | 62 | >[Weights](https://pan.baidu.com/s/1EumXaBohQ0p9daw9R5xhLQ?pwd=3ins) 63 | 64 | 65 | 66 | ## License 67 | 68 | MIT Licence 69 | 70 | ## Contact 71 | 72 | Please contact us (Chunyu Sun sunchyqd@gmail.com, Yang Liu yangliu@microsoft.com) if you have any problem about our implementation. 73 | 74 | -------------------------------------------------------------------------------- /comparison.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/isunchy/3d_instance_segmentation/5eb6d2bfbe76e76e27d1045aa335e5829bec5aeb/comparison.png -------------------------------------------------------------------------------- /data_preprocessing/PointsData/Points.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | from util import * 4 | import json 5 | import math 6 | class Part(): 7 | def __init__(self): 8 | self.__part_id = 0 9 | self.__parent_part_id = 0 10 | self.__motion_parent_id = -1 11 | self.__level_id = 0 12 | self.with_motion = 0 13 | self.motion_type = -1 14 | self.center = np.array([0, 0, 0], dtype=np.float32) 15 | self.direction = np.array([0, 0, 0], dtype=np.float32) 16 | self.int_attri_num = 5 17 | self.float_attri_num = 6 18 | self.motion_sons = [] 19 | self.hrch_sons = [] 20 | self.__transform_list = [] 21 | self.__points_index = None 22 | 23 | def get_id(self): 24 | return self.__part_id 25 | 26 | def get_parent_id(self): 27 | return self.__parent_part_id 28 | 29 | def get_motion_parent_id(self): 30 | return self.__motion_parent_id 31 | 32 | def get_level_id(self): 33 | return self.__level_id 34 | 35 | def set_id(self, part_id, parent_part_id, motion_parent_id, level_id): 36 | self.__part_id = part_id 37 | self.__parent_part_id = parent_part_id 38 | self.__motion_parent_id = motion_parent_id 39 | self.__level_id = level_id 40 | 41 | def get_header_length(self): 42 | return self.int_attri_num * 4 + self.float_attri_num * 4 43 | 44 | def get_header(self): 45 | header = bytes() 46 | header += encode_int32(self.__part_id) 47 | header += encode_int32(self.__parent_part_id) 48 | header += encode_int32(self.with_motion) 49 | header += encode_int32(self.__motion_parent_id) 50 | header += encode_int32(self.motion_type) 51 | header += self.center.tobytes() 52 | header += self.direction.tobytes() 53 | return header 54 | def decode_header(self, header): 55 | int_attri = self.int_attri_num 56 | bytes_offset = 0 57 | self.__part_id, self.__parent_part_id, self.with_motion, self.__motion_parent_id, self.motion_type = np.frombuffer(header, dtype=np.int32, count=int_attri, offset=bytes_offset) 58 | bytes_offset += int_attri * 4 59 | self.center = np.frombuffer(header, dtype=np.float32, count=3, offset=bytes_offset) 60 | bytes_offset += 3 * 4 61 | self.direction = np.frombuffer(header, dtype=np.float32, count=3, offset=bytes_offset) 62 | return 63 | 64 | def build_points_index(self, part_id_list): 65 | self.__points_index = part_id_list == self.__part_id 66 | 67 | def pass_transform(self, transform): 68 | self.__transform_list.append(transform) 69 | for child_part in self.motion_sons: 70 | child_part.pass_transform(transform) 71 | 72 | def pass_global_transform(self, transform): 73 | self.center = apply_transform_list(self.center, [transform]) 74 | self.direction = apply_transform_list(self.direction, [transform]) 75 | for child_part in self.motion_sons: 76 | child_part.pass_global_transform(transform) 77 | 78 | def apply_transform(self, points, normals): 79 | part_points = points[self.__points_index] 80 | part_points = apply_transform_list(part_points, self.__transform_list) 81 | points[self.__points_index] = part_points 82 | part_normals = normals[self.__points_index] 83 | part_normals = apply_transform_list(part_normals, self.__transform_list) 84 | normals[self.__points_index] = part_normals 85 | self.__transform_list = [] 86 | 87 | def generate_random_motion(self): 88 | step = np.random.uniform(0, 0.1) 89 | angle = np.random.uniform(-math.pi/2, math.pi/2) 90 | self.center = apply_transform_list(self.center, self.__transform_list) 91 | self.direction = apply_transform_list(self.direction, self.__transform_list, is_vector=True) 92 | if self.motion_type == -1: 93 | return 94 | elif self.motion_type == 0:#tranlation 95 | transform = Translation_Matrix(self.direction, step) 96 | elif self.motion_type == 1:#rotation 97 | transform = Rotation_Matrix(self.center, self.direction, angle) 98 | elif self.motion_type == 2:#spiral 99 | transform = Spiral_Matrix(self.center, self.direction, step, angle) 100 | self.pass_transform(transform) 101 | return 102 | 103 | class Level(): 104 | def __init__(self): 105 | self.part_num = 0 106 | self.part_list = [] 107 | self.attr_num = 1 108 | self.parent_dict = {} 109 | self.part_index_dict = {} 110 | self.root_part = None 111 | 112 | def get_header_length(self): 113 | return self.attr_num * 4 114 | 115 | def get_header(self): 116 | header = bytes() 117 | header += encode_int32(self.part_num) 118 | return header 119 | 120 | def decode_header(self, header): 121 | self.part_num = np.frombuffer(header, dtype=np.int32)[0] 122 | for part_i in range(self.part_num): 123 | self.part_list.append(Part()) 124 | return 125 | 126 | def build_index_dict(self): 127 | self.part_index_dict = {} 128 | for i in range(self.part_num): 129 | self.part_index_dict[self.part_list[i].get_id()] = i 130 | 131 | def build_parent_dict(self): 132 | self.parent_dict = {} 133 | for i in range(self.part_num): 134 | self.parent_dict[self.part_list[i].get_id()] = self.part_list[i].get_parent_id() 135 | 136 | def get_parent_id(self, part_id): 137 | if self.parent_dict.get(part_id)!=None: 138 | return self.parent_dict[part_id] 139 | else: 140 | return -1 141 | 142 | def get_part(self, part_id): 143 | if self.part_index_dict.get(part_id)!=None: 144 | return self.part_list[self.part_index_dict[part_id]] 145 | else: 146 | return None 147 | def build_motion_tree(self): 148 | for part in self.part_list: 149 | if part.get_motion_parent_id()==-1: 150 | self.root_part = part 151 | else: 152 | self.get_part(part.get_motion_parent_id()).motion_sons.append(part) 153 | 154 | def apply_transform(self, points, normals): 155 | for part in self.part_list: 156 | part.apply_transform(points, normals) 157 | 158 | def generate_random_motion(self): 159 | node_stack = [self.root_part] 160 | while len(node_stack)>0: 161 | curr_part = node_stack.pop() 162 | curr_part.generate_random_motion() 163 | for child_part in curr_part.motion_sons: 164 | node_stack.append(child_part) 165 | 166 | def pass_global_transform(self, transform_list): 167 | global_transform = compose_transform_list(transform_list) 168 | self.root_part.pass_global_transform(global_transform) 169 | return 170 | 171 | class Points(): 172 | def __init__(self): 173 | self.points = None 174 | self.normals = None 175 | self.part_ids = None 176 | self.labels = None 177 | self.features = None 178 | 179 | self.points_num = 0 180 | self.level_num = 0 181 | self.with_part = 0 182 | self.with_label = 0 183 | self.with_motion = 0 184 | self.feature_channel = 0 185 | self.attr_num = 6 186 | self.level_list = [] 187 | self.center = None 188 | self.bBox = None 189 | self.radius = 0 190 | self.upright = np.array([0, 0, 1], dtype=np.float32) 191 | 192 | def calc_boundingBox(self): 193 | coord_max, coord_min = boundingBox(self.points) 194 | self.center = (coord_max + coord_min) / 2 195 | self.bBox = [coord_max, coord_min] 196 | self.radius = np.linalg.norm(coord_max-coord_min, ord=2) / 2 197 | 198 | def global_transform(self): 199 | scale = np.random.uniform(1, 1.1, (3)) 200 | normal_transform_list = [] 201 | global_transform_list = [] 202 | global_transform_list.append(Scale_Matrix(self.center, scale)) 203 | sheer = np.random.uniform(0, 0.1, (6)) 204 | global_transform_list.append(Sheer_Matrx(self.center, sheer)) 205 | #translation_direction = np.random.uniform(0, 0.5*self.radius, (3)) 206 | #global_transform_list.append(Translation_Matrix(translation_direction, 1)) 207 | angle = np.random.uniform(-math.pi, math.pi) 208 | global_transform_list.append(Rotation_Matrix(self.center, self.upright, angle)) 209 | normal_transform_list.append(Rotation_Matrix(np.array([0., 0., 0.]), self.upright, angle)) 210 | self.points = apply_transform_list(self.points, global_transform_list) 211 | self.normals = apply_transform_list(self.normals, normal_transform_list) 212 | if self.with_motion: 213 | self.level_list[0].pass_global_transform(global_transform_list) 214 | return 215 | 216 | def get_part(self, level, part_id): 217 | return self.level_list[i].get_part(part_id) 218 | 219 | def get_part_id(self, index, level): 220 | part_id = -1 221 | lowest_part_id = self.part_ids[index] 222 | if level == self.level_num-1: 223 | part_id = lowest_part_id 224 | else: 225 | for i in range(self.level_num-1, level, -1): 226 | part_id = self.level_list[i].get_parent_id(part_id) 227 | return part_id 228 | 229 | def get_header_length(self): 230 | return self.attr_num * 4 231 | 232 | def build_check_tree(self): 233 | for level_i in self.level_list: 234 | level_i.build_index_dict() 235 | level_i.build_parent_dict() 236 | for part_i in level_i.part_list: 237 | part_i.build_points_index(self.part_ids) 238 | #check util last but one level 239 | for i in range(self.level_num-2): 240 | for part in self.level_list[i+1].part_list: 241 | parent_part = self.level_list[i].get_part(part.get_parent_id()) 242 | if parent_part == None: 243 | print("Error parent ID: %d of part %d in level %d"%(part.get_parent_id(), part.get_id(), i+1)) 244 | else: 245 | parent_part.hrch_sons.append(part) 246 | return 247 | 248 | def build_motion_tree(self): 249 | self.level_list[0].build_motion_tree() 250 | 251 | def apply_transform(self): 252 | self.level_list[0].apply_transform(self.points, self.normals) 253 | 254 | def generate_random_motion(self): 255 | self.level_list[0].generate_random_motion() 256 | 257 | def get_points_length(self): 258 | bytes_length = 0 259 | bytes_length += self.points_num * 3 * 4 260 | bytes_length += self.points_num * 3 * 4 261 | if self.with_part: 262 | bytes_length += self.points_num * 4 263 | if self.with_label: 264 | bytes_length += self.points_num * 4 265 | if self.feature_channel: 266 | bytes_length += self.points_num * self.feature_channel * 4 267 | return bytes_length 268 | 269 | def read_from_PartNet(self): 270 | return 271 | 272 | def read_from_numpy(self, points, normals, semantic_labels, instance_labels=None): 273 | points_num = points.shape[0] 274 | assert(points.shape == (points_num, 3)) 275 | assert(normals.shape == (points_num, 3)) 276 | assert(semantic_labels.shape == (points_num,)) 277 | if instance_labels is not None: assert(instance_labels.shape == (points_num,)) 278 | part_ids = semantic_labels 279 | if instance_labels is not None: part_ids = instance_labels*100 + semantic_labels 280 | Level0 = Level() 281 | part_id_list = np.unique(np.array(part_ids)) 282 | Level0.part_num = len(part_id_list) 283 | for i in range(len(part_id_list)): 284 | new_part = Part() 285 | new_part.set_id(part_id_list[i], 0, -1, 0) 286 | Level0.part_list.append(new_part) 287 | self.points_num = points_num 288 | self.level_num = 1 289 | self.level_list.append(Level0) 290 | self.points = points 291 | self.normals = normals 292 | self.part_ids = semantic_labels 293 | self.with_part = 1 294 | self.with_motion = 0 295 | self.with_label = 1 296 | self.labels = part_ids.astype(np.float32) 297 | return 298 | 299 | def read_from_Relabeled_PartNet(self, filename): 300 | with open(filename) as f: 301 | content = f.readlines() 302 | points_num = len(content) 303 | points = np.zeros([points_num, 3], dtype=np.float32) 304 | normals = np.zeros([points_num, 3], dtype=np.float32) 305 | part_ids = np.zeros([points_num], dtype=np.int32) 306 | for i in range(points_num): 307 | x, y, z, nx, ny, nz, part_id = content[i].split(' ') 308 | points[i] = [x, y, z] 309 | normals[i] = [nx, ny, nz] 310 | part_ids[i] = int(float(part_id)) 311 | part_ids[part_ids==36] = -1 # note the bad shape index 312 | valid_point_index = part_ids != -1 313 | points = points[valid_point_index] 314 | normals = normals[valid_point_index] 315 | part_ids = part_ids[valid_point_index] 316 | # new_points_num = part_ids.size 317 | select_index = np.linspace(0, part_ids.size-1, num=10000, dtype=int) 318 | points = points[select_index] 319 | normals = normals[select_index] 320 | part_ids = part_ids[select_index] 321 | new_points_num = part_ids.size 322 | Level0 = Level() 323 | part_id_list = np.unique(np.array(part_ids)) 324 | Level0.part_num = len(part_id_list) 325 | for i in range(len(part_id_list)): 326 | new_part = Part() 327 | new_part.set_id(part_id_list[i], 0, -1, 0) 328 | Level0.part_list.append(new_part) 329 | self.points_num = new_points_num 330 | self.level_num = 1 331 | self.level_list.append(Level0) 332 | self.points = points 333 | self.normals = normals 334 | self.part_ids = part_ids 335 | self.with_part = 1 336 | self.with_motion = 0 337 | # for point integration 338 | self.with_label = 1 339 | self.labels = part_ids.astype(np.float32) 340 | # ##################### 341 | return 342 | 343 | def read_from_ShapeNetCorev2(self, dataset_path, class_name, object_id): 344 | dir_path = os.path.join(dataset_path, class_name, object_id) 345 | with open(os.path.join(dir_path, 'models', 'model_normalized_deformation_mix.pts')) as f: 346 | content = f.readlines() 347 | points_num = len(content) 348 | points = np.zeros([points_num, 3], dtype=np.float32) 349 | normals = np.zeros([points_num, 3], dtype=np.float32) 350 | part_ids = np.zeros([points_num], dtype=np.int32) 351 | for i in range(points_num): 352 | x, y, z, nx, ny, nz, part_id, _, _, _ = content[i].split(' ') 353 | points[i] = [x, y, z] 354 | normals[i] = [nx, ny, nz] 355 | part_ids[i] = part_id 356 | Level0 = Level() 357 | part_id_list = np.unique(np.array(part_ids)) 358 | Level0.part_num = len(part_id_list) 359 | for i in range(len(part_id_list)): 360 | new_part = Part() 361 | new_part.set_id(part_id_list[i], 0, -1, 0) 362 | Level0.part_list.append(new_part) 363 | self.points_num = points_num 364 | self.level_num = 1 365 | self.level_list.append(Level0) 366 | self.points = points 367 | self.normals = normals 368 | self.part_ids = part_ids 369 | self.with_part = 1 370 | self.with_motion = 0 371 | return 372 | 373 | def read_from_MotionDataset(self, dataset_path, class_name, object_id, density): 374 | dir_path = os.path.join(dataset_path, class_name, object_id) 375 | with open(os.path.join(dir_path, "motion_attributes.json"), "r") as fid: 376 | root_node = json.loads(fid.read()) 377 | fid.close() 378 | node_stack = [(root_node, -1)] 379 | points = [] 380 | normals = [] 381 | part_ids = [] 382 | part_id_count = 0 383 | Level0 = Level() 384 | motion_type_dict = { 385 | 'none': -1, 386 | 'translation': 0, 387 | 'rotation': 1, 388 | 'spiral': 2 389 | } 390 | while len(node_stack)>0: 391 | curr_node, motion_parent_id = node_stack.pop() 392 | new_part = Part() 393 | new_part.set_id(part_id_count, -1, motion_parent_id, 0) 394 | part_id_count += 1 395 | new_part.with_motion = 1 396 | new_part.motion_type = motion_type_dict[curr_node['motion_type']] 397 | new_part.center = np.array(curr_node['center'], dtype=np.float32) 398 | new_part.direction = np.array(curr_node['direction'], dtype=np.float32) 399 | if motion_parent_id!=-1: 400 | name = curr_node['dof_name'] 401 | else: 402 | name = "none_motion" 403 | part_points, part_normals = points_from_mesh(os.path.join(dir_path, "part_objs", name+".obj"), density=density) 404 | part_points_num = part_points.shape[0] 405 | #print("part %d point num: %d" % (new_part.get_id(), part_points_num)) 406 | self.points_num += part_points_num 407 | points.append(part_points) 408 | normals.append(part_normals) 409 | part_ids = part_ids + [new_part.get_id()] * part_points_num 410 | Level0.part_num += 1 411 | Level0.part_list.append(new_part) 412 | for child in curr_node['children']: 413 | node_stack.append((child, new_part.get_id())) 414 | self.level_num = 1 415 | self.level_list.append(Level0) 416 | self.points = np.array(np.concatenate(points, axis=0), dtype=np.float32) 417 | self.normals = np.array(np.concatenate(normals, axis=0), dtype=np.float32) 418 | self.part_ids = np.array(part_ids, dtype=np.int32) 419 | self.with_part = 1 420 | self.with_motion = 1 421 | self.build_check_tree() 422 | self.build_motion_tree() 423 | self.calc_boundingBox() 424 | return 425 | 426 | def get_header(self): 427 | header = bytes() 428 | header += encode_int32(self.points_num) 429 | header += encode_int32(self.level_num) 430 | header += encode_int32(self.with_part) 431 | header += encode_int32(self.with_label) 432 | header += encode_int32(self.with_motion) 433 | header += encode_int32(self.feature_channel) 434 | return header 435 | 436 | def decode_header(self, header): 437 | self.points_num, self.level_num, self.with_part, self.with_label, self.with_motion, self.feature_channel = np.frombuffer(header, dtype=np.int32) 438 | for level_i in range(self.level_num): 439 | self.level_list.append(Level()) 440 | return 441 | 442 | def read_points_1(self, path): 443 | with open(path, "rb") as f: 444 | self.__init__() 445 | magic_str_ = f.read(16) 446 | self.points_num = np.frombuffer(f.read(4), dtype=np.int32)[0] 447 | int_flags_ = np.frombuffer(f.read(4), dtype=np.int32)[0] 448 | content_flags_ = [int_flags_%2, (int_flags_>>1)%2, (int_flags_>>2)%2, (int_flags_>>3)%2] 449 | channels_ = np.frombuffer(f.read(4 * 8), dtype=np.int32) 450 | ptr_dis_ = np.frombuffer(f.read(4 * 8), dtype=np.int32) 451 | 452 | self.points = np.frombuffer(f.read(4 * 3 * self.points_num), dtype=np.float32) 453 | self.points = np.reshape(self.points, (self.points_num, 3)) 454 | self.normals = np.frombuffer(f.read(4 * 3 * self.points_num), dtype=np.float32) 455 | self.normals = np.reshape(self.normals, (self.points_num, 3)) 456 | if content_flags_[2]: 457 | self.feature_channel = channels_[2] 458 | self.features = np.frombuffer(f.read(4 * self.feature_channel * self.points_num), dtype=np.float32) 459 | self.features = np.reshape(self.features, (self.points_num, self.feature_channel)) 460 | if content_flags_[3]: 461 | self.with_label = 1 462 | # self.labels = np.frombuffer(f.read(4 * self.points_num), dtype=np.int32) 463 | self.labels = np.frombuffer(f.read(4 * self.points_num), dtype=np.float32) 464 | f.close() 465 | self.with_part = 0 466 | self.level_num = 1 467 | level0 = Level() 468 | level0.part_num = 1 469 | part0 = Part() 470 | level0.part_list.append(part0) 471 | self.level_list.append(level0) 472 | self.build_check_tree() 473 | self.calc_boundingBox() 474 | 475 | def write_to_points_1(self, path=None): 476 | magic_str_ = '_POINTS_1.0_\x00\x00\x00\x00' 477 | channels = [3, 3, self.feature_channel, self.with_label , 0, 0, 0, 0] 478 | point_bytes = self.points.tobytes() 479 | normal_bytes = self.normals.tobytes() 480 | # label_bytes = IntList_to_Bytes(self.labels if self.with_label else []) 481 | label_bytes = self.labels.tobytes() if self.with_label else [] 482 | feature_bytes = Float32List_to_Bytes(self.features if self.feature_channel>0 else []) 483 | 484 | ptr_dis_ = [0] * 8 485 | offset = len(magic_str_) + 4 + 4 + 4 * len(channels) + 4 * len(ptr_dis_) 486 | ptr_dis_[0] = offset 487 | offset += len(point_bytes) 488 | ptr_dis_[1] = offset 489 | offset += len(normal_bytes) 490 | ptr_dis_[2] = offset 491 | offset += len(feature_bytes) 492 | ptr_dis_[3] = offset 493 | offset += len(label_bytes) 494 | ptr_dis_[4] = offset 495 | 496 | content_flags_ = 1 + (1<<1) + (int(self.feature_channel>0)<<2) + (int(self.with_label)<<3) 497 | 498 | bytes_list = bytes() 499 | bytes_list += magic_str_.encode() 500 | bytes_list += IntList_to_Bytes(self.points_num) 501 | bytes_list += IntList_to_Bytes(content_flags_) 502 | bytes_list += IntList_to_Bytes(channels) 503 | bytes_list += IntList_to_Bytes(ptr_dis_) 504 | bytes_list += point_bytes 505 | bytes_list += normal_bytes 506 | bytes_list += feature_bytes 507 | bytes_list += label_bytes 508 | if path is not None: 509 | f = open(path, "wb") 510 | f.write(bytes_list) 511 | f.close() 512 | return 513 | else: 514 | return bytes_list 515 | 516 | 517 | def encode_points(self): 518 | bytes_list = bytes() 519 | bytes_list += self.points.tobytes() 520 | bytes_list += self.normals.tobytes() 521 | if self.with_part: 522 | bytes_list += self.part_ids.tobytes() 523 | if self.with_label: 524 | bytes_list += IntList_to_Bytes(self.labels) 525 | if self.feature_channel: 526 | bytes_list += Float32List_to_Bytes(self.features) 527 | return bytes_list 528 | 529 | def decode_points(self, bytes_list): 530 | bytes_offset = 0 531 | self.points = np.frombuffer(bytes_list, dtype=np.float32, count=self.points_num * 3, offset=bytes_offset) 532 | self.points = self.points.reshape((self.points_num, 3)) 533 | bytes_offset += self.points_num * 3 * 4 534 | self.normals = np.frombuffer(bytes_list, dtype=np.float32, count=self.points_num * 3, offset=bytes_offset) 535 | self.normals = self.normals.reshape((self.points_num, 3)) 536 | bytes_offset += self.points_num * 3 * 4 537 | if self.with_part: 538 | self.part_ids = np.frombuffer(bytes_list, dtype=np.int32, count=self.points_num, offset=bytes_offset) 539 | bytes_offset += self.points_num * 4 540 | if self.with_label: 541 | self.labels = np.frombuffer(bytes_list, dtype=np.int32, count=self.points_num, offset=bytes_offset) 542 | bytes_offset += self.points_num * 4 543 | if self.feature_channel: 544 | self.labels = np.frombuffer(bytes_list, dtype=np.float32, count=self.points_num * self.feature_channel, offset=bytes_offset) 545 | bytes_offset += self.points_num * self.feature_channel * 4 546 | return 547 | 548 | def write_to_off(self, path): 549 | np.savetxt(path, self.points, fmt="%.6f", header="OFF\n%d 0 0"%self.points.shape[0], comments="") 550 | 551 | def write_to_points(self, path): 552 | bytes_array = bytes() 553 | #Header 554 | bytes_array += self.get_header() 555 | #LevelHeader 556 | for level_i in self.level_list: 557 | bytes_array += level_i.get_header() 558 | #PartHeader 559 | for level_i in self.level_list: 560 | for part_i in level_i.part_list: 561 | bytes_array += part_i.get_header() 562 | bytes_array += self.encode_points() 563 | with open(path, "wb") as f: 564 | f.write(bytes_array) 565 | f.close() 566 | return 567 | 568 | def read_from_points(self, path): 569 | self.__init__() 570 | with open(path, "rb") as f: 571 | #Decode Header 572 | self.decode_header(f.read(self.get_header_length())) 573 | #Decode LevelHeader 574 | for level_i in self.level_list: 575 | level_i.decode_header(f.read(level_i.get_header_length())) 576 | #Decode PartHeader 577 | for level_i in self.level_list: 578 | for part_i in level_i.part_list: 579 | part_i.decode_header(f.read(part_i.get_header_length())) 580 | self.decode_points(f.read(self.get_points_length())) 581 | f.close() 582 | return 583 | 584 | def read_from_points_buffer(self, bytes_list): 585 | self.__init__() 586 | #Decode Header 587 | offset_begin = 0 588 | offset_end = offset_begin + self.get_header_length() 589 | self.decode_header(bytes_list[offset_begin: offset_end]) 590 | 591 | #Decode LevelHeader 592 | for level_i in self.level_list: 593 | offset_begin = offset_end 594 | offset_end = offset_begin + level_i.get_header_length() 595 | level_i.decode_header(bytes_list[offset_begin: offset_end]) 596 | #Decode PartHeader 597 | for level_i in self.level_list: 598 | for part_i in level_i.part_list: 599 | offset_begin = offset_end 600 | offset_end = offset_begin + part_i.get_header_length() 601 | part_i.decode_header(bytes_list[offset_begin: offset_end]) 602 | offset_begin = offset_end 603 | self.decode_points(bytes_list[offset_begin:-1]) 604 | return -------------------------------------------------------------------------------- /data_preprocessing/PointsData/util.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import openmesh 3 | import sys 4 | import os 5 | 6 | def blockPrint(): 7 | sys.stdout = open(os.devnull, 'w') 8 | sys.stderr = open(os.devnull, 'w') 9 | # Restore 10 | def enablePrint(): 11 | sys.stdout = sys.__stdout__ 12 | sys.stdout = sys.__stderr__ 13 | 14 | def boundingBox(points): 15 | coord_min = np.min(points, axis=0) 16 | coord_max = np.max(points, axis=0) 17 | return coord_max, coord_min 18 | 19 | def FreeFormDeformation(points): 20 | coord_max, coord_min = boundingbox(points) 21 | 22 | def encode_int32(x): 23 | x=int(x) 24 | return x.to_bytes(length=4, byteorder='little', signed=True) 25 | 26 | def IntList_to_Bytes(int_list): 27 | #list_bytes = struct.pack('i'*len(int_list), *int_list) 28 | x = np.array(int_list, dtype=np.int32) 29 | list_bytes = x.tobytes() 30 | return list_bytes 31 | 32 | def DoubleList_to_Bytes(float_list): 33 | #list_bytes = struct.pack('d'*len(float_list), *float_list) 34 | x = np.array(float_list, dtype=np.float64) 35 | list_bytes = x.tobytes() 36 | return list_bytes 37 | 38 | def Float32List_to_Bytes(float_list): 39 | #list_bytes = struct.pack('f'*len(float_list), *float_list) 40 | x = np.array(float_list, dtype=np.float32) 41 | list_bytes = x.tobytes() 42 | return list_bytes 43 | 44 | def normalized(a, axis=-1, order=2): 45 | norm = np.atleast_1d(np.linalg.norm(a, order, axis)) 46 | norm[norm==0] = 1 47 | return a / np.expand_dims(norm, axis) 48 | 49 | def compose_transform_list(transform_list): 50 | if(len(transform_list)==1): 51 | return transform_list[0] 52 | transform = np.eye(4,4) 53 | for i in range(len(transform_list)): 54 | transform = np.matmul(transform_list[i], transform) 55 | return transform 56 | 57 | def apply_transform_list(points, transform_list, is_vector=False): 58 | squeeze = False 59 | if points.ndim == 1: 60 | points = np.expand_dims(points, axis=0) 61 | squeeze = True 62 | if is_vector: 63 | points = np.insert(points, 3, 0, axis=-1) 64 | else: 65 | points = np.insert(points, 3, 1, axis=-1) 66 | transform = compose_transform_list(transform_list) 67 | transform = np.array(transform, dtype=np.float32) 68 | points = np.matmul(transform, points.transpose()).transpose()[:,0:3] 69 | if squeeze: 70 | points = np.squeeze(points, axis=0) 71 | return points 72 | 73 | def Sheer_Matrx(center, sheer): 74 | T1 = Translation_Matrix(-center, 1) 75 | Sh = np.eye(4) 76 | Sh[0][1] = sheer[0] 77 | Sh[0][2] = sheer[1] 78 | Sh[1][0] = sheer[2] 79 | Sh[1][2] = sheer[3] 80 | Sh[2][0] = sheer[4] 81 | Sh[2][1] = sheer[5] 82 | T2 = Translation_Matrix(center, 1) 83 | return np.matmul(T2, np.matmul(Sh, T1)) 84 | 85 | def Scale_Matrix(center, scale): 86 | T1 = Translation_Matrix(-center, 1) 87 | Sc = np.diag([scale[0], scale[1], scale[2], 1.]) 88 | T2 = Translation_Matrix(center, 1) 89 | return np.matmul(T2, np.matmul(Sc, T1)) 90 | 91 | def Translation_Matrix(direction, step): 92 | direction = np.array(direction, dtype=np.float32) 93 | v = direction * step 94 | transform = np.array([ 95 | [1, 0, 0, v[0]], 96 | [0, 1, 0, v[1]], 97 | [0, 0, 1, v[2]], 98 | [0, 0, 0, 1] 99 | ], dtype=np.float32) 100 | return transform 101 | 102 | def Rotation_Matrix(center, direction, angle): 103 | T1 = Translation_Matrix(-center, 1) 104 | v = normalized(direction, order=2)[0] 105 | R = np.array([ 106 | [np.cos(angle)+v[0]*v[0]*(1-np.cos(angle)), v[0]*v[1]*(1-np.cos(angle))-v[2]*np.sin(angle), v[0]*v[2]*(1-np.cos(angle))+v[1]*np.sin(angle), 0], 107 | [v[1]*v[0]*(1-np.cos(angle))+v[2]*np.sin(angle), np.cos(angle)+v[1]*v[1]*(1-np.cos(angle)), v[1]*v[2]*(1-np.cos(angle))-v[0]*np.sin(angle), 0], 108 | [v[2]*v[0]*(1-np.cos(angle))-v[1]*np.sin(angle), v[2]*v[1]*(1-np.cos(angle))+v[0]*np.sin(angle), np.cos(angle)+v[2]*v[2]*(1-np.cos(angle)), 0], 109 | [0, 0, 0, 1] 110 | ], dtype=np.float32) 111 | T2 = Translation_Matrix(center, 1) 112 | return np.matmul(T2, np.matmul(R, T1)) 113 | 114 | def Spiral_Matrix(center, direction, step, angle): 115 | return np.matmul(Translation_Matrix(direction, step), Spiral_Matrix(center, direction, angle)) 116 | 117 | def points_from_mesh(mesh_path, density=5000): 118 | mesh = openmesh.read_trimesh(mesh_path) 119 | points = mesh.points() 120 | mesh.update_vertex_normals() 121 | normals = mesh.vertex_normals() 122 | if mesh.n_vertices() == 0: 123 | return points, normals 124 | #help(mesh) 125 | for fh in mesh.faces(): 126 | point_list = [] 127 | face_normal = mesh.calc_face_normal(fh) 128 | for vh in mesh.fv(fh): 129 | point_list.append(mesh.point(vh)) 130 | p, q, r = point_list 131 | area = np.linalg.norm(np.cross(p-q, p-r))/2 132 | sample_num = max(int(area * density), 1) 133 | bary_coord = np.random.uniform(0, 1, (sample_num, 3)) 134 | bary_coord = normalized(bary_coord, order=1) 135 | vert_coord = np.array(point_list) 136 | sample_coord = np.matmul(bary_coord, vert_coord) 137 | points = np.concatenate([points, sample_coord], axis=0) 138 | if sample_num > 1: 139 | stacked_face_normal = np.stack([face_normal]*sample_num, axis=0) 140 | else: 141 | stacked_face_normal = np.reshape(face_normal, (1,3)) 142 | normals = np.concatenate([normals, stacked_face_normal], axis=0) 143 | return points, normals 144 | -------------------------------------------------------------------------------- /data_preprocessing/README.md: -------------------------------------------------------------------------------- 1 | ## About data preprocessing 2 | 3 | 1. Please follow the PartNet data downloading instructions on [the webpage](https://github.com/daerduoCarey/partnet_seg_exps/blob/master/data/README.md) to download `ins_seg_h5.zip`, and unzip it to the current folder. 4 | 2. Downloading the folders `after_merging_label_ids` and `train_val_test_split` from [the repo](https://github.com/daerduoCarey/partnet_seg_exps/tree/master/stats). 5 | 3. run `python parse_h5_to_points.py` to get all the points files. 6 | 4. run `python run_get_points_filelist.py` to get filelists for each level. 7 | 5. run `python run_get_points_filelist_level123.py` to get the concated filelists for each category. 8 | 6. run `python run_convert_points_to_tfrecords_level123.py` to get all the tfrecords for each category. 9 | -------------------------------------------------------------------------------- /data_preprocessing/convert_points_to_tfrecords_level123.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import argparse 4 | import tensorflow as tf 5 | from tqdm import tqdm 6 | 7 | sys.path.append('PointsData') 8 | from Points import * 9 | 10 | 11 | def _bytes_feature(value): 12 | return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) 13 | 14 | 15 | def _float_feature(value): 16 | return tf.train.Feature(float_list=tf.train.FloatList(value=value)) 17 | 18 | 19 | def _int64_feature(value): 20 | return tf.train.Feature(int64_list=tf.train.Int64List(value=value)) 21 | 22 | 23 | def parse_points(filename): 24 | assert os.path.isfile(filename), filename 25 | p = Points() 26 | p.read_from_Relabeled_PartNet(filename) 27 | points_bytes = p.write_to_points_1() 28 | return points_bytes 29 | 30 | 31 | def load_points(filename): 32 | assert os.path.isfile(filename), filename 33 | with open(filename, 'rb') as f: 34 | points_bytes = f.read() 35 | p = Points() 36 | p.read_points_1(filename) 37 | label = p.labels 38 | assert(label.size == 10000) 39 | return points_bytes, label 40 | 41 | 42 | def get_data_label_pair(list_file): 43 | shape_id_list_1 = [] 44 | shape_id_list_2 = [] 45 | shape_id_list_3 = [] 46 | shape_flag_list = [] 47 | shape_index_list = [] 48 | with open(list_file) as f: 49 | content = f.readlines() 50 | for line in content: 51 | shape_id_1, shape_id_2, shape_id_3, shape_flag, shape_index = line.split() 52 | shape_id_list_1.append(shape_id_1) 53 | shape_id_list_2.append(shape_id_2) 54 | shape_id_list_3.append(shape_id_3) 55 | shape_flag_list.append(int(shape_flag)) 56 | shape_index_list.append(int(shape_index)) 57 | return shape_id_list_1, shape_id_list_2, shape_id_list_3, shape_flag_list, shape_index_list 58 | 59 | 60 | def write_data_to_tfrecords(list_file, records_name): 61 | assert 'level123' in list_file, list_file 62 | shape_id_list_1, shape_id_list_2, shape_id_list_3, shape_flag_list, shape_index_list = get_data_label_pair(list_file) 63 | n_shape = len(shape_id_list_1) 64 | writer = tf.io.TFRecordWriter(records_name) 65 | for i in tqdm(range(n_shape)): 66 | points_bytes_1, label_1 = load_points(shape_id_list_1[i]) 67 | points_bytes_2, label_2 = load_points(shape_id_list_2[i]) 68 | points_bytes_3, label_3 = load_points(shape_id_list_3[i]) 69 | 70 | feature = {'points_bytes': _bytes_feature(points_bytes_1), 71 | 'label_1': _int64_feature(label_1.astype(np.int64).tolist()), 72 | 'label_2': _int64_feature(label_2.astype(np.int64).tolist()), 73 | 'label_3': _int64_feature(label_3.astype(np.int64).tolist()), 74 | 'points_flag': _int64_feature([shape_flag_list[i]]), 75 | 'shape_index': _int64_feature([shape_index_list[i]])} 76 | 77 | example = tf.train.Example(features=tf.train.Features(feature=feature)) 78 | writer.write(example.SerializeToString()) 79 | 80 | writer.close() 81 | 82 | 83 | if __name__ == '__main__': 84 | parser = argparse.ArgumentParser() 85 | 86 | # Required Arguments 87 | parser.add_argument('--list_file', 88 | type=str, 89 | help='File containing the list of points data, and \ 90 | append the identity filename as the label', 91 | required=True) 92 | parser.add_argument('--records_name', 93 | type=str, 94 | help='Name of tfrecords', 95 | required=True) 96 | 97 | args = parser.parse_args() 98 | 99 | write_data_to_tfrecords(args.list_file, 100 | args.records_name) -------------------------------------------------------------------------------- /data_preprocessing/get_points_filelist.py: -------------------------------------------------------------------------------- 1 | import os 2 | import argparse 3 | 4 | def get_points_filelist(category, phase, level): 5 | points_folder = 'ins_seg_points' 6 | shape_list = sorted([file for file in os.listdir(points_folder) if file.startswith(category) and file.endswith('level{}_{}.points'.format(level, phase))]) 7 | print(category, phase, level, len(shape_list)) 8 | out_file = os.path.join('filelist', '{}_level{}_{}_{}.txt'.format(category, level, phase, len(shape_list))) 9 | flag = 1 10 | with open(out_file, 'w') as f: 11 | for index, shape in enumerate(shape_list): 12 | f.write('{} {} {:04d}\n'.format(os.path.join('ins_seg_points', shape), flag, index)) 13 | print('write to {}'.format(out_file)) 14 | 15 | if __name__ == '__main__': 16 | parser = argparse.ArgumentParser() 17 | 18 | # Required Arguments 19 | parser.add_argument('--category', 20 | type=str, 21 | help='Category', 22 | default='Chair') 23 | parser.add_argument('--phase', 24 | type=str, 25 | help='train/val/test', 26 | default='test') 27 | parser.add_argument('--level', 28 | type=str, 29 | help='1/2/3', 30 | default='3') 31 | 32 | args = parser.parse_args() 33 | 34 | get_points_filelist(args.category, args.phase, args.level) -------------------------------------------------------------------------------- /data_preprocessing/get_points_filelist_level123.py: -------------------------------------------------------------------------------- 1 | import os 2 | import argparse 3 | 4 | def get_points_filelist(category, phase, f1, f2, f3): 5 | points_folder = 'ins_seg_points' 6 | content1 = open(f1).readlines() 7 | content2 = open(f2).readlines() 8 | content3 = open(f3).readlines() 9 | n_shape = len(content1) 10 | assert(n_shape==len(content2)==len(content3)) 11 | filelist_folder = 'filelist_level123' 12 | if not os.path.isdir(filelist_folder): os.mkdir(filelist_folder) 13 | out_file = os.path.join(filelist_folder, '{}_level123_{}_{}.txt'.format(category, phase, n_shape)) 14 | flag = 1 15 | with open(out_file, 'w') as f: 16 | for index in range(n_shape): 17 | f.write('{} {} {} {} {:04d}\n'.format(content1[index].split()[0], content2[index].split()[0], content3[index].split()[0], flag, index)) 18 | print('write to {}'.format(out_file)) 19 | 20 | if __name__ == '__main__': 21 | parser = argparse.ArgumentParser() 22 | 23 | # Required Arguments 24 | parser.add_argument('--category', 25 | type=str, 26 | help='Category', 27 | default='Chair') 28 | parser.add_argument('--phase', 29 | type=str, 30 | help='train/val/test', 31 | default='test') 32 | parser.add_argument('--f1', 33 | type=str) 34 | parser.add_argument('--f2', 35 | type=str) 36 | parser.add_argument('--f3', 37 | type=str) 38 | 39 | args = parser.parse_args() 40 | 41 | get_points_filelist(args.category, args.phase, args.f1, args.f2, args.f3) -------------------------------------------------------------------------------- /data_preprocessing/parse_h5_to_points.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import numpy as np 4 | import h5py 5 | import json 6 | from tqdm import tqdm 7 | 8 | sys.path.append('PointsData') 9 | from Points import * 10 | 11 | pts_h5_folder = 'ins_seg_h5' 12 | category_list = os.listdir(pts_h5_folder) 13 | # print(category_list) 14 | label_h5_folder = 'ins_seg_h5_for_detection' 15 | save_points_folder = 'ins_seg_points' 16 | if not os.path.isdir(save_points_folder): os.mkdir(save_points_folder) 17 | 18 | def get_h5_file_list(folder, prefix): 19 | return [file for file in os.listdir(folder) if file.startswith(prefix) and file.endswith('h5')] 20 | 21 | def get_instance_label(gt_mask, gt_valid): 22 | instance_label = -np.ones(gt_mask.shape[1], dtype=np.int32) 23 | for i, valid in enumerate(gt_valid): 24 | if valid: instance_label[gt_mask[i]] = i 25 | return instance_label 26 | 27 | for category in category_list: 28 | print(category) 29 | for phase in ['train', 'val', 'test']: 30 | if phase == 'test': 31 | label_h5_folder = 'ins_seg_h5_gt' 32 | else: 33 | label_h5_folder = 'ins_seg_h5_for_detection' 34 | h5_file_list = get_h5_file_list(os.path.join(pts_h5_folder, category), phase) 35 | print(h5_file_list) 36 | with open(os.path.join('train_val_test_split', '{}.{}.json'.format(category, phase)), 'r') as f: 37 | shape_name_list = json.load(f) 38 | print(len(shape_name_list)) 39 | shape_cum_index = 0 40 | for h5_file in h5_file_list: 41 | filename = os.path.join(pts_h5_folder, category, h5_file) 42 | f = h5py.File(filename, 'r') 43 | filename = os.path.join(label_h5_folder, '{}-1'.format(category), h5_file) 44 | f1 = h5py.File(filename, 'r') 45 | f2, f3 = None, None 46 | filename = os.path.join(label_h5_folder, '{}-2'.format(category), h5_file) 47 | if os.path.isfile(filename): f2 = h5py.File(filename, 'r') 48 | filename = os.path.join(label_h5_folder, '{}-3'.format(category), h5_file) 49 | if os.path.isfile(filename): f3 = h5py.File(filename, 'r') 50 | # print(f1, f2, f3) 51 | n_shape = f['pts'].shape[0] 52 | print(shape_cum_index, n_shape, h5_file) 53 | for i in tqdm(range(n_shape)): 54 | points = f['pts'][i] 55 | normals = f['nor'][i] 56 | shape_index = shape_cum_index+i 57 | model_id = shape_name_list[shape_index]['model_id'] 58 | anno_id = shape_name_list[shape_index]['anno_id'] 59 | for j, file in enumerate([f1, f2, f3]): 60 | if file is not None: 61 | instance_labels = get_instance_label(file['gt_mask'][i], file['gt_valid'][i] if phase != 'test' else file['gt_mask_valid'][i]) 62 | if phase != 'test': 63 | semantic_labels = file['gt_label'][i] 64 | else: 65 | mask_label = file['gt_mask_label'][i] 66 | semantic_labels = mask_label[instance_labels]+1 67 | semantic_labels[instance_labels==-1] = 0 68 | p = Points() 69 | p.read_from_numpy(points, normals, semantic_labels, instance_labels=instance_labels) 70 | save_filename = os.path.join(save_points_folder, '{}_{:04d}_{}_{}_level{}_{}.points'.format(category, shape_index, model_id, anno_id, j+1, phase)) 71 | # print(save_filename) 72 | points_bytes = p.write_to_points_1(save_filename) 73 | shape_cum_index += n_shape -------------------------------------------------------------------------------- /data_preprocessing/run_convert_points_to_tfrecords_level123.py: -------------------------------------------------------------------------------- 1 | import os 2 | from tqdm import tqdm 3 | 4 | if not os.path.isdir('tfrecords_level123'): os.mkdir('tfrecords_level123') 5 | 6 | filelist_folder = 'filelist_level123' 7 | 8 | filelist = os.listdir(filelist_folder) 9 | print(len(filelist)) 10 | print(filelist) 11 | for file in tqdm(filelist): 12 | category = file.split('_')[0] 13 | cmd = 'python convert_points_to_tfrecords_level123.py --list_file {} --records_name {}'.format(os.path.join(filelist_folder, file), os.path.join('tfrecords_level123', file.replace('.txt', '.tfrecords'))) 14 | print(cmd) 15 | os.system(cmd) 16 | 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /data_preprocessing/run_get_points_filelist.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | if not os.path.isdir('filelist'): os.mkdir('filelist') 4 | 5 | category_level = {} 6 | 7 | category_level_folder = 'after_merging_label_ids' 8 | 9 | category_level_filelist = os.listdir(category_level_folder) 10 | for file in category_level_filelist: 11 | if len(file.split('-'))==3: 12 | if file.split('-')[1] == 'level': 13 | category = file.split('-')[0] 14 | level = file[:-4].split('-')[-1] 15 | if category not in category_level: 16 | category_level[category] = [level] 17 | else: 18 | category_level[category].append(level) 19 | print(category_level) 20 | 21 | for k, v in category_level.items(): 22 | print(k, v) 23 | for level in v: 24 | for phase in ['train', 'val', 'test']: 25 | cmd = 'python get_points_filelist.py --category {} --phase {} --level {}'.format(k, phase, level) 26 | print(cmd) 27 | os.system(cmd) 28 | 29 | 30 | 31 | 32 | -------------------------------------------------------------------------------- /data_preprocessing/run_get_points_filelist_level123.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | if not os.path.isdir('filelist_level123'): os.mkdir('filelist_level123') 4 | 5 | category_level = {} 6 | 7 | category_level_folder = 'after_merging_label_ids' 8 | 9 | category_level_filelist = os.listdir(category_level_folder) 10 | for file in category_level_filelist: 11 | if len(file.split('-'))==3: 12 | if file.split('-')[1] == 'level': 13 | category = file.split('-')[0] 14 | level = file[:-4].split('-')[-1] 15 | if category not in category_level: 16 | category_level[category] = [level] 17 | else: 18 | category_level[category].append(level) 19 | print(category_level) 20 | 21 | filelist_list = os.listdir('filelist') 22 | 23 | for category, v in category_level.items(): 24 | print(category, v) 25 | for phase in ['train', 'val', 'test']: 26 | flist = [] 27 | for level in range(1, 4): 28 | level = str(level) 29 | if level in v: 30 | filename = [file for file in filelist_list if file.startswith('{}_level{}_{}'.format(category, level, phase))][0] 31 | flist.append(filename) 32 | else: 33 | filename = [file for file in filelist_list if file.startswith('{}_level{}_{}'.format(category, str(int(level)-1), phase)) or file.startswith('{}_level{}_{}'.format(category, str(int(level)-2), phase))][0] 34 | flist.append(filename) 35 | print(category, level, phase, filename) 36 | print(phase, category, flist) 37 | cmd = 'python get_points_filelist_level123.py --category {} --phase {} --f1 {} --f2 {} --f3 {}'.format(category, phase, os.path.join('filelist', flist[0]), os.path.join('filelist', flist[1]), os.path.join('filelist', flist[2])) 38 | print(cmd) 39 | os.system(cmd) 40 | -------------------------------------------------------------------------------- /get_results.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import numpy as np 4 | 5 | 6 | category_info = { 7 | 'Chair': { 8 | '#part': { 9 | 1: 6, 10 | 2: 30, 11 | 3: 39 12 | } 13 | }, 14 | 'Table': { 15 | '#part': { 16 | 1: 11, 17 | 2: 42, 18 | 3: 51 19 | } 20 | }, 21 | 'Lamp': { 22 | '#part': { 23 | 1: 18, 24 | 2: 28, 25 | 3: 41 26 | } 27 | }, 28 | 'StorageFurniture': { 29 | '#part': { 30 | 1: 7, 31 | 2: 19, 32 | 3: 24 33 | } 34 | }, 35 | 'Bed': { 36 | '#part': { 37 | 1: 4, 38 | 2: 10, 39 | 3: 15 40 | } 41 | }, 42 | 'Bag': { 43 | '#part': { 44 | 1: 4 45 | } 46 | }, 47 | 'Bottle': { 48 | '#part': { 49 | 1: 6, 50 | 3: 9 51 | } 52 | }, 53 | 'Bowl': { 54 | '#part': { 55 | 1: 4 56 | } 57 | }, 58 | 'Clock': { 59 | '#part': { 60 | 1: 6, 61 | 3: 11 62 | } 63 | }, 64 | 'Dishwasher': { 65 | '#part': { 66 | 1: 3, 67 | 2: 5, 68 | 3: 7 69 | } 70 | }, 71 | 'Display': { 72 | '#part': { 73 | 1: 3, 74 | 3: 4 75 | } 76 | }, 77 | 'Door': { 78 | '#part': { 79 | 1: 3, 80 | 2: 4, 81 | 3: 5 82 | } 83 | }, 84 | 'Earphone': { 85 | '#part': { 86 | 1: 6, 87 | 3: 10 88 | } 89 | }, 90 | 'Faucet': { 91 | '#part': { 92 | 1: 8, 93 | 3: 12 94 | } 95 | }, 96 | 'Hat': { 97 | '#part': { 98 | 1: 6 99 | } 100 | }, 101 | 'Keyboard': { 102 | '#part': { 103 | 1: 3 104 | } 105 | }, 106 | 'Knife': { 107 | '#part': { 108 | 1: 5, 109 | 3: 10 110 | } 111 | }, 112 | 'Laptop': { 113 | '#part': { 114 | 1: 3 115 | } 116 | }, 117 | 'Microwave': { 118 | '#part': { 119 | 1: 3, 120 | 2: 5, 121 | 3: 6 122 | } 123 | }, 124 | 'Mug': { 125 | '#part': { 126 | 1: 4 127 | } 128 | }, 129 | 'Refrigerator': { 130 | '#part': { 131 | 1: 3, 132 | 2: 6, 133 | 3: 7 134 | } 135 | }, 136 | 'Scissors': { 137 | '#part': { 138 | 1: 3 139 | } 140 | }, 141 | 'TrashCan': { 142 | '#part': { 143 | 1: 5, 144 | 3: 11 145 | } 146 | }, 147 | 'Vase': { 148 | '#part': { 149 | 1: 4, 150 | 3: 6 151 | } 152 | } 153 | } 154 | 155 | 156 | def load_metric(filename): 157 | assert os.path.isfile(filename), filename 158 | content = open(filename).readlines() 159 | prefix_list = ['mAP50'] 160 | # prefix_list = ['miou v2'] 161 | # prefix_list = ['mAP25'] 162 | # prefix_list = ['mAP75'] 163 | # prefix_list = ['shape mAP'] 164 | # prefix_list = ['miou v3'] 165 | metric = [] 166 | for prefix in prefix_list: 167 | for line in content: 168 | if line.startswith(' {}'.format(prefix)): 169 | metric.append(float(line.split()[-1])) 170 | return metric 171 | 172 | 173 | 174 | 175 | def main(): 176 | folder = os.path.join('log', 'PartNet') 177 | assert os.path.isdir(folder), folder 178 | category_list = os.listdir(folder) 179 | n_cat = len(category_list) 180 | metric_list = -np.ones([3, n_cat], np.float32) 181 | print('n_cat: {}'.format(n_cat)) 182 | for category in category_list: 183 | category_folder = os.path.join(folder, category) 184 | filename = [file for file in os.listdir(category_folder) if file.endswith('.txt')][0] 185 | metric = load_metric(os.path.join(category_folder, filename)) 186 | 187 | for level in [1, 2, 3]: 188 | if level in category_info[category]['#part']: 189 | metric_list[level-1][category_list.index(category)] = metric[level-1] 190 | 191 | for category in category_list: 192 | print('{:>5} '.format(category[:5]), end='') 193 | print(''); sys.stdout.flush() 194 | for level in [1, 2, 3]: 195 | for i in range(n_cat): 196 | value = metric_list[level-1][i] 197 | if value >= 0: 198 | print('{:5.1f},'.format(value), end='') 199 | else: 200 | print(' ,', end='') 201 | print('') 202 | 203 | 204 | if __name__ == '__main__': 205 | main() -------------------------------------------------------------------------------- /ocnn/tensorflow/libs/__init__.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tensorflow as tf 3 | from tensorflow.python.framework import ops 4 | 5 | if 'OCTREE_KEY' in os.environ and os.environ['OCTREE_KEY'] == '64': 6 | print('INFO from ocnn: The octree key is 64 bits') 7 | octree_key64 = True 8 | tf_uintk = tf.uint64 9 | tf_uints = tf.uint16 10 | tf_intk = tf.int64 11 | else: 12 | print('INFO from ocnn: The octree key is 32 bits, ' 13 | 'the octree depth should be smaller than 8.') 14 | octree_key64 = False 15 | tf_uintk = tf.uint32 16 | tf_uints = tf.uint8 17 | tf_intk = tf.int32 18 | 19 | _current_path = os.path.dirname(os.path.realpath(__file__)) 20 | _tf_ocnn_module = tf.load_op_library(os.path.join(_current_path, 'libocnn_docker.so')) 21 | 22 | bounding_sphere = _tf_ocnn_module.bounding_sphere 23 | points_property = _tf_ocnn_module.points_property 24 | transform_points = _tf_ocnn_module.transform_points 25 | normalize_points = _tf_ocnn_module.normalize_points 26 | points_new = _tf_ocnn_module.points_new 27 | points_set_property = _tf_ocnn_module.points_set_property 28 | octree_drop = _tf_ocnn_module.octree_drop 29 | octree_scan = _tf_ocnn_module.octree_scan 30 | octree_cast = _tf_ocnn_module.octree_cast 31 | octree_batch = _tf_ocnn_module.octree_batch 32 | points2octree = _tf_ocnn_module.points_to_octree 33 | octree_property = _tf_ocnn_module.octree_property 34 | octree_pad = _tf_ocnn_module.octree_pad 35 | octree_depad = _tf_ocnn_module.octree_depad 36 | octree2col = _tf_ocnn_module.octree_to_col 37 | col2octree = _tf_ocnn_module.col_to_octree 38 | octree_grow = _tf_ocnn_module.octree_grow 39 | octree_new = _tf_ocnn_module.octree_new 40 | octree_update = _tf_ocnn_module.octree_update 41 | octree_align = _tf_ocnn_module.octree_align 42 | octree_mask = _tf_ocnn_module.octree_mask 43 | octree_samples = _tf_ocnn_module.octree_samples 44 | octree_search = _tf_ocnn_module.octree_search 45 | octree_key2xyz = _tf_ocnn_module.octree_key_to_xyz 46 | octree_xyz2key = _tf_ocnn_module.octree_xyz_to_key 47 | octree_decode_key = _tf_ocnn_module.octree_decode_key 48 | octree_encode_key = _tf_ocnn_module.octree_encode_key 49 | octree_search_key = _tf_ocnn_module.octree_search_key 50 | octree_set_property = _tf_ocnn_module.octree_set_property 51 | octree_gather = _tf_ocnn_module.octree_gather 52 | octree_gatherbk = _tf_ocnn_module.octree_gatherbk 53 | _octree_max_pool = _tf_ocnn_module.octree_max_pool 54 | _octree_mask_pool = _tf_ocnn_module.octree_mask_pool 55 | _octree_max_unpool = _tf_ocnn_module.octree_max_unpool 56 | _octree_conv = _tf_ocnn_module.octree_conv 57 | _octree_deconv = _tf_ocnn_module.octree_deconv 58 | _octree_conv_grad = _tf_ocnn_module.octree_conv_grad 59 | _octree_deconv_grad = _tf_ocnn_module.octree_deconv_grad 60 | _octree_align_grad = _tf_ocnn_module.octree_align_grad 61 | _octree_bilinear = _tf_ocnn_module.octree_bilinear 62 | 63 | 64 | ops.NotDifferentiable('BoundingSphere') 65 | ops.NotDifferentiable('OctreeSetProperty') 66 | ops.NotDifferentiable('OctreeBatch') 67 | ops.NotDifferentiable('TransformPoints') 68 | ops.NotDifferentiable('NormalizePoints') 69 | ops.NotDifferentiable('PointsNew') 70 | ops.NotDifferentiable('PointsSetProperty') 71 | ops.NotDifferentiable('PointsToOctree') 72 | ops.NotDifferentiable('OctreeProperty') 73 | ops.NotDifferentiable('OctreeNew') 74 | ops.NotDifferentiable('OctreeUpdate') 75 | ops.NotDifferentiable('OctreeGrow') 76 | ops.NotDifferentiable('OctreeSamples') 77 | ops.NotDifferentiable('OctreeBilinear') 78 | ops.NotDifferentiable('OctreeKeyToXyz') 79 | ops.NotDifferentiable('OctreeXyzToKey') 80 | ops.NotDifferentiable('OctreeDecodeKey') 81 | ops.NotDifferentiable('OctreeEncodeKey') 82 | ops.NotDifferentiable('OctreeSearchKey') 83 | ops.NotDifferentiable('OctreeSearch') 84 | ops.NotDifferentiable('PointsProperty') 85 | ops.NotDifferentiable('OctreeScan') 86 | ops.NotDifferentiable('OctreeCast') 87 | ops.NotDifferentiable('OctreeDrop') 88 | 89 | 90 | @ops.RegisterGradient('OctreePad') 91 | def _OctreePadGrad(op, grad): 92 | grad_out = octree_depad(grad, op.inputs[1], op.get_attr('depth')) 93 | return [grad_out, None] 94 | 95 | 96 | @ops.RegisterGradient('OctreeDepad') 97 | def _OctreeDepadGrad(op, grad): 98 | grad_out = octree_pad(grad, op.inputs[1], op.get_attr('depth')) 99 | return [grad_out, None] 100 | 101 | 102 | @ops.RegisterGradient('OctreeToCol') 103 | def _OctreeToColGrad(op, grad): 104 | grad_out = col2octree(grad, op.inputs[1], op.get_attr('depth'), 105 | op.get_attr('kernel_size'), op.get_attr('stride')) 106 | return [grad_out, None] 107 | 108 | 109 | @ops.RegisterGradient('ColToOctree') 110 | def _ColToOctreeGrad(op, grad): 111 | grad_out = octree2col(grad, op.inputs[1], op.get_attr('depth'), 112 | op.get_attr('kernel_size'), op.get_attr('stride')) 113 | return [grad_out, None] 114 | 115 | 116 | @ops.RegisterGradient('OctreeMaxPool') 117 | def _OctreeMaxPoolGrad(op, *grad): 118 | grad_out = _octree_max_unpool(grad[0], op.outputs[1], op.inputs[1], 119 | op.get_attr('depth')) 120 | return [grad_out, None] 121 | 122 | 123 | @ops.RegisterGradient('OctreeMaxUnpool') 124 | def _OctreeMaxUnpoolGrad(op, grad): 125 | grad_out = _octree_mask_pool(grad, op.inputs[1], op.inputs[2], 126 | op.get_attr('depth')) 127 | return [grad_out, None, None] 128 | 129 | 130 | @ops.RegisterGradient('OctreeMaskPool') 131 | def _OctreeMaskPoolGrad(op, grad): 132 | grad_out = _octree_max_unpool(grad, op.inputs[1], op.inputs[2], 133 | op.get_attr('depth')) 134 | return [grad_out, None, None] 135 | 136 | 137 | @ops.RegisterGradient('OctreeConv') 138 | def _OctreeConvGrad(op, grad): 139 | grad_out = _octree_conv_grad(op.inputs[0], op.inputs[1], op.inputs[2], grad, 140 | op.get_attr('depth'), op.get_attr('num_output'), 141 | op.get_attr('kernel_size'), op.get_attr('stride')) 142 | return grad_out + (None, ) 143 | 144 | 145 | @ops.RegisterGradient('OctreeDeconv') 146 | def _OctreeDeconvGrad(op, grad): 147 | grad_out = _octree_deconv_grad(op.inputs[0], op.inputs[1], op.inputs[2], grad, 148 | op.get_attr('depth'), op.get_attr('num_output'), 149 | op.get_attr('kernel_size'), op.get_attr('stride')) 150 | return grad_out + (None, ) 151 | 152 | 153 | @ops.RegisterGradient('OctreeAlign') 154 | def _OctreeAlignGrad(op, *grad): 155 | grad_out = _octree_align_grad(grad[0], op.outputs[1]) 156 | return [grad_out, None, None] 157 | 158 | 159 | @ops.RegisterGradient('OctreeMask') 160 | def _OctreeMaskGrad(op, grad): 161 | grad_out = octree_mask(grad, op.inputs[1], op.get_attr('mask')) 162 | return [grad_out, None] 163 | 164 | 165 | @ops.RegisterGradient('OctreeGather') 166 | def _OctreeGatherGrad(op, grad): 167 | shape = tf.shape(op.inputs[0]) 168 | grad_out = octree_gatherbk(grad, op.inputs[1], shape) 169 | return [grad_out, None] 170 | 171 | 172 | def octree_max_pool(data, octree, depth): 173 | with tf.variable_scope('octree_max_pool'): 174 | data, mask = _octree_max_pool(data, octree, depth) # the bottom data depth 175 | data = octree_pad(data, octree, depth-1) # !!! depth-1 176 | return data, mask 177 | 178 | 179 | def octree_max_unpool(data, mask, octree, depth): 180 | with tf.variable_scope('octree_max_unpool'): 181 | data = octree_depad(data, octree, depth) # !!! depth 182 | data = _octree_max_unpool(data, mask, octree, depth) # the bottom data depth 183 | return data 184 | 185 | 186 | def octree_avg_pool(data, octree, depth): 187 | with tf.variable_scope('octree_avg_pool'): 188 | data = tf.reshape(data, [1, int(data.shape[1]), -1, 8]) 189 | data = tf.reduce_mean(data, axis=3, keepdims=True) 190 | data = octree_pad(data, octree, depth-1) # !!! depth-1 191 | return data 192 | 193 | 194 | # todo: merge octree_conv_fast and octree_conv_memory to reduce code redundancy 195 | def octree_conv_fast(data, octree, depth, channel, kernel_size=[3], stride=1): 196 | assert(type(kernel_size) is list and len(kernel_size) < 4) 197 | for i in range(len(kernel_size), 3): 198 | kernel_size.append(kernel_size[-1]) 199 | 200 | with tf.variable_scope('octree_conv'): 201 | dim = int(data.shape[1]) * kernel_size[0] * kernel_size[1] * kernel_size[2] 202 | kernel = tf.get_variable('weights', shape=[channel, dim], dtype=tf.float32, 203 | initializer=tf.contrib.layers.xavier_initializer()) 204 | col = octree2col(data, octree, depth, kernel_size, stride) 205 | col = tf.reshape(col, [dim, -1]) 206 | conv = tf.matmul(kernel, col) 207 | conv = tf.expand_dims(tf.expand_dims(conv, 0), -1) # [C, H] -> [1, C, H, 1] 208 | if stride == 2: 209 | conv = octree_pad(conv, octree, depth-1, 0) 210 | return conv 211 | 212 | 213 | def octree_conv_memory(data, octree, depth, channel, kernel_size=[3], stride=1): 214 | assert(type(kernel_size) is list and len(kernel_size) < 4) 215 | for i in range(len(kernel_size), 3): 216 | kernel_size.append(kernel_size[-1]) 217 | 218 | with tf.variable_scope('octree_conv'): 219 | dim = int(data.shape[1]) * kernel_size[0] * kernel_size[1] * kernel_size[2] 220 | kernel = tf.get_variable('weights', shape=[channel, dim], dtype=tf.float32, 221 | initializer=tf.contrib.layers.xavier_initializer()) 222 | conv = _octree_conv(data, kernel, octree, depth, channel, kernel_size, stride) 223 | if stride == 2: 224 | conv = octree_pad(conv, octree, depth-1) 225 | return conv 226 | 227 | 228 | def octree_deconv_fast(data, octree, depth, channel, kernel_size=[3], stride=1): 229 | assert(type(kernel_size) is list and len(kernel_size) < 4) 230 | for i in range(len(kernel_size), 3): 231 | kernel_size.append(kernel_size[-1]) 232 | 233 | with tf.variable_scope('octree_deconv'): 234 | kernel_sdim = kernel_size[0] * kernel_size[1] * kernel_size[2] 235 | dim = channel * kernel_sdim 236 | kernel = tf.get_variable('weights', shape=[int(data.shape[1]), dim], dtype=tf.float32, 237 | initializer=tf.contrib.layers.xavier_initializer()) 238 | if stride == 2: 239 | data = octree_depad(data, octree, depth) 240 | depth = depth + 1 241 | data = tf.squeeze(data, [0, 3]) 242 | deconv = tf.matmul(kernel, data, transpose_a=True, transpose_b=False) 243 | deconv = tf.reshape(deconv, [channel, kernel_sdim, -1]) 244 | col = col2octree(deconv, octree, depth, kernel_size, stride) 245 | return col 246 | 247 | 248 | def octree_deconv_memory(data, octree, depth, channel, kernel_size=[3], stride=1): 249 | assert(type(kernel_size) is list and len(kernel_size) < 4) 250 | for i in range(len(kernel_size), 3): 251 | kernel_size.append(kernel_size[-1]) 252 | 253 | with tf.variable_scope('octree_deconv'): 254 | kernel_sdim = kernel_size[0] * kernel_size[1] * kernel_size[2] 255 | dim = channel * kernel_sdim 256 | kernel = tf.get_variable('weights', shape=[int(data.shape[1]), dim], dtype=tf.float32, 257 | initializer=tf.contrib.layers.xavier_initializer()) 258 | if stride == 2: 259 | data = octree_depad(data, octree, depth) 260 | deconv = _octree_deconv(data, kernel, octree, depth, channel, kernel_size, stride) 261 | return deconv 262 | 263 | 264 | def octree_full_voxel(data, depth): 265 | height = 2 ** (3 * depth) 266 | channel = int(data.shape[1]) 267 | with tf.variable_scope('octree_full_voxel'): 268 | data = tf.reshape(data, [channel, -1, height]) # (1, C, H, 1) -> (C, batch_size, H1) 269 | data = tf.transpose(data, perm=[1, 0, 2]) 270 | return data 271 | 272 | 273 | def octree_tile(data, octree, depth): 274 | with tf.variable_scope('octree_tile'): 275 | data = octree_depad(data, octree, depth) # (1, C, H, 1) 276 | data = tf.tile(data, [1, 1, 1, 8]) # (1, C, H, 8) 277 | channel = int(data.shape[1]) 278 | output = tf.reshape(data, [1, channel, -1, 1]) 279 | return output 280 | 281 | 282 | def octree_global_pool(data, octree, depth): 283 | with tf.variable_scope('octree_global_pool'): 284 | segment_ids = octree_property(octree, property_name='index', dtype=tf.int32, 285 | depth=depth, channel=1) 286 | segment_ids = tf.reshape(segment_ids, [-1]) 287 | data = tf.squeeze(data, axis=[0, 3]) # (1, C, H, 1) -> (C, H) 288 | data = tf.transpose(data) # (C, H) -> (H, C) 289 | output = tf.math.segment_mean(data, segment_ids) # (H, C) -> (batch_size, C) 290 | return output 291 | 292 | 293 | def octree_bilinear_legacy(data, octree, depth, target_depth): 294 | with tf.variable_scope('octree_bilinear'): 295 | mask = tf.constant( 296 | [[0, 0, 0], [0, 0, 1], [0, 1, 0], [1, 0, 0], 297 | [0, 1, 1], [1, 0, 1], [1, 1, 0], [1, 1, 1]], dtype=tf.float32) 298 | index, fracs = _octree_bilinear(octree, depth, target_depth) 299 | feat = tf.transpose(tf.squeeze(data, [0, 3])) # (1, C, H, 1) -> (H, C) 300 | output = tf.zeros([tf.shape(index)[0], tf.shape(feat)[1]], dtype=tf.float32) 301 | norm = tf.zeros([tf.shape(index)[0], 1], dtype=tf.float32) 302 | for i in range(8): 303 | idxi = index[:, i] 304 | weight = tf.abs(tf.reduce_prod(mask[i, :] - fracs, axis=1, keepdims=True)) 305 | output += weight * tf.gather(feat, idxi) 306 | norm += weight * tf.expand_dims(tf.cast(idxi > -1, dtype=tf.float32), -1) 307 | output = tf.div(output, norm) 308 | output = tf.expand_dims(tf.expand_dims(tf.transpose(output), 0), -1) 309 | return output 310 | 311 | 312 | # pts: (N, 4), i.e. N x (x, y, z, id) 313 | # data: (1, C, H, 1) 314 | def octree_bilinear_v1(pts, data, octree, depth): 315 | with tf.variable_scope('octree_bilinear'): 316 | mask = tf.constant( 317 | [[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], 318 | [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]], dtype=tf.float32) 319 | 320 | xyzf, ids = tf.split(pts, [3, 1], 1) 321 | xyzf = xyzf - 0.5 # since the value is defined on the center of each voxel 322 | xyzi = tf.floor(xyzf) # the integer part 323 | frac = xyzf - xyzi # the fraction part 324 | 325 | feat = tf.transpose(tf.squeeze(data, [0, 3])) # (1, C, H, 1) -> (H, C) 326 | output = tf.zeros([tf.shape(xyzi)[0], tf.shape(feat)[1]], dtype=tf.float32) 327 | norm = tf.zeros([tf.shape(xyzi)[0], 1], dtype=tf.float32) 328 | 329 | for i in range(8): 330 | maski = mask[i, :] 331 | maskc = 1.0 - maski 332 | xyzm = xyzi + maski 333 | xyzm = tf.cast(tf.concat([xyzm, ids], axis=1), dtype=tf_uints) 334 | idxi = octree_search_key(octree_encode_key(xyzm), octree, depth, is_xyz=True) 335 | 336 | weight = tf.abs(tf.reduce_prod(maskc - frac, axis=1, keepdims=True)) 337 | output += weight * tf.gather(feat, idxi) 338 | norm += weight * tf.expand_dims(tf.cast(idxi > -1, dtype=tf.float32), -1) 339 | output = tf.div(output, norm) 340 | 341 | output = tf.expand_dims(tf.expand_dims(tf.transpose(output), 0), -1) 342 | frac = tf.expand_dims(tf.expand_dims(tf.transpose(frac), 0), -1) 343 | 344 | return output, frac 345 | 346 | # pts: (N, 4), i.e. N x (x, y, z, id) 347 | # data: (1, C, H, 1) 348 | def octree_bilinear_v2(pts, data, octree, depth): 349 | with tf.variable_scope('octree_bilinear'): 350 | mask = tf.constant( 351 | [[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], 352 | [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]], dtype=tf.float32) 353 | 354 | xyzf, ids = tf.split(pts, [3, 1], 1) 355 | xyzf = xyzf - 0.5 # since the value is defined on the center of each voxel 356 | xyzi = tf.floor(xyzf) # the integer part 357 | frac = xyzf - xyzi # the fraction part 358 | 359 | output = tf.zeros([1, tf.shape(data)[1], tf.shape(xyzi)[0], 1], dtype=tf.float32) 360 | norm = tf.zeros([tf.shape(xyzi)[0], 1], dtype=tf.float32) 361 | 362 | for i in range(8): 363 | maski = mask[i, :] 364 | maskc = 1.0 - maski 365 | xyzm = xyzi + maski 366 | xyzm = tf.cast(tf.concat([xyzm, ids], axis=1), dtype=tf_uints) 367 | # !!! Note some elements of idxi may be -1 368 | idxi = octree_search_key(octree_encode_key(xyzm), octree, depth, is_xyz=True) 369 | 370 | weight = tf.abs(tf.reduce_prod(maskc - frac, axis=1, keepdims=True)) 371 | # output += weight * tf.gather(data, idxi, axis=2) 372 | output += weight * octree_gather(data, idxi) 373 | norm += weight * tf.expand_dims(tf.cast(idxi > -1, dtype=tf.float32), -1) 374 | output = tf.div(output, norm) 375 | return output 376 | 377 | 378 | # pts: (N, 4), i.e. N x (x, y, z, id). 379 | # data: (1, C, H, 1) 380 | # !!! Note: the pts should be scaled into [0, 2^depth] 381 | def octree_bilinear_v3(pts, data, octree, depth): 382 | with tf.variable_scope('octree_linear'): 383 | mask = tf.constant( 384 | [[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], 385 | [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]], dtype=tf.float32) 386 | if octree_key64: 387 | masku = tf.constant([0, 4294967296, 65536, 4295032832, 388 | 1, 4294967297, 65537, 4295032833], dtype=tf.int64) 389 | else: 390 | masku = tf.constant([0, 65536, 256, 65792, 391 | 1, 65537, 257, 65793], dtype=tf.int32) 392 | 393 | maskc = 1 - mask 394 | 395 | xyzf, ids = tf.split(pts, [3, 1], 1) 396 | xyzf = xyzf - 0.5 # since the value is defined on the center of each voxel 397 | xyzi = tf.floor(xyzf) # the integer part (N, 3) 398 | frac = xyzf - xyzi # the fraction part (N, 3) 399 | 400 | key = tf.cast(tf.concat([xyzi, ids], axis=1), dtype=tf_uints) 401 | key = tf.cast(octree_encode_key(key), dtype=tf_intk) 402 | # Cast the key to `int32` since the `add` below does not support `uint64` 403 | # The size effect is that the batch_size must be smaller than 128 404 | key = tf.expand_dims(key, 1) + masku # (N, 8), 405 | key = tf.cast(tf.reshape(key, [-1]), dtype=tf_uintk) 406 | 407 | idx = octree_search_key(key, octree, depth) # (N*8,) 408 | flgs = idx > -1 # filtering flags 409 | idx = tf.boolean_mask(idx, flgs) 410 | 411 | npt = tf.shape(xyzi)[0] 412 | ids = tf.reshape(tf.range(npt), [-1, 1]) 413 | ids = tf.reshape(tf.tile(ids, [1, 8]), [-1]) # (N*8,) 414 | ids = tf.boolean_mask(ids, flgs) 415 | 416 | frac = maskc - tf.expand_dims(frac, axis=1) 417 | weight = tf.abs(tf.reshape(tf.reduce_prod(frac, axis=2), [-1])) 418 | weight = tf.boolean_mask(weight, flgs) 419 | 420 | indices = tf.concat([tf.expand_dims(ids, 1), tf.expand_dims(idx, 1)], 1) 421 | indices = tf.cast(indices, tf.int64) 422 | data = tf.squeeze(data, [0, 3]) # (C, H) 423 | h = tf.shape(data)[1] 424 | mat = tf.SparseTensor(indices=indices, values=weight, dense_shape=[npt, h]) 425 | 426 | # channel, max_channel = int(data.shape[0]), 512 427 | # if channel > max_channel: 428 | # num = channel // max_channel 429 | # remain = channel % max_channel 430 | # splits = [max_channel] * num 431 | # if remain != 0: 432 | # splits.append(remain) 433 | # num += 1 434 | # output_split = [None] * num 435 | # data_split = tf.split(data, splits, axis=0) 436 | # for i in range(num): 437 | # with tf.name_scope('mat_%d' % i): 438 | # output_split[i] = tf.sparse.sparse_dense_matmul( 439 | # mat, data_split[i], adjoint_a=False, adjoint_b=True) 440 | # output = tf.concat(output_split, axis=1) 441 | # else: 442 | # output = tf.sparse.sparse_dense_matmul(mat, data, adjoint_a=False, adjoint_b=True) 443 | 444 | output = tf.sparse.sparse_dense_matmul(mat, data, adjoint_a=False, adjoint_b=True) 445 | norm = tf.sparse.sparse_dense_matmul(mat, tf.ones([h, 1])) 446 | output = tf.div(output, norm + 1.0e-10) # avoid dividing by zeros 447 | output = tf.expand_dims(tf.expand_dims(tf.transpose(output), 0), -1) 448 | return output 449 | 450 | 451 | def octree_bilinear(data, octree, depth, target_depth, mask=None): 452 | with tf.name_scope('Octree_bilinear'): 453 | xyz = octree_property(octree, property_name='xyz', depth=target_depth, 454 | channel=1, dtype=tf_uintk) 455 | xyz = tf.reshape(xyz, [-1]) 456 | if mask is not None: 457 | xyz = tf.boolean_mask(xyz, mask) 458 | xyz = tf.cast(octree_decode_key(xyz), dtype=tf.float32) 459 | 460 | # Attention: displacement 0.5, scale 461 | scale = 2.0**(depth-target_depth) 462 | xyz += tf.constant([0.5, 0.5, 0.5, 0.0], dtype=tf.float32) 463 | xyz *= tf.constant([scale, scale, scale, 1.0], dtype=tf.float32) 464 | 465 | output = octree_bilinear_v3(xyz, data, octree, depth) 466 | return output 467 | 468 | 469 | # pts: (N, 4), i.e. N x (x, y, z, id) 470 | # data: (1, C, H, 1) 471 | def octree_nearest_interp(pts, data, octree, depth): 472 | with tf.variable_scope('octree_nearest_interp'): 473 | # The value is defined on the center of each voxel, 474 | # so we can get the closest grid point by simply casting the value to tf_uints 475 | pts = tf.cast(pts, dtype=tf_uints) 476 | key = tf.reshape(octree_encode_key(pts), [-1]) 477 | 478 | idx = octree_search_key(key, octree, depth) 479 | # !!! Note that some of idx may be -1 or over-bound 480 | # Use tf.gather may be problematic with some version of tensorflow 481 | # according to my experiments. So I implemented octree_gather to 482 | # replace the original tf.gather. If you encounter errors, please 483 | # use the octree_gather 484 | # output = tf.gather(data, idx, axis=2) 485 | output = octree_gather(data, idx) 486 | return output 487 | 488 | 489 | 490 | def octree_signal(octree, depth, channel): 491 | with tf.name_scope('octree_signal'): 492 | signal = octree_property(octree, property_name='feature', dtype=tf.float32, 493 | depth=depth, channel=channel) 494 | signal = tf.reshape(signal, [1, channel, -1, 1]) 495 | return signal 496 | 497 | 498 | def octree_xyz(octree, depth, decode=True): 499 | with tf.name_scope('octree_xyz'): 500 | xyz = octree_property(octree, property_name='xyz', dtype=tf_uintk, 501 | depth=depth, channel=1) 502 | xyz = tf.reshape(xyz, [-1]) # uint32, N 503 | if decode: 504 | xyz = octree_decode_key(xyz) # uint8, Nx4 505 | return xyz 506 | 507 | 508 | def octree_child(octree, depth): 509 | with tf.name_scope('octree_child'): 510 | child = octree_property(octree, property_name='child', dtype=tf.int32, 511 | depth=depth, channel=1) 512 | child = tf.reshape(child, [-1]) 513 | return child 514 | 515 | 516 | def octree_split(octree, depth): 517 | with tf.name_scope('octree_split'): 518 | split = octree_property(octree, property_name='split', dtype=tf.float32, 519 | depth=depth, channel=1) 520 | split = tf.reshape(split, [-1]) 521 | return split -------------------------------------------------------------------------------- /ocnn/tensorflow/libs/libocnn_docker.so: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/isunchy/3d_instance_segmentation/5eb6d2bfbe76e76e27d1045aa335e5829bec5aeb/ocnn/tensorflow/libs/libocnn_docker.so -------------------------------------------------------------------------------- /ocnn/tensorflow/script/ocnn.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import tensorflow as tf 3 | sys.path.append("..") 4 | from libs import * 5 | 6 | 7 | def get_variables_with_name(name=None, without=None, train_only=True, verbose=False): 8 | if name is None: 9 | raise Exception("please input a name") 10 | 11 | t_vars = tf.trainable_variables() if train_only else tf.all_variables() 12 | d_vars = [var for var in t_vars if name in var.name] 13 | 14 | if without is not None: 15 | d_vars = [var for var in d_vars if without not in var.name] 16 | 17 | if verbose: 18 | print(" [*] geting variables with %s" % name) 19 | for idx, v in enumerate(d_vars): 20 | print(" got {:3}: {:15} {}".format(idx, v.name, str(v.get_shape()))) 21 | 22 | return d_vars 23 | 24 | 25 | def dense(inputs, nout, use_bias=False): 26 | inputs = tf.layers.flatten(inputs) 27 | fc = tf.layers.dense(inputs, nout, use_bias=use_bias, 28 | kernel_initializer=tf.contrib.layers.xavier_initializer()) 29 | return fc 30 | 31 | 32 | def batch_norm(inputs, training, axis=1): 33 | return tf.layers.batch_normalization(inputs, axis=axis, training=training) 34 | 35 | 36 | def fc_bn_relu(inputs, nout, training): 37 | fc = dense(inputs, nout, use_bias=False) 38 | bn = batch_norm(fc, training) 39 | return tf.nn.relu(bn) 40 | 41 | 42 | def conv2d(inputs, nout, kernel_size, stride, padding='SAME', data_format='channels_first'): 43 | return tf.layers.conv2d(inputs, nout, kernel_size=kernel_size, strides=stride, 44 | padding=padding, data_format=data_format, use_bias=False, 45 | kernel_initializer=tf.contrib.layers.xavier_initializer()) 46 | 47 | 48 | # def octree_conv1x1(inputs, nout, use_bias=False): 49 | # outputs = tf.layers.conv2d(inputs, nout, kernel_size=1, strides=1, 50 | # data_format='channels_first', use_bias=use_bias, 51 | # kernel_initializer=tf.contrib.layers.xavier_initializer()) 52 | # return outputs 53 | 54 | 55 | def octree_conv1x1(inputs, nout, use_bias=False): 56 | with tf.variable_scope('conv2d_1x1'): 57 | inputs = tf.squeeze(inputs, axis=[0, 3]) # (1, C, H, 1) -> (C, H) 58 | weights = tf.get_variable('weights', shape=[nout, int(inputs.shape[0])], 59 | dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer()) 60 | outputs = tf.matmul(weights, inputs) # (C, H) -> (nout, H) 61 | if use_bias: 62 | bias = tf.get_variable('bias', shape=[nout, 1], dtype=tf.float32, 63 | initializer=tf.contrib.layers.xavier_initializer()) 64 | outputs = bias + outputs 65 | outputs = tf.expand_dims(tf.expand_dims(outputs, axis=0), axis=-1) 66 | return outputs 67 | 68 | 69 | def octree_conv1x1_bn(inputs, nout, training): 70 | conv = octree_conv1x1(inputs, nout, use_bias=False) 71 | return batch_norm(conv, training) 72 | 73 | 74 | def octree_conv1x1_bn_relu(inputs, nout, training): 75 | conv = octree_conv1x1_bn(inputs, nout, training) 76 | return tf.nn.relu(conv) 77 | 78 | 79 | def octree_conv1x1_bn_lrelu(inputs, nout, training, alpha=0.1): 80 | conv = octree_conv1x1_bn(inputs, nout, training) 81 | return tf.nn.leaky_relu(conv, alpha=alpha) 82 | 83 | 84 | def conv2d_bn(inputs, nout, kernel_size, stride, training): 85 | conv = conv2d(inputs, nout, kernel_size, stride) 86 | return batch_norm(conv, training) 87 | 88 | 89 | def conv2d_bn_relu(inputs, nout, kernel_size, stride, training): 90 | conv = conv2d_bn(inputs, nout, kernel_size, stride, training) 91 | return tf.nn.relu(conv) 92 | 93 | 94 | def upsample(data, channel, training): 95 | deconv = tf.layers.conv2d_transpose(data, channel, kernel_size=[8, 1], 96 | strides=[8, 1], data_format='channels_first', use_bias=False, 97 | kernel_initializer=tf.contrib.layers.xavier_initializer()) 98 | bn = tf.layers.batch_normalization(deconv, axis=1, training=training) 99 | return tf.nn.relu(bn) 100 | 101 | 102 | def downsample(data, channel, training): 103 | deconv = tf.layers.conv2d(data, channel, kernel_size=[8, 1], 104 | strides=[8, 1], data_format='channels_first', use_bias=False, 105 | kernel_initializer=tf.contrib.layers.xavier_initializer()) 106 | bn = tf.layers.batch_normalization(deconv, axis=1, training=training) 107 | return tf.nn.relu(bn) 108 | 109 | 110 | def avg_pool2d(inputs, data_format='NCHW'): 111 | return tf.nn.avg_pool2d(inputs, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], 112 | padding='SAME', data_format=data_format) 113 | 114 | 115 | def global_pool(inputs, data_format='channels_first'): 116 | axis = [2, 3] if data_format == 'channels_first' else [1, 2] 117 | return tf.reduce_mean(inputs, axis=axis) 118 | 119 | 120 | # !!! Deprecated 121 | def octree_upsample(data, octree, depth, channel, training): 122 | with tf.variable_scope('octree_upsample'): 123 | depad = octree_depad(data, octree, depth) 124 | up = upsample(depad, channel, training) 125 | return up 126 | 127 | 128 | def octree_upsample(data, octree, depth, channel, training): 129 | up = octree_deconv_bn_relu(data, octree, depth, channel, training, 130 | kernel_size=[2], stride=2, fast_mode=False) 131 | return up 132 | 133 | 134 | def octree_downsample(data, octree, depth, channel, training): 135 | with tf.variable_scope('octree_downsample'): 136 | down = downsample(data, channel, training) 137 | pad = octree_pad(down, octree, depth) 138 | return pad 139 | 140 | 141 | def octree_conv_bn(data, octree, depth, channel, training, kernel_size=[3], 142 | stride=1, fast_mode=False): 143 | if fast_mode == True: 144 | conv = octree_conv_fast(data, octree, depth, channel, kernel_size, stride) 145 | else: 146 | conv = octree_conv_memory( 147 | data, octree, depth, channel, kernel_size, stride) 148 | return tf.layers.batch_normalization(conv, axis=1, training=training) 149 | 150 | 151 | def octree_conv_bn_relu(data, octree, depth, channel, training, kernel_size=[3], 152 | stride=1, fast_mode=False): 153 | with tf.variable_scope('conv_bn_relu'): 154 | conv_bn = octree_conv_bn(data, octree, depth, channel, training, kernel_size, 155 | stride, fast_mode) 156 | rl = tf.nn.relu(conv_bn) 157 | return rl 158 | 159 | 160 | def octree_conv_bn_leakyrelu(data, octree, depth, channel, training): 161 | cb = octree_conv_bn(data, octree, depth, channel, training) 162 | return tf.nn.leaky_relu(cb, alpha=0.2) 163 | 164 | 165 | def octree_deconv_bn(data, octree, depth, channel, training, kernel_size=[3], 166 | stride=1, fast_mode=False): 167 | if fast_mode == True: 168 | conv = octree_deconv_fast( 169 | data, octree, depth, channel, kernel_size, stride) 170 | else: 171 | conv = octree_deconv_memory( 172 | data, octree, depth, channel, kernel_size, stride) 173 | return tf.layers.batch_normalization(conv, axis=1, training=training) 174 | 175 | 176 | def octree_deconv_bn_relu(data, octree, depth, channel, training, kernel_size=[3], 177 | stride=1, fast_mode=False): 178 | with tf.variable_scope('deconv_bn_relu'): 179 | conv_bn = octree_deconv_bn(data, octree, depth, channel, training, kernel_size, 180 | stride, fast_mode) 181 | rl = tf.nn.relu(conv_bn) 182 | return rl 183 | 184 | 185 | def octree_resblock(data, octree, depth, num_out, stride, training, bottleneck=4): 186 | num_in = int(data.shape[1]) 187 | channelb = int(num_out / bottleneck) 188 | if stride == 2: 189 | data, mask = octree_max_pool(data, octree, depth=depth) 190 | depth = depth - 1 191 | 192 | with tf.variable_scope("1x1x1_a"): 193 | block1 = octree_conv1x1_bn_relu(data, channelb, training=training) 194 | 195 | with tf.variable_scope("3x3x3"): 196 | block2 = octree_conv_bn_relu(block1, octree, depth, channelb, training) 197 | 198 | with tf.variable_scope("1x1x1_b"): 199 | block3 = octree_conv1x1_bn(block2, num_out, training=training) 200 | 201 | block4 = data 202 | if num_in != num_out: 203 | with tf.variable_scope("1x1x1_c"): 204 | block4 = octree_conv1x1_bn(data, num_out, training=training) 205 | 206 | return tf.nn.relu(block3 + block4) 207 | 208 | 209 | def octree_resblock2(data, octree, depth, num_out, training): 210 | num_in = int(data.shape[1]) 211 | with tf.variable_scope("conv_1"): 212 | conv = octree_conv_bn_relu(data, octree, depth, num_out/4, training) 213 | with tf.variable_scope("conv_2"): 214 | conv = octree_conv_bn(conv, octree, depth, num_out, training) 215 | 216 | link = data 217 | if num_in != num_out: 218 | with tf.variable_scope("conv_1x1"): 219 | link = octree_conv1x1_bn(data, num_out, training=training) 220 | 221 | out = tf.nn.relu(conv + link) 222 | return out 223 | 224 | 225 | def predict_module(data, num_output, num_hidden, training): 226 | # MLP with one hidden layer 227 | with tf.variable_scope('conv1'): 228 | conv = octree_conv1x1_bn_relu(data, num_hidden, training) 229 | with tf.variable_scope('conv2'): 230 | logit = octree_conv1x1(conv, num_output, use_bias=True) 231 | return logit 232 | 233 | 234 | def predict_label(data, num_output, num_hidden, training): 235 | logit = predict_module(data, num_output, num_hidden, training) 236 | # prob = tf.nn.softmax(logit, axis=1) # logit (1, num_output, ?, 1) 237 | label = tf.argmax(logit, axis=1, output_type=tf.int32) # predict (1, ?, 1) 238 | label = tf.reshape(label, [-1]) # flatten 239 | return logit, label 240 | 241 | 242 | def predict_signal(data, num_output, num_hidden, training): 243 | return tf.nn.tanh(predict_module(data, num_output, num_hidden, training)) 244 | 245 | 246 | def softmax_loss(logit, label_gt, num_class, label_smoothing=0.0): 247 | with tf.name_scope('softmax_loss'): 248 | label_gt = tf.cast(label_gt, tf.int32) 249 | onehot = tf.one_hot(label_gt, depth=num_class) 250 | loss = tf.losses.softmax_cross_entropy( 251 | onehot, logit, label_smoothing=label_smoothing) 252 | return loss 253 | 254 | 255 | def l2_regularizer(name, weight_decay): 256 | with tf.name_scope('l2_regularizer'): 257 | var = get_variables_with_name(name) 258 | regularizer = tf.add_n([tf.nn.l2_loss(v) for v in var]) * weight_decay 259 | return regularizer 260 | 261 | 262 | def label_accuracy(label, label_gt): 263 | label_gt = tf.cast(label_gt, tf.int32) 264 | accuracy = tf.reduce_mean(tf.to_float(tf.equal(label, label_gt))) 265 | return accuracy 266 | 267 | 268 | def softmax_accuracy(logit, label): 269 | with tf.name_scope('softmax_accuracy'): 270 | predict = tf.argmax(logit, axis=1, output_type=tf.int32) 271 | accu = label_accuracy(predict, tf.cast(label, tf.int32)) 272 | return accu 273 | 274 | 275 | def regress_loss(signal, signal_gt): 276 | return tf.reduce_mean(tf.reduce_sum(tf.square(signal-signal_gt), 1)) 277 | 278 | 279 | def normalize_signal(data): 280 | channel = data.shape[1] 281 | assert(channel == 3 or channel == 4) 282 | with tf.variable_scope("normalize"): 283 | if channel == 4: 284 | normals = tf.slice(data, [0, 0, 0, 0], [1, 3, -1, 1]) 285 | displacement = tf.slice(data, [0, 3, 0, 0], [1, 1, -1, 1]) 286 | normals = tf.nn.l2_normalize(normals, axis=1) 287 | output = tf.concat([normals, displacement], axis=1) 288 | else: 289 | output = tf.nn.l2_normalize(data, axis=1) 290 | return output 291 | 292 | 293 | def average_tensors(tower_tensors): 294 | avg_tensors = [] 295 | with tf.name_scope('avg_tensors'): 296 | for tensors in tower_tensors: 297 | tensors = [tf.expand_dims(tensor, 0) for tensor in tensors] 298 | avg_tensor = tf.concat(tensors, axis=0) 299 | avg_tensor = tf.reduce_mean(avg_tensor, 0) 300 | avg_tensors.append(avg_tensor) 301 | return avg_tensors 302 | 303 | 304 | def solver_single_gpu(total_loss, learning_rate_handle, gpu_num=1): 305 | with tf.variable_scope('solver'): 306 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 307 | with tf.control_dependencies(update_ops): 308 | global_step = tf.Variable(0, trainable=False, name='global_step') 309 | lr = learning_rate_handle(global_step) 310 | solver = tf.train.MomentumOptimizer(lr, 0.9) \ 311 | .minimize(total_loss, global_step=global_step) 312 | return solver, lr 313 | 314 | 315 | def solver_multiple_gpus(total_loss, learning_rate_handle, gpu_num): 316 | tower_grads, variables = [], [] 317 | with tf.device('/cpu:0'): 318 | with tf.variable_scope('solver'): 319 | global_step = tf.Variable(0, trainable=False, name='global_step') 320 | lr = learning_rate_handle(global_step) 321 | opt = tf.train.MomentumOptimizer(lr, 0.9) 322 | 323 | for i in range(gpu_num): 324 | with tf.device('/gpu:%d' % i): 325 | with tf.name_scope('device_b%d' % i): 326 | grads_and_vars = opt.compute_gradients(total_loss[i]) 327 | grads, variables = zip(*grads_and_vars) 328 | tower_grads.append(grads) 329 | 330 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 331 | # !!! Only get the update_ops defined on `device_0` to avoid the sync 332 | # between different GPUs to speed up the training process. !!! 333 | update_ops = [op for op in update_ops if 'device_0' in op.name] 334 | assert update_ops, 'The update ops of BN are empty, check the namescope \'device_0\'' 335 | with tf.device('/cpu:0'): 336 | with tf.name_scope('sync_and_apply_grad'): 337 | with tf.control_dependencies(update_ops): 338 | tower_grads = list(zip(*tower_grads)) 339 | avg_grads = average_tensors(tower_grads) 340 | grads_and_vars = list(zip(avg_grads, variables)) 341 | solver = opt.apply_gradients(grads_and_vars, global_step=global_step) 342 | return solver, lr 343 | 344 | 345 | def build_solver(total_loss, learning_rate_handle, gpu_num=1): 346 | assert (gpu_num > 0) 347 | the_solver = solver_single_gpu if gpu_num == 1 else solver_multiple_gpus 348 | return the_solver(total_loss, learning_rate_handle, gpu_num) 349 | 350 | 351 | def summary_train(names, tensors): 352 | with tf.name_scope('summary_train'): 353 | summaries = [] 354 | for it in zip(names, tensors): 355 | summaries.append(tf.summary.scalar(it[0], it[1])) 356 | summ = tf.summary.merge(summaries) 357 | return summ 358 | 359 | 360 | def summary_test(names): 361 | with tf.name_scope('summary_test'): 362 | summaries = [] 363 | summ_placeholder = [] 364 | for name in names: 365 | summ_placeholder.append(tf.placeholder(tf.float32)) 366 | summaries.append(tf.summary.scalar(name, summ_placeholder[-1])) 367 | summ = tf.summary.merge(summaries) 368 | return summ, summ_placeholder 369 | 370 | 371 | def loss_functions(logit, label_gt, num_class, weight_decay, var_name, label_smoothing=0.0): 372 | with tf.name_scope('loss'): 373 | loss = softmax_loss(logit, label_gt, num_class, label_smoothing) 374 | accu = softmax_accuracy(logit, label_gt) 375 | regularizer = l2_regularizer(var_name, weight_decay) 376 | return [loss, accu, regularizer] 377 | 378 | 379 | def loss_functions_seg(logit, label_gt, num_class, weight_decay, var_name, mask=-1): 380 | with tf.name_scope('loss_seg'): 381 | label_mask = label_gt > mask # filter label -1 382 | masked_logit = tf.boolean_mask(logit, label_mask) 383 | masked_label = tf.boolean_mask(label_gt, label_mask) 384 | loss = softmax_loss(masked_logit, masked_label, num_class) 385 | 386 | accu = softmax_accuracy(masked_logit, masked_label) 387 | regularizer = l2_regularizer(var_name, weight_decay) 388 | return [loss, accu, regularizer] 389 | 390 | 391 | def get_seg_label(octree, depth): 392 | with tf.name_scope('seg_label'): 393 | label = octree_property(octree, property_name='label', dtype=tf.float32, 394 | depth=depth, channel=1) 395 | label = tf.reshape(tf.cast(label, tf.int32), [-1]) 396 | return label 397 | 398 | 399 | def run_k_iterations(sess, k, tensors): 400 | num = len(tensors) 401 | avg_results = [0] * num 402 | for _ in range(k): 403 | iter_results = sess.run(tensors) 404 | for j in range(num): 405 | avg_results[j] += iter_results[j] 406 | 407 | for j in range(num): 408 | avg_results[j] /= k 409 | return avg_results 410 | 411 | 412 | def tf_IoU_per_shape(pred, label, class_num, mask=-1): 413 | with tf.name_scope('IoU'): 414 | label_mask = label > mask # filter label -1 415 | pred = tf.boolean_mask(pred, label_mask) 416 | label = tf.boolean_mask(label, label_mask) 417 | pred = tf.argmax(pred, axis=1, output_type=tf.int32) 418 | IoU, valid_part_num, esp = 0.0, 0.0, 1.0e-10 419 | for k in range(class_num): 420 | pk, lk = tf.equal(pred, k), tf.equal(label, k) 421 | # pk, lk = pred == k, label == k # why can this not output the right results? 422 | intsc = tf.reduce_sum(tf.cast(pk & lk, dtype=tf.float32)) 423 | union = tf.reduce_sum(tf.cast(pk | lk, dtype=tf.float32)) 424 | valid = tf.cast(tf.reduce_any(lk), dtype=tf.float32) 425 | valid_part_num += valid 426 | IoU += valid * intsc / (union + esp) 427 | IoU /= valid_part_num + esp 428 | return IoU, valid_part_num 429 | 430 | 431 | class Optimizer: 432 | def __init__(self, stype='SGD', var_list=None, mul=1.0): 433 | self.stype = stype # TODO: support more optimizers 434 | self.mul = mul # used to modulate the global learning rate 435 | self.var_list = var_list 436 | 437 | def __call__(self, total_loss, learning_rate): 438 | with tf.name_scope('solver'): 439 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 440 | with tf.control_dependencies(update_ops): 441 | global_step = tf.Variable(0, trainable=False, name='global_step') 442 | lr = learning_rate(global_step) * self.mul 443 | solver = tf.train.MomentumOptimizer(lr, 0.9) \ 444 | .minimize(total_loss, global_step=global_step, 445 | var_list=self.var_list) 446 | return solver, lr 447 | 448 | 449 | 450 | def octree2points(octree, depth, pts_channel=4, output_normal=False): 451 | with tf.name_scope('octree2points'): 452 | signal = octree_signal(octree, depth, 4) # normal and displacement 453 | signal = tf.transpose(tf.squeeze(signal, [0, 3])) # (1, C, H, 1) -> (H, C) 454 | xyz = octree_xyz(octree, depth) 455 | xyz = tf.cast(xyz, dtype=tf.float32) 456 | 457 | mask = octree_child(octree, depth) > -1 458 | signal = tf.boolean_mask(signal, mask) 459 | xyz = tf.boolean_mask(xyz, mask) 460 | 461 | c = 3.0 ** 0.5 / 2.0 462 | normal, dis = tf.split(signal, [3, 1], axis=1) 463 | pts, idx = tf.split(xyz, [3, 1], axis=1) 464 | pts = (pts + 0.5) + normal * (dis * c) 465 | if pts_channel == 4: 466 | pts = tf.concat([pts, idx], axis=1) 467 | output = pts if not output_normal else (pts, normal) 468 | return output 469 | 470 | -------------------------------------------------------------------------------- /run_partnet_test.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | from util.category_info import category_info 4 | 5 | 6 | category_list = category_info.keys() 7 | 8 | 9 | print(category_info) 10 | 11 | print(category_list) 12 | 13 | 14 | gpu_id = 2 15 | 16 | 17 | for category in category_list: 18 | n_test = category_info[category]['#shape']['test'] 19 | train_data = os.path.join('data', '{}_level123_test_{}.tfrecords'.format(category, n_test)) 20 | test_data = os.path.join('data', '{}_level123_test_{}.tfrecords'.format(category, n_test)) 21 | assert os.path.isfile(train_data) 22 | assert os.path.isfile(test_data) 23 | n_part_1 = category_info[category]['#part'][1] 24 | n_part_2 = category_info[category]['#part'][2] 25 | n_part_3 = category_info[category]['#part'][3] 26 | 27 | script = 'python 3DInsSegNet.py --logdir log/PartNet/{} --train_data {} --test_data {} --test_data_visual {} --train_batch_size 8 --test_batch_size 1 --max_iter 100000 --test_every_iter 5000 --test_iter {} --test_iter_visual 0 --gpu {} --n_part_1 {} --n_part_2 {} --n_part_3 {} --level_1_weight 1 --level_2_weight 1 --level_3_weight 1 --phase test --seg_loss_weight 1 --offset_weight 1 --sem_offset_weight 1 --learning_rate 0.1 --ckpt weight/{} --delete_0 --notest_visual --depth 6 --weight_decay 0.0001 --stop_gradient --category {}'.format(category, train_data, test_data, test_data, n_test, gpu_id, n_part_1, n_part_2, n_part_3, category, category) 28 | 29 | print(script) 30 | os.system(script) -------------------------------------------------------------------------------- /util/category_info.py: -------------------------------------------------------------------------------- 1 | category_info = { 2 | 'Chair': { 3 | '#part': { 4 | 1: 6, 5 | 2: 30, 6 | 3: 39 7 | }, 8 | '#shape': { 9 | 'train': 4489, 10 | 'val': 617, 11 | 'test': 1217 12 | } 13 | }, 14 | 'Table': { 15 | '#part': { 16 | 1: 11, 17 | 2: 42, 18 | 3: 51 19 | }, 20 | '#shape': { 21 | 'train': 5707, 22 | 'val': 843, 23 | 'test': 1668 24 | } 25 | }, 26 | 'Lamp': { 27 | '#part': { 28 | 1: 18, 29 | 2: 28, 30 | 3: 41 31 | }, 32 | '#shape': { 33 | 'train': 1554, 34 | 'val': 234, 35 | 'test': 419 36 | } 37 | }, 38 | 'StorageFurniture': { 39 | '#part': { 40 | 1: 7, 41 | 2: 19, 42 | 3: 24 43 | }, 44 | '#shape': { 45 | 'train': 1588, 46 | 'val': 230, 47 | 'test': 451 48 | } 49 | }, 50 | 'TrashCan': { 51 | '#part': { 52 | 1: 5, 53 | 2: 5, 54 | 3: 11 55 | }, 56 | '#shape': { 57 | 'train': 221, 58 | 'val': 37, 59 | 'test': 63 60 | } 61 | }, 62 | 'Vase': { 63 | '#part': { 64 | 1: 4, 65 | 2: 4, 66 | 3: 6 67 | }, 68 | '#shape': { 69 | 'train': 741, 70 | 'val': 102, 71 | 'test': 233 72 | } 73 | }, 74 | 'Microwave': { 75 | '#part': { 76 | 1: 3, 77 | 2: 5, 78 | 3: 6 79 | }, 80 | '#shape': { 81 | 'train': 133, 82 | 'val': 12, 83 | 'test': 39 84 | } 85 | }, 86 | 'Mug': { 87 | '#part': { 88 | 1: 4, 89 | 2: 4, 90 | 3: 4, 91 | }, 92 | '#shape': { 93 | 'train': 138, 94 | 'val': 19, 95 | 'test': 35 96 | } 97 | }, 98 | 'Refrigerator': { 99 | '#part': { 100 | 1: 3, 101 | 2: 6, 102 | 3: 7 103 | }, 104 | '#shape': { 105 | 'train': 136, 106 | 'val': 20, 107 | 'test': 31 108 | } 109 | }, 110 | 'Hat': { 111 | '#part': { 112 | 1: 6, 113 | 2: 6, 114 | 3: 6, 115 | }, 116 | '#shape': { 117 | 'train': 170, 118 | 'val': 16, 119 | 'test': 45 120 | } 121 | }, 122 | 'Keyboard': { 123 | '#part': { 124 | 1: 3, 125 | 2: 3, 126 | 3: 3, 127 | }, 128 | '#shape': { 129 | 'train': 111, 130 | 'val': 14, 131 | 'test': 31 132 | } 133 | }, 134 | 'Bed': { 135 | '#part': { 136 | 1: 4, 137 | 2: 10, 138 | 3: 15 139 | }, 140 | '#shape': { 141 | 'train': 133, 142 | 'val': 24, 143 | 'test': 37 144 | } 145 | }, 146 | 'Bag': { 147 | '#part': { 148 | 1: 4, 149 | 2: 4, 150 | 3: 4, 151 | }, 152 | '#shape': { 153 | 'train': 92, 154 | 'val': 5, 155 | 'test': 29 156 | } 157 | }, 158 | 'Bottle': { 159 | '#part': { 160 | 1: 6, 161 | 2: 6, 162 | 3: 9 163 | }, 164 | '#shape': { 165 | 'train': 315, 166 | 'val': 37, 167 | 'test': 84 168 | } 169 | }, 170 | 'Bowl': { 171 | '#part': { 172 | 1: 4, 173 | 2: 4, 174 | 3: 4, 175 | }, 176 | '#shape': { 177 | 'train': 131, 178 | 'val': 18, 179 | 'test': 39 180 | } 181 | }, 182 | 'Clock': { 183 | '#part': { 184 | 1: 6, 185 | 2: 6, 186 | 3: 11 187 | }, 188 | '#shape': { 189 | 'train': 406, 190 | 'val': 50, 191 | 'test': 98 192 | } 193 | }, 194 | 'Dishwasher': { 195 | '#part': { 196 | 1: 3, 197 | 2: 5, 198 | 3: 7 199 | }, 200 | '#shape': { 201 | 'train': 111, 202 | 'val': 19, 203 | 'test': 51 204 | } 205 | }, 206 | 'Display': { 207 | '#part': { 208 | 1: 3, 209 | 2: 3, 210 | 3: 4 211 | }, 212 | '#shape': { 213 | 'train': 633, 214 | 'val': 104, 215 | 'test': 191 216 | } 217 | }, 218 | 'Door': { 219 | '#part': { 220 | 1: 3, 221 | 2: 4, 222 | 3: 5 223 | }, 224 | '#shape': { 225 | 'train': 149, 226 | 'val': 25, 227 | 'test': 51 228 | } 229 | }, 230 | 'Earphone': { 231 | '#part': { 232 | 1: 6, 233 | 2: 6, 234 | 3: 10 235 | }, 236 | '#shape': { 237 | 'train': 147, 238 | 'val': 28, 239 | 'test': 53 240 | } 241 | }, 242 | 'Faucet': { 243 | '#part': { 244 | 1: 8, 245 | 2: 8, 246 | 3: 12 247 | }, 248 | '#shape': { 249 | 'train': 435, 250 | 'val': 81, 251 | 'test': 132 252 | } 253 | }, 254 | 'Laptop': { 255 | '#part': { 256 | 1: 3, 257 | 2: 3, 258 | 3: 3, 259 | }, 260 | '#shape': { 261 | 'train': 306, 262 | 'val': 45, 263 | 'test': 82 264 | } 265 | }, 266 | 'Scissors': { 267 | '#part': { 268 | 1: 3, 269 | 2: 3, 270 | 3: 3, 271 | }, 272 | '#shape': { 273 | 'train': 45, 274 | 'val': 10, 275 | 'test': 13 276 | } 277 | }, 278 | 'Knife': { 279 | '#part': { 280 | 1: 5, 281 | 2: 5, 282 | 3: 10 283 | }, 284 | '#shape': { 285 | 'train': 221, 286 | 'val': 29, 287 | 'test': 77 288 | } 289 | } 290 | } 291 | -------------------------------------------------------------------------------- /util/cluster.py: -------------------------------------------------------------------------------- 1 | from sklearn.cluster import MeanShift 2 | import numpy as np 3 | 4 | 5 | def semantic_mean_shift_cluster(point_semantic_label, pts, bandwidth=0.1): 6 | point_instance_label = np.zeros_like(point_semantic_label, dtype=np.int32) 7 | cum_instance_index = 0 8 | for label in np.unique(point_semantic_label): 9 | cur_pts_index = np.reshape(np.argwhere(point_semantic_label==label), [-1]) 10 | if cur_pts_index.size > 1: 11 | cur_pts = pts[cur_pts_index] 12 | ms = MeanShift(bandwidth=bandwidth, bin_seeding=True) 13 | ms.fit(cur_pts) 14 | point_instance_label[cur_pts_index] = ms.labels_ + cum_instance_index 15 | cur_n_instance = ms.cluster_centers_.shape[0] 16 | else: 17 | point_instance_label[cur_pts_index] = cum_instance_index 18 | cur_n_instance = 1 19 | cum_instance_index += cur_n_instance 20 | return point_instance_label 21 | 22 | -------------------------------------------------------------------------------- /util/config.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tensorflow as tf 3 | 4 | tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR) 5 | 6 | tf_flags = tf.app.flags 7 | 8 | tf_flags.DEFINE_string('logdir', 'log/test', 'Directory where to write event logs.') 9 | tf_flags.DEFINE_string('train_data', '', 'Training data.') 10 | tf_flags.DEFINE_string('test_data', '', 'Test data.') 11 | tf_flags.DEFINE_string('test_data_visual', '', 'Testing data for visualization.') 12 | tf_flags.DEFINE_integer('train_batch_size', 8, 'Batch size for the training.') 13 | tf_flags.DEFINE_integer('test_batch_size', 1, 'Batch size for the testing.') 14 | tf_flags.DEFINE_float('learning_rate', 0.1, 'Initial learning rate.') 15 | tf_flags.DEFINE_string('optimizer', 'sgd', 'Optimizer (adam/sgd).') 16 | tf_flags.DEFINE_string('decay_policy', 'step', 'Learning rate decay policy (step/poly/constant).') 17 | tf_flags.DEFINE_float('weight_decay', 0.0001, 'Weight decay.') 18 | tf_flags.DEFINE_integer('max_iter', 100000, 'Maximum training iterations.') 19 | tf_flags.DEFINE_integer('test_every_iter', 5000, 'Test model every n training steps.') 20 | tf_flags.DEFINE_integer('test_iter', 100, '#shapes in test data.') 21 | tf_flags.DEFINE_integer('test_iter_visual', 20, 'Test steps in testing phase for visualization.') 22 | tf_flags.DEFINE_boolean('test_visual', False, """Test with visualization.""") 23 | tf_flags.DEFINE_string('cache_folder', 'test', 'Directory where to dump immediate data.') 24 | tf_flags.DEFINE_string('ckpt', '', 'Restore weights from checkpoint file.') 25 | tf_flags.DEFINE_string('gpu', '0', 'The gpu index.') 26 | tf_flags.DEFINE_string('phase', 'train', 'Choose from train, test or dump}.') 27 | tf_flags.DEFINE_integer('n_part_1', 6, 'Number of semantic part in level one.') 28 | tf_flags.DEFINE_integer('n_part_2', 30, 'Number of semantic part in level two.') 29 | tf_flags.DEFINE_integer('n_part_3', 39, 'Number of semantic part in level three.') 30 | tf_flags.DEFINE_boolean('delete_0', True, """Whether consider label 0 in metric computation.""") 31 | tf_flags.DEFINE_float('seg_loss_weight', 1.0, 'Weight of segmentation loss.') 32 | tf_flags.DEFINE_float('offset_weight', 1.0, 'Weight of offset loss.') 33 | tf_flags.DEFINE_float('sem_offset_weight', 1.0, 'Weight of semantic offset loss.') 34 | tf_flags.DEFINE_float('level_1_weight', 0.0, 'Weight of level 1 loss.') 35 | tf_flags.DEFINE_float('level_2_weight', 0.0, 'Weight of level 2 loss.') 36 | tf_flags.DEFINE_float('level_3_weight', 0.0, 'Weight of level 3 loss.') 37 | tf_flags.DEFINE_integer('test_shape_average_point_number', 10000, 'Mean point number of each shape in test phase. Must be greater than real point number.') 38 | tf_flags.DEFINE_integer('depth', 6, 'The depth of octree.') 39 | tf_flags.DEFINE_boolean('stop_gradient', True, """Stop gradient in fusion module.""") 40 | tf_flags.DEFINE_string('category', 'Chair', 'Category.') 41 | tf_flags.DEFINE_float('bandwidth', 0.1, 'Bandwidth of mean-shift.') 42 | tf_flags.DEFINE_float('semantic_center_offset', 0.05, 'semantic center offset.') 43 | 44 | 45 | FLAGS = tf_flags.FLAGS 46 | 47 | os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 48 | os.environ['CUDA_VISIBLE_DEVICES'] = FLAGS.gpu 49 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 50 | 51 | max_iter = FLAGS.max_iter 52 | test_iter = FLAGS.test_iter 53 | test_iter_visual = FLAGS.test_iter_visual 54 | n_part_1 = FLAGS.n_part_1 55 | n_part_2 = FLAGS.n_part_2 56 | n_part_3 = FLAGS.n_part_3 57 | n_test_point = FLAGS.test_shape_average_point_number 58 | -------------------------------------------------------------------------------- /util/dataset.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import tensorflow as tf 4 | 5 | assert(os.path.isdir('ocnn/tensorflow')) 6 | sys.path.append('ocnn/tensorflow') 7 | 8 | from libs import * 9 | import numpy as np 10 | from transform import get_transform_matrix, get_inverse_transform_matrix 11 | 12 | 13 | def get_label_mapping(category, slevel=3, dlevel=2): 14 | assert(slevel > dlevel) 15 | label_mapping = { 16 | 'Bag': 17 | [ 18 | [ 19 | tf.constant([0,1,2,3], tf.int32), # 4 20 | tf.constant([0,1,2,3], tf.int32), # 4 21 | tf.constant([0,1,2,3], tf.int32), # 4 22 | ], 23 | [ 24 | tf.constant([0,1,2,3], tf.int32), # 4 25 | tf.constant([0,1,2,3], tf.int32), # 4 26 | ] 27 | ], 28 | 'Bed': 29 | [ 30 | [ 31 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14], tf.int32), # 15 32 | tf.constant([0,1,2,3,3,4,4,4,5,5, 5, 6, 7, 8, 9], tf.int32), # 10 33 | tf.constant([0,1,1,2,2,2,2,2,2,2, 2, 2, 2, 3, 3], tf.int32), # 4 34 | ], 35 | [ 36 | tf.constant([0,1,2,3,4,5,6,7,8,9], tf.int32), # 10 37 | tf.constant([0,1,1,2,2,2,2,2,3,3], tf.int32), # 4 38 | ] 39 | ], 40 | 'Bottle': 41 | [ 42 | [ 43 | tf.constant([0,1,2,3,4,5,6,7,8], tf.int32), # 9 44 | tf.constant([0,1,0,2,3,0,0,4,5], tf.int32), # 6 45 | tf.constant([0,1,0,2,3,0,0,4,5], tf.int32), # 6 46 | ], 47 | [ 48 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 49 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 50 | ] 51 | ], 52 | 'Bowl': 53 | [ 54 | [ 55 | tf.constant([0,1,2,3], tf.int32), # 4 56 | tf.constant([0,1,2,3], tf.int32), # 4 57 | tf.constant([0,1,2,3], tf.int32), # 4 58 | ], 59 | [ 60 | tf.constant([0,1,2,3], tf.int32), # 4 61 | tf.constant([0,1,2,3], tf.int32), # 4 62 | ] 63 | ], 64 | 'Chair': 65 | [ 66 | [ 67 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38], tf.int32), # 39 68 | tf.constant([0,1,2,3,3,3,4,5,6,6, 6, 7, 8, 9,10,11,12,13,13,13,14,15,15,16,17,18,19,20,21,22,23,24,25,25,26,26,27,28,29], tf.int32), # 30 69 | tf.constant([0,1,1,2,2,2,2,2,2,2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 0, 0, 0], tf.int32), # 6 70 | ], 71 | [ 72 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29], tf.int32), # 30 73 | tf.constant([0,1,1,2,2,2,2,3,3,3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 0, 0, 0], tf.int32), # 6 74 | ] 75 | ], 76 | 'Clock': 77 | [ 78 | [ 79 | tf.constant([0,1,2,3,4,5,6,7,8,9,10], tf.int32), # 11 80 | tf.constant([0,1,1,2,2,3,3,4,5,5, 0], tf.int32), # 6 81 | tf.constant([0,1,1,2,2,3,3,4,5,5, 0], tf.int32), # 6 82 | ], 83 | [ 84 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 85 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 86 | ] 87 | ], 88 | 'Dishwasher': 89 | [ 90 | [ 91 | tf.constant([0,1,2,3,4,5,6], tf.int32), # 7 92 | tf.constant([0,1,2,2,3,3,4], tf.int32), # 5 93 | tf.constant([0,1,1,1,2,2,2], tf.int32), # 3 94 | ], 95 | [ 96 | tf.constant([0,1,2,3,4], tf.int32), # 5 97 | tf.constant([0,1,1,2,2], tf.int32), # 3 98 | ] 99 | ], 100 | 'Display': 101 | [ 102 | [ 103 | tf.constant([0,1,2,3], tf.int32), # 4 104 | tf.constant([0,1,2,2], tf.int32), # 3 105 | tf.constant([0,1,2,2], tf.int32), # 3 106 | ], 107 | [ 108 | tf.constant([0,1,2], tf.int32), # 3 109 | tf.constant([0,1,2], tf.int32), # 3 110 | ] 111 | ], 112 | 'Door': 113 | [ 114 | [ 115 | tf.constant([0,1,2,3,4], tf.int32), # 5 116 | tf.constant([0,1,2,2,3], tf.int32), # 4 117 | tf.constant([0,1,2,2,2], tf.int32), # 3 118 | ], 119 | [ 120 | tf.constant([0,1,2,3], tf.int32), # 4 121 | tf.constant([0,1,2,2], tf.int32), # 3 122 | ] 123 | ], 124 | 'Earphone': 125 | [ 126 | [ 127 | tf.constant([0,1,2,3,4,5,6,7,8,9], tf.int32), # 10 128 | tf.constant([0,1,1,1,2,3,4,4,4,5], tf.int32), # 6 129 | tf.constant([0,1,1,1,2,3,4,4,4,5], tf.int32), # 6 130 | ], 131 | [ 132 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 133 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 134 | ] 135 | ], 136 | 'Faucet': 137 | [ 138 | [ 139 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11], tf.int32), # 12 140 | tf.constant([0,1,2,3,4,4,4,5,6,7, 7, 7], tf.int32), # 8 141 | tf.constant([0,1,2,3,4,4,4,5,6,7, 7, 7], tf.int32), # 8 142 | ], 143 | [ 144 | tf.constant([0,1,2,3,4,5,6,7], tf.int32), # 8 145 | tf.constant([0,1,2,3,4,5,6,7], tf.int32), # 8 146 | ] 147 | ], 148 | 'Hat': 149 | [ 150 | [ 151 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 152 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 153 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 154 | ], 155 | [ 156 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 157 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 158 | ] 159 | ], 160 | 'Keyboard': 161 | [ 162 | [ 163 | tf.constant([0,1,2], tf.int32), # 3 164 | tf.constant([0,1,2], tf.int32), # 3 165 | tf.constant([0,1,2], tf.int32), # 3 166 | ], 167 | [ 168 | tf.constant([0,1,2], tf.int32), # 3 169 | tf.constant([0,1,2], tf.int32), # 3 170 | ] 171 | ], 172 | 'Knife': 173 | [ 174 | [ 175 | tf.constant([0,1,2,3,4,5,6,7,8,9], tf.int32), # 10 176 | tf.constant([0,1,1,1,2,3,3,3,4,4], tf.int32), # 5 177 | tf.constant([0,1,1,1,2,3,3,3,4,4], tf.int32), # 5 178 | ], 179 | [ 180 | tf.constant([0,1,2,3,4], tf.int32), # 5 181 | tf.constant([0,1,2,3,4], tf.int32), # 5 182 | ] 183 | ], 184 | 'Lamp': 185 | [ 186 | [ 187 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40], tf.int32), # 51 188 | tf.constant([0,1,2,2,3,4,5,6,7,8, 8, 9,10,11,12,13,14,15,16,16,17,17,17,17,18,19,20,21,21,22,23,24,25,26,26,27,27,27,27,27,27], tf.int32), # 42 189 | tf.constant([0,1,1,1,2,3,4,5,6,6, 6, 7, 8, 9, 9, 9, 9,10,10,10,10,10,10,10,11,11,12,13,13,13,14,15,16,17,17,17,17,17,17,17,17], tf.int32), # 11 190 | ], 191 | [ 192 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27], tf.int32), # 42 193 | tf.constant([0,1,1,2,3,4,5,6,6,7, 8, 9, 9, 9, 9,10,10,10,11,11,12,13,13,14,15,16,17,17], tf.int32), # 11 194 | ] 195 | ], 196 | 'Laptop': 197 | [ 198 | [ 199 | tf.constant([0,1,2], tf.int32), # 3 200 | tf.constant([0,1,2], tf.int32), # 3 201 | tf.constant([0,1,2], tf.int32), # 3 202 | ], 203 | [ 204 | tf.constant([0,1,2], tf.int32), # 3 205 | tf.constant([0,1,2], tf.int32), # 3 206 | ] 207 | ], 208 | 'Microwave': 209 | [ 210 | [ 211 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 212 | tf.constant([0,1,2,2,3,4], tf.int32), # 5 213 | tf.constant([0,1,1,1,1,2], tf.int32), # 3 214 | ], 215 | [ 216 | tf.constant([0,1,2,3,4], tf.int32), # 5 217 | tf.constant([0,1,1,1,2], tf.int32), # 3 218 | ] 219 | ], 220 | 'Mug': 221 | [ 222 | [ 223 | tf.constant([0,1,2,3], tf.int32), # 4 224 | tf.constant([0,1,2,3], tf.int32), # 4 225 | tf.constant([0,1,2,3], tf.int32), # 4 226 | ], 227 | [ 228 | tf.constant([0,1,2,3], tf.int32), # 4 229 | tf.constant([0,1,2,3], tf.int32), # 4 230 | ] 231 | ], 232 | 'Refrigerator': 233 | [ 234 | [ 235 | tf.constant([0,1,2,3,4,5,6], tf.int32), # 7 236 | tf.constant([0,1,2,2,3,4,5], tf.int32), # 6 237 | tf.constant([0,1,1,1,1,2,2], tf.int32), # 3 238 | ], 239 | [ 240 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 241 | tf.constant([0,1,1,1,2,2], tf.int32), # 3 242 | ] 243 | ], 244 | 'Scissors': 245 | [ 246 | [ 247 | tf.constant([0,1,2], tf.int32), # 3 248 | tf.constant([0,1,2], tf.int32), # 3 249 | tf.constant([0,1,2], tf.int32), # 3 250 | ], 251 | [ 252 | tf.constant([0,1,2], tf.int32), # 3 253 | tf.constant([0,1,2], tf.int32), # 3 254 | ] 255 | ], 256 | 'StorageFurniture': 257 | [ 258 | [ 259 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23], tf.int32), # 24 260 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,16,17,17,17,18,18,18], tf.int32), # 19 261 | tf.constant([0,1,2,3,3,3,3,3,3,3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6], tf.int32), # 7 262 | ], 263 | [ 264 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18], tf.int32), # 19 265 | tf.constant([0,1,2,3,3,3,3,3,3,3, 3, 4, 4, 4, 4, 4, 5, 5, 6], tf.int32), # 7 266 | ] 267 | ], 268 | 'Table': 269 | [ 270 | [ 271 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50], tf.int32), # 51 272 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,13,13,14,15,16,17,18,19,20,21,22,23,24,24,25,26,27,28,29,30,31,31,32,32,32,32,32,33,34,35,36,37,37,38,39,40,41], tf.int32), # 42 273 | tf.constant([0,1,2,3,3,0,4,4,5,6, 7, 0, 8, 9, 9, 9, 9, 9,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10], tf.int32), # 11 274 | ], 275 | [ 276 | tf.constant([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41], tf.int32), # 42 277 | tf.constant([0,1,2,3,3,0,4,4,5,6, 7, 0, 8, 9, 9, 9,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10], tf.int32), # 11 278 | ] 279 | ], 280 | 'TrashCan': 281 | [ 282 | [ 283 | tf.constant([0,1,2,3,4,5,6,7,8,9,10], tf.int32), # 11 284 | tf.constant([0,1,1,1,2,2,2,2,3,4, 4], tf.int32), # 5 285 | tf.constant([0,1,1,1,2,2,2,2,3,4, 4], tf.int32), # 5 286 | ], 287 | [ 288 | tf.constant([0,1,2,3,4], tf.int32), # 5 289 | tf.constant([0,1,2,3,4], tf.int32), # 5 290 | ] 291 | ], 292 | 'Vase': 293 | [ 294 | [ 295 | tf.constant([0,1,2,3,4,5], tf.int32), # 6 296 | tf.constant([0,1,1,2,3,3], tf.int32), # 4 297 | tf.constant([0,1,1,2,3,3], tf.int32), # 4 298 | ], 299 | [ 300 | tf.constant([0,1,2,3], tf.int32), # 4 301 | tf.constant([0,1,2,3], tf.int32), # 4 302 | ] 303 | ], 304 | } 305 | return label_mapping[category][3-slevel][slevel-dlevel] 306 | 307 | 308 | def compute_instance_center(pts, label): 309 | label.astype(np.int32) 310 | pts_instance_center = np.empty_like(pts) 311 | for label_id in np.unique(label): 312 | instance_mask = label==label_id 313 | instance_pts = pts[instance_mask] 314 | instance_center = np.mean(instance_pts, axis=0) 315 | pts_instance_center[instance_mask] = instance_center 316 | return pts_instance_center 317 | 318 | def compute_semantic_center(instance_label, semantic_label, instance_center): 319 | instance_label = instance_label.astype(np.int32).flatten() 320 | semantic_label = semantic_label.astype(np.int32).flatten() 321 | pts_semantic_center = np.empty_like(instance_center) 322 | for slabel in np.unique(semantic_label): 323 | semantic_center = np.zeros([3], dtype=np.float32) 324 | instance_count = 0 325 | for ilabel in np.unique(instance_label[semantic_label==slabel]): 326 | semantic_center += instance_center[np.argwhere(instance_label==ilabel)[0]].flatten() 327 | instance_count += 1 328 | semantic_center /= instance_count 329 | pts_semantic_center[semantic_label==slabel] = semantic_center 330 | return pts_semantic_center 331 | 332 | def float_to_bytes(data): 333 | return data.tobytes() 334 | 335 | def apply_transform_to_points(transform_matrix, points): 336 | points = tf.pad(points, tf.constant([[0, 0], [0, 1]]), constant_values=1.0) # [n_point, 4] 337 | transformed_points = tf.matmul(transform_matrix, points, transpose_b=True) # [4, n_point] 338 | return tf.transpose(transformed_points[:3, :]) 339 | 340 | def split_semantic_instance_label(label): 341 | semantic_label = label % 100 342 | instance_label = label // 100 343 | return semantic_label.astype(np.float32), instance_label.astype(np.float32) 344 | 345 | class PointsPreprocessor: 346 | def __init__(self, depth, test=False): 347 | self._depth = depth 348 | self._test = test 349 | 350 | def __call__(self, record): 351 | raw_points_bytes, label_1, label_2, label_3, points_flag, shape_index = self.parse_example(record) 352 | radius, center = bounding_sphere(raw_points_bytes) 353 | raw_points_bytes = normalize_points(raw_points_bytes, radius, center) 354 | 355 | # get semantic and instance label 356 | semantic_points_bytes_1, instance_points_bytes_1 = self.split_label(raw_points_bytes, label_1) # [], [] 357 | semantic_points_bytes_2, instance_points_bytes_2 = self.split_label(raw_points_bytes, label_2) # [], [] 358 | semantic_points_bytes_3, instance_points_bytes_3 = self.split_label(raw_points_bytes, label_3) # [], [] 359 | 360 | # get instance and semantic center 361 | points_instance_center_1 = self.get_instance_center(instance_points_bytes_1) # [n_point, 3] 362 | points_semantic_center_1 = self.get_semantic_center(instance_points_bytes_1, semantic_points_bytes_1, points_instance_center_1) # [n_point, 3] 363 | points_instance_center_2 = self.get_instance_center(instance_points_bytes_2) # [n_point, 3] 364 | points_semantic_center_2 = self.get_semantic_center(instance_points_bytes_2, semantic_points_bytes_2, points_instance_center_2) # [n_point, 3] 365 | points_instance_center_3 = self.get_instance_center(instance_points_bytes_3) # [n_point, 3] 366 | points_semantic_center_3 = self.get_semantic_center(instance_points_bytes_3, semantic_points_bytes_3, points_instance_center_3) # [n_point, 3] 367 | 368 | if self._test: 369 | octree = points2octree(semantic_points_bytes_1, depth=self._depth, full_depth=2, node_dis=True) 370 | instance_center_bytes_1 = tf.py_func(float_to_bytes, [points_instance_center_1], Tout=tf.string) 371 | semantic_center_bytes_1 = tf.py_func(float_to_bytes, [points_semantic_center_1], Tout=tf.string) 372 | instance_center_bytes_2 = tf.py_func(float_to_bytes, [points_instance_center_2], Tout=tf.string) 373 | semantic_center_bytes_2 = tf.py_func(float_to_bytes, [points_semantic_center_2], Tout=tf.string) 374 | instance_center_bytes_3 = tf.py_func(float_to_bytes, [points_instance_center_3], Tout=tf.string) 375 | semantic_center_bytes_3 = tf.py_func(float_to_bytes, [points_semantic_center_3], Tout=tf.string) 376 | return [octree, semantic_points_bytes_1, instance_center_bytes_1, instance_points_bytes_1, semantic_center_bytes_1, 377 | semantic_points_bytes_2, instance_center_bytes_2, instance_points_bytes_2, semantic_center_bytes_2, 378 | semantic_points_bytes_3, instance_center_bytes_3, instance_points_bytes_3, semantic_center_bytes_3] 379 | else: 380 | # augment points 381 | transform_matrix, rotation_matrix = self.get_augment_matrix() # [4, 4], [4, 4] 382 | points_bytes_1 = self.augment_points(semantic_points_bytes_1, transform_matrix, rotation_matrix) # [] 383 | points_bytes_2 = self.augment_points(semantic_points_bytes_2, transform_matrix, rotation_matrix) # [] 384 | points_bytes_3 = self.augment_points(semantic_points_bytes_3, transform_matrix, rotation_matrix) # [] 385 | # clip points 386 | inbox_points_bytes_1, _, inbox_instance_points_bytes_1, instance_center_bytes_1, semantic_center_bytes_1 = self.get_clip_pts( 387 | points_bytes_1, instance_points_bytes_1, points_instance_center_1, points_semantic_center_1, transform_matrix) # [], _, [], [] 388 | inbox_points_bytes_2, _, inbox_instance_points_bytes_2, instance_center_bytes_2, semantic_center_bytes_2 = self.get_clip_pts( 389 | points_bytes_2, instance_points_bytes_2, points_instance_center_2, points_semantic_center_2, transform_matrix) # [], _, [], [] 390 | inbox_points_bytes_3, _, inbox_instance_points_bytes_3, instance_center_bytes_3, semantic_center_bytes_3 = self.get_clip_pts( 391 | points_bytes_3, instance_points_bytes_3, points_instance_center_3, points_semantic_center_3, transform_matrix) # [], _, [], [] 392 | # get octree 393 | octree = points2octree(inbox_points_bytes_1, depth=self._depth, full_depth=2, node_dis=True) 394 | return [octree, inbox_points_bytes_1, inbox_points_bytes_2, inbox_points_bytes_3, points_flag, 395 | inbox_instance_points_bytes_1, inbox_instance_points_bytes_2, inbox_instance_points_bytes_3, 396 | instance_center_bytes_1, semantic_center_bytes_1, instance_center_bytes_2, semantic_center_bytes_2, 397 | instance_center_bytes_3, semantic_center_bytes_3] 398 | 399 | 400 | def split_label(self, points_bytes, label): 401 | points_pts = points_property(points_bytes, property_name='xyz', channel=3) # [n_point, 3] 402 | points_normal = points_property(points_bytes, property_name='normal', channel=3) # [n_point, 3] 403 | points_semantic_label, points_instance_label = tf.py_func(split_semantic_instance_label, [label], Tout=[tf.float32, tf.float32]) # [10000], [10000] 404 | semantic_points_bytes = points_new(points_pts, points_normal, tf.zeros([0]), points_semantic_label) 405 | instance_points_bytes = points_new(points_pts, points_normal, tf.zeros([0]), points_instance_label) 406 | return semantic_points_bytes, instance_points_bytes 407 | 408 | def get_instance_center(self, points_bytes): 409 | points_pts = points_property(points_bytes, property_name='xyz', channel=3) # [n_point, 3] 410 | points_label = points_property(points_bytes, property_name='label', channel=1) # [n_point, 1] 411 | points_label = tf.reshape(points_label, [-1]) # [n_point] 412 | points_instance_center = tf.py_func(compute_instance_center, [points_pts, points_label], Tout=tf.float32) # [n_point, 3] 413 | return points_instance_center 414 | 415 | def get_semantic_center(self, instance_points_bytes, semantic_points_bytes, points_instance_center): 416 | points_pts = points_property(instance_points_bytes, property_name='xyz', channel=3) # [n_point, 3] 417 | points_instance_label = points_property(instance_points_bytes, property_name='label', channel=1) # [n_point, 1] 418 | points_semantic_label = points_property(semantic_points_bytes, property_name='label', channel=1) # [n_point, 1] 419 | points_semantic_center = tf.py_func(compute_semantic_center, [points_instance_label, points_semantic_label, points_instance_center], Tout=tf.float32) # [n_point, 3] 420 | return points_semantic_center 421 | 422 | def get_augment_matrix(self): 423 | rotation_angle = 10 424 | rnd = tf.random.uniform(shape=[3], minval=-rotation_angle, maxval=rotation_angle, dtype=tf.int32) 425 | angle = tf.cast(rnd, dtype=tf.float32) * 3.14159265 / 180.0 426 | scale = tf.random.uniform(shape=[3], minval=0.75, maxval=1.25, dtype=tf.float32) 427 | scale = tf.stack([scale[0]]*3) 428 | jitter = tf.random.uniform(shape=[3], minval=-0.125, maxval=0.125, dtype=tf.float32) 429 | transform_matrix, rotation_matrix = tf.py_func(get_transform_matrix, [angle, jitter, scale, True], Tout=[tf.float32, tf.float32]) # [4, 4], [4, 4] 430 | return transform_matrix, rotation_matrix 431 | 432 | def augment_points(self, points_bytes, transform_matrix, rotation_matrix): 433 | points_pts = points_property(points_bytes, property_name='xyz', channel=3) # [n_point, 3] 434 | points_normal = points_property(points_bytes, property_name='normal', channel=3) # [n_point, 3] 435 | points_label = points_property(points_bytes, property_name='label', channel=1) # [n_point, 1] 436 | transformed_points_pts = apply_transform_to_points(transform_matrix, points_pts) # [n_point, 3] 437 | transformed_points_normal = apply_transform_to_points(rotation_matrix, points_normal) # [n_point, 3] 438 | points_bytes = points_new(transformed_points_pts, transformed_points_normal, tf.zeros([0]), points_label) 439 | return points_bytes 440 | 441 | def get_clip_pts(self, points_bytes, instance_points_bytes, points_instance_center, points_semantic_center, transform_matrix): 442 | points_pts = points_property(points_bytes, property_name='xyz', channel=3) # [n_point, 3] 443 | points_normal = points_property(points_bytes, property_name='normal', channel=3) # [n_point, 3] 444 | points_label = points_property(points_bytes, property_name='label', channel=1) # [n_point, 1] 445 | points_instance_label = points_property(instance_points_bytes, property_name='label', channel=1) # [n_point, 1] 446 | inbox_mask = self.clip_pts(points_pts) # [n_point] 447 | points_pts = tf.boolean_mask(points_pts, inbox_mask) # [n_inbox_point, 3] 448 | points_normal = tf.boolean_mask(points_normal, inbox_mask) # [n_inbox_point, 3] 449 | points_label = tf.boolean_mask(points_label, inbox_mask) # [n_inbox_point, 1] 450 | points_instance_label = tf.boolean_mask(points_instance_label, inbox_mask) # [n_inbox_point, 1] 451 | points_bytes = points_new(points_pts, points_normal, tf.zeros([0]), points_label) 452 | instance_points_bytes = points_new(points_pts, points_normal, tf.zeros([0]), points_instance_label) 453 | points_instance_center = tf.boolean_mask(points_instance_center, inbox_mask) # [n_inbox_point, 3] 454 | points_instance_center = apply_transform_to_points(transform_matrix, points_instance_center) # [n_inbox_point, 3] 455 | points_instance_center_bytes = tf.py_func(float_to_bytes, [points_instance_center], Tout=tf.string) 456 | points_semantic_center = tf.boolean_mask(points_semantic_center, inbox_mask) # [n_inbox_point, 3] 457 | points_semantic_center = apply_transform_to_points(transform_matrix, points_semantic_center) # [n_inbox_point, 3] 458 | points_semantic_center_bytes = tf.py_func(float_to_bytes, [points_semantic_center], Tout=tf.string) 459 | return points_bytes, inbox_mask, instance_points_bytes, points_instance_center_bytes, points_semantic_center_bytes 460 | 461 | def clip_pts(self, pts): 462 | abs_pts = tf.abs(pts) # [n_point, 3] 463 | max_value = tf.math.reduce_max(abs_pts, axis=1) # [n_point] 464 | inbox_mask = tf.cast(max_value <= 1.0, dtype=tf.bool) # [n_point] 465 | return inbox_mask 466 | 467 | def parse_example(self, record): 468 | features = {'points_bytes': tf.io.FixedLenFeature([], tf.string), 469 | 'label_1': tf.io.FixedLenFeature([10000], tf.int64), 470 | 'label_2': tf.io.FixedLenFeature([10000], tf.int64), 471 | 'label_3': tf.io.FixedLenFeature([10000], tf.int64), 472 | 'points_flag': tf.io.FixedLenFeature([1], tf.int64), 473 | 'shape_index': tf.io.FixedLenFeature([1], tf.int64) 474 | } 475 | parsed = tf.io.parse_single_example(record, features) 476 | return [parsed['points_bytes'], parsed['label_1'], parsed['label_2'], parsed['label_3'], 477 | parsed['points_flag'], parsed['shape_index']] 478 | 479 | 480 | def points_dataset(record_name, batch_size, depth=6, test=False): 481 | def merge_octrees_training(octrees, inbox_points_bytes_1, inbox_points_bytes_2, inbox_points_bytes_3, points_flag, 482 | inbox_instance_points_bytes_1, inbox_instance_points_bytes_2, inbox_instance_points_bytes_3, 483 | instance_center_bytes_1, semantic_center_bytes_1, instance_center_bytes_2, 484 | semantic_center_bytes_2, instance_center_bytes_3, semantic_center_bytes_3): 485 | octree = octree_batch(octrees) 486 | return [octree, inbox_points_bytes_1, inbox_points_bytes_2, inbox_points_bytes_3, points_flag, 487 | inbox_instance_points_bytes_1, inbox_instance_points_bytes_2, inbox_instance_points_bytes_3, 488 | instance_center_bytes_1, semantic_center_bytes_1, instance_center_bytes_2, 489 | semantic_center_bytes_2, instance_center_bytes_3, semantic_center_bytes_3] 490 | def merge_octrees_test(octrees, points_bytes_1, instance_center_bytes_1, instance_points_bytes_1, semantic_center_bytes_1, 491 | points_bytes_2, instance_center_bytes_2, instance_points_bytes_2, semantic_center_bytes_2, 492 | points_bytes_3, instance_center_bytes_3, instance_points_bytes_3, semantic_center_bytes_3): 493 | octree = octree_batch(octrees) 494 | return [octree, points_bytes_1, instance_center_bytes_1, instance_points_bytes_1, semantic_center_bytes_1, 495 | points_bytes_2, instance_center_bytes_2, instance_points_bytes_2, semantic_center_bytes_2, 496 | points_bytes_3, instance_center_bytes_3, instance_points_bytes_3, semantic_center_bytes_3] 497 | with tf.name_scope('points_dataset'): 498 | dataset = tf.data.TFRecordDataset([record_name]).repeat() 499 | if test is False: 500 | dataset = dataset.shuffle(100) 501 | return dataset.map(PointsPreprocessor(depth, test=test), num_parallel_calls=8).batch(batch_size) \ 502 | .map(merge_octrees_test if test else merge_octrees_training, num_parallel_calls=8) \ 503 | .prefetch(8).make_one_shot_iterator().get_next() 504 | -------------------------------------------------------------------------------- /util/instance_metric.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from scipy import stats 4 | 5 | 6 | def compute_instance_score(point_predict_instance_label, point_predict_prob_entropy, min_points_per_cluster=10): 7 | n_instance = np.max(point_predict_instance_label)+1 8 | instance_score = np.zeros(n_instance, dtype=np.float32) 9 | instance_valid = np.zeros(n_instance, dtype=np.bool) 10 | for i in np.unique(point_predict_instance_label): 11 | instance_score[i] = 1-np.mean(point_predict_prob_entropy[point_predict_instance_label==i]) 12 | instance_valid[i] = np.sum(point_predict_instance_label==i) > min_points_per_cluster 13 | return instance_score, instance_valid 14 | 15 | 16 | def compute_ap(tp, fp, gt_npart, n_bins=100, plot_fn=None): 17 | assert len(tp) == len(fp) 18 | tp = np.cumsum(tp) 19 | fp = np.cumsum(fp) 20 | rec = tp/gt_npart 21 | prec = tp/(tp+fp) 22 | rec = np.insert(rec, 0, 0.0) 23 | prec = np.insert(prec, 0, 1.0) 24 | ap = 0.0 25 | delta = 1.0/n_bins 26 | out_rec = np.arange(0, 1+delta, delta) 27 | out_prec = np.zeros(n_bins+1, dtype=np.float32) 28 | for idx, t in enumerate(out_rec): 29 | prec1 = prec[rec>=t] 30 | if prec1.size == 0: 31 | p = 0.0 32 | else: 33 | p = np.max(prec1) 34 | out_prec[idx] = p 35 | ap = ap + p/(n_bins+1) 36 | if plot_fn is not None: 37 | import matplotlib.pyplot as plt 38 | base_folder = os.path.split(plot_fn)[0] 39 | if not os.path.isdir(base_folder): os.mkdir(base_folder) 40 | fig = plt.figure() 41 | plt.plot(out_rec, out_prec, 'b-') 42 | plt.title('PR-Curve (AP: {:4.2f}%)'.format(ap*100)) 43 | plt.xlabel('Recall') 44 | plt.ylabel('Precision') 45 | plt.xlim([0, 1]) 46 | plt.ylim([0, 1.05]) 47 | fig.savefig(plot_fn) 48 | plt.close(fig) 49 | return ap 50 | 51 | 52 | def per_shape_mean_ap(point_predict_instance_label, point_gt_instance_label, point_predict_semantic_label, point_gt_semantic_label, point_shape_index, point_predict_prob_entropy, n_part, iou_threshold=0.5, folder=None, delete_0=True, non_instance_semantic_label=0, min_points_per_cluster=10): 53 | n_shape = np.max(point_shape_index) + 1 54 | mean_aps = [] 55 | shape_valids = [] 56 | for i in range(n_shape): 57 | shape_point_index = point_shape_index == i 58 | shape_point_predict_instance_label = point_predict_instance_label[shape_point_index] 59 | shape_point_gt_instance_label = point_gt_instance_label[shape_point_index] 60 | shape_point_predict_semantic_label = point_predict_semantic_label[shape_point_index] 61 | shape_point_gt_semantic_label = point_gt_semantic_label[shape_point_index] 62 | shape_point_predict_prob_entropy = point_predict_prob_entropy[shape_point_index] 63 | if delete_0: 64 | non_neg_mask = shape_point_gt_semantic_label > non_instance_semantic_label 65 | if np.sum(non_neg_mask) == 0: 66 | mean_aps.append(0.0) 67 | shape_valids.append(False) 68 | continue 69 | shape_point_predict_instance_label = shape_point_predict_instance_label[non_neg_mask] 70 | shape_point_gt_instance_label = shape_point_gt_instance_label[non_neg_mask] 71 | shape_point_predict_semantic_label = shape_point_predict_semantic_label[non_neg_mask] 72 | shape_point_gt_semantic_label = shape_point_gt_semantic_label[non_neg_mask] 73 | shape_point_predict_prob_entropy = shape_point_predict_prob_entropy[non_neg_mask] 74 | 75 | gt_ins_label = np.unique(shape_point_gt_instance_label) 76 | pred_n_ins = np.max(shape_point_predict_instance_label) + 1 77 | shape_instance_score, shape_instance_valid = compute_instance_score(shape_point_predict_instance_label, shape_point_predict_prob_entropy, min_points_per_cluster=min_points_per_cluster) 78 | 79 | true_pos_list = [[] for i in range(n_part)] 80 | false_pos_list = [[] for i in range(n_part)] 81 | gt_npart = np.zeros(n_part, dtype=np.int32) 82 | 83 | gt_mask_per_cat = [[] for i in range(n_part)] 84 | for j in gt_ins_label: 85 | sem_id = stats.mode(shape_point_gt_semantic_label[shape_point_gt_instance_label==j])[0][0] 86 | gt_mask_per_cat[sem_id].append(j) 87 | gt_npart[sem_id] += 1 88 | 89 | order = np.argsort(-shape_instance_score) 90 | gt_used = set() 91 | for j in range(pred_n_ins): 92 | idx = order[j] 93 | if shape_instance_valid[idx]: 94 | sem_id = stats.mode(shape_point_predict_semantic_label[shape_point_predict_instance_label==idx])[0][0] 95 | iou_max = 0.0; match_gt_id = -1 96 | for k in gt_mask_per_cat[sem_id]: 97 | if not(k in gt_used): 98 | predict_instance_index = shape_point_predict_instance_label==idx 99 | gt_instance_index = shape_point_gt_instance_label==k 100 | intersect = np.sum(predict_instance_index & gt_instance_index) 101 | union = np.sum(predict_instance_index | gt_instance_index) 102 | iou = intersect*1.0 / union 103 | if iou > iou_max: 104 | iou_max = iou 105 | match_gt_id = k 106 | if iou_max > iou_threshold: 107 | gt_used.add(match_gt_id) 108 | true_pos_list[sem_id].append(True) 109 | false_pos_list[sem_id].append(False) 110 | else: 111 | true_pos_list[sem_id].append(False) 112 | false_pos_list[sem_id].append(True) 113 | 114 | aps = np.zeros(n_part, dtype=np.float32) 115 | ap_valids = np.zeros(n_part, dtype=np.bool) 116 | start_j = 1 if delete_0 else 0 117 | for j in range(start_j, n_part): 118 | has_pred = len(true_pos_list[j]) > 0 119 | has_gt = gt_npart[j] > 0 120 | 121 | if has_pred and has_gt: 122 | cur_true_pos = np.array(true_pos_list[j], dtype=np.float32) 123 | cur_false_pos = np.array(false_pos_list[j], dtype=np.float32) 124 | aps[j] = compute_ap(cur_true_pos, cur_false_pos, gt_npart[j]) 125 | ap_valids[j] = True 126 | elif has_pred and not has_gt: 127 | aps[j] = 0.0 128 | ap_valids[j] = True 129 | elif not has_pred and has_gt: 130 | aps[j] = 0.0 131 | ap_valids[j] = True 132 | if np.sum(ap_valids) > 0: 133 | mean_aps.append(np.sum(aps*ap_valids)/np.sum(ap_valids)) 134 | shape_valids.append(True) 135 | else: 136 | mean_aps.append(0.0) 137 | shape_valids.append(False) 138 | 139 | mean_aps = np.array(mean_aps, dtype=np.float32) 140 | shape_valids = np.array(shape_valids, dtype=np.bool) 141 | mean_mean_ap = np.sum(mean_aps*shape_valids)/np.sum(shape_valids) 142 | return mean_aps, shape_valids, mean_mean_ap 143 | 144 | 145 | def per_part_mean_ap(point_predict_instance_label, point_gt_instance_label, point_predict_semantic_label, point_gt_semantic_label, point_shape_index, point_predict_prob_entropy, n_part, iou_threshold=0.5, folder=None, delete_0=True, non_instance_semantic_label=0, min_points_per_cluster=10): 146 | n_shape = np.max(point_shape_index) + 1 147 | 148 | true_pos_list = [[] for i in range(n_part)] 149 | false_pos_list = [[] for i in range(n_part)] 150 | conf_score_list = [[] for i in range(n_part)] 151 | gt_npart = np.zeros(n_part, dtype=np.int32) 152 | 153 | for i in range(n_shape): 154 | shape_point_index = point_shape_index == i 155 | shape_point_predict_instance_label = point_predict_instance_label[shape_point_index] 156 | shape_point_gt_instance_label = point_gt_instance_label[shape_point_index] 157 | shape_point_predict_semantic_label = point_predict_semantic_label[shape_point_index] 158 | shape_point_gt_semantic_label = point_gt_semantic_label[shape_point_index] 159 | shape_point_predict_prob_entropy = point_predict_prob_entropy[shape_point_index] 160 | if delete_0: 161 | non_neg_mask = shape_point_gt_semantic_label > non_instance_semantic_label 162 | if np.sum(non_neg_mask) == 0: continue 163 | shape_point_predict_instance_label = shape_point_predict_instance_label[non_neg_mask] 164 | shape_point_gt_instance_label = shape_point_gt_instance_label[non_neg_mask] 165 | shape_point_predict_semantic_label = shape_point_predict_semantic_label[non_neg_mask] 166 | shape_point_gt_semantic_label = shape_point_gt_semantic_label[non_neg_mask] 167 | shape_point_predict_prob_entropy = shape_point_predict_prob_entropy[non_neg_mask] 168 | 169 | gt_ins_label = np.unique(shape_point_gt_instance_label) 170 | pred_n_ins = np.max(shape_point_predict_instance_label) + 1 171 | shape_instance_score, shape_instance_valid = compute_instance_score(shape_point_predict_instance_label, shape_point_predict_prob_entropy, min_points_per_cluster=min_points_per_cluster) 172 | 173 | gt_mask_per_cat = [[] for i in range(n_part)] 174 | for j in gt_ins_label: 175 | if j == -1: print('detect -1 instance label') 176 | sem_id = stats.mode(shape_point_gt_semantic_label[shape_point_gt_instance_label==j])[0][0] 177 | gt_mask_per_cat[sem_id].append(j) 178 | gt_npart[sem_id] += 1 179 | 180 | order = np.argsort(-shape_instance_score) 181 | gt_used = set() 182 | for j in range(pred_n_ins): 183 | idx = order[j] 184 | if shape_instance_valid[idx]: 185 | sem_id = stats.mode(shape_point_predict_semantic_label[shape_point_predict_instance_label==idx])[0][0] 186 | iou_max = 0.0; match_gt_id = -1 187 | for k in gt_mask_per_cat[sem_id]: 188 | if not(k in gt_used): 189 | predict_instance_index = shape_point_predict_instance_label==idx 190 | gt_instance_index = shape_point_gt_instance_label==k 191 | intersect = np.sum(predict_instance_index & gt_instance_index) 192 | union = np.sum(predict_instance_index | gt_instance_index) 193 | iou = intersect*1.0 / union 194 | if iou > iou_max: 195 | iou_max = iou 196 | match_gt_id = k 197 | if iou_max > iou_threshold: 198 | gt_used.add(match_gt_id) 199 | true_pos_list[sem_id].append(True) 200 | false_pos_list[sem_id].append(False) 201 | conf_score_list[sem_id].append(shape_instance_score[idx]) 202 | else: 203 | true_pos_list[sem_id].append(False) 204 | false_pos_list[sem_id].append(True) 205 | conf_score_list[sem_id].append(shape_instance_score[idx]) 206 | 207 | aps = np.zeros(n_part, dtype=np.float32) 208 | ap_valids = np.ones(n_part, dtype=np.bool) 209 | for i in range(n_part): 210 | if delete_0 and (i == 0): 211 | ap_valids[i] = False 212 | continue 213 | 214 | has_pred = len(true_pos_list[i]) > 0 215 | has_gt = gt_npart[i] > 0 216 | 217 | if not has_gt: 218 | ap_valids[i] = False 219 | continue 220 | if has_gt and not has_pred: 221 | continue 222 | 223 | cur_true_pos = np.array(true_pos_list[i], dtype=np.float32) 224 | cur_false_pos = np.array(false_pos_list[i], dtype=np.float32) 225 | cur_conf_score = np.array(conf_score_list[i], dtype=np.float32) 226 | 227 | order = np.argsort(-cur_conf_score) 228 | sorted_true_pos = cur_true_pos[order] 229 | sorted_false_pos = cur_false_pos[order] 230 | 231 | if folder is not None: 232 | filename = os.path.join(folder, 'img', 'part_{:04d}.png'.format(i)) 233 | aps[i] = compute_ap(sorted_true_pos, sorted_false_pos, gt_npart[i], plot_fn=filename) 234 | else: 235 | aps[i] = compute_ap(sorted_true_pos, sorted_false_pos, gt_npart[i], plot_fn=None) 236 | 237 | mean_ap = np.sum(aps*ap_valids)/np.sum(ap_valids) 238 | return aps, ap_valids, mean_ap, gt_npart 239 | -------------------------------------------------------------------------------- /util/network.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import tensorflow as tf 4 | 5 | assert(os.path.isdir('ocnn/tensorflow')) 6 | sys.path.append('ocnn/tensorflow') 7 | sys.path.append('ocnn/tensorflow/script') 8 | 9 | from libs import * 10 | from ocnn import * 11 | 12 | 13 | def predict_module(data, num_output, num_hidden, n_layer, training=True, reuse=False): 14 | with tf.variable_scope('predict_%d' % num_output, reuse=reuse): 15 | for i in range(n_layer): 16 | with tf.variable_scope('conv{}'.format(i)): 17 | data = octree_conv1x1_bn_lrelu(data, num_hidden, training) 18 | with tf.variable_scope('conv{}'.format(n_layer)): 19 | logit = octree_conv1x1(data, num_output, use_bias=True) 20 | logit = tf.transpose(tf.squeeze(logit, [0, 3])) # (1, C, H, 1) -> (H, C) 21 | data = tf.transpose(tf.squeeze(data, [0, 3])) # (1, C, H, 1) -> (H, C) 22 | return logit, data 23 | 24 | 25 | def feature_aggregation(point_feature, point_predict_prob, point_batch_index, batch_size): 26 | point_aggregation_feature_list = [] 27 | for i in range(batch_size): 28 | shape_point_index = tf.math.equal(point_batch_index, tf.constant(i, dtype=tf.int32)) 29 | shape_point_predict_prob = tf.boolean_mask(point_predict_prob, shape_point_index) # [Ni, n_part] 30 | shape_point_feature = tf.boolean_mask(point_feature, shape_point_index) # [Ni, n_feat] 31 | shape_part_feature = tf.matmul(tf.transpose(shape_point_predict_prob), shape_point_feature) # [n_part, n_feat] 32 | part_point_prob_sum = tf.reduce_sum(shape_point_predict_prob, axis=0) # [n_part] 33 | shape_part_feature = tf.math.divide_no_nan(shape_part_feature, tf.reshape(part_point_prob_sum, [-1, 1])) # [n_part, n_feat] 34 | shape_point_aggregation_feature = tf.matmul(shape_point_predict_prob, shape_part_feature) # [Ni, n_feat] 35 | point_aggregation_feature_list.append(shape_point_aggregation_feature) 36 | point_aggregation_feature = tf.concat(point_aggregation_feature_list, axis=0) # [N, n_feat] 37 | return point_aggregation_feature 38 | 39 | 40 | def predict_module_offset(data, point_predict_prob, point_predict_prob_1, point_predict_prob_2, point_batch_index, node_position, batch_size, num_hidden, n_layer, training=True, reuse=False): 41 | with tf.variable_scope('predict_offset', reuse=reuse): 42 | for i in range(n_layer): 43 | with tf.variable_scope('conv{}'.format(i)): 44 | data = octree_conv1x1_bn_lrelu(data, num_hidden, training) 45 | point_feature = tf.transpose(tf.squeeze(data, [0, 3])) # (1, C, H, 1) -> (H, C) 46 | 47 | point_aggregation_feature = feature_aggregation(point_feature, point_predict_prob, point_batch_index, batch_size) # [H, C] 48 | point_aggregation_feature = tf.expand_dims(tf.expand_dims(tf.transpose(point_aggregation_feature), axis=0), axis=-1) # [1, C, H, 1] 49 | with tf.variable_scope('convtransfer'): 50 | point_aggregation_feature = octree_conv1x1_bn_lrelu(point_aggregation_feature, num_hidden, training) 51 | point_aggregation_feature = tf.transpose(tf.squeeze(point_aggregation_feature, [0, 3])) # (1, C, H, 1) -> (H, C) 52 | 53 | point_aggregation_feature_1 = feature_aggregation(point_feature, point_predict_prob_1, point_batch_index, batch_size) # [H, C] 54 | point_aggregation_feature_1 = tf.expand_dims(tf.expand_dims(tf.transpose(point_aggregation_feature_1), axis=0), axis=-1) # [1, C, H, 1] 55 | with tf.variable_scope('convtransfer_1'): 56 | point_aggregation_feature_1 = octree_conv1x1_bn_lrelu(point_aggregation_feature_1, num_hidden, training) 57 | point_aggregation_feature_1 = tf.transpose(tf.squeeze(point_aggregation_feature_1, [0, 3])) # (1, C, H, 1) -> (H, C) 58 | 59 | point_aggregation_feature_2 = feature_aggregation(point_feature, point_predict_prob_2, point_batch_index, batch_size) # [H, C] 60 | point_aggregation_feature_2 = tf.expand_dims(tf.expand_dims(tf.transpose(point_aggregation_feature_2), axis=0), axis=-1) # [1, C, H, 1] 61 | with tf.variable_scope('convtransfer_2'): 62 | point_aggregation_feature_2 = octree_conv1x1_bn_lrelu(point_aggregation_feature_2, num_hidden, training) 63 | point_aggregation_feature_2 = tf.transpose(tf.squeeze(point_aggregation_feature_2, [0, 3])) # (1, C, H, 1) -> (H, C) 64 | 65 | point_feature = tf.concat([point_feature, point_aggregation_feature, point_aggregation_feature_1, 66 | point_aggregation_feature_2, node_position[:, :3]], axis=1) # [H, C*4+3] 67 | point_feature = tf.expand_dims(tf.expand_dims(tf.transpose(point_feature), axis=0), axis=-1) # [1, C*4+3, H, 1] 68 | with tf.variable_scope('convfusion'): 69 | point_feature = octree_conv1x1_bn_lrelu(point_feature, num_hidden, training) 70 | with tf.variable_scope('conv{}'.format(n_layer)): 71 | offset = octree_conv1x1(point_feature, 6, use_bias=True) 72 | offset = tf.transpose(tf.squeeze(offset, [0, 3])) # (1, C, H, 1) -> (H, C) 73 | return offset 74 | 75 | 76 | def extract_pts_feature_from_octree_node(inputs, octree, pts, depth): 77 | # pts shape: [n_pts, 4] 78 | xyz, ids = tf.split(pts, [3, 1], axis=1) 79 | xyz = xyz + 1.0 # [0, 2] 80 | pts_input = tf.concat([xyz * (2**(depth-1)), ids], axis=1) 81 | feature = octree_bilinear_v3(pts_input, inputs, octree, depth=depth) 82 | return feature 83 | 84 | 85 | def network_unet_two_decoder(octree, depth, channel=4, training=True, reuse=False): 86 | nout = [512, 256, 256, 256, 256, 128, 64, 32, 16, 16, 16] 87 | with tf.variable_scope('ocnn_unet', reuse=reuse): 88 | with tf.variable_scope('signal'): 89 | data = octree_property(octree, property_name='feature', dtype=tf.float32, 90 | depth=depth, channel=channel) 91 | data = tf.abs(data) 92 | data = tf.reshape(data, [1, channel, -1, 1]) 93 | 94 | ## encoder 95 | convd = [None]*11 96 | convd[depth+1] = data 97 | for d in range(depth, 1, -1): 98 | with tf.variable_scope('encoder_d%d' % d): 99 | # downsampling 100 | dd = d if d == depth else d + 1 101 | stride = 1 if d == depth else 2 102 | kernel_size = [3] if d == depth else [2] 103 | convd[d] = octree_conv_bn_relu(convd[d+1], octree, dd, nout[d], training, 104 | stride=stride, kernel_size=kernel_size) 105 | # resblock 106 | for n in range(0, 3): 107 | with tf.variable_scope('resblock_%d' % n): 108 | convd[d] = octree_resblock(convd[d], octree, d, nout[d], 1, training) 109 | 110 | ## decoder 111 | deconv_seg = convd[2] 112 | for d in range(3, depth + 1): 113 | with tf.variable_scope('decoder_seg_d%d' % d): 114 | # upsampling 115 | deconv_seg = octree_deconv_bn_relu(deconv_seg, octree, d-1, nout[d], training, 116 | kernel_size=[2], stride=2, fast_mode=False) 117 | deconv_seg = convd[d] + deconv_seg # skip connections 118 | 119 | # resblock 120 | for n in range(0, 3): 121 | with tf.variable_scope('resblock_%d' % n): 122 | deconv_seg = octree_resblock(deconv_seg, octree, d, nout[d], 1, training) 123 | 124 | ## decoder 125 | deconv_offset = convd[2] 126 | for d in range(3, depth + 1): 127 | with tf.variable_scope('decoder_offset_d%d' % d): 128 | # upsampling 129 | deconv_offset = octree_deconv_bn_relu(deconv_offset, octree, d-1, nout[d], training, 130 | kernel_size=[2], stride=2, fast_mode=False) 131 | deconv_offset = convd[d] + deconv_offset # skip connections 132 | 133 | # resblock 134 | for n in range(0, 3): 135 | with tf.variable_scope('resblock_%d' % n): 136 | deconv_offset = octree_resblock(deconv_offset, octree, d, nout[d], 1, training) 137 | 138 | return deconv_seg, deconv_offset 139 | -------------------------------------------------------------------------------- /util/numeric_function.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | def compute_iou_v1(point_cube_index, point_part_index, point_shape_index, n_cube, delete_0=False): 4 | assert(point_cube_index.size == point_part_index.size == point_shape_index.size) 5 | n_shape = np.max(point_shape_index) + 1 6 | shape_iou = np.zeros([n_shape]) 7 | valid_shape_num = 0 8 | for i in range(n_shape): 9 | shape_point_index = point_shape_index == i 10 | shape_cube_index = point_cube_index[shape_point_index] 11 | shape_part_index = point_part_index[shape_point_index] 12 | if delete_0: 13 | non_neg_mask = shape_part_index > 0 14 | shape_cube_index = shape_cube_index[non_neg_mask] 15 | shape_part_index = shape_part_index[non_neg_mask] 16 | shape_part_count = 0 17 | for part_id in np.unique(shape_part_index): 18 | part_id_point_index_of_part = shape_part_index == part_id 19 | part_id_point_index_of_cube = shape_cube_index == part_id 20 | intersection = np.sum(np.logical_and(part_id_point_index_of_part, part_id_point_index_of_cube)) 21 | union = np.sum(np.logical_or(part_id_point_index_of_part, part_id_point_index_of_cube)) 22 | iou = intersection/union 23 | shape_iou[i] += iou 24 | shape_part_count += 1 25 | valid_shape_num += (1 if shape_part_count>0 else 0) 26 | shape_iou[i] /= (shape_part_count if shape_part_count>0 else 1) 27 | return np.sum(shape_iou)/valid_shape_num, shape_iou 28 | 29 | 30 | def compute_iou_v2(point_cube_index, point_part_index, n_cube, delete_0=False): 31 | assert(point_cube_index.size == point_part_index.size) 32 | part_intersection = np.zeros([n_cube]) 33 | part_union = np.zeros([n_cube]) 34 | part_flag = np.zeros([n_cube], dtype=int) 35 | if delete_0: 36 | non_neg_mask = point_part_index > 0 37 | point_cube_index = point_cube_index[non_neg_mask] 38 | point_part_index = point_part_index[non_neg_mask] 39 | for part_id in range(n_cube): 40 | part_id_point_index_of_cube = point_cube_index == part_id 41 | part_id_point_index_of_part = point_part_index == part_id 42 | intersection = np.sum(np.logical_and(part_id_point_index_of_part, part_id_point_index_of_cube)) 43 | union = np.sum(np.logical_or(part_id_point_index_of_part, part_id_point_index_of_cube)) 44 | if np.sum(part_id_point_index_of_part) > 0: part_flag[part_id] = 1 45 | part_intersection[part_id] = intersection 46 | part_union[part_id] = union 47 | part_iou = part_intersection/(part_union+1e-5) 48 | if delete_0: 49 | part_iou = part_iou[1:] 50 | part_flag = part_flag[1:] 51 | mean_part_iou = np.sum(part_iou)/np.sum(part_flag) 52 | return mean_part_iou, part_iou 53 | 54 | 55 | def compute_iou_v3(point_cube_index, point_part_index, point_shape_index, n_cube, delete_0=False): 56 | assert(point_cube_index.size == point_part_index.size == point_shape_index.size) 57 | n_shape = np.max(point_shape_index) + 1 58 | shape_iou = np.zeros([n_shape]) 59 | valid_shape_num = 0 60 | for i in range(n_shape): 61 | shape_point_index = point_shape_index == i 62 | shape_cube_index = point_cube_index[shape_point_index] 63 | shape_part_index = point_part_index[shape_point_index] 64 | if delete_0: 65 | non_neg_mask = shape_part_index > 0 66 | shape_cube_index = shape_cube_index[non_neg_mask] 67 | shape_part_index = shape_part_index[non_neg_mask] 68 | shape_part_count = 0 69 | for part_id in np.unique(np.concatenate((shape_part_index, shape_cube_index))): 70 | part_id_point_index_of_part = shape_part_index == part_id 71 | part_id_point_index_of_cube = shape_cube_index == part_id 72 | intersection = np.sum(np.logical_and(part_id_point_index_of_part, part_id_point_index_of_cube)) 73 | union = np.sum(np.logical_or(part_id_point_index_of_part, part_id_point_index_of_cube)) 74 | iou = intersection/union 75 | shape_iou[i] += iou 76 | shape_part_count += 1 77 | valid_shape_num += (1 if shape_part_count>0 else 0) 78 | shape_iou[i] /= (shape_part_count if shape_part_count>0 else 1) 79 | return np.sum(shape_iou)/valid_shape_num, shape_iou 80 | -------------------------------------------------------------------------------- /util/transform.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import numpy as np 4 | 5 | 6 | def fill_rotation_matrix(angle): 7 | cosx = np.cos(angle[0]); sinx = np.sin(angle[0]) 8 | cosy = np.cos(angle[1]); siny = np.sin(angle[1]) 9 | cosz = np.cos(angle[2]); sinz = np.sin(angle[2]) 10 | rotx = np.array([[1.0, 0.0, 0.0, 0.0], [0.0, cosx, sinx, 0.0], [0.0, -sinx, cosx, 0.0], [0.0, 0.0, 0.0, 1.0]]) 11 | roty = np.array([[cosy, 0.0, -siny, 0.0], [0.0, 1.0, 0.0, 0.0], [siny, 0.0, cosy, 0.0], [0.0, 0.0, 0.0, 1.0]]) 12 | rotz = np.array([[cosz, sinz, 0.0, 0.0], [-sinz, cosz, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0]]) 13 | return np.matmul(rotz, np.matmul(rotx, roty)).astype(np.float32) 14 | 15 | def fill_translation_matrix(jitter): 16 | translation_matrix = np.array([[1.0, 0.0, 0.0, jitter[0]], [0.0, 1.0, 0.0, jitter[1]], [0.0, 0.0, 1.0, jitter[2]], [0.0, 0.0, 0.0, 1.0]]) 17 | return translation_matrix.astype(np.float32) 18 | 19 | def fill_scale_matrix(scale): 20 | scale_matrix = np.array([[scale[0], 0.0, 0.0, 0.0], [0.0, scale[1], 0.0, 0.0], [0.0, 0.0, scale[2], 0.0], [0.0, 0.0, 0.0, 1.0]]) 21 | return scale_matrix.astype(np.float32) 22 | 23 | def get_transform_matrix(angle, jitter, scale, return_rotation=False): 24 | rotation_matrix = fill_rotation_matrix(angle) 25 | translation_matrix = fill_translation_matrix(jitter) 26 | scale_matrix = fill_scale_matrix(scale) 27 | transform_matrix = np.matmul(scale_matrix, np.matmul(translation_matrix, rotation_matrix)) 28 | if return_rotation is False: 29 | return transform_matrix 30 | else: 31 | return transform_matrix, rotation_matrix 32 | 33 | def get_inverse_transform_matrix(angle, jitter, scale): 34 | rotation_matrix = fill_rotation_matrix(-angle) 35 | translation_matrix = fill_translation_matrix(-jitter) 36 | scale_matrix = fill_scale_matrix(1.0/scale) 37 | inverse_transform_matrix = np.matmul(rotation_matrix, np.matmul(translation_matrix, scale_matrix)) 38 | return inverse_transform_matrix 39 | 40 | if __name__ == '__main__': 41 | for i in range(5): 42 | angle = np.random.uniform(-5, 5, 3) 43 | angle = angle * 3.1415926 / 180.0 44 | jitter = np.random.uniform(-0.125, 0.125, 3) 45 | scale = np.random.uniform(0.75, 0.125, 1) 46 | scale = np.array([scale[0], scale[0], scale[0]]) 47 | # for item in [angle, jitter, scale]: 48 | # print(item, type(item)) 49 | m = get_transform_matrix(angle, jitter, scale) 50 | # print(m) 51 | im = get_inverse_transform_matrix(angle, jitter, scale) 52 | # print(im) 53 | imm = np.matmul(im, m) 54 | print(imm) 55 | np.testing.assert_allclose(imm, np.eye(4), atol=1e-2) -------------------------------------------------------------------------------- /util/vis_pointcloud.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | import shutil 4 | 5 | cube_vert_raw = np.array([[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]], dtype=float) 6 | cube_face = np.array([[1, 3, 7, 5], [1, 2, 4, 3], [3, 4, 8, 7], [5, 7, 8, 6], [1, 5, 6, 2], [2, 6, 8, 4]]) 7 | good_color = np.array([ 8 | [246, 83, 20], 9 | [124, 187, 0], 10 | [ 0, 161, 241], 11 | [255, 187, 0], 12 | [ 11, 239, 239], 13 | [247, 230, 49], 14 | [255, 96, 165], 15 | [178, 96, 255], 16 | [242, 198, 4], 17 | [252, 218, 123], 18 | [ 77, 146, 33], 19 | [161, 206, 107], 20 | [ 41, 125, 198], 21 | [126, 193, 221], 22 | [198, 31, 40], 23 | [252, 136, 123], 24 | [ 5, 112, 103], 25 | [ 87, 193, 177], 26 | [107, 53, 168], 27 | [139, 117, 198], 28 | [206, 37, 135], 29 | [247, 155, 222], 30 | [196, 98, 13], 31 | [253, 184, 99], 32 | [158, 1, 66], 33 | [233, 93, 71], 34 | [253, 190, 111], 35 | [230, 245, 152], 36 | [125, 203, 164], 37 | [ 64, 117, 180], 38 | [163, 79, 132]], np.float32)/256. 39 | 40 | 41 | def generate_random_color_palette(n_color): 42 | np.random.seed(0) 43 | color = np.random.rand(n_color, 3) 44 | if n_color >=1: 45 | color[0] = np.array([1., 1., 1.]) 46 | n_copy = min(31, n_color-1) 47 | color[1:1+n_copy] = good_color[:n_copy] 48 | return color 49 | 50 | 51 | def generate_squential_color_palette(): 52 | palette = np.array([ 53 | [255,255,255], 54 | [253,212,158], 55 | [252,141,89], 56 | [215,48,31], 57 | [127,0,0]], dtype=float) / 255 58 | return palette 59 | 60 | 61 | def save_material(palette, output_file): 62 | with open(output_file, 'w') as f: 63 | n_color = np.shape(palette)[0] 64 | for i in range(n_color): 65 | part_color = palette[i] 66 | f.write('newmtl m{}\nKd {} {} {}\nKa 0 0 0\n'.format(i, 67 | float(part_color[0]), float(part_color[1]), float(part_color[2]))) 68 | 69 | 70 | def copy_material(src_mtl_filename, des_mtl_filename): 71 | shutil.copyfile(src_mtl_filename, des_mtl_filename) 72 | 73 | 74 | def save_points(position, part_ids, save_file, depth=5, refmtl_filename=None, 75 | squantial_color=False): 76 | n_color = np.max(part_ids) + 1 77 | mtl_filename = save_file.replace('.obj', '.mtl') 78 | if refmtl_filename is None: 79 | if squantial_color: 80 | color_palette = generate_squential_color_palette() 81 | else: 82 | color_palette = generate_random_color_palette(n_color) 83 | save_material(color_palette, mtl_filename) 84 | else: 85 | copy_material(refmtl_filename, mtl_filename) 86 | 87 | with open(save_file, 'w') as f: 88 | n_vert = np.shape(position)[0] 89 | assert(n_vert == np.shape(part_ids)[0]) 90 | f.write('mtllib {}\n'.format(mtl_filename.split('/')[-1])) 91 | vert_offset = 0 92 | cube_vert = (cube_vert_raw-0.5) / (2**depth * 2) 93 | for i in range(n_vert): 94 | part_id = part_ids[i] 95 | if squantial_color: 96 | if part_id > 4: part_id = 4 97 | f.write('usemtl m{}\n'.format(part_id)) 98 | for j in range(8): 99 | x = position[i][0] + cube_vert[j][0] 100 | y = position[i][1] + cube_vert[j][1] 101 | z = position[i][2] + cube_vert[j][2] 102 | f.write("v {:6.4f} {:6.4f} {:6.4f}\n".format(x, y, z)) 103 | faces = cube_face + vert_offset 104 | for j in range(6): 105 | f.write("f {} {} {} {}\n".format(faces[j][0], faces[j][1], faces[j][2], faces[j][3])) 106 | vert_offset += 8 107 | --------------------------------------------------------------------------------