├── .DS_Store ├── .gitattributes ├── LICENSE ├── README.md ├── doc └── teaser.png ├── evaluate.py ├── models ├── pointnet_cls.py ├── pointnet_cls_basic.py ├── pointnet_seg.py └── transform_nets.py ├── part_seg ├── download_data.sh ├── pointnet_part_seg.py ├── test.py ├── testing_ply_file_list.txt └── train.py ├── provider.py ├── sem_seg ├── README.md ├── batch_inference.py ├── collect_indoor3d_data.py ├── download_data.sh ├── eval_iou_accuracy.py ├── gen_indoor3d_h5.py ├── indoor3d_util.py ├── meta │ ├── all_data_label.txt │ ├── anno_paths.txt │ ├── area6_data_label.txt │ └── class_names.txt ├── model.py └── train.py ├── train.py └── utils ├── data_prep_util.py ├── eulerangles.py ├── pc_util.py ├── plyfile.py └── tf_util.py /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhihaoZhu/PointNet-Implementation-Tensorflow/14709d960aaf47a642bc6e45c928bfd02ed47cfd/.DS_Store -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. 2 | 3 | Copyright (c) 2017, Geometric Computation Group of Stanford University 4 | 5 | The MIT License (MIT) 6 | 7 | Copyright (c) 2017 Charles R. Qi 8 | 9 | Permission is hereby granted, free of charge, to any person obtaining a copy 10 | of this software and associated documentation files (the "Software"), to deal 11 | in the Software without restriction, including without limitation the rights 12 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 13 | copies of the Software, and to permit persons to whom the Software is 14 | furnished to do so, subject to the following conditions: 15 | 16 | The above copyright notice and this permission notice shall be included in all 17 | copies or substantial portions of the Software. 18 | 19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 21 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 22 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 23 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 24 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 25 | SOFTWARE. 26 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ### Introduction 4 | This work is based on our [arXiv tech report](https://arxiv.org/abs/1612.00593), which is going to appear in CVPR 2017. We proposed a novel deep net architecture for point clouds (as unordered point sets). You can also check our [project webpage](http://stanford.edu/~rqi/pointnet) for a deeper introduction. 5 | 6 | Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective. 7 | 8 | In this repository, we release code and data for training a PointNet classification network on point clouds sampled from 3D shapes, as well as for training a part segmentation network on ShapeNet Part dataset. 9 | 10 | ### Citation 11 | If you find our work useful in your research, please consider citing: 12 | 13 | @article{qi2016pointnet, 14 | title={PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation}, 15 | author={Qi, Charles R and Su, Hao and Mo, Kaichun and Guibas, Leonidas J}, 16 | journal={arXiv preprint arXiv:1612.00593}, 17 | year={2016} 18 | } 19 | 20 | ### Installation 21 | 22 | Install TensorFlow. You may also need to install h5py. The code has been tested with Python 2.7, TensorFlow 1.0.1, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04. 23 | 24 | If you are using PyTorch, you can find a third-party pytorch implementation here. 25 | 26 | To install h5py for Python: 27 | ```bash 28 | sudo apt-get install libhdf5-dev 29 | sudo pip install h5py 30 | ``` 31 | 32 | ### Usage 33 | To train a model to classify point clouds sampled from 3D shapes: 34 | 35 | python train.py 36 | 37 | Log files and network parameters will be saved to `log` folder in default. Point clouds of ModelNet40 models in HDF5 files will be automatically downloaded (416MB) to the data folder. Each point cloud contains 2048 points uniformly sampled from a shape surface. Each cloud is zero-mean and normalized into an unit sphere. There are also text files in `data/modelnet40_ply_hdf5_2048` specifying the ids of shapes in h5 files. 38 | 39 | To see HELP for the training script: 40 | 41 | python train.py -h 42 | 43 | We can use TensorBoard to view the network architecture and monitor the training progress. 44 | 45 | tensorboard --logdir log 46 | 47 | After the above training, we can evaluate the model and output some visualizations of the error cases. 48 | 49 | python evaluate.py --visu 50 | 51 | Point clouds that are wrongly classified will be saved to `dump` folder in default. We visualize the point cloud by rendering it into three-view images. 52 | 53 | If you'd like to prepare your own data, you can refer to some helper functions in `utils/data_prep_util.py` for saving and loading HDF5 files. 54 | 55 | ### Part Segmentation 56 | To train a model for object part segmentation, firstly download the data: 57 | 58 | cd part_seg 59 | sh download_data.sh 60 | 61 | The downloading script will download ShapeNetPart dataset (around 1.08GB) and our prepared HDF5 files (around 346MB). 62 | 63 | Then you can run `train.py` and `test.py` in the `part_seg` folder for training and testing (computing mIoU for evaluation). 64 | 65 | ### License 66 | Our code is released under MIT License (see LICENSE file for details). 67 | 68 | ### Selected Projects that Use PointNet 69 | 70 | * PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space by Qi et al. (NIPS 2017) A hierarchical feature learning framework on point clouds. The PointNet++ architecture applies PointNet recursively on a nested partitioning of the input point set. It also proposes novel layers for point clouds with non-uniform densities. 71 | * Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds by Engelmann et al. (ICCV 2017 workshop). This work extends PointNet for large-scale scene segmentation. 72 | * PCPNET: Learning Local Shape Properties from Raw Point Clouds by Guerrero et al. (arXiv). The work adapts PointNet for local geometric properties (e.g. normal and curvature) estimation in noisy point clouds. 73 | * VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection by Zhou et al. from Apple (arXiv) This work studies 3D object detection using LiDAR point clouds. It splits space into voxels, use PointNet to learn local voxel features and then use 3D CNN for region proposal, object classification and 3D bounding box estimation. 74 | * Frustum PointNets for 3D Object Detection from RGB-D Data by Qi et al. (arXiv) A novel framework for 3D object detection with RGB-D data. The method proposed has achieved first place on KITTI 3D object detection benchmark on all categories (last checked on 11/30/2017). 75 | -------------------------------------------------------------------------------- /doc/teaser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhihaoZhu/PointNet-Implementation-Tensorflow/14709d960aaf47a642bc6e45c928bfd02ed47cfd/doc/teaser.png -------------------------------------------------------------------------------- /evaluate.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import argparse 4 | import socket 5 | import importlib 6 | import time 7 | import os 8 | import scipy.misc 9 | import sys 10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 11 | sys.path.append(BASE_DIR) 12 | sys.path.append(os.path.join(BASE_DIR, 'models')) 13 | sys.path.append(os.path.join(BASE_DIR, 'utils')) 14 | import provider 15 | import pc_util 16 | 17 | 18 | parser = argparse.ArgumentParser() 19 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]') 20 | parser.add_argument('--model', default='pointnet_cls', help='Model name: pointnet_cls or pointnet_cls_basic [default: pointnet_cls]') 21 | parser.add_argument('--batch_size', type=int, default=4, help='Batch Size during training [default: 1]') 22 | parser.add_argument('--num_point', type=int, default=1024, help='Point Number [256/512/1024/2048] [default: 1024]') 23 | parser.add_argument('--model_path', default='log/model.ckpt', help='model checkpoint file path [default: log/model.ckpt]') 24 | parser.add_argument('--dump_dir', default='dump', help='dump folder path [dump]') 25 | parser.add_argument('--visu', action='store_true', help='Whether to dump image for error case [default: False]') 26 | FLAGS = parser.parse_args() 27 | 28 | 29 | BATCH_SIZE = FLAGS.batch_size 30 | NUM_POINT = FLAGS.num_point 31 | MODEL_PATH = FLAGS.model_path 32 | GPU_INDEX = FLAGS.gpu 33 | MODEL = importlib.import_module(FLAGS.model) # import network module 34 | DUMP_DIR = FLAGS.dump_dir 35 | if not os.path.exists(DUMP_DIR): os.mkdir(DUMP_DIR) 36 | LOG_FOUT = open(os.path.join(DUMP_DIR, 'log_evaluate.txt'), 'w') 37 | LOG_FOUT.write(str(FLAGS)+'\n') 38 | 39 | NUM_CLASSES = 40 40 | SHAPE_NAMES = [line.rstrip() for line in \ 41 | open(os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/shape_names.txt'))] 42 | 43 | HOSTNAME = socket.gethostname() 44 | 45 | # ModelNet40 official train/test split 46 | TRAIN_FILES = provider.getDataFiles( \ 47 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/train_files.txt')) 48 | TEST_FILES = provider.getDataFiles(\ 49 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/test_files.txt')) 50 | 51 | def log_string(out_str): 52 | LOG_FOUT.write(out_str+'\n') 53 | LOG_FOUT.flush() 54 | print(out_str) 55 | 56 | def evaluate(num_votes): 57 | is_training = False 58 | 59 | with tf.device('/gpu:'+str(GPU_INDEX)): 60 | pointclouds_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE, NUM_POINT) 61 | is_training_pl = tf.placeholder(tf.bool, shape=()) 62 | 63 | # simple model 64 | pred, end_points = MODEL.get_model(pointclouds_pl, is_training_pl) 65 | loss = MODEL.get_loss(pred, labels_pl, end_points) 66 | 67 | # Add ops to save and restore all the variables. 68 | saver = tf.train.Saver() 69 | 70 | # Create a session 71 | config = tf.ConfigProto() 72 | config.gpu_options.allow_growth = True 73 | config.allow_soft_placement = True 74 | config.log_device_placement = True 75 | sess = tf.Session(config=config) 76 | 77 | # Restore variables from disk. 78 | saver.restore(sess, MODEL_PATH) 79 | log_string("Model restored.") 80 | 81 | ops = {'pointclouds_pl': pointclouds_pl, 82 | 'labels_pl': labels_pl, 83 | 'is_training_pl': is_training_pl, 84 | 'pred': pred, 85 | 'loss': loss} 86 | 87 | eval_one_epoch(sess, ops, num_votes) 88 | 89 | 90 | def eval_one_epoch(sess, ops, num_votes=1, topk=1): 91 | error_cnt = 0 92 | is_training = False 93 | total_correct = 0 94 | total_seen = 0 95 | loss_sum = 0 96 | total_seen_class = [0 for _ in range(NUM_CLASSES)] 97 | total_correct_class = [0 for _ in range(NUM_CLASSES)] 98 | fout = open(os.path.join(DUMP_DIR, 'pred_label.txt'), 'w') 99 | for fn in range(len(TEST_FILES)): 100 | log_string('----'+str(fn)+'----') 101 | current_data, current_label = provider.loadDataFile(TEST_FILES[fn]) 102 | current_data = current_data[:,0:NUM_POINT,:] 103 | current_label = np.squeeze(current_label) 104 | print(current_data.shape) 105 | 106 | file_size = current_data.shape[0] 107 | num_batches = file_size // BATCH_SIZE 108 | print(file_size) 109 | 110 | for batch_idx in range(num_batches): 111 | start_idx = batch_idx * BATCH_SIZE 112 | end_idx = (batch_idx+1) * BATCH_SIZE 113 | cur_batch_size = end_idx - start_idx 114 | 115 | # Aggregating BEG 116 | batch_loss_sum = 0 # sum of losses for the batch 117 | batch_pred_sum = np.zeros((cur_batch_size, NUM_CLASSES)) # score for classes 118 | batch_pred_classes = np.zeros((cur_batch_size, NUM_CLASSES)) # 0/1 for classes 119 | for vote_idx in range(num_votes): 120 | rotated_data = provider.rotate_point_cloud_by_angle(current_data[start_idx:end_idx, :, :], 121 | vote_idx/float(num_votes) * np.pi * 2) 122 | feed_dict = {ops['pointclouds_pl']: rotated_data, 123 | ops['labels_pl']: current_label[start_idx:end_idx], 124 | ops['is_training_pl']: is_training} 125 | loss_val, pred_val = sess.run([ops['loss'], ops['pred']], 126 | feed_dict=feed_dict) 127 | batch_pred_sum += pred_val 128 | batch_pred_val = np.argmax(pred_val, 1) 129 | for el_idx in range(cur_batch_size): 130 | batch_pred_classes[el_idx, batch_pred_val[el_idx]] += 1 131 | batch_loss_sum += (loss_val * cur_batch_size / float(num_votes)) 132 | # pred_val_topk = np.argsort(batch_pred_sum, axis=-1)[:,-1*np.array(range(topk))-1] 133 | # pred_val = np.argmax(batch_pred_classes, 1) 134 | pred_val = np.argmax(batch_pred_sum, 1) 135 | # Aggregating END 136 | 137 | correct = np.sum(pred_val == current_label[start_idx:end_idx]) 138 | # correct = np.sum(pred_val_topk[:,0:topk] == label_val) 139 | total_correct += correct 140 | total_seen += cur_batch_size 141 | loss_sum += batch_loss_sum 142 | 143 | for i in range(start_idx, end_idx): 144 | l = current_label[i] 145 | total_seen_class[l] += 1 146 | total_correct_class[l] += (pred_val[i-start_idx] == l) 147 | fout.write('%d, %d\n' % (pred_val[i-start_idx], l)) 148 | 149 | if pred_val[i-start_idx] != l and FLAGS.visu: # ERROR CASE, DUMP! 150 | img_filename = '%d_label_%s_pred_%s.jpg' % (error_cnt, SHAPE_NAMES[l], 151 | SHAPE_NAMES[pred_val[i-start_idx]]) 152 | img_filename = os.path.join(DUMP_DIR, img_filename) 153 | output_img = pc_util.point_cloud_three_views(np.squeeze(current_data[i, :, :])) 154 | scipy.misc.imsave(img_filename, output_img) 155 | error_cnt += 1 156 | 157 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen))) 158 | log_string('eval accuracy: %f' % (total_correct / float(total_seen))) 159 | log_string('eval avg class acc: %f' % (np.mean(np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float)))) 160 | 161 | class_accuracies = np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float) 162 | for i, name in enumerate(SHAPE_NAMES): 163 | log_string('%10s:\t%0.3f' % (name, class_accuracies[i])) 164 | 165 | 166 | 167 | if __name__=='__main__': 168 | with tf.Graph().as_default(): 169 | evaluate(num_votes=1) 170 | LOG_FOUT.close() 171 | -------------------------------------------------------------------------------- /models/pointnet_cls.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | import sys 5 | import os 6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 7 | sys.path.append(BASE_DIR) 8 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 9 | import tf_util 10 | from transform_nets import input_transform_net, feature_transform_net 11 | 12 | def placeholder_inputs(batch_size, num_point): 13 | pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point, 3)) 14 | labels_pl = tf.placeholder(tf.int32, shape=(batch_size)) 15 | return pointclouds_pl, labels_pl 16 | 17 | 18 | def get_model(point_cloud, is_training, bn_decay=None): 19 | """ Classification PointNet, input is BxNx3, output Bx40 """ 20 | batch_size = point_cloud.get_shape()[0].value 21 | num_point = point_cloud.get_shape()[1].value 22 | end_points = {} 23 | 24 | with tf.variable_scope('transform_net1') as sc: 25 | transform = input_transform_net(point_cloud, is_training, bn_decay, K=3) 26 | point_cloud_transformed = tf.matmul(point_cloud, transform) 27 | input_image = tf.expand_dims(point_cloud_transformed, -1) 28 | 29 | net = tf_util.conv2d(input_image, 64, [1,3], 30 | padding='VALID', stride=[1,1], 31 | bn=True, is_training=is_training, 32 | scope='conv1', bn_decay=bn_decay) 33 | net = tf_util.conv2d(net, 64, [1,1], 34 | padding='VALID', stride=[1,1], 35 | bn=True, is_training=is_training, 36 | scope='conv2', bn_decay=bn_decay) 37 | 38 | with tf.variable_scope('transform_net2') as sc: 39 | transform = feature_transform_net(net, is_training, bn_decay, K=64) 40 | end_points['transform'] = transform 41 | net_transformed = tf.matmul(tf.squeeze(net, axis=[2]), transform) 42 | net_transformed = tf.expand_dims(net_transformed, [2]) 43 | 44 | net = tf_util.conv2d(net_transformed, 64, [1,1], 45 | padding='VALID', stride=[1,1], 46 | bn=True, is_training=is_training, 47 | scope='conv3', bn_decay=bn_decay) 48 | net = tf_util.conv2d(net, 128, [1,1], 49 | padding='VALID', stride=[1,1], 50 | bn=True, is_training=is_training, 51 | scope='conv4', bn_decay=bn_decay) 52 | net = tf_util.conv2d(net, 1024, [1,1], 53 | padding='VALID', stride=[1,1], 54 | bn=True, is_training=is_training, 55 | scope='conv5', bn_decay=bn_decay) 56 | 57 | # Symmetric function: max pooling 58 | net = tf_util.max_pool2d(net, [num_point,1], 59 | padding='VALID', scope='maxpool') 60 | 61 | net = tf.reshape(net, [batch_size, -1]) 62 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, 63 | scope='fc1', bn_decay=bn_decay) 64 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, 65 | scope='dp1') 66 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, 67 | scope='fc2', bn_decay=bn_decay) 68 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, 69 | scope='dp2') 70 | net = tf_util.fully_connected(net, 40, activation_fn=None, scope='fc3') 71 | 72 | return net, end_points 73 | 74 | 75 | def get_loss(pred, label, end_points, reg_weight=0.001): 76 | """ pred: B*NUM_CLASSES, 77 | label: B, """ 78 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label) 79 | classify_loss = tf.reduce_mean(loss) 80 | tf.summary.scalar('classify loss', classify_loss) 81 | 82 | # Enforce the transformation as orthogonal matrix 83 | transform = end_points['transform'] # BxKxK 84 | K = transform.get_shape()[1].value 85 | mat_diff = tf.matmul(transform, tf.transpose(transform, perm=[0,2,1])) 86 | mat_diff -= tf.constant(np.eye(K), dtype=tf.float32) 87 | mat_diff_loss = tf.nn.l2_loss(mat_diff) 88 | tf.summary.scalar('mat loss', mat_diff_loss) 89 | 90 | return classify_loss + mat_diff_loss * reg_weight 91 | 92 | 93 | if __name__=='__main__': 94 | with tf.Graph().as_default(): 95 | inputs = tf.zeros((32,1024,3)) 96 | outputs = get_model(inputs, tf.constant(True)) 97 | print(outputs) 98 | -------------------------------------------------------------------------------- /models/pointnet_cls_basic.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | import sys 5 | import os 6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 7 | sys.path.append(BASE_DIR) 8 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 9 | import tf_util 10 | 11 | def placeholder_inputs(batch_size, num_point): 12 | pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point, 3)) 13 | labels_pl = tf.placeholder(tf.int32, shape=(batch_size)) 14 | return pointclouds_pl, labels_pl 15 | 16 | 17 | def get_model(point_cloud, is_training, bn_decay=None): 18 | """ Classification PointNet, input is BxNx3, output Bx40 """ 19 | batch_size = point_cloud.get_shape()[0].value 20 | num_point = point_cloud.get_shape()[1].value 21 | end_points = {} 22 | input_image = tf.expand_dims(point_cloud, -1) 23 | 24 | # Point functions (MLP implemented as conv2d) 25 | net = tf_util.conv2d(input_image, 64, [1,3], 26 | padding='VALID', stride=[1,1], 27 | bn=True, is_training=is_training, 28 | scope='conv1', bn_decay=bn_decay) 29 | net = tf_util.conv2d(net, 64, [1,1], 30 | padding='VALID', stride=[1,1], 31 | bn=True, is_training=is_training, 32 | scope='conv2', bn_decay=bn_decay) 33 | net = tf_util.conv2d(net, 64, [1,1], 34 | padding='VALID', stride=[1,1], 35 | bn=True, is_training=is_training, 36 | scope='conv3', bn_decay=bn_decay) 37 | net = tf_util.conv2d(net, 128, [1,1], 38 | padding='VALID', stride=[1,1], 39 | bn=True, is_training=is_training, 40 | scope='conv4', bn_decay=bn_decay) 41 | net = tf_util.conv2d(net, 1024, [1,1], 42 | padding='VALID', stride=[1,1], 43 | bn=True, is_training=is_training, 44 | scope='conv5', bn_decay=bn_decay) 45 | 46 | # Symmetric function: max pooling 47 | net = tf_util.max_pool2d(net, [num_point,1], 48 | padding='VALID', scope='maxpool') 49 | 50 | # MLP on global point cloud vector 51 | net = tf.reshape(net, [batch_size, -1]) 52 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, 53 | scope='fc1', bn_decay=bn_decay) 54 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, 55 | scope='fc2', bn_decay=bn_decay) 56 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, 57 | scope='dp1') 58 | net = tf_util.fully_connected(net, 40, activation_fn=None, scope='fc3') 59 | 60 | return net, end_points 61 | 62 | 63 | def get_loss(pred, label, end_points): 64 | """ pred: B*NUM_CLASSES, 65 | label: B, """ 66 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label) 67 | classify_loss = tf.reduce_mean(loss) 68 | tf.summary.scalar('classify loss', classify_loss) 69 | return classify_loss 70 | 71 | 72 | if __name__=='__main__': 73 | with tf.Graph().as_default(): 74 | inputs = tf.zeros((32,1024,3)) 75 | outputs = get_model(inputs, tf.constant(True)) 76 | print(outputs) 77 | -------------------------------------------------------------------------------- /models/pointnet_seg.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | import sys 5 | import os 6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 7 | sys.path.append(BASE_DIR) 8 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 9 | import tf_util 10 | from transform_nets import input_transform_net, feature_transform_net 11 | 12 | def placeholder_inputs(batch_size, num_point): 13 | pointclouds_pl = tf.placeholder(tf.float32, 14 | shape=(batch_size, num_point, 3)) 15 | labels_pl = tf.placeholder(tf.int32, 16 | shape=(batch_size, num_point)) 17 | return pointclouds_pl, labels_pl 18 | 19 | 20 | def get_model(point_cloud, is_training, bn_decay=None): 21 | """ Classification PointNet, input is BxNx3, output BxNx50 """ 22 | batch_size = point_cloud.get_shape()[0].value 23 | num_point = point_cloud.get_shape()[1].value 24 | end_points = {} 25 | 26 | with tf.variable_scope('transform_net1') as sc: 27 | transform = input_transform_net(point_cloud, is_training, bn_decay, K=3) 28 | point_cloud_transformed = tf.matmul(point_cloud, transform) 29 | input_image = tf.expand_dims(point_cloud_transformed, -1) 30 | 31 | net = tf_util.conv2d(input_image, 64, [1,3], 32 | padding='VALID', stride=[1,1], 33 | bn=True, is_training=is_training, 34 | scope='conv1', bn_decay=bn_decay) 35 | net = tf_util.conv2d(net, 64, [1,1], 36 | padding='VALID', stride=[1,1], 37 | bn=True, is_training=is_training, 38 | scope='conv2', bn_decay=bn_decay) 39 | 40 | with tf.variable_scope('transform_net2') as sc: 41 | transform = feature_transform_net(net, is_training, bn_decay, K=64) 42 | end_points['transform'] = transform 43 | net_transformed = tf.matmul(tf.squeeze(net, axis=[2]), transform) 44 | point_feat = tf.expand_dims(net_transformed, [2]) 45 | print(point_feat) 46 | 47 | net = tf_util.conv2d(point_feat, 64, [1,1], 48 | padding='VALID', stride=[1,1], 49 | bn=True, is_training=is_training, 50 | scope='conv3', bn_decay=bn_decay) 51 | net = tf_util.conv2d(net, 128, [1,1], 52 | padding='VALID', stride=[1,1], 53 | bn=True, is_training=is_training, 54 | scope='conv4', bn_decay=bn_decay) 55 | net = tf_util.conv2d(net, 1024, [1,1], 56 | padding='VALID', stride=[1,1], 57 | bn=True, is_training=is_training, 58 | scope='conv5', bn_decay=bn_decay) 59 | global_feat = tf_util.max_pool2d(net, [num_point,1], 60 | padding='VALID', scope='maxpool') 61 | print(global_feat) 62 | 63 | global_feat_expand = tf.tile(global_feat, [1, num_point, 1, 1]) 64 | concat_feat = tf.concat(3, [point_feat, global_feat_expand]) 65 | print(concat_feat) 66 | 67 | net = tf_util.conv2d(concat_feat, 512, [1,1], 68 | padding='VALID', stride=[1,1], 69 | bn=True, is_training=is_training, 70 | scope='conv6', bn_decay=bn_decay) 71 | net = tf_util.conv2d(net, 256, [1,1], 72 | padding='VALID', stride=[1,1], 73 | bn=True, is_training=is_training, 74 | scope='conv7', bn_decay=bn_decay) 75 | net = tf_util.conv2d(net, 128, [1,1], 76 | padding='VALID', stride=[1,1], 77 | bn=True, is_training=is_training, 78 | scope='conv8', bn_decay=bn_decay) 79 | net = tf_util.conv2d(net, 128, [1,1], 80 | padding='VALID', stride=[1,1], 81 | bn=True, is_training=is_training, 82 | scope='conv9', bn_decay=bn_decay) 83 | 84 | net = tf_util.conv2d(net, 50, [1,1], 85 | padding='VALID', stride=[1,1], activation_fn=None, 86 | scope='conv10') 87 | net = tf.squeeze(net, [2]) # BxNxC 88 | 89 | return net, end_points 90 | 91 | 92 | def get_loss(pred, label, end_points, reg_weight=0.001): 93 | """ pred: BxNxC, 94 | label: BxN, """ 95 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label) 96 | classify_loss = tf.reduce_mean(loss) 97 | tf.scalar_summary('classify loss', classify_loss) 98 | 99 | # Enforce the transformation as orthogonal matrix 100 | transform = end_points['transform'] # BxKxK 101 | K = transform.get_shape()[1].value 102 | mat_diff = tf.matmul(transform, tf.transpose(transform, perm=[0,2,1])) 103 | mat_diff -= tf.constant(np.eye(K), dtype=tf.float32) 104 | mat_diff_loss = tf.nn.l2_loss(mat_diff) 105 | tf.scalar_summary('mat_loss', mat_diff_loss) 106 | 107 | return classify_loss + mat_diff_loss * reg_weight 108 | 109 | 110 | if __name__=='__main__': 111 | with tf.Graph().as_default(): 112 | inputs = tf.zeros((32,1024,3)) 113 | outputs = get_model(inputs, tf.constant(True)) 114 | print(outputs) 115 | -------------------------------------------------------------------------------- /models/transform_nets.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import sys 4 | import os 5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 6 | sys.path.append(BASE_DIR) 7 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 8 | import tf_util 9 | 10 | def input_transform_net(point_cloud, is_training, bn_decay=None, K=3): 11 | """ Input (XYZ) Transform Net, input is BxNx3 gray image 12 | Return: 13 | Transformation matrix of size 3xK """ 14 | batch_size = point_cloud.get_shape()[0].value 15 | num_point = point_cloud.get_shape()[1].value 16 | 17 | input_image = tf.expand_dims(point_cloud, -1) 18 | net = tf_util.conv2d(input_image, 64, [1,3], 19 | padding='VALID', stride=[1,1], 20 | bn=True, is_training=is_training, 21 | scope='tconv1', bn_decay=bn_decay) 22 | net = tf_util.conv2d(net, 128, [1,1], 23 | padding='VALID', stride=[1,1], 24 | bn=True, is_training=is_training, 25 | scope='tconv2', bn_decay=bn_decay) 26 | net = tf_util.conv2d(net, 1024, [1,1], 27 | padding='VALID', stride=[1,1], 28 | bn=True, is_training=is_training, 29 | scope='tconv3', bn_decay=bn_decay) 30 | net = tf_util.max_pool2d(net, [num_point,1], 31 | padding='VALID', scope='tmaxpool') 32 | 33 | net = tf.reshape(net, [batch_size, -1]) 34 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, 35 | scope='tfc1', bn_decay=bn_decay) 36 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, 37 | scope='tfc2', bn_decay=bn_decay) 38 | 39 | with tf.variable_scope('transform_XYZ') as sc: 40 | assert(K==3) 41 | weights = tf.get_variable('weights', [256, 3*K], 42 | initializer=tf.constant_initializer(0.0), 43 | dtype=tf.float32) 44 | biases = tf.get_variable('biases', [3*K], 45 | initializer=tf.constant_initializer(0.0), 46 | dtype=tf.float32) 47 | biases += tf.constant([1,0,0,0,1,0,0,0,1], dtype=tf.float32) 48 | transform = tf.matmul(net, weights) 49 | transform = tf.nn.bias_add(transform, biases) 50 | 51 | transform = tf.reshape(transform, [batch_size, 3, K]) 52 | return transform 53 | 54 | 55 | def feature_transform_net(inputs, is_training, bn_decay=None, K=64): 56 | """ Feature Transform Net, input is BxNx1xK 57 | Return: 58 | Transformation matrix of size KxK """ 59 | batch_size = inputs.get_shape()[0].value 60 | num_point = inputs.get_shape()[1].value 61 | 62 | net = tf_util.conv2d(inputs, 64, [1,1], 63 | padding='VALID', stride=[1,1], 64 | bn=True, is_training=is_training, 65 | scope='tconv1', bn_decay=bn_decay) 66 | net = tf_util.conv2d(net, 128, [1,1], 67 | padding='VALID', stride=[1,1], 68 | bn=True, is_training=is_training, 69 | scope='tconv2', bn_decay=bn_decay) 70 | net = tf_util.conv2d(net, 1024, [1,1], 71 | padding='VALID', stride=[1,1], 72 | bn=True, is_training=is_training, 73 | scope='tconv3', bn_decay=bn_decay) 74 | net = tf_util.max_pool2d(net, [num_point,1], 75 | padding='VALID', scope='tmaxpool') 76 | 77 | net = tf.reshape(net, [batch_size, -1]) 78 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, 79 | scope='tfc1', bn_decay=bn_decay) 80 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, 81 | scope='tfc2', bn_decay=bn_decay) 82 | 83 | with tf.variable_scope('transform_feat') as sc: 84 | weights = tf.get_variable('weights', [256, K*K], 85 | initializer=tf.constant_initializer(0.0), 86 | dtype=tf.float32) 87 | biases = tf.get_variable('biases', [K*K], 88 | initializer=tf.constant_initializer(0.0), 89 | dtype=tf.float32) 90 | biases += tf.constant(np.eye(K).flatten(), dtype=tf.float32) 91 | transform = tf.matmul(net, weights) 92 | transform = tf.nn.bias_add(transform, biases) 93 | 94 | transform = tf.reshape(transform, [batch_size, K, K]) 95 | return transform 96 | -------------------------------------------------------------------------------- /part_seg/download_data.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Download original ShapeNetPart dataset (around 1GB) 4 | wget https://shapenet.cs.stanford.edu/ericyi/shapenetcore_partanno_v0.zip 5 | unzip shapenetcore_partanno_v0.zip 6 | rm shapenetcore_partanno_v0.zip 7 | 8 | # Download HDF5 for ShapeNet Part segmentation (around 346MB) 9 | wget https://shapenet.cs.stanford.edu/media/shapenet_part_seg_hdf5_data.zip 10 | unzip shapenet_part_seg_hdf5_data.zip 11 | rm shapenet_part_seg_hdf5_data.zip 12 | 13 | -------------------------------------------------------------------------------- /part_seg/pointnet_part_seg.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | import os 5 | import sys 6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 7 | sys.path.append(os.path.dirname(BASE_DIR)) 8 | sys.path.append(os.path.join(BASE_DIR, '../utils')) 9 | import tf_util 10 | 11 | 12 | def get_transform_K(inputs, is_training, bn_decay=None, K = 3): 13 | """ Transform Net, input is BxNx1xK gray image 14 | Return: 15 | Transformation matrix of size KxK """ 16 | batch_size = inputs.get_shape()[0].value 17 | num_point = inputs.get_shape()[1].value 18 | 19 | net = tf_util.conv2d(inputs, 256, [1,1], padding='VALID', stride=[1,1], 20 | bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay) 21 | net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1], 22 | bn=True, is_training=is_training, scope='tconv2', bn_decay=bn_decay) 23 | net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool') 24 | 25 | net = tf.reshape(net, [batch_size, -1]) 26 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay) 27 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay) 28 | 29 | with tf.variable_scope('transform_feat') as sc: 30 | weights = tf.get_variable('weights', [256, K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) 31 | biases = tf.get_variable('biases', [K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant(np.eye(K).flatten(), dtype=tf.float32) 32 | transform = tf.matmul(net, weights) 33 | transform = tf.nn.bias_add(transform, biases) 34 | 35 | #transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3') 36 | transform = tf.reshape(transform, [batch_size, K, K]) 37 | return transform 38 | 39 | 40 | 41 | 42 | 43 | def get_transform(point_cloud, is_training, bn_decay=None, K = 3): 44 | """ Transform Net, input is BxNx3 gray image 45 | Return: 46 | Transformation matrix of size 3xK """ 47 | batch_size = point_cloud.get_shape()[0].value 48 | num_point = point_cloud.get_shape()[1].value 49 | 50 | input_image = tf.expand_dims(point_cloud, -1) 51 | net = tf_util.conv2d(input_image, 64, [1,3], padding='VALID', stride=[1,1], 52 | bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay) 53 | net = tf_util.conv2d(net, 128, [1,1], padding='VALID', stride=[1,1], 54 | bn=True, is_training=is_training, scope='tconv3', bn_decay=bn_decay) 55 | net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1], 56 | bn=True, is_training=is_training, scope='tconv4', bn_decay=bn_decay) 57 | net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool') 58 | 59 | net = tf.reshape(net, [batch_size, -1]) 60 | net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay) 61 | net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay) 62 | 63 | with tf.variable_scope('transform_XYZ') as sc: 64 | assert(K==3) 65 | weights = tf.get_variable('weights', [128, 3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) 66 | biases = tf.get_variable('biases', [3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant([1,0,0,0,1,0,0,0,1], dtype=tf.float32) 67 | transform = tf.matmul(net, weights) 68 | transform = tf.nn.bias_add(transform, biases) 69 | 70 | #transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3') 71 | transform = tf.reshape(transform, [batch_size, 3, K]) 72 | return transform 73 | 74 | 75 | def get_model(point_cloud, input_label, is_training, cat_num, part_num, \ 76 | batch_size, num_point, weight_decay, bn_decay=None): 77 | """ ConvNet baseline, input is BxNx3 gray image """ 78 | end_points = {} 79 | 80 | with tf.variable_scope('transform_net1') as sc: 81 | K = 3 82 | transform = get_transform(point_cloud, is_training, bn_decay, K = 3) 83 | point_cloud_transformed = tf.matmul(point_cloud, transform) 84 | 85 | input_image = tf.expand_dims(point_cloud_transformed, -1) 86 | out1 = tf_util.conv2d(input_image, 64, [1,K], padding='VALID', stride=[1,1], 87 | bn=True, is_training=is_training, scope='conv1', bn_decay=bn_decay) 88 | out2 = tf_util.conv2d(out1, 128, [1,1], padding='VALID', stride=[1,1], 89 | bn=True, is_training=is_training, scope='conv2', bn_decay=bn_decay) 90 | out3 = tf_util.conv2d(out2, 128, [1,1], padding='VALID', stride=[1,1], 91 | bn=True, is_training=is_training, scope='conv3', bn_decay=bn_decay) 92 | 93 | 94 | with tf.variable_scope('transform_net2') as sc: 95 | K = 128 96 | transform = get_transform_K(out3, is_training, bn_decay, K) 97 | 98 | end_points['transform'] = transform 99 | 100 | squeezed_out3 = tf.reshape(out3, [batch_size, num_point, 128]) 101 | net_transformed = tf.matmul(squeezed_out3, transform) 102 | net_transformed = tf.expand_dims(net_transformed, [2]) 103 | 104 | out4 = tf_util.conv2d(net_transformed, 512, [1,1], padding='VALID', stride=[1,1], 105 | bn=True, is_training=is_training, scope='conv4', bn_decay=bn_decay) 106 | out5 = tf_util.conv2d(out4, 2048, [1,1], padding='VALID', stride=[1,1], 107 | bn=True, is_training=is_training, scope='conv5', bn_decay=bn_decay) 108 | out_max = tf_util.max_pool2d(out5, [num_point,1], padding='VALID', scope='maxpool') 109 | 110 | # classification network 111 | net = tf.reshape(out_max, [batch_size, -1]) 112 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc1', bn_decay=bn_decay) 113 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc2', bn_decay=bn_decay) 114 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='cla/dp1') 115 | net = tf_util.fully_connected(net, cat_num, activation_fn=None, scope='cla/fc3') 116 | 117 | # segmentation network 118 | one_hot_label_expand = tf.reshape(input_label, [batch_size, 1, 1, cat_num]) 119 | out_max = tf.concat(axis=3, values=[out_max, one_hot_label_expand]) 120 | 121 | expand = tf.tile(out_max, [1, num_point, 1, 1]) 122 | concat = tf.concat(axis=3, values=[expand, out1, out2, out3, out4, out5]) 123 | 124 | net2 = tf_util.conv2d(concat, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay, 125 | bn=True, is_training=is_training, scope='seg/conv1', weight_decay=weight_decay) 126 | net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp1') 127 | net2 = tf_util.conv2d(net2, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay, 128 | bn=True, is_training=is_training, scope='seg/conv2', weight_decay=weight_decay) 129 | net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp2') 130 | net2 = tf_util.conv2d(net2, 128, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay, 131 | bn=True, is_training=is_training, scope='seg/conv3', weight_decay=weight_decay) 132 | net2 = tf_util.conv2d(net2, part_num, [1,1], padding='VALID', stride=[1,1], activation_fn=None, 133 | bn=False, scope='seg/conv4', weight_decay=weight_decay) 134 | 135 | net2 = tf.reshape(net2, [batch_size, num_point, part_num]) 136 | 137 | return net, net2, end_points 138 | 139 | def get_loss(l_pred, seg_pred, label, seg, weight, end_points): 140 | per_instance_label_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=l_pred, labels=label) 141 | label_loss = tf.reduce_mean(per_instance_label_loss) 142 | 143 | # size of seg_pred is batch_size x point_num x part_cat_num 144 | # size of seg is batch_size x point_num 145 | per_instance_seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_pred, labels=seg), axis=1) 146 | seg_loss = tf.reduce_mean(per_instance_seg_loss) 147 | 148 | per_instance_seg_pred_res = tf.argmax(seg_pred, 2) 149 | 150 | # Enforce the transformation as orthogonal matrix 151 | transform = end_points['transform'] # BxKxK 152 | K = transform.get_shape()[1].value 153 | mat_diff = tf.matmul(transform, tf.transpose(transform, perm=[0,2,1])) - tf.constant(np.eye(K), dtype=tf.float32) 154 | mat_diff_loss = tf.nn.l2_loss(mat_diff) 155 | 156 | 157 | total_loss = weight * seg_loss + (1 - weight) * label_loss + mat_diff_loss * 1e-3 158 | 159 | return total_loss, label_loss, per_instance_label_loss, seg_loss, per_instance_seg_loss, per_instance_seg_pred_res 160 | 161 | -------------------------------------------------------------------------------- /part_seg/test.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import tensorflow as tf 3 | import json 4 | import numpy as np 5 | import os 6 | import sys 7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 8 | sys.path.append(BASE_DIR) 9 | sys.path.append(os.path.dirname(BASE_DIR)) 10 | import provider 11 | import pointnet_part_seg as model 12 | 13 | parser = argparse.ArgumentParser() 14 | parser.add_argument('--model_path', default='train_results/trained_models/epoch_190.ckpt', help='Model checkpoint path') 15 | FLAGS = parser.parse_args() 16 | 17 | 18 | # DEFAULT SETTINGS 19 | pretrained_model_path = FLAGS.model_path # os.path.join(BASE_DIR, './pretrained_model/model.ckpt') 20 | hdf5_data_dir = os.path.join(BASE_DIR, './hdf5_data') 21 | ply_data_dir = os.path.join(BASE_DIR, './PartAnnotation') 22 | gpu_to_use = 0 23 | output_dir = os.path.join(BASE_DIR, './test_results') 24 | output_verbose = True # If true, output all color-coded part segmentation obj files 25 | 26 | # MAIN SCRIPT 27 | point_num = 3000 # the max number of points in the all testing data shapes 28 | batch_size = 1 29 | 30 | test_file_list = os.path.join(BASE_DIR, 'testing_ply_file_list.txt') 31 | 32 | oid2cpid = json.load(open(os.path.join(hdf5_data_dir, 'overallid_to_catid_partid.json'), 'r')) 33 | 34 | object2setofoid = {} 35 | for idx in range(len(oid2cpid)): 36 | objid, pid = oid2cpid[idx] 37 | if not objid in object2setofoid.keys(): 38 | object2setofoid[objid] = [] 39 | object2setofoid[objid].append(idx) 40 | 41 | all_obj_cat_file = os.path.join(hdf5_data_dir, 'all_object_categories.txt') 42 | fin = open(all_obj_cat_file, 'r') 43 | lines = [line.rstrip() for line in fin.readlines()] 44 | objcats = [line.split()[1] for line in lines] 45 | objnames = [line.split()[0] for line in lines] 46 | on2oid = {objcats[i]:i for i in range(len(objcats))} 47 | fin.close() 48 | 49 | color_map_file = os.path.join(hdf5_data_dir, 'part_color_mapping.json') 50 | color_map = json.load(open(color_map_file, 'r')) 51 | 52 | NUM_OBJ_CATS = 16 53 | NUM_PART_CATS = 50 54 | 55 | cpid2oid = json.load(open(os.path.join(hdf5_data_dir, 'catid_partid_to_overallid.json'), 'r')) 56 | 57 | def printout(flog, data): 58 | print(data) 59 | flog.write(data + '\n') 60 | 61 | def output_color_point_cloud(data, seg, out_file): 62 | with open(out_file, 'w') as f: 63 | l = len(seg) 64 | for i in range(l): 65 | color = color_map[seg[i]] 66 | f.write('v %f %f %f %f %f %f\n' % (data[i][0], data[i][1], data[i][2], color[0], color[1], color[2])) 67 | 68 | def output_color_point_cloud_red_blue(data, seg, out_file): 69 | with open(out_file, 'w') as f: 70 | l = len(seg) 71 | for i in range(l): 72 | if seg[i] == 1: 73 | color = [0, 0, 1] 74 | elif seg[i] == 0: 75 | color = [1, 0, 0] 76 | else: 77 | color = [0, 0, 0] 78 | 79 | f.write('v %f %f %f %f %f %f\n' % (data[i][0], data[i][1], data[i][2], color[0], color[1], color[2])) 80 | 81 | 82 | def pc_normalize(pc): 83 | l = pc.shape[0] 84 | centroid = np.mean(pc, axis=0) 85 | pc = pc - centroid 86 | m = np.max(np.sqrt(np.sum(pc**2, axis=1))) 87 | pc = pc / m 88 | return pc 89 | 90 | def placeholder_inputs(): 91 | pointclouds_ph = tf.placeholder(tf.float32, shape=(batch_size, point_num, 3)) 92 | input_label_ph = tf.placeholder(tf.float32, shape=(batch_size, NUM_OBJ_CATS)) 93 | return pointclouds_ph, input_label_ph 94 | 95 | def output_color_point_cloud(data, seg, out_file): 96 | with open(out_file, 'w') as f: 97 | l = len(seg) 98 | for i in range(l): 99 | color = color_map[seg[i]] 100 | f.write('v %f %f %f %f %f %f\n' % (data[i][0], data[i][1], data[i][2], color[0], color[1], color[2])) 101 | 102 | def load_pts_seg_files(pts_file, seg_file, catid): 103 | with open(pts_file, 'r') as f: 104 | pts_str = [item.rstrip() for item in f.readlines()] 105 | pts = np.array([np.float32(s.split()) for s in pts_str], dtype=np.float32) 106 | with open(seg_file, 'r') as f: 107 | part_ids = np.array([int(item.rstrip()) for item in f.readlines()], dtype=np.uint8) 108 | seg = np.array([cpid2oid[catid+'_'+str(x)] for x in part_ids]) 109 | return pts, seg 110 | 111 | def pc_augment_to_point_num(pts, pn): 112 | assert(pts.shape[0] <= pn) 113 | cur_len = pts.shape[0] 114 | res = np.array(pts) 115 | while cur_len < pn: 116 | res = np.concatenate((res, pts)) 117 | cur_len += pts.shape[0] 118 | return res[:pn, :] 119 | 120 | def convert_label_to_one_hot(labels): 121 | label_one_hot = np.zeros((labels.shape[0], NUM_OBJ_CATS)) 122 | for idx in range(labels.shape[0]): 123 | label_one_hot[idx, labels[idx]] = 1 124 | return label_one_hot 125 | 126 | def predict(): 127 | is_training = False 128 | 129 | with tf.device('/gpu:'+str(gpu_to_use)): 130 | pointclouds_ph, input_label_ph = placeholder_inputs() 131 | is_training_ph = tf.placeholder(tf.bool, shape=()) 132 | 133 | # simple model 134 | pred, seg_pred, end_points = model.get_model(pointclouds_ph, input_label_ph, \ 135 | cat_num=NUM_OBJ_CATS, part_num=NUM_PART_CATS, is_training=is_training_ph, \ 136 | batch_size=batch_size, num_point=point_num, weight_decay=0.0, bn_decay=None) 137 | 138 | # Add ops to save and restore all the variables. 139 | saver = tf.train.Saver() 140 | 141 | # Later, launch the model, use the saver to restore variables from disk, and 142 | # do some work with the model. 143 | 144 | config = tf.ConfigProto() 145 | config.gpu_options.allow_growth = True 146 | config.allow_soft_placement = True 147 | 148 | with tf.Session(config=config) as sess: 149 | if not os.path.exists(output_dir): 150 | os.mkdir(output_dir) 151 | 152 | flog = open(os.path.join(output_dir, 'log.txt'), 'w') 153 | 154 | # Restore variables from disk. 155 | printout(flog, 'Loading model %s' % pretrained_model_path) 156 | saver.restore(sess, pretrained_model_path) 157 | printout(flog, 'Model restored.') 158 | 159 | # Note: the evaluation for the model with BN has to have some statistics 160 | # Using some test datas as the statistics 161 | batch_data = np.zeros([batch_size, point_num, 3]).astype(np.float32) 162 | 163 | total_acc = 0.0 164 | total_seen = 0 165 | total_acc_iou = 0.0 166 | 167 | total_per_cat_acc = np.zeros((NUM_OBJ_CATS)).astype(np.float32) 168 | total_per_cat_iou = np.zeros((NUM_OBJ_CATS)).astype(np.float32) 169 | total_per_cat_seen = np.zeros((NUM_OBJ_CATS)).astype(np.int32) 170 | 171 | ffiles = open(test_file_list, 'r') 172 | lines = [line.rstrip() for line in ffiles.readlines()] 173 | pts_files = [line.split()[0] for line in lines] 174 | seg_files = [line.split()[1] for line in lines] 175 | labels = [line.split()[2] for line in lines] 176 | ffiles.close() 177 | 178 | len_pts_files = len(pts_files) 179 | for shape_idx in range(len_pts_files): 180 | if shape_idx % 100 == 0: 181 | printout(flog, '%d/%d ...' % (shape_idx, len_pts_files)) 182 | 183 | cur_gt_label = on2oid[labels[shape_idx]] 184 | 185 | cur_label_one_hot = np.zeros((1, NUM_OBJ_CATS), dtype=np.float32) 186 | cur_label_one_hot[0, cur_gt_label] = 1 187 | 188 | pts_file_to_load = os.path.join(ply_data_dir, pts_files[shape_idx]) 189 | seg_file_to_load = os.path.join(ply_data_dir, seg_files[shape_idx]) 190 | 191 | pts, seg = load_pts_seg_files(pts_file_to_load, seg_file_to_load, objcats[cur_gt_label]) 192 | ori_point_num = len(seg) 193 | 194 | batch_data[0, ...] = pc_augment_to_point_num(pc_normalize(pts), point_num) 195 | 196 | label_pred_val, seg_pred_res = sess.run([pred, seg_pred], feed_dict={ 197 | pointclouds_ph: batch_data, 198 | input_label_ph: cur_label_one_hot, 199 | is_training_ph: is_training, 200 | }) 201 | 202 | label_pred_val = np.argmax(label_pred_val[0, :]) 203 | 204 | seg_pred_res = seg_pred_res[0, ...] 205 | 206 | iou_oids = object2setofoid[objcats[cur_gt_label]] 207 | non_cat_labels = list(set(np.arange(NUM_PART_CATS)).difference(set(iou_oids))) 208 | 209 | mini = np.min(seg_pred_res) 210 | seg_pred_res[:, non_cat_labels] = mini - 1000 211 | 212 | seg_pred_val = np.argmax(seg_pred_res, axis=1)[:ori_point_num] 213 | 214 | seg_acc = np.mean(seg_pred_val == seg) 215 | 216 | total_acc += seg_acc 217 | total_seen += 1 218 | 219 | total_per_cat_seen[cur_gt_label] += 1 220 | total_per_cat_acc[cur_gt_label] += seg_acc 221 | 222 | mask = np.int32(seg_pred_val == seg) 223 | 224 | total_iou = 0.0 225 | iou_log = '' 226 | for oid in iou_oids: 227 | n_pred = np.sum(seg_pred_val == oid) 228 | n_gt = np.sum(seg == oid) 229 | n_intersect = np.sum(np.int32(seg == oid) * mask) 230 | n_union = n_pred + n_gt - n_intersect 231 | iou_log += '_' + str(n_pred)+'_'+str(n_gt)+'_'+str(n_intersect)+'_'+str(n_union)+'_' 232 | if n_union == 0: 233 | total_iou += 1 234 | iou_log += '_1\n' 235 | else: 236 | total_iou += n_intersect * 1.0 / n_union 237 | iou_log += '_'+str(n_intersect * 1.0 / n_union)+'\n' 238 | 239 | avg_iou = total_iou / len(iou_oids) 240 | total_acc_iou += avg_iou 241 | total_per_cat_iou[cur_gt_label] += avg_iou 242 | 243 | if output_verbose: 244 | output_color_point_cloud(pts, seg, os.path.join(output_dir, str(shape_idx)+'_gt.obj')) 245 | output_color_point_cloud(pts, seg_pred_val, os.path.join(output_dir, str(shape_idx)+'_pred.obj')) 246 | output_color_point_cloud_red_blue(pts, np.int32(seg == seg_pred_val), 247 | os.path.join(output_dir, str(shape_idx)+'_diff.obj')) 248 | 249 | with open(os.path.join(output_dir, str(shape_idx)+'.log'), 'w') as fout: 250 | fout.write('Total Point: %d\n\n' % ori_point_num) 251 | fout.write('Ground Truth: %s\n' % objnames[cur_gt_label]) 252 | fout.write('Predict: %s\n\n' % objnames[label_pred_val]) 253 | fout.write('Accuracy: %f\n' % seg_acc) 254 | fout.write('IoU: %f\n\n' % avg_iou) 255 | fout.write('IoU details: %s\n' % iou_log) 256 | 257 | printout(flog, 'Accuracy: %f' % (total_acc / total_seen)) 258 | printout(flog, 'IoU: %f' % (total_acc_iou / total_seen)) 259 | 260 | for cat_idx in range(NUM_OBJ_CATS): 261 | printout(flog, '\t ' + objcats[cat_idx] + ' Total Number: ' + str(total_per_cat_seen[cat_idx])) 262 | if total_per_cat_seen[cat_idx] > 0: 263 | printout(flog, '\t ' + objcats[cat_idx] + ' Accuracy: ' + \ 264 | str(total_per_cat_acc[cat_idx] / total_per_cat_seen[cat_idx])) 265 | printout(flog, '\t ' + objcats[cat_idx] + ' IoU: '+ \ 266 | str(total_per_cat_iou[cat_idx] / total_per_cat_seen[cat_idx])) 267 | 268 | 269 | with tf.Graph().as_default(): 270 | predict() 271 | -------------------------------------------------------------------------------- /part_seg/train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import subprocess 3 | import tensorflow as tf 4 | import numpy as np 5 | from datetime import datetime 6 | import json 7 | import os 8 | import sys 9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 10 | sys.path.append(BASE_DIR) 11 | sys.path.append(os.path.dirname(BASE_DIR)) 12 | import provider 13 | import pointnet_part_seg as model 14 | 15 | # DEFAULT SETTINGS 16 | parser = argparse.ArgumentParser() 17 | parser.add_argument('--gpu', type=int, default=1, help='GPU to use [default: GPU 0]') 18 | parser.add_argument('--batch', type=int, default=32, help='Batch Size during training [default: 32]') 19 | parser.add_argument('--epoch', type=int, default=200, help='Epoch to run [default: 50]') 20 | parser.add_argument('--point_num', type=int, default=2048, help='Point Number [256/512/1024/2048]') 21 | parser.add_argument('--output_dir', type=str, default='train_results', help='Directory that stores all training logs and trained models') 22 | parser.add_argument('--wd', type=float, default=0, help='Weight Decay [Default: 0.0]') 23 | FLAGS = parser.parse_args() 24 | 25 | hdf5_data_dir = os.path.join(BASE_DIR, './hdf5_data') 26 | 27 | # MAIN SCRIPT 28 | point_num = FLAGS.point_num 29 | batch_size = FLAGS.batch 30 | output_dir = FLAGS.output_dir 31 | 32 | if not os.path.exists(output_dir): 33 | os.mkdir(output_dir) 34 | 35 | color_map_file = os.path.join(hdf5_data_dir, 'part_color_mapping.json') 36 | color_map = json.load(open(color_map_file, 'r')) 37 | 38 | all_obj_cats_file = os.path.join(hdf5_data_dir, 'all_object_categories.txt') 39 | fin = open(all_obj_cats_file, 'r') 40 | lines = [line.rstrip() for line in fin.readlines()] 41 | all_obj_cats = [(line.split()[0], line.split()[1]) for line in lines] 42 | fin.close() 43 | 44 | all_cats = json.load(open(os.path.join(hdf5_data_dir, 'overallid_to_catid_partid.json'), 'r')) 45 | NUM_CATEGORIES = 16 46 | NUM_PART_CATS = len(all_cats) 47 | 48 | print('#### Batch Size: {0}'.format(batch_size)) 49 | print('#### Point Number: {0}'.format(point_num)) 50 | print('#### Training using GPU: {0}'.format(FLAGS.gpu)) 51 | 52 | DECAY_STEP = 16881 * 20 53 | DECAY_RATE = 0.5 54 | 55 | LEARNING_RATE_CLIP = 1e-5 56 | 57 | BN_INIT_DECAY = 0.5 58 | BN_DECAY_DECAY_RATE = 0.5 59 | BN_DECAY_DECAY_STEP = float(DECAY_STEP * 2) 60 | BN_DECAY_CLIP = 0.99 61 | 62 | BASE_LEARNING_RATE = 0.001 63 | MOMENTUM = 0.9 64 | TRAINING_EPOCHES = FLAGS.epoch 65 | print('### Training epoch: {0}'.format(TRAINING_EPOCHES)) 66 | 67 | TRAINING_FILE_LIST = os.path.join(hdf5_data_dir, 'train_hdf5_file_list.txt') 68 | TESTING_FILE_LIST = os.path.join(hdf5_data_dir, 'val_hdf5_file_list.txt') 69 | 70 | MODEL_STORAGE_PATH = os.path.join(output_dir, 'trained_models') 71 | if not os.path.exists(MODEL_STORAGE_PATH): 72 | os.mkdir(MODEL_STORAGE_PATH) 73 | 74 | LOG_STORAGE_PATH = os.path.join(output_dir, 'logs') 75 | if not os.path.exists(LOG_STORAGE_PATH): 76 | os.mkdir(LOG_STORAGE_PATH) 77 | 78 | SUMMARIES_FOLDER = os.path.join(output_dir, 'summaries') 79 | if not os.path.exists(SUMMARIES_FOLDER): 80 | os.mkdir(SUMMARIES_FOLDER) 81 | 82 | def printout(flog, data): 83 | print(data) 84 | flog.write(data + '\n') 85 | 86 | def placeholder_inputs(): 87 | pointclouds_ph = tf.placeholder(tf.float32, shape=(batch_size, point_num, 3)) 88 | input_label_ph = tf.placeholder(tf.float32, shape=(batch_size, NUM_CATEGORIES)) 89 | labels_ph = tf.placeholder(tf.int32, shape=(batch_size)) 90 | seg_ph = tf.placeholder(tf.int32, shape=(batch_size, point_num)) 91 | return pointclouds_ph, input_label_ph, labels_ph, seg_ph 92 | 93 | def convert_label_to_one_hot(labels): 94 | label_one_hot = np.zeros((labels.shape[0], NUM_CATEGORIES)) 95 | for idx in range(labels.shape[0]): 96 | label_one_hot[idx, labels[idx]] = 1 97 | return label_one_hot 98 | 99 | def train(): 100 | with tf.Graph().as_default(): 101 | with tf.device('/gpu:'+str(FLAGS.gpu)): 102 | pointclouds_ph, input_label_ph, labels_ph, seg_ph = placeholder_inputs() 103 | is_training_ph = tf.placeholder(tf.bool, shape=()) 104 | 105 | batch = tf.Variable(0, trainable=False) 106 | learning_rate = tf.train.exponential_decay( 107 | BASE_LEARNING_RATE, # base learning rate 108 | batch * batch_size, # global_var indicating the number of steps 109 | DECAY_STEP, # step size 110 | DECAY_RATE, # decay rate 111 | staircase=True # Stair-case or continuous decreasing 112 | ) 113 | learning_rate = tf.maximum(learning_rate, LEARNING_RATE_CLIP) 114 | 115 | bn_momentum = tf.train.exponential_decay( 116 | BN_INIT_DECAY, 117 | batch*batch_size, 118 | BN_DECAY_DECAY_STEP, 119 | BN_DECAY_DECAY_RATE, 120 | staircase=True) 121 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum) 122 | 123 | lr_op = tf.summary.scalar('learning_rate', learning_rate) 124 | batch_op = tf.summary.scalar('batch_number', batch) 125 | bn_decay_op = tf.summary.scalar('bn_decay', bn_decay) 126 | 127 | labels_pred, seg_pred, end_points = model.get_model(pointclouds_ph, input_label_ph, \ 128 | is_training=is_training_ph, bn_decay=bn_decay, cat_num=NUM_CATEGORIES, \ 129 | part_num=NUM_PART_CATS, batch_size=batch_size, num_point=point_num, weight_decay=FLAGS.wd) 130 | 131 | # model.py defines both classification net and segmentation net, which share the common global feature extractor network. 132 | # In model.get_loss, we define the total loss to be weighted sum of the classification and segmentation losses. 133 | # Here, we only train for segmentation network. Thus, we set weight to be 1.0. 134 | loss, label_loss, per_instance_label_loss, seg_loss, per_instance_seg_loss, per_instance_seg_pred_res \ 135 | = model.get_loss(labels_pred, seg_pred, labels_ph, seg_ph, 1.0, end_points) 136 | 137 | total_training_loss_ph = tf.placeholder(tf.float32, shape=()) 138 | total_testing_loss_ph = tf.placeholder(tf.float32, shape=()) 139 | 140 | label_training_loss_ph = tf.placeholder(tf.float32, shape=()) 141 | label_testing_loss_ph = tf.placeholder(tf.float32, shape=()) 142 | 143 | seg_training_loss_ph = tf.placeholder(tf.float32, shape=()) 144 | seg_testing_loss_ph = tf.placeholder(tf.float32, shape=()) 145 | 146 | label_training_acc_ph = tf.placeholder(tf.float32, shape=()) 147 | label_testing_acc_ph = tf.placeholder(tf.float32, shape=()) 148 | label_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=()) 149 | 150 | seg_training_acc_ph = tf.placeholder(tf.float32, shape=()) 151 | seg_testing_acc_ph = tf.placeholder(tf.float32, shape=()) 152 | seg_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=()) 153 | 154 | total_train_loss_sum_op = tf.summary.scalar('total_training_loss', total_training_loss_ph) 155 | total_test_loss_sum_op = tf.summary.scalar('total_testing_loss', total_testing_loss_ph) 156 | 157 | label_train_loss_sum_op = tf.summary.scalar('label_training_loss', label_training_loss_ph) 158 | label_test_loss_sum_op = tf.summary.scalar('label_testing_loss', label_testing_loss_ph) 159 | 160 | seg_train_loss_sum_op = tf.summary.scalar('seg_training_loss', seg_training_loss_ph) 161 | seg_test_loss_sum_op = tf.summary.scalar('seg_testing_loss', seg_testing_loss_ph) 162 | 163 | label_train_acc_sum_op = tf.summary.scalar('label_training_acc', label_training_acc_ph) 164 | label_test_acc_sum_op = tf.summary.scalar('label_testing_acc', label_testing_acc_ph) 165 | label_test_acc_avg_cat_op = tf.summary.scalar('label_testing_acc_avg_cat', label_testing_acc_avg_cat_ph) 166 | 167 | seg_train_acc_sum_op = tf.summary.scalar('seg_training_acc', seg_training_acc_ph) 168 | seg_test_acc_sum_op = tf.summary.scalar('seg_testing_acc', seg_testing_acc_ph) 169 | seg_test_acc_avg_cat_op = tf.summary.scalar('seg_testing_acc_avg_cat', seg_testing_acc_avg_cat_ph) 170 | 171 | train_variables = tf.trainable_variables() 172 | 173 | trainer = tf.train.AdamOptimizer(learning_rate) 174 | train_op = trainer.minimize(loss, var_list=train_variables, global_step=batch) 175 | 176 | saver = tf.train.Saver() 177 | 178 | config = tf.ConfigProto() 179 | config.gpu_options.allow_growth = True 180 | config.allow_soft_placement = True 181 | sess = tf.Session(config=config) 182 | 183 | init = tf.global_variables_initializer() 184 | sess.run(init) 185 | 186 | train_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/train', sess.graph) 187 | test_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/test') 188 | 189 | train_file_list = provider.getDataFiles(TRAINING_FILE_LIST) 190 | num_train_file = len(train_file_list) 191 | test_file_list = provider.getDataFiles(TESTING_FILE_LIST) 192 | num_test_file = len(test_file_list) 193 | 194 | fcmd = open(os.path.join(LOG_STORAGE_PATH, 'cmd.txt'), 'w') 195 | fcmd.write(str(FLAGS)) 196 | fcmd.close() 197 | 198 | # write logs to the disk 199 | flog = open(os.path.join(LOG_STORAGE_PATH, 'log.txt'), 'w') 200 | 201 | def train_one_epoch(train_file_idx, epoch_num): 202 | is_training = True 203 | 204 | for i in range(num_train_file): 205 | cur_train_filename = os.path.join(hdf5_data_dir, train_file_list[train_file_idx[i]]) 206 | printout(flog, 'Loading train file ' + cur_train_filename) 207 | 208 | cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_train_filename) 209 | cur_data, cur_labels, order = provider.shuffle_data(cur_data, np.squeeze(cur_labels)) 210 | cur_seg = cur_seg[order, ...] 211 | 212 | cur_labels_one_hot = convert_label_to_one_hot(cur_labels) 213 | 214 | num_data = len(cur_labels) 215 | num_batch = num_data // batch_size 216 | 217 | total_loss = 0.0 218 | total_label_loss = 0.0 219 | total_seg_loss = 0.0 220 | total_label_acc = 0.0 221 | total_seg_acc = 0.0 222 | 223 | for j in range(num_batch): 224 | begidx = j * batch_size 225 | endidx = (j + 1) * batch_size 226 | 227 | feed_dict = { 228 | pointclouds_ph: cur_data[begidx: endidx, ...], 229 | labels_ph: cur_labels[begidx: endidx, ...], 230 | input_label_ph: cur_labels_one_hot[begidx: endidx, ...], 231 | seg_ph: cur_seg[begidx: endidx, ...], 232 | is_training_ph: is_training, 233 | } 234 | 235 | _, loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \ 236 | per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \ 237 | = sess.run([train_op, loss, label_loss, seg_loss, per_instance_label_loss, \ 238 | per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \ 239 | feed_dict=feed_dict) 240 | 241 | per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1) 242 | average_part_acc = np.mean(per_instance_part_acc) 243 | 244 | total_loss += loss_val 245 | total_label_loss += label_loss_val 246 | total_seg_loss += seg_loss_val 247 | 248 | per_instance_label_pred = np.argmax(label_pred_val, axis=1) 249 | total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...])) 250 | total_seg_acc += average_part_acc 251 | 252 | total_loss = total_loss * 1.0 / num_batch 253 | total_label_loss = total_label_loss * 1.0 / num_batch 254 | total_seg_loss = total_seg_loss * 1.0 / num_batch 255 | total_label_acc = total_label_acc * 1.0 / num_batch 256 | total_seg_acc = total_seg_acc * 1.0 / num_batch 257 | 258 | lr_sum, bn_decay_sum, batch_sum, train_loss_sum, train_label_acc_sum, \ 259 | train_label_loss_sum, train_seg_loss_sum, train_seg_acc_sum = sess.run(\ 260 | [lr_op, bn_decay_op, batch_op, total_train_loss_sum_op, label_train_acc_sum_op, \ 261 | label_train_loss_sum_op, seg_train_loss_sum_op, seg_train_acc_sum_op], \ 262 | feed_dict={total_training_loss_ph: total_loss, label_training_loss_ph: total_label_loss, \ 263 | seg_training_loss_ph: total_seg_loss, label_training_acc_ph: total_label_acc, \ 264 | seg_training_acc_ph: total_seg_acc}) 265 | 266 | train_writer.add_summary(train_loss_sum, i + epoch_num * num_train_file) 267 | train_writer.add_summary(train_label_loss_sum, i + epoch_num * num_train_file) 268 | train_writer.add_summary(train_seg_loss_sum, i + epoch_num * num_train_file) 269 | train_writer.add_summary(lr_sum, i + epoch_num * num_train_file) 270 | train_writer.add_summary(bn_decay_sum, i + epoch_num * num_train_file) 271 | train_writer.add_summary(train_label_acc_sum, i + epoch_num * num_train_file) 272 | train_writer.add_summary(train_seg_acc_sum, i + epoch_num * num_train_file) 273 | train_writer.add_summary(batch_sum, i + epoch_num * num_train_file) 274 | 275 | printout(flog, '\tTraining Total Mean_loss: %f' % total_loss) 276 | printout(flog, '\t\tTraining Label Mean_loss: %f' % total_label_loss) 277 | printout(flog, '\t\tTraining Label Accuracy: %f' % total_label_acc) 278 | printout(flog, '\t\tTraining Seg Mean_loss: %f' % total_seg_loss) 279 | printout(flog, '\t\tTraining Seg Accuracy: %f' % total_seg_acc) 280 | 281 | def eval_one_epoch(epoch_num): 282 | is_training = False 283 | 284 | total_loss = 0.0 285 | total_label_loss = 0.0 286 | total_seg_loss = 0.0 287 | total_label_acc = 0.0 288 | total_seg_acc = 0.0 289 | total_seen = 0 290 | 291 | total_label_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32) 292 | total_seg_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32) 293 | total_seen_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.int32) 294 | 295 | for i in range(num_test_file): 296 | cur_test_filename = os.path.join(hdf5_data_dir, test_file_list[i]) 297 | printout(flog, 'Loading test file ' + cur_test_filename) 298 | 299 | cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_test_filename) 300 | cur_labels = np.squeeze(cur_labels) 301 | 302 | cur_labels_one_hot = convert_label_to_one_hot(cur_labels) 303 | 304 | num_data = len(cur_labels) 305 | num_batch = num_data // batch_size 306 | 307 | for j in range(num_batch): 308 | begidx = j * batch_size 309 | endidx = (j + 1) * batch_size 310 | feed_dict = { 311 | pointclouds_ph: cur_data[begidx: endidx, ...], 312 | labels_ph: cur_labels[begidx: endidx, ...], 313 | input_label_ph: cur_labels_one_hot[begidx: endidx, ...], 314 | seg_ph: cur_seg[begidx: endidx, ...], 315 | is_training_ph: is_training, 316 | } 317 | 318 | loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \ 319 | per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \ 320 | = sess.run([loss, label_loss, seg_loss, per_instance_label_loss, \ 321 | per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \ 322 | feed_dict=feed_dict) 323 | 324 | per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1) 325 | average_part_acc = np.mean(per_instance_part_acc) 326 | 327 | total_seen += 1 328 | total_loss += loss_val 329 | total_label_loss += label_loss_val 330 | total_seg_loss += seg_loss_val 331 | 332 | per_instance_label_pred = np.argmax(label_pred_val, axis=1) 333 | total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...])) 334 | total_seg_acc += average_part_acc 335 | 336 | for shape_idx in range(begidx, endidx): 337 | total_seen_per_cat[cur_labels[shape_idx]] += 1 338 | total_label_acc_per_cat[cur_labels[shape_idx]] += np.int32(per_instance_label_pred[shape_idx-begidx] == cur_labels[shape_idx]) 339 | total_seg_acc_per_cat[cur_labels[shape_idx]] += per_instance_part_acc[shape_idx - begidx] 340 | 341 | total_loss = total_loss * 1.0 / total_seen 342 | total_label_loss = total_label_loss * 1.0 / total_seen 343 | total_seg_loss = total_seg_loss * 1.0 / total_seen 344 | total_label_acc = total_label_acc * 1.0 / total_seen 345 | total_seg_acc = total_seg_acc * 1.0 / total_seen 346 | 347 | test_loss_sum, test_label_acc_sum, test_label_loss_sum, test_seg_loss_sum, test_seg_acc_sum = sess.run(\ 348 | [total_test_loss_sum_op, label_test_acc_sum_op, label_test_loss_sum_op, seg_test_loss_sum_op, seg_test_acc_sum_op], \ 349 | feed_dict={total_testing_loss_ph: total_loss, label_testing_loss_ph: total_label_loss, \ 350 | seg_testing_loss_ph: total_seg_loss, label_testing_acc_ph: total_label_acc, seg_testing_acc_ph: total_seg_acc}) 351 | 352 | test_writer.add_summary(test_loss_sum, (epoch_num+1) * num_train_file-1) 353 | test_writer.add_summary(test_label_loss_sum, (epoch_num+1) * num_train_file-1) 354 | test_writer.add_summary(test_seg_loss_sum, (epoch_num+1) * num_train_file-1) 355 | test_writer.add_summary(test_label_acc_sum, (epoch_num+1) * num_train_file-1) 356 | test_writer.add_summary(test_seg_acc_sum, (epoch_num+1) * num_train_file-1) 357 | 358 | printout(flog, '\tTesting Total Mean_loss: %f' % total_loss) 359 | printout(flog, '\t\tTesting Label Mean_loss: %f' % total_label_loss) 360 | printout(flog, '\t\tTesting Label Accuracy: %f' % total_label_acc) 361 | printout(flog, '\t\tTesting Seg Mean_loss: %f' % total_seg_loss) 362 | printout(flog, '\t\tTesting Seg Accuracy: %f' % total_seg_acc) 363 | 364 | for cat_idx in range(NUM_CATEGORIES): 365 | if total_seen_per_cat[cat_idx] > 0: 366 | printout(flog, '\n\t\tCategory %s Object Number: %d' % (all_obj_cats[cat_idx][0], total_seen_per_cat[cat_idx])) 367 | printout(flog, '\t\tCategory %s Label Accuracy: %f' % (all_obj_cats[cat_idx][0], total_label_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx])) 368 | printout(flog, '\t\tCategory %s Seg Accuracy: %f' % (all_obj_cats[cat_idx][0], total_seg_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx])) 369 | 370 | if not os.path.exists(MODEL_STORAGE_PATH): 371 | os.mkdir(MODEL_STORAGE_PATH) 372 | 373 | for epoch in range(TRAINING_EPOCHES): 374 | printout(flog, '\n<<< Testing on the test dataset ...') 375 | eval_one_epoch(epoch) 376 | 377 | printout(flog, '\n>>> Training for the epoch %d/%d ...' % (epoch, TRAINING_EPOCHES)) 378 | 379 | train_file_idx = np.arange(0, len(train_file_list)) 380 | np.random.shuffle(train_file_idx) 381 | 382 | train_one_epoch(train_file_idx, epoch) 383 | 384 | if (epoch+1) % 10 == 0: 385 | cp_filename = saver.save(sess, os.path.join(MODEL_STORAGE_PATH, 'epoch_' + str(epoch+1)+'.ckpt')) 386 | printout(flog, 'Successfully store the checkpoint model into ' + cp_filename) 387 | 388 | flog.flush() 389 | 390 | flog.close() 391 | 392 | if __name__=='__main__': 393 | train() 394 | -------------------------------------------------------------------------------- /provider.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import numpy as np 4 | import h5py 5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 6 | sys.path.append(BASE_DIR) 7 | 8 | # Download dataset for point cloud classification 9 | DATA_DIR = os.path.join(BASE_DIR, 'data') 10 | if not os.path.exists(DATA_DIR): 11 | os.mkdir(DATA_DIR) 12 | if not os.path.exists(os.path.join(DATA_DIR, 'modelnet40_ply_hdf5_2048')): 13 | www = 'https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip' 14 | zipfile = os.path.basename(www) 15 | os.system('wget %s; unzip %s' % (www, zipfile)) 16 | os.system('mv %s %s' % (zipfile[:-4], DATA_DIR)) 17 | os.system('rm %s' % (zipfile)) 18 | 19 | 20 | def shuffle_data(data, labels): 21 | """ Shuffle data and labels. 22 | Input: 23 | data: B,N,... numpy array 24 | label: B,... numpy array 25 | Return: 26 | shuffled data, label and shuffle indices 27 | """ 28 | idx = np.arange(len(labels)) 29 | np.random.shuffle(idx) 30 | return data[idx, ...], labels[idx], idx 31 | 32 | 33 | def rotate_point_cloud(batch_data): 34 | """ Randomly rotate the point clouds to augument the dataset 35 | rotation is per shape based along up direction 36 | Input: 37 | BxNx3 array, original batch of point clouds 38 | Return: 39 | BxNx3 array, rotated batch of point clouds 40 | """ 41 | rotated_data = np.zeros(batch_data.shape, dtype=np.float32) 42 | for k in range(batch_data.shape[0]): 43 | rotation_angle = np.random.uniform() * 2 * np.pi 44 | cosval = np.cos(rotation_angle) 45 | sinval = np.sin(rotation_angle) 46 | rotation_matrix = np.array([[cosval, 0, sinval], 47 | [0, 1, 0], 48 | [-sinval, 0, cosval]]) 49 | shape_pc = batch_data[k, ...] 50 | rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix) 51 | return rotated_data 52 | 53 | 54 | def rotate_point_cloud_by_angle(batch_data, rotation_angle): 55 | """ Rotate the point cloud along up direction with certain angle. 56 | Input: 57 | BxNx3 array, original batch of point clouds 58 | Return: 59 | BxNx3 array, rotated batch of point clouds 60 | """ 61 | rotated_data = np.zeros(batch_data.shape, dtype=np.float32) 62 | for k in range(batch_data.shape[0]): 63 | #rotation_angle = np.random.uniform() * 2 * np.pi 64 | cosval = np.cos(rotation_angle) 65 | sinval = np.sin(rotation_angle) 66 | rotation_matrix = np.array([[cosval, 0, sinval], 67 | [0, 1, 0], 68 | [-sinval, 0, cosval]]) 69 | shape_pc = batch_data[k, ...] 70 | rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix) 71 | return rotated_data 72 | 73 | 74 | def jitter_point_cloud(batch_data, sigma=0.01, clip=0.05): 75 | """ Randomly jitter points. jittering is per point. 76 | Input: 77 | BxNx3 array, original batch of point clouds 78 | Return: 79 | BxNx3 array, jittered batch of point clouds 80 | """ 81 | B, N, C = batch_data.shape 82 | assert(clip > 0) 83 | jittered_data = np.clip(sigma * np.random.randn(B, N, C), -1*clip, clip) 84 | jittered_data += batch_data 85 | return jittered_data 86 | 87 | def getDataFiles(list_filename): 88 | return [line.rstrip() for line in open(list_filename)] 89 | 90 | def load_h5(h5_filename): 91 | f = h5py.File(h5_filename) 92 | data = f['data'][:] 93 | label = f['label'][:] 94 | return (data, label) 95 | 96 | def loadDataFile(filename): 97 | return load_h5(filename) 98 | 99 | def load_h5_data_label_seg(h5_filename): 100 | f = h5py.File(h5_filename) 101 | data = f['data'][:] 102 | label = f['label'][:] 103 | seg = f['pid'][:] 104 | return (data, label, seg) 105 | 106 | 107 | def loadDataFile_with_seg(filename): 108 | return load_h5_data_label_seg(filename) 109 | -------------------------------------------------------------------------------- /sem_seg/README.md: -------------------------------------------------------------------------------- 1 | ## Semantic Segmentation of Indoor Scenes 2 | 3 | ### Dataset 4 | 5 | Donwload prepared HDF5 data for training: 6 | 7 | sh download_data.sh 8 | 9 | (optional) Download 3D indoor parsing dataset (S3DIS Dataset) for testing and visualization. Version 1.2 of the dataset is used in this work. 10 | 11 | 12 | To prepare your own HDF5 data, you need to firstly download 3D indoor parsing dataset and then use `python collect_indoor3d_data.py` for data re-organization and `python gen_indoor3d_h5.py` to generate HDF5 files. 13 | 14 | ### Training 15 | 16 | Once you have downloaded prepared HDF5 files or prepared them by yourself, to start training: 17 | 18 | python train.py --log_dir log6 --test_area 6 19 | 20 | In default a simple model based on vanilla PointNet is used for training. Area 6 is used for test set. 21 | 22 | ### Testing 23 | 24 | Testing requires download of 3D indoor parsing data and preprocessing with `collect_indoor3d_data.py` 25 | 26 | After training, use `batch_inference.py` command to segment rooms in test set. In our work we use 6-fold training that trains 6 models. For model1 , area2-6 are used as train set, area1 is used as test set. For model2, area1,3-6 are used as train set and area2 is used as test set... Note that S3DIS dataset paper uses a different 3-fold training, which was not publicly announced at the time of our work. 27 | 28 | For example, to test model6, use command: 29 | 30 | python batch_inference.py --model_path log6/model.ckpt --dump_dir log6/dump --output_filelist log6/output_filelist.txt --room_data_filelist meta/area6_data_label.txt --visu 31 | 32 | Some OBJ files will be created for prediciton visualization in `log6/dump`. 33 | 34 | To evaluate overall segmentation accuracy, we evaluate 6 models on their corresponding test areas and use `eval_iou_accuracy.py` to produce point classification accuracy and IoU as reported in the paper. 35 | 36 | 37 | -------------------------------------------------------------------------------- /sem_seg/batch_inference.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import sys 4 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 5 | ROOT_DIR = os.path.dirname(BASE_DIR) 6 | sys.path.append(BASE_DIR) 7 | from model import * 8 | import indoor3d_util 9 | 10 | parser = argparse.ArgumentParser() 11 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]') 12 | parser.add_argument('--batch_size', type=int, default=1, help='Batch Size during training [default: 1]') 13 | parser.add_argument('--num_point', type=int, default=4096, help='Point number [default: 4096]') 14 | parser.add_argument('--model_path', required=True, help='model checkpoint file path') 15 | parser.add_argument('--dump_dir', required=True, help='dump folder path') 16 | parser.add_argument('--output_filelist', required=True, help='TXT filename, filelist, each line is an output for a room') 17 | parser.add_argument('--room_data_filelist', required=True, help='TXT filename, filelist, each line is a test room data label file.') 18 | parser.add_argument('--no_clutter', action='store_true', help='If true, donot count the clutter class') 19 | parser.add_argument('--visu', action='store_true', help='Whether to output OBJ file for prediction visualization.') 20 | FLAGS = parser.parse_args() 21 | 22 | BATCH_SIZE = FLAGS.batch_size 23 | NUM_POINT = FLAGS.num_point 24 | MODEL_PATH = FLAGS.model_path 25 | GPU_INDEX = FLAGS.gpu 26 | DUMP_DIR = FLAGS.dump_dir 27 | if not os.path.exists(DUMP_DIR): os.mkdir(DUMP_DIR) 28 | LOG_FOUT = open(os.path.join(DUMP_DIR, 'log_evaluate.txt'), 'w') 29 | LOG_FOUT.write(str(FLAGS)+'\n') 30 | ROOM_PATH_LIST = [os.path.join(ROOT_DIR,line.rstrip()) for line in open(FLAGS.room_data_filelist)] 31 | 32 | NUM_CLASSES = 13 33 | 34 | def log_string(out_str): 35 | LOG_FOUT.write(out_str+'\n') 36 | LOG_FOUT.flush() 37 | print(out_str) 38 | 39 | def evaluate(): 40 | is_training = False 41 | 42 | with tf.device('/gpu:'+str(GPU_INDEX)): 43 | pointclouds_pl, labels_pl = placeholder_inputs(BATCH_SIZE, NUM_POINT) 44 | is_training_pl = tf.placeholder(tf.bool, shape=()) 45 | 46 | # simple model 47 | pred = get_model(pointclouds_pl, is_training_pl) 48 | loss = get_loss(pred, labels_pl) 49 | pred_softmax = tf.nn.softmax(pred) 50 | 51 | # Add ops to save and restore all the variables. 52 | saver = tf.train.Saver() 53 | 54 | # Create a session 55 | config = tf.ConfigProto() 56 | config.gpu_options.allow_growth = True 57 | config.allow_soft_placement = True 58 | config.log_device_placement = True 59 | sess = tf.Session(config=config) 60 | 61 | # Restore variables from disk. 62 | saver.restore(sess, MODEL_PATH) 63 | log_string("Model restored.") 64 | 65 | ops = {'pointclouds_pl': pointclouds_pl, 66 | 'labels_pl': labels_pl, 67 | 'is_training_pl': is_training_pl, 68 | 'pred': pred, 69 | 'pred_softmax': pred_softmax, 70 | 'loss': loss} 71 | 72 | total_correct = 0 73 | total_seen = 0 74 | fout_out_filelist = open(FLAGS.output_filelist, 'w') 75 | for room_path in ROOM_PATH_LIST: 76 | out_data_label_filename = os.path.basename(room_path)[:-4] + '_pred.txt' 77 | out_data_label_filename = os.path.join(DUMP_DIR, out_data_label_filename) 78 | out_gt_label_filename = os.path.basename(room_path)[:-4] + '_gt.txt' 79 | out_gt_label_filename = os.path.join(DUMP_DIR, out_gt_label_filename) 80 | print(room_path, out_data_label_filename) 81 | a, b = eval_one_epoch(sess, ops, room_path, out_data_label_filename, out_gt_label_filename) 82 | total_correct += a 83 | total_seen += b 84 | fout_out_filelist.write(out_data_label_filename+'\n') 85 | fout_out_filelist.close() 86 | log_string('all room eval accuracy: %f'% (total_correct / float(total_seen))) 87 | 88 | def eval_one_epoch(sess, ops, room_path, out_data_label_filename, out_gt_label_filename): 89 | error_cnt = 0 90 | is_training = False 91 | total_correct = 0 92 | total_seen = 0 93 | loss_sum = 0 94 | total_seen_class = [0 for _ in range(NUM_CLASSES)] 95 | total_correct_class = [0 for _ in range(NUM_CLASSES)] 96 | if FLAGS.visu: 97 | fout = open(os.path.join(DUMP_DIR, os.path.basename(room_path)[:-4]+'_pred.obj'), 'w') 98 | fout_gt = open(os.path.join(DUMP_DIR, os.path.basename(room_path)[:-4]+'_gt.obj'), 'w') 99 | fout_data_label = open(out_data_label_filename, 'w') 100 | fout_gt_label = open(out_gt_label_filename, 'w') 101 | 102 | current_data, current_label = indoor3d_util.room2blocks_wrapper_normalized(room_path, NUM_POINT) 103 | current_data = current_data[:,0:NUM_POINT,:] 104 | current_label = np.squeeze(current_label) 105 | # Get room dimension.. 106 | data_label = np.load(room_path) 107 | data = data_label[:,0:6] 108 | max_room_x = max(data[:,0]) 109 | max_room_y = max(data[:,1]) 110 | max_room_z = max(data[:,2]) 111 | 112 | file_size = current_data.shape[0] 113 | num_batches = file_size // BATCH_SIZE 114 | print(file_size) 115 | 116 | 117 | for batch_idx in range(num_batches): 118 | start_idx = batch_idx * BATCH_SIZE 119 | end_idx = (batch_idx+1) * BATCH_SIZE 120 | cur_batch_size = end_idx - start_idx 121 | 122 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :], 123 | ops['labels_pl']: current_label[start_idx:end_idx], 124 | ops['is_training_pl']: is_training} 125 | loss_val, pred_val = sess.run([ops['loss'], ops['pred_softmax']], 126 | feed_dict=feed_dict) 127 | 128 | if FLAGS.no_clutter: 129 | pred_label = np.argmax(pred_val[:,:,0:12], 2) # BxN 130 | else: 131 | pred_label = np.argmax(pred_val, 2) # BxN 132 | # Save prediction labels to OBJ file 133 | for b in range(BATCH_SIZE): 134 | pts = current_data[start_idx+b, :, :] 135 | l = current_label[start_idx+b,:] 136 | pts[:,6] *= max_room_x 137 | pts[:,7] *= max_room_y 138 | pts[:,8] *= max_room_z 139 | pts[:,3:6] *= 255.0 140 | pred = pred_label[b, :] 141 | for i in range(NUM_POINT): 142 | color = indoor3d_util.g_label2color[pred[i]] 143 | color_gt = indoor3d_util.g_label2color[current_label[start_idx+b, i]] 144 | if FLAGS.visu: 145 | fout.write('v %f %f %f %d %d %d\n' % (pts[i,6], pts[i,7], pts[i,8], color[0], color[1], color[2])) 146 | fout_gt.write('v %f %f %f %d %d %d\n' % (pts[i,6], pts[i,7], pts[i,8], color_gt[0], color_gt[1], color_gt[2])) 147 | fout_data_label.write('%f %f %f %d %d %d %f %d\n' % (pts[i,6], pts[i,7], pts[i,8], pts[i,3], pts[i,4], pts[i,5], pred_val[b,i,pred[i]], pred[i])) 148 | fout_gt_label.write('%d\n' % (l[i])) 149 | correct = np.sum(pred_label == current_label[start_idx:end_idx,:]) 150 | total_correct += correct 151 | total_seen += (cur_batch_size*NUM_POINT) 152 | loss_sum += (loss_val*BATCH_SIZE) 153 | for i in range(start_idx, end_idx): 154 | for j in range(NUM_POINT): 155 | l = current_label[i, j] 156 | total_seen_class[l] += 1 157 | total_correct_class[l] += (pred_label[i-start_idx, j] == l) 158 | 159 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen/NUM_POINT))) 160 | log_string('eval accuracy: %f'% (total_correct / float(total_seen))) 161 | fout_data_label.close() 162 | fout_gt_label.close() 163 | if FLAGS.visu: 164 | fout.close() 165 | fout_gt.close() 166 | return total_correct, total_seen 167 | 168 | 169 | if __name__=='__main__': 170 | with tf.Graph().as_default(): 171 | evaluate() 172 | LOG_FOUT.close() 173 | -------------------------------------------------------------------------------- /sem_seg/collect_indoor3d_data.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 4 | ROOT_DIR = os.path.dirname(BASE_DIR) 5 | sys.path.append(BASE_DIR) 6 | import indoor3d_util 7 | 8 | anno_paths = [line.rstrip() for line in open(os.path.join(BASE_DIR, 'meta/anno_paths.txt'))] 9 | anno_paths = [os.path.join(indoor3d_util.DATA_PATH, p) for p in anno_paths] 10 | 11 | output_folder = os.path.join(ROOT_DIR, 'data/stanford_indoor3d') 12 | if not os.path.exists(output_folder): 13 | os.mkdir(output_folder) 14 | 15 | # Note: there is an extra character in the v1.2 data in Area_5/hallway_6. It's fixed manually. 16 | for anno_path in anno_paths: 17 | print(anno_path) 18 | try: 19 | elements = anno_path.split('/') 20 | out_filename = elements[-3]+'_'+elements[-2]+'.npy' # Area_1_hallway_1.npy 21 | indoor3d_util.collect_point_label(anno_path, os.path.join(output_folder, out_filename), 'numpy') 22 | except: 23 | print(anno_path, 'ERROR!!') 24 | -------------------------------------------------------------------------------- /sem_seg/download_data.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Download HDF5 for indoor 3d semantic segmentation (around 1.6GB) 4 | wget https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip 5 | unzip indoor3d_sem_seg_hdf5_data.zip 6 | rm indoor3d_sem_seg_hdf5_data.zip 7 | 8 | -------------------------------------------------------------------------------- /sem_seg/eval_iou_accuracy.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | pred_data_label_filenames = [line.rstrip() for line in open('all_pred_data_label_filelist.txt')] 4 | gt_label_filenames = [f.rstrip('_pred\.txt') + '_gt.txt' for f in pred_data_label_filenames] 5 | num_room = len(gt_label_filenames) 6 | 7 | 8 | gt_classes = [0 for _ in range(13)] 9 | positive_classes = [0 for _ in range(13)] 10 | true_positive_classes = [0 for _ in range(13)] 11 | for i in range(num_room): 12 | print(i) 13 | data_label = np.loadtxt(pred_data_label_filenames[i]) 14 | pred_label = data_label[:,-1] 15 | gt_label = np.loadtxt(gt_label_filenames[i]) 16 | print(gt_label.shape) 17 | for j in xrange(gt_label.shape[0]): 18 | gt_l = int(gt_label[j]) 19 | pred_l = int(pred_label[j]) 20 | gt_classes[gt_l] += 1 21 | positive_classes[pred_l] += 1 22 | true_positive_classes[gt_l] += int(gt_l==pred_l) 23 | 24 | 25 | print(gt_classes) 26 | print(positive_classes) 27 | print(true_positive_classes) 28 | 29 | 30 | print('Overall accuracy: {0}'.format(sum(true_positive_classes)/float(sum(positive_classes)))) 31 | 32 | print 'IoU:' 33 | iou_list = [] 34 | for i in range(13): 35 | iou = true_positive_classes[i]/float(gt_classes[i]+positive_classes[i]-true_positive_classes[i]) 36 | print(iou) 37 | iou_list.append(iou) 38 | 39 | print(sum(iou_list)/13.0) 40 | -------------------------------------------------------------------------------- /sem_seg/gen_indoor3d_h5.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import sys 4 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 5 | ROOT_DIR = os.path.dirname(BASE_DIR) 6 | sys.path.append(BASE_DIR) 7 | sys.path.append(os.path.join(ROOT_DIR, 'utils')) 8 | import data_prep_util 9 | import indoor3d_util 10 | 11 | # Constants 12 | data_dir = os.path.join(ROOT_DIR, 'data') 13 | indoor3d_data_dir = os.path.join(data_dir, 'stanford_indoor3d') 14 | NUM_POINT = 4096 15 | H5_BATCH_SIZE = 1000 16 | data_dim = [NUM_POINT, 9] 17 | label_dim = [NUM_POINT] 18 | data_dtype = 'float32' 19 | label_dtype = 'uint8' 20 | 21 | # Set paths 22 | filelist = os.path.join(BASE_DIR, 'meta/all_data_label.txt') 23 | data_label_files = [os.path.join(indoor3d_data_dir, line.rstrip()) for line in open(filelist)] 24 | output_dir = os.path.join(data_dir, 'indoor3d_sem_seg_hdf5_data') 25 | if not os.path.exists(output_dir): 26 | os.mkdir(output_dir) 27 | output_filename_prefix = os.path.join(output_dir, 'ply_data_all') 28 | output_room_filelist = os.path.join(output_dir, 'room_filelist.txt') 29 | fout_room = open(output_room_filelist, 'w') 30 | 31 | # -------------------------------------- 32 | # ----- BATCH WRITE TO HDF5 ----- 33 | # -------------------------------------- 34 | batch_data_dim = [H5_BATCH_SIZE] + data_dim 35 | batch_label_dim = [H5_BATCH_SIZE] + label_dim 36 | h5_batch_data = np.zeros(batch_data_dim, dtype = np.float32) 37 | h5_batch_label = np.zeros(batch_label_dim, dtype = np.uint8) 38 | buffer_size = 0 # state: record how many samples are currently in buffer 39 | h5_index = 0 # state: the next h5 file to save 40 | 41 | def insert_batch(data, label, last_batch=False): 42 | global h5_batch_data, h5_batch_label 43 | global buffer_size, h5_index 44 | data_size = data.shape[0] 45 | # If there is enough space, just insert 46 | if buffer_size + data_size <= h5_batch_data.shape[0]: 47 | h5_batch_data[buffer_size:buffer_size+data_size, ...] = data 48 | h5_batch_label[buffer_size:buffer_size+data_size] = label 49 | buffer_size += data_size 50 | else: # not enough space 51 | capacity = h5_batch_data.shape[0] - buffer_size 52 | assert(capacity>=0) 53 | if capacity > 0: 54 | h5_batch_data[buffer_size:buffer_size+capacity, ...] = data[0:capacity, ...] 55 | h5_batch_label[buffer_size:buffer_size+capacity, ...] = label[0:capacity, ...] 56 | # Save batch data and label to h5 file, reset buffer_size 57 | h5_filename = output_filename_prefix + '_' + str(h5_index) + '.h5' 58 | data_prep_util.save_h5(h5_filename, h5_batch_data, h5_batch_label, data_dtype, label_dtype) 59 | print('Stored {0} with size {1}'.format(h5_filename, h5_batch_data.shape[0])) 60 | h5_index += 1 61 | buffer_size = 0 62 | # recursive call 63 | insert_batch(data[capacity:, ...], label[capacity:, ...], last_batch) 64 | if last_batch and buffer_size > 0: 65 | h5_filename = output_filename_prefix + '_' + str(h5_index) + '.h5' 66 | data_prep_util.save_h5(h5_filename, h5_batch_data[0:buffer_size, ...], h5_batch_label[0:buffer_size, ...], data_dtype, label_dtype) 67 | print('Stored {0} with size {1}'.format(h5_filename, buffer_size)) 68 | h5_index += 1 69 | buffer_size = 0 70 | return 71 | 72 | 73 | sample_cnt = 0 74 | for i, data_label_filename in enumerate(data_label_files): 75 | print(data_label_filename) 76 | data, label = indoor3d_util.room2blocks_wrapper_normalized(data_label_filename, NUM_POINT, block_size=1.0, stride=0.5, 77 | random_sample=False, sample_num=None) 78 | print('{0}, {1}'.format(data.shape, label.shape)) 79 | for _ in range(data.shape[0]): 80 | fout_room.write(os.path.basename(data_label_filename)[0:-4]+'\n') 81 | 82 | sample_cnt += data.shape[0] 83 | insert_batch(data, label, i == len(data_label_files)-1) 84 | 85 | fout_room.close() 86 | print("Total samples: {0}".format(sample_cnt)) 87 | -------------------------------------------------------------------------------- /sem_seg/indoor3d_util.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import glob 3 | import os 4 | import sys 5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 6 | ROOT_DIR = os.path.dirname(BASE_DIR) 7 | sys.path.append(BASE_DIR) 8 | 9 | # ----------------------------------------------------------------------------- 10 | # CONSTANTS 11 | # ----------------------------------------------------------------------------- 12 | 13 | DATA_PATH = os.path.join(ROOT_DIR, 'data', 'Stanford3dDataset_v1.2_Aligned_Version') 14 | g_classes = [x.rstrip() for x in open(os.path.join(BASE_DIR, 'meta/class_names.txt'))] 15 | g_class2label = {cls: i for i,cls in enumerate(g_classes)} 16 | g_class2color = {'ceiling': [0,255,0], 17 | 'floor': [0,0,255], 18 | 'wall': [0,255,255], 19 | 'beam': [255,255,0], 20 | 'column': [255,0,255], 21 | 'window': [100,100,255], 22 | 'door': [200,200,100], 23 | 'table': [170,120,200], 24 | 'chair': [255,0,0], 25 | 'sofa': [200,100,100], 26 | 'bookcase': [10,200,100], 27 | 'board': [200,200,200], 28 | 'clutter': [50,50,50]} 29 | g_easy_view_labels = [7,8,9,10,11,1] 30 | g_label2color = {g_classes.index(cls): g_class2color[cls] for cls in g_classes} 31 | 32 | 33 | # ----------------------------------------------------------------------------- 34 | # CONVERT ORIGINAL DATA TO OUR DATA_LABEL FILES 35 | # ----------------------------------------------------------------------------- 36 | 37 | def collect_point_label(anno_path, out_filename, file_format='txt'): 38 | """ Convert original dataset files to data_label file (each line is XYZRGBL). 39 | We aggregated all the points from each instance in the room. 40 | 41 | Args: 42 | anno_path: path to annotations. e.g. Area_1/office_2/Annotations/ 43 | out_filename: path to save collected points and labels (each line is XYZRGBL) 44 | file_format: txt or numpy, determines what file format to save. 45 | Returns: 46 | None 47 | Note: 48 | the points are shifted before save, the most negative point is now at origin. 49 | """ 50 | points_list = [] 51 | 52 | for f in glob.glob(os.path.join(anno_path, '*.txt')): 53 | cls = os.path.basename(f).split('_')[0] 54 | if cls not in g_classes: # note: in some room there is 'staris' class.. 55 | cls = 'clutter' 56 | points = np.loadtxt(f) 57 | labels = np.ones((points.shape[0],1)) * g_class2label[cls] 58 | points_list.append(np.concatenate([points, labels], 1)) # Nx7 59 | 60 | data_label = np.concatenate(points_list, 0) 61 | xyz_min = np.amin(data_label, axis=0)[0:3] 62 | data_label[:, 0:3] -= xyz_min 63 | 64 | if file_format=='txt': 65 | fout = open(out_filename, 'w') 66 | for i in range(data_label.shape[0]): 67 | fout.write('%f %f %f %d %d %d %d\n' % \ 68 | (data_label[i,0], data_label[i,1], data_label[i,2], 69 | data_label[i,3], data_label[i,4], data_label[i,5], 70 | data_label[i,6])) 71 | fout.close() 72 | elif file_format=='numpy': 73 | np.save(out_filename, data_label) 74 | else: 75 | print('ERROR!! Unknown file format: %s, please use txt or numpy.' % \ 76 | (file_format)) 77 | exit() 78 | 79 | def point_label_to_obj(input_filename, out_filename, label_color=True, easy_view=False, no_wall=False): 80 | """ For visualization of a room from data_label file, 81 | input_filename: each line is X Y Z R G B L 82 | out_filename: OBJ filename, 83 | visualize input file by coloring point with label color 84 | easy_view: only visualize furnitures and floor 85 | """ 86 | data_label = np.loadtxt(input_filename) 87 | data = data_label[:, 0:6] 88 | label = data_label[:, -1].astype(int) 89 | fout = open(out_filename, 'w') 90 | for i in range(data.shape[0]): 91 | color = g_label2color[label[i]] 92 | if easy_view and (label[i] not in g_easy_view_labels): 93 | continue 94 | if no_wall and ((label[i] == 2) or (label[i]==0)): 95 | continue 96 | if label_color: 97 | fout.write('v %f %f %f %d %d %d\n' % \ 98 | (data[i,0], data[i,1], data[i,2], color[0], color[1], color[2])) 99 | else: 100 | fout.write('v %f %f %f %d %d %d\n' % \ 101 | (data[i,0], data[i,1], data[i,2], data[i,3], data[i,4], data[i,5])) 102 | fout.close() 103 | 104 | 105 | 106 | # ----------------------------------------------------------------------------- 107 | # PREPARE BLOCK DATA FOR DEEPNETS TRAINING/TESTING 108 | # ----------------------------------------------------------------------------- 109 | 110 | def sample_data(data, num_sample): 111 | """ data is in N x ... 112 | we want to keep num_samplexC of them. 113 | if N > num_sample, we will randomly keep num_sample of them. 114 | if N < num_sample, we will randomly duplicate samples. 115 | """ 116 | N = data.shape[0] 117 | if (N == num_sample): 118 | return data, range(N) 119 | elif (N > num_sample): 120 | sample = np.random.choice(N, num_sample) 121 | return data[sample, ...], sample 122 | else: 123 | sample = np.random.choice(N, num_sample-N) 124 | dup_data = data[sample, ...] 125 | return np.concatenate([data, dup_data], 0), range(N)+list(sample) 126 | 127 | def sample_data_label(data, label, num_sample): 128 | new_data, sample_indices = sample_data(data, num_sample) 129 | new_label = label[sample_indices] 130 | return new_data, new_label 131 | 132 | def room2blocks(data, label, num_point, block_size=1.0, stride=1.0, 133 | random_sample=False, sample_num=None, sample_aug=1): 134 | """ Prepare block training data. 135 | Args: 136 | data: N x 6 numpy array, 012 are XYZ in meters, 345 are RGB in [0,1] 137 | assumes the data is shifted (min point is origin) and aligned 138 | (aligned with XYZ axis) 139 | label: N size uint8 numpy array from 0-12 140 | num_point: int, how many points to sample in each block 141 | block_size: float, physical size of the block in meters 142 | stride: float, stride for block sweeping 143 | random_sample: bool, if True, we will randomly sample blocks in the room 144 | sample_num: int, if random sample, how many blocks to sample 145 | [default: room area] 146 | sample_aug: if random sample, how much aug 147 | Returns: 148 | block_datas: K x num_point x 6 np array of XYZRGB, RGB is in [0,1] 149 | block_labels: K x num_point x 1 np array of uint8 labels 150 | 151 | TODO: for this version, blocking is in fixed, non-overlapping pattern. 152 | """ 153 | assert(stride<=block_size) 154 | 155 | limit = np.amax(data, 0)[0:3] 156 | 157 | # Get the corner location for our sampling blocks 158 | xbeg_list = [] 159 | ybeg_list = [] 160 | if not random_sample: 161 | num_block_x = int(np.ceil((limit[0] - block_size) / stride)) + 1 162 | num_block_y = int(np.ceil((limit[1] - block_size) / stride)) + 1 163 | for i in range(num_block_x): 164 | for j in range(num_block_y): 165 | xbeg_list.append(i*stride) 166 | ybeg_list.append(j*stride) 167 | else: 168 | num_block_x = int(np.ceil(limit[0] / block_size)) 169 | num_block_y = int(np.ceil(limit[1] / block_size)) 170 | if sample_num is None: 171 | sample_num = num_block_x * num_block_y * sample_aug 172 | for _ in range(sample_num): 173 | xbeg = np.random.uniform(-block_size, limit[0]) 174 | ybeg = np.random.uniform(-block_size, limit[1]) 175 | xbeg_list.append(xbeg) 176 | ybeg_list.append(ybeg) 177 | 178 | # Collect blocks 179 | block_data_list = [] 180 | block_label_list = [] 181 | idx = 0 182 | for idx in range(len(xbeg_list)): 183 | xbeg = xbeg_list[idx] 184 | ybeg = ybeg_list[idx] 185 | xcond = (data[:,0]<=xbeg+block_size) & (data[:,0]>=xbeg) 186 | ycond = (data[:,1]<=ybeg+block_size) & (data[:,1]>=ybeg) 187 | cond = xcond & ycond 188 | if np.sum(cond) < 100: # discard block if there are less than 100 pts. 189 | continue 190 | 191 | block_data = data[cond, :] 192 | block_label = label[cond] 193 | 194 | # randomly subsample data 195 | block_data_sampled, block_label_sampled = \ 196 | sample_data_label(block_data, block_label, num_point) 197 | block_data_list.append(np.expand_dims(block_data_sampled, 0)) 198 | block_label_list.append(np.expand_dims(block_label_sampled, 0)) 199 | 200 | return np.concatenate(block_data_list, 0), \ 201 | np.concatenate(block_label_list, 0) 202 | 203 | 204 | def room2blocks_plus(data_label, num_point, block_size, stride, 205 | random_sample, sample_num, sample_aug): 206 | """ room2block with input filename and RGB preprocessing. 207 | """ 208 | data = data_label[:,0:6] 209 | data[:,3:6] /= 255.0 210 | label = data_label[:,-1].astype(np.uint8) 211 | 212 | return room2blocks(data, label, num_point, block_size, stride, 213 | random_sample, sample_num, sample_aug) 214 | 215 | def room2blocks_wrapper(data_label_filename, num_point, block_size=1.0, stride=1.0, 216 | random_sample=False, sample_num=None, sample_aug=1): 217 | if data_label_filename[-3:] == 'txt': 218 | data_label = np.loadtxt(data_label_filename) 219 | elif data_label_filename[-3:] == 'npy': 220 | data_label = np.load(data_label_filename) 221 | else: 222 | print('Unknown file type! exiting.') 223 | exit() 224 | return room2blocks_plus(data_label, num_point, block_size, stride, 225 | random_sample, sample_num, sample_aug) 226 | 227 | def room2blocks_plus_normalized(data_label, num_point, block_size, stride, 228 | random_sample, sample_num, sample_aug): 229 | """ room2block, with input filename and RGB preprocessing. 230 | for each block centralize XYZ, add normalized XYZ as 678 channels 231 | """ 232 | data = data_label[:,0:6] 233 | data[:,3:6] /= 255.0 234 | label = data_label[:,-1].astype(np.uint8) 235 | max_room_x = max(data[:,0]) 236 | max_room_y = max(data[:,1]) 237 | max_room_z = max(data[:,2]) 238 | 239 | data_batch, label_batch = room2blocks(data, label, num_point, block_size, stride, 240 | random_sample, sample_num, sample_aug) 241 | new_data_batch = np.zeros((data_batch.shape[0], num_point, 9)) 242 | for b in range(data_batch.shape[0]): 243 | new_data_batch[b, :, 6] = data_batch[b, :, 0]/max_room_x 244 | new_data_batch[b, :, 7] = data_batch[b, :, 1]/max_room_y 245 | new_data_batch[b, :, 8] = data_batch[b, :, 2]/max_room_z 246 | minx = min(data_batch[b, :, 0]) 247 | miny = min(data_batch[b, :, 1]) 248 | data_batch[b, :, 0] -= (minx+block_size/2) 249 | data_batch[b, :, 1] -= (miny+block_size/2) 250 | new_data_batch[:, :, 0:6] = data_batch 251 | return new_data_batch, label_batch 252 | 253 | 254 | def room2blocks_wrapper_normalized(data_label_filename, num_point, block_size=1.0, stride=1.0, 255 | random_sample=False, sample_num=None, sample_aug=1): 256 | if data_label_filename[-3:] == 'txt': 257 | data_label = np.loadtxt(data_label_filename) 258 | elif data_label_filename[-3:] == 'npy': 259 | data_label = np.load(data_label_filename) 260 | else: 261 | print('Unknown file type! exiting.') 262 | exit() 263 | return room2blocks_plus_normalized(data_label, num_point, block_size, stride, 264 | random_sample, sample_num, sample_aug) 265 | 266 | def room2samples(data, label, sample_num_point): 267 | """ Prepare whole room samples. 268 | 269 | Args: 270 | data: N x 6 numpy array, 012 are XYZ in meters, 345 are RGB in [0,1] 271 | assumes the data is shifted (min point is origin) and 272 | aligned (aligned with XYZ axis) 273 | label: N size uint8 numpy array from 0-12 274 | sample_num_point: int, how many points to sample in each sample 275 | Returns: 276 | sample_datas: K x sample_num_point x 9 277 | numpy array of XYZRGBX'Y'Z', RGB is in [0,1] 278 | sample_labels: K x sample_num_point x 1 np array of uint8 labels 279 | """ 280 | N = data.shape[0] 281 | order = np.arange(N) 282 | np.random.shuffle(order) 283 | data = data[order, :] 284 | label = label[order] 285 | 286 | batch_num = int(np.ceil(N / float(sample_num_point))) 287 | sample_datas = np.zeros((batch_num, sample_num_point, 6)) 288 | sample_labels = np.zeros((batch_num, sample_num_point, 1)) 289 | 290 | for i in range(batch_num): 291 | beg_idx = i*sample_num_point 292 | end_idx = min((i+1)*sample_num_point, N) 293 | num = end_idx - beg_idx 294 | sample_datas[i,0:num,:] = data[beg_idx:end_idx, :] 295 | sample_labels[i,0:num,0] = label[beg_idx:end_idx] 296 | if num < sample_num_point: 297 | makeup_indices = np.random.choice(N, sample_num_point - num) 298 | sample_datas[i,num:,:] = data[makeup_indices, :] 299 | sample_labels[i,num:,0] = label[makeup_indices] 300 | return sample_datas, sample_labels 301 | 302 | def room2samples_plus_normalized(data_label, num_point): 303 | """ room2sample, with input filename and RGB preprocessing. 304 | for each block centralize XYZ, add normalized XYZ as 678 channels 305 | """ 306 | data = data_label[:,0:6] 307 | data[:,3:6] /= 255.0 308 | label = data_label[:,-1].astype(np.uint8) 309 | max_room_x = max(data[:,0]) 310 | max_room_y = max(data[:,1]) 311 | max_room_z = max(data[:,2]) 312 | #print(max_room_x, max_room_y, max_room_z) 313 | 314 | data_batch, label_batch = room2samples(data, label, num_point) 315 | new_data_batch = np.zeros((data_batch.shape[0], num_point, 9)) 316 | for b in range(data_batch.shape[0]): 317 | new_data_batch[b, :, 6] = data_batch[b, :, 0]/max_room_x 318 | new_data_batch[b, :, 7] = data_batch[b, :, 1]/max_room_y 319 | new_data_batch[b, :, 8] = data_batch[b, :, 2]/max_room_z 320 | #minx = min(data_batch[b, :, 0]) 321 | #miny = min(data_batch[b, :, 1]) 322 | #data_batch[b, :, 0] -= (minx+block_size/2) 323 | #data_batch[b, :, 1] -= (miny+block_size/2) 324 | new_data_batch[:, :, 0:6] = data_batch 325 | return new_data_batch, label_batch 326 | 327 | 328 | def room2samples_wrapper_normalized(data_label_filename, num_point): 329 | if data_label_filename[-3:] == 'txt': 330 | data_label = np.loadtxt(data_label_filename) 331 | elif data_label_filename[-3:] == 'npy': 332 | data_label = np.load(data_label_filename) 333 | else: 334 | print('Unknown file type! exiting.') 335 | exit() 336 | return room2samples_plus_normalized(data_label, num_point) 337 | 338 | 339 | # ----------------------------------------------------------------------------- 340 | # EXTRACT INSTANCE BBOX FROM ORIGINAL DATA (for detection evaluation) 341 | # ----------------------------------------------------------------------------- 342 | 343 | def collect_bounding_box(anno_path, out_filename): 344 | """ Compute bounding boxes from each instance in original dataset files on 345 | one room. **We assume the bbox is aligned with XYZ coordinate.** 346 | 347 | Args: 348 | anno_path: path to annotations. e.g. Area_1/office_2/Annotations/ 349 | out_filename: path to save instance bounding boxes for that room. 350 | each line is x1 y1 z1 x2 y2 z2 label, 351 | where (x1,y1,z1) is the point on the diagonal closer to origin 352 | Returns: 353 | None 354 | Note: 355 | room points are shifted, the most negative point is now at origin. 356 | """ 357 | bbox_label_list = [] 358 | 359 | for f in glob.glob(os.path.join(anno_path, '*.txt')): 360 | cls = os.path.basename(f).split('_')[0] 361 | if cls not in g_classes: # note: in some room there is 'staris' class.. 362 | cls = 'clutter' 363 | points = np.loadtxt(f) 364 | label = g_class2label[cls] 365 | # Compute tightest axis aligned bounding box 366 | xyz_min = np.amin(points[:, 0:3], axis=0) 367 | xyz_max = np.amax(points[:, 0:3], axis=0) 368 | ins_bbox_label = np.expand_dims( 369 | np.concatenate([xyz_min, xyz_max, np.array([label])], 0), 0) 370 | bbox_label_list.append(ins_bbox_label) 371 | 372 | bbox_label = np.concatenate(bbox_label_list, 0) 373 | room_xyz_min = np.amin(bbox_label[:, 0:3], axis=0) 374 | bbox_label[:, 0:3] -= room_xyz_min 375 | bbox_label[:, 3:6] -= room_xyz_min 376 | 377 | fout = open(out_filename, 'w') 378 | for i in range(bbox_label.shape[0]): 379 | fout.write('%f %f %f %f %f %f %d\n' % \ 380 | (bbox_label[i,0], bbox_label[i,1], bbox_label[i,2], 381 | bbox_label[i,3], bbox_label[i,4], bbox_label[i,5], 382 | bbox_label[i,6])) 383 | fout.close() 384 | 385 | def bbox_label_to_obj(input_filename, out_filename_prefix, easy_view=False): 386 | """ Visualization of bounding boxes. 387 | 388 | Args: 389 | input_filename: each line is x1 y1 z1 x2 y2 z2 label 390 | out_filename_prefix: OBJ filename prefix, 391 | visualize object by g_label2color 392 | easy_view: if True, only visualize furniture and floor 393 | Returns: 394 | output a list of OBJ file and MTL files with the same prefix 395 | """ 396 | bbox_label = np.loadtxt(input_filename) 397 | bbox = bbox_label[:, 0:6] 398 | label = bbox_label[:, -1].astype(int) 399 | v_cnt = 0 # count vertex 400 | ins_cnt = 0 # count instance 401 | for i in range(bbox.shape[0]): 402 | if easy_view and (label[i] not in g_easy_view_labels): 403 | continue 404 | obj_filename = out_filename_prefix+'_'+g_classes[label[i]]+'_'+str(ins_cnt)+'.obj' 405 | mtl_filename = out_filename_prefix+'_'+g_classes[label[i]]+'_'+str(ins_cnt)+'.mtl' 406 | fout_obj = open(obj_filename, 'w') 407 | fout_mtl = open(mtl_filename, 'w') 408 | fout_obj.write('mtllib %s\n' % (os.path.basename(mtl_filename))) 409 | 410 | length = bbox[i, 3:6] - bbox[i, 0:3] 411 | a = length[0] 412 | b = length[1] 413 | c = length[2] 414 | x = bbox[i, 0] 415 | y = bbox[i, 1] 416 | z = bbox[i, 2] 417 | color = np.array(g_label2color[label[i]], dtype=float) / 255.0 418 | 419 | material = 'material%d' % (ins_cnt) 420 | fout_obj.write('usemtl %s\n' % (material)) 421 | fout_obj.write('v %f %f %f\n' % (x,y,z+c)) 422 | fout_obj.write('v %f %f %f\n' % (x,y+b,z+c)) 423 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z+c)) 424 | fout_obj.write('v %f %f %f\n' % (x+a,y,z+c)) 425 | fout_obj.write('v %f %f %f\n' % (x,y,z)) 426 | fout_obj.write('v %f %f %f\n' % (x,y+b,z)) 427 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z)) 428 | fout_obj.write('v %f %f %f\n' % (x+a,y,z)) 429 | fout_obj.write('g default\n') 430 | v_cnt = 0 # for individual box 431 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 3+v_cnt, 2+v_cnt, 1+v_cnt)) 432 | fout_obj.write('f %d %d %d %d\n' % (1+v_cnt, 2+v_cnt, 6+v_cnt, 5+v_cnt)) 433 | fout_obj.write('f %d %d %d %d\n' % (7+v_cnt, 6+v_cnt, 2+v_cnt, 3+v_cnt)) 434 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 8+v_cnt, 7+v_cnt, 3+v_cnt)) 435 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 8+v_cnt, 4+v_cnt, 1+v_cnt)) 436 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 6+v_cnt, 7+v_cnt, 8+v_cnt)) 437 | fout_obj.write('\n') 438 | 439 | fout_mtl.write('newmtl %s\n' % (material)) 440 | fout_mtl.write('Kd %f %f %f\n' % (color[0], color[1], color[2])) 441 | fout_mtl.write('\n') 442 | fout_obj.close() 443 | fout_mtl.close() 444 | 445 | v_cnt += 8 446 | ins_cnt += 1 447 | 448 | def bbox_label_to_obj_room(input_filename, out_filename_prefix, easy_view=False, permute=None, center=False, exclude_table=False): 449 | """ Visualization of bounding boxes. 450 | 451 | Args: 452 | input_filename: each line is x1 y1 z1 x2 y2 z2 label 453 | out_filename_prefix: OBJ filename prefix, 454 | visualize object by g_label2color 455 | easy_view: if True, only visualize furniture and floor 456 | permute: if not None, permute XYZ for rendering, e.g. [0 2 1] 457 | center: if True, move obj to have zero origin 458 | Returns: 459 | output a list of OBJ file and MTL files with the same prefix 460 | """ 461 | bbox_label = np.loadtxt(input_filename) 462 | bbox = bbox_label[:, 0:6] 463 | if permute is not None: 464 | assert(len(permute)==3) 465 | permute = np.array(permute) 466 | bbox[:,0:3] = bbox[:,permute] 467 | bbox[:,3:6] = bbox[:,permute+3] 468 | if center: 469 | xyz_max = np.amax(bbox[:,3:6], 0) 470 | bbox[:,0:3] -= (xyz_max/2.0) 471 | bbox[:,3:6] -= (xyz_max/2.0) 472 | bbox /= np.max(xyz_max/2.0) 473 | label = bbox_label[:, -1].astype(int) 474 | obj_filename = out_filename_prefix+'.obj' 475 | mtl_filename = out_filename_prefix+'.mtl' 476 | 477 | fout_obj = open(obj_filename, 'w') 478 | fout_mtl = open(mtl_filename, 'w') 479 | fout_obj.write('mtllib %s\n' % (os.path.basename(mtl_filename))) 480 | v_cnt = 0 # count vertex 481 | ins_cnt = 0 # count instance 482 | for i in range(bbox.shape[0]): 483 | if easy_view and (label[i] not in g_easy_view_labels): 484 | continue 485 | if exclude_table and label[i] == g_classes.index('table'): 486 | continue 487 | 488 | length = bbox[i, 3:6] - bbox[i, 0:3] 489 | a = length[0] 490 | b = length[1] 491 | c = length[2] 492 | x = bbox[i, 0] 493 | y = bbox[i, 1] 494 | z = bbox[i, 2] 495 | color = np.array(g_label2color[label[i]], dtype=float) / 255.0 496 | 497 | material = 'material%d' % (ins_cnt) 498 | fout_obj.write('usemtl %s\n' % (material)) 499 | fout_obj.write('v %f %f %f\n' % (x,y,z+c)) 500 | fout_obj.write('v %f %f %f\n' % (x,y+b,z+c)) 501 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z+c)) 502 | fout_obj.write('v %f %f %f\n' % (x+a,y,z+c)) 503 | fout_obj.write('v %f %f %f\n' % (x,y,z)) 504 | fout_obj.write('v %f %f %f\n' % (x,y+b,z)) 505 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z)) 506 | fout_obj.write('v %f %f %f\n' % (x+a,y,z)) 507 | fout_obj.write('g default\n') 508 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 3+v_cnt, 2+v_cnt, 1+v_cnt)) 509 | fout_obj.write('f %d %d %d %d\n' % (1+v_cnt, 2+v_cnt, 6+v_cnt, 5+v_cnt)) 510 | fout_obj.write('f %d %d %d %d\n' % (7+v_cnt, 6+v_cnt, 2+v_cnt, 3+v_cnt)) 511 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 8+v_cnt, 7+v_cnt, 3+v_cnt)) 512 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 8+v_cnt, 4+v_cnt, 1+v_cnt)) 513 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 6+v_cnt, 7+v_cnt, 8+v_cnt)) 514 | fout_obj.write('\n') 515 | 516 | fout_mtl.write('newmtl %s\n' % (material)) 517 | fout_mtl.write('Kd %f %f %f\n' % (color[0], color[1], color[2])) 518 | fout_mtl.write('\n') 519 | 520 | v_cnt += 8 521 | ins_cnt += 1 522 | 523 | fout_obj.close() 524 | fout_mtl.close() 525 | 526 | 527 | def collect_point_bounding_box(anno_path, out_filename, file_format): 528 | """ Compute bounding boxes from each instance in original dataset files on 529 | one room. **We assume the bbox is aligned with XYZ coordinate.** 530 | Save both the point XYZRGB and the bounding box for the point's 531 | parent element. 532 | 533 | Args: 534 | anno_path: path to annotations. e.g. Area_1/office_2/Annotations/ 535 | out_filename: path to save instance bounding boxes for each point, 536 | plus the point's XYZRGBL 537 | each line is XYZRGBL offsetX offsetY offsetZ a b c, 538 | where cx = X+offsetX, cy=X+offsetY, cz=Z+offsetZ 539 | where (cx,cy,cz) is center of the box, a,b,c are distances from center 540 | to the surfaces of the box, i.e. x1 = cx-a, x2 = cx+a, y1=cy-b etc. 541 | file_format: output file format, txt or numpy 542 | Returns: 543 | None 544 | 545 | Note: 546 | room points are shifted, the most negative point is now at origin. 547 | """ 548 | point_bbox_list = [] 549 | 550 | for f in glob.glob(os.path.join(anno_path, '*.txt')): 551 | cls = os.path.basename(f).split('_')[0] 552 | if cls not in g_classes: # note: in some room there is 'staris' class.. 553 | cls = 'clutter' 554 | points = np.loadtxt(f) # Nx6 555 | label = g_class2label[cls] # N, 556 | # Compute tightest axis aligned bounding box 557 | xyz_min = np.amin(points[:, 0:3], axis=0) # 3, 558 | xyz_max = np.amax(points[:, 0:3], axis=0) # 3, 559 | xyz_center = (xyz_min + xyz_max) / 2 560 | dimension = (xyz_max - xyz_min) / 2 561 | 562 | xyz_offsets = xyz_center - points[:,0:3] # Nx3 563 | dimensions = np.ones((points.shape[0],3)) * dimension # Nx3 564 | labels = np.ones((points.shape[0],1)) * label # N 565 | point_bbox_list.append(np.concatenate([points, labels, 566 | xyz_offsets, dimensions], 1)) # Nx13 567 | 568 | point_bbox = np.concatenate(point_bbox_list, 0) # KxNx13 569 | room_xyz_min = np.amin(point_bbox[:, 0:3], axis=0) 570 | point_bbox[:, 0:3] -= room_xyz_min 571 | 572 | if file_format == 'txt': 573 | fout = open(out_filename, 'w') 574 | for i in range(point_bbox.shape[0]): 575 | fout.write('%f %f %f %d %d %d %d %f %f %f %f %f %f\n' % \ 576 | (point_bbox[i,0], point_bbox[i,1], point_bbox[i,2], 577 | point_bbox[i,3], point_bbox[i,4], point_bbox[i,5], 578 | point_bbox[i,6], 579 | point_bbox[i,7], point_bbox[i,8], point_bbox[i,9], 580 | point_bbox[i,10], point_bbox[i,11], point_bbox[i,12])) 581 | 582 | fout.close() 583 | elif file_format == 'numpy': 584 | np.save(out_filename, point_bbox) 585 | else: 586 | print('ERROR!! Unknown file format: %s, please use txt or numpy.' % \ 587 | (file_format)) 588 | exit() 589 | 590 | 591 | -------------------------------------------------------------------------------- /sem_seg/meta/all_data_label.txt: -------------------------------------------------------------------------------- 1 | Area_1_conferenceRoom_1.npy 2 | Area_1_conferenceRoom_2.npy 3 | Area_1_copyRoom_1.npy 4 | Area_1_hallway_1.npy 5 | Area_1_hallway_2.npy 6 | Area_1_hallway_3.npy 7 | Area_1_hallway_4.npy 8 | Area_1_hallway_5.npy 9 | Area_1_hallway_6.npy 10 | Area_1_hallway_7.npy 11 | Area_1_hallway_8.npy 12 | Area_1_office_10.npy 13 | Area_1_office_11.npy 14 | Area_1_office_12.npy 15 | Area_1_office_13.npy 16 | Area_1_office_14.npy 17 | Area_1_office_15.npy 18 | Area_1_office_16.npy 19 | Area_1_office_17.npy 20 | Area_1_office_18.npy 21 | Area_1_office_19.npy 22 | Area_1_office_1.npy 23 | Area_1_office_20.npy 24 | Area_1_office_21.npy 25 | Area_1_office_22.npy 26 | Area_1_office_23.npy 27 | Area_1_office_24.npy 28 | Area_1_office_25.npy 29 | Area_1_office_26.npy 30 | Area_1_office_27.npy 31 | Area_1_office_28.npy 32 | Area_1_office_29.npy 33 | Area_1_office_2.npy 34 | Area_1_office_30.npy 35 | Area_1_office_31.npy 36 | Area_1_office_3.npy 37 | Area_1_office_4.npy 38 | Area_1_office_5.npy 39 | Area_1_office_6.npy 40 | Area_1_office_7.npy 41 | Area_1_office_8.npy 42 | Area_1_office_9.npy 43 | Area_1_pantry_1.npy 44 | Area_1_WC_1.npy 45 | Area_2_auditorium_1.npy 46 | Area_2_auditorium_2.npy 47 | Area_2_conferenceRoom_1.npy 48 | Area_2_hallway_10.npy 49 | Area_2_hallway_11.npy 50 | Area_2_hallway_12.npy 51 | Area_2_hallway_1.npy 52 | Area_2_hallway_2.npy 53 | Area_2_hallway_3.npy 54 | Area_2_hallway_4.npy 55 | Area_2_hallway_5.npy 56 | Area_2_hallway_6.npy 57 | Area_2_hallway_7.npy 58 | Area_2_hallway_8.npy 59 | Area_2_hallway_9.npy 60 | Area_2_office_10.npy 61 | Area_2_office_11.npy 62 | Area_2_office_12.npy 63 | Area_2_office_13.npy 64 | Area_2_office_14.npy 65 | Area_2_office_1.npy 66 | Area_2_office_2.npy 67 | Area_2_office_3.npy 68 | Area_2_office_4.npy 69 | Area_2_office_5.npy 70 | Area_2_office_6.npy 71 | Area_2_office_7.npy 72 | Area_2_office_8.npy 73 | Area_2_office_9.npy 74 | Area_2_storage_1.npy 75 | Area_2_storage_2.npy 76 | Area_2_storage_3.npy 77 | Area_2_storage_4.npy 78 | Area_2_storage_5.npy 79 | Area_2_storage_6.npy 80 | Area_2_storage_7.npy 81 | Area_2_storage_8.npy 82 | Area_2_storage_9.npy 83 | Area_2_WC_1.npy 84 | Area_2_WC_2.npy 85 | Area_3_conferenceRoom_1.npy 86 | Area_3_hallway_1.npy 87 | Area_3_hallway_2.npy 88 | Area_3_hallway_3.npy 89 | Area_3_hallway_4.npy 90 | Area_3_hallway_5.npy 91 | Area_3_hallway_6.npy 92 | Area_3_lounge_1.npy 93 | Area_3_lounge_2.npy 94 | Area_3_office_10.npy 95 | Area_3_office_1.npy 96 | Area_3_office_2.npy 97 | Area_3_office_3.npy 98 | Area_3_office_4.npy 99 | Area_3_office_5.npy 100 | Area_3_office_6.npy 101 | Area_3_office_7.npy 102 | Area_3_office_8.npy 103 | Area_3_office_9.npy 104 | Area_3_storage_1.npy 105 | Area_3_storage_2.npy 106 | Area_3_WC_1.npy 107 | Area_3_WC_2.npy 108 | Area_4_conferenceRoom_1.npy 109 | Area_4_conferenceRoom_2.npy 110 | Area_4_conferenceRoom_3.npy 111 | Area_4_hallway_10.npy 112 | Area_4_hallway_11.npy 113 | Area_4_hallway_12.npy 114 | Area_4_hallway_13.npy 115 | Area_4_hallway_14.npy 116 | Area_4_hallway_1.npy 117 | Area_4_hallway_2.npy 118 | Area_4_hallway_3.npy 119 | Area_4_hallway_4.npy 120 | Area_4_hallway_5.npy 121 | Area_4_hallway_6.npy 122 | Area_4_hallway_7.npy 123 | Area_4_hallway_8.npy 124 | Area_4_hallway_9.npy 125 | Area_4_lobby_1.npy 126 | Area_4_lobby_2.npy 127 | Area_4_office_10.npy 128 | Area_4_office_11.npy 129 | Area_4_office_12.npy 130 | Area_4_office_13.npy 131 | Area_4_office_14.npy 132 | Area_4_office_15.npy 133 | Area_4_office_16.npy 134 | Area_4_office_17.npy 135 | Area_4_office_18.npy 136 | Area_4_office_19.npy 137 | Area_4_office_1.npy 138 | Area_4_office_20.npy 139 | Area_4_office_21.npy 140 | Area_4_office_22.npy 141 | Area_4_office_2.npy 142 | Area_4_office_3.npy 143 | Area_4_office_4.npy 144 | Area_4_office_5.npy 145 | Area_4_office_6.npy 146 | Area_4_office_7.npy 147 | Area_4_office_8.npy 148 | Area_4_office_9.npy 149 | Area_4_storage_1.npy 150 | Area_4_storage_2.npy 151 | Area_4_storage_3.npy 152 | Area_4_storage_4.npy 153 | Area_4_WC_1.npy 154 | Area_4_WC_2.npy 155 | Area_4_WC_3.npy 156 | Area_4_WC_4.npy 157 | Area_5_conferenceRoom_1.npy 158 | Area_5_conferenceRoom_2.npy 159 | Area_5_conferenceRoom_3.npy 160 | Area_5_hallway_10.npy 161 | Area_5_hallway_11.npy 162 | Area_5_hallway_12.npy 163 | Area_5_hallway_13.npy 164 | Area_5_hallway_14.npy 165 | Area_5_hallway_15.npy 166 | Area_5_hallway_1.npy 167 | Area_5_hallway_2.npy 168 | Area_5_hallway_3.npy 169 | Area_5_hallway_4.npy 170 | Area_5_hallway_5.npy 171 | Area_5_hallway_6.npy 172 | Area_5_hallway_7.npy 173 | Area_5_hallway_8.npy 174 | Area_5_hallway_9.npy 175 | Area_5_lobby_1.npy 176 | Area_5_office_10.npy 177 | Area_5_office_11.npy 178 | Area_5_office_12.npy 179 | Area_5_office_13.npy 180 | Area_5_office_14.npy 181 | Area_5_office_15.npy 182 | Area_5_office_16.npy 183 | Area_5_office_17.npy 184 | Area_5_office_18.npy 185 | Area_5_office_19.npy 186 | Area_5_office_1.npy 187 | Area_5_office_20.npy 188 | Area_5_office_21.npy 189 | Area_5_office_22.npy 190 | Area_5_office_23.npy 191 | Area_5_office_24.npy 192 | Area_5_office_25.npy 193 | Area_5_office_26.npy 194 | Area_5_office_27.npy 195 | Area_5_office_28.npy 196 | Area_5_office_29.npy 197 | Area_5_office_2.npy 198 | Area_5_office_30.npy 199 | Area_5_office_31.npy 200 | Area_5_office_32.npy 201 | Area_5_office_33.npy 202 | Area_5_office_34.npy 203 | Area_5_office_35.npy 204 | Area_5_office_36.npy 205 | Area_5_office_37.npy 206 | Area_5_office_38.npy 207 | Area_5_office_39.npy 208 | Area_5_office_3.npy 209 | Area_5_office_40.npy 210 | Area_5_office_41.npy 211 | Area_5_office_42.npy 212 | Area_5_office_4.npy 213 | Area_5_office_5.npy 214 | Area_5_office_6.npy 215 | Area_5_office_7.npy 216 | Area_5_office_8.npy 217 | Area_5_office_9.npy 218 | Area_5_pantry_1.npy 219 | Area_5_storage_1.npy 220 | Area_5_storage_2.npy 221 | Area_5_storage_3.npy 222 | Area_5_storage_4.npy 223 | Area_5_WC_1.npy 224 | Area_5_WC_2.npy 225 | Area_6_conferenceRoom_1.npy 226 | Area_6_copyRoom_1.npy 227 | Area_6_hallway_1.npy 228 | Area_6_hallway_2.npy 229 | Area_6_hallway_3.npy 230 | Area_6_hallway_4.npy 231 | Area_6_hallway_5.npy 232 | Area_6_hallway_6.npy 233 | Area_6_lounge_1.npy 234 | Area_6_office_10.npy 235 | Area_6_office_11.npy 236 | Area_6_office_12.npy 237 | Area_6_office_13.npy 238 | Area_6_office_14.npy 239 | Area_6_office_15.npy 240 | Area_6_office_16.npy 241 | Area_6_office_17.npy 242 | Area_6_office_18.npy 243 | Area_6_office_19.npy 244 | Area_6_office_1.npy 245 | Area_6_office_20.npy 246 | Area_6_office_21.npy 247 | Area_6_office_22.npy 248 | Area_6_office_23.npy 249 | Area_6_office_24.npy 250 | Area_6_office_25.npy 251 | Area_6_office_26.npy 252 | Area_6_office_27.npy 253 | Area_6_office_28.npy 254 | Area_6_office_29.npy 255 | Area_6_office_2.npy 256 | Area_6_office_30.npy 257 | Area_6_office_31.npy 258 | Area_6_office_32.npy 259 | Area_6_office_33.npy 260 | Area_6_office_34.npy 261 | Area_6_office_35.npy 262 | Area_6_office_36.npy 263 | Area_6_office_37.npy 264 | Area_6_office_3.npy 265 | Area_6_office_4.npy 266 | Area_6_office_5.npy 267 | Area_6_office_6.npy 268 | Area_6_office_7.npy 269 | Area_6_office_8.npy 270 | Area_6_office_9.npy 271 | Area_6_openspace_1.npy 272 | Area_6_pantry_1.npy 273 | -------------------------------------------------------------------------------- /sem_seg/meta/anno_paths.txt: -------------------------------------------------------------------------------- 1 | Area_1/conferenceRoom_1/Annotations 2 | Area_1/conferenceRoom_2/Annotations 3 | Area_1/copyRoom_1/Annotations 4 | Area_1/hallway_1/Annotations 5 | Area_1/hallway_2/Annotations 6 | Area_1/hallway_3/Annotations 7 | Area_1/hallway_4/Annotations 8 | Area_1/hallway_5/Annotations 9 | Area_1/hallway_6/Annotations 10 | Area_1/hallway_7/Annotations 11 | Area_1/hallway_8/Annotations 12 | Area_1/office_10/Annotations 13 | Area_1/office_11/Annotations 14 | Area_1/office_12/Annotations 15 | Area_1/office_13/Annotations 16 | Area_1/office_14/Annotations 17 | Area_1/office_15/Annotations 18 | Area_1/office_16/Annotations 19 | Area_1/office_17/Annotations 20 | Area_1/office_18/Annotations 21 | Area_1/office_19/Annotations 22 | Area_1/office_1/Annotations 23 | Area_1/office_20/Annotations 24 | Area_1/office_21/Annotations 25 | Area_1/office_22/Annotations 26 | Area_1/office_23/Annotations 27 | Area_1/office_24/Annotations 28 | Area_1/office_25/Annotations 29 | Area_1/office_26/Annotations 30 | Area_1/office_27/Annotations 31 | Area_1/office_28/Annotations 32 | Area_1/office_29/Annotations 33 | Area_1/office_2/Annotations 34 | Area_1/office_30/Annotations 35 | Area_1/office_31/Annotations 36 | Area_1/office_3/Annotations 37 | Area_1/office_4/Annotations 38 | Area_1/office_5/Annotations 39 | Area_1/office_6/Annotations 40 | Area_1/office_7/Annotations 41 | Area_1/office_8/Annotations 42 | Area_1/office_9/Annotations 43 | Area_1/pantry_1/Annotations 44 | Area_1/WC_1/Annotations 45 | Area_2/auditorium_1/Annotations 46 | Area_2/auditorium_2/Annotations 47 | Area_2/conferenceRoom_1/Annotations 48 | Area_2/hallway_10/Annotations 49 | Area_2/hallway_11/Annotations 50 | Area_2/hallway_12/Annotations 51 | Area_2/hallway_1/Annotations 52 | Area_2/hallway_2/Annotations 53 | Area_2/hallway_3/Annotations 54 | Area_2/hallway_4/Annotations 55 | Area_2/hallway_5/Annotations 56 | Area_2/hallway_6/Annotations 57 | Area_2/hallway_7/Annotations 58 | Area_2/hallway_8/Annotations 59 | Area_2/hallway_9/Annotations 60 | Area_2/office_10/Annotations 61 | Area_2/office_11/Annotations 62 | Area_2/office_12/Annotations 63 | Area_2/office_13/Annotations 64 | Area_2/office_14/Annotations 65 | Area_2/office_1/Annotations 66 | Area_2/office_2/Annotations 67 | Area_2/office_3/Annotations 68 | Area_2/office_4/Annotations 69 | Area_2/office_5/Annotations 70 | Area_2/office_6/Annotations 71 | Area_2/office_7/Annotations 72 | Area_2/office_8/Annotations 73 | Area_2/office_9/Annotations 74 | Area_2/storage_1/Annotations 75 | Area_2/storage_2/Annotations 76 | Area_2/storage_3/Annotations 77 | Area_2/storage_4/Annotations 78 | Area_2/storage_5/Annotations 79 | Area_2/storage_6/Annotations 80 | Area_2/storage_7/Annotations 81 | Area_2/storage_8/Annotations 82 | Area_2/storage_9/Annotations 83 | Area_2/WC_1/Annotations 84 | Area_2/WC_2/Annotations 85 | Area_3/conferenceRoom_1/Annotations 86 | Area_3/hallway_1/Annotations 87 | Area_3/hallway_2/Annotations 88 | Area_3/hallway_3/Annotations 89 | Area_3/hallway_4/Annotations 90 | Area_3/hallway_5/Annotations 91 | Area_3/hallway_6/Annotations 92 | Area_3/lounge_1/Annotations 93 | Area_3/lounge_2/Annotations 94 | Area_3/office_10/Annotations 95 | Area_3/office_1/Annotations 96 | Area_3/office_2/Annotations 97 | Area_3/office_3/Annotations 98 | Area_3/office_4/Annotations 99 | Area_3/office_5/Annotations 100 | Area_3/office_6/Annotations 101 | Area_3/office_7/Annotations 102 | Area_3/office_8/Annotations 103 | Area_3/office_9/Annotations 104 | Area_3/storage_1/Annotations 105 | Area_3/storage_2/Annotations 106 | Area_3/WC_1/Annotations 107 | Area_3/WC_2/Annotations 108 | Area_4/conferenceRoom_1/Annotations 109 | Area_4/conferenceRoom_2/Annotations 110 | Area_4/conferenceRoom_3/Annotations 111 | Area_4/hallway_10/Annotations 112 | Area_4/hallway_11/Annotations 113 | Area_4/hallway_12/Annotations 114 | Area_4/hallway_13/Annotations 115 | Area_4/hallway_14/Annotations 116 | Area_4/hallway_1/Annotations 117 | Area_4/hallway_2/Annotations 118 | Area_4/hallway_3/Annotations 119 | Area_4/hallway_4/Annotations 120 | Area_4/hallway_5/Annotations 121 | Area_4/hallway_6/Annotations 122 | Area_4/hallway_7/Annotations 123 | Area_4/hallway_8/Annotations 124 | Area_4/hallway_9/Annotations 125 | Area_4/lobby_1/Annotations 126 | Area_4/lobby_2/Annotations 127 | Area_4/office_10/Annotations 128 | Area_4/office_11/Annotations 129 | Area_4/office_12/Annotations 130 | Area_4/office_13/Annotations 131 | Area_4/office_14/Annotations 132 | Area_4/office_15/Annotations 133 | Area_4/office_16/Annotations 134 | Area_4/office_17/Annotations 135 | Area_4/office_18/Annotations 136 | Area_4/office_19/Annotations 137 | Area_4/office_1/Annotations 138 | Area_4/office_20/Annotations 139 | Area_4/office_21/Annotations 140 | Area_4/office_22/Annotations 141 | Area_4/office_2/Annotations 142 | Area_4/office_3/Annotations 143 | Area_4/office_4/Annotations 144 | Area_4/office_5/Annotations 145 | Area_4/office_6/Annotations 146 | Area_4/office_7/Annotations 147 | Area_4/office_8/Annotations 148 | Area_4/office_9/Annotations 149 | Area_4/storage_1/Annotations 150 | Area_4/storage_2/Annotations 151 | Area_4/storage_3/Annotations 152 | Area_4/storage_4/Annotations 153 | Area_4/WC_1/Annotations 154 | Area_4/WC_2/Annotations 155 | Area_4/WC_3/Annotations 156 | Area_4/WC_4/Annotations 157 | Area_5/conferenceRoom_1/Annotations 158 | Area_5/conferenceRoom_2/Annotations 159 | Area_5/conferenceRoom_3/Annotations 160 | Area_5/hallway_10/Annotations 161 | Area_5/hallway_11/Annotations 162 | Area_5/hallway_12/Annotations 163 | Area_5/hallway_13/Annotations 164 | Area_5/hallway_14/Annotations 165 | Area_5/hallway_15/Annotations 166 | Area_5/hallway_1/Annotations 167 | Area_5/hallway_2/Annotations 168 | Area_5/hallway_3/Annotations 169 | Area_5/hallway_4/Annotations 170 | Area_5/hallway_5/Annotations 171 | Area_5/hallway_6/Annotations 172 | Area_5/hallway_7/Annotations 173 | Area_5/hallway_8/Annotations 174 | Area_5/hallway_9/Annotations 175 | Area_5/lobby_1/Annotations 176 | Area_5/office_10/Annotations 177 | Area_5/office_11/Annotations 178 | Area_5/office_12/Annotations 179 | Area_5/office_13/Annotations 180 | Area_5/office_14/Annotations 181 | Area_5/office_15/Annotations 182 | Area_5/office_16/Annotations 183 | Area_5/office_17/Annotations 184 | Area_5/office_18/Annotations 185 | Area_5/office_19/Annotations 186 | Area_5/office_1/Annotations 187 | Area_5/office_20/Annotations 188 | Area_5/office_21/Annotations 189 | Area_5/office_22/Annotations 190 | Area_5/office_23/Annotations 191 | Area_5/office_24/Annotations 192 | Area_5/office_25/Annotations 193 | Area_5/office_26/Annotations 194 | Area_5/office_27/Annotations 195 | Area_5/office_28/Annotations 196 | Area_5/office_29/Annotations 197 | Area_5/office_2/Annotations 198 | Area_5/office_30/Annotations 199 | Area_5/office_31/Annotations 200 | Area_5/office_32/Annotations 201 | Area_5/office_33/Annotations 202 | Area_5/office_34/Annotations 203 | Area_5/office_35/Annotations 204 | Area_5/office_36/Annotations 205 | Area_5/office_37/Annotations 206 | Area_5/office_38/Annotations 207 | Area_5/office_39/Annotations 208 | Area_5/office_3/Annotations 209 | Area_5/office_40/Annotations 210 | Area_5/office_41/Annotations 211 | Area_5/office_42/Annotations 212 | Area_5/office_4/Annotations 213 | Area_5/office_5/Annotations 214 | Area_5/office_6/Annotations 215 | Area_5/office_7/Annotations 216 | Area_5/office_8/Annotations 217 | Area_5/office_9/Annotations 218 | Area_5/pantry_1/Annotations 219 | Area_5/storage_1/Annotations 220 | Area_5/storage_2/Annotations 221 | Area_5/storage_3/Annotations 222 | Area_5/storage_4/Annotations 223 | Area_5/WC_1/Annotations 224 | Area_5/WC_2/Annotations 225 | Area_6/conferenceRoom_1/Annotations 226 | Area_6/copyRoom_1/Annotations 227 | Area_6/hallway_1/Annotations 228 | Area_6/hallway_2/Annotations 229 | Area_6/hallway_3/Annotations 230 | Area_6/hallway_4/Annotations 231 | Area_6/hallway_5/Annotations 232 | Area_6/hallway_6/Annotations 233 | Area_6/lounge_1/Annotations 234 | Area_6/office_10/Annotations 235 | Area_6/office_11/Annotations 236 | Area_6/office_12/Annotations 237 | Area_6/office_13/Annotations 238 | Area_6/office_14/Annotations 239 | Area_6/office_15/Annotations 240 | Area_6/office_16/Annotations 241 | Area_6/office_17/Annotations 242 | Area_6/office_18/Annotations 243 | Area_6/office_19/Annotations 244 | Area_6/office_1/Annotations 245 | Area_6/office_20/Annotations 246 | Area_6/office_21/Annotations 247 | Area_6/office_22/Annotations 248 | Area_6/office_23/Annotations 249 | Area_6/office_24/Annotations 250 | Area_6/office_25/Annotations 251 | Area_6/office_26/Annotations 252 | Area_6/office_27/Annotations 253 | Area_6/office_28/Annotations 254 | Area_6/office_29/Annotations 255 | Area_6/office_2/Annotations 256 | Area_6/office_30/Annotations 257 | Area_6/office_31/Annotations 258 | Area_6/office_32/Annotations 259 | Area_6/office_33/Annotations 260 | Area_6/office_34/Annotations 261 | Area_6/office_35/Annotations 262 | Area_6/office_36/Annotations 263 | Area_6/office_37/Annotations 264 | Area_6/office_3/Annotations 265 | Area_6/office_4/Annotations 266 | Area_6/office_5/Annotations 267 | Area_6/office_6/Annotations 268 | Area_6/office_7/Annotations 269 | Area_6/office_8/Annotations 270 | Area_6/office_9/Annotations 271 | Area_6/openspace_1/Annotations 272 | Area_6/pantry_1/Annotations 273 | -------------------------------------------------------------------------------- /sem_seg/meta/area6_data_label.txt: -------------------------------------------------------------------------------- 1 | data/stanford_indoor3d/Area_6_conferenceRoom_1.npy 2 | data/stanford_indoor3d/Area_6_copyRoom_1.npy 3 | data/stanford_indoor3d/Area_6_hallway_1.npy 4 | data/stanford_indoor3d/Area_6_hallway_2.npy 5 | data/stanford_indoor3d/Area_6_hallway_3.npy 6 | data/stanford_indoor3d/Area_6_hallway_4.npy 7 | data/stanford_indoor3d/Area_6_hallway_5.npy 8 | data/stanford_indoor3d/Area_6_hallway_6.npy 9 | data/stanford_indoor3d/Area_6_lounge_1.npy 10 | data/stanford_indoor3d/Area_6_office_10.npy 11 | data/stanford_indoor3d/Area_6_office_11.npy 12 | data/stanford_indoor3d/Area_6_office_12.npy 13 | data/stanford_indoor3d/Area_6_office_13.npy 14 | data/stanford_indoor3d/Area_6_office_14.npy 15 | data/stanford_indoor3d/Area_6_office_15.npy 16 | data/stanford_indoor3d/Area_6_office_16.npy 17 | data/stanford_indoor3d/Area_6_office_17.npy 18 | data/stanford_indoor3d/Area_6_office_18.npy 19 | data/stanford_indoor3d/Area_6_office_19.npy 20 | data/stanford_indoor3d/Area_6_office_1.npy 21 | data/stanford_indoor3d/Area_6_office_20.npy 22 | data/stanford_indoor3d/Area_6_office_21.npy 23 | data/stanford_indoor3d/Area_6_office_22.npy 24 | data/stanford_indoor3d/Area_6_office_23.npy 25 | data/stanford_indoor3d/Area_6_office_24.npy 26 | data/stanford_indoor3d/Area_6_office_25.npy 27 | data/stanford_indoor3d/Area_6_office_26.npy 28 | data/stanford_indoor3d/Area_6_office_27.npy 29 | data/stanford_indoor3d/Area_6_office_28.npy 30 | data/stanford_indoor3d/Area_6_office_29.npy 31 | data/stanford_indoor3d/Area_6_office_2.npy 32 | data/stanford_indoor3d/Area_6_office_30.npy 33 | data/stanford_indoor3d/Area_6_office_31.npy 34 | data/stanford_indoor3d/Area_6_office_32.npy 35 | data/stanford_indoor3d/Area_6_office_33.npy 36 | data/stanford_indoor3d/Area_6_office_34.npy 37 | data/stanford_indoor3d/Area_6_office_35.npy 38 | data/stanford_indoor3d/Area_6_office_36.npy 39 | data/stanford_indoor3d/Area_6_office_37.npy 40 | data/stanford_indoor3d/Area_6_office_3.npy 41 | data/stanford_indoor3d/Area_6_office_4.npy 42 | data/stanford_indoor3d/Area_6_office_5.npy 43 | data/stanford_indoor3d/Area_6_office_6.npy 44 | data/stanford_indoor3d/Area_6_office_7.npy 45 | data/stanford_indoor3d/Area_6_office_8.npy 46 | data/stanford_indoor3d/Area_6_office_9.npy 47 | data/stanford_indoor3d/Area_6_openspace_1.npy 48 | data/stanford_indoor3d/Area_6_pantry_1.npy 49 | -------------------------------------------------------------------------------- /sem_seg/meta/class_names.txt: -------------------------------------------------------------------------------- 1 | ceiling 2 | floor 3 | wall 4 | beam 5 | column 6 | window 7 | door 8 | table 9 | chair 10 | sofa 11 | bookcase 12 | board 13 | clutter 14 | -------------------------------------------------------------------------------- /sem_seg/model.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import math 3 | import time 4 | import numpy as np 5 | import os 6 | import sys 7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 8 | ROOT_DIR = os.path.dirname(BASE_DIR) 9 | sys.path.append(os.path.join(ROOT_DIR, 'utils')) 10 | import tf_util 11 | 12 | def placeholder_inputs(batch_size, num_point): 13 | pointclouds_pl = tf.placeholder(tf.float32, 14 | shape=(batch_size, num_point, 9)) 15 | labels_pl = tf.placeholder(tf.int32, 16 | shape=(batch_size, num_point)) 17 | return pointclouds_pl, labels_pl 18 | 19 | def get_model(point_cloud, is_training, bn_decay=None): 20 | """ ConvNet baseline, input is BxNx3 gray image """ 21 | batch_size = point_cloud.get_shape()[0].value 22 | num_point = point_cloud.get_shape()[1].value 23 | 24 | input_image = tf.expand_dims(point_cloud, -1) 25 | # CONV 26 | net = tf_util.conv2d(input_image, 64, [1,9], padding='VALID', stride=[1,1], 27 | bn=True, is_training=is_training, scope='conv1', bn_decay=bn_decay) 28 | net = tf_util.conv2d(net, 64, [1,1], padding='VALID', stride=[1,1], 29 | bn=True, is_training=is_training, scope='conv2', bn_decay=bn_decay) 30 | net = tf_util.conv2d(net, 64, [1,1], padding='VALID', stride=[1,1], 31 | bn=True, is_training=is_training, scope='conv3', bn_decay=bn_decay) 32 | net = tf_util.conv2d(net, 128, [1,1], padding='VALID', stride=[1,1], 33 | bn=True, is_training=is_training, scope='conv4', bn_decay=bn_decay) 34 | points_feat1 = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1], 35 | bn=True, is_training=is_training, scope='conv5', bn_decay=bn_decay) 36 | # MAX 37 | pc_feat1 = tf_util.max_pool2d(points_feat1, [num_point,1], padding='VALID', scope='maxpool1') 38 | # FC 39 | pc_feat1 = tf.reshape(pc_feat1, [batch_size, -1]) 40 | pc_feat1 = tf_util.fully_connected(pc_feat1, 256, bn=True, is_training=is_training, scope='fc1', bn_decay=bn_decay) 41 | pc_feat1 = tf_util.fully_connected(pc_feat1, 128, bn=True, is_training=is_training, scope='fc2', bn_decay=bn_decay) 42 | print(pc_feat1) 43 | 44 | # CONCAT 45 | pc_feat1_expand = tf.tile(tf.reshape(pc_feat1, [batch_size, 1, 1, -1]), [1, num_point, 1, 1]) 46 | points_feat1_concat = tf.concat(axis=3, values=[points_feat1, pc_feat1_expand]) 47 | 48 | # CONV 49 | net = tf_util.conv2d(points_feat1_concat, 512, [1,1], padding='VALID', stride=[1,1], 50 | bn=True, is_training=is_training, scope='conv6') 51 | net = tf_util.conv2d(net, 256, [1,1], padding='VALID', stride=[1,1], 52 | bn=True, is_training=is_training, scope='conv7') 53 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='dp1') 54 | net = tf_util.conv2d(net, 13, [1,1], padding='VALID', stride=[1,1], 55 | activation_fn=None, scope='conv8') 56 | net = tf.squeeze(net, [2]) 57 | 58 | return net 59 | 60 | def get_loss(pred, label): 61 | """ pred: B,N,13 62 | label: B,N """ 63 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label) 64 | return tf.reduce_mean(loss) 65 | 66 | if __name__ == "__main__": 67 | with tf.Graph().as_default(): 68 | a = tf.placeholder(tf.float32, shape=(32,4096,9)) 69 | net = get_model(a, tf.constant(True)) 70 | with tf.Session() as sess: 71 | init = tf.global_variables_initializer() 72 | sess.run(init) 73 | start = time.time() 74 | for i in range(100): 75 | print(i) 76 | sess.run(net, feed_dict={a:np.random.rand(32,4096,9)}) 77 | print(time.time() - start) 78 | -------------------------------------------------------------------------------- /sem_seg/train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import math 3 | import h5py 4 | import numpy as np 5 | import tensorflow as tf 6 | import socket 7 | 8 | import os 9 | import sys 10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 11 | ROOT_DIR = os.path.dirname(BASE_DIR) 12 | sys.path.append(BASE_DIR) 13 | sys.path.append(ROOT_DIR) 14 | sys.path.append(os.path.join(ROOT_DIR, 'utils')) 15 | import provider 16 | import tf_util 17 | from model import * 18 | 19 | parser = argparse.ArgumentParser() 20 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]') 21 | parser.add_argument('--log_dir', default='log', help='Log dir [default: log]') 22 | parser.add_argument('--num_point', type=int, default=4096, help='Point number [default: 4096]') 23 | parser.add_argument('--max_epoch', type=int, default=50, help='Epoch to run [default: 50]') 24 | parser.add_argument('--batch_size', type=int, default=24, help='Batch Size during training [default: 24]') 25 | parser.add_argument('--learning_rate', type=float, default=0.001, help='Initial learning rate [default: 0.001]') 26 | parser.add_argument('--momentum', type=float, default=0.9, help='Initial learning rate [default: 0.9]') 27 | parser.add_argument('--optimizer', default='adam', help='adam or momentum [default: adam]') 28 | parser.add_argument('--decay_step', type=int, default=300000, help='Decay step for lr decay [default: 300000]') 29 | parser.add_argument('--decay_rate', type=float, default=0.5, help='Decay rate for lr decay [default: 0.5]') 30 | parser.add_argument('--test_area', type=int, default=6, help='Which area to use for test, option: 1-6 [default: 6]') 31 | FLAGS = parser.parse_args() 32 | 33 | 34 | BATCH_SIZE = FLAGS.batch_size 35 | NUM_POINT = FLAGS.num_point 36 | MAX_EPOCH = FLAGS.max_epoch 37 | NUM_POINT = FLAGS.num_point 38 | BASE_LEARNING_RATE = FLAGS.learning_rate 39 | GPU_INDEX = FLAGS.gpu 40 | MOMENTUM = FLAGS.momentum 41 | OPTIMIZER = FLAGS.optimizer 42 | DECAY_STEP = FLAGS.decay_step 43 | DECAY_RATE = FLAGS.decay_rate 44 | 45 | LOG_DIR = FLAGS.log_dir 46 | if not os.path.exists(LOG_DIR): os.mkdir(LOG_DIR) 47 | os.system('cp model.py %s' % (LOG_DIR)) # bkp of model def 48 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure 49 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w') 50 | LOG_FOUT.write(str(FLAGS)+'\n') 51 | 52 | MAX_NUM_POINT = 4096 53 | NUM_CLASSES = 13 54 | 55 | BN_INIT_DECAY = 0.5 56 | BN_DECAY_DECAY_RATE = 0.5 57 | #BN_DECAY_DECAY_STEP = float(DECAY_STEP * 2) 58 | BN_DECAY_DECAY_STEP = float(DECAY_STEP) 59 | BN_DECAY_CLIP = 0.99 60 | 61 | HOSTNAME = socket.gethostname() 62 | 63 | ALL_FILES = provider.getDataFiles('indoor3d_sem_seg_hdf5_data/all_files.txt') 64 | room_filelist = [line.rstrip() for line in open('indoor3d_sem_seg_hdf5_data/room_filelist.txt')] 65 | 66 | # Load ALL data 67 | data_batch_list = [] 68 | label_batch_list = [] 69 | for h5_filename in ALL_FILES: 70 | data_batch, label_batch = provider.loadDataFile(h5_filename) 71 | data_batch_list.append(data_batch) 72 | label_batch_list.append(label_batch) 73 | data_batches = np.concatenate(data_batch_list, 0) 74 | label_batches = np.concatenate(label_batch_list, 0) 75 | print(data_batches.shape) 76 | print(label_batches.shape) 77 | 78 | test_area = 'Area_'+str(FLAGS.test_area) 79 | train_idxs = [] 80 | test_idxs = [] 81 | for i,room_name in enumerate(room_filelist): 82 | if test_area in room_name: 83 | test_idxs.append(i) 84 | else: 85 | train_idxs.append(i) 86 | 87 | train_data = data_batches[train_idxs,...] 88 | train_label = label_batches[train_idxs] 89 | test_data = data_batches[test_idxs,...] 90 | test_label = label_batches[test_idxs] 91 | print(train_data.shape, train_label.shape) 92 | print(test_data.shape, test_label.shape) 93 | 94 | 95 | 96 | 97 | def log_string(out_str): 98 | LOG_FOUT.write(out_str+'\n') 99 | LOG_FOUT.flush() 100 | print(out_str) 101 | 102 | 103 | def get_learning_rate(batch): 104 | learning_rate = tf.train.exponential_decay( 105 | BASE_LEARNING_RATE, # Base learning rate. 106 | batch * BATCH_SIZE, # Current index into the dataset. 107 | DECAY_STEP, # Decay step. 108 | DECAY_RATE, # Decay rate. 109 | staircase=True) 110 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE!! 111 | return learning_rate 112 | 113 | def get_bn_decay(batch): 114 | bn_momentum = tf.train.exponential_decay( 115 | BN_INIT_DECAY, 116 | batch*BATCH_SIZE, 117 | BN_DECAY_DECAY_STEP, 118 | BN_DECAY_DECAY_RATE, 119 | staircase=True) 120 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum) 121 | return bn_decay 122 | 123 | def train(): 124 | with tf.Graph().as_default(): 125 | with tf.device('/gpu:'+str(GPU_INDEX)): 126 | pointclouds_pl, labels_pl = placeholder_inputs(BATCH_SIZE, NUM_POINT) 127 | is_training_pl = tf.placeholder(tf.bool, shape=()) 128 | 129 | # Note the global_step=batch parameter to minimize. 130 | # That tells the optimizer to helpfully increment the 'batch' parameter for you every time it trains. 131 | batch = tf.Variable(0) 132 | bn_decay = get_bn_decay(batch) 133 | tf.summary.scalar('bn_decay', bn_decay) 134 | 135 | # Get model and loss 136 | pred = get_model(pointclouds_pl, is_training_pl, bn_decay=bn_decay) 137 | loss = get_loss(pred, labels_pl) 138 | tf.summary.scalar('loss', loss) 139 | 140 | correct = tf.equal(tf.argmax(pred, 2), tf.to_int64(labels_pl)) 141 | accuracy = tf.reduce_sum(tf.cast(correct, tf.float32)) / float(BATCH_SIZE*NUM_POINT) 142 | tf.summary.scalar('accuracy', accuracy) 143 | 144 | # Get training operator 145 | learning_rate = get_learning_rate(batch) 146 | tf.summary.scalar('learning_rate', learning_rate) 147 | if OPTIMIZER == 'momentum': 148 | optimizer = tf.train.MomentumOptimizer(learning_rate, momentum=MOMENTUM) 149 | elif OPTIMIZER == 'adam': 150 | optimizer = tf.train.AdamOptimizer(learning_rate) 151 | train_op = optimizer.minimize(loss, global_step=batch) 152 | 153 | # Add ops to save and restore all the variables. 154 | saver = tf.train.Saver() 155 | 156 | # Create a session 157 | config = tf.ConfigProto() 158 | config.gpu_options.allow_growth = True 159 | config.allow_soft_placement = True 160 | config.log_device_placement = True 161 | sess = tf.Session(config=config) 162 | 163 | # Add summary writers 164 | merged = tf.summary.merge_all() 165 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'), 166 | sess.graph) 167 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test')) 168 | 169 | # Init variables 170 | init = tf.global_variables_initializer() 171 | sess.run(init, {is_training_pl:True}) 172 | 173 | ops = {'pointclouds_pl': pointclouds_pl, 174 | 'labels_pl': labels_pl, 175 | 'is_training_pl': is_training_pl, 176 | 'pred': pred, 177 | 'loss': loss, 178 | 'train_op': train_op, 179 | 'merged': merged, 180 | 'step': batch} 181 | 182 | for epoch in range(MAX_EPOCH): 183 | log_string('**** EPOCH %03d ****' % (epoch)) 184 | sys.stdout.flush() 185 | 186 | train_one_epoch(sess, ops, train_writer) 187 | eval_one_epoch(sess, ops, test_writer) 188 | 189 | # Save the variables to disk. 190 | if epoch % 10 == 0: 191 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt")) 192 | log_string("Model saved in file: %s" % save_path) 193 | 194 | 195 | 196 | def train_one_epoch(sess, ops, train_writer): 197 | """ ops: dict mapping from string to tf ops """ 198 | is_training = True 199 | 200 | log_string('----') 201 | current_data, current_label, _ = provider.shuffle_data(train_data[:,0:NUM_POINT,:], train_label) 202 | 203 | file_size = current_data.shape[0] 204 | num_batches = file_size // BATCH_SIZE 205 | 206 | total_correct = 0 207 | total_seen = 0 208 | loss_sum = 0 209 | 210 | for batch_idx in range(num_batches): 211 | if batch_idx % 100 == 0: 212 | print('Current batch/total batch num: %d/%d'%(batch_idx,num_batches)) 213 | start_idx = batch_idx * BATCH_SIZE 214 | end_idx = (batch_idx+1) * BATCH_SIZE 215 | 216 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :], 217 | ops['labels_pl']: current_label[start_idx:end_idx], 218 | ops['is_training_pl']: is_training,} 219 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], ops['step'], ops['train_op'], ops['loss'], ops['pred']], 220 | feed_dict=feed_dict) 221 | train_writer.add_summary(summary, step) 222 | pred_val = np.argmax(pred_val, 2) 223 | correct = np.sum(pred_val == current_label[start_idx:end_idx]) 224 | total_correct += correct 225 | total_seen += (BATCH_SIZE*NUM_POINT) 226 | loss_sum += loss_val 227 | 228 | log_string('mean loss: %f' % (loss_sum / float(num_batches))) 229 | log_string('accuracy: %f' % (total_correct / float(total_seen))) 230 | 231 | 232 | def eval_one_epoch(sess, ops, test_writer): 233 | """ ops: dict mapping from string to tf ops """ 234 | is_training = False 235 | total_correct = 0 236 | total_seen = 0 237 | loss_sum = 0 238 | total_seen_class = [0 for _ in range(NUM_CLASSES)] 239 | total_correct_class = [0 for _ in range(NUM_CLASSES)] 240 | 241 | log_string('----') 242 | current_data = test_data[:,0:NUM_POINT,:] 243 | current_label = np.squeeze(test_label) 244 | 245 | file_size = current_data.shape[0] 246 | num_batches = file_size // BATCH_SIZE 247 | 248 | for batch_idx in range(num_batches): 249 | start_idx = batch_idx * BATCH_SIZE 250 | end_idx = (batch_idx+1) * BATCH_SIZE 251 | 252 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :], 253 | ops['labels_pl']: current_label[start_idx:end_idx], 254 | ops['is_training_pl']: is_training} 255 | summary, step, loss_val, pred_val = sess.run([ops['merged'], ops['step'], ops['loss'], ops['pred']], 256 | feed_dict=feed_dict) 257 | test_writer.add_summary(summary, step) 258 | pred_val = np.argmax(pred_val, 2) 259 | correct = np.sum(pred_val == current_label[start_idx:end_idx]) 260 | total_correct += correct 261 | total_seen += (BATCH_SIZE*NUM_POINT) 262 | loss_sum += (loss_val*BATCH_SIZE) 263 | for i in range(start_idx, end_idx): 264 | for j in range(NUM_POINT): 265 | l = current_label[i, j] 266 | total_seen_class[l] += 1 267 | total_correct_class[l] += (pred_val[i-start_idx, j] == l) 268 | 269 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen/NUM_POINT))) 270 | log_string('eval accuracy: %f'% (total_correct / float(total_seen))) 271 | log_string('eval avg class acc: %f' % (np.mean(np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float)))) 272 | 273 | 274 | 275 | if __name__ == "__main__": 276 | train() 277 | LOG_FOUT.close() 278 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import math 3 | import h5py 4 | import numpy as np 5 | import tensorflow as tf 6 | import socket 7 | import importlib 8 | import os 9 | import sys 10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 11 | sys.path.append(BASE_DIR) 12 | sys.path.append(os.path.join(BASE_DIR, 'models')) 13 | sys.path.append(os.path.join(BASE_DIR, 'utils')) 14 | import provider 15 | import tf_util 16 | 17 | parser = argparse.ArgumentParser() 18 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]') 19 | parser.add_argument('--model', default='pointnet_cls', help='Model name: pointnet_cls or pointnet_cls_basic [default: pointnet_cls]') 20 | parser.add_argument('--log_dir', default='log', help='Log dir [default: log]') 21 | parser.add_argument('--num_point', type=int, default=1024, help='Point Number [256/512/1024/2048] [default: 1024]') 22 | parser.add_argument('--max_epoch', type=int, default=250, help='Epoch to run [default: 250]') 23 | parser.add_argument('--batch_size', type=int, default=32, help='Batch Size during training [default: 32]') 24 | parser.add_argument('--learning_rate', type=float, default=0.001, help='Initial learning rate [default: 0.001]') 25 | parser.add_argument('--momentum', type=float, default=0.9, help='Initial learning rate [default: 0.9]') 26 | parser.add_argument('--optimizer', default='adam', help='adam or momentum [default: adam]') 27 | parser.add_argument('--decay_step', type=int, default=200000, help='Decay step for lr decay [default: 200000]') 28 | parser.add_argument('--decay_rate', type=float, default=0.7, help='Decay rate for lr decay [default: 0.8]') 29 | FLAGS = parser.parse_args() 30 | 31 | 32 | BATCH_SIZE = FLAGS.batch_size 33 | NUM_POINT = FLAGS.num_point 34 | MAX_EPOCH = FLAGS.max_epoch 35 | BASE_LEARNING_RATE = FLAGS.learning_rate 36 | GPU_INDEX = FLAGS.gpu 37 | MOMENTUM = FLAGS.momentum 38 | OPTIMIZER = FLAGS.optimizer 39 | DECAY_STEP = FLAGS.decay_step 40 | DECAY_RATE = FLAGS.decay_rate 41 | 42 | MODEL = importlib.import_module(FLAGS.model) # import network module 43 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py') 44 | LOG_DIR = FLAGS.log_dir 45 | if not os.path.exists(LOG_DIR): os.mkdir(LOG_DIR) 46 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR)) # bkp of model def 47 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure 48 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w') 49 | LOG_FOUT.write(str(FLAGS)+'\n') 50 | 51 | MAX_NUM_POINT = 2048 52 | NUM_CLASSES = 40 53 | 54 | BN_INIT_DECAY = 0.5 55 | BN_DECAY_DECAY_RATE = 0.5 56 | BN_DECAY_DECAY_STEP = float(DECAY_STEP) 57 | BN_DECAY_CLIP = 0.99 58 | 59 | HOSTNAME = socket.gethostname() 60 | 61 | # ModelNet40 official train/test split 62 | TRAIN_FILES = provider.getDataFiles( \ 63 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/train_files.txt')) 64 | TEST_FILES = provider.getDataFiles(\ 65 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/test_files.txt')) 66 | 67 | def log_string(out_str): 68 | LOG_FOUT.write(out_str+'\n') 69 | LOG_FOUT.flush() 70 | print(out_str) 71 | 72 | 73 | def get_learning_rate(batch): 74 | learning_rate = tf.train.exponential_decay( 75 | BASE_LEARNING_RATE, # Base learning rate. 76 | batch * BATCH_SIZE, # Current index into the dataset. 77 | DECAY_STEP, # Decay step. 78 | DECAY_RATE, # Decay rate. 79 | staircase=True) 80 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE! 81 | return learning_rate 82 | 83 | def get_bn_decay(batch): 84 | bn_momentum = tf.train.exponential_decay( 85 | BN_INIT_DECAY, 86 | batch*BATCH_SIZE, 87 | BN_DECAY_DECAY_STEP, 88 | BN_DECAY_DECAY_RATE, 89 | staircase=True) 90 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum) 91 | return bn_decay 92 | 93 | def train(): 94 | with tf.Graph().as_default(): 95 | with tf.device('/gpu:'+str(GPU_INDEX)): 96 | pointclouds_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE, NUM_POINT) 97 | is_training_pl = tf.placeholder(tf.bool, shape=()) 98 | print(is_training_pl) 99 | 100 | # Note the global_step=batch parameter to minimize. 101 | # That tells the optimizer to helpfully increment the 'batch' parameter for you every time it trains. 102 | batch = tf.Variable(0) 103 | bn_decay = get_bn_decay(batch) 104 | tf.summary.scalar('bn_decay', bn_decay) 105 | 106 | # Get model and loss 107 | pred, end_points = MODEL.get_model(pointclouds_pl, is_training_pl, bn_decay=bn_decay) 108 | loss = MODEL.get_loss(pred, labels_pl, end_points) 109 | tf.summary.scalar('loss', loss) 110 | 111 | correct = tf.equal(tf.argmax(pred, 1), tf.to_int64(labels_pl)) 112 | accuracy = tf.reduce_sum(tf.cast(correct, tf.float32)) / float(BATCH_SIZE) 113 | tf.summary.scalar('accuracy', accuracy) 114 | 115 | # Get training operator 116 | learning_rate = get_learning_rate(batch) 117 | tf.summary.scalar('learning_rate', learning_rate) 118 | if OPTIMIZER == 'momentum': 119 | optimizer = tf.train.MomentumOptimizer(learning_rate, momentum=MOMENTUM) 120 | elif OPTIMIZER == 'adam': 121 | optimizer = tf.train.AdamOptimizer(learning_rate) 122 | train_op = optimizer.minimize(loss, global_step=batch) 123 | 124 | # Add ops to save and restore all the variables. 125 | saver = tf.train.Saver() 126 | 127 | # Create a session 128 | config = tf.ConfigProto() 129 | config.gpu_options.allow_growth = True 130 | config.allow_soft_placement = True 131 | config.log_device_placement = False 132 | sess = tf.Session(config=config) 133 | 134 | # Add summary writers 135 | #merged = tf.merge_all_summaries() 136 | merged = tf.summary.merge_all() 137 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'), 138 | sess.graph) 139 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test')) 140 | 141 | # Init variables 142 | init = tf.global_variables_initializer() 143 | # To fix the bug introduced in TF 0.12.1 as in 144 | # http://stackoverflow.com/questions/41543774/invalidargumenterror-for-tensor-bool-tensorflow-0-12-1 145 | #sess.run(init) 146 | sess.run(init, {is_training_pl: True}) 147 | 148 | ops = {'pointclouds_pl': pointclouds_pl, 149 | 'labels_pl': labels_pl, 150 | 'is_training_pl': is_training_pl, 151 | 'pred': pred, 152 | 'loss': loss, 153 | 'train_op': train_op, 154 | 'merged': merged, 155 | 'step': batch} 156 | 157 | for epoch in range(MAX_EPOCH): 158 | log_string('**** EPOCH %03d ****' % (epoch)) 159 | sys.stdout.flush() 160 | 161 | train_one_epoch(sess, ops, train_writer) 162 | eval_one_epoch(sess, ops, test_writer) 163 | 164 | # Save the variables to disk. 165 | if epoch % 10 == 0: 166 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt")) 167 | log_string("Model saved in file: %s" % save_path) 168 | 169 | 170 | 171 | def train_one_epoch(sess, ops, train_writer): 172 | """ ops: dict mapping from string to tf ops """ 173 | is_training = True 174 | 175 | # Shuffle train files 176 | train_file_idxs = np.arange(0, len(TRAIN_FILES)) 177 | np.random.shuffle(train_file_idxs) 178 | 179 | for fn in range(len(TRAIN_FILES)): 180 | log_string('----' + str(fn) + '-----') 181 | current_data, current_label = provider.loadDataFile(TRAIN_FILES[train_file_idxs[fn]]) 182 | current_data = current_data[:,0:NUM_POINT,:] 183 | current_data, current_label, _ = provider.shuffle_data(current_data, np.squeeze(current_label)) 184 | current_label = np.squeeze(current_label) 185 | 186 | file_size = current_data.shape[0] 187 | num_batches = file_size // BATCH_SIZE 188 | 189 | total_correct = 0 190 | total_seen = 0 191 | loss_sum = 0 192 | 193 | for batch_idx in range(num_batches): 194 | start_idx = batch_idx * BATCH_SIZE 195 | end_idx = (batch_idx+1) * BATCH_SIZE 196 | 197 | # Augment batched point clouds by rotation and jittering 198 | rotated_data = provider.rotate_point_cloud(current_data[start_idx:end_idx, :, :]) 199 | jittered_data = provider.jitter_point_cloud(rotated_data) 200 | feed_dict = {ops['pointclouds_pl']: jittered_data, 201 | ops['labels_pl']: current_label[start_idx:end_idx], 202 | ops['is_training_pl']: is_training,} 203 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], ops['step'], 204 | ops['train_op'], ops['loss'], ops['pred']], feed_dict=feed_dict) 205 | train_writer.add_summary(summary, step) 206 | pred_val = np.argmax(pred_val, 1) 207 | correct = np.sum(pred_val == current_label[start_idx:end_idx]) 208 | total_correct += correct 209 | total_seen += BATCH_SIZE 210 | loss_sum += loss_val 211 | 212 | log_string('mean loss: %f' % (loss_sum / float(num_batches))) 213 | log_string('accuracy: %f' % (total_correct / float(total_seen))) 214 | 215 | 216 | def eval_one_epoch(sess, ops, test_writer): 217 | """ ops: dict mapping from string to tf ops """ 218 | is_training = False 219 | total_correct = 0 220 | total_seen = 0 221 | loss_sum = 0 222 | total_seen_class = [0 for _ in range(NUM_CLASSES)] 223 | total_correct_class = [0 for _ in range(NUM_CLASSES)] 224 | 225 | for fn in range(len(TEST_FILES)): 226 | log_string('----' + str(fn) + '-----') 227 | current_data, current_label = provider.loadDataFile(TEST_FILES[fn]) 228 | current_data = current_data[:,0:NUM_POINT,:] 229 | current_label = np.squeeze(current_label) 230 | 231 | file_size = current_data.shape[0] 232 | num_batches = file_size // BATCH_SIZE 233 | 234 | for batch_idx in range(num_batches): 235 | start_idx = batch_idx * BATCH_SIZE 236 | end_idx = (batch_idx+1) * BATCH_SIZE 237 | 238 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :], 239 | ops['labels_pl']: current_label[start_idx:end_idx], 240 | ops['is_training_pl']: is_training} 241 | summary, step, loss_val, pred_val = sess.run([ops['merged'], ops['step'], 242 | ops['loss'], ops['pred']], feed_dict=feed_dict) 243 | pred_val = np.argmax(pred_val, 1) 244 | correct = np.sum(pred_val == current_label[start_idx:end_idx]) 245 | total_correct += correct 246 | total_seen += BATCH_SIZE 247 | loss_sum += (loss_val*BATCH_SIZE) 248 | for i in range(start_idx, end_idx): 249 | l = current_label[i] 250 | total_seen_class[l] += 1 251 | total_correct_class[l] += (pred_val[i-start_idx] == l) 252 | 253 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen))) 254 | log_string('eval accuracy: %f'% (total_correct / float(total_seen))) 255 | log_string('eval avg class acc: %f' % (np.mean(np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float)))) 256 | 257 | 258 | 259 | if __name__ == "__main__": 260 | train() 261 | LOG_FOUT.close() 262 | -------------------------------------------------------------------------------- /utils/data_prep_util.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 4 | sys.path.append(BASE_DIR) 5 | from plyfile import (PlyData, PlyElement, make2d, PlyParseError, PlyProperty) 6 | import numpy as np 7 | import h5py 8 | 9 | SAMPLING_BIN = os.path.join(BASE_DIR, 'third_party/mesh_sampling/build/pcsample') 10 | 11 | SAMPLING_POINT_NUM = 2048 12 | SAMPLING_LEAF_SIZE = 0.005 13 | 14 | MODELNET40_PATH = '../datasets/modelnet40' 15 | def export_ply(pc, filename): 16 | vertex = np.zeros(pc.shape[0], dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')]) 17 | for i in range(pc.shape[0]): 18 | vertex[i] = (pc[i][0], pc[i][1], pc[i][2]) 19 | ply_out = PlyData([PlyElement.describe(vertex, 'vertex', comments=['vertices'])]) 20 | ply_out.write(filename) 21 | 22 | # Sample points on the obj shape 23 | def get_sampling_command(obj_filename, ply_filename): 24 | cmd = SAMPLING_BIN + ' ' + obj_filename 25 | cmd += ' ' + ply_filename 26 | cmd += ' -n_samples %d ' % SAMPLING_POINT_NUM 27 | cmd += ' -leaf_size %f ' % SAMPLING_LEAF_SIZE 28 | return cmd 29 | 30 | # -------------------------------------------------------------- 31 | # Following are the helper functions to load MODELNET40 shapes 32 | # -------------------------------------------------------------- 33 | 34 | # Read in the list of categories in MODELNET40 35 | def get_category_names(): 36 | shape_names_file = os.path.join(MODELNET40_PATH, 'shape_names.txt') 37 | shape_names = [line.rstrip() for line in open(shape_names_file)] 38 | return shape_names 39 | 40 | # Return all the filepaths for the shapes in MODELNET40 41 | def get_obj_filenames(): 42 | obj_filelist_file = os.path.join(MODELNET40_PATH, 'filelist.txt') 43 | obj_filenames = [os.path.join(MODELNET40_PATH, line.rstrip()) for line in open(obj_filelist_file)] 44 | print('Got %d obj files in modelnet40.' % len(obj_filenames)) 45 | return obj_filenames 46 | 47 | # Helper function to create the father folder and all subdir folders if not exist 48 | def batch_mkdir(output_folder, subdir_list): 49 | if not os.path.exists(output_folder): 50 | os.mkdir(output_folder) 51 | for subdir in subdir_list: 52 | if not os.path.exists(os.path.join(output_folder, subdir)): 53 | os.mkdir(os.path.join(output_folder, subdir)) 54 | 55 | # ---------------------------------------------------------------- 56 | # Following are the helper functions to load save/load HDF5 files 57 | # ---------------------------------------------------------------- 58 | 59 | # Write numpy array data and label to h5_filename 60 | def save_h5_data_label_normal(h5_filename, data, label, normal, 61 | data_dtype='float32', label_dtype='uint8', noral_dtype='float32'): 62 | h5_fout = h5py.File(h5_filename) 63 | h5_fout.create_dataset( 64 | 'data', data=data, 65 | compression='gzip', compression_opts=4, 66 | dtype=data_dtype) 67 | h5_fout.create_dataset( 68 | 'normal', data=normal, 69 | compression='gzip', compression_opts=4, 70 | dtype=normal_dtype) 71 | h5_fout.create_dataset( 72 | 'label', data=label, 73 | compression='gzip', compression_opts=1, 74 | dtype=label_dtype) 75 | h5_fout.close() 76 | 77 | 78 | # Write numpy array data and label to h5_filename 79 | def save_h5(h5_filename, data, label, data_dtype='uint8', label_dtype='uint8'): 80 | h5_fout = h5py.File(h5_filename) 81 | h5_fout.create_dataset( 82 | 'data', data=data, 83 | compression='gzip', compression_opts=4, 84 | dtype=data_dtype) 85 | h5_fout.create_dataset( 86 | 'label', data=label, 87 | compression='gzip', compression_opts=1, 88 | dtype=label_dtype) 89 | h5_fout.close() 90 | 91 | # Read numpy array data and label from h5_filename 92 | def load_h5_data_label_normal(h5_filename): 93 | f = h5py.File(h5_filename) 94 | data = f['data'][:] 95 | label = f['label'][:] 96 | normal = f['normal'][:] 97 | return (data, label, normal) 98 | 99 | # Read numpy array data and label from h5_filename 100 | def load_h5_data_label_seg(h5_filename): 101 | f = h5py.File(h5_filename) 102 | data = f['data'][:] 103 | label = f['label'][:] 104 | seg = f['pid'][:] 105 | return (data, label, seg) 106 | 107 | # Read numpy array data and label from h5_filename 108 | def load_h5(h5_filename): 109 | f = h5py.File(h5_filename) 110 | data = f['data'][:] 111 | label = f['label'][:] 112 | return (data, label) 113 | 114 | # ---------------------------------------------------------------- 115 | # Following are the helper functions to load save/load PLY files 116 | # ---------------------------------------------------------------- 117 | 118 | # Load PLY file 119 | def load_ply_data(filename, point_num): 120 | plydata = PlyData.read(filename) 121 | pc = plydata['vertex'].data[:point_num] 122 | pc_array = np.array([[x, y, z] for x,y,z in pc]) 123 | return pc_array 124 | 125 | # Load PLY file 126 | def load_ply_normal(filename, point_num): 127 | plydata = PlyData.read(filename) 128 | pc = plydata['normal'].data[:point_num] 129 | pc_array = np.array([[x, y, z] for x,y,z in pc]) 130 | return pc_array 131 | 132 | # Make up rows for Nxk array 133 | # Input Pad is 'edge' or 'constant' 134 | def pad_arr_rows(arr, row, pad='edge'): 135 | assert(len(arr.shape) == 2) 136 | assert(arr.shape[0] <= row) 137 | assert(pad == 'edge' or pad == 'constant') 138 | if arr.shape[0] == row: 139 | return arr 140 | if pad == 'edge': 141 | return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'edge') 142 | if pad == 'constant': 143 | return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'constant', (0, 0)) 144 | 145 | 146 | -------------------------------------------------------------------------------- /utils/eulerangles.py: -------------------------------------------------------------------------------- 1 | # emacs: -*- mode: python-mode; py-indent-offset: 4; indent-tabs-mode: nil -*- 2 | # vi: set ft=python sts=4 ts=4 sw=4 et: 3 | ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ## 4 | # 5 | # See COPYING file distributed along with the NiBabel package for the 6 | # copyright and license terms. 7 | # 8 | ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ## 9 | ''' Module implementing Euler angle rotations and their conversions 10 | 11 | See: 12 | 13 | * http://en.wikipedia.org/wiki/Rotation_matrix 14 | * http://en.wikipedia.org/wiki/Euler_angles 15 | * http://mathworld.wolfram.com/EulerAngles.html 16 | 17 | See also: *Representing Attitude with Euler Angles and Quaternions: A 18 | Reference* (2006) by James Diebel. A cached PDF link last found here: 19 | 20 | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.5134 21 | 22 | Euler's rotation theorem tells us that any rotation in 3D can be 23 | described by 3 angles. Let's call the 3 angles the *Euler angle vector* 24 | and call the angles in the vector :math:`alpha`, :math:`beta` and 25 | :math:`gamma`. The vector is [ :math:`alpha`, 26 | :math:`beta`. :math:`gamma` ] and, in this description, the order of the 27 | parameters specifies the order in which the rotations occur (so the 28 | rotation corresponding to :math:`alpha` is applied first). 29 | 30 | In order to specify the meaning of an *Euler angle vector* we need to 31 | specify the axes around which each of the rotations corresponding to 32 | :math:`alpha`, :math:`beta` and :math:`gamma` will occur. 33 | 34 | There are therefore three axes for the rotations :math:`alpha`, 35 | :math:`beta` and :math:`gamma`; let's call them :math:`i` :math:`j`, 36 | :math:`k`. 37 | 38 | Let us express the rotation :math:`alpha` around axis `i` as a 3 by 3 39 | rotation matrix `A`. Similarly :math:`beta` around `j` becomes 3 x 3 40 | matrix `B` and :math:`gamma` around `k` becomes matrix `G`. Then the 41 | whole rotation expressed by the Euler angle vector [ :math:`alpha`, 42 | :math:`beta`. :math:`gamma` ], `R` is given by:: 43 | 44 | R = np.dot(G, np.dot(B, A)) 45 | 46 | See http://mathworld.wolfram.com/EulerAngles.html 47 | 48 | The order :math:`G B A` expresses the fact that the rotations are 49 | performed in the order of the vector (:math:`alpha` around axis `i` = 50 | `A` first). 51 | 52 | To convert a given Euler angle vector to a meaningful rotation, and a 53 | rotation matrix, we need to define: 54 | 55 | * the axes `i`, `j`, `k` 56 | * whether a rotation matrix should be applied on the left of a vector to 57 | be transformed (vectors are column vectors) or on the right (vectors 58 | are row vectors). 59 | * whether the rotations move the axes as they are applied (intrinsic 60 | rotations) - compared the situation where the axes stay fixed and the 61 | vectors move within the axis frame (extrinsic) 62 | * the handedness of the coordinate system 63 | 64 | See: http://en.wikipedia.org/wiki/Rotation_matrix#Ambiguities 65 | 66 | We are using the following conventions: 67 | 68 | * axes `i`, `j`, `k` are the `z`, `y`, and `x` axes respectively. Thus 69 | an Euler angle vector [ :math:`alpha`, :math:`beta`. :math:`gamma` ] 70 | in our convention implies a :math:`alpha` radian rotation around the 71 | `z` axis, followed by a :math:`beta` rotation around the `y` axis, 72 | followed by a :math:`gamma` rotation around the `x` axis. 73 | * the rotation matrix applies on the left, to column vectors on the 74 | right, so if `R` is the rotation matrix, and `v` is a 3 x N matrix 75 | with N column vectors, the transformed vector set `vdash` is given by 76 | ``vdash = np.dot(R, v)``. 77 | * extrinsic rotations - the axes are fixed, and do not move with the 78 | rotations. 79 | * a right-handed coordinate system 80 | 81 | The convention of rotation around ``z``, followed by rotation around 82 | ``y``, followed by rotation around ``x``, is known (confusingly) as 83 | "xyz", pitch-roll-yaw, Cardan angles, or Tait-Bryan angles. 84 | ''' 85 | 86 | import math 87 | 88 | import sys 89 | if sys.version_info >= (3,0): 90 | from functools import reduce 91 | 92 | import numpy as np 93 | 94 | 95 | _FLOAT_EPS_4 = np.finfo(float).eps * 4.0 96 | 97 | 98 | def euler2mat(z=0, y=0, x=0): 99 | ''' Return matrix for rotations around z, y and x axes 100 | 101 | Uses the z, then y, then x convention above 102 | 103 | Parameters 104 | ---------- 105 | z : scalar 106 | Rotation angle in radians around z-axis (performed first) 107 | y : scalar 108 | Rotation angle in radians around y-axis 109 | x : scalar 110 | Rotation angle in radians around x-axis (performed last) 111 | 112 | Returns 113 | ------- 114 | M : array shape (3,3) 115 | Rotation matrix giving same rotation as for given angles 116 | 117 | Examples 118 | -------- 119 | >>> zrot = 1.3 # radians 120 | >>> yrot = -0.1 121 | >>> xrot = 0.2 122 | >>> M = euler2mat(zrot, yrot, xrot) 123 | >>> M.shape == (3, 3) 124 | True 125 | 126 | The output rotation matrix is equal to the composition of the 127 | individual rotations 128 | 129 | >>> M1 = euler2mat(zrot) 130 | >>> M2 = euler2mat(0, yrot) 131 | >>> M3 = euler2mat(0, 0, xrot) 132 | >>> composed_M = np.dot(M3, np.dot(M2, M1)) 133 | >>> np.allclose(M, composed_M) 134 | True 135 | 136 | You can specify rotations by named arguments 137 | 138 | >>> np.all(M3 == euler2mat(x=xrot)) 139 | True 140 | 141 | When applying M to a vector, the vector should column vector to the 142 | right of M. If the right hand side is a 2D array rather than a 143 | vector, then each column of the 2D array represents a vector. 144 | 145 | >>> vec = np.array([1, 0, 0]).reshape((3,1)) 146 | >>> v2 = np.dot(M, vec) 147 | >>> vecs = np.array([[1, 0, 0],[0, 1, 0]]).T # giving 3x2 array 148 | >>> vecs2 = np.dot(M, vecs) 149 | 150 | Rotations are counter-clockwise. 151 | 152 | >>> zred = np.dot(euler2mat(z=np.pi/2), np.eye(3)) 153 | >>> np.allclose(zred, [[0, -1, 0],[1, 0, 0], [0, 0, 1]]) 154 | True 155 | >>> yred = np.dot(euler2mat(y=np.pi/2), np.eye(3)) 156 | >>> np.allclose(yred, [[0, 0, 1],[0, 1, 0], [-1, 0, 0]]) 157 | True 158 | >>> xred = np.dot(euler2mat(x=np.pi/2), np.eye(3)) 159 | >>> np.allclose(xred, [[1, 0, 0],[0, 0, -1], [0, 1, 0]]) 160 | True 161 | 162 | Notes 163 | ----- 164 | The direction of rotation is given by the right-hand rule (orient 165 | the thumb of the right hand along the axis around which the rotation 166 | occurs, with the end of the thumb at the positive end of the axis; 167 | curl your fingers; the direction your fingers curl is the direction 168 | of rotation). Therefore, the rotations are counterclockwise if 169 | looking along the axis of rotation from positive to negative. 170 | ''' 171 | Ms = [] 172 | if z: 173 | cosz = math.cos(z) 174 | sinz = math.sin(z) 175 | Ms.append(np.array( 176 | [[cosz, -sinz, 0], 177 | [sinz, cosz, 0], 178 | [0, 0, 1]])) 179 | if y: 180 | cosy = math.cos(y) 181 | siny = math.sin(y) 182 | Ms.append(np.array( 183 | [[cosy, 0, siny], 184 | [0, 1, 0], 185 | [-siny, 0, cosy]])) 186 | if x: 187 | cosx = math.cos(x) 188 | sinx = math.sin(x) 189 | Ms.append(np.array( 190 | [[1, 0, 0], 191 | [0, cosx, -sinx], 192 | [0, sinx, cosx]])) 193 | if Ms: 194 | return reduce(np.dot, Ms[::-1]) 195 | return np.eye(3) 196 | 197 | 198 | def mat2euler(M, cy_thresh=None): 199 | ''' Discover Euler angle vector from 3x3 matrix 200 | 201 | Uses the conventions above. 202 | 203 | Parameters 204 | ---------- 205 | M : array-like, shape (3,3) 206 | cy_thresh : None or scalar, optional 207 | threshold below which to give up on straightforward arctan for 208 | estimating x rotation. If None (default), estimate from 209 | precision of input. 210 | 211 | Returns 212 | ------- 213 | z : scalar 214 | y : scalar 215 | x : scalar 216 | Rotations in radians around z, y, x axes, respectively 217 | 218 | Notes 219 | ----- 220 | If there was no numerical error, the routine could be derived using 221 | Sympy expression for z then y then x rotation matrix, which is:: 222 | 223 | [ cos(y)*cos(z), -cos(y)*sin(z), sin(y)], 224 | [cos(x)*sin(z) + cos(z)*sin(x)*sin(y), cos(x)*cos(z) - sin(x)*sin(y)*sin(z), -cos(y)*sin(x)], 225 | [sin(x)*sin(z) - cos(x)*cos(z)*sin(y), cos(z)*sin(x) + cos(x)*sin(y)*sin(z), cos(x)*cos(y)] 226 | 227 | with the obvious derivations for z, y, and x 228 | 229 | z = atan2(-r12, r11) 230 | y = asin(r13) 231 | x = atan2(-r23, r33) 232 | 233 | Problems arise when cos(y) is close to zero, because both of:: 234 | 235 | z = atan2(cos(y)*sin(z), cos(y)*cos(z)) 236 | x = atan2(cos(y)*sin(x), cos(x)*cos(y)) 237 | 238 | will be close to atan2(0, 0), and highly unstable. 239 | 240 | The ``cy`` fix for numerical instability below is from: *Graphics 241 | Gems IV*, Paul Heckbert (editor), Academic Press, 1994, ISBN: 242 | 0123361559. Specifically it comes from EulerAngles.c by Ken 243 | Shoemake, and deals with the case where cos(y) is close to zero: 244 | 245 | See: http://www.graphicsgems.org/ 246 | 247 | The code appears to be licensed (from the website) as "can be used 248 | without restrictions". 249 | ''' 250 | M = np.asarray(M) 251 | if cy_thresh is None: 252 | try: 253 | cy_thresh = np.finfo(M.dtype).eps * 4 254 | except ValueError: 255 | cy_thresh = _FLOAT_EPS_4 256 | r11, r12, r13, r21, r22, r23, r31, r32, r33 = M.flat 257 | # cy: sqrt((cos(y)*cos(z))**2 + (cos(x)*cos(y))**2) 258 | cy = math.sqrt(r33*r33 + r23*r23) 259 | if cy > cy_thresh: # cos(y) not close to zero, standard form 260 | z = math.atan2(-r12, r11) # atan2(cos(y)*sin(z), cos(y)*cos(z)) 261 | y = math.atan2(r13, cy) # atan2(sin(y), cy) 262 | x = math.atan2(-r23, r33) # atan2(cos(y)*sin(x), cos(x)*cos(y)) 263 | else: # cos(y) (close to) zero, so x -> 0.0 (see above) 264 | # so r21 -> sin(z), r22 -> cos(z) and 265 | z = math.atan2(r21, r22) 266 | y = math.atan2(r13, cy) # atan2(sin(y), cy) 267 | x = 0.0 268 | return z, y, x 269 | 270 | 271 | def euler2quat(z=0, y=0, x=0): 272 | ''' Return quaternion corresponding to these Euler angles 273 | 274 | Uses the z, then y, then x convention above 275 | 276 | Parameters 277 | ---------- 278 | z : scalar 279 | Rotation angle in radians around z-axis (performed first) 280 | y : scalar 281 | Rotation angle in radians around y-axis 282 | x : scalar 283 | Rotation angle in radians around x-axis (performed last) 284 | 285 | Returns 286 | ------- 287 | quat : array shape (4,) 288 | Quaternion in w, x, y z (real, then vector) format 289 | 290 | Notes 291 | ----- 292 | We can derive this formula in Sympy using: 293 | 294 | 1. Formula giving quaternion corresponding to rotation of theta radians 295 | about arbitrary axis: 296 | http://mathworld.wolfram.com/EulerParameters.html 297 | 2. Generated formulae from 1.) for quaternions corresponding to 298 | theta radians rotations about ``x, y, z`` axes 299 | 3. Apply quaternion multiplication formula - 300 | http://en.wikipedia.org/wiki/Quaternions#Hamilton_product - to 301 | formulae from 2.) to give formula for combined rotations. 302 | ''' 303 | z = z/2.0 304 | y = y/2.0 305 | x = x/2.0 306 | cz = math.cos(z) 307 | sz = math.sin(z) 308 | cy = math.cos(y) 309 | sy = math.sin(y) 310 | cx = math.cos(x) 311 | sx = math.sin(x) 312 | return np.array([ 313 | cx*cy*cz - sx*sy*sz, 314 | cx*sy*sz + cy*cz*sx, 315 | cx*cz*sy - sx*cy*sz, 316 | cx*cy*sz + sx*cz*sy]) 317 | 318 | 319 | def quat2euler(q): 320 | ''' Return Euler angles corresponding to quaternion `q` 321 | 322 | Parameters 323 | ---------- 324 | q : 4 element sequence 325 | w, x, y, z of quaternion 326 | 327 | Returns 328 | ------- 329 | z : scalar 330 | Rotation angle in radians around z-axis (performed first) 331 | y : scalar 332 | Rotation angle in radians around y-axis 333 | x : scalar 334 | Rotation angle in radians around x-axis (performed last) 335 | 336 | Notes 337 | ----- 338 | It's possible to reduce the amount of calculation a little, by 339 | combining parts of the ``quat2mat`` and ``mat2euler`` functions, but 340 | the reduction in computation is small, and the code repetition is 341 | large. 342 | ''' 343 | # delayed import to avoid cyclic dependencies 344 | import nibabel.quaternions as nq 345 | return mat2euler(nq.quat2mat(q)) 346 | 347 | 348 | def euler2angle_axis(z=0, y=0, x=0): 349 | ''' Return angle, axis corresponding to these Euler angles 350 | 351 | Uses the z, then y, then x convention above 352 | 353 | Parameters 354 | ---------- 355 | z : scalar 356 | Rotation angle in radians around z-axis (performed first) 357 | y : scalar 358 | Rotation angle in radians around y-axis 359 | x : scalar 360 | Rotation angle in radians around x-axis (performed last) 361 | 362 | Returns 363 | ------- 364 | theta : scalar 365 | angle of rotation 366 | vector : array shape (3,) 367 | axis around which rotation occurs 368 | 369 | Examples 370 | -------- 371 | >>> theta, vec = euler2angle_axis(0, 1.5, 0) 372 | >>> print(theta) 373 | 1.5 374 | >>> np.allclose(vec, [0, 1, 0]) 375 | True 376 | ''' 377 | # delayed import to avoid cyclic dependencies 378 | import nibabel.quaternions as nq 379 | return nq.quat2angle_axis(euler2quat(z, y, x)) 380 | 381 | 382 | def angle_axis2euler(theta, vector, is_normalized=False): 383 | ''' Convert angle, axis pair to Euler angles 384 | 385 | Parameters 386 | ---------- 387 | theta : scalar 388 | angle of rotation 389 | vector : 3 element sequence 390 | vector specifying axis for rotation. 391 | is_normalized : bool, optional 392 | True if vector is already normalized (has norm of 1). Default 393 | False 394 | 395 | Returns 396 | ------- 397 | z : scalar 398 | y : scalar 399 | x : scalar 400 | Rotations in radians around z, y, x axes, respectively 401 | 402 | Examples 403 | -------- 404 | >>> z, y, x = angle_axis2euler(0, [1, 0, 0]) 405 | >>> np.allclose((z, y, x), 0) 406 | True 407 | 408 | Notes 409 | ----- 410 | It's possible to reduce the amount of calculation a little, by 411 | combining parts of the ``angle_axis2mat`` and ``mat2euler`` 412 | functions, but the reduction in computation is small, and the code 413 | repetition is large. 414 | ''' 415 | # delayed import to avoid cyclic dependencies 416 | import nibabel.quaternions as nq 417 | M = nq.angle_axis2mat(theta, vector, is_normalized) 418 | return mat2euler(M) 419 | -------------------------------------------------------------------------------- /utils/pc_util.py: -------------------------------------------------------------------------------- 1 | """ Utility functions for processing point clouds. 2 | 3 | Author: Charles R. Qi, Hao Su 4 | Date: November 2016 5 | """ 6 | 7 | import os 8 | import sys 9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 10 | sys.path.append(BASE_DIR) 11 | 12 | # Draw point cloud 13 | from eulerangles import euler2mat 14 | 15 | # Point cloud IO 16 | import numpy as np 17 | from plyfile import PlyData, PlyElement 18 | 19 | 20 | # ---------------------------------------- 21 | # Point Cloud/Volume Conversions 22 | # ---------------------------------------- 23 | 24 | def point_cloud_to_volume_batch(point_clouds, vsize=12, radius=1.0, flatten=True): 25 | """ Input is BxNx3 batch of point cloud 26 | Output is Bx(vsize^3) 27 | """ 28 | vol_list = [] 29 | for b in range(point_clouds.shape[0]): 30 | vol = point_cloud_to_volume(np.squeeze(point_clouds[b,:,:]), vsize, radius) 31 | if flatten: 32 | vol_list.append(vol.flatten()) 33 | else: 34 | vol_list.append(np.expand_dims(np.expand_dims(vol, -1), 0)) 35 | if flatten: 36 | return np.vstack(vol_list) 37 | else: 38 | return np.concatenate(vol_list, 0) 39 | 40 | 41 | def point_cloud_to_volume(points, vsize, radius=1.0): 42 | """ input is Nx3 points. 43 | output is vsize*vsize*vsize 44 | assumes points are in range [-radius, radius] 45 | """ 46 | vol = np.zeros((vsize,vsize,vsize)) 47 | voxel = 2*radius/float(vsize) 48 | locations = (points + radius)/voxel 49 | locations = locations.astype(int) 50 | vol[locations[:,0],locations[:,1],locations[:,2]] = 1.0 51 | return vol 52 | 53 | #a = np.zeros((16,1024,3)) 54 | #print point_cloud_to_volume_batch(a, 12, 1.0, False).shape 55 | 56 | def volume_to_point_cloud(vol): 57 | """ vol is occupancy grid (value = 0 or 1) of size vsize*vsize*vsize 58 | return Nx3 numpy array. 59 | """ 60 | vsize = vol.shape[0] 61 | assert(vol.shape[1] == vsize and vol.shape[1] == vsize) 62 | points = [] 63 | for a in range(vsize): 64 | for b in range(vsize): 65 | for c in range(vsize): 66 | if vol[a,b,c] == 1: 67 | points.append(np.array([a,b,c])) 68 | if len(points) == 0: 69 | return np.zeros((0,3)) 70 | points = np.vstack(points) 71 | return points 72 | 73 | # ---------------------------------------- 74 | # Point cloud IO 75 | # ---------------------------------------- 76 | 77 | def read_ply(filename): 78 | """ read XYZ point cloud from filename PLY file """ 79 | plydata = PlyData.read(filename) 80 | pc = plydata['vertex'].data 81 | pc_array = np.array([[x, y, z] for x,y,z in pc]) 82 | return pc_array 83 | 84 | 85 | def write_ply(points, filename, text=True): 86 | """ input: Nx3, write points to filename as PLY format. """ 87 | points = [(points[i,0], points[i,1], points[i,2]) for i in range(points.shape[0])] 88 | vertex = np.array(points, dtype=[('x', 'f4'), ('y', 'f4'),('z', 'f4')]) 89 | el = PlyElement.describe(vertex, 'vertex', comments=['vertices']) 90 | PlyData([el], text=text).write(filename) 91 | 92 | 93 | # ---------------------------------------- 94 | # Simple Point cloud and Volume Renderers 95 | # ---------------------------------------- 96 | 97 | def draw_point_cloud(input_points, canvasSize=500, space=200, diameter=25, 98 | xrot=0, yrot=0, zrot=0, switch_xyz=[0,1,2], normalize=True): 99 | """ Render point cloud to image with alpha channel. 100 | Input: 101 | points: Nx3 numpy array (+y is up direction) 102 | Output: 103 | gray image as numpy array of size canvasSizexcanvasSize 104 | """ 105 | image = np.zeros((canvasSize, canvasSize)) 106 | if input_points is None or input_points.shape[0] == 0: 107 | return image 108 | 109 | points = input_points[:, switch_xyz] 110 | M = euler2mat(zrot, yrot, xrot) 111 | points = (np.dot(M, points.transpose())).transpose() 112 | 113 | # Normalize the point cloud 114 | # We normalize scale to fit points in a unit sphere 115 | if normalize: 116 | centroid = np.mean(points, axis=0) 117 | points -= centroid 118 | furthest_distance = np.max(np.sqrt(np.sum(abs(points)**2,axis=-1))) 119 | points /= furthest_distance 120 | 121 | # Pre-compute the Gaussian disk 122 | radius = (diameter-1)/2.0 123 | disk = np.zeros((diameter, diameter)) 124 | for i in range(diameter): 125 | for j in range(diameter): 126 | if (i - radius) * (i-radius) + (j-radius) * (j-radius) <= radius * radius: 127 | disk[i, j] = np.exp((-(i-radius)**2 - (j-radius)**2)/(radius**2)) 128 | mask = np.argwhere(disk > 0) 129 | dx = mask[:, 0] 130 | dy = mask[:, 1] 131 | dv = disk[disk > 0] 132 | 133 | # Order points by z-buffer 134 | zorder = np.argsort(points[:, 2]) 135 | points = points[zorder, :] 136 | points[:, 2] = (points[:, 2] - np.min(points[:, 2])) / (np.max(points[:, 2] - np.min(points[:, 2]))) 137 | max_depth = np.max(points[:, 2]) 138 | 139 | for i in range(points.shape[0]): 140 | j = points.shape[0] - i - 1 141 | x = points[j, 0] 142 | y = points[j, 1] 143 | xc = canvasSize/2 + (x*space) 144 | yc = canvasSize/2 + (y*space) 145 | xc = int(np.round(xc)) 146 | yc = int(np.round(yc)) 147 | 148 | px = dx + xc 149 | py = dy + yc 150 | 151 | image[px, py] = image[px, py] * 0.7 + dv * (max_depth - points[j, 2]) * 0.3 152 | 153 | image = image / np.max(image) 154 | return image 155 | 156 | def point_cloud_three_views(points): 157 | """ input points Nx3 numpy array (+y is up direction). 158 | return an numpy array gray image of size 500x1500. """ 159 | # +y is up direction 160 | # xrot is azimuth 161 | # yrot is in-plane 162 | # zrot is elevation 163 | img1 = draw_point_cloud(points, zrot=110/180.0*np.pi, xrot=45/180.0*np.pi, yrot=0/180.0*np.pi) 164 | img2 = draw_point_cloud(points, zrot=70/180.0*np.pi, xrot=135/180.0*np.pi, yrot=0/180.0*np.pi) 165 | img3 = draw_point_cloud(points, zrot=180.0/180.0*np.pi, xrot=90/180.0*np.pi, yrot=0/180.0*np.pi) 166 | image_large = np.concatenate([img1, img2, img3], 1) 167 | return image_large 168 | 169 | 170 | from PIL import Image 171 | def point_cloud_three_views_demo(): 172 | """ Demo for draw_point_cloud function """ 173 | points = read_ply('../third_party/mesh_sampling/piano.ply') 174 | im_array = point_cloud_three_views(points) 175 | img = Image.fromarray(np.uint8(im_array*255.0)) 176 | img.save('piano.jpg') 177 | 178 | if __name__=="__main__": 179 | point_cloud_three_views_demo() 180 | 181 | 182 | import matplotlib.pyplot as plt 183 | def pyplot_draw_point_cloud(points, output_filename): 184 | """ points is a Nx3 numpy array """ 185 | fig = plt.figure() 186 | ax = fig.add_subplot(111, projection='3d') 187 | ax.scatter(points[:,0], points[:,1], points[:,2]) 188 | ax.set_xlabel('x') 189 | ax.set_ylabel('y') 190 | ax.set_zlabel('z') 191 | #savefig(output_filename) 192 | 193 | def pyplot_draw_volume(vol, output_filename): 194 | """ vol is of size vsize*vsize*vsize 195 | output an image to output_filename 196 | """ 197 | points = volume_to_point_cloud(vol) 198 | pyplot_draw_point_cloud(points, output_filename) 199 | -------------------------------------------------------------------------------- /utils/tf_util.py: -------------------------------------------------------------------------------- 1 | """ Wrapper functions for TensorFlow layers. 2 | 3 | Author: Charles R. Qi 4 | Date: November 2016 5 | """ 6 | 7 | import numpy as np 8 | import tensorflow as tf 9 | 10 | def _variable_on_cpu(name, shape, initializer, use_fp16=False): 11 | """Helper to create a Variable stored on CPU memory. 12 | Args: 13 | name: name of the variable 14 | shape: list of ints 15 | initializer: initializer for Variable 16 | Returns: 17 | Variable Tensor 18 | """ 19 | with tf.device('/cpu:0'): 20 | dtype = tf.float16 if use_fp16 else tf.float32 21 | var = tf.get_variable(name, shape, initializer=initializer, dtype=dtype) 22 | return var 23 | 24 | def _variable_with_weight_decay(name, shape, stddev, wd, use_xavier=True): 25 | """Helper to create an initialized Variable with weight decay. 26 | 27 | Note that the Variable is initialized with a truncated normal distribution. 28 | A weight decay is added only if one is specified. 29 | 30 | Args: 31 | name: name of the variable 32 | shape: list of ints 33 | stddev: standard deviation of a truncated Gaussian 34 | wd: add L2Loss weight decay multiplied by this float. If None, weight 35 | decay is not added for this Variable. 36 | use_xavier: bool, whether to use xavier initializer 37 | 38 | Returns: 39 | Variable Tensor 40 | """ 41 | if use_xavier: 42 | initializer = tf.contrib.layers.xavier_initializer() 43 | else: 44 | initializer = tf.truncated_normal_initializer(stddev=stddev) 45 | var = _variable_on_cpu(name, shape, initializer) 46 | if wd is not None: 47 | weight_decay = tf.multiply(tf.nn.l2_loss(var), wd, name='weight_loss') 48 | tf.add_to_collection('losses', weight_decay) 49 | return var 50 | 51 | 52 | def conv1d(inputs, 53 | num_output_channels, 54 | kernel_size, 55 | scope, 56 | stride=1, 57 | padding='SAME', 58 | use_xavier=True, 59 | stddev=1e-3, 60 | weight_decay=0.0, 61 | activation_fn=tf.nn.relu, 62 | bn=False, 63 | bn_decay=None, 64 | is_training=None): 65 | """ 1D convolution with non-linear operation. 66 | 67 | Args: 68 | inputs: 3-D tensor variable BxLxC 69 | num_output_channels: int 70 | kernel_size: int 71 | scope: string 72 | stride: int 73 | padding: 'SAME' or 'VALID' 74 | use_xavier: bool, use xavier_initializer if true 75 | stddev: float, stddev for truncated_normal init 76 | weight_decay: float 77 | activation_fn: function 78 | bn: bool, whether to use batch norm 79 | bn_decay: float or float tensor variable in [0,1] 80 | is_training: bool Tensor variable 81 | 82 | Returns: 83 | Variable tensor 84 | """ 85 | with tf.variable_scope(scope) as sc: 86 | num_in_channels = inputs.get_shape()[-1].value 87 | kernel_shape = [kernel_size, 88 | num_in_channels, num_output_channels] 89 | kernel = _variable_with_weight_decay('weights', 90 | shape=kernel_shape, 91 | use_xavier=use_xavier, 92 | stddev=stddev, 93 | wd=weight_decay) 94 | outputs = tf.nn.conv1d(inputs, kernel, 95 | stride=stride, 96 | padding=padding) 97 | biases = _variable_on_cpu('biases', [num_output_channels], 98 | tf.constant_initializer(0.0)) 99 | outputs = tf.nn.bias_add(outputs, biases) 100 | 101 | if bn: 102 | outputs = batch_norm_for_conv1d(outputs, is_training, 103 | bn_decay=bn_decay, scope='bn') 104 | 105 | if activation_fn is not None: 106 | outputs = activation_fn(outputs) 107 | return outputs 108 | 109 | 110 | 111 | 112 | def conv2d(inputs, 113 | num_output_channels, 114 | kernel_size, 115 | scope, 116 | stride=[1, 1], 117 | padding='SAME', 118 | use_xavier=True, 119 | stddev=1e-3, 120 | weight_decay=0.0, 121 | activation_fn=tf.nn.relu, 122 | bn=False, 123 | bn_decay=None, 124 | is_training=None): 125 | """ 2D convolution with non-linear operation. 126 | 127 | Args: 128 | inputs: 4-D tensor variable BxHxWxC 129 | num_output_channels: int 130 | kernel_size: a list of 2 ints 131 | scope: string 132 | stride: a list of 2 ints 133 | padding: 'SAME' or 'VALID' 134 | use_xavier: bool, use xavier_initializer if true 135 | stddev: float, stddev for truncated_normal init 136 | weight_decay: float 137 | activation_fn: function 138 | bn: bool, whether to use batch norm 139 | bn_decay: float or float tensor variable in [0,1] 140 | is_training: bool Tensor variable 141 | 142 | Returns: 143 | Variable tensor 144 | """ 145 | with tf.variable_scope(scope) as sc: 146 | kernel_h, kernel_w = kernel_size 147 | num_in_channels = inputs.get_shape()[-1].value 148 | kernel_shape = [kernel_h, kernel_w, 149 | num_in_channels, num_output_channels] 150 | kernel = _variable_with_weight_decay('weights', 151 | shape=kernel_shape, 152 | use_xavier=use_xavier, 153 | stddev=stddev, 154 | wd=weight_decay) 155 | stride_h, stride_w = stride 156 | outputs = tf.nn.conv2d(inputs, kernel, 157 | [1, stride_h, stride_w, 1], 158 | padding=padding) 159 | biases = _variable_on_cpu('biases', [num_output_channels], 160 | tf.constant_initializer(0.0)) 161 | outputs = tf.nn.bias_add(outputs, biases) 162 | 163 | if bn: 164 | outputs = batch_norm_for_conv2d(outputs, is_training, 165 | bn_decay=bn_decay, scope='bn') 166 | 167 | if activation_fn is not None: 168 | outputs = activation_fn(outputs) 169 | return outputs 170 | 171 | 172 | def conv2d_transpose(inputs, 173 | num_output_channels, 174 | kernel_size, 175 | scope, 176 | stride=[1, 1], 177 | padding='SAME', 178 | use_xavier=True, 179 | stddev=1e-3, 180 | weight_decay=0.0, 181 | activation_fn=tf.nn.relu, 182 | bn=False, 183 | bn_decay=None, 184 | is_training=None): 185 | """ 2D convolution transpose with non-linear operation. 186 | 187 | Args: 188 | inputs: 4-D tensor variable BxHxWxC 189 | num_output_channels: int 190 | kernel_size: a list of 2 ints 191 | scope: string 192 | stride: a list of 2 ints 193 | padding: 'SAME' or 'VALID' 194 | use_xavier: bool, use xavier_initializer if true 195 | stddev: float, stddev for truncated_normal init 196 | weight_decay: float 197 | activation_fn: function 198 | bn: bool, whether to use batch norm 199 | bn_decay: float or float tensor variable in [0,1] 200 | is_training: bool Tensor variable 201 | 202 | Returns: 203 | Variable tensor 204 | 205 | Note: conv2d(conv2d_transpose(a, num_out, ksize, stride), a.shape[-1], ksize, stride) == a 206 | """ 207 | with tf.variable_scope(scope) as sc: 208 | kernel_h, kernel_w = kernel_size 209 | num_in_channels = inputs.get_shape()[-1].value 210 | kernel_shape = [kernel_h, kernel_w, 211 | num_output_channels, num_in_channels] # reversed to conv2d 212 | kernel = _variable_with_weight_decay('weights', 213 | shape=kernel_shape, 214 | use_xavier=use_xavier, 215 | stddev=stddev, 216 | wd=weight_decay) 217 | stride_h, stride_w = stride 218 | 219 | # from slim.convolution2d_transpose 220 | def get_deconv_dim(dim_size, stride_size, kernel_size, padding): 221 | dim_size *= stride_size 222 | 223 | if padding == 'VALID' and dim_size is not None: 224 | dim_size += max(kernel_size - stride_size, 0) 225 | return dim_size 226 | 227 | # caculate output shape 228 | batch_size = inputs.get_shape()[0].value 229 | height = inputs.get_shape()[1].value 230 | width = inputs.get_shape()[2].value 231 | out_height = get_deconv_dim(height, stride_h, kernel_h, padding) 232 | out_width = get_deconv_dim(width, stride_w, kernel_w, padding) 233 | output_shape = [batch_size, out_height, out_width, num_output_channels] 234 | 235 | outputs = tf.nn.conv2d_transpose(inputs, kernel, output_shape, 236 | [1, stride_h, stride_w, 1], 237 | padding=padding) 238 | biases = _variable_on_cpu('biases', [num_output_channels], 239 | tf.constant_initializer(0.0)) 240 | outputs = tf.nn.bias_add(outputs, biases) 241 | 242 | if bn: 243 | outputs = batch_norm_for_conv2d(outputs, is_training, 244 | bn_decay=bn_decay, scope='bn') 245 | 246 | if activation_fn is not None: 247 | outputs = activation_fn(outputs) 248 | return outputs 249 | 250 | 251 | 252 | def conv3d(inputs, 253 | num_output_channels, 254 | kernel_size, 255 | scope, 256 | stride=[1, 1, 1], 257 | padding='SAME', 258 | use_xavier=True, 259 | stddev=1e-3, 260 | weight_decay=0.0, 261 | activation_fn=tf.nn.relu, 262 | bn=False, 263 | bn_decay=None, 264 | is_training=None): 265 | """ 3D convolution with non-linear operation. 266 | 267 | Args: 268 | inputs: 5-D tensor variable BxDxHxWxC 269 | num_output_channels: int 270 | kernel_size: a list of 3 ints 271 | scope: string 272 | stride: a list of 3 ints 273 | padding: 'SAME' or 'VALID' 274 | use_xavier: bool, use xavier_initializer if true 275 | stddev: float, stddev for truncated_normal init 276 | weight_decay: float 277 | activation_fn: function 278 | bn: bool, whether to use batch norm 279 | bn_decay: float or float tensor variable in [0,1] 280 | is_training: bool Tensor variable 281 | 282 | Returns: 283 | Variable tensor 284 | """ 285 | with tf.variable_scope(scope) as sc: 286 | kernel_d, kernel_h, kernel_w = kernel_size 287 | num_in_channels = inputs.get_shape()[-1].value 288 | kernel_shape = [kernel_d, kernel_h, kernel_w, 289 | num_in_channels, num_output_channels] 290 | kernel = _variable_with_weight_decay('weights', 291 | shape=kernel_shape, 292 | use_xavier=use_xavier, 293 | stddev=stddev, 294 | wd=weight_decay) 295 | stride_d, stride_h, stride_w = stride 296 | outputs = tf.nn.conv3d(inputs, kernel, 297 | [1, stride_d, stride_h, stride_w, 1], 298 | padding=padding) 299 | biases = _variable_on_cpu('biases', [num_output_channels], 300 | tf.constant_initializer(0.0)) 301 | outputs = tf.nn.bias_add(outputs, biases) 302 | 303 | if bn: 304 | outputs = batch_norm_for_conv3d(outputs, is_training, 305 | bn_decay=bn_decay, scope='bn') 306 | 307 | if activation_fn is not None: 308 | outputs = activation_fn(outputs) 309 | return outputs 310 | 311 | def fully_connected(inputs, 312 | num_outputs, 313 | scope, 314 | use_xavier=True, 315 | stddev=1e-3, 316 | weight_decay=0.0, 317 | activation_fn=tf.nn.relu, 318 | bn=False, 319 | bn_decay=None, 320 | is_training=None): 321 | """ Fully connected layer with non-linear operation. 322 | 323 | Args: 324 | inputs: 2-D tensor BxN 325 | num_outputs: int 326 | 327 | Returns: 328 | Variable tensor of size B x num_outputs. 329 | """ 330 | with tf.variable_scope(scope) as sc: 331 | num_input_units = inputs.get_shape()[-1].value 332 | weights = _variable_with_weight_decay('weights', 333 | shape=[num_input_units, num_outputs], 334 | use_xavier=use_xavier, 335 | stddev=stddev, 336 | wd=weight_decay) 337 | outputs = tf.matmul(inputs, weights) 338 | biases = _variable_on_cpu('biases', [num_outputs], 339 | tf.constant_initializer(0.0)) 340 | outputs = tf.nn.bias_add(outputs, biases) 341 | 342 | if bn: 343 | outputs = batch_norm_for_fc(outputs, is_training, bn_decay, 'bn') 344 | 345 | if activation_fn is not None: 346 | outputs = activation_fn(outputs) 347 | return outputs 348 | 349 | 350 | def max_pool2d(inputs, 351 | kernel_size, 352 | scope, 353 | stride=[2, 2], 354 | padding='VALID'): 355 | """ 2D max pooling. 356 | 357 | Args: 358 | inputs: 4-D tensor BxHxWxC 359 | kernel_size: a list of 2 ints 360 | stride: a list of 2 ints 361 | 362 | Returns: 363 | Variable tensor 364 | """ 365 | with tf.variable_scope(scope) as sc: 366 | kernel_h, kernel_w = kernel_size 367 | stride_h, stride_w = stride 368 | outputs = tf.nn.max_pool(inputs, 369 | ksize=[1, kernel_h, kernel_w, 1], 370 | strides=[1, stride_h, stride_w, 1], 371 | padding=padding, 372 | name=sc.name) 373 | return outputs 374 | 375 | def avg_pool2d(inputs, 376 | kernel_size, 377 | scope, 378 | stride=[2, 2], 379 | padding='VALID'): 380 | """ 2D avg pooling. 381 | 382 | Args: 383 | inputs: 4-D tensor BxHxWxC 384 | kernel_size: a list of 2 ints 385 | stride: a list of 2 ints 386 | 387 | Returns: 388 | Variable tensor 389 | """ 390 | with tf.variable_scope(scope) as sc: 391 | kernel_h, kernel_w = kernel_size 392 | stride_h, stride_w = stride 393 | outputs = tf.nn.avg_pool(inputs, 394 | ksize=[1, kernel_h, kernel_w, 1], 395 | strides=[1, stride_h, stride_w, 1], 396 | padding=padding, 397 | name=sc.name) 398 | return outputs 399 | 400 | 401 | def max_pool3d(inputs, 402 | kernel_size, 403 | scope, 404 | stride=[2, 2, 2], 405 | padding='VALID'): 406 | """ 3D max pooling. 407 | 408 | Args: 409 | inputs: 5-D tensor BxDxHxWxC 410 | kernel_size: a list of 3 ints 411 | stride: a list of 3 ints 412 | 413 | Returns: 414 | Variable tensor 415 | """ 416 | with tf.variable_scope(scope) as sc: 417 | kernel_d, kernel_h, kernel_w = kernel_size 418 | stride_d, stride_h, stride_w = stride 419 | outputs = tf.nn.max_pool3d(inputs, 420 | ksize=[1, kernel_d, kernel_h, kernel_w, 1], 421 | strides=[1, stride_d, stride_h, stride_w, 1], 422 | padding=padding, 423 | name=sc.name) 424 | return outputs 425 | 426 | def avg_pool3d(inputs, 427 | kernel_size, 428 | scope, 429 | stride=[2, 2, 2], 430 | padding='VALID'): 431 | """ 3D avg pooling. 432 | 433 | Args: 434 | inputs: 5-D tensor BxDxHxWxC 435 | kernel_size: a list of 3 ints 436 | stride: a list of 3 ints 437 | 438 | Returns: 439 | Variable tensor 440 | """ 441 | with tf.variable_scope(scope) as sc: 442 | kernel_d, kernel_h, kernel_w = kernel_size 443 | stride_d, stride_h, stride_w = stride 444 | outputs = tf.nn.avg_pool3d(inputs, 445 | ksize=[1, kernel_d, kernel_h, kernel_w, 1], 446 | strides=[1, stride_d, stride_h, stride_w, 1], 447 | padding=padding, 448 | name=sc.name) 449 | return outputs 450 | 451 | 452 | 453 | 454 | 455 | def batch_norm_template(inputs, is_training, scope, moments_dims, bn_decay): 456 | """ Batch normalization on convolutional maps and beyond... 457 | Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow 458 | 459 | Args: 460 | inputs: Tensor, k-D input ... x C could be BC or BHWC or BDHWC 461 | is_training: boolean tf.Varialbe, true indicates training phase 462 | scope: string, variable scope 463 | moments_dims: a list of ints, indicating dimensions for moments calculation 464 | bn_decay: float or float tensor variable, controling moving average weight 465 | Return: 466 | normed: batch-normalized maps 467 | """ 468 | with tf.variable_scope(scope) as sc: 469 | num_channels = inputs.get_shape()[-1].value 470 | beta = tf.Variable(tf.constant(0.0, shape=[num_channels]), 471 | name='beta', trainable=True) 472 | gamma = tf.Variable(tf.constant(1.0, shape=[num_channels]), 473 | name='gamma', trainable=True) 474 | batch_mean, batch_var = tf.nn.moments(inputs, moments_dims, name='moments') 475 | decay = bn_decay if bn_decay is not None else 0.9 476 | ema = tf.train.ExponentialMovingAverage(decay=decay) 477 | # Operator that maintains moving averages of variables. 478 | ema_apply_op = tf.cond(is_training, 479 | lambda: ema.apply([batch_mean, batch_var]), 480 | lambda: tf.no_op()) 481 | 482 | # Update moving average and return current batch's avg and var. 483 | def mean_var_with_update(): 484 | with tf.control_dependencies([ema_apply_op]): 485 | return tf.identity(batch_mean), tf.identity(batch_var) 486 | 487 | # ema.average returns the Variable holding the average of var. 488 | mean, var = tf.cond(is_training, 489 | mean_var_with_update, 490 | lambda: (ema.average(batch_mean), ema.average(batch_var))) 491 | normed = tf.nn.batch_normalization(inputs, mean, var, beta, gamma, 1e-3) 492 | return normed 493 | 494 | 495 | def batch_norm_for_fc(inputs, is_training, bn_decay, scope): 496 | """ Batch normalization on FC data. 497 | 498 | Args: 499 | inputs: Tensor, 2D BxC input 500 | is_training: boolean tf.Varialbe, true indicates training phase 501 | bn_decay: float or float tensor variable, controling moving average weight 502 | scope: string, variable scope 503 | Return: 504 | normed: batch-normalized maps 505 | """ 506 | return batch_norm_template(inputs, is_training, scope, [0,], bn_decay) 507 | 508 | 509 | def batch_norm_for_conv1d(inputs, is_training, bn_decay, scope): 510 | """ Batch normalization on 1D convolutional maps. 511 | 512 | Args: 513 | inputs: Tensor, 3D BLC input maps 514 | is_training: boolean tf.Varialbe, true indicates training phase 515 | bn_decay: float or float tensor variable, controling moving average weight 516 | scope: string, variable scope 517 | Return: 518 | normed: batch-normalized maps 519 | """ 520 | return batch_norm_template(inputs, is_training, scope, [0,1], bn_decay) 521 | 522 | 523 | 524 | 525 | def batch_norm_for_conv2d(inputs, is_training, bn_decay, scope): 526 | """ Batch normalization on 2D convolutional maps. 527 | 528 | Args: 529 | inputs: Tensor, 4D BHWC input maps 530 | is_training: boolean tf.Varialbe, true indicates training phase 531 | bn_decay: float or float tensor variable, controling moving average weight 532 | scope: string, variable scope 533 | Return: 534 | normed: batch-normalized maps 535 | """ 536 | return batch_norm_template(inputs, is_training, scope, [0,1,2], bn_decay) 537 | 538 | 539 | 540 | def batch_norm_for_conv3d(inputs, is_training, bn_decay, scope): 541 | """ Batch normalization on 3D convolutional maps. 542 | 543 | Args: 544 | inputs: Tensor, 5D BDHWC input maps 545 | is_training: boolean tf.Varialbe, true indicates training phase 546 | bn_decay: float or float tensor variable, controling moving average weight 547 | scope: string, variable scope 548 | Return: 549 | normed: batch-normalized maps 550 | """ 551 | return batch_norm_template(inputs, is_training, scope, [0,1,2,3], bn_decay) 552 | 553 | 554 | def dropout(inputs, 555 | is_training, 556 | scope, 557 | keep_prob=0.5, 558 | noise_shape=None): 559 | """ Dropout layer. 560 | 561 | Args: 562 | inputs: tensor 563 | is_training: boolean tf.Variable 564 | scope: string 565 | keep_prob: float in [0,1] 566 | noise_shape: list of ints 567 | 568 | Returns: 569 | tensor variable 570 | """ 571 | with tf.variable_scope(scope) as sc: 572 | outputs = tf.cond(is_training, 573 | lambda: tf.nn.dropout(inputs, keep_prob, noise_shape), 574 | lambda: inputs) 575 | return outputs 576 | --------------------------------------------------------------------------------