├── .DS_Store
├── .gitattributes
├── LICENSE
├── README.md
├── doc
└── teaser.png
├── evaluate.py
├── models
├── pointnet_cls.py
├── pointnet_cls_basic.py
├── pointnet_seg.py
└── transform_nets.py
├── part_seg
├── download_data.sh
├── pointnet_part_seg.py
├── test.py
├── testing_ply_file_list.txt
└── train.py
├── provider.py
├── sem_seg
├── README.md
├── batch_inference.py
├── collect_indoor3d_data.py
├── download_data.sh
├── eval_iou_accuracy.py
├── gen_indoor3d_h5.py
├── indoor3d_util.py
├── meta
│ ├── all_data_label.txt
│ ├── anno_paths.txt
│ ├── area6_data_label.txt
│ └── class_names.txt
├── model.py
└── train.py
├── train.py
└── utils
├── data_prep_util.py
├── eulerangles.py
├── pc_util.py
├── plyfile.py
└── tf_util.py
/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ZhihaoZhu/PointNet-Implementation-Tensorflow/14709d960aaf47a642bc6e45c928bfd02ed47cfd/.DS_Store
--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation.
2 |
3 | Copyright (c) 2017, Geometric Computation Group of Stanford University
4 |
5 | The MIT License (MIT)
6 |
7 | Copyright (c) 2017 Charles R. Qi
8 |
9 | Permission is hereby granted, free of charge, to any person obtaining a copy
10 | of this software and associated documentation files (the "Software"), to deal
11 | in the Software without restriction, including without limitation the rights
12 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
13 | copies of the Software, and to permit persons to whom the Software is
14 | furnished to do so, subject to the following conditions:
15 |
16 | The above copyright notice and this permission notice shall be included in all
17 | copies or substantial portions of the Software.
18 |
19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
25 | SOFTWARE.
26 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | ### Introduction
4 | This work is based on our [arXiv tech report](https://arxiv.org/abs/1612.00593), which is going to appear in CVPR 2017. We proposed a novel deep net architecture for point clouds (as unordered point sets). You can also check our [project webpage](http://stanford.edu/~rqi/pointnet) for a deeper introduction.
5 |
6 | Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective.
7 |
8 | In this repository, we release code and data for training a PointNet classification network on point clouds sampled from 3D shapes, as well as for training a part segmentation network on ShapeNet Part dataset.
9 |
10 | ### Citation
11 | If you find our work useful in your research, please consider citing:
12 |
13 | @article{qi2016pointnet,
14 | title={PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation},
15 | author={Qi, Charles R and Su, Hao and Mo, Kaichun and Guibas, Leonidas J},
16 | journal={arXiv preprint arXiv:1612.00593},
17 | year={2016}
18 | }
19 |
20 | ### Installation
21 |
22 | Install TensorFlow. You may also need to install h5py. The code has been tested with Python 2.7, TensorFlow 1.0.1, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04.
23 |
24 | If you are using PyTorch, you can find a third-party pytorch implementation here.
25 |
26 | To install h5py for Python:
27 | ```bash
28 | sudo apt-get install libhdf5-dev
29 | sudo pip install h5py
30 | ```
31 |
32 | ### Usage
33 | To train a model to classify point clouds sampled from 3D shapes:
34 |
35 | python train.py
36 |
37 | Log files and network parameters will be saved to `log` folder in default. Point clouds of ModelNet40 models in HDF5 files will be automatically downloaded (416MB) to the data folder. Each point cloud contains 2048 points uniformly sampled from a shape surface. Each cloud is zero-mean and normalized into an unit sphere. There are also text files in `data/modelnet40_ply_hdf5_2048` specifying the ids of shapes in h5 files.
38 |
39 | To see HELP for the training script:
40 |
41 | python train.py -h
42 |
43 | We can use TensorBoard to view the network architecture and monitor the training progress.
44 |
45 | tensorboard --logdir log
46 |
47 | After the above training, we can evaluate the model and output some visualizations of the error cases.
48 |
49 | python evaluate.py --visu
50 |
51 | Point clouds that are wrongly classified will be saved to `dump` folder in default. We visualize the point cloud by rendering it into three-view images.
52 |
53 | If you'd like to prepare your own data, you can refer to some helper functions in `utils/data_prep_util.py` for saving and loading HDF5 files.
54 |
55 | ### Part Segmentation
56 | To train a model for object part segmentation, firstly download the data:
57 |
58 | cd part_seg
59 | sh download_data.sh
60 |
61 | The downloading script will download ShapeNetPart dataset (around 1.08GB) and our prepared HDF5 files (around 346MB).
62 |
63 | Then you can run `train.py` and `test.py` in the `part_seg` folder for training and testing (computing mIoU for evaluation).
64 |
65 | ### License
66 | Our code is released under MIT License (see LICENSE file for details).
67 |
68 | ### Selected Projects that Use PointNet
69 |
70 | * PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space by Qi et al. (NIPS 2017) A hierarchical feature learning framework on point clouds. The PointNet++ architecture applies PointNet recursively on a nested partitioning of the input point set. It also proposes novel layers for point clouds with non-uniform densities.
71 | * Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds by Engelmann et al. (ICCV 2017 workshop). This work extends PointNet for large-scale scene segmentation.
72 | * PCPNET: Learning Local Shape Properties from Raw Point Clouds by Guerrero et al. (arXiv). The work adapts PointNet for local geometric properties (e.g. normal and curvature) estimation in noisy point clouds.
73 | * VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection by Zhou et al. from Apple (arXiv) This work studies 3D object detection using LiDAR point clouds. It splits space into voxels, use PointNet to learn local voxel features and then use 3D CNN for region proposal, object classification and 3D bounding box estimation.
74 | * Frustum PointNets for 3D Object Detection from RGB-D Data by Qi et al. (arXiv) A novel framework for 3D object detection with RGB-D data. The method proposed has achieved first place on KITTI 3D object detection benchmark on all categories (last checked on 11/30/2017).
75 |
--------------------------------------------------------------------------------
/doc/teaser.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ZhihaoZhu/PointNet-Implementation-Tensorflow/14709d960aaf47a642bc6e45c928bfd02ed47cfd/doc/teaser.png
--------------------------------------------------------------------------------
/evaluate.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import argparse
4 | import socket
5 | import importlib
6 | import time
7 | import os
8 | import scipy.misc
9 | import sys
10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
11 | sys.path.append(BASE_DIR)
12 | sys.path.append(os.path.join(BASE_DIR, 'models'))
13 | sys.path.append(os.path.join(BASE_DIR, 'utils'))
14 | import provider
15 | import pc_util
16 |
17 |
18 | parser = argparse.ArgumentParser()
19 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]')
20 | parser.add_argument('--model', default='pointnet_cls', help='Model name: pointnet_cls or pointnet_cls_basic [default: pointnet_cls]')
21 | parser.add_argument('--batch_size', type=int, default=4, help='Batch Size during training [default: 1]')
22 | parser.add_argument('--num_point', type=int, default=1024, help='Point Number [256/512/1024/2048] [default: 1024]')
23 | parser.add_argument('--model_path', default='log/model.ckpt', help='model checkpoint file path [default: log/model.ckpt]')
24 | parser.add_argument('--dump_dir', default='dump', help='dump folder path [dump]')
25 | parser.add_argument('--visu', action='store_true', help='Whether to dump image for error case [default: False]')
26 | FLAGS = parser.parse_args()
27 |
28 |
29 | BATCH_SIZE = FLAGS.batch_size
30 | NUM_POINT = FLAGS.num_point
31 | MODEL_PATH = FLAGS.model_path
32 | GPU_INDEX = FLAGS.gpu
33 | MODEL = importlib.import_module(FLAGS.model) # import network module
34 | DUMP_DIR = FLAGS.dump_dir
35 | if not os.path.exists(DUMP_DIR): os.mkdir(DUMP_DIR)
36 | LOG_FOUT = open(os.path.join(DUMP_DIR, 'log_evaluate.txt'), 'w')
37 | LOG_FOUT.write(str(FLAGS)+'\n')
38 |
39 | NUM_CLASSES = 40
40 | SHAPE_NAMES = [line.rstrip() for line in \
41 | open(os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/shape_names.txt'))]
42 |
43 | HOSTNAME = socket.gethostname()
44 |
45 | # ModelNet40 official train/test split
46 | TRAIN_FILES = provider.getDataFiles( \
47 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/train_files.txt'))
48 | TEST_FILES = provider.getDataFiles(\
49 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/test_files.txt'))
50 |
51 | def log_string(out_str):
52 | LOG_FOUT.write(out_str+'\n')
53 | LOG_FOUT.flush()
54 | print(out_str)
55 |
56 | def evaluate(num_votes):
57 | is_training = False
58 |
59 | with tf.device('/gpu:'+str(GPU_INDEX)):
60 | pointclouds_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE, NUM_POINT)
61 | is_training_pl = tf.placeholder(tf.bool, shape=())
62 |
63 | # simple model
64 | pred, end_points = MODEL.get_model(pointclouds_pl, is_training_pl)
65 | loss = MODEL.get_loss(pred, labels_pl, end_points)
66 |
67 | # Add ops to save and restore all the variables.
68 | saver = tf.train.Saver()
69 |
70 | # Create a session
71 | config = tf.ConfigProto()
72 | config.gpu_options.allow_growth = True
73 | config.allow_soft_placement = True
74 | config.log_device_placement = True
75 | sess = tf.Session(config=config)
76 |
77 | # Restore variables from disk.
78 | saver.restore(sess, MODEL_PATH)
79 | log_string("Model restored.")
80 |
81 | ops = {'pointclouds_pl': pointclouds_pl,
82 | 'labels_pl': labels_pl,
83 | 'is_training_pl': is_training_pl,
84 | 'pred': pred,
85 | 'loss': loss}
86 |
87 | eval_one_epoch(sess, ops, num_votes)
88 |
89 |
90 | def eval_one_epoch(sess, ops, num_votes=1, topk=1):
91 | error_cnt = 0
92 | is_training = False
93 | total_correct = 0
94 | total_seen = 0
95 | loss_sum = 0
96 | total_seen_class = [0 for _ in range(NUM_CLASSES)]
97 | total_correct_class = [0 for _ in range(NUM_CLASSES)]
98 | fout = open(os.path.join(DUMP_DIR, 'pred_label.txt'), 'w')
99 | for fn in range(len(TEST_FILES)):
100 | log_string('----'+str(fn)+'----')
101 | current_data, current_label = provider.loadDataFile(TEST_FILES[fn])
102 | current_data = current_data[:,0:NUM_POINT,:]
103 | current_label = np.squeeze(current_label)
104 | print(current_data.shape)
105 |
106 | file_size = current_data.shape[0]
107 | num_batches = file_size // BATCH_SIZE
108 | print(file_size)
109 |
110 | for batch_idx in range(num_batches):
111 | start_idx = batch_idx * BATCH_SIZE
112 | end_idx = (batch_idx+1) * BATCH_SIZE
113 | cur_batch_size = end_idx - start_idx
114 |
115 | # Aggregating BEG
116 | batch_loss_sum = 0 # sum of losses for the batch
117 | batch_pred_sum = np.zeros((cur_batch_size, NUM_CLASSES)) # score for classes
118 | batch_pred_classes = np.zeros((cur_batch_size, NUM_CLASSES)) # 0/1 for classes
119 | for vote_idx in range(num_votes):
120 | rotated_data = provider.rotate_point_cloud_by_angle(current_data[start_idx:end_idx, :, :],
121 | vote_idx/float(num_votes) * np.pi * 2)
122 | feed_dict = {ops['pointclouds_pl']: rotated_data,
123 | ops['labels_pl']: current_label[start_idx:end_idx],
124 | ops['is_training_pl']: is_training}
125 | loss_val, pred_val = sess.run([ops['loss'], ops['pred']],
126 | feed_dict=feed_dict)
127 | batch_pred_sum += pred_val
128 | batch_pred_val = np.argmax(pred_val, 1)
129 | for el_idx in range(cur_batch_size):
130 | batch_pred_classes[el_idx, batch_pred_val[el_idx]] += 1
131 | batch_loss_sum += (loss_val * cur_batch_size / float(num_votes))
132 | # pred_val_topk = np.argsort(batch_pred_sum, axis=-1)[:,-1*np.array(range(topk))-1]
133 | # pred_val = np.argmax(batch_pred_classes, 1)
134 | pred_val = np.argmax(batch_pred_sum, 1)
135 | # Aggregating END
136 |
137 | correct = np.sum(pred_val == current_label[start_idx:end_idx])
138 | # correct = np.sum(pred_val_topk[:,0:topk] == label_val)
139 | total_correct += correct
140 | total_seen += cur_batch_size
141 | loss_sum += batch_loss_sum
142 |
143 | for i in range(start_idx, end_idx):
144 | l = current_label[i]
145 | total_seen_class[l] += 1
146 | total_correct_class[l] += (pred_val[i-start_idx] == l)
147 | fout.write('%d, %d\n' % (pred_val[i-start_idx], l))
148 |
149 | if pred_val[i-start_idx] != l and FLAGS.visu: # ERROR CASE, DUMP!
150 | img_filename = '%d_label_%s_pred_%s.jpg' % (error_cnt, SHAPE_NAMES[l],
151 | SHAPE_NAMES[pred_val[i-start_idx]])
152 | img_filename = os.path.join(DUMP_DIR, img_filename)
153 | output_img = pc_util.point_cloud_three_views(np.squeeze(current_data[i, :, :]))
154 | scipy.misc.imsave(img_filename, output_img)
155 | error_cnt += 1
156 |
157 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen)))
158 | log_string('eval accuracy: %f' % (total_correct / float(total_seen)))
159 | log_string('eval avg class acc: %f' % (np.mean(np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float))))
160 |
161 | class_accuracies = np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float)
162 | for i, name in enumerate(SHAPE_NAMES):
163 | log_string('%10s:\t%0.3f' % (name, class_accuracies[i]))
164 |
165 |
166 |
167 | if __name__=='__main__':
168 | with tf.Graph().as_default():
169 | evaluate(num_votes=1)
170 | LOG_FOUT.close()
171 |
--------------------------------------------------------------------------------
/models/pointnet_cls.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | import sys
5 | import os
6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
7 | sys.path.append(BASE_DIR)
8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
9 | import tf_util
10 | from transform_nets import input_transform_net, feature_transform_net
11 |
12 | def placeholder_inputs(batch_size, num_point):
13 | pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point, 3))
14 | labels_pl = tf.placeholder(tf.int32, shape=(batch_size))
15 | return pointclouds_pl, labels_pl
16 |
17 |
18 | def get_model(point_cloud, is_training, bn_decay=None):
19 | """ Classification PointNet, input is BxNx3, output Bx40 """
20 | batch_size = point_cloud.get_shape()[0].value
21 | num_point = point_cloud.get_shape()[1].value
22 | end_points = {}
23 |
24 | with tf.variable_scope('transform_net1') as sc:
25 | transform = input_transform_net(point_cloud, is_training, bn_decay, K=3)
26 | point_cloud_transformed = tf.matmul(point_cloud, transform)
27 | input_image = tf.expand_dims(point_cloud_transformed, -1)
28 |
29 | net = tf_util.conv2d(input_image, 64, [1,3],
30 | padding='VALID', stride=[1,1],
31 | bn=True, is_training=is_training,
32 | scope='conv1', bn_decay=bn_decay)
33 | net = tf_util.conv2d(net, 64, [1,1],
34 | padding='VALID', stride=[1,1],
35 | bn=True, is_training=is_training,
36 | scope='conv2', bn_decay=bn_decay)
37 |
38 | with tf.variable_scope('transform_net2') as sc:
39 | transform = feature_transform_net(net, is_training, bn_decay, K=64)
40 | end_points['transform'] = transform
41 | net_transformed = tf.matmul(tf.squeeze(net, axis=[2]), transform)
42 | net_transformed = tf.expand_dims(net_transformed, [2])
43 |
44 | net = tf_util.conv2d(net_transformed, 64, [1,1],
45 | padding='VALID', stride=[1,1],
46 | bn=True, is_training=is_training,
47 | scope='conv3', bn_decay=bn_decay)
48 | net = tf_util.conv2d(net, 128, [1,1],
49 | padding='VALID', stride=[1,1],
50 | bn=True, is_training=is_training,
51 | scope='conv4', bn_decay=bn_decay)
52 | net = tf_util.conv2d(net, 1024, [1,1],
53 | padding='VALID', stride=[1,1],
54 | bn=True, is_training=is_training,
55 | scope='conv5', bn_decay=bn_decay)
56 |
57 | # Symmetric function: max pooling
58 | net = tf_util.max_pool2d(net, [num_point,1],
59 | padding='VALID', scope='maxpool')
60 |
61 | net = tf.reshape(net, [batch_size, -1])
62 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training,
63 | scope='fc1', bn_decay=bn_decay)
64 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training,
65 | scope='dp1')
66 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training,
67 | scope='fc2', bn_decay=bn_decay)
68 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training,
69 | scope='dp2')
70 | net = tf_util.fully_connected(net, 40, activation_fn=None, scope='fc3')
71 |
72 | return net, end_points
73 |
74 |
75 | def get_loss(pred, label, end_points, reg_weight=0.001):
76 | """ pred: B*NUM_CLASSES,
77 | label: B, """
78 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label)
79 | classify_loss = tf.reduce_mean(loss)
80 | tf.summary.scalar('classify loss', classify_loss)
81 |
82 | # Enforce the transformation as orthogonal matrix
83 | transform = end_points['transform'] # BxKxK
84 | K = transform.get_shape()[1].value
85 | mat_diff = tf.matmul(transform, tf.transpose(transform, perm=[0,2,1]))
86 | mat_diff -= tf.constant(np.eye(K), dtype=tf.float32)
87 | mat_diff_loss = tf.nn.l2_loss(mat_diff)
88 | tf.summary.scalar('mat loss', mat_diff_loss)
89 |
90 | return classify_loss + mat_diff_loss * reg_weight
91 |
92 |
93 | if __name__=='__main__':
94 | with tf.Graph().as_default():
95 | inputs = tf.zeros((32,1024,3))
96 | outputs = get_model(inputs, tf.constant(True))
97 | print(outputs)
98 |
--------------------------------------------------------------------------------
/models/pointnet_cls_basic.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | import sys
5 | import os
6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
7 | sys.path.append(BASE_DIR)
8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
9 | import tf_util
10 |
11 | def placeholder_inputs(batch_size, num_point):
12 | pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point, 3))
13 | labels_pl = tf.placeholder(tf.int32, shape=(batch_size))
14 | return pointclouds_pl, labels_pl
15 |
16 |
17 | def get_model(point_cloud, is_training, bn_decay=None):
18 | """ Classification PointNet, input is BxNx3, output Bx40 """
19 | batch_size = point_cloud.get_shape()[0].value
20 | num_point = point_cloud.get_shape()[1].value
21 | end_points = {}
22 | input_image = tf.expand_dims(point_cloud, -1)
23 |
24 | # Point functions (MLP implemented as conv2d)
25 | net = tf_util.conv2d(input_image, 64, [1,3],
26 | padding='VALID', stride=[1,1],
27 | bn=True, is_training=is_training,
28 | scope='conv1', bn_decay=bn_decay)
29 | net = tf_util.conv2d(net, 64, [1,1],
30 | padding='VALID', stride=[1,1],
31 | bn=True, is_training=is_training,
32 | scope='conv2', bn_decay=bn_decay)
33 | net = tf_util.conv2d(net, 64, [1,1],
34 | padding='VALID', stride=[1,1],
35 | bn=True, is_training=is_training,
36 | scope='conv3', bn_decay=bn_decay)
37 | net = tf_util.conv2d(net, 128, [1,1],
38 | padding='VALID', stride=[1,1],
39 | bn=True, is_training=is_training,
40 | scope='conv4', bn_decay=bn_decay)
41 | net = tf_util.conv2d(net, 1024, [1,1],
42 | padding='VALID', stride=[1,1],
43 | bn=True, is_training=is_training,
44 | scope='conv5', bn_decay=bn_decay)
45 |
46 | # Symmetric function: max pooling
47 | net = tf_util.max_pool2d(net, [num_point,1],
48 | padding='VALID', scope='maxpool')
49 |
50 | # MLP on global point cloud vector
51 | net = tf.reshape(net, [batch_size, -1])
52 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training,
53 | scope='fc1', bn_decay=bn_decay)
54 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training,
55 | scope='fc2', bn_decay=bn_decay)
56 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training,
57 | scope='dp1')
58 | net = tf_util.fully_connected(net, 40, activation_fn=None, scope='fc3')
59 |
60 | return net, end_points
61 |
62 |
63 | def get_loss(pred, label, end_points):
64 | """ pred: B*NUM_CLASSES,
65 | label: B, """
66 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label)
67 | classify_loss = tf.reduce_mean(loss)
68 | tf.summary.scalar('classify loss', classify_loss)
69 | return classify_loss
70 |
71 |
72 | if __name__=='__main__':
73 | with tf.Graph().as_default():
74 | inputs = tf.zeros((32,1024,3))
75 | outputs = get_model(inputs, tf.constant(True))
76 | print(outputs)
77 |
--------------------------------------------------------------------------------
/models/pointnet_seg.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | import sys
5 | import os
6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
7 | sys.path.append(BASE_DIR)
8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
9 | import tf_util
10 | from transform_nets import input_transform_net, feature_transform_net
11 |
12 | def placeholder_inputs(batch_size, num_point):
13 | pointclouds_pl = tf.placeholder(tf.float32,
14 | shape=(batch_size, num_point, 3))
15 | labels_pl = tf.placeholder(tf.int32,
16 | shape=(batch_size, num_point))
17 | return pointclouds_pl, labels_pl
18 |
19 |
20 | def get_model(point_cloud, is_training, bn_decay=None):
21 | """ Classification PointNet, input is BxNx3, output BxNx50 """
22 | batch_size = point_cloud.get_shape()[0].value
23 | num_point = point_cloud.get_shape()[1].value
24 | end_points = {}
25 |
26 | with tf.variable_scope('transform_net1') as sc:
27 | transform = input_transform_net(point_cloud, is_training, bn_decay, K=3)
28 | point_cloud_transformed = tf.matmul(point_cloud, transform)
29 | input_image = tf.expand_dims(point_cloud_transformed, -1)
30 |
31 | net = tf_util.conv2d(input_image, 64, [1,3],
32 | padding='VALID', stride=[1,1],
33 | bn=True, is_training=is_training,
34 | scope='conv1', bn_decay=bn_decay)
35 | net = tf_util.conv2d(net, 64, [1,1],
36 | padding='VALID', stride=[1,1],
37 | bn=True, is_training=is_training,
38 | scope='conv2', bn_decay=bn_decay)
39 |
40 | with tf.variable_scope('transform_net2') as sc:
41 | transform = feature_transform_net(net, is_training, bn_decay, K=64)
42 | end_points['transform'] = transform
43 | net_transformed = tf.matmul(tf.squeeze(net, axis=[2]), transform)
44 | point_feat = tf.expand_dims(net_transformed, [2])
45 | print(point_feat)
46 |
47 | net = tf_util.conv2d(point_feat, 64, [1,1],
48 | padding='VALID', stride=[1,1],
49 | bn=True, is_training=is_training,
50 | scope='conv3', bn_decay=bn_decay)
51 | net = tf_util.conv2d(net, 128, [1,1],
52 | padding='VALID', stride=[1,1],
53 | bn=True, is_training=is_training,
54 | scope='conv4', bn_decay=bn_decay)
55 | net = tf_util.conv2d(net, 1024, [1,1],
56 | padding='VALID', stride=[1,1],
57 | bn=True, is_training=is_training,
58 | scope='conv5', bn_decay=bn_decay)
59 | global_feat = tf_util.max_pool2d(net, [num_point,1],
60 | padding='VALID', scope='maxpool')
61 | print(global_feat)
62 |
63 | global_feat_expand = tf.tile(global_feat, [1, num_point, 1, 1])
64 | concat_feat = tf.concat(3, [point_feat, global_feat_expand])
65 | print(concat_feat)
66 |
67 | net = tf_util.conv2d(concat_feat, 512, [1,1],
68 | padding='VALID', stride=[1,1],
69 | bn=True, is_training=is_training,
70 | scope='conv6', bn_decay=bn_decay)
71 | net = tf_util.conv2d(net, 256, [1,1],
72 | padding='VALID', stride=[1,1],
73 | bn=True, is_training=is_training,
74 | scope='conv7', bn_decay=bn_decay)
75 | net = tf_util.conv2d(net, 128, [1,1],
76 | padding='VALID', stride=[1,1],
77 | bn=True, is_training=is_training,
78 | scope='conv8', bn_decay=bn_decay)
79 | net = tf_util.conv2d(net, 128, [1,1],
80 | padding='VALID', stride=[1,1],
81 | bn=True, is_training=is_training,
82 | scope='conv9', bn_decay=bn_decay)
83 |
84 | net = tf_util.conv2d(net, 50, [1,1],
85 | padding='VALID', stride=[1,1], activation_fn=None,
86 | scope='conv10')
87 | net = tf.squeeze(net, [2]) # BxNxC
88 |
89 | return net, end_points
90 |
91 |
92 | def get_loss(pred, label, end_points, reg_weight=0.001):
93 | """ pred: BxNxC,
94 | label: BxN, """
95 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label)
96 | classify_loss = tf.reduce_mean(loss)
97 | tf.scalar_summary('classify loss', classify_loss)
98 |
99 | # Enforce the transformation as orthogonal matrix
100 | transform = end_points['transform'] # BxKxK
101 | K = transform.get_shape()[1].value
102 | mat_diff = tf.matmul(transform, tf.transpose(transform, perm=[0,2,1]))
103 | mat_diff -= tf.constant(np.eye(K), dtype=tf.float32)
104 | mat_diff_loss = tf.nn.l2_loss(mat_diff)
105 | tf.scalar_summary('mat_loss', mat_diff_loss)
106 |
107 | return classify_loss + mat_diff_loss * reg_weight
108 |
109 |
110 | if __name__=='__main__':
111 | with tf.Graph().as_default():
112 | inputs = tf.zeros((32,1024,3))
113 | outputs = get_model(inputs, tf.constant(True))
114 | print(outputs)
115 |
--------------------------------------------------------------------------------
/models/transform_nets.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import sys
4 | import os
5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
6 | sys.path.append(BASE_DIR)
7 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
8 | import tf_util
9 |
10 | def input_transform_net(point_cloud, is_training, bn_decay=None, K=3):
11 | """ Input (XYZ) Transform Net, input is BxNx3 gray image
12 | Return:
13 | Transformation matrix of size 3xK """
14 | batch_size = point_cloud.get_shape()[0].value
15 | num_point = point_cloud.get_shape()[1].value
16 |
17 | input_image = tf.expand_dims(point_cloud, -1)
18 | net = tf_util.conv2d(input_image, 64, [1,3],
19 | padding='VALID', stride=[1,1],
20 | bn=True, is_training=is_training,
21 | scope='tconv1', bn_decay=bn_decay)
22 | net = tf_util.conv2d(net, 128, [1,1],
23 | padding='VALID', stride=[1,1],
24 | bn=True, is_training=is_training,
25 | scope='tconv2', bn_decay=bn_decay)
26 | net = tf_util.conv2d(net, 1024, [1,1],
27 | padding='VALID', stride=[1,1],
28 | bn=True, is_training=is_training,
29 | scope='tconv3', bn_decay=bn_decay)
30 | net = tf_util.max_pool2d(net, [num_point,1],
31 | padding='VALID', scope='tmaxpool')
32 |
33 | net = tf.reshape(net, [batch_size, -1])
34 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training,
35 | scope='tfc1', bn_decay=bn_decay)
36 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training,
37 | scope='tfc2', bn_decay=bn_decay)
38 |
39 | with tf.variable_scope('transform_XYZ') as sc:
40 | assert(K==3)
41 | weights = tf.get_variable('weights', [256, 3*K],
42 | initializer=tf.constant_initializer(0.0),
43 | dtype=tf.float32)
44 | biases = tf.get_variable('biases', [3*K],
45 | initializer=tf.constant_initializer(0.0),
46 | dtype=tf.float32)
47 | biases += tf.constant([1,0,0,0,1,0,0,0,1], dtype=tf.float32)
48 | transform = tf.matmul(net, weights)
49 | transform = tf.nn.bias_add(transform, biases)
50 |
51 | transform = tf.reshape(transform, [batch_size, 3, K])
52 | return transform
53 |
54 |
55 | def feature_transform_net(inputs, is_training, bn_decay=None, K=64):
56 | """ Feature Transform Net, input is BxNx1xK
57 | Return:
58 | Transformation matrix of size KxK """
59 | batch_size = inputs.get_shape()[0].value
60 | num_point = inputs.get_shape()[1].value
61 |
62 | net = tf_util.conv2d(inputs, 64, [1,1],
63 | padding='VALID', stride=[1,1],
64 | bn=True, is_training=is_training,
65 | scope='tconv1', bn_decay=bn_decay)
66 | net = tf_util.conv2d(net, 128, [1,1],
67 | padding='VALID', stride=[1,1],
68 | bn=True, is_training=is_training,
69 | scope='tconv2', bn_decay=bn_decay)
70 | net = tf_util.conv2d(net, 1024, [1,1],
71 | padding='VALID', stride=[1,1],
72 | bn=True, is_training=is_training,
73 | scope='tconv3', bn_decay=bn_decay)
74 | net = tf_util.max_pool2d(net, [num_point,1],
75 | padding='VALID', scope='tmaxpool')
76 |
77 | net = tf.reshape(net, [batch_size, -1])
78 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training,
79 | scope='tfc1', bn_decay=bn_decay)
80 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training,
81 | scope='tfc2', bn_decay=bn_decay)
82 |
83 | with tf.variable_scope('transform_feat') as sc:
84 | weights = tf.get_variable('weights', [256, K*K],
85 | initializer=tf.constant_initializer(0.0),
86 | dtype=tf.float32)
87 | biases = tf.get_variable('biases', [K*K],
88 | initializer=tf.constant_initializer(0.0),
89 | dtype=tf.float32)
90 | biases += tf.constant(np.eye(K).flatten(), dtype=tf.float32)
91 | transform = tf.matmul(net, weights)
92 | transform = tf.nn.bias_add(transform, biases)
93 |
94 | transform = tf.reshape(transform, [batch_size, K, K])
95 | return transform
96 |
--------------------------------------------------------------------------------
/part_seg/download_data.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Download original ShapeNetPart dataset (around 1GB)
4 | wget https://shapenet.cs.stanford.edu/ericyi/shapenetcore_partanno_v0.zip
5 | unzip shapenetcore_partanno_v0.zip
6 | rm shapenetcore_partanno_v0.zip
7 |
8 | # Download HDF5 for ShapeNet Part segmentation (around 346MB)
9 | wget https://shapenet.cs.stanford.edu/media/shapenet_part_seg_hdf5_data.zip
10 | unzip shapenet_part_seg_hdf5_data.zip
11 | rm shapenet_part_seg_hdf5_data.zip
12 |
13 |
--------------------------------------------------------------------------------
/part_seg/pointnet_part_seg.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | import os
5 | import sys
6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
7 | sys.path.append(os.path.dirname(BASE_DIR))
8 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
9 | import tf_util
10 |
11 |
12 | def get_transform_K(inputs, is_training, bn_decay=None, K = 3):
13 | """ Transform Net, input is BxNx1xK gray image
14 | Return:
15 | Transformation matrix of size KxK """
16 | batch_size = inputs.get_shape()[0].value
17 | num_point = inputs.get_shape()[1].value
18 |
19 | net = tf_util.conv2d(inputs, 256, [1,1], padding='VALID', stride=[1,1],
20 | bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay)
21 | net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1],
22 | bn=True, is_training=is_training, scope='tconv2', bn_decay=bn_decay)
23 | net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool')
24 |
25 | net = tf.reshape(net, [batch_size, -1])
26 | net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay)
27 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay)
28 |
29 | with tf.variable_scope('transform_feat') as sc:
30 | weights = tf.get_variable('weights', [256, K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32)
31 | biases = tf.get_variable('biases', [K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant(np.eye(K).flatten(), dtype=tf.float32)
32 | transform = tf.matmul(net, weights)
33 | transform = tf.nn.bias_add(transform, biases)
34 |
35 | #transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3')
36 | transform = tf.reshape(transform, [batch_size, K, K])
37 | return transform
38 |
39 |
40 |
41 |
42 |
43 | def get_transform(point_cloud, is_training, bn_decay=None, K = 3):
44 | """ Transform Net, input is BxNx3 gray image
45 | Return:
46 | Transformation matrix of size 3xK """
47 | batch_size = point_cloud.get_shape()[0].value
48 | num_point = point_cloud.get_shape()[1].value
49 |
50 | input_image = tf.expand_dims(point_cloud, -1)
51 | net = tf_util.conv2d(input_image, 64, [1,3], padding='VALID', stride=[1,1],
52 | bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay)
53 | net = tf_util.conv2d(net, 128, [1,1], padding='VALID', stride=[1,1],
54 | bn=True, is_training=is_training, scope='tconv3', bn_decay=bn_decay)
55 | net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1],
56 | bn=True, is_training=is_training, scope='tconv4', bn_decay=bn_decay)
57 | net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool')
58 |
59 | net = tf.reshape(net, [batch_size, -1])
60 | net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay)
61 | net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay)
62 |
63 | with tf.variable_scope('transform_XYZ') as sc:
64 | assert(K==3)
65 | weights = tf.get_variable('weights', [128, 3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32)
66 | biases = tf.get_variable('biases', [3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant([1,0,0,0,1,0,0,0,1], dtype=tf.float32)
67 | transform = tf.matmul(net, weights)
68 | transform = tf.nn.bias_add(transform, biases)
69 |
70 | #transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3')
71 | transform = tf.reshape(transform, [batch_size, 3, K])
72 | return transform
73 |
74 |
75 | def get_model(point_cloud, input_label, is_training, cat_num, part_num, \
76 | batch_size, num_point, weight_decay, bn_decay=None):
77 | """ ConvNet baseline, input is BxNx3 gray image """
78 | end_points = {}
79 |
80 | with tf.variable_scope('transform_net1') as sc:
81 | K = 3
82 | transform = get_transform(point_cloud, is_training, bn_decay, K = 3)
83 | point_cloud_transformed = tf.matmul(point_cloud, transform)
84 |
85 | input_image = tf.expand_dims(point_cloud_transformed, -1)
86 | out1 = tf_util.conv2d(input_image, 64, [1,K], padding='VALID', stride=[1,1],
87 | bn=True, is_training=is_training, scope='conv1', bn_decay=bn_decay)
88 | out2 = tf_util.conv2d(out1, 128, [1,1], padding='VALID', stride=[1,1],
89 | bn=True, is_training=is_training, scope='conv2', bn_decay=bn_decay)
90 | out3 = tf_util.conv2d(out2, 128, [1,1], padding='VALID', stride=[1,1],
91 | bn=True, is_training=is_training, scope='conv3', bn_decay=bn_decay)
92 |
93 |
94 | with tf.variable_scope('transform_net2') as sc:
95 | K = 128
96 | transform = get_transform_K(out3, is_training, bn_decay, K)
97 |
98 | end_points['transform'] = transform
99 |
100 | squeezed_out3 = tf.reshape(out3, [batch_size, num_point, 128])
101 | net_transformed = tf.matmul(squeezed_out3, transform)
102 | net_transformed = tf.expand_dims(net_transformed, [2])
103 |
104 | out4 = tf_util.conv2d(net_transformed, 512, [1,1], padding='VALID', stride=[1,1],
105 | bn=True, is_training=is_training, scope='conv4', bn_decay=bn_decay)
106 | out5 = tf_util.conv2d(out4, 2048, [1,1], padding='VALID', stride=[1,1],
107 | bn=True, is_training=is_training, scope='conv5', bn_decay=bn_decay)
108 | out_max = tf_util.max_pool2d(out5, [num_point,1], padding='VALID', scope='maxpool')
109 |
110 | # classification network
111 | net = tf.reshape(out_max, [batch_size, -1])
112 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc1', bn_decay=bn_decay)
113 | net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc2', bn_decay=bn_decay)
114 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='cla/dp1')
115 | net = tf_util.fully_connected(net, cat_num, activation_fn=None, scope='cla/fc3')
116 |
117 | # segmentation network
118 | one_hot_label_expand = tf.reshape(input_label, [batch_size, 1, 1, cat_num])
119 | out_max = tf.concat(axis=3, values=[out_max, one_hot_label_expand])
120 |
121 | expand = tf.tile(out_max, [1, num_point, 1, 1])
122 | concat = tf.concat(axis=3, values=[expand, out1, out2, out3, out4, out5])
123 |
124 | net2 = tf_util.conv2d(concat, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
125 | bn=True, is_training=is_training, scope='seg/conv1', weight_decay=weight_decay)
126 | net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp1')
127 | net2 = tf_util.conv2d(net2, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
128 | bn=True, is_training=is_training, scope='seg/conv2', weight_decay=weight_decay)
129 | net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp2')
130 | net2 = tf_util.conv2d(net2, 128, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
131 | bn=True, is_training=is_training, scope='seg/conv3', weight_decay=weight_decay)
132 | net2 = tf_util.conv2d(net2, part_num, [1,1], padding='VALID', stride=[1,1], activation_fn=None,
133 | bn=False, scope='seg/conv4', weight_decay=weight_decay)
134 |
135 | net2 = tf.reshape(net2, [batch_size, num_point, part_num])
136 |
137 | return net, net2, end_points
138 |
139 | def get_loss(l_pred, seg_pred, label, seg, weight, end_points):
140 | per_instance_label_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=l_pred, labels=label)
141 | label_loss = tf.reduce_mean(per_instance_label_loss)
142 |
143 | # size of seg_pred is batch_size x point_num x part_cat_num
144 | # size of seg is batch_size x point_num
145 | per_instance_seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_pred, labels=seg), axis=1)
146 | seg_loss = tf.reduce_mean(per_instance_seg_loss)
147 |
148 | per_instance_seg_pred_res = tf.argmax(seg_pred, 2)
149 |
150 | # Enforce the transformation as orthogonal matrix
151 | transform = end_points['transform'] # BxKxK
152 | K = transform.get_shape()[1].value
153 | mat_diff = tf.matmul(transform, tf.transpose(transform, perm=[0,2,1])) - tf.constant(np.eye(K), dtype=tf.float32)
154 | mat_diff_loss = tf.nn.l2_loss(mat_diff)
155 |
156 |
157 | total_loss = weight * seg_loss + (1 - weight) * label_loss + mat_diff_loss * 1e-3
158 |
159 | return total_loss, label_loss, per_instance_label_loss, seg_loss, per_instance_seg_loss, per_instance_seg_pred_res
160 |
161 |
--------------------------------------------------------------------------------
/part_seg/test.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import tensorflow as tf
3 | import json
4 | import numpy as np
5 | import os
6 | import sys
7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
8 | sys.path.append(BASE_DIR)
9 | sys.path.append(os.path.dirname(BASE_DIR))
10 | import provider
11 | import pointnet_part_seg as model
12 |
13 | parser = argparse.ArgumentParser()
14 | parser.add_argument('--model_path', default='train_results/trained_models/epoch_190.ckpt', help='Model checkpoint path')
15 | FLAGS = parser.parse_args()
16 |
17 |
18 | # DEFAULT SETTINGS
19 | pretrained_model_path = FLAGS.model_path # os.path.join(BASE_DIR, './pretrained_model/model.ckpt')
20 | hdf5_data_dir = os.path.join(BASE_DIR, './hdf5_data')
21 | ply_data_dir = os.path.join(BASE_DIR, './PartAnnotation')
22 | gpu_to_use = 0
23 | output_dir = os.path.join(BASE_DIR, './test_results')
24 | output_verbose = True # If true, output all color-coded part segmentation obj files
25 |
26 | # MAIN SCRIPT
27 | point_num = 3000 # the max number of points in the all testing data shapes
28 | batch_size = 1
29 |
30 | test_file_list = os.path.join(BASE_DIR, 'testing_ply_file_list.txt')
31 |
32 | oid2cpid = json.load(open(os.path.join(hdf5_data_dir, 'overallid_to_catid_partid.json'), 'r'))
33 |
34 | object2setofoid = {}
35 | for idx in range(len(oid2cpid)):
36 | objid, pid = oid2cpid[idx]
37 | if not objid in object2setofoid.keys():
38 | object2setofoid[objid] = []
39 | object2setofoid[objid].append(idx)
40 |
41 | all_obj_cat_file = os.path.join(hdf5_data_dir, 'all_object_categories.txt')
42 | fin = open(all_obj_cat_file, 'r')
43 | lines = [line.rstrip() for line in fin.readlines()]
44 | objcats = [line.split()[1] for line in lines]
45 | objnames = [line.split()[0] for line in lines]
46 | on2oid = {objcats[i]:i for i in range(len(objcats))}
47 | fin.close()
48 |
49 | color_map_file = os.path.join(hdf5_data_dir, 'part_color_mapping.json')
50 | color_map = json.load(open(color_map_file, 'r'))
51 |
52 | NUM_OBJ_CATS = 16
53 | NUM_PART_CATS = 50
54 |
55 | cpid2oid = json.load(open(os.path.join(hdf5_data_dir, 'catid_partid_to_overallid.json'), 'r'))
56 |
57 | def printout(flog, data):
58 | print(data)
59 | flog.write(data + '\n')
60 |
61 | def output_color_point_cloud(data, seg, out_file):
62 | with open(out_file, 'w') as f:
63 | l = len(seg)
64 | for i in range(l):
65 | color = color_map[seg[i]]
66 | f.write('v %f %f %f %f %f %f\n' % (data[i][0], data[i][1], data[i][2], color[0], color[1], color[2]))
67 |
68 | def output_color_point_cloud_red_blue(data, seg, out_file):
69 | with open(out_file, 'w') as f:
70 | l = len(seg)
71 | for i in range(l):
72 | if seg[i] == 1:
73 | color = [0, 0, 1]
74 | elif seg[i] == 0:
75 | color = [1, 0, 0]
76 | else:
77 | color = [0, 0, 0]
78 |
79 | f.write('v %f %f %f %f %f %f\n' % (data[i][0], data[i][1], data[i][2], color[0], color[1], color[2]))
80 |
81 |
82 | def pc_normalize(pc):
83 | l = pc.shape[0]
84 | centroid = np.mean(pc, axis=0)
85 | pc = pc - centroid
86 | m = np.max(np.sqrt(np.sum(pc**2, axis=1)))
87 | pc = pc / m
88 | return pc
89 |
90 | def placeholder_inputs():
91 | pointclouds_ph = tf.placeholder(tf.float32, shape=(batch_size, point_num, 3))
92 | input_label_ph = tf.placeholder(tf.float32, shape=(batch_size, NUM_OBJ_CATS))
93 | return pointclouds_ph, input_label_ph
94 |
95 | def output_color_point_cloud(data, seg, out_file):
96 | with open(out_file, 'w') as f:
97 | l = len(seg)
98 | for i in range(l):
99 | color = color_map[seg[i]]
100 | f.write('v %f %f %f %f %f %f\n' % (data[i][0], data[i][1], data[i][2], color[0], color[1], color[2]))
101 |
102 | def load_pts_seg_files(pts_file, seg_file, catid):
103 | with open(pts_file, 'r') as f:
104 | pts_str = [item.rstrip() for item in f.readlines()]
105 | pts = np.array([np.float32(s.split()) for s in pts_str], dtype=np.float32)
106 | with open(seg_file, 'r') as f:
107 | part_ids = np.array([int(item.rstrip()) for item in f.readlines()], dtype=np.uint8)
108 | seg = np.array([cpid2oid[catid+'_'+str(x)] for x in part_ids])
109 | return pts, seg
110 |
111 | def pc_augment_to_point_num(pts, pn):
112 | assert(pts.shape[0] <= pn)
113 | cur_len = pts.shape[0]
114 | res = np.array(pts)
115 | while cur_len < pn:
116 | res = np.concatenate((res, pts))
117 | cur_len += pts.shape[0]
118 | return res[:pn, :]
119 |
120 | def convert_label_to_one_hot(labels):
121 | label_one_hot = np.zeros((labels.shape[0], NUM_OBJ_CATS))
122 | for idx in range(labels.shape[0]):
123 | label_one_hot[idx, labels[idx]] = 1
124 | return label_one_hot
125 |
126 | def predict():
127 | is_training = False
128 |
129 | with tf.device('/gpu:'+str(gpu_to_use)):
130 | pointclouds_ph, input_label_ph = placeholder_inputs()
131 | is_training_ph = tf.placeholder(tf.bool, shape=())
132 |
133 | # simple model
134 | pred, seg_pred, end_points = model.get_model(pointclouds_ph, input_label_ph, \
135 | cat_num=NUM_OBJ_CATS, part_num=NUM_PART_CATS, is_training=is_training_ph, \
136 | batch_size=batch_size, num_point=point_num, weight_decay=0.0, bn_decay=None)
137 |
138 | # Add ops to save and restore all the variables.
139 | saver = tf.train.Saver()
140 |
141 | # Later, launch the model, use the saver to restore variables from disk, and
142 | # do some work with the model.
143 |
144 | config = tf.ConfigProto()
145 | config.gpu_options.allow_growth = True
146 | config.allow_soft_placement = True
147 |
148 | with tf.Session(config=config) as sess:
149 | if not os.path.exists(output_dir):
150 | os.mkdir(output_dir)
151 |
152 | flog = open(os.path.join(output_dir, 'log.txt'), 'w')
153 |
154 | # Restore variables from disk.
155 | printout(flog, 'Loading model %s' % pretrained_model_path)
156 | saver.restore(sess, pretrained_model_path)
157 | printout(flog, 'Model restored.')
158 |
159 | # Note: the evaluation for the model with BN has to have some statistics
160 | # Using some test datas as the statistics
161 | batch_data = np.zeros([batch_size, point_num, 3]).astype(np.float32)
162 |
163 | total_acc = 0.0
164 | total_seen = 0
165 | total_acc_iou = 0.0
166 |
167 | total_per_cat_acc = np.zeros((NUM_OBJ_CATS)).astype(np.float32)
168 | total_per_cat_iou = np.zeros((NUM_OBJ_CATS)).astype(np.float32)
169 | total_per_cat_seen = np.zeros((NUM_OBJ_CATS)).astype(np.int32)
170 |
171 | ffiles = open(test_file_list, 'r')
172 | lines = [line.rstrip() for line in ffiles.readlines()]
173 | pts_files = [line.split()[0] for line in lines]
174 | seg_files = [line.split()[1] for line in lines]
175 | labels = [line.split()[2] for line in lines]
176 | ffiles.close()
177 |
178 | len_pts_files = len(pts_files)
179 | for shape_idx in range(len_pts_files):
180 | if shape_idx % 100 == 0:
181 | printout(flog, '%d/%d ...' % (shape_idx, len_pts_files))
182 |
183 | cur_gt_label = on2oid[labels[shape_idx]]
184 |
185 | cur_label_one_hot = np.zeros((1, NUM_OBJ_CATS), dtype=np.float32)
186 | cur_label_one_hot[0, cur_gt_label] = 1
187 |
188 | pts_file_to_load = os.path.join(ply_data_dir, pts_files[shape_idx])
189 | seg_file_to_load = os.path.join(ply_data_dir, seg_files[shape_idx])
190 |
191 | pts, seg = load_pts_seg_files(pts_file_to_load, seg_file_to_load, objcats[cur_gt_label])
192 | ori_point_num = len(seg)
193 |
194 | batch_data[0, ...] = pc_augment_to_point_num(pc_normalize(pts), point_num)
195 |
196 | label_pred_val, seg_pred_res = sess.run([pred, seg_pred], feed_dict={
197 | pointclouds_ph: batch_data,
198 | input_label_ph: cur_label_one_hot,
199 | is_training_ph: is_training,
200 | })
201 |
202 | label_pred_val = np.argmax(label_pred_val[0, :])
203 |
204 | seg_pred_res = seg_pred_res[0, ...]
205 |
206 | iou_oids = object2setofoid[objcats[cur_gt_label]]
207 | non_cat_labels = list(set(np.arange(NUM_PART_CATS)).difference(set(iou_oids)))
208 |
209 | mini = np.min(seg_pred_res)
210 | seg_pred_res[:, non_cat_labels] = mini - 1000
211 |
212 | seg_pred_val = np.argmax(seg_pred_res, axis=1)[:ori_point_num]
213 |
214 | seg_acc = np.mean(seg_pred_val == seg)
215 |
216 | total_acc += seg_acc
217 | total_seen += 1
218 |
219 | total_per_cat_seen[cur_gt_label] += 1
220 | total_per_cat_acc[cur_gt_label] += seg_acc
221 |
222 | mask = np.int32(seg_pred_val == seg)
223 |
224 | total_iou = 0.0
225 | iou_log = ''
226 | for oid in iou_oids:
227 | n_pred = np.sum(seg_pred_val == oid)
228 | n_gt = np.sum(seg == oid)
229 | n_intersect = np.sum(np.int32(seg == oid) * mask)
230 | n_union = n_pred + n_gt - n_intersect
231 | iou_log += '_' + str(n_pred)+'_'+str(n_gt)+'_'+str(n_intersect)+'_'+str(n_union)+'_'
232 | if n_union == 0:
233 | total_iou += 1
234 | iou_log += '_1\n'
235 | else:
236 | total_iou += n_intersect * 1.0 / n_union
237 | iou_log += '_'+str(n_intersect * 1.0 / n_union)+'\n'
238 |
239 | avg_iou = total_iou / len(iou_oids)
240 | total_acc_iou += avg_iou
241 | total_per_cat_iou[cur_gt_label] += avg_iou
242 |
243 | if output_verbose:
244 | output_color_point_cloud(pts, seg, os.path.join(output_dir, str(shape_idx)+'_gt.obj'))
245 | output_color_point_cloud(pts, seg_pred_val, os.path.join(output_dir, str(shape_idx)+'_pred.obj'))
246 | output_color_point_cloud_red_blue(pts, np.int32(seg == seg_pred_val),
247 | os.path.join(output_dir, str(shape_idx)+'_diff.obj'))
248 |
249 | with open(os.path.join(output_dir, str(shape_idx)+'.log'), 'w') as fout:
250 | fout.write('Total Point: %d\n\n' % ori_point_num)
251 | fout.write('Ground Truth: %s\n' % objnames[cur_gt_label])
252 | fout.write('Predict: %s\n\n' % objnames[label_pred_val])
253 | fout.write('Accuracy: %f\n' % seg_acc)
254 | fout.write('IoU: %f\n\n' % avg_iou)
255 | fout.write('IoU details: %s\n' % iou_log)
256 |
257 | printout(flog, 'Accuracy: %f' % (total_acc / total_seen))
258 | printout(flog, 'IoU: %f' % (total_acc_iou / total_seen))
259 |
260 | for cat_idx in range(NUM_OBJ_CATS):
261 | printout(flog, '\t ' + objcats[cat_idx] + ' Total Number: ' + str(total_per_cat_seen[cat_idx]))
262 | if total_per_cat_seen[cat_idx] > 0:
263 | printout(flog, '\t ' + objcats[cat_idx] + ' Accuracy: ' + \
264 | str(total_per_cat_acc[cat_idx] / total_per_cat_seen[cat_idx]))
265 | printout(flog, '\t ' + objcats[cat_idx] + ' IoU: '+ \
266 | str(total_per_cat_iou[cat_idx] / total_per_cat_seen[cat_idx]))
267 |
268 |
269 | with tf.Graph().as_default():
270 | predict()
271 |
--------------------------------------------------------------------------------
/part_seg/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import subprocess
3 | import tensorflow as tf
4 | import numpy as np
5 | from datetime import datetime
6 | import json
7 | import os
8 | import sys
9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
10 | sys.path.append(BASE_DIR)
11 | sys.path.append(os.path.dirname(BASE_DIR))
12 | import provider
13 | import pointnet_part_seg as model
14 |
15 | # DEFAULT SETTINGS
16 | parser = argparse.ArgumentParser()
17 | parser.add_argument('--gpu', type=int, default=1, help='GPU to use [default: GPU 0]')
18 | parser.add_argument('--batch', type=int, default=32, help='Batch Size during training [default: 32]')
19 | parser.add_argument('--epoch', type=int, default=200, help='Epoch to run [default: 50]')
20 | parser.add_argument('--point_num', type=int, default=2048, help='Point Number [256/512/1024/2048]')
21 | parser.add_argument('--output_dir', type=str, default='train_results', help='Directory that stores all training logs and trained models')
22 | parser.add_argument('--wd', type=float, default=0, help='Weight Decay [Default: 0.0]')
23 | FLAGS = parser.parse_args()
24 |
25 | hdf5_data_dir = os.path.join(BASE_DIR, './hdf5_data')
26 |
27 | # MAIN SCRIPT
28 | point_num = FLAGS.point_num
29 | batch_size = FLAGS.batch
30 | output_dir = FLAGS.output_dir
31 |
32 | if not os.path.exists(output_dir):
33 | os.mkdir(output_dir)
34 |
35 | color_map_file = os.path.join(hdf5_data_dir, 'part_color_mapping.json')
36 | color_map = json.load(open(color_map_file, 'r'))
37 |
38 | all_obj_cats_file = os.path.join(hdf5_data_dir, 'all_object_categories.txt')
39 | fin = open(all_obj_cats_file, 'r')
40 | lines = [line.rstrip() for line in fin.readlines()]
41 | all_obj_cats = [(line.split()[0], line.split()[1]) for line in lines]
42 | fin.close()
43 |
44 | all_cats = json.load(open(os.path.join(hdf5_data_dir, 'overallid_to_catid_partid.json'), 'r'))
45 | NUM_CATEGORIES = 16
46 | NUM_PART_CATS = len(all_cats)
47 |
48 | print('#### Batch Size: {0}'.format(batch_size))
49 | print('#### Point Number: {0}'.format(point_num))
50 | print('#### Training using GPU: {0}'.format(FLAGS.gpu))
51 |
52 | DECAY_STEP = 16881 * 20
53 | DECAY_RATE = 0.5
54 |
55 | LEARNING_RATE_CLIP = 1e-5
56 |
57 | BN_INIT_DECAY = 0.5
58 | BN_DECAY_DECAY_RATE = 0.5
59 | BN_DECAY_DECAY_STEP = float(DECAY_STEP * 2)
60 | BN_DECAY_CLIP = 0.99
61 |
62 | BASE_LEARNING_RATE = 0.001
63 | MOMENTUM = 0.9
64 | TRAINING_EPOCHES = FLAGS.epoch
65 | print('### Training epoch: {0}'.format(TRAINING_EPOCHES))
66 |
67 | TRAINING_FILE_LIST = os.path.join(hdf5_data_dir, 'train_hdf5_file_list.txt')
68 | TESTING_FILE_LIST = os.path.join(hdf5_data_dir, 'val_hdf5_file_list.txt')
69 |
70 | MODEL_STORAGE_PATH = os.path.join(output_dir, 'trained_models')
71 | if not os.path.exists(MODEL_STORAGE_PATH):
72 | os.mkdir(MODEL_STORAGE_PATH)
73 |
74 | LOG_STORAGE_PATH = os.path.join(output_dir, 'logs')
75 | if not os.path.exists(LOG_STORAGE_PATH):
76 | os.mkdir(LOG_STORAGE_PATH)
77 |
78 | SUMMARIES_FOLDER = os.path.join(output_dir, 'summaries')
79 | if not os.path.exists(SUMMARIES_FOLDER):
80 | os.mkdir(SUMMARIES_FOLDER)
81 |
82 | def printout(flog, data):
83 | print(data)
84 | flog.write(data + '\n')
85 |
86 | def placeholder_inputs():
87 | pointclouds_ph = tf.placeholder(tf.float32, shape=(batch_size, point_num, 3))
88 | input_label_ph = tf.placeholder(tf.float32, shape=(batch_size, NUM_CATEGORIES))
89 | labels_ph = tf.placeholder(tf.int32, shape=(batch_size))
90 | seg_ph = tf.placeholder(tf.int32, shape=(batch_size, point_num))
91 | return pointclouds_ph, input_label_ph, labels_ph, seg_ph
92 |
93 | def convert_label_to_one_hot(labels):
94 | label_one_hot = np.zeros((labels.shape[0], NUM_CATEGORIES))
95 | for idx in range(labels.shape[0]):
96 | label_one_hot[idx, labels[idx]] = 1
97 | return label_one_hot
98 |
99 | def train():
100 | with tf.Graph().as_default():
101 | with tf.device('/gpu:'+str(FLAGS.gpu)):
102 | pointclouds_ph, input_label_ph, labels_ph, seg_ph = placeholder_inputs()
103 | is_training_ph = tf.placeholder(tf.bool, shape=())
104 |
105 | batch = tf.Variable(0, trainable=False)
106 | learning_rate = tf.train.exponential_decay(
107 | BASE_LEARNING_RATE, # base learning rate
108 | batch * batch_size, # global_var indicating the number of steps
109 | DECAY_STEP, # step size
110 | DECAY_RATE, # decay rate
111 | staircase=True # Stair-case or continuous decreasing
112 | )
113 | learning_rate = tf.maximum(learning_rate, LEARNING_RATE_CLIP)
114 |
115 | bn_momentum = tf.train.exponential_decay(
116 | BN_INIT_DECAY,
117 | batch*batch_size,
118 | BN_DECAY_DECAY_STEP,
119 | BN_DECAY_DECAY_RATE,
120 | staircase=True)
121 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
122 |
123 | lr_op = tf.summary.scalar('learning_rate', learning_rate)
124 | batch_op = tf.summary.scalar('batch_number', batch)
125 | bn_decay_op = tf.summary.scalar('bn_decay', bn_decay)
126 |
127 | labels_pred, seg_pred, end_points = model.get_model(pointclouds_ph, input_label_ph, \
128 | is_training=is_training_ph, bn_decay=bn_decay, cat_num=NUM_CATEGORIES, \
129 | part_num=NUM_PART_CATS, batch_size=batch_size, num_point=point_num, weight_decay=FLAGS.wd)
130 |
131 | # model.py defines both classification net and segmentation net, which share the common global feature extractor network.
132 | # In model.get_loss, we define the total loss to be weighted sum of the classification and segmentation losses.
133 | # Here, we only train for segmentation network. Thus, we set weight to be 1.0.
134 | loss, label_loss, per_instance_label_loss, seg_loss, per_instance_seg_loss, per_instance_seg_pred_res \
135 | = model.get_loss(labels_pred, seg_pred, labels_ph, seg_ph, 1.0, end_points)
136 |
137 | total_training_loss_ph = tf.placeholder(tf.float32, shape=())
138 | total_testing_loss_ph = tf.placeholder(tf.float32, shape=())
139 |
140 | label_training_loss_ph = tf.placeholder(tf.float32, shape=())
141 | label_testing_loss_ph = tf.placeholder(tf.float32, shape=())
142 |
143 | seg_training_loss_ph = tf.placeholder(tf.float32, shape=())
144 | seg_testing_loss_ph = tf.placeholder(tf.float32, shape=())
145 |
146 | label_training_acc_ph = tf.placeholder(tf.float32, shape=())
147 | label_testing_acc_ph = tf.placeholder(tf.float32, shape=())
148 | label_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=())
149 |
150 | seg_training_acc_ph = tf.placeholder(tf.float32, shape=())
151 | seg_testing_acc_ph = tf.placeholder(tf.float32, shape=())
152 | seg_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=())
153 |
154 | total_train_loss_sum_op = tf.summary.scalar('total_training_loss', total_training_loss_ph)
155 | total_test_loss_sum_op = tf.summary.scalar('total_testing_loss', total_testing_loss_ph)
156 |
157 | label_train_loss_sum_op = tf.summary.scalar('label_training_loss', label_training_loss_ph)
158 | label_test_loss_sum_op = tf.summary.scalar('label_testing_loss', label_testing_loss_ph)
159 |
160 | seg_train_loss_sum_op = tf.summary.scalar('seg_training_loss', seg_training_loss_ph)
161 | seg_test_loss_sum_op = tf.summary.scalar('seg_testing_loss', seg_testing_loss_ph)
162 |
163 | label_train_acc_sum_op = tf.summary.scalar('label_training_acc', label_training_acc_ph)
164 | label_test_acc_sum_op = tf.summary.scalar('label_testing_acc', label_testing_acc_ph)
165 | label_test_acc_avg_cat_op = tf.summary.scalar('label_testing_acc_avg_cat', label_testing_acc_avg_cat_ph)
166 |
167 | seg_train_acc_sum_op = tf.summary.scalar('seg_training_acc', seg_training_acc_ph)
168 | seg_test_acc_sum_op = tf.summary.scalar('seg_testing_acc', seg_testing_acc_ph)
169 | seg_test_acc_avg_cat_op = tf.summary.scalar('seg_testing_acc_avg_cat', seg_testing_acc_avg_cat_ph)
170 |
171 | train_variables = tf.trainable_variables()
172 |
173 | trainer = tf.train.AdamOptimizer(learning_rate)
174 | train_op = trainer.minimize(loss, var_list=train_variables, global_step=batch)
175 |
176 | saver = tf.train.Saver()
177 |
178 | config = tf.ConfigProto()
179 | config.gpu_options.allow_growth = True
180 | config.allow_soft_placement = True
181 | sess = tf.Session(config=config)
182 |
183 | init = tf.global_variables_initializer()
184 | sess.run(init)
185 |
186 | train_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/train', sess.graph)
187 | test_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/test')
188 |
189 | train_file_list = provider.getDataFiles(TRAINING_FILE_LIST)
190 | num_train_file = len(train_file_list)
191 | test_file_list = provider.getDataFiles(TESTING_FILE_LIST)
192 | num_test_file = len(test_file_list)
193 |
194 | fcmd = open(os.path.join(LOG_STORAGE_PATH, 'cmd.txt'), 'w')
195 | fcmd.write(str(FLAGS))
196 | fcmd.close()
197 |
198 | # write logs to the disk
199 | flog = open(os.path.join(LOG_STORAGE_PATH, 'log.txt'), 'w')
200 |
201 | def train_one_epoch(train_file_idx, epoch_num):
202 | is_training = True
203 |
204 | for i in range(num_train_file):
205 | cur_train_filename = os.path.join(hdf5_data_dir, train_file_list[train_file_idx[i]])
206 | printout(flog, 'Loading train file ' + cur_train_filename)
207 |
208 | cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_train_filename)
209 | cur_data, cur_labels, order = provider.shuffle_data(cur_data, np.squeeze(cur_labels))
210 | cur_seg = cur_seg[order, ...]
211 |
212 | cur_labels_one_hot = convert_label_to_one_hot(cur_labels)
213 |
214 | num_data = len(cur_labels)
215 | num_batch = num_data // batch_size
216 |
217 | total_loss = 0.0
218 | total_label_loss = 0.0
219 | total_seg_loss = 0.0
220 | total_label_acc = 0.0
221 | total_seg_acc = 0.0
222 |
223 | for j in range(num_batch):
224 | begidx = j * batch_size
225 | endidx = (j + 1) * batch_size
226 |
227 | feed_dict = {
228 | pointclouds_ph: cur_data[begidx: endidx, ...],
229 | labels_ph: cur_labels[begidx: endidx, ...],
230 | input_label_ph: cur_labels_one_hot[begidx: endidx, ...],
231 | seg_ph: cur_seg[begidx: endidx, ...],
232 | is_training_ph: is_training,
233 | }
234 |
235 | _, loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \
236 | per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \
237 | = sess.run([train_op, loss, label_loss, seg_loss, per_instance_label_loss, \
238 | per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \
239 | feed_dict=feed_dict)
240 |
241 | per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1)
242 | average_part_acc = np.mean(per_instance_part_acc)
243 |
244 | total_loss += loss_val
245 | total_label_loss += label_loss_val
246 | total_seg_loss += seg_loss_val
247 |
248 | per_instance_label_pred = np.argmax(label_pred_val, axis=1)
249 | total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...]))
250 | total_seg_acc += average_part_acc
251 |
252 | total_loss = total_loss * 1.0 / num_batch
253 | total_label_loss = total_label_loss * 1.0 / num_batch
254 | total_seg_loss = total_seg_loss * 1.0 / num_batch
255 | total_label_acc = total_label_acc * 1.0 / num_batch
256 | total_seg_acc = total_seg_acc * 1.0 / num_batch
257 |
258 | lr_sum, bn_decay_sum, batch_sum, train_loss_sum, train_label_acc_sum, \
259 | train_label_loss_sum, train_seg_loss_sum, train_seg_acc_sum = sess.run(\
260 | [lr_op, bn_decay_op, batch_op, total_train_loss_sum_op, label_train_acc_sum_op, \
261 | label_train_loss_sum_op, seg_train_loss_sum_op, seg_train_acc_sum_op], \
262 | feed_dict={total_training_loss_ph: total_loss, label_training_loss_ph: total_label_loss, \
263 | seg_training_loss_ph: total_seg_loss, label_training_acc_ph: total_label_acc, \
264 | seg_training_acc_ph: total_seg_acc})
265 |
266 | train_writer.add_summary(train_loss_sum, i + epoch_num * num_train_file)
267 | train_writer.add_summary(train_label_loss_sum, i + epoch_num * num_train_file)
268 | train_writer.add_summary(train_seg_loss_sum, i + epoch_num * num_train_file)
269 | train_writer.add_summary(lr_sum, i + epoch_num * num_train_file)
270 | train_writer.add_summary(bn_decay_sum, i + epoch_num * num_train_file)
271 | train_writer.add_summary(train_label_acc_sum, i + epoch_num * num_train_file)
272 | train_writer.add_summary(train_seg_acc_sum, i + epoch_num * num_train_file)
273 | train_writer.add_summary(batch_sum, i + epoch_num * num_train_file)
274 |
275 | printout(flog, '\tTraining Total Mean_loss: %f' % total_loss)
276 | printout(flog, '\t\tTraining Label Mean_loss: %f' % total_label_loss)
277 | printout(flog, '\t\tTraining Label Accuracy: %f' % total_label_acc)
278 | printout(flog, '\t\tTraining Seg Mean_loss: %f' % total_seg_loss)
279 | printout(flog, '\t\tTraining Seg Accuracy: %f' % total_seg_acc)
280 |
281 | def eval_one_epoch(epoch_num):
282 | is_training = False
283 |
284 | total_loss = 0.0
285 | total_label_loss = 0.0
286 | total_seg_loss = 0.0
287 | total_label_acc = 0.0
288 | total_seg_acc = 0.0
289 | total_seen = 0
290 |
291 | total_label_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32)
292 | total_seg_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32)
293 | total_seen_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.int32)
294 |
295 | for i in range(num_test_file):
296 | cur_test_filename = os.path.join(hdf5_data_dir, test_file_list[i])
297 | printout(flog, 'Loading test file ' + cur_test_filename)
298 |
299 | cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_test_filename)
300 | cur_labels = np.squeeze(cur_labels)
301 |
302 | cur_labels_one_hot = convert_label_to_one_hot(cur_labels)
303 |
304 | num_data = len(cur_labels)
305 | num_batch = num_data // batch_size
306 |
307 | for j in range(num_batch):
308 | begidx = j * batch_size
309 | endidx = (j + 1) * batch_size
310 | feed_dict = {
311 | pointclouds_ph: cur_data[begidx: endidx, ...],
312 | labels_ph: cur_labels[begidx: endidx, ...],
313 | input_label_ph: cur_labels_one_hot[begidx: endidx, ...],
314 | seg_ph: cur_seg[begidx: endidx, ...],
315 | is_training_ph: is_training,
316 | }
317 |
318 | loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \
319 | per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \
320 | = sess.run([loss, label_loss, seg_loss, per_instance_label_loss, \
321 | per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \
322 | feed_dict=feed_dict)
323 |
324 | per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1)
325 | average_part_acc = np.mean(per_instance_part_acc)
326 |
327 | total_seen += 1
328 | total_loss += loss_val
329 | total_label_loss += label_loss_val
330 | total_seg_loss += seg_loss_val
331 |
332 | per_instance_label_pred = np.argmax(label_pred_val, axis=1)
333 | total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...]))
334 | total_seg_acc += average_part_acc
335 |
336 | for shape_idx in range(begidx, endidx):
337 | total_seen_per_cat[cur_labels[shape_idx]] += 1
338 | total_label_acc_per_cat[cur_labels[shape_idx]] += np.int32(per_instance_label_pred[shape_idx-begidx] == cur_labels[shape_idx])
339 | total_seg_acc_per_cat[cur_labels[shape_idx]] += per_instance_part_acc[shape_idx - begidx]
340 |
341 | total_loss = total_loss * 1.0 / total_seen
342 | total_label_loss = total_label_loss * 1.0 / total_seen
343 | total_seg_loss = total_seg_loss * 1.0 / total_seen
344 | total_label_acc = total_label_acc * 1.0 / total_seen
345 | total_seg_acc = total_seg_acc * 1.0 / total_seen
346 |
347 | test_loss_sum, test_label_acc_sum, test_label_loss_sum, test_seg_loss_sum, test_seg_acc_sum = sess.run(\
348 | [total_test_loss_sum_op, label_test_acc_sum_op, label_test_loss_sum_op, seg_test_loss_sum_op, seg_test_acc_sum_op], \
349 | feed_dict={total_testing_loss_ph: total_loss, label_testing_loss_ph: total_label_loss, \
350 | seg_testing_loss_ph: total_seg_loss, label_testing_acc_ph: total_label_acc, seg_testing_acc_ph: total_seg_acc})
351 |
352 | test_writer.add_summary(test_loss_sum, (epoch_num+1) * num_train_file-1)
353 | test_writer.add_summary(test_label_loss_sum, (epoch_num+1) * num_train_file-1)
354 | test_writer.add_summary(test_seg_loss_sum, (epoch_num+1) * num_train_file-1)
355 | test_writer.add_summary(test_label_acc_sum, (epoch_num+1) * num_train_file-1)
356 | test_writer.add_summary(test_seg_acc_sum, (epoch_num+1) * num_train_file-1)
357 |
358 | printout(flog, '\tTesting Total Mean_loss: %f' % total_loss)
359 | printout(flog, '\t\tTesting Label Mean_loss: %f' % total_label_loss)
360 | printout(flog, '\t\tTesting Label Accuracy: %f' % total_label_acc)
361 | printout(flog, '\t\tTesting Seg Mean_loss: %f' % total_seg_loss)
362 | printout(flog, '\t\tTesting Seg Accuracy: %f' % total_seg_acc)
363 |
364 | for cat_idx in range(NUM_CATEGORIES):
365 | if total_seen_per_cat[cat_idx] > 0:
366 | printout(flog, '\n\t\tCategory %s Object Number: %d' % (all_obj_cats[cat_idx][0], total_seen_per_cat[cat_idx]))
367 | printout(flog, '\t\tCategory %s Label Accuracy: %f' % (all_obj_cats[cat_idx][0], total_label_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx]))
368 | printout(flog, '\t\tCategory %s Seg Accuracy: %f' % (all_obj_cats[cat_idx][0], total_seg_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx]))
369 |
370 | if not os.path.exists(MODEL_STORAGE_PATH):
371 | os.mkdir(MODEL_STORAGE_PATH)
372 |
373 | for epoch in range(TRAINING_EPOCHES):
374 | printout(flog, '\n<<< Testing on the test dataset ...')
375 | eval_one_epoch(epoch)
376 |
377 | printout(flog, '\n>>> Training for the epoch %d/%d ...' % (epoch, TRAINING_EPOCHES))
378 |
379 | train_file_idx = np.arange(0, len(train_file_list))
380 | np.random.shuffle(train_file_idx)
381 |
382 | train_one_epoch(train_file_idx, epoch)
383 |
384 | if (epoch+1) % 10 == 0:
385 | cp_filename = saver.save(sess, os.path.join(MODEL_STORAGE_PATH, 'epoch_' + str(epoch+1)+'.ckpt'))
386 | printout(flog, 'Successfully store the checkpoint model into ' + cp_filename)
387 |
388 | flog.flush()
389 |
390 | flog.close()
391 |
392 | if __name__=='__main__':
393 | train()
394 |
--------------------------------------------------------------------------------
/provider.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | import numpy as np
4 | import h5py
5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
6 | sys.path.append(BASE_DIR)
7 |
8 | # Download dataset for point cloud classification
9 | DATA_DIR = os.path.join(BASE_DIR, 'data')
10 | if not os.path.exists(DATA_DIR):
11 | os.mkdir(DATA_DIR)
12 | if not os.path.exists(os.path.join(DATA_DIR, 'modelnet40_ply_hdf5_2048')):
13 | www = 'https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip'
14 | zipfile = os.path.basename(www)
15 | os.system('wget %s; unzip %s' % (www, zipfile))
16 | os.system('mv %s %s' % (zipfile[:-4], DATA_DIR))
17 | os.system('rm %s' % (zipfile))
18 |
19 |
20 | def shuffle_data(data, labels):
21 | """ Shuffle data and labels.
22 | Input:
23 | data: B,N,... numpy array
24 | label: B,... numpy array
25 | Return:
26 | shuffled data, label and shuffle indices
27 | """
28 | idx = np.arange(len(labels))
29 | np.random.shuffle(idx)
30 | return data[idx, ...], labels[idx], idx
31 |
32 |
33 | def rotate_point_cloud(batch_data):
34 | """ Randomly rotate the point clouds to augument the dataset
35 | rotation is per shape based along up direction
36 | Input:
37 | BxNx3 array, original batch of point clouds
38 | Return:
39 | BxNx3 array, rotated batch of point clouds
40 | """
41 | rotated_data = np.zeros(batch_data.shape, dtype=np.float32)
42 | for k in range(batch_data.shape[0]):
43 | rotation_angle = np.random.uniform() * 2 * np.pi
44 | cosval = np.cos(rotation_angle)
45 | sinval = np.sin(rotation_angle)
46 | rotation_matrix = np.array([[cosval, 0, sinval],
47 | [0, 1, 0],
48 | [-sinval, 0, cosval]])
49 | shape_pc = batch_data[k, ...]
50 | rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix)
51 | return rotated_data
52 |
53 |
54 | def rotate_point_cloud_by_angle(batch_data, rotation_angle):
55 | """ Rotate the point cloud along up direction with certain angle.
56 | Input:
57 | BxNx3 array, original batch of point clouds
58 | Return:
59 | BxNx3 array, rotated batch of point clouds
60 | """
61 | rotated_data = np.zeros(batch_data.shape, dtype=np.float32)
62 | for k in range(batch_data.shape[0]):
63 | #rotation_angle = np.random.uniform() * 2 * np.pi
64 | cosval = np.cos(rotation_angle)
65 | sinval = np.sin(rotation_angle)
66 | rotation_matrix = np.array([[cosval, 0, sinval],
67 | [0, 1, 0],
68 | [-sinval, 0, cosval]])
69 | shape_pc = batch_data[k, ...]
70 | rotated_data[k, ...] = np.dot(shape_pc.reshape((-1, 3)), rotation_matrix)
71 | return rotated_data
72 |
73 |
74 | def jitter_point_cloud(batch_data, sigma=0.01, clip=0.05):
75 | """ Randomly jitter points. jittering is per point.
76 | Input:
77 | BxNx3 array, original batch of point clouds
78 | Return:
79 | BxNx3 array, jittered batch of point clouds
80 | """
81 | B, N, C = batch_data.shape
82 | assert(clip > 0)
83 | jittered_data = np.clip(sigma * np.random.randn(B, N, C), -1*clip, clip)
84 | jittered_data += batch_data
85 | return jittered_data
86 |
87 | def getDataFiles(list_filename):
88 | return [line.rstrip() for line in open(list_filename)]
89 |
90 | def load_h5(h5_filename):
91 | f = h5py.File(h5_filename)
92 | data = f['data'][:]
93 | label = f['label'][:]
94 | return (data, label)
95 |
96 | def loadDataFile(filename):
97 | return load_h5(filename)
98 |
99 | def load_h5_data_label_seg(h5_filename):
100 | f = h5py.File(h5_filename)
101 | data = f['data'][:]
102 | label = f['label'][:]
103 | seg = f['pid'][:]
104 | return (data, label, seg)
105 |
106 |
107 | def loadDataFile_with_seg(filename):
108 | return load_h5_data_label_seg(filename)
109 |
--------------------------------------------------------------------------------
/sem_seg/README.md:
--------------------------------------------------------------------------------
1 | ## Semantic Segmentation of Indoor Scenes
2 |
3 | ### Dataset
4 |
5 | Donwload prepared HDF5 data for training:
6 |
7 | sh download_data.sh
8 |
9 | (optional) Download 3D indoor parsing dataset (S3DIS Dataset) for testing and visualization. Version 1.2 of the dataset is used in this work.
10 |
11 |
12 | To prepare your own HDF5 data, you need to firstly download 3D indoor parsing dataset and then use `python collect_indoor3d_data.py` for data re-organization and `python gen_indoor3d_h5.py` to generate HDF5 files.
13 |
14 | ### Training
15 |
16 | Once you have downloaded prepared HDF5 files or prepared them by yourself, to start training:
17 |
18 | python train.py --log_dir log6 --test_area 6
19 |
20 | In default a simple model based on vanilla PointNet is used for training. Area 6 is used for test set.
21 |
22 | ### Testing
23 |
24 | Testing requires download of 3D indoor parsing data and preprocessing with `collect_indoor3d_data.py`
25 |
26 | After training, use `batch_inference.py` command to segment rooms in test set. In our work we use 6-fold training that trains 6 models. For model1 , area2-6 are used as train set, area1 is used as test set. For model2, area1,3-6 are used as train set and area2 is used as test set... Note that S3DIS dataset paper uses a different 3-fold training, which was not publicly announced at the time of our work.
27 |
28 | For example, to test model6, use command:
29 |
30 | python batch_inference.py --model_path log6/model.ckpt --dump_dir log6/dump --output_filelist log6/output_filelist.txt --room_data_filelist meta/area6_data_label.txt --visu
31 |
32 | Some OBJ files will be created for prediciton visualization in `log6/dump`.
33 |
34 | To evaluate overall segmentation accuracy, we evaluate 6 models on their corresponding test areas and use `eval_iou_accuracy.py` to produce point classification accuracy and IoU as reported in the paper.
35 |
36 |
37 |
--------------------------------------------------------------------------------
/sem_seg/batch_inference.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import os
3 | import sys
4 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
5 | ROOT_DIR = os.path.dirname(BASE_DIR)
6 | sys.path.append(BASE_DIR)
7 | from model import *
8 | import indoor3d_util
9 |
10 | parser = argparse.ArgumentParser()
11 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]')
12 | parser.add_argument('--batch_size', type=int, default=1, help='Batch Size during training [default: 1]')
13 | parser.add_argument('--num_point', type=int, default=4096, help='Point number [default: 4096]')
14 | parser.add_argument('--model_path', required=True, help='model checkpoint file path')
15 | parser.add_argument('--dump_dir', required=True, help='dump folder path')
16 | parser.add_argument('--output_filelist', required=True, help='TXT filename, filelist, each line is an output for a room')
17 | parser.add_argument('--room_data_filelist', required=True, help='TXT filename, filelist, each line is a test room data label file.')
18 | parser.add_argument('--no_clutter', action='store_true', help='If true, donot count the clutter class')
19 | parser.add_argument('--visu', action='store_true', help='Whether to output OBJ file for prediction visualization.')
20 | FLAGS = parser.parse_args()
21 |
22 | BATCH_SIZE = FLAGS.batch_size
23 | NUM_POINT = FLAGS.num_point
24 | MODEL_PATH = FLAGS.model_path
25 | GPU_INDEX = FLAGS.gpu
26 | DUMP_DIR = FLAGS.dump_dir
27 | if not os.path.exists(DUMP_DIR): os.mkdir(DUMP_DIR)
28 | LOG_FOUT = open(os.path.join(DUMP_DIR, 'log_evaluate.txt'), 'w')
29 | LOG_FOUT.write(str(FLAGS)+'\n')
30 | ROOM_PATH_LIST = [os.path.join(ROOT_DIR,line.rstrip()) for line in open(FLAGS.room_data_filelist)]
31 |
32 | NUM_CLASSES = 13
33 |
34 | def log_string(out_str):
35 | LOG_FOUT.write(out_str+'\n')
36 | LOG_FOUT.flush()
37 | print(out_str)
38 |
39 | def evaluate():
40 | is_training = False
41 |
42 | with tf.device('/gpu:'+str(GPU_INDEX)):
43 | pointclouds_pl, labels_pl = placeholder_inputs(BATCH_SIZE, NUM_POINT)
44 | is_training_pl = tf.placeholder(tf.bool, shape=())
45 |
46 | # simple model
47 | pred = get_model(pointclouds_pl, is_training_pl)
48 | loss = get_loss(pred, labels_pl)
49 | pred_softmax = tf.nn.softmax(pred)
50 |
51 | # Add ops to save and restore all the variables.
52 | saver = tf.train.Saver()
53 |
54 | # Create a session
55 | config = tf.ConfigProto()
56 | config.gpu_options.allow_growth = True
57 | config.allow_soft_placement = True
58 | config.log_device_placement = True
59 | sess = tf.Session(config=config)
60 |
61 | # Restore variables from disk.
62 | saver.restore(sess, MODEL_PATH)
63 | log_string("Model restored.")
64 |
65 | ops = {'pointclouds_pl': pointclouds_pl,
66 | 'labels_pl': labels_pl,
67 | 'is_training_pl': is_training_pl,
68 | 'pred': pred,
69 | 'pred_softmax': pred_softmax,
70 | 'loss': loss}
71 |
72 | total_correct = 0
73 | total_seen = 0
74 | fout_out_filelist = open(FLAGS.output_filelist, 'w')
75 | for room_path in ROOM_PATH_LIST:
76 | out_data_label_filename = os.path.basename(room_path)[:-4] + '_pred.txt'
77 | out_data_label_filename = os.path.join(DUMP_DIR, out_data_label_filename)
78 | out_gt_label_filename = os.path.basename(room_path)[:-4] + '_gt.txt'
79 | out_gt_label_filename = os.path.join(DUMP_DIR, out_gt_label_filename)
80 | print(room_path, out_data_label_filename)
81 | a, b = eval_one_epoch(sess, ops, room_path, out_data_label_filename, out_gt_label_filename)
82 | total_correct += a
83 | total_seen += b
84 | fout_out_filelist.write(out_data_label_filename+'\n')
85 | fout_out_filelist.close()
86 | log_string('all room eval accuracy: %f'% (total_correct / float(total_seen)))
87 |
88 | def eval_one_epoch(sess, ops, room_path, out_data_label_filename, out_gt_label_filename):
89 | error_cnt = 0
90 | is_training = False
91 | total_correct = 0
92 | total_seen = 0
93 | loss_sum = 0
94 | total_seen_class = [0 for _ in range(NUM_CLASSES)]
95 | total_correct_class = [0 for _ in range(NUM_CLASSES)]
96 | if FLAGS.visu:
97 | fout = open(os.path.join(DUMP_DIR, os.path.basename(room_path)[:-4]+'_pred.obj'), 'w')
98 | fout_gt = open(os.path.join(DUMP_DIR, os.path.basename(room_path)[:-4]+'_gt.obj'), 'w')
99 | fout_data_label = open(out_data_label_filename, 'w')
100 | fout_gt_label = open(out_gt_label_filename, 'w')
101 |
102 | current_data, current_label = indoor3d_util.room2blocks_wrapper_normalized(room_path, NUM_POINT)
103 | current_data = current_data[:,0:NUM_POINT,:]
104 | current_label = np.squeeze(current_label)
105 | # Get room dimension..
106 | data_label = np.load(room_path)
107 | data = data_label[:,0:6]
108 | max_room_x = max(data[:,0])
109 | max_room_y = max(data[:,1])
110 | max_room_z = max(data[:,2])
111 |
112 | file_size = current_data.shape[0]
113 | num_batches = file_size // BATCH_SIZE
114 | print(file_size)
115 |
116 |
117 | for batch_idx in range(num_batches):
118 | start_idx = batch_idx * BATCH_SIZE
119 | end_idx = (batch_idx+1) * BATCH_SIZE
120 | cur_batch_size = end_idx - start_idx
121 |
122 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :],
123 | ops['labels_pl']: current_label[start_idx:end_idx],
124 | ops['is_training_pl']: is_training}
125 | loss_val, pred_val = sess.run([ops['loss'], ops['pred_softmax']],
126 | feed_dict=feed_dict)
127 |
128 | if FLAGS.no_clutter:
129 | pred_label = np.argmax(pred_val[:,:,0:12], 2) # BxN
130 | else:
131 | pred_label = np.argmax(pred_val, 2) # BxN
132 | # Save prediction labels to OBJ file
133 | for b in range(BATCH_SIZE):
134 | pts = current_data[start_idx+b, :, :]
135 | l = current_label[start_idx+b,:]
136 | pts[:,6] *= max_room_x
137 | pts[:,7] *= max_room_y
138 | pts[:,8] *= max_room_z
139 | pts[:,3:6] *= 255.0
140 | pred = pred_label[b, :]
141 | for i in range(NUM_POINT):
142 | color = indoor3d_util.g_label2color[pred[i]]
143 | color_gt = indoor3d_util.g_label2color[current_label[start_idx+b, i]]
144 | if FLAGS.visu:
145 | fout.write('v %f %f %f %d %d %d\n' % (pts[i,6], pts[i,7], pts[i,8], color[0], color[1], color[2]))
146 | fout_gt.write('v %f %f %f %d %d %d\n' % (pts[i,6], pts[i,7], pts[i,8], color_gt[0], color_gt[1], color_gt[2]))
147 | fout_data_label.write('%f %f %f %d %d %d %f %d\n' % (pts[i,6], pts[i,7], pts[i,8], pts[i,3], pts[i,4], pts[i,5], pred_val[b,i,pred[i]], pred[i]))
148 | fout_gt_label.write('%d\n' % (l[i]))
149 | correct = np.sum(pred_label == current_label[start_idx:end_idx,:])
150 | total_correct += correct
151 | total_seen += (cur_batch_size*NUM_POINT)
152 | loss_sum += (loss_val*BATCH_SIZE)
153 | for i in range(start_idx, end_idx):
154 | for j in range(NUM_POINT):
155 | l = current_label[i, j]
156 | total_seen_class[l] += 1
157 | total_correct_class[l] += (pred_label[i-start_idx, j] == l)
158 |
159 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen/NUM_POINT)))
160 | log_string('eval accuracy: %f'% (total_correct / float(total_seen)))
161 | fout_data_label.close()
162 | fout_gt_label.close()
163 | if FLAGS.visu:
164 | fout.close()
165 | fout_gt.close()
166 | return total_correct, total_seen
167 |
168 |
169 | if __name__=='__main__':
170 | with tf.Graph().as_default():
171 | evaluate()
172 | LOG_FOUT.close()
173 |
--------------------------------------------------------------------------------
/sem_seg/collect_indoor3d_data.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
4 | ROOT_DIR = os.path.dirname(BASE_DIR)
5 | sys.path.append(BASE_DIR)
6 | import indoor3d_util
7 |
8 | anno_paths = [line.rstrip() for line in open(os.path.join(BASE_DIR, 'meta/anno_paths.txt'))]
9 | anno_paths = [os.path.join(indoor3d_util.DATA_PATH, p) for p in anno_paths]
10 |
11 | output_folder = os.path.join(ROOT_DIR, 'data/stanford_indoor3d')
12 | if not os.path.exists(output_folder):
13 | os.mkdir(output_folder)
14 |
15 | # Note: there is an extra character in the v1.2 data in Area_5/hallway_6. It's fixed manually.
16 | for anno_path in anno_paths:
17 | print(anno_path)
18 | try:
19 | elements = anno_path.split('/')
20 | out_filename = elements[-3]+'_'+elements[-2]+'.npy' # Area_1_hallway_1.npy
21 | indoor3d_util.collect_point_label(anno_path, os.path.join(output_folder, out_filename), 'numpy')
22 | except:
23 | print(anno_path, 'ERROR!!')
24 |
--------------------------------------------------------------------------------
/sem_seg/download_data.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Download HDF5 for indoor 3d semantic segmentation (around 1.6GB)
4 | wget https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip
5 | unzip indoor3d_sem_seg_hdf5_data.zip
6 | rm indoor3d_sem_seg_hdf5_data.zip
7 |
8 |
--------------------------------------------------------------------------------
/sem_seg/eval_iou_accuracy.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | pred_data_label_filenames = [line.rstrip() for line in open('all_pred_data_label_filelist.txt')]
4 | gt_label_filenames = [f.rstrip('_pred\.txt') + '_gt.txt' for f in pred_data_label_filenames]
5 | num_room = len(gt_label_filenames)
6 |
7 |
8 | gt_classes = [0 for _ in range(13)]
9 | positive_classes = [0 for _ in range(13)]
10 | true_positive_classes = [0 for _ in range(13)]
11 | for i in range(num_room):
12 | print(i)
13 | data_label = np.loadtxt(pred_data_label_filenames[i])
14 | pred_label = data_label[:,-1]
15 | gt_label = np.loadtxt(gt_label_filenames[i])
16 | print(gt_label.shape)
17 | for j in xrange(gt_label.shape[0]):
18 | gt_l = int(gt_label[j])
19 | pred_l = int(pred_label[j])
20 | gt_classes[gt_l] += 1
21 | positive_classes[pred_l] += 1
22 | true_positive_classes[gt_l] += int(gt_l==pred_l)
23 |
24 |
25 | print(gt_classes)
26 | print(positive_classes)
27 | print(true_positive_classes)
28 |
29 |
30 | print('Overall accuracy: {0}'.format(sum(true_positive_classes)/float(sum(positive_classes))))
31 |
32 | print 'IoU:'
33 | iou_list = []
34 | for i in range(13):
35 | iou = true_positive_classes[i]/float(gt_classes[i]+positive_classes[i]-true_positive_classes[i])
36 | print(iou)
37 | iou_list.append(iou)
38 |
39 | print(sum(iou_list)/13.0)
40 |
--------------------------------------------------------------------------------
/sem_seg/gen_indoor3d_h5.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import sys
4 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
5 | ROOT_DIR = os.path.dirname(BASE_DIR)
6 | sys.path.append(BASE_DIR)
7 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
8 | import data_prep_util
9 | import indoor3d_util
10 |
11 | # Constants
12 | data_dir = os.path.join(ROOT_DIR, 'data')
13 | indoor3d_data_dir = os.path.join(data_dir, 'stanford_indoor3d')
14 | NUM_POINT = 4096
15 | H5_BATCH_SIZE = 1000
16 | data_dim = [NUM_POINT, 9]
17 | label_dim = [NUM_POINT]
18 | data_dtype = 'float32'
19 | label_dtype = 'uint8'
20 |
21 | # Set paths
22 | filelist = os.path.join(BASE_DIR, 'meta/all_data_label.txt')
23 | data_label_files = [os.path.join(indoor3d_data_dir, line.rstrip()) for line in open(filelist)]
24 | output_dir = os.path.join(data_dir, 'indoor3d_sem_seg_hdf5_data')
25 | if not os.path.exists(output_dir):
26 | os.mkdir(output_dir)
27 | output_filename_prefix = os.path.join(output_dir, 'ply_data_all')
28 | output_room_filelist = os.path.join(output_dir, 'room_filelist.txt')
29 | fout_room = open(output_room_filelist, 'w')
30 |
31 | # --------------------------------------
32 | # ----- BATCH WRITE TO HDF5 -----
33 | # --------------------------------------
34 | batch_data_dim = [H5_BATCH_SIZE] + data_dim
35 | batch_label_dim = [H5_BATCH_SIZE] + label_dim
36 | h5_batch_data = np.zeros(batch_data_dim, dtype = np.float32)
37 | h5_batch_label = np.zeros(batch_label_dim, dtype = np.uint8)
38 | buffer_size = 0 # state: record how many samples are currently in buffer
39 | h5_index = 0 # state: the next h5 file to save
40 |
41 | def insert_batch(data, label, last_batch=False):
42 | global h5_batch_data, h5_batch_label
43 | global buffer_size, h5_index
44 | data_size = data.shape[0]
45 | # If there is enough space, just insert
46 | if buffer_size + data_size <= h5_batch_data.shape[0]:
47 | h5_batch_data[buffer_size:buffer_size+data_size, ...] = data
48 | h5_batch_label[buffer_size:buffer_size+data_size] = label
49 | buffer_size += data_size
50 | else: # not enough space
51 | capacity = h5_batch_data.shape[0] - buffer_size
52 | assert(capacity>=0)
53 | if capacity > 0:
54 | h5_batch_data[buffer_size:buffer_size+capacity, ...] = data[0:capacity, ...]
55 | h5_batch_label[buffer_size:buffer_size+capacity, ...] = label[0:capacity, ...]
56 | # Save batch data and label to h5 file, reset buffer_size
57 | h5_filename = output_filename_prefix + '_' + str(h5_index) + '.h5'
58 | data_prep_util.save_h5(h5_filename, h5_batch_data, h5_batch_label, data_dtype, label_dtype)
59 | print('Stored {0} with size {1}'.format(h5_filename, h5_batch_data.shape[0]))
60 | h5_index += 1
61 | buffer_size = 0
62 | # recursive call
63 | insert_batch(data[capacity:, ...], label[capacity:, ...], last_batch)
64 | if last_batch and buffer_size > 0:
65 | h5_filename = output_filename_prefix + '_' + str(h5_index) + '.h5'
66 | data_prep_util.save_h5(h5_filename, h5_batch_data[0:buffer_size, ...], h5_batch_label[0:buffer_size, ...], data_dtype, label_dtype)
67 | print('Stored {0} with size {1}'.format(h5_filename, buffer_size))
68 | h5_index += 1
69 | buffer_size = 0
70 | return
71 |
72 |
73 | sample_cnt = 0
74 | for i, data_label_filename in enumerate(data_label_files):
75 | print(data_label_filename)
76 | data, label = indoor3d_util.room2blocks_wrapper_normalized(data_label_filename, NUM_POINT, block_size=1.0, stride=0.5,
77 | random_sample=False, sample_num=None)
78 | print('{0}, {1}'.format(data.shape, label.shape))
79 | for _ in range(data.shape[0]):
80 | fout_room.write(os.path.basename(data_label_filename)[0:-4]+'\n')
81 |
82 | sample_cnt += data.shape[0]
83 | insert_batch(data, label, i == len(data_label_files)-1)
84 |
85 | fout_room.close()
86 | print("Total samples: {0}".format(sample_cnt))
87 |
--------------------------------------------------------------------------------
/sem_seg/indoor3d_util.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import glob
3 | import os
4 | import sys
5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
6 | ROOT_DIR = os.path.dirname(BASE_DIR)
7 | sys.path.append(BASE_DIR)
8 |
9 | # -----------------------------------------------------------------------------
10 | # CONSTANTS
11 | # -----------------------------------------------------------------------------
12 |
13 | DATA_PATH = os.path.join(ROOT_DIR, 'data', 'Stanford3dDataset_v1.2_Aligned_Version')
14 | g_classes = [x.rstrip() for x in open(os.path.join(BASE_DIR, 'meta/class_names.txt'))]
15 | g_class2label = {cls: i for i,cls in enumerate(g_classes)}
16 | g_class2color = {'ceiling': [0,255,0],
17 | 'floor': [0,0,255],
18 | 'wall': [0,255,255],
19 | 'beam': [255,255,0],
20 | 'column': [255,0,255],
21 | 'window': [100,100,255],
22 | 'door': [200,200,100],
23 | 'table': [170,120,200],
24 | 'chair': [255,0,0],
25 | 'sofa': [200,100,100],
26 | 'bookcase': [10,200,100],
27 | 'board': [200,200,200],
28 | 'clutter': [50,50,50]}
29 | g_easy_view_labels = [7,8,9,10,11,1]
30 | g_label2color = {g_classes.index(cls): g_class2color[cls] for cls in g_classes}
31 |
32 |
33 | # -----------------------------------------------------------------------------
34 | # CONVERT ORIGINAL DATA TO OUR DATA_LABEL FILES
35 | # -----------------------------------------------------------------------------
36 |
37 | def collect_point_label(anno_path, out_filename, file_format='txt'):
38 | """ Convert original dataset files to data_label file (each line is XYZRGBL).
39 | We aggregated all the points from each instance in the room.
40 |
41 | Args:
42 | anno_path: path to annotations. e.g. Area_1/office_2/Annotations/
43 | out_filename: path to save collected points and labels (each line is XYZRGBL)
44 | file_format: txt or numpy, determines what file format to save.
45 | Returns:
46 | None
47 | Note:
48 | the points are shifted before save, the most negative point is now at origin.
49 | """
50 | points_list = []
51 |
52 | for f in glob.glob(os.path.join(anno_path, '*.txt')):
53 | cls = os.path.basename(f).split('_')[0]
54 | if cls not in g_classes: # note: in some room there is 'staris' class..
55 | cls = 'clutter'
56 | points = np.loadtxt(f)
57 | labels = np.ones((points.shape[0],1)) * g_class2label[cls]
58 | points_list.append(np.concatenate([points, labels], 1)) # Nx7
59 |
60 | data_label = np.concatenate(points_list, 0)
61 | xyz_min = np.amin(data_label, axis=0)[0:3]
62 | data_label[:, 0:3] -= xyz_min
63 |
64 | if file_format=='txt':
65 | fout = open(out_filename, 'w')
66 | for i in range(data_label.shape[0]):
67 | fout.write('%f %f %f %d %d %d %d\n' % \
68 | (data_label[i,0], data_label[i,1], data_label[i,2],
69 | data_label[i,3], data_label[i,4], data_label[i,5],
70 | data_label[i,6]))
71 | fout.close()
72 | elif file_format=='numpy':
73 | np.save(out_filename, data_label)
74 | else:
75 | print('ERROR!! Unknown file format: %s, please use txt or numpy.' % \
76 | (file_format))
77 | exit()
78 |
79 | def point_label_to_obj(input_filename, out_filename, label_color=True, easy_view=False, no_wall=False):
80 | """ For visualization of a room from data_label file,
81 | input_filename: each line is X Y Z R G B L
82 | out_filename: OBJ filename,
83 | visualize input file by coloring point with label color
84 | easy_view: only visualize furnitures and floor
85 | """
86 | data_label = np.loadtxt(input_filename)
87 | data = data_label[:, 0:6]
88 | label = data_label[:, -1].astype(int)
89 | fout = open(out_filename, 'w')
90 | for i in range(data.shape[0]):
91 | color = g_label2color[label[i]]
92 | if easy_view and (label[i] not in g_easy_view_labels):
93 | continue
94 | if no_wall and ((label[i] == 2) or (label[i]==0)):
95 | continue
96 | if label_color:
97 | fout.write('v %f %f %f %d %d %d\n' % \
98 | (data[i,0], data[i,1], data[i,2], color[0], color[1], color[2]))
99 | else:
100 | fout.write('v %f %f %f %d %d %d\n' % \
101 | (data[i,0], data[i,1], data[i,2], data[i,3], data[i,4], data[i,5]))
102 | fout.close()
103 |
104 |
105 |
106 | # -----------------------------------------------------------------------------
107 | # PREPARE BLOCK DATA FOR DEEPNETS TRAINING/TESTING
108 | # -----------------------------------------------------------------------------
109 |
110 | def sample_data(data, num_sample):
111 | """ data is in N x ...
112 | we want to keep num_samplexC of them.
113 | if N > num_sample, we will randomly keep num_sample of them.
114 | if N < num_sample, we will randomly duplicate samples.
115 | """
116 | N = data.shape[0]
117 | if (N == num_sample):
118 | return data, range(N)
119 | elif (N > num_sample):
120 | sample = np.random.choice(N, num_sample)
121 | return data[sample, ...], sample
122 | else:
123 | sample = np.random.choice(N, num_sample-N)
124 | dup_data = data[sample, ...]
125 | return np.concatenate([data, dup_data], 0), range(N)+list(sample)
126 |
127 | def sample_data_label(data, label, num_sample):
128 | new_data, sample_indices = sample_data(data, num_sample)
129 | new_label = label[sample_indices]
130 | return new_data, new_label
131 |
132 | def room2blocks(data, label, num_point, block_size=1.0, stride=1.0,
133 | random_sample=False, sample_num=None, sample_aug=1):
134 | """ Prepare block training data.
135 | Args:
136 | data: N x 6 numpy array, 012 are XYZ in meters, 345 are RGB in [0,1]
137 | assumes the data is shifted (min point is origin) and aligned
138 | (aligned with XYZ axis)
139 | label: N size uint8 numpy array from 0-12
140 | num_point: int, how many points to sample in each block
141 | block_size: float, physical size of the block in meters
142 | stride: float, stride for block sweeping
143 | random_sample: bool, if True, we will randomly sample blocks in the room
144 | sample_num: int, if random sample, how many blocks to sample
145 | [default: room area]
146 | sample_aug: if random sample, how much aug
147 | Returns:
148 | block_datas: K x num_point x 6 np array of XYZRGB, RGB is in [0,1]
149 | block_labels: K x num_point x 1 np array of uint8 labels
150 |
151 | TODO: for this version, blocking is in fixed, non-overlapping pattern.
152 | """
153 | assert(stride<=block_size)
154 |
155 | limit = np.amax(data, 0)[0:3]
156 |
157 | # Get the corner location for our sampling blocks
158 | xbeg_list = []
159 | ybeg_list = []
160 | if not random_sample:
161 | num_block_x = int(np.ceil((limit[0] - block_size) / stride)) + 1
162 | num_block_y = int(np.ceil((limit[1] - block_size) / stride)) + 1
163 | for i in range(num_block_x):
164 | for j in range(num_block_y):
165 | xbeg_list.append(i*stride)
166 | ybeg_list.append(j*stride)
167 | else:
168 | num_block_x = int(np.ceil(limit[0] / block_size))
169 | num_block_y = int(np.ceil(limit[1] / block_size))
170 | if sample_num is None:
171 | sample_num = num_block_x * num_block_y * sample_aug
172 | for _ in range(sample_num):
173 | xbeg = np.random.uniform(-block_size, limit[0])
174 | ybeg = np.random.uniform(-block_size, limit[1])
175 | xbeg_list.append(xbeg)
176 | ybeg_list.append(ybeg)
177 |
178 | # Collect blocks
179 | block_data_list = []
180 | block_label_list = []
181 | idx = 0
182 | for idx in range(len(xbeg_list)):
183 | xbeg = xbeg_list[idx]
184 | ybeg = ybeg_list[idx]
185 | xcond = (data[:,0]<=xbeg+block_size) & (data[:,0]>=xbeg)
186 | ycond = (data[:,1]<=ybeg+block_size) & (data[:,1]>=ybeg)
187 | cond = xcond & ycond
188 | if np.sum(cond) < 100: # discard block if there are less than 100 pts.
189 | continue
190 |
191 | block_data = data[cond, :]
192 | block_label = label[cond]
193 |
194 | # randomly subsample data
195 | block_data_sampled, block_label_sampled = \
196 | sample_data_label(block_data, block_label, num_point)
197 | block_data_list.append(np.expand_dims(block_data_sampled, 0))
198 | block_label_list.append(np.expand_dims(block_label_sampled, 0))
199 |
200 | return np.concatenate(block_data_list, 0), \
201 | np.concatenate(block_label_list, 0)
202 |
203 |
204 | def room2blocks_plus(data_label, num_point, block_size, stride,
205 | random_sample, sample_num, sample_aug):
206 | """ room2block with input filename and RGB preprocessing.
207 | """
208 | data = data_label[:,0:6]
209 | data[:,3:6] /= 255.0
210 | label = data_label[:,-1].astype(np.uint8)
211 |
212 | return room2blocks(data, label, num_point, block_size, stride,
213 | random_sample, sample_num, sample_aug)
214 |
215 | def room2blocks_wrapper(data_label_filename, num_point, block_size=1.0, stride=1.0,
216 | random_sample=False, sample_num=None, sample_aug=1):
217 | if data_label_filename[-3:] == 'txt':
218 | data_label = np.loadtxt(data_label_filename)
219 | elif data_label_filename[-3:] == 'npy':
220 | data_label = np.load(data_label_filename)
221 | else:
222 | print('Unknown file type! exiting.')
223 | exit()
224 | return room2blocks_plus(data_label, num_point, block_size, stride,
225 | random_sample, sample_num, sample_aug)
226 |
227 | def room2blocks_plus_normalized(data_label, num_point, block_size, stride,
228 | random_sample, sample_num, sample_aug):
229 | """ room2block, with input filename and RGB preprocessing.
230 | for each block centralize XYZ, add normalized XYZ as 678 channels
231 | """
232 | data = data_label[:,0:6]
233 | data[:,3:6] /= 255.0
234 | label = data_label[:,-1].astype(np.uint8)
235 | max_room_x = max(data[:,0])
236 | max_room_y = max(data[:,1])
237 | max_room_z = max(data[:,2])
238 |
239 | data_batch, label_batch = room2blocks(data, label, num_point, block_size, stride,
240 | random_sample, sample_num, sample_aug)
241 | new_data_batch = np.zeros((data_batch.shape[0], num_point, 9))
242 | for b in range(data_batch.shape[0]):
243 | new_data_batch[b, :, 6] = data_batch[b, :, 0]/max_room_x
244 | new_data_batch[b, :, 7] = data_batch[b, :, 1]/max_room_y
245 | new_data_batch[b, :, 8] = data_batch[b, :, 2]/max_room_z
246 | minx = min(data_batch[b, :, 0])
247 | miny = min(data_batch[b, :, 1])
248 | data_batch[b, :, 0] -= (minx+block_size/2)
249 | data_batch[b, :, 1] -= (miny+block_size/2)
250 | new_data_batch[:, :, 0:6] = data_batch
251 | return new_data_batch, label_batch
252 |
253 |
254 | def room2blocks_wrapper_normalized(data_label_filename, num_point, block_size=1.0, stride=1.0,
255 | random_sample=False, sample_num=None, sample_aug=1):
256 | if data_label_filename[-3:] == 'txt':
257 | data_label = np.loadtxt(data_label_filename)
258 | elif data_label_filename[-3:] == 'npy':
259 | data_label = np.load(data_label_filename)
260 | else:
261 | print('Unknown file type! exiting.')
262 | exit()
263 | return room2blocks_plus_normalized(data_label, num_point, block_size, stride,
264 | random_sample, sample_num, sample_aug)
265 |
266 | def room2samples(data, label, sample_num_point):
267 | """ Prepare whole room samples.
268 |
269 | Args:
270 | data: N x 6 numpy array, 012 are XYZ in meters, 345 are RGB in [0,1]
271 | assumes the data is shifted (min point is origin) and
272 | aligned (aligned with XYZ axis)
273 | label: N size uint8 numpy array from 0-12
274 | sample_num_point: int, how many points to sample in each sample
275 | Returns:
276 | sample_datas: K x sample_num_point x 9
277 | numpy array of XYZRGBX'Y'Z', RGB is in [0,1]
278 | sample_labels: K x sample_num_point x 1 np array of uint8 labels
279 | """
280 | N = data.shape[0]
281 | order = np.arange(N)
282 | np.random.shuffle(order)
283 | data = data[order, :]
284 | label = label[order]
285 |
286 | batch_num = int(np.ceil(N / float(sample_num_point)))
287 | sample_datas = np.zeros((batch_num, sample_num_point, 6))
288 | sample_labels = np.zeros((batch_num, sample_num_point, 1))
289 |
290 | for i in range(batch_num):
291 | beg_idx = i*sample_num_point
292 | end_idx = min((i+1)*sample_num_point, N)
293 | num = end_idx - beg_idx
294 | sample_datas[i,0:num,:] = data[beg_idx:end_idx, :]
295 | sample_labels[i,0:num,0] = label[beg_idx:end_idx]
296 | if num < sample_num_point:
297 | makeup_indices = np.random.choice(N, sample_num_point - num)
298 | sample_datas[i,num:,:] = data[makeup_indices, :]
299 | sample_labels[i,num:,0] = label[makeup_indices]
300 | return sample_datas, sample_labels
301 |
302 | def room2samples_plus_normalized(data_label, num_point):
303 | """ room2sample, with input filename and RGB preprocessing.
304 | for each block centralize XYZ, add normalized XYZ as 678 channels
305 | """
306 | data = data_label[:,0:6]
307 | data[:,3:6] /= 255.0
308 | label = data_label[:,-1].astype(np.uint8)
309 | max_room_x = max(data[:,0])
310 | max_room_y = max(data[:,1])
311 | max_room_z = max(data[:,2])
312 | #print(max_room_x, max_room_y, max_room_z)
313 |
314 | data_batch, label_batch = room2samples(data, label, num_point)
315 | new_data_batch = np.zeros((data_batch.shape[0], num_point, 9))
316 | for b in range(data_batch.shape[0]):
317 | new_data_batch[b, :, 6] = data_batch[b, :, 0]/max_room_x
318 | new_data_batch[b, :, 7] = data_batch[b, :, 1]/max_room_y
319 | new_data_batch[b, :, 8] = data_batch[b, :, 2]/max_room_z
320 | #minx = min(data_batch[b, :, 0])
321 | #miny = min(data_batch[b, :, 1])
322 | #data_batch[b, :, 0] -= (minx+block_size/2)
323 | #data_batch[b, :, 1] -= (miny+block_size/2)
324 | new_data_batch[:, :, 0:6] = data_batch
325 | return new_data_batch, label_batch
326 |
327 |
328 | def room2samples_wrapper_normalized(data_label_filename, num_point):
329 | if data_label_filename[-3:] == 'txt':
330 | data_label = np.loadtxt(data_label_filename)
331 | elif data_label_filename[-3:] == 'npy':
332 | data_label = np.load(data_label_filename)
333 | else:
334 | print('Unknown file type! exiting.')
335 | exit()
336 | return room2samples_plus_normalized(data_label, num_point)
337 |
338 |
339 | # -----------------------------------------------------------------------------
340 | # EXTRACT INSTANCE BBOX FROM ORIGINAL DATA (for detection evaluation)
341 | # -----------------------------------------------------------------------------
342 |
343 | def collect_bounding_box(anno_path, out_filename):
344 | """ Compute bounding boxes from each instance in original dataset files on
345 | one room. **We assume the bbox is aligned with XYZ coordinate.**
346 |
347 | Args:
348 | anno_path: path to annotations. e.g. Area_1/office_2/Annotations/
349 | out_filename: path to save instance bounding boxes for that room.
350 | each line is x1 y1 z1 x2 y2 z2 label,
351 | where (x1,y1,z1) is the point on the diagonal closer to origin
352 | Returns:
353 | None
354 | Note:
355 | room points are shifted, the most negative point is now at origin.
356 | """
357 | bbox_label_list = []
358 |
359 | for f in glob.glob(os.path.join(anno_path, '*.txt')):
360 | cls = os.path.basename(f).split('_')[0]
361 | if cls not in g_classes: # note: in some room there is 'staris' class..
362 | cls = 'clutter'
363 | points = np.loadtxt(f)
364 | label = g_class2label[cls]
365 | # Compute tightest axis aligned bounding box
366 | xyz_min = np.amin(points[:, 0:3], axis=0)
367 | xyz_max = np.amax(points[:, 0:3], axis=0)
368 | ins_bbox_label = np.expand_dims(
369 | np.concatenate([xyz_min, xyz_max, np.array([label])], 0), 0)
370 | bbox_label_list.append(ins_bbox_label)
371 |
372 | bbox_label = np.concatenate(bbox_label_list, 0)
373 | room_xyz_min = np.amin(bbox_label[:, 0:3], axis=0)
374 | bbox_label[:, 0:3] -= room_xyz_min
375 | bbox_label[:, 3:6] -= room_xyz_min
376 |
377 | fout = open(out_filename, 'w')
378 | for i in range(bbox_label.shape[0]):
379 | fout.write('%f %f %f %f %f %f %d\n' % \
380 | (bbox_label[i,0], bbox_label[i,1], bbox_label[i,2],
381 | bbox_label[i,3], bbox_label[i,4], bbox_label[i,5],
382 | bbox_label[i,6]))
383 | fout.close()
384 |
385 | def bbox_label_to_obj(input_filename, out_filename_prefix, easy_view=False):
386 | """ Visualization of bounding boxes.
387 |
388 | Args:
389 | input_filename: each line is x1 y1 z1 x2 y2 z2 label
390 | out_filename_prefix: OBJ filename prefix,
391 | visualize object by g_label2color
392 | easy_view: if True, only visualize furniture and floor
393 | Returns:
394 | output a list of OBJ file and MTL files with the same prefix
395 | """
396 | bbox_label = np.loadtxt(input_filename)
397 | bbox = bbox_label[:, 0:6]
398 | label = bbox_label[:, -1].astype(int)
399 | v_cnt = 0 # count vertex
400 | ins_cnt = 0 # count instance
401 | for i in range(bbox.shape[0]):
402 | if easy_view and (label[i] not in g_easy_view_labels):
403 | continue
404 | obj_filename = out_filename_prefix+'_'+g_classes[label[i]]+'_'+str(ins_cnt)+'.obj'
405 | mtl_filename = out_filename_prefix+'_'+g_classes[label[i]]+'_'+str(ins_cnt)+'.mtl'
406 | fout_obj = open(obj_filename, 'w')
407 | fout_mtl = open(mtl_filename, 'w')
408 | fout_obj.write('mtllib %s\n' % (os.path.basename(mtl_filename)))
409 |
410 | length = bbox[i, 3:6] - bbox[i, 0:3]
411 | a = length[0]
412 | b = length[1]
413 | c = length[2]
414 | x = bbox[i, 0]
415 | y = bbox[i, 1]
416 | z = bbox[i, 2]
417 | color = np.array(g_label2color[label[i]], dtype=float) / 255.0
418 |
419 | material = 'material%d' % (ins_cnt)
420 | fout_obj.write('usemtl %s\n' % (material))
421 | fout_obj.write('v %f %f %f\n' % (x,y,z+c))
422 | fout_obj.write('v %f %f %f\n' % (x,y+b,z+c))
423 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z+c))
424 | fout_obj.write('v %f %f %f\n' % (x+a,y,z+c))
425 | fout_obj.write('v %f %f %f\n' % (x,y,z))
426 | fout_obj.write('v %f %f %f\n' % (x,y+b,z))
427 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z))
428 | fout_obj.write('v %f %f %f\n' % (x+a,y,z))
429 | fout_obj.write('g default\n')
430 | v_cnt = 0 # for individual box
431 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 3+v_cnt, 2+v_cnt, 1+v_cnt))
432 | fout_obj.write('f %d %d %d %d\n' % (1+v_cnt, 2+v_cnt, 6+v_cnt, 5+v_cnt))
433 | fout_obj.write('f %d %d %d %d\n' % (7+v_cnt, 6+v_cnt, 2+v_cnt, 3+v_cnt))
434 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 8+v_cnt, 7+v_cnt, 3+v_cnt))
435 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 8+v_cnt, 4+v_cnt, 1+v_cnt))
436 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 6+v_cnt, 7+v_cnt, 8+v_cnt))
437 | fout_obj.write('\n')
438 |
439 | fout_mtl.write('newmtl %s\n' % (material))
440 | fout_mtl.write('Kd %f %f %f\n' % (color[0], color[1], color[2]))
441 | fout_mtl.write('\n')
442 | fout_obj.close()
443 | fout_mtl.close()
444 |
445 | v_cnt += 8
446 | ins_cnt += 1
447 |
448 | def bbox_label_to_obj_room(input_filename, out_filename_prefix, easy_view=False, permute=None, center=False, exclude_table=False):
449 | """ Visualization of bounding boxes.
450 |
451 | Args:
452 | input_filename: each line is x1 y1 z1 x2 y2 z2 label
453 | out_filename_prefix: OBJ filename prefix,
454 | visualize object by g_label2color
455 | easy_view: if True, only visualize furniture and floor
456 | permute: if not None, permute XYZ for rendering, e.g. [0 2 1]
457 | center: if True, move obj to have zero origin
458 | Returns:
459 | output a list of OBJ file and MTL files with the same prefix
460 | """
461 | bbox_label = np.loadtxt(input_filename)
462 | bbox = bbox_label[:, 0:6]
463 | if permute is not None:
464 | assert(len(permute)==3)
465 | permute = np.array(permute)
466 | bbox[:,0:3] = bbox[:,permute]
467 | bbox[:,3:6] = bbox[:,permute+3]
468 | if center:
469 | xyz_max = np.amax(bbox[:,3:6], 0)
470 | bbox[:,0:3] -= (xyz_max/2.0)
471 | bbox[:,3:6] -= (xyz_max/2.0)
472 | bbox /= np.max(xyz_max/2.0)
473 | label = bbox_label[:, -1].astype(int)
474 | obj_filename = out_filename_prefix+'.obj'
475 | mtl_filename = out_filename_prefix+'.mtl'
476 |
477 | fout_obj = open(obj_filename, 'w')
478 | fout_mtl = open(mtl_filename, 'w')
479 | fout_obj.write('mtllib %s\n' % (os.path.basename(mtl_filename)))
480 | v_cnt = 0 # count vertex
481 | ins_cnt = 0 # count instance
482 | for i in range(bbox.shape[0]):
483 | if easy_view and (label[i] not in g_easy_view_labels):
484 | continue
485 | if exclude_table and label[i] == g_classes.index('table'):
486 | continue
487 |
488 | length = bbox[i, 3:6] - bbox[i, 0:3]
489 | a = length[0]
490 | b = length[1]
491 | c = length[2]
492 | x = bbox[i, 0]
493 | y = bbox[i, 1]
494 | z = bbox[i, 2]
495 | color = np.array(g_label2color[label[i]], dtype=float) / 255.0
496 |
497 | material = 'material%d' % (ins_cnt)
498 | fout_obj.write('usemtl %s\n' % (material))
499 | fout_obj.write('v %f %f %f\n' % (x,y,z+c))
500 | fout_obj.write('v %f %f %f\n' % (x,y+b,z+c))
501 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z+c))
502 | fout_obj.write('v %f %f %f\n' % (x+a,y,z+c))
503 | fout_obj.write('v %f %f %f\n' % (x,y,z))
504 | fout_obj.write('v %f %f %f\n' % (x,y+b,z))
505 | fout_obj.write('v %f %f %f\n' % (x+a,y+b,z))
506 | fout_obj.write('v %f %f %f\n' % (x+a,y,z))
507 | fout_obj.write('g default\n')
508 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 3+v_cnt, 2+v_cnt, 1+v_cnt))
509 | fout_obj.write('f %d %d %d %d\n' % (1+v_cnt, 2+v_cnt, 6+v_cnt, 5+v_cnt))
510 | fout_obj.write('f %d %d %d %d\n' % (7+v_cnt, 6+v_cnt, 2+v_cnt, 3+v_cnt))
511 | fout_obj.write('f %d %d %d %d\n' % (4+v_cnt, 8+v_cnt, 7+v_cnt, 3+v_cnt))
512 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 8+v_cnt, 4+v_cnt, 1+v_cnt))
513 | fout_obj.write('f %d %d %d %d\n' % (5+v_cnt, 6+v_cnt, 7+v_cnt, 8+v_cnt))
514 | fout_obj.write('\n')
515 |
516 | fout_mtl.write('newmtl %s\n' % (material))
517 | fout_mtl.write('Kd %f %f %f\n' % (color[0], color[1], color[2]))
518 | fout_mtl.write('\n')
519 |
520 | v_cnt += 8
521 | ins_cnt += 1
522 |
523 | fout_obj.close()
524 | fout_mtl.close()
525 |
526 |
527 | def collect_point_bounding_box(anno_path, out_filename, file_format):
528 | """ Compute bounding boxes from each instance in original dataset files on
529 | one room. **We assume the bbox is aligned with XYZ coordinate.**
530 | Save both the point XYZRGB and the bounding box for the point's
531 | parent element.
532 |
533 | Args:
534 | anno_path: path to annotations. e.g. Area_1/office_2/Annotations/
535 | out_filename: path to save instance bounding boxes for each point,
536 | plus the point's XYZRGBL
537 | each line is XYZRGBL offsetX offsetY offsetZ a b c,
538 | where cx = X+offsetX, cy=X+offsetY, cz=Z+offsetZ
539 | where (cx,cy,cz) is center of the box, a,b,c are distances from center
540 | to the surfaces of the box, i.e. x1 = cx-a, x2 = cx+a, y1=cy-b etc.
541 | file_format: output file format, txt or numpy
542 | Returns:
543 | None
544 |
545 | Note:
546 | room points are shifted, the most negative point is now at origin.
547 | """
548 | point_bbox_list = []
549 |
550 | for f in glob.glob(os.path.join(anno_path, '*.txt')):
551 | cls = os.path.basename(f).split('_')[0]
552 | if cls not in g_classes: # note: in some room there is 'staris' class..
553 | cls = 'clutter'
554 | points = np.loadtxt(f) # Nx6
555 | label = g_class2label[cls] # N,
556 | # Compute tightest axis aligned bounding box
557 | xyz_min = np.amin(points[:, 0:3], axis=0) # 3,
558 | xyz_max = np.amax(points[:, 0:3], axis=0) # 3,
559 | xyz_center = (xyz_min + xyz_max) / 2
560 | dimension = (xyz_max - xyz_min) / 2
561 |
562 | xyz_offsets = xyz_center - points[:,0:3] # Nx3
563 | dimensions = np.ones((points.shape[0],3)) * dimension # Nx3
564 | labels = np.ones((points.shape[0],1)) * label # N
565 | point_bbox_list.append(np.concatenate([points, labels,
566 | xyz_offsets, dimensions], 1)) # Nx13
567 |
568 | point_bbox = np.concatenate(point_bbox_list, 0) # KxNx13
569 | room_xyz_min = np.amin(point_bbox[:, 0:3], axis=0)
570 | point_bbox[:, 0:3] -= room_xyz_min
571 |
572 | if file_format == 'txt':
573 | fout = open(out_filename, 'w')
574 | for i in range(point_bbox.shape[0]):
575 | fout.write('%f %f %f %d %d %d %d %f %f %f %f %f %f\n' % \
576 | (point_bbox[i,0], point_bbox[i,1], point_bbox[i,2],
577 | point_bbox[i,3], point_bbox[i,4], point_bbox[i,5],
578 | point_bbox[i,6],
579 | point_bbox[i,7], point_bbox[i,8], point_bbox[i,9],
580 | point_bbox[i,10], point_bbox[i,11], point_bbox[i,12]))
581 |
582 | fout.close()
583 | elif file_format == 'numpy':
584 | np.save(out_filename, point_bbox)
585 | else:
586 | print('ERROR!! Unknown file format: %s, please use txt or numpy.' % \
587 | (file_format))
588 | exit()
589 |
590 |
591 |
--------------------------------------------------------------------------------
/sem_seg/meta/all_data_label.txt:
--------------------------------------------------------------------------------
1 | Area_1_conferenceRoom_1.npy
2 | Area_1_conferenceRoom_2.npy
3 | Area_1_copyRoom_1.npy
4 | Area_1_hallway_1.npy
5 | Area_1_hallway_2.npy
6 | Area_1_hallway_3.npy
7 | Area_1_hallway_4.npy
8 | Area_1_hallway_5.npy
9 | Area_1_hallway_6.npy
10 | Area_1_hallway_7.npy
11 | Area_1_hallway_8.npy
12 | Area_1_office_10.npy
13 | Area_1_office_11.npy
14 | Area_1_office_12.npy
15 | Area_1_office_13.npy
16 | Area_1_office_14.npy
17 | Area_1_office_15.npy
18 | Area_1_office_16.npy
19 | Area_1_office_17.npy
20 | Area_1_office_18.npy
21 | Area_1_office_19.npy
22 | Area_1_office_1.npy
23 | Area_1_office_20.npy
24 | Area_1_office_21.npy
25 | Area_1_office_22.npy
26 | Area_1_office_23.npy
27 | Area_1_office_24.npy
28 | Area_1_office_25.npy
29 | Area_1_office_26.npy
30 | Area_1_office_27.npy
31 | Area_1_office_28.npy
32 | Area_1_office_29.npy
33 | Area_1_office_2.npy
34 | Area_1_office_30.npy
35 | Area_1_office_31.npy
36 | Area_1_office_3.npy
37 | Area_1_office_4.npy
38 | Area_1_office_5.npy
39 | Area_1_office_6.npy
40 | Area_1_office_7.npy
41 | Area_1_office_8.npy
42 | Area_1_office_9.npy
43 | Area_1_pantry_1.npy
44 | Area_1_WC_1.npy
45 | Area_2_auditorium_1.npy
46 | Area_2_auditorium_2.npy
47 | Area_2_conferenceRoom_1.npy
48 | Area_2_hallway_10.npy
49 | Area_2_hallway_11.npy
50 | Area_2_hallway_12.npy
51 | Area_2_hallway_1.npy
52 | Area_2_hallway_2.npy
53 | Area_2_hallway_3.npy
54 | Area_2_hallway_4.npy
55 | Area_2_hallway_5.npy
56 | Area_2_hallway_6.npy
57 | Area_2_hallway_7.npy
58 | Area_2_hallway_8.npy
59 | Area_2_hallway_9.npy
60 | Area_2_office_10.npy
61 | Area_2_office_11.npy
62 | Area_2_office_12.npy
63 | Area_2_office_13.npy
64 | Area_2_office_14.npy
65 | Area_2_office_1.npy
66 | Area_2_office_2.npy
67 | Area_2_office_3.npy
68 | Area_2_office_4.npy
69 | Area_2_office_5.npy
70 | Area_2_office_6.npy
71 | Area_2_office_7.npy
72 | Area_2_office_8.npy
73 | Area_2_office_9.npy
74 | Area_2_storage_1.npy
75 | Area_2_storage_2.npy
76 | Area_2_storage_3.npy
77 | Area_2_storage_4.npy
78 | Area_2_storage_5.npy
79 | Area_2_storage_6.npy
80 | Area_2_storage_7.npy
81 | Area_2_storage_8.npy
82 | Area_2_storage_9.npy
83 | Area_2_WC_1.npy
84 | Area_2_WC_2.npy
85 | Area_3_conferenceRoom_1.npy
86 | Area_3_hallway_1.npy
87 | Area_3_hallway_2.npy
88 | Area_3_hallway_3.npy
89 | Area_3_hallway_4.npy
90 | Area_3_hallway_5.npy
91 | Area_3_hallway_6.npy
92 | Area_3_lounge_1.npy
93 | Area_3_lounge_2.npy
94 | Area_3_office_10.npy
95 | Area_3_office_1.npy
96 | Area_3_office_2.npy
97 | Area_3_office_3.npy
98 | Area_3_office_4.npy
99 | Area_3_office_5.npy
100 | Area_3_office_6.npy
101 | Area_3_office_7.npy
102 | Area_3_office_8.npy
103 | Area_3_office_9.npy
104 | Area_3_storage_1.npy
105 | Area_3_storage_2.npy
106 | Area_3_WC_1.npy
107 | Area_3_WC_2.npy
108 | Area_4_conferenceRoom_1.npy
109 | Area_4_conferenceRoom_2.npy
110 | Area_4_conferenceRoom_3.npy
111 | Area_4_hallway_10.npy
112 | Area_4_hallway_11.npy
113 | Area_4_hallway_12.npy
114 | Area_4_hallway_13.npy
115 | Area_4_hallway_14.npy
116 | Area_4_hallway_1.npy
117 | Area_4_hallway_2.npy
118 | Area_4_hallway_3.npy
119 | Area_4_hallway_4.npy
120 | Area_4_hallway_5.npy
121 | Area_4_hallway_6.npy
122 | Area_4_hallway_7.npy
123 | Area_4_hallway_8.npy
124 | Area_4_hallway_9.npy
125 | Area_4_lobby_1.npy
126 | Area_4_lobby_2.npy
127 | Area_4_office_10.npy
128 | Area_4_office_11.npy
129 | Area_4_office_12.npy
130 | Area_4_office_13.npy
131 | Area_4_office_14.npy
132 | Area_4_office_15.npy
133 | Area_4_office_16.npy
134 | Area_4_office_17.npy
135 | Area_4_office_18.npy
136 | Area_4_office_19.npy
137 | Area_4_office_1.npy
138 | Area_4_office_20.npy
139 | Area_4_office_21.npy
140 | Area_4_office_22.npy
141 | Area_4_office_2.npy
142 | Area_4_office_3.npy
143 | Area_4_office_4.npy
144 | Area_4_office_5.npy
145 | Area_4_office_6.npy
146 | Area_4_office_7.npy
147 | Area_4_office_8.npy
148 | Area_4_office_9.npy
149 | Area_4_storage_1.npy
150 | Area_4_storage_2.npy
151 | Area_4_storage_3.npy
152 | Area_4_storage_4.npy
153 | Area_4_WC_1.npy
154 | Area_4_WC_2.npy
155 | Area_4_WC_3.npy
156 | Area_4_WC_4.npy
157 | Area_5_conferenceRoom_1.npy
158 | Area_5_conferenceRoom_2.npy
159 | Area_5_conferenceRoom_3.npy
160 | Area_5_hallway_10.npy
161 | Area_5_hallway_11.npy
162 | Area_5_hallway_12.npy
163 | Area_5_hallway_13.npy
164 | Area_5_hallway_14.npy
165 | Area_5_hallway_15.npy
166 | Area_5_hallway_1.npy
167 | Area_5_hallway_2.npy
168 | Area_5_hallway_3.npy
169 | Area_5_hallway_4.npy
170 | Area_5_hallway_5.npy
171 | Area_5_hallway_6.npy
172 | Area_5_hallway_7.npy
173 | Area_5_hallway_8.npy
174 | Area_5_hallway_9.npy
175 | Area_5_lobby_1.npy
176 | Area_5_office_10.npy
177 | Area_5_office_11.npy
178 | Area_5_office_12.npy
179 | Area_5_office_13.npy
180 | Area_5_office_14.npy
181 | Area_5_office_15.npy
182 | Area_5_office_16.npy
183 | Area_5_office_17.npy
184 | Area_5_office_18.npy
185 | Area_5_office_19.npy
186 | Area_5_office_1.npy
187 | Area_5_office_20.npy
188 | Area_5_office_21.npy
189 | Area_5_office_22.npy
190 | Area_5_office_23.npy
191 | Area_5_office_24.npy
192 | Area_5_office_25.npy
193 | Area_5_office_26.npy
194 | Area_5_office_27.npy
195 | Area_5_office_28.npy
196 | Area_5_office_29.npy
197 | Area_5_office_2.npy
198 | Area_5_office_30.npy
199 | Area_5_office_31.npy
200 | Area_5_office_32.npy
201 | Area_5_office_33.npy
202 | Area_5_office_34.npy
203 | Area_5_office_35.npy
204 | Area_5_office_36.npy
205 | Area_5_office_37.npy
206 | Area_5_office_38.npy
207 | Area_5_office_39.npy
208 | Area_5_office_3.npy
209 | Area_5_office_40.npy
210 | Area_5_office_41.npy
211 | Area_5_office_42.npy
212 | Area_5_office_4.npy
213 | Area_5_office_5.npy
214 | Area_5_office_6.npy
215 | Area_5_office_7.npy
216 | Area_5_office_8.npy
217 | Area_5_office_9.npy
218 | Area_5_pantry_1.npy
219 | Area_5_storage_1.npy
220 | Area_5_storage_2.npy
221 | Area_5_storage_3.npy
222 | Area_5_storage_4.npy
223 | Area_5_WC_1.npy
224 | Area_5_WC_2.npy
225 | Area_6_conferenceRoom_1.npy
226 | Area_6_copyRoom_1.npy
227 | Area_6_hallway_1.npy
228 | Area_6_hallway_2.npy
229 | Area_6_hallway_3.npy
230 | Area_6_hallway_4.npy
231 | Area_6_hallway_5.npy
232 | Area_6_hallway_6.npy
233 | Area_6_lounge_1.npy
234 | Area_6_office_10.npy
235 | Area_6_office_11.npy
236 | Area_6_office_12.npy
237 | Area_6_office_13.npy
238 | Area_6_office_14.npy
239 | Area_6_office_15.npy
240 | Area_6_office_16.npy
241 | Area_6_office_17.npy
242 | Area_6_office_18.npy
243 | Area_6_office_19.npy
244 | Area_6_office_1.npy
245 | Area_6_office_20.npy
246 | Area_6_office_21.npy
247 | Area_6_office_22.npy
248 | Area_6_office_23.npy
249 | Area_6_office_24.npy
250 | Area_6_office_25.npy
251 | Area_6_office_26.npy
252 | Area_6_office_27.npy
253 | Area_6_office_28.npy
254 | Area_6_office_29.npy
255 | Area_6_office_2.npy
256 | Area_6_office_30.npy
257 | Area_6_office_31.npy
258 | Area_6_office_32.npy
259 | Area_6_office_33.npy
260 | Area_6_office_34.npy
261 | Area_6_office_35.npy
262 | Area_6_office_36.npy
263 | Area_6_office_37.npy
264 | Area_6_office_3.npy
265 | Area_6_office_4.npy
266 | Area_6_office_5.npy
267 | Area_6_office_6.npy
268 | Area_6_office_7.npy
269 | Area_6_office_8.npy
270 | Area_6_office_9.npy
271 | Area_6_openspace_1.npy
272 | Area_6_pantry_1.npy
273 |
--------------------------------------------------------------------------------
/sem_seg/meta/anno_paths.txt:
--------------------------------------------------------------------------------
1 | Area_1/conferenceRoom_1/Annotations
2 | Area_1/conferenceRoom_2/Annotations
3 | Area_1/copyRoom_1/Annotations
4 | Area_1/hallway_1/Annotations
5 | Area_1/hallway_2/Annotations
6 | Area_1/hallway_3/Annotations
7 | Area_1/hallway_4/Annotations
8 | Area_1/hallway_5/Annotations
9 | Area_1/hallway_6/Annotations
10 | Area_1/hallway_7/Annotations
11 | Area_1/hallway_8/Annotations
12 | Area_1/office_10/Annotations
13 | Area_1/office_11/Annotations
14 | Area_1/office_12/Annotations
15 | Area_1/office_13/Annotations
16 | Area_1/office_14/Annotations
17 | Area_1/office_15/Annotations
18 | Area_1/office_16/Annotations
19 | Area_1/office_17/Annotations
20 | Area_1/office_18/Annotations
21 | Area_1/office_19/Annotations
22 | Area_1/office_1/Annotations
23 | Area_1/office_20/Annotations
24 | Area_1/office_21/Annotations
25 | Area_1/office_22/Annotations
26 | Area_1/office_23/Annotations
27 | Area_1/office_24/Annotations
28 | Area_1/office_25/Annotations
29 | Area_1/office_26/Annotations
30 | Area_1/office_27/Annotations
31 | Area_1/office_28/Annotations
32 | Area_1/office_29/Annotations
33 | Area_1/office_2/Annotations
34 | Area_1/office_30/Annotations
35 | Area_1/office_31/Annotations
36 | Area_1/office_3/Annotations
37 | Area_1/office_4/Annotations
38 | Area_1/office_5/Annotations
39 | Area_1/office_6/Annotations
40 | Area_1/office_7/Annotations
41 | Area_1/office_8/Annotations
42 | Area_1/office_9/Annotations
43 | Area_1/pantry_1/Annotations
44 | Area_1/WC_1/Annotations
45 | Area_2/auditorium_1/Annotations
46 | Area_2/auditorium_2/Annotations
47 | Area_2/conferenceRoom_1/Annotations
48 | Area_2/hallway_10/Annotations
49 | Area_2/hallway_11/Annotations
50 | Area_2/hallway_12/Annotations
51 | Area_2/hallway_1/Annotations
52 | Area_2/hallway_2/Annotations
53 | Area_2/hallway_3/Annotations
54 | Area_2/hallway_4/Annotations
55 | Area_2/hallway_5/Annotations
56 | Area_2/hallway_6/Annotations
57 | Area_2/hallway_7/Annotations
58 | Area_2/hallway_8/Annotations
59 | Area_2/hallway_9/Annotations
60 | Area_2/office_10/Annotations
61 | Area_2/office_11/Annotations
62 | Area_2/office_12/Annotations
63 | Area_2/office_13/Annotations
64 | Area_2/office_14/Annotations
65 | Area_2/office_1/Annotations
66 | Area_2/office_2/Annotations
67 | Area_2/office_3/Annotations
68 | Area_2/office_4/Annotations
69 | Area_2/office_5/Annotations
70 | Area_2/office_6/Annotations
71 | Area_2/office_7/Annotations
72 | Area_2/office_8/Annotations
73 | Area_2/office_9/Annotations
74 | Area_2/storage_1/Annotations
75 | Area_2/storage_2/Annotations
76 | Area_2/storage_3/Annotations
77 | Area_2/storage_4/Annotations
78 | Area_2/storage_5/Annotations
79 | Area_2/storage_6/Annotations
80 | Area_2/storage_7/Annotations
81 | Area_2/storage_8/Annotations
82 | Area_2/storage_9/Annotations
83 | Area_2/WC_1/Annotations
84 | Area_2/WC_2/Annotations
85 | Area_3/conferenceRoom_1/Annotations
86 | Area_3/hallway_1/Annotations
87 | Area_3/hallway_2/Annotations
88 | Area_3/hallway_3/Annotations
89 | Area_3/hallway_4/Annotations
90 | Area_3/hallway_5/Annotations
91 | Area_3/hallway_6/Annotations
92 | Area_3/lounge_1/Annotations
93 | Area_3/lounge_2/Annotations
94 | Area_3/office_10/Annotations
95 | Area_3/office_1/Annotations
96 | Area_3/office_2/Annotations
97 | Area_3/office_3/Annotations
98 | Area_3/office_4/Annotations
99 | Area_3/office_5/Annotations
100 | Area_3/office_6/Annotations
101 | Area_3/office_7/Annotations
102 | Area_3/office_8/Annotations
103 | Area_3/office_9/Annotations
104 | Area_3/storage_1/Annotations
105 | Area_3/storage_2/Annotations
106 | Area_3/WC_1/Annotations
107 | Area_3/WC_2/Annotations
108 | Area_4/conferenceRoom_1/Annotations
109 | Area_4/conferenceRoom_2/Annotations
110 | Area_4/conferenceRoom_3/Annotations
111 | Area_4/hallway_10/Annotations
112 | Area_4/hallway_11/Annotations
113 | Area_4/hallway_12/Annotations
114 | Area_4/hallway_13/Annotations
115 | Area_4/hallway_14/Annotations
116 | Area_4/hallway_1/Annotations
117 | Area_4/hallway_2/Annotations
118 | Area_4/hallway_3/Annotations
119 | Area_4/hallway_4/Annotations
120 | Area_4/hallway_5/Annotations
121 | Area_4/hallway_6/Annotations
122 | Area_4/hallway_7/Annotations
123 | Area_4/hallway_8/Annotations
124 | Area_4/hallway_9/Annotations
125 | Area_4/lobby_1/Annotations
126 | Area_4/lobby_2/Annotations
127 | Area_4/office_10/Annotations
128 | Area_4/office_11/Annotations
129 | Area_4/office_12/Annotations
130 | Area_4/office_13/Annotations
131 | Area_4/office_14/Annotations
132 | Area_4/office_15/Annotations
133 | Area_4/office_16/Annotations
134 | Area_4/office_17/Annotations
135 | Area_4/office_18/Annotations
136 | Area_4/office_19/Annotations
137 | Area_4/office_1/Annotations
138 | Area_4/office_20/Annotations
139 | Area_4/office_21/Annotations
140 | Area_4/office_22/Annotations
141 | Area_4/office_2/Annotations
142 | Area_4/office_3/Annotations
143 | Area_4/office_4/Annotations
144 | Area_4/office_5/Annotations
145 | Area_4/office_6/Annotations
146 | Area_4/office_7/Annotations
147 | Area_4/office_8/Annotations
148 | Area_4/office_9/Annotations
149 | Area_4/storage_1/Annotations
150 | Area_4/storage_2/Annotations
151 | Area_4/storage_3/Annotations
152 | Area_4/storage_4/Annotations
153 | Area_4/WC_1/Annotations
154 | Area_4/WC_2/Annotations
155 | Area_4/WC_3/Annotations
156 | Area_4/WC_4/Annotations
157 | Area_5/conferenceRoom_1/Annotations
158 | Area_5/conferenceRoom_2/Annotations
159 | Area_5/conferenceRoom_3/Annotations
160 | Area_5/hallway_10/Annotations
161 | Area_5/hallway_11/Annotations
162 | Area_5/hallway_12/Annotations
163 | Area_5/hallway_13/Annotations
164 | Area_5/hallway_14/Annotations
165 | Area_5/hallway_15/Annotations
166 | Area_5/hallway_1/Annotations
167 | Area_5/hallway_2/Annotations
168 | Area_5/hallway_3/Annotations
169 | Area_5/hallway_4/Annotations
170 | Area_5/hallway_5/Annotations
171 | Area_5/hallway_6/Annotations
172 | Area_5/hallway_7/Annotations
173 | Area_5/hallway_8/Annotations
174 | Area_5/hallway_9/Annotations
175 | Area_5/lobby_1/Annotations
176 | Area_5/office_10/Annotations
177 | Area_5/office_11/Annotations
178 | Area_5/office_12/Annotations
179 | Area_5/office_13/Annotations
180 | Area_5/office_14/Annotations
181 | Area_5/office_15/Annotations
182 | Area_5/office_16/Annotations
183 | Area_5/office_17/Annotations
184 | Area_5/office_18/Annotations
185 | Area_5/office_19/Annotations
186 | Area_5/office_1/Annotations
187 | Area_5/office_20/Annotations
188 | Area_5/office_21/Annotations
189 | Area_5/office_22/Annotations
190 | Area_5/office_23/Annotations
191 | Area_5/office_24/Annotations
192 | Area_5/office_25/Annotations
193 | Area_5/office_26/Annotations
194 | Area_5/office_27/Annotations
195 | Area_5/office_28/Annotations
196 | Area_5/office_29/Annotations
197 | Area_5/office_2/Annotations
198 | Area_5/office_30/Annotations
199 | Area_5/office_31/Annotations
200 | Area_5/office_32/Annotations
201 | Area_5/office_33/Annotations
202 | Area_5/office_34/Annotations
203 | Area_5/office_35/Annotations
204 | Area_5/office_36/Annotations
205 | Area_5/office_37/Annotations
206 | Area_5/office_38/Annotations
207 | Area_5/office_39/Annotations
208 | Area_5/office_3/Annotations
209 | Area_5/office_40/Annotations
210 | Area_5/office_41/Annotations
211 | Area_5/office_42/Annotations
212 | Area_5/office_4/Annotations
213 | Area_5/office_5/Annotations
214 | Area_5/office_6/Annotations
215 | Area_5/office_7/Annotations
216 | Area_5/office_8/Annotations
217 | Area_5/office_9/Annotations
218 | Area_5/pantry_1/Annotations
219 | Area_5/storage_1/Annotations
220 | Area_5/storage_2/Annotations
221 | Area_5/storage_3/Annotations
222 | Area_5/storage_4/Annotations
223 | Area_5/WC_1/Annotations
224 | Area_5/WC_2/Annotations
225 | Area_6/conferenceRoom_1/Annotations
226 | Area_6/copyRoom_1/Annotations
227 | Area_6/hallway_1/Annotations
228 | Area_6/hallway_2/Annotations
229 | Area_6/hallway_3/Annotations
230 | Area_6/hallway_4/Annotations
231 | Area_6/hallway_5/Annotations
232 | Area_6/hallway_6/Annotations
233 | Area_6/lounge_1/Annotations
234 | Area_6/office_10/Annotations
235 | Area_6/office_11/Annotations
236 | Area_6/office_12/Annotations
237 | Area_6/office_13/Annotations
238 | Area_6/office_14/Annotations
239 | Area_6/office_15/Annotations
240 | Area_6/office_16/Annotations
241 | Area_6/office_17/Annotations
242 | Area_6/office_18/Annotations
243 | Area_6/office_19/Annotations
244 | Area_6/office_1/Annotations
245 | Area_6/office_20/Annotations
246 | Area_6/office_21/Annotations
247 | Area_6/office_22/Annotations
248 | Area_6/office_23/Annotations
249 | Area_6/office_24/Annotations
250 | Area_6/office_25/Annotations
251 | Area_6/office_26/Annotations
252 | Area_6/office_27/Annotations
253 | Area_6/office_28/Annotations
254 | Area_6/office_29/Annotations
255 | Area_6/office_2/Annotations
256 | Area_6/office_30/Annotations
257 | Area_6/office_31/Annotations
258 | Area_6/office_32/Annotations
259 | Area_6/office_33/Annotations
260 | Area_6/office_34/Annotations
261 | Area_6/office_35/Annotations
262 | Area_6/office_36/Annotations
263 | Area_6/office_37/Annotations
264 | Area_6/office_3/Annotations
265 | Area_6/office_4/Annotations
266 | Area_6/office_5/Annotations
267 | Area_6/office_6/Annotations
268 | Area_6/office_7/Annotations
269 | Area_6/office_8/Annotations
270 | Area_6/office_9/Annotations
271 | Area_6/openspace_1/Annotations
272 | Area_6/pantry_1/Annotations
273 |
--------------------------------------------------------------------------------
/sem_seg/meta/area6_data_label.txt:
--------------------------------------------------------------------------------
1 | data/stanford_indoor3d/Area_6_conferenceRoom_1.npy
2 | data/stanford_indoor3d/Area_6_copyRoom_1.npy
3 | data/stanford_indoor3d/Area_6_hallway_1.npy
4 | data/stanford_indoor3d/Area_6_hallway_2.npy
5 | data/stanford_indoor3d/Area_6_hallway_3.npy
6 | data/stanford_indoor3d/Area_6_hallway_4.npy
7 | data/stanford_indoor3d/Area_6_hallway_5.npy
8 | data/stanford_indoor3d/Area_6_hallway_6.npy
9 | data/stanford_indoor3d/Area_6_lounge_1.npy
10 | data/stanford_indoor3d/Area_6_office_10.npy
11 | data/stanford_indoor3d/Area_6_office_11.npy
12 | data/stanford_indoor3d/Area_6_office_12.npy
13 | data/stanford_indoor3d/Area_6_office_13.npy
14 | data/stanford_indoor3d/Area_6_office_14.npy
15 | data/stanford_indoor3d/Area_6_office_15.npy
16 | data/stanford_indoor3d/Area_6_office_16.npy
17 | data/stanford_indoor3d/Area_6_office_17.npy
18 | data/stanford_indoor3d/Area_6_office_18.npy
19 | data/stanford_indoor3d/Area_6_office_19.npy
20 | data/stanford_indoor3d/Area_6_office_1.npy
21 | data/stanford_indoor3d/Area_6_office_20.npy
22 | data/stanford_indoor3d/Area_6_office_21.npy
23 | data/stanford_indoor3d/Area_6_office_22.npy
24 | data/stanford_indoor3d/Area_6_office_23.npy
25 | data/stanford_indoor3d/Area_6_office_24.npy
26 | data/stanford_indoor3d/Area_6_office_25.npy
27 | data/stanford_indoor3d/Area_6_office_26.npy
28 | data/stanford_indoor3d/Area_6_office_27.npy
29 | data/stanford_indoor3d/Area_6_office_28.npy
30 | data/stanford_indoor3d/Area_6_office_29.npy
31 | data/stanford_indoor3d/Area_6_office_2.npy
32 | data/stanford_indoor3d/Area_6_office_30.npy
33 | data/stanford_indoor3d/Area_6_office_31.npy
34 | data/stanford_indoor3d/Area_6_office_32.npy
35 | data/stanford_indoor3d/Area_6_office_33.npy
36 | data/stanford_indoor3d/Area_6_office_34.npy
37 | data/stanford_indoor3d/Area_6_office_35.npy
38 | data/stanford_indoor3d/Area_6_office_36.npy
39 | data/stanford_indoor3d/Area_6_office_37.npy
40 | data/stanford_indoor3d/Area_6_office_3.npy
41 | data/stanford_indoor3d/Area_6_office_4.npy
42 | data/stanford_indoor3d/Area_6_office_5.npy
43 | data/stanford_indoor3d/Area_6_office_6.npy
44 | data/stanford_indoor3d/Area_6_office_7.npy
45 | data/stanford_indoor3d/Area_6_office_8.npy
46 | data/stanford_indoor3d/Area_6_office_9.npy
47 | data/stanford_indoor3d/Area_6_openspace_1.npy
48 | data/stanford_indoor3d/Area_6_pantry_1.npy
49 |
--------------------------------------------------------------------------------
/sem_seg/meta/class_names.txt:
--------------------------------------------------------------------------------
1 | ceiling
2 | floor
3 | wall
4 | beam
5 | column
6 | window
7 | door
8 | table
9 | chair
10 | sofa
11 | bookcase
12 | board
13 | clutter
14 |
--------------------------------------------------------------------------------
/sem_seg/model.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import math
3 | import time
4 | import numpy as np
5 | import os
6 | import sys
7 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
8 | ROOT_DIR = os.path.dirname(BASE_DIR)
9 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
10 | import tf_util
11 |
12 | def placeholder_inputs(batch_size, num_point):
13 | pointclouds_pl = tf.placeholder(tf.float32,
14 | shape=(batch_size, num_point, 9))
15 | labels_pl = tf.placeholder(tf.int32,
16 | shape=(batch_size, num_point))
17 | return pointclouds_pl, labels_pl
18 |
19 | def get_model(point_cloud, is_training, bn_decay=None):
20 | """ ConvNet baseline, input is BxNx3 gray image """
21 | batch_size = point_cloud.get_shape()[0].value
22 | num_point = point_cloud.get_shape()[1].value
23 |
24 | input_image = tf.expand_dims(point_cloud, -1)
25 | # CONV
26 | net = tf_util.conv2d(input_image, 64, [1,9], padding='VALID', stride=[1,1],
27 | bn=True, is_training=is_training, scope='conv1', bn_decay=bn_decay)
28 | net = tf_util.conv2d(net, 64, [1,1], padding='VALID', stride=[1,1],
29 | bn=True, is_training=is_training, scope='conv2', bn_decay=bn_decay)
30 | net = tf_util.conv2d(net, 64, [1,1], padding='VALID', stride=[1,1],
31 | bn=True, is_training=is_training, scope='conv3', bn_decay=bn_decay)
32 | net = tf_util.conv2d(net, 128, [1,1], padding='VALID', stride=[1,1],
33 | bn=True, is_training=is_training, scope='conv4', bn_decay=bn_decay)
34 | points_feat1 = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1],
35 | bn=True, is_training=is_training, scope='conv5', bn_decay=bn_decay)
36 | # MAX
37 | pc_feat1 = tf_util.max_pool2d(points_feat1, [num_point,1], padding='VALID', scope='maxpool1')
38 | # FC
39 | pc_feat1 = tf.reshape(pc_feat1, [batch_size, -1])
40 | pc_feat1 = tf_util.fully_connected(pc_feat1, 256, bn=True, is_training=is_training, scope='fc1', bn_decay=bn_decay)
41 | pc_feat1 = tf_util.fully_connected(pc_feat1, 128, bn=True, is_training=is_training, scope='fc2', bn_decay=bn_decay)
42 | print(pc_feat1)
43 |
44 | # CONCAT
45 | pc_feat1_expand = tf.tile(tf.reshape(pc_feat1, [batch_size, 1, 1, -1]), [1, num_point, 1, 1])
46 | points_feat1_concat = tf.concat(axis=3, values=[points_feat1, pc_feat1_expand])
47 |
48 | # CONV
49 | net = tf_util.conv2d(points_feat1_concat, 512, [1,1], padding='VALID', stride=[1,1],
50 | bn=True, is_training=is_training, scope='conv6')
51 | net = tf_util.conv2d(net, 256, [1,1], padding='VALID', stride=[1,1],
52 | bn=True, is_training=is_training, scope='conv7')
53 | net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='dp1')
54 | net = tf_util.conv2d(net, 13, [1,1], padding='VALID', stride=[1,1],
55 | activation_fn=None, scope='conv8')
56 | net = tf.squeeze(net, [2])
57 |
58 | return net
59 |
60 | def get_loss(pred, label):
61 | """ pred: B,N,13
62 | label: B,N """
63 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=label)
64 | return tf.reduce_mean(loss)
65 |
66 | if __name__ == "__main__":
67 | with tf.Graph().as_default():
68 | a = tf.placeholder(tf.float32, shape=(32,4096,9))
69 | net = get_model(a, tf.constant(True))
70 | with tf.Session() as sess:
71 | init = tf.global_variables_initializer()
72 | sess.run(init)
73 | start = time.time()
74 | for i in range(100):
75 | print(i)
76 | sess.run(net, feed_dict={a:np.random.rand(32,4096,9)})
77 | print(time.time() - start)
78 |
--------------------------------------------------------------------------------
/sem_seg/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import math
3 | import h5py
4 | import numpy as np
5 | import tensorflow as tf
6 | import socket
7 |
8 | import os
9 | import sys
10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
11 | ROOT_DIR = os.path.dirname(BASE_DIR)
12 | sys.path.append(BASE_DIR)
13 | sys.path.append(ROOT_DIR)
14 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
15 | import provider
16 | import tf_util
17 | from model import *
18 |
19 | parser = argparse.ArgumentParser()
20 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]')
21 | parser.add_argument('--log_dir', default='log', help='Log dir [default: log]')
22 | parser.add_argument('--num_point', type=int, default=4096, help='Point number [default: 4096]')
23 | parser.add_argument('--max_epoch', type=int, default=50, help='Epoch to run [default: 50]')
24 | parser.add_argument('--batch_size', type=int, default=24, help='Batch Size during training [default: 24]')
25 | parser.add_argument('--learning_rate', type=float, default=0.001, help='Initial learning rate [default: 0.001]')
26 | parser.add_argument('--momentum', type=float, default=0.9, help='Initial learning rate [default: 0.9]')
27 | parser.add_argument('--optimizer', default='adam', help='adam or momentum [default: adam]')
28 | parser.add_argument('--decay_step', type=int, default=300000, help='Decay step for lr decay [default: 300000]')
29 | parser.add_argument('--decay_rate', type=float, default=0.5, help='Decay rate for lr decay [default: 0.5]')
30 | parser.add_argument('--test_area', type=int, default=6, help='Which area to use for test, option: 1-6 [default: 6]')
31 | FLAGS = parser.parse_args()
32 |
33 |
34 | BATCH_SIZE = FLAGS.batch_size
35 | NUM_POINT = FLAGS.num_point
36 | MAX_EPOCH = FLAGS.max_epoch
37 | NUM_POINT = FLAGS.num_point
38 | BASE_LEARNING_RATE = FLAGS.learning_rate
39 | GPU_INDEX = FLAGS.gpu
40 | MOMENTUM = FLAGS.momentum
41 | OPTIMIZER = FLAGS.optimizer
42 | DECAY_STEP = FLAGS.decay_step
43 | DECAY_RATE = FLAGS.decay_rate
44 |
45 | LOG_DIR = FLAGS.log_dir
46 | if not os.path.exists(LOG_DIR): os.mkdir(LOG_DIR)
47 | os.system('cp model.py %s' % (LOG_DIR)) # bkp of model def
48 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure
49 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w')
50 | LOG_FOUT.write(str(FLAGS)+'\n')
51 |
52 | MAX_NUM_POINT = 4096
53 | NUM_CLASSES = 13
54 |
55 | BN_INIT_DECAY = 0.5
56 | BN_DECAY_DECAY_RATE = 0.5
57 | #BN_DECAY_DECAY_STEP = float(DECAY_STEP * 2)
58 | BN_DECAY_DECAY_STEP = float(DECAY_STEP)
59 | BN_DECAY_CLIP = 0.99
60 |
61 | HOSTNAME = socket.gethostname()
62 |
63 | ALL_FILES = provider.getDataFiles('indoor3d_sem_seg_hdf5_data/all_files.txt')
64 | room_filelist = [line.rstrip() for line in open('indoor3d_sem_seg_hdf5_data/room_filelist.txt')]
65 |
66 | # Load ALL data
67 | data_batch_list = []
68 | label_batch_list = []
69 | for h5_filename in ALL_FILES:
70 | data_batch, label_batch = provider.loadDataFile(h5_filename)
71 | data_batch_list.append(data_batch)
72 | label_batch_list.append(label_batch)
73 | data_batches = np.concatenate(data_batch_list, 0)
74 | label_batches = np.concatenate(label_batch_list, 0)
75 | print(data_batches.shape)
76 | print(label_batches.shape)
77 |
78 | test_area = 'Area_'+str(FLAGS.test_area)
79 | train_idxs = []
80 | test_idxs = []
81 | for i,room_name in enumerate(room_filelist):
82 | if test_area in room_name:
83 | test_idxs.append(i)
84 | else:
85 | train_idxs.append(i)
86 |
87 | train_data = data_batches[train_idxs,...]
88 | train_label = label_batches[train_idxs]
89 | test_data = data_batches[test_idxs,...]
90 | test_label = label_batches[test_idxs]
91 | print(train_data.shape, train_label.shape)
92 | print(test_data.shape, test_label.shape)
93 |
94 |
95 |
96 |
97 | def log_string(out_str):
98 | LOG_FOUT.write(out_str+'\n')
99 | LOG_FOUT.flush()
100 | print(out_str)
101 |
102 |
103 | def get_learning_rate(batch):
104 | learning_rate = tf.train.exponential_decay(
105 | BASE_LEARNING_RATE, # Base learning rate.
106 | batch * BATCH_SIZE, # Current index into the dataset.
107 | DECAY_STEP, # Decay step.
108 | DECAY_RATE, # Decay rate.
109 | staircase=True)
110 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE!!
111 | return learning_rate
112 |
113 | def get_bn_decay(batch):
114 | bn_momentum = tf.train.exponential_decay(
115 | BN_INIT_DECAY,
116 | batch*BATCH_SIZE,
117 | BN_DECAY_DECAY_STEP,
118 | BN_DECAY_DECAY_RATE,
119 | staircase=True)
120 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
121 | return bn_decay
122 |
123 | def train():
124 | with tf.Graph().as_default():
125 | with tf.device('/gpu:'+str(GPU_INDEX)):
126 | pointclouds_pl, labels_pl = placeholder_inputs(BATCH_SIZE, NUM_POINT)
127 | is_training_pl = tf.placeholder(tf.bool, shape=())
128 |
129 | # Note the global_step=batch parameter to minimize.
130 | # That tells the optimizer to helpfully increment the 'batch' parameter for you every time it trains.
131 | batch = tf.Variable(0)
132 | bn_decay = get_bn_decay(batch)
133 | tf.summary.scalar('bn_decay', bn_decay)
134 |
135 | # Get model and loss
136 | pred = get_model(pointclouds_pl, is_training_pl, bn_decay=bn_decay)
137 | loss = get_loss(pred, labels_pl)
138 | tf.summary.scalar('loss', loss)
139 |
140 | correct = tf.equal(tf.argmax(pred, 2), tf.to_int64(labels_pl))
141 | accuracy = tf.reduce_sum(tf.cast(correct, tf.float32)) / float(BATCH_SIZE*NUM_POINT)
142 | tf.summary.scalar('accuracy', accuracy)
143 |
144 | # Get training operator
145 | learning_rate = get_learning_rate(batch)
146 | tf.summary.scalar('learning_rate', learning_rate)
147 | if OPTIMIZER == 'momentum':
148 | optimizer = tf.train.MomentumOptimizer(learning_rate, momentum=MOMENTUM)
149 | elif OPTIMIZER == 'adam':
150 | optimizer = tf.train.AdamOptimizer(learning_rate)
151 | train_op = optimizer.minimize(loss, global_step=batch)
152 |
153 | # Add ops to save and restore all the variables.
154 | saver = tf.train.Saver()
155 |
156 | # Create a session
157 | config = tf.ConfigProto()
158 | config.gpu_options.allow_growth = True
159 | config.allow_soft_placement = True
160 | config.log_device_placement = True
161 | sess = tf.Session(config=config)
162 |
163 | # Add summary writers
164 | merged = tf.summary.merge_all()
165 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'),
166 | sess.graph)
167 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test'))
168 |
169 | # Init variables
170 | init = tf.global_variables_initializer()
171 | sess.run(init, {is_training_pl:True})
172 |
173 | ops = {'pointclouds_pl': pointclouds_pl,
174 | 'labels_pl': labels_pl,
175 | 'is_training_pl': is_training_pl,
176 | 'pred': pred,
177 | 'loss': loss,
178 | 'train_op': train_op,
179 | 'merged': merged,
180 | 'step': batch}
181 |
182 | for epoch in range(MAX_EPOCH):
183 | log_string('**** EPOCH %03d ****' % (epoch))
184 | sys.stdout.flush()
185 |
186 | train_one_epoch(sess, ops, train_writer)
187 | eval_one_epoch(sess, ops, test_writer)
188 |
189 | # Save the variables to disk.
190 | if epoch % 10 == 0:
191 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt"))
192 | log_string("Model saved in file: %s" % save_path)
193 |
194 |
195 |
196 | def train_one_epoch(sess, ops, train_writer):
197 | """ ops: dict mapping from string to tf ops """
198 | is_training = True
199 |
200 | log_string('----')
201 | current_data, current_label, _ = provider.shuffle_data(train_data[:,0:NUM_POINT,:], train_label)
202 |
203 | file_size = current_data.shape[0]
204 | num_batches = file_size // BATCH_SIZE
205 |
206 | total_correct = 0
207 | total_seen = 0
208 | loss_sum = 0
209 |
210 | for batch_idx in range(num_batches):
211 | if batch_idx % 100 == 0:
212 | print('Current batch/total batch num: %d/%d'%(batch_idx,num_batches))
213 | start_idx = batch_idx * BATCH_SIZE
214 | end_idx = (batch_idx+1) * BATCH_SIZE
215 |
216 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :],
217 | ops['labels_pl']: current_label[start_idx:end_idx],
218 | ops['is_training_pl']: is_training,}
219 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], ops['step'], ops['train_op'], ops['loss'], ops['pred']],
220 | feed_dict=feed_dict)
221 | train_writer.add_summary(summary, step)
222 | pred_val = np.argmax(pred_val, 2)
223 | correct = np.sum(pred_val == current_label[start_idx:end_idx])
224 | total_correct += correct
225 | total_seen += (BATCH_SIZE*NUM_POINT)
226 | loss_sum += loss_val
227 |
228 | log_string('mean loss: %f' % (loss_sum / float(num_batches)))
229 | log_string('accuracy: %f' % (total_correct / float(total_seen)))
230 |
231 |
232 | def eval_one_epoch(sess, ops, test_writer):
233 | """ ops: dict mapping from string to tf ops """
234 | is_training = False
235 | total_correct = 0
236 | total_seen = 0
237 | loss_sum = 0
238 | total_seen_class = [0 for _ in range(NUM_CLASSES)]
239 | total_correct_class = [0 for _ in range(NUM_CLASSES)]
240 |
241 | log_string('----')
242 | current_data = test_data[:,0:NUM_POINT,:]
243 | current_label = np.squeeze(test_label)
244 |
245 | file_size = current_data.shape[0]
246 | num_batches = file_size // BATCH_SIZE
247 |
248 | for batch_idx in range(num_batches):
249 | start_idx = batch_idx * BATCH_SIZE
250 | end_idx = (batch_idx+1) * BATCH_SIZE
251 |
252 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :],
253 | ops['labels_pl']: current_label[start_idx:end_idx],
254 | ops['is_training_pl']: is_training}
255 | summary, step, loss_val, pred_val = sess.run([ops['merged'], ops['step'], ops['loss'], ops['pred']],
256 | feed_dict=feed_dict)
257 | test_writer.add_summary(summary, step)
258 | pred_val = np.argmax(pred_val, 2)
259 | correct = np.sum(pred_val == current_label[start_idx:end_idx])
260 | total_correct += correct
261 | total_seen += (BATCH_SIZE*NUM_POINT)
262 | loss_sum += (loss_val*BATCH_SIZE)
263 | for i in range(start_idx, end_idx):
264 | for j in range(NUM_POINT):
265 | l = current_label[i, j]
266 | total_seen_class[l] += 1
267 | total_correct_class[l] += (pred_val[i-start_idx, j] == l)
268 |
269 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen/NUM_POINT)))
270 | log_string('eval accuracy: %f'% (total_correct / float(total_seen)))
271 | log_string('eval avg class acc: %f' % (np.mean(np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float))))
272 |
273 |
274 |
275 | if __name__ == "__main__":
276 | train()
277 | LOG_FOUT.close()
278 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import math
3 | import h5py
4 | import numpy as np
5 | import tensorflow as tf
6 | import socket
7 | import importlib
8 | import os
9 | import sys
10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
11 | sys.path.append(BASE_DIR)
12 | sys.path.append(os.path.join(BASE_DIR, 'models'))
13 | sys.path.append(os.path.join(BASE_DIR, 'utils'))
14 | import provider
15 | import tf_util
16 |
17 | parser = argparse.ArgumentParser()
18 | parser.add_argument('--gpu', type=int, default=0, help='GPU to use [default: GPU 0]')
19 | parser.add_argument('--model', default='pointnet_cls', help='Model name: pointnet_cls or pointnet_cls_basic [default: pointnet_cls]')
20 | parser.add_argument('--log_dir', default='log', help='Log dir [default: log]')
21 | parser.add_argument('--num_point', type=int, default=1024, help='Point Number [256/512/1024/2048] [default: 1024]')
22 | parser.add_argument('--max_epoch', type=int, default=250, help='Epoch to run [default: 250]')
23 | parser.add_argument('--batch_size', type=int, default=32, help='Batch Size during training [default: 32]')
24 | parser.add_argument('--learning_rate', type=float, default=0.001, help='Initial learning rate [default: 0.001]')
25 | parser.add_argument('--momentum', type=float, default=0.9, help='Initial learning rate [default: 0.9]')
26 | parser.add_argument('--optimizer', default='adam', help='adam or momentum [default: adam]')
27 | parser.add_argument('--decay_step', type=int, default=200000, help='Decay step for lr decay [default: 200000]')
28 | parser.add_argument('--decay_rate', type=float, default=0.7, help='Decay rate for lr decay [default: 0.8]')
29 | FLAGS = parser.parse_args()
30 |
31 |
32 | BATCH_SIZE = FLAGS.batch_size
33 | NUM_POINT = FLAGS.num_point
34 | MAX_EPOCH = FLAGS.max_epoch
35 | BASE_LEARNING_RATE = FLAGS.learning_rate
36 | GPU_INDEX = FLAGS.gpu
37 | MOMENTUM = FLAGS.momentum
38 | OPTIMIZER = FLAGS.optimizer
39 | DECAY_STEP = FLAGS.decay_step
40 | DECAY_RATE = FLAGS.decay_rate
41 |
42 | MODEL = importlib.import_module(FLAGS.model) # import network module
43 | MODEL_FILE = os.path.join(BASE_DIR, 'models', FLAGS.model+'.py')
44 | LOG_DIR = FLAGS.log_dir
45 | if not os.path.exists(LOG_DIR): os.mkdir(LOG_DIR)
46 | os.system('cp %s %s' % (MODEL_FILE, LOG_DIR)) # bkp of model def
47 | os.system('cp train.py %s' % (LOG_DIR)) # bkp of train procedure
48 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w')
49 | LOG_FOUT.write(str(FLAGS)+'\n')
50 |
51 | MAX_NUM_POINT = 2048
52 | NUM_CLASSES = 40
53 |
54 | BN_INIT_DECAY = 0.5
55 | BN_DECAY_DECAY_RATE = 0.5
56 | BN_DECAY_DECAY_STEP = float(DECAY_STEP)
57 | BN_DECAY_CLIP = 0.99
58 |
59 | HOSTNAME = socket.gethostname()
60 |
61 | # ModelNet40 official train/test split
62 | TRAIN_FILES = provider.getDataFiles( \
63 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/train_files.txt'))
64 | TEST_FILES = provider.getDataFiles(\
65 | os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/test_files.txt'))
66 |
67 | def log_string(out_str):
68 | LOG_FOUT.write(out_str+'\n')
69 | LOG_FOUT.flush()
70 | print(out_str)
71 |
72 |
73 | def get_learning_rate(batch):
74 | learning_rate = tf.train.exponential_decay(
75 | BASE_LEARNING_RATE, # Base learning rate.
76 | batch * BATCH_SIZE, # Current index into the dataset.
77 | DECAY_STEP, # Decay step.
78 | DECAY_RATE, # Decay rate.
79 | staircase=True)
80 | learning_rate = tf.maximum(learning_rate, 0.00001) # CLIP THE LEARNING RATE!
81 | return learning_rate
82 |
83 | def get_bn_decay(batch):
84 | bn_momentum = tf.train.exponential_decay(
85 | BN_INIT_DECAY,
86 | batch*BATCH_SIZE,
87 | BN_DECAY_DECAY_STEP,
88 | BN_DECAY_DECAY_RATE,
89 | staircase=True)
90 | bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
91 | return bn_decay
92 |
93 | def train():
94 | with tf.Graph().as_default():
95 | with tf.device('/gpu:'+str(GPU_INDEX)):
96 | pointclouds_pl, labels_pl = MODEL.placeholder_inputs(BATCH_SIZE, NUM_POINT)
97 | is_training_pl = tf.placeholder(tf.bool, shape=())
98 | print(is_training_pl)
99 |
100 | # Note the global_step=batch parameter to minimize.
101 | # That tells the optimizer to helpfully increment the 'batch' parameter for you every time it trains.
102 | batch = tf.Variable(0)
103 | bn_decay = get_bn_decay(batch)
104 | tf.summary.scalar('bn_decay', bn_decay)
105 |
106 | # Get model and loss
107 | pred, end_points = MODEL.get_model(pointclouds_pl, is_training_pl, bn_decay=bn_decay)
108 | loss = MODEL.get_loss(pred, labels_pl, end_points)
109 | tf.summary.scalar('loss', loss)
110 |
111 | correct = tf.equal(tf.argmax(pred, 1), tf.to_int64(labels_pl))
112 | accuracy = tf.reduce_sum(tf.cast(correct, tf.float32)) / float(BATCH_SIZE)
113 | tf.summary.scalar('accuracy', accuracy)
114 |
115 | # Get training operator
116 | learning_rate = get_learning_rate(batch)
117 | tf.summary.scalar('learning_rate', learning_rate)
118 | if OPTIMIZER == 'momentum':
119 | optimizer = tf.train.MomentumOptimizer(learning_rate, momentum=MOMENTUM)
120 | elif OPTIMIZER == 'adam':
121 | optimizer = tf.train.AdamOptimizer(learning_rate)
122 | train_op = optimizer.minimize(loss, global_step=batch)
123 |
124 | # Add ops to save and restore all the variables.
125 | saver = tf.train.Saver()
126 |
127 | # Create a session
128 | config = tf.ConfigProto()
129 | config.gpu_options.allow_growth = True
130 | config.allow_soft_placement = True
131 | config.log_device_placement = False
132 | sess = tf.Session(config=config)
133 |
134 | # Add summary writers
135 | #merged = tf.merge_all_summaries()
136 | merged = tf.summary.merge_all()
137 | train_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'train'),
138 | sess.graph)
139 | test_writer = tf.summary.FileWriter(os.path.join(LOG_DIR, 'test'))
140 |
141 | # Init variables
142 | init = tf.global_variables_initializer()
143 | # To fix the bug introduced in TF 0.12.1 as in
144 | # http://stackoverflow.com/questions/41543774/invalidargumenterror-for-tensor-bool-tensorflow-0-12-1
145 | #sess.run(init)
146 | sess.run(init, {is_training_pl: True})
147 |
148 | ops = {'pointclouds_pl': pointclouds_pl,
149 | 'labels_pl': labels_pl,
150 | 'is_training_pl': is_training_pl,
151 | 'pred': pred,
152 | 'loss': loss,
153 | 'train_op': train_op,
154 | 'merged': merged,
155 | 'step': batch}
156 |
157 | for epoch in range(MAX_EPOCH):
158 | log_string('**** EPOCH %03d ****' % (epoch))
159 | sys.stdout.flush()
160 |
161 | train_one_epoch(sess, ops, train_writer)
162 | eval_one_epoch(sess, ops, test_writer)
163 |
164 | # Save the variables to disk.
165 | if epoch % 10 == 0:
166 | save_path = saver.save(sess, os.path.join(LOG_DIR, "model.ckpt"))
167 | log_string("Model saved in file: %s" % save_path)
168 |
169 |
170 |
171 | def train_one_epoch(sess, ops, train_writer):
172 | """ ops: dict mapping from string to tf ops """
173 | is_training = True
174 |
175 | # Shuffle train files
176 | train_file_idxs = np.arange(0, len(TRAIN_FILES))
177 | np.random.shuffle(train_file_idxs)
178 |
179 | for fn in range(len(TRAIN_FILES)):
180 | log_string('----' + str(fn) + '-----')
181 | current_data, current_label = provider.loadDataFile(TRAIN_FILES[train_file_idxs[fn]])
182 | current_data = current_data[:,0:NUM_POINT,:]
183 | current_data, current_label, _ = provider.shuffle_data(current_data, np.squeeze(current_label))
184 | current_label = np.squeeze(current_label)
185 |
186 | file_size = current_data.shape[0]
187 | num_batches = file_size // BATCH_SIZE
188 |
189 | total_correct = 0
190 | total_seen = 0
191 | loss_sum = 0
192 |
193 | for batch_idx in range(num_batches):
194 | start_idx = batch_idx * BATCH_SIZE
195 | end_idx = (batch_idx+1) * BATCH_SIZE
196 |
197 | # Augment batched point clouds by rotation and jittering
198 | rotated_data = provider.rotate_point_cloud(current_data[start_idx:end_idx, :, :])
199 | jittered_data = provider.jitter_point_cloud(rotated_data)
200 | feed_dict = {ops['pointclouds_pl']: jittered_data,
201 | ops['labels_pl']: current_label[start_idx:end_idx],
202 | ops['is_training_pl']: is_training,}
203 | summary, step, _, loss_val, pred_val = sess.run([ops['merged'], ops['step'],
204 | ops['train_op'], ops['loss'], ops['pred']], feed_dict=feed_dict)
205 | train_writer.add_summary(summary, step)
206 | pred_val = np.argmax(pred_val, 1)
207 | correct = np.sum(pred_val == current_label[start_idx:end_idx])
208 | total_correct += correct
209 | total_seen += BATCH_SIZE
210 | loss_sum += loss_val
211 |
212 | log_string('mean loss: %f' % (loss_sum / float(num_batches)))
213 | log_string('accuracy: %f' % (total_correct / float(total_seen)))
214 |
215 |
216 | def eval_one_epoch(sess, ops, test_writer):
217 | """ ops: dict mapping from string to tf ops """
218 | is_training = False
219 | total_correct = 0
220 | total_seen = 0
221 | loss_sum = 0
222 | total_seen_class = [0 for _ in range(NUM_CLASSES)]
223 | total_correct_class = [0 for _ in range(NUM_CLASSES)]
224 |
225 | for fn in range(len(TEST_FILES)):
226 | log_string('----' + str(fn) + '-----')
227 | current_data, current_label = provider.loadDataFile(TEST_FILES[fn])
228 | current_data = current_data[:,0:NUM_POINT,:]
229 | current_label = np.squeeze(current_label)
230 |
231 | file_size = current_data.shape[0]
232 | num_batches = file_size // BATCH_SIZE
233 |
234 | for batch_idx in range(num_batches):
235 | start_idx = batch_idx * BATCH_SIZE
236 | end_idx = (batch_idx+1) * BATCH_SIZE
237 |
238 | feed_dict = {ops['pointclouds_pl']: current_data[start_idx:end_idx, :, :],
239 | ops['labels_pl']: current_label[start_idx:end_idx],
240 | ops['is_training_pl']: is_training}
241 | summary, step, loss_val, pred_val = sess.run([ops['merged'], ops['step'],
242 | ops['loss'], ops['pred']], feed_dict=feed_dict)
243 | pred_val = np.argmax(pred_val, 1)
244 | correct = np.sum(pred_val == current_label[start_idx:end_idx])
245 | total_correct += correct
246 | total_seen += BATCH_SIZE
247 | loss_sum += (loss_val*BATCH_SIZE)
248 | for i in range(start_idx, end_idx):
249 | l = current_label[i]
250 | total_seen_class[l] += 1
251 | total_correct_class[l] += (pred_val[i-start_idx] == l)
252 |
253 | log_string('eval mean loss: %f' % (loss_sum / float(total_seen)))
254 | log_string('eval accuracy: %f'% (total_correct / float(total_seen)))
255 | log_string('eval avg class acc: %f' % (np.mean(np.array(total_correct_class)/np.array(total_seen_class,dtype=np.float))))
256 |
257 |
258 |
259 | if __name__ == "__main__":
260 | train()
261 | LOG_FOUT.close()
262 |
--------------------------------------------------------------------------------
/utils/data_prep_util.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
4 | sys.path.append(BASE_DIR)
5 | from plyfile import (PlyData, PlyElement, make2d, PlyParseError, PlyProperty)
6 | import numpy as np
7 | import h5py
8 |
9 | SAMPLING_BIN = os.path.join(BASE_DIR, 'third_party/mesh_sampling/build/pcsample')
10 |
11 | SAMPLING_POINT_NUM = 2048
12 | SAMPLING_LEAF_SIZE = 0.005
13 |
14 | MODELNET40_PATH = '../datasets/modelnet40'
15 | def export_ply(pc, filename):
16 | vertex = np.zeros(pc.shape[0], dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
17 | for i in range(pc.shape[0]):
18 | vertex[i] = (pc[i][0], pc[i][1], pc[i][2])
19 | ply_out = PlyData([PlyElement.describe(vertex, 'vertex', comments=['vertices'])])
20 | ply_out.write(filename)
21 |
22 | # Sample points on the obj shape
23 | def get_sampling_command(obj_filename, ply_filename):
24 | cmd = SAMPLING_BIN + ' ' + obj_filename
25 | cmd += ' ' + ply_filename
26 | cmd += ' -n_samples %d ' % SAMPLING_POINT_NUM
27 | cmd += ' -leaf_size %f ' % SAMPLING_LEAF_SIZE
28 | return cmd
29 |
30 | # --------------------------------------------------------------
31 | # Following are the helper functions to load MODELNET40 shapes
32 | # --------------------------------------------------------------
33 |
34 | # Read in the list of categories in MODELNET40
35 | def get_category_names():
36 | shape_names_file = os.path.join(MODELNET40_PATH, 'shape_names.txt')
37 | shape_names = [line.rstrip() for line in open(shape_names_file)]
38 | return shape_names
39 |
40 | # Return all the filepaths for the shapes in MODELNET40
41 | def get_obj_filenames():
42 | obj_filelist_file = os.path.join(MODELNET40_PATH, 'filelist.txt')
43 | obj_filenames = [os.path.join(MODELNET40_PATH, line.rstrip()) for line in open(obj_filelist_file)]
44 | print('Got %d obj files in modelnet40.' % len(obj_filenames))
45 | return obj_filenames
46 |
47 | # Helper function to create the father folder and all subdir folders if not exist
48 | def batch_mkdir(output_folder, subdir_list):
49 | if not os.path.exists(output_folder):
50 | os.mkdir(output_folder)
51 | for subdir in subdir_list:
52 | if not os.path.exists(os.path.join(output_folder, subdir)):
53 | os.mkdir(os.path.join(output_folder, subdir))
54 |
55 | # ----------------------------------------------------------------
56 | # Following are the helper functions to load save/load HDF5 files
57 | # ----------------------------------------------------------------
58 |
59 | # Write numpy array data and label to h5_filename
60 | def save_h5_data_label_normal(h5_filename, data, label, normal,
61 | data_dtype='float32', label_dtype='uint8', noral_dtype='float32'):
62 | h5_fout = h5py.File(h5_filename)
63 | h5_fout.create_dataset(
64 | 'data', data=data,
65 | compression='gzip', compression_opts=4,
66 | dtype=data_dtype)
67 | h5_fout.create_dataset(
68 | 'normal', data=normal,
69 | compression='gzip', compression_opts=4,
70 | dtype=normal_dtype)
71 | h5_fout.create_dataset(
72 | 'label', data=label,
73 | compression='gzip', compression_opts=1,
74 | dtype=label_dtype)
75 | h5_fout.close()
76 |
77 |
78 | # Write numpy array data and label to h5_filename
79 | def save_h5(h5_filename, data, label, data_dtype='uint8', label_dtype='uint8'):
80 | h5_fout = h5py.File(h5_filename)
81 | h5_fout.create_dataset(
82 | 'data', data=data,
83 | compression='gzip', compression_opts=4,
84 | dtype=data_dtype)
85 | h5_fout.create_dataset(
86 | 'label', data=label,
87 | compression='gzip', compression_opts=1,
88 | dtype=label_dtype)
89 | h5_fout.close()
90 |
91 | # Read numpy array data and label from h5_filename
92 | def load_h5_data_label_normal(h5_filename):
93 | f = h5py.File(h5_filename)
94 | data = f['data'][:]
95 | label = f['label'][:]
96 | normal = f['normal'][:]
97 | return (data, label, normal)
98 |
99 | # Read numpy array data and label from h5_filename
100 | def load_h5_data_label_seg(h5_filename):
101 | f = h5py.File(h5_filename)
102 | data = f['data'][:]
103 | label = f['label'][:]
104 | seg = f['pid'][:]
105 | return (data, label, seg)
106 |
107 | # Read numpy array data and label from h5_filename
108 | def load_h5(h5_filename):
109 | f = h5py.File(h5_filename)
110 | data = f['data'][:]
111 | label = f['label'][:]
112 | return (data, label)
113 |
114 | # ----------------------------------------------------------------
115 | # Following are the helper functions to load save/load PLY files
116 | # ----------------------------------------------------------------
117 |
118 | # Load PLY file
119 | def load_ply_data(filename, point_num):
120 | plydata = PlyData.read(filename)
121 | pc = plydata['vertex'].data[:point_num]
122 | pc_array = np.array([[x, y, z] for x,y,z in pc])
123 | return pc_array
124 |
125 | # Load PLY file
126 | def load_ply_normal(filename, point_num):
127 | plydata = PlyData.read(filename)
128 | pc = plydata['normal'].data[:point_num]
129 | pc_array = np.array([[x, y, z] for x,y,z in pc])
130 | return pc_array
131 |
132 | # Make up rows for Nxk array
133 | # Input Pad is 'edge' or 'constant'
134 | def pad_arr_rows(arr, row, pad='edge'):
135 | assert(len(arr.shape) == 2)
136 | assert(arr.shape[0] <= row)
137 | assert(pad == 'edge' or pad == 'constant')
138 | if arr.shape[0] == row:
139 | return arr
140 | if pad == 'edge':
141 | return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'edge')
142 | if pad == 'constant':
143 | return np.lib.pad(arr, ((0, row-arr.shape[0]), (0, 0)), 'constant', (0, 0))
144 |
145 |
146 |
--------------------------------------------------------------------------------
/utils/eulerangles.py:
--------------------------------------------------------------------------------
1 | # emacs: -*- mode: python-mode; py-indent-offset: 4; indent-tabs-mode: nil -*-
2 | # vi: set ft=python sts=4 ts=4 sw=4 et:
3 | ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ##
4 | #
5 | # See COPYING file distributed along with the NiBabel package for the
6 | # copyright and license terms.
7 | #
8 | ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ##
9 | ''' Module implementing Euler angle rotations and their conversions
10 |
11 | See:
12 |
13 | * http://en.wikipedia.org/wiki/Rotation_matrix
14 | * http://en.wikipedia.org/wiki/Euler_angles
15 | * http://mathworld.wolfram.com/EulerAngles.html
16 |
17 | See also: *Representing Attitude with Euler Angles and Quaternions: A
18 | Reference* (2006) by James Diebel. A cached PDF link last found here:
19 |
20 | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.5134
21 |
22 | Euler's rotation theorem tells us that any rotation in 3D can be
23 | described by 3 angles. Let's call the 3 angles the *Euler angle vector*
24 | and call the angles in the vector :math:`alpha`, :math:`beta` and
25 | :math:`gamma`. The vector is [ :math:`alpha`,
26 | :math:`beta`. :math:`gamma` ] and, in this description, the order of the
27 | parameters specifies the order in which the rotations occur (so the
28 | rotation corresponding to :math:`alpha` is applied first).
29 |
30 | In order to specify the meaning of an *Euler angle vector* we need to
31 | specify the axes around which each of the rotations corresponding to
32 | :math:`alpha`, :math:`beta` and :math:`gamma` will occur.
33 |
34 | There are therefore three axes for the rotations :math:`alpha`,
35 | :math:`beta` and :math:`gamma`; let's call them :math:`i` :math:`j`,
36 | :math:`k`.
37 |
38 | Let us express the rotation :math:`alpha` around axis `i` as a 3 by 3
39 | rotation matrix `A`. Similarly :math:`beta` around `j` becomes 3 x 3
40 | matrix `B` and :math:`gamma` around `k` becomes matrix `G`. Then the
41 | whole rotation expressed by the Euler angle vector [ :math:`alpha`,
42 | :math:`beta`. :math:`gamma` ], `R` is given by::
43 |
44 | R = np.dot(G, np.dot(B, A))
45 |
46 | See http://mathworld.wolfram.com/EulerAngles.html
47 |
48 | The order :math:`G B A` expresses the fact that the rotations are
49 | performed in the order of the vector (:math:`alpha` around axis `i` =
50 | `A` first).
51 |
52 | To convert a given Euler angle vector to a meaningful rotation, and a
53 | rotation matrix, we need to define:
54 |
55 | * the axes `i`, `j`, `k`
56 | * whether a rotation matrix should be applied on the left of a vector to
57 | be transformed (vectors are column vectors) or on the right (vectors
58 | are row vectors).
59 | * whether the rotations move the axes as they are applied (intrinsic
60 | rotations) - compared the situation where the axes stay fixed and the
61 | vectors move within the axis frame (extrinsic)
62 | * the handedness of the coordinate system
63 |
64 | See: http://en.wikipedia.org/wiki/Rotation_matrix#Ambiguities
65 |
66 | We are using the following conventions:
67 |
68 | * axes `i`, `j`, `k` are the `z`, `y`, and `x` axes respectively. Thus
69 | an Euler angle vector [ :math:`alpha`, :math:`beta`. :math:`gamma` ]
70 | in our convention implies a :math:`alpha` radian rotation around the
71 | `z` axis, followed by a :math:`beta` rotation around the `y` axis,
72 | followed by a :math:`gamma` rotation around the `x` axis.
73 | * the rotation matrix applies on the left, to column vectors on the
74 | right, so if `R` is the rotation matrix, and `v` is a 3 x N matrix
75 | with N column vectors, the transformed vector set `vdash` is given by
76 | ``vdash = np.dot(R, v)``.
77 | * extrinsic rotations - the axes are fixed, and do not move with the
78 | rotations.
79 | * a right-handed coordinate system
80 |
81 | The convention of rotation around ``z``, followed by rotation around
82 | ``y``, followed by rotation around ``x``, is known (confusingly) as
83 | "xyz", pitch-roll-yaw, Cardan angles, or Tait-Bryan angles.
84 | '''
85 |
86 | import math
87 |
88 | import sys
89 | if sys.version_info >= (3,0):
90 | from functools import reduce
91 |
92 | import numpy as np
93 |
94 |
95 | _FLOAT_EPS_4 = np.finfo(float).eps * 4.0
96 |
97 |
98 | def euler2mat(z=0, y=0, x=0):
99 | ''' Return matrix for rotations around z, y and x axes
100 |
101 | Uses the z, then y, then x convention above
102 |
103 | Parameters
104 | ----------
105 | z : scalar
106 | Rotation angle in radians around z-axis (performed first)
107 | y : scalar
108 | Rotation angle in radians around y-axis
109 | x : scalar
110 | Rotation angle in radians around x-axis (performed last)
111 |
112 | Returns
113 | -------
114 | M : array shape (3,3)
115 | Rotation matrix giving same rotation as for given angles
116 |
117 | Examples
118 | --------
119 | >>> zrot = 1.3 # radians
120 | >>> yrot = -0.1
121 | >>> xrot = 0.2
122 | >>> M = euler2mat(zrot, yrot, xrot)
123 | >>> M.shape == (3, 3)
124 | True
125 |
126 | The output rotation matrix is equal to the composition of the
127 | individual rotations
128 |
129 | >>> M1 = euler2mat(zrot)
130 | >>> M2 = euler2mat(0, yrot)
131 | >>> M3 = euler2mat(0, 0, xrot)
132 | >>> composed_M = np.dot(M3, np.dot(M2, M1))
133 | >>> np.allclose(M, composed_M)
134 | True
135 |
136 | You can specify rotations by named arguments
137 |
138 | >>> np.all(M3 == euler2mat(x=xrot))
139 | True
140 |
141 | When applying M to a vector, the vector should column vector to the
142 | right of M. If the right hand side is a 2D array rather than a
143 | vector, then each column of the 2D array represents a vector.
144 |
145 | >>> vec = np.array([1, 0, 0]).reshape((3,1))
146 | >>> v2 = np.dot(M, vec)
147 | >>> vecs = np.array([[1, 0, 0],[0, 1, 0]]).T # giving 3x2 array
148 | >>> vecs2 = np.dot(M, vecs)
149 |
150 | Rotations are counter-clockwise.
151 |
152 | >>> zred = np.dot(euler2mat(z=np.pi/2), np.eye(3))
153 | >>> np.allclose(zred, [[0, -1, 0],[1, 0, 0], [0, 0, 1]])
154 | True
155 | >>> yred = np.dot(euler2mat(y=np.pi/2), np.eye(3))
156 | >>> np.allclose(yred, [[0, 0, 1],[0, 1, 0], [-1, 0, 0]])
157 | True
158 | >>> xred = np.dot(euler2mat(x=np.pi/2), np.eye(3))
159 | >>> np.allclose(xred, [[1, 0, 0],[0, 0, -1], [0, 1, 0]])
160 | True
161 |
162 | Notes
163 | -----
164 | The direction of rotation is given by the right-hand rule (orient
165 | the thumb of the right hand along the axis around which the rotation
166 | occurs, with the end of the thumb at the positive end of the axis;
167 | curl your fingers; the direction your fingers curl is the direction
168 | of rotation). Therefore, the rotations are counterclockwise if
169 | looking along the axis of rotation from positive to negative.
170 | '''
171 | Ms = []
172 | if z:
173 | cosz = math.cos(z)
174 | sinz = math.sin(z)
175 | Ms.append(np.array(
176 | [[cosz, -sinz, 0],
177 | [sinz, cosz, 0],
178 | [0, 0, 1]]))
179 | if y:
180 | cosy = math.cos(y)
181 | siny = math.sin(y)
182 | Ms.append(np.array(
183 | [[cosy, 0, siny],
184 | [0, 1, 0],
185 | [-siny, 0, cosy]]))
186 | if x:
187 | cosx = math.cos(x)
188 | sinx = math.sin(x)
189 | Ms.append(np.array(
190 | [[1, 0, 0],
191 | [0, cosx, -sinx],
192 | [0, sinx, cosx]]))
193 | if Ms:
194 | return reduce(np.dot, Ms[::-1])
195 | return np.eye(3)
196 |
197 |
198 | def mat2euler(M, cy_thresh=None):
199 | ''' Discover Euler angle vector from 3x3 matrix
200 |
201 | Uses the conventions above.
202 |
203 | Parameters
204 | ----------
205 | M : array-like, shape (3,3)
206 | cy_thresh : None or scalar, optional
207 | threshold below which to give up on straightforward arctan for
208 | estimating x rotation. If None (default), estimate from
209 | precision of input.
210 |
211 | Returns
212 | -------
213 | z : scalar
214 | y : scalar
215 | x : scalar
216 | Rotations in radians around z, y, x axes, respectively
217 |
218 | Notes
219 | -----
220 | If there was no numerical error, the routine could be derived using
221 | Sympy expression for z then y then x rotation matrix, which is::
222 |
223 | [ cos(y)*cos(z), -cos(y)*sin(z), sin(y)],
224 | [cos(x)*sin(z) + cos(z)*sin(x)*sin(y), cos(x)*cos(z) - sin(x)*sin(y)*sin(z), -cos(y)*sin(x)],
225 | [sin(x)*sin(z) - cos(x)*cos(z)*sin(y), cos(z)*sin(x) + cos(x)*sin(y)*sin(z), cos(x)*cos(y)]
226 |
227 | with the obvious derivations for z, y, and x
228 |
229 | z = atan2(-r12, r11)
230 | y = asin(r13)
231 | x = atan2(-r23, r33)
232 |
233 | Problems arise when cos(y) is close to zero, because both of::
234 |
235 | z = atan2(cos(y)*sin(z), cos(y)*cos(z))
236 | x = atan2(cos(y)*sin(x), cos(x)*cos(y))
237 |
238 | will be close to atan2(0, 0), and highly unstable.
239 |
240 | The ``cy`` fix for numerical instability below is from: *Graphics
241 | Gems IV*, Paul Heckbert (editor), Academic Press, 1994, ISBN:
242 | 0123361559. Specifically it comes from EulerAngles.c by Ken
243 | Shoemake, and deals with the case where cos(y) is close to zero:
244 |
245 | See: http://www.graphicsgems.org/
246 |
247 | The code appears to be licensed (from the website) as "can be used
248 | without restrictions".
249 | '''
250 | M = np.asarray(M)
251 | if cy_thresh is None:
252 | try:
253 | cy_thresh = np.finfo(M.dtype).eps * 4
254 | except ValueError:
255 | cy_thresh = _FLOAT_EPS_4
256 | r11, r12, r13, r21, r22, r23, r31, r32, r33 = M.flat
257 | # cy: sqrt((cos(y)*cos(z))**2 + (cos(x)*cos(y))**2)
258 | cy = math.sqrt(r33*r33 + r23*r23)
259 | if cy > cy_thresh: # cos(y) not close to zero, standard form
260 | z = math.atan2(-r12, r11) # atan2(cos(y)*sin(z), cos(y)*cos(z))
261 | y = math.atan2(r13, cy) # atan2(sin(y), cy)
262 | x = math.atan2(-r23, r33) # atan2(cos(y)*sin(x), cos(x)*cos(y))
263 | else: # cos(y) (close to) zero, so x -> 0.0 (see above)
264 | # so r21 -> sin(z), r22 -> cos(z) and
265 | z = math.atan2(r21, r22)
266 | y = math.atan2(r13, cy) # atan2(sin(y), cy)
267 | x = 0.0
268 | return z, y, x
269 |
270 |
271 | def euler2quat(z=0, y=0, x=0):
272 | ''' Return quaternion corresponding to these Euler angles
273 |
274 | Uses the z, then y, then x convention above
275 |
276 | Parameters
277 | ----------
278 | z : scalar
279 | Rotation angle in radians around z-axis (performed first)
280 | y : scalar
281 | Rotation angle in radians around y-axis
282 | x : scalar
283 | Rotation angle in radians around x-axis (performed last)
284 |
285 | Returns
286 | -------
287 | quat : array shape (4,)
288 | Quaternion in w, x, y z (real, then vector) format
289 |
290 | Notes
291 | -----
292 | We can derive this formula in Sympy using:
293 |
294 | 1. Formula giving quaternion corresponding to rotation of theta radians
295 | about arbitrary axis:
296 | http://mathworld.wolfram.com/EulerParameters.html
297 | 2. Generated formulae from 1.) for quaternions corresponding to
298 | theta radians rotations about ``x, y, z`` axes
299 | 3. Apply quaternion multiplication formula -
300 | http://en.wikipedia.org/wiki/Quaternions#Hamilton_product - to
301 | formulae from 2.) to give formula for combined rotations.
302 | '''
303 | z = z/2.0
304 | y = y/2.0
305 | x = x/2.0
306 | cz = math.cos(z)
307 | sz = math.sin(z)
308 | cy = math.cos(y)
309 | sy = math.sin(y)
310 | cx = math.cos(x)
311 | sx = math.sin(x)
312 | return np.array([
313 | cx*cy*cz - sx*sy*sz,
314 | cx*sy*sz + cy*cz*sx,
315 | cx*cz*sy - sx*cy*sz,
316 | cx*cy*sz + sx*cz*sy])
317 |
318 |
319 | def quat2euler(q):
320 | ''' Return Euler angles corresponding to quaternion `q`
321 |
322 | Parameters
323 | ----------
324 | q : 4 element sequence
325 | w, x, y, z of quaternion
326 |
327 | Returns
328 | -------
329 | z : scalar
330 | Rotation angle in radians around z-axis (performed first)
331 | y : scalar
332 | Rotation angle in radians around y-axis
333 | x : scalar
334 | Rotation angle in radians around x-axis (performed last)
335 |
336 | Notes
337 | -----
338 | It's possible to reduce the amount of calculation a little, by
339 | combining parts of the ``quat2mat`` and ``mat2euler`` functions, but
340 | the reduction in computation is small, and the code repetition is
341 | large.
342 | '''
343 | # delayed import to avoid cyclic dependencies
344 | import nibabel.quaternions as nq
345 | return mat2euler(nq.quat2mat(q))
346 |
347 |
348 | def euler2angle_axis(z=0, y=0, x=0):
349 | ''' Return angle, axis corresponding to these Euler angles
350 |
351 | Uses the z, then y, then x convention above
352 |
353 | Parameters
354 | ----------
355 | z : scalar
356 | Rotation angle in radians around z-axis (performed first)
357 | y : scalar
358 | Rotation angle in radians around y-axis
359 | x : scalar
360 | Rotation angle in radians around x-axis (performed last)
361 |
362 | Returns
363 | -------
364 | theta : scalar
365 | angle of rotation
366 | vector : array shape (3,)
367 | axis around which rotation occurs
368 |
369 | Examples
370 | --------
371 | >>> theta, vec = euler2angle_axis(0, 1.5, 0)
372 | >>> print(theta)
373 | 1.5
374 | >>> np.allclose(vec, [0, 1, 0])
375 | True
376 | '''
377 | # delayed import to avoid cyclic dependencies
378 | import nibabel.quaternions as nq
379 | return nq.quat2angle_axis(euler2quat(z, y, x))
380 |
381 |
382 | def angle_axis2euler(theta, vector, is_normalized=False):
383 | ''' Convert angle, axis pair to Euler angles
384 |
385 | Parameters
386 | ----------
387 | theta : scalar
388 | angle of rotation
389 | vector : 3 element sequence
390 | vector specifying axis for rotation.
391 | is_normalized : bool, optional
392 | True if vector is already normalized (has norm of 1). Default
393 | False
394 |
395 | Returns
396 | -------
397 | z : scalar
398 | y : scalar
399 | x : scalar
400 | Rotations in radians around z, y, x axes, respectively
401 |
402 | Examples
403 | --------
404 | >>> z, y, x = angle_axis2euler(0, [1, 0, 0])
405 | >>> np.allclose((z, y, x), 0)
406 | True
407 |
408 | Notes
409 | -----
410 | It's possible to reduce the amount of calculation a little, by
411 | combining parts of the ``angle_axis2mat`` and ``mat2euler``
412 | functions, but the reduction in computation is small, and the code
413 | repetition is large.
414 | '''
415 | # delayed import to avoid cyclic dependencies
416 | import nibabel.quaternions as nq
417 | M = nq.angle_axis2mat(theta, vector, is_normalized)
418 | return mat2euler(M)
419 |
--------------------------------------------------------------------------------
/utils/pc_util.py:
--------------------------------------------------------------------------------
1 | """ Utility functions for processing point clouds.
2 |
3 | Author: Charles R. Qi, Hao Su
4 | Date: November 2016
5 | """
6 |
7 | import os
8 | import sys
9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
10 | sys.path.append(BASE_DIR)
11 |
12 | # Draw point cloud
13 | from eulerangles import euler2mat
14 |
15 | # Point cloud IO
16 | import numpy as np
17 | from plyfile import PlyData, PlyElement
18 |
19 |
20 | # ----------------------------------------
21 | # Point Cloud/Volume Conversions
22 | # ----------------------------------------
23 |
24 | def point_cloud_to_volume_batch(point_clouds, vsize=12, radius=1.0, flatten=True):
25 | """ Input is BxNx3 batch of point cloud
26 | Output is Bx(vsize^3)
27 | """
28 | vol_list = []
29 | for b in range(point_clouds.shape[0]):
30 | vol = point_cloud_to_volume(np.squeeze(point_clouds[b,:,:]), vsize, radius)
31 | if flatten:
32 | vol_list.append(vol.flatten())
33 | else:
34 | vol_list.append(np.expand_dims(np.expand_dims(vol, -1), 0))
35 | if flatten:
36 | return np.vstack(vol_list)
37 | else:
38 | return np.concatenate(vol_list, 0)
39 |
40 |
41 | def point_cloud_to_volume(points, vsize, radius=1.0):
42 | """ input is Nx3 points.
43 | output is vsize*vsize*vsize
44 | assumes points are in range [-radius, radius]
45 | """
46 | vol = np.zeros((vsize,vsize,vsize))
47 | voxel = 2*radius/float(vsize)
48 | locations = (points + radius)/voxel
49 | locations = locations.astype(int)
50 | vol[locations[:,0],locations[:,1],locations[:,2]] = 1.0
51 | return vol
52 |
53 | #a = np.zeros((16,1024,3))
54 | #print point_cloud_to_volume_batch(a, 12, 1.0, False).shape
55 |
56 | def volume_to_point_cloud(vol):
57 | """ vol is occupancy grid (value = 0 or 1) of size vsize*vsize*vsize
58 | return Nx3 numpy array.
59 | """
60 | vsize = vol.shape[0]
61 | assert(vol.shape[1] == vsize and vol.shape[1] == vsize)
62 | points = []
63 | for a in range(vsize):
64 | for b in range(vsize):
65 | for c in range(vsize):
66 | if vol[a,b,c] == 1:
67 | points.append(np.array([a,b,c]))
68 | if len(points) == 0:
69 | return np.zeros((0,3))
70 | points = np.vstack(points)
71 | return points
72 |
73 | # ----------------------------------------
74 | # Point cloud IO
75 | # ----------------------------------------
76 |
77 | def read_ply(filename):
78 | """ read XYZ point cloud from filename PLY file """
79 | plydata = PlyData.read(filename)
80 | pc = plydata['vertex'].data
81 | pc_array = np.array([[x, y, z] for x,y,z in pc])
82 | return pc_array
83 |
84 |
85 | def write_ply(points, filename, text=True):
86 | """ input: Nx3, write points to filename as PLY format. """
87 | points = [(points[i,0], points[i,1], points[i,2]) for i in range(points.shape[0])]
88 | vertex = np.array(points, dtype=[('x', 'f4'), ('y', 'f4'),('z', 'f4')])
89 | el = PlyElement.describe(vertex, 'vertex', comments=['vertices'])
90 | PlyData([el], text=text).write(filename)
91 |
92 |
93 | # ----------------------------------------
94 | # Simple Point cloud and Volume Renderers
95 | # ----------------------------------------
96 |
97 | def draw_point_cloud(input_points, canvasSize=500, space=200, diameter=25,
98 | xrot=0, yrot=0, zrot=0, switch_xyz=[0,1,2], normalize=True):
99 | """ Render point cloud to image with alpha channel.
100 | Input:
101 | points: Nx3 numpy array (+y is up direction)
102 | Output:
103 | gray image as numpy array of size canvasSizexcanvasSize
104 | """
105 | image = np.zeros((canvasSize, canvasSize))
106 | if input_points is None or input_points.shape[0] == 0:
107 | return image
108 |
109 | points = input_points[:, switch_xyz]
110 | M = euler2mat(zrot, yrot, xrot)
111 | points = (np.dot(M, points.transpose())).transpose()
112 |
113 | # Normalize the point cloud
114 | # We normalize scale to fit points in a unit sphere
115 | if normalize:
116 | centroid = np.mean(points, axis=0)
117 | points -= centroid
118 | furthest_distance = np.max(np.sqrt(np.sum(abs(points)**2,axis=-1)))
119 | points /= furthest_distance
120 |
121 | # Pre-compute the Gaussian disk
122 | radius = (diameter-1)/2.0
123 | disk = np.zeros((diameter, diameter))
124 | for i in range(diameter):
125 | for j in range(diameter):
126 | if (i - radius) * (i-radius) + (j-radius) * (j-radius) <= radius * radius:
127 | disk[i, j] = np.exp((-(i-radius)**2 - (j-radius)**2)/(radius**2))
128 | mask = np.argwhere(disk > 0)
129 | dx = mask[:, 0]
130 | dy = mask[:, 1]
131 | dv = disk[disk > 0]
132 |
133 | # Order points by z-buffer
134 | zorder = np.argsort(points[:, 2])
135 | points = points[zorder, :]
136 | points[:, 2] = (points[:, 2] - np.min(points[:, 2])) / (np.max(points[:, 2] - np.min(points[:, 2])))
137 | max_depth = np.max(points[:, 2])
138 |
139 | for i in range(points.shape[0]):
140 | j = points.shape[0] - i - 1
141 | x = points[j, 0]
142 | y = points[j, 1]
143 | xc = canvasSize/2 + (x*space)
144 | yc = canvasSize/2 + (y*space)
145 | xc = int(np.round(xc))
146 | yc = int(np.round(yc))
147 |
148 | px = dx + xc
149 | py = dy + yc
150 |
151 | image[px, py] = image[px, py] * 0.7 + dv * (max_depth - points[j, 2]) * 0.3
152 |
153 | image = image / np.max(image)
154 | return image
155 |
156 | def point_cloud_three_views(points):
157 | """ input points Nx3 numpy array (+y is up direction).
158 | return an numpy array gray image of size 500x1500. """
159 | # +y is up direction
160 | # xrot is azimuth
161 | # yrot is in-plane
162 | # zrot is elevation
163 | img1 = draw_point_cloud(points, zrot=110/180.0*np.pi, xrot=45/180.0*np.pi, yrot=0/180.0*np.pi)
164 | img2 = draw_point_cloud(points, zrot=70/180.0*np.pi, xrot=135/180.0*np.pi, yrot=0/180.0*np.pi)
165 | img3 = draw_point_cloud(points, zrot=180.0/180.0*np.pi, xrot=90/180.0*np.pi, yrot=0/180.0*np.pi)
166 | image_large = np.concatenate([img1, img2, img3], 1)
167 | return image_large
168 |
169 |
170 | from PIL import Image
171 | def point_cloud_three_views_demo():
172 | """ Demo for draw_point_cloud function """
173 | points = read_ply('../third_party/mesh_sampling/piano.ply')
174 | im_array = point_cloud_three_views(points)
175 | img = Image.fromarray(np.uint8(im_array*255.0))
176 | img.save('piano.jpg')
177 |
178 | if __name__=="__main__":
179 | point_cloud_three_views_demo()
180 |
181 |
182 | import matplotlib.pyplot as plt
183 | def pyplot_draw_point_cloud(points, output_filename):
184 | """ points is a Nx3 numpy array """
185 | fig = plt.figure()
186 | ax = fig.add_subplot(111, projection='3d')
187 | ax.scatter(points[:,0], points[:,1], points[:,2])
188 | ax.set_xlabel('x')
189 | ax.set_ylabel('y')
190 | ax.set_zlabel('z')
191 | #savefig(output_filename)
192 |
193 | def pyplot_draw_volume(vol, output_filename):
194 | """ vol is of size vsize*vsize*vsize
195 | output an image to output_filename
196 | """
197 | points = volume_to_point_cloud(vol)
198 | pyplot_draw_point_cloud(points, output_filename)
199 |
--------------------------------------------------------------------------------
/utils/tf_util.py:
--------------------------------------------------------------------------------
1 | """ Wrapper functions for TensorFlow layers.
2 |
3 | Author: Charles R. Qi
4 | Date: November 2016
5 | """
6 |
7 | import numpy as np
8 | import tensorflow as tf
9 |
10 | def _variable_on_cpu(name, shape, initializer, use_fp16=False):
11 | """Helper to create a Variable stored on CPU memory.
12 | Args:
13 | name: name of the variable
14 | shape: list of ints
15 | initializer: initializer for Variable
16 | Returns:
17 | Variable Tensor
18 | """
19 | with tf.device('/cpu:0'):
20 | dtype = tf.float16 if use_fp16 else tf.float32
21 | var = tf.get_variable(name, shape, initializer=initializer, dtype=dtype)
22 | return var
23 |
24 | def _variable_with_weight_decay(name, shape, stddev, wd, use_xavier=True):
25 | """Helper to create an initialized Variable with weight decay.
26 |
27 | Note that the Variable is initialized with a truncated normal distribution.
28 | A weight decay is added only if one is specified.
29 |
30 | Args:
31 | name: name of the variable
32 | shape: list of ints
33 | stddev: standard deviation of a truncated Gaussian
34 | wd: add L2Loss weight decay multiplied by this float. If None, weight
35 | decay is not added for this Variable.
36 | use_xavier: bool, whether to use xavier initializer
37 |
38 | Returns:
39 | Variable Tensor
40 | """
41 | if use_xavier:
42 | initializer = tf.contrib.layers.xavier_initializer()
43 | else:
44 | initializer = tf.truncated_normal_initializer(stddev=stddev)
45 | var = _variable_on_cpu(name, shape, initializer)
46 | if wd is not None:
47 | weight_decay = tf.multiply(tf.nn.l2_loss(var), wd, name='weight_loss')
48 | tf.add_to_collection('losses', weight_decay)
49 | return var
50 |
51 |
52 | def conv1d(inputs,
53 | num_output_channels,
54 | kernel_size,
55 | scope,
56 | stride=1,
57 | padding='SAME',
58 | use_xavier=True,
59 | stddev=1e-3,
60 | weight_decay=0.0,
61 | activation_fn=tf.nn.relu,
62 | bn=False,
63 | bn_decay=None,
64 | is_training=None):
65 | """ 1D convolution with non-linear operation.
66 |
67 | Args:
68 | inputs: 3-D tensor variable BxLxC
69 | num_output_channels: int
70 | kernel_size: int
71 | scope: string
72 | stride: int
73 | padding: 'SAME' or 'VALID'
74 | use_xavier: bool, use xavier_initializer if true
75 | stddev: float, stddev for truncated_normal init
76 | weight_decay: float
77 | activation_fn: function
78 | bn: bool, whether to use batch norm
79 | bn_decay: float or float tensor variable in [0,1]
80 | is_training: bool Tensor variable
81 |
82 | Returns:
83 | Variable tensor
84 | """
85 | with tf.variable_scope(scope) as sc:
86 | num_in_channels = inputs.get_shape()[-1].value
87 | kernel_shape = [kernel_size,
88 | num_in_channels, num_output_channels]
89 | kernel = _variable_with_weight_decay('weights',
90 | shape=kernel_shape,
91 | use_xavier=use_xavier,
92 | stddev=stddev,
93 | wd=weight_decay)
94 | outputs = tf.nn.conv1d(inputs, kernel,
95 | stride=stride,
96 | padding=padding)
97 | biases = _variable_on_cpu('biases', [num_output_channels],
98 | tf.constant_initializer(0.0))
99 | outputs = tf.nn.bias_add(outputs, biases)
100 |
101 | if bn:
102 | outputs = batch_norm_for_conv1d(outputs, is_training,
103 | bn_decay=bn_decay, scope='bn')
104 |
105 | if activation_fn is not None:
106 | outputs = activation_fn(outputs)
107 | return outputs
108 |
109 |
110 |
111 |
112 | def conv2d(inputs,
113 | num_output_channels,
114 | kernel_size,
115 | scope,
116 | stride=[1, 1],
117 | padding='SAME',
118 | use_xavier=True,
119 | stddev=1e-3,
120 | weight_decay=0.0,
121 | activation_fn=tf.nn.relu,
122 | bn=False,
123 | bn_decay=None,
124 | is_training=None):
125 | """ 2D convolution with non-linear operation.
126 |
127 | Args:
128 | inputs: 4-D tensor variable BxHxWxC
129 | num_output_channels: int
130 | kernel_size: a list of 2 ints
131 | scope: string
132 | stride: a list of 2 ints
133 | padding: 'SAME' or 'VALID'
134 | use_xavier: bool, use xavier_initializer if true
135 | stddev: float, stddev for truncated_normal init
136 | weight_decay: float
137 | activation_fn: function
138 | bn: bool, whether to use batch norm
139 | bn_decay: float or float tensor variable in [0,1]
140 | is_training: bool Tensor variable
141 |
142 | Returns:
143 | Variable tensor
144 | """
145 | with tf.variable_scope(scope) as sc:
146 | kernel_h, kernel_w = kernel_size
147 | num_in_channels = inputs.get_shape()[-1].value
148 | kernel_shape = [kernel_h, kernel_w,
149 | num_in_channels, num_output_channels]
150 | kernel = _variable_with_weight_decay('weights',
151 | shape=kernel_shape,
152 | use_xavier=use_xavier,
153 | stddev=stddev,
154 | wd=weight_decay)
155 | stride_h, stride_w = stride
156 | outputs = tf.nn.conv2d(inputs, kernel,
157 | [1, stride_h, stride_w, 1],
158 | padding=padding)
159 | biases = _variable_on_cpu('biases', [num_output_channels],
160 | tf.constant_initializer(0.0))
161 | outputs = tf.nn.bias_add(outputs, biases)
162 |
163 | if bn:
164 | outputs = batch_norm_for_conv2d(outputs, is_training,
165 | bn_decay=bn_decay, scope='bn')
166 |
167 | if activation_fn is not None:
168 | outputs = activation_fn(outputs)
169 | return outputs
170 |
171 |
172 | def conv2d_transpose(inputs,
173 | num_output_channels,
174 | kernel_size,
175 | scope,
176 | stride=[1, 1],
177 | padding='SAME',
178 | use_xavier=True,
179 | stddev=1e-3,
180 | weight_decay=0.0,
181 | activation_fn=tf.nn.relu,
182 | bn=False,
183 | bn_decay=None,
184 | is_training=None):
185 | """ 2D convolution transpose with non-linear operation.
186 |
187 | Args:
188 | inputs: 4-D tensor variable BxHxWxC
189 | num_output_channels: int
190 | kernel_size: a list of 2 ints
191 | scope: string
192 | stride: a list of 2 ints
193 | padding: 'SAME' or 'VALID'
194 | use_xavier: bool, use xavier_initializer if true
195 | stddev: float, stddev for truncated_normal init
196 | weight_decay: float
197 | activation_fn: function
198 | bn: bool, whether to use batch norm
199 | bn_decay: float or float tensor variable in [0,1]
200 | is_training: bool Tensor variable
201 |
202 | Returns:
203 | Variable tensor
204 |
205 | Note: conv2d(conv2d_transpose(a, num_out, ksize, stride), a.shape[-1], ksize, stride) == a
206 | """
207 | with tf.variable_scope(scope) as sc:
208 | kernel_h, kernel_w = kernel_size
209 | num_in_channels = inputs.get_shape()[-1].value
210 | kernel_shape = [kernel_h, kernel_w,
211 | num_output_channels, num_in_channels] # reversed to conv2d
212 | kernel = _variable_with_weight_decay('weights',
213 | shape=kernel_shape,
214 | use_xavier=use_xavier,
215 | stddev=stddev,
216 | wd=weight_decay)
217 | stride_h, stride_w = stride
218 |
219 | # from slim.convolution2d_transpose
220 | def get_deconv_dim(dim_size, stride_size, kernel_size, padding):
221 | dim_size *= stride_size
222 |
223 | if padding == 'VALID' and dim_size is not None:
224 | dim_size += max(kernel_size - stride_size, 0)
225 | return dim_size
226 |
227 | # caculate output shape
228 | batch_size = inputs.get_shape()[0].value
229 | height = inputs.get_shape()[1].value
230 | width = inputs.get_shape()[2].value
231 | out_height = get_deconv_dim(height, stride_h, kernel_h, padding)
232 | out_width = get_deconv_dim(width, stride_w, kernel_w, padding)
233 | output_shape = [batch_size, out_height, out_width, num_output_channels]
234 |
235 | outputs = tf.nn.conv2d_transpose(inputs, kernel, output_shape,
236 | [1, stride_h, stride_w, 1],
237 | padding=padding)
238 | biases = _variable_on_cpu('biases', [num_output_channels],
239 | tf.constant_initializer(0.0))
240 | outputs = tf.nn.bias_add(outputs, biases)
241 |
242 | if bn:
243 | outputs = batch_norm_for_conv2d(outputs, is_training,
244 | bn_decay=bn_decay, scope='bn')
245 |
246 | if activation_fn is not None:
247 | outputs = activation_fn(outputs)
248 | return outputs
249 |
250 |
251 |
252 | def conv3d(inputs,
253 | num_output_channels,
254 | kernel_size,
255 | scope,
256 | stride=[1, 1, 1],
257 | padding='SAME',
258 | use_xavier=True,
259 | stddev=1e-3,
260 | weight_decay=0.0,
261 | activation_fn=tf.nn.relu,
262 | bn=False,
263 | bn_decay=None,
264 | is_training=None):
265 | """ 3D convolution with non-linear operation.
266 |
267 | Args:
268 | inputs: 5-D tensor variable BxDxHxWxC
269 | num_output_channels: int
270 | kernel_size: a list of 3 ints
271 | scope: string
272 | stride: a list of 3 ints
273 | padding: 'SAME' or 'VALID'
274 | use_xavier: bool, use xavier_initializer if true
275 | stddev: float, stddev for truncated_normal init
276 | weight_decay: float
277 | activation_fn: function
278 | bn: bool, whether to use batch norm
279 | bn_decay: float or float tensor variable in [0,1]
280 | is_training: bool Tensor variable
281 |
282 | Returns:
283 | Variable tensor
284 | """
285 | with tf.variable_scope(scope) as sc:
286 | kernel_d, kernel_h, kernel_w = kernel_size
287 | num_in_channels = inputs.get_shape()[-1].value
288 | kernel_shape = [kernel_d, kernel_h, kernel_w,
289 | num_in_channels, num_output_channels]
290 | kernel = _variable_with_weight_decay('weights',
291 | shape=kernel_shape,
292 | use_xavier=use_xavier,
293 | stddev=stddev,
294 | wd=weight_decay)
295 | stride_d, stride_h, stride_w = stride
296 | outputs = tf.nn.conv3d(inputs, kernel,
297 | [1, stride_d, stride_h, stride_w, 1],
298 | padding=padding)
299 | biases = _variable_on_cpu('biases', [num_output_channels],
300 | tf.constant_initializer(0.0))
301 | outputs = tf.nn.bias_add(outputs, biases)
302 |
303 | if bn:
304 | outputs = batch_norm_for_conv3d(outputs, is_training,
305 | bn_decay=bn_decay, scope='bn')
306 |
307 | if activation_fn is not None:
308 | outputs = activation_fn(outputs)
309 | return outputs
310 |
311 | def fully_connected(inputs,
312 | num_outputs,
313 | scope,
314 | use_xavier=True,
315 | stddev=1e-3,
316 | weight_decay=0.0,
317 | activation_fn=tf.nn.relu,
318 | bn=False,
319 | bn_decay=None,
320 | is_training=None):
321 | """ Fully connected layer with non-linear operation.
322 |
323 | Args:
324 | inputs: 2-D tensor BxN
325 | num_outputs: int
326 |
327 | Returns:
328 | Variable tensor of size B x num_outputs.
329 | """
330 | with tf.variable_scope(scope) as sc:
331 | num_input_units = inputs.get_shape()[-1].value
332 | weights = _variable_with_weight_decay('weights',
333 | shape=[num_input_units, num_outputs],
334 | use_xavier=use_xavier,
335 | stddev=stddev,
336 | wd=weight_decay)
337 | outputs = tf.matmul(inputs, weights)
338 | biases = _variable_on_cpu('biases', [num_outputs],
339 | tf.constant_initializer(0.0))
340 | outputs = tf.nn.bias_add(outputs, biases)
341 |
342 | if bn:
343 | outputs = batch_norm_for_fc(outputs, is_training, bn_decay, 'bn')
344 |
345 | if activation_fn is not None:
346 | outputs = activation_fn(outputs)
347 | return outputs
348 |
349 |
350 | def max_pool2d(inputs,
351 | kernel_size,
352 | scope,
353 | stride=[2, 2],
354 | padding='VALID'):
355 | """ 2D max pooling.
356 |
357 | Args:
358 | inputs: 4-D tensor BxHxWxC
359 | kernel_size: a list of 2 ints
360 | stride: a list of 2 ints
361 |
362 | Returns:
363 | Variable tensor
364 | """
365 | with tf.variable_scope(scope) as sc:
366 | kernel_h, kernel_w = kernel_size
367 | stride_h, stride_w = stride
368 | outputs = tf.nn.max_pool(inputs,
369 | ksize=[1, kernel_h, kernel_w, 1],
370 | strides=[1, stride_h, stride_w, 1],
371 | padding=padding,
372 | name=sc.name)
373 | return outputs
374 |
375 | def avg_pool2d(inputs,
376 | kernel_size,
377 | scope,
378 | stride=[2, 2],
379 | padding='VALID'):
380 | """ 2D avg pooling.
381 |
382 | Args:
383 | inputs: 4-D tensor BxHxWxC
384 | kernel_size: a list of 2 ints
385 | stride: a list of 2 ints
386 |
387 | Returns:
388 | Variable tensor
389 | """
390 | with tf.variable_scope(scope) as sc:
391 | kernel_h, kernel_w = kernel_size
392 | stride_h, stride_w = stride
393 | outputs = tf.nn.avg_pool(inputs,
394 | ksize=[1, kernel_h, kernel_w, 1],
395 | strides=[1, stride_h, stride_w, 1],
396 | padding=padding,
397 | name=sc.name)
398 | return outputs
399 |
400 |
401 | def max_pool3d(inputs,
402 | kernel_size,
403 | scope,
404 | stride=[2, 2, 2],
405 | padding='VALID'):
406 | """ 3D max pooling.
407 |
408 | Args:
409 | inputs: 5-D tensor BxDxHxWxC
410 | kernel_size: a list of 3 ints
411 | stride: a list of 3 ints
412 |
413 | Returns:
414 | Variable tensor
415 | """
416 | with tf.variable_scope(scope) as sc:
417 | kernel_d, kernel_h, kernel_w = kernel_size
418 | stride_d, stride_h, stride_w = stride
419 | outputs = tf.nn.max_pool3d(inputs,
420 | ksize=[1, kernel_d, kernel_h, kernel_w, 1],
421 | strides=[1, stride_d, stride_h, stride_w, 1],
422 | padding=padding,
423 | name=sc.name)
424 | return outputs
425 |
426 | def avg_pool3d(inputs,
427 | kernel_size,
428 | scope,
429 | stride=[2, 2, 2],
430 | padding='VALID'):
431 | """ 3D avg pooling.
432 |
433 | Args:
434 | inputs: 5-D tensor BxDxHxWxC
435 | kernel_size: a list of 3 ints
436 | stride: a list of 3 ints
437 |
438 | Returns:
439 | Variable tensor
440 | """
441 | with tf.variable_scope(scope) as sc:
442 | kernel_d, kernel_h, kernel_w = kernel_size
443 | stride_d, stride_h, stride_w = stride
444 | outputs = tf.nn.avg_pool3d(inputs,
445 | ksize=[1, kernel_d, kernel_h, kernel_w, 1],
446 | strides=[1, stride_d, stride_h, stride_w, 1],
447 | padding=padding,
448 | name=sc.name)
449 | return outputs
450 |
451 |
452 |
453 |
454 |
455 | def batch_norm_template(inputs, is_training, scope, moments_dims, bn_decay):
456 | """ Batch normalization on convolutional maps and beyond...
457 | Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow
458 |
459 | Args:
460 | inputs: Tensor, k-D input ... x C could be BC or BHWC or BDHWC
461 | is_training: boolean tf.Varialbe, true indicates training phase
462 | scope: string, variable scope
463 | moments_dims: a list of ints, indicating dimensions for moments calculation
464 | bn_decay: float or float tensor variable, controling moving average weight
465 | Return:
466 | normed: batch-normalized maps
467 | """
468 | with tf.variable_scope(scope) as sc:
469 | num_channels = inputs.get_shape()[-1].value
470 | beta = tf.Variable(tf.constant(0.0, shape=[num_channels]),
471 | name='beta', trainable=True)
472 | gamma = tf.Variable(tf.constant(1.0, shape=[num_channels]),
473 | name='gamma', trainable=True)
474 | batch_mean, batch_var = tf.nn.moments(inputs, moments_dims, name='moments')
475 | decay = bn_decay if bn_decay is not None else 0.9
476 | ema = tf.train.ExponentialMovingAverage(decay=decay)
477 | # Operator that maintains moving averages of variables.
478 | ema_apply_op = tf.cond(is_training,
479 | lambda: ema.apply([batch_mean, batch_var]),
480 | lambda: tf.no_op())
481 |
482 | # Update moving average and return current batch's avg and var.
483 | def mean_var_with_update():
484 | with tf.control_dependencies([ema_apply_op]):
485 | return tf.identity(batch_mean), tf.identity(batch_var)
486 |
487 | # ema.average returns the Variable holding the average of var.
488 | mean, var = tf.cond(is_training,
489 | mean_var_with_update,
490 | lambda: (ema.average(batch_mean), ema.average(batch_var)))
491 | normed = tf.nn.batch_normalization(inputs, mean, var, beta, gamma, 1e-3)
492 | return normed
493 |
494 |
495 | def batch_norm_for_fc(inputs, is_training, bn_decay, scope):
496 | """ Batch normalization on FC data.
497 |
498 | Args:
499 | inputs: Tensor, 2D BxC input
500 | is_training: boolean tf.Varialbe, true indicates training phase
501 | bn_decay: float or float tensor variable, controling moving average weight
502 | scope: string, variable scope
503 | Return:
504 | normed: batch-normalized maps
505 | """
506 | return batch_norm_template(inputs, is_training, scope, [0,], bn_decay)
507 |
508 |
509 | def batch_norm_for_conv1d(inputs, is_training, bn_decay, scope):
510 | """ Batch normalization on 1D convolutional maps.
511 |
512 | Args:
513 | inputs: Tensor, 3D BLC input maps
514 | is_training: boolean tf.Varialbe, true indicates training phase
515 | bn_decay: float or float tensor variable, controling moving average weight
516 | scope: string, variable scope
517 | Return:
518 | normed: batch-normalized maps
519 | """
520 | return batch_norm_template(inputs, is_training, scope, [0,1], bn_decay)
521 |
522 |
523 |
524 |
525 | def batch_norm_for_conv2d(inputs, is_training, bn_decay, scope):
526 | """ Batch normalization on 2D convolutional maps.
527 |
528 | Args:
529 | inputs: Tensor, 4D BHWC input maps
530 | is_training: boolean tf.Varialbe, true indicates training phase
531 | bn_decay: float or float tensor variable, controling moving average weight
532 | scope: string, variable scope
533 | Return:
534 | normed: batch-normalized maps
535 | """
536 | return batch_norm_template(inputs, is_training, scope, [0,1,2], bn_decay)
537 |
538 |
539 |
540 | def batch_norm_for_conv3d(inputs, is_training, bn_decay, scope):
541 | """ Batch normalization on 3D convolutional maps.
542 |
543 | Args:
544 | inputs: Tensor, 5D BDHWC input maps
545 | is_training: boolean tf.Varialbe, true indicates training phase
546 | bn_decay: float or float tensor variable, controling moving average weight
547 | scope: string, variable scope
548 | Return:
549 | normed: batch-normalized maps
550 | """
551 | return batch_norm_template(inputs, is_training, scope, [0,1,2,3], bn_decay)
552 |
553 |
554 | def dropout(inputs,
555 | is_training,
556 | scope,
557 | keep_prob=0.5,
558 | noise_shape=None):
559 | """ Dropout layer.
560 |
561 | Args:
562 | inputs: tensor
563 | is_training: boolean tf.Variable
564 | scope: string
565 | keep_prob: float in [0,1]
566 | noise_shape: list of ints
567 |
568 | Returns:
569 | tensor variable
570 | """
571 | with tf.variable_scope(scope) as sc:
572 | outputs = tf.cond(is_training,
573 | lambda: tf.nn.dropout(inputs, keep_prob, noise_shape),
574 | lambda: inputs)
575 | return outputs
576 |
--------------------------------------------------------------------------------