├── .gitignore
├── LICENSE
├── README.md
├── alsNet
    ├── alsNet.py
    ├── alsNetEvaluator.py
    ├── alsNetHistory.py
    ├── alsNetLogger.py
    ├── alsNetLogger2.py
    ├── alsNetMerger.py
    ├── alsNetPreparer.py
    ├── alsNetRefactored.py
    ├── alsNetRunner.py
    ├── alsNetRunner2.py
    ├── alsNetRunner3.py
    ├── alsNetRunner4.py
    ├── alsNetRunner5.py
    ├── archs
    │   ├── __init__.py
    │   ├── arch1.py
    │   ├── arch2.py
    │   ├── arch3.py
    │   ├── arch4.py
    │   └── arch5.py
    ├── dataset.py
    ├── model.py
    ├── model2.py
    ├── model3.py
    └── plots
    │   └── confusion.py
├── bregenz_c1293.png
├── tf_ops
    ├── 3d_interpolation
    │   ├── interpolate.cpp
    │   ├── tf_interpolate.cpp
    │   ├── tf_interpolate.py
    │   ├── tf_interpolate_compile.sh
    │   └── tf_interpolate_op_test.py
    ├── grouping
    │   ├── .gitignore
    │   ├── test
    │   │   ├── compile.sh
    │   │   ├── query_ball_point.cpp
    │   │   ├── query_ball_point.cu
    │   │   ├── query_ball_point_block.cu
    │   │   ├── query_ball_point_grid.cu
    │   │   ├── selection_sort.cpp
    │   │   ├── selection_sort.cu
    │   │   └── selection_sort_const.cu
    │   ├── tf_grouping.cpp
    │   ├── tf_grouping.py
    │   ├── tf_grouping_compile.sh
    │   ├── tf_grouping_g.cu
    │   └── tf_grouping_op_test.py
    └── sampling
    │   ├── .gitignore
    │   ├── tf_sampling.cpp
    │   ├── tf_sampling.py
    │   ├── tf_sampling_compile.sh
    │   └── tf_sampling_g.cu
└── utils
    ├── pointnet_util.py
    └── tf_util.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | # data
 2 | data/
 3 | 
 4 | # log files
 5 | classification/log/
 6 | alsNet/log/
 7 | part_seg/log/
 8 | 
 9 | # models
10 | model.ckpt.*
11 | *.pickle
12 | 
13 | #
14 | *.pyc
15 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | alsNet: Classification of 3D Point Clouds using Deep Neural Networks
 2 | 
 3 | Copyright (c) 2018, Lukas Winiwarter, TU Wien
 4 | 
 5 | Based on:
 6 | PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space.
 7 | 
 8 | Copyright (c) 2017, Geometric Computation Group of Stanford University
 9 | 
10 | The MIT License (MIT)
11 | 
12 | Copyright (c) 2017 Charles R. Qi
13 | Copyright (c) 2018 Lukas Winiwarter
14 | 
15 | Permission is hereby granted, free of charge, to any person obtaining a copy
16 | of this software and associated documentation files (the "Software"), to deal
17 | in the Software without restriction, including without limitation the rights
18 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
19 | copies of the Software, and to permit persons to whom the Software is
20 | furnished to do so, subject to the following conditions:
21 | 
22 | The above copyright notice and this permission notice shall be included in all
23 | copies or substantial portions of the Software.
24 | 
25 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
26 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
27 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
28 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
29 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
30 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
31 | SOFTWARE.
32 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | ## alsNet: Classification of 3D Point Clouds using Deep Neural Networks
 2 | 
 3 | This is the code repository accompanying the diploma thesis that was 
 4 | carried out at the <a href="https://photo.geo.tuwien.ac.at">Research Group Photogrammetry</a> of TU Wien 
 5 | under the same name.
 6 | 
 7 | *alsNet* is a neural network framework for classification of point clouds acquired by airborne laser scanning.
 8 | More details can be found in the thesis itself.
 9 | 
10 | ![Comparison](bregenz_c1293.png "Comparison between reference (left) and estimated (right) classes. Differences shown in red/green below.")
11 | 
12 | ### PointNet, PointNet++
13 | *alsNet* is heavily based on the neural networks of *PointNet* and *PointNet++* by Charles R. Qi et al. from Stanford University.
14 | This especially concerns the tensorflow operations. 
15 | The code has been updated to run on tensorflow 1.6, CUDA 9.0 and python3. To complile them, the instructions below can be follow. They are copied from the *PointNet++* repository.
16 | Since some changes have been made to the `tf_xxx_compile.sh` scripts, they should run as-is, provided a correct installation of CUDA, cuDNN and tensorflow-gpu exists.
17 | #### Compile Customized TF Operators
18 | The TF operators are included under `tf_ops`, you need to compile them (check `tf_xxx_compile.sh` under each ops subfolder) first. Update `nvcc` and `python` path if necessary. The code is tested under TF1.2.0. If you are using earlier version it's possible that you need to remove the `-D_GLIBCXX_USE_CXX11_ABI=0` flag in g++ command in order to compile correctly.
19 | 
20 | To compile the operators in TF version >=1.4, you need to modify the compile scripts slightly.
21 | 
22 | First, find Tensorflow include and library paths.
23 | 
24 |         TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
25 |         TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
26 |         
27 | Then, add flags of `-I$TF_INC/external/nsync/public -L$TF_LIB -ltensorflow_framework` to the `g++` commands.
28 | 
29 | 
30 | ### Usage
31 | This sections shows how to use *alsNet* both in training and in inference. With all of these scripts, the parameter `-h` will show help information on the other parameters.
32 | 
33 | #### Preprocessing
34 | First, the dataset has to be tiled into chunks of 200,000 points each. Here we take a number of laz-Files, do not thin them out (`thinFactor 1`) 
35 | and assume an average point density of 15 pts/m².
36 | 
37 |     alsNet/alsNet/alsNetPreparer.py --inFiles .../input*.laz --outFolder .../some/folder --thinFactor 1 --density 15 --kNN 200000
38 | 
39 | #### Training
40 | Now we can train a model based on these chunks and an architecture (e.g. `arch4`)
41 | 
42 |     alsNet/alsNet/alsNetRunner5.py --inList .../some/folder/stats.csv --threshold 20 --minBuild 0 --learningRate 0.0001 --outDir .../logs_models/ --archFile archs.arch4
43 | 
44 | If we want to continue training on an exisiting model, we can supply the path to the already saved files: `--continueModel .../logs_models/model_1_99.alsNet`
45 | 
46 | #### Inferece
47 | To use a trained model on validation data, use the `alsNetEvaluator`. The data has to be prepared using the `alsNetPreparer` in advance.
48 | 
49 |     alsNet/alsNet/alsNetEvaluator.py --inFile .../data/validate_c*.laz --model .../logs_models/model_1_99.alsNet --arch archs.arch4 --outDir .../predictions/
50 | 
51 | #### Postprocessing
52 | Finally, the chunks can be merged together to create a single output file. For this, the original reference point cloud (where each point appears once) is required:
53 | 
54 |     alsNet/alsNet/alsNetMerger.py --inFiles .../predictions/*.laz --refFile .../input*.laz --outFile .../predictions/merged.laz
55 | 
56 | ### License
57 | The code is released under MIT License (see LICENSE file for details).
58 | 
59 | ### Related Projects
60 | 
61 | * <a href="http://stanford.edu/~rqi/pointnet" target="_blank">PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation</a> by Qi et al. (CVPR 2017 Oral Presentation). Code and data released in <a href="https://github.com/charlesq34/pointnet">GitHub</a>.
62 | * <a href="http://stanford.edu/~rqi/pointnet2/" target="_blank">PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space</a> by Qi et al. (NIPS 2017) A hierarchical feature learning framework on point clouds. The PointNet++ architecture applies PointNet recursively on a nested partitioning of the input point set. It also proposes novel layers for point clouds with non-uniform densities.
63 | 


--------------------------------------------------------------------------------
/alsNet/alsNetEvaluator.py:
--------------------------------------------------------------------------------
 1 | import glob
 2 | 
 3 | from argparse import ArgumentParser
 4 | from alsNetRefactored import AlsNetContainer
 5 | from dataset import Dataset
 6 | import numpy as np
 7 | import os, sys
 8 | import logging
 9 | import importlib
10 | 
11 | 
12 | # disable tensorflow debug information:
13 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
14 | 
15 | logging.basicConfig(level=logging.DEBUG,
16 |                     format='%(asctime)s [%(levelname)s]: %(message)s',
17 |                     datefmt='%Y-%m-%d %H:%M:%S')
18 | 
19 | 
20 | def main(args):
21 |     arch = importlib.import_module(args.arch).arch
22 |     normalize = args.normalize
23 |     model = AlsNetContainer(num_feat=3, num_classes=30, num_points=200000, output_base=args.outDir, arch=arch)
24 |     logging.info("Loading pretrained model %s" % args.model)
25 |     model.load_model(args.model)
26 |     datasets = []
27 |     if not os.path.exists(args.outDir):
28 |         os.makedirs(args.outDir)
29 |     for filepattern in args.inFiles:
30 |         for file in glob.glob(filepattern):
31 |             datasets.append(Dataset(file, load=False, normalize=normalize))
32 |             logging.info("File %s loaded" % file)
33 |     total_acc = 0
34 |     total_batch = 0
35 |     for idx, dataset in enumerate(datasets):
36 |         logging.info("Loading dataset %d / %d (%s)" % (idx, len(datasets), dataset.filename))
37 |         acc = model.test_single(dataset,
38 |                          save_to=os.path.join(args.outDir, os.path.basename(dataset.file).replace(".la", "_test.la")),
39 |                          save_prob=True, unload=False)
40 |         logging.info("Current test accuracy: %.2f%%" % (acc * 100.))
41 |         meanxy = np.mean(dataset._xyz, axis=1)[0:2]
42 |         with open(os.path.join(args.outDir, 'result.csv'), 'a') as out_stat_file:
43 |             out_stat_file.write("%s, %.3f, %.3f, %.4f\n" % (dataset.file, meanxy[0], meanxy[1], acc) )
44 |         dataset.unload()
45 |         total_acc += acc
46 |         total_batch += 1
47 |         logging.info("Current avg test accuracy: %.2f%%" % ((total_acc/total_batch) * 100.))
48 |         sys.stdout.flush()
49 | 
50 | 
51 | 
52 | 
53 | if __name__ == '__main__':
54 |     parser = ArgumentParser()
55 |     parser.add_argument('--inFiles',
56 |                         default=[],
57 |                         required=True,
58 |                         help='input files (wildcard supported)',
59 |                         action='append')
60 |     parser.add_argument('--model', required=True, help='tensorflow model ckpt file')
61 |     parser.add_argument('--arch', required=True, help='python architecture file')
62 |     parser.add_argument('--outDir', required=True, help='log and output directory')
63 |     parser.add_argument('--normalize', default=1, type=int,
64 |                         help='normalize fields and coordinates [default: 1][1/0]')
65 |     args = parser.parse_args()
66 | 
67 |     main(args)


--------------------------------------------------------------------------------
/alsNet/alsNetHistory.py:
--------------------------------------------------------------------------------
 1 | import datetime
 2 | import numpy as np
 3 | 
 4 | 
 5 | class AlsNetHistory:
 6 |     def __init__(self):
 7 |         self.cm = []
 8 |         self.points_seen = []
 9 |         self.timestamps = []
10 |         self.losses = []
11 | 
12 |     def add_history_step(self, cm, points_seen, loss, timestamp=datetime.datetime.now()):
13 |         self.cm.append(cm+1e-8)  # +1e-8 to make sure always > 0
14 |         self.points_seen.append(points_seen)
15 |         self.losses.append(loss)
16 |         self.timestamps.append(timestamp)
17 | 
18 |     def get_cm_timeline(self, i, j):
19 |         return [AlsNetHistory.over_gt(cm)[i, j] for cm in self.cm]
20 | 
21 |     def get_cm_timeline_compressed(self, i, j, keep_classes):
22 |         return [AlsNetHistory.over_gt(AlsNetHistory.get_cm_compressed(cm, keep_classes))[i, j] for cm in self.cm]
23 | 
24 |     def get_oa_timeline(self):
25 |         return [np.sum([cm[i, i] for i in range(cm.shape[0])]) / np.sum(cm, axis=(0,1)) for cm in self.cm]
26 | 
27 |     def get_oa_timeline_smooth(self, n_window):
28 |         return np.convolve(self.get_oa_timeline(), np.ones((n_window, ))/n_window, mode='valid')
29 | 
30 |     @staticmethod
31 |     def get_cm_compressed(cm, keep_classes=(2, 3, 4, 5, 6, 9), delete=False):
32 |         """
33 |         Compresses a confusion matrix into the interesting columns/rows
34 |         (careful, they are not ordered according to keep_classes, but the indices change!)
35 |         and collects the rest in the last column/row
36 |         :param cm: a 2D confusion matrix
37 |         :param keep_classes: a set of classes to keep
38 |         :param delete: delete rows from matrix after caluclation (default: False)
39 |         :return:
40 |         """
41 |         coll_idx = cm.shape[0]
42 |         cm_buf = np.append(cm, np.zeros((1, coll_idx)), axis=0)
43 |         cm_buf = np.append(cm_buf, np.zeros((coll_idx + 1, 1)), axis=1)
44 |         sum_idxs = [i for i in range(coll_idx) if i not in keep_classes]
45 |         cm_buf[:, coll_idx] = np.sum(cm_buf[:, sum_idxs], axis=1)
46 |         cm_buf[coll_idx, :] = np.sum(cm_buf[sum_idxs, :], axis=0)
47 |         cm_buf[coll_idx, coll_idx] = np.sum(cm_buf[sum_idxs, -1])
48 |         if delete:
49 |             cm_buf = np.delete(cm_buf, sum_idxs, axis=0)
50 |             cm_buf = np.delete(cm_buf, sum_idxs, axis=1)
51 |         return cm_buf
52 | 
53 |     @staticmethod
54 |     def over_gt(cm):
55 |         return (cm.T/ np.sum(cm, axis=1)).T
56 | 
57 |     def class_points_timeline(self, class_idx):
58 |         return [np.sum(cm[class_idx, :]) for cm in self.cm]
59 | 
60 | 
61 | if __name__ == '__main__':
62 |     cm = np.array([[45, 3, 4, 6, 3],
63 |                    [2, 18, 3, 5, 4],
64 |                    [9, 1, 13, 5, 7],
65 |                    [0, 4, 3, 15, 3],
66 |                    [2, 8, 3, 5, 14]])
67 |     cm_c = AlsNetHistory.get_cm_compressed(cm, (0, 1))
68 |     print(cm)
69 |     print(cm_c)
70 | 


--------------------------------------------------------------------------------
/alsNet/alsNetLogger.py:
--------------------------------------------------------------------------------
  1 | import markdown
  2 | import matplotlib
  3 | matplotlib.use('agg')
  4 | import matplotlib.pyplot as plt
  5 | import numpy as np
  6 | from io import BytesIO
  7 | import base64
  8 | import datetime
  9 | import codecs
 10 | import os
 11 | import logging
 12 | 
 13 | HEAD = """
 14 | <html>
 15 | <head>
 16 | <style>
 17 | body{
 18 |     margin: 0 auto;
 19 |     font-family: Georgia, Palatino, serif;
 20 |     color: #444444;
 21 |     line-height: 1;
 22 |     max-width: 960px;
 23 |     padding: 30px;
 24 | }
 25 | h1, h2, h3, h4 {
 26 |     color: #111111;
 27 |     font-weight: 400;
 28 | }
 29 | h1, h2, h3, h4, h5, p {
 30 |     margin-bottom: 24px;
 31 |     padding: 0;
 32 | }
 33 | h1 {
 34 |     font-size: 48px;
 35 | }
 36 | h2 {
 37 |     font-size: 36px;
 38 |     margin: 24px 0 6px;
 39 | }
 40 | h3 {
 41 |     font-size: 24px;
 42 | }
 43 | h4 {
 44 |     font-size: 21px;
 45 | }
 46 | h5 {
 47 |     font-size: 18px;
 48 | }
 49 | a {
 50 |     color: #0099ff;
 51 |     margin: 0;
 52 |     padding: 0;
 53 |     vertical-align: baseline;
 54 | }
 55 | ul, ol {
 56 |     padding: 0;
 57 |     margin: 0;
 58 | }
 59 | li {
 60 |     line-height: 24px;
 61 | }
 62 | li ul, li ul {
 63 |     margin-left: 24px;
 64 | }
 65 | p, ul, ol {
 66 |     font-size: 16px;
 67 |     line-height: 24px;
 68 |     max-width: 540px;
 69 | }
 70 | pre {
 71 |     padding: 0px 24px;
 72 |     max-width: 800px;
 73 |     white-space: pre-wrap;
 74 | }
 75 | code {
 76 |     font-family: Consolas, Monaco, Andale Mono, monospace;
 77 |     line-height: 1.5;
 78 |     font-size: 13px;
 79 | }
 80 | aside {
 81 |     display: block;
 82 |     float: right;
 83 |     width: 390px;
 84 | }
 85 | blockquote {
 86 |     margin: 1em 2em;
 87 |     max-width: 476px;
 88 | }
 89 | blockquote p {
 90 |     color: #666;
 91 |     max-width: 460px;
 92 | }
 93 | hr {
 94 |     width: 540px;
 95 |     text-align: left;
 96 |     margin: 0 auto 0 0;
 97 |     color: #999;
 98 | }
 99 | table {
100 |     border-collapse: collapse;
101 |     margin: 1em 1em;
102 |     border: 1px solid #CCC;
103 | }
104 | table thead {
105 |     background-color: #EEE;
106 | }
107 | table thead td {
108 |     color: #666;
109 | }
110 | table td {
111 |     padding: 0.5em 1em;
112 |     border: 1px solid #CCC;
113 | }
114 | </style>
115 | """
116 | TITLE="""
117 | <title>
118 | {name}
119 | </title>
120 | </head>
121 | <body>
122 | """
123 | 
124 | FOOT="""
125 | </body>
126 | </html>
127 | """
128 | 
129 | class Logger():
130 |     def __init__(self, outfile, training_files=[], num_points=0, multiclass=True, extra=""):
131 |         self.outfile = outfile
132 |         self.startdate = datetime.datetime.now()
133 |         self.arch = None
134 |         self.losses = []
135 |         self.lr = []
136 |         self.points_seen = []
137 |         self.accuracy_train = []
138 |         self.perc_ground = []
139 |         self.perc_building = []
140 |         self.perc_lo_veg = []
141 |         self.perc_med_veg = []
142 |         self.perc_hi_veg = []
143 |         self.perc_water = []
144 |         self.perc_rest = []
145 |         self.cumaccuracy_train = []
146 |         self.valid_points_seen = []
147 |         self.valid_points_acc = []
148 |         self.valid_points_cumacc = []
149 |         self.valid_confusion = []
150 |         self.plots = {}
151 |         self.container = None
152 |         self.training_files = training_files
153 |         self.num_points = num_points
154 |         self.multiclass = multiclass
155 |         self.extra = extra
156 |         if not os.path.exists(os.path.dirname(outfile)):
157 |             os.makedirs(os.path.dirname(outfile))
158 | 
159 |     def add_plot(self):
160 |         pass
161 | 
162 |     def save(self):
163 |         currdate = datetime.datetime.now()
164 |         train_repr = (self.training_files[:10] + ["..."]) if len(self.training_files) > 10 else self.training_files
165 |         md = """
166 | alsNet Logger
167 | =============
168 | Date started: {startdate}
169 | 
170 | Current Date: {currdate}
171 | 
172 | * * *
173 | 
174 | Parameters
175 | ----------
176 | 
177 | ### Global
178 |     
179 | points per batch: {ppb}
180 | 
181 | learning rate: {learning_rate}
182 | 
183 | dropout rate: {dropout_rate}
184 | 
185 | classes: {classes}
186 | 
187 | training files:
188 | 
189 |     {training_files}
190 |     
191 | {extra}
192 |         """.format(startdate=self.startdate.strftime('%Y-%m-%d %H:%M:%S'),
193 |                    currdate=currdate.strftime('%Y-%m-%d %H:%M:%S'),
194 |                    learning_rate=self.container.learning_rate,
195 |                    training_files="\n    ".join(train_repr),
196 |                    ppb=self.num_points,
197 |                    dropout_rate=self.container.dropout,
198 |                    classes="all" if self.multiclass else "only ground/non-ground",
199 |                    extra=self.extra)
200 |         for nr, level in enumerate(self.arch):
201 |             md += """
202 | ### Level {levelno}
203 | 
204 |     nPoints = {npoint}
205 |     radius = {radius}
206 |     nSample = {nsample}
207 |     mlp = {mlp}
208 |     pooling = {pooling}
209 |     mlp2 = {mlp2}
210 |     reverse_mlp = {reverse_mlp}
211 |       
212 |             """.format(levelno=nr, **level)
213 | 
214 | 
215 |         self.create_plots()
216 | 
217 |         md += """
218 | * * *
219 | 
220 | Training
221 | --------
222 | 
223 | ![Loss]({plot_loss} "Loss")
224 | 
225 | Loss (latest: {loss})
226 | 
227 | ![Instantaneous accuracy]({plot_acc} "Instantaneous accuracy")
228 | 
229 | Instantaneous accuracy (latest: {acc})
230 | 
231 | ![Cumulative accuracy]({plot_cumacc} "Cumulative accuracy")
232 | 
233 | Cumulative accuracy (latest: {cumacc})
234 | 
235 | ![Class representativity]({plot_class} "Class representativity")
236 | 
237 | Class representativity
238 | 
239 | ![Confusion matrix]({plot_confusion} "Confusion matrix")
240 | 
241 | Confusion matrix
242 | 
243 | * * *
244 | 
245 | Testing
246 | -------
247 | N/A
248 | """.format(loss=self.losses[-1],
249 |            acc=self.accuracy_train[-1],
250 |            cumacc=self.cumaccuracy_train[-1],
251 |            plot_acc=self.plots['acc'],
252 |            plot_cumacc=self.plots['cumacc'],
253 |            plot_loss=self.plots['loss'],
254 |            plot_class=self.plots['classes'],
255 |            plot_confusion=self.plots['confusion'])
256 | 
257 |         html = markdown.markdown(md)
258 |         output_file = codecs.open(self.outfile, "w",
259 |                                   encoding="utf-8",
260 |                                   errors="xmlcharrefreplace")
261 |         output_file.write(HEAD + TITLE.format(name=os.path.dirname(self.outfile).split(os.sep)[-1]) + html + FOOT)
262 | 
263 | 
264 |     def create_plots(self):
265 |         data_folder = os.path.join(os.path.dirname(self.outfile), "plot_data")
266 |         if not os.path.exists(data_folder):
267 |             os.makedirs(data_folder)
268 |         d = {
269 |             'points_seen': self.points_seen,
270 |             'losses': self.losses,
271 |             'lr': self.lr,
272 |             'perc_ground': self.perc_ground,
273 |             'perc_building': self.perc_building,
274 |             'perc_hi_veg': self.perc_hi_veg,
275 |             'perc_lo_veg': self.perc_lo_veg,
276 |             'perc_med_veg': self.perc_med_veg,
277 |             'perc_water': self.perc_water,
278 |             'perc_rest': self.perc_rest,
279 |             'cumaccuracy': self.cumaccuracy_train,
280 |             'accuracy': self.accuracy_train,
281 |             'valid_points': self.valid_points_seen,
282 |             'valid_accuracy': self.valid_points_acc,
283 |             'valid_cumaccuracy': self.valid_points_cumacc
284 |         }
285 |         np.save(os.path.join(data_folder, 'data.npy'), d)
286 | 
287 |         logging.debug("Starting plotting...")
288 | 
289 |         fig = plt.figure(figsize=(10,4))
290 |         plt.plot(self.points_seen, self.losses, label='mean loss')
291 |         plt.xlabel("Mio. points seen")
292 |         plt.ylabel("Loss (absolute)")
293 |         ax2 = plt.twinx()
294 |         ax2.plot(self.points_seen, self.lr, color="green", label="learning rate")
295 |         ax2.set_ylabel("Learning rate")
296 |         ax2.set_yscale("log")
297 |         plt.tight_layout()
298 |         figdata = BytesIO()
299 |         plt.savefig(figdata, format='png')
300 |         self.plots['loss'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
301 |                                                                            ).decode('utf-8').replace('\n', '')
302 |         #plt.savefig(os.path.join(outpath, 'plot_loss.png'), bbox_inches='tight')
303 |         plt.close()
304 | 
305 | 
306 |         fig = plt.figure(figsize=(10,4))
307 |         plt.plot(self.points_seen, self.accuracy_train, label='current accuracy')
308 |         plt.plot(self.points_seen, self.perc_ground, color='tab:purple', label='ground point percentage')
309 |         plt.plot(self.valid_points_seen, self.valid_points_acc, color='g', marker='+', linestyle='None', label='validation accuracy')
310 |         plt.legend(loc=3)
311 |         plt.xlabel("Mio. points seen")
312 |         plt.ylabel("Percent")
313 |         plt.ylim([0, 100])
314 |         plt.grid(True)
315 |         plt.tight_layout()
316 |         figdata = BytesIO()
317 |         plt.savefig(figdata, format='png')
318 |         self.plots['acc'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
319 |                                                                           ).decode('utf-8').replace('\n', '')
320 |         #plt.savefig(os.path.join(outpath, 'plot_acc.png'), bbox_inches='tight')
321 |         plt.close()
322 | 
323 |         fig = plt.figure(figsize=(10,4))
324 |         plt.plot(self.points_seen, self.cumaccuracy_train, label='cumulative accuracy')
325 |         plt.plot(self.valid_points_seen, self.valid_points_cumacc, label='cumulative validation accuracy')
326 |         plt.legend(loc=3)
327 |         plt.xlabel("Mio. points seen")
328 |         plt.ylabel("Percent")
329 |         plt.ylim([0, 100])
330 |         plt.grid(True)
331 |         plt.tight_layout()
332 |         figdata = BytesIO()
333 |         plt.savefig(figdata, format='png')
334 |         self.plots['cumacc'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
335 |                                                                              ).decode('utf-8').replace('\n', '')
336 |         #plt.savefig(os.path.join(outpath, 'plot_cumacc.png'), bbox_inches='tight')
337 |         plt.close()
338 | 
339 |         fig = plt.figure(figsize=(10,4))
340 |         plt.stackplot(self.points_seen,
341 |                       self.perc_ground, self.perc_hi_veg, self.perc_med_veg,
342 |                       self.perc_lo_veg, self.perc_building, self.perc_water, self.perc_rest,
343 |                       labels=['ground', 'hi veg', 'med veg', 'lo veg', 'building'],
344 |                       colors=('xkcd:bright purple',
345 |                               'xkcd:dark green',
346 |                               'xkcd:kelly green',
347 |                               'xkcd:lime',
348 |                               'xkcd:light red',
349 |                               'xkcd:water blue',
350 |                               'xkcd:light grey'))
351 | 
352 |         plt.ylim([0, 100])
353 |         plt.ylabel("Percent")
354 |         #ax2 = plt.twinx()
355 |         #ax2.plot(self.points_seen, self.losses)
356 |         #ax2.set_ylabel("Std. dev. of percent")
357 |         #plt.legend(loc=3)
358 |         plt.xlabel("Mio. points seen")
359 |         plt.tight_layout()
360 |         figdata = BytesIO()
361 |         plt.savefig(figdata, format='png')
362 |         self.plots['classes'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
363 |                                                                              ).decode('utf-8').replace('\n', '')
364 |         plt.close()
365 | 
366 |         # confusion matrix plot
367 |         if self.valid_confusion:
368 |             fig = plt.figure(figsize=(10, 10))
369 |             num_classes = self.valid_confusion[0].shape[0]
370 |             for ref_class in range(num_classes):
371 |                 curr_ref_axis = None
372 |                 for eval_class in range(num_classes):
373 |                     curplt_id = ref_class * num_classes + eval_class + 1
374 |                     conf_timeline = [self.valid_confusion[i][ref_class, eval_class] for i in range(len(self.valid_confusion))]
375 |                     if curr_ref_axis:
376 |                         plt.subplot(num_classes, num_classes, curplt_id, sharey=curr_ref_axis)
377 |                     else:
378 |                         curr_ref_axis = plt.subplot(num_classes, num_classes, curplt_id)
379 |                     plt.plot(self.valid_points_seen, conf_timeline)
380 |                     plt.ylim([0, 1])
381 | 
382 |             plt.tight_layout()
383 |             figdata = BytesIO()
384 |             plt.savefig(figdata, format='png')
385 |             self.plots['confusion'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
386 |                                                                                   ).decode('utf-8').replace('\n', '')
387 |             plt.close()
388 |         else:
389 |             self.plots['confusion'] = ""
390 | 
391 |         logging.debug("Plotting done.")


--------------------------------------------------------------------------------
/alsNet/alsNetLogger2.py:
--------------------------------------------------------------------------------
  1 | import markdown
  2 | import matplotlib
  3 | matplotlib.use('agg')
  4 | from matplotlib import gridspec
  5 | import matplotlib.pyplot as plt
  6 | import numpy as np
  7 | from io import BytesIO
  8 | import base64
  9 | import datetime
 10 | import codecs
 11 | import os
 12 | import logging
 13 | 
 14 | HEAD = """
 15 | <html>
 16 | <head>
 17 | <style>
 18 | body{
 19 |     margin: 0 auto;
 20 |     font-family: Georgia, Palatino, serif;
 21 |     color: #444444;
 22 |     line-height: 1;
 23 |     max-width: 960px;
 24 |     padding: 30px;
 25 | }
 26 | h1, h2, h3, h4 {
 27 |     color: #111111;
 28 |     font-weight: 400;
 29 | }
 30 | h1, h2, h3, h4, h5, p {
 31 |     margin-bottom: 24px;
 32 |     padding: 0;
 33 | }
 34 | h1 {
 35 |     font-size: 48px;
 36 | }
 37 | h2 {
 38 |     font-size: 36px;
 39 |     margin: 24px 0 6px;
 40 | }
 41 | h3 {
 42 |     font-size: 24px;
 43 | }
 44 | h4 {
 45 |     font-size: 21px;
 46 | }
 47 | h5 {
 48 |     font-size: 18px;
 49 | }
 50 | a {
 51 |     color: #0099ff;
 52 |     margin: 0;
 53 |     padding: 0;
 54 |     vertical-align: baseline;
 55 | }
 56 | ul, ol {
 57 |     padding: 0;
 58 |     margin: 0;
 59 | }
 60 | li {
 61 |     line-height: 24px;
 62 | }
 63 | li ul, li ul {
 64 |     margin-left: 24px;
 65 | }
 66 | p, ul, ol {
 67 |     font-size: 16px;
 68 |     line-height: 24px;
 69 |     max-width: 540px;
 70 | }
 71 | pre {
 72 |     padding: 0px 24px;
 73 |     max-width: 800px;
 74 |     white-space: pre-wrap;
 75 | }
 76 | code {
 77 |     font-family: Consolas, Monaco, Andale Mono, monospace;
 78 |     line-height: 1.5;
 79 |     font-size: 13px;
 80 | }
 81 | aside {
 82 |     display: block;
 83 |     float: right;
 84 |     width: 390px;
 85 | }
 86 | blockquote {
 87 |     margin: 1em 2em;
 88 |     max-width: 476px;
 89 | }
 90 | blockquote p {
 91 |     color: #666;
 92 |     max-width: 460px;
 93 | }
 94 | hr {
 95 |     width: 540px;
 96 |     text-align: left;
 97 |     margin: 0 auto 0 0;
 98 |     color: #999;
 99 | }
100 | table {
101 |     border-collapse: collapse;
102 |     margin: 1em 1em;
103 |     border: 1px solid #CCC;
104 | }
105 | table thead {
106 |     background-color: #EEE;
107 | }
108 | table thead td {
109 |     color: #666;
110 | }
111 | table td {
112 |     padding: 0.5em 1em;
113 |     border: 1px solid #CCC;
114 | }
115 | </style>
116 | """
117 | TITLE="""
118 | <title>
119 | {name}
120 | </title>
121 | </head>
122 | <body>
123 | """
124 | 
125 | FOOT="""
126 | </body>
127 | </html>
128 | """
129 | 
130 | class Logger():
131 |     def __init__(self, outfile, inst, training_files):
132 |         self.outfile = outfile
133 |         self.inst = inst
134 |         self.startdate = datetime.datetime.now()
135 |         self.training_files = training_files
136 |         self.extra = ""
137 |         self.plots = {}
138 | 
139 |         if not os.path.exists(os.path.dirname(outfile)):
140 |             os.makedirs(os.path.dirname(outfile))
141 | 
142 | 
143 |     def save(self):
144 |         currdate = datetime.datetime.now()
145 |         train_repr = ([f.file for f in self.training_files[:10]]+ ["..."]) if len(self.training_files) > 10 else [f.file for f in self.training_files]
146 |         md = """
147 | alsNet Logger2
148 | ==============
149 | Date started: {startdate}
150 | 
151 | Current Date: {currdate}
152 | 
153 | * * *
154 | 
155 | Parameters
156 | ----------
157 | 
158 | ### Global
159 |     
160 | points per batch: {ppb}
161 | 
162 | learning rate: {learning_rate}
163 | 
164 | dropout rate: {dropout_rate}
165 | 
166 | 
167 | training files:
168 | 
169 |     {training_files}
170 |     
171 | {extra}
172 |         """.format(startdate=self.startdate.strftime('%Y-%m-%d %H:%M:%S'),
173 |                    currdate=currdate.strftime('%Y-%m-%d %H:%M:%S'),
174 |                    learning_rate=self.inst.learning_rate,
175 |                    training_files="\n    ".join(train_repr),
176 |                    ppb=self.inst.num_points,
177 |                    dropout_rate=self.inst.dropout,
178 |                    extra=self.extra)
179 |         for nr, level in enumerate(self.inst.arch):
180 |             md += """
181 | ### Level {levelno}
182 | 
183 |     nPoints = {npoint}
184 |     radius = {radius}
185 |     nSample = {nsample}
186 |     mlp = {mlp}
187 |     pooling = {pooling}
188 |     mlp2 = {mlp2}
189 |     reverse_mlp = {reverse_mlp}
190 |       
191 |             """.format(levelno=nr, **level)
192 | 
193 | 
194 |         self.create_plots()
195 |         md += """
196 | * * *
197 | 
198 | Training
199 | --------
200 | 
201 | ![Loss]({plot_loss} "Loss")
202 | 
203 | Loss (latest: {loss})
204 | 
205 | ![Accuracy]({plot_acc} "Accuracy")
206 | 
207 | Accuracy (latest 5: {acc})
208 | 
209 | ![Class representativity]({plot_class} "Class representativity")
210 | 
211 | Class representativity
212 | 
213 | ![Confusion matrix]({plot_confusion} "Confusion matrix")
214 | 
215 | Confusion matrix
216 | 
217 | * * *
218 | 
219 | Testing
220 | -------
221 | N/A
222 | """.format(loss=self.inst.train_history.losses[-1],
223 |            acc=self.inst.train_history.get_oa_timeline()[-1],
224 | 
225 |            plot_loss=self.plots['loss'],
226 |            plot_acc=self.plots['acc'],
227 |            plot_class=self.plots['classes'],
228 |            plot_confusion=self.plots['confusion'])
229 | 
230 |         html = markdown.markdown(md)
231 |         output_file = codecs.open(self.outfile, "w",
232 |                                   encoding="utf-8",
233 |                                   errors="xmlcharrefreplace")
234 |         output_file.write(HEAD + TITLE.format(name=os.path.dirname(self.outfile).split(os.sep)[-1]) + html + FOOT)
235 | 
236 | 
237 |     def create_plots(self):
238 |         data_folder = os.path.join(os.path.dirname(self.outfile), "plot_data")
239 |         if not os.path.exists(data_folder):
240 |             os.makedirs(data_folder)
241 |         d = {
242 |             'train_cm': self.inst.train_history.cm,
243 |             'train_losses': self.inst.train_history.losses,
244 |             'train_point_seen': self.inst.train_history.points_seen,
245 |             'train_timestamps': self.inst.train_history.timestamps,
246 |             'eval_cm':  self.inst.eval_history.cm,
247 |             'eval_point_seen': self.inst.eval_history.points_seen,
248 |             'eval_timestamps': self.inst.eval_history.timestamps,
249 |         }
250 |         np.save(os.path.join(data_folder, 'data.npy'), d)
251 | 
252 |         logging.debug("Starting plotting...")
253 | 
254 |         fig = plt.figure(figsize=(10,4))
255 |         plt.plot(self.inst.train_history.points_seen, self.inst.train_history.losses, label='mean loss')
256 |         plt.xlabel("Mio. points seen")
257 |         plt.ylabel("Loss (absolute)")
258 |         #ax2 = plt.twinx()
259 |         #ax2.plot(self.points_seen, self.lr, color="green", label="learning rate")
260 |         #ax2.set_ylabel("Learning rate")
261 |         #ax2.set_yscale("log")
262 |         plt.tight_layout()
263 |         figdata = BytesIO()
264 |         plt.savefig(figdata, format='png')
265 |         self.plots['loss'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
266 |                                                                            ).decode('utf-8').replace('\n', '')
267 |         plt.close()
268 | 
269 | 
270 |         fig = plt.figure(figsize=(10,4))
271 |         plt.plot(self.inst.train_history.points_seen,
272 |                  self.inst.train_history.get_oa_timeline(),
273 |                  label='current accuracy')
274 |         #plt.plot(self.inst.train_history.points_seen,
275 |         #         self.inst.train_history.get_oa_timeline_smooth(5)[1:-1],
276 |         #         color='tab:purple', label='averaged accuracy (5)')
277 |         plt.plot(self.inst.eval_history.points_seen,
278 |                  self.inst.eval_history.get_oa_timeline(),
279 |                  color='g', marker='+', linestyle='None', label='validation accuracy')
280 | 
281 |         plt.legend(loc=3)
282 |         plt.xlabel("Mio. points seen")
283 |         plt.ylabel("x100 Percent")
284 |         plt.ylim([0, 1])
285 |         plt.grid(True)
286 |         plt.tight_layout()
287 |         figdata = BytesIO()
288 |         plt.savefig(figdata, format='png')
289 |         self.plots['acc'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
290 |                                                                           ).decode('utf-8').replace('\n', '')
291 |         plt.close()
292 | 
293 |         fig = plt.figure(figsize=(10,4))
294 | 
295 |         keep_classes = (2, 3, 4, 5, 6, 9)
296 |         for class_id, label, color in zip(keep_classes,
297 |                                           ['ground', 'hi veg', 'med veg', 'lo veg', 'building', 'water'],
298 |                                           ['xkcd:bright purple',
299 |                                            'xkcd:dark green',
300 |                                            'xkcd:kelly green',
301 |                                            'xkcd:lime',
302 |                                            'xkcd:light red',
303 |                                            'xkcd:water blue']
304 |                                           ):
305 |             plt.plot(self.inst.train_history.points_seen,
306 |                      np.cumsum(self.inst.train_history.class_points_timeline(class_id)),
307 |                      label=label,
308 |                      color=color)
309 | 
310 |         plt.ylabel("Points of class seen")
311 |         plt.legend(loc=3)
312 |         plt.xlabel("Mio. points seen")
313 |         plt.tight_layout()
314 |         figdata = BytesIO()
315 |         plt.savefig(figdata, format='png')
316 |         self.plots['classes'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
317 |                                                                               ).decode('utf-8').replace('\n', '')
318 |         plt.close()
319 | 
320 |         # confusion matrix plot
321 |         fig = plt.figure(figsize=(10, 10))
322 |         num_classes = len(keep_classes)+1
323 |         keep_classes_e = keep_classes + (-1,)
324 |         gs = gridspec.GridSpec(num_classes, num_classes)
325 | 
326 |         row = -1
327 |         for ref_class in keep_classes_e:
328 |             curr_ref_axis = None
329 |             row += 1
330 |             col = -1
331 |             for eval_class in keep_classes_e:
332 |                 col += 1
333 | 
334 | 
335 | 
336 |                 conf_timeline = self.inst.eval_history.get_cm_timeline_compressed(ref_class, eval_class, keep_classes)
337 |                 if curr_ref_axis:
338 |                     plt.subplot(gs[row, col], sharey=curr_ref_axis)
339 |                 else:
340 |                     curr_ref_axis = plt.subplot(gs[row, col])
341 | 
342 |                 plt.plot(self.inst.eval_history.points_seen, conf_timeline)
343 | 
344 |                 if col == row:
345 |                     plt.gca().set_facecolor('xkcd:pale green')
346 |                     highcolor = 'xkcd:forest green'
347 |                     lowcolor = 'xkcd:grass green'
348 |                 else:
349 |                     plt.gca().set_facecolor('xkcd:pale pink')
350 |                     highcolor = 'xkcd:orange red'
351 |                     lowcolor = 'xkcd:dirty pink'
352 |                 if conf_timeline:
353 |                     plt.text((self.inst.eval_history.points_seen[0] + self.inst.eval_history.points_seen[-1])/2,
354 |                              0.5,
355 |                              "%.1f%%" % (conf_timeline[-1]*100), ha='center',
356 |                              color=highcolor if conf_timeline[-1]>0.5 else lowcolor)
357 |                     cm = self.inst.eval_history.cm[-1]
358 |                     ref_sum = np.sum(cm, axis=1)[ref_class]
359 |                     eval_sum = np.sum(cm, axis=0)[eval_class]
360 |                     plt.text((self.inst.eval_history.points_seen[0] + self.inst.eval_history.points_seen[-1])/2,
361 |                              0.3,
362 |                              "%d" % (cm[ref_class, eval_class]), ha='center')
363 |                     if col == 0:
364 |                         plt.ylabel('%d\n%d\n(%.0f%%)' % (ref_class,
365 |                                                          ref_sum,
366 |                                                          ref_sum/self.inst.num_points * 100))
367 |                     if row == 0:
368 |                         plt.gca().xaxis.set_label_position('top')
369 |                         plt.xlabel('%d\n%d\n(%.0f%%)' % (eval_class,
370 |                                                          eval_sum,
371 |                                                          eval_sum/self.inst.num_points * 100))
372 | 
373 | 
374 |                 plt.gca().get_yaxis().set_ticks([])
375 |                 plt.gca().get_xaxis().set_ticks([])
376 | 
377 |                 plt.ylim([0, 1])
378 | 
379 |         fig.text(0.5, 0.94, 'Estimated', ha='center', va='center')
380 |         fig.text(0.06, 0.5, 'Ground truth', ha='center', va='center', rotation='vertical')
381 | 
382 |         plt.subplots_adjust(hspace=.0, wspace=.0)
383 |         figdata = BytesIO()
384 |         plt.savefig(figdata, format='png')
385 |         self.plots['confusion'] = "data:image/png;base64,%s" % base64.b64encode(figdata.getvalue()
386 |                                                                                 ).decode('utf-8').replace('\n', '')
387 |         plt.close()
388 | 
389 |         logging.debug("Plotting done.")


--------------------------------------------------------------------------------
/alsNet/alsNetMerger.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | from argparse import ArgumentParser
  3 | from dataset import Dataset
  4 | from scipy.spatial import ckdtree
  5 | from sklearn import metrics
  6 | import numpy as np
  7 | import os
  8 | import logging
  9 | 
 10 | logging.basicConfig(level=logging.DEBUG,
 11 |                     format='%(asctime)s [%(levelname)s]: %(message)s',
 12 |                     datefmt='%Y-%m-%d %H:%M:%S')
 13 | 
 14 | MAX_CLASSES = 30
 15 | 
 16 | def main(in_files, ref_file, out_file, write_probs=True):
 17 |     input_files = []
 18 |     for filepattern in in_files:
 19 |         for file in glob.glob(filepattern):
 20 |             input_files.append(file)
 21 | 
 22 |     logging.info("Found %d files to merge" % len(input_files))
 23 | 
 24 |     overall_points = None
 25 | 
 26 |     out_points = []
 27 |     out_attrs = []
 28 |     out_counts = []
 29 |     out_meanvar = []
 30 |     out_labels = []
 31 |     new_max_class = []
 32 |     names = None
 33 | 
 34 |     logging.info("Loading reference dataset")
 35 |     ref_ds = Dataset(ref_file)
 36 |     ref_points = ref_ds._xyz
 37 |     out_labels = ref_ds.labels
 38 | 
 39 |     prob_sums = np.zeros((ref_points.shape[0], MAX_CLASSES))
 40 |     prob_counts = np.zeros((ref_points.shape[0],))
 41 | 
 42 |     logging.info("Building 2D kD-Tree on the reference dataset")
 43 |     tree = ckdtree.cKDTree(ref_points[:, 0:2])  # only on 2D :D
 44 | 
 45 |     for fileidx, file in enumerate(input_files):
 46 |         logging.info("Processing file %d" % fileidx)
 47 |         ds = Dataset(file)
 48 |         points = np.hstack((ds.points_and_features, np.expand_dims(ds.labels, -1)))
 49 |         names = ds.names
 50 |         prob_ids_here = []
 51 |         prob_ids_ref = []
 52 |         for idx, name in enumerate(names):
 53 |             if name.startswith('prob_class'):
 54 |                 prob_ids_here.append(idx+3)
 55 |                 prob_ids_ref.append(int(name.split('prob_class')[-1]))
 56 | 
 57 |         for ptidx in range(points.shape[0]):
 58 |             xy = points[ptidx, 0:2]
 59 |             ref_ids = tree.query_ball_point(xy, r=0.0001, eps=0.0001)
 60 |             if len(ref_ids) > 1:
 61 |                 ref_id = ref_ids[np.argmin(np.abs(ref_points[ref_ids, -1] - points[ptidx, 3]), axis=0)]
 62 |             elif len(ref_ids) == 0:
 63 |                 logging.warn("Point not found: %s" % xy)
 64 |                 continue
 65 |             else:
 66 |                 ref_id = ref_ids[0]
 67 |             prob_counts[ref_id] += 1
 68 |             probs_here = points[ptidx, prob_ids_here]
 69 |             prob_sums[ref_id, prob_ids_ref] += probs_here
 70 |         del ds
 71 |         del points
 72 | 
 73 |     # clear memory
 74 |     ref_ds = None
 75 | 
 76 |     out_points = ref_points
 77 |     print(prob_counts)
 78 |     print(prob_sums[ref_id, :])
 79 | 
 80 |     #prob_avgs = prob_sums / prob_counts[:, np.newaxis]
 81 |     #print(prob_avgs)
 82 |     #print(prob_avgs[ref_id, :])
 83 |     new_max_class = np.zeros((ref_points.shape[0]))
 84 |     for i in range(ref_points.shape[0]):
 85 |         curr_point = prob_sums[i, :] / prob_counts[i]
 86 |         curr_point_max = np.argmax(curr_point)
 87 |         new_max_class[i] = curr_point_max
 88 |     #new_max_class = np.argmax(prob_avgs, axis=1)
 89 |     print(new_max_class)
 90 |     print(new_max_class[ref_id])
 91 |     print(out_labels[ref_id])
 92 |     #print(np.argmax(prob_avgs, axis=0))
 93 |     print(new_max_class.shape, out_labels.shape)
 94 |     #avg_feats = np.mean(prob_avgs, axis=1)
 95 |     #var_feats = np.std(prob_avgs, axis=1)
 96 | 
 97 |     #for line in range(ref_points.shape[0]):
 98 |     #    if line%10 == 0:
 99 |     #        #logging.info("Currently in line %d from %d" % (line, ref_points.shape[0]))
100 |     #        pass
101 |     #    curr_xyz = ref_points[line,:]
102 |     #    multiples = tree.query_ball_point(curr_xyz, r=0.0001, eps=0.0001)
103 |     #    #print(curr_xyz)
104 |     #    #print(overall_points[multiples, 0:3])
105 |     #    idx_processed += multiples
106 |     #    if len(multiples) == 0:
107 |     #        logging.info("Point missing: %s" % curr_xyz)
108 |     #        continue
109 |     #    out_points.append(curr_xyz)
110 |     #    out_labels.append(overall_points[multiples[0], -1])
111 |     #    out_meanvar.append(np.mean(var_feats))
112 |     #    out_attrs.append(avg_feats)
113 |     #    new_max_class.append(np.argmax(avg_feats[prob_cols], axis=0))
114 |     #    #print(np.argmax(avg_feats[prob_cols], axis=0))
115 |     #    #print(multiples, len(multiples))
116 |     #    out_counts.append(len(multiples))
117 | 
118 |     #out_attrs = np.array(out_attrs)
119 |     #out_counts = np.expand_dims(np.array(out_counts), -1)
120 |     #out_meanvar = np.expand_dims(np.array(out_meanvar), -1)
121 |     #names.append('meanvar')
122 |     #names.append('count')
123 |     ##attr_avg = np.sum(out_attrs, axis=1)/out_counts
124 |     #out_labels = np.array(out_labels).astype(np.int)
125 |     #Dataset.Save(out_file, np.hstack((out_points, out_attrs, out_meanvar, out_counts)), names, labels=out_labels, new_classes=new_max_class,
126 |     #             )#addDims=addDims + ['meanvar', 'count'])
127 |     #staticstics
128 |     #pre_acc = np.count_nonzero(overall_points[:, estim_col] == overall_points[:, -1]) / overall_points.shape[0]
129 |     pre_acc = 0
130 |     post_acc = np.count_nonzero(new_max_class == out_labels) / len(out_points)
131 |     post_cm = metrics.confusion_matrix(out_labels, new_max_class, labels=range(17))
132 |     post_prec = metrics.precision_score(out_labels, new_max_class, average=None)
133 |     post_recall = metrics.recall_score(out_labels, new_max_class, average=None)
134 |     post_f1 = metrics.f1_score(out_labels, new_max_class, average=None)
135 |     np.savetxt(out_file + '_cm.txt', post_cm, fmt='%.4f')
136 |     np.savetxt(out_file + '_prec.txt', post_prec, fmt='%.4f')
137 |     np.savetxt(out_file + '_recall.txt', post_recall, fmt='%.4f')
138 |     np.savetxt(out_file + '_f1.txt', post_f1, fmt='%.4f')
139 |     logging.info("Finished. Pre-acc: %.3f | Post-acc: %.3f" % (pre_acc, post_acc))
140 | 
141 | 
142 | 
143 | 
144 | if __name__ == '__main__':
145 |     parser = ArgumentParser()
146 |     parser.add_argument('--inFiles',
147 |                         default=[],
148 |                         required=True,
149 |                         help='input files (wildcard supported)',
150 |                         action='append')
151 |     parser.add_argument('--refFile',
152 |                         required=True,
153 |                         help='File with all the output points present')
154 |     parser.add_argument('--outFile', required=True, help='path to write output to')
155 |     parser.add_argument('--writeProbs', default=True, type=bool, help='write class probabilities')
156 |     args = parser.parse_args()
157 | 
158 |     main(args.inFiles, args.refFile, args.outFile, args.writeProbs)


--------------------------------------------------------------------------------
/alsNet/alsNetPreparer.py:
--------------------------------------------------------------------------------
 1 | from argparse import ArgumentParser
 2 | import numpy as np
 3 | import csv
 4 | import glob
 5 | import os
 6 | import sys
 7 | import dataset
 8 | 
 9 | def main(in_files, density, kNN, out_folder, thinFactor):
10 |     spacing = np.sqrt(kNN*thinFactor/(np.pi*density)) * np.sqrt(2)/2 * 0.95  # 5% MARGIN
11 |     print("Using a spacing of %.2f m" % spacing)
12 |     if not os.path.exists(out_folder):
13 |         os.makedirs(out_folder)
14 | 
15 |     statlist = [["Filename", "StdDev_Classes", "Ground", "Lo Veg", "Hi Veg"]]
16 |     for file_pattern in in_files:
17 |         print(file_pattern)
18 |         for file in glob.glob(file_pattern):
19 |             print("Loading file %s" % file)
20 |             d = dataset.kNNBatchDataset(file=file, k=int(kNN*thinFactor), spacing=spacing)
21 |             while True:
22 |                 print("Processing batch %d/%d" % (d.currIdx, d.num_batches))
23 |                 points_and_features, labels = d.getBatches(batch_size=1)
24 |                 idx_to_use = np.random.choice(range(int(thinFactor*kNN)), kNN)
25 |                 names = d.names
26 |                 out_name = d.filename.replace('.la', '_c%04d.la' % d.currIdx)  # laz or las
27 |                 out_path = os.path.join(out_folder, out_name)
28 |                 if points_and_features is not None:
29 |                     stats = dataset.ChunkedDataset.chunkStatistics(labels[0], 10)
30 |                     rest = 1 - (stats['relative'][2] +
31 |                                 stats['relative'][3] +
32 |                                 stats['relative'][4] +
33 |                                 stats['relative'][5] +
34 |                                 stats['relative'][6] +
35 |                                 stats['relative'][9])
36 |                     perc = [stats['relative'][2],
37 |                             stats['relative'][3],
38 |                             stats['relative'][4],
39 |                             stats['relative'][5],
40 |                             stats['relative'][6],
41 |                             stats['relative'][9],
42 |                             rest]
43 |                     stddev = np.std(perc) * 100
44 |                     list_entry = [out_name, "%.3f" % stddev, *["%.3f" % p for p in perc]]
45 |                     statlist.append(list_entry)
46 |                     dataset.Dataset.Save(out_path, points_and_features[0][idx_to_use], names,
47 |                                          labels=labels[0][idx_to_use], new_classes=None)
48 |                 else:  # no more data
49 |                     break
50 | 
51 |     with open(os.path.join(out_folder, "stats.csv"), "wb") as f:
52 |         for line in statlist:
53 |             f.write((",".join(line) + "\n").encode('utf-8'))
54 | 
55 | if __name__ == '__main__':
56 |     parser = ArgumentParser()
57 |     parser.add_argument('--inFiles',
58 |                         default=[],
59 |                         required=True,
60 |                         help='input files (wildcard supported)',
61 |                         action='append')
62 |     parser.add_argument('--density', type=float, required=True, help='average point density')
63 |     parser.add_argument('--kNN', default=200000, type=int, required=True, help='how many points per batch [default: 200000]')
64 |     parser.add_argument('--outFolder', required=True, help='where to write output files and statistics to')
65 |     parser.add_argument('--thinFactor', type=float, default=1., help='factor to thin out points by (2=use half of the points)')
66 |     args = parser.parse_args()
67 | 
68 |     main(args.inFiles, args.density, args.kNN, args.outFolder, args.thinFactor)


--------------------------------------------------------------------------------
/alsNet/alsNetRunner.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import glob
  3 | import logging
  4 | import numpy as np
  5 | from alsNet import AlsNetContainer
  6 | from alsNetLogger import Logger
  7 | from dataset import ChunkedDataset
  8 | from dataset import kNNBatchDataset
  9 | import os
 10 | # disable tensorflow debug information:
 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
 12 | 
 13 | logging.basicConfig(level=logging.DEBUG,
 14 |                     format='%(asctime)s [%(levelname)s]: %(message)s',
 15 |                     datefmt='%Y-%m-%d %H:%M:%S')
 16 | 
 17 | if __name__ == '__main__':
 18 | 
 19 |     parser = argparse.ArgumentParser()
 20 |     parser.add_argument('--inFiles', default='./*.laz', help='input files (wildcard supported) [default: ./*.laz]')
 21 |     parser.add_argument('--testFiles', type=str, help='files for testing/validation ')
 22 |     parser.add_argument('--batchSize', default=10, type=int, help='batch size for training [default: 10]')
 23 |     parser.add_argument('--kNN', default=100000, type=int, help='how many points per batch [default: 100000]')
 24 |     parser.add_argument('--spacing', default=100, type=float, help='spatial spacing between batches in m [default: 100]')
 25 |     parser.add_argument('--dropout', default=0.5, type=float, help='probability to randomly drop a neuron ' +
 26 |                                                                    'in the last layer [default: 0.5]')
 27 |     parser.add_argument('--logDir', default='log', help='directory to write html log to [default: log]')
 28 |     parser.add_argument('--multiclass', default=True, type=bool, help='label into multiple classes ' +
 29 |                                                                       '(not only ground/nonground) [default: True]')
 30 |     args = parser.parse_args()
 31 |     num_classes = 30
 32 |     batch_size = int(args.batchSize)
 33 |     train_files = glob.glob(args.inFiles)
 34 |     test_files = glob.glob(args.testFiles)
 35 |     dropout = float(args.dropout)
 36 |     kNN = int(args.kNN)
 37 |     spacing = int(args.spacing)
 38 |     logdir = args.logDir
 39 |     multiclass = args.multiclass
 40 |     logger = Logger(os.path.join(logdir, 'alsnet-log.html'), training_files=train_files, num_points=kNN,
 41 |                     multiclass=multiclass)
 42 |     alsNetInstance = AlsNetContainer('log', 0.01, logger=logger, dropout=dropout)
 43 |     logger.container = alsNetInstance
 44 |     alsNetInstance.prepare(num_feat=1, num_classes=num_classes, points_in=kNN, batch_size=batch_size)
 45 | 
 46 |     logging.info("""Training
 47 |   _____  ___    _    ___  _  _  ___  _  _   ___ 
 48 |  |_   _|| _ \  /_\  |_ _|| \| ||_ _|| \| | / __|
 49 |    | |  |   / / _ \  | | | .` | | | | .` || (_ |
 50 |    |_|  |_|_\/_/ \_\|___||_|\_||___||_|\_| \___|
 51 |    """)
 52 | 
 53 | 
 54 |     for idx, file in enumerate(train_files):
 55 |         logging.info(' - FILE %d/%d (%s) -' % (idx+1, len(train_files), file))
 56 |         train_ds = kNNBatchDataset(file=file, k=kNN, spacing=spacing, multiclass=multiclass)
 57 |         batch_idx = 0
 58 |         test_paf, test_labels = train_ds.getBatchByIdx(1200)
 59 |         while True:
 60 |             logging.info(" -- Fetching batches (%d-%d)/%d..." % (batch_idx + 1,
 61 |                                                                  batch_size + batch_idx - 1 + 1,
 62 |                                                                  train_ds.num_batches))
 63 |             points_and_features, labels = train_ds.getBatches(batch_size=batch_size)
 64 |             batch_idx += 1
 65 |             #if labels is not None and np.max(labels) > num_classes:
 66 |             #    logging.warning("Chunk contains points with Classes: %s" % str(np.unique(labels)))
 67 |             #    logging.warning("but only classes 0-%s are defined in the model." % num_classes)
 68 |             #    logging.warning("Removing those points...")
 69 |             #    points_and_features = np.delete(points_and_features, labels > num_classes, axis=0)
 70 |             #    labels = np.delete(labels, labels > num_classes, axis=0)
 71 |             if points_and_features is not None:
 72 |                 logging.info(" -- Feeding batches (%d-%d)/%d..." % (batch_idx, batch_size+batch_idx-1, train_ds.num_batches))
 73 |                 #logging.info("Chunk %d/%d (%d points)" % (train_ds.curr_chunk, train_ds.num_rows * train_ds.num_cols, points_and_features.shape[0]))
 74 |                 stats = ChunkedDataset.chunkStatistics(labels[0], num_classes)
 75 |                 #logging.info("Stats: %5.2f Ground | %5.2f Building | %5.2f Hi Veg | %5.2f Med Veg | %5.2f Lo Veg" %
 76 |                 #             (stats['relative'][2]*100, stats['relative'][6]*100, stats['relative'][5]*100,
 77 |                 #              stats['relative'][4]*100, stats['relative'][3]*100))
 78 |                 logger.perc_building.append(stats['relative'][6]*100)
 79 |                 logger.perc_hi_veg.append(stats['relative'][5]*100)
 80 |                 logger.perc_med_veg.append(stats['relative'][4]*100)
 81 |                 logger.perc_lo_veg.append(stats['relative'][3]*100)
 82 |                 logger.perc_ground.append(stats['relative'][2]*100)
 83 |                 logger.perc_water.append(stats['relative'][9]*100)
 84 |                 rest = 1 - (stats['relative'][2] +
 85 |                             stats['relative'][3] +
 86 |                             stats['relative'][4] +
 87 |                             stats['relative'][5] +
 88 |                             stats['relative'][6] +
 89 |                             stats['relative'][9])
 90 |                 logger.perc_rest.append(rest*100)
 91 |                 perc = [stats['relative'][2],
 92 |                         stats['relative'][3],
 93 |                         stats['relative'][4],
 94 |                         stats['relative'][5],
 95 |                         stats['relative'][6],
 96 |                         stats['relative'][9],
 97 |                         rest]
 98 |                 stddev = np.std(perc) * 100
 99 |                 logger.losses.append(stddev)
100 |                 if len(logger.points_seen) == 0:
101 |                     logger.points_seen.append(0)
102 |                 else:
103 |                     logger.points_seen.append(logger.points_seen[-1] + kNN*1e-6)
104 |                 logger.accuracy_train.append(0)
105 |                 logger.cumaccuracy_train.append(0)
106 |                 logger.save()
107 |                 if batch_idx % 10 == 0:
108 |                     logging.info(" --- Testing with this batch before training...")
109 |                     alsNetInstance.test_chunk(points_and_features[0], labels[0],
110 |                                               os.path.join(logdir, "test_batch_%s_%s" % (batch_idx, train_ds.filename)))
111 |                     logging.info(" --- Testing with fixed 'batch of interest'...")
112 |                     alsNetInstance.test_chunk(test_paf[0], test_labels[0],
113 |                                               os.path.join(logdir, "const_test_%s_%s" % (batch_idx, train_ds.filename)))
114 |                 alsNetInstance.train_batch(points_and_features, labels)
115 |                 alsNetInstance.save_model('/data/lwiniwar/02_temp/models/alsNet')
116 |             else:
117 |                 break
118 | 
119 |     logging.info("""Testing
120 |   _____  ___  ___  _____  ___  _  _   ___ 
121 |  |_   _|| __|/ __||_   _||_ _|| \| | / __|
122 |    | |  | _| \__ \  | |   | | | .` || (_ |
123 |    |_|  |___||___/  |_|  |___||_|\_| \___|
124 |    """)
125 |     for idx, file in enumerate(test_files):
126 |         logging.info(' - FILE %d/%d (%s) - ' % (idx, len(test_files), file))
127 |         test_ds = kNNBatchDataset(file=file, k=kNN, spacing=spacing, multiclass=multiclass)
128 |         while True:
129 |             points_and_features, labels = test_ds.getBatches(batch_size=1)
130 |             if points_and_features is not None:
131 |                 logging.info(" -- Evaluating region %d/%d" % (test_ds.currIdx, test_ds.num_batches))
132 |                 stats = ChunkedDataset.chunkStatistics(labels[0], num_classes)
133 |                 logging.info(" -- Stats: %5.2f Ground | %5.2f Building | %5.2f Hi Veg | %5.2f Med Veg | %5.2f Lo Veg" %
134 |                              (stats['relative'][2]*100, stats['relative'][6]*100, stats['relative'][5]*100,
135 |                               stats['relative'][4]*100, stats['relative'][3]*100))
136 |                 alsNetInstance.test_chunk(points_and_features[0], labels[0], file.replace(".laz", "eval/region_%d.laz" % test_ds.currIdx))
137 |             else:
138 |                 break


--------------------------------------------------------------------------------
/alsNet/alsNetRunner2.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import glob
  3 | import logging
  4 | import numpy as np
  5 | from alsNet import AlsNetContainer
  6 | from alsNetLogger import Logger
  7 | from dataset import Dataset
  8 | import os
  9 | import webbrowser
 10 | # disable tensorflow debug information:
 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
 12 | 
 13 | logging.basicConfig(level=logging.DEBUG,
 14 |                     format='%(asctime)s [%(levelname)s]: %(message)s',
 15 |                     datefmt='%Y-%m-%d %H:%M:%S')
 16 | 
 17 | if __name__ == '__main__':
 18 | 
 19 |     parser = argparse.ArgumentParser()
 20 |     parser.add_argument('--inList', help='input text file, must be csv with filename;stddev;class representativity')
 21 |     parser.add_argument('--threshold', type=float, help='upper threshold for class stddev')
 22 |     parser.add_argument('--batchSize', default=10, type=int, help='batch size for training [default: 10]')
 23 |     parser.add_argument('--dropout', default=0.5, type=float, help='probability to randomly drop a neuron ' +
 24 |                                                                    'in the last layer [default: 0.5]')
 25 |     parser.add_argument('--logDir', default='log', help='directory to write html log to [default: log]')
 26 |     parser.add_argument('--multiclass', default=True, type=bool, help='label into multiple classes ' +
 27 |                                                                       '(not only ground/nonground) [default: True]')
 28 |     parser.add_argument('--multiTrain', default=1, type=int, help='how often to feed the whole training dataset [default: 1]')
 29 |     parser.add_argument('--testList', help='list with files to test on')
 30 |     parser.add_argument('--gpuID', default=None, help='which GPU to run on (default: CPU only)')
 31 |     args = parser.parse_args()
 32 |     num_classes = 30
 33 |     num_feats = 3
 34 |     inlist = args.inList
 35 |     batch_size = int(args.batchSize)
 36 |     dropout = float(args.dropout)
 37 |     threshold = float(args.threshold)
 38 |     multitrain = args.multiTrain
 39 |     testlist = args.testList
 40 |     logdir = args.logDir
 41 |     multiclass = args.multiclass
 42 |     gpu = args.gpuID
 43 |     device = ("/gpu:%d" % int(gpu)) if gpu else "/cpu"
 44 |     logger = Logger(os.path.join(logdir, 'alsnet-log.html'), training_files=[], num_points="N/A",
 45 |                     multiclass=multiclass,
 46 |                     extra="\nThreshold for selection:\n\n{threshold}\n".format(threshold=threshold))
 47 | 
 48 |     alsNetInstance = AlsNetContainer('log', 0.01, logger=logger, dropout=dropout, dev=device)
 49 |     logger.container = alsNetInstance
 50 |     alsNetInstance.prepare(num_feat=num_feats, num_classes=num_classes, points_in=200000, batch_size=batch_size)
 51 | 
 52 |     logging.info("""Training
 53 |   _____  ___    _    ___  _  _  ___  _  _   ___ 
 54 |  |_   _|| _ \  /_\  |_ _|| \| ||_ _|| \| | / __|
 55 |    | |  |   / / _ \  | | | .` | | | | .` || (_ |
 56 |    |_|  |_|_\/_/ \_\|___||_|\_||___||_|\_| \___|
 57 |    """)
 58 |     with open(inlist, "rb") as f:
 59 |         _ = f.readline()  # remove header
 60 |         rest = f.readlines()
 61 | 
 62 |     datasets = []
 63 |     for line in rest:
 64 |         line = line.decode('utf-8')
 65 |         linespl = line.split(",")
 66 |         if float(linespl[1]) < threshold:
 67 |             datasets.append(os.path.join(os.path.dirname(inlist), linespl[0]))
 68 | 
 69 |     logger.training_files = datasets
 70 | 
 71 |     logging.info("Found %s suitable files" % len(datasets))
 72 | 
 73 |     for mult_i in range(multitrain):
 74 |         logging.info("Starting iteration %d through training dataset" % mult_i)
 75 |         seens = []
 76 |         np.random.shuffle(datasets)
 77 |         for idx, file in enumerate(datasets):
 78 |             logging.info(' - FILE %d/%d (%s) -' % (idx+1, len(datasets), file))
 79 |             train_ds = Dataset(file=file)
 80 |             paf = np.expand_dims(train_ds.points_and_features, 0)
 81 |             lab = np.expand_dims(train_ds.labels, 0)
 82 | 
 83 |             if idx % 1 == 0 and idx > 0:
 84 |                 logging.info(" --- Testing with this batch before training...")
 85 |                 alsNetInstance.test_chunk(paf[0], lab[0],
 86 |                                           os.path.join(logdir, "test_%s" % train_ds.filename),
 87 |                                           train_ds.names)
 88 | 
 89 |             alsNetInstance.train_batch(paf, lab)
 90 |             if mult_i == 0 and idx == 1:
 91 |                 webbrowser.open(logger.outfile)
 92 |             if idx % 50 == 0:
 93 |                 alsNetInstance.save_model(os.path.join(logdir, 'models', 'alsNet'))
 94 | 
 95 |         alsNetInstance.save_model(os.path.join(logdir, 'models', 'alsNet'))
 96 | 
 97 |     if testlist:
 98 |         datasets = []
 99 |         with open(testlist, "rb") as f:
100 |             _ = f.readline()  # remove header
101 |             rest = f.readlines()
102 |         for line in rest:
103 |             line = line.decode('utf-8')
104 |             linespl = line.split(",")
105 |             datasets.append(os.path.join(os.path.dirname(inlist), linespl[0]))
106 | 
107 |         for idx,file in enumerate(datasets):
108 |             ds = Dataset(file=file)
109 |             paf, lab = ds.points_and_features, ds.labels
110 |             alsNetInstance.test_chunk(paf, lab,
111 |                                       os.path.join(logdir, "eval_%s" % ds.filename),
112 |                                       ds.names)
113 | 


--------------------------------------------------------------------------------
/alsNet/alsNetRunner3.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import numpy as np
 3 | import argparse
 4 | from alsNetRefactored import AlsNetContainer
 5 | from alsNetLogger2 import Logger
 6 | 
 7 | #Disable TF debug messages
 8 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
 9 | 
10 | arch =[
11 |     {
12 |         'npoint': 4096,
13 |         'radius': 1,
14 |         'nsample': 32,
15 |         'mlp': [128, 128, 128],
16 |         'pooling': 'max',
17 |         'mlp2': None,
18 |         'reverse_mlp': [128,128]
19 |     },
20 |     {
21 |         'npoint': 2048,
22 |         'radius': 5,
23 |         'nsample': 64,
24 |         'mlp': [128, 128, 256],
25 |         'pooling': 'max',
26 |         'mlp2': None,
27 |         'reverse_mlp': [256,256]
28 |     },
29 |     {
30 |         'npoint': 1024,
31 |         'radius': 15,
32 |         'nsample': 64,
33 |         'mlp': [128, 128, 256],
34 |         'pooling': 'max',
35 |         'mlp2': None,
36 |         'reverse_mlp': [256,256]
37 |     }, ]
38 | 
39 | def main(args):
40 |     inlist = args.inList
41 |     threshold = args.threshold
42 |     train_size = args.trainSize
43 | 
44 |     with open(inlist, "rb") as f:
45 |         _ = f.readline()  # remove header
46 |         rest = f.readlines()
47 | 
48 |     datasets = []
49 |     for line in rest:
50 |         line = line.decode('utf-8')
51 |         linespl = line.split(",")
52 |         if float(linespl[1]) < threshold:
53 |             datasets.append(os.path.join(os.path.dirname(inlist), linespl[0]))
54 | 
55 |     np.random.shuffle(datasets)
56 |     inst = AlsNetContainer(num_points=200000, num_classes=30, num_feat=3, arch=arch,
57 |                            output_dir=args.outDir, dropout=args.dropout)
58 |     logg = Logger(outfile=os.path.join(args.outDir, 'alsNet-log.html'),
59 |                   inst=inst,
60 |                   training_files=datasets)
61 |     for i in range(len(datasets)//train_size):
62 |         if i > 0:
63 |             test_ds = datasets[i*train_size+1]
64 |             inst.test_single(test_ds,
65 |                              save_to=os.path.join(args.outDir, os.path.basename(test_ds).replace(".la", "_test.la")),
66 |                              save_prob=True)
67 |         print("Training datasets %s to %s (%s total)" % (i*train_size,
68 |                                                          min((i+1)*train_size, len(datasets)),
69 |                                                          len(datasets)))
70 |         inst.fit_file(datasets[i * train_size:min((i + 1) * train_size, len(datasets))], new_session=False)
71 |         logg.save()
72 | 
73 | 
74 | if __name__ == '__main__':
75 |     parser = argparse.ArgumentParser()
76 |     parser.add_argument('--inList', help='input text file, must be csv with filename;stddev;...')
77 |     parser.add_argument('--threshold', type=float, help='upper threshold for class stddev')
78 |     parser.add_argument('--trainSize', default=10, type=int, help='"batch" size for training [default: 10]')
79 |     parser.add_argument('--dropout', default=0.5, type=float, help='probability to randomly drop a neuron ' +
80 |                                                                    'in the last layer [default: 0.5]')
81 |     parser.add_argument('--outDir', required=True, help='directory to write html log to')
82 |     # parser.add_argument('--multiclass', default=True, type=bool, help='label into multiple classes ' +
83 |     #                                                                  '(not only ground/nonground) [default: True]')
84 |     # parser.add_argument('--multiTrain', default=1, type=int,
85 |     #                    help='how often to feed the whole training dataset [default: 1]')
86 |     # parser.add_argument('--testList', help='list with files to test on')
87 |     # parser.add_argument('--gpuID', default=None, help='which GPU to run on (default: CPU only)')
88 |     args = parser.parse_args()
89 |     main(args)


--------------------------------------------------------------------------------
/alsNet/alsNetRunner4.py:
--------------------------------------------------------------------------------
  1 | import os, sys
  2 | import numpy as np
  3 | import argparse
  4 | from alsNetRefactored import AlsNetContainer, simple_loss, fp_high_loss
  5 | from dataset import Dataset
  6 | from sklearn.model_selection import RandomizedSearchCV
  7 | 
  8 | #Disable TF debug messages
  9 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
 10 | 
 11 | arch1 =[
 12 |     {
 13 |         'npoint': 512,
 14 |         'radius': 1,
 15 |         'nsample': 128,
 16 |         'mlp': [512, 512, 1024],
 17 |         'pooling': 'max',
 18 |         'mlp2': None,
 19 |         'reverse_mlp': [128,128]
 20 |     },
 21 |     {
 22 |         'npoint': 256,
 23 |         'radius': 5,
 24 |         'nsample': 64,
 25 |         'mlp': [128, 128, 256],
 26 |         'pooling': 'max',
 27 |         'mlp2': None,
 28 |         'reverse_mlp': [256,256]
 29 |     },
 30 |     {
 31 |         'npoint': 128,
 32 |         'radius': 15,
 33 |         'nsample': 64,
 34 |         'mlp': [128, 128, 256],
 35 |         'pooling': 'max',
 36 |         'mlp2': None,
 37 |         'reverse_mlp': [256,256]
 38 |     }, ]
 39 | arch2 =[
 40 |     {
 41 |         'npoint': 1024,
 42 |         'radius': 1,
 43 |         'nsample': 128,
 44 |         'mlp': [512, 512, 1024],
 45 |         'pooling': 'max',
 46 |         'mlp2': None,
 47 |         'reverse_mlp': [1024,512]
 48 |     },
 49 |     {
 50 |         'npoint': 512,
 51 |         'radius': 5,
 52 |         'nsample': 64,
 53 |         'mlp': [128, 128, 256],
 54 |         'pooling': 'max',
 55 |         'mlp2': None,
 56 |         'reverse_mlp': [256,256]
 57 |     },
 58 |     {
 59 |         'npoint': 256,
 60 |         'radius': 15,
 61 |         'nsample': 64,
 62 |         'mlp': [128, 128, 256],
 63 |         'pooling': 'max',
 64 |         'mlp2': None,
 65 |         'reverse_mlp': [256,256]
 66 |     }, ]
 67 | arch3 =[
 68 |     {
 69 |         'npoint': 256,
 70 |         'radius': 1,
 71 |         'nsample': 128,
 72 |         'mlp': [256, 256, 256],
 73 |         'pooling': 'max',
 74 |         'mlp2': None,
 75 |         'reverse_mlp': [256,256]
 76 |     },
 77 |     {
 78 |         'npoint': 128,
 79 |         'radius': 5,
 80 |         'nsample': 64,
 81 |         'mlp': [128, 128, 256],
 82 |         'pooling': 'max',
 83 |         'mlp2': None,
 84 |         'reverse_mlp': [256,256]
 85 |     },
 86 |     {
 87 |         'npoint': 64,
 88 |         'radius': 15,
 89 |         'nsample': 64,
 90 |         'mlp': [128, 128, 256],
 91 |         'pooling': 'max',
 92 |         'mlp2': None,
 93 |         'reverse_mlp': [256,256]
 94 |     }, ]
 95 | 
 96 | 
 97 | param_distr = {
 98 |     'arch': [arch1, arch2, arch3],
 99 |     'learning_rate': [0.1, 0.01, 0.001],
100 |     'dropout': [0.55, 0.6, 0.65],
101 |     'loss_fn': [simple_loss, fp_high_loss]
102 | }
103 | 
104 | def main(args):
105 |     inlist = args.inList
106 |     threshold = args.threshold
107 |     #train_size = args.trainSize
108 | 
109 |     with open(inlist, "rb") as f:
110 |         _ = f.readline()  # remove header
111 |         rest = f.readlines()
112 | 
113 |     datasets = []
114 |     all_ds = []
115 |     for line in rest:
116 |         line = line.decode('utf-8')
117 |         linespl = line.split(",")
118 |         dataset_path = os.path.join(os.path.dirname(inlist), linespl[0])
119 |         if float(linespl[1]) < threshold:
120 |             datasets.append(dataset_path)
121 |         all_ds.append(dataset_path)
122 | 
123 |     np.random.shuffle(datasets)
124 |     datasets_th = []
125 |     for idx, dataset in enumerate(datasets):
126 |         print("Loading dataset %s of %s" % (idx+1, len(datasets)))
127 |         ds = Dataset(dataset, load=False)
128 |         datasets_th.append(ds)
129 |     print("%s datasets loaded." % len(datasets_th))
130 |     sys.stdout.flush()
131 | 
132 |     rnd_search = RandomizedSearchCV(AlsNetContainer(num_feat=3, num_classes=30, num_points=200000,
133 |                                                     output_base=args.outDir, score_sample=10),
134 |                                     param_distr,
135 |                                     n_iter=50,
136 |                                     random_state=42,
137 |                                     verbose=2,
138 |                                     n_jobs=1)
139 |     rnd_search.fit(datasets_th)
140 |     print(rnd_search.best_params_)
141 | 
142 |     #inst = AlsNetContainer(num_points=200000, num_classes=30, num_feat=3, arch=arch,
143 |     #                       output_dir=args.outDir, dropout=args.dropout)
144 |     #logg = Logger(outfile=os.path.join(args.outDir, 'alsNet-log.html'),
145 |     #              inst=inst,
146 |     #              training_files=datasets)
147 |     #for i in range(len(datasets)//train_size):
148 |     #    if i > 0:
149 |     #        test_ds = datasets[i*train_size+1]
150 |     #        inst.test_single(test_ds,
151 |     #                         save_to=os.path.join(args.outDir, os.path.basename(test_ds).replace(".la", "_test.la")),
152 |     #                         save_prob=True)
153 |     #    print("Training datasets %s to %s (%s total)" % (i*train_size,
154 |     #                                                     min((i+1)*train_size, len(datasets)),
155 |     #                                                     len(datasets)))
156 |     #    inst.fit(datasets[i*train_size:min((i+1)*train_size, len(datasets))], new_session=False)
157 |     #    logg.save()
158 | 
159 | 
160 | if __name__ == '__main__':
161 |     parser = argparse.ArgumentParser()
162 |     parser.add_argument('--inList', help='input text file, must be csv with filename;stddev;...')
163 |     parser.add_argument('--threshold', type=float, help='upper threshold for class stddev')
164 |     parser.add_argument('--outDir', required=True, help='directory to write html log to')
165 |     # parser.add_argument('--multiclass', default=True, type=bool, help='label into multiple classes ' +
166 |     #                                                                  '(not only ground/nonground) [default: True]')
167 |     # parser.add_argument('--multiTrain', default=1, type=int,
168 |     #                    help='how often to feed the whole training dataset [default: 1]')
169 |     # parser.add_argument('--testList', help='list with files to test on')
170 |     # parser.add_argument('--gpuID', default=None, help='which GPU to run on (default: CPU only)')
171 |     args = parser.parse_args()
172 |     main(args)


--------------------------------------------------------------------------------
/alsNet/alsNetRunner5.py:
--------------------------------------------------------------------------------
  1 | import os, sys
  2 | import numpy as np
  3 | import argparse
  4 | from alsNetRefactored import AlsNetContainer, simple_loss, fp_high_loss
  5 | from alsNetLogger2 import Logger
  6 | from dataset import Dataset
  7 | import importlib
  8 | 
  9 | #Disable TF debug messages
 10 | #os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
 11 | 
 12 | arch =[
 13 |     {
 14 |         'npoint': 8192,
 15 |         'radius': 1,
 16 |         'nsample': 16,
 17 |         'mlp': [64, 128, 128],
 18 |         'pooling': 'max',
 19 |         'mlp2': None,
 20 |         'reverse_mlp': [128,64]
 21 |     },
 22 |     {
 23 |         'npoint': 4096,
 24 |         'radius': 2,
 25 |         'nsample': 16,
 26 |         'mlp': [128, 256, 256],
 27 |         'pooling': 'max',
 28 |         'mlp2': None,
 29 |         'reverse_mlp': [256,256]
 30 |     },
 31 |     {
 32 |         'npoint': 2048,
 33 |         'radius': 5,
 34 |         'nsample': 16,
 35 |         'mlp': [128, 256, 256],
 36 |         'pooling': 'max',
 37 |         'mlp2': None,
 38 |         'reverse_mlp': [256,256]
 39 |     },
 40 |     {
 41 |         'npoint': 512,
 42 |         'radius': 15,
 43 |         'nsample': 32,
 44 |         'mlp': [128, 512, 256],
 45 |         'pooling': 'max',
 46 |         'mlp2': None,
 47 |         'reverse_mlp': [256,256]
 48 |     }, ]
 49 | 
 50 | 
 51 | def main(args):
 52 |     inlist = args.inList
 53 |     threshold = args.threshold
 54 |     train_size = args.trainSize
 55 |     arch = importlib.import_module(args.archFile).arch if args.archFile else arch
 56 |     lr = args.learningRate
 57 |     normalize_vals = args.normalize == 1
 58 | 
 59 |     with open(inlist, "rb") as f:
 60 |         _ = f.readline()  # remove header
 61 |         rest = f.readlines()
 62 | 
 63 |     datasets = []
 64 |     all_ds = []
 65 |     for line in rest:
 66 |         line = line.decode('utf-8')
 67 |         linespl = line.split(",")
 68 |         dataset_path = os.path.join(os.path.dirname(inlist), linespl[0])
 69 |         if float(linespl[1]) < threshold and float(linespl[6]) > args.minBuild:
 70 |             datasets.append(dataset_path)
 71 |         all_ds.append(dataset_path)
 72 | 
 73 |     np.random.shuffle(datasets)
 74 |     datasets_th = []
 75 |     for idx, dataset in enumerate(datasets):
 76 |         print("Loading dataset %s of %s (%s)" % (idx+1, len(datasets), os.path.basename(dataset)))
 77 |         ds = Dataset(dataset, load=False, normalize=normalize_vals)
 78 |         datasets_th.append(ds)
 79 |     print("%s datasets loaded." % len(datasets_th))
 80 |     sys.stdout.flush()
 81 | 
 82 |     inst = AlsNetContainer(num_feat=3, num_classes=30, num_points=200000, output_base=args.outDir, score_sample=10,
 83 |                            arch=arch,
 84 |                            learning_rate=lr,
 85 |                            dropout=0.55,
 86 |                            loss_fn=simple_loss if args.lossFn == "simple" else fp_high_loss)
 87 | 
 88 |     if args.continueModel is not None:
 89 |         inst.load_model(args.continueModel)
 90 | 
 91 |     logg = Logger(outfile=os.path.join(args.outDir, 'alsNet-log.html'),
 92 |                   inst=inst,
 93 |                   training_files=datasets_th)
 94 | 
 95 |     for j in range(args.multiTrain):
 96 |         for i in range(len(datasets_th)//train_size):
 97 |             if i > 0:
 98 |                 test_ds = datasets_th[i*train_size+1]
 99 |                 inst.test_single(test_ds,
100 |                                  save_to=os.path.join(args.outDir, os.path.basename(test_ds.file).replace(".la", "_test.la")),
101 |                                  save_prob=True)
102 |             print("Training datasets %s to %s (%s total)" % (i*train_size,
103 |                                                              min((i+1)*train_size, len(datasets_th)),
104 |                                                              len(datasets_th)))
105 |             inst.fit(datasets_th[i*train_size:min((i+1)*train_size, len(datasets_th))], new_session=False)
106 |             logg.save()
107 |         inst.save_model(os.path.join(args.outDir, 'models', 'model_%d_%d' % (j, i), 'alsNet.ckpt'))
108 | 
109 | 
110 | if __name__ == '__main__':
111 |     parser = argparse.ArgumentParser()
112 |     parser.add_argument('--inList', help='input text file, must be csv with filename;stddev;...')
113 |     parser.add_argument('--threshold', type=float, help='upper threshold for class stddev')
114 |     parser.add_argument('--minBuild', type=float, help='lower threshold for buildings class [0-1]')
115 |     parser.add_argument('--outDir', required=True, help='directory to write html log to')
116 |     # parser.add_argument('--multiclass', default=True, type=bool, help='label into multiple classes ' +
117 |     #                                                                  '(not only ground/nonground) [default: True]')
118 |     parser.add_argument('--multiTrain', default=1, type=int,
119 |                        help='how often to feed the whole training dataset [default: 1]')
120 |     parser.add_argument('--trainSize', default=1, type=int,
121 |                        help='how many plots to train at once [default: 1]')
122 |     parser.add_argument('--learningRate', default=0.001, type=float,
123 |                        help='learning rate [default: 0.001]')
124 |     parser.add_argument('--archFile', default="", type=str,
125 |                        help='architecture file to import [default: default architecture]')
126 |     parser.add_argument('--continueModel', default=None, type=str,
127 |                         help='continue training an existing model [default: start new model]')
128 |     parser.add_argument('--lossFn', default='simple', type=str,
129 |                         help='loss function to use [default: simple][simple/fp_high]')
130 |     parser.add_argument('--normalize', default=1, type=int,
131 |                         help='normalize fields and coordinates [default: 1][1/0]')
132 |     # parser.add_argument('--testList', help='list with files to test on')
133 |     # parser.add_argument('--gpuID', default=None, help='which GPU to run on (default: CPU only)')
134 |     args = parser.parse_args()
135 |     main(args)
136 | 


--------------------------------------------------------------------------------
/alsNet/archs/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/lwiniwar/alsNet/aab6a8a8d9df7850d7ec5fccee463ef40087a1a7/alsNet/archs/__init__.py


--------------------------------------------------------------------------------
/alsNet/archs/arch1.py:
--------------------------------------------------------------------------------
 1 | arch =[
 2 |     {
 3 |         'npoint': 4096,
 4 |         'radius': 1,
 5 |         'nsample': 32,
 6 |         'mlp': [512, 512, 1024],
 7 |         'pooling': 'max',
 8 |         'mlp2': None,
 9 |         'reverse_mlp': [128,128]
10 |     },
11 |     {
12 |         'npoint': 2048,
13 |         'radius': 5,
14 |         'nsample': 64,
15 |         'mlp': [128, 128, 256],
16 |         'pooling': 'max',
17 |         'mlp2': None,
18 |         'reverse_mlp': [256,256]
19 |     },
20 |     {
21 |         'npoint': 512,
22 |         'radius': 15,
23 |         'nsample': 64,
24 |         'mlp': [128, 128, 256],
25 |         'pooling': 'max',
26 |         'mlp2': None,
27 |         'reverse_mlp': [256,256]
28 |     }, ]


--------------------------------------------------------------------------------
/alsNet/archs/arch2.py:
--------------------------------------------------------------------------------
 1 | arch =[
 2 |     {
 3 |         'npoint': 8192,
 4 |         'radius': 1,
 5 |         'nsample': 16,
 6 |         'mlp': [64, 64, 128],
 7 |         'pooling': 'max_and_avg',
 8 |         'mlp2': None,
 9 |         'reverse_mlp': [128,64]
10 |     },
11 |     {
12 |         'npoint': 4096,
13 |         'radius': 5,
14 |         'nsample': 64,
15 |         'mlp': [128, 128, 256],
16 |         'pooling': 'max_and_avg',
17 |         'mlp2': None,
18 |         'reverse_mlp': [256,256]
19 |     },
20 |     {
21 |         'npoint': 2048,
22 |         'radius': 15,
23 |         'nsample': 64,
24 |         'mlp': [128, 128, 256],
25 |         'pooling': 'max_and_avg',
26 |         'mlp2': None,
27 |         'reverse_mlp': [256,256]
28 |     }, ]


--------------------------------------------------------------------------------
/alsNet/archs/arch3.py:
--------------------------------------------------------------------------------
 1 | arch =[
 2 |     {
 3 |         'npoint': 8192,
 4 |         'radius': 1,
 5 |         'nsample': 16,
 6 |         'mlp': [64, 64, 128],
 7 |         'pooling': 'max_and_avg',
 8 |         'mlp2': None,
 9 |         'reverse_mlp': [128,64]
10 |     },
11 |     {
12 |         'npoint': 4096,
13 |         'radius': 5,
14 |         'nsample': 64,
15 |         'mlp': [128, 128, 256],
16 |         'pooling': 'max_and_avg',
17 |         'mlp2': None,
18 |         'reverse_mlp': [256,256]
19 |     },
20 |     {
21 |         'npoint': 2048,
22 |         'radius': 15,
23 |         'nsample': 64,
24 |         'mlp': [128, 128, 256],
25 |         'pooling': 'max_and_avg',
26 |         'mlp2': None,
27 |         'reverse_mlp': [256,256]
28 |     }, ]


--------------------------------------------------------------------------------
/alsNet/archs/arch4.py:
--------------------------------------------------------------------------------
 1 | arch =[
 2 |     {
 3 |         'npoint': 8192,
 4 |         'radius': 1,
 5 |         'nsample': 16,
 6 |         'mlp': [64, 512, 128],
 7 |         'pooling': 'max_and_avg',
 8 |         'mlp2': None,
 9 |         'reverse_mlp': [128,64]
10 |     },
11 |     {
12 |         'npoint': 4096,
13 |         'radius': 5,
14 |         'nsample': 64,
15 |         'mlp': [128, 512, 256],
16 |         'pooling': 'max_and_avg',
17 |         'mlp2': None,
18 |         'reverse_mlp': [256,256]
19 |     },
20 |     {
21 |         'npoint': 2048,
22 |         'radius': 15,
23 |         'nsample': 64,
24 |         'mlp': [128, 512, 256],
25 |         'pooling': 'max_and_avg',
26 |         'mlp2': None,
27 |         'reverse_mlp': [256,256]
28 |     }, ]


--------------------------------------------------------------------------------
/alsNet/archs/arch5.py:
--------------------------------------------------------------------------------
 1 | arch =[
 2 |     {
 3 |         'npoint': 8192,
 4 |         'radius': 1,
 5 |         'nsample': 16,
 6 |         'mlp': [64, 128, 128],
 7 |         'pooling': 'max',
 8 |         'mlp2': None,
 9 |         'reverse_mlp': [128,64]
10 |     },
11 |     {
12 |         'npoint': 4096,
13 |         'radius': 2,
14 |         'nsample': 16,
15 |         'mlp': [128, 256, 256],
16 |         'pooling': 'max',
17 |         'mlp2': None,
18 |         'reverse_mlp': [256,256]
19 |     },
20 |     {
21 |         'npoint': 2048,
22 |         'radius': 5,
23 |         'nsample': 16,
24 |         'mlp': [128, 256, 256],
25 |         'pooling': 'max',
26 |         'mlp2': None,
27 |         'reverse_mlp': [256,256]
28 |     },
29 |     {
30 |         'npoint': 512,
31 |         'radius': 15,
32 |         'nsample': 32,
33 |         'mlp': [128, 512, 256],
34 |         'pooling': 'max',
35 |         'mlp2': None,
36 |         'reverse_mlp': [256,256]
37 |     }, ]


--------------------------------------------------------------------------------
/alsNet/dataset.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import laspy
  3 | import os
  4 | from scipy.spatial import KDTree
  5 | from sklearn.preprocessing import normalize
  6 | import logging
  7 | 
  8 | 
  9 | class Dataset():
 10 |     ATTR_EXLUSION_LIST = ['X', 'Y', 'Z', 'raw_classification', 'Classification',
 11 |                           'flag_byte', 'scan_angle_rank', 'user_data',
 12 |                           'pt_src_id', 'gps_time']
 13 |     ATTR_EXTRA_LIST = ['num_returns', 'return_num']
 14 | 
 15 |     def __init__(self, file, load=True, multiclass=True, normalize=False):
 16 |         self.file = file
 17 |         self._features = self._xyz = self._classes = self._names = None
 18 |         self.xmax = self.xmin = self.ymax = self.ymin = None
 19 |         self._header = None
 20 |         self.multiclass = multiclass
 21 |         self.normalize = normalize
 22 |         if load:
 23 |             self.load_data()
 24 | 
 25 |     def load_data(self):
 26 |         file_h = laspy.file.File(self.file, mode='r')
 27 |         self._xyz = np.vstack([file_h.x, file_h.y, file_h.z]).transpose()
 28 |         self._classes = file_h.classification
 29 |         points = file_h.points['point']
 30 |         attr_names = [a for a in points.dtype.names] + Dataset.ATTR_EXTRA_LIST
 31 |         self._features = np.array([getattr(file_h, name) for name in attr_names
 32 |                                    if name not in Dataset.ATTR_EXLUSION_LIST]).transpose()
 33 |         self._names = [name for name in attr_names if name not in Dataset.ATTR_EXLUSION_LIST]
 34 | 
 35 |         self.xmin = file_h.header.min[0]
 36 |         self.ymin = file_h.header.min[1]
 37 |         self.xmax = file_h.header.max[0]
 38 |         self.ymax = file_h.header.max[1]
 39 |         self._header = file_h.header
 40 |         file_h.close()
 41 | 
 42 |     def statistics(self):
 43 |         stats = {'absolute': {},
 44 |                  'relative': {}}
 45 |         for i in range(np.max(self.labels)):
 46 |             count = np.count_nonzero(self.labels == i)
 47 |             stats['absolute'][i] = count
 48 |             stats['relative'][i] = count/len(self)
 49 | 
 50 |         return stats
 51 | 
 52 |     @property
 53 |     def labels(self):
 54 |         if self._xyz is None:
 55 |             self.load_data()
 56 |         ret_val = self._classes if self.multiclass else (self._classes != 2).astype('int8') + 2
 57 |         return ret_val
 58 | 
 59 |     @property
 60 |     def names(self):
 61 |         return self._names
 62 | 
 63 |     @property
 64 |     def points_and_features(self):
 65 |         if self._xyz is None:
 66 |             self.load_data()
 67 |         ret_val = np.hstack((self._xyz, self._features))
 68 |         if self.normalize:
 69 |             normalize(ret_val)
 70 |         return ret_val
 71 | 
 72 |     @property
 73 |     def filename(self):
 74 |         return os.path.basename(self.file)
 75 | 
 76 |     def points_and_features_f(self):
 77 |         return self.points_and_features
 78 | 
 79 |     def labels_f(self):
 80 |         return self.labels
 81 | 
 82 |     def unload(self):
 83 |         self._features = self._xyz = self._classes = self._names = None
 84 |         self.xmax = self.xmin = self.ymax = self.ymin = None
 85 |         self._header = None
 86 | 
 87 |     def get_label_unique_count(self):
 88 |         return len(np.unique(self._classes))
 89 | 
 90 |     def get_feature_count(self):
 91 |         return self._features.shape[1]
 92 | 
 93 | 
 94 |     def __len__(self):
 95 |         return self.labels.shape[0]
 96 | 
 97 |     def getBatch(self, start_idx, batch_size, idx_randomizer=None):
 98 |         if idx_randomizer is not None:
 99 |             idx_range = idx_randomizer[start_idx:start_idx + batch_size]
100 |         else:
101 |             idx_range = range(start_idx, start_idx + batch_size)
102 |         data = self.points_and_features[idx_range]
103 |         labels = self.labels[idx_range]
104 | 
105 |     def save_with_new_classes(self, outFile, new_classes):
106 |         inFile = laspy.file.File(self.file)
107 |         outFile = laspy.file.File(outFile, mode='w', header=inFile.header)
108 |         outFile.points = inFile.points
109 |         outFile.Classification = new_classes[0]
110 |         outFile.close()
111 | 
112 |     @staticmethod
113 |     def Save(path, points_and_features, names=None, labels=None, new_classes=None, probs=None):
114 |         hdr = laspy.header.Header()
115 |         outfile = laspy.file.File(path, mode="w", header=hdr)
116 |         if new_classes is not None:
117 |             outfile.define_new_dimension(name="estim_class", data_type=5, description="estimated class")
118 |         if labels is not None and new_classes is not None:
119 |             outfile.define_new_dimension(name="class_correct", data_type=5, description="correctness of estimated class")
120 |         if probs is not None:
121 |             for classid in range(probs.shape[1]):
122 |                 outfile.define_new_dimension(name="prob_class%02d" % classid, data_type=9, description="p of estimated class %02d"%classid)
123 | 
124 |         allx = points_and_features[:, 0]
125 |         ally = points_and_features[:, 1]
126 |         allz = points_and_features[:, 2]
127 | 
128 |         xmin = np.floor(np.min(allx))
129 |         ymin = np.floor(np.min(ally))
130 |         zmin = np.floor(np.min(allz))
131 | 
132 |         outfile.header.offset = [xmin, ymin, zmin]
133 |         outfile.header.scale = [0.001, 0.001, 0.001]
134 | 
135 |         outfile.x = allx
136 |         outfile.y = ally
137 |         outfile.z = allz
138 | 
139 |         for featid in range(points_and_features.shape[1]-3):
140 |             try:
141 |                 data = points_and_features[:, 3+featid]
142 |                 if names[featid] in ['num_returns', 'return_num']:  # hack to treat int-values
143 |                     data = data.astype('int8')
144 |                 setattr(outfile, names[featid], data)
145 |             except Exception as e:
146 |                 logging.warning("Could not save attribute %s to file %s: \n%s" % (names[featid], path, e))
147 |                 #raise
148 | 
149 |         if probs is not None:
150 |             for classid in range(probs.shape[1]):
151 |                 setattr(outfile, "prob_class%02d" % classid, probs[:, classid])
152 | 
153 |         if labels is not None:
154 |             outfile.classification = labels
155 |         if new_classes is not None:
156 |             outfile.estim_class = new_classes
157 |         if labels is not None and new_classes is not None:
158 |             outfile.class_correct = np.equal(labels, new_classes)*-1 + 6  #  so that equal =5 --> green (veg)
159 |             #  and not equal =6 --> red (building)
160 | 
161 |         outfile.close()
162 | 
163 | 
164 | class ChunkedDataset(Dataset):
165 |     def __init__(self, chunk_size, overlap, *args, **kwargs):
166 |         super(ChunkedDataset, self).__init__(*args, **kwargs)
167 |         self.chunk_size = chunk_size
168 |         self.overlap = overlap
169 |         self.curr_chunk = 0
170 | 
171 |         self.num_cols = (self.xmax - self.xmin) // (self.chunk_size - self.overlap) + 1
172 |         self.num_rows = (self.ymax - self.ymin) // (self.chunk_size - self.overlap) + 1
173 | 
174 |     def idx_to_lims(self, idx):
175 |         if idx >= self.num_cols * self.num_rows:
176 |             return None
177 |         row_idx = idx // self.num_cols
178 |         col_idx = idx % self.num_cols
179 | 
180 |         return [self.xmin + (self.chunk_size - self.overlap) * col_idx,
181 |                 self.xmin + (self.chunk_size - self.overlap) * (col_idx + 1) + self.overlap,
182 |                 self.ymin + (self.chunk_size - self.overlap) * row_idx,
183 |                 self.ymin + (self.chunk_size - self.overlap) * (row_idx + 1) + self.overlap,
184 |                 ]
185 | 
186 |     def getNextChunk(self):
187 |         lims = self.idx_to_lims(self.curr_chunk)
188 |         if not lims:  # no more chunks
189 |             return None, None
190 |         idxes = self._xyz[:, 0] >= lims[0]
191 |         idxes &= self._xyz[:, 0] < lims[1]
192 |         idxes &= self._xyz[:, 1] >= lims[2]
193 |         idxes &= self._xyz[:, 1] < lims[3]
194 |         self.curr_chunk += 1
195 |         return self.points_and_features[idxes, :], self.labels[idxes]
196 | 
197 |     @staticmethod
198 |     def chunkStatistics(labels, max):
199 |         stats = {'absolute': {},
200 |                  'relative': {}}
201 |         for i in range(max):
202 |             count = np.count_nonzero(labels == i)
203 |             stats['absolute'][i] = count
204 |             stats['relative'][i] = count / len(labels)
205 | 
206 |         return stats
207 | 
208 | class kNNBatchDataset(Dataset):
209 | 
210 |     def __init__(self, k, spacing, *args, **kwargs):
211 |         super(kNNBatchDataset, self).__init__(*args, **kwargs)
212 |         self.spacing = spacing
213 |         self.k = k
214 |         self.tree = None
215 |         self.currIdx = 0
216 | 
217 |         self.num_cols = (self.xmax - self.xmin - self.spacing/2) // (self.spacing) + 1
218 |         self.num_rows = (self.ymax - self.ymin - self.spacing/2) // (self.spacing) + 1
219 |         self.num_batches = int(self.num_cols * self.num_rows)
220 |         self.rndzer = list(range(self.num_batches))
221 |         np.random.shuffle(self.rndzer)
222 |         self.buildKD()
223 | 
224 |     def buildKD(self):
225 |         logging.info(" -- Building kD-Tree with %d points..." % len(self))
226 |         self.tree = KDTree(self._xyz[:, :2], leafsize=100)  # build only on x/y
227 |         logging.info(" --- kD-Tree built.")
228 | 
229 | 
230 |     def getBatches(self, batch_size=1):
231 |         centers = []
232 |         for i in range(batch_size):
233 |             if self.currIdx >= self.num_batches:
234 |                 break
235 |             centers.append([self.xmin + self.spacing/2 + (self.currIdx // self.num_cols) * self.spacing,
236 |                             self.ymin + self.spacing/2 + (self.currIdx % self.num_cols) * self.spacing])
237 |             self.currIdx += 1
238 |         if centers:
239 |             _, idx = self.tree.query(centers, k=self.k)
240 |             return self.points_and_features[idx, :], self.labels[idx]
241 |         else:
242 |             return None, None
243 | 
244 |     def getBatchByIdx(self, batch_idx):
245 |         centers = [[self.xmin + self.spacing / 2 + (batch_idx // self.num_cols) * self.spacing,
246 |                     self.ymin + self.spacing / 2 + (batch_idx % self.num_cols) * self.spacing]]
247 |         _, idx = self.tree.query(centers, k=self.k)
248 |         return self.points_and_features[idx, :], self.labels[idx]
249 | 


--------------------------------------------------------------------------------
/alsNet/model.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | BASE_DIR = os.path.dirname(__file__)
 4 | sys.path.append(BASE_DIR)
 5 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 6 | import tensorflow as tf
 7 | import numpy as np
 8 | import tf_util
 9 | from pointnet_util import pointnet_sa_module, pointnet_fp_module
10 | 
11 | def placeholder_inputs(batch_size, points_in, num_feat):
12 |     pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, points_in, num_feat), name='pointcloud_in')
13 |     labels_pl = tf.placeholder(tf.int32, shape=(batch_size, points_in), name='labels')
14 |     #smpws_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point))
15 |     return pointclouds_pl, labels_pl
16 | 
17 | 
18 | def get_model(point_cloud, is_training, num_class, bn_decay=None):
19 |     """ Semantic segmentation PointNet, input is BxNx3, output Bxnum_class """
20 |     batch_size = point_cloud.get_shape()[0].value
21 |     num_point = point_cloud.get_shape()[1].value
22 |     end_points = {}
23 |     l0_xyz = tf.slice(point_cloud, [0,0,0], [-1,-1,3])  # point coordinates
24 |     l0_points = tf.slice(point_cloud, [0,0,3], [-1,-1,-1])  #point attributes
25 |     end_points['l0_xyz'] = l0_xyz
26 | 
27 |     # Set Abstraction layers
28 |     l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=50000, radius=1, nsample=32, mlp=[32,32,64], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
29 |     l2_xyz, l2_points, l2_indices = pointnet_sa_module(l1_xyz, l1_points, npoint=10000, radius=2, nsample=32, mlp=[64,64,128], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer2')
30 |     l3_xyz, l3_points, l3_indices = pointnet_sa_module(l2_xyz, l2_points, npoint=2000, radius=4, nsample=32, mlp=[128,128,256], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer3')
31 |     l4_xyz, l4_points, l4_indices = pointnet_sa_module(l3_xyz, l3_points, npoint=500, radius=9, nsample=32, mlp=[256,256,512], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer4')
32 |     l5_xyz, l5_points, l5_indices = pointnet_sa_module(l4_xyz, l4_points, npoint=100, radius=25, nsample=32, mlp=[512,512,1024], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer5')
33 |     # debug line:
34 |     # l4_points = tf.Print(l1_points, [l0_xyz, l0_points, l1_xyz, l1_points], 'ln-points', -1, 12)
35 |     end_points['l1_xyz'] = l1_xyz
36 |     end_points['l2_xyz'] = l2_xyz
37 |     end_points['l3_xyz'] = l3_xyz
38 |     end_points['l4_xyz'] = l4_xyz
39 |     end_points['l5_xyz'] = l5_xyz
40 | 
41 |     # Feature Propagation layers
42 |     l4_points = pointnet_fp_module(l4_xyz, l5_xyz, l4_points, l5_points, [512,512], is_training, bn_decay, scope='fa_layer0')
43 |     l3_points = pointnet_fp_module(l3_xyz, l4_xyz, l3_points, l4_points, [256,256], is_training, bn_decay, scope='fa_layer1')
44 |     l2_points = pointnet_fp_module(l2_xyz, l3_xyz, l2_points, l3_points, [256,256], is_training, bn_decay, scope='fa_layer2')
45 |     l1_points = pointnet_fp_module(l1_xyz, l2_xyz, l1_points, l2_points, [256,128], is_training, bn_decay, scope='fa_layer3')
46 |     l0_points = pointnet_fp_module(l0_xyz, l1_xyz, l0_points, l1_points, [128,128,128], is_training, bn_decay, scope='fa_layer4')
47 | 
48 |     # FC layers
49 |     net = tf_util.conv1d(l0_points, 128, 1, padding='VALID', bn=True, is_training=is_training, scope='fc1', bn_decay=bn_decay)
50 |     end_points['feats'] = net
51 |     net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='dp1')
52 |     net = tf_util.conv1d(net, num_class, 1, padding='VALID', activation_fn=None, scope='fc2', name='net')
53 | 
54 |     return net, end_points
55 | 
56 | 
57 | def get_loss(pred, label):
58 |     """ pred: BxNxC,
59 |         label: BxN,
60 |         smpw: BxN """
61 |     #classify_loss = tf.losses.sparse_softmax_cross_entropy(labels=label, logits=pred, scope='loss')
62 |     classify_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=label, logits=pred, name='loss')
63 |     tf.summary.scalar('classify_loss', classify_loss)
64 |     return classify_loss
65 | 


--------------------------------------------------------------------------------
/alsNet/model2.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | BASE_DIR = os.path.dirname(__file__)
 4 | sys.path.append(BASE_DIR)
 5 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
 6 | import tensorflow as tf
 7 | import numpy as np
 8 | import tf_util
 9 | from pointnet_util import pointnet_sa_module, pointnet_fp_module
10 | 
11 | def placeholder_inputs(batch_size, points_in, num_feat):
12 |     pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, points_in, num_feat), name='pointcloud_in')
13 |     labels_pl = tf.placeholder(tf.int32, shape=(batch_size, points_in), name='labels')
14 |     #smpws_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point))
15 |     return pointclouds_pl, labels_pl
16 | 
17 | 
18 | def get_model(point_cloud, is_training, num_class, bn_decay=None):
19 |     """ Semantic segmentation PointNet, input is BxNx3, output Bxnum_class """
20 |     batch_size = point_cloud.get_shape()[0].value
21 |     num_point = point_cloud.get_shape()[1].value
22 |     end_points = {}
23 |     l0_xyz = tf.slice(point_cloud, [0,0,0], [-1,-1,3])  # point coordinates
24 |     l0_points = tf.slice(point_cloud, [0,0,3], [-1,-1,-1])  #point attributes
25 |     end_points['l0_xyz'] = l0_xyz
26 | 
27 |     # Set Abstraction layers
28 |     l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=4096, radius=3, nsample=64, mlp=[64,64,64,128], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
29 |     l2_xyz, l2_points, l2_indices = pointnet_sa_module(l1_xyz, l1_points, npoint=1024, radius=10, nsample=32, mlp=[128,128,128], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer2')
30 |     #l3_xyz, l3_points, l3_indices = pointnet_sa_module(l2_xyz, l2_points, npoint=512, radius=10, nsample=32, mlp=[128,128,256], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer3')
31 |     #l4_xyz, l4_points, l4_indices = pointnet_sa_module(l3_xyz, l3_points, npoint=256, radius=50, nsample=32, mlp=[256,256,512], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer4')
32 |     #l5_xyz, l5_points, l5_indices = pointnet_sa_module(l4_xyz, l4_points, npoint=100, radius=25, nsample=32, mlp=[512,512,1024], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer5')
33 |     # debug line:
34 |     # l4_points = tf.Print(l1_points, [l0_xyz, l0_points, l1_xyz, l1_points], 'ln-points', -1, 12)
35 |     #end_points['l1_xyz'] = l1_xyz
36 |     #end_points['l2_xyz'] = l2_xyz
37 |     #end_points['l3_xyz'] = l3_xyz
38 |     #end_points['l4_xyz'] = l4_xyz
39 |     #end_points['l5_xyz'] = l5_xyz
40 | 
41 |     # Feature Propagation layers
42 |     #l4_points = pointnet_fp_module(l4_xyz, l5_xyz, l4_points, l5_points, [512,512], is_training, bn_decay, scope='fa_layer0')
43 |     #l3_points = pointnet_fp_module(l3_xyz, l4_xyz, l3_points, l4_points, [256,256], is_training, bn_decay, scope='fa_layer1')
44 |     #l2_points = pointnet_fp_module(l2_xyz, l3_xyz, l2_points, l3_points, [256,256], is_training, bn_decay, scope='fa_layer2')
45 |     l1_points = pointnet_fp_module(l1_xyz, l2_xyz, l1_points, l2_points, [128,128], is_training, bn_decay, scope='fa_layer3')
46 |     l0_points = pointnet_fp_module(l0_xyz, l1_xyz, l0_points, l1_points, [128,128,128], is_training, bn_decay, scope='fa_layer4')
47 | 
48 |     # FC layers
49 |     net = tf_util.conv1d(l0_points, 128, 1, padding='VALID', bn=True, is_training=is_training, scope='fc1', bn_decay=bn_decay)
50 |     end_points['feats'] = net
51 |     net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='dp1')
52 |     net = tf_util.conv1d(net, num_class, 1, padding='VALID', activation_fn=None, scope='fc2', name='net')
53 | 
54 |     return net, end_points
55 | 
56 | 
57 | def get_loss(pred, label):
58 |     """ pred: BxNxC,
59 |         label: BxN,
60 |         smpw: BxN """
61 |     #classify_loss = tf.losses.sparse_softmax_cross_entropy(labels=label, logits=pred, scope='loss')
62 |     classify_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=label, logits=pred, name='loss')
63 |     tf.summary.scalar('classify_loss', classify_loss)
64 |     return classify_loss
65 | 


--------------------------------------------------------------------------------
/alsNet/model3.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | BASE_DIR = os.path.dirname(__file__)
  4 | sys.path.append(BASE_DIR)
  5 | sys.path.append(os.path.join(BASE_DIR, '../utils'))
  6 | import tensorflow as tf
  7 | import numpy as np
  8 | import tf_util
  9 | from pointnet_util import pointnet_sa_module, pointnet_fp_module
 10 | 
 11 | arch = [
 12 |     {
 13 |         'npoint': 4*4096,
 14 |         'radius': 1,
 15 |         'nsample': 32,
 16 |         'mlp': [128, 128, 128],
 17 |         'pooling': 'max',
 18 |         'mlp2': None,
 19 |         'reverse_mlp': [128,128]
 20 |     },
 21 |     {
 22 |         'npoint': 2*4096,
 23 |         'radius': 5,
 24 |         'nsample': 64,
 25 |         'mlp': [128, 128, 256],
 26 |         'pooling': 'max',
 27 |         'mlp2': None,
 28 |         'reverse_mlp': [256,256]
 29 |     },
 30 |     {
 31 |         'npoint': 1*4096,
 32 |         'radius': 15,
 33 |         'nsample': 64,
 34 |         'mlp': [128, 128, 256],
 35 |         'pooling': 'max',
 36 |         'mlp2': None,
 37 |         'reverse_mlp': [256,256]
 38 |     },
 39 |     #{
 40 |     #    'npoint': 16,
 41 |     #    'radius': 999,
 42 |     #    'nsample': 4096,
 43 |     #    'mlp': [256, 256, 512],
 44 |     #    'pooling': 'max_and_avg',
 45 |     #    'mlp2': None,
 46 |     #    'reverse_mlp': [512,512]
 47 |     #}
 48 | ]
 49 | 
 50 | 
 51 | def placeholder_inputs(batch_size, points_in, num_feat):
 52 |     pointclouds_pl = tf.placeholder(tf.float32, shape=(batch_size, points_in, num_feat), name='pointcloud_in')
 53 |     labels_pl = tf.placeholder(tf.int64, shape=(batch_size, points_in), name='labels')
 54 |     #smpws_pl = tf.placeholder(tf.float32, shape=(batch_size, num_point))
 55 |     return pointclouds_pl, labels_pl
 56 | 
 57 | 
 58 | def get_model(point_cloud, is_training, num_class, dropout, bn_decay=None):
 59 |     """ Semantic segmentation PointNet, input is BxNx3, output Bxnum_class """
 60 |     batch_size = point_cloud.get_shape()[0].value
 61 |     num_point = point_cloud.get_shape()[1].value
 62 |     end_points = {}
 63 |     l0_xyz = tf.slice(point_cloud, [0,0,0], [-1,-1,3])  # point coordinates
 64 |     l0_points = tf.slice(point_cloud, [0,0,3], [-1,-1,-1])  #point attributes
 65 |     end_points['l0_xyz'] = l0_xyz
 66 | 
 67 |     ln_xyz = [l0_xyz]
 68 |     ln_points = [l0_points]
 69 |     ln_indices = [None]
 70 |     for depth, step_dict in enumerate(arch):
 71 |         li_xyz, li_points, li_indices = pointnet_sa_module(ln_xyz[depth], ln_points[depth],
 72 |                                                            npoint=step_dict['npoint'],
 73 |                                                            radius=step_dict['radius'],
 74 |                                                            nsample=step_dict['nsample'],
 75 |                                                            mlp=step_dict['mlp'],
 76 |                                                            pooling=step_dict['pooling'],
 77 |                                                            mlp2=step_dict['mlp2'],
 78 |                                                            group_all=False,
 79 |                                                            is_training=is_training,
 80 |                                                            bn_decay=bn_decay,
 81 |                                                            scope='layer%d' % (depth+1))
 82 |         ln_xyz.append(li_xyz)
 83 |         #end_points['l%d_xyz' % (depth+1)] = li_xyz  # debug subsampled points
 84 |         ln_points.append(li_points)
 85 |         ln_indices.append(li_indices)
 86 | 
 87 |     for depth, step_dict in enumerate(reversed(arch)):
 88 |         depth = len(arch) - depth
 89 |         li_points = pointnet_fp_module(ln_xyz[depth-1], ln_xyz[depth],
 90 |                                        ln_points[depth-1], ln_points[depth],
 91 |                                        step_dict['reverse_mlp'],
 92 |                                        is_training,
 93 |                                        bn_decay,
 94 |                                        scope='fa_layer%d' % (depth-1))
 95 |         ln_points[depth-1] = li_points
 96 | 
 97 |     l0_points = ln_points[0]
 98 | 
 99 |     # Set Abstraction layers
100 |     #l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=2048, radius=3, nsample=64, mlp=[64,64,128], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
101 |     #l2_xyz, l2_points, l2_indices = pointnet_sa_module(l1_xyz, l1_points, npoint=1024, radius=10, nsample=32, mlp=[128,128,128], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer2')
102 |     #l3_xyz, l3_points, l3_indices = pointnet_sa_module(l2_xyz, l2_points, npoint=512, radius=15, nsample=32, mlp=[128,128,256], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer3')
103 |     #l4_xyz, l4_points, l4_indices = pointnet_sa_module(l3_xyz, l3_points, npoint=256, radius=50, nsample=32, mlp=[256,256,512], pooling='max_and_avg', mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer4')
104 |     #l5_xyz, l5_points, l5_indices = pointnet_sa_module(l4_xyz, l4_points, npoint=100, radius=25, nsample=32, mlp=[512,512,1024], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer5')
105 |     # debug line:
106 |     # l4_points = tf.Print(l1_points, [l0_xyz, l0_points, l1_xyz, l1_points], 'ln-points', -1, 12)
107 |     #end_points['l1_xyz'] = l1_xyz
108 |     #end_points['l2_xyz'] = l2_xyz
109 |     #end_points['l3_xyz'] = l3_xyz
110 |     #end_points['l4_xyz'] = l4_xyz
111 |     #end_points['l5_xyz'] = l5_xyz
112 | 
113 |     # Feature Propagation layers
114 |     #l4_points = pointnet_fp_module(l4_xyz, l5_xyz, l4_points, l5_points, [512,512], is_training, bn_decay, scope='fa_layer0')
115 |     #l3_points = pointnet_fp_module(l3_xyz, l4_xyz, l3_points, l4_points, [256,256], is_training, bn_decay, scope='fa_layer1')
116 |     #l2_points = pointnet_fp_module(l2_xyz, l3_xyz, l2_points, l3_points, [256,256], is_training, bn_decay, scope='fa_layer2')
117 |     #l1_points = pointnet_fp_module(l1_xyz, l2_xyz, l1_points, l2_points, [128,128], is_training, bn_decay, scope='fa_layer3')
118 |     #l0_points = pointnet_fp_module(l0_xyz, l1_xyz, l0_points, l1_points, [128,128,128], is_training, bn_decay, scope='fa_layer4')
119 | 
120 |     # FC layers
121 |     net = tf_util.conv1d(l0_points, 128, 1, padding='VALID', bn=True, is_training=is_training, scope='fc1', bn_decay=bn_decay)
122 |     end_points['feats'] = net
123 |     net = tf_util.dropout(net, keep_prob=1-dropout, is_training=is_training, scope='dp1')
124 |     net = tf_util.conv1d(net, num_class, 1, padding='VALID', activation_fn=None, scope='fc2', name='net')
125 | 
126 |     return net, end_points
127 | 
128 | 
129 | def get_loss(pred, label):
130 |     """ pred: BxNxC,
131 |         label: BxN,
132 |         smpw: BxN """
133 |     #classify_loss = tf.losses.sparse_softmax_cross_entropy(labels=label, logits=pred, scope='loss')
134 |     weights = tf.where(tf.logical_and(label != 2, tf.argmax(pred) == 2), 10, 1)
135 |     classify_loss = tf.losses.sparse_softmax_cross_entropy(labels=label, logits=pred, scope='loss', weights=weights)
136 |     #classify_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=label, logits=pred, name='loss')
137 |     # false negatives are less of a problem than false positives
138 |     # pred_val = np.argmax(pred)
139 |     # fn = np.count_nonzero(np.logical_and(label == 2, pred_val != 2))
140 |     # fp = np.count_nonzero(np.logical_and(label != 2, pred_val == 2))
141 |     # tn = np.count_nonzero(np.logical_and(label != 2, pred_val != 2))
142 |     # tp = np.count_nonzero(np.logical_and(label == 2, pred_val == 2))
143 |     # precision = tp / (tp+fp+0.01)  # correctness
144 |     # recall = tp / (tp+fn+0.01)  # completeness
145 |     # f1_score = 2*precision*recall / (precision+recall+0.01)
146 |     # classify_loss = 1-f1_score
147 |     tf.summary.scalar('classify_loss', classify_loss)
148 |     return classify_loss
149 | 


--------------------------------------------------------------------------------
/alsNet/plots/confusion.py:
--------------------------------------------------------------------------------
  1 | from __future__ import print_function
  2 | import numpy as np
  3 | import matplotlib
  4 | matplotlib.use('agg')
  5 | import matplotlib.pyplot as plt
  6 | from matplotlib import gridspec
  7 | import os
  8 | from sklearn.metrics import confusion_matrix
  9 | import glob
 10 | import sys
 11 | from alsNet.dataset import Dataset
 12 | 
 13 | class_names={
 14 |     0: 'Power',
 15 |     1: 'Low Veg.',
 16 |     2: 'Imp. Surf.',
 17 |     3: 'Car',
 18 |     4: 'Fence/Hedge',
 19 |     5: 'Roof',
 20 |     6: 'Facade',
 21 |     7: 'Shrub',
 22 |     8: 'Tree',
 23 | }
 24 | class_names={
 25 |     2: 'Ground',
 26 |     3: 'Low Veg.',
 27 |     4: 'Med. Veg.',
 28 |     5: 'High Veg.',
 29 | }
 30 | class_names={
 31 |     2: 'Ground',
 32 |     3: 'Low Veg.',
 33 |     4: 'Med. Veg.',
 34 |     5: 'High Veg.',
 35 |     6: 'Building',
 36 |     9: 'Water',
 37 |     -1: 'Other'
 38 | }
 39 | 
 40 | def get_cm_compressed(cm, keep_classes=(2, 3, 4, 5, 6, 9), delete=False):
 41 |     """
 42 |     Compresses a confusion matrix into the interesting columns/rows
 43 |     (careful, they are not ordered according to keep_classes, but the indices change!)
 44 |     and collects the rest in the last column/row
 45 |     :param cm: a 2D confusion matrix
 46 |     :param keep_classes: a set of classes to keep
 47 |     :param delete: delete rows from matrix after caluclation (default: False)
 48 |     :return:
 49 |     """
 50 |     coll_idx = cm.shape[0]
 51 |     cm_buf = np.append(cm, np.zeros((1, coll_idx)), axis=0)
 52 |     cm_buf = np.append(cm_buf, np.zeros((coll_idx + 1, 1)), axis=1)
 53 |     sum_idxs = [i for i in range(coll_idx) if i not in keep_classes]
 54 |     cm_buf[:, coll_idx] = np.sum(cm_buf[:, sum_idxs], axis=1)
 55 |     cm_buf[coll_idx, :] = np.sum(cm_buf[sum_idxs, :], axis=0)
 56 |     cm_buf[coll_idx, coll_idx] = np.sum(cm_buf[sum_idxs, -1])
 57 |     if delete:
 58 |         cm_buf = np.delete(cm_buf, sum_idxs, axis=0)
 59 |         cm_buf = np.delete(cm_buf, sum_idxs, axis=1)
 60 |     return cm_buf
 61 | 
 62 | def over_gt(cm):
 63 |     return (cm.T/ np.sum(cm, axis=1)).T
 64 | 
 65 | def main(tile_id):
 66 |     input_files = r"D:\91_classes\10_MSc\04_results\VSC\4\test20\2011_%s_c*.laz"% tile_id
 67 |     #input_files = r"D:\91_classes\10_MSc\04_results\VSC\28\test36\area1_aoi_c*_test.laz"
 68 |     #input_files = r"D:\91_classes\10_MSc\04_results\VSC\32\test33\merge.las"
 69 | 
 70 |     filelist = glob.glob(input_files)
 71 |     cm_sum = np.zeros((30,30), dtype=np.int64)
 72 |     pt_cnt = 0
 73 |     for idx, file in enumerate(filelist):
 74 |         print("Loading dataset '%s' (%s/%s)" % (file, idx+1, len(filelist)))
 75 |         ds = Dataset(file)
 76 |         ref = list(ds.labels)
 77 |         gt_col = ds.names.index('estim_class')
 78 |         gt = list(ds.points_and_features[:, gt_col+3])
 79 |         labels = ref #[item+2 for item in ref]
 80 |         classes = gt #[item+2 for item in gt]
 81 |         pt_cnt += len(ref)
 82 |         print("Creating confusion matrix")
 83 |         eval_cm = confusion_matrix(labels, classes, range(30))
 84 |         cm_sum += eval_cm
 85 | 
 86 |     keep_classes = (2,3,4,5,6,9)#(2,3,4,5) #(0, 1, 2, 3, 4, 5, 6, 7, 8)
 87 |     # confusion matrix plot
 88 |     print("Plotting")
 89 |     fig = plt.figure(figsize=(10, 10))
 90 |     num_classes = len(keep_classes) + 1
 91 |     keep_classes_e = keep_classes + (-1,)
 92 |     gs = gridspec.GridSpec(num_classes, num_classes)
 93 | 
 94 |     cm_sum = get_cm_compressed(cm_sum, keep_classes, delete=True)
 95 |     conf_all = over_gt(cm_sum)
 96 |     row = -1
 97 |     for ref_idx, ref_class in enumerate(keep_classes_e):
 98 |         curr_ref_axis = None
 99 |         row += 1
100 |         col = -1
101 |         for eval_idx, eval_class in enumerate(keep_classes_e):
102 |             col += 1
103 |             conf = conf_all[ref_idx, eval_idx]
104 |             if curr_ref_axis:
105 |                 plt.subplot(gs[row, col], sharey=curr_ref_axis)
106 |             else:
107 |                 curr_ref_axis = plt.subplot(gs[row, col])
108 | 
109 |             plt.plot([0], [0])
110 |             plt.xlim([0, 1])
111 |             plt.ylim([0, 1])
112 |             #plt.plot(points_seen, conf_timeline)
113 | 
114 |             if col == row:
115 |                 if col == num_classes-1:
116 |                     plt.gca().set_facecolor('gray')
117 |                     highcolor = 'k'
118 |                     lowcolor = 'k'
119 |                 else:
120 |                     plt.gca().set_facecolor(([30/255, 180/255, 60/255, conf]))
121 |                     highcolor = 'xkcd:forest green'
122 |                     lowcolor = 'xkcd:grass green'
123 |             else:
124 |                 plt.gca().set_facecolor(([220/255, 60/255, 30/255, conf]))
125 |                 highcolor = 'xkcd:orange red'
126 |                 lowcolor = 'xkcd:dirty pink'
127 | 
128 |             plt.text(0.5,
129 |                      0.5,
130 |                      "%.1f%%" % (conf * 100) if not np.isnan(conf) else "N/A", ha='center',
131 |                      )#color=highcolor if conf > 0.5 else lowcolor)
132 |             cm = cm_sum
133 |             ref_sum = np.sum(cm, axis=1)[ref_idx]
134 |             eval_sum = np.sum(cm, axis=0)[eval_idx]
135 |             plt.text(0.5,
136 |                      0.3,
137 |                      "%d" % (cm[ref_idx, eval_idx]), ha='center')
138 |             if col == 0:
139 |                 plt.ylabel('%s\n%d\n(%.0f%%)' % (class_names[ref_class],
140 |                                                  ref_sum,
141 |                                                  ref_sum / (pt_cnt) * 100))
142 |             if row == 0:
143 |                 plt.gca().xaxis.set_label_position('top')
144 |                 plt.xlabel('%s\n%d\n(%.0f%%)' % (class_names[eval_class],
145 |                                                  eval_sum,
146 |                                                  eval_sum / (pt_cnt) * 100))
147 | 
148 |             plt.gca().get_yaxis().set_ticks([])
149 |             plt.gca().get_xaxis().set_ticks([])
150 | 
151 |             plt.ylim([0, 1])
152 | 
153 |     print("saving plot")
154 |     fig.text(0.5, 0.94, 'Estimated', ha='center', va='center',  fontweight='bold')
155 |     fig.text(0.06, 0.5, 'Ground truth', ha='center', va='center', rotation='vertical',  fontweight='bold')
156 | 
157 |     plt.subplots_adjust(hspace=.0, wspace=.0)
158 |     plt.savefig((r"D:\91_classes\10_MSc\04_results\VSC\4\test20\2011_%s_cm3.png" % tile_id).replace("*", "all"))
159 |     #plt.savefig((r"D:\91_classes\10_MSc\04_results\VSC\28\test36\conf.png"))
160 |     #plt.savefig(r"D:\91_classes\10_MSc\04_results\VSC\32\test33\merge.png")
161 | 
162 | main('13235203')
163 | main('13245200')
164 | main('13205000')
165 | main('11275100')
166 | main('*')


--------------------------------------------------------------------------------
/bregenz_c1293.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/lwiniwar/alsNet/aab6a8a8d9df7850d7ec5fccee463ef40087a1a7/bregenz_c1293.png


--------------------------------------------------------------------------------
/tf_ops/3d_interpolation/interpolate.cpp:
--------------------------------------------------------------------------------
  1 | #include <cstdio>
  2 | #include <ctime>
  3 | #include <cstring> // memset
  4 | #include <cstdlib> // rand, RAND_MAX
  5 | #include <cmath> // sqrtf
  6 | #include <string>
  7 | #include <vector>
  8 | using namespace std;
  9 | float randomf(){
 10 |     return (rand()+0.5)/(RAND_MAX+1.0);
 11 | }
 12 | static double get_time(){
 13 |     timespec tp;
 14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
 15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
 16 | }
 17 | 
 18 | // Find three nearest neigbors with square distance
 19 | // input: xyz1 (b,n,3), xyz2(b,m,3)
 20 | // output: dist (b,n,3), idx (b,n,3)
 21 | void threenn_cpu(int b, int n, int m, const float *xyz1, const float *xyz2, float *dist, int *idx) {
 22 |      for (int i=0;i<b;++i) {
 23 |         for (int j=0;j<n;++j) {
 24 | 	    float x1=xyz1[j*3+0];
 25 | 	    float y1=xyz1[j*3+1];
 26 | 	    float z1=xyz1[j*3+2];
 27 |             double best1=1e40; double best2=1e40; double best3=1e40;
 28 |             int besti1=0; int besti2=0; int besti3=0;
 29 |             for (int k=0;k<m;++k) {
 30 |                 float x2=xyz2[k*3+0];
 31 | 	        float y2=xyz2[k*3+1];
 32 | 	        float z2=xyz2[k*3+2];
 33 | 		//float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);
 34 | 		double d=x2*x2+y2*y2+z2*z2;
 35 |                 if (d<best1) {
 36 |                     best3=best2;
 37 |                     besti3=besti2;
 38 |                     best2=best1;
 39 |                     besti2=besti1;
 40 |                     best1=d;
 41 |                     besti1=k;
 42 |                 } else if (d<best2) {
 43 |                     best3=best2;
 44 |                     besti3=besti2;
 45 |                     best2=d;
 46 |                     besti2=k;
 47 |                 } else if (d<best3) {
 48 |                     best3=d;
 49 |                     besti3=k;
 50 |                 }
 51 |             } 
 52 |             dist[j*3]=best1;
 53 |             idx[j*3]=besti1;
 54 |             dist[j*3+1]=best2;
 55 |             idx[j*3+1]=besti2;
 56 |             dist[j*3+2]=best3;
 57 |             idx[j*3+2]=besti3;
 58 |         } 
 59 |         xyz1+=n*3;
 60 |         xyz2+=m*3;
 61 |         dist+=n*3;
 62 |         idx+=n*3;
 63 |     }
 64 | } 
 65 | 
 66 | // CONSTANT WEIGHT TODO
 67 | // input: dist (b,n,3)
 68 | // output: weight (b,n,3)
 69 | void get_weights_cpu(int b, int n, const float *dist, float *weight) {
 70 |     const float w = 1.0/3.0;
 71 |     for (int i=0;i<b;++i) {
 72 |         for (int j=0;j<n;++j) {
 73 |             weight[j*3]=w;
 74 |             weight[j*3+1]=w;
 75 |             weight[j*3+2]=w;
 76 |         } 
 77 |         dist+=n*3;
 78 |         weight+=n*3;
 79 |     }
 80 | }
 81 | 
 82 | // input: points (b,m,c), idx (b,n,3), weight (b,n,3)
 83 | // output: out (b,n,c)
 84 | void interpolate_cpu(int b, int m, int c, int n, const float *points, const int *idx, const float *weight, float *out) {
 85 |      float w1,w2,w3;
 86 |      int i1,i2,i3;
 87 |      for (int i=0;i<b;++i) {
 88 |         for (int j=0;j<n;++j) {
 89 |             w1=weight[j*3];
 90 |             w2=weight[j*3+1];
 91 |             w3=weight[j*3+2]; 
 92 |             i1=idx[j*3];
 93 |             i2=idx[j*3+1];
 94 |             i3=idx[j*3+2];
 95 |             for (int l=0;l<c;++l) {
 96 |                 out[j*c+l] = points[i1*c+l]*w1 + points[i2*c+l]*w2 + points[i3*c+l]*w3;
 97 |             }
 98 |         } 
 99 |         points+=m*c;
100 |         idx+=n*3;
101 |         weight+=n*3;
102 |         out+=n*c;
103 |     }
104 | }
105 | 
106 | // input: grad_out (b,n,c), idx (b,n,3), weight (b,n,3)
107 | // output: grad_points (b,m,c)
108 | void interpolate_grad_cpu(int b, int n, int c, int m, const float *grad_out, const int *idx, const float *weight, float *grad_points) {
109 |      float w1,w2,w3;
110 |      int i1,i2,i3;
111 |      for (int i=0;i<b;++i) {
112 |         for (int j=0;j<n;++j) {
113 |             w1=weight[j*3];
114 |             w2=weight[j*3+1];
115 |             w3=weight[j*3+2]; 
116 |             i1=idx[j*3];
117 |             i2=idx[j*3+1];
118 |             i3=idx[j*3+2];
119 |             for (int l=0;l<c;++l) {
120 |                 grad_points[i1*c+l] += grad_out[j*c+l]*w1;
121 |                 grad_points[i2*c+l] += grad_out[j*c+l]*w2;
122 |                 grad_points[i3*c+l] += grad_out[j*c+l]*w3;
123 |             }
124 |         } 
125 |         grad_out+=n*c;
126 |         idx+=n*3;
127 |         weight+=n*3;
128 |         grad_points+=m*c;
129 |     }
130 | }
131 | 
132 | int main()
133 | {
134 |     int b=32,n=512,m=128,c=64;
135 |     float *xyz1=new float[b*n*3];
136 |     float *xyz2=new float[b*m*3];
137 |     float *dist=new float[b*n*3];
138 |     int *idx=new int[b*n*3];
139 |     memset(idx, 0, sizeof(int)*b*n*3);
140 |     float *weight=new float[b*n*3];
141 |     float *points=new float[b*m*c];
142 |     float *out=new float[b*n*c];
143 |     float *grad_out=new float[b*n*c]; // grad to out
144 |     memset(grad_out, 0.0, sizeof(float)*b*n*c);
145 |     float *grad_points=new float[b*m*c]; // grad to points
146 |     for (int i=0;i<b*n*3;i++)
147 |         xyz1[i]=randomf();
148 |     for (int i=0;i<b*m*3;i++)
149 |         xyz2[i]=randomf();
150 |     for (int i=0;i<b*m*c;i++)
151 |         points[i]=randomf();
152 | 
153 |     double t0=get_time();
154 |     threenn_cpu(b,n,m,xyz1,xyz2,dist,idx);
155 |     printf("threenn cpu time %f\n",get_time()-t0);
156 |     
157 |     t0=get_time();
158 |     get_weights_cpu(b,n,dist,weight);
159 |     printf("get_weights_cpu cpu time %f\n",get_time()-t0);
160 | 
161 |     t0=get_time();
162 |     interpolate_cpu(b,m,c,n,points,idx,weight,out);
163 |     printf("interpolate_cpu cpu time %f\n",get_time()-t0);
164 | 
165 |     t0=get_time();
166 |     interpolate_grad_cpu(b,n,c,m,grad_out,idx,weight,grad_points);
167 |     printf("interpolate_grad_cpu cpu time %f\n",get_time()-t0);
168 |     return 0;
169 | }
170 | 


--------------------------------------------------------------------------------
/tf_ops/3d_interpolation/tf_interpolate.cpp:
--------------------------------------------------------------------------------
  1 | #include <cstdio>
  2 | #include <ctime>
  3 | #include <cstring> // memset
  4 | #include <cstdlib> // rand, RAND_MAX
  5 | #include <cmath> // sqrtf
  6 | #include "tensorflow/core/framework/op.h"
  7 | #include "tensorflow/core/framework/op_kernel.h"
  8 | #include "tensorflow/core/framework/shape_inference.h"
  9 | #include "tensorflow/core/framework/common_shape_fns.h"
 10 | using namespace tensorflow;
 11 | 
 12 | REGISTER_OP("ThreeNN")
 13 |     .Input("xyz1: float32")
 14 |     .Input("xyz2: float32")
 15 |     .Output("dist: float32")
 16 |     .Output("idx: int32")
 17 |     .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 18 |         c->set_output(0, c->input(0));
 19 |         c->set_output(1, c->input(0));
 20 |         return Status::OK();
 21 |     });
 22 | REGISTER_OP("ThreeInterpolate")
 23 |     .Input("points: float32")
 24 |     .Input("idx: int32")
 25 |     .Input("weight: float32")
 26 |     .Output("out: float32")
 27 |     .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 28 |         ::tensorflow::shape_inference::ShapeHandle dims1; // (b,m,c)
 29 |         c->WithRank(c->input(0), 3, &dims1);
 30 |         ::tensorflow::shape_inference::ShapeHandle dims2; // (b,n,3)
 31 |         c->WithRank(c->input(1), 3, &dims2);
 32 |         // (b,n,c)
 33 |         ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims2, 1), c->Dim(dims1, 2)});
 34 |         c->set_output(0, output);
 35 |         return Status::OK();
 36 |     });
 37 | REGISTER_OP("ThreeInterpolateGrad")
 38 |     .Input("points: float32")
 39 |     .Input("idx: int32")
 40 |     .Input("weight: float32")
 41 |     .Input("grad_out: float32")
 42 |     .Output("grad_points: float32")
 43 |     .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 44 |         c->set_output(0, c->input(0));
 45 |         return Status::OK();
 46 |     });
 47 | 
 48 | float randomf(){
 49 |     return (rand()+0.5)/(RAND_MAX+1.0);
 50 | }
 51 | static double get_time(){
 52 |     timespec tp;
 53 |     clock_gettime(CLOCK_MONOTONIC,&tp);
 54 |     return tp.tv_sec+tp.tv_nsec*1e-9;
 55 | }
 56 | 
 57 | // Find three nearest neigbors with square distance
 58 | // input: xyz1 (b,n,3), xyz2(b,m,3)
 59 | // output: dist (b,n,3), idx (b,n,3)
 60 | void threenn_cpu(int b, int n, int m, const float *xyz1, const float *xyz2, float *dist, int *idx) {
 61 |      for (int i=0;i<b;++i) {
 62 |         for (int j=0;j<n;++j) {
 63 | 	    float x1=xyz1[j*3+0];
 64 | 	    float y1=xyz1[j*3+1];
 65 | 	    float z1=xyz1[j*3+2];
 66 |             double best1=1e40; double best2=1e40; double best3=1e40;
 67 |             int besti1=0; int besti2=0; int besti3=0;
 68 |             for (int k=0;k<m;++k) {
 69 |                 float x2=xyz2[k*3+0];
 70 | 	        float y2=xyz2[k*3+1];
 71 | 	        float z2=xyz2[k*3+2];
 72 | 		//float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);
 73 | 		double d=(x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1);
 74 |                 if (d<best1) {
 75 |                     best3=best2;
 76 |                     besti3=besti2;
 77 |                     best2=best1;
 78 |                     besti2=besti1;
 79 |                     best1=d;
 80 |                     besti1=k;
 81 |                 } else if (d<best2) {
 82 |                     best3=best2;
 83 |                     besti3=besti2;
 84 |                     best2=d;
 85 |                     besti2=k;
 86 |                 } else if (d<best3) {
 87 |                     best3=d;
 88 |                     besti3=k;
 89 |                 }
 90 |             } 
 91 |             dist[j*3]=best1;
 92 |             idx[j*3]=besti1;
 93 |             dist[j*3+1]=best2;
 94 |             idx[j*3+1]=besti2;
 95 |             dist[j*3+2]=best3;
 96 |             idx[j*3+2]=besti3;
 97 |         } 
 98 |         xyz1+=n*3;
 99 |         xyz2+=m*3;
100 |         dist+=n*3;
101 |         idx+=n*3;
102 |     }
103 | } 
104 | 
105 | // input: points (b,m,c), idx (b,n,3), weight (b,n,3)
106 | // output: out (b,n,c)
107 | void threeinterpolate_cpu(int b, int m, int c, int n, const float *points, const int *idx, const float *weight, float *out) {
108 |      float w1,w2,w3;
109 |      int i1,i2,i3;
110 |      for (int i=0;i<b;++i) {
111 |         for (int j=0;j<n;++j) {
112 |             w1=weight[j*3];
113 |             w2=weight[j*3+1];
114 |             w3=weight[j*3+2]; 
115 |             i1=idx[j*3];
116 |             i2=idx[j*3+1];
117 |             i3=idx[j*3+2];
118 |             for (int l=0;l<c;++l) {
119 |                 out[j*c+l] = points[i1*c+l]*w1 + points[i2*c+l]*w2 + points[i3*c+l]*w3;
120 |             }
121 |         } 
122 |         points+=m*c;
123 |         idx+=n*3;
124 |         weight+=n*3;
125 |         out+=n*c;
126 |     }
127 | }
128 | 
129 | // input: grad_out (b,n,c), idx (b,n,3), weight (b,n,3)
130 | // output: grad_points (b,m,c)
131 | void threeinterpolate_grad_cpu(int b, int n, int c, int m, const float *grad_out, const int *idx, const float *weight, float *grad_points) {
132 |      float w1,w2,w3;
133 |      int i1,i2,i3;
134 |      for (int i=0;i<b;++i) {
135 |         for (int j=0;j<n;++j) {
136 |             w1=weight[j*3];
137 |             w2=weight[j*3+1];
138 |             w3=weight[j*3+2]; 
139 |             i1=idx[j*3];
140 |             i2=idx[j*3+1];
141 |             i3=idx[j*3+2];
142 |             for (int l=0;l<c;++l) {
143 |                 grad_points[i1*c+l] += grad_out[j*c+l]*w1;
144 |                 grad_points[i2*c+l] += grad_out[j*c+l]*w2;
145 |                 grad_points[i3*c+l] += grad_out[j*c+l]*w3;
146 |             }
147 |         } 
148 |         grad_out+=n*c;
149 |         idx+=n*3;
150 |         weight+=n*3;
151 |         grad_points+=m*c;
152 |     }
153 | }
154 | 
155 | 
156 | 
157 | class ThreeNNOp : public OpKernel {
158 |     public:
159 |         explicit ThreeNNOp(OpKernelConstruction* context) : OpKernel(context) {}
160 | 
161 |         void Compute(OpKernelContext* context) override {
162 |             const Tensor& xyz1_tensor = context->input(0);
163 |             OP_REQUIRES(context, xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeNN expects (b,n,3) xyz1 shape."));
164 |             int b = xyz1_tensor.shape().dim_size(0);
165 |             int n = xyz1_tensor.shape().dim_size(1);
166 | 
167 |             const Tensor& xyz2_tensor = context->input(1);
168 |             OP_REQUIRES(context, xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeNN expects (b,m,3) xyz2 shape."));
169 |             int m = xyz2_tensor.shape().dim_size(1);
170 | 
171 |             Tensor *dist_tensor = nullptr;
172 |             OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,n,3}, &dist_tensor));
173 |             Tensor *idx_tensor = nullptr;
174 |             OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,n,3}, &idx_tensor));
175 | 
176 |             auto xyz1_flat = xyz1_tensor.flat<float>();
177 |             const float *xyz1 = &(xyz1_flat(0));
178 |             auto xyz2_flat = xyz2_tensor.flat<float>();
179 |             const float *xyz2 = &(xyz2_flat(0));
180 |             auto dist_flat = dist_tensor->flat<float>();
181 |             float *dist = &(dist_flat(0));
182 |             auto idx_flat = idx_tensor->flat<int>();
183 |             int *idx = &(idx_flat(0));
184 |             threenn_cpu(b,n,m,xyz1,xyz2,dist,idx);
185 |         }
186 | };
187 | REGISTER_KERNEL_BUILDER(Name("ThreeNN").Device(DEVICE_CPU), ThreeNNOp);
188 | 
189 | 
190 | 
191 | class ThreeInterpolateOp: public OpKernel{
192 |     public:
193 |         explicit ThreeInterpolateOp(OpKernelConstruction * context):OpKernel(context){}
194 | 
195 |         void Compute(OpKernelContext * context) override {
196 |             const Tensor& points_tensor=context->input(0);
197 |             OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("ThreeInterpolate expects (b,m,c) points shape"));
198 |             int b = points_tensor.shape().dim_size(0);
199 |             int m = points_tensor.shape().dim_size(1);
200 |             int c = points_tensor.shape().dim_size(2);
201 | 
202 |             const Tensor& idx_tensor=context->input(1);
203 |             OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b && idx_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeInterpolate expects (b,n,3) idx shape"));
204 |             int n = idx_tensor.shape().dim_size(1);
205 |             const Tensor& weight_tensor=context->input(2);
206 |             OP_REQUIRES(context,weight_tensor.dims()==3 && weight_tensor.shape().dim_size(0)==b && weight_tensor.shape().dim_size(1)==n && weight_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeInterpolate expects (b,n,3) weight shape"));
207 | 
208 |             Tensor * out_tensor = nullptr;
209 |             OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,n,c}, &out_tensor));
210 | 
211 |             auto points_flat = points_tensor.flat<float>();
212 |             const float *points = &(points_flat(0));
213 |             auto idx_flat = idx_tensor.flat<int>();
214 |             const int *idx = &(idx_flat(0));
215 |             auto weight_flat = weight_tensor.flat<float>();
216 |             const float *weight = &(weight_flat(0));
217 |             auto out_flat = out_tensor->flat<float>();
218 |             float *out = &(out_flat(0));
219 |             threeinterpolate_cpu(b,m,c,n,points,idx,weight,out);
220 |         }
221 | };
222 | REGISTER_KERNEL_BUILDER(Name("ThreeInterpolate").Device(DEVICE_CPU),ThreeInterpolateOp);
223 | 
224 | 
225 | class ThreeInterpolateGradOp: public OpKernel{
226 |     public:
227 |         explicit ThreeInterpolateGradOp(OpKernelConstruction * context):OpKernel(context){}
228 | 
229 |         void Compute(OpKernelContext * context) override {
230 |             const Tensor& points_tensor=context->input(0);
231 |             OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("ThreeInterpolateGrad expects (b,m,c) points shape"));
232 |             int b = points_tensor.shape().dim_size(0);
233 |             int m = points_tensor.shape().dim_size(1);
234 |             int c = points_tensor.shape().dim_size(2);
235 | 
236 |             const Tensor& idx_tensor=context->input(1);
237 |             OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument("ThreeInterpolateGrad expects (b,n,3) idx shape"));
238 |             int n = idx_tensor.shape().dim_size(1);
239 |             const Tensor& weight_tensor=context->input(2);
240 |             OP_REQUIRES(context,weight_tensor.dims()==3 && weight_tensor.shape().dim_size(0)==b && weight_tensor.shape().dim_size(1)==n && weight_tensor.shape().dim_size(2)==3, errors::InvalidArgument("ThreeInterpolateGrad expects (b,n,3) weight shape"));
241 | 
242 |             const Tensor& grad_out_tensor=context->input(3);
243 |             OP_REQUIRES(context,grad_out_tensor.dims()==3 && grad_out_tensor.shape().dim_size(0)==b && grad_out_tensor.shape().dim_size(1)==n && grad_out_tensor.shape().dim_size(2)==c, errors::InvalidArgument("ThreeInterpolateGrad expects (b,n,c) grad_out shape"));
244 | 
245 |             Tensor * grad_points_tensor = nullptr;
246 |             OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,m,c}, &grad_points_tensor));
247 | 
248 |             auto points_flat = points_tensor.flat<float>();
249 |             const float *points = &(points_flat(0));
250 |             auto idx_flat = idx_tensor.flat<int>();
251 |             const int *idx = &(idx_flat(0));
252 |             auto weight_flat = weight_tensor.flat<float>();
253 |             const float *weight = &(weight_flat(0));
254 |             auto grad_out_flat = grad_out_tensor.flat<float>();
255 |             const float *grad_out = &(grad_out_flat(0));
256 |             auto grad_points_flat = grad_points_tensor->flat<float>();
257 |             float *grad_points = &(grad_points_flat(0));
258 |             memset(grad_points, 0, sizeof(float)*b*m*c);
259 |             threeinterpolate_grad_cpu(b,n,c,m,grad_out,idx,weight,grad_points);
260 |         }
261 | };
262 | REGISTER_KERNEL_BUILDER(Name("ThreeInterpolateGrad").Device(DEVICE_CPU),ThreeInterpolateGradOp);
263 | 
264 | 
265 | 


--------------------------------------------------------------------------------
/tf_ops/3d_interpolation/tf_interpolate.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | from tensorflow.python.framework import ops
 3 | import sys
 4 | import os
 5 | BASE_DIR = os.path.dirname(__file__)
 6 | sys.path.append(BASE_DIR)
 7 | interpolate_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_interpolate_so.so'))
 8 | def three_nn(xyz1, xyz2):
 9 |     '''
10 |     Input:
11 |         xyz1: (b,n,3) float32 array, unknown points
12 |         xyz2: (b,m,3) float32 array, known points
13 |     Output:
14 |         dist: (b,n,3) float32 array, distances to known points
15 |         idx: (b,n,3) int32 array, indices to known points
16 |     '''
17 |     return interpolate_module.three_nn(xyz1, xyz2)
18 | ops.NoGradient('ThreeNN')
19 | def three_interpolate(points, idx, weight):
20 |     '''
21 |     Input:
22 |         points: (b,m,c) float32 array, known points
23 |         idx: (b,n,3) int32 array, indices to known points
24 |         weight: (b,n,3) float32 array, weights on known points
25 |     Output:
26 |         out: (b,n,c) float32 array, interpolated point values
27 |     '''
28 |     return interpolate_module.three_interpolate(points, idx, weight)
29 | @tf.RegisterGradient('ThreeInterpolate')
30 | def _three_interpolate_grad(op, grad_out):
31 |     points = op.inputs[0]
32 |     idx = op.inputs[1]
33 |     weight = op.inputs[2]
34 |     return [interpolate_module.three_interpolate_grad(points, idx, weight, grad_out), None, None]
35 | 
36 | if __name__=='__main__':
37 |     import numpy as np
38 |     import time
39 |     np.random.seed(100)
40 |     pts = np.random.random((32,128,64)).astype('float32')
41 |     tmp1 = np.random.random((32,512,3)).astype('float32')
42 |     tmp2 = np.random.random((32,128,3)).astype('float32')
43 |     with tf.device('/cpu:0'):
44 |         points = tf.constant(pts)
45 |         xyz1 = tf.constant(tmp1)
46 |         xyz2 = tf.constant(tmp2)
47 |         dist, idx = three_nn(xyz1, xyz2)
48 |         weight = tf.ones_like(dist)/3.0
49 |         interpolated_points = three_interpolate(points, idx, weight)
50 |     with tf.Session('') as sess:
51 |         now = time.time() 
52 |         for _ in range(100):
53 |             ret = sess.run(interpolated_points)
54 |         print(time.time() - now)
55 |         print(ret.shape, ret.dtype)
56 |         #print ret
57 |     
58 |     
59 |     
60 | 


--------------------------------------------------------------------------------
/tf_ops/3d_interpolation/tf_interpolate_compile.sh:
--------------------------------------------------------------------------------
1 | TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
2 | TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
3 | g++ -std=c++11 tf_interpolate.cpp -o tf_interpolate_so.so -shared -fPIC -I $TF_INC -I /usr/local/cuda-9.0/include -lcudart -L /usr/local/cuda-9.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0 -I $TF_INC/external/nsync/public -L $TF_LIB -ltensorflow_framework
4 | 
5 | 


--------------------------------------------------------------------------------
/tf_ops/3d_interpolation/tf_interpolate_op_test.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | import numpy as np
 3 | from tf_interpolate import three_nn, three_interpolate
 4 | 
 5 | class GroupPointTest(tf.test.TestCase):
 6 |   def test(self):
 7 |     pass
 8 | 
 9 |   def test_grad(self):
10 |     with self.test_session():
11 |       points = tf.constant(np.random.random((1,8,16)).astype('float32'))
12 |       print points
13 |       xyz1 = tf.constant(np.random.random((1,128,3)).astype('float32'))
14 |       xyz2 = tf.constant(np.random.random((1,8,3)).astype('float32'))
15 |       dist, idx = three_nn(xyz1, xyz2)
16 |       weight = tf.ones_like(dist)/3.0
17 |       interpolated_points = three_interpolate(points, idx, weight)
18 |       print interpolated_points
19 |       err = tf.test.compute_gradient_error(points, (1,8,16), interpolated_points, (1,128,16))
20 |       print err
21 |       self.assertLess(err, 1e-4) 
22 | 
23 | if __name__=='__main__':
24 |   tf.test.main() 
25 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/.gitignore:
--------------------------------------------------------------------------------
 1 | a.out
 2 | query_ball_point
 3 | query_ball_point_block
 4 | query_ball_point_cuda
 5 | query_ball_point_grid
 6 | tf_grouping_g.cu.o
 7 | tf_grouping_so.so
 8 | selection_sort
 9 | selection_sort_cuda
10 | selection_sort_const_cuda
11 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/compile.sh:
--------------------------------------------------------------------------------
1 | g++ query_ball_point.cpp -o query_ball_point
2 | nvcc query_ball_point.cu -o query_ball_point_cuda
3 | nvcc query_ball_point_block.cu -o query_ball_point_block
4 | nvcc query_ball_point_grid.cu -o query_ball_point_grid
5 | g++ -Wall selection_sort.cpp -o selection_sort
6 | nvcc selection_sort.cu -o selection_sort_cuda
7 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/query_ball_point.cpp:
--------------------------------------------------------------------------------
  1 | #include <cstdio>
  2 | #include <ctime>
  3 | #include <cstring> // memset
  4 | #include <cstdlib> // rand, RAND_MAX
  5 | #include <cmath> // sqrtf
  6 | #include <string>
  7 | #include <vector>
  8 | using namespace std;
  9 | float randomf(){
 10 |     return (rand()+0.5)/(RAND_MAX+1.0);
 11 | }
 12 | static double get_time(){
 13 |     timespec tp;
 14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
 15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
 16 | }
 17 | // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)
 18 | // output: idx (b,m,nsample)
 19 | void query_ball_point_cpu(int b, int n, int m, float radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {
 20 |     for (int i=0;i<b;++i) {
 21 |         for (int j=0;j<m;++j) {
 22 |             int cnt = 0;
 23 |             for (int k=0;k<n;++k) {
 24 |                 if (cnt == nsample)
 25 |                     break; // only pick the FIRST nsample points in the ball
 26 | 	        float x2=xyz2[j*3+0];
 27 | 	        float y2=xyz2[j*3+1];
 28 | 	        float z2=xyz2[j*3+2];
 29 | 	        float x1=xyz1[k*3+0];
 30 | 	        float y1=xyz1[k*3+1];
 31 | 	        float z1=xyz1[k*3+2];
 32 | 		float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);
 33 |                 if (d<radius) {
 34 |                     if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices
 35 |                         for (int l=0;l<nsample;++l)
 36 |                             idx[j*nsample+l] = k;
 37 |                     }
 38 |                     idx[j*nsample+cnt] = k;
 39 |                     cnt+=1;
 40 |                 }
 41 |             }
 42 |         }
 43 |         xyz1+=n*3;
 44 |         xyz2+=m*3;
 45 |         idx+=m*nsample;
 46 |     }
 47 | }
 48 | 
 49 | 
 50 | // input: points (b,n,c), idx (b,m,nsample)
 51 | // output: out (b,m,nsample,c)
 52 | void group_point_cpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {
 53 |     for (int i=0;i<b;++i) {
 54 |         for (int j=0;j<m;++j) {
 55 |             for (int k=0;k<nsample;++k) {
 56 |                 int ii = idx[j*nsample+k];
 57 |                 for (int l=0;l<c;++l) {
 58 |                     out[j*nsample*c+k*c+l] = points[ii*c+l];
 59 |                 }
 60 |             }
 61 |         }
 62 |         points+=n*c;
 63 |         idx+=m*nsample;
 64 |         out+=m*nsample*c;
 65 |     }
 66 | }
 67 | 
 68 | // input: grad_out (b,m,nsample,c), idx (b,m,nsample), 
 69 | // output: grad_points (b,n,c)
 70 | void group_point_grad_cpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {
 71 |     for (int i=0;i<b;++i) {
 72 |         for (int j=0;j<m;++j) {
 73 |             for (int k=0;k<nsample;++k) {
 74 |                 int ii = idx[j*nsample+k];
 75 |                 for (int l=0;l<c;++l) {
 76 |                      grad_points[ii*c+l] += grad_out[j*nsample*c+k*c+l];
 77 |                 }
 78 |             }
 79 |         }
 80 |         idx+=m*nsample;
 81 |         grad_out+=m*nsample*c;
 82 |         grad_points+=n*c;
 83 |     }
 84 | }
 85 | 
 86 | int main()
 87 | {
 88 |     int b=32,n=512,m=128,nsample=64,c=64;
 89 |     float radius=0.1;
 90 |     float *xyz1=new float[b*n*3];
 91 |     float *xyz2=new float[b*m*3];
 92 |     float *points=new float[b*n*c];
 93 |     int *idx=new int[b*m*nsample];
 94 |     memset(idx, 0, sizeof(int)*b*m*nsample);
 95 |     float *out=new float[b*m*nsample*c];
 96 |     float *grad_out=new float[b*m*nsample*c]; // grad to out
 97 |     memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);
 98 |     float *grad_points=new float[b*n*c]; // grad to points
 99 |     for (int i=0;i<b*n*3;i++)
100 |         xyz1[i]=randomf();
101 |     for (int i=0;i<b*m*3;i++)
102 |         xyz2[i]=randomf();
103 |     for (int i=0;i<b*n*c;i++)
104 |         points[i]=randomf();
105 | 
106 |     double t0=get_time();
107 |     query_ball_point_cpu(b,n,m,radius,nsample,xyz1,xyz2,idx);
108 |     printf("query_ball_point cpu time %f\n",get_time()-t0);
109 | 
110 |     t0=get_time();
111 |     group_point_cpu(b,n,c,m,nsample,points,idx,out);
112 |     printf("grou_point cpu time %f\n",get_time()-t0);
113 | 
114 |     t0=get_time();
115 |     group_point_grad_cpu(b,n,c,m,nsample,grad_out,idx,grad_points);
116 |     printf("grou_point_grad cpu time %f\n",get_time()-t0);
117 | 
118 |     return 0;
119 | }
120 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/query_ball_point.cu:
--------------------------------------------------------------------------------
  1 | #include <cstdio>
  2 | #include <ctime>
  3 | #include <cstring> // memset
  4 | #include <cstdlib> // rand, RAND_MAX
  5 | #include <cmath> // sqrtf
  6 | #include <string>
  7 | #include <vector>
  8 | using namespace std;
  9 | float randomf(){
 10 |     return (rand()+0.5)/(RAND_MAX+1.0);
 11 | }
 12 | static double get_time(){
 13 |     timespec tp;
 14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
 15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
 16 | }
 17 | // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)
 18 | // output: idx (b,m,nsample)
 19 | __global__ void query_ball_point_gpu(int b, int n, int m, float radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {
 20 |     for (int i=0;i<b;++i) {
 21 |         for (int j=0;j<m;++j) {
 22 |             int cnt = 0;
 23 |             for (int k=0;k<n;++k) {
 24 |                 if (cnt == nsample)
 25 |                     break; // only pick the FIRST nsample points in the ball
 26 | 	        float x2=xyz2[j*3+0];
 27 | 	        float y2=xyz2[j*3+1];
 28 | 	        float z2=xyz2[j*3+2];
 29 | 	        float x1=xyz1[k*3+0];
 30 | 	        float y1=xyz1[k*3+1];
 31 | 	        float z1=xyz1[k*3+2];
 32 | 		float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);
 33 |                 if (d<radius) {
 34 |                     if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices
 35 |                         for (int l=0;l<nsample;++l)
 36 |                             idx[j*nsample+l] = k;
 37 |                     }
 38 |                     idx[j*nsample+cnt] = k;
 39 |                     cnt+=1;
 40 |                 }
 41 |             }
 42 |         }
 43 |         xyz1+=n*3;
 44 |         xyz2+=m*3;
 45 |         idx+=m*nsample;
 46 |     }
 47 | }
 48 | 
 49 | 
 50 | // input: points (b,n,c), idx (b,m,nsample)
 51 | // output: out (b,m,nsample,c)
 52 | __global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {
 53 |     for (int i=0;i<b;++i) {
 54 |         for (int j=0;j<m;++j) {
 55 |             for (int k=0;k<nsample;++k) {
 56 |                 int ii = idx[j*nsample+k];
 57 |                 for (int l=0;l<c;++l) {
 58 |                     out[j*nsample*c+k*c+l] = points[ii*c+l];
 59 |                 }
 60 |             }
 61 |         }
 62 |         points+=n*c;
 63 |         idx+=m*nsample;
 64 |         out+=m*nsample*c;
 65 |     }
 66 | }
 67 | 
 68 | // input: grad_out (b,m,nsample,c), idx (b,m,nsample), 
 69 | // output: grad_points (b,n,c)
 70 | __global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {
 71 |     for (int i=0;i<b;++i) {
 72 |         for (int j=0;j<m;++j) {
 73 |             for (int k=0;k<nsample;++k) {
 74 |                 int ii = idx[j*nsample+k];
 75 |                 for (int l=0;l<c;++l) {
 76 |                      grad_points[ii*c+l] += grad_out[j*nsample*c+k*c+l];
 77 |                 }
 78 |             }
 79 |         }
 80 |         idx+=m*nsample;
 81 |         grad_out+=m*nsample*c;
 82 |         grad_points+=n*c;
 83 |     }
 84 | }
 85 | 
 86 | int main()
 87 | {
 88 |     int b=32,n=512,m=128,nsample=64,c=64;
 89 |     float radius=0.1;
 90 |     float *xyz1, *xyz2, *points;
 91 |     cudaMallocManaged(&xyz1, b*n*3*sizeof(float));
 92 |     cudaMallocManaged(&xyz2, b*m*3*sizeof(float));
 93 |     cudaMallocManaged(&points, b*n*c*sizeof(float));
 94 |     int *idx;
 95 |     cudaMallocManaged(&idx, b*m*nsample*sizeof(int));
 96 |     memset(idx, 0, sizeof(int)*b*m*nsample);
 97 |     float *out, *grad_out;
 98 |     cudaMallocManaged(&out, b*m*nsample*c*sizeof(float));
 99 |     cudaMallocManaged(&grad_out, b*m*nsample*c*sizeof(float));
100 |     memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);
101 |     float *grad_points;
102 |     cudaMallocManaged(&grad_points, b*n*c*sizeof(float));
103 | 
104 |     for (int i=0;i<b*n*3;i++)
105 |         xyz1[i]=randomf();
106 |     for (int i=0;i<b*m*3;i++)
107 |         xyz2[i]=randomf();
108 |     for (int i=0;i<b*n*c;i++)
109 |         points[i]=randomf();
110 | 
111 |     double t0=get_time();
112 |     query_ball_point_gpu<<<1,1>>>(b,n,m,radius,nsample,xyz1,xyz2,idx);
113 |     cudaDeviceSynchronize();
114 |     printf("query_ball_point gpu time %f\n",get_time()-t0);
115 | 
116 |     t0=get_time();
117 |     group_point_gpu<<<1,1>>>(b,n,c,m,nsample,points,idx,out);
118 |     cudaDeviceSynchronize();
119 |     printf("grou_point gpu time %f\n",get_time()-t0);
120 | 
121 |     t0=get_time();
122 |     group_point_grad_gpu<<<1,1>>>(b,n,c,m,nsample,grad_out,idx,grad_points);
123 |     cudaDeviceSynchronize();
124 |     printf("grou_point_grad gpu time %f\n",get_time()-t0);
125 | 
126 |     cudaFree(xyz1);
127 |     cudaFree(xyz2);
128 |     cudaFree(points);
129 |     cudaFree(idx);
130 |     cudaFree(out);
131 |     cudaFree(grad_out);
132 |     cudaFree(grad_points);
133 |     return 0;
134 | }
135 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/query_ball_point_block.cu:
--------------------------------------------------------------------------------
  1 | #include <cstdio>
  2 | #include <ctime>
  3 | #include <cstring> // memset
  4 | #include <cstdlib> // rand, RAND_MAX
  5 | #include <cmath> // sqrtf
  6 | #include <string>
  7 | #include <vector>
  8 | using namespace std;
  9 | float randomf(){
 10 |     return (rand()+0.5)/(RAND_MAX+1.0);
 11 | }
 12 | static double get_time(){
 13 |     timespec tp;
 14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
 15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
 16 | }
 17 | // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)
 18 | // output: idx (b,m,nsample)
 19 | __global__ void query_ball_point_gpu(int b, int n, int m, float radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {
 20 |     int index = threadIdx.x;
 21 |     xyz1 += n*3*index;
 22 |     xyz2 += m*3*index;
 23 |     idx += m*nsample*index;
 24 | 
 25 |     for (int j=0;j<m;++j) {
 26 |         int cnt = 0;
 27 |         for (int k=0;k<n;++k) {
 28 |             if (cnt == nsample)
 29 |                 break; // only pick the FIRST nsample points in the ball
 30 |             float x2=xyz2[j*3+0];
 31 |             float y2=xyz2[j*3+1];
 32 |             float z2=xyz2[j*3+2];
 33 |             float x1=xyz1[k*3+0];
 34 |             float y1=xyz1[k*3+1];
 35 |             float z1=xyz1[k*3+2];
 36 |     	float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);
 37 |             if (d<radius) {
 38 |                 if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices
 39 |                     for (int l=0;l<nsample;++l)
 40 |                         idx[j*nsample+l] = k;
 41 |                 }
 42 |                 idx[j*nsample+cnt] = k;
 43 |                 cnt+=1;
 44 |             }
 45 |         }
 46 |     }
 47 | }
 48 | 
 49 | 
 50 | // input: points (b,n,c), idx (b,m,nsample)
 51 | // output: out (b,m,nsample,c)
 52 | __global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {
 53 |     int index = threadIdx.x;
 54 |     points += n*c*index;
 55 |     idx += m*nsample*index;
 56 |     out += m*nsample*c*index;
 57 | 
 58 |     for (int j=0;j<m;++j) {
 59 |         for (int k=0;k<nsample;++k) {
 60 |             int ii = idx[j*nsample+k];
 61 |             for (int l=0;l<c;++l) {
 62 |                 out[j*nsample*c+k*c+l] = points[ii*c+l];
 63 |             }
 64 |         }
 65 |     }
 66 | }
 67 | 
 68 | // input: grad_out (b,m,nsample,c), idx (b,m,nsample), 
 69 | // output: grad_points (b,n,c)
 70 | __global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {
 71 |     int index = threadIdx.x;
 72 |     idx += m*nsample*index;
 73 |     grad_out += m*nsample*c*index;
 74 |     grad_points += n*c*index;
 75 | 
 76 |     for (int j=0;j<m;++j) {
 77 |         for (int k=0;k<nsample;++k) {
 78 |             int ii = idx[j*nsample+k];
 79 |             for (int l=0;l<c;++l) {
 80 |                  grad_points[ii*c+l] += grad_out[j*nsample*c+k*c+l];
 81 |             }
 82 |         }
 83 |     }
 84 | }
 85 | 
 86 | int main()
 87 | {
 88 |     int b=32,n=512,m=128,nsample=64,c=64;
 89 |     float radius=0.1;
 90 |     float *xyz1, *xyz2, *points;
 91 |     cudaMallocManaged(&xyz1, b*n*3*sizeof(float));
 92 |     cudaMallocManaged(&xyz2, b*m*3*sizeof(float));
 93 |     cudaMallocManaged(&points, b*n*c*sizeof(float));
 94 |     int *idx;
 95 |     cudaMallocManaged(&idx, b*m*nsample*sizeof(int));
 96 |     memset(idx, 0, sizeof(int)*b*m*nsample);
 97 |     float *out, *grad_out;
 98 |     cudaMallocManaged(&out, b*m*nsample*c*sizeof(float));
 99 |     cudaMallocManaged(&grad_out, b*m*nsample*c*sizeof(float));
100 |     memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);
101 |     float *grad_points;
102 |     cudaMallocManaged(&grad_points, b*n*c*sizeof(float));
103 | 
104 |     for (int i=0;i<b*n*3;i++)
105 |         xyz1[i]=randomf();
106 |     for (int i=0;i<b*m*3;i++)
107 |         xyz2[i]=randomf();
108 |     for (int i=0;i<b*n*c;i++)
109 |         points[i]=randomf();
110 | 
111 |     double t0=get_time();
112 |     query_ball_point_gpu<<<1,b>>>(b,n,m,radius,nsample,xyz1,xyz2,idx);
113 |     cudaDeviceSynchronize();
114 |     printf("query_ball_point gpu time %f\n",get_time()-t0);
115 | 
116 |     t0=get_time();
117 |     group_point_gpu<<<1,b>>>(b,n,c,m,nsample,points,idx,out);
118 |     cudaDeviceSynchronize();
119 |     printf("grou_point gpu time %f\n",get_time()-t0);
120 | 
121 |     t0=get_time();
122 |     group_point_grad_gpu<<<1,b>>>(b,n,c,m,nsample,grad_out,idx,grad_points);
123 |     cudaDeviceSynchronize();
124 |     printf("grou_point_grad gpu time %f\n",get_time()-t0);
125 | 
126 |     cudaFree(xyz1);
127 |     cudaFree(xyz2);
128 |     cudaFree(points);
129 |     cudaFree(idx);
130 |     cudaFree(out);
131 |     cudaFree(grad_out);
132 |     cudaFree(grad_points);
133 |     return 0;
134 | }
135 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/query_ball_point_grid.cu:
--------------------------------------------------------------------------------
  1 | #include <cstdio>
  2 | #include <ctime>
  3 | #include <cstring> // memset
  4 | #include <cstdlib> // rand, RAND_MAX
  5 | #include <cmath> // sqrtf
  6 | #include <string>
  7 | #include <vector>
  8 | using namespace std;
  9 | float randomf(){
 10 |     return (rand()+0.5)/(RAND_MAX+1.0);
 11 | }
 12 | static double get_time(){
 13 |     timespec tp;
 14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
 15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
 16 | }
 17 | // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)
 18 | // output: idx (b,m,nsample)
 19 | __global__ void query_ball_point_gpu(int b, int n, int m, float radius, int nsample, const float *xyz1, const float *xyz2, int *idx) {
 20 |     int batch_index = blockIdx.x;
 21 |     xyz1 += n*3*batch_index;
 22 |     xyz2 += m*3*batch_index;
 23 |     idx += m*nsample*batch_index;
 24 | 
 25 |     int index = threadIdx.x;
 26 |     int stride = blockDim.x;
 27 |     
 28 |     for (int j=index;j<m;j+=stride) {
 29 |         int cnt = 0;
 30 |         for (int k=0;k<n;++k) {
 31 |             if (cnt == nsample)
 32 |                 break; // only pick the FIRST nsample points in the ball
 33 |             float x2=xyz2[j*3+0];
 34 |             float y2=xyz2[j*3+1];
 35 |             float z2=xyz2[j*3+2];
 36 |             float x1=xyz1[k*3+0];
 37 |             float y1=xyz1[k*3+1];
 38 |             float z1=xyz1[k*3+2];
 39 |     	float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);
 40 |             if (d<radius) {
 41 |                 if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices
 42 |                     for (int l=0;l<nsample;++l)
 43 |                         idx[j*nsample+l] = k;
 44 |                 }
 45 |                 idx[j*nsample+cnt] = k;
 46 |                 cnt+=1;
 47 |             }
 48 |         }
 49 |     }
 50 | }
 51 | 
 52 | 
 53 | // input: points (b,n,c), idx (b,m,nsample)
 54 | // output: out (b,m,nsample,c)
 55 | __global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {
 56 |     int batch_index = blockIdx.x;
 57 |     points += n*c*batch_index;
 58 |     idx += m*nsample*batch_index;
 59 |     out += m*nsample*c*batch_index;
 60 | 
 61 |     int index = threadIdx.x;
 62 |     int stride = blockDim.x;
 63 |     
 64 |     for (int j=index;j<m;j+=stride) {
 65 |         for (int k=0;k<nsample;++k) {
 66 |             int ii = idx[j*nsample+k];
 67 |             for (int l=0;l<c;++l) {
 68 |                 out[j*nsample*c+k*c+l] = points[ii*c+l];
 69 |             }
 70 |         }
 71 |     }
 72 | }
 73 | 
 74 | // input: grad_out (b,m,nsample,c), idx (b,m,nsample), 
 75 | // output: grad_points (b,n,c)
 76 | __global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {
 77 |     int batch_index = blockIdx.x;
 78 |     idx += m*nsample*batch_index;
 79 |     grad_out += m*nsample*c*batch_index;
 80 |     grad_points += n*c*batch_index;
 81 | 
 82 |     int index = threadIdx.x;
 83 |     int stride = blockDim.x;
 84 | 
 85 |     for (int j=index;j<m;j+=stride) {
 86 |         for (int k=0;k<nsample;++k) {
 87 |             int ii = idx[j*nsample+k];
 88 |             for (int l=0;l<c;++l) {
 89 |                  // Use atomic add to avoid race condition
 90 |                  atomicAdd(&grad_points[ii*c+l], grad_out[j*nsample*c+k*c+l]);
 91 |             }
 92 |         }
 93 |     }
 94 | }
 95 | 
 96 | int main()
 97 | {
 98 |     int b=32,n=512,m=128,nsample=64,c=64;
 99 |     float radius=0.1;
100 |     float *xyz1, *xyz2, *points;
101 |     cudaMallocManaged(&xyz1, b*n*3*sizeof(float));
102 |     cudaMallocManaged(&xyz2, b*m*3*sizeof(float));
103 |     cudaMallocManaged(&points, b*n*c*sizeof(float));
104 |     int *idx;
105 |     cudaMallocManaged(&idx, b*m*nsample*sizeof(int));
106 |     memset(idx, 0, sizeof(int)*b*m*nsample);
107 |     float *out, *grad_out;
108 |     cudaMallocManaged(&out, b*m*nsample*c*sizeof(float));
109 |     cudaMallocManaged(&grad_out, b*m*nsample*c*sizeof(float));
110 |     memset(grad_out, 0.0, sizeof(float)*b*m*nsample*c);
111 |     float *grad_points;
112 |     cudaMallocManaged(&grad_points, b*n*c*sizeof(float));
113 | 
114 |     for (int i=0;i<b*n*3;i++)
115 |         xyz1[i]=randomf();
116 |     for (int i=0;i<b*m*3;i++)
117 |         xyz2[i]=randomf();
118 |     for (int i=0;i<b*n*c;i++)
119 |         points[i]=randomf();
120 | 
121 |     double t0=get_time();
122 |     query_ball_point_gpu<<<b,256>>>(b,n,m,radius,nsample,xyz1,xyz2,idx);
123 |     cudaDeviceSynchronize();
124 |     printf("query_ball_point gpu time %f\n",get_time()-t0);
125 | 
126 |     t0=get_time();
127 |     group_point_gpu<<<b,256>>>(b,n,c,m,nsample,points,idx,out);
128 |     cudaDeviceSynchronize();
129 |     printf("grou_point gpu time %f\n",get_time()-t0);
130 | 
131 |     t0=get_time();
132 |     group_point_grad_gpu<<<b,256>>>(b,n,c,m,nsample,grad_out,idx,grad_points);
133 |     cudaDeviceSynchronize();
134 |     printf("grou_point_grad gpu time %f\n",get_time()-t0);
135 | 
136 |     cudaFree(xyz1);
137 |     cudaFree(xyz2);
138 |     cudaFree(points);
139 |     cudaFree(idx);
140 |     cudaFree(out);
141 |     cudaFree(grad_out);
142 |     cudaFree(grad_points);
143 |     return 0;
144 | }
145 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/selection_sort.cpp:
--------------------------------------------------------------------------------
 1 | #include <cstdio>
 2 | #include <ctime>
 3 | #include <cstring> // memset
 4 | #include <cstdlib> // rand, RAND_MAX
 5 | #include <cmath> // sqrtf
 6 | #include <string>
 7 | #include <vector>
 8 | using namespace std;
 9 | float randomf(){
10 |     return (rand()+0.5)/(RAND_MAX+1.0);
11 | }
12 | static double get_time(){
13 |     timespec tp;
14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
16 | }
17 | 
18 | // input: k (1), distance matrix dist (b,m,n)
19 | // output: idx (b,m,n), val (b,m,n)
20 | void selection_sort_cpu(int b, int n, int m, int k, const float *dist, int *idx, float *val) {
21 |     float *p_dist;
22 |     float tmp;
23 |     int tmpi;
24 |     for (int i=0;i<b;++i) {
25 |         for (int j=0;j<m;++j) {
26 |             for (int s=0;s<n;++s) {
27 |                 val[i*m*n+j*n+s] = dist[i*m*n+j*n+s];
28 |                 idx[i*m*n+j*n+s] = s;
29 |             }
30 |         }
31 |     }
32 | 
33 |     for (int i=0;i<b;++i) {
34 |         for (int j=0;j<m;++j) {
35 |             for (int s=0;s<n;++s)
36 |                 printf("%f ", dist[i*m*n+j*n+s]);
37 |             printf("\n");
38 |             p_dist = val+j*n;
39 |             // selection sort for the first k elements
40 |             for (int s=0;s<k;++s) {
41 |                 int min=s; 
42 |                 // find the min
43 |                 for (int t=s+1;t<n;++t) {
44 |                     if (p_dist[t]<p_dist[min]) {
45 |                         min = t;
46 |                     }
47 |                 }
48 |                 printf("%d\n", min);
49 |                 // swap min-th and i-th element
50 |                 if (min!=s) {
51 |                     tmp = p_dist[min];
52 |                     p_dist[min] = p_dist[s];
53 |                     p_dist[s] = tmp;
54 |                     tmpi = idx[j*n+min];
55 |                     idx[j*n+min] = idx[j*n+s];
56 |                     idx[j*n+s] = tmpi;
57 |                 }       
58 |             }
59 |         }
60 |         idx+=m*n;
61 |         val+=m*n;
62 |     }
63 | }
64 | 
65 | int main()
66 | {
67 |     //int b=32,n=10000,m=1000,k=128;
68 |     int b=2,n=4,m=2,k=3;
69 |     float *dist=new float[b*m*n];
70 |     int *idx=new int[b*m*n];
71 |     float *val=new float[b*m*n];
72 |     memset(idx, 0, sizeof(int)*b*m*n);
73 |     //for (int i=0;i<b*n*m;i++)
74 |     //    dist[i]=randomf();
75 |     for (int i=0;i<b*n*m;i++) {
76 |         dist[i] = float(10-i);
77 |         printf("%f ", dist[i]);
78 |     }
79 |     printf("\n");
80 | 
81 | 
82 | 
83 |     double t0=get_time();
84 |     selection_sort_cpu(b,n,m,k,dist,idx,val);
85 |     printf("selection sort cpu time %f\n",get_time()-t0);
86 | 
87 |     for (int i=0;i<b*n*m;++i)
88 |         printf("%d ", idx[i]);
89 |     printf("\n");
90 |     for (int i=0;i<b*n*m;++i)
91 |         printf("%f ", val[i]);
92 |     printf("\n");
93 |     return 0;
94 | }
95 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/selection_sort.cu:
--------------------------------------------------------------------------------
 1 | #include <cstdio>
 2 | #include <ctime>
 3 | #include <cstring> // memset
 4 | #include <cstdlib> // rand, RAND_MAX
 5 | #include <cmath> // sqrtf
 6 | #include <string>
 7 | #include <vector>
 8 | using namespace std;
 9 | float randomf(){
10 |     return (rand()+0.5)/(RAND_MAX+1.0);
11 | }
12 | static double get_time(){
13 |     timespec tp;
14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
16 | }
17 | 
18 | // input: k (1), distance matrix dist (b,m,n)
19 | // output: idx (b,m,k), val (b,m,k)
20 | __global__ void selection_sort_gpu(int b, int n, int m, int k, float *dist, int *idx, float *val) {
21 |     int batch_index = blockIdx.x;
22 |     dist+=m*n*batch_index;
23 |     idx+=m*k*batch_index;
24 |     val+=m*k*batch_index;
25 | 
26 |     int index = threadIdx.x;
27 |     int stride = blockDim.x;
28 | 
29 |     float *p_dist;
30 |     for (int j=index;j<m;j+=stride) {
31 |         p_dist = dist+j*n;
32 |         // selection sort for the first k elements
33 |         for (int s=0;s<k;++s) {
34 |             int min=s; 
35 |             // find the min
36 |             for (int t=s+1;t<n;++t) {
37 |                 if (p_dist[t]<p_dist[min]) {
38 |                     min = t;
39 |                 }
40 |             }
41 |             // update idx and val
42 |             idx[j*n+s] = min;
43 |             val[j*n+s] = p_dist[min];
44 |             // swap min-th and i-th element
45 |             float tmp = p_dist[min];
46 |             p_dist[min] = p_dist[s];
47 |             p_dist[s] = tmp;
48 |         }
49 |     }
50 | }
51 | 
52 | int main()
53 | {
54 |     //int b=32,n=10000,m=1000,k=128;
55 |     int b=32,n=2048,m=512,k=128;
56 |     float *dist;
57 |     int *idx;
58 |     float *val;
59 |     cudaMallocManaged(&dist, b*m*n*sizeof(float));
60 |     cudaMallocManaged(&idx, b*m*k*sizeof(int));
61 |     cudaMallocManaged(&val, b*m*k*sizeof(float));
62 |     cudaMemset(idx, 0, sizeof(int)*b*m*k);
63 |     for (int i=0;i<b*n*m;i++)
64 |         dist[i]=randomf();
65 | 
66 |     double t0=get_time();
67 |     selection_sort_gpu<<<b,256>>>(b,n,m,k,dist,idx,val);
68 |     cudaDeviceSynchronize();
69 |     printf("selection sort cpu time %f\n",get_time()-t0);
70 | 
71 |     return 0;
72 | }
73 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/test/selection_sort_const.cu:
--------------------------------------------------------------------------------
 1 | #include <cstdio>
 2 | #include <ctime>
 3 | #include <cstring> // memset
 4 | #include <cstdlib> // rand, RAND_MAX
 5 | #include <cmath> // sqrtf
 6 | #include <string>
 7 | #include <vector>
 8 | using namespace std;
 9 | float randomf(){
10 |     return (rand()+0.5)/(RAND_MAX+1.0);
11 | }
12 | static double get_time(){
13 |     timespec tp;
14 |     clock_gettime(CLOCK_MONOTONIC,&tp);
15 |     return tp.tv_sec+tp.tv_nsec*1e-9;
16 | }
17 | 
18 | // input: k (1), distance matrix dist (b,m,n)
19 | // output: idx (b,m,n), dist_out (b,m,n)
20 | __global__ void selection_sort_gpu(int b, int n, int m, int k, const float *dist, int *outi, float *out) {
21 |     int batch_index = blockIdx.x;
22 |     dist+=m*n*batch_index;
23 |     outi+=m*n*batch_index;
24 |     out+=m*n*batch_index;
25 | 
26 |     int index = threadIdx.x;
27 |     int stride = blockDim.x;
28 | 
29 |     // copy from dist to dist_out
30 |     for (int j=index;j<m;j+=stride) {
31 |         for (int s=0;s<n;++s) {
32 |             out[j*n+s] = dist[j*n+s];
33 |             outi[j*n+s] = s;
34 |         }
35 |     }
36 | 
37 |     float *p_dist;
38 |     for (int j=index;j<m;j+=stride) {
39 |         p_dist = out+j*n;
40 |         // selection sort for the first k elements
41 |         for (int s=0;s<k;++s) {
42 |             int min=s; 
43 |             // find the min
44 |             for (int t=s+1;t<n;++t) {
45 |                 if (p_dist[t]<p_dist[min]) {
46 |                     min = t;
47 |                 }
48 |             }
49 |             // swap min-th and i-th element
50 |             if (min!=s) {
51 |                 float tmp = p_dist[min];
52 |                 p_dist[min] = p_dist[s];
53 |                 p_dist[s] = tmp;
54 |                 int tmpi = outi[j*n+min];
55 |                 outi[j*n+min] = outi[j*n+s];
56 |                 outi[j*n+s] = tmpi;
57 |             }
58 |         }
59 |     }
60 | }
61 | 
62 | int main()
63 | {
64 |     //int b=32,n=10000,m=1000,k=128;
65 |     int b=32,n=2048,m=512,k=128;
66 |     //int b=2,n=4,m=2,k=3;
67 |     float *dist;
68 |     int *idx;
69 |     float *dist_out;
70 |     cudaMallocManaged(&dist, b*m*n*sizeof(float));
71 |     cudaMallocManaged(&idx, b*m*n*sizeof(int));
72 |     cudaMallocManaged(&dist_out, b*m*n*sizeof(float));
73 |     cudaMemset(idx, 0, sizeof(int)*b*m*n);
74 |     for (int i=0;i<b*n*m;i++)
75 |         dist[i]=randomf();
76 |     //for (int i=0;i<b*n*m;i++) {
77 |     //    dist[i] = float(10-i);
78 |     //    printf("%f ", dist[i]);
79 |     //}
80 |     //printf("\n");
81 | 
82 |     double t0=get_time();
83 |     selection_sort_gpu<<<b,256>>>(b,n,m,k,dist,idx,dist_out);
84 |     cudaDeviceSynchronize();
85 |     printf("selection sort cpu time %f\n",get_time()-t0);
86 |     
87 |     //for (int i=0;i<b*n*m;++i)
88 |     //    printf("%d ", idx[i]);
89 |     //printf("\n");
90 | 
91 |     return 0;
92 | }
93 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/tf_grouping.cpp:
--------------------------------------------------------------------------------
  1 | #include <cstdio>
  2 | #include <ctime>
  3 | #include <cstring> // memset
  4 | #include <cstdlib> // rand, RAND_MAX
  5 | #include <cmath> // sqrtf
  6 | #include "tensorflow/core/framework/op.h"
  7 | #include "tensorflow/core/framework/op_kernel.h"
  8 | #include "tensorflow/core/framework/shape_inference.h"
  9 | #include "tensorflow/core/framework/common_shape_fns.h"
 10 | #include <cuda_runtime.h>
 11 | using namespace tensorflow;
 12 | 
 13 | REGISTER_OP("QueryBallPoint")
 14 |     .Attr("radius: float")
 15 |     .Attr("nsample: int")
 16 |     .Input("xyz1: float32")
 17 |     .Input("xyz2: float32")
 18 |     .Output("idx: int32")
 19 |     .Output("pts_cnt: int32")
 20 |     .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 21 |         ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoint * 3
 22 |         c->WithRank(c->input(1), 3, &dims2);
 23 |         int nsample;
 24 |         TF_RETURN_IF_ERROR(c->GetAttr("nsample", &nsample));
 25 |         ::tensorflow::shape_inference::ShapeHandle output1 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1), nsample});
 26 |         c->set_output(0, output1);
 27 |         ::tensorflow::shape_inference::ShapeHandle output2 = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)});
 28 |         c->set_output(1, output2);
 29 |         return Status::OK();
 30 |     });
 31 | REGISTER_OP("SelectionSort")
 32 |     .Attr("k: int")
 33 |     .Input("dist: float32")
 34 |     .Output("outi: int32")
 35 |     .Output("out: float32")
 36 |     .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 37 |         c->set_output(0, c->input(0));
 38 |         c->set_output(1, c->input(0));
 39 |         return Status::OK();
 40 |     });
 41 | REGISTER_OP("GroupPoint")
 42 |     .Input("points: float32")
 43 |     .Input("idx: int32")
 44 |     .Output("out: float32")
 45 |     .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 46 |         ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ndataset * channels
 47 |         c->WithRank(c->input(0), 3, &dims1);
 48 |         ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints * nsample
 49 |         c->WithRank(c->input(1), 3, &dims2);
 50 |         // batch_size * npoints * nsample * channels
 51 |         ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1), c->Dim(dims2, 2), c->Dim(dims1, 2)});
 52 |         c->set_output(0, output);
 53 |         return Status::OK();
 54 |     });
 55 | REGISTER_OP("GroupPointGrad")
 56 |     .Input("points: float32")
 57 |     .Input("idx: int32")
 58 |     .Input("grad_out: float32")
 59 |     .Output("grad_points: float32")
 60 |     .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 61 |         c->set_output(0, c->input(0));
 62 |         return Status::OK();
 63 |     });
 64 | 
 65 | 
 66 | void queryBallPointLauncher(int b, int n, int m, float radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt);
 67 | class QueryBallPointGpuOp : public OpKernel {
 68 |     public:
 69 |         explicit QueryBallPointGpuOp(OpKernelConstruction* context) : OpKernel(context) {
 70 |             OP_REQUIRES_OK(context, context->GetAttr("radius", &radius_));
 71 |             OP_REQUIRES(context, radius_ > 0, errors::InvalidArgument("QueryBallPoint expects positive radius"));
 72 | 
 73 |             OP_REQUIRES_OK(context, context->GetAttr("nsample", &nsample_));
 74 |             OP_REQUIRES(context, nsample_ > 0, errors::InvalidArgument("QueryBallPoint expects positive nsample"));
 75 |         }
 76 | 
 77 |         void Compute(OpKernelContext* context) override {
 78 |             const Tensor& xyz1_tensor = context->input(0);
 79 |             OP_REQUIRES(context, xyz1_tensor.dims()==3 && xyz1_tensor.shape().dim_size(2)==3, errors::InvalidArgument("QueryBallPoint expects (batch_size, ndataset, 3) xyz1 shape."));
 80 |             int b = xyz1_tensor.shape().dim_size(0);
 81 |             int n = xyz1_tensor.shape().dim_size(1);
 82 | 
 83 |             const Tensor& xyz2_tensor = context->input(1);
 84 |             OP_REQUIRES(context, xyz2_tensor.dims()==3 && xyz2_tensor.shape().dim_size(2)==3, errors::InvalidArgument("QueryBallPoint expects (batch_size, npoint, 3) xyz2 shape."));
 85 |             int m = xyz2_tensor.shape().dim_size(1);
 86 | 
 87 |             Tensor *idx_tensor = nullptr;
 88 |             OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,m,nsample_}, &idx_tensor));
 89 |             Tensor *pts_cnt_tensor = nullptr;
 90 |             OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,m}, &pts_cnt_tensor));
 91 | 
 92 |             auto xyz1_flat = xyz1_tensor.flat<float>();
 93 |             const float *xyz1 = &(xyz1_flat(0));
 94 |             auto xyz2_flat = xyz2_tensor.flat<float>();
 95 |             const float *xyz2 = &(xyz2_flat(0));
 96 |             auto idx_flat = idx_tensor->flat<int>();
 97 |             int *idx = &(idx_flat(0));
 98 |             auto pts_cnt_flat = pts_cnt_tensor->flat<int>();
 99 |             int *pts_cnt = &(pts_cnt_flat(0));
100 |             queryBallPointLauncher(b,n,m,radius_,nsample_,xyz1,xyz2,idx,pts_cnt);
101 |         }
102 |     private:
103 |         float radius_;
104 |         int nsample_;
105 | };
106 | REGISTER_KERNEL_BUILDER(Name("QueryBallPoint").Device(DEVICE_GPU), QueryBallPointGpuOp);
107 | 
108 | void selectionSortLauncher(int b, int n, int m, int k, const float *dist, int *outi, float *out);
109 | class SelectionSortGpuOp : public OpKernel {
110 |     public:
111 |         explicit SelectionSortGpuOp(OpKernelConstruction* context) : OpKernel(context) {
112 |             OP_REQUIRES_OK(context, context->GetAttr("k", &k_));
113 |             OP_REQUIRES(context, k_ > 0, errors::InvalidArgument("SelectionSort expects positive k"));
114 |         }
115 | 
116 |         void Compute(OpKernelContext* context) override {
117 |             const Tensor& dist_tensor = context->input(0);
118 |             OP_REQUIRES(context, dist_tensor.dims()==3, errors::InvalidArgument("SelectionSort expects (b,m,n) dist shape."));
119 |             int b = dist_tensor.shape().dim_size(0);
120 |             int m = dist_tensor.shape().dim_size(1);
121 |             int n = dist_tensor.shape().dim_size(2);
122 | 
123 |             Tensor *outi_tensor = nullptr;
124 |             OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape{b,m,n}, &outi_tensor));
125 |             Tensor *out_tensor = nullptr;
126 |             OP_REQUIRES_OK(context, context->allocate_output(1, TensorShape{b,m,n}, &out_tensor));
127 | 
128 |             auto dist_flat = dist_tensor.flat<float>();
129 |             const float *dist = &(dist_flat(0));
130 |             auto outi_flat = outi_tensor->flat<int>();
131 |             int *outi = &(outi_flat(0));
132 |             auto out_flat = out_tensor->flat<float>();
133 |             float *out = &(out_flat(0));
134 |             selectionSortLauncher(b,n,m,k_,dist,outi,out);
135 |         }
136 |     private:
137 |         int k_;
138 | };
139 | REGISTER_KERNEL_BUILDER(Name("SelectionSort").Device(DEVICE_GPU), SelectionSortGpuOp);
140 | 
141 | 
142 | void groupPointLauncher(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out);
143 | class GroupPointGpuOp: public OpKernel{
144 |     public:
145 |         explicit GroupPointGpuOp(OpKernelConstruction * context):OpKernel(context){}
146 | 
147 |         void Compute(OpKernelContext * context) override {
148 |             const Tensor& points_tensor=context->input(0);
149 |             OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("GroupPoint expects (batch_size, num_points, channel) points shape"));
150 |             int b = points_tensor.shape().dim_size(0);
151 |             int n = points_tensor.shape().dim_size(1);
152 |             int c = points_tensor.shape().dim_size(2);
153 | 
154 |             const Tensor& idx_tensor=context->input(1);
155 |             OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument("GroupPoint expects (batch_size, npoints, nsample) idx shape"));
156 |             int m = idx_tensor.shape().dim_size(1);
157 |             int nsample = idx_tensor.shape().dim_size(2);
158 | 
159 |             Tensor * out_tensor = nullptr;
160 |             OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,m,nsample,c}, &out_tensor));
161 | 
162 |             auto points_flat = points_tensor.flat<float>();
163 |             const float *points = &(points_flat(0));
164 |             auto idx_flat = idx_tensor.flat<int>();
165 |             const int *idx = &(idx_flat(0));
166 |             auto out_flat = out_tensor->flat<float>();
167 |             float *out = &(out_flat(0));
168 |             groupPointLauncher(b,n,c,m,nsample,points,idx,out);
169 |         }
170 | };
171 | REGISTER_KERNEL_BUILDER(Name("GroupPoint").Device(DEVICE_GPU),GroupPointGpuOp);
172 | 
173 | void groupPointGradLauncher(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points);
174 | class GroupPointGradGpuOp: public OpKernel{
175 |     public:
176 |         explicit GroupPointGradGpuOp(OpKernelConstruction * context):OpKernel(context){}
177 | 
178 |         void Compute(OpKernelContext * context) override {
179 |             const Tensor& points_tensor=context->input(0);
180 |             OP_REQUIRES(context, points_tensor.dims()==3, errors::InvalidArgument("GroupPointGrad expects (batch_size, num_points, channel) points shape"));
181 |             int b = points_tensor.shape().dim_size(0);
182 |             int n = points_tensor.shape().dim_size(1);
183 |             int c = points_tensor.shape().dim_size(2);
184 | 
185 |             const Tensor& idx_tensor=context->input(1);
186 |             OP_REQUIRES(context,idx_tensor.dims()==3 && idx_tensor.shape().dim_size(0)==b, errors::InvalidArgument("GroupPointGrad expects (batch_size, npoints, nsample) idx shape"));
187 |             int m = idx_tensor.shape().dim_size(1);
188 |             int nsample = idx_tensor.shape().dim_size(2);
189 | 
190 |             const Tensor& grad_out_tensor=context->input(2);
191 |             OP_REQUIRES(context,grad_out_tensor.dims()==4 && grad_out_tensor.shape().dim_size(0)==b && grad_out_tensor.shape().dim_size(1)==m && grad_out_tensor.shape().dim_size(2)==nsample && grad_out_tensor.shape().dim_size(3)==c, errors::InvalidArgument("GroupPointGrad expects (batch_size, npoints, nsample, channel) grad_out shape"));
192 | 
193 |             Tensor * grad_points_tensor = nullptr;
194 |             OP_REQUIRES_OK(context, context->allocate_output(0,TensorShape{b,n,c}, &grad_points_tensor));
195 | 
196 |             auto points_flat = points_tensor.flat<float>();
197 |             const float *points = &(points_flat(0));
198 |             auto idx_flat = idx_tensor.flat<int>();
199 |             const int *idx = &(idx_flat(0));
200 |             auto grad_out_flat = grad_out_tensor.flat<float>();
201 |             const float *grad_out = &(grad_out_flat(0));
202 |             auto grad_points_flat = grad_points_tensor->flat<float>();
203 |             float *grad_points = &(grad_points_flat(0));
204 |             cudaMemset(grad_points, 0, sizeof(float)*b*n*c);
205 |             groupPointGradLauncher(b,n,c,m,nsample,grad_out,idx,grad_points);
206 |         }
207 | };
208 | REGISTER_KERNEL_BUILDER(Name("GroupPointGrad").Device(DEVICE_GPU),GroupPointGradGpuOp);
209 | 
210 | 
211 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/tf_grouping.py:
--------------------------------------------------------------------------------
  1 | import tensorflow as tf
  2 | from tensorflow.python.framework import ops
  3 | import sys
  4 | import os
  5 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
  6 | sys.path.append(BASE_DIR)
  7 | grouping_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_grouping_so.so'))
  8 | def query_ball_point(radius, nsample, xyz1, xyz2):
  9 |     '''
 10 |     Input:
 11 |         radius: float32, ball search radius
 12 |         nsample: int32, number of points selected in each ball region
 13 |         xyz1: (batch_size, ndataset, 3) float32 array, input points
 14 |         xyz2: (batch_size, npoint, 3) float32 array, query points
 15 |     Output:
 16 |         idx: (batch_size, npoint, nsample) int32 array, indices to input points
 17 |         pts_cnt: (batch_size, npoint) int32 array, number of unique points in each local region
 18 |     '''
 19 |     #return grouping_module.query_ball_point(radius, nsample, xyz1, xyz2)
 20 |     return grouping_module.query_ball_point(xyz1, xyz2, radius, nsample)
 21 | ops.NoGradient('QueryBallPoint')
 22 | def select_top_k(k, dist):
 23 |     '''
 24 |     Input:
 25 |         k: int32, number of k SMALLEST elements selected
 26 |         dist: (b,m,n) float32 array, distance matrix, m query points, n dataset points
 27 |     Output:
 28 |         idx: (b,m,n) int32 array, first k in n are indices to the top k
 29 |         dist_out: (b,m,n) float32 array, first k in n are the top k
 30 |     '''
 31 |     return grouping_module.selection_sort(dist, k)
 32 | ops.NoGradient('SelectionSort')
 33 | def group_point(points, idx):
 34 |     '''
 35 |     Input:
 36 |         points: (batch_size, ndataset, channel) float32 array, points to sample from
 37 |         idx: (batch_size, npoint, nsample) int32 array, indices to points
 38 |     Output:
 39 |         out: (batch_size, npoint, nsample, channel) float32 array, values sampled from points
 40 |     '''
 41 |     return grouping_module.group_point(points, idx)
 42 | @tf.RegisterGradient('GroupPoint')
 43 | def _group_point_grad(op, grad_out):
 44 |     points = op.inputs[0]
 45 |     idx = op.inputs[1]
 46 |     return [grouping_module.group_point_grad(points, idx, grad_out), None]
 47 | 
 48 | def knn_point(k, xyz1, xyz2):
 49 |     '''
 50 |     Input:
 51 |         k: int32, number of k in k-nn search
 52 |         xyz1: (batch_size, ndataset, c) float32 array, input points
 53 |         xyz2: (batch_size, npoint, c) float32 array, query points
 54 |     Output:
 55 |         val: (batch_size, npoint, k) float32 array, L2 distances
 56 |         idx: (batch_size, npoint, k) int32 array, indices to input points
 57 |     '''
 58 |     b = xyz1.get_shape()[0].value
 59 |     n = xyz1.get_shape()[1].value
 60 |     c = xyz1.get_shape()[2].value
 61 |     m = xyz2.get_shape()[1].value
 62 |     print(b, n, c, m)
 63 |     print(xyz1, (b,1,n,c))
 64 |     xyz1 = tf.tile(tf.reshape(xyz1, (b,1,n,c)), [1,m,1,1])
 65 |     xyz2 = tf.tile(tf.reshape(xyz2, (b,m,1,c)), [1,1,n,1])
 66 |     dist = tf.reduce_sum((xyz1-xyz2)**2, -1)
 67 |     print(dist, k)
 68 |     outi, out = select_top_k(k, dist)
 69 |     idx = tf.slice(outi, [0,0,0], [-1,-1,k])
 70 |     val = tf.slice(out, [0,0,0], [-1,-1,k])
 71 |     print(idx, val)
 72 |     #val, idx = tf.nn.top_k(-dist, k=k) # ONLY SUPPORT CPU
 73 |     return val, idx
 74 | 
 75 | if __name__=='__main__':
 76 |     knn=True
 77 |     import numpy as np
 78 |     import time
 79 |     np.random.seed(100)
 80 |     pts = np.random.random((32,512,64)).astype('float32')
 81 |     tmp1 = np.random.random((32,512,3)).astype('float32')
 82 |     tmp2 = np.random.random((32,128,3)).astype('float32')
 83 |     with tf.device('/gpu:1'):
 84 |         points = tf.constant(pts)
 85 |         xyz1 = tf.constant(tmp1)
 86 |         xyz2 = tf.constant(tmp2)
 87 |         radius = 0.1 
 88 |         nsample = 64
 89 |         if knn:
 90 |             _, idx = knn_point(nsample, xyz1, xyz2)
 91 |             grouped_points = group_point(points, idx)
 92 |         else:
 93 |             idx, _ = query_ball_point(radius, nsample, xyz1, xyz2)
 94 |             grouped_points = group_point(points, idx)
 95 |             #grouped_points_grad = tf.ones_like(grouped_points)
 96 |             #points_grad = tf.gradients(grouped_points, points, grouped_points_grad)
 97 |     with tf.Session('') as sess:
 98 |         now = time.time() 
 99 |         for _ in range(100):
100 |             ret = sess.run(grouped_points)
101 |         print(time.time() - now)
102 |         print(ret.shape, ret.dtype)
103 |         print(ret)
104 |     
105 |     
106 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/tf_grouping_compile.sh:
--------------------------------------------------------------------------------
1 | #/bin/bash
2 | TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
3 | TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
4 | nvcc tf_grouping_g.cu -o tf_grouping_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC
5 | g++ -std=c++11 tf_grouping.cpp tf_grouping_g.cu.o -o tf_grouping_so.so -shared -fPIC -I $TF_INC -I /usr/local/cuda-9.0/include -lcudart -L /usr/local/cuda-9.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0 -I $TF_INC/external/nsync/public -L $TF_LIB -ltensorflow_framework
6 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/tf_grouping_g.cu:
--------------------------------------------------------------------------------
  1 | // input: radius (1), nsample (1), xyz1 (b,n,3), xyz2 (b,m,3)
  2 | // output: idx (b,m,nsample), pts_cnt (b,m)
  3 | __global__ void query_ball_point_gpu(int b, int n, int m, float radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt) {
  4 |     int batch_index = blockIdx.x;
  5 |     xyz1 += n*3*batch_index;
  6 |     xyz2 += m*3*batch_index;
  7 |     idx += m*nsample*batch_index;
  8 |     pts_cnt += m*batch_index; // counting how many unique points selected in local region
  9 | 
 10 |     int index = threadIdx.x;
 11 |     int stride = blockDim.x;
 12 |     
 13 |     for (int j=index;j<m;j+=stride) {
 14 |         int cnt = 0;
 15 |         for (int k=0;k<n;++k) {
 16 |             if (cnt == nsample)
 17 |                 break; // only pick the FIRST nsample points in the ball
 18 |             float x2=xyz2[j*3+0];
 19 |             float y2=xyz2[j*3+1];
 20 |             float z2=xyz2[j*3+2];
 21 |             float x1=xyz1[k*3+0];
 22 |             float y1=xyz1[k*3+1];
 23 |             float z1=xyz1[k*3+2];
 24 |     	    float d=max(sqrtf((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1)),1e-20f);
 25 |             if (d<radius) {
 26 |                 if (cnt==0) { // set ALL indices to k, s.t. if there are less points in ball than nsample, we still have valid (repeating) indices
 27 |                     for (int l=0;l<nsample;++l)
 28 |                         idx[j*nsample+l] = k;
 29 |                 }
 30 |                 idx[j*nsample+cnt] = k;
 31 |                 cnt+=1;
 32 |             }
 33 |         }
 34 |         pts_cnt[j] = cnt;
 35 |     }
 36 | }
 37 | 
 38 | // input: points (b,n,c), idx (b,m,nsample)
 39 | // output: out (b,m,nsample,c)
 40 | __global__ void group_point_gpu(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out) {
 41 |     int batch_index = blockIdx.x;
 42 |     points += n*c*batch_index;
 43 |     idx += m*nsample*batch_index;
 44 |     out += m*nsample*c*batch_index;
 45 | 
 46 |     int index = threadIdx.x;
 47 |     int stride = blockDim.x;
 48 |     
 49 |     for (int j=index;j<m;j+=stride) {
 50 |         for (int k=0;k<nsample;++k) {
 51 |             int ii = idx[j*nsample+k];
 52 |             for (int l=0;l<c;++l) {
 53 |                 out[j*nsample*c+k*c+l] = points[ii*c+l];
 54 |             }
 55 |         }
 56 |     }
 57 | }
 58 | 
 59 | // input: grad_out (b,m,nsample,c), idx (b,m,nsample), 
 60 | // output: grad_points (b,n,c)
 61 | __global__ void group_point_grad_gpu(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points) {
 62 |     int batch_index = blockIdx.x;
 63 |     idx += m*nsample*batch_index;
 64 |     grad_out += m*nsample*c*batch_index;
 65 |     grad_points += n*c*batch_index;
 66 | 
 67 |     int index = threadIdx.x;
 68 |     int stride = blockDim.x;
 69 | 
 70 |     for (int j=index;j<m;j+=stride) {
 71 |         for (int k=0;k<nsample;++k) {
 72 |             int ii = idx[j*nsample+k];
 73 |             for (int l=0;l<c;++l) {
 74 |                  atomicAdd(&grad_points[ii*c+l], grad_out[j*nsample*c+k*c+l]);
 75 |             }
 76 |         }
 77 |     }
 78 | }
 79 | 
 80 | // input: k (1), distance matrix dist (b,m,n)
 81 | // output: idx (b,m,n), dist_out (b,m,n)
 82 | // only the top k results within n are useful
 83 | __global__ void selection_sort_gpu(int b, int n, int m, int k, const float *dist, int *outi, float *out) {
 84 |     int batch_index = blockIdx.x;
 85 |     dist+=m*n*batch_index;
 86 |     outi+=m*n*batch_index;
 87 |     out+=m*n*batch_index;
 88 | 
 89 |     int index = threadIdx.x;
 90 |     int stride = blockDim.x;
 91 | 
 92 |     // copy from dist to dist_out
 93 |     for (int j=index;j<m;j+=stride) {
 94 |         for (int s=0;s<n;++s) {
 95 |             out[j*n+s] = dist[j*n+s];
 96 |             outi[j*n+s] = s;
 97 |         }
 98 |     }
 99 | 
100 |     float *p_dist;
101 |     for (int j=index;j<m;j+=stride) {
102 |         p_dist = out+j*n;
103 |         // selection sort for the first k elements
104 |         for (int s=0;s<k;++s) {
105 |             int min=s; 
106 |             // find the min
107 |             for (int t=s+1;t<n;++t) {
108 |                 if (p_dist[t]<p_dist[min]) {
109 |                     min = t;
110 |                 }
111 |             }
112 |             // swap min-th and i-th element
113 |             if (min!=s) {
114 |                 float tmp = p_dist[min];
115 |                 p_dist[min] = p_dist[s];
116 |                 p_dist[s] = tmp;
117 |                 int tmpi = outi[j*n+min];
118 |                 outi[j*n+min] = outi[j*n+s];
119 |                 outi[j*n+s] = tmpi;
120 |             }
121 |         }
122 |     }
123 | }
124 | 
125 | void queryBallPointLauncher(int b, int n, int m, float radius, int nsample, const float *xyz1, const float *xyz2, int *idx, int *pts_cnt) {
126 |     query_ball_point_gpu<<<b,256>>>(b,n,m,radius,nsample,xyz1,xyz2,idx,pts_cnt);
127 |     //cudaDeviceSynchronize();
128 | }
129 | void selectionSortLauncher(int b, int n, int m, int k, const float *dist, int *outi, float *out) {
130 |     selection_sort_gpu<<<b,256>>>(b,n,m,k,dist,outi,out); 
131 |     //cudaDeviceSynchronize();
132 | }
133 | void groupPointLauncher(int b, int n, int c, int m, int nsample, const float *points, const int *idx, float *out){
134 |     group_point_gpu<<<b,256>>>(b,n,c,m,nsample,points,idx,out);
135 |     //cudaDeviceSynchronize();
136 | }
137 | void groupPointGradLauncher(int b, int n, int c, int m, int nsample, const float *grad_out, const int *idx, float *grad_points){
138 |     group_point_grad_gpu<<<b,256>>>(b,n,c,m,nsample,grad_out,idx,grad_points);
139 |     //group_point_grad_gpu<<<1,1>>>(b,n,c,m,nsample,grad_out,idx,grad_points);
140 |     //cudaDeviceSynchronize();
141 | }
142 | 


--------------------------------------------------------------------------------
/tf_ops/grouping/tf_grouping_op_test.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | import numpy as np
 3 | from tf_grouping import query_ball_point, group_point
 4 | 
 5 | class GroupPointTest(tf.test.TestCase):
 6 |   def test(self):
 7 |     pass
 8 | 
 9 |   def test_grad(self):
10 |     with tf.device('/gpu:0'):
11 |       points = tf.constant(np.random.random((1,128,16)).astype('float32'))
12 |       print points
13 |       xyz1 = tf.constant(np.random.random((1,128,3)).astype('float32'))
14 |       xyz2 = tf.constant(np.random.random((1,8,3)).astype('float32'))
15 |       radius = 0.3 
16 |       nsample = 32
17 |       idx, pts_cnt = query_ball_point(radius, nsample, xyz1, xyz2)
18 |       grouped_points = group_point(points, idx)
19 |       print grouped_points
20 | 
21 |     with self.test_session():
22 |       print "---- Going to compute gradient error"
23 |       err = tf.test.compute_gradient_error(points, (1,128,16), grouped_points, (1,8,32,16))
24 |       print err
25 |       self.assertLess(err, 1e-4) 
26 | 
27 | if __name__=='__main__':
28 |   tf.test.main() 
29 | 


--------------------------------------------------------------------------------
/tf_ops/sampling/.gitignore:
--------------------------------------------------------------------------------
1 | *.o
2 | *.so
3 | 


--------------------------------------------------------------------------------
/tf_ops/sampling/tf_sampling.cpp:
--------------------------------------------------------------------------------
  1 | /* Furthest point sampling
  2 |  * Original author: Haoqiang Fan
  3 |  * Modified by Charles R. Qi
  4 |  * All Rights Reserved. 2017. 
  5 |  */
  6 | #include "tensorflow/core/framework/op.h"
  7 | #include "tensorflow/core/framework/op_kernel.h"
  8 | #include "tensorflow/core/framework/shape_inference.h"
  9 | #include "tensorflow/core/framework/common_shape_fns.h"
 10 | #include <cuda_runtime.h>
 11 | 
 12 | using namespace tensorflow;
 13 | 
 14 | REGISTER_OP("ProbSample")
 15 |   .Input("inp: float32")
 16 |   .Input("inpr: float32")
 17 |   .Output("out: int32")
 18 |   .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 19 |     ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ncategory
 20 |     c->WithRank(c->input(0), 2, &dims1);
 21 |     ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints
 22 |     c->WithRank(c->input(1), 2, &dims2);
 23 |     // batch_size * npoints
 24 |     ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims2, 0), c->Dim(dims2, 1)});
 25 |     c->set_output(0, output);
 26 |     return Status::OK();
 27 |   });
 28 | REGISTER_OP("FarthestPointSample")
 29 |   .Attr("npoint: int")
 30 |   .Input("inp: float32")
 31 |   .Output("out: int32")
 32 |   .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 33 |     ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * npoint * 3
 34 |     c->WithRank(c->input(0), 3, &dims1);
 35 |     int npoint;
 36 |     TF_RETURN_IF_ERROR(c->GetAttr("npoint", &npoint));
 37 |     ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), npoint});
 38 |     c->set_output(0, output);
 39 |     return Status::OK();
 40 |   });
 41 | REGISTER_OP("GatherPoint")
 42 |   .Input("inp: float32")
 43 |   .Input("idx: int32")
 44 |   .Output("out: float32")
 45 |   .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 46 |     ::tensorflow::shape_inference::ShapeHandle dims1; // batch_size * ndataset * 3
 47 |     c->WithRank(c->input(0), 3, &dims1);
 48 |     ::tensorflow::shape_inference::ShapeHandle dims2; // batch_size * npoints
 49 |     c->WithRank(c->input(1), 2, &dims2);
 50 |     // batch_size * npoints * 3
 51 |     ::tensorflow::shape_inference::ShapeHandle output = c->MakeShape({c->Dim(dims1, 0), c->Dim(dims2, 1), c->Dim(dims1, 2)});
 52 |     c->set_output(0, output);
 53 |     return Status::OK();
 54 |   });
 55 | REGISTER_OP("GatherPointGrad")
 56 |   .Input("inp: float32")
 57 |   .Input("idx: int32")
 58 |   .Input("out_g: float32")
 59 |   .Output("inp_g: float32")
 60 |   .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
 61 |     c->set_output(0, c->input(0));
 62 |     return Status::OK();
 63 |   });
 64 | 
 65 | void probsampleLauncher(int b,int n,int m,const float * inp_p,const float * inp_r,float * temp,int * out);
 66 | class ProbSampleGpuOp: public OpKernel{
 67 |   public:
 68 |     explicit ProbSampleGpuOp(OpKernelConstruction* context):OpKernel(context){}
 69 |     void Compute(OpKernelContext * context)override{
 70 |       const Tensor& inp_tensor=context->input(0);
 71 |       const Tensor& inpr_tensor=context->input(1);
 72 |       auto inp_flat=inp_tensor.flat<float>();
 73 |       auto inpr_flat=inpr_tensor.flat<float>();
 74 |       const float * inp=&(inp_flat(0));
 75 |       const float * inpr=&(inpr_flat(0));
 76 |       OP_REQUIRES(context,inp_tensor.dims()==2,errors::InvalidArgument("ProbSample expects (batch_size,num_choices) inp shape"));
 77 |       int b=inp_tensor.shape().dim_size(0);
 78 |       int n=inp_tensor.shape().dim_size(1);
 79 |       OP_REQUIRES(context,inpr_tensor.dims()==2 && inpr_tensor.shape().dim_size(0)==b,errors::InvalidArgument("ProbSample expects (batch_size,num_points) inpr shape"));
 80 |       int m=inpr_tensor.shape().dim_size(1);
 81 |       Tensor * out_tensor=NULL;
 82 |       OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m},&out_tensor));
 83 |       auto out_flat=out_tensor->flat<int>();
 84 |       int * out=&(out_flat(0));
 85 |       Tensor temp_tensor;
 86 |       OP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum<float>::value,TensorShape{b,n},&temp_tensor));
 87 |       auto temp_flat=temp_tensor.flat<float>();
 88 |       float * temp=&(temp_flat(0));
 89 |       probsampleLauncher(b,n,m,inp,inpr,temp,out);
 90 |     }
 91 | };
 92 | REGISTER_KERNEL_BUILDER(Name("ProbSample").Device(DEVICE_GPU), ProbSampleGpuOp);
 93 | 
 94 | void farthestpointsamplingLauncher(int b,int n,int m,const float * inp,float * temp,int * out);
 95 | class FarthestPointSampleGpuOp: public OpKernel{
 96 |   public:
 97 |     explicit FarthestPointSampleGpuOp(OpKernelConstruction* context):OpKernel(context) {
 98 |                     OP_REQUIRES_OK(context, context->GetAttr("npoint", &npoint_));
 99 |                     OP_REQUIRES(context, npoint_ > 0, errors::InvalidArgument("FarthestPointSample expects positive npoint"));
100 |                 }
101 |     void Compute(OpKernelContext * context)override{
102 |       int m = npoint_;
103 | 
104 |       const Tensor& inp_tensor=context->input(0);
105 |       OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument("FarthestPointSample expects (batch_size,num_points,3) inp shape"));
106 |       int b=inp_tensor.shape().dim_size(0);
107 |       int n=inp_tensor.shape().dim_size(1);
108 |       auto inp_flat=inp_tensor.flat<float>();
109 |       const float * inp=&(inp_flat(0));
110 |       Tensor * out_tensor;
111 |       OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m},&out_tensor));
112 |       auto out_flat=out_tensor->flat<int>();
113 |       int * out=&(out_flat(0));
114 |       Tensor temp_tensor;
115 |       OP_REQUIRES_OK(context,context->allocate_temp(DataTypeToEnum<float>::value,TensorShape{32,n},&temp_tensor));
116 |       auto temp_flat=temp_tensor.flat<float>();
117 |       float * temp=&(temp_flat(0));
118 |       farthestpointsamplingLauncher(b,n,m,inp,temp,out);
119 |     }
120 |     private:
121 |         int npoint_;
122 | };
123 | REGISTER_KERNEL_BUILDER(Name("FarthestPointSample").Device(DEVICE_GPU),FarthestPointSampleGpuOp);
124 | 
125 | void gatherpointLauncher(int b,int n,int m,const float * inp,const int * idx,float * out);
126 | class GatherPointGpuOp: public OpKernel{
127 |   public:
128 |     explicit GatherPointGpuOp(OpKernelConstruction * context):OpKernel(context){}
129 |     void Compute(OpKernelContext * context)override{
130 |       const Tensor& inp_tensor=context->input(0);
131 |       OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument("GatherPoint expects (batch_size,num_points,3) inp shape"));
132 |       int b=inp_tensor.shape().dim_size(0);
133 |       int n=inp_tensor.shape().dim_size(1);
134 |       const Tensor& idx_tensor=context->input(1);
135 |       OP_REQUIRES(context,idx_tensor.dims()==2 && idx_tensor.shape().dim_size(0)==b,errors::InvalidArgument("GatherPoint expects (batch_size,num_result) idx shape"));
136 |       int m=idx_tensor.shape().dim_size(1);
137 |       auto inp_flat=inp_tensor.flat<float>();
138 |       const float * inp=&(inp_flat(0));
139 |       auto idx_flat=idx_tensor.flat<int>();
140 |       const int * idx=&(idx_flat(0));
141 |       Tensor * out_tensor=NULL;
142 |       OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,m,3},&out_tensor));
143 |       auto out_flat=out_tensor->flat<float>();
144 |       float * out=&(out_flat(0));
145 |       gatherpointLauncher(b,n,m,inp,idx,out);
146 |     }
147 | };
148 | REGISTER_KERNEL_BUILDER(Name("GatherPoint").Device(DEVICE_GPU),GatherPointGpuOp);
149 | 
150 | void scatteraddpointLauncher(int b,int n,int m,const float * out_g,const int * idx,float * inp_g);
151 | class GatherPointGradGpuOp: public OpKernel{
152 |   public:
153 |     explicit GatherPointGradGpuOp(OpKernelConstruction * context):OpKernel(context){}
154 |     void Compute(OpKernelContext * context)override{
155 |       const Tensor& inp_tensor=context->input(0);
156 |       OP_REQUIRES(context,inp_tensor.dims()==3 && inp_tensor.shape().dim_size(2)==3,errors::InvalidArgument("GatherPointGradGpuOp expects (batch_size,num_points,3) inp"));
157 |       int b=inp_tensor.shape().dim_size(0);
158 |       int n=inp_tensor.shape().dim_size(1);
159 |       const Tensor& idx_tensor=context->input(1);
160 |       OP_REQUIRES(context,idx_tensor.dims()==2 && idx_tensor.shape().dim_size(0)==b,errors::InvalidArgument("GatherPointGradGpuOp expects (batch_size,num_result) idx shape"));
161 |       int m=idx_tensor.shape().dim_size(1);
162 |       auto inp_flat=inp_tensor.flat<float>();
163 |       const float * inp=&(inp_flat(0));
164 |       auto idx_flat=idx_tensor.flat<int>();
165 |       const int * idx=&(idx_flat(0));
166 |       const Tensor& out_g_tensor=context->input(2);
167 |       OP_REQUIRES(context,out_g_tensor.dims()==3 && out_g_tensor.shape().dim_size(0)==b && out_g_tensor.shape().dim_size(1)==m && out_g_tensor.shape().dim_size(2)==3,errors::InvalidArgument("GatherPointGradGpuOp expects (batch_size,num_result,3) out_g shape"));
168 |       auto out_g_flat=out_g_tensor.flat<float>();
169 |       const float * out_g=&(out_g_flat(0));
170 |       Tensor * inp_g_tensor=NULL;
171 |       OP_REQUIRES_OK(context,context->allocate_output(0,TensorShape{b,n,3},&inp_g_tensor));
172 |       auto inp_g_flat=inp_g_tensor->flat<float>();
173 |       float * inp_g=&(inp_g_flat(0));
174 |       cudaMemset(inp_g,0,b*n*3*4);
175 |       scatteraddpointLauncher(b,n,m,out_g,idx,inp_g);
176 |     }
177 | };
178 | REGISTER_KERNEL_BUILDER(Name("GatherPointGrad").Device(DEVICE_GPU),GatherPointGradGpuOp);
179 | 
180 | 


--------------------------------------------------------------------------------
/tf_ops/sampling/tf_sampling.py:
--------------------------------------------------------------------------------
 1 | ''' Furthest point sampling
 2 | Original author: Haoqiang Fan
 3 | Modified by Charles R. Qi
 4 | All Rights Reserved. 2017. 
 5 | '''
 6 | import tensorflow as tf
 7 | from tensorflow.python.framework import ops
 8 | import sys
 9 | import os
10 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
11 | sys.path.append(BASE_DIR)
12 | sampling_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_sampling_so.so'))
13 | def prob_sample(inp,inpr):
14 |     '''
15 | input:
16 |     batch_size * ncategory float32
17 |     batch_size * npoints   float32
18 | returns:
19 |     batch_size * npoints   int32
20 |     '''
21 |     return sampling_module.prob_sample(inp,inpr)
22 | ops.NoGradient('ProbSample')
23 | # TF1.0 API requires set shape in C++
24 | #@tf.RegisterShape('ProbSample')
25 | #def _prob_sample_shape(op):
26 | #    shape1=op.inputs[0].get_shape().with_rank(2)
27 | #    shape2=op.inputs[1].get_shape().with_rank(2)
28 | #    return [tf.TensorShape([shape2.dims[0],shape2.dims[1]])]
29 | def gather_point(inp,idx):
30 |     '''
31 | input:
32 |     batch_size * ndataset * 3   float32
33 |     batch_size * npoints        int32
34 | returns:
35 |     batch_size * npoints * 3    float32
36 |     '''
37 |     return sampling_module.gather_point(inp,idx)
38 | #@tf.RegisterShape('GatherPoint')
39 | #def _gather_point_shape(op):
40 | #    shape1=op.inputs[0].get_shape().with_rank(3)
41 | #    shape2=op.inputs[1].get_shape().with_rank(2)
42 | #    return [tf.TensorShape([shape1.dims[0],shape2.dims[1],shape1.dims[2]])]
43 | @tf.RegisterGradient('GatherPoint')
44 | def _gather_point_grad(op,out_g):
45 |     inp=op.inputs[0]
46 |     idx=op.inputs[1]
47 |     return [sampling_module.gather_point_grad(inp,idx,out_g),None]
48 | def farthest_point_sample(npoint,inp):
49 |     '''
50 | input:
51 |     int32
52 |     batch_size * ndataset * 3   float32
53 | returns:
54 |     batch_size * npoint         int32
55 |     '''
56 |     return sampling_module.farthest_point_sample(inp, npoint)
57 | ops.NoGradient('FarthestPointSample')
58 |     
59 | 
60 | if __name__=='__main__':
61 |     import numpy as np
62 |     np.random.seed(100)
63 |     triangles=np.random.rand(1,5,3,3).astype('float32')
64 |     with tf.device('/gpu:1'):
65 |         inp=tf.constant(triangles)
66 |         tria=inp[:,:,0,:]
67 |         trib=inp[:,:,1,:]
68 |         tric=inp[:,:,2,:]
69 |         areas=tf.sqrt(tf.reduce_sum(tf.cross(trib-tria,tric-tria)**2,2)+1e-9)
70 |         randomnumbers=tf.random_uniform((1,8192))
71 |         triids=prob_sample(areas,randomnumbers)
72 |         tria_sample=gather_point(tria,triids)
73 |         trib_sample=gather_point(trib,triids)
74 |         tric_sample=gather_point(tric,triids)
75 |         us=tf.random_uniform((1,8192))
76 |         vs=tf.random_uniform((1,8192))
77 |         uplusv=1-tf.abs(us+vs-1)
78 |         uminusv=us-vs
79 |         us=(uplusv+uminusv)*0.5
80 |         vs=(uplusv-uminusv)*0.5
81 |         pt_sample=tria_sample+(trib_sample-tria_sample)*tf.expand_dims(us,-1)+(tric_sample-tria_sample)*tf.expand_dims(vs,-1)
82 |         print('pt_sample: ', pt_sample)
83 |         reduced_sample=gather_point(pt_sample,farthest_point_sample(1024,pt_sample))
84 |         print(reduced_sample)
85 |     with tf.Session('') as sess:
86 |         ret=sess.run(reduced_sample)
87 |     print(ret.shape,ret.dtype)
88 |     import cPickle as pickle
89 |     pickle.dump(ret,open('1.pkl','wb'),-1)
90 | 


--------------------------------------------------------------------------------
/tf_ops/sampling/tf_sampling_compile.sh:
--------------------------------------------------------------------------------
1 | #/bin/bash
2 | TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
3 | TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
4 | /usr/local/cuda-9.0/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC
5 | g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -I$TF_INC -I/usr/local/cuda-9.0/include -lcudart -L/usr/local/cuda-9.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0 -I$TF_INC/external/nsync/public -L$TF_LIB -ltensorflow_framework
6 | 


--------------------------------------------------------------------------------
/tf_ops/sampling/tf_sampling_g.cu:
--------------------------------------------------------------------------------
  1 | /* Furthest point sampling GPU implementation
  2 |  * Original author: Haoqiang Fan
  3 |  * Modified by Charles R. Qi
  4 |  * All Rights Reserved. 2017. 
  5 |  */
  6 | 
  7 | __global__ void cumsumKernel(int b,int n,const float * __restrict__ inp,float * __restrict__ out){
  8 |   const int BlockSize=2048;
  9 |   const int paddingLevel=5;
 10 |   __shared__ float buffer4[BlockSize*4];
 11 |   __shared__ float buffer[BlockSize+(BlockSize>>paddingLevel)];
 12 |   for (int i=blockIdx.x;i<b;i+=gridDim.x){
 13 |     float runningsum=0,runningsum2=0;
 14 |     for (int j=0;j<n;j+=BlockSize*4){
 15 |       int n24_i=min(n-j,BlockSize*4);
 16 |       int n24=(n24_i+3)&~3;
 17 |       int n2=n24>>2;
 18 |       for (int k=threadIdx.x*4;k<n24_i;k+=blockDim.x*4){
 19 |         if (k+3<n24_i){
 20 |           float v1=inp[i*n+j+k];
 21 |           float v2=inp[i*n+j+k+1];
 22 |           v2+=v1;
 23 |           float v3=inp[i*n+j+k+2];
 24 |           float v4=inp[i*n+j+k+3];
 25 |           v4+=v3;
 26 |           v3+=v2;
 27 |           v4+=v2;
 28 |           buffer4[k]=v1;
 29 |           buffer4[k+1]=v2;
 30 |           buffer4[k+2]=v3;
 31 |           buffer4[k+3]=v4;
 32 |           buffer[(k>>2)+(k>>(2+paddingLevel))]=v4;
 33 |         }else{
 34 |           float v=0;
 35 |           for (int k2=k;k2<n24_i;k2++){
 36 |             v+=inp[i*n+j+k2];
 37 |             buffer4[k2]=v;
 38 |           }
 39 |           for (int k2=n24_i;k2<n24;k2++){
 40 |             buffer4[k2]=v;
 41 |           }
 42 |           buffer[(k>>2)+(k>>(2+paddingLevel))]=v;
 43 |         }
 44 |       }
 45 |       int u=0;
 46 |       for (;(2<<u)<=n2;u++){
 47 |         __syncthreads();
 48 |         for (int k=threadIdx.x;k<int(n2>>(u+1));k+=blockDim.x){
 49 |           int i1=(((k<<1)+2)<<u)-1;
 50 |           int i2=(((k<<1)+1)<<u)-1;
 51 |           i1+=i1>>paddingLevel;
 52 |           i2+=i2>>paddingLevel;
 53 |           buffer[i1]+=buffer[i2];
 54 |         }
 55 |       }
 56 |       u--;
 57 |       for (;u>=0;u--){
 58 |         __syncthreads();
 59 |         for (int k=threadIdx.x;k<int((n2-(1<<u))>>(u+1));k+=blockDim.x){
 60 |           int i1=(((k<<1)+3)<<u)-1;
 61 |           int i2=(((k<<1)+2)<<u)-1;
 62 |           i1+=i1>>paddingLevel;
 63 |           i2+=i2>>paddingLevel;
 64 |           buffer[i1]+=buffer[i2];
 65 |         }
 66 |       }
 67 |       __syncthreads();
 68 |       for (int k=threadIdx.x*4;k<n24;k+=blockDim.x*4){
 69 |         if (k!=0){
 70 |           int k2=((k>>2)-1)+(((k>>2)-1)>>paddingLevel);
 71 |           buffer4[k]+=buffer[k2];
 72 |           buffer4[k+1]+=buffer[k2];
 73 |           buffer4[k+2]+=buffer[k2];
 74 |           buffer4[k+3]+=buffer[k2];
 75 |         }
 76 |       }
 77 |       __syncthreads();
 78 |       for (int k=threadIdx.x;k<n24_i;k+=blockDim.x){
 79 |         out[i*n+j+k]=buffer4[k]+runningsum;
 80 |       }
 81 |       float t=buffer[(n2-1)+((n2-1)>>paddingLevel)]+runningsum2;
 82 |       float r2=runningsum+t;
 83 |       runningsum2=t-(r2-runningsum);
 84 |       runningsum=r2;
 85 |       __syncthreads();
 86 |     }
 87 |   }
 88 | }
 89 | 
 90 | __global__ void binarysearchKernel(int b,int n,int m,const float * __restrict__ dataset,const float * __restrict__ query, int * __restrict__ result){
 91 |   int base=1;
 92 |   while (base<n)
 93 |     base<<=1;
 94 |   for (int i=blockIdx.x;i<b;i+=gridDim.x){
 95 |     for (int j=blockIdx.y*blockDim.x+threadIdx.x;j<m;j+=blockDim.x*gridDim.y){
 96 |       float q=query[i*m+j]*dataset[i*n+n-1];
 97 |       int r=n-1;
 98 |       for (int k=base;k>=1;k>>=1)
 99 |         if (r>=k && dataset[i*n+r-k]>=q)
100 |           r-=k;
101 |       result[i*m+j]=r;
102 |     }
103 |   }
104 | }
105 | __global__ void farthestpointsamplingKernel(int b,int n,int m,const float * __restrict__ dataset,float * __restrict__ temp,int * __restrict__ idxs){
106 |   if (m<=0)
107 |     return;
108 |   const int BlockSize=512;
109 |   __shared__ float dists[BlockSize];
110 |   __shared__ int dists_i[BlockSize];
111 |   const int BufferSize=3072;
112 |   __shared__ float buf[BufferSize*3];
113 |   for (int i=blockIdx.x;i<b;i+=gridDim.x){
114 |     int old=0;
115 |     if (threadIdx.x==0)
116 |       idxs[i*m+0]=old;
117 |     for (int j=threadIdx.x;j<n;j+=blockDim.x){
118 |       temp[blockIdx.x*n+j]=1e38;
119 |     }
120 |     for (int j=threadIdx.x;j<min(BufferSize,n)*3;j+=blockDim.x){
121 |       buf[j]=dataset[i*n*3+j];
122 |     }
123 |     __syncthreads();
124 |     for (int j=1;j<m;j++){
125 |       int besti=0;
126 |       float best=-1;
127 |       float x1=dataset[i*n*3+old*3+0];
128 |       float y1=dataset[i*n*3+old*3+1];
129 |       float z1=dataset[i*n*3+old*3+2];
130 |       for (int k=threadIdx.x;k<n;k+=blockDim.x){
131 |         float td=temp[blockIdx.x*n+k];
132 |         float x2,y2,z2;
133 |         if (k<BufferSize){
134 |           x2=buf[k*3+0];
135 |           y2=buf[k*3+1];
136 |           z2=buf[k*3+2];
137 |         }else{
138 |           x2=dataset[i*n*3+k*3+0];
139 |           y2=dataset[i*n*3+k*3+1];
140 |           z2=dataset[i*n*3+k*3+2];
141 |         }
142 |         float d=(x2-x1)*(x2-x1)+(y2-y1)*(y2-y1)+(z2-z1)*(z2-z1);
143 |         float d2=min(d,td);
144 |         if (d2!=td)
145 |           temp[blockIdx.x*n+k]=d2;
146 |         if (d2>best){
147 |           best=d2;
148 |           besti=k;
149 |         }
150 |       }
151 |       dists[threadIdx.x]=best;
152 |       dists_i[threadIdx.x]=besti;
153 |       for (int u=0;(1<<u)<blockDim.x;u++){
154 |         __syncthreads();
155 |         if (threadIdx.x<(blockDim.x>>(u+1))){
156 |           int i1=(threadIdx.x*2)<<u;
157 |           int i2=(threadIdx.x*2+1)<<u;
158 |           if (dists[i1]<dists[i2]){
159 |             dists[i1]=dists[i2];
160 |             dists_i[i1]=dists_i[i2];
161 |           }
162 |         }
163 |       }
164 |       __syncthreads();
165 |       old=dists_i[0];
166 |       if (threadIdx.x==0)
167 |         idxs[i*m+j]=old;
168 |     }
169 |   }
170 | }
171 | 
172 | __global__ void gatherpointKernel(int b,int n,int m,const float * __restrict__ inp,const int * __restrict__ idx,float * __restrict__ out){
173 |   for (int i=blockIdx.x;i<b;i+=gridDim.x){
174 |     for (int j=blockIdx.y*blockDim.x+threadIdx.x;j<m;j+=blockDim.x*gridDim.y){
175 |       int a=idx[i*m+j];
176 |       out[(i*m+j)*3+0]=inp[(i*n+a)*3+0];
177 |       out[(i*m+j)*3+1]=inp[(i*n+a)*3+1];
178 |       out[(i*m+j)*3+2]=inp[(i*n+a)*3+2];
179 |     }
180 |   }
181 | }
182 | 
183 | __global__ void scatteraddpointKernel(int b,int n,int m,const float * __restrict__ out_g,const int * __restrict__ idx,float * __restrict__ inp_g){
184 |   for (int i=blockIdx.x;i<b;i+=gridDim.x){
185 |     for (int j=blockIdx.y*blockDim.x+threadIdx.x;j<m;j+=blockDim.x*gridDim.y){
186 |       int a=idx[i*m+j];
187 |       atomicAdd(&inp_g[(i*n+a)*3+0],out_g[(i*m+j)*3+0]);
188 |       atomicAdd(&inp_g[(i*n+a)*3+1],out_g[(i*m+j)*3+1]);
189 |       atomicAdd(&inp_g[(i*n+a)*3+2],out_g[(i*m+j)*3+2]);
190 |     }
191 |   }
192 | }
193 | 
194 | void cumsumLauncher(int b,int n,const float * inp,float * out){
195 |   cumsumKernel<<<32,512>>>(b,n,inp,out);
196 | }
197 | //require b*n working space
198 | void probsampleLauncher(int b,int n,int m,const float * inp_p,const float * inp_r,float * temp,int * out){
199 |   cumsumKernel<<<32,512>>>(b,n,inp_p,temp);
200 |   binarysearchKernel<<<dim3(32,8,1),512>>>(b,n,m,temp,inp_r,out);
201 | }
202 | //require 32*n working space
203 | void farthestpointsamplingLauncher(int b,int n,int m,const float * inp,float * temp,int * out){
204 |   farthestpointsamplingKernel<<<32,512>>>(b,n,m,inp,temp,out);
205 | }
206 | void gatherpointLauncher(int b,int n,int m,const float * inp,const int * idx,float * out){
207 |   gatherpointKernel<<<dim3(2,8,1),512>>>(b,n,m,inp,idx,out);
208 | }
209 | void scatteraddpointLauncher(int b,int n,int m,const float * out_g,const int * idx,float * inp_g){
210 |   scatteraddpointKernel<<<dim3(2,8,1),512>>>(b,n,m,out_g,idx,inp_g);
211 | }
212 | 
213 | 


--------------------------------------------------------------------------------
/utils/pointnet_util.py:
--------------------------------------------------------------------------------
  1 | """ PointNet++ Layers
  2 | 
  3 | Author: Charles R. Qi
  4 | Date: November 2017
  5 | """
  6 | 
  7 | import os
  8 | import sys
  9 | BASE_DIR = os.path.dirname(os.path.abspath(__file__))
 10 | ROOT_DIR = os.path.dirname(BASE_DIR)
 11 | sys.path.append(os.path.join(ROOT_DIR, 'utils'))
 12 | sys.path.append(os.path.join(ROOT_DIR, 'tf_ops/sampling'))
 13 | sys.path.append(os.path.join(ROOT_DIR, 'tf_ops/grouping'))
 14 | sys.path.append(os.path.join(ROOT_DIR, 'tf_ops/3d_interpolation'))
 15 | from tf_sampling import farthest_point_sample, gather_point
 16 | from tf_grouping import query_ball_point, group_point, knn_point
 17 | from tf_interpolate import three_nn, three_interpolate
 18 | import tensorflow as tf
 19 | import numpy as np
 20 | import tf_util
 21 | 
 22 | def sample_and_group(npoint, radius, nsample, xyz, points, tnet_spec=None, knn=False, use_xyz=True):
 23 |     '''
 24 |     Input:
 25 |         npoint: int32
 26 |         radius: float32
 27 |         nsample: int32
 28 |         xyz: (batch_size, ndataset, 3) TF tensor
 29 |         points: (batch_size, ndataset, channel) TF tensor, if None will just use xyz as points
 30 |         tnet_spec: dict (keys: mlp, mlp2, is_training, bn_decay), if None do not apply tnet
 31 |         knn: bool, if True use kNN instead of radius search
 32 |         use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features
 33 |     Output:
 34 |         new_xyz: (batch_size, npoint, 3) TF tensor
 35 |         new_points: (batch_size, npoint, nsample, 3+channel) TF tensor
 36 |         idx: (batch_size, npoint, nsample) TF tensor, indices of local points as in ndataset points
 37 |         grouped_xyz: (batch_size, npoint, nsample, 3) TF tensor, normalized point XYZs
 38 |             (subtracted by seed point XYZ) in local regions
 39 |     '''
 40 | 
 41 |     new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz)) # (batch_size, npoint, 3)
 42 |     if knn:
 43 |         _,idx = knn_point(nsample, xyz, new_xyz)
 44 |     else:
 45 |         idx, pts_cnt = query_ball_point(radius, nsample, xyz, new_xyz)
 46 |     grouped_xyz = group_point(xyz, idx) # (batch_size, npoint, nsample, 3)
 47 |     grouped_xyz -= tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1]) # translation normalization
 48 |     if tnet_spec is not None:
 49 |         grouped_xyz = tnet(grouped_xyz, tnet_spec)
 50 |     if points is not None:
 51 |         grouped_points = group_point(points, idx) # (batch_size, npoint, nsample, channel)
 52 |         #grouped_points = tf.Print(grouped_points, [grouped_points[0, :, :, :]], 'grouped points: ', 10, 2000)
 53 |         if use_xyz:
 54 |             new_points = tf.concat([grouped_xyz, grouped_points], axis=-1) # (batch_size, npoint, nample, 3+channel)
 55 |         else:
 56 |             new_points = grouped_points
 57 |     else:
 58 |         new_points = grouped_xyz
 59 | 
 60 |     return new_xyz, new_points, idx, grouped_xyz
 61 | 
 62 | 
 63 | def sample_and_group_all(xyz, points, use_xyz=True):
 64 |     '''
 65 |     Inputs:
 66 |         xyz: (batch_size, ndataset, 3) TF tensor
 67 |         points: (batch_size, ndataset, channel) TF tensor, if None will just use xyz as points
 68 |         use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features
 69 |     Outputs:
 70 |         new_xyz: (batch_size, 1, 3) as (0,0,0)
 71 |         new_points: (batch_size, 1, ndataset, 3+channel) TF tensor
 72 |     Note:
 73 |         Equivalent to sample_and_group with npoint=1, radius=inf, use (0,0,0) as the centroid
 74 |     '''
 75 |     batch_size = xyz.get_shape()[0].value
 76 |     nsample = xyz.get_shape()[1].value
 77 |     new_xyz = tf.constant(np.tile(np.array([0,0,0]).reshape((1,1,3)), (batch_size,1,1)),dtype=tf.float32) # (batch_size, 1, 3)
 78 |     idx = tf.constant(np.tile(np.array(range(nsample)).reshape((1,1,nsample)), (batch_size,1,1)))
 79 |     grouped_xyz = tf.reshape(xyz, (batch_size, 1, nsample, 3)) # (batch_size, npoint=1, nsample, 3)
 80 |     if points is not None:
 81 |         if use_xyz:
 82 |             new_points = tf.concat([xyz, points], axis=2) # (batch_size, 16, 259)
 83 |         else:
 84 |             new_points = points
 85 |         new_points = tf.expand_dims(new_points, 1) # (batch_size, 1, 16, 259)
 86 |     else:
 87 |         new_points = grouped_xyz
 88 |     return new_xyz, new_points, idx, grouped_xyz
 89 | 
 90 | 
 91 | def pointnet_sa_module(xyz, points, npoint, radius, nsample, mlp, mlp2, group_all, is_training, bn_decay, scope, bn=True, pooling='max', tnet_spec=None, knn=False, use_xyz=True):
 92 |     ''' PointNet Set Abstraction (SA) Module
 93 |         Input:
 94 |             xyz: (batch_size, ndataset, 3) TF tensor
 95 |             points: (batch_size, ndataset, channel) TF tensor
 96 |             npoint: int32 -- #points sampled in farthest point sampling
 97 |             radius: float32 -- search radius in local region
 98 |             nsample: int32 -- how many points in each local region
 99 |             mlp: list of int32 -- output size for MLP on each point
100 |             mlp2: list of int32 -- output size for MLP on each region
101 |             group_all: bool -- group all points into one PC if set true, OVERRIDE
102 |                 npoint, radius and nsample settings
103 |             use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features
104 |         Return:
105 |             new_xyz: (batch_size, npoint, 3) TF tensor
106 |             new_points: (batch_size, npoint, mlp[-1] or mlp2[-1]) TF tensor
107 |             idx: (batch_size, npoint, nsample) int32 -- indices for local regions
108 |     '''
109 |     with tf.variable_scope(scope) as sc:
110 |         if group_all:
111 |             nsample = xyz.get_shape()[1].value
112 |             new_xyz, new_points, idx, grouped_xyz = sample_and_group_all(xyz, points, use_xyz)
113 |         else:
114 |             new_xyz, new_points, idx, grouped_xyz = sample_and_group(npoint, radius, nsample, xyz, points, tnet_spec, knn, use_xyz)
115 |         for i, num_out_channel in enumerate(mlp):
116 |             #new_points = tf.Print(new_points, [new_points], 'new points:', 1, 18)
117 |             new_points = tf_util.conv2d(new_points, num_out_channel, kernel_size=[1,1],
118 |                                         padding='VALID', stride=[1,1],
119 |                                         bn=bn, is_training=is_training,
120 |                                         scope='conv%d'%(i), bn_decay=bn_decay)
121 |             # shp = new_points.shape
122 |             # new_points = tf.reshape(new_points, [new_points.shape[0], -1])
123 |             # new_points = tf_util.fully_connected(new_points, num_out_channel, 'fc-mlp_%d' % i,
124 |             #                                      use_xavier=True,
125 |             #                                      bn=bn, is_training=is_training,
126 |             #                                      bn_decay=bn_decay)
127 |             # new_points = tf.reshape(new_points, [shp[0], shp[1], shp[2], num_out_channel])
128 |         if pooling=='avg':
129 |             new_points = tf_util.avg_pool2d(new_points, [1,nsample], stride=[1,1], padding='VALID', scope='avgpool1')
130 |         elif pooling=='weighted_avg':
131 |             with tf.variable_scope('weighted_avg1'):
132 |                 dists = tf.norm(grouped_xyz,axis=-1,ord=2,keepdims=True)
133 |                 exp_dists = tf.exp(-dists * 5)
134 |                 weights = exp_dists/tf.reduce_sum(exp_dists,axis=2,keepdims=True) # (batch_size, npoint, nsample, 1)
135 |                 new_points *= weights # (batch_size, npoint, nsample, mlp[-1])
136 |                 new_points = tf.reduce_sum(new_points, axis=2, keepdims=True)
137 |         elif pooling=='max':
138 |             new_points = tf.reduce_max(new_points, axis=[2], keepdims=True)
139 |         elif pooling=='min':
140 |             new_points = tf_util.max_pool2d(-1*new_points, [1,nsample], stride=[1,1], padding='VALID', scope='minpool1')
141 |         elif pooling=='max_and_avg':
142 |             avg_points = tf_util.max_pool2d(new_points, [1,nsample], stride=[1,1], padding='VALID', scope='maxpool1')
143 |             max_points = tf_util.avg_pool2d(new_points, [1,nsample], stride=[1,1], padding='VALID', scope='avgpool1')
144 |             new_points = tf.concat([avg_points, max_points], axis=-1)
145 |             
146 |         if mlp2 is None: mlp2 = []
147 |         for i, num_out_channel in enumerate(mlp2):
148 |             new_points = tf_util.conv2d(new_points, num_out_channel, [1,1],
149 |                                         padding='VALID', stride=[1,1],
150 |                                         bn=bn, is_training=is_training,
151 |                                         scope='conv_post_%d'%(i), bn_decay=bn_decay) 
152 |         new_points = tf.squeeze(new_points, [2]) # (batch_size, npoints, mlp2[-1])
153 |         return new_xyz, new_points, idx
154 | 
155 | def pointnet_sa_module_msg(xyz, points, npoint, radius_list, nsample_list, mlp_list, is_training, bn_decay, scope, bn=True, use_xyz=True):
156 |     ''' PointNet Set Abstraction (SA) module with Multi-Scale Grouping (MSG)
157 |         Input:
158 |             xyz: (batch_size, ndataset, 3) TF tensor
159 |             points: (batch_size, ndataset, channel) TF tensor
160 |             npoint: int32 -- #points sampled in farthest point sampling
161 |             radius: list of float32 -- search radius in local region
162 |             nsample: list of int32 -- how many points in each local region
163 |             mlp: list of list of int32 -- output size for MLP on each point
164 |             use_xyz: bool, if True concat XYZ with local point features, otherwise just use point features
165 |         Return:
166 |             new_xyz: (batch_size, npoint, 3) TF tensor
167 |             new_points: (batch_size, npoint, \sum_k{mlp[k][-1]}) TF tensor
168 |     '''
169 |     with tf.variable_scope(scope) as sc:
170 |         new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz))
171 |         new_points_list = []
172 |         for i in range(len(radius_list)):
173 |             radius = radius_list[i]
174 |             nsample = nsample_list[i]
175 |             idx, pts_cnt = query_ball_point(radius, nsample, xyz, new_xyz)
176 |             grouped_xyz = group_point(xyz, idx)
177 |             grouped_xyz -= tf.tile(tf.expand_dims(new_xyz, 2), [1,1,nsample,1])
178 |             if points is not None:
179 |                 grouped_points = group_point(points, idx)
180 |                 if use_xyz:
181 |                     grouped_points = tf.concat([grouped_points, grouped_xyz], axis=-1)
182 |             else:
183 |                 grouped_points = grouped_xyz
184 |             for j,num_out_channel in enumerate(mlp_list[i]):
185 |                 grouped_points = tf_util.conv2d(grouped_points, num_out_channel, [1,1],
186 |                                                 padding='VALID', stride=[1,1], bn=bn, is_training=is_training,
187 |                                                 scope='conv%d_%d'%(i,j), bn_decay=bn_decay)
188 |             new_points = tf.reduce_max(grouped_points, axis=[2])
189 |             new_points_list.append(new_points)
190 |         new_points_concat = tf.concat(new_points_list, axis=-1)
191 |         return new_xyz, new_points_concat
192 | 
193 |  
194 | def pointnet_fp_module(xyz1, xyz2, points1, points2, mlp, is_training, bn_decay, scope, bn=True):
195 |     ''' PointNet Feature Propogation (FP) Module
196 |         Input:                                                                                                      
197 |             xyz1: (batch_size, ndataset1, 3) TF tensor                                                              
198 |             xyz2: (batch_size, ndataset2, 3) TF tensor, sparser than xyz1                                           
199 |             points1: (batch_size, ndataset1, nchannel1) TF tensor                                                   
200 |             points2: (batch_size, ndataset2, nchannel2) TF tensor
201 |             mlp: list of int32 -- output size for MLP on each point                                                 
202 |         Return:
203 |             new_points: (batch_size, ndataset1, mlp[-1]) TF tensor
204 |     '''
205 |     with tf.variable_scope(scope) as sc, tf.device('/cpu:0'):
206 |         dist, idx = three_nn(xyz1, xyz2)
207 |         dist = tf.maximum(dist, 1e-10)
208 |         norm = tf.reduce_sum((1.0/dist),axis=2,keepdims=True)
209 |         norm = tf.tile(norm,[1,1,3])
210 |         weight = (1.0/dist) / norm
211 |         interpolated_points = three_interpolate(points2, idx, weight)
212 | 
213 |         if points1 is not None:
214 |             new_points1 = tf.concat(axis=2, values=[interpolated_points, points1]) # B,ndataset1,nchannel1+nchannel2
215 |         else:
216 |             new_points1 = interpolated_points
217 |         new_points1 = tf.expand_dims(new_points1, 2)
218 |         for i, num_out_channel in enumerate(mlp):
219 |             with tf.device('/gpu:0'):
220 |                 new_points1 = tf_util.conv2d(new_points1, num_out_channel, [1,1],
221 |                                              padding='VALID', stride=[1,1],
222 |                                              bn=bn, is_training=is_training,
223 |                                              scope='conv_%d'%(i), bn_decay=bn_decay)
224 |         new_points1 = tf.squeeze(new_points1, [2]) # B,ndataset1,mlp[-1]
225 |         return new_points1
226 | 


--------------------------------------------------------------------------------