├── .gitignore ├── LICENSE ├── README.md ├── pics ├── semantic-proj.gif └── semantic-ptcl.gif ├── requirements.txt └── train ├── README.md ├── __init__.py ├── auxiliary └── __init__.py ├── backbones ├── __init__.py ├── darknet.py ├── squeezeseg.py └── squeezesegV2.py ├── common ├── __init__.py ├── avgmeter.py ├── laserscan.py ├── laserscanvis.py ├── logger.py ├── onehot.py ├── sync_batchnorm │ ├── __init__.py │ ├── batchnorm.py │ ├── comm.py │ └── replicate.py └── warmupLR.py ├── requirements.txt └── tasks ├── __init__.py ├── panoptic ├── README.md ├── __init__.py └── readme.md └── semantic ├── README.md ├── __init__.py ├── config ├── arch │ ├── darknet21.yaml │ ├── darknet53-1024px.yaml │ ├── darknet53-512px.yaml │ ├── darknet53-crf-1024px.yaml │ ├── darknet53-crf-512px.yaml │ ├── darknet53-crf.yaml │ ├── darknet53.yaml │ ├── squeezeseg.yaml │ ├── squeezesegV2.yaml │ ├── squeezesegV2_crf.yaml │ └── squeezeseg_crf.yaml └── labels │ ├── semantic-kitti-all.yaml │ └── semantic-kitti.yaml ├── dataset └── kitti │ ├── __init__.py │ └── parser.py ├── decoders ├── __init__.py ├── darknet.py ├── squeezeseg.py └── squeezesegV2.py ├── evaluate_biou.py ├── evaluate_iou.py ├── infer.py ├── modules ├── __init__.py ├── ioueval.py ├── segmentator.py ├── trainer.py └── user.py ├── postproc ├── CRF.py ├── KNN.py ├── __init__.py └── borderMask.py ├── readme.md ├── train.py └── visualize.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *.pyo 3 | *.o 4 | *.so 5 | *.a 6 | **build 7 | .vscode 8 | .catkin_tools/ 9 | devel/ 10 | logs/ 11 | external.hpp -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License 2 | 3 | Copyright (c) 2019 Andres Milioto, Jens Behley, Cyrill Stachniss, Photogrammetry and Robotics Lab, University of Bonn. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in 13 | all copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 21 | THE SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | > [!IMPORTANT] 2 | > **Repository Archived** 3 | > This repository has been archived and is no longer actively maintained. You are welcome to explore the code, but please note that no further updates, issues, or pull requests will be accepted. 4 | > 5 | > Thank you for your interest and contributions. 6 | 7 | # LiDAR-Bonnetal 8 | 9 | Semantic Segmentation of point clouds using range images. 10 | 11 | Developed by [Andres Milioto](http://www.ipb.uni-bonn.de/people/andres-milioto/), [Jens Behley](http://www.ipb.uni-bonn.de/people/jens-behley/), [Ignacio Vizzo](http://www.ipb.uni-bonn.de/people/ignacio-vizzo/), and [Cyrill Stachniss](http://www.ipb.uni-bonn.de/people/cyrill-stachniss/) 12 | 13 | _Examples of segmentation results from [SemanticKITTI](http://semantic-kitti.org) dataset:_ 14 | ![ptcl](pics/semantic-ptcl.gif) 15 | ![ptcl](pics/semantic-proj.gif) 16 | 17 | ## Description 18 | 19 | This code provides code to train and deploy Semantic Segmentation of LiDAR scans, using range images as intermediate representation. The training pipeline can be found in [/train](train/). We will open-source the deployment pipeline soon. 20 | 21 | ## Pre-trained Models 22 | 23 | ### [SemanticKITTI](http://semantic-kitti.org) 24 | 25 | - [squeezeseg](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezeseg.tar.gz) 26 | - [squeezeseg + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezeseg-crf.tar.gz) 27 | - [squeezesegV2](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezesegV2.tar.gz) 28 | - [squeezesegV2 + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezesegV2-crf.tar.gz) 29 | - [darknet21](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet21.tar.gz) 30 | - [darknet53](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet53.tar.gz) 31 | - [darknet53-1024](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet53-1024.tar.gz) 32 | - [darknet53-512](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet53-512.tar.gz) 33 | 34 | To enable kNN post-processing, just change the boolean value to `True` in the `arch_cfg.yaml` file parameter, inside the model directory. 35 | 36 | ## Predictions from Models 37 | 38 | ### [SemanticKITTI](http://semantic-kitti.org) 39 | 40 | These are the predictions for the train, validation, and test sets. The performance can be evaluated for the training and validation set, but for test set evaluation a submission to the benchmark needs to be made (labels are not public). 41 | 42 | No post-processing: 43 | - [squeezeseg](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezeseg.tar.gz) 44 | - [squeezeseg + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezeseg-crf.tar.gz) 45 | - [squeezesegV2](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezesegV2.tar.gz) 46 | - [squeezesegV2 + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezesegV2-crf.tar.gz) 47 | - [darknet21](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet21.tar.gz) 48 | - [darknet53](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53.tar.gz) 49 | - [darknet53-1024](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-1024.tar.gz) 50 | - [darknet53-512](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-512.tar.gz) 51 | 52 | With k-NN processing: 53 | - [squeezeseg](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezeseg-knn.tar.gz) 54 | - [squeezesegV2](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezesegV2-knn.tar.gz) 55 | - [darknet53](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-knn.tar.gz) 56 | - [darknet21](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet21-knn.tar.gz) 57 | - [darknet53-1024](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-1024-knn.tar.gz) 58 | - [darknet53-512](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-512-knn.tar.gz) 59 | 60 | ## License 61 | 62 | ### LiDAR-Bonnetal: MIT 63 | 64 | Copyright 2019, Andres Milioto, Jens Behley, Cyrill Stachniss. University of Bonn. 65 | 66 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 67 | 68 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 69 | 70 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 71 | 72 | ### Pretrained models: Model and Dataset Dependent 73 | 74 | The pretrained models with a specific dataset maintain the copyright of such dataset. 75 | 76 | ## Citations 77 | 78 | If you use our framework, model, or predictions for any academic work, please cite the original [paper](http://www.ipb.uni-bonn.de/wp-content/papercite-data/pdf/milioto2019iros.pdf), and the [dataset](http://semantic-kitti.org). 79 | 80 | ``` 81 | @inproceedings{milioto2019iros, 82 | author = {A. Milioto and I. Vizzo and J. Behley and C. Stachniss}, 83 | title = {{RangeNet++: Fast and Accurate LiDAR Semantic Segmentation}}, 84 | booktitle = {IEEE/RSJ Intl.~Conf.~on Intelligent Robots and Systems (IROS)}, 85 | year = 2019, 86 | codeurl = {https://github.com/PRBonn/lidar-bonnetal}, 87 | videourl = {https://youtu.be/wuokg7MFZyU}, 88 | } 89 | ``` 90 | 91 | ``` 92 | @inproceedings{behley2019iccv, 93 | author = {J. Behley and M. Garbade and A. Milioto and J. Quenzel and S. Behnke and C. Stachniss and J. Gall}, 94 | title = {{SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences}}, 95 | booktitle = {Proc. of the IEEE/CVF International Conf.~on Computer Vision (ICCV)}, 96 | year = {2019} 97 | } 98 | ``` 99 | -------------------------------------------------------------------------------- /pics/semantic-proj.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/pics/semantic-proj.gif -------------------------------------------------------------------------------- /pics/semantic-ptcl.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/pics/semantic-ptcl.gif -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # first apt install python3-tk 2 | numpy==1.14.0 3 | torchvision==0.2.2.post3 4 | matplotlib==2.2.3 5 | tensorflow==1.13.1 6 | scipy==0.19.1 7 | torch==1.1.0 8 | vispy==0.5.3 9 | opencv_python==4.1.0.25 10 | opencv_contrib_python==4.1.0.25 11 | Pillow==6.1.0 12 | PyYAML==5.1.1 13 | -------------------------------------------------------------------------------- /train/README.md: -------------------------------------------------------------------------------- 1 | # LiDAR-Bonnetal Training 2 | 3 | This part of the framework deals with the training of segmentation networks for point cloud data using range images. 4 | 5 | ## Tasks 6 | 7 | - [Semantic Segmentation](tasks/semantic). 8 | - [Panoptic Segmentation](tasks/panoptic) \[Soon\]. 9 | 10 | ## Dependencies 11 | 12 | First you need to install the nvidia driver and CUDA, so have fun! 13 | 14 | - CUDA Installation guide: [link](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) 15 | 16 | - System dependencies: 17 | 18 | ```sh 19 | $ sudo apt-get update 20 | $ sudo apt-get install -yqq build-essential ninja-build \ 21 | python3-dev python3-pip apt-utils curl git cmake unzip autoconf autogen \ 22 | libtool mlocate zlib1g-dev python3-numpy python3-wheel wget \ 23 | software-properties-common openjdk-8-jdk libpng-dev \ 24 | libxft-dev ffmpeg python3-pyqt5.qtopengl 25 | $ sudo updatedb 26 | ``` 27 | 28 | - Python dependencies 29 | 30 | ```sh 31 | $ sudo pip3 install -r requirements.txt 32 | ``` -------------------------------------------------------------------------------- /train/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/__init__.py -------------------------------------------------------------------------------- /train/auxiliary/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/auxiliary/__init__.py -------------------------------------------------------------------------------- /train/backbones/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/backbones/__init__.py -------------------------------------------------------------------------------- /train/backbones/darknet.py: -------------------------------------------------------------------------------- 1 | # This file was modified from https://github.com/BobLiu20/YOLOv3_PyTorch 2 | # It needed to be modified in order to accomodate for different strides in the 3 | 4 | import torch.nn as nn 5 | from collections import OrderedDict 6 | import torch.nn.functional as F 7 | 8 | 9 | class BasicBlock(nn.Module): 10 | def __init__(self, inplanes, planes, bn_d=0.1): 11 | super(BasicBlock, self).__init__() 12 | self.conv1 = nn.Conv2d(inplanes, planes[0], kernel_size=1, 13 | stride=1, padding=0, bias=False) 14 | self.bn1 = nn.BatchNorm2d(planes[0], momentum=bn_d) 15 | self.relu1 = nn.LeakyReLU(0.1) 16 | self.conv2 = nn.Conv2d(planes[0], planes[1], kernel_size=3, 17 | stride=1, padding=1, bias=False) 18 | self.bn2 = nn.BatchNorm2d(planes[1], momentum=bn_d) 19 | self.relu2 = nn.LeakyReLU(0.1) 20 | 21 | def forward(self, x): 22 | residual = x 23 | 24 | out = self.conv1(x) 25 | out = self.bn1(out) 26 | out = self.relu1(out) 27 | 28 | out = self.conv2(out) 29 | out = self.bn2(out) 30 | out = self.relu2(out) 31 | 32 | out += residual 33 | return out 34 | 35 | 36 | # ****************************************************************************** 37 | 38 | # number of layers per model 39 | model_blocks = { 40 | 21: [1, 1, 2, 2, 1], 41 | 53: [1, 2, 8, 8, 4], 42 | } 43 | 44 | 45 | class Backbone(nn.Module): 46 | """ 47 | Class for DarknetSeg. Subclasses PyTorch's own "nn" module 48 | """ 49 | 50 | def __init__(self, params): 51 | super(Backbone, self).__init__() 52 | self.use_range = params["input_depth"]["range"] 53 | self.use_xyz = params["input_depth"]["xyz"] 54 | self.use_remission = params["input_depth"]["remission"] 55 | self.drop_prob = params["dropout"] 56 | self.bn_d = params["bn_d"] 57 | self.OS = params["OS"] 58 | self.layers = params["extra"]["layers"] 59 | print("Using DarknetNet" + str(self.layers) + " Backbone") 60 | 61 | # input depth calc 62 | self.input_depth = 0 63 | self.input_idxs = [] 64 | if self.use_range: 65 | self.input_depth += 1 66 | self.input_idxs.append(0) 67 | if self.use_xyz: 68 | self.input_depth += 3 69 | self.input_idxs.extend([1, 2, 3]) 70 | if self.use_remission: 71 | self.input_depth += 1 72 | self.input_idxs.append(4) 73 | print("Depth of backbone input = ", self.input_depth) 74 | 75 | # stride play 76 | self.strides = [2, 2, 2, 2, 2] 77 | # check current stride 78 | current_os = 1 79 | for s in self.strides: 80 | current_os *= s 81 | print("Original OS: ", current_os) 82 | 83 | # make the new stride 84 | if self.OS > current_os: 85 | print("Can't do OS, ", self.OS, 86 | " because it is bigger than original ", current_os) 87 | else: 88 | # redo strides according to needed stride 89 | for i, stride in enumerate(reversed(self.strides), 0): 90 | if int(current_os) != self.OS: 91 | if stride == 2: 92 | current_os /= 2 93 | self.strides[-1 - i] = 1 94 | if int(current_os) == self.OS: 95 | break 96 | print("New OS: ", int(current_os)) 97 | print("Strides: ", self.strides) 98 | 99 | # check that darknet exists 100 | assert self.layers in model_blocks.keys() 101 | 102 | # generate layers depending on darknet type 103 | self.blocks = model_blocks[self.layers] 104 | 105 | # input layer 106 | self.conv1 = nn.Conv2d(self.input_depth, 32, kernel_size=3, 107 | stride=1, padding=1, bias=False) 108 | self.bn1 = nn.BatchNorm2d(32, momentum=self.bn_d) 109 | self.relu1 = nn.LeakyReLU(0.1) 110 | 111 | # encoder 112 | self.enc1 = self._make_enc_layer(BasicBlock, [32, 64], self.blocks[0], 113 | stride=self.strides[0], bn_d=self.bn_d) 114 | self.enc2 = self._make_enc_layer(BasicBlock, [64, 128], self.blocks[1], 115 | stride=self.strides[1], bn_d=self.bn_d) 116 | self.enc3 = self._make_enc_layer(BasicBlock, [128, 256], self.blocks[2], 117 | stride=self.strides[2], bn_d=self.bn_d) 118 | self.enc4 = self._make_enc_layer(BasicBlock, [256, 512], self.blocks[3], 119 | stride=self.strides[3], bn_d=self.bn_d) 120 | self.enc5 = self._make_enc_layer(BasicBlock, [512, 1024], self.blocks[4], 121 | stride=self.strides[4], bn_d=self.bn_d) 122 | 123 | # for a bit of fun 124 | self.dropout = nn.Dropout2d(self.drop_prob) 125 | 126 | # last channels 127 | self.last_channels = 1024 128 | 129 | # make layer useful function 130 | def _make_enc_layer(self, block, planes, blocks, stride, bn_d=0.1): 131 | layers = [] 132 | 133 | # downsample 134 | layers.append(("conv", nn.Conv2d(planes[0], planes[1], 135 | kernel_size=3, 136 | stride=[1, stride], dilation=1, 137 | padding=1, bias=False))) 138 | layers.append(("bn", nn.BatchNorm2d(planes[1], momentum=bn_d))) 139 | layers.append(("relu", nn.LeakyReLU(0.1))) 140 | 141 | # blocks 142 | inplanes = planes[1] 143 | for i in range(0, blocks): 144 | layers.append(("residual_{}".format(i), 145 | block(inplanes, planes, bn_d))) 146 | 147 | return nn.Sequential(OrderedDict(layers)) 148 | 149 | def run_layer(self, x, layer, skips, os): 150 | y = layer(x) 151 | if y.shape[2] < x.shape[2] or y.shape[3] < x.shape[3]: 152 | skips[os] = x.detach() 153 | os *= 2 154 | x = y 155 | return x, skips, os 156 | 157 | def forward(self, x): 158 | # filter input 159 | x = x[:, self.input_idxs] 160 | 161 | # run cnn 162 | # store for skip connections 163 | skips = {} 164 | os = 1 165 | 166 | # first layer 167 | x, skips, os = self.run_layer(x, self.conv1, skips, os) 168 | x, skips, os = self.run_layer(x, self.bn1, skips, os) 169 | x, skips, os = self.run_layer(x, self.relu1, skips, os) 170 | 171 | # all encoder blocks with intermediate dropouts 172 | x, skips, os = self.run_layer(x, self.enc1, skips, os) 173 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 174 | x, skips, os = self.run_layer(x, self.enc2, skips, os) 175 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 176 | x, skips, os = self.run_layer(x, self.enc3, skips, os) 177 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 178 | x, skips, os = self.run_layer(x, self.enc4, skips, os) 179 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 180 | x, skips, os = self.run_layer(x, self.enc5, skips, os) 181 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 182 | 183 | return x, skips 184 | 185 | def get_last_depth(self): 186 | return self.last_channels 187 | 188 | def get_input_depth(self): 189 | return self.input_depth 190 | -------------------------------------------------------------------------------- /train/backbones/squeezeseg.py: -------------------------------------------------------------------------------- 1 | # Adapted from https://github.com/BichenWuUCB/SqueezeSeg 2 | from __future__ import print_function 3 | import torch 4 | import torch.nn as nn 5 | import torch.nn.functional as F 6 | 7 | 8 | class Fire(nn.Module): 9 | 10 | def __init__(self, inplanes, squeeze_planes, 11 | expand1x1_planes, expand3x3_planes): 12 | super(Fire, self).__init__() 13 | self.inplanes = inplanes 14 | self.activation = nn.ReLU(inplace=True) 15 | self.squeeze = nn.Conv2d(inplanes, squeeze_planes, kernel_size=1) 16 | self.expand1x1 = nn.Conv2d(squeeze_planes, expand1x1_planes, 17 | kernel_size=1) 18 | self.expand3x3 = nn.Conv2d(squeeze_planes, expand3x3_planes, 19 | kernel_size=3, padding=1) 20 | 21 | def forward(self, x): 22 | x = self.activation(self.squeeze(x)) 23 | return torch.cat([ 24 | self.activation(self.expand1x1(x)), 25 | self.activation(self.expand3x3(x)) 26 | ], 1) 27 | 28 | 29 | # ****************************************************************************** 30 | 31 | class Backbone(nn.Module): 32 | """ 33 | Class for Squeezeseg. Subclasses PyTorch's own "nn" module 34 | """ 35 | 36 | def __init__(self, params): 37 | # Call the super constructor 38 | super(Backbone, self).__init__() 39 | print("Using SqueezeNet Backbone") 40 | self.use_range = params["input_depth"]["range"] 41 | self.use_xyz = params["input_depth"]["xyz"] 42 | self.use_remission = params["input_depth"]["remission"] 43 | self.drop_prob = params["dropout"] 44 | self.OS = params["OS"] 45 | 46 | # input depth calc 47 | self.input_depth = 0 48 | self.input_idxs = [] 49 | if self.use_range: 50 | self.input_depth += 1 51 | self.input_idxs.append(0) 52 | if self.use_xyz: 53 | self.input_depth += 3 54 | self.input_idxs.extend([1, 2, 3]) 55 | if self.use_remission: 56 | self.input_depth += 1 57 | self.input_idxs.append(4) 58 | print("Depth of backbone input = ", self.input_depth) 59 | 60 | # stride play 61 | self.strides = [2, 2, 2, 2] 62 | # check current stride 63 | current_os = 1 64 | for s in self.strides: 65 | current_os *= s 66 | print("Original OS: ", current_os) 67 | 68 | # make the new stride 69 | if self.OS > current_os: 70 | print("Can't do OS, ", self.OS, 71 | " because it is bigger than original ", current_os) 72 | else: 73 | # redo strides according to needed stride 74 | for i, stride in enumerate(reversed(self.strides), 0): 75 | if int(current_os) != self.OS: 76 | if stride == 2: 77 | current_os /= 2 78 | self.strides[-1 - i] = 1 79 | if int(current_os) == self.OS: 80 | break 81 | print("New OS: ", int(current_os)) 82 | print("Strides: ", self.strides) 83 | 84 | # encoder 85 | self.conv1a = nn.Sequential(nn.Conv2d(self.input_depth, 64, kernel_size=3, 86 | stride=[1, self.strides[0]], 87 | padding=1), 88 | nn.ReLU(inplace=True)) 89 | self.conv1b = nn.Conv2d(self.input_depth, 64, kernel_size=1, 90 | stride=1, padding=0) 91 | self.fire23 = nn.Sequential(nn.MaxPool2d(kernel_size=3, 92 | stride=[1, self.strides[1]], 93 | padding=1), 94 | Fire(64, 16, 64, 64), 95 | Fire(128, 16, 64, 64)) 96 | self.fire45 = nn.Sequential(nn.MaxPool2d(kernel_size=3, 97 | stride=[1, self.strides[2]], 98 | padding=1), 99 | Fire(128, 32, 128, 128), 100 | Fire(256, 32, 128, 128)) 101 | self.fire6789 = nn.Sequential(nn.MaxPool2d(kernel_size=3, 102 | stride=[1, self.strides[3]], 103 | padding=1), 104 | Fire(256, 48, 192, 192), 105 | Fire(384, 48, 192, 192), 106 | Fire(384, 64, 256, 256), 107 | Fire(512, 64, 256, 256)) 108 | 109 | # output 110 | self.dropout = nn.Dropout2d(self.drop_prob) 111 | 112 | # last channels 113 | self.last_channels = 512 114 | 115 | def run_layer(self, x, layer, skips, os): 116 | y = layer(x) 117 | if y.shape[2] < x.shape[2] or y.shape[3] < x.shape[3]: 118 | skips[os] = x.detach() 119 | os *= 2 120 | x = y 121 | return x, skips, os 122 | 123 | def forward(self, x): 124 | # filter input 125 | x = x[:, self.input_idxs] 126 | 127 | # run cnn 128 | # store for skip connections 129 | skips = {} 130 | os = 1 131 | 132 | # encoder 133 | skip_in = self.conv1b(x) 134 | x = self.conv1a(x) 135 | # first skip done manually 136 | skips[1] = skip_in.detach() 137 | os *= 2 138 | 139 | x, skips, os = self.run_layer(x, self.fire23, skips, os) 140 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 141 | x, skips, os = self.run_layer(x, self.fire45, skips, os) 142 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 143 | x, skips, os = self.run_layer(x, self.fire6789, skips, os) 144 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 145 | 146 | return x, skips 147 | 148 | def get_last_depth(self): 149 | return self.last_channels 150 | 151 | def get_input_depth(self): 152 | return self.input_depth 153 | -------------------------------------------------------------------------------- /train/backbones/squeezesegV2.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | from __future__ import print_function 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | 9 | 10 | class Fire(nn.Module): 11 | def __init__(self, inplanes, squeeze_planes, 12 | expand1x1_planes, expand3x3_planes, bn_d=0.1): 13 | super(Fire, self).__init__() 14 | self.inplanes = inplanes 15 | self.bn_d = bn_d 16 | self.activation = nn.ReLU(inplace=True) 17 | self.squeeze = nn.Conv2d(inplanes, squeeze_planes, kernel_size=1) 18 | self.squeeze_bn = nn.BatchNorm2d(squeeze_planes, momentum=self.bn_d) 19 | self.expand1x1 = nn.Conv2d(squeeze_planes, expand1x1_planes, 20 | kernel_size=1) 21 | self.expand1x1_bn = nn.BatchNorm2d(expand1x1_planes, momentum=self.bn_d) 22 | self.expand3x3 = nn.Conv2d(squeeze_planes, expand3x3_planes, 23 | kernel_size=3, padding=1) 24 | self.expand3x3_bn = nn.BatchNorm2d(expand3x3_planes, momentum=self.bn_d) 25 | 26 | def forward(self, x): 27 | x = self.activation(self.squeeze_bn(self.squeeze(x))) 28 | return torch.cat([ 29 | self.activation(self.expand1x1_bn(self.expand1x1(x))), 30 | self.activation(self.expand3x3_bn(self.expand3x3(x))) 31 | ], 1) 32 | 33 | 34 | class CAM(nn.Module): 35 | 36 | def __init__(self, inplanes, bn_d=0.1): 37 | super(CAM, self).__init__() 38 | self.inplanes = inplanes 39 | self.bn_d = bn_d 40 | self.pool = nn.MaxPool2d(7, 1, 3) 41 | self.squeeze = nn.Conv2d(inplanes, inplanes // 16, 42 | kernel_size=1, stride=1) 43 | self.squeeze_bn = nn.BatchNorm2d(inplanes // 16, momentum=self.bn_d) 44 | self.relu = nn.ReLU(inplace=True) 45 | self.unsqueeze = nn.Conv2d(inplanes // 16, inplanes, 46 | kernel_size=1, stride=1) 47 | self.unsqueeze_bn = nn.BatchNorm2d(inplanes, momentum=self.bn_d) 48 | self.sigmoid = nn.Sigmoid() 49 | 50 | def forward(self, x): 51 | # 7x7 pooling 52 | y = self.pool(x) 53 | # squeezing and relu 54 | y = self.relu(self.squeeze_bn(self.squeeze(y))) 55 | # unsqueezing 56 | y = self.sigmoid(self.unsqueeze_bn(self.unsqueeze(y))) 57 | # attention 58 | return y * x 59 | 60 | # ****************************************************************************** 61 | 62 | 63 | class Backbone(nn.Module): 64 | """ 65 | Class for Squeezeseg. Subclasses PyTorch's own "nn" module 66 | """ 67 | 68 | def __init__(self, params): 69 | # Call the super constructor 70 | super(Backbone, self).__init__() 71 | print("Using SqueezeNet Backbone") 72 | self.use_range = params["input_depth"]["range"] 73 | self.use_xyz = params["input_depth"]["xyz"] 74 | self.use_remission = params["input_depth"]["remission"] 75 | self.bn_d = params["bn_d"] 76 | self.drop_prob = params["dropout"] 77 | self.OS = params["OS"] 78 | 79 | # input depth calc 80 | self.input_depth = 0 81 | self.input_idxs = [] 82 | if self.use_range: 83 | self.input_depth += 1 84 | self.input_idxs.append(0) 85 | if self.use_xyz: 86 | self.input_depth += 3 87 | self.input_idxs.extend([1, 2, 3]) 88 | if self.use_remission: 89 | self.input_depth += 1 90 | self.input_idxs.append(4) 91 | print("Depth of backbone input = ", self.input_depth) 92 | 93 | # stride play 94 | self.strides = [2, 2, 2, 2] 95 | # check current stride 96 | current_os = 1 97 | for s in self.strides: 98 | current_os *= s 99 | print("Original OS: ", current_os) 100 | 101 | # make the new stride 102 | if self.OS > current_os: 103 | print("Can't do OS, ", self.OS, 104 | " because it is bigger than original ", current_os) 105 | else: 106 | # redo strides according to needed stride 107 | for i, stride in enumerate(reversed(self.strides), 0): 108 | if int(current_os) != self.OS: 109 | if stride == 2: 110 | current_os /= 2 111 | self.strides[-1 - i] = 1 112 | if int(current_os) == self.OS: 113 | break 114 | print("New OS: ", int(current_os)) 115 | print("Strides: ", self.strides) 116 | 117 | # encoder 118 | self.conv1a = nn.Sequential(nn.Conv2d(self.input_depth, 64, kernel_size=3, 119 | stride=[1, self.strides[0]], 120 | padding=1), 121 | nn.BatchNorm2d(64, momentum=self.bn_d), 122 | nn.ReLU(inplace=True), 123 | CAM(64, bn_d=self.bn_d)) 124 | self.conv1b = nn.Sequential(nn.Conv2d(self.input_depth, 64, kernel_size=1, 125 | stride=1, padding=0), 126 | nn.BatchNorm2d(64, momentum=self.bn_d)) 127 | self.fire23 = nn.Sequential(nn.MaxPool2d(kernel_size=3, 128 | stride=[1, self.strides[1]], 129 | padding=1), 130 | Fire(64, 16, 64, 64, bn_d=self.bn_d), 131 | CAM(128, bn_d=self.bn_d), 132 | Fire(128, 16, 64, 64, bn_d=self.bn_d), 133 | CAM(128, bn_d=self.bn_d)) 134 | self.fire45 = nn.Sequential(nn.MaxPool2d(kernel_size=3, 135 | stride=[1, self.strides[2]], 136 | padding=1), 137 | Fire(128, 32, 128, 128, bn_d=self.bn_d), 138 | Fire(256, 32, 128, 128, bn_d=self.bn_d)) 139 | self.fire6789 = nn.Sequential(nn.MaxPool2d(kernel_size=3, 140 | stride=[1, self.strides[3]], 141 | padding=1), 142 | Fire(256, 48, 192, 192, bn_d=self.bn_d), 143 | Fire(384, 48, 192, 192, bn_d=self.bn_d), 144 | Fire(384, 64, 256, 256, bn_d=self.bn_d), 145 | Fire(512, 64, 256, 256, bn_d=self.bn_d)) 146 | 147 | # output 148 | self.dropout = nn.Dropout2d(self.drop_prob) 149 | 150 | # last channels 151 | self.last_channels = 512 152 | 153 | def run_layer(self, x, layer, skips, os): 154 | y = layer(x) 155 | if y.shape[2] < x.shape[2] or y.shape[3] < x.shape[3]: 156 | skips[os] = x.detach() 157 | os *= 2 158 | x = y 159 | return x, skips, os 160 | 161 | def forward(self, x): 162 | # filter input 163 | x = x[:, self.input_idxs] 164 | 165 | # run cnn 166 | # store for skip connections 167 | skips = {} 168 | os = 1 169 | 170 | # encoder 171 | skip_in = self.conv1b(x) 172 | x = self.conv1a(x) 173 | # first skip done manually 174 | skips[1] = skip_in.detach() 175 | os *= 2 176 | 177 | x, skips, os = self.run_layer(x, self.fire23, skips, os) 178 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 179 | x, skips, os = self.run_layer(x, self.fire45, skips, os) 180 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 181 | x, skips, os = self.run_layer(x, self.fire6789, skips, os) 182 | x, skips, os = self.run_layer(x, self.dropout, skips, os) 183 | 184 | return x, skips 185 | 186 | def get_last_depth(self): 187 | return self.last_channels 188 | 189 | def get_input_depth(self): 190 | return self.input_depth 191 | -------------------------------------------------------------------------------- /train/common/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/common/__init__.py -------------------------------------------------------------------------------- /train/common/avgmeter.py: -------------------------------------------------------------------------------- 1 | # This file is covered by the LICENSE file in the root of this project. 2 | 3 | 4 | class AverageMeter(object): 5 | """Computes and stores the average and current value""" 6 | 7 | def __init__(self): 8 | self.reset() 9 | 10 | def reset(self): 11 | self.val = 0 12 | self.avg = 0 13 | self.sum = 0 14 | self.count = 0 15 | 16 | def update(self, val, n=1): 17 | self.val = val 18 | self.sum += val * n 19 | self.count += n 20 | self.avg = self.sum / self.count 21 | -------------------------------------------------------------------------------- /train/common/laserscan.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | import numpy as np 4 | 5 | 6 | class LaserScan: 7 | """Class that contains LaserScan with x,y,z,r""" 8 | EXTENSIONS_SCAN = ['.bin'] 9 | 10 | def __init__(self, project=False, H=64, W=1024, fov_up=3.0, fov_down=-25.0): 11 | self.project = project 12 | self.proj_H = H 13 | self.proj_W = W 14 | self.proj_fov_up = fov_up 15 | self.proj_fov_down = fov_down 16 | self.reset() 17 | 18 | def reset(self): 19 | """ Reset scan members. """ 20 | self.points = np.zeros((0, 3), dtype=np.float32) # [m, 3]: x, y, z 21 | self.remissions = np.zeros((0, 1), dtype=np.float32) # [m ,1]: remission 22 | 23 | # projected range image - [H,W] range (-1 is no data) 24 | self.proj_range = np.full((self.proj_H, self.proj_W), -1, 25 | dtype=np.float32) 26 | 27 | # unprojected range (list of depths for each point) 28 | self.unproj_range = np.zeros((0, 1), dtype=np.float32) 29 | 30 | # projected point cloud xyz - [H,W,3] xyz coord (-1 is no data) 31 | self.proj_xyz = np.full((self.proj_H, self.proj_W, 3), -1, 32 | dtype=np.float32) 33 | 34 | # projected remission - [H,W] intensity (-1 is no data) 35 | self.proj_remission = np.full((self.proj_H, self.proj_W), -1, 36 | dtype=np.float32) 37 | 38 | # projected index (for each pixel, what I am in the pointcloud) 39 | # [H,W] index (-1 is no data) 40 | self.proj_idx = np.full((self.proj_H, self.proj_W), -1, 41 | dtype=np.int32) 42 | 43 | # for each point, where it is in the range image 44 | self.proj_x = np.zeros((0, 1), dtype=np.int32) # [m, 1]: x 45 | self.proj_y = np.zeros((0, 1), dtype=np.int32) # [m, 1]: y 46 | 47 | # mask containing for each pixel, if it contains a point or not 48 | self.proj_mask = np.zeros((self.proj_H, self.proj_W), 49 | dtype=np.int32) # [H,W] mask 50 | 51 | def size(self): 52 | """ Return the size of the point cloud. """ 53 | return self.points.shape[0] 54 | 55 | def __len__(self): 56 | return self.size() 57 | 58 | def open_scan(self, filename): 59 | """ Open raw scan and fill in attributes 60 | """ 61 | # reset just in case there was an open structure 62 | self.reset() 63 | 64 | # check filename is string 65 | if not isinstance(filename, str): 66 | raise TypeError("Filename should be string type, " 67 | "but was {type}".format(type=str(type(filename)))) 68 | 69 | # check extension is a laserscan 70 | if not any(filename.endswith(ext) for ext in self.EXTENSIONS_SCAN): 71 | raise RuntimeError("Filename extension is not valid scan file.") 72 | 73 | # if all goes well, open pointcloud 74 | scan = np.fromfile(filename, dtype=np.float32) 75 | scan = scan.reshape((-1, 4)) 76 | 77 | # put in attribute 78 | points = scan[:, 0:3] # get xyz 79 | remissions = scan[:, 3] # get remission 80 | self.set_points(points, remissions) 81 | 82 | def set_points(self, points, remissions=None): 83 | """ Set scan attributes (instead of opening from file) 84 | """ 85 | # reset just in case there was an open structure 86 | self.reset() 87 | 88 | # check scan makes sense 89 | if not isinstance(points, np.ndarray): 90 | raise TypeError("Scan should be numpy array") 91 | 92 | # check remission makes sense 93 | if remissions is not None and not isinstance(remissions, np.ndarray): 94 | raise TypeError("Remissions should be numpy array") 95 | 96 | # put in attribute 97 | self.points = points # get xyz 98 | if remissions is not None: 99 | self.remissions = remissions # get remission 100 | else: 101 | self.remissions = np.zeros((points.shape[0]), dtype=np.float32) 102 | 103 | # if projection is wanted, then do it and fill in the structure 104 | if self.project: 105 | self.do_range_projection() 106 | 107 | def do_range_projection(self): 108 | """ Project a pointcloud into a spherical projection image.projection. 109 | Function takes no arguments because it can be also called externally 110 | if the value of the constructor was not set (in case you change your 111 | mind about wanting the projection) 112 | """ 113 | # laser parameters 114 | fov_up = self.proj_fov_up / 180.0 * np.pi # field of view up in rad 115 | fov_down = self.proj_fov_down / 180.0 * np.pi # field of view down in rad 116 | fov = abs(fov_down) + abs(fov_up) # get field of view total in rad 117 | 118 | # get depth of all points 119 | depth = np.linalg.norm(self.points, 2, axis=1) 120 | 121 | # get scan components 122 | scan_x = self.points[:, 0] 123 | scan_y = self.points[:, 1] 124 | scan_z = self.points[:, 2] 125 | 126 | # get angles of all points 127 | yaw = -np.arctan2(scan_y, scan_x) 128 | pitch = np.arcsin(scan_z / depth) 129 | 130 | # get projections in image coords 131 | proj_x = 0.5 * (yaw / np.pi + 1.0) # in [0.0, 1.0] 132 | proj_y = 1.0 - (pitch + abs(fov_down)) / fov # in [0.0, 1.0] 133 | 134 | # scale to image size using angular resolution 135 | proj_x *= self.proj_W # in [0.0, W] 136 | proj_y *= self.proj_H # in [0.0, H] 137 | 138 | # round and clamp for use as index 139 | proj_x = np.floor(proj_x) 140 | proj_x = np.minimum(self.proj_W - 1, proj_x) 141 | proj_x = np.maximum(0, proj_x).astype(np.int32) # in [0,W-1] 142 | self.proj_x = np.copy(proj_x) # store a copy in orig order 143 | 144 | proj_y = np.floor(proj_y) 145 | proj_y = np.minimum(self.proj_H - 1, proj_y) 146 | proj_y = np.maximum(0, proj_y).astype(np.int32) # in [0,H-1] 147 | self.proj_y = np.copy(proj_y) # stope a copy in original order 148 | 149 | # copy of depth in original order 150 | self.unproj_range = np.copy(depth) 151 | 152 | # order in decreasing depth 153 | indices = np.arange(depth.shape[0]) 154 | order = np.argsort(depth)[::-1] 155 | depth = depth[order] 156 | indices = indices[order] 157 | points = self.points[order] 158 | remission = self.remissions[order] 159 | proj_y = proj_y[order] 160 | proj_x = proj_x[order] 161 | 162 | # assing to images 163 | self.proj_range[proj_y, proj_x] = depth 164 | self.proj_xyz[proj_y, proj_x] = points 165 | self.proj_remission[proj_y, proj_x] = remission 166 | self.proj_idx[proj_y, proj_x] = indices 167 | self.proj_mask = (self.proj_idx > 0).astype(np.int32) 168 | 169 | 170 | class SemLaserScan(LaserScan): 171 | """Class that contains LaserScan with x,y,z,r,sem_label,sem_color_label,inst_label,inst_color_label""" 172 | EXTENSIONS_LABEL = ['.label'] 173 | 174 | def __init__(self, sem_color_dict=None, project=False, H=64, W=1024, fov_up=3.0, fov_down=-25.0, max_classes=300): 175 | super(SemLaserScan, self).__init__(project, H, W, fov_up, fov_down) 176 | self.reset() 177 | 178 | # make semantic colors 179 | if sem_color_dict: 180 | # if I have a dict, make it 181 | max_sem_key = 0 182 | for key, data in sem_color_dict.items(): 183 | if key + 1 > max_sem_key: 184 | max_sem_key = key + 1 185 | self.sem_color_lut = np.zeros((max_sem_key + 100, 3), dtype=np.float32) 186 | for key, value in sem_color_dict.items(): 187 | self.sem_color_lut[key] = np.array(value, np.float32) / 255.0 188 | else: 189 | # otherwise make random 190 | max_sem_key = max_classes 191 | self.sem_color_lut = np.random.uniform(low=0.0, 192 | high=1.0, 193 | size=(max_sem_key, 3)) 194 | # force zero to a gray-ish color 195 | self.sem_color_lut[0] = np.full((3), 0.1) 196 | 197 | # make instance colors 198 | max_inst_id = 100000 199 | self.inst_color_lut = np.random.uniform(low=0.0, 200 | high=1.0, 201 | size=(max_inst_id, 3)) 202 | # force zero to a gray-ish color 203 | self.inst_color_lut[0] = np.full((3), 0.1) 204 | 205 | def reset(self): 206 | """ Reset scan members. """ 207 | super(SemLaserScan, self).reset() 208 | 209 | # semantic labels 210 | self.sem_label = np.zeros((0, 1), dtype=np.int32) # [m, 1]: label 211 | self.sem_label_color = np.zeros((0, 3), dtype=np.float32) # [m ,3]: color 212 | 213 | # instance labels 214 | self.inst_label = np.zeros((0, 1), dtype=np.int32) # [m, 1]: label 215 | self.inst_label_color = np.zeros((0, 3), dtype=np.float32) # [m ,3]: color 216 | 217 | # projection color with semantic labels 218 | self.proj_sem_label = np.zeros((self.proj_H, self.proj_W), 219 | dtype=np.int32) # [H,W] label 220 | self.proj_sem_color = np.zeros((self.proj_H, self.proj_W, 3), 221 | dtype=np.float) # [H,W,3] color 222 | 223 | # projection color with instance labels 224 | self.proj_inst_label = np.zeros((self.proj_H, self.proj_W), 225 | dtype=np.int32) # [H,W] label 226 | self.proj_inst_color = np.zeros((self.proj_H, self.proj_W, 3), 227 | dtype=np.float) # [H,W,3] color 228 | 229 | def open_label(self, filename): 230 | """ Open raw scan and fill in attributes 231 | """ 232 | # check filename is string 233 | if not isinstance(filename, str): 234 | raise TypeError("Filename should be string type, " 235 | "but was {type}".format(type=str(type(filename)))) 236 | 237 | # check extension is a laserscan 238 | if not any(filename.endswith(ext) for ext in self.EXTENSIONS_LABEL): 239 | raise RuntimeError("Filename extension is not valid label file.") 240 | 241 | # if all goes well, open label 242 | label = np.fromfile(filename, dtype=np.int32) 243 | label = label.reshape((-1)) 244 | 245 | # set it 246 | self.set_label(label) 247 | 248 | def set_label(self, label): 249 | """ Set points for label not from file but from np 250 | """ 251 | # check label makes sense 252 | if not isinstance(label, np.ndarray): 253 | raise TypeError("Label should be numpy array") 254 | 255 | # only fill in attribute if the right size 256 | if label.shape[0] == self.points.shape[0]: 257 | self.sem_label = label & 0xFFFF # semantic label in lower half 258 | self.inst_label = label >> 16 # instance id in upper half 259 | else: 260 | print("Points shape: ", self.points.shape) 261 | print("Label shape: ", label.shape) 262 | raise ValueError("Scan and Label don't contain same number of points") 263 | 264 | # sanity check 265 | assert((self.sem_label + (self.inst_label << 16) == label).all()) 266 | 267 | if self.project: 268 | self.do_label_projection() 269 | 270 | def colorize(self): 271 | """ Colorize pointcloud with the color of each semantic label 272 | """ 273 | self.sem_label_color = self.sem_color_lut[self.sem_label] 274 | self.sem_label_color = self.sem_label_color.reshape((-1, 3)) 275 | 276 | self.inst_label_color = self.inst_color_lut[self.inst_label] 277 | self.inst_label_color = self.inst_label_color.reshape((-1, 3)) 278 | 279 | def do_label_projection(self): 280 | # only map colors to labels that exist 281 | mask = self.proj_idx >= 0 282 | 283 | # semantics 284 | self.proj_sem_label[mask] = self.sem_label[self.proj_idx[mask]] 285 | self.proj_sem_color[mask] = self.sem_color_lut[self.sem_label[self.proj_idx[mask]]] 286 | 287 | # instances 288 | self.proj_inst_label[mask] = self.inst_label[self.proj_idx[mask]] 289 | self.proj_inst_color[mask] = self.inst_color_lut[self.inst_label[self.proj_idx[mask]]] 290 | -------------------------------------------------------------------------------- /train/common/laserscanvis.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import vispy 5 | from vispy.scene import visuals, SceneCanvas 6 | import numpy as np 7 | from matplotlib import pyplot as plt 8 | from common.laserscan import LaserScan, SemLaserScan 9 | 10 | 11 | class LaserScanVis: 12 | """Class that creates and handles a visualizer for a pointcloud""" 13 | 14 | def __init__(self, scan, scan_names, label_names, offset=0, 15 | semantics=True, instances=False): 16 | self.scan = scan 17 | self.scan_names = scan_names 18 | self.label_names = label_names 19 | self.offset = offset 20 | self.semantics = semantics 21 | self.instances = instances 22 | # sanity check 23 | if not self.semantics and self.instances: 24 | print("Instances are only allowed in when semantics=True") 25 | raise ValueError 26 | 27 | self.reset() 28 | self.update_scan() 29 | 30 | def reset(self): 31 | """ Reset. """ 32 | # last key press (it should have a mutex, but visualization is not 33 | # safety critical, so let's do things wrong) 34 | self.action = "no" # no, next, back, quit are the possibilities 35 | 36 | # new canvas prepared for visualizing data 37 | self.canvas = SceneCanvas(keys='interactive', show=True) 38 | # interface (n next, b back, q quit, very simple) 39 | self.canvas.events.key_press.connect(self.key_press) 40 | self.canvas.events.draw.connect(self.draw) 41 | # grid 42 | self.grid = self.canvas.central_widget.add_grid() 43 | 44 | # laserscan part 45 | self.scan_view = vispy.scene.widgets.ViewBox( 46 | border_color='white', parent=self.canvas.scene) 47 | self.grid.add_widget(self.scan_view, 0, 0) 48 | self.scan_vis = visuals.Markers() 49 | self.scan_view.camera = 'turntable' 50 | self.scan_view.add(self.scan_vis) 51 | visuals.XYZAxis(parent=self.scan_view.scene) 52 | # add semantics 53 | if self.semantics: 54 | print("Using semantics in visualizer") 55 | self.sem_view = vispy.scene.widgets.ViewBox( 56 | border_color='white', parent=self.canvas.scene) 57 | self.grid.add_widget(self.sem_view, 0, 1) 58 | self.sem_vis = visuals.Markers() 59 | self.sem_view.camera = 'turntable' 60 | self.sem_view.add(self.sem_vis) 61 | visuals.XYZAxis(parent=self.sem_view.scene) 62 | # self.sem_view.camera.link(self.scan_view.camera) 63 | 64 | if self.instances: 65 | print("Using instances in visualizer") 66 | self.inst_view = vispy.scene.widgets.ViewBox( 67 | border_color='white', parent=self.canvas.scene) 68 | self.grid.add_widget(self.inst_view, 0, 2) 69 | self.inst_vis = visuals.Markers() 70 | self.inst_view.camera = 'turntable' 71 | self.inst_view.add(self.inst_vis) 72 | visuals.XYZAxis(parent=self.inst_view.scene) 73 | # self.inst_view.camera.link(self.scan_view.camera) 74 | 75 | # img canvas size 76 | self.multiplier = 1 77 | self.canvas_W = 1024 78 | self.canvas_H = 64 79 | if self.semantics: 80 | self.multiplier += 1 81 | if self.instances: 82 | self.multiplier += 1 83 | 84 | # new canvas for img 85 | self.img_canvas = SceneCanvas(keys='interactive', show=True, 86 | size=(self.canvas_W, self.canvas_H * self.multiplier)) 87 | # grid 88 | self.img_grid = self.img_canvas.central_widget.add_grid() 89 | # interface (n next, b back, q quit, very simple) 90 | self.img_canvas.events.key_press.connect(self.key_press) 91 | self.img_canvas.events.draw.connect(self.draw) 92 | 93 | # add a view for the depth 94 | self.img_view = vispy.scene.widgets.ViewBox( 95 | border_color='white', parent=self.img_canvas.scene) 96 | self.img_grid.add_widget(self.img_view, 0, 0) 97 | self.img_vis = visuals.Image(cmap='viridis') 98 | self.img_view.add(self.img_vis) 99 | 100 | # add semantics 101 | if self.semantics: 102 | self.sem_img_view = vispy.scene.widgets.ViewBox( 103 | border_color='white', parent=self.img_canvas.scene) 104 | self.img_grid.add_widget(self.sem_img_view, 1, 0) 105 | self.sem_img_vis = visuals.Image(cmap='viridis') 106 | self.sem_img_view.add(self.sem_img_vis) 107 | 108 | # add instances 109 | if self.instances: 110 | self.inst_img_view = vispy.scene.widgets.ViewBox( 111 | border_color='white', parent=self.img_canvas.scene) 112 | self.img_grid.add_widget(self.inst_img_view, 2, 0) 113 | self.inst_img_vis = visuals.Image(cmap='viridis') 114 | self.inst_img_view.add(self.inst_img_vis) 115 | 116 | def get_mpl_colormap(self, cmap_name): 117 | cmap = plt.get_cmap(cmap_name) 118 | 119 | # Initialize the matplotlib color map 120 | sm = plt.cm.ScalarMappable(cmap=cmap) 121 | 122 | # Obtain linear color range 123 | color_range = sm.to_rgba(np.linspace(0, 1, 256), bytes=True)[:, 2::-1] 124 | 125 | return color_range.reshape(256, 3).astype(np.float32) / 255.0 126 | 127 | def update_scan(self): 128 | # first open data 129 | self.scan.open_scan(self.scan_names[self.offset]) 130 | if self.semantics: 131 | self.scan.open_label(self.label_names[self.offset]) 132 | self.scan.colorize() 133 | 134 | # then change names 135 | title = "scan " + str(self.offset) + " of " + str(len(self.scan_names)) 136 | self.canvas.title = title 137 | self.img_canvas.title = title 138 | 139 | # then do all the point cloud stuff 140 | 141 | # plot scan 142 | power = 16 143 | # print() 144 | range_data = np.copy(self.scan.unproj_range) 145 | # print(range_data.max(), range_data.min()) 146 | range_data = range_data**(1 / power) 147 | # print(range_data.max(), range_data.min()) 148 | viridis_range = ((range_data - range_data.min()) / 149 | (range_data.max() - range_data.min()) * 150 | 255).astype(np.uint8) 151 | viridis_map = self.get_mpl_colormap("viridis") 152 | viridis_colors = viridis_map[viridis_range] 153 | self.scan_vis.set_data(self.scan.points, 154 | face_color=viridis_colors[..., ::-1], 155 | edge_color=viridis_colors[..., ::-1], 156 | size=1) 157 | 158 | # plot semantics 159 | if self.semantics: 160 | self.sem_vis.set_data(self.scan.points, 161 | face_color=self.scan.sem_label_color[..., ::-1], 162 | edge_color=self.scan.sem_label_color[..., ::-1], 163 | size=1) 164 | 165 | # plot instances 166 | if self.instances: 167 | self.inst_vis.set_data(self.scan.points, 168 | face_color=self.scan.inst_label_color[..., ::-1], 169 | edge_color=self.scan.inst_label_color[..., ::-1], 170 | size=1) 171 | 172 | # now do all the range image stuff 173 | # plot range image 174 | data = np.copy(self.scan.proj_range) 175 | # print(data[data > 0].max(), data[data > 0].min()) 176 | data[data > 0] = data[data > 0]**(1 / power) 177 | data[data < 0] = data[data > 0].min() 178 | # print(data.max(), data.min()) 179 | data = (data - data[data > 0].min()) / \ 180 | (data.max() - data[data > 0].min()) 181 | # print(data.max(), data.min()) 182 | self.img_vis.set_data(data) 183 | self.img_vis.update() 184 | 185 | if self.semantics: 186 | self.sem_img_vis.set_data(self.scan.proj_sem_color[..., ::-1]) 187 | self.sem_img_vis.update() 188 | 189 | if self.instances: 190 | self.inst_img_vis.set_data(self.scan.proj_inst_color[..., ::-1]) 191 | self.inst_img_vis.update() 192 | 193 | # interface 194 | def key_press(self, event): 195 | self.canvas.events.key_press.block() 196 | self.img_canvas.events.key_press.block() 197 | if event.key == 'N': 198 | self.offset += 1 199 | self.update_scan() 200 | elif event.key == 'B': 201 | self.offset -= 1 202 | self.update_scan() 203 | elif event.key == 'Q' or event.key == 'Escape': 204 | self.destroy() 205 | 206 | def draw(self, event): 207 | if self.canvas.events.key_press.blocked(): 208 | self.canvas.events.key_press.unblock() 209 | if self.img_canvas.events.key_press.blocked(): 210 | self.img_canvas.events.key_press.unblock() 211 | 212 | def destroy(self): 213 | # destroy the visualization 214 | self.canvas.close() 215 | self.img_canvas.close() 216 | vispy.app.quit() 217 | 218 | def run(self): 219 | vispy.app.run() 220 | -------------------------------------------------------------------------------- /train/common/logger.py: -------------------------------------------------------------------------------- 1 | # Code referenced from https://gist.github.com/gyglim/1f8dfb1b5c82627ae3efcfbbadb9f514 2 | 3 | import tensorflow as tf 4 | import numpy as np 5 | import scipy.misc 6 | try: 7 | from StringIO import StringIO # Python 2.7 8 | except ImportError: 9 | from io import BytesIO # Python 3.x 10 | 11 | 12 | class Logger(object): 13 | 14 | def __init__(self, log_dir): 15 | """Create a summary writer logging to log_dir.""" 16 | self.writer = tf.summary.FileWriter(log_dir) 17 | 18 | def scalar_summary(self, tag, value, step): 19 | """Log a scalar variable.""" 20 | summary = tf.Summary( 21 | value=[tf.Summary.Value(tag=tag, simple_value=value)]) 22 | self.writer.add_summary(summary, step) 23 | self.writer.flush() 24 | 25 | def image_summary(self, tag, images, step): 26 | """Log a list of images.""" 27 | 28 | img_summaries = [] 29 | for i, img in enumerate(images): 30 | # Write the image to a string 31 | try: 32 | s = StringIO() 33 | except: 34 | s = BytesIO() 35 | scipy.misc.toimage(img).save(s, format="png") 36 | 37 | # Create an Image object 38 | img_sum = tf.Summary.Image(encoded_image_string=s.getvalue(), 39 | height=img.shape[0], 40 | width=img.shape[1]) 41 | # Create a Summary value 42 | img_summaries.append(tf.Summary.Value( 43 | tag='%s/%d' % (tag, i), image=img_sum)) 44 | 45 | # Create and write Summary 46 | summary = tf.Summary(value=img_summaries) 47 | self.writer.add_summary(summary, step) 48 | self.writer.flush() 49 | 50 | def histo_summary(self, tag, values, step, bins=1000): 51 | """Log a histogram of the tensor of values.""" 52 | 53 | # Create a histogram using numpy 54 | counts, bin_edges = np.histogram(values, bins=bins) 55 | 56 | # Fill the fields of the histogram proto 57 | hist = tf.HistogramProto() 58 | hist.min = float(np.min(values)) 59 | hist.max = float(np.max(values)) 60 | hist.num = int(np.prod(values.shape)) 61 | hist.sum = float(np.sum(values)) 62 | hist.sum_squares = float(np.sum(values**2)) 63 | 64 | # Drop the start of the first bin 65 | bin_edges = bin_edges[1:] 66 | 67 | # Add bin edges and counts 68 | for edge in bin_edges: 69 | hist.bucket_limit.append(edge) 70 | for c in counts: 71 | hist.bucket.append(c) 72 | 73 | # Create and write Summary 74 | summary = tf.Summary(value=[tf.Summary.Value(tag=tag, histo=hist)]) 75 | self.writer.add_summary(summary, step) 76 | self.writer.flush() 77 | -------------------------------------------------------------------------------- /train/common/onehot.py: -------------------------------------------------------------------------------- 1 | # This file is covered by the LICENSE file in the root of this project. 2 | import torch 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | import __init__ as booger 6 | 7 | 8 | class oneHot(nn.Module): 9 | def __init__(self, device, nclasses, spatial_dim=2): 10 | super().__init__() 11 | self.device = device 12 | self.nclasses = nclasses 13 | self.spatial_dim = spatial_dim 14 | 15 | def onehot1dspatial(self, x): 16 | # we only do tensors that 1d tensors that are batched or not, so check 17 | assert(len(x.shape) == 1 or len(x.shape) == 2) 18 | # if not batched, batch 19 | remove_dim = False # flag to unbatch 20 | if len(x.shape) == 1: 21 | # add batch dimension 22 | x = x[None, ...] 23 | remove_dim = True 24 | 25 | # get tensor shape 26 | n, b = x.shape 27 | 28 | # scatter to onehot 29 | one_hot = torch.zeros((n, self.nclasses, b), 30 | device=self.device).scatter_(1, x.unsqueeze(1), 1) 31 | 32 | # x is now [n,classes,b] 33 | 34 | # if it used to be unbatched, then unbatch it 35 | if remove_dim: 36 | one_hot = one_hot[0] 37 | 38 | return one_hot 39 | 40 | def onehot2dspatial(self, x): 41 | # we only do tensors that 2d tensors that are batched or not, so check 42 | assert(len(x.shape) == 2 or len(x.shape) == 3) 43 | # if not batched, batch 44 | remove_dim = False # flag to unbatch 45 | if len(x.shape) == 2: 46 | # add batch dimension 47 | x = x[None, ...] 48 | remove_dim = True 49 | 50 | # get tensor shape 51 | n, h, w = x.shape 52 | 53 | # scatter to onehot 54 | one_hot = torch.zeros((n, self.nclasses, h, w), 55 | device=self.device).scatter_(1, x.unsqueeze(1), 1) 56 | 57 | # x is now [n,classes,b] 58 | 59 | # if it used to be unbatched, then unbatch it 60 | if remove_dim: 61 | one_hot = one_hot[0] 62 | 63 | return one_hot 64 | 65 | def forward(self, x): 66 | # do onehot here 67 | if self.spatial_dim == 1: 68 | return self.onehot1dspatial(x) 69 | elif self.spatial_dim == 2: 70 | return self.onehot2dspatial(x) 71 | 72 | 73 | if __name__ == "__main__": 74 | # get device 75 | if torch.cuda.is_available(): 76 | device = torch.device('cuda') 77 | else: 78 | device = torch.device('cpu') 79 | 80 | # define number of classes 81 | nclasses = 6 82 | print("*"*80) 83 | print("Num classes 1d =", nclasses) 84 | print("*"*80) 85 | 86 | # test 1d unbatched case 87 | print("Tensor 1d spat dim, unbatched") 88 | tensor = torch.arange(0, nclasses).to(device) # [0,1,2,3,4,5] 89 | print("in:", tensor) 90 | module = oneHot(device, nclasses, spatial_dim=1) 91 | print("out:", module(tensor)) 92 | print("*"*80) 93 | 94 | # test 1d batched case 95 | print("*"*80) 96 | print("Tensor 1d spat dim, batched") 97 | tensor = torch.arange(0, nclasses).to(device) # [0,1,2,3,4,5] 98 | tensor = torch.cat([tensor.unsqueeze(0), 99 | tensor.unsqueeze(0)]) # [[0,1,2,3,4,5], [0,1,2,3,4,5]] 100 | print("in:", tensor) 101 | module = oneHot(device, nclasses, spatial_dim=1) 102 | print("out:", module(tensor)) 103 | print("*"*80) 104 | 105 | # for 2 use less classes 106 | nclasses = 3 107 | print("*"*80) 108 | print("Num classes 2d =", nclasses) 109 | print("*"*80) 110 | 111 | # test 2d unbatched case 112 | print("*"*80) 113 | print("Tensor 2d spat dim, unbatched") 114 | tensor = torch.arange(0, nclasses).to(device) # [0,1,2] 115 | tensor = torch.cat([tensor.unsqueeze(0), # [[0,1,2], 116 | tensor.unsqueeze(0), # [0,1,2], 117 | tensor.unsqueeze(0), # [0,1,2], 118 | tensor.unsqueeze(0)]) # [0,1,2]] 119 | print("in:", tensor) 120 | module = oneHot(device, nclasses, spatial_dim=2) 121 | print("out:", module(tensor)) 122 | print("*"*80) 123 | 124 | # test 2d batched case 125 | print("*"*80) 126 | print("Tensor 2d spat dim, unbatched") 127 | tensor = torch.arange(0, nclasses).to(device) # [0,1,2] 128 | tensor = torch.cat([tensor.unsqueeze(0), # [[0,1,2], 129 | tensor.unsqueeze(0), # [0,1,2], 130 | tensor.unsqueeze(0), # [0,1,2], 131 | tensor.unsqueeze(0)]) # [0,1,2]] 132 | tensor = torch.cat([tensor.unsqueeze(0), 133 | tensor.unsqueeze(0)]) # 2 of the same 2d tensor 134 | print("in:", tensor) 135 | module = oneHot(device, nclasses, spatial_dim=2) 136 | print("out:", module(tensor)) 137 | print("*"*80) 138 | -------------------------------------------------------------------------------- /train/common/sync_batchnorm/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/common/sync_batchnorm/__init__.py -------------------------------------------------------------------------------- /train/common/sync_batchnorm/comm.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : comm.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import queue 12 | import collections 13 | import threading 14 | 15 | __all__ = ['FutureResult', 'SlavePipe', 'SyncMaster'] 16 | 17 | 18 | class FutureResult(object): 19 | """A thread-safe future implementation. Used only as one-to-one pipe.""" 20 | 21 | def __init__(self): 22 | self._result = None 23 | self._lock = threading.Lock() 24 | self._cond = threading.Condition(self._lock) 25 | 26 | def put(self, result): 27 | with self._lock: 28 | assert self._result is None, 'Previous result has\'t been fetched.' 29 | self._result = result 30 | self._cond.notify() 31 | 32 | def get(self): 33 | with self._lock: 34 | if self._result is None: 35 | self._cond.wait() 36 | 37 | res = self._result 38 | self._result = None 39 | return res 40 | 41 | 42 | _MasterRegistry = collections.namedtuple('MasterRegistry', ['result']) 43 | _SlavePipeBase = collections.namedtuple( 44 | '_SlavePipeBase', ['identifier', 'queue', 'result']) 45 | 46 | 47 | class SlavePipe(_SlavePipeBase): 48 | """Pipe for master-slave communication.""" 49 | 50 | def run_slave(self, msg): 51 | self.queue.put((self.identifier, msg)) 52 | ret = self.result.get() 53 | self.queue.put(True) 54 | return ret 55 | 56 | 57 | class SyncMaster(object): 58 | """An abstract `SyncMaster` object. 59 | 60 | - During the replication, as the data parallel will trigger an callback of each module, all slave devices should 61 | call `register(id)` and obtain an `SlavePipe` to communicate with the master. 62 | - During the forward pass, master device invokes `run_master`, all messages from slave devices will be collected, 63 | and passed to a registered callback. 64 | - After receiving the messages, the master device should gather the information and determine to message passed 65 | back to each slave devices. 66 | """ 67 | 68 | def __init__(self, master_callback): 69 | """ 70 | 71 | Args: 72 | master_callback: a callback to be invoked after having collected messages from slave devices. 73 | """ 74 | self._master_callback = master_callback 75 | self._queue = queue.Queue() 76 | self._registry = collections.OrderedDict() 77 | self._activated = False 78 | 79 | def __getstate__(self): 80 | return {'master_callback': self._master_callback} 81 | 82 | def __setstate__(self, state): 83 | self.__init__(state['master_callback']) 84 | 85 | def register_slave(self, identifier): 86 | """ 87 | Register an slave device. 88 | 89 | Args: 90 | identifier: an identifier, usually is the device id. 91 | 92 | Returns: a `SlavePipe` object which can be used to communicate with the master device. 93 | 94 | """ 95 | if self._activated: 96 | assert self._queue.empty(), 'Queue is not clean before next initialization.' 97 | self._activated = False 98 | self._registry.clear() 99 | future = FutureResult() 100 | self._registry[identifier] = _MasterRegistry(future) 101 | return SlavePipe(identifier, self._queue, future) 102 | 103 | def run_master(self, master_msg): 104 | """ 105 | Main entry for the master device in each forward pass. 106 | The messages were first collected from each devices (including the master device), and then 107 | an callback will be invoked to compute the message to be sent back to each devices 108 | (including the master device). 109 | 110 | Args: 111 | master_msg: the message that the master want to send to itself. This will be placed as the first 112 | message when calling `master_callback`. For detailed usage, see `_SynchronizedBatchNorm` for an example. 113 | 114 | Returns: the message to be sent back to the master device. 115 | 116 | """ 117 | self._activated = True 118 | 119 | intermediates = [(0, master_msg)] 120 | for i in range(self.nr_slaves): 121 | intermediates.append(self._queue.get()) 122 | 123 | results = self._master_callback(intermediates) 124 | assert results[0][0] == 0, 'The first result should belongs to the master.' 125 | 126 | for i, res in results: 127 | if i == 0: 128 | continue 129 | self._registry[i].result.put(res) 130 | 131 | for i in range(self.nr_slaves): 132 | assert self._queue.get() is True 133 | 134 | return results[0][1] 135 | 136 | @property 137 | def nr_slaves(self): 138 | return len(self._registry) 139 | -------------------------------------------------------------------------------- /train/common/sync_batchnorm/replicate.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : replicate.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import functools 12 | 13 | from torch.nn.parallel.data_parallel import DataParallel 14 | 15 | __all__ = [ 16 | 'CallbackContext', 17 | 'execute_replication_callbacks', 18 | 'DataParallelWithCallback', 19 | 'patch_replication_callback' 20 | ] 21 | 22 | 23 | class CallbackContext(object): 24 | pass 25 | 26 | 27 | def execute_replication_callbacks(modules): 28 | """ 29 | Execute an replication callback `__data_parallel_replicate__` on each module created by original replication. 30 | 31 | The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)` 32 | 33 | Note that, as all modules are isomorphism, we assign each sub-module with a context 34 | (shared among multiple copies of this module on different devices). 35 | Through this context, different copies can share some information. 36 | 37 | We guarantee that the callback on the master copy (the first copy) will be called ahead of calling the callback 38 | of any slave copies. 39 | """ 40 | master_copy = modules[0] 41 | nr_modules = len(list(master_copy.modules())) 42 | ctxs = [CallbackContext() for _ in range(nr_modules)] 43 | 44 | for i, module in enumerate(modules): 45 | for j, m in enumerate(module.modules()): 46 | if hasattr(m, '__data_parallel_replicate__'): 47 | m.__data_parallel_replicate__(ctxs[j], i) 48 | 49 | 50 | class DataParallelWithCallback(DataParallel): 51 | """ 52 | Data Parallel with a replication callback. 53 | 54 | An replication callback `__data_parallel_replicate__` of each module will be invoked after being created by 55 | original `replicate` function. 56 | The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)` 57 | 58 | Examples: 59 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 60 | > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1]) 61 | # sync_bn.__data_parallel_replicate__ will be invoked. 62 | """ 63 | 64 | def replicate(self, module, device_ids): 65 | modules = super(DataParallelWithCallback, 66 | self).replicate(module, device_ids) 67 | execute_replication_callbacks(modules) 68 | return modules 69 | 70 | 71 | def patch_replication_callback(data_parallel): 72 | """ 73 | Monkey-patch an existing `DataParallel` object. Add the replication callback. 74 | Useful when you have customized `DataParallel` implementation. 75 | 76 | Examples: 77 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 78 | > sync_bn = DataParallel(sync_bn, device_ids=[0, 1]) 79 | > patch_replication_callback(sync_bn) 80 | # this is equivalent to 81 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 82 | > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1]) 83 | """ 84 | 85 | assert isinstance(data_parallel, DataParallel) 86 | 87 | old_replicate = data_parallel.replicate 88 | 89 | @functools.wraps(old_replicate) 90 | def new_replicate(module, device_ids): 91 | modules = old_replicate(module, device_ids) 92 | execute_replication_callbacks(modules) 93 | return modules 94 | 95 | data_parallel.replicate = new_replicate 96 | -------------------------------------------------------------------------------- /train/common/warmupLR.py: -------------------------------------------------------------------------------- 1 | # This file is covered by the LICENSE file in the root of this project. 2 | 3 | import torch.optim.lr_scheduler as toptim 4 | 5 | 6 | class warmupLR(toptim._LRScheduler): 7 | """ Warmup learning rate scheduler. 8 | Initially, increases the learning rate from 0 to the final value, in a 9 | certain number of steps. After this number of steps, each step decreases 10 | LR exponentially. 11 | """ 12 | 13 | def __init__(self, optimizer, lr, warmup_steps, momentum, decay): 14 | # cyclic params 15 | self.optimizer = optimizer 16 | self.lr = lr 17 | self.warmup_steps = warmup_steps 18 | self.momentum = momentum 19 | self.decay = decay 20 | 21 | # cap to one 22 | if self.warmup_steps < 1: 23 | self.warmup_steps = 1 24 | 25 | # cyclic lr 26 | self.initial_scheduler = toptim.CyclicLR(self.optimizer, 27 | base_lr=0, 28 | max_lr=self.lr, 29 | step_size_up=self.warmup_steps, 30 | step_size_down=self.warmup_steps, 31 | cycle_momentum=False, 32 | base_momentum=self.momentum, 33 | max_momentum=self.momentum) 34 | 35 | # our params 36 | self.last_epoch = -1 # fix for pytorch 1.1 and below 37 | self.finished = False # am i done 38 | super().__init__(optimizer) 39 | 40 | def get_lr(self): 41 | return [self.lr * (self.decay ** self.last_epoch) for lr in self.base_lrs] 42 | 43 | def step(self, epoch=None): 44 | if self.finished or self.initial_scheduler.last_epoch >= self.warmup_steps: 45 | if not self.finished: 46 | self.base_lrs = [self.lr for lr in self.base_lrs] 47 | self.finished = True 48 | return super(warmupLR, self).step(epoch) 49 | else: 50 | return self.initial_scheduler.step(epoch) 51 | -------------------------------------------------------------------------------- /train/requirements.txt: -------------------------------------------------------------------------------- 1 | numpy==1.14.0 2 | scipy==0.19.1 3 | torch==1.1.0 4 | tensorflow==1.13.1 5 | vispy==0.5.3 6 | torchvision==0.2.2.post3 7 | opencv_contrib_python==4.1.0.25 8 | matplotlib==2.2.3 9 | Pillow==6.1.0 10 | PyYAML==5.1.1 11 | -------------------------------------------------------------------------------- /train/tasks/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/tasks/__init__.py -------------------------------------------------------------------------------- /train/tasks/panoptic/README.md: -------------------------------------------------------------------------------- 1 | # LiDAR-Bonnetal Panoptic Segmentation Training 2 | 3 | Coming Soon -------------------------------------------------------------------------------- /train/tasks/panoptic/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/tasks/panoptic/__init__.py -------------------------------------------------------------------------------- /train/tasks/panoptic/readme.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/tasks/panoptic/readme.md -------------------------------------------------------------------------------- /train/tasks/semantic/README.md: -------------------------------------------------------------------------------- 1 | # LiDAR-Bonnetal Semantic Segmentation Training 2 | 3 | This part of the framework deals with the training of semantic segmentation networks for point cloud data using range images. This code allows to reproduce the experiments from the [RangeNet++](http://www.ipb.uni-bonn.de/wp-content/papercite-data/pdf/milioto2019iros.pdf) paper 4 | 5 | _Examples of segmentation results from [SemanticKITTI](http://semantic-kitti.org) dataset:_ 6 | ![ptcl](../../../pics/semantic-ptcl.gif) 7 | ![ptcl](../../../pics/semantic-proj.gif) 8 | 9 | ## Configuration files 10 | 11 | Architecture configuration files are located at [config/arch](config/arch/) 12 | Dataset configuration files are located at [config/labels](config/labels/) 13 | 14 | ## Apps 15 | 16 | `ALL SCRIPTS CAN BE INVOKED WITH -h TO GET EXTRA HELP ON HOW TO RUN THEM` 17 | 18 | ### Visualization 19 | 20 | To visualize the data (in this example sequence 00): 21 | 22 | ```sh 23 | $ ./visualize.py -d /path/to/dataset/ -s 00 24 | ``` 25 | 26 | To visualize the predictions (in this example sequence 00): 27 | 28 | ```sh 29 | $ ./visualize.py -d /path/to/dataset/ -p /path/to/predictions/ -s 00 30 | ``` 31 | 32 | ### Training 33 | 34 | To train a network (from scratch): 35 | 36 | ```sh 37 | $ ./train.py -d /path/to/dataset -ac /config/arch/CHOICE.yaml -l /path/to/log 38 | ``` 39 | 40 | To train a network (from pretrained model): 41 | 42 | ``` 43 | $ ./train.py -d /path/to/dataset -ac /config/arch/CHOICE.yaml -dc /config/labels/CHOICE.yaml -l /path/to/log -p /path/to/pretrained 44 | ``` 45 | 46 | This will generate a tensorboard log, which can be visualized by running: 47 | 48 | ```sh 49 | $ cd /path/to/log 50 | $ tensorboard --logdir=. --port 5555 51 | ``` 52 | 53 | And acccessing [http://localhost:5555](http://localhost:5555) in your browser. 54 | 55 | ### Inference 56 | 57 | To infer the predictions for the entire dataset: 58 | 59 | ```sh 60 | $ ./infer.py -d /path/to/dataset/ -l /path/for/predictions -m /path/to/model 61 | ```` 62 | 63 | ### Evaluation 64 | 65 | To evaluate the overall IoU of the point clouds (of a specific split, which in semantic kitti can only be train and valid, since test is only run in our evaluation server): 66 | 67 | ```sh 68 | $ ./evaluate_iou.py -d /path/to/dataset -p /path/to/predictions/ --split valid 69 | ``` 70 | 71 | To evaluate the border IoU of the point clouds (introduced in RangeNet++ paper): 72 | 73 | ```sh 74 | $ ./evaluate_biou.py -d /path/to/dataset -p /path/to/predictions/ --split valid --border 1 --conn 4 75 | ``` 76 | 77 | ## Pre-trained Models 78 | 79 | ### [SemanticKITTI](http://semantic-kitti.org) 80 | 81 | - [squeezeseg](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezeseg.tar.gz) 82 | - [squeezeseg + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezeseg-crf.tar.gz) 83 | - [squeezesegV2](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezesegV2.tar.gz) 84 | - [squeezesegV2 + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/squeezesegV2-crf.tar.gz) 85 | - [darknet21](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet21.tar.gz) 86 | - [darknet53](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet53.tar.gz) 87 | - [darknet53-1024](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet53-1024.tar.gz) 88 | - [darknet53-512](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/models/darknet53-512.tar.gz) 89 | 90 | To enable kNN post-processing, just change the boolean value to `True` in the `arch_cfg.yaml` file parameter, inside the model directory. 91 | 92 | ## Predictions from Models 93 | 94 | ### [SemanticKITTI](http://semantic-kitti.org) 95 | 96 | These are the predictions for the train, validation, and test sets. The performance can be evaluated for the training and validation set, but for test set evaluation a submission to the benchmark needs to be made (labels are not public). 97 | 98 | No post-processing: 99 | - [squeezeseg](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezeseg.tar.gz) 100 | - [squeezeseg + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezeseg-crf.tar.gz) 101 | - [squeezesegV2](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezesegV2.tar.gz) 102 | - [squeezesegV2 + crf](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezesegV2-crf.tar.gz) 103 | - [darknet21](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet21.tar.gz) 104 | - [darknet53](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53.tar.gz) 105 | - [darknet53-1024](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-1024.tar.gz) 106 | - [darknet53-512](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-512.tar.gz) 107 | 108 | With k-NN processing: 109 | - [squeezeseg](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezeseg-knn.tar.gz) 110 | - [squeezesegV2](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/squeezesegV2-knn.tar.gz) 111 | - [darknet53](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-knn.tar.gz) 112 | - [darknet21](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet21-knn.tar.gz) 113 | - [darknet53-1024](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-1024-knn.tar.gz) 114 | - [darknet53-512](http://www.ipb.uni-bonn.de/html/projects/bonnetal/lidar/semantic/predictions/darknet53-512-knn.tar.gz) 115 | 116 | ## Citations 117 | 118 | If you use our framework, model, or predictions for any academic work, please cite the original [paper](http://www.ipb.uni-bonn.de/wp-content/papercite-data/pdf/milioto2019iros.pdf), and the [dataset](http://semantic-kitti.org). 119 | 120 | ``` 121 | @inproceedings{milioto2019iros, 122 | author = {A. Milioto and I. Vizzo and J. Behley and C. Stachniss}, 123 | title = {{RangeNet++: Fast and Accurate LiDAR Semantic Segmentation}}, 124 | booktitle = {IEEE/RSJ Intl.~Conf.~on Intelligent Robots and Systems (IROS)}, 125 | year = 2019, 126 | codeurl = {https://github.com/PRBonn/lidar-bonnetal}, 127 | videourl = {https://youtu.be/wuokg7MFZyU}, 128 | } 129 | ``` 130 | 131 | ``` 132 | @inproceedings{behley2019iccv, 133 | author = {J. Behley and M. Garbade and A. Milioto and J. Quenzel and S. Behnke and C. Stachniss and J. Gall}, 134 | title = {{SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences}}, 135 | booktitle = {Proc. of the IEEE/CVF International Conf.~on Computer Vision (ICCV)}, 136 | year = {2019} 137 | } 138 | ``` -------------------------------------------------------------------------------- /train/tasks/semantic/__init__.py: -------------------------------------------------------------------------------- 1 | import sys 2 | TRAIN_PATH = "../../" 3 | DEPLOY_PATH = "../../../deploy" 4 | sys.path.insert(0, TRAIN_PATH) 5 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/darknet21.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.01 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.995 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 2 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "darknet" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.01 33 | bn_d: 0.01 34 | OS: 32 # output stride (only horizontally) 35 | train: True # train backbone? 36 | extra: 37 | layers: 21 38 | 39 | ################################################################################ 40 | # decoder parameters 41 | ################################################################################ 42 | decoder: 43 | name: "darknet" 44 | dropout: 0.01 45 | bn_d: 0.01 46 | train: True # train decoder? 47 | extra: False # nothing to add for this decoder, otherwise this is a dict 48 | 49 | ################################################################################ 50 | # classification head parameters 51 | ################################################################################ 52 | head: 53 | name: "segmentation" 54 | train: True 55 | dropout: 0.01 56 | 57 | ################################################################################ 58 | # postproc parameters 59 | ################################################################################ 60 | post: 61 | CRF: 62 | use: False 63 | train: True 64 | params: False # this should be a dict when in use 65 | KNN: 66 | use: False 67 | params: 68 | knn: 5 69 | search: 5 70 | sigma: 1.0 71 | cutoff: 1.0 72 | 73 | ################################################################################ 74 | # classification head parameters 75 | ################################################################################ 76 | # dataset (to find parser) 77 | dataset: 78 | labels: "kitti" 79 | scans: "kitti" 80 | max_points: 150000 # max of any scan in dataset 81 | sensor: 82 | name: "HDL64" 83 | type: "spherical" # projective 84 | fov_up: 3 85 | fov_down: -25 86 | img_prop: 87 | width: 2048 88 | height: 64 89 | img_means: #range,x,y,z,signal 90 | - 12.12 91 | - 10.88 92 | - 0.23 93 | - -1.04 94 | - 0.21 95 | img_stds: #range,x,y,z,signal 96 | - 12.32 97 | - 11.47 98 | - 6.91 99 | - 0.86 100 | - 0.16 101 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/darknet53-1024px.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.01 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 16 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "darknet" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.05 33 | bn_d: 0.01 34 | OS: 32 # output stride (only horizontally) 35 | train: True # train backbone? 36 | extra: 37 | layers: 53 38 | 39 | ################################################################################ 40 | # decoder parameters 41 | ################################################################################ 42 | decoder: 43 | name: "darknet" 44 | dropout: 0.05 45 | bn_d: 0.01 46 | train: True # train decoder? 47 | extra: False # nothing to add for this decoder, otherwise this is a dict 48 | 49 | ################################################################################ 50 | # classification head parameters 51 | ################################################################################ 52 | head: 53 | name: "segmentation" 54 | train: True 55 | dropout: 0.05 56 | 57 | ################################################################################ 58 | # postproc parameters 59 | ################################################################################ 60 | post: 61 | CRF: 62 | use: False 63 | train: True 64 | params: False # this should be a dict when in use 65 | KNN: 66 | use: False 67 | params: 68 | knn: 5 69 | search: 5 70 | sigma: 1.0 71 | cutoff: 1.0 72 | 73 | ################################################################################ 74 | # classification head parameters 75 | ################################################################################ 76 | # dataset (to find parser) 77 | dataset: 78 | labels: "kitti" 79 | scans: "kitti" 80 | max_points: 150000 # max of any scan in dataset 81 | sensor: 82 | name: "HDL64" 83 | type: "spherical" # projective 84 | fov_up: 3 85 | fov_down: -25 86 | img_prop: 87 | width: 1024 88 | height: 64 89 | img_means: #range,x,y,z,signal 90 | - 12.12 91 | - 10.88 92 | - 0.23 93 | - -1.04 94 | - 0.21 95 | img_stds: #range,x,y,z,signal 96 | - 12.32 97 | - 11.47 98 | - 6.91 99 | - 0.86 100 | - 0.16 101 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/darknet53-512px.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.01 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 32 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "darknet" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.05 33 | bn_d: 0.01 34 | OS: 32 # output stride (only horizontally) 35 | train: True # train backbone? 36 | extra: 37 | layers: 53 38 | 39 | ################################################################################ 40 | # decoder parameters 41 | ################################################################################ 42 | decoder: 43 | name: "darknet" 44 | dropout: 0.05 45 | bn_d: 0.01 46 | train: True # train decoder? 47 | extra: False # nothing to add for this decoder, otherwise this is a dict 48 | 49 | ################################################################################ 50 | # classification head parameters 51 | ################################################################################ 52 | head: 53 | name: "segmentation" 54 | train: True 55 | dropout: 0.05 56 | 57 | ################################################################################ 58 | # postproc parameters 59 | ################################################################################ 60 | post: 61 | CRF: 62 | use: False 63 | train: True 64 | params: False # this should be a dict when in use 65 | KNN: 66 | use: False 67 | params: 68 | knn: 5 69 | search: 5 70 | sigma: 1.0 71 | cutoff: 1.0 72 | 73 | ################################################################################ 74 | # classification head parameters 75 | ################################################################################ 76 | # dataset (to find parser) 77 | dataset: 78 | labels: "kitti" 79 | scans: "kitti" 80 | max_points: 150000 # max of any scan in dataset 81 | sensor: 82 | name: "HDL64" 83 | type: "spherical" # projective 84 | fov_up: 3 85 | fov_down: -25 86 | img_prop: 87 | width: 512 88 | height: 64 89 | img_means: #range,x,y,z,signal 90 | - 12.12 91 | - 10.88 92 | - 0.23 93 | - -1.04 94 | - 0.21 95 | img_stds: #range,x,y,z,signal 96 | - 12.32 97 | - 11.47 98 | - 6.91 99 | - 0.86 100 | - 0.16 101 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/darknet53-crf-1024px.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.001 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 12 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "darknet" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.05 33 | bn_d: 0.01 34 | OS: 32 # output stride (only horizontally) 35 | train: True # train backbone? 36 | extra: 37 | layers: 53 38 | 39 | ################################################################################ 40 | # decoder parameters 41 | ################################################################################ 42 | decoder: 43 | name: "darknet" 44 | dropout: 0.05 45 | bn_d: 0.01 46 | train: True # train decoder? 47 | extra: False # nothing to add for this decoder, otherwise this is a dict 48 | 49 | ################################################################################ 50 | # classification head parameters 51 | ################################################################################ 52 | head: 53 | name: "segmentation" 54 | train: True 55 | dropout: 0.05 56 | 57 | ################################################################################ 58 | # postproc parameters 59 | ################################################################################ 60 | post: 61 | CRF: 62 | use: True 63 | train: True 64 | params: 65 | iter: 3 66 | lcn_size: 67 | h: 3 68 | w: 5 69 | xyz_coef: 0.1 70 | xyz_sigma: 0.7 71 | KNN: 72 | use: False 73 | params: 74 | knn: 5 75 | search: 5 76 | sigma: 1.0 77 | cutoff: 1.0 78 | 79 | ################################################################################ 80 | # classification head parameters 81 | ################################################################################ 82 | # dataset (to find parser) 83 | dataset: 84 | labels: "kitti" 85 | scans: "kitti" 86 | max_points: 150000 # max of any scan in dataset 87 | sensor: 88 | name: "HDL64" 89 | type: "spherical" # projective 90 | fov_up: 3 91 | fov_down: -25 92 | img_prop: 93 | width: 1024 94 | height: 64 95 | img_means: #range,x,y,z,signal 96 | - 12.12 97 | - 10.88 98 | - 0.23 99 | - -1.04 100 | - 0.21 101 | img_stds: #range,x,y,z,signal 102 | - 12.32 103 | - 11.47 104 | - 6.91 105 | - 0.86 106 | - 0.16 107 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/darknet53-crf-512px.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.001 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 12 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: True # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "darknet" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.05 33 | bn_d: 0.01 34 | OS: 32 # output stride (only horizontally) 35 | train: True # train backbone? 36 | extra: 37 | layers: 53 38 | 39 | ################################################################################ 40 | # decoder parameters 41 | ################################################################################ 42 | decoder: 43 | name: "darknet" 44 | dropout: 0.05 45 | bn_d: 0.01 46 | train: True # train decoder? 47 | extra: False # nothing to add for this decoder, otherwise this is a dict 48 | 49 | ################################################################################ 50 | # classification head parameters 51 | ################################################################################ 52 | head: 53 | name: "segmentation" 54 | train: True 55 | dropout: 0.05 56 | 57 | ################################################################################ 58 | # postproc parameters 59 | ################################################################################ 60 | post: 61 | CRF: 62 | use: True 63 | train: True 64 | params: 65 | iter: 3 66 | lcn_size: 67 | h: 3 68 | w: 5 69 | xyz_coef: 0.1 70 | xyz_sigma: 0.7 71 | KNN: 72 | use: False 73 | params: 74 | knn: 5 75 | search: 5 76 | sigma: 1.0 77 | cutoff: 1.0 78 | 79 | ################################################################################ 80 | # classification head parameters 81 | ################################################################################ 82 | # dataset (to find parser) 83 | dataset: 84 | labels: "kitti" 85 | scans: "kitti" 86 | max_points: 150000 # max of any scan in dataset 87 | sensor: 88 | name: "HDL64" 89 | type: "spherical" # projective 90 | fov_up: 3 91 | fov_down: -25 92 | img_prop: 93 | width: 512 94 | height: 64 95 | img_means: #range,x,y,z,signal 96 | - 12.12 97 | - 10.88 98 | - 0.23 99 | - -1.04 100 | - 0.21 101 | img_stds: #range,x,y,z,signal 102 | - 12.32 103 | - 11.47 104 | - 6.91 105 | - 0.86 106 | - 0.16 107 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/darknet53-crf.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.002 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 8 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "darknet" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.01 33 | bn_d: 0.01 34 | OS: 32 # output stride (only horizontally) 35 | train: True # train backbone? 36 | extra: 37 | layers: 53 38 | 39 | ################################################################################ 40 | # decoder parameters 41 | ################################################################################ 42 | decoder: 43 | name: "darknet" 44 | dropout: 0.01 45 | bn_d: 0.01 46 | train: True # train decoder? 47 | extra: False # nothing to add for this decoder, otherwise this is a dict 48 | 49 | ################################################################################ 50 | # classification head parameters 51 | ################################################################################ 52 | head: 53 | name: "segmentation" 54 | train: True 55 | dropout: 0.01 56 | 57 | ################################################################################ 58 | # postproc parameters 59 | ################################################################################ 60 | post: 61 | CRF: 62 | use: True 63 | train: True 64 | params: 65 | iter: 3 66 | lcn_size: 67 | h: 3 68 | w: 5 69 | xyz_coef: 0.1 70 | xyz_sigma: 0.7 71 | KNN: 72 | use: False 73 | params: 74 | knn: 5 75 | search: 5 76 | sigma: 1.0 77 | cutoff: 1.0 78 | 79 | ################################################################################ 80 | # classification head parameters 81 | ################################################################################ 82 | # dataset (to find parser) 83 | dataset: 84 | labels: "kitti" 85 | scans: "kitti" 86 | max_points: 150000 # max of any scan in dataset 87 | sensor: 88 | name: "HDL64" 89 | type: "spherical" # projective 90 | fov_up: 3 91 | fov_down: -25 92 | img_prop: 93 | width: 2048 94 | height: 64 95 | img_means: #range,x,y,z,signal 96 | - 12.12 97 | - 10.88 98 | - 0.23 99 | - -1.04 100 | - 0.21 101 | img_stds: #range,x,y,z,signal 102 | - 12.32 103 | - 11.47 104 | - 6.91 105 | - 0.86 106 | - 0.16 107 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/darknet53.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.005 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 8 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "darknet" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.01 33 | bn_d: 0.01 34 | OS: 32 # output stride (only horizontally) 35 | train: True # train backbone? 36 | extra: 37 | layers: 53 38 | 39 | ################################################################################ 40 | # decoder parameters 41 | ################################################################################ 42 | decoder: 43 | name: "darknet" 44 | dropout: 0.01 45 | bn_d: 0.01 46 | train: True # train decoder? 47 | extra: False # nothing to add for this decoder, otherwise this is a dict 48 | 49 | ################################################################################ 50 | # classification head parameters 51 | ################################################################################ 52 | head: 53 | name: "segmentation" 54 | train: True 55 | dropout: 0.01 56 | 57 | ################################################################################ 58 | # postproc parameters 59 | ################################################################################ 60 | post: 61 | CRF: 62 | use: False 63 | train: True 64 | params: False # this should be a dict when in use 65 | KNN: 66 | use: False 67 | params: 68 | knn: 5 69 | search: 5 70 | sigma: 1.0 71 | cutoff: 1.0 72 | 73 | ################################################################################ 74 | # classification head parameters 75 | ################################################################################ 76 | # dataset (to find parser) 77 | dataset: 78 | labels: "kitti" 79 | scans: "kitti" 80 | max_points: 150000 # max of any scan in dataset 81 | sensor: 82 | name: "HDL64" 83 | type: "spherical" # projective 84 | fov_up: 3 85 | fov_down: -25 86 | img_prop: 87 | width: 2048 88 | height: 64 89 | img_means: #range,x,y,z,signal 90 | - 12.12 91 | - 10.88 92 | - 0.23 93 | - -1.04 94 | - 0.21 95 | img_stds: #range,x,y,z,signal 96 | - 12.32 97 | - 11.47 98 | - 6.91 99 | - 0.86 100 | - 0.16 101 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/squeezeseg.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.01 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.995 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 36 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "squeezeseg" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.01 33 | OS: 16 # output stride (only horizontally) 34 | train: True # train backbone? 35 | extra: False 36 | 37 | ################################################################################ 38 | # decoder parameters 39 | ################################################################################ 40 | decoder: 41 | name: "squeezeseg" 42 | dropout: 0.01 43 | train: True # train decoder? 44 | extra: False # nothing to add for this decoder, otherwise this is a dict 45 | 46 | ################################################################################ 47 | # classification head parameters 48 | ################################################################################ 49 | head: 50 | name: "segmentation" 51 | train: True 52 | dropout: 0.3 53 | 54 | ################################################################################ 55 | # postproc parameters 56 | ################################################################################ 57 | post: 58 | CRF: 59 | use: False 60 | train: True 61 | params: False # this should be a dict when in use 62 | KNN: 63 | use: False 64 | params: 65 | knn: 5 66 | search: 5 67 | sigma: 1.0 68 | cutoff: 1.0 69 | 70 | ################################################################################ 71 | # classification head parameters 72 | ################################################################################ 73 | # dataset (to find parser) 74 | dataset: 75 | labels: "kitti" 76 | scans: "kitti" 77 | max_points: 150000 # max of any scan in dataset 78 | sensor: 79 | name: "HDL64" 80 | type: "spherical" # projective 81 | fov_up: 3 82 | fov_down: -25 83 | img_prop: 84 | width: 2048 85 | height: 64 86 | img_means: #range,x,y,z,signal 87 | - 12.12 88 | - 10.88 89 | - 0.23 90 | - -1.04 91 | - 0.21 92 | img_stds: #range,x,y,z,signal 93 | - 12.32 94 | - 11.47 95 | - 6.91 96 | - 0.86 97 | - 0.16 98 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/squeezesegV2.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.002 # sgd learning rate 8 | wup_epochs: 1 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 8 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "squeezesegV2" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.01 33 | OS: 16 # output stride (only horizontally) 34 | bn_d: 0.01 35 | train: True # train backbone? 36 | extra: False 37 | 38 | ################################################################################ 39 | # decoder parameters 40 | ################################################################################ 41 | decoder: 42 | name: "squeezesegV2" 43 | dropout: 0.01 44 | bn_d: 0.01 45 | train: True # train decoder? 46 | extra: False # nothing to add for this decoder, otherwise this is a dict 47 | 48 | ################################################################################ 49 | # classification head parameters 50 | ################################################################################ 51 | head: 52 | name: "segmentation" 53 | train: True 54 | dropout: 0.01 55 | 56 | ################################################################################ 57 | # postproc parameters 58 | ################################################################################ 59 | post: 60 | CRF: 61 | use: False 62 | train: True 63 | params: False # this should be a dict when in use 64 | KNN: 65 | use: False 66 | params: 67 | knn: 5 68 | search: 5 69 | sigma: 1.0 70 | cutoff: 1.0 71 | 72 | ################################################################################ 73 | # classification head parameters 74 | ################################################################################ 75 | # dataset (to find parser) 76 | dataset: 77 | labels: "kitti" 78 | scans: "kitti" 79 | max_points: 150000 # max of any scan in dataset 80 | sensor: 81 | name: "HDL64" 82 | type: "spherical" # projective 83 | fov_up: 3 84 | fov_down: -25 85 | img_prop: 86 | width: 2048 87 | height: 64 88 | img_means: #range,x,y,z,signal 89 | - 12.12 90 | - 10.88 91 | - 0.23 92 | - -1.04 93 | - 0.21 94 | img_stds: #range,x,y,z,signal 95 | - 12.32 96 | - 11.47 97 | - 6.91 98 | - 0.86 99 | - 0.16 100 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/squeezesegV2_crf.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.001 # sgd learning rate 8 | wup_epochs: 0.01 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 8 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: False # show scans during training 21 | workers: 12 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "squeezesegV2" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.01 33 | OS: 16 # output stride (only horizontally) 34 | bn_d: 0.01 35 | train: True # train backbone? 36 | extra: False 37 | 38 | ################################################################################ 39 | # decoder parameters 40 | ################################################################################ 41 | decoder: 42 | name: "squeezesegV2" 43 | dropout: 0.01 44 | bn_d: 0.01 45 | train: True # train decoder? 46 | extra: False # nothing to add for this decoder, otherwise this is a dict 47 | 48 | ################################################################################ 49 | # classification head parameters 50 | ################################################################################ 51 | head: 52 | name: "segmentation" 53 | train: True 54 | dropout: 0.01 55 | 56 | ################################################################################ 57 | # postproc parameters 58 | ################################################################################ 59 | post: 60 | CRF: 61 | use: True 62 | train: True 63 | params: 64 | iter: 3 65 | lcn_size: 66 | h: 3 67 | w: 5 68 | xyz_coef: 0.1 69 | xyz_sigma: 0.7 70 | KNN: 71 | use: False 72 | params: 73 | knn: 5 74 | search: 5 75 | sigma: 1.0 76 | cutoff: 1.0 77 | 78 | ################################################################################ 79 | # classification head parameters 80 | ################################################################################ 81 | # dataset (to find parser) 82 | dataset: 83 | labels: "kitti" 84 | scans: "kitti" 85 | max_points: 150000 # max of any scan in dataset 86 | sensor: 87 | name: "HDL64" 88 | type: "spherical" # projective 89 | fov_up: 3 90 | fov_down: -25 91 | img_prop: 92 | width: 2048 93 | height: 64 94 | img_means: #range,x,y,z,signal 95 | - 12.12 96 | - 10.88 97 | - 0.23 98 | - -1.04 99 | - 0.21 100 | img_stds: #range,x,y,z,signal 101 | - 12.32 102 | - 11.47 103 | - 6.91 104 | - 0.86 105 | - 0.16 106 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/arch/squeezeseg_crf.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # training parameters 3 | ################################################################################ 4 | train: 5 | loss: "xentropy" # must be either xentropy or iou 6 | max_epochs: 150 7 | lr: 0.001 # sgd learning rate 8 | wup_epochs: 0.01 # warmup during first XX epochs (can be float) 9 | momentum: 0.9 # sgd momentum 10 | lr_decay: 0.99 # learning rate decay per epoch after initial cycle (from min lr) 11 | w_decay: 0.0001 # weight decay 12 | batch_size: 8 # batch size 13 | report_batch: 1 # every x batches, report loss 14 | report_epoch: 1 # every x epochs, report validation set 15 | epsilon_w: 0.001 # class weight w = 1 / (content + epsilon_w) 16 | save_summary: False # Summary of weight histograms for tensorboard 17 | save_scans: True # False doesn't save anything, True saves some 18 | # sample images (one per batch of the last calculated batch) 19 | # in log folder 20 | show_scans: True # show scans during training 21 | workers: 4 # number of threads to get data 22 | 23 | ################################################################################ 24 | # backbone parameters 25 | ################################################################################ 26 | backbone: 27 | name: "squeezeseg" # ['squeezeseg', 'squeezesegV2', 'darknet'] 28 | input_depth: 29 | range: True 30 | xyz: True 31 | remission: True 32 | dropout: 0.01 33 | OS: 16 # output stride (only horizontally) 34 | train: True # train backbone? 35 | extra: False 36 | 37 | ################################################################################ 38 | # decoder parameters 39 | ################################################################################ 40 | decoder: 41 | name: "squeezeseg" 42 | dropout: 0.01 43 | train: True # train decoder? 44 | extra: False # nothing to add for this decoder, otherwise this is a dict 45 | 46 | ################################################################################ 47 | # classification head parameters 48 | ################################################################################ 49 | head: 50 | name: "segmentation" 51 | train: True 52 | dropout: 0.01 53 | 54 | ################################################################################ 55 | # postproc parameters 56 | ################################################################################ 57 | post: 58 | CRF: 59 | use: True 60 | train: True 61 | params: 62 | iter: 3 63 | lcn_size: 64 | h: 3 65 | w: 5 66 | xyz_coef: 0.1 67 | xyz_sigma: 0.7 68 | KNN: 69 | use: False 70 | params: 71 | knn: 5 72 | search: 5 73 | sigma: 1.0 74 | cutoff: 1.0 75 | 76 | ################################################################################ 77 | # classification head parameters 78 | ################################################################################ 79 | # dataset (to find parser) 80 | dataset: 81 | labels: "kitti" 82 | scans: "kitti" 83 | max_points: 150000 # max of any scan in dataset 84 | sensor: 85 | name: "HDL64" 86 | type: "spherical" # projective 87 | fov_up: 3 88 | fov_down: -25 89 | img_prop: 90 | width: 2048 91 | height: 64 92 | img_means: #range,x,y,z,signal 93 | - 12.12 94 | - 10.88 95 | - 0.23 96 | - -1.04 97 | - 0.21 98 | img_stds: #range,x,y,z,signal 99 | - 12.32 100 | - 11.47 101 | - 6.91 102 | - 0.86 103 | - 0.16 104 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/labels/semantic-kitti-all.yaml: -------------------------------------------------------------------------------- 1 | # This file is covered by the LICENSE file in the root of this project. 2 | name: "kitti" 3 | labels: 4 | 0 : "unlabeled" 5 | 1 : "outlier" 6 | 10: "car" 7 | 11: "bicycle" 8 | 13: "bus" 9 | 15: "motorcycle" 10 | 16: "on-rails" 11 | 18: "truck" 12 | 20: "other-vehicle" 13 | 30: "person" 14 | 31: "bicyclist" 15 | 32: "motorcyclist" 16 | 40: "road" 17 | 44: "parking" 18 | 48: "sidewalk" 19 | 49: "other-ground" 20 | 50: "building" 21 | 51: "fence" 22 | 52: "other-structure" 23 | 60: "lane-marking" 24 | 70: "vegetation" 25 | 71: "trunk" 26 | 72: "terrain" 27 | 80: "pole" 28 | 81: "traffic-sign" 29 | 99: "other-object" 30 | 252: "moving-car" 31 | 253: "moving-bicyclist" 32 | 254: "moving-person" 33 | 255: "moving-motorcyclist" 34 | 256: "moving-on-rails" 35 | 257: "moving-bus" 36 | 258: "moving-truck" 37 | 259: "moving-other-vehicle" 38 | color_map: # bgr 39 | 0 : [0, 0, 0] 40 | 1 : [0, 0, 255] 41 | 10: [245, 150, 100] 42 | 11: [245, 230, 100] 43 | 13: [250, 80, 100] 44 | 15: [150, 60, 30] 45 | 16: [255, 0, 0] 46 | 18: [180, 30, 80] 47 | 20: [255, 0, 0] 48 | 30: [30, 30, 255] 49 | 31: [200, 40, 255] 50 | 32: [90, 30, 150] 51 | 40: [255, 0, 255] 52 | 44: [255, 150, 255] 53 | 48: [75, 0, 75] 54 | 49: [75, 0, 175] 55 | 50: [0, 200, 255] 56 | 51: [50, 120, 255] 57 | 52: [0, 150, 255] 58 | 60: [170, 255, 150] 59 | 70: [0, 175, 0] 60 | 71: [0, 60, 135] 61 | 72: [80, 240, 150] 62 | 80: [150, 240, 255] 63 | 81: [0, 0, 255] 64 | 99: [255, 255, 50] 65 | 252: [245, 150, 100] 66 | 256: [255, 0, 0] 67 | 253: [200, 40, 255] 68 | 254: [30, 30, 255] 69 | 255: [90, 30, 150] 70 | 257: [250, 80, 100] 71 | 258: [180, 30, 80] 72 | 259: [255, 0, 0] 73 | content: # as a ratio with the total number of points 74 | 0: 0.018889854628292943 75 | 1: 0.0002937197336781505 76 | 10: 0.040818519255974316 77 | 11: 0.00016609538710764618 78 | 13: 2.7879693665067774e-05 79 | 15: 0.00039838616015114444 80 | 16: 0.0 81 | 18: 0.0020633612104619787 82 | 20: 0.0016218197275284021 83 | 30: 0.00017698551338515307 84 | 31: 1.1065903904919655e-08 85 | 32: 5.532951952459828e-09 86 | 40: 0.1987493871255525 87 | 44: 0.014717169549888214 88 | 48: 0.14392298360372 89 | 49: 0.0039048553037472045 90 | 50: 0.1326861944777486 91 | 51: 0.0723592229456223 92 | 52: 0.002395131480328884 93 | 60: 4.7084144280367186e-05 94 | 70: 0.26681502148037506 95 | 71: 0.006035012012626033 96 | 72: 0.07814222006271769 97 | 80: 0.002855498193863172 98 | 81: 0.0006155958086189918 99 | 99: 0.009923127583046915 100 | 252: 0.001789309418528068 101 | 253: 0.00012709999297008662 102 | 254: 0.00016059776092534436 103 | 255: 3.745553104802113e-05 104 | 256: 0.0 105 | 257: 0.00011351574470342043 106 | 258: 0.00010157861367183268 107 | 259: 4.3840131989471124e-05 108 | # classes that are indistinguishable from single scan or inconsistent in 109 | # ground truth are mapped to their closest equivalent 110 | learning_map: 111 | 0 : 0 # "unlabeled" 112 | 1 : 0 # "outlier" mapped to "unlabeled" --------------------------mapped 113 | 10: 1 # "car" 114 | 11: 2 # "bicycle" 115 | 13: 5 # "bus" mapped to "other-vehicle" --------------------------mapped 116 | 15: 3 # "motorcycle" 117 | 16: 5 # "on-rails" mapped to "other-vehicle" ---------------------mapped 118 | 18: 4 # "truck" 119 | 20: 5 # "other-vehicle" 120 | 30: 6 # "person" 121 | 31: 7 # "bicyclist" 122 | 32: 8 # "motorcyclist" 123 | 40: 9 # "road" 124 | 44: 10 # "parking" 125 | 48: 11 # "sidewalk" 126 | 49: 12 # "other-ground" 127 | 50: 13 # "building" 128 | 51: 14 # "fence" 129 | 52: 0 # "other-structure" mapped to "unlabeled" ------------------mapped 130 | 60: 9 # "lane-marking" to "road" ---------------------------------mapped 131 | 70: 15 # "vegetation" 132 | 71: 16 # "trunk" 133 | 72: 17 # "terrain" 134 | 80: 18 # "pole" 135 | 81: 19 # "traffic-sign" 136 | 99: 0 # "other-object" to "unlabeled" ----------------------------mapped 137 | 252: 20 # "moving-car" 138 | 253: 21 # "moving-bicyclist" 139 | 254: 22 # "moving-person" 140 | 255: 23 # "moving-motorcyclist" 141 | 256: 24 # "moving-on-rails" mapped to "moving-other-vehicle" ------mapped 142 | 257: 24 # "moving-bus" mapped to "moving-other-vehicle" -----------mapped 143 | 258: 25 # "moving-truck" 144 | 259: 24 # "moving-other-vehicle" 145 | learning_map_inv: # inverse of previous map 146 | 0: 0 # "unlabeled", and others ignored 147 | 1: 10 # "car" 148 | 2: 11 # "bicycle" 149 | 3: 15 # "motorcycle" 150 | 4: 18 # "truck" 151 | 5: 20 # "other-vehicle" 152 | 6: 30 # "person" 153 | 7: 31 # "bicyclist" 154 | 8: 32 # "motorcyclist" 155 | 9: 40 # "road" 156 | 10: 44 # "parking" 157 | 11: 48 # "sidewalk" 158 | 12: 49 # "other-ground" 159 | 13: 50 # "building" 160 | 14: 51 # "fence" 161 | 15: 70 # "vegetation" 162 | 16: 71 # "trunk" 163 | 17: 72 # "terrain" 164 | 18: 80 # "pole" 165 | 19: 81 # "traffic-sign" 166 | 20: 252 # "moving-car" 167 | 21: 253 # "moving-bicyclist" 168 | 22: 254 # "moving-person" 169 | 23: 255 # "moving-motorcyclist" 170 | 24: 259 # "moving-other-vehicle" 171 | 25: 258 # "moving-truck" 172 | learning_ignore: # Ignore classes 173 | 0: True # "unlabeled", and others ignored 174 | 1: False # "car" 175 | 2: False # "bicycle" 176 | 3: False # "motorcycle" 177 | 4: False # "truck" 178 | 5: False # "other-vehicle" 179 | 6: False # "person" 180 | 7: False # "bicyclist" 181 | 8: False # "motorcyclist" 182 | 9: False # "road" 183 | 10: False # "parking" 184 | 11: False # "sidewalk" 185 | 12: False # "other-ground" 186 | 13: False # "building" 187 | 14: False # "fence" 188 | 15: False # "vegetation" 189 | 16: False # "trunk" 190 | 17: False # "terrain" 191 | 18: False # "pole" 192 | 19: False # "traffic-sign" 193 | 20: False # "moving-car" 194 | 21: False # "moving-bicyclist" 195 | 22: False # "moving-person" 196 | 23: False # "moving-motorcyclist" 197 | 24: False # "moving-other-vehicle" 198 | 25: False # "moving-truck" 199 | split: # sequence numbers 200 | train: 201 | - 0 202 | - 1 203 | - 2 204 | - 3 205 | - 4 206 | - 5 207 | - 6 208 | - 7 209 | - 9 210 | - 10 211 | valid: 212 | - 8 213 | test: 214 | - 11 215 | - 12 216 | - 13 217 | - 14 218 | - 15 219 | - 16 220 | - 17 221 | - 18 222 | - 19 223 | - 20 224 | - 21 225 | -------------------------------------------------------------------------------- /train/tasks/semantic/config/labels/semantic-kitti.yaml: -------------------------------------------------------------------------------- 1 | # This file is covered by the LICENSE file in the root of this project. 2 | name: "kitti" 3 | labels: 4 | 0 : "unlabeled" 5 | 1 : "outlier" 6 | 10: "car" 7 | 11: "bicycle" 8 | 13: "bus" 9 | 15: "motorcycle" 10 | 16: "on-rails" 11 | 18: "truck" 12 | 20: "other-vehicle" 13 | 30: "person" 14 | 31: "bicyclist" 15 | 32: "motorcyclist" 16 | 40: "road" 17 | 44: "parking" 18 | 48: "sidewalk" 19 | 49: "other-ground" 20 | 50: "building" 21 | 51: "fence" 22 | 52: "other-structure" 23 | 60: "lane-marking" 24 | 70: "vegetation" 25 | 71: "trunk" 26 | 72: "terrain" 27 | 80: "pole" 28 | 81: "traffic-sign" 29 | 99: "other-object" 30 | 252: "moving-car" 31 | 253: "moving-bicyclist" 32 | 254: "moving-person" 33 | 255: "moving-motorcyclist" 34 | 256: "moving-on-rails" 35 | 257: "moving-bus" 36 | 258: "moving-truck" 37 | 259: "moving-other-vehicle" 38 | color_map: # bgr 39 | 0 : [0, 0, 0] 40 | 1 : [0, 0, 255] 41 | 10: [245, 150, 100] 42 | 11: [245, 230, 100] 43 | 13: [250, 80, 100] 44 | 15: [150, 60, 30] 45 | 16: [255, 0, 0] 46 | 18: [180, 30, 80] 47 | 20: [255, 0, 0] 48 | 30: [30, 30, 255] 49 | 31: [200, 40, 255] 50 | 32: [90, 30, 150] 51 | 40: [255, 0, 255] 52 | 44: [255, 150, 255] 53 | 48: [75, 0, 75] 54 | 49: [75, 0, 175] 55 | 50: [0, 200, 255] 56 | 51: [50, 120, 255] 57 | 52: [0, 150, 255] 58 | 60: [170, 255, 150] 59 | 70: [0, 175, 0] 60 | 71: [0, 60, 135] 61 | 72: [80, 240, 150] 62 | 80: [150, 240, 255] 63 | 81: [0, 0, 255] 64 | 99: [255, 255, 50] 65 | 252: [245, 150, 100] 66 | 256: [255, 0, 0] 67 | 253: [200, 40, 255] 68 | 254: [30, 30, 255] 69 | 255: [90, 30, 150] 70 | 257: [250, 80, 100] 71 | 258: [180, 30, 80] 72 | 259: [255, 0, 0] 73 | content: # as a ratio with the total number of points 74 | 0: 0.018889854628292943 75 | 1: 0.0002937197336781505 76 | 10: 0.040818519255974316 77 | 11: 0.00016609538710764618 78 | 13: 2.7879693665067774e-05 79 | 15: 0.00039838616015114444 80 | 16: 0.0 81 | 18: 0.0020633612104619787 82 | 20: 0.0016218197275284021 83 | 30: 0.00017698551338515307 84 | 31: 1.1065903904919655e-08 85 | 32: 5.532951952459828e-09 86 | 40: 0.1987493871255525 87 | 44: 0.014717169549888214 88 | 48: 0.14392298360372 89 | 49: 0.0039048553037472045 90 | 50: 0.1326861944777486 91 | 51: 0.0723592229456223 92 | 52: 0.002395131480328884 93 | 60: 4.7084144280367186e-05 94 | 70: 0.26681502148037506 95 | 71: 0.006035012012626033 96 | 72: 0.07814222006271769 97 | 80: 0.002855498193863172 98 | 81: 0.0006155958086189918 99 | 99: 0.009923127583046915 100 | 252: 0.001789309418528068 101 | 253: 0.00012709999297008662 102 | 254: 0.00016059776092534436 103 | 255: 3.745553104802113e-05 104 | 256: 0.0 105 | 257: 0.00011351574470342043 106 | 258: 0.00010157861367183268 107 | 259: 4.3840131989471124e-05 108 | # classes that are indistinguishable from single scan or inconsistent in 109 | # ground truth are mapped to their closest equivalent 110 | learning_map: 111 | 0 : 0 # "unlabeled" 112 | 1 : 0 # "outlier" mapped to "unlabeled" --------------------------mapped 113 | 10: 1 # "car" 114 | 11: 2 # "bicycle" 115 | 13: 5 # "bus" mapped to "other-vehicle" --------------------------mapped 116 | 15: 3 # "motorcycle" 117 | 16: 5 # "on-rails" mapped to "other-vehicle" ---------------------mapped 118 | 18: 4 # "truck" 119 | 20: 5 # "other-vehicle" 120 | 30: 6 # "person" 121 | 31: 7 # "bicyclist" 122 | 32: 8 # "motorcyclist" 123 | 40: 9 # "road" 124 | 44: 10 # "parking" 125 | 48: 11 # "sidewalk" 126 | 49: 12 # "other-ground" 127 | 50: 13 # "building" 128 | 51: 14 # "fence" 129 | 52: 0 # "other-structure" mapped to "unlabeled" ------------------mapped 130 | 60: 9 # "lane-marking" to "road" ---------------------------------mapped 131 | 70: 15 # "vegetation" 132 | 71: 16 # "trunk" 133 | 72: 17 # "terrain" 134 | 80: 18 # "pole" 135 | 81: 19 # "traffic-sign" 136 | 99: 0 # "other-object" to "unlabeled" ----------------------------mapped 137 | 252: 1 # "moving-car" to "car" ------------------------------------mapped 138 | 253: 7 # "moving-bicyclist" to "bicyclist" ------------------------mapped 139 | 254: 6 # "moving-person" to "person" ------------------------------mapped 140 | 255: 8 # "moving-motorcyclist" to "motorcyclist" ------------------mapped 141 | 256: 5 # "moving-on-rails" mapped to "other-vehicle" --------------mapped 142 | 257: 5 # "moving-bus" mapped to "other-vehicle" -------------------mapped 143 | 258: 4 # "moving-truck" to "truck" --------------------------------mapped 144 | 259: 5 # "moving-other"-vehicle to "other-vehicle" ----------------mapped 145 | learning_map_inv: # inverse of previous map 146 | 0: 0 # "unlabeled", and others ignored 147 | 1: 10 # "car" 148 | 2: 11 # "bicycle" 149 | 3: 15 # "motorcycle" 150 | 4: 18 # "truck" 151 | 5: 20 # "other-vehicle" 152 | 6: 30 # "person" 153 | 7: 31 # "bicyclist" 154 | 8: 32 # "motorcyclist" 155 | 9: 40 # "road" 156 | 10: 44 # "parking" 157 | 11: 48 # "sidewalk" 158 | 12: 49 # "other-ground" 159 | 13: 50 # "building" 160 | 14: 51 # "fence" 161 | 15: 70 # "vegetation" 162 | 16: 71 # "trunk" 163 | 17: 72 # "terrain" 164 | 18: 80 # "pole" 165 | 19: 81 # "traffic-sign" 166 | learning_ignore: # Ignore classes 167 | 0: True # "unlabeled", and others ignored 168 | 1: False # "car" 169 | 2: False # "bicycle" 170 | 3: False # "motorcycle" 171 | 4: False # "truck" 172 | 5: False # "other-vehicle" 173 | 6: False # "person" 174 | 7: False # "bicyclist" 175 | 8: False # "motorcyclist" 176 | 9: False # "road" 177 | 10: False # "parking" 178 | 11: False # "sidewalk" 179 | 12: False # "other-ground" 180 | 13: False # "building" 181 | 14: False # "fence" 182 | 15: False # "vegetation" 183 | 16: False # "trunk" 184 | 17: False # "terrain" 185 | 18: False # "pole" 186 | 19: False # "traffic-sign" 187 | split: # sequence numbers 188 | train: 189 | - 0 190 | - 1 191 | - 2 192 | - 3 193 | - 4 194 | - 5 195 | - 6 196 | - 7 197 | - 9 198 | - 10 199 | valid: 200 | - 8 201 | test: 202 | - 11 203 | - 12 204 | - 13 205 | - 14 206 | - 15 207 | - 16 208 | - 17 209 | - 18 210 | - 19 211 | - 20 212 | - 21 213 | -------------------------------------------------------------------------------- /train/tasks/semantic/dataset/kitti/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/tasks/semantic/dataset/kitti/__init__.py -------------------------------------------------------------------------------- /train/tasks/semantic/decoders/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/tasks/semantic/decoders/__init__.py -------------------------------------------------------------------------------- /train/tasks/semantic/decoders/darknet.py: -------------------------------------------------------------------------------- 1 | # This file was modified from https://github.com/BobLiu20/YOLOv3_PyTorch 2 | # It needed to be modified in order to accomodate for different strides in the 3 | 4 | import torch.nn as nn 5 | from collections import OrderedDict 6 | import torch.nn.functional as F 7 | 8 | 9 | class BasicBlock(nn.Module): 10 | def __init__(self, inplanes, planes, bn_d=0.1): 11 | super(BasicBlock, self).__init__() 12 | self.conv1 = nn.Conv2d(inplanes, planes[0], kernel_size=1, 13 | stride=1, padding=0, bias=False) 14 | self.bn1 = nn.BatchNorm2d(planes[0], momentum=bn_d) 15 | self.relu1 = nn.LeakyReLU(0.1) 16 | self.conv2 = nn.Conv2d(planes[0], planes[1], kernel_size=3, 17 | stride=1, padding=1, bias=False) 18 | self.bn2 = nn.BatchNorm2d(planes[1], momentum=bn_d) 19 | self.relu2 = nn.LeakyReLU(0.1) 20 | 21 | def forward(self, x): 22 | residual = x 23 | 24 | out = self.conv1(x) 25 | out = self.bn1(out) 26 | out = self.relu1(out) 27 | 28 | out = self.conv2(out) 29 | out = self.bn2(out) 30 | out = self.relu2(out) 31 | 32 | out += residual 33 | return out 34 | 35 | 36 | # ****************************************************************************** 37 | 38 | class Decoder(nn.Module): 39 | """ 40 | Class for DarknetSeg. Subclasses PyTorch's own "nn" module 41 | """ 42 | 43 | def __init__(self, params, stub_skips, OS=32, feature_depth=1024): 44 | super(Decoder, self).__init__() 45 | self.backbone_OS = OS 46 | self.backbone_feature_depth = feature_depth 47 | self.drop_prob = params["dropout"] 48 | self.bn_d = params["bn_d"] 49 | 50 | # stride play 51 | self.strides = [2, 2, 2, 2, 2] 52 | # check current stride 53 | current_os = 1 54 | for s in self.strides: 55 | current_os *= s 56 | print("Decoder original OS: ", int(current_os)) 57 | # redo strides according to needed stride 58 | for i, stride in enumerate(self.strides): 59 | if int(current_os) != self.backbone_OS: 60 | if stride == 2: 61 | current_os /= 2 62 | self.strides[i] = 1 63 | if int(current_os) == self.backbone_OS: 64 | break 65 | print("Decoder new OS: ", int(current_os)) 66 | print("Decoder strides: ", self.strides) 67 | 68 | # decoder 69 | self.dec5 = self._make_dec_layer(BasicBlock, 70 | [self.backbone_feature_depth, 512], 71 | bn_d=self.bn_d, 72 | stride=self.strides[0]) 73 | self.dec4 = self._make_dec_layer(BasicBlock, [512, 256], bn_d=self.bn_d, 74 | stride=self.strides[1]) 75 | self.dec3 = self._make_dec_layer(BasicBlock, [256, 128], bn_d=self.bn_d, 76 | stride=self.strides[2]) 77 | self.dec2 = self._make_dec_layer(BasicBlock, [128, 64], bn_d=self.bn_d, 78 | stride=self.strides[3]) 79 | self.dec1 = self._make_dec_layer(BasicBlock, [64, 32], bn_d=self.bn_d, 80 | stride=self.strides[4]) 81 | 82 | # layer list to execute with skips 83 | self.layers = [self.dec5, self.dec4, self.dec3, self.dec2, self.dec1] 84 | 85 | # for a bit of fun 86 | self.dropout = nn.Dropout2d(self.drop_prob) 87 | 88 | # last channels 89 | self.last_channels = 32 90 | 91 | def _make_dec_layer(self, block, planes, bn_d=0.1, stride=2): 92 | layers = [] 93 | 94 | # downsample 95 | if stride == 2: 96 | layers.append(("upconv", nn.ConvTranspose2d(planes[0], planes[1], 97 | kernel_size=[1, 4], stride=[1, 2], 98 | padding=[0, 1]))) 99 | else: 100 | layers.append(("conv", nn.Conv2d(planes[0], planes[1], 101 | kernel_size=3, padding=1))) 102 | layers.append(("bn", nn.BatchNorm2d(planes[1], momentum=bn_d))) 103 | layers.append(("relu", nn.LeakyReLU(0.1))) 104 | 105 | # blocks 106 | layers.append(("residual", block(planes[1], planes, bn_d))) 107 | 108 | return nn.Sequential(OrderedDict(layers)) 109 | 110 | def run_layer(self, x, layer, skips, os): 111 | feats = layer(x) # up 112 | if feats.shape[-1] > x.shape[-1]: 113 | os //= 2 # match skip 114 | feats = feats + skips[os].detach() # add skip 115 | x = feats 116 | return x, skips, os 117 | 118 | def forward(self, x, skips): 119 | os = self.backbone_OS 120 | 121 | # run layers 122 | x, skips, os = self.run_layer(x, self.dec5, skips, os) 123 | x, skips, os = self.run_layer(x, self.dec4, skips, os) 124 | x, skips, os = self.run_layer(x, self.dec3, skips, os) 125 | x, skips, os = self.run_layer(x, self.dec2, skips, os) 126 | x, skips, os = self.run_layer(x, self.dec1, skips, os) 127 | 128 | x = self.dropout(x) 129 | 130 | return x 131 | 132 | def get_last_depth(self): 133 | return self.last_channels 134 | -------------------------------------------------------------------------------- /train/tasks/semantic/decoders/squeezeseg.py: -------------------------------------------------------------------------------- 1 | # Adapted from https://github.com/BichenWuUCB/SqueezeSeg 2 | from __future__ import print_function 3 | import torch 4 | import torch.nn as nn 5 | import torch.nn.functional as F 6 | 7 | 8 | class FireUp(nn.Module): 9 | 10 | def __init__(self, inplanes, squeeze_planes, 11 | expand1x1_planes, expand3x3_planes, stride): 12 | super(FireUp, self).__init__() 13 | self.inplanes = inplanes 14 | self.stride = stride 15 | self.activation = nn.ReLU(inplace=True) 16 | self.squeeze = nn.Conv2d(inplanes, squeeze_planes, kernel_size=1) 17 | if self.stride == 2: 18 | self.upconv = nn.ConvTranspose2d(squeeze_planes, squeeze_planes, 19 | kernel_size=[1, 4], stride=[1, 2], 20 | padding=[0, 1]) 21 | self.expand1x1 = nn.Conv2d(squeeze_planes, expand1x1_planes, 22 | kernel_size=1) 23 | self.expand3x3 = nn.Conv2d(squeeze_planes, expand3x3_planes, 24 | kernel_size=3, padding=1) 25 | 26 | def forward(self, x): 27 | x = self.activation(self.squeeze(x)) 28 | if self.stride == 2: 29 | x = self.activation(self.upconv(x)) 30 | return torch.cat([ 31 | self.activation(self.expand1x1(x)), 32 | self.activation(self.expand3x3(x)) 33 | ], 1) 34 | 35 | 36 | # ****************************************************************************** 37 | 38 | class Decoder(nn.Module): 39 | """ 40 | Class for DarknetSeg. Subclasses PyTorch's own "nn" module 41 | """ 42 | 43 | def __init__(self, params, stub_skips, OS=32, feature_depth=512): 44 | super(Decoder, self).__init__() 45 | self.backbone_OS = OS 46 | self.backbone_feature_depth = feature_depth 47 | self.drop_prob = params["dropout"] 48 | 49 | # stride play 50 | self.strides = [2, 2, 2, 2] 51 | # check current stride 52 | current_os = 1 53 | for s in self.strides: 54 | current_os *= s 55 | print("Decoder original OS: ", int(current_os)) 56 | # redo strides according to needed stride 57 | for i, stride in enumerate(self.strides): 58 | if int(current_os) != self.backbone_OS: 59 | if stride == 2: 60 | current_os /= 2 61 | self.strides[i] = 1 62 | if int(current_os) == self.backbone_OS: 63 | break 64 | print("Decoder new OS: ", int(current_os)) 65 | print("Decoder strides: ", self.strides) 66 | 67 | # decoder 68 | # decoder 69 | self.firedec10 = FireUp(self.backbone_feature_depth, 64, 128, 128, 70 | stride=self.strides[0]) 71 | self.firedec11 = FireUp(256, 32, 64, 64, 72 | stride=self.strides[1]) 73 | self.firedec12 = FireUp(128, 16, 32, 32, 74 | stride=self.strides[2]) 75 | self.firedec13 = FireUp(64, 16, 32, 32, 76 | stride=self.strides[3]) 77 | 78 | # layer list to execute with skips 79 | self.layers = [self.firedec10, self.firedec11, 80 | self.firedec12, self.firedec13] 81 | 82 | # for a bit of fun 83 | self.dropout = nn.Dropout2d(self.drop_prob) 84 | 85 | # last channels 86 | self.last_channels = 64 87 | 88 | def run_layer(self, x, layer, skips, os): 89 | feats = layer(x) # up 90 | if feats.shape[-1] > x.shape[-1]: 91 | os //= 2 # match skip 92 | feats = feats + skips[os].detach() # add skip 93 | x = feats 94 | return x, skips, os 95 | 96 | def forward(self, x, skips): 97 | os = self.backbone_OS 98 | 99 | # run layers 100 | x, skips, os = self.run_layer(x, self.firedec10, skips, os) 101 | x, skips, os = self.run_layer(x, self.firedec11, skips, os) 102 | x, skips, os = self.run_layer(x, self.firedec12, skips, os) 103 | x, skips, os = self.run_layer(x, self.firedec13, skips, os) 104 | 105 | x = self.dropout(x) 106 | 107 | return x 108 | 109 | def get_last_depth(self): 110 | return self.last_channels 111 | -------------------------------------------------------------------------------- /train/tasks/semantic/decoders/squeezesegV2.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | from __future__ import print_function 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | 9 | 10 | class FireUp(nn.Module): 11 | 12 | def __init__(self, inplanes, squeeze_planes, 13 | expand1x1_planes, expand3x3_planes, bn_d, stride): 14 | super(FireUp, self).__init__() 15 | self.inplanes = inplanes 16 | self.bn_d = bn_d 17 | self.stride = stride 18 | self.activation = nn.ReLU(inplace=True) 19 | self.squeeze = nn.Conv2d(inplanes, squeeze_planes, kernel_size=1) 20 | self.squeeze_bn = nn.BatchNorm2d(squeeze_planes, momentum=self.bn_d) 21 | if self.stride == 2: 22 | self.upconv = nn.ConvTranspose2d(squeeze_planes, squeeze_planes, 23 | kernel_size=[1, 4], stride=[1, 2], 24 | padding=[0, 1]) 25 | self.expand1x1 = nn.Conv2d(squeeze_planes, expand1x1_planes, 26 | kernel_size=1) 27 | self.expand1x1_bn = nn.BatchNorm2d(expand1x1_planes, momentum=self.bn_d) 28 | self.expand3x3 = nn.Conv2d(squeeze_planes, expand3x3_planes, 29 | kernel_size=3, padding=1) 30 | self.expand3x3_bn = nn.BatchNorm2d(expand3x3_planes, momentum=self.bn_d) 31 | 32 | def forward(self, x): 33 | x = self.activation(self.squeeze_bn(self.squeeze(x))) 34 | if self.stride == 2: 35 | x = self.activation(self.upconv(x)) 36 | return torch.cat([ 37 | self.activation(self.expand1x1_bn(self.expand1x1(x))), 38 | self.activation(self.expand3x3_bn(self.expand3x3(x))) 39 | ], 1) 40 | 41 | 42 | # ****************************************************************************** 43 | 44 | class Decoder(nn.Module): 45 | """ 46 | Class for DarknetSeg. Subclasses PyTorch's own "nn" module 47 | """ 48 | 49 | def __init__(self, params, stub_skips, OS=32, feature_depth=512): 50 | super(Decoder, self).__init__() 51 | self.backbone_OS = OS 52 | self.backbone_feature_depth = feature_depth 53 | self.drop_prob = params["dropout"] 54 | self.bn_d = params["bn_d"] 55 | 56 | # stride play 57 | self.strides = [2, 2, 2, 2] 58 | # check current stride 59 | current_os = 1 60 | for s in self.strides: 61 | current_os *= s 62 | print("Decoder original OS: ", int(current_os)) 63 | # redo strides according to needed stride 64 | for i, stride in enumerate(self.strides): 65 | if int(current_os) != self.backbone_OS: 66 | if stride == 2: 67 | current_os /= 2 68 | self.strides[i] = 1 69 | if int(current_os) == self.backbone_OS: 70 | break 71 | print("Decoder new OS: ", int(current_os)) 72 | print("Decoder strides: ", self.strides) 73 | 74 | # decoder 75 | # decoder 76 | self.firedec10 = FireUp(self.backbone_feature_depth, 77 | 64, 128, 128, bn_d=self.bn_d, 78 | stride=self.strides[0]) 79 | self.firedec11 = FireUp(256, 32, 64, 64, bn_d=self.bn_d, 80 | stride=self.strides[1]) 81 | self.firedec12 = FireUp(128, 16, 32, 32, bn_d=self.bn_d, 82 | stride=self.strides[2]) 83 | self.firedec13 = FireUp(64, 16, 32, 32, bn_d=self.bn_d, 84 | stride=self.strides[3]) 85 | 86 | # for a bit of fun 87 | self.dropout = nn.Dropout2d(self.drop_prob) 88 | 89 | # last channels 90 | self.last_channels = 64 91 | 92 | def run_layer(self, x, layer, skips, os): 93 | feats = layer(x) # up 94 | if feats.shape[-1] > x.shape[-1]: 95 | os //= 2 # match skip 96 | feats = feats + skips[os].detach() # add skip 97 | x = feats 98 | return x, skips, os 99 | 100 | def forward(self, x, skips): 101 | os = self.backbone_OS 102 | 103 | # run layers 104 | x, skips, os = self.run_layer(x, self.firedec10, skips, os) 105 | x, skips, os = self.run_layer(x, self.firedec11, skips, os) 106 | x, skips, os = self.run_layer(x, self.firedec12, skips, os) 107 | x, skips, os = self.run_layer(x, self.firedec13, skips, os) 108 | 109 | x = self.dropout(x) 110 | 111 | return x 112 | 113 | def get_last_depth(self): 114 | return self.last_channels 115 | -------------------------------------------------------------------------------- /train/tasks/semantic/evaluate_biou.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import argparse 5 | import os 6 | import yaml 7 | import sys 8 | import numpy as np 9 | import torch 10 | import __init__ as booger 11 | 12 | from tasks.semantic.modules.ioueval import biouEval 13 | from common.laserscan import SemLaserScan 14 | 15 | # possible splits 16 | splits = ["train", "valid", "test"] 17 | 18 | if __name__ == '__main__': 19 | parser = argparse.ArgumentParser("./evaluate_biou.py") 20 | parser.add_argument( 21 | '--dataset', '-d', 22 | type=str, 23 | required=True, 24 | help='Dataset dir. No Default', 25 | ) 26 | parser.add_argument( 27 | '--predictions', '-p', 28 | type=str, 29 | required=None, 30 | help='Prediction dir. Same organization as dataset, but predictions in' 31 | 'each sequences "prediction" directory. No Default. If no option is set' 32 | ' we look for the labels in the same directory as dataset' 33 | ) 34 | parser.add_argument( 35 | '--split', '-s', 36 | type=str, 37 | required=False, 38 | choices=["train", "valid", "test"], 39 | default="valid", 40 | help='Split to evaluate on. One of ' + 41 | str(splits) + '. Defaults to %(default)s', 42 | ) 43 | parser.add_argument( 44 | '--data_cfg', '-dc', 45 | type=str, 46 | required=False, 47 | default="config/labels/semantic-kitti.yaml", 48 | help='Dataset config file. Defaults to %(default)s', 49 | ) 50 | parser.add_argument( 51 | '--border', '-bs', 52 | type=int, 53 | required=False, 54 | default=1, 55 | help='Border size. Defaults to %(default)s', 56 | ) 57 | parser.add_argument( 58 | '--conn', '-c', 59 | type=int, 60 | required=False, 61 | default=4, 62 | help='Kernel connectivity. Defaults to %(default)s', 63 | ) 64 | FLAGS, unparsed = parser.parse_known_args() 65 | 66 | # fill in real predictions dir 67 | if FLAGS.predictions is None: 68 | FLAGS.predictions = FLAGS.dataset 69 | 70 | # print summary of what we will do 71 | print("*" * 80) 72 | print("INTERFACE:") 73 | print("Data: ", FLAGS.dataset) 74 | print("Predictions: ", FLAGS.predictions) 75 | print("Split: ", FLAGS.split) 76 | print("Config: ", FLAGS.data_cfg) 77 | print("Border Mask Size", FLAGS.border) 78 | print("Border Mask Connectivity", FLAGS.conn) 79 | print("*" * 80) 80 | 81 | # assert split 82 | assert(FLAGS.split in splits) 83 | 84 | # open data config file 85 | try: 86 | print("Opening data config file %s" % FLAGS.data_cfg) 87 | DATA = yaml.safe_load(open(FLAGS.data_cfg, 'r')) 88 | except Exception as e: 89 | print(e) 90 | print("Error opening data yaml file.") 91 | quit() 92 | 93 | # get number of interest classes, and the label mappings 94 | class_strings = DATA["labels"] 95 | class_remap = DATA["learning_map"] 96 | class_inv_remap = DATA["learning_map_inv"] 97 | class_ignore = DATA["learning_ignore"] 98 | nr_classes = len(class_inv_remap) 99 | 100 | # make lookup table for mapping 101 | maxkey = 0 102 | for key, data in class_remap.items(): 103 | if key > maxkey: 104 | maxkey = key 105 | # +100 hack making lut bigger just in case there are unknown labels 106 | remap_lut = np.zeros((maxkey + 100), dtype=np.int32) 107 | for key, data in class_remap.items(): 108 | try: 109 | remap_lut[key] = data 110 | except IndexError: 111 | print("Wrong key ", key) 112 | # print(remap_lut) 113 | 114 | # create evaluator 115 | ignore = [] 116 | for cl, ign in class_ignore.items(): 117 | if ign: 118 | x_cl = int(cl) 119 | ignore.append(x_cl) 120 | print("Ignoring xentropy class ", x_cl, " in IoU evaluation") 121 | 122 | # create evaluator 123 | device = torch.device( 124 | "cuda") if torch.cuda.is_available() else torch.device('cpu') 125 | evaluator = biouEval(nr_classes, device, ignore, 126 | FLAGS.border, FLAGS.conn) 127 | evaluator.reset() 128 | 129 | # get test set 130 | test_sequences = DATA["split"][FLAGS.split] 131 | 132 | # get scan paths 133 | scan_names = [] 134 | for sequence in test_sequences: 135 | sequence = '{0:02d}'.format(int(sequence)) 136 | scan_paths = os.path.join(FLAGS.dataset, "sequences", 137 | str(sequence), "velodyne") 138 | # populate the scan names 139 | seq_scan_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 140 | os.path.expanduser(scan_paths)) for f in fn if ".bin" in f] 141 | seq_scan_names.sort() 142 | scan_names.extend(seq_scan_names) 143 | # print(scan_names) 144 | 145 | # get label paths 146 | label_names = [] 147 | for sequence in test_sequences: 148 | sequence = '{0:02d}'.format(int(sequence)) 149 | label_paths = os.path.join(FLAGS.dataset, "sequences", 150 | str(sequence), "labels") 151 | # populate the label names 152 | seq_label_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 153 | os.path.expanduser(label_paths)) for f in fn if ".label" in f] 154 | seq_label_names.sort() 155 | label_names.extend(seq_label_names) 156 | # print(label_names) 157 | 158 | # get predictions paths 159 | pred_names = [] 160 | for sequence in test_sequences: 161 | sequence = '{0:02d}'.format(int(sequence)) 162 | pred_paths = os.path.join(FLAGS.predictions, "sequences", 163 | sequence, "predictions") 164 | # populate the label names 165 | seq_pred_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 166 | os.path.expanduser(pred_paths)) for f in fn if ".label" in f] 167 | seq_pred_names.sort() 168 | pred_names.extend(seq_pred_names) 169 | # print(pred_names) 170 | 171 | # check that I have the same number of files 172 | # print("labels: ", len(label_names)) 173 | # print("predictions: ", len(pred_names)) 174 | assert(len(label_names) == len(scan_names) and 175 | len(label_names) == len(pred_names)) 176 | 177 | print("Evaluating sequences: ") 178 | # open each file, get the tensor, and make the iou comparison 179 | for scan_file, label_file, pred_file in zip(scan_names, label_names, pred_names): 180 | print("evaluating label ", label_file, "with", pred_file) 181 | # open label 182 | label = SemLaserScan(project=True) 183 | label.open_scan(scan_file) 184 | label.open_label(label_file) 185 | u_label_sem = remap_lut[label.sem_label] # remap to xentropy format 186 | p_label_sem = remap_lut[label.proj_sem_label] # remap to xentropy format 187 | u_scan_px = label.proj_x 188 | u_scan_py = label.proj_y 189 | 190 | # open prediction 191 | pred = SemLaserScan(project=True) 192 | pred.open_scan(scan_file) 193 | pred.open_label(pred_file) 194 | u_pred_sem = remap_lut[pred.sem_label] # remap to xentropy format 195 | 196 | # add single scan to evaluation 197 | evaluator.addBorderBatch1d(p_label_sem, u_pred_sem, u_label_sem, 198 | u_scan_px, u_scan_py) 199 | 200 | # when I am done, print the evaluation 201 | m_accuracy = evaluator.getacc() 202 | m_jaccard, class_jaccard = evaluator.getIoU() 203 | 204 | print('Validation set:\n' 205 | 'bAcc avg {m_accuracy:.3f}\n' 206 | 'bIoU avg {m_jaccard:.3f}'.format(m_accuracy=m_accuracy, 207 | m_jaccard=m_jaccard)) 208 | # print also classwise 209 | for i, jacc in enumerate(class_jaccard): 210 | if i not in ignore: 211 | print('bIoU class {i:} [{class_str:}] = {jacc:.3f}'.format( 212 | i=i, class_str=class_strings[class_inv_remap[i]], jacc=jacc)) 213 | 214 | # print for spreadsheet 215 | print("*" * 80) 216 | print("below can be copied straight for paper table") 217 | for i, jacc in enumerate(class_jaccard): 218 | if i not in ignore: 219 | sys.stdout.write('{jacc:.3f}'.format(jacc=jacc.item())) 220 | sys.stdout.write(",") 221 | sys.stdout.write('{jacc:.3f}'.format(jacc=m_jaccard.item())) 222 | sys.stdout.write(",") 223 | sys.stdout.write('{acc:.3f}'.format(acc=m_accuracy.item())) 224 | sys.stdout.write('\n') 225 | sys.stdout.flush() 226 | -------------------------------------------------------------------------------- /train/tasks/semantic/evaluate_iou.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import argparse 5 | import os 6 | import yaml 7 | import sys 8 | import numpy as np 9 | import torch 10 | import __init__ as booger 11 | 12 | from tasks.semantic.modules.ioueval import iouEval 13 | from common.laserscan import SemLaserScan 14 | 15 | # possible splits 16 | splits = ["train", "valid", "test"] 17 | 18 | if __name__ == '__main__': 19 | parser = argparse.ArgumentParser("./evaluate_iou.py") 20 | parser.add_argument( 21 | '--dataset', '-d', 22 | type=str, 23 | required=True, 24 | help='Dataset dir. No Default', 25 | ) 26 | parser.add_argument( 27 | '--predictions', '-p', 28 | type=str, 29 | required=None, 30 | help='Prediction dir. Same organization as dataset, but predictions in' 31 | 'each sequences "prediction" directory. No Default. If no option is set' 32 | ' we look for the labels in the same directory as dataset' 33 | ) 34 | parser.add_argument( 35 | '--split', '-s', 36 | type=str, 37 | required=False, 38 | choices=["train", "valid", "test"], 39 | default="valid", 40 | help='Split to evaluate on. One of ' + 41 | str(splits) + '. Defaults to %(default)s', 42 | ) 43 | parser.add_argument( 44 | '--data_cfg', '-dc', 45 | type=str, 46 | required=False, 47 | default="config/labels/semantic-kitti.yaml", 48 | help='Dataset config file. Defaults to %(default)s', 49 | ) 50 | parser.add_argument( 51 | '--limit', '-l', 52 | type=int, 53 | required=False, 54 | default=None, 55 | help='Limit to the first "--limit" points of each scan. Useful for' 56 | ' evaluating single scan from aggregated pointcloud.' 57 | ' Defaults to %(default)s', 58 | ) 59 | 60 | FLAGS, unparsed = parser.parse_known_args() 61 | 62 | # fill in real predictions dir 63 | if FLAGS.predictions is None: 64 | FLAGS.predictions = FLAGS.dataset 65 | 66 | # print summary of what we will do 67 | print("*" * 80) 68 | print("INTERFACE:") 69 | print("Data: ", FLAGS.dataset) 70 | print("Predictions: ", FLAGS.predictions) 71 | print("Split: ", FLAGS.split) 72 | print("Config: ", FLAGS.data_cfg) 73 | print("Limit: ", FLAGS.limit) 74 | print("*" * 80) 75 | 76 | # assert split 77 | assert(FLAGS.split in splits) 78 | 79 | # open data config file 80 | try: 81 | print("Opening data config file %s" % FLAGS.data_cfg) 82 | DATA = yaml.safe_load(open(FLAGS.data_cfg, 'r')) 83 | except Exception as e: 84 | print(e) 85 | print("Error opening data yaml file.") 86 | quit() 87 | 88 | # get number of interest classes, and the label mappings 89 | class_strings = DATA["labels"] 90 | class_remap = DATA["learning_map"] 91 | class_inv_remap = DATA["learning_map_inv"] 92 | class_ignore = DATA["learning_ignore"] 93 | nr_classes = len(class_inv_remap) 94 | 95 | # make lookup table for mapping 96 | maxkey = 0 97 | for key, data in class_remap.items(): 98 | if key > maxkey: 99 | maxkey = key 100 | # +100 hack making lut bigger just in case there are unknown labels 101 | remap_lut = np.zeros((maxkey + 100), dtype=np.int32) 102 | for key, data in class_remap.items(): 103 | try: 104 | remap_lut[key] = data 105 | except IndexError: 106 | print("Wrong key ", key) 107 | # print(remap_lut) 108 | 109 | # create evaluator 110 | ignore = [] 111 | for cl, ign in class_ignore.items(): 112 | if ign: 113 | x_cl = int(cl) 114 | ignore.append(x_cl) 115 | print("Ignoring xentropy class ", x_cl, " in IoU evaluation") 116 | 117 | # create evaluator 118 | device = torch.device("cpu") 119 | evaluator = iouEval(nr_classes, device, ignore) 120 | evaluator.reset() 121 | 122 | # get test set 123 | test_sequences = DATA["split"][FLAGS.split] 124 | 125 | # get scan paths 126 | scan_names = [] 127 | for sequence in test_sequences: 128 | sequence = '{0:02d}'.format(int(sequence)) 129 | scan_paths = os.path.join(FLAGS.dataset, "sequences", 130 | str(sequence), "velodyne") 131 | # populate the scan names 132 | seq_scan_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 133 | os.path.expanduser(scan_paths)) for f in fn if ".bin" in f] 134 | seq_scan_names.sort() 135 | scan_names.extend(seq_scan_names) 136 | # print(scan_names) 137 | 138 | # get label paths 139 | label_names = [] 140 | for sequence in test_sequences: 141 | sequence = '{0:02d}'.format(int(sequence)) 142 | label_paths = os.path.join(FLAGS.dataset, "sequences", 143 | str(sequence), "labels") 144 | # populate the label names 145 | seq_label_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 146 | os.path.expanduser(label_paths)) for f in fn if ".label" in f] 147 | seq_label_names.sort() 148 | label_names.extend(seq_label_names) 149 | # print(label_names) 150 | 151 | # get predictions paths 152 | pred_names = [] 153 | for sequence in test_sequences: 154 | sequence = '{0:02d}'.format(int(sequence)) 155 | pred_paths = os.path.join(FLAGS.predictions, "sequences", 156 | sequence, "predictions") 157 | # populate the label names 158 | seq_pred_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 159 | os.path.expanduser(pred_paths)) for f in fn if ".label" in f] 160 | seq_pred_names.sort() 161 | pred_names.extend(seq_pred_names) 162 | # print(pred_names) 163 | 164 | # check that I have the same number of files 165 | # print("labels: ", len(label_names)) 166 | # print("predictions: ", len(pred_names)) 167 | assert(len(label_names) == len(scan_names) and 168 | len(label_names) == len(pred_names)) 169 | 170 | print("Evaluating sequences: ") 171 | # open each file, get the tensor, and make the iou comparison 172 | for scan_file, label_file, pred_file in zip(scan_names, label_names, pred_names): 173 | print("evaluating label ", label_file, "with", pred_file) 174 | # open label 175 | label = SemLaserScan(project=False) 176 | label.open_scan(scan_file) 177 | label.open_label(label_file) 178 | u_label_sem = remap_lut[label.sem_label] # remap to xentropy format 179 | if FLAGS.limit is not None: 180 | u_label_sem = u_label_sem[:FLAGS.limit] 181 | 182 | # open prediction 183 | pred = SemLaserScan(project=False) 184 | pred.open_scan(scan_file) 185 | pred.open_label(pred_file) 186 | u_pred_sem = remap_lut[pred.sem_label] # remap to xentropy format 187 | if FLAGS.limit is not None: 188 | u_pred_sem = u_pred_sem[:FLAGS.limit] 189 | 190 | # add single scan to evaluation 191 | evaluator.addBatch(u_pred_sem, u_label_sem) 192 | 193 | # when I am done, print the evaluation 194 | m_accuracy = evaluator.getacc() 195 | m_jaccard, class_jaccard = evaluator.getIoU() 196 | 197 | print('Validation set:\n' 198 | 'Acc avg {m_accuracy:.3f}\n' 199 | 'IoU avg {m_jaccard:.3f}'.format(m_accuracy=m_accuracy, 200 | m_jaccard=m_jaccard)) 201 | # print also classwise 202 | for i, jacc in enumerate(class_jaccard): 203 | if i not in ignore: 204 | print('IoU class {i:} [{class_str:}] = {jacc:.3f}'.format( 205 | i=i, class_str=class_strings[class_inv_remap[i]], jacc=jacc)) 206 | 207 | # print for spreadsheet 208 | print("*" * 80) 209 | print("below can be copied straight for paper table") 210 | for i, jacc in enumerate(class_jaccard): 211 | if i not in ignore: 212 | sys.stdout.write('{jacc:.3f}'.format(jacc=jacc.item())) 213 | sys.stdout.write(",") 214 | sys.stdout.write('{jacc:.3f}'.format(jacc=m_jaccard.item())) 215 | sys.stdout.write(",") 216 | sys.stdout.write('{acc:.3f}'.format(acc=m_accuracy.item())) 217 | sys.stdout.write('\n') 218 | sys.stdout.flush() 219 | -------------------------------------------------------------------------------- /train/tasks/semantic/infer.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import argparse 5 | import subprocess 6 | import datetime 7 | import yaml 8 | from shutil import copyfile 9 | import os 10 | import shutil 11 | import __init__ as booger 12 | 13 | from tasks.semantic.modules.user import * 14 | 15 | 16 | if __name__ == '__main__': 17 | parser = argparse.ArgumentParser("./infer.py") 18 | parser.add_argument( 19 | '--dataset', '-d', 20 | type=str, 21 | required=True, 22 | help='Dataset to train with. No Default', 23 | ) 24 | parser.add_argument( 25 | '--log', '-l', 26 | type=str, 27 | default=os.path.expanduser("~") + '/logs/' + 28 | datetime.datetime.now().strftime("%Y-%-m-%d-%H:%M") + '/', 29 | help='Directory to put the predictions. Default: ~/logs/date+time' 30 | ) 31 | parser.add_argument( 32 | '--model', '-m', 33 | type=str, 34 | required=True, 35 | default=None, 36 | help='Directory to get the trained model.' 37 | ) 38 | FLAGS, unparsed = parser.parse_known_args() 39 | 40 | # print summary of what we will do 41 | print("----------") 42 | print("INTERFACE:") 43 | print("dataset", FLAGS.dataset) 44 | print("log", FLAGS.log) 45 | print("model", FLAGS.model) 46 | print("----------\n") 47 | print("Commit hash (training version): ", str( 48 | subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).strip())) 49 | print("----------\n") 50 | 51 | # open arch config file 52 | try: 53 | print("Opening arch config file from %s" % FLAGS.model) 54 | ARCH = yaml.safe_load(open(FLAGS.model + "/arch_cfg.yaml", 'r')) 55 | except Exception as e: 56 | print(e) 57 | print("Error opening arch yaml file.") 58 | quit() 59 | 60 | # open data config file 61 | try: 62 | print("Opening data config file from %s" % FLAGS.model) 63 | DATA = yaml.safe_load(open(FLAGS.model + "/data_cfg.yaml", 'r')) 64 | except Exception as e: 65 | print(e) 66 | print("Error opening data yaml file.") 67 | quit() 68 | 69 | # create log folder 70 | try: 71 | if os.path.isdir(FLAGS.log): 72 | shutil.rmtree(FLAGS.log) 73 | os.makedirs(FLAGS.log) 74 | os.makedirs(os.path.join(FLAGS.log, "sequences")) 75 | for seq in DATA["split"]["train"]: 76 | seq = '{0:02d}'.format(int(seq)) 77 | print("train", seq) 78 | os.makedirs(os.path.join(FLAGS.log, "sequences", seq)) 79 | os.makedirs(os.path.join(FLAGS.log, "sequences", seq, "predictions")) 80 | for seq in DATA["split"]["valid"]: 81 | seq = '{0:02d}'.format(int(seq)) 82 | print("valid", seq) 83 | os.makedirs(os.path.join(FLAGS.log, "sequences", seq)) 84 | os.makedirs(os.path.join(FLAGS.log, "sequences", seq, "predictions")) 85 | for seq in DATA["split"]["test"]: 86 | seq = '{0:02d}'.format(int(seq)) 87 | print("test", seq) 88 | os.makedirs(os.path.join(FLAGS.log, "sequences", seq)) 89 | os.makedirs(os.path.join(FLAGS.log, "sequences", seq, "predictions")) 90 | except Exception as e: 91 | print(e) 92 | print("Error creating log directory. Check permissions!") 93 | raise 94 | 95 | except Exception as e: 96 | print(e) 97 | print("Error creating log directory. Check permissions!") 98 | quit() 99 | 100 | # does model folder exist? 101 | if os.path.isdir(FLAGS.model): 102 | print("model folder exists! Using model from %s" % (FLAGS.model)) 103 | else: 104 | print("model folder doesnt exist! Can't infer...") 105 | quit() 106 | 107 | # create user and infer dataset 108 | user = User(ARCH, DATA, FLAGS.dataset, FLAGS.log, FLAGS.model) 109 | user.infer() 110 | -------------------------------------------------------------------------------- /train/tasks/semantic/modules/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/tasks/semantic/modules/__init__.py -------------------------------------------------------------------------------- /train/tasks/semantic/modules/ioueval.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | # This file is covered by the LICENSE file in the root of this project. 4 | 5 | import torch 6 | import numpy as np 7 | import time 8 | 9 | from tasks.semantic.postproc.borderMask import borderMask 10 | 11 | 12 | class iouEval: 13 | def __init__(self, n_classes, device, ignore=None): 14 | self.n_classes = n_classes 15 | self.device = device 16 | # if ignore is larger than n_classes, consider no ignoreIndex 17 | self.ignore = torch.tensor(ignore).long() 18 | self.include = torch.tensor( 19 | [n for n in range(self.n_classes) if n not in self.ignore]).long() 20 | print("[IOU EVAL] IGNORE: ", self.ignore) 21 | print("[IOU EVAL] INCLUDE: ", self.include) 22 | self.reset() 23 | 24 | def num_classes(self): 25 | return self.n_classes 26 | 27 | def reset(self): 28 | self.conf_matrix = torch.zeros( 29 | (self.n_classes, self.n_classes), device=self.device).long() 30 | self.ones = None 31 | self.last_scan_size = None # for when variable scan size is used 32 | 33 | def addBatch(self, x, y): # x=preds, y=targets 34 | # if numpy, pass to pytorch 35 | # to tensor 36 | if isinstance(x, np.ndarray): 37 | x = torch.from_numpy(np.array(x)).long().to(self.device) 38 | if isinstance(y, np.ndarray): 39 | y = torch.from_numpy(np.array(y)).long().to(self.device) 40 | 41 | # sizes should be "batch_size x H x W" 42 | x_row = x.reshape(-1) # de-batchify 43 | y_row = y.reshape(-1) # de-batchify 44 | 45 | # idxs are labels and predictions 46 | idxs = torch.stack([x_row, y_row], dim=0) 47 | 48 | # ones is what I want to add to conf when I 49 | if self.ones is None or self.last_scan_size != idxs.shape[-1]: 50 | self.ones = torch.ones((idxs.shape[-1]), device=self.device).long() 51 | self.last_scan_size = idxs.shape[-1] 52 | 53 | # make confusion matrix (cols = gt, rows = pred) 54 | self.conf_matrix = self.conf_matrix.index_put_( 55 | tuple(idxs), self.ones, accumulate=True) 56 | 57 | # print(self.tp.shape) 58 | # print(self.fp.shape) 59 | # print(self.fn.shape) 60 | 61 | def getStats(self): 62 | # remove fp and fn from confusion on the ignore classes cols and rows 63 | conf = self.conf_matrix.clone().double() 64 | conf[self.ignore] = 0 65 | conf[:, self.ignore] = 0 66 | 67 | # get the clean stats 68 | tp = conf.diag() 69 | fp = conf.sum(dim=1) - tp 70 | fn = conf.sum(dim=0) - tp 71 | return tp, fp, fn 72 | 73 | def getIoU(self): 74 | tp, fp, fn = self.getStats() 75 | intersection = tp 76 | union = tp + fp + fn + 1e-15 77 | iou = intersection / union 78 | iou_mean = (intersection[self.include] / union[self.include]).mean() 79 | return iou_mean, iou # returns "iou mean", "iou per class" ALL CLASSES 80 | 81 | def getacc(self): 82 | tp, fp, fn = self.getStats() 83 | total_tp = tp.sum() 84 | total = tp[self.include].sum() + fp[self.include].sum() + 1e-15 85 | acc_mean = total_tp / total 86 | return acc_mean # returns "acc mean" 87 | 88 | 89 | class biouEval(iouEval): 90 | def __init__(self, n_classes, device, ignore=None, border_size=1, kern_conn=4): 91 | super().__init__(n_classes, device, ignore) 92 | self.border_size = border_size 93 | self.kern_conn = kern_conn 94 | 95 | # check that I am only ignoring one class 96 | if len(ignore) > 1: 97 | raise ValueError("Length of ignored class list should be 1 or 0") 98 | elif len(ignore) == 0: 99 | ignore = None 100 | else: 101 | ignore = ignore[0] 102 | 103 | self.borderer = borderMask(self.n_classes, self.device, 104 | self.border_size, self.kern_conn, 105 | background_class=ignore) 106 | self.reset() 107 | 108 | def reset(self): 109 | super().reset() 110 | return 111 | 112 | def addBorderBatch1d(self, range_y, x, y, px, py): 113 | '''range_y=target as img, x=preds, y=targets, px,py=idxs of points of 114 | pointcloud in range img 115 | WARNING: Only batch size 1 works for now 116 | ''' 117 | # if numpy, pass to pytorch 118 | # to tensor 119 | if isinstance(range_y, np.ndarray): 120 | range_y = torch.from_numpy(np.array(range_y)).long().to(self.device) 121 | if isinstance(x, np.ndarray): 122 | x = torch.from_numpy(np.array(x)).long().to(self.device) 123 | if isinstance(y, np.ndarray): 124 | y = torch.from_numpy(np.array(y)).long().to(self.device) 125 | if isinstance(px, np.ndarray): 126 | px = torch.from_numpy(np.array(px)).long().to(self.device) 127 | if isinstance(py, np.ndarray): 128 | py = torch.from_numpy(np.array(py)).long().to(self.device) 129 | 130 | # get border mask of range_y 131 | border_mask_2d = self.borderer(range_y) 132 | 133 | # filter px, py according to if they are on border mask or not 134 | border_mask_1d = border_mask_2d[0, py, px].byte() 135 | 136 | # get proper points from filtered x and y 137 | x_in_mask = torch.masked_select(x, border_mask_1d) 138 | y_in_mask = torch.masked_select(y, border_mask_1d) 139 | 140 | # add batch 141 | self.addBatch(x_in_mask, y_in_mask) 142 | 143 | 144 | if __name__ == "__main__": 145 | # mock problem 146 | nclasses = 2 147 | ignore = [] 148 | 149 | # test with 2 squares and a known IOU 150 | lbl = torch.zeros((7, 7)).long() 151 | argmax = torch.zeros((7, 7)).long() 152 | 153 | # put squares 154 | lbl[2:4, 2:4] = 1 155 | argmax[3:5, 3:5] = 1 156 | 157 | # make evaluator 158 | eval = iouEval(nclasses, torch.device('cpu'), ignore) 159 | 160 | # run 161 | eval.addBatch(argmax, lbl) 162 | m_iou, iou = eval.getIoU() 163 | print("*"*80) 164 | print("Small iou mock problem") 165 | print("IoU: ", m_iou) 166 | print("IoU class: ", iou) 167 | m_acc = eval.getacc() 168 | print("Acc: ", m_acc) 169 | print("*"*80) 170 | -------------------------------------------------------------------------------- /train/tasks/semantic/modules/segmentator.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import imp 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | from tasks.semantic.postproc.CRF import CRF 9 | import __init__ as booger 10 | 11 | 12 | class Segmentator(nn.Module): 13 | def __init__(self, ARCH, nclasses, path=None, path_append="", strict=False): 14 | super().__init__() 15 | self.ARCH = ARCH 16 | self.nclasses = nclasses 17 | self.path = path 18 | self.path_append = path_append 19 | self.strict = False 20 | 21 | # get the model 22 | bboneModule = imp.load_source("bboneModule", 23 | booger.TRAIN_PATH + '/backbones/' + 24 | self.ARCH["backbone"]["name"] + '.py') 25 | self.backbone = bboneModule.Backbone(params=self.ARCH["backbone"]) 26 | 27 | # do a pass of the backbone to initialize the skip connections 28 | stub = torch.zeros((1, 29 | self.backbone.get_input_depth(), 30 | self.ARCH["dataset"]["sensor"]["img_prop"]["height"], 31 | self.ARCH["dataset"]["sensor"]["img_prop"]["width"])) 32 | 33 | if torch.cuda.is_available(): 34 | stub = stub.cuda() 35 | self.backbone.cuda() 36 | _, stub_skips = self.backbone(stub) 37 | 38 | decoderModule = imp.load_source("decoderModule", 39 | booger.TRAIN_PATH + '/tasks/semantic/decoders/' + 40 | self.ARCH["decoder"]["name"] + '.py') 41 | self.decoder = decoderModule.Decoder(params=self.ARCH["decoder"], 42 | stub_skips=stub_skips, 43 | OS=self.ARCH["backbone"]["OS"], 44 | feature_depth=self.backbone.get_last_depth()) 45 | 46 | self.head = nn.Sequential(nn.Dropout2d(p=ARCH["head"]["dropout"]), 47 | nn.Conv2d(self.decoder.get_last_depth(), 48 | self.nclasses, kernel_size=3, 49 | stride=1, padding=1)) 50 | 51 | if self.ARCH["post"]["CRF"]["use"]: 52 | self.CRF = CRF(self.ARCH["post"]["CRF"]["params"], self.nclasses) 53 | else: 54 | self.CRF = None 55 | 56 | # train backbone? 57 | if not self.ARCH["backbone"]["train"]: 58 | for w in self.backbone.parameters(): 59 | w.requires_grad = False 60 | 61 | # train decoder? 62 | if not self.ARCH["decoder"]["train"]: 63 | for w in self.decoder.parameters(): 64 | w.requires_grad = False 65 | 66 | # train head? 67 | if not self.ARCH["head"]["train"]: 68 | for w in self.head.parameters(): 69 | w.requires_grad = False 70 | 71 | # train CRF? 72 | if self.CRF and not self.ARCH["post"]["CRF"]["train"]: 73 | for w in self.CRF.parameters(): 74 | w.requires_grad = False 75 | 76 | # print number of parameters and the ones requiring gradients 77 | # print number of parameters and the ones requiring gradients 78 | weights_total = sum(p.numel() for p in self.parameters()) 79 | weights_grad = sum(p.numel() for p in self.parameters() if p.requires_grad) 80 | print("Total number of parameters: ", weights_total) 81 | print("Total number of parameters requires_grad: ", weights_grad) 82 | 83 | # breakdown by layer 84 | weights_enc = sum(p.numel() for p in self.backbone.parameters()) 85 | weights_dec = sum(p.numel() for p in self.decoder.parameters()) 86 | weights_head = sum(p.numel() for p in self.head.parameters()) 87 | print("Param encoder ", weights_enc) 88 | print("Param decoder ", weights_dec) 89 | print("Param head ", weights_head) 90 | if self.CRF: 91 | weights_crf = sum(p.numel() for p in self.CRF.parameters()) 92 | print("Param CRF ", weights_crf) 93 | 94 | # get weights 95 | if path is not None: 96 | # try backbone 97 | try: 98 | w_dict = torch.load(path + "/backbone" + path_append, 99 | map_location=lambda storage, loc: storage) 100 | self.backbone.load_state_dict(w_dict, strict=True) 101 | print("Successfully loaded model backbone weights") 102 | except Exception as e: 103 | print() 104 | print("Couldn't load backbone, using random weights. Error: ", e) 105 | if strict: 106 | print("I'm in strict mode and failure to load weights blows me up :)") 107 | raise e 108 | 109 | # try decoder 110 | try: 111 | w_dict = torch.load(path + "/segmentation_decoder" + path_append, 112 | map_location=lambda storage, loc: storage) 113 | self.decoder.load_state_dict(w_dict, strict=True) 114 | print("Successfully loaded model decoder weights") 115 | except Exception as e: 116 | print("Couldn't load decoder, using random weights. Error: ", e) 117 | if strict: 118 | print("I'm in strict mode and failure to load weights blows me up :)") 119 | raise e 120 | 121 | # try head 122 | try: 123 | w_dict = torch.load(path + "/segmentation_head" + path_append, 124 | map_location=lambda storage, loc: storage) 125 | self.head.load_state_dict(w_dict, strict=True) 126 | print("Successfully loaded model head weights") 127 | except Exception as e: 128 | print("Couldn't load head, using random weights. Error: ", e) 129 | if strict: 130 | print("I'm in strict mode and failure to load weights blows me up :)") 131 | raise e 132 | 133 | # try CRF 134 | if self.CRF: 135 | try: 136 | w_dict = torch.load(path + "/segmentation_CRF" + path_append, 137 | map_location=lambda storage, loc: storage) 138 | self.CRF.load_state_dict(w_dict, strict=True) 139 | print("Successfully loaded model CRF weights") 140 | except Exception as e: 141 | print("Couldn't load CRF, using random weights. Error: ", e) 142 | if strict: 143 | print("I'm in strict mode and failure to load weights blows me up :)") 144 | raise e 145 | else: 146 | print("No path to pretrained, using random init.") 147 | 148 | def forward(self, x, mask=None): 149 | y, skips = self.backbone(x) 150 | y = self.decoder(y, skips) 151 | y = self.head(y) 152 | y = F.softmax(y, dim=1) 153 | if self.CRF: 154 | assert(mask is not None) 155 | y = self.CRF(x, y, mask) 156 | return y 157 | 158 | def save_checkpoint(self, logdir, suffix=""): 159 | # Save the weights 160 | torch.save(self.backbone.state_dict(), logdir + 161 | "/backbone" + suffix) 162 | torch.save(self.decoder.state_dict(), logdir + 163 | "/segmentation_decoder" + suffix) 164 | torch.save(self.head.state_dict(), logdir + 165 | "/segmentation_head" + suffix) 166 | if self.CRF: 167 | torch.save(self.CRF.state_dict(), logdir + 168 | "/segmentation_CRF" + suffix) 169 | -------------------------------------------------------------------------------- /train/tasks/semantic/modules/user.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import torch 5 | import torch.nn as nn 6 | import torch.optim as optim 7 | import torch.backends.cudnn as cudnn 8 | import torchvision.transforms as transforms 9 | import imp 10 | import yaml 11 | import time 12 | from PIL import Image 13 | import __init__ as booger 14 | import collections 15 | import copy 16 | import cv2 17 | import os 18 | import numpy as np 19 | 20 | from tasks.semantic.modules.segmentator import * 21 | from tasks.semantic.postproc.KNN import KNN 22 | 23 | 24 | class User(): 25 | def __init__(self, ARCH, DATA, datadir, logdir, modeldir): 26 | # parameters 27 | self.ARCH = ARCH 28 | self.DATA = DATA 29 | self.datadir = datadir 30 | self.logdir = logdir 31 | self.modeldir = modeldir 32 | 33 | # get the data 34 | parserModule = imp.load_source("parserModule", 35 | booger.TRAIN_PATH + '/tasks/semantic/dataset/' + 36 | self.DATA["name"] + '/parser.py') 37 | self.parser = parserModule.Parser(root=self.datadir, 38 | train_sequences=self.DATA["split"]["train"], 39 | valid_sequences=self.DATA["split"]["valid"], 40 | test_sequences=self.DATA["split"]["test"], 41 | labels=self.DATA["labels"], 42 | color_map=self.DATA["color_map"], 43 | learning_map=self.DATA["learning_map"], 44 | learning_map_inv=self.DATA["learning_map_inv"], 45 | sensor=self.ARCH["dataset"]["sensor"], 46 | max_points=self.ARCH["dataset"]["max_points"], 47 | batch_size=1, 48 | workers=self.ARCH["train"]["workers"], 49 | gt=True, 50 | shuffle_train=False) 51 | 52 | # concatenate the encoder and the head 53 | with torch.no_grad(): 54 | self.model = Segmentator(self.ARCH, 55 | self.parser.get_n_classes(), 56 | self.modeldir) 57 | 58 | # use knn post processing? 59 | self.post = None 60 | if self.ARCH["post"]["KNN"]["use"]: 61 | self.post = KNN(self.ARCH["post"]["KNN"]["params"], 62 | self.parser.get_n_classes()) 63 | 64 | # GPU? 65 | self.gpu = False 66 | self.model_single = self.model 67 | self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 68 | print("Infering in device: ", self.device) 69 | if torch.cuda.is_available() and torch.cuda.device_count() > 0: 70 | cudnn.benchmark = True 71 | cudnn.fastest = True 72 | self.gpu = True 73 | self.model.cuda() 74 | 75 | def infer(self): 76 | # do train set 77 | self.infer_subset(loader=self.parser.get_train_set(), 78 | to_orig_fn=self.parser.to_original) 79 | 80 | # do valid set 81 | self.infer_subset(loader=self.parser.get_valid_set(), 82 | to_orig_fn=self.parser.to_original) 83 | # do test set 84 | self.infer_subset(loader=self.parser.get_test_set(), 85 | to_orig_fn=self.parser.to_original) 86 | 87 | print('Finished Infering') 88 | 89 | return 90 | 91 | def infer_subset(self, loader, to_orig_fn): 92 | # switch to evaluate mode 93 | self.model.eval() 94 | 95 | # empty the cache to infer in high res 96 | if self.gpu: 97 | torch.cuda.empty_cache() 98 | 99 | with torch.no_grad(): 100 | end = time.time() 101 | 102 | for i, (proj_in, proj_mask, _, _, path_seq, path_name, p_x, p_y, proj_range, unproj_range, _, _, _, _, npoints) in enumerate(loader): 103 | # first cut to rela size (batch size one allows it) 104 | p_x = p_x[0, :npoints] 105 | p_y = p_y[0, :npoints] 106 | proj_range = proj_range[0, :npoints] 107 | unproj_range = unproj_range[0, :npoints] 108 | path_seq = path_seq[0] 109 | path_name = path_name[0] 110 | 111 | if self.gpu: 112 | proj_in = proj_in.cuda() 113 | proj_mask = proj_mask.cuda() 114 | p_x = p_x.cuda() 115 | p_y = p_y.cuda() 116 | if self.post: 117 | proj_range = proj_range.cuda() 118 | unproj_range = unproj_range.cuda() 119 | 120 | # compute output 121 | proj_output = self.model(proj_in, proj_mask) 122 | proj_argmax = proj_output[0].argmax(dim=0) 123 | 124 | if self.post: 125 | # knn postproc 126 | unproj_argmax = self.post(proj_range, 127 | unproj_range, 128 | proj_argmax, 129 | p_x, 130 | p_y) 131 | else: 132 | # put in original pointcloud using indexes 133 | unproj_argmax = proj_argmax[p_y, p_x] 134 | 135 | # measure elapsed time 136 | if torch.cuda.is_available(): 137 | torch.cuda.synchronize() 138 | 139 | print("Infered seq", path_seq, "scan", path_name, 140 | "in", time.time() - end, "sec") 141 | end = time.time() 142 | 143 | # save scan 144 | # get the first scan in batch and project scan 145 | pred_np = unproj_argmax.cpu().numpy() 146 | pred_np = pred_np.reshape((-1)).astype(np.int32) 147 | 148 | # map to original label 149 | pred_np = to_orig_fn(pred_np) 150 | 151 | # save scan 152 | path = os.path.join(self.logdir, "sequences", 153 | path_seq, "predictions", path_name) 154 | pred_np.tofile(path) 155 | -------------------------------------------------------------------------------- /train/tasks/semantic/postproc/CRF.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import numpy as np 5 | from scipy import signal 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | import __init__ as booger 10 | 11 | 12 | class LocallyConnectedXYZLayer(nn.Module): 13 | def __init__(self, h, w, sigma, nclasses): 14 | super().__init__() 15 | # size of window 16 | self.h = h 17 | self.padh = h//2 18 | self.w = w 19 | self.padw = w//2 20 | assert(self.h % 2 == 1 and self.w % 2 == 1) # window must be odd 21 | self.sigma = sigma 22 | self.gauss_den = 2 * self.sigma**2 23 | self.nclasses = nclasses 24 | 25 | def forward(self, xyz, softmax, mask): 26 | # softmax size 27 | N, C, H, W = softmax.shape 28 | 29 | # make sofmax zero everywhere input is invalid 30 | softmax = softmax * mask.unsqueeze(1).float() 31 | 32 | # get x,y,z for distance (shape N,1,H,W) 33 | x = xyz[:, 0].unsqueeze(1) 34 | y = xyz[:, 1].unsqueeze(1) 35 | z = xyz[:, 2].unsqueeze(1) 36 | 37 | # im2col in size of window of input (x,y,z separately) 38 | window_x = F.unfold(x, kernel_size=(self.h, self.w), 39 | padding=(self.padh, self.padw)) 40 | center_x = F.unfold(x, kernel_size=(1, 1), 41 | padding=(0, 0)) 42 | window_y = F.unfold(y, kernel_size=(self.h, self.w), 43 | padding=(self.padh, self.padw)) 44 | center_y = F.unfold(y, kernel_size=(1, 1), 45 | padding=(0, 0)) 46 | window_z = F.unfold(z, kernel_size=(self.h, self.w), 47 | padding=(self.padh, self.padw)) 48 | center_z = F.unfold(z, kernel_size=(1, 1), 49 | padding=(0, 0)) 50 | 51 | # sq distance to center (center distance is zero) 52 | unravel_dist2 = (window_x - center_x)**2 + \ 53 | (window_y - center_y)**2 + \ 54 | (window_z - center_z)**2 55 | 56 | # weight input distance by gaussian weights 57 | unravel_gaussian = torch.exp(- unravel_dist2 / self.gauss_den) 58 | 59 | # im2col in size of window of softmax to reweight by gaussian weights from input 60 | cloned_softmax = softmax.clone() 61 | for i in range(self.nclasses): 62 | # get the softmax for this class 63 | c_softmax = softmax[:, i].unsqueeze(1) 64 | # unfold this class to weigh it by the proper gaussian weights 65 | unravel_softmax = F.unfold(c_softmax, 66 | kernel_size=(self.h, self.w), 67 | padding=(self.padh, self.padw)) 68 | unravel_w_softmax = unravel_softmax * unravel_gaussian 69 | # add dimenssion 1 to obtain the new softmax for this class 70 | unravel_added_softmax = unravel_w_softmax.sum(dim=1).unsqueeze(1) 71 | # fold it and put it in new tensor 72 | added_softmax = unravel_added_softmax.view(N, H, W) 73 | cloned_softmax[:, i] = added_softmax 74 | 75 | return cloned_softmax 76 | 77 | 78 | class CRF(nn.Module): 79 | def __init__(self, params, nclasses): 80 | super().__init__() 81 | self.params = params 82 | self.iter = torch.nn.Parameter(torch.tensor(params["iter"]), 83 | requires_grad=False) 84 | self.lcn_size = torch.nn.Parameter(torch.tensor([params["lcn_size"]["h"], 85 | params["lcn_size"]["w"]]), 86 | requires_grad=False) 87 | self.xyz_coef = torch.nn.Parameter(torch.tensor(params["xyz_coef"]), 88 | requires_grad=False).float() 89 | self.xyz_sigma = torch.nn.Parameter(torch.tensor(params["xyz_sigma"]), 90 | requires_grad=False).float() 91 | 92 | self.nclasses = nclasses 93 | print("Using CRF!") 94 | 95 | # define layers here 96 | # compat init 97 | self.compat_kernel_init = np.reshape(np.ones((self.nclasses, self.nclasses)) - 98 | np.identity(self.nclasses), 99 | [self.nclasses, self.nclasses, 1, 1]) 100 | 101 | # bilateral compatibility matrixes 102 | self.compat_conv = nn.Conv2d(self.nclasses, self.nclasses, 1) 103 | self.compat_conv.weight = torch.nn.Parameter(torch.from_numpy( 104 | self.compat_kernel_init).float() * self.xyz_coef, requires_grad=True) 105 | 106 | # locally connected layer for message passing 107 | self.local_conn_xyz = LocallyConnectedXYZLayer(params["lcn_size"]["h"], 108 | params["lcn_size"]["w"], 109 | params["xyz_coef"], 110 | self.nclasses) 111 | 112 | def forward(self, input, softmax, mask): 113 | # use xyz 114 | xyz = input[:, 1:4] 115 | 116 | # iteratively 117 | for iter in range(self.iter): 118 | # message passing as locally connected layer 119 | locally_connected = self.local_conn_xyz(xyz, softmax, mask) 120 | 121 | # reweigh with the 1x1 convolution 122 | reweight_softmax = self.compat_conv(locally_connected) 123 | 124 | # add the new values to the original softmax 125 | reweight_softmax = reweight_softmax + softmax 126 | 127 | # lastly, renormalize 128 | softmax = F.softmax(reweight_softmax, dim=1) 129 | 130 | return softmax 131 | -------------------------------------------------------------------------------- /train/tasks/semantic/postproc/KNN.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import math 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | import __init__ as booger 9 | 10 | 11 | def get_gaussian_kernel(kernel_size=3, sigma=2, channels=1): 12 | # Create a x, y coordinate grid of shape (kernel_size, kernel_size, 2) 13 | x_coord = torch.arange(kernel_size) 14 | x_grid = x_coord.repeat(kernel_size).view(kernel_size, kernel_size) 15 | y_grid = x_grid.t() 16 | xy_grid = torch.stack([x_grid, y_grid], dim=-1).float() 17 | 18 | mean = (kernel_size - 1) / 2. 19 | variance = sigma**2. 20 | 21 | # Calculate the 2-dimensional gaussian kernel which is 22 | # the product of two gaussian distributions for two different 23 | # variables (in this case called x and y) 24 | gaussian_kernel = (1. / (2. * math.pi * variance)) *\ 25 | torch.exp(-torch.sum((xy_grid - mean)**2., dim=-1) / (2 * variance)) 26 | 27 | # Make sure sum of values in gaussian kernel equals 1. 28 | gaussian_kernel = gaussian_kernel / torch.sum(gaussian_kernel) 29 | 30 | # Reshape to 2d depthwise convolutional weight 31 | gaussian_kernel = gaussian_kernel.view(kernel_size, kernel_size) 32 | 33 | return gaussian_kernel 34 | 35 | 36 | class KNN(nn.Module): 37 | def __init__(self, params, nclasses): 38 | super().__init__() 39 | print("*"*80) 40 | print("Cleaning point-clouds with kNN post-processing") 41 | self.knn = params["knn"] 42 | self.search = params["search"] 43 | self.sigma = params["sigma"] 44 | self.cutoff = params["cutoff"] 45 | self.nclasses = nclasses 46 | print("kNN parameters:") 47 | print("knn:", self.knn) 48 | print("search:", self.search) 49 | print("sigma:", self.sigma) 50 | print("cutoff:", self.cutoff) 51 | print("nclasses:", self.nclasses) 52 | print("*"*80) 53 | 54 | def forward(self, proj_range, unproj_range, proj_argmax, px, py): 55 | ''' Warning! Only works for un-batched pointclouds. 56 | If they come batched we need to iterate over the batch dimension or do 57 | something REALLY smart to handle unaligned number of points in memory 58 | ''' 59 | # get device 60 | if proj_range.is_cuda: 61 | device = torch.device("cuda") 62 | else: 63 | device = torch.device("cpu") 64 | 65 | # sizes of projection scan 66 | H, W = proj_range.shape 67 | 68 | # number of points 69 | P = unproj_range.shape 70 | 71 | # check if size of kernel is odd and complain 72 | if (self.search % 2 == 0): 73 | raise ValueError("Nearest neighbor kernel must be odd number") 74 | 75 | # calculate padding 76 | pad = int((self.search - 1) / 2) 77 | 78 | # unfold neighborhood to get nearest neighbors for each pixel (range image) 79 | proj_unfold_k_rang = F.unfold(proj_range[None, None, ...], 80 | kernel_size=(self.search, self.search), 81 | padding=(pad, pad)) 82 | 83 | # index with px, py to get ALL the pcld points 84 | idx_list = py * W + px 85 | unproj_unfold_k_rang = proj_unfold_k_rang[:, :, idx_list] 86 | 87 | # WARNING, THIS IS A HACK 88 | # Make non valid (<0) range points extremely big so that there is no screwing 89 | # up the nn self.search 90 | unproj_unfold_k_rang[unproj_unfold_k_rang < 0] = float("inf") 91 | 92 | # now the matrix is unfolded TOTALLY, replace the middle points with the actual range points 93 | center = int(((self.search * self.search) - 1) / 2) 94 | unproj_unfold_k_rang[:, center, :] = unproj_range 95 | 96 | # now compare range 97 | k2_distances = torch.abs(unproj_unfold_k_rang - unproj_range) 98 | 99 | # make a kernel to weigh the ranges according to distance in (x,y) 100 | # I make this 1 - kernel because I want distances that are close in (x,y) 101 | # to matter more 102 | inv_gauss_k = ( 103 | 1 - get_gaussian_kernel(self.search, self.sigma, 1)).view(1, -1, 1) 104 | inv_gauss_k = inv_gauss_k.to(device).type(proj_range.type()) 105 | 106 | # apply weighing 107 | k2_distances = k2_distances * inv_gauss_k 108 | 109 | # find nearest neighbors 110 | _, knn_idx = k2_distances.topk( 111 | self.knn, dim=1, largest=False, sorted=False) 112 | 113 | # do the same unfolding with the argmax 114 | proj_unfold_1_argmax = F.unfold(proj_argmax[None, None, ...].float(), 115 | kernel_size=(self.search, self.search), 116 | padding=(pad, pad)).long() 117 | unproj_unfold_1_argmax = proj_unfold_1_argmax[:, :, idx_list] 118 | 119 | # get the top k predictions from the knn at each pixel 120 | knn_argmax = torch.gather( 121 | input=unproj_unfold_1_argmax, dim=1, index=knn_idx) 122 | 123 | # fake an invalid argmax of classes + 1 for all cutoff items 124 | if self.cutoff > 0: 125 | knn_distances = torch.gather(input=k2_distances, dim=1, index=knn_idx) 126 | knn_invalid_idx = knn_distances > self.cutoff 127 | knn_argmax[knn_invalid_idx] = self.nclasses 128 | 129 | # now vote 130 | # argmax onehot has an extra class for objects after cutoff 131 | knn_argmax_onehot = torch.zeros( 132 | (1, self.nclasses + 1, P[0]), device=device).type(proj_range.type()) 133 | ones = torch.ones_like(knn_argmax).type(proj_range.type()) 134 | knn_argmax_onehot = knn_argmax_onehot.scatter_add_(1, knn_argmax, ones) 135 | 136 | # now vote (as a sum over the onehot shit) (don't let it choose unlabeled OR invalid) 137 | knn_argmax_out = knn_argmax_onehot[:, 1:-1].argmax(dim=1) + 1 138 | 139 | # reshape again 140 | knn_argmax_out = knn_argmax_out.view(P) 141 | 142 | return knn_argmax_out 143 | -------------------------------------------------------------------------------- /train/tasks/semantic/postproc/__init__.py: -------------------------------------------------------------------------------- 1 | import sys 2 | TRAIN_PATH = "../" 3 | DEPLOY_PATH = "../../deploy" 4 | sys.path.insert(0, TRAIN_PATH) 5 | -------------------------------------------------------------------------------- /train/tasks/semantic/postproc/borderMask.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | """Border Mask for 2D labeled range images. 4 | 5 | Simple module to obtain the border mask of a given range image. 6 | 7 | The border mask is defined as the zone where are intersections between 8 | differrent classes for the given range image. 9 | 10 | In this case we will violate a little bit the definition and will augment it. We 11 | define the border mask as the zone where are intersections between differnet 12 | classes for the given range image in determined neighborhood. To obtain this 13 | border mask we will need to apply de binary erosion algorithm multiple times to 14 | the same range image. 15 | 16 | Example: 17 | Suppose we have 3 classes and this given range image(labeled): 18 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 19 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 20 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 21 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 22 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 23 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 24 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 25 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 26 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 27 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 28 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 29 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 30 | [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 31 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 0, 0], 32 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 0, 0], 33 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 0, 0], 34 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 0, 0], 35 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 0, 0], 36 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 37 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 38 | 39 | The output of the bordermask would like: 40 | # 1 erode iteration with a connectivity kernel of 4: 41 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 42 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1], 43 | [1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1], 44 | [1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1], 45 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 46 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 47 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 48 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 49 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 50 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 51 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 52 | [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1], 53 | [1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1], 54 | [1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 55 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1], 56 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1], 57 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1], 58 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1], 59 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1], 60 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] 61 | 62 | # 2 erode iterations with a connectivity kernel of 8: 63 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 64 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 65 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1], 66 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1], 67 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1], 68 | [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1], 69 | [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1], 70 | [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1], 71 | [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1], 72 | [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1], 73 | [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1], 74 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 75 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 76 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 77 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 78 | [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1], 79 | [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1], 80 | [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1], 81 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 82 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]] 83 | """ 84 | 85 | import torch 86 | import torch.nn as nn 87 | import torch.nn.functional as F 88 | from common.onehot import oneHot 89 | import __init__ as booger 90 | 91 | 92 | class borderMask(nn.Module): 93 | def __init__(self, nclasses, device, border_size, kern_conn=4, background_class=None): 94 | """Get the binary border mask of a labeled 2d range image. 95 | 96 | Args: 97 | nclasses(int) : The number of classes labeled in the input image 98 | device(torch.device) : Process in host or cuda? 99 | border_size(int) : How many erode iterations to perform for the mask 100 | kern_conn(int) : The connectivity kernel number (4 or 8) 101 | background_class(int) : "unlabeled" class in dataset (to avoid double borders) 102 | 103 | Returns: 104 | eroded_output(tensor) : The 2d binary border mask, 1 where a intersection 105 | between classes occurs, 0 everywhere else 106 | 107 | """ 108 | super().__init__() 109 | self.nclasses = nclasses 110 | self.device = device 111 | self.border_size = border_size 112 | self.kern_conn = kern_conn 113 | self.background_class = background_class 114 | if self.background_class is not None: 115 | self.include_idx = list(range(self.nclasses)) 116 | self.exclude_idx = self.include_idx.pop(self.background_class) 117 | 118 | # check connectivity 119 | # For obtaining the border mask we will be eroding the input image, for this 120 | # reason we only support erode_kernels with connectivity 4 or 8 121 | assert self.kern_conn in (4, 8), ("The specified kernel connectivity(kern_conn= %r) is " 122 | "not supported" % self.kern_conn) 123 | 124 | # make the onehot inferer 125 | self.onehot = oneHot(self.device, 126 | self.nclasses, 127 | spatial_dim=2) # range labels 128 | 129 | def forward(self, range_label): 130 | # length of shape of range_label must be 3 (N, H, W) 131 | must_unbatch = False # remove batch dimension after operation? 132 | if len(range_label.shape) != 3: 133 | range_label = range_label[None, ...] 134 | must_unbatch = True 135 | 136 | # The range_label comes labeled, we need to create one tensor per class, thus: 137 | input_tensor = self.onehot(range_label) # (N, C, H, W) 138 | 139 | # Because we are using GT range_labels, there is a lot of pixels that end up 140 | # unlabeled(thus, in the background). If we feed the erosion algorithm with 141 | # this "raw" gt_labels we will detect intersection between the other classes 142 | # and the backgorund, and we will end with the incorrect border mask. To solve 143 | # this issue we need to pre process the input gt_label. The artifact in this 144 | # case will be to sum the background channel(mostly the channel 0) to 145 | # all the rest channels expect for the background channel itself. 146 | # This will allow us to avoid detecting intersections between a class and the 147 | # background. This also force us to change the logical AND we were doing to 148 | # obtain the border mask when we were working with predicted labels. 149 | # With predicted labels won't see this problem because all the pixels belongs 150 | # to at least one class 151 | if self.background_class is not None: 152 | input_tensor[:, self.include_idx] = input_tensor[:, self.include_idx] + \ 153 | input_tensor[:, self.exclude_idx] 154 | 155 | # C denotes a number of channels, N, H and W are dismissed 156 | C = input_tensor.shape[1] 157 | 158 | # Create an empty erode kernel and send it to 'device' 159 | erode_kernel = torch.zeros((C, 1, 3, 3), device=self.device) 160 | if self.kern_conn == 4: 161 | erode_kernel[:] = torch.tensor([[0, 1, 0], 162 | [1, 1, 1], 163 | [0, 1, 0]], device=self.device) 164 | else: 165 | erode_kernel[:] = torch.tensor([[1, 1, 1], 166 | [1, 1, 1], 167 | [1, 1, 1]], device=self.device) 168 | 169 | # to check connectivity 170 | kernel_sum = erode_kernel[0][0].sum() # should be kern_conn + 1 171 | 172 | # erode the input image border_size times 173 | erode_input = input_tensor 174 | for _ in range(self.border_size): 175 | eroded_output = F.conv2d(erode_input, erode_kernel, groups=C, padding=1) 176 | # Pick the elements that match the kernel_sum to obtain the eroded 177 | # output and convert to dtype=float32 178 | eroded_output = (eroded_output == kernel_sum).float() 179 | erode_input = eroded_output 180 | 181 | # We want to sum up all the channels into 1 unique border mask 182 | # Even when we added the background to all the rest of the channels, there 183 | # might be "bodies" in the background channel, thus, the erosion process can 184 | # output "false positives" were this "bodies" are present in the background. 185 | # We need to obtain the background mask and add it to the eroded bodies to 186 | # obtain a consisent output once we calculate the border mask 187 | if self.background_class is not None: 188 | background_mask = (eroded_output[:, self.exclude_idx] == 1) 189 | 190 | # The eroded_bodies mask will consist in all the pixels were the convolution 191 | # returned 1 for all the channels, therefore we need to sum up all the 192 | # channels into one unique tensor and add the background mask to avoid having 193 | # the background in the border mask output 194 | eroded_bodies = (eroded_output.sum(1, keepdim=True) == 1) 195 | if self.background_class is not None: 196 | eroded_bodies = eroded_bodies + background_mask 197 | 198 | # we want the opposite 199 | borders = 1 - eroded_bodies 200 | 201 | # unbatch? 202 | if must_unbatch: 203 | borders = borders[0] 204 | # import cv2 205 | # import numpy as np 206 | # bordersprint = (borders * 255).squeeze().cpu().numpy().astype(np.uint8) 207 | # cv2.imshow("border", bordersprint) 208 | # cv2.waitKey(0) 209 | 210 | return borders 211 | 212 | 213 | if __name__ == "__main__": 214 | import argparse 215 | parser = argparse.ArgumentParser("./borderMask.py") 216 | parser.add_argument( 217 | '--scan', '-s', 218 | type=str, 219 | required=True, 220 | help='Scan to get xyz. No Default', 221 | ) 222 | parser.add_argument( 223 | '--label', '-l', 224 | type=str, 225 | required=True, 226 | help='Label to calculate border mask. No Default', 227 | ) 228 | parser.add_argument( 229 | '--exclude_class', '-e', 230 | type=int, 231 | required=False, 232 | default=None, 233 | help='Label to ignore. No Default', 234 | ) 235 | parser.add_argument( 236 | '--border', '-b', 237 | type=int, 238 | required=False, 239 | default=1, 240 | help='Border size. Defaults to %(default)s', 241 | ) 242 | parser.add_argument( 243 | '--conn', '-c', 244 | type=int, 245 | required=False, 246 | default=4, 247 | help='Kernel connectivity. Defaults to %(default)s', 248 | ) 249 | FLAGS, unparsed = parser.parse_known_args() 250 | 251 | # print summary of what we will do 252 | print("----------") 253 | print("INTERFACE:") 254 | print("Scan", FLAGS.scan) 255 | print("Label", FLAGS.label) 256 | print("Exclude class", FLAGS.exclude_class) 257 | print("Border", FLAGS.border) 258 | print("Connectivity", FLAGS.conn) 259 | print("----------\n") 260 | 261 | # get device 262 | if torch.cuda.is_available(): 263 | device = torch.device('cuda') 264 | else: 265 | device = torch.device('cpu') 266 | 267 | # define the border mask 268 | bm = borderMask(300, device, FLAGS.border, 269 | FLAGS.conn, FLAGS.exclude_class) 270 | 271 | # imports for inference part 272 | import cv2 273 | import numpy as np 274 | from common.laserscan import LaserScan, SemLaserScan 275 | 276 | # open label and project 277 | scan = SemLaserScan(project=True, max_classes=300) 278 | scan.open_scan(FLAGS.scan) 279 | scan.open_label(FLAGS.label) 280 | 281 | # get the things I need 282 | proj_range = torch.from_numpy(scan.proj_range).to(device) 283 | proj_sem_label = torch.from_numpy(scan.proj_sem_label).long().to(device) 284 | proj_sem_color = torch.from_numpy(scan.proj_sem_color).to(device) 285 | 286 | # run the border mask 287 | border_mask = bm(proj_sem_label) 288 | 289 | # bring to numpy and normalize for showing 290 | proj_range = proj_range.cpu().numpy() 291 | proj_sem_label = proj_sem_label.cpu().numpy() 292 | proj_sem_color = proj_sem_color.cpu().numpy() 293 | border_mask = border_mask.cpu().numpy().squeeze() 294 | 295 | # norm 296 | proj_range = (proj_range / proj_range.max() * 255).astype(np.uint8) 297 | border_mask = (border_mask * 255).astype(np.uint8) 298 | 299 | # show 300 | cv2.imshow("range", proj_range) 301 | cv2.imshow("label", proj_sem_color) 302 | cv2.imshow("border", border_mask) 303 | cv2.waitKey(0) 304 | cv2.destroyAllWindows() 305 | -------------------------------------------------------------------------------- /train/tasks/semantic/readme.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PRBonn/lidar-bonnetal/99b827f0228ff0e997473ac8e2cecbaa4af7d7c7/train/tasks/semantic/readme.md -------------------------------------------------------------------------------- /train/tasks/semantic/train.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import argparse 5 | import subprocess 6 | import datetime 7 | import yaml 8 | from shutil import copyfile 9 | import os 10 | import shutil 11 | import __init__ as booger 12 | 13 | from tasks.semantic.modules.trainer import * 14 | 15 | if __name__ == '__main__': 16 | parser = argparse.ArgumentParser("./train.py") 17 | parser.add_argument( 18 | '--dataset', '-d', 19 | type=str, 20 | required=True, 21 | help='Dataset to train with. No Default', 22 | ) 23 | parser.add_argument( 24 | '--arch_cfg', '-ac', 25 | type=str, 26 | required=True, 27 | help='Architecture yaml cfg file. See /config/arch for sample. No default!', 28 | ) 29 | parser.add_argument( 30 | '--data_cfg', '-dc', 31 | type=str, 32 | required=False, 33 | default='config/labels/semantic-kitti.yaml', 34 | help='Classification yaml cfg file. See /config/labels for sample. No default!', 35 | ) 36 | parser.add_argument( 37 | '--log', '-l', 38 | type=str, 39 | default=os.path.expanduser("~") + '/logs/' + 40 | datetime.datetime.now().strftime("%Y-%-m-%d-%H:%M") + '/', 41 | help='Directory to put the log data. Default: ~/logs/date+time' 42 | ) 43 | parser.add_argument( 44 | '--pretrained', '-p', 45 | type=str, 46 | required=False, 47 | default=None, 48 | help='Directory to get the pretrained model. If not passed, do from scratch!' 49 | ) 50 | FLAGS, unparsed = parser.parse_known_args() 51 | 52 | # print summary of what we will do 53 | print("----------") 54 | print("INTERFACE:") 55 | print("dataset", FLAGS.dataset) 56 | print("arch_cfg", FLAGS.arch_cfg) 57 | print("data_cfg", FLAGS.data_cfg) 58 | print("log", FLAGS.log) 59 | print("pretrained", FLAGS.pretrained) 60 | print("----------\n") 61 | print("Commit hash (training version): ", str( 62 | subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).strip())) 63 | print("----------\n") 64 | 65 | # open arch config file 66 | try: 67 | print("Opening arch config file %s" % FLAGS.arch_cfg) 68 | ARCH = yaml.safe_load(open(FLAGS.arch_cfg, 'r')) 69 | except Exception as e: 70 | print(e) 71 | print("Error opening arch yaml file.") 72 | quit() 73 | 74 | # open data config file 75 | try: 76 | print("Opening data config file %s" % FLAGS.data_cfg) 77 | DATA = yaml.safe_load(open(FLAGS.data_cfg, 'r')) 78 | except Exception as e: 79 | print(e) 80 | print("Error opening data yaml file.") 81 | quit() 82 | 83 | # create log folder 84 | try: 85 | if os.path.isdir(FLAGS.log): 86 | shutil.rmtree(FLAGS.log) 87 | os.makedirs(FLAGS.log) 88 | except Exception as e: 89 | print(e) 90 | print("Error creating log directory. Check permissions!") 91 | quit() 92 | 93 | # does model folder exist? 94 | if FLAGS.pretrained is not None: 95 | if os.path.isdir(FLAGS.pretrained): 96 | print("model folder exists! Using model from %s" % (FLAGS.pretrained)) 97 | else: 98 | print("model folder doesnt exist! Start with random weights...") 99 | else: 100 | print("No pretrained directory found.") 101 | 102 | # copy all files to log folder (to remember what we did, and make inference 103 | # easier). Also, standardize name to be able to open it later 104 | try: 105 | print("Copying files to %s for further reference." % FLAGS.log) 106 | copyfile(FLAGS.arch_cfg, FLAGS.log + "/arch_cfg.yaml") 107 | copyfile(FLAGS.data_cfg, FLAGS.log + "/data_cfg.yaml") 108 | except Exception as e: 109 | print(e) 110 | print("Error copying files, check permissions. Exiting...") 111 | quit() 112 | 113 | # create trainer and start the training 114 | trainer = Trainer(ARCH, DATA, FLAGS.dataset, FLAGS.log, FLAGS.pretrained) 115 | trainer.train() 116 | -------------------------------------------------------------------------------- /train/tasks/semantic/visualize.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # This file is covered by the LICENSE file in the root of this project. 3 | 4 | import argparse 5 | import os 6 | import yaml 7 | import __init__ as booger 8 | 9 | from common.laserscan import LaserScan, SemLaserScan 10 | from common.laserscanvis import LaserScanVis 11 | 12 | if __name__ == '__main__': 13 | parser = argparse.ArgumentParser("./visualize.py") 14 | parser.add_argument( 15 | '--dataset', '-d', 16 | type=str, 17 | required=True, 18 | help='Dataset to visualize. No Default', 19 | ) 20 | parser.add_argument( 21 | '--config', '-c', 22 | type=str, 23 | required=False, 24 | default="config/labels/semantic-kitti.yaml", 25 | help='Dataset config file. Defaults to %(default)s', 26 | ) 27 | parser.add_argument( 28 | '--sequence', '-s', 29 | type=str, 30 | default="00", 31 | required=False, 32 | help='Sequence to visualize. Defaults to %(default)s', 33 | ) 34 | parser.add_argument( 35 | '--predictions', '-p', 36 | type=str, 37 | default=None, 38 | required=False, 39 | help='Alternate location for labels, to use predictions folder. ' 40 | 'Must point to directory containing the predictions in the proper format ' 41 | ' (see readme)' 42 | 'Defaults to %(default)s', 43 | ) 44 | parser.add_argument( 45 | '--ignore_semantics', '-i', 46 | dest='ignore_semantics', 47 | default=False, 48 | action='store_true', 49 | help='Ignore semantics. Visualizes uncolored pointclouds.' 50 | 'Defaults to %(default)s', 51 | ) 52 | parser.add_argument( 53 | '--offset', 54 | type=int, 55 | default=0, 56 | required=False, 57 | help='Sequence to start. Defaults to %(default)s', 58 | ) 59 | parser.add_argument( 60 | '--ignore_safety', 61 | dest='ignore_safety', 62 | default=False, 63 | action='store_true', 64 | help='Normally you want the number of labels and ptcls to be the same,' 65 | ', but if you are not done inferring this is not the case, so this disables' 66 | ' that safety.' 67 | 'Defaults to %(default)s', 68 | ) 69 | FLAGS, unparsed = parser.parse_known_args() 70 | 71 | # print summary of what we will do 72 | print("*" * 80) 73 | print("INTERFACE:") 74 | print("Dataset", FLAGS.dataset) 75 | print("Config", FLAGS.config) 76 | print("Sequence", FLAGS.sequence) 77 | print("Predictions", FLAGS.predictions) 78 | print("ignore_semantics", FLAGS.ignore_semantics) 79 | print("ignore_safety", FLAGS.ignore_safety) 80 | print("offset", FLAGS.offset) 81 | print("*" * 80) 82 | 83 | # open config file 84 | try: 85 | print("Opening config file %s" % FLAGS.config) 86 | CFG = yaml.safe_load(open(FLAGS.config, 'r')) 87 | except Exception as e: 88 | print(e) 89 | print("Error opening yaml file.") 90 | quit() 91 | 92 | # fix sequence name 93 | FLAGS.sequence = '{0:02d}'.format(int(FLAGS.sequence)) 94 | 95 | # does sequence folder exist? 96 | scan_paths = os.path.join(FLAGS.dataset, "sequences", 97 | FLAGS.sequence, "velodyne") 98 | if os.path.isdir(scan_paths): 99 | print("Sequence folder exists! Using sequence from %s" % scan_paths) 100 | else: 101 | print("Sequence folder doesn't exist! Exiting...") 102 | quit() 103 | 104 | # populate the pointclouds 105 | scan_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 106 | os.path.expanduser(scan_paths)) for f in fn] 107 | scan_names.sort() 108 | 109 | # does sequence folder exist? 110 | if not FLAGS.ignore_semantics: 111 | if FLAGS.predictions is not None: 112 | label_paths = os.path.join(FLAGS.predictions, "sequences", 113 | FLAGS.sequence, "predictions") 114 | else: 115 | label_paths = os.path.join(FLAGS.dataset, "sequences", 116 | FLAGS.sequence, "labels") 117 | if os.path.isdir(label_paths): 118 | print("Labels folder exists! Using labels from %s" % label_paths) 119 | else: 120 | print("Labels folder doesn't exist! Exiting...") 121 | quit() 122 | # populate the pointclouds 123 | label_names = [os.path.join(dp, f) for dp, dn, fn in os.walk( 124 | os.path.expanduser(label_paths)) for f in fn] 125 | label_names.sort() 126 | 127 | # check that there are same amount of labels and scans 128 | if not FLAGS.ignore_safety: 129 | assert(len(label_names) == len(scan_names)) 130 | 131 | # create a scan 132 | if FLAGS.ignore_semantics: 133 | scan = LaserScan(project=True) # project all opened scans to spheric proj 134 | else: 135 | color_dict = CFG["color_map"] 136 | scan = SemLaserScan(color_dict, project=True) 137 | 138 | # create a visualizer 139 | semantics = not FLAGS.ignore_semantics 140 | if not semantics: 141 | label_names = None 142 | vis = LaserScanVis(scan=scan, 143 | scan_names=scan_names, 144 | label_names=label_names, 145 | offset=FLAGS.offset, 146 | semantics=semantics, 147 | instances=False) 148 | 149 | # print instructions 150 | print("To navigate:") 151 | print("\tb: back (previous scan)") 152 | print("\tn: next (next scan)") 153 | print("\tq: quit (exit program)") 154 | 155 | # run the visualizer 156 | vis.run() 157 | --------------------------------------------------------------------------------