├── Readme.md
├── assets
    └── network_architecture.jpg
├── configs
    ├── datasets
    │   ├── flying_things_3d
    │   │   ├── test.txt
    │   │   ├── train.txt
    │   │   └── val.txt
    │   ├── lidar_kitti
    │   │   └── test.txt
    │   ├── semantic_kitti
    │   │   ├── train.txt
    │   │   └── val.txt
    │   ├── stereo_kitti
    │   │   └── test.txt
    │   └── waymo_open
    │   │   ├── test.txt
    │   │   ├── train.txt
    │   │   └── val.txt
    ├── default.yaml
    ├── demo
    │   ├── demo_flying_things_3d.yaml
    │   └── demo_lidar_kitti.yaml
    ├── eval
    │   ├── eval_flying_things_3d.yaml
    │   ├── eval_lidar_kitti.yaml
    │   ├── eval_semantic_kitti.yaml
    │   ├── eval_stereo_kitti.yaml
    │   └── eval_waymo_open.yaml
    └── train
    │   ├── train_fully_supervised.yaml
    │   └── train_weakly_supervised.yaml
├── data_preprocessing
    ├── IO.py
    ├── flyingthings3d_utils.py
    ├── kitti_utils.py
    ├── process_flyingthings3d_subset.py
    ├── process_kitti.py
    ├── process_semantKitti.py
    └── python_pfm.py
├── eval.py
├── lib
    ├── __pycache__
    │   ├── __init__.cpython-36.pyc
    │   ├── config.cpython-36.pyc
    │   ├── config.cpython-37.pyc
    │   ├── config.cpython-38.pyc
    │   ├── data.cpython-36.pyc
    │   ├── data.cpython-37.pyc
    │   ├── logger.cpython-36.pyc
    │   ├── logger.cpython-37.pyc
    │   ├── loss.cpython-36.pyc
    │   ├── loss.cpython-37.pyc
    │   ├── metrics.cpython-36.pyc
    │   ├── metrics.cpython-37.pyc
    │   ├── trainer.cpython-36.pyc
    │   ├── trainer.cpython-37.pyc
    │   ├── utils.cpython-36.pyc
    │   ├── utils.cpython-37.pyc
    │   └── utils.cpython-38.pyc
    ├── config.py
    ├── data.py
    ├── logger.py
    ├── loss.py
    ├── metrics.py
    ├── model
    │   ├── __init__.py
    │   ├── minkowski
    │   │   ├── ME_layers.py
    │   │   ├── MinkowskiFlow.py
    │   │   └── __init__,py
    │   └── rigid_3d_sf.py
    ├── trainer.py
    └── utils.py
├── requirements.txt
├── scripts
    ├── download_data.sh
    ├── download_pretrained_models.sh
    └── download_pretrained_models_ablations.sh
├── train.py
└── utils
    ├── __init__.py
    └── chamfer_distance
        ├── __init__.py
        ├── __pycache__
            ├── __init__.cpython-37.pyc
            └── chamfer_distance.cpython-37.pyc
        ├── chamfer_distance.cpp
        ├── chamfer_distance.cu
        ├── chamfer_distance.py
        └── readme.txt


/Readme.md:
--------------------------------------------------------------------------------
  1 | # Weakly Supervised Learning of Rigid 3D Scene Flow 
  2 | This repository provides code and data to train and evaluate a weakly supervised method for rigid 3D scene flow estimation. It represents the official implementation of the paper:
  3 | 
  4 | ### [Weakly Supervised Learning of Rigid 3D Scene Flow](https://arxiv.org/pdf/2102.08945.pdf)
  5 | [Zan Gojcic](https://zgojcic.github.io/), [Or Litany](https://orlitany.github.io/), [Andreas Wieser](https://baug.ethz.ch/departement/personen/mitarbeiter/personen-detail.MTg3NzU5.TGlzdC82NzksLTU1NTc1NDEwMQ==.html), [Leonidas J. Guibas](https://geometry.stanford.edu/member/guibas/), [Tolga Birdal](http://tbirdal.me/)\
  6 | | [IGP ETH Zurich](https://igp.ethz.ch/) | [Nvidia Toronto AI Lab](https://nv-tlabs.github.io/) | [Guibas Lab Stanford University](https://geometry.stanford.edu/index.html) |
  7 | 
  8 | For more information, please see the [project webpage](https://3dsceneflow.github.io/)
  9 | 
 10 | ![WSR3DSF](assets/network_architecture.jpg?raw=true)
 11 | 
 12 | 
 13 | ### Environment Setup
 14 | 
 15 | > Note: the code in this repo has been tested on Ubuntu 16.04/20.04 with Python 3.7, CUDA 10.1/10.2, PyTorch 1.7.1 and MinkowskiEngine 0.5.1. It may work for other setups, but has not been tested.
 16 | 
 17 | 
 18 | Before proceding, make sure CUDA is installed and set up correctly. 
 19 | 
 20 | After cloning this reposiory you can proceed by setting up and activating a virual environment with Python 3.7. If you are using a different version of cuda (10.1) change the pytorch installation instruction accordingly.
 21 | 
 22 | ```bash
 23 | export CXX=g++-7
 24 | conda config --append channels conda-forge
 25 | conda create --name rigid_3dsf python=3.7
 26 | source activate rigid_3dsf
 27 | conda install --file requirements.txt
 28 | conda install -c open3d-admin open3d=0.9.0.0
 29 | conda install -c intel scikit-learn
 30 | conda install pytorch==1.7.1 torchvision cudatoolkit=10.1 -c pytorch
 31 | ```
 32 | You can then proceed and install [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) library for sparse tensors:
 33 | 
 34 | ```bash
 35 | pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps
 36 | ```
 37 | Our repository also includes a pytorch implementation of [Chamfer Distance](https://github.com/chrdiller/pyTorchChamferDistance) in `./utils/chamfer_distance` which will be compiled on the first run. 
 38 | 
 39 | In order to test if Pytorch and MinkwoskiEngine are installed correctly please run
 40 | ```bash
 41 | python -c "import torch, MinkowskiEngine"
 42 | ```
 43 | which should run without an error message.
 44 | 
 45 | ### Data
 46 | 
 47 | We provide the preprocessed data of *flying_things_3d* (108GB), *stereo_kitti* (500MB), *lidar_kitti* (~160MB), *semantic_kitti* (78GB), and *waymo_open* (50GB) used for training and evaluating our model.
 48 | 
 49 | To download a single dataset please run:
 50 | 
 51 | ```bash
 52 | bash ./scripts/download_data.sh name_of_the_dataset
 53 | ```
 54 | 
 55 | To download all datasets simply run:
 56 | 
 57 | ```bash
 58 | bash ./scripts/download_data.sh
 59 | ```
 60 | The data will be downloaded and extracted to `./data/name_of_the_dataset/`.
 61 | 
 62 | ### Pretrained models
 63 | 
 64 | We provide the checkpoints of the models trained on *flying_things_3d* or *semantic_kitti*, which we use in our main evaluations.
 65 | 
 66 | To download these models please run:
 67 | 
 68 | ```bash
 69 | bash ./scripts/download_pretrained_models.sh
 70 | ```
 71 | 
 72 | Additionally, we provide all the models used in the ablation studies and the model fine tuned on *waymo_open*.
 73 | 
 74 | To download these models please run:
 75 | 
 76 | ```bash
 77 | bash ./scripts/download_pretrained_models_ablations.sh
 78 | ```
 79 | 
 80 | All the models will be downloaded and extracted to `./logs/dataset_used_for_training/`.
 81 | 
 82 | ### Evaluation with pretrained models
 83 | 
 84 | Our method with pretrained weights can be evaluated using the `./eval.py` script. The configuration parameters of the evaluation can be set with the `*.yaml` configuration files located in `./configs/eval/`. We provide a configuration file for each dataset used in our paper. For all evaluations please first download the pretrained weights and the corresponding data. Note, if the data or pretrained models are saved to a non-default path the config files also has to be adapted accordingly.
 85 | 
 86 | #### *FlyingThings3D*
 87 | 
 88 | To evaluate our backbone + scene flow head on *FlyingThings3d* please run:
 89 | 
 90 | ```shell
 91 | python eval.py ./configs/eval/eval_flying_things_3d.yaml
 92 | ```
 93 | This should recreate the results from the Table 1 of our paper (EPE3D: 0.052 m).
 94 | 
 95 | #### *stereoKITTI*
 96 | 
 97 | To evaluate our backbone + scene flow head on *stereoKITTI* please run:
 98 | 
 99 | ```shell
100 | python eval.py ./configs/eval/eval_stereo_kitti.yaml
101 | ```
102 | This should again recreate the results from the Table 1 of our paper (EPE3D: 0.042 m).
103 | 
104 | #### *lidarKITTI*
105 | 
106 | To evaluate our full weakly supervised method on *lidarKITTI* please run:
107 | 
108 | ```shell
109 | python eval.py ./configs/eval/eval_lidar_kitti.yaml
110 | ```
111 | This should recreate the results for Ours++ on *lidarKITTI* (w/o ground) from the Table 2 of our paper (EPE3D: 0.094 m). To recreate other results on *lidarKITTI* please change the `./configs/eval/eval_lidar_kitti.yaml` file accordingly.
112 | 
113 | 
114 | #### *semanticKITTI*
115 | 
116 | To evaluate our full weakly supervised method on *semanticKITTI* please run:
117 | 
118 | ```shell
119 | python eval.py ./configs/eval/eval_semantic_kitti.yaml
120 | ```
121 | This should recreate the results of our full model on *semanticKITTI* (w/o ground) from the Table 4 of our paper. To recreate other results on *semanticKITTI* please change the `./configs/eval/eval_semantic_kitti.yaml` file accordingly.
122 | 
123 | #### *waymo open*
124 | 
125 | To evaluate our fine-tuned model on *waymo open* please run:
126 | 
127 | ```shell
128 | python eval.py ./configs/eval/eval_waymo_open.yaml
129 | ```
130 | This should recreate the results for Ours++ (fine-tuned) from the Table 9 of the appendix. To recreate other results on *waymo open* please change the `./configs/eval/eval_waymo_open.yaml` file accordingly.
131 | 
132 | 
133 | ### Training our method from scratch
134 | 
135 | Our method can be trained using the `./train.py` script. The configuration parameters of the training process can be set using the config files located in `./configs/train/`.
136 | 
137 | #### Training our backbone with full supervision on *FlyingThings3D*
138 | 
139 | To train our backbone network and scene flow head under full supervision (corresponds to Sec. 4.3 of our paper) please run: 
140 | 
141 | ```shell
142 | python train.py ./configs/train/train_fully_supervised.yaml
143 | ```
144 | 
145 | The checkpoints and tensorboard data will be saved to `./logs/logs_FlyingThings3D_ME`. If you run out of GPU memory with the default setting please adapt the `batch_size` and `acc_iter_size` in the `./configs/default.yaml` to e.g. 4 and 2, respectively.
146 | 
147 | #### Training under weak supervision on *semanticKITTI*
148 | 
149 | To train our full method under weak supervision on *semanticKITTI* please run
150 | 
151 | ```shell
152 | python train.py ./configs/train/train_weakly_supervised.yaml
153 | ```
154 | 
155 | The checkpoints and tensorboard data will be saved to `./logs/logs_SemanticKITTI_ME`. If you run out of GPU memory with the default setting please adapt the `batch_size` and `acc_iter_size` in the `./configs/default.yaml` to e.g. 4 and 2, respectively.
156 | 
157 | ### Citation
158 | 
159 | If you found this code or paper useful, please consider citing:
160 | 
161 | ```shell
162 | @misc{gojcic2021weakly3dsf,
163 |         title = {Weakly {S}upervised {L}earning of {R}igid {3D} {S}cene {F}low}, 
164 |         author = {Gojcic, Zan and Litany, Or and Wieser, Andreas and Guibas, Leonidas J and Birdal, Tolga},
165 |         year = {2021},
166 |         eprint={2102.08945},
167 |         archivePrefix={arXiv},
168 |         primaryClass={cs.CV}
169 |         }
170 | ```
171 | ### Contact
172 | If you run into any problems or have questions, please create an issue or contact [Zan Gojcic](zgojcic@ethz.ch).
173 | 
174 | 
175 | ### Acknowledgments
176 | In this project we use parts of the official implementations of: 
177 | 
178 | - [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine)
179 | - [MultiviewReg](https://github.com/zgojcic/3D_multiview_reg)
180 | - [RPMNet](https://github.com/yewzijian/RPMNet)
181 | - [FLOT](https://github.com/valeoai/FLOT)
182 | 
183 |  We thank the respective authors for open sourcing their methods.


--------------------------------------------------------------------------------
/assets/network_architecture.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/assets/network_architecture.jpg


--------------------------------------------------------------------------------
/configs/datasets/lidar_kitti/test.txt:
--------------------------------------------------------------------------------
  1 | 000002.npz
  2 | 000003.npz
  3 | 000007.npz
  4 | 000008.npz
  5 | 000009.npz
  6 | 000010.npz
  7 | 000011.npz
  8 | 000012.npz
  9 | 000013.npz
 10 | 000014.npz
 11 | 000015.npz
 12 | 000016.npz
 13 | 000017.npz
 14 | 000018.npz
 15 | 000019.npz
 16 | 000020.npz
 17 | 000021.npz
 18 | 000022.npz
 19 | 000023.npz
 20 | 000024.npz
 21 | 000025.npz
 22 | 000026.npz
 23 | 000027.npz
 24 | 000028.npz
 25 | 000029.npz
 26 | 000030.npz
 27 | 000031.npz
 28 | 000032.npz
 29 | 000033.npz
 30 | 000034.npz
 31 | 000035.npz
 32 | 000036.npz
 33 | 000037.npz
 34 | 000038.npz
 35 | 000039.npz
 36 | 000040.npz
 37 | 000041.npz
 38 | 000042.npz
 39 | 000043.npz
 40 | 000044.npz
 41 | 000045.npz
 42 | 000046.npz
 43 | 000047.npz
 44 | 000048.npz
 45 | 000049.npz
 46 | 000050.npz
 47 | 000051.npz
 48 | 000052.npz
 49 | 000053.npz
 50 | 000054.npz
 51 | 000055.npz
 52 | 000056.npz
 53 | 000057.npz
 54 | 000058.npz
 55 | 000059.npz
 56 | 000060.npz
 57 | 000061.npz
 58 | 000062.npz
 59 | 000063.npz
 60 | 000064.npz
 61 | 000065.npz
 62 | 000066.npz
 63 | 000067.npz
 64 | 000068.npz
 65 | 000069.npz
 66 | 000070.npz
 67 | 000071.npz
 68 | 000072.npz
 69 | 000073.npz
 70 | 000074.npz
 71 | 000075.npz
 72 | 000076.npz
 73 | 000077.npz
 74 | 000078.npz
 75 | 000079.npz
 76 | 000080.npz
 77 | 000081.npz
 78 | 000083.npz
 79 | 000084.npz
 80 | 000085.npz
 81 | 000086.npz
 82 | 000088.npz
 83 | 000089.npz
 84 | 000090.npz
 85 | 000091.npz
 86 | 000092.npz
 87 | 000093.npz
 88 | 000094.npz
 89 | 000095.npz
 90 | 000096.npz
 91 | 000097.npz
 92 | 000098.npz
 93 | 000105.npz
 94 | 000106.npz
 95 | 000107.npz
 96 | 000108.npz
 97 | 000109.npz
 98 | 000110.npz
 99 | 000111.npz
100 | 000112.npz
101 | 000113.npz
102 | 000114.npz
103 | 000115.npz
104 | 000116.npz
105 | 000117.npz
106 | 000118.npz
107 | 000119.npz
108 | 000120.npz
109 | 000121.npz
110 | 000122.npz
111 | 000123.npz
112 | 000124.npz
113 | 000125.npz
114 | 000126.npz
115 | 000127.npz
116 | 000128.npz
117 | 000129.npz
118 | 000130.npz
119 | 000131.npz
120 | 000132.npz
121 | 000141.npz
122 | 000142.npz
123 | 000143.npz
124 | 000144.npz
125 | 000145.npz
126 | 000146.npz
127 | 000147.npz
128 | 000148.npz
129 | 000149.npz
130 | 000150.npz
131 | 000155.npz
132 | 000157.npz
133 | 000158.npz
134 | 000159.npz
135 | 000160.npz
136 | 000161.npz
137 | 000162.npz
138 | 000163.npz
139 | 000164.npz
140 | 000168.npz
141 | 000169.npz
142 | 000199.npz
143 | 


--------------------------------------------------------------------------------
/configs/datasets/stereo_kitti/test.txt:
--------------------------------------------------------------------------------
  1 | 000002.npz
  2 | 000003.npz
  3 | 000007.npz
  4 | 000008.npz
  5 | 000009.npz
  6 | 000010.npz
  7 | 000011.npz
  8 | 000012.npz
  9 | 000013.npz
 10 | 000014.npz
 11 | 000015.npz
 12 | 000016.npz
 13 | 000017.npz
 14 | 000018.npz
 15 | 000019.npz
 16 | 000020.npz
 17 | 000021.npz
 18 | 000022.npz
 19 | 000023.npz
 20 | 000024.npz
 21 | 000025.npz
 22 | 000026.npz
 23 | 000027.npz
 24 | 000028.npz
 25 | 000029.npz
 26 | 000030.npz
 27 | 000031.npz
 28 | 000032.npz
 29 | 000033.npz
 30 | 000034.npz
 31 | 000035.npz
 32 | 000036.npz
 33 | 000037.npz
 34 | 000038.npz
 35 | 000039.npz
 36 | 000040.npz
 37 | 000041.npz
 38 | 000042.npz
 39 | 000043.npz
 40 | 000044.npz
 41 | 000045.npz
 42 | 000046.npz
 43 | 000047.npz
 44 | 000048.npz
 45 | 000049.npz
 46 | 000050.npz
 47 | 000051.npz
 48 | 000052.npz
 49 | 000053.npz
 50 | 000054.npz
 51 | 000055.npz
 52 | 000056.npz
 53 | 000057.npz
 54 | 000058.npz
 55 | 000059.npz
 56 | 000060.npz
 57 | 000061.npz
 58 | 000062.npz
 59 | 000063.npz
 60 | 000064.npz
 61 | 000065.npz
 62 | 000066.npz
 63 | 000067.npz
 64 | 000068.npz
 65 | 000069.npz
 66 | 000070.npz
 67 | 000071.npz
 68 | 000072.npz
 69 | 000073.npz
 70 | 000074.npz
 71 | 000075.npz
 72 | 000076.npz
 73 | 000077.npz
 74 | 000078.npz
 75 | 000079.npz
 76 | 000080.npz
 77 | 000081.npz
 78 | 000083.npz
 79 | 000084.npz
 80 | 000085.npz
 81 | 000086.npz
 82 | 000088.npz
 83 | 000089.npz
 84 | 000090.npz
 85 | 000091.npz
 86 | 000092.npz
 87 | 000093.npz
 88 | 000094.npz
 89 | 000095.npz
 90 | 000096.npz
 91 | 000097.npz
 92 | 000098.npz
 93 | 000105.npz
 94 | 000106.npz
 95 | 000107.npz
 96 | 000108.npz
 97 | 000109.npz
 98 | 000110.npz
 99 | 000111.npz
100 | 000112.npz
101 | 000113.npz
102 | 000114.npz
103 | 000115.npz
104 | 000116.npz
105 | 000117.npz
106 | 000118.npz
107 | 000119.npz
108 | 000120.npz
109 | 000121.npz
110 | 000122.npz
111 | 000123.npz
112 | 000124.npz
113 | 000125.npz
114 | 000126.npz
115 | 000127.npz
116 | 000128.npz
117 | 000129.npz
118 | 000130.npz
119 | 000131.npz
120 | 000132.npz
121 | 000141.npz
122 | 000142.npz
123 | 000143.npz
124 | 000144.npz
125 | 000145.npz
126 | 000146.npz
127 | 000147.npz
128 | 000148.npz
129 | 000149.npz
130 | 000150.npz
131 | 000155.npz
132 | 000157.npz
133 | 000158.npz
134 | 000159.npz
135 | 000160.npz
136 | 000161.npz
137 | 000162.npz
138 | 000163.npz
139 | 000164.npz
140 | 000168.npz
141 | 000169.npz
142 | 000199.npz
143 | 


--------------------------------------------------------------------------------
/configs/default.yaml:
--------------------------------------------------------------------------------
 1 | method:
 2 |   backbone: ''
 3 | 
 4 | misc:
 5 |   voxel_size: 0.10 # Size of the voxel grid used for downsampling
 6 |   num_points: 8192 # Number of points 
 7 |   trainer: 'FlowTrainer' # Which class of trainer to use. Can be used if multiple different trainers are defined.
 8 |   use_gpu: True # If GPU should be used or not
 9 |   log_dir: ./logs/ # Path to the base folder where the models and logs will be saves
10 |   
11 | data:
12 |   input_features: absolute_coords # Input features to use (assigned to each sparse voxel)
13 |   only_near_points: True # Only consider near points (less than 35m away) [Used in all scene flow algorithms]
14 | 
15 | train:
16 |   batch_size: 8 # Training batch size
17 |   acc_iter_size: 1 # Number of iterration to accumulate the gradients before optimizer step (can be used if the gpu memory is too low)
18 |   num_workers: 6 # Number of workers used for the data loader
19 |   max_epoch: 5000 # Max number of training epochs
20 |   stat_interval: 250 # Interval at which the stats are printed out and saved for the tensorboard (if positive it denotes iteration if negative epochs)
21 |   chkpt_interval: 500 # Interval at which the model is saved (if positive it denotes iteration if negative epochs)
22 |   val_interval: 2000 # Interval at which the validation is performed (if positive it denotes iteration if negative epochs)
23 |   weighted_seg_loss: True
24 | 
25 | val:
26 |   batch_size: 8 # Validation batch size
27 |   num_workers: 6 # Number of workers for the validation data set
28 | 
29 | test:
30 |   results_dir: ./eval/ # Path to where to save the test results
31 |   batch_size: 1 # Test batch size
32 |   num_workers: 1 # Num of workers to use for the test data set
33 |   
34 | loss: 
35 |   bg_loss_w: 1.0 # Weight of the background loss
36 |   fg_loss_w: 1.0 # Weight of the foreground loss
37 |   flow_loss_w: 1.0 # Weight of the flow loss
38 |   ego_loss_w: 1.0 # Weight of the ego motion loss
39 |   inlier_loss_w: 0.005 # Weight of the inlier loss (part of ego-motion)
40 |   cd_loss_w: 0.5 # Wegihts of the chamfer distance loss
41 |   rigid_loss_w: 1.0 # Wegihts of the rigidity loss
42 |   
43 | optimizer:
44 |   alg: Adam # Which optimizer to use
45 |   learning_rate: 0.001 # Initial learning rate
46 |   weight_decay: 0.0 # Weight decay weight
47 |   momentum: 0.8 #Momentum
48 |   scheduler: ExponentialLR
49 |   exp_gamma: 0.98
50 | 


--------------------------------------------------------------------------------
/configs/demo/demo_flying_things_3d.yaml:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/configs/demo/demo_flying_things_3d.yaml


--------------------------------------------------------------------------------
/configs/demo/demo_lidar_kitti.yaml:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/configs/demo/demo_lidar_kitti.yaml


--------------------------------------------------------------------------------
/configs/eval/eval_flying_things_3d.yaml:
--------------------------------------------------------------------------------
 1 | method: 
 2 |   backbone: 'ME' # Type of backbone network [ME]
 3 |   flow: True # Use scene-flow head
 4 |   ego_motion: False # Use ego-motion head
 5 |   semantic: False # Use background segmentation head
 6 |   clustering: False # Use foreground clustering head
 7 | 
 8 | misc:
 9 |   run_mode: test # Mode to run the network in
10 | 
11 | data:
12 |   dataset: FlyingThings3D_ME # Name of the dataset [StereoKITTI_ME, FlyingThings3D_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
13 |   root: ./data/flying_things_3d/ # Path to the data
14 |   input_features: absolute_coords # Input features assigned to each sparse voxel
15 |   n_classes: 2 # Number of classes for the background segmentation head
16 |   remove_ground: False # Remove ground by simple thresholding of the height coordinate
17 |   augment_data: False # Augment the data by random rotation and translation
18 | 
19 | network:
20 |   normalize_features: True
21 |   norm_type: IN # Type of normalization layer IN = instance, BN = batch normalization, False = no normalization
22 |   in_kernel_size: 7 # Size of the initial convolutional kernel 
23 |   feature_dim: 64
24 |   ego_motion_points: 1024 # Number of points that are randomly sampled for the ego motion estimation
25 |   add_slack: True # Add slack row and column in the Sinkhorn iteration module
26 |   sinkhorn_iter: 3 # Number of Sinkhorn iterations in the ego motion module
27 |   use_pretrained: True # Flag for training
28 |   cluster_metric: euclidean # Distance metric used to compute the cluster assignments 0 = Euclidean
29 |   min_p_cluster: 30 # Min number of points in a cluster 
30 |   min_samples_dbscan: 5 # Min number of points in the neighborhood DBSCAN
31 |   eps_dbscan: 0.75 # Eps value in DBSCAN for the Euclidean distance
32 |   pretrained_path: logs/logs_FlyingThings3D_ME/flow_head_l1_loss/model_best.pt # Path to the pretrained model
33 |   
34 | test:
35 |   batch_size: 1 # Test batch size
36 |   num_workers: 1 # Num of workers to use for the test data set
37 |   postprocess_ego: False  # Apply postprocessing (optimization of the ego-motion)
38 |   postprocess_clusters: False # Apply postprocessing (optimization of the motion across the clusters)
39 |   results_dir: ./eval_results/flying_things_3d/
40 | 
41 | loss: 
42 |   background_loss: False # Compute background loss
43 |   flow_loss: False # Compute flow loss
44 |   ego_loss: False # Compute ego-motion loss
45 |   foreground_loss: False # Compute foreground loss
46 | 
47 | metrics:
48 |   flow: True # Compute evaluation metrics for flow estimation (EPE3D, Acc3DS, Acc3DR, Outliers)
49 |   ego_motion: False # Compute evaluation metrics for ego-motion estimation (RRE, RTE)
50 |   semantic: False # Compute evaluation metrics for background segmentation (Precision, Recall)
51 | 


--------------------------------------------------------------------------------
/configs/eval/eval_lidar_kitti.yaml:
--------------------------------------------------------------------------------
 1 | method: 
 2 |   backbone: 'ME' # Type of backbone network [ME]
 3 |   flow: True # Use scene-flow head
 4 |   ego_motion: True # Use ego-motion head
 5 |   semantic: True # Use background segmentation head
 6 |   clustering: True # Use foreground clustering head
 7 | 
 8 | misc:
 9 |   run_mode: test # Mode to run the network in
10 | 
11 | data:
12 |   dataset: LidarKITTI_ME # Name of the dataset [StereoKITTI_ME, FlyingThings3D_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
13 |   root: ./data/lidar_kitti/ # Path to the data
14 |   input_features: absolute_coords # Input features assigned to each sparse voxel
15 |   n_classes: 2 # Number of classes for the background segmentation head
16 |   remove_ground: True # Remove ground by simple thresholding of the height coordinate
17 |   augment_data: False # Augment the data by random rotation and translation
18 | 
19 | network:
20 |   normalize_features: True
21 |   norm_type: IN # Type of normalization layer IN = instance, BN = batch normalization, False = no normalization
22 |   in_kernel_size: 7 # Size of the initial convolutional kernel 
23 |   feature_dim: 64
24 |   ego_motion_points: 1024 # Number of points that are randomly sampled for the ego motion estimation
25 |   add_slack: True # Add slack row and column in the Sinkhorn iteration module
26 |   sinkhorn_iter: 3 # Number of Sinkhorn iterations in the ego motion module
27 |   use_pretrained: True # Flag for training
28 |   cluster_metric: euclidean # Distance metric used to compute the cluster assignments 0 = Euclidean
29 |   min_p_cluster: 30 # Min number of points in a cluster 
30 |   min_samples_dbscan: 5 # Min number of points in the neighborhood DBSCAN
31 |   eps_dbscan: 0.75 # Eps value in DBSCAN for the Euclidean distance
32 |   pretrained_path: logs/logs_SemanticKITTI_ME/full_scratch_all_loss/model_best.pt # Path to the pretrained model
33 |   
34 | test:
35 |   batch_size: 1 # Test batch size
36 |   num_workers: 1 # Num of workers to use for the test data set
37 |   postprocess_ego: True  # Apply postprocessing (optimization of the ego-motion)
38 |   postprocess_clusters: True # Apply postprocessing (optimization of the motion across the clusters)
39 |   results_dir: ./eval_results/lidar_kitti/
40 | 
41 | loss: 
42 |   background_loss: False # Compute background loss
43 |   flow_loss: False # Compute flow loss
44 |   ego_loss: False # Compute ego-motion loss
45 |   foreground_loss: False # Compute foreground loss
46 | 
47 | metrics:
48 |   flow: True # Compute evaluation metrics for flow estimation (EPE3D, Acc3DS, Acc3DR, Outliers)
49 |   ego_motion: True # Compute evaluation metrics for ego-motion estimation (RRE, RTE)
50 |   semantic: False # Compute evaluation metrics for background segmentation (Precision, Recall)


--------------------------------------------------------------------------------
/configs/eval/eval_semantic_kitti.yaml:
--------------------------------------------------------------------------------
 1 | method: 
 2 |   backbone: 'ME' # Type of backbone network [ME]
 3 |   flow: True # Use scene-flow head
 4 |   ego_motion: True # Use ego-motion head
 5 |   semantic: True # Use background segmentation head
 6 |   clustering: True # Use foreground clustering head
 7 | 
 8 | misc:
 9 |   run_mode: test # Mode to run the network in
10 | 
11 | data:
12 |   dataset: SemanticKITTI_ME # Name of the dataset [StereoKITTI_ME, FlyingThings3D_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
13 |   root: ./data/semantic_kitti/ # Path to the data
14 |   input_features: absolute_coords # Input features assigned to each sparse voxel
15 |   n_classes: 2 # Number of classes for the background segmentation head
16 |   remove_ground: True # Remove ground by simple thresholding of the height coordinate
17 |   augment_data: False # Augment the data by random rotation and translation
18 | 
19 | network:
20 |   normalize_features: True
21 |   norm_type: IN # Type of normalization layer IN = instance, BN = batch normalization, False = no normalization
22 |   in_kernel_size: 7 # Size of the initial convolutional kernel 
23 |   feature_dim: 64
24 |   ego_motion_points: 1024 # Number of points that are randomly sampled for the ego motion estimation
25 |   add_slack: True # Add slack row and column in the Sinkhorn iteration module
26 |   sinkhorn_iter: 3 # Number of Sinkhorn iterations in the ego motion module
27 |   use_pretrained: True # Flag for training
28 |   cluster_metric: euclidean # Distance metric used to compute the cluster assignments 0 = Euclidean
29 |   min_p_cluster: 30 # Min number of points in a cluster 
30 |   min_samples_dbscan: 5 # Min number of points in the neighborhood DBSCAN
31 |   eps_dbscan: 0.75 # Eps value in DBSCAN for the Euclidean distance
32 |   pretrained_path: logs/logs_SemanticKITTI_ME/full_scratch_all_loss/model_best.pt # Path to the pretrained model
33 |   
34 | test:
35 |   batch_size: 1 # Test batch size
36 |   num_workers: 1 # Num of workers to use for the test data set
37 |   postprocess_ego: True  # Apply postprocessing (optimization of the ego-motion)
38 |   postprocess_clusters: True # Apply postprocessing (optimization of the motion across the clusters)
39 |   results_dir: ./eval_results/semantic_kitti/
40 | 
41 | loss: 
42 |   background_loss: False # Compute background loss
43 |   flow_loss: False # Compute flow loss
44 |   ego_loss: False # Compute ego-motion loss
45 |   foreground_loss: False # Compute foreground loss
46 | 
47 | metrics:
48 |   flow: False # Compute evaluation metrics for flow estimation (EPE3D, Acc3DS, Acc3DR, Outliers)
49 |   ego_motion: True # Compute evaluation metrics for ego-motion estimation (RRE, RTE)
50 |   semantic: True # Compute evaluation metrics for background segmentation (Precision, Recall)
51 | 
52 | 


--------------------------------------------------------------------------------
/configs/eval/eval_stereo_kitti.yaml:
--------------------------------------------------------------------------------
 1 | method: 
 2 |   backbone: 'ME' # Type of backbone network [ME]
 3 |   flow: True # Use scene-flow head
 4 |   ego_motion: False # Use ego-motion head
 5 |   semantic: False # Use background segmentation head
 6 |   clustering: False # Use foreground clustering head
 7 | 
 8 | misc:
 9 |   run_mode: test # Mode to run the network in
10 | 
11 | data:
12 |   dataset: StereoKITTI_ME # Name of the dataset [StereoKITTI_ME, FlyingThings3D_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
13 |   root: ./data/stereo_kitti/ # Path to the data
14 |   input_features: absolute_coords # Input features assigned to each sparse voxel
15 |   n_classes: 2 # Number of classes for the background segmentation head
16 |   remove_ground: True # Remove ground by simple thresholding of the height coordinate
17 |   augment_data: False # Augment the data by random rotation and translation
18 | 
19 | network:
20 |   normalize_features: True
21 |   norm_type: IN # Type of normalization layer IN = instance, BN = batch normalization, False = no normalization
22 |   in_kernel_size: 7 # Size of the initial convolutional kernel 
23 |   feature_dim: 64
24 |   ego_motion_points: 1024 # Number of points that are randomly sampled for the ego motion estimation
25 |   add_slack: True # Add slack row and column in the Sinkhorn iteration module
26 |   sinkhorn_iter: 3 # Number of Sinkhorn iterations in the ego motion module
27 |   use_pretrained: True # Flag for training
28 |   cluster_metric: euclidean # Distance metric used to compute the cluster assignments 0 = Euclidean
29 |   min_p_cluster: 30 # Min number of points in a cluster 
30 |   min_samples_dbscan: 5 # Min number of points in the neighborhood DBSCAN
31 |   eps_dbscan: 0.75 # Eps value in DBSCAN for the Euclidean distance
32 |   pretrained_path: logs/logs_FlyingThings3D_ME/flow_head_l1_loss/model_best.pt # Path to the pretrained model
33 |   
34 | test:
35 |   batch_size: 1 # Test batch size
36 |   num_workers: 1 # Num of workers to use for the test data set
37 |   postprocess_ego: False  # Apply postprocessing (optimization of the ego-motion)
38 |   postprocess_clusters: False # Apply postprocessing (optimization of the motion across the clusters)
39 |   results_dir: ./eval_results/stereo_kitti/
40 | 
41 | loss: 
42 |   background_loss: False # Compute background loss
43 |   flow_loss: False # Compute flow loss
44 |   ego_loss: False # Compute ego-motion loss
45 |   foreground_loss: False # Compute foreground loss
46 | 
47 | metrics:
48 |   flow: True # Compute evaluation metrics for flow estimation (EPE3D, Acc3DS, Acc3DR, Outliers)
49 |   ego_motion: False # Compute evaluation metrics for ego-motion estimation (RRE, RTE)
50 |   semantic: False # Compute evaluation metrics for background segmentation (Precision, Recall)
51 | 


--------------------------------------------------------------------------------
/configs/eval/eval_waymo_open.yaml:
--------------------------------------------------------------------------------
 1 | method: 
 2 |   backbone: 'ME' # Type of backbone network [ME]
 3 |   flow: True # Use scene-flow head
 4 |   ego_motion: True # Use ego-motion head
 5 |   semantic: True # Use background segmentation head
 6 |   clustering: True # Use foreground clustering head
 7 | 
 8 | misc:
 9 |   run_mode: test # Mode to run the network in
10 | 
11 | data:
12 |   dataset: WaymoOpen_ME # Name of the dataset [StereoKITTI_ME, FlyingThings3D_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
13 |   root: ./data/waymo_open/ # Path to the data
14 |   input_features: absolute_coords # Input features assigned to each sparse voxel
15 |   n_classes: 2 # Number of classes for the background segmentation head
16 |   remove_ground: False # Remove ground by simple thresholding of the height coordinate
17 |   augment_data: False # Augment the data by random rotation and translation
18 | 
19 | network:
20 |   normalize_features: True
21 |   norm_type: IN # Type of normalization layer IN = instance, BN = batch normalization, False = no normalization
22 |   in_kernel_size: 7 # Size of the initial convolutional kernel 
23 |   feature_dim: 64
24 |   ego_motion_points: 1024 # Number of points that are randomly sampled for the ego motion estimation
25 |   add_slack: True # Add slack row and column in the Sinkhorn iteration module
26 |   sinkhorn_iter: 3 # Number of Sinkhorn iterations in the ego motion module
27 |   use_pretrained: True # Flag for training
28 |   cluster_metric: euclidean # Distance metric used to compute the cluster assignments 0 = Euclidean
29 |   min_p_cluster: 30 # Min number of points in a cluster 
30 |   min_samples_dbscan: 5 # Min number of points in the neighborhood DBSCAN
31 |   eps_dbscan: 0.75 # Eps value in DBSCAN for the Euclidean distance
32 |   pretrained_path: logs/logs_WaymoOpen_ME/full_pretrained_all_loss/model_best.pt # Path to the pretrained model
33 |   
34 | test:
35 |   batch_size: 1 # Test batch size
36 |   num_workers: 1 # Num of workers to use for the test data set
37 |   postprocess_ego: True  # Apply postprocessing (optimization of the ego-motion)
38 |   postprocess_clusters: True # Apply postprocessing (optimization of the motion across the clusters)
39 |   results_dir: ./eval_results/waymo_open/
40 | 
41 | loss: 
42 |   background_loss: False # Compute background loss
43 |   flow_loss: False # Compute flow loss
44 |   ego_loss: False # Compute ego-motion loss
45 |   foreground_loss: False # Compute foreground loss
46 | 
47 | metrics:
48 |   flow: False # Compute evaluation metrics for flow estimation (EPE3D, Acc3DS, Acc3DR, Outliers)
49 |   ego_motion: True # Compute evaluation metrics for ego-motion estimation (RRE, RTE)
50 |   semantic: True # Compute evaluation metrics for background segmentation (Precision, Recall)


--------------------------------------------------------------------------------
/configs/train/train_fully_supervised.yaml:
--------------------------------------------------------------------------------
 1 | method: 
 2 |   backbone: 'ME' # Type of backbone network [ME]
 3 |   flow: True # Use scene-flow head
 4 |   ego_motion: False # Use ego-motion head
 5 |   semantic: False # Use background segmentation head
 6 |   clustering: False # Use foreground clustering head
 7 | 
 8 | misc:
 9 |   run_mode: train # Mode to run the network in
10 | 
11 | data:
12 |   dataset: FlyingThings3D_ME # Name of the dataset [StereoKITTI_ME, FlyingThings3D_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
13 |   root: ./data/flying_things_3d/ # Path to the data
14 |   input_features: absolute_coords # Input features assigned to each sparse voxel
15 |   remove_ground: False # Remove ground by simple thresholding of the height coordinate
16 |   augment_data: False # Augment the data by random rotation and translation
17 | 
18 | network:
19 |   normalize_features: True # If the feature for the correspondence computation should be normalized
20 |   norm_type: IN # Type of normalization layer IN = instance, BN = batch normalization, False = no normalization
21 |   in_kernel_size: 7 # Size of the initial convolutional kernel 
22 |   feature_dim: 64
23 |   use_pretrained: True # Flag for training
24 |   pretrained_path: '' # Path to the pretrained model
25 | 
26 | loss: 
27 |   background_loss: False # Compute background loss
28 |   flow_loss: True # Compute flow loss
29 |   ego_loss: False # Compute ego-motion loss
30 |   foreground_loss: False # Compute foreground loss
31 | 
32 | metrics:
33 |   flow: True # Compute evaluation metrics for flow estimation (EPE3D, Acc3DS, Acc3DR, Outliers)
34 |   ego_motion: False # Compute evaluation metrics for ego-motion estimation (RRE, RTE)
35 |   semantic: False # Compute evaluation metrics for background segmentation (Precision, Recall)
36 | 


--------------------------------------------------------------------------------
/configs/train/train_weakly_supervised.yaml:
--------------------------------------------------------------------------------
 1 | method: 
 2 |   backbone: 'ME' # Type of backbone network [ME]
 3 |   flow: True # Use scene-flow head
 4 |   ego_motion: True # Use ego-motion head
 5 |   semantic: True # Use background segmentation head
 6 |   clustering: True # Use foreground clustering head
 7 | 
 8 | misc:
 9 |   run_mode: train # Mode to run the network in
10 | 
11 | data:
12 |   dataset: SemanticKITTI_ME # Name of the dataset [StereoKITTI_ME, FlyingThings3D_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
13 |   root: ./data/semantic_kitti/ # Path to the data
14 |   input_features: absolute_coords # Input features assigned to each sparse voxel
15 |   n_classes: 2 # Number of classes for the background segmentation head
16 |   remove_ground: True # Remove ground by simple thresholding of the height coordinate
17 |   augment_data: True # Augment the data by random rotation and translation
18 | 
19 | network:
20 |   normalize_features: True
21 |   norm_type: IN # Type of normalization layer IN = instance, BN = batch normalization, False = no normalization
22 |   in_kernel_size: 7 # Size of the initial convolutional kernel 
23 |   feature_dim: 64
24 |   ego_motion_points: 1024 # Number of points that are randomly sampled for the ego motion estimation
25 |   add_slack: True # Add slack row and column in the Sinkhorn iteration module
26 |   sinkhorn_iter: 3 # Number of Sinkhorn iterations in the ego motion module
27 |   use_pretrained: True # Flag for training
28 |   cluster_metric: euclidean # Distance metric used to compute the cluster assignments 0 = Euclidean
29 |   min_p_cluster: 30 # Min number of points in a cluster 
30 |   min_samples_dbscan: 5 # Min number of points in the neighborhood DBSCAN
31 |   eps_dbscan: 0.75 # Eps value in DBSCAN for the Euclidean distance
32 |   pretrained_path: '' # Path to the pretrained model
33 |   
34 | loss: 
35 |   background_loss: True # Compute background loss
36 |   flow_loss: False # Compute flow loss
37 |   ego_loss: True # Compute ego-motion loss
38 |   foreground_loss: True # Compute foreground loss
39 | 
40 | metrics:
41 |   flow: False # Compute evaluation metrics for flow estimation (EPE3D, Acc3DS, Acc3DR, Outliers)
42 |   ego_motion: True # Compute evaluation metrics for ego-motion estimation (RRE, RTE)
43 |   semantic: True # Compute evaluation metrics for background segmentation (Precision, Recall)
44 | 
45 | 


--------------------------------------------------------------------------------
/data_preprocessing/IO.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3.4
  2 | 
  3 | import os
  4 | import re
  5 | import numpy as np
  6 | import uuid
  7 | from scipy import misc
  8 | import numpy as np
  9 | from PIL import Image
 10 | import sys
 11 | 
 12 | 
 13 | def read(file):
 14 |     if file.endswith('.float3'): return readFloat(file)
 15 |     elif file.endswith('.flo'): return readFlow(file)
 16 |     elif file.endswith('.ppm'): return readImage(file)
 17 |     elif file.endswith('.pgm'): return readImage(file)
 18 |     elif file.endswith('.png'): return readImage(file)
 19 |     elif file.endswith('.jpg'): return readImage(file)
 20 |     elif file.endswith('.pfm'): return readPFM(file)[0]
 21 |     else: raise Exception('don\'t know how to read %s' % file)
 22 | 
 23 | def write(file, data):
 24 |     if file.endswith('.float3'): return writeFloat(file, data)
 25 |     elif file.endswith('.flo'): return writeFlow(file, data)
 26 |     elif file.endswith('.ppm'): return writeImage(file, data)
 27 |     elif file.endswith('.pgm'): return writeImage(file, data)
 28 |     elif file.endswith('.png'): return writeImage(file, data)
 29 |     elif file.endswith('.jpg'): return writeImage(file, data)
 30 |     elif file.endswith('.pfm'): return writePFM(file, data)
 31 |     else: raise Exception('don\'t know how to write %s' % file)
 32 | 
 33 | def readPFM(file):
 34 |     file = open(file, 'rb')
 35 | 
 36 |     color = None
 37 |     width = None
 38 |     height = None
 39 |     scale = None
 40 |     endian = None
 41 | 
 42 |     header = file.readline().rstrip()
 43 |     if header.decode("ascii") == 'PF':
 44 |         color = True
 45 |     elif header.decode("ascii") == 'Pf':
 46 |         color = False
 47 |     else:
 48 |         raise Exception('Not a PFM file.')
 49 | 
 50 |     dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline().decode("ascii"))
 51 |     if dim_match:
 52 |         width, height = list(map(int, dim_match.groups()))
 53 |     else:
 54 |         raise Exception('Malformed PFM header.')
 55 | 
 56 |     scale = float(file.readline().decode("ascii").rstrip())
 57 |     if scale < 0: # little-endian
 58 |         endian = '<'
 59 |         scale = -scale
 60 |     else:
 61 |         endian = '>' # big-endian
 62 | 
 63 |     data = np.fromfile(file, endian + 'f')
 64 |     shape = (height, width, 3) if color else (height, width)
 65 | 
 66 |     data = np.reshape(data, shape)
 67 |     data = np.flipud(data)
 68 |     return data, scale
 69 | 
 70 | def writePFM(file, image, scale=1):
 71 |     file = open(file, 'wb')
 72 | 
 73 |     color = None
 74 | 
 75 |     if image.dtype.name != 'float32':
 76 |         raise Exception('Image dtype must be float32.')
 77 | 
 78 |     image = np.flipud(image)
 79 | 
 80 |     if len(image.shape) == 3 and image.shape[2] == 3: # color image
 81 |         color = True
 82 |     elif len(image.shape) == 2 or len(image.shape) == 3 and image.shape[2] == 1: # greyscale
 83 |         color = False
 84 |     else:
 85 |         raise Exception('Image must have H x W x 3, H x W x 1 or H x W dimensions.')
 86 | 
 87 |     file.write('PF\n' if color else 'Pf\n'.encode())
 88 |     file.write('%d %d\n'.encode() % (image.shape[1], image.shape[0]))
 89 | 
 90 |     endian = image.dtype.byteorder
 91 | 
 92 |     if endian == '<' or endian == '=' and sys.byteorder == 'little':
 93 |         scale = -scale
 94 | 
 95 |     file.write('%f\n'.encode() % scale)
 96 | 
 97 |     image.tofile(file)
 98 | 
 99 | def readFlow(name):
100 |     if name.endswith('.pfm') or name.endswith('.PFM'):
101 |         return readPFM(name)[0][:,:,0:2]
102 | 
103 |     f = open(name, 'rb')
104 | 
105 |     header = f.read(4)
106 |     if header.decode("utf-8") != 'PIEH':
107 |         raise Exception('Flow file header does not contain PIEH')
108 | 
109 |     width = np.fromfile(f, np.int32, 1).squeeze()
110 |     height = np.fromfile(f, np.int32, 1).squeeze()
111 | 
112 |     flow = np.fromfile(f, np.float32, width * height * 2).reshape((height, width, 2))
113 | 
114 |     return flow.astype(np.float32)
115 | 
116 | def readImage(name):
117 |     if name.endswith('.pfm') or name.endswith('.PFM'):
118 |         data = readPFM(name)[0]
119 |         if len(data.shape)==3:
120 |             return data[:,:,0:3]
121 |         else:
122 |             return data
123 | 
124 |     return misc.imread(name)
125 | 
126 | def writeImage(name, data):
127 |     if name.endswith('.pfm') or name.endswith('.PFM'):
128 |         return writePFM(name, data, 1)
129 | 
130 |     return misc.imsave(name, data)
131 | 
132 | def writeFlow(name, flow):
133 |     f = open(name, 'wb')
134 |     f.write('PIEH'.encode('utf-8'))
135 |     np.array([flow.shape[1], flow.shape[0]], dtype=np.int32).tofile(f)
136 |     flow = flow.astype(np.float32)
137 |     flow.tofile(f)
138 | 
139 | def readFloat(name):
140 |     f = open(name, 'rb')
141 | 
142 |     if(f.readline().decode("utf-8"))  != 'float\n':
143 |         raise Exception('float file %s did not contain <float> keyword' % name)
144 | 
145 |     dim = int(f.readline())
146 | 
147 |     dims = []
148 |     count = 1
149 |     for i in range(0, dim):
150 |         d = int(f.readline())
151 |         dims.append(d)
152 |         count *= d
153 | 
154 |     dims = list(reversed(dims))
155 | 
156 |     data = np.fromfile(f, np.float32, count).reshape(dims)
157 |     if dim > 2:
158 |         data = np.transpose(data, (2, 1, 0))
159 |         data = np.transpose(data, (1, 0, 2))
160 | 
161 |     return data
162 | 
163 | def writeFloat(name, data):
164 |     f = open(name, 'wb')
165 | 
166 |     dim=len(data.shape)
167 |     if dim>3:
168 |         raise Exception('bad float file dimension: %d' % dim)
169 | 
170 |     f.write(('float\n').encode('ascii'))
171 |     f.write(('%d\n' % dim).encode('ascii'))
172 | 
173 |     if dim == 1:
174 |         f.write(('%d\n' % data.shape[0]).encode('ascii'))
175 |     else:
176 |         f.write(('%d\n' % data.shape[1]).encode('ascii'))
177 |         f.write(('%d\n' % data.shape[0]).encode('ascii'))
178 |         for i in range(2, dim):
179 |             f.write(('%d\n' % data.shape[i]).encode('ascii'))
180 | 
181 |     data = data.astype(np.float32)
182 |     if dim==2:
183 |         data.tofile(f)
184 | 
185 |     else:
186 |         np.transpose(data, (2, 0, 1)).tofile(f)
187 | 
188 | 


--------------------------------------------------------------------------------
/data_preprocessing/flyingthings3d_utils.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | def next_pixel2pc(flow, disparity, save_path=None, f=-1050., cx=479.5, cy=269.5):
 5 |     height, width = disparity.shape
 6 | 
 7 |     BASELINE = 1.0
 8 |     depth = -1. * f * BASELINE / disparity
 9 | 
10 |     x = ((np.tile(np.arange(width, dtype=np.float32)[None, :], (height, 1)) - cx + flow[..., 0]) * -1. / disparity)[:,
11 |         :, None]
12 |     y = ((np.tile(np.arange(height, dtype=np.float32)[:, None], (1, width)) - cy + flow[..., 1]) * 1. / disparity)[:,
13 |         :, None]
14 |     pc = np.concatenate((x, y, depth[:, :, None]), axis=-1)
15 | 
16 |     if save_path is not None:
17 |         np.save(save_path, pc)
18 |     return pc
19 | 
20 | 
21 | def pixel2pc(disparity, save_path=None, f=-1050., cx=479.5, cy=269.5):
22 |     height, width = disparity.shape
23 | 
24 |     BASELINE = 1.0
25 |     depth = -1. * f * BASELINE / disparity
26 | 
27 |     x = ((np.tile(np.arange(width, dtype=np.float32)[None, :], (height, 1)) - cx) * -1. / disparity)[:, :, None]
28 |     y = ((np.tile(np.arange(height, dtype=np.float32)[:, None], (1, width)) - cy) * 1. / disparity)[:, :, None]
29 |     pc = np.concatenate((x, y, depth[:, :, None]), axis=-1)
30 | 
31 |     if save_path is not None:
32 |         np.save(save_path, pc)
33 |     return pc


--------------------------------------------------------------------------------
/data_preprocessing/kitti_utils.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import png
 3 | 
 4 | 
 5 | def pixel2xyz(depth, P_rect, px=None, py=None):
 6 |     assert P_rect[0,1] == 0
 7 |     assert P_rect[1,0] == 0
 8 |     assert P_rect[2,0] == 0
 9 |     assert P_rect[2,1] == 0
10 |     assert P_rect[0,0] == P_rect[1,1]
11 |     focal_length_pixel = P_rect[0,0]
12 |     
13 |     height, width = depth.shape[:2]
14 |     if px is None:
15 |         px = np.tile(np.arange(width, dtype=np.float32)[None, :], (height, 1))
16 |     if py is None:
17 |         py = np.tile(np.arange(height, dtype=np.float32)[:, None], (1, width))
18 |     const_x = P_rect[0,2] * depth + P_rect[0,3]
19 |     const_y = P_rect[1,2] * depth + P_rect[1,3]
20 |     
21 |     x = ((px * (depth + P_rect[2,3]) - const_x) / focal_length_pixel) [:, :, None]
22 |     y = ((py * (depth + P_rect[2,3]) - const_y) / focal_length_pixel) [:, :, None]
23 |     pc = np.concatenate((x, y, depth[:, :, None]), axis=-1)
24 |     
25 |     pc[..., :2] *= -1.
26 |     return pc
27 | 
28 | 
29 | def load_uint16PNG(fpath):
30 |     reader = png.Reader(fpath)
31 |     pngdata = reader.read()
32 |     px_array = np.vstack( map(np.uint16, pngdata[2]) )
33 |     if pngdata[3]['planes'] == 3:
34 |         width, height = pngdata[:2]
35 |         px_array = px_array.reshape(height, width, 3)
36 |     return px_array
37 | 
38 | 
39 | def load_disp(fpath):
40 |     # A 0 value indicates an invalid pixel (ie, no
41 |     # ground truth exists, or the estimation algorithm didn't produce an estimate
42 |     # for that pixel).
43 |     array = load_uint16PNG(fpath)
44 |     valid = array > 0
45 |     disp = array.astype(np.float32) / 256.0
46 |     disp[np.logical_not(valid)] = -1.
47 |     return disp, valid
48 | 
49 | 
50 | def load_op_flow(fpath):
51 |     array = load_uint16PNG(fpath)
52 |     valid = array[..., -1] == 1
53 |     array = array.astype(np.float32)
54 |     flow = (array[..., :-1] - 2**15) / 64.
55 |     return flow, valid
56 | 
57 | 
58 | def disp_2_depth(disparity, valid_disp, FOCAL_LENGTH_PIXEL):
59 |     BASELINE = 0.54
60 |     depth = FOCAL_LENGTH_PIXEL * BASELINE / (disparity + 1e-5)
61 |     depth[np.logical_not(valid_disp)] = -1.
62 |     return depth
63 | 


--------------------------------------------------------------------------------
/data_preprocessing/process_flyingthings3d_subset.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import sys, os
 3 | import os.path as osp
 4 | from multiprocessing import Pool
 5 | import argparse
 6 | 
 7 | import IO
 8 | from flyingthings3d_utils import *
 9 | 
10 | parser = argparse.ArgumentParser()
11 | parser.add_argument('--raw_data_path', type=str, help="path to the raw data")
12 | parser.add_argument('--save_path', type=str, help="save path")
13 | parser.add_argument('--only_save_near_pts', dest='save_near', action='store_true',
14 |                     help='only save near points to save disk space')
15 | 
16 | args = parser.parse_args()
17 | root_path = args.raw_data_path
18 | save_path = args.save_path
19 | 
20 | splits = ['train', 'val']
21 | 
22 | 
23 | def process_one_file(params):
24 |     try:
25 |         train_val, fname = params
26 | 
27 |         save_folder_path = osp.join(save_path, train_val)
28 |         os.makedirs(save_folder_path, exist_ok=True)
29 | 
30 |         disp1 = IO.read(osp.join(root_path, train_val, 'disparity', 'left', fname + '.pfm'))
31 |         disp1_occ = IO.read(osp.join(root_path, train_val, 'disparity_occlusions', 'left', fname + '.png'))
32 |         disp1_change = IO.read(
33 |             osp.join(root_path, train_val, 'disparity_change', 'left', 'into_future', fname + '.pfm'))
34 |         flow = IO.read(osp.join(root_path, train_val, 'flow', 'left', 'into_future', fname + '.flo'))
35 |         flow_occ = IO.read(osp.join(root_path, train_val, 'flow_occlusions', 'left', 'into_future', fname + '.png'))
36 |         instance_mask = IO.read(osp.join(root_path, train_val, 'object_ids', 'left', fname + '.png'))
37 | 
38 |         pc1 = pixel2pc(disp1)
39 |         pc2 = next_pixel2pc(flow, disp1 + disp1_change)
40 | 
41 |         if pc1[..., -1].max() > 0 or pc2[..., -1].max() > 0:
42 |             print('z > 0', train_val, fname, pc1[..., -1].max(), pc1[..., -1].min(), pc2[..., -1].max(),
43 |                   pc2[..., -1].min())
44 | 
45 |         valid_mask = np.logical_and(disp1_occ == 0, flow_occ == 0)
46 | 
47 |         pc1 = pc1[valid_mask]
48 |         pc2 = pc2[valid_mask]
49 |         flow = pc2 - pc1
50 |         instance_mask = instance_mask[valid_mask]
51 | 
52 |         inst_cnt = 0
53 | 
54 | 
55 | 
56 |         if args.save_near:
57 |             near_mask = np.logical_and(pc1[..., -1] > -35., pc2[..., -1] > -35.)
58 |             pc1 = pc1[near_mask]
59 |             pc2 = pc2[near_mask]
60 |             flow = flow[near_mask]
61 |             instance_mask = instance_mask[near_mask]
62 | 
63 |         # Map instance labels to small numbers
64 |         for inst_label in set(instance_mask.tolist()):
65 |             instance_mask[instance_mask==inst_label] = inst_cnt
66 |             inst_cnt += 1
67 | 
68 |         if not args.save_near:
69 |             np.savez_compressed(osp.join(save_folder_path,  '{}.npz'.format(fname)), pc1=pc1, pc2=pc2, 
70 |                                                                                      flow=flow, inst_pc1=instance_mask,
71 |                                                                                      inst_pc2=instance_mask)
72 |         else:
73 |             near_mask = np.logical_and(pc1[..., -1] > -35., pc2[..., -1] > -35.)
74 |             np.savez_compressed(osp.join(save_folder_path,  '{}.npz'.format(fname)), pc1=pc1[near_mask],
75 |                                                                                      pc2=pc2[near_mask], flow=flow[near_mask], 
76 |                                                                                      inst_pc1=instance_mask[near_mask],
77 |                                                                                      inst_pc2=instance_mask[near_mask])
78 | 
79 |         #np.savetxt('test.csv',np.concatenate((pc1,instance_mask.reshape(-1,1)),axis=1))
80 |     except Exception as ex:
81 |         print('error in addressing params', params, 'see exception:')
82 |         print(ex)
83 |         sys.stdout.flush()
84 |         return
85 | 
86 | 
87 | if __name__ == '__main__':
88 |     param_list = []
89 |     for train_val in splits:
90 |         tmp_path = osp.join(root_path, train_val, 'disparity_change', 'left', 'into_future')
91 |         param_list.extend([(train_val, item.split('.')[0]) for item in os.listdir(tmp_path)])
92 | 
93 |     pool = Pool(10)
94 |     pool.map(process_one_file, param_list)
95 |     pool.close()
96 |     pool.join()
97 | 
98 |     print('Finish all!')
99 | 


--------------------------------------------------------------------------------
/data_preprocessing/process_kitti.py:
--------------------------------------------------------------------------------
 1 | import os, sys
 2 | import os.path as osp
 3 | import numpy as np
 4 | from multiprocessing import Pool
 5 | import IO
 6 | 
 7 | from kitti_utils import *
 8 | 
 9 | 
10 | data_root = sys.argv[1]
11 | calib_root = osp.join(data_root, 'training/calib_cam_to_cam')
12 | disp1_root = osp.join(data_root, 'training/disp_occ_0')
13 | disp2_root = osp.join(data_root, 'training/disp_occ_1')
14 | op_flow_root = osp.join(data_root, 'training/flow_occ')
15 | obj_map_root = osp.join(data_root, 'training/obj_map')
16 | save_path = sys.argv[2]
17 | 
18 | 
19 | def process_one_frame(idx):
20 |     sidx = '{:06d}'.format(idx)
21 | 
22 |     calib_path = osp.join(calib_root, sidx + '.txt')
23 |     with open(calib_path) as fd:
24 |         lines = fd.readlines()
25 |         assert len([line for line in lines if line.startswith('P_rect_02')]) == 1
26 |         P_rect_left = \
27 |             np.array([float(item) for item in
28 |                       [line for line in lines if line.startswith('P_rect_02')][0].split()[1:]],
29 |                      dtype=np.float32).reshape(3, 4)
30 | 
31 |     assert P_rect_left[0, 0] == P_rect_left[1, 1]
32 |     focal_length_pixel = P_rect_left[0, 0]
33 | 
34 |     disp1_path = osp.join(disp1_root, sidx + '_10.png')
35 |     disp1, valid_disp1 = load_disp(disp1_path)
36 |     depth1 = disp_2_depth(disp1, valid_disp1, focal_length_pixel)
37 |     pc1 = pixel2xyz(depth1, P_rect_left)
38 | 
39 |     disp2_path = osp.join(disp2_root, sidx + '_10.png')
40 |     disp2, valid_disp2 = load_disp(disp2_path)
41 |     depth2 = disp_2_depth(disp2, valid_disp2, focal_length_pixel)
42 | 
43 |     valid_disp = np.logical_and(valid_disp1, valid_disp2)
44 | 
45 |     op_flow, valid_op_flow = load_op_flow(osp.join(op_flow_root, '{:06d}_10.png'.format(idx)))
46 |     vertical = op_flow[..., 1]
47 |     horizontal = op_flow[..., 0]
48 |     height, width = op_flow.shape[:2]
49 | 
50 |     px2 = np.zeros((height, width), dtype=np.float32)
51 |     py2 = np.zeros((height, width), dtype=np.float32)
52 | 
53 |     obj_map_1 = osp.join(obj_map_root, sidx + '_10.png')
54 |     instance_mask = IO.read(obj_map_1)
55 | 
56 |     for i in range(height):
57 |         for j in range(width):
58 |             if valid_op_flow[i, j] and valid_disp[i, j]:
59 |                 try:
60 |                     dx = horizontal[i, j]
61 |                     dy = vertical[i, j]
62 |                 except:
63 |                     print('error, i,j:', i, j, 'hor and ver:', horizontal[i, j], vertical[i, j])
64 |                     continue
65 | 
66 |                 px2[i, j] = j + dx
67 |                 py2[i, j] = i + dy
68 | 
69 |     pc2 = pixel2xyz(depth2, P_rect_left, px=px2, py=py2)
70 | 
71 |     # Only consider points/pixels with valid disparity and flow information
72 |     final_mask = np.logical_and(valid_disp, valid_op_flow)
73 | 
74 |     valid_pc1 = pc1[final_mask]
75 |     valid_pc2 = pc2[final_mask]
76 |     flow = valid_pc2 - valid_pc1
77 |     instance_mask = instance_mask[final_mask]
78 | 
79 |     np.savetxt('test.csv',np.concatenate((valid_pc2,instance_mask.reshape(-1,1)),axis=1))
80 | 
81 | 
82 |     np.savez_compressed(osp.join(save_path, '{:06d}.npz'.format(idx)), pc1=valid_pc1, pc2=valid_pc2, 
83 |                                                                        flow=flow, inst_pc1=instance_mask,
84 |                                                                        inst_pc2=instance_mask)
85 | 
86 | 
87 | pool = Pool(10)
88 | indices = range(200)
89 | pool.map(process_one_frame, indices)
90 | pool.close()
91 | pool.join()
92 | 


--------------------------------------------------------------------------------
/data_preprocessing/process_semantKitti.py:
--------------------------------------------------------------------------------
  1 | import os 
  2 | import glob
  3 | import argparse
  4 | import re
  5 | import copy
  6 | 
  7 | import open3d as o3d
  8 | import numpy as np 
  9 | from multiprocessing import Pool
 10 | 
 11 | # Some of the functions are taken from pykitti https://github.com/utiasSTARS/pykitti/blob/master/pykitti/utils.py
 12 | def load_velo_scan(file):
 13 |     """Load and parse a velodyne binary file."""
 14 |     scan = np.fromfile(file, dtype=np.float32)
 15 |     return scan.reshape((-1, 4))
 16 | 
 17 | def load_poses(file):
 18 |     """Load and parse ground truth poses"""
 19 |     tmp_poses = np.genfromtxt(file, delimiter=' ').reshape(-1,3,4)
 20 |     poses = np.repeat(np.expand_dims(np.eye(4),0), tmp_poses.shape[0], axis=0)
 21 |     poses[:,0:3,:] = tmp_poses
 22 |     return poses
 23 | 
 24 | def read_calib_file(filepath):
 25 |     """Read in a calibration file and parse into a dictionary."""
 26 |     data = {}
 27 | 
 28 |     with open(filepath, 'r') as f:
 29 |         for line in f.readlines():
 30 |             key, value = line.split(':', 1)
 31 |             # The only non-float values in these files are dates, which
 32 |             # we don't care about anyway
 33 |             try:
 34 |                 data[key] = np.array([float(x) for x in value.split()])
 35 |             except ValueError:
 36 |                 pass
 37 | 
 38 |     return data
 39 | 
 40 | # This part of the code is taken from the semanticKITTI API
 41 | 
 42 | def open_label(filename):
 43 |     """ Open raw scan and fill in attributes
 44 |     """
 45 |     # check filename is string
 46 |     if not isinstance(filename, str):
 47 |         raise TypeError("Filename should be string type, "
 48 |                         "but was {type}".format(type=str(type(filename))))
 49 | 
 50 |     # if all goes well, open label
 51 |     label = np.fromfile(filename, dtype=np.uint32)
 52 |     label = label.reshape((-1))
 53 | 
 54 |     return label
 55 | 
 56 | def set_label(label, points):
 57 |     """ Set points for label not from file but from np
 58 |     """
 59 |     # check label makes sense
 60 |     if not isinstance(label, np.ndarray):
 61 |         raise TypeError("Label should be numpy array")
 62 | 
 63 |     # only fill in attribute if the right size
 64 |     if label.shape[0] == points.shape[0]:
 65 |         sem_label = label & 0xFFFF  # semantic label in lower half
 66 |         inst_label = label >> 16    # instance id in upper half
 67 |     else:
 68 |         print("Points shape: ", points.shape)
 69 |         print("Label shape: ", label.shape)
 70 |         raise ValueError("Scan and Label don't contain same number of points")
 71 | 
 72 |     # sanity check
 73 |     assert((sem_label + (inst_label << 16) == label).all())
 74 | 
 75 |     return sem_label, inst_label
 76 | 
 77 | 
 78 | 
 79 | 
 80 | 
 81 | def transform_point_cloud(x1, R, t):
 82 |     """
 83 |     Transforms the point cloud using the giver transformation paramaters
 84 |     
 85 |     Args:
 86 |         x1  (np array): points of the point cloud [n,3]
 87 |         R   (np array): estimated rotation matrice [3,3]
 88 |         t   (np array): estimated translation vectors [3,1]
 89 |     Returns:
 90 |         x1_t (np array): points of the transformed point clouds [n,3]
 91 |     """
 92 |     x1_t = (np.matmul(R, x1.transpose()) + t).transpose()
 93 | 
 94 |     return x1_t
 95 | 
 96 | def add_argument_group(name):
 97 |     arg = parser.add_argument_group(name)
 98 |     arg_lists.append(arg)
 99 |     return arg
100 | 
101 | def sorted_alphanum(file_list_ordered):
102 |     """
103 |     Sorts the list alphanumerically
104 |     Args:
105 |         file_list_ordered (list): list of files to be sorted
106 |     Return:
107 |         sorted_list (list): input list sorted alphanumerically
108 |     """
109 |     def convert(text):
110 |         return int(text) if text.isdigit() else text
111 | 
112 |     def alphanum_key(key):
113 |         return [convert(c) for c in re.split('([0-9]+)', key)]
114 | 
115 |     sorted_list = sorted(file_list_ordered, key=alphanum_key)
116 | 
117 |     return sorted_list
118 | 
119 | def get_file_list(path, extension=None):
120 |     """
121 |     Build a list of all the files in the provided path
122 |     Args:
123 |         path (str): path to the directory 
124 |         extension (str): only return files with this extension
125 |     Return:
126 |         file_list (list): list of all the files (with the provided extension) sorted alphanumerically
127 |     """
128 |     if extension is None:
129 |         file_list = [os.path.join(path, f) for f in os.listdir(path) if os.path.isfile(os.path.join(path, f))]
130 |     else:
131 |         file_list = [
132 |             os.path.join(path, f)
133 |             for f in os.listdir(path)
134 |             if os.path.isfile(os.path.join(path, f)) and os.path.splitext(f)[1] == extension
135 |         ]
136 |     file_list = sorted_alphanum(file_list)
137 | 
138 |     return file_list
139 | 
140 | 
141 | def get_folder_list(path):
142 |     """
143 |     Build a list of all the files in the provided path
144 |     Args:
145 |         path (str): path to the directory 
146 |         extension (str): only return files with this extension
147 |     Returns:
148 |         file_list (list): list of all the files (with the provided extension) sorted alphanumerically
149 |     """
150 |     folder_list = [os.path.join(path, f) for f in os.listdir(path) if os.path.isdir(os.path.join(path, f))]
151 |     folder_list = sorted_alphanum(folder_list)
152 |     
153 |     return folder_list
154 | 
155 | 
156 | def extract_moving_objects(save_path, frame_idx, pts, sem_label, inst_label, moving_threshold = 100):
157 |     """
158 |     Extracts the point belonging to individual moving objects and saves them to a file 
159 |     Args:
160 |         save_path (str): path where to save the files
161 |         frame_idx (str): current frame number
162 |         pts (np.array): point cloud of the source frame
163 |         sem_label (np.array): semantic labels
164 |         inst_label (np.array): temporally consistent instance labels
165 |         moving_threshold (int): label above which the classes denote moving objects
166 | 
167 |     Returns:
168 |     
169 |     """
170 |     moving_idx_s = np.where(sem_label >= 100)[0]
171 |     
172 |     # Filter out the points and labels
173 |     sem_label = sem_label[moving_idx_s]
174 |     inst_label = inst_label[moving_idx_s]
175 |     pts = pts[moving_idx_s,:]
176 | 
177 |     # Unique semantic labels
178 |     unique_labels = np.unique(sem_label)
179 | 
180 |     pcd = o3d.geometry.PointCloud()
181 |     for label in unique_labels:
182 |         class_idx = np.where(sem_label == label)[0]
183 |         class_instances = inst_label[class_idx]
184 |         class_points = pts[class_idx,:]
185 |         tmp_instances = np.unique(class_instances)
186 | 
187 |         for instance in tmp_instances:
188 |             object_idx = np.where(class_instances == instance)[0]
189 |             object_points = class_points[object_idx, :]
190 |         
191 |             # Save the points and sample a random color
192 |             object_color = np.repeat(np.random.random(size=3).reshape(1,-1),repeats=object_points.shape[0], axis=0)
193 |             pcd.points = o3d.utility.Vector3dVector(object_points)
194 |             pcd.colors = o3d.utility.Vector3dVector(object_color)
195 | 
196 |             if not os.path.exists(os.path.join(save_path, 'objects', '{}_{}'.format(label, instance))):
197 |                 os.makedirs(os.path.join(save_path, 'objects', '{}_{}'.format(label, instance)))
198 | 
199 | 
200 |             # Save point in the npz and ply format
201 |             np.savez(os.path.join(save_path, 'objects', '{}_{}'.format(label, instance),'{}.npz'.format(frame_idx)),
202 |                         pts=object_points)
203 |             
204 |             o3d.io.write_point_cloud(os.path.join(save_path, 'objects', '{}_{}'.format(label, instance),
205 |                                         '{}.ply'.format(frame_idx)), pcd)
206 | 
207 | 
208 | 
209 | class semanticKITTIProcesor:
210 |     def __init__(self, args):
211 |         self.root_path = args.raw_data_path
212 |         self.save_path = args.save_path
213 |         self.save_ply = args.save_ply
214 |         self.save_near = args.save_near
215 |         self.n_processes = args.n_processes
216 | 
217 |         self.scenes = get_folder_list(self.root_path)
218 | 
219 |     def run_processing(self):
220 | 
221 |         if self.n_processes < 1:
222 |             self.n_processes = 1
223 | 
224 |         pool = Pool(self.n_processes)
225 |         pool.map(self.process_scene, self.scenes)
226 |         pool.close()
227 |         pool.join()
228 | 
229 |     def process_scene(self, scene):
230 |         scene_name = scene.split(os.sep)[-1]
231 | 
232 |         # Create a save file if not existing
233 |         if not os.path.exists(os.path.join(self.save_path, scene_name)):
234 |             os.makedirs(os.path.join(self.save_path, scene_name))
235 |         
236 |         # Load transformation paramters
237 |         poses = load_poses(os.path.join(scene,'poses.txt'))
238 |         tr_velo_cam = read_calib_file(os.path.join(scene,'calib.txt'))['Tr'].reshape(3,4)
239 |         tr_velo_cam = np.concatenate((tr_velo_cam,np.array([0,0,0,1]).reshape(1,4)),axis=0)
240 |         frames = get_file_list(os.path.join(scene,'velodyne'), extension='.bin')
241 | 
242 |         if os.path.isdir(os.path.join(scene,'labels')):
243 |             labels = get_file_list(os.path.join(scene,'labels'), extension='.label')
244 |             test_scene = False
245 |                     
246 |             assert len(frames) == len(labels), "Number of point cloud fils and label files is not the same!!"
247 |         
248 |         else:
249 |             test_scene = True
250 | 
251 | 
252 | 
253 | 
254 |         for idx in range(len(frames)-1):
255 |             frame_name_s = frames[idx].split(os.sep)[-1].split('.')[0]
256 |             frame_name_t = frames[idx + 1].split(os.sep)[-1].split('.')[0]
257 | 
258 |             pc_s = load_velo_scan(frames[idx])[:,:3]
259 |             pc_t = load_velo_scan(frames[idx + 1])[:,:3]
260 | 
261 |             # Transform both point cloud to the camera coordinate system (check KITTI webpage)
262 |             pc_s = transform_point_cloud(pc_s, tr_velo_cam[:3, :3], tr_velo_cam[:3, 3:4])
263 |             pc_t = transform_point_cloud(pc_t, tr_velo_cam[:3, :3], tr_velo_cam[:3, 3:4])
264 | 
265 |             # Rotate 180 degrees around z axis (to be in accordance to KITTI flow as used by other datsets)
266 |             pc_s[:,0], pc_s[:,1] = -pc_s[:,0], -pc_s[:,1]
267 |             pc_t[:,0], pc_t[:,1] = -pc_t[:,0], -pc_t[:,1]
268 | 
269 | 
270 |             
271 | 
272 |             if not test_scene:
273 |                 # Load the labels
274 |                 sem_label_s, inst_label_s = set_label(open_label(labels[idx]), pc_s)
275 |                 sem_label_t, inst_label_t = set_label(open_label(labels[idx + 1]), pc_t)
276 | 
277 |                 # Filter out points which are behind the car (to be in accordance with the stereo datasets)
278 |                 front_mask_s = pc_s[:,2] > 1.5
279 |                 front_mask_t = pc_t[:,2] > 1.5
280 |                 pc_s = pc_s[front_mask_s, :]
281 |                 pc_t = pc_t[front_mask_t,:] 
282 | 
283 |                 sem_label_s = sem_label_s[front_mask_s]
284 |                 inst_label_s = inst_label_s[front_mask_s]
285 | 
286 |                 sem_label_t = sem_label_t[front_mask_t]
287 |                 inst_label_t = inst_label_t[front_mask_t]
288 | 
289 |                 if self.save_near:
290 |                     near_mask_s = pc_s[:,2] < 35
291 |                     near_mask_t = pc_t[:,2] < 35
292 |                     pc_s = pc_s[near_mask_s, :]
293 |                     pc_t = pc_t[near_mask_t,:] 
294 | 
295 |                     sem_label_s = sem_label_s[near_mask_s]
296 |                     inst_label_s = inst_label_s[near_mask_s]
297 | 
298 |                     sem_label_t = sem_label_t[near_mask_t]
299 |                     inst_label_t = inst_label_t[near_mask_t]
300 | 
301 |                 # Extract the stable parts (sem. labels above 99 denote moving objects)
302 |                 # Could also remove 11, 15, 30, 31, 32 (classes like cyclist, person, ...)
303 |                 # Motion labels are 1 if moving and 0 if stable
304 |                 stable_idx_s = np.where(sem_label_s < 100)[0]
305 |                 stable_idx_t = np.where(sem_label_t < 100)[0]
306 |                 mot_label_s = np.ones_like(sem_label_s)
307 |                 mot_label_s[stable_idx_s] = 0
308 | 
309 |                 mot_label_t = np.ones_like(sem_label_t)
310 |                 mot_label_t[stable_idx_t] = 0
311 |                 
312 | 
313 |                 # Extract ego motion from the gt poses
314 |                 T_st = np.matmul(poses[idx,:,:],np.linalg.inv(poses[idx + 1,:,:]))
315 | 
316 | 
317 | 
318 |                 np.savez_compressed(os.path.join(self.save_path, scene_name, '{}_{}.npz'.format(frame_name_s, frame_name_t)), 
319 |                                                                         pc1=pc_s, 
320 |                                                                         pc2=pc_t, 
321 |                                                                         sem_label_s=sem_label_s,
322 |                                                                         sem_label_t=sem_label_t,
323 |                                                                         inst_label_s=inst_label_s,
324 |                                                                         inst_label_t=inst_label_t,
325 |                                                                         mot_label_s=mot_label_s,
326 |                                                                         mot_label_t=mot_label_t,
327 |                                                                         pose_s=poses[idx,:,:],
328 |                                                                         pose_t=poses[idx + 1,:,:])
329 |             else:
330 |                 # Filter out points which are behind the car (to be in accordance with the stereo datasets)
331 |                 front_mask_s = pc_s[:,2] > 1.5
332 |                 front_mask_t = pc_t[:,2] > 1.5
333 |                 pc_s = pc_s[front_mask_s, :]
334 |                 pc_t = pc_t[front_mask_t,:] 
335 | 
336 |                 if self.save_near:
337 |                     near_mask_s = pc_s[:,2] < 35
338 |                     near_mask_t = pc_t[:,2] < 35
339 |                     pc_s = pc_s[near_mask_s, :]
340 |                     pc_t = pc_t[near_mask_t,:] 
341 | 
342 |                 np.savez_compressed(os.path.join(self.save_path, scene_name, '{}_{}.npz'.format(frame_name_s, frame_name_t)), 
343 |                                                         pc1=pc_s, 
344 |                                                         pc2=pc_t,
345 |                                                         pose_s=poses[idx,:,:],
346 |                                                         pose_t=poses[idx + 1,:,:])
347 | 
348 |             
349 |             # Save point clouds as ply files
350 |             if self.save_ply:
351 |                 pcd_s = o3d.geometry.PointCloud()
352 |                 pcd_t = o3d.geometry.PointCloud()
353 |                 pcd_s.points = o3d.utility.Vector3dVector(pc_s)
354 |                 pcd_t.points = o3d.utility.Vector3dVector(pc_t)
355 | 
356 |                 o3d.io.write_point_cloud(os.path.join(self.save_path, scene_name, '{}.ply'.format(frame_name_s)), pcd_s)
357 |                 o3d.io.write_point_cloud(os.path.join(self.save_path, scene_name, '{}.ply'.format(frame_name_t)), pcd_t)
358 | 
359 | 
360 | 
361 | 
362 | # Define and process command line arguments
363 | parser = argparse.ArgumentParser()
364 | parser.add_argument("--raw_data_path", type=str, default="test", help='path to the raw files')
365 | parser.add_argument('--save_path', type=str, help="save path")
366 | parser.add_argument('--n_processes', type=int, default=10,
367 |                     help='number of processes used for multi-processing')
368 | parser.add_argument('--save_ply', action='store_true',
369 |                     help='save point clouds also in ply format')
370 | parser.add_argument('--save_near', action='store_true',
371 |                     help='only save near points (less than 35m)')
372 | 
373 | 
374 | args = parser.parse_args()
375 | 
376 | 
377 | processor = semanticKITTIProcesor(args)
378 | 
379 | processor.run_processing()


--------------------------------------------------------------------------------
/data_preprocessing/python_pfm.py:
--------------------------------------------------------------------------------
 1 | import re
 2 | import numpy as np
 3 | import sys
 4 |  
 5 | 
 6 | def readPFM(file):
 7 |     file = open(file, 'rb')
 8 | 
 9 |     color = None
10 |     width = None
11 |     height = None
12 |     scale = None
13 |     endian = None
14 | 
15 |     header = file.readline().rstrip()
16 |     if header == 'PF':
17 |         color = True
18 |     elif header == 'Pf':
19 |         color = False
20 |     else:
21 |         raise Exception('Not a PFM file.')
22 | 
23 |     dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline())
24 |     if dim_match:
25 |         width, height = map(int, dim_match.groups())
26 |     else:
27 |         raise Exception('Malformed PFM header.')
28 | 
29 |     scale = float(file.readline().rstrip())
30 |     if scale < 0: # little-endian
31 |         endian = '<'
32 |         scale = -scale
33 |     else:
34 |         endian = '>' # big-endian
35 | 
36 |     data = np.fromfile(file, endian + 'f')
37 |     shape = (height, width, 3) if color else (height, width)
38 | 
39 |     data = np.reshape(data, shape)
40 |     data = np.flipud(data)
41 |     return data, scale
42 | 
43 | def writePFM(file, image, scale=1):
44 |     file = open(file, 'wb')
45 | 
46 |     color = None
47 | 
48 |     if image.dtype.name != 'float32':
49 |         raise Exception('Image dtype must be float32.')
50 |       
51 |     image = np.flipud(image)  
52 | 
53 |     if len(image.shape) == 3 and image.shape[2] == 3: # color image
54 |         color = True
55 |     elif len(image.shape) == 2 or len(image.shape) == 3 and image.shape[2] == 1: # greyscale
56 |         color = False
57 |     else:
58 |         raise Exception('Image must have H x W x 3, H x W x 1 or H x W dimensions.')
59 | 
60 |     file.write('PF\n' if color else 'Pf\n')
61 |     file.write('%d %d\n' % (image.shape[1], image.shape[0]))
62 | 
63 |     endian = image.dtype.byteorder
64 | 
65 |     if endian == '<' or endian == '=' and sys.byteorder == 'little':
66 |         scale = -scale
67 | 
68 |     file.write('%f\n' % scale)
69 | 
70 |     image.tofile(file)
71 | 
72 | 


--------------------------------------------------------------------------------
/eval.py:
--------------------------------------------------------------------------------
  1 | import os 
  2 | import logging 
  3 | import torch
  4 | import time
  5 | import argparse
  6 | import yaml 
  7 | 
  8 | import numpy as np
  9 | from collections import defaultdict
 10 | from tqdm import tqdm
 11 | 
 12 | import lib.config as config
 13 | from lib.utils import n_model_parameters, dict_all_to_device, load_checkpoint
 14 | from lib.data import make_data_loader
 15 | from lib.logger import prepare_logger
 16 | 
 17 | 
 18 | 
 19 | # Set the random seeds for repeatability
 20 | np.random.seed(41)
 21 | torch.manual_seed(41)
 22 | if torch.cuda.is_available():
 23 |     torch.cuda.manual_seed(41) 
 24 | 
 25 | def main(cfg, logger):
 26 |     """
 27 |     Main function of this evaluation software. After preparing the data loaders, and the model start with the evaluation process.
 28 |     Args:
 29 |         cfg (dict): current configuration paramaters
 30 |     """
 31 | 
 32 |     # Create the output dir if it does not exist 
 33 |     if not os.path.exists(cfg['test']['results_dir']):
 34 |         os.makedirs(cfg['test']['results_dir'])
 35 | 
 36 |     # Get model
 37 |     model = config.get_model(cfg)
 38 |     device = torch.device('cuda' if (torch.cuda.is_available() and cfg['misc']['use_gpu']) else 'cpu') 
 39 | 
 40 |     # Get data loader
 41 |     eval_loader = make_data_loader(cfg, phase='test')
 42 | 
 43 |     # Log directory
 44 |     dataset_name = cfg["data"]["dataset"]
 45 | 
 46 |     path2log = os.path.join(cfg['test']['results_dir'], dataset_name, '{}_{}'.format(cfg['method']['backbone'], cfg['misc']['num_points']))
 47 | 
 48 |     logger, checkpoint_dir = prepare_logger(cfg, path2log)
 49 | 
 50 |     # Output torch and cuda version 
 51 |     
 52 |     logger.info('Torch version: {}'.format(torch.__version__))
 53 |     logger.info('CUDA version: {}'.format(torch.version.cuda))
 54 |     logger.info('Starting evaluation of the method {} on {} dataset'.format(cfg['method']['backbone'], dataset_name))
 55 | 
 56 |     # Save config file that was used for this experiment
 57 |     with open(os.path.join(path2log, "config.yaml"),'w') as outfile:
 58 |         yaml.dump(cfg, outfile, default_flow_style=False, allow_unicode=True)
 59 | 
 60 | 
 61 |     logger.info("Parameter Count: {:d}".format(n_model_parameters(model)))
 62 |     
 63 |     # Load the pretrained weights
 64 |     if cfg['network']['use_pretrained'] and cfg['network']['pretrained_path']:
 65 |         model, optimizer, scheduler, epoch_it, total_it, metric_val_best = load_checkpoint(model, None, None, filename=cfg['network']['pretrained_path'])
 66 | 
 67 |     else:
 68 |         logger.warning('MODEL RUNS IN EVAL MODE, BUT NO PRETRAINED WEIGHTS WERE LOADED!!!!')
 69 | 
 70 | 
 71 |     # Initialize the trainer
 72 |     trainer = config.get_trainer(cfg, model,device)
 73 | 
 74 |     # if not a pretrained model epoch and iterations should be -1 
 75 |     eval_metrics = defaultdict(list)    
 76 |     start = time.time()
 77 |     
 78 |     for it, batch in enumerate(tqdm(eval_loader)):
 79 |         # Put all the tensors to the designated device
 80 |         dict_all_to_device(batch, device)
 81 |         
 82 | 
 83 |         metrics = trainer.eval_step(batch)
 84 |         
 85 |         for key in metrics:
 86 |             eval_metrics[key].append(metrics[key])
 87 | 
 88 | 
 89 |     stop = time.time()
 90 | 
 91 |     # Compute mean values of the evaluation statistics
 92 |     result_string = ''
 93 | 
 94 |     for key, value in eval_metrics.items():
 95 |         if key not in ['true_p', 'true_n', 'false_p', 'false_n']:
 96 |             result_string += '{}: {:.3f}; '.format(key, np.mean(value))
 97 |     
 98 |     if 'true_p' in eval_metrics:
 99 |         result_string += '{}: {:.3f}; '.format('dataset_precision_f', (np.sum(eval_metrics['true_p']) / (np.sum(eval_metrics['true_p'])  + np.sum(eval_metrics['false_p'])) ))
100 |         result_string += '{}: {:.3f}; '.format('dataset_recall_f', (np.sum(eval_metrics['true_p']) / (np.sum(eval_metrics['true_p'])  + np.sum(eval_metrics['false_n']))))
101 | 
102 |         result_string += '{}: {:.3f}; '.format('dataset_precision_b', (np.sum(eval_metrics['true_n']) / (np.sum(eval_metrics['true_n'])  + np.sum(eval_metrics['false_n']))))
103 |         result_string += '{}: {:.3f}; '.format('dataset_recall_b', (np.sum(eval_metrics['true_n']) / (np.sum(eval_metrics['true_n'])  + np.sum(eval_metrics['false_p']))))
104 | 
105 | 
106 |     logger.info('Outputing the evaluation metric for: {} {} {} '.format('Flow, ' if cfg['metrics']['flow'] else '', 'Ego-Motion, ' if cfg['metrics']['ego_motion'] else '', 'Bckg. Segmentaion' if cfg['metrics']['semantic'] else ''))
107 |     logger.info(result_string)
108 |     logger.info('Evaluation completed in {}s [{}s per scene]'.format((stop - start), (stop - start)/len(eval_loader)))     
109 | 
110 | 
111 | if __name__ == "__main__":
112 |     logger = logging.getLogger
113 | 
114 | 
115 |     parser = argparse.ArgumentParser()
116 |     parser.add_argument('config', type=str, help= 'Path to the config file.')
117 |     args = parser.parse_args()
118 | 
119 |     cfg = config.get_config(args.config, 'configs/default.yaml')
120 | 
121 |     main(cfg, logger)


--------------------------------------------------------------------------------
/lib/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/__init__.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/config.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/config.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/config.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/config.cpython-37.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/config.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/config.cpython-38.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/data.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/data.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/data.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/data.cpython-37.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/logger.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/logger.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/logger.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/logger.cpython-37.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/loss.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/loss.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/loss.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/loss.cpython-37.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/metrics.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/metrics.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/metrics.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/metrics.cpython-37.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/trainer.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/trainer.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/trainer.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/trainer.cpython-37.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/utils.cpython-36.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/utils.cpython-37.pyc


--------------------------------------------------------------------------------
/lib/__pycache__/utils.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/__pycache__/utils.cpython-38.pyc


--------------------------------------------------------------------------------
/lib/config.py:
--------------------------------------------------------------------------------
  1 | from lib.model.rigid_3d_sf import MinkowskiFlow
  2 | from lib.trainer import MEFlowTrainer
  3 | import torch
  4 | import yaml
  5 | import torch.optim as optim
  6 | 
  7 | model_dict = {
  8 |      'ME': MinkowskiFlow,
  9 | }
 10 | 
 11 | trainer_dict = {
 12 |      'ME': MEFlowTrainer,
 13 | }
 14 | 
 15 | 
 16 | def get_model(cfg):
 17 |     ''' 
 18 |     Gets the model instance based on the input paramters.
 19 |     Args:
 20 |         cfg (dict): config dictionary
 21 |     
 22 |     Returns:
 23 |         model (nn.Module): torch model initialized with the input params
 24 |     '''
 25 | 
 26 |     method = cfg['method']['backbone']
 27 | 
 28 |     model = model_dict[method](cfg)
 29 | 
 30 |     return model
 31 | 
 32 | 
 33 | def get_trainer(cfg, model, device):
 34 |     ''' 
 35 |     Returns a trainer instance.
 36 |     Args:
 37 |         cfg (dict): config dictionary
 38 |         model (nn.Module): the model used for training
 39 |         device: torch device
 40 | 
 41 |     Returns:
 42 |         trainer (trainer instance): trainer instance used to train the network
 43 |     '''
 44 |     
 45 |     method = cfg['method']['backbone']
 46 |     trainer = trainer_dict[method](cfg, model, device)
 47 | 
 48 | 
 49 |     return trainer
 50 | 
 51 | 
 52 | def get_optimizer(cfg, model):
 53 |     ''' 
 54 |     Returns an optimizer instance.
 55 |     Args:
 56 |         cfg (dict): config dictionary
 57 |         model (nn.Module): the model used for training
 58 | 
 59 |     Returns:
 60 |         optimizer (optimizer instance): optimizer used to train the network
 61 |     '''
 62 |     
 63 |     method = cfg['optimizer']['alg']
 64 | 
 65 |     if method == "SGD":
 66 |         optimizer = getattr(optim, method)(model.parameters(), lr=cfg['optimizer']['learning_rate'],
 67 |                                                         momentum=cfg['optimizer']['momentum'],
 68 |                                                         weight_decay=cfg['optimizer']['weight_decay'])
 69 | 
 70 |     elif method == "Adam":
 71 |         optimizer = getattr(optim, method)(model.parameters(), lr=cfg['optimizer']['learning_rate'],
 72 |                                                         weight_decay=cfg['optimizer']['weight_decay'])
 73 |     else: 
 74 |         print("{} optimizer is not implemented, must be one of the [SGD, Adam]".format(method))
 75 | 
 76 |     return optimizer
 77 | 
 78 | 
 79 | def get_scheduler(cfg, optimizer):
 80 |     ''' 
 81 |     Returns a learning rate scheduler
 82 |     Args:
 83 |         cfg (dict): config dictionary
 84 |         optimizer (torch.optim): optimizer used for training the network
 85 | 
 86 |     Returns:
 87 |         scheduler (optimizer instance): learning rate scheduler
 88 |     '''
 89 |     
 90 |     method = cfg['optimizer']['scheduler']
 91 | 
 92 |     if method == "ExponentialLR":
 93 |         scheduler = getattr(optim.lr_scheduler, method)(optimizer, gamma=cfg['optimizer']['exp_gamma'])
 94 |     else: 
 95 |         print("{} scheduler is not implemented, must be one of the [ExponentialLR]".format(method))
 96 | 
 97 |     return scheduler
 98 | 
 99 | 
100 | 
101 | # General config
102 | def get_config(path, default_path='./configs/default.yaml'):
103 |     ''' 
104 |     Loads config file.
105 |     
106 |     Args:  
107 |         path (str): path to config file
108 |         default_path (bool): whether to use default path
109 |     '''
110 |     # Load configuration from file itself
111 |     with open(path, 'r') as f:
112 |         cfg_special = yaml.safe_load(f)
113 | 
114 |     # Check if we should inherit from a config
115 |     inherit_from = cfg_special.get('inherit_from')
116 | 
117 |     # If yes, load this config first as default
118 |     # If no, use the default_path
119 |     if inherit_from is not None:
120 |         cfg = load_config(inherit_from, default_path)
121 |     elif default_path is not None:
122 |         with open(default_path, 'r') as f:
123 |             cfg = yaml.safe_load(f)
124 |     else:
125 |         cfg = dict()
126 | 
127 |     # Include main configuration
128 |     update_recursive(cfg, cfg_special)
129 | 
130 |     return cfg
131 | 
132 | 
133 | def update_recursive(dict1, dict2):
134 |     ''' 
135 |     Update two config dictionaries recursively.
136 |     
137 |     Args:
138 |         dict1 (dict): first dictionary to be updated
139 |         dict2 (dict): second dictionary which entries should be used
140 |     '''
141 |     for k, v in dict2.items():
142 |         if k not in dict1:
143 |             dict1[k] = dict()
144 |         if isinstance(v, dict):
145 |             update_recursive(dict1[k], v)
146 |         else:
147 |             dict1[k] = v


--------------------------------------------------------------------------------
/lib/data.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import torch 
  3 | import logging 
  4 | 
  5 | import numpy as np
  6 | import torch.utils.data as data
  7 | import MinkowskiEngine as ME
  8 | 
  9 | 
 10 | def to_tensor(x):
 11 |     if isinstance(x, torch.Tensor):
 12 |       return x
 13 |     elif isinstance(x, np.ndarray):
 14 |       return torch.from_numpy(x)
 15 |     else:
 16 |       raise ValueError("Can not convert to torch tensor {}".format(x))
 17 |     
 18 | def collate_fn(list_data):
 19 |     pc_1,pc_2, coords1, coords2, feats1, feats2, fg_labels_1, \
 20 |     fg_labels_2, flow, R_ego, t_ego, pc_eval_1, pc_eval_2, flow_eval, fg_labels_eval_1, fg_labels_eval_2 = list(zip(*list_data))
 21 | 
 22 |     pc_batch1, pc_batch2 = [], []
 23 |     pc_eval_batch1, pc_eval_batch2 = [], []
 24 |     fg_labels_batch1, fg_labels_batch2 = [], []
 25 |     fg_labels_eval_batch1, fg_labels_eval_batch2 = [], []
 26 |     R_ego_batch, t_ego_batch = [],[]
 27 |     flow_batch, flow_eval_batch, len_batch = [], [], []
 28 |     batch_id = 0
 29 | 
 30 |     for batch_id, _ in enumerate(coords1):
 31 |         N1 = coords1[batch_id].shape[0]
 32 |         N2 = coords2[batch_id].shape[0]
 33 |         len_batch.append([N1, N2])
 34 | 
 35 |         pc_batch1.append(to_tensor(pc_1[batch_id]).float())
 36 |         pc_batch2.append(to_tensor(pc_2[batch_id]).float())
 37 | 
 38 |         pc_eval_batch1.append(to_tensor(pc_eval_1[batch_id]).float())
 39 |         pc_eval_batch2.append(to_tensor(pc_eval_2[batch_id]).float())
 40 | 
 41 |         fg_labels_batch1.append(to_tensor(fg_labels_1[batch_id]))
 42 |         fg_labels_batch2.append(to_tensor(fg_labels_2[batch_id]))
 43 | 
 44 |         fg_labels_eval_batch1.append(to_tensor(fg_labels_eval_1[batch_id]))
 45 |         fg_labels_eval_batch2.append(to_tensor(fg_labels_eval_2[batch_id]))
 46 | 
 47 |         R_ego_batch.append(to_tensor(R_ego[batch_id]).unsqueeze(0))
 48 |         t_ego_batch.append(to_tensor(t_ego[batch_id]).unsqueeze(0))
 49 | 
 50 |         flow_batch.append(to_tensor(flow[batch_id]))
 51 |         flow_eval_batch.append(to_tensor(flow_eval[batch_id]))
 52 | 
 53 |     coords_batch1, feats_batch1 = ME.utils.sparse_collate(coords=coords1, feats=feats1)
 54 |     coords_batch2, feats_batch2 = ME.utils.sparse_collate(coords=coords2, feats=feats2)
 55 |   
 56 | 
 57 |     # Concatenate all lists
 58 |     fg_labels_batch1 = torch.cat(fg_labels_batch1, 0).long()
 59 |     fg_labels_batch2 = torch.cat(fg_labels_batch2, 0).long()
 60 |     flow_batch = torch.cat(flow_batch, 0).float()
 61 |     flow_eval_batch = torch.cat(flow_eval_batch, 0).float()
 62 |     R_ego_batch = torch.cat(R_ego_batch, 0).float()
 63 |     t_ego_batch = torch.cat(t_ego_batch, 0).float()
 64 |     fg_labels_eval_batch1 = torch.cat(fg_labels_eval_batch1, 0).long()
 65 |     fg_labels_eval_batch2 = torch.cat(fg_labels_eval_batch2, 0).long()
 66 | 
 67 |     return {
 68 |         'pcd_s': pc_batch1,
 69 |         'pcd_t': pc_batch2,
 70 |         'sinput_s_C': coords_batch1,
 71 |         'sinput_s_F': feats_batch1.float(),
 72 |         'sinput_t_C': coords_batch2,
 73 |         'sinput_t_F': feats_batch2.float(),
 74 |         'fg_labels_s': fg_labels_batch1,
 75 |         'fg_labels_t': fg_labels_batch2,
 76 |         'flow': flow_batch,
 77 |         'R_ego': R_ego_batch,
 78 |         't_ego': t_ego_batch,
 79 |         'pcd_eval_s': pc_eval_batch1,
 80 |         'pcd_eval_t': pc_eval_batch2,
 81 |         'flow_eval': flow_eval_batch,
 82 |         'fg_labels_eval_s': fg_labels_eval_batch1,
 83 |         'fg_labels_eval_t': fg_labels_eval_batch2,
 84 |         'len_batch': len_batch
 85 |     }
 86 | 
 87 | 
 88 | class MELidarDataset(data.Dataset):
 89 |     def __init__(self, phase, config):
 90 |         
 91 |         self.files = []
 92 |         self.root = config['data']['root']
 93 |         self.config = config
 94 |         self.input_features = config['data']['input_features']
 95 |         self.num_points = config['misc']['num_points']
 96 |         self.voxel_size = config['misc']['voxel_size']
 97 |         self.remove_ground = True if (config['data']['remove_ground'] and config['data']['dataset'] in ['StereoKITTI_ME','LidarKITTI_ME','SemanticKITTI_ME','WaymoOpen_ME']) else False
 98 |         self.dataset = config['data']['dataset']
 99 |         self.only_near_points = config['data']['only_near_points']
100 |         self.phase = phase
101 |     
102 |         self.randng = np.random.RandomState()
103 |         self.device = torch.device('cuda' if (torch.cuda.is_available() and config['misc']['use_gpu']) else 'cpu') 
104 | 
105 |         self.augment_data = config['data']['augment_data']
106 | 
107 |         logging.info("Loading the subset {} from {}".format(phase,self.root))
108 | 
109 |         subset_names = open(self.DATA_FILES[phase]).read().split()
110 | 
111 |         for name in subset_names:
112 |             self.files.append(name)
113 | 
114 |     def __getitem__(self, idx):
115 |         file = os.path.join(self.root,self.files[idx])
116 |         file_name = file.replace(os.sep,'/').split('/')[-1]
117 |         
118 |         # Load the data
119 |         data = np.load(file)
120 |         pc_1 = data['pc1']
121 |         pc_2 = data['pc2']
122 | 
123 |         if 'pose_s' in data:
124 |             pose_1 = data['pose_s']
125 |         else:
126 |             pose_1 = np.eye(4)
127 | 
128 |         if 'pose_t' in data:
129 |             pose_2 = data['pose_t']
130 |         else:
131 |             pose_2 = np.eye(4)
132 | 
133 |         if 'sem_label_s' in data:
134 |             labels_1 = data['sem_label_s']
135 |         else:
136 |             labels_1 = np.zeros(pc_1.shape[0])
137 | 
138 | 
139 |         if 'sem_label_t' in data:
140 |             labels_2 = data['sem_label_t']
141 |         else:
142 |             labels_2 = np.zeros(pc_2.shape[0])
143 | 
144 |         if 'flow' in data:
145 |             flow = data['flow']
146 |         else:
147 |             flow = np.zeros_like(pc_1)
148 | 
149 |         # Remove the ground and far away points
150 |         # In stereoKITTI the direct correspondences are provided therefore we remove,
151 |         # if either of the points fullfills the condition (as in hplflownet, flot, ...)
152 | 
153 |         if self.dataset in ["SemanticKITTI_ME", 'LidarKITTI_ME', "WaymoOpen_ME"]:
154 |             if self.remove_ground:
155 |                 if self.phase == 'test':
156 |                     is_not_ground_s = (pc_1[:, 1] > -1.4)
157 |                     is_not_ground_t = (pc_2[:, 1] > -1.4)
158 | 
159 |                     pc_1 = pc_1[is_not_ground_s,:]
160 |                     labels_1 = labels_1[is_not_ground_s]
161 |                     flow = flow[is_not_ground_s,:]
162 | 
163 |                     pc_2 = pc_2[is_not_ground_t,:]
164 |                     labels_2 = labels_2[is_not_ground_t]
165 | 
166 |                 # In the training phase we randomly select if the ground should be removed or not 
167 |                 elif np.random.rand() > 1/4:
168 |                     is_not_ground_s = (pc_1[:, 1] > -1.4)
169 |                     is_not_ground_t = (pc_2[:, 1] > -1.4)
170 | 
171 |                     pc_1 = pc_1[is_not_ground_s,:]
172 |                     labels_1 = labels_1[is_not_ground_s]
173 |                     flow = flow[is_not_ground_s,:]
174 | 
175 |                     pc_2 = pc_2[is_not_ground_t,:]
176 |                     labels_2 = labels_2[is_not_ground_t]
177 | 
178 |             if self.only_near_points:
179 |                 is_near_s = (pc_1[:, 2] < 35)
180 |                 is_near_t = (pc_2[:, 2] < 35)
181 | 
182 |                 pc_1 = pc_1[is_near_s,:]
183 |                 labels_1 = labels_1[is_near_s]
184 |                 flow = flow[is_near_s,:]
185 | 
186 |                 pc_2 = pc_2[is_near_t,:]
187 |                 labels_2 = labels_2[is_near_t]
188 | 
189 |         else:
190 |             if self.remove_ground:
191 |                 is_not_ground = np.logical_not(np.logical_and(pc_1[:, 1] < -1.4, pc_2[:, 1] < -1.4))
192 |                 pc_1 = pc_1[is_not_ground,:]
193 |                 pc_2 = pc_2[is_not_ground,:]
194 |                 flow = flow[is_not_ground,:]
195 | 
196 |             if self.only_near_points:
197 |                 is_near = np.logical_and(pc_1[:, 2] < 35, pc_1[:, 2] < 35)
198 |                 pc_1 = pc_1[is_near,:]
199 |                 pc_2 = pc_2[is_near,:]
200 |                 flow = flow[is_near,:]
201 | 
202 |         # Augment the point cloud by randomly rotating and translating them (recompute the ego-motion if augmention is applied!)
203 |         if self.augment_data and self.phase != 'test':
204 |             T_1 = np.eye(4)
205 |             T_2 = np.eye(4)
206 | 
207 |             T_1[0:3,3] = (np.random.rand(3) - 0.5) * 0.5
208 |             T_2[0:3,3] = (np.random.rand(3) - 0.5) * 0.5
209 | 
210 |             T_1[1,3] = (np.random.rand(1) - 0.5) * 0.1 
211 |             T_2[1,3] = (np.random.rand(1) - 0.5) * 0.1
212 | 
213 |             pc_1 = (np.matmul(T_1[0:3, 0:3], pc_1.transpose()) + T_1[0:3,3:4]).transpose()
214 |             pc_2 = (np.matmul(T_2[0:3, 0:3], pc_2.transpose()) + T_2[0:3,3:4]).transpose()
215 | 
216 |             pose_1 = np.matmul(pose_1, np.linalg.inv(T_1))
217 |             pose_2 = np.matmul(pose_2, np.linalg.inv(T_2))
218 | 
219 |             rel_trans = np.linalg.inv(pose_2) @ pose_1
220 | 
221 |             R_ego = rel_trans[0:3,0:3]
222 |             t_ego = rel_trans[0:3,3:4]
223 |         else:
224 |             # Compute relative pose that transform the point from the source point cloud to the target
225 |             rel_trans = np.linalg.inv(pose_2) @ pose_1
226 |             R_ego = rel_trans[0:3,0:3]
227 |             t_ego = rel_trans[0:3,3:4]
228 | 
229 | 
230 |         # Sample n points for evaluation before the voxelization
231 |         # If less than desired points are available just consider the maximum
232 |         if pc_1.shape[0] > self.num_points:
233 |             idx_1 = np.random.choice(pc_1.shape[0], self.num_points, replace=False)
234 |         else:
235 |             idx_1 = np.random.choice(pc_1.shape[0], pc_1.shape[0], replace=False)
236 | 
237 |         if pc_2.shape[0] > self.num_points:
238 |             idx_2 = np.random.choice(pc_2.shape[0], self.num_points, replace=False)
239 |         else:
240 |             idx_2 = np.random.choice(pc_2.shape[0], pc_2.shape[0], replace=False)
241 | 
242 |         pc_1_eval = pc_1[idx_1,:]
243 |         flow_eval = flow[idx_1,:]
244 |         labels_1_eval = labels_1[idx_1]
245 | 
246 |         pc_2_eval = pc_2[idx_2,:]
247 |         labels_2_eval = labels_2[idx_2]
248 | 
249 |         # Voxelization
250 |         _, sel1 = ME.utils.sparse_quantize(np.ascontiguousarray(pc_1) / self.voxel_size, return_index=True)
251 |         _, sel2 = ME.utils.sparse_quantize(np.ascontiguousarray(pc_2) / self.voxel_size, return_index=True)
252 | 
253 | 
254 |         # Slect the voxelized points
255 |         pc_1 = pc_1[sel1,:]
256 |         labels_1 = labels_1[sel1]
257 |         flow = flow[sel1,:]
258 | 
259 |         pc_2 = pc_2[sel2,:]
260 |         labels_2 = labels_2[sel2]
261 | 
262 |         # If more voxels then the selected number of points are remaining randomly sample them
263 |         if pc_1.shape[0] > self.num_points:
264 |             idx_1 = np.random.choice(pc_1.shape[0], self.num_points, replace=False)
265 |         else:
266 |             idx_1 = np.random.choice(pc_1.shape[0], pc_1.shape[0], replace=False)
267 | 
268 |         if pc_2.shape[0] > self.num_points:
269 |             idx_2 = np.random.choice(pc_2.shape[0], self.num_points, replace=False)
270 |         else:
271 |             idx_2 = np.random.choice(pc_2.shape[0], pc_2.shape[0], replace=False)
272 | 
273 |         pc_1 = pc_1[idx_1,:]
274 |         labels_1 = labels_1[idx_1]
275 |         flow = flow[idx_1,:]
276 | 
277 |         pc_2 = pc_2[idx_2,:]
278 |         labels_2 = labels_2[idx_2]
279 | 
280 | 
281 |         # Get sparse indices
282 |         coords1 = np.floor(pc_1 / self.voxel_size)
283 |         coords2 = np.floor(pc_2 / self.voxel_size)
284 | 
285 | 
286 |         feats_train1, feats_train2 = [], []
287 | 
288 |         if self.input_features == 'occupancy':
289 |             feats_train1.append(np.ones((pc_1.shape[0], 1)))
290 |             feats_train2.append(np.ones((pc_2.shape[0], 1)))
291 | 
292 |         elif self.input_features == 'absolute_coords':
293 |             feats_train1.append(pc_1)
294 |             feats_train2.append(pc_2)
295 | 
296 |         elif self.input_features == 'relative_coords':
297 |             feats_train1.append(pc_1 - (coords1 * self.voxel_size))
298 |             feats_train2.append(pc_2 - (coords2 * self.voxel_size))
299 | 
300 |         else:
301 |             raise ValueError('{} not recognized as a valid input feature!'.format(self.input_features))
302 | 
303 |         feats1 = np.hstack(feats_train1)
304 |         feats2 = np.hstack(feats_train2)
305 | 
306 |         # Foreground points (class label bellow 40 or above 99 -> binary label 1)
307 |         fg_labels_1 = np.zeros((labels_1.shape[0]))
308 |         fg_labels_1[((labels_1 < 40) | (labels_1 > 99))] = 1
309 |         fg_labels_1[labels_1 == 0] = -1
310 | 
311 |         fg_labels_2 = np.zeros((labels_2.shape[0]))
312 |         fg_labels_2[((labels_2 < 40) | (labels_2 > 99))] = 1
313 |         fg_labels_2[labels_2 == 0] = -1
314 | 
315 |         fg_labels_1_eval = np.zeros((labels_1_eval.shape[0]))
316 |         fg_labels_1_eval[((labels_1_eval < 40) | (labels_1_eval > 99))] = 1
317 |         fg_labels_1_eval[labels_1_eval == 0] = -1
318 | 
319 |         fg_labels_2_eval = np.zeros((labels_2_eval.shape[0]))
320 |         fg_labels_2_eval[((labels_2_eval < 40) | (labels_2_eval > 99))] = 1
321 |         fg_labels_2_eval[labels_2_eval == 0] = -1
322 | 
323 |         return (pc_1, pc_2, coords1, coords2, feats1, feats2, fg_labels_1, fg_labels_2, flow,
324 |                 R_ego, t_ego, pc_1_eval, pc_2_eval, flow_eval, fg_labels_1_eval, fg_labels_2_eval)
325 | 
326 |     def __len__(self):
327 |         return len(self.files)
328 | 
329 |     def reset_seed(self,seed=41):
330 |         logging.info('Resetting the data loader seed to {}'.format(seed))
331 |         self.randng.seed(seed)
332 | 
333 | 
334 | class FlyingThings3D_ME(MELidarDataset):
335 |     # 3D Match dataset all files
336 |     DATA_FILES = {
337 |         'train': './configs/datasets/flying_things_3d/train.txt',
338 |         'val': './configs/datasets/flying_things_3d/val.txt',
339 |         'test': './configs/datasets/flying_things_3d/test.txt'
340 |     }
341 | 
342 | class StereoKITTI_ME(MELidarDataset):
343 |     # 3D Match dataset all files
344 |     DATA_FILES = {
345 |         'train': './configs/datasets/stereo_kitti/test.txt',
346 |         'val': './configs/datasets/stereo_kitti/test.txt',
347 |         'test': './configs/datasets/stereo_kitti/test.txt'
348 |     }
349 | 
350 | class SemanticKITTI_ME(MELidarDataset):
351 |     # 3D Match dataset all files
352 |     DATA_FILES = {
353 |         'train': './configs/datasets/semantic_kitti/train.txt',
354 |         'val': './configs/datasets/semantic_kitti/val.txt',
355 |         'test': './configs/datasets/semantic_kitti/val.txt'
356 |     }
357 | 
358 | class LidarKITTI_ME(MELidarDataset):
359 |     # 3D Match dataset all files
360 |     DATA_FILES = {
361 |         'train': './configs/datasets/lidar_kitti/test.txt',
362 |         'val': './configs/datasets/lidar_kitti/test.txt',
363 |         'test': './configs/datasets/lidar_kitti/test.txt'
364 |     }
365 | 
366 | 
367 | class WaymoOpen_ME(MELidarDataset):
368 |     # 3D Match dataset all files
369 |     DATA_FILES = {
370 |         'train': './configs/datasets/waymo_open/train.txt',
371 |         'val': './configs/datasets/waymo_open/val.txt',
372 |         'test': './configs/datasets/waymo_open/test.txt'
373 |     }
374 | 
375 | 
376 | # Map the datasets to string names
377 | ALL_DATASETS = [FlyingThings3D_ME, StereoKITTI_ME, SemanticKITTI_ME, LidarKITTI_ME, WaymoOpen_ME]
378 | 
379 | dataset_str_mapping = {d.__name__: d for d in ALL_DATASETS}
380 | 
381 | 
382 | def make_data_loader(config, phase, neighborhood_limits=None, shuffle_dataset=None):
383 |     """
384 |     Defines the data loader based on the parameters specified in the config file
385 |     Args:
386 |         config (dict): dictionary of the arguments
387 |         phase (str): phase for which the data loader should be initialized in [train,val,test]
388 |         shuffle_dataset (bool): shuffle the dataset or not
389 |     Returns:
390 |         loader (torch data loader): data loader that handles loading the data to the model
391 |     """
392 | 
393 |     assert config['misc']['run_mode'] in ['train','val','test']
394 | 
395 |     if shuffle_dataset is None:
396 |         shuffle_dataset = config['misc']['run_mode'] != 'test'
397 | 
398 |     # Select the defined dataset
399 |     Dataset = dataset_str_mapping[config['data']['dataset']]
400 | 
401 |     dset = Dataset(phase, config=config)
402 | 
403 |     drop_last = False if config['misc']['run_mode'] == 'test' else True
404 | 
405 |     loader = torch.utils.data.DataLoader(
406 |                 dset,
407 |                 batch_size=config[phase]['batch_size'],
408 |                 shuffle=shuffle_dataset,
409 |                 num_workers=config[phase]['num_workers'],
410 |                 collate_fn=collate_fn,
411 |                 pin_memory=False,
412 |                 drop_last=drop_last
413 |             )
414 | 
415 |     return loader


--------------------------------------------------------------------------------
/lib/logger.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | import numpy as np
 4 | import logging
 5 | import coloredlogs
 6 | 
 7 | 
 8 | _logger = logging.getLogger()
 9 | 
10 | 
11 | def print_info(config, log_dir=None):
12 |     """ Logs source code configuration
13 |         Code adapted from RPMNet repository: https://github.com/yewzijian/RPMNet/
14 |     """
15 |     _logger.info('Command: {}'.format(' '.join(sys.argv)))
16 | 
17 |     # Arguments
18 |     arg_str = []
19 | 
20 |     for k_id, k_val in config.items():
21 |         if isinstance(k_val, dict):
22 |             for key in k_val:
23 |                 arg_str.append("{}_{}: {}".format(k_id, key, k_val[key]))
24 |         else:
25 |             arg_str.append("{}: {}".format(k_id, k_val))
26 | 
27 |     arg_str = ', '.join(arg_str)
28 |     _logger.info('Arguments: {}'.format(arg_str))
29 | 
30 | 
31 | def prepare_logger(config, log_path = None):
32 |     """Creates logging directory, and installs colorlogs 
33 |     Args:
34 |         opt: Program arguments, should include --dev and --logdir flag.
35 |              See get_parent_parser()
36 |         log_path: Logging path (optional). This serves to overwrite the settings in
37 |                  argparse namespace
38 |     Returns:
39 |         logger (logging.Logger)
40 |         log_path (str): Logging directory
41 |     Code borrowed from RPMNet repository: https://github.com/yewzijian/RPMNet/
42 |     """
43 |     
44 |     os.makedirs(log_path, exist_ok=True)
45 | 
46 |     logger = logging.getLogger()
47 |     coloredlogs.install(level='INFO', logger=logger)
48 |     file_handler = logging.FileHandler('{}/console_output.txt'.format(log_path))
49 |     log_formatter = logging.Formatter('%(asctime)s [%(levelname)s] %(name)s - %(message)s')
50 |     file_handler.setFormatter(log_formatter)
51 |     logger.addHandler(file_handler)
52 |     print_info(config, log_path)
53 |     logger.info('Output and logs will be saved to {}'.format(log_path))
54 | 
55 |     return logger, log_path
56 | 


--------------------------------------------------------------------------------
/lib/loss.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | 
  4 | from lib.utils import transform_point_cloud, kabsch_transformation_estimation
  5 | from utils.chamfer_distance import ChamferDistance
  6 | 
  7 | 
  8 | class TrainLoss(nn.Module):
  9 |     """
 10 |     Training loss consists of a ego-motion loss, background segmentation loss, and a foreground loss. 
 11 |     The l1 flow loss is used for the full supervised experiments only. 
 12 | 
 13 |     Args:
 14 |        args: parameters controling the initialization of the loss functions
 15 | 
 16 |     """
 17 | 
 18 |     def __init__(self, args):
 19 |         nn.Module.__init__(self)
 20 | 
 21 | 
 22 |         self.args = args
 23 |         self.device = torch.device('cuda' if (torch.cuda.is_available() and args['misc']['use_gpu']) else 'cpu') 
 24 | 
 25 |         # Flow loss
 26 |         self.flow_criterion = nn.L1Loss(reduction='mean')
 27 | 
 28 |         # Ego motion loss
 29 |         self.ego_l1_criterion = nn.L1Loss(reduction='mean')
 30 |         self.ego_outlier_criterion = OutlierLoss()
 31 |         
 32 |         # Background segmentation loss
 33 |         if args['loss']['background_loss'] == 'weighted':
 34 | 
 35 |             # Based on the dataset analysis there are 14 times more background labels
 36 |             seg_weight = torch.tensor([1.0, 20.0]).to(self.device)
 37 |             self.seg_criterion = torch.nn.CrossEntropyLoss(weight=seg_weight, ignore_index=-1)
 38 |         
 39 |         else:
 40 |             self.seg_criterion = torch.nn.CrossEntropyLoss(ignore_index=-1)
 41 | 
 42 |         # Foreground loss
 43 |         self.chamfer_criterion = ChamferDistance()
 44 |         self.rigidity_criterion = nn.L1Loss(reduction='mean')
 45 | 
 46 |     def __call__(self, inferred_values, gt_data):
 47 |         
 48 |         # Initialize the dictionary
 49 |         losses = {}
 50 |         
 51 |         if self.args['method']['flow'] and self.args['loss']['flow_loss']:
 52 |             assert (('coarse_flow' in inferred_values) & ('flow' in gt_data)), 'Flow loss selected \
 53 |                                                                     but either est or gt flow not provided'
 54 | 
 55 |             losses['refined_flow_loss'] = self.flow_criterion(inferred_values['refined_flow'], 
 56 |                                                 gt_data['flow']) * self.args['loss'].get('flow_loss_w', 1.0)
 57 | 
 58 |             losses['coarse_flow_loss'] = self.flow_criterion(inferred_values['coarse_flow'], 
 59 |                                                  gt_data['flow']) * self.args['loss'].get('flow_loss_w', 1.0)
 60 | 
 61 | 
 62 |         if self.args['method']['ego_motion'] and self.args['loss']['ego_loss']:
 63 |             assert (('R_est' in inferred_values) & ('R_s_t' in gt_data) is not None), "Ego motion loss selected \
 64 |                                             but either est or gt ego motion not provided"
 65 |                                                             
 66 |             assert 'permutation' in inferred_values is not None, 'Outlier loss selected \
 67 |                                                                         but the permutation matrix is not provided'
 68 | 
 69 |             # Only evaluate on the background points
 70 |             mask = (gt_data['fg_labels_s'] == 0)
 71 | 
 72 |             prev_idx = 0
 73 |             pc_t_gt, pc_t_est = [], []
 74 | 
 75 |             # Iterate over the samples in the batch
 76 |             for batch_idx in range(gt_data['R_ego'].shape[0]):
 77 |                 
 78 |                 # Convert the voxel indices back to the coordinates
 79 |                 p_s_temp = gt_data['sinput_s_C'][prev_idx: prev_idx + gt_data['len_batch'][batch_idx][0],:].to(self.device) * self.args['misc']['voxel_size']
 80 |                 mask_temp = mask[prev_idx: prev_idx + gt_data['len_batch'][batch_idx][0]]
 81 | 
 82 |                 # Transform the point cloud with gt and estimated ego-motion parameters
 83 |                 pc_t_gt_temp = transform_point_cloud(p_s_temp[mask_temp,1:4], gt_data['R_ego'][batch_idx,:,:], gt_data['t_ego'][batch_idx,:,:])
 84 |                 pc_t_est_temp = transform_point_cloud(p_s_temp[mask_temp,1:4], inferred_values['R_est'][batch_idx,:,:], inferred_values['t_est'][batch_idx,:,:])
 85 |                 
 86 |                 pc_t_gt.append(pc_t_gt_temp.squeeze(0))
 87 |                 pc_t_est.append(pc_t_est_temp.squeeze(0))
 88 |                 
 89 |                 prev_idx += gt_data['len_batch'][batch_idx][0]
 90 | 
 91 |             pc_t_est = torch.cat(pc_t_est, 0)
 92 |             pc_t_gt = torch.cat(pc_t_gt, 0)
 93 | 
 94 |             losses['ego_loss'] = self.ego_l1_criterion(pc_t_est, pc_t_gt) * self.args['loss'].get('ego_loss_w', 1.0)
 95 |             losses['outlier_loss'] = self.ego_outlier_criterion(inferred_values['permutation']) * self.args['loss'].get('inlier_loss_w', 1.0)
 96 | 
 97 |         # Background segmentation loss
 98 |         if self.args['method']['semantic'] and self.args['loss']['background_loss']:
 99 |             assert (('semantic_logits_s' in inferred_values) & ('fg_labels_s' in gt_data)), "Background loss selected but either est or gt labels not provided"
100 |             
101 |             semantic_loss = torch.tensor(0.0).to(self.device)
102 | 
103 |             semantic_loss += self.seg_criterion(inferred_values['semantic_logits_s'].F, gt_data['fg_labels_s']) * self.args['loss'].get('bg_loss_w', 1.0)
104 | 
105 |             # If the background labels for the target point cloud are available also use them for the loss computation
106 |             if 'semantic_logits_t' in inferred_values:
107 |                 semantic_loss += self.seg_criterion(inferred_values['semantic_logits_t'].F, gt_data['fg_labels_t']) * self.args['loss'].get('bg_loss_w', 1.0)
108 |                 semantic_loss = semantic_loss/2
109 | 
110 |             losses['semantic_loss'] = semantic_loss
111 | 
112 |         # Foreground loss
113 |         if self.args['method']['clustering'] and self.args['loss']['foreground_loss']:
114 |             assert ('clusters_s' in inferred_values), "Foreground loss selected but inferred cluster labels not provided"
115 |             
116 |             rigidity_loss = torch.tensor(0.0).to(self.device)
117 | 
118 |             xyz_s = torch.cat(gt_data['pcd_s'], 0).to(self.device)
119 |             xyz_t = torch.cat(gt_data['pcd_t'], 0).to(self.device)
120 | 
121 |             # # Two-way chamfer distance for the foreground points (only compute if both point clouds have more than 50 foreground points)
122 |             # if torch.where(gt_data['fg_labels_s'] == 1)[0].shape[0] > 50 and torch.where(gt_data['fg_labels_t'] == 1)[0].shape[0] > 50:
123 | 
124 |             foreground_mask_s = (gt_data['fg_labels_s'] == 1)
125 |             foreground_mask_t = (gt_data['fg_labels_t'] == 1)
126 | 
127 |             prev_idx_s = 0
128 |             prev_idx_t = 0
129 |             chamfer_loss = []
130 |             # Iterate over the samples in the batch
131 |             for batch_idx in range(gt_data['R_ego'].shape[0]):
132 |     
133 |                 temp_foreground_mask_s = foreground_mask_s[prev_idx_s : prev_idx_s + gt_data['len_batch'][batch_idx][0]]
134 |                 temp_foreground_mask_t = foreground_mask_t[prev_idx_t : prev_idx_t + gt_data['len_batch'][batch_idx][1]]
135 | 
136 |                 if torch.sum(temp_foreground_mask_s) > 50 and torch.sum(temp_foreground_mask_t) > 50:
137 |                     foreground_xyz_s_temp = xyz_s[prev_idx_s: prev_idx_s + gt_data['len_batch'][batch_idx][0],:]
138 |                     foreground_xyz_t_temp = xyz_t[prev_idx_t: prev_idx_t + gt_data['len_batch'][batch_idx][1],:]
139 |                     foreground_flow = inferred_values['refined_rigid_flow'][prev_idx_s: prev_idx_s + gt_data['len_batch'][batch_idx][0],:]
140 | 
141 |                     foreground_xyz_s = foreground_xyz_s_temp[temp_foreground_mask_s,:]
142 |                     foreground_flow = foreground_flow[temp_foreground_mask_s,:]
143 |                     foreground_xyz_t = foreground_xyz_t_temp[temp_foreground_mask_t,:]
144 | 
145 |                     dist1, dist2 = self.chamfer_criterion(foreground_xyz_t.unsqueeze(0), (foreground_xyz_s + foreground_flow).unsqueeze(0))
146 |                     
147 |                     # Clamp the distance to prevent outliers (objects that appear and disappear from the scene)
148 |                     dist1 = torch.clamp(torch.sqrt(dist1), max=1.0)
149 |                     dist2 = torch.clamp(torch.sqrt(dist2), max=1.0)
150 | 
151 |                     chamfer_loss.append((torch.mean(dist1) + torch.mean(dist2)) / 2.0)
152 | 
153 |                 prev_idx_s += gt_data['len_batch'][batch_idx][0]
154 |                 prev_idx_t += gt_data['len_batch'][batch_idx][1]
155 | 
156 |             # Handle the case where there are no foreground points
157 |             if len(chamfer_loss) == 0: chamfer_loss.append(torch.tensor(0.0).to(self.device))
158 | 
159 |             losses['chamfer_loss'] = torch.mean(torch.stack(chamfer_loss)) * self.args['loss'].get('cd_loss_w', 1.0)
160 | 
161 |             # Rigidity loss (flow vectors of each cluster should be congruent)
162 |             n_clusters = 0
163 |             # Iterate over the clusters and enforce rigidity within each cluster
164 |             for batch_idx in inferred_values['clusters_s']:
165 |         
166 |                 for cluster in inferred_values['clusters_s'][batch_idx]:
167 |                     cluster_xyz_s = xyz_s[cluster,:].unsqueeze(0)
168 |                     cluster_flow = inferred_values['refined_rigid_flow'][cluster,:].unsqueeze(0)
169 |                     reconstructed_xyz = cluster_xyz_s + cluster_flow
170 | 
171 |                     # Compute the unweighted Kabsch estimation (transformation parameters which best explain the vectors)
172 |                     R_cluster, t_cluster, _, _ = kabsch_transformation_estimation(cluster_xyz_s, reconstructed_xyz)
173 | 
174 |                     # Detach the gradients such that they do not flow through the tansformation parameters but only through flow
175 |                     rigid_xyz = (torch.matmul(R_cluster, cluster_xyz_s.transpose(1, 2)) + t_cluster ).detach().squeeze(0).transpose(0,1)
176 |                     
177 |                     rigidity_loss += self.rigidity_criterion(reconstructed_xyz.squeeze(0), rigid_xyz)
178 | 
179 |                     n_clusters += 1
180 | 
181 |             n_clusters = 1.0 if n_clusters == 0 else n_clusters            
182 |             losses['rigidity_loss'] = (rigidity_loss / n_clusters) * self.args['loss'].get('rigid_loss_w', 1.0)
183 | 
184 |         # Compute the total loss as the sum of individual losses
185 |         total_loss = 0.0
186 |         for key in losses:
187 |             total_loss += losses[key]
188 | 
189 |         losses['total_loss'] = total_loss
190 |         return losses 
191 | 
192 | 
193 | 
194 | 
195 | 
196 | 
197 | class OutlierLoss():
198 |     """
199 |     Outlier loss used regularize the training of the ego-motion. Aims to prevent Sinkhorn algorithm to 
200 |     assign to much mass to the slack row and column.
201 | 
202 |     """
203 |     def __init__(self):
204 | 
205 |         self.reduction = 'mean'
206 | 
207 |     def __call__(self, perm_matrix):
208 | 
209 |         ref_outliers_strength = []
210 |         src_outliers_strength = []
211 | 
212 |         for batch_idx in range(len(perm_matrix)):
213 |             ref_outliers_strength.append(1.0 - torch.sum(perm_matrix[batch_idx], dim=1))
214 |             src_outliers_strength.append(1.0 - torch.sum(perm_matrix[batch_idx], dim=2))
215 | 
216 |         ref_outliers_strength = torch.cat(ref_outliers_strength,1)
217 |         src_outliers_strength = torch.cat(src_outliers_strength,0)
218 | 
219 |         if self.reduction.lower() == 'mean':
220 |             return torch.mean(ref_outliers_strength) + torch.mean(src_outliers_strength)
221 |         
222 |         elif self.reduction.lower() == 'none':
223 |             return  torch.mean(ref_outliers_strength, dim=1) + \
224 |                                              torch.mean(src_outliers_strength, dim=1)
225 | 


--------------------------------------------------------------------------------
/lib/metrics.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | 
 4 | from lib.utils import compute_epe, rotation_error, translation_error, precision_at_one, evaluate_binary_class
 5 | 
 6 | class EvalMetrics(nn.Module):
 7 |     """
 8 |     Computes all the evaluation metric used to either monitor the training process or evaluate the method
 9 |     
10 |     Args:
11 |        args: parameters controling the initialization of the evaluation metrics
12 | 
13 |     """
14 | 
15 |     def __init__(self, args):
16 |         nn.Module.__init__(self)
17 | 
18 | 
19 |         self.args = args
20 |         self.device = torch.device('cuda' if (torch.cuda.is_available() and args['misc']['use_gpu']) else 'cpu') 
21 | 
22 |     def __call__(self, inferred_values, gt_data, phase='train'):
23 |         
24 |         # Initialize the dictionary
25 |         metrics = {}
26 |         
27 |         if (self.args['method']['flow'] and self.args['metrics']['flow']):
28 |             assert (('refined_flow' in inferred_values) & ('flow_eval' in gt_data)), "Flow metrics selected \
29 |                                 but either est or gt flow not provided"
30 |             
31 |             
32 |             gt_flow = gt_data['flow'] if phase == 'train' else gt_data['flow_eval']
33 |             # Compute the end point error of the flow vectors
34 |             # If bg/fg labels are available use them to also compute f-EPE and b-EPE
35 |             if 'fg_labels_eval_s' in gt_data and self.args['data']['dataset'] not in ["FlyingThings3D_ME", "StereoKITTI_ME"]:
36 |                 gt_label = gt_data['fg_labels_s'] if phase == 'train' else gt_data['fg_labels_eval_s']
37 |                 ego_metrics = compute_epe(inferred_values['refined_rigid_flow'], gt_flow, sem_label=gt_label, eval_stats=True)
38 |             else:
39 |                 ego_metrics = compute_epe(inferred_values['refined_rigid_flow'], gt_flow, eval_stats =True)
40 |             
41 |             for key, value in ego_metrics.items():
42 |                 metrics[key] = value
43 | 
44 |         # Compute the ego-motion metric
45 |         if self.args['method']['ego_motion'] and self.args['metrics']['ego_motion']:
46 |             assert (('R_est' in inferred_values) & ('R_ego' in gt_data)), "Ego motion metric selected \
47 |                                             but either est or gt ego motion not provided"
48 | 
49 |             r_error = rotation_error(inferred_values['R_est'], gt_data['R_ego'])
50 |             
51 |             metrics['mean_r_error'] = torch.mean(r_error).item()
52 |             metrics['max_r_error'] = torch.max(r_error).item()
53 |             metrics['min_r_error'] = torch.min(r_error).item()
54 | 
55 |             t_error = translation_error(inferred_values['t_est'], gt_data['t_ego'])
56 |             
57 |             metrics['mean_t_error'] = torch.mean(t_error).item()
58 |             metrics['max_t_error'] = torch.max(t_error).item()
59 |             metrics['min_t_error'] = torch.min(t_error).item()
60 | 
61 | 
62 |         # Compute the background segmentation metric
63 |         if self.args['method']['semantic'] and self.args['metrics']['semantic']:
64 |             assert (('semantic_logits_s_all' in inferred_values) & ('fg_labels_eval_s' in gt_data)), "Background segmentation metric selected \
65 |                                             but either est or gt labels not provided"
66 |             
67 |             gt_label = gt_data['fg_labels_s'] if phase == 'train' else gt_data['fg_labels_eval_s']
68 | 
69 |             pred_label = inferred_values['semantic_logits_s_all'].max(1)[1]
70 |             pre_f, pre_b, rec_f, rec_b = precision_at_one(pred_label, gt_label)
71 | 
72 |             metrics['precision_f'] = pre_f.item()
73 |             metrics['recall_f'] = rec_f.item()
74 |             metrics['precision_b'] = pre_b.item()
75 |             metrics['recall_b'] = rec_b.item()
76 | 
77 | 
78 |             true_p, true_n, false_p, false_n = evaluate_binary_class(pred_label, gt_label)
79 | 
80 |             metrics['true_p'] = true_p.item()
81 |             metrics['true_n'] = true_n.item()
82 |             metrics['false_p'] = false_p.item()
83 |             metrics['false_n'] = false_n.item()
84 | 
85 |         return metrics


--------------------------------------------------------------------------------
/lib/model/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/model/__init__.py


--------------------------------------------------------------------------------
/lib/model/minkowski/ME_layers.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | 
 4 | import MinkowskiEngine as ME
 5 | import MinkowskiEngine.MinkowskiFunctional as MEF
 6 | 
 7 | #### NORMALIZATION LAYER ####
 8 | def get_norm_layer(norm_type, num_feats, bn_momentum=0.05, D=-1):
 9 | 
10 |     if norm_type == 'BN':
11 |         return ME.MinkowskiBatchNorm(num_feats, momentum=bn_momentum)
12 |     
13 |     elif norm_type == 'IN':
14 |         return ME.MinkowskiInstanceNorm(num_feats)
15 |   
16 |     else:
17 |         raise ValueError(f'Type {norm_type}, not defined')
18 | 
19 | #### RESIDUAL BLOCK ####
20 | 
21 | class ResBlockBase(nn.Module):
22 |     expansion = 1
23 |     NORM_TYPE = 'BN'
24 | 
25 |     def __init__(self,
26 |                 inplanes,
27 |                 planes,
28 |                 stride=1,
29 |                 dilation=1,
30 |                 downsample=None,
31 |                 bn_momentum=0.1,
32 |                 D=3):
33 |         super(ResBlockBase, self).__init__()
34 | 
35 |         self.conv1 = ME.MinkowskiConvolution(
36 |             inplanes, planes, kernel_size=3, stride=stride, dimension=D)
37 | 
38 |         self.norm1 = get_norm_layer(self.NORM_TYPE, planes, bn_momentum=bn_momentum, D=D)
39 |     
40 |         self.conv2 = ME.MinkowskiConvolution(
41 |             planes,
42 |             planes,
43 |             kernel_size=3,
44 |             stride=1,
45 |             dilation=dilation,
46 |             bias=False,
47 |             dimension=D)
48 | 
49 |         self.norm2 = get_norm_layer(self.NORM_TYPE, planes, bn_momentum=bn_momentum, D=D)
50 |     
51 |         self.downsample = downsample
52 | 
53 |     def forward(self, x):
54 |         residual = x
55 | 
56 |         out = self.conv1(x)
57 |         out = self.norm1(out)
58 |         out = MEF.relu(out)
59 | 
60 |         out = self.conv2(out)
61 |         out = self.norm2(out)
62 | 
63 |         if self.downsample is not None:
64 |             residual = self.downsample(x)
65 | 
66 |         out += residual
67 |         out = MEF.relu(out)
68 | 
69 |         return out
70 | 
71 | 
72 | class ResBlockBN(ResBlockBase):
73 |     NORM_TYPE = 'BN'
74 | 
75 | 
76 | class ResBlockIN(ResBlockBase):
77 |     NORM_TYPE = 'IN'
78 | 
79 | 
80 | def get_res_block(norm_type,
81 |                   inplanes,
82 |                   planes,
83 |                   stride=1,
84 |                   dilation=1,
85 |                   downsample=None,
86 |                   bn_momentum=0.1,
87 |                   D=3):
88 | 
89 |     if norm_type == 'BN':
90 |         return ResBlockBN(inplanes, planes, stride, dilation, downsample, bn_momentum, D)
91 |   
92 |     elif norm_type == 'IN':
93 |         return ResBlockIN(inplanes, planes, stride, dilation, downsample, bn_momentum, D)
94 |   
95 |     else:
96 |         raise ValueError(f'Type {norm_type}, not defined')
97 | 


--------------------------------------------------------------------------------
/lib/model/minkowski/MinkowskiFlow.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import numpy as np
  4 | import MinkowskiEngine as ME
  5 | import MinkowskiEngine.MinkowskiFunctional as MEF
  6 | 
  7 | from lib.model.minkowski.ME_layers import get_norm_layer, get_res_block
  8 | from lib.utils import kabsch_transformation_estimation
  9 | 
 10 | _EPS = 1e-6
 11 | 
 12 | class SparseEnoder(ME.MinkowskiNetwork):
 13 |     CHANNELS = [None, 64, 64, 128, 128]
 14 | 
 15 |     def __init__(self,
 16 |                 in_channels=3,
 17 |                 out_channels=128,
 18 |                 bn_momentum=0.1,
 19 |                 conv1_kernel_size=9,
 20 |                 norm_type='IN',
 21 |                 D=3):
 22 | 
 23 |         ME.MinkowskiNetwork.__init__(self, D)
 24 | 
 25 |         NORM_TYPE = norm_type
 26 |         BLOCK_NORM_TYPE = norm_type
 27 |         CHANNELS = self.CHANNELS
 28 | 
 29 |     
 30 |         self.conv1 = ME.MinkowskiConvolution(
 31 |             in_channels=in_channels,
 32 |             out_channels=CHANNELS[1],
 33 |             kernel_size=conv1_kernel_size,
 34 |             stride=1,
 35 |             dilation=1,
 36 |             bias=False,
 37 |             dimension=D)
 38 |         self.norm1 = get_norm_layer(NORM_TYPE, CHANNELS[1], bn_momentum=bn_momentum, D=D)
 39 | 
 40 |         self.block1 = get_res_block(
 41 |             BLOCK_NORM_TYPE, CHANNELS[1], CHANNELS[1], bn_momentum=bn_momentum, D=D)
 42 | 
 43 |         self.conv2 = ME.MinkowskiConvolution(
 44 |             in_channels=CHANNELS[1],
 45 |             out_channels=CHANNELS[2],
 46 |             kernel_size=3,
 47 |             stride=2,
 48 |             dilation=1,
 49 |             bias=False,
 50 |             dimension=D)
 51 | 
 52 |         self.norm2 = get_norm_layer(NORM_TYPE, CHANNELS[2], bn_momentum=bn_momentum, D=D)
 53 | 
 54 |         self.block2 = get_res_block(
 55 |                 BLOCK_NORM_TYPE, CHANNELS[2], CHANNELS[2], bn_momentum=bn_momentum, D=D)
 56 | 
 57 |         self.conv3 = ME.MinkowskiConvolution(
 58 |             in_channels=CHANNELS[2],
 59 |             out_channels=CHANNELS[3],
 60 |             kernel_size=3,
 61 |             stride=2,
 62 |             dilation=1,
 63 |             bias=False,
 64 |             dimension=D)
 65 |         self.norm3 = get_norm_layer(NORM_TYPE, CHANNELS[3], bn_momentum=bn_momentum, D=D)
 66 | 
 67 |         self.block3 = get_res_block(
 68 |                 BLOCK_NORM_TYPE, CHANNELS[3], CHANNELS[3], bn_momentum=bn_momentum, D=D)
 69 | 
 70 |         self.conv4 = ME.MinkowskiConvolution(
 71 |             in_channels=CHANNELS[3],
 72 |             out_channels=CHANNELS[4],
 73 |             kernel_size=3,
 74 |             stride=2,
 75 |             dilation=1,
 76 |             bias=False,
 77 |             dimension=D)
 78 |         self.norm4 = get_norm_layer(NORM_TYPE, CHANNELS[4], bn_momentum=bn_momentum, D=D)
 79 | 
 80 |         self.block4 = get_res_block(
 81 |                 BLOCK_NORM_TYPE, CHANNELS[4], CHANNELS[4], bn_momentum=bn_momentum, D=D)
 82 | 
 83 | 
 84 | 
 85 |     def forward(self, x, tgt_feature=False):
 86 | 
 87 |         skip_features = []
 88 |         out_s1 = self.conv1(x)
 89 |         out_s1 = self.norm1(out_s1)
 90 |         out = self.block1(out_s1)
 91 | 
 92 |         skip_features.append(out_s1)
 93 | 
 94 |         out_s2 = self.conv2(out)
 95 |         out_s2 = self.norm2(out_s2)
 96 |         out = self.block2(out_s2)
 97 | 
 98 |         skip_features.append(out_s2)
 99 | 
100 |         out_s4 = self.conv3(out)
101 |         out_s4 = self.norm3(out_s4)
102 |         out = self.block3(out_s4)
103 | 
104 |         skip_features.append(out_s4)
105 | 
106 |         out_s8 = self.conv4(out)
107 |         out_s8 = self.norm4(out_s8)
108 |         out = self.block4(out_s8)
109 | 
110 |         return out, skip_features
111 | 
112 | 
113 | 
114 | 
115 | 
116 | class SparseDecoder(ME.MinkowskiNetwork):
117 |     TR_CHANNELS = [None, 64, 128, 128, 128]
118 |     CHANNELS = [None, 64, 64, 128, 128]
119 | 
120 |     def __init__(self,
121 |                 out_channels=128,
122 |                 bn_momentum=0.1,
123 |                 norm_type='IN',
124 |                 D=3):
125 | 
126 |         ME.MinkowskiNetwork.__init__(self, D)
127 | 
128 |         NORM_TYPE = norm_type
129 |         BLOCK_NORM_TYPE = norm_type
130 |         TR_CHANNELS = self.TR_CHANNELS
131 |         CHANNELS = self.CHANNELS
132 | 
133 | 
134 |         self.conv4_tr = ME.MinkowskiConvolutionTranspose(
135 |             in_channels=CHANNELS[4],
136 |             out_channels=TR_CHANNELS[4],
137 |             kernel_size=3,
138 |             stride=2,
139 |             dilation=1,
140 |             bias=False,
141 |             dimension=D)
142 | 
143 |         self.norm4_tr = get_norm_layer(NORM_TYPE, TR_CHANNELS[4], bn_momentum=bn_momentum, D=D)
144 | 
145 |         self.block4_tr = get_res_block(
146 |                 BLOCK_NORM_TYPE, TR_CHANNELS[4], TR_CHANNELS[4], bn_momentum=bn_momentum, D=D)
147 | 
148 | 
149 |         self.conv3_tr = ME.MinkowskiConvolutionTranspose(
150 |                 in_channels=CHANNELS[3] + TR_CHANNELS[4],
151 |                 out_channels=TR_CHANNELS[3],
152 |                 kernel_size=3,
153 |                 stride=2,
154 |                 dilation=1,
155 |                 bias=False,
156 |                 dimension=D)
157 |         self.norm3_tr = get_norm_layer(NORM_TYPE, TR_CHANNELS[3], bn_momentum=bn_momentum, D=D)
158 | 
159 |         self.block3_tr = get_res_block(
160 |                 BLOCK_NORM_TYPE, TR_CHANNELS[3], TR_CHANNELS[3], bn_momentum=bn_momentum, D=D)
161 | 
162 | 
163 |         self.conv2_tr = ME.MinkowskiConvolutionTranspose(
164 |                 in_channels=CHANNELS[2] + TR_CHANNELS[3],
165 |                 out_channels=TR_CHANNELS[2],
166 |                 kernel_size=3,
167 |                 stride=2,
168 |                 dilation=1,
169 |                 bias=False,
170 |                 dimension=D)
171 |         self.norm2_tr = get_norm_layer(NORM_TYPE, TR_CHANNELS[2], bn_momentum=bn_momentum, D=D)
172 | 
173 |         self.block2_tr = get_res_block(
174 |                 BLOCK_NORM_TYPE, TR_CHANNELS[2], TR_CHANNELS[2], bn_momentum=bn_momentum, D=D)
175 | 
176 | 
177 | 
178 |         self.conv1_tr = ME.MinkowskiConvolutionTranspose(
179 |             in_channels=CHANNELS[1] + TR_CHANNELS[2],
180 |             out_channels=TR_CHANNELS[1],
181 |             kernel_size=1,
182 |             stride=1,
183 |             dilation=1,
184 |             bias=False,
185 |             dimension=D)
186 | 
187 |         self.final = ME.MinkowskiConvolution(
188 |             in_channels=TR_CHANNELS[1],
189 |             out_channels=out_channels,
190 |             kernel_size=1,
191 |             stride=1,
192 |             dilation=1,
193 |             bias=True,
194 |             dimension=D)
195 | 
196 | 
197 |     def forward(self, x, skip_features):
198 |         
199 |         out = self.conv4_tr(x)
200 |         out = self.norm4_tr(out)
201 |         
202 |         out_s4_tr = self.block4_tr(out)
203 | 
204 |         out = ME.cat(out_s4_tr, skip_features[-1])
205 | 
206 |         out = self.conv3_tr(out)
207 |         out = self.norm3_tr(out)
208 |         out_s2_tr = self.block3_tr(out)
209 |         
210 |         out = ME.cat(out_s2_tr, skip_features[-2])
211 | 
212 |         out = self.conv2_tr(out)
213 |         out = self.norm2_tr(out)
214 |         out_s1_tr = self.block2_tr(out)
215 |         
216 |         out = ME.cat(out_s1_tr, skip_features[-3])
217 | 
218 |         out = self.conv1_tr(out)
219 |         out = MEF.relu(out)
220 |         out = self.final(out)
221 | 
222 |         return out
223 | 
224 | class SparseFlowRefiner(ME.MinkowskiNetwork):
225 |     BLOCK_NORM_TYPE = 'BN'
226 |     NORM_TYPE = 'BN'
227 | 
228 |     def __init__(self,
229 |                 flow_dim = 3,
230 |                 flow_channels = 64,
231 |                 out_channels=3,
232 |                 bn_momentum=0.1,
233 |                 conv1_kernel_size=5,
234 |                 D=3):
235 | 
236 |         ME.MinkowskiNetwork.__init__(self, D)
237 | 
238 |         NORM_TYPE = self.NORM_TYPE
239 |         BLOCK_NORM_TYPE = self.BLOCK_NORM_TYPE
240 |     
241 |         self.conv1 = ME.MinkowskiConvolution(
242 |             in_channels=flow_dim,
243 |             out_channels=flow_channels,
244 |             kernel_size=conv1_kernel_size,
245 |             stride=1,
246 |             dilation=1,
247 |             bias=False,
248 |             dimension=D)
249 | 
250 |         self.conv2 = ME.MinkowskiConvolution(
251 |             in_channels=flow_channels,
252 |             out_channels=flow_channels,
253 |             kernel_size=3,
254 |             stride=1,
255 |             dilation=1,
256 |             bias=False,
257 |             dimension=D)
258 | 
259 | 
260 | 
261 |         self.conv3 = ME.MinkowskiConvolution(
262 |             in_channels=flow_channels,
263 |             out_channels=flow_channels,
264 |             kernel_size=3,
265 |             stride=1,
266 |             dilation=1,
267 |             bias=False,
268 |             dimension=D)
269 | 
270 |         self.conv4 = ME.MinkowskiConvolution(
271 |             in_channels=flow_channels,
272 |             out_channels=flow_channels,
273 |             kernel_size=3,
274 |             stride=1,
275 |             dilation=1,
276 |             bias=False,
277 |             dimension=D)
278 | 
279 |         self.final = ME.MinkowskiConvolution(
280 |             in_channels=flow_channels,
281 |             out_channels=out_channels,
282 |             kernel_size=1,
283 |             stride=1,
284 |             dilation=1,
285 |             bias=False,
286 |             dimension=D)
287 | 
288 | 
289 |     def forward(self, flow):
290 | 
291 |         
292 |         out =  MEF.relu(self.conv1(flow))
293 |         out =  MEF.relu(self.conv2(out))
294 | 
295 |         out = MEF.relu(self.conv3(out))
296 |         out = MEF.relu(self.conv4(out))
297 | 
298 |         res_flow = self.final(out)
299 |         
300 | 
301 |         return flow + res_flow
302 | 
303 | 
304 | class EgoMotionHead(nn.Module):
305 |     """
306 |     Class defining EgoMotionHead
307 |     """
308 | 
309 |     def __init__(self, add_slack=True, sinkhorn_iter=5):
310 |         nn.Module.__init__(self)
311 | 
312 |         self.slack = add_slack
313 |         self.sinkhorn_iter = sinkhorn_iter
314 | 
315 |         # Affinity parameters
316 |         self.beta = torch.nn.Parameter(torch.tensor(-5.0))
317 |         self.alpha = torch.nn.Parameter(torch.tensor(-5.0))
318 | 
319 |         self.softplus = torch.nn.Softplus()
320 | 
321 | 
322 |     def compute_rigid_transform(self, xyz_s, xyz_t, weights):
323 |         """Compute rigid transforms between two point sets
324 | 
325 |         Args:
326 |             a (torch.Tensor): (B, M, 3) points
327 |             b (torch.Tensor): (B, N, 3) points
328 |             weights (torch.Tensor): (B, M)
329 | 
330 |         Returns:
331 |             Transform T (B, 3, 4) to get from a to b, i.e. T*a = b
332 |         """
333 | 
334 |         weights_normalized = weights[..., None] / (torch.sum(weights[..., None], dim=1, keepdim=True) + _EPS)
335 |         centroid_s = torch.sum(xyz_s * weights_normalized, dim=1)
336 |         centroid_t = torch.sum(xyz_t * weights_normalized, dim=1)
337 |         s_centered = xyz_s - centroid_s[:, None, :]
338 |         t_centered = xyz_t - centroid_t[:, None, :]
339 |         cov = s_centered.transpose(-2, -1) @ (t_centered * weights_normalized)
340 | 
341 |         # Compute rotation using Kabsch algorithm. Will compute two copies with +/-V[:,:3]
342 |         # and choose based on determinant to avoid flips
343 |         u, s, v = torch.svd(cov, some=False, compute_uv=True)
344 |         rot_mat_pos = v @ u.transpose(-1, -2)
345 |         v_neg = v.clone()
346 |         v_neg[:, :, 2] *= -1
347 |         rot_mat_neg = v_neg @ u.transpose(-1, -2)
348 |         rot_mat = torch.where(torch.det(rot_mat_pos)[:, None, None] > 0, rot_mat_pos, rot_mat_neg)
349 |         assert torch.all(torch.det(rot_mat) > 0)
350 | 
351 |         # Compute translation (uncenter centroid)
352 |         translation = -rot_mat @ centroid_s[:, :, None] + centroid_t[:, :, None]
353 | 
354 |         transform = torch.cat((rot_mat, translation), dim=2)
355 | 
356 |         return transform
357 | 
358 |     def sinkhorn(self, log_alpha, n_iters=5, slack=True):
359 |         """ Run sinkhorn iterations to generate a near doubly stochastic matrix, where each row or column sum to <=1
360 |         Args:
361 |             log_alpha: log of positive matrix to apply sinkhorn normalization (B, J, K)
362 |             n_iters (int): Number of normalization iterations
363 |             slack (bool): Whether to include slack row and column
364 |             eps: eps for early termination (Used only for handcrafted RPM). Set to negative to disable.
365 |         Returns:
366 |             log(perm_matrix): Doubly stochastic matrix (B, J, K)
367 |         Modified from original source taken from:
368 |             Learning Latent Permutations with Gumbel-Sinkhorn Networks
369 |             https://github.com/HeddaCohenIndelman/Learning-Gumbel-Sinkhorn-Permutations-w-Pytorch
370 |         """
371 | 
372 |         # Sinkhorn iterations
373 | 
374 |         zero_pad = nn.ZeroPad2d((0, 1, 0, 1))
375 |         log_alpha_padded = zero_pad(log_alpha[:, None, :, :])
376 | 
377 |         log_alpha_padded = torch.squeeze(log_alpha_padded, dim=1)
378 | 
379 |         for i in range(n_iters):
380 |             # Row normalization
381 |             log_alpha_padded = torch.cat((
382 |                     log_alpha_padded[:, :-1, :] - (torch.logsumexp(log_alpha_padded[:, :-1, :], dim=2, keepdim=True)),
383 |                     log_alpha_padded[:, -1, None, :]),  # Don't normalize last row
384 |                 dim=1)
385 | 
386 |             # Column normalization
387 |             log_alpha_padded = torch.cat((
388 |                     log_alpha_padded[:, :, :-1] - (torch.logsumexp(log_alpha_padded[:, :, :-1], dim=1, keepdim=True)),
389 |                     log_alpha_padded[:, :, -1, None]),  # Don't normalize last column
390 |                 dim=2)
391 | 
392 | 
393 |         log_alpha = log_alpha_padded[:, :-1, :-1]
394 | 
395 |         return log_alpha
396 | 
397 | 
398 |     def forward(self, score_matrix, mask, xyz_s, xyz_t):
399 | 
400 |         affinity = -(score_matrix - self.softplus(self.alpha))/(torch.exp(self.beta) + 0.02)
401 | 
402 |          # Compute weighted coordinates
403 |         log_perm_matrix = self.sinkhorn(affinity, n_iters=self.sinkhorn_iter, slack=self.slack)
404 | 
405 |         perm_matrix = torch.exp(log_perm_matrix) * mask
406 |         weighted_t = perm_matrix @ xyz_t / (torch.sum(perm_matrix, dim=2, keepdim=True) + _EPS)
407 | 
408 |         # Compute transform and transform points
409 |         #transform = self.compute_rigid_transform(xyz_s, weighted_t, weights=torch.sum(perm_matrix, dim=2))
410 |         R_est, t_est, _, _ = kabsch_transformation_estimation(xyz_s, weighted_t, weights=torch.sum(perm_matrix, dim=2))
411 |         return R_est, t_est, perm_matrix
412 | 
413 | 
414 | 
415 | class SparseSegHead(ME.MinkowskiNetwork):
416 | 
417 |     def __init__(self,
418 |                 in_channels=64,
419 |                 out_channels=128,
420 |                 bn_momentum=0.1,
421 |                 norm_type='IN',
422 |                 D=3):
423 | 
424 |         ME.MinkowskiNetwork.__init__(self, D)
425 | 
426 |         NORM_TYPE = norm_type
427 | 
428 |         self.seg_head_1 = ME.MinkowskiConvolution(
429 |             in_channels=in_channels,
430 |             out_channels=in_channels,
431 |             kernel_size=1,
432 |             stride=1,
433 |             dilation=1,
434 |             bias=True,
435 |             dimension=D)
436 | 
437 |         self.norm_1 = get_norm_layer(NORM_TYPE, in_channels, bn_momentum=bn_momentum, D=D)
438 | 
439 |         self.seg_head_2 = ME.MinkowskiConvolution(
440 |             in_channels=in_channels,
441 |             out_channels=out_channels,
442 |             kernel_size=1,
443 |             stride=1,
444 |             dilation=1,
445 |             bias=True,
446 |             dimension=D)
447 | 
448 | 
449 |     def forward(self, x):
450 |         
451 |         out = self.seg_head_1(x)
452 |         out = self.norm_1(out)
453 |         out = MEF.relu(out)
454 | 
455 |         out = self.seg_head_2(out)
456 | 
457 | 
458 |         return out


--------------------------------------------------------------------------------
/lib/model/minkowski/__init__,py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/lib/model/minkowski/__init__,py


--------------------------------------------------------------------------------
/lib/model/rigid_3d_sf.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | 
  4 | import numpy as np
  5 | from collections import defaultdict
  6 | from sklearn.cluster import DBSCAN
  7 | import MinkowskiEngine as ME
  8 | 
  9 | from lib.utils import pairwise_distance, transform_point_cloud, kabsch_transformation_estimation, refine_ego_motion, refine_cluster_motion
 10 | from lib.utils import upsample_flow, upsample_bckg_labels, upsample_cluster_labels
 11 | from lib.model.minkowski.MinkowskiFlow import SparseEnoder, SparseDecoder, SparseFlowRefiner, EgoMotionHead, SparseSegHead
 12 | 
 13 | 
 14 | 
 15 | class MinkowskiFlow(nn.Module):
 16 |     def __init__(self, args):
 17 |         super(MinkowskiFlow, self).__init__()
 18 |         
 19 |         self.args = args
 20 |         self.voxel_size = args['misc']['voxel_size']
 21 |         self.device = torch.device('cuda' if (torch.cuda.is_available() and args['misc']['use_gpu']) else 'cpu') 
 22 |         self.normalize_feature  = args['network']['normalize_features']
 23 |         self.test_flag = True if args['misc']['run_mode'] == 'test' else False
 24 | 
 25 |         if self.test_flag:
 26 |             self.postprocess_ego = args['test']['postprocess_ego']
 27 |             self.postprocess_clusters = args['test']['postprocess_clusters']
 28 |         
 29 |         self.estimate_ego, self.estimate_flow, self.estimate_semantic, self.estimate_cluster = False, False, False, False
 30 | 
 31 |         self.upsampling_k = 36 if args['data']['dataset'] in ['StereoKITTI_ME', 'FlyingThings3D_ME'] else 3
 32 |         self.tau_offset = 0.025 if args['data']['dataset'] in ['StereoKITTI_ME', 'FlyingThings3D_ME'] else 0.03
 33 | 
 34 |         if args['data']['input_features'] == 'occupancy':
 35 |             self.input_feature_dim = 1
 36 |         else:
 37 |             self.input_feature_dim = 3
 38 | 
 39 |         # Initialize the backbone network
 40 |         self.encoder = SparseEnoder(in_channels=self.input_feature_dim,
 41 |                                             conv1_kernel_size=args['network']['in_kernel_size'],
 42 |                                             norm_type=args['network']['norm_type'])
 43 | 
 44 |         self.decoder = SparseDecoder(out_channels=args['network']['feature_dim'],
 45 |                                 norm_type=args['network']['norm_type'])
 46 | 
 47 |         # Initialize the scene flow head
 48 |         if args['method']['flow']:
 49 |             self.estimate_flow = True
 50 |             self.epsilon = torch.nn.Parameter(torch.tensor(-5.0))
 51 | 
 52 |             self.flow_refiner = SparseFlowRefiner(flow_dim=3)
 53 | 
 54 |         # Initialize the background segmentation head
 55 |         if args['method']['semantic']:
 56 |             self.estimate_semantic = True
 57 | 
 58 |             self.seg_decoder = SparseSegHead(in_channels=args['network']['feature_dim'],
 59 |                                              out_channels=args['data']['n_classes'],
 60 |                                              norm_type=args['network']['norm_type'])
 61 | 
 62 | 
 63 |         # Initialize the ego motion head
 64 |         if args['method']['ego_motion']:
 65 |             self.estimate_ego = True
 66 |             self.ego_n_points = args['network']['ego_motion_points']
 67 |             self.add_slack = args['network']['add_slack']
 68 |             self.sinkhorn_iter = args['network']['sinkhorn_iter']
 69 |                     
 70 |             self.ego_motion_decoder = EgoMotionHead(add_slack=self.add_slack,
 71 |                                                    sinkhorn_iter=self.sinkhorn_iter)
 72 |         
 73 |         # Initialize the foreground clustering head
 74 |         if args['method']['clustering']:
 75 |                 self.estimate_cluster = True
 76 |                 self.min_p_cluster = args['network']['min_p_cluster']
 77 |                 
 78 |                 self.cluster_estimator = DBSCAN(min_samples=args['network']['min_samples_dbscan'], 
 79 |                                                 metric=args['network']['cluster_metric'], eps=args['network']['eps_dbscan'])        
 80 | 
 81 | 
 82 | 
 83 |     def _infer_flow(self, flow_f_1, flow_f_2):
 84 |         
 85 |         # Normalize the features
 86 |         if self.normalize_feature:
 87 |             flow_f_1= ME.SparseTensor(
 88 |                         flow_f_1.F / torch.norm(flow_f_1.F, p=2, dim=1, keepdim=True),
 89 |                         coordinate_map_key=flow_f_1.coordinate_map_key,
 90 |                         coordinate_manager=flow_f_1.coordinate_manager)
 91 | 
 92 |             flow_f_2= ME.SparseTensor(
 93 |                         flow_f_2.F / torch.norm(flow_f_2.F, p=2, dim=1, keepdim=True),
 94 |                         coordinate_map_key=flow_f_2.coordinate_map_key,
 95 |                         coordinate_manager=flow_f_2.coordinate_manager)
 96 | 
 97 |         # Extract the coarse flow based on the feature correspondences
 98 |         coarse_flow = []
 99 | 
100 |         # Iterate over the examples in the batch
101 |         for b_idx in range(len(flow_f_1.decomposed_coordinates)):
102 |             feat_s = flow_f_1.F[flow_f_1.C[:,0] == b_idx]
103 |             feat_t = flow_f_2.F[flow_f_2.C[:,0] == b_idx]
104 | 
105 |             coor_s = flow_f_1.C[flow_f_1.C[:,0] == b_idx,1:].to(self.device) * self.voxel_size
106 |             coor_t = flow_f_2.C[flow_f_2.C[:,0] == b_idx,1:].to(self.device) * self.voxel_size
107 | 
108 | 
109 |             # Squared l2 distance between points points of both point clouds
110 |             coor_s, coor_t = coor_s.unsqueeze(0), coor_t.unsqueeze(0)
111 |             feat_s, feat_t = feat_s.unsqueeze(0), feat_t.unsqueeze(0)
112 |             
113 |             # Force transport to be zero for points further than 10 m apart
114 |             support = (pairwise_distance(coor_s, coor_t, normalized=False ) < 10**2).float()
115 | 
116 |             # Transport cost matrix
117 |             C = pairwise_distance(feat_s, feat_t)
118 | 
119 |             K = torch.exp(-C / (torch.exp(self.epsilon) + self.tau_offset)) * support
120 |        
121 |             row_sum  = K.sum(-1, keepdim=True)
122 | 
123 |             # Estimate flow
124 |             corr_flow = (K  @ coor_t) / (row_sum + 1e-8) - coor_s
125 |             
126 |             coarse_flow.append(corr_flow.squeeze(0))
127 |         
128 | 
129 |         coarse_flow = torch.cat(coarse_flow,dim=0)
130 | 
131 |         st_cf = ME.SparseTensor(features=coarse_flow, 
132 |                                 coordinate_manager=flow_f_1.coordinate_manager, 
133 |                                 coordinate_map_key=flow_f_1.coordinate_map_key)
134 |         
135 |         self.inferred_values['coarse_flow'] = st_cf.F
136 | 
137 | 
138 |         # Refine the flow with the second network
139 |         refined_flow  = self.flow_refiner(st_cf)
140 | 
141 | 
142 |         self.inferred_values['refined_flow'] = refined_flow.F
143 | 
144 | 
145 | 
146 |     def _infer_ego_motion(self, flow_f_1, flow_f_2, sem_label_s,  sem_label_t):
147 | 
148 |         ego_motion_R = []
149 |         ego_motion_t = []
150 |         ego_motion_perm = []
151 | 
152 |         run_b_len_s = 0
153 |         run_b_len_t = 0
154 | 
155 |         # Normalize the features
156 |         if self.normalize_feature:
157 |             flow_f_1= ME.SparseTensor(
158 |                         flow_f_1.F / torch.norm(flow_f_1.F, p=2, dim=1, keepdim=True),
159 |                         coordinate_map_key=flow_f_1.coordinate_map_key,
160 |                         coordinate_manager=flow_f_1.coordinate_manager)
161 | 
162 |             flow_f_2= ME.SparseTensor(
163 |                         flow_f_2.F / torch.norm(flow_f_2.F, p=2, dim=1, keepdim=True),
164 |                         coordinate_map_key=flow_f_2.coordinate_map_key,
165 |                         coordinate_manager=flow_f_2.coordinate_manager)
166 | 
167 |         for b_idx in range(len(flow_f_1.decomposed_coordinates)):
168 |             feat_s = flow_f_1.F[flow_f_1.C[:,0] == b_idx]
169 |             feat_t = flow_f_2.F[flow_f_2.C[:,0] == b_idx]
170 | 
171 |             coor_s = flow_f_1.C[flow_f_1.C[:,0] == b_idx,1:].to(self.device) * self.voxel_size
172 |             coor_t = flow_f_2.C[flow_f_2.C[:,0] == b_idx,1:].to(self.device) * self.voxel_size
173 | 
174 |             # Get the number of points in the current b_idx
175 |             b_len_s = feat_s.shape[0]
176 |             b_len_t = feat_t.shape[0]
177 | 
178 |             # Extract the semantic labels for the current b_idx (0 are the background points)
179 |             mask_s = (sem_label_s[run_b_len_s: (run_b_len_s + b_len_s)] == 0)
180 |             mask_t = (sem_label_t[run_b_len_t: (run_b_len_t + b_len_t)] == 0)
181 |             
182 |             # Update the running number of points 
183 |             run_b_len_s += b_len_s
184 |             run_b_len_t += b_len_t
185 | 
186 |             # Squared l2 distance between points points of both point clouds
187 |             coor_s, coor_t = coor_s[mask_s, :].unsqueeze(0), coor_t[mask_t, :].unsqueeze(0)
188 |             feat_s, feat_t = feat_s[mask_s, :].unsqueeze(0), feat_t[mask_t, :].unsqueeze(0)
189 |             
190 |             # Sample the points randomly (to keep the computation memory tracktable)
191 |             idx_ego_s = torch.randperm(coor_s.shape[1])[:self.ego_n_points]
192 |             idx_ego_t = torch.randperm(coor_t.shape[1])[:self.ego_n_points]
193 | 
194 |             coor_s_ego = coor_s[:,idx_ego_s,:]
195 |             coor_t_ego = coor_t[:,idx_ego_t,:]
196 |             feat_s_ego = feat_s[:,idx_ego_s,:]
197 |             feat_t_ego = feat_t[:,idx_ego_t,:]
198 | 
199 |             # Force transport to be zero for points further than 10 m apart
200 |             support_ego = (pairwise_distance(coor_s_ego, coor_t_ego, normalized=False ) < 5 ** 2).float()
201 | 
202 |             # Cost matrix in the feature space
203 |             feat_dist = pairwise_distance(feat_s_ego, feat_t_ego)
204 | 
205 |             R_est, t_est, perm_matrix = self.ego_motion_decoder(feat_dist, support_ego, coor_s_ego, coor_t_ego)
206 | 
207 |             ego_motion_R.append(R_est)
208 |             ego_motion_t.append(t_est)
209 |             ego_motion_perm.append(perm_matrix)
210 | 
211 | 
212 |         # Save ego motion results
213 |         self.inferred_values['R_est'] = torch.cat(ego_motion_R, dim=0)
214 |         self.inferred_values['t_est'] = torch.cat(ego_motion_t, dim=0)
215 |         self.inferred_values['permutation'] = ego_motion_perm
216 |         
217 | 
218 |     def _infer_semantics(self, dec_f_1, dec_f_2):
219 | 
220 |         # Extract the logits
221 |         logits_s = self.seg_decoder(dec_f_1)
222 |         logits_t = self.seg_decoder(dec_f_2)
223 | 
224 |         self.inferred_values['semantic_logits_s'] = logits_s
225 |         self.inferred_values['semantic_logits_t'] = logits_t
226 | 
227 | 
228 |     def _infer_clusters(self, st_s, st_t, sem_label_s, sem_label_t):
229 | 
230 |         # Cluster the source and target point cloud (only source clusters will be used)
231 |         running_idx_s = 0
232 |         running_idx_t = 0
233 |         
234 |         clusters_s = defaultdict(list)
235 |         clusters_t = defaultdict(list)
236 |         
237 |         clusters_s_rot = defaultdict(list)
238 |         clusters_s_trans = defaultdict(list)
239 | 
240 |         batch_size = torch.max(st_s.coordinates[:,0]) + 1
241 | 
242 |         for b_idx in range(batch_size):
243 |             b_fgrnd_idx_s = torch.where(sem_label_s[running_idx_s:(running_idx_s + st_s.C[st_s.C[:,0] == b_idx,1:].shape[0])] == 1)[0]
244 |             b_fgrnd_idx_t = torch.where(sem_label_t[running_idx_t:(running_idx_t + st_t.C[st_t.C[:,0] == b_idx,1:].shape[0])] == 1)[0]
245 | 
246 |             coor_s = st_s.C[st_s.C[:,0] == b_idx,1:].to(self.device) * self.voxel_size
247 |             coor_t = st_t.C[st_t.C[:,0] == b_idx,1:].to(self.device) * self.voxel_size
248 | 
249 |             # Only perform if foreground points are in both source and target
250 |             if b_fgrnd_idx_s.shape[0] and b_fgrnd_idx_t.shape[0]:
251 |                 xyz_fgrnd_s = coor_s[b_fgrnd_idx_s, :].cpu().numpy()
252 |                 xyz_fgrnd_t = coor_t[b_fgrnd_idx_t, :].cpu().numpy()
253 | 
254 |                 # Perform clustering
255 |                 labels_s = self.cluster_estimator.fit_predict(xyz_fgrnd_s)
256 |                 labels_t = self.cluster_estimator.fit_predict(xyz_fgrnd_t)
257 |                 
258 |                 # Map cluster labels to indices (consider only clusters that have at least n points)
259 |                 for class_label in np.unique(labels_s):
260 |                     if class_label != -1 and np.where(labels_s == class_label)[0].shape[0] >= self.min_p_cluster:
261 |                         clusters_s[str(b_idx)].append(b_fgrnd_idx_s[np.where(labels_s == class_label)[0]] + running_idx_s)
262 | 
263 |                 for class_label in np.unique(labels_t):
264 |                     if class_label != -1 and np.where(labels_t == class_label)[0].shape[0] >= self.min_p_cluster:
265 |                         clusters_t[str(b_idx)].append(b_fgrnd_idx_t[np.where(labels_t == class_label)[0]] + running_idx_t)
266 |             
267 |                 # Estimate the relative transformation parameteres of each cluster
268 |                 if self.test_flag:
269 |                     for c_idx in clusters_s[str(b_idx)]:
270 |                         cluster_xyz_s = (st_s.C[c_idx,1:] * self.voxel_size).unsqueeze(0).to(self.device)
271 |                         cluster_flow = self.inferred_values['refined_flow'][c_idx,:].unsqueeze(0)
272 |                         reconstructed_xyz = cluster_xyz_s + cluster_flow
273 | 
274 |                         R_cluster, t_cluster, _, _ = kabsch_transformation_estimation(cluster_xyz_s, reconstructed_xyz)
275 | 
276 |                         clusters_s_rot[str(b_idx)].append(R_cluster.squeeze(0))
277 |                         clusters_s_trans[str(b_idx)].append(t_cluster.squeeze(0))
278 | 
279 |             running_idx_s += coor_s.shape[0]
280 |             running_idx_t += coor_t.shape[0]
281 | 
282 |         self.inferred_values['clusters_s'] = clusters_s
283 |         self.inferred_values['clusters_t'] = clusters_t
284 |         self.inferred_values['clusters_s_R'] = clusters_s_rot
285 |         self.inferred_values['clusters_s_t'] = clusters_s_trans
286 | 
287 | 
288 | 
289 |     def forward(self, st_1, st_2, xyz_1, xyz_2, sem_label_s, sem_label_t):
290 |         
291 |         self.inferred_values = {}
292 | 
293 |         # Run both point clouds through the backbone network
294 |         enc_feat_1, skip_features_1 = self.encoder(st_1)
295 |         enc_feat_2, skip_features_2 = self.encoder(st_2)
296 | 
297 |         dec_feat_1 = self.decoder(enc_feat_1, skip_features_1)
298 |         dec_feat_2 = self.decoder(enc_feat_2, skip_features_2)
299 | 
300 |         # Rune the background segmentation head
301 |         if self.estimate_semantic:
302 |             self._infer_semantics(dec_feat_1, dec_feat_2)
303 |             est_sem_label_s = self.inferred_values['semantic_logits_s'].F.max(1)[1]
304 |             est_sem_label_t = self.inferred_values['semantic_logits_t'].F.max(1)[1]
305 | 
306 |         # Rune the scene flow head
307 |         if self.estimate_flow:            
308 |             self._infer_flow(dec_feat_1, dec_feat_2)
309 | 
310 |         # Rune the ego-motion head
311 |         if self.estimate_ego:
312 |             # During training use the given semantic labels to sample the points
313 |             if self.test_flag:
314 |                 if self.estimate_semantic:
315 |                     self._infer_ego_motion(dec_feat_1, dec_feat_2, est_sem_label_s, est_sem_label_t)
316 |                 else:
317 |                     raise ValueError("Ego motion estimation selected in test phase but background segmentation head was not used")                
318 |             else:
319 |                 self._infer_ego_motion(dec_feat_1, dec_feat_2, sem_label_s, sem_label_t)
320 | 
321 |         # Rune the foreground clustering
322 |         if self.estimate_cluster:
323 |             # During training use the given semantic labels
324 |             if self.test_flag:
325 |                 if self.estimate_semantic:
326 |                     self._infer_clusters(st_1,st_2, est_sem_label_s, est_sem_label_t)
327 |                 else:
328 |                     raise ValueError("Foreground clustering selected in test phase but background segmentation head was not used")
329 |             else:
330 | 
331 |                 self._infer_clusters(st_1,st_2, sem_label_s, sem_label_t)
332 | 
333 | 
334 | 
335 |         # From rigid transformations to pointwise scene flow
336 |         if self.test_flag and self.estimate_ego:
337 | 
338 |             coor_s = st_1.C[st_1.C[:,0] == 0,1:].to(self.device) * self.voxel_size
339 |             coor_t = st_2.C[st_2.C[:,0] == 0,1:].to(self.device) * self.voxel_size
340 | 
341 |             # Ego-motion test-time optimization
342 |             if self.test_flag and self.postprocess_ego:
343 |                 bckg_mask_s = (est_sem_label_s == 0).unsqueeze(0)
344 |                 bckg_mask_t = (est_sem_label_t == 0).unsqueeze(0)
345 | 
346 |                 R_e, t_e = refine_ego_motion(coor_s.unsqueeze(0), coor_t.unsqueeze(0), bckg_mask_s, bckg_mask_t, self.inferred_values['R_est'], self.inferred_values['t_est'] )
347 | 
348 |                 self.inferred_values['R_est'] = torch.from_numpy(R_e).to(self.device)
349 |                 self.inferred_values['t_est'] = torch.from_numpy(t_e).to(self.device)
350 | 
351 | 
352 |             # Update the flow vectors of the background based on the ego motion         
353 |             xyz_1_transformed = transform_point_cloud(coor_s.to(self.device), self.inferred_values['R_est'], self.inferred_values['t_est'])
354 |             bckg_idx = torch.where(est_sem_label_s == 0)[0]
355 |             self.inferred_values['refined_flow'][bckg_idx,:] = xyz_1_transformed[0,bckg_idx,:].to(self.device) - coor_s[bckg_idx,:].to(self.device)
356 | 
357 |         if self.test_flag and self.estimate_cluster:
358 | 
359 |             # Foreground test time optimization
360 |             if self.test_flag and self.postprocess_clusters:
361 |                 fgnd_mask_t = (est_sem_label_t == 1).unsqueeze(0)
362 | 
363 |                 for idx, c_idx in enumerate(self.inferred_values['clusters_s']['0']):
364 |                     pc_s_cluster = coor_s[c_idx,:]
365 |                     pc_t_fgnd = coor_t[fgnd_mask_t[0],:]
366 |                     
367 |                     R_coarse = self.inferred_values['clusters_s_R']['0'][idx]
368 |                     t_coarse = self.inferred_values['clusters_s_t']['0'][idx]
369 | 
370 |                     R_c, t_c = refine_cluster_motion(pc_s_cluster, pc_t_fgnd, R_coarse, t_coarse)
371 | 
372 |                     R_c = torch.from_numpy(R_c).to(self.device)
373 |                     t_c = torch.from_numpy(t_c).to(self.device)
374 | 
375 |                     self.inferred_values['clusters_s_R']['0'][idx] = R_c
376 |                     self.inferred_values['clusters_s_t']['0'][idx] = t_c
377 | 
378 | 
379 |             # Update the flow vectors of the foreground based on the object wise rigid motion
380 |             for idx, c_idx in enumerate(self.inferred_values['clusters_s']['0']):
381 |                 pc_s_cluster = coor_s[c_idx,:]
382 |         
383 |                 cluster_transformed = transform_point_cloud(pc_s_cluster.to(self.device), self.inferred_values['clusters_s_R']['0'][idx], 
384 |                                                             self.inferred_values['clusters_s_t']['0'][idx])
385 | 
386 |                 self.inferred_values['refined_flow'][c_idx,:] = cluster_transformed.squeeze(0).to(self.device) - pc_s_cluster.to(self.device)
387 | 
388 | 
389 |         # Upsample the flow from the voxel centers to the original points
390 |         
391 |         if self.estimate_flow:
392 |             # Finally we upsample the voxel flow to the actuall raw points 
393 |             refined_voxel_flow = ME.SparseTensor(features=self.inferred_values['refined_flow'], 
394 |                             coordinate_manager=dec_feat_1.coordinate_manager, 
395 |                             coordinate_map_key=dec_feat_1.coordinate_map_key)
396 | 
397 |             # Interpolate the flow from the voxels to the continuos coordinates on the coarse level and upsample the labels
398 |             upsampled_voxel_flow =  upsample_flow(xyz_1, refined_voxel_flow, k_value=self.upsampling_k, voxel_size=self.voxel_size)
399 |             self.inferred_values['refined_rigid_flow'] = torch.cat(upsampled_voxel_flow, dim=0)
400 | 
401 |         if self.estimate_semantic:
402 |             upsampled_seg_labels =  upsample_bckg_labels(xyz_1, self.inferred_values['semantic_logits_s'],  voxel_size=self.voxel_size)
403 | 
404 |             self.inferred_values['semantic_logits_s_all'] = upsampled_seg_labels
405 | 
406 |         if self.estimate_cluster:
407 | 
408 |             upsampled_cluster_labels =  upsample_cluster_labels(xyz_1, self.inferred_values['semantic_logits_s'], self.inferred_values['clusters_s'],  voxel_size=self.voxel_size)
409 | 
410 |             self.inferred_values['clusters_s_all'] = upsampled_cluster_labels
411 | 
412 |         return self.inferred_values
413 | 
414 | 
415 | 


--------------------------------------------------------------------------------
/lib/trainer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import copy
  3 | import MinkowskiEngine as ME
  4 | from tqdm import tqdm 
  5 | 
  6 | from lib.loss import TrainLoss
  7 | from lib.metrics import EvalMetrics
  8 | from lib.utils import dict_all_to_device
  9 | 
 10 | class FlowTrainer:
 11 |     ''' 
 12 |     Default trainer class for the scene flow training
 13 | 
 14 |     Args:
 15 |         args (dict): configuration parameters
 16 |         model (nn.Module): model
 17 |         device (pytorch device)
 18 | 
 19 |     ''' 
 20 | 
 21 |     def __init__(self, args, model, device):
 22 |        
 23 |         self.device = device
 24 | 
 25 |         self.compute_losses = TrainLoss(args)
 26 |         self.compute_metrics = EvalMetrics(args)
 27 |         self.model = model.to(device)
 28 | 
 29 |     
 30 |     def train_step(self, data):
 31 |         ''' 
 32 |         Performs a single training step.
 33 |         
 34 |         Args:
 35 |             data (dict): input data
 36 | 
 37 |         Returns:
 38 |             loss_values (dict): all individual loss values
 39 |             metric (dict): evaluation metics
 40 |             total_loss (torch.tensor): loss value used for training
 41 |         
 42 |         '''
 43 |         
 44 |         self.model.train()
 45 |         losses, metrics = self._compute_loss_metrics(data)
 46 | 
 47 |         # Copy only the loss values not the whole tensors
 48 |         loss_values = {}
 49 |         for key, value in losses.items():
 50 |             loss_values[key] = value.item()
 51 |         
 52 |         return loss_values, metrics, losses['total_loss']
 53 | 
 54 |     
 55 |     def eval_step(self, data):
 56 |         ''' 
 57 |         Performs a single evaluation epoch.
 58 |         
 59 |         Args:
 60 |             data (dict): input data
 61 |         
 62 |         Returns:
 63 |             metric (dict): evaluation metics
 64 | 
 65 |         '''
 66 |         
 67 |         # evaluate model:
 68 |         self.model.eval()
 69 |         with torch.no_grad():
 70 |             _, metrics = self._compute_loss_metrics(data, phase='eval')
 71 |        
 72 |         return metrics
 73 | 
 74 | 
 75 |     def validate(self, val_loader):
 76 |         ''' 
 77 |         Performs the whole validation 
 78 |         
 79 |         Args:
 80 |             val_loader ( torch data loader): data loader of the validation data            
 81 |         '''
 82 |         
 83 |         # evaluate model:
 84 |         self.model.eval()
 85 |         running_losses = {}
 86 |         running_metrics = {}
 87 | 
 88 |         with torch.no_grad():
 89 |             for it, batch in enumerate(tqdm(val_loader)):
 90 |                 
 91 |                 dict_all_to_device(batch, self.device)
 92 |                 losses, metrics = self._compute_loss_metrics(batch)
 93 | 
 94 |                 # Update the running losses
 95 |                 if not running_losses:
 96 |                     running_losses = copy.deepcopy(losses)    
 97 |                 else:
 98 |                     for key, value in losses.items():
 99 |                         running_losses[key] += value
100 | 
101 |                 # Update the running metrics
102 |                 if not running_metrics:
103 |                     running_metrics = copy.deepcopy(metrics)    
104 |                 else:
105 |                     for key, value in metrics.items():
106 |                         running_metrics[key] += value
107 | 
108 | 
109 |         for key, value in running_losses.items():
110 |             running_losses[key] = value/len(val_loader)
111 | 
112 |         for key, value in running_metrics.items():
113 |             running_metrics[key] = value/len(val_loader)
114 | 
115 |         return running_losses, running_metrics
116 | 
117 | 
118 | class MEFlowTrainer(FlowTrainer):
119 |     ''' 
120 |     Trainer class of the 3D rigid scene flow network with ME backbone
121 | 
122 |     Args:
123 |         args (dict): configuration parameters
124 |         model (nn.Module): model
125 |         device (pytorch device)
126 |     '''
127 | 
128 |     def __init__(self, args, model, device):
129 | 
130 |         FlowTrainer.__init__(self, args, model, device)
131 | 
132 |     def _compute_loss_metrics(self, input_dict, phase='train'):
133 | 
134 |         ''' 
135 |         Computes the losses and evaluation metrics
136 |         
137 |         Args:
138 |             input_dict (dict): data dictionary
139 | 
140 |         Return:
141 |             losses (dict): selected loss values
142 |             metric (dict): selected evaluation metric
143 |         '''
144 | 
145 |         # Run the feature and context encoder
146 |         sinput1 = ME.SparseTensor(features=input_dict['sinput_s_F'].to(self.device),
147 |             coordinates=input_dict['sinput_s_C'].to(self.device))
148 | 
149 |         sinput2 = ME.SparseTensor(features=input_dict['sinput_t_F'].to(self.device),
150 |             coordinates=input_dict['sinput_t_C'].to(self.device))
151 |                 
152 |         if phase == 'train':
153 |             inferred_values = self.model(sinput1, sinput2, input_dict['pcd_s'], input_dict['pcd_t'], input_dict['fg_labels_s'], input_dict['fg_labels_t'])
154 |         else:
155 |             inferred_values = self.model(sinput1, sinput2, input_dict['pcd_eval_s'], input_dict['pcd_eval_t'], input_dict['fg_labels_s'], input_dict['fg_labels_t'])
156 | 
157 |         losses = self.compute_losses(inferred_values, input_dict)
158 |         
159 |         metrics = self.compute_metrics(inferred_values, input_dict, phase)
160 | 
161 |         return losses, metrics
162 | 
163 | 
164 |     def _demo_step(self, input_dict):
165 | 
166 |         ''' 
167 |         Runs a short demo and visualizes the output
168 |         
169 |         Args:
170 |             input_dict (dict): data dictionary
171 | 
172 |         '''
173 | 
174 |         # Run the feature and context encoder
175 |         sinput1 = ME.SparseTensor(features=input_dict['sinput_s_F'].to(self.device),
176 |             coordinates=input_dict['sinput_s_C'].to(self.device))
177 | 
178 |         sinput2 = ME.SparseTensor(features=input_dict['sinput_t_F'].to(self.device),
179 |             coordinates=input_dict['sinput_t_C'].to(self.device))
180 |                 
181 |         inferred_values = self.model(sinput1, sinput2, input_dict['pcd_eval_s'], input_dict['pcd_eval_t'], input_dict['fg_labels_s'], input_dict['fg_labels_t'])
182 | 
183 |         
184 |         metrics = self.compute_metrics(inferred_values, input_dict)
185 | 
186 |         return losses, metrics


--------------------------------------------------------------------------------
/lib/utils.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import re
  3 | import torch 
  4 | import logging 
  5 | import math
  6 | from collections import defaultdict
  7 | 
  8 | import open3d as o3d
  9 | import numpy as np
 10 | 
 11 | from matplotlib import pyplot as plt
 12 | from matplotlib import cm
 13 | from matplotlib.colors import Normalize
 14 | 
 15 | def dict_all_to_device(tensor_dict, device):
 16 |     """
 17 |     Puts all the tensors to a specified device
 18 | 
 19 |     Args: 
 20 |         tensor_dict (dict): dictionary of all tensors
 21 |         device (str): device to be used (cuda or cpu)
 22 | 
 23 |     """
 24 | 
 25 |     for key in tensor_dict:
 26 |         if isinstance(tensor_dict[key], torch.Tensor):
 27 |             if 'sinput' not in key:
 28 |                 tensor_dict[key] = tensor_dict[key].to(device)
 29 |             
 30 | 
 31 | def save_checkpoint(filename, epoch, it, model, optimizer=None, scheduler=None, config=None, best_val=None):
 32 |     """
 33 |     Saves the current model, optimizer, scheduler, and side information to a checkpoint
 34 | 
 35 |     Args: 
 36 |         filename (str): path to where the checpoint will be saved
 37 |         epoch (int): current epoch
 38 |         it (int): current iteration
 39 |         model (nn.Module): torch neural network model
 40 |         optimizer (torch.optim): selected optimizer
 41 |         scheduler (torch.optim): selected scheduler
 42 |         config (dict): config parameters
 43 |         best_val (float): best validation result
 44 | 
 45 |     """
 46 |     
 47 |     state = {
 48 |         'epoch': epoch,
 49 |         'state_dict': model.state_dict(),
 50 |         'total_it': it,
 51 |         'optimizer': optimizer.state_dict(),
 52 |         'config': config,
 53 |         'scheduler': scheduler.state_dict(),
 54 |         'best_val': best_val,
 55 |     }
 56 | 
 57 |     logging.info("Saving checkpoint: {} ...".format(filename))
 58 |     torch.save(state, filename)
 59 | 
 60 | 
 61 | def load_checkpoint(model, optimizer, scheduler, filename):
 62 |     """
 63 |     Loads the saved checkpoint and updates the model, optimizer and scheduler.
 64 | 
 65 |     Args: 
 66 |         model (nn.Module): torch neural network model
 67 |         optimizer (torch.optim): selected optimizer
 68 |         scheduler (torch.optim): selected scheduler
 69 |         filename (str): path to the saved checkpoint
 70 | 
 71 |     Returns:
 72 |         model (nn.Module): model with pretrained parameters
 73 |         optimizer  (torch.optim): optimizer loaded from the checkpoint
 74 |         scheduler  (torch.optim): scheduler loaded from the checkpoint
 75 |         start_epoch (int): current epoch
 76 |         total_it (int): total number of iterations that were performed
 77 |         metric_val_best (float): current best valuation metric
 78 | 
 79 |     """
 80 |     start_epoch = 0
 81 |     total_it = 0
 82 |     metric_val_best = np.inf
 83 | 
 84 |     if os.path.isfile(filename):
 85 |         logging.info("Loading checkpoint {}".format(filename))
 86 |         checkpoint = torch.load(filename)
 87 | 
 88 |         # Safe loading of the model, load only the keys that are in the init and the saved model
 89 |         model_dict = model.state_dict()
 90 |         for key in model_dict:
 91 |             if key in checkpoint['state_dict']:
 92 |                 model_dict[key] = checkpoint['state_dict'][key]
 93 | 
 94 |         model.load_state_dict(model_dict)
 95 | 
 96 |         if optimizer is not None and 'optimizer' in checkpoint:
 97 |             try:
 98 |                 optimizer.load_state_dict(checkpoint['optimizer'])
 99 |             except:
100 |                 logging.info('could not load optimizer from the pretrained model')
101 | 
102 |         if 'epoch' in checkpoint:
103 |             start_epoch = checkpoint['epoch']
104 | 
105 |         if 'total_it' in checkpoint:
106 |             total_it = checkpoint['total_it']
107 | 
108 |         if 'best_val' in checkpoint:
109 |             metric_val_best = checkpoint['best_val']
110 | 
111 |         if scheduler is not None and 'scheduler' in checkpoint:
112 |             scheduler.load_state_dict(checkpoint['scheduler'])
113 |         
114 |     else:
115 |         logging.info("No checkpoint found at {}".format(filename))
116 | 
117 |     return model, optimizer, scheduler, start_epoch, total_it, metric_val_best
118 | 
119 | 
120 | def load_point_cloud(file, data_type='numpy'):
121 |     """
122 |     Loads the point cloud coordinates from the '*.ply' file.
123 |     Args: 
124 |         file (str): path to the '*.ply' file
125 |         data_type (str): data type to be returned (default: numpy)
126 |     Returns:
127 |         pc (np.array or open3d.PointCloud()): point coordinates [n, 3]
128 |     """
129 |     temp_pc = o3d.io.read_point_cloud(file)
130 |      
131 |     assert data_type in ['numpy', 'open3d'], 'Wrong data type selected when loading the ply file.' 
132 |     
133 |     if data_type == 'numpy':
134 |         return np.asarray(temp_pc.points)
135 |     else:         
136 |         return temp_pc
137 | 
138 | 
139 | def sorted_alphanum(file_list_ordered):
140 |     """
141 |     Sorts the list alphanumerically
142 |     Args:
143 |         file_list_ordered (list): list of files to be sorted
144 |     Return:
145 |         sorted_list (list): input list sorted alphanumerically
146 |     """
147 |     def convert(text):
148 |         return int(text) if text.isdigit() else text
149 | 
150 |     def alphanum_key(key):
151 |         return [convert(c) for c in re.split('([0-9]+)', key)]
152 | 
153 |     sorted_list = sorted(file_list_ordered, key=alphanum_key)
154 | 
155 |     return sorted_list
156 |     
157 | def get_file_list(path, extension=None):
158 |     """
159 |     Build a list of all the files in the provided path
160 |     Args:
161 |         path (str): path to the directory 
162 |         extension (str): only return files with this extension
163 |     Return:
164 |         file_list (list): list of all the files (with the provided extension) sorted alphanumerically
165 |     """
166 |     if extension is None:
167 |         file_list = [os.path.join(path, f) for f in os.listdir(path) if os.path.isfile(os.path.join(path, f))]
168 |     else:
169 |         file_list = [
170 |             os.path.join(path, f)
171 |             for f in os.listdir(path)
172 |             if os.path.isfile(os.path.join(path, f)) and os.path.splitext(f)[1] == extension
173 |         ]
174 |     file_list = sorted_alphanum(file_list)
175 | 
176 |     return file_list
177 | 
178 | 
179 | def get_folder_list(path):
180 |     """
181 |     Build a list of all the files in the provided path
182 |     Args:
183 |         path (str): path to the directory 
184 |         extension (str): only return files with this extension
185 |     Returns:
186 |         file_list (list): list of all the files (with the provided extension) sorted alphanumerically
187 |     """
188 |     folder_list = [os.path.join(path, f) for f in os.listdir(path) if os.path.isdir(os.path.join(path, f))]
189 |     folder_list = sorted_alphanum(folder_list)
190 |     
191 |     return folder_list
192 | 
193 | def n_model_parameters(model):
194 |     """
195 |     Counts the number of parameters in a torch model
196 |     Args:
197 |         model (torch.Model): input model 
198 |     
199 |     Returns:
200 |         _ (int): number of the parameters
201 |     """
202 | 
203 |     return sum(p.numel() for p in model.parameters() if p.requires_grad)
204 | 
205 | 
206 | def pairwise_distance(src, dst, normalized=True):
207 |     """Calculates squared Euclidean distance between each two points.
208 |     Args:
209 |         src (torch tensor): source data, [b, n, c]
210 |         dst (torch tensor): target data, [b, m, c]
211 |         normalized (bool): distance computation can be more efficient 
212 |     Returns:
213 |         dist (torch tensor): per-point square distance, [b, n, m]
214 |     """
215 | 
216 |     if len(src.shape) == 2:
217 |         src = src.unsqueeze(0)
218 |         dst = dst.unsqueeze(0)
219 | 
220 |     B, N, _ = src.shape
221 |     _, M, _ = dst.shape
222 |     
223 |     # Minus such that smaller value still means closer 
224 |     dist = -torch.matmul(src, dst.permute(0, 2, 1))
225 | 
226 |     # If inputs are normalized just add 1 otherwise compute the norms 
227 |     if not normalized:
228 |         dist *= 2 
229 |         dist += torch.sum(src ** 2, dim=-1)[:, :, None]
230 |         dist += torch.sum(dst ** 2, dim=-1)[:, None, :]
231 |     
232 |     else:
233 |         dist += 1.0
234 |     
235 |     # Distances can get negative due to numerical precision
236 |     dist = torch.clamp(dist, min=0.0, max=None)
237 |     
238 |     return dist
239 | 
240 | 
241 | def rotation_error(R1, R2):
242 |     """
243 |     Torch batch implementation of the rotation error between the estimated and the ground truth rotatiom matrix. 
244 |     Rotation error is defined as r_e = \arccos(\frac{Trace(\mathbf{R}_{ij}^{T}\mathbf{R}_{ij}^{\mathrm{GT}) - 1}{2})
245 |     Args: 
246 |         R1 (torch tensor): Estimated rotation matrices [b,3,3]
247 |         R2 (torch tensor): Ground truth rotation matrices [b,3,3]
248 |     Returns:
249 |         ae (torch tensor): Rotation error in angular degreees [b,1]
250 |     """
251 |     R_ = torch.matmul(R1.transpose(1,2), R2)
252 |     e = torch.stack([(torch.trace(R_[_, :, :]) - 1) / 2 for _ in range(R_.shape[0])], dim=0).unsqueeze(1)
253 | 
254 |     # Clamp the errors to the valid range (otherwise torch.acos() is nan)
255 |     e = torch.clamp(e, -1, 1, out=None)
256 | 
257 |     ae = torch.acos(e)
258 |     pi = torch.Tensor([math.pi])
259 |     ae = 180. * ae / pi.to(ae.device).type(ae.dtype)
260 | 
261 |     return ae
262 | 
263 | 
264 | def translation_error(t1, t2):
265 |     """
266 |     Torch batch implementation of the rotation error between the estimated and the ground truth rotatiom matrix. 
267 |     Rotation error is defined as r_e = \arccos(\frac{Trace(\mathbf{R}_{ij}^{T}\mathbf{R}_{ij}^{\mathrm{GT}) - 1}{2})
268 |     Args: 
269 |         t1 (torch tensor): Estimated translation vectors [b,3,1]
270 |         t2 (torch tensor): Ground truth translation vectors [b,3,1]
271 |     Returns:
272 |         te (torch tensor): translation error in meters [b,1]
273 |     """
274 |     return torch.norm(t1-t2, dim=(1, 2))
275 | 
276 | 
277 | def kabsch_transformation_estimation(x1, x2, weights=None, normalize_w = True, eps = 1e-7, best_k = 0, w_threshold = 0, compute_residuals = False):
278 |     """
279 |     Torch differentiable implementation of the weighted Kabsch algorithm (https://en.wikipedia.org/wiki/Kabsch_algorithm). Based on the correspondences and weights calculates
280 |     the optimal rotation matrix in the sense of the Frobenius norm (RMSD), based on the estimated rotation matrix it then estimates the translation vector hence solving
281 |     the Procrustes problem. This implementation supports batch inputs.
282 |     Args:
283 |         x1            (torch array): points of the first point cloud [b,n,3]
284 |         x2            (torch array): correspondences for the PC1 established in the feature space [b,n,3]
285 |         weights       (torch array): weights denoting if the coorespondence is an inlier (~1) or an outlier (~0) [b,n]
286 |         normalize_w   (bool)       : flag for normalizing the weights to sum to 1
287 |         best_k        (int)        : number of correspondences with highest weights to be used (if 0 all are used)
288 |         w_threshold   (float)      : only use weights higher than this w_threshold (if 0 all are used)
289 |     Returns:
290 |         rot_matrices  (torch array): estimated rotation matrices [b,3,3]
291 |         trans_vectors (torch array): estimated translation vectors [b,3,1]
292 |         res           (torch array): pointwise residuals (Eucledean distance) [b,n]
293 |         valid_gradient (bool): Flag denoting if the SVD computation converged (gradient is valid)
294 |     """
295 |     if weights is None:
296 |         weights = torch.ones(x1.shape[0],x1.shape[1]).type_as(x1).to(x1.device)
297 | 
298 |     if normalize_w:
299 |         sum_weights = torch.sum(weights,dim=1,keepdim=True) + eps
300 |         weights = (weights/sum_weights)
301 | 
302 |     weights = weights.unsqueeze(2)
303 | 
304 |     if best_k > 0:
305 |         indices = np.argpartition(weights.cpu().numpy(), -best_k, axis=1)[0,-best_k:,0]
306 |         weights = weights[:,indices,:]
307 |         x1 = x1[:,indices,:]
308 |         x2 = x2[:,indices,:]
309 | 
310 |     if w_threshold > 0:
311 |         weights[weights < w_threshold] = 0
312 | 
313 | 
314 |     x1_mean = torch.matmul(weights.transpose(1,2), x1) / (torch.sum(weights, dim=1).unsqueeze(1) + eps)
315 |     x2_mean = torch.matmul(weights.transpose(1,2), x2) / (torch.sum(weights, dim=1).unsqueeze(1) + eps)
316 | 
317 |     x1_centered = x1 - x1_mean
318 |     x2_centered = x2 - x2_mean
319 | 
320 |     cov_mat = torch.matmul(x1_centered.transpose(1, 2),
321 |                             (x2_centered * weights))
322 | 
323 |     try:
324 |         u, s, v = torch.svd(cov_mat)
325 | 
326 |     except Exception as e:
327 |         r = torch.eye(3,device=x1.device)
328 |         r = r.repeat(x1_mean.shape[0],1,1)
329 |         t = torch.zeros((x1_mean.shape[0],3,1), device=x1.device)
330 | 
331 |         res = transformation_residuals(x1, x2, r, t)
332 | 
333 |         return r, t, res, True
334 | 
335 |     tm_determinant = torch.det(torch.matmul(v.transpose(1, 2), u.transpose(1, 2)))
336 | 
337 |     determinant_matrix = torch.diag_embed(torch.cat((torch.ones((tm_determinant.shape[0],2),device=x1.device), tm_determinant.unsqueeze(1)), 1))
338 | 
339 |     rotation_matrix = torch.matmul(v,torch.matmul(determinant_matrix,u.transpose(1,2)))
340 | 
341 |     # translation vector
342 |     translation_matrix = x2_mean.transpose(1,2) - torch.matmul(rotation_matrix,x1_mean.transpose(1,2))
343 | 
344 |     # Residuals
345 |     res = None
346 |     if compute_residuals:
347 |         res = transformation_residuals(x1, x2, rotation_matrix, translation_matrix)
348 | 
349 |     return rotation_matrix, translation_matrix, res, False
350 | 
351 | 
352 | def transformation_residuals(x1, x2, R, t):
353 |     """
354 |     Computer the pointwise residuals based on the estimated transformation paramaters
355 |     
356 |     Args:
357 |         x1  (torch array): points of the first point cloud [b,n,3]
358 |         x2  (torch array): points of the second point cloud [b,n,3]
359 |         R   (torch array): estimated rotation matrice [b,3,3]
360 |         t   (torch array): estimated translation vectors [b,3,1]
361 |     Returns:
362 |         res (torch array): pointwise residuals (Eucledean distance) [b,n,1]
363 |     """
364 |     x2_reconstruct = torch.matmul(R, x1.transpose(1, 2)) + t 
365 | 
366 |     res = torch.norm(x2_reconstruct.transpose(1, 2) - x2, dim=2)
367 | 
368 |     return res
369 | 
370 | def transform_point_cloud(x1, R, t):
371 |     """
372 |     Transforms the point cloud using the giver transformation paramaters
373 |     
374 |     Args:
375 |         x1  (np array): points of the point cloud [b,n,3]
376 |         R   (np array): estimated rotation matrice [b,3,3]
377 |         t   (np array): estimated translation vectors [b,3,1]
378 |     Returns:
379 |         x1_t (np array): points of the transformed point clouds [b,n,3]
380 |     """
381 |     if len(R.shape) != 3:
382 |         R = R.unsqueeze(0)
383 | 
384 |     if len(t.shape) != 3:
385 |         t = t.unsqueeze(0)
386 |     
387 |     if len(x1.shape) != 3:
388 |         x1 = x1.unsqueeze(0)
389 | 
390 |     x1_t = (torch.matmul(R, x1.transpose(2,1)) + t).transpose(2,1)
391 | 
392 |     return x1_t
393 | 
394 | 
395 | def refine_ego_motion(pc_s, pc_t, bckg_mask_s, bckg_mask_t, R_est, t_est):
396 |     """
397 |     Refines the coarse ego motion estimate based on all background indices
398 |     
399 |     Args:
400 |         pc_s  (torch.tensor): points of the source point cloud [b,n,3]
401 |         pc_t  (torch.tensor): points of the target point cloud [b,n,3]
402 |         bckg_mask_s  (torch.tensor): background mask for the source points [b,n]
403 |         bckg_mask_t  (torch.tensor): background mask for the target points [b,n]
404 |         R_est   (torch.tensor): coarse rotation matrices [b,3,3]
405 |         t_est   (torch.tensor): coarse translation vectors [b,3,1]
406 |     Returns:
407 |         R_ref  (np array): refined transformation parameters [b,3,3]
408 |         t_ref  (np array): refined transformation parameters [b,3,1]
409 |     """
410 | 
411 |     pcd_s = o3d.geometry.PointCloud()
412 |     pcd_t = o3d.geometry.PointCloud()
413 | 
414 |     R_est = R_est.cpu().numpy()
415 |     t_est = t_est.cpu().numpy()
416 | 
417 |     R_ref = np.zeros_like(R_est)
418 |     t_ref = np.zeros_like(t_est)
419 | 
420 |     init_T = np.eye(4)
421 | 
422 |     for b_idx in range(pc_s.shape[0]):
423 |         xyz_bckg_s = pc_s[b_idx, bckg_mask_s[b_idx,:], :].cpu().numpy()
424 |         xyz_bckg_t = pc_t[b_idx, bckg_mask_t[b_idx,:], :].cpu().numpy()
425 | 
426 |         pcd_s.points = o3d.utility.Vector3dVector(xyz_bckg_s)
427 |         pcd_t.points = o3d.utility.Vector3dVector(xyz_bckg_t)
428 | 
429 |         init_T[0:3,0:3] = R_est[b_idx,:,:]
430 |         init_T[0:3,3:4] = t_est[b_idx,:,:]
431 | 
432 |         trans = o3d.pipelines.registration.registration_icp(pcd_s, pcd_t,
433 |                                                   max_correspondence_distance=0.15, init=init_T,
434 |                                                   criteria=o3d.pipelines.registration.ICPConvergenceCriteria(max_iteration = 300))
435 | 
436 |         R_ref[b_idx,:,:] = trans.transformation[0:3,0:3]
437 |         t_ref[b_idx,:,:] = trans.transformation[0:3,3:4]
438 |     
439 |     return R_ref, t_ref
440 | 
441 | 
442 | 
443 | 
444 | def refine_cluster_motion(pc_s, pc_t, R_est=None, t_est=None):
445 |     """
446 |     Refines the motion of a foreground rigid agent (clust) 
447 |     
448 |     Args:
449 |         pc_s  (torch.tensor): points of the cluster points [n,3]
450 |         pc_t  (torch.tensor): foreground point of the target point cloud [m,3]
451 |         R_coarse   (torch.tensor): coarse rotation matrices [3,3]
452 |         t_coarse   (torch.tensor): coarse translation vectors [3,1]
453 |     Returns:
454 |         R_ref  (np array): refined transformation parameters [3,3]
455 |         t_ref  (np array): refined transformation parameters [3,1]
456 |     """
457 | 
458 |     pcd_s = o3d.geometry.PointCloud()
459 |     pcd_t = o3d.geometry.PointCloud()
460 | 
461 |     init_T = np.eye(4, dtype=np.float)
462 | 
463 |     if R_est is not None:
464 |         init_T[0:3,0:3] = R_est.cpu().numpy()
465 |         init_T[0:3,3:4] = t_est.cpu().numpy()
466 | 
467 |     pcd_s.points = o3d.utility.Vector3dVector(pc_s.cpu())
468 |     pcd_t.points = o3d.utility.Vector3dVector(pc_t.cpu())
469 | 
470 |     trans = o3d.pipelines.registration.registration_icp(pcd_s, pcd_t,
471 |                                               max_correspondence_distance=0.25, init=init_T,
472 |                                               criteria=o3d.pipelines.registration.ICPConvergenceCriteria(max_iteration = 300))
473 | 
474 |     R_ref = trans.transformation[0:3,0:3].astype(np.float32)
475 |     t_ref = trans.transformation[0:3,3:4].astype(np.float32)
476 |     
477 |     return R_ref, t_ref
478 | 
479 | 
480 | def compute_epe(est_flow, gt_flow, sem_label=None, eval_stats =False, mask=None):
481 |     """
482 |     Compute 3d end-point-error
483 | 
484 |     Args:
485 |         st_flow (torch.Tensor): estimated flow vectors [n,3]
486 |         gt_flow  (torch.Tensor): ground truth flow vectors [n,3]
487 |         eval_stats (bool): compute the evaluation stats as defined in FlowNet3D
488 |         mask (torch.Tensor): boolean mask used for filtering the epe [n]
489 | 
490 |     Returns:
491 |         epe (float): mean EPE for current batch
492 |         epe_bckg (float): mean EPE for the background points
493 |         epe_forg (float): mean EPE for the foreground points
494 |         acc3d_strict (float): inlier ratio according to strict thresh (error smaller than 5cm or 5%)
495 |         acc3d_relax (float): inlier ratio according to relaxed thresh (error smaller than 10cm or 10%)
496 |         outlier (float): ratio of outliers (error larger than 30cm or 10%)
497 |     """
498 | 
499 |     metrics = {}
500 |     error = est_flow - gt_flow
501 |     
502 |     # If mask if provided mask out the flow
503 |     if mask is not None:
504 |         error = error[mask > 0.5]
505 |         gt_flow = gt_flow[mask > 0.5, :]
506 |     
507 |     epe_per_point = torch.sqrt(torch.sum(torch.pow(error, 2.0), -1))
508 |     epe = epe_per_point.mean()
509 | 
510 |     metrics['epe'] = epe.item()
511 | 
512 | 
513 |     if sem_label is not None:
514 |         # Extract epe for background and foreground separately (background = class 0)
515 |         bckg_mask = (sem_label == 0)
516 |         forg_mask = (sem_label == 1)
517 | 
518 |         bckg_epe = epe_per_point[bckg_mask]
519 |         forg_epe = epe_per_point[forg_mask]
520 | 
521 |         metrics['bckg_epe'] = bckg_epe.mean().item()
522 |         metrics['bckg_epe_median'] = bckg_epe.median().item()
523 |         
524 |         if torch.sum(forg_mask) > 0:
525 |             metrics['forg_epe_median'] = forg_epe.median().item()
526 |             metrics['forg_epe'] = forg_epe.mean().item()
527 | 
528 |     if eval_stats:
529 |         
530 |         gt_f_magnitude = torch.norm(gt_flow, dim=-1)
531 |         gt_f_magnitude_np = np.linalg.norm(gt_flow.cpu(), axis=-1)
532 |         relative_err = epe_per_point / (gt_f_magnitude + 1e-4)
533 |         acc3d_strict = (
534 |             (torch.logical_or(epe_per_point < 0.05, relative_err < 0.05)).type(torch.float).mean()
535 |         )
536 |         acc3d_relax = (
537 |             (torch.logical_or(epe_per_point < 0.1, relative_err < 0.1)).type(torch.float).mean()
538 |         )
539 |         outlier = (torch.logical_or(epe_per_point > 0.3, relative_err > 0.1)).type(torch.float).mean()
540 | 
541 |         metrics['acc3d_s'] = acc3d_strict.item()
542 |         metrics['acc3d_r'] = acc3d_relax.item()
543 |         metrics['outlier'] = outlier.item()
544 | 
545 |     return metrics
546 | 
547 | 
548 | def compute_l1_loss(est_flow, gt_flow):
549 |     """
550 |     Compute training loss.
551 | 
552 |     Args:
553 |     est_flow (torch.Tensor): estimated flow
554 |     gt_flow (torch.Tensor): : ground truth flow
555 | 
556 |     Returns
557 |     loss (torch.tensor): mean l1 loss of the current batch
558 | 
559 |     """
560 | 
561 |     error = est_flow - gt_flow
562 |     loss = torch.mean(torch.abs(error))
563 | 
564 |     return loss
565 | 
566 | 
567 | 
568 | def precision_at_one(pred, target):
569 |     """
570 |     Computes the precision and recall of the binary fg/bg segmentation
571 | 
572 |     Args:
573 |     pred (torch.Tensor): predicted foreground labels
574 |     target (torch.Tensor): : gt foreground labels
575 | 
576 |     Returns
577 |     precision_f (float): foreground precision
578 |     precision_b (float): background precision
579 |     recall_f (float): foreground recall
580 |     recall_b (float): background recall
581 | 
582 |     """
583 | 
584 |     precision_f = (pred[target == 1] == 1).float().sum() / ((pred == 1).float().sum() + 1e-6)
585 |     precision_b = (pred[target == 0] == 0).float().sum() / ((pred == 0).float().sum() + 1e-6)
586 | 
587 |     recall_f = (pred[target == 1] == 1).float().sum() / ((target == 1).float().sum() + 1e-6)
588 |     recall_b = (pred[target == 0] == 0).float().sum() / ((target == 0).float().sum() + 1e-6)
589 | 
590 |     return precision_f, precision_b, recall_f, recall_b
591 | 
592 | 
593 | def evaluate_binary_class(pred, target):
594 |     """
595 |     Computes the number of true/false positives and negatives
596 | 
597 |     Args:
598 |     pred (torch.Tensor): predicted foreground labels
599 |     target (torch.Tensor): : gt foreground labels
600 | 
601 |     Returns
602 |     true_p (float): number of true positives
603 |     true_n (float): number of true negatives
604 |     false_p (float): number of false positives
605 |     false_n (float): number of false negatives
606 | 
607 |     """
608 | 
609 |     true_p = (pred[target == 1] == 1).float().sum()
610 |     true_n = (pred[target == 0] == 0).float().sum()
611 | 
612 |     false_p = (pred[target == 0] == 1).float().sum()
613 |     false_n = (pred[target == 1] == 0).float().sum()
614 | 
615 |     return true_p, true_n, false_p, false_n
616 | 
617 | 
618 | 
619 | def upsample_flow(xyz, sparse_flow_tensor, k_value=3, voxel_size = None):
620 | 
621 |     dense_flow = []
622 |     for b_idx in range(len(xyz)):
623 |         
624 |         sparse_xyz = sparse_flow_tensor.coordinates_at(b_idx).cuda() * voxel_size
625 |         sparse_flow = sparse_flow_tensor.features_at(b_idx)
626 | 
627 |         sqr_dist = pairwise_distance(xyz[b_idx].cuda(), sparse_xyz, normalized=False).squeeze(0)
628 |         sqr_dist, group_idx = torch.topk(sqr_dist, k_value, dim = -1, largest=False, sorted=False)
629 |         
630 | 
631 |         dist = torch.sqrt(sqr_dist)
632 |         norm = torch.sum(1 / (dist + 1e-7), dim = 1, keepdim = True)
633 |         weight = ((1 / (dist + 1e-7)) / norm ).unsqueeze(-1)
634 | 
635 |         test = group_idx.reshape(-1)
636 |         sparse_flow = sparse_flow[group_idx.reshape(-1), :].reshape(-1,k_value,3)
637 |         
638 |         dense_flow.append(torch.sum(weight * sparse_flow, dim=1))
639 | 
640 |     return dense_flow
641 | 
642 | 
643 | def upsample_bckg_labels(xyz, sparse_seg_tensor, voxel_size = None):
644 | 
645 |     upsampled_seg_labels = []
646 |     for b_idx in range(len(xyz)):
647 |         sparse_xyz = sparse_seg_tensor.coordinates_at(b_idx).cuda() * voxel_size
648 |         seg_labels = sparse_seg_tensor.features_at(b_idx)
649 |         sqr_dist = pairwise_distance(xyz[b_idx].cuda(), sparse_xyz, normalized=False).squeeze(0)
650 |         sqr_dist, idx = torch.topk(sqr_dist, 1, dim = -1, largest=False, sorted=False)
651 |         
652 |         
653 |         upsampled_seg_labels.append(seg_labels[idx.reshape(-1)])
654 | 
655 |     return torch.cat(upsampled_seg_labels,0)
656 | 
657 | 
658 | 
659 | def upsample_cluster_labels(xyz, sparse_seg_tensor, cluster_labels, voxel_size = None):
660 | 
661 |     upsampled_seg_labels = []
662 | 
663 |     cluster_labels_all = defaultdict(list)
664 |     for b_idx in range(len(xyz)):
665 |         sparse_xyz = sparse_seg_tensor.coordinates_at(b_idx).cuda() * voxel_size
666 |         seg_labels = sparse_seg_tensor.features_at(b_idx)
667 |         sqr_dist = pairwise_distance(xyz[b_idx].cuda(), sparse_xyz, normalized=False).squeeze(0)
668 |         sqr_dist, idx = torch.topk(sqr_dist, 1, dim = -1, largest=False, sorted=False)
669 |         
670 |         for cluster in cluster_labels[str(b_idx)]:
671 |             cluster_indices = torch.nonzero(idx.reshape(-1)[:,None] == cluster)
672 |         
673 |             cluster_labels_all[str(b_idx)].append(cluster_indices[:,0])
674 | 
675 |     return cluster_labels_all
676 | 
677 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | coloredlogs
2 | tqdm
3 | pyyaml
4 | tensorboardx
5 | matplotlib


--------------------------------------------------------------------------------
/scripts/download_data.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | 
 3 | DATASET=$1
 4 | 
 5 | function download() {
 6 |     if [ ! -d "data" ]; then
 7 |         mkdir -p "data"
 8 |     fi
 9 |     cd data 
10 | 
11 |     url="https://share.phys.ethz.ch/~gsg/weakly_supervised_3D_rigid_scene_flow/data/"
12 |    
13 |     extension=".tar" 
14 |     
15 |     wget --no-check-certificate --show-progress "$url$DATASET$extension"
16 |     tar -xf "$DATASET$extension"
17 |     rm "$DATASET$extension"
18 |     cd ../
19 | 
20 | 
21 | }
22 | 
23 | 
24 | function download_all() {
25 | 
26 |     for DATASET in "stereo_kitti" "lidar_kitti" "semantic_kitti" "waymo_open" "flying_things_3d"
27 |     do
28 |         download
29 |     done	
30 | }
31 | 
32 | function main() {
33 |     if [ -z "$DATASET" ]; 
34 |     then
35 |         echo "No dataset selcted. All data will be downloaded"
36 |         download_all
37 |     elif [ "$DATASET" == "stereo_kitti" ]  || [ "$DATASET" == "lidar_kitti" ] || [ "$DATASET" == "semantic_kitti" ] || [ "$DATASET" == "waymo_open" ] || [ "$DATASET" == "flying_things_3d" ] 
38 |     then
39 |         download
40 |     else
41 |         echo "Wrong dataset selected must be one of  [stereo_kitti, lidar_kitti, waymo_open, semantic_kitti, flying_things_3d]."
42 |     fi
43 | }
44 | 
45 | main;
46 | 


--------------------------------------------------------------------------------
/scripts/download_pretrained_models.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | 
 3 | 
 4 | function download_models() {
 5 |     if [ ! -d "logs" ]; then
 6 |         mkdir -p "logs"
 7 |     fi
 8 |     cd logs 
 9 | 
10 |     url="https://share.phys.ethz.ch/~gsg/weakly_supervised_3D_rigid_scene_flow/pretrained_models/"
11 |    
12 |     model_file="pretrained_models.tar" 
13 |     
14 |     wget --no-check-certificate --show-progress "$url$model_file"
15 |     tar -xf "$model_file"
16 |     rm "$model_file"
17 |     cd ../
18 | }
19 | 
20 | 
21 | download_models;
22 | 


--------------------------------------------------------------------------------
/scripts/download_pretrained_models_ablations.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | 
 3 | 
 4 | function download_models() {
 5 |     if [ ! -d "logs" ]; then
 6 |         mkdir -p "logs"
 7 |     fi
 8 |     cd logs 
 9 | 
10 |     url="https://share.phys.ethz.ch/~gsg/weakly_supervised_3D_rigid_scene_flow/pretrained_models/"
11 |    
12 |     model_file="pretrained_models_ablations.tar" 
13 |     
14 |     wget --no-check-certificate --show-progress "$url$model_file"
15 |     tar -xf "$model_file"
16 |     rm "$model_file"
17 |     cd ../
18 | }
19 | 
20 | 
21 | download_models;
22 | 


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
  1 | import os 
  2 | import gc
  3 | import shutil
  4 | import torch
  5 | import time
  6 | import argparse
  7 | import yaml 
  8 | import copy
  9 | import glob
 10 | import numpy as np
 11 | from datetime import datetime
 12 | from tqdm import tqdm
 13 | from tensorboardX import SummaryWriter
 14 | 
 15 | import lib.config as config
 16 | from lib.utils import n_model_parameters, save_checkpoint, dict_all_to_device, load_checkpoint
 17 | from lib.data import make_data_loader
 18 | from lib.logger import prepare_logger
 19 | 
 20 | # Set the random seeds for repeatability
 21 | np.random.seed(41)
 22 | torch.manual_seed(41)
 23 | if torch.cuda.is_available():
 24 |     torch.cuda.manual_seed(41) 
 25 | 
 26 | def main(cfg, config_name):
 27 |     """
 28 |     Main training function: after preparing the data loaders, model, optimizer, and trainer,
 29 |     start with the training process.
 30 | 
 31 |     Args:
 32 |         cfg (dict): current configuration parameters
 33 |         config_name (str): path to the config file
 34 |     """
 35 | 
 36 |     # Create the output dir if it does not exist 
 37 |     if not os.path.exists(cfg['misc']['log_dir']):
 38 |         os.makedirs(cfg['misc']['log_dir'])
 39 | 
 40 |     # Initialize the model
 41 |     model = config.get_model(cfg)
 42 |     model = model.cuda()
 43 | 
 44 |     # Get data loader
 45 |     train_loader = make_data_loader(cfg, phase='train')
 46 |     val_loader = make_data_loader(cfg, phase='val')
 47 | 
 48 |     # Log directory
 49 |     dataset_name = cfg["data"]["dataset"]
 50 | 
 51 |     now = datetime.now().strftime("%y_%m_%d-%H_%M_%S_%f")
 52 |     now += "__Method_" + str(cfg['method']['backbone'])
 53 |     now += "__Pretrained_" if cfg['network']['use_pretrained'] and cfg['network']['pretrained_path'] else ''
 54 |     if cfg['method']['flow']: now += "__Flow_"
 55 |     if cfg['method']['ego_motion']: now += "__Ego_"
 56 |     if cfg['method']['semantic']: now += "__Sem_"
 57 |     now += "__Rem_Ground_"  if cfg['data']['remove_ground'] else ''
 58 |     now += "__VoxSize_" + str(cfg['misc']["voxel_size"])
 59 |     now += "__Pts_" + str(cfg['misc']["num_points"])
 60 |     path2log = os.path.join(cfg['misc']['log_dir'],"logs_" + dataset_name, now)
 61 | 
 62 |     logger, checkpoint_dir = prepare_logger(cfg, path2log)
 63 |     tboard_logger = SummaryWriter(path2log)
 64 |     
 65 |     # Output number of model parameters
 66 |     logger.info("Parameter Count: {:d}".format(n_model_parameters(model)))
 67 | 
 68 |     # Output torch and cuda version 
 69 |     logger.info('Torch version: {}'.format(torch.__version__))
 70 |     logger.info('CUDA version: {}'.format(torch.version.cuda))
 71 | 
 72 |     # Save config file that was used for this experiment
 73 |     with open(os.path.join(path2log, config_name.split(os.sep)[-1]),'w') as outfile:
 74 |         yaml.dump(cfg, outfile, default_flow_style=False, allow_unicode=True)
 75 | 
 76 |     # Get optimizer and trainer
 77 |     optimizer = config.get_optimizer(cfg, model)
 78 |     scheduler = config.get_scheduler(cfg, optimizer)
 79 | 
 80 | 
 81 |     # Parameters determining the saving and validation interval (if positive denotes iteration if negative epoch)
 82 |     stat_interval = cfg['train']['stat_interval']
 83 |     stat_interval = stat_interval if stat_interval > 0 else abs(stat_interval* len(train_loader))
 84 | 
 85 |     chkpt_interval = cfg['train']['chkpt_interval']
 86 |     chkpt_interval = chkpt_interval if chkpt_interval > 0 else abs(chkpt_interval* len(train_loader))
 87 | 
 88 |     val_interval = cfg['train']['val_interval']
 89 |     val_interval = val_interval if val_interval > 0 else abs(val_interval* len(train_loader))
 90 | 
 91 |     # if not a pretrained model epoch and iterations should be -1 
 92 |     metric_val_best = np.inf
 93 |     running_metrics = {}
 94 |     running_losses = {}
 95 |     epoch_it = -1
 96 |     total_it = -1
 97 | 
 98 |     # Load the pretrained weights
 99 |     if cfg['network']['use_pretrained'] and cfg['network']['pretrained_path']:
100 |         model, optimizer, scheduler, epoch_it, total_it, metric_val_best = load_checkpoint(model, optimizer, scheduler, filename=cfg['network']['pretrained_path'])
101 | 
102 |         # Find previous tensorboard files and copy them 
103 |         tb_files = glob.glob(os.sep.join(cfg['network']['pretrained_path'].split(os.sep)[:-1]) + '/events.*')
104 |         for tb_file in tb_files:
105 |             shutil.copy(tb_file, os.path.join(path2log, tb_file.split(os.sep)[-1]))
106 | 
107 | 
108 |     # Initialize the trainer
109 |     device = torch.device('cuda' if (torch.cuda.is_available() and cfg['misc']['use_gpu']) else 'cpu') 
110 |     trainer = config.get_trainer(cfg, model, device)
111 |     acc_iter_size = cfg['train']['acc_iter_size']
112 | 
113 |     # Training loop
114 |     while epoch_it < cfg['train']['max_epoch']: 
115 |         epoch_it += 1
116 |         lr = scheduler.get_last_lr()
117 |         logger.info('Training epoch: {}, LR: {} '.format(epoch_it, lr))
118 |         gc.collect()
119 |         
120 |         train_loader_iter = train_loader.__iter__()
121 |         start = time.time()
122 |         tbar = tqdm(total=len(train_loader) // acc_iter_size, ncols=100)
123 | 
124 |         for it in range(len(train_loader) // acc_iter_size):
125 |             optimizer.zero_grad()
126 |             total_it += 1
127 |             batch_metrics = {}
128 |             batch_losses = {}
129 | 
130 | 
131 |             for iter_idx in range(acc_iter_size):
132 | 
133 |                 batch = train_loader_iter.next()
134 |                 
135 |                 dict_all_to_device(batch, device)
136 |                 losses, metrics, total_loss = trainer.train_step(batch)
137 | 
138 |                 total_loss.backward()
139 | 
140 |                 # Save the running metrics and losses
141 |                 if not batch_metrics:
142 |                     batch_metrics = copy.deepcopy(metrics)
143 |                 else:
144 |                     for key, value in metrics.items():
145 |                         batch_metrics[key] += value
146 | 
147 |                 if not batch_losses:
148 |                     batch_losses = copy.deepcopy(losses)    
149 |                 else:
150 |                     for key, value in losses.items():
151 |                         batch_losses[key] += value
152 | 
153 | 
154 |             # Compute the mean value of the metrics and losses of the batch
155 |             for key, value in batch_metrics.items():
156 |                 batch_metrics[key] = value / acc_iter_size
157 |                 
158 |             for key, value in batch_losses.items():
159 |                 batch_losses[key] = value / acc_iter_size
160 | 
161 |             optimizer.step()
162 |             torch.cuda.empty_cache()
163 |             
164 |             tbar.set_description('Loss: {:.3g}'.format(batch_losses['total_loss']))
165 |             tbar.update(1)
166 | 
167 |             # Save the running metrics and losses
168 |             if not running_metrics:
169 |                 running_metrics = copy.deepcopy(batch_metrics)
170 |             else:
171 |                 for key, value in batch_metrics.items():
172 |                     running_metrics[key] += value
173 | 
174 |             if not running_losses:
175 |                 running_losses = copy.deepcopy(batch_losses)    
176 |             else:
177 |                 for key, value in batch_losses.items():
178 |                     running_losses[key] += value      
179 |             
180 |             # Logs
181 |             if total_it % stat_interval == stat_interval - 1:
182 |                 # Print / save logs
183 |                 logger.info("Epoch {0:d} - It. {1:d}: loss = {2:.3f}".format(
184 |                     epoch_it, total_it, running_losses['total_loss'] / stat_interval
185 |                     )
186 |                 )
187 | 
188 |                 for key, value in running_losses.items():
189 |                     tboard_logger.add_scalar("Train/{}".format(key), value / stat_interval, total_it)
190 |                     # Reinitialize the values
191 |                     running_losses[key] = 0
192 | 
193 |                 for key, value in running_metrics.items():
194 |                     tboard_logger.add_scalar("Train/{}".format(key), value / stat_interval, total_it)
195 |                     # Reinitialize the values
196 |                     running_metrics[key] = 0
197 | 
198 |                 start = time.time()
199 | 
200 | 
201 |             # Run validation
202 |             if total_it % val_interval == val_interval - 1:
203 |                 logger.info("Starting the validation")
204 |                 val_losses, val_metrics = trainer.validate(val_loader)
205 | 
206 |                 for key, value in val_losses.items():
207 |                     tboard_logger.add_scalar("Val/{}".format(key), value, total_it)
208 | 
209 | 
210 |                 for key, value in val_metrics.items():
211 |                     tboard_logger.add_scalar("Val/{}".format(key), value, total_it)
212 | 
213 |                 logger.info("VALIDATION -It. {0:d}: total loss: {1:.3f}.".format(total_it, val_losses['total_loss']))
214 | 
215 | 
216 |                 if val_losses['total_loss'] < metric_val_best:
217 |                     metric_val_best = val_losses['total_loss']
218 |                     logger.info('New best model (loss: {:.4f})'.format(metric_val_best))
219 | 
220 |                     save_checkpoint(os.path.join(path2log,'model_best.pt'), epoch=epoch_it, it=total_it, model=model,
221 |                                     optimizer=optimizer,scheduler=scheduler,config=cfg, best_val=metric_val_best)
222 |                 else:
223 |                     save_checkpoint(os.path.join(path2log,'model_{}.pt'.format(total_it)), epoch=epoch_it, it=total_it, model=model,
224 |                         optimizer=optimizer, scheduler=scheduler, config=cfg, best_val=val_losses['total_loss'])
225 | 
226 |         # After the epoch if finished update the scheduler
227 |         scheduler.step()
228 | 
229 |     # Quit after the maximum number of epochs is reached
230 |     logger.info('Training completed after {} Epochs ({} it) with best val metric ({})={}'.format(epoch_it, it, model_selection_metric, metric_val_best))
231 | 
232 | 
233 | if __name__ == "__main__":
234 | 
235 |     # Parse the command line arguments
236 |     parser = argparse.ArgumentParser()
237 |     parser.add_argument('config', type=str, help= 'Path to the config file.')
238 |     args = parser.parse_args()
239 | 
240 |     # Combine the two config files
241 |     cfg = config.get_config(args.config, 'configs/default.yaml')
242 | 
243 |     main(cfg, args.config)
244 | 


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/utils/__init__.py


--------------------------------------------------------------------------------
/utils/chamfer_distance/__init__.py:
--------------------------------------------------------------------------------
1 | from .chamfer_distance import ChamferDistance
2 | 


--------------------------------------------------------------------------------
/utils/chamfer_distance/__pycache__/__init__.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/utils/chamfer_distance/__pycache__/__init__.cpython-37.pyc


--------------------------------------------------------------------------------
/utils/chamfer_distance/__pycache__/chamfer_distance.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zgojcic/Rigid3DSceneFlow/ee064203873951ee42e058c35924878de8446eb8/utils/chamfer_distance/__pycache__/chamfer_distance.cpython-37.pyc


--------------------------------------------------------------------------------
/utils/chamfer_distance/chamfer_distance.cpp:
--------------------------------------------------------------------------------
  1 | #include <torch/torch.h>
  2 | 
  3 | // CUDA forward declarations
  4 | int ChamferDistanceKernelLauncher(
  5 |     const int b, const int n,
  6 |     const float* xyz,
  7 |     const int m,
  8 |     const float* xyz2,
  9 |     float* result,
 10 |     int* result_i,
 11 |     float* result2,
 12 |     int* result2_i);
 13 | 
 14 | int ChamferDistanceGradKernelLauncher(
 15 |     const int b, const int n,
 16 |     const float* xyz1,
 17 |     const int m,
 18 |     const float* xyz2,
 19 |     const float* grad_dist1,
 20 |     const int* idx1,
 21 |     const float* grad_dist2,
 22 |     const int* idx2,
 23 |     float* grad_xyz1,
 24 |     float* grad_xyz2);
 25 | 
 26 | 
 27 | void chamfer_distance_forward_cuda(
 28 |     const at::Tensor xyz1, 
 29 |     const at::Tensor xyz2, 
 30 |     const at::Tensor dist1, 
 31 |     const at::Tensor dist2, 
 32 |     const at::Tensor idx1, 
 33 |     const at::Tensor idx2) 
 34 | {
 35 |     ChamferDistanceKernelLauncher(xyz1.size(0), xyz1.size(1), xyz1.data<float>(),
 36 |                                             xyz2.size(1), xyz2.data<float>(),
 37 |                                             dist1.data<float>(), idx1.data<int>(),
 38 |                                             dist2.data<float>(), idx2.data<int>());
 39 | }
 40 | 
 41 | void chamfer_distance_backward_cuda(
 42 |     const at::Tensor xyz1,
 43 |     const at::Tensor xyz2, 
 44 |     at::Tensor gradxyz1, 
 45 |     at::Tensor gradxyz2, 
 46 |     at::Tensor graddist1, 
 47 |     at::Tensor graddist2, 
 48 |     at::Tensor idx1, 
 49 |     at::Tensor idx2)
 50 | {
 51 |     ChamferDistanceGradKernelLauncher(xyz1.size(0), xyz1.size(1), xyz1.data<float>(),
 52 |                                            xyz2.size(1), xyz2.data<float>(),
 53 |                                            graddist1.data<float>(), idx1.data<int>(),
 54 |                                            graddist2.data<float>(), idx2.data<int>(),
 55 |                                            gradxyz1.data<float>(), gradxyz2.data<float>());
 56 | }
 57 | 
 58 | 
 59 | void nnsearch(
 60 |     const int b, const int n, const int m,
 61 |     const float* xyz1,
 62 |     const float* xyz2,
 63 |     float* dist,
 64 |     int* idx)
 65 | {
 66 |     for (int i = 0; i < b; i++) {
 67 |         for (int j = 0; j < n; j++) {
 68 |             const float x1 = xyz1[(i*n+j)*3+0];
 69 |             const float y1 = xyz1[(i*n+j)*3+1];
 70 |             const float z1 = xyz1[(i*n+j)*3+2];
 71 |             double best = 0;
 72 |             int besti = 0;
 73 |             for (int k = 0; k < m; k++) {
 74 |                 const float x2 = xyz2[(i*m+k)*3+0] - x1;
 75 |                 const float y2 = xyz2[(i*m+k)*3+1] - y1;
 76 |                 const float z2 = xyz2[(i*m+k)*3+2] - z1;
 77 |                 const double d=x2*x2+y2*y2+z2*z2;
 78 |                 if (k==0 || d < best){
 79 |                     best = d;
 80 |                     besti = k;
 81 |                 }
 82 |             }
 83 |             dist[i*n+j] = best;
 84 |             idx[i*n+j] = besti;
 85 |         }
 86 |     }
 87 | }
 88 | 
 89 | 
 90 | void chamfer_distance_forward(
 91 |     const at::Tensor xyz1, 
 92 |     const at::Tensor xyz2, 
 93 |     const at::Tensor dist1, 
 94 |     const at::Tensor dist2, 
 95 |     const at::Tensor idx1, 
 96 |     const at::Tensor idx2) 
 97 | {
 98 |     const int batchsize = xyz1.size(0);
 99 |     const int n = xyz1.size(1);
100 |     const int m = xyz2.size(1);
101 | 
102 |     const float* xyz1_data = xyz1.data<float>();
103 |     const float* xyz2_data = xyz2.data<float>();
104 |     float* dist1_data = dist1.data<float>();
105 |     float* dist2_data = dist2.data<float>();
106 |     int* idx1_data = idx1.data<int>();
107 |     int* idx2_data = idx2.data<int>();
108 | 
109 |     nnsearch(batchsize, n, m, xyz1_data, xyz2_data, dist1_data, idx1_data);
110 |     nnsearch(batchsize, m, n, xyz2_data, xyz1_data, dist2_data, idx2_data);
111 | }
112 | 
113 | 
114 | void chamfer_distance_backward(
115 |     const at::Tensor xyz1, 
116 |     const at::Tensor xyz2, 
117 |     at::Tensor gradxyz1, 
118 |     at::Tensor gradxyz2, 
119 |     at::Tensor graddist1, 
120 |     at::Tensor graddist2, 
121 |     at::Tensor idx1, 
122 |     at::Tensor idx2) 
123 | {
124 |     const int b = xyz1.size(0);
125 |     const int n = xyz1.size(1);
126 |     const int m = xyz2.size(1);
127 | 
128 |     const float* xyz1_data = xyz1.data<float>();
129 |     const float* xyz2_data = xyz2.data<float>();
130 |     float* gradxyz1_data = gradxyz1.data<float>();
131 |     float* gradxyz2_data = gradxyz2.data<float>();
132 |     float* graddist1_data = graddist1.data<float>();
133 |     float* graddist2_data = graddist2.data<float>();
134 |     const int* idx1_data = idx1.data<int>();
135 |     const int* idx2_data = idx2.data<int>();
136 | 
137 |     for (int i = 0; i < b*n*3; i++)
138 |         gradxyz1_data[i] = 0;
139 |     for (int i = 0; i < b*m*3; i++)
140 |         gradxyz2_data[i] = 0;
141 |     for (int i = 0;i < b; i++) {
142 |         for (int j = 0; j < n; j++) {
143 |             const float x1 = xyz1_data[(i*n+j)*3+0];
144 |             const float y1 = xyz1_data[(i*n+j)*3+1];
145 |             const float z1 = xyz1_data[(i*n+j)*3+2];
146 |             const int j2 = idx1_data[i*n+j];
147 | 
148 |             const float x2 = xyz2_data[(i*m+j2)*3+0];
149 |             const float y2 = xyz2_data[(i*m+j2)*3+1];
150 |             const float z2 = xyz2_data[(i*m+j2)*3+2];
151 |             const float g = graddist1_data[i*n+j]*2;
152 | 
153 |             gradxyz1_data[(i*n+j)*3+0] += g*(x1-x2);
154 |             gradxyz1_data[(i*n+j)*3+1] += g*(y1-y2);
155 |             gradxyz1_data[(i*n+j)*3+2] += g*(z1-z2);
156 |             gradxyz2_data[(i*m+j2)*3+0] -= (g*(x1-x2));
157 |             gradxyz2_data[(i*m+j2)*3+1] -= (g*(y1-y2));
158 |             gradxyz2_data[(i*m+j2)*3+2] -= (g*(z1-z2));
159 |         }
160 |         for (int j = 0; j < m; j++) {
161 |             const float x1 = xyz2_data[(i*m+j)*3+0];
162 |             const float y1 = xyz2_data[(i*m+j)*3+1];
163 |             const float z1 = xyz2_data[(i*m+j)*3+2];
164 |             const int j2 = idx2_data[i*m+j];
165 |             const float x2 = xyz1_data[(i*n+j2)*3+0];
166 |             const float y2 = xyz1_data[(i*n+j2)*3+1];
167 |             const float z2 = xyz1_data[(i*n+j2)*3+2];
168 |             const float g = graddist2_data[i*m+j]*2;
169 |             gradxyz2_data[(i*m+j)*3+0] += g*(x1-x2);
170 |             gradxyz2_data[(i*m+j)*3+1] += g*(y1-y2);
171 |             gradxyz2_data[(i*m+j)*3+2] += g*(z1-z2);
172 |             gradxyz1_data[(i*n+j2)*3+0] -= (g*(x1-x2));
173 |             gradxyz1_data[(i*n+j2)*3+1] -= (g*(y1-y2));
174 |             gradxyz1_data[(i*n+j2)*3+2] -= (g*(z1-z2));
175 |         }
176 |     }
177 | }
178 | 
179 | 
180 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
181 |     m.def("forward", &chamfer_distance_forward, "ChamferDistance forward");
182 |     m.def("forward_cuda", &chamfer_distance_forward_cuda, "ChamferDistance forward (CUDA)");
183 |     m.def("backward", &chamfer_distance_backward, "ChamferDistance backward");
184 |     m.def("backward_cuda", &chamfer_distance_backward_cuda, "ChamferDistance backward (CUDA)");
185 | }
186 | 


--------------------------------------------------------------------------------
/utils/chamfer_distance/chamfer_distance.cu:
--------------------------------------------------------------------------------
  1 | #include <ATen/ATen.h>
  2 | 
  3 | #include <cuda.h>
  4 | #include <cuda_runtime.h>
  5 | 
  6 | __global__ 
  7 | void ChamferDistanceKernel(
  8 | 	int b,
  9 | 	int n,
 10 | 	const float* xyz,
 11 | 	int m,
 12 | 	const float* xyz2,
 13 | 	float* result,
 14 | 	int* result_i)
 15 | {
 16 | 	const int batch=512;
 17 | 	__shared__ float buf[batch*3];
 18 | 	for (int i=blockIdx.x;i<b;i+=gridDim.x){
 19 | 		for (int k2=0;k2<m;k2+=batch){
 20 | 			int end_k=min(m,k2+batch)-k2;
 21 | 			for (int j=threadIdx.x;j<end_k*3;j+=blockDim.x){
 22 | 				buf[j]=xyz2[(i*m+k2)*3+j];
 23 | 			}
 24 | 			__syncthreads();
 25 | 			for (int j=threadIdx.x+blockIdx.y*blockDim.x;j<n;j+=blockDim.x*gridDim.y){
 26 | 				float x1=xyz[(i*n+j)*3+0];
 27 | 				float y1=xyz[(i*n+j)*3+1];
 28 | 				float z1=xyz[(i*n+j)*3+2];
 29 | 				int best_i=0;
 30 | 				float best=0;
 31 | 				int end_ka=end_k-(end_k&3);
 32 | 				if (end_ka==batch){
 33 | 					for (int k=0;k<batch;k+=4){
 34 | 						{
 35 | 							float x2=buf[k*3+0]-x1;
 36 | 							float y2=buf[k*3+1]-y1;
 37 | 							float z2=buf[k*3+2]-z1;
 38 | 							float d=x2*x2+y2*y2+z2*z2;
 39 | 							if (k==0 || d<best){
 40 | 								best=d;
 41 | 								best_i=k+k2;
 42 | 							}
 43 | 						}
 44 | 						{
 45 | 							float x2=buf[k*3+3]-x1;
 46 | 							float y2=buf[k*3+4]-y1;
 47 | 							float z2=buf[k*3+5]-z1;
 48 | 							float d=x2*x2+y2*y2+z2*z2;
 49 | 							if (d<best){
 50 | 								best=d;
 51 | 								best_i=k+k2+1;
 52 | 							}
 53 | 						}
 54 | 						{
 55 | 							float x2=buf[k*3+6]-x1;
 56 | 							float y2=buf[k*3+7]-y1;
 57 | 							float z2=buf[k*3+8]-z1;
 58 | 							float d=x2*x2+y2*y2+z2*z2;
 59 | 							if (d<best){
 60 | 								best=d;
 61 | 								best_i=k+k2+2;
 62 | 							}
 63 | 						}
 64 | 						{
 65 | 							float x2=buf[k*3+9]-x1;
 66 | 							float y2=buf[k*3+10]-y1;
 67 | 							float z2=buf[k*3+11]-z1;
 68 | 							float d=x2*x2+y2*y2+z2*z2;
 69 | 							if (d<best){
 70 | 								best=d;
 71 | 								best_i=k+k2+3;
 72 | 							}
 73 | 						}
 74 | 					}
 75 | 				}else{
 76 | 					for (int k=0;k<end_ka;k+=4){
 77 | 						{
 78 | 							float x2=buf[k*3+0]-x1;
 79 | 							float y2=buf[k*3+1]-y1;
 80 | 							float z2=buf[k*3+2]-z1;
 81 | 							float d=x2*x2+y2*y2+z2*z2;
 82 | 							if (k==0 || d<best){
 83 | 								best=d;
 84 | 								best_i=k+k2;
 85 | 							}
 86 | 						}
 87 | 						{
 88 | 							float x2=buf[k*3+3]-x1;
 89 | 							float y2=buf[k*3+4]-y1;
 90 | 							float z2=buf[k*3+5]-z1;
 91 | 							float d=x2*x2+y2*y2+z2*z2;
 92 | 							if (d<best){
 93 | 								best=d;
 94 | 								best_i=k+k2+1;
 95 | 							}
 96 | 						}
 97 | 						{
 98 | 							float x2=buf[k*3+6]-x1;
 99 | 							float y2=buf[k*3+7]-y1;
100 | 							float z2=buf[k*3+8]-z1;
101 | 							float d=x2*x2+y2*y2+z2*z2;
102 | 							if (d<best){
103 | 								best=d;
104 | 								best_i=k+k2+2;
105 | 							}
106 | 						}
107 | 						{
108 | 							float x2=buf[k*3+9]-x1;
109 | 							float y2=buf[k*3+10]-y1;
110 | 							float z2=buf[k*3+11]-z1;
111 | 							float d=x2*x2+y2*y2+z2*z2;
112 | 							if (d<best){
113 | 								best=d;
114 | 								best_i=k+k2+3;
115 | 							}
116 | 						}
117 | 					}
118 | 				}
119 | 				for (int k=end_ka;k<end_k;k++){
120 | 					float x2=buf[k*3+0]-x1;
121 | 					float y2=buf[k*3+1]-y1;
122 | 					float z2=buf[k*3+2]-z1;
123 | 					float d=x2*x2+y2*y2+z2*z2;
124 | 					if (k==0 || d<best){
125 | 						best=d;
126 | 						best_i=k+k2;
127 | 					}
128 | 				}
129 | 				if (k2==0 || result[(i*n+j)]>best){
130 | 					result[(i*n+j)]=best;
131 | 					result_i[(i*n+j)]=best_i;
132 | 				}
133 | 			}
134 | 			__syncthreads();
135 | 		}
136 | 	}
137 | }
138 | 
139 | void ChamferDistanceKernelLauncher(
140 |     const int b, const int n,
141 |     const float* xyz,
142 |     const int m,
143 |     const float* xyz2,
144 |     float* result,
145 |     int* result_i,
146 |     float* result2,
147 |     int* result2_i)
148 | {
149 | 	ChamferDistanceKernel<<<dim3(32,16,1),512>>>(b, n, xyz, m, xyz2, result, result_i);
150 | 	ChamferDistanceKernel<<<dim3(32,16,1),512>>>(b, m, xyz2, n, xyz, result2, result2_i);
151 | 
152 | 	cudaError_t err = cudaGetLastError();
153 | 	if (err != cudaSuccess)
154 | 	    printf("error in chamfer distance updateOutput: %s\n", cudaGetErrorString(err));
155 | }
156 | 
157 | 
158 | __global__ 
159 | void ChamferDistanceGradKernel(
160 | 	int b, int n,
161 | 	const float* xyz1,
162 | 	int m,
163 | 	const float* xyz2,
164 | 	const float* grad_dist1,
165 | 	const int* idx1,
166 | 	float* grad_xyz1,
167 | 	float* grad_xyz2)
168 | {
169 | 	for (int i = blockIdx.x; i<b; i += gridDim.x) {
170 | 		for (int j = threadIdx.x + blockIdx.y * blockDim.x; j < n; j += blockDim.x*gridDim.y) {
171 | 			float x1=xyz1[(i*n+j)*3+0];
172 | 			float y1=xyz1[(i*n+j)*3+1];
173 | 			float z1=xyz1[(i*n+j)*3+2];
174 | 			int j2=idx1[i*n+j];
175 | 			float x2=xyz2[(i*m+j2)*3+0];
176 | 			float y2=xyz2[(i*m+j2)*3+1];
177 | 			float z2=xyz2[(i*m+j2)*3+2];
178 | 			float g=grad_dist1[i*n+j]*2;
179 | 			atomicAdd(&(grad_xyz1[(i*n+j)*3+0]),g*(x1-x2));
180 | 			atomicAdd(&(grad_xyz1[(i*n+j)*3+1]),g*(y1-y2));
181 | 			atomicAdd(&(grad_xyz1[(i*n+j)*3+2]),g*(z1-z2));
182 | 			atomicAdd(&(grad_xyz2[(i*m+j2)*3+0]),-(g*(x1-x2)));
183 | 			atomicAdd(&(grad_xyz2[(i*m+j2)*3+1]),-(g*(y1-y2)));
184 | 			atomicAdd(&(grad_xyz2[(i*m+j2)*3+2]),-(g*(z1-z2)));
185 | 		}
186 | 	}
187 | }
188 | 
189 | void ChamferDistanceGradKernelLauncher(
190 |     const int b, const int n,
191 |     const float* xyz1,
192 |     const int m,
193 |     const float* xyz2,
194 |     const float* grad_dist1,
195 |     const int* idx1,
196 |     const float* grad_dist2,
197 |     const int* idx2,
198 |     float* grad_xyz1,
199 |     float* grad_xyz2)
200 | {
201 | 	cudaMemset(grad_xyz1, 0, b*n*3*4);
202 | 	cudaMemset(grad_xyz2, 0, b*m*3*4);
203 | 	ChamferDistanceGradKernel<<<dim3(1,16,1), 256>>>(b, n, xyz1, m, xyz2, grad_dist1, idx1, grad_xyz1, grad_xyz2);
204 | 	ChamferDistanceGradKernel<<<dim3(1,16,1), 256>>>(b, m, xyz2, n, xyz1, grad_dist2, idx2, grad_xyz2, grad_xyz1);
205 | 
206 | 	cudaError_t err = cudaGetLastError();
207 |   	if (err != cudaSuccess)
208 | 	    printf("error in chamfer distance get grad: %s\n", cudaGetErrorString(err));
209 | }
210 | 


--------------------------------------------------------------------------------
/utils/chamfer_distance/chamfer_distance.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import torch
 3 | 
 4 | from torch.utils.cpp_extension import load
 5 | cd = load(name="cd",
 6 |           sources=["utils/chamfer_distance/chamfer_distance.cpp",
 7 |                    "utils/chamfer_distance/chamfer_distance.cu"], verbose=True)
 8 | 
 9 | class ChamferDistanceFunction(torch.autograd.Function):
10 |     @staticmethod
11 |     def forward(ctx, xyz1, xyz2):
12 |         batchsize, n, _ = xyz1.size()
13 |         _, m, _ = xyz2.size()
14 |         xyz1 = xyz1.contiguous()
15 |         xyz2 = xyz2.contiguous()
16 |         dist1 = torch.zeros(batchsize, n)
17 |         dist2 = torch.zeros(batchsize, m)
18 | 
19 |         idx1 = torch.zeros(batchsize, n, dtype=torch.int)
20 |         idx2 = torch.zeros(batchsize, m, dtype=torch.int)
21 | 
22 |         if not xyz1.is_cuda:
23 |             cd.forward(xyz1, xyz2, dist1, dist2, idx1, idx2)
24 |         else:
25 |             dist1 = dist1.cuda()
26 |             dist2 = dist2.cuda()
27 |             idx1 = idx1.cuda()
28 |             idx2 = idx2.cuda()
29 |             cd.forward_cuda(xyz1, xyz2, dist1, dist2, idx1, idx2)
30 | 
31 |         ctx.save_for_backward(xyz1, xyz2, idx1, idx2)
32 | 
33 |         return dist1, dist2
34 | 
35 |     @staticmethod
36 |     def backward(ctx, graddist1, graddist2):
37 |         xyz1, xyz2, idx1, idx2 = ctx.saved_tensors
38 | 
39 |         graddist1 = graddist1.contiguous()
40 |         graddist2 = graddist2.contiguous()
41 | 
42 |         gradxyz1 = torch.zeros(xyz1.size())
43 |         gradxyz2 = torch.zeros(xyz2.size())
44 | 
45 |         if not graddist1.is_cuda:
46 |             cd.backward(xyz1, xyz2, gradxyz1, gradxyz2, graddist1, graddist2, idx1, idx2)
47 |         else:
48 |             gradxyz1 = gradxyz1.cuda()
49 |             gradxyz2 = gradxyz2.cuda()
50 |             cd.backward_cuda(xyz1, xyz2, gradxyz1, gradxyz2, graddist1, graddist2, idx1, idx2)
51 | 
52 |         return gradxyz1, gradxyz2
53 | 
54 | 
55 | class ChamferDistance(torch.nn.Module):
56 |     def forward(self, xyz1, xyz2):
57 |         return ChamferDistanceFunction.apply(xyz1, xyz2)
58 | 


--------------------------------------------------------------------------------
/utils/chamfer_distance/readme.txt:
--------------------------------------------------------------------------------
1 | git clone //github.com/chrdiller/pyTorchChamferDistance.git
2 | 


--------------------------------------------------------------------------------