├── .gitignore
├── .vscode
    └── settings.json
├── LICENSE.md
├── README.md
├── detection_3d
    ├── __init__.py
    ├── create_dataset_lists.py
    ├── data_preprocessing
    │   └── pandaset_tools
    │   │   ├── helpers.py
    │   │   ├── preprocess_data.py
    │   │   ├── transform.py
    │   │   └── visualize_data.py
    ├── detection_dataset.py
    ├── losses.py
    ├── metrics.py
    ├── model.py
    ├── parameters.py
    ├── tools
    │   ├── augmentation_tools.py
    │   ├── detection_helpers.py
    │   ├── file_io.py
    │   ├── statics.py
    │   ├── summary_helpers.py
    │   ├── training_helpers.py
    │   └── visualization_tools.py
    ├── train.py
    └── validation_inferece.py
├── pictures
    ├── box_parametrization.png
    ├── result.png
    └── topview.png
└── setup.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | *.pyc
 2 | *~
 3 | *.txt
 4 | dataset
 5 | *-INFO
 6 | log
 7 | inference
 8 | *.bin
 9 | 
10 | 


--------------------------------------------------------------------------------
/.vscode/settings.json:
--------------------------------------------------------------------------------
1 | {
2 |     "python.formatting.provider": "black"
3 | }


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | Copyright 2020 Denis Tananaev
2 | 
3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4 | 
5 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6 | 
7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Dynamic objects detection in LiDAR
  2 | 
  3 | [![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/Dtananaev/lidar_dynamic_objects_detection/blob/master/LICENSE.md) 
  4 | 
  5 | ## The result of network (click on the image below)
  6 | 
  7 | [![result](https://github.com/Dtananaev/lidar_dynamic_objects_detection/blob/master/pictures/result.png)](https://youtu.be/f_HZg9Cq-h4)
  8 | The network weights could be loaded [weight](https://drive.google.com/file/d/1m8N5m2WXATgFNw88BRqEbUieiyV7p3S0/view?usp=sharing).
  9 | ## Installation
 10 | For ubuntu 18.04 install necessary dependecies:
 11 | ```
 12 | sudo apt update
 13 | sudo apt install python3-dev python3-pip python3-venv
 14 | ```
 15 | Create virtual environment and activate it:
 16 | ```
 17 | python3 -m venv --system-site-packages ./venv
 18 | source ./venv/bin/activate
 19 | ```
 20 | Upgrade pip tools:
 21 | ```
 22 | pip install --upgrade pip
 23 | ```
 24 | Install tensorflow 2.0  (for more details check the tensofrolow install tutorial: [tensorflow](https://www.tensorflow.org/install/pip))
 25 | ```
 26 | pip install --upgrade tensorflow-gpu
 27 | ```
 28 | Clone this repository and then install it:
 29 | ```
 30 | cd lidar_dynamic_objects_detection
 31 | pip install -r requirements.txt
 32 | pip install -e .
 33 | ```
 34 | This should install all the necessary packages to your environment.
 35 | 
 36 | ## The method
 37 | 
 38 | The lidar point cloud represented as top view image where each pixel of the image corresponds to 12.5x12.5 cm. For each grid cell
 39 | we project random point and get the height and intensity
 40 | <p align="center">
 41 |   <img src="https://github.com/Dtananaev/lidar_dynamic_objects_detection/blob/master/pictures/topview.png" width="900"/>
 42 | </p>
 43 | We are doing direct regression of the 3D boxes, thus for each pixel of the image we regress confidence between 0 and 1, 7 parameters for box (dx_centroid, dy_centroid, z_centroid, width, height, dx_front, dy_front) and classes.
 44 | <p align="center">
 45 |   <img src="https://github.com/Dtananaev/lidar_dynamic_objects_detection/blob/master/pictures/box_parametrization.png" width="1500"/>
 46 | </p>
 47 | We apply binary cross entrophy for confidence loss, l1 loss for all box parameters regression and softmax loss for classes prediction.
 48 | The confidence map computed from ground truth boxes. We assign the closest to the box centroid cell as confidence 1.0 (green on the image above)
 49 | and 0 otherwise. We apply confidence loss for all the pixels. Other losses  applied only for those pixels where we have confidence ground truth 1.0.
 50 | 
 51 | 
 52 | ## The dataset preparation
 53 | We work with Pandaset dataset which can be uploaded from here: [Pandaset](https://pandaset.org/)
 54 | Upload and unpack all the data to dataset folder (e.g. ~/dataset).
 55 | The dataset should have the next folder structure:
 56 | ``` bash
 57 |     dataset
 58 |     ├── 001                     # The sequence number
 59 |     │   ├── annotations         # Bounding boxes and semseg annotations
 60 |     |   |   ├──cuboids
 61 |     |   |   |  ├──00.pkl.gz
 62 |     |   |   |  └──  ...
 63 |     |   |   ├──semseg
 64 |     |   |      ├──00.pkl.gz
 65 |     |   |      └── ...
 66 |     │   ├── camera             # cameras images
 67 |     |   |  ├──back_camera
 68 |     |   |  |  ├──00.jpg
 69 |     |   |  |  └── ..
 70 |     |   |  ├──front_camera
 71 |     |   |  └── ...
 72 |     │   ├── lidar             # lidar data
 73 |     │   |    ├── 00.pkl.gz
 74 |     │   |    └── ... 
 75 |     |   ├── meta
 76 |     |   |   ├── gps.json
 77 |     |   |   ├── timestamps.json
 78 |     ├── 002
 79 |     └── ...
 80 | ```
 81 | Preprocess dataset by applying next command:
 82 | ```
 83 | cd lidar_dynamic_objects_detection/detection_3d/data_preprocessing/pandaset_tools
 84 | python preprocess_data.py --dataset_dir <path_to_your_dataset_dir>
 85 | ```
 86 | Create dataset lists:
 87 | ```
 88 | cd lidar_dynamic_objects_detection/detection_3d/
 89 | python create_dataset_lists.py --dataset_dir <path_to_your_dataset_dir>
 90 | ```
 91 | This should create ```train.datatxt``` and ```val.datatxt``` into your dataset folder.
 92 | Finally change into ```parameters.py``` the directory of the dataset.
 93 | ## Train
 94 | In order to train the network:
 95 | ```
 96 | python train.py
 97 | ```
 98 | In order to resume training:
 99 | ```
100 | python train.py --resume
101 | ```
102 | The training can be monitored in tensorboard:
103 | ```
104 | tensorboard --logdir=log
105 | ```
106 | ## Inference on validation dataset
107 | In order to do inference on validation dataset:
108 | ```
109 | python validation_inference.py --dataset_file <path_to_dataset_folder>/val.datatxt --output_dir <path_to_inference_output> --model_dir <path_to_trained_model>
110 | ```
111 | The result of the inference is 3d boxes and also visualized 3d boxes on top view image. The visualized top view image (upper) concatenated with ground truth top view image (bottom).
112 | 


--------------------------------------------------------------------------------
/detection_3d/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/detection_3d/__init__.py


--------------------------------------------------------------------------------
/detection_3d/create_dataset_lists.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | __copyright__ = """
 3 | Copyright (c) 2020 Tananaev Denis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
 9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import os
20 | import glob
21 | import numpy as np
22 | import argparse
23 | from detection_3d.tools.file_io import save_dataset_list
24 | 
25 | 
26 | class PandaDetectionDataset:
27 |     def __init__(self, dataset_dir):
28 |         self.dataset_dir = dataset_dir
29 | 
30 |     def get_data(self):
31 |         search_string = os.path.join(self.dataset_dir, "*", "lidar_processed", "*.bin")
32 |         lidar_list = np.asarray(sorted(glob.glob(search_string)))
33 |         search_string = os.path.join(self.dataset_dir, "*", "bbox_processed", "*.txt")
34 |         box_list = np.asarray(sorted(glob.glob(search_string)))
35 |         data = np.concatenate((lidar_list[:, None], box_list[:, None],), axis=1,)
36 |         data = [";".join(x) for x in data]
37 |         return data
38 | 
39 |     def create_datasets_file(self):
40 |         """
41 |         Creates  train.dataset  and val.dataset file
42 |         """
43 |         data_list = self.get_data()
44 | 
45 |         split_num = 80 * int(103 * 0.75)
46 |         print(f"split_num {split_num}")
47 |         # Save train and validation dataset
48 |         filename = os.path.join(self.dataset_dir, "train.datatxt")
49 |         save_dataset_list(filename, data_list[:split_num])
50 |         print(
51 |             f"The dataset of the size {len(data_list[:split_num])} saved in {filename}."
52 |         )
53 |         filename = os.path.join(self.dataset_dir, "val.datatxt")
54 |         save_dataset_list(filename, data_list[split_num:])
55 |         print(
56 |             f"The dataset of the size {len(data_list[split_num:])} saved in {filename}."
57 |         )
58 | 
59 | 
60 | if __name__ == "__main__":
61 |     parser = argparse.ArgumentParser(description="Create kitti dataset file.")
62 |     parser.add_argument("--dataset_dir", default="dataset")
63 |     args = parser.parse_args()
64 |     dataset_creator = PandaDetectionDataset(args.dataset_dir)
65 |     dataset_creator.create_datasets_file()
66 | 


--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/helpers.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import numpy as np
 20 | 
 21 | labels = {
 22 |     "Cones": 0,
 23 |     "Towed Object": 1,
 24 |     "Semi-truck": 2,
 25 |     "Train": 3,
 26 |     "Temporary Construction Barriers": 4,
 27 |     "Rolling Containers": 5,
 28 |     "Animals - Other": 6,
 29 |     "Pylons": 7,
 30 |     "Emergency Vehicle": 8,
 31 |     "Motorcycle": 9,
 32 |     "Construction Signs": 10,
 33 |     "Medium-sized Truck": 11,
 34 |     "Other Vehicle - Uncommon": 12,
 35 |     "Tram / Subway": 13,
 36 |     "Road Barriers": 14,
 37 |     "Bus": 15,
 38 |     "Pedestrian with Object": 16,
 39 |     "Personal Mobility Device": 17,
 40 |     "Signs": 18,
 41 |     "Other Vehicle - Pedicab": 19,
 42 |     "Pedestrian": 20,
 43 |     "Car": 21,
 44 |     "Other Vehicle - Construction Vehicle": 22,
 45 |     "Bicycle": 23,
 46 |     "Motorized Scooter": 24,
 47 |     "Pickup Truck": 25,
 48 | }
 49 | 
 50 | 
 51 | def get_color(label):
 52 |     # "Tram": 0, "Car": 1, "Misc": 2, "Van": 3, "Person_sitting": 4, "Pedestrian": 5, "Truck": 6, "Cyclist": 7
 53 |     color = np.asarray(
 54 |         [
 55 |             [255, 229, 204],  # "Cones": 0,
 56 |             [255, 255, 204],  # "Towed Object": 1,
 57 |             [204, 204, 255],  # "Semi-truck": 2,
 58 |             [255, 204, 204],  # "Train": 3,
 59 |             [255, 204, 153],  # "Temporary Construction Barriers": 4,
 60 |             [204, 255, 204],  # "Rolling Containers": 5,
 61 |             [255, 204, 229],  #     "Animals - Other": 6,
 62 |             [153, 255, 153],  #    "Pylons": 7,
 63 |             [128, 128, 128],  #    "Emergency Vehicle": 8,
 64 |             [255, 255, 102],  #   "Motorcycle": 9,
 65 |             [255, 153, 51],  # "Construction Signs": 10,
 66 |             [153, 153, 255],  # "Medium-sized Truck": 11,
 67 |             [255, 255, 255],  #  "Other Vehicle - Uncommon": 12,
 68 |             [255, 102, 102],  #   "Tram / Subway": 13,
 69 |             [204, 102, 0],  #   "Road Barriers": 14,
 70 |             [0, 0, 255],  #   "Bus": 15,
 71 |             [255, 51, 153],  #    "Pedestrian with Object": 16,
 72 |             [153, 153, 0],  # "Personal Mobility Device"
 73 |             [255, 153, 51],  #   "Signs": 18,
 74 |             [128, 128, 128],  #    "Other Vehicle - Pedicab": 19,
 75 |             [204, 0, 102],  # Pedestrian
 76 |             [0, 255, 0],  # Car
 77 |             [0, 0, 102],  # "Other Vehicle - Construction Vehicle"
 78 |             [255, 255, 0],  # "Other Vehicle - Construction Vehicle"
 79 |             [255, 255, 153],  #    "Motorized Scooter": 24,
 80 |             [51, 255, 255],  #    "Motorized Scooter": 24,
 81 |         ]
 82 |     )
 83 |     return color[int(label)]
 84 | 
 85 | 
 86 | def make_xzyhwly(bboxes):
 87 |     """
 88 |     Get raw data from bboxes and return xyzwlhy
 89 |     """
 90 |     label = bboxes[:, 1]
 91 |     yaw = bboxes[:, 2]
 92 |     c_x = bboxes[:, 5]
 93 |     c_y = bboxes[:, 6]
 94 |     c_z = bboxes[:, 7]
 95 |     length = bboxes[:, 8]
 96 |     width = bboxes[:, 9]
 97 |     height = bboxes[:, 10]
 98 |     new_boxes = np.asarray([c_x, c_y, c_z, length, width, height, yaw], dtype=np.float)
 99 |     return label, np.transpose(new_boxes)
100 | 
101 | 
102 | def filter_boxes(labels, bboxes_3d, orient_3d, lidar, treshold=20):
103 |     labels_res = []
104 |     box_res = []
105 |     orient_res = []
106 |     for idx, box in enumerate(bboxes_3d):
107 |         min_x = np.min(box[:, 0])
108 |         max_x = np.max(box[:, 0])
109 |         min_y = np.min(box[:, 1])
110 |         max_y = np.max(box[:, 1])
111 |         min_z = np.min(box[:, 2])
112 |         max_z = np.max(box[:, 2])
113 |         mask_x = (lidar[:, 0] >= min_x) & (lidar[:, 0] <= max_x)
114 |         mask_y = (lidar[:, 1] >= min_y) & (lidar[:, 1] <= max_y)
115 |         mask_z = (lidar[:, 2] >= min_z) & (lidar[:, 2] <= max_z)
116 |         mask = mask_x & mask_y & mask_z
117 |         result = np.sum(mask.astype(float))
118 |         if result > treshold:
119 |             box_res.append(box)
120 |             orient_res.append(orient_3d[idx])
121 |             labels_res.append(labels[idx])
122 |     return np.asarray(labels_res), np.asarray(box_res), np.asarray(orient_res)
123 | 


--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/preprocess_data.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import argparse
 20 | import numpy as np
 21 | import os
 22 | import glob
 23 | import pandas as pd
 24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import (
 25 |     make_xzyhwly,
 26 |     filter_boxes,
 27 | )
 28 | from detection_3d.tools.detection_helpers import (
 29 |     make_eight_points_boxes,
 30 |     get_bboxes_parameters_from_points,
 31 | )
 32 | import mayavi.mlab as mlab
 33 | from tqdm import tqdm
 34 | from detection_3d.tools.file_io import read_json, save_bboxes_to_file, save_lidar
 35 | from detection_3d.data_preprocessing.pandaset_tools.transform import (
 36 |     quaternion_to_euler,
 37 |     to_transform_matrix,
 38 |     transform_lidar_box_3d,
 39 | )
 40 | from detection_3d.tools.visualization_tools import visualize_lidar, visualize_bboxes_3d
 41 | 
 42 | 
 43 | def preprocess_data(dataset_dir):
 44 |     """
 45 |     The function prepares data for training from pandaset.
 46 |     Arguments:
 47 |         dataset_dir: directory with  Pandaset data
 48 |     """
 49 | 
 50 |     # Get list of data samples
 51 |     search_string = os.path.join(dataset_dir, "*")
 52 |     seq_list = sorted(glob.glob(search_string))
 53 |     for seq in tqdm(seq_list, desc="Process sequences", total=len(seq_list)):
 54 |         # Make output dirs for data
 55 |         lidar_out_dir = os.path.join(seq, "lidar_processed")
 56 |         bbox_out_dir = os.path.join(seq, "bbox_processed")
 57 |         os.makedirs(lidar_out_dir, exist_ok=True)
 58 |         os.makedirs(bbox_out_dir, exist_ok=True)
 59 |         search_string = os.path.join(seq, "lidar", "*.pkl.gz")
 60 |         lidar_list = sorted(glob.glob(search_string))
 61 |         lidar_pose_path = os.path.join(seq, "lidar", "poses.json")
 62 |         lidar_pose = read_json(lidar_pose_path)
 63 |         for idx, lidar_path in enumerate(lidar_list):
 64 |             sample_idx = os.path.splitext(os.path.basename(lidar_path))[0].split(".")[0]
 65 |             # Get pose of the lidar
 66 |             translation = lidar_pose[idx]["position"]
 67 |             translation = np.asarray([translation[key] for key in translation])
 68 |             rotation = lidar_pose[idx]["heading"]
 69 |             rotation = np.asarray([rotation[key] for key in rotation])
 70 |             rotation = quaternion_to_euler(*rotation)
 71 |             Rt = to_transform_matrix(translation, rotation)
 72 | 
 73 |             # Get respective bboxes
 74 |             bbox_path = lidar_path.split("/")
 75 |             bbox_path[-2] = "annotations/cuboids"
 76 |             bbox_path = os.path.join(*bbox_path)
 77 | 
 78 |             # Load data
 79 |             lidar = np.asarray(pd.read_pickle(lidar_path))
 80 |             # Get only lidar 0 (there is also lidar 1)
 81 |             lidar = lidar[lidar[:, -1] == 0]
 82 |             intensity = lidar[:, 3]
 83 |             lidar = transform_lidar_box_3d(lidar, Rt)
 84 |             # add intensity
 85 |             lidar = np.concatenate((lidar, intensity[:, None]), axis=-1)
 86 | 
 87 |             # Load bboxes
 88 |             bboxes = np.asarray(pd.read_pickle(bbox_path))
 89 |             labels, bboxes = make_xzyhwly(bboxes)
 90 |             corners_3d, orientation_3d = make_eight_points_boxes(bboxes)
 91 |             corners_3d = np.asarray(
 92 |                 [transform_lidar_box_3d(box, Rt) for box in corners_3d]
 93 |             )
 94 |             orientation_3d = np.asarray(
 95 |                 [transform_lidar_box_3d(box, Rt) for box in orientation_3d]
 96 |             )
 97 |             # filter boxes containing less then 20 lidar points inside
 98 |             labels, corners_3d, orientation_3d = filter_boxes(
 99 |                 labels, corners_3d, orientation_3d, lidar
100 |             )
101 |             centroid, width, length, height, yaw = get_bboxes_parameters_from_points(
102 |                 corners_3d
103 |             )
104 | 
105 |             # Save data
106 |             lidar_filename = os.path.join(lidar_out_dir, sample_idx + ".bin")
107 |             save_lidar(lidar_filename, lidar.astype(np.float32))
108 |             box_filename = os.path.join(bbox_out_dir, sample_idx + ".txt")
109 |             save_bboxes_to_file(
110 |                 box_filename, centroid, width, length, height, yaw, labels
111 |             )
112 | 
113 | 
114 | if __name__ == "__main__":
115 |     parser = argparse.ArgumentParser(description="Preprocess 3D pandaset.")
116 |     parser.add_argument("--dataset_dir", default="../../dataset")
117 |     args = parser.parse_args()
118 |     preprocess_data(args.dataset_dir)
119 | 


--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/transform.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | __copyright__ = """
 3 | Copyright (c) 2020 Tananaev Denis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
 9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | 
20 | import numpy as np
21 | 
22 | 
23 | def quaternion_to_euler(w, x, y, z):
24 |     """
25 |     Converts quaternions with components w, x, y, z into a tuple (roll, pitch, yaw)
26 |     
27 |     """
28 |     sinr_cosp = 2 * (w * x + y * z)
29 |     cosr_cosp = 1 - 2 * (x ** 2 + y ** 2)
30 |     roll = np.arctan2(sinr_cosp, cosr_cosp)
31 | 
32 |     sinp = 2 * (w * y - z * x)
33 |     pitch = np.where(np.abs(sinp) >= 1, np.sign(sinp) * np.pi / 2, np.arcsin(sinp))
34 | 
35 |     siny_cosp = 2 * (w * z + x * y)
36 |     cosy_cosp = 1 - 2 * (y ** 2 + z ** 2)
37 |     yaw = np.arctan2(siny_cosp, cosy_cosp)
38 | 
39 |     return roll, pitch, yaw
40 | 
41 | 
42 | # Calculates Rotation Matrix given euler angles.
43 | def eulerAnglesToRotationMatrix(theta):
44 | 
45 |     R_x = np.array(
46 |         [
47 |             [1, 0, 0],
48 |             [0, np.cos(theta[0]), -np.sin(theta[0])],
49 |             [0, np.sin(theta[0]), np.cos(theta[0])],
50 |         ]
51 |     )
52 | 
53 |     R_y = np.array(
54 |         [
55 |             [np.cos(theta[1]), 0, np.sin(theta[1])],
56 |             [0, 1, 0],
57 |             [-np.sin(theta[1]), 0, np.cos(theta[1])],
58 |         ]
59 |     )
60 | 
61 |     R_z = np.array(
62 |         [
63 |             [np.cos(theta[2]), -np.sin(theta[2]), 0],
64 |             [np.sin(theta[2]), np.cos(theta[2]), 0],
65 |             [0, 0, 1],
66 |         ]
67 |     )
68 | 
69 |     R = np.dot(R_z, np.dot(R_y, R_x))
70 | 
71 |     return R
72 | 
73 | 
74 | def to_transform_matrix(translation, rotation):
75 |     Rt = np.eye(4)
76 |     Rt[:3, :3] = eulerAnglesToRotationMatrix(rotation)
77 |     Rt[:3, 3] = translation
78 |     return Rt
79 | 
80 | 
81 | def transform_lidar_box_3d(lidar, Rt):
82 |     rt_inv = np.linalg.inv(Rt)
83 | 
84 |     lidar_3d = lidar[:, :3]
85 |     lidar_3d = np.transpose(lidar_3d)
86 | 
87 |     ones = np.ones_like(lidar_3d[0])[None, :]
88 |     hom_coord = np.concatenate((lidar_3d, ones), axis=0)
89 |     lidar_3d = np.dot(rt_inv, hom_coord)
90 |     lidar_3d = np.transpose(lidar_3d)[:, :3]
91 | 
92 |     return lidar_3d
93 | 


--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/visualize_data.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import argparse
 20 | import numpy as np
 21 | import os
 22 | import glob
 23 | import pandas as pd
 24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import (
 25 |     make_xzyhwly,
 26 |     filter_boxes,
 27 | )
 28 | from detection_3d.tools.detection_helpers import (
 29 |     make_eight_points_boxes,
 30 |     get_bboxes_parameters_from_points,
 31 | )
 32 | import mayavi.mlab as mlab
 33 | from tqdm import tqdm
 34 | from detection_3d.tools.file_io import read_json
 35 | from detection_3d.data_preprocessing.pandaset_tools.transform import (
 36 |     quaternion_to_euler,
 37 |     to_transform_matrix,
 38 |     transform_lidar_box_3d,
 39 | )
 40 | from detection_3d.tools.visualization_tools import visualize_lidar, visualize_bboxes_3d
 41 | 
 42 | 
 43 | def preprocess_data(dataset_dir):
 44 |     """
 45 |     The function visualizes data from pandaset.
 46 |     Arguments:
 47 |         dataset_dir: directory with  Pandaset data
 48 |     """
 49 |     shift_lidar = [
 50 |         25,
 51 |         50,
 52 |         2.5,
 53 |     ]  # The lidar coordinates is in the middle of point cloud we shift them to left top corner of the top view image
 54 |     # the top view image applied to the area of 50x100 meters around the car, where the most dense lidar point cloud
 55 |     # Get list of data samples
 56 |     search_string = os.path.join(dataset_dir, "*")
 57 |     seq_list = sorted(glob.glob(search_string))
 58 |     for seq in tqdm(seq_list, desc="Process sequences", total=len(seq_list)):
 59 |         search_string = os.path.join(seq, "lidar", "*.pkl.gz")
 60 |         lidar_list = sorted(glob.glob(search_string))
 61 |         lidar_pose_path = os.path.join(seq, "lidar", "poses.json")
 62 |         lidar_pose = read_json(lidar_pose_path)
 63 |         for idx, lidar_path in enumerate(lidar_list):
 64 |             # Get pose of the lidar
 65 |             translation = lidar_pose[idx]["position"]
 66 |             translation = np.asarray([translation[key] for key in translation])
 67 |             rotation = lidar_pose[idx]["heading"]
 68 |             rotation = np.asarray([rotation[key] for key in rotation])
 69 |             rotation = quaternion_to_euler(*rotation)
 70 |             Rt = to_transform_matrix(translation, rotation)
 71 | 
 72 |             # Get respective bboxes
 73 |             bbox_path = lidar_path.split("/")
 74 |             bbox_path[-2] = "annotations/cuboids"
 75 |             bbox_path = os.path.join(*bbox_path)
 76 | 
 77 |             # Load data
 78 |             lidar = np.asarray(pd.read_pickle(lidar_path))
 79 |             # Get only lidar 0 (there is also lidar 1)
 80 |             lidar = lidar[lidar[:, -1] == 0]
 81 |             intensity = lidar[:, 3]
 82 |             lidar = transform_lidar_box_3d(lidar, Rt)
 83 |             # add intensity
 84 |             lidar = np.concatenate((lidar, intensity[:, None]), axis=-1)
 85 | 
 86 |             # Load bboxes
 87 |             bboxes = np.asarray(pd.read_pickle(bbox_path))
 88 |             labels, bboxes = make_xzyhwly(bboxes)
 89 |             corners_3d, orientation_3d = make_eight_points_boxes(bboxes)
 90 |             corners_3d = np.asarray(
 91 |                 [transform_lidar_box_3d(box, Rt) for box in corners_3d]
 92 |             )
 93 |             orientation_3d = np.asarray(
 94 |                 [transform_lidar_box_3d(box, Rt) for box in orientation_3d]
 95 |             )
 96 |             labels, corners_3d, orientation_3d = filter_boxes(
 97 |                 labels, corners_3d, orientation_3d, lidar
 98 |             )
 99 |             centroid, width, length, height, yaw = get_bboxes_parameters_from_points(
100 |                 corners_3d
101 |             )
102 | 
103 |             boxes_new = np.concatenate(
104 |                 (
105 |                     centroid,
106 |                     length[:, None],
107 |                     width[:, None],
108 |                     height[:, None],
109 |                     yaw[:, None],
110 |                 ),
111 |                 axis=-1,
112 |             )
113 |             lidar[:, :3] = lidar[:, :3] + shift_lidar
114 | 
115 |             corners_3d, orientation_3d = make_eight_points_boxes(boxes_new)
116 |             corners_3d = corners_3d + shift_lidar
117 |             orientation_3d = orientation_3d + shift_lidar
118 |             figure = visualize_bboxes_3d(corners_3d, None, orientation_3d)
119 |             figure = visualize_lidar(lidar, figure)
120 |             mlab.show(1)
121 |             input()
122 |             mlab.close(figure)
123 | 
124 | 
125 | if __name__ == "__main__":
126 |     parser = argparse.ArgumentParser(description="Preprocess 3D pandaset.")
127 |     parser.add_argument("--dataset_dir", default="../../dataset")
128 |     args = parser.parse_args()
129 |     preprocess_data(args.dataset_dir)
130 | 


--------------------------------------------------------------------------------
/detection_3d/detection_dataset.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import tensorflow as tf
 20 | import numpy as np
 21 | import argparse
 22 | from tqdm import tqdm
 23 | from detection_3d.parameters import Parameters
 24 | from detection_3d.tools.file_io import load_dataset_list, load_lidar, load_bboxes
 25 | from detection_3d.tools.detection_helpers import (
 26 |     make_top_view_image,
 27 |     make_eight_points_boxes,
 28 |     get_bboxes_grid,
 29 | )
 30 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image
 31 | from detection_3d.tools.augmentation_tools import (
 32 |     random_rotate_lidar_boxes,
 33 |     random_flip_x_lidar_boxes,
 34 |     random_flip_y_lidar_boxes,
 35 | )
 36 | from PIL import Image
 37 | 
 38 | 
 39 | class DetectionDataset:
 40 |     """
 41 |     This is dataset layer for 3d detection experiment
 42 |     Arguments:
 43 |         param_settings: parameters of experiment
 44 |         dataset_file: name of .dataset file
 45 |         shuffle: shuffle the data True/False
 46 |     """
 47 | 
 48 |     def __init__(self, param_settings, dataset_file, augmentation=False, shuffle=False):
 49 |         # Private methods
 50 |         self.seed = param_settings["seed"]
 51 |         np.random.seed(self.seed)
 52 | 
 53 |         self.augmentation = augmentation
 54 | 
 55 |         self.param_settings = param_settings
 56 |         self.dataset_file = dataset_file
 57 |         self.inputs_list = load_dataset_list(
 58 |             self.param_settings["dataset_dir"], dataset_file
 59 |         )
 60 |         self.num_samples = len(self.inputs_list)
 61 |         self.num_it_per_epoch = int(
 62 |             self.num_samples / self.param_settings["batch_size"]
 63 |         )
 64 |         self.output_types = [tf.float32, tf.float32, tf.string]
 65 | 
 66 |         ds = tf.data.Dataset.from_tensor_slices(self.inputs_list)
 67 | 
 68 |         if shuffle:
 69 |             ds = ds.shuffle(self.num_samples)
 70 |         ds = ds.map(
 71 |             map_func=lambda x: tf.py_function(
 72 |                 self.load_data, [x], Tout=self.output_types
 73 |             ),
 74 |             num_parallel_calls=12,
 75 |         )
 76 |         ds = ds.batch(self.param_settings["batch_size"])
 77 |         ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
 78 |         self.dataset = ds
 79 | 
 80 |     def load_data(self, data_input):
 81 |         """
 82 |         Loads image and semseg and resizes it
 83 |         Note: This is numpy function.
 84 |         """
 85 |         lidar_file, bboxes_file = np.asarray(data_input).astype("U")
 86 | 
 87 |         lidar = load_lidar(lidar_file)
 88 |         bboxes = load_bboxes(bboxes_file)
 89 |         labels = bboxes[:, -1]
 90 |         lidar_corners_3d, _ = make_eight_points_boxes(bboxes[:, :-1])
 91 |         if self.augmentation:
 92 |              np.random.shuffle(lidar)
 93 |              if np.random.uniform(0, 1) < 0.50:  # 50% probability to flip over x axis
 94 |                  lidar, lidar_corners_3d = random_flip_x_lidar_boxes(
 95 |                      lidar, lidar_corners_3d
 96 |                  )
 97 |              if np.random.uniform(0, 1) < 0.50:  # 50% probability  to flip over y axis
 98 |                  lidar, lidar_corners_3d = random_flip_y_lidar_boxes(
 99 |                     lidar, lidar_corners_3d
100 |                  )
101 |              if np.random.uniform(0, 1) < 0.80:  # 80% probability to rotate
102 |                 lidar, lidar_corners_3d = random_rotate_lidar_boxes(
103 |                     lidar, lidar_corners_3d
104 |                 )
105 | 
106 |         # # Shift lidar coordinate to positive quadrant
107 |         lidar_coord = np.asarray(self.param_settings["lidar_offset"], dtype=np.float32)
108 |         lidar = lidar + lidar_coord
109 |         lidar_corners_3d = lidar_corners_3d + lidar_coord[:3]
110 |         # Process data
111 |         top_view = make_top_view_image(
112 |             lidar, self.param_settings["grid_meters"], self.param_settings["voxel_size"]
113 |         )
114 |         box_grid = get_bboxes_grid(
115 |             labels,
116 |             lidar_corners_3d,
117 |             self.param_settings["grid_meters"],
118 |             self.param_settings["bbox_voxel_size"],
119 |         )
120 |         return top_view, box_grid, lidar_file
121 | 
122 | 
123 | if __name__ == "__main__":
124 |     parser = argparse.ArgumentParser(description="DatasetLayer.")
125 |     parser.add_argument(
126 |         "--dataset_file",
127 |         type=str,
128 |         help="creates .dataset file",
129 |         default="train.datatxt",
130 |     )
131 |     args = parser.parse_args()
132 | 
133 |     param_settings = Parameters().settings
134 |     train_dataset = DetectionDataset(param_settings, args.dataset_file)
135 | 
136 |     bbox_voxel_size = np.asarray(param_settings["bbox_voxel_size"], dtype=np.float32)
137 |     grid_meters = np.array(param_settings["grid_meters"], dtype=np.float32)
138 | 
139 |     lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32)
140 | 
141 |     for samples in tqdm(train_dataset.dataset, total=train_dataset.num_it_per_epoch):
142 |         top_images, boxes_grid, lidar_file = samples
143 |         print(
144 |             f"lidar {top_images.shape}, boxes {boxes_grid.shape}, lidar_file {lidar_file}"
145 |         )
146 | 
147 |         top_view = (
148 |             visualize_2d_boxes_on_top_image(
149 |                 boxes_grid, top_images, grid_meters, bbox_voxel_size
150 |             )
151 |             * 255
152 |         )
153 |         img = Image.fromarray(top_view[0].astype("uint8"))
154 |         img.save("result.png")
155 |         input()
156 | 


--------------------------------------------------------------------------------
/detection_3d/losses.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | __copyright__ = """
 3 | Copyright (c) 2020 Tananaev Denis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
 9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | 
20 | import tensorflow as tf
21 | import numpy as np
22 | from tensorflow.keras.losses import binary_crossentropy, sparse_categorical_crossentropy
23 | 
24 | 
25 | def detection_loss(gt_bboxes, pred_bboxes, num_classes=26):
26 | 
27 |     # [2, 280, 160, 7]
28 |     # (x_grid, y_grid, (objectness, min_delta_x, min_delta_y, max_delta_x, max_delta_y, z, label))
29 |     (
30 |         gt_objectness,
31 |         gt_delta_xy,
32 |         gt_orient_xy,
33 |         gt_z_coord,
34 |         gt_width,
35 |         gt_height,
36 |         gt_label,
37 |     ) = tf.split(gt_bboxes, (1, 2, 2, 1, 1, 1, 1), axis=-1)
38 | 
39 |     (
40 |         p_objectness,
41 |         p_delta_xy,
42 |         p_orient_xy,
43 |         p_z_coord,
44 |         p_width,
45 |         p_height,
46 |         p_label,
47 |     ) = tf.split(pred_bboxes, (1, 2, 2, 1, 1, 1, num_classes), axis=-1)
48 | 
49 |     # Objectness
50 |     p_objectness = tf.sigmoid(p_objectness)
51 |     obj_loss = binary_crossentropy(gt_objectness, p_objectness)
52 | 
53 |     # Evaluate regression only for non-zero ground truth objects
54 |     obj_mask = tf.squeeze(gt_objectness, -1)
55 | 
56 |     # Evaluate other 6 parameters of the bboxes
57 |     label_loss = obj_mask * sparse_categorical_crossentropy(
58 |         gt_label, p_label, from_logits=True
59 |     )
60 | 
61 |     delta_xy_loss = obj_mask * tf.reduce_sum(tf.abs(gt_delta_xy - p_delta_xy), axis=-1)
62 |     delta_orient_loss = obj_mask * tf.reduce_sum(
63 |         tf.abs(gt_orient_xy - p_orient_xy), axis=-1
64 |     )
65 | 
66 |     z_loss = obj_mask * tf.squeeze(tf.abs(gt_z_coord - p_z_coord), -1)
67 |     width_loss = obj_mask * tf.squeeze(tf.abs(gt_width - p_width), -1)
68 |     height_loss = obj_mask * tf.squeeze(tf.abs(gt_height - p_height), -1)
69 | 
70 |     obj_loss = tf.reduce_sum(obj_loss)
71 |     label_loss = tf.reduce_sum(label_loss)
72 |     z_loss = tf.reduce_sum(z_loss)
73 |     delta_xy_loss = tf.reduce_sum(delta_xy_loss)
74 |     width_loss = tf.reduce_sum(width_loss)
75 |     height_loss = tf.reduce_sum(height_loss)
76 |     delta_orient_loss = tf.reduce_sum(delta_orient_loss)
77 | 
78 |     return (
79 |         obj_loss,
80 |         label_loss,
81 |         z_loss,
82 |         delta_xy_loss,
83 |         width_loss,
84 |         height_loss,
85 |         delta_orient_loss,
86 |     )
87 | 
88 | 


--------------------------------------------------------------------------------
/detection_3d/metrics.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | __copyright__ = """
 3 | Copyright (c) 2020 Tananaev Denis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
 9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | 
20 | 
21 | import tensorflow as tf
22 | from detection_3d.tools.file_io import save_to_json
23 | import numpy as np
24 | import os
25 | 
26 | 
27 | class EpochMetrics:
28 |     """
29 |     The class computes the loss
30 |     for train and validation step
31 |     """
32 | 
33 |     def __init__(self):
34 |         self.train_loss = tf.keras.metrics.Mean(name="train_loss")
35 |         self.val_loss = tf.keras.metrics.Mean(name="val_loss")
36 | 
37 |     def reset(self):
38 |         """
39 |         Reset all metrics to zero (need to do each epoch)
40 |         """
41 |         self.train_loss.reset_states()
42 |         self.val_loss.reset_states()
43 | 
44 |     def save_to_json(self, dir_to_save):
45 |         """
46 |         Save all metrics to the json file
47 |         """
48 | 
49 |         # Check that folder is exitsts or create it
50 |         os.makedirs(dir_to_save, exist_ok=True)
51 |         json_filename = os.path.join(dir_to_save, "epoch_metrics.json")
52 |         # fill the dict
53 |         metrics_dict = {
54 |             "train_loss": str(self.train_loss.result().numpy()),
55 |             "val_loss": str(self.val_loss.result().numpy()),
56 |         }
57 |         save_to_json(json_filename, metrics_dict)
58 | 
59 |     def print_metrics(self):
60 |         """
61 |         Print all metrics
62 |         """
63 |         train_loss = np.around(self.train_loss.result().numpy(), decimals=2)
64 |         val_loss = np.around(self.val_loss.result().numpy(), decimals=2)
65 | 
66 |         template = "train_loss {}, val_loss {}".format(train_loss, val_loss)
67 |         print(template)
68 | 


--------------------------------------------------------------------------------
/detection_3d/model.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | 
 20 | 
 21 | import tensorflow as tf
 22 | from tensorflow.keras.layers import (
 23 |     Conv2D,
 24 |     Layer,
 25 |     UpSampling2D,
 26 |     BatchNormalization,
 27 |     LeakyReLU,
 28 | )
 29 | from tensorflow.keras import Model
 30 | from tensorflow.keras.regularizers import l2
 31 | 
 32 | 
 33 | class DarkNetConv2D(Layer):
 34 |     """
 35 |     The darknet conv layer yolo_v3
 36 |     """
 37 | 
 38 |     def __init__(
 39 |         self,
 40 |         filters,
 41 |         kernel,
 42 |         strides,
 43 |         padding,
 44 |         weight_decay,
 45 |         batch_norm=True,
 46 |         activation_funct=True,
 47 |         data_format="channels_last",
 48 |     ):
 49 |         super(DarkNetConv2D, self).__init__()
 50 |         self.batch_norm = batch_norm
 51 |         self.activation_funct = activation_funct
 52 |         self.conv = Conv2D(
 53 |             filters,
 54 |             kernel,
 55 |             strides=strides,
 56 |             activation=None,
 57 |             kernel_regularizer=l2(weight_decay),
 58 |             padding=padding,
 59 |             data_format=data_format,
 60 |         )
 61 |         self.bn = BatchNormalization()
 62 |         self.activation = LeakyReLU(alpha=0.1)
 63 | 
 64 |     def call(self, x, training=False):
 65 |         x = self.conv(x)
 66 |         if self.batch_norm:
 67 |             x = self.bn(x, training=training)
 68 |         if self.activation_funct:
 69 |             x = self.activation(x)
 70 |         return x
 71 | 
 72 | 
 73 | class DarkNetBlock(Layer):
 74 |     """
 75 |     The darknet block layer
 76 |     """
 77 | 
 78 |     def __init__(
 79 |         self, filters, weight_decay, batch_norm=True, data_format="channels_last"
 80 |     ):
 81 |         super(DarkNetBlock, self).__init__()
 82 | 
 83 |         self.conv1 = DarkNetConv2D(
 84 |             filters // 2,
 85 |             1,
 86 |             strides=1,
 87 |             padding="same",
 88 |             weight_decay=weight_decay,
 89 |             batch_norm=batch_norm,
 90 |             data_format=data_format,
 91 |         )
 92 |         self.conv2 = DarkNetConv2D(
 93 |             filters,
 94 |             3,
 95 |             strides=1,
 96 |             padding="same",
 97 |             weight_decay=weight_decay,
 98 |             batch_norm=batch_norm,
 99 |             data_format=data_format,
100 |         )
101 | 
102 |     def call(self, x, training=False):
103 |         prev = x
104 |         x = self.conv1(x, training=training)
105 |         x = self.conv2(x, training=training)
106 |         return prev + x
107 | 
108 | 
109 | class DarkNetDecoderBlock(Layer):
110 |     """
111 |     The yolo v3 decoder layer 
112 |     """
113 | 
114 |     def __init__(self, filters, weight_decay, data_format="channels_last"):
115 |         super(DarkNetDecoderBlock, self).__init__()
116 | 
117 |         self.conv1_1 = DarkNetConv2D(
118 |             filters,
119 |             (1, 1),
120 |             strides=(1, 1),
121 |             weight_decay=weight_decay,
122 |             padding="same",
123 |             data_format=data_format,
124 |         )
125 |         self.conv1_2 = DarkNetConv2D(
126 |             filters * 2,
127 |             (3, 3),
128 |             strides=(1, 1),
129 |             weight_decay=weight_decay,
130 |             padding="same",
131 |             data_format=data_format,
132 |         )
133 |         self.conv2_1 = DarkNetConv2D(
134 |             filters,
135 |             (1, 1),
136 |             strides=(1, 1),
137 |             weight_decay=weight_decay,
138 |             padding="same",
139 |             data_format=data_format,
140 |         )
141 |         self.conv2_2 = DarkNetConv2D(
142 |             filters * 2,
143 |             (3, 3),
144 |             strides=(1, 1),
145 |             weight_decay=weight_decay,
146 |             padding="same",
147 |             data_format=data_format,
148 |         )
149 |         self.conv3 = DarkNetConv2D(
150 |             filters,
151 |             (1, 1),
152 |             strides=(1, 1),
153 |             weight_decay=weight_decay,
154 |             padding="same",
155 |             data_format=data_format,
156 |         )
157 | 
158 |     def call(self, x, training=False):
159 |         x = self.conv1_1(x, training=training)
160 |         x = self.conv1_2(x, training=training)
161 |         x = self.conv2_1(x, training=training)
162 |         x = self.conv2_2(x, training=training)
163 |         x = self.conv3(x, training=training)
164 |         return x
165 | 
166 | 
167 | class DarkNetEncoder(Layer):
168 |     """
169 |     The darknet 53 encoder from yolo_v3
170 |     See: https://arxiv.org/abs/1804.02767
171 |     """
172 | 
173 |     def __init__(self, name, weight_decay, data_format="channels_last"):
174 |         super(DarkNetEncoder, self).__init__(name=name)
175 |         #  Input
176 |         self.conv1 = DarkNetConv2D(
177 |             32,
178 |             (3, 3),
179 |             strides=(1, 1),
180 |             weight_decay=weight_decay,
181 |             padding="same",
182 |             data_format=data_format,
183 |         )
184 |         # Conv with stride 2
185 |         self.conv2 = DarkNetConv2D(
186 |             64,
187 |             (3, 3),
188 |             strides=(2, 2),
189 |             weight_decay=weight_decay,
190 |             padding="same",
191 |             data_format=data_format,
192 |         )
193 |         # Residual block
194 |         self.block_1 = DarkNetBlock(
195 |             64, weight_decay=weight_decay, data_format=data_format
196 |         )
197 |         # Conv with stride 2
198 |         self.conv3 = DarkNetConv2D(
199 |             128,
200 |             (3, 3),
201 |             strides=(2, 2),
202 |             weight_decay=weight_decay,
203 |             padding="same",
204 |             data_format=data_format,
205 |         )
206 |         # Residual blocks 2x
207 |         self.block_2 = []
208 |         for _ in range(2):
209 |             self.block_2.append(
210 |                 DarkNetBlock(128, weight_decay=weight_decay, data_format=data_format)
211 |             )
212 |         # Conv with stride 2
213 |         self.conv4 = DarkNetConv2D(
214 |             256,
215 |             (3, 3),
216 |             strides=(2, 2),
217 |             weight_decay=weight_decay,
218 |             padding="same",
219 |             data_format=data_format,
220 |         )
221 |         # Residual blocks 8x
222 |         self.block_3 = []
223 |         for _ in range(8):
224 |             self.block_3.append(
225 |                 DarkNetBlock(256, weight_decay=weight_decay, data_format=data_format)
226 |             )
227 |         # Conv with stride 2
228 |         self.conv5 = DarkNetConv2D(
229 |             512,
230 |             (3, 3),
231 |             strides=(2, 2),
232 |             weight_decay=weight_decay,
233 |             padding="same",
234 |             data_format=data_format,
235 |         )
236 |         # Residual blocks 8x
237 |         self.block_4 = []
238 |         for _ in range(8):
239 |             self.block_4.append(
240 |                 DarkNetBlock(512, weight_decay=weight_decay, data_format=data_format)
241 |             )
242 |         # Conv with stride 2
243 |         self.conv6 = DarkNetConv2D(
244 |             1024,
245 |             (3, 3),
246 |             strides=(2, 2),
247 |             weight_decay=weight_decay,
248 |             padding="same",
249 |             data_format=data_format,
250 |         )
251 |         # Residual blocks 4x
252 |         self.block_5 = []
253 |         for _ in range(4):
254 |             self.block_5.append(
255 |                 DarkNetBlock(1024, weight_decay=weight_decay, data_format=data_format)
256 |             )
257 | 
258 |     def call(self, x, training=False):
259 |         x = self.conv1(x, training=training)
260 |         x = self.conv2(x, training=training)
261 |         x = x_b1 = self.block_1(x, training=training)
262 |         x = self.conv3(x, training=training)
263 |         for i in range(len(self.block_2)):
264 |             x = x_b2 = self.block_2[i](x, training=training)
265 |         x = self.conv4(x)
266 |         for i in range(len(self.block_3)):
267 |             x = x_b3 = self.block_3[i](x, training=training)
268 |         x = self.conv5(x, training=training)
269 |         for i in range(len(self.block_4)):
270 |             x = x_b4 = self.block_4[i](x, training=training)
271 |         x = self.conv6(x, training=training)
272 |         for i in range(len(self.block_5)):
273 |             x = x_b5 = self.block_5[i](x, training=training)
274 | 
275 |         return x_b5, x_b4, x_b3, x_b2, x_b1
276 | 
277 | 
278 | class DarkNetDecoder(Layer):
279 |     """
280 |     The yolo v3 decoder 
281 |     """
282 | 
283 |     def __init__(self, name, weight_decay, data_format="channels_last"):
284 |         super(DarkNetDecoder, self).__init__(name=name)
285 | 
286 |         self.decoder_block_1 = DarkNetDecoderBlock(
287 |             filters=512, weight_decay=weight_decay, data_format=data_format
288 |         )
289 | 
290 |         self.conv1 = DarkNetConv2D(
291 |             256,
292 |             (1, 1),
293 |             strides=(1, 1),
294 |             weight_decay=weight_decay,
295 |             padding="same",
296 |             data_format=data_format,
297 |         )
298 |         self.up1 = UpSampling2D(size=(2, 2), data_format=data_format)
299 |         self.decoder_block_2 = DarkNetDecoderBlock(
300 |             filters=256, weight_decay=weight_decay, data_format=data_format
301 |         )
302 | 
303 |         self.conv2 = DarkNetConv2D(
304 |             128,
305 |             (1, 1),
306 |             strides=(1, 1),
307 |             weight_decay=weight_decay,
308 |             padding="same",
309 |             data_format=data_format,
310 |         )
311 |         self.up2 = UpSampling2D(size=(2, 2), data_format=data_format)
312 |         self.decoder_block_3 = DarkNetDecoderBlock(
313 |             filters=128, weight_decay=weight_decay, data_format=data_format
314 |         )
315 |         self.up3 = UpSampling2D(size=(2, 2), data_format=data_format)
316 |         self.decoder_block_4 = DarkNetDecoderBlock(
317 |             filters=64, weight_decay=weight_decay, data_format=data_format
318 |         )
319 |         self.up4 = UpSampling2D(size=(2, 2), data_format=data_format)
320 |         self.decoder_block_5 = DarkNetDecoderBlock(
321 |             filters=64, weight_decay=weight_decay, data_format=data_format
322 |         )
323 | 
324 |     def call(self, x_in, training=False):
325 |         # First lvl
326 |         x_b5, x_b4, x_b3, x_b2, x_b1 = x_in
327 |         x = self.decoder_block_1(x_b5, training=training)
328 |         # Second lvl
329 |         x = self.conv1(x, training=training)
330 |         x = self.up1(x)
331 |         x = tf.concat([x, x_b4], axis=-1)
332 |         x = self.decoder_block_2(x, training=training)
333 |         # Third lvl
334 |         x = self.conv2(x, training=training)
335 |         x = self.up2(x)
336 |         x = tf.concat([x, x_b3], axis=-1)
337 |         x = self.decoder_block_3(x, training=training)
338 |         x = self.up3(x)
339 |         x = tf.concat([x, x_b2], axis=-1)
340 |         x = self.decoder_block_4(x, training=training)
341 |         x = self.up4(x)
342 |         x = tf.concat([x, x_b1], axis=-1)
343 |         x = self.decoder_block_5(x, training=training)
344 | 
345 |         return x
346 | 
347 | 
348 | class YoloV3_Lidar(Model):
349 |     def __init__(self, weight_decay, num_classes=26, data_format="channels_last"):
350 |         super(YoloV3_Lidar, self).__init__(name="YoloV3_Lidar")
351 |         self.encoder = DarkNetEncoder(
352 |             name="DarkNetEncoder", weight_decay=weight_decay, data_format=data_format
353 |         )
354 |         self.decoder = DarkNetDecoder(
355 |             name="DarkNetDecoder", weight_decay=weight_decay, data_format=data_format
356 |         )
357 |         self.final_layer = Conv2D(
358 |             8 + num_classes,
359 |             (1, 1),
360 |             activation=None,
361 |             padding="same",
362 |             data_format=data_format,
363 |         )
364 | 
365 |     def call(self, x, training):
366 |         x = self.encoder(x, training=training)
367 |         x = self.decoder(x, training=training)
368 |         x = self.final_layer(x)
369 |         return x
370 | 


--------------------------------------------------------------------------------
/detection_3d/parameters.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import os
 20 | from detection_3d.tools.file_io import save_to_json
 21 | from detection_3d.tools.statics import NO_SCHEDULER, RESTARTS_SCHEDULER, ADAM
 22 | 
 23 | 
 24 | class Parameters(object):
 25 |     """
 26 |     The class contains experiment parameters.
 27 |     """
 28 | 
 29 |     def __init__(self):
 30 | 
 31 |         self.settings = {
 32 |             # The directory of the dataset
 33 |             "dataset_dir": "dataset",
 34 |             "batch_size": 4,
 35 |             # The checkpoint related
 36 |             "checkpoints_dir": "log/checkpoints",
 37 |             "train_summaries": "log/summaries/train",
 38 |             "eval_summaries": "log/summaries/val",
 39 |             # Update tensorboard train images each step_summaries iterations
 40 |             "step_summaries": 100,  # to turn off make it None
 41 |             # General settings
 42 |             "seed": 2020,
 43 |             "max_epochs": 1000,
 44 |             "weight_decay": 1,
 45 |         }
 46 | 
 47 |         # Set special parameters
 48 |         self.settings["optimizer"] = ADAM
 49 |         self.settings["scheduler"] = SchedulerSettings.no_scheduler()
 50 |         self.settings["augmentation"] = True
 51 |         # Detection related
 52 |         self.settings["grid_meters"] = [52.0, 104.0, 8.0]  # [x,y,z ] in meters
 53 |         # [x,y,z, intensity] offset to shift all lidar points in positive coordinate quadrant
 54 |         # (all x,y,z coords >=0)
 55 |         self.settings["lidar_offset"] = [26.0, 52.0, 2.5, 0.0]
 56 |         # [x,y,z] voxel size in meters
 57 |         self.settings["voxel_size"] = [0.125, 0.125, 8.0]
 58 |         # [x,y,z] voxel size in meters
 59 |         self.settings["bbox_voxel_size"] = [0.25, 0.25, 1.0]
 60 | 
 61 |         # Automatically defined during training parameters
 62 |         self.settings["train_size"] = None  # the size of train set
 63 |         self.settings["val_size"] = None  # the size of val set
 64 | 
 65 |     def save_to_json(self, dir_to_save):
 66 |         """
 67 |         Save parameters to .json
 68 |         """
 69 |         # Check that folder is exitsts or create it
 70 |         os.makedirs(dir_to_save, exist_ok=True)
 71 |         json_filename = os.path.join(dir_to_save, "parameters.json")
 72 |         save_to_json(json_filename, self.settings)
 73 | 
 74 | 
 75 | class SchedulerSettings:
 76 |     """
 77 |     The class contains parameters for different schedulers.
 78 |     """
 79 | 
 80 |     def __init__(self):
 81 |         pass
 82 | 
 83 |     # Supported schedulers
 84 |     @staticmethod
 85 |     def no_scheduler():
 86 |         """
 87 |         Constant learning rate scheduler.
 88 |         """
 89 |         scheduler = {
 90 |             "name": NO_SCHEDULER,
 91 |             "initial_learning_rate": 1e-5,
 92 |         }
 93 |         return scheduler
 94 | 
 95 |     @staticmethod
 96 |     def restarts_scheduler():
 97 |         """
 98 |         The warm restarts scheduler.
 99 |         See: https://arxiv.org/abs/1608.03983
100 |         """
101 |         scheduler = {
102 |             "name": RESTARTS_SCHEDULER,
103 |             "initial_learning_rate": 1e-4,  # 2e-3
104 |             "first_decay_steps": 80,  # Important: convertable param from epoch to iteration
105 |             "t_mul": 2.0,
106 |             "m_mul": 1.0,
107 |             "alpha": 1e-6,
108 |         }
109 |         return scheduler
110 | 


--------------------------------------------------------------------------------
/detection_3d/tools/augmentation_tools.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | __copyright__ = """
 3 | Copyright (c) 2020 Tananaev Denis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
 9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | 
20 | import numpy as np
21 | from detection_3d.data_preprocessing.pandaset_tools.transform import (
22 |     eulerAnglesToRotationMatrix,
23 | )
24 | 
25 | 
26 | def random_rotate_lidar_boxes(
27 |     lidar, lidar_corners_3d, min_angle=-np.pi / 4, max_angle=np.pi / 4
28 | ):
29 |     yaw = np.random.uniform(min_angle, max_angle)
30 |     R = eulerAnglesToRotationMatrix([0, 0, yaw])
31 |     lidar = np.transpose(lidar)
32 |     lidar_corners_3d = np.transpose(lidar_corners_3d, (0, 2, 1))
33 | 
34 |     lidar[:3] = np.matmul(R, lidar[:3])
35 |     lidar_corners_3d = np.matmul(R, lidar_corners_3d)
36 | 
37 |     lidar_corners_3d = np.transpose(lidar_corners_3d, (0, 2, 1))
38 |     lidar = np.transpose(lidar)
39 |     return lidar, lidar_corners_3d
40 | 
41 | 
42 | def random_flip_x_lidar_boxes(lidar, lidar_corners_3d):
43 |     lidar[:, 0] = -lidar[:, 0]
44 |     lidar_corners_3d[:, :, 0] = -lidar_corners_3d[:, :, 0]
45 |     return lidar, lidar_corners_3d
46 | 
47 | 
48 | def random_flip_y_lidar_boxes(lidar, lidar_corners_3d):
49 |     lidar[:, 1] = -lidar[:, 1]
50 |     lidar_corners_3d[:, :, 1] = -lidar_corners_3d[:, :, 1]
51 |     return lidar, lidar_corners_3d
52 | 


--------------------------------------------------------------------------------
/detection_3d/tools/detection_helpers.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import numpy as np
 20 | import tensorflow as tf
 21 | 
 22 | 
 23 | def rot_z(t):
 24 |     """ Rotation about the z-axis. """
 25 |     c = np.cos(t)
 26 |     s = np.sin(t)
 27 |     ones = np.ones_like(c)
 28 |     zeros = np.zeros_like(c)
 29 |     return np.asarray([[c, -s, zeros], [s, c, zeros], [zeros, zeros, ones]])
 30 | 
 31 | 
 32 | def make_eight_points_boxes(bboxes_xyzlwhy):
 33 |     bboxes_xyzlwhy = np.asarray(bboxes_xyzlwhy)
 34 |     l = bboxes_xyzlwhy[:, 3] / 2.0
 35 |     w = bboxes_xyzlwhy[:, 4] / 2.0
 36 |     h = bboxes_xyzlwhy[:, 5] / 2.0
 37 |     # 3d bounding box corners
 38 |     x_corners = np.asarray([l, l, -l, -l, l, l, -l, -l])
 39 |     y_corners = np.asarray([w, -w, -w, w, w, -w, -w, w])
 40 |     z_corners = np.asarray([-h, -h, -h, -h, h, h, h, h])
 41 |     corners_3d = np.concatenate(([x_corners], [y_corners], [z_corners]), axis=0)
 42 |     yaw = np.asarray(bboxes_xyzlwhy[:, -1], dtype=np.float)
 43 |     corners_3d = np.transpose(corners_3d, (2, 0, 1))
 44 |     R = np.transpose(rot_z(yaw), (2, 0, 1))
 45 | 
 46 |     corners_3d = np.matmul(R, corners_3d)
 47 | 
 48 |     centroid = bboxes_xyzlwhy[:, :3]
 49 |     corners_3d += centroid[:, :, None]
 50 |     orient_p = (corners_3d[:, :, 0] + corners_3d[:, :, 7]) / 2.0
 51 |     orientation_3d = np.concatenate(
 52 |         (centroid[:, :, None], orient_p[:, :, None]), axis=-1
 53 |     )
 54 |     corners_3d = np.transpose(corners_3d, (0, 2, 1))
 55 |     orientation_3d = np.transpose(orientation_3d, (0, 2, 1))
 56 | 
 57 |     return corners_3d, orientation_3d
 58 | 
 59 | 
 60 | def get_bboxes_parameters_from_points(lidar_corners_3d):
 61 |     """
 62 |     The function returns 7 parameters of box [x, y, z, w, l, h, yaw]
 63 | 
 64 |     Arguments:
 65 |         lidar_corners_3d: [num_ponts, 8, 3]
 66 |     """
 67 |     centroid = (lidar_corners_3d[:, -2, :] + lidar_corners_3d[:, 0, :]) / 2.0
 68 |     delta_l = lidar_corners_3d[:, 0, :2] - lidar_corners_3d[:, 1, :2]
 69 |     delta_w = lidar_corners_3d[:, 1, :2] - lidar_corners_3d[:, 2, :2]
 70 |     width = np.linalg.norm(delta_w, axis=-1)
 71 |     length = np.linalg.norm(delta_l, axis=-1)
 72 | 
 73 |     height = lidar_corners_3d[:, -1, -1] - lidar_corners_3d[:, 0, -1]
 74 |     yaw = np.arctan2(delta_l[:, 1], delta_l[:, 0])
 75 | 
 76 |     return centroid, width, length, height, yaw
 77 | 
 78 | 
 79 | def get_voxels_grid(voxel_size, grid_meters):
 80 |     voxel_size = np.asarray(voxel_size, np.float32)
 81 |     grid_size_meters = np.asarray(grid_meters, np.float32)
 82 |     voxels_grid = np.asarray(grid_size_meters / voxel_size, np.int32)
 83 |     return voxels_grid
 84 | 
 85 | 
 86 | def get_bboxes_grid(bbox_labels, lidar_corners_3d, grid_meters, bbox_voxel_size):
 87 |     """
 88 |         The function transform lidar_corners_3d (8 points of bboxes) to
 89 |         parametrized version of bbox.
 90 |     """
 91 |     voxels_grid = get_voxels_grid(bbox_voxel_size, grid_meters)
 92 |     # Find box parameters
 93 |     centroid, width, length, height, _ = get_bboxes_parameters_from_points(
 94 |         lidar_corners_3d
 95 |     )
 96 |     # find the vector of orientation [centroid, orient_point]
 97 |     orient_point = (lidar_corners_3d[:, 1] + lidar_corners_3d[:, 2]) / 2.0
 98 | 
 99 |     voxel_coordinates = np.asarray(
100 |         np.floor(centroid[:, :2] / bbox_voxel_size[:2]), np.int32
101 |     )
102 |     # Filter bboxes not fall in the grid
103 |     bound_x = (voxel_coordinates[:, 0] >= 0) & (
104 |         voxel_coordinates[:, 0] < voxels_grid[0]
105 |     )
106 |     bound_y = (voxel_coordinates[:, 1] >= 0) & (
107 |         voxel_coordinates[:, 1] < voxels_grid[1]
108 |     )
109 |     mask = bound_x & bound_y
110 |     # Filter all non related bboxes
111 |     centroid = centroid[mask]
112 |     orient_point = orient_point[mask]
113 |     width = width[mask]
114 |     length = length[mask]
115 |     height = height[mask]
116 |     bbox_labels = bbox_labels[mask]
117 |     voxel_coordinates = voxel_coordinates[mask]
118 |     # Confidence
119 |     confidence = np.ones_like(width)
120 | 
121 |     # Voxels close corners to the coordinate system origin (0,0,0)
122 |     voxels_close_corners = (
123 |         np.asarray(voxel_coordinates, np.float32) * bbox_voxel_size[:2]
124 |     )
125 |     # Get x,y, coordinate
126 |     delta_xy = centroid[:, :2] - voxels_close_corners
127 |     orient_xy = orient_point[:, :2] - voxels_close_corners
128 |     z_coord = centroid[:, -1]
129 | 
130 |     # print(
131 |     #     f"confidence {confidence.shape}, delta_xy {delta_xy.shape}, orient_xy {orient_xy.shape}, z_coord {z_coord.shape}, width {width.shape}, height {height.shape}, bbox_labels {bbox_labels.shape}"
132 |     # )
133 |     # (x_grid, y_grid, (objectness, min_delta_x, min_delta_y, max_delta_x, max_delta_y, z, label))
134 |     # objectness means 1 if box exists for this grid cell else 0
135 |     output_tensor = np.zeros((voxels_grid[0], voxels_grid[1], 9), np.float32)
136 |     if len(bbox_labels) > 0:
137 |         data = np.concatenate(
138 |             (
139 |                 confidence[:, None],
140 |                 delta_xy,
141 |                 orient_xy,
142 |                 z_coord[:, None],
143 |                 width[:, None],
144 |                 height[:, None],
145 |                 bbox_labels[:, None],
146 |             ),
147 |             axis=-1,
148 |         )
149 |         output_tensor[voxel_coordinates[:, 0], voxel_coordinates[:, 1]] = data
150 |     return output_tensor
151 | 
152 | 
153 | def get_boxes_from_box_grid(box_grid, bbox_voxel_size, conf_trhld=0.0):
154 | 
155 |     # Get non-zero voxels
156 |     objectness, delta_xy, orient_xy, z_coord, width, height, label = tf.split(
157 |         box_grid, (1, 2, 2, 1, 1, 1, -1), axis=-1
158 |     )
159 | 
160 |     mask = box_grid[:, :, 0] > conf_trhld
161 |     valid_idx = tf.where(mask)
162 | 
163 |     z_coord = tf.gather_nd(z_coord, valid_idx)
164 |     width = tf.gather_nd(width, valid_idx)
165 |     height = tf.gather_nd(height, valid_idx)
166 |     objectness = tf.gather_nd(objectness, valid_idx)
167 |     label = tf.gather_nd(label, valid_idx)
168 |     delta_xy = tf.gather_nd(delta_xy, valid_idx)
169 |     orient_xy = tf.gather_nd(orient_xy, valid_idx)
170 |     voxels_close_corners = tf.cast(valid_idx, tf.float32) * bbox_voxel_size[None, :2]
171 |     xy_coord = delta_xy + voxels_close_corners
172 |     xy_orient = orient_xy + voxels_close_corners
173 | 
174 |     delta = xy_orient[:, :2] - xy_coord[:, :2]
175 |     length = 2 * tf.norm(delta, axis=-1, keepdims=True)
176 |     yaw = tf.expand_dims(tf.atan2(delta[:, 1], delta[:, 0]), axis=-1)
177 | 
178 |     bbox = tf.concat([xy_coord, z_coord, length, width, height, yaw], axis=-1,)
179 |     return bbox, label, objectness
180 | 
181 | 
182 | def make_top_view_image(lidar, grid_meters, voxels_size, channels=3):
183 |     """
184 |     The function makes top view image from lidar
185 |     Arguments:
186 |         lidar: lidar array of the shape [num_points, 3]
187 |         width: width of the top view image
188 |         height: height of the top view image
189 |         channels: number of channels of the top view image
190 |     """
191 |     mask_x = (lidar[:, 0] >= 0) & (lidar[:, 0] < grid_meters[0])
192 |     mask_y = (lidar[:, 1] >= 0) & (lidar[:, 1] < grid_meters[1])
193 |     mask_z = (lidar[:, 2] >= 0) & (lidar[:, 2] < grid_meters[2])
194 |     mask = mask_x & mask_y & mask_z
195 |     lidar = lidar[mask]
196 |     voxel_grid = get_voxels_grid(voxels_size, grid_meters)
197 |     voxels = np.asarray(np.floor(lidar[:, :3] / voxels_size), np.int32)
198 |     top_view = np.zeros((voxel_grid[0], voxel_grid[1], 2), np.float32)
199 |     top_view[voxels[:, 0], voxels[:, 1], 0] = lidar[:, 2]  # z values
200 |     top_view[voxels[:, 0], voxels[:, 1], 1] = lidar[:, 3]  # intensity values
201 | 
202 |     return top_view
203 | 


--------------------------------------------------------------------------------
/detection_3d/tools/file_io.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import tensorflow as tf
 20 | import matplotlib.pyplot as plt
 21 | import numpy as np
 22 | import os
 23 | import json
 24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import labels
 25 | 
 26 | 
 27 | def load_and_resize_image(image_filename, resize=None, data_type=tf.float32):
 28 |     """
 29 |     Load png image to tensor and resize if necessary
 30 |     Arguments:
 31 |        image_filename: image file to load
 32 |        resize: tensor [new_width, new_height] or None
 33 |     Return:
 34 |        img: tensor of the size [1, H, W, 3]
 35 |     """
 36 | 
 37 |     img = tf.io.read_file(image_filename)
 38 |     img = tf.image.decode_png(img)
 39 |     # Add batch dim
 40 |     img = tf.expand_dims(img, axis=0)
 41 | 
 42 |     if resize is not None:
 43 |         img = tf.compat.v1.image.resize_nearest_neighbor(img, resize)
 44 | 
 45 |     img = tf.cast(img, data_type)
 46 |     return img
 47 | 
 48 | 
 49 | def save_plot_to_image(file_to_save, figure):
 50 |     """
 51 |     Save matplotlib figure to image and close
 52 |     """
 53 |     plt.savefig(file_to_save)
 54 |     plt.close(figure)
 55 | 
 56 | 
 57 | def read_json(json_filename):
 58 |     with open(json_filename) as json_file:
 59 |         data = json.load(json_file)
 60 |         return data
 61 | 
 62 | 
 63 | def save_bboxes_to_file(
 64 |     filename, centroid, width, length, height, alpha, label, delim=";"
 65 | ):
 66 | 
 67 |     if centroid is not None:
 68 |         with open(filename, "w") as the_file:
 69 |             for c, w, l, h, a, lbl in zip(
 70 |                 centroid, width, length, height, alpha, label
 71 |             ):
 72 |                 data = (
 73 |                     delim.join(
 74 |                         (
 75 |                             str(c[0]),
 76 |                             str(c[1]),
 77 |                             str(c[2]),
 78 |                             str(l),
 79 |                             str(w),
 80 |                             str(h),
 81 |                             str(a),
 82 |                             str(lbl),
 83 |                         )
 84 |                     )
 85 |                     + "\n"
 86 |                 )
 87 |                 # data = "{};{};{};{};{};{};{};{}\n".format(
 88 |                 #     c[0], c[1], c[2], l, w, h, a, lbl
 89 |                 # )
 90 |                 the_file.write(data)
 91 | 
 92 | 
 93 | def load_bboxes(label_filename, label_string=True):
 94 |     # returns the array with [num_boxes, (bbox_parm)]
 95 |     with open(label_filename) as f:
 96 |         bboxes = np.asarray([line.rstrip().split(";") for line in f])
 97 |         # Convert labels to numbers
 98 |         if label_string:
 99 |             bboxes[:, -1] = [labels[label] for label in bboxes[:, -1]]
100 |         bboxes = np.asarray(bboxes, dtype=float)
101 |         return bboxes
102 | 
103 | 
104 | def load_lidar(lidar_filename, dtype=np.float32, n_vec=4):
105 |     scan = np.fromfile(lidar_filename, dtype=dtype)
106 |     scan = scan.reshape((-1, n_vec))
107 |     return scan
108 | 
109 | 
110 | def save_lidar(lidar_filename, scan):
111 |     scan = scan.reshape((-1))
112 |     scan.tofile(lidar_filename)
113 | 
114 | 
115 | def save_to_json(json_filename, dict_to_save):
116 |     """
117 |     Save to json file
118 |     """
119 |     with open(json_filename, "w") as f:
120 |         json.dump(dict_to_save, f, indent=2)
121 | 
122 | 
123 | def save_dataset_list(dataset_file, data_list):
124 |     """
125 |     Saves dataset list to file.
126 |     """
127 |     with open(dataset_file, "w") as f:
128 |         for item in data_list:
129 |             f.write("%s\n" % item)
130 | 
131 | 
132 | def load_dataset_list(dataset_dir, dataset_file, delimiter=";"):
133 |     """
134 |     The function loads list of data from dataset
135 |     file.
136 |     Args:
137 |      dataset_file: path to the .dataset file.
138 |     Returns:
139 |      dataset_list: list of data.
140 |     """
141 | 
142 |     file_path = os.path.join(dataset_dir, dataset_file)
143 |     dataset_list = []
144 |     with open(file_path) as f:
145 |         dataset_list = f.readlines()
146 |     dataset_list = [x.strip().split(delimiter) for x in dataset_list]
147 |     return dataset_list
148 | 


--------------------------------------------------------------------------------
/detection_3d/tools/statics.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | __copyright__ = """
 3 | Copyright (c) 2020 Tananaev Denis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
 9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | 
20 | NO_SCHEDULER = "no_scheduler"
21 | RESTARTS_SCHEDULER = "restarts"
22 | 
23 | ADAM = "adam"
24 | 


--------------------------------------------------------------------------------
/detection_3d/tools/summary_helpers.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import tensorflow as tf
 20 | import numpy as np
 21 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image
 22 | 
 23 | 
 24 | def train_summaries(train_out, optimizer, param_settings, learning_rate):
 25 |     """
 26 |     Visualizes  the train outputs in tensorboards
 27 |     """
 28 | 
 29 |     writer = tf.summary.create_file_writer(param_settings["train_summaries"])
 30 |     with writer.as_default():
 31 |         # Losses
 32 |         (
 33 |             obj_loss,
 34 |             label_loss,
 35 |             z_loss,
 36 |             delta_xy_loss,
 37 |             width_loss,
 38 |             height_loss,
 39 |             delta_orient_loss,
 40 |         ) = train_out["losses"]
 41 | 
 42 |         # Show learning rate given scheduler
 43 |         if param_settings["scheduler"]["name"] != "no_scheduler":
 44 |             with tf.name_scope("Optimizer info"):
 45 |                 step = float(
 46 |                     optimizer.iterations.numpy()
 47 |                 )  # triangular_scheduler learning rate needs float dtype
 48 |                 tf.summary.scalar(
 49 |                     "learning_rate", learning_rate(step), step=optimizer.iterations
 50 |                 )
 51 |         with tf.name_scope("Training losses"):
 52 |             tf.summary.scalar(
 53 |                 "1.Total loss", train_out["total_loss"], step=optimizer.iterations
 54 |             )
 55 |             tf.summary.scalar("2.obj loss", obj_loss, step=optimizer.iterations)
 56 |             tf.summary.scalar("3.label_loss", label_loss, step=optimizer.iterations)
 57 |             tf.summary.scalar("4. z_loss", z_loss, step=optimizer.iterations)
 58 |             tf.summary.scalar(
 59 |                 "5. delta_xy_loss", delta_xy_loss, step=optimizer.iterations
 60 |             )
 61 |             tf.summary.scalar("6. width_loss", width_loss, step=optimizer.iterations)
 62 |             tf.summary.scalar("8. height_loss", height_loss, step=optimizer.iterations)
 63 |             tf.summary.scalar(
 64 |                 "9. delta_orient_loss", delta_orient_loss, step=optimizer.iterations
 65 |             )
 66 | 
 67 |         if (
 68 |             param_settings["step_summaries"] is not None
 69 |             and optimizer.iterations % param_settings["step_summaries"] == 0
 70 |         ):
 71 |             bbox_voxel_size = np.asarray(
 72 |                 param_settings["bbox_voxel_size"], dtype=np.float32
 73 |             )
 74 |             lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32)
 75 |             gt_bboxes = train_out["box_grid"]
 76 |             p_bboxes = train_out["predictions"]
 77 |             grid_meters = param_settings["grid_meters"]
 78 |             top_view = train_out["top_view"]
 79 |             gt_top_view = visualize_2d_boxes_on_top_image(
 80 |                 gt_bboxes, top_view, grid_meters, bbox_voxel_size,
 81 |             )
 82 | 
 83 |             p_top_view = visualize_2d_boxes_on_top_image(
 84 |                 p_bboxes, top_view, grid_meters, bbox_voxel_size, prediction=True,
 85 |             )
 86 | 
 87 |             # Show GT
 88 |             with tf.name_scope("1-Ground truth bounding boxes"):
 89 |                 tf.summary.image("Top view", gt_top_view, step=optimizer.iterations)
 90 | 
 91 |             with tf.name_scope("2-Predicted bounding boxes"):
 92 |                 tf.summary.image(
 93 |                     "Predicted top view", p_top_view, step=optimizer.iterations
 94 |                 )
 95 | 
 96 | 
 97 | def epoch_metrics_summaries(param_settings, epoch_metrics, epoch):
 98 |     """
 99 |     Visualizes epoch metrics
100 |     """
101 |     # Train results
102 |     writer = tf.summary.create_file_writer(param_settings["train_summaries"])
103 |     with writer.as_default():
104 |         # Show epoch metrics for train
105 |         with tf.name_scope("Epoch metrics"):
106 |             tf.summary.scalar(
107 |                 "1. Loss", epoch_metrics.train_loss.result().numpy(), step=epoch
108 |             )
109 | 
110 |     # Val results
111 |     writer = tf.summary.create_file_writer(param_settings["eval_summaries"])
112 |     with writer.as_default():
113 |         # Show epoch metrics for train
114 |         with tf.name_scope("Epoch metrics"):
115 |             tf.summary.scalar(
116 |                 "1. Loss", epoch_metrics.val_loss.result().numpy(), step=epoch
117 |             )
118 | 


--------------------------------------------------------------------------------
/detection_3d/tools/training_helpers.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | __copyright__ = """
 3 | Copyright (c) 2020 Tananaev Denis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
 9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import os
20 | import glob
21 | import tensorflow as tf
22 | import copy
23 | from detection_3d.tools.statics import NO_SCHEDULER, RESTARTS_SCHEDULER, ADAM
24 | 
25 | 
26 | def setup_gpu():
27 |     physical_devices = tf.config.experimental.list_physical_devices("GPU")
28 |     if len(physical_devices) > 0:
29 |         # Will not allocate all memory but only necessary amount
30 |         tf.config.experimental.set_memory_growth(physical_devices[0], True)
31 | 
32 | 
33 | def initialize_model(model, input_shape):
34 |     """
35 |     Helper tf2 specific model initialization (need for saving mechanism)
36 |     """
37 |     sample = tf.zeros(input_shape, tf.float32)
38 |     model.predict(sample)
39 | 
40 | 
41 | def load_model(checkpoints_dir, model, resume):
42 |     """
43 |     Resume model from given checkpoint
44 |     """
45 |     start_epoch = 0
46 |     if resume:
47 |         search_string = os.path.join(checkpoints_dir, "*")
48 |         checkpoints_list = sorted(glob.glob(search_string))
49 |         if len(checkpoints_list) > 0:
50 |             current_epoch = int(os.path.split(checkpoints_list[-1])[-1].split("-")[-1])
51 |             model = tf.keras.models.load_model(checkpoints_list[-1])
52 |             start_epoch = current_epoch + 1  # we should continue from the next epoch
53 |             print(f"RESUME TRAINING FROM CHECKPOINT: {checkpoints_list[-1]}.")
54 |         else:
55 |             print(f"CAN'T RESUME TRAINING! NO CHECKPOINT FOUND! START NEW TRAINING!")
56 |     return start_epoch, model
57 | 
58 | 
59 | def get_optimizer(optimizer_name, scheduler, num_iter_per_epoch):
60 |     if scheduler["name"] == NO_SCHEDULER:
61 |         learning_rate = scheduler["initial_learning_rate"]
62 |     elif scheduler["name"] == RESTARTS_SCHEDULER:
63 |         tmp_scheduler = copy.deepcopy(scheduler)
64 |         tmp_scheduler["first_decay_steps"] = (
65 |             scheduler["first_decay_steps"] * num_iter_per_epoch
66 |         )
67 |         learning_rate = tf.keras.experimental.CosineDecayRestarts(**tmp_scheduler)
68 | 
69 |     if optimizer_name == ADAM:
70 |         optimizer_type = tf.keras.optimizers.Adam(learning_rate)
71 |     else:
72 |         ValueError("Error: Unknow optimizer {}".format(optimizer_name))
73 | 
74 |     return learning_rate, optimizer_type
75 | 


--------------------------------------------------------------------------------
/detection_3d/tools/visualization_tools.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import mayavi.mlab as mlab
 20 | import numpy as np
 21 | import cv2
 22 | import matplotlib.pyplot as plt
 23 | import copy
 24 | from tqdm import tqdm
 25 | from detection_3d.tools.detection_helpers import (
 26 |     get_boxes_from_box_grid,
 27 |     make_eight_points_boxes,
 28 | )
 29 | from detection_3d.data_preprocessing.pandaset_tools.helpers import get_color
 30 | 
 31 | 
 32 | def visualize_lidar(lidar, figure=None):
 33 |     """ 
 34 |     Draw lidar points
 35 |     Args:
 36 |         lidar: numpy array (n,3) of XYZ
 37 |         figure: mayavi figure handler, if None create new one otherwise will use it
 38 |     Returns:
 39 |         fig: created or used fig
 40 |     """
 41 | 
 42 |     if figure is None:
 43 |         figure = mlab.figure(
 44 |             figure=None, bgcolor=(0, 0, 0), fgcolor=None, engine=None, size=(1600, 1000)
 45 |         )
 46 | 
 47 |     color = lidar[:, 2]
 48 |     mlab.points3d(
 49 |         lidar[:, 0],
 50 |         lidar[:, 1],
 51 |         lidar[:, 2],
 52 |         color,
 53 |         mode="point",
 54 |         scale_factor=0.3,
 55 |         figure=figure,
 56 |     )
 57 | 
 58 |     # draw origin
 59 |     mlab.points3d(
 60 |         0, 0, 0, color=(1, 1, 1), mode="sphere", scale_factor=0.2, figure=figure
 61 |     )
 62 |     # draw axis
 63 |     mlab.plot3d(
 64 |         [0, 2], [0, 0], [0, 0], color=(1, 0, 0), tube_radius=None, figure=figure
 65 |     )
 66 |     mlab.plot3d(
 67 |         [0, 0], [0, 2], [0, 0], color=(0, 1, 0), tube_radius=None, figure=figure
 68 |     )
 69 |     mlab.plot3d(
 70 |         [0, 0], [0, 0], [0, 2], color=(0, 0, 1), tube_radius=None, figure=figure
 71 |     )
 72 |     return figure
 73 | 
 74 | 
 75 | def visualize_bboxes_3d(lidar_corners_3d, figure=None, orientation=None):
 76 |     if figure is None:
 77 |         figure = mlab.figure(
 78 |             figure=None, bgcolor=(0, 0, 0), fgcolor=None, engine=None, size=(1600, 1000)
 79 |         )
 80 | 
 81 |     for b in tqdm(lidar_corners_3d, desc=f"Add bboxes", total=len(lidar_corners_3d)):
 82 |         for k in range(0, 4):
 83 |             i, j = k, (k + 1) % 4
 84 |             mlab.plot3d(
 85 |                 [b[i, 0], b[j, 0]],
 86 |                 [b[i, 1], b[j, 1]],
 87 |                 [b[i, 2], b[j, 2]],
 88 |                 color=(1, 1, 1),
 89 |                 tube_radius=None,
 90 |                 line_width=1,
 91 |                 figure=figure,
 92 |             )
 93 | 
 94 |             i, j = k + 4, (k + 1) % 4 + 4
 95 |             mlab.plot3d(
 96 |                 [b[i, 0], b[j, 0]],
 97 |                 [b[i, 1], b[j, 1]],
 98 |                 [b[i, 2], b[j, 2]],
 99 |                 color=(1, 1, 1),
100 |                 tube_radius=None,
101 |                 line_width=1,
102 |                 figure=figure,
103 |             )
104 | 
105 |             i, j = k, k + 4
106 |             mlab.plot3d(
107 |                 [b[i, 0], b[j, 0]],
108 |                 [b[i, 1], b[j, 1]],
109 |                 [b[i, 2], b[j, 2]],
110 |                 color=(1, 1, 1),
111 |                 tube_radius=None,
112 |                 line_width=1,
113 |                 figure=figure,
114 |             )
115 |     if orientation is not None:
116 |         for o in orientation:
117 |             mlab.plot3d(
118 |                 [o[0, 0], o[1, 0]],
119 |                 [o[0, 1], o[1, 1]],
120 |                 [o[0, 2], o[1, 2]],
121 |                 color=(1, 1, 1),
122 |                 tube_radius=None,
123 |                 line_width=1,
124 |                 figure=figure,
125 |             )
126 |     print(f"Done")
127 |     return figure
128 | 
129 | 
130 | def draw_boxes_top_view(
131 |     top_view_image, boxes_3d, grid_meters, labels, orientation_3d=None
132 | ):
133 |     height, width, channels = top_view_image.shape
134 |     delimiter_x = grid_meters[0] / height
135 |     delimiter_y = grid_meters[1] / width
136 |     thickness = 2
137 |     for idx, b in enumerate(boxes_3d):
138 |         color = get_color(labels[idx]) / 255
139 |         b = b[:4]
140 |         x = np.floor(b[:, 0] / delimiter_x).astype(int)
141 | 
142 |         y = np.floor(b[:, 1] / delimiter_y).astype(int)
143 | 
144 |         cv2.line(top_view_image, (y[0], x[0]), (y[1], x[1]), color, thickness)
145 |         cv2.line(top_view_image, (y[1], x[1]), (y[2], x[2]), color, thickness)
146 |         cv2.line(top_view_image, (y[2], x[2]), (y[3], x[3]), color, thickness)
147 |         cv2.line(top_view_image, (y[3], x[3]), (y[0], x[0]), color, thickness)
148 | 
149 |     if orientation_3d is not None:
150 |         for o in orientation_3d:
151 |             x = np.floor(o[:, 0] / delimiter_x).astype(int)
152 |             y = np.floor(o[:, 1] / delimiter_y).astype(int)
153 |             cv2.arrowedLine(
154 |                 top_view_image, (y[0], x[0]), (y[1], x[1]), (1, 0, 0), thickness
155 |             )
156 |     return top_view_image
157 | 
158 | 
159 | def visualize_2d_boxes_on_top_image(
160 |     bboxes_grid, top_view, grid_meters, bbox_voxel_size, prediction=False
161 | ):
162 |     top_image_vis = []
163 |     for boxes, top_image in zip(bboxes_grid, top_view):  # iterate over batch
164 |         top_image = top_image.numpy()
165 |         shape = top_image.shape
166 |         rgb_image = np.zeros((shape[0], shape[1], 3))
167 |         rgb_image[top_image[:, :, 0] > 0] = 1
168 | 
169 |         box, labels, _ = get_boxes_from_box_grid(boxes, bbox_voxel_size)
170 |         box = box.numpy()
171 |         box, orientation_3d = make_eight_points_boxes(box)
172 | 
173 |         if prediction:
174 |             labels = np.argmax(labels, axis=-1)
175 |         if len(box) > 0:
176 |             rgb_image = draw_boxes_top_view(
177 |                 rgb_image, box, grid_meters, labels, orientation_3d
178 |             )
179 | 
180 |         # rgb_image = np.rot90(rgb_image)
181 |         top_image_vis.append(rgb_image)
182 |     return np.asarray(top_image_vis)
183 | 
184 | 
185 | def visualize_bboxes_on_image(image, bboxes_2d, labels, orientation_2d=None):
186 |     """
187 |     The function visualize the reprojected 3d bounding boxes
188 |     on 2d image
189 |     Arguments:
190 |          images: the tensor of the shape [height, width, 3]
191 |          bboxes: the reprojected bboxes of the shape [num_boxes, 8, 2]
192 |     Returns:
193 |          resulted_images: the tensor with bboxes of the shape [height, width, 3]
194 |     """
195 | 
196 |     height, width, _ = image.shape
197 |     thickness = 2
198 |     boundaries = np.asarray([width, height])
199 |     for idx, b in enumerate(bboxes_2d):
200 |         color = get_color(labels[idx]) / 255
201 | 
202 |         b = b.astype(np.int32)
203 |         first_square = False
204 |         second_square = False
205 |         if (
206 |             (b[0] >= 0).all() & (b[0] < boundaries).all()
207 |             or (b[1] >= 0).all() & (b[1] < boundaries).all()
208 |             or (b[4] >= 0).all() & (b[4] < boundaries).all()
209 |             or (b[5] >= 0).all() & (b[5] < boundaries).all()
210 |         ):
211 |             first_square = True
212 |             cv2.line(image, (b[0, 0], b[0, 1]), (b[1, 0], b[1, 1]), color, thickness)
213 |             cv2.line(image, (b[4, 0], b[4, 1]), (b[0, 0], b[0, 1]), color, thickness)
214 |             cv2.line(image, (b[5, 0], b[5, 1]), (b[1, 0], b[1, 1]), color, thickness)
215 |             cv2.line(image, (b[4, 0], b[4, 1]), (b[5, 0], b[5, 1]), color, thickness)
216 |         if (
217 |             (b[2] >= 0).all() & (b[2] < boundaries).all()
218 |             or (b[3] >= 0).all() & (b[3] < boundaries).all()
219 |             or (b[6] >= 0).all() & (b[6] < boundaries).all()
220 |             or (b[7] >= 0).all() & (b[7] < boundaries).all()
221 |         ):
222 |             second_square = True
223 |             cv2.line(image, (b[2, 0], b[2, 1]), (b[3, 0], b[3, 1]), color, thickness)
224 |             cv2.line(image, (b[6, 0], b[6, 1]), (b[2, 0], b[2, 1]), color, thickness)
225 |             cv2.line(image, (b[7, 0], b[7, 1]), (b[3, 0], b[3, 1]), color, thickness)
226 |             cv2.line(image, (b[7, 0], b[7, 1]), (b[6, 0], b[6, 1]), color, thickness)
227 | 
228 |         if first_square and second_square:
229 |             cv2.line(image, (b[0, 0], b[0, 1]), (b[3, 0], b[3, 1]), color, thickness)
230 |             cv2.line(image, (b[1, 0], b[1, 1]), (b[2, 0], b[2, 1]), color, thickness)
231 |             cv2.line(image, (b[4, 0], b[4, 1]), (b[7, 0], b[7, 1]), color, thickness)
232 |             cv2.line(image, (b[5, 0], b[5, 1]), (b[6, 0], b[6, 1]), color, thickness)
233 | 
234 |     if orientation_2d is not None:
235 |         for o in orientation_2d:
236 |             o = o.astype(np.int32)
237 |             if (
238 |                 (o[0] >= 0).all()
239 |                 & (o[0] < boundaries).all()
240 |                 & (o[1] >= 0).all()
241 |                 & (o[1] < boundaries).all()
242 |             ):
243 |                 cv2.arrowedLine(
244 |                     image, (o[0, 0], o[0, 1]), (o[1, 0], o[1, 1]), (1, 0, 0), thickness
245 |                 )
246 | 
247 |     return image
248 | 


--------------------------------------------------------------------------------
/detection_3d/train.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | import argparse
 20 | import os
 21 | import tensorflow as tf
 22 | from detection_3d.parameters import Parameters
 23 | from detection_3d.detection_dataset import DetectionDataset
 24 | from detection_3d.tools.detection_helpers import get_voxels_grid
 25 | from detection_3d.model import YoloV3_Lidar
 26 | from detection_3d.tools.training_helpers import (
 27 |     setup_gpu,
 28 |     initialize_model,
 29 |     load_model,
 30 |     get_optimizer,
 31 | )
 32 | from detection_3d.losses import detection_loss
 33 | from detection_3d.tools.summary_helpers import train_summaries, epoch_metrics_summaries
 34 | from detection_3d.metrics import EpochMetrics
 35 | from tqdm import tqdm
 36 | 
 37 | 
 38 | @tf.function
 39 | def train_step(param_settings, train_samples, model, optimizer, epoch_metrics=None):
 40 | 
 41 |     with tf.GradientTape() as tape:
 42 |         top_view, box_grid, _ = train_samples
 43 |         predictions = model(top_view, training=True)
 44 |         (
 45 |             obj_loss,
 46 |             label_loss,
 47 |             z_loss,
 48 |             delta_xy_loss,
 49 |             width_loss,
 50 |             height_loss,
 51 |             delta_orient_loss,
 52 |         ) = detection_loss(box_grid, predictions)
 53 |         losses = [
 54 |             obj_loss,
 55 |             label_loss,
 56 |             z_loss,
 57 |             delta_xy_loss,
 58 |             width_loss,
 59 |             height_loss,
 60 |             delta_orient_loss,
 61 |         ]
 62 |         total_detection_loss = tf.reduce_sum(losses)
 63 |         # Get L2 losses for weight decay
 64 |         total_loss = total_detection_loss + tf.add_n(model.losses)
 65 | 
 66 |     gradients = tape.gradient(total_loss, model.trainable_variables)
 67 |     optimizer.apply_gradients(zip(gradients, model.trainable_variables))
 68 |     if epoch_metrics is not None:
 69 |         epoch_metrics.train_loss(total_detection_loss)
 70 | 
 71 |     train_outputs = {
 72 |         "total_loss": total_loss,
 73 |         "losses": losses,
 74 |         "box_grid": box_grid,
 75 |         "predictions": predictions,
 76 |         "top_view": top_view,
 77 |     }
 78 | 
 79 |     return train_outputs
 80 | 
 81 | 
 82 | @tf.function
 83 | def val_step(samples, model, epoch_metrics=None):
 84 | 
 85 |     top_view, box_grid, _ = samples
 86 |     predictions = model(top_view, training=False)
 87 |     (
 88 |         obj_loss,
 89 |         label_loss,
 90 |         z_loss,
 91 |         delta_xy_loss,
 92 |         width_loss,
 93 |         height_loss,
 94 |         delta_orient_loss,
 95 |     ) = detection_loss(box_grid, predictions)
 96 |     losses = [
 97 |         obj_loss,
 98 |         label_loss,
 99 |         z_loss,
100 |         delta_xy_loss,
101 |         width_loss,
102 |         height_loss,
103 |         delta_orient_loss,
104 |     ]
105 |     total_detection_loss = tf.reduce_sum(losses)
106 | 
107 |     if epoch_metrics is not None:
108 |         epoch_metrics.val_loss(total_detection_loss)
109 | 
110 | 
111 | def train(resume=False):
112 |     setup_gpu()
113 |     # General parameters
114 |     param = Parameters()
115 | 
116 |     # Init label colors and label names
117 |     tf.random.set_seed(param.settings["seed"])
118 | 
119 |     train_dataset = DetectionDataset(
120 |         param.settings,
121 |         "train.datatxt",
122 |         augmentation=param.settings["augmentation"],
123 |         shuffle=True,
124 |     )
125 | 
126 |     param.settings["train_size"] = train_dataset.num_samples
127 |     val_dataset = DetectionDataset(param.settings, "val.datatxt", shuffle=False)
128 |     param.settings["val_size"] = val_dataset.num_samples
129 | 
130 |     model = YoloV3_Lidar(weight_decay=param.settings["weight_decay"])
131 |     voxels_grid = get_voxels_grid(
132 |         param.settings["voxel_size"], param.settings["grid_meters"]
133 |     )
134 |     input_shape = [1, voxels_grid[0], voxels_grid[1], 2]
135 |     initialize_model(model, input_shape)
136 |     model.summary()
137 |     start_epoch, model = load_model(param.settings["checkpoints_dir"], model, resume)
138 |     model_path = os.path.join(param.settings["checkpoints_dir"], "{model}-{epoch:04d}")
139 | 
140 |     learning_rate, optimizer = get_optimizer(
141 |         param.settings["optimizer"],
142 |         param.settings["scheduler"],
143 |         train_dataset.num_it_per_epoch,
144 |     )
145 |     epoch_metrics = EpochMetrics()
146 | 
147 |     for epoch in range(start_epoch, param.settings["max_epochs"]):
148 |         save_dir = model_path.format(model=model.name, epoch=epoch)
149 |         epoch_metrics.reset()
150 |         for train_samples in tqdm(
151 |             train_dataset.dataset,
152 |             desc=f"Epoch {epoch}",
153 |             total=train_dataset.num_it_per_epoch,
154 |         ):
155 |             train_outputs = train_step(
156 |                 param.settings, train_samples, model, optimizer, epoch_metrics
157 |             )
158 |             train_summaries(train_outputs, optimizer, param.settings, learning_rate)
159 |         for val_samples in tqdm(
160 |             val_dataset.dataset, desc="Validation", total=val_dataset.num_it_per_epoch
161 |         ):
162 |             val_step(val_samples, model, epoch_metrics)
163 |         epoch_metrics_summaries(param.settings, epoch_metrics, epoch)
164 |         epoch_metrics.print_metrics()
165 |         # Save all
166 |         param.save_to_json(save_dir)
167 |         epoch_metrics.save_to_json(save_dir)
168 |         model.save(save_dir)
169 | 
170 | 
171 | if __name__ == "__main__":
172 |     parser = argparse.ArgumentParser(description="Train CNN.")
173 |     parser.add_argument(
174 |         "--resume",
175 |         type=lambda x: x,
176 |         nargs="?",
177 |         const=True,
178 |         default=False,
179 |         help="Activate nice mode.",
180 |     )
181 |     args = parser.parse_args()
182 |     train(resume=args.resume)
183 | 


--------------------------------------------------------------------------------
/detection_3d/validation_inferece.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | __copyright__ = """
  3 | Copyright (c) 2020 Tananaev Denis
  4 | 
  5 | Permission is hereby granted, free of charge, to any person obtaining a copy
  6 | of this software and associated documentation files (the "Software"), to deal
  7 | in the Software without restriction, including without limitation the rights
  8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  9 | of the Software, and to permit persons to whom the Software is furnished to do so,
 10 | subject to the following conditions: The above copyright notice and this permission
 11 | notice shall be included in all copies or substantial portions of the Software.
 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 17 | DEALINGS IN THE SOFTWARE.
 18 | """
 19 | 
 20 | import argparse
 21 | import os
 22 | import numpy as np
 23 | import tensorflow as tf
 24 | from detection_3d.parameters import Parameters
 25 | from detection_3d.tools.training_helpers import setup_gpu
 26 | from detection_3d.detection_dataset import DetectionDataset
 27 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image
 28 | from detection_3d.tools.file_io import save_bboxes_to_file
 29 | from detection_3d.tools.detection_helpers import (
 30 |     make_eight_points_boxes,
 31 |     get_boxes_from_box_grid,
 32 |     get_bboxes_parameters_from_points,
 33 | )
 34 | from PIL import Image
 35 | from tqdm import tqdm
 36 | import timeit
 37 | 
 38 | 
 39 | def validation_inference(param_settings, dataset_file, model_dir, output_dir):
 40 |     setup_gpu()
 41 | 
 42 |     # Load model
 43 |     model = tf.keras.models.load_model(model_dir)
 44 |     bbox_voxel_size = np.asarray(param_settings["bbox_voxel_size"], dtype=np.float32)
 45 |     lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32)
 46 |     grid_meters = param_settings["grid_meters"]
 47 | 
 48 |     val_dataset = DetectionDataset(param_settings, dataset_file, shuffle=False)
 49 |     param_settings["val_size"] = val_dataset.num_samples
 50 |     for val_samples in tqdm(
 51 |         val_dataset.dataset, desc=f"val_inference", total=val_dataset.num_it_per_epoch,
 52 |     ):
 53 |         top_view, gt_boxes, lidar_filenames = val_samples
 54 |         predictions = model(top_view, training=False)
 55 |         for image, predict, gt, filename in zip(
 56 |             top_view, predictions, gt_boxes, lidar_filenames
 57 |         ):
 58 |             filename = str(filename.numpy())
 59 |             seq_folder = filename.split("/")[-3]
 60 |             name = os.path.splitext(os.path.basename(filename))[0]
 61 |             # Ensure that output dir exists or create it
 62 |             top_view_dir = os.path.join(output_dir, "top_view", seq_folder)
 63 |             bboxes_dir = os.path.join(output_dir, "bboxes", seq_folder)
 64 |             os.makedirs(top_view_dir, exist_ok=True)
 65 |             os.makedirs(bboxes_dir, exist_ok=True)
 66 |             p_top_view = (
 67 |                 visualize_2d_boxes_on_top_image(
 68 |                     [predict], [image], grid_meters, bbox_voxel_size, prediction=True,
 69 |                 )
 70 |                 * 255
 71 |             )
 72 |             gt_top_view = (
 73 |                 visualize_2d_boxes_on_top_image(
 74 |                     [gt], [image], grid_meters, bbox_voxel_size, prediction=False,
 75 |                 )
 76 |                 * 255
 77 |             )
 78 |             result = np.vstack((p_top_view[0], gt_top_view[0]))
 79 |             file_to_save = os.path.join(top_view_dir, name + ".png")
 80 |             img = Image.fromarray(result.astype("uint8"))
 81 |             img.save(file_to_save)
 82 | 
 83 |             box, labels, _ = get_boxes_from_box_grid(predict, bbox_voxel_size)
 84 |             box = box.numpy()
 85 |             box, _ = make_eight_points_boxes(box)
 86 |             if len(box) > 0:
 87 |                 box = box - lidar_coord[:3]
 88 |                 labels = np.argmax(labels, axis=-1)
 89 |                 (
 90 |                     centroid,
 91 |                     width,
 92 |                     length,
 93 |                     height,
 94 |                     yaw,
 95 |                 ) = get_bboxes_parameters_from_points(box)
 96 |                 bboxes_name = os.path.join(bboxes_dir, name + ".txt")
 97 |                 save_bboxes_to_file(
 98 |                     bboxes_name, centroid, width, length, height, yaw, labels
 99 |                 )
100 | 
101 | 
102 | if __name__ == "__main__":
103 |     parser = argparse.ArgumentParser(description="Inference  validation set.")
104 |     parser.add_argument(
105 |         "--dataset_file", default="val.datatxt",
106 |     )
107 | 
108 |     parser.add_argument("--output_dir", default="inference")
109 | 
110 |     parser.add_argument(
111 |         "--model_dir", default="YoloV3_Lidar-0085",
112 |     )
113 |     args = parser.parse_args()
114 | 
115 |     param_settings = Parameters().settings
116 |     validation_inference(
117 |         param_settings, args.dataset_file, args.model_dir, args.output_dir
118 |     )
119 | 


--------------------------------------------------------------------------------
/pictures/box_parametrization.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/box_parametrization.png


--------------------------------------------------------------------------------
/pictures/result.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/result.png


--------------------------------------------------------------------------------
/pictures/topview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/topview.png


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | import setuptools
 2 | 
 3 | with open("README.md", "r") as fh:
 4 |     long_description = fh.read()
 5 | 
 6 | setuptools.setup(
 7 |     name="detection_3d-Denis-Tananaev",
 8 |     version="0.0.1",
 9 |     author="Denis Tananaev",
10 |     author_email="d.d.tananaev@gmail.com",
11 |     description="3D bbox detection with Lidar",
12 |     long_description=long_description,
13 |     long_description_content_type="text/markdown",
14 |     url="https://github.com/Dtananaev/lidar_dynamic_objects_detection",
15 |     packages=setuptools.find_packages(),
16 |     classifiers=[
17 |         "Programming Language :: Python :: 3",
18 |         "License :: OSI Approved :: MIT License",
19 |         "Operating System :: OS Independent",
20 |     ],
21 |     python_requires='>=3.6',
22 | )
23 | 


--------------------------------------------------------------------------------